General Utilities¶
Provides several utility functions.
-
uchuutools.utils.get_parser(filename, fields=None, drop_fields=None)[source]¶ Returns a parser that parses a single line from the input ascii file
Parameters: - filename (string, required) – A filename containg Rockstar/Consistent-Trees data. Can be a compressed
file if the compression is one of the types supported by the
generic_readerfunction. - fields (list of strings, optional, default: None) – Describes which specific columns in the input file to carry across to the hdf5 file. Default action is to convert ALL columns.
- drop_fields (list of strings, optional, default: None) – Describes which columns are not carried through to the hdf5 file.
Processed after
fields, i.e., you can specifyfields=Noneto create an initial list of all columns in the ascii file, and then specifydrop_fields = [colname2, colname7, ...], and those columns will not be present in the hdf5 output.
Returns: parser (an instance of BaseParseFields) – A parser that will parse a single line (read from a Rockstar/Consistent-Trees file) and create a tuple containing only the relevant columns.
- filename (string, required) – A filename containg Rockstar/Consistent-Trees data. Can be a compressed
file if the compression is one of the types supported by the
-
uchuutools.utils.get_approx_totnumhalos(input_file, ndatabytes=None)[source]¶ Returns an (approximate) number of lines containing data in the
input_file.Assumes that the only comment lines in the file occur at the beginning. Comment lines are assumed to begin with ‘#’.
Parameters: - input_file (string, required) – The input filename for the Rockstar/Consistent Trees file
- ndatabytes (integer, optional) – The total number of bytes being processed. If not passed, the
entire disk size of the
input_fileminus the initial header lines will be used (i.e. assumes that the entire file is being processed)
Returns: approx_totnumhalos (integer) – The approximate number of halos in the input file. The actual number of halos should be close but can be smaller/greater than the approximate value.
-
uchuutools.utils.generic_reader(filename, mode='rt')[source]¶ Returns a file-reader with capability to read line-by-line for both compressed and normal text files.
Parameters: - filename (string, required) – The filename for the associated input/output. Can be a compressed (.bz2, .gz, .xz, .zip) file as well as a regular ascii file
- mode (string, optional, default: ‘rt’ (readonly-text mode)) – Controls the kind of i/o operation that will be performed
Returns: f (file handle, generator) – Yields a generator that has the
readlinefeature (i.e., supports the paradigmfor line in f:). This file-reader generator is suitable for use inwithstatements, e.g.,with generic_reader(<fname>) as f:
-
uchuutools.utils.get_metadata(input_file)[source]¶ Returns metadata information for
input_file. Includes all comment lines in the header, Rockstar/Consistent-Trees version, and the input catalog type (either Rockstar or Consistent-Trees).Assumes that the only comment lines in the file occur at the beginning. Comment lines are assumed to begin with ‘#’.
Parameters: input_file (string, required) – The input filename for the Rockstar/Consistent Trees file Compressed files (‘.bz2’, ‘.gz’, ‘.xz’, ‘.zip’) are also allowed as valid kinds of input_fileReturns: - metadata_dict (dictionary) – The dictionary contains four key-value pairs corresponding to the keys: [‘metadata’, ‘version’, ‘catalog_type’, ‘headerline’].
- metadata (string) – All lines in the beginning of the file that start with the character ‘#’.
- version (string) – Rockstar or Consistent-Trees version that was used to generate
input_file - catalog_type (string) – Is one of [
Rockstar,Consistent Trees,Consistent Trees (hlist)] and indicates what kind of catalog is contained ininput_file - headerline (string) – The first line in the input file with any leading/trailing white-space, and any leading ‘#’ removed
-
uchuutools.utils.resize_halo_datasets(halos_dset, new_size, write_halo_props_cont, dtype)[source]¶ Resizes the halo datasets
Parameters: - halos_dset (dictionary, required)
- new_size (scalar integer, required)
- write_halo_props_cont (boolean, required) – Controls if the individual halo properties are written as distinct datasets such that any given property for ALL halos is written contiguously (structure of arrays, SOA).
- dtype (numpy datatype)
Returns: Returns
Trueon successful completion
-
uchuutools.utils.check_and_decompress(fname)[source]¶ Decompresses the input file (if necessary) and returns the decompressed filename
Parameters: fname (string, required) – Input filename, can be compressed Returns: decomp_fname (string) – The decompressed filename
-
uchuutools.utils.distribute_array_over_ntasks(cost_array, rank, ntasks)[source]¶ Calculates the subscript range for the
rank’th task such that the work-load is evenly distributed acrossntasks.Parameters: - cost_array (numpy array, required) – Contains the cost associated with processing each element of the array
- rank (integer, required) – The integer rank for the task that we need to compute the work-load division for
- ntasks (integer, required) – Total number of tasks that the array should be (evenly) distributed across
Returns: (start, stop) (A tuple of (np.int64, np.int64)) – Contains the initial and final subscripts that the
ranktask should process.Note: start, stop are both inclusive, i.e., all elements from
starttostopshould be included. For python array indexing with slices, this translates to arr[start:stop + 1].
-
uchuutools.utils.check_for_contiguous_halos(h5_task_file, write_halo_props_cont)[source]¶ Checks that the hdf5 file can be appended to with the requested writing of halo properties
Parameters: - h5_task_file (string, required) – An existing hdf5 file. The file may or may not contain halos, but
the dataset (or datasets, depending on the value of
write_halo_props_cont) for the halo properties should already be created - write_halo_props_cont (boolean, required) – Controls if the individual halo properties are written as distinct datasets such that any given property for ALL halos is written contiguously (structure of arrays, SOA).
Returns: Returns
Trueon successful completion- h5_task_file (string, required) – An existing hdf5 file. The file may or may not contain halos, but
the dataset (or datasets, depending on the value of
-
uchuutools.utils.write_halos(halos_dset, halos_dset_offset, halos, nhalos_to_write, write_halo_props_cont)[source]¶ Writes halos into the relevant dataset(s) within a hdf5 file
Parameters: - halos_dset (dictionary, required) – Contains the halos dataset(s) within a hdf5 file where either
the entire halos array or the individual halo properties should
be written to. See parameter
write_halo_props_contfor further details - halos_dset_offset (scalar integer, required) – Contains the index within the halos dataset(s) where the write should start
- halos (numpy structured array, required) – An array containing the halo properties that should be written out
into the hdf5 file. The entire array may not be written out, see
the parameter
nhalos_to_write - nhalos_to_write (scalar integer, required) – Number of halos from the
halosarray that should be written out. Can be smaller than the shape of thehalosarray - write_halo_props_cont (boolean, required) – Controls if the individual halo properties are written as distinct datasets such that any given property for ALL halos is written contiguously (structure of arrays, SOA).
Returns: Returns
Trueon successful completion of the write- halos_dset (dictionary, required) – Contains the halos dataset(s) within a hdf5 file where either
the entire halos array or the individual halo properties should
be written to. See parameter
-
uchuutools.utils.update_container_h5_file(fname, h5files, standard_consistent_trees=True)[source]¶ Writes the container hdf5 file that has external links to the hdf5 datafiles with the mergertree information.
Parameters: fname (string, required) – The name of the output container file (usually
forest.h5). A new file is always created, however, if the filefnamepreviously existed then the external links are preserved.h5files (list of filenames, required) – The list of filenames that were either newly created or updated.
If the container file
fnameexists, then the union of the filenames that already existed infnameandh5fileswill be used to create the external linksstandard_consistent_tree (boolean, optional, default: True) – Specifies whether the input files were from a parallel Consistent-Trees code or the standard Consistent-Trees code. Assumed to be standard (i.e., the public version) of the Consistent-Trees catalog
Returns: Returns
Trueon successful completion of the write