General Utilities

Provides several utility functions.

uchuutools.utils.get_parser(filename, fields=None, drop_fields=None)[source]

Returns a parser that parses a single line from the input ascii file

Parameters:
  • filename (string, required) – A filename containg Rockstar/Consistent-Trees data. Can be a compressed file if the compression is one of the types supported by the generic_reader function.
  • fields (list of strings, optional, default: None) – Describes which specific columns in the input file to carry across to the hdf5 file. Default action is to convert ALL columns.
  • drop_fields (list of strings, optional, default: None) – Describes which columns are not carried through to the hdf5 file. Processed after fields, i.e., you can specify fields=None to create an initial list of all columns in the ascii file, and then specify drop_fields = [colname2, colname7, ...], and those columns will not be present in the hdf5 output.
Returns:

parser (an instance of BaseParseFields) – A parser that will parse a single line (read from a Rockstar/Consistent-Trees file) and create a tuple containing only the relevant columns.

uchuutools.utils.get_approx_totnumhalos(input_file, ndatabytes=None)[source]

Returns an (approximate) number of lines containing data in the input_file.

Assumes that the only comment lines in the file occur at the beginning. Comment lines are assumed to begin with ‘#’.

Parameters:
  • input_file (string, required) – The input filename for the Rockstar/Consistent Trees file
  • ndatabytes (integer, optional) – The total number of bytes being processed. If not passed, the entire disk size of the input_file minus the initial header lines will be used (i.e. assumes that the entire file is being processed)
Returns:

approx_totnumhalos (integer) – The approximate number of halos in the input file. The actual number of halos should be close but can be smaller/greater than the approximate value.

uchuutools.utils.generic_reader(filename, mode='rt')[source]

Returns a file-reader with capability to read line-by-line for both compressed and normal text files.

Parameters:
  • filename (string, required) – The filename for the associated input/output. Can be a compressed (.bz2, .gz, .xz, .zip) file as well as a regular ascii file
  • mode (string, optional, default: ‘rt’ (readonly-text mode)) – Controls the kind of i/o operation that will be performed
Returns:

f (file handle, generator) – Yields a generator that has the readline feature (i.e., supports the paradigm for line in f:). This file-reader generator is suitable for use in with statements, e.g., with generic_reader(<fname>) as f:

uchuutools.utils.get_metadata(input_file)[source]

Returns metadata information for input_file. Includes all comment lines in the header, Rockstar/Consistent-Trees version, and the input catalog type (either Rockstar or Consistent-Trees).

Assumes that the only comment lines in the file occur at the beginning. Comment lines are assumed to begin with ‘#’.

Parameters:input_file (string, required) – The input filename for the Rockstar/Consistent Trees file Compressed files (‘.bz2’, ‘.gz’, ‘.xz’, ‘.zip’) are also allowed as valid kinds of input_file
Returns:
  • metadata_dict (dictionary) – The dictionary contains four key-value pairs corresponding to the keys: [‘metadata’, ‘version’, ‘catalog_type’, ‘headerline’].
  • metadata (string) – All lines in the beginning of the file that start with the character ‘#’.
  • version (string) – Rockstar or Consistent-Trees version that was used to generate input_file
  • catalog_type (string) – Is one of [Rockstar, Consistent Trees, Consistent Trees (hlist)] and indicates what kind of catalog is contained in input_file
  • headerline (string) – The first line in the input file with any leading/trailing white-space, and any leading ‘#’ removed
uchuutools.utils.resize_halo_datasets(halos_dset, new_size, write_halo_props_cont, dtype)[source]

Resizes the halo datasets

Parameters:
  • halos_dset (dictionary, required)
  • new_size (scalar integer, required)
  • write_halo_props_cont (boolean, required) – Controls if the individual halo properties are written as distinct datasets such that any given property for ALL halos is written contiguously (structure of arrays, SOA).
  • dtype (numpy datatype)
Returns:

Returns True on successful completion

uchuutools.utils.check_and_decompress(fname)[source]

Decompresses the input file (if necessary) and returns the decompressed filename

Parameters:fname (string, required) – Input filename, can be compressed
Returns:decomp_fname (string) – The decompressed filename
uchuutools.utils.distribute_array_over_ntasks(cost_array, rank, ntasks)[source]

Calculates the subscript range for the rank’th task such that the work-load is evenly distributed across ntasks.

Parameters:
  • cost_array (numpy array, required) – Contains the cost associated with processing each element of the array
  • rank (integer, required) – The integer rank for the task that we need to compute the work-load division for
  • ntasks (integer, required) – Total number of tasks that the array should be (evenly) distributed across
Returns:

(start, stop) (A tuple of (np.int64, np.int64)) – Contains the initial and final subscripts that the rank task should process.

Note: start, stop are both inclusive, i.e., all elements from start to stop should be included. For python array indexing with slices, this translates to arr[start:stop + 1].

uchuutools.utils.check_for_contiguous_halos(h5_task_file, write_halo_props_cont)[source]

Checks that the hdf5 file can be appended to with the requested writing of halo properties

Parameters:
  • h5_task_file (string, required) – An existing hdf5 file. The file may or may not contain halos, but the dataset (or datasets, depending on the value of write_halo_props_cont) for the halo properties should already be created
  • write_halo_props_cont (boolean, required) – Controls if the individual halo properties are written as distinct datasets such that any given property for ALL halos is written contiguously (structure of arrays, SOA).
Returns:

Returns True on successful completion

uchuutools.utils.write_halos(halos_dset, halos_dset_offset, halos, nhalos_to_write, write_halo_props_cont)[source]

Writes halos into the relevant dataset(s) within a hdf5 file

Parameters:
  • halos_dset (dictionary, required) – Contains the halos dataset(s) within a hdf5 file where either the entire halos array or the individual halo properties should be written to. See parameter write_halo_props_cont for further details
  • halos_dset_offset (scalar integer, required) – Contains the index within the halos dataset(s) where the write should start
  • halos (numpy structured array, required) – An array containing the halo properties that should be written out into the hdf5 file. The entire array may not be written out, see the parameter nhalos_to_write
  • nhalos_to_write (scalar integer, required) – Number of halos from the halos array that should be written out. Can be smaller than the shape of the halos array
  • write_halo_props_cont (boolean, required) – Controls if the individual halo properties are written as distinct datasets such that any given property for ALL halos is written contiguously (structure of arrays, SOA).
Returns:

Returns True on successful completion of the write

uchuutools.utils.update_container_h5_file(fname, h5files, standard_consistent_trees=True)[source]

Writes the container hdf5 file that has external links to the hdf5 datafiles with the mergertree information.

Parameters:
  • fname (string, required) – The name of the output container file (usually forest.h5). A new file is always created, however, if the file fname previously existed then the external links are preserved.

  • h5files (list of filenames, required) – The list of filenames that were either newly created or updated.

    If the container file fname exists, then the union of the filenames that already existed in fname and h5files will be used to create the external links

  • standard_consistent_tree (boolean, optional, default: True) – Specifies whether the input files were from a parallel Consistent-Trees code or the standard Consistent-Trees code. Assumed to be standard (i.e., the public version) of the Consistent-Trees catalog

Returns:

Returns True on successful completion of the write