hipscat_import.catalog.resume_plan#

Utility to hold the file-level pipeline execution plan.

Module Contents#

Classes#

ResumePlan

Container class for holding the state of each file in the pipeline plan.

class ResumePlan[source]#

Bases: hipscat_import.pipeline_resume_plan.PipelineResumePlan

Container class for holding the state of each file in the pipeline plan.

input_paths: List[hipscat.io.FilePointer][source]#

resolved list of all files that will be used in the importer

map_files: List[Tuple[str, str]][source]#

list of files (and job keys) that have yet to be mapped

split_keys: List[Tuple[str, str]][source]#

set of files (and job keys) that have yet to be split

destination_pixel_map: List[Tuple[hipscat.pixel_math.healpix_pixel.HealpixPixel, List[hipscat.pixel_math.healpix_pixel.HealpixPixel], str]] | None[source]#

Fully resolved map of destination pixels to constituent smaller pixels

MAPPING_STAGE = 'mapping'[source]#
SPLITTING_STAGE = 'splitting'[source]#
REDUCING_STAGE = 'reducing'[source]#
HISTOGRAM_BINARY_FILE = 'mapping_histogram.npz'[source]#
HISTOGRAMS_DIR = 'histograms'[source]#
__post_init__()[source]#

Initialize the plan.

gather_plan()[source]#

Initialize the plan.

get_remaining_map_keys()[source]#

Gather remaining keys, dropping successful mapping tasks from histogram names.

Returns:

list of mapping keys not found in files like /resume/path/mapping_key.npz

read_histogram(healpix_order)[source]#

Return histogram with healpix_order’d shape

  • Try to find a combined histogram

  • Otherwise, combine histograms from partials

  • Otherwise, return an empty histogram

classmethod partial_histogram_file(tmp_path, mapping_key: str)[source]#

File name for writing a histogram file to a special intermediate directory.

As a side effect, this method may create the special intermediate directory.

Parameters:
  • tmp_path (str) – where to write intermediate resume files.

  • mapping_key (str) – unique string for each mapping task (e.g. “map_57”)

get_remaining_split_keys()[source]#

Gather remaining keys, dropping successful split tasks from done file names.

Returns:

list of splitting keys not found in files like /resume/path/split_key.done

classmethod splitting_key_done(tmp_path, splitting_key: str)[source]#

Mark a single splitting task as done

Parameters:
  • tmp_path (str) – where to write intermediate resume files.

  • splitting_key (str) – unique string for each splitting task (e.g. “split_57”)

classmethod reducing_key_done(tmp_path, reducing_key: str)[source]#

Mark a single reducing task as done

Parameters:
  • tmp_path (str) – where to write intermediate resume files.

  • reducing_key (str) – unique string for each reducing task (e.g. “3_57”)

wait_for_mapping(futures)[source]#

Wait for mapping futures to complete.

is_mapping_done() bool[source]#

Are there files left to map?

wait_for_splitting(futures)[source]#

Wait for splitting futures to complete.

is_splitting_done() bool[source]#

Are there files left to split?

get_reduce_items(destination_pixel_map=None)[source]#

Fetch a triple for each partition to reduce.

Triple contains:

  • destination pixel (healpix pixel with both order and pixel)

  • source pixels (list of pixels at mapping order)

  • reduce key (string of destination order+pixel)

is_reducing_done() bool[source]#

Are there partitions left to reduce?

wait_for_reducing(futures)[source]#

Wait for reducing futures to complete.