hipscat_import.runtime_arguments
#
Data class to hold common runtime arguments for dataset creation.
Module Contents#
Classes#
Data class for holding runtime arguments |
Functions#
|
Helper method to find input paths, given either a prefix and format, or an |
- class RuntimeArguments[source]#
Data class for holding runtime arguments
- output_storage_options: Dict[Any, Any] | None[source]#
optional dictionary of abstract filesystem credentials for the OUTPUT.
- resume: bool = True[source]#
If True, we try to read any existing intermediate files and continue to run the pipeline where we left off. If False, we start the import from scratch, overwriting any content of the output directory.
- progress_bar: bool = True[source]#
if true, a tqdm progress bar will be displayed for user feedback of map reduce progress
- dask_tmp: str = ''[source]#
directory for dask worker space. this should be local to the execution of the pipeline, for speed of reads and writes
- resume_tmp: str = ''[source]#
directory for intermediate resume files, when needed. see RTD for more info.
- completion_email_address: str = ''[source]#
if provided, send an email to the indicated email address once the import pipeline has complete.
- catalog_path: hipscat.io.FilePointer | None[source]#
constructed output path for the catalog that will be something like <output_path>/<output_artifact_name>
- tmp_path: hipscat.io.FilePointer | None[source]#
constructed temp path - defaults to tmp_dir, then dask_tmp, but will create a new temp directory under catalog_path if no other options are provided
- find_input_paths(input_path='', file_matcher='', input_file_list=None, storage_options: Dict[Any, Any] | None = None)[source]#
Helper method to find input paths, given either a prefix and format, or an explicit list of paths.
- Parameters:
input_path (str) – prefix to search for
file_matcher (str) – matcher to use when searching for files
input_file_list (List[str]) – list of input paths
- Returns:
matching files, if input_path is provided, otherwise, input_file_list
- Raises:
FileNotFoundError – if no files are found at the input_path and the provided list is empty.