ConversionArguments#

class ConversionArguments#

Data class for holding conversion arguments. Mostly just inheriting from RuntimeArguments

Attributes

addl_hats_properties

Any additional keyword arguments you would like to provide when writing the hats.properties file for the final HATS table.

catalog_path

constructed output path for the catalog that will be something like <output_path>/<output_artifact_name>

completion_email_address

if provided, send an email to the indicated email address once the import pipeline has completed.

create_metadata

Create /dataset/_metadata parquet from all data partitions.

create_per_partition_stats

Create per_partition_statistics.parquet, based on footers from all data partitions.

create_thumbnail

Create /dataset/data_thumbnail.parquet from one row of each data partition.

dask_n_workers

number of workers for the dask client

dask_threads_per_worker

number of threads per dask worker

dask_tmp

directory for dask worker space.

delete_intermediate_parquet_files

should we delete the smaller intermediate parquet files generated in the splitting stage, once the relevant reducing stage is complete?

delete_resume_log_files

should we delete task-level done files once each stage is complete? if False, we will keep all done marker files at the end of the pipeline.

input_catalog_path

npix_parquet_name

Name of the pixel parquet file to be used when npix_suffix=/.

npix_suffix

Suffix for pixel data.

output_artifact_name

short, convenient name for the catalog

output_path

base path where new catalog should be output

progress_bar

if true, a progress bar will be displayed for user feedback of map reduce progress

resume

If True, we try to read any existing intermediate files and continue to run the pipeline where we left off.

resume_tmp

directory for intermediate resume files, when needed.

row_group_kwargs

additional keyword arguments to use in creation of rowgroups when writing files to parquet.

should_write_skymap

main catalogs should contain skymap fits files

simple_progress_bar

if displaying a progress bar, use a text-only simple progress bar instead of widget.

skymap_alt_orders

Additional alternative healpix orders to write a HEALPix skymap.

tmp_base_path

either tmp_dir or dask_dir, if those were provided by the user

tmp_dir

path for storing intermediate files

tmp_path

constructed temp path - defaults to tmp_dir, then dask_tmp, but will create a new temp directory under catalog_path if no other options are provided

tqdm_kwargs

Additional arguments to pass to the tqdm progress bar.

write_table_kwargs

additional keyword arguments to use when writing files to parquet (e.g. compression schemes).

Methods

__init__([output_path, ...])

extra_property_dict()

Generate additional HATS properties for this import run as a dictionary.

resume_kwargs_dict()

Convenience method to convert fields for resume functionality.

__init__(output_path: str | Path | UPath | None = None, output_artifact_name: str = '', addl_hats_properties: dict | None = None, npix_suffix: str = '.parquet', npix_parquet_name: str | None = None, write_table_kwargs: dict | None = None, row_group_kwargs: dict | None = None, should_write_skymap: bool = True, skymap_alt_orders: list[int] | None = None, create_thumbnail: bool = False, create_metadata: bool = True, create_per_partition_stats: bool = False, tmp_dir: str | Path | UPath | None = None, resume: bool = True, progress_bar: bool = True, simple_progress_bar: bool = False, tqdm_kwargs: dict | None = None, dask_tmp: str | Path | UPath | None = None, dask_n_workers: int = 1, dask_threads_per_worker: int = 1, resume_tmp: str | Path | UPath | None = None, delete_intermediate_parquet_files: bool = True, delete_resume_log_files: bool = True, completion_email_address: str = '', catalog_path: UPath | None = None, tmp_path: UPath | None = None, tmp_base_path: UPath | None = None, input_catalog_path: str | Path | UPath | None = None) None#
classmethod __new__(*args, **kwargs)#