`hipscat_import.index.map_reduce`#

Create columnar index of hipscat table using dask for parallelization

Module Contents#

`read_leaf_file`(input_file, include_columns, ...)	Mapping function called once per input file.
`create_index`(args, client)	Read primary column, indexing column, and other payload data,

read_leaf_file(input_file, include_columns, include_hipscat_index, drop_duplicates, storage_options)[source]#

Mapping function called once per input file.

Reads the leaf parquet file, and returns with appropriate columns and duplicates dropped.

create_index(args, client)[source]#: Read primary column, indexing column, and other payload data, and write to catalog directory.