hipscat_import.margin_cache.margin_cache#

Module Contents#

Functions#

_find_partition_margin_pixel_pairs(stats, margin_order)

Creates a DataFrame filled with many-to-many connections between

_create_margin_directory(stats, output_path, ...)

Creates directories for all the catalog partitions.

_map_to_margin_shards(client, args, partition_pixels, ...)

Create all the jobs for mapping partition files into the margin cache.

_reduce_margin_shards(client, args, partition_pixels)

Create all the jobs for reducing margin cache shards into singular files

generate_margin_cache(args, client)

Generate a margin cache for a given input catalog.

_find_partition_margin_pixel_pairs(stats, margin_order)[source]#

Creates a DataFrame filled with many-to-many connections between the catalog partition pixels and the margin pixels at margin_order.

_create_margin_directory(stats, output_path, storage_options)[source]#

Creates directories for all the catalog partitions.

_map_to_margin_shards(client, args, partition_pixels, margin_pairs)[source]#

Create all the jobs for mapping partition files into the margin cache.

_reduce_margin_shards(client, args, partition_pixels)[source]#

Create all the jobs for reducing margin cache shards into singular files

generate_margin_cache(args, client)[source]#

Generate a margin cache for a given input catalog. The input catalog must be in hipscat format.

Parameters:
  • args (MarginCacheArguments) – A valid MarginCacheArguments object.

  • client (dask.distributed.Client) – A dask distributed client object.