Getting Started with HATS import

Getting Started with HATS import#

Installation#

We recommend installing in a virtual environment, like venv or conda. You may need to install or upgrade versions of dependencies to work with hats-import.

>> conda create -n <env_name> python=3.12
>> conda activate <env_name>

Tip

Installing optional dependencies

There are some extra dependencies that can make running hats-import in a jupyter environment easier, or connecting to a variety of remote file systems.

These can be installed with the full extra.

>> pip install hats-import[full]

Tip

Installing on Mac

healpy is an optional dependency for hats-import (included in the full extra) to support converting from older HiPSCat catalogs, but native prebuilt binaries for healpy on Apple Silicon Macs do not yet exist, so it’s recommended to install via conda before proceeding to hats-import.

>> conda config --append channels conda-forge
>> conda install healpy

Setting up a pipeline#

For each type of dataset the hats-import tool can generate, there is an argument container class that you will need to instantiate and populate with relevant arguments.

See dataset-specific notes on arguments:

Once you have created your arguments object, you pass it into the pipeline control, and then wait. Running within a main guard will potentially avoid some python threading issues with dask:

from dask.distributed import Client
from hats_import.pipeline import pipeline_with_client

def main():
    args = ...
    with Client(
        n_workers=10,
        threads_per_worker=1,
        ...
    ) as client:
        pipeline_with_client(args, client)

if __name__ == '__main__':
    main()