CsvReader#
- class CsvReader#
CSV reader for the most common CSV reading arguments.
This uses pandas.read_csv, and you can find more information on additional arguments in the pandas documentation: https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html
- chunksize#
number of rows to read in a single iteration.
- Type:
int
- header#
rows to use as the header with column names
- Type:
int, list of int, None, default ‘infer’
- schema_file#
path to a parquet schema file. if provided, header names and column types will be pulled from the parquet schema metadata.
- Type:
str
- column_names#
the names of columns if no header is available
- Type:
list[str]
- type_map#
the data types to use for columns
- Type:
dict
- parquet_kwargs#
additional keyword arguments to use when reading the parquet schema metadata, passed to pandas.read_parquet. See https://pandas.pydata.org/docs/reference/api/pandas.read_parquet.html
- Type:
dict
- kwargs#
additional keyword arguments to use when reading the CSV files with pandas.read_csv. See https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html
- Type:
dict
Methods
__init__([chunksize, header, schema_file, ...])read(input_file[, read_columns])Read the input file, or chunk of the input file.
read_index_file(input_file[, upath_kwargs])Read an "indexed" file.
regular_file_exists(input_file, **_kwargs)Check that the input_file points to a single regular file
- __init__(chunksize=500000, header='infer', schema_file=None, column_names=None, type_map=None, parquet_kwargs=None, upath_kwargs=None, **kwargs)#
- classmethod __new__(*args, **kwargs)#