riptable.rt_csv

Functions

load_csv_as_dataset(path_or_file[, column_names, ...])

Load a Dataset from a comma-separated value (CSV) file.

riptable.rt_csv.load_csv_as_dataset(path_or_file, column_names=None, converters=None, skip_rows=0, version=None, encoding='utf-8', **kwargs)

Load a Dataset from a comma-separated value (CSV) file.

Parameters:
  • path_or_file – A filename or a file-like object (from open() or StringIO()); if you need a non-standard encoding, do the open yourself.

  • column_names (list of str, optional) – List of column names (must be legal python var names), or None for ‘use first row read from file’. Defaults to None.

  • converters (dict) – {column_name -> str2type-converters}, do your own error handling, should return uniform types, and handle bad/missing data as desired missing converter will default to ‘leave as string’.

  • skip_rows (int) – Number of rows to skip before processing, defaults to 0.

  • version (int, optional) – Selects the implementation of the CSV parser used to read the input file. Defaults to None, in which case the function chooses the best available implementation.

  • encoding (str) – The text encoding of the CSV file, defaults to ‘utf-8’.

  • kwargs – Any csv ‘dialect’ params you like.

Return type:

Dataset

Notes

For a dataset of shape (459302, 15) (all strings) the timings are roughly: (version=0) 6.195947s (version=1) 5.605156s (default if pandas not available) (version=2) 8.370234s (version=3) 6.994191s (version=4) 3.642205s (only available if pandas is available, default if so)