riptable.rt_csv
Functions
|
Load a Dataset from a comma-separated value (CSV) file. |
- riptable.rt_csv.load_csv_as_dataset(path_or_file, column_names=None, converters=None, skip_rows=0, version=None, encoding='utf-8', **kwargs)
Load a Dataset from a comma-separated value (CSV) file.
- Parameters:
path_or_file – A filename or a file-like object (from open() or StringIO()); if you need a non-standard encoding, do the open yourself.
column_names (list of str, optional) – List of column names (must be legal python var names), or None for ‘use first row read from file’. Defaults to None.
converters (dict) – {column_name -> str2type-converters}, do your own error handling, should return uniform types, and handle bad/missing data as desired missing converter will default to ‘leave as string’.
skip_rows (int) – Number of rows to skip before processing, defaults to 0.
version (int, optional) – Selects the implementation of the CSV parser used to read the input file. Defaults to None, in which case the function chooses the best available implementation.
encoding (str) – The text encoding of the CSV file, defaults to ‘utf-8’.
kwargs – Any csv ‘dialect’ params you like.
- Return type:
Notes
For a dataset of shape (459302, 15) (all strings) the timings are roughly: (version=0) 6.195947s (version=1) 5.605156s (default if pandas not available) (version=2) 8.370234s (version=3) 6.994191s (version=4) 3.642205s (only available if pandas is available, default if so)