riptable.rt_multiset
Classes
Multisets contain datasets and/or multisets where all contained dataset have the |
- class riptable.rt_multiset.Multiset(input_value=None)
Bases:
riptable.rt_struct.Struct
Multisets contain datasets and/or multisets where all contained dataset have the same number of rows. Multisets can provide a convenient namespace for closely related datasets, such as those loaded from a single HDF5 file or generated by an aggregation applied to a GroupBy object.
The columns within contained datasets may be displayed in an interleaved way. Example: Assume Jan and Feb are two datasets with 3 columns each:
Jan: Run1, Run2, Run3
Feb: Run1, Run2, Run3
A Multiset containing these datasets would display with a multi-line header:
Run1 Run2 Run3
Jan Feb Jan Feb Jan Feb
One can access the Run1 column in the Jan dataset with the syntax: ms.Jan.Run1
Examples
>>> ds=rt.Dataset({'somenans': [0., 1., 2., nan, 4., 5.], 'morestuff': ['A','B','C','D','E','F']}) >>> ds2=rt.Dataset({'somenans': [0., 1., nan, 3., 4., 5.], 'morestuff':['H','I','J','K','L','M']}) >>> ms=rt.Multiset({'test':ds, 'test2':ds2}) >>> ms somenans morestuff # test test2 test test2 - ---- ----- ---- ----- 0 0.00 0.00 A H 1 1.00 1.00 B I 2 2.00 nan C J 3 nan 3.00 D K 4 4.00 4.00 E L 5 5.00 5.00 F M
>>> ms['morestuff'] morestuff # test test2 - ---- ----- 0 A H 1 B I 2 C J 3 D K 4 E L 5 F M
>>> ms['test'] # somenans morestuff - -------- --------- 0 0.00 A 1 1.00 B 2 2.00 C 3 nan D 4 4.00 E 5 5.00 F
>>> ms[[2,3],'somenans'] somenans # test test2 - ---- ----- 0 2.00 nan 1 nan 3.00
>>> ms[[2,3],'morestuff'] morestuff # test test2 - ---- ----- 0 C J 1 D K
>>> ms[[2,3],['morestuff','somenans']] morestuff somenans # test test2 test test2 - ---- ----- ---- ----- 0 C J 2.00 nan 1 D K nan 3.00
- property dtypes
Returns dictionary of dtype for each column.
- Returns:
Dictionary containing the dtype for each column in the Multiset.
- Return type:
- __getitem__(index)
- Parameters:
index ((rowspec, colspec) or colspec) –
- Return type:
the indexed row(s), cols(s), sub-dataset or single value
Examples
>>> ds=rt.Dataset({'somenans': [0., 1., 2., nan, 4., 5.]}) >>> ds2=rt.Dataset({'somenans': [0., 1., nan, 3., 4., 5.]}) >>> ms=rt.Multiset({'test':ds, 'test2':ds2}) >>> ms[2,:] somenans # test test2 - ---- ----- 0 2.00 nan
- Raises:
IndexError – When an invalid column name is supplied.
- __len__()
- __repr__()
Return repr(self).
- __setitem__(index, value)
- Parameters:
index (colspec) –
value (A Dataset or Multiset) –
- Return type:
None
- Raises:
- __str__()
Return str(self).
- _autocomplete()
- static _build_col_headers(rootobject, rootdict)
return a list of lists of ColHeaders
Still testing. TODO: speed up this python loop
- _check_addtype(name, value)
called from subclassed Struct when a new item is added
- _copy(deep=False, rows=None, cols=None, base_index=0, cls=None)
Bracket indexing that returns a multiset will funnel into this routine.
- Parameters:
deep (if True, perform a deep copy on column array) –
rows (row mask) –
cols (column mask) –
base_index (used for head/tail slicing) –
cls (class of return type, for subclass super() calls) –
False. (First argument must be deep. Deep cannnot be set to None. It must be True or) –
- static _depth_first(curobject, curdict, level, returnlist)
returns the max depth, list of dictionaries
- _init_from_dict(dictionary)
- _last_row_stats()
- _repr_html_()
- abs(*args, **kwargs)
- all(*args, **kwargs)
For use in boolean contexts: Is it true that for all elements (val) either:
val casts to True, or
returns True for val.all() or all(val)
- Return type:
- any(*args, **kwargs)
For use in boolean contexts: Does there exist an element (val) which either:
val casts to True, or
returns True for val.any() or any(val)
- Return type:
Examples
>>> s=rt.Struct() >>> s.a=rt.Dataset() >>> s.any() False
- apply(*args, **kwargs)
- apply_cols(*args, **kwargs)
- apply_rows(*args, **kwargs)
- astype(*args, **kwargs)
- cascade(funcname, *args, **kwargs)
Depth first calling of functions, often into a Dataset. For each Dataset in the Multiset, the function will be called with the
args
andkwargs
. The return result is expected to be a Dataset which will then be added back into a new Multiset and returned to the called.- Parameters:
funcname (string or callable function) –
- Return type:
- copy(deep=True)
Returns a shallow or deep copy of the multiset Defaults to a deepy copy.
- Parameters:
deep (bool, default True) – Set to False for a shallow copy.
- describe(*args, **kwargs)
- fillna(*args, **kwargs)
- flatten(horizontal=True, delimiter='_', dset_col_name='Column')
Return a single dataset constructed by concatenating all of the datasets and flattened multisets contained within the multiset. Horizontal flattening will concatenate the datasets horizontally, prepending the dataset name to each dataset’s column names. Vertical flattening requires the names and order of columns in each dataset to be identical, adding a single column to the returned dataset containing the name of the dataset from which each row derives.
- Parameters:
horizontal (bool) – If True, concatenate the Datasets horizontally, otherwise vertically.
delimiter (char) – The character used when joining dataset and column names to create unique names.
dset_col_name (string) – For vertical flattening, the name for the column containing dataset names.
- Return type:
- Raises:
- keep(*args, **kwargs)
- label_fixup()
Auto scan for which column names can be used as labels in display
- label_set_names(listnames)
Set which column names can be used as labels in display
- make_table(display_type)
Pretty-print code used by infrastructure.
- Parameters:
display_type – See rt.rt_enum.DS_DISPLAY_TYPES.
- Returns:
Display object or string.
- max(*args, **kwargs)
- mean(*args, **kwargs)
- min(*args, **kwargs)
- multiget(index, deep=False)
Returns a new Multiset representing a one-level sub-sampling of the original.
- Parameters:
index (An index specification.) –
deep (bool, False) – If set to True will make deep copies
- Return type:
A new Multiset.
- nanmax(*args, **kwargs)
- nanmean(*args, **kwargs)
- nanmin(*args, **kwargs)
- nanstd(*args, **kwargs)
- nansum(*args, **kwargs)
- nanvar(*args, **kwargs)
- pivot(*args, **kwargs)
- quantile(*args, **kwargs)
- sort_copy(*args, **kwargs)
- sort_inplace(*args, **kwargs)
- std(*args, **kwargs)
- sum(*args, **kwargs)
- trim(*args, **kwargs)
- var(*args, **kwargs)