riptable.rt_groupbynumba

Classes

GroupbyNumba

Holds all the functions for groupby

class riptable.rt_groupbynumba.GroupbyNumba

Bases: riptable.rt_groupbyops.GroupByOps

Holds all the functions for groupby

Only used when inherited

Child class must set self.grouping and self._dataset Child class must also override methods; count, _calculate_all, and the property; gb_keychain

CORE_COUNT = 12
static _nb_fill_backend(iGroup, iFirstGroup, nCountGroup, binLow, binHigh, data, ret, fill_val, limit, direction)

Numba backend implementation for grouped fill_forward and fill_backward for all aplicable dtypes.

Parameters:
  • iGroup (np.ndarray) – Arrays from a groupby object’s ‘get_groupings’ method

  • iFirstGroup (np.ndarray) – Arrays from a groupby object’s ‘get_groupings’ method

  • nCountGroup (np.ndarray) – Arrays from a groupby object’s ‘get_groupings’ method

  • binLow (int) – Indexes corresponding to the first and the last groups in iFirstGroup and nCountGroup

  • binHigh (int) – Indexes corresponding to the first and the last groups in iFirstGroup and nCountGroup

  • data (array) – The original data to be opperated on

  • ret (array) – An empty array the same size as ‘data’ which will contain the processed data. Must be None for inplace operation

  • fill_val (parameters for nb_fill_forward/nb_fill_backward) – The value to use where there is no valid group value to propagate forward/backward. If fill_val is not specified, NaN and invalid values aren’t replaced where there is no valid group value to propagate forward/backward.

  • limit (parameters for nb_fill_forward/nb_fill_backward) – The value to use where there is no valid group value to propagate forward/backward. If fill_val is not specified, NaN and invalid values aren’t replaced where there is no valid group value to propagate forward/backward.

  • direction (int (-1 or 1)) – direction = 1 corresponds to fill_forward, -1 corresponds to fill_backward

_nb_groupbycalculateall(ikey, unique_rows, funcList, binLowList, binHighList, func_param)
_nb_groupbycalculateallpack(ikey, iGroup, iFirstGroup, nCountGroup, unique_rows, funcList, binLowList, binHighList, inplace, func_param)
_numbaEMA(iFirstGroup, nCountGroup, binLow, binHigh, data, ret, time, decayRate)
_numbaEMA2(iFirstGroup, nCountGroup, data, ret, time, decayRate)

For each group defined by the grouping arguments, sets ‘ret’ to a true EMA of the ‘data’ argument using the time argument as the time and the ‘decayRate’ as the decay rate.

Parameters:
  • iGroup (from a groupby object's 'get_groupings' method) –

  • iFirstGroup (from a groupby object's 'get_groupings' method) –

  • nCountGroup (from a groupby object's 'get_groupings' method) –

  • data (the original data to be opperated on) –

  • ret (a blank array the same size as 'data' which will return the processed data) –

  • time (a list of times associated to the rows of data) –

  • decayRate (the decay rate (e based)) –

  • TODO (Error checking.) –

_numbaFillBackward(iFirstGroup, nCountGroup, data, ret)

propogate backward non-NaN values within a group, overwriting NaN values. TODO: better documentation

_numbaFillForward(iFirstGroup, nCountGroup, data, ret)

propogate forward non-NaN values within a group, overwriting NaN values. TODO: better documentation

_numbaTrim(iFirstGroup, nCountGroup, data, ret, x, y)

For each group defined by the grouping arguments, sets ‘ret’ to be a copy of the ‘data’ with elements below the ‘x’th percentile or above the ‘y’th percentile of the group set to nan.

Parameters:
  • iGroup (from a groupby object's 'get_groupings' method) –

  • iFirstGroup (from a groupby object's 'get_groupings' method) –

  • nCountGroup (from a groupby object's 'get_groupings' method) –

  • data (the original data to be opperated on) –

  • ret (a blank array the same size as 'data' which will return the processed data) –

  • x (the lower percentile bound) –

  • y (the upper percentile bound) –

static _numba_fill_direction(direction, iGroup, iFirstGroup, nCountGroup, binLow, binHigh, data, ret, fill_val, limit)
_numbamin(unique_rows, binLow, binHigh, data, ret)
_numbasum(unique_rows, binLow, binHigh, data, ret)
grpFillBackward()

propogate backward non-NaN values within a group, overwriting NaN values. TODO: better documentation

grpFillForward()

propogate forward non-NaN values within a group, overwriting NaN values. TODO: better documentation

grpFillForwardBackward()

propogate forward, then backward, non-NaN values within a group, overwriting NaN values. TODO: better documentation

grpTrim(x, y)

For each column, for each group, determine the x’th and y’th percentile of the data and set data below the x’th percentile or above the y’th percentile to nan.

Parameters:
  • grp (a groupby object) –

  • x (lower percentile) –

  • y (uppper percentile) –

Returns:

  • A dataset with the values outside the given percentiles set to np.nan

  • TODO (Test column types to make sure that the numba code will work nicely)

nb_ema(*args, time=None, decay_rate=None, **kwargs)
Parameters:
  • time (an array of times (often in nanoseconds) associated to the rows of data) –

  • decayRate (the scalar decay rate (e based)) –

nb_fill_backward(*args, fill_val, limit=0, inplace=False)

Replace NaN and invalid array values by propagating the next encountered valid group value backward.

Optionally, you can modify the original array if it’s not locked.

Parameters:
  • *args (array or list of arrays) – The array or arrays that contain NaN or invalid values you want to replace.

  • limit (int, default 0 (disabled)) – The maximium number of consecutive NaN or invalid values to fill. If there is a gap with more than this number of consecutive NaN or invalid values, the gap will be only partially filled. If no limit is specified, all consecutive NaN and invalid values are replaced.

  • fill_val (scalar, default None) – The value to use where there is no valid group value to propagate backward. If fill_val is not specified, NaN and invalid values aren’t replaced where there is no valid group value to propagate backward.

  • inplace (bool, default False) – If False, return a copy of the array. If True, modify original data. This will modify any other views on this object. This fails if the array is locked.

Returns:

The dataset (categorical) will be the same size and have the same dtypes as the original input.

Return type:

Dataset-like object

nb_fill_forward(*args, limit=0, fill_val=None, inplace=False)

Replace NaN and invalid array values by propagating the last encountered valid group value forward.

Optionally, you can modify the original array if it’s not locked.

Parameters:
  • *args (array or list of arrays) – The array or arrays that contain NaN or invalid values you want to replace.

  • limit (int, default 0 (disabled)) – The maximium number of consecutive NaN or invalid values to fill. If there is a gap with more than this number of consecutive NaN or invalid values, the gap will be only partially filled. If no limit is specified, all consecutive NaN and invalid values are replaced.

  • fill_val (scalar, default None) – The value to use where there is no valid group value to propagate forward. If fill_val is not specified, NaN and invalid values aren’t replaced where there is no valid group value to propagate forward.

  • inplace (bool, default False) – If False, return a copy of the array. If True, modify original data. This will modify any other views on this object. This fails if the array is locked.

Returns:

The dataset (categorical) will be the same size and have the same dtypes as the original input.

Return type:

Dataset-like object

nb_min(*args, **kwargs)

Compute sum of group

nb_sum(*args, **kwargs)

Compute sum of group

nb_sum_punt_test(*args, **kwargs)

Compute sum of group