riptable.rt_numpy
Classes
The Riptable equivalent of |
|
The Riptable equivalent of |
|
The Riptable equivalent of |
|
The Riptable equivalent of |
|
The Riptable equivalent of |
|
The Riptable equivalent of |
|
The Riptable equivalent of |
|
The Riptable equivalent of |
|
The Riptable equivalent of |
|
The Riptable equivalent of |
|
The Riptable equivalent of |
|
The Riptable equivalent of |
|
The Riptable equivalent of |
|
The Riptable equivalent of |
|
The Riptable equivalent of |
Functions
|
|
|
This will check for numpy array first and call np.abs |
|
|
|
|
|
|
|
Return an array of evenly spaced values within a specified interval. |
|
|
|
|
|
|
|
|
|
|
|
|
|
Count the number of set (True) bits in an integer or in each integer within an array of |
|
|
|
Create a |
|
|
|
|
|
|
|
|
|
|
|
|
|
Calculate the 32-bit CRC of the data in an array using the Castagnoli polynomial (CRC32C). |
|
|
|
|
|
|
|
|
|
Return a new array of specified shape and type, without initializing entries. |
|
Return a new array with the same shape and type as the specified array, |
|
|
|
Return a new array of a specified shape and type, filled with a specified value. |
|
Return a full array with the same shape and type as a given array. |
|
Return the dtype of two arrays, or two scalars, or a scalar and an array. |
|
Return the dtype of an array, list, or builtin int, float, bool, str, bytes. |
|
Main routine used to groupby one or more keys. |
|
Find unique values in an array using a linear hashing algorithm. |
|
|
|
A routine often called after groupbyhash or groupbylex. |
|
see numpy hstack |
|
One-dimensional or two-dimensional linear interpolation with clipping. |
|
One-dimensional or two-dimensional linear interpolation without clipping. |
|
Return True for each finite element, False otherwise. |
|
Return True for each element that's positive or negative infinity, False otherwise. |
|
The ismember function is meant to mimic the ismember function in MATLab. It takes two sets of data |
|
Return True for each element that's a NaN (Not a Number), False otherwise. |
|
Return True for each element that's a NaN (Not a Number) or zero, False otherwise. |
|
Return True for each non-finite element, False otherwise. |
|
Return True for each element that's not positive or negative infinity, |
|
Return True for each element that's not a NaN (Not a Number), False otherwise. |
|
Return True if the array is sorted, False otherwise. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
pass in a tuple or list of boolean arrays to AND together |
|
inplace version: pass in a tuple or list of boolean arrays to AND together |
|
pass in a tuple or list of boolean arrays to ANDNOT together |
|
inplace version: pass in a tuple or list of boolean arrays to ANDNOT together |
|
pass in a tuple or list of boolean arrays to OR together |
|
inplace version: pass in a tuple or list of boolean arrays to OR together |
|
pass in a tuple or list of boolean arrays to XOR together |
|
inplace version: pass in a tuple or list of boolean arrays to XOR together |
|
|
|
|
|
Compute the arithmetic mean of the values in the first argument. |
|
|
|
|
|
|
|
Returns 7 arrays to help navigate data. |
|
arg1: ndarray |
|
Replace the NaN or invalid values in an array with zeroes. |
|
|
|
|
|
|
|
Compute the arithmetic mean of the values in the first argument, ignoring NaNs. |
|
|
|
|
|
|
|
Compute the standard deviation of the values in the first argument, ignoring NaNs. |
|
Compute the sum of the values in the first argument, ignoring NaNs. |
|
Compute the variance of the values in the first argument, ignoring NaNs. |
|
Return a new array of the specified shape and data type, filled with ones. |
|
Return an array of ones with the same shape and data type as the specified array. |
|
|
|
This is roughly the equivalent of arr[mask] = arr2[mask]. |
|
|
|
|
|
This will check for numpy array first and call np.round |
|
see np.searchsorted |
|
|
|
|
|
|
|
Compute the standard deviation of the values in the first argument. |
|
Compute the sum of the values in the first argument. |
|
Construct an array by repeating a specified array a specified number of |
|
|
|
|
|
Returns the index location of the first occurence of each key. |
|
Compute the variance of the values in the first argument. |
|
|
|
Return a new |
|
Return a new array of the specified shape and data type, filled with zeros. |
|
Return an array of zeros with the same shape and data type as the specified array. |
Attributes
|
|
|
- class riptable.rt_numpy.bool_(value)
Bases:
numpy.bool_
The Riptable equivalent of
numpy.bool_
, with the concept of an invalid added.See also
numpy.bool_
,float32
,float64
,int8
,uint8
,int16
,uint16
,int32
,uint32
,int64
,uint64
,bytes_
,str_
Examples
>>> rt.bool_.inv False
- inv
- class riptable.rt_numpy.bytes_
Bases:
numpy.bytes_
The Riptable equivalent of
numpy.bytes_
, with the concept of an invalid added.See also
np.bytes_
,float32
,float64
,int8
,uint8
,int16
,uint16
,int32
,uint32
,int64
,uint64
,str_
,bool_
Examples
>>> rt.bytes_.inv b''
- inv
- class riptable.rt_numpy.float32(value)
Bases:
numpy.float32
The Riptable equivalent of
numpy.float32
, with the concept of an invalid added.See also
numpy.float32
,float64
,int8
,uint8
,int16
,uint16
,int32
,uint32
,int64
,uint64
,bytes_
,str_
,bool_
Examples
>>> rt.float32.inv nan
- inv
- class riptable.rt_numpy.float64(value)
Bases:
numpy.float64
The Riptable equivalent of
numpy.float64
, with the concept of an invalid added.See also
numpy.float64
,float32
,int8
,uint8
,int16
,uint16
,int32
,uint32
,int64
,uint64
,bytes_
,str_
,bool_
Examples
>>> rt.float64.inv nan
- inv
- class riptable.rt_numpy.int0(value)
Bases:
int64
The Riptable equivalent of
numpy.int64
, with the concept of an invalid added.See also
numpy.int64
,float32
,float64
,int8
,uint8
,int16
,uint16
,int32
,uint32
,uint64
,bytes_
,str_
,bool_
Examples
>>> rt.int64.inv -9223372036854775808
- class riptable.rt_numpy.int16(value)
Bases:
numpy.int16
The Riptable equivalent of
numpy.int16
, with the concept of an invalid added.See also
numpy.int16
,float32
,float64
,int8
,uint8
,uint16
,int32
,uint32
,int64
,uint64
,bytes_
,str_
,bool_
Examples
>>> rt.int16.inv -32768
- inv
- class riptable.rt_numpy.int32(value)
Bases:
numpy.int32
The Riptable equivalent of
numpy.int32
, with the concept of an invalid added.See also
numpy.int32
,float32
,float64
,int8
,uint8
,int16
,uint16
,uint32
,int64
,uint64
,bytes_
,str_
,bool_
Examples
>>> rt.int32.inv -2147483648
- inv
- class riptable.rt_numpy.int64(value)
Bases:
numpy.int64
The Riptable equivalent of
numpy.int64
, with the concept of an invalid added.See also
numpy.int64
,float32
,float64
,int8
,uint8
,int16
,uint16
,int32
,uint32
,uint64
,bytes_
,str_
,bool_
Examples
>>> rt.int64.inv -9223372036854775808
- inv
- class riptable.rt_numpy.int8(value)
Bases:
numpy.int8
The Riptable equivalent of
numpy.int8
, with the concept of an invalid added.See also
numpy.int8
,float32
,float64
,uint8
,int16
,uint16
,int32
,uint32
,int64
,uint64
,bytes_
,str_
,bool_
Examples
>>> rt.int8.inv -128
- inv
- class riptable.rt_numpy.str_
Bases:
numpy.str_
The Riptable equivalent of
numpy.str_
, with the concept of an invalid added.See also
numpy.str_
,float32
,float64
,int8
,uint8
,int16
,uint16
,int32
,uint32
,int64
,uint64
,bytes_
,bool_
Examples
>>> rt.str_.inv ''
- inv
- class riptable.rt_numpy.uint0(value)
Bases:
uint64
The Riptable equivalent of
numpy.uint64
, with the concept of an invalid added.See also
numpy.uint64
,float32
,float64
,int8
,uint8
,int16
,uint16
,int32
,uint32
,int64
,bytes_
,str_
,bool_
Examples
>>> rt.uint64.inv 18446744073709551615
- class riptable.rt_numpy.uint16(value)
Bases:
numpy.uint16
The Riptable equivalent of
numpy.uint16
, with the concept of an invalid added.See also
numpy.uint16
,float32
,float64
,int8
,uint8
,int16
,int32
,uint32
,int64
,uint64
,bytes_
,str_
,bool_
Examples
>>> rt.uint16.inv 65535
- inv
- class riptable.rt_numpy.uint32(value)
Bases:
numpy.uint32
The Riptable equivalent of
numpy.uint32
, with the concept of an invalid added.See also
numpy.uint32
,float32
,float64
,int8
,uint8
,int16
,uint16
,int32
,int64
,uint64
,bytes_
,str_
,bool_
Examples
>>> rt.uint32.inv 4294967295
- inv
- class riptable.rt_numpy.uint64(value)
Bases:
numpy.uint64
The Riptable equivalent of
numpy.uint64
, with the concept of an invalid added.See also
numpy.uint64
,float32
,float64
,int8
,uint8
,int16
,uint16
,int32
,uint32
,int64
,bytes_
,str_
,bool_
Examples
>>> rt.uint64.inv 18446744073709551615
- inv
- class riptable.rt_numpy.uint8(value)
Bases:
numpy.uint8
The Riptable equivalent of
numpy.uint8
, with the concept of an invalid added.See also
numpy.uint8
,float32
,float64
,int8
,int16
,uint16
,int32
,uint32
,int64
,uint64
,bytes_
,str_
,bool_
Examples
>>> rt.uint8.inv 255
- inv
- riptable.rt_numpy._searchsorted(array, v, side='left', sorter=None)
- riptable.rt_numpy.abs(*args, **kwargs)
This will check for numpy array first and call np.abs
- riptable.rt_numpy.absolute(*args, **kwargs)
- riptable.rt_numpy.all(*args, **kwargs)
- riptable.rt_numpy.any(*args, **kwargs)
- riptable.rt_numpy.arange(*args, **kwargs)
Return an array of evenly spaced values within a specified interval.
The half-open interval includes
start
but excludesstop
:[start, stop)
.For integer arguments the function is roughly equivalent to the Python built-in
range
, but returns aFastArray
rather than arange
instance.When using a non-integer step, such as 0.1, it’s often better to use
numpy.linspace()
.For additional warnings, see
numpy.arange()
.- Parameters:
start (int or float, default 0) – Start of interval. The interval includes this value.
stop (int or float) – End of interval. The interval does not include this value, except in some cases where
step
is not an integer and floating point round-off affects the length of the output.step (int or float, default 1) – Spacing between values. For any output
out
, this is the distance between two adjacent values:out[i+1] - out[i]
. Ifstep
is specified as a positional argument,start
must also be given.dtype (str or NumPy dtype or Riptable dtype, optional) – The type of the output array. If
dtype
is not given, the data type is inferred from the other input arguments.like (array_like, optional) – Reference object to allow the creation of arrays that are not NumPy arrays. If an array-like passed in as
like
supports the__array_function__
protocol, the result will be defined by it. In this case, it ensures the creation of an array object compatible with that passed in via this argument.
- Returns:
A
FastArray
of evenly spaced numbers within the specified interval. For floating point arguments, the length of the result isceil((stop - start)/step)
. Because of floating point overflow, this rule may result in the last element of the output being greater thanstop
.- Return type:
FastArray
See also
numpy.arange
,riptable.ones
,riptable.ones_like
,riptable.zeros
,riptable.zeros_like
,riptable.empty
,riptable.empty_like
,riptable.full
,riptable.arange
,Categorical.full
Examples
>>> rt.arange(3) FastArray([0, 1, 2])
>>> rt.arange(3.0) FastArray([ 0., 1., 2.])
>>> rt.arange(3, 7) FastArray([3, 4, 5, 6])
>>> rt.arange(3, 7, 2) FastArray([3, 5])
- riptable.rt_numpy.argsort(*args, **kwargs)
- riptable.rt_numpy.asanyarray(a, dtype=None, order=None)
- riptable.rt_numpy.asarray(a, dtype=None, order=None)
- riptable.rt_numpy.assoc_copy(key1, key2, arr)
- Parameters:
key1 (ndarray / list thereof or a Dataset) – Numpy arrays to match against; all arrays must be same length.
key2 (ndarray / list thereof or a Dataset) – Numpy arrays that will be matched with
key1
; all arrays must be same length.arr (ndarray / Dataset) – An array or Dataset the same length as key2 arrays which will be mapped to the size of
key1
In the case of an array, the output will be cast to FastArray to accomodate support of fancy-indexing with sentinel values
- Returns:
A new array the same length as
key1
arrays which has mapped the inputarr
fromkey2
tokey1
the array’s dtype will match the dtype of the input array (3rd parameter). However, outputs will be FastArrays when the input array is a numpy arrays such that fancy indexing with sentinels works correctly.- Return type:
array_like
Examples
>>> np.random.seed(12345) >>> ds=Dataset({'time': rt.arange(200_000_000.0)}) >>> ds.data = np.random.randint(7, size=200_000_000) >>> ds.symbol = rt.Cat(1 + rt.arange(200_000_000) % 7, ['AAPL','AMZN', 'FB', 'GOOG', 'IBM','MSFT','UBER']) >>> dsa = rt.Dataset({'data': rt.repeat(rt.arange(7), 7), 'symbol': rt.tile(rt.FastArray(['AAPL','AMZN', 'FB', 'GOOG', 'IBM','MSFT','UBER']), 7), 'time': 48 - rt.arange(49.0)}) >>> rt.assoc_copy([ds.symbol, ds.data], [dsa.symbol, dsa.data], dsa.time) FastArray([13., 5., 46., ..., 5., 11., 24.])
- riptable.rt_numpy.assoc_index(key1, key2)
- Parameters:
key1 (ndarray / list thereof or a Dataset) – Numpy arrays to match against; all arrays must be same length.
key2 (ndarray / list thereof or a Dataset) – Numpy arrays that will be matched with
key1
; all arrays must be same length.
- Returns:
fancy_index – Fancy index where the index of
key2
is matched againstkey1
; if there was no match, the minimum integer (aka sentinel) is the index value.- Return type:
ndarray of ints
Examples
>>> np.random.seed(12345) >>> ds = rt.Dataset({'time': rt.arange(200_000_000.0)}) >>> ds.data = np.random.randint(7, size=200_000_000) >>> ds.symbol = rt.Cat(1 + rt.arange(200_000_000) % 7, ['AAPL','AMZN', 'FB', 'GOOG', 'IBM','MSFT','UBER']) >>> dsa = rt.Dataset({'data': rt.repeat(rt.arange(7), 7), 'symbol': rt.tile(rt.FastArray(['AAPL','AMZN', 'FB', 'GOOG', 'IBM','MSFT','UBER']), 7)}) >>> rt.assoc_index([ds.symbol, ds.data], [dsa.symbol, dsa.data]) FastArray([35, 43, 2, ..., 43, 37, 24])
- riptable.rt_numpy.bincount(*args, **kwargs)
- riptable.rt_numpy.bitcount(a)
Count the number of set (True) bits in an integer or in each integer within an array of integers. This operation is also known as population count or Hamming weight.
- Parameters:
a (int or sequence or numpy.array) – A Python integer or a sequence of integers or a numpy integer array.
- Returns:
If the input is Python int the return is int. If the input is sequence or numpy array the return is a numpy array with dtype int8.
- Return type:
int or numpy.array
Examples
>>> arr = rt.FastArray([741858, 77285, 916765, 395393, 347556, 896425, 921598, 86398]) >>> rt.bitcount(arr) FastArray([10, 10, 14, 5, 9, 12, 14, 10], dtype=int8)
- riptable.rt_numpy.bool_to_fancy(arr, both=False)
- Parameters:
arr (ndarray of bools) – A boolean array of True/False values
both (bool) – Controls whether to return a the True and False elements in
arr
. Defaults to False.
- Returns:
fancy_index (ndarray of bools) – Fancy index array of where the True values are. If
both
is True, there are two fancy index array sections: The first array slice is where the True values are; The second array slice is where the False values are. The True count is returned.true_count (int, optional) – When
both
is True, this value is returned to indicate how many True values were inarr
; this is then used to slicefancy_index
into two slices indicating where the True and False values are, respectively, withinarr
.
Notes
runs in parallel
Examples
>>> np.random.seed(12345) >>> bools = np.random.randint(2, size=20, dtype=np.int8).astype(bool) >>> rt.bool_to_fancy(bools) FastArray([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 17, 18, 19])
Setting the
both
parameter to True causes the function to return an array containing the indices of the True values inarr
followed by the indices of the False values, along with the number (count) of True values. This count can be used to slice the returned array if you want just the True indices and False indices.>>> fancy_index, true_count = rt.bool_to_fancy(bools, both=True) >>> fancy_index[:true_count], fancy_index[true_count:] (FastArray([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 17, 18, 19]), FastArray([ 0, 11, 13, 14, 16]))
- riptable.rt_numpy.cat2keys(key1, key2, filter=None, ordered=True, sort_gb=False, invalid=False, fuse=False)
Create a
Categorical
from two keys or twoCategorical
objects with all possible unique combinations.Notes
Code assumes Categoricals are base 1.
- Parameters:
key1 (Categorical, ndarray, or list of ndarray) – If a list of arrays is passed for this parameter, all arrays in the list must have the same length.
key2 (Categorical, ndarray, or list of ndarray) – If a list of arrays is passed for this parameter, all arrays in the list must have the same length.
filter (ndarray of bool, optional) – only valid when invalid is set to True
ordered (bool, default True) – only applies when key1 or key2 is not a categorical
sort_gb (bool, default False) – only applies when key1 or key2 is not a categorical
invalid (bool, default False) – Specifies whether or not to insert the invalid when creating the n x m unique matrix.
fuse (bool, default False) – When True, forces the resulting categorical to have 2 keys, one for rows, and one for columns.
- Returns:
A multikey categorical that has at least 2 keys.
- Return type:
Examples
The following examples demonstrate using cat2keys on keys as lists and arrays, lists of arrays, and Categoricals. In each of the examples, you can determine the unique combinations by zipping the same position of each of the values of the category dictionary.
Creating a MultiKey Categorical from two lists of equal length.
>>> rt.cat2keys(list('abc'), list('xyz')) Categorical([(a, x), (b, y), (c, z)]) Length: 3 FastArray([1, 5, 9], dtype=int64) Base Index: 1 {'key_0': FastArray([b'a', b'b', b'c', b'a', b'b', b'c', b'a', b'b', b'c'], dtype='|S1'), 'key_01': FastArray([b'x', b'x', b'x', b'y', b'y', b'y', b'z', b'z', b'z'], dtype='|S1')} Unique count: 9
>>> rt.cat2keys(np.array(list('abc')), np.array(list('xyz'))) Categorical([(a, x), (b, y), (c, z)]) Length: 3 FastArray([1, 5, 9], dtype=int64) Base Index: 1 {'key_0': FastArray([b'a', b'b', b'c', b'a', b'b', b'c', b'a', b'b', b'c'], dtype='|S1'), 'key_01': FastArray([b'x', b'x', b'x', b'y', b'y', b'y', b'z', b'z', b'z'], dtype='|S1')} Unique count: 9
>>> key1, key2 = [rt.FA(list('abc')), rt.FA(list('def'))], [rt.FA(list('uvw')), rt.FA(list('xyz'))] >>> rt.cat2keys(key1, key2) Categorical([(a, d, u, x), (b, e, v, y), (c, f, w, z)]) Length: 3 FastArray([1, 5, 9], dtype=int64) Base Index: 1 {'key_0': FastArray([b'a', b'b', b'c', b'a', b'b', b'c', b'a', b'b', b'c'], dtype='|S1'), 'key_1': FastArray([b'd', b'e', b'f', b'd', b'e', b'f', b'd', b'e', b'f'], dtype='|S1'), 'key_01': FastArray([b'u', b'u', b'u', b'v', b'v', b'v', b'w', b'w', b'w'], dtype='|S1'), 'key_11': FastArray([b'x', b'x', b'x', b'y', b'y', b'y', b'z', b'z', b'z'], dtype='|S1')} Unique count: 9
>>> cat.category_dict {'key_0': FastArray([b'a', b'b', b'c', b'a', b'b', b'c', b'a', b'b', b'c'], dtype='|S1'), 'key_1': FastArray([b'd', b'e', b'f', b'd', b'e', b'f', b'd', b'e', b'f'], dtype='|S1'), 'key_01': FastArray([b'u', b'u', b'u', b'v', b'v', b'v', b'w', b'w', b'w'], dtype='|S1'), 'key_11': FastArray([b'x', b'x', b'x', b'y', b'y', b'y', b'z', b'z', b'z'], dtype='|S1')}
- riptable.rt_numpy.ceil(*args, **kwargs)
- riptable.rt_numpy.combine2keys(key1, key2, unique_count1, unique_count2, filter=None)
- Parameters:
key1 (ndarray of ints) – First index array (int8, int16, int32 or int64).
key2 (ndarray of ints) – Second index array (int8, int16, int32 or int64).
unique_count1 (int) – Number of unique values in
key1
(often returned bygroupbyhash
/groupbylex
).unique_count2 (int) – Number of unique values in
key2
.filter (ndarray of bools, optional) – Boolean array with same length as
key1
array, defaults to None.
- Returns:
TWO ARRAYs (iKey (for 2 dims), nCountGroup)
bin is a 1 based index array with each False value setting the index to 0
nCountGroup is INT32 array with size = to (unique_count1 + 1)*(unique_count2 + 1)
- riptable.rt_numpy.combine_accum1_filter(key1, unique_count1, filter=None)
- Parameters:
key1 (ndarray of ints) – index array (int8, int16, int32 or int64) [must be base 1 – if base 0, increment by 1] often referred to as iKey or the bin array for categoricals
unique_count1 (int) – Maximum number of uniques in
key1
array.filter (ndarray of bool, optional) – Boolean array same length as
key1
array, defaults to None.
- Returns:
iKey (a new 1 based index array with each False value setting the index to 0) – iKey dtype will match the dtype in Arg1
iFirstKey (an INT32 array, the fixup for first since some bins may have been removed)
unique_count (INT32 and is the new unique_count1. It is the length of
iFirstKey
)
Example
>>> a = rt.arange(20) % 10 >>> b = a.astype('S') >>> c = rt.Cat(b) >>> rt.combine_accum1_filter(c, c.unique_count, rt.logical(rt.arange(20) % 2)) {'iKey': FastArray([0, 1, 0, 2, 0, 3, 0, 4, 0, 5, 0, 1, 0, 2, 0, 3, 0, 4, 0, 5], dtype=int8), 'iFirstKey': FastArray([1, 3, 5, 7, 9]), 'unique_count': 5}
- riptable.rt_numpy.combine_accum2_filter(key1, key2, unique_count1, unique_count2, filter=None)
- Parameters:
key1 (ndarray of ints) – First index array (int8, int16, int32 or int64).
key2 (ndarray of ints) – Second index array (int8, int16, int32 or int64).
unique_count1 (int) – Maximum number of unique values in
key1
.unique_count2 (int) – Maximum number of unique values in
key2
.filter (ndarray of bools, optional) – Boolean array with same length as
key1
array, defaults to None.
- Returns:
TWO ARRAYs (iKey (for 2 dims), nCountGroup)
bin is a 1 based index array with each False value setting the index to 0
nCountGroup is INT32 array with size = to (unique_count1 + 1)*(unique_count2 + 1)
- riptable.rt_numpy.combine_filter(key, filter)
- Parameters:
key (ndarray of ints) – index array (int8, int16, int32 or int64)
filter (ndarray of bools) – Boolean array same length as
key
.
- Returns:
1 based index array with each False value setting the index to 0. The equivalent function is
return index*filter
ornp.where(filter, index, 0)
.- Return type:
ndarray of ints
Notes
This routine can run in parallel.
- riptable.rt_numpy.concatenate(*args, **kwargs)
- riptable.rt_numpy.crc32c(arr)
Calculate the 32-bit CRC of the data in an array using the Castagnoli polynomial (CRC32C).
This function does not consider the array’s shape or strides when calculating the CRC, it simply calculates the CRC value over the entire buffer described by the array.
- Parameters:
arr –
- Returns:
The 32-bit CRC value calculated from the array data.
- Return type:
Notes
- TODO: Warn when the array has non-default striding, as that is not currently respected by
the implementation of this function.
- riptable.rt_numpy.crc64(arr)
- riptable.rt_numpy.cumsum(*args, **kwargs)
- riptable.rt_numpy.diff(*args, **kwargs)
- riptable.rt_numpy.double(a)
- riptable.rt_numpy.empty(shape, dtype=float, order='C')
Return a new array of specified shape and type, without initializing entries.
- Parameters:
shape (int or tuple of int) – Shape of the empty array, e.g.,
(2, 3)
or2
. Note that although multi-dimensional arrays are technically supported by Riptable, you may get unexpected results when working with them.dtype (str or NumPy dtype or Riptable dtype, default
numpy.float64
) – The desired data type for the array.order ({'C', 'F'}, default 'C') – Whether to store multi-dimensional data in row-major (C-style) or column-major (Fortran-style) order in memory.
- Returns:
A new
FastArray
of uninitialized (arbitrary) data of the specified shape and type.- Return type:
FastArray
See also
riptable.empty_like
,riptable.ones
,riptable.ones_like
,riptable.zeros
,riptable.zeros_like
,riptable.empty
,riptable.full
,Categorical.full
Notes
Unlike
zeros
,empty
doesn’t set the array values to zero, so it may be marginally faster. On the other hand, it requires the user to manually set all the values in the array, so it should be used with caution.Examples
>>> rt.empty(5) FastArray([0. , 0.25, 0.5 , 0.75, 1. ]) # uninitialized
>>> rt.empty(5, dtype = int) FastArray([80288976, 0, 0, 0, 1]) # uninitialized
- riptable.rt_numpy.empty_like(array, dtype=None, order='K', subok=True, shape=None)
Return a new array with the same shape and type as the specified array, without initializing entries.
- Parameters:
array (array) – The shape and data type of
array
define the same attributes of the returned array. Note that although multi-dimensional arrays are technically supported by Riptable, you may get unexpected results when working with them.dtype (str or NumPy dtype or Riptable dtype, optional) – Overrides the data type of the result.
order ({'K', C', 'F', or 'A'}, default 'K') – Overrides the memory layout of the result. ‘K’ (the default) means match the layout of
array
as closely as possible. ‘C’ means row-major (C-style); ‘F’ means column-major (Fortran-style); ‘A’ means ‘F’ ifarray
is Fortran-contiguous, ‘C’ otherwise.subok (bool, default True) – If True (the default), then the newly created array will use the sub-class type of
array
, otherwise it will be a base-class array.shape (int or sequence of ints, optional) – Overrides the shape of the result. If order=’K’ and the number of dimensions is unchanged, it will try to keep the same order; otherwise, order=’C’ is implied. Note that although multi-dimensional arrays are technically supported by Riptable, you may get unexpected results when working with them.
- Returns:
A new
FastArray
of uninitialized (arbitrary) data with the same shape and type asarray
.- Return type:
FastArray
See also
riptable.empty
,riptable.ones
,riptable.ones_like
,riptable.zeros
,riptable.zeros_like
,riptable.full
,Categorical.full
Examples
>>> a = rt.FastArray([1, 2, 3, 4]) >>> rt.empty_like(a) FastArray([ 1814376192, 1668069856, -1994737310, 746250422]) # uninitialized
>>> rt.empty_like(a, dtype = float) FastArray([0.25, 0.5 , 0.75, 1. ]) # uninitialized
- riptable.rt_numpy.floor(*args, **kwargs)
- riptable.rt_numpy.full(shape, fill_value, dtype=None, order='C')
Return a new array of a specified shape and type, filled with a specified value.
- Parameters:
shape (int or sequence of int) – Shape of the new array, e.g.,
(2, 3)
or2
. Note that although multi-dimensional arrays are technically supported by Riptable, you may get unexpected results when working with them.fill_value (scalar or array) – Fill value. For 1-dimensional arrays, only scalar values are accepted.
dtype (str or NumPy dtype or Riptable dtype, optional) – The desired data type for the array. The default is the data type that would result from creating a
FastArray
with the specifiedfill_value
:rt.FastArray(fill_value).dtype
.order ({'C', 'F'}, default 'C') – Whether to store multi-dimensional data in row-major (C-style) or column-major (Fortran-style) order in memory.
- Returns:
A new
FastArray
of the specified shape and type, filled with the specified value.- Return type:
FastArray
See also
Categorical.full
,riptable.ones
,riptable.ones_like
,riptable.zeros
,riptable.zeros_like
,riptable.empty
,riptable.empty_like
Examples
>>> rt.full(5, 2) FastArray([2, 2, 2, 2, 2])
>>> rt.full(5, 2.0) FastArray([2., 2., 2., 2., 2.])
Specify a data type:
>>> rt.full(5, 2, dtype = float) FastArray([2., 2., 2., 2., 2.])
- riptable.rt_numpy.full_like(a, fill_value, dtype=None, order='K', subok=True, shape=None)
Return a full array with the same shape and type as a given array.
- Parameters:
a (array) – The shape and data type of
a
define the same attributes of the returned array. Note that although multi-dimensional arrays are technically supported by Riptable, you may get unexpected results when working with them.fill_value (scalar or array_like) – Fill value.
dtype (str or NumPy dtype or Riptable dtype, optional) – Overrides the data type of the result.
order ({'C', 'F', 'A', or 'K'}, default 'K') – Overrides the memory layout of the result. ‘C’ means row-major (C-style), ‘F’ means column-major (Fortran-style), ‘A’ means ‘F’ if
a
is Fortran-contiguous, ‘C’ otherwise. ‘K’ means match the layout ofa
as closely as possible.subok (bool, default True) – If True (the default), then the newly created array will use the sub-class type of
a
, otherwise it will be a base-class array.shape (int or sequence of int, optional) – Overrides the shape of the result. If order=’K’ and the number of dimensions is unchanged, it will try to keep the same order; otherwise, order=’C’ is implied. Note that although multi-dimensional arrays are technically supported by Riptable, you may get unexpected results when working with them.
- Returns:
A
FastArray
with the same shape and data type as the specified array, filled withfill_value
.- Return type:
FastArray
See also
riptable.ones
,riptable.zeros
,riptable.zeros_like
,riptable.empty
,riptable.empty_like
,riptable.full
Examples
>>> a = rt.FastArray([1, 2, 3, 4]) >>> rt.full_like(a, 9) FastArray([9, 9, 9, 9])
>>> rt.ones_like(a, dtype = float) FastArray([1., 1., 1., 1.])
- riptable.rt_numpy.get_common_dtype(x, y)
Return the dtype of two arrays, or two scalars, or a scalar and an array.
Will dtype normal python ints to int32 or int64 (not int8 or int16). Used in where, put, take, putmask.
- Parameters:
x (scalar or array_like) – A scalar and/or array to find the common dtype of.
y (scalar or array_like) – A scalar and/or array to find the common dtype of.
- Returns:
The data type (dtype) common to both
x
andy
. If the objects don’t have exactly the same dtype, returns the dtype which both types could be implicitly coerced to.- Return type:
data-type
Examples
>>> get_common_type('test','hello') dtype('<U5')
>>> get_common_type(14,'hello') dtype('<U16')
>>> get_common_type(14,b'hello') dtype('<S16')
>>> get_common_type(14, 17) dtype('int32')
>>> get_common_type(arange(10), arange(10.0)) dtype('float64')
>>> get_common_type(arange(10).astype(bool), True) dtype('bool')
- riptable.rt_numpy.get_dtype(val)
Return the dtype of an array, list, or builtin int, float, bool, str, bytes.
- Parameters:
val – An object to get the dtype of.
- Returns:
The data-type (dtype) for
val
(if it has a dtype), or a dtype compatible withval
.- Return type:
data-type
Notes
if a python integer, will use int32 or int64 (never uint) for a python float, always returns float64 for a string, will return U or S with size
TODO: consider pushing down into C++
Examples
>>> get_dtype(10) dtype('int32')
>>> get_dtype(123.45) dtype('float64')
>>> get_dtype('hello') dtype('<U5')
>>> get_dtype(b'hello') dtype('S5')
- riptable.rt_numpy.groupby(list_arrays, filter=None, cutoffs=None, base_index=1, lex=False, rec=False, pack=False, hint_size=0)
Main routine used to groupby one or more keys.
- Parameters:
list_arrays (list of ndarray) – A list of numpy arrays to hash on (multikey). All arrays must be the same size.
filter (ndarray of bool, optional) – A boolean array the same length as the arrays in
list_arrays
used to pre-filter the input data before passing it to the grouping algorithm, defaults to None.cutoffs (ndarray, optional) – INT64 array of cutoffs
base_index (int) –
lex (defaults to False. if False will call groupbyhash) – If set to true will call groupbylex
rec (bool) – When set to true, a record array is created, and then the data is sorted. A record array is faster, but may not produce a true lexicographical sort. Defaults to False. Only applicable when
lex
is True.pack (bool) – Set to True to return iGroup, iFirstGroup, nCountGroup also; defaults to False. This is only meaningful when using hash-based grouping – when
lex
is True, the sorting-based grouping always computes and returns this information.hint_size (int) – An integer hint if the number of unique keys is known in advance, defaults to zero. Only applicable when using hash-based grouping (i.e.
lex
is False).
Notes
Ends up calling groupbyhash or groupbylex.
See also
- riptable.rt_numpy.groupbyhash(list_arrays, hint_size=0, filter=None, hash_mode=2, cutoffs=None, pack=False)
Find unique values in an array using a linear hashing algorithm.
Find unique values in an array using a linear hashing algorithm; it will then bin each group according to first appearance. The zero bin is reserved for anything filtered out.
- Parameters:
list_arrays (ndarray or list of ndarray) – a single numpy array or a list of numpy arrays to hash on (multikey) - all arrays must be the same size
hint_size (int, optional) – An integer hint if the number of unique keys is known in advance, defaults to zero.
filter (ndarray of bool, optional) – A boolean filter to pre-filter the values on, defaults to None.
hash_mode (int) – Setting for controlling the hashing mode; defaults to 2. Users generally should not override the default value of this parameter.
cutoffs (ndarray, optional) – An int64 array of cutoffs, defaults to None.
pack (bool) – Set to True to return iGroup, iFirstGroup, nCountGroup also; defaults to False.
- Returns:
A dictionary of 3 arrays
’iKey’ (array size is same as multikey, the unique key for which this row in multikey belongs)
’iFirstKey’ (array size is same as unique keys, index into the first row for that unique key)
’unique_count’ (number of uniques (not including the zero bin))
Examples
>>> np.random.seed(12345) >>> c = np.random.randint(0, 8000, 2_000_000) >>> rt.groupbyhash(c) {'iKey': FastArray([ 1, 2, 3, ..., 6061, 7889, 3002]), 'iFirstKey': FastArray([ 0, 1, 2, ..., 67072, 67697, 68250]), 'unique_count': 8000, 'iGroup': None, 'iFirstGroup': None, 'nCountGroup': None}
The ‘pack’ parameter can be overridden to True to calculate additional information about the relationship between elements in the input array and their group. Note this information is the same type of information
groupbylex
returns by default.>>> rt.groupbyhash(c, pack=True) {'iKey': FastArray([1, 2, 2, ..., 4, 6, 1]), 'iFirstKey': FastArray([ 0, 1, 3, 4, 6, 14, 18, 20]), 'unique_count': 8, 'iGroup': FastArray([ 0, 9, 21, ..., 9988, 9991, 9992]), 'iFirstGroup': FastArray([ 0, 0, 1213, 2465, 3761, 4987, 6239, 7522, 8797]), 'nCountGroup': FastArray([ 0, 1213, 1252, 1296, 1226, 1252, 1283, 1275, 1203])}
The output from
groupbyhash
is useful as an input torc.BinCount
:>>> x = rt.groupbyhash(c) >>> rc.BinCount(x['iKey'], x['unique_count'] + 1) FastArray([ 0, 251, 262, ..., 239, 217, 246])
A filter (boolean array) can be passed to
groupbyhash
; this causesgroupbyhash
to only operate on the elements of the input array where the filter has a corresponding True value.>>> f = (c % 3).astype(bool) >>> rt.groupbyhash(c, filter=f) {'iKey': FastArray([ 0, 1, 2, ..., 0, 5250, 1973]), 'iFirstKey': FastArray([ 1, 2, 3, ..., 54422, 58655, 68250]), 'unique_count': 5333, 'iGroup': None, 'iFirstGroup': None, 'nCountGroup': None}
The
groupbyhash
function can also operate on multikeys (tuple keys).>>> d = np.random.randint(0, 8000, 2_000_000) >>> rt.groupbyhash([c, d]) {'iKey': FastArray([ 1, 2, 3, ..., 1968854, 1968855, 1968856]), 'iFirstKey': FastArray([ 0, 1, 2, ..., 1999997, 1999998, 1999999]), 'unique_count': 1968856, 'iGroup': None, 'iFirstGroup': None, 'nCountGroup': None}
- riptable.rt_numpy.groupbylex(list_arrays, filter=None, cutoffs=None, base_index=1, rec=False)
- Parameters:
list_arrays (ndarray or list of ndarray) – A list of numpy arrays to hash on (multikey). All arrays must be the same size.
filter (ndarray of bool, optional) – A boolean array of true/false filters, defaults to None.
cutoffs (ndarray, optional) – INT64 array of cutoffs
base_index (int) –
rec (bool) – When set to true, a record array is created, and then the data is sorted. A record array is faster, but may not produce a true lexicographical sort. Defaults to False.
- Returns:
A dict of 6 numpy arrays
iKey (array size is same as multikey, the unique key for which this row in multikey belongs)
iFirstKey (array size is same as unique keys, index into the first row for that unique key)
unique_count (number of uniques)
iGroup (result from lexsort (fancy index sort of list_arrays))
iFirstGroup (array size is same as unique keys + 1: offset into iGroup)
nCountGroup (array size is same as unique keys + 1: length of slice in iGroup)
Examples
>>> a = rt.arange(100).astype('S') >>> f = rt.logical(rt.arange(100) % 3) >>> rt.groupbylex([a], filter=f) {'iKey': FastArray([ 0, 1, 9, 0, 23, 31, 0, 45, 53, 0, 2, 3, 0, 4, 5, 0, 6, 7, 0, 8, 10, 0, 11, 12, 0, 13, 14, 0, 15, 16, 0, 17, 18, 0, 19, 20, 0, 21, 22, 0, 24, 25, 0, 26, 27, 0, 28, 29, 0, 30, 32, 0, 33, 34, 0, 35, 36, 0, 37, 38, 0, 39, 40, 0, 41, 42, 0, 43, 44, 0, 46, 47, 0, 48, 49, 0, 50, 51, 0, 52, 54, 0, 55, 56, 0, 57, 58, 0, 59, 60, 0, 61, 62, 0, 63, 64, 0, 65, 66, 0]), 'iFirstKey': FastArray([ 1, 10, 11, 13, 14, 16, 17, 19, 2, 20, 22, 23, 25, 26, 28, 29, 31, 32, 34, 35, 37, 38, 4, 40, 41, 43, 44, 46, 47, 49, 5, 50, 52, 53, 55, 56, 58, 59, 61, 62, 64, 65, 67, 68, 7, 70, 71, 73, 74, 76, 77, 79, 8, 80, 82, 83, 85, 86, 88, 89, 91, 92, 94, 95, 97, 98]), 'unique_count': 66, 'iGroup': FastArray([ 1, 10, 11, 13, 14, 16, 17, 19, 2, 20, 22, 23, 25, 26, 28, 29, 31, 32, 34, 35, 37, 38, 4, 40, 41, 43, 44, 46, 47, 49, 5, 50, 52, 53, 55, 56, 58, 59, 61, 62, 64, 65, 67, 68, 7, 70, 71, 73, 74, 76, 77, 79, 8, 80, 82, 83, 85, 86, 88, 89, 91, 92, 94, 95, 97, 98]), 'iFirstGroup': FastArray([66, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65]), 'nCountGroup': FastArray([34, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])}
- riptable.rt_numpy.groupbypack(ikey, ncountgroup, unique_count=None, cutoffs=None)
A routine often called after groupbyhash or groupbylex. Operates on binned integer arrays only (int8, int16, int32, or int64).
- Parameters:
ikey (ndarray of ints) – iKey from groupbyhash or groupbylex
ncountgroup (ndarray of ints, optional) – From rc.BinCount or hash, if passed in it will be returned unchanged as part of this function’s output.
unique_count (int, optional) – required if
ncountgroup
is None, otherwise not unique_count (scalar int) (must include the 0 bin so +1 often added)cutoffs (array_like, optional) – cutoff array for parallel processing
- Returns:
3 arrays in a dict
[‘iGroup’] (array size is same as ikey, unique keys are grouped together)
[‘iFirstGroup’] (array size is number of unique keys, indexes into iGroup)
[‘nCountGroup’] (array size is number of unique keys, how many in each group)
Examples
>>> np.random.seed(12345) >>> c = np.random.randint(0, 8, 10_000) >>> x = rt.groupbyhash(c) >>> ncountgroup = rc.BinCount(x['iKey'], x['unique_count'] + 1) >>> rt.groupbypack(x['iKey'], ncountgroup) {'iGroup': FastArray([ 0, 9, 21, ..., 9988, 9991, 9992]), 'iFirstGroup': FastArray([ 0, 0, 1213, 2465, 3761, 4987, 6239, 7522, 8797]), 'nCountGroup': FastArray([ 0, 1213, 1252, 1296, 1226, 1252, 1283, 1275, 1203])}
The sum of the entries in the
nCountGroup
array returned bygroupbypack
matches the length of the original array.>>> rt.groupbypack(x['iKey'], ncountgroup)['nCountGroup'].sum() 10000
- riptable.rt_numpy.hstack(tup, dtype=None, **kwargs)
see numpy hstack riptable can also take a dtype (it will convert all arrays to that dtype while stacking) riptable version will preserve sentinels riptable version is multithreaded for special classes like categorical and dataset, it will check to see if the class has it’s own hstack and it will call that
- riptable.rt_numpy.interp(x, xp, fp)
One-dimensional or two-dimensional linear interpolation with clipping.
Returns the one-dimensional piecewise linear interpolant to a function with given discrete data points (
xp
,fp
), evaluated atx
.- Parameters:
x (array of float32 or float64) – The x-coordinates at which to evaluate the interpolated values.
xp (1-D or 2-D sequence of float32 or float64) – The x-coordinates of the data points, must be increasing if argument
period
is not specified. Otherwise,xp
is internally sorted after normalizing the periodic boundaries withxp = xp % period
.fp (1-D or 2-D sequence of float32 or float64) – The y-coordinates of the data points, same length as
xp
.
- Returns:
y – The interpolated values, same shape as
x
.- Return type:
See also
np.interp
,rt.interp_extrap
Notes
riptable version does not handle kwargs left/right whereas np does riptable version handles floats or doubles, whereas np is always a double riptable will warn if first parameter is a float32, but xp or yp is a double
- riptable.rt_numpy.interp_extrap(x, xp, fp)
One-dimensional or two-dimensional linear interpolation without clipping.
Returns the one-dimensional piecewise linear interpolant to a function with given discrete data points (
xp
,fp
), evaluated atx
.See also
np.interp
,rt.interp
Notes
riptable version handles floats or doubles, wheras np is always a double
2d mode is auto-detected based on
xp
/fp
- riptable.rt_numpy.isfinite(*args, **kwargs)
Return True for each finite element, False otherwise.
A value is considered to be finite if it’s not positive or negative infinity or a NaN (Not a Number).
- Parameters:
*args – See
numpy.isfinite
.**kwargs – See
numpy.isfinite
.
- Returns:
For array input, a
FastArray
of booleans is returned that’s True for each element that’s finite, False otherwise. For scalar input, a boolean is returned.- Return type:
FastArray
or bool
See also
riptable.isnotfinite
,riptable.isinf
,riptable.isnotinf
,FastArray.isfinite
,FastArray.isnotfinite
,FastArray.isinf
,FastArray.isnotinf
Dataset.mask_or_isfinite
Return a boolean array that’s True for each
Dataset
row that has at least one finite value.Dataset.mask_and_isfinite
Return a boolean array that’s True for each
Dataset
row that contains all finite values.Dataset.mask_or_isinf
Return a boolean array that’s True for each
Dataset
row that has at least one value that’s positive or negative infinity.Dataset.mask_and_isinf
Return a boolean array that’s True for each
Dataset
row that contains all infinite values.
Examples
>>> a = rt.FastArray([rt.inf, -rt.inf, rt.nan, 0]) >>> rt.isfinite(a) FastArray([False, False, False, True])
>>> rt.isfinite(1) True
- riptable.rt_numpy.isinf(*args, **kwargs)
Return True for each element that’s positive or negative infinity, False otherwise.
- Parameters:
*args – See
numpy.isinf
.**kwargs – See
numpy.isinf
.
- Returns:
For array input, a
FastArray
of booleans is returned that’s True for each element that’s positive or negative infinity, False otherwise. For scalar input, a boolean is returned.- Return type:
FastArray
or bool
See also
riptable.isnotinf
,riptable.isfinite
,riptable.isnotfinite
,FastArray.isinf
,FastArray.isnotinf
,FastArray.isfinite
,FastArray.isnotfinite
Dataset.mask_or_isfinite
Return a boolean array that’s True for each
Dataset
row that has at least one finite value.Dataset.mask_and_isfinite
Return a boolean array that’s True for each
Dataset
row that contains all finite values.Dataset.mask_or_isinf
Return a boolean array that’s True for each
Dataset
row that has at least one value that’s positive or negative infinity.Dataset.mask_and_isinf
Return a boolean array that’s True for each
Dataset
row that contains all infinite values.
Examples
>>> a = rt.FastArray([rt.inf, -rt.inf, rt.nan, 0]) >>> rt.isinf(a) FastArray([ True, True, False, False])
>>> rt.isinf(1) False
- riptable.rt_numpy.ismember(a, b, h=2, hint_size=0, base_index=0)
The ismember function is meant to mimic the ismember function in MATLab. It takes two sets of data and returns two - a boolean array and array of indices of the first occurrence of an element in
a
inb
- otherwise NaN.- Parameters:
a (A python list (strings), python tuple (strings), chararray, ndarray of unicode strings,) – ndarray of int32, int64, float32, or float64.
b (A list with the same constraints as
a
. Note: if a contains string data, b must also contain) – string data. If it contains different numerical data, casting will occur in eithera
orb
.h (There are currently two different hashing functions that can be used to execute ismember.) – Depending on the size, type, and number of matches in the data, the hashes perform differently. Currently accepts 1 or 2. 1=PRIME number (might be faster for floats - uses less memory) 2=MASK using power of 2 (usually faster but uses more memory)
hint_size (int, default 0) – For large arrays with a low unique count, setting this value to 4*expected unique count may speed up hashing.
base_index (int, default 0) – When set to 1 the first return argument is no longer a boolean array but an integer that is 1 or 0. A return value of 1 indicates there exists values in
b
that do not exist ina
.
- Returns:
c (int or np.ndarray of bool) – A boolean array the same size as a indicating whether or not the element at the corresponding index in
a
was found inb
.d (np.ndarray of int) – An array of indices the same size as
a
which each indicate where an element in a first occured inb
or NaN otherwise.
- Raises:
TypeError – input must be ndarray, python list, or python tuple
ValueError – data must be int32, int64, float32, float64, chararray, or unicode strings. If a contains string data, b must also contain string data and vice versa.
Examples
>>> a = [1.0, 2.0, 3.0, 4.0] >>> b = [1.0, 3.0, 4.0, 4.0] >>> c,d = ismember(a,b) >>> c FastArray([ True, False, True, True]) >>> d FastArray([ 0, -128, 1, 2], dtype=int8)
NaN values do not behave the same way as other elements. A NaN in the first will not register as existing in the second array. This is the expected behavior (to match MatLab nan MATLab nan handling):
>>> a = FastArray([1.,2.,3.,np.nan]) >>> b = FastArray([2.,3.,np.nan]) >>> c,d = ismember(a,b) >>> c FastArray([False, True, True, False]) >>> d FastArray([-128, 0, 1, -128], dtype=int8)
- riptable.rt_numpy.isnan(*args, **kwargs)
Return True for each element that’s a NaN (Not a Number), False otherwise.
- Parameters:
*args – See
numpy.isnan
.**kwargs – See
numpy.isnan
.
- Returns:
For array input, a
FastArray
of booleans is returned that’s True for each element that’s a NaN, False otherwise. For scalar input, a boolean is returned.- Return type:
FastArray
or bool
See also
riptable.isnotnan
,riptable.isnanorzero
,FastArray.isnan
,FastArray.isnotnan
,FastArray.notna
,FastArray.isnanorzero
,Categorical.isnan
,Categorical.isnotnan
,Categorical.notna
,Date.isnan
,Date.isnotnan
,DateTimeNano.isnan
,DateTimeNano.isnotnan
Dataset.mask_or_isnan
Return a boolean array that’s True for each
Dataset
row that contains at least one NaN.Dataset.mask_and_isnan
Return a boolean array that’s True for each all-NaN
Dataset
row.
Examples
>>> a = rt.FastArray([rt.nan, rt.inf, 2]) >>> rt.isnan(a) FastArray([ True, False, False])
>>> rt.isnan(0) False
- riptable.rt_numpy.isnanorzero(*args, **kwargs)
Return True for each element that’s a NaN (Not a Number) or zero, False otherwise.
- Parameters:
*args – See
numpy.isnan
.**kwargs – See
numpy.isnan
.
- Returns:
For array input, a
FastArray
of booleans is returned that’s True for each element that’s a NaN or zero, False otherwise. For scalar input, a boolean is returned.- Return type:
FastArray
or bool
See also
FastArray.isnanorzero
,riptable.isnan
,riptable.isnotnan
,FastArray.isnan
,FastArray.isnotnan
,Categorical.isnan
,Categorical.isnotnan
,Date.isnan
,Date.isnotnan
,DateTimeNano.isnan
,DateTimeNano.isnotnan
Dataset.mask_or_isnan
Return a boolean array that’s True for each
Dataset
row that contains at least one NaN.Dataset.mask_and_isnan
Return a boolean array that’s True for each all-NaN
Dataset
row.
Examples
>>> a = rt.FastArray([0, rt.nan, rt.inf, 3]) >>> rt.isnanorzero(a) FastArray([ True, True, False, False])
>>> rt.isnanorzero(0) True
- riptable.rt_numpy.isnotfinite(*args, **kwargs)
Return True for each non-finite element, False otherwise.
A value is considered to be finite if it’s not positive or negative infinity or a NaN (Not a Number).
- Parameters:
*args – See
numpy.isfinite
.**kwargs – See
numpy.isfinite
.
- Returns:
For array input, a
FastArray
of booleans is returned that’s True for each non-finite element, False otherwise. For scalar input, a boolean is returned.- Return type:
FastArray
or bool
See also
riptable.isfinite
,riptable.isinf
,riptable.isnotinf
,FastArray.isfinite
,FastArray.isnotfinite
,FastArray.isinf
,FastArray.isnotinf
Dataset.mask_or_isfinite
Return a boolean array that’s True for each
Dataset
row that has at least one finite value.Dataset.mask_and_isfinite
Return a boolean array that’s True for each
Dataset
row that contains all finite values.Dataset.mask_or_isinf
Return a boolean array that’s True for each
Dataset
row that has at least one value that’s positive or negative infinity.Dataset.mask_and_isinf
Return a boolean array that’s True for each
Dataset
row that contains all infinite values.
Examples
>>> a = rt.FastArray([rt.inf, -rt.inf, rt.nan, 0]) >>> rt.isnotfinite(a) FastArray([ True, True, True, False])
>>> rt.isnotfinite(1) False
- riptable.rt_numpy.isnotinf(*args, **kwargs)
Return True for each element that’s not positive or negative infinity, False otherwise.
- Parameters:
*args – See
numpy.isinf
.**kwargs – See
numpy.isinf
.
- Returns:
For array input, a
FastArray
of booleans is returned that’s True for each element that’s not positive or negative infinity, False otherwise. For scalar input, a boolean is returned.- Return type:
FastArray
or bool
See also
riptable.isinf
,FastArray.isnotinf
,FastArray.isinf
,riptable.isfinite
,riptable.isnotfinite
,FastArray.isfinite
,FastArray.isnotfinite
Dataset.mask_or_isfinite
Return a boolean array that’s True for each
Dataset
row that has at least one finite value.Dataset.mask_and_isfinite
Return a boolean array that’s True for each
Dataset
row that contains all finite values.Dataset.mask_or_isinf
Return a boolean array that’s True for each
Dataset
row that has at least one value that’s positive or negative infinity.Dataset.mask_and_isinf
Return a boolean array that’s True for each
Dataset
row that contains all infinite values.
Examples
>>> a = rt.FastArray([rt.inf, -rt.inf, rt.nan, 0]) >>> rt.isnotinf(a) FastArray([False, False, True, True])
>>> rt.isnotinf(1) True
- riptable.rt_numpy.isnotnan(*args, **kwargs)
Return True for each element that’s not a NaN (Not a Number), False otherwise.
- Parameters:
*args – See
numpy.isnan
.**kwargs – See
numpy.isnan
.
- Returns:
For array input, a
FastArray
of booleans is returned that’s True for each element that’s not a NaN, False otherwise. For scalar input, a boolean is returned.- Return type:
FastArray
or bool
See also
riptable.isnan
,riptable.isnanorzero
,FastArray.isnan
,FastArray.isnotnan
,FastArray.notna
,FastArray.isnanorzero
,Categorical.isnan
,Categorical.isnotnan
,Categorical.notna
,Date.isnan
,Date.isnotnan
,DateTimeNano.isnan
,DateTimeNano.isnotnan
Dataset.mask_or_isnan
Return a boolean array that’s True for each
Dataset
row that contains at least one NaN.Dataset.mask_and_isnan
Return a boolean array that’s True for each all-NaN
Dataset
row.
Examples
>>> a = rt.FastArray([rt.nan, rt.inf, 2]) >>> rt.isnotnan(a) FastArray([False, True, True])
>>> rt.isnotnan(0) True
- riptable.rt_numpy.issorted(*args)
Return True if the array is sorted, False otherwise.
NaNs at the end of an array are considered sorted.
- Parameters:
*args (ndarray) – The array to check. It must be one-dimensional and contiguous.
- Returns:
True if the array is sorted, False otherwise.
- Return type:
See also
FastArray.issorted
Examples
>>> a = rt.FastArray(['a', 'c', 'b']) >>> rt.issorted(a) False
>>> a = rt.FastArray([1.0, 2.0, 3.0, rt.nan]) >>> rt.issorted(a) True
>>> cat = rt.Categorical(['a', 'a', 'a', 'b', 'b']) >>> rt.issorted(cat) True
>>> dt = rt.Date.range('20190201', '20190208') >>> rt.issorted(dt) True
>>> dtn = rt.DateTimeNano(['6/30/19', '1/30/19'], format='%m/%d/%y', from_tz='NYC') >>> rt.issorted(dtn) False
- riptable.rt_numpy.lexsort(*args, **kwargs)
- riptable.rt_numpy.log(*args, **kwargs)
- riptable.rt_numpy.log10(*args, **kwargs)
- riptable.rt_numpy.logical(a)
- riptable.rt_numpy.makeifirst(key, unique_count, filter=None)
- Parameters:
key (ndarray of ints) – Index array (int8, int16, int32 or int64).
unique_count (int) – Maximum number of unique values in
key
array.filter (ndarray of bools, optional) – Boolean array same length as
key
array, defaults to None.
- Returns:
index – An index array of the same dtype and length of the
key
passed in. The index array will have the invalid value for the array’s dtype set at any locations it could not find a first occurrence.- Return type:
ndarray of ints
Notes
makeifirst will NOT reduce the index/ikey unique size even when a filter is passed. Based on the integer dtype int8/16/32/64, all locations that have no first will be set to invalid. If an invalid is used as a riptable fancy index, it will pull in another invalid, for example ‘’ empty string
- riptable.rt_numpy.makeilast(key, unique_count, filter=None)
- Parameters:
key (ndarray of ints) – Index array (int8, int16, int32 or int64).
unique_count (int) – Maximum number of unique values in
key
array.filter (ndarray of bools, optional) – Boolean array same length as
key
array, defaults to None.
- Returns:
index – An index array of the same dtype and length of the
key
passed in. The index array will have the invalid value for the array’s dtype set at any locations it could not find a last occurrence.- Return type:
ndarray of ints
Notes
makeilast will NOT reduce the index/ikey unique size even when a filter is passed. Based on the integer dtype int8/16/32/64, all locations that have no last will be set to invalid. If an invalid is used as a riptable fancy index, it will pull in another invalid, for example ‘’ empty string
- riptable.rt_numpy.makeinext(key, unique_count)
- Parameters:
key (ndarray of integers) – index array (int8, int16, int32 or int64)
unique_count (int) – max uniques in ‘key’ array
- Returns:
An index array of the same dtype and length of the next row
The index array will have -MAX_INT set to any locations it could not find a next
- riptable.rt_numpy.makeiprev(key, unique_count)
- Parameters:
key (ndarray of integers) – index array (int8, int16, int32 or int64)
unique_count (int) – max uniques in ‘key’ array
- Return type:
The index array will have -MAX_INT set to any locations it could not find a previous
- riptable.rt_numpy.mask_and(*args, **kwargs)
pass in a tuple or list of boolean arrays to AND together
- riptable.rt_numpy.mask_andi(*args, **kwargs)
inplace version: pass in a tuple or list of boolean arrays to AND together
- riptable.rt_numpy.mask_andnot(*args, **kwargs)
pass in a tuple or list of boolean arrays to ANDNOT together
- riptable.rt_numpy.mask_andnoti(*args, **kwargs)
inplace version: pass in a tuple or list of boolean arrays to ANDNOT together
- riptable.rt_numpy.mask_or(*args, **kwargs)
pass in a tuple or list of boolean arrays to OR together
- riptable.rt_numpy.mask_ori(*args, **kwargs)
inplace version: pass in a tuple or list of boolean arrays to OR together
- riptable.rt_numpy.mask_xor(*args, **kwargs)
pass in a tuple or list of boolean arrays to XOR together
- riptable.rt_numpy.mask_xori(*args, **kwargs)
inplace version: pass in a tuple or list of boolean arrays to XOR together
- riptable.rt_numpy.max(*args, **kwargs)
- riptable.rt_numpy.maximum(x1, x2, *args, **kwargs)
- riptable.rt_numpy.mean(*args, filter=None, dtype=None, **kwargs)
Compute the arithmetic mean of the values in the first argument.
When possible,
rt.mean(x, *args)
callsx.mean(*args)
; look there for documentation. In particular, note whether the called function accepts the keyword arguments listed below.For example,
FastArray.mean
accepts thefilter
anddtype
keyword arguments, butDataset.mean
does not.- Parameters:
filter (array of bool, default None) – Specifies which elements to include in the mean calculation. If the filter is uniformly
False
,rt.mean
returns aZeroDivisionError
.dtype (rt.dtype or numpy.dtype, default float64) – The data type of the result. For a
FastArray
x
,x.mean(dtype = my_type)
is equivalent tomy_type(x.mean())
.
- Returns:
Scalar for
FastArray
input. ForDataset
input, returns aDataset
consisting of a row with each numerical column’s mean.- Return type:
scalar or
Dataset
See also
nanmean
Computes the mean, ignoring NaNs.
Dataset.mean
Computes the mean of numerical
Dataset
columns.FastArray.mean
Computes the mean of
FastArray
values.GroupByOps.mean
Computes the mean of each group. Used by
Categorical
objects.
Notes
The
dtype
keyword forrt.mean
specifies the data type of the result. This differs fromnumpy.mean
, where it specifies the data type used to compute the mean.Examples
>>> a = rt.FastArray([1, 3, 5, 7]) >>> rt.mean(a) 4.0
With a
dtype
specified:>>> a = rt.FastArray([1, 3, 5, 7]) >>> rt.mean(a, dtype = rt.int32) 4
With a filter:
>>> a = rt.FastArray([1, 3, 5, 7]) >>> b = rt.FastArray([False, True, False, True]) >>> rt.mean(a, filter = b) 5.0
- riptable.rt_numpy.median(*args, **kwargs)
- riptable.rt_numpy.min(*args, **kwargs)
- riptable.rt_numpy.minimum(x1, x2, *args, **kwargs)
- riptable.rt_numpy.multikeyhash(*args)
Returns 7 arrays to help navigate data.
- Parameters:
key – the unique occurence
nth – the nth unique occurence
bktsize – how many unique occurences occur
next – index to the next unique occurence and previous
prev – index to the next unique occurence and previous
first – index to the first unique occurence and last
last – index to the first unique occurence and last
Examples
>>> myarr = rt.arange(10) % 3 >>> myarr FastArray([0, 1, 2, 0, 1, 2, 0, 1, 2, 0])
>>> mkgrp = rt.Dataset(rt.multikeyhash([myarr]).asdict()) >>> mkgrp.a = myarr >>> mkgrp # key nth bktsize next prev first last a - --- --- ------- ---- ---- ----- ---- - 0 1 1 4 3 -1 0 9 0 1 2 1 3 4 -1 1 7 1 2 3 1 3 5 -1 2 8 2 3 1 2 4 6 0 0 9 0 4 2 2 3 7 1 1 7 1 5 3 2 3 8 2 2 8 2 6 1 3 4 9 3 0 9 0 7 2 3 3 -1 4 1 7 1 8 3 3 3 -1 5 2 8 2 9 1 4 4 -1 6 0 9 0
- riptable.rt_numpy.nan_to_num(*args, **kwargs)
arg1: ndarray returns: ndarray with nan_to_num notes: if you want to do this inplace contact TJD
- riptable.rt_numpy.nan_to_zero(a)
Replace the NaN or invalid values in an array with zeroes.
This is an in-place operation – the input array is returned after being modified.
- Parameters:
a (ndarray) – The input array.
- Returns:
The input array
a
(after it’s been modified).- Return type:
ndarray
- riptable.rt_numpy.nanargmax(*args, **kwargs)
- riptable.rt_numpy.nanargmin(*args, **kwargs)
- riptable.rt_numpy.nanmax(*args, **kwargs)
- riptable.rt_numpy.nanmean(*args, filter=None, dtype=None, **kwargs)
Compute the arithmetic mean of the values in the first argument, ignoring NaNs.
If all values in the first argument are NaNs,
0.0
is returned.When possible,
rt.nanmean(x, *args)
callsx.nanmean(*args)
; look there for documentation. In particular, note whether the called function accepts the keyword arguments listed below.For example,
FastArray.nanmean
accepts thefilter
anddtype
keyword arguments, butDataset.nanmean
does not.- Parameters:
filter (array of bool, default None) – Specifies which elements to include in the mean calculation. If the filter is uniformly
False
,rt.nanmean
returns aZeroDivisionError
.dtype (rt.dtype or numpy.dtype, default float64) – The data type of the result. For a
FastArray
x
,x.nanmean(dtype = my_type)
is equivalent tomy_type(x.nanmean())
.
- Returns:
Scalar for
FastArray
input. ForDataset
input, returns aDataset
consisting of a row with each numerical column’s mean.- Return type:
scalar or
Dataset
See also
mean
Computes the mean.
Dataset.nanmean
Computes the mean of numerical
Dataset
columns, ignoring NaNs.FastArray.nanmean
Computes the mean of
FastArray
values, ignoring NaNs.GroupByOps.nanmean
Computes the mean of each group, ignoring NaNs. Used by
Categorical
objects.
Notes
The
dtype
keyword forrt.nanmean
specifies the data type of the result. This differs fromnumpy.nanmean
, where it specifies the data type used to compute the mean.Examples
>>> a = rt.FastArray([1, 3, 5, rt.nan]) >>> rt.nanmean(a) 3.0
With a
dtype
specified:>>> a = rt.FastArray([1, 3, 5, rt.nan]) >>> rt.nanmean(a, dtype = rt.int32) 3
With a filter:
>>> a = rt.FastArray([1, 3, 5, rt.nan]) >>> b = rt.FastArray([False, True, True, True]) >>> rt.nanmean(a, filter = b) 4.0
- riptable.rt_numpy.nanmedian(*args, **kwargs)
- riptable.rt_numpy.nanmin(*args, **kwargs)
- riptable.rt_numpy.nanpercentile(*args, **kwargs)
- riptable.rt_numpy.nanstd(*args, filter=None, dtype=None, **kwargs)
Compute the standard deviation of the values in the first argument, ignoring NaNs.
If all values in the first argument are NaNs,
NaN
is returned.Riptable uses the convention that
ddof = 1
, meaning the standard deviation of[x_1, ..., x_n]
is defined bystd = 1/(n - 1) * sum(x_i - mean )**2
(note then - 1
instead ofn
). This differs from NumPy, which usesddof = 0
by default.When possible,
rt.nanstd(x, *args)
callsx.nanstd(*args)
; look there for documentation. In particular, note whether the called function accepts the keyword arguments listed below.For example,
FastArray.nanstd
accepts thefilter
anddtype
keyword arguments, butDataset.nanstd
does not.- Parameters:
filter (array of bool, default None) – Specifies which elements to include in the standard deviation calculation. If the filter is uniformly
False
,rt.nanstd
returns aZeroDivisionError
.dtype (rt.dtype or numpy.dtype, default float64) – The data type of the result. For a
FastArray
x
,x.nanstd(dtype = my_type)
is equivalent tomy_type(x.nanstd())
.
- Returns:
Scalar for
FastArray
input. ForDataset
input, returns aDataset
consisting of a row with each numerical column’s standard deviation.- Return type:
scalar or
Dataset
See also
std
Computes the standard deviation.
FastArray.nanstd
Computes the standard deviation of
FastArray
values, ignoring NaNs.Dataset.nanstd
Computes the standard deviation of numerical
Dataset
columns, ignoring NaNs.GroupByOps.nanstd
Computes the standard deviation of each group, ignoring NaNs. Used by
Categorical
objects.
Notes
The
dtype
keyword forrt.nanstd
specifies the data type of the result. This differs fromnumpy.nanstd
, where it specifies the data type used to compute the standard deviation.Examples
>>> a = rt.FastArray([1, 2, 3, rt.nan]) >>> rt.nanstd(a) 1.0
With a
dtype
specified:>>> a = rt.FastArray([1, 2, 3, rt.nan]) >>> rt.nanstd(a, dtype = rt.int32) 1
With filter:
>>> a = rt.FastArray([1, 2, 3, rt.nan]) >>> b = rt.FastArray([False, True, True, True]) >>> rt.nanstd(a, filter = b) 0.7071067811865476
- riptable.rt_numpy.nansum(*args, filter=None, dtype=None, **kwargs)
Compute the sum of the values in the first argument, ignoring NaNs.
If all values in the first argument are NaNs,
0.0
is returned.When possible,
rt.nansum(x, *args)
callsx.nansum(*args)
; look there for documentation. In particular, note whether the called function accepts the keyword arguments listed below.For example,
FastArray.nansum
accepts thefilter
anddtype
keyword arguments, butDataset.nansum
does not.- Parameters:
filter (array of bool, default None) – Specifies which elements to include in the sum calculation. If the filter is uniformly
False
,rt.nansum
returns0.0
.dtype (rt.dtype or numpy.dtype, default float64) – The data type of the result. For a
FastArray
x
,x.nansum(dtype = my_type)
is equivalent tomy_type(x.nansum())
.
- Returns:
Scalar for
FastArray
input. ForDataset
input, returns aDataset
consisting of a row with each numerical column’s sum.- Return type:
scalar or
Dataset
See also
sum
Sums the values of the input.
FastArray.nansum
Sums the values of a
FastArray
, ignoring NaNs.Dataset.nansum
Sums the values of numerical
Dataset
columns, ignoring NaNs.GroupByOps.nansum
Sums the values of each group, ignoring NaNs. Used by
Categorical
objects.
Notes
The
dtype
keyword forrt.nansum
specifies the data type of the result. This differs fromnumpy.nansum
, where it specifies the data type used to compute the sum.Examples
>>> a = rt.FastArray( [1, 3, 5, 7, rt.nan]) >>> rt.nansum(a) 16.0
With a
dtype
specified:>>> a = rt.FastArray([1.0, 3.0, 5.0, 7.0, rt.nan]) >>> rt.nansum(a, dtype = rt.int32) 16
With a filter:
>>> a = rt.FastArray([1, 3, 5, 7, rt.nan]) >>> b = rt.FastArray([False, True, False, True, True]) >>> rt.nansum(a, filter = b) 10.0
- riptable.rt_numpy.nanvar(*args, filter=None, dtype=None, **kwargs)
Compute the variance of the values in the first argument, ignoring NaNs.
If all values in the first argument are NaNs,
NaN
is returned.Riptable uses the convention that
ddof = 1
, meaning the variance of[x_1, ..., x_n]
is defined byvar = 1/(n - 1) * sum(x_i - mean )**2
(note then - 1
instead ofn
). This differs from NumPy, which usesddof = 0
by default.When possible,
rt.nanvar(x, *args)
callsx.nanvar(*args)
; look there for documentation. In particular, note whether the called function accepts the keyword arguments listed below.For example,
FastArray.nanvar
accepts thefilter
anddtype
keyword arguments, butDataset.nanvar
does not.- Parameters:
filter (array of bool, default None) – Specifies which elements to include in the variance calculation. If the filter is uniformly
False
,rt.nanvar
returns aZeroDivisionError
.dtype (rt.dtype or numpy.dtype, default float64) – The data type of the result. For a
FastArray
x
,x.nanvar(dtype = my_type)
is equivalent tomy_type(x.nanvar())
.
- Returns:
Scalar for
FastArray
input. ForDataset
input, returns aDataset
consisting of a row with each numerical column’s variance.- Return type:
scalar or
Dataset
See also
var
Computes the variance.
FastArray.nanvar
Computes the variance of
FastArray
values, ignoring NaNs.Dataset.nanvar
Computes the variance of numerical
Dataset
columns, ignoring NaNs.GroupByOps.nanvar
Computes the variance of each group, ignoring NaNs. Used by
Categorical
objects.
Notes
The
dtype
keyword forrt.nanvar
specifies the data type of the result. This differs fromnumpy.nanvar
, where it specifies the data type used to compute the variance.Examples
>>> a = rt.FastArray([1, 2, 3, rt.nan]) >>> rt.nanvar(a) 1.0
With a
dtype
specified:>>> a = rt.FastArray([1, 2, 3, rt.nan]) >>> rt.nanvar(a, dtype = rt.int32) 1
With a filter:
>>> a = rt.FastArray([1, 2, 3, rt.nan]) >>> b = rt.FastArray([False, True, True, True]) >>> rt.nanvar(a, filter = b) 0.5
- riptable.rt_numpy.ones(shape, dtype=None, order='C', *, like=None)
Return a new array of the specified shape and data type, filled with ones.
- Parameters:
shape (int or sequence of int) – Shape of the new array, e.g.,
(2, 3)
or2
. Note that although multi-dimensional arrays are technically supported by Riptable, you may get unexpected results when working with them.dtype (str or NumPy dtype or Riptable dtype, default
numpy.float64
) – The desired data type for the array.order ({'C', 'F'}, default 'C') – Whether to store multi-dimensional data in row-major (C-style) or column-major (Fortran-style) order in memory.
like (array_like, default None) – Reference object to allow the creation of arrays that are not NumPy arrays. If an array-like passed in as
like
supports the__array_function__
protocol, the result will be defined by it. In this case, it ensures the creation of an array object compatible with that passed in via this argument.
- Returns:
A new
FastArray
of the specified shape and type, filled with ones.- Return type:
FastArray
See also
riptable.ones_like
,riptable.zeros
,riptable.zeros_like
,riptable.empty
,riptable.empty_like
,riptable.full
Examples
>>> rt.ones(5) FastArray([1., 1., 1., 1., 1.])
>>> rt.ones(5, dtype='int8') FastArray([1, 1, 1, 1, 1], dtype=int8)
- riptable.rt_numpy.ones_like(a, dtype=None, order='K', subok=True, shape=None)
Return an array of ones with the same shape and data type as the specified array.
- Parameters:
a (array) – The shape and data type of
a
define the same attributes of the returned array. Note that although multi-dimensional arrays are technically supported by Riptable, you may get unexpected results when working with them.dtype (str or NumPy dtype or Riptable dtype, optional) – Overrides the data type of the result.
order ({'C', 'F', 'A', or 'K'}, default 'K') – Overrides the memory layout of the result. ‘C’ means row-major (C-style), ‘F’ means column-major (Fortran-style), ‘A’ means ‘F’ if
a
is Fortran-contiguous, ‘C’ otherwise. ‘K’ means match the layout ofa
as closely as possible.subok (bool, default True) – If True (the default), then the newly created array will use the sub-class type of
a
, otherwise it will be a base-class array.shape (int or sequence of int, optional) – Overrides the shape of the result. If order=’K’ and the number of dimensions is unchanged, it will try to keep the same order; otherwise, order=’C’ is implied. Note that although multi-dimensional arrays are technically supported by Riptable, you may get unexpected results when working with them.
- Returns:
A
FastArray
with the same shape and data type as the specified array, filled with ones.- Return type:
FastArray
See also
riptable.ones
,riptable.zeros
,riptable.zeros_like
,riptable.empty
,riptable.empty_like
,riptable.full
Examples
>>> a = rt.FastArray([1, 2, 3, 4]) >>> rt.ones_like(a) FastArray([1, 1, 1, 1])
>>> rt.ones_like(a, dtype = float) FastArray([1., 1., 1., 1.])
- riptable.rt_numpy.percentile(*args, **kwargs)
- riptable.rt_numpy.putmask(a, mask, values)
This is roughly the equivalent of arr[mask] = arr2[mask].
Examples
>>> arr = rt.FastArray([10, 10, 10, 10]) >>> arr2 = rt.FastArray([1, 2, 3, 4]) >>> mask = rt.FastArray([False, True, True, False]) >>> rt.putmask(arr, mask, arr2) >>> arr FastArray([10, 2, 3, 10])
It’s important to note that the length of
arr
andarr2
are presumed to be the same, otherwise the values inarr2
are repeated until they have the same dimension.It should NOT be used to replace this operation:
>>> arr = rt.FastArray([10, 10, 10, 10]) >>> arr2 = rt.FastArray([1, 2]) >>> mask = rt.FastArray([False, True, True, False]) >>> arr[mask] = arr2 >>> arr FastArray([10, 1, 2, 10])
arr2
is repeated to creatert.FastArray([1, 2, 1, 2])
before performing the operation, hence the different result.>>> arr = rt.FastArray([10, 10, 10, 10]) >>> arr2 = rt.FastArray([1, 2]) >>> mask = rt.FastArray([False, True, True, False]) >>> rt.putmask(arr, mask, arr2) >>> arr FastArray([10, 2, 1, 10])
- riptable.rt_numpy.reindex_fast(index, array)
- riptable.rt_numpy.reshape(*args, **kwargs)
- riptable.rt_numpy.round(*args, **kwargs)
This will check for numpy array first and call np.round
- riptable.rt_numpy.searchsorted(a, v, side='left', sorter=None)
see np.searchsorted side =’leftplus’ is a new option in riptable where values > get a 0
- riptable.rt_numpy.single(a)
- riptable.rt_numpy.sort(*args, **kwargs)
- riptable.rt_numpy.sortinplaceindirect(*args, **kwargs)
- riptable.rt_numpy.std(*args, filter=None, dtype=None, **kwargs)
Compute the standard deviation of the values in the first argument.
Riptable uses the convention that
ddof = 1
, meaning the standard deviation of[x_1, ..., x_n]
is defined bystd = 1/(n - 1) * sum(x_i - mean )**2
(note then - 1
instead ofn
). This differs from NumPy, which usesddof = 0
by default.When possible,
rt.std(x, *args)
callsx.std(*args)
; look there for documentation. In particular, note whether the called function accepts the keyword arguments listed below.For example,
FastArray.std
accepts thefilter
anddtype
keyword arguments, butDataset.std
does not.- Parameters:
filter (array of bool, default None) – Specifies which elements to include in the standard deviation calculation. If the filter is uniformly
False
,rt.std
returns aZeroDivisionError
.dtype (rt.dtype or numpy.dtype, default float64) – The data type of the result. For a
FastArray
x
,x.std(dtype = my_type)
is equivalent tomy_type(x.std())
.
- Returns:
Scalar for
FastArray
input. ForDataset
input, returns aDataset
consisting of a row with each numerical column’s standard deviation.- Return type:
scalar or
Dataset
See also
nanstd
Computes the standard deviation, ignoring NaNs.
FastArray.std
Computes the standard deviation of
FastArray
values.Dataset.std
Computes the standard deviation of numerical
Dataset
columns.GroupByOps.std
Computes the standard deviation of each group. Used by
Categorical
objects.
Notes
The
dtype
keyword forrt.std
specifies the data type of the result. This differs fromnumpy.std
, where it specifies the data type used to compute the standard deviation.Examples
>>> a = rt.FastArray([1, 2, 3]) >>> rt.std(a) 1.0
With a
dtype
specified:>>> a = rt.FastArray([1, 2, 3]) >>> rt.std(a, dtype = rt.int32) 1
With a filter:
>>> a = rt.FastArray([1, 2, 3]) >>> b = rt.FA([False, True, True]) >>> rt.std(a, filter = b) 0.7071067811865476
- riptable.rt_numpy.sum(*args, filter=None, dtype=None, **kwargs)
Compute the sum of the values in the first argument.
When possible,
rt.sum(x, *args)
callsx.sum(*args)
; look there for documentation. In particular, note whether the called function accepts the keyword arguments listed below. For example,Dataset.sum()
does not accept thefilter
ordtype
keyword arguments.For
FastArray.sum
, seenumpy.sum
for documentation but note the following:Until a reported bug is fixed, the
dtype
keyword argument may not work as expected:Riptable data types (for example,
rt.float64
) are ignored.NumPy integer data types (for example,
numpy.int32
) are also ignored.NumPy floating point data types are applied as
numpy.float64
.
If you include another NumPy parameter (for example,
axis=0
), the NumPy implementation ofsum
will be used and thedtype
will be used to compute the sum.
- Parameters:
filter (array of bool, default None) – Specifies which elements to include in the sum calculation.
dtype (rt.dtype or numpy.dtype, optional) – The data type of the result. By default, for integer input the result
dtype
isint64
and for floating point input the resultdtype
isfloat64
. See the notes above about using this keyword argument withFastArray
objects as input.
- Returns:
Scalar for
FastArray
input. ForDataset
input, returns aDataset
consisting of a row with each numerical column’s sum.- Return type:
scalar or
Dataset
See also
nansum
Sums the values, ignoring NaNs.
FastArray.sum
Sums the values of a
FastArray
.Dataset.sum
Sums the values of numerical
Dataset
columns.GroupByOps.sum
Sums the values of each group. Used by
Categorical
objects.
Examples
>>> a = rt.FastArray([1, 3, 5, 7]) >>> rt.sum(a) 16
>>> a = rt.FastArray([1.0, 3.0, 5.0, 7.0]) >>> rt.sum(a) 16.0
- riptable.rt_numpy.tile(arr, reps)
Construct an array by repeating a specified array a specified number of times.
- Parameters:
a (array or scalar) – The input array or scalar.
reps (int or array of int) – The number of repetitions of
a
along each axis. For examples oftile
used with multi-dimensional arrays, seenumpy.tile()
. Note that although multi-dimensional arrays are technically supported by Riptable, you may get unexpected results when working with them.
- Returns:
A new
FastArray
of the repeated input arrays.- Return type:
FastArray
See also
riptable.repeat
Construct an array by repeating each element of a specified array.
Examples
Tile a scalar:
>>> rt.tile(2, 5) FastArray([2, 2, 2, 2, 2])
Tile an array:
>>> x = rt.FA([1, 2, 3, 4]) >>> rt.tile(x, 2) FastArray([1, 2, 3, 4, 1, 2, 3, 4])
- riptable.rt_numpy.transpose(*args, **kwargs)
- riptable.rt_numpy.trunc(*args, **kwargs)
- riptable.rt_numpy.unique32(list_keys, hintSize=0, filter=None)
Returns the index location of the first occurence of each key.
- Parameters:
list_keys (list of ndarray) – A list of numpy arrays to hash on (multikey); if there is just one item it still needs to be in a list such as
[array1]
.hintSize (int) – Integer hint if the number of unique keys (in
list_keys
) is known in advance, defaults to 0.filter (ndarray of bools, optional) – Boolean array used to pre-filter the array(s) in
list_keys
prior to processing them, defaults to None.
- Returns:
An array the size of the total unique values; the array contains the INDEX to the first occurence of the unique value. the second array contains the INDEX to the last occurence of the unique value.
- Return type:
ndarray of ints
- riptable.rt_numpy.var(*args, filter=None, dtype=None, **kwargs)
Compute the variance of the values in the first argument.
Riptable uses the convention that
ddof = 1
, meaning the variance of[x_1, ..., x_n]
is defined byvar = 1/(n - 1) * sum(x_i - mean )**2
(note then - 1
instead ofn
). This differs from NumPy, which usesddof = 0
by default.When possible,
rt.var(x, *args)
callsx.var(*args)
; look there for documentation. In particular, note whether the called function accepts the keyword arguments listed below.For example,
FastArray.var
accepts thefilter
anddtype
keyword arguments, butDataset.var
does not.- Parameters:
filter (array of bool, default None) – Specifies which elements to include in the variance calculation. If the filter is uniformly
False
,rt.var
returns aZeroDivisionError
.dtype (rt.dtype or numpy.dtype, default float64) – The data type of the result. For a
FastArray
x
,x.var(dtype = my_type)
is equivalent tomy_type(x.var())
.
- Returns:
Scalar for
FastArray
input. ForDataset
input, returns aDataset
consisting of a row with each numerical column’s variance.- Return type:
scalar or
Dataset
See also
nanvar
Computes the variance, ignoring NaNs.
FastArray.var
Computes the variance of
FastArray
values.Dataset.var
Computes the variance of numerical
Dataset
columns.GroupByOps.var
Computes the variance of each group. Used by
Categorical
objects.
Notes
The
dtype
keyword forrt.var
specifies the data type of the result. This differs fromnumpy.var
, where it specifies the data type used to compute the variance.Examples
>>> a = rt.FastArray([1, 2, 3]) >>> rt.var(a) 1.0
With a
dtype
specified:>>> a = rt.FastArray([1, 2, 3]) >>> rt.var(a, dtype = rt.int32) 1
With a filter:
>>> a = rt.FastArray([1, 2, 3]) >>> b = rt.FastArray([False, True, True]) >>> rt.var(a, filter = b) 0.5
- riptable.rt_numpy.vstack(arrlist, dtype=None, order='C')
- Parameters:
arrlist (list of 1d numpy arrays of the same length) – these arrays are considered the columns
order (defaults to 'C' for row major. 'F' will be column major.) –
dtype (defaults to None. Can specifiy the final dtype for order='F' only.) –
WARNING (when order='F' riptable vstack will return a diffrent shape) –
length (from np.vstack since it will try to keep the first dim the same) –
contiguous. (while keeping data) –
passed (If order='F' is not) –
assumed. (order='C' is) –
fails (If riptable) –
called. (then normal np.vstack will be) –
arrays (For large) –
fly. (riptable can run in parallel while converting to the dtype on the) –
- Returns:
a 2d array that is column major and can be insert into a dataset
Use v[ (,0] then v[:,1] to access the columns instead of)
v[0] and v[1] which would be the method with np.vstack
See also
np.vstack
,np.column_stack
Examples
>>> a = rt.arange(100) >>> b = rt.arange(100.0) >>> v = rt.vstack([a,b], order='F') >>> v.strides (8, 800)
>>> v.flags C_CONTIGUOUS : False F_CONTIGUOUS : True
>>> v.shape (100,2)
- riptable.rt_numpy.where(condition, x=None, y=None)
Return a new
FastArray
orCategorical
with elements fromx
ory
depending on whethercondition
is True.For 1-dimensional arrays, this function is equivalent to:
[xv if c else yv for c, xv, yv in zip(condition, x, y)]
If only
condition
is provided, this function returns a tuple containing an integerFastArray
with the indices where the condition is True. Note that this usage ofwhere
is not supported forFastArray
objects of more than one dimension.Note also that this case of
where
usesriptable.bool_to_fancy()
. Usingbool_to_fancy
directly is preferred, as it behaves correctly for subclasses.- Parameters:
condition (bool or array of bool) – Where True, yield
x
, otherwise yieldy
.x (scalar, array, or callable, optional) – The value to use where
condition
is True. Ifx
is provided,y
must also be provided, andx
andy
should be the same type. Ifx
is an array, a callable that returns an array, or aCategorical
, it must be the same length ascondition
. The value ofx
that corresponds to the True value is used.y (scalar, array, or callable, optional) – The value to use where
condition
is False. Ify
is provided,x
must also be provided, andx
andy
should be the same type. Ify
is an array, a callable that returns an array, or aCategorical
, it must be the same length ascondition
. The value ofy
that corresponds to the False value is used.
- Returns:
If
x
andy
areCategorical
objects, aCategorical
is returned. Otherwise, ifx
andy
are provided aFastArray
is returned. When onlycondition
is provided, a tuple is returned containing an integerFastArray
with the indices where the condition is True.- Return type:
FastArray or Categorical or tuple
See also
FastArray.where
Replace values where a given condition is False.
riptable.bool_to_fancy
The function called when
x
andy
are omitted.
Examples
condition
is a comparison that creates an array of booleans, andx
andy
are scalars:>>> a = rt.FastArray(rt.arange(5)) >>> a FastArray([0, 1, 2, 3, 4]) >>> rt.where(a < 2, 100, 200) FastArray([100, 100, 200, 200, 200])
condition
andx
are same-length arrays, andy
is a scalar:>>> condition = rt.FastArray([False, False, True, True, True]) >>> x = rt.FastArray([100, 101, 102, 103, 104]) >>> y = 200 >>> rt.where(condition, x, y) FastArray([200, 200, 102, 103, 104])
When
x
andy
areCategorical
objects, aCategorical
is returned:>>> primary_traders = rt.Cat(['John', 'Mary', 'John', 'Mary', 'John', 'Mary']) >>> secondary_traders = rt.Cat(['Chris', 'Duncan', 'Chris', 'Duncan', 'Duncan', 'Chris']) >>> is_primary = rt.FA([True, True, False, True, False, True]) >>> rt.where(is_primary, primary_traders, secondary_traders) Categorical([John, Mary, Chris, Mary, Duncan, Mary]) Length: 6 FastArray([3, 4, 1, 4, 2, 4], dtype=int8) Base Index: 1 FastArray([b'Chris', b'Duncan', b'John', b'Mary'], dtype='|S6') Unique count: 4
When
x
andy
areDate
objects, aFastArray
of integers is returned that can be converted to aDate
(other datetime objects are similar):>>> x = rt.Date(['20230101', '20220101', '20210101']) >>> y = rt.Date(['20190101', '20180101', '20170101']) >>> condition = x > rt.Date(['20211231']) >>> rt.where(condition, x, y) >>> FastArray([19358, 18993, 17167]) >>> rt.Date(_) Date(['2023-01-01', '2022-01-01', '2017-01-01'])
When only a condition is provided, a tuple is returned containing a
FastArray
with the indices where the condition is True:>>> a = rt.FastArray([10, 20, 30, 40, 50]) >>> rt.where(a < 40) (FastArray([0, 1, 2]),)
- riptable.rt_numpy.zeros(*args, **kwargs)
Return a new array of the specified shape and data type, filled with zeros.
- Parameters:
shape (int or sequence of int) – Shape of the new array, e.g.,
(2, 3)
or2
. Note that although multi-dimensional arrays are technically supported by Riptable, you may get unexpected results when working with them.dtype (str or NumPy dtype or Riptable dtype, default
numpy.float64
) – The desired data type for the array.order ({'C', 'F'}, default 'C') – Whether to store multi-dimensional data in row-major (C-style) or column-major (Fortran-style) order in memory.
like (array_like, default None) – Reference object to allow the creation of arrays that are not NumPy arrays. If an array-like passed in as
like
supports the__array_function__
protocol, the result will be defined by it. In this case, it ensures the creation of an array object compatible with that passed in via this argument.
- Returns:
A new
FastArray
of the specified shape and type, filled with zeros.- Return type:
FastArray
See also
riptable.zeros_like
,riptable.ones
,riptable.ones_like
,riptable.empty
,riptable.empty_like
,riptable.full
Examples
>>> rt.zeros(5) FastArray([0., 0., 0., 0., 0.])
>>> rt.zeros(5, dtype = 'int8') FastArray([0, 0, 0, 0, 0], dtype=int8)
- riptable.rt_numpy.zeros_like(a, dtype=None, order='k', subok=True, shape=None)
Return an array of zeros with the same shape and data type as the specified array.
- Parameters:
a (array) – The shape and data type of
a
define the same attributes of the returned array. Note that although multi-dimensional arrays are technically supported by Riptable, you may get unexpected results when working with them.dtype (str or NumPy dtype or Riptable dtype, optional) – Overrides the data type of the result.
order ({'C', 'F', 'A', or 'K'}, default 'K') – Overrides the memory layout of the result. ‘C’ means row-major (C-style), ‘F’ means column-major (Fortran-style), ‘A’ means ‘F’ if
a
is Fortran-contiguous, ‘C’ otherwise. ‘K’ means match the layout ofa
as closely as possible.subok (bool, default True) – If True (the default), then the newly created array will use the sub-class type of
a
, otherwise it will be a base-class array.shape (int or sequence of int, optional) – Overrides the shape of the result. If order=’K’ and the number of dimensions is unchanged, it will try to keep the same order; otherwise, order=’C’ is implied. Note that although multi-dimensional arrays are technically supported by Riptable, you may get unexpected results when working with them.
- Returns:
A
FastArray
with the same shape and data type as the specified array, filled with zeros.- Return type:
FastArray
See also
riptable.zeros
,riptable.ones
,riptable.ones_like
,riptable.empty
,riptable.empty_like
,riptable.full
Examples
>>> a = rt.FastArray([1, 2, 3, 4]) >>> rt.zeros_like(a) FastArray([0, 0, 0, 0])
>>> rt.zeros_like(a, dtype = float) FastArray([1., 1., 1., 1.])
- riptable.rt_numpy.asanyarray
- riptable.rt_numpy.asarray