riptable.rt_compressedarray
Classes
A |
- class riptable.rt_compressedarray.CompressedArray(arr)
Bases:
riptable.rt_fastarray.FastArray
A
FastArray
is a 1-dimensional array of items that are the same data type.Because it’s a subclass of NumPy’s
numpy.ndarray
, allndarray
functions and attributes can be used withFastArray
objects. However, Riptable optimizes many of NumPy’s functions to make them faster and more memory-efficient. Riptable has also added some methods.FastArray
objects with more than 1 dimension are not supported.See NumPy’s docs for details on all
ndarray
methods and attributes.- Parameters:
arr (array, iterable, or scalar value) – Contains data to be stored in the
FastArray
.**kwargs – Additional keyword arguments to be passed to the function.
Notes
To improve performance,
FastArray
objects take over some of NumPy’s universal functions (ufuncs), use array recycling and multiple threads, and pass certain method calls to Bottleneck.Note that whenever Riptable has implemented its own version of an existing NumPy method, a call to the NumPy method results in a call to the optimized Riptable version instead. We encourage users to directly call the Riptable method in order to avoid any confusion as to what method is actually being called.
See the list of NumPy Methods Optimized by Riptable for FastArrays.
Examples
Construct a FastArray
Pass a list to the constructor:
>>> rt.FastArray([1, 2, 3, 4, 5]) FastArray([1, 2, 3, 4, 5])
>>> #NOTE: rt.FA also works. >>> rt.FA([1.0, 2.0, 3.0, 4.0, 5.0]) FastArray([1., 2., 3., 4., 5.])
Or use a utility function:
>>> rt.full(10, 0.7) FastArray([0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7])
>>> rt.arange(10) FastArray([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
You can optionally specify a data type:
>>> x = rt.FastArray([3, 6, 10], dtype = rt.float64) >>> x, x.dtype (FastArray([ 3., 6., 10.]), dtype('float64'))
>>> # Using a string shortcut: >>> x = rt.FastArray([3,6,10], dtype = 'float64') >>> x, x.dtype (FastArray([ 3., 6., 10.]), dtype('float64'))
By default, characters are stored as byte strings. When
unicode=True
, theFastArray
allows Unicode characters.>>> rt.FA(list('abc'), unicode=True) FastArray(['a', 'b', 'c'], dtype='<U1')
To convert an existing NumPy array, use the
FastArray
constructor.>>> np_arr = np.array([1, 2, 3]) >>> rt.FA(np_arr) FastArray([1, 2, 3])
To view the NumPy array as a
FastArray
(which is slightly less expensive than using the constructor), use theview
method.>>> fa = np_arr.view(FA) >>> fa FastArray([1, 2, 3])
To view it as a NumPy array again:
>>> fa.view(np.ndarray) array([1, 2, 3])
>>> # Alternatively: >>> fa._np array([1, 2, 3])
Get a Subset of a FastArray
You can use standard Python slicing notation or fancy indexing to access a subset of a
FastArray
.>>> # Create a FastArray: >>> array = rt.arange(8)**2 >>> array FastArray([0, 1, 4, 9, 16, 25, 36, 49]) >>> # Use Python slicing to get elements 2, 3, and 4: >>> array[2:5] FastArray([4, 9, 16])
>>> # Use fancy indexing to get elements 2, 4, and 1 (in that order): >>> array[[2, 4, 1]] FastArray([4, 16, 1])
For more details, see the examples for 1-dimensional arrays in NumPy’s docs: Indexing on ndarrays.
Note that slicing creates a view of the array and does not copy the underlying data; modifying the slice modifies the original array. Fancy indexing creates a copy of the extracted data; modifying this array does not modify the original array.
You can also pass a Boolean mask array.
>>> # Create a Boolean mask: >>> evenMask = (array % 2 == 0) >>> evenMask FastArray([True, False, True, False, True, False, True, False]) >>> # Index using the Boolean mask: >>> array[evenMask] FastArray([0, 4, 16, 36])
How to Subclass FastArray
Include the required class definition:
>>> class TestSubclass(FastArray): ... def __new__(cls, arr, **args): ... # Before this call, arr needs to be a np.ndarray instance. ... return arr.view(cls) ... def __init__(self, arr, **args): ... pass
If the subclass is computable, you might define your own math operations. In these operations, you might define what the subclass can be computed with. For examples of new definitions, see the
DateTimeNano
class.Common operations to hook are comparisons (
__eq__()
,__ne__()
,__gt__()
,__lt__()
,__le__()
,__ge__()
) and basic math functions (__add__()
,__sub__()
,__mul__()
, etc.).Bracket indexing operations are very common. If the subclass needs to set or return a value other than that in the underlying array, you need to take over
__getitem__()
or__setitem__()
.Indexing is also used in display. For regular console/notebook display, you need to take over:
_repr_html_()
(for JupyterLab and Jupyter notebooks)
If the array is being displayed in a
Dataset
and you require certain formatting, you need to define two more methods:display_query_properties()
Returns an
ItemFormat
object (seert.Utils.rt_display_properties
)display_convert_func()
The conversion function returned by
display_query_properties()
must return a string. Each item being displayed, the result of__getitem__()
at a single index, will go through this function individually, accompanied by anItemFormat
object.
Many Riptable operations need to return arrays of the same class they received. To ensure that your subclass will retain its special properties, you need to take over
newclassfrominstance()
. Failure to take this over will often result in an object with uninitialized variables.copy()
is another method that is called generically in Riptable routines, and needs to be taken over to retain subclass properties.For a view of the underlying
FastArray
, you can use the_fa
property.- allowed_funcs = ['decompress', 'view']
- __getattribute__(attr)
Block all FastArray operations. See allowed_funcs class global.
- __repr__()
Return repr(self).
- __str__()
Return str(self).
- _build_string()
- decompress()