riptable.rt_accumtable
Classes
Enables the creation of tables with values calculated by various reducing functions. |
Functions
|
Apply reducing functions to multiple arrays that are grouped by a |
|
Generate a |
|
Generate a |
- class riptable.rt_accumtable.AccumTable(cat_rows, cat_cols, filter=None, showfilter=False)[source]
Bases:
riptable.rt_accum2.Accum2
Enables the creation of tables with values calculated by various reducing functions.
AccumTable
is a wrapper onAccum2
and can generate tables with multiple footer rows and margin columns, which represent values calculated by a variety of reducing functions.An
AccumTable
holds multiple tables at once. For example, anAccumTable
can hold the tables calculated by the mean, sum, and variance reducing functions. All tables in theAccumTable
are grouped by the same twoCategorical
objects.Each table in the
AccumTable
has these three parts:Inner table - a table of values calculated by a reducing function and indexed by row and column groups.
Footer row - a row on the bottom margin that contains the calculated value for each column group.
Margin column - a column on the right margin that contains the calculated value for each row group.
After creating an
AccumTable
, you can generate aDataset
to view the calculated values as a table. You can customize the generated table by specifying one inner table, a set of footer rows, and a set of margin columns.You create an
AccumTable
and generate a table with the following multistep process:Pass two
Categorical
objects to create anAccumTable
and to specify the row and column groups.Add tables to the
AccumTable
by setting its elements toDataset
objects of values calculated by a reducing function. For a list of reducing functions, see Reducing Functions Supported by Categoricals.Specify which summary rows and columns you want to include in a generated table using
set_footer_rows()
andset_margin_columns()
.Generate a table view with the specified summary rows and columns using
gen()
.
- Parameters:
cat_rows (
Categorical
) – The row groups used to accumlate the values.cat_cols (
Categorical
) – The column groups used to accumlate the values.filter (ndarray) – Boolean mask array applied to arrays before grouping, reducing, and addition to the
AccumTable
.showfilter (bool) – Controls whether the returned table contains row or column groups that result entirely in
0
ornan
when the filter is applied.
See also
rt_accum2.Accum2
The parent class for
AccumTable
.rt_categorical.Categorical
A class that efficiently stores an array of repeated strings and is used for groupby operations.
rt_groupbyops.GroupByOps
A class that holds the reducing functions used to create an
AccumTable
.
Examples
Construct a
Dataset
for the following examples:>>> ds = rt.Dataset() >>> ds.Zeros = [0, 0, 0, 0, 0] >>> ds.Ones = [1, 1, 1, 1, 1] >>> ds.Twos = [2, 2, 2, 2, 2] >>> ds.Nans = [rt.nan, rt.nan, rt.nan, rt.nan, rt.nan] >>> ds.Ints = [0, 1, 2, 3, 4] >>> ds.Groups = rt.Cat(["Group1", "Group2", "Group1", "Group1", "Group2"]) >>> ds.Letters = rt.Cat(["A", "B", "C", "A", "C"]) >>> ds # Zeros Ones Twos Nans Ints Groups Letters - ----- ---- ---- ---- ---- ------ ------- 0 0 1 2 nan 0 Group1 A 1 0 1 2 nan 1 Group2 B 2 0 1 2 nan 2 Group1 C 3 0 1 2 nan 3 Group1 A 4 0 1 2 nan 4 Group2 C [5 rows x 7 columns] total bytes: 225.0 B
Create an AccumTable
Pass two
Categorical
objects to create the row and column groups for theAccumTable
:>>> at = rt.AccumTable(ds.Groups, ds.Letters) >>> at Inner Tables: [] Margin Columns: [] Footer Rows: []
The
Accumtable
doesn’t yet hold any inner tables. Add a table using a reducing function. This example adds a table with values calculated bycount()
:>>> at["Count"] = at.count() >>> at["Count"] *Groups A B C Count ------- - - - ----- Group1 2 0 1 3 Group2 0 1 1 2 ------- - - - ----- Count 2 1 2 5 [2 rows x 5 columns] total bytes: 52.0 B
The
AccumTable
now holds the Count table:>>> at Inner Tables: ['Count'] Margin Columns: ['Count'] Footer Rows: ['Count']
Add more tables to the
AccumTable
using different reducing functions:>>> at["Sum Ints"] = at.sum(ds.Ints) >>> at["Mean Double"] = at.mean(ds.Ints * ds.Twos) >>> at["Variance Ints"] = at.var(ds.Ints) >>> at Inner Tables: ['Count', 'Sum Ints', 'Mean Double', 'Variance Ints'] Margin Columns: ['Count', 'Sum Ints', 'Mean Double', 'Variance Ints'] Footer Rows: ['Count', 'Sum Ints', 'Mean Double', 'Variance Ints']
Generate a table with multiple summary rows and columns using
gen()
. Pass the name of the inner table that you want to include in the generated table:>>> at.gen("Sum Ints") *Groups A B C Sum Ints Count Mean Double Variance Ints ------------- ---- ---- ---- -------- ----- ----------- ------------- Group1 3 0 2 5 3 3.33 2.33 Group2 0 1 4 5 2 5.00 4.50 ------------- ---- ---- ---- -------- ----- ----------- ------------- Sum Ints 3 1 6 10 Count 2 1 2 5 Mean Double 3.00 2.00 6.00 4.00 Variance Ints 4.50 nan 2.00 2.00 [2 rows x 8 columns] total bytes: 124.0 B
By default, all summary rows and columns appear in the generated table. Specify which summary rows and columns appear using
set_footer_rows()
andset_margin_columns()
:>>> at.set_footer_rows(["Count", "Sum Ints"]) >>> at.set_margin_columns(["Variance Ints"]) >>> at Inner Tables: ['Count', 'Sum Ints', 'Mean Double', 'Variance Ints'] Margin Columns: ['Variance Ints'] Footer Rows: ['Count', 'Sum Ints']
Generate the table with the specified summary rows and columns:
>>> at.gen("Sum Ints") *Groups A B C Sum Ints Variance Ints -------- - - - -------- ------------- Group1 3 0 2 5 2.33 Group2 0 1 4 5 4.50 -------- - - - -------- ------------- Sum Ints 3 1 6 10 Count 2 1 2 [2 rows x 6 columns] total bytes: 92.0 B
- __getitem__(index)[source]
Return the inner table, footer row, and margin column corresponding to
index
.- Parameters:
index (str) – Name of the inner table, footer row, and margin column to return.
- Returns:
The inner table, footer row, and margin column corresponding to
index
.- Return type:
- Raises:
IndexError – If
index
is not a string.
- __repr__()[source]
Return a string representation of the
AccumTable
.- Returns:
The
AccumTable
as a string.- Return type:
- __setitem__(name, ds)[source]
Add an inner table, corresponding footer row, and corresponding margin column to the
AccumTable
.- Parameters:
- Raises:
IndexError – If
name
is not a string.ValueError – If
ds
is not aDataset
.
- gen(table_name=None, format=None, ref_table=None, remove_blanks=True)[source]
Generate a table with one inner table and multiple footer rows and margin columns from an
AccumTable
.- Parameters:
table_name (str, optional) – The name of the inner table that appears in the generated table. If not provided, the last-created inner table appears in the generated table.
format (dict of {str : func}, optional) – (Not yet implemented) A dictionary used to specify the formatting of each cell in the table. Each key is a formatting type, such as “bold”, “color”, and “background”, and each value is a function that applies conditional formatting to each table cell. For example,
format={"bold": lambda v: v > 0}
applies bold formatting to all cells with positive values.ref_table (str or
Dataset
, optional) – (Not yet implemented) The name of anAccumTable
or aDataset
of the same shape that acts as a format reference for the generated table.remove_blanks (bool, default
True
) – Controls whether rows and columns consisting entirely of0
andnan
are removed from the generated table.
- Returns:
A table generated from the
AccumTable
, including footer rows and margin columns.- Return type:
See also
rt_accumtable.AccumTable
The class containing
gen()
.rt_accumtable.AccumTable.set_footer_rows()
The method that sets the footer rows for the
rt_accumtable.AccumTable
and its generated tables.rt_accumtable.AccumTable.set_margin_columns()
The method that sets the margin columns for the
rt_accumtable.AccumTable
and its generated tables.
Examples
Construct a
Dataset
for the following examples:>>> ds = rt.Dataset() >>> ds.Zeros = [0, 0, 0, 0, 0] >>> ds.Ones = [1, 1, 1, 1, 1] >>> ds.Twos = [2, 2, 2, 2, 2] >>> ds.Nans = [rt.nan, rt.nan, rt.nan, rt.nan, rt.nan] >>> ds.Ints = [0, 1, 2, 3, 4] >>> ds.Groups = rt.Cat(["Group1", "Group2", "Group1", "Group1", "Group2"]) >>> ds.Letters = rt.Cat(["A", "B", "C", "A", "C"]) >>> ds # Zeros Ones Twos Nans Ints Groups Letters - ----- ---- ---- ---- ---- ------ ------- 0 0 1 2 nan 0 Group1 A 1 0 1 2 nan 1 Group2 B 2 0 1 2 nan 2 Group1 C 3 0 1 2 nan 3 Group1 A 4 0 1 2 nan 4 Group2 C [5 rows x 7 columns] total bytes: 225.0 B
Construct an
AccumTable
from that data:>>> at = rt.AccumTable(ds.Groups, ds.Letters) >>> at["Count"] = at.count() >>> at["Sum Ints"] = at.sum(ds.Ints) >>> at["Mean Double"] = at.mean(ds.Ints * ds.Twos) >>> at Inner Tables: ['Count', 'Sum Ints', 'Mean Double'] Margin Columns: ['Count', 'Sum Ints', 'Mean Double'] Footer Rows: ['Count', 'Sum Ints', 'Mean Double']
Generate a table from this
AccumTable
using default parameter values:>>> at.gen() *Groups A B C Mean Double Count Sum Ints ----------- ---- ---- ---- ----------- ----- -------- Group1 3.00 nan 4.00 3.33 3 5 Group2 nan 2.00 8.00 5.00 2 5 ----------- ---- ---- ---- ----------- ----- -------- Mean Double 3.00 2.00 6.00 4.00 Count 2 1 2 5 Sum Ints 3 1 6 10 [2 rows x 7 columns] total bytes: 108.0 B
Without specifying
table_name
, the last-created inner table, Mean Double, appears as the generated inner table and the first footer row and margin column.Pass an inner table name to generate a specific table:
>>> at.gen("Sum Ints") *Groups A B C Sum Ints Count Mean Double ----------- ---- ---- ---- -------- ----- ----------- Group1 3 0 2 5 3 3.33 Group2 0 1 4 5 2 5.00 ----------- ---- ---- ---- -------- ----- ----------- Sum Ints 3 1 6 10 Count 2 1 2 5 Mean Double 3.00 2.00 6.00 4.00 [2 rows x 7 columns] total bytes: 108.0 B
Specify the footer rows that appear in a generated
Accumtable
.Pass a list of inner table names to set the corresponding footer rows for the
AccumTable
instance. The footer rows contain values calculated by a reducing function and grouped by theAccumTable
columns.When you generate a table using
gen()
, the footer row corresponding to the inner table appears first. Then, the remaining footer rows appear in the order you passed them toset_margin_columns()
.Passing an empty list removes all footer rows from the generated table, except for the footer row corresponding to the inner table.
- Parameters:
rows (list) – A list of inner table names, in the order you want the footer rows to appear in a generated table.
See also
rt_accumtable.AccumTable
The class containing
set_footer_rows()
.rt_accumtable.AccumTable.gen()
The method that generates a table from an
rt_accumtable.AccumTable
.rt_accumtable.AccumTable.set_margin_columns()
The method that sets the margin columns for the
rt_accumtable.AccumTable
and its generated tables.
Examples
Construct a
Dataset
for the following examples:>>> ds = rt.Dataset() >>> ds.Zeros = [0, 0, 0, 0, 0] >>> ds.Ones = [1, 1, 1, 1, 1] >>> ds.Twos = [2, 2, 2, 2, 2] >>> ds.Nans = [rt.nan, rt.nan, rt.nan, rt.nan, rt.nan] >>> ds.Ints = [0, 1, 2, 3, 4] >>> ds.Groups = rt.Cat(["Group1", "Group2", "Group1", "Group1", "Group2"]) >>> ds.Letters = rt.Cat(["A", "B", "C", "A", "C"]) >>> ds # Zeros Ones Twos Nans Ints Groups Letters - ----- ---- ---- ---- ---- ------ ------- 0 0 1 2 nan 0 Group1 A 1 0 1 2 nan 1 Group2 B 2 0 1 2 nan 2 Group1 C 3 0 1 2 nan 3 Group1 A 4 0 1 2 nan 4 Group2 C [5 rows x 7 columns] total bytes: 225.0 B
Construct an
AccumTable
from that data:>>> at = rt.AccumTable(ds.Groups, ds.Letters) >>> at["Count"] = at.count() >>> at["Sum Ints"] = at.sum(ds.Ints) >>> at["Mean Double"] = at.mean(ds.Ints * ds.Twos) >>> at["Variance Ints"] = at.var(ds.Ints) >>> at Inner Tables: ['Count', 'Sum Ints', 'Mean Double', 'Variance Ints'] Margin Columns: ['Count', 'Sum Ints', 'Mean Double', 'Variance Ints'] Footer Rows: ['Count', 'Sum Ints', 'Mean Double', 'Variance Ints']
When you generate a table from the
AccumTable
without setting the footer rows, all footer rows appear in the generated table:>>> at.gen("Sum Ints") *Groups A B C Sum Ints Count Mean Double Variance Ints ------------- ---- ---- ---- -------- ----- ----------- ------------- Group1 3 0 2 5 3 3.33 2.33 Group2 0 1 4 5 2 5.00 4.50 ------------- ---- ---- ---- -------- ----- ----------- ------------- Sum Ints 3 1 6 10 Count 2 1 2 5 Mean Double 3.00 2.00 6.00 4.00 Variance Ints 4.50 nan 2.00 2.00 [2 rows x 8 columns] total bytes: 124.0 B
Pass a list of inner table names from the
AccumTable
to set the corresponding footer rows in a generated table:>>> at.set_footer_rows(["Variance Ints", "Count"]) >>> at Inner Tables: ['Count', 'Sum Ints', 'Mean Double', 'Variance Ints'] Margin Columns: ['Count', 'Sum Ints', 'Mean Double', 'Variance Ints'] Footer Rows: ['Variance Ints', 'Count']
Generate a table to see the new set of footer rows:
>>> at.gen("Sum Ints") *Groups A B C Sum Ints Count Mean Double Variance Ints ------------- ---- --- ---- -------- ----- ----------- ------------- Group1 3 0 2 5 3 3.33 2.33 Group2 0 1 4 5 2 5.00 4.50 ------------- ---- --- ---- -------- ----- ----------- ------------- Sum Ints 3 1 6 10 Variance Ints 4.50 nan 2.00 2.00 Count 2 1 2 5 [2 rows x 8 columns] total bytes: 124.0 B
Pass an empty list to remove all footer rows from the generated table, except for the footer row corresponding to the inner table. In this example, the Sum Ints footer row remains in the generated table:
>>> at.set_footer_rows([]) >>> at Inner Tables: ['Count', 'Sum Ints', 'Mean Double', 'Variance Ints'] Margin Columns: ['Count', 'Sum Ints', 'Mean Double', 'Variance Ints'] Footer Rows: [] >>> at.gen("Sum Ints") *Groups A B C Sum Ints Count Mean Double Variance Ints -------- - - - -------- ----- ----------- ------------- Group1 3 0 2 5 3 3.33 2.33 Group2 0 1 4 5 2 5.00 4.50 -------- - - - -------- ----- ----------- ------------- Sum Ints 3 1 6 10 [2 rows x 8 columns] total bytes: 124.0 B
- set_margin_columns(cols)[source]
Specify the margin columns that appear in a generated
Accumtable
.Pass a list of inner table names to set the corresponding margin columns for the
AccumTable
instance. The margin columns contain values calculated by a reducing function and grouped by theAccumTable
rows.When you generate a table using
gen()
, the margin column corresponding to the inner table appears first. Then, the remaining margin columns appear in the order you passed them toset_margin_columns()
.Passing an empty list removes all margin columns from the generated table, except for the margin column corresponding to the inner table.
- Parameters:
cols (list of str) – A list of inner table names, in the order you want the margin columns to appear in a generated table.
See also
rt_accumtable.AccumTable
The class containing
set_margin_columns()
.rt_accumtable.AccumTable.gen()
The method that generates a table from an
rt_accumtable.AccumTable
.rt_accumtable.AccumTable.set_footer_rows()
The method that sets the footer rows for the
rt_accumtable.AccumTable
and its generated tables.
Examples
Construct a
Dataset
for the following examples:>>> ds = rt.Dataset() >>> ds.Zeros = [0, 0, 0, 0, 0] >>> ds.Ones = [1, 1, 1, 1, 1] >>> ds.Twos = [2, 2, 2, 2, 2] >>> ds.Nans = [rt.nan, rt.nan, rt.nan, rt.nan, rt.nan] >>> ds.Ints = [0, 1, 2, 3, 4] >>> ds.Groups = rt.Cat(["Group1", "Group2", "Group1", "Group1", "Group2"]) >>> ds.Letters = rt.Cat(["A", "B", "C", "A", "C"]) >>> ds # Zeros Ones Twos Nans Ints Groups Letters - ----- ---- ---- ---- ---- ------ ------- 0 0 1 2 nan 0 Group1 A 1 0 1 2 nan 1 Group2 B 2 0 1 2 nan 2 Group1 C 3 0 1 2 nan 3 Group1 A 4 0 1 2 nan 4 Group2 C [5 rows x 7 columns] total bytes: 225.0 B
Construct an
AccumTable
from that data:>>> at = rt.AccumTable(ds.Groups, ds.Letters) >>> at["Count"] = at.count() >>> at["Sum Ints"] = at.sum(ds.Ints) >>> at["Mean Double"] = at.mean(ds.Ints * ds.Twos) >>> at["Variance Ints"] = at.var(ds.Ints) >>> at Inner Tables: ['Count', 'Sum Ints', 'Mean Double', 'Variance Ints'] Margin Columns: ['Count', 'Sum Ints', 'Mean Double', 'Variance Ints'] Footer Rows: ['Count', 'Sum Ints', 'Mean Double', 'Variance Ints']
When you generate a table from the
AccumTable
without setting the margin columns, all margin columns appear in the generated table:>>> at.gen("Sum Ints") *Groups A B C Sum Ints Count Mean Double Variance Ints ------------- ---- ---- ---- -------- ----- ----------- ------------- Group1 3 0 2 5 3 3.33 2.33 Group2 0 1 4 5 2 5.00 4.50 ------------- ---- ---- ---- -------- ----- ----------- ------------- Sum Ints 3 1 6 10 Count 2 1 2 5 Mean Double 3.00 2.00 6.00 4.00 Variance Ints 4.50 nan 2.00 2.00 [2 rows x 8 columns] total bytes: 124.0 B
Pass a list of inner table names from the
AccumTable
to set the corresponding margin columns in a generated table:>>> at.set_margin_columns(["Variance Ints", "Count"]) >>> at Inner Tables: ['Count', 'Sum Ints', 'Mean Double', 'Variance Ints'] Margin Columns: ['Variance Ints', 'Count'] Footer Rows: ['Count', 'Sum Ints', 'Mean Double', 'Variance Ints']
Generate a table to see the new set of margin columns:
>>> at.gen("Sum Ints") *Groups A B C Sum Ints Variance Ints Count ------------- ---- ---- ---- -------- ------------- ----- Group1 3 0 2 5 2.33 3 Group2 0 1 4 5 4.50 2 ------------- ---- ---- ---- -------- ------------- ----- Sum Ints 3 1 6 10 Count 2 1 2 5 Mean Double 3.00 2.00 6.00 Variance Ints 4.50 nan 2.00 2.00 [2 rows x 7 columns] total bytes: 108.0 B
Pass an empty list to remove all margin columns from the generated table, except for the margin column corresponding to the inner table. In this example, the Sum Ints margin column remains in the generated table:
>>> at.set_margin_columns([]) >>> at Inner Tables: ['Count', 'Sum Ints', 'Mean Double', 'Variance Ints'] Margin Columns: [] Footer Rows: ['Count', 'Sum Ints', 'Mean Double', 'Variance Ints'] >>> at.gen("Sum Ints") *Groups A B C Sum Ints ------------- ---- ---- ---- -------- Group1 3 0 2 5 Group2 0 1 4 5 ------------- ---- ---- ---- -------- Sum Ints 3 1 6 10 Count 2 1 2 Mean Double 3.00 2.00 6.00 Variance Ints 4.50 nan 2.00 [2 rows x 5 columns] total bytes: 76.0 B
- riptable.rt_accumtable.accum_cols(cat, val_list, name_list=None, filt_list=None, func_list='nansum', remove_blanks=False)[source]
Apply reducing functions to multiple arrays that are grouped by a
Categorical
.The returned
Dataset
contains values calculated by a reducing function for eachCategorical
group from each of the arrays inval_list
. It also contains the calculated value for each of the original arrays in theTotal
row.accum_cols()
supports only reducing functions that take an array as a parameter. For examplecount()
isn’t valid, as it doesn’t accept an array as an input argument. For a list of reducing functions, see Reducing Functions Supported by Categoricals.- Parameters:
cat (
Categorical
) – ACategorical
that specifies the groups for reducing theval_list
array.val_list (array or list of arrays) –
Array or list of arrays that
func_list
is applied to.accum_cols()
returns an array for each element inval_list
. If an element ofval_list
is itself a two-element list of two arrays,accum_cols()
calculates a ratio between the values calculated by a reducing function for the two arrays.accum_ratio()
performs this calculation usingcat
, the two arrays, the respective filter, and the respective reducing function as arguments.If the second element of the two-element list is
"p"
or"P"
,accum_cols()
calculates a ratio displayed as a percentage between the individual values of a table calculated with a reducing function and the calculated value of the entireAccumTable
.accum_ratiop()
performs this calculation usingcat
, the first element of the two element list, the respective filter, and the respective reducing function as arguments.name_list (list, optional) – List of column names in the returned
Dataset
. If not provided, the returned columns have namescolN
.filt_list (array of bool or list of array of bool, optional) – Either a filter array that applies to all arrays in
val_list
or a list of filters, where each filter applies to the respective array inval_list
. Each filter must be the same length as the arrays inval_list
.func_list (str or list of str, default "nansum") –
Either a string of the name of a reducing function (for example,
"sum"
or"nanmean"
) or a list of strings of reducing function names. Passing a string applies the single reducing function to all arrays inval_list
. Passing a list of strings applies each reducing function to the respective array inval_list
. Note the following two exceptions:If you pass more functions than there are arrays in
val_list
, the extra functions without respective arrays inval_list
are ignored.If you pass fewer functions than arrays, the returned
Dataset
contains only same number of columns as there are functions infunc_list
.remove_blanks (bool, default
False
) – IfTrue
, removes rows and columns that consist entirely of0
ornan
from the returnedDataset
.
- Returns:
A table of the values calculated by the reducing functions for each element of
val_list
.- Return type:
See also
rt_accum2.Accum2
The parent class for
AccumTable
.rt_accumtable.AccumTable
A wrapper on
Accum2
that enables the creation of tables that combine the results of multiple tables generated from theAccum2
object.rt_categorical.Categorical
A class that efficiently stores an array of repeated strings and is used for groupby operations.
rt_groupbyops.GroupByOps
A class that holds the reducing functions used by
accum_cols()
.
Examples
Construct a
Dataset
for the following examples:>>> ds = rt.Dataset() >>> ds.Zeros = [0, 0, 0, 0, 0] >>> ds.Ones = [1, 1, 1, 1, 1] >>> ds.Twos = [2, 2, 2, 2, 2] >>> ds.Nans = [rt.nan, rt.nan, rt.nan, rt.nan, rt.nan] >>> ds.Ints = [0, 1, 2, 3, 4] >>> ds.Groups = ["Group1", "Group2", "Group1", "Group1", "Group2"] >>> ds.Groups = rt.Cat(ds.Groups) >>> ds # Zeros Ones Twos Nans Ints Groups - ----- ---- ---- ---- ---- ------ 0 0 1 2 nan 0 Group1 1 0 1 2 nan 1 Group2 2 0 1 2 nan 2 Group1 3 0 1 2 nan 3 Group1 4 0 1 2 nan 4 Group2 [5 rows x 6 columns] total bytes: 217.0 B
Apply one reducing function to all arrays
Pass a single function name as a string to
func_list
. This example applies thesum()
reducing function to all arrays inval_list
:>>> rt.accum_cols(cat=ds.Groups, ... val_list=[ds.Zeros, ds.Ones, ds.Twos, ds.Nans, ds.Ints], ... func_list="sum") *Groups col0 col1 col2 col3 col4 ------- ---- ---- ---- ---- ---- Group1 0 3 6 nan 5 Group2 0 2 4 nan 5 ------- ---- ---- ---- ---- ---- Total 0 5 10 nan 10 [2 rows x 6 columns] total bytes: 92.0 B
Without passing a
name_list
toaccum_cols()
, the default column names appear in the returned table.Apply a different reducing function to each array
Pass a list of function names as strings to
func_list
. This example applies a respective function infunc_list
to each of the arrays inval_list
:>>> rt.accum_cols(cat=ds.Groups, ... val_list=[ds.Zeros, ds.Ones, ds.Twos, ds.Nans, ds.Ints], ... name_list=["Zeros sum", "Ones mean", "Twos var", "NaNs nansum", "Ints mean"], ... func_list=["sum", "mean", "var", "nansum", "mean"]) *Groups Zeros sum Ones mean Twos var NaNs nansum Ints mean ------- --------- --------- -------- ----------- --------- Group1 0 1.00 0.00 0.00 1.67 Group2 0 1.00 0.00 0.00 2.50 ------- --------- --------- -------- ----------- --------- Total 0 1.00 0.00 0.00 2.00 [2 rows x 6 columns] total bytes: 92.0 B
Include ratio arrays
Pass a list of two arrays to
val_list
to return the ratio of the values calculated bysum()
for the two arrays:>>> rt.accum_cols(cat=ds.Groups, ... val_list=[ds.Ints, ds.Ones, [ds.Ints, ds.Ones]], ... name_list=["Ints sum", "Ones sum", "Ints:Ones sum ratio"], ... func_list="sum") *Groups Ints sum Ones sum Ints:Ones sum ratio ------- -------- -------- ------------------- Group1 5 3 1.67 Group2 5 2 2.50 ------- -------- -------- ------------------- Total 10 5 2.00 [2 rows x 4 columns] total bytes: 60.0 B
The values returned for the two-element list in
val_list
are ratios between the values calculated bysum()
for Ints as the numerator and for Ones as the denominator.accum_cols()
usesaccum_ratio()
to calculate this ratio. In the previous example,accum_ratio()
is passed the following arguments:>>> ints_ones_ratio = rt.accum_ratio(cat1=ds.Groups, ... cat2=rt.Categorical(np.full(ds.Groups.shape[0], 1, dtype=np.int8), ["NotGrouped"]), ... val1=ds.Ints, ... val2=ds.Ones, ... func1="sum", ... func2="sum", ... remove_blanks=False) >>> ints_ones_ratio["NotGrouped"] FastArray([1.66666667, 2.5 ])
Include percentile arrays
Pass two-element lists with an array and
"p"
toval_list
to return the ratio of the values calculated bysum()
for the grouped array values compared to the total value for the array, displayed as a percent:>>> rt.accum_cols(cat=ds.Groups, ... val_list=[ds.Ones, ds.Ints, [ds.Ones, "p"], [ds.Ints, "p"]], ... name_list=["Ones sum", "Ints sum", "Ones percent", "Ints percent"], ... func_list="sum") *Groups Ones sum Ints sum Ones percent Ints percent ------- -------- -------- ------------ ------------ Group1 3 5 60.00 50.00 Group2 2 5 40.00 50.00 ------- -------- -------- ------------ ------------ Total 5 10 100.00 100.00 [2 rows x 5 columns] total bytes: 76.0 B
The values returned for the two-element lists are the percent ratios. Group1 of the Ones sum column is 3 and the Total for Ones sum is 5. The ratio of these two numbers as a percent is 60.00, as displayed in the Ones percent column.
accum_cols()
usesaccum_ratiop()
to calculate this percent ratio. In the previous example,accum_ratiop()
is passed the following arguments to calculate the Ones percent column:>>> ones_ratiop = rt.accum_ratiop(cat1=ds.Groups, ... cat2=rt.Categorical(np.full(ds.Groups.shape[0], 1, dtype=np.int8), ["NotGrouped"]), ... val=ds.Ones, ... filter=None, ... func="sum", ... norm_by="T", ... include_total=False, ... remove_blanks=False) >>> ones_ratiop["NotGrouped"] FastArray([60., 40.])
Filter all arrays with a single boolean mask
Pass an array of booleans to
filt_list
to filter all arrays inval_list
:>>> greater_3_filter = ds.Ints > 3 >>> rt.accum_cols(cat=ds.Groups, ... val_list=[ds.Zeros, ds.Ones, ds.Twos, ds.Nans, ds.Ints], ... name_list=["Zeros sum", "Ones sum", "Twos sum", "NaNs sum", "Ints sum"], ... filt_list=greater_3_filter, ... func_list="sum") *Groups Zeros sum Ones sum Twos sum NaNs sum Ints sum ------- --------- -------- -------- -------- -------- Group1 0 0 0 0.00 0 Group2 0 1 2 nan 4 ------- --------- -------- -------- -------- -------- Total 0 1 2 nan 4 [2 rows x 6 columns] total bytes: 92.0 B
Filter each array with a different boolean mask
Pass an array of boolean arrays to
filt_list
to filter the respective arrays inval_list
:>>> even_zeros = ds.Zeros % 2 == 0 >>> even_ones = ds.Ones % 2 == 0 >>> even_twos = ds.Twos % 2 == 0 >>> even_nans = ds.Nans % 2 == 0 >>> even_ints = ds.Ints % 2 == 0 >>> rt.accum_cols(cat=ds.Groups, ... val_list=[ds.Zeros, ds.Ones, ds.Twos, ds.Nans, ds.Ints], ... name_list=["Zeros sum", "Ones sum", "Twos sum", "NaNs sum", "Ints sum"], ... filt_list=[even_zeros, even_ones, even_twos, even_nans, even_ints], ... func_list="sum") *Groups Zeros sum Ones sum Twos sum NaNs sum Ints sum ------- --------- -------- -------- -------- -------- Group1 0 0 6 0.00 2 Group2 0 0 4 0.00 4 ------- --------- -------- -------- -------- -------- Total 0 0 10 0.00 6 [2 rows x 6 columns] total bytes: 92.0 B
Remove blank values
Pass
True
toremove_blanks
to remove all rows and columns from the returnedDataset
that consist entirely of0
ornan
:>>> rt.accum_cols(cat=ds.Groups, ... val_list=[ds.Zeros, ds.Ones, ds.Twos, ds.Nans, ds.Ints], ... name_list=["Zeros sum", "Ones sum", "Twos sum", "NaNs sum", "Ints sum"], ... func_list="sum", ... remove_blanks=True) *Groups Ones sum Twos sum Ints sum ------- -------- -------- -------- Group1 3 6 5 Group2 2 4 5 ------- -------- -------- -------- Total 5 10 10 [2 rows x 4 columns] total bytes: 60.0 B
- riptable.rt_accumtable.accum_ratio(cat1, cat2=None, val1=None, val2=None, filt1=None, filt2=None, func1='nansum', func2=None, return_table=False, include_numer=False, include_denom=True, remove_blanks=False)[source]
Generate a
Dataset
of ratios between values calculated by reducing functions for two arrays.accum_ratio()
performs the following actions:Creates an
AccumTable
using the groups ofcat1
andcat2
.Aggregates the data from the
val1
andval2
arrays according to thefunc1
andfunc2
reducing functions.Calculates a ratio between the values calculated by the reducing functions for
val1
andval2
.Returns either a
Dataset
or anAccumTable
, depending on the value ofreturn_table
.
By default,
accum_ratio()
returns aDataset
with a"Ratio"
inner table. Ifreturn_table
is set toTrue
, the function returns anAccumTable
, which can be converted to aDataset
using thegen()
method. Generating aDataset
gives you more control over which inner table, footer rows, and margin columns are included in the result.accum_ratio()
supports only reducing functions that take an array as a parameter. For examplecount()
isn’t valid, as it doesn’t accept an array as an input argument. For a list of reducing functions, see Reducing Functions Supported by Categoricals.- Parameters:
cat1 (
Categorical
) – The row groups used to accumulate the values.cat2 (
Categorical
, optional) – The column groups used to accumulate the values. If not provided,accum_ratio()
uses aCategorical
with a single group,"NotGrouped"
.val1 (array) – The numerator for the calculated ratio.
val2 (array) – The denominator for the calculated ratio.
filt1 (array of bool, optional) – Boolean filter for
val1
array. The filter array must be the same length asval1
andval2
.filt2 (array of bool, optional) – Boolean filter for
val2
array. The filter array must be the same length asval1
andval2
. If not provided, the filter is the same asfilt1
.func1 (str, default
"nansum"
) – String of the name of the reducing function (for example,"sum"
or"nanmean"
) used to reduceval1
before calculating the ratio.func2 (str, optional) – String of the name of the reducing function (for example,
"sum"
or"nanmean"
) used to reduceval2
before calculating the ratio. If not provided, thefunc1
is applied toval2
.return_table (bool, default
False
) – IfFalse
(the default), returns aDataset
with the calculated ratio. If set toTrue
, returns anAccumTable
from which you can generate aDataset
. The returnedAccumTable
has"Numer"
,"Denom"
, and"Ratio"
inner tables, footer rows, and margin columns.include_numer (bool, default
False
) – If set toTrue
, include the values calculated by the reducing function forval1
as a row and column in the returned table. Ignored ifreturn_table
isTrue
.include_denom (bool, default
True
) – IfTrue
(the default), include the values calculated by the reducing function forval2
as a row and column in the returned table. Ignored ifreturn_table
isTrue
.remove_blanks (bool, default
False
) – If set toTrue
, removes rows and columns that consist entirely of0
ornan
from the returned table.
- Returns:
Either a
Dataset
with a view of the calculated ratio, or anAccumTable
, depending onreturn_table
.- Return type:
See also
rt_accum2.Accum2
The parent class for
AccumTable
.rt_accumtable.AccumTable
A wrapper on
Accum2
that enables the creation of tables that combine the results of multiple tables generated from theAccum2
object.rt_categorical.Categorical
A class that efficiently stores an array of repeated strings and is used for groupby operations.
rt_groupbyops.GroupByOps
A class that holds the reducing functions used by
accum_ratio()
.
Examples
Construct a
Dataset
for the following examples:>>> ds = rt.Dataset() >>> ds.Zeros = [0, 0, 0, 0, 0] >>> ds.Ones = [1, 1, 1, 1, 1] >>> ds.Twos = [2, 2, 2, 2, 2] >>> ds.Nans = [rt.nan, rt.nan, rt.nan, rt.nan, rt.nan] >>> ds.Ints = [0, 1, 2, 3, 4] >>> ds.Groups = rt.Cat(["Group1", "Group2", "Group1", "Group1", "Group2"]) >>> ds.Letters = rt.Cat(["A", "B", "C", "A", "C"]) >>> ds # Zeros Ones Twos Nans Ints Groups Letters - ----- ---- ---- ---- ---- ------ ------- 0 0 1 2 nan 0 Group1 A 1 0 1 2 nan 1 Group2 B 2 0 1 2 nan 2 Group1 C 3 0 1 2 nan 3 Group1 A 4 0 1 2 nan 4 Group2 C [5 rows x 7 columns] total bytes: 225.0 B
Calculate a ratio between the values calculated by a reducing function
This example returns a
Dataset
that holds ratios between the values calculated by the default reducing function (nansum()
) for Ints and Ones.>>> rt.accum_ratio(cat1=ds.Groups, ... cat2=ds.Letters, ... val1=ds.Ints, ... val2=ds.Ones) *Groups A B C Ratio Denom ------- ---- ---- ---- ----- ----- Group1 1.50 nan 2.00 1.67 3 Group2 nan 1.00 4.00 2.50 2 ------- ---- ---- ---- ----- ----- Ratio 1.50 1.00 3.00 2.00 Denom 2 1 2 5 [2 rows x 6 columns] total bytes: 92.0 B
Return an AccumTable
Pass
True
toreturn_table
to return anAccumTable
instead of aDataset
:>>> returned_accumtable = rt.accum_ratio(cat1=ds.Groups, ... cat2=ds.Letters, ... val1=ds.Ints, ... val2=ds.Ones, ... func1="nansum", ... return_table=True) >>> returned_accumtable Inner Tables: ['Numer', 'Denom', 'Ratio'] Margin Columns: ['Numer', 'Denom', 'Ratio'] Footer Rows: ['Numer', 'Denom', 'Ratio']
Use
gen()
to create aDataset
from the returnedAccumTable
:>>> returned_accumtable.gen() *Groups A B C Ratio Numer Denom ------- ---- ---- ---- ----- ----- ----- Group1 1.50 nan 2.00 1.67 5 3 Group2 nan 1.00 4.00 2.50 5 2 ------- ---- ---- ---- ----- ----- ----- Ratio 1.50 1.00 3.00 2.00 Numer 3 1 6 10 Denom 2 1 2 5 [2 rows x 7 columns] total bytes: 108.0 B
Filter the arrays before calculating ratios
Pass filters to
filt1
andfilt2
to filterval1
andval2
before reducing and ratio calculation:>>> c_filter = ds.Letters == "C" >>> even_filter = ds.Ints % 2 == 0 >>> rt.accum_ratio(cat1=ds.Groups, ... cat2=ds.Letters, ... val1=ds.Ints, ... val2=ds.Ones, ... func1="nansum", ... filt1=c_filter, ... filt2=even_filter) *Groups A B C Ratio Denom ------- ---- --- ---- ----- ----- Group1 0.00 nan 2.00 1.00 2 Group2 nan nan 4.00 4.00 1 ------- ---- --- ---- ----- ----- Ratio 0.00 nan 3.00 2.00 Denom 1 0 2 3 [2 rows x 6 columns] total bytes: 92.0 B
Remove blank rows and columns
Pass
True
toremove_blanks
to remove the rows and columns consisting entirely of0
andnan
. This example removes the blank lines from the filteredDataset
:>>> c_filter = ds.Letters == "C" >>> even_filter = ds.Ints % 2 == 0 >>> rt.accum_ratio(cat1=ds.Groups, ... cat2=ds.Letters, ... val1=ds.Ints, ... val2=ds.Ones, ... func1="nansum", ... filt1=c_filter, ... filt2=even_filter, ... remove_blanks=True) *Groups C Ratio Denom ------- ---- ----- ----- Group1 2.00 1.00 2 Group2 4.00 4.00 1 ------- ---- ----- ----- Ratio 3.00 2.00 Denom 2 3 [2 rows x 4 columns] total bytes: 60.0 B
Include non-ratio values calculated by reducing functions
Pass
True
toinclude_numer
andinclude_denom
to add summary rows and columns with the non-ratio values calculated by the reducing functions. Numer contains values forval1
calculated withfunc1
. Denom contains values forval2
calculated withfunc2
. This example doesn’t includefunc2
, soaccum_ratio()
usesfunc1
forval2
.>>> rt.accum_ratio(cat1=ds.Groups, ... cat2=ds.Letters, ... val1=ds.Ints, ... val2=ds.Ones, ... func1="nansum", ... include_numer=True, ... include_denom=True) *Groups A B C Ratio Numer Denom ------- ---- ---- ---- ----- ----- ----- Group1 1.50 nan 2.00 1.67 5 3 Group2 nan 1.00 4.00 2.50 5 2 ------- ---- ---- ---- ----- ----- ----- Ratio 1.50 1.00 3.00 2.00 Numer 3 1 6 10 Denom 2 1 2 5 [2 rows x 7 columns] total bytes: 108.0 B
- riptable.rt_accumtable.accum_ratiop(cat1, cat2=None, val=None, filter=None, func='nansum', norm_by='T', include_total=True, remove_blanks=False, filt=None)[source]
Generate a
Dataset
of ratios displayed as percentages between the individual values of a table calculated with a reducing function and the value of the entireAccumTable
, its rows, or its columns calculated with the same reducing function.accum_ratiop()
performs the following actions:Creates an
Accumtable
using the groups ofcat1
andcat2
.Aggregates the data from the
val
array according to thefunc
reducing function.Calculates a ratio as a percent for each cell in the inner table, footer row, and margin column. The numerator of each ratio is the calculated value for the cell, and the denominator is the calculated value for that row, that column, or the table, depending on the value of
norm_by
.Generates and returns a
Dataset
from theAccumTable
with percentile values.
accum_ratiop()
supports only reducing functions that take an array as a parameter. For example,count()
isn’t valid, as it doesn’t accept an array as an input argument. For a list of reducing functions, see Reducing Functions Supported by Categoricals.- Parameters:
cat1 (
Categorical
) – The row groups used for accumulation.cat2 (
Categorical
, optional) – The column groups used for accumulation. If not provided,accum_ratiop()
uses aCategorical
with a single group,"NotGrouped"
.val (array) – The array used as the numerator for percentile calculation.
filter (array of bool, optional) – Filter for
val
. Thefilter
array must be the same length asval
. Replaces the deprecatedfilt
parameter.func (str) – String of the name of the reducing function used to reduce
val
before calculating the percentile.norm_by ({"T", "C", "R"}, default "T") –
Controls the values used as the denominator for the ratio calculation:
”T” selects the calculated value for the entire
AccumTable
.”C” selects the calculated value for each column.
”R” selects the calculated value for each row.
include_total (bool, default
True
) – Adds a summary row and column of values calculated byfunc
to the returnedDataset
.remove_blanks (bool, default
True
) – IfTrue
, removes rows and columns that consist entirely of0
ornan
from the returned table.filt (array of bool, optional) – Deprecated and replaced with
filter
.
- Returns:
A table of percent ratios for the array.
- Return type:
See also
rt_accum2.Accum2
The parent class for
AccumTable
.rt_accumtable.AccumTable
A wrapper on
Accum2
that enables the creation of tables that combine the results of multiple tables generated from theAccum2
object.rt_categorical.Categorical
A class that efficiently stores an array of repeated strings and is used for groupby operations.
rt_groupbyops.GroupByOps
A class that holds the reducing functions used by
accum_ratiop()
.
Examples
Construct a
Dataset
for the following examples:>>> ds = rt.Dataset() >>> ds.Zeros = [0, 0, 0, 0, 0] >>> ds.Ones = [1, 1, 1, 1, 1] >>> ds.Twos = [2, 2, 2, 2, 2] >>> ds.Ints = [0, 1, 2, 3, 4] >>> ds.Groups = rt.Cat(["Group1", "Group2", "Group1", "Group1", "Group2"]) >>> ds.Letters = rt.Cat(["A", "B", "C", "A", "C"]) >>> ds # Zeros Ones Twos Ints Groups Letters - ----- ---- ---- ---- ------ ------- 0 0 1 2 0 Group1 A 1 0 1 2 1 Group2 B 2 0 1 2 2 Group1 C 3 0 1 2 3 Group1 A 4 0 1 2 4 Group2 C [5 rows x 6 columns] total bytes: 185.0 B
Calculate percentiles compared to total
>>> rt.accum_ratiop(cat1=ds.Groups, ... cat2=ds.Letters, ... val=ds.Ints) *Groups A B C TotalRatio Total ---------- ----- ----- ----- ---------- ----- Group1 30.00 0.00 20.00 50.00 5 Group2 0.00 10.00 40.00 50.00 5 ---------- ----- ----- ----- ---------- ----- TotalRatio 30.00 10.00 60.00 100.00 Total 3 1 6 10 [2 rows x 6 columns] total bytes: 92.0 B
Pass
"nanmean"
tofunc
to calculate the ratio as a percent between the mean for each inner table cell and the total mean:>>> rt.accum_ratiop(cat1=ds.Groups, ... cat2=ds.Letters, ... val=ds.Ints, ... func="nanmean") *Groups A B C TotalRatio Total ---------- ----- ----- ------ ---------- ----- Group1 75.00 nan 100.00 83.33 1.67 Group2 nan 50.00 200.00 125.00 2.50 ---------- ----- ----- ------ ---------- ----- TotalRatio 75.00 50.00 150.00 100.00 Total 1.50 1.00 3.00 2.00 [2 rows x 6 columns] total bytes: 92.0 B
Calculate percentiles compared to row
Pass
"R"
tonorm_by
:>>> rt.accum_ratiop(cat1=ds.Groups, ... cat2=ds.Letters, ... val=ds.Ints, ... func="nanmean", ... norm_by="R", ... include_total=False) *Groups A B C TotalRatio ---------- ----- ----- ------ ---------- Group1 90.00 nan 120.00 100.00 Group2 nan 40.00 160.00 100.00 ---------- ----- ----- ------ ---------- TotalRatio 75.00 50.00 150.00 100.00 [2 rows x 5 columns] total bytes: 76.0 B
Calculate percentiles compared to column
Pass
"C"
tonorm_by
:>>> rt.accum_ratiop(cat1=ds.Groups, ... cat2=ds.Letters, ... val=ds.Ints, ... func="nanmean", ... norm_by="C", ... include_total=False) *Groups A B C TotalRatio ---------- ------ ------ ------ ---------- Group1 100.00 nan 66.67 83.33 Group2 nan 100.00 133.33 125.00 ---------- ------ ------ ------ ---------- TotalRatio 100.00 100.00 100.00 100.00 [2 rows x 5 columns] total bytes: 76.0 B
Filter the array before calculating percentiles
Create a filter for
val
and pass it tofilter
. This example selects for data inval
in the"carrot"
group:>>> c_filter = ds.Letters == "C" >>> rt.accum_ratiop(cat1=ds.Groups, ... cat2=ds.Letters, ... val=ds.Ints, ... filter=c_filter, ... func="nansum", ... include_total=False) *Groups A B C TotalRatio ---------- ---- ---- ------ ---------- Group1 0.00 0.00 33.33 33.33 Group2 0.00 0.00 66.67 66.67 ---------- ---- ---- ------ ---------- TotalRatio 0.00 0.00 100.00 100.00 [2 rows x 5 columns] total bytes: 76.0 B
Remove blank rows and columns
Pass
True
toremove_blanks
to remove the rows and columns consisting entirely of0
andnan
. This example removes the blank lines from the filteredDataset
:>>> rt.accum_ratiop(cat1=ds.Groups, ... cat2=ds.Letters, ... val=ds.Ints, ... filter=c_filter, ... func="nansum", ... include_total=False, ... remove_blanks=True) *Groups C TotalRatio ---------- ------ ---------- Group1 33.33 33.33 Group2 66.67 66.67 ---------- ------ ---------- TotalRatio 100.00 100.00 [2 rows x 3 columns] total bytes: 44.0 B
Include the total values calculated by reducing functions
Pass
True
toinclude_total
to add a"Total"
row and column to the returnedDataset
. The total represents the values calculated by the reducing function before percentile calculation.>>> rt.accum_ratiop(cat1=ds.Groups, ... cat2=ds.Letters, ... val=ds.Ints, ... include_total=True) *Groups A B C TotalRatio Total ---------- ----- ----- ----- ---------- ----- Group1 30.00 0.00 20.00 50.00 5 Group2 0.00 10.00 40.00 50.00 5 ---------- ----- ----- ----- ---------- ----- TotalRatio 30.00 10.00 60.00 100.00 Total 3 1 6 10 [2 rows x 6 columns] total bytes: 92.0 B