class Tabel

class tabel.tabel.Tabel(datastruct=None, columns=None, copy=True)

Tabel datastructure

Data table with rows and columns, rows are numbered columns are named. Each column has its own datatype. Data is stored by columns (column store), fixed datatype per column, varyiable datatypes from column to column.

  • datastruct (object) – list, tuple, ndarray or dict of lists, tuples, ndarrays or elements; or a pandas.DataFrame. List of columns of data. See tabel.T for a convenience function to transpose a list of records.
  • columns (list of strings) – Column names, ignored when keys are part of the datastruct (dict and pandas.DataFrame). Automatic names are generated, if omitted, as strings of column number.
  • copy (boolean) – Wether to make a copy of the data or to reference to the current memory location (when possible), default: True


  1. It is possible to create an empty Tabel instance and later add data using the tabel.Tabel.append and/or tabel.Tabel.__setitem__ methods.
  2. It is possibe to add or manipulate data directly through the instance attributes tabel.Tabel.columns and One could use the tabel.Tabel.valid method to check wether the manipulated structure is still valid.
  1. If one or more (but not all) of the columns contain a single element this element is repeated to match the length of the other columns.


To initialize a Tabel, call the constructor with the data in column lists:

>>> from tabel import Tabel
>>> Tabel( [ ["John", "Joe", "Jane"],
...          [1.82, 1.65, 2.15],
...          [False, False, True] ],
...       columns = ["Name", "Height", "Married"])
 Name   |   Height |   Married
 John   |     1.82 |         0
 Joe    |     1.65 |         0
 Jane   |     2.15 |         1
3 rows ['<U4', '<f8', '|b1']



Indexing and slicing parts of a Tabel.

Slicing and indexing mostly follows Numpy array and Python list conventions.


key (r, c):
r can be a single integer, a boolean array, an integer itereable or a slice object. c can be a single integer or string, a boolean array, an integer or string itereable or a slice object.
key (int, string) :
When only a single int or string is supplied it is considered to point to a whole single column or a whole single row (in that order).


Depending on key, four different types can be returned.

element ():
If both the row place and the column place are a single integer (or string for the column place), adressing a single element in the Tabel, wich could be of any datatype supported by Numpy.ndarray.
column (ndarray):
If the column place is a single string or integer, adressing a single column and the row place is either abscent or not an integer.
row (tuple) :
If the row place is a single integer, adressing a single row and the coumn place is either abscent or not a single integer/string.
Tabel (Tabel) :
If a tuple key (r, c) is provided with anything other than an integer for the row place and anything other than a single integer/string type for the column place.


Returned Tabel objects from slicing are referenced to the original Tabel object unless row indexing was with a boolean list/array or the returned type was not a Tabel or np.ndarray object. Changes made to the slice will be reflected in the original Tabel. Appending or joining Tabels or adding/renaming columns will never be reflected in the original Tabel object. Use the py:copy function to make a full copy of the object.

Raises:KeyError – When a key is referencing an invallid or not existing part of the data.


>>> tbl[:, 1:3]
   Height |   Married
     1.82 |         0
     1.65 |         0
     2.15 |         1
3 rows ['<f8', '|b1']
>>> tbl[0, 0]
>>> tbl["Name"]
array(['John', 'Joe', 'Jane'], dtype='<U4')
>>> tbl[0]
('John', 1.82, False)


Tabel.__setitem__(key, value)

Setting a slice of a Tabel

Setting, like getting, slices mostly follows numpy conventions. Specifically the rules for the key are the same as for tabel.Tabel.__getitem__ with the same relation between key and expected type for the value. In adition this method can also be used to add new columns.

  • key (int, string) –

    r can be a single integer, a boolean array, an integer itereable or a slice object.

    c can be a single integer or string, a boolean array, an integer or string itereable or a slice object.

    To adress a single element in the Tabel object the key should be a tuple of (r, c) with r a single integer adressing the row and c a single integer or string addressing the column of the element to be changed.

  • key – When only a single int or string is supplied it is considered to point to a whole single column or a whole single row (in that order).
  • value (object) –

    The type the value needs to have depends on the key provided.

    A single element of the same type, or a type convertable to the same, as the column targeted as a destination. See tabel.Tabel.dtype to get the type of the columns.
    column :
    An array or list of elements, each element of of the same type, or a type convertable to the same, as the column targeted as a destination. If a new column is targeted a single element could be provided, in which case it will be replicated along all rows.
    row :
    A tuple of elements, each of the same type or a type convertable to the same, as the column targeted as a destination. Length of the tuple should match the number of columns addressed.
    Tabel :
    Not currently implemented.

nothing, change in-place.


When changing a column two syntaxes give approximately the same result, with, however, a noteable difference. Using a slice object “:” will change all elements of the column with the new element(s) provided. If just the colum name is provided, with no indication for row, than the whole column is replaced with the column provided.

>>> tbl = Tabel( [ ["John", "Joe", "Jane"], [1.82, 1.65, 2.15],
...              [False, False, True] ], columns = ["Name", "Height", "Married"])
>>> tbl[:, "Name"] = [1, 2, 3]
>>> tbl
   Name |   Height |   Married
      1 |     1.82 |         0
      2 |     1.65 |         0
      3 |     2.15 |         1
3 rows ['<U4', '<f8', '|b1']
>>> tbl["Name"] = [1, 2, 3]
>>> tbl
   Name |   Height |   Married
      1 |     1.82 |         0
      2 |     1.65 |         0
      3 |     2.15 |         1
3 rows ['<i8', '<f8', '|b1']

Note how in the first case the type of the name column stays “<U8” while seccond case the type of the Name column changes to “<i8”.



Pretty print using tabulate.


>>> tbl
 Name   |   Height |   Married
 John   |     1.82 |         0
 Joe    |     1.65 |         0
 Jane   |     2.15 |         1
3 rows ['<U4', '<f8', '|b1']



Append new Tabel to the current Tabel.

Append a Tabel or pandas.DataFrame to the end of this Tabel. Each column is appended to each column of the instance invoking the method.

Parameters:tbl (Tabel) – Tabel with the same columns as the current Tabel, order of columns does not need to match. Columns do not need to match if the current Tabel has zero length. Besides Tabel onjects pandas.DataFrame objects are also allowed.
Returns:Nothing, change in-place.



Append a row reccord at the end of the Tabel.

Appending a single row at the end of the Tabel.

Parameters:row (dict, list, tuple) – The row to be appended to the Tabel. If a dict is provided the keys should match the column names of the Tabel. If a list or tuple is provided the length and order should match the columns of the Tabel. columns do not need to match if the current Tabel has zero length.
Returns:Nothing. Change in-place.


Tabel.join(tbl, key, jointype=u'inner', **kwargs)

Join two tables with key or keys

Joins two tables using the key or list of keys provided, adding the collumns of the argument table to the current table. One to one joins only.

  • tbl (Tabel) – The right hand tabel to be joined to the current tabel.
  • key (string or list) – Name of the column to be used as the key, or columns to be used as keys. Both tabels should have a (one) column of the named key(s), and the elements of the key columns should have no duplicates (one to one joins only)
  • jointype (string) – Type of the join to be performed: inner, outer or leftouter. If inner, returns the elements common to both tabels. If outer, returns the common elements as well as the elements of the left tabel not in the right tabel and the elements of the right tabel not in the left tabel. If leftouter, returns the common elements and the elements of the left tabel not in the right tabel.
  • () (kwargs) – kwargs passed on to numpy.lib.recfunctions.join_by.

Nothing. Tabel is joined in place with the input.


Join a Tabel into the current Tabel matching on column ‘a’:

>>> tbl = Tabel({"a":list(range(4)), "b": ['a', 'b'] *2})
>>> tbl_b = Tabel({"a":list(range(4)), "c": ['d', 'e'] *2})
>>> tbl.join(tbl_b, "a")
>>> tbl
   a | b   | c
   0 | a   | d
   1 | b   | e
   2 | a   | d
   3 | b   | e
4 rows ['<i8', '<U1', '<U1']


Tabel.group_by(group_cols, aggregate_fie_col)

Groups and aggregates Tabel.

  • group_cols (list) – list of string names of the columns to be grouped by.
  • aggregate_fie_col (list) – list of tuples (function, column) where function is the function to be applied to aggregate and column is the string name of the column. function should take an 1D array as an input and the returned value is treated as a single element.

Tabel object with requested columns


grouping by ‘a’ and then by ‘b’, agregating with taking the sum of ‘a’ elements and taking the first ‘c’ element of each group:

>>> tbl = Tabel({'a':[10, 20, 30, 40]*3, 'b':["100", "200"]*6, 'c':[100, 200]*6})
>>> from tabel import first
>>> tbl.group_by(['b', 'a'], [ (np.sum, 'a'), (first, 'c')])
   b |   a |   a_sum |   c_first
 100 |  10 |      30 |       100
 100 |  20 |       0 |
 100 |  30 |      90 |       100
 100 |  40 |       0 |
 200 |  10 |       0 |
 200 |  20 |      60 |       200
 200 |  30 |       0 |
 200 |  40 |     120 |       200
8 rows ['<U3', '<i8', '<i8', '|O']



Sort the Tabel.

Sorting in-place the Tabel according to columns provided. Rows always stay together, just the order of rows is affectd.

Parameters:columns (string or list) – column name or column names to be sorted, listed in-order.
Returns:Nothing. Sorting in-place.


>>> tbl = Tabel({'a':['b', 'g', 'd'], 'b':list(range(3))})
>>> tbl.sort('a')
>>> tbl
 a   |   b
 b   |   0
 d   |   2
 g   |   1
3 rows ['<U1', '<i8']



Returns a type-converted tabel.

Converts the tabel according to the provided list of dtypes and returns a new Tabel instance.

Parameters:dtypes (list) – list of valid numpy dtypes in the order of the columns. List should have same length as number of columns present (see Tabel.shape) See Tabel.dtype for the current types of the Tabel.
Returns:Tabel object with the columns converted to the new dtype.


save, fmt=u'auto')

Save to file

Saves the Tabel data including a header with the column names to a file of the specified name in the current directory or the directory specified.

  • filename (str) – filename, should include path
  • fmt (str) –

    formatting, valid values are: ‘auto’, ‘csv’, ‘npz’, ‘gz’

    auto :
    Determine the filetype from the fiel extension.
    csv :
    Write to csv file using pythons csv module.
    gz :
    Write to csv using pythons csv module and zip using standard gzip module.
    npz :
    Write to compressed numpy native binary format.





Dump all data as a dict of columns.

Keywords are the column names and values are the column Numpy.ndarrays. Usefull when transferring to a pandas DataFrame.



Tabel shape.

Returns:tuple (r, c) with r the number of rows and c the number of columns.


Tabel.__len__ = <unbound method Tabel.__len__>



List of dtypes of the data columns.



Check wether the current datastructure is legit.

Returns:(bool) True if the Tabel internal structure is valid.


This is currently checking for the length of the columns to be the same and the number of the columns to be the same as the number of column names.

class attributes

Tabel.repr_layout = u'presto'
Tabel.max_repr_rows = 20
Tabel.join_fill_value = {u'float': nan, u'integer': 999999, u'string': u''}