Python API Documentation

Kastore provides a simple Python interface to loading and storing key-array mappings in kastore format.

kastore.load(file, read_all=False, key_encoding='utf-8', engine='python')

Loads a store from the specified file.

Parameters
  • file (str) – The path of the file to load, or a file-like object with a read() method.

  • read_all (bool) – If True, read the entire file into memory. This optimisation can be useful when all the data will be needed, as it saves a little overhead. Defaults to False.

  • key_encoding (str) – The encoding to use when converting the keys from raw bytes.

  • engine (str) – The underlying implementation to use.

Returns

A dict-like object mapping the key-array pairs.

kastore.loads(encoded_data, key_encoding='utf-8')

Loads a store from the specified bytes object.

Parameters
  • encoded_data (bytes) – The encoded kastore data as returned by dumps() or read from a file written by dump().

  • key_encoding (str) – The encoding to use when converting the keys from raw bytes.

Returns

A dict-like object mapping the key-array pairs.

kastore.dump(data, file, key_encoding='utf-8', engine='python')

Dumps a store to the specified file.

Parameters
  • data (dict) – A dictionary-like string keys to numpy arrays.

  • file (str) – The path of the file to write the store to, or a file-like object with a write() method.

  • key_encoding (str) – The encoding to use when converting the keys to raw bytes.

  • engine (str) – The underlying implementation to use.

kastore.dumps(data, key_encoding='utf-8')

Encodes the specified data in kastore form and returns the resulting bytes.

Parameters
  • data (dict) – A dictionary-like string keys to numpy arrays.

  • key_encoding (str) – The encoding to use when converting the keys to raw bytes.

Returns

The bytes encoding of the specified data in kastore format.

Return type

bytes

kastore.get_include()

Returns the directory path where include files for the kastore C API are to be found.

Exceptions

kastore.KastoreException
kastore.FileFormatError
kastore.VersionTooNewError
kastore.VersionTooOldError

Example

Here is a simple example of using kastore to save some numpy arrays to a file and load them again.

import kastore
import numpy as np

data = {"one": np.arange(5, dtype=np.int8), "two": np.arange(5, dtype=np.uint64)}
kastore.dump(data, "tmp.kas")

kas = kastore.load("tmp.kas")
print(list(kas.items()))

Running this code chunk gives us:

[('one', array([0, 1, 2, 3, 4], dtype=int8)), ('two', array([0, 1, 2, 3, 4], dtype=uint64))]

We can also get a useful summmary of a kastore file using the command line interface:

$ python3 -m kastore ls -lH tmp.kas
int8   5  5B one
uint64 5 40B two

The output here shows us that the array corresponding to key one has type int8, 5 elements and consumes 5 bytes of space. The array for key two also have 5 elements but has type uint64 and therefore consumes 40 bytes of space.

Please see the output of python3 -m kastore --help for more help on this command line interface.

The python module gives a read-only view of the kastore file, so to add more data to an existing store, we need to load it, convert it to a dict (which is efficient, as the underlying arrays won’t be copied), and dump it back out again. For instance, here’s how we might add a new key to the previous example:

kas_dict = dict(kas)
print(kas_dict)
# {'one': array([0, 1, 2, 3, 4], dtype=int8), 'two': array([0, 1, 2, 3, 4], dtype=uint64)}

kas_dict["three"] = np.array([0.5772, 2.7818, 3.1415])
kastore.dump(kas_dict, "tmp2.kas")

After this, we get:

# python3 -m kastore ls -lH tmp2.kas
int8    5  5B one
float64 3 24B three
uint64  5 40B two

indicating that the key “three” has three float64 entries.