Python API Documentation¶
Kastore provides a simple Python interface to loading and storing key-array mappings in kastore format.
-
kastore.
load
(file, read_all=False, key_encoding='utf-8', engine='python')¶ Loads a store from the specified file.
- Parameters
file (str) – The path of the file to load, or a file-like object with a
read()
method.read_all (bool) – If True, read the entire file into memory. This optimisation can be useful when all the data will be needed, as it saves a little overhead. Defaults to False.
key_encoding (str) – The encoding to use when converting the keys from raw bytes.
engine (str) – The underlying implementation to use.
- Returns
A dict-like object mapping the key-array pairs.
-
kastore.
loads
(encoded_data, key_encoding='utf-8')¶ Loads a store from the specified bytes object.
-
kastore.
dump
(data, file, key_encoding='utf-8', engine='python')¶ Dumps a store to the specified file.
- Parameters
-
kastore.
dumps
(data, key_encoding='utf-8')¶ Encodes the specified data in kastore form and returns the resulting bytes.
-
kastore.
get_include
()¶ Returns the directory path where include files for the kastore C API are to be found.
Exceptions¶
-
kastore.
KastoreException
¶
-
kastore.
FileFormatError
¶
-
kastore.
VersionTooNewError
¶
-
kastore.
VersionTooOldError
¶
Example¶
Here is a simple example of using kastore to save some numpy arrays to a file and load them again.
import kastore
import numpy as np
data = {"one": np.arange(5, dtype=np.int8), "two": np.arange(5, dtype=np.uint64)}
kastore.dump(data, "tmp.kas")
kas = kastore.load("tmp.kas")
print(list(kas.items()))
Running this code chunk gives us:
[('one', array([0, 1, 2, 3, 4], dtype=int8)), ('two', array([0, 1, 2, 3, 4], dtype=uint64))]
We can also get a useful summmary of a kastore file using the command line interface:
$ python3 -m kastore ls -lH tmp.kas
int8 5 5B one
uint64 5 40B two
The output here shows us that the array corresponding to key one
has type int8
, 5 elements and consumes 5 bytes of space. The array for key two
also have 5 elements but has type uint64
and therefore consumes 40 bytes of space.
Please see the output of python3 -m kastore --help
for more help on this
command line interface.
The python module gives a read-only view of the kastore file, so to add more data to an existing store, we need to load it, convert it to a dict (which is efficient, as the underlying arrays won’t be copied), and dump it back out again. For instance, here’s how we might add a new key to the previous example:
kas_dict = dict(kas)
print(kas_dict)
# {'one': array([0, 1, 2, 3, 4], dtype=int8), 'two': array([0, 1, 2, 3, 4], dtype=uint64)}
kas_dict["three"] = np.array([0.5772, 2.7818, 3.1415])
kastore.dump(kas_dict, "tmp2.kas")
After this, we get:
# python3 -m kastore ls -lH tmp2.kas
int8 5 5B one
float64 3 24B three
uint64 5 40B two
indicating that the key “three” has three float64 entries.