awkward
#
Memory Usage#
ak.Array
(or ak.Record
) works as a view of the original data with a structure given by ak.Array.layout
. Let’s start with a simple array:
>>> size = 100_000
>>> data = ak.zip({"x":np.ones(size, dtype=np.int64), "y":np.zeros(size, dtype=np.int64)})
and consider the following use cases.
Counting Memory for Multiple Arrays#
Most of the operations will only wrap the ak.Array
with another layer instead of making a copy (with some exceptions). For example:
>>> selected = data[np.random.choice(size, 10)]
>>> data.nbytes + selected.nbytes
3200080
But the actual memory usage is about 1.6MB
instead, as they are sharing the same content.
Garbage Collection#
Each instance will hold a reference to the original data. If only the following is executed:
>>> del data
>>> gc.collect()
the memory will not be released until the selected
is also deleted.
Concatenate#
ak.concatenate()
can very expensive in some cases. For example:
TODO: add example