Basic I/O#
A wrapper for ROOT file I/O uproot.reading.open(), uproot._dask.dask() and uproot.writing.writable.recreate().
Note
Readers will use the following default options for uproot.open():
object_cache = None
array_cache = None
for uproot.dask():
open_files = False
and for both:
timeout = 180
Warning
Writers will always overwrite the output file if it exists.
Todo
Use dask_awkward.new_scalar_object() to return object.
TTree#
- class heptools.root.TreeWriter(name='Events', parents=True, basket_size=..., **options)[source]#
uproot.recreate()with remote file support andTBasketsize control.- Parameters:
name (str, optional, default='Events') – Default name of tree.
parents (bool, optional, default=True) – Create parent directories if not exist.
basket_size (int, optional) – Size of
TBasket. If not given, a newTBasketwill be created for eachextend()call.**options (dict, optional) – Additional options passed to
uproot.recreate().
- __call__(path, name=None)[source]#
Set output path.
- Parameters:
path (PathLike) – Path to output ROOT file.
name (str, optional) – Name of tree. If given, it will temporarily override the tree name.
- Returns:
self (TreeWriter)
- __exit__(*exc)[source]#
If no exception is raised, move the temporary file to the output path and store
Chunkinformation totree.
- extend(data)[source]#
Extend the
TTreewithdatausinguproot.writing.writable.WritableTree.extend().Note
The VirtualArray in awkward < 2.0 will be materialized.
- Parameters:
data (RecordLike) – Data to extend.
- Returns:
self (TreeWriter)
- class heptools.root.TreeReader(branch_filter=None, transform=None, **options)[source]#
Read data from
Chunk.- Parameters:
branch_filter (Callable[[set[str]], set[str]], optional) – A function to select branches. If not given, all branches will be read.
transform (Callable[[RecordLike], RecordLike], optional) – A function to transform the data after reading. If not given, no transformation will be applied.
**options (dict, optional) – Additional options passed to
uproot.open().
- arrays(source, library='ak', **options)[source]#
Read
sourceinto array.- Parameters:
source (Chunk) – Chunk of
TTree.library (Literal['ak', 'np', 'pd'], optional, default='ak') –
The library used to represent arrays.
library='ak': returnak.Array.library='pd': returnpandas.DataFrame.library='np': returndictofnumpy.ndarray.
**options (dict, optional) – Additional options passed to
uproot.behaviors.TBranch.HasBranches.arrays().
- Returns:
RecordLike – Data from
TTree.
- concat(*sources, library='ak', **options)[source]#
Read
sourcesinto one array. The branches ofsourcesmust be the same after filtering.Todo
Add
multiprocessingsupport.
- iterate(*sources, step=..., library='ak', mode='partition', **options)[source]#
Iterate over
sources.- Parameters:
sources (tuple[Chunk]) – One or more chunks of
TTree.step (int, optional) – Number of entries to read in each iteration step. If not given, the chunk size will be used and the
modewill be ignored.library (Literal['ak', 'np', 'pd'], optional, default='ak') – The library used to represent arrays.
mode (Literal['balance', 'partition'], optional, default='partition') –
The mode to generate iteration steps.
mode='balance': usebalance(). The length of output arrays is not guaranteed to bestepbut no need to concatenate.mode='partition': usepartition(). The length of output arrays is guaranteed to bestepexcept for the last one but need to concatenate.
**options (dict, optional) – Additional options passed to
arrays().
- Yields:
RecordLike – A chunk of data from
TTree.
- dask(*sources, partition=..., library='ak')[source]#
Read
sourcesinto delayed arrays.- Parameters:
- Returns:
DelayedRecordLike – Delayed data from
TTree.
dask#
- heptools.root.merge.resize(path, *sources, step, chunk_size=..., writer_options=None, reader_options=None, clean_source=True, transform=None, dask=False)[source]#
merge()sourcesintoChunkandclean()sourcesafter merging.- Parameters:
path (PathLike) – Path to output ROOT file.
sources (tuple[Chunk]) – Chunks to merge.
step (int) – Number of entries to read and write in each iteration step.
chunk_size (int, optional) – Number of entries in each chunk. If not given, all entries will be merged into one chunk.
writer_options (dict, optional) – Additional options passed to
TreeWriter.reader_options (dict, optional) – Additional options passed to
TreeReader.clean_source (bool, optional, default=True) – If
True, remove the source chunk after moving.transform (Callable[[ak.Array], ak.Array], optional) – A function to transform the array before writing.
dask (bool, optional, default=False) – If
True, return aDelayedobject.
- Returns:
list[Chunk] or Delayed – Merged chunks.
- heptools.root.merge.merge(path, *sources, step, writer_options=None, reader_options=None, transform=None, dask=False)[source]#
Merge
sourcesinto oneChunk.- Parameters:
path (PathLike) – Path to output ROOT file.
sources (tuple[Chunk]) – Chunks to merge.
step (int) – Number of entries to read and write in each iteration step.
writer_options (dict, optional) – Additional options passed to
TreeWriter.reader_options (dict, optional) – Additional options passed to
TreeReader.transform (Callable[[ak.Array], ak.Array], optional) – A function to transform the array before writing.
dask (bool, optional, default=False) – If
True, return aDelayedobject.
- Returns:
Chunk or Delayed – Merged chunk.