Show simple item record

dc.contributor.authorWhite, Malcolm CA
dc.contributor.authorZhang, Zhendong
dc.contributor.authorBai, Tong
dc.contributor.authorQiu, Hongrui
dc.contributor.authorChang, Hilary
dc.contributor.authorNakata, Nori
dc.date.accessioned2026-04-28T14:58:53Z
dc.date.available2026-04-28T14:58:53Z
dc.date.issued2023-04-12
dc.identifier.urihttps://hdl.handle.net/1721.1/165708
dc.description.abstractModern high-performance computing (HPC) tasks overwhelm conventional geophysical data formats. We describe a new data schema called HDF5eis (read H-D-F-size) for handling big multidimensional time series data from environmental sensors in HPC applications and implement a freely available Python application programming interface (API) for building and processing HDF5eis files. HDF5eis augments the popular Hierarchical Data Format 5 with a minimal set of additional conventions that facilitate fast and flexible data input and output protocols for regularly sampled (in time) data with any number of dimensions. HDF5eis supports arbitrary ancillary data (e.g., metadata) storage in columnar format or as UTF-8 encoded byte streams alongside time series data. Our HDF5eis API enables simple and efficient access to big data sets distributed across a potentially large number of small heterogeneous files through a single point of access. HDF5eis outperforms conventional seismic data formats by up to two orders of magnitude in terms of random read access times. We contribute HDF5eis as an operational tool and an experimental draft proposal that will help establish the next generation of data standards in the earth sciences.en_US
dc.language.isoen
dc.publisherSociety of Exploration Geophysicistsen_US
dc.relation.isversionofhttps://doi.org/10.1190/geo2022-0448.1en_US
dc.rightsArticle is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.en_US
dc.sourceSociety of Exploration Geophysicistsen_US
dc.titleHDF5eis: A storage and input/output solution for big multidimensional time series data from environmental sensorsen_US
dc.typeArticleen_US
dc.identifier.citationMalcolm C. A. White, Zhendong Zhang, Tong Bai, Hongrui Qiu, Hilary Chang, Nori Nakata; HDF5eis: A storage and input/output solution for big multidimensional time series data from environmental sensors. Geophysics 2023;; 88 (3): F29–F38.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Earth, Atmospheric, and Planetary Sciencesen_US
dc.relation.journalGeophysicsen_US
dc.eprint.versionFinal published versionen_US
dc.type.urihttp://purl.org/eprint/type/JournalArticleen_US
eprint.statushttp://purl.org/eprint/status/PeerRevieweden_US
dc.date.updated2026-04-28T14:54:22Z
dspace.orderedauthorsWhite, MCA; Zhang, Z; Bai, T; Qiu, H; Chang, H; Nakata, Nen_US
dspace.date.submission2026-04-28T14:54:24Z
mit.journal.volume88en_US
mit.journal.issue3en_US
mit.licensePUBLISHER_POLICY
mit.metadata.statusAuthority Work and Publication Information Neededen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record