DASCore

A Python Library for Distributed Acoustic Sensing

Derrick Chambers, Eileen Martin, Ge Jin

DASDAE

Distributed Acoustic Sensing Data Analysis Ecosystem

DASDAE: Goals

  • Collection of open-source DFOS libraries and applications
  • Facilitate research and education
  • Enable DFOS interoperability and fusion
  • Reduce code redundancy

DASDAE: People

DASDAE: DASCore’s Place

DASCore: Goals


  • Common building block for future DASDAE codes
  • Provide reference processing implementations
  • Implement basic, customizable visualizations
  • Read/write common data formats
  • Manage data sets

DASCore: Inspiration


  • Nearly 1.8 million downloads
  • Used by ~700 repos (128 packages)
  • 97 unique contributors
  • 1,600 academic citations

DASCore: Non-Goals


What DASCore is not:

  • Application for a particular use case
  • Graphical User Interface
  • Machine-learning library
  • Massively parallel framework

DASCore: Installation



conda

conda install dascore -c conda-forge


pip

pip install dascore

DASCore: Data Structures


Patch

Spool

DASCore: Patch


Patch: Contiguous data and metadata

  • Contains three types of metadata
    • dims - dimension labels
    • coords - dimension values
    • attrs - scalar metadata
  • Maintains metadata consistency
  • Strives to be immutable

DASCore: Patch


  • We start by importing dascore
import dascore as dc

DASCore: Patch


  • Then we read a DAS file
import dascore as dc

patch = dc.read('path_to_das_file.h5')[0]

DASCore: Patch


  • Or we can use one of the example patches
import dascore as dc

patch = dc.get_example_patch()

DASCore: Patch


  • Processing methods are chained together
import dascore as dc
patch = dc.get_example_patch()

out = (
    patch.decimate(time=8)
    .detrend(dim='distance')
    .pass_filter(time=(None, 10))
)

DASCore: Patch


  • Simple visualizations are accessed through .viz
import dascore as dc
patch = dc.get_example_patch()

patch.viz.waterfall(show=True)

DASCore: Patch


DASCore: Patch


  • Accessing metadata: dimensions
import dascore as dc
patch = dc.get_example_patch()

print(patch.dims)


('distance', 'time')

DASCore: Patch


  • Accessing metadata: coordinates
import dascore as dc
patch = dc.get_example_patch()

print(patch.coords)


➤ Coordinates (distance: 300, time: 2000)
    *distance: CoordRange( min: 0 max: 299 step: 1 shape: (300,) dtype: int64 units: m )
    *time: CoordRange( min: 2017-09-18 max: 2017-09-18T00:00:07.996 step: 0.004s shape: (2000,) dtype: datetime64[ns] units: s )

DASCore: Patch


  • Accessing metadata: attributes
import dascore as dc
patch = dc.get_example_patch()

print(patch.attrs)


{'data_type': '', 'data_category': '', 'data_units': None, 'instrument_id': '', 'acquisition_id': '', 'tag': 'random', 'station': '', 'network': '', 'history': [], 'dims': 'distance,time', 'coords': <FrozenDict {'distance': CoordSummary(dtype='int64', min=0, max=299, step=1, units=<Quantity(1, 'meter')>), 'time': CoordSummary(dtype='datetime64', min=numpy.datetime64('2017-09-18T00:00:00.000000000'), max=numpy.datetime64('2017-09-18T00:00:07.996000000'), step=numpy.timedelta64(4000000,'ns'), units=<Quantity(1, 'second')>)}>, 'category': 'DAS'}

DASCore: Patch


  • Updating metadata
import dascore as dc
patch = dc.get_example_patch()

new = patch.update_attrs(time_min='2015-01-01T10')

DASCore: Patch


  • Escape hatches: numpy arrays
import dascore as dc
patch = dc.get_example_patch()

array = patch.data

DASCore: Patch


  • Escape hatches: xarray DataArray
import dascore as dc
patch = dc.get_example_patch()

data_array = patch.to_xarray()

DASCore: Spool


Spool: Collection of Patches

  • Encapsulates access to data sources
    • In-memory (list of patches)
    • On-disk (directory of files)
    • Remote (data centers)*
  • Orchestrates batch processing
  • Operates lazily

DASCore: Spool


  • Getting a spool from: A single file
import dascore as dc

spool = dc.spool('path_to_das_file.h5')

DASCore: Spool


  • Getting a spool from: A collection of patches
import dascore as dc

patch_list = [dc.get_example_patch()]
spool = dc.spool(patch_list)

DASCore: Spool


  • Getting a spool from: A directory of DAS files
import dascore as dc

spool = dc.spool('das_directory').update()

DASCore: Spool


  • Getting a spool from: DASCore’s example data set
import dascore as dc

spool = dc.get_example_spool()

DASCore: Spool


  • Patches are accessed via indexing
import dascore as dc
spool = dc.get_example_spool()

patch = spool[0]

DASCore: Spool


  • or via iteration
import dascore as dc
spool = dc.get_example_spool()

for patch in spool:
  ...

DASCore: Spool


  • Sub-spools can be created with slices
import dascore as dc
spool = dc.get_example_spool()

sub_spool = spool[:2]

DASCore: Spool


  • spools are filtered with select
import dascore as dc
spool = dc.get_example_spool()

filtered_spool = spool.select(
  tag='experiment1',
  time=(None, '2021-01-01'),
  distance=(10, 100)
)

DASCore: Spool


  • Data are chunked with chunk
import dascore as dc
spool = dc.get_example_spool()

chunked_spool = spool.chunk(
  time=60,
  overlap=10,
)

DASCore: Spool


  • chunk can also merge adjacent/overlapping data
import dascore as dc
spool = dc.get_example_spool()

chunked_spool = spool.chunk(
  time=None,
)

DASCore: Supported Formats


  • Terra15
  • TDMS*
  • Silixa HDF5*
  • WAV
  • DASDAE
  • SEGY*
  • APSensing*
  • Optasense*
  • Zarr*

DASCore: Future Work


  • Remote data spool
  • Additional file formats
  • Finalizing metadata schema
  • DASDAE format improvements
  • Performance improvements
  • Array interoperability (PyTorch, Jax, NumPy)​
  • Testing with Ray/Dask

Acknowledgements

  • Colorado School of Mines ITS and CIARC

  • NSF Geoinformatics grant 2148614

  • DOE STTR grant DE-SC0022478, subcontract 7026-DOE-1T/MINES with Luna Innovations

  • NIOSH funding for Derrick Chambers PhD studies

  • AFRL Acknowledgement: “This material is based on research sponsored by Air Force Research Laboratory (AFRL) under agreement number FA9453-21-2-0018. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright notation thereon.”

  • AFRL Disclaimer: “The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of Air Force Research Laboratory (AFRL) or the U.S. Government.”

DASCore: Final Note


  • DASCore is very new
  • There will be:
    • Bugs
    • API improvements
  • You can help shape its future!