← Back to publications

scida: scalable analysis for scientific big data

Chris Byrohl, Dylan Nelson

JOSS · October 2023

scida is a Python package for scalable analysis of large scientific datasets. It provides a unified interface for reading and analyzing data from various astrophysical simulations and observations, with built-in support for distributed computing via dask.

Key Features

  • Unified Interface: Read data from AREPO, GADGET, SWIFT, GIZMO simulations and observational datasets with a consistent API
  • Lazy Evaluation: Data is only loaded when needed, enabling work with datasets larger than memory
  • Scalable: Built on dask for parallel and distributed computing
  • Unit-Aware: Automatic unit handling and conversion via pint
  • Extensible: Easy to add support for new data formats

Visual Impressions

scida can be used to analyze and visualize a wide variety of astrophysical datasets.

Cosmological Simulations

TNG100 Metallicity projection
TNG100 metallicity at z=2
THESAN neutral hydrogen projection
THESAN neutral hydrogen at z=6
FLAMINGO density projection
FLAMINGO density at z=2
SIMBA temperature projection
SIMBA temperature at z=2

Observational Data

SDSS DR16 sky map
SDSS DR16 Aitoff projection