sheap.Utils.SpectralReaders module

FITS Spectrum Readers

This module provides utilities to read spectra from different survey and simulation formats (SDSS, DESI, PyQSO, and custom simulations). It also includes parallel and batched readers to handle multiple files efficiently, with fallbacks for sequential reading.

Readers

  • fits_reader_sdss: SDSS spectra (PLATE-MJD-FIBERID format)

  • fits_reader_desi: DESI spectra

  • fits_reader_pyqso: PyQSO pipeline spectra

  • fits_reader_simulation: Simulated spectra

Batching / Parallel utilities

  • parallel_reader

  • batched_reader

  • sequential_reader

batched_reader(paths, batch_size=8, function=<function fits_reader_sdss>)[source]

Batch reader for safer memory usage.

Parameters:
  • paths (list of str) – Paths to FITS files.

  • batch_size (int, optional) – Number of files to read per batch.

  • function (callable or str, optional) – Reader function or key in READER_FUNCTIONS.

Returns:

  • coords (np.ndarray) – Stacked coordinates from all batches.

  • spectra_reshaped (str) – Placeholder (currently unused).

  • spectra_raw (list of np.ndarray) – All raw spectra arrays.

fits_reader_desi(file)[source]

Read a DESI FITS spectrum.

Parameters:

file (str) – Path to DESI FITS file.

Returns:

  • data_array (np.ndarray) – Array with shape (3, n_pix): [wavelength, flux, error].

  • header_array (np.ndarray) – Array with RA and DEC from header.

fits_reader_pyqso(file)[source]

Read a PyQSO-format spectrum.

Parameters:

file (str) – Path to PyQSO FITS file.

Returns:

  • spectra (np.ndarray) – Array with shape (3, n_pix): [wavelength, flux, error].

  • header_array (list) – Empty list (no coords stored).

fits_reader_sdss(file)[source]

Read an SDSS FITS spectrum.

Parameters:

file (str) – Path to SDSS FITS file.

Returns:

  • data_array (np.ndarray) – Array with shape (4, n_pix): [wavelength, flux, error, wdisp].

  • header_array (np.ndarray) – Array with RA and DEC from header.

fits_reader_simulation(file, chanel=1, template=False)[source]

Read a simulated spectrum from a FITS file.

Parameters:
  • file (str) – Path to simulation FITS file.

  • chanel (int, default=1) – HDU extension index to read.

  • template (bool, default=False) – If True, reads template arrays.

Returns:

  • data_array (np.ndarray) – Array with shape (n_channels, n_pix).

  • header_array (list) – Empty or metadata, depending on template.

parallel_reader(paths, n_cpu=2, function=<function fits_reader_sdss>, **kwargs)[source]

Parallel reader using multiprocessing.

Parameters:
  • paths (list of str) – Paths to FITS files.

  • n_cpu (int, optional) – Number of processes to use (default=min(4, os.cpu_count())).

  • function (callable or str, optional) – Reader function or key in READER_FUNCTIONS.

Returns:

  • coords (np.ndarray) – Coordinates from headers (RA, DEC).

  • spectra_reshaped (list) – Placeholder for reshaped spectra (currently empty).

  • spectra (list of np.ndarray) – Raw spectra arrays.

sequential_reader(paths, function=<function fits_reader_sdss>)[source]

Sequential FITS reader (fallback for debugging).

Parameters:
  • paths (list of str) – Paths to FITS files.

  • function (callable or str, optional) – Reader function or key in READER_FUNCTIONS.

Returns:

  • coords (np.ndarray) – Coordinates from headers (RA, DEC).

  • spectra_reshaped (np.ndarray) – Reshaped spectra array.

  • spectra (list of np.ndarray) – Raw spectra arrays.