mangadap.util.sampling module

Provides a set of functions to handle resampling.


License

Copyright © 2019, SDSS-IV/MaNGA Pipeline Group


class mangadap.util.sampling.Resample(y, e=None, mask=None, x=None, xRange=None, xBorders=None, inLog=False, newx=None, newRange=None, newBorders=None, newpix=None, newLog=True, newdx=None, base=10.0, ext_value=0.0, conserve=False, step=True, covar=False)[source]

Bases: object

Resample regularly or irregularly sampled data to a new grid using integration.

This is a generalization of the routine ppxf.ppxf_util.log_rebin() provided by Michele Cappellari in the pPXF package.

The abscissa coordinates (x) or the pixel borders (xBorders) for the data (y) should be provided for irregularly sampled data. If the input data is linearly or geometrically sampled (inLog=True), the abscissa coordinates can be generated using the input range for the (geometric) center of each grid point. If x, xBorders, and xRange are all None, the function assumes grid coordinates of x = numpy.arange(y.shape[-1]).

The function resamples the data by constructing the borders of the output grid using the new* keywords and integrating the input function between those borders. The output data will be set to ext_value for any data beyond the abscissa limits of the input data.

The data to resample (y) can be a 1D or 2D array; the abscissa coordinates must always be 1D. If y is 2D, the resampling is performed along the last axis (i.e., axis=-1).

The nominal assumption is that the provided function is a step function based on the provided input (i.e., step=True). If the output grid is substantially finer than the input grid, the assumption of a step function will be very apparent. To assume the function is instead linearly interpolated between each provided point, choose step=False; higher-order interpolations are not provided.

If errors are provided, a nominal error propagation is performed to provide the errors in the resampled data.

Warning

Depending on the details of the resampling, the output errors are likely highly correlated. Any later analysis of the resampled function should account for this.

The covariance in the resampled pixels can be constructed by setting covar=True; however, this is currently only supported when step=True. If no errors are provided and covar=True, the computed matrix is the correlation matrix instead of the covariance matrix. Given that the resampling is the same for all vectors, only one correlation matix will be calculated if no errors are provided, even if the input y is 2D. If the input data to be resampled is 2D and errors are provided, a covariance matrix is calculated for each vector in y. Beware that this can be an expensive computation.

The conserve keyword sets how the units of the input data should be treated. If conserve=False, the input data are expected to be in density units (i.e., per x coordinate unit) such that the integral over \(dx\) is independent of the units of \(x\) (i.e., flux per unit angstrom or flux density). If conserve=True, the value of the data is assumed to have been integrated over the size of each pixel (i.e., units of flux). If conserve=True, \(y\) is converted to units of per step in \(x\) such that the integral before and after the resample is the same. For example, if \(y\) is a spectrum in units of flux, the function first converts the units to flux density and then computes the integral over each new pixel to produce the new spectra with units of flux.

Todo

  • Allow for higher order interpolations.

  • Enable covariance matrix calculations for step=False.

  • Provide examples

Parameters:
  • y (numpy.ndarray, numpy.ma.MaskedArray) – Data values to resample. The shape can be 1D or 2D. If 1D, the shape must be \((N_{\rm pix},)\); otherwise, it must be \((N_y,N_{\rm pix})\). I.e., the length of the last axis must match the input coordinates.

  • e (numpy.ndarray, numpy.ma.MaskedArray, optional) – Errors in the data that should be resampled. The shape must match the input y array. These data are used to perform a nominal calculation of the error in the resampled array.

  • mask (numpy.ndarray, optional) – A boolean array indicating values in y that should be ignored during the resampling (values to ignore have masked=True, just like in a numpy.ma.MaskedArray). The mask used during the resampling is the union of this object and the masks of y and e, if either are provided as numpy.ma.MaskedArray objects.

  • x (numpy.ndarray, optional) – Abscissa coordinates for the data, which do not need to be regularly sampled. If the pixel borders are not provided, they are assumed to be half-way between adjacent pixels, and the first and last borders are assumed to be equidistant about the provided value. If these coordinates are not provided, they are determined by the input borders, the input range, or just assumed to be the indices, \(0..N_{\rm pix}-1\).

  • xRange (array-like, optional) – A two-element array with the starting and ending value for the coordinates of the centers of the first and last pixels in y. Default is \([0,N_{\rm pix}-1]\).

  • xBorders (numpy.ndarray, optional) – An array with the borders of each pixel that must have a length of \(N_{\rm pix}+1\).

  • inLog (bool, optional) – Flag that the input is logarithmically binned, primarily meaning that the coordinates are at the geometric center of each pixel and the centers are spaced logarithmically. If false, the sampling is expected to be linear.

  • newx (array-like, optional) – Abscissa coordinates for the output data, which do not need to be a regular grid. If this is provided, the pixel borders are assumed to be half-way between adjacent pixels, and the first and last borders are assumed to be equidistant about the provided value. If these coordinates are not provided, they are determined by the new range, the new number of pixels, and/or the new pixel width (and whether or not the new grid should be logarithmically binned). If this is provided, newRange, newpix, newLog, and newdx are all ignored.

  • newRange (array-like, optional) – A two-element array with the (geometric) centers of the first and last pixel in the output vector. If not provided, assumed to be the same as the input range.

  • newBorders (array-like, optional) – An array with the borders of each pixel in the resampled vectors.

  • newpix (int, optional) – Number of pixels for the output vector. If not provided, assumed to be the same as the input vector.

  • newLog (bool, optional) – The output vector should be logarithmically binned.

  • newdx (float, optional) – The sampling step for the output vector. If newLog=True, this must be the change in the logarithm of \(x\) for the output vector! If not provided, the sampling is set by the output range (see newRange above) and number of pixels (see newpix above).

  • base (float, optional) – The base of the logarithm used for both input and output sampling, if specified. The default is 10; use numpy.exp(1) for natural logarithm.

  • ext_value (float, optional) – Set extrapolated values to the provided float. If set to None, values are just set to the linear extrapolation of the data beyond the provided limits; use ext_value=None with caution!

  • conserve (bool, optional) – Conserve the integral of the input vector. For example, if the input vector is a spectrum in flux units, you should conserve the flux in the resampling; if the spectrum is in units of flux density, you do not want to conserve the integral.

  • step (bool, optional) – Treat the input function as a step function during the resampling integration. If False, use a linear interpolation between pixel samples.

  • covar (bool, optional) – Calculate the covariance matrix between pixels in the resampled vector. Can only be used if step=True. If no error vector is provided (e), the result is a correlation matrix.

x

The coordinates of the function on input.

Type:

numpy.ndarray

xborders

The borders of the input pixel samples.

Type:

numpy.ndarray

y

The function to resample.

Type:

numpy.ndarray

e

The 1-sigma errors in the function to resample.

Type:

numpy.ndarray

m

The boolean mask for the input function.

Type:

numpy.ndarray

outx

The coordinates of the function on output.

Type:

numpy.ndarray

outborders

The borders of the output pixel samples.

Type:

numpy.ndarray

outy

The resampled function.

Type:

numpy.ndarray

oute

The resampled 1-sigma errors.

Type:

numpy.ndarray

outf

The fraction of each output pixel that includes valid data from the input function.

Type:

numpy.ndarray

covar

The covariance or correlation matrices for the resampled vectors.

Type:

Covariance

Raises:

ValueError – Raised if more the one of x, xRange, or xBorders are provided, if more the one of newx, newRange, or newBorders are provided, if y is a numpy.ndarray, if y is not 1D or 2D, if the covariance is requested but step is False, if the shapes of the provided errors or mask do not match y, if there is insufficient information to construct the input or output grid, or if either xRange or newRange are not two-element arrays.

static _coordinate_grid(x=None, rng=None, nx=None, dx=None, borders=None, log=False, base=10.0)[source]

Use the provided information to construct the coordinate grid and the grid borders.

_resample_linear(v, quad=False)[source]

Resample the vectors.

_resample_step(v, quad=False)[source]

Resample the vectors.

_resample_step_matrix()[source]

Build a matrix such that

\[y = \mathbf{A} x\]

where \(x\) is the input vector, \(y\) is the resampled vector, and \(\mathbf{A}\) is the matrix operations that resamples \(x\).

mangadap.util.sampling.angstroms_per_pixel(wave, log=False, base=10.0, regular=True)[source]

Return a vector with the angstroms per pixel at each channel.

When regular=True, the function assumes that the wavelengths are either sampled linearly or geometrically. Otherwise, it calculates the size of each pixel as the difference between the wavelength coordinates. The first and last pixels are assumed to have a width as determined by assuming the coordinate is at its center.

Note

If the regular is False and log is True, the code does not assume the wavelength coordinates are at the geometric center of the pixel.

Parameters:
  • wave (numpy.ndarray) – (Geometric) centers of the spectrum pixels in angstroms.

  • log (numpy.ndarray, optional) – The vector is geometrically sampled.

  • base (float, optional) – Base of the logarithm used in the geometric sampling.

  • regular (bool, optional) – The vector is regularly sampled.

Returns:

The angstroms per pixel.

Return type:

numpy.ndarray

mangadap.util.sampling.borders_to_centers(borders, log=False)[source]

Convert a set of bin borders to bin centers.

Grid borders need not be regularly spaced.

Parameters:
  • borders (numpy.ndarray) – Borders for adjoining bins.

  • log (bool, optional) – Return the geometric center instead of the linear center of the bins.

Returns:

The vector of bin centers.

Return type:

numpy.ndarray

mangadap.util.sampling.centers_to_borders(x, log=False)[source]

Convert a set of bin centers to bounding edges.

Grid centers need not be regularly spaced. The first edge of the first bin and the last edge of the last bin are assumed to be equidistant from the center of the 2nd and penultimate bins, respectively.

Parameters:
  • x (numpy.ndarray) – Centers of adjoining bins.

  • log (bool, optional) – Adopt a geometric binning instead of a linear binning.

Returns:

The vector with the coordinates of adjoining bin edges.

Return type:

numpy.ndarray

mangadap.util.sampling.grid_borders(rng, npix, log=False, base=10.0)[source]

Determine the borders of bin edges in a grid.

Parameters:
  • rng (array-like) – Two-element array with the (geometric) centers of the first and last pixel in the grid.

  • npix (int) – Number of pixels in the grid.

  • log (bool, optional) – The input range is (to be) logarithmically sampled.

  • base (float, optional) – The base of the logarithmic sampling. Use numpy.exp(1.) for the natural logarithm.

Returns:

Returns a numpy.ndarray with the grid borders with shape (npix+1,) and the step size per grid point. If log=True, the latter is the geometric step.

Return type:

tuple

mangadap.util.sampling.grid_centers(rng, npix, log=False, base=10.0)[source]

Determine the (geometric) center of pixels in a grid.

Parameters:
  • rng (array-like) – Two-element array with the (geometric) centers of the first and last pixel in the grid.

  • npix (int) – Number of pixels in the grid.

  • log (bool, optional) – The input range is (to be) logarithmically sampled.

  • base (float, optional) – The base of the logarithmic sampling. Use numpy.exp(1.) for the natural logarithm.

Returns:

Returns a numpy.ndarray with the grid pixel (geometric) ceners with shape (npix,) and the step size per grid point. If log=True, the latter is the geometric step.

Return type:

tuple

mangadap.util.sampling.grid_npix(rng=None, dx=None, log=False, base=10.0, default=None)[source]

Determine the number of pixels needed for a given grid.

Parameters:
  • rng (array-like, optional) – Two-element array with the starting and ending x coordinate of the pixel centers to divide into pixels of a given width. If log is True, this must still be the linear value of the x coordinate, not log(x)!.

  • dx (float, optional) – Linear or logarithmic pixel width.

  • log (bool, optional) – Flag that the range should be logarithmically binned.

  • base (float, optional) – Base for the logarithm

  • default (int, optional) – Default number of pixels to use. The default is returned if either rng or dx are not provided.

Returns:

Returns the number of pixels to cover rng with pixels of width dx and a two-element numpy.ndarray with the adjusted range such that number of pixels of size dx is the exact integer.

Return type:

tuple

Raises:

ValueError – Raised if the range is not a two-element vector.

mangadap.util.sampling.spectral_coordinate_step(wave, log=False, base=10.0)[source]

Return the uniform sampling step for the input wavelength vector.

If the sampling is logarithmic, return the change in the logarithm of the wavelength; otherwise, return the linear step in angstroms.

Parameters:
  • wave (numpy.ndarray) – Wavelength coordinates of each spectral channel in angstroms.

  • log (bool, optional) – Input spectrum has been sampled geometrically.

  • base (float, optional) – If sampled geometrically, the sampling is done using a logarithm with this base. For natural logarithm, use numpy.exp(1).

Returns:

Spectral sampling step in either angstroms (log=False) or the step in log(angstroms).

Return type:

float

Raises:

ValueError – Raised if the wavelength vector is not linearly or log-linearly sampled.

mangadap.util.sampling.spectrum_velocity_scale(wave)[source]

Determine the velocity sampling of an input wavelength vector when log sampled

Note

The wavelength vector is assumed to be geometrically sampled! However, the input units expected to be in angstroms, not, e.g., log(angstrom).

Parameters:

wave (numpy.ndarray) – Wavelength coordinates of each spectral channel in angstroms. It is expected that the spectrum has been sampled geometrically

Returns:

Velocity scale of the spectrum in km/s.

Return type:

float