# Covariance¶

Note

This page describes the MaNGA DAP Covariance object in general terms. For its specific use in the DAP for handling spatial covariance in the MaNGA datacubes, see Spatial Covariance.

The DAP Covariance object is a general utility for constructing, visualizing, and storing two- and three-dimensional covariance matrices.

## Construction¶

Beyond the nominal instantiation method, there are numerous methods used to construct a Covariance object.

1. The simplest approaches are if you have a variance vector or a pre-calculated covariance matrix as a dense array and you want to construct a Covariance object for further calculations:

import numpy

# Construct the Covariance object just from a variance vector
var = numpy.ones(3, dtype=float)
covar = Covariance.from_variance(var)
# Should be the same as the identity matrix.
assert numpy.array_equal(covar.toarray(), numpy.identity(3))

# Construct the Covariance object from a pre-calculated array
c = numpy.diag(numpy.full(10-2, 0.2, dtype=float), k=-2) \
+ numpy.diag(numpy.full(10-1, 0.5, dtype=float), k=-1) \
+ numpy.diag(numpy.full(10, 1.0, dtype=float), k=0) \
+ numpy.diag(numpy.full(10-1, 0.5, dtype=float), k=1) \
+ numpy.diag(numpy.full(10-2, 0.2, dtype=float), k=2)
covar = Covariance.from_array(c)
# Should be the same as the identity matrix.
assert numpy.array_equal(covar.toarray(), c)


Note the use of the toarray() to access the array; see Accessing the covariance data.

2. You can construct a covariance matrix based on samples from a distribution:

import numpy

# Build a bogus covariance matrix
m = numpy.zeros(10, dtype=float)
c = numpy.diag(numpy.full(10-2, 0.2, dtype=float), k=-2) \
+ numpy.diag(numpy.full(10-1, 0.5, dtype=float), k=-1) \
+ numpy.diag(numpy.full(10, 1.0, dtype=float), k=0) \
+ numpy.diag(numpy.full(10-1, 0.5, dtype=float), k=1) \
+ numpy.diag(numpy.full(10-2, 0.2, dtype=float), k=2)

# Draw samples
s = numpy.random.multivariate_normal(m, c, size=100000)

# Instantiate
covar = Covariance.from_samples(s.T, cov_tol=0.1)

# Check the values are very nearly the same as the input
assert numpy.all(numpy.absolute(c - covar.toarray()) < 0.02)


Here, we’ve drawn samples from a known multivariate normal distribution with a given covariance between its 10 axes and checked the reconstruction of the covariance matrix from those samples using from_samples(). This construction method is a simple wrapper for numpy.cov and from_array().

3. You can construct the covariance matrix that results from a matrix multiplication:

import numpy

# Build a bogus covariance matrix
c = numpy.diag(numpy.full(10-2, 0.2, dtype=float), k=-2) \
+ numpy.diag(numpy.full(10-1, 0.5, dtype=float), k=-1) \
+ numpy.diag(numpy.full(10, 1.0, dtype=float), k=0) \
+ numpy.diag(numpy.full(10-1, 0.5, dtype=float), k=1) \
+ numpy.diag(numpy.full(10-2, 0.2, dtype=float), k=2)

# Vector and matrix to multiply
x = numpy.ones(10, dtype=float)
t = numpy.zeros((3,10), dtype=float)
t[0,0] = 1.0
t[1,2] = 1.0
t[2,4] = 1.0
# Apply the matrix multiplication (not used; just for illustration)
y = numpy.dot(t, x)
# Expected covariance from the result
_c = numpy.diag(numpy.full(3-1, 0.2, dtype=float), k=-1) \
+ numpy.diag(numpy.full(3, 1.0, dtype=float), k=0) \
+ numpy.diag(numpy.full(3-1, 0.2, dtype=float), k=1)
covar = Covariance.from_matrix_multiplication(t, c)
assert numpy.array_equal(covar.toarray(), _c)

4. Finally, you can construct the covariance matrix from a previous instance that was saved to a fits file using the Fits file I/O methods.

## Accessing the covariance data¶

The Covariance object is primarily a storage and IO utility. Internally, the object only keeps the upper triangle of the matrix, which means that use of the cov attribute is not recommended unless you know what you’re doing.

Also note that the object can be either “2D” or “3D”. When the object is 3D (covar.dim == 3), this is just a convenience for storing multiple covariance matrices that have the same shape.

There are two ways to access the full covariance matrix, the full() and toarray() methods depending on whether you want a sparse or dense matrix, respectively. Also note that the use of the full() method requires you to specify a single channel for 3D objects. The output of these two methods can be used as you would use any scipy.sparse.csr_matrix or numpy.ndarray object.

To show the covariance matrix, you can use its show() method to quickly produce a plot, which is a simple wrapper for the toarray() method and pyplot.imshow.

## Toggling between covariance and correlation matrices¶

The Covariance object allows you to toggle between the full covariance matrix, $${\mathbf C}$$ and a correlation matrix, $${\mathbf \rho}$$, where:

$\rho_{ij} = \frac{C_{ij}}{(V_i V_j)^{1/2}}$

where $${\mathbf V}$$ is the variance vector (the diagonal elements of $${\mathbf C}$$). To convert a Covariance object to a correlation matrix (or ensure that it already is one), use to_correlation(). To revert back to a covariance matrix, use revert_correlation().

## Fits file I/O methods¶

Covariance objects can be saved as a binary table in a fits file using the write() method. To reload the covariance matrix, use the from_fits() instantiation method:

import numpy

ofile = 'test_covar_io.fits'

# Build a bogus covariance matrix
m = numpy.zeros(10, dtype=float)
c = numpy.diag(numpy.full(10-2, 0.2, dtype=float), k=-2) \
+ numpy.diag(numpy.full(10-1, 0.5, dtype=float), k=-1) \
+ numpy.diag(numpy.full(10, 1.0, dtype=float), k=0) \
+ numpy.diag(numpy.full(10-1, 0.5, dtype=float), k=1) \
+ numpy.diag(numpy.full(10-2, 0.2, dtype=float), k=2)

# Draw samples
s = numpy.random.multivariate_normal(m, c, size=100000)

# Instantiate
covar = Covariance.from_samples(s.T, cov_tol=0.1)
# Write
covar.write(ofile)

The details of how the covariance data are stored are described by the write() method.