Execution
The MaNGA DAP was originally developed to analyze data from the SDSS-IV/MaNGA Survey, provided reduction products from the MaNGA data-reduction pipeline (DRP). However, it is now easy to apply the MaNGA DAP to any IFU datacube.
Below, we describe how to use the MaNGA DAP to analyze MaNGA datacubes, as well as how to execute the survey-level batch mode that analyzes all the MaNGA datacubes within a given directory structure.
To analyze non-MaNGA datacubes, see both How to Fit a non-MaNGA Datacube and the description of the MaNGA DAP Analysis Plans. Also, for an example of how to fit single spectra using MaNGA DAP core algorithms, see How to Fit One Spectrum.
Input files
DAP AnalysisPlan
The DAP uses a toml file to define parameters used by its modules to analyze the provided datacube. This is described in detail by Analysis Plans.
Default AnalysisPlan
If executed without an AnalysisPlan
parameter file, the command-line
execution of the DAP will use a default plan; see
mangadap.config.analysisplan.AnalysisPlan.default()
and default_plan.toml.
The DAP Datacube Configuration File
The DAP uses a configuration (ini
) file to set the datacube to be analyzed
and provides some relevant metadata. These configuration files are generated at
the survey-level by write_config()
. However,
we also provide the write_dap_config
script that will generate the relevant
configuration files if you have the DRPall or DRPComplete file. As with all the
DAP scripts, you can use the -h
command-line option to get the usage:
$ write_dap_config -h
usage: write_dap_config [-h] (-c DRPCOMPLETE | -a DRPALL) [--sres_ext SRES_EXT]
[--sres_fill SRES_FILL] [--covar_ext COVAR_EXT]
[--drpver DRPVER] [--redux_path REDUX_PATH]
[--directory_path DIRECTORY_PATH] [-o]
plate ifudesign ofile
Generate a DAP input configuration file
positional arguments:
plate Plate number
ifudesign IFU design number
ofile Output file name
options:
-h, --help show this help message and exit
-c DRPCOMPLETE, --drpcomplete DRPCOMPLETE
DRP complete fits file (default: None)
-a DRPALL, --drpall DRPALL
DRPall fits file (default: None)
--sres_ext SRES_EXT Spectral resolution extension to use. Default set by
MaNGADataCube class. (default: None)
--sres_fill SRES_FILL
If present, use interpolation to fill any masked pixels
in the spectral resolution vectors. Default set by
MaNGADataCube class. (default: None)
--covar_ext COVAR_EXT
Use this extension to define the spatial correlation
matrix. Default set by MaNGADataCube class. (default:
None)
--drpver DRPVER DRP version. Default set by MaNGADataCube class.
(default: None)
--redux_path REDUX_PATH
Path to the top-level DRP reduction directory. Default
set by MaNGADataCube class. (default: None)
--directory_path DIRECTORY_PATH
Exact path to the directory with the MaNGA DRP datacube.
The name of the file itself must match the nominal MaNGA
DRP naming convention. Default set by MaNGADataCube
class. (default: None)
-o, --overwrite Overwrite any existing files. (default: False)
To construct the configuration file for datacube 7815-3702, run:
write_dap_config 7815 3702 mangadap-7815-3702.ini -a drpall-v3_0_1.fits
to produce:
# Auto-generated configuration file
# Fri 28 Feb 2020 16:57:19
[default]
drpver
redux_path
directory_path
plate = 7815
ifu = 3702
log = True
sres_ext
sres_fill
covar_ext
z = 2.9382300e-02
vdisp
ell = 1.1084400e-01
pa = 1.6324500e+02
reff = 3.7749500e+00
Use the relevant keywords to change the paths or the extensions used
for the spectral resolution and spatial correlation matrix (e.g.,
GCORREL
).
Note
The metadata included in the configuration file is used by the DAP during
the analysis; however, not all of these metadata parameters are explicitly
required. See how the data is checked/used in
mangadap.datacube.datacube.DataCube.populate_metadata()
.
DAP command-line script
The main DAP script is manga_dap
, which is a simple wrapper of
manga_dap()
. With the DAP installed, you can
call the script directly from the command line:
$ manga_dap -h
usage: manga_dap [-h] (-c CONFIG | -f CUBEFILE)
[--cube_module [CUBE_MODULE ...]] [-p PLAN]
[--plan_module [PLAN_MODULE ...]] [--dbg] [--log LOG] [-v]
[-o OUTPUT_PATH]
Perform analysis of integral-field data.
options:
-h, --help show this help message and exit
-c CONFIG, --config CONFIG
Configuration file used to instantiate the relevant
DataCube derived class. (default: None)
-f CUBEFILE, --cubefile CUBEFILE
Name of the file with the datacube data. Must be
possible to instantiate the relevant DataCube derived
class directly from the file only. (default: None)
--cube_module [CUBE_MODULE ...]
The name of the module that contains the DataCube
derived class used to read the data. (default:
mangadap.datacube.MaNGADataCube)
-p PLAN, --plan PLAN TOML file with analysis plan. If not provided, a default
plan is used. (default: None)
--plan_module [PLAN_MODULE ...]
The name of the module used to define the analysis plan
and the output paths. (default:
mangadap.config.manga.MaNGAAnalysisPlan)
--dbg Run manga_dap in debug mode (default: False)
--log LOG File name for runtime log (default: None)
-v, --verbose Set verbosity level; can be omitted and set up to -vv
(default: 0)
-o OUTPUT_PATH, --output_path OUTPUT_PATH
Top-level directory for the DAP output files; default
path is set by the provided analysis plan object (see
plan_module). (default: None)
The DAP allows you to define your own datacube class, as long as it is derived
from DataCube
. You can then specify that
your data should be instantiated with that derived class using the
cube_module
argument; this defaults to
mangadap.datacube.datacube.MaNGADataCube
expecting that you’re
analyzing a MaNGA datacube.
When running the DAP on a MaNGA datacube, you have to provide a
configuration file; however, for derived classes, you may be able to
fully instantiate the relevant data just using the datacube file, which is
why we’ve provided the -f
option.
Note that the analysis plan file is an optional argument. If it is not given,
the DAP will use the Default AnalysisPlan. As discussed in
Analysis Plans, you can also provide a derived class for the analysis plan, which
should basically be provide fine-tuned control over the output directory
structure. If you’re analyzing a non-MaNGA datacube, you will likely be fine
just using the analysis plan base class. However, because manga_dap
defaults to using the MaNGA specific plan class, you’ll need to specify the base
class (mangadap.config.analysisplan.AnalysisPlan
) when executing the
code.
To run the DAP on a single datacube using the default analysis plan, and assuming you have the DRPall file and the relevant LOGCUBE and LOGRSS files (see the warning below) in the current directory, you could execute the DAP as follows:
write_dap_config 7815 3702 mangadap-7815-3702.ini -a drpall-v3_1_1.fits --directory_path .
manga_dap -c mangadap-7815-3702.ini -vv --log mangadap-7815-3702.log -d . -o dap_output
This will analyze the datacube for observation 7815-3702 using the default
analysis plan, with verbose output and a log written to
mangadap-7815-3702.log
, and with the root directory for all the DAP output
(except for the log) set to dap_output
. These commands can be successfully
run in the $MANGADAP_DIR/mangadap/data/remote
directory if you’ve run the
download_test_data.py
script.
Warning
When running the DAP on a MaNGA datacube, you should have both
the DRP LOGRSS
and LOGCUBE
files in the
directory_path
if you want to account for the
Spatial Covariance! If the LOGRSS
files are not
present, the DAP will throw a warning and continue, which means
that the warning can get buried among all the other messages and
likely missed.
Programmatic execution
The manga_dap
executable script is simple, in that it only (1) reads the
datacube, (2) sets the analysis plan, and (3) executes the main DAP analysis
wrapper function, mangadap.survey.manga_dap.manga_dap()
. It is
straight-forward then to construct a script that executes the DAP
programmatically for many datacubes.
Batch execution using automatically generated scripts
Note
Version 4.x of the DAP has never been used for a batch execution/analysis of the MaNGA data. The code should work properly. If it doesn’t, please Submit an issue.
The survey-level execution of the DAP uses the rundap
script, which is a
simple wrapper of rundap
. This script
sets up the DAP output directory structure
either confirms that a provided list of datacubes to analyze exist on disk or trolls the DRP directory structure to find all or some subset of available datacubes to analyze
creates The DAP Datacube Configuration File for each
plateifu
to be analyzed,creates a script file for each
plateifu
that can be sourced to execute the DAP and the associated QA plots,creates scripts that execute the plate-level QA plots,
creates scripts that build the DAPall file and its QA plots, and
submits the scripts to the Utah cluster.
The last step uses an SDSS python package called pbs
, which isn’t
required for the more general-purpose use of the rundap
script
discussed here. With the DAP installed, you can call the script
directly from the command line:
$ rundap -h
usage: rundap [-h] [--overwrite] [-v] [--quiet] [--print_version]
[--drpver DRPVER] [--redux_path REDUX_PATH] [--dapver DAPVER]
[--analysis_path ANALYSIS_PATH] [--plan_file PLAN_FILE]
[--platelist PLATELIST] [--ifudesignlist IFUDESIGNLIST]
[--list_file LIST_FILE] [--combinatorics] [--sres_ext SRES_EXT]
[--sres_fill SRES_FILL] [--covar_ext COVAR_EXT] [--on_disk]
[--can_analyze] [--log] [--no_proc] [--no_plots] [--post]
[--post_plots] [--label LABEL] [--nodes NODES] [--cpus CPUS]
[--fast QOS] [--umask UMASK] [--walltime WALLTIME] [--toughness]
[--create] [--submit] [--progress] [--queue QUEUE]
Perform analysis of integral-field data.
options:
-h, --help show this help message and exit
--overwrite if all selected, will run dap for all
plates/ifudesigns/modes regardless of state (default:
False)
-v, --verbose Set verbosity level for manga_dap; can be omitted and
set up to -vv (default: 0)
--quiet suppress screen output (default: False)
--print_version print DAP version and stop (default: False)
--drpver DRPVER MaNGA DRP version for analysis; $MANGADRP_VER by default
(default: None)
--redux_path REDUX_PATH
main DRP output path (default: None)
--dapver DAPVER optional output version, different from product version.
This *only* affects the output directory structure. It
does *not* select the version of the DAP to use.
(default: None)
--analysis_path ANALYSIS_PATH
main DAP output path (default: None)
--plan_file PLAN_FILE
parameter file with the MaNGA DAP execution plan to use
instead of the default (default: None)
--platelist PLATELIST
set list of plates to reduce (default: None)
--ifudesignlist IFUDESIGNLIST
set list of ifus to reduce (default: None)
--list_file LIST_FILE
A file with the list of plates and ifudesigns to analyze
(default: None)
--combinatorics force execution of all permutations of the provided
lists (default: False)
--sres_ext SRES_EXT Spectral resolution extension to use. Default set by
MaNGADataCube class. (default: None)
--sres_fill SRES_FILL
If present, use interpolation to fill any masked pixels
in the spectral resolution vectors. Default set by
MaNGADataCube class. (default: None)
--covar_ext COVAR_EXT
Use this extension to define the spatial correlation
matrix. Default set by MaNGADataCube class. (default:
None)
--on_disk When using the DRPall file to collate the data for input
to the DAP, search for available DRP files on disk
instead of using the DRPall file content. (default:
False)
--can_analyze Only construct script files for datacubes that
can/should be analyzed by the DAP. See :func:`~mangadap.
survey.drpcomplete.DRPComplete.can_analyze`. (default:
False)
--log Have the main DAP executable produce a log file
(default: False)
--no_proc Do NOT perform the main DAP processing steps (default:
False)
--no_plots Do NOT create QA plots (default: False)
--post Create/Submit the post-processing scripts (default:
False)
--post_plots Create/Submit the post-processing plotting scripts
(default: False)
--label LABEL label for cluster job (default: mangadap)
--nodes NODES number of nodes to use in cluster (default: 1)
--cpus CPUS number of cpus to use per node. Default is to use all
available; otherwise, set to minimum of provided number
and number of processors per node (default: None)
--fast QOS qos state (default: None)
--umask UMASK umask bit for cluster job (default: 0027)
--walltime WALLTIME walltime for cluster job (default: 240:00:00)
--toughness turn off hard keyword for cluster submission (default:
True)
--create use the pbs package to create the cluster scripts
(default: False)
--submit submit the scripts to the cluster (default: False)
--progress instead of closing the script, report the progress of
the analysis on the cluster; this is required if you
want to submit the DAPall script immediately after
completing the individual cube analysis (default: False)
--queue QUEUE set the destination queue (default: None)
If a DAP AnalysisPlan is not provided, the scripts will use the Default AnalysisPlan.
An example call of this script that will only construct scripts for the analysis of observation 7443-12701 using the default AnalysisPlan is:
rundap --platelist 7443 --ifudesignlist 12701 --redux_path /path/with/drp/output/drpver --analysis_path /path/for/dap/output/drpver/dapver -vv --log
In this call, I’ve specified that the DRP data is in
/path/with/drp/output/
and that the DAP output should be placed
in /path/for/dap/output/
instead of using the default
DAP Directory Structure.
The script file this call produces is written to
/path/for/dap/output/log/[time]/7495/12704/mangadap-7495-12704-LOGCUBE
,
where [time]
is a time stamp of when rundap
was executed. (If you
execute rundap
multiple times, it will create new directories using new time
stamps each time.) The lines of the script file for each plate-ifu:
touches the
*.started
fileexecutes
manga_dap
executes a series of QA plotting scripts
touches the
*.done
file
The example script generated by the above command would look something like this:
# Auto-generated batch file
# Wed 06 Apr 2022 15:15:26
touch /path/for/dap/output/drpver/dapver/log/06Apr2022T22.15.26UTC/7443/12701/mangadap-7443-12701-LOGCUBE.started
OMP_NUM_THREADS=1 manga_dap -c /path/for/dap/output/drpver/dapver/common/7443/12701/mangadap-7443-12701-LOGCUBE.ini -o /path/for/dap/output/drpver/dapver --log /path/for/dap/output/drpver/dapver/log/06Apr2022T22.15.26UTC/7443/12701/mangadap-7443-12701-LOGCUBE.log -vv
OMP_NUM_THREADS=1 dap_ppxffit_qa -c /path/for/dap/output/drpver/dapver/common/7443/12701/mangadap-7443-12701-LOGCUBE.ini -o /path/for/dap/output/drpver/dapver -b 2.5
OMP_NUM_THREADS=1 spotcheck_dap_maps -c /path/for/dap/output/drpver/dapver/common/7443/12701/mangadap-7443-12701-LOGCUBE.ini -o /path/for/dap/output/drpver/dapver -b 2.5
OMP_NUM_THREADS=1 dap_fit_residuals -c /path/for/dap/output/drpver/dapver/common/7443/12701/mangadap-7443-12701-LOGCUBE.ini -o /path/for/dap/output/drpver/dapver -b 2.5
touch /path/for/dap/output/drpver/dapver/log/06Apr2022T22.15.26UTC/7443/12701/mangadap-7443-12701-LOGCUBE.done
To execute the script, you would then run:
source /path/for/dap/output/drpver/dapver/log/06Apr2022T22.15.26UTC/7443/12701/mangadap-7443-12701-LOGCUBE
The rundap
script allows you to construct scripts for all datacubes
it can find on disk, all IFUs on a given plate, all combinations of a
set of plate and IFU numbers, or for a specified list of plateifu
IDs.
Note
The rundap
script constructs the
DRPComplete
object and writes its
associated fits file; see DRPComplete database. The data
compiled into this database is pulled from the DRPall file, but some
corrections may be applied to the NSA redshift or photometry; see, e.g.,
Redshift Fix File.
To write the post-processing scripts, execute rundap
with the
--post
and --post_plots
options. This produces two additional
types of scripts:
Scripts to produce QA plots for all IFUs on a given plate. This file is written to, e.g.,
/path/for/dap/output/drpver/dapver/log/06Apr2022T22.15.26UTC/7443/7443_fitqa
and looks like this:# Auto-generated batch file # Thu 07 Apr 2022 13:33:27 touch /path/for/dap/output/drpver/dapver/log/07Apr2022T20.33.27UTC/10001/10001_fitqa.started OMP_NUM_THREADS=1 dap_plate_fit_qa 10001 --analysis_path /path/for/dap/output/drpver/dapver --plan_file /path/for/dap/output/drpver/dapver/log/07Apr2022T20.33.27UTC/plan.toml touch /path/for/dap/output/drpver/dapver/log/07Apr2022T20.33.27UTC/10001/10001_fitqa.doneA script that builds the DAPall database and writes its QA plots. This file is written to, e.g.,
/path/for/dap/output/drpver/dapver/log/06Apr2022T22.15.26UTC/build_dapall
and looks like this:# Auto-generated batch file # Thu 07 Apr 2022 13:33:27 touch /path/for/dap/output/drpver/dapver/log/07Apr2022T20.33.27UTC/build_dapall.started OMP_NUM_THREADS=1 construct_dapall --drpver v3_1_1 -r /path/with/drp/output/drpver --dapver 3.1.0 -a /path/for/dap/output/drpver/dapver --plan_file /path/for/dap/output/drpver/dapver/log/07Apr2022T20.33.27UTC/plan.toml -vv OMP_NUM_THREADS=1 dapall_qa --drpver v3_1_1 --redux_path /path/with/drp/output/drpver --dapver 3.1.0 --analysis_path /path/for/dap/output/drpver/dapver --plan_file /path/for/dap/output/drpver/dapver/log/07Apr2022T20.33.27UTC/plan.toml touch /path/for/dap/output/drpver/dapver/log/07Apr2022T20.33.27UTC/build_dapall.done
In the automated run of the DAP, any entry in the
DRPComplete database that meets the criteria set by
mangadap.survey.drpcomplete.DRPComplete.can_analyze()
will be
analyzed. Currently the relevant criteria are:
MANGAID != NULL
MANGA_TARGET1 > 0 | MANGA_TARGET3 > 0
VEL > -500
An important consequence of the selection above is that any targets without a provided redshift will not be analyzed by the DAP, unless it has replacement redshift in the Redshift Fix File. Ancillary targets not analyzed by the DAP are likely because a redshift was not available.
Execution Recovery
If the code faults, all of the modules completed up until the fault occurred should have created a “reference” file that will effectively allow the code to pick up where it left off. The mechanism used to determine if it can do this relies simply on the existence of the expected output file. This means that if you change one of the parameters in your input Analysis Plans file without changing the keyword identifier for that module parameter set, the code may read in the existing file and keep going without incorporating your parameter change.
If you’re testing the performance with different parameter values, either perform the testing with different output directories, always remember to change the keyword for the relevant module parameter set, set the overwrite keyword for the relevant module (and each subsequent module!) to True, or just nuke the directory and start again.
Quality Assessment Plots
The survey-level execution of the MaNGA DAP constructs automatically generated
plots that provide spotchecks of the performance of the DAP; see
Quality Assessment Plots. The main QA plots for the analysis of a single
datacube have a similar calling sequence as the main DAP command-line script.
The three main scripts are spotcheck_dap_maps
, dap_ppxffit_qa
, and
dap_fit_residuals
. Following the example execution of the
DAP command-line script above, these can be created using the following
subsequent calls:
spotcheck_dap_maps -c mangadap-7815-3702.ini -vv --log mangadap-7815-3702.log -d . -o dap_output
dap_ppxffit_qa -c mangadap-7815-3702.ini -vv --log mangadap-7815-3702.log -d . -o dap_output
dap_fit_residuals -c mangadap-7815-3702.ini -vv --log mangadap-7815-3702.log -d . -o dap_output
Local Environment Setup for Survey-Level MaNGA Analysis
The DAP uses environmental variables to define the paths to specific data and other repositories, when executed for MaNGA data. If these are not defined, default values will be used; see the initialization of the mangadap.config module. The relevant environmental variables, their default, and their usage are provided below.
Variable |
Default |
Comments |
---|---|---|
|
|
Version of the DRP, used for path construction |
|
|
Root path for the reduced data |
|
|
Version of the DAP, used for path construction |
|
|
Root path for the analysis data |
These environmental variables can be added to, e.g., your
.bash_profile
file in your home directory or be included in a script
that is sourced when you want to run the DAP. The lines added to your
.bash_profile
file could look something like this:
export MANGA_SPECTRO_REDUX=/path/with/drp/output
export MANGADRP_VER=v3_1_1
export MANGA_SPECTRO_ANALYSIS=/path/for/dap/output
export MANGADAP_VER=3.1.0
Note
Importantly, note that
$MANGADAP_VER
is only used to set the path names, not to select the specific version of the DAP that should be used. The version of the DAP used is always the one installed by your python environment.The DAP checks that these variables are defined every time it is imported.
Some of these same variables are defined by Marvin. It is possible to have both Marvin and the DAP point to the same directory, but beware that this may mean that some of the files get overwritten!
Two additional variables (
$MANGACORE_VER
and$MANGACORE_DIR
) are used in a specific mode of survey-level execution of the DAP. However, this is a niche usage mode and is effectively never used. See Batch execution using automatically generated scripts.The DAP expects to find the DRP
LOGCUBE
andLOGRSS
files in the directory$MANGA_SPECTRO_REDUX/$MANGADRP_VER/[PLATE]/stack
, where[PLATE]
is the desired plate number. TheLOGRSS
files are required if you want to properly account for Spatial Covariance. This path can be altered when executing the DAP.The DAP expects to find/write data to
$MANGA_SPECTRO_ANALYSIS/$MANGADRP_VER/$MANGADAP_VER
. This path can be altered when executing the DAP, but the subdirectory structure used by the DAP to organize its outputs within this root directory cannot currently be changed.