desitransfer API¶
desitransfer¶
DESI data transfer infrastructure.
desitransfer.common¶
Code needed by all scripts.
- desitransfer.common.ensure_scratch(directories)[source]¶
Try an alternate temporary directory if the primary temporary directory is unavilable.
- desitransfer.common.exclude_years(start_year)[source]¶
Generate rsync
--exclude
statements of the form--exclude 2020*
.
- desitransfer.common.idle_time(start=8, end=12, tz=None)[source]¶
Determine whether we are in an idle time during the day.
- desitransfer.common.rsync(s, d, test=False, config='dts', reverse=False)[source]¶
Set up rsync command.
- Parameters:
- Returns:
A list suitable for passing to
subprocess.Popen
.- Return type:
- desitransfer.common.today()[source]¶
Today’s date in DESI “NIGHT” format, YYYYMMDD.
This formulation, with the offset
7/24+0.5
, is inherited from previous nightwatch transfer scripts.
desitransfer.daemon¶
Entry point for desi_transfer_daemon.
- class desitransfer.daemon.TransferDaemon(options)[source]¶
Manage data transfer configuration, options, and operations.
- Parameters:
options (
argparse.Namespace
) – The parsed command-line options.
- _configure_log(debug)[source]¶
Re-configure the default logger returned by
desiutil.log
.- Parameters:
debug (
bool
) – IfTrue
set the log level toDEBUG
.
- backup(d, night, status)[source]¶
Final sync and backup for a specific night.
- Parameters:
d (
collections.namedtuple()
) – Configuration for the destination directory.night (
str
) – Night to check.status (
desitransfer.status.TransferStatus
) – The status object associated with d.
- Returns:
True
indicates the backup ran to completion and the the transfer status should be updated to reflect that.- Return type:
Notes
12:00 MST = 19:00 UTC, plus one hour just to be safe, so after 20:00 UTC.
- catchup(d, night, status, backup=False)[source]¶
Do a “catch-up” transfer to catch delayed files in the morning, rather than at noon.
- Parameters:
d (
collections.namedtuple()
) – Configuration for the destination directory.night (
str
) – Night to check.status (
desitransfer.status.TransferStatus
) – The status object associated with d.backup (
bool
) – IfTrue
, this catch-up is happening immediately prior to tape backup.
Notes
07:00 MST = 14:00 UTC.
This script can do nothing about exposures that were never linked into the DTS area at KPNO in the first place.
- checksum(checksum_file, status)[source]¶
Verify checksum associated with checksum_file and report status.
The status is reported via log messages and messages passed to the status object, not via a return value.
- Parameters:
checksum_file (
str
) – The checksum file.status (
desitransfer.status.TransferStatus
) – The associated status object.
- checksum_lock()[source]¶
See if checksums are being computed at KPNO.
- Returns:
True
if checksums are being computed.- Return type:
- directory(d)[source]¶
Data transfer operations for a single destination directory.
- Parameters:
d (
collections.namedtuple()
) – Configuration for the destination directory.
- exposure(d, link, status)[source]¶
Data transfer operations for a single exposure.
This method will unconditionally install an exposure directory in the destination, regardless of any transfer or checksum errors.
- Parameters:
d (
collections.namedtuple()
) – Configuration for the destination directory.link (
str
) – The exposure path.status (
desitransfer.status.TransferStatus
) – The status object associated with d.
- desitransfer.daemon._options()[source]¶
Parse command-line options for desi_transfer_daemon.
- Returns:
The parsed command-line options.
- Return type:
- desitransfer.daemon._popen(command)[source]¶
Simple wrapper for
subprocess.Popen
to avoid repeated code.- Parameters:
command (
list
) – Command to pass tosubprocess.Popen
.- Returns:
The returncode, standard output and standard error.
- Return type:
tuple()
- desitransfer.daemon.lock_directory(directory, test=False)[source]¶
Set a directory and its contents read-only.
- desitransfer.daemon.main()[source]¶
Entry point for desi_transfer_daemon.
- Returns:
An integer suitable for passing to
sys.exit()
.- Return type:
- desitransfer.daemon.rsync_night(source, destination, night, test=False)[source]¶
Run an rsync command on an entire night, for example, to pick up delayed files.
- desitransfer.daemon.unlock_directory(directory, test=False)[source]¶
Set a directory and its contents user-writeable.
- desitransfer.daemon.verify_checksum(checksum_file)[source]¶
Verify checksums supplied with the raw data.
desitransfer.daily¶
Entry point for desi_daily_transfer.
- class desitransfer.daily.DailyDirectory(source, destination, extra=[], dirlinks=False)[source]¶
Simple object to hold daily transfer configuration.
- Parameters:
- desitransfer.daily._config(timeframe)[source]¶
Wrap configuration so that module can be imported without environment variables set.
- desitransfer.daily._options()[source]¶
Parse command-line options for desi_daily_transfer.
- Returns:
The parsed command-line options.
- Return type:
- desitransfer.daily.main()[source]¶
Entry point for desi_daily_transfer.
- Returns:
An integer suitable for passing to
sys.exit()
.- Return type:
desitransfer.nightwatch¶
Sync KPNO nightwatch. Due to differences in timing and directory structure, this is kept separate from the raw data transfer daemon.
A cronjob running as desi@dtn01.nersc.gov ensures that this daemon is running.
Catchup on a specific night:
NIGHT=20200124 && rsync -rlvt --exclude-from ${DESITRANSFER}/py/desitransfer/data/desi_nightwatch_transfer_exclude.txt dts:/exposures/nightwatch/${NIGHT}/ /global/cfs/cdirs/desi/spectro/nightwatch/kpno/${NIGHT}/
By-hand startup sequence (bash shell):
source /global/common/software/desi/desi_environment.sh datatran
module load desitransfer
nohup nice -19 ${DESITRANSFER}/bin/desi_nightwatch_transfer &> /dev/null &
tail -f ${DESI_ROOT}/spectro/nightwatch/desi_nightwatch_transfer.log
- desitransfer.nightwatch._configure_log(debug)[source]¶
Re-configure the default logger returned by
desiutil.log
.- Parameters:
debug (
bool
) – IfTrue
set the log level toDEBUG
.
- desitransfer.nightwatch._options()[source]¶
Parse command-line options for desi_nightwatch_transfer.
- Returns:
The parsed command-line options.
- Return type:
- desitransfer.nightwatch.main()[source]¶
Entry point for desi_nightwatch_transfer.
- Returns:
An integer suitable for passing to
sys.exit()
.- Return type:
desitransfer.spacewatch¶
Download Spacewatch data from a server at KPNO.
Notes
Spacewatch data rolls over at 00:00 UTC = 17:00 MST.
The data relevant to the previous night, say 20231030, would be downloaded on the morning of 20231031.
Therefore to obtain all data of interest, just download the files that have already appeared in 2023/10/31/ (Spacewatch directory structure) the morning after DESI night 20231030.
- class desitransfer.spacewatch.SpacewatchHTMLParser(*args, **kwargs)[source]¶
Extract JPG files from an HTML index.
- desitransfer.spacewatch._options()[source]¶
Parse command-line options for desi_nightwatch_transfer.
- Returns:
The parsed command-line options.
- Return type:
- desitransfer.spacewatch.download_jpg(files, destination, overwrite=False, test=False)[source]¶
Download files to destination.
- desitransfer.spacewatch.main()[source]¶
Entry point for desi_spacewatch_transfer.
- Returns:
An integer suitable for passing to
sys.exit()
.- Return type:
desitransfer.status¶
Entry point for desi_transfer_status.
- class desitransfer.status.TransferStatus(directory, install=False, year=None)[source]¶
Simple object for interacting with desitransfer status reports.
- Parameters:
- _handle_malformed()[source]¶
Handle malformed JSON files.
This function will save the malformed file to a .bad file for later analysis, and write an empty array to a new status file.
- find(night, exposure=None, stage=None)[source]¶
Find status entries that match night, etc.
- Parameters:
- Returns:
:class:`list` or class – If only night is set, return a
dict
containing information on all exposures for that night. If exposure is not set, return adict
keyed by exposure containing all data matching stage for that night. If stage is not set, return alist
containing indexes pointing to all data about that exposure. If both exposure and stage are set, return alist
of indexes pointing to the data for exposure filtered on stage.- Return type:
dict
- Raises:
KeyError – If night is not yet defined.
- desitransfer.status._options()[source]¶
Parse command-line options for desi_transfer_status.
- Returns:
The parsed command-line options.
- Return type:
- desitransfer.status.main()[source]¶
Entry point for desi_transfer_status.
- Returns:
An integer suitable for passing to
sys.exit()
.- Return type:
desitransfer.tucson¶
Entry point for desi_tucson_transfer.
- desitransfer.tucson._configure_log(debug)[source]¶
Re-configure the default logger returned by
desiutil.log
.- Parameters:
debug (
bool
) – IfTrue
set the log level toDEBUG
.
- desitransfer.tucson._get_proc(directories, exclude, src, dst, options, nice=5)[source]¶
Prepare the next download directory for processing.
- Parameters:
directories (
list
) – A list of directories to process.exclude (
set
) – Do not process directories in this set.src (
str
) – Root source directory.dst (
str
) – Root destination directory.options (
argparse.Namespace
) – The parsed command-line options.nice (
int
, optional.) – Lower-priority transfers will be run with this value passed toos.nice()
, default 5.
- Returns:
A tuple containing information about the process.
- Return type:
- desitransfer.tucson._options()[source]¶
Parse command-line options for desi_tucson_transfer.
- Returns:
The parsed command-line options.
- Return type:
- desitransfer.tucson._rsync(src, dst, d, checksum=False)[source]¶
Construct an rsync command to transfer d.
- desitransfer.tucson.main()[source]¶
Entry point for desi_tucson_transfer.
- Returns:
An integer suitable for passing to
sys.exit()
.- Return type: