Zstash HPSS Archive
Zstash – HPSS long-term archiving solution for E3SM
Zstash is written in Python 3 using standard libraries. Its design is intentionally minimalistic to provide an effective long-term HPSS archiving solution without creating an overly complicated (and hard to maintain) tool.
Key features:
- Files are archived into standard tar files with a user-specified maximum size optimized for HPSS storage, typically 128 to 256 GB.
- Tar files are created locally first, then transferred to HPSS. If no HPSS is available or the user specifies to not use HPSS, these files will instead be archived locally.
- Checksums (md5) of input files are computed on-the-fly during archiving. For large files, this saves a considerable amount of time compared to separate checksumming and archiving steps. (Checksums are computed on-the-fly again when extracting files to verify file integrity.)
- Checksums and additional metadata (size, modification time, tar file and offset) are stored in an SQLite3 index database.
- The SQLite3 database enables faster retrieval of individual files by providing the containing tar file and offset (location) within that tar file.
- Parallel extraction is supported for additional performance.
Documentation
- Zstash Documentation
- Github (source code)
- Getting Started
- Tutorial