SPRING2 Assays and Quality Modes¶
This guide explains the higher-level data behaviors that affect how SPRING2 stores and restores reads.
Assay Selection¶
Use --assay / -y to choose an assay explicitly:
Supported values are:
autodnarnaatacbisulfitesc-rnasc-atacsc-bisulfite
auto is the default. During compression, SPRING2 samples the startup portion
of the input and classifies the dataset using assay-specific heuristics and the
bundled assay reference.
For technical background on the assay reference, see ../assays/ASSAY_REFERENCE.md.
Cellular Barcode Length¶
Use --cb-len when single-cell assays store cellular barcodes in the start of
R1 instead of a dedicated I1 lane:
Notes:
- Valid range is
1..64. - When
--I1is present, barcode length is inferred from the I1 read length and--cb-lenis ignored.
Assay-Aware Compression Behaviors¶
Depending on assay and confidence, SPRING2 can enable internal transforms that remain reversible for lossless archives.
Examples include:
- RNA and scRNA: poly-A/T tail stripping when assay detection strongly suggests it is safe
- ATAC and scATAC: terminal adapter read-through stripping when the overlap is strong enough
- scRNA and sc-bisulfite: cellular barcode prefix extraction from R1 when no I1 lane is used
- grouped scRNA with index lanes: trailing index tokens in IDs can be reconstructed from decoded I1/I2 reads
These behaviors are recorded in archive metadata and surfaced by preview mode.
Quality Modes¶
Use --qmod / -q to choose how quality values are stored.
Lossless¶
Default mode:
This preserves qualities bit-for-bit.
QVZ¶
qvz <ratio> enables lossy quality compression using QVZ. The ratio roughly
tracks bits used per quality value.
Illumina 8-level Binning¶
Binary Thresholding¶
This maps qualities to high when >= threshold and to low otherwise.
--strip Modes¶
Use --strip / -s to intentionally discard archive components:
i: identifierso: original orderq: quality scores
Example:
Be careful: stripping changes what can be faithfully restored later.
When to Prefer Explicit Assay Selection¶
Prefer --assay over auto when:
- you already know the assay and want deterministic behavior
- startup samples are atypical or low quality
- you are benchmarking and want consistent cross-run settings