Sequence Compression Benchmark

wrap-xm has no data for 23 of the selected datasets, therefore these datasets are removed from selection

Spironucleus salmonicida GCA_000497125.1,
Tieghemostelium lacteum GCA_001606155.1,
Fusarium graminearum PH-1 GCF_000240135.3,
Salpingoeca rosetta GCA_000188695.1,
PDB,
Homo sapiens GRCh38 peptides all,
Chondrus crispus GCA_000350225.2,
NCBI Virus RefSeq Protein,
Mitochondrion,
UniProtKB Reviewed (Swiss-Prot),
UCSC hg38 7way knownCanonical-exonNuc,
Kappaphycus alvarezii GCA_002205965.2,
NCBI Virus Complete Nucleotide Human,
SILVA 132 LSURef,
UCSC hg38 20way knownCanonical-exonNuc,
Strongylocentrotus purpuratus GCF_000002235.4,
SILVA 132 SSURef Nr99,
Influenza,
Helicobacter,
NCBI SARS-CoV-2 random-100k,
SILVA 132 SSURef,
Homo sapiens GCA_000001405.28,
Picea abies GCA_900067695.1

Comparing 21 settings of 2 compressors

Step 1. Select test data

Genomes (less repetitive) Other datasets (more repetitive)
Aggregate results from multiple datasets using:
sum average

Step 2. Select compressors to compare

Compare:
Sequence compressors
General-purpose compressors
Copy (no compression)
Wrappers
Include compressors
Include compressors
Use results from tests
Only best setting(s) in terms of
Sort by
Reverse sort order
Show only top entries
Link speed: Mbit/s (for estimating transfer time)
Show all values relative to

Select
individual
compressors:
Select
individual
compressor
settings:

Step 3. Configure output

Table

Column chart

Scatterplot

Columns to show:








Value to plot:
Scale:linearlogarithmic
Chart size: x pixels
Highlight specialized vs general-purpose compressors
X axis:
Fixed range: ..
linearlogarithmic
Y axis:
Fixed range: ..
linearlogarithmic