Pure-Rust Zstandard codec with a production-grade decoder, dictionary handle reuse, and an actively-improved encoder. Builds with plain cargo — no cmake, no system zstd, no FFI. no_std ready for embedded.
cargo add structured-zstduse structured_zstd::encoding::{compress_to_vec, CompressionLevel};
let compressed = compress_to_vec(&b"hello world"[..], CompressionLevel::from_level(7));For no_std builds disable the default features:
cargo add structured-zstd --no-default-featuresThe decoder ships per-CPU-tier SIMD kernels, each behind a cargo feature
(all on by default; the tier is picked at runtime with std, or at compile
time from target_feature on no_std): kernel_scalar, kernel_sse2,
kernel_bmi2, kernel_avx2, kernel_vbmi2 (x86) and kernel_neon,
kernel_sve (aarch64). The scalar kernel is always compiled (it is the
mandatory fallback), so kernel_scalar is a marker that gates no code;
disabling the SIMD tiers is what trims the binary. A scalar-only build —
--no-default-features (or, equivalently, naming the marker explicitly) —
compiles out the per-tier SIMD kernel dispatch, its BMI2/AVX2/VBMI2/NEON
trampolines, and the explicit SSE2/NEON intrinsics in the small fixed-size
copy primitives — all gated on the matching kernel_* feature. These features
control the crate's own explicit SIMD only; the compiler's autovectorizer may
still emit vector instructions from ordinary scalar code regardless:
cargo add structured-zstd --no-default-features --features kernel_scalarRelease notes for every version live in zstd/CHANGELOG.md (maintained by release-plz).
Complete RFC 8878 implementation, including dictionary-backed streams, raw / RLE / compressed blocks, and the full Zstandard frame format with optional content checksums.
All standard compression levels are wired and produce valid Zstandard frames decodable by both this crate and upstream C zstd:
- Named presets:
Fastest(≈1),Default(≈3),Better(≈7),Best(≈11) - Numeric levels:
0..=22and negative ultra-fast levels viaCompressionLevel::from_level(n)— C zstd-compatible numbering - Fine-grained parameters: override individual knobs (
windowLog,hashLog,chainLog,searchLog,minMatch,targetLength,strategy) and activate long-distance matching viaCompressionParameters::builder(...), the drop-in equivalent of C zstd'sZSTD_CCtx_setParametersurface - Streaming encoder via
std::io::Write - Dictionary compression with the same dictionary format C zstd consumes
- Frame Content Size —
FrameCompressorwrites FCS automatically;StreamingEncoderrequiresset_pledged_content_size()before the first write - Content checksums opt-in
The encoder is undergoing an architectural rewrite — see #111 for the roadmap.
Behind the dict_builder feature flag, the dictionary module can:
- build raw dictionaries with COVER (
create_raw_dict_from_source) - build raw dictionaries with FastCOVER (
create_fastcover_raw_dict_from_source) - finalize raw content into the full zstd dictionary format (
finalize_raw_dict) - train + finalize in one pure-Rust flow (
create_fastcover_dict_from_source)
Internal: compression strategy backends
| Level range | Strategy | Backend |
|---|---|---|
| 1-2 | Fast |
Simple matcher |
| 3-4 | Dfast |
Dfast two-tier hash |
| 5 | Greedy |
Row matcher (lazy_depth=0) |
| 6-15 | Lazy / Lazy2 |
HashChain (lazy_depth=1 or 2) |
| 16-17 | BtOpt |
HashChain candidates + btopt price parser |
| 18 | BtUltra |
HashChain candidates + btultra price parser |
| 19-22 | BtUltra2 |
HashChain candidates + btultra2 dual-profile parse |
The level → strategy column matches donor ZSTD_defaultCParameters[0] at zstd/lib/compress/clevels.h:25-50 (srcSize > 256 KiB tier). Donor routes greedy/lazy/lazy2 through its row-based matchfinder when windowLog > 14; we route Greedy through the row matcher (matches donor) but Lazy/Lazy2 through the hash-chain matcher — an intentional architectural difference, not an oversight.
Per-merge benchmarks publish to GitHub Pages: structured-world.github.io/structured-zstd/dev/bench.
The CI matrix covers x86_64-linux-gnu, i686-linux-gnu, and x86_64-musl; the dashboard exposes per-target / stage / scenario / level filtering. The encoder architecture rewrite (#111) is the active surface for compression-side work; the public benchmark report tracks the delta vs upstream C zstd over time. A dedicated dashboard section also tracks the WebAssembly build (simd128 + scalar) against the most popular npm wasm zstd, @bokuweb/zstd-wasm, over time.
See BENCHMARKS.md for the methodology — small payloads, entropy extremes, a 100 MiB large-stream scenario, repository corpus fixtures, and optional local Silesia corpora.
use structured_zstd::encoding::{compress, compress_to_vec, CompressionLevel};
let data: &[u8] = b"hello world";
// Named level
let compressed = compress_to_vec(data, CompressionLevel::Fastest);
// Numeric level (C zstd compatible: 0 = default, 1-22, negative for ultra-fast)
let compressed = compress_to_vec(data, CompressionLevel::from_level(7));use structured_zstd::encoding::{CompressionLevel, StreamingEncoder};
use std::io::Write;
let mut out = Vec::new();
let mut encoder = StreamingEncoder::new(&mut out, CompressionLevel::Fastest);
encoder.write_all(b"hello ")?;
encoder.write_all(b"world")?;
encoder.finish()?;
# Ok::<(), std::io::Error>(())Override individual compression knobs (the drop-in equivalent of C zstd's
ZSTD_CCtx_setParameter). Every knob left unset inherits the base level's
default, so a parameter set that overrides nothing reproduces plain
level-based compression. Long-distance matching is off at every level preset
and is activated only here:
use structured_zstd::encoding::{
compress_with_parameters, CompressionLevel, CompressionParameters, Strategy,
};
let data: &[u8] = b"hello world";
let params = CompressionParameters::builder(CompressionLevel::Level(19))
.window_log(22)
.strategy(Strategy::Btultra2)
.enable_long_distance_matching(true)
.build()
.expect("parameters within bounds");
let compressed = compress_with_parameters(data, ¶ms);Each parameter's valid range is queryable via CParameter::bounds() (the
analogue of ZSTD_cParam_getBounds); the builder validates every set knob.
use structured_zstd::decoding::StreamingDecoder;
use structured_zstd::io::Read;
let compressed_data: Vec<u8> = vec![];
let mut source: &[u8] = &compressed_data;
let mut decoder = StreamingDecoder::new(&mut source).unwrap();
let mut result = Vec::new();
decoder.read_to_end(&mut result).unwrap();use structured_zstd::decoding::{DictionaryHandle, FrameDecoder, StreamingDecoder};
use structured_zstd::io::Read;
let compressed: Vec<u8> = vec![];
let dict_bytes: Vec<u8> = vec![];
let mut output = vec![0u8; 1024];
// Parse dictionary once, then reuse handle.
let handle = DictionaryHandle::decode_dict(&dict_bytes).unwrap();
let mut decoder = FrameDecoder::new();
let _written = decoder
.decode_all_with_dict_handle(compressed.as_slice(), &mut output, &handle)
.unwrap();
// Compatibility path: pass raw dictionary bytes directly.
let mut decoder = FrameDecoder::new();
let _written = decoder
.decode_all_with_dict_bytes(compressed.as_slice(), &mut output, &dict_bytes)
.unwrap();
// Streaming helpers exist for both handle- and bytes-based paths.
let mut source: &[u8] = &compressed;
let mut stream = StreamingDecoder::new_with_dictionary_handle(&mut source, &handle).unwrap();
let mut sink = Vec::new();
stream.read_to_end(&mut sink).unwrap();Behind the lsm Cargo feature (default off), structured-zstd
exposes a typed SkippableFrame API
(structured_zstd::skippable) for storage-format authors who need
to interleave application metadata with zstd data, plus a
block-subset partial decoder: FrameDecoder::decode_blocks_partial(src, start_block, end_block, resume, emit_resume) decodes only the inner
blocks covering a requested range (skipping the trailing ones) and
preserves the clean prefix on a corrupt block, while
FrameEmitInfo::decompressed_byte_range(block_index) returns the
decompressed byte range of a given block, so a range query can locate
which inner blocks cover a target byte window. For incremental /
resumable decoding, pass emit_resume = true to capture a ResumeState
(cross-block entropy tables + repcode history + next-block coordinates)
in PartialDecode::resume_state, then feed it back via the resume
argument (ResumeInput { window_prime, state }) to continue from a later
block WITHOUT re-decompressing the prefix — even across a dropped (cold)
decoder. Enable on the command line:
cargo add structured-zstd --features lsmor in Cargo.toml:
[dependencies]
structured-zstd = { version = "0", features = ["lsm"] }The ecosystem registry of allocated skippable-frame magic variants and the allocation policy live in docs/SKIPPABLE_MAGIC_ALLOCATIONS.md.
JavaScript / TypeScript consumers can use the codec from npm — no native addons, no build step:
npm install @structured-world/structured-zstdimport { compress, decompress } from "@structured-world/structured-zstd";
const framed = await compress(new TextEncoder().encode("hello"), 19);
const plain = await decompress(framed);The package ships two WebAssembly payloads — one built with the simd128
SIMD tier, one scalar — and selects the fast one at runtime from the host
engine's capabilities. Pure ESM, strict TypeScript types. Frames interoperate
with native zstd. Source lives in
zstd-wasm/;
see the
package README.
Maintained fork of KillingSpark/zstd-rs (ruzstd) by the Structured World Foundation. We sync periodically with upstream but maintain an independent development trajectory focused on the CoordiNode database engine's per-label dictionary needs.
Apache License 2.0. Contributions will be published under the same Apache 2.0 license.