EngineeringFeatured
Delta: Lossless Token Sequence Compression for Large Language Models
A technical exposition on dictionary-based compression, algorithmic guarantees, and the economics of inference cost reduction.
Feb 2, 202622 min read
Nikhil Srivastava