
10x faster tokenization
Integrating with the IREE tokenizer for a 10x uplift in tokenization performance.
ReadZML is a production inference stack, purpose-built to decouple AI workloads from proprietary hardware. Any model, many hardwares, one codebase, peak performance.