ReDrafter delivers 2.7x more tokens per second compared to traditional auto-regression ReDrafter could reduce latency for users while using fewer GPUs Apple hasn’t said when ReDrafter will be deployed on rival AI GPUs from AMD and Intel Apple has announced a collaboration with Nvidia to accelerate large language model inference using its open source technology, Recurrent Drafter (or ReDrafter for …
Read More »