Repository navigation

#

speculative-decoding

⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡

Python
2169
6 个月前

Official Implementation of EAGLE-1 (ICML'24), EAGLE-2 (EMNLP'24), and EAGLE-3.

Python
1182
7 小时前

scalable and robust tree-based speculative decoding algorithm

Python
342
3 个月前

Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024

Python
286
4 天前

[COLM 2024] TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding

Python
247
8 个月前

REST: Retrieval-Based Speculative Decoding, NAACL 2024

C
199
5 个月前

LLM Inference on consumer devices

Python
105
1 个月前

[NeurIPS'23] Speculative Decoding with Big Little Decoder

Python
90
1 年前

From Hours to Minutes: Lossless Acceleration of Ultra Long Sequence Generation

Python
86
1 个月前

Implementation of the paper Fast Inference from Transformers via Speculative Decoding, Leviathan et al. 2023.

Python
47
5 个月前

[ICLR 2025] SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Acceleration

Python
45
2 个月前

Codes for our paper "Speculative Decoding: Exploiting Speculative Execution for Accelerating Seq2seq Generation" (EMNLP 2023 Findings)

Python
39
1 年前

PipeInfer: Accelerating LLM Inference using Asynchronous Pipelined Speculation

C++
29
5 个月前

Reward-guided Speculative Decoding (RSD) for efficiency and effectiveness.

Python
23
1 个月前

minimal C implementation of speculative decoding based on llama2.c

C
21
9 个月前

Official Implementation of LANTERN (ICLR'25) and LANTERN++(ICLRW-SCOPE'25)

Python
11
1 个月前

Accelerating LLM inference with techniques like speculative decoding, quantization, and kernel fusion, focusing on implementing state-of-the-art research papers.

Python
7
1 个月前

Official Implementation of "GRIFFIN: Effective Token Alignment for Faster Speculative Decoding"

Python
6
2 个月前

Unofficial implementation of Token Recycling self-speculative decoding method.

Python
5
5 个月前