Repository navigation

#

local-inference

High-speed Large Language Model Serving for Local Deployment

C++
8306
18 天前

[ICLR'25] Fast Inference of MoE Models with CPU-GPU Orchestration

Python
226
9 个月前

Lightweight 6GB VRAM Gradio web app with auto-installer for running AuraFlow locally — no cloud, no clutter.

Python
3
2 个月前

Study Buddy is a desktop application that provides AI tutoring without requiring internet access or accounts

TypeScript
3
2 个月前

Verify claims using AI agents that debate using scraped evidence and local language models.

Python
1
3 个月前

script which performs RAG and use a local LLM for Q&A

Python
0
1 年前

Script which takes a .wav audio file, performs speech-to-text using OpenAI/Whisper, and then, using Llama3, summarization and action point from the transcript generated

Python
0
1 年前

Hands-on experiments with the DSPy framework using local Ollama models. Features basic QA systems, multimodal image processing with LLaVA, and interactive Jupyter notebooks. Privacy-focused with local inference and no API costs.

Jupyter Notebook
0
1 个月前