Repository navigation
local-inference
- Website
- Wikipedia
High-speed Large Language Model Serving for Local Deployment
[ICLR'25] Fast Inference of MoE Models with CPU-GPU Orchestration
Tool for test diferents large language models without code.
Local LLM Inference Library
LLM chatbot example using OpenVINO with RAG (Retrieval Augmented Generation).
Lightweight 6GB VRAM Gradio web app with auto-installer for running AuraFlow locally — no cloud, no clutter.
Study Buddy is a desktop application that provides AI tutoring without requiring internet access or accounts
Verify claims using AI agents that debate using scraped evidence and local language models.
script which performs RAG and use a local LLM for Q&A
Script which takes a .wav audio file, performs speech-to-text using OpenAI/Whisper, and then, using Llama3, summarization and action point from the transcript generated
Hands-on experiments with the DSPy framework using local Ollama models. Features basic QA systems, multimodal image processing with LLaVA, and interactive Jupyter notebooks. Privacy-focused with local inference and no API costs.