Repository navigation
local-inference
- Website
- Wikipedia
High-speed Large Language Model Serving for Local Deployment
[ICLR'25] Fast Inference of MoE Models with CPU-GPU Orchestration
Tool for test diferents large language models without code.
Local LLM Inference Library
LLM chatbot example using OpenVINO with RAG (Retrieval Augmented Generation).
Study Buddy is a desktop application that provides AI tutoring without requiring internet access or accounts
Lightweight 6GB VRAM Gradio web app with auto-installer for running AuraFlow locally — no cloud, no clutter.
Verify claims using AI agents that debate using scraped evidence and local language models.
script which performs RAG and use a local LLM for Q&A
Script which takes a .wav audio file, performs speech-to-text using OpenAI/Whisper, and then, using Llama3, summarization and action point from the transcript generated
Hands-on experiments with the DSPy framework using local Ollama models. Features basic QA systems, multimodal image processing with LLaVA, and interactive Jupyter notebooks. Privacy-focused with local inference and no API costs.