Repository navigation

local-inference

Website
Wikipedia

SJTU-IPADS / PowerInfer

High-speed Large Language Model Serving for Local Deployment

large-language-models llama 大语言模型 llm-inference local-inference

C++

8346

448

2 个月前

efeslab / fiddler

[ICLR'25] Fast Inference of MoE Models with CPU-GPU Orchestration

大语言模型 llm-inference mixtral-8x7b mixture-of-experts local-inference

Python

235

1 年前

BorjaOteroFerreira / IALab-Suite

Tool for test diferents large language models without code.

flask-api 大语言模型 llm-inference chat-application inference-api large-language-models llama2 API llamacpp mixtral-8x7b local-inference

Python

3 个月前

tinyBigGAMES / JetInfero

Local LLM Inference Library

llama-cpp win64 pascal local-inference ai-inference Library

Pascal

8 个月前

yas-sim / openvino-llm-chatbot-rag

LLM chatbot example using OpenVINO with RAG (Retrieval Augmented Generation).

聊天机器人大语言模型 openvino rag retrieval-augmented-generation edge-computing llama2 intel langchain 自然语言处理 offline cloud-free local-inference huggingface

Python

2 年前

michaelborck-education / study-buddy

Study Buddy is a desktop application that provides AI tutoring without requiring internet access or accounts

CSS desktop-application Electron JavaScript local-inference TypeScript

TypeScript

3 个月前

Raxephion / AuraGen-AuraFlow-WebUI

Lightweight 6GB VRAM Gradio web app with auto-installer for running AuraFlow locally — no cloud, no clutter.

diffusers generative-ai gradio image-generation local-inference Open Source Python stable-diffusion text-to-image webui

Python

4 个月前

aTh1ef / ai-debate-agents

Verify claims using AI agents that debate using scraped evidence and local language models.

agentic-ai autonomous-agents beautifulsoup langgraph local-inference Python qwen scraping

Python

4 个月前

nazago / rag-llm-local

script which performs RAG and use a local LLM for Q&A

langchain-python 大语言模型 local-inference ollama rag

Python

1 年前

nazago / meeting-minutes-generator

Script which takes a .wav audio file, performs speech-to-text using OpenAI/Whisper, and then, using Llama3, summarization and action point from the transcript generated

meeting-minutes speech-to-text summarization Whisper langchain-python llm-inference local-inference ollama

Python

1 年前

DoMaLi94 / dspy-experiments

Hands-on experiments with the DSPy framework using local Ollama models. Features basic QA systems, multimodal image processing with LLaVA, and interactive Jupyter notebooks. Privacy-focused with local inference and no API costs.

大语言模型 local-inference 机器学习 multimodal ollama vlm

Jupyter Notebook

3 个月前