Large Language Models (LLMs)

Frameworks, tools, and resources for Large Language Models (LLMs), including training, inference (vLLM, llama.cpp), and RAG.

Repositories

Ollama is a lightweight framework for running and managing open-source large language models locally. It provides a simple CLI and REST API for building AI applications, supporting models like Llama, Gemma, and Mistral with easy integration into various tools and platforms.

Go
165.8k
a day ago
langchain-ai/langchain

LangChain is a framework for building agents and LLM-powered applications. It helps chain together interoperable components and third-party integrations to simplify AI application development while future-proofing decisions as technology evolves.

Python
130.5k
an hour ago
open-webui/open-webui

Open WebUI is an extensible, feature-rich, and user-friendly self-hosted AI platform designed to operate entirely offline. It supports various LLM runners like Ollama and OpenAI-compatible APIs, with built-in inference engine for RAG, making it a powerful AI deployment solution.

Python
128.2k
7 hours ago

DeepSeek-V3 is a powerful Mixture-of-Experts (MoE) language model with 671B total parameters and 37B activated per token. It features Multi-head Latent Attention, FP8 training, and multi-token prediction, achieving performance comparable to leading closed-source models while maintaining training efficiency and stability.

Python
102.3k
7 months ago

DeepSeek-R1 is a first-generation reasoning model achieving performance comparable to OpenAI-o1 in math, code, and reasoning tasks. It features a 671B parameter MoE architecture and is open-sourced under MIT license, including distilled smaller models.

92.0k
9 months ago

GPT4All is an open-source ecosystem that enables you to run powerful large language models (LLMs) privately on everyday desktops and laptops. No API calls or GPUs required—just download the app and start chatting with local AI models.

C++
77.2k
10 months ago

vLLM is a fast and easy-to-use library for LLM inference and serving. Originally developed at UC Berkeley, it features state-of-the-art throughput, efficient memory management with PagedAttention, continuous batching, and seamless integration with Hugging Face models.

Python
73.9k
an hour ago

A comprehensive academic AI assistant supporting multiple LLMs (GPT/GLM/Qwen/DeepSeek). Specialized in paper translation, polishing, code analysis, and academic writing with modular plugin system and customizable shortcuts.

Python
70.3k
2 months ago

Official Meta Llama 2 inference code repository. Provides minimal implementation to load and run Llama models (7B-70B parameters) for text completion and chat applications. Includes model weights, tokenizer, and example scripts for local deployment.

Python
59.2k
a year ago

xAI's Grok-1: A 314B parameter Mixture-of-Experts model with JAX implementation. Open-source weights and architecture for advanced AI research and deployment.

Python
51.5k
2 years ago

LlamaIndex is an open-source data framework for building LLM applications with retrieval-augmented generation (RAG). It provides data connectors, indexing tools, and query interfaces to enhance LLMs with private data.

Python
47.8k
2 days ago

A comprehensive Gradio web UI for running Large Language Models locally with 100% privacy. Supports text generation, vision models, tool-calling, training, image generation, and OpenAI-compatible API.

Python
46.3k
7 hours ago

Microsoft's official inference framework for 1-bit LLMs, providing fast and lossless inference on CPU and GPU with optimized kernels for efficient edge deployment.

Python
36.3k
12 days ago

LightRAG is a lightweight and efficient Retrieval-Augmented Generation framework that integrates knowledge graphs with vector retrieval for enhanced document understanding. It supports multiple storage backends, multimodal processing, and provides both API and Web UI interfaces.

Python
29.8k
6 hours ago

Qwen3 is an advanced open-source LLM series by Alibaba Cloud, featuring dual thinking/non-thinking modes, 1M token context, multilingual support, and state-of-the-art reasoning capabilities for complex problem-solving.

Python
27.0k
2 months ago
huggingface/open-r1

Open R1 is a community-driven project to fully reproduce DeepSeek-R1's reasoning capabilities. It provides training pipelines, evaluation scripts, and datasets for SFT, GRPO, and data generation, enabling transparent AI reasoning model development.

Python
26.0k
4 months ago
SillyTavern/SillyTavern

A powerful local LLM frontend for power users, supporting multiple AI APIs, image generation, TTS, and extensive customization options for immersive roleplaying experiences.

JavaScript
24.7k
7 hours ago