Repository navigation
ai-infra
- Website
- Wikipedia
🚀 Awesome System for Machine Learning ⚡️ AI System Papers and Industry Practice. ⚡️ System for Machine Learning, LLM (Large Language Model), GenAI (Generative AI). 🍻 OSDI, NSDI, SIGCOMM, SoCC, MLSys, etc. 🗃️ Llama3, Mistral, etc. 🧑💻 Video Tutorials.
A.I.G (AI-Infra-Guard) is a comprehensive, intelligent, and easy-to-use AI Red Teaming platform developed by Tencent Zhuque Lab.
SpargeAttention: A training-free sparse attention that can accelerate any model inference.
RLinf is a flexible and scalable open-source infrastructure designed for post-training foundation models (LLMs, VLMs, VLAs) via reinforcement learning.
High performance distributed cache system. Built by Rust.
Transform your pythonic research to an artifact that engineers can deploy easily.
This is a landscape of the infrastructure that powers the generative AI ecosystem
Cloud Native ML/DL Platform
SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse–Linear Attention
LLM Inference via Triton (Flexible & Modular): Focused on Kernel Optimization using CUBIN binaries, Starting from gpt-oss Model
TME: Structured memory engine for LLM agents to plan, rollback, and reason across multi-step tasks.
vgpu.rs is the fractional GPU & vgpu-hypervisor implementation written in Rust
A curated list of awesome tools, frameworks, platforms, and resources for building scalable and efficient AI infrastructure, including distributed training, model serving, MLOps, and deployment.
OriginDL: A distributed deep learning framework Built from scratch
This repository contains a list of various service-specific Azure Landing Zone implementation options.
Memory Management Service, a Long Term Memory Solution for AI