Repository navigation
qwq
- Website
- Wikipedia
🌐 WebThinker: Empowering Large Reasoning Models with Deep Research Capability
Higher performance OpenAI LLM service than vLLM serve: A pure C++ high-performance OpenAI LLM service implemented with GPRS+TensorRT-LLM+Tokenizers.cpp, supporting chat and function call, AI agents, distributed multi-GPU inference, multimodal capabilities, and a Gradio chat interface.
Ollama负载均衡服务器 | 一款高性能、易配置的开源负载均衡服务器,优化Ollama负载。它能够帮助您提高应用程序的可用性和响应速度,同时确保系统资源的有效利用。
Official PyTorch implementation for Hogwild! Inference: Parallel LLM Generation with a Concurrent Attention Cache
Deploy open-source LLMs on AWS in minutes — with OpenAI-compatible APIs and a powerful CLI/SDK toolkit.
To Think or Not to Think: Exploring the Unthinking Vulnerability in Large Reasoning Models
多文件多密钥加密成一个大文件. 给一个正确的密钥,可以提取对应文件. 用于在受到胁迫的情况下隐藏文件.
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
Breaking long thought processes of o1-like LLMs, such as DeepSeek-R1, QwQ