Repository navigation

streetfighterai

Website
Wikipedia

OpenGenerativeAI / llm-colosseum

Benchmark LLMs by fighting in Street Fighter 3! The new way to evaluate the quality of an LLM

genai 大语言模型 benchmark streetfighterai

Jupyter Notebook

1448

175

6 个月前

jiseongHAN / StreetFighterRL

StreetFighter RL (Beta)

人工智能 reinforcement-learning rl streetfighterai gym retro

Python

7 个月前

mennahasan31 / llm_benchmark

llm_benchmark is a comprehensive benchmarking tool for evaluating the performance of various Large Language Models (LLMs) on a range of natural language processing tasks. It provides a standardized framework for comparing different models based on accuracy, speed, and efficiency.

ai-tools alibaba anthropic benchmark evals evaluation evaluation-metrics humaneval information-seeking mistral 自然语言处理 openai reasoning streetfighterai

7 个月前