Repository navigation

#

streetfighterai

Benchmark LLMs by fighting in Street Fighter 3! The new way to evaluate the quality of an LLM

Jupyter Notebook
1448
6 个月前

llm_benchmark is a comprehensive benchmarking tool for evaluating the performance of various Large Language Models (LLMs) on a range of natural language processing tasks. It provides a standardized framework for comparing different models based on accuracy, speed, and efficiency.

0
7 个月前