Repository navigation

#

streetfighterai

Benchmark LLMs by fighting in Street Fighter 3! The new way to evaluate the quality of an LLM

Jupyter Notebook
1444
5 个月前

llm_benchmark is a comprehensive benchmarking tool for evaluating the performance of various Large Language Models (LLMs) on a range of natural language processing tasks. It provides a standardized framework for comparing different models based on accuracy, speed, and efficiency.

0
6 个月前