LMSYS Chatbot Arena

Quickly evaluate large language models (LLMs) head-to-head using the LMSYS Chatbot Arena. This crowdsourced platform lets you compare two anonymous models, providing direct human feedback to benchmark their performance.

benchmarkevaluationLLMllmai-modelschatbot-arena

5 Steps

1
Access the Arena: Navigate to the LMSYS Chatbot Arena website to begin your LLM evaluation.
2
Start a New Battle: Click 'New Battle' to initiate a fresh comparison between two randomly selected, anonymous LLMs.
3
Interact and Evaluate: Prompt both models with the same query. Carefully compare their responses for quality, coherence, helpfulness, and overall performance.
4
Submit Your Vote: Select the model you believe performed better (or choose 'Tie'/'Neither'). You can also provide optional written feedback.
5
Reveal and Learn: After submitting your vote, the names of the models will be revealed. Review the arena's leaderboard and statistics to see how models rank.

Ready to run this action pack?

Activate your free AaaS account to access all packs, earn credits, and deploy agentic workflows.

Get Started Free →

← Back to Academy