Root Judge, a ground-breaking LLM that sets a new standard for reliable, customizable and locally-deployable evaluation models.
Root Judge, a groundbreaking LLM that sets a new standard for reliable, customizable and locally-deployable evaluation models.
Root Judge is a powerful mid-sized model that enables reliable and customizable LLM system evaluations. Root Judge was post-trained from Llama-3.3-70B-Instruct on a high quality, human-annotated dataset mix for pairwise preference choice judgments and multi-turn instruction following with source citing.
Root Judge was tested to support complex, user-defined rating rubrics over large context sizes, provide granular qualitative feedback, and support structured evaluation outputs and tool calling. Released under the Apache 2.0 license, Root Judge is an open and accessible model suitable for developers and companies seeking cost-effective and rapid evaluations using custom rubrics.
What is Root Judge based on?
How was Root Judge trained and optimized?
How can developers use Root Judge?
How does Root Judge perform compared to Llama-3.3-70B-Instruct?
What are the recommended specifications for using Root Judge?
What license is Root Judge released under?
Which languages does Root Judge support?