RagaliQ

RagaliQ Documentation

RagaliQ is an open-source LLM & RAG evaluation testing framework for Python. It provides automated hallucination detection, faithfulness metrics, answer relevance scoring, context precision, and context recall evaluation — powered by an LLM-as-Judge architecture.

Getting Started

Core Concepts

Evaluators

RagaliQ ships with five built-in evaluators for comprehensive RAG pipeline testing:

LLM-as-Judge

RagaliQ uses Claude or OpenAI as a semantic judge to evaluate response quality. This approach captures nuanced errors that keyword-matching and embedding similarity approaches miss.

Pytest Integration

RagaliQ integrates natively with pytest — RAG quality tests run alongside your existing unit tests with familiar fixtures and markers.

Installation

pip install ragaliq