RAGAS faithfulness formula
Faithfulness = supported_claims / total_claims. A score of 0.4 means 60% of what the model said is hallucination — not found in the retrieved context. Use a different, stronger model as the evaluator (Gemini Pro, GPT-4o) than the one being evaluated — same model judging itself has self-serving bias. Run evals in CI: block the merge if faithfulness drops below threshold after any prompt or retrieval change.