libeval
تایید شدهlibeval - RAG evaluation system. Evaluator orchestrates quality assessment using LLM-as-judge patterns. CriteriaEvaluator scores responses against rubrics. RecallEvaluator measures retrieval performance. TraceEvaluator analyzes execution traces. EvalStore persists results. Use for automated quality testing, RAG pipeline evaluation, and agent performance testing
(0)
۸۲
۲
۳
نصب مهارت
مهارتها کدهای شخص ثالث از مخازن عمومی GitHub هستند. SkillHub الگوهای مخرب شناختهشده را اسکن میکند اما نمیتواند امنیت را تضمین کند. قبل از نصب، کد منبع را بررسی کنید.
نصب سراسری (سطح کاربر):
npx skillhub install majiayu000/claude-skill-registry/libevalنصب در پروژه فعلی:
npx skillhub install majiayu000/claude-skill-registry/libeval --projectمسیر پیشنهادی: ~/.claude/skills/libeval/
محتوای SKILL.md
---
name: libeval
description: >
libeval - RAG evaluation system. Evaluator orchestrates quality assessment
using LLM-as-judge patterns. CriteriaEvaluator scores responses against
rubrics. RecallEvaluator measures retrieval performance. TraceEvaluator
analyzes execution traces. EvalStore persists results. Use for automated
quality testing, RAG pipeline evaluation, and agent performance testing
---
# libeval Skill
## When to Use
- Evaluating RAG agent response quality
- Measuring retrieval recall and precision
- Running automated quality assessments
- Benchmarking agent performance over time
## Key Concepts
**Evaluator**: Main orchestrator that runs test cases through the agent and
collects metrics.
**CriteriaEvaluator**: Uses LLM-as-judge to score responses against defined
criteria and rubrics.
**RecallEvaluator**: Measures how well the retrieval system returns relevant
documents.
**TraceEvaluator**: Analyzes execution traces for performance and correctness.
## Usage Patterns
### Pattern 1: Run evaluation suite
```javascript
import { Evaluator } from "@copilot-ld/libeval";
const evaluator = new Evaluator(config);
const results = await evaluator.run(testCases);
console.log(results.summary);
```
### Pattern 2: Criteria-based evaluation
```javascript
import { CriteriaEvaluator } from "@copilot-ld/libeval";
const criteria = new CriteriaEvaluator(llmClient);
const score = await criteria.evaluate(response, rubric);
```
## Integration
Configured via config/eval.yml. Run via `make eval`. Uses libllm for
LLM-as-judge.