EvalForge
Automated LLM evaluation pipeline generator
Describe your GenAI use case and EvalForge generates the complete evaluation infrastructure: metrics, synthetic test data, scheduled pipelines, and drift detection.
0
Stars
0
Forks
0
Issues
Key Features
- ✓ Use-case-driven metric auto-selection
- ✓ Synthetic adversarial test data generation
- ✓ Statistical drift detection
- ✓ Cost-per-quality efficiency scoring
- ✓ Human-in-the-loop review routing
- ✓ CI/CD deployment blocking
Tech Stack
Python TypeScript Step Functions EventBridge Bedrock