📦
conKurrence
AI evaluation toolkit that measures inter-rater agreement (Fleiss' κ, Kendall's W) across multiple LLM providers. Evaluate prompt reliability, detect contested outputs, and track consensus trends over time.
0 installs
Trust: 34 — Low
Science
Ask AI about conKurrence
Powered by Claude · Grounded in docs
I know everything about conKurrence. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Loading tools...
Reviews
Documentation
No README available
