Help developers discover the tools you use. Get visibility for your team's tech choices and contribute to the community's knowledge.
It is a tool for testing and evaluating LLM output quality. With this tool, you can systematically test prompts, models, and RAGs with predefined test cases. It can be utilized as a CLI, a library, or integrated into CI/CD pipelines. | CI failures are painful to debug. SentinelQA gives you run summaries, flaky test detection, regression analysis, visual diffs and AI-generated action items. |
Evaluate quality and catch regressions;
Speed up evaluations with caching and concurrency;
Score outputs automatically by defining test cases | QA, DevOps, Test Intelligence, AI, Analytics, Test Debugging |
Statistics | |
GitHub Stars 9.0K | GitHub Stars - |
GitHub Forks 760 | GitHub Forks - |
Stacks 0 | Stacks 0 |
Followers 0 | Followers 1 |
Votes 0 | Votes 1 |
Integrations | |
| No integrations available | |

BrowserStack is the leading test platform built for developers & QAs to expand test coverage, scale & optimize testing with cross-browser, real device cloud, accessibility, visual testing, test management, and test observability.

TestRail helps you manage and track your software testing efforts and organize your QA department. Its intuitive web-based user interface makes it easy to create test cases, manage test runs and coordinate your entire testing process.

Manage all aspects of software quality; integrate with JIRA and various test tools, foster collaboration and gain real-time visibility.

It is a platform for building production-grade LLM applications. It lets you debug, test, evaluate, and monitor chains and intelligent agents built on any LLM framework and seamlessly integrates with LangChain, the go-to open source framework for building with LLMs.

Transform basic prompts into expert-level AI instructions. Enhance, benchmark & optimize prompts for ChatGPT, Claude, Gemini & more.

The collaborative testing platform for LLM applications and agents. Your whole team defines quality requirements together, Rhesis generates thousands of test scenarios covering edge cases, simulates realistic multi-turn conversations, and delivers actionable reviews. Testing infrastructure built for Gen AI.

The most advanced, consistent, and effective AI humanizer on the market. Instantly transform AI-generated text into undetectable, human-like writing in one click.

Vivgrid is an AI agent infrastructure platform that helps developers and startups build, observe, evaluate, and deploy AI agents with safety guardrails and global low-latency inference. Support for GPT-5, Gemini 2.5 Pro, and DeepSeek-V3. Start free with $200 monthly credits. Ship production-ready AI agents confidently.

Practice your interview skills with AI-powered interviewers. Simulate real interview scenarios and improve your performance. Get instant feedback. Get complete overview and a plan with next steps to improve.

Is a debate simulator powered by the top 5 LLM's. Generate endless discussions and debates on any topic. It's like reddit - but powered by AI.