Help developers discover the tools you use. Get visibility for your team's tech choices and contribute to the community's knowledge.
It is a platform for structured prompt engineering. It helps you develop, test, and monitor your LLM structured tasks using templates, queries, collections, and functions. | It is an AI-powered LLMOps platform that enables developers to build continuously improving LLM-powered applications and ship them into production. |
Design your prompt templates in an extended playground;
Test prompts on entire query collections at once;
Define test-queries with expected result JSON schemas or values;
Follow up with the entire history of your runs and tests | Manage LLM data;
Continuously monitor the health of your LLM apps;
Optimize your LLM app via a rich debugger;
Easy programmatic integration |
Statistics | |
GitHub Stars - | GitHub Stars 97 |
GitHub Forks - | GitHub Forks 13 |
Stacks 0 | Stacks 0 |
Followers 3 | Followers 0 |
Votes 0 | Votes 0 |
Integrations | |

It is a platform for building production-grade LLM applications. It lets you debug, test, evaluate, and monitor chains and intelligent agents built on any LLM framework and seamlessly integrates with LangChain, the go-to open source framework for building with LLMs.

The collaborative testing platform for LLM applications and agents. Your whole team defines quality requirements together, Rhesis generates thousands of test scenarios covering edge cases, simulates realistic multi-turn conversations, and delivers actionable reviews. Testing infrastructure built for Gen AI.

Vivgrid is an AI agent infrastructure platform that helps developers and startups build, observe, evaluate, and deploy AI agents with safety guardrails and global low-latency inference. Support for GPT-5, Gemini 2.5 Pro, and DeepSeek-V3. Start free with $200 monthly credits. Ship production-ready AI agents confidently.

Is this image AI-generated? Free AI detector with 99.7% accuracy detects fake photos, deepfakes, and AI images from DALL-E, Midjourney, Stable Diffusion. No signup required.

CI failures are painful to debug. SentinelQA gives you run summaries, flaky test detection, regression analysis, visual diffs and AI-generated action items.

It improves the cost, performance, and accuracy of Gen AI apps. It takes <2 mins to integrate and with that, it already starts monitoring all of your LLM requests and also makes your app resilient, secure, performant, and more accurate at the same time.

It is an AI observability and LLM evaluation platform designed to help ML and LLM engineers and data scientists surface model issues quicker, resolve their root cause, and ultimately, improve model performance.

It is the leading observability platform trusted by high-performing teams to help maintain the quality and performance of ML models, LLMs, and data pipelines.

It is the toolkit for evaluating and developing robust and reliable AI agents. Build compliant virtual employees with observability, evals, and replay analytics. No more black boxes and prompt guessing.

It is an interactive AI evaluation platform for exploring, debugging, and sharing how your AI systems perform. Evaluate any task and data type with Zeno's modular views which support everything from chatbot conversations to object detection and audio transcription.