It is a simple-to-use, open-source evaluation framework for LLM applications. It is similar to Pytest but specialized for unit testing LLM applications. It evaluates performance based on metrics such as hallucination, answer relevancy, RAGAS, etc., using LLMs and various other NLP models locally on your machine. | It is a platform for building production-grade LLM applications. It lets you debug, test, evaluate, and monitor chains and intelligent agents built on any LLM framework and seamlessly integrates with LangChain, the go-to open source framework for building with LLMs. |
Simple functions to unit test LLM applications in the CLI;
Gain insights to quickly iterate towards optimal hyperparameters;
Evaluate existing LLM applications built with other frameworks | Collaborate with teammates to get app behavior just right;
A unified DevOps platform for your LLM applications;
The platform for your LLM development lifecycle;
Develop with greater visibility |
Statistics | |
GitHub Stars 11.9K | GitHub Stars - |
GitHub Forks 1.0K | GitHub Forks - |
Stacks 1 | Stacks 6 |
Followers 1 | Followers 5 |
Votes 0 | Votes 1 |
Integrations | |

That transforms AI-generated content into natural, undetectable human-like writing. Bypass AI detection systems with intelligent text humanization technology

It is a framework built around LLMs. It can be used for chatbots, generative question-answering, summarization, and much more. The core idea of the library is that we can “chain” together different components to create more advanced use cases around LLMs.

It allows you to run open-source large language models, such as Llama 2, locally.

It is a project that provides a central interface to connect your LLMs with external data. It offers you a comprehensive toolset trading off cost and performance.

It is a library for building stateful, multi-actor applications with LLMs, built on top of (and intended to be used with) LangChain. It extends the LangChain Expression Language with the ability to coordinate multiple chains (or actors) across multiple steps of computation in a cyclic manner.

It is a new scripting language to automate your interaction with a Large Language Model (LLM), namely OpenAI. The ultimate goal is to create a natural language programming experience. The syntax of GPTScript is largely natural language, making it very easy to learn and use.

The collaborative testing platform for LLM applications and agents. Your whole team defines quality requirements together, Rhesis generates thousands of test scenarios covering edge cases, simulates realistic multi-turn conversations, and delivers actionable reviews. Testing infrastructure built for Gen AI.

Vivgrid is an AI agent infrastructure platform that helps developers and startups build, observe, evaluate, and deploy AI agents with safety guardrails and global low-latency inference. Support for GPT-5, Gemini 2.5 Pro, and DeepSeek-V3. Start free with $200 monthly credits. Ship production-ready AI agents confidently.

Is this image AI-generated? Free AI detector with 99.7% accuracy detects fake photos, deepfakes, and AI images from DALL-E, Midjourney, Stable Diffusion. No signup required.

It is an open-source library designed to help developers build conversational streaming user interfaces in JavaScript and TypeScript. The SDK supports React/Next.js, Svelte/SvelteKit, and Vue/Nuxt as well as Node.js, Serverless, and the Edge Runtime.