
Evaluate, monitor, and debug LLM apps with realistic simulations.
6 months ago
LangWatch has been a lifesaver. We were flying blind with our LLM costs and quality. Now we have a dashboard that shows us our token usage, latency, and tracks user feedback on responses. The automated evaluation for things like toxicity and PII detection gives us peace of mind. It's the Datadog for LLMs.
8 months ago
LangWatch is excellent for high-level monitoring of our LLM application's health. For deep, trace-level debugging of a complex chain, I find LangSmith to be a bit more powerful. But for a production dashboard to monitor cost, latency, and user sentiment, LangWatch is simpler and more focused.
12 months ago
We integrated LangWatch and immediately discovered that a small percentage of our users were responsible for a huge portion of our token costs due to a poorly designed prompt. We fixed it and cut our OpenAI bill by 30%. The tool paid for itself almost instantly. The team is also super responsive in their Slack channel.