The first chaos engineering infra for AI agents

Stress test your AI agentsbefore they stress you.

Find AI failures before your users do. Chaosync simulates real-world chaos so you can fix issues during development.

terminal
$pip install chaosync_
# Start testing in 2 minutes
# No complex setup required
# Works with your existing AI agents

Works with all major AI frameworks

LangChain
AutoGen
Google ADK
CrewAI
HTTP APIs
Custom Agents

Community-driven chaos testing

Access pre-built chaos scenarios. Share your own. Learn from production failures.

Concurrency Storm

by @chaosync-team

1.2k

Tests agent behavior under high concurrency with 429 errors and rate limits

langchainrate-limitsadvanced
✓ 15,234 runs

Hallucination Hunter

by @ml-safety

892

Detects when agents generate false information under API timeouts

autogenhallucinationsafety
✓ 8,456 runs

Cost Explosion Detector

by @finops-team

2.1k

Catches infinite loops and token usage explosions before production

all-frameworkscostcritical
✓ 24,892 runs

Start with community-tested scenarios

Browse scenarios by framework, failure type, or popularity. Every test includes assertions and fix recommendations.

coming soon
coming soon
# Install from marketplace
runner.load_test_by_id("concurrency_test")

# Or load from YAML
runner.load_test("./my_chaos_test.yaml")

# Run with one line
results = runner.run_test(agent)

Why traditional testing is broken

LLM-based testing can't catch system-level failures

Traditional Testing

Testing Method

LLM judges LLM outputs

Coverage

Happy path scenarios only

Failure Detection

After production incidents

Test Duration

Hours to days per scenario

Edge Cases

Manual scenario writing

Cost

$1000s in API calls per test suite

Chaosync

Testing Method

Injects real system failures

Coverage

Access to marketplace chaos tests

Failure Detection

Before deployment

Test Duration

2 minutes for full suite

Edge Cases

Auto-generated from your code

Cost

10x cheaper with isolated testing

The fundamental problem with LLM-based testing

Testing LLMs with LLMs is like asking a student to grade their own exam. You can't expect them to detect failures they're themselves are prone to.

Start finding failures in 2 minutes

No complex setup. No infrastructure changes. Just results.

1

Initialize Runner

Point to your agent file or API endpoint

from chaosync import ChaosSyncRunner

runner = ChaosSyncRunner(
agent_path="./my_agent.py"
)
2

Configure Chaos

Select scenarios or auto-detect from code

# Auto-detect chaos scenarios
results = runner.run_test(
scenario="chaos.yaml",
auto_detect=True
)
3

Get Results

Detailed failure report with fixes

✓ 47 failures found
⚠ 12 critical
→ View full report

# Fix recommendations included
production_chaos_test.yaml
id: production_agent_test
name: Customer Support Agent Chaos Test
framework: langchain
agent_type: conversational

chaos:
- type: latency
latency_ms: 5000
probability: 0.3

- type: api_failures
error_codes: [429, 500, 503]
failure_rate: 0.15

- type: hallucination
hallucination_rate: 0.05
inject_false_facts: true

- type: rate_limit
tokens_per_minute: 1000

scenarios:
- name: high_load_test
input:
question: "I need help with my order #12345"
expected_behavior: "graceful_degradation"

assertions:
- type: response_time
max_time_ms: 30000
- type: no_hallucination
check_facts: true
- type: error_handling
must_have_fallback: true

95% of AI Agent Deployments Fail Silently

Go beyond basic logs and traces. Run regular diagnostics on your agent while in development.

Every day without chaos testing is another day closer to your AI horror story making headlines

Free tier includes 1,000 chaos tests/month • No credit card required • Test your AI in 2 minutes