The first chaos engineering infra for AI agents

Stress test your AI agentsbefore they stress you.

Find AI failures before your users do. Chaosync simulates real-world chaos so you can fix issues during development.

terminal

$pip install chaosync_

# Start testing in 2 minutes

# No complex setup required

# Works with your existing AI agents

Works with all major AI frameworks

LangChain

AutoGen

Google ADK

CrewAI

HTTP APIs

Custom Agents

Community-driven chaos testing

Access pre-built chaos scenarios. Share your own. Learn from production failures.

Concurrency Storm

by @chaosync-team

1.2k

Tests agent behavior under high concurrency with 429 errors and rate limits

langchainrate-limitsadvanced

✓ 15,234 runs

Hallucination Hunter

by @ml-safety

892

Detects when agents generate false information under API timeouts

autogenhallucinationsafety

✓ 8,456 runs

Cost Explosion Detector

by @finops-team

2.1k

Catches infinite loops and token usage explosions before production

all-frameworkscostcritical

✓ 24,892 runs

Start with community-tested scenarios

Browse scenarios by framework, failure type, or popularity. Every test includes assertions and fix recommendations.

coming soon

# Install from marketplace

runner.load_test_by_id("concurrency_test")

# Or load from YAML

runner.load_test("./my_chaos_test.yaml")

# Run with one line

results = runner.run_test(agent)

Why traditional testing is broken

LLM-based testing can't catch system-level failures

Traditional Testing

Testing Method

LLM judges LLM outputs

Coverage

Happy path scenarios only

Failure Detection

After production incidents

Test Duration

Hours to days per scenario

Edge Cases

Manual scenario writing

Cost

$1000s in API calls per test suite

Chaosync

Testing Method

Injects real system failures

Coverage

Access to marketplace chaos tests

Failure Detection

Before deployment

Test Duration

2 minutes for full suite

Edge Cases

Auto-generated from your code

Cost

10x cheaper with isolated testing

The fundamental problem with LLM-based testing

Testing LLMs with LLMs is like asking a student to grade their own exam. You can't expect them to detect failures they're themselves are prone to.

Start finding failures in 2 minutes

No complex setup. No infrastructure changes. Just results.

Initialize Runner

Point to your agent file or API endpoint

from chaosync import ChaosSyncRunner

runner = ChaosSyncRunner(
agent_path="./my_agent.py"
)

Configure Chaos

Select scenarios or auto-detect from code

# Auto-detect chaos scenarios
results = runner.run_test(
scenario="chaos.yaml",
auto_detect=True
)

Get Results

Detailed failure report with fixes

✓ 47 failures found

⚠ 12 critical

→ View full report

# Fix recommendations included

production_chaos_test.yaml

id: production_agent_test
name: Customer Support Agent Chaos Test
framework: langchain
agent_type: conversational

chaos:
  - type: latency
    latency_ms: 5000
    probability: 0.3
    
  - type: api_failures
    error_codes: [429, 500, 503]
    failure_rate: 0.15
    
  - type: hallucination
    hallucination_rate: 0.05
    inject_false_facts: true
    
  - type: rate_limit
    tokens_per_minute: 1000
    
scenarios:
  - name: high_load_test
    input:
      question: "I need help with my order #12345"
    expected_behavior: "graceful_degradation"
    
assertions:
  - type: response_time
    max_time_ms: 30000
  - type: no_hallucination
    check_facts: true
  - type: error_handling
    must_have_fallback: true

95% of AI Agent Deployments Fail Silently

Go beyond basic logs and traces. Run regular diagnostics on your agent while in development.

Every day without chaos testing is another day closer to your AI horror story making headlines

Free tier includes 1,000 chaos tests/month • No credit card required • Test your AI in 2 minutes