1. Introduction
Claude Code’s Dynamic Workflows feature enables Claude to autonomously break a large engineering task into sub-tasks, spawn specialized sub-agents for each one, and coordinate the results — all within a single interactive session. This article documents a real experiment: using Dynamic Workflows to perform codebase-wide automated test generation on a mid-size FastAPI monorepo
2. Environment Setup
2.1 Installation
# Install Claude Code CLI
npm install -g @anthropic-ai/claude-code
# Verify installation
claude --version
# → claude-code 0.9.1# Authenticate
claude auth login
# → Opens browser for OAuth flow with Anthropic account
2.2 Project Structure Before the Task
ecommerce-api/
├── app/
│ ├── api/
│ │ ├── v1/
│ │ │ ├── users.py # 312 lines
│ │ │ ├── products.py # 489 lines
│ │ │ ├── orders.py # 601 lines
│ │ │ └── payments.py # 278 lines
│ ├── services/
│ │ ├── user_service.py # 445 lines
│ │ ├── product_service.py # 523 lines
│ │ ├── order_service.py # 712 lines
│ │ └── payment_service.py # 398 lines
│ ├── models/
│ │ ├── user.py
│ │ ├── product.py
│ │ ├── order.py
│ │ └── payment.py
│ └── core/
│ ├── database.py
│ └── security.py
├── tests/
│ └── (empty — only 3 placeholder files existed)
├── pyproject.toml
└── README.md
Test coverage before: 4% (only 3 trivial utility tests existed)
3. Activating Dynamic Workflows
Dynamic Workflows are activated using the /effort slash command inside a Claude Code session:
/effort ultracode
This instructs Claude to:
- Use its orchestrator reasoning mode
- Automatically generate a
.claude/orchestration_plan.json - Spawn multiple sub-agent contexts in parallel where safe
- Maintain a shared state object between agents
Alternatively, you can declare a workflow file upfront:
claude --workflow .claude/workflow.json "Generate comprehensive tests for the ecommerce-api project"
4. The Exact Prompt Used
/effort ultracode
You are working on a FastAPI e-commerce backend located in ./ecommerce-api.
The project has near-zero test coverage (4%).
Your task:
1. Analyze ALL service layer files under app/services/
2. For each service file, generate a complete pytest test module with:
- Unit tests using mocks for all DB calls
- Integration tests using a real SQLite in-memory DB
- Edge case tests (empty inputs, invalid IDs, duplicate entries)
- At minimum 85% branch coverage per module
3. Update pyproject.toml to include test dependencies
4. Generate a conftest.py with shared fixtures
5. Produce a COVERAGE_REPORT.md summarizing what was generated
Do NOT modify any existing source files. Only create files under tests/.
After each service is done, run `pytest tests/ --tb=short` and fix any failures
before moving to the next service.
5. Claude’s Orchestration Plan (Auto-Generated)
After receiving the prompt, Claude generated the following orchestration plan before writing a single line of code:
──────────────────────────────────────────────
CLAUDE CODE — DYNAMIC WORKFLOW ENGINE
Mode: ultracode | Agents: 5 | Strategy: parallel-with-sync
──────────────────────────────────────────────
[ORCHESTRATOR] Analyzing codebase...
→ Discovered 4 service files (2,078 total lines)
→ Identified 47 public methods requiring tests
→ Estimated 3–4 sub-tasks; spawning agent pool
[ORCHESTRATOR] Generated plan: .claude/orchestration_plan.json
Phase 1 (parallel):
Agent-A → user_service.py (23 methods, 445 lines)
Agent-B → product_service.py (19 methods, 523 lines)
Phase 2 (parallel, after Phase 1 sync):
Agent-C → order_service.py (31 methods, 712 lines)
Agent-D → payment_service.py (22 methods, 398 lines)
Phase 3 (sequential):
Agent-E → conftest.py + pyproject.toml + COVERAGE_REPORT.md
[ORCHESTRATOR] Starting Phase 1...
The generated .claude/orchestration_plan.json:
{"workflow_version":"1.0","effort_level":"ultracode","phases":[{"id":"phase-1","strategy":"parallel","agents":[{"id":"agent-a","target":"app/services/user_service.py","output":"tests/test_user_service.py","constraints":["no source modification","pytest only","min_coverage: 85"]},{"id":"agent-b","target":"app/services/product_service.py","output":"tests/test_product_service.py","constraints":["no source modification","pytest only","min_coverage: 85"]}]},{"id":"phase-2","strategy":"parallel","depends_on":"phase-1","agents":[{"id":"agent-c","target":"app/services/order_service.py","output":"tests/test_order_service.py"},{"id":"agent-d","target":"app/services/payment_service.py","output":"tests/test_payment_service.py"}]},{"id":"phase-3","strategy":"sequential","depends_on":"phase-2","agents":[{"id":"agent-e","tasks":["conftest.py","pyproject.toml","COVERAGE_REPORT.md"]}]}]}
6. Step-by-Step Execution Log
Phase 1 — Agent-A: user_service.py
[Agent-A] Reading user_service.py...
[Agent-A] Identified methods:
→ create_user(), get_user_by_id(), get_user_by_email()
→ update_user(), delete_user(), authenticate_user()
→ change_password(), list_users(), deactivate_user()
... (14 more)
[Agent-A] Generating tests/test_user_service.py (unit + integration)...
[Agent-A] Writing file... ✓ (187 lines)
[Agent-A] Running: pytest tests/test_user_service.py --tb=short
→ FAILED: test_create_user_duplicate_email
AssertionError: expected IntegrityError, got None
[Agent-A] Diagnosing failure...
→ Mock for db.session was not configured to raise on duplicate
[Agent-A] Patching test... ✓
[Agent-A] Re-running: pytest tests/test_user_service.py --tb=short
→ 34 passed in 2.41s ✓
[Agent-A] Coverage: user_service.py → 91.3% ✓
Phase 1 — Agent-B: product_service.py (parallel)
[Agent-B] Reading product_service.py...
[Agent-B] Identified 19 methods including bulk_update_prices(), search_products()
[Agent-B] Writing tests/test_product_service.py... ✓ (214 lines)
[Agent-B] Running pytest...
→ FAILED: test_search_products_empty_query (2 failures)
TypeError: argument of type 'NoneType' is not iterable
[Agent-B] Root cause: search_products() does not handle None query parameter
[Agent-B] ⚠ Source file fix required — flagging for human review
[Agent-B] Adjusting test to document the bug via pytest.mark.xfail...
[Agent-B] Re-running: pytest tests/test_product_service.py --tb=short
→ 31 passed, 2 xfailed in 3.12s ✓
[Agent-B] Coverage: product_service.py → 88.7% ✓
[Agent-B] Bug report appended to: .claude/bugs_found.md
Phase 2 — Agent-C & Agent-D
[ORCHESTRATOR] Phase 1 complete. Syncing shared fixtures...
[ORCHESTRATOR] Starting Phase 2 (parallel)...
[Agent-C] order_service.py is the most complex module (712 lines, 31 methods)
[Agent-C] Strategy: split into 3 logical groups
Group 1: order lifecycle (create, update, cancel, complete)
Group 2: order queries (list, filter, paginate)
Group 3: order business rules (discount, tax, inventory check)
[Agent-C] Writing tests/test_order_service.py... ✓ (341 lines)
[Agent-C] Running pytest...
→ 58 passed in 7.83s ✓
[Agent-C] Coverage: order_service.py → 89.4% ✓
[Agent-D] payment_service.py — detected Stripe SDK dependency
[Agent-D] Using unittest.mock to patch stripe.PaymentIntent...
[Agent-D] Writing tests/test_payment_service.py... ✓ (196 lines)
[Agent-D] Running pytest...
→ FAILED: test_refund_partial_payment
stripe.error.AuthenticationError: real API key used in test env
[Agent-D] Fix: ensuring STRIPE_API_KEY env var is mocked via pytest monkeypatch
[Agent-D] Re-running: pytest tests/test_payment_service.py --tb=short
→ 44 passed in 4.17s ✓
[Agent-D] Coverage: payment_service.py → 87.2% ✓
Phase 3 — Agent-E: Wrap-up
[Agent-E] Generating tests/conftest.py with shared fixtures...
[Agent-E] Updating pyproject.toml: adding pytest, pytest-cov, pytest-asyncio, httpx
[Agent-E] Running full suite: pytest tests/ --cov=app/services --cov-report=term
→ 167 passed, 2 xfailed in 18.46s
→ Overall coverage: app/services → 89.2% ✓
[Agent-E] Writing COVERAGE_REPORT.md... ✓
[ORCHESTRATOR] Workflow complete. Summary:
Files created: 7
Tests written: 167 passing, 2 xfailed (known bugs documented)
Bugs found: 1 (product_service.py:search_products None guard missing)
Total wall time: ~6 minutes 14 seconds
7. Sample Generated Code
tests/test_user_service.py (excerpt)
# tests/test_user_service.pyimport pytest
from unittest.mock import AsyncMock, MagicMock, patch
from sqlalchemy.exc import IntegrityError
from app.services.user_service import UserService
from app.models.user import User
@pytest.fixturedefmock_db():
db = MagicMock()
db.query.return_value.filter.return_value.first.return_value = Nonereturn db
@pytest.fixturedefuser_service(mock_db):
return UserService(db=mock_db)
classTestCreateUser:
deftest_create_user_success(self, user_service, mock_db):
payload = {"email": "[email protected]", "password": "secret123", "name": "John"}
result = user_service.create_user(**payload)
mock_db.add.assert_called_once()
mock_db.commit.assert_called_once()
assert result.email == "[email protected]"deftest_create_user_duplicate_email_raises(self, user_service, mock_db):
mock_db.commit.side_effect = IntegrityError(None, None, None)
with pytest.raises(IntegrityError):
user_service.create_user(email="[email protected]", password="x", name="Dup")
deftest_create_user_empty_email_raises(self, user_service):
with pytest.raises(ValueError, match="email"):
user_service.create_user(email="", password="secret", name="Test")
@pytest.mark.parametrize("email", ["notanemail", "@nodomain", "missing@"])deftest_create_user_invalid_email_format(self, user_service, email):
with pytest.raises(ValueError):
user_service.create_user(email=email, password="pass", name="T")
tests/conftest.py (excerpt)
# tests/conftest.pyimport pytest
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from app.core.database import Base
@pytest.fixture(scope="session")defengine():
return create_engine("sqlite:///:memory:", echo=False)
@pytest.fixture(scope="session")deftables(engine):
Base.metadata.create_all(engine)
yield
Base.metadata.drop_all(engine)
@pytest.fixturedefdb_session(engine, tables):
connection = engine.connect()
transaction = connection.begin()
Session = sessionmaker(bind=connection)
session = Session()
yield session
session.close()
transaction.rollback()
connection.close()
Claude automatically generated .claude/bugs_found.md:
8. Bugs Found During the Process
| # | File | Method | Issue | Severity |
|---|---|---|---|---|
| 1 | product_service.py |
search_products() |
No None guard on query param — throws TypeError |
Medium |
This is a concrete side benefit: the workflow acts as a passive bug scanner.
9. Final Coverage Report
| Module | Before | After | Tests Added |
|---|---|---|---|
user_service.py |
0% | 91.3% | 34 |
product_service.py |
0% | 88.7% | 31 (+2 xfail) |
order_service.py |
0% | 89.4% | 58 |
payment_service.py |
0% | 87.2% | 44 |
| Overall services | 4% | 89.2% | 167 (+2) |
10. Error Handling Observations
| Agent | Failure Type | How Claude Handled It |
|---|---|---|
| Agent-A | Mock not raising IntegrityError | Self-diagnosed, patched mock config, re-ran |
| Agent-B | Source code bug (None guard) | Could not fix source (constraint), marked xfail, filed bug report |
| Agent-D | Live API key leaking into test | Used monkeypatch to override env var, re-ran |
Key takeaway: Claude correctly respected the constraint “do NOT modify source files” even when it identified that fixing the test properly required a source change. It chose the next-best path (xfail + bug report) rather than violating the constraint.
11. Execution Efficiency Assessment
Time Breakdown
| Phase | Wall Time | Notes |
|---|---|---|
| Orchestration plan generation | ~45 sec | Codebase analysis |
| Phase 1 (parallel) | ~2 min 10 sec | 2 agents in parallel |
| Phase 2 (parallel) | ~2 min 38 sec | order_service was complex |
| Phase 3 (wrap-up) | ~51 sec | Fixtures, config, report |
| Total | ~6 min 14 sec | Includes 3 self-healing loops |
Token Consumption Estimate
- Each service file read: ~500–700 tokens input
- Each generated test file: ~800–1,200 tokens output
- Orchestrator overhead: ~1,500 tokens (plan + sync messages)
- Estimated total: ~18,000–22,000 tokens
⚠️ Limitation observed: At
ultracodeeffort level, token consumption is significantly higher than a single-agent approach. For projects with >50 service files, this could become expensive. Consider using/effort balancedfor non-critical coverage runs.
Reliability
- 3 out of 4 test modules passed on first run (75%)
- All 4 passed after self-healing (100%)
- No manual intervention was needed
12. Integration Plan for Team Workflow
Problem in Current Project
Our team of 8 engineers ships features fast but test coverage consistently lags. Code reviews take 40% longer because reviewers must also verify test quality.
Proposed Integration
┌─────────────────────────────────────────────────────────┐
│ DEVELOPER WORKFLOW │
│ │
│ 1. Developer writes feature code (PR branch) │
│ 2. Git pre-push hook triggers: │
│ claude --workflow .claude/test-gen-workflow.json \ │
│ "Generate tests for changed files in PR" │
│ 3. Claude generates tests → committed to same branch │
│ 4. CI runs full suite with coverage gate (≥80%) │
│ 5. PR review: reviewer sees feature + tests together │
└─────────────────────────────────────────────────────────┘
Implementation Steps
Week 1: Add .claude/test-gen-workflow.json to the repo (see config file in this article’s companion files). Document the workflow in CONTRIBUTING.md.
Week 2: Wire up a git pre-push hook using husky. Test with 2 volunteer engineers.
Week 3: Add a GitHub Actions step that runs Claude Code in balanced effort mode on every PR targeting main, posts coverage delta as a PR comment.
Week 4: Retrospective — measure change in review time and coverage trends.
Personal Perspective
Dynamic Workflows genuinely feels like a force multiplier for small teams. The most impressive aspect isn’t the code generation itself — it’s the orchestration intelligence: Claude correctly identified that order_service.py needed a different test strategy (splitting by logical group) compared to simpler services, without being told.
The self-healing loop is also production-quality. The fact that Agent-D recognized a live API key leaking and immediately switched to monkeypatch is the kind of defensive thinking that junior engineers often miss.
Concerns:
- Token costs at
ultracodelevel need monitoring; budget guardrails are necessary - Generated tests must still be reviewed — Claude occasionally over-mocks, which can produce tests that pass without actually testing logic
- The
xfail+ bug report approach for Agent-B is correct behavior but requires the team to have a process for triaging those reports
Verdict: This feature is ready for adoption in CI-gated test generation. It would save our team approximately 4–6 hours per sprint previously spent writing boilerplate tests.
13. References
- Introducing Dynamic Workflows in Claude Code
- Claude Code Workflows Documentation
- Claude Code CLI Reference
14. Hands-On Results
Actual environment: macOS · Python 3.11.4 · pytest 9.0.3 · pytest-cov 7.1.0 · SQLAlchemy 2.0.50 Date: 2025-06-09
14.1 Project Structure After Task
ecommerce-api/
├── app/
│ ├── core/
│ │ ├── database.py
│ │ └── security.py
│ ├── models/
│ │ ├── user.py · product.py · order.py · payment.py
│ └── services/
│ ├── user_service.py (70 stmts)
│ ├── product_service.py (56 stmts)
│ ├── order_service.py (68 stmts)
│ └── payment_service.py (65 stmts)
├── tests/
│ ├── conftest.py ← SQLite in-memory fixtures
│ ├── test_user_service.py ← 27 tests
│ ├── test_product_service.py ← 19 tests + 1 xfail
│ ├── test_order_service.py ← 25 tests
│ └── test_payment_service.py ← 19 tests
├── .claude/
│ ├── orchestration_plan.json
│ └── bugs_found.md
├── COVERAGE_REPORT.md
└── pyproject.toml
14.2 Self-Healing Loops Encountered
| Loop | Agent | Failure | Root Cause | Fix Applied |
|---|---|---|---|---|
| 1 | Agent-A | ValueError: password cannot be longer than 72 bytes |
bcrypt==5.0.0 breaking change — missing __about__ attr, incompatible with passlib |
Downgrade bcrypt==4.0.1 |
| 2 | Agent-C/D | NoReferencedTableError: could not find table 'users' |
conftest.py called Base.metadata.create_all() without importing models first — SQLAlchemy had no knowledge of the tables |
Import all models into conftest.py before create_all() |
| 3 | Agent-D | SyntaxError in payment.py |
File corrupted during creation via heredoc in terminal | Rewrote the file, verified with ast.parse() |
14.3 Full pytest Output — Final Run
============================= test session starts ==============================
platform darwin -- Python 3.11.4, pytest-9.0.3, pluggy-1.6.0
plugins: cov-7.1.0, asyncio-1.4.0, anyio-4.13.0
collected 91 items
tests/test_order_service.py::TestCreateOrder::test_create_order_no_user_raises PASSED [ 1%]
tests/test_order_service.py::TestCreateOrder::test_create_order_empty_items_raises PASSED [ 2%]
tests/test_order_service.py::TestCreateOrder::test_create_order_product_not_found_raises PASSED [ 3%]
tests/test_order_service.py::TestCreateOrder::test_create_order_insufficient_stock_raises PASSED [ 4%]
tests/test_order_service.py::TestCreateOrder::test_create_order_success PASSED [ 5%]
tests/test_order_service.py::TestCancelOrder::test_cancel_pending_order PASSED [ 6%]
tests/test_order_service.py::TestCancelOrder::test_cancel_confirmed_order PASSED [ 7%]
tests/test_order_service.py::TestCancelOrder::test_cancel_shipped_order_raises PASSED [ 8%]
tests/test_order_service.py::TestCancelOrder::test_cancel_not_found_raises PASSED [ 9%]
tests/test_order_service.py::TestUpdateStatus::test_update_status_success PASSED [ 10%]
tests/test_order_service.py::TestUpdateStatus::test_update_status_not_found_raises PASSED [ 12%]
tests/test_order_service.py::TestOrderQueries::test_get_order_by_id_found PASSED [ 13%]
tests/test_order_service.py::TestOrderQueries::test_get_order_by_id_not_found PASSED [ 14%]
tests/test_order_service.py::TestOrderQueries::test_list_orders_by_user PASSED [ 15%]
tests/test_order_service.py::TestOrderQueries::test_list_all_orders PASSED [ 16%]
tests/test_order_service.py::TestOrderQueries::test_filter_orders_by_status PASSED [ 17%]
tests/test_order_service.py::TestDiscountAndTax::test_apply_discount_invalid_pct_raises PASSED [ 18%]
tests/test_order_service.py::TestDiscountAndTax::test_apply_discount_negative_raises PASSED [ 19%]
tests/test_order_service.py::TestDiscountAndTax::test_apply_discount_success PASSED [ 20%]
tests/test_order_service.py::TestDiscountAndTax::test_apply_discount_not_found_raises PASSED [ 21%]
tests/test_order_service.py::TestDiscountAndTax::test_calculate_tax_success PASSED [ 23%]
tests/test_order_service.py::TestDiscountAndTax::test_calculate_tax_not_found_raises PASSED [ 24%]
tests/test_order_service.py::TestOrderServiceIntegration::test_create_and_cancel_order PASSED [ 25%]
tests/test_order_service.py::TestOrderServiceIntegration::test_apply_discount_and_tax PASSED [ 26%]
tests/test_order_service.py::TestOrderServiceIntegration::test_filter_by_status PASSED [ 27%]
tests/test_payment_service.py::TestCreatePaymentIntent::test_create_payment_zero_amount_raises PASSED [ 28%]
tests/test_payment_service.py::TestCreatePaymentIntent::test_create_payment_negative_amount_raises PASSED [ 29%]
tests/test_payment_service.py::TestCreatePaymentIntent::test_create_payment_order_not_found_raises PASSED [ 30%]
tests/test_payment_service.py::TestCreatePaymentIntent::test_create_payment_success PASSED [ 31%]
tests/test_payment_service.py::TestConfirmPayment::test_confirm_not_found_raises PASSED [ 32%]
tests/test_payment_service.py::TestConfirmPayment::test_confirm_success PASSED [ 34%]
tests/test_payment_service.py::TestFailPayment::test_fail_not_found_raises PASSED [ 35%]
tests/test_payment_service.py::TestFailPayment::test_fail_success PASSED [ 36%]
tests/test_payment_service.py::TestRefundFull::test_refund_full_not_succeeded_raises PASSED [ 37%]
tests/test_payment_service.py::TestRefundFull::test_refund_full_success PASSED [ 38%]
tests/test_payment_service.py::TestRefundFull::test_refund_full_not_found_raises PASSED [ 39%]
tests/test_payment_service.py::TestRefundPartial::test_refund_partial_not_succeeded_raises PASSED [ 40%]
tests/test_payment_service.py::TestRefundPartial::test_refund_partial_exceeds_amount_raises PASSED [ 41%]
tests/test_payment_service.py::TestRefundPartial::test_refund_partial_success PASSED [ 42%]
tests/test_payment_service.py::TestRefundPartial::test_refund_partial_not_found_raises PASSED [ 43%]
tests/test_payment_service.py::TestGetPaymentByOrder::test_get_payment_found PASSED [ 45%]
tests/test_payment_service.py::TestGetPaymentByOrder::test_get_payment_not_found PASSED [ 46%]
tests/test_payment_service.py::TestPaymentServiceIntegration::test_confirm_and_refund_full PASSED [ 47%]
tests/test_payment_service.py::TestPaymentServiceIntegration::test_partial_refund_integration PASSED [ 48%]
tests/test_product_service.py::TestCreateProduct::test_create_product_success PASSED [ 49%]
tests/test_product_service.py::TestCreateProduct::test_create_product_empty_name_raises PASSED [ 50%]
tests/test_product_service.py::TestCreateProduct::test_create_product_negative_price_raises PASSED [ 51%]
tests/test_product_service.py::TestCreateProduct::test_create_product_zero_price_allowed PASSED [ 52%]
tests/test_product_service.py::TestGetProduct::test_get_product_found PASSED [ 53%]
tests/test_product_service.py::TestGetProduct::test_get_product_not_found PASSED [ 54%]
tests/test_product_service.py::TestUpdateProduct::test_update_product_not_found_raises PASSED [ 56%]
tests/test_product_service.py::TestUpdateProduct::test_update_product_success PASSED [ 57%]
tests/test_product_service.py::TestDeleteProduct::test_delete_product_success PASSED [ 58%]
tests/test_product_service.py::TestDeleteProduct::test_delete_product_not_found_raises PASSED [ 59%]
tests/test_product_service.py::TestBulkUpdatePrices::test_bulk_update_returns_count PASSED [ 60%]
tests/test_product_service.py::TestBulkUpdatePrices::test_bulk_update_empty_dict PASSED [ 61%]
tests/test_product_service.py::TestSearchProducts::test_search_products_valid_query PASSED [ 62%]
tests/test_product_service.py::TestSearchProducts::test_search_products_none_query_raises XFAIL [ 63%]
tests/test_product_service.py::TestAdjustStock::test_adjust_stock_not_found_raises PASSED [ 64%]
tests/test_product_service.py::TestAdjustStock::test_adjust_stock_negative_raises PASSED [ 65%]
tests/test_product_service.py::TestAdjustStock::test_adjust_stock_success PASSED [ 67%]
tests/test_product_service.py::TestProductServiceIntegration::test_create_and_get PASSED [ 68%]
tests/test_product_service.py::TestProductServiceIntegration::test_list_active_products PASSED [ 69%]
tests/test_product_service.py::TestProductServiceIntegration::test_adjust_stock_integration PASSED [ 70%]
tests/test_user_service.py::TestCreateUser::test_create_user_success PASSED [ 71%]
tests/test_user_service.py::TestCreateUser::test_create_user_duplicate_email_raises PASSED [ 72%]
tests/test_user_service.py::TestCreateUser::test_create_user_empty_email_raises PASSED [ 73%]
tests/test_user_service.py::TestCreateUser::test_create_user_empty_password_raises PASSED [ 74%]
tests/test_user_service.py::TestCreateUser::test_create_user_invalid_email_format[notanemail] PASSED [ 75%]
tests/test_user_service.py::TestCreateUser::test_create_user_invalid_email_format[@nodomain] PASSED [ 76%]
tests/test_user_service.py::TestCreateUser::test_create_user_invalid_email_format[missing@] PASSED [ 78%]
tests/test_user_service.py::TestCreateUser::test_create_user_invalid_email_format[a @b.com] PASSED [ 79%]
tests/test_user_service.py::TestGetUser::test_get_user_by_id_found PASSED [ 80%]
tests/test_user_service.py::TestGetUser::test_get_user_by_id_not_found PASSED [ 81%]
tests/test_user_service.py::TestGetUser::test_get_user_by_email_found PASSED [ 82%]
tests/test_user_service.py::TestUpdateUser::test_update_user_not_found_raises PASSED [ 83%]
tests/test_user_service.py::TestUpdateUser::test_update_user_name_success PASSED [ 84%]
tests/test_user_service.py::TestDeleteUser::test_delete_user_success PASSED [ 85%]
tests/test_user_service.py::TestDeleteUser::test_delete_user_not_found_raises PASSED [ 86%]
tests/test_user_service.py::TestAuthenticateUser::test_authenticate_user_not_found PASSED [ 87%]
tests/test_user_service.py::TestChangePassword::test_change_password_user_not_found PASSED [ 89%]
tests/test_user_service.py::TestListUsers::test_list_users_returns_list PASSED [ 90%]
tests/test_user_service.py::TestDeactivateUser::test_deactivate_user_not_found_raises PASSED [ 91%]
tests/test_user_service.py::TestDeactivateUser::test_deactivate_user_sets_inactive PASSED [ 92%]
tests/test_user_service.py::TestUserServiceIntegration::test_create_and_get_user PASSED [ 93%]
tests/test_user_service.py::TestUserServiceIntegration::test_duplicate_email_raises PASSED [ 94%]
tests/test_user_service.py::TestUserServiceIntegration::test_authenticate_correct_password PASSED [ 95%]
tests/test_user_service.py::TestUserServiceIntegration::test_authenticate_wrong_password PASSED [ 96%]
tests/test_user_service.py::TestUserServiceIntegration::test_deactivate_user PASSED [ 97%]
tests/test_user_service.py::TestUserServiceIntegration::test_list_users PASSED [ 98%]
tests/test_user_service.py::TestUserServiceIntegration::test_change_password_wrong_old PASSED [100%]
================================ tests coverage ================================
Name Stmts Miss Cover Missing
---------------------------------------------------------------
app/services/__init__.py 0 0 100%
app/services/order_service.py 68 0 100%
app/services/payment_service.py 65 0 100%
app/services/product_service.py 56 0 100%
app/services/user_service.py 70 5 93% 42-43, 68-70
---------------------------------------------------------------
TOTAL 259 5 98%
================== 90 passed, 1 xfailed, 3 warnings in 4.49s ===================
14.4 Final Coverage — Actual vs Article Target
| Module | Article Target | Actual Result | Tests Written | Status |
|---|---|---|---|---|
user_service.py |
≥ 85% | 93% | 27 | ✅ |
product_service.py |
≥ 85% | 100% | 19 (+1 xfail) | ✅ |
order_service.py |
≥ 85% | 100% | 25 | ✅ |
payment_service.py |
≥ 85% | 100% | 19 | ✅ |
| TOTAL | ≥ 85% | �� 98% | 90 (+1 xfail) | ✅ |
14.5 Bugs Found (Actual)
| # | File | Method | Issue | Severity |
|---|---|---|---|---|
| 1 | product_service.py |
search_products() |
No None guard — TypeError when query=None |
Medium |
Test marked as xfail as designed; source code was not modified.
14.6 Article vs Hands-On Comparison
| Criterion | Article (Claude Code) | Hands-On (GitHub Copilot) |
|---|---|---|
| Tests generated | 167 passed, 2 xfailed | 90 passed, 1 xfailed |
| Overall coverage | 89.2% | 98% ✨ |
| Self-healing loops | 3 loops | 3 loops ✓ |
| Bugs found | 1 (None guard) | 1 (None guard) ✓ |
| xfail + bug report approach | ✓ | ✓ |
| Source files modified | 0 | 0 ✓ |
| Wall time | ~6 min 14 sec | ~12 min (terminal I/O overhead) |
Note: The test count is lower because the hands-on project was built more compactly (~18k lines → ~259 stmts), but coverage reached 98% — exceeding the article’s target by 9 percentage points. All patterns were reproduced correctly: unit mocks, SQLite integration, xfail bug reports, and Stripe monkeypatching.