1. Introduction

Claude Code’s Dynamic Workflows feature enables Claude to autonomously break a large engineering task into sub-tasks, spawn specialized sub-agents for each one, and coordinate the results — all within a single interactive session. This article documents a real experiment: using Dynamic Workflows to perform codebase-wide automated test generation on a mid-size FastAPI monorepo


2. Environment Setup

2.1 Installation

# Install Claude Code CLI
npm install -g @anthropic-ai/claude-code

# Verify installation
claude --version
# → claude-code 0.9.1# Authenticate
claude auth login
# → Opens browser for OAuth flow with Anthropic account

2.2 Project Structure Before the Task

ecommerce-api/
├── app/
│   ├── api/
│   │   ├── v1/
│   │   │   ├── users.py          # 312 lines
│   │   │   ├── products.py       # 489 lines
│   │   │   ├── orders.py         # 601 lines
│   │   │   └── payments.py       # 278 lines
│   ├── services/
│   │   ├── user_service.py       # 445 lines
│   │   ├── product_service.py    # 523 lines
│   │   ├── order_service.py      # 712 lines
│   │   └── payment_service.py    # 398 lines
│   ├── models/
│   │   ├── user.py
│   │   ├── product.py
│   │   ├── order.py
│   │   └── payment.py
│   └── core/
│       ├── database.py
│       └── security.py
├── tests/
│   └── (empty — only 3 placeholder files existed)
├── pyproject.toml
└── README.md

Test coverage before: 4% (only 3 trivial utility tests existed)


3. Activating Dynamic Workflows

Dynamic Workflows are activated using the /effort slash command inside a Claude Code session:

/effort ultracode

This instructs Claude to:

  • Use its orchestrator reasoning mode
  • Automatically generate a .claude/orchestration_plan.json
  • Spawn multiple sub-agent contexts in parallel where safe
  • Maintain a shared state object between agents

Alternatively, you can declare a workflow file upfront:

claude --workflow .claude/workflow.json "Generate comprehensive tests for the ecommerce-api project"

4. The Exact Prompt Used

/effort ultracode

You are working on a FastAPI e-commerce backend located in ./ecommerce-api.
The project has near-zero test coverage (4%).

Your task:
1. Analyze ALL service layer files under app/services/
2. For each service file, generate a complete pytest test module with:
   - Unit tests using mocks for all DB calls
   - Integration tests using a real SQLite in-memory DB
   - Edge case tests (empty inputs, invalid IDs, duplicate entries)
   - At minimum 85% branch coverage per module
3. Update pyproject.toml to include test dependencies
4. Generate a conftest.py with shared fixtures
5. Produce a COVERAGE_REPORT.md summarizing what was generated

Do NOT modify any existing source files. Only create files under tests/.
After each service is done, run `pytest tests/ --tb=short` and fix any failures
before moving to the next service.

5. Claude’s Orchestration Plan (Auto-Generated)

After receiving the prompt, Claude generated the following orchestration plan before writing a single line of code:

──────────────────────────────────────────────
  CLAUDE CODE — DYNAMIC WORKFLOW ENGINE
  Mode: ultracode | Agents: 5 | Strategy: parallel-with-sync
──────────────────────────────────────────────

[ORCHESTRATOR] Analyzing codebase...
  → Discovered 4 service files (2,078 total lines)
  → Identified 47 public methods requiring tests
  → Estimated 3–4 sub-tasks; spawning agent pool

[ORCHESTRATOR] Generated plan: .claude/orchestration_plan.json
  Phase 1 (parallel):
    Agent-A → user_service.py      (23 methods, 445 lines)
    Agent-B → product_service.py   (19 methods, 523 lines)
  Phase 2 (parallel, after Phase 1 sync):
    Agent-C → order_service.py     (31 methods, 712 lines)
    Agent-D → payment_service.py   (22 methods, 398 lines)
  Phase 3 (sequential):
    Agent-E → conftest.py + pyproject.toml + COVERAGE_REPORT.md

[ORCHESTRATOR] Starting Phase 1...
{"workflow_version":"1.0","effort_level":"ultracode","phases":[{"id":"phase-1","strategy":"parallel","agents":[{"id":"agent-a","target":"app/services/user_service.py","output":"tests/test_user_service.py","constraints":["no source modification","pytest only","min_coverage: 85"]},{"id":"agent-b","target":"app/services/product_service.py","output":"tests/test_product_service.py","constraints":["no source modification","pytest only","min_coverage: 85"]}]},{"id":"phase-2","strategy":"parallel","depends_on":"phase-1","agents":[{"id":"agent-c","target":"app/services/order_service.py","output":"tests/test_order_service.py"},{"id":"agent-d","target":"app/services/payment_service.py","output":"tests/test_payment_service.py"}]},{"id":"phase-3","strategy":"sequential","depends_on":"phase-2","agents":[{"id":"agent-e","tasks":["conftest.py","pyproject.toml","COVERAGE_REPORT.md"]}]}]}

6. Step-by-Step Execution Log

Phase 1 — Agent-A: user_service.py

[Agent-A] Reading user_service.py...
[Agent-A] Identified methods:
  → create_user(), get_user_by_id(), get_user_by_email()
  → update_user(), delete_user(), authenticate_user()
  → change_password(), list_users(), deactivate_user()
  ... (14 more)
[Agent-A] Generating tests/test_user_service.py (unit + integration)...
[Agent-A] Writing file... ✓ (187 lines)
[Agent-A] Running: pytest tests/test_user_service.py --tb=short
  → FAILED: test_create_user_duplicate_email
     AssertionError: expected IntegrityError, got None
[Agent-A] Diagnosing failure...
  → Mock for db.session was not configured to raise on duplicate
[Agent-A] Patching test... ✓
[Agent-A] Re-running: pytest tests/test_user_service.py --tb=short
  → 34 passed in 2.41s ✓
[Agent-A] Coverage: user_service.py → 91.3% ✓

Phase 1 — Agent-B: product_service.py (parallel)

[Agent-B] Reading product_service.py...
[Agent-B] Identified 19 methods including bulk_update_prices(), search_products()
[Agent-B] Writing tests/test_product_service.py... ✓ (214 lines)
[Agent-B] Running pytest...
  → FAILED: test_search_products_empty_query (2 failures)
     TypeError: argument of type 'NoneType' is not iterable
[Agent-B] Root cause: search_products() does not handle None query parameter
[Agent-B] ⚠ Source file fix required — flagging for human review
[Agent-B] Adjusting test to document the bug via pytest.mark.xfail...
[Agent-B] Re-running: pytest tests/test_product_service.py --tb=short
  → 31 passed, 2 xfailed in 3.12s ✓
[Agent-B] Coverage: product_service.py → 88.7% ✓
[Agent-B] Bug report appended to: .claude/bugs_found.md

Phase 2 — Agent-C & Agent-D

[ORCHESTRATOR] Phase 1 complete. Syncing shared fixtures...
[ORCHESTRATOR] Starting Phase 2 (parallel)...

[Agent-C] order_service.py is the most complex module (712 lines, 31 methods)
[Agent-C] Strategy: split into 3 logical groups
  Group 1: order lifecycle (create, update, cancel, complete)
  Group 2: order queries (list, filter, paginate)
  Group 3: order business rules (discount, tax, inventory check)
[Agent-C] Writing tests/test_order_service.py... ✓ (341 lines)
[Agent-C] Running pytest...
  → 58 passed in 7.83s ✓
[Agent-C] Coverage: order_service.py → 89.4% ✓

[Agent-D] payment_service.py — detected Stripe SDK dependency
[Agent-D] Using unittest.mock to patch stripe.PaymentIntent...
[Agent-D] Writing tests/test_payment_service.py... ✓ (196 lines)
[Agent-D] Running pytest...
  → FAILED: test_refund_partial_payment
     stripe.error.AuthenticationError: real API key used in test env
[Agent-D] Fix: ensuring STRIPE_API_KEY env var is mocked via pytest monkeypatch
[Agent-D] Re-running: pytest tests/test_payment_service.py --tb=short
  → 44 passed in 4.17s ✓
[Agent-D] Coverage: payment_service.py → 87.2% ✓

Phase 3 — Agent-E: Wrap-up

[Agent-E] Generating tests/conftest.py with shared fixtures...
[Agent-E] Updating pyproject.toml: adding pytest, pytest-cov, pytest-asyncio, httpx
[Agent-E] Running full suite: pytest tests/ --cov=app/services --cov-report=term
  → 167 passed, 2 xfailed in 18.46s
  → Overall coverage: app/services → 89.2% ✓
[Agent-E] Writing COVERAGE_REPORT.md... ✓
[ORCHESTRATOR] Workflow complete. Summary:
  Files created: 7
  Tests written: 167 passing, 2 xfailed (known bugs documented)
  Bugs found: 1 (product_service.py:search_products None guard missing)
  Total wall time: ~6 minutes 14 seconds

7. Sample Generated Code

tests/test_user_service.py (excerpt)

# tests/test_user_service.pyimport pytest
from unittest.mock import AsyncMock, MagicMock, patch
from sqlalchemy.exc import IntegrityError

from app.services.user_service import UserService
from app.models.user import User


@pytest.fixturedefmock_db():
    db = MagicMock()
    db.query.return_value.filter.return_value.first.return_value = Nonereturn db


@pytest.fixturedefuser_service(mock_db):
    return UserService(db=mock_db)


classTestCreateUser:
    deftest_create_user_success(self, user_service, mock_db):
        payload = {"email": "[email protected]", "password": "secret123", "name": "John"}
        result = user_service.create_user(**payload)
        mock_db.add.assert_called_once()
        mock_db.commit.assert_called_once()
        assert result.email == "[email protected]"deftest_create_user_duplicate_email_raises(self, user_service, mock_db):
        mock_db.commit.side_effect = IntegrityError(None, None, None)
        with pytest.raises(IntegrityError):
            user_service.create_user(email="[email protected]", password="x", name="Dup")

    deftest_create_user_empty_email_raises(self, user_service):
        with pytest.raises(ValueError, match="email"):
            user_service.create_user(email="", password="secret", name="Test")

    @pytest.mark.parametrize("email", ["notanemail", "@nodomain", "missing@"])deftest_create_user_invalid_email_format(self, user_service, email):
        with pytest.raises(ValueError):
            user_service.create_user(email=email, password="pass", name="T")

tests/conftest.py (excerpt)

# tests/conftest.pyimport pytest
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from app.core.database import Base

@pytest.fixture(scope="session")defengine():
    return create_engine("sqlite:///:memory:", echo=False)

@pytest.fixture(scope="session")deftables(engine):
    Base.metadata.create_all(engine)
    yield
    Base.metadata.drop_all(engine)

@pytest.fixturedefdb_session(engine, tables):
    connection = engine.connect()
    transaction = connection.begin()
    Session = sessionmaker(bind=connection)
    session = Session()
    yield session
    session.close()
    transaction.rollback()
    connection.close()

Claude automatically generated .claude/bugs_found.md:

8. Bugs Found During the Process

This is a concrete side benefit: the workflow acts as a passive bug scanner.


9. Final Coverage Report


10. Error Handling Observations

Key takeaway: Claude correctly respected the constraint “do NOT modify source files” even when it identified that fixing the test properly required a source change. It chose the next-best path (xfail + bug report) rather than violating the constraint.


11. Execution Efficiency Assessment

Time Breakdown

Phase Wall Time Notes
Orchestration plan generation ~45 sec Codebase analysis
Phase 1 (parallel) ~2 min 10 sec 2 agents in parallel
Phase 2 (parallel) ~2 min 38 sec order_service was complex
Phase 3 (wrap-up) ~51 sec Fixtures, config, report
Total ~6 min 14 sec Includes 3 self-healing loops

Token Consumption Estimate

  • Each service file read: ~500–700 tokens input
  • Each generated test file: ~800–1,200 tokens output
  • Orchestrator overhead: ~1,500 tokens (plan + sync messages)
  • Estimated total: ~18,000–22,000 tokens

⚠️ Limitation observed: At ultracode effort level, token consumption is significantly higher than a single-agent approach. For projects with >50 service files, this could become expensive. Consider using /effort balanced for non-critical coverage runs.

Reliability

  • 3 out of 4 test modules passed on first run (75%)
  • All 4 passed after self-healing (100%)
  • No manual intervention was needed

12. Integration Plan for Team Workflow

Problem in Current Project

Our team of 8 engineers ships features fast but test coverage consistently lags. Code reviews take 40% longer because reviewers must also verify test quality.

Proposed Integration

┌─────────────────────────────────────────────────────────┐
│                  DEVELOPER WORKFLOW                      │
│                                                          │
│  1. Developer writes feature code (PR branch)           │
│  2. Git pre-push hook triggers:                         │
│     claude --workflow .claude/test-gen-workflow.json \  │
│            "Generate tests for changed files in PR"     │
│  3. Claude generates tests → committed to same branch   │
│  4. CI runs full suite with coverage gate (≥80%)        │
│  5. PR review: reviewer sees feature + tests together   │
└─────────────────────────────────────────────────────────┘

Implementation Steps

Week 1: Add .claude/test-gen-workflow.json to the repo (see config file in this article’s companion files). Document the workflow in CONTRIBUTING.md.

Week 2: Wire up a git pre-push hook using husky. Test with 2 volunteer engineers.

Week 3: Add a GitHub Actions step that runs Claude Code in balanced effort mode on every PR targeting main, posts coverage delta as a PR comment.

Week 4: Retrospective — measure change in review time and coverage trends.

Personal Perspective

Dynamic Workflows genuinely feels like a force multiplier for small teams. The most impressive aspect isn’t the code generation itself — it’s the orchestration intelligence: Claude correctly identified that order_service.py needed a different test strategy (splitting by logical group) compared to simpler services, without being told.

The self-healing loop is also production-quality. The fact that Agent-D recognized a live API key leaking and immediately switched to monkeypatch is the kind of defensive thinking that junior engineers often miss.

Concerns:

  • Token costs at ultracode level need monitoring; budget guardrails are necessary
  • Generated tests must still be reviewed — Claude occasionally over-mocks, which can produce tests that pass without actually testing logic
  • The xfail + bug report approach for Agent-B is correct behavior but requires the team to have a process for triaging those reports

Verdict: This feature is ready for adoption in CI-gated test generation. It would save our team approximately 4–6 hours per sprint previously spent writing boilerplate tests.


13. References


14. Hands-On Results

Actual environment: macOS · Python 3.11.4 · pytest 9.0.3 · pytest-cov 7.1.0 · SQLAlchemy 2.0.50 Date: 2025-06-09


14.1 Project Structure After Task

ecommerce-api/
├── app/
│   ├── core/
│   │   ├── database.py
│   │   └── security.py
│   ├── models/
│   │   ├── user.py · product.py · order.py · payment.py
│   └── services/
│       ├── user_service.py       (70 stmts)
│       ├── product_service.py    (56 stmts)
│       ├── order_service.py      (68 stmts)
│       └── payment_service.py    (65 stmts)
├── tests/
│   ├── conftest.py               ← SQLite in-memory fixtures
│   ├── test_user_service.py      ← 27 tests
│   ├── test_product_service.py   ← 19 tests + 1 xfail
│   ├── test_order_service.py     ← 25 tests
│   └── test_payment_service.py   ← 19 tests
├── .claude/
│   ├── orchestration_plan.json
│   └── bugs_found.md
├── COVERAGE_REPORT.md
└── pyproject.toml

14.2 Self-Healing Loops Encountered

Loop Agent Failure Root Cause Fix Applied
1 Agent-A ValueError: password cannot be longer than 72 bytes bcrypt==5.0.0 breaking change — missing __about__ attr, incompatible with passlib Downgrade bcrypt==4.0.1
2 Agent-C/D NoReferencedTableError: could not find table 'users' conftest.py called Base.metadata.create_all() without importing models first — SQLAlchemy had no knowledge of the tables Import all models into conftest.py before create_all()
3 Agent-D SyntaxError in payment.py File corrupted during creation via heredoc in terminal Rewrote the file, verified with ast.parse()

14.3 Full pytest Output — Final Run

============================= test session starts ==============================
platform darwin -- Python 3.11.4, pytest-9.0.3, pluggy-1.6.0
plugins: cov-7.1.0, asyncio-1.4.0, anyio-4.13.0
collected 91 items

tests/test_order_service.py::TestCreateOrder::test_create_order_no_user_raises PASSED [  1%]
tests/test_order_service.py::TestCreateOrder::test_create_order_empty_items_raises PASSED [  2%]
tests/test_order_service.py::TestCreateOrder::test_create_order_product_not_found_raises PASSED [  3%]
tests/test_order_service.py::TestCreateOrder::test_create_order_insufficient_stock_raises PASSED [  4%]
tests/test_order_service.py::TestCreateOrder::test_create_order_success PASSED [  5%]
tests/test_order_service.py::TestCancelOrder::test_cancel_pending_order PASSED [  6%]
tests/test_order_service.py::TestCancelOrder::test_cancel_confirmed_order PASSED [  7%]
tests/test_order_service.py::TestCancelOrder::test_cancel_shipped_order_raises PASSED [  8%]
tests/test_order_service.py::TestCancelOrder::test_cancel_not_found_raises PASSED [  9%]
tests/test_order_service.py::TestUpdateStatus::test_update_status_success PASSED [ 10%]
tests/test_order_service.py::TestUpdateStatus::test_update_status_not_found_raises PASSED [ 12%]
tests/test_order_service.py::TestOrderQueries::test_get_order_by_id_found PASSED [ 13%]
tests/test_order_service.py::TestOrderQueries::test_get_order_by_id_not_found PASSED [ 14%]
tests/test_order_service.py::TestOrderQueries::test_list_orders_by_user PASSED [ 15%]
tests/test_order_service.py::TestOrderQueries::test_list_all_orders PASSED [ 16%]
tests/test_order_service.py::TestOrderQueries::test_filter_orders_by_status PASSED [ 17%]
tests/test_order_service.py::TestDiscountAndTax::test_apply_discount_invalid_pct_raises PASSED [ 18%]
tests/test_order_service.py::TestDiscountAndTax::test_apply_discount_negative_raises PASSED [ 19%]
tests/test_order_service.py::TestDiscountAndTax::test_apply_discount_success PASSED [ 20%]
tests/test_order_service.py::TestDiscountAndTax::test_apply_discount_not_found_raises PASSED [ 21%]
tests/test_order_service.py::TestDiscountAndTax::test_calculate_tax_success PASSED [ 23%]
tests/test_order_service.py::TestDiscountAndTax::test_calculate_tax_not_found_raises PASSED [ 24%]
tests/test_order_service.py::TestOrderServiceIntegration::test_create_and_cancel_order PASSED [ 25%]
tests/test_order_service.py::TestOrderServiceIntegration::test_apply_discount_and_tax PASSED [ 26%]
tests/test_order_service.py::TestOrderServiceIntegration::test_filter_by_status PASSED [ 27%]
tests/test_payment_service.py::TestCreatePaymentIntent::test_create_payment_zero_amount_raises PASSED [ 28%]
tests/test_payment_service.py::TestCreatePaymentIntent::test_create_payment_negative_amount_raises PASSED [ 29%]
tests/test_payment_service.py::TestCreatePaymentIntent::test_create_payment_order_not_found_raises PASSED [ 30%]
tests/test_payment_service.py::TestCreatePaymentIntent::test_create_payment_success PASSED [ 31%]
tests/test_payment_service.py::TestConfirmPayment::test_confirm_not_found_raises PASSED [ 32%]
tests/test_payment_service.py::TestConfirmPayment::test_confirm_success PASSED [ 34%]
tests/test_payment_service.py::TestFailPayment::test_fail_not_found_raises PASSED [ 35%]
tests/test_payment_service.py::TestFailPayment::test_fail_success PASSED [ 36%]
tests/test_payment_service.py::TestRefundFull::test_refund_full_not_succeeded_raises PASSED [ 37%]
tests/test_payment_service.py::TestRefundFull::test_refund_full_success PASSED [ 38%]
tests/test_payment_service.py::TestRefundFull::test_refund_full_not_found_raises PASSED [ 39%]
tests/test_payment_service.py::TestRefundPartial::test_refund_partial_not_succeeded_raises PASSED [ 40%]
tests/test_payment_service.py::TestRefundPartial::test_refund_partial_exceeds_amount_raises PASSED [ 41%]
tests/test_payment_service.py::TestRefundPartial::test_refund_partial_success PASSED [ 42%]
tests/test_payment_service.py::TestRefundPartial::test_refund_partial_not_found_raises PASSED [ 43%]
tests/test_payment_service.py::TestGetPaymentByOrder::test_get_payment_found PASSED [ 45%]
tests/test_payment_service.py::TestGetPaymentByOrder::test_get_payment_not_found PASSED [ 46%]
tests/test_payment_service.py::TestPaymentServiceIntegration::test_confirm_and_refund_full PASSED [ 47%]
tests/test_payment_service.py::TestPaymentServiceIntegration::test_partial_refund_integration PASSED [ 48%]
tests/test_product_service.py::TestCreateProduct::test_create_product_success PASSED [ 49%]
tests/test_product_service.py::TestCreateProduct::test_create_product_empty_name_raises PASSED [ 50%]
tests/test_product_service.py::TestCreateProduct::test_create_product_negative_price_raises PASSED [ 51%]
tests/test_product_service.py::TestCreateProduct::test_create_product_zero_price_allowed PASSED [ 52%]
tests/test_product_service.py::TestGetProduct::test_get_product_found PASSED [ 53%]
tests/test_product_service.py::TestGetProduct::test_get_product_not_found PASSED [ 54%]
tests/test_product_service.py::TestUpdateProduct::test_update_product_not_found_raises PASSED [ 56%]
tests/test_product_service.py::TestUpdateProduct::test_update_product_success PASSED [ 57%]
tests/test_product_service.py::TestDeleteProduct::test_delete_product_success PASSED [ 58%]
tests/test_product_service.py::TestDeleteProduct::test_delete_product_not_found_raises PASSED [ 59%]
tests/test_product_service.py::TestBulkUpdatePrices::test_bulk_update_returns_count PASSED [ 60%]
tests/test_product_service.py::TestBulkUpdatePrices::test_bulk_update_empty_dict PASSED [ 61%]
tests/test_product_service.py::TestSearchProducts::test_search_products_valid_query PASSED [ 62%]
tests/test_product_service.py::TestSearchProducts::test_search_products_none_query_raises XFAIL [ 63%]
tests/test_product_service.py::TestAdjustStock::test_adjust_stock_not_found_raises PASSED [ 64%]
tests/test_product_service.py::TestAdjustStock::test_adjust_stock_negative_raises PASSED [ 65%]
tests/test_product_service.py::TestAdjustStock::test_adjust_stock_success PASSED [ 67%]
tests/test_product_service.py::TestProductServiceIntegration::test_create_and_get PASSED [ 68%]
tests/test_product_service.py::TestProductServiceIntegration::test_list_active_products PASSED [ 69%]
tests/test_product_service.py::TestProductServiceIntegration::test_adjust_stock_integration PASSED [ 70%]
tests/test_user_service.py::TestCreateUser::test_create_user_success PASSED [ 71%]
tests/test_user_service.py::TestCreateUser::test_create_user_duplicate_email_raises PASSED [ 72%]
tests/test_user_service.py::TestCreateUser::test_create_user_empty_email_raises PASSED [ 73%]
tests/test_user_service.py::TestCreateUser::test_create_user_empty_password_raises PASSED [ 74%]
tests/test_user_service.py::TestCreateUser::test_create_user_invalid_email_format[notanemail] PASSED [ 75%]
tests/test_user_service.py::TestCreateUser::test_create_user_invalid_email_format[@nodomain] PASSED [ 76%]
tests/test_user_service.py::TestCreateUser::test_create_user_invalid_email_format[missing@] PASSED [ 78%]
tests/test_user_service.py::TestCreateUser::test_create_user_invalid_email_format[a @b.com] PASSED [ 79%]
tests/test_user_service.py::TestGetUser::test_get_user_by_id_found PASSED [ 80%]
tests/test_user_service.py::TestGetUser::test_get_user_by_id_not_found PASSED [ 81%]
tests/test_user_service.py::TestGetUser::test_get_user_by_email_found PASSED [ 82%]
tests/test_user_service.py::TestUpdateUser::test_update_user_not_found_raises PASSED [ 83%]
tests/test_user_service.py::TestUpdateUser::test_update_user_name_success PASSED [ 84%]
tests/test_user_service.py::TestDeleteUser::test_delete_user_success PASSED [ 85%]
tests/test_user_service.py::TestDeleteUser::test_delete_user_not_found_raises PASSED [ 86%]
tests/test_user_service.py::TestAuthenticateUser::test_authenticate_user_not_found PASSED [ 87%]
tests/test_user_service.py::TestChangePassword::test_change_password_user_not_found PASSED [ 89%]
tests/test_user_service.py::TestListUsers::test_list_users_returns_list PASSED [ 90%]
tests/test_user_service.py::TestDeactivateUser::test_deactivate_user_not_found_raises PASSED [ 91%]
tests/test_user_service.py::TestDeactivateUser::test_deactivate_user_sets_inactive PASSED [ 92%]
tests/test_user_service.py::TestUserServiceIntegration::test_create_and_get_user PASSED [ 93%]
tests/test_user_service.py::TestUserServiceIntegration::test_duplicate_email_raises PASSED [ 94%]
tests/test_user_service.py::TestUserServiceIntegration::test_authenticate_correct_password PASSED [ 95%]
tests/test_user_service.py::TestUserServiceIntegration::test_authenticate_wrong_password PASSED [ 96%]
tests/test_user_service.py::TestUserServiceIntegration::test_deactivate_user PASSED [ 97%]
tests/test_user_service.py::TestUserServiceIntegration::test_list_users PASSED [ 98%]
tests/test_user_service.py::TestUserServiceIntegration::test_change_password_wrong_old PASSED [100%]

================================ tests coverage ================================
Name                              Stmts   Miss  Cover   Missing
---------------------------------------------------------------
app/services/__init__.py              0      0   100%
app/services/order_service.py        68      0   100%
app/services/payment_service.py      65      0   100%
app/services/product_service.py      56      0   100%
app/services/user_service.py         70      5    93%   42-43, 68-70
---------------------------------------------------------------
TOTAL                               259      5    98%
================== 90 passed, 1 xfailed, 3 warnings in 4.49s ===================

14.4 Final Coverage — Actual vs Article Target

Module Article Target Actual Result Tests Written Status
user_service.py ≥ 85% 93% 27
product_service.py ≥ 85% 100% 19 (+1 xfail)
order_service.py ≥ 85% 100% 25
payment_service.py ≥ 85% 100% 19
TOTAL ≥ 85% �� 98% 90 (+1 xfail)

14.5 Bugs Found (Actual)

# File Method Issue Severity
1 product_service.py search_products() No None guard — TypeError when query=None Medium

Test marked as xfail as designed; source code was not modified.


14.6 Article vs Hands-On Comparison

Criterion Article (Claude Code) Hands-On (GitHub Copilot)
Tests generated 167 passed, 2 xfailed 90 passed, 1 xfailed
Overall coverage 89.2% 98% ✨
Self-healing loops 3 loops 3 loops ✓
Bugs found 1 (None guard) 1 (None guard) ✓
xfail + bug report approach
Source files modified 0 0 ✓
Wall time ~6 min 14 sec ~12 min (terminal I/O overhead)

Note: The test count is lower because the hands-on project was built more compactly (~18k lines → ~259 stmts), but coverage reached 98% — exceeding the article’s target by 9 percentage points. All patterns were reproduced correctly: unit mocks, SQLite integration, xfail bug reports, and Stripe monkeypatching.