Skip to main content
Back to blog
ai-agents

Building With Claude Code: A Production Agency's Workflow

ByDOT· Founder @ DOTxLabs
Published May 6, 20268 min read

A production Claude Code workflow at an AI-first agency follows a daily rhythm: humans handle architecture and planning, Claude Code executes implementation tasks under supervision, and humans review before each commit. This structured pattern delivers 40-60% faster than traditional development while maintaining production-grade quality through TypeScript strict mode, automated testing, and explicit prompt patterns referenced from a CLAUDE.md project file.

Claude Code is an agentic coding tool developed by Anthropic that operates as a command-line interface with full repository context. Unlike code completion tools that suggest the next line, Claude Code understands the entire project — every file, every type definition, every test — and executes multi-step development tasks autonomously. This document describes how DOTxLabs uses Claude Code as the primary development tool for shipping production client applications.

The Workflow: How Production Development Actually Happens

A typical day of Claude Code-assisted development at DOTxLabs follows a specific rhythm that maximizes AI throughput while maintaining architectural integrity:

Morning: Architecture and Planning (Human-Led)

The developer reviews requirements, designs data models, defines API contracts, and makes security decisions. These are judgment-heavy tasks where AI assists with research but humans make final calls.

Output: a clear specification for implementation tasks, often captured in GitHub issues or a local task list.

Implementation: Claude Code Executes (AI-Led, Human-Supervised)

Each implementation task is delegated to Claude Code with a structured prompt:

Task: Build the tenant invitation system
Context: Existing auth in src/lib/auth/, RLS policies in supabase/migrations/
Requirements:
- Admin can invite users via email
- Invited users get a signup link with pre-filled tenant association
- RLS policy ensures invited users only see their assigned tenant
- Server action for the invite form in src/app/admin/team/
Output: migration file, server action, form component, and test
Verify: run type check and tests after implementation

Claude Code then creates the necessary files, writes the migration, implements the server action, builds the UI component, and runs verification. The developer reviews the diff, adjusts any business logic nuances, and commits.

Review: Quality Gates (Human-Led)

Every Claude Code output passes through:

  1. TypeScript strict mode — catches type errors at compile time
  2. Human diff review — ensures architectural coherence
  3. Automated test suite — validates behavior
  4. RLS verification — confirms tenant isolation holds

This is not "AI writes code and we ship it blind." It's "AI handles implementation volume while humans maintain architectural oversight." The distinction matters for production reliability.

What Claude Code Handles Well (Measured by Acceptance Rate)

Based on 6 months of production use across client projects, these task types have the highest first-pass acceptance rates:

| Task Type | Acceptance Rate | Notes | |-----------|:--------------:|-------| | CRUD API routes + validation | 92% | Zod schemas, server actions, error handling | | React components from specs | 88% | When given explicit props interface and design reference | | Supabase RLS policies | 85% | When tenant model is clearly defined | | Test suites for existing code | 90% | Unit and integration tests | | TypeScript type definitions | 95% | Infers correctly from usage patterns | | Database migrations | 82% | Simple schema changes; complex joins need review | | Documentation from code | 94% | README, API docs, inline comments | | Refactoring existing code | 87% | Extract functions, improve naming, reduce duplication |

"Acceptance rate" means the generated code requires no modifications beyond minor style adjustments before passing review.

What Claude Code Struggles With (Still Human Territory)

| Task Type | Issue | Our Approach | |-----------|-------|-------------| | Novel architecture decisions | No pattern to reference | Human designs, Claude implements | | Cross-cutting security concerns | Threat modeling requires judgment | Human defines policy, Claude implements checks | | Performance optimization | Requires profiling data Claude doesn't see | Human identifies bottleneck, Claude refactors | | Third-party API quirks | Undocumented behavior | Human debugs, Claude wraps solution | | UX micro-interactions | Subjective quality judgment | Human designs, Claude implements |

The pattern is clear: Claude Code excels at implementation within defined parameters and struggles with tasks requiring judgment about undefined parameters. This maps perfectly to the traditional senior-developer/junior-developer split — except Claude Code works at 5x the speed of a junior developer and doesn't need mentoring.

The CLAUDE.md File: Project-Level AI Instructions

Every DOTxLabs project includes a CLAUDE.md file at the repository root. This file tells Claude Code how to behave within the project context:

# Project: [Client Portal Name]

## Stack
- Next.js 14 (App Router)
- Supabase (PostgreSQL + RLS)
- TypeScript strict
- Tailwind CSS
- Zod for validation

## Conventions
- Server actions in src/app/_actions/
- Shared types in src/types/
- Database queries through Supabase client only (no raw SQL in app code)
- All forms use Zod schemas for validation
- Error handling: return typed Result objects, never throw in server actions

## Testing
- Vitest for unit tests
- Test files co-located: component.test.ts next to component.tsx
- RLS tests in supabase/tests/

## Security
- All data access through Supabase client (RLS enforced)
- Never expose tenant_id in URLs
- Server-side session validation on every protected route
- File uploads: validate type and size server-side

Claude Code reads this file automatically and adapts all generated code to match these conventions. The result is consistent output that follows project standards without repeated prompting.

Prompt Patterns That Work in Production

Pattern 1: Specification-First

Implement the [feature] according to this spec:

Input: [describe inputs]
Output: [describe expected behavior]
Constraints: [list non-negotiable requirements]
Location: [target files/directories]
Reference: [existing patterns to follow — file paths]

After implementation, run `npm run typecheck && npm test`.

Pattern 2: Test-First

Write failing tests for [feature] first, then implement until they pass.

Test file: src/app/_actions/[feature].test.ts
Test cases:
1. [happy path]
2. [edge case]
3. [error case]
4. [authorization case — wrong tenant should fail]

Then implement in src/app/_actions/[feature].ts to make tests green.

Pattern 3: Refactor With Safety Net

Refactor [target file/function] to [improvement goal].

Constraints:
- All existing tests must continue passing
- No changes to the public API/exports
- Extract [specific pattern] into shared utility

Run tests before and after. Show me the diff.

Economics: Why This Workflow Changes Agency Pricing

The Claude Code workflow produces measurable cost savings that DOTxLabs passes to clients:

Traditional agency model:

  • Senior developer: 2 hours architecture + 2 hours review = 4 hours
  • Junior developer: 16 hours implementation
  • Total: 20 developer-hours per feature
  • At $150/hour blended: $3,000 per feature

AI-first agency model (DOTxLabs):

  • Senior developer: 2 hours architecture + 1 hour review = 3 hours
  • Claude Code: 30 minutes implementation (developer time: prompt + review)
  • Total: 3.5 developer-hours per feature
  • At $150/hour: $525 per feature

The 80% reduction in implementation time means either lower prices for clients or higher margins for the agency. DOTxLabs splits the difference: clients pay less, and the agency maintains healthy margins on compressed timelines.

Guardrails: Preventing AI Mistakes in Production

AI-generated code can contain subtle bugs. The production workflow includes multiple safety layers:

TypeScript strict mode is non-negotiable. Every project runs "strict": true in tsconfig. This catches 70%+ of AI-generated errors at compile time — wrong return types, missing null checks, incorrect generics.

RLS policies are tested independently. Automated tests verify that Tenant A's credentials cannot access Tenant B's data. These tests run on every commit and block deployment if they fail.

Human review focuses on logic, not syntax. The reviewer's job isn't to catch typos (TypeScript and ESLint do that). It's to verify that the business logic is correct, the security model holds, and the user experience makes sense.

Canary deployments for critical changes. Major features deploy to a preview environment first, serve real traffic from a subset of users, and promote to production only after metrics confirm no regressions.

When to Use Claude Code vs. When to Code Manually

| Situation | Use Claude Code | Code Manually | |-----------|:--------------:|:-------------:| | Building a CRUD feature with known patterns | Yes | | | Designing a new system architecture | | Yes | | Writing migration for schema change | Yes | | | Debugging a production incident | | Yes | | Adding tests to existing code | Yes | | | Optimizing a critical hot path | | Yes | | Implementing auth flow from spec | Yes | | | Deciding between architectural approaches | | Yes | | Refactoring for readability | Yes | | | Writing client-facing documentation | Yes | |

The rule: if the task has a clear specification and established patterns to follow, Claude Code handles it. If the task requires judgment about unknowns, a human handles it.

Summary

Claude Code transforms agency development economics by handling 60-70% of implementation tasks at 5x the speed of manual coding, while maintaining production quality through TypeScript strict mode, automated testing, RLS verification, and human architectural review. The workflow is not "let AI write everything" — it's "let AI handle volume while humans handle judgment." This produces better outcomes at lower cost because human attention is concentrated on the decisions that actually require it.

For agencies evaluating whether to adopt Claude Code: the entry cost is zero (it runs on your existing codebase), the learning curve is 1-2 weeks of prompt pattern development, and the productivity gain is measurable within the first client project. The only requirement is TypeScript and a well-structured codebase — which you should have anyway.

Frequently asked questions

  • What is Claude Code and how do agencies use it?

    Claude Code is an agentic coding tool from Anthropic that operates directly on a full codebase with complete project context. Unlike chat-based AI assistants, Claude Code understands your entire repository — file structure, type definitions, dependencies, test suites — and generates code that integrates correctly. Agencies use it as their primary development tool for implementation, testing, refactoring, and documentation.

  • How does Claude Code differ from GitHub Copilot or ChatGPT?

    Claude Code operates on the full project context rather than individual files. It can create new files, modify existing ones, run tests, and execute multi-step implementation tasks autonomously. Copilot suggests single-line or single-function completions. ChatGPT generates code snippets without repository awareness. Claude Code is an agent that executes development tasks end-to-end.

  • Is code generated by Claude Code production-ready?

    With proper workflow guardrails, yes. Claude Code generates TypeScript that passes strict type checking, follows project conventions, and integrates with existing patterns. The agency workflow includes human architectural review, automated testing, and type system validation. The code quality is comparable to a skilled mid-level developer output with senior oversight.

  • How much faster is development with Claude Code?

    For standard implementation tasks — CRUD operations, authentication flows, dashboard components, API integrations — Claude Code reduces development time by 40-60%. A feature that takes a developer 8 hours typically takes 2-3 hours with Claude Code (including prompt engineering, review, and iteration). Complex architectural decisions still require full human attention.

  • What tasks should NOT be delegated to Claude Code?

    Three categories: (1) Novel architectural decisions where there's no established pattern to follow. (2) Security-critical logic where human judgment about threat models is essential. (3) Business logic that requires deep domain understanding of the client's industry. Claude Code excels at implementation, not at deciding what to implement.

  • How do agencies ensure Claude Code follows their coding standards?

    Through CLAUDE.md project files that define conventions, TypeScript strict mode that enforces type safety, ESLint rules that catch style violations, and explicit prompt patterns that reference project standards. Claude Code reads these configuration files and adapts its output to match. The project context acts as a persistent coding standard that Claude Code respects.

  • Can Claude Code handle Next.js App Router and server components?

    Yes. Claude Code has strong understanding of Next.js 14 patterns including server components, client components, route handlers, middleware, and the App Router file conventions. It correctly distinguishes between server and client code boundaries, handles async server components, and generates appropriate 'use client' directives.

  • What is the prompt engineering workflow for Claude Code in production?

    Production prompt engineering follows a pattern: (1) Define the task with explicit acceptance criteria. (2) Reference relevant existing files as context. (3) Specify output constraints (file locations, naming conventions, test requirements). (4) Request verification steps (run tests, check types). This structured approach produces consistent, high-quality output across different task types.

Related reading

Need help with this?

We build websites and run SEO for GTA businesses. If anything here hit close to home, let's talk.

Get in touch