This post is not a tool list. It is the workflow I distilled after repeated iteration in real projects.
The process is straightforward: define constraints first, execute in parallel with multiple models, then converge with a unified review gate.

1. Design Stage

In the design stage, I align information architecture and produce an iteratable first draft as quickly as possible.

1) Wireframe Sketches

  • wiretext: Quickly turns page structure into wireframes, useful for confirming information hierarchy first.
  • mockdown: Expresses page block layout in text, useful for early discussion and requirement clarification.

2) UI Shell and Variants

  • Pencil: Generates editable UI drafts through prompts.
  • variant ai: Start from a close style, refine with prompts, then export .js and continue structural refinement with Claude Code or Codex.

2. Development Stage

1) Baseline: Multi-Agent + Git Worktree

My current approach for simpler requirements:

  1. Split one requirement into 2-4 relatively independent subtasks.
  2. Bind each subtask to an isolated worktree and session.
  3. Merge small verifiable increments first, then converge cross-module changes.

This significantly reduces context pollution and task interference.
There is a boundary, though: when task coupling is high, over-splitting worktrees can increase integration cost.

2) Third-Party Tooling Landscape

I used to rely on:

  • superset: a parallel worktree orchestration tool.
  • happy: remote control for Claude Code.

Now that Claude officially covers multi-worktree workflows and part of remote capabilities, these tools are better used as optional CLI enhancers.

3. CCG Workflow: Turning Multi-Model Collaboration into Process

In my stack, ccg-workflow acts as the orchestration layer: Claude Code coordinates, frontend tasks are routed to Gemini first, backend tasks to Codex first, then Claude performs unified review and lands patches.

1) Installation and Prerequisites

npx ccg-workflow
  • Requires Node.js 20+.
  • Optionally install Codex CLI (backend-focused) and Gemini CLI (frontend-focused).

2) Common Commands

  • Full workflow: /ccg:workflow
  • Planning + execution split: /ccg:plan + /ccg:execute
  • Task-oriented: /ccg:frontend, /ccg:backend, /ccg:debug, /ccg:optimize, /ccg:review

Plan first, then execute:

/ccg:plan 实现用户认证功能
/ccg:execute .claude/plan/user-auth.md

This pattern reduces context loss from session switching and minimizes goal drift during implementation.

3) Built-in CCG Capabilities

  • Spec-driven (/ccg:spec-*):
    • Suitable for complex requirements with many constraints and high drift risk.
    • Value: it compresses "free generation" into "constraint-driven execution".
  • Parallel implementation (/ccg:team-*):
    • Suitable for tasks that can be split into 3+ independent modules.
    • Value: reduces context cross-talk via /clear + file-state passing.

4) Configuration and Pitfalls

  • Tune timeout parameters in ~/.claude/settings.json (such as CODEX_TIMEOUT, BASH_DEFAULT_TIMEOUT_MS).
  • Choose MCP tools by complexity: ContextWeaver + Context7 / Playwright / DeepWiki / Exa.

4. Skills as Reusable Process

Common skills resources:

How I use them:

  • superpowers: process orchestration (automatic planning, staged execution, completion checks).
  • claude-mem: context compression and memory management (useful for long-running workflows).
  • react-best-practices: performance and code quality constraints during implementation.

5. MCP Capability Expansion

I currently use Lighthouse MCP primarily:

  • @danielsogl/lighthouse-mcp: full-featured, with support for Core Web Vitals, budgets, and more.

Two recurring pitfalls:

  1. When running Lighthouse, explicitly ask the LLM to test both desktop and mobile. Otherwise, many automated flows run mobile-only.
  2. Automated scores can differ from manual DevTools runs. Use trends and bottleneck localization as the main signal, not a single absolute score.

If your site is hosted on Cloudflare and crawler/audit access fails, check robots.txt and caching policy first, then purge robots.txt-related cache if needed.

6. Testing and Review Pipeline

1) Explicitly Require Three Test Layers in Prompt

After implementation, I use this reusable instruction:

请基于当前变更补齐并执行测试,要求:
1) 单元测试(unit tests):覆盖核心函数、边界条件、异常路径。
2) 集成测试(integration tests):覆盖模块间协作、数据流与接口契约。
3) 端到端测试(e2e tests):覆盖关键用户路径(成功流 + 失败流)。

执行要求:
- 先写测试再修复实现,直到全部通过。
- 输出每类测试的新增用例清单、执行命令和结果摘要。
- 标注当前未覆盖风险点和建议补测项。

2) Cross-Model Code Review

After tests pass, I run review with different models and focus on:

  1. Potential bugs and boundary failures.
  2. Readability and maintainability.
  3. Performance, complexity, and scalability risks.
  4. Whether test coverage matches business risk.

The key gain of multi-model review is reducing single-model blind spots.

3) Human Review

After AI review, I run a manual review pass to verify:

  1. The requirement is truly closed-loop, not just textually satisfied.
  2. No obvious regression risk is introduced.
  3. Naming, boundary handling, and error handling align with team conventions.
  4. Tests cover the highest-risk paths.

The role of human review is not to repeat AI output; it is to provide business-context judgment and final risk containment.

4) GitHub App Auto Review After CI

After PR submission and CI pass, I run another automated review round with GitHub Apps, usually including:

I am also vibe-building a testing-focused pr-agent to automatically summarize "test results + risk points + review suggestions" into PRs/Issues and improve review decisions.

7. Current Reusable End-to-End Workflow

  1. Use wireframe tools to define information hierarchy and page structure.
  2. Use UI generation tools to get an editable first draft quickly.
  3. Prioritize MVP in development, then parallelize independent subtasks through worktrees.
  4. For complex requirements, lock constraints with /ccg:plan or /ccg:spec-* before implementation.
  5. Before merge, run unified review (/ccg:review + manual spot checks), then do stabilization refactoring (robustness, maintainability, edge handling).

The core benefit of this workflow: high early velocity, stable mid-stage quality, and lower long-term maintenance cost.