Hi there 👋
RSS FeedWelcome to Gerald Chen's tech blog — notes and deep dives on frontend development, JavaScript, AI agents, and the modern web.
Feel free to look around, or visit the About page to learn more.
Featured
-
Loop Engineering: From Writing Prompts to Designing Loops That Run Agents for You
The "Loop Engineering" term Addy Osmani popularized in June isn't a replacement for prompt engineering — it's about swapping you out as "the person who hits enter" and turning you into "the person who designs the loop". Walks through the five components plus a state, and the three debts Osmani is really worth remembering for (verification debt / comprehension debt / cognitive surrender) — and along the way, why I disagree with him placing loop above the harness.
-
Two Days with Claude Fable 5: The 5 Things Every API Integrator Actually Has to Change
Anthropic shipped Fable 5 on 6/9, swapping the Opus/Sonnet/Haiku naming for Fable/Mythos. But the things that actually force every Claude API integrator to touch code aren't the names — they're the new stop_reason refusal, the forced adaptive thinking, and the "you must wire up a fallback model" architecture. After two days driving it inside Claude Code, here's the integration-side detail you need.
-
Prompt, Context, Harness, Agentic: The Four Nested Layers of LLM Apps — and Knowing Which One You're Stuck In
Prompt engineer, context engineer, harness engineer, agentic engineer — these aren't four competing job titles. They're nested layers of concern, from a single instruction to an entire autonomous system. Understand how the four layers relate, and you'll know exactly which one you're optimizing every time you get stuck.
-
Electron for React Developers: 9 Things, Ranked by How Hard They Hit
This isn't a tutorial on installing electron-builder. It's a primer for web developers with a few years of React experience, organized by actual impact from biggest to smallest — from the fundamental process model, to the native-module pit everyone falls into, to tooling and debugging. Reading it should save you about a week of trial and error.
-
WebContentsView in Electron 30: How to Build Multi-View Apps After BrowserView's Deprecation
BrowserView is officially deprecated in Electron 30; the new API is the WebContentsView + BaseWindow combo. This post breaks down the new model: key differences from iframe/webview/BrowserView, migration code diffs, how to lay out multi-view apps, and a few easy-to-hit pitfalls.
-
Version Management for Electron Apps in Practice: Auto Updates, Version Checks, and a Better User Experience
A deep dive into the full version management lifecycle for Electron apps — from versioning conventions to auto-update mechanics, user notifications, and rollback strategies. Built on real-world practices with electron-updater and semantic versioning to deliver a user-friendly update experience.
-
How Electron Desktop Wallets Should Store Private Keys: safeStorage Isn't Enough — Learn from MetaMask and Phantom
A few days ago I wrote a teardown of safeStorage — the conclusion was that it's fine for ordinary API keys. But what if you need to store a wallet private key worth tens of thousands of dollars? safeStorage falls short. This post looks at how real desktop wallets like MetaMask and Phantom do it, and why the trio of "encryption + master password + short-lived unlock" is unavoidable.
-
Letting Claude Code Touch My Real-Money Trading Code: The Lines I Refused to Cross
I've spent 10 months using Claude Code on a real-money futures trading project. This is an honest retrospective: the AI never touched the money directly (I'm not that bold), but it did write code on the critical order/stop-loss/close-position paths. Here are the boundaries I held, where AI genuinely helped, and the moments I had to take over.
-
Flutter Desktop vs Electron: What Migration Patterns in 2026 Tell Us About Choosing a Desktop Framework
No boring comparison tables here. This post reverse-engineers the decision logic from what real products did in 2026 — why VS Code / Slack / Claude Desktop are still betting on Electron, why the Ubuntu 26.04 desktop went all-in on Flutter, and why Teams and Zed walked away.
-
Cracking Open the Electron safeStorage Black Box: AES-128-CBC, a Hardcoded IV, and the Things Nobody Tells You
safeStorage is Electron's recommended API for storing secrets, but its implementation details are rarely discussed. This post cracks open the source: roughly 100 lines of C++ wrapping Chromium's OSCrypt, AES-128-CBC, an IV hardcoded to 16 spaces, and PBKDF2 with a single iteration. Paired with real cases — VS Code credentials read directly by extensions, VoidStealer grabbing the master key with a hardware breakpoint — it ends with a threat-model-based storage decision table.
-
7 "Anti-AI-Tone" Principles I Distilled After Writing 80+ Blog Posts with Claude Code
Roughly 70% of posts blog080-166 on this blog were written with Claude Code's help, yet readers almost never notice. Here are the 7 "anti-AI-tone" principles I distilled — the goal isn't to make AI sound less like AI, it's to make AI sound like you. Includes the automated check script from my blog-preflight Skill.
-
AI Model Comparison, Mid-2026 Edition: Two Months After blog080, the Model Layer Has Turned Over
blog080 was written in early March 2026. Two-plus months later, GPT-5.5, Claude Opus 4.7, and Gemini 3.1 Pro have all shipped, and open-source flagships GLM-5.1/Qwen 3 Coder have closed the gap to within 5-15 points of closed models. This is the May update: what changed, and how to adjust your March picks.
-
AI Tooling Supply Chain Security Checklist: 8 Defense Principles Distilled from the Vercel and Nx Console Incidents
Neither the Vercel breach nor the Nx Console incident was a protocol vulnerability—both were credential governance failures. This post distills these two AI tooling supply chain attacks into 8 defense principles plus a 1-hour audit checklist, covering OAuth least privilege, secret tiering, managed device isolation, and IDE extension credential isolation—a security playbook indie developers and small teams can act on immediately.
-
Claude Code Multi-Agent Orchestration Plugins Compared 2026: Choosing Between Ruflo, Maestro, Claude Octopus, and Codex Peer Review
A head-to-head comparison of multi-agent orchestration plugins: Ruflo calls itself the "leading Claude orchestration platform" but underdelivers in execution, Maestro stays lightweight, Claude Octopus runs reviews across 8 models in parallel, and Codex Peer Review gates merges behind three sequential reviewers. From architecture to measured token costs — a decision framework for indie developers.
-
Claude Code Workflow Plugins Compared (2026): Superpowers, Shipyard, Ralph Loop, Maestro, or Karpathy CLAUDE.md?
The Claude Code ecosystem has splintered into 100+ plugins as of May. This post zooms in on the "workflow methodology" category—Superpowers, Shipyard, Ralph Loop, Maestro, and Karpathy CLAUDE.md. Design philosophy, context overhead, fit, and combination strategies, plus a decision tree for indie developers.
-
Astro 5 to 6, Fully Documented: Real Migration Data from a 48-Page Blog — the Official "2x Faster" Claim Doesn't Hold for Small Blogs
I upgraded my own blog (48 pages, Astro 5.16.6) to Astro 6.3.1 and recorded what actually changed, whether builds got faster, and what broke. Verdict: near-zero migration cost for a small blog, but the official "2x faster" claim doesn't hold at 48 pages — measured build times were essentially flat.
-
AI Agent Persistent Memory Architectures Compared: File-Based vs Vector Retrieval, Benchmarked with a blog-preflight Subagent
I hooked the same Subagent up to both Claude Code's built-in file-based memory and mem0's vector retrieval, then compared token cost, recall quality, and cross-session learning. The result: concrete thresholds for which approach fits which data scale, plus a look at procedural memory—the weakest but most promising direction.
-
Claude Code's Five-Layer Architecture Explained: How MCP, Skills, Agent, Subagents, and Agent Teams Work Together
Anthropic officially describes Claude Code as a five-layer architecture: MCP for connectivity, Skills for task knowledge, Agent as the main worker, Subagents for parallel isolation, and Agent Teams for coordination. This post breaks down each layer's role and collaboration patterns, with a real-world example from my blog's blog-preflight Skill showing three layers working together.
-
Frontman Deep Dive: What an AI Agent Can Do When It Sees Your Code from the Browser, Paired with Frontend Skills
Cursor and Claude Code both start from source code, but a frontend engineer's real work happens in the browser—the actual color on hover, the real DOM after SSR, the re-render triggered by the third useState. Frontman works in the opposite direction: from the browser back to the code. This post breaks down its architecture and combines it with Anthropic's frontend-design Skill and others into a complete frontend AI workflow.
-
Claude Code Skills in Practice: Building a Reusable Cross-Project Skill from Scratch
Pull your repetitive playbooks, checklists, and multi-step workflows out of CLAUDE.md and turn them into Skills. Using a "pre-publish blog check" as the running example, this post covers the SKILL.md structure, every frontmatter field, context fork isolation, the boundaries with Slash Commands and Sub Agents, plus debugging and sharing.
-
GPT-5.5 vs Claude Opus 4.6 vs Gemini 2.5 Pro: Coding Capability Comparison 2026
A 2026 head-to-head coding comparison of the leading large language models: benchmark numbers, pricing, and real-world coding performance for GPT-5.5, Claude Opus 4.6, and Gemini 2.5 Pro — to help you pick the right model for everyday development.
-
Choosing a React Chart Library: Recharts vs. ECharts vs. Nivo vs. Lightweight Charts
An in-depth comparison of the leading React chart libraries in 2026—Recharts, Apache ECharts, Nivo, and TradingView Lightweight Charts—across three core scenarios (candlestick charts, bar charts, treemaps), with recommendations based on performance, bundle size, and ease of use.
-
The 2026 AI Coding Tools Scorecard: An Honest Review of Claude Code, Cursor, Copilot, Windsurf, and Gemini CLI
An in-depth comparison of the leading AI coding tools in 2026: Claude Code, Cursor, GitHub Copilot, Windsurf, Trae, Cline, Gemini CLI, and Aider — covering real-world data, current pricing, and use cases to help developers pick the right tool.
-
AI Agent Success Rates Jumped from 12% to 66%: How Frontend Developers Should Prepare for the Era of 'Usable' Agents
Stanford's 2026 AI Index report shows AI agent success rates on real computer tasks jumped from 12% to 66% in a single year—just 6 percentage points shy of the human baseline. Here's what that means for frontend developers and how to adjust your workflow to take advantage of this inflection point.
-
AI Toolchain Supply Chain Security: A Full Post-Mortem of the Vercel Breach
A post-mortem of the April 2026 Vercel breach and its full attack chain: Roblox cheat script → Lumma Stealer → over-permissive OAuth → SSO lateral movement → leaked environment variables. We break down the security blind spot behind each link and the defenses developers should put in place.
-
One CLAUDE.md File, 44K Stars in a Week: Karpathy's Four Principles for AI Coding
A breakdown of how the forrestchang/andrej-karpathy-skills repo gained 44K stars in a single week: Karpathy's four principles for AI coding (think before coding, simplicity first, surgical changes, goal-driven execution), and how to use them directly in Claude Code.
-
Two Claude Code Environment Variables You've Probably Never Used: EFFORT_LEVEL and ADDITIONAL_DIRECTORIES_CLAUDE_MD
A deep dive into two underrated Claude Code environment variables: CLAUDE_CODE_EFFORT_LEVEL controls the reasoning effort tier, and CLAUDE_CODE_ADDITIONAL_DIRECTORIES_CLAUDE_MD enables sharing rules across projects — with complete configuration examples and use cases.
-
Inside the Axios Poisoning: How a North Korean APT Infected Millions of Developer Environments in 3 Hours
In March 2026, the axios npm package was hijacked by a North Korean state-level APT, planting a RAT into millions of developer environments within 3 hours. This post breaks down two separate but related incidents: the full supply-chain poisoning attack chain, and the technical mechanics and real-world exploitability debate around CVE-2026-40175 (CVSS 10.0).
-
The AI Agent Security Landscape: From the ClawHavoc Poisoning to Cisco DefenseClaw and Microsoft's Governance Toolkit
A ClawHavoc-style supply chain attack poisons 1,184 agent skills and hits 300,000 users; within two weeks, Cisco and Microsoft ship agent security tooling. This post breaks down the threat model, compares the two defense architectures, and walks through real integration code.
-
Getting Started with Claude Managed Agents: Let Anthropic Run Your Agent Loop
Claude Managed Agents, which entered public beta in April 2026, moves the agent loop, tool execution, and sandboxed runtime entirely into Anthropic's cloud—three API calls are all it takes to get an autonomous agent running. This post walks through the core concepts, demonstrates the full workflow with real code, and compares it against building your own.
-
Hermes Agent in Practice: Embedding an AI Assistant into Your Development Workflow
Not another feature rundown of Hermes — this is what it's actually like after wiring it into a real development workflow: code review, requirement breakdown, doc generation, scheduled monitoring. Which scenarios genuinely help, and which ones will bite you.
-
A Deep Dive into Claude Code Hooks: Making the AI Coding Tool Truly Fit Your Workflow
Claude Code Hooks might be the most underrated AI coding feature out there. This post starts with how the three hook types fire, then walks through 10+ real configurations from my blog agent, tooling site, and daily work to show how Hooks can make Claude Code truly part of your workflow.
-
Hermes Agent Review: OpenClaw's Successor, a Multi-Platform AI Assistant with a Built-In Learning Loop
Hermes Agent is an open-source AI assistant framework from Nous Research, featuring a self-learning loop, cross-platform messaging integration, and cron scheduling, with one-command migration from OpenClaw configs. This post covers its core features, where it fits, and its limitations.
-
After OpenClaw Shut Down: Rebuilding a Multi-Agent Automation Setup with the Claude Code CLI
When the third-party AI agent framework OpenClaw shut down, I rebuilt the entire multi-agent automation experience with the Claude Code CLI: Telegram bots, scheduled tasks, session persistence — plus every pitfall I hit along the way.
-
Flash-MoE: Running a 397B-Parameter Model on a MacBook at 4.4 token/s
A developer built Flash-MoE in 24 hours: it runs the 397B-parameter Qwen3.5 model on a 48GB MacBook Pro at 4.4 token/s, using only about 6GB of RAM and no cloud GPUs. We break down how it works: SSD streaming, Metal shader optimization, and MoE sparse activation.
-
Apifox Supply Chain Attack Post-Mortem: Your SSH Keys May Already Be Compromised
In March 2026, the Apifox desktop client was hit by a supply chain attack: a JS file on the official CDN was replaced with a malicious version that stole users' SSH keys, Git credentials, and other sensitive data. A technical breakdown of the attack chain, blast radius, and how to check if you were affected.
-
GitHub Squad: Drop an AI Dev Team Straight into Your Repo
Squad, an open-source project, lets you spin up an AI dev team inside your repo with two commands—an architect, frontend dev, backend dev, and tester, each with their own job, collaborating on top of Copilot. A look at its architecture, what it's like to use, and the multi-agent collaboration patterns behind it.
-
Computer-Use: When AI Agents No Longer Need APIs
AI agents are learning to operate computers the way humans do—reading the screen, clicking the mouse, typing on the keyboard. From Anthropic's Claude Computer Use to Microsoft's CUA to OpenAI's Operator, Computer-Use is redefining what "software integration" means.
-
Google Stitch's Big Update: UI Design in Natural Language — Should Figma Be Worried?
Google Stitch just got a major update, evolving from a simple prompt-to-mockup tool into an AI-native design canvas. Infinite canvas, voice interaction, the DESIGN.md design system format, MCP integration — five big features shipped at once. Free, powered by Gemini, and aimed squarely at Figma Make.
-
Karpathy's AutoResearch: Letting an AI Agent Run 700 ML Experiments on Its Own
A deep dive into Karpathy's open-source AutoResearch project: how a 630-line Python script lets an AI agent run ML experiments autonomously on a single GPU, completing 700 experiments in two days and finding 20 effective optimizations. From architecture to practical applications, here's why every developer should pay attention to the "Karpathy Loop" pattern.
-
Reading the MCP 2026 Roadmap: From Local Tools to Production-Grade Agent Infrastructure
MCP (Model Context Protocol) has published its 2026 roadmap with four priority areas: transport evolution, agent communication, governance maturity, and enterprise readiness. A technical breakdown of the concrete problems and proposed solutions in each area.
-
MiroFish: A Swarm Intelligence Prediction Engine a College Senior Built in 10 Days with Vibe Coding
A deep dive into the architecture and design of MiroFish, a swarm intelligence prediction engine. Built by a college senior in 10 days with Vibe Coding, it topped GitHub Trending and landed a 30M RMB investment from Shanda. Here's what it got right, technically.
-
When AI Agents Learn to Pay: A Deep Dive into the x402 Protocol and Agent Payment Infrastructure
How Coinbase's x402 protocol revived the HTTP 402 status code to let AI agents pay for API calls with stablecoins. From protocol design to hands-on code, from the Stripe/Mastercard competitive landscape to the reality of $28K daily volume — a full breakdown of the agent payments space.
-
Browser-Native AI in Practice: The Complete Guide to Chrome Built-in AI APIs
A deep dive into Chrome's built-in Gemini Nano model and the browser-native AI APIs (Prompt, Translation, Summarization) — from technical architecture to production practice, showing how to build privacy-first local AI apps, with complete code examples and performance analysis.
-
OpenClaw Automation in Practice: Building a 24/7 AI Assistant with Cron + Heartbeat
A deep dive into OpenClaw's Cron Job and Heartbeat mechanisms—from choosing between them to engineering practice—with a real production case study covering error handling, state management, and cost optimization.
-
The 2026 AI Model Landscape: Hands-On Comparison of 12 Leading Models from China and Abroad
Hands-on benchmarks of 12 AI models (GPT-4o, Claude 3.5, Gemini 2.0, Qwen 2.5, GLM-4, Kimi, and more) across 6 real-world scenarios including code generation, Chinese writing, and reasoning. Includes performance scores, monthly cost comparisons, and a decision tree to help you pick the right model.
-
Building AI Agents with Long-Term Memory: From Design Patterns to Production
A deep dive into building episodic, semantic, and procedural long-term memory for AI Agents, with a complete technical architecture, code implementation, and production optimization strategies.
-
AI Agent-Driven Development: The Paradigm Shift from Tools to Workflows
How AI Agents are evolving from "assistive tools" into "collaborative partners" and reshaping the modern developer workflow. Based on real project experience, a deep dive into three core capabilities — context awareness, proactive execution, and tool orchestration — with complete code examples.
-
The AI Agent Skills Standardization War: Architecture, Security, and Ecosystem Evolution
A deep dive into the technical architecture, security models, and governance mechanisms of MCP, Agent Skills, and Skills.sh — unpacking the design-philosophy clash behind real-world security incidents, with a practical selection guide for enterprises and developers.
-
AI Agent Multi-Task Collaboration in Practice: From Monolith to Distributed Workflows
How to design and build a multi-task collaboration system for AI Agents, covering task decomposition, state management, and error recovery. A hands-on look at agent collaboration architecture through a real blog-publishing workflow.
-
Why Do AI Agents Ignore Your "Hard Rules"? A Deep Postmortem of Two Real Incidents
A deep postmortem of two real production incidents, analyzing why AI agents systematically ignore explicit rules and how to design constraint mechanisms that actually hold. Covers the technical root causes, common patterns, and actionable solutions.
-
AI Agent Memory Systems in Practice: OpenClaw Memory Best Practices
A deep dive into OpenClaw's memory architecture, from file layout to retrieval tuning, with actionable best practices for managing AI Agent memory
-
From Command Line to Conversational Programming: Building a Personal Dev Assistant with AI Agents
A deep dive into building an AI dev assistant from scratch with memory, tool calling, and task planning — a hands-on look at AI-native development patterns
-
The Complete Guide to AI Agent Skills: Give Your AI Assistant Superpowers
What are Skills? How do you install and use them in Claude Code, Codex, OpenClaw, and other AI tools? Explore skills.sh and awesome-openclaw-skills, tap into 3,000+ community Skills, and turn your AI Agent from a generalist assistant into a domain expert.
-
A Lightweight Electron Alternative: A Deep Dive into electrobun
12MB vs 150MB, 14KB incremental updates, full-stack TypeScript. How does electrobun redefine desktop app development with Bun + Zig? A complete breakdown of its architecture, performance, and hands-on usage.
-
AI Agent Frontend Workflow (Part 3): Cost Optimization and Team Collaboration Best Practices
How do you keep AI Agent token costs under control? How do you deal with hallucinations? How do you roll it out across a team? This post shares battle-tested optimization strategies and collaboration practices, backed by real cost data.
-
AI Agent Frontend Workflows (Part 4): What's Next, and the Open Source Tool Landscape
The series finale. From Copilot to autonomous agents, from closed to open source — this post maps where AI agents are heading, compares the major tools, and explores how the developer's role is changing. Complete learning roadmap included.
-
Hard-Won Lessons from Multi-Agent Collaboration: How One config.patch Nearly Took Down the Whole System
Real-world lessons from a week of running a multi-agent system: a config management incident, TypeScript import errors, a wrong publish date, cost optimization in practice, and best practices for team collaboration.
-
AI Agent Frontend Workflow (Part 2): Intelligent Code Review and Automated Testing
Use AI Agents for intelligent code review and automated test case generation. From Git Hook integration to E2E testing, this post shares a complete hands-on setup with real project metrics.
-
Safely Exposing a Local AI Assistant to the Internet: SSH Reverse Tunnels in Practice
Expose a locally running OpenClaw Gateway to the public internet with an SSH reverse tunnel plus an Nginx reverse proxy, accessible via your own domain. All data stays local, with multiple layers of security — at zero cost.
-
AI Agent Frontend Workflow (Part 1): Understanding Agents and Automated Component Generation in Practice
An AI Agent is not just a chatbot — it's an intelligent assistant that can invoke tools and manage context on its own. This post breaks down how Agents actually work, then walks through a hands-on React component generator to show how AI can reshape the frontend development workflow.
-
OpenClaw Multi-Agent Setup in Practice: Pitfalls and Best Practices
Setting up multiple OpenClaw agents and multiple Telegram accounts comes with plenty of pitfalls. Based on hands-on experience, this post covers every common problem and its fix so you can skip the pain.
-
AI Agent Dev Tools Compared 2026: Claude Code vs OpenClaw vs Cursor — Which One Should You Pick
A hands-on comparison of Claude Code, OpenClaw, and Cursor — the three big AI coding tools. From runtime model, memory systems, and model support to skill mechanisms, this guide covers how to choose an AI agent dev tool in 2026, plus a deep dive into their config systems and a cross-tool migration guide.