← 所有文章
claudeClaude Code

Claude Code + Computer Use: When the Terminal Isn't Enough

You just built a new onboarding flow. The tests pass. The types check. But does it actually look right when a real user taps through it on a phone screen?
That last-mile verification has always been manual. You open the simulator, tap through the screens, take screenshots, compare against the design. Claude can write the code, but it couldn't verify the visual result.
Now it can. Computer use lets Claude control your actual desktop from the Claude Code Desktop app — open native apps, click through UI, take screenshots, and verify changes on screen.

Key Takeaways

What It Actually Does

Claude takes screenshots of your screen, identifies UI elements, and controls the mouse and keyboard to interact with them. It's not simulating a browser — it's controlling your actual desktop, the same way a remote support tool would.
This means it can work with anything that has a GUI: native Mac/Windows apps, iOS/Android simulators, web apps in any browser, even hardware control panels.

Getting Started

Step 1: Open Claude Code Desktop app (not the terminal CLI).
Step 2: Go to Settings and enable Computer Use. The OS will ask for accessibility permissions — Claude needs these to control mouse and keyboard.
Step 3: Ask Claude to do something visual:

Open the iOS simulator, tap through the onboarding flow, and screenshot each step
Claude takes a screenshot, identifies the first button, clicks it, takes another screenshot, and continues. Each action shows you what it's about to do and asks for approval.


Where This Changes Things

End-to-end visual verification. You tell Claude "build a settings page with dark mode toggle" and then "open the app and verify the toggle works." Same conversation, code to verification.
Testing in simulators. Claude can drive the iOS simulator or Android emulator. Tap through flows, fill forms, trigger edge cases. Not a replacement for automated tests, but perfect for the exploratory testing you'd do manually.
Proprietary tools without APIs. Some tools only have a GUI — design apps, admin panels, legacy systems. Claude can operate them the way you would.
Screenshots as bug reports. Ask Claude to reproduce a bug visually and screenshot each step. Instant reproduction steps with evidence.

A Few Things I Noticed

💡 Sequential tasks work best. "Open this app, click here, type this, click there" is reliable. "Find the best route through this complex UI" is less so. Give specific steps when you can.
💡 Screenshots are the bottleneck. Each screenshot takes a moment to process. If your task involves 30 clicks, it'll feel slower than doing it yourself. Use it for verification, not for speed.
💡 It reads text from screenshots accurately. Error messages, console output, status indicators — Claude reads them from the screenshot and can act on what it sees.


Honest Limitations

Desktop app only. Computer use requires screen access, which the terminal CLI doesn't have. You need the Claude Code Desktop app.
Can't handle rapid animations. If your UI has fast transitions or animations, Claude might screenshot at the wrong moment and misinterpret the state.
No multi-monitor awareness yet. Claude works on your primary display. If the app you want to control is on a secondary monitor, move it first.
Research preview. It occasionally misclicks or misidentifies UI elements, especially in dense interfaces. Always watch the first few actions to calibrate your trust.
⚠️ Security consideration. You're giving Claude control of your mouse and keyboard. It asks before each action, but think about what's visible on screen — passwords, sensitive data, personal messages. Minimize unrelated windows first.


Setting Up

  1. Install the Claude Code Desktop app (Mac or Windows)
  2. Open Settings → enable Computer Use
  3. Grant accessibility permissions when the OS asks
  4. Start a conversation and ask Claude to interact with your screen
    Full details in the Computer use guide.
← 所有文章OctoDock 首頁 →