GPT-5.4 Computer Use Safety Checklist for Repeatable Agent Runs

A first successful GPT-5.4 computer use demo can be misleading. The agent opens a browser, clicks the right buttons, reads the page, and finishes a task that would have taken a human several minutes. That success is useful, but it is not the same thing as a safe repeatable workflow. The real question comes after the demo: what prevents the next run from clicking the wrong domain, obeying hostile page text, submitting a form too early, or leaking private information into a tool call?

This follow-up is not about building the first desktop automation. It is about adding a safety layer before you let GPT-5.4 computer use handle repeated work. OpenAI documents GPT-5.4 as a model for agentic and professional workflows, and the API model page lists computer use among supported Responses API tools. That makes the safety design more important, not less important. A stronger model still needs boundaries, approvals, logging, and stop conditions.

Separate the task from the control layer

Start by separating the useful task from the control layer around it. The task might be "download the weekly report from this dashboard" or "copy shipping status into this internal form." The control layer answers different questions: which websites are allowed, which actions are blocked, which fields are sensitive, when a human must approve, and what evidence is saved after the run.

This separation keeps the prompt from becoming the only safety mechanism. Prompts matter, but a prompt is not a permission system. If the workflow should never submit payments, delete records, change account settings, or message customers without approval, that rule should exist in the runner, the tool policy, or the review step. The model should receive the rule, but the environment should enforce it where possible.

A practical first pass is to classify every action into one of three levels. Read actions inspect a page or copy visible text. Draft actions prepare data but do not submit it. Commit actions create external effects: sending, buying, deleting, publishing, deploying, changing permissions, or updating customer data. Most computer-use workflows should allow read actions freely, allow draft actions with logging, and require human confirmation for commit actions.

Advertisement
Advertisement

Write a small policy before the next run

Do not wait for a mistake to decide what the agent may do. Write the policy before the workflow becomes routine. Keep it short enough that a human reviewer can understand it, but specific enough that it can guide implementation.

computer_use_policy:
  workflow: "weekly-report-download"
  allowed_domains:
    - "reports.example.com"
    - "login.example.com"
  blocked_domains:
    - "personal-email.example"
    - "banking.example"
  allowed_actions:
    - "navigate"
    - "read_visible_text"
    - "download_report"
    - "fill_draft_field"
  approval_required_for:
    - "submit_form"
    - "send_message"
    - "delete_record"
    - "change_permissions"
    - "enter_payment_or_tax_data"
    - "acknowledge_pending_safety_check"
  stop_conditions:
    - "unexpected_domain"
    - "login_or_mfa_prompt"
    - "page_text_instructs_agent_to_ignore_user"
    - "sensitive_personal_data_visible"
    - "download_filename_does_not_match_expected_pattern"
  log_fields:
    - "start_time"
    - "allowed_domain_checks"
    - "computer_call_actions"
    - "approval_decisions"
    - "final_status"

The domain list is especially important. OpenAI's computer-use documentation recommends allowlists or blocklists for websites, actions, and users. A domain allowlist reduces the damage from a wrong navigation step or a malicious link inside a page. It also gives the runner a clean reason to stop instead of improvising.

Treat pending safety checks as a handoff

OpenAI's computer-use guide describes safety checks that can be raised for issues such as malicious instructions, irrelevant domains, and sensitive domains. When a pending safety check appears, the next request can include acknowledged safety checks so the workflow can proceed. That mechanism should not become an invisible auto-continue button.

The safer pattern is to treat a pending safety check as a handoff. Pause the loop, show the human what the agent is trying to do, show the current page or domain, explain which safety check fired, and require a deliberate approval decision. If the task is low-risk and the warning is expected, the human can continue. If the page is unrelated, hostile, or sensitive, the correct action is to stop the run and preserve the trace.

  1. Pause the computer-use loop when `pending_safety_checks` is not empty.
  2. Display the current URL, screenshot, planned action, and safety-check code to the reviewer.
  3. Require explicit approval before passing `acknowledged_safety_checks` into the next request.
  4. Log the approval decision and the reason for continuing or stopping.
  5. Resume only if the action still matches the original task and approved domain list.

Where possible, pass the current URL back with the computer call output. OpenAI notes that this can improve the accuracy of safety checks. It also helps your own logs because the action is tied to a visible location instead of a vague screenshot.

Protect private data before it enters the loop

Computer use can see what is on the screen. That means screen hygiene matters. Close unrelated tabs, hide password managers, remove private spreadsheets, sign out of accounts that are not part of the workflow, and use a dedicated browser profile for the agent. If the task can run against a staging account or demo tenant, use that first.

OpenAI's agent safety guidance highlights prompt injection and private data leakage as risks in agent workflows. For computer use, both risks are practical. A page can display text that tries to manipulate the model, and a visible private document can be accidentally summarized, copied, or submitted. Treat web pages, emails, tickets, PDFs, and chat messages as untrusted data unless your workflow has verified them.

Do not give the agent a logged-in browser with every personal account active. A narrow session is safer than a powerful one. If the workflow only needs one dashboard, the browser profile should contain that dashboard and little else.

Review the run like a system, not a chat

After each test run, review the trace as an operational record. Look for unnecessary navigation, repeated failed clicks, attempts to leave the allowed domain, form submissions without review, and any moment where the agent treated page text as an instruction. The goal is not to blame the model for every rough edge. The goal is to harden the workflow until the safe path is the easy path.

A repeatable run should produce a simple summary: task, start time, end time, domains visited, files downloaded, fields drafted, approvals requested, approvals granted, and final status. Avoid logging secrets or full customer records. Metadata is usually enough to debug behavior without creating a new privacy problem.

Use the same rule for every future automation: if the workflow can affect money, permissions, customer communication, production data, or public content, keep a human approval gate. GPT-5.4 can make computer-use workflows more capable, but capability does not remove the need for containment.

Sources

Disclaimer: "All content is for educational use only. AI outputs are not guaranteed to be accurate."

ZJ

Written by ZayJII

Developer, trader, and realist. Writing tutorials that actually work.

Advertisement
Advertisement