Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Important
- This is a preview feature.
- Preview features aren't meant for production use and might have restricted functionality. These features are subject to supplemental terms of use, and are available before an official release so that customers can get early access and provide feedback.
Windows 365 for Agents is an MCP server for full operational control of a Windows 365 cloud PC. Use this MCP server to drive a real Windows environment through desktop interaction (mouse, keyboard, screen capture, command execution), browser automation via Microsoft Edge, and semantic UI inspection via Windows UI Automation.
Note
- Existing connections that use previous versions of Microsoft MCP servers remain supported.
- For all new connections, use the latest Windows 365 Agents MCP server, which exposes tools across desktop, browser, and accessibility capabilities.
- Browser automation operates on Microsoft Edge. Edge launches automatically on the first browser tool call.
focus_browsercan also target Chrome or Firefox, but DOM-level browser tools only operate on the Edge instance.
To learn more about Windows 365 for Agents, see Windows 365 for Agents documentation.
Overview
| Server ID | Display name | Description |
|---|---|---|
mcp_W365AServer |
Windows 365 for Agents MCP server | Full operational control of a Windows 365 cloud PC, including desktop interaction, browser automation, and UI inspection. |
Available tools
mcp_desktop_move_mouse
Move the cursor to a screen position. Use mcp_desktop_click instead if you intend to click at the destination.
Required parameters:
- x: X coordinate in screen pixels
- y: Y coordinate in screen pixels
mcp_desktop_click
Click at a position, or at the current cursor location if coordinates are omitted. Supports single-click, double-click, and all five mouse buttons.
Optional parameters:
x: X coordinate in screen pixels (omit for current position)
y: Y coordinate in screen pixels (omit for current position)
button: Left, Right, Middle, Forward, or Backward (default Left)
clickCount: 1 = single click, 2 = double click (default 1)
mcp_desktop_get_cursor_position
Return the current cursor coordinates. No parameters. Returns {cursorX, cursorY}.
mcp_desktop_drag_mouse
Drag from one position to another. Useful for moving objects, resizing windows, or pixel-precise scrolling. Required parameters:
- startX: Start X coordinate.
- startY: Start Y coordinate.
- endX: End X coordinate.
- endY: End Y coordinate. Optional parameters:
- button: Left, Right, or Middle (default is Left)
mcp_desktop_scroll
Scroll at a position using notch units (not pixels). Three notches is approximately one page.
Required parameters:
x: Scroll position X
y: Scroll position Y
Optional parameters:
- deltaX: Horizontal notches, positive = right (default 0)
- deltaY: Vertical notches, positive = down (default 0)
Note
Values are clamped to the range [-20, 20].
mcp_desktop_type_text
Type text via keyboard simulation. For keyboard shortcuts, use mcp_desktop_press_keys. For web form fields, use mcp_browser_type.
Required parameters:
- text: Text to type
mcp_desktop_press_keys
Press a key combination simultaneously. Supports modifier keys, function keys, and standard keys.
Required parameters:
- keys: Array of key names to press together (for example,
\["ctrl","c"\],\["alt","tab"\],\["ctrl","shift","s"\])
mcp_desktop_take_screenshot
Capture the full screen or a cropped region as a PNG image (base64-encoded).
Optional parameters:
- x: Crop region left edge
- y: Crop region top edge
- width: Crop region width
- height: Crop region height
Note
Provide all four crop parameters together, or omit all four for a full-screen capture.
mcp_desktop_analyze_screen
Perform OCR on the entire screen. No parameters. Returns {fullText, averageConfidence, boxes[{text, confidence, x, y, width, height}], width, height}.
mcp_desktop_get_screen_size
Return the screen resolution. No parameters. Returns {width, height}.
mcp_desktop_list_windows
List all visible windows with their titles, positions, and dimensions. No parameters. Returns an array of {title, processName, handle, x, y, width, height}.
mcp_desktop_activate_window
Bring a window to the foreground using a fuzzy title match.
Required parameters:
- titlePattern: Partial window title (case-insensitive substring)
mcp_desktop_focus_browser
Focus a browser window (Edge, Chrome, or Firefox), optionally filtered by URL or title.
Optional parameters:
- pattern: URL or title substring to match (omit for any browser window)
mcp_desktop_close_window
Gracefully close a window by fuzzy title match. System-critical processes are protected and cannot be closed.
Required parameters:
- titlePattern: Partial window title (80% match threshold). Returns
{matchedTitle, processName, closed}.
mcp_desktop_execute_shell_command
Run a shell command in a sandboxed environment. Commands are validated against an allow list and dangerous patterns are blocked.
Required parameters:
- command: Command to execute
Optional parameters:
- cwd: Working directory
- timeoutMs: Timeout in milliseconds (default 30000, max 30000)
Note
- Allowed commands: git, npm, dotnet, python, cargo, node, pip, dir, mkdir, del, copy, move, robocopy, findstr, where, and type.
- Blocked patterns include shell metacharacters (|, ;, &, <, >), environment variable expansion
(%VAR%), interpreter eval flags (python -cornode -e),git config --global,npm -g, path-prefixed executables,rm -rf,sudo, and disk/system commands. stdoutandstderrare each truncated at 32 KB. Usemcp_desktop_execute_python_codefor arbitrary computation. Returns{stdout, stderr, exitCode, success, timedOut, resourceLimitsApplied}.
mcp_desktop_execute_python_code
Execute Python code in a sandboxed environment with resource limits. Ideal for data processing, calculations, file I/O, and any computation that goes beyond simple shell commands.
Required parameters:
- code: Python code (max 262,144 characters).
Optional parameters:
- cwd: Working directory
- timeoutMs: Timeout in milliseconds (default 30000, max 30000).
Returns the same schema as mcp_desktop_execute_shell_command
mcp_desktop_wait_milliseconds
Pause execution to allow animations or transitions to complete. Do not use in polling loops—use mcp_browser_wait_for for DOM polling.
Required parameters:
- ms: Wait duration in milliseconds (clamped to [0, 5000])
mcp_browser_navigate
Navigate to a URL and wait for the page to load.
Required parameters:
- url: Full URL including protocol (for example,
https://example.com)
mcp_browser_back
Navigate back in browser history. No parameters.
mcp_browser_forward
Navigate forward in browser history. No parameters.
mcp_browser_reload
Reload the current page. No parameters.
mcp_browser_get_url
Return the current page URL as a plain string. No parameters.
mcp_browser_get_title
Return the current page title as a plain string. No parameters.
mcp_browser_get_text
Return the visible page text content as a plain string. No parameters. Truncated at 512 KB.
mcp_browser_get_html
Return the full page HTML source as a plain string. No parameters. Truncated at 512 KB.
mcp_browser_click
Click a DOM element by CSS selector. More reliable than coordinate-based clicking for web content.
Required parameters:
- selector: CSS selector (for example,
#submit-btnora.nav-link)
mcp_browser_type
Type text into a form element by CSS selector.
Required parameters:
- selector: CSS selector of the input element.
- text: Text to type
mcp_browser_query_text
Get the text content of the first element matching a CSS selector.
Required parameters:
- selector: CSS selector
mcp_browser_wait_for
Wait for a DOM element to appear. Useful for dynamic content that loads asynchronously.
Required parameters:
- selector: CSS selector to wait for
Optional parameters:
- timeoutMs: Timeout in milliseconds (default 5000, max 30000)
mcp_browser_eval_js
Evaluate a JavaScript expression in the page context and return the result as a string.
Required parameters:
- expression: JavaScript expression that returns a string
Note
If your expression returns an object or number, convert it to a string explicitly (for example, JSON.stringify(obj) or .toString()).
mcp_browser_list_tabs
List all open tabs with their index, title, and URL. No parameters. Returns an array of {index, title, url}.
mcp_browser_switch_tab
Switch to a tab by index.
Required parameters:
- tabIndex: 0-based tab index
mcp_browser_new_tab
Open a new tab, optionally navigating to a URL.
Optional parameters:
- url: URL to open (blank tab if omitted)
Returns {index, title, url}.
mcp_browser_close_tab
Close a tab by index.
Required parameters:
- tabIndex: 0-based tab index
mcp_browser_screenshot
Capture a PNG screenshot of the browser viewport only (not the full screen). No parameters. Returns a base64-encoded PNG.
mcp_accessibility_get_accessibility_tree
Retrieve the UI element tree for the foreground window. Each element includes its role, name, value, and screen coordinates.
Optional parameters:
- maxDepth: Maximum tree traversal depth, 1-10 (default 3)
- maxElements: Maximum elements to return, 1-2000 (default 500)
Returns a hierarchical tree of {role, name, value, x, y, width, height, children[...]}.
mcp_accessibility_find_ui_element
Search for UI elements by text content, accessibility role, or name (case-insensitive substring). Returns matching elements with their clickable screen coordinates.
Optional parameters:
- text: Text to search for (used as name if name omitted)
- role: UI role filter —
Button,TextBox,CheckBox,MenuItem,ComboBox, and more - name: Accessible name (takes precedence over text if both provided)
- windowHandle: Target window handle (null = foreground window)
Note
At least one of text, role, or name must be provided. Returns an array of {role, name, value, x, y, width, height}.
Key features
Desktop interaction
- Click, double-click, right-click, and five-button mouse control.
- Pixel-precise drag and drop.
- Notch-based scrolling (three notches ≈ one page).
- Keyboard typing and multi-key shortcut combos.
- Cursor position tracking.
- Screen resolution detection.
Screen capture and analysis
- Full-screen or cropped PNG screenshots.
- OCR of the full screen with per-region confidence scores and bounding boxes.
- Browser-viewport-only screenshots for web content.
Window management
- Enumerate all visible windows with positions and dimensions.
- Activate windows by fuzzy title match.
- Focus browser windows (Edge, Chrome, Firefox) optionally filtered by URL or title.
- Graceful window close with protection for system-critical processes.
Command execution
- Sandboxed shell commands with an allow list (git, npm, dotnet, python, cargo, node, pip, dir, mkdir, del, copy, move, robocopy, findstr, where, type).
- Sandboxed Python execution up to 262,144 characters of code.
- Working-directory and per-call timeout control (max 30 seconds).
- Resource limits and hardened block list against shell metacharacters, eval flags, privilege escalation, and destructive operations.
Browser automation
- Navigate, back, forward, reload.
- Read pageURL, title, visible text (512 KB cap), and full HTML (512 KB cap).
- DOM-level click, type, and text query by CSS selector.
- Wait for dynamic elements with configurable timeout.
- Evaluate JavaScript expressions in the page context.
- Multi-tab management: list, switch, open, close.
- Runs on Microsoft Edge, launched automatically on first use.
UI accessibility
- Retrieve the Windows UI Automation tree for the foreground window with configurable depth and element count.
- Find UI elements by text, role, or accessible name.
- Returns clickable screen coordinates for precise targeting of buttons, text boxes, checkboxes, menu items, and combo boxes.
Timing and synchronization
- Short one-shot pauses via
mcp_desktop_wait_milliseconds(max five seconds). - DOM-level polling via
mcp_browser_wait_for(max 30 seconds).
Notes
- All coordinates are in screen pixels with (0,0) at the top-left corner. Coordinates from
mcp_desktop_take_screenshot,mcp_desktop_analyze_screen,mcp_accessibility_find_ui_element, andmcp_desktop_list_windowsall share the same coordinate space. - A cursor failsafe is active: If the cursor moves within five pixels of any screen corner, mouse operations are cancelled. Avoid targeting the extreme edges of the screen.
- Shell pipe operators (|), semicolons (;), ampersands (&), and output redirection (>, <) are blocked. To transform command output, capture it and process it with
mcp_desktop_execute_python_code. - If interpreter eval flags are blocked or if
python -c "..."andnode -e "..."are rejected, you can usemcp_desktop_execute_python_codefor Python code, or write code to a file first. - Command
stdout/stderris truncated at 32 KB each. Use flags to limit verbose output (for example,git log --oneline -20) or redirect to a file and read it separately. - Maximum timeout for
mcp_desktop_execute_shell_commandandmcp_desktop_execute_python_codeis 30 seconds. For longer work, break it into smaller steps or launch a background process from Python and poll. - There is no dedicated file read/write tool. Read files with
mcp_desktop_execute_shell_commandusing thetypecommand; write files withmcp_desktop_execute_python_codeusing Python's built-in file I/O. Shell output redirection (>, >>) is blocked. mcp_browser_eval_jsalways returns a string. Convert objects or numbers explicitly before returning.- Browser DOM tools (
mcp_browser_click,mcp_browser_type,mcp_browser_eval_js, etc.) operate only on the Microsoft Edge instance.mcp_desktop_focus_browsercan focus Chrome or Firefox windows, but DOM tools will not target them. mcp_desktop_take_screenshotrequires all four crop parameters (x, y, width, height) together, or none for a full-screen capture.mcp_desktop_scrolluses notch units (clamped to [-20, 20]), not pixels. Three notches is approximately one page.mcp_accessibility_find_ui_elementrequires at least one of text, role, or name. When both text and name are provided, name takes precedence.
Common use cases
Fill out a web form
- Call
mcp_browser_navigateto open the target page. - Call
mcp_browser_wait_forto wait for the form to load. - Call
mcp_browser_typeto fill each field by CSS selector. - Call
mcp_browser_clickto submit the form. - Call
mcp_browser_wait_forto wait for the confirmation element. - Call
mcp_browser_get_textto read and verify the result.
Automate a desktop application
- Call
mcp_desktop_activate_windowto bring the application to the foreground. - Call
mcp_desktop_take_screenshotto capture the current state. - Call
mcp_accessibility_find_ui_elementto locate a button or field by name. - Call
mcp_desktop_clickon the element's reported coordinates. - Call
mcp_desktop_type_textto enter data. - Call
mcp_desktop_press_keysfor shortcuts (for example, ["ctrl","s"] to save). - Call
mcp_desktop_take_screenshotto verify the result.
Extract data from a web page
- Call
mcp_browser_navigateto open the page. - Call
mcp_browser_get_textto extract visible text content. - Call
mcp_desktop_execute_python_codeto parse and process the extracted data. - Call
mcp_browser_eval_jsto query specific values via JavaScript when text extraction isn't enough.
Run development tasks
- Call
mcp_desktop_execute_shell_commandforgit pull,npm install, anddotnet build. - Call
mcp_desktop_take_screenshotto capture build output. - Call
mcp_desktop_execute_python_codeto analyze logs or test results. - Call
mcp_browser_navigateto open a local dev server in the browser. - Call
mcp_browser_screenshotto capture the rendered page.
Read and write files
- Read a file with
mcp_desktop_execute_shell_commandusingtype C:\path\to\file.txt. - Write a file with
mcp_desktop_execute_python_codeusing Python'sopen(...)andwrite(...). - Verify with
mcp_desktop_execute_shell_commandusingdir C:\path\to\output.txt.
Navigate complex UI with accessibility
- Call
mcp_accessibility_get_accessibility_treeto understand the full UI structure. - Call
mcp_accessibility_find_ui_elementto find a specific control (for example, role: "MenuItem", name: "Settings"). - Call
mcp_desktop_clickusing the element's reported coordinates. - Call
mcp_accessibility_find_ui_elementagain to find the next control in the dialog. - Call
mcp_desktop_type_textormcp_desktop_clickto interact with it.
Keep a long-running session alive
- Send any MCP request at least once every 30 minutes to prevent idle eviction.
mcp_desktop_get_screen_sizeis lightweight and works well as a heartbeat.