New Microsoft Foundry Agent via Responses API rejects local image input as Base64 data URL and byte array

Question

New Microsoft Foundry Agent via Responses API rejects local image input as Base64 data URL and byte array

Tim-Niklas Wennemann 0

Hello Microsoft Support,

we are seeing an issue with the new Microsoft Foundry Agents via the Responses API when sending a local image as part of the user message.

What works

text-only input

image by public URL

What fails

local PNG passed as Base64 data URL

local PNG passed as raw byte array through SDK methods

Example failing image part:

{
  "type": "input_image",
  "image_url": "data:image/png;base64,...",
  "detail": "auto"
}

Returned error:

{
  "code": "invalid_payload",
  "message": "The provided data does not match the expected schema",
  "param": "/",
  "type": "invalid_request_error",
  "details": []
}

We reproduced this in:

C#

Python

raw REST

So this does not appear to be limited to one SDK.

Also important: the same pattern is used in the sample repo for the Foundry Agent Web App, and this scenario worked for us about one week ago: https://github.com/microsoft-foundry/foundry-agent-webapp

Could you confirm whether local image input is currently supported for Foundry Agents through the Responses API, or whether this is a regression?

Best regards

0 comments

2 answers

Your answer

Answer 1

Hi @Tim-Niklas Wennemann,

Thank you for contacting Microsoft Support and for providing detailed reproduction steps across multiple SDKs. This information was helpful in validating the behavior.

At present, Microsoft Foundry Agents via the Responses API support image inputs only through publicly accessible URLs.

Supported scenarios:

Text inputs

Images referenced via HTTP(S) URLs

Not currently supported:

Base64-encoded images (e.g., data:image/png;base64,...)

Raw byte arrays passed through SDKs

When Base64 strings or byte arrays are used, the request fails schema validation and returns the following error:

invalid_payload: The provided data does not match the expected schema

This behavior aligns with the current API contract, where the input_image.image_url field expects a URL-based reference rather than inline image data.

You noted that this scenario appeared to work previously, including in the sample repository. Based on our review:

The sample application likely uses images that are already hosted and accessible via URL (either directly or through a backend upload step), or

There may have been a recent service-side change that enforces stricter schema validation

We have shared this observation internally to confirm whether this represents an intentional change or a regression.

To proceed with image inputs, we recommend the following approaches:

Upload the image to a publicly accessible location, such as Azure Blob Storage (optionally secured with a SAS token), and provide the HTTPS URL in the request
For local testing scenarios, host the image using a lightweight local HTTP server and reference it via a local URL (e.g., http://localhost:<port>/image.png)

We have forwarded feedback to the product team regarding support for inline image inputs (Base64 and byte arrays). At this time, there is no confirmed timeline for this capability.

If this functionality is critical for your use case, we recommend continuing with the URL-based approach and monitoring official documentation for updates.

Please let us know if you need assistance with implementing the suggested workaround or validating your request format.

Tim-Niklas Wennemann 0 Reputation points

2026-04-25T10:02:52.29+00:00

I also opened a Github Issue. They fixed it a few days ago and it is working now again

Answer 2

Local/base64 image input is supported in some Microsoft Foundry scenarios, but the context only documents this capability for classic Foundry chat completions and not specifically for Agents via the Responses API.

The documented, supported image patterns in the provided context are:

Azure OpenAI image generation APIs
- Input is text-only; images are not sent as data: URLs or byte arrays.
- Output images are returned as base64 in a b64_json field.
- Example response:
```
     {
       "created": 1698116662,
       "data": [
         {
           "b64_json": "<base64 image data>"
         }
       ]
     }
```
- Common errors like InvalidPayload are raised when the payload does not match the expected schema.
Classic Foundry chat completions with images
- Images can be provided either as base64-encoded data URLs or as URLs from an accessible cloud location.
- The documentation explicitly notes that the model can read content from an accessible cloud URL and that images are tokenized as patches.
- It also notes constraints such as “some models support only one image per turn” and that adding multiple images can cause errors.
Foundry Agents image generation tool
- The image generation tool uses gpt-image-1 and returns base64-encoded images in the agent response (for example, in image_generation_call output items).
- The tool is focused on generating images from text prompts, not on accepting arbitrary local image input.

In the provided context there is no schema definition or example for sending local images (base64 data URLs or raw byte arrays) into Foundry Agents via the Responses API. The only documented image-related behavior for Agents is:

Using the image generation tool, which returns base64 image data in the response.
General troubleshooting for image generation (deployment issues, headers, content filtering, regional limitations).

Given this, the invalid_payload error with message "The provided data does not match the expected schema" is consistent with the documented behavior when the request body does not conform to the expected schema for the API in question. The context does not provide a schema that accepts:

{
  "type": "input_image",
  "image_url": "data:image/png;base64,...",
  "detail": "auto"
}

for Agents via the Responses API.

From the available documentation, only the following can be stated:

Image generation with gpt-image-1 is supported for Agents, and images are returned as base64 in the response.
Classic chat completions (not Agents) support base64 data URLs or public URLs as image input.
An InvalidPayload/invalid_payload error is expected when the payload does not match the API’s documented schema.

The context does not contain an explicit statement confirming or denying current support for local image input to Foundry Agents via the Responses API, nor does it document any recent regression or breaking change for this scenario. No further conclusion can be drawn from the provided material.

References:

Share via

New Microsoft Foundry Agent via Responses API rejects local image input as Base64 data URL and byte array

2 answers

Your answer