Share via

Can't store completions in Azure Foundry

Rodrigo Maldonado 0 Reputation points
2026-04-16T00:51:39.1733333+00:00

Hi, I was trying to follow this guide to save completions from one of the models that I've deployed in the Azure Foundry service https://learn.microsoft.com/en-us/azure/foundry-classic/openai/how-to/stored-completions?tabs=rest-api . I'm testing this curl but it's not working for me. What could I be missing or is this a bug?

curl https://<YOUR-RESOURCE-NAME>
.cognitiveservices.azure.com/openai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "api-key: $AZURE_OPENAI_API_KEY" \
  -d '{
    "model": "gpt-5.4-nano",
    "store": true,
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": "Hello!"
      }
    ]
  }'



Note : PII audited at support side.

Foundry Models
Foundry Models

A catalog of AI models in Microsoft Foundry that you can discover, compare, and deploy using Azure’s built‑in tools for evaluation, fine‑tuning, and inference


2 answers

Sort by: Most helpful
  1. Manas Mohanty 16,670 Reputation points Microsoft External Staff Moderator
    2026-05-04T08:45:26.7866667+00:00

    Hey Rodrigo Maldonado

    We have tagged this issue with concerned engineering ticket

    Seems to be a regression with Chat completion feature.

    Shall keep you posted on the same once we progress.

    Thank you for your inputs on forum.

    0 comments No comments

  2. Sina Salam 28,606 Reputation points Volunteer Moderator
    2026-04-27T17:02:22.88+00:00

    Hello Rodrigo Maldonado,

    Welcome to the Microsoft Q&A and thank you for posting your questions here.

    I understand that you cannot store completions in Azure Foundry.

    This isn’t a bug. Azure AI Foundry and Azure OpenAI never persist completions automatically, so the root cause is simply the absence of a storage architecture.

    What you will need to do are the following:

    1. You must explicitly capture the model’s response in code: ``` response = client.chat.completions.create(model="gpt-4o", messages=messages)thenoutput_text = response.choices[0].message.content` ``.
    2. Choose a persistent store matching your pattern: Azure Blob Storage for raw logs, Azure Cosmos DB for conversation history, or Azure SQL Database for structured apps.
    3. Insert the captured output into Cosmos DB: container.create_item({"id": str(uuid.uuid4()), "user_id": user_id, "prompt": messages, "response": output_text, "timestamp": datetime.utcnow().isoformat()}).
    4. Add observability with Azure Monitor Application Insights to track latency, token usage, and failures. See the link here - https://learn.microsoft.com/azure/azure-monitor/app/app-insights-overview.
    5. If you use Azure AI Foundry, Prompt Flow provides built‑in execution tracking and lineage but does not replace long‑term storage - https://learn.microsoft.com/azure/ai-studio/how-to/prompt-flow.
    6. For multi‑turn experiences, implement conversation memory by storing structured message arrays such as {"session_id":"abc123","messages":[{"role":"user","content":"..."},{"role":"assistant","content":"..."}]}.
    7. Finally, lock data down with Azure’s default encryption at rest, mask sensitive fields, and apply RBAC – refer to the encryption fundamentals- https://learn.microsoft.com/azure/security/fundamentals/encryption-atrest.

    I hope this is helpful! Do not hesitate to let me know if you have any other questions or clarifications.


    Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful.

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.