Share via

New Azure AI Foundry VS Old Azure AI Foundry

nishant garg 20 Reputation points
2026-04-22T10:36:41.5666667+00:00

Hi,

I am currently facing issues while working with Azure AI Foundry (both the new and old experiences), and I’m unable to proceed due to differences in behavior between the two.

Scenario:

  • I am using Azure AI Foundry agents along with Azure AI Search for a RAG-based implementation.

I need both:

Conversation threading (to maintain chat history)

  Retrieval details (chunk IDs and scores) to show sources to users
  

Issues observed:

Agent visibility mismatch

Agents created in the New Foundry are not visible in the Old Foundry.

  Similarly, agents created in the *Old Foundry* are not visible in the *New Foundry*.
  
  **New Foundry limitations**
  
     I can successfully call the agent via API.
     
        However, I am unable to maintain conversation threads.
        
           I also cannot retrieve past conversation history.
           
           **Old Foundry limitations**
           
              Conversation threading works correctly when using backend code.
              
                 However, I am not receiving:
                 
                       Chunk IDs
                       
                             Relevance scores
                             
                                Because of this, I cannot properly show source attribution to users.
                                

Current blocker:

New Foundry → No proper thread management / history

Old Foundry → No chunk metadata (IDs, scores)

Because of these limitations, I am unable to complete my RAG-based implementation.

Question: What is the recommended approach to:

Maintain conversation threads and

Retrieve chunk-level metadata (IDs, scores) for source attribution

Should I:

Fully switch to one Foundry experience?

Use a hybrid approach?

Or is there a specific configuration/API I might be missing?

Any guidance or best practices would be greatly appreciated.

Thanks!Hi,

I am currently facing issues while working with Azure AI Foundry (both the new and old experiences), and I’m unable to proceed due to differences in behavior between the two.

Scenario:

I am using Azure AI Foundry agents along with Azure AI Search for a RAG-based implementation.

I need both:

Conversation threading (to maintain chat history)

  Retrieval details (chunk IDs and scores) to show sources to users
  

Issues observed:

Agent visibility mismatch

Agents created in the New Foundry are not visible in the Old Foundry.

  Similarly, agents created in the *Old Foundry* are not visible in the *New Foundry*.
  
  **New Foundry limitations**
  
     I can successfully call the agent via API.
     
        However, I am unable to maintain conversation threads.
        
           I also cannot retrieve past conversation history.
           
           **Old Foundry limitations**
           
              Conversation threading works correctly when using backend code.
              
                 However, I am not receiving:
                 
                       Chunk IDs
                       
                             Relevance scores
                             
                                Because of this, I cannot properly show source attribution to users.
                                

Current blocker:

New Foundry → No proper thread management / history

Old Foundry → No chunk metadata (IDs, scores)

Because of these limitations, I am unable to complete my RAG-based implementation.

Question:
What is the recommended approach to:

Maintain conversation threads and

Retrieve chunk-level metadata (IDs, scores) for source attribution

Should I:

Fully switch to one Foundry experience?

Use a hybrid approach?

Or is there a specific configuration/API I might be missing?

Any guidance or best practices would be greatly appreciated.

Thanks!

Foundry Agent Service
Foundry Agent Service

A fully managed platform in Microsoft Foundry for hosting, scaling, and securing AI agents built with any supported framework or model


2 answers

Sort by: Most helpful
  1. Karnam Venkata Rajeswari 2,395 Reputation points Microsoft External Staff Moderator
    2026-04-25T17:26:50.2733333+00:00

    Hello @nishant garg ,

    Welcome to Microsoft Q&A .Thank you for reaching out to us.

    The behavior observed aligns with the current architecture of Azure AI Foundry Agent Service, where conversation state management and retrieval metadata are handled through separate but complementary components rather than a unified configuration.

    For conversation threading and chat history, the new Foundry experience provides built-in support through structured runtime components:

    • Conversations act as the primary mechanism for maintaining multi-turn state. Reusing the same conversation ID preserves history across requests.
    • Memory (preview) enables longer-term continuity by storing relevant context across sessions through configurable memory stores.

    In scenarios requiring higher control or durability, external storage can also be considered:

    • Persist conversation history in a database (Cosmos DB / SQL / Redis)
    • Maintain identifiers such as conversation_id / session_id
    • Rehydrate context into each request when invoking the agent

    This ensures predictable behavior across environments and avoids reliance on preview features if stability is a primary concern.

    For retrieval and source attribution (chunk IDs, relevance scores), the agent abstraction does not expose detailed retrieval outputs. These are produced by Azure AI Search and must be accessed directly.

    Recommended approach is to

    • Use Azure AI Search for retrieval operations
    • Configure the index with retrievable fields (e.g., chunk_id, content, metadata)
    • Perform vector or hybrid queries to return relevant chunks along with ranking signals
    • Pass selected chunks into the agent prompt for grounding
    • Display metadata (IDs, scores, citations) at the application layer

    To meet both requirements together, the following practical architecture pattern is recommended:

    1. Conversation State (Threading) -Using Foundry Conversations for multi-turn interactions.Optionally enable Memory stores for cross-session continuity.
    2. Retrieval Layer (Attribution) -Using Azure AI Search for document retrieval and metadata. Then returning chunk details as part of the search response
    3. For application orchestration
      • Retrieve context from Azure AI Search
      • Fetch conversation history (from conversations or external store)
      • Construct prompt (history + retrieved content)
      • Invoke Foundry agent for response generation
      • Store new messages and associated metadata

    Regarding hybrid usage:

    • Combining old Foundry for memory and new Foundry for retrieval is not recommended
    • Both experiences operate on separate architectures and do not share state
    • A unified design within the new Foundry ecosystem provides better scalability and maintainability

    In summary-

    • conversation continuity is achieved through Foundry conversations and optional memory stores, while retrieval metadata is obtained from Azure AI Search.
    • Combining these layers explicitly at the application level provides a complete and production-ready solution for RAG scenarios requiring both threading and source attribution.

    The following references might be helpful , please check them out

    Thank you

    1 person found this answer helpful.
    0 comments No comments

  2. Q&A Assist
    2026-04-22T10:37:12.8533333+00:00
    AI generated content. This question contains an answer created with AI from Q&A Assist. Learn more

    Agents and conversations are not shared between the Foundry (classic) experience and the new Foundry Agent Service. Each experience has its own agent model and APIs, and agents created in one are not expected to appear in the other. The recommended path is to standardize on the new Foundry Agent Service and its Responses API rather than using a hybrid of classic and new.

    From the platform perspective:

    • The classic experience uses the older Assistants/threads/runs model.
    • The new experience uses Agents v2 with conversations/items/responses.
    • Threads → Conversations, Runs → Responses, Assistants/agents → Agent Versions.

    In the new experience:

    • Conversation state and history are handled via the Conversations and Responses APIs, which must be called through the OpenAI client for the project, not the project client itself.
    • The correct pattern is:
      • Use the project client only for agent creation and versioning.
      • Use the OpenAI client (obtained from the project client) for conversations and responses.

    If conversation history appears unavailable or threading is not working, the most common causes in the new experience are:

    • Calling conversation/response methods on the project client instead of the OpenAI client.
    • Using older SDKs that still target the classic Assistants/threads/runs model.

    The migration guidance explicitly calls out that in the new API:

    • Conversations and responses use the OpenAI client.
    • Agent creation and versioning remain on the project client.

    For example (language-specific details from the migration guidance):

    • Python: call project.get_openai_client() and then use that client for conversations and responses.
    • C#: use projectClient.ProjectOpenAIClient.GetProjectResponsesClientForAgent().
    • JavaScript: call projectClient.getOpenAIClient().
    • Java: use AgentsClientBuilder to build a ResponsesClient.

    If code is still using the removed create_agent() pattern or older SDKs, it must be updated to the new create_version() / Responses API pattern to get full conversation support.

    Regarding RAG and retrieval metadata (chunk IDs, scores):

    • The context indicates that File Search, Azure AI Search, and other grounding tools are supported in both classic and new Foundry, but it does not provide detailed schema for chunk-level metadata in responses.
    • It does not describe any limitation in the new Agent Service that would prevent retrieval metadata from being available; instead, it focuses on API shape changes and tool availability.

    Given the documented direction of the platform and the migration guidance, the recommended approach is:

    1. Migrate agent implementations to the new Foundry Agent Service (Agents v2 / Responses API) rather than mixing classic and new.
    2. Ensure the latest SDKs are used (azure-ai-projects 2.x or equivalent) and that:
      • Agent versions are created via the project client (create_version() / CreateAgentVersionAsync() / createAgentVersion()).
      • Conversations and responses are created and managed via the OpenAI client obtained from the project client.
    3. Start new conversations after migration; historical thread data from classic is not migrated and remains only in the previous API.

    If conversation history and threading still do not appear in the new experience after using the correct client and SDK versions, the next step is to:

    • Verify that the Foundry resource is in a region that supports the Responses API.
    • Confirm that the code is not mixing Assistants API calls with a Responses API endpoint (which would result in 404/MethodNotAllowed and missing behavior).

    The context does not describe a supported or recommended hybrid pattern where classic is used solely for threading and new for retrieval metadata. The platform guidance is to upgrade to Foundry and migrate to the new Agents v2/Responses API model, then start new conversations there.


    References:

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.