Share via

Custom Neural Model Training Stuck at 0% and Unable to View Cognitive Services Quotas

Benson Mbaaro 5 Reputation points
2026-03-06T10:54:13.6666667+00:00

Hello Microsoft Support,

I’m experiencing an issue with Azure Document Intelligence (Custom Neural Model training) where every training operation gets stuck at 0% progress indefinitely. This happens even when creating brand‑new projects, new containers, and even new Document Intelligence resources.

I have in the past trained the models(several are stuck) ok and training completed. Region UK south.

When I delete the stuck models via the cli I get a 204 request(success) but the models continue showing in the model list as running.

Additionally, when I attempt to check Cognitive Services usage or quotas through the Azure CLI/ARM management API, the result is always empty, meaning I cannot see any Cognitive Services quota information for my subscription or region

Foundry Tools
Foundry Tools

Formerly known as Azure AI Services or Azure Cognitive Services is a unified collection of prebuilt AI capabilities within the Microsoft Foundry platform

0 comments No comments

3 answers

Sort by: Most helpful
  1. Anshika Varshney 10,145 Reputation points Microsoft External Staff Moderator
    2026-05-01T07:27:40.14+00:00

    Hi Benson Mbaaro,

    Thanks for sharing the details. Based on your scenario, there are two separate things happening here, and both point more towards service level behavior or quota/permission related issues, not something wrong in your code.

    From your description

    • training jobs are stuck at 0 percent
    • deleting models returns success but still shows running
    • quota information is empty

    this combination usually indicates either

    • quota or usage limits not being correctly applied or visible
    • permission issue while reading quota data
    • or backend state issue in the service

    This matches similar patterns seen with Document Intelligence custom neural model behavior where training jobs get accepted but do not progress.

    First thing to check is quota and limits

    Even on Standard tier, there are still limits involved

    For custom neural models

    • there is a limit on total number of models per resource
    • there is also tracking of training usage time
    • limits may still block progress even if request is accepted

    If any of these limits are reached, jobs can stay stuck at 0 percent without error.

    Check model count in your resource

    There is a maximum limit for custom neural models per resource

    If you are close to that limit

    • new training operations may not start properly
    • delete operations may look successful but not reflect immediately

    Try reducing older or unused models and then retry training

    Check training usage behavior

    Even though Standard tier allows more usage

    • training still depends on available capacity and usage tracking
    • sometimes usage reset or tracking may not reflect immediately

    If training worked earlier and suddenly stopped, it can be related to usage tracking or backend state.

    Check quota visibility issue

    For the quota page returning empty

    this is often caused by permission and not by resource issue

    Make sure your account has

    • required role at subscription level to read usage data

    If the role is missing

    • quota APIs may return empty
    • even though resource exists and works

    This is a common reason for blank quota results in portal or CLI.

    Check region and model availability

    You mentioned UK South

    Sometimes

    • specific regions may have temporary issues
    • or backend processing delays

    You can try

    • creating a test resource in another region
    • running same training

    If it works there, then issue is region specific

    Important observation

    Your case has these signs

    • valid resource and key
    • training accepted but no progress
    • model delete not reflecting
    • quota not visible

    This combination strongly points to a control plane or backend state issue, not a request or configuration mistake

    Things you can try as workaround

    Before retrying

    • create a fresh resource and try one small training dataset
    • check if new model progresses beyond 0 percent
    • keep input small to simplify testing

    This helps confirm whether issue is tied to resource state or not.

    In short

    • training stuck at 0 usually indicates quota or backend state issue
    • empty quota result is often due to missing permissions
    • model limit or usage limits can block training silently
    • region level issues can also impact behavior

    I hope this helps. From what you shared, your setup looks correct, this seems more like service side or quota visibility issue rather than anything wrong in your implementation.

    Do let me know if you have any further queries.

    If this answers your query, please do click Accept Answer and Yes for was this answer helpful.

    Thankyou!


  2. Benson Mbaaro 5 Reputation points
    2026-03-06T11:23:00.6666667+00:00

    I am on S0 and my model only has approximately 500 pages. Not sure it is a quota issue since it's just at the beginning of the month and this should have reset. I only have two new pages this month so the rest have trained ok severally for a couple of months

    0 comments No comments

  3. Q&A Assist
    2026-03-06T10:54:34.5166667+00:00
    AI generated content. This question contains an answer created with AI from Q&A Assist. Learn more

    The behavior matches a quota/limit or backend provisioning issue affecting custom neural model training and quota visibility.

    For Document Intelligence v4.0 custom neural models, the following limits apply per resource:

    • Training dataset size (neural and generative): 1 GB, not adjustable on either Free (F0) or Standard (S0).
    • Max number of pages for neural/generative training: 50,000 pages, not adjustable.
    • Custom neural model training time:
      • Free (F0): 10 hours per month.
      • Standard (S0): No hard limit, but billed by the hour, with 10 free hours each month and the ability to request increases.

    If the monthly 10‑hour free training allowance is exhausted on F0, or if a custom quota configuration is mis‑applied on S0, training jobs can remain in a non‑progressing state while still being accepted. Since training time is tracked per calendar month, the counter resets monthly. For higher training volume on S0, a support request is required to increase the monthly training limit.

    In addition, there is a hard limit on the number of custom neural models per resource:

    • Max number of neural models: 100 on Free (F0), 500 on Standard (S0), not adjustable.

    If the resource is at or beyond this limit, model management operations can behave inconsistently (for example, models appearing to remain in a running state even after delete calls return 204). Cleaning up models below the limit and then re‑checking can help, but if the portal or control plane is stuck, support intervention is needed.

    For the separate issue where Cognitive Services quotas appear empty in the portal/CLI:

    • When the quota page is empty or shows no allocations, the required role at subscription level is Cognitive Services Usages Reader. Without this role, quota data will not be returned even though the resource exists.
    • If the Request quota button is disabled, Owner or Contributor role on the subscription is required. Some model/region combinations do not support quota increases.
    • After any approved quota change, allow up to 15 minutes for propagation and refresh the Quota page. If the change is still not visible after 24 hours, Azure support must be contacted.
    • If a specific model’s quota cannot be found, verify that the model is available in the selected region.

    Given the symptoms:

    1. Verify that the subscription has Cognitive Services Usages Reader at the subscription scope and that the correct subscription and region (UK South) are selected when viewing quotas.
    2. Confirm whether the resource is on Free (F0) and whether the 10 hours/month of custom neural training may have been consumed. If so, either wait for the next calendar month or move to Standard (S0) and, if needed, request a training‑limit increase.
    3. Check that the number of neural models is below the documented maximum (100 on F0, 500 on S0). If near or above the limit, reduce the number of models and then retry training.
    4. If quota pages remain empty despite correct roles and the training jobs continue to stay at 0% or models remain stuck in Running after 204 deletes, open an Azure support request. This is necessary both to:
      • Investigate and fix the control‑plane state for the stuck models and training jobs in UK South.
      • Investigate the quota/usage visibility issue for Cognitive Services on the subscription.

    These actions align with the documented limits and the guidance to use Azure support when quota changes or control‑plane state do not behave as expected.


    References:

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.