Skip to main content

Command Palette

Search for a command to run...

Azure AI Foundry: Production-Grade Agent Deployment

How to take marketplace agents from developer tools to production-grade services with Azure AI Foundry.

Updated
8 min read
Azure AI Foundry: Production-Grade Agent Deployment
P
Pushp Vashisht is working as a Software Engineer II at Microsoft, Ireland. For more information pay a visit at: pushp.ovh

Difficulty: Advanced | Prerequisites: Azure subscription, Azure AI Foundry access

TL;DR: AI Foundry is where marketplace agents graduate to production. Skills become Prompt Flow nodes, agents become deployable endpoints, rules become system prompts + content safety filters. Use evaluations for automated quality gates before promoting agents.


What is Azure AI Foundry?

Azure AI Foundry (formerly Azure AI Studio) is Microsoft's platform for building, deploying, and managing production AI applications. While the other platforms in this series focus on developer productivity, Azure AI Foundry is where agents graduate to production workloads: serving end users, processing data at scale, and integrating into business-critical systems.

Key differentiators for marketplace integration:

  • Agent Service: deploy conversational agents as managed services

  • Prompt Flow: visual workflow orchestration for complex agent logic

  • Model Catalog: access to GPT-4o, Claude, Llama, Phi, and more

  • Evaluations: built-in testing and quality assessment

  • Safety: content filtering, jailbreak detection, grounding checks

  • Enterprise security: VNET, private endpoints, managed identity, RBAC

Why Azure AI Foundry for the Marketplace?

The other platforms (Claude Code, Copilot, Cursor) help developers build software faster. Azure AI Foundry helps you deploy AI agents as products: agents that serve thousands of users, handle production traffic, and meet enterprise SLAs.

Stage Platform Use Case
Development Claude Code, Cursor Build and iterate on agent logic
Code Assistance GitHub Copilot Get AI help while coding
Business Users Copilot Studio Low-code agents for non-developers
Production Azure AI Foundry Deploy agents as scalable services

Scenario 1: Deploying a Marketplace Agent as an API

Situation: The incident responder agent is so useful in Teams that leadership wants it available as an API for integration with the internal incident management dashboard.

Step 1: Create the project in Azure AI Foundry

# Using Azure CLI
az ai project create \
  --name incident-responder \
  --resource-group ai-agents-rg \
  --hub-name my-ai-hub

Step 2: Define the agent using Prompt Flow

Translate the marketplace agent definition into a Prompt Flow:

# flow.dag.yaml
inputs:
  incident_id:
    type: string
    description: The incident ID to triage

outputs:
  triage_result:
    type: string
    reference: ${triage.output}

nodes:
  - name: fetch_incident
    type: python
    source:
      type: code
      path: fetch_incident.py
    inputs:
      incident_id: ${inputs.incident_id}

  - name: search_tsg
    type: python
    source:
      type: code
      path: search_tsg.py
    inputs:
      service_name: ${fetch_incident.output.service}
      error_type: ${fetch_incident.output.error_type}

  - name: triage
    type: llm
    source:
      type: code
      path: triage_prompt.jinja2
    inputs:
      incident_details: ${fetch_incident.output}
      tsg_results: ${search_tsg.output}
      deployment_model: gpt-4o

Step 3: Apply marketplace rules as system prompts

The marketplace's rules translate directly to the LLM node's system prompt:

{# triage_prompt.jinja2 #}
system:
You are an incident triage agent for our organization.

## Rules (from AI Agent Marketplace)
- Never expose internal infrastructure details in responses
- Always recommend human escalation for Sev1/Sev2 incidents
- Log all triage decisions for audit purposes
- Follow the principle of least privilege when suggesting access

## Triage Workflow
1. Summarize the incident in 2-3 sentences
2. Identify the likely root cause based on error patterns
3. Find the matching TSG and present key steps
4. Recommend severity classification
5. Suggest the owning team for escalation

user:
Triage this incident:
{{ incident_details }}

Relevant TSGs found:
{{ tsg_results }}

Step 4: Add safety and evaluation

# evaluations/eval_triage.py
from azure.ai.evaluation import evaluate, GroundednessEvaluator, RelevanceEvaluator

results = evaluate(
    data="test_incidents.jsonl",
    evaluators={
        "groundedness": GroundednessEvaluator(model_config),
        "relevance": RelevanceEvaluator(model_config),
    },
    target=triage_flow,
)

Step 5: Deploy as a managed endpoint

az ai deployment create \
  --name incident-responder-prod \
  --project incident-responder \
  --flow ./flow.dag.yaml \
  --instance-type Standard_DS3_v2 \
  --instance-count 2

Now the agent is available as a REST API:

curl -X POST https://incident-responder-prod.azurewebsites.net/score \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"incident_id": "123456"}'

Scenario 2: Multi-Agent Orchestration

Situation: You want a production system where multiple marketplace agents collaborate. One fetches data, one analyzes, and one recommends actions.

Agent orchestration with Azure AI Agent Service

from azure.ai.projects import AIProjectClient
from azure.ai.projects.models import AgentConfig

client = AIProjectClient(
    credential=DefaultAzureCredential(),
    project="my-ai-hub/ai-agents",
)

# Create specialized agents from marketplace definitions
data_agent = client.agents.create(
    name="data-fetcher",
    instructions=open("marketplace/agents/data-fetcher.md").read(),
    model="gpt-4o",
    tools=[{"type": "code_interpreter"}, ado_tool, kusto_tool],
)

analysis_agent = client.agents.create(
    name="analyzer",
    instructions=open("marketplace/agents/analyzer.md").read(),
    model="gpt-4o",
    tools=[{"type": "code_interpreter"}],
)

# Orchestrate
thread = client.agents.threads.create()

# Step 1: Data agent fetches context
client.agents.threads.runs.create(
    thread_id=thread.id,
    agent_id=data_agent.id,
    instructions="Fetch the last 7 days of deployment data for auth-service",
)

# Step 2: Analysis agent processes results
client.agents.threads.runs.create(
    thread_id=thread.id,
    agent_id=analysis_agent.id,
    instructions="Analyze the deployment data and identify anomalies",
)

Scenario 3: Grounding Agents in Enterprise Data

Situation: Your agents need access to proprietary data (internal wikis, code repos, incident history) in a secure, managed way.

Use Azure AI Search as a knowledge base

# Index your marketplace documentation and org knowledge
from azure.search.documents.indexes import SearchIndexClient

index_client = SearchIndexClient(
    endpoint="https://my-search.search.windows.net",
    credential=DefaultAzureCredential(),
)

# Create index for TSGs, runbooks, and marketplace docs
index = SearchIndex(
    name="org-knowledge",
    fields=[
        SimpleField(name="id", type="Edm.String", key=True),
        SearchableField(name="title", type="Edm.String"),
        SearchableField(name="content", type="Edm.String"),
        SimpleField(name="source", type="Edm.String", filterable=True),
        SimpleField(name="plugin", type="Edm.String", filterable=True),
    ],
)

Then ground your agents:

agent = client.agents.create(
    name="knowledge-assistant",
    instructions="You are an engineering knowledge assistant...",
    model="gpt-4o",
    tools=[{
        "type": "azure_ai_search",
        "azure_ai_search": {
            "index_name": "org-knowledge",
            "endpoint": "https://my-search.search.windows.net",
        }
    }],
)

How Marketplace Concepts Map to Azure AI Foundry

Marketplace Concept Azure AI Foundry Equivalent
Skill (.md) Prompt Flow node or agent instructions
Agent (workflow) Prompt Flow DAG or Agent Service agent
Rule (constraint) System prompt + content safety filters
Hook (automation) Prompt Flow trigger or Azure Function
MCP Server Tool definition (code interpreter, Azure AI Search, custom)
Plugin (package) AI Foundry project with deployable endpoint

Scenario 4: Evaluation Pipeline for Marketplace Agents

Situation: Before promoting an agent from dev to prod, you need automated quality checks.

Build an evaluation pipeline

# azure-pipelines.yml or GitHub Actions
name: Agent Evaluation Pipeline

on:
  pull_request:
    paths:
      - 'plugins/*/agents/**'
      - 'plugins/*/skills/**'

jobs:
  evaluate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Run agent evaluations
        run: |
          python -m pytest evaluations/ \
            --agent-dir plugins/$CHANGED_PLUGIN \
            --eval-dataset test_cases.jsonl \
            --metrics groundedness,relevance,coherence,safety \
            --threshold 0.8

      - name: Safety check
        run: |
          python evaluations/safety_check.py \
            --agent-dir plugins/$CHANGED_PLUGIN \
            --adversarial-dataset adversarial_prompts.jsonl \
            --max-failures 0

Tips for Marketplace Authors Targeting Azure AI Foundry

  1. Design for production from the start. Think about latency, cost, and reliability

  2. Use Prompt Flow for complex workflows. It provides observability that raw code doesn't

  3. Leverage evaluations. Automated quality gates prevent regressions

  4. Security is non-negotiable. Use managed identities, VNET, and RBAC

  5. Monitor costs. Production agents can be expensive; track token usage per agent

  6. Version your agents. Use the same versioning as marketplace plugins (semver)

  7. Start with a single model. Optimize model selection after you have evaluation data

The Full Lifecycle

Marketplace Plugin (dev)
    |
    v
Claude Code / Cursor (build & iterate)
    |
    v
GitHub Copilot (code with AI assistance)
    |
    v
Copilot Studio (business user testing)
    |
    v
Azure AI Foundry (production deployment)
    |
    v
Monitoring & Evaluation (continuous improvement)
    |
    v
Back to Marketplace (updated plugin)

Quick Setup Checklist

  1. Create an Azure AI Foundry hub and project

  2. Deploy a model (GPT-4o recommended for agents)

  3. Translate marketplace skills into Prompt Flow nodes

  4. Add marketplace rules as system prompt content

  5. Configure tool connections (Azure AI Search, custom APIs)

  6. Build evaluation dataset from real incident/test data

  7. Run evaluations (groundedness, relevance, safety)

  8. Deploy to managed endpoint with autoscale

  9. Set up monitoring and alerting on the endpoint

  10. Wire endpoint into your application or Teams bot

Mapping reference:

Marketplace Azure AI Foundry
Skill Prompt Flow node / agent instructions
Agent Prompt Flow DAG / Agent Service agent
Rule System prompt + content safety filters
Hook Azure Function trigger
MCP Server Tool definition (search, code interpreter, custom)
Plugin Project with deployable endpoint

Series Conclusion

Across this 3-part series, we've covered:

  1. Part 1: Why the marketplace pattern works for scaling AI across an enterprise

  2. Part 2: How to build and extend a marketplace from the boilerplate

  3. Part 3: How to integrate with every major AI platform

The key insight: you don't need to pick one tool. The marketplace is the abstraction layer that lets different teams use different tools while sharing the same capabilities, rules, and governance.

Start small. Seed with 3-5 high-value plugins. Let adoption drive what to build next. And remember: the hardest part isn't the technology. It's getting the first five contributors.

Good luck building your marketplace.

Enterprise AI Agent Marketplace

Part 7 of 10

Most organizations let every team pick their own AI tool. Some use Claude Code, others use GitHub Copilot, others Cursor or Copilot Studio. The result: duplicated workflows, inconsistent governance, and AI capabilities trapped inside individual teams. This series shows you how to fix that with an internal AI Agent Marketplace: a shared catalog of skills, agents, and rules that every team can install into whichever AI platform they already use. Consistency without forced standardization. Inside you'll find a 3-part core walkthrough (why the pattern matters, how to build one, how to integrate it), 5 platform-specific guides (Claude Code, GitHub Copilot, Copilot Studio, Cursor, Azure AI Foundry), and 5 bonus posts covering ROI modeling, a 90-day adoption playbook, 15 ready-to-build plugin recipes, real case studies across company sizes, and an AI maturity model. A production-ready boilerplate repository ships alongside the series so you can fork and customize on day one. Who it's for: platform engineers, engineering managers, and governance teams who want AI adoption to scale without becoming a sprawl of disconnected experiments.

Up next

The AI Agent Marketplace ROI Calculator

The spreadsheet that gets leadership to say yes.