Azure AI Foundry: Production-Grade Agent Deployment
How to take marketplace agents from developer tools to production-grade services with Azure AI Foundry.

Difficulty: Advanced | Prerequisites: Azure subscription, Azure AI Foundry access
TL;DR: AI Foundry is where marketplace agents graduate to production. Skills become Prompt Flow nodes, agents become deployable endpoints, rules become system prompts + content safety filters. Use evaluations for automated quality gates before promoting agents.
What is Azure AI Foundry?
Azure AI Foundry (formerly Azure AI Studio) is Microsoft's platform for building, deploying, and managing production AI applications. While the other platforms in this series focus on developer productivity, Azure AI Foundry is where agents graduate to production workloads: serving end users, processing data at scale, and integrating into business-critical systems.
Key differentiators for marketplace integration:
Agent Service: deploy conversational agents as managed services
Prompt Flow: visual workflow orchestration for complex agent logic
Model Catalog: access to GPT-4o, Claude, Llama, Phi, and more
Evaluations: built-in testing and quality assessment
Safety: content filtering, jailbreak detection, grounding checks
Enterprise security: VNET, private endpoints, managed identity, RBAC
Why Azure AI Foundry for the Marketplace?
The other platforms (Claude Code, Copilot, Cursor) help developers build software faster. Azure AI Foundry helps you deploy AI agents as products: agents that serve thousands of users, handle production traffic, and meet enterprise SLAs.
| Stage | Platform | Use Case |
|---|---|---|
| Development | Claude Code, Cursor | Build and iterate on agent logic |
| Code Assistance | GitHub Copilot | Get AI help while coding |
| Business Users | Copilot Studio | Low-code agents for non-developers |
| Production | Azure AI Foundry | Deploy agents as scalable services |
Scenario 1: Deploying a Marketplace Agent as an API
Situation: The incident responder agent is so useful in Teams that leadership wants it available as an API for integration with the internal incident management dashboard.
Step 1: Create the project in Azure AI Foundry
# Using Azure CLI
az ai project create \
--name incident-responder \
--resource-group ai-agents-rg \
--hub-name my-ai-hub
Step 2: Define the agent using Prompt Flow
Translate the marketplace agent definition into a Prompt Flow:
# flow.dag.yaml
inputs:
incident_id:
type: string
description: The incident ID to triage
outputs:
triage_result:
type: string
reference: ${triage.output}
nodes:
- name: fetch_incident
type: python
source:
type: code
path: fetch_incident.py
inputs:
incident_id: ${inputs.incident_id}
- name: search_tsg
type: python
source:
type: code
path: search_tsg.py
inputs:
service_name: ${fetch_incident.output.service}
error_type: ${fetch_incident.output.error_type}
- name: triage
type: llm
source:
type: code
path: triage_prompt.jinja2
inputs:
incident_details: ${fetch_incident.output}
tsg_results: ${search_tsg.output}
deployment_model: gpt-4o
Step 3: Apply marketplace rules as system prompts
The marketplace's rules translate directly to the LLM node's system prompt:
{# triage_prompt.jinja2 #}
system:
You are an incident triage agent for our organization.
## Rules (from AI Agent Marketplace)
- Never expose internal infrastructure details in responses
- Always recommend human escalation for Sev1/Sev2 incidents
- Log all triage decisions for audit purposes
- Follow the principle of least privilege when suggesting access
## Triage Workflow
1. Summarize the incident in 2-3 sentences
2. Identify the likely root cause based on error patterns
3. Find the matching TSG and present key steps
4. Recommend severity classification
5. Suggest the owning team for escalation
user:
Triage this incident:
{{ incident_details }}
Relevant TSGs found:
{{ tsg_results }}
Step 4: Add safety and evaluation
# evaluations/eval_triage.py
from azure.ai.evaluation import evaluate, GroundednessEvaluator, RelevanceEvaluator
results = evaluate(
data="test_incidents.jsonl",
evaluators={
"groundedness": GroundednessEvaluator(model_config),
"relevance": RelevanceEvaluator(model_config),
},
target=triage_flow,
)
Step 5: Deploy as a managed endpoint
az ai deployment create \
--name incident-responder-prod \
--project incident-responder \
--flow ./flow.dag.yaml \
--instance-type Standard_DS3_v2 \
--instance-count 2
Now the agent is available as a REST API:
curl -X POST https://incident-responder-prod.azurewebsites.net/score \
-H "Authorization: Bearer $TOKEN" \
-d '{"incident_id": "123456"}'
Scenario 2: Multi-Agent Orchestration
Situation: You want a production system where multiple marketplace agents collaborate. One fetches data, one analyzes, and one recommends actions.
Agent orchestration with Azure AI Agent Service
from azure.ai.projects import AIProjectClient
from azure.ai.projects.models import AgentConfig
client = AIProjectClient(
credential=DefaultAzureCredential(),
project="my-ai-hub/ai-agents",
)
# Create specialized agents from marketplace definitions
data_agent = client.agents.create(
name="data-fetcher",
instructions=open("marketplace/agents/data-fetcher.md").read(),
model="gpt-4o",
tools=[{"type": "code_interpreter"}, ado_tool, kusto_tool],
)
analysis_agent = client.agents.create(
name="analyzer",
instructions=open("marketplace/agents/analyzer.md").read(),
model="gpt-4o",
tools=[{"type": "code_interpreter"}],
)
# Orchestrate
thread = client.agents.threads.create()
# Step 1: Data agent fetches context
client.agents.threads.runs.create(
thread_id=thread.id,
agent_id=data_agent.id,
instructions="Fetch the last 7 days of deployment data for auth-service",
)
# Step 2: Analysis agent processes results
client.agents.threads.runs.create(
thread_id=thread.id,
agent_id=analysis_agent.id,
instructions="Analyze the deployment data and identify anomalies",
)
Scenario 3: Grounding Agents in Enterprise Data
Situation: Your agents need access to proprietary data (internal wikis, code repos, incident history) in a secure, managed way.
Use Azure AI Search as a knowledge base
# Index your marketplace documentation and org knowledge
from azure.search.documents.indexes import SearchIndexClient
index_client = SearchIndexClient(
endpoint="https://my-search.search.windows.net",
credential=DefaultAzureCredential(),
)
# Create index for TSGs, runbooks, and marketplace docs
index = SearchIndex(
name="org-knowledge",
fields=[
SimpleField(name="id", type="Edm.String", key=True),
SearchableField(name="title", type="Edm.String"),
SearchableField(name="content", type="Edm.String"),
SimpleField(name="source", type="Edm.String", filterable=True),
SimpleField(name="plugin", type="Edm.String", filterable=True),
],
)
Then ground your agents:
agent = client.agents.create(
name="knowledge-assistant",
instructions="You are an engineering knowledge assistant...",
model="gpt-4o",
tools=[{
"type": "azure_ai_search",
"azure_ai_search": {
"index_name": "org-knowledge",
"endpoint": "https://my-search.search.windows.net",
}
}],
)
How Marketplace Concepts Map to Azure AI Foundry
| Marketplace Concept | Azure AI Foundry Equivalent |
|---|---|
| Skill (.md) | Prompt Flow node or agent instructions |
| Agent (workflow) | Prompt Flow DAG or Agent Service agent |
| Rule (constraint) | System prompt + content safety filters |
| Hook (automation) | Prompt Flow trigger or Azure Function |
| MCP Server | Tool definition (code interpreter, Azure AI Search, custom) |
| Plugin (package) | AI Foundry project with deployable endpoint |
Scenario 4: Evaluation Pipeline for Marketplace Agents
Situation: Before promoting an agent from dev to prod, you need automated quality checks.
Build an evaluation pipeline
# azure-pipelines.yml or GitHub Actions
name: Agent Evaluation Pipeline
on:
pull_request:
paths:
- 'plugins/*/agents/**'
- 'plugins/*/skills/**'
jobs:
evaluate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Run agent evaluations
run: |
python -m pytest evaluations/ \
--agent-dir plugins/$CHANGED_PLUGIN \
--eval-dataset test_cases.jsonl \
--metrics groundedness,relevance,coherence,safety \
--threshold 0.8
- name: Safety check
run: |
python evaluations/safety_check.py \
--agent-dir plugins/$CHANGED_PLUGIN \
--adversarial-dataset adversarial_prompts.jsonl \
--max-failures 0
Tips for Marketplace Authors Targeting Azure AI Foundry
Design for production from the start. Think about latency, cost, and reliability
Use Prompt Flow for complex workflows. It provides observability that raw code doesn't
Leverage evaluations. Automated quality gates prevent regressions
Security is non-negotiable. Use managed identities, VNET, and RBAC
Monitor costs. Production agents can be expensive; track token usage per agent
Version your agents. Use the same versioning as marketplace plugins (semver)
Start with a single model. Optimize model selection after you have evaluation data
The Full Lifecycle
Marketplace Plugin (dev)
|
v
Claude Code / Cursor (build & iterate)
|
v
GitHub Copilot (code with AI assistance)
|
v
Copilot Studio (business user testing)
|
v
Azure AI Foundry (production deployment)
|
v
Monitoring & Evaluation (continuous improvement)
|
v
Back to Marketplace (updated plugin)
Quick Setup Checklist
Create an Azure AI Foundry hub and project
Deploy a model (GPT-4o recommended for agents)
Translate marketplace skills into Prompt Flow nodes
Add marketplace rules as system prompt content
Configure tool connections (Azure AI Search, custom APIs)
Build evaluation dataset from real incident/test data
Run evaluations (groundedness, relevance, safety)
Deploy to managed endpoint with autoscale
Set up monitoring and alerting on the endpoint
Wire endpoint into your application or Teams bot
Mapping reference:
| Marketplace | Azure AI Foundry |
|---|---|
| Skill | Prompt Flow node / agent instructions |
| Agent | Prompt Flow DAG / Agent Service agent |
| Rule | System prompt + content safety filters |
| Hook | Azure Function trigger |
| MCP Server | Tool definition (search, code interpreter, custom) |
| Plugin | Project with deployable endpoint |
Series Conclusion
Across this 3-part series, we've covered:
Part 1: Why the marketplace pattern works for scaling AI across an enterprise
Part 2: How to build and extend a marketplace from the boilerplate
Part 3: How to integrate with every major AI platform
The key insight: you don't need to pick one tool. The marketplace is the abstraction layer that lets different teams use different tools while sharing the same capabilities, rules, and governance.
Start small. Seed with 3-5 high-value plugins. Let adoption drive what to build next. And remember: the hardest part isn't the technology. It's getting the first five contributors.
Good luck building your marketplace.



