Foundry | The .NET Blog

Microsoft Foundry April 2026: Foundry Local GA, GPT-5.5, CodeAct with Hyperlight

Emiliano Montesdeoca — Tue, 02 Jun 2026 00:00:00 +0000

A busy month for Microsoft Foundry. Here are the announcements that matter most.

Foundry Local Is Generally Available

Foundry Local — Microsoft’s cross-platform local AI runtime — graduates from preview to GA on Windows, macOS (Apple Silicon), and Linux x64. Production-ready local model inference with a developer-friendly SDK. The 1.1 release (detailed in a separate post) adds transcription, embeddings, and Responses API support.

GPT-5.5

The latest GPT-5 family model is now available in Foundry. Default quota for Tier 5 and Tier 6 subscriptions. If you’ve been working with earlier GPT-5 variants, this is worth evaluating for your use cases.

Agent Framework Tracing in Foundry

Two tracing features ship in preview this month:

Microsoft Agent Framework tracing — MAF agents can now emit OpenTelemetry traces into Foundry. Debug agent behavior, trace multi-step execution, surface latency and errors across tool calls. This fills a real gap: knowing what your agent actually did in production, not just what it returned.

Hosted-agent tracing — Sessions, tool calls, and run steps from hosted agents also surface in Foundry traces. Same observability story extended to the hosted tier.

CodeAct with Hyperlight (Alpha)

This is the most technically interesting addition: Agent Framework can now execute Python code inside Hyperlight micro-virtual machines.

CodeAct is the pattern where an agent generates and executes Python code as a tool. The obvious concern is security — you’re running model-generated code. Hyperlight’s micro-VMs provide process-level isolation with near-native startup time, making sandboxed code execution practical without the overhead of full containers or VMs.

For agentic workflows where code execution is necessary, this is a significant safety improvement over running code in the host process.

Agent Monitoring Dashboard (Preview)

A unified operations dashboard combining token usage, latency, run success rate, and evaluator scores in one view. The distinction from regular observability dashboards: it includes evaluation results alongside operational metrics, so you can correlate “the agent is slower” with “evaluator scores dropped” — or confirm they’re unrelated.

Continuous Evaluation Custom Evaluators (Preview)

You can now bring your own code-based or prompt-based evaluators into continuous evaluation pipelines. Previously, continuous eval was limited to built-in evaluators. Custom evaluators let you enforce team-specific quality criteria in your production monitoring loop.

Agent Inventory in Control Plane

The Foundry Control Plane Operate view now shows all supported agents across a subscription: Foundry agents, Azure SRE Agent, Logic Apps agent loops, and registered custom agents. One view to understand what’s deployed and where.

Original post: What’s new in Microsoft Foundry | April 2026

Your Local MAF Agent Just Got a Production Home

Emiliano Montesdeoca — Sat, 30 May 2026 00:00:00 +0000

Getting an agent to work locally is the fun part. The tricky part is everything that comes after: deploying it without losing your mind, managing sessions, setting up identity, wiring observability. Usually that means a lot of custom infrastructure glue.

Foundry Hosted Agents just removed most of that glue for Microsoft Agent Framework (MAF) users.

What Foundry Hosted Agents Actually Does

When you deploy a MAF agent to Foundry Hosted Agents, the platform handles a surprisingly long list of things you’d otherwise build yourself:

Scale to zero — your agent costs nothing while idle and spins back up automatically
Per-session VM-isolated sandboxes — every user session gets its own sandbox with filesystem persistence that survives scale-down events
Built-in Entra ID — each agent gets its own identity so it can call Foundry models, Toolbox, and Azure services without secrets baked into the image
Versioned deployments — every deployment is an immutable snapshot, with blue/green and canary rollout support
Zero-config observability — APPLICATIONINSIGHTS_CONNECTION_STRING is injected at runtime so MAF’s OpenTelemetry traces flow into App Insights automatically

That last one is genuinely nice. No extra wiring, no additional config. Traces just show up.

The Code Difference Is Tiny

This is what I appreciate most about this integration. You don’t rewrite your agent. You just wrap it:

In .NET:

using Microsoft.Agents.AI.Foundry.Hosting;

var builder = WebApplication.CreateBuilder(args);
builder.Services.AddFoundryResponses(agent);

var app = builder.Build();
app.MapFoundryResponses();

app.Run();

In Python:

server = ResponsesHostServer(agent)
server.run()

That’s it. The same logic you tested locally is what runs in production. The platform wraps it in the session management, identity, and scaling infrastructure.

Two Protocols, One Agent

Hosted Agents support two endpoint styles:

Responses (/responses) — OpenAI-compatible, manages conversation history and streaming. Good default for chat-shaped agents.
Invocations (/invocations) — you define the request/response schema. Good for non-conversational workflows.

If you’re building something that looks like a conversation, start with Responses. If you’re building an API-shaped agent that takes structured input and returns structured output, Invocations gives you the flexibility.

The Deployment Flow with `azd`

When you run azd up with a MAF agent:

Optionally creates a Foundry project and deploys a model
Packages your code and pushes an image to Azure Container Registry
Provisions compute from the ACR image
Assigns a dedicated Entra ID to the agent
Exposes a stable endpoint (https://{project_endpoint}/agents/{agent_name})
Handles everything else from that point on

Sessions persist for up to 30 days. Idle compute is deprovisioned after 15 minutes and restored transparently on the next request. From the agent’s perspective, nothing changed.

Wrapping Up

The distance between “working locally” and “running in production” has historically been long and painful for AI agents. Foundry Hosted Agents + MAF closes that gap significantly. If you already have a local agent built with Agent Framework, this is worth trying today.

The team says GA is coming soon — this is currently in preview. Check the MAF Hosted Agent Integration docs and the .NET samples to get started.

Original article: From Local to Production: Deploy Your Microsoft Agent Framework Agent with Foundry Hosted Agents

Foundry Local 1.1: Real-Time Transcription, Embeddings, and the Responses API

Emiliano Montesdeoca — Thu, 28 May 2026 00:00:00 +0000

Foundry Local 1.0 proved the concept: run AI models locally on Windows, macOS (Apple Silicon), and Linux x64 with a developer-friendly SDK. Version 1.1 adds three capabilities that cover a lot of real production use cases.

Live Audio Transcription

The most significant new feature: real-time speech-to-text streaming directly from the microphone. Captions, voice UIs, meeting transcription, accessibility tooling — all running locally with zero cloud dependency.

The API is session-based and streams results as they arrive, with is_final markers to distinguish interim from finalized text. Available across all language bindings: JavaScript, C#, Python, and Rust.

Load a streaming speech model from the catalog, create a session with audio settings (sample rate, channels, language), start it, push raw PCM audio chunks, and consume the async stream of results. The post has full Python and C# examples.

Text Embeddings

Semantic search, RAG pipelines, clustering, similarity matching — these all require embeddings. Foundry Local 1.1 adds embedding model support so you can generate vectors locally from the same SDK, without sending data to a cloud endpoint.

For applications where data residency matters or where you’re processing sensitive content, local embedding generation is a meaningful capability.

Responses API

Foundry Local now supports the Responses API — the structured interface designed for agentic interactions. This adds:

Tool calling — let locally-running models invoke tools you define
Multimodal vision-language input — pass image + text to vision-capable models
Compatible with the standard API shape, so existing agents targeting OpenAI’s Responses API work against local models

Package Size Improvements

Two changes reduce the JavaScript package size:

The koffi FFI layer has been replaced with a custom Node-API C addon
WebGPU execution provider ships as a separate plugin, so applications that don’t need GPU acceleration don’t pay the size cost

The C# SDK now targets lower framework versions for broader .NET compatibility.

Why This Matters

The three capabilities together — transcription, embeddings, tool calling — cover the core building blocks of many AI applications. Running them locally means:

No internet required
No per-token costs
No data leaving the machine
Consistent latency regardless of network conditions

Foundry Local is the right choice for edge scenarios, privacy-sensitive workloads, offline applications, or anything where you want to avoid cloud dependency during development.

Original post: Foundry Local 1.1: Live Transcription, Embeddings, and Responses API

GPT-5.5 Is Here and It's Coming to Azure Foundry — What .NET Developers Need to Know

Emiliano Montesdeoca — Sat, 25 Apr 2026 00:00:00 +0000

Microsoft just announced that GPT-5.5 is generally available in Microsoft Foundry. If you’ve been building agents on Azure, this is the update you’ve been waiting for.

Let me break down what actually changed and why it matters for developers building on this stack.

The GPT-5 progression

It helps to understand the arc. This isn’t just a version bump:

GPT-5: unified reasoning and speed into a single system
GPT-5.4: stronger multi-step reasoning, early agentic capabilities for enterprise use
GPT-5.5: deeper long-context reasoning, more reliable agentic execution, improved computer-use accuracy, better token efficiency

Each step has been deliberately aimed at production agentic workloads. GPT-5.5 continues that arc with a specific focus on sustained, high-stakes professional workflows — not just one-shot queries.

What’s actually different

Improved agentic coding: GPT-5.5 holds context across large codebases, can diagnose architectural-level failures, and anticipates downstream testing requirements. That last point is interesting — the model reasons about what else a fix affects before making a move. Less back-and-forth to get to a working result.

Token efficiency: Higher-quality outputs with fewer tokens and fewer retries. This translates directly to lower cost and latency for production deployments. If you’re running agents at scale, this compounds fast.

Long-context analysis: Handles extensive documents, codebases, and multi-session histories without losing the thread. For agentic workflows that maintain large working state, this matters.

There’s also a GPT-5.5 Pro variant for the most demanding enterprise workloads — deeper reasoning, higher cost.

Pricing

Model	Input ($/M tokens)	Cached Input	Output ($/M tokens)
GPT-5.5	$5.00	$0.50	$30.00
GPT-5.5 Pro	$30.00	$3.00	$180.00

GPT-5.5 is priced at the same input rate as GPT-5 but the token efficiency improvements mean you’re actually paying less per useful output. Worth running a benchmark on your specific workload before committing.

Why Foundry matters here

Access to a frontier model is just the starting point. What matters for .NET developers is how you operationalize it.

Foundry Agent Service lets you define agents in YAML or wire them up with Microsoft Agent Framework, GitHub Copilot SDK, LangGraph, or OpenAI Agents SDK — and run them as isolated hosted agents with:

A persistent filesystem
A distinct Microsoft Entra identity
Scale-to-zero pricing

One command to deploy. No infrastructure to manage. Your agents get GPT-5.5 as the model underneath.

Getting started

If you’re already using Azure AI Foundry, GPT-5.5 shows up as a new model option. Point your client at it and you’re done:

// C# — just update the model name
AIAgent agent = aiProjectClient
 .AsAIAgent("gpt-5.5", instructions: "You are a helpful assistant.", name: "MyAgent");

If you haven’t tried Foundry yet, ai.azure.com is where to start. The model catalog has a direct link to try GPT-5.5.

Wrapping up

GPT-5.5 is a real step forward for production agentic workloads. The combination of better long-context handling, improved agentic execution, and token efficiency makes it worth evaluating for anything you’re running at scale.

The frontier is moving fast. Keep building.

See the full announcement for the complete feature breakdown and enterprise details.

Foundry's RFT Just Got Cheaper and Smarter — Here's What Changed

Emiliano Montesdeoca — Sat, 18 Apr 2026 00:00:00 +0000

If you’re building .NET apps that rely on fine-tuned models, this month’s Foundry updates are worth paying attention to. Reinforcement Fine-Tuning just got more accessible and significantly cheaper.

The full details are in the official announcement, but here’s the practical breakdown.

Global Training for o4-mini

o4-mini is the go-to model for reasoning-heavy and agentic workloads. The big news: you can now launch fine-tuning jobs from 13+ Azure regions with lower per-token training rates compared to Standard training. Same infrastructure, same quality, broader reach.

If your team is spread across geographies, this matters. You’re no longer pinned to a handful of regions to train.

Here’s the REST API call to kick off a global training job:

curl -X POST "https://<your-resource>.openai.azure.com/openai/fine_tuning/jobs?api-version=2025-04-01-preview" \
 -H "Content-Type: application/json" \
 -H "api-key: $AZURE_OPENAI_API_KEY" \
 -d '{
 "model": "o4-mini",
 "training_file": "<your-training-file-id>",
 "method": {
 "type": "reinforcement",
 "reinforcement": {
 "grader": {
 "type": "string_check",
 "name": "answer-check",
 "input": "{{sample.output_text}}",
 "reference": "{{item.reference_answer}}",
 "operation": "eq"
 }
 }
 },
 "hyperparameters": {
 "n_epochs": 2,
 "compute_multiplier": 1.0
 },
 "trainingType": "globalstandard"
 }'

That trainingType: globalstandard flag is the key difference.

New Model Graders: GPT-4.1 Family

Graders define the reward signal your model optimizes against. Until now, model-based graders were limited to a smaller set of models. Now you get three new options: GPT-4.1, GPT-4.1-mini, and GPT-4.1-nano.

When should you reach for model graders instead of deterministic ones? When your task output is open-ended, when you need partial credit scoring across multiple dimensions, or when you’re building agentic workflows where tool-call correctness depends on semantic context.

Here’s the thing – the tiering strategy is practical:

GPT-4.1-nano for initial iterations. Low cost, fast feedback loops.
GPT-4.1-mini once your grading rubric is stable and you need higher fidelity.
GPT-4.1 for production grading or complex rubrics where every scoring decision counts.

You can even mix grader types in a single RFT job. Use string-match for the “correct answer” dimension and a model grader for evaluating reasoning quality. That flexibility is honestly what makes this useful for real workloads.

The RFT Data Format Gotcha

This trips people up. RFT data format is different from SFT. The last message in each row must be a User or Developer role – not Assistant. The expected answer goes in a top-level key like reference_answer that the grader references directly.

If you’ve been doing supervised fine-tuning and want to switch to RFT, you need to restructure your training data. Don’t skip this step or your jobs will fail silently.

Why This Matters for .NET Developers

If you’re calling fine-tuned models from your .NET apps through the Azure OpenAI SDK, cheaper training means you can iterate more aggressively. The model grader options mean you can fine-tune for nuanced tasks – not just exact-match scenarios. And the best practices guide on GitHub will save you real debugging time.

Start small. Ten to a hundred samples. Simple grader. Validate the loop. Then scale.

Connect Your MCP Servers on Azure Functions to Foundry Agents — Here's How

Emiliano Montesdeoca — Fri, 10 Apr 2026 00:00:00 +0000

Here’s something I love about the MCP ecosystem: you build your server once, and it works everywhere. VS Code, Visual Studio, Cursor, ChatGPT — every MCP client can discover and use your tools. Now, Microsoft is adding another consumer to that list: Foundry agents.

Lily Ma from the Azure SDK team published a practical guide on connecting MCP servers deployed to Azure Functions with Microsoft Foundry agents. If you already have an MCP server, this is pure value-add — no rebuilding required.

Why this combination makes sense

Azure Functions gives you scalable infrastructure, built-in auth, and serverless billing for hosting MCP servers. Microsoft Foundry gives you AI agents that can reason, plan, and take actions. Connecting the two means your custom tools — querying a database, calling a business API, running validation logic — become capabilities that enterprise AI agents can discover and use autonomously.

The key point: your MCP server stays the same. You’re just adding Foundry as another consumer. The same tools that work in your VS Code setup now power an AI agent your team or customers interact with.

Authentication options

This is where the post really adds value. Four auth methods depending on your scenario:

Method	Use Case
Key-based (default)	Development or servers without Entra auth
Microsoft Entra	Production with managed identities
OAuth identity passthrough	Production where each user authenticates individually
Unauthenticated	Dev/testing or public data only

For production, Microsoft Entra with agent identity is the recommended path. OAuth identity passthrough is for when user context matters — the agent prompts users to sign in, and each request carries the user’s own token.

Setting it up

The high-level flow:

Deploy your MCP server to Azure Functions — samples available for .NET, Python, TypeScript, and Java
Enable built-in MCP authentication on your function app
Get your endpoint URL — https://<FUNCTION_APP_NAME>.azurewebsites.net/runtime/webhooks/mcp
Add the MCP server as a tool in Foundry — navigate to your agent in the portal, add a new MCP tool, provide endpoint and credentials

Then test it in the Agent Builder playground by sending a prompt that would trigger one of your tools.

My take

The composability story here is getting really strong. Build your MCP server once in .NET (or Python, TypeScript, Java), deploy to Azure Functions, and every MCP-compatible client can use it — coding tools, chat apps, and now enterprise AI agents. That’s a “write once, use everywhere” pattern that actually works.

For .NET developers specifically, the Azure Functions MCP extension makes this straightforward. You define your tools as Azure Functions, deploy, and you’ve got a production-grade MCP server with all the security and scaling Azure Functions provides.

Wrapping up

If you have MCP tools running on Azure Functions, connecting them to Foundry agents is a quick win — your custom tools become enterprise AI capabilities with proper auth and no code changes to the server itself.

Read the full guide for step-by-step instructions on each authentication method, and check the detailed docs for production setups.

Microsoft Foundry March 2026 — GPT-5.4, Agent Service GA, and the SDK Refresh That Changes Everything

Emiliano Montesdeoca — Fri, 10 Apr 2026 00:00:00 +0000

The monthly “What’s New in Microsoft Foundry” posts are usually a mix of incremental improvements and the occasional headline feature. The March 2026 edition? It’s basically all headline features. Foundry Agent Service goes GA, GPT-5.4 ships for production, the SDK gets a major stable release, and Fireworks AI brings open model inference to Azure. Let me break down what matters for .NET developers.

Foundry Agent Service is production-ready

This is the big one. The next-gen agent runtime is generally available — built on the OpenAI Responses API, wire-compatible with OpenAI agents, and open to models from multiple providers. If you’re building with the Responses API today, migrating to Foundry adds enterprise security, private networking, Entra RBAC, full tracing, and evaluation on top of your existing agent logic.

from azure.ai.projects import AIProjectClient
from azure.ai.projects.models import PromptAgentDefinition

project_client = AIProjectClient(
 endpoint=os.environ["AZURE_AI_PROJECT_ENDPOINT"],
 credential=DefaultAzureCredential()
)

agent = project_client.agents.create_version(
 agent_name="my-enterprise-agent",
 definition=PromptAgentDefinition(
 model=os.environ["AZURE_AI_MODEL_DEPLOYMENT_NAME"],
 instructions="You are a helpful assistant.",
 ),
)

Key additions: end-to-end private networking, MCP auth expansion (including OAuth passthrough), Voice Live preview for speech-to-speech agents, and hosted agents in 6 new regions.

GPT-5.4 — reliability over raw intelligence

GPT-5.4 isn’t about being smarter. It’s about being more reliable. Stronger reasoning over long interactions, better instruction adherence, fewer mid-workflow failures, and integrated computer use capabilities. For production agents, that reliability matters way more than benchmark scores.

Model	Pricing (per M tokens)	Best For
GPT-5.4 (≤272K)	$2.50 / $15 output	Production agents, coding, document workflows
GPT-5.4 Pro	$30 / $180 output	Deep analysis, scientific reasoning
GPT-5.4 Mini	Cost-effective	Classification, extraction, lightweight tool calls

The smart play is a routing strategy: GPT-5.4 Mini handles high-volume, low-latency work while GPT-5.4 takes the reasoning-heavy requests.

The SDK is finally stable

azure-ai-projects SDK shipped stable releases across all languages — Python 2.0.0, JS/TS 2.0.0, Java 2.0.0, and .NET 2.0.0 (April 1). The azure-ai-agents dependency is gone — everything lives under AIProjectClient. Install with pip install azure-ai-projects and the package bundles openai and azure-identity as direct dependencies.

For .NET developers, this means a single NuGet package for the full Foundry surface. No more juggling separate agent SDKs.

Fireworks AI brings open models to Azure

Perhaps the most architecturally interesting addition: Fireworks AI processing 13+ trillion tokens daily at ~180K requests/second, now available through Foundry. DeepSeek V3.2, gpt-oss-120b, Kimi K2.5, and MiniMax M2.5 at launch.

The real story is bring-your-own-weights — upload quantized or fine-tuned weights from anywhere without changing the serving stack. Deploy via serverless pay-per-token or provisioned throughput.

Other highlights

Phi-4 Reasoning Vision 15B — multimodal reasoning for charts, diagrams, and document layouts
Evaluations GA — out-of-the-box evaluators with continuous production monitoring piped into Azure Monitor
Priority Processing (Preview) — dedicated compute lane for latency-sensitive workloads
Voice Live — speech-to-speech runtime that connects directly to Foundry agents
Tracing GA — end-to-end agent trace inspection with sort and filter
PromptFlow deprecation — migration to Microsoft Framework Workflows by January 2027

Wrapping up

March 2026 is a turning point for Foundry. The Agent Service GA, stable SDKs across all languages, GPT-5.4 for reliable production agents, and open model inference via Fireworks AI — the platform is ready for serious workloads.

Read the full roundup and build your first agent to get started.

Azure DevOps MCP Server Lands in Microsoft Foundry: What This Means for Your AI Agents

Emiliano Montesdeoca — Thu, 26 Mar 2026 00:00:00 +0000

MCP (Model Context Protocol) has been having a moment. If you’ve been following the AI agent ecosystem, you’ve probably noticed MCP servers popping up everywhere — giving agents the ability to interact with external tools and services through a standardized protocol.

Now the Azure DevOps MCP Server is available in Microsoft Foundry, and this is one of those integrations that makes you think about the practical possibilities.

What’s actually happening here

Microsoft already released the Azure DevOps MCP Server as a public preview — that’s the MCP server itself. What’s new is the Foundry integration. You can now add the Azure DevOps MCP Server to your Foundry agents directly from the tool catalog.

For those not familiar with Foundry yet: it’s Microsoft’s unified platform for building and managing AI-powered applications and agents at scale. Model access, orchestration, evaluation, deployment — all in one place.

Setting it up

The setup is surprisingly straightforward:

In your Foundry agent, go to Add Tools > Catalog
Search for “Azure DevOps”
Select the Azure DevOps MCP Server (preview) and click Create
Enter your organization name and connect

That’s it. Your agent now has access to Azure DevOps tools.

Controlling what your agent can access

Here’s the part I appreciate: you’re not stuck with an all-or-nothing approach. You can specify which tools are available to your agent. So if you only want it to read work items but not touch pipelines, you can configure that. Principle of least privilege, applied to your AI agents.

This matters for enterprise scenarios where you don’t want an agent accidentally triggering a deployment pipeline because someone asked it to “help with the release.”

Why this is interesting for .NET teams

Think about what this enables in practice:

Sprint planning assistants — agents that can pull work items, analyze velocity data, and suggest sprint capacity
Code review bots — agents that understand your PR context because they can actually read your repos and linked work items
Incident response — agents that can create work items, query recent deployments, and correlate bugs with recent changes
Developer onboarding — “What should I work on?” gets a real answer backed by actual project data

For .NET teams already using Azure DevOps for their CI/CD pipelines and project management, having an AI agent that can actually interact with those systems directly is a significant step toward useful automation (not just chatbot-as-a-service).

The bigger MCP picture

This is part of a broader trend: MCP servers are becoming the standard way AI agents interact with the outside world. We’re seeing them for GitHub, Azure DevOps, databases, SaaS APIs — and Foundry is becoming the hub where these connections all come together.

If you’re building agents in the .NET ecosystem, MCP is worth paying attention to. The protocol is standardized, the tooling is maturing, and the Foundry integration makes it accessible without having to manually wire up server connections.

Wrapping up

The Azure DevOps MCP Server in Foundry is in preview, so expect it to evolve. But the core workflow is solid: connect, configure tool access, and let your agents work with your DevOps data. If you’re already in the Foundry ecosystem, this is a few clicks away. Give it a try and see what workflows you can build.

Check out the full announcement for the step-by-step setup and more details.

Foundry Agent Service is GA: What Actually Matters for .NET Agent Builders

Emiliano Montesdeoca — Thu, 26 Mar 2026 00:00:00 +0000

Let’s be honest — building an AI agent prototype is the easy part. The hard part is everything after: getting it into production with proper network isolation, running evaluations that actually mean something, handling compliance requirements, and not breaking things at 2 AM.

The Foundry Agent Service just went GA, and this release is laser-focused on that “everything after” gap.

Built on the Responses API

Here’s the headline: the next-gen Foundry Agent Service is built on the OpenAI Responses API. If you’re already building with that wire protocol, migrating to Foundry is minimal code changes. What you gain: enterprise security, private networking, Entra RBAC, full tracing, and evaluation — on top of your existing agent logic.

The architecture is intentionally open. You’re not locked to one model provider or one orchestration framework. Use DeepSeek for planning, OpenAI for generation, LangGraph for orchestration — the runtime handles the consistency layer.

from azure.ai.projects import AIProjectClient
from azure.ai.projects.models import PromptAgentDefinition

with (
 DefaultAzureCredential() as credential,
 AIProjectClient(endpoint=os.environ["AZURE_AI_PROJECT_ENDPOINT"],
 credential=credential) as project_client,
 project_client.get_openai_client() as openai_client,
):
 agent = project_client.agents.create_version(
 agent_name="my-enterprise-agent",
 definition=PromptAgentDefinition(
 model=os.environ["AZURE_AI_MODEL_DEPLOYMENT_NAME"],
 instructions="You are a helpful assistant.",
 ),
 )

 conversation = openai_client.conversations.create()
 response = openai_client.responses.create(
 conversation=conversation.id,
 input="What are best practices for building AI agents?",
 extra_body={
 "agent_reference": {"name": agent.name, "type": "agent_reference"}
 },
 )
 print(response.output_text)

If you’re coming from the azure-ai-agents package, agents are now first-class operations on AIProjectClient in azure-ai-projects. Drop the standalone pin and use get_openai_client() to drive responses.

Private networking: the enterprise blocker removed

This is the feature that unblocks enterprise adoption. Foundry now supports full end-to-end private networking with BYO VNet:

No public egress — agent traffic never touches the public internet
Container/subnet injection into your network for local communication
Tool connectivity included — MCP servers, Azure AI Search, Fabric data agents all operate over private paths

That last point is critical. It’s not just inference calls that stay private — every tool invocation and retrieval call stays inside your network boundary too. For teams operating under data classification policies that prohibit external routing, this is what was missing.

MCP authentication done right

MCP server connections now support the full spectrum of auth patterns:

Auth method	When to use
Key-based	Simple shared access for org-wide internal tools
Entra Agent Identity	Service-to-service; the agent authenticates as itself
Entra Managed Identity	Per-project isolation; no credential management
OAuth Identity Passthrough	User-delegated access; agent acts on behalf of users

OAuth Identity Passthrough is the interesting one. When users need to grant an agent access to their personal data — their OneDrive, their Salesforce org, a SaaS API scoped by user — the agent acts on their behalf with standard OAuth flows. No shared system identity pretending to be everyone.

Voice Live: speech-to-speech without the plumbing

Adding voice to an agent used to mean stitching together STT, LLM, and TTS — three services, three latency hops, three billing surfaces, all synchronized by hand. Voice Live collapses that into a single managed API with:

Semantic voice activity and end-of-turn detection (understands meaning, not just silence)
Server-side noise suppression and echo cancellation
Barge-in support (users can interrupt mid-response)

Voice interactions go through the same agent runtime as text. Same evaluators, same traces, same cost visibility. For customer support, field service, or accessibility scenarios, this replaces what previously required a custom audio pipeline.

Evaluations: from checkbox to continuous monitoring

This is where Foundry gets serious about production quality. The evaluation system now has three layers:

Out-of-the-box evaluators — coherence, relevance, groundedness, retrieval quality, safety. Connect to a dataset or live traffic and get scores back.
Custom evaluators — encode your own business logic, tone standards, and domain-specific compliance rules.
Continuous evaluation — Foundry samples live production traffic, runs your evaluator suite, and surfaces results through dashboards. Set Azure Monitor alerts for when groundedness drops or safety thresholds breach.

Everything publishes to Azure Monitor Application Insights. Agent quality, infrastructure health, cost, and app telemetry — all in one place.

eval_object = openai_client.evals.create(
 name="Agent Quality Evaluation",
 data_source_config=DataSourceConfigCustom(
 type="custom",
 item_schema={
 "type": "object",
 "properties": {"query": {"type": "string"}},
 "required": ["query"],
 },
 include_sample_schema=True,
 ),
 testing_criteria=[
 {
 "type": "azure_ai_evaluator",
 "name": "fluency",
 "evaluator_name": "builtin.fluency",
 "initialization_parameters": {
 "deployment_name": os.environ["AZURE_AI_MODEL_DEPLOYMENT_NAME"]
 },
 "data_mapping": {
 "query": "{{item.query}}",
 "response": "{{sample.output_text}}",
 },
 },
 ],
)

Six new regions for hosted agents

Hosted agents are now available in East US, North Central US, Sweden Central, Southeast Asia, Japan East, and more. This matters for data residency requirements and for compressing latency when your agent runs close to its data sources.

Why this matters for .NET developers

Even though the code samples in the GA announcement are Python-first, the underlying infrastructure is language-agnostic — and the .NET SDK for azure-ai-projects follows the same patterns. The Responses API, the evaluation framework, the private networking, the MCP auth — all of this is available from .NET.

If you’ve been waiting for AI agents to go from “cool demo” to “I can actually ship this at work,” this GA release is the signal. Private networking, proper auth, continuous evaluation, and production monitoring are the pieces that were missing.

Wrapping up

Foundry Agent Service is available now. Install the SDK, open the portal, and start building. The quickstart guide takes you from zero to a running agent in minutes.

For the full technical deep-dive with all code samples, check the GA announcement.

From Laptop to Production: Deploying AI Agents to Microsoft Foundry with Two Commands

Emiliano Montesdeoca — Thu, 26 Mar 2026 00:00:00 +0000

You know that gap between “it works on my machine” and “it’s deployed and serving traffic”? For AI agents, that gap has been painfully wide. You need to provision resources, deploy models, wire up identity, set up monitoring — and that’s before anyone can actually call your agent.

The Azure Developer CLI just made this a two-command affair.

The new `azd ai agent` workflow

Let me walk through what this actually looks like. You have an AI agent project — let’s say a hotel concierge agent. It works locally. You want it running on Microsoft Foundry.

azd ai agent init
azd up

That’s it. Two commands. azd ai agent init scaffolds the infrastructure-as-code in your repo, and azd up provisions everything on Azure and publishes your agent. You get a direct link to your agent in the Foundry portal.

What happens under the hood

The init command generates real, inspectable Bicep templates in your repo:

A Foundry Resource (top-level container)
A Foundry Project (where your agent lives)
Model deployment configuration (GPT-4o, etc.)
Managed identity with proper RBAC role assignments
azure.yaml for the service map
agent.yaml with agent metadata and environment variables

Here’s the key part: you own all of this. It’s versioned Bicep in your repo. You can inspect it, customize it, and commit it alongside your agent code. No magic black boxes.

The dev inner loop

What I really like is the local development story. When you’re iterating on agent logic, you don’t want to redeploy every time you change a prompt:

azd ai agent run

This starts your agent locally. Pair it with azd ai agent invoke to send test prompts, and you’ve got a tight feedback loop. Edit code, restart, invoke, repeat.

The invoke command is smart about routing too — when a local agent is running, it targets that automatically. When it’s not, it hits the remote endpoint.

Real-time monitoring

This is the feature that sold me. Once your agent is deployed:

azd ai agent monitor --follow

Every request and response flowing through your agent streams to your terminal in real time. For debugging production issues, this is invaluable. No digging through log analytics, no waiting for metrics to aggregate — you see what’s happening right now.

The full command set

Here’s the quick reference:

Command	What it does
`azd ai agent init`	Scaffold a Foundry agent project with IaC
`azd up`	Provision Azure resources and deploy the agent
`azd ai agent invoke`	Send prompts to the remote or local agent
`azd ai agent run`	Run the agent locally for development
`azd ai agent monitor`	Stream real-time logs from the published agent
`azd ai agent show`	Check agent health and status
`azd down`	Clean up all Azure resources

Why this matters for .NET developers

Even though the sample in the announcement is Python-based, the infrastructure story is language-agnostic. Your .NET agent gets the same Bicep scaffolding, the same managed identity setup, the same monitoring pipeline. And if you’re already using azd for your .NET Aspire apps or Azure deployments, this fits right into your existing workflow.

The deployment gap for AI agents has been one of the biggest friction points in the ecosystem. Going from a working prototype to a production endpoint with proper identity, networking, and monitoring shouldn’t require a week of DevOps work. Now it requires two commands and a few minutes.

Wrapping up

azd ai agent is available now. If you’ve been putting off deploying your AI agents because the infrastructure setup felt like too much work, give this a shot. Check out the full walkthrough for the complete step-by-step including frontend chat app integration.

Foundry | The .NET Blog

Microsoft Foundry April 2026: Foundry Local GA, GPT-5.5, CodeAct with Hyperlight

Foundry Local Is Generally Available

GPT-5.5

Agent Framework Tracing in Foundry

CodeAct with Hyperlight (Alpha)

Agent Monitoring Dashboard (Preview)

Continuous Evaluation Custom Evaluators (Preview)

Agent Inventory in Control Plane

Your Local MAF Agent Just Got a Production Home

What Foundry Hosted Agents Actually Does

The Code Difference Is Tiny

Two Protocols, One Agent

The Deployment Flow with azd

Wrapping Up

Foundry Local 1.1: Real-Time Transcription, Embeddings, and the Responses API

Live Audio Transcription

Text Embeddings

Responses API

Package Size Improvements

Why This Matters

GPT-5.5 Is Here and It's Coming to Azure Foundry — What .NET Developers Need to Know

The GPT-5 progression

What’s actually different

Pricing

Why Foundry matters here

Getting started

Wrapping up

Foundry's RFT Just Got Cheaper and Smarter — Here's What Changed

Global Training for o4-mini

New Model Graders: GPT-4.1 Family

The RFT Data Format Gotcha

Why This Matters for .NET Developers

Connect Your MCP Servers on Azure Functions to Foundry Agents — Here's How

Why this combination makes sense

Authentication options

Setting it up

My take

Wrapping up

Microsoft Foundry March 2026 — GPT-5.4, Agent Service GA, and the SDK Refresh That Changes Everything

Foundry Agent Service is production-ready

GPT-5.4 — reliability over raw intelligence

The SDK is finally stable

Fireworks AI brings open models to Azure

Other highlights

Wrapping up

Azure DevOps MCP Server Lands in Microsoft Foundry: What This Means for Your AI Agents

What’s actually happening here

Setting it up

Controlling what your agent can access

Why this is interesting for .NET teams

The bigger MCP picture

Wrapping up

Foundry Agent Service is GA: What Actually Matters for .NET Agent Builders

Built on the Responses API

Private networking: the enterprise blocker removed

MCP authentication done right

Voice Live: speech-to-speech without the plumbing

Evaluations: from checkbox to continuous monitoring

Six new regions for hosted agents

Why this matters for .NET developers

Wrapping up

From Laptop to Production: Deploying AI Agents to Microsoft Foundry with Two Commands

The new azd ai agent workflow

What happens under the hood

The dev inner loop

Real-time monitoring

The full command set

Why this matters for .NET developers

Wrapping up

The Deployment Flow with `azd`

The new `azd ai agent` workflow