Using Microsoft Entra ID To Authenticate With Model Context Protocol Servers

Table of Contents

Because what you’re about to read deals with authentication and authorization, I want to call out that this blog post is intended for demo purposes. You will need to make some changes to make the code you see here production-ready.

This post was an exploration. Servers should not pretend to be public clients. A better version of this exploration exists. For latest supported samples, refer to the reference collection.

Quick context #

Let’s talk about AI, but not in the context you think. Specifically, let’s talk about authentication for Model Context Protocol (MCP) servers. You can see some of them in the official documentation.

While this blog post is not designed to be a crash-course in MCP or MCP servers, what you need to know is that MCP is nothing other than an attempt to create a standard protocol for AI models to provide richer context to a user’s request from external sources (e.g., cloud providers, source code repositories, and more). Anthropic announced it in November of last year.

I should also note that this blog post deals with pre-release information and potentially unstable implementations. Things may change as the protocol is being fleshed out and its SDKs mature.

The desired auth flow #

Not too long ago, Anthropic put forward a draft specification that outlines how authentication and authorization works in the context of MCP. It’s pretty vanilla in terms of what you’d expect from an OAuth-based implementation, but it gets a bit trickier if we try and integrate Microsoft Entra ID into it.

There are two stages here, in my eyes:

Authenticate the client with the server. For now, let’s assume that I am not doing any custom authorization on the server. I just want the client to get an Entra ID token, pass it to the sever, which will validate it, and get the user on their merry way.
The server can use the token to do an on-behalf-of token exchange. This will enable me to then have the server talk to other downstream APIs, well, on behalf of the user that authenticated with the client.

Sounds fairly straightforward. If you’re a fan of sequence diagrams, the flow I wanted to get to is this:

%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1f1f1f', 'primaryTextColor': '#ffffff', 'primaryBorderColor': '#ffffff', 'lineColor': '#ffffff' }}}%% sequenceDiagram participant UA as User Agent (Browser) participant MC as MCP Client participant MS as MCP Server participant EI as Entra ID MC->>MS: Connect MS->>MC: HTTP 401 Unauthorized MC->>MS: /authorize MS->>MC: Return Entra ID /authorize URL MC->>UA: Redirect to browser to auth UA->>EI: Perform auth dance EI->>UA: Auth code to callback UA->>MC: Auth code MC->>MS: Request access token from auth code MS->>EI: Exchange auth code for access token EI->>MS: Access token MS->>MC: Return access token MC->>MC: Store access token MC->>MS: Connect MS->>MC: Good to go! MC->>MS: Invoke some tool on the server

So far, nothing too unusual. But you see, the expectations from an MCP server are ever so slightly different than your typical OAuth identity provider (IdP). One such thing is dynamic client registration. And while it’s not a requirement per spec, there are caveats about how client registration should be done. In our case, and if you’re at least somewhat familiar with Entra ID, you might already know that client registration is nothing other than app registration. The problem is that we can’t do it dynamically if we are unauthenticated (or don’t have admin permissions, which we can’t assume we’ll always have).

Given the current TypeScript SDK-based implementation, we have a way to make it work regardless. Let’s take a closer peek at my tinkering.

The two clients #

If you’ve used Entra ID before, you likely already know that there are two types of clients that we can deal with - public clients, that cannot safely store secrets, and confidential clients, that can. You can read more about it in the Entra ID documentation.

A public client is a desktop, mobile, or web frontend - putting anything truly secret there is less than ideal because such clients can be reverse-engineered or can encounter people like myself that are pretty liberal with their use of Fiddler to inspect local machine traffic. If a “secret” is, at any point, used to issue a request to an IdP, there’s a chance it can be captured.

A confidential client, on the other hand, typically runs on the server or in some kind of protected context. No user should have access to that server or the code that runs on it, therefore the application could save secrets there and keep them safer than on a local client. I say “safer” because having any secret credentials is inherently unsafe and if you use those, you should probably move to something like managed identities (which now supports federation, by the way). But I digress.

With the introduction of the auth spec for MCP, we clearly see that there is an expectation to let clients authenticate with MCP servers. That is - if there is a server running somewhere (locally or otherwise), we don’t want to have its tools or resources be available to anyone that can craft a Server-Sent Events (SSE) request. Hence, the need for client-to-server authentication. Once you authenticate with the MCP server, you can have a myriad of ways for the server itself to talk to other resources, such as third-party APIs (say, the Dropboxes and Stripes of the world).

So, what we have in play in our scenario, is two classes of clients. First, the MCP client (be it Claude Desktop, Cursor, or any other tool) is for all intents and purposes a public client. The MCP server can be a public client too if it runs locally on the box. Or it can act as a confidential client too while on the box, given that many APIs do generate a client ID and client secret and don’t support public client scenarios. But what it can’t be is a public client when it runs remotely.

Let’s say you create an MCP server that runs somewhere in the cloud. If you want to authenticate a user in, you can’t have that user log into the server, run some auth processes, and then get back to their own machine with a token. Instead, the server would rely on the client to somehow provide it the credentials it needs. The client then needs to trigger some kind of authentication flow and go through the “dance” to get the right artifacts. But how?

Unenlightened clients #

The challenge here is that we do not have access to the client code. Typically, if you use Entra ID (and this entire blog post assumes that you do), if you are building a public client application you can use a library like MSAL to request a token on the client. The library can use all sorts of fancy bells and whistles, like relying on authentication brokers to produce an access token that then can be passed to the server (i.e., an API) for whatever that server needs it for.

Mole has an idea and a light bulb pops up.

I think I see where this is going. If you want to generate an access token on the client, and you don’t have access to the client code, that means that we can’t integrate MSAL on the client.

Precisely, and that’s what I refer to as “unenlightened clients” - clients that have no concept of Entra ID or the libraries that can provide user authentication capabilities. So, we need to think differently. The MCP TypeScript SDK introduces the OAuthServerProvider construct, which can be implemented as an OAuth proxy.

Instead of the client trying to handle all the public client flows, it asks the server - “Hey, can you please tell the identity provider that there is a user trying to authenticate? Ask them to give me a URL I can send the user to enter their credentials.” I’ve implemented a very simple technical demonstration on how that can be done.

Check out the code

There are a few components worth calling out here:

EntraIdServerAuthProvider - implements the aforementioned interface in a way that allows us to “proxy” requests from the client to Entra ID.
EntraIdAuthRouter - responsible for establishing the basic endpoints for the authorization logic. It effectively re-implements the mcpAuthRouter because I needed to ensure that the metadata document is correctly accounting for more complex issuer URLs. This is technically not needed if you are using localhost as the proxy, since we’re not really tweaking the metadata document much.
EntraIdTokenHandler - a token handler that is a re-implementation of the built-in token handler. I needed to remove the local PKCE verification because it’s already done by Entra ID when we try to exchange the auth code for an access token.
CustomBearerMiddleware - I needed to make some tweaks to how token expiration is handled - in the current SDK implementation, expiry time of 0 would let the request go through, even though it should throw a 401 Unauthenticated. This fixes the gap.
ClientWithVerifier - extends OAuthClientInformationFull with a verifier, that we need to exchange the auth code for an access token.

The rest of the server is your typical MCP server scaffolding - I tried to keep it as minimal as possible. It has only one tool - for the authenticated user, get the information about them from Microsoft Graph.

Digging through the code #

When a client knows that it needs to get new credentials, it will invoke the /authorize endpoint that is exposed by the MCP server through its OIDC metadata document (just like we have in Entra ID). When that endpoint is invoked, on the server the auth router will, in turn, invoke the authorize function from the provider that is responsible for either kicking off the first step in the flow or take the user to a third-party IdP to complete the process.

In our case, we’re sending users to Entra ID, and the implementation looks like this:

/**
 * Authorizes a client and redirects to Entra ID login
 * @param client - Client information
 * @param params - Authorization parameters
 * @param res - Express response object
 */
async authorize(client: OAuthClientInformationFull, params: AuthorizationParams, res: Response): Promise<void> {
    console.log("Authorizing client ", client.client_id);

    try {
        const redirectUri = client.redirect_uris[0] as string;
        const redirectUrl = new URL(redirectUri);
        if (redirectUrl.hostname !== 'localhost' && redirectUrl.hostname !== '127.0.0.1') {
            throw new Error(`Invalid redirect URI: ${redirectUri}. Only localhost redirects are allowed.`);
        }

        const codeChallenge = params.codeChallenge as string;
        const codeChallengeMethod = 'S256';

        if (!this._config.clientId) {
            res.status(400).send("Missing client ID configuration");
            return;
        }

        this._msalClient.getAuthCodeUrl({
            scopes: this._config.scopes,
            redirectUri: redirectUri,
            responseMode: "query",
            codeChallenge: codeChallenge,
            codeChallengeMethod: codeChallengeMethod
        })
            .then(authUrl => {
                res.redirect(authUrl);
            })
            .catch(error => {
                console.error("Error generating auth URL:", error);
                res.status(500).send("Authentication error occurred");
            });
    } catch (error) {
        console.error("Authorization setup error:", error);
        res.status(500).send("Failed to initialize authentication");
    }
}

The logic here is - take the information from the client, like its redirect URL and the code challenge (part of PKCE), and have the server with a registered public client Entra ID application construct the auth code URL with getAuthCodeUrl from MSAL Node. Then send, that full URL to the client.

Ah, I see - so in this case the server is… kind of being the public client? We’re doing the work for it to get the data from the client and then “channels” the steps for the client because we can’t really have the client do the proper OAuth dance natively?

Indeed! That is because, remember, we have two hops here that we need to take care of:

%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1f1f1f', 'primaryTextColor': '#ffffff', 'primaryBorderColor': '#ffffff', 'lineColor': '#ffffff' }}}%% sequenceDiagram participant C as MCP Client participant S as MCP Server participant R as Resource API C->>S: Authenticate request S-->>C: Authentication response S->>R: Authenticate request R-->>S: Authentication response

For the first hop (Client to Server) we want to authenticate with user credentials. That is, we’re protecting the server so that unauthenticated users can’t just issue arbitrary requests for data. The second hop (Server to Resource) is responsible for then talking to a protected API - when it’s with a derivative of Entra ID credentials or not isn’t really important, but on that second hop the server can act with a different client registration that in the first hop. Separation of client responsibilities.

Long story longer, we effectively have a “fake public client,” but for the purposes of this demo it’s OK. The authorize call will produce the URL we need to authenticate the user, pass it to the actual client, the client will authenticate the user, pass the required artifacts (auth code and state) to the server, and the server will exchange it for an access token. Which brings me to exchangeAuthorizationCode:

/**
 * Exchanges an authorization code for a bearer access token. The server in this
 * context does not cache the tokens in any capacity, but rather gives that responsibility
 * to the client, who will request a new token when needed.
 * @param client - Client with verifier
 * @param authorizationCode - The authorization code
 * @returns Promise with OAuth tokens
 */
async exchangeAuthorizationCode(client: ClientWithVerifier, authorizationCode: string): Promise<OAuthTokens> {
    try {
        const redirectUri = client.redirect_uris[0] as string;
        const redirectUrl = new URL(redirectUri);
        if (redirectUrl.hostname !== 'localhost' && redirectUrl.hostname !== '127.0.0.1') {
            throw new Error(`Invalid redirect URI: ${redirectUri}. Only localhost redirects are allowed.`);
        }

        const tokenResponse = await this._msalClient.acquireTokenByCode({
            code: authorizationCode,
            scopes: this._config.scopes,
            redirectUri: redirectUri,
            codeVerifier: client.verifier,
        });

        if (!tokenResponse) {
            throw new Error("Failed to acquire token");
        }

        // Return the tokens in the format expected by OAuthTokens
        return {
            access_token: tokenResponse.accessToken,
            token_type: 'Bearer',
            expires_in: tokenResponse.expiresOn ?
                Math.floor((tokenResponse.expiresOn.getTime() - Date.now()) / 1000) :
                3600, // Default to 1 hour if expiration is not provided
            scope: tokenResponse.scopes.join(' ')
        };
    } catch (error) {
        console.error("Error exchanging authorization code for tokens:", error);
        throw new Error(`Failed to exchange authorization code: ${error instanceof Error ? error.message : String(error)}`);
    }
}

This part of the flow exchanges the auth code for an actual token with the help of MSAL Node once again - acquireTokenByCode does the work for us, and we now have an access token that we can get to the client. This token can now be used by the client when it talks to the MCP server, because when we instantiate the server, we’re using requireBearerAuth. This automatically ensures that all requests going out from the client are authenticated, and if they are not - call authorize on the server:

app.get("/sse", requireBearerAuth({
  provider,
  requiredScopes: ["forerunner.mcp.act"]
}), async (req, res) => {
  console.log("Received connection");
  transport = new SSEServerTransport("/message", res);
  await server.connect(transport);

  server.onclose = async () => {
    await cleanup();
    await server.close();
    process.exit(0);
  };
});

The function above is also a tweak on the built-in implementation of the exchangeAuthorizationCode logic in the MCP TypeScript SDK. I mentioned above that I created an entity called ClientWithVerifier that is responsible for keeping the verifier blob with us when the call to exchange the auth code for the token is made. Injecting the verifier is important that because without it we can’t acquire a token by code - this is not yet implemented in the “stock” interface.

OK, but hold on a second - looking at the code, and just to make sure I understand - the client is sending all requests to the server instead of to Entra ID? When the client invokes the /authorize endpoint, it’s talking to the MCP server, not to login.microsoftonline.com?

Yes, that’s right! You can take a peek at the configuration to see that the server is used as a proxy:

app.use(entraIdAuthRouter({
  provider: provider,
  issuerUrl: new URL('http://localhost:3001'),
  serviceDocumentationUrl: new URL('https://den.dev'),
  authorizationOptions: {},
  tokenOptions: {}
}));

Notice that I am using an entraIdAuthRouter (a custom variant of the default mcpAuthRouter), as I mentioned above. It’s responsible for generating the OIDC metadata document that will point clients to the right endpoints. The challenge with pointing directly to Entra ID is that it doesn’t support the things that the MCP server is expected to support, like dynamic client registration. So, if we just use the Entra ID endpoints, it won’t work.

This led me to the current implementation, where the server acts as a proxy that merely “ferries” the requests, but behind the scenes uses Entra ID. The implementation that I have in the demo is barebones - it does nothing to cache requests or even have capabilities to refresh tokens, but it also shows how this could be implemented in the future.

Testing the flow #

There’s quite a bit of code in the repository that I highly recommend you step through and see how it works. A blog post is not enough to cover it all. For now, though, let’s try and test things out. To do that, I will start the server with npm run start. If all is well, we should see a confirmation that it’s running on port 3001.

Yep, that sure enough looks like a running server. Next, let’s try to use MCP Inspector, an open source tool that was designed to help debug MCP servers. What’s great about MCP Inspector is that it already supports OAuth flows between client and server, something no other MCP adopter, like Claude Desktop or Cursor, can say at the time of this blog post.

Erm… Doesn’t this kind of defeat the purpose of this whole effort if no other MCP client supports auth? What’s the use of this functionality?

The auth spec and its implementation through the TypeScript SDK is in its very early stages, hence the lack of client adoption. It will get there, just not yet. That being said, let’s try to test what we can with the MCP Inspector. To use the tool, I can run npx @modelcontextprotocol/inspector, which will automatically install and activate the full stack app. I can then open the target URL in the browser and connect to our very own MCP server SSE endpoint (http://localhost:3001/sse). When I initiate the connection, the authentication flow should be triggered automatically.

Starting the MCP client in the browser and authenticating.

Boom! I managed to use Microsoft Entra ID for user authentication and invoked a tool that talked to Microsoft Graph with an OBO token. Not bad for a day’s work!

I see you keep talking about SSE and its implementation on the server. But we know that MCP supports more than just SSE. What about other protocols, like stdio? Do they support authentication too?

Unfortunately, at this time, only the HTTP+SSE transport supports OAuth-based authentication. For other protocols, more creative means might be necessary that are way outside the scope of this write-up.

Token verification #

An important part of the flow that we also should take a bit of a closer look at is the one responsible for token verification. A security best practice is to always ensrure that the API verifies inbound tokens. We can’t assume that what we get from the client is a valid token destined for the API. To perform the verification, and assuming that we’re getting a proper JSON Web Token (JWT) from Entra ID, verification can be implemented through verifyAccessToken in the auth provider, like this (this function is part of the standard interface):

/**
 * Verifies an access token and returns authentication information.
 * This method is invoked in the context of bearerAuth infra inside
 * the auth middleware. It get an AuthInfo object and then checks if
 * all required sceopes are included or the token has expired. It assumes
 * that the bulk of validation (beyond that) happens here.
 * @param token - The access token to verify
 * @returns Promise with authentication information
 */
async verifyAccessToken(token: string): Promise<AuthInfo> {
    try {
        const decodedToken = jwt.decode(token, { complete: true });
        if (!decodedToken || typeof decodedToken === 'string' || !decodedToken.header.kid) {
            throw new Error('Invalid token format');
        }

        const payload = decodedToken.payload as jwt.JwtPayload;

        const openIdConfigUrl = `https://login.microsoftonline.com/${this._config.tenantId}/v2.0/.well-known/openid-configuration`;

        const openIdConfigResponse = await fetch(openIdConfigUrl);
        const openIdConfigData = await openIdConfigResponse.json();
        const jwksUri = openIdConfigData.jwks_uri;

        const keyClient = jwksClient({
            jwksUri: jwksUri,
            cache: true,
            cacheMaxEntries: 5,
            cacheMaxAge: 600000 // 10 minutes
        });

        const getSigningKey = (kid: string): Promise<jwksClient.SigningKey> => {
            return new Promise((resolve, reject) => {
                keyClient.getSigningKey(kid, (err, key) => {
                    if (err) {
                        reject(err);
                        return;
                    }
                    if (!key) {
                        reject(new Error('Signing key not found'));
                        return;
                    }
                    resolve(key);
                });
            });
        };

        const key = await getSigningKey(decodedToken.header.kid);
        const signingKey = key.getPublicKey();

        jwt.verify(token, signingKey, {
            audience: this._config.apiClientId,
            issuer: `https://login.microsoftonline.com/${this._config.tenantId}/v2.0`
        });

        return {
            clientId: Array.isArray(payload.aud) ? payload.aud[0] : payload.aud || '',
            token: token,
            expiresAt: payload.exp || 0,
            scopes: (payload.scp || '').split(' ').filter(Boolean),
        };
    } catch (error) {
        console.error('Token processing failed:', error);
        
        // Given that inside the middleware the failure occurs if expiration is less than
        // now, we can return an object with whatever is in the token and set the expiration
        // to 0. This will cause the middleware to fail the check and return a 401, which
        // should restart the authentication flow.
        return {
            clientId: '',
            token: token,
            expiresAt: 0,
            scopes: ["forerunner.mcp.act"],
        };
    }
}

Let’s break down the steps:

The token is decoded, along with the complete header.
We checks if the decoded token is valid and contains the kid (key ID) property in the header.
If the token format is invalid, throw an error. No reason to continue.
Construct the OpenID configuration URL using the tenant ID from the server configuration. We only work with Entra ID tokens here, so the format is predictable.
Fetch the OpenID configuration data from the constructed URL.
Extract the JWKS (JSON Web Key Set) URI from the OpenID configuration data.
Configure a JWKS client with the JWKS URI and caching settings.
Fetch the signing key with the kid from the decoded token.
Extract the public key from the signing key.
Verify the token with the public key, audience, and issuer from the configuration.

If all is well, the function will return an AuthInfo object (as expected by the MCP client) containing:

clientId - the client ID from the token payload.
token - the original token.
expiresAt - the expiration time from the token payload.
scopes - the scopes from the token payload, split into an array.

The fallback AuthInfo object with expiresAt set to 0 (expiration time of 0 means the token is immediately expired, not indefinitely valid) will cause the middleware to fail the check and return a HTTP 401 Unauthenticated status, prompting a restart of the authentication flow.

You can, of course, add additional logic to inspect other token claims, but this is outside the scope of this blog post (you can spot the trend - a lot of complexity is not covered in this sample).

Bootstrapping on-behalf-of inside tools #

I glanced over a pretty big thing above, though, and that is - bootstrapping the server to do on-behalf-of (OBO) token exchange. When the MCP client talks to the MCP server, it includes the token we got from Entra ID in your run-of-the-mill Authorization header. That header is used for the /sse and /message endpoints but is not necessarily passed through to the tools that might need it. To fix that, I implemented a bit of a hacky solution:

app.post("/message", requireBearerAuth({
  provider,
  requiredScopes: ["forerunner.mcp.act"]
}), async (req, res) => {
  console.log("Received message");

  const authHeader = req.headers.authorization;
  const token = authHeader?.split(' ')[1];

  const rawBody = await getRawBody(req, {
    limit: '1mb',
    encoding: 'utf-8'
  });

  const messageBody = JSON.parse(rawBody.toString());
  if (!messageBody.params) {
    messageBody.params = {};
  }
  messageBody.params.context = { token };

  await transport.handlePostMessage(req, res, messageBody);
});

We grab the request inside /message, extract the auth token, and then reconstruct the body that is expected downstream along with a new params.context.token property, that would otherwise not exist. You must use params to tweak the outbound JSON because using any other property at another nesting level will just break the process, but params can be as flexible as you need.

Then, inside the tool implementation, I can just access the request parameters:

const context = request.params?.context as { token?: string } | undefined;
const token = context?.token;

With this token in hand, it’s now possible to do proper OBO exchanges and invoke other APIs:

const oboRequest = {
  oboAssertion: token,
  scopes: ["User.Read"],
};

const response = await confidentialClient.acquireTokenOnBehalfOf(oboRequest);

if (!response?.accessToken) {
  throw new Error("Failed to acquire token for Microsoft Graph");
}

const graphClient = Client.init({
  authProvider: (done) => {
    done(null, response.accessToken);
  }
});

const user = await graphClient
  .api('/me')
  .select('displayName,mail,userPrincipalName')
  .get();

With a bit of tweaking, I can make this code more flexible and re-usable, but the encouraging point is that the extensibility point does exist for passing additional metadata into tool invocations.

Conclusion #

Given that the current SDK and the specification are in the early design stages, there are a few rough edges, but it works and it works well enough to get Entra ID plugged in without major issues. There’s a wrench I will throw into your plans if your organization uses special policies (e.g., Conditional Access or require token binding) - the flow that I outlined above doesn’t quite vibe with those integrations as they require the client to be compliant and using advanced capabilities such as authentication brokers. I’ve already kickstarted this conversation with the Entra ID team to see what recommendations there may be for customers as more of them adopt credential protection features (which is the future, anyway).