Data residency for AI products: where does your data actually go

In-depth analyses of real-world cyber incidents and emerging threat trends, authored exclusively by our analysts.

Joanna Larson
7 min read
4 July 2026

An enterprise buyer asks you a question that sounds simple. Where is our data actually processed and stored? For a traditional SaaS product, this is usually a short answer about which cloud region your database sits in. For an AI product, the honest answer is considerably more complicated, because your data does not just sit somewhere, it passes through several different places on its way to and from the AI model, and each of those places has its own residency answer. This article explains what data residency actually means for an AI product, where your data really goes, and what to check before you promise a buyer an answer you cannot back up.

Why this is harder for AI products than for typical software

For a conventional application, data residency usually means one thing, where your database and application servers physically run. You choose a cloud region, and that is largely the end of the story. An AI product has more moving parts, and residency has to be answered for each of them, not just once.

Think about what actually happens when your product makes a single AI powered request. The prompt is sent somewhere for inference. It may be temporarily cached to speed up future requests. If you are using retrieval augmented generation, relevant content is pulled from a vector database that lives somewhere. The response comes back and may be logged for monitoring or abuse detection. If you collect feedback for fine tuning, that becomes another store of data, potentially in another location entirely. Each of these is a place your data rests, even briefly, and residency means being able to answer where every one of them actually is, not just where your main application runs.

Where your data actually goes, provider by provider

Here is the practical picture as it stands, though you should always verify current specifics directly with each provider before making a commitment to a buyer, because these settings and regional offerings change and expand regularly.

  • OpenAI. OpenAI offers data residency for eligible API customers, allowing you to choose to store and process data within Europe rather than the default. This is not automatic. It typically needs to be enabled at the account or project level, and eligibility and exact scope can vary, so check your specific setup rather than assuming residency is in place because you read that OpenAI offers it.
  • Anthropic. Anthropic offers EU data residency for enterprise API customers directly. Separately, Claude models are also available through cloud platforms such as AWS Bedrock and Google Vertex AI, which offer their own EU regional endpoints. These are different paths with different contractual relationships, so if you are running Claude through a cloud platform, your residency terms sit with that platform, not with Anthropic directly.
  • Azure OpenAI. Microsoft's Azure OpenAI Service runs OpenAI models within Azure's own EU regions, and this is generally considered the strongest European residency posture among the major routes to these models, because it is backed by Microsoft's broader EU data commitments covering most of what happens within the region.
  • AWS Bedrock and Google Vertex AI. Both offer EU regional endpoints for the models they host, though which specific models are available in which region can lag behind the US and varies by provider, so check availability for the exact model you use, not just the platform generally.

The consistent theme across all of these is that direct model provider APIs typically default to global routing, and getting a genuine regional or EU commitment usually requires you to actively request or configure it, rather than it being the default state the moment you sign up.

The mistake that catches startups out

The most common mistake is assuming that because a provider offers EU or UK residency somewhere in their documentation, your specific usage automatically has it. It almost never works that way. Residency is typically something you opt into, at a specific account, project, or endpoint level, and it needs to be checked and confirmed for your actual configuration, not assumed from a general claim on a provider's website.

The second common mistake is thinking about residency only in terms of where the model itself runs, and forgetting the other places data touches. A team might correctly pin their inference calls to a European endpoint and consider the job done, while a caching feature, a logging pipeline, or a fine tuning dataset quietly continues to use a different, non regional default elsewhere in the same platform. Read the residency documentation for each specific feature you use, not just the headline regional commitment for the platform as a whole, because exceptions are common and are not always obvious.

What to actually check and document

If a buyer asks where their data goes, or if you simply want to know the honest answer yourself, work through this systematically.

  • List every AI provider and every distinct feature of each provider your product uses, since inference, caching, logging, and fine tuning can each have separate residency behaviour within the same provider.
  • For each one, confirm the actual current residency setting for your account, not the general claim on the provider's marketing page.
  • Check whether residency is enabled by request or configuration rather than by default, and if so, confirm it has genuinely been switched on for you.
  • Document what you find, including the date you checked, since these settings and offerings change and your documentation should reflect your current actual state, not a one time assumption.
  • If you use a vector database for retrieval, check its region separately, because it is a distinct store of data with its own residency answer, often overlooked because attention focuses on the model provider.

Why this matters beyond the technical answer

For many enterprise buyers, particularly in regulated sectors or with EU operations, a vague or overconfident answer about data residency is a serious red flag, precisely because the topic is genuinely complicated and a confident sounding wrong answer is worse than an honest uncertain one followed by a promise to confirm. Buyers who ask this question are often testing whether you understand your own architecture, not just looking for a specific region name.

This is also increasingly a regulatory question, not just a contractual preference. For AI systems classed as high risk under the EU AI Act, and for data protection obligations more broadly, being able to demonstrate where personal data is actually processed and stored is becoming a genuine compliance requirement, not simply something a cautious buyer happens to ask about.

The honest takeaway

Data residency for an AI product is not a single answer, it is a map of several different places your data touches, each with its own configuration, and each capable of quietly defaulting somewhere you did not intend. Do not promise a buyer a residency commitment you have not actually verified for your specific setup. Check each provider, each feature within that provider, and your own vector store or logging pipeline, document what you find, and revisit it periodically, because these offerings expand and change.

Get this right and you can answer one of the more technical questions in a security review with genuine confidence, rather than a guess dressed up as certainty.

Not sure where your AI product's data actually goes?

Book a free review and we'll help you map your data flows and check what you can genuinely commit to.

Tags
#Compliance
#Cybersecurity
#DPA
#Founder
#GDPR
#ISO 27001
#ISO 42001
#Procurement
#SOC
#SOC2
#United Kingdom

AI Security Insights

MCP security: the risks of the Model Context Protocol nobody's talking about yet

If your AI product uses the Model Context Protocol, or MCP, to connect your agents to tools and data sources, there is…

Read article

AI security glossary: 30 terms every founder should know before an enterprise review

Enterprise security reviews come packed with terminology that nobody explains before you need it. Founders often encoun…

Read article

What is a security.txt file and does your AI startup need one

If you have never heard of a security.txt file, you are not alone, and yet it is one of the smallest, cheapest pieces o…

Read article

Sub-processors explained: what they are and why enterprise buyers ask for your list

Somewhere in an enterprise security review, you will almost certainly be asked for your list of sub-processors. If you…

Read article

More insights, delivered monthly

Get the latest insights on AI security and compliance.