GDPR and the OpenAI API: what UK AI startups actually need to do

In-depth analyses of real-world cyber incidents and emerging threat trends, authored exclusively by our analysts.

Joanna Larson
8 min read
18 June 2026

If you are a UK AI startup sending data to the OpenAI API, you have probably asked yourself whether you are GDPR compliant, and found that most of the answers online are written by lawyers in the abstract. They tell you that you need a lawful basis and appropriate safeguards, which is true and entirely unhelpful when what you actually want is the concrete list of things to do. This guide is the practical version. What to sign, what to configure, and what to write down.

It is written for the founder or engineer who is integrating the OpenAI API and wants to get the data protection side genuinely right, not just sound compliant. It is not legal advice, but it is the practical groundwork that any sensible legal review will expect you to have done.

The core problem, in one sentence

When your product sends data to the OpenAI API, and that data relates to identifiable people, you are sharing personal data with a third party that processes it on your behalf. Under UK GDPR that makes OpenAI your processor and you the controller, and it triggers a specific set of obligations. The good news is that OpenAI provides what you need to meet them. The catch is that you have to actively put it in place, because none of it happens automatically just because you opened an account.

Step one, sign the Data Processing Addendum

This is the non negotiable foundation, and it is the step most often skipped. UK GDPR requires a contract between you and any processor handling personal data on your behalf. OpenAI provides exactly this in the form of its Data Processing Addendum, or DPA.

A few practical points that catch people out.

  • For paid API use, the DPA is incorporated into OpenAI's commercial terms, but you should confirm you have actually accepted those commercial terms rather than assuming they apply. Check, and keep the record.
  • A personal account using a personal email is the classic mistake. If you are running company workloads through a personal account, fix that first, because the right contractual terms need to be in place at the company level.
  • Save the executed agreement somewhere your team can find it instantly. If you ever have a breach to notify or a buyer asking for evidence, hunting for the DPA is not a position you want to be in.

Without this in place, every API call carrying personal data is processing without the contract the law requires. Signing it takes minutes. There is genuinely no reason not to.

Step two, understand what OpenAI does and does not do by default

You cannot document your compliance if you do not know the actual behaviour. Here are the practical facts as they stand, and you should verify the current detail against OpenAI's own documentation when you set this up, because providers update terms.

  • Training. OpenAI does not use data submitted through the API to train its models by default. This is an important and genuine protection, and it is the default for API and business use, unlike the consumer ChatGPT tiers.
  • Retention. By default OpenAI may retain API inputs and outputs for a short period, around thirty days, for abuse monitoring, after which they are deleted unless legally required to keep them. This residual retention is the thing that surprises founders, so account for it explicitly rather than assuming zero retention.
  • Zero data retention. For eligible enterprise use cases, OpenAI offers a zero data retention option on supported endpoints, which removes that abuse monitoring window. It is not a self serve toggle. It is approval gated through their sales team, so if your data sensitivity warrants it, you need to request it.
  • Data residency. OpenAI offers data residency in various regions including the UK and Europe for eligible customers, but it requires configuration and is not the default. If your buyers need EU or UK residency, this is something you set up, not something you get automatically.

The theme across all of these is the same. The protections exist, but several of them are off by default or require a request. Compliance is in the configuration, not just the signature.

Step three, minimise what you send in the first place

The most robust protection is not sending personal data you do not need to send. Before your application calls the API, build in a step that reduces the personal data in the payload to the minimum the task actually requires. Where you can strip or mask identifiers without harming the result, do it.

This is good engineering and good data protection at once. It reduces your exposure, it strengthens your answer when a buyer asks how you handle their data, and it aligns with the data minimisation principle that sits at the heart of GDPR. The less personal data that leaves your system, the smaller every downstream risk becomes.

Step four, handle the lawful basis and transparency

Two GDPR fundamentals apply to your use of the API, and both are your responsibility as the controller, not OpenAI's.

You need a lawful basis for processing the personal data through the API, and you need to be transparent with the people whose data it is. In practice this means your privacy notice should explain that you use a third party AI provider to process data, what is sent, and that the provider does not train on it and acts under a data processing agreement. If your AI is user facing, such as a chatbot, a short version of this disclosure at the point of interaction is good practice, and it also helps you meet the separate transparency expectations coming under the EU AI Act.

Step five, document everything

This is the step that turns good practice into something you can prove, and proof is what buyers and regulators actually want. Documentation is also where most startups fall short, not because they lack the controls but because they never wrote them down.

  • Keep the executed DPA on file and accessible.
  • Record your configuration choices, such as retention settings, zero data retention status, and data residency region, ideally with dated evidence.
  • Maintain a simple data flow description showing what personal data is sent to the API and why.
  • Note OpenAI as a processor in your records of processing, and be aware of its sub processors, since your buyers may ask about the chain.
  • For higher risk uses, consider whether a data protection impact assessment is warranted, and if so, complete one.

When an enterprise buyer's security team asks how you handle personal data in your AI pipeline, this documentation is the difference between a confident, same day answer and a scramble that stalls the deal.

Where this connects to your AI security more broadly

Getting the GDPR and OpenAI API setup right is essential, but it is worth being clear about what it does and does not cover. It addresses the lawful and contractual handling of personal data. It does not, by itself, address whether your AI can be manipulated through prompt injection, or whether one customer's data could surface in another's results through your application. Those are AI security questions that sit alongside your data protection work, and enterprise buyers increasingly ask about both.

The strongest position is to treat the GDPR setup as one layer of a properly secured AI product, not the whole of it. Do the data protection groundwork described here, and make sure the product around it is genuinely secure too.

The honest takeaway

Using the OpenAI API in a UK AI startup can absolutely be done compliantly, and it is more about doing a handful of concrete things than wrestling with legal theory. Sign the DPA, understand and configure the retention and residency settings, minimise what you send, get your lawful basis and transparency right, and document all of it. Do those, and you can answer the data protection question with confidence.

The founders who get caught out are not usually the ones who did something egregious. They are the ones who assumed that using the service was enough, never signed the DPA, never configured the settings, and never wrote anything down. Avoid that, and you are most of the way there.

Want to be sure your AI data handling would pass a buyer's review?

Book a free review and we'll check your OpenAI API setup and the wider security of your AI product.

Tags
#Compliance
#Cybersecurity
#DPA
#Founder
#GDPR
#ISO 27001
#ISO 42001
#Procurement
#SOC
#SOC2
#United Kingdom

AI Security Insights

Vanta vs Drata: What compliance platforms do and where CYBNODE fits

If you are an AI startup researching how to get through enterprise security and compliance, you will quickly run into V…

Explore

How to get ISO 42001 certified as an AI startup (and whether you need it)

There is a new certification that enterprise buyers are starting to ask AI companies about, and most founders have bare…

Explore

EU AI Act high-risk classification: Does your AI startup qualify and What does it mean?

The single most important question under the EU AI Act is not what the law says in general. It is whether your specific…

Explore

Does your AI startup need a Data Processing Agreement with OpenAI, Anthropic, and Google?

It is a question that tends to arrive late at night, often the evening before a big enterprise demo, when a founder sud…

Explore

More insights, delivered monthly

Get the latest insights on AI security and compliance.

GDPR and the OpenAI API: what UK AI startups actually need to do — CYBNODE®