Module 14 min

What AI actually does with what you paste

TL;DR

When you paste something into a free or public AI tool, it can be stored, reviewed by humans, and used to train future models. Even when companies say "we don't train on your data," that often only applies to their paid enterprise tier — not the free one your team probably uses.

The mental model that fixes 80% of bad AI behavior

Picture this: you're standing in an airport terminal. You hand a stranger a printed copy of whatever you're about to paste into ChatGPT. You ask them to read it, then summarize it back to you.

That stranger:

Might remember what they read
Might tell their friends about the interesting parts
Might be writing a book and use your content as inspiration
Will not sign an NDA

If you wouldn't hand the document to the stranger in the airport, don't paste it into a public AI.

That's it. That's the whole framework.

What "public AI" actually means

Tool	Free tier	Paid personal tier	Enterprise tier
ChatGPT	Trains on your data by default	Trains by default (opt-out exists)	Does not train
Claude.ai	Does not train on conversations	Does not train	Does not train
Gemini	Trains on conversations	Mixed — check setting	Does not train
Copilot (M365)	n/a — only enterprise	n/a	Does not train

This table changes constantly. your security team should keep the current version posted somewhere. The point isn't to memorize it — the point is: assume the worst until proven otherwise.

The two real risks

Data exposure — Your data is now somewhere it shouldn't be. May appear in another user's results. May be reviewed by a human contractor. May be stolen if the AI vendor is breached.
You don't know what you don't know — The AI may sound confident but be completely wrong. (Module 4 covers this.)

Real incident: Samsung, 2023

Samsung engineers pasted proprietary source code into ChatGPT to help debug it. Within weeks, Samsung had to ban public AI tools company-wide. The code is presumably still inside OpenAI's data, somewhere.

The engineers weren't reckless people. They were trying to do their jobs faster. That's the most common path to a leak: someone trying to be helpful.

Knowledge check

Knowledge check 1

You're about to paste a customer's email into ChatGPT to help draft a response. The email mentions their account number and a complaint. What should you do?

Knowledge check 2

Your company has a paid ChatGPT Team plan. Is it safe to paste internal documents into it?