What AI actually does with what you paste
When you paste something into a free or public AI tool, it can be stored, reviewed by humans, and used to train future models. Even when companies say "we don't train on your data," that often only applies to their paid enterprise tier — not the free one your team probably uses.
The mental model that fixes 80% of bad AI behavior
Picture this: you're standing in an airport terminal. You hand a stranger a printed copy of whatever you're about to paste into ChatGPT. You ask them to read it, then summarize it back to you.
That stranger:
- Might remember what they read
- Might tell their friends about the interesting parts
- Might be writing a book and use your content as inspiration
- Will not sign an NDA
If you wouldn't hand the document to the stranger in the airport, don't paste it into a public AI.
That's it. That's the whole framework.
What "public AI" actually means
| Tool | Free tier | Paid personal tier | Enterprise tier |
|---|---|---|---|
| ChatGPT | Trains on your data by default | Trains by default (opt-out exists) | Does not train |
| Claude.ai | Does not train on conversations | Does not train | Does not train |
| Gemini | Trains on conversations | Mixed — check setting | Does not train |
| Copilot (M365) | n/a — only enterprise | n/a | Does not train |
This table changes constantly. your security team should keep the current version posted somewhere. The point isn't to memorize it — the point is: assume the worst until proven otherwise.
The two real risks
- Data exposure — Your data is now somewhere it shouldn't be. May appear in another user's results. May be reviewed by a human contractor. May be stolen if the AI vendor is breached.
- You don't know what you don't know — The AI may sound confident but be completely wrong. (Module 4 covers this.)
Real incident: Samsung, 2023
Samsung engineers pasted proprietary source code into ChatGPT to help debug it. Within weeks, Samsung had to ban public AI tools company-wide. The code is presumably still inside OpenAI's data, somewhere.
The engineers weren't reckless people. They were trying to do their jobs faster. That's the most common path to a leak: someone trying to be helpful.
Knowledge check
You're about to paste a customer's email into ChatGPT to help draft a response. The email mentions their account number and a complaint. What should you do?
Your company has a paid ChatGPT Team plan. Is it safe to paste internal documents into it?