ForgeAwareness
0 of 5 complete0%
Module 14 min

What AI actually does with what you paste

TL;DR

When you paste something into a free or public AI tool, it can be stored, reviewed by humans, and used to train future models. Even when companies say "we don't train on your data," that often only applies to their paid enterprise tier — not the free one your team probably uses.

The mental model that fixes 80% of bad AI behavior

Picture this: you're standing in an airport terminal. You hand a stranger a printed copy of whatever you're about to paste into ChatGPT. You ask them to read it, then summarize it back to you.

That stranger:

  • Might remember what they read
  • Might tell their friends about the interesting parts
  • Might be writing a book and use your content as inspiration
  • Will not sign an NDA

If you wouldn't hand the document to the stranger in the airport, don't paste it into a public AI.

That's it. That's the whole framework.

What "public AI" actually means

ToolFree tierPaid personal tierEnterprise tier
ChatGPTTrains on your data by defaultTrains by default (opt-out exists)Does not train
Claude.aiDoes not train on conversationsDoes not trainDoes not train
GeminiTrains on conversationsMixed — check settingDoes not train
Copilot (M365)n/a — only enterprisen/aDoes not train

This table changes constantly. your security team should keep the current version posted somewhere. The point isn't to memorize it — the point is: assume the worst until proven otherwise.

The two real risks

  1. Data exposure — Your data is now somewhere it shouldn't be. May appear in another user's results. May be reviewed by a human contractor. May be stolen if the AI vendor is breached.
  2. You don't know what you don't know — The AI may sound confident but be completely wrong. (Module 4 covers this.)

Real incident: Samsung, 2023

Samsung engineers pasted proprietary source code into ChatGPT to help debug it. Within weeks, Samsung had to ban public AI tools company-wide. The code is presumably still inside OpenAI's data, somewhere.

The engineers weren't reckless people. They were trying to do their jobs faster. That's the most common path to a leak: someone trying to be helpful.

Knowledge check

Knowledge check 1

You're about to paste a customer's email into ChatGPT to help draft a response. The email mentions their account number and a complaint. What should you do?

Knowledge check 2

Your company has a paid ChatGPT Team plan. Is it safe to paste internal documents into it?