What counts as PII
Most data-protection frameworks split PII into two tiers:
Direct identifiers, uniquely identify a person on their own:
- Full name
- Email address
- Phone number
- Government IDs (SSN, passport, national ID)
- Physical address
- Date of birth combined with name
Indirect (quasi-)identifiers, identify when combined with other data:
- IP address
- Device ID (IDFA, AAID)
- Cookie ID
- Hashed email or phone number
- Browser fingerprint
- Precise location data
- Behavioral data tied to a user
GDPR uses “personal data” as a broader umbrella that includes both tiers, any information relating to an identified or identifiable natural person.
Why PII matters in marketing tech
Three places it shows up:
- Tracking, the more PII flows into your pixels and server-side events, the more constrained you are by consent and data-sharing rules.
- Ad platform syncs, sending customer lists to Meta or Google requires hashing the PII. Sending in plaintext is non-compliant.
- AI and analytics access, letting an LLM or BI tool query customer data raises questions about who/what is allowed to see PII in raw form.
Hashing as the standard middle ground
The dominant pattern for sharing PII with platforms: SHA-256 hash before transmission.
Email jane@example.com → SHA-256 → c47c3a9c..
Meta, Google, TikTok all expect hashed PII for audience matching. The hash is reversible only by brute force (if the input space is small, like emails, it’s reversible to a determined attacker, so the hash isn’t true anonymisation, but it does protect against casual leakage).
For most uses, the hash is the right transmission format. Plaintext PII should stay inside your perimeter.
PII vs anonymous data
True anonymous data is rare. Aggregated metrics (“3,200 visitors this week”) are anonymous. A row in a database with a hashed email is pseudonymous, re-identifiable with the key. GDPR treats pseudonymous data as personal data, even though it’s safer than raw PII.
For analytics use cases that don’t need user-level granularity, design to operate on aggregates. Many “user-level” insights can be served by cohort-level statistics, and the privacy posture is dramatically easier.
PII and AI agents
A specific concern in 2026: feeding raw PII into LLM tools. Best practices emerging:
- Use read-only access patterns so AI tools can answer questions without exfiltrating raw PII
- Aggregate or pseudonymise data before exposing it to AI
- Audit and log every AI query against customer data
This is one reason MCP-style architectures matter, they constrain what an AI tool can see and do, instead of granting blanket data access.
Common mistakes
- Treating IP addresses as not-PII. They are, in most jurisdictions.
- Storing PII indefinitely. GDPR requires storage limitation. Decide retention periods per data category and enforce them.
- Hashing identifiers in different ways across systems. A hash with no salt vs. a salted hash vs. a different hash function = three different identifiers for the same person. Pick one approach per identifier.
FAQ about PII (Personally Identifiable Information)
What is PII?
PII (Personally Identifiable Information) is any data that can identify a specific individual, directly or in combination with other available data. It covers obvious identifiers (name, email, phone) and indirect ones (IP address, device ID, cookie ID).
What is sensitive PII?
Sensitive PII is a subset that requires extra protection, government IDs, financial account numbers, health information, biometric data, religious or political affiliation. Most data protection laws apply stricter rules to sensitive PII.
Is a hashed email still PII?
Yes, in most jurisdictions. Hashing protects against casual leakage but the hash is reversible (deterministically generated) and still uniquely identifies a person. GDPR treats pseudonymised data as personal data.
Can I send PII to ad platforms?
Only hashed (SHA-256) and only with proper consent. Meta, Google, TikTok all expect hashed PII for audience matching. Plaintext PII transmission to ad platforms is non-compliant.