Your Documents Are Full of Personal Information. Here's How to Share Them Safely.
Your Documents Are Full of Personal Information. Here's How to Share Them Safely.
A practical guide to POPIA-compliant document redaction for South African businesses.
Every day, South African businesses share documents — patient files forwarded to specialists, employee records sent to auditors, legal files exchanged between firms, financial statements submitted to regulators. Every one of those documents is a potential POPIA liability.
The Protection of Personal Information Act (POPIA) has been fully enforceable since July 2021, and the Information Regulator is no longer warming up. In July 2023, the Department of Justice was hit with a R5 million administrative fine — 50% of the maximum — for failing to protect personal information after a ransomware attack that compromised over 1,200 files. During 2024, the Regulator issued enforcement notices against Lancet Laboratories, the IEC, Blouberg Municipality, and WhatsApp, and conducted over 30 compliance assessments, including against law firms. By the 2024/25 financial year, 2,374 data breaches had been reported to the Regulator, with that number rising 40% into 2025. The message is clear: enforcement is accelerating.
But POPIA compliance isn't just about avoiding fines. It's about something more fundamental — the ability to share information safely while respecting people's privacy. That's the problem SureDox was built to solve.
What POPIA Actually Protects
Before you can redact effectively, you need to understand what counts as personal information. POPIA's definition in Section 1 is deliberately broad. Personal information means any information relating to an identifiable, living, natural person (and in some cases, a juristic person like a company). The Act provides a detailed list that includes:
Demographic and identity information — race, gender, sex, pregnancy, marital status, national or ethnic origin, colour, sexual orientation, age, and birth details.
Health and wellbeing — physical health, mental health, disability, and medical history. This extends to genetic and genomic data: gene variants, test results, chromosomal information, and family genetic history.
Identifiers — ID numbers, passport numbers, driver's licences, email addresses, physical addresses, phone numbers, location data, and online identifiers.
Personal history — education, medical, financial, criminal, and employment history.
Biometrics — fingerprints, facial recognition data, voice patterns, and related data.
Beliefs and views — religion, conscience, culture, language, and personal opinions.
Correspondence — private or confidential communications.
Then there's a category the Act calls special personal information under Section 26, which carries even stricter processing rules: religious or philosophical beliefs, race or ethnic origin, trade union membership, political persuasion, health or sex life, biometric information, and criminal behaviour. Processing this data without proper authorisation is prohibited outright.
The critical thing to understand: POPIA doesn't say you can't process this information. It says you must have a lawful basis to do so, you must protect it appropriately, and — crucially — you can only use it for the specific purpose it was collected for. The moment you want to use a document for a secondary purpose, you have a problem.
The Secondary Use Problem
This is where most organisations get stuck. You collected a patient's medical records to treat them. Now a researcher wants to study treatment outcomes across 500 patients. You gathered employee files for HR management. Now an auditor needs to review them. A law firm compiled case files for litigation. Now a junior associate needs training material.
In each case, the original purpose of collection doesn't cover the new use. POPIA's Condition 4 (Section 15) restricts further processing unless it's compatible with the original purpose. You have three options:
- Get fresh consent from every data subject — often impractical or impossible at scale.
- Don't share the documents — which blocks legitimate business needs like research, training, auditing, and quality assurance.
- De-identify the information — remove or mask the personal information so the document can be shared without exposing anyone's identity.
POPIA itself defines "de-identify" in Section 1 as deleting any information that identifies the data subject, can be used to identify them, or can be linked to other information that identifies them. Once data is properly de-identified, it falls outside POPIA's scope entirely. The document becomes shareable.
This is redaction done right: not destroying the document, but surgically removing the personal information while preserving everything else.
Why Redaction Beats Deletion
Many organisations default to one extreme or the other — they either keep everything (and accept the compliance risk) or destroy documents entirely (and lose institutional knowledge). Neither approach serves the business well.
Consider what you lose when you destroy documents instead of redacting them:
- Medical research stalls. A hospital can't study treatment patterns across anonymised patient records if those records no longer exist.
- Legal precedent disappears. A law firm can't build a knowledge base of case strategies if previous files are shredded after matters close.
- Audit trails break. A financial services firm can't demonstrate historical compliance if supporting documents have been purged.
- Training suffers. New employees can't learn from real-world examples if all sample documents have been destroyed.
Redaction preserves the utility of the document — the medical findings, the legal reasoning, the financial patterns, the procedural examples — while removing only the elements that identify specific individuals. It's the middle ground POPIA implicitly encourages through its de-identification provisions.
Real-World Scenarios
The specialist who needs a second opinion. A cardiologist in Johannesburg wants to consult with a colleague in Cape Town about a complex case. The patient file contains the patient's name, ID number, address, medical aid details, and full medical history. Under POPIA, forwarding the unredacted file to another practitioner for an informal opinion (outside the direct treatment relationship) is risky. With automated redaction, the cardiologist can share the clinical findings, test results, and treatment history — everything the colleague needs to advise — without exposing the patient's identity.
The law firm conducting internal training. A commercial litigation practice wants to use real case files to train junior associates on strategy and drafting. Those files contain client names, counterparty details, witness information, and confidential business terms. Manually redacting a 200-page trial bundle is a week's work for a paralegal. Automated redaction does it in minutes, producing training materials that are realistic and educational without compromising client confidentiality or POPIA compliance.
The HR department facing an audit. An external auditor needs to review employment practices across 150 employee files — contracts, disciplinary records, performance reviews. Each file contains names, ID numbers, salary details, and potentially health information or disciplinary outcomes (special personal information under POPIA). The HR team can redact all identifying information, giving the auditor access to the substance of their employment practices while protecting employee privacy.
The researcher studying financial inclusion. A microfinance institution wants to share loan application data with university researchers studying financial access in underserved communities. The applications contain names, ID numbers, addresses, income details, and employment history. Redacted versions preserve the demographic and financial data the researchers need while ensuring no applicant can be identified — making the research POPIA-compliant by design.
The insurer processing historical claims. An insurance company is building an AI model to detect fraudulent claims. The training data includes thousands of historical claim files with policyholder names, medical reports, and bank details. Using unredacted data for model training would violate POPIA's purpose limitation principle. Redacted claims data provides the patterns the model needs without the personal information it doesn't.
What SureDox Does
SureDox is an AI-powered document redaction platform built specifically for POPIA compliance. Upload a PDF, and the system automatically detects personal information across all POPIA categories — names, ID numbers, addresses, health information, financial details, biometric references, and more.
The platform uses intelligent context classification to distinguish between information that genuinely identifies a person (which must be redacted) and standard template data that appears in every document of that type (which doesn't need redaction). This prevents the over-redaction problem that plagues simpler tools — where redacting a bank's branch address or a standard fee schedule destroys the document's usefulness without any privacy benefit.
Every detected item is presented for human review before redaction is applied. You see exactly what will be redacted and why, with the ability to approve, reject, or adjust individual items. The final output is a properly redacted PDF with a full audit log — the kind of documentation the Information Regulator expects to see.
The platform handles both text-based and scanned documents, including handwritten notes, using different processing approaches optimised for each. Whether it's a typed contract, a scanned medical record, or a handwritten form, SureDox processes it.
Getting Started
POPIA compliance isn't optional, and the Information Regulator's enforcement activity is only increasing. But compliance doesn't have to mean choosing between privacy and productivity. With the right redaction approach, you can share documents confidently — preserving the information your business needs while protecting the people that information belongs to.
SureDox offers free credits on signup so you can process your first documents at no cost. Upload a document, review the detected personal information, and download a redacted version — all in minutes.
SureDox is built by Boone and Boo (Pty) Ltd, operating as a POPIA Operator that processes personal information on behalf of clients. For questions about our compliance approach, see our Privacy Policy and Terms of Service.
References & Further Reading
- POPIA Full Text — The Protection of Personal Information Act as enacted by Parliament, formatted as a navigable website.
- Section 1 — Definitions — Including the definitions of "personal information" and "de-identify."
- Section 26 — Special Personal Information — Prohibition on processing sensitive categories without authorisation.
- Section 15 — Further Processing Limitation — Restrictions on using personal information for secondary purposes.
- Section 19 — Security Safeguards — Requirements for technical and organisational security measures.
- Chapter 11 — Offences, Penalties and Administrative Fines — Fines up to R10 million and imprisonment up to 10 years.
- Bowmans: Information Regulator Issues First Fine of R5 Million — Analysis of the Department of Justice enforcement action (July 2023).
- Michalsons: Information Regulator Aims to Step Up Enforcement — Summary of 2024 enforcement notices against Lancet, IEC, WhatsApp, and Blouberg Municipality.
- ITWeb: InfoReg Exposes POPIA Violators as Data Breaches Mount — 2,374 breaches reported in 2024/25, with a 40% year-on-year increase.
- Cliffe Dekker Hofmeyr: Recent Privacy Updates in South Africa — Lancet Laboratories R100,000 fine and broader enforcement trends (December 2025).
- Information Regulator South Africa — Official website of the enforcement body.