Back to Blog

Where Does Your Data Go? A Complete Guide to How SureDox Protects Your Documents

3 February 2026SureDox Team9 min readSecurity

Where Does Your Data Go? A Complete Guide to How SureDox Protects Your Documents

Reading time: 6 minutes


When you upload a sensitive document to any online tool, one question matters more than anything else: what happens to my data?

It's a fair question. Headlines about data breaches, AI companies training on user data, and cloud storage leaks have made everyone — rightly — cautious about where their information ends up.

At SureDox, we built our entire platform around a simple principle: your documents are yours. We process them, we protect them, and we never use them for anything other than what you asked us to do.

This post walks you through exactly what happens to your document from the moment you upload it to the moment you download the redacted version. No technical jargon. No hand-waving. Just a clear, honest explanation.


The Journey of Your Document: Step by Step

Here's the complete path your document takes through SureDox, from upload to download.

Step 1: You Upload Your Document

When you upload a PDF to SureDox, it travels from your browser to our servers over an encrypted connection (TLS/HTTPS). This is the same encryption technology your bank uses. Anyone intercepting the data in transit would see nothing but scrambled nonsense.

Your document lands in Google Cloud Storage, part of Google's Firebase infrastructure. It's stored in a private container that only your account can access — no other SureDox user can see or reach your files.

At rest, your document is encrypted using AES-256, the same standard used by governments and financial institutions worldwide. Even if someone physically stole a hard drive from Google's data centre, your files would be unreadable.

Step 2: We Analyse the Document Type

Before we start looking for personal information, SureDox examines your PDF to determine what kind of document it is. Is it a typed, text-based document like a contract or invoice? Or is it a scanned, handwritten, or image-heavy document?

This analysis happens entirely within our own systems on Google Cloud. Your document never leaves our infrastructure for this step. The result determines which processing path we use — and this is where it's worth explaining how our AI partners handle your data.

Step 3: AI Detection — The Critical Part

This is the step most people worry about, so let's be completely transparent.

For text-based documents, we use the Anthropic Claude API to detect personal information. Here's what that means for your data:

  • SureDox uses a commercial API key under Anthropic's Commercial Terms of Service.
  • Under these terms, Anthropic does not train its AI models on your data. This isn't an opt-out situation — commercial API data is excluded from training entirely, by contract.
  • Your document content is sent to Anthropic's servers for analysis, processed, and the results are returned to us. Anthropic retains API logs for a maximum of 7 days for abuse prevention, after which they are automatically deleted.
  • Anthropic offers a Zero Data Retention option for organisations with the strictest requirements — logs are processed for real-time safety checks and then immediately discarded.

To be absolutely clear: the content of your documents is never used to train, improve, or develop any AI model. This is a contractual guarantee under Anthropic's commercial terms, not a setting that can be toggled on or off.

For scanned or handwritten documents, we use DocuPipe for OCR (optical character recognition) and PII detection. DocuPipe's privacy commitments are equally strong:

  • Your files are never shared with any third party, unless legally compelled.
  • All data is encrypted in transit and at rest.
  • DocuPipe employees can only access files for ensuring service operation, and their access is audited and logged.
  • DocuPipe acts as a data processor — you (through SureDox) remain the controller of your data.
  • When you delete your data, DocuPipe removes all uploaded documents, schemas, and derived outputs from their active systems within 30 days.

Step 4: Results Come Back to SureDox

The AI returns a structured list of detected personal information — names, ID numbers, addresses, phone numbers, and so on — along with their exact locations in the document. This is stored in your project within Firestore (Google's cloud database), encrypted at rest and accessible only to your account.

At this point, the AI's job is done. Your document content no longer sits with any third party — only the detection results live in your SureDox account.

Step 5: You Review and Redact

You see the detected items highlighted on a preview of your document. You can accept, reject, or adjust any detection before redacting. Nothing is permanently changed until you confirm.

This step happens entirely within your browser and our Firebase backend. No external services are involved.

Step 6: Download Your Redacted Document

When you download, SureDox generates a new PDF with your chosen redactions permanently applied. The redacted content is truly removed — not just covered with a black box that can be copy-pasted away (the mistake that exposed hidden names in the Epstein files and the Manafort court filings).

Your redacted document is yours to keep. The original remains in your SureDox account until you choose to delete it.


What We Don't Do

Sometimes what a company doesn't do matters more than what it does. Here's our list:

We don't train AI on your data. Our commercial API agreements with Anthropic and DocuPipe contractually prohibit this.

We don't sell your data. Not to advertisers, not to data brokers, not to anyone. Ever.

We don't share your documents. The only parties that ever see your document content are the AI services processing them on your behalf — under strict data processing agreements.

We don't keep your data forever. You control your documents. Delete them when you're done, and they're gone.

We don't use your data for marketing. Your uploaded documents are processed for redaction and nothing else.


How This Fits POPIA

Under the Protection of Personal Information Act, SureDox operates as an Operator (Section 1) — we process personal information on your behalf, under your instructions. You remain the Responsible Party who determines why and how the data is processed.

This means:

  • We process your data only for the purpose you instructed — redaction (Section 13, purpose limitation).
  • We implement appropriate security safeguards to protect your data (Section 19).
  • We have Operator Agreements in place with our sub-processors (Anthropic and DocuPipe) that bind them to equivalent data protection standards (Section 21).
  • We support your obligation to ensure data is processed lawfully, by giving you tools to de-identify personal information before sharing documents for secondary purposes (Section 15, further processing limitation).

The Trust Chain

Every link in the chain that handles your data has clear, contractual obligations:

You → SureDox: Our Terms of Service and Privacy Policy set out exactly how we handle your data, including our role as a POPIA Operator.

SureDox → Google Cloud (Firebase): Google acts as a data processor under their Data Processing and Security Terms. Firebase services are certified under ISO 27001, SOC 1, SOC 2, and SOC 3.

SureDox → Anthropic (Claude API): Anthropic's Commercial Terms guarantee no model training on API data. Retention is limited to 7 days maximum, with zero-retention options available.

SureDox → DocuPipe: DocuPipe's Terms of Service and Privacy Notice confirm files are never shared with third parties, are encrypted in transit and at rest, and employee access is audited.


Questions You Should Ask Any Redaction Tool

If you're evaluating SureDox or any other redaction tool, here are the questions that matter:

  1. Is my data used to train AI models? (Ours isn't — by contract, not just by policy.)
  2. Where is my data stored, and is it encrypted? (Google Cloud, AES-256 at rest, TLS in transit.)
  3. Who else can access my documents? (Only the AI services processing them, under data processing agreements.)
  4. How long is my data retained? (You control this — delete when ready.)
  5. Are redactions permanent? (Yes — we remove the data, not just cover it up.)
  6. Does the tool operate under a recognised privacy framework? (POPIA Operator, with sub-processor agreements in place.)

Transparency Is the Point

We're building SureDox for organisations that handle sensitive information every day — law firms, HR departments, healthcare providers, financial services. These are people who can't afford to guess about where their data goes.

That's why we've written this post. Not because a regulation required it, but because we believe that if you're trusting us with personal information, you deserve to know exactly what happens to it.

If you have questions about anything in this post, or if you need specific documentation for your compliance team, reach out. We're happy to provide our Operator Agreement, security documentation, or technical details about any part of our infrastructure.

Your data. Your control. That's the deal.


SureDox is a POPIA-compliant document redaction platform built by Boone and Boo (Pty) Ltd. Get started for free or view a demo to see how it works.


References