Privacy & security
Is It Safe to Remove Metadata Online? How MetaDocu Works
No — your files never leave your device. MetaDocu removes hidden metadata 100% inside your browser. When you open a Word, Excel, or PDF file, it's read into local memory, scanned and cleaned by a WebAssembly engine, and saved back as a download. Nothing is uploaded to any server, there's no cloud copy, and no account is required — once the page has loaded you could even go offline and still clean a file. That makes it safe in a way upload-based tools can't match: there's no transmission to intercept, no server-side retention to trust, and no breach surface. Below we explain exactly how the in-browser processing works, compare local vs. upload-based cleaning, and show which hidden fields each format leaks — and how MetaDocu removes them.
How in-browser cleaning works
Four steps, all on your machine — no server ever sees your file.
- 1
Your file loads locally
When you drop in a document, the browser reads its bytes into your device's memory. No upload request is made.
- 2
WebAssembly parses it in memory
A compiled WebAssembly engine unzips the OOXML parts or walks the PDF object tree at near-native speed, right in the browser sandbox.
- 3
Metadata fields are stripped
Author, company, timestamps, RSIDs, comments, XMP, and embedded-image EXIF/GPS are located and physically removed from the file structure.
- 4
You download the clean file
The rebuilt, byte-clean file is handed back to you as a normal download, plus a verification report of what was removed.
Local processing vs. upload-based tools
Why in-browser cleaning removes the whole category of server-side risk.
| Aspect | MetaDocu (in-browser) | Upload-based online tools |
|---|---|---|
| Where your file goes | Stays in your browser's memory | Transmitted to a remote server |
| Network transfer of file | None | Required (your file is sent) |
| Server copy / retention | Never created | May be cached, logged, or retained |
| Breach / interception surface | None — nothing in transit | Exists at upload, storage, and processing |
| Works offline | Yes, after the page loads | No |
| Account required | No | Often yes |
What hidden metadata each format exposes
Office and PDF files carry far more than their visible content. Here is the sensitive data each format can leak — and how MetaDocu removes it, locally.
| Hidden field | Formats | What it exposes | Risk | How MetaDocu removes it |
|---|---|---|---|---|
Author / Creator dc:creator · PDF /Author | Word (.docx), Excel (.xlsx), PDF | The real name (or Office sign-in name) of whoever first created the file — often your full legal name. | High | Cleared from the OOXML core properties / PDF Info dictionary in browser memory; the field is emptied, not just hidden. |
Last Modified By cp:lastModifiedBy | Word (.docx), Excel (.xlsx) | The name of the last person to save the file — exposes internal reviewers and collaboration chains. | High | Stripped from the core properties XML so no editor identity remains. |
Company Company (app.xml) | Word (.docx), Excel (.xlsx) | The organization name baked in from your Office licence — reveals your employer even on a personal document. | Medium | Removed from the extended (app) properties part. |
Manager Manager (app.xml) | Word (.docx), Excel (.xlsx) | The manager name some templates embed — leaks your reporting line. | Medium | Cleared from the extended properties. |
Template path Template (app.xml) | Word (.docx), Excel (.xlsx) | An absolute file path to the template (e.g. C:\Users\<you>\…) — leaks your account name and local folder layout. | High | Path is wiped so no local filesystem clue ships with the file. |
Application & version Application/AppVersion · PDF /Producer · /Creator | Word (.docx), Excel (.xlsx), PDF | The exact software and version used — a fingerprint for targeting known vulnerabilities or deanonymizing authors. | Low | Normalized/removed from app properties and the PDF Producer/Creator fields. |
Revision number cp:revision | Word (.docx), Excel (.xlsx) | How many times the file was saved — hints at how heavily a 'final' document was reworked. | Low | Reset in the core properties. |
Total editing time TotalTime (app.xml) | Word (.docx), Excel (.xlsx) | Cumulative minutes spent editing — can contradict claims about when/how long work was done. | Low | Zeroed out in the extended properties. |
Created / Modified dates dcterms:created/modified · PDF /CreationDate /ModDate | Word (.docx), Excel (.xlsx), PDF | Precise creation and last-edit timestamps — builds a timeline of your activity. | Medium | Removed or reset so no editing timeline leaks. |
Title / Subject / Keywords dc:title, dc:subject, cp:keywords · PDF /Title /Subject /Keywords | Word (.docx), Excel (.xlsx), PDF | Internal codenames, client names, or tags left in the properties even when not shown in the document text. | Medium | Cleared from both OOXML properties and the PDF Info dictionary. |
Revision save IDs (RSID) w:rsid in settings.xml + run-level rsids | Word (.docx) | Random per-editing-session IDs that let two documents be linked to the same author/machine across files. | Medium | RSID nodes are physically stripped from the document XML, breaking cross-file correlation. |
Tracked changes & comments w:ins/w:del, comments.xml, people.xml | Word (.docx) | Deleted text, internal review notes and commenter names that survive inside the file after 'accepting all'. | High | Comment and revision parts are removed so no hidden review history ships. |
Custom properties custom.xml | Word (.docx), Excel (.xlsx) | Bespoke fields added by DMS/templates (matter numbers, classifications, internal IDs). | Medium | The custom properties part is cleared. |
XMP metadata stream /Metadata XMP packet (xmpMM:DocumentID, InstanceID, History) | A second copy of author/tool data plus document/instance IDs that survive even when the Info dictionary is cleared. | High | The XMP packet is removed alongside the Info dictionary so no duplicate metadata remains. | |
Image EXIF (camera & software) EXIF Make/Model/Software/DateTimeOriginal in embedded images | Embedded images | Camera make/model, capture time and editing software of photos embedded in the document. | Medium | EXIF segments are byte-stripped from embedded images while keeping the picture intact. |
Image GPS coordinates EXIF GPSLatitude/GPSLongitude in embedded images | Embedded images | The exact latitude/longitude where a photo was taken — can pinpoint your home or office. | High | GPS EXIF tags are wiped so no location ships with the file. |
Clean your document before you share it
Scan, remove, and verify hidden metadata 100% in your browser — nothing uploaded.
Frequently Asked Questions
Is it safe to remove metadata online? Are my files uploaded?
Yes, it's safe — and no, your files are never uploaded. MetaDocu runs entirely in your browser using WebAssembly. When you open a document, it's read into your device's local memory, scanned and cleaned there, and saved back as a download. No file bytes are ever sent to a server, so there's no upload to intercept, no cloud copy to leak, and no account required. The only network request is loading the page itself — after that you could disconnect from the internet and still clean a file. That's fundamentally different from upload-based online tools, which transmit your contract, resume, or spreadsheet to a remote server you don't control.
How does MetaDocu's in-browser processing work?
MetaDocu uses WebAssembly — compiled code that runs at near-native speed inside your browser's sandbox. When you drop in a Word, Excel, or PDF file, the browser reads its bytes into local memory. A WebAssembly engine then parses the document structure directly: it unzips the OOXML parts (or walks the PDF object tree), locates every metadata field — author, company, timestamps, RSIDs, XMP, embedded-image EXIF — and rewrites the file without them. The cleaned bytes are handed back as a normal download. Every step happens on your machine; the file never touches a server, so it's instant and works offline once the page has loaded.
Why is local processing safer than upload-based tools?
Local processing is safer because your file never leaves your device, so the entire category of server-side risk disappears. Upload-based tools must transmit your document to a remote server, where it may be cached, logged, retained, processed by third parties, or exposed in a breach — all outside your control. With MetaDocu's in-browser approach there's no transmission, no server copy, and no retention policy to trust: the cleaning happens in your browser's sandbox and the only file that exists is the one on your disk. For sensitive documents — contracts, resumes, legal disclosures, financial spreadsheets — that difference is the whole point.
How can I verify my file is clean after removing metadata?
After cleaning, MetaDocu shows a verification report confirming what was removed and that no sensitive metadata remains, so you don't have to take it on faith. You can also verify independently: open the downloaded file and check its properties — in Word or Excel via File → Info → Properties, or in a PDF viewer via Document Properties — and the author, company, timestamps, and other fields should be empty. Because MetaDocu strips the fields from the file's internal XML/PDF structure rather than hiding them in the interface, the values are physically gone, not just blanked on screen.
Does local processing change my file's content or formatting?
No — cleaning metadata doesn't change your document's visible content, formatting, or layout. MetaDocu only edits the metadata fields (author, company, timestamps, revision IDs, embedded-image EXIF, PDF Info/XMP) that live alongside your content, not the text, tables, formulas, images, or styles. Your Word document reads identically, your Excel formulas and charts keep working, and your PDF pages render the same. The one intentional exception is the hidden data you asked to remove — tracked-change history, comments, and metadata fields — which are stripped. The output looks and behaves exactly like the original, minus the privacy risks.