Compress Scanned PDF Documents

Reduce the file size of scanned receipts, contracts, and forms by 50-70% — without losing readability, without uploading sensitive scans to a server.

The Scenario

Scanned documents are the largest PDFs most people encounter. A single page scanned at 300 DPI produces a 1-3 MB file. A 20-page scanned contract can be 30-60 MB — too large for email, slow to open, and expensive to store. These are also among the most sensitive documents: scanned IDs, signed contracts, medical records, tax returns.

Why Privacy Matters Here

Scanned PDFs contain embedded raster images at high resolution. Each page is essentially a photograph of the original document. This makes them 10-50x larger than equivalent text-based PDFs. Compressing them on a server means uploading megabytes of your most personal documents — scans of your passport, tax returns, or medical records.

How to Do It

1

Upload scanned PDF

Drop your scanned document into the compress tool. Large files (20+ MB) may take a moment to load into browser memory.

2

Compress

The tool identifies and removes duplicate embedded objects, strips EXIF metadata from scanned images, and optimizes the PDF structure. Expect 50-70% reduction for typical scans.

3

Verify readability

Open the compressed file and check that text is still readable at normal zoom. Browser-based compression preserves the original image quality — no re-encoding occurs.

Tips

  • If you are scanning documents specifically for digital storage, scan at 200 DPI instead of 300 DPI — the resulting PDF will be 50% smaller at scan time.
  • Batch your scans: merge all pages first, then compress the single merged file. Compression is more effective on larger files because there are more opportunities to deduplicate shared resources.
  • Color scans compress more than grayscale. If the original is a black-and-white document, scanning in grayscale mode produces a smaller starting file.
  • For receipts that you need to keep for tax purposes, compress and store — the IRS accepts digital records as long as they are legible.

Why Browser-Based Processing Matters

Scanned documents are inherently personal — they are images of physical papers you chose to digitize. Passport scans, signed contracts, medical forms, tax returns. Browser-based compression ensures these images never pass through any network connection.

Frequently Asked Questions

Why do scanned PDFs compress so much more than regular PDFs?

Scanned PDFs contain large embedded images (one per page). These images often have redundant color data, duplicate ICC profiles, and unoptimized encoding. Removing this redundancy yields large savings. Text-based PDFs are already compact because text consumes very little space.

Will compression make scanned text harder to read?

No. Browser-based compression does not re-encode the embedded images. It removes structural overhead around them. The pixel data in your scans is untouched.

Can I compress a scanned PDF that has been OCR-processed?

Yes. OCR adds a transparent text layer over the scanned images. Compression preserves both the image layer and the OCR text layer. Search functionality is maintained.

Related Use Cases