privacy-toolkit/tools/dangerzone/README.md
2025-12-31 19:44:07 -07:00

3.1 KiB

Dangerzone - Document Sanitization

Convert potentially dangerous PDFs and Office documents into safe PDFs

🎯 Threat Model

What Problem Does This Solve?

PDF and Office documents can contain:

  • Embedded malware and exploits
  • Tracking beacons that phone home when opened
  • Active content (JavaScript, macros) that can compromise your system
  • Hidden layers and metadata

Opening untrusted documents is a major attack vector.

How Dangerzone Protects You

  1. Container Isolation: Opens document in a disposable container
  2. Pixel Conversion: Renders each page to pixels (destroying any code)
  3. Safe Reconstruction: Rebuilds a clean PDF from the pixels
  4. Metadata Stripping: Removes all potentially identifying metadata

Result: A safe, pixel-perfect copy of the document without any embedded threats.

🚀 Installation

cd ~/github/privacy-toolkit
./tools/dangerzone/install.sh

Or run directly:

bash <(curl -s https://raw.githubusercontent.com/YOUR_USERNAME/privacy-toolkit/main/tools/dangerzone/install.sh)

📖 Usage

GUI Method

  1. Launch "Dangerzone" from your application menu
  2. Drag and drop a PDF or Office document
  3. Wait for sanitization (can take a few minutes for large docs)
  4. Get your safe PDF

Right-Click Method

  1. Right-click any PDF file
  2. Select "Open with Dangerzone"
  3. Sanitized PDF will be created in the same directory

Command Line

# Sanitize a PDF
dangerzone document.pdf

# Sanitize multiple files
dangerzone file1.pdf file2.docx file3.xlsx

# Specify output directory
dangerzone -o /safe/directory document.pdf

🔍 When to Use Dangerzone

Always sanitize:

  • Documents from email attachments
  • Downloads from the internet
  • Documents from USB drives
  • Any document from untrusted sources
  • Journalist source materials
  • Legal documents from unknown parties

Less critical:

  • Documents you created yourself
  • Documents from verified, trusted colleagues (but still good practice!)

⚠️ Limitations

  • File Size: Dangerzone creates larger files (pixel-based PDFs)
  • Processing Time: Can be slow for large documents (requires rendering)
  • Text Selection: Text becomes images (not searchable/selectable)
  • Forms: Interactive PDF forms become static

Trade-off: Security vs. convenience. Dangerzone prioritizes security.

🛠️ Technical Details

Architecture:

  • Uses Podman or Docker for containerization
  • Runs untrusted code in isolated sandbox
  • Converts to pixels using LibreOffice/GraphicsMagick
  • Rebuilds PDF using safe rendering engine

Supported Formats:

  • PDF
  • Microsoft Office: .docx, .xlsx, .pptx
  • LibreOffice: .odt, .ods, .odp

🔗 Resources

  • mat2: Metadata removal (keeps documents editable)
  • ExifCleaner: Quick metadata stripping
  • Qubes OS: Full system isolation for maximum security

Maintained by: Freedom of the Press Foundation License: AGPL-3.0 Last Updated: 2025-11-12