This guide will show you, from start to finish, how to:
- Install and configure tools for scrubbing metadata
- Show you how to use those tools to scrub metadata
- Touch on some best practices for handling images and PDFs when you intend to publish them
Installation[edit]
Windows[edit]
Before you begin you will need to install LibreOffice, Exiftool, and QPDF.
While other document editors might be suitable, we recommend LibreOffice due to its open-source nature and highly suggest avoiding Microsoft Office.
Prerequisites[edit]
- Ensure you have administrative privileges for installations and PATH edits.
- Winget must be available (included in Windows 10/11 via the App Installer; update it from the Microsoft Store if needed). If
winget
is not recognized in Command Prompt, install it manually from Microsoft's official GitHub repository: https://github.com/microsoft/winget-cli.
LibreOffice[edit]
To install LibreOffice on Windows, download it from their official download link below and run the installer: https://www.libreoffice.org/download/download-libreoffice/
Exiftool[edit]
To install exiftool
on Windows, first open a command prompt window (type Win+R, type in cmd.exe
, and hit enter to open a prompt window) then copy/paste the following command in and hit enter. Follow the on screen instructions for the installer:
winget install --id OliverBetz.ExifTool -e
QPDF[edit]
Install qpdf
by entering the following command into the Command Prompt:
winget install --id QPDF.QPDF -e
Run the following to check the version that was installed. Take note of it:
winget list --id QPDF.QPDF
Open C:\Program Files
and look for a folder titled qpdf 12.2.0
. Yours may have a different version number. Copy the file path to the bin folder. The filepath will be something like C:\Program Files\qpdf 12.2.0\bin
.
Search for "edit system variables" in the Windows search bar and open it. Click on the "Environment variables" button. Click on "PATH" under user variables, then click "Edit". A new window will pop up. Click the new button and then paste the folder path to the qpdf bin folder. Click OK to close both windows.
Linux[edit]
Install exiftool
and qpdf
through your package manager.
Scrubbing the Data[edit]
Scrub Images[edit]
NOTE: It is important to scrub images before embedding them into your document. Scrubbing a PDF does not scrub any images inside of it.
NOTE: You should take backups of any images you perform this process on. Consider copying them into a new folder and then working on those copies.
These instructions focus on JPEGs but can be adapted for other formats, like PNG or TIFF. Use -ext png
for PNGs, or specify multiple with -ext jpg -ext png
. Scrub only the images you'll embed; if your document includes vectors or other embeds (e.g., from LibreOffice Draw), they must be scrubbed separately. Make sure that all of the images have been scrubbed before you proceed to creating your pdf.
Open a cmd.exe
window and change directory to the new image folder (you did read the notes at the top of this section, right?). The command to do this is below. Be sure to use the correct filepath:
cd C:\[path to folder]\images
Scrub all JPEG images that will be inserted into the doc using exiftool
. The command to do this is below. Be sure to use the correct file path:
exiftool -overwrite_original -all= -r -ext jpg C:\[path to folder]\images
Create Your PDF[edit]
We recommend using LibreOffice to draft your documentation as we cannot fully trust Microsoft Office to not embed data into the resulting PDF in ways we have not yet discovered.
In LibreOffice, you should then export to PDF through the "File > Export As > Export" menu. For other word processing software, such as MS Word (which we do not recommend that you use), we recommend that you save the document as a PDF - do not "print to PDF".
Scrub Your PDF[edit]
Scrub all PDF metadata with the command below. Be sure to use the correct file path:
exiftool -overwrite_original -all= C:\[path to folder]\input.pdf
Then, after scrubbing, linearize the PDF with QPDF to remove any remaining metadata stored in the file:
qpdf --linearize input.pdf output.pdf
To verify that your PDF has been scrubbed, run:
exiftool -a -G1 -s output.pdf
The output should show minimal metadata. No user-specific info should remain.
You are now the proud author of a PDF document which has its metadata scrubbed. We know that this process is cumbersome and are working to have a cleaner solution available to the community soon.