Clean

The clean utility will produce a cleaned version of an input PDF. It can apply a range of different options, a full list of which can be obtained by running mutool clean with no options:


\begin{lstlisting}
\$ mutool clean
usage: mutool clean [options] input.pdf [outp...
... streams
pages comma separated list of page numbers and ranges
\end{lstlisting}

The arguments here are fairly self explanatory, and usage is best explained with a few examples.

Firstly, and most simply, clean can be used to try to repair broken files. Many PDF files found in the wild are broken - sometimes because of having been corrupted, either by transmission/archiving problems, but a disappointing amount by just having been created by bad PDF writing software. Running a clean pass will attempt to repair the files:


\begin{lstlisting}
mutool clean in.pdf out.pdf
\end{lstlisting}

Individual pages (or page ranges) can be extracted from a PDF. For example:


\begin{lstlisting}
mutool clean -gggg in.pdf out.pdf 1-10,12
\end{lstlisting}

That will extract the pages 1 to 10, and page 12 of in.pdf and output it into a new out.pdf. The -gggg options ensure that unused objects will be dropped from the PDF.

An 8 page PDF might be rearranged into booklet form using:


\begin{lstlisting}
mutool clean -gggg in.pdf out.pdf 8,1,7,2,6,3,5,4
\end{lstlisting}

Finally, a more exotic, but very common example; if someone reports a problem seen on page 4 of a given PDF, the following command will extract that page, and expand the content streams, without decompressing the images or the fonts:


\begin{lstlisting}
mutool clean -difgggg in.pdf out.pdf 4
\end{lstlisting}

If this file still exhibits the same problem, it is generally far easier to debug through it than the original one was.