綾小路龍之介の素人思考

[pdf] 圧縮してファイルサイズを減らす比較

pdf ファイルのサイズが大きい場合にどうすればファイルサイズを減らすことができるか

下のような pdf ファイルのファイルサイズを減らすことを考える。

$ pdfinfo *******.pdf
Creator:        pdftk 2.01 - www.pdftk.com
Producer:       itext-paulo-155 (itextpdf.sf.net-lowagie.com)
CreationDate:   Fri Mar 14 16:44:32 2014
ModDate:        Fri Mar 14 16:44:32 2014
Tagged:         no
Pages:          238
Encrypted:      no
Page size:      595.276 x 785.197 pts
File size:      149551168 bytes
Optimized:      no
PDF version:    1.7

処理後ファイルサイズの比較。gs で画像の解像度を変えるオプションをつけた場合 (/screen, /ebook, /printer, /prepress) は画像サイズとディスプレイの解像度によってはかなり粗が目立つ。これに対して gs の/default オプションはサイズも少ないし、画像の粗も無い。元の pdf ファイルに含まれる画像の解像度がそもそも高い場合は解像度を帰るオプションを使うと処理が増えるし粗も目立つと良いこと無し。その他のコマンドを使った場合は gs に比べて圧縮されにくい。

処理と結果
処理user + sysファイルサイズ元ファイルに対する割合
元ファイル-149551168 bytes100 %
gs -dPDFSETTINGS=/default335.66100 seconds10673382 bytes7.13694325677215707200 %
gs -dPDFSETTINGS=/screen559.16700 seconds24716043 bytes16.52681375246765040300 %
gs -dPDFSETTINGS=/ebook545.50600 seconds25966350 bytes17.36285336133249056200 %
gs -dPDFSETTINGS=/printer351.71800 seconds12383032 bytes8.28013058380125790700 %
gs -dPDFSETTINGS=/prepress354.47800 seconds14157700 bytes9.46679333189828380300 %
pdfopt24.11800 seconds149536440 bytes99.99015186561431603100 %
pdftk compress18.85400 seconds147370039 bytes98.54155000648339971500 %
qpdf --linearize39.89800 seconds148749860 bytes99.46419141306873644700 %

具体的なコマンド例。

$ time gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/default -dNOPAUSE -dQUIET -dBATCH -sOutputFile=*******-default.pdf *******.pdf
warning: ignoring invalid option raw

   **** Warning: File has insufficient data for an image.
warning: ignoring invalid option raw
warning: ignoring invalid option raw
warning: ignoring invalid option raw

   **** This file had errors that were repaired or ignored.
   **** The file was produced by:
   **** >>>> itext-paulo-155 (itextpdf.sf.net-lowagie.com) <<<<
   **** Please notify the author of the software that produced this
   **** file that it does not conform to Adobe's published PDF
   **** specification.


real    12m1.275s
user    5m20.432s
sys     0m15.229s
$ pdfinfo *******-default.pdf
Producer:       GPL Ghostscript 9.05
CreationDate:   Sat Mar 15 02:07:56 2014
ModDate:        Sat Mar 15 02:07:56 2014
Tagged:         no
Pages:          238
Encrypted:      no
Page size:      595.276 x 785.197 pts
File size:      10673382 bytes
Optimized:      no
PDF version:    1.4
$ time gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/screen -dNOPAUSE -dQUIET -dBATCH -sOutputFile=*******-screen.pdf *******.pdf
warning: ignoring invalid option raw

   **** Warning: File has insufficient data for an image.
warning: ignoring invalid option raw
warning: ignoring invalid option raw
warning: ignoring invalid option raw


   **** This file had errors that were repaired or ignored.
   **** The file was produced by:
   **** >>>> itext-paulo-155 (itextpdf.sf.net-lowagie.com) <<<<
   **** Please notify the author of the software that produced this
   **** file that it does not conform to Adobe's published PDF
   **** specification.


real    18m31.787s
user    8m52.941s
sys     0m26.226s
$ pdfinfo *******-screen.pdf
Producer:       GPL Ghostscript 9.05
CreationDate:   Sat Mar 15 02:44:24 2014
ModDate:        Sat Mar 15 02:44:24 2014
Tagged:         no
Pages:          238
Encrypted:      no
Page size:      595.276 x 785.197 pts
File size:      24716043 bytes
Optimized:      no
PDF version:    1.4
$ time gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/ebook -dNOPAUSE -dQUIET -dBATCH -sOutputFile=*******-ebook.pdf *******.pdf
warning: ignoring invalid option raw

   **** Warning: File has insufficient data for an image.
warning: ignoring invalid option raw
warning: ignoring invalid option raw
warning: ignoring invalid option raw

   **** This file had errors that were repaired or ignored.
   **** The file was produced by:
   **** >>>> itext-paulo-155 (itextpdf.sf.net-lowagie.com) <<<<
   **** Please notify the author of the software that produced this
   **** file that it does not conform to Adobe's published PDF
   **** specification.


real    18m1.188s
user    8m31.068s
sys     0m34.438s
$ pdfinfo *******-ebook.pdf
Producer:       GPL Ghostscript 9.05
CreationDate:   Sat Mar 15 03:02:57 2014
ModDate:        Sat Mar 15 03:02:57 2014
Tagged:         no
Pages:          238
Encrypted:      no
Page size:      595.276 x 785.197 pts
File size:      25966350 bytes
Optimized:      no
PDF version:    1.4
$ time gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/printer -dNOPAUSE -dQUIET -dBATCH -sOutputFile=*******-printer.pdf *******.pdf
GPL Ghostscript 9.05: Set UseCIEColor for UseDeviceIndependentColor to work properly.
warning: ignoring invalid option raw

   **** Warning: File has insufficient data for an image.
warning: ignoring invalid option raw
warning: ignoring invalid option raw
warning: ignoring invalid option raw

   **** This file had errors that were repaired or ignored.
   **** The file was produced by:
   **** >>>> itext-paulo-155 (itextpdf.sf.net-lowagie.com) <<<<
   **** Please notify the author of the software that produced this
   **** file that it does not conform to Adobe's published PDF
   **** specification.


real    11m30.515s
user    5m22.464s
sys     0m29.254s
$ pdfinfo *******-printer.pdf
Producer:       GPL Ghostscript 9.05
CreationDate:   Sat Mar 15 03:21:05 2014
ModDate:        Sat Mar 15 03:21:05 2014
Tagged:         no
Pages:          238
Encrypted:      no
Page size:      595.276 x 785.197 pts
File size:      12383032 bytes
Optimized:      no
PDF version:    1.4
$ time gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/prepress -dNOPAUSE -dQUIET -dBATCH -sOutputFile=*******-prepress.pdf *******.pdf
warning: ignoring invalid option raw

   **** Warning: File has insufficient data for an image.
warning: ignoring invalid option raw
warning: ignoring invalid option raw
warning: ignoring invalid option raw

   **** This file had errors that were repaired or ignored.
   **** The file was produced by:
   **** >>>> itext-paulo-155 (itextpdf.sf.net-lowagie.com) <<<<
   **** Please notify the author of the software that produced this
   **** file that it does not conform to Adobe's published PDF
   **** specification.


real    10m4.432s
user    5m24.456s
sys     0m30.022s
$ pdfinfo *******-prepress.pdf
Producer:       GPL Ghostscript 9.05
CreationDate:   Sat Mar 15 03:32:31 2014
ModDate:        Sat Mar 15 03:32:31 2014
Tagged:         no
Pages:          238
Encrypted:      no
Page size:      595.276 x 785.197 pts
File size:      14157700 bytes
Optimized:      no
PDF version:    1.4

ghostscript の PDFSETTINGS オプションで与えることのできるスイッチ (/prepress, /ebook, /screen, /printer, /default) の内容を具体的にあらわすと以下。基本的にはファイルサイズの大きくなる順に /screen < /ebook < /printer < /prepress。それぞれのオプションを使うことでどのように設定されるかを確認するには以下のようにする。

$ gs -q -dPDFSETTINGS=/prepress -o /dev/null -sDEVICE=pdfwrite -c "currentpagedevice {exch ==only ( ) print == } forall" | sort | less

例えば、以下のようにすれば、/prepress と /screen で変化するオプションの種類を確認できる。

$ diff <(gs -q -dPDFSETTINGS=/prepress -o /dev/null -sDEVICE=pdfwrite -c "currentpagedevice {exch ==only ( ) print == } forall" | sort) <(gs -q -dPDFSETTINGS=/screen -o /dev/null -sDEVICE=pdfwrite -c "currentpagedevice {exch ==only ( ) print == } forall" | sort)
9c9
< /.NeverEmbed []
---
> /.NeverEmbed [/Courier /Courier-Bold /Courier-Oblique /Courier-BoldOblique /Helvetica /Helvetica-Bold /Helvetica-Oblique /Helvetica-BoldOblique /Times-Roman /Times-Bold /Times-Italic /Times-BoldItalic /Symbol /ZapfDingbats]
20c20
< /AutoRotatePages /None
---
> /AutoRotatePages /PageByPage
33c33
< /CannotEmbedFontPolicy /Error
---
> /CannotEmbedFontPolicy /Warning
36c36
< /ColorConversionStrategy /LeaveColorUnchanged
---
> /ColorConversionStrategy /sRGB
41c41
< /ColorImageDownsampleType /Bicubic
---
> /ColorImageDownsampleType /Average
43c43
< /ColorImageResolution 300
---
> /ColorImageResolution 72
53c53
< /CreateJobTicket true
---
> /CreateJobTicket false
61c61
< /DoThumbnails true
---
> /DoThumbnails false
90c90
< /GrayImageDownsampleType /Bicubic
---
> /GrayImageDownsampleType /Average
92c92
< /GrayImageResolution 300
---
> /GrayImageResolution 72
127c127
< /MonoImageDownsampleType /Bicubic
---
> /MonoImageDownsampleType /Average
129c129
< /MonoImageResolution 1200
---
> /MonoImageResolution 300
131c131
< /NeverEmbed []
---
> /NeverEmbed [/Courier /Courier-Bold /Courier-Oblique /Courier-BoldOblique /Helvetica /Helvetica-Bold /Helvetica-Oblique /Helvetica-BoldOblique /Times-Roman /Times-Bold /Times-Italic /Times-BoldItalic /Symbol /ZapfDingbats]
163c163
< /PreBandThreshold false
---
> /PreBandThreshold true
166c166
< /PreserveEPSInfo true
---
> /PreserveEPSInfo false
168,169c168,169
< /PreserveOPIComments true
< /PreserveOverprintSettings true
---
> /PreserveOPIComments false
> /PreserveOverprintSettings false
193c193
< /UCRandBGInfo /Preserve
---
> /UCRandBGInfo /Remove
$ gs -dNODISPLAY -c ".distillersettings {exch ==only ( ) print ===} forall quit" | less
GPL Ghostscript 9.05 (2012-02-08)
Copyright (C) 2010 Artifex Software, Inc.  All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
/prepress << /MonoImageResolution 1200 /ColorImageDownsampleType /Bicubic /PreserveEPSInfo true /ColorConversionStrategy /LeaveColorUnchanged /GrayImageDownsampleType /Bicubic /EmbedAllFonts true /CannotEmbedFontPolicy /Error /PreserveOPIComments true /GrayImageResolution 300 /GrayACSImageDict << /ColorTransform 1 /QFactor 0.15 /Blend 1 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /ColorImageResolution 300 /PreserveOverprintSettings true /CreateJobTicket true /AutoRotatePages /None /MonoImageDownsampleType /Bicubic /NeverEmbed [] /ColorACSImageDict << /ColorTransform 1 /QFactor 0.15 /Blend 1 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /CompatibilityLevel 1.4 /UCRandBGInfo /Preserve /DoThumbnails true >>
/PSL2Printer << /CompatibilityLevel 1.2 /TransferFunctionInfo /Preserve /MonoImageResolution 1200 /PreserveEPSInfo true /CompressFonts true /ColorImageDownsampleType /Bicubic /GrayImageDownsampleType /Bicubic /ColorConversionStrategy /LeaveColorUnchanged /EmbedAllFonts true /ColorACSImageDict << /ColorTransform 1 /QFactor 0.15 /Blend 1 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /CannotEmbedFontPolicy /Error /PreserveOPIComments true /CompressPages true /GrayImageResolution 600 /GrayACSImageDict << /ColorTransform 1 /QFactor 0.15 /Blend 1 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /ColorImageResolution 600 /PreserveOverprintSettings true /AutoRotatePages /None /MonoImageDownsampleType /Bicubic /ASCII85EncodePages true /MaxViewerMemorySize 8000000 /NeverEmbed [] /PreserveHalftoneInfo true /UCRandBGInfo /Preserve /DoThumbnails false >>
/ebook << /MonoImageResolution 300 /ColorImageDownsampleType /Bicubic /PreserveEPSInfo false /ColorConversionStrategy /sRGB /GrayImageDownsampleType /Bicubic /EmbedAllFonts true /CannotEmbedFontPolicy /Warning /PreserveOPIComments false /GrayImageResolution 150 /GrayACSImageDict << /ColorTransform 1 /QFactor 0.76 /Blend 1 /HSamples [2 1 1 2] /VSamples [2 1 1 2] >> /ColorImageResolution 150 /PreserveOverprintSettings false /CreateJobTicket false /AutoRotatePages /All /MonoImageDownsampleType /Bicubic /NeverEmbed [/Courier /Courier-Bold /Courier-Oblique /Courier-BoldOblique /Helvetica /Helvetica-Bold /Helvetica-Oblique /Helvetica-BoldOblique /Times-Roman /Times-Bold /Times-Italic /Times-BoldItalic /Symbol /ZapfDingbats] /ColorACSImageDict << /ColorTransform 1 /QFactor 0.76 /Blend 1 /HSamples [2 1 1 2] /VSamples [2 1 1 2] >> /CompatibilityLevel 1.4 /UCRandBGInfo /Remove /DoThumbnails false >>
/screen << /MonoImageResolution 300 /ColorImageDownsampleType /Average /PreserveEPSInfo false /ColorConversionStrategy /sRGB /GrayImageDownsampleType /Average /EmbedAllFonts true /CannotEmbedFontPolicy /Warning /PreserveOPIComments false /GrayImageResolution 72 /GrayACSImageDict << /ColorTransform 1 /QFactor 0.76 /Blend 1 /HSamples [2 1 1 2] /VSamples [2 1 1 2] >> /ColorImageResolution 72 /PreserveOverprintSettings false /CreateJobTicket false /AutoRotatePages /PageByPage /MonoImageDownsampleType /Average /NeverEmbed [/Courier /Courier-Bold /Courier-Oblique /Courier-BoldOblique /Helvetica /Helvetica-Bold /Helvetica-Oblique /Helvetica-BoldOblique /Times-Roman /Times-Bold /Times-Italic /Times-BoldItalic /Symbol /ZapfDingbats] /ColorACSImageDict << /ColorTransform 1 /QFactor 0.76 /Blend 1 /HSamples [2 1 1 2] /VSamples [2 1 1 2] >> /CompatibilityLevel 1.3 /UCRandBGInfo /Remove /DoThumbnails false >>
/printer << /MonoImageResolution 1200 /ColorImageDownsampleType /Bicubic /PreserveEPSInfo true /ColorConversionStrategy /UseDeviceIndependentColor /GrayImageDownsampleType /Bicubic /EmbedAllFonts true /CannotEmbedFontPolicy /Warning /PreserveOPIComments true /GrayImageResolution 300 /GrayACSImageDict << /ColorTransform 1 /QFactor 0.4 /Blend 1 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /ColorImageResolution 300 /PreserveOverprintSettings true /CreateJobTicket true /AutoRotatePages /None /MonoImageDownsampleType /Bicubic /NeverEmbed [] /ColorACSImageDict << /ColorTransform 1 /QFactor 0.4 /Blend 1 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /CompatibilityLevel 1.4 /UCRandBGInfo /Preserve /DoThumbnails false >>
/default << /DoThumbnails false /PreserveEPSInfo true /ColorConversionStrategy /LeaveColorUnchanged /DownsampleMonoImages false /EmbedAllFonts true /CannotEmbedFontPolicy /Warning /PreserveOPIComments true /GrayACSImageDict << /VSamples [2 1 1 2] /QFactor 0.9 /Blend 1 /HSamples [2 1 1 2] >> /DownsampleColorImages false /PreserveOverprintSettings true /CreateJobTicket false /AutoRotatePages /PageByPage /NeverEmbed [/Courier /Courier-Bold /Courier-Oblique /Courier-BoldOblique /Helvetica /Helvetica-Bold /Helvetica-Oblique /Helvetica-BoldOblique /Times-Roman /Times-Bold /Times-Italic /Times-BoldItalic /Symbol /ZapfDingbats] /ColorACSImageDict << /VSamples [2 1 1 2] /QFactor 0.9 /Blend 1 /HSamples [2 1 1 2] >> /DownsampleGrayImages false /UCRandBGInfo /Preserve /Optimize false >>
$ time pdfopt *******.pdf *******-pdfopt.pdf
   **** Considering object with an invalid number 11997 as null.
   **** Considering object with an invalid number 11998 as null.
   **** Considering object with an invalid number 11997 as null.
   **** Considering object with an invalid number 11998 as null.

real    1m28.014s
user    0m14.029s
sys     0m10.089s
$ pdfinfo *******-pdfopt.pdf
Creator:        pdftk 2.01 - www.pdftk.com
Producer:       itext-paulo-155 (itextpdf.sf.net-lowagie.com)
CreationDate:   Fri Mar 14 16:44:32 2014
ModDate:        Fri Mar 14 16:44:32 2014
Tagged:         no
Pages:          238
Encrypted:      no
Page size:      595.276 x 785.197 pts
File size:      149536440 bytes
Optimized:      no
PDF version:    1.7
$ time pdftk *******.pdf cat output *******-pdftk.pdf compress

real    1m0.180s
user    0m8.469s
sys     0m10.385s
$ pdfinfo *******-pdftk.pdf
Creator:        pdftk 1.44 - www.pdftk.com
Producer:       itext-paulo-155 (itextpdf.sf.net-lowagie.com)
CreationDate:   Sat Mar 15 02:33:44 2014
ModDate:        Sat Mar 15 02:33:44 2014
Tagged:         no
Pages:          238
Encrypted:      no
Page size:      595.276 x 785.197 pts
File size:      147370039 bytes
Optimized:      no
PDF version:    1.4
$ time qpdf --linearize *******.pdf *******-qpdf.pdf

real    1m23.840s
user    0m33.274s
sys     0m6.624s
$ pdfinfo *******-qpdf.pdf
Creator:        pdftk 2.01 - www.pdftk.com
Producer:       itext-paulo-155 (itextpdf.sf.net-lowagie.com)
CreationDate:   Fri Mar 14 16:44:32 2014
ModDate:        Fri Mar 14 16:44:32 2014
Tagged:         no
Pages:          238
Encrypted:      no
Page size:      595.276 x 785.197 pts
File size:      148749860 bytes
Optimized:      yes
PDF version:    1.7

リファレンス

  1. Ghostcript PDF Reference & Tips — Milan Kupcevic
  2. PDFファイルのサイズを小さくする - Wiki Number8
  3. pdf - presentation: option screen or prepress in ps2pdf? - TeX - LaTeX Stack Exchange
  4. linux - optimize PDF files (with Ghostscript or other) - Stack Overflow
  5. java - How to downsample images within PDF file? - Stack Overflow
  6. Reducing PDF file size using Ghostscript on Linux didn't work - Stack Overflow
  7. unix - Changing pdf image dpi using gs - Stack Overflow
  8. ghostscript - How to reduce the size of a pdf file? - Ask Ubuntu
  9. Querying Ghostscript for the default options/settings of an output device (such as 'pdfwrite' or 'tiffg4') - Stack Overflow
  10. ps2pdf: PostScript-to-PDF converter
  11. command line - Where are ghostscript options / switches documented? - Super User

ソーシャルブックマーク

  1. はてなブックマーク
  2. Google Bookmarks
  3. del.icio.us

ChangeLog

  1. Posted: 2010-09-17T15:04:30+09:00
  2. Modified: 2010-09-17T15:04:30+09:00
  3. Generated: 2017-06-23T23:09:18+09:00