bugtex4ht - Bugs: bug #343, Package pdfpages

Show feedback again

You are not allowed to post comments on this tracker with your current authentification level.

bug #343: Package pdfpages

Submitted by:  Michal Hoftich <michal_h21>
Submitted on:  Mon 05 Dec 2016 02:20:24 PM EET  
Category: NonePriority: 5 - Normal
Severity: 5 - NormalStatus: None
Privacy: PublicAssigned to: None
Open/Closed: Open

Sun 22 Jan 2017 08:55:03 PM EET, comment #5:

The bottom line seems to be that post-processing the png from gs can indeed reduce the file size, but I doubt it is worth the trouble of invoking another external program.

Perhaps Ghostscript itself has options to control its png output and achieve smaller sizes that way, but I didn't look.

Meanwhile, any pdfpages support would be better than none :).


Karl Berry <karl>
Project Administrator
Sun 22 Jan 2017 08:52:44 PM EET, comment #4:

Regarding pdf to png conversion, I finally took a few minutes to try to
get to the bottom of it. (Additional discussion on mailing list,

I started with pdflatex small2e.tex. Resulting PDF is 60587 bytes.
I saw the same basic results you did: convert small2e.tex magick.png
resulted in a smaller file than your rungs invocation:

-rw-rw-r-- 1 karl root 9262 Jan 22 10:26 convert.png
-rw-rw-r-- 1 karl root 19189 Jan 22 10:15 rungs.png

I wondered if the precise gs invocation would make a difference.
So I ran
strace -vfs 9999 convert small2e.pdf convert.png >&/tmp/str
where the options to strace make it display everything.
The (voluminous) output shows gs being invoked this way,
except with temporary filenames:

-dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 \
-sDEVICE=pngalpha -dTextAlphaBits=4 -dGraphicsAlphaBits=4 \
-r72x72 -sOutputFile=gsmagick.png \

But, running this results in the same file size as rungs; I was surprised:
-rw-rw-r-- 1 karl root 19189 Jan 22 10:29 gsmagick.png

Ok, so then I ran
convert -debug all small2e.pdf convert.png >&/tmp/deb
to get a sense of what convert thought it was doing.

And indeed, I see it running gs as we expected, and getting but then doing
postprocessing on the png file:
Searching for module "PNG" using filename "png.la"
Enter ReadPNGImage()

Ok, so I am led to believe that convert is smarter than gs about how to
use png compression features (or whatever), and this seems plausible.

Finally, running it through netpbm results in an even smaller file:
pngtopnm convert.png | pnmtopng >pngto.png; ls -l pngto.png
-rw-rw-r-- 1 karl root 4185 Jan 22 10:34 pngto.png

While identify shows that the netpbm output is "PseudoClass" (uses color
table) rather than "DirectClass" (separate color per pixel):

$ identify pngto.png convert.png
pngto.png PNG 612x792 612x792+0+0 8-bit PseudoClass 2c 4.18KB 0.000u 0:00.000
convert.png[1] PNG 612x792 612x792+0+0 8-bit DirectClass 9.26KB 0.000u 0:00.000

Some discussion at http://www.imagemagick.org/discourse-server/viewtopic.php?t=16706.

And no doubt with additional options one could get imagemagick to do
that too, or netpbm not to, or whatever, but it doesn't matter :).

Karl Berry <karl>
Project Administrator
Wed 14 Dec 2016 05:34:03 PM EET, comment #3:

Thanks Karl.

pdfpages support isn't still ready, I should put it together before I forget it.

Michal Hoftich <michal_h21>
Project Member
Wed 14 Dec 2016 02:16:59 AM EET, comment #2:

I committed the reordered tex4ht.env to TL, r42704.
(In Master/texmf-dist/tex4ht/base/unix)

You have some pdfpages support to commit, Michal?

Thanks ...

Karl Berry <karl>
Project Administrator
Sat 10 Dec 2016 02:42:48 AM EET, comment #1:

if i'm understanding correctly, it would be fine (good) to commit the new tex4ht.env[-unix], independent of the actual pdfpages stuff you've done?

for the record, regarding tex4ht.env-win32: it bears no resemblance to the texmf-dist/tex4ht/base/win32/tex4ht.env that is used in TeX Live. Many (10-15?) years ago, the TL version was hacked (by Staszek W of GUST, as I recall) to be somewhat more portable, and use Unix-style paths. There has been no effort to get back in sync.

Karl Berry <karl>
Project Administrator
Mon 05 Dec 2016 02:20:24 PM EET, original submission:

Package pdfpages is not supported by tex4ht. It is not really a surprise, as we operate in the DVI mode, but I was able to make some basic support, see this answer of mine on TeX.sx [1].

It is really an basic support, it supports just the `\includepdf[pages={1,2,3}]{filename.pdf}` form, the more advanced forms which include several pdf files, impose them, etc., are not supported. I also added support for `page` option for `\includegraphics`, so it is also possible to use `\includegraphics[page=number]{filename.pdf}`.

Before I add this to the sources, what is the best and most portable way of converting pdf to bitmap formats? In my solution, Imagemagick is used, but I guess that it is not bundled with TL on Windows, is it?

[1] http://tex.stackexchange.com/a/342380/2891

Michal Hoftich <michal_h21>
Project Member


No files currently attached


Depends on the following items: None found

Items that depend on this one: None found


Carbon-Copy List
  • -unavailable- added by karl (Posted a comment)
  • -unavailable- added by michal_h21 (Submitted the item)

    Do you think this task is very important?
    If so, you can click here to add your encouragement to it.
    This task has 0 encouragements so far.

    Only logged-in users can vote.


    Please enter the title of George Orwell's famous dystopian book (it's a date):



    No Changes Have Been Made to This Item
    Show feedback again

    Back to the top

    Powered by Savane 3.1-cleanup+gray