bugtex4ht - Bugs: bug #343, Package pdfpages

 
 
Show feedback again

You are not allowed to post comments on this tracker with your current authentification level.

bug #343: Package pdfpages

Submitted by:  Michal Hoftich <michal_h21>
Submitted on:  Mon Dec 5 12:20:24 2016  
 
Category: NonePriority: 5 - Normal
Severity: 5 - NormalStatus: None
Privacy: PublicAssigned to: None
Open/Closed: Closed

(Jump to the original submission Jump to the original submission)

Fri Jul 17 15:52:47 2020, comment #6:

I think we got Pdfpages to work in the meantime.

Michal Hoftich <michal_h21>
Project Member
Sun Jan 22 18:55:03 2017, comment #5:

The bottom line seems to be that post-processing the png from gs can indeed reduce the file size, but I doubt it is worth the trouble of invoking another external program.

Perhaps Ghostscript itself has options to control its png output and achieve smaller sizes that way, but I didn't look.

Meanwhile, any pdfpages support would be better than none :).

Thanks,
Karl

Karl Berry <karl>
Project Administrator
Sun Jan 22 18:52:44 2017, comment #4:

Regarding pdf to png conversion, I finally took a few minutes to try to
get to the bottom of it. (Additional discussion on mailing list,
http://tug.org/pipermail/tex4ht/2016q4/001682.html)

I started with pdflatex small2e.tex. Resulting PDF is 60587 bytes.
I saw the same basic results you did: convert small2e.tex magick.png
resulted in a smaller file than your rungs invocation:

-rw-rw-r-- 1 karl root 9262 Jan 22 10:26 convert.png
-rw-rw-r-- 1 karl root 19189 Jan 22 10:15 rungs.png

I wondered if the precise gs invocation would make a difference.
So I ran
strace -vfs 9999 convert small2e.pdf convert.png >&/tmp/str
where the options to strace make it display everything.
The (voluminous) output shows gs being invoked this way,
except with temporary filenames:

gs -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT \
-dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 \
-sDEVICE=pngalpha -dTextAlphaBits=4 -dGraphicsAlphaBits=4 \
-r72x72 -sOutputFile=gsmagick.png \
small2e.pdf

But, running this results in the same file size as rungs; I was surprised:
-rw-rw-r-- 1 karl root 19189 Jan 22 10:29 gsmagick.png

Ok, so then I ran
convert -debug all small2e.pdf convert.png >&/tmp/deb
to get a sense of what convert thought it was doing.

And indeed, I see it running gs as we expected, and getting but then doing
postprocessing on the png file:
Searching for module "PNG" using filename "png.la"
...
Enter ReadPNGImage()
...

Ok, so I am led to believe that convert is smarter than gs about how to
use png compression features (or whatever), and this seems plausible.

Finally, running it through netpbm results in an even smaller file:
pngtopnm convert.png | pnmtopng >pngto.png; ls -l pngto.png
-rw-rw-r-- 1 karl root 4185 Jan 22 10:34 pngto.png

While identify shows that the netpbm output is "PseudoClass" (uses color
table) rather than "DirectClass" (separate color per pixel):

$ identify pngto.png convert.png
pngto.png PNG 612x792 612x792+0+0 8-bit PseudoClass 2c 4.18KB 0.000u 0:00.000
convert.png[1] PNG 612x792 612x792+0+0 8-bit DirectClass 9.26KB 0.000u 0:00.000

Some discussion at http://www.imagemagick.org/discourse-server/viewtopic.php?t=16706.

And no doubt with additional options one could get imagemagick to do
that too, or netpbm not to, or whatever, but it doesn't matter :).

Karl Berry <karl>
Project Administrator
Wed Dec 14 15:34:03 2016, comment #3:

Thanks Karl.

pdfpages support isn't still ready, I should put it together before I forget it.

Michal Hoftich <michal_h21>
Project Member
Wed Dec 14 00:16:59 2016, comment #2:

I committed the reordered tex4ht.env to TL, r42704.
(In Master/texmf-dist/tex4ht/base/unix)

You have some pdfpages support to commit, Michal?

Thanks ...

Karl Berry <karl>
Project Administrator
Sat Dec 10 00:42:48 2016, comment #1:

if i'm understanding correctly, it would be fine (good) to commit the new tex4ht.env[-unix], independent of the actual pdfpages stuff you've done?

for the record, regarding tex4ht.env-win32: it bears no resemblance to the texmf-dist/tex4ht/base/win32/tex4ht.env that is used in TeX Live. Many (10-15?) years ago, the TL version was hacked (by Staszek W of GUST, as I recall) to be somewhat more portable, and use Unix-style paths. There has been no effort to get back in sync.

Karl Berry <karl>
Project Administrator
Mon Dec 5 12:20:24 2016, original submission:

Package pdfpages is not supported by tex4ht. It is not really a surprise, as we operate in the DVI mode, but I was able to make some basic support, see this answer of mine on TeX.sx [1].

It is really an basic support, it supports just the `\includepdf[pages={1,2,3}]{filename.pdf}` form, the more advanced forms which include several pdf files, impose them, etc., are not supported. I also added support for `page` option for `\includegraphics`, so it is also possible to use `\includegraphics[page=number]{filename.pdf}`.

Before I add this to the sources, what is the best and most portable way of converting pdf to bitmap formats? In my solution, Imagemagick is used, but I guess that it is not bundled with TL on Windows, is it?

[1] http://tex.stackexchange.com/a/342380/2891

Michal Hoftich <michal_h21>
Project Member

 

No files currently attached

 

Depends on the following items: None found

Items that depend on this one: None found

 

Carbon-Copy List
  • -unavailable- added by karl (Posted a comment)
  • -unavailable- added by michal_h21 (Submitted the item)
  •  

    Do you think this task is very important?
    If so, you can click here to add your encouragement to it.
    This task has 0 encouragements so far.

    Only logged-in users can vote.

     

    Please enter the title of George Orwell's famous dystopian book (it's a date):

     

     

    1 latest change follows.

    Date Changed By Updated Field Previous Value => Replaced By
    Fri Jul 17 15:52:47 2020michal_h21Open/ClosedOpen=>Closed
    Show feedback again

    Back to the top


    Powered by Savane 3.1-cleanup+gray