bugtex4ht - Bugs: bug #433, Ligature fi becoming ø in avant...

 
 
Show feedback again

You are not allowed to post comments on this tracker with your current authentification level.

bug #433: Ligature fi becoming ø in avant garde

Submitted by:  Hilmar Preusse <hpreusse>
Submitted on:  Sun Jul 28 19:02:04 2019  
 
Category: NonePriority: 5 - Normal
Severity: 3 - MinorStatus: None
Privacy: PublicAssigned to: None
Open/Closed: Closed

Mon Aug 19 01:04:46 2019, comment #2:

I committed the new versions of the adobe htf files to TeX Live (r51905).

I also realized that a number of htf files were being generated in tex4ht dev but weren't in TL yet, so I committed those too (r51906).

Finally, I noticed that three files in TL were no longer being generated (as far as I could see), so I removed them:
unicode/ec/eccc.htf
unicode/pxfonts/pxbsyc.htf
unicode/txfonts/t1x.htf
(in texmf-dist/tex4ht/ht-fonts). Guess we'll see if that causes trouble.

More discrepancies remain between dev and tl, but we'll resolve those another time. Closing this ...

Karl Berry <karl>
Project Administrator
Fri Aug 16 14:22:19 2019, comment #1:

Thanks for the report. The 8-bit fonts can be sometimes a bit mess. Even if a standard encoding is used, some characters, especially ligatures can be non-standard, which results in such errors. tex4ht uses special files with conversion tables between font characters and Unicode. These tables were originally written by hand, so they may contain errors.

I've fixed the files using Htfgen [1], tool that can generate conversion tables automatically. I've fixed lot of bugs and added new features to this tool, so the updates for text fonts should be mostly automatic in the future. Math fonts are a different story, as they often contain unique glyphs that are not even in Unicode.

[1] https://github.com/michal-h21/htfgen

Michal Hoftich <michal_h21>
Project Member
Sun Jul 28 19:02:04 2019, original submission:

The following input:

\documentclass{article}
\usepackage{avant}
\usepackage[T1]{fontenc}
\begin{document}
first
\textsf{first}
\end{document}

with the following command:

mk4ht htlatex a.tex "html,uni-html4,css2,charset=utf-8,info" " -cunihtf -utf8"

gets the following HTML output:

first
ørst

Notice how the fi (fi; U+FB01) ligature becomes ø (small o stroke; U+00F8).

Further please note that when selecting another font the problem disappears. The original reporter listed the font families below.

These fonts work:
- computer modern (default)
- lmodern
- charter
- palatino
- times (both serif and sans-serif)
- utopia

These do not:
- avant
- bookman (both serif and sans-serif broken)
- chancery
- newcent (both serif and sans-serif)

Hilmar Preusse <hpreusse>

 

No files currently attached

 

Depends on the following items: None found

Items that depend on this one: None found

 

Carbon-Copy List
  • -unavailable- added by karl (Posted a comment)
  • -unavailable- added by michal_h21 (Posted a comment)
  • -unavailable- added by hpreusse (Submitted the item)
  •  

    Do you think this task is very important?
    If so, you can click here to add your encouragement to it.
    This task has 0 encouragements so far.

    Only logged-in users can vote.

     

    Please enter the title of George Orwell's famous dystopian book (it's a date):

     

     

    1 latest change follows.

    Date Changed By Updated Field Previous Value => Replaced By
    Mon Aug 19 01:08:09 2019karlOpen/ClosedOpen=>Closed
    Show feedback again

    Back to the top


    Powered by Savane 3.1-cleanup+gray