bugtex4ht - Bugs: bug #434, HTML mode is spewing some XML code

 
 
Show feedback again

You are not allowed to post comments on this tracker with your current authentification level.

bug #434: HTML mode is spewing some XML code

Submitted by:  Hilmar Preusse <hpreusse>
Submitted on:  Tue 30 Jul 2019 07:06:28 PM EEST  
 
Category: NonePriority: 5 - Normal
Severity: 3 - MinorStatus: Wont Fix
Privacy: PublicAssigned to: None
Open/Closed: Closed

Mon 16 Dec 2019 09:22:05 PM EET, comment #4:

Hi Hilmar - well, I agree it is a bug, since html4 is being requested. Unfortunately <br /> and <hr /> are ubiquitious in tex4ht-html4.tex; I guess Eitan didn't know about the problem.

It does not seem straightforward at all to factor them all out so they are conditional on html4. And I believe all that code is used for other html variants, so just removing the slashes would mean xhtml output etc. would no longer validate, so that's not viable.

Since html5 is pretty widespread by now, it's hard for me and Michal to work up a lot of enthusiasm for tinkering with the html4 output. I know that's sad, but there it is. Of course reasonable patches would be welcome (good luck :).

Karl Berry <karl>
Project Administrator
Fri 23 Aug 2019 04:39:07 PM EEST, comment #3:

Ah, that is unfortunate. But is it really worth to fix HTML 4 stuff in 2019? I think we should aim at least at XHTML.

Michal Hoftich <michal_h21>
Project Member
Mon 19 Aug 2019 04:07:51 AM EEST, comment #2:

I'm not sure about this, Michal. If it's only claiming to be "traditional" HTML, not XHTML or HTML5, those XML-ish statements actually break browsers and validation. I've tried to add <hr/> and the like to plain HTML before, and it didn't work out well.

Karl Berry <karl>
Project Administrator
Tue 13 Aug 2019 03:25:38 PM EEST, comment #1:

I don't think this is a bug. tex4ht now produces HTML5 in XML serialization by default, another supported HTML doctype is XHTML. As post-processing that involves XML tools may be used by tex4ht, it is really not a good idea to produce HTML files that are not well-formed XML at the same time.

Michal Hoftich <michal_h21>
Project Member
Tue 30 Jul 2019 07:06:28 PM EEST, original submission:

https://bugs.debian.org/536380
Hope the bug is still valid.

Command used:
mk4ht htlatex manual-web.tex "html,uni-html4,frames,css2,charset=utf-8,info" " -cunihtf -utf8"

Despite not having any XHTML or XML options enabled, it is outputting various bits of XML:

<br />
<hr class="endfloat" />
<br class="newline" />

Minimal exmample for the <br /> part:

\documentclass{article}
\begin{document}

Hello,\\
world
\end{document}
-----
Thanks!

Hilmar Preusse <hpreusse>

 

No files currently attached

 

Depends on the following items: None found

Items that depend on this one: None found

 

Carbon-Copy List
  • -unavailable- added by karl (Posted a comment)
  • -unavailable- added by michal_h21 (Posted a comment)
  • -unavailable- added by hpreusse (Submitted the item)
  •  

    Do you think this task is very important?
    If so, you can click here to add your encouragement to it.
    This task has 0 encouragements so far.

    Only logged-in users can vote.

     

    Please enter the title of George Orwell's famous dystopian book (it's a date):

     

     

    2 latest changes follow.

    Date Changed By Updated Field Previous Value => Replaced By
    Mon 16 Dec 2019 09:22:05 PM EETkarlStatusNone=>Wont Fix
      Open/ClosedOpen=>Closed
    Show feedback again

    Back to the top


    Powered by Savane 3.1-cleanup+gray