bugtex4ht - Bugs: bug #340, Math issues in the ODF export

 
 
Show feedback again

You are not allowed to post comments on this tracker with your current authentification level.

bug #340: Math issues in the ODF export

Submitted by:  Michal Hoftich <michal_h21>
Submitted on:  Tue Nov 22 15:39:54 2016  
 
Category: NonePriority: 5 - Normal
Severity: 5 - NormalStatus: None
Privacy: PublicAssigned to: None
Open/Closed: Closed

(Jump to the original submission Jump to the original submission)

Thu Oct 5 15:25:59 2023, comment #10:

It seems that most of these issues were resolved in the meantime. The only remaining issue was incorrect support for some relation operators:

2. Instead of "<" characters at start of array columns, upside-down "?" are displayed.

I think it is LibreOffice's bug, but as a workaround, I've found that empty <mrow></mrow> inserted before these operators work. The fix is included in the development version of make4ht.

Michal Hoftich <michal_h21>
Project Member
Sat Nov 26 12:40:11 2016, comment #9:

Thanks Karl, I thought that there is no XSLT processor included.

I think that xtpipes works, for example cross-references doesn't work without it. As I understand it, it combines SAX processor with XSLT or calls to Java.

I guess that all what xtpipes does can be done using Lua and the new Domobject, which will be used in next make4ht version. Maybe the Domobject will be usable also for conversion of mathml to starmath or TeX annotations.

Michal Hoftich <michal_h21>
Project Member
Thu Nov 24 23:04:12 2016, comment #8:

there is no xslt processor in TL, and not likely to be -- those are huge programs with monumental prerequisites. xquery, xpath, etc. it's a completely different and foreign world from tex.

however, if xslt would be useful, don't let that stop you. people who want to can install it through their system, if they don't already have it. that's the only feasible way to do so, seems to me.

as far as i know, xtpipes is not functional and never has been for anyone except eitan. i've never looked into it what it does, let alone what would be needed to get it working. lots of java, for starters.

Karl Berry <karl>
Project Administrator
Thu Nov 24 19:52:33 2016, comment #7:

Thanks Karl.

Yes, maybe it is not a bad idea to contact LO people before I try to write mathml to starmath convertor in Lua.

I have one question. Is any xslt processor included in TeX Live? I know that our xt-pipes include a xslt processor, but does anybody know how it works?

I am interested in Mathml to TeX conversion and found one potential solution [1], which is based on xslt. I would like to try if it could be used to add TeX as annotation for mathml.

[1] https://github.com/transpect/mml2tex

Michal Hoftich <michal_h21>
Project Member
Thu Nov 24 00:01:37 2016, comment #6:

committed the new mathml.4ht and ooffice-mml.4ht to TL.

regarding star(t)math, i suppose only the LO people can help, if they invented it. maybe they'd be willing to work with you on improving the overall situation wrt tex4ht/LO/mathml ...

Karl Berry <karl>
Project Administrator
Wed Nov 23 14:52:28 2016, comment #5:

I've just found that LO can read mathml files without prefix, I must had done some mistake previously when it didn't work. The issue with prefix-less mathml is that processing with xtpipes don't work.

I've also found interesting thread [1], where odt export from tex4ht is discussed. It seems that at least 7 years ago improving bugs in mathml import wasn't important for OO devs (I totally understand that they didn't had enough developers, moreover developers who understand mathml). Also it seems that the annotation in StarMath format is required for valid odf file. Which again leads to a question: is there any usable mathml to StartMath converter?

[1] https://bz.apache.org/ooo/show_bug.cgi?id=69088

Michal Hoftich <michal_h21>
Project Member
Wed Nov 23 09:00:20 2016, comment #4:

Yes, you can push it to TL, it fixes the issue with <mfences>. We should leave this issue open, as we should add prefixes to all mathml attributes, and address the LO's mathml handling.

Michal Hoftich <michal_h21>
Project Member
Wed Nov 23 08:57:07 2016, comment #3:

Thanks Karl. The html+mathml produced from the sample document are valid. I can't find functional ODF validator, the online one [1] gives me "Internal server error". I've tried to validate the mathml files included in the ODF can be validated in the HTML validator. It failed because of math: prefix, when I removed it, then they passed as valid. LibreOffice don't open the mathml files without namespace, so this shouldn't be an issue. So I guess this really is a bug in LibreOffice .

I've also figured out substance of another issue, which I found earlier, but didn't understand. Word can open ODT files, but it cannot display the math from tex4ht by default. But when you open the ODT file in LibreOffice and save it, then it can be read. It seems that LO converts the mathml to its's own format called StarMath, which can be then edited in LO's equation editor. Word seems to understand only the StarMath, so it can read the math in ODT files only after it is added by LO.

I can't find much information about StarMath. There is an element reference [1] and it seems that it is based on Troff's Eqn format [3]. I can't find Mathml to StartMath nor Eqn convertor, so we probably need to rely on LO's Mathml support, or write custom Mathml to StarMath converter.

BTW, I've created simple DOM library for LuaXML, next version of Make4ht will provide Lua filters based on it, in addition to regular expression filters. It can do some really funny stuff.

[1] https://odf-validator.rhcloud.com/
[2] https://wiki.documentfoundation.org/images/2/26/MG44-MathGuide.pdf
[3] http://manpages.ubuntu.com/manpages/precise/man1/eqn.1.html

Michal Hoftich <michal_h21>
Project Member
Wed Nov 23 00:48:57 2016, comment #2:

p.s. regarding your commit (r200), is this something i should push into TL now? thanks for all ...

Karl Berry <karl>
Project Administrator
Wed Nov 23 00:48:08 2016, comment #1:

Hi Michal - regarding whether the mathml fragment is ok, I can only suggest passing it (a whole doc using it) through the W3C (or other) validator and see. It's hard to imagine what could be wrong with it, but who knows. (And whether reporting bugs to libreoffice is worthwhile, I also don't know.)

Karl Berry <karl>
Project Administrator
Tue Nov 22 15:39:54 2016, original submission:

When I tried to compile the code from a question on TeX.sx [1], I found several issues in ODT export:

1. The braces from \left are small, they don't cover the three lines in the multi-line equation.

2. Instead of "<" characters at start of array columns, upside-down "?" are displayed.

Ad 1: Definitions of \Configure{left} and \Configure{right} are redefined in ooffice-mml.4ht. It is generated from tex4ht-ooffice.tex. There is a comment in the sources:

> OO doesn't seem to hono mfenced


I've tried to delete the configurations for `left` and `right` from ooffice-mml.4ht, so the default mathml configuration was used. This resulted in brackets of correct size, but wrong form. "(" instead of "{" was used, right bracket shouldn't be displayed at all.

I've took a look at the generated mathml code. For each math, one file named "filename-m{count}/content.xml" is created. The automatic size bracket are contained in `<mfenced>` element, with attributes `left` and `right`, where the bracket character is specified. mathml used in odf uses `math:` prefix on each element, this prefix must be used also on attributes. In our case `left` and `right` attributes didn't have this prefix, so they haven't been taken into the account and default brackets are used.

The prefixes are added using `\a:mathml` command in the tex4ht-mathml.tex, it is empty by default, but ooffice uses mathml: prefix. It is used on all element names and on most attributes, but it is missing on some of them, in particular in all configurations which use `<mfenced>` element.

I will add the prefix for the attributes to all configurations which use `<mfenced>` element and remove the configurations of "left" and "right" from ooffice-mml.4ht. But I guess there is much more instances of prefix-less attributes which need to be fixed.

Also, maybe it is worth checking whether all mathml fixes in ooffice-mml.4ht are really useful, or if there were only some minor bugs in mathml.4ht as in this case.

Ad 2: It seems to be a bug in the LibreOffice mathml handling. Minimal example which shows this issue is ${} < c$

This result in following mathml:

<math:math xmlns:math="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <math:mo>&lt;</math:mo> <math:mi> c</math:mi></math:math>

this seems like valid code and Firefox for instance has no problem in displaying that. It can be fixed if we add `<math:mtext />` tag before `<math:mo>`.

So my question is: is it really a bug in LO, or is there also some issue with the mathml from tex4ht? If it is bug in tex4ht, can we insert `<mtext />` automatically in the place of {} in the math context? Or is some post-processing of the XML needed?

[1] http://tex.stackexchange.com/q/340322/2891

Michal Hoftich <michal_h21>
Project Member

 

No files currently attached

 

Depends on the following items: None found

Items that depend on this one: None found

 

Carbon-Copy List
  • -unavailable- added by karl (Posted a comment)
  • -unavailable- added by michal_h21 (Submitted the item)
  •  

    Do you think this task is very important?
    If so, you can click here to add your encouragement to it.
    This task has 0 encouragements so far.

    Only logged-in users can vote.

     

    Please enter the title of George Orwell's famous dystopian book (it's a date):

     

     

    1 latest change follows.

    Date Changed By Updated Field Previous Value => Replaced By
    Thu Oct 5 15:26:10 2023michal_h21Open/ClosedOpen=>Closed
    Show feedback again

    Back to the top


    Powered by Savane 3.1-cleanup+gray