Mon Apr 29 10:32:57 2024, comment #2:
Hi Nasser, this is not limited to TeX4ht but to l3regex, which is used to clean the MathJax code. See this question, where they found the same issue: https://tex.stackexchange.com/q/654633/2891
In theory, you could skip the regex cleaning using this configuration file:
%%%%%%%%%%%%%%%%
\Preamble{xhtml}
\ExplSyntaxOn
\cs_set_protected:Npn \alteqtoks #1
{
\HCode{\detokenize{#1}}
}
\ExplSyntaxOff
\begin{document}
\EndPreamble
%%%%%%%%%%%%%%%%%%%%%
The downside is that you can end with incorrect characters in your HTML, in particular <, > and &. These can cause MathJax fail.
Also, to break your long strings, you can use this prefilter:
%%%%%%%%%%%%%%%
local max_len = 255
for line in io.lines() do
line = line:gsub("\r", "")
local str_len = string.len(line)
if str_len > max_len then
-- if the line is longer than maxlen, we will break it to a shorter segments
local curr_pos = max_len
local prev_pos = 1
while curr_pos < str_len do
-- find next command preceded by spaces starting at the current position
curr_pos, len = string.find(line, "%s+\\", curr_pos)
print(curr_pos, str_len,string.sub(line, prev_pos, curr_pos))
prev_pos = curr_pos
-- we must move the current position
curr_pos = curr_pos + max_len
end
-- print rest of the line
print(string.sub(line, prev_pos, str_len))
else
print(line)
end
end
%%%%%%%%%%%%
Use it as:
$ texlua format.lua < original.tex > formated.tex
It will format your long lines into much shorter chunks, preventing the "! Unable to read an entire line---bufsize=200000" error I encountered. I suppose that you set the limit higher, but I didn't and couldn't compile your file originally.
|