Sat Mar 4 14:01:23 2023, original submission:
NB: this is the same as https://tex.stackexchange.com/q/678200/56076
(sorry for cross-posting) :)
I have this situation
- a LaTeX file with a macro that is usually translated into a unicode char by tex4ht (e.g. `\ldots` that became `…`)
- a citation with non-ascii char in the name of the author (e.g. the `í` in `Albarracín`)
- I would like to generate an xhtml file with htlatex
The procedure works, but the resulting file has one char encoded in utf-8 (the latex macro) and the non-ascii char in the author's name encoded in latin-1. AFAICT, htlatex includes the bbl file reading it as if it was in latin-1.
Is there anything that I could do to fix this behavior? :)\
(I'm working on `pdfTeX, Version 3.141592653-2.6-1.40.24 (TeX Live 2022/Arch Linux)`)
Here is a mwe, and below the commands that I run:
```latex
%% File mwe.tex
\documentclass{article}
\usepackage[backend=biber]{biblatex}
\begin{filecontents}{\jobname.bib}
@Article{Albarracin2000,
year = {2000},
volume = {1},
issue = {2},
pages = {3},
author = {Anyone Albarracín},
title = {A beautiful paper.},
journaltitle = {Some Journal}
}
\end{filecontents}
\addbibresource{\jobname.bib}
\begin{document}
I Am a Scientist\ldots\ Ask Me Anything
\parencite{Albarracin2000}
\printbibliography
\end{document}
```
```sh
htlatex mwe.tex "xhtml" "-cunihtf -utf8" "" ""
biber mwe
htlatex mwe.tex "xhtml" "-cunihtf -utf8" "" ""
```
and the result
```sh
$ file mwe.html
mwe.html: XML 1.0 document, Non-ISO extended-ASCII text
$ grep -a -e 'Anyone Albarra' -e Scientist --color mwe.html
<!--l. 22--><p class="noindent" >I Am a Scientist… Ask Me Anything [<a
<!--l. 26--><p class="noindent" >Anyone Albarrac�n. “A beautiful paper.” In: <span
```
|