Menu

#203 strange behavior with large \romannumeral

Future
open
nobody
None
5
2025-09-06
2025-09-01
karl berry
No

Hans discovered some strange behavior of XeTeX with extremely large \romannumeral expansion. His ConTeXt input file:

\starttext
\tttf % so no features, just checking / processing
\dostepwiserecurse{0}{"FFFFFF}{"FFF}{
     \setbox0\hbox{\romannumeral#1}
%     \setbox0\hpack{\romannumeral#1}
     \writestatus{!!!!}{\the\wd0}
     % xetex     : runtime: 74.5 seconds
     %           : gets slower and crawls, reports --32768.0pt, mem stable
}
\stoptext

He writes:

XeTeX clips to 32K and then shows an extra minus
which indicates several problems: some overflow check (not sure why
because that's the hpack routine which should be the same as pdftex) so
we clip, and then some print value issue (also weird as it looks like a
value is interpreted wrong, duplicate minus check or so).

Discussion

  • Anonymous

    Anonymous - 2025-09-02

    A context-free (just plain-TeX-based) testcase would be helpful here. It's not immediately clear to me what ends up going into the \hbox in the example.

    FWIW, I tried the following, which I thought might be more-or-less equivalent:

    \newcount\i \i=0
    \newcount\step \step="FFF
    \newcount\stop \stop="FFFFFF
    
    \loop
      \ifnum\i < \stop
        \setbox0=\hbox{\romannumeral\i}
        \immediate\write16{!!!! \the\wd0}
        \advance\i by \step
        \repeat
    
    \end
    

    but did not see the behavior described. I do see the logged dimension "wrapping" to negative at 32k, but no extra minus; no slowing to a crawl; and this behavior matches what pdfTeX (or even plain TeX) does with the same file.

     
  • karl berry

    karl berry - 2025-09-02

    Hello anonymous - thanks for your reply. If I thought anyone would ever actually debug xetex again, I'd ask Hans or figure out how to reproduce it outside of ConTeXt. Or figure out exactly how to run it in ConTeXt. As it is, there are far, far, more important bugs already submitted, so I'm not inclined to spent more time on this obscurity ...

     
  • Jonathan Kew

    Jonathan Kew - 2025-09-03

    Hi Karl - that was me, I just hadn't logged in to SF when I first posted. (Only realized when it went into the moderation queue.)

    Regarding debugging, I know it's been sadly neglected, but if there were a simple plain- or ini-based testcase for this, there's some chance it might be an obvious fix. But I can guarantee I'm not going to dig through layers of ConTeXt macros to figure out what it's doing.

    Maybe it's sufficiently obscure that it won't really affect anyone, though.

     
  • karl berry

    karl berry - 2025-09-04

    This runs under initex. And it shows that the problem is apparently Windows-specific, as Hans is running on Windows. i get the same results for pdftex and xetex in tl25 on x86_64-linux, as you did. What's different in Akira's build is not something I want to delve into ...

    \catcode`\{=1 \catcode`\}=2 \catcode`\^=7 \newlinechar=`^^J
    \font\tenrm=cmr10 \tenrm \def\space{ }
    %
    \setbox0\hbox{\romannumeral 3910725}
    \message{^^J[3910725: \the\wd0 \space cf. 32722.85156pt]^^J}
    %
    \setbox0\hbox{\romannumeral 3914820}
    \message{[3914820: \the\wd0 \space cf. 32755.45313pt]^^J}
    %
    \setbox0\hbox{\romannumeral 3918915}
    \message{[3918915: \the\wd0 \space cf. --32768.0pt]^^J}
    %
    \setbox0\hbox{\romannumeral 3923010}
    \message{[3923010: \the\wd0 \space cf. --32768.0pt]^^J}
    \end
    
     
    • Jonathan Kew

      Jonathan Kew - 2025-09-04

      Thanks for looking further. Weird!

      My guess is there's some kind of compiler or standard-library option in Akira's build that is causing overflow to be handled differently from what we see on the other platforms.

      I've started an install on a Windows machine to see how it behaves for me....

       
      • Jonathan Kew

        Jonathan Kew - 2025-09-04

        Nope, I can't reproduce this. I installed TL2025 on Windows, and I get exactly the same result from xetex there as I do on macOS, and the same as I get from pdftex: see screenshot.

        Can you confirm that your example does produce the unexpected results on Windows for you? Are you using the exact same xetex version, or is there maybe a post-TL2025 build that's broken something?

         
  • karl berry

    karl berry - 2025-09-04

    P.S. If you have any time/interest in looking into things, just glancing over the open bug list shows a variety of things that presumably affect people's real-life output. Stacking diacritics (182), all the rtl stuff, etc. Unfortunately almost all the bugs are reported from "nobody" so it's hard to tell what might be important, i.e., from the latex team.

    P.P.S. There's also #185, where Ross Alexander sent in patches to purportedly "fix" an observable minute line breaking difference with the other engines, but so far as I know no one has ever gotten to the bottom of why the difference exists (just blindly installing the patch seems just as likely to make things worse as better). There are some threads on tex-live, e.g., starting at https://tug.org/pipermail/tex-live/2024-April/050355.html, and more references in the bug. I can't imagine that anyone but you will ever get to the bottom of it.

    Thanks.

     
  • karl berry

    karl berry - 2025-09-06

    I can't run anything on Windows, so no, I can't confirm :(.

    There's nothing post-TL25(or pre-TL25) that would affect this, as far as I know. Nothing serious has changed in the xetex sources in years, apart from the JP devs making it work better with over-BMP chars. Anyway, Hans was using the released TL25 binary.

    Thanks for looking into it so carefully. Guess there's nothing to be done.

     
    • Jonathan Kew

      Jonathan Kew - 2025-09-06

      I guess my remaining question, then, would be whether Hans has reproduced the issue with that simplified example, or has he only encountered it within ConTeXt?

      If it's somehow unique to his machine, or if he can't reproduce it outside of the ConTeXt system, then I don't see anything actionable here at the moment. I'm not going to attempt to understand and debug a whole ConTeXt environment.

       
  • karl berry

    karl berry - 2025-09-06

    I sent Hans the plain TeX example, but I don't think he did anything with it. Feel free to write him yourself if you want, or let's just forget the whole thing. Sorry for the noise. So many more important things to spend time on ...

     

Anonymous
Anonymous

Add attachments
Cancel





MongoDB Logo MongoDB