Menu

#3371 imm: immxml-merge fails when merging XML files with different encodings in Python 2

5.26.02
fixed
None
defect
imm
tools
major
False
2026-02-27
2025-04-28
No

In python2, the immxml-merge tool uses the encoding of the first input file to set the output file's encoding. If any subsequent input files require UTF-8 encoding but the first file is not encoded in UTF-8, immxml-merge will fail.

encoding in first source xml document: None
Traceback (most recent call last):
  File "./src/imm/tools/immxml-merge", line 627, in <module>
    main(sys.argv[1:])
  File "./src/imm/tools/immxml-merge", line 619, in main
    merged_doc.save_result()
  File "./src/imm/tools/immxml-merge", line 389, in save_result
    file_object.write(string + "\n")
UnicodeEncodeError: 'ascii' codec can't encode character u'\xc5' in position 229: ordinal not in range(128)

Related

Wiki: ChangeLog-5.26.02

Discussion

  • Nguyen Huynh Tai

    • status: unassigned --> assigned
    • assigned_to: Nguyen Huynh Tai
    • Component: unknown --> imm
    • Part: - --> tools
     
  • Gary Lee

    Gary Lee - 2025-05-03
    • Milestone: 5.25.04 --> 5.25.09
     
  • Nguyen Huynh Tai

    • status: assigned --> fixed
     
  • Nguyen Huynh Tai

    commit c035b53056d1cce502a4fea40d5d48e6770df73e (HEAD -> develop, origin/develop)
    Author: tai.h.nguyen <tai.h.nguyen@endava.com>
    Date:   Mon Apr 28 17:57:18 2025 +0700
    
        imm: Fix immxml-merge failure caused by encoding mismatch [#3371]
    
        The immxml-merge tool uses the encoding of the first input file to set the
        output file's encoding. If any subsequent input files require UTF-8 encoding
        but the first file is not encoded in UTF-8, then immxml-merge will fail.
    
        UTF-8 encoding will be applied to the output files if any input file
        uses UTF-8 encoding.
    
     

Log in to post a comment.

MongoDB Logo MongoDB