Skip to content

gh-79516: allow msgfmt.py to compile multiple input po files#10875

Open
s-ball wants to merge 70 commits intopython:mainfrom
s-ball:multi_inputs
Open

gh-79516: allow msgfmt.py to compile multiple input po files#10875
s-ball wants to merge 70 commits intopython:mainfrom
s-ball:multi_inputs

Conversation

@s-ball
Copy link

@s-ball s-ball commented Dec 3, 2018

msgfmt.py (from Tools/i18n) can now reliably be passed more than one input po file.

In addition, its make central function can reliably be called repeatedly, which fixes bpo-9741

https://bugs.python.org/issue35335

Test option processing, and conversion of one single file with or
without the -o option.
Also test the little documented behaviour of merging two input files
with -o option.
Also show that it is now possible to build multiple po files in one
single script call.
@the-knights-who-say-ni
Copy link

Hello, and thanks for your contribution!

I'm a bot set up to make sure that the project can legally accept your contribution by verifying you have signed the PSF contributor agreement (CLA).

Our records indicate we have not received your CLA. For legal reasons we need you to sign this before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue.

If you have recently signed the CLA, please wait at least one business day
before our records are updated.

You can check yourself to see if the CLA has been received.

Thanks again for your contribution, we look forward to reviewing it!

s-ball and others added 3 commits January 23, 2025 19:32
When merging main into multi_inputs, the reference to os_helper was
erroneously removed.
In 2018, all imports came from the test.support package. They are now
splitted among various subpackages.
@s-ball
Copy link
Author

s-ball commented Jan 23, 2025

My patches now successfully pass all tests. Is there anything else I should do?

@zware zware changed the title bpo-35335: explicitely allows msgfmt.py to compile more than one input po files gh-79516: explicitely allows msgfmt.py to compile more than one input po files Feb 25, 2025
s-ball added 2 commits March 19, 2025 13:52
Fixes a small bug introduced in previous commit (changed behaviour when an input file has an extension other than .po)
@s-ball
Copy link
Author

s-ball commented Mar 20, 2025

Ok the tests for the infile/outfile computations are passing.

Back to the duplicate ids question, it should not be that hard:

  • add a lid variable in process to record the lno value when a msgid is found
  • pass that lid along with infile to the add method and store tuples (msgstr, infile, lid) in the messages dict
  • in add, test whether a key is present before adding a new record and abort with a message - we now have the file and line number of the previous and current id
  • in generate replace the 2 occurrences of messages[id] with messages[id][0]

For the tests, the current test_both_with_outputfile should fail because the empty msgid is duplicated in the second file. We should just add a new PO file without header to have a passing test.

This would allow a better GNU msgfmt compatibility.

Do you think that this should go into this PR or into a new one?

@s-ball
Copy link
Author

s-ball commented Mar 21, 2025

Started working on that point, and I fell not on technical but behavior questions. msgfmt chokes if more than one file contains a header, but the header is the only place where the encoding of the PO file can be declared. As GNU gettext contains a number of auxiliary programs that can help to manage a set of PO files this behavior is nice, but msgfmt.py has not them and it would be a serious limitation. My opinion is that msgfmt.py should accept a header in every file but only store the first (or the last?) in the resulting MO file. That way users could process PO files with other encodings than UTF-8. But I would love to have other opinions on that...

If you think that this points requires a longer discussion, or that it should be discussed on a different place, then it means that we should handle it in a followup PR. If you think that just making a special case for the header is reasonable, then I could implement it in a couple of days.

s-ball and others added 3 commits April 4, 2025 09:39
compile_messages parameters order has changed in a previous commit to allow compiling multiple PO files.
@StanFromIreland
Copy link
Member

There will be a lot of conflicts with my hashing pr… this pr changes quite a lot. What do we want to do?

@s-ball
Copy link
Author

s-ball commented Apr 4, 2025

I was not very happy to reverse to parameters order of compile_messages because I thought it could disturb users if they were used to the precedent order. If there are concurrent PR (which I was absolutely not aware), this is clearly a blocking point. IMHO the only possible ways are to sequence the PR if there are only two of them, or to revert the modification of compile_messages, and use a different function to compile many PO files. I am sorry if I have made a mistake here. It is my first PR and I really do not know the usages. Anyway are there other sources of conflict?

@StanFromIreland
Copy link
Member

I guess it is up to @serhiy-storchaka who is the expert for msgfmt if I remember correctly as to what order we want these. (Who will have to deal with the conflicts)

@serhiy-storchaka
Copy link
Member

Before proceeding with this issue, we must solve other issues: --exclude-file, --omit-header, use the source file encoding. I think they are important for this issue.

@s-ball
Copy link
Author

s-ball commented Apr 4, 2025

If I correctly understand, I shall just leave this PR as it is now, without even worrying for conflicts, and only come back when the other issues will be fixed. Not really a problem, but do not forget to wake me when it will be the time to fix the conflicts... Anyway it is far from a very important one even for me as I have understood that I had to use a private copy for my own project. I have already been working that way for more that 6 years after all... Stupid me again! If there are conflicts with other changes, it means that those changes have been merged and that the conflicts are to be resolved... I just shall keep on coming here from time to time and fix them when I see them. But unless being advised to act differently I shall not try to handle the duplicate ids question before the PR is merged in its current state to avoid adding other sources of possible conflicts.

@python python deleted a comment Apr 7, 2025
@python-cla-bot
Copy link

python-cla-bot bot commented Apr 18, 2025

All commit authors signed the Contributor License Agreement.

CLA signed

s-ball and others added 3 commits May 20, 2025 16:42
@github-actions
Copy link

This PR is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale Stale PR or inactive for long period of time. label Sep 21, 2025
@s-ball
Copy link
Author

s-ball commented Oct 7, 2025

This PR was marked as stale because of long inactivity. I could fix a conflict with the main branch and merge main, but that did not remove the stale label. Should I do anything else?

@StanFromIreland StanFromIreland removed the stale Stale PR or inactive for long period of time. label Oct 7, 2025
@github-actions
Copy link

github-actions bot commented Jan 1, 2026

This PR is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale Stale PR or inactive for long period of time. label Jan 1, 2026
@github-actions github-actions bot removed the stale Stale PR or inactive for long period of time. label Mar 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants