public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [Patch] c++/7765
@ 2002-10-28  7:43 Wolfgang Bangerth
  2002-10-28  8:32 ` c++/7765 [Patch] Zack Weinberg
  0 siblings, 1 reply; 8+ messages in thread
From: Wolfgang Bangerth @ 2002-10-28  7:43 UTC (permalink / raw)
  To: gcc-bugs, gcc-patches, gcc-gnats

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: TEXT/PLAIN; charset=X-UNKNOWN, Size: 3045 bytes --]


Hi,
yesterday I dug a little deeper into this error, which is new in 3.2 and 
present CVS. I believe that this is the reason: in the french translation 
gcc/po/fr.po, we find

> #: cp/call.c:2842
> msgid "%s for `%T %s %T' operator"
> msgstr "%s pour l'operateur %t [%T]"A
>
> #: cp/call.c:2845
> msgid "%s for `%s %T' operator"
> msgstr "%s pour l'operateur %t [%T]"A

This ultimately leads to the error because %t is not recognized by 
cp/error.c(cp_printer). I confirmed that the appended patch fixes this. I 
have no idea whom to contact for i18n matters, so can please someone take 
care of the necessary steps?

Since this problem did not occur with at least 2.95, it would be nice if 
it could be fixed for 3.2.1. Present CVS will also need this fix.

Regards
  Wolfgang

-------------------------------------------------------------------------
Wolfgang Bangerth           email:              bangerth@ticam.utexas.edu
                            www:   http://www.ticam.utexas.edu/~bangerth/


Index: po/fr.po
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/po/fr.po,v
retrieving revision 1.3.2.2
diff -c -r1.3.2.2 fr.po
*** po/fr.po	10 May 2002 14:52:05 -0000	1.3.2.2
--- po/fr.po	28 Oct 2002 15:30:16 -0000
***************
*** 13122,13132 ****
  
  #: cp/call.c:2842
  msgid "%s for `%T %s %T' operator"
! msgstr "%s pour l'opérateur «%t [%T]»"
  
  #: cp/call.c:2845
  msgid "%s for `%s %T' operator"
! msgstr "%s pour l'opérateur «%t [%T]»"
  
  #: cp/call.c:2937
  msgid "ISO C++ forbids omitting the middle term of a ?: expression"
--- 13122,13132 ----
  
  #: cp/call.c:2842
  msgid "%s for `%T %s %T' operator"
! msgstr "%s pour l'opérateur «%T [%T]»"
  
  #: cp/call.c:2845
  msgid "%s for `%s %T' operator"
! msgstr "%s pour l'opérateur «%T [%T]»"
  
  #: cp/call.c:2937
  msgid "ISO C++ forbids omitting the middle term of a ?: expression"
Index: diagnostic.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/diagnostic.c,v
retrieving revision 1.78
diff -c -r1.78 diagnostic.c
*** diagnostic.c	23 Jan 2002 19:34:08 -0000	1.78
--- diagnostic.c	28 Oct 2002 15:30:16 -0000
***************
*** 733,739 ****
            if (!buffer->format_decoder || !(*buffer->format_decoder) (buffer))
              {
                /* Hmmm.  The front-end failed to install a format translator
!                  but called us with an unrecognized format.  Sorry.  */
                abort ();
              }
          }
--- 733,741 ----
            if (!buffer->format_decoder || !(*buffer->format_decoder) (buffer))
              {
                /* Hmmm.  The front-end failed to install a format translator
!                  or called us with an unrecognized format. (Maybe also just
!                  the translated string contained an invalid format.)
!                  Sorry.  */
                abort ();
              }
          }




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: c++/7765 [Patch]
  2002-10-28  7:43 [Patch] c++/7765 Wolfgang Bangerth
@ 2002-10-28  8:32 ` Zack Weinberg
  2002-10-28  8:48   ` Wolfgang Bangerth
  2002-10-28 10:23   ` Joseph S. Myers
  0 siblings, 2 replies; 8+ messages in thread
From: Zack Weinberg @ 2002-10-28  8:32 UTC (permalink / raw)
  To: Wolfgang Bangerth; +Cc: gcc-bugs, gcc-patches, gcc-gnats, Michel Robitaille

On Mon, Oct 28, 2002 at 04:43:22PM +0100, Wolfgang Bangerth wrote:

>   msgid "%s for `%T %s %T' operator"
> ! msgstr "%s pour l'opérateur «%T [%T]»"
>   
>   #: cp/call.c:2845
>   msgid "%s for `%s %T' operator"
> ! msgstr "%s pour l'opérateur «%T [%T]»"

You're on the right track here, but these are still wrong.
Translations must not change the sequence of %-escapes *at all*.
(We haven't yet implemented %1$s notation in diagnostic.c; it's
needed, but until then...) And the square brackets were blindly
copied from the message above; they don't belong.

 msgid "%s for `%T %s %T' operator"
 msgstr "%s pour l'opérateur «%T %s %T»"

 msgid "%s for `%s %T' operator"
 msgstr "%s pour l'opérateur «%s %T»"

If this weren't in the translations it would qualify as an obvious
bugfix.  I would say, go ahead and check it in (with this change);
but we need to get the fix propagated to the translation project.
Michel, you are listed as the translator for this file.  Would you
please incorporate these changes into the next version of the file,
and also check for other places where the %-escapes are inconsistent
between the msgid and the msgstr?

zw

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: c++/7765 [Patch]
  2002-10-28  8:32 ` c++/7765 [Patch] Zack Weinberg
@ 2002-10-28  8:48   ` Wolfgang Bangerth
  2002-10-28  9:59     ` Zack Weinberg
  2002-10-28 10:23   ` Joseph S. Myers
  1 sibling, 1 reply; 8+ messages in thread
From: Wolfgang Bangerth @ 2002-10-28  8:48 UTC (permalink / raw)
  To: Zack Weinberg; +Cc: gcc-bugs, gcc-patches, Michel Robitaille


Zack, Michel,

> If this weren't in the translations it would qualify as an obvious
> bugfix.  I would say, go ahead and check it in (with this change);

I can't, someone has to do that for me. Note also the changed comment in
the patch, I believe the old one had an "and" and "or" confused.

> but we need to get the fix propagated to the translation project.
> Michel, you are listed as the translator for this file.  Would you
> please incorporate these changes into the next version of the file,
> and also check for other places where the %-escapes are inconsistent
> between the msgid and the msgstr?

I'm presently about to do just that, with a small script. There are
literally dozens of such cases in the danish translation, and at least ten
more in the french one, where formats don't match (things are a little bit
complicated, since a % might be followed by a charater that needs to be
escaped in perl...). I'm surprised nobody's been paying attention to this
rather obvious problem. Also, do I understand you correctly that the
_order_ of formats needs to be preserved? (One would think so if things
are called via functions with an ellipsis at the end.) This is going to be
a headache for translators, because it restricts their choice of wording,
and it can also not be checked automatically, if the same format appears
more than once in the text.

This is probably going to be too much just for me, but once it's running,
I'd be happy to share the script.

Regards
  Wolfgang

-------------------------------------------------------------------------
Wolfgang Bangerth          email: wolfgang.bangerth@iwr.uni-heidelberg.de
                             www: http://gaia.iwr.uni-heidelberg.de/~wolf


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: c++/7765 [Patch]
  2002-10-28  8:48   ` Wolfgang Bangerth
@ 2002-10-28  9:59     ` Zack Weinberg
  2002-10-28 10:57       ` Wolfgang Bangerth
  0 siblings, 1 reply; 8+ messages in thread
From: Zack Weinberg @ 2002-10-28  9:59 UTC (permalink / raw)
  To: Wolfgang Bangerth; +Cc: gcc-bugs, gcc-patches, Michel Robitaille

On Mon, Oct 28, 2002 at 05:47:49PM +0100, Wolfgang Bangerth wrote:
> 
> Zack, Michel,
> 
> > If this weren't in the translations it would qualify as an obvious
> > bugfix.  I would say, go ahead and check it in (with this change);
> 
> I can't, someone has to do that for me. Note also the changed comment in
> the patch, I believe the old one had an "and" and "or" confused.

Okay.  I will apply your comment change shortly, and use your script
to audit the .po files and clean them up.

> I'm presently about to do just that, with a small script. There are
> literally dozens of such cases in the danish translation, and at least ten
> more in the french one, where formats don't match (things are a little bit
> complicated, since a % might be followed by a charater that needs to be
> escaped in perl...).

Joy.

> I'm surprised nobody's been paying attention to this rather obvious
> problem.

msgfmt has logic to check for this stuff, but we can't use it because
it only knows about printf %-escapes, not our extensions.  And it
doesn't reliably know which strings are going to undergo printf
processing.  (The built-in heuristic of scanning strings for %[a-z] is
worse than useless, because a translation may introduce a % where
there wasn't one before.)

> Also, do I understand you correctly that the
> _order_ of formats needs to be preserved? (One would think so if things
> are called via functions with an ellipsis at the end.) This is going to be
> a headache for translators, because it restricts their choice of wording,
> and it can also not be checked automatically, if the same format appears
> more than once in the text.

Yes, the order of formats must presently be preserved.  I realize that
this interferes with proper translation.  What we need to do is
implement the SVR4 "%1$x" printf extension: this allows you to write

 msgid "statement about %d %s"
 msgstr "statement about %2$s in quantity %1$d"

(I will get to this eventually, but the list of things I will get to
eventually has items on it from 1998, so don't hold your breath.
Patches for diagnostic.c are welcome.)

zw

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: c++/7765 [Patch]
  2002-10-28  8:32 ` c++/7765 [Patch] Zack Weinberg
  2002-10-28  8:48   ` Wolfgang Bangerth
@ 2002-10-28 10:23   ` Joseph S. Myers
  1 sibling, 0 replies; 8+ messages in thread
From: Joseph S. Myers @ 2002-10-28 10:23 UTC (permalink / raw)
  To: Zack Weinberg
  Cc: Wolfgang Bangerth, gcc-bugs, gcc-patches, gcc-gnats, Michel Robitaille

On Mon, 28 Oct 2002, Zack Weinberg wrote:

> but we need to get the fix propagated to the translation project.

Somewhere the requirement for translation fixes to go via the translation 
project should be documented (I'd suggest both in contribute.html and in 
the list of upstream packages in codingconventions.html).

Mark's also asked for more detailed documentation in branching.html of how
to regenerate gcc.pot and send it to the translation project.

-- 
Joseph S. Myers
jsm28@cam.ac.uk

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: c++/7765 [Patch]
  2002-10-28  9:59     ` Zack Weinberg
@ 2002-10-28 10:57       ` Wolfgang Bangerth
  2002-11-05  0:26         ` Kai Henningsen
  0 siblings, 1 reply; 8+ messages in thread
From: Wolfgang Bangerth @ 2002-10-28 10:57 UTC (permalink / raw)
  To: Zack Weinberg; +Cc: gcc-bugs, gcc-patches, Michel Robitaille


> > I'm presently about to do just that, with a small script. There are
> > literally dozens of such cases in the danish translation, and at least ten
> > more in the french one, where formats don't match (things are a little bit
> > complicated, since a % might be followed by a charater that needs to be
> > escaped in perl...).
> 
> Joy.

I may have misunderstood the format at first when I said that. The present 
version of the script only checks for %., where initially I thought I 
would have to take care of %%. as well, where . could have been in 
[()\[\]\\]. But then, if one would do it correctly, one would have to 
check for length-width-size-height and whatnot specifiers as well, the 
full set of what printf et al understand. I did not do that -- there are 
more obvious problems that can be fixed first.


> > Also, do I understand you correctly that the
> > _order_ of formats needs to be preserved? (One would think so if things
> > are called via functions with an ellipsis at the end.) This is going to be
> > a headache for translators, because it restricts their choice of wording,
> > and it can also not be checked automatically, if the same format appears
> > more than once in the text.
> 
> Yes, the order of formats must presently be preserved.  I realize that
> this interferes with proper translation.  What we need to do is
> implement the SVR4 "%1$x" printf extension: this allows you to write
> 
>  msgid "statement about %d %s"
>  msgstr "statement about %2$s in quantity %1$d"
> 
> (I will get to this eventually, but the list of things I will get to
> eventually has items on it from 1998, so don't hold your breath.
> Patches for diagnostic.c are welcome.)

Doing this first, before cleaning up what is there now, may be even 
the simpler way: about 1000 (half of the total) are ordering problems. 
These could, to a large extent, probably be fixed if the SVR4 syntax would 
be there, without even knowing the language in question. Maybe even 
semiautomatically.

Regards
  Wolfgang

-------------------------------------------------------------------------
Wolfgang Bangerth              email:           bangerth@ticam.utexas.edu
                               www: http://www.ticam.utexas.edu/~bangerth


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: c++/7765 [Patch]
  2002-10-28 10:57       ` Wolfgang Bangerth
@ 2002-11-05  0:26         ` Kai Henningsen
  2002-11-05  9:50           ` Zack Weinberg
  0 siblings, 1 reply; 8+ messages in thread
From: Kai Henningsen @ 2002-11-05  0:26 UTC (permalink / raw)
  To: gcc-patches

bangerth@ticam.utexas.edu (Wolfgang Bangerth)  wrote on 28.10.02 in <Pine.LNX.4.44.0210281251240.736-100000@gandalf.ticam.utexas.edu>:

> > Yes, the order of formats must presently be preserved.  I realize that
> > this interferes with proper translation.  What we need to do is
> > implement the SVR4 "%1$x" printf extension: this allows you to write
> >
> >  msgid "statement about %d %s"
> >  msgstr "statement about %2$s in quantity %1$d"
> >
> > (I will get to this eventually, but the list of things I will get to
> > eventually has items on it from 1998, so don't hold your breath.
> > Patches for diagnostic.c are welcome.)
>
> Doing this first, before cleaning up what is there now, may be even
> the simpler way: about 1000 (half of the total) are ordering problems.
> These could, to a large extent, probably be fixed if the SVR4 syntax would
> be there, without even knowing the language in question. Maybe even
> semiautomatically.

Unfortunately, that's not exactly trivial. Maybe one could steal the  
relevant code from glibc?

Essentially, one would have to first build up a list of all specifiers in  
the format string, hope that that gives no holes with unused parameters in  
them where one does not know their type, then process the varargs with the  
type info one has into that list, and finally do another pass to output  
the whole mess - first and third pass in format string sequence, second  
pass in argument sequence.

It's not a particularly obvious piece of code. Actually, the glibc code is  
even worse IIRC because not only does it optimize the case where there are  
no n$ sequences (leading to a significantly different logic flow), it also  
IIRC tries do do several different printf variants with the same source -  
a macro jungle not quite like the worst places in gcc but definitely  
tending in that direction.

It is, however, possible to drastically clean it up by not trying to make  
it do just as much, which is what I did when reusing it for GNUstep (which  
does implement an additional %@ escape for objects which know how to  
generate a text description of themselves, incidentally; also uses its own  
locale system).

MfG Kai

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: c++/7765 [Patch]
  2002-11-05  0:26         ` Kai Henningsen
@ 2002-11-05  9:50           ` Zack Weinberg
  0 siblings, 0 replies; 8+ messages in thread
From: Zack Weinberg @ 2002-11-05  9:50 UTC (permalink / raw)
  To: Kai Henningsen; +Cc: gcc-patches

On Tue, Nov 05, 2002 at 08:58:00AM +0200, Kai Henningsen wrote:
> bangerth@ticam.utexas.edu (Wolfgang Bangerth)  wrote on 28.10.02 in <Pine.LNX.4.44.0210281251240.736-100000@gandalf.ticam.utexas.edu>:
> 
> > > Yes, the order of formats must presently be preserved.  I realize that
> > > this interferes with proper translation.  What we need to do is
> > > implement the SVR4 "%1$x" printf extension: this allows you to write
> > >
> > >  msgid "statement about %d %s"
> > >  msgstr "statement about %2$s in quantity %1$d"
> > >
> > > (I will get to this eventually, but the list of things I will get to
> > > eventually has items on it from 1998, so don't hold your breath.
> > > Patches for diagnostic.c are welcome.)
> >
> > Doing this first, before cleaning up what is there now, may be even
> > the simpler way: about 1000 (half of the total) are ordering problems.
> > These could, to a large extent, probably be fixed if the SVR4 syntax would
> > be there, without even knowing the language in question. Maybe even
> > semiautomatically.
> 
> Unfortunately, that's not exactly trivial. Maybe one could steal the  
> relevant code from glibc?
> 
> Essentially, one would have to first build up a list of all specifiers in  
> the format string, hope that that gives no holes with unused parameters in  
> them where one does not know their type, then process the varargs with the  
> type info one has into that list, and finally do another pass to output  
> the whole mess - first and third pass in format string sequence, second  
> pass in argument sequence.

I'm working on just this.  It may take me awhile - it is indeed
nonobvious, and I don't intend to recycle the glibc code because it's
just too messy.

It may well be too invasive for 3.3 - we'll see.

zw

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2002-11-05 17:50 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-10-28  7:43 [Patch] c++/7765 Wolfgang Bangerth
2002-10-28  8:32 ` c++/7765 [Patch] Zack Weinberg
2002-10-28  8:48   ` Wolfgang Bangerth
2002-10-28  9:59     ` Zack Weinberg
2002-10-28 10:57       ` Wolfgang Bangerth
2002-11-05  0:26         ` Kai Henningsen
2002-11-05  9:50           ` Zack Weinberg
2002-10-28 10:23   ` Joseph S. Myers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).