public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 0/12] detect quoting and punctuation problems in diagnostics
@ 2019-05-14 21:31 Martin Sebor
  2019-05-15 11:40 ` Martin Liška
  2019-05-16 10:02 ` Martin Liška
  0 siblings, 2 replies; 4+ messages in thread
From: Martin Sebor @ 2019-05-14 21:31 UTC (permalink / raw)
  To: gcc-patches; +Cc: Martin Liška

Near the end of every release a bunch of problem reports are
raised for various punctuation, quoting, and spelling issues
in GCC.  In GCC 9 a total 28 such issues were submitted.
A fair number of them are discovered as new or changed
diagnostics are being translated.  A checker was developed
to help find some of these as well many others by scanning
the gcc.pot file: contrib/check-internal-format-escaping.py.
The checker automates the process of finding these issues but
it doesn't prevent them.  Being external to GCC the checker
cannot easily distinguish between message strings that are
expected to be translated and those that aren't.

To help avoid introducing the most glaring subset of these
problems as early as possible while making it possible to
differentiate messages that are expected to conform from those
that need not, the attached patch adds a simple checker to GCC:
-Wformat-diag.  My goal is to improve consistency of messages
and relieve the burden at the end of each release both on
translators and those who then have to fix the problems.

The warning detects some of the following:

  *  unquoted operators consisting of two or more characters
     (e.g., == or ?:)
  *  unquoted keywords, types, GCC built-ins and command line
     options, or identifiers with underscores
  *  unquoted apostrophes and contractions like in can't
  *  informal abbreviations (arg or reg)
  *  unbalanced paired tokens (brackets, parentheses, or quotes)
  *  duplicate or trailing spaces (it allows multiple spaces at
     the beginning of a diagnostic since that's used by the C++
     front-end as an "indentation")
  *  unintended control characters
  *  inconsistent sentence capitalization

To avoid diagnosing direct calls to pretty-printer functions
that often format parts of the same diagnostic in multiple steps,
the solution introduces the "_raw__" suffix to the __gcc_diag__
attributes that the warning then uses to avoid checking those
calls.  Functions with the raw attributes aren't checked.

I tested the patch kit as a whole in an x86_64-linux build and
fixed all the issues it pointed out except for 43 instances of
the new warning in the Go front-end that I wasn't sure how best
to handle (I will follow up on that with Ian).  I also adjusted
the affected tests.

There were just a few issues specific to the i386 back-end but they
had an impact on 9 tests.  The impact of fixing the same kinds of
issues in other back-ends is likely to be similar.  Rather than
trying to do the clean up across all supported targets I think it's
better to handle the rest of the issues over time and with the help
of back end maintainers who can more easily verify that tests still
pass.  With than in mind, I have prevented the warning from causing
bootstrap failures. Once the cleanup is complete I expect to remove
this and include the new warning in -Werror.

Since a few files in GCC "misuse" the diagnostic machinery for
debugging output leading up to internal errors (instead of
calling a pp_xxx function) I suppressed the new warning in those
files via #pragma GCC diagnostic ignored "-Wformat-diag".  If we
agree that calling pp_format or some such is the way to go I can
work replacing those calls as a subsequent step.

The subset should cover the following bug reports (some already
resolved):

90176 - diagnostics should generally contain underscore only
         inside quotes
90162 - exclamation mark in diagnostic!!!!!1111!!!!
90158 - aarch64: wrong quotation in diagnostics
90157 - aarch64: unnecessary abbreviation in diagnostic
90156 - add linter check suggesting to replace %<%s%> with %qs
90149 - diagnostics containing BIT_FIELD_REF don't conform to
         diagnostics guideline
90121 - extra space in error message
90117 - Replace %<%s%> with %qs
90011 - [9 Regression] trailing space in diagnostic
79477 - Please write code more translator-friendly (unquoted options)
89936 - wrong punctuation in tree-profile.c (%<%s%>)

Although the changes are mechanical, since I made them all by hand
I broke up the patch into a series to make it easier to review:

[PATCH 1/12] implement -Wformat-diag to detect quoting and spelling
              issues in GCC diagnostics
[PATCH 2/12] fix diagnostic quoting/spelling in ada
[PATCH 3/12] fix diagnostic quoting/spelling in Brig
[PATCH 4/12] fix diagnostic quoting/spelling in the C front-end
[PATCH 5/12] fix diagnostic quoting/spelling in c-family
[PATCH 6/12] fix diagnostic quoting/spelling in C++
[PATCH 7/12] fix diagnostic quoting/spelling in libgcc
[PATCH 8/12] fix diagnostic quoting/spelling in the middle-end
[PATCH 9/12] adjust tests to quoting/spelling diagnostics fixes
[PATCH 10/12] fix diagnostic quoting/spelling in D
[PATCH 11/12] fix diagnostic quoting/spelling issues in i386 back-end
[PATCH 12/12] fix diagnostic quoting/spelling issues in ObjC

Martin

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-05-16 15:01 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-14 21:31 [PATCH 0/12] detect quoting and punctuation problems in diagnostics Martin Sebor
2019-05-15 11:40 ` Martin Liška
2019-05-16 10:02 ` Martin Liška
2019-05-16 15:01   ` Martin Sebor

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).