* [PATCH 0/12] detect quoting and punctuation problems in diagnostics @ 2019-05-14 21:31 Martin Sebor 2019-05-15 11:40 ` Martin Liška 2019-05-16 10:02 ` Martin Liška 0 siblings, 2 replies; 4+ messages in thread From: Martin Sebor @ 2019-05-14 21:31 UTC (permalink / raw) To: gcc-patches; +Cc: Martin Liška Near the end of every release a bunch of problem reports are raised for various punctuation, quoting, and spelling issues in GCC. In GCC 9 a total 28 such issues were submitted. A fair number of them are discovered as new or changed diagnostics are being translated. A checker was developed to help find some of these as well many others by scanning the gcc.pot file: contrib/check-internal-format-escaping.py. The checker automates the process of finding these issues but it doesn't prevent them. Being external to GCC the checker cannot easily distinguish between message strings that are expected to be translated and those that aren't. To help avoid introducing the most glaring subset of these problems as early as possible while making it possible to differentiate messages that are expected to conform from those that need not, the attached patch adds a simple checker to GCC: -Wformat-diag. My goal is to improve consistency of messages and relieve the burden at the end of each release both on translators and those who then have to fix the problems. The warning detects some of the following: * unquoted operators consisting of two or more characters (e.g., == or ?:) * unquoted keywords, types, GCC built-ins and command line options, or identifiers with underscores * unquoted apostrophes and contractions like in can't * informal abbreviations (arg or reg) * unbalanced paired tokens (brackets, parentheses, or quotes) * duplicate or trailing spaces (it allows multiple spaces at the beginning of a diagnostic since that's used by the C++ front-end as an "indentation") * unintended control characters * inconsistent sentence capitalization To avoid diagnosing direct calls to pretty-printer functions that often format parts of the same diagnostic in multiple steps, the solution introduces the "_raw__" suffix to the __gcc_diag__ attributes that the warning then uses to avoid checking those calls. Functions with the raw attributes aren't checked. I tested the patch kit as a whole in an x86_64-linux build and fixed all the issues it pointed out except for 43 instances of the new warning in the Go front-end that I wasn't sure how best to handle (I will follow up on that with Ian). I also adjusted the affected tests. There were just a few issues specific to the i386 back-end but they had an impact on 9 tests. The impact of fixing the same kinds of issues in other back-ends is likely to be similar. Rather than trying to do the clean up across all supported targets I think it's better to handle the rest of the issues over time and with the help of back end maintainers who can more easily verify that tests still pass. With than in mind, I have prevented the warning from causing bootstrap failures. Once the cleanup is complete I expect to remove this and include the new warning in -Werror. Since a few files in GCC "misuse" the diagnostic machinery for debugging output leading up to internal errors (instead of calling a pp_xxx function) I suppressed the new warning in those files via #pragma GCC diagnostic ignored "-Wformat-diag". If we agree that calling pp_format or some such is the way to go I can work replacing those calls as a subsequent step. The subset should cover the following bug reports (some already resolved): 90176 - diagnostics should generally contain underscore only inside quotes 90162 - exclamation mark in diagnostic!!!!!1111!!!! 90158 - aarch64: wrong quotation in diagnostics 90157 - aarch64: unnecessary abbreviation in diagnostic 90156 - add linter check suggesting to replace %<%s%> with %qs 90149 - diagnostics containing BIT_FIELD_REF don't conform to diagnostics guideline 90121 - extra space in error message 90117 - Replace %<%s%> with %qs 90011 - [9 Regression] trailing space in diagnostic 79477 - Please write code more translator-friendly (unquoted options) 89936 - wrong punctuation in tree-profile.c (%<%s%>) Although the changes are mechanical, since I made them all by hand I broke up the patch into a series to make it easier to review: [PATCH 1/12] implement -Wformat-diag to detect quoting and spelling issues in GCC diagnostics [PATCH 2/12] fix diagnostic quoting/spelling in ada [PATCH 3/12] fix diagnostic quoting/spelling in Brig [PATCH 4/12] fix diagnostic quoting/spelling in the C front-end [PATCH 5/12] fix diagnostic quoting/spelling in c-family [PATCH 6/12] fix diagnostic quoting/spelling in C++ [PATCH 7/12] fix diagnostic quoting/spelling in libgcc [PATCH 8/12] fix diagnostic quoting/spelling in the middle-end [PATCH 9/12] adjust tests to quoting/spelling diagnostics fixes [PATCH 10/12] fix diagnostic quoting/spelling in D [PATCH 11/12] fix diagnostic quoting/spelling issues in i386 back-end [PATCH 12/12] fix diagnostic quoting/spelling issues in ObjC Martin ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH 0/12] detect quoting and punctuation problems in diagnostics 2019-05-14 21:31 [PATCH 0/12] detect quoting and punctuation problems in diagnostics Martin Sebor @ 2019-05-15 11:40 ` Martin Liška 2019-05-16 10:02 ` Martin Liška 1 sibling, 0 replies; 4+ messages in thread From: Martin Liška @ 2019-05-15 11:40 UTC (permalink / raw) To: Martin Sebor, gcc-patches On 5/14/19 11:31 PM, Martin Sebor wrote: > Near the end of every release a bunch of problem reports are > raised for various punctuation, quoting, and spelling issues > in GCC. In GCC 9 a total 28 such issues were submitted. Hi Martin. Great that you prepared the patchset. > A fair number of them are discovered as new or changed > diagnostics are being translated. A checker was developed > to help find some of these as well many others by scanning > the gcc.pot file: contrib/check-internal-format-escaping.py. > The checker automates the process of finding these issues but > it doesn't prevent them. Being external to GCC the checker > cannot easily distinguish between message strings that are > expected to be translated and those that aren't. > > To help avoid introducing the most glaring subset of these > problems as early as possible while making it possible to > differentiate messages that are expected to conform from those > that need not, the attached patch adds a simple checker to GCC: > -Wformat-diag. My goal is to improve consistency of messages > and relieve the burden at the end of each release both on > translators and those who then have to fix the problems. I like the idea! One limitation compared to the linter is that one needs to have a test-suite coverage for errors and warnings. Moreover, one needs to have that for all target supported by GCC. > > The warning detects some of the following: > >  * unquoted operators consisting of two or more characters >    (e.g., == or ?:) >  * unquoted keywords, types, GCC built-ins and command line >    options, or identifiers with underscores >  * unquoted apostrophes and contractions like in can't >  * informal abbreviations (arg or reg) >  * unbalanced paired tokens (brackets, parentheses, or quotes) >  * duplicate or trailing spaces (it allows multiple spaces at >    the beginning of a diagnostic since that's used by the C++ >    front-end as an "indentation") >  * unintended control characters >  * inconsistent sentence capitalization You implemented very complex checks! Nice. > > To avoid diagnosing direct calls to pretty-printer functions > that often format parts of the same diagnostic in multiple steps, > the solution introduces the "_raw__" suffix to the __gcc_diag__ > attributes that the warning then uses to avoid checking those > calls. Functions with the raw attributes aren't checked. > > I tested the patch kit as a whole in an x86_64-linux build and > fixed all the issues it pointed out except for 43 instances of > the new warning in the Go front-end that I wasn't sure how best > to handle (I will follow up on that with Ian). I also adjusted > the affected tests. > > There were just a few issues specific to the i386 back-end but they > had an impact on 9 tests. The impact of fixing the same kinds of > issues in other back-ends is likely to be similar. Rather than > trying to do the clean up across all supported targets I think it's > better to handle the rest of the issues over time and with the help > of back end maintainers who can more easily verify that tests still > pass. With than in mind, I have prevented the warning from causing > bootstrap failures. Once the cleanup is complete I expect to remove > this and include the new warning in -Werror. The approach works form me. Martin > > Since a few files in GCC "misuse" the diagnostic machinery for > debugging output leading up to internal errors (instead of > calling a pp_xxx function) I suppressed the new warning in those > files via #pragma GCC diagnostic ignored "-Wformat-diag". If we > agree that calling pp_format or some such is the way to go I can > work replacing those calls as a subsequent step. > > The subset should cover the following bug reports (some already > resolved): > > 90176 - diagnostics should generally contain underscore only >        inside quotes > 90162 - exclamation mark in diagnostic!!!!!1111!!!! > 90158 - aarch64: wrong quotation in diagnostics > 90157 - aarch64: unnecessary abbreviation in diagnostic > 90156 - add linter check suggesting to replace %<%s%> with %qs > 90149 - diagnostics containing BIT_FIELD_REF don't conform to >        diagnostics guideline > 90121 - extra space in error message > 90117 - Replace %<%s%> with %qs > 90011 - [9 Regression] trailing space in diagnostic > 79477 - Please write code more translator-friendly (unquoted options) > 89936 - wrong punctuation in tree-profile.c (%<%s%>) > > Although the changes are mechanical, since I made them all by hand > I broke up the patch into a series to make it easier to review: > > [PATCH 1/12] implement -Wformat-diag to detect quoting and spelling >             issues in GCC diagnostics > [PATCH 2/12] fix diagnostic quoting/spelling in ada > [PATCH 3/12] fix diagnostic quoting/spelling in Brig > [PATCH 4/12] fix diagnostic quoting/spelling in the C front-end > [PATCH 5/12] fix diagnostic quoting/spelling in c-family > [PATCH 6/12] fix diagnostic quoting/spelling in C++ > [PATCH 7/12] fix diagnostic quoting/spelling in libgcc > [PATCH 8/12] fix diagnostic quoting/spelling in the middle-end > [PATCH 9/12] adjust tests to quoting/spelling diagnostics fixes > [PATCH 10/12] fix diagnostic quoting/spelling in D > [PATCH 11/12] fix diagnostic quoting/spelling issues in i386 back-end > [PATCH 12/12] fix diagnostic quoting/spelling issues in ObjC > > Martin ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH 0/12] detect quoting and punctuation problems in diagnostics 2019-05-14 21:31 [PATCH 0/12] detect quoting and punctuation problems in diagnostics Martin Sebor 2019-05-15 11:40 ` Martin Liška @ 2019-05-16 10:02 ` Martin Liška 2019-05-16 15:01 ` Martin Sebor 1 sibling, 1 reply; 4+ messages in thread From: Martin Liška @ 2019-05-16 10:02 UTC (permalink / raw) To: Martin Sebor, gcc-patches Hi. Maybe I've install the patches wrongly, but I see following error on ppc64le during bootstrap in stage2: /home/marxin/Programming/gcc/objdir/./prev-gcc/xg++ -B/home/marxin/Programming/gcc/objdir/./prev-gcc/ -B/usr/local/powerpc64le-unknown-linux-gnu/bin/ -nostdinc++ -B/home/marxin/Programming/gcc/objdir/prev-powerpc64le-unknn-linux-gnu/libstdc++-v3/src/.libs -B/home/marxin/Programming/gcc/objdir/prev-powerpc64le-unknown-linux-gnu/libstdc++-v3/libsupc++/.libs -I/home/marxin/Programming/gcc/objdir/prev-powerpc64le-unknown-linux-gnu/libstdc++-v3/include/powerpc64le-unknown-linux-gnu -I/homearxin/Programming/gcc/objdir/prev-powerpc64le-unknown-linux-gnu/libstdc++-v3/include -I/home/marxin/Programming/gcc/libstdc++-v3/libsupc++ -L/home/marxin/Programming/gcc/objdir/prev-powerpc64le-unknown-linux-gnu/libstdc++-v3/src/.libs -L/home/marxin/Programming/gcc/objr/prev-powerpc64le-unknown-linux-gnu/libstdc++-v3/libsupc++/.libs -fno-PIE -c -g -O2 -fno-checking -gtoggle -DIN_GCC -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wno-error=format-diag -Wmissing-format-tribute -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Werror -fno-common -DHAVE_CONFIG_H -I. -I. -I../../gcc -I../../gcc/. -I../../gcc/../include -I../../gcc/../libcpp/include -I../../gcc/../libdecnumber -I../../gcc/../libcnumber/dpd -I../libdecnumber -I../../gcc/../libbacktrace -o tree.o -MT tree.o -MMD -MP -MF ./.deps/tree.TPo ../../gcc/tree.c ../../gcc/tree.c: In function ‘void omp_clause_check_failed(const_tree, const char*, int, const char*, omp_clause_code)’: ../../gcc/tree.c:10012:67: error: unknown conversion type character ‘k’ in format [-Werror=format=] 10012 | internal_error ("tree check: expected %<omp_clause %s%>, have %qk " | ^ ../../gcc/tree.c:10013:10: error: format ‘%s’ expects argument of type ‘char*’, but argument 3 has type ‘int’ [-Werror=format=] 10013 | "in %s, at %s:%d", | ~^ | | | char* | %d ../../gcc/tree.c:10013:20: error: format ‘%d’ expects argument of type ‘int’, but argument 5 has type ‘const char*’ [-Werror=format=] 10013 | "in %s, at %s:%d", | ~^ | | | int | %s 10014 | omp_clause_code_name[code], TREE_CODE (node), 10015 | function, trim_filename (file), line); | ~~~~~~~~~~~~~~~~~~~~ | | | const char* ../../gcc/tree.c:10012:19: error: too many arguments for format [-Werror=format-extra-args] 10012 | internal_error ("tree check: expected %<omp_clause %s%>, have %qk " | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 10013 | "in %s, at %s:%d", | ~~~~~~~~~~~~~~~~~ Martin ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH 0/12] detect quoting and punctuation problems in diagnostics 2019-05-16 10:02 ` Martin Liška @ 2019-05-16 15:01 ` Martin Sebor 0 siblings, 0 replies; 4+ messages in thread From: Martin Sebor @ 2019-05-16 15:01 UTC (permalink / raw) To: Martin Liška, gcc-patches On 5/16/19 4:02 AM, Martin Liška wrote: > Hi. > > Maybe I've install the patches wrongly, but I see following error on ppc64le > during bootstrap in stage2: I also noticed it yesterday on x86_64. The %qk was vestige of an earlier attempt to use the pretty-printer to format TREE_CODEs. I have this in my tree that fixes it but let me post an updated patch. --- a/gcc/tree.c +++ b/gcc/tree.c @@ -10009,8 +10009,10 @@ void omp_clause_check_failed (const_tree node, const char *file, int line, const char *function, enum omp_clause_code code) { - internal_error ("tree check: expected omp_clause %s, have %s in %s, at %s:%d", - omp_clause_code_name[code], get_tree_code_name (TREE_CODE (node)), + internal_error ("tree check: expected %<omp_clause %s%>, have %qs " + "in %s, at %s:%d", + omp_clause_code_name[code], + get_tree_code_name (TREE_CODE (node)), function, trim_filename (file), line); } As a heads up, my latest log still shows a few testsuite failures that I need to clean up. Those I've looked at are all missing adjustments to expected dg-warning output. ! FAIL: 20_util/any/misc/any_cast_neg.cc (3: +3) ! FAIL: gcc.dg/gcc_diag-11.c (1: +1) ! FAIL: g++.dg/ubsan/pr63956.C (21: +21) ! FAIL: gnat.dg/inline3.adb (2: +2) ! FAIL: gnat.dg/inline5.adb (2: +2) ! FAIL: gnat.dg/inline7.adb (2: +2) ! FAIL: gnat.dg/inline9.adb (2: +2) ! FAIL: objc.dg/method-19.m (2: +2) ! FAIL: objc.dg/protocol-qualifier-2.m (2: +2) ! FAIL: obj-c++.dg/protocol-qualifier-2.mm (2: +2) Martin > > /home/marxin/Programming/gcc/objdir/./prev-gcc/xg++ -B/home/marxin/Programming/gcc/objdir/./prev-gcc/ -B/usr/local/powerpc64le-unknown-linux-gnu/bin/ -nostdinc++ -B/home/marxin/Programming/gcc/objdir/prev-powerpc64le-unknn-linux-gnu/libstdc++-v3/src/.libs -B/home/marxin/Programming/gcc/objdir/prev-powerpc64le-unknown-linux-gnu/libstdc++-v3/libsupc++/.libs -I/home/marxin/Programming/gcc/objdir/prev-powerpc64le-unknown-linux-gnu/libstdc++-v3/include/powerpc64le-unknown-linux-gnu -I/homearxin/Programming/gcc/objdir/prev-powerpc64le-unknown-linux-gnu/libstdc++-v3/include -I/home/marxin/Programming/gcc/libstdc++-v3/libsupc++ -L/home/marxin/Programming/gcc/objdir/prev-powerpc64le-unknown-linux-gnu/libstdc++-v3/src/.libs -L/home/marxin/Programming/gcc/objr/prev-powerpc64le-unknown-linux-gnu/libstdc++-v3/libsupc++/.libs -fno-PIE -c -g -O2 -fno-checking -gtoggle -DIN_GCC -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wno-error=format-diag -Wmissing-format-tribute -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Werror -fno-common -DHAVE_CONFIG_H -I. -I. -I../../gcc -I../../gcc/. -I../../gcc/../include -I../../gcc/../libcpp/include -I../../gcc/../libdecnumber -I../../gcc/../libcnumber/dpd -I../libdecnumber -I../../gcc/../libbacktrace -o tree.o -MT tree.o -MMD -MP -MF ./.deps/tree.TPo ../../gcc/tree.c > ../../gcc/tree.c: In function ‘void omp_clause_check_failed(const_tree, const char*, int, const char*, omp_clause_code)’: > ../../gcc/tree.c:10012:67: error: unknown conversion type character ‘k’ in format [-Werror=format=] > 10012 | internal_error ("tree check: expected %<omp_clause %s%>, have %qk " > | ^ > ../../gcc/tree.c:10013:10: error: format ‘%s’ expects argument of type ‘char*’, but argument 3 has type ‘int’ [-Werror=format=] > 10013 | "in %s, at %s:%d", > | ~^ > | | > | char* > | %d > ../../gcc/tree.c:10013:20: error: format ‘%d’ expects argument of type ‘int’, but argument 5 has type ‘const char*’ [-Werror=format=] > 10013 | "in %s, at %s:%d", > | ~^ > | | > | int > | %s > 10014 | omp_clause_code_name[code], TREE_CODE (node), > 10015 | function, trim_filename (file), line); > | ~~~~~~~~~~~~~~~~~~~~ > | | > | const char* > ../../gcc/tree.c:10012:19: error: too many arguments for format [-Werror=format-extra-args] > 10012 | internal_error ("tree check: expected %<omp_clause %s%>, have %qk " > | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > 10013 | "in %s, at %s:%d", > | ~~~~~~~~~~~~~~~~~ > > Martin > ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2019-05-16 15:01 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-05-14 21:31 [PATCH 0/12] detect quoting and punctuation problems in diagnostics Martin Sebor 2019-05-15 11:40 ` Martin Liška 2019-05-16 10:02 ` Martin Liška 2019-05-16 15:01 ` Martin Sebor
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).