From: David Malcolm <dmalcolm@redhat.com>
To: Marek Polacek <polacek@redhat.com>
Cc: Joseph Myers <joseph@codesourcery.com>,
Jakub Jelinek <jakub@redhat.com>,
Martin Sebor <msebor@redhat.com>,
GCC Patches <gcc-patches@gcc.gnu.org>,
David Malcolm <dmalcolm@redhat.com>
Subject: [committed] libcpp: escape non-ASCII source bytes in -Wbidi-chars= [PR103026]
Date: Wed, 17 Nov 2021 17:45:14 -0500 [thread overview]
Message-ID: <20211117224515.2484625-1-dmalcolm@redhat.com> (raw)
In-Reply-To: <YZRxcWcQgVRRZz6p@redhat.com>
This flags rich_locations associated with -Wbidi-chars= so that
non-ASCII bytes will be escaped when printing the source lines
(using the diagnostics support I added in
r12-4825-gbd5e882cf6e0def3dd1bc106075d59a303fe0d1e).
In particular, this ensures that the printed source lines will
be pure ASCII, and thus the visual ordering of the characters
will be the same as the logical ordering.
Before:
Wbidi-chars-1.c: In function ‘main’:
Wbidi-chars-1.c:6:43: warning: unpaired UTF-8 bidirectional control character detected [-Wbidi-chars=]
6 | /* } if (isAdmin) begin admins only */
| ^
Wbidi-chars-1.c:9:28: warning: unpaired UTF-8 bidirectional control character detected [-Wbidi-chars=]
9 | /* end admins only { */
| ^
Wbidi-chars-11.c:6:15: warning: UTF-8 vs UCN mismatch when closing a context by "U+202C (POP DIRECTIONAL FORMATTING)" [-Wbidi-chars=]
6 | int LRE__PDF_\u202c;
| ^
Wbidi-chars-11.c:8:19: warning: UTF-8 vs UCN mismatch when closing a context by "U+202C (POP DIRECTIONAL FORMATTING)" [-Wbidi-chars=]
8 | int LRE_\u202a_PDF__;
| ^
Wbidi-chars-11.c:10:28: warning: UTF-8 vs UCN mismatch when closing a context by "U+202C (POP DIRECTIONAL FORMATTING)" [-Wbidi-chars=]
10 | const char *s1 = "LRE__PDF_\u202c";
| ^
Wbidi-chars-11.c:12:33: warning: UTF-8 vs UCN mismatch when closing a context by "U+202C (POP DIRECTIONAL FORMATTING)" [-Wbidi-chars=]
12 | const char *s2 = "LRE_\u202a_PDF_";
| ^
After:
Wbidi-chars-1.c: In function ‘main’:
Wbidi-chars-1.c:6:43: warning: unpaired UTF-8 bidirectional control character detected [-Wbidi-chars=]
6 | /*<U+202E> } <U+2066>if (isAdmin)<U+2069> <U+2066> begin admins only */
| ^
Wbidi-chars-1.c:9:28: warning: unpaired UTF-8 bidirectional control character detected [-Wbidi-chars=]
9 | /* end admins only <U+202E> { <U+2066>*/
| ^
Wbidi-chars-11.c:6:15: warning: UTF-8 vs UCN mismatch when closing a context by "U+202C (POP DIRECTIONAL FORMATTING)" [-Wbidi-chars=]
6 | int LRE_<U+202A>_PDF_\u202c;
| ^
Wbidi-chars-11.c:8:19: warning: UTF-8 vs UCN mismatch when closing a context by "U+202C (POP DIRECTIONAL FORMATTING)" [-Wbidi-chars=]
8 | int LRE_\u202a_PDF_<U+202C>_;
| ^
Wbidi-chars-11.c:10:28: warning: UTF-8 vs UCN mismatch when closing a context by "U+202C (POP DIRECTIONAL FORMATTING)" [-Wbidi-chars=]
10 | const char *s1 = "LRE_<U+202A>_PDF_\u202c";
| ^
Wbidi-chars-11.c:12:33: warning: UTF-8 vs UCN mismatch when closing a context by "U+202C (POP DIRECTIONAL FORMATTING)" [-Wbidi-chars=]
12 | const char *s2 = "LRE_\u202a_PDF_<U+202C>";
| ^
Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r12-5355-g1a7f2c0774129750fdf73e9f1b78f0ce983c9ab3.
libcpp/ChangeLog:
PR preprocessor/103026
* lex.c (maybe_warn_bidi_on_close): Use a rich_location
and call set_escape_on_output (true) on it.
(maybe_warn_bidi_on_char): Likewise.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
---
libcpp/lex.c | 29 +++++++++++++++++------------
1 file changed, 17 insertions(+), 12 deletions(-)
diff --git a/libcpp/lex.c b/libcpp/lex.c
index 6a4fbce6030..8290bc637cd 100644
--- a/libcpp/lex.c
+++ b/libcpp/lex.c
@@ -1427,9 +1427,11 @@ maybe_warn_bidi_on_close (cpp_reader *pfile, const uchar *p)
const location_t loc
= linemap_position_for_column (pfile->line_table,
CPP_BUF_COLUMN (pfile->buffer, p));
- cpp_warning_with_line (pfile, CPP_W_BIDIRECTIONAL, loc, 0,
- "unpaired UTF-8 bidirectional control character "
- "detected");
+ rich_location rich_loc (pfile->line_table, loc);
+ rich_loc.set_escape_on_output (true);
+ cpp_warning_at (pfile, CPP_W_BIDIRECTIONAL, &rich_loc,
+ "unpaired UTF-8 bidirectional control character "
+ "detected");
}
/* We're done with this context. */
bidi::on_close ();
@@ -1454,6 +1456,9 @@ maybe_warn_bidi_on_char (cpp_reader *pfile, const uchar *p, bidi::kind kind,
const location_t loc
= linemap_position_for_column (pfile->line_table,
CPP_BUF_COLUMN (pfile->buffer, p));
+ rich_location rich_loc (pfile->line_table, loc);
+ rich_loc.set_escape_on_output (true);
+
/* It seems excessive to warn about a PDI/PDF that is closing
an opened context because we've already warned about the
opening character. Except warn when we have a UCN x UTF-8
@@ -1462,20 +1467,20 @@ maybe_warn_bidi_on_char (cpp_reader *pfile, const uchar *p, bidi::kind kind,
{
if (warn_bidi == bidirectional_unpaired
&& bidi::current_ctx_ucn_p () != ucn_p)
- cpp_warning_with_line (pfile, CPP_W_BIDIRECTIONAL, loc, 0,
- "UTF-8 vs UCN mismatch when closing "
- "a context by \"%s\"", bidi::to_str (kind));
+ cpp_warning_at (pfile, CPP_W_BIDIRECTIONAL, &rich_loc,
+ "UTF-8 vs UCN mismatch when closing "
+ "a context by \"%s\"", bidi::to_str (kind));
}
else if (warn_bidi == bidirectional_any)
{
if (kind == bidi::kind::PDF || kind == bidi::kind::PDI)
- cpp_warning_with_line (pfile, CPP_W_BIDIRECTIONAL, loc, 0,
- "\"%s\" is closing an unopened context",
- bidi::to_str (kind));
+ cpp_warning_at (pfile, CPP_W_BIDIRECTIONAL, &rich_loc,
+ "\"%s\" is closing an unopened context",
+ bidi::to_str (kind));
else
- cpp_warning_with_line (pfile, CPP_W_BIDIRECTIONAL, loc, 0,
- "found problematic Unicode character \"%s\"",
- bidi::to_str (kind));
+ cpp_warning_at (pfile, CPP_W_BIDIRECTIONAL, &rich_loc,
+ "found problematic Unicode character \"%s\"",
+ bidi::to_str (kind));
}
}
/* We're done with this context. */
--
2.26.3
next prev parent reply other threads:[~2021-11-17 22:46 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-11-01 16:36 [PATCH] libcpp: Implement -Wbidirectional for CVE-2021-42574 [PR103026] Marek Polacek
2021-11-01 22:10 ` Joseph Myers
2021-11-02 17:18 ` [PATCH v2] " Marek Polacek
2021-11-02 19:20 ` Martin Sebor
2021-11-02 19:52 ` Marek Polacek
2021-11-08 21:33 ` Marek Polacek
2021-11-15 17:28 ` [PATCH] libcpp: Implement -Wbidi-chars " Marek Polacek
2021-11-15 23:15 ` David Malcolm
2021-11-16 19:50 ` [PATCH v2] " Marek Polacek
2021-11-16 23:00 ` David Malcolm
2021-11-17 0:37 ` [PATCH v3] " Marek Polacek
2021-11-17 2:28 ` David Malcolm
2021-11-17 3:05 ` Marek Polacek
2021-11-17 22:45 ` David Malcolm [this message]
2021-11-17 22:45 ` [PATCH 2/2] libcpp: capture and underline ranges in -Wbidi-chars= [PR103026] David Malcolm
2021-11-17 23:01 ` Marek Polacek
2021-11-30 8:38 ` [PATCH] libcpp: Implement -Wbidi-chars for CVE-2021-42574 [PR103026] Stephan Bergmann
2021-11-30 13:26 ` Marek Polacek
2021-11-30 15:00 ` Stephan Bergmann
2021-11-30 15:27 ` Marek Polacek
2022-01-14 9:23 ` Stephan Bergmann
2022-01-14 13:28 ` Marek Polacek
2022-01-14 14:52 ` Stephan Bergmann
2021-11-02 20:57 ` [PATCH 0/2] Re: [PATCH] libcpp: Implement -Wbidirectional " David Malcolm
2021-11-02 20:58 ` [PATCH 1/2] Flag CPP_W_BIDIRECTIONAL so that source lines are escaped David Malcolm
2021-11-02 21:07 ` David Malcolm
2021-11-02 20:58 ` [PATCH 2/2] Capture locations of bidi chars and underline ranges David Malcolm
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20211117224515.2484625-1-dmalcolm@redhat.com \
--to=dmalcolm@redhat.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=jakub@redhat.com \
--cc=joseph@codesourcery.com \
--cc=msebor@redhat.com \
--cc=polacek@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).