From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by sourceware.org (Postfix) with ESMTPS id 0FB443858C27 for ; Tue, 2 Nov 2021 21:07:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 0FB443858C27 Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-559-DzpgSMWgOuSgVw2OG-Kzgg-1; Tue, 02 Nov 2021 17:07:28 -0400 X-MC-Unique: DzpgSMWgOuSgVw2OG-Kzgg-1 Received: by mail-qk1-f198.google.com with SMTP id bm9-20020a05620a198900b004629c6f44c4so265393qkb.21 for ; Tue, 02 Nov 2021 14:07:28 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:subject:from:to:cc:date:in-reply-to :references:user-agent:mime-version:content-transfer-encoding; bh=2kQxevDrwj4zkeFRY0DMb5Kt7xuVg75q7EzLSFVo/Ak=; b=GbzGVl4p4Rq6mbq6Nb5aDox3Ze9P6Wurnw1lEHrJmgF53vSVm8DyUf/UdT2AaBky9K Gpi5LIlhYUeAZtkL3pVDFUHYouY+0HX9YTW+Xqyl2PZHptAOLJk+1wlLlYq1LeSFem0f syY+sEEuO7DqMSWOtCs45fGgyR1L5mzfCQ6wd+GVcZcV62pFMdJftHRzUlr29GiDC4Ks uX+4LoH3JJ76bmJlt6UZazYxzPFRmXCZWyMFGLQwVJSBX+9yJKwNeG3/GBfk8F/SPYkM 5SrfYXo5AwB69NKTwLv+i2wWKhwGoyI/dCG1YcFh3HwnA8LP+RUEkj59h4GvMRkcaUIP Yrjw== X-Gm-Message-State: AOAM530BDvzRYudHs1Qa1zoA90fqL9Ue5VpSUNEBdChmoL43UqaWG1Kd 8g7bE+Dn4jkTprk+55+WCqK8YmbxjayR/gzM8zufjbhZdln/ObXRXTT0VS6DfB/37S9X/tx8dh6 r7P2MCWfZMlU+gGb05w== X-Received: by 2002:a05:6214:2b0f:: with SMTP id jx15mr17881330qvb.62.1635887248047; Tue, 02 Nov 2021 14:07:28 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx5BbDali5y9NcwR0yO3VFGe3IkDPq+LnaVToUEXFEdYJFwOSAOyUxod/eAT3B9J1Gwfe4G1A== X-Received: by 2002:a05:6214:2b0f:: with SMTP id jx15mr17881316qvb.62.1635887247869; Tue, 02 Nov 2021 14:07:27 -0700 (PDT) Received: from t14s.localdomain (c-73-69-212-193.hsd1.nh.comcast.net. [73.69.212.193]) by smtp.gmail.com with ESMTPSA id k19sm87156qta.82.2021.11.02.14.07.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Nov 2021 14:07:27 -0700 (PDT) Message-ID: <0cfee94e654530c36615d8445c74d22e125915a1.camel@redhat.com> Subject: Re: [PATCH 1/2] Flag CPP_W_BIDIRECTIONAL so that source lines are escaped From: David Malcolm To: Marek Polacek , GCC Patches Cc: Joseph Myers , Jakub Jelinek Date: Tue, 02 Nov 2021 17:07:25 -0400 In-Reply-To: <20211102205801.1202228-2-dmalcolm@redhat.com> References: <20211101163652.36794-1-polacek@redhat.com> <20211102205801.1202228-1-dmalcolm@redhat.com> <20211102205801.1202228-2-dmalcolm@redhat.com> User-Agent: Evolution 3.38.4 (3.38.4-1.fc33) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.6 required=5.0 tests=BAYES_00, BODY_8BITS, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Nov 2021 21:07:32 -0000 On Tue, 2021-11-02 at 16:58 -0400, David Malcolm wrote: > Before: > >   Wbidirectional-1.c: In function ‘main’: >   Wbidirectional-1.c:6:43: warning: unpaired UTF-8 bidirectional > character detected [-Wbidirectional=] >       6 |     /*‮ } ⁦if (isAdmin)⁩ ⁦ begin admins only */ >         |                                           ^ >   Wbidirectional-1.c:9:28: warning: unpaired UTF-8 bidirectional > character detected [-Wbidirectional=] >       9 |     /* end admins only ‮ { ⁦*/ >         |                            ^ > >   Wbidirectional-11.c:6:15: warning: UTF-8 vs UCN mismatch when > closing a context by "U+202C (POP DIRECTIONAL FORMATTING)" [- > Wbidirectional=] >       6 | int LRE_‪_PDF_\u202c; >         |               ^ > > After setting rich_loc.set_escape_on_output (true): > >   Wbidirectional-1.c:6:43: warning: unpaired UTF-8 bidirectional > character detected [-Wbidirectional=] >       6 |     /* } if (isAdmin) > begin admins only */ >         > |                                                                     >        ^ >   Wbidirectional-1.c:9:28: warning: unpaired UTF-8 bidirectional > character detected [-Wbidirectional=] >       9 |     /* end admins only { */ >         |                                            ^ > >   Wbidirectional-11.c:6:15: warning: UTF-8 vs UCN mismatch when > closing a context by "U+202C (POP DIRECTIONAL FORMATTING)" [- > Wbidirectional=] >       6 | int LRE__PDF_\u202c; >         |                       ^ > > libcpp/ChangeLog: >         * lex.c (maybe_warn_bidi_on_close): Use a rich_location >         and call set_escape_on_output (true) on it. >         (maybe_warn_bidi_on_char): Likewise. > > Signed-off-by: David Malcolm [...snip...] To be more explicit: part of the benefit of escaping non-ASCII bytes in the source line is that it further mitigates against CVE-2021-42574, since it "defangs" the bidi control characters - turning everything into ASCII, so that the user can see the logical ordering of the characters directly. A similar consideration applies to homoglyph attacks. Dave