From: "Fāng-ruì Sòng" <maskray@google.com>
To: Joseph Myers <joseph@codesourcery.com>
Cc: Paul Eggert <eggert@cs.ucla.edu>, libc-alpha@sourceware.org
Subject: Re: [PATCH] regex: Unnest nested functions in regcomp.c
Date: Wed, 27 Oct 2021 21:12:03 -0700 [thread overview]
Message-ID: <CAFP8O3K17-e0UmF3QT_5sJ7FxOE2Aft9g=iOF0KwEn-Fq-VMGA@mail.gmail.com> (raw)
In-Reply-To: <alpine.DEB.2.22.394.2110271609080.1681686@digraph.polyomino.org.uk>
On Wed, Oct 27, 2021 at 9:11 AM Joseph Myers <joseph@codesourcery.com> wrote:
>
> On Wed, 27 Oct 2021, Paul Eggert wrote:
>
> > On 10/26/21 22:29, Fangrui Song wrote:
> > > collseqwc, table_size, symb_table, extra are now initialized to appease
> > > GCC -Werror=maybe-uninitialized false positive.
> >
> > Are the diagnostics really false positives? As I understand it, the modified
> > code would have undefined behavior if these variables were not initialized.
Are sorry, the diagnostics are legitimate.
collseqwc, table_size, symb_table, extra are now passed as arguments
so they need to be initialized.
Previously they were accessed from `auto inline` nested functions.
With the delayed access GCC did not warn.
In file included from regex.c:74:
regcomp.c: In function ‘parse_expression’:
regcomp.c:3102:11: error: ‘table_size’ may be used uninitialized in
this function [-Werror=maybe-uninitialized]
3102 | int32_t table_size;
| ^~~~~~~~~~
I don't know how to compare performance difference but the new version
is just 16 bytes larger in .text on x86-64 (gcc -O2).
% ~/projects/bloaty/Release/bloaty /tmp/c/regex1.o -- /tmp/c/regex.o
FILE SIZE VM SIZE
-------------- --------------
+118% +20 [ = ] 0 [Unmapped]
+0.0% +16 +0.0% +16 .text
-5.6% -4 [ = ] 0 .data
+0.0% +32 +0.0% +16 TOTAL
The code move apparently triggers large assembly move in GCC's code generation.
The 16 byte .text increase is likely due to the now needed zero
initialization for the four variables.
% diff -u =(llvm-objdump -d --symbolize-operands -M intel
--no-leading-addr --no-show-raw-insn /tmp/c/regex.o) =(llvm-objdump -d
--symbolize-operands -M intel --no-leading-addr --no-show-raw-insn
/tmp/c/regex1.o)
...
@@ -13236,7 +13236,11 @@
mov r15d, ebx
test ebx, ebx
jne <L33>
-<L102>:
+ mov qword ptr [rsp + 96], 0
+ mov qword ptr [rsp + 120], 0
+ mov dword ptr [rsp + 116], 0
+ mov qword ptr [rsp + 88], 0
+<L94>:
mov esi, 1
mov edi, 32
call <L34>
> And if they are false positives, the glibc convention in such cases is
> generally to use the DIAG_* macros from libc-diag.h to suppress the
> warnings, with appropriate comments explaining why they are false
> positives, rather than adding an initialization that should not be
> necessary. (Sometimes another approach is more appropriate to warning
> suppression - for example, a call to __builtin_unreachable that enables
> the compiler to see that certain paths through the code cannot actually
> occur.)
Thanks for the tip.
> --
> Joseph S. Myers
> joseph@codesourcery.com
next prev parent reply other threads:[~2021-10-28 4:12 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-27 5:29 Fangrui Song
2021-10-27 9:35 ` Paul Eggert
2021-10-27 16:10 ` Joseph Myers
2021-10-28 4:12 ` Fāng-ruì Sòng [this message]
2021-11-02 2:40 ` Carlos O'Donell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAFP8O3K17-e0UmF3QT_5sJ7FxOE2Aft9g=iOF0KwEn-Fq-VMGA@mail.gmail.com' \
--to=maskray@google.com \
--cc=eggert@cs.ucla.edu \
--cc=joseph@codesourcery.com \
--cc=libc-alpha@sourceware.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).