public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c++/112513] New: Misoptimization of argument
@ 2023-11-13 13:33 alexander.grund@tu-dresden.de
  2023-11-13 17:53 ` [Bug c++/112513] " pinskia at gcc dot gnu.org
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: alexander.grund@tu-dresden.de @ 2023-11-13 13:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112513

            Bug ID: 112513
           Summary: Misoptimization of argument
           Product: gcc
           Version: 12.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: alexander.grund@tu-dresden.de
  Target Milestone: ---

Created attachment 56569
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56569&action=edit
Preprocessed source

In the NVIDIA NCCL library (https://github.com/NVIDIA/nccl) I came across a
SIGSEGV in __strncmp_sse42 that happens "sometimes" when compiled with GCC 12
in -O2 mode and higher but don't happen in lower modes or in GCC 11

The original stacktrace looks like this:
#0  0x00002aaaabd82e3a in __strcmp_sse42 () from /lib64/libc.so.6
#1  0x00002aab18d83a6e in xmlGetAttrIndex (index=<synthetic pointer>,
attrName=0x2aab18e2820c "familyid", node=0x2aae9c108160) at graph/xml.h:67
#2  xmlSetAttrInt (value=143, attrName=0x2aab18e2820c "familyid",
node=0x2aae9c108160) at graph/xml.h:167
#3  ncclTopoGetXmlFromCpu (cpuNode=cpuNode@entry=0x2aae9c108160,
xml=xml@entry=0x2aae9c0d1f20) at graph/xml.cc:436

Moving the `strncmp(key, attrName, MAX_STR_LEN) == 0` out into a separate
function to see the arguments in the debugger shows this backtrace:
#0  0x00002aaaabd83c00 in __strncmp_sse42 () from /lib64/libc.so.6
#1  0x00002aab18d75a9f in cmpFromXml (attrName=0x89300800 <error: Cannot access
memory at address 0x89300800>, key=0x2aaeac107eb0 "numaid") at graph/xml.h:65
#2  xmlGetAttrIndex (index=<synthetic pointer>, attrName=0x89300800 <error:
Cannot access memory at address 0x89300800>, node=0x2aaeac107db0) at
graph/xml.h:73
#3  xmlSetAttrInt (node=node@entry=0x2aaeac107db0,
attrName=attrName@entry=0x89300800 <error: Cannot access memory at address
0x89300800>, value=143) at graph/xml.h:174
#4  0x00002aab18d77de4 in ncclTopoGetXmlFromCpu
(cpuNode=cpuNode@entry=0x2aaeac107db0, xml=xml@entry=0x2aaeac0d1b70) at
graph/xml.cc:437

So it looks like the `attrName` parameter gets corrupted somehow. The callsite
of `xmlSetAttrInt` is `NCCLCHECK(xmlSetAttrInt(cpuNode, "familyid",
familyId));`, so that parameter is a string constant already used earlier by
`NCCLCHECK(xmlGetAttrIndex(cpuNode, "familyid", &index));`

I suspect the `index` parameter to be involved.
Many modifications cause the bug to disappear, such as removing the `NCCLCHECK`
macro (basically an `if(error) return error;`-wrapper) or adding
fprintf-statements into xmlGetAttrIndex or cmpFromXml

The compile command is `g++ -fPIC -fvisibility=hidden -std=c++11 -O2 -g -ggdb3
-c graph/xml.cc`, the preprocessed source. Needs minimization but as it only
happens when compiled into a library used by a python package from a script I
don't know how. So I hope that there will be something obvious for someone
familiar with the optimization in GCC

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-11-14 11:50 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-13 13:33 [Bug c++/112513] New: Misoptimization of argument alexander.grund@tu-dresden.de
2023-11-13 17:53 ` [Bug c++/112513] " pinskia at gcc dot gnu.org
2023-11-14  9:52 ` alexander.grund@tu-dresden.de
2023-11-14 10:00 ` pinskia at gcc dot gnu.org
2023-11-14 11:50 ` alexander.grund@tu-dresden.de

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).