public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c++/112513] New: Misoptimization of argument
@ 2023-11-13 13:33 alexander.grund@tu-dresden.de
  2023-11-13 17:53 ` [Bug c++/112513] " pinskia at gcc dot gnu.org
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: alexander.grund@tu-dresden.de @ 2023-11-13 13:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112513

            Bug ID: 112513
           Summary: Misoptimization of argument
           Product: gcc
           Version: 12.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: alexander.grund@tu-dresden.de
  Target Milestone: ---

Created attachment 56569
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56569&action=edit
Preprocessed source

In the NVIDIA NCCL library (https://github.com/NVIDIA/nccl) I came across a
SIGSEGV in __strncmp_sse42 that happens "sometimes" when compiled with GCC 12
in -O2 mode and higher but don't happen in lower modes or in GCC 11

The original stacktrace looks like this:
#0  0x00002aaaabd82e3a in __strcmp_sse42 () from /lib64/libc.so.6
#1  0x00002aab18d83a6e in xmlGetAttrIndex (index=<synthetic pointer>,
attrName=0x2aab18e2820c "familyid", node=0x2aae9c108160) at graph/xml.h:67
#2  xmlSetAttrInt (value=143, attrName=0x2aab18e2820c "familyid",
node=0x2aae9c108160) at graph/xml.h:167
#3  ncclTopoGetXmlFromCpu (cpuNode=cpuNode@entry=0x2aae9c108160,
xml=xml@entry=0x2aae9c0d1f20) at graph/xml.cc:436

Moving the `strncmp(key, attrName, MAX_STR_LEN) == 0` out into a separate
function to see the arguments in the debugger shows this backtrace:
#0  0x00002aaaabd83c00 in __strncmp_sse42 () from /lib64/libc.so.6
#1  0x00002aab18d75a9f in cmpFromXml (attrName=0x89300800 <error: Cannot access
memory at address 0x89300800>, key=0x2aaeac107eb0 "numaid") at graph/xml.h:65
#2  xmlGetAttrIndex (index=<synthetic pointer>, attrName=0x89300800 <error:
Cannot access memory at address 0x89300800>, node=0x2aaeac107db0) at
graph/xml.h:73
#3  xmlSetAttrInt (node=node@entry=0x2aaeac107db0,
attrName=attrName@entry=0x89300800 <error: Cannot access memory at address
0x89300800>, value=143) at graph/xml.h:174
#4  0x00002aab18d77de4 in ncclTopoGetXmlFromCpu
(cpuNode=cpuNode@entry=0x2aaeac107db0, xml=xml@entry=0x2aaeac0d1b70) at
graph/xml.cc:437

So it looks like the `attrName` parameter gets corrupted somehow. The callsite
of `xmlSetAttrInt` is `NCCLCHECK(xmlSetAttrInt(cpuNode, "familyid",
familyId));`, so that parameter is a string constant already used earlier by
`NCCLCHECK(xmlGetAttrIndex(cpuNode, "familyid", &index));`

I suspect the `index` parameter to be involved.
Many modifications cause the bug to disappear, such as removing the `NCCLCHECK`
macro (basically an `if(error) return error;`-wrapper) or adding
fprintf-statements into xmlGetAttrIndex or cmpFromXml

The compile command is `g++ -fPIC -fvisibility=hidden -std=c++11 -O2 -g -ggdb3
-c graph/xml.cc`, the preprocessed source. Needs minimization but as it only
happens when compiled into a library used by a python package from a script I
don't know how. So I hope that there will be something obvious for someone
familiar with the optimization in GCC

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug c++/112513] Misoptimization of argument
  2023-11-13 13:33 [Bug c++/112513] New: Misoptimization of argument alexander.grund@tu-dresden.de
@ 2023-11-13 17:53 ` pinskia at gcc dot gnu.org
  2023-11-14  9:52 ` alexander.grund@tu-dresden.de
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-11-13 17:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112513

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
           Keywords|                            |inline-asm
         Resolution|---                         |INVALID

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
    asm volatile("cpuid" : "=a" (cpuid1.val) : "a" (1) : "memory");

cpuid touches EAX, EBX, ECX, and EDX registers but in the above inline-asm,
only eax is marked as being touched.

https://faydoc.tripod.com/cpu/cpuid.htm

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug c++/112513] Misoptimization of argument
  2023-11-13 13:33 [Bug c++/112513] New: Misoptimization of argument alexander.grund@tu-dresden.de
  2023-11-13 17:53 ` [Bug c++/112513] " pinskia at gcc dot gnu.org
@ 2023-11-14  9:52 ` alexander.grund@tu-dresden.de
  2023-11-14 10:00 ` pinskia at gcc dot gnu.org
  2023-11-14 11:50 ` alexander.grund@tu-dresden.de
  3 siblings, 0 replies; 5+ messages in thread
From: alexander.grund@tu-dresden.de @ 2023-11-14  9:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112513

Alexander Grund <alexander.grund@tu-dresden.de> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|INVALID                     |FIXED

--- Comment #2 from Alexander Grund <alexander.grund@tu-dresden.de> ---
I'm having trouble understanding the exact syntax of `asm`. It looks like
`a(...)` in the input/output list refers to read/write of the EAX register and
"b" for EBX etc., doesn't it?
But in the clobber list I'd need to use `"eax"`, not a..d. I haven't been able
to confirm this via the docs
(https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Clobbers-and-Scratch-Registers).
I.e. I would have expected to be able to use the same "a".."d" there too.

Anyway if I understand you correctly the correct instruction would be: `asm
volatile("cpuid" : "=a" (cpuid1.val) : "a" (1) : "ebx", "ecx", "edx",
"memory");`

Did I got that right?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug c++/112513] Misoptimization of argument
  2023-11-13 13:33 [Bug c++/112513] New: Misoptimization of argument alexander.grund@tu-dresden.de
  2023-11-13 17:53 ` [Bug c++/112513] " pinskia at gcc dot gnu.org
  2023-11-14  9:52 ` alexander.grund@tu-dresden.de
@ 2023-11-14 10:00 ` pinskia at gcc dot gnu.org
  2023-11-14 11:50 ` alexander.grund@tu-dresden.de
  3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-11-14 10:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112513

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|FIXED                       |INVALID

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
I think that is correct. You could always use cpuid.h and __cpuid instead.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug c++/112513] Misoptimization of argument
  2023-11-13 13:33 [Bug c++/112513] New: Misoptimization of argument alexander.grund@tu-dresden.de
                   ` (2 preceding siblings ...)
  2023-11-14 10:00 ` pinskia at gcc dot gnu.org
@ 2023-11-14 11:50 ` alexander.grund@tu-dresden.de
  3 siblings, 0 replies; 5+ messages in thread
From: alexander.grund@tu-dresden.de @ 2023-11-14 11:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112513

--- Comment #4 from Alexander Grund <alexander.grund@tu-dresden.de> ---
Thank you, I replaced that by

    unsigned unused;
    __cpuid(1, cpuid1.val, unused, unused, unused);

and it works in the setup I have.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-11-14 11:50 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-13 13:33 [Bug c++/112513] New: Misoptimization of argument alexander.grund@tu-dresden.de
2023-11-13 17:53 ` [Bug c++/112513] " pinskia at gcc dot gnu.org
2023-11-14  9:52 ` alexander.grund@tu-dresden.de
2023-11-14 10:00 ` pinskia at gcc dot gnu.org
2023-11-14 11:50 ` alexander.grund@tu-dresden.de

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).