public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c++/112513] New: Misoptimization of argument
@ 2023-11-13 13:33 alexander.grund@tu-dresden.de
2023-11-13 17:53 ` [Bug c++/112513] " pinskia at gcc dot gnu.org
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: alexander.grund@tu-dresden.de @ 2023-11-13 13:33 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112513
Bug ID: 112513
Summary: Misoptimization of argument
Product: gcc
Version: 12.2.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c++
Assignee: unassigned at gcc dot gnu.org
Reporter: alexander.grund@tu-dresden.de
Target Milestone: ---
Created attachment 56569
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56569&action=edit
Preprocessed source
In the NVIDIA NCCL library (https://github.com/NVIDIA/nccl) I came across a
SIGSEGV in __strncmp_sse42 that happens "sometimes" when compiled with GCC 12
in -O2 mode and higher but don't happen in lower modes or in GCC 11
The original stacktrace looks like this:
#0 0x00002aaaabd82e3a in __strcmp_sse42 () from /lib64/libc.so.6
#1 0x00002aab18d83a6e in xmlGetAttrIndex (index=<synthetic pointer>,
attrName=0x2aab18e2820c "familyid", node=0x2aae9c108160) at graph/xml.h:67
#2 xmlSetAttrInt (value=143, attrName=0x2aab18e2820c "familyid",
node=0x2aae9c108160) at graph/xml.h:167
#3 ncclTopoGetXmlFromCpu (cpuNode=cpuNode@entry=0x2aae9c108160,
xml=xml@entry=0x2aae9c0d1f20) at graph/xml.cc:436
Moving the `strncmp(key, attrName, MAX_STR_LEN) == 0` out into a separate
function to see the arguments in the debugger shows this backtrace:
#0 0x00002aaaabd83c00 in __strncmp_sse42 () from /lib64/libc.so.6
#1 0x00002aab18d75a9f in cmpFromXml (attrName=0x89300800 <error: Cannot access
memory at address 0x89300800>, key=0x2aaeac107eb0 "numaid") at graph/xml.h:65
#2 xmlGetAttrIndex (index=<synthetic pointer>, attrName=0x89300800 <error:
Cannot access memory at address 0x89300800>, node=0x2aaeac107db0) at
graph/xml.h:73
#3 xmlSetAttrInt (node=node@entry=0x2aaeac107db0,
attrName=attrName@entry=0x89300800 <error: Cannot access memory at address
0x89300800>, value=143) at graph/xml.h:174
#4 0x00002aab18d77de4 in ncclTopoGetXmlFromCpu
(cpuNode=cpuNode@entry=0x2aaeac107db0, xml=xml@entry=0x2aaeac0d1b70) at
graph/xml.cc:437
So it looks like the `attrName` parameter gets corrupted somehow. The callsite
of `xmlSetAttrInt` is `NCCLCHECK(xmlSetAttrInt(cpuNode, "familyid",
familyId));`, so that parameter is a string constant already used earlier by
`NCCLCHECK(xmlGetAttrIndex(cpuNode, "familyid", &index));`
I suspect the `index` parameter to be involved.
Many modifications cause the bug to disappear, such as removing the `NCCLCHECK`
macro (basically an `if(error) return error;`-wrapper) or adding
fprintf-statements into xmlGetAttrIndex or cmpFromXml
The compile command is `g++ -fPIC -fvisibility=hidden -std=c++11 -O2 -g -ggdb3
-c graph/xml.cc`, the preprocessed source. Needs minimization but as it only
happens when compiled into a library used by a python package from a script I
don't know how. So I hope that there will be something obvious for someone
familiar with the optimization in GCC
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug c++/112513] Misoptimization of argument
2023-11-13 13:33 [Bug c++/112513] New: Misoptimization of argument alexander.grund@tu-dresden.de
@ 2023-11-13 17:53 ` pinskia at gcc dot gnu.org
2023-11-14 9:52 ` alexander.grund@tu-dresden.de
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-11-13 17:53 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112513
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |RESOLVED
Keywords| |inline-asm
Resolution|--- |INVALID
--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
asm volatile("cpuid" : "=a" (cpuid1.val) : "a" (1) : "memory");
cpuid touches EAX, EBX, ECX, and EDX registers but in the above inline-asm,
only eax is marked as being touched.
https://faydoc.tripod.com/cpu/cpuid.htm
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug c++/112513] Misoptimization of argument
2023-11-13 13:33 [Bug c++/112513] New: Misoptimization of argument alexander.grund@tu-dresden.de
2023-11-13 17:53 ` [Bug c++/112513] " pinskia at gcc dot gnu.org
@ 2023-11-14 9:52 ` alexander.grund@tu-dresden.de
2023-11-14 10:00 ` pinskia at gcc dot gnu.org
2023-11-14 11:50 ` alexander.grund@tu-dresden.de
3 siblings, 0 replies; 5+ messages in thread
From: alexander.grund@tu-dresden.de @ 2023-11-14 9:52 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112513
Alexander Grund <alexander.grund@tu-dresden.de> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|INVALID |FIXED
--- Comment #2 from Alexander Grund <alexander.grund@tu-dresden.de> ---
I'm having trouble understanding the exact syntax of `asm`. It looks like
`a(...)` in the input/output list refers to read/write of the EAX register and
"b" for EBX etc., doesn't it?
But in the clobber list I'd need to use `"eax"`, not a..d. I haven't been able
to confirm this via the docs
(https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Clobbers-and-Scratch-Registers).
I.e. I would have expected to be able to use the same "a".."d" there too.
Anyway if I understand you correctly the correct instruction would be: `asm
volatile("cpuid" : "=a" (cpuid1.val) : "a" (1) : "ebx", "ecx", "edx",
"memory");`
Did I got that right?
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug c++/112513] Misoptimization of argument
2023-11-13 13:33 [Bug c++/112513] New: Misoptimization of argument alexander.grund@tu-dresden.de
2023-11-13 17:53 ` [Bug c++/112513] " pinskia at gcc dot gnu.org
2023-11-14 9:52 ` alexander.grund@tu-dresden.de
@ 2023-11-14 10:00 ` pinskia at gcc dot gnu.org
2023-11-14 11:50 ` alexander.grund@tu-dresden.de
3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-11-14 10:00 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112513
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|FIXED |INVALID
--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
I think that is correct. You could always use cpuid.h and __cpuid instead.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug c++/112513] Misoptimization of argument
2023-11-13 13:33 [Bug c++/112513] New: Misoptimization of argument alexander.grund@tu-dresden.de
` (2 preceding siblings ...)
2023-11-14 10:00 ` pinskia at gcc dot gnu.org
@ 2023-11-14 11:50 ` alexander.grund@tu-dresden.de
3 siblings, 0 replies; 5+ messages in thread
From: alexander.grund@tu-dresden.de @ 2023-11-14 11:50 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112513
--- Comment #4 from Alexander Grund <alexander.grund@tu-dresden.de> ---
Thank you, I replaced that by
unsigned unused;
__cpuid(1, cpuid1.val, unused, unused, unused);
and it works in the setup I have.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-11-14 11:50 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-13 13:33 [Bug c++/112513] New: Misoptimization of argument alexander.grund@tu-dresden.de
2023-11-13 17:53 ` [Bug c++/112513] " pinskia at gcc dot gnu.org
2023-11-14 9:52 ` alexander.grund@tu-dresden.de
2023-11-14 10:00 ` pinskia at gcc dot gnu.org
2023-11-14 11:50 ` alexander.grund@tu-dresden.de
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).