public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/113729] New: Missing APX NDD optimization
@ 2024-02-02 21:44 hjl.tools at gmail dot com
2024-02-04 2:04 ` [Bug target/113729] " liuhongt at gcc dot gnu.org
2024-02-04 7:29 ` liuhongt at gcc dot gnu.org
0 siblings, 2 replies; 3+ messages in thread
From: hjl.tools at gmail dot com @ 2024-02-02 21:44 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113729
Bug ID: 113729
Summary: Missing APX NDD optimization
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: hjl.tools at gmail dot com
CC: crazylht at gmail dot com, hongyuw at gcc dot gnu.org
Target Milestone: ---
Target: x86-64
APX spec has
---
Unlike the merge-upper behavior at a destination register of a typical x86
integer instruction when OSIZE
is 8b or 16b, the NDD register is always zero-uppered
---
But GCC 14 generates:
[hjl@gnu-tgl-3 pr113711]$ cat b.c
extern unsigned char b;
unsigned int
foo (void)
{
return 200 + b;
}
[hjl@gnu-tgl-3 pr113711]$ make b.s
/export/build/gnu/tools-build/gcc-gitlab-debug/build-x86_64-linux/gcc/xgcc
-B/export/build/gnu/tools-build/gcc-gitlab-debug/build-x86_64-linux/gcc/ -mapxf
-O3 -S b.c
[hjl@gnu-tgl-3 pr113711]$ cat b.s
.file "b.c"
.text
.p2align 4
.globl foo
.type foo, @function
foo:
.LFB0:
.cfi_startproc
movzbl b(%rip), %eax
addl $200, %eax
ret
.cfi_endproc
.LFE0:
.size foo, .-foo
.ident "GCC: (GNU) 14.0.1 20240202 (experimental)"
.section .note.G
[hjl@gnu-tgl-3 pr113711]$
addb $200, b(%rip), %al
should be generated.
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug target/113729] Missing APX NDD optimization
2024-02-02 21:44 [Bug target/113729] New: Missing APX NDD optimization hjl.tools at gmail dot com
@ 2024-02-04 2:04 ` liuhongt at gcc dot gnu.org
2024-02-04 7:29 ` liuhongt at gcc dot gnu.org
1 sibling, 0 replies; 3+ messages in thread
From: liuhongt at gcc dot gnu.org @ 2024-02-04 2:04 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113729
Hongtao Liu <liuhongt at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |liuhongt at gcc dot gnu.org
--- Comment #1 from Hongtao Liu <liuhongt at gcc dot gnu.org> ---
At gimple level it's (int) b + 200, it's different from (int)(b +200) when
there's overflow, i.e. b = 200.
For
extern unsigned char b;
unsigned char
foo (void)
{
return 200 + b;
}
gcc already generate
sub al, BYTE PTR b[rip], 56
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug target/113729] Missing APX NDD optimization
2024-02-02 21:44 [Bug target/113729] New: Missing APX NDD optimization hjl.tools at gmail dot com
2024-02-04 2:04 ` [Bug target/113729] " liuhongt at gcc dot gnu.org
@ 2024-02-04 7:29 ` liuhongt at gcc dot gnu.org
1 sibling, 0 replies; 3+ messages in thread
From: liuhongt at gcc dot gnu.org @ 2024-02-04 7:29 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113729
--- Comment #2 from Hongtao Liu <liuhongt at gcc dot gnu.org> ---
extern unsigned char b;
int
foo (void)
{
return (unsigned char)(200 + b);
}
gcc -O2 -mapxf
foo():
subb $56, b(%rip), %al
movzbl %al, %eax
ret
And this can be optimzied to
foo():
subb $56, b(%rip), %al
ret
Note, if we want to optimize it in pass_combine, the pattern need to generate
explicit APX NDD instructions, since APX non-NDD will not clear upper bits.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2024-02-04 7:29 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-02 21:44 [Bug target/113729] New: Missing APX NDD optimization hjl.tools at gmail dot com
2024-02-04 2:04 ` [Bug target/113729] " liuhongt at gcc dot gnu.org
2024-02-04 7:29 ` liuhongt at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).