public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/101139] New: Unable to remove double byteswap in fast path
@ 2021-06-20 9:37 steinar+gcc at gunderson dot no
2021-06-20 9:41 ` [Bug tree-optimization/101139] " pinskia at gcc dot gnu.org
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: steinar+gcc at gunderson dot no @ 2021-06-20 9:37 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101139
Bug ID: 101139
Summary: Unable to remove double byteswap in fast path
Product: gcc
Version: 10.2.1
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: steinar+gcc at gunderson dot no
Target Milestone: ---
The following code is reduced from a real interpreter:
extern void (*a[])();
int d, e, h, l;
typedef struct {
char ab;
} f;
f g;
short i();
short m68ki_read_imm_16() {
short j, k;
int b = d;
f f = g;
if (b < h)
return __builtin_bswap16((&f.ab)[0]);
k = i();
short c = k;
j = __builtin_bswap16(c);
return j;
}
int b() {
short m;
do {
m = m68ki_read_imm_16();
short c = m;
l = __builtin_bswap16(c);
a[l]();
} while (e);
return e;
}
Compiling with arm-linux-gnueabihf-gcc-10 -O2 yields this interesting sequence
in the function:
b .L11
.L15:
ldrb r3, [r5, #8] @ zero_extendqisi2
rev16 r3, r3
uxth r3, r3
.L10:
rev16 r3, r3
uxth r3, r3
The original code intention was to have a reusable function that returned in
big-endian, but that a specific use of it would be able to ignore endianness
into a table lookup, removing the double-swap entirely. GCC can normally do
that, but it seems that the branch in m68ki_read_imm_16() somehow gets in the
way. Just to be clear, I expect zero rev16 instructions altogether in b() when
m68ki_read_imm_16() is inlined.
The problem is not ARM-specific; x86 shows a similar problematic sequence:
leaq a(%rip), %rbx
jmp .L11
.p2align 4,,10
.p2align 3
.L15:
movsbw g(%rip), %ax
rolw $8, %ax
.L10:
rolw $8, %ax
movzwl %ax, %edx
Also verified with
gcc version 12.0.0 20210527 (experimental) [master revision
262e75d22c3:7bb6b9b2f47:9d3a953ec4d2695e9a6bfa5f22655e2aea47a973] (Debian
20210527-1)
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/101139] Unable to remove double byteswap in fast path
2021-06-20 9:37 [Bug tree-optimization/101139] New: Unable to remove double byteswap in fast path steinar+gcc at gunderson dot no
@ 2021-06-20 9:41 ` pinskia at gcc dot gnu.org
2021-06-20 9:48 ` pinskia at gcc dot gnu.org
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-06-20 9:41 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101139
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Last reconfirmed| |2021-06-20
Assignee|unassigned at gcc dot gnu.org |pinskia at gcc dot gnu.org
Depends on| |13563
Status|UNCONFIRMED |ASSIGNED
Keywords| |missed-optimization
Ever confirmed|0 |1
--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
if (b_13 < h.0_15)
goto <bb 4>; [51.12%]
else
goto <bb 5>; [48.88%]
<bb 4> [local count: 548896825]:
_16 = (short unsigned int) f$ab_14;
_17 = (int) _16;
_18 = __builtin_bswap16 (_17);
goto <bb 6>; [100.00%]
<bb 5> [local count: 524845000]:
k_22 = i ();
c.1_23 = (short unsigned int) k_22;
_24 = (int) c.1_23;
_25 = __builtin_bswap16 (_24);
<bb 6> [local count: 1073741824]:
# prephitmp_32 = PHI <_18(4), _25(5)>
Basically the same issue as PR 13563.
Referenced Bugs:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=13563
[Bug 13563] if-conversion not agressive enough
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/101139] Unable to remove double byteswap in fast path
2021-06-20 9:37 [Bug tree-optimization/101139] New: Unable to remove double byteswap in fast path steinar+gcc at gunderson dot no
2021-06-20 9:41 ` [Bug tree-optimization/101139] " pinskia at gcc dot gnu.org
@ 2021-06-20 9:48 ` pinskia at gcc dot gnu.org
2021-06-21 6:37 ` rguenth at gcc dot gnu.org
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-06-20 9:48 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101139
--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #1)
> if (b_13 < h.0_15)
> goto <bb 4>; [51.12%]
> else
> goto <bb 5>; [48.88%]
>
> <bb 4> [local count: 548896825]:
> _16 = (short unsigned int) f$ab_14;
> _17 = (int) _16;
> _18 = __builtin_bswap16 (_17);
> goto <bb 6>; [100.00%]
>
> <bb 5> [local count: 524845000]:
> k_22 = i ();
> c.1_23 = (short unsigned int) k_22;
> _24 = (int) c.1_23;
> _25 = __builtin_bswap16 (_24);
>
> <bb 6> [local count: 1073741824]:
> # prephitmp_32 = PHI <_18(4), _25(5)>
>
> Basically the same issue as PR 13563.
Once that issue is fixed we should get:
if (b_13 < h.0_15)
goto <bb 4>; [51.12%]
else
goto <bb 5>; [48.88%]
<bb 4> [local count: 548896825]:
_16 = (short unsigned int) f$ab_14;
goto <bb 6>; [100.00%]
<bb 5> [local count: 524845000]:
k_22 = i ();
c.1_23 = (short unsigned int) k_22;
<bb 6> [local count: 1073741824]:
# c.1_23 = PHI <_16(4), c.1_23(5)>
_24 = (int) c.1_23;
prephitmp_32 = __builtin_bswap16 (_32);
_2 = (int) prephitmp_32;
_3 = __builtin_bswap16 (_2);
_4 = (int) _3;
Which then should just optimize later on to: if (b_13 < h.0_15)
goto <bb 4>; [51.12%]
else
goto <bb 5>; [48.88%]
<bb 4> [local count: 548896825]:
_16 = (short unsigned int) f$ab_14;
goto <bb 6>; [100.00%]
<bb 5> [local count: 524845000]:
k_22 = i ();
c.1_23 = (short unsigned int) k_22;
<bb 6> [local count: 1073741824]:
# c.1_23 = PHI <_16(4), c.1_23(5)>
_24 = (int) c.1_23;
_4 = _24;
(If I did this conversion right)
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/101139] Unable to remove double byteswap in fast path
2021-06-20 9:37 [Bug tree-optimization/101139] New: Unable to remove double byteswap in fast path steinar+gcc at gunderson dot no
2021-06-20 9:41 ` [Bug tree-optimization/101139] " pinskia at gcc dot gnu.org
2021-06-20 9:48 ` pinskia at gcc dot gnu.org
@ 2021-06-21 6:37 ` rguenth at gcc dot gnu.org
2021-06-26 16:51 ` steinar+gcc at gunderson dot no
2024-04-20 7:25 ` pinskia at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-06-21 6:37 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101139
--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
One odd thing is that while __builtin_bswap16 is declared as taking an uint16
argument the frontend promotes it to 'int' and that stays that way in GIMPLE:
_16 = (short unsigned int) f$ab_14;
_17 = (int) _16;
_18 = __builtin_bswap16 (_17);
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/101139] Unable to remove double byteswap in fast path
2021-06-20 9:37 [Bug tree-optimization/101139] New: Unable to remove double byteswap in fast path steinar+gcc at gunderson dot no
` (2 preceding siblings ...)
2021-06-21 6:37 ` rguenth at gcc dot gnu.org
@ 2021-06-26 16:51 ` steinar+gcc at gunderson dot no
2024-04-20 7:25 ` pinskia at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: steinar+gcc at gunderson dot no @ 2021-06-26 16:51 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101139
--- Comment #4 from Steinar H. Gunderson <steinar+gcc at gunderson dot no> ---
Yes, the integer promotion actually costs some performance. It happens on both
x86 and Arm. Should I file that as a separate bug?
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/101139] Unable to remove double byteswap in fast path
2021-06-20 9:37 [Bug tree-optimization/101139] New: Unable to remove double byteswap in fast path steinar+gcc at gunderson dot no
` (3 preceding siblings ...)
2021-06-26 16:51 ` steinar+gcc at gunderson dot no
@ 2024-04-20 7:25 ` pinskia at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-04-20 7:25 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101139
--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Created attachment 57993
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57993&action=edit
Patch but it does not work for the code in this testcase
I have to look into why it is not working for the testcase in comment #0
(factor_out_conditional_operation is not even called) but it does work for:
```
short f(int a, short b, short c)
{
if (a)
return __builtin_bswap16(b);
return __builtin_bswap16(c);
}
```
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-04-20 7:25 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-20 9:37 [Bug tree-optimization/101139] New: Unable to remove double byteswap in fast path steinar+gcc at gunderson dot no
2021-06-20 9:41 ` [Bug tree-optimization/101139] " pinskia at gcc dot gnu.org
2021-06-20 9:48 ` pinskia at gcc dot gnu.org
2021-06-21 6:37 ` rguenth at gcc dot gnu.org
2021-06-26 16:51 ` steinar+gcc at gunderson dot no
2024-04-20 7:25 ` pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).