public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c++/109476] New: Missing optimization for 8bit/8bit multiplication / regression
@ 2023-04-11 19:19 klaus.doldinger64 at googlemail dot com
2023-04-11 19:27 ` [Bug target/109476] " klaus.doldinger64 at googlemail dot com
` (18 more replies)
0 siblings, 19 replies; 20+ messages in thread
From: klaus.doldinger64 at googlemail dot com @ 2023-04-11 19:19 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109476
Bug ID: 109476
Summary: Missing optimization for 8bit/8bit multiplication /
regression
Product: gcc
Version: 12.2.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c++
Assignee: unassigned at gcc dot gnu.org
Reporter: klaus.doldinger64 at googlemail dot com
Target Milestone: ---
For avr-gcc > 4.6.4 the follwing function
uint16_t mul(const uint8_t a, const uint16_t b) {
return static_cast<uint8_t>((b >> 8) + 0) * a ;
}
produces suboptimal
mul(unsigned char, unsigned int):
mov r18,r23
ldi r19,0
mov r20,r24
mul r20,r18
movw r24,r0
mul r20,r19
add r25,r0
clr __zero_reg__
ret
whereas avr-gcc 4.6.4 produces optimal
mul(unsigned char, unsigned int):
mul r23,r24
movw r24,r0
clr r1
ret
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug target/109476] Missing optimization for 8bit/8bit multiplication / regression
2023-04-11 19:19 [Bug c++/109476] New: Missing optimization for 8bit/8bit multiplication / regression klaus.doldinger64 at googlemail dot com
@ 2023-04-11 19:27 ` klaus.doldinger64 at googlemail dot com
2023-04-12 11:14 ` rguenth at gcc dot gnu.org
` (17 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: klaus.doldinger64 at googlemail dot com @ 2023-04-11 19:27 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109476
--- Comment #1 from Wilhelm M <klaus.doldinger64 at googlemail dot com> ---
Inetristingly changing the function to
uint16_t mul(const uint8_t a, const uint16_t b) {
return static_cast<uint8_t>((b >> 8) + 1) * a ;
}
produces optimal
mul(unsigned char, unsigned int):
subi r23,lo8(-(1))
mul r23,r24
movw r24,r0
clr __zero_reg__
ret
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug target/109476] Missing optimization for 8bit/8bit multiplication / regression
2023-04-11 19:19 [Bug c++/109476] New: Missing optimization for 8bit/8bit multiplication / regression klaus.doldinger64 at googlemail dot com
2023-04-11 19:27 ` [Bug target/109476] " klaus.doldinger64 at googlemail dot com
@ 2023-04-12 11:14 ` rguenth at gcc dot gnu.org
2023-04-12 11:33 ` klaus.doldinger64 at googlemail dot com
` (16 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-04-12 11:14 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109476
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |rguenth at gcc dot gnu.org
--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
I get
_Z3mulhj:
/* prologue: function */
/* frame size = 0 */
/* stack size = 0 */
.L__stack_usage = 0
mov r22,r24
mov r24,r23
ldi r23,0
ldi r25,0
rcall __mulhi3
but probably because you didn't specify the actual command to compile and
I'm using the wrong subarchitecture.
What probably regressed is that we're promoting the multiplication to
'int'. The same happens for the + 1 variant for me though.
Can you share the -march required to have 16bit multiplication?
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug target/109476] Missing optimization for 8bit/8bit multiplication / regression
2023-04-11 19:19 [Bug c++/109476] New: Missing optimization for 8bit/8bit multiplication / regression klaus.doldinger64 at googlemail dot com
2023-04-11 19:27 ` [Bug target/109476] " klaus.doldinger64 at googlemail dot com
2023-04-12 11:14 ` rguenth at gcc dot gnu.org
@ 2023-04-12 11:33 ` klaus.doldinger64 at googlemail dot com
2023-04-12 12:21 ` [Bug rtl-optimization/109476] " rguenth at gcc dot gnu.org
` (15 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: klaus.doldinger64 at googlemail dot com @ 2023-04-12 11:33 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109476
--- Comment #3 from Wilhelm M <klaus.doldinger64 at googlemail dot com> ---
Sorry, I forgot to mention the flags: -Os -mmcu=atmega328
Maybe CompilerExplorer ist also usefull:
https://godbolt.org/z/zsq6PT1xb
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug rtl-optimization/109476] Missing optimization for 8bit/8bit multiplication / regression
2023-04-11 19:19 [Bug c++/109476] New: Missing optimization for 8bit/8bit multiplication / regression klaus.doldinger64 at googlemail dot com
` (2 preceding siblings ...)
2023-04-12 11:33 ` klaus.doldinger64 at googlemail dot com
@ 2023-04-12 12:21 ` rguenth at gcc dot gnu.org
2023-04-12 15:43 ` segher at gcc dot gnu.org
` (14 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-04-12 12:21 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109476
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Component|target |rtl-optimization
Ever confirmed|0 |1
Last reconfirmed| |2023-04-12
Status|UNCONFIRMED |NEW
CC| |iant at google dot com,
| |law at gcc dot gnu.org,
| |segher at gcc dot gnu.org
--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
In the good case (with + 1) combine succeeds:
Trying 9 -> 11:
9: r55:HI=zero_extend(r54:QI)
REG_DEAD r54:QI
11: r52:HI=r55:HI*r56:HI
REG_DEAD r56:HI
REG_DEAD r55:HI
Successfully matched this instruction:
(set (reg:HI 52)
(mult:HI (zero_extend:HI (reg:QI 54))
(reg:HI 56 [ a ])))
allowing combination of insns 9 and 11
original costs 4 + 28 = 32
replacement cost 20
deferring deletion of insn with uid = 9.
modifying insn i3 11: r52:HI=zero_extend(r54:QI)*r56:HI
REG_DEAD r54:QI
REG_DEAD r56:HI
deferring rescan insn with uid = 11.
and then
Trying 10 -> 11:
10: r56:HI=zero_extend(r61:QI)
REG_DEAD r61:QI
11: r52:HI=zero_extend(r54:QI)*r56:HI
REG_DEAD r54:QI
REG_DEAD r56:HI
Successfully matched this instruction:
(set (reg:HI 52)
(mult:HI (zero_extend:HI (reg:QI 54))
(zero_extend:HI (reg:QI 61))))
allowing combination of insns 10 and 11
original costs 4 + 20 = 24
replacement cost 12
deferring deletion of insn with uid = 10.
modifying insn i3 11: r52:HI=zero_extend(r54:QI)*zero_extend(r61:QI)
REG_DEAD r61:QI
REG_DEAD r54:QI
deferring rescan insn with uid = 11.
in the bad case instead
Trying 8 -> 9:
8: r52:HI=zero_extend(r55:QI)
REG_DEAD r55:QI
9: r50:HI=r51:HI*r52:HI
REG_DEAD r52:HI
REG_DEAD r51:HI
Successfully matched this instruction:
(set (reg:HI 50)
(mult:HI (zero_extend:HI (reg:QI 55))
(reg:HI 51 [ b+1 ])))
allowing combination of insns 8 and 9
original costs 4 + 28 = 32
replacement cost 20
deferring deletion of insn with uid = 8.
modifying insn i3 9: r50:HI=zero_extend(r55:QI)*r51:HI
REG_DEAD r55:QI
REG_DEAD r51:HI
deferring rescan insn with uid = 9.
Trying 20 -> 9:
20: r51:HI#1=0
9: r50:HI=zero_extend(r55:QI)*r51:HI
REG_DEAD r55:QI
REG_DEAD r51:HI
Can't combine i2 into i3
that's because the RTL into combine in the bad case is
(insn 19 22 20 2 (set (subreg:QI (reg:HI 51 [ b+1 ]) 0)
(reg:QI 54 [ b+1 ])) "t.ii":4:36 86 {movqi_insn_split}
(expr_list:REG_DEAD (reg:QI 54 [ b+1 ])
(nil)))
(insn 20 19 8 2 (set (subreg:QI (reg:HI 51 [ b+1 ]) 1)
(const_int 0 [0])) "t.ii":4:36 86 {movqi_insn_split}
(nil))
(insn 8 20 9 2 (set (reg:HI 52 [ a ])
(zero_extend:HI (reg/v:QI 48 [ a ]))) "t.ii":4:49 635
{zero_extendqihi2}
(expr_list:REG_DEAD (reg/v:QI 48 [ a ])
(nil)))
(insn 9 8 14 2 (set (reg:HI 50)
(mult:HI (reg:HI 51 [ b+1 ])
(reg:HI 52 [ a ]))) "t.ii":4:47 328 {*mulhi3_enh_split}
(expr_list:REG_DEAD (reg:HI 52 [ a ])
(expr_list:REG_DEAD (reg:HI 51 [ b+1 ])
(nil))))
so the 'b' operand of the multiplication is now not a zero_extend:HI
but instead a two instruction set. The first subreg pass produces
this IL, turning
(insn 3 2 4 2 (set (reg/v:HI 49 [ b ])
(reg:HI 22 r22 [ b ])) "t.ii":3:49 101 {*movhi_split}
(nil))
(insn 7 4 8 2 (set (reg:HI 51)
(lshiftrt:HI (reg/v:HI 49 [ b ])
(const_int 8 [0x8]))) "t.ii":4:36 557 {lshrhi3}
(nil))
into
(insn 17 2 18 2 (set (reg:QI 53 [ b ])
(reg:QI 22 r22 [ b ])) "t.ii":3:49 86 {movqi_insn_split}
(nil))
(insn 18 17 4 2 (set (reg:QI 54 [ b+1 ])
(reg:QI 23 r23 [ b+1 ])) "t.ii":3:49 86 {movqi_insn_split}
(nil))
(insn 19 4 20 2 (set (subreg:QI (reg:HI 51) 0)
(reg:QI 54 [ b+1 ])) "t.ii":4:36 86 {movqi_insn_split}
(nil))
(insn 20 19 8 2 (set (subreg:QI (reg:HI 51) 1)
(const_int 0 [0])) "t.ii":4:36 86 {movqi_insn_split}
(nil))
with -fno-split-wide-types the rotate doesn't get a zero_extend so the
multiplication pattern doesn't match either.
I think that possibly the lower subreg pass should more optimally
handle the situation, creating
(insn ... (set (zero_extend:HI (reg:QI 54 [ b + 1])))
here. I'm quite sure combine/forwprop cannot combine the seemingly
unrelated subreg sets. resolve_shift_zext seems to be supposed to
handle this and it receives
(insn 7 4 8 2 (set (reg:HI 51)
(lshiftrt:HI (concatn/v:HI [
(reg:QI 53 [ b ])
(reg:QI 54 [ b+1 ])
])
(const_int 8 [0x8]))) "t.ii":4:36 557 {lshrhi3}
(nil))
maybe for GET_CODE (op) != ASHIFTRT && offset1 == 0 && shift_count <=
BITS_PER_WORD this can be directly emitted as zero_extend (if supported
by the target).
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug rtl-optimization/109476] Missing optimization for 8bit/8bit multiplication / regression
2023-04-11 19:19 [Bug c++/109476] New: Missing optimization for 8bit/8bit multiplication / regression klaus.doldinger64 at googlemail dot com
` (3 preceding siblings ...)
2023-04-12 12:21 ` [Bug rtl-optimization/109476] " rguenth at gcc dot gnu.org
@ 2023-04-12 15:43 ` segher at gcc dot gnu.org
2023-04-12 16:48 ` klaus.doldinger64 at googlemail dot com
` (13 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: segher at gcc dot gnu.org @ 2023-04-12 15:43 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109476
--- Comment #5 from Segher Boessenkool <segher at gcc dot gnu.org> ---
Correct, this certainly can not be done by combine, it see two independent
pseudos here. For hard registers it *can* do many tricks, but not for
pseudos like this.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug rtl-optimization/109476] Missing optimization for 8bit/8bit multiplication / regression
2023-04-11 19:19 [Bug c++/109476] New: Missing optimization for 8bit/8bit multiplication / regression klaus.doldinger64 at googlemail dot com
` (4 preceding siblings ...)
2023-04-12 15:43 ` segher at gcc dot gnu.org
@ 2023-04-12 16:48 ` klaus.doldinger64 at googlemail dot com
2023-04-12 18:43 ` roger at nextmovesoftware dot com
` (12 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: klaus.doldinger64 at googlemail dot com @ 2023-04-12 16:48 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109476
--- Comment #6 from Wilhelm M <klaus.doldinger64 at googlemail dot com> ---
The following "solution"
constexpr uint16_t mul(const uint8_t a, const uint16_t b) {
const auto aa = std::bit_cast<std::array<uint8_t, 2>>(b);
return aa[1] * a;
}
or
constexpr uint16_t mul(const uint8_t a, const uint16_t b) {
uint8_t aa[2];
std::memcpy(aa, &b, 2);
return aa[1] * a;
}
gives optimal result ;-)
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug rtl-optimization/109476] Missing optimization for 8bit/8bit multiplication / regression
2023-04-11 19:19 [Bug c++/109476] New: Missing optimization for 8bit/8bit multiplication / regression klaus.doldinger64 at googlemail dot com
` (5 preceding siblings ...)
2023-04-12 16:48 ` klaus.doldinger64 at googlemail dot com
@ 2023-04-12 18:43 ` roger at nextmovesoftware dot com
2023-04-12 20:20 ` klaus.doldinger64 at googlemail dot com
` (11 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: roger at nextmovesoftware dot com @ 2023-04-12 18:43 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109476
Roger Sayle <roger at nextmovesoftware dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |roger at nextmovesoftware dot com
--- Comment #7 from Roger Sayle <roger at nextmovesoftware dot com> ---
Created attachment 54843
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54843&action=edit
proposed patch
Proposed patch. Does this look reasonable?
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug rtl-optimization/109476] Missing optimization for 8bit/8bit multiplication / regression
2023-04-11 19:19 [Bug c++/109476] New: Missing optimization for 8bit/8bit multiplication / regression klaus.doldinger64 at googlemail dot com
` (6 preceding siblings ...)
2023-04-12 18:43 ` roger at nextmovesoftware dot com
@ 2023-04-12 20:20 ` klaus.doldinger64 at googlemail dot com
2023-04-12 20:56 ` segher at gcc dot gnu.org
` (10 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: klaus.doldinger64 at googlemail dot com @ 2023-04-12 20:20 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109476
--- Comment #8 from Wilhelm M <klaus.doldinger64 at googlemail dot com> ---
Yes. Looks like the patch does its job.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug rtl-optimization/109476] Missing optimization for 8bit/8bit multiplication / regression
2023-04-11 19:19 [Bug c++/109476] New: Missing optimization for 8bit/8bit multiplication / regression klaus.doldinger64 at googlemail dot com
` (7 preceding siblings ...)
2023-04-12 20:20 ` klaus.doldinger64 at googlemail dot com
@ 2023-04-12 20:56 ` segher at gcc dot gnu.org
2023-04-13 7:01 ` rguenth at gcc dot gnu.org
` (9 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: segher at gcc dot gnu.org @ 2023-04-12 20:56 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109476
--- Comment #9 from Segher Boessenkool <segher at gcc dot gnu.org> ---
That patch looks fine :-)
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug rtl-optimization/109476] Missing optimization for 8bit/8bit multiplication / regression
2023-04-11 19:19 [Bug c++/109476] New: Missing optimization for 8bit/8bit multiplication / regression klaus.doldinger64 at googlemail dot com
` (8 preceding siblings ...)
2023-04-12 20:56 ` segher at gcc dot gnu.org
@ 2023-04-13 7:01 ` rguenth at gcc dot gnu.org
2023-04-13 12:55 ` klaus.doldinger64 at googlemail dot com
` (8 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-04-13 7:01 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109476
--- Comment #10 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Roger Sayle from comment #7)
> Created attachment 54843 [details]
> proposed patch
>
> Proposed patch. Does this look reasonable?
Yes, but doesn't this work for all GET_CODE (op) != ASHIFTRT?
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug rtl-optimization/109476] Missing optimization for 8bit/8bit multiplication / regression
2023-04-11 19:19 [Bug c++/109476] New: Missing optimization for 8bit/8bit multiplication / regression klaus.doldinger64 at googlemail dot com
` (9 preceding siblings ...)
2023-04-13 7:01 ` rguenth at gcc dot gnu.org
@ 2023-04-13 12:55 ` klaus.doldinger64 at googlemail dot com
2023-04-13 15:19 ` segher at gcc dot gnu.org
` (7 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: klaus.doldinger64 at googlemail dot com @ 2023-04-13 12:55 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109476
--- Comment #11 from Wilhelm M <klaus.doldinger64 at googlemail dot com> ---
After testing some more code, I got this ICE:
/home/lmeier/Projekte/wmucpp/boards/rcfoc/gimbal_sbus_01.cc: In static member
function 'static void GlobalFsm<BusDevs>::ratePeriodic() [with BusDevs
=BusDevs<External::Bus::NoBus<Devices<0> > >]':
/home/lmeier/Projekte/wmucpp/boards/rcfoc/gimbal_sbus_01.cc:426:5: error:
unrecognizable insn:
426 | }
| ^
(insn 1584 1583 25 3 (set (concatn:HI [
(reg:QI 800)
(reg:QI 801 [+1 ])
])
(reg:HI 826)) "../../include0/external/sbus/sbus.h":226:69 -1
(nil))
during RTL pass: subreg1
/home/lmeier/Projekte/wmucpp/boards/rcfoc/gimbal_sbus_01.cc:426:5: internal
compiler error: in extract_insn, at recog.cc:2791
0x79f2cb _fatal_insn(char const*, rtx_def const*, char const*, int, char
const*)
../../gcc/rtl-error.cc:108
0x79f2e7 _fatal_insn_not_found(rtx_def const*, char const*, int, char const*)
../../gcc/rtl-error.cc:116
0x79dc77 extract_insn(rtx_insn*)
../../gcc/recog.cc:2791
0x1a43d91 decompose_multiword_subregs
../../gcc/lower-subreg.cc:1651
0x1a447ca execute
../../gcc/lower-subreg.cc:1790
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug rtl-optimization/109476] Missing optimization for 8bit/8bit multiplication / regression
2023-04-11 19:19 [Bug c++/109476] New: Missing optimization for 8bit/8bit multiplication / regression klaus.doldinger64 at googlemail dot com
` (10 preceding siblings ...)
2023-04-13 12:55 ` klaus.doldinger64 at googlemail dot com
@ 2023-04-13 15:19 ` segher at gcc dot gnu.org
2023-04-13 17:57 ` klaus.doldinger64 at googlemail dot com
` (6 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: segher at gcc dot gnu.org @ 2023-04-13 15:19 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109476
--- Comment #12 from Segher Boessenkool <segher at gcc dot gnu.org> ---
With the modified compiler? Does it ICE with an unmodified compiler as well?
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug rtl-optimization/109476] Missing optimization for 8bit/8bit multiplication / regression
2023-04-11 19:19 [Bug c++/109476] New: Missing optimization for 8bit/8bit multiplication / regression klaus.doldinger64 at googlemail dot com
` (11 preceding siblings ...)
2023-04-13 15:19 ` segher at gcc dot gnu.org
@ 2023-04-13 17:57 ` klaus.doldinger64 at googlemail dot com
2023-04-13 21:49 ` roger at nextmovesoftware dot com
` (5 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: klaus.doldinger64 at googlemail dot com @ 2023-04-13 17:57 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109476
--- Comment #13 from Wilhelm M <klaus.doldinger64 at googlemail dot com> ---
Yes, the ICE is with the patch. I'm pretty sure that this does not happen
without the patch, but I will that check again.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug rtl-optimization/109476] Missing optimization for 8bit/8bit multiplication / regression
2023-04-11 19:19 [Bug c++/109476] New: Missing optimization for 8bit/8bit multiplication / regression klaus.doldinger64 at googlemail dot com
` (12 preceding siblings ...)
2023-04-13 17:57 ` klaus.doldinger64 at googlemail dot com
@ 2023-04-13 21:49 ` roger at nextmovesoftware dot com
2023-04-14 7:42 ` klaus.doldinger64 at googlemail dot com
` (4 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: roger at nextmovesoftware dot com @ 2023-04-13 21:49 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109476
--- Comment #14 from Roger Sayle <roger at nextmovesoftware dot com> ---
My apologies for the delay/issues. My bootstrap and regression testing of this
patch (on x86_64-pc-linux-gnu) revealed an issue or two (including the reported
ICE). My plan was to fix/resolve all these before posting a concrete
submission to gcc-patches. The general approach is solid (thanks to everyone
that agreed this was the correct place to fix things) but it's the corner cases
(such as RTL sharing) that all need to be addressed prior to a reviewable
submission.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug rtl-optimization/109476] Missing optimization for 8bit/8bit multiplication / regression
2023-04-11 19:19 [Bug c++/109476] New: Missing optimization for 8bit/8bit multiplication / regression klaus.doldinger64 at googlemail dot com
` (13 preceding siblings ...)
2023-04-13 21:49 ` roger at nextmovesoftware dot com
@ 2023-04-14 7:42 ` klaus.doldinger64 at googlemail dot com
2023-04-14 7:46 ` klaus.doldinger64 at googlemail dot com
` (3 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: klaus.doldinger64 at googlemail dot com @ 2023-04-14 7:42 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109476
--- Comment #15 from Wilhelm M <klaus.doldinger64 at googlemail dot com> ---
Just checked actual gcc 13.0.1 without the patch: then no ICE accurs.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug rtl-optimization/109476] Missing optimization for 8bit/8bit multiplication / regression
2023-04-11 19:19 [Bug c++/109476] New: Missing optimization for 8bit/8bit multiplication / regression klaus.doldinger64 at googlemail dot com
` (14 preceding siblings ...)
2023-04-14 7:42 ` klaus.doldinger64 at googlemail dot com
@ 2023-04-14 7:46 ` klaus.doldinger64 at googlemail dot com
2023-04-23 20:22 ` roger at nextmovesoftware dot com
` (2 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: klaus.doldinger64 at googlemail dot com @ 2023-04-14 7:46 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109476
--- Comment #16 from Wilhelm M <klaus.doldinger64 at googlemail dot com> ---
(In reply to Roger Sayle from comment #14)
> My apologies for the delay/issues. My bootstrap and regression testing of
> this patch (on x86_64-pc-linux-gnu) revealed an issue or two (including the
> reported ICE). My plan was to fix/resolve all these before posting a
> concrete submission to gcc-patches.
We all appreciate your great effort in this case! Please don't hesitate to send
here some patches to test with. I'll be happy to test your patches!
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug rtl-optimization/109476] Missing optimization for 8bit/8bit multiplication / regression
2023-04-11 19:19 [Bug c++/109476] New: Missing optimization for 8bit/8bit multiplication / regression klaus.doldinger64 at googlemail dot com
` (15 preceding siblings ...)
2023-04-14 7:46 ` klaus.doldinger64 at googlemail dot com
@ 2023-04-23 20:22 ` roger at nextmovesoftware dot com
2023-04-28 13:22 ` cvs-commit at gcc dot gnu.org
2023-05-07 8:30 ` roger at nextmovesoftware dot com
18 siblings, 0 replies; 20+ messages in thread
From: roger at nextmovesoftware dot com @ 2023-04-23 20:22 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109476
Roger Sayle <roger at nextmovesoftware dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |ASSIGNED
Assignee|unassigned at gcc dot gnu.org |roger at nextmovesoftware dot com
--- Comment #17 from Roger Sayle <roger at nextmovesoftware dot com> ---
I've submitted an improved version of my patch for review:
https://gcc.gnu.org/pipermail/gcc-patches/2023-April/616527.html
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug rtl-optimization/109476] Missing optimization for 8bit/8bit multiplication / regression
2023-04-11 19:19 [Bug c++/109476] New: Missing optimization for 8bit/8bit multiplication / regression klaus.doldinger64 at googlemail dot com
` (16 preceding siblings ...)
2023-04-23 20:22 ` roger at nextmovesoftware dot com
@ 2023-04-28 13:22 ` cvs-commit at gcc dot gnu.org
2023-05-07 8:30 ` roger at nextmovesoftware dot com
18 siblings, 0 replies; 20+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-04-28 13:22 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109476
--- Comment #18 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Roger Sayle <sayle@gcc.gnu.org>:
https://gcc.gnu.org/g:650c36ec461a722d9c65e82512b4c3aeec2ffee1
commit r14-335-g650c36ec461a722d9c65e82512b4c3aeec2ffee1
Author: Roger Sayle <roger@nextmovesoftware.com>
Date: Fri Apr 28 14:21:53 2023 +0100
PR rtl-optimization/109476: Use ZERO_EXTEND instead of zeroing a SUBREG.
This patch fixes PR rtl-optimization/109476, which is a code quality
regression affecting AVR. The cause is that the lower-subreg pass is
sometimes overly aggressive, lowering the LSHIFTRT below:
(insn 7 4 8 2 (set (reg:HI 51)
(lshiftrt:HI (reg/v:HI 49 [ b ])
(const_int 8 [0x8]))) "t.ii":4:36 557 {lshrhi3}
(nil))
into a pair of QImode SUBREG assignments:
(insn 19 4 20 2 (set (subreg:QI (reg:HI 51) 0)
(reg:QI 54 [ b+1 ])) "t.ii":4:36 86 {movqi_insn_split}
(nil))
(insn 20 19 8 2 (set (subreg:QI (reg:HI 51) 1)
(const_int 0 [0])) "t.ii":4:36 86 {movqi_insn_split}
(nil))
but this idiom, SETs of SUBREGs, interferes with combine's ability
to associate/fuse instructions. The solution, on targets that
have a suitable ZERO_EXTEND (i.e. where the lower-subreg pass
wouldn't itself split a ZERO_EXTEND, so "splitting_zext" is false),
is to split/lower LSHIFTRT to a ZERO_EXTEND.
To answer Richard's question in comment #10 of the bugzilla PR,
the function resolve_shift_zext is called with one of four RTX
codes, ASHIFTRT, LSHIFTRT, ZERO_EXTEND and ASHIFT, but only with
LSHIFTRT can the setting of low_part and high_part SUBREGs be
replaced by a ZERO_EXTEND. For ASHIFTRT, we require a sign
extension, so don't set the high_part to zero; if we're splitting
a ZERO_EXTEND then it doesn't make sense to replace it with a
ZERO_EXTEND, and for ASHIFT we've played games to swap the
high_part and low_part SUBREGs, so that we assign the low_part
to zero (for double word shifts by greater than word size bits).
2023-04-28 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
PR rtl-optimization/109476
* lower-subreg.cc: Include explow.h for force_reg.
(find_decomposable_shift_zext): Pass an additional SPEED_P
argument.
If decomposing a suitable LSHIFTRT and we're not splitting
ZERO_EXTEND (based on the current SPEED_P), then use a ZERO_EXTEND
instead of setting a high part SUBREG to zero, which helps combine.
(decompose_multiword_subregs): Update call to resolve_shift_zext.
gcc/testsuite/ChangeLog
PR rtl-optimization/109476
* gcc.target/avr/mmcu/pr109476.c: New test case.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug rtl-optimization/109476] Missing optimization for 8bit/8bit multiplication / regression
2023-04-11 19:19 [Bug c++/109476] New: Missing optimization for 8bit/8bit multiplication / regression klaus.doldinger64 at googlemail dot com
` (17 preceding siblings ...)
2023-04-28 13:22 ` cvs-commit at gcc dot gnu.org
@ 2023-05-07 8:30 ` roger at nextmovesoftware dot com
18 siblings, 0 replies; 20+ messages in thread
From: roger at nextmovesoftware dot com @ 2023-05-07 8:30 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109476
Roger Sayle <roger at nextmovesoftware dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |RESOLVED
Target Milestone|--- |14.0
Resolution|--- |FIXED
--- Comment #19 from Roger Sayle <roger at nextmovesoftware dot com> ---
This should now be fixed on mainline.
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2023-05-07 8:31 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-11 19:19 [Bug c++/109476] New: Missing optimization for 8bit/8bit multiplication / regression klaus.doldinger64 at googlemail dot com
2023-04-11 19:27 ` [Bug target/109476] " klaus.doldinger64 at googlemail dot com
2023-04-12 11:14 ` rguenth at gcc dot gnu.org
2023-04-12 11:33 ` klaus.doldinger64 at googlemail dot com
2023-04-12 12:21 ` [Bug rtl-optimization/109476] " rguenth at gcc dot gnu.org
2023-04-12 15:43 ` segher at gcc dot gnu.org
2023-04-12 16:48 ` klaus.doldinger64 at googlemail dot com
2023-04-12 18:43 ` roger at nextmovesoftware dot com
2023-04-12 20:20 ` klaus.doldinger64 at googlemail dot com
2023-04-12 20:56 ` segher at gcc dot gnu.org
2023-04-13 7:01 ` rguenth at gcc dot gnu.org
2023-04-13 12:55 ` klaus.doldinger64 at googlemail dot com
2023-04-13 15:19 ` segher at gcc dot gnu.org
2023-04-13 17:57 ` klaus.doldinger64 at googlemail dot com
2023-04-13 21:49 ` roger at nextmovesoftware dot com
2023-04-14 7:42 ` klaus.doldinger64 at googlemail dot com
2023-04-14 7:46 ` klaus.doldinger64 at googlemail dot com
2023-04-23 20:22 ` roger at nextmovesoftware dot com
2023-04-28 13:22 ` cvs-commit at gcc dot gnu.org
2023-05-07 8:30 ` roger at nextmovesoftware dot com
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).