* shift/extract SHIFT_COUNT_TRUNCATED combine bug
@ 2014-04-08 20:09 Mike Stump
2015-01-12 22:25 ` Jeff Law
0 siblings, 1 reply; 7+ messages in thread
From: Mike Stump @ 2014-04-08 20:09 UTC (permalink / raw)
To: GCC Patches; +Cc: Richard Sandiford, Eric Botcazou
Something broke in the compiler to cause combine to incorrectly optimize:
(insn 12 11 13 3 (set (reg:SI 604 [ D.6102 ])
(lshiftrt:SI (subreg/s/u:SI (reg/v:DI 601 [ x ]) 0)
(reg:SI 602 [ D.6103 ]))) t.c:47 4436 {lshrsi3}
(expr_list:REG_DEAD (reg:SI 602 [ D.6103 ])
(nil)))
(insn 13 12 14 3 (set (reg:SI 605)
(and:SI (reg:SI 604 [ D.6102 ])
(const_int 1 [0x1]))) t.c:47 3658 {andsi3}
(expr_list:REG_DEAD (reg:SI 604 [ D.6102 ])
(nil)))
(insn 14 13 15 3 (set (reg:DI 599 [ D.6102 ])
(zero_extend:DI (reg:SI 605))) t.c:47 4616 {zero_extendsidi2}
(expr_list:REG_DEAD (reg:SI 605)
(nil)))
into:
(insn 11 10 12 3 (set (reg:SI 602 [ D.6103 ])
(not:SI (subreg:SI (reg:DI 595 [ D.6102 ]) 0))) t.c:47 3732 {one_cmplsi2}
(expr_list:REG_DEAD (reg:DI 595 [ D.6102 ])
(nil)))
(note 12 11 13 3 NOTE_INSN_DELETED)
(note 13 12 14 3 NOTE_INSN_DELETED)
(insn 14 13 15 3 (set (reg:DI 599 [ D.6102 ])
(zero_extract:DI (reg/v:DI 601 [ x ])
(const_int 1 [0x1])
(reg:SI 602 [ D.6103 ]))) t.c:47 4668 {c2_extzvdi}
(expr_list:REG_DEAD (reg:SI 602 [ D.6103 ])
(nil)))
This shows up in:
FAIL: gcc.c-torture/execute/builtin-bitops-1.c execution, -Og -g
for me.
diff --git a/gcc/combine.c b/gcc/combine.c
index 708691f..c1f50ff 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -7245,6 +7245,18 @@ make_extraction (enum machine_mode mode, rtx inner, HOST_WIDE_INT pos,
extraction_mode = insn.field_mode;
}
+ /* On a SHIFT_COUNT_TRUNCATED machine, we can't promote the mode of
+ the extract to a larger size on a variable extract, as previously
+ the position might have been optimized to change a bit of the
+ index of the starting bit that would have been ignored before,
+ but, with a larger mode, will then not be. If we wanted to do
+ this, we'd have to mask out those bits or prove that those bits
+ are 0. */
+ if (SHIFT_COUNT_TRUNCATED
+ && pos_rtx
+ && GET_MODE_BITSIZE (extraction_mode) > GET_MODE_BITSIZE (mode))
+ extraction_mode = mode;
+
/* Never narrow an object, since that might not be safe. */
if (mode != VOIDmode
is sufficient to never widen variable extracts on SHIFT_COUNT_TRUNCATED machines. So, the question is, how did people expect this to work? I didn’t spot what changed recently to cause the bad code-gen. The optimization of sub into not is ok, despite how funny it looks, because is feeds into extract which we know by SHIFT_COUNT_TRUNCATED is safe.
Is the patch a reasonable way to fix this?
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: shift/extract SHIFT_COUNT_TRUNCATED combine bug
2014-04-08 20:09 shift/extract SHIFT_COUNT_TRUNCATED combine bug Mike Stump
@ 2015-01-12 22:25 ` Jeff Law
2015-01-13 10:11 ` Richard Biener
0 siblings, 1 reply; 7+ messages in thread
From: Jeff Law @ 2015-01-12 22:25 UTC (permalink / raw)
To: Mike Stump, GCC Patches; +Cc: Richard Sandiford, Eric Botcazou
On 04/08/14 14:07, Mike Stump wrote:
> Something broke in the compiler to cause combine to incorrectly optimize:
>
> (insn 12 11 13 3 (set (reg:SI 604 [ D.6102 ])
> (lshiftrt:SI (subreg/s/u:SI (reg/v:DI 601 [ x ]) 0)
> (reg:SI 602 [ D.6103 ]))) t.c:47 4436 {lshrsi3}
> (expr_list:REG_DEAD (reg:SI 602 [ D.6103 ])
> (nil)))
> (insn 13 12 14 3 (set (reg:SI 605)
> (and:SI (reg:SI 604 [ D.6102 ])
> (const_int 1 [0x1]))) t.c:47 3658 {andsi3}
> (expr_list:REG_DEAD (reg:SI 604 [ D.6102 ])
> (nil)))
> (insn 14 13 15 3 (set (reg:DI 599 [ D.6102 ])
> (zero_extend:DI (reg:SI 605))) t.c:47 4616 {zero_extendsidi2}
> (expr_list:REG_DEAD (reg:SI 605)
> (nil)))
>
> into:
>
> (insn 11 10 12 3 (set (reg:SI 602 [ D.6103 ])
> (not:SI (subreg:SI (reg:DI 595 [ D.6102 ]) 0))) t.c:47 3732 {one_cmplsi2}
> (expr_list:REG_DEAD (reg:DI 595 [ D.6102 ])
> (nil)))
> (note 12 11 13 3 NOTE_INSN_DELETED)
> (note 13 12 14 3 NOTE_INSN_DELETED)
> (insn 14 13 15 3 (set (reg:DI 599 [ D.6102 ])
> (zero_extract:DI (reg/v:DI 601 [ x ])
> (const_int 1 [0x1])
> (reg:SI 602 [ D.6103 ]))) t.c:47 4668 {c2_extzvdi}
> (expr_list:REG_DEAD (reg:SI 602 [ D.6103 ])
> (nil)))
>
> This shows up in:
>
> FAIL: gcc.c-torture/execute/builtin-bitops-1.c execution, -Og -g
>
> for me.
>
> diff --git a/gcc/combine.c b/gcc/combine.c
> index 708691f..c1f50ff 100644
> --- a/gcc/combine.c
> +++ b/gcc/combine.c
> @@ -7245,6 +7245,18 @@ make_extraction (enum machine_mode mode, rtx inner, HOST_WIDE_INT pos,
> extraction_mode = insn.field_mode;
> }
>
> + /* On a SHIFT_COUNT_TRUNCATED machine, we can't promote the mode of
> + the extract to a larger size on a variable extract, as previously
> + the position might have been optimized to change a bit of the
> + index of the starting bit that would have been ignored before,
> + but, with a larger mode, will then not be. If we wanted to do
> + this, we'd have to mask out those bits or prove that those bits
> + are 0. */
> + if (SHIFT_COUNT_TRUNCATED
> + && pos_rtx
> + && GET_MODE_BITSIZE (extraction_mode) > GET_MODE_BITSIZE (mode))
> + extraction_mode = mode;
> +
> /* Never narrow an object, since that might not be safe. */
>
> if (mode != VOIDmode
>
> is sufficient to never widen variable extracts on SHIFT_COUNT_TRUNCATED machines. So, the question is, how did people expect this to work? I didnÂ’t spot what changed recently to cause the bad code-gen. The optimization of sub into not is ok, despite how funny it looks, because is feeds into extract which we know by SHIFT_COUNT_TRUNCATED is safe.
>
> Is the patch a reasonable way to fix this?
On a SHIFT_COUNT_TRUNCATED target, I don't think it's ever OK to widen a
shift, variable or constant.
In the case of a variable shift, we could easily have eliminated the
masking code before or during combine. For a constant shift amount we
could have adjusted the constant (see SHIFT_COUNT_TRUNCATED in cse.c)
I think it's just an oversight and it has simply never bit us before.
jeff
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: shift/extract SHIFT_COUNT_TRUNCATED combine bug
2015-01-12 22:25 ` Jeff Law
@ 2015-01-13 10:11 ` Richard Biener
2015-01-13 17:00 ` Jeff Law
2015-01-13 18:25 ` Segher Boessenkool
0 siblings, 2 replies; 7+ messages in thread
From: Richard Biener @ 2015-01-13 10:11 UTC (permalink / raw)
To: Jeff Law; +Cc: Mike Stump, GCC Patches, Richard Sandiford, Eric Botcazou
On Mon, Jan 12, 2015 at 11:12 PM, Jeff Law <law@redhat.com> wrote:
> On 04/08/14 14:07, Mike Stump wrote:
>>
>> Something broke in the compiler to cause combine to incorrectly optimize:
>>
>> (insn 12 11 13 3 (set (reg:SI 604 [ D.6102 ])
>> (lshiftrt:SI (subreg/s/u:SI (reg/v:DI 601 [ x ]) 0)
>> (reg:SI 602 [ D.6103 ]))) t.c:47 4436 {lshrsi3}
>> (expr_list:REG_DEAD (reg:SI 602 [ D.6103 ])
>> (nil)))
>> (insn 13 12 14 3 (set (reg:SI 605)
>> (and:SI (reg:SI 604 [ D.6102 ])
>> (const_int 1 [0x1]))) t.c:47 3658 {andsi3}
>> (expr_list:REG_DEAD (reg:SI 604 [ D.6102 ])
>> (nil)))
>> (insn 14 13 15 3 (set (reg:DI 599 [ D.6102 ])
>> (zero_extend:DI (reg:SI 605))) t.c:47 4616 {zero_extendsidi2}
>> (expr_list:REG_DEAD (reg:SI 605)
>> (nil)))
>>
>> into:
>>
>> (insn 11 10 12 3 (set (reg:SI 602 [ D.6103 ])
>> (not:SI (subreg:SI (reg:DI 595 [ D.6102 ]) 0))) t.c:47 3732
>> {one_cmplsi2}
>> (expr_list:REG_DEAD (reg:DI 595 [ D.6102 ])
>> (nil)))
>> (note 12 11 13 3 NOTE_INSN_DELETED)
>> (note 13 12 14 3 NOTE_INSN_DELETED)
>> (insn 14 13 15 3 (set (reg:DI 599 [ D.6102 ])
>> (zero_extract:DI (reg/v:DI 601 [ x ])
>> (const_int 1 [0x1])
>> (reg:SI 602 [ D.6103 ]))) t.c:47 4668 {c2_extzvdi}
>> (expr_list:REG_DEAD (reg:SI 602 [ D.6103 ])
>> (nil)))
>>
>> This shows up in:
>>
>> FAIL: gcc.c-torture/execute/builtin-bitops-1.c execution, -Og -g
>>
>> for me.
>>
>> diff --git a/gcc/combine.c b/gcc/combine.c
>> index 708691f..c1f50ff 100644
>> --- a/gcc/combine.c
>> +++ b/gcc/combine.c
>> @@ -7245,6 +7245,18 @@ make_extraction (enum machine_mode mode, rtx inner,
>> HOST_WIDE_INT pos,
>> extraction_mode = insn.field_mode;
>> }
>>
>> + /* On a SHIFT_COUNT_TRUNCATED machine, we can't promote the mode of
>> + the extract to a larger size on a variable extract, as previously
>> + the position might have been optimized to change a bit of the
>> + index of the starting bit that would have been ignored before,
>> + but, with a larger mode, will then not be. If we wanted to do
>> + this, we'd have to mask out those bits or prove that those bits
>> + are 0. */
>> + if (SHIFT_COUNT_TRUNCATED
>> + && pos_rtx
>> + && GET_MODE_BITSIZE (extraction_mode) > GET_MODE_BITSIZE (mode))
>> + extraction_mode = mode;
>> +
>> /* Never narrow an object, since that might not be safe. */
>>
>> if (mode != VOIDmode
>>
>> is sufficient to never widen variable extracts on SHIFT_COUNT_TRUNCATED
>> machines. So, the question is, how did people expect this to work? I
>> didn’t spot what changed recently to cause the bad code-gen. The
>> optimization of sub into not is ok, despite how funny it looks, because is
>> feeds into extract which we know by SHIFT_COUNT_TRUNCATED is safe.
>>
>> Is the patch a reasonable way to fix this?
>
> On a SHIFT_COUNT_TRUNCATED target, I don't think it's ever OK to widen a
> shift, variable or constant.
>
> In the case of a variable shift, we could easily have eliminated the masking
> code before or during combine. For a constant shift amount we could have
> adjusted the constant (see SHIFT_COUNT_TRUNCATED in cse.c)
>
> I think it's just an oversight and it has simply never bit us before.
IMHO SHIFT_COUNT_TRUNCATED should be removed and instead
backends should provide shift patterns with a (and:QI ...) for the
shift amount which simply will omit that operation if suitable.
Richard.
> jeff
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: shift/extract SHIFT_COUNT_TRUNCATED combine bug
2015-01-13 10:11 ` Richard Biener
@ 2015-01-13 17:00 ` Jeff Law
2015-01-13 18:25 ` Segher Boessenkool
1 sibling, 0 replies; 7+ messages in thread
From: Jeff Law @ 2015-01-13 17:00 UTC (permalink / raw)
To: Richard Biener; +Cc: Mike Stump, GCC Patches, Richard Sandiford, Eric Botcazou
On 01/13/15 02:51, Richard Biener wrote:
>> On a SHIFT_COUNT_TRUNCATED target, I don't think it's ever OK to widen a
>> shift, variable or constant.
>>
>> In the case of a variable shift, we could easily have eliminated the masking
>> code before or during combine. For a constant shift amount we could have
>> adjusted the constant (see SHIFT_COUNT_TRUNCATED in cse.c)
>>
>> I think it's just an oversight and it has simply never bit us before.
>
> IMHO SHIFT_COUNT_TRUNCATED should be removed and instead
> backends should provide shift patterns with a (and:QI ...) for the
> shift amount which simply will omit that operation if suitable.
Perhaps. I'm certainly not wed to concept of SHIFT_COUNT_TRUNCATED. I
don't see that getting addressed in the gcc-5 timeframe.
aarch64, alpha, epiphany, iq2000, lm32, m32r, mep, microblaze, mips,
mn103, nds32, pa, sparc, stormy16, tilepro, v850 and xtensa are the
current SHIFT_COUNT_TRUNCATED targets.
Jeff
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: shift/extract SHIFT_COUNT_TRUNCATED combine bug
2015-01-13 10:11 ` Richard Biener
2015-01-13 17:00 ` Jeff Law
@ 2015-01-13 18:25 ` Segher Boessenkool
2015-01-14 9:24 ` Richard Biener
1 sibling, 1 reply; 7+ messages in thread
From: Segher Boessenkool @ 2015-01-13 18:25 UTC (permalink / raw)
To: Richard Biener
Cc: Jeff Law, Mike Stump, GCC Patches, Richard Sandiford, Eric Botcazou
On Tue, Jan 13, 2015 at 10:51:27AM +0100, Richard Biener wrote:
> IMHO SHIFT_COUNT_TRUNCATED should be removed and instead
> backends should provide shift patterns with a (and:QI ...) for the
> shift amount which simply will omit that operation if suitable.
Note that that catches less though, e.g. in
int f(int x, int n) { return x << ((2*n) & 31); }
without SHIFT_COUNT_TRUNCATED it will try to match an AND with 30,
not with 31.
Segher
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: shift/extract SHIFT_COUNT_TRUNCATED combine bug
2015-01-13 18:25 ` Segher Boessenkool
@ 2015-01-14 9:24 ` Richard Biener
2015-01-14 14:35 ` Segher Boessenkool
0 siblings, 1 reply; 7+ messages in thread
From: Richard Biener @ 2015-01-14 9:24 UTC (permalink / raw)
To: Segher Boessenkool
Cc: Jeff Law, Mike Stump, GCC Patches, Richard Sandiford, Eric Botcazou
On Tue, Jan 13, 2015 at 6:38 PM, Segher Boessenkool
<segher@kernel.crashing.org> wrote:
> On Tue, Jan 13, 2015 at 10:51:27AM +0100, Richard Biener wrote:
>> IMHO SHIFT_COUNT_TRUNCATED should be removed and instead
>> backends should provide shift patterns with a (and:QI ...) for the
>> shift amount which simply will omit that operation if suitable.
>
> Note that that catches less though, e.g. in
>
> int f(int x, int n) { return x << ((2*n) & 31); }
>
> without SHIFT_COUNT_TRUNCATED it will try to match an AND with 30,
> not with 31.
But even with SHIFT_COUNT_TRUNCATED you cannot omit the
and as it clears the LSB. Only at a higher level we might be tempted
to drop the & 31 while it still persists in its original form (not sure
if fold does that - I don't see SHIFT_COUNT_TRUNCATED mentioned there).
Richard.
>
> Segher
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: shift/extract SHIFT_COUNT_TRUNCATED combine bug
2015-01-14 9:24 ` Richard Biener
@ 2015-01-14 14:35 ` Segher Boessenkool
0 siblings, 0 replies; 7+ messages in thread
From: Segher Boessenkool @ 2015-01-14 14:35 UTC (permalink / raw)
To: Richard Biener
Cc: Jeff Law, Mike Stump, GCC Patches, Richard Sandiford, Eric Botcazou
On Wed, Jan 14, 2015 at 10:10:24AM +0100, Richard Biener wrote:
> On Tue, Jan 13, 2015 at 6:38 PM, Segher Boessenkool
> <segher@kernel.crashing.org> wrote:
> > On Tue, Jan 13, 2015 at 10:51:27AM +0100, Richard Biener wrote:
> >> IMHO SHIFT_COUNT_TRUNCATED should be removed and instead
> >> backends should provide shift patterns with a (and:QI ...) for the
> >> shift amount which simply will omit that operation if suitable.
> >
> > Note that that catches less though, e.g. in
> >
> > int f(int x, int n) { return x << ((2*n) & 31); }
> >
> > without SHIFT_COUNT_TRUNCATED it will try to match an AND with 30,
> > not with 31.
>
> But even with SHIFT_COUNT_TRUNCATED you cannot omit the
> and as it clears the LSB.
The 2*n already does that.
Before combine, we have something like
t1 = n << 1
t2 = t1 & 30
ret = x << t2
(it actually has some register copies to more temporaries), and on
SHIFT_COUNT_TRUNCATED targets where the first two insns don't combine,
e.g. m32r, currently combine ends up with
t1 = n << 1
ret = x << t1
while it doesn't without SHIFT_COUNT_TRUNCATED if you only have a
x << (n & 31) pattern.
I'm all for eradicating SHIFT_COUNT_TRUNCATED; just pointing out that
it is not trivial to fully replace (just the important, obvious cases
are easy).
Segher
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2015-01-14 14:28 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-04-08 20:09 shift/extract SHIFT_COUNT_TRUNCATED combine bug Mike Stump
2015-01-12 22:25 ` Jeff Law
2015-01-13 10:11 ` Richard Biener
2015-01-13 17:00 ` Jeff Law
2015-01-13 18:25 ` Segher Boessenkool
2015-01-14 9:24 ` Richard Biener
2015-01-14 14:35 ` Segher Boessenkool
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).