* peephole or expand optimization for fixing poor code generated in conversion from QImode to PSImode pointer?
@ 2019-09-23 11:56 Jozef Lawrynowicz
2019-09-23 13:25 ` Richard Biener
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Jozef Lawrynowicz @ 2019-09-23 11:56 UTC (permalink / raw)
To: gcc
For msp430-elf in the large memory model there are PSImode (20-bit) pointers.
POINTERS_EXTEND_UNSIGNED == 1 and "char" is unsigned by default.
We get poor code generated for the following code snippet:
const int table[2] = {1, 2};
int
foo (char i)
{
return table[i];
}
The RTL generated by expand uses two insns to convert "i" to a register's
natural mode; there is a sign extension which would be unnecessary if the first
instruction had a PSImode register as the lvalue:
(insn 2 4 3 2 (set (reg/v:HI 25 [ i ])
(zero_extend:HI (reg:QI 12 R12 [ i ])))
(nil))
.....
(insn 7 6 8 2 (set (reg:PSI 28)
(subreg:PSI (sign_extend:SI (reg/v:HI 25 [ i ])) 0))
(nil))
All we really need is:
(insn (set (reg:PSI 28 [ i ])
(zero_extend:PSI (reg:QI 12 R12 [ i ])))
(nil))
The zero extend is implicit with byte sized register operations, so this would
result in assembly such as:
MOV.B R12, R12
My question is whether this is the type of thing that should be handled with a
peephole optimization or if it is worth trying to fix the initial RTL generated
by expand (or in a later RTL pass e.g. combine)?
Thanks,
Jozef
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: peephole or expand optimization for fixing poor code generated in conversion from QImode to PSImode pointer?
2019-09-23 11:56 peephole or expand optimization for fixing poor code generated in conversion from QImode to PSImode pointer? Jozef Lawrynowicz
@ 2019-09-23 13:25 ` Richard Biener
2019-09-23 13:30 ` Jeff Law
2019-09-23 15:56 ` Segher Boessenkool
2 siblings, 0 replies; 7+ messages in thread
From: Richard Biener @ 2019-09-23 13:25 UTC (permalink / raw)
To: Jozef Lawrynowicz; +Cc: GCC Development
On Mon, Sep 23, 2019 at 1:56 PM Jozef Lawrynowicz <jozefl.gcc@gmail.com> wrote:
>
> For msp430-elf in the large memory model there are PSImode (20-bit) pointers.
> POINTERS_EXTEND_UNSIGNED == 1 and "char" is unsigned by default.
>
> We get poor code generated for the following code snippet:
>
> const int table[2] = {1, 2};
>
> int
> foo (char i)
> {
> return table[i];
> }
>
> The RTL generated by expand uses two insns to convert "i" to a register's
> natural mode; there is a sign extension which would be unnecessary if the first
> instruction had a PSImode register as the lvalue:
>
> (insn 2 4 3 2 (set (reg/v:HI 25 [ i ])
> (zero_extend:HI (reg:QI 12 R12 [ i ])))
> (nil))
> .....
> (insn 7 6 8 2 (set (reg:PSI 28)
> (subreg:PSI (sign_extend:SI (reg/v:HI 25 [ i ])) 0))
> (nil))
>
> All we really need is:
>
> (insn (set (reg:PSI 28 [ i ])
> (zero_extend:PSI (reg:QI 12 R12 [ i ])))
> (nil))
>
> The zero extend is implicit with byte sized register operations, so this would
> result in assembly such as:
> MOV.B R12, R12
>
> My question is whether this is the type of thing that should be handled with a
> peephole optimization or if it is worth trying to fix the initial RTL generated
> by expand (or in a later RTL pass e.g. combine)?
The C frontend promotes 'i' already and likely that promotion "sticks" via
your choice of SIZETYPE.
But yes, preferably RTL expansion would behave better here.
Richard.
>
> Thanks,
> Jozef
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: peephole or expand optimization for fixing poor code generated in conversion from QImode to PSImode pointer?
2019-09-23 11:56 peephole or expand optimization for fixing poor code generated in conversion from QImode to PSImode pointer? Jozef Lawrynowicz
2019-09-23 13:25 ` Richard Biener
@ 2019-09-23 13:30 ` Jeff Law
2019-09-23 17:02 ` Jozef Lawrynowicz
2019-09-23 15:56 ` Segher Boessenkool
2 siblings, 1 reply; 7+ messages in thread
From: Jeff Law @ 2019-09-23 13:30 UTC (permalink / raw)
To: Jozef Lawrynowicz, gcc
On 9/23/19 5:56 AM, Jozef Lawrynowicz wrote:
> For msp430-elf in the large memory model there are PSImode (20-bit) pointers.
> POINTERS_EXTEND_UNSIGNED == 1 and "char" is unsigned by default.
>
> We get poor code generated for the following code snippet:
>
> const int table[2] = {1, 2};
>
> int
> foo (char i)
> {
> return table[i];
> }
>
> The RTL generated by expand uses two insns to convert "i" to a register's
> natural mode; there is a sign extension which would be unnecessary if the first
> instruction had a PSImode register as the lvalue:
>
> (insn 2 4 3 2 (set (reg/v:HI 25 [ i ])
> (zero_extend:HI (reg:QI 12 R12 [ i ])))
> (nil))
> .....
> (insn 7 6 8 2 (set (reg:PSI 28)
> (subreg:PSI (sign_extend:SI (reg/v:HI 25 [ i ])) 0))
> (nil))
>
> All we really need is:
>
> (insn (set (reg:PSI 28 [ i ])
> (zero_extend:PSI (reg:QI 12 R12 [ i ])))
> (nil))
>
> The zero extend is implicit with byte sized register operations, so this would
> result in assembly such as:
> MOV.B R12, R12
>
> My question is whether this is the type of thing that should be handled with a
> peephole optimization or if it is worth trying to fix the initial RTL generated
> by expand (or in a later RTL pass e.g. combine)?
Ideally it'd be fixed earlier, but that can be painful due to the way
promotions work.
I certainly recall doing some combiner patterns to catch the more
egregious codegen issues on the mn102 port (similar, but with 24bit
pointers). I think I mostly focused on stuff that showed up with
integer loop indices and their annoying promotion to ptrdiff_t when used
for array indexing.
I'd bet if you extracted mn10200.md out of the archive the patterns
would be obvious ;-)
In this specific case I'd think a combiner pattern ought to do the right
thing except for the fact that you're probably dealing with hard
registers which combine no longer handles as aggressively.
jeff
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: peephole or expand optimization for fixing poor code generated in conversion from QImode to PSImode pointer?
2019-09-23 11:56 peephole or expand optimization for fixing poor code generated in conversion from QImode to PSImode pointer? Jozef Lawrynowicz
2019-09-23 13:25 ` Richard Biener
2019-09-23 13:30 ` Jeff Law
@ 2019-09-23 15:56 ` Segher Boessenkool
2019-09-23 17:56 ` Jozef Lawrynowicz
2 siblings, 1 reply; 7+ messages in thread
From: Segher Boessenkool @ 2019-09-23 15:56 UTC (permalink / raw)
To: Jozef Lawrynowicz; +Cc: gcc
On Mon, Sep 23, 2019 at 12:56:20PM +0100, Jozef Lawrynowicz wrote:
> (insn 2 4 3 2 (set (reg/v:HI 25 [ i ])
> (zero_extend:HI (reg:QI 12 R12 [ i ])))
> (nil))
> .....
> (insn 7 6 8 2 (set (reg:PSI 28)
> (subreg:PSI (sign_extend:SI (reg/v:HI 25 [ i ])) 0))
> (nil))
>
> All we really need is:
>
> (insn (set (reg:PSI 28 [ i ])
> (zero_extend:PSI (reg:QI 12 R12 [ i ])))
> (nil))
>
> The zero extend is implicit with byte sized register operations, so this would
> result in assembly such as:
> MOV.B R12, R12
>
> My question is whether this is the type of thing that should be handled with a
> peephole optimization or if it is worth trying to fix the initial RTL generated
> by expand (or in a later RTL pass e.g. combine)?
combine does (well, it did in June, I don't think things changed since then)
Trying 2 -> 7:
2: r25:HI=zero_extend(R12:QI)
REG_DEAD R12:QI
7: r28:PSI=sign_extend(r25:HI)#0
REG_DEAD r25:HI
Failed to match this instruction:
(set (reg:PSI 28 [ i ])
(sign_extend:PSI (zero_extend:HI (reg:QI 12 R12 [ i ]))))
Failed to match this instruction:
(set (reg:PSI 28 [ i ])
(sign_extend:PSI (and:HI (reg:HI 12 R12)
(const_int 255 [0xff]))))
It could have just done that as
(set (reg:PSI 28)
(zero_extend:PSI (reg:QI 12)))
as far as I can see? Do you already have a machine pattern that matches
that?
Please file a PR for this. Thanks!
Segher
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: peephole or expand optimization for fixing poor code generated in conversion from QImode to PSImode pointer?
2019-09-23 13:30 ` Jeff Law
@ 2019-09-23 17:02 ` Jozef Lawrynowicz
0 siblings, 0 replies; 7+ messages in thread
From: Jozef Lawrynowicz @ 2019-09-23 17:02 UTC (permalink / raw)
To: Jeff Law; +Cc: gcc
On Mon, 23 Sep 2019 07:30:21 -0600
Jeff Law <law@redhat.com> wrote:
> On 9/23/19 5:56 AM, Jozef Lawrynowicz wrote:
> > For msp430-elf in the large memory model there are PSImode (20-bit) pointers.
> > POINTERS_EXTEND_UNSIGNED == 1 and "char" is unsigned by default.
> >
> > We get poor code generated for the following code snippet:
> >
> > const int table[2] = {1, 2};
> >
> > int
> > foo (char i)
> > {
> > return table[i];
> > }
> >
> > The RTL generated by expand uses two insns to convert "i" to a register's
> > natural mode; there is a sign extension which would be unnecessary if the first
> > instruction had a PSImode register as the lvalue:
> >
> > (insn 2 4 3 2 (set (reg/v:HI 25 [ i ])
> > (zero_extend:HI (reg:QI 12 R12 [ i ])))
> > (nil))
> > .....
> > (insn 7 6 8 2 (set (reg:PSI 28)
> > (subreg:PSI (sign_extend:SI (reg/v:HI 25 [ i ])) 0))
> > (nil))
> >
> > All we really need is:
> >
> > (insn (set (reg:PSI 28 [ i ])
> > (zero_extend:PSI (reg:QI 12 R12 [ i ])))
> > (nil))
> >
> > The zero extend is implicit with byte sized register operations, so this would
> > result in assembly such as:
> > MOV.B R12, R12
> >
> > My question is whether this is the type of thing that should be handled with a
> > peephole optimization or if it is worth trying to fix the initial RTL generated
> > by expand (or in a later RTL pass e.g. combine)?
> Ideally it'd be fixed earlier, but that can be painful due to the way
> promotions work.
>
> I certainly recall doing some combiner patterns to catch the more
> egregious codegen issues on the mn102 port (similar, but with 24bit
> pointers). I think I mostly focused on stuff that showed up with
> integer loop indices and their annoying promotion to ptrdiff_t when used
> for array indexing.
>
> I'd bet if you extracted mn10200.md out of the archive the patterns
> would be obvious ;-)
Ok thanks for the tip, I'll take a look. I'm sure there's likely to be other
stuff that could be applicable to msp430 as well.
Jozef
>
> In this specific case I'd think a combiner pattern ought to do the right
> thing except for the fact that you're probably dealing with hard
> registers which combine no longer handles as aggressively.
>
> jeff
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: peephole or expand optimization for fixing poor code generated in conversion from QImode to PSImode pointer?
2019-09-23 15:56 ` Segher Boessenkool
@ 2019-09-23 17:56 ` Jozef Lawrynowicz
2019-09-23 22:53 ` Segher Boessenkool
0 siblings, 1 reply; 7+ messages in thread
From: Jozef Lawrynowicz @ 2019-09-23 17:56 UTC (permalink / raw)
To: Segher Boessenkool; +Cc: gcc
On Mon, 23 Sep 2019 10:56:55 -0500
Segher Boessenkool <segher@kernel.crashing.org> wrote:
> On Mon, Sep 23, 2019 at 12:56:20PM +0100, Jozef Lawrynowicz wrote:
> > (insn 2 4 3 2 (set (reg/v:HI 25 [ i ])
> > (zero_extend:HI (reg:QI 12 R12 [ i ])))
> > (nil))
> > .....
> > (insn 7 6 8 2 (set (reg:PSI 28)
> > (subreg:PSI (sign_extend:SI (reg/v:HI 25 [ i ])) 0))
> > (nil))
> >
> > All we really need is:
> >
> > (insn (set (reg:PSI 28 [ i ])
> > (zero_extend:PSI (reg:QI 12 R12 [ i ])))
> > (nil))
> >
> > The zero extend is implicit with byte sized register operations, so this would
> > result in assembly such as:
> > MOV.B R12, R12
> >
> > My question is whether this is the type of thing that should be handled with a
> > peephole optimization or if it is worth trying to fix the initial RTL generated
> > by expand (or in a later RTL pass e.g. combine)?
>
> combine does (well, it did in June, I don't think things changed since then)
>
> Trying 2 -> 7:
> 2: r25:HI=zero_extend(R12:QI)
> REG_DEAD R12:QI
> 7: r28:PSI=sign_extend(r25:HI)#0
> REG_DEAD r25:HI
> Failed to match this instruction:
> (set (reg:PSI 28 [ i ])
> (sign_extend:PSI (zero_extend:HI (reg:QI 12 R12 [ i ]))))
> Failed to match this instruction:
> (set (reg:PSI 28 [ i ])
> (sign_extend:PSI (and:HI (reg:HI 12 R12)
> (const_int 255 [0xff]))))
>
> It could have just done that as
>
> (set (reg:PSI 28)
> (zero_extend:PSI (reg:QI 12)))
>
> as far as I can see? Do you already have a machine pattern that matches
> that?
Yes that combination is what I was expecting to see. I guess since char is
unsigned, a zero extend is needed to widen it, and then for the following insn a
sign extend is added to widen the HImode integer.
There isn't currently a pattern which matches that, but adding it
doesn't fix the issue which is why I thought it might need to be fixed earlier
in RTL generation.
Here is the pattern that is missing:
+(define_insn "movqipsi"
+ [(set (match_operand:PSI 0 "register_operand" "=r,r")
+ (zero_extend:PSI (match_operand:QI 1 "general_operand" "rYs,m")))]
+ "msp430x"
+ "@
+ MOV.B\t%1, %0
+ MOV%X1.B\t%1, %0"
+)
+
So will combine potentially be able to do away with (sign_extend (zero_extend))
in certain situations? I filed PR/91865 which has all the relevant details
from this thread.
I can add the following nameless insn and combine will catch this case and
generate the better, smaller code:
+(define_insn "*movqihipsi"
+ [(set (match_operand:PSI 0 "register_operand" "=r,r")
+ (sign_extend:PSI (zero_extend:HI (match_operand:QI 1 "general_operand"
"rYs,m"))))]
+ "msp430x"
+ "@
+ MOV.B\t%1, %0
+ MOV%X1.B\t%1, %0"
+)
+
Thanks,
Jozef
>
> Please file a PR for this. Thanks!
>
>
> Segher
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: peephole or expand optimization for fixing poor code generated in conversion from QImode to PSImode pointer?
2019-09-23 17:56 ` Jozef Lawrynowicz
@ 2019-09-23 22:53 ` Segher Boessenkool
0 siblings, 0 replies; 7+ messages in thread
From: Segher Boessenkool @ 2019-09-23 22:53 UTC (permalink / raw)
To: Jozef Lawrynowicz; +Cc: gcc
On Mon, Sep 23, 2019 at 06:56:12PM +0100, Jozef Lawrynowicz wrote:
> > It could have just done that as
> >
> > (set (reg:PSI 28)
> > (zero_extend:PSI (reg:QI 12)))
> >
> > as far as I can see? Do you already have a machine pattern that matches
> > that?
>
> Yes that combination is what I was expecting to see. I guess since char is
> unsigned, a zero extend is needed to widen it, and then for the following insn a
> sign extend is added to widen the HImode integer.
Yeah, but a sign_extend:M1 of a zero_extend:M2 of an M3 (with M2>M3) is
just a zero_extend:M1 of that M3. Somehow combine (or simplify-rtx)
missed that; or come to think of it, it probably does that for MODE_INT
things, but this is a MODE_PARTIAL_INT. Hrm, would it be correct then.
I think it is? M2 is a normal MODE_INT, all bits in M2 are defined, the
M2 value always is non-negative, the sign_extend:M1 always is the same
as the zero_extend:M1 would be.
> There isn't currently a pattern which matches that, but adding it
> doesn't fix the issue which is why I thought it might need to be fixed earlier
> in RTL generation.
> Here is the pattern that is missing:
>
> +(define_insn "movqipsi"
> + [(set (match_operand:PSI 0 "register_operand" "=r,r")
> + (zero_extend:PSI (match_operand:QI 1 "general_operand" "rYs,m")))]
> + "msp430x"
> + "@
> + MOV.B\t%1, %0
> + MOV%X1.B\t%1, %0"
> +)
> +
>
> So will combine potentially be able to do away with (sign_extend (zero_extend))
> in certain situations?
Yes.
> I filed PR/91865 which has all the relevant details from this thread.
Thanks!
> I can add the following nameless insn and combine will catch this case and
> generate the better, smaller code:
>
> +(define_insn "*movqihipsi"
> + [(set (match_operand:PSI 0 "register_operand" "=r,r")
> + (sign_extend:PSI (zero_extend:HI (match_operand:QI 1 "general_operand"
> "rYs,m"))))]
> + "msp430x"
> + "@
> + MOV.B\t%1, %0
> + MOV%X1.B\t%1, %0"
> +)
Good to hear that already works! combine should come up with the simpler
formulation though, let me see what I can do.
Segher
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2019-09-23 22:53 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-23 11:56 peephole or expand optimization for fixing poor code generated in conversion from QImode to PSImode pointer? Jozef Lawrynowicz
2019-09-23 13:25 ` Richard Biener
2019-09-23 13:30 ` Jeff Law
2019-09-23 17:02 ` Jozef Lawrynowicz
2019-09-23 15:56 ` Segher Boessenkool
2019-09-23 17:56 ` Jozef Lawrynowicz
2019-09-23 22:53 ` Segher Boessenkool
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).