public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Question regarding preventing optimizing out of register in expansion
@ 2018-06-21 11:42 Peryt, Sebastian
  2018-06-21 13:12 ` Nathan Sidwell
  0 siblings, 1 reply; 7+ messages in thread
From: Peryt, Sebastian @ 2018-06-21 11:42 UTC (permalink / raw)
  To: gcc; +Cc: Peryt, Sebastian

Hi,

I'd appreciate if someone could advise me in builtin expansion I'm currently writing.

High level description for what I want to do:

I have 2 operands in my builtin.
First I set register (reg1) with value from operand1 (op1);
Second I call my instruction (reg1 is called implicitly and updated);
At the end I'm setting operand2 (op2) with value from reg1.

Simplified implementation in i386.c I have:

reg1 = gen_reg_rtx (mode);
emit_insn (gen_rtx_SET (reg1, op1);
emit_clobber (reg1);

emit_insn (gen_myinstruction ());

emit_insn (gen_rtx_SET (op2,reg1));

Everything works fine for -O0, but when I move to higher level optimizations
setting value into reg1 (lines before emit_clobber) are optimized out.
I already tried moving emit_clobber just after assignment but it doesn't help.

Could you please suggest how I can prevent it from happening?

Thanks,
Sebastian

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Question regarding preventing optimizing out of register in expansion
  2018-06-21 11:42 Question regarding preventing optimizing out of register in expansion Peryt, Sebastian
@ 2018-06-21 13:12 ` Nathan Sidwell
  2018-06-21 14:06   ` Peryt, Sebastian
  2018-06-26  9:26   ` Peryt, Sebastian
  0 siblings, 2 replies; 7+ messages in thread
From: Nathan Sidwell @ 2018-06-21 13:12 UTC (permalink / raw)
  To: Peryt, Sebastian, gcc

On 06/21/2018 05:20 AM, Peryt, Sebastian wrote:
> Hi,
> 
> I'd appreciate if someone could advise me in builtin expansion I'm currently writing.
> 
> High level description for what I want to do:
> 
> I have 2 operands in my builtin.

IIUC you're defining an UNSPEC.

> First I set register (reg1) with value from operand1 (op1);
> Second I call my instruction (reg1 is called implicitly and updated);

Here is your error -- NEVER have implicit register settings.  The data 
flow analysers need accurate information.


> Simplified implementation in i386.c I have:
> 
> reg1 = gen_reg_rtx (mode);
> emit_insn (gen_rtx_SET (reg1, op1); 
> emit_clobber (reg1);

At this point reg1 is dead.  That means the previous set of reg1 from 
op1 is unneeded and can be deleted.

> emit_insn (gen_myinstruction ());

This instruction has no inputs or outputs, and is not marked volatile(?) 
so can be deleted.

> emit_insn (gen_rtx_SET (op2,reg1));

And this is storing a value from a dead register.

You need something like:
   rtx reg1 = force_reg (op1);
   rtx reg2 = gen_reg_rtx (mode);
   emit_insn (gen_my_insn (reg2, reg1));
   emit insn (gen_rtx_SET (op2, reg2));

your instruction should be an UNSPEC showing what the inputs and outputs 
are.  That tells the optimizers what depends on what, but the compiler 
has no clue about what the transform is.

nathan
-- 
Nathan Sidwell

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: Question regarding preventing optimizing out of register in expansion
  2018-06-21 13:12 ` Nathan Sidwell
@ 2018-06-21 14:06   ` Peryt, Sebastian
  2018-06-26  9:26   ` Peryt, Sebastian
  1 sibling, 0 replies; 7+ messages in thread
From: Peryt, Sebastian @ 2018-06-21 14:06 UTC (permalink / raw)
  To: Nathan Sidwell, gcc; +Cc: Peryt, Sebastian

Thank you very much! Your suggestions helped me figure this out.

Sebastian


-----Original Message-----
From: Nathan Sidwell [mailto:nathanmsidwell@gmail.com] On Behalf Of Nathan Sidwell
Sent: Thursday, June 21, 2018 1:43 PM
To: Peryt, Sebastian <sebastian.peryt@intel.com>; gcc@gcc.gnu.org
Subject: Re: Question regarding preventing optimizing out of register in expansion

On 06/21/2018 05:20 AM, Peryt, Sebastian wrote:
> Hi,
> 
> I'd appreciate if someone could advise me in builtin expansion I'm currently writing.
> 
> High level description for what I want to do:
> 
> I have 2 operands in my builtin.

IIUC you're defining an UNSPEC.

> First I set register (reg1) with value from operand1 (op1); Second I 
> call my instruction (reg1 is called implicitly and updated);

Here is your error -- NEVER have implicit register settings.  The data flow analysers need accurate information.


> Simplified implementation in i386.c I have:
> 
> reg1 = gen_reg_rtx (mode);
> emit_insn (gen_rtx_SET (reg1, op1); 
> emit_clobber (reg1);

At this point reg1 is dead.  That means the previous set of reg1 from 
op1 is unneeded and can be deleted.

> emit_insn (gen_myinstruction ());

This instruction has no inputs or outputs, and is not marked volatile(?) 
so can be deleted.

> emit_insn (gen_rtx_SET (op2,reg1));

And this is storing a value from a dead register.

You need something like:
   rtx reg1 = force_reg (op1);
   rtx reg2 = gen_reg_rtx (mode);
   emit_insn (gen_my_insn (reg2, reg1));
   emit insn (gen_rtx_SET (op2, reg2));

your instruction should be an UNSPEC showing what the inputs and outputs 
are.  That tells the optimizers what depends on what, but the compiler 
has no clue about what the transform is.

nathan
-- 
Nathan Sidwell

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: Question regarding preventing optimizing out of register in expansion
  2018-06-21 13:12 ` Nathan Sidwell
  2018-06-21 14:06   ` Peryt, Sebastian
@ 2018-06-26  9:26   ` Peryt, Sebastian
  2018-06-26 19:20     ` Peter Bergner
  1 sibling, 1 reply; 7+ messages in thread
From: Peryt, Sebastian @ 2018-06-26  9:26 UTC (permalink / raw)
  To: Nathan Sidwell, gcc; +Cc: Peryt, Sebastian

After some more digging and adjusting I found additional cases that are optimizing out registers
thus I decided to continue this thread to  keep discussion compact.

With some changes simplified implementation of my expansion is as follows:
tmp_op0 = gen_reg_rtx (mode);
emit_move_insn (tmp_op0, op0);
tmp_op1 = gen_reg_rtx (mode);
emit_move_insn (tmp_op1, op1);

// This is important part
reg = gen_rtx_REG(wide_mode, XMM2_REG);
emit_insn (gen_rtx_SET (reg, tmp_op1));

emit_insn (gen_myinsn(op2, reg));

emit_insn (gen_rtx_SET (tmp_op0, reg));
////

And my md is as follows:
(define_insn "myinsn"
  [(unspec [(match_operand:SI 0 "register_operand" "r")
            (match_operand:V4SI 1 "vector_operand")]
            UNSPEC_MYINSN)
   (clobber (reg:V4SI XMM2_REG))]
  "TARGET_MYTARGET"
  "instr\t%0"
  [(set_attr "type" "other")])

This is working like a charm when built with any optimization level producing something like this:

movdqu  %eax, %xmm2
instr      %edx
movups  %xmm2, %eax

Unfortunately, when I build it with additional -mavx2 or -mavx512f first move (from reg to xmm2) is
optimized out. I'm using those extra flags because I also want to use YMM2 and ZMM2 in my instruction.

Does anyone have idea why might such thing happen? And how this can be overcome?

Thanks,
Sebastian


> -----Original Message-----
> Subject: Re: Question regarding preventing optimizing out of register in
> expansion
> 
> On 06/21/2018 05:20 AM, Peryt, Sebastian wrote:
> > Hi,
> >
> > I'd appreciate if someone could advise me in builtin expansion I'm currently
> writing.
> >
> > High level description for what I want to do:
> >
> > I have 2 operands in my builtin.
> 
> IIUC you're defining an UNSPEC.
> 
> > First I set register (reg1) with value from operand1 (op1); Second I
> > call my instruction (reg1 is called implicitly and updated);
> 
> Here is your error -- NEVER have implicit register settings.  The data flow
> analysers need accurate information.
> 
> 
> > Simplified implementation in i386.c I have:
> >
> > reg1 = gen_reg_rtx (mode);
> > emit_insn (gen_rtx_SET (reg1, op1);
> > emit_clobber (reg1);
> 
> At this point reg1 is dead.  That means the previous set of reg1 from
> op1 is unneeded and can be deleted.
> 
> > emit_insn (gen_myinstruction ());
> 
> This instruction has no inputs or outputs, and is not marked volatile(?)
> so can be deleted.
> 
> > emit_insn (gen_rtx_SET (op2,reg1));
> 
> And this is storing a value from a dead register.
> 
> You need something like:
>    rtx reg1 = force_reg (op1);
>    rtx reg2 = gen_reg_rtx (mode);
>    emit_insn (gen_my_insn (reg2, reg1));
>    emit insn (gen_rtx_SET (op2, reg2));
> 
> your instruction should be an UNSPEC showing what the inputs and outputs
> are.  That tells the optimizers what depends on what, but the compiler
> has no clue about what the transform is.
> 
> nathan
> --
> Nathan Sidwell

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Question regarding preventing optimizing out of register in expansion
  2018-06-26  9:26   ` Peryt, Sebastian
@ 2018-06-26 19:20     ` Peter Bergner
  2018-06-26 19:46       ` Peryt, Sebastian
  0 siblings, 1 reply; 7+ messages in thread
From: Peter Bergner @ 2018-06-26 19:20 UTC (permalink / raw)
  To: Peryt, Sebastian; +Cc: Nathan Sidwell, gcc

On 6/26/18 4:05 AM, Peryt, Sebastian wrote:
> With some changes simplified implementation of my expansion is as follows:
> tmp_op0 = gen_reg_rtx (mode);
> emit_move_insn (tmp_op0, op0);

You set tmp_op0 here, and then....


> emit_insn (gen_rtx_SET (tmp_op0, reg));

You set it again here without ever using it above, so it's dead code,
which explains why it's removed.

Peter


^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: Question regarding preventing optimizing out of register in expansion
  2018-06-26 19:20     ` Peter Bergner
@ 2018-06-26 19:46       ` Peryt, Sebastian
  2018-06-27 10:26         ` Jeff Law
  0 siblings, 1 reply; 7+ messages in thread
From: Peryt, Sebastian @ 2018-06-26 19:46 UTC (permalink / raw)
  To: Peter Bergner; +Cc: gcc, Peryt, Sebastian

> Subject: Re: Question regarding preventing optimizing out of register in
> expansion
> 
> On 6/26/18 4:05 AM, Peryt, Sebastian wrote:
> > With some changes simplified implementation of my expansion is as follows:
> > tmp_op0 = gen_reg_rtx (mode);
> > emit_move_insn (tmp_op0, op0);
> 
> You set tmp_op0 here, and then....
> 
> 
> > emit_insn (gen_rtx_SET (tmp_op0, reg));
> 
> You set it again here without ever using it above, so it's dead code, which
> explains why it's removed.

Oh.... My bad - I oversimplified my code. Now I can see it.

This should be more appropriate:
tmp_op0 = gen_reg_rtx (mode);
emit_move_insn (tmp_op0, op0);
tmp_op1 = gen_reg_rtx (mode);
emit_move_insn (tmp_op1, op1);

// This is important part
reg = gen_rtx_REG(wide_mode, XMM2_REG);
op3 = gen_rtx_PLUS (mode, tmp_op1, GEN_INT (128));
emit_insn (gen_rtx_SET (reg, op3));

emit_insn (gen_myinsn(op2, reg));

op3 = gen_rtx_PLUS (mode, tmp_op0, GEN_INT (128));
emit_insn (gen_rtx_SET (op3, reg));
////

Also I'd like to one more time point out that without additional -mavx or -mavx2 
I'm getting expected register moves before and after my instr. With those options
only *after*. This is the part that I don't get especially - why.

> 
> Peter
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Question regarding preventing optimizing out of register in expansion
  2018-06-26 19:46       ` Peryt, Sebastian
@ 2018-06-27 10:26         ` Jeff Law
  0 siblings, 0 replies; 7+ messages in thread
From: Jeff Law @ 2018-06-27 10:26 UTC (permalink / raw)
  To: Peryt, Sebastian, Peter Bergner; +Cc: gcc

On 06/26/2018 01:20 PM, Peryt, Sebastian wrote:
>> Subject: Re: Question regarding preventing optimizing out of register in
>> expansion
>>
>> On 6/26/18 4:05 AM, Peryt, Sebastian wrote:
>>> With some changes simplified implementation of my expansion is as follows:
>>> tmp_op0 = gen_reg_rtx (mode);
>>> emit_move_insn (tmp_op0, op0);
>>
>> You set tmp_op0 here, and then....
>>
>>
>>> emit_insn (gen_rtx_SET (tmp_op0, reg));
>>
>> You set it again here without ever using it above, so it's dead code, which
>> explains why it's removed.
> 
> Oh.... My bad - I oversimplified my code. Now I can see it.
> 
> This should be more appropriate:
> tmp_op0 = gen_reg_rtx (mode);
> emit_move_insn (tmp_op0, op0);
> tmp_op1 = gen_reg_rtx (mode);
> emit_move_insn (tmp_op1, op1);
> 
> // This is important part
> reg = gen_rtx_REG(wide_mode, XMM2_REG);
> op3 = gen_rtx_PLUS (mode, tmp_op1, GEN_INT (128));
> emit_insn (gen_rtx_SET (reg, op3));
> 
> emit_insn (gen_myinsn(op2, reg));
> 
> op3 = gen_rtx_PLUS (mode, tmp_op0, GEN_INT (128));
> emit_insn (gen_rtx_SET (op3, reg));
> ////
> 
> Also I'd like to one more time point out that without additional -mavx or -mavx2 
> I'm getting expected register moves before and after my instr. With those options
> only *after*. This is the part that I don't get especially - why.
I don't know the details of what you're doing, but the expansion phase
may be trying make operands you provide fit the predicate for expanders
or named patterns you're using.  It may also be the case that copies are
created as a result of other define_expands, etc.

The way to figure this out is to note the insn # for the unexpected
copy.  Then put a breakpoint in emit_insn that is conditional on
cur_insn_uid having that value.  You can then walk up the callchain and
try to ascertain why those copies were made.

jeff

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-06-26 20:02 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-21 11:42 Question regarding preventing optimizing out of register in expansion Peryt, Sebastian
2018-06-21 13:12 ` Nathan Sidwell
2018-06-21 14:06   ` Peryt, Sebastian
2018-06-26  9:26   ` Peryt, Sebastian
2018-06-26 19:20     ` Peter Bergner
2018-06-26 19:46       ` Peryt, Sebastian
2018-06-27 10:26         ` Jeff Law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).