public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Fix PR44281 (bad RA with global regs)
       [not found] <56C78C62.7050204@t-online.de>
@ 2016-02-19 22:03 ` Bernd Schmidt
  2016-02-22 14:37   ` Richard Biener
                     ` (3 more replies)
  0 siblings, 4 replies; 21+ messages in thread
From: Bernd Schmidt @ 2016-02-19 22:03 UTC (permalink / raw)
  To: GCC Patches

[-- Attachment #1: Type: text/plain, Size: 689 bytes --]

In this PR, we generate unnecessarily bad code for code that declares a 
global register var. Since global regs get added to fixed_regs, IRA 
never considers them as candidates. However, we do seem to have proper 
data flow information for them. In the testcase, the global reg dies, 
some operations are done on temporary results, and the final result 
stored back in the global reg. We can achieve the desired code 
generation by reusing the global reg for those temporaries.

Bootstrapped and tested on x86_64-linux. Ok? An argument could be made 
not to use this for gcc-6 since global register vars are both not very 
important and not very well represented in the testsuite.


Bernd

[-- Attachment #2: global-regalloc.diff --]
[-- Type: text/x-patch, Size: 3189 bytes --]

	PR rtl-optimization/44281
	* hard-reg-set.h (struct target_hard_regs): New field
	x_fixed_nonglobal_reg_set.
	(fixed_nonglobal_reg_set): New macro.
	* reginfo.c (init_reg_sets_1): Initialize it.
	* ira.c (setup_alloc_regs): Use fixed_nonglobal_reg_set instead
	of fixed_reg_set.

	PR rtl-optimization/44281
	* gcc.target/i386/pr44281.c: New test.

Index: gcc/hard-reg-set.h
===================================================================
--- gcc/hard-reg-set.h	(revision 233451)
+++ gcc/hard-reg-set.h	(working copy)
@@ -660,6 +660,12 @@ struct target_hard_regs {
      across calls even if we are willing to save and restore them.  */
   HARD_REG_SET x_call_fixed_reg_set;
 
+  /* Contains registers that are fixed use -- i.e. in fixed_reg_set -- but
+     only if they are not merely part of that set because they are global
+     regs.  Global regs that are not otherwise fixed can still take part
+     in register allocation.  */
+  HARD_REG_SET x_fixed_nonglobal_reg_set;
+
   /* Contains 1 for registers that are set or clobbered by calls.  */
   /* ??? Ideally, this would be just call_used_regs plus global_regs, but
      for someone's bright idea to have call_used_regs strictly include
@@ -722,6 +728,8 @@ extern struct target_hard_regs *this_tar
   (this_target_hard_regs->x_fixed_regs)
 #define fixed_reg_set \
   (this_target_hard_regs->x_fixed_reg_set)
+#define fixed_nonglobal_reg_set \
+  (this_target_hard_regs->x_fixed_nonglobal_reg_set)
 #define call_used_regs \
   (this_target_hard_regs->x_call_used_regs)
 #define call_really_used_regs \
Index: gcc/ira.c
===================================================================
--- gcc/ira.c	(revision 233451)
+++ gcc/ira.c	(working copy)
@@ -512,7 +512,7 @@ setup_alloc_regs (bool use_hard_frame_p)
 #ifdef ADJUST_REG_ALLOC_ORDER
   ADJUST_REG_ALLOC_ORDER;
 #endif
-  COPY_HARD_REG_SET (no_unit_alloc_regs, fixed_reg_set);
+  COPY_HARD_REG_SET (no_unit_alloc_regs, fixed_nonglobal_reg_set);
   if (! use_hard_frame_p)
     SET_HARD_REG_BIT (no_unit_alloc_regs, HARD_FRAME_POINTER_REGNUM);
   setup_class_hard_regs ();
Index: gcc/reginfo.c
===================================================================
--- gcc/reginfo.c	(revision 233451)
+++ gcc/reginfo.c	(working copy)
@@ -449,6 +449,7 @@ init_reg_sets_1 (void)
     }
 
   COPY_HARD_REG_SET (call_fixed_reg_set, fixed_reg_set);
+  COPY_HARD_REG_SET (fixed_nonglobal_reg_set, fixed_reg_set);
 
   /* Preserve global registers if called more than once.  */
   for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
Index: gcc/testsuite/gcc.target/i386/pr44281.c
===================================================================
--- gcc/testsuite/gcc.target/i386/pr44281.c	(revision 0)
+++ gcc/testsuite/gcc.target/i386/pr44281.c	(working copy)
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-options "-std=gnu99 -O2" } */
+/* { dg-final { scan-assembler "salq\[ \\t\]+\\\$8, %rbx" } } */
+
+#include <stdint.h>
+
+register uint64_t global_flag_stack __asm__("rbx");
+
+void push_flag_into_global_reg_var(uint64_t a, uint64_t b) {
+  uint64_t flag = (a==b);
+  global_flag_stack <<= 8;
+  global_flag_stack  |= flag;
+}


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Fix PR44281 (bad RA with global regs)
  2016-02-19 22:03 ` Fix PR44281 (bad RA with global regs) Bernd Schmidt
@ 2016-02-22 14:37   ` Richard Biener
  2016-02-26 15:41     ` Bernd Schmidt
  2016-02-22 17:53   ` Jeff Law
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 21+ messages in thread
From: Richard Biener @ 2016-02-22 14:37 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: GCC Patches

On Fri, Feb 19, 2016 at 11:03 PM, Bernd Schmidt <bschmidt@redhat.com> wrote:
> In this PR, we generate unnecessarily bad code for code that declares a
> global register var. Since global regs get added to fixed_regs, IRA never
> considers them as candidates. However, we do seem to have proper data flow
> information for them. In the testcase, the global reg dies, some operations
> are done on temporary results, and the final result stored back in the
> global reg. We can achieve the desired code generation by reusing the global
> reg for those temporaries.
>
> Bootstrapped and tested on x86_64-linux. Ok? An argument could be made not
> to use this for gcc-6 since global register vars are both not very important
> and not very well represented in the testsuite.

Do calls properly clobber them even if they are not in the set of
call-clobbered regs?
Are asm()s properly using/clobbering them?  I think you are allowed to use them
in asm()s without adding constraints for them?

I'd say this is sth for GCC 7.

Thanks,
Richard.

>
> Bernd

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Fix PR44281 (bad RA with global regs)
  2016-02-19 22:03 ` Fix PR44281 (bad RA with global regs) Bernd Schmidt
  2016-02-22 14:37   ` Richard Biener
@ 2016-02-22 17:53   ` Jeff Law
  2016-02-22 18:29     ` Michael Matz
  2016-04-27 20:59   ` Jeff Law
  2016-07-13 15:30   ` Dominik Vogt
  3 siblings, 1 reply; 21+ messages in thread
From: Jeff Law @ 2016-02-22 17:53 UTC (permalink / raw)
  To: Bernd Schmidt, GCC Patches

On 02/19/2016 03:03 PM, Bernd Schmidt wrote:
> In this PR, we generate unnecessarily bad code for code that declares a
> global register var. Since global regs get added to fixed_regs, IRA
> never considers them as candidates. However, we do seem to have proper
> data flow information for them. In the testcase, the global reg dies,
> some operations are done on temporary results, and the final result
> stored back in the global reg. We can achieve the desired code
> generation by reusing the global reg for those temporaries.
>
> Bootstrapped and tested on x86_64-linux. Ok? An argument could be made
> not to use this for gcc-6 since global register vars are both not very
> important and not very well represented in the testsuite.
>
>
> Bernd
>
> global-regalloc.diff
>
>
> 	PR rtl-optimization/44281
> 	* hard-reg-set.h (struct target_hard_regs): New field
> 	x_fixed_nonglobal_reg_set.
> 	(fixed_nonglobal_reg_set): New macro.
> 	* reginfo.c (init_reg_sets_1): Initialize it.
> 	* ira.c (setup_alloc_regs): Use fixed_nonglobal_reg_set instead
> 	of fixed_reg_set.
>
> 	PR rtl-optimization/44281
> 	* gcc.target/i386/pr44281.c: New test.
It sounds like Richi wants to table this to gcc-7, which I can 
understand.  So I'm not going to dig into the patch itself, just touch 
on the high level stuff.

Global register variables are currently documented as "The register is 
reserved entirely for this use, and will not be allocated for any other 
purpose".  That would need to be fixed.

When David W. and I were working through the docs on this stuff in 2015, 
I don't think that either of us considered the possibility that the 
register could be used for temporaries between a use of the global 
register var (where it dies) and an assignment to the global register 
var (rebirth).

As long as our dataflow is good, using the global register var for a 
temporary in that window seems like a useful thing to do.  So I think 
the biggest concern would be any kind of implicit use.  I'm thinking of 
function calls where the callee expects the global register var to have 
the right value, perhaps asms as well (someone might reference the 
global var in the asm, but not list it in the sets/uses/clobbers because 
it wasn't really necessary in the past).

Thoughts?

jeff


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Fix PR44281 (bad RA with global regs)
  2016-02-22 17:53   ` Jeff Law
@ 2016-02-22 18:29     ` Michael Matz
  0 siblings, 0 replies; 21+ messages in thread
From: Michael Matz @ 2016-02-22 18:29 UTC (permalink / raw)
  To: Jeff Law; +Cc: Bernd Schmidt, GCC Patches

Hi,

On Mon, 22 Feb 2016, Jeff Law wrote:

> > never considers them as candidates. However, we do seem to have proper
> > data flow information for them.

IMO one of the points of global reg vars, and its long-standing 
documentation to that effect, is that we do not have proper data flow 
information ...

> temporary in that window seems like a useful thing to do.  So I think 
> the biggest concern would be any kind of implicit use.  I'm thinking of 
> function calls where the callee expects the global register var to have 
> the right value, perhaps asms as well (someone might reference the 
> global var in the asm, but not list it in the sets/uses/clobbers because 
> it wasn't really necessary in the past).

... exactly because of this.  Some jit-like code uses global reg vars to 
reserve registers for the generated code.  It would have to add 
-ffixed-<name> compilation options instead of only the global vars after 
the patch.  I think the advantages of having "good RA" with global reg 
vars are dubious enough to not warrant breaking backward compatibility.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Fix PR44281 (bad RA with global regs)
  2016-02-22 14:37   ` Richard Biener
@ 2016-02-26 15:41     ` Bernd Schmidt
  2016-02-26 18:15       ` Jeff Law
  2016-02-29 17:07       ` Michael Matz
  0 siblings, 2 replies; 21+ messages in thread
From: Bernd Schmidt @ 2016-02-26 15:41 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Patches

On 02/22/2016 03:37 PM, Richard Biener wrote:

> Do calls properly clobber them even if they are not in the set of
> call-clobbered regs?
> Are asm()s properly using/clobbering them?  I think you are allowed to use them
> in asm()s without adding constraints for them?

Calls do, asms currently don't AFAICT. Not sure whether it's allowed to 
use them, but I think it should be straightforward to adjust df-scan.

Michael Matz wrote:

> Some jit-like code uses global reg vars to
> reserve registers for the generated code.  It would have to add
> -ffixed-<name> compilation options instead of only the global vars after
> the patch.  I think the advantages of having "good RA" with global reg
> vars are dubious enough to not warrant breaking backward compatibility.

I don't see the need for new options - if we have proper live info for 
calls and asms, what other reason could there be for incomplete liveness 
information?

RB:
 > I'd say this is sth for GCC 7.

Not really objecting to that. Would you ack it if I extended df-scan for 
the asm case?


Bernd

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Fix PR44281 (bad RA with global regs)
  2016-02-26 15:41     ` Bernd Schmidt
@ 2016-02-26 18:15       ` Jeff Law
       [not found]         ` <CAFiYyc1GGJtT_QQ3UtzhyACB5Ru=XDsJsG6vQyBs1_rONDdDLw@mail.gmail.com>
  2016-02-29 17:07       ` Michael Matz
  1 sibling, 1 reply; 21+ messages in thread
From: Jeff Law @ 2016-02-26 18:15 UTC (permalink / raw)
  To: Bernd Schmidt, Richard Biener; +Cc: GCC Patches

On 02/26/2016 08:41 AM, Bernd Schmidt wrote:
> On 02/22/2016 03:37 PM, Richard Biener wrote:
>
>> Do calls properly clobber them even if they are not in the set of
>> call-clobbered regs?
>> Are asm()s properly using/clobbering them?  I think you are allowed to
>> use them
>> in asm()s without adding constraints for them?
>
> Calls do, asms currently don't AFAICT. Not sure whether it's allowed to
> use them, but I think it should be straightforward to adjust df-scan.
The other case that came to mind was signal handlers.  What happens if 
we're using the global register as a scratch, we hit a memory reference 
that faults and inside the signal handler the original code expects to 
be able to use the global register the user set up?

If that's a valid use scenario, then there's probably all kinds of DF 
stuff that we'd need to expose.

jeff

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Fix PR44281 (bad RA with global regs)
       [not found]         ` <CAFiYyc1GGJtT_QQ3UtzhyACB5Ru=XDsJsG6vQyBs1_rONDdDLw@mail.gmail.com>
@ 2016-02-29 13:38           ` Bernd Schmidt
  2016-02-29 15:18             ` Richard Biener
  0 siblings, 1 reply; 21+ messages in thread
From: Bernd Schmidt @ 2016-02-29 13:38 UTC (permalink / raw)
  To: Richard Biener, Jeff Law; +Cc: GCC Patches

On 02/27/2016 08:12 PM, Richard Biener wrote:
>
>
> Am Freitag, 26. Februar 2016 schrieb Jeff Law :
>
>     The other case that came to mind was signal handlers.  What happens
>     if we're using the global register as a scratch, we hit a memory
>     reference that faults and inside the signal handler the original
>     code expects to be able to use the global register the user set up?
>
>     If that's a valid use scenario, then there's probably all kinds of
>     DF stuff that we'd need to expose.
>
> I'd say that's a valid assumption.  Though maybe we want to be able to
> change semantics with a flag here.

A flag seems like overkill, I don't think people are likely to enable 
that ever.

So what's the consensus here, closed wontfix?


Bernd


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Fix PR44281 (bad RA with global regs)
  2016-02-29 13:38           ` Bernd Schmidt
@ 2016-02-29 15:18             ` Richard Biener
  2016-02-29 15:22               ` Jeff Law
  0 siblings, 1 reply; 21+ messages in thread
From: Richard Biener @ 2016-02-29 15:18 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: Jeff Law, GCC Patches

On Mon, Feb 29, 2016 at 2:38 PM, Bernd Schmidt <bschmidt@redhat.com> wrote:
> On 02/27/2016 08:12 PM, Richard Biener wrote:
>>
>>
>>
>> Am Freitag, 26. Februar 2016 schrieb Jeff Law :
>>
>>     The other case that came to mind was signal handlers.  What happens
>>     if we're using the global register as a scratch, we hit a memory
>>     reference that faults and inside the signal handler the original
>>     code expects to be able to use the global register the user set up?
>>
>>     If that's a valid use scenario, then there's probably all kinds of
>>     DF stuff that we'd need to expose.
>>
>> I'd say that's a valid assumption.  Though maybe we want to be able to
>> change semantics with a flag here.
>
>
> A flag seems like overkill, I don't think people are likely to enable that
> ever.
>
> So what's the consensus here, closed wontfix?

I think so, and maybe update documentation to reflect the discussion.

Richard.

>
> Bernd
>
>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Fix PR44281 (bad RA with global regs)
  2016-02-29 15:18             ` Richard Biener
@ 2016-02-29 15:22               ` Jeff Law
  0 siblings, 0 replies; 21+ messages in thread
From: Jeff Law @ 2016-02-29 15:22 UTC (permalink / raw)
  To: Richard Biener, Bernd Schmidt; +Cc: GCC Patches

On 02/29/2016 08:18 AM, Richard Biener wrote:
> On Mon, Feb 29, 2016 at 2:38 PM, Bernd Schmidt <bschmidt@redhat.com> wrote:
>> On 02/27/2016 08:12 PM, Richard Biener wrote:
>>>
>>>
>>>
>>> Am Freitag, 26. Februar 2016 schrieb Jeff Law :
>>>
>>>      The other case that came to mind was signal handlers.  What happens
>>>      if we're using the global register as a scratch, we hit a memory
>>>      reference that faults and inside the signal handler the original
>>>      code expects to be able to use the global register the user set up?
>>>
>>>      If that's a valid use scenario, then there's probably all kinds of
>>>      DF stuff that we'd need to expose.
>>>
>>> I'd say that's a valid assumption.  Though maybe we want to be able to
>>> change semantics with a flag here.
>>
>>
>> A flag seems like overkill, I don't think people are likely to enable that
>> ever.
>>
>> So what's the consensus here, closed wontfix?
>
> I think so, and maybe update documentation to reflect the discussion.
Agreed (closed/wontfix).  I think the signal handler case essentially 
kills the ability to use global regs as scratches.

We could create another style declaration for a global-like register 
which has different semantics, but I don't think that's wise.

I'd either add a comment summarizing the discussion or a pointer back to 
the discussion in the archives.

jeff

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Fix PR44281 (bad RA with global regs)
  2016-02-26 15:41     ` Bernd Schmidt
  2016-02-26 18:15       ` Jeff Law
@ 2016-02-29 17:07       ` Michael Matz
  2016-02-29 17:13         ` Bernd Schmidt
  1 sibling, 1 reply; 21+ messages in thread
From: Michael Matz @ 2016-02-29 17:07 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: Richard Biener, GCC Patches

Hi,

On Fri, 26 Feb 2016, Bernd Schmidt wrote:

> Calls do, asms currently don't AFAICT. Not sure whether it's allowed to 
> use them, but I think it should be straightforward to adjust df-scan.
> 
> > Some jit-like code uses global reg vars to reserve registers for the 
> > generated code.  It would have to add -ffixed-<name> compilation 
> > options instead of only the global vars after the patch.  I think the 
> > advantages of having "good RA" with global reg vars are dubious enough 
> > to not warrant breaking backward compatibility.
> 
> I don't see the need for new options - if we have proper live info for 
> calls and asms, what other reason could there be for incomplete liveness 
> information?

[now moot, since wontfix, but still:]

Well, but we don't have proper live info, as you already said yourself 
above.  What I mean to say is, the following is currently proper use of 
global reg vars:

----
register uint64_t ugh __asm__("rbx"); //r11, whatever
void write_into_ugh (void)
{
  asm volatile ("mov 1, %%rbx" :::);
  assert (ugh == 1);
}
----

%rbx would have to be implicitly used/clobbered by the asm.  In addition 
it would have to be used by all function entries and exits (so that a 
function body where the global reg var is merely visible but not used 
doesn't accidentally clobber that register) and of course by calls.

AFAICS this would fix the code in push_flag_into_global_reg_var, because 
then indeed you'd have proper live info, including proper deaths.  But for 
everything a little more complicated than this it'd basically be the same 
as putting the reg into fixed_regs[].

FWIW: signal handlers need no consideration (if they were allowed to 
inspect/alter global reg vars we would have lost and no improvement on 
fixed_regs[] could be done).  They are explicitely documented to not be 
able to access global reg vars.  (They already can't accidentally clobber 
the register in question because signal handlers don't do that)

So, I think it can be made to work, but not something for GCC 6, and if 
the additional bit for all calls/asms/function-entry-exits really is worth 
it ... I just don't know.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Fix PR44281 (bad RA with global regs)
  2016-02-29 17:07       ` Michael Matz
@ 2016-02-29 17:13         ` Bernd Schmidt
  2016-02-29 18:50           ` Michael Matz
  0 siblings, 1 reply; 21+ messages in thread
From: Bernd Schmidt @ 2016-02-29 17:13 UTC (permalink / raw)
  To: Michael Matz; +Cc: Richard Biener, GCC Patches

On 02/29/2016 06:07 PM, Michael Matz wrote:

> %rbx would have to be implicitly used/clobbered by the asm.  In addition
> it would have to be used by all function entries and exits (so that a
> function body where the global reg var is merely visible but not used
> doesn't accidentally clobber that register) and of course by calls.

Nearly all this exists as of today. From df-scan.c:

static void
df_get_call_refs (struct df_collection_rec *collection_rec,
[...]
       else if (global_regs[i])
         {
           /* Calls to const functions cannot access any global 
registers and
              calls to pure functions cannot set them.  All other calls may
              reference any of the global registers, so they are recorded as
              used. */
           if (!RTL_CONST_CALL_P (insn_info->insn))

and

static void
df_get_entry_block_def_set (bitmap entry_block_defs)
{
[...]
   for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
     {
       if (global_regs[i])
         bitmap_set_bit (entry_block_defs, i);

and

   /* Mark all global registers, and all registers used by the
      epilogue as being live at the end of the function since they
      may be referenced by our caller.  */
   for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
     if (global_regs[i] || EPILOGUE_USES (i))
       bitmap_set_bit (exit_block_uses, i);

> FWIW: signal handlers need no consideration (if they were allowed to
> inspect/alter global reg vars we would have lost and no improvement on
> fixed_regs[] could be done).  They are explicitely documented to not be
> able to access global reg vars.  (They already can't accidentally clobber
> the register in question because signal handlers don't do that)

Oh, they can't modify the register in question because the OS would 
restore it? Hadn't thought of that.

Ok so maybe reopen and apply my patch for gcc-7, with a tweak for asms.


Bernd

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Fix PR44281 (bad RA with global regs)
  2016-02-29 17:13         ` Bernd Schmidt
@ 2016-02-29 18:50           ` Michael Matz
  2016-02-29 19:46             ` Mikael Pettersson
  0 siblings, 1 reply; 21+ messages in thread
From: Michael Matz @ 2016-02-29 18:50 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: Richard Biener, GCC Patches

Hi,

On Mon, 29 Feb 2016, Bernd Schmidt wrote:

> On 02/29/2016 06:07 PM, Michael Matz wrote:
> 
> > %rbx would have to be implicitly used/clobbered by the asm.  In addition
> > it would have to be used by all function entries and exits (so that a
> > function body where the global reg var is merely visible but not used
> > doesn't accidentally clobber that register) and of course by calls.
> 
> Nearly all this exists as of today. From df-scan.c:

Okay, that looks good, I agree (modulo the asms).

> > FWIW: signal handlers need no consideration (if they were allowed to
> > inspect/alter global reg vars we would have lost and no improvement on
> > fixed_regs[] could be done).  They are explicitely documented to not be
> > able to access global reg vars.  (They already can't accidentally clobber
> > the register in question because signal handlers don't do that)
> 
> Oh, they can't modify the register in question because the OS would 
> restore it?

Yep.

> Ok so maybe reopen and apply my patch for gcc-7, with a tweak for asms.

That seems workable.  At least I can't imagine other implicit uses of such 
registers.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Fix PR44281 (bad RA with global regs)
  2016-02-29 18:50           ` Michael Matz
@ 2016-02-29 19:46             ` Mikael Pettersson
  2016-02-29 20:09               ` Michael Matz
  0 siblings, 1 reply; 21+ messages in thread
From: Mikael Pettersson @ 2016-02-29 19:46 UTC (permalink / raw)
  To: Michael Matz; +Cc: Bernd Schmidt, Richard Biener, GCC Patches

Michael Matz writes:
 > > > FWIW: signal handlers need no consideration (if they were allowed to
 > > > inspect/alter global reg vars we would have lost and no improvement on
 > > > fixed_regs[] could be done).  They are explicitely documented to not be
 > > > able to access global reg vars.  (They already can't accidentally clobber
 > > > the register in question because signal handlers don't do that)
 > > 
 > > Oh, they can't modify the register in question because the OS would 
 > > restore it?
 > 
 > Yep.

Well, almost.  While it is true that a signal handler cannot *accidentally*
clobber the register state of the interrupted thread, it can in fact access
and update any part of that state via the ucontext_t passed to it.  Doing so
is uncommon, but not unheard of and not even that difficult -- I've done it
myself in several different runtime systems.

The code in a signal handler cannot assume that global register variables
are in sync with the interrupted thread, or that plain assignments to them
are reflected back, but that's not GCC's fault, nor is it GCC's job to make
that happen.

/Mikael

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Fix PR44281 (bad RA with global regs)
  2016-02-29 19:46             ` Mikael Pettersson
@ 2016-02-29 20:09               ` Michael Matz
  0 siblings, 0 replies; 21+ messages in thread
From: Michael Matz @ 2016-02-29 20:09 UTC (permalink / raw)
  To: Mikael Pettersson; +Cc: Bernd Schmidt, Richard Biener, GCC Patches

Hi,

On Mon, 29 Feb 2016, Mikael Pettersson wrote:

> Well, almost.  While it is true that a signal handler cannot 
> *accidentally* clobber the register state of the interrupted thread, it 
> can in fact access and update any part of that state via the ucontext_t 
> passed to it.  Doing so is uncommon, but not unheard of and not even 
> that difficult -- I've done it myself in several different runtime 
> systems.

Yeah, well, sure.  That's not clobbering the registers directly though, 
but setting it up so that the kernel does it on return :)  If you do that, 
you have to have a special sig-handler anyway, lest it clobbers other 
registers that are currently in use by the interrupted piece of code.

> The code in a signal handler cannot assume that global register 
> variables are in sync with the interrupted thread, or that plain 
> assignments to them are reflected back, but that's not GCC's fault, nor 
> is it GCC's job to make that happen.

And it's documented to not happen (reliably anyway), so all is fine.


Ciai,
Michael.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Fix PR44281 (bad RA with global regs)
  2016-02-19 22:03 ` Fix PR44281 (bad RA with global regs) Bernd Schmidt
  2016-02-22 14:37   ` Richard Biener
  2016-02-22 17:53   ` Jeff Law
@ 2016-04-27 20:59   ` Jeff Law
  2016-04-27 21:02     ` Bernd Schmidt
  2016-07-13 15:30   ` Dominik Vogt
  3 siblings, 1 reply; 21+ messages in thread
From: Jeff Law @ 2016-04-27 20:59 UTC (permalink / raw)
  To: Bernd Schmidt, GCC Patches

On 02/19/2016 03:03 PM, Bernd Schmidt wrote:
> In this PR, we generate unnecessarily bad code for code that declares a
> global register var. Since global regs get added to fixed_regs, IRA
> never considers them as candidates. However, we do seem to have proper
> data flow information for them. In the testcase, the global reg dies,
> some operations are done on temporary results, and the final result
> stored back in the global reg. We can achieve the desired code
> generation by reusing the global reg for those temporaries.
>
> Bootstrapped and tested on x86_64-linux. Ok? An argument could be made
> not to use this for gcc-6 since global register vars are both not very
> important and not very well represented in the testsuite.
>
>
> Bernd
>
> global-regalloc.diff
>
>
> 	PR rtl-optimization/44281
> 	* hard-reg-set.h (struct target_hard_regs): New field
> 	x_fixed_nonglobal_reg_set.
> 	(fixed_nonglobal_reg_set): New macro.
> 	* reginfo.c (init_reg_sets_1): Initialize it.
> 	* ira.c (setup_alloc_regs): Use fixed_nonglobal_reg_set instead
> 	of fixed_reg_set.
>
> 	PR rtl-optimization/44281
> 	* gcc.target/i386/pr44281.c: New test.
So  my recollection of where we left things was that we needed to 
dataflow information fixed for implicit uses/sets in asms.  With that 
infrastructure in place I think we'll be ready to resolve this old 
issue.  Right?

jeff

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Fix PR44281 (bad RA with global regs)
  2016-04-27 20:59   ` Jeff Law
@ 2016-04-27 21:02     ` Bernd Schmidt
  2016-04-27 21:08       ` Jeff Law
  0 siblings, 1 reply; 21+ messages in thread
From: Bernd Schmidt @ 2016-04-27 21:02 UTC (permalink / raw)
  To: Jeff Law, GCC Patches

On 04/27/2016 10:59 PM, Jeff Law wrote:
>>     PR rtl-optimization/44281
>>     * hard-reg-set.h (struct target_hard_regs): New field
>>     x_fixed_nonglobal_reg_set.
>>     (fixed_nonglobal_reg_set): New macro.
>>     * reginfo.c (init_reg_sets_1): Initialize it.
>>     * ira.c (setup_alloc_regs): Use fixed_nonglobal_reg_set instead
>>     of fixed_reg_set.
>>
>>     PR rtl-optimization/44281
>>     * gcc.target/i386/pr44281.c: New test.
> So  my recollection of where we left things was that we needed to
> dataflow information fixed for implicit uses/sets in asms.  With that
> infrastructure in place I think we'll be ready to resolve this old
> issue.  Right?

Yeah. I had prepared the following, and I was planning on getting around 
to retesting it with current trunk on top of the other patch, but I 
haven't quite got there yet.


Bernd

Index: gcc/df-scan.c
===================================================================
--- gcc/df-scan.c       (revision 234341)
+++ gcc/df-scan.c       (working copy)
@@ -3223,11 +3223,22 @@ df_insn_refs_collect (struct df_collecti
          }
      }

+  int flags = (is_cond_exec) ? DF_REF_CONDITIONAL : 0;
    /* For CALL_INSNs, first record DF_REF_BASE register defs, as well as
       uses from CALL_INSN_FUNCTION_USAGE. */
    if (CALL_P (insn_info->insn))
-    df_get_call_refs (collection_rec, bb, insn_info,
-                     (is_cond_exec) ? DF_REF_CONDITIONAL : 0);
+    df_get_call_refs (collection_rec, bb, insn_info, flags);
+
+  if (asm_noperands (PATTERN (insn_info->insn)) >= 0)
+    for (unsigned i = 0; i < FIRST_PSEUDO_REGISTER; i++)
+      if (global_regs[i])
+       {
+         /* As with calls, asm statements reference all global regs. */
+         df_ref_record (DF_REF_BASE, collection_rec, regno_reg_rtx[i],
+                        NULL, bb, insn_info, DF_REF_REG_USE, flags);
+         df_ref_record (DF_REF_BASE, collection_rec, regno_reg_rtx[i],
+                        NULL, bb, insn_info, DF_REF_REG_DEF, flags);
+       }

    /* Record other defs.  These should be mostly for DF_REF_REGULAR, so
       that a qsort on the defs is unnecessary in most cases.  */

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Fix PR44281 (bad RA with global regs)
  2016-04-27 21:02     ` Bernd Schmidt
@ 2016-04-27 21:08       ` Jeff Law
  0 siblings, 0 replies; 21+ messages in thread
From: Jeff Law @ 2016-04-27 21:08 UTC (permalink / raw)
  To: Bernd Schmidt, GCC Patches

On 04/27/2016 03:02 PM, Bernd Schmidt wrote:
> On 04/27/2016 10:59 PM, Jeff Law wrote:
>>>     PR rtl-optimization/44281
>>>     * hard-reg-set.h (struct target_hard_regs): New field
>>>     x_fixed_nonglobal_reg_set.
>>>     (fixed_nonglobal_reg_set): New macro.
>>>     * reginfo.c (init_reg_sets_1): Initialize it.
>>>     * ira.c (setup_alloc_regs): Use fixed_nonglobal_reg_set instead
>>>     of fixed_reg_set.
>>>
>>>     PR rtl-optimization/44281
>>>     * gcc.target/i386/pr44281.c: New test.
>> So  my recollection of where we left things was that we needed to
>> dataflow information fixed for implicit uses/sets in asms.  With that
>> infrastructure in place I think we'll be ready to resolve this old
>> issue.  Right?
>
> Yeah. I had prepared the following, and I was planning on getting around
> to retesting it with current trunk on top of the other patch, but I
> haven't quite got there yet.
Looks like it'll do the trick to me.  Consider the combination of the 
two patches pre-approved once you've run it through the usual testing.

Thanks,
Jeff

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Fix PR44281 (bad RA with global regs)
  2016-02-19 22:03 ` Fix PR44281 (bad RA with global regs) Bernd Schmidt
                     ` (2 preceding siblings ...)
  2016-04-27 20:59   ` Jeff Law
@ 2016-07-13 15:30   ` Dominik Vogt
  2016-07-13 17:43     ` Bernd Schmidt
  3 siblings, 1 reply; 21+ messages in thread
From: Dominik Vogt @ 2016-07-13 15:30 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: GCC Patches, Andreas Krebbel

On Fri, Feb 19, 2016 at 11:03:02PM +0100, Bernd Schmidt wrote:
> In this PR, we generate unnecessarily bad code for code that
> declares a global register var. Since global regs get added to
> fixed_regs, IRA never considers them as candidates. However, we do
> seem to have proper data flow information for them. In the testcase,
> the global reg dies, some operations are done on temporary results,
> and the final result stored back in the global reg. We can achieve
> the desired code generation by reusing the global reg for those
> temporaries.
> 
> Bootstrapped and tested on x86_64-linux. Ok? An argument could be
> made not to use this for gcc-6 since global register vars are both
> not very important and not very well represented in the testsuite.

> 	PR rtl-optimization/44281
> 	* hard-reg-set.h (struct target_hard_regs): New field
> 	x_fixed_nonglobal_reg_set.
> 	(fixed_nonglobal_reg_set): New macro.
> 	* reginfo.c (init_reg_sets_1): Initialize it.
> 	* ira.c (setup_alloc_regs): Use fixed_nonglobal_reg_set instead
> 	of fixed_reg_set.
> 
> 	PR rtl-optimization/44281
> 	* gcc.target/i386/pr44281.c: New test.

Unfortunately this patch (or whatever got actually committed) has
broken the gcc.target/s390/pr679443.c test case, which is a bit
fishy (see code snippet below).  I assign most registers to global
variables and then use some complicated arithmetics with the goal
that the pointer stored in the first argument gets saved on the
stack and reloaded to a different register.  Before this patch the
test case just needed three registers to do its work (r2, r3, r4).
With the patch it currently causes an error in the reload pass

  error: unable to find a register to spill

-- snip --
/* Block all registers except the first three argument registers.  */
register long r0 asm ("r0");
register long r1 asm ("r1");
register long r5 asm ("r5");
register long r6 asm ("r6");
register long r7 asm ("r7");
register long r8 asm ("r8");
register long r9 asm ("r9");
register long r10 asm ("r10");
register long r11 asm ("r11");

...

void foo (struct s_t *ps, int c, int i)
{
  /* Uses r2 as address register.  */
  ps->f1 = c;
  /* The calculation of the value is so expensive that it's cheaper to spill ps
     to the stack and reload it later (into a different register).
     ==> Uses r4 as address register.*/
  ps->f2 = i + i % 3;
  /* If dead store elimination fails to detect that the address in r2 during
     the first assignment is an alias of the address in r4 during the second
     assignment, it eliminates the first assignment and the f1 field is not
     written (bug).  */
}
-- snip --

If a fourth register is available, the ICE goes away, but the
pointer remains in r2, rendering the test case useless.

So, what should be done about this?

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Fix PR44281 (bad RA with global regs)
  2016-07-13 15:30   ` Dominik Vogt
@ 2016-07-13 17:43     ` Bernd Schmidt
  2016-07-14  9:25       ` Dominik Vogt
  0 siblings, 1 reply; 21+ messages in thread
From: Bernd Schmidt @ 2016-07-13 17:43 UTC (permalink / raw)
  To: vogt, GCC Patches, Andreas Krebbel

On 07/13/2016 05:29 PM, Dominik Vogt wrote:

> Unfortunately this patch (or whatever got actually committed) has
> broken the gcc.target/s390/pr679443.c test case, which is a bit
> fishy (see code snippet below).  I assign most registers to global
> variables and then use some complicated arithmetics with the goal
> that the pointer stored in the first argument gets saved on the
> stack and reloaded to a different register.  Before this patch the
> test case just needed three registers to do its work (r2, r3, r4).
> With the patch it currently causes an error in the reload pass
>
>   error: unable to find a register to spill

Might be useful to see the dump_reload output.

> If a fourth register is available, the ICE goes away, but the
> pointer remains in r2, rendering the test case useless.

I don't think I quite understand what you're trying to do here, but it 
does look dodgy. Whatever this is testing would probably be better as a 
part of a unit test.


Bernd

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Fix PR44281 (bad RA with global regs)
  2016-07-13 17:43     ` Bernd Schmidt
@ 2016-07-14  9:25       ` Dominik Vogt
  2016-07-20 12:14         ` Dominik Vogt
  0 siblings, 1 reply; 21+ messages in thread
From: Dominik Vogt @ 2016-07-14  9:25 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: GCC Patches, Andreas Krebbel

[-- Attachment #1: Type: text/plain, Size: 1692 bytes --]

On Wed, Jul 13, 2016 at 07:43:13PM +0200, Bernd Schmidt wrote:
> On 07/13/2016 05:29 PM, Dominik Vogt wrote:
> 
> >Unfortunately this patch (or whatever got actually committed) has
> >broken the gcc.target/s390/pr679443.c test case, which is a bit
> >fishy (see code snippet below).  I assign most registers to global
> >variables and then use some complicated arithmetics with the goal
> >that the pointer stored in the first argument gets saved on the
> >stack and reloaded to a different register.  Before this patch the
> >test case just needed three registers to do its work (r2, r3, r4).
> >With the patch it currently causes an error in the reload pass
> >
> >  error: unable to find a register to spill
> 
> Might be useful to see the dump_reload output.

Attached.

> >If a fourth register is available, the ICE goes away, but the
> >pointer remains in r2, rendering the test case useless.
> 
> I don't think I quite understand what you're trying to do here,

Alias detection of the memory pointed to by the first register.
There was some hard to trigger bug where writing a bitfield in a
struct would also overwrite the unselected bits of the
corresponding word.  See here:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67443

> but it does look dodgy.

It is.  The problem is that I can't tell Gcc to move the address
to a different register directly, so I had to do some voodoo
to achieve that.  The test case seemed to be the most robust
approach ...

> Whatever this is testing would probably be
> better as a part of a unit test.o

I've spent a lot of time to port the test to Power but was unable
to reproduce the effect.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany

[-- Attachment #2: pr67443.c.256r.reload --]
[-- Type: text/plain, Size: 14515 bytes --]


;; Function foo (foo, funcdef_no=0, decl_uid=2002, cgraph_uid=0, symbol_order=9)


********** Local #1: **********

	   Spilling non-eliminable hard regs: 15 35
New elimination table:
Can eliminate 34 to 15 (offset=160, prev_offset=0)
Can eliminate 34 to 11 (offset=160, prev_offset=0)
Can eliminate 32 to 15 (offset=160, prev_offset=0)
Can eliminate 32 to 11 (offset=160, prev_offset=0)
Can't eliminate 35 to 15 (offset=0, prev_offset=0)
Can't eliminate 35 to 11 (offset=0, prev_offset=0)
Can eliminate 13 to 13 (offset=0, prev_offset=0)
            alt=0: Bad operand -- refuse
            alt=1: Bad operand -- refuse
            alt=2: Bad operand -- refuse
            alt=3: Bad operand -- refuse
            alt=4: Bad operand -- refuse
          alt=12,overall=0,losers=0,rld_nregs=0
	 Choosing alt 12 in insn 4:  (0) d  (1) d {*movdi_64}
            0 Non input pseudo reload: reject++
            1 Matching alt: reject+=2
            2 Non-pseudo reload: reject+=2
            2 Non input pseudo reload: reject++
          alt=0,overall=24,losers=3,rld_nregs=4
            0 Non input pseudo reload: reject++
            1 Matching alt: reject+=2
            2 Non input pseudo reload: reject++
          alt=1,overall=22,losers=3,rld_nregs=3
	 Choosing alt 1 in insn 9:  (0) d  (1) 0  (2) T {divmodtidi3}
      Creating newreg=78 from oldreg=71, assigning class GENERAL_REGS to r78
    9: r78:TI=zero_extend(r78:TI#8%[`*.LC0'])<<0x40|zero_extend(r78:TI#8/[`*.LC0'])
    Inserting insn reload before:
   23: clobber r78:TI
   24: r78:TI#8=r67:DI
    Inserting insn reload after:
   25: r71:TI=r78:TI

            0 Non input pseudo reload: reject++
          alt=0,overall=613,losers=2,rld_nregs=2
            0 Non pseudo reload: reject++
            1 Non pseudo reload: reject++
          alt=1,overall=2,losers=0,rld_nregs=0
	 Choosing alt 1 in insn 25:  (0) S  (1) d {movti}
            0 Non pseudo reload: reject++
            alt=0: Bad operand -- refuse
            0 Non pseudo reload: reject++
            alt=1: Bad operand -- refuse
            0 Non pseudo reload: reject++
            alt=2: Bad operand -- refuse
            0 Non pseudo reload: reject++
            alt=3: Bad operand -- refuse
            0 Non pseudo reload: reject++
            alt=4: Bad operand -- refuse
            0 Non pseudo reload: reject++
          alt=12,overall=1,losers=0,rld_nregs=0
	 Choosing alt 12 in insn 24:  (0) d  (1) d {*movdi_64}
            alt=0: Bad operand -- refuse
            alt=1: Bad operand -- refuse
            alt=2: Bad operand -- refuse
            alt=3: Bad operand -- refuse
            alt=4: Bad operand -- refuse
          alt=12,overall=0,losers=0,rld_nregs=0
	 Choosing alt 12 in insn 2:  (0) d  (1) d {*movdi_64}
          alt=0,overall=6,losers=1,rld_nregs=1
            alt=2: Bad operand -- refuse
            2 Spill pseudo into memory: reject+=3
            alt=6,overall=15,losers=2 -- refuse
          alt=0,overall=6,losers=1,rld_nregs=1
            alt=2: Bad operand -- refuse
            2 Non pseudo reload: reject++
          alt=6,overall=1,losers=0,rld_nregs=0
  Commutative operand exchange in insn 12
	 Choosing alt 6 in insn 12:  (0) d  (1) 0  (2) R {*addsi3}
            alt=0: Bad operand -- refuse
            1 Spill pseudo into memory: reject+=3
            alt=1: Bad operand -- refuse
          alt=2,overall=0,losers=0,rld_nregs=0
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
            alt=0,overall=15,losers=2 -- refuse
            1 Non input pseudo reload: reject++
            alt=1,overall=7,losers=1 -- refuse
            1 Matching alt: reject+=2
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
            alt=2,overall=17,losers=2 -- refuse
            1 Matching alt: reject+=2
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
            alt=3,overall=17,losers=2 -- refuse
            1 Matching alt: reject+=2
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
            alt=4,overall=17,losers=2 -- refuse
            1 Matching alt: reject+=2
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
            alt=5,overall=17,losers=2 -- refuse
            1 Matching alt: reject+=2
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
            alt=7,overall=17,losers=2 -- refuse
            0 Spill pseudo into memory: reject+=3
            0 Non input pseudo reload: reject++
            alt=10,overall=10,losers=1 -- refuse
            0 Spill pseudo into memory: reject+=3
            0 Non input pseudo reload: reject++
            alt=11,overall=10,losers=1 -- refuse
	 Choosing alt 2 in insn 14:  (0) d  (1) 0  (2) N0HSF {*andsi3_zarch}
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
          alt=0,overall=609,losers=1,rld_nregs=1
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            alt=1: Bad operand -- refuse
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            1 Non-pseudo reload: reject+=2
            1 Spill pseudo into memory: reject+=3
          alt=2,overall=20,losers=2,rld_nregs=1
          alt=4,overall=0,losers=0,rld_nregs=0
	 Choosing alt 4 in insn 8:  (0) R  (1) d {*movqi}
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
            alt=0: Bad operand -- refuse
            alt=1: Bad operand -- refuse
            1 Matching alt: reject+=2
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
            alt=2: Bad operand -- refuse
            1 Matching alt: reject+=2
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
            alt=3: Bad operand -- refuse
            1 Matching alt: reject+=2
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
            alt=4: Bad operand -- refuse
            1 Matching alt: reject+=2
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
            2 Non-pseudo reload: reject+=2
            2 Non input pseudo reload: reject++
          alt=5,overall=26,losers=3,rld_nregs=2
            1 Matching alt: reject+=2
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
            2 Non input pseudo reload: reject++
          alt=7,overall=18,losers=2,rld_nregs=1
            0 Spill pseudo into memory: reject+=3
            0 Non input pseudo reload: reject++
            alt=10: Bad operand -- refuse
            0 Spill pseudo into memory: reject+=3
            0 Non input pseudo reload: reject++
            alt=11: Bad operand -- refuse
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
            alt=0: Bad operand -- refuse
            1 Non input pseudo reload: reject++
            alt=1: Bad operand -- refuse
            1 Matching alt: reject+=2
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
            alt=2: Bad operand -- refuse
            1 Matching alt: reject+=2
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
            alt=3: Bad operand -- refuse
            1 Matching alt: reject+=2
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
            alt=4: Bad operand -- refuse
            1 Matching alt: reject+=2
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
            2 Non-pseudo reload: reject+=2
            2 Non input pseudo reload: reject++
            alt=5,overall=26,losers=3 -- refuse
            1 Matching alt: reject+=2
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
          alt=7,overall=17,losers=2,rld_nregs=1
            0 Spill pseudo into memory: reject+=3
            0 Non input pseudo reload: reject++
            alt=10: Bad operand -- refuse
            0 Spill pseudo into memory: reject+=3
            0 Non input pseudo reload: reject++
            alt=11: Bad operand -- refuse
  Commutative operand exchange in insn 16
	 Choosing alt 7 in insn 16:  (0) d  (1) 0  (2) R {*andsi3_zarch}
      Creating newreg=79 from oldreg=76, assigning class GENERAL_REGS to r79
   16: {r79:SI=r79:SI&[r65:DI];clobber %cc:CC;}
      REG_UNUSED %cc:CC
    Inserting insn reload before:
   26: r79:SI=[`*.LC1']
    Inserting insn reload after:
   27: r76:SI=r79:SI

            alt=0: Bad operand -- refuse
            alt=1: Bad operand -- refuse
            alt=2: Bad operand -- refuse
            1 Non pseudo reload: reject++
          alt=6,overall=1,losers=0,rld_nregs=0
	 Choosing alt 6 in insn 27:  (0) d  (1) d {*movsi_zarch}
            0 Non pseudo reload: reject++
            alt=0: Bad operand -- refuse
            0 Non pseudo reload: reject++
            alt=1: Bad operand -- refuse
            0 Non pseudo reload: reject++
            alt=2: Bad operand -- refuse
            0 Non pseudo reload: reject++
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
          alt=6,overall=610,losers=1,rld_nregs=1
            0 Non pseudo reload: reject++
          alt=7,overall=1,losers=0,rld_nregs=0
	 Choosing alt 7 in insn 26:  (0) d  (1) R {*movsi_zarch}
            alt=0: Bad operand -- refuse
            alt=1: Bad operand -- refuse
            alt=2: Bad operand -- refuse
          alt=3,overall=6,losers=1,rld_nregs=1
            2 Spill pseudo into memory: reject+=3
            alt=5,overall=15,losers=2 -- refuse
            0 Spill pseudo into memory: reject+=3
            0 Non input pseudo reload: reject++
            alt=7,overall=10,losers=1 -- refuse
            0 Spill pseudo into memory: reject+=3
            0 Non input pseudo reload: reject++
            alt=8,overall=10,losers=1 -- refuse
            alt=0: Bad operand -- refuse
            alt=1: Bad operand -- refuse
            alt=2: Bad operand -- refuse
          alt=3,overall=0,losers=0,rld_nregs=0
  Commutative operand exchange in insn 17
	 Choosing alt 3 in insn 17:  (0) d  (1) 0  (2) d {*iorsi3_zarch}
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            alt=0: Bad operand -- refuse
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            alt=1: Bad operand -- refuse
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            alt=2: Bad operand -- refuse
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
          alt=6,overall=609,losers=1,rld_nregs=1
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            1 Spill pseudo into memory: reject+=3
          alt=7,overall=18,losers=2,rld_nregs=1
          alt=9,overall=0,losers=0,rld_nregs=0
	 Choosing alt 9 in insn 18:  (0) R  (1) d {*movsi_zarch}
	Elimination 13 to 13 is not possible anymore
    13 is not eliminable at all
	   Spilling non-eliminable hard regs: 13 15 35

********** Pseudo live ranges #1: **********

  BB 2
   Insn 18: point = 0
   Insn 17: point = 1
   Insn 27: point = 3
	Hard reg 3 is preferable by r79 with profit 1000
   Insn 16: point = 5
   Insn 26: point = 5
   Insn 8: point = 6
   Insn 14: point = 6
   Insn 12: point = 8
   Insn 2: point = 10
   Insn 25: point = 11
   Insn 9: point = 13
   Insn 24: point = 15
   Insn 23: point = 15
   Insn 4: point = 16
 r65: [0..10]
 r67: [9..16]
 r71: [9..11]
 r72: [7..8]
 r74: [2..6]
 r76: [2..3]
 r77: [0..1]
 r78: [12..15]
 r79: [4..5]
Compressing live ranges: from 17 to 12 - 70%
Ranges after the compression:
 r65: [0..9]
 r67: [8..11]
 r71: [8..9]
 r72: [6..7]
 r74: [2..5]
 r76: [2..3]
 r77: [0..1]
 r78: [10..11]
 r79: [4..5]

********** Inheritance #1: **********

EBB 2
    <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
      Creating newreg=80 from oldreg=71, assigning class GENERAL_REGS to inheritance r80
    Original reg change 71->80 (bb2):
   25: r80:TI=r78:TI
      REG_DEAD r78:TI
    Add original<-inheritance after:
   28: r71:TI=r80:TI

    Inheritance reuse change 71->80 (bb2):
   12: {r72:SI=r67:DI#4+r80:TI#4;clobber %cc:CC;}
      REG_DEAD r80:TI
      REG_DEAD r67:DI
      REG_UNUSED %cc:CC
	  >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
	    Removing dead insn:
    28: r71:TI=r80:TI
deleting insn with uid = 28.

********** Pseudo live ranges #2: **********

  BB 2
   Insn 18: point = 0
   Insn 17: point = 1
   Insn 27: point = 3
	Hard reg 3 is preferable by r79 with profit 1000
   Insn 16: point = 5
   Insn 26: point = 5
   Insn 8: point = 6
   Insn 14: point = 6
   Insn 12: point = 8
   Insn 2: point = 10
   Insn 25: point = 11
	   Creating copy r78->r80@1000
   Insn 9: point = 13
   Insn 24: point = 15
   Insn 23: point = 15
   Insn 4: point = 16
 r65: [0..10]
 r67: [9..16]
 r72: [7..8]
 r74: [2..6]
 r76: [2..3]
 r77: [0..1]
 r78: [12..15]
 r79: [4..5]
 r80: [9..11]
Compressing live ranges: from 17 to 12 - 70%
Ranges after the compression:
 r65: [0..9]
 r67: [8..11]
 r72: [6..7]
 r74: [2..5]
 r76: [2..3]
 r77: [0..1]
 r78: [10..11]
 r79: [4..5]
 r80: [8..9]

********** Assignment #1: **********

	 Assigning to 78 (cl=GENERAL_REGS, orig=71, freq=5000, tfirst=78, tfreq=5000)...
	 Trying 1:
	 Trying 2:
	 Trying 3: spill 67(freq=3000)
	 Trying 4: spill 67(freq=3000)
	 Trying 5:
	 Trying 0:
	 Trying 11:
	 Trying 10:
	 Trying 9:
	 Trying 8:
	 Trying 7:
	 Trying 6:
	 Trying 14:
	 Trying 13:
	 Assigning to 80 (cl=GENERAL_REGS, orig=71, freq=2000, tfirst=78, tfreq=5000)...
	 Assigning to 79 (cl=GENERAL_REGS, orig=76, freq=3000, tfirst=79, tfreq=3000)...
	   Assign 3 to reload r79 (freq=3000)
  2nd iter for reload pseudo assignments:
	 Reload r78 assignment failure
	  Spill  r67(hr=4, freq=3000)
	 Assigning to 78 (cl=GENERAL_REGS, orig=71, freq=5000, tfirst=78, tfreq=5000)...
	 Trying 1:
	 Trying 2:
	 Trying 3:
	 Trying 4:
	 Trying 5:
	 Trying 0:
	 Trying 11:
	 Trying 10:
	 Trying 9:
	 Trying 8:
	 Trying 7:
	 Trying 6:
	 Trying 14:
	 Trying 13:
	   Assign 1 to reload r78 (freq=5000)
	Hard reg 1 is preferable by r80 with profit 1000

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Fix PR44281 (bad RA with global regs)
  2016-07-14  9:25       ` Dominik Vogt
@ 2016-07-20 12:14         ` Dominik Vogt
  0 siblings, 0 replies; 21+ messages in thread
From: Dominik Vogt @ 2016-07-20 12:14 UTC (permalink / raw)
  To: Bernd Schmidt, GCC Patches, Andreas Krebbel

On Thu, Jul 14, 2016 at 10:24:38AM +0100, Dominik Vogt wrote:
> On Wed, Jul 13, 2016 at 07:43:13PM +0200, Bernd Schmidt wrote:
> > On 07/13/2016 05:29 PM, Dominik Vogt wrote:
> > 
> > >Unfortunately this patch (or whatever got actually committed) has
> > >broken the gcc.target/s390/pr679443.c test case, which is a bit
> > >fishy (see code snippet below).  I assign most registers to global
> > >variables and then use some complicated arithmetics with the goal
> > >that the pointer stored in the first argument gets saved on the
> > >stack and reloaded to a different register.  Before this patch the
> > >test case just needed three registers to do its work (r2, r3, r4).
> > >With the patch it currently causes an error in the reload pass
> > >
> > >  error: unable to find a register to spill
> > 
> > Might be useful to see the dump_reload output.
> 
> Attached.
> 
> > >If a fourth register is available, the ICE goes away, but the
> > >pointer remains in r2, rendering the test case useless.
> > 
> > I don't think I quite understand what you're trying to do here,
> 
> Alias detection of the memory pointed to by the first register.
> There was some hard to trigger bug where writing a bitfield in a
> struct would also overwrite the unselected bits of the
> corresponding word.  See here:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67443

I've made a patch for the testcase:
  https://gcc.gnu.org/ml/gcc-patches/2016-07/msg01232.html
That fixes the problem for s390/s390x, but I cannot tell wether
the patch to global register variable allocation has a problem or
not.  If you need any more information just give me a shout.
Otherwise I'll not track this issue any further.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2016-07-20 12:14 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <56C78C62.7050204@t-online.de>
2016-02-19 22:03 ` Fix PR44281 (bad RA with global regs) Bernd Schmidt
2016-02-22 14:37   ` Richard Biener
2016-02-26 15:41     ` Bernd Schmidt
2016-02-26 18:15       ` Jeff Law
     [not found]         ` <CAFiYyc1GGJtT_QQ3UtzhyACB5Ru=XDsJsG6vQyBs1_rONDdDLw@mail.gmail.com>
2016-02-29 13:38           ` Bernd Schmidt
2016-02-29 15:18             ` Richard Biener
2016-02-29 15:22               ` Jeff Law
2016-02-29 17:07       ` Michael Matz
2016-02-29 17:13         ` Bernd Schmidt
2016-02-29 18:50           ` Michael Matz
2016-02-29 19:46             ` Mikael Pettersson
2016-02-29 20:09               ` Michael Matz
2016-02-22 17:53   ` Jeff Law
2016-02-22 18:29     ` Michael Matz
2016-04-27 20:59   ` Jeff Law
2016-04-27 21:02     ` Bernd Schmidt
2016-04-27 21:08       ` Jeff Law
2016-07-13 15:30   ` Dominik Vogt
2016-07-13 17:43     ` Bernd Schmidt
2016-07-14  9:25       ` Dominik Vogt
2016-07-20 12:14         ` Dominik Vogt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).