public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [Patch, AVR]: Fix PR46779
@ 2011-06-09 17:24 Georg-Johann Lay
  2011-06-09 18:50 ` Denis Chertykov
  0 siblings, 1 reply; 47+ messages in thread
From: Georg-Johann Lay @ 2011-06-09 17:24 UTC (permalink / raw)
  To: gcc-patches; +Cc: Anatoly Sokolov, Denis Chertykov, Eric B. Weddington

[-- Attachment #1: Type: text/plain, Size: 1040 bytes --]

This is a tentative patch to fix PR46779 and hopefully also related
issues like PR45291.

In the present version of avr_hard_regno_mode_ok, QImode is forbidden
in r29/r28. If a 16-bit value or composite is allocated to r28, this
can lead to odd subregs like
  (set (subreg:QI (reg:HI r28) 0) ...)
These subregs are produced by IRA and reload treats the subreg with a
RELOAD_WRITE. As reload spills r28 to another hard reg that can access
QI, there will be no input reload. Therefore, if r29 already contains
data that data will get garbaged. See also the discussion around

http://gcc.gnu.org/ml/gcc/2011-06/msg00005.html

Tested with two regressions less compared to unpatched compiler.
Testcases that now pass are:

* gcc.target/avr/pr46779-1.c
* gcc.target/avr/pr46779-2.c

Johann

--
	PR target/46779
	* config/avr/avr.c (avr_hard_regno_mode_ok): Rewrite.
	In particular, allow 8-bit values in r28 and r29.
	(avr_hard_regno_scratch_ok): Disallow any register that might be
	part of the frame pointer.
	(avr_hard_regno_rename_ok): Same.

[-- Attachment #2: pr46779.diff --]
[-- Type: text/x-patch, Size: 3357 bytes --]

Index: config/avr/avr.c
===================================================================
--- config/avr/avr.c	(Revision 174701)
+++ config/avr/avr.c	(Arbeitskopie)
@@ -6276,26 +6276,35 @@ jump_over_one_insn_p (rtx insn, rtx dest
 int
 avr_hard_regno_mode_ok (int regno, enum machine_mode mode)
 {
-  /* Disallow QImode in stack pointer regs.  */
-  if ((regno == REG_SP || regno == (REG_SP + 1)) && mode == QImode)
+  /* Don't allocate data to non-GENERAL_REGS registers.  */
+  
+  if (regno >= 32)
     return 0;
 
-  /* The only thing that can go into registers r28:r29 is a Pmode.  */
-  if (regno == REG_Y && mode == Pmode)
+  /* Any GENERAL_REGS register can hold 8-bit values.  */
+  /* FIXME:
+     8-bit values must not be disallowed for R28 or R29.  Disallowing
+     QI et al. in these registers might lead to code like
+         (set (subreg:QI (reg:HI 28)) ...)
+     which will result in wrong code because reload does not handle
+     SUBREGs of hard regsisters like this.  This could be fixed in reload.
+     However, it appears that fixing reload is not wanted by reload people.  */
+  
+  if (GET_MODE_SIZE (mode) == 1)
     return 1;
+  
+  /* Disallow big registers that overlap the frame pointer.
+     This will hopefully reduce the number of spill failures.  */
+  
+  if (GET_MODE_SIZE (mode) > 2
+      && regno <= REG_Y
+      && regno + GET_MODE_SIZE (mode) >= REG_Y + 1)
+    {
+      return 0;
+    }
 
-  /* Otherwise disallow all regno/mode combinations that span r28:r29.  */
-  if (regno <= (REG_Y + 1) && (regno + GET_MODE_SIZE (mode)) >= (REG_Y + 1))
-    return 0;
-
-  if (mode == QImode)
-    return 1;
-
-  /* Modes larger than QImode occupy consecutive registers.  */
-  if (regno + GET_MODE_SIZE (mode) > FIRST_PSEUDO_REGISTER)
-    return 0;
-
-  /* All modes larger than QImode should start in an even register.  */
+  /* All modes larger than 8 bits should start in an even register.  */
+  
   return !(regno & 1);
 }
 
@@ -6422,13 +6431,23 @@ avr_hard_regno_scratch_ok (unsigned int
       && !df_regs_ever_live_p (regno))
     return false;
 
+  /* Don't allow hard registers that might be part of the frame pointer.
+     Some places in the compiler just test for [HARD_]FRAME_POINTER_REGNUM
+     and don't care for a frame pointer that spans more than one register.  */
+
+  if ((!reload_completed || frame_pointer_needed)
+      && (regno == REG_Y || regno == REG_Y + 1))
+    {
+      return false;
+    }
+
   return true;
 }
 
 /* Return nonzero if register OLD_REG can be renamed to register NEW_REG.  */
 
 int
-avr_hard_regno_rename_ok (unsigned int old_reg ATTRIBUTE_UNUSED,
+avr_hard_regno_rename_ok (unsigned int old_reg,
 			  unsigned int new_reg)
 {
   /* Interrupt functions can only use registers that have already been
@@ -6439,6 +6458,17 @@ avr_hard_regno_rename_ok (unsigned int o
       && !df_regs_ever_live_p (new_reg))
     return 0;
 
+  /* Don't allow hard registers that might be part of the frame pointer.
+     Some places in the compiler just test for [HARD_]FRAME_POINTER_REGNUM
+     and don't care for a frame pointer that spans more than one register.  */
+
+  if ((!reload_completed || frame_pointer_needed)
+      && (old_reg == REG_Y || old_reg == REG_Y + 1
+          || new_reg == REG_Y || new_reg == REG_Y + 1))
+    {
+      return 0;
+    }
+  
   return 1;
 }
 

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Patch, AVR]: Fix PR46779
  2011-06-09 17:24 [Patch, AVR]: Fix PR46779 Georg-Johann Lay
@ 2011-06-09 18:50 ` Denis Chertykov
  2011-06-09 19:43   ` Georg-Johann Lay
  2011-06-10  9:27   ` Georg-Johann Lay
  0 siblings, 2 replies; 47+ messages in thread
From: Denis Chertykov @ 2011-06-09 18:50 UTC (permalink / raw)
  To: Georg-Johann Lay; +Cc: gcc-patches, Anatoly Sokolov, Eric B. Weddington

2011/6/9 Georg-Johann Lay <avr@gjlay.de>:
> This is a tentative patch to fix PR46779 and hopefully also related
> issues like PR45291.
>
-  /* Disallow QImode in stack pointer regs.  */
-  if ((regno == REG_SP || regno == (REG_SP + 1)) && mode == QImode)
+  /* Don't allocate data to non-GENERAL_REGS registers.  */
+
+  if (regno >= 32)
     return 0;

I think that we not need in this code.
GCC core must bother about this.

+
+  if (GET_MODE_SIZE (mode) == 1)
     return 1;

I'm agree with this.

+
+  /* Disallow big registers that overlap the frame pointer.
+     This will hopefully reduce the number of spill failures.  */
+
+  if (GET_MODE_SIZE (mode) > 2
+      && regno <= REG_Y
+      && regno + GET_MODE_SIZE (mode) >= REG_Y + 1)
+    {
+      return 0;
+    }

Fragment from GCC info:
--------------------------------------
HARD_REGNO_MODE_OK (regno, mode)A C expression that is nonzero if it
is permissible to store a value of mode mode in hard register number
regno (or in several registers starting with that one). For a machine
where all registers are equivalent, a suitable definition is

#define HARD_REGNO_MODE_OK(REGNO, MODE) 1

You need not include code to check for the numbers of fixed registers,
because the allocation mechanism considers them to be always occupied.
-----------------------------------------
Again, GCC core must bother about this.

-  /* Otherwise disallow all regno/mode combinations that span r28:r29.  */
-  if (regno <= (REG_Y + 1) && (regno + GET_MODE_SIZE (mode)) >= (REG_Y + 1))
-    return 0;
-
-  if (mode == QImode)
-    return 1;
-
-  /* Modes larger than QImode occupy consecutive registers.  */
-  if (regno + GET_MODE_SIZE (mode) > FIRST_PSEUDO_REGISTER)
-    return 0;
-

This is a right change.

Denis.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Patch, AVR]: Fix PR46779
  2011-06-09 18:50 ` Denis Chertykov
@ 2011-06-09 19:43   ` Georg-Johann Lay
  2011-06-09 19:51     ` Denis Chertykov
  2011-06-09 20:03     ` Georg-Johann Lay
  2011-06-10  9:27   ` Georg-Johann Lay
  1 sibling, 2 replies; 47+ messages in thread
From: Georg-Johann Lay @ 2011-06-09 19:43 UTC (permalink / raw)
  To: Denis Chertykov; +Cc: gcc-patches, Anatoly Sokolov, Eric B. Weddington

Denis Chertykov schrieb:
> 2011/6/9 Georg-Johann Lay <avr@gjlay.de>:
>> This is a tentative patch to fix PR46779 and hopefully also related
>> issues like PR45291.
>>
> -  /* Disallow QImode in stack pointer regs.  */
> -  if ((regno == REG_SP || regno == (REG_SP + 1)) && mode == QImode)
> +  /* Don't allocate data to non-GENERAL_REGS registers.  */
> +
> +  if (regno >= 32)
>      return 0;
> 
> I think that we not need in this code.
> GCC core must bother about this.

I am unsure about that. There is the "q" constraint that is used for
SP. register_operand will accept SP because regno_reg_class does not
return NO_REGS for SP. So something must stop the register allocator
from allocating ordinary data to SP.

In the current md there is "*movhi_sp" insn prior to "*movhi" insn.
Besides that it is strongly discouraged to have more than one move
insn for one mode, not each and every place looks at insn conditions.

Moreover, if there is an insn like
(set (reg:HI 100)
     (reg:HI 101))
that matches "*movhi", what will keep IRA from allocating one of the
values to SP, i.e. chose alternative "q,r" or "r,q"?

> +
> +  if (GET_MODE_SIZE (mode) == 1)
>      return 1;
> 
> I'm agree with this.

> +  /* Disallow big registers that overlap the frame pointer.
> +     This will hopefully reduce the number of spill failures.  */
> +
> +  if (GET_MODE_SIZE (mode) > 2
> +      && regno <= REG_Y
> +      && regno + GET_MODE_SIZE (mode) >= REG_Y + 1)
> +    {
> +      return 0;
> +    }
> 
> Fragment from GCC info:
> --------------------------------------
> HARD_REGNO_MODE_OK (regno, mode)A C expression that is nonzero if it
> is permissible to store a value of mode mode in hard register number
> regno (or in several registers starting with that one). For a machine
> where all registers are equivalent, a suitable definition is
> 
> #define HARD_REGNO_MODE_OK(REGNO, MODE) 1
> 
> You need not include code to check for the numbers of fixed registers,
> because the allocation mechanism considers them to be always occupied.
> -----------------------------------------
> Again, GCC core must bother about this.

I agree with you. However, I think that the risk of spill failure
should be minimized. I have no idea how to fix a splill failure like
the following that occurs in testsuite (with -Os, no matter if the
patch is applied or not):

./gcc/testsuite/gcc.c-torture/execute/pr38051.c:189:1: error: unable
to find a register to spill in class 'POINTER_REGS'
./gcc/testsuite/gcc.c-torture/execute/pr38051.c:189:1: error: this is
the insn:
(insn 61 60 62 10 (set (reg/v:SI 12 r12 [orig:73 b0 ] [73])
        (mem:SI (subreg:HI (reg/v:SI 70 [ srcp2 ]) 0) [2 *D.2218_56+0
S4 A8])) ./gcc/testsuite/gcc.c-torture/execute/pr38051.c:64 12 {*movsi}
     (nil))
./gcc/testsuite/gcc.c-torture/execute/pr38051.c:189:1: internal
compiler error: in spill_failure, at reload1.c:2113

Actually I have no idea if the snip in avr_hard_regno_mode_ok actually
would reduce the risk of spill failure :-/

> -  /* Otherwise disallow all regno/mode combinations that span r28:r29.  */
> -  if (regno <= (REG_Y + 1) && (regno + GET_MODE_SIZE (mode)) >= (REG_Y + 1))
> -    return 0;
> -
> -  if (mode == QImode)
> -    return 1;
> -
> -  /* Modes larger than QImode occupy consecutive registers.  */
> -  if (regno + GET_MODE_SIZE (mode) > FIRST_PSEUDO_REGISTER)
> -    return 0;
> -
> 
> This is a right change.
> 
> Denis.

Johann

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Patch, AVR]: Fix PR46779
  2011-06-09 19:43   ` Georg-Johann Lay
@ 2011-06-09 19:51     ` Denis Chertykov
  2011-06-10 10:23       ` Georg-Johann Lay
  2011-06-09 20:03     ` Georg-Johann Lay
  1 sibling, 1 reply; 47+ messages in thread
From: Denis Chertykov @ 2011-06-09 19:51 UTC (permalink / raw)
  To: Georg-Johann Lay; +Cc: gcc-patches, Anatoly Sokolov, Eric B. Weddington

2011/6/9 Georg-Johann Lay <avr@gjlay.de>:
> Denis Chertykov schrieb:
>> 2011/6/9 Georg-Johann Lay <avr@gjlay.de>:
>>> This is a tentative patch to fix PR46779 and hopefully also related
>>> issues like PR45291.
>>>
>> -  /* Disallow QImode in stack pointer regs.  */
>> -  if ((regno == REG_SP || regno == (REG_SP + 1)) && mode == QImode)
>> +  /* Don't allocate data to non-GENERAL_REGS registers.  */
>> +
>> +  if (regno >= 32)
>>      return 0;
>>
>> I think that we not need in this code.
>> GCC core must bother about this.
>
> I am unsure about that. There is the "q" constraint that is used for
> SP. register_operand will accept SP because regno_reg_class does not
> return NO_REGS for SP. So something must stop the register allocator
> from allocating ordinary data to SP.

I know, this is a FIXED_REGISTERS.

> In the current md there is "*movhi_sp" insn prior to "*movhi" insn.
> Besides that it is strongly discouraged to have more than one move
> insn for one mode, not each and every place looks at insn conditions.

I'm agree. May be better to rewrite it.

>
> Moreover, if there is an insn like
> (set (reg:HI 100)
>     (reg:HI 101))
> that matches "*movhi", what will keep IRA from allocating one of the
> values to SP, i.e. chose alternative "q,r" or "r,q"?

FIXED_REGISTERS.

>> +    {
>> +      return 0;
>> +    }
>>
>> Fragment from GCC info:
>> --------------------------------------
>> HARD_REGNO_MODE_OK (regno, mode)A C expression that is nonzero if it
>> is permissible to store a value of mode mode in hard register number
>> regno (or in several registers starting with that one). For a machine
>> where all registers are equivalent, a suitable definition is
>>
>> #define HARD_REGNO_MODE_OK(REGNO, MODE) 1
>>
>> You need not include code to check for the numbers of fixed registers,
>> because the allocation mechanism considers them to be always occupied.
>> -----------------------------------------
>> Again, GCC core must bother about this.
>
> I agree with you. However, I think that the risk of spill failure
> should be minimized. I have no idea how to fix a splill failure like
> the following that occurs in testsuite (with -Os, no matter if the
> patch is applied or not):
>
> ./gcc/testsuite/gcc.c-torture/execute/pr38051.c:189:1: error: unable
> to find a register to spill in class 'POINTER_REGS'
> ./gcc/testsuite/gcc.c-torture/execute/pr38051.c:189:1: error: this is
> the insn:
> (insn 61 60 62 10 (set (reg/v:SI 12 r12 [orig:73 b0 ] [73])
>        (mem:SI (subreg:HI (reg/v:SI 70 [ srcp2 ]) 0) [2 *D.2218_56+0
> S4 A8])) ./gcc/testsuite/gcc.c-torture/execute/pr38051.c:64 12 {*movsi}
>     (nil))
> ./gcc/testsuite/gcc.c-torture/execute/pr38051.c:189:1: internal
> compiler error: in spill_failure, at reload1.c:2113
>
> Actually I have no idea if the snip in avr_hard_regno_mode_ok actually
> would reduce the risk of spill failure :-/

I think, no.
I will try to debug reload for pr38051.c (It will be a long process 1-2 weeks)

Denis.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Patch, AVR]: Fix PR46779
  2011-06-09 19:43   ` Georg-Johann Lay
  2011-06-09 19:51     ` Denis Chertykov
@ 2011-06-09 20:03     ` Georg-Johann Lay
  1 sibling, 0 replies; 47+ messages in thread
From: Georg-Johann Lay @ 2011-06-09 20:03 UTC (permalink / raw)
  To: Georg-Johann Lay
  Cc: Denis Chertykov, gcc-patches, Anatoly Sokolov, Eric B. Weddington

Georg-Johann Lay schrieb:
> Denis Chertykov schrieb:
> 
>>2011/6/9 Georg-Johann Lay <avr@gjlay.de>:
>>
>>>This is a tentative patch to fix PR46779 and hopefully also related
>>>issues like PR45291.
>>>
>>
>>-  /* Disallow QImode in stack pointer regs.  */
>>-  if ((regno == REG_SP || regno == (REG_SP + 1)) && mode == QImode)
>>+  /* Don't allocate data to non-GENERAL_REGS registers.  */
>>+
>>+  if (regno >= 32)
>>     return 0;
>>
>>I think that we not need in this code.
>>GCC core must bother about this.
> 
> 
> I am unsure about that. There is the "q" constraint that is used for
> SP. register_operand will accept SP because regno_reg_class does not
> return NO_REGS for SP. So something must stop the register allocator
> from allocating ordinary data to SP.
> 
> In the current md there is "*movhi_sp" insn prior to "*movhi" insn.
> Besides that it is strongly discouraged to have more than one move
> insn for one mode, not each and every place looks at insn conditions.
> 
> Moreover, if there is an insn like
> (set (reg:HI 100)
>      (reg:HI 101))
> that matches "*movhi", what will keep IRA from allocating one of the
> values to SP, i.e. chose alternative "q,r" or "r,q"?

Ok, SP is a fixed register, I missed that above :-)

For the "*movhi_sp" I will provide a patch too as soon as I find the time.

Johann

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Patch, AVR]: Fix PR46779
  2011-06-09 18:50 ` Denis Chertykov
  2011-06-09 19:43   ` Georg-Johann Lay
@ 2011-06-10  9:27   ` Georg-Johann Lay
  2011-06-21 17:17     ` Georg-Johann Lay
  1 sibling, 1 reply; 47+ messages in thread
From: Georg-Johann Lay @ 2011-06-10  9:27 UTC (permalink / raw)
  To: Denis Chertykov; +Cc: gcc-patches, Anatoly Sokolov, Eric B. Weddington

[-- Attachment #1: Type: text/plain, Size: 2236 bytes --]

Denis Chertykov schrieb:
> 2011/6/9 Georg-Johann Lay <avr@gjlay.de>:
>> This is a tentative patch to fix PR46779 and hopefully also related
>> issues like PR45291.
>>
> -  /* Disallow QImode in stack pointer regs.  */
> -  if ((regno == REG_SP || regno == (REG_SP + 1)) && mode == QImode)
> +  /* Don't allocate data to non-GENERAL_REGS registers.  */
> +
> +  if (regno >= 32)
>      return 0;
> 
> I think that we not need in this code.
> GCC core must bother about this.
> 
> +
> +  if (GET_MODE_SIZE (mode) == 1)
>      return 1;
> 
> I'm agree with this.
> 
> +
> +  /* Disallow big registers that overlap the frame pointer.
> +     This will hopefully reduce the number of spill failures.  */
> +
> +  if (GET_MODE_SIZE (mode) > 2
> +      && regno <= REG_Y
> +      && regno + GET_MODE_SIZE (mode) >= REG_Y + 1)
> +    {
> +      return 0;
> +    }
> 
> Fragment from GCC info:
> --------------------------------------
> HARD_REGNO_MODE_OK (regno, mode)A C expression that is nonzero if it
> is permissible to store a value of mode mode in hard register number
> regno (or in several registers starting with that one). For a machine
> where all registers are equivalent, a suitable definition is
> 
> #define HARD_REGNO_MODE_OK(REGNO, MODE) 1
> 
> You need not include code to check for the numbers of fixed registers,
> because the allocation mechanism considers them to be always occupied.
> -----------------------------------------
> Again, GCC core must bother about this.
> 
> -  /* Otherwise disallow all regno/mode combinations that span r28:r29.  */
> -  if (regno <= (REG_Y + 1) && (regno + GET_MODE_SIZE (mode)) >= (REG_Y + 1))
> -    return 0;
> -
> -  if (mode == QImode)
> -    return 1;
> -
> -  /* Modes larger than QImode occupy consecutive registers.  */
> -  if (regno + GET_MODE_SIZE (mode) > FIRST_PSEUDO_REGISTER)
> -    return 0;
> -
> 
> This is a right change.
> 
> Denis.

So the patch turns avr_hard_regno_mode_ok into plain vanilla...

Johann

--

	PR target/46779
	* config/avr/avr.c (avr_hard_regno_mode_ok): Rewrite.
	In particular, allow 8-bit values in r28 and r29.
	(avr_hard_regno_scratch_ok): Disallow any register that might be
	part of the frame pointer.
	(avr_hard_regno_rename_ok): Same.


[-- Attachment #2: pr46779.diff --]
[-- Type: text/x-patch, Size: 2999 bytes --]

Index: config/avr/avr.c
===================================================================
--- config/avr/avr.c	(Revision 174701)
+++ config/avr/avr.c	(Arbeitskopie)
@@ -6276,26 +6276,20 @@ jump_over_one_insn_p (rtx insn, rtx dest
 int
 avr_hard_regno_mode_ok (int regno, enum machine_mode mode)
 {
-  /* Disallow QImode in stack pointer regs.  */
-  if ((regno == REG_SP || regno == (REG_SP + 1)) && mode == QImode)
-    return 0;
-
-  /* The only thing that can go into registers r28:r29 is a Pmode.  */
-  if (regno == REG_Y && mode == Pmode)
-    return 1;
-
-  /* Otherwise disallow all regno/mode combinations that span r28:r29.  */
-  if (regno <= (REG_Y + 1) && (regno + GET_MODE_SIZE (mode)) >= (REG_Y + 1))
-    return 0;
-
-  if (mode == QImode)
+  /* Any GENERAL_REGS register can hold 8-bit values.  */
+  /* FIXME:
+     8-bit values must not be disallowed for R28 or R29.  Disallowing
+     QI et al. in these registers might lead to code like
+         (set (subreg:QI (reg:HI 28)) ...)
+     which will result in wrong code because reload does not handle
+     SUBREGs of hard regsisters like this.  This could be fixed in reload.
+     However, it appears that fixing reload is not wanted by reload people.  */
+  
+  if (GET_MODE_SIZE (mode) == 1)
     return 1;
-
-  /* Modes larger than QImode occupy consecutive registers.  */
-  if (regno + GET_MODE_SIZE (mode) > FIRST_PSEUDO_REGISTER)
-    return 0;
-
-  /* All modes larger than QImode should start in an even register.  */
+  
+  /* All modes larger than 8 bits should start in an even register.  */
+  
   return !(regno & 1);
 }
 
@@ -6422,13 +6416,23 @@ avr_hard_regno_scratch_ok (unsigned int
       && !df_regs_ever_live_p (regno))
     return false;
 
+  /* Don't allow hard registers that might be part of the frame pointer.
+     Some places in the compiler just test for [HARD_]FRAME_POINTER_REGNUM
+     and don't care for a frame pointer that spans more than one register.  */
+
+  if ((!reload_completed || frame_pointer_needed)
+      && (regno == REG_Y || regno == REG_Y + 1))
+    {
+      return false;
+    }
+
   return true;
 }
 
 /* Return nonzero if register OLD_REG can be renamed to register NEW_REG.  */
 
 int
-avr_hard_regno_rename_ok (unsigned int old_reg ATTRIBUTE_UNUSED,
+avr_hard_regno_rename_ok (unsigned int old_reg,
 			  unsigned int new_reg)
 {
   /* Interrupt functions can only use registers that have already been
@@ -6439,6 +6443,17 @@ avr_hard_regno_rename_ok (unsigned int o
       && !df_regs_ever_live_p (new_reg))
     return 0;
 
+  /* Don't allow hard registers that might be part of the frame pointer.
+     Some places in the compiler just test for [HARD_]FRAME_POINTER_REGNUM
+     and don't care for a frame pointer that spans more than one register.  */
+
+  if ((!reload_completed || frame_pointer_needed)
+      && (old_reg == REG_Y || old_reg == REG_Y + 1
+          || new_reg == REG_Y || new_reg == REG_Y + 1))
+    {
+      return 0;
+    }
+  
   return 1;
 }
 

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Patch, AVR]: Fix PR46779
  2011-06-09 19:51     ` Denis Chertykov
@ 2011-06-10 10:23       ` Georg-Johann Lay
  2011-06-12 10:34         ` Denis Chertykov
  0 siblings, 1 reply; 47+ messages in thread
From: Georg-Johann Lay @ 2011-06-10 10:23 UTC (permalink / raw)
  To: Denis Chertykov; +Cc: gcc-patches, Anatoly Sokolov, Eric B. Weddington

Denis Chertykov schrieb:
> 2011/6/9 Georg-Johann Lay <avr@gjlay.de>:
>> Denis Chertykov schrieb:
>>> 2011/6/9 Georg-Johann Lay <avr@gjlay.de>:
>> I agree with you. However, I think that the risk of spill failure
>> should be minimized. I have no idea how to fix a splill failure like
>> the following that occurs in testsuite (with -Os, no matter if the
>> patch is applied or not):
>>
>> ./gcc/testsuite/gcc.c-torture/execute/pr38051.c:189:1: error: unable
>> to find a register to spill in class 'POINTER_REGS'
>> ./gcc/testsuite/gcc.c-torture/execute/pr38051.c:189:1: error: this is
>> the insn:
>> (insn 61 60 62 10 (set (reg/v:SI 12 r12 [orig:73 b0 ] [73])
>>        (mem:SI (subreg:HI (reg/v:SI 70 [ srcp2 ]) 0) [2 *D.2218_56+0
>> S4 A8])) ./gcc/testsuite/gcc.c-torture/execute/pr38051.c:64 12 {*movsi}
>>     (nil))
>> ./gcc/testsuite/gcc.c-torture/execute/pr38051.c:189:1: internal
>> compiler error: in spill_failure, at reload1.c:2113
>>
>> Actually I have no idea if the snip in avr_hard_regno_mode_ok actually
>> would reduce the risk of spill failure :-/
> 
> I think, no.
> I will try to debug reload for pr38051.c (It will be a long process 1-2 weeks)
> 
> Denis.

Thanks for doing that! Debugging reload is really no fun...

I see spill fails for these test cases with -mmcu=atmega128

* gcc.c-torture/execute/pr38051.c (-Os)
* gcc.dg/pr43670.c      (-O -ftree-vrp -fcompare-debug)
* gcc.dg/20030324-1.c   (-O -fstrict-aliasing -fgcse)

all complainins contain RTX like

   (mem:SI/SF (subreg:HI (reg:SI xxx) 0/2))


Then I have a question on spill failures:

There is PR46278, an optimization flaw that goes as follows:

The avr BE defines fake addressing mode X+const that has to be written
down in asm as
  X += const
  a = *X
  X -= const

The comment says that this is just to cover a corner case, however the
new register allocator uses this case more often or even greedily.
There is no way to describe the cost for such an access, and as X has
lower prologue/epilogue cost than Y, X is preferred over Y often.

In 4.7, I see that flaw less often than in 4.5.  However, I think the
best way is not to lie at the register allocator and not to supply a
fake instruction like that.

So I started to work on a fix. The changes in avr.h are:

	* config/avr/avr.h (BASE_REG_CLASS): Remove.
	(REG_OK_FOR_BASE_NOSTRICT_P): Remove.
	(REG_OK_FOR_BASE_STRICT_P): Remove.
	(MODE_CODE_BASE_REG_CLASS): New Define.
	(REGNO_MODE_CODE_OK_FOR_BASE_P): New Define.

The macros REGNO_MODE_CODE_OK_FOR_BASE_P and MODE_CODE_BASE_REG_CLASS
allow a better, fine grained control over addressing modes for each
hard register and allow to support X without fake instructions. The
code quality actually improves, but I see a new spill failure for the
test case

* gcc.c-torture/compile/950612-1.c

On the one hand, that test case has heavy long long use and is no
"real world code" to run on AVR. On the other hand, I don't think
trading code quality here against ICE there is a good idea.

What do you think about that? As I have no idea how to fix a spill
failure in the backend, I stopped working on a patch.

Then I observed trouble with DI patterns during libgcc build and had
to remove

* "zero_extendqidi2"
* "zero_extendhidi2"
* "zero_extendsidi2"

These are "orphan" insns: they deal with DI without having movdi
support so I removed them.

Johann

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Patch, AVR]: Fix PR46779
  2011-06-10 10:23       ` Georg-Johann Lay
@ 2011-06-12 10:34         ` Denis Chertykov
  2011-06-13 18:10           ` Georg-Johann Lay
  0 siblings, 1 reply; 47+ messages in thread
From: Denis Chertykov @ 2011-06-12 10:34 UTC (permalink / raw)
  To: Georg-Johann Lay; +Cc: gcc-patches, Anatoly Sokolov, Eric B. Weddington

2011/6/10 Georg-Johann Lay <avr@gjlay.de>:
> Then I have a question on spill failures:
>
> There is PR46278, an optimization flaw that goes as follows:
>
> The avr BE defines fake addressing mode X+const that has to be written
> down in asm as
>  X += const
>  a = *X
>  X -= const
>
> The comment says that this is just to cover a corner case, however the
> new register allocator uses this case more often or even greedily.
> There is no way to describe the cost for such an access, and as X has
> lower prologue/epilogue cost than Y, X is preferred over Y often.
>
> In 4.7, I see that flaw less often than in 4.5.  However, I think the
> best way is not to lie at the register allocator and not to supply a
> fake instruction like that.
>
> So I started to work on a fix. The changes in avr.h are:
>
>        * config/avr/avr.h (BASE_REG_CLASS): Remove.
>        (REG_OK_FOR_BASE_NOSTRICT_P): Remove.
>        (REG_OK_FOR_BASE_STRICT_P): Remove.
>        (MODE_CODE_BASE_REG_CLASS): New Define.
>        (REGNO_MODE_CODE_OK_FOR_BASE_P): New Define.
>
> The macros REGNO_MODE_CODE_OK_FOR_BASE_P and MODE_CODE_BASE_REG_CLASS
> allow a better, fine grained control over addressing modes for each
> hard register and allow to support X without fake instructions. The
> code quality actually improves, but I see a new spill failure for the
> test case
>
> * gcc.c-torture/compile/950612-1.c
>
> On the one hand, that test case has heavy long long use and is no
> "real world code" to run on AVR. On the other hand, I don't think
> trading code quality here against ICE there is a good idea.
>
> What do you think about that? As I have no idea how to fix a spill
> failure in the backend, I stopped working on a patch.

Ohhh. It's a most complicated case for GCC for AVR.
Look carefully at `out_movqi_r_mr'.
There are even two fake addressing modes:
1. [Y + infinite-dslacement];
2. [X + (0...63)].
I have spent a many hours (days, months) to debug GCC (especially avr port
and reload) for right addressing modes.
I have stopped on this code.
AVR have a limited memory addressing and GCC can't handle it in native form.
Because of that I have supported a fake adddressing modes.
(Richard Henderson have a different oppinion: GCC can, AVR port can't)
IMHO that three limited pointer registers is not enough for C compiler.
Even more with frame pointer it's only two and X is a very limited.

1. [Y + infinite-dslacement] it's helper for reload addressing.
For addressing too long local/spilled variable. (Y + more-than-63)

2. [X + (0...63)] for another one "normal" pointer address.

> Then I observed trouble with DI patterns during libgcc build and had
> to remove
>
> * "zero_extendqidi2"
> * "zero_extendhidi2"
> * "zero_extendsidi2"
>
> These are "orphan" insns: they deal with DI without having movdi
> support so I removed them.

This seems strange for me.

Denis.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Patch, AVR]: Fix PR46779
  2011-06-12 10:34         ` Denis Chertykov
@ 2011-06-13 18:10           ` Georg-Johann Lay
  2011-06-13 22:37             ` Denis Chertykov
  2011-06-22  3:28             ` Hans-Peter Nilsson
  0 siblings, 2 replies; 47+ messages in thread
From: Georg-Johann Lay @ 2011-06-13 18:10 UTC (permalink / raw)
  To: Denis Chertykov
  Cc: gcc-patches, Anatoly Sokolov, Eric B. Weddington, Richard Henderson

[In CCing Richard Henderson]

Denis Chertykov schrieb:
> 2011/6/10 Georg-Johann Lay <avr@gjlay.de>:
> 
>>Then I have a question on spill failures:
>>
>>There is PR46278, an optimization flaw that goes as follows:
>>
>>The avr BE defines fake addressing mode X+const that has to be written
>>down in asm as
>> X += const
>> a = *X
>> X -= const
>>
>>The comment says that this is just to cover a corner case, however the
>>new register allocator uses this case more often or even greedily.
>>There is no way to describe the cost for such an access, and as X has
>>lower prologue/epilogue cost than Y, X is preferred over Y often.
>>
>>In 4.7, I see that flaw less often than in 4.5.  However, I think the
>>best way is not to lie at the register allocator and not to supply a
>>fake instruction like that.
>>
>>So I started to work on a fix. The changes in avr.h are:
>>
>>       * config/avr/avr.h (BASE_REG_CLASS): Remove.
>>       (REG_OK_FOR_BASE_NOSTRICT_P): Remove.
>>       (REG_OK_FOR_BASE_STRICT_P): Remove.
>>       (MODE_CODE_BASE_REG_CLASS): New Define.
>>       (REGNO_MODE_CODE_OK_FOR_BASE_P): New Define.
>>
>>The macros REGNO_MODE_CODE_OK_FOR_BASE_P and MODE_CODE_BASE_REG_CLASS
>>allow a better, fine grained control over addressing modes for each
>>hard register and allow to support X without fake instructions. The
>>code quality actually improves, but I see a new spill failure for the
>>test case
>>
>>* gcc.c-torture/compile/950612-1.c
>>
>>On the one hand, that test case has heavy long long use and is no
>>"real world code" to run on AVR. On the other hand, I don't think
>>trading code quality here against ICE there is a good idea.
>>
>>What do you think about that? As I have no idea how to fix a spill
>>failure in the backend, I stopped working on a patch.
> 
> Ohhh. It's a most complicated case for GCC for AVR.

So you think is is pointless/discouraged to give a more realistic 
description of AVR addressing be means of MODE_CODE_BASE_REG_CLASS 
(instead of BASE_REG_CLASS) resp. REGNO_MODE_CODE_OK_FOR_BASE_P?

> Look carefully at `out_movqi_r_mr'.
> There are even two fake addressing modes:
> 1. [Y + infinite-dslacement];
> 2. [X + (0...63)].

Yes, I know. The first is introduced by avr_legitimate_address_p and the 
second appears to be artifact of LEGITIMIZE_RELOAD_ADDRESS.

The changes are basically MODE_CODE_BASE_REG_CLASS (introduced in 4.2) 
and a rewrite of avr_legitimate_address_p. The changes aim at a better 
addressing for X and to minimize fake addresses.

> I have spent a many hours (days, months) to debug GCC (especially avr port
> and reload) for right addressing modes.
> I have stopped on this code.
> AVR have a limited memory addressing and GCC can't handle it in native form.
> Because of that I have supported a fake adddressing modes.

I assume the code is from prior to 4.2 when 
REGNO_MODE_CODE_OK_FOR_BASE_P and MODE_CODE_BASE_REG_CLASS had not been 
available so that supporting X required some hacking.
All that would still be fine; however the new register allocator leads 
to code that noone would accept. Accessing a structure through a pointer 
is not uncommon, not even on AVR. So if Z is used for, say accessing 
flash, X appears to be the best register.

The shortcoming in GCC is that there is no way to give costs of 
addressing (TARGET_ADDRESS_COST does different things).

So take a look what avr-gcc compiles here:
   http://gcc.gnu.org/bugzilla/attachment.cgi?id=22242
I saw similar complains in forums on the web.

> (Richard Henderson have a different opinion: GCC can, AVR port can't)

What does he mean with that?

> IMHO that three limited pointer registers is not enough for C compiler.
> Even more with frame pointer it's only two and X is a very limited.

The current implementation has several oddities like

* allowing SUBREGs in avr-legitimate_address_p
* changing BASE_REG_CLASS on the fly (by means of reload_completed)

isn't that supposed to cause trouble?

> 1. [Y + infinite-dslacement] it's helper for reload addressing.
> For addressing too long local/spilled variable. (Y + more-than-63)
> 
> 2. [X + (0...63)] for another one "normal" pointer address.

____________________

>>Then I observed trouble with DI patterns during libgcc build and had
>>to remove
>>
>>* "zero_extendqidi2"
>>* "zero_extendhidi2"
>>* "zero_extendsidi2"
>>
>>These are "orphan" insns: they deal with DI without having movdi
>>support so I removed them.
> 
> This seems strange for me.

As far as I know, to support a mode a respective mov insn is needed, 
which is not the case for DI. I don't know the exact rationale behind 
that (reloading?), just read is on gcc list by Ian Taylor (and also that 
it is stronly discouraged to have more than one mov insn per mode).

So if the requirement to have mov insn is dropped and without the burden 
to implement movdi, it would be rather easy to implement adddi3 and 
subdi3 for avr...

Johann

> Denis.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Patch, AVR]: Fix PR46779
  2011-06-13 18:10           ` Georg-Johann Lay
@ 2011-06-13 22:37             ` Denis Chertykov
  2011-06-14 10:39               ` Georg-Johann Lay
  2011-06-22  3:28             ` Hans-Peter Nilsson
  1 sibling, 1 reply; 47+ messages in thread
From: Denis Chertykov @ 2011-06-13 22:37 UTC (permalink / raw)
  To: Georg-Johann Lay
  Cc: gcc-patches, Anatoly Sokolov, Eric B. Weddington, Richard Henderson

2011/6/13 Georg-Johann Lay <avr@gjlay.de>:
>
> So you think is is pointless/discouraged to give a more realistic
> description of AVR addressing be means of MODE_CODE_BASE_REG_CLASS (instead
> of BASE_REG_CLASS) resp. REGNO_MODE_CODE_OK_FOR_BASE_P?
>
>> Look carefully at `out_movqi_r_mr'.
>> There are even two fake addressing modes:
>> 1. [Y + infinite-dslacement];
>> 2. [X + (0...63)].
>
> Yes, I know. The first is introduced by avr_legitimate_address_p and the
> second appears to be artifact of LEGITIMIZE_RELOAD_ADDRESS.
>
> The changes are basically MODE_CODE_BASE_REG_CLASS (introduced in 4.2) and a
> rewrite of avr_legitimate_address_p. The changes aim at a better addressing
> for X and to minimize fake addresses.
>
>> I have spent a many hours (days, months) to debug GCC (especially avr port
>> and reload) for right addressing modes.
>> I have stopped on this code.
>> AVR have a limited memory addressing and GCC can't handle it in native
>> form.
>> Because of that I have supported a fake adddressing modes.
>
> I assume the code is from prior to 4.2 when REGNO_MODE_CODE_OK_FOR_BASE_P
> and MODE_CODE_BASE_REG_CLASS had not been available so that supporting X
> required some hacking.
> All that would still be fine; however the new register allocator leads to
> code that noone would accept. Accessing a structure through a pointer is not
> uncommon, not even on AVR. So if Z is used for, say accessing flash, X
> appears to be the best register.
>
> The shortcoming in GCC is that there is no way to give costs of addressing
> (TARGET_ADDRESS_COST does different things).
>
> So take a look what avr-gcc compiles here:
>  http://gcc.gnu.org/bugzilla/attachment.cgi?id=22242
> I saw similar complains in forums on the web.
>
>> (Richard Henderson have a different opinion: GCC can, AVR port can't)
>
> What does he mean with that?
>
>> IMHO that three limited pointer registers is not enough for C compiler.
>> Even more with frame pointer it's only two and X is a very limited.
>
> The current implementation has several oddities like
>
> * allowing SUBREGs in avr-legitimate_address_p
> * changing BASE_REG_CLASS on the fly (by means of reload_completed)
>
> isn't that supposed to cause trouble?

You can try to remove all oddities and check results.
Definitely something changed in GCC core since I wrote addressing code.

>>>
>>> * "zero_extendqidi2"
>>> * "zero_extendhidi2"
>>> * "zero_extendsidi2"
>>>
>>> These are "orphan" insns: they deal with DI without having movdi
>>> support so I removed them.
>>
>> This seems strange for me.
>
> As far as I know, to support a mode a respective mov insn is needed, which
> is not the case for DI. I don't know the exact rationale behind that
> (reloading?), just read is on gcc list by Ian Taylor (and also that it is
> stronly discouraged to have more than one mov insn per mode).
>
> So if the requirement to have mov insn is dropped and without the burden to
> implement movdi, it would be rather easy to implement adddi3 and subdi3 for
> avr...

Personally, I don't like to maintain 64bits integers in AVR port, but
may be somebody use it.

Denis.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Patch, AVR]: Fix PR46779
  2011-06-13 22:37             ` Denis Chertykov
@ 2011-06-14 10:39               ` Georg-Johann Lay
  2011-06-14 11:38                 ` Denis Chertykov
  0 siblings, 1 reply; 47+ messages in thread
From: Georg-Johann Lay @ 2011-06-14 10:39 UTC (permalink / raw)
  To: Denis Chertykov
  Cc: gcc-patches, Anatoly Sokolov, Eric B. Weddington, Richard Henderson

[-- Attachment #1: Type: text/plain, Size: 4555 bytes --]

Denis Chertykov schrieb:
> 2011/6/13 Georg-Johann Lay <avr@gjlay.de>:
>> So you think is is pointless/discouraged to give a more realistic
>> description of AVR addressing be means of MODE_CODE_BASE_REG_CLASS (instead
>> of BASE_REG_CLASS) resp. REGNO_MODE_CODE_OK_FOR_BASE_P?
>>
>>> Look carefully at `out_movqi_r_mr'.
>>> There are even two fake addressing modes:
>>> 1. [Y + infinite-dslacement];
>>> 2. [X + (0...63)].
>> Yes, I know. The first is introduced by avr_legitimate_address_p and the
>> second appears to be artifact of LEGITIMIZE_RELOAD_ADDRESS.
>>
>> The changes are basically MODE_CODE_BASE_REG_CLASS (introduced in 4.2) and a
>> rewrite of avr_legitimate_address_p. The changes aim at a better addressing
>> for X and to minimize fake addresses.
>>
>>> I have spent a many hours (days, months) to debug GCC (especially avr port
>>> and reload) for right addressing modes.
>>> I have stopped on this code.
>>> AVR have a limited memory addressing and GCC can't handle it in native
>>> form.
>>> Because of that I have supported a fake adddressing modes.
>> I assume the code is from prior to 4.2 when REGNO_MODE_CODE_OK_FOR_BASE_P
>> and MODE_CODE_BASE_REG_CLASS had not been available so that supporting X
>> required some hacking.
>> All that would still be fine; however the new register allocator leads to
>> code that noone would accept. Accessing a structure through a pointer is not
>> uncommon, not even on AVR. So if Z is used for, say accessing flash, X
>> appears to be the best register.
>>
>> The shortcoming in GCC is that there is no way to give costs of addressing
>> (TARGET_ADDRESS_COST does different things).
>>
>> So take a look what avr-gcc compiles here:
>>  http://gcc.gnu.org/bugzilla/attachment.cgi?id=22242
>> I saw similar complains in forums on the web.
>>
>>> (Richard Henderson have a different opinion: GCC can, AVR port can't)
>> What does he mean with that?
>>
>>> IMHO that three limited pointer registers is not enough for C compiler.
>>> Even more with frame pointer it's only two and X is a very limited.
>> The current implementation has several oddities like
>>
>> * allowing SUBREGs in avr-legitimate_address_p
>> * changing BASE_REG_CLASS on the fly (by means of reload_completed)
>>
>> isn't that supposed to cause trouble?
> 
> You can try to remove all oddities and check results.
> Definitely something changed in GCC core since I wrote addressing code.
> 
> 
> Denis.

For your interest, here is a patch that shows the changes in
addressing mode.

Note that the

* LEGITIMIZE_RELOAD_ADDRESS is disabled. This is because I am
  unsure about how it should look like. The special cases for X
  are no more needed, and for Y and Z it might be good to have
  intermediate addresses with, say offset =0 mod 60, so that
  big offsets can be reached with addr + const, 0<= const < 60.

* patch already includes patch for pr46779 from
  http://gcc.gnu.org/ml/gcc-patches/2011-06/msg00810.html

* As I said above, I removed orphan DI insns.

So if you have a look into reload, it might also be interesting what
it does with this changes.

Johann

--

	* config/avr/avr.h (BASE_REG_CLASS): Remove.
	(REG_OK_FOR_BASE_NOSTRICT_P): Remove.
	(REG_OK_FOR_BASE_STRICT_P): Remove.
	(LEGITIMIZE_RELOAD_ADDRESS): Remove.
	(MODE_CODE_BASE_REG_CLASS): New Define.
	(REGNO_MODE_CODE_OK_FOR_BASE_P): New Define.

	* config/avr/avr.c: (avr_legitimate_address_p): Rewrite to allow
	addresses that can actually be handled by hardware.
	(avr_regno_mode_code_ok_for_base_p): New global Function.
	(avr_mode_code_base_reg_class): New global Function.
	(avr_hard_regno_mode_ok): Allow QI in all GPRs.
	(avr_reg_ok_for_addr): New static function.
	(avr_regno_reg_class): Change return type from enum reg_class to
	reg_class_t.
	(reg_class_tab): Set base type to reg_class_t. Return smallest
	register class for each register.

	* config/avr/avr.md: ("*sbrx_branch<mode>"): Disallow DI in mode.
	("rotl<mode>3"): Ditto.
	("*movqi"): Remove constraint 'Q'.
	("*movsi"): Ditto.
	("*movsf"): Ditto.
	("*ashlqi3", "ashrqi3", "*lshrqi3"): Ditto.
	("ashlhi3", "ashrhi3", "lshrhi3"): Ditto.
	("ashlsi3", "ashrsi3", "lshrsi3"): Ditto.
	("*movhi_sp"): Remove insn.
	("zero_extendqidi2"): Remove insn_and_split.
	("zero_extendhidi2"): Remove insn_and_split.
	("zero_extendsidi2"): Remove insn_and_split.

	* config/avr/avr-protos.h
	(secondary_input_reload_class): Remove prototype.
	(avr_mode_code_base_reg_class): New prototype.
	(avr_regno_mode_code_ok_for_base_p): New prototype.
	(avr_legitimize_reload_address): New prototype.

[-- Attachment #2: new-addr-v3.diff --]
[-- Type: text/x-patch, Size: 22348 bytes --]

Index: config/avr/avr-protos.h
===================================================================
--- config/avr/avr-protos.h	(Revision 175011)
+++ config/avr/avr-protos.h	(Arbeitskopie)
@@ -24,7 +24,7 @@
 
 extern int function_arg_regno_p (int r);
 extern void avr_cpu_cpp_builtins (struct cpp_reader * pfile);
-extern enum reg_class avr_regno_reg_class (int r);
+extern reg_class_t avr_regno_reg_class (unsigned int r);
 extern void asm_globalize_label (FILE *file, const char *name);
 extern void avr_asm_declare_function_name (FILE *, const char *, tree);
 extern void order_regs_for_local_alloc (void);
@@ -88,9 +88,6 @@ extern int extra_constraint_Q (rtx x);
 extern int adjust_insn_length (rtx insn, int len);
 extern const char *output_reload_inhi (rtx insn, rtx *operands, int *len);
 extern const char *output_reload_insisf (rtx insn, rtx *operands, int *len);
-extern enum reg_class secondary_input_reload_class (enum reg_class,
-						    enum machine_mode,
-						    rtx);
 extern void notice_update_cc (rtx body, rtx insn);
 extern void print_operand (FILE *file, rtx x, int code);
 extern void print_operand_address (FILE *file, rtx addr);
@@ -109,6 +106,8 @@ extern RTX_CODE avr_normalize_condition
 extern int compare_eq_p (rtx insn);
 extern void out_shift_with_cnt (const char *templ, rtx insn,
 				rtx operands[], int *len, int t_len);
+extern reg_class_t avr_mode_code_base_reg_class (enum machine_mode, RTX_CODE, RTX_CODE);
+extern bool avr_regno_mode_code_ok_for_base_p (int, enum machine_mode, RTX_CODE, RTX_CODE);
 extern rtx avr_incoming_return_addr_rtx (void);
 #endif /* RTX_CODE */
 
Index: config/avr/avr.md
===================================================================
--- config/avr/avr.md	(Revision 175011)
+++ config/avr/avr.md	(Arbeitskopie)
@@ -246,8 +246,8 @@ (define_expand "movqi"
   ")
 
 (define_insn "*movqi"
-  [(set (match_operand:QI 0 "nonimmediate_operand" "=r,d,Qm,r,q,r,*r")
-	(match_operand:QI 1 "general_operand"       "rL,i,rL,Qm,r,q,i"))]
+  [(set (match_operand:QI 0 "nonimmediate_operand" "=r,d,m,r,q,r,*r")
+	(match_operand:QI 1 "general_operand"       "rL,i,rL,m,r,q,i"))]
   "(register_operand (operands[0],QImode)
     || register_operand (operands[1], QImode) || const0_rtx == operands[1])"
   "* return output_movqi (insn, operands, NULL);"
@@ -295,15 +295,6 @@ (define_expand "movhi"
     }
 }")
 
-(define_insn "*movhi_sp"
-  [(set (match_operand:HI 0 "register_operand" "=q,r")
-        (match_operand:HI 1 "register_operand"  "r,q"))]
-  "((stack_register_operand(operands[0], HImode) && register_operand (operands[1], HImode))
-    || (register_operand (operands[0], HImode) && stack_register_operand(operands[1], HImode)))"
-  "* return output_movhi (insn, operands, NULL);"
-  [(set_attr "length" "5,2")
-   (set_attr "cc" "none,none")])
-
 (define_insn "movhi_sp_r_irq_off"
   [(set (match_operand:HI 0 "stack_register_operand" "=q")
         (unspec_volatile:HI [(match_operand:HI 1 "register_operand"  "r")] 
@@ -425,8 +416,8 @@ (define_insn "*reload_insi"
 
 
 (define_insn "*movsi"
-  [(set (match_operand:SI 0 "nonimmediate_operand" "=r,r,r,Qm,!d,r")
-        (match_operand:SI 1 "general_operand"       "r,L,Qm,rL,i,i"))]
+  [(set (match_operand:SI 0 "nonimmediate_operand" "=r,r,r,m,!d,r")
+        (match_operand:SI 1 "general_operand"       "r,L,m,rL,i,i"))]
   "(register_operand (operands[0],SImode)
     || register_operand (operands[1],SImode) || const0_rtx == operands[1])"
   "* return output_movsisf (insn, operands, NULL);"
@@ -451,8 +442,8 @@ (define_expand "movsf"
 }")
 
 (define_insn "*movsf"
-  [(set (match_operand:SF 0 "nonimmediate_operand" "=r,r,r,Qm,!d,r")
-        (match_operand:SF 1 "general_operand"       "r,G,Qm,r,F,F"))]
+  [(set (match_operand:SF 0 "nonimmediate_operand" "=r,r,r,m,!d,r")
+        (match_operand:SF 1 "general_operand"       "r,G,m,r,F,F"))]
   "register_operand (operands[0], SFmode)
    || register_operand (operands[1], SFmode)"
   "* return output_movsisf (insn, operands, NULL);"
@@ -1518,8 +1509,8 @@ (define_mode_attr rotx [(DI "&r,&r,X") (
 (define_mode_attr rotsmode [(DI "QI") (SI "HI") (HI "QI")])
 
 (define_expand "rotl<mode>3"
-  [(parallel [(set (match_operand:HIDI 0 "register_operand" "")
-		   (rotate:HIDI (match_operand:HIDI 1 "register_operand" "")
+  [(parallel [(set (match_operand:HISI 0 "register_operand" "")
+		   (rotate:HISI (match_operand:HISI 1 "register_operand" "")
 				(match_operand:VOID 2 "const_int_operand" "")))
 		(clobber (match_dup 3))])]
   ""
@@ -1618,7 +1609,7 @@ (define_split ; ashlqi3_const6
 (define_insn "*ashlqi3"
   [(set (match_operand:QI 0 "register_operand"           "=r,r,r,r,!d,r,r")
 	(ashift:QI (match_operand:QI 1 "register_operand" "0,0,0,0,0,0,0")
-		   (match_operand:QI 2 "general_operand"  "r,L,P,K,n,n,Qm")))]
+		   (match_operand:QI 2 "general_operand"  "r,L,P,K,n,n,m")))]
   ""
   "* return ashlqi3_out (insn, operands, NULL);"
   [(set_attr "length" "5,0,1,2,4,6,9")
@@ -1627,7 +1618,7 @@ (define_insn "*ashlqi3"
 (define_insn "ashlhi3"
   [(set (match_operand:HI 0 "register_operand"           "=r,r,r,r,r,r,r")
 	(ashift:HI (match_operand:HI 1 "register_operand" "0,0,0,r,0,0,0")
-		   (match_operand:QI 2 "general_operand"  "r,L,P,O,K,n,Qm")))]
+		   (match_operand:QI 2 "general_operand"  "r,L,P,O,K,n,m")))]
   ""
   "* return ashlhi3_out (insn, operands, NULL);"
   [(set_attr "length" "6,0,2,2,4,10,10")
@@ -1636,7 +1627,7 @@ (define_insn "ashlhi3"
 (define_insn "ashlsi3"
   [(set (match_operand:SI 0 "register_operand"           "=r,r,r,r,r,r,r")
 	(ashift:SI (match_operand:SI 1 "register_operand" "0,0,0,r,0,0,0")
-		   (match_operand:QI 2 "general_operand"  "r,L,P,O,K,n,Qm")))]
+		   (match_operand:QI 2 "general_operand"  "r,L,P,O,K,n,m")))]
   ""
   "* return ashlsi3_out (insn, operands, NULL);"
   [(set_attr "length" "8,0,4,4,8,10,12")
@@ -1725,7 +1716,7 @@ (define_insn "*ashlsi3_const"
 (define_insn "ashrqi3"
   [(set (match_operand:QI 0 "register_operand" "=r,r,r,r,r,r")
 	(ashiftrt:QI (match_operand:QI 1 "register_operand" "0,0,0,0,0,0")
-		     (match_operand:QI 2 "general_operand"  "r,L,P,K,n,Qm")))]
+		     (match_operand:QI 2 "general_operand"  "r,L,P,K,n,m")))]
   ""
   "* return ashrqi3_out (insn, operands, NULL);"
   [(set_attr "length" "5,0,1,2,5,9")
@@ -1734,7 +1725,7 @@ (define_insn "ashrqi3"
 (define_insn "ashrhi3"
   [(set (match_operand:HI 0 "register_operand"             "=r,r,r,r,r,r,r")
 	(ashiftrt:HI (match_operand:HI 1 "register_operand" "0,0,0,r,0,0,0")
-		     (match_operand:QI 2 "general_operand"  "r,L,P,O,K,n,Qm")))]
+		     (match_operand:QI 2 "general_operand"  "r,L,P,O,K,n,m")))]
   ""
   "* return ashrhi3_out (insn, operands, NULL);"
   [(set_attr "length" "6,0,2,4,4,10,10")
@@ -1743,7 +1734,7 @@ (define_insn "ashrhi3"
 (define_insn "ashrsi3"
   [(set (match_operand:SI 0 "register_operand"             "=r,r,r,r,r,r,r")
 	(ashiftrt:SI (match_operand:SI 1 "register_operand" "0,0,0,r,0,0,0")
-		     (match_operand:QI 2 "general_operand"  "r,L,P,O,K,n,Qm")))]
+		     (match_operand:QI 2 "general_operand"  "r,L,P,O,K,n,m")))]
   ""
   "* return ashrsi3_out (insn, operands, NULL);"
   [(set_attr "length" "8,0,4,6,8,10,12")
@@ -1833,7 +1824,7 @@ (define_split	; lshrqi3_const6
 (define_insn "*lshrqi3"
   [(set (match_operand:QI 0 "register_operand"             "=r,r,r,r,!d,r,r")
 	(lshiftrt:QI (match_operand:QI 1 "register_operand" "0,0,0,0,0,0,0")
-		     (match_operand:QI 2 "general_operand"  "r,L,P,K,n,n,Qm")))]
+		     (match_operand:QI 2 "general_operand"  "r,L,P,K,n,n,m")))]
   ""
   "* return lshrqi3_out (insn, operands, NULL);"
   [(set_attr "length" "5,0,1,2,4,6,9")
@@ -1842,7 +1833,7 @@ (define_insn "*lshrqi3"
 (define_insn "lshrhi3"
   [(set (match_operand:HI 0 "register_operand"             "=r,r,r,r,r,r,r")
 	(lshiftrt:HI (match_operand:HI 1 "register_operand" "0,0,0,r,0,0,0")
-		     (match_operand:QI 2 "general_operand"  "r,L,P,O,K,n,Qm")))]
+		     (match_operand:QI 2 "general_operand"  "r,L,P,O,K,n,m")))]
   ""
   "* return lshrhi3_out (insn, operands, NULL);"
   [(set_attr "length" "6,0,2,2,4,10,10")
@@ -1851,7 +1842,7 @@ (define_insn "lshrhi3"
 (define_insn "lshrsi3"
   [(set (match_operand:SI 0 "register_operand"             "=r,r,r,r,r,r,r")
 	(lshiftrt:SI (match_operand:SI 1 "register_operand" "0,0,0,r,0,0,0")
-		     (match_operand:QI 2 "general_operand"  "r,L,P,O,K,n,Qm")))]
+		     (match_operand:QI 2 "general_operand"  "r,L,P,O,K,n,m")))]
   ""
   "* return lshrsi3_out (insn, operands, NULL);"
   [(set_attr "length" "8,0,4,4,8,10,12")
@@ -2124,54 +2115,6 @@ (define_insn_and_split "zero_extendhisi2
   operands[3] = simplify_gen_subreg (HImode, operands[0], SImode, high_off);
 })
 
-(define_insn_and_split "zero_extendqidi2"
-  [(set (match_operand:DI 0 "register_operand" "=r")
-        (zero_extend:DI (match_operand:QI 1 "register_operand" "r")))]
-  ""
-  "#"
-  "reload_completed"
-  [(set (match_dup 2) (zero_extend:SI (match_dup 1)))
-   (set (match_dup 3) (const_int 0))]
-{
-  unsigned int low_off = subreg_lowpart_offset (SImode, DImode);
-  unsigned int high_off = subreg_highpart_offset (SImode, DImode);
-
-  operands[2] = simplify_gen_subreg (SImode, operands[0], DImode, low_off);
-  operands[3] = simplify_gen_subreg (SImode, operands[0], DImode, high_off);
-})
-
-(define_insn_and_split "zero_extendhidi2"
-  [(set (match_operand:DI 0 "register_operand" "=r")
-        (zero_extend:DI (match_operand:HI 1 "register_operand" "r")))]
-  ""
-  "#"
-  "reload_completed"
-  [(set (match_dup 2) (zero_extend:SI (match_dup 1)))
-   (set (match_dup 3) (const_int 0))]
-{
-  unsigned int low_off = subreg_lowpart_offset (SImode, DImode);
-  unsigned int high_off = subreg_highpart_offset (SImode, DImode);
-
-  operands[2] = simplify_gen_subreg (SImode, operands[0], DImode, low_off);
-  operands[3] = simplify_gen_subreg (SImode, operands[0], DImode, high_off);
-})
-
-(define_insn_and_split "zero_extendsidi2"
-  [(set (match_operand:DI 0 "register_operand" "=r")
-        (zero_extend:DI (match_operand:SI 1 "register_operand" "r")))]
-  ""
-  "#"
-  "reload_completed"
-  [(set (match_dup 2) (match_dup 1))
-   (set (match_dup 3) (const_int 0))]
-{
-  unsigned int low_off = subreg_lowpart_offset (SImode, DImode);
-  unsigned int high_off = subreg_highpart_offset (SImode, DImode);
-
-  operands[2] = simplify_gen_subreg (SImode, operands[0], DImode, low_off);
-  operands[3] = simplify_gen_subreg (SImode, operands[0], DImode, high_off);
-})
-
 ;;<=><=><=><=><=><=><=><=><=><=><=><=><=><=><=><=><=><=><=><=><=><=><=><=><=>
 ;; compare
 
@@ -2430,7 +2373,7 @@ (define_insn "*sbrx_branch<mode>"
   [(set (pc)
         (if_then_else
 	 (match_operator 0 "eqne_operator"
-			 [(zero_extract:QIDI
+			 [(zero_extract:QISI
 			   (match_operand:VOID 1 "register_operand" "r")
 			   (const_int 1)
 			   (match_operand 2 "const_int_operand" "n"))
Index: config/avr/avr.c
===================================================================
--- config/avr/avr.c	(Revision 175011)
+++ config/avr/avr.c	(Arbeitskopie)
@@ -343,18 +343,27 @@ avr_help (void)
 
 /*  return register class from register number.  */
 
-static const enum reg_class reg_class_tab[]={
-  GENERAL_REGS,GENERAL_REGS,GENERAL_REGS,GENERAL_REGS,GENERAL_REGS,
-  GENERAL_REGS,GENERAL_REGS,GENERAL_REGS,GENERAL_REGS,GENERAL_REGS,
-  GENERAL_REGS,GENERAL_REGS,GENERAL_REGS,GENERAL_REGS,GENERAL_REGS,
-  GENERAL_REGS, /* r0 - r15 */
-  LD_REGS,LD_REGS,LD_REGS,LD_REGS,LD_REGS,LD_REGS,LD_REGS,
-  LD_REGS,                      /* r16 - 23 */
-  ADDW_REGS,ADDW_REGS,          /* r24,r25 */
-  POINTER_X_REGS,POINTER_X_REGS, /* r26,27 */
-  POINTER_Y_REGS,POINTER_Y_REGS, /* r28,r29 */
-  POINTER_Z_REGS,POINTER_Z_REGS, /* r30,r31 */
-  STACK_REG,STACK_REG           /* SPL,SPH */
+static const reg_class_t reg_class_tab[] =
+  {
+    /* r0 */
+    R0_REG,
+    /* r1 - r15 */
+    NO_LD_REGS, NO_LD_REGS, NO_LD_REGS, NO_LD_REGS, NO_LD_REGS,
+    NO_LD_REGS, NO_LD_REGS, NO_LD_REGS, NO_LD_REGS, NO_LD_REGS,
+    NO_LD_REGS, NO_LD_REGS, NO_LD_REGS, NO_LD_REGS, NO_LD_REGS,
+    /* r16 - r23 */
+    SIMPLE_LD_REGS, SIMPLE_LD_REGS, SIMPLE_LD_REGS, SIMPLE_LD_REGS,
+    SIMPLE_LD_REGS, SIMPLE_LD_REGS, SIMPLE_LD_REGS, SIMPLE_LD_REGS,
+    /* r24, r25 */
+    ADDW_REGS, ADDW_REGS,
+    /* r26, r27 */
+    POINTER_X_REGS, POINTER_X_REGS,
+    /* r28, r29 */
+    POINTER_Y_REGS, POINTER_Y_REGS,
+    /* r30, r31 */
+    POINTER_Z_REGS, POINTER_Z_REGS,
+    /* SPL, SPH */
+    STACK_REG, STACK_REG
 };
 
 /* Function to set up the backend function structure.  */
@@ -367,8 +376,8 @@ avr_init_machine_status (void)
 
 /* Return register class for register R.  */
 
-enum reg_class
-avr_regno_reg_class (int r)
+reg_class_t
+avr_regno_reg_class (unsigned int r)
 {
   if (r <= 33)
     return reg_class_tab[r];
@@ -1146,74 +1155,75 @@ avr_cannot_modify_jumps_p (void)
 }
 
 
-/* Return nonzero if X (an RTX) is a legitimate memory address on the target
-   machine for a memory operand of mode MODE.  */
+/* Helper function for `avr_legitimate_address_p'.  */
+
+static inline int
+avr_reg_ok_for_addr (rtx reg, int strict)
+{
+  return (REG_P (reg)
+          && (avr_regno_mode_code_ok_for_base_p (REGNO (reg), QImode, MEM, SCRATCH)
+              || (!strict && REGNO (reg) >= FIRST_PSEUDO_REGISTER)));
+}
+
+
+/* Implement `TARGET_LEGITIMATE_ADDRESS_P'.  */
 
 bool
 avr_legitimate_address_p (enum machine_mode mode, rtx x, bool strict)
 {
-  enum reg_class r = NO_REGS;
-  
-  if (TARGET_ALL_DEBUG)
-    {
-      fprintf (stderr, "mode: (%s) %s %s %s %s:",
-	       GET_MODE_NAME(mode),
-	       strict ? "(strict)": "",
-	       reload_completed ? "(reload_completed)": "",
-	       reload_in_progress ? "(reload_in_progress)": "",
-	       reg_renumber ? "(reg_renumber)" : "");
-      if (GET_CODE (x) == PLUS
-	  && REG_P (XEXP (x, 0))
-	  && GET_CODE (XEXP (x, 1)) == CONST_INT
-	  && INTVAL (XEXP (x, 1)) >= 0
-	  && INTVAL (XEXP (x, 1)) <= MAX_LD_OFFSET (mode)
-	  && reg_renumber
-	  )
-	fprintf (stderr, "(r%d ---> r%d)", REGNO (XEXP (x, 0)),
-		 true_regnum (XEXP (x, 0)));
-      debug_rtx (x);
-    }
-  if (!strict && GET_CODE (x) == SUBREG)
-	x = SUBREG_REG (x);
-  if (REG_P (x) && (strict ? REG_OK_FOR_BASE_STRICT_P (x)
-                    : REG_OK_FOR_BASE_NOSTRICT_P (x)))
-    r = POINTER_REGS;
-  else if (CONSTANT_ADDRESS_P (x))
-    r = ALL_REGS;
-  else if (GET_CODE (x) == PLUS
-           && REG_P (XEXP (x, 0))
-	   && GET_CODE (XEXP (x, 1)) == CONST_INT
-	   && INTVAL (XEXP (x, 1)) >= 0)
-    {
-      int fit = INTVAL (XEXP (x, 1)) <= MAX_LD_OFFSET (mode);
-      if (fit)
-	{
-	  if (! strict
-	      || REGNO (XEXP (x,0)) == REG_X
-	      || REGNO (XEXP (x,0)) == REG_Y
-	      || REGNO (XEXP (x,0)) == REG_Z)
-	    r = BASE_POINTER_REGS;
-	  if (XEXP (x,0) == frame_pointer_rtx
-	      || XEXP (x,0) == arg_pointer_rtx)
-	    r = BASE_POINTER_REGS;
-	}
-      else if (frame_pointer_needed && XEXP (x,0) == frame_pointer_rtx)
-	r = POINTER_Y_REGS;
-    }
-  else if ((GET_CODE (x) == PRE_DEC || GET_CODE (x) == POST_INC)
-           && REG_P (XEXP (x, 0))
-           && (strict ? REG_OK_FOR_BASE_STRICT_P (XEXP (x, 0))
-               : REG_OK_FOR_BASE_NOSTRICT_P (XEXP (x, 0))))
-    {
-      r = POINTER_REGS;
-    }
-  if (TARGET_ALL_DEBUG)
+  bool ok = false;
+
+  switch (GET_CODE (x))
     {
-      fprintf (stderr, "   ret = %c\n", r + '0');
+    case REG:
+      ok = avr_reg_ok_for_addr (x, strict);
+      if (strict
+          && DImode == mode
+          && REG_X == REGNO (x))
+        {
+          ok = false;
+        }
+      break;
+
+    case POST_INC:
+    case PRE_DEC:
+      ok = avr_reg_ok_for_addr (XEXP (x, 0), strict);
+      break;
+
+    case SYMBOL_REF:
+    case CONST_INT:
+    case CONST:
+      ok = true;
+      break;
+      
+    case PLUS:
+      {
+        rtx op0 = XEXP (x, 0);
+        rtx op1 = XEXP (x, 1);
+            
+        if (REG_P (op0)
+            && CONST_INT_P (op1))
+          {
+            ok = (avr_reg_ok_for_addr (op0, strict)
+                  && INTVAL (op1) >= 0
+                  && INTVAL (op1) <= MAX_LD_OFFSET (mode));
+            
+            if (strict
+                && REG_X == REGNO (op0))
+              {
+                ok = false;
+              }
+          }
+        break;
+      }
+      
+    default:
+      break;
     }
-  return r == NO_REGS ? 0 : (int)r;
-}
 
+  return ok;
+}
+ 
 /* Attempts to replace X with a valid
    memory address for an operand of mode MODE  */
 
@@ -6278,27 +6288,74 @@ jump_over_one_insn_p (rtx insn, rtx dest
 int
 avr_hard_regno_mode_ok (int regno, enum machine_mode mode)
 {
-  /* Disallow QImode in stack pointer regs.  */
-  if ((regno == REG_SP || regno == (REG_SP + 1)) && mode == QImode)
-    return 0;
+  /* Any GENERAL_REGS register can hold 8-bit values.  */
+  /* FIXME:  8-bit values must not be disallowed for R28 or R29.
+     Disallowing QI et al. in these registers might lead to code like
+         (set (subreg:QI (reg:HI 28) n) ...)
+     which will result in wrong code because reload does not handle
+     SUBREGs of hard regsisters like this.  This could be fixed in reload.
+     However, it appears that fixing reload is not wanted by reload people.  */
+  
+  if (GET_MODE_SIZE (mode) == 1)
+     return 1;
+   
+   /* All modes larger than 8 bits should start in an even register.  */
+   
+  return regno % 2 == 0;
+}
 
-  /* The only thing that can go into registers r28:r29 is a Pmode.  */
-  if (regno == REG_Y && mode == Pmode)
-    return 1;
 
-  /* Otherwise disallow all regno/mode combinations that span r28:r29.  */
-  if (regno <= (REG_Y + 1) && (regno + GET_MODE_SIZE (mode)) >= (REG_Y + 1))
-    return 0;
+/* Worker function for `MODE_CODE_BASE_REG_CLASS'.  */
 
-  if (mode == QImode)
-    return 1;
+reg_class_t
+avr_mode_code_base_reg_class (enum machine_mode mode ATTRIBUTE_UNUSED,
+                              RTX_CODE outer_code, RTX_CODE index_code ATTRIBUTE_UNUSED)
+{
+  reg_class_t rclass = BASE_POINTER_REGS;
+  
+  switch (outer_code)
+    {
+    case MEM:
+    case POST_INC:
+    case PRE_DEC:
+      rclass = POINTER_REGS;
+      break;
+      
+    default:
+      break;
+    }
+  
+  return rclass;
+}
 
-  /* Modes larger than QImode occupy consecutive registers.  */
-  if (regno + GET_MODE_SIZE (mode) > FIRST_PSEUDO_REGISTER)
-    return 0;
 
-  /* All modes larger than QImode should start in an even register.  */
-  return !(regno & 1);
+/* Worker function for `REGNO_MODE_CODE_OK_FOR_BASE_P'.  */
+
+bool
+avr_regno_mode_code_ok_for_base_p (int regno, enum machine_mode mode ATTRIBUTE_UNUSED,
+                                   RTX_CODE outer_code, RTX_CODE index_code ATTRIBUTE_UNUSED)
+{
+  bool ok;
+  
+  switch (outer_code)
+    {
+    case PLUS:
+      ok = regno == REG_Z || regno == REG_Y;
+      break;
+      
+    case MEM: /* plain reg */
+    case POST_INC:
+    case PRE_DEC:
+      ok = regno == REG_Z || regno == REG_Y || regno == REG_X;
+      break;
+      
+    default:
+      ok = false;
+      break;
+      
+    }
+
+  return ok;
 }
 
 const char *
@@ -6424,13 +6481,23 @@ avr_hard_regno_scratch_ok (unsigned int
       && !df_regs_ever_live_p (regno))
     return false;
 
+  /* Don't allow hard registers that might be part of the frame pointer.
+     Some places in the compiler just test for [HARD_]FRAME_POINTER_REGNUM
+     and don't care for a frame pointer that spans more than one register.  */
+
+  if ((!reload_completed || frame_pointer_needed)
+      && (regno == REG_Y || regno == REG_Y + 1))
+    {
+      return false;
+    }
+
   return true;
 }
 
 /* Return nonzero if register OLD_REG can be renamed to register NEW_REG.  */
 
 int
-avr_hard_regno_rename_ok (unsigned int old_reg ATTRIBUTE_UNUSED,
+avr_hard_regno_rename_ok (unsigned int old_reg,
 			  unsigned int new_reg)
 {
   /* Interrupt functions can only use registers that have already been
@@ -6441,6 +6508,17 @@ avr_hard_regno_rename_ok (unsigned int o
       && !df_regs_ever_live_p (new_reg))
     return 0;
 
+  /* Don't allow hard registers that might be part of the frame pointer.
+     Some places in the compiler just test for [HARD_]FRAME_POINTER_REGNUM
+     and don't care for a frame pointer that spans more than one register.  */
+
+  if ((!reload_completed || frame_pointer_needed)
+      && (old_reg == REG_Y || old_reg == REG_Y + 1
+          || new_reg == REG_Y || new_reg == REG_Y + 1))
+    {
+      return 0;
+    }
+  
   return 1;
 }
 
Index: config/avr/avr.h
===================================================================
--- config/avr/avr.h	(Revision 175011)
+++ config/avr/avr.h	(Arbeitskopie)
@@ -294,21 +294,13 @@ enum reg_class {
 
 #define REGNO_REG_CLASS(R) avr_regno_reg_class(R)
 
-#define BASE_REG_CLASS (reload_completed ? BASE_POINTER_REGS : POINTER_REGS)
+#define MODE_CODE_BASE_REG_CLASS(mode, outer_code, index_code)  \
+  avr_mode_code_base_reg_class (mode, outer_code, index_code)
 
 #define INDEX_REG_CLASS NO_REGS
 
-#define REGNO_OK_FOR_BASE_P(r) (((r) < FIRST_PSEUDO_REGISTER		\
-				 && ((r) == REG_X			\
-				     || (r) == REG_Y			\
-				     || (r) == REG_Z			\
-				     || (r) == ARG_POINTER_REGNUM))	\
-				|| (reg_renumber			\
-				    && (reg_renumber[r] == REG_X	\
-					|| reg_renumber[r] == REG_Y	\
-					|| reg_renumber[r] == REG_Z	\
-					|| (reg_renumber[r]		\
-					    == ARG_POINTER_REGNUM))))
+#define REGNO_MODE_CODE_OK_FOR_BASE_P(num, mode, outer_code, index_code) \
+  avr_regno_mode_code_ok_for_base_p (num, mode, outer_code, index_code) 
 
 #define REGNO_OK_FOR_INDEX_P(NUM) 0
 
@@ -371,16 +363,11 @@ extern int avr_reg_order[];
 
 #define MAX_REGS_PER_ADDRESS 1
 
-#define REG_OK_FOR_BASE_NOSTRICT_P(X) \
-  (REGNO (X) >= FIRST_PSEUDO_REGISTER || REG_OK_FOR_BASE_STRICT_P(X))
-
-#define REG_OK_FOR_BASE_STRICT_P(X) REGNO_OK_FOR_BASE_P (REGNO (X))
-
 /* LEGITIMIZE_RELOAD_ADDRESS will allow register R26/27 to be used, where it
    is no worse than normal base pointers R28/29 and R30/31. For example:
    If base offset is greater than 63 bytes or for R++ or --R addressing.  */
    
-#define LEGITIMIZE_RELOAD_ADDRESS(X, MODE, OPNUM, TYPE, IND_LEVELS, WIN)    \
+#define _LEGITIMIZE_RELOAD_ADDRESS(X, MODE, OPNUM, TYPE, IND_LEVELS, WIN)    \
 do {									    \
   if (1&&(GET_CODE (X) == POST_INC || GET_CODE (X) == PRE_DEC))	    \
     {									    \

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Patch, AVR]: Fix PR46779
  2011-06-14 10:39               ` Georg-Johann Lay
@ 2011-06-14 11:38                 ` Denis Chertykov
  2011-06-14 21:38                   ` Georg-Johann Lay
  0 siblings, 1 reply; 47+ messages in thread
From: Denis Chertykov @ 2011-06-14 11:38 UTC (permalink / raw)
  To: Georg-Johann Lay
  Cc: gcc-patches, Anatoly Sokolov, Eric B. Weddington, Richard Henderson

2011/6/14 Georg-Johann Lay <avr@gjlay.de>:
> Denis Chertykov schrieb:
>> 2011/6/13 Georg-Johann Lay <avr@gjlay.de>:
>>> So you think is is pointless/discouraged to give a more realistic
>>> description of AVR addressing be means of MODE_CODE_BASE_REG_CLASS (instead
>>> of BASE_REG_CLASS) resp. REGNO_MODE_CODE_OK_FOR_BASE_P?
>>>
>>>> Look carefully at `out_movqi_r_mr'.
>>>> There are even two fake addressing modes:
>>>> 1. [Y + infinite-dslacement];
>>>> 2. [X + (0...63)].
>>> Yes, I know. The first is introduced by avr_legitimate_address_p and the
>>> second appears to be artifact of LEGITIMIZE_RELOAD_ADDRESS.
>>>
>>> The changes are basically MODE_CODE_BASE_REG_CLASS (introduced in 4.2) and a
>>> rewrite of avr_legitimate_address_p. The changes aim at a better addressing
>>> for X and to minimize fake addresses.
>>>
>>>> I have spent a many hours (days, months) to debug GCC (especially avr port
>>>> and reload) for right addressing modes.
>>>> I have stopped on this code.
>>>> AVR have a limited memory addressing and GCC can't handle it in native
>>>> form.
>>>> Because of that I have supported a fake adddressing modes.
>>> I assume the code is from prior to 4.2 when REGNO_MODE_CODE_OK_FOR_BASE_P
>>> and MODE_CODE_BASE_REG_CLASS had not been available so that supporting X
>>> required some hacking.
>>> All that would still be fine; however the new register allocator leads to
>>> code that noone would accept. Accessing a structure through a pointer is not
>>> uncommon, not even on AVR. So if Z is used for, say accessing flash, X
>>> appears to be the best register.
>>>
>>> The shortcoming in GCC is that there is no way to give costs of addressing
>>> (TARGET_ADDRESS_COST does different things).
>>>
>>> So take a look what avr-gcc compiles here:
>>>  http://gcc.gnu.org/bugzilla/attachment.cgi?id=22242
>>> I saw similar complains in forums on the web.
>>>
>>>> (Richard Henderson have a different opinion: GCC can, AVR port can't)
>>> What does he mean with that?
>>>
>>>> IMHO that three limited pointer registers is not enough for C compiler.
>>>> Even more with frame pointer it's only two and X is a very limited.
>>> The current implementation has several oddities like
>>>
>>> * allowing SUBREGs in avr-legitimate_address_p
>>> * changing BASE_REG_CLASS on the fly (by means of reload_completed)
>>>
>>> isn't that supposed to cause trouble?
>>
>> You can try to remove all oddities and check results.
>> Definitely something changed in GCC core since I wrote addressing code.
>>
>>
>> Denis.
>
> For your interest, here is a patch that shows the changes in
> addressing mode.

Generally, the patch seems as a "right thing". I like it.

How about a regression testing and code quality.

Denis.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Patch, AVR]: Fix PR46779
  2011-06-14 11:38                 ` Denis Chertykov
@ 2011-06-14 21:38                   ` Georg-Johann Lay
  2011-06-14 23:04                     ` Richard Henderson
  2011-06-15 17:58                     ` Richard Henderson
  0 siblings, 2 replies; 47+ messages in thread
From: Georg-Johann Lay @ 2011-06-14 21:38 UTC (permalink / raw)
  To: Denis Chertykov
  Cc: gcc-patches, Anatoly Sokolov, Eric B. Weddington, Richard Henderson

Denis Chertykov schrieb:
> 2011/6/14 Georg-Johann Lay <avr@gjlay.de>:
> 
>>Denis Chertykov schrieb:
>>
>>>2011/6/13 Georg-Johann Lay <avr@gjlay.de>:
>>>
>>>>So you think is is pointless/discouraged to give a more realistic
>>>>description of AVR addressing be means of MODE_CODE_BASE_REG_CLASS (instead
>>>>of BASE_REG_CLASS) resp. REGNO_MODE_CODE_OK_FOR_BASE_P?
>>>>
>>>>
>>>>>Look carefully at `out_movqi_r_mr'.
>>>>>There are even two fake addressing modes:
>>>>>1. [Y + infinite-dslacement];
>>>>>2. [X + (0...63)].
>>>>
>>>>Yes, I know. The first is introduced by avr_legitimate_address_p and the
>>>>second appears to be artifact of LEGITIMIZE_RELOAD_ADDRESS.
>>>>
>>>>The changes are basically MODE_CODE_BASE_REG_CLASS (introduced in 4.2) and a
>>>>rewrite of avr_legitimate_address_p. The changes aim at a better addressing
>>>>for X and to minimize fake addresses.
>>>>
>>>>
>>>>>I have spent a many hours (days, months) to debug GCC (especially avr port
>>>>>and reload) for right addressing modes.
>>>>>I have stopped on this code.
>>>>>AVR have a limited memory addressing and GCC can't handle it in native
>>>>>form.
>>>>>Because of that I have supported a fake adddressing modes.
>>>>
>>>>I assume the code is from prior to 4.2 when REGNO_MODE_CODE_OK_FOR_BASE_P
>>>>and MODE_CODE_BASE_REG_CLASS had not been available so that supporting X
>>>>required some hacking.
>>>>All that would still be fine; however the new register allocator leads to
>>>>code that noone would accept. Accessing a structure through a pointer is not
>>>>uncommon, not even on AVR. So if Z is used for, say accessing flash, X
>>>>appears to be the best register.
>>>>
>>>>The shortcoming in GCC is that there is no way to give costs of addressing
>>>>(TARGET_ADDRESS_COST does different things).
>>>>
>>>>So take a look what avr-gcc compiles here:
>>>> http://gcc.gnu.org/bugzilla/attachment.cgi?id=22242
>>>>I saw similar complains in forums on the web.
>>>>
>>>>
>>>>>(Richard Henderson have a different opinion: GCC can, AVR port can't)
>>>>
>>>>What does he mean with that?
>>>>
>>>>
>>>>>IMHO that three limited pointer registers is not enough for C compiler.
>>>>>Even more with frame pointer it's only two and X is a very limited.
>>>>
>>>>The current implementation has several oddities like
>>>>
>>>>* allowing SUBREGs in avr-legitimate_address_p
>>>>* changing BASE_REG_CLASS on the fly (by means of reload_completed)
>>>>
>>>>isn't that supposed to cause trouble?
>>>
>>>You can try to remove all oddities and check results.
>>>Definitely something changed in GCC core since I wrote addressing code.
>>>
>>>Denis.
>>
>>For your interest, here is a patch that shows the changes in
>>addressing mode.

http://gcc.gnu.org/ml/gcc-patches/2011-06/msg01029.html

> Generally, the patch seems as a "right thing". I like it.
> 
> How about a regression testing and code quality.
> 
> Denis.

I tested on some handcrafted examples and on the code attached to 
PR46278. The generated code looked very good and so I started regression 
testing but found at spill fail in
   gcc.c-torture/compile/950612-1.c

As I don't know how to fix a spill failure I stopped working on the 
patch. Perhaps it would help to have fake y+const addresses; I didn't try.

As far as I remember, reload emits inefficient code in some situations 
when it tries to fit address into available addressing mode by adding 
constant. I could fix that by new constraint alternative in addhi3 insn, 
something like "=*!r,r,n". But that are just details.

The major work left to be done are fixing spill fails and implementing 
appropriate LEGITIMIZE_RELOAD_ADDRESS which are beyond my skills.

BTW, fixing PR48459, Richard ran immediately into a spill failure during 
newlib build:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48459#c20

Johann

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Patch, AVR]: Fix PR46779
  2011-06-14 21:38                   ` Georg-Johann Lay
@ 2011-06-14 23:04                     ` Richard Henderson
  2011-06-15 17:58                     ` Richard Henderson
  1 sibling, 0 replies; 47+ messages in thread
From: Richard Henderson @ 2011-06-14 23:04 UTC (permalink / raw)
  To: Georg-Johann Lay
  Cc: Denis Chertykov, gcc-patches, Anatoly Sokolov, Eric B. Weddington

On 06/14/2011 02:29 PM, Georg-Johann Lay wrote:
> http://gcc.gnu.org/ml/gcc-patches/2011-06/msg01029.html

It does look like a step in the right direction.

> I tested on some handcrafted examples and on the code attached to
> PR46278. The generated code looked very good and so I started
> regression testing but found at spill fail in
>   gcc.c-torture/compile/950612-1.c

What compiler options?  I tried -O{1,2,s,3} and they all passed.

> The major work left to be done are fixing spill fails and
> implementing appropriate LEGITIMIZE_RELOAD_ADDRESS which are beyond
> my skills.

L_R_A is a tad tricky, but there are good examples to copy from.

But you shouldn't need that to fix spill failures.  L_R_A should
simply be able to provide slightly better sharing between addresses.
Not that I expect that there will be many instances of that, since
you'll quickly run out of registers in which to share anything.

> BTW, fixing PR48459, Richard ran immediately into a spill failure during newlib build:
> 
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48459#c20

FWIW, that spill failure vanishes with your patch.


r~

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Patch, AVR]: Fix PR46779
  2011-06-14 21:38                   ` Georg-Johann Lay
  2011-06-14 23:04                     ` Richard Henderson
@ 2011-06-15 17:58                     ` Richard Henderson
  2011-06-15 21:58                       ` Georg-Johann Lay
  1 sibling, 1 reply; 47+ messages in thread
From: Richard Henderson @ 2011-06-15 17:58 UTC (permalink / raw)
  To: Georg-Johann Lay
  Cc: Denis Chertykov, gcc-patches, Anatoly Sokolov, Eric B. Weddington

On 06/14/2011 02:29 PM, Georg-Johann Lay wrote:
> I tested on some handcrafted examples and on the code attached to
> PR46278. The generated code looked very good and so I started
> regression testing but found at spill fail in
>   gcc.c-torture/compile/950612-1.c

I reproduced this today.

The Problem is that we really have run out of registers.  X and Z
are both in use, and we're attempting to spill them for caller-save.
There is no register in which to load fp+158 so that we can save
either X or Z.

You're going to have to have some support for fp+large somewhere.


r~

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Patch, AVR]: Fix PR46779
  2011-06-15 17:58                     ` Richard Henderson
@ 2011-06-15 21:58                       ` Georg-Johann Lay
  2011-06-15 22:20                         ` Richard Henderson
  0 siblings, 1 reply; 47+ messages in thread
From: Georg-Johann Lay @ 2011-06-15 21:58 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Denis Chertykov, gcc-patches, Anatoly Sokolov, Eric B. Weddington

Richard Henderson schrieb:
> On 06/14/2011 02:29 PM, Georg-Johann Lay wrote:
> 
>>I tested on some handcrafted examples and on the code attached to
>>PR46278. The generated code looked very good and so I started
>>regression testing but found at spill fail in
>>  gcc.c-torture/compile/950612-1.c
> 
> I reproduced this today.
> 
> The Problem is that we really have run out of registers.  X and Z
> are both in use, and we're attempting to spill them for caller-save.
> There is no register in which to load fp+158 so that we can save
> either X or Z.

Thanks for the analysis, that makes thinks clearer.

But I still wonder what's the very problem. Any architecture has limited 
reg+const addressing and limited number of address registers.
I definetely saw architectures run out of registers and reload manages 
to access stack beyond reg+maxoff without any hacks and clear and 
straight forward backend.

Bigger machines definetely have bigger maxoff and more address regs, but 
the software that's intended to run on them is considerably more 
complicated. Yet gcc doesn't run into problems and reload can cope with 
that.

> You're going to have to have some support for fp+large somewhere.

So there must be something fundamentally different between avr and other 
ports?

Is there a minimum requirement for add<pmode>3 insn like "r,r,n" 
alternative so that it can add a const without reload?

Johann

> r~

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Patch, AVR]: Fix PR46779
  2011-06-15 21:58                       ` Georg-Johann Lay
@ 2011-06-15 22:20                         ` Richard Henderson
  2011-06-16  3:46                           ` Richard Henderson
  0 siblings, 1 reply; 47+ messages in thread
From: Richard Henderson @ 2011-06-15 22:20 UTC (permalink / raw)
  To: Georg-Johann Lay
  Cc: Denis Chertykov, gcc-patches, Anatoly Sokolov, Eric B. Weddington

On 06/15/2011 01:40 PM, Georg-Johann Lay wrote:
> But I still wonder what's the very problem. Any architecture has
> limited reg+const addressing and limited number of address
> registers. I definetely saw architectures run out of registers and
> reload manages to access stack beyond reg+maxoff without any hacks
> and clear and straight forward backend.

Well, this is probably the only architecture that has *no*
non-fixed, call-saved base registers.

Indeed, I can work around this particular crash by either
hacking Z to be call-saved, or hacking the frame pointer to
not be required.  The former of course changes the abi, and
the second produces awful code due to too many copies from
the stack pointer.  So neither option is "preferred".



r~   

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Patch, AVR]: Fix PR46779
  2011-06-15 22:20                         ` Richard Henderson
@ 2011-06-16  3:46                           ` Richard Henderson
  2011-06-16 11:37                             ` Denis Chertykov
  2011-06-23 20:48                             ` Denis Chertykov
  0 siblings, 2 replies; 47+ messages in thread
From: Richard Henderson @ 2011-06-16  3:46 UTC (permalink / raw)
  To: Georg-Johann Lay
  Cc: Denis Chertykov, gcc-patches, Anatoly Sokolov, Eric B. Weddington

On 06/15/2011 02:58 PM, Richard Henderson wrote:
> Indeed, I can work around this particular crash by either
> hacking Z to be call-saved, or hacking the frame pointer to
> not be required.  The former of course changes the abi, and
> the second produces awful code due to too many copies from
> the stack pointer.  So neither option is "preferred".

Perhaps I spoke too soon re the frame pointer.  The old
code is even worse.

   text	   data	    bss	    dec	    hex	filename
  10032	     25	      0	  10057	   2749	bld-avr-orig/gcc/z.o
   5816	     25	      0	   5841	   16d1	bld-avr-new/gcc/z.o

That said, there's a crash building libgcc so there's
still work to do.


r~

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Patch, AVR]: Fix PR46779
  2011-06-16  3:46                           ` Richard Henderson
@ 2011-06-16 11:37                             ` Denis Chertykov
  2011-06-16 12:01                               ` Denis Chertykov
  2011-06-16 14:36                               ` Richard Henderson
  2011-06-23 20:48                             ` Denis Chertykov
  1 sibling, 2 replies; 47+ messages in thread
From: Denis Chertykov @ 2011-06-16 11:37 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Georg-Johann Lay, gcc-patches, Anatoly Sokolov, Eric B. Weddington

2011/6/16 Richard Henderson <rth@redhat.com>:
> On 06/15/2011 02:58 PM, Richard Henderson wrote:
>> Indeed, I can work around this particular crash by either
>> hacking Z to be call-saved, or hacking the frame pointer to
>> not be required.  The former of course changes the abi, and
>> the second produces awful code due to too many copies from
>> the stack pointer.  So neither option is "preferred".
>
> Perhaps I spoke too soon re the frame pointer.  The old
> code is even worse.
>
>   text    data     bss     dec     hex filename
>  10032      25       0   10057    2749 bld-avr-orig/gcc/z.o
>   5816      25       0    5841    16d1 bld-avr-new/gcc/z.o
>
> That said, there's a crash building libgcc so there's
> still work to do.

@rth (while you are diving into AVR microworld ;-)
May be you can give a suggestion to change the AVR abi.
I have tuned the abi for code size almost 13 years ago.
The register pressure to r18-r31 are very high.

I have a set of experiments with omitting the frame poiner and I found that
better to support fake addressing modes (infinite displacement for frame
pointer).

Denis.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Patch, AVR]: Fix PR46779
  2011-06-16 11:37                             ` Denis Chertykov
@ 2011-06-16 12:01                               ` Denis Chertykov
  2011-06-16 14:07                                 ` Richard Henderson
  2011-06-16 14:36                               ` Richard Henderson
  1 sibling, 1 reply; 47+ messages in thread
From: Denis Chertykov @ 2011-06-16 12:01 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Georg-Johann Lay, gcc-patches, Anatoly Sokolov, Eric B. Weddington

2011/6/16 Denis Chertykov <chertykov@gmail.com>:
> 2011/6/16 Richard Henderson <rth@redhat.com>:
>> On 06/15/2011 02:58 PM, Richard Henderson wrote:
>>> Indeed, I can work around this particular crash by either
>>> hacking Z to be call-saved, or hacking the frame pointer to
>>> not be required.  The former of course changes the abi, and
>>> the second produces awful code due to too many copies from
>>> the stack pointer.  So neither option is "preferred".
>>
>> Perhaps I spoke too soon re the frame pointer.  The old
>> code is even worse.
>>
>>   text    data     bss     dec     hex filename
>>  10032      25       0   10057    2749 bld-avr-orig/gcc/z.o
>>   5816      25       0    5841    16d1 bld-avr-new/gcc/z.o
>>
>> That said, there's a crash building libgcc so there's
>> still work to do.
>
> @rth (while you are diving into AVR microworld ;-)
> May be you can give a suggestion to change the AVR abi.
> I have tuned the abi for code size almost 13 years ago.
> The register pressure to r18-r31 are very high.
>
> I have a set of experiments with omitting the frame poiner and I found that
> better to support fake addressing modes (infinite displacement for frame
> pointer).

I forgot to said that suggestion from Bernd Schmidt about
HARD_FRAME_POINTER_REGNUM seems very useful:
> Maybe it would help for your port to define a separate
> FRAME_POINTER_REGNUM, able to hold an HImode value, which then gets
> eliminated to HARD_FRAME_POINTER_REGNUM? This mechanism is used on many
> other ports if you need examples.

It's not related to addressing modes but it's related to frame pointer bugs.

Denis.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Patch, AVR]: Fix PR46779
  2011-06-16 12:01                               ` Denis Chertykov
@ 2011-06-16 14:07                                 ` Richard Henderson
  2011-06-16 18:18                                   ` Denis Chertykov
  0 siblings, 1 reply; 47+ messages in thread
From: Richard Henderson @ 2011-06-16 14:07 UTC (permalink / raw)
  To: Denis Chertykov
  Cc: Georg-Johann Lay, gcc-patches, Anatoly Sokolov, Eric B. Weddington

[-- Attachment #1: Type: text/plain, Size: 702 bytes --]

On 06/16/2011 04:46 AM, Denis Chertykov wrote:
> I forgot to said that suggestion from Bernd Schmidt about
> HARD_FRAME_POINTER_REGNUM seems very useful:
>> Maybe it would help for your port to define a separate
>> FRAME_POINTER_REGNUM, able to hold an HImode value, which then gets
>> eliminated to HARD_FRAME_POINTER_REGNUM? This mechanism is used on many
>> other ports if you need examples.
> 
> It's not related to addressing modes but it's related to frame pointer bugs.

Yes, that was the first thing I did while investigating G-J's
spill failure.  It didn't help that, of course, but I kept it
around in my tree anyway.

This probably doesn't apply by itself, as is, but for the record...


r~

[-- Attachment #2: commit-avr-sfp --]
[-- Type: text/plain, Size: 16314 bytes --]

commit 0ca9df0a8c2722e8a555c2c37d7d7b40467bc5d4
Author: Richard Henderson <rth@redhat.com>
Date:   Wed Jun 15 14:03:19 2011 -0700

    avr: Add soft frame pointer register.

	* config/avr/avr.c (avr_can_eliminate): Simplify.
	(avr_initial_elimination_offset): Likewise.
	(expand_prologue): Use hard_frame_pointer_rtx.
	(expand_epilogue): Likewise.
	(avr_legitimize_address): Gut.
	(avr_hard_regno_nregs): New.
	(avr_hard_regno_ok): Allow only Pmode for arg and frame_pointers.
	(avr_regno_mode_code_ok_for_base_b): Handle arg and frame pointers.
	* config/avr/avr.h (FIXED_REGISTERS): Adjust arg pointer,
	add soft frame pointer.
	(CALL_USED_REGISTERS): Likewise.
	(REG_CLASS_CONTENTS): Likewise.
	(REGISTER_NAMES): Likewise.
	(HARD_REGNO_NREGS): Use avr_hard_regno_nregs.
	(HARD_FRAME_POINTER_REGNUM): New.
	(FRAME_POINTER_REGNUM): Use soft frame pointer.
	(ELIMINABLE_REGS): Eliminate from the soft frame pointer,
	remove the HARD_FRAME_POINTER self-elimination.

diff --git a/gcc/config/avr/avr-protos.h b/gcc/config/avr/avr-protos.h
index 2af123c..5c60e2e 100644
--- a/gcc/config/avr/avr-protos.h
+++ b/gcc/config/avr/avr-protos.h
@@ -98,6 +98,7 @@ extern int byte_immediate_operand (rtx op, enum machine_mode mode);
 extern int test_hard_reg_class (enum reg_class rclass, rtx x);
 extern int jump_over_one_insn_p (rtx insn, rtx dest);
 
+extern int avr_hard_regno_nregs (int regno, enum machine_mode mode);
 extern int avr_hard_regno_mode_ok (int regno, enum machine_mode mode);
 extern void final_prescan_insn (rtx insn, rtx *operand, int num_operands);
 extern int avr_simplify_comparison_p (enum machine_mode mode,
diff --git a/gcc/config/avr/avr.c b/gcc/config/avr/avr.c
index c6087fe..e1d6b50 100644
--- a/gcc/config/avr/avr.c
+++ b/gcc/config/avr/avr.c
@@ -493,29 +493,28 @@ avr_regs_to_save (HARD_REG_SET *set)
 /* Return true if register FROM can be eliminated via register TO.  */
 
 bool
-avr_can_eliminate (const int from, const int to)
+avr_can_eliminate (int from ATTRIBUTE_UNUSED, int to)
 {
-  return ((from == ARG_POINTER_REGNUM && to == FRAME_POINTER_REGNUM)
-	  || ((from == FRAME_POINTER_REGNUM 
-	       || from == FRAME_POINTER_REGNUM + 1)
-	      && !frame_pointer_needed));
+  return to == HARD_FRAME_POINTER_REGNUM;
 }
 
 /* Compute offset between arg_pointer and frame_pointer.  */
 
 int
-avr_initial_elimination_offset (int from, int to)
+avr_initial_elimination_offset (int from, int to ATTRIBUTE_UNUSED)
 {
-  if (from == FRAME_POINTER_REGNUM && to == STACK_POINTER_REGNUM)
-    return 0;
-  else
-    {
-      int offset = frame_pointer_needed ? 2 : 0;
-      int avr_pc_size = AVR_HAVE_EIJMP_EICALL ? 3 : 2;
+  int offset = 0;
 
+  if (from == ARG_POINTER_REGNUM)
+    {
+      offset += AVR_HAVE_EIJMP_EICALL ? 3 : 2;
+      offset += frame_pointer_needed ? 2 : 0;
       offset += avr_regs_to_save (NULL);
-      return get_frame_size () + (avr_pc_size) + 1 + offset;
+      offset += get_frame_size ();
+      offset += 1; /* post-dec stack space */
     }
+
+  return offset;
 }
 
 /* Actual start of frame is virtual_stack_vars_rtx this is offset from 
@@ -747,12 +746,12 @@ expand_prologue (void)
 	 notes to the front.  Thus we build them in the reverse order of
 	 how we want dwarf2out to process them.  */
 
-      /* The function does always set frame_pointer_rtx, but whether that
+      /* The function does always set hard_frame_pointer_rtx, but whether that
 	 is going to be permanent in the function is frame_pointer_needed.  */
       add_reg_note (insn, REG_CFA_ADJUST_CFA,
 		    gen_rtx_SET (VOIDmode,
 				 (frame_pointer_needed
-				  ? frame_pointer_rtx : stack_pointer_rtx),
+				  ? hard_frame_pointer_rtx : stack_pointer_rtx),
 				 plus_constant (stack_pointer_rtx,
 						-(size + live_seq))));
 
@@ -793,7 +792,7 @@ expand_prologue (void)
 
           if (!size)
             {
-              insn = emit_move_insn (frame_pointer_rtx, stack_pointer_rtx);
+              insn = emit_move_insn (hard_frame_pointer_rtx, stack_pointer_rtx);
               RTX_FRAME_RELATED_P (insn) = 1;
             }
           else
@@ -816,12 +815,12 @@ expand_prologue (void)
                 {
                   /* The high byte (r29) doesn't change.  Prefer 'subi'
 		     (1 cycle) over 'sbiw' (2 cycles, same size).  */
-                  myfp = gen_rtx_REG (QImode, FRAME_POINTER_REGNUM);
+                  myfp = gen_rtx_REG (QImode, HARD_FRAME_POINTER_REGNUM);
                 }
               else 
                 {
                   /*  Normal sized addition.  */
-                  myfp = frame_pointer_rtx;
+                  myfp = hard_frame_pointer_rtx;
                 }
 
 	      /* Method 1-Adjust frame pointer.  */
@@ -833,12 +832,12 @@ expand_prologue (void)
 		 instead indicate that the entire operation is complete after
 		 the frame pointer subtraction is done.  */
 
-              emit_move_insn (frame_pointer_rtx, stack_pointer_rtx);
+              emit_move_insn (hard_frame_pointer_rtx, stack_pointer_rtx);
 
               insn = emit_move_insn (myfp, plus_constant (myfp, -size));
               RTX_FRAME_RELATED_P (insn) = 1;
 	      add_reg_note (insn, REG_CFA_ADJUST_CFA,
-			    gen_rtx_SET (VOIDmode, frame_pointer_rtx,
+			    gen_rtx_SET (VOIDmode, hard_frame_pointer_rtx,
 					 plus_constant (stack_pointer_rtx,
 							-size)));
 
@@ -847,23 +846,23 @@ expand_prologue (void)
 		 need not be annotated at all.  */
 	      if (AVR_HAVE_8BIT_SP)
 		{
-		  emit_move_insn (stack_pointer_rtx, frame_pointer_rtx);
+		  emit_move_insn (stack_pointer_rtx, hard_frame_pointer_rtx);
 		}
 	      else if (TARGET_NO_INTERRUPTS 
 		       || cfun->machine->is_signal
 		       || cfun->machine->is_OS_main)
 		{
 		  emit_insn (gen_movhi_sp_r_irq_off (stack_pointer_rtx, 
-						     frame_pointer_rtx));
+						     hard_frame_pointer_rtx));
 		}
 	      else if (cfun->machine->is_interrupt)
 		{
 		  emit_insn (gen_movhi_sp_r_irq_on (stack_pointer_rtx, 
-						    frame_pointer_rtx));
+						    hard_frame_pointer_rtx));
 		}
 	      else
 		{
-		  emit_move_insn (stack_pointer_rtx, frame_pointer_rtx);
+		  emit_move_insn (stack_pointer_rtx, hard_frame_pointer_rtx);
 		}
 
 	      fp_plus_insns = get_insns ();
@@ -880,7 +879,7 @@ expand_prologue (void)
 		  insn = emit_move_insn (stack_pointer_rtx, insn);
 		  RTX_FRAME_RELATED_P (insn) = 1;
 		  
-		  insn = emit_move_insn (frame_pointer_rtx, stack_pointer_rtx);
+		  insn = emit_move_insn (hard_frame_pointer_rtx, stack_pointer_rtx);
 		  RTX_FRAME_RELATED_P (insn) = 1;
 
 		  sp_plus_insns = get_insns ();
@@ -997,13 +996,13 @@ expand_epilogue (bool sibcall_p)
       if (frame_pointer_needed)
 	{
           /*  Get rid of frame.  */
-	  emit_move_insn(frame_pointer_rtx,
-                         gen_rtx_PLUS (HImode, frame_pointer_rtx,
-                                       gen_int_mode (size, HImode)));
+	  emit_move_insn (hard_frame_pointer_rtx,
+                          gen_rtx_PLUS (HImode, hard_frame_pointer_rtx,
+                                        gen_int_mode (size, HImode)));
 	}
       else
 	{
-          emit_move_insn (frame_pointer_rtx, stack_pointer_rtx);
+          emit_move_insn (hard_frame_pointer_rtx, stack_pointer_rtx);
 	}
 	
       emit_insn (gen_epilogue_restores (gen_int_mode (live_seq, HImode)));
@@ -1022,12 +1021,12 @@ expand_epilogue (bool sibcall_p)
                 {
                   /* The high byte (r29) doesn't change - prefer 'subi' 
                      (1 cycle) over 'sbiw' (2 cycles, same size).  */
-                  myfp = gen_rtx_REG (QImode, FRAME_POINTER_REGNUM);
+                  myfp = gen_rtx_REG (QImode, HARD_FRAME_POINTER_REGNUM);
                 }
               else 
                 {
                   /* Normal sized addition.  */
-                  myfp = frame_pointer_rtx;
+                  myfp = hard_frame_pointer_rtx;
                 }
 	      
               /* Method 1-Adjust frame pointer.  */
@@ -1038,22 +1037,22 @@ expand_epilogue (bool sibcall_p)
 	      /* Copy to stack pointer.  */
 	      if (AVR_HAVE_8BIT_SP)
 		{
-		  emit_move_insn (stack_pointer_rtx, frame_pointer_rtx);
+		  emit_move_insn (stack_pointer_rtx, hard_frame_pointer_rtx);
 		}
 	      else if (TARGET_NO_INTERRUPTS 
 		       || cfun->machine->is_signal)
 		{
 		  emit_insn (gen_movhi_sp_r_irq_off (stack_pointer_rtx, 
-						     frame_pointer_rtx));
+						     hard_frame_pointer_rtx));
 		}
 	      else if (cfun->machine->is_interrupt)
 		{
 		  emit_insn (gen_movhi_sp_r_irq_on (stack_pointer_rtx, 
-						    frame_pointer_rtx));
+						    hard_frame_pointer_rtx));
 		}
 	      else
 		{
-		  emit_move_insn (stack_pointer_rtx, frame_pointer_rtx);
+		  emit_move_insn (stack_pointer_rtx, hard_frame_pointer_rtx);
 		}
 
 	      fp_plus_insns = get_insns ();
@@ -1228,32 +1227,8 @@ avr_legitimate_address_p (enum machine_mode mode, rtx x, bool strict)
    memory address for an operand of mode MODE  */
 
 rtx
-avr_legitimize_address (rtx x, rtx oldx, enum machine_mode mode)
+avr_legitimize_address (rtx x, rtx oldx ATTRIBUTE_UNUSED, enum machine_mode mode ATTRIBUTE_UNUSED)
 {
-  x = oldx;
-  if (TARGET_ALL_DEBUG)
-    {
-      fprintf (stderr, "legitimize_address mode: %s", GET_MODE_NAME(mode));
-      debug_rtx (oldx);
-    }
-  
-  if (GET_CODE (oldx) == PLUS
-      && REG_P (XEXP (oldx,0)))
-    {
-      if (REG_P (XEXP (oldx,1)))
-	x = force_reg (GET_MODE (oldx), oldx);
-      else if (GET_CODE (XEXP (oldx, 1)) == CONST_INT)
-	{
-	  int offs = INTVAL (XEXP (oldx,1));
-	  if (frame_pointer_rtx != XEXP (oldx,0))
-	    if (offs > MAX_LD_OFFSET (mode))
-	      {
-		if (TARGET_ALL_DEBUG)
-		  fprintf (stderr, "force_reg (big offset)\n");
-		x = force_reg (GET_MODE (oldx), oldx);
-	      }
-	}
-    }
   return x;
 }
 
@@ -6096,8 +6071,7 @@ extra_constraint_Q (rtx x)
 	return 1;		/* allocate pseudos */
       else if (regno == REG_Z || regno == REG_Y)
 	return 1;		/* strictly check */
-      else if (xx == frame_pointer_rtx
-	       || xx == arg_pointer_rtx)
+      else if (xx == frame_pointer_rtx || xx == arg_pointer_rtx)
 	return 1;		/* XXX frame & arg pointer checks */
     }
   return 0;
@@ -6280,6 +6254,18 @@ jump_over_one_insn_p (rtx insn, rtx dest)
   return dest_addr - jump_addr == get_attr_length (insn) + 1;
 }
 
+/* Returns the number of registers required to hold a value of MODE.  */
+
+int
+avr_hard_regno_nregs (int regno, enum machine_mode mode)
+{
+  /* The fake registers are designed to hold exactly a pointer.  */
+  if (regno == ARG_POINTER_REGNUM || regno == FRAME_POINTER_REGNUM)
+    return 1;
+
+  return (GET_MODE_SIZE (mode) + UNITS_PER_WORD - 1) / UNITS_PER_WORD;
+}
+
 /* Returns 1 if a value of mode MODE can be stored starting with hard
    register number REGNO.  On the enhanced core, anything larger than
    1 byte must start in even numbered register for "movw" to work
@@ -6288,6 +6274,10 @@ jump_over_one_insn_p (rtx insn, rtx dest)
 int
 avr_hard_regno_mode_ok (int regno, enum machine_mode mode)
 {
+  /* The fake registers are designed to hold exactly a pointer.  */
+  if (regno == ARG_POINTER_REGNUM || regno == FRAME_POINTER_REGNUM)
+    return mode == Pmode;
+
   /* Any GENERAL_REGS register can hold 8-bit values.  */
   /* FIXME:  8-bit values must not be disallowed for R28 or R29.
      Disallowing QI et al. in these registers might lead to code like
@@ -6299,8 +6289,7 @@ avr_hard_regno_mode_ok (int regno, enum machine_mode mode)
   if (GET_MODE_SIZE (mode) == 1)
      return 1;
    
-   /* All modes larger than 8 bits should start in an even register.  */
-   
+  /* All modes larger than 8 bits should start in an even register.  */
   return regno % 2 == 0;
 }
 
@@ -6332,27 +6321,35 @@ avr_mode_code_base_reg_class (enum machine_mode mode ATTRIBUTE_UNUSED,
 /* Worker function for `REGNO_MODE_CODE_OK_FOR_BASE_P'.  */
 
 bool
-avr_regno_mode_code_ok_for_base_p (int regno, enum machine_mode mode ATTRIBUTE_UNUSED,
-                                   RTX_CODE outer_code, RTX_CODE index_code ATTRIBUTE_UNUSED)
+avr_regno_mode_code_ok_for_base_p (int regno,
+				   enum machine_mode mode ATTRIBUTE_UNUSED,
+                                   RTX_CODE outer_code,
+				   RTX_CODE index_code ATTRIBUTE_UNUSED)
 {
   bool ok;
   
+  ok = (regno == REG_Z
+        || regno == REG_Y
+        || regno == ARG_POINTER_REGNUM
+        || regno == FRAME_POINTER_REGNUM);
+
   switch (outer_code)
     {
     case PLUS:
-      ok = regno == REG_Z || regno == REG_Y;
+      /* Computed above */
       break;
       
     case MEM: /* plain reg */
     case POST_INC:
     case PRE_DEC:
-      ok = regno == REG_Z || regno == REG_Y || regno == REG_X;
+      /* As above, but also X.  */
+      if (regno == REG_X)
+	ok = true;
       break;
       
     default:
       ok = false;
       break;
-      
     }
 
   return ok;
diff --git a/gcc/config/avr/avr.h b/gcc/config/avr/avr.h
index 9c64f96..426ddec 100644
--- a/gcc/config/avr/avr.h
+++ b/gcc/config/avr/avr.h
@@ -191,7 +191,8 @@ extern GTY(()) section *progmem_section;
   0,0,/* r28 r29 */\
   0,0,/* r30 r31 */\
   1,1,/*  STACK */\
-  1,1 /* arg pointer */  }
+  1,  /* arg pointer */\
+  1   /* frame pointer */  }
 
 #define CALL_USED_REGISTERS {			\
   1,1,/* r0 r1 */				\
@@ -211,7 +212,8 @@ extern GTY(()) section *progmem_section;
     0,0,/* r28 r29 */				\
     1,1,/* r30 r31 */				\
     1,1,/*  STACK */				\
-    1,1 /* arg pointer */  }
+    1,  /* arg pointer */			\
+    1   /* frame pointer */ }
 
 #define REG_ALLOC_ORDER {			\
     24,25,					\
@@ -229,7 +231,7 @@ extern GTY(()) section *progmem_section;
 #define ADJUST_REG_ALLOC_ORDER order_regs_for_local_alloc ()
 
 
-#define HARD_REGNO_NREGS(REGNO, MODE) ((GET_MODE_SIZE (MODE) + UNITS_PER_WORD - 1) / UNITS_PER_WORD)
+#define HARD_REGNO_NREGS(REGNO, MODE) avr_hard_regno_nregs(REGNO, MODE)
 
 #define HARD_REGNO_MODE_OK(REGNO, MODE) avr_hard_regno_mode_ok(REGNO, MODE)
 
@@ -279,17 +281,17 @@ enum reg_class {
   {3 << REG_Z,0x00000000},      /* POINTER_Z_REGS, r30 - r31 */		\
   {0x00000000,0x00000003},	/* STACK_REG, STACK */			\
   {(3 << REG_Y) | (3 << REG_Z),						\
-     0x00000000},		/* BASE_POINTER_REGS, r28 - r31 */	\
+     0x0000000c},		/* BASE_POINTER_REGS, r28 - r31,ap,fp */\
   {(3 << REG_X) | (3 << REG_Y) | (3 << REG_Z),				\
-     0x00000000},		/* POINTER_REGS, r26 - r31 */		\
+     0x0000000c},		/* POINTER_REGS, r26 - r31,ap,fp */	\
   {(3 << REG_X) | (3 << REG_Y) | (3 << REG_Z) | (3 << REG_W),		\
      0x00000000},		/* ADDW_REGS, r24 - r31 */		\
   {0x00ff0000,0x00000000},	/* SIMPLE_LD_REGS r16 - r23 */          \
   {(3 << REG_X)|(3 << REG_Y)|(3 << REG_Z)|(3 << REG_W)|(0xff << 16),	\
-     0x00000000},	/* LD_REGS, r16 - r31 */			\
+     0x0000000c},	/* LD_REGS, r16 - r31 */			\
   {0x0000ffff,0x00000000},	/* NO_LD_REGS  r0 - r15 */              \
-  {0xffffffff,0x00000000},	/* GENERAL_REGS, r0 - r31 */		\
-  {0xffffffff,0x00000003}	/* ALL_REGS */				\
+  {0xffffffff,0x0000000c},	/* GENERAL_REGS, r0 - r31,ap,fp */	\
+  {0xffffffff,0x0000000f}	/* ALL_REGS */				\
 }
 
 #define REGNO_REG_CLASS(R) avr_regno_reg_class(R)
@@ -322,16 +324,18 @@ enum reg_class {
 
 #define STACK_POINTER_REGNUM 32
 
-#define FRAME_POINTER_REGNUM REG_Y
+#define HARD_FRAME_POINTER_REGNUM REG_Y
 
 #define ARG_POINTER_REGNUM 34
+#define FRAME_POINTER_REGNUM 35
 
 #define STATIC_CHAIN_REGNUM 2
 
 #define ELIMINABLE_REGS {					\
-      {ARG_POINTER_REGNUM, FRAME_POINTER_REGNUM},		\
-	{FRAME_POINTER_REGNUM, STACK_POINTER_REGNUM}		\
-       ,{FRAME_POINTER_REGNUM+1,STACK_POINTER_REGNUM+1}}
+      { ARG_POINTER_REGNUM, STACK_POINTER_REGNUM },		\
+      { ARG_POINTER_REGNUM, HARD_FRAME_POINTER_REGNUM },	\
+      { FRAME_POINTER_REGNUM, STACK_POINTER_REGNUM },		\
+      { FRAME_POINTER_REGNUM, HARD_FRAME_POINTER_REGNUM }}
 
 #define INITIAL_ELIMINATION_OFFSET(FROM, TO, OFFSET)			\
   OFFSET = avr_initial_elimination_offset (FROM, TO)
@@ -582,7 +586,7 @@ sprintf (STRING, "*.%s%lu", PREFIX, (unsigned long)(NUM))
     "r8","r9","r10","r11","r12","r13","r14","r15",	\
     "r16","r17","r18","r19","r20","r21","r22","r23",	\
     "r24","r25","r26","r27","r28","r29","r30","r31",	\
-    "__SP_L__","__SP_H__","argL","argH"}
+    "__SP_L__","__SP_H__","ap","fp"}
 
 #define FINAL_PRESCAN_INSN(insn, operand, nop) final_prescan_insn (insn, operand,nop)
 

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Patch, AVR]: Fix PR46779
  2011-06-16 11:37                             ` Denis Chertykov
  2011-06-16 12:01                               ` Denis Chertykov
@ 2011-06-16 14:36                               ` Richard Henderson
  2011-06-16 14:52                                 ` Bernd Schmidt
  2011-06-16 18:57                                 ` Denis Chertykov
  1 sibling, 2 replies; 47+ messages in thread
From: Richard Henderson @ 2011-06-16 14:36 UTC (permalink / raw)
  To: Denis Chertykov
  Cc: Georg-Johann Lay, gcc-patches, Anatoly Sokolov, Eric B. Weddington

On 06/16/2011 04:34 AM, Denis Chertykov wrote:
> @rth (while you are diving into AVR microworld ;-)
> May be you can give a suggestion to change the AVR abi.
> I have tuned the abi for code size almost 13 years ago.
> The register pressure to r18-r31 are very high.

As far as I can see, the ABI is pretty good.  There's nothing
that I would say that should obviously be changed.

I might totally drop support for DImode.  Let "long long" map
to SImode.  If you want 64-bit data, you probably don't want
to select an 8-bit microcontroller.

That might take a bit of surgery on the way we currently build
libgcc though.

> I have a set of experiments with omitting the frame poiner and I found that
> better to support fake addressing modes (infinite displacement for frame
> pointer).

The biggest problem that I see, from the 950612-1.c test case
with the current handling of the "infinite displacement frame
pointer", is that the adjustments to the frame pointer are 
never exposed as separate instructions, so there's never a
chance to optimize them.

So once you build a stack frame larger than 64 bytes, things
get worse and worse.  You wind up with code like

        subi r28,lo8(-133)
        sbci r29,hi8(-133)
        ld r18,Y
        subi r28,lo8(133)
        sbci r29,hi8(133)
        subi r28,lo8(-134)
        sbci r29,hi8(-134)
        ld r19,Y
        subi r28,lo8(134)
        sbci r29,hi8(134)

where obviously the 4 middle instructions could be eliminated.

If we came out of reload (or shortly thereafter via split) with
these as separate patterns, we might be able to eliminate them
via an existing optimization pass.  OTOH, an existing pass might
refuse to operate on the frame pointer because the frame pointer
is usually Special.

One could write an avr-specific pass for this:
  Scan the code for references to the frame pointer.
  Record the offsets of the uses.
  Compute a sliding 64-byte window that satisfies the maximum
	number of uses within the region.
  Insert adjustments to the frame pointer.
  Arrange for the dwarf2out routines to scan the whole function.
	(If alloca has been used, the frame pointer anchors the
	CFA, and the unwind info will need adjusting throughout
	the function.  Otherwise gdb will fail entirely.)

That last point probably depends on Bernd Schmidt's work in
dwarf2out for shrink-wrapping...


r~

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Patch, AVR]: Fix PR46779
  2011-06-16 14:36                               ` Richard Henderson
@ 2011-06-16 14:52                                 ` Bernd Schmidt
  2011-06-16 15:13                                   ` Richard Henderson
  2011-06-16 18:57                                 ` Denis Chertykov
  1 sibling, 1 reply; 47+ messages in thread
From: Bernd Schmidt @ 2011-06-16 14:52 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Denis Chertykov, Georg-Johann Lay, gcc-patches, Anatoly Sokolov,
	Eric B. Weddington

On 06/16/2011 04:25 PM, Richard Henderson wrote:
> That last point probably depends on Bernd Schmidt's work in
> dwarf2out for shrink-wrapping...

Did you have a chance to look at the ia64 unwind code? I probably won't
be able to get back to it for a while yet since I'm working on C6X tuning...


Bernd

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Patch, AVR]: Fix PR46779
  2011-06-16 14:52                                 ` Bernd Schmidt
@ 2011-06-16 15:13                                   ` Richard Henderson
  0 siblings, 0 replies; 47+ messages in thread
From: Richard Henderson @ 2011-06-16 15:13 UTC (permalink / raw)
  To: Bernd Schmidt
  Cc: Denis Chertykov, Georg-Johann Lay, gcc-patches, Anatoly Sokolov,
	Eric B. Weddington

On 06/16/2011 07:35 AM, Bernd Schmidt wrote:
> Did you have a chance to look at the ia64 unwind code? I probably won't
> be able to get back to it for a while yet since I'm working on C6X tuning...

I was looking at it before I moved house a month ago, and then
totally forgot about it until today.  I should get back on that...


r~

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Patch, AVR]: Fix PR46779
  2011-06-16 14:07                                 ` Richard Henderson
@ 2011-06-16 18:18                                   ` Denis Chertykov
  2011-06-16 18:57                                     ` Richard Henderson
  0 siblings, 1 reply; 47+ messages in thread
From: Denis Chertykov @ 2011-06-16 18:18 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Georg-Johann Lay, gcc-patches, Anatoly Sokolov, Eric B. Weddington

2011/6/16 Richard Henderson <rth@redhat.com>:
> On 06/16/2011 04:46 AM, Denis Chertykov wrote:
>> I forgot to said that suggestion from Bernd Schmidt about
>> HARD_FRAME_POINTER_REGNUM seems very useful:
>>> Maybe it would help for your port to define a separate
>>> FRAME_POINTER_REGNUM, able to hold an HImode value, which then gets
>>> eliminated to HARD_FRAME_POINTER_REGNUM? This mechanism is used on many
>>> other ports if you need examples.
>>
>> It's not related to addressing modes but it's related to frame pointer bugs.
>
> Yes, that was the first thing I did while investigating G-J's
> spill failure.  It didn't help that, of course, but I kept it
> around in my tree anyway.
>
> This probably doesn't apply by itself, as is, but for the record...

Only one question why you removed avr_legitimize_address ?

Denis.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Patch, AVR]: Fix PR46779
  2011-06-16 18:18                                   ` Denis Chertykov
@ 2011-06-16 18:57                                     ` Richard Henderson
  2011-06-16 19:33                                       ` Denis Chertykov
  0 siblings, 1 reply; 47+ messages in thread
From: Richard Henderson @ 2011-06-16 18:57 UTC (permalink / raw)
  To: Denis Chertykov
  Cc: Georg-Johann Lay, gcc-patches, Anatoly Sokolov, Eric B. Weddington

[-- Attachment #1: Type: text/plain, Size: 559 bytes --]

On 06/16/2011 11:09 AM, Denis Chertykov wrote:
> Only one question why you removed avr_legitimize_address ?

It doesn't actually do anything useful.

All it does is, for a subset of inputs, force the address into
a register.  This is the same as the fallback action.  That is,
any non-legitimate address is forced into a register.

A more useful version of legitimize_address would be to split
a large offset into two pieces, the large one to be CSE'd and
the small one that fits into a memory offset.

E.g. the following with which I was experimenting.


r~

[-- Attachment #2: commit-avr-split-address --]
[-- Type: text/plain, Size: 6870 bytes --]

commit 034bf089f69d13ebab4765439163824a946913bb
Author: Richard Henderson <rth@redhat.com>
Date:   Wed Jun 15 15:35:57 2011 -0700

    avr: Split address offsets in legitimize{_reload,}_address.

diff --git a/gcc/config/avr/avr-protos.h b/gcc/config/avr/avr-protos.h
index 5c60e2e..3045587 100644
--- a/gcc/config/avr/avr-protos.h
+++ b/gcc/config/avr/avr-protos.h
@@ -110,6 +110,8 @@ extern void out_shift_with_cnt (const char *templ, rtx insn,
 extern reg_class_t avr_mode_code_base_reg_class (enum machine_mode, RTX_CODE, RTX_CODE);
 extern bool avr_regno_mode_code_ok_for_base_p (int, enum machine_mode, RTX_CODE, RTX_CODE);
 extern rtx avr_incoming_return_addr_rtx (void);
+extern rtx avr_legitimize_reload_address (rtx, enum machine_mode,
+					  int, int, int, int);
 #endif /* RTX_CODE */
 
 #ifdef HAVE_MACHINE_MODES
diff --git a/gcc/config/avr/avr.c b/gcc/config/avr/avr.c
index 045da63..78c2467 100644
--- a/gcc/config/avr/avr.c
+++ b/gcc/config/avr/avr.c
@@ -1160,7 +1160,7 @@ static inline int
 avr_reg_ok_for_addr (rtx reg, int strict)
 {
   return (REG_P (reg)
-          && (avr_regno_mode_code_ok_for_base_p (REGNO (reg), QImode, MEM, SCRATCH)
+          && (avr_regno_mode_code_ok_for_base_p (REGNO (reg), QImode, MEM, UNKNOWN)
               || (!strict && REGNO (reg) >= FIRST_PSEUDO_REGISTER)));
 }
 
@@ -1226,12 +1226,88 @@ avr_legitimate_address_p (enum machine_mode mode, rtx x, bool strict)
 /* Attempts to replace X with a valid
    memory address for an operand of mode MODE  */
 
-rtx
-avr_legitimize_address (rtx x, rtx oldx ATTRIBUTE_UNUSED, enum machine_mode mode ATTRIBUTE_UNUSED)
+static rtx
+avr_legitimize_address (rtx x, rtx oldx ATTRIBUTE_UNUSED,
+                        enum machine_mode mode)
 {
+  if (GET_CODE (x) == PLUS && CONST_INT_P (XEXP (x, 1)))
+    {
+      HOST_WIDE_INT addend = INTVAL (XEXP (x, 1));
+
+      if (addend > MAX_LD_OFFSET (mode))
+	{
+	  HOST_WIDE_INT hi, lo;
+
+	  x = XEXP (x, 0);
+	  if (!REG_P (x)
+	      || !avr_regno_mode_code_ok_for_base_p (REGNO (x), mode,
+						     PLUS, UNKNOWN))
+	    x = force_reg (Pmode, x);
+
+	  lo = addend & 63;
+	  hi = addend - lo;
+	  x = force_reg (Pmode, plus_constant (x, hi));
+	  return plus_constant (x, lo);
+	}
+    }
+
   return x;
 }
 
+rtx
+avr_legitimize_reload_address (rtx x, enum machine_mode mode,
+			       int opnum, int type, int addr_type,
+			       int ind_levels ATTRIBUTE_UNUSED)
+{
+  /* We must recognize output that we have already generated ourselves.  */
+  if (GET_CODE (x) == PLUS
+      && GET_CODE (XEXP (x, 0)) == PLUS
+      && REG_P (XEXP (XEXP (x, 0), 0))
+      && CONST_INT_P (XEXP (XEXP (x, 0), 1))
+      && CONST_INT_P (XEXP (x, 1)))
+    {
+      push_reload (XEXP (x, 0), NULL_RTX, &XEXP (x, 0), NULL,
+		   BASE_POINTER_REGS, GET_MODE (x), VOIDmode, 0, 0,
+		   opnum, (enum reload_type) addr_type);
+      return x;
+    }
+
+  /* We wish to handle large displacements off a register by splitting
+     the addend into two parts.  This may allow some sharing.  */
+  if (GET_CODE (x) == PLUS
+      && REG_P (XEXP (x, 0))
+      && CONST_INT_P (XEXP (x, 1)))
+    {
+      HOST_WIDE_INT val = INTVAL (XEXP (x, 1));
+      HOST_WIDE_INT hi, lo;
+
+      lo = val & 63;
+      hi = val - lo;
+
+      if (val > MAX_LD_OFFSET (mode) && hi && lo)
+	{
+          /* Reload the high part into a base reg; leave the low part
+	     in the mem directly.  */
+          x = plus_constant (XEXP (x, 0), hi);
+          x = gen_rtx_PLUS (Pmode, x, GEN_INT (lo));
+
+          push_reload (XEXP (x, 0), NULL_RTX, &XEXP (x, 0), NULL,
+		       BASE_POINTER_REGS, GET_MODE (x), VOIDmode, 0, 0,
+		       opnum, (enum reload_type) addr_type);
+          return x;
+	}
+    }
+
+  if (GET_CODE (x) == POST_INC || GET_CODE (x) == PRE_DEC)
+    {
+      push_reload (XEXP (x, 0), NULL, &XEXP (x,0), NULL,
+	           POINTER_REGS, GET_MODE (x), VOIDmode, 0, 0,
+		   opnum, (enum reload_type) type);
+      return x;
+    }
+
+  return NULL_RTX;
+}
 
 /* Return a pointer register name as a string.  */
 
diff --git a/gcc/config/avr/avr.h b/gcc/config/avr/avr.h
index 426ddec..9cb19d1 100644
--- a/gcc/config/avr/avr.h
+++ b/gcc/config/avr/avr.h
@@ -367,50 +367,20 @@ extern int avr_reg_order[];
 
 #define MAX_REGS_PER_ADDRESS 1
 
-/* LEGITIMIZE_RELOAD_ADDRESS will allow register R26/27 to be used, where it
-   is no worse than normal base pointers R28/29 and R30/31. For example:
-   If base offset is greater than 63 bytes or for R++ or --R addressing.  */
-   
-#define _LEGITIMIZE_RELOAD_ADDRESS(X, MODE, OPNUM, TYPE, IND_LEVELS, WIN)    \
-do {									    \
-  if (1&&(GET_CODE (X) == POST_INC || GET_CODE (X) == PRE_DEC))	    \
-    {									    \
-      push_reload (XEXP (X,0), XEXP (X,0), &XEXP (X,0), &XEXP (X,0),	    \
-	           POINTER_REGS, GET_MODE (X),GET_MODE (X) , 0, 0,	    \
-		   OPNUM, RELOAD_OTHER);				    \
-      goto WIN;								    \
-    }									    \
-  if (GET_CODE (X) == PLUS						    \
-      && REG_P (XEXP (X, 0))						    \
-      && (reg_equiv_constant (REGNO (XEXP (X, 0))) == 0)		    \
-      && GET_CODE (XEXP (X, 1)) == CONST_INT				    \
-      && INTVAL (XEXP (X, 1)) >= 1)					    \
-    {									    \
-      int fit = INTVAL (XEXP (X, 1)) <= (64 - GET_MODE_SIZE (MODE));	    \
-      if (fit)								    \
-	{								    \
-          if (reg_equiv_address (REGNO (XEXP (X, 0))) != 0)		    \
-	    {								    \
-	      int regno = REGNO (XEXP (X, 0));				    \
-	      rtx mem = make_memloc (X, regno);				    \
-	      push_reload (XEXP (mem,0), NULL, &XEXP (mem,0), NULL,         \
-		           POINTER_REGS, Pmode, VOIDmode, 0, 0,		    \
-		           1, ADDR_TYPE (TYPE));			    \
-	      push_reload (mem, NULL_RTX, &XEXP (X, 0), NULL,		    \
-		           BASE_POINTER_REGS, GET_MODE (X), VOIDmode, 0, 0, \
-		           OPNUM, TYPE);				    \
-	      goto WIN;							    \
-	    }								    \
-	}								    \
-      else if (! (frame_pointer_needed && XEXP (X,0) == frame_pointer_rtx)) \
-	{								    \
-	  push_reload (X, NULL_RTX, &X, NULL,				    \
-		       POINTER_REGS, GET_MODE (X), VOIDmode, 0, 0,	    \
-		       OPNUM, TYPE);					    \
-          goto WIN;							    \
-	}								    \
-    }									    \
-} while(0)
+/* Try a machine-dependent way of reloading an illegitimate address
+   operand.  If we find one, push the reload and jump to WIN.  This
+   macro is used in only one place: `find_reloads_address' in reload.c.  */
+
+#define LEGITIMIZE_RELOAD_ADDRESS(X,MODE,OPNUM,TYPE,IND_L,WIN)		     \
+do {									     \
+  rtx new_x = avr_legitimize_reload_address (X, MODE, OPNUM,                 \
+					     TYPE, ADDR_TYPE (TYPE), IND_L); \
+  if (new_x)								     \
+    {									     \
+      X = new_x;							     \
+      goto WIN;								     \
+    }									     \
+} while (0)
 
 #define BRANCH_COST(speed_p, predictable_p) 0
 

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Patch, AVR]: Fix PR46779
  2011-06-16 14:36                               ` Richard Henderson
  2011-06-16 14:52                                 ` Bernd Schmidt
@ 2011-06-16 18:57                                 ` Denis Chertykov
  1 sibling, 0 replies; 47+ messages in thread
From: Denis Chertykov @ 2011-06-16 18:57 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Georg-Johann Lay, gcc-patches, Anatoly Sokolov, Eric B. Weddington

2011/6/16 Richard Henderson <rth@redhat.com>:
> On 06/16/2011 04:34 AM, Denis Chertykov wrote:
>> @rth (while you are diving into AVR microworld ;-)
>> May be you can give a suggestion to change the AVR abi.
>> I have tuned the abi for code size almost 13 years ago.
>> The register pressure to r18-r31 are very high.
>
> As far as I can see, the ABI is pretty good.  There's nothing
> that I would say that should obviously be changed.
>
> I might totally drop support for DImode.  Let "long long" map
> to SImode.  If you want 64-bit data, you probably don't want
> to select an 8-bit microcontroller.
>
> That might take a bit of surgery on the way we currently build
> libgcc though.
>
>> I have a set of experiments with omitting the frame poiner and I found that
>> better to support fake addressing modes (infinite displacement for frame
>> pointer).
>
> The biggest problem that I see, from the 950612-1.c test case
> with the current handling of the "infinite displacement frame
> pointer", is that the adjustments to the frame pointer are
> never exposed as separate instructions, so there's never a
> chance to optimize them.

Yes. I'm agree.

> So once you build a stack frame larger than 64 bytes, things
> get worse and worse.

May be something like this:
The port must not permit infinite displacements before reload.
  (all addressing modes must have a right form)
Fake addressing mode enabled only after reload_in_progress.
  (only small part of spilled/local variables will be addressed by
fake addressing mode.)

I don't like fake addressing mode and I don't want to support it at
all, but may be it's a best way to work around 'unable to find a
register to spill' problem. (the current AVR addressing code written
in this way)

>  You wind up with code like
>
>        subi r28,lo8(-133)
>        sbci r29,hi8(-133)
>        ld r18,Y
>        subi r28,lo8(133)
>        sbci r29,hi8(133)
>        subi r28,lo8(-134)
>        sbci r29,hi8(-134)
>        ld r19,Y
>        subi r28,lo8(134)
>        sbci r29,hi8(134)
>
> where obviously the 4 middle instructions could be eliminated.
>
> If we came out of reload (or shortly thereafter via split) with
> these as separate patterns, we might be able to eliminate them
> via an existing optimization pass.  OTOH, an existing pass might
> refuse to operate on the frame pointer because the frame pointer
> is usually Special.

IMHO the source of the problem is a REGISTER ALLOCATOR because it is
not handle a register elimination and strict insn constraints.

Denis.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Patch, AVR]: Fix PR46779
  2011-06-16 18:57                                     ` Richard Henderson
@ 2011-06-16 19:33                                       ` Denis Chertykov
  2011-06-16 19:42                                         ` Richard Henderson
  0 siblings, 1 reply; 47+ messages in thread
From: Denis Chertykov @ 2011-06-16 19:33 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Georg-Johann Lay, gcc-patches, Anatoly Sokolov, Eric B. Weddington

2011/6/16 Richard Henderson <rth@redhat.com>:
> On 06/16/2011 11:09 AM, Denis Chertykov wrote:
>> Only one question why you removed avr_legitimize_address ?
>
> It doesn't actually do anything useful.
>
> All it does is, for a subset of inputs, force the address into
> a register.  This is the same as the fallback action.  That is,
> any non-legitimate address is forced into a register.
>
> A more useful version of legitimize_address would be to split
> a large offset into two pieces, the large one to be CSE'd and
> the small one that fits into a memory offset.
>
> E.g. the following with which I was experimenting.

Are you working in public git repository ?
Can I chekout it ?

If not then may be you can drop a full patch against svn.

Denis.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Patch, AVR]: Fix PR46779
  2011-06-16 19:33                                       ` Denis Chertykov
@ 2011-06-16 19:42                                         ` Richard Henderson
  2011-06-16 20:04                                           ` Denis Chertykov
  0 siblings, 1 reply; 47+ messages in thread
From: Richard Henderson @ 2011-06-16 19:42 UTC (permalink / raw)
  To: Denis Chertykov
  Cc: Georg-Johann Lay, gcc-patches, Anatoly Sokolov, Eric B. Weddington

On 06/16/2011 12:17 PM, Denis Chertykov wrote:
> Are you working in public git repository ?
> Can I chekout it ?
> 
> If not then may be you can drop a full patch against svn.

I've pushed it to 

  git://gcc.gnu.org/git/gcc.git rth/avr-addressing

As I said yesterday, there's still a crash in libgcc, so
we've merely exchanged one crash for another from G-J's
patch.


r~

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Patch, AVR]: Fix PR46779
  2011-06-16 19:42                                         ` Richard Henderson
@ 2011-06-16 20:04                                           ` Denis Chertykov
  0 siblings, 0 replies; 47+ messages in thread
From: Denis Chertykov @ 2011-06-16 20:04 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Georg-Johann Lay, gcc-patches, Anatoly Sokolov, Eric B. Weddington

2011/6/16 Richard Henderson <rth@redhat.com>:
> On 06/16/2011 12:17 PM, Denis Chertykov wrote:
>> Are you working in public git repository ?
>> Can I chekout it ?
>>
>> If not then may be you can drop a full patch against svn.
>
> I've pushed it to
>
>  git://gcc.gnu.org/git/gcc.git rth/avr-addressing

Thank you.

> As I said yesterday, there's still a crash in libgcc, so
> we've merely exchanged one crash for another from G-J's
> patch.

It's not a problem for experiments with new addressing code.

Denis.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Patch, AVR]: Fix PR46779
  2011-06-10  9:27   ` Georg-Johann Lay
@ 2011-06-21 17:17     ` Georg-Johann Lay
  0 siblings, 0 replies; 47+ messages in thread
From: Georg-Johann Lay @ 2011-06-21 17:17 UTC (permalink / raw)
  To: Denis Chertykov
  Cc: gcc-patches, Anatoly Sokolov, Eric B. Weddington, Richard Henderson

Do you think we can split the patch intended to fix PR46779

> 	PR target/46779
> 	* config/avr/avr.c (avr_hard_regno_mode_ok): Rewrite.
> 	In particular, allow 8-bit values in r28 and r29.
> 	(avr_hard_regno_scratch_ok): Disallow any register that might be
> 	part of the frame pointer.
> 	(avr_hard_regno_rename_ok): Same.

http://gcc.gnu.org/ml/gcc-patches/2011-06/msg00810.html

from the other work discussed in this thread that aims at fixing
PR45291 which is "just" an optimization issue because of fake X
addressing?

These PR are two different things.

Johann

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Patch, AVR]: Fix PR46779
  2011-06-13 18:10           ` Georg-Johann Lay
  2011-06-13 22:37             ` Denis Chertykov
@ 2011-06-22  3:28             ` Hans-Peter Nilsson
  2011-06-22 17:03               ` Georg-Johann Lay
  1 sibling, 1 reply; 47+ messages in thread
From: Hans-Peter Nilsson @ 2011-06-22  3:28 UTC (permalink / raw)
  To: Georg-Johann Lay
  Cc: Denis Chertykov, gcc-patches, Anatoly Sokolov,
	Eric B. Weddington, Richard Henderson

On Mon, 13 Jun 2011, Georg-Johann Lay wrote:
> [In CCing Richard Henderson]
> Denis Chertykov schrieb:
> > 2011/6/10 Georg-Johann Lay <avr@gjlay.de>:

> > > Then I observed trouble with DI patterns during libgcc build and had
> > > to remove
> > >
> > > * "zero_extendqidi2"
> > > * "zero_extendhidi2"
> > > * "zero_extendsidi2"
> > >
> > > These are "orphan" insns: they deal with DI without having movdi
> > > support so I removed them.
> >
> > This seems strange for me.
>
> As far as I know, to support a mode a respective mov insn is needed,

For the record, not in general, just if you have patterns
operating on DImode.  I.e. if you always have to call into
libgcc for every operation, you're fine with just SImode, as the
access will be split into SImode accesses.  (That reload can't
split the access is arguably a wart.)

It's even documented, "node Standard Names" for mov@var{m}:
"If there are patterns accepting operands in larger modes,
@samp{mov@var{m}} must be defined for integer modes of those
sizes."

> which is
> not the case for DI. I don't know the exact rationale behind that
> (reloading?),

Yes.  (I ran into problems with this myself long ago.)

> just read is on gcc list by Ian Taylor (and also that it is
> stronly discouraged to have more than one mov insn per mode).

That is correct.

> So if the requirement to have mov insn is dropped and without the burden to
> implement movdi, it would be rather easy to implement adddi3 and subdi3 for
> avr...

Resist the temptation... I see you did. :)

brgds, H-P

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Patch, AVR]: Fix PR46779
  2011-06-22  3:28             ` Hans-Peter Nilsson
@ 2011-06-22 17:03               ` Georg-Johann Lay
  2011-06-23 12:51                 ` Hans-Peter Nilsson
  2011-06-23 13:00                 ` Hans-Peter Nilsson
  0 siblings, 2 replies; 47+ messages in thread
From: Georg-Johann Lay @ 2011-06-22 17:03 UTC (permalink / raw)
  To: Hans-Peter Nilsson
  Cc: Denis Chertykov, gcc-patches, Anatoly Sokolov,
	Eric B. Weddington, Richard Henderson

Hans-Peter Nilsson schrieb:
> On Mon, 13 Jun 2011, Georg-Johann Lay wrote:
>> [In CCing Richard Henderson]
>> Denis Chertykov schrieb:
>>> 2011/6/10 Georg-Johann Lay <avr@gjlay.de>:
> 
>>>> Then I observed trouble with DI patterns during libgcc build and had
>>>> to remove
>>>>
>>>> * "zero_extendqidi2"
>>>> * "zero_extendhidi2"
>>>> * "zero_extendsidi2"
>>>>
>>>> These are "orphan" insns: they deal with DI without having movdi
>>>> support so I removed them.
>>> This seems strange for me.
>> As far as I know, to support a mode a respective mov insn is needed,
> 
> For the record, not in general, just if you have patterns
> operating on DImode.  I.e. if you always have to call into
> libgcc for every operation, you're fine with just SImode, as the
> access will be split into SImode accesses.  (That reload can't
> split the access is arguably a wart.)

For avr it's actually split in QImode (word_mode), SImode would be
more efficient.

> It's even documented, "node Standard Names" for mov@var{m}:
> "If there are patterns accepting operands in larger modes,
> @samp{mov@var{m}} must be defined for integer modes of those
> sizes."

Thanks for pointing that out.

For avr that means:  There is movsf pattern that is implemented less
efficient than movsi.  So removing movsf could improve code a bit.
Besides efficiency, code for movsi and movsf can be the same on avr.

>> which is
>> not the case for DI. I don't know the exact rationale behind that
>> (reloading?),
> 
> Yes.  (I ran into problems with this myself long ago.)

So the zero_extend*di2 pattern are bogus because there is no movdi.

>> just read is on gcc list by Ian Taylor (and also that it is
>> stronly discouraged to have more than one mov insn per mode).
> 
> That is correct.
> 
>> So if the requirement to have mov insn is dropped and without the burden to
>> implement movdi, it would be rather easy to implement adddi3 and subdi3 for
>> avr...
> 
> Resist the temptation... I see you did. :)

The preferred handling is still that optabs cared for calling __adddi3
if there is no adddi3 pattern... The target would have to care for
implementing __adddi3 so generic libgcc need not to be changed and IMO
changing libgcc for that would not be adequate.

Johann

> brgds, H-P

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Patch, AVR]: Fix PR46779
  2011-06-22 17:03               ` Georg-Johann Lay
@ 2011-06-23 12:51                 ` Hans-Peter Nilsson
  2011-06-23 13:00                 ` Hans-Peter Nilsson
  1 sibling, 0 replies; 47+ messages in thread
From: Hans-Peter Nilsson @ 2011-06-23 12:51 UTC (permalink / raw)
  To: Georg-Johann Lay
  Cc: Denis Chertykov, gcc-patches, Anatoly Sokolov,
	Eric B. Weddington, Richard Henderson



On Wed, 22 Jun 2011, Georg-Johann Lay wrote:

> Hans-Peter Nilsson schrieb:
> > On Mon, 13 Jun 2011, Georg-Johann Lay wrote:
> >> [In CCing Richard Henderson]
> >> Denis Chertykov schrieb:
> >>> 2011/6/10 Georg-Johann Lay <avr@gjlay.de>:
> >
> >>>> Then I observed trouble with DI patterns during libgcc build and had
> >>>> to remove
> >>>>
> >>>> * "zero_extendqidi2"
> >>>> * "zero_extendhidi2"
> >>>> * "zero_extendsidi2"
> >>>>
> >>>> These are "orphan" insns: they deal with DI without having movdi
> >>>> support so I removed them.
> >>> This seems strange for me.
> >> As far as I know, to support a mode a respective mov insn is needed,
> >
> > For the record, not in general, just if you have patterns
> > operating on DImode.  I.e. if you always have to call into
> > libgcc for every operation, you're fine with just SImode, as the
> > access will be split into SImode accesses.  (That reload can't
> > split the access is arguably a wart.)
>
> For avr it's actually split in QImode (word_mode), SImode would be
> more efficient.
>
> > It's even documented, "node Standard Names" for mov@var{m}:
> > "If there are patterns accepting operands in larger modes,
> > @samp{mov@var{m}} must be defined for integer modes of those
> > sizes."
>
> Thanks for pointing that out.
>
> For avr that means:  There is movsf pattern that is implemented less
> efficient than movsi.  So removing movsf could improve code a bit.
> Besides efficiency, code for movsi and movsf can be the same on avr.
>
> >> which is
> >> not the case for DI. I don't know the exact rationale behind that
> >> (reloading?),
> >
> > Yes.  (I ran into problems with this myself long ago.)
>
> So the zero_extend*di2 pattern are bogus because there is no movdi.
>
> >> just read is on gcc list by Ian Taylor (and also that it is
> >> stronly discouraged to have more than one mov insn per mode).
> >
> > That is correct.
> >
> >> So if the requirement to have mov insn is dropped and without the burden to
> >> implement movdi, it would be rather easy to implement adddi3 and subdi3 for
> >> avr...
> >
> > Resist the temptation... I see you did. :)
>
> The preferred handling is still that optabs cared for calling __adddi3
> if there is no adddi3 pattern... The target would have to care for
> implementing __adddi3 so generic libgcc need not to be changed and IMO
> changing libgcc for that would not be adequate.
>
> Johann
>
> > brgds, H-P
>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Patch, AVR]: Fix PR46779
  2011-06-22 17:03               ` Georg-Johann Lay
  2011-06-23 12:51                 ` Hans-Peter Nilsson
@ 2011-06-23 13:00                 ` Hans-Peter Nilsson
  1 sibling, 0 replies; 47+ messages in thread
From: Hans-Peter Nilsson @ 2011-06-23 13:00 UTC (permalink / raw)
  To: Georg-Johann Lay
  Cc: Denis Chertykov, gcc-patches, Anatoly Sokolov,
	Eric B. Weddington, Richard Henderson

Sorry for the earlier semi-empty mail (just quoting G-J), I
meant to cancel it.  Happy midsummer.

brgds, H-P

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Patch, AVR]: Fix PR46779
  2011-06-16  3:46                           ` Richard Henderson
  2011-06-16 11:37                             ` Denis Chertykov
@ 2011-06-23 20:48                             ` Denis Chertykov
  2011-06-23 22:04                               ` Richard Henderson
  1 sibling, 1 reply; 47+ messages in thread
From: Denis Chertykov @ 2011-06-23 20:48 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Georg-Johann Lay, gcc-patches, Anatoly Sokolov, Eric B. Weddington

2011/6/16 Richard Henderson <rth@redhat.com>:
> On 06/15/2011 02:58 PM, Richard Henderson wrote:
>> Indeed, I can work around this particular crash by either
>> hacking Z to be call-saved, or hacking the frame pointer to
>> not be required.  The former of course changes the abi, and
>> the second produces awful code due to too many copies from
>> the stack pointer.  So neither option is "preferred".
>
> Perhaps I spoke too soon re the frame pointer.  The old
> code is even worse.
>
>   text    data     bss     dec     hex filename
>  10032      25       0   10057    2749 bld-avr-orig/gcc/z.o
>   5816      25       0    5841    16d1 bld-avr-new/gcc/z.o

Richard, can you send me this z.c file ?
Right now I'm notice that new code is worse.

Denis.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Patch, AVR]: Fix PR46779
  2011-06-23 20:48                             ` Denis Chertykov
@ 2011-06-23 22:04                               ` Richard Henderson
  2011-06-26 20:03                                 ` Denis Chertykov
  0 siblings, 1 reply; 47+ messages in thread
From: Richard Henderson @ 2011-06-23 22:04 UTC (permalink / raw)
  To: Denis Chertykov
  Cc: Georg-Johann Lay, gcc-patches, Anatoly Sokolov, Eric B. Weddington

On 06/23/2011 01:15 PM, Denis Chertykov wrote:
>>   text    data     bss     dec     hex filename
>>  10032      25       0   10057    2749 bld-avr-orig/gcc/z.o
>>   5816      25       0    5841    16d1 bld-avr-new/gcc/z.o
> 
> Richard, can you send me this z.c file ?
> Right now I'm notice that new code is worse.

That's gcc.c-torture/compile/950612-1.c.


r~

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Patch, AVR]: Fix PR46779
  2011-06-23 22:04                               ` Richard Henderson
@ 2011-06-26 20:03                                 ` Denis Chertykov
  2011-06-26 20:51                                   ` Georg-Johann Lay
  0 siblings, 1 reply; 47+ messages in thread
From: Denis Chertykov @ 2011-06-26 20:03 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Georg-Johann Lay, gcc-patches, Anatoly Sokolov, Eric B. Weddington

2011/6/24 Richard Henderson <rth@redhat.com>:
> On 06/23/2011 01:15 PM, Denis Chertykov wrote:
>>>   text    data     bss     dec     hex filename
>>>  10032      25       0   10057    2749 bld-avr-orig/gcc/z.o
>>>   5816      25       0    5841    16d1 bld-avr-new/gcc/z.o
>>
>> Richard, can you send me this z.c file ?
>> Right now I'm notice that new code is worse.
>
> That's gcc.c-torture/compile/950612-1.c.
>

I have founded that postreload optimizations can't handle results of
new L_R_A code.
I think that it's can be handled by CSE (postreload).

This is a fragment from 950612-1.c:

---------- you can skip it I have a short version below -------
(insn 5186 23 5187 2 (set (reg:HI 30 r30)
        (const_int 128 [0x80])) c950612-1.c:20 9 {*movhi}
     (nil))

(insn 5187 5186 5189 2 (set (reg:HI 30 r30)
        (plus:HI (reg:HI 30 r30)
            (reg/f:HI 28 r28))) c950612-1.c:20 21 {*addhi3}
     (expr_list:REG_EQUIV (plus:HI (reg/f:HI 28 r28)
            (const_int 128 [0x80]))
        (nil)))

(insn 5189 5187 2451 2 (set (mem/c:HI (plus:HI (reg:HI 30 r30)
                (const_int 1 [0x1])) [8 %sfp+129 S2 A8])
        (reg:HI 18 r18)) c950612-1.c:20 9 {*movhi}
     (nil))

(note 2451 5189 2537 2 NOTE_INSN_DELETED)

(note 2537 2451 1651 2 NOTE_INSN_DELETED)

(note 1651 2537 3180 2 NOTE_INSN_DELETED)

(note 3180 1651 5191 2 NOTE_INSN_DELETED)

(insn 5191 3180 5192 2 (set (reg:HI 30 r30)
        (const_int 128 [0x80])) c950612-1.c:132 9 {*movhi}
     (nil))

(insn 5192 5191 5071 2 (set (reg:HI 30 r30)
        (plus:HI (reg:HI 30 r30)
            (reg/f:HI 28 r28))) c950612-1.c:132 21 {*addhi3}
     (expr_list:REG_EQUIV (plus:HI (reg/f:HI 28 r28)
            (const_int 128 [0x80]))
        (nil)))

(insn 5071 5192 5073 2 (set (mem/c:HI (plus:HI (reg:HI 30 r30)
                (const_int 3 [0x3])) [8 %sfp+131 S2 A8])
        (reg/v/f:HI 8 r8 [orig:211 pc ] [211])) c950612-1.c:132 9 {*movhi}
     (nil))
------------------------------------------------------

Insns 5186,5187 equal to 5191,5192 and 5191,5192 can be removed, but
reload_cse_regs_1 operate only on one insn.

5186 Z=128
5187 Z=Y+128       ; REG_EQUIV Z=Y+128
5189 HI:[Z+1]=HI:R18
...(deleted insns)
5191 Z=128
5192 Z=Y+128       ; REG_EQUIV Z=Y+128

(5191,5192) really a one three addressing add Z=Y+128.
Insns (5191,5192) exists because AVR havn't 3 addressing add.
Insn 5191 destroy REG_EQUIV (it's right), but reload_cse_regs_1 can't
optimize insns 5191,5192.

Right now I have only one idea:
1. create peephole2 for joining such insns (5191,5192);
2. inside machine dependent pass rerun postreload and may be gcse2;
3. split joined insns to originals.

Denis.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Patch, AVR]: Fix PR46779
  2011-06-26 20:03                                 ` Denis Chertykov
@ 2011-06-26 20:51                                   ` Georg-Johann Lay
  2011-06-27  8:41                                     ` Denis Chertykov
  0 siblings, 1 reply; 47+ messages in thread
From: Georg-Johann Lay @ 2011-06-26 20:51 UTC (permalink / raw)
  To: Denis Chertykov
  Cc: Richard Henderson, gcc-patches, Anatoly Sokolov, Eric B. Weddington

Denis Chertykov schrieb:
> 2011/6/24 Richard Henderson <rth@redhat.com>:
> 
>>On 06/23/2011 01:15 PM, Denis Chertykov wrote:
>>
>>>>  text    data     bss     dec     hex filename
>>>> 10032      25       0   10057    2749 bld-avr-orig/gcc/z.o
>>>>  5816      25       0    5841    16d1 bld-avr-new/gcc/z.o
>>>
>>>Richard, can you send me this z.c file ?
>>>Right now I'm notice that new code is worse.
>>
>>That's gcc.c-torture/compile/950612-1.c.
> 
> I have founded that postreload optimizations can't handle results of
> new L_R_A code.
> I think that it's can be handled by CSE (postreload).
> 
> This is a fragment from 950612-1.c:
> 
> ---------- you can skip it I have a short version below -------
> (insn 5186 23 5187 2 (set (reg:HI 30 r30)
>         (const_int 128 [0x80])) c950612-1.c:20 9 {*movhi}
>      (nil))
> 
> (insn 5187 5186 5189 2 (set (reg:HI 30 r30)
>         (plus:HI (reg:HI 30 r30)
>             (reg/f:HI 28 r28))) c950612-1.c:20 21 {*addhi3}
>      (expr_list:REG_EQUIV (plus:HI (reg/f:HI 28 r28)
>             (const_int 128 [0x80]))
>         (nil)))
> 
> (insn 5189 5187 2451 2 (set (mem/c:HI (plus:HI (reg:HI 30 r30)
>                 (const_int 1 [0x1])) [8 %sfp+129 S2 A8])
>         (reg:HI 18 r18)) c950612-1.c:20 9 {*movhi}
>      (nil))
> 
> (note 2451 5189 2537 2 NOTE_INSN_DELETED)
> 
> (note 2537 2451 1651 2 NOTE_INSN_DELETED)
> 
> (note 1651 2537 3180 2 NOTE_INSN_DELETED)
> 
> (note 3180 1651 5191 2 NOTE_INSN_DELETED)
> 
> (insn 5191 3180 5192 2 (set (reg:HI 30 r30)
>         (const_int 128 [0x80])) c950612-1.c:132 9 {*movhi}
>      (nil))
> 
> (insn 5192 5191 5071 2 (set (reg:HI 30 r30)
>         (plus:HI (reg:HI 30 r30)
>             (reg/f:HI 28 r28))) c950612-1.c:132 21 {*addhi3}
>      (expr_list:REG_EQUIV (plus:HI (reg/f:HI 28 r28)
>             (const_int 128 [0x80]))
>         (nil)))
> 
> (insn 5071 5192 5073 2 (set (mem/c:HI (plus:HI (reg:HI 30 r30)
>                 (const_int 3 [0x3])) [8 %sfp+131 S2 A8])
>         (reg/v/f:HI 8 r8 [orig:211 pc ] [211])) c950612-1.c:132 9 {*movhi}
>      (nil))
> ------------------------------------------------------
> 
> Insns 5186,5187 equal to 5191,5192 and 5191,5192 can be removed, but
> reload_cse_regs_1 operate only on one insn.
> 
> 5186 Z=128
> 5187 Z=Y+128       ; REG_EQUIV Z=Y+128
> 5189 HI:[Z+1]=HI:R18
> ...(deleted insns)
> 5191 Z=128
> 5192 Z=Y+128       ; REG_EQUIV Z=Y+128
> 
> (5191,5192) really a one three addressing add Z=Y+128.
> Insns (5191,5192) exists because AVR havn't 3 addressing add.
> Insn 5191 destroy REG_EQUIV (it's right), but reload_cse_regs_1 can't
> optimize insns 5191,5192.

Did you try to add constraint alternative to *addhi3?
Like "*!d,d,n" or even "*!r,r,n"

I saw some code improvement with that alternative.

Johann

> Right now I have only one idea:
> 1. create peephole2 for joining such insns (5191,5192);
> 2. inside machine dependent pass rerun postreload and may be gcse2;
> 3. split joined insns to originals.
> 
> Denis.


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Patch, AVR]: Fix PR46779
  2011-06-26 20:51                                   ` Georg-Johann Lay
@ 2011-06-27  8:41                                     ` Denis Chertykov
  2011-06-27  9:19                                       ` Georg-Johann Lay
  0 siblings, 1 reply; 47+ messages in thread
From: Denis Chertykov @ 2011-06-27  8:41 UTC (permalink / raw)
  To: Georg-Johann Lay
  Cc: Richard Henderson, gcc-patches, Anatoly Sokolov, Eric B. Weddington

2011/6/26 Georg-Johann Lay <avr@gjlay.de>:
> Denis Chertykov schrieb:
>>
>> 2011/6/24 Richard Henderson <rth@redhat.com>:
>>
>>> On 06/23/2011 01:15 PM, Denis Chertykov wrote:
>>>
>>>>>  text    data     bss     dec     hex filename
>>>>> 10032      25       0   10057    2749 bld-avr-orig/gcc/z.o
>>>>>  5816      25       0    5841    16d1 bld-avr-new/gcc/z.o
>>>>
>>>> Richard, can you send me this z.c file ?
>>>> Right now I'm notice that new code is worse.
>>>
>>> That's gcc.c-torture/compile/950612-1.c.
>>
>> I have founded that postreload optimizations can't handle results of
>> new L_R_A code.
>> I think that it's can be handled by CSE (postreload).
> Did you try to add constraint alternative to *addhi3?
> Like "*!d,d,n" or even "*!r,r,n"
>
> I saw some code improvement with that alternative.

I'm trying:

(define_insn "*addhi3"
  [(set (match_operand:HI 0 "register_operand" "=r,!w,!w,d,r,r,!d")
 	(plus:HI
 	 (match_operand:HI 1 "register_operand" "%0,0,0,0,0,0,!r")
 	 (match_operand:HI 2 "nonmemory_operand" "r,I,J,i,P,N,!ri")))]
  ""
  "@
 	add %A0,%A2\;adc %B0,%B2
 	adiw %A0,%2
 	sbiw %A0,%n2
 	subi %A0,lo8(-(%2))\;sbci %B0,hi8(-(%2))
 	sec\;adc %A0,__zero_reg__\;adc %B0,__zero_reg__
 	sec\;sbc %A0,__zero_reg__\;sbc %B0,__zero_reg__
        #"
  [(set_attr "length" "2,1,1,2,3,3,4")
   (set_attr "cc" "set_n,set_czn,set_czn,set_czn,set_n,set_n,set_n")])

;; Special split three addressing addhi3
;; to make postreload optimization possible
(define_split ; addhi3 !d,!r,!ri
  [(set (match_operand:HI 0 "d_register_operand" "")
	(plus:HI (match_operand:HI 1 "register_operand" "")
		 (match_operand:HI 2 "nonmemory_operand" "")))]
  "reload_completed"
  [(set (match_dup 0) (match_dup 2))
   (set (match_dup 0) (plus:HI (match_dup 0) (match_dup 1)))]
  "")

The main problem for me is that the new addressing mode produce a
worse code in many tests.
Although, results for gcc.c-torture/compile/950612-1.c is
significantly better with new addressing.

Denis.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Patch, AVR]: Fix PR46779
  2011-06-27  8:41                                     ` Denis Chertykov
@ 2011-06-27  9:19                                       ` Georg-Johann Lay
  2011-06-27 10:17                                         ` Denis Chertykov
  0 siblings, 1 reply; 47+ messages in thread
From: Georg-Johann Lay @ 2011-06-27  9:19 UTC (permalink / raw)
  To: Denis Chertykov
  Cc: Richard Henderson, gcc-patches, Anatoly Sokolov, Eric B. Weddington

Denis Chertykov wrote:
> 2011/6/26 Georg-Johann Lay <avr@gjlay.de>:
>> Denis Chertykov schrieb:
>>> 2011/6/24 Richard Henderson <rth@redhat.com>:
>>>
>>>> On 06/23/2011 01:15 PM, Denis Chertykov wrote:
>>>>
>>>>>>  text    data     bss     dec     hex filename
>>>>>> 10032      25       0   10057    2749 bld-avr-orig/gcc/z.o
>>>>>>  5816      25       0    5841    16d1 bld-avr-new/gcc/z.o
>>>>> Richard, can you send me this z.c file ?
>>>>> Right now I'm notice that new code is worse.
>>>> That's gcc.c-torture/compile/950612-1.c.
>>> I have founded that postreload optimizations can't handle results of
>>> new L_R_A code.
>>> I think that it's can be handled by CSE (postreload).
>> Did you try to add constraint alternative to *addhi3?
>> Like "*!d,d,n" or even "*!r,r,n"
>>
>> I saw some code improvement with that alternative.
> 
> I'm trying:
> 
> (define_insn "*addhi3"
>   [(set (match_operand:HI 0 "register_operand" "=r,!w,!w,d,r,r,!d")
>  	(plus:HI
>  	 (match_operand:HI 1 "register_operand" "%0,0,0,0,0,0,!r")
>  	 (match_operand:HI 2 "nonmemory_operand" "r,I,J,i,P,N,!ri")))]
>   ""
>   "@
>  	add %A0,%A2\;adc %B0,%B2
>  	adiw %A0,%2
>  	sbiw %A0,%n2
>  	subi %A0,lo8(-(%2))\;sbci %B0,hi8(-(%2))
>  	sec\;adc %A0,__zero_reg__\;adc %B0,__zero_reg__
>  	sec\;sbc %A0,__zero_reg__\;sbc %B0,__zero_reg__
>         #"
>   [(set_attr "length" "2,1,1,2,3,3,4")
>    (set_attr "cc" "set_n,set_czn,set_czn,set_czn,set_n,set_n,set_n")])
> 

That split will split always:

> ;; Special split three addressing addhi3
> ;; to make postreload optimization possible
> (define_split ; addhi3 !d,!r,!ri
>   [(set (match_operand:HI 0 "d_register_operand" "")
> 	(plus:HI (match_operand:HI 1 "register_operand" "")
> 		 (match_operand:HI 2 "nonmemory_operand" "")))]
>   "reload_completed"
     && REGNO(operands[0]) != REGNO(operands[1])"
>   [(set (match_dup 0) (match_dup 2))
>    (set (match_dup 0) (plus:HI (match_dup 0) (match_dup 1)))]
>   "")
Maybe it can also restrict to const_int_operand in #2 and then it's
best to
    (set (match_dup 0)
         (match_dup 1))
    (set (match_dup 0)
         (plus:HI (match_dup 0)
                  (match_dup 2)))
> 
> The main problem for me is that the new addressing mode produce a
> worse code in many tests.

You have an example source?

> Although, results for gcc.c-torture/compile/950612-1.c is
> significantly better with new addressing.

That testcase is pathologic for AVR...

> 
> Denis.
> 

Johann

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Patch, AVR]: Fix PR46779
  2011-06-27  9:19                                       ` Georg-Johann Lay
@ 2011-06-27 10:17                                         ` Denis Chertykov
  2011-07-07  9:59                                           ` Georg-Johann Lay
  0 siblings, 1 reply; 47+ messages in thread
From: Denis Chertykov @ 2011-06-27 10:17 UTC (permalink / raw)
  To: Georg-Johann Lay
  Cc: Richard Henderson, gcc-patches, Anatoly Sokolov, Eric B. Weddington

[-- Attachment #1: Type: text/plain, Size: 2683 bytes --]

2011/6/27 Georg-Johann Lay <avr@gjlay.de>:
> Denis Chertykov wrote:
>> 2011/6/26 Georg-Johann Lay <avr@gjlay.de>:
>>> Denis Chertykov schrieb:
>>>> 2011/6/24 Richard Henderson <rth@redhat.com>:
>>>>
>>>>> On 06/23/2011 01:15 PM, Denis Chertykov wrote:
>>>>>
>>>>>>>  text    data     bss     dec     hex filename
>>>>>>> 10032      25       0   10057    2749 bld-avr-orig/gcc/z.o
>>>>>>>  5816      25       0    5841    16d1 bld-avr-new/gcc/z.o
>>>>>> Richard, can you send me this z.c file ?
>>>>>> Right now I'm notice that new code is worse.
>>>>> That's gcc.c-torture/compile/950612-1.c.
>>>> I have founded that postreload optimizations can't handle results of
>>>> new L_R_A code.
>>>> I think that it's can be handled by CSE (postreload).
>>> Did you try to add constraint alternative to *addhi3?
>>> Like "*!d,d,n" or even "*!r,r,n"
>>>
>>> I saw some code improvement with that alternative.
>>
>> I'm trying:
>>
>> (define_insn "*addhi3"
>>   [(set (match_operand:HI 0 "register_operand" "=r,!w,!w,d,r,r,!d")
>>       (plus:HI
>>        (match_operand:HI 1 "register_operand" "%0,0,0,0,0,0,!r")
>>        (match_operand:HI 2 "nonmemory_operand" "r,I,J,i,P,N,!ri")))]
>>   ""
>>   "@
>>       add %A0,%A2\;adc %B0,%B2
>>       adiw %A0,%2
>>       sbiw %A0,%n2
>>       subi %A0,lo8(-(%2))\;sbci %B0,hi8(-(%2))
>>       sec\;adc %A0,__zero_reg__\;adc %B0,__zero_reg__
>>       sec\;sbc %A0,__zero_reg__\;sbc %B0,__zero_reg__
>>         #"
>>   [(set_attr "length" "2,1,1,2,3,3,4")
>>    (set_attr "cc" "set_n,set_czn,set_czn,set_czn,set_n,set_n,set_n")])
>>
>
> That split will split always:
>
>> ;; Special split three addressing addhi3
>> ;; to make postreload optimization possible
>> (define_split ; addhi3 !d,!r,!ri
>>   [(set (match_operand:HI 0 "d_register_operand" "")
>>       (plus:HI (match_operand:HI 1 "register_operand" "")
>>                (match_operand:HI 2 "nonmemory_operand" "")))]
>>   "reload_completed"
>     && REGNO(operands[0]) != REGNO(operands[1])"
>>   [(set (match_dup 0) (match_dup 2))
>>    (set (match_dup 0) (plus:HI (match_dup 0) (match_dup 1)))]
>>   "")
> Maybe it can also restrict to const_int_operand in #2 and then it's
> best to
>    (set (match_dup 0)
>         (match_dup 1))
>    (set (match_dup 0)
>         (plus:HI (match_dup 0)
>                  (match_dup 2)))

Thanks for suggestions.

>> The main problem for me is that the new addressing mode produce a
>> worse code in many tests.
>
> You have an example source?

In attachment.

Denis.

[-- Attachment #2: pr.c --]
[-- Type: text/x-csrc, Size: 6116 bytes --]

#include <stdarg.h>
#include <string.h>
#include <stdio.h>

#if 0
unsigned long simple_strtoul(const char *cp,char **endp,unsigned int base)
{
	unsigned long result = 0,value;

	if (!base) {
		base = 10;
		if (*cp == '0') {
			base = 8;
			cp++;
			if ((*cp == 'x') && isxdigit(cp[1])) {
				cp++;
				base = 16;
			}
		}
	}
	while (isxdigit(*cp) && (value = is_digit(*cp) ? *cp-'0' : (islower(*cp)
	    ? toupper(*cp) : *cp)-'A'+10) < base) {
		result = result*base + value;
		cp++;
	}
	if (endp)
		*endp = (char *)cp;
	return result;
}
#endif

/* we use this so that we can do without the ctype library */

static int skip_atoi(const char **s)
{
	int i=0;

	while (is_digit(**s))
		i = i*10 + *((*s)++) - '0';
	return i;
}

#define ZEROPAD	1		/* pad with zero */
#define SIGN	2		/* unsigned/signed long */
#define PLUS	4		/* show plus */
#define SPACE	8		/* space if plus */
#define LEFT	16		/* left justified */
#define SPECIAL	32		/* 0x */
#define LARGE	64		/* use 'ABCDEF' instead of 'abcdef' */

#define do_div(n,base) ({ \
int __res; \
__res = ((unsigned long long) n) % (unsigned) base; \
n = ((unsigned long long) n) / (unsigned) base; \
__res; })
     
static char * number(char * str, long long num, int base, int size, int precision
	,int type)
{
	char c,sign,tmp[20];
	const char *digits="0123456789abcdef";
	int i;
	/*
	if (type & LARGE)
		digits = "0123456789ABCDEF";
	*/
	if (type & LEFT)
		type &= ~ZEROPAD;
	if (base < 2 || base > 36)
		return 0;
	c = (type & ZEROPAD) ? '0' : ' ';
	sign = 0;

	if (type & SIGN) {
		if (num < 0) {
			sign = '-';
			num = -num;
			size--;
		} else if (type & PLUS) {
			sign = '+';
			size--;
		} else if (type & SPACE) {
			sign = ' ';
			size--;
		}
	}
	if (type & SPECIAL) {
		if (base == 16)
			size -= 2;
		else if (base == 8)
			size--;
	}
	i = 0;
	if (num == 0)
		tmp[i++]='0';
	else while (num != 0)
	            tmp[i++] = digits[do_div(num,base)];
	if (i > precision)
		precision = i;
	size -= precision;
	if (!(type&(ZEROPAD+LEFT)))
		while(size-->0)
			*str++ = ' ';
	if (sign)
		*str++ = sign;
	if (type & SPECIAL) {
		if (base==8)
			*str++ = '0';
		else if (base==16) {
			*str++ = '0';
			*str++ = 'x';
		}
	}
	if (!(type & LEFT))
		while (size-- > 0)
			*str++ = c;
	while (i < precision--)
		*str++ = '0';
	while (i-- > 0)
		*str++ = tmp[i];
	while (size-- > 0)
		*str++ = ' ';
	return str;
}

static int vsprintf(char *buf, const char *fmt, va_list args)
{
	int len;
	unsigned long long num;
	int i, base;
	char * str;
	const char *s;

	int flags;		/* flags to number() */

	int field_width;	/* width of output field */
	int precision;		/* min. # of digits for integers; max
				   number of chars for from string */
	int qualifier;		/* 'h', 'l', or 'L' for integer fields */

	for (str=buf ; *fmt ; ++fmt) {
		if (*fmt != '%') {
			*str++ = *fmt;
			continue;
		}
			
		/* process flags */
		flags = 0;
		repeat:
			++fmt;		/* this also skips first '%' */
			switch (*fmt) {
				case '-': flags |= LEFT; goto repeat;
				case '+': flags |= PLUS; goto repeat;
				case ' ': flags |= SPACE; goto repeat;
				case '#': flags |= SPECIAL; goto repeat;
				case '0': flags |= ZEROPAD; goto repeat;
				}
		
		/* get field width */
		field_width = -1;
		if (is_digit(*fmt))
			field_width = skip_atoi(&fmt);
		else if (*fmt == '*') {
			++fmt;
			/* it's the next argument */
			field_width = va_arg(args, int);
			if (field_width < 0) {
				field_width = -field_width;
				flags |= LEFT;
			}
		}

		/* get the precision */
		precision = -1;
#if 0
		if (*fmt == '.') {
			++fmt;	
			if (is_digit(*fmt))
				precision = skip_atoi(&fmt);
			else if (*fmt == '*') {
				++fmt;
				/* it's the next argument */
				precision = va_arg(args, int);
			}
			if (precision < 0)
				precision = 0;
		}
#endif
		/* get the conversion qualifier */
		qualifier = -1;
		if (*fmt == 'h' || *fmt == 'l' || *fmt == 'L') {
			qualifier = *fmt;
			++fmt;
		}

		/* default base */
		base = 10;

		switch (*fmt) {
		  /*
		case 'c':
			if (!(flags & LEFT))
				while (--field_width > 0)
					*str++ = ' ';
			*str++ = (unsigned char) va_arg(args, int);
			while (--field_width > 0)
				*str++ = ' ';
			continue;
		  */
		case 's':
			s = va_arg(args, char *);
			if (!s)
				s = "<NULL>";

			len = strnlen(s, precision);

			if (!(flags & LEFT))
				while (len < field_width--)
					*str++ = ' ';
			for (i = 0; i < len; ++i)
				*str++ = *s++;
			while (len < field_width--)
				*str++ = ' ';
			continue;
			/*
		case 'p':
			if (field_width == -1) {
				field_width = 2*sizeof(void *);
				flags |= ZEROPAD;
			}
			str = number(str,
				(unsigned long) va_arg(args, void *), 16,
				field_width, precision, flags);
			continue;
			*/
			/*
		case 'n':
			if (qualifier == 'l') {
				long * ip = va_arg(args, long *);
				*ip = (str - buf);
			} else {
				int * ip = va_arg(args, int *);
				*ip = (str - buf);
			}
			continue;
			*/
		/* integer number formats - set up the flags and "break" */
			/*
		case 'o':
			base = 8;
			break;
			*/
			/*
		case 'X':
			flags |= LARGE;
			*/
		case 'x':
			base = 16;
			break;

		case 'd':
		case 'i':
			flags |= SIGN;
		case 'u':
			break;

		default:
			if (*fmt != '%')
				*str++ = '%';
			if (*fmt)
				*str++ = *fmt;
			else
				--fmt;
			continue;
		}
		if (qualifier == 'L')
		  num = va_arg(args, unsigned long long);
		else if (qualifier == 'l')
			if (flags & SIGN)
				num = va_arg(args, long);
			else
				num = va_arg(args, unsigned long);
		else if (qualifier == 'h') {
			if (flags & SIGN)
				num = va_arg(args, short);
			else
				num = va_arg(args, unsigned short);
		} else if (flags & SIGN)
			num = va_arg(args, int);
		else
			num = va_arg(args, unsigned int);
		str = number(str, num, base, field_width, precision, flags);
	}
	*str = '\0';
	return str-buf;
}

/*
int sprintf(char * buf, const char *fmt, ...)
{
	va_list args;
	int i;

	va_start(args, fmt);
	i=vsprintf(buf,fmt,args);
	va_end(args);
	return i;
}
*/

int printf(const char *fmt, ...)
{
  char buf[100];
  va_list args;
  int i;

  va_start(args, fmt);
  i=vsprintf(buf,fmt,args);
  send_str (buf);
  va_end(args);
  return i;
}


[-- Attachment #3: sort.c --]
[-- Type: text/x-csrc, Size: 1195 bytes --]

/* This demonstrates some of the compiler's common-subexpression*/
/* elimination capabilities.  For example, inspect the code */
/* generated for procedure Sort_array.  See the Programmer's    */
/* Guide for how to request an assembly listing on your host.   */

typedef unsigned char boolean;

void Sort_array();
long Tab[100];

main () {
   long I,J,K,L;

for (L = 0; L < 1000; L++) {
   /* Initialize the table that will be sorted. */
   K = 0;
   for (I = 9; I >= 0; I--)
      for (J = I*10; J < (I+1)*10; J++)
     Tab[K++] = J&1 ? J+1 : J-1;

/*   Print_array(); */
   Sort_array(Tab,99);     /* Sort it. */
/*   Print_array(); */
}
/* */ exit(0); /* */
}

void
Sort_array (Tab,Last)
     int Tab[];
     long Last;
{
   boolean Swap;
   int Temp,I;

   do
     {
       Swap = 0;
       for (I = 0; I < Last; I++)
	 if (Tab[I] > Tab[I+1])
	   {
	     Temp = Tab[I];
	     Tab[I] = Tab[I+1];
	     Tab[I+1] = Temp;
	     Swap = 1;
	   }
     } while (Swap);
}

void Print_array() {
   int I,J;
   /*printf("\nArray Contents:\n");*/
   for (I=0; I<=9; I++) {
      /*printf("%5d:",10*I); */
      for (J=0; J<=9; J++); /*printf("%5d",Tab[10*I+J]); */
      /* printf("\n");*/
      }
}


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Patch, AVR]: Fix PR46779
  2011-06-27 10:17                                         ` Denis Chertykov
@ 2011-07-07  9:59                                           ` Georg-Johann Lay
  2011-07-07 18:21                                             ` Denis Chertykov
  0 siblings, 1 reply; 47+ messages in thread
From: Georg-Johann Lay @ 2011-07-07  9:59 UTC (permalink / raw)
  To: Denis Chertykov
  Cc: Richard Henderson, gcc-patches, Anatoly Sokolov, Eric B. Weddington

[-- Attachment #1: Type: text/plain, Size: 1858 bytes --]

Denis Chertykov wrote:
> 2011/6/27 Georg-Johann Lay:
>> Denis Chertykov wrote:

>>> The main problem for me is that the new addressing mode produce a
>>> worse code in many tests.
>> You have an example source?
> 
> In attachment.
> 
> Denis.

Hi Denis.

I had a look at the sources you sent.

sort.c:
=======

There is some difference because of register allocation, but the new
code does not look awfully bad, just a bit different because of
different register allocation that might need some more bytes.

The difference is *not* because of deny fake X addressing, it's
because of the new avr_hard_regno_mode_ok implementation to fix
PR46779.  When I add


  if (GET_MODE_SIZE (mode) == 1)
     return 1;

+     if (SImode == mode && regno == 28)
+       return 0;

   return regno % 2 == 0;

to that function, the difference in code disappears.

pr.c:
=====

I get the following sizes with pr-0 the original compile and pr qith
my patch:

>avr-size pr-0.o
   text    data     bss     dec     hex filename
   2824      24       0    2848     b20 pr-0.o
>avr-size pr.o
   text    data     bss     dec     hex filename
   2564      24       0    2588     a1c pr.o

So the size actually decreased significantly.  Avoiding SI in
avr_hard_regno_mode_ok like above does not change code size.

============

Note that I did *not* use the version from the git repository; I could
not get reasonable code out of it (even after some fixes).  Hundreds
of testsuite crashes...

I used the initial patch that I posted; I attached it again for
reference. Note that LEGITIMIZE_RELOAD_ADDRESS is still not
implemented there.

============

Did you decide about the fix for PR46779?

http://gcc.gnu.org/ml/gcc-patches/2011-06/msg00810.html

Is it ok to commit?

I think fix PR46779 and fix fake X addresses (PR46278) should be
separate patches and not intermixed.

Johann


[-- Attachment #2: new-addr-v4.diff --]
[-- Type: text/x-patch, Size: 21751 bytes --]

Index: config/avr/avr.md
===================================================================
--- config/avr/avr.md	(revision 175956)
+++ config/avr/avr.md	(working copy)
@@ -246,8 +246,8 @@ (define_expand "movqi"
   ")
 
 (define_insn "*movqi"
-  [(set (match_operand:QI 0 "nonimmediate_operand" "=r,d,Qm,r,q,r,*r")
-	(match_operand:QI 1 "general_operand"       "rL,i,rL,Qm,r,q,i"))]
+  [(set (match_operand:QI 0 "nonimmediate_operand" "=r,d,m,r,q,r,*r")
+	(match_operand:QI 1 "general_operand"       "rL,i,rL,m,r,q,i"))]
   "(register_operand (operands[0],QImode)
     || register_operand (operands[1], QImode) || const0_rtx == operands[1])"
   "* return output_movqi (insn, operands, NULL);"
@@ -295,15 +295,6 @@ (define_expand "movhi"
     }
 }")
 
-(define_insn "*movhi_sp"
-  [(set (match_operand:HI 0 "register_operand" "=q,r")
-        (match_operand:HI 1 "register_operand"  "r,q"))]
-  "((stack_register_operand(operands[0], HImode) && register_operand (operands[1], HImode))
-    || (register_operand (operands[0], HImode) && stack_register_operand(operands[1], HImode)))"
-  "* return output_movhi (insn, operands, NULL);"
-  [(set_attr "length" "5,2")
-   (set_attr "cc" "none,none")])
-
 (define_insn "movhi_sp_r_irq_off"
   [(set (match_operand:HI 0 "stack_register_operand" "=q")
         (unspec_volatile:HI [(match_operand:HI 1 "register_operand"  "r")] 
@@ -427,8 +418,8 @@ (define_insn "*reload_insi"
 
 
 (define_insn "*movsi"
-  [(set (match_operand:SI 0 "nonimmediate_operand" "=r,r,r,Qm,!d,r")
-        (match_operand:SI 1 "general_operand"       "r,L,Qm,rL,i,i"))]
+  [(set (match_operand:SI 0 "nonimmediate_operand" "=r,r,r,m,!d,r")
+        (match_operand:SI 1 "general_operand"       "r,L,m,rL,i,i"))]
   "(register_operand (operands[0],SImode)
     || register_operand (operands[1],SImode) || const0_rtx == operands[1])"
   {
@@ -455,8 +446,8 @@ (define_expand "movsf"
 }")
 
 (define_insn "*movsf"
-  [(set (match_operand:SF 0 "nonimmediate_operand" "=r,r,r,Qm,!d,r")
-        (match_operand:SF 1 "general_operand"       "r,G,Qm,rG,F,F"))]
+  [(set (match_operand:SF 0 "nonimmediate_operand" "=r,r,r,m,!d,r")
+        (match_operand:SF 1 "general_operand"       "r,G,m,rG,F,F"))]
   "register_operand (operands[0], SFmode)
    || register_operand (operands[1], SFmode)
    || operands[1] == CONST0_RTX (SFmode)"
@@ -1592,8 +1583,8 @@ (define_mode_attr rotx [(DI "&r,&r,X") (
 (define_mode_attr rotsmode [(DI "QI") (SI "HI") (HI "QI")])
 
 (define_expand "rotl<mode>3"
-  [(parallel [(set (match_operand:HIDI 0 "register_operand" "")
-		   (rotate:HIDI (match_operand:HIDI 1 "register_operand" "")
+  [(parallel [(set (match_operand:HISI 0 "register_operand" "")
+		   (rotate:HISI (match_operand:HISI 1 "register_operand" "")
 				(match_operand:VOID 2 "const_int_operand" "")))
 		(clobber (match_dup 3))])]
   ""
@@ -1692,7 +1683,7 @@ (define_split ; ashlqi3_const6
 (define_insn "*ashlqi3"
   [(set (match_operand:QI 0 "register_operand"           "=r,r,r,r,!d,r,r")
 	(ashift:QI (match_operand:QI 1 "register_operand" "0,0,0,0,0,0,0")
-		   (match_operand:QI 2 "general_operand"  "r,L,P,K,n,n,Qm")))]
+		   (match_operand:QI 2 "general_operand"  "r,L,P,K,n,n,m")))]
   ""
   "* return ashlqi3_out (insn, operands, NULL);"
   [(set_attr "length" "5,0,1,2,4,6,9")
@@ -1701,7 +1692,7 @@ (define_insn "*ashlqi3"
 (define_insn "ashlhi3"
   [(set (match_operand:HI 0 "register_operand"           "=r,r,r,r,r,r,r")
 	(ashift:HI (match_operand:HI 1 "register_operand" "0,0,0,r,0,0,0")
-		   (match_operand:QI 2 "general_operand"  "r,L,P,O,K,n,Qm")))]
+		   (match_operand:QI 2 "general_operand"  "r,L,P,O,K,n,m")))]
   ""
   "* return ashlhi3_out (insn, operands, NULL);"
   [(set_attr "length" "6,0,2,2,4,10,10")
@@ -1710,7 +1701,7 @@ (define_insn "ashlhi3"
 (define_insn "ashlsi3"
   [(set (match_operand:SI 0 "register_operand"           "=r,r,r,r,r,r,r")
 	(ashift:SI (match_operand:SI 1 "register_operand" "0,0,0,r,0,0,0")
-		   (match_operand:QI 2 "general_operand"  "r,L,P,O,K,n,Qm")))]
+		   (match_operand:QI 2 "general_operand"  "r,L,P,O,K,n,m")))]
   ""
   "* return ashlsi3_out (insn, operands, NULL);"
   [(set_attr "length" "8,0,4,4,8,10,12")
@@ -1799,7 +1790,7 @@ (define_insn "*ashlsi3_const"
 (define_insn "ashrqi3"
   [(set (match_operand:QI 0 "register_operand" "=r,r,r,r,r,r")
 	(ashiftrt:QI (match_operand:QI 1 "register_operand" "0,0,0,0,0,0")
-		     (match_operand:QI 2 "general_operand"  "r,L,P,K,n,Qm")))]
+		     (match_operand:QI 2 "general_operand"  "r,L,P,K,n,m")))]
   ""
   "* return ashrqi3_out (insn, operands, NULL);"
   [(set_attr "length" "5,0,1,2,5,9")
@@ -1808,7 +1799,7 @@ (define_insn "ashrqi3"
 (define_insn "ashrhi3"
   [(set (match_operand:HI 0 "register_operand"             "=r,r,r,r,r,r,r")
 	(ashiftrt:HI (match_operand:HI 1 "register_operand" "0,0,0,r,0,0,0")
-		     (match_operand:QI 2 "general_operand"  "r,L,P,O,K,n,Qm")))]
+		     (match_operand:QI 2 "general_operand"  "r,L,P,O,K,n,m")))]
   ""
   "* return ashrhi3_out (insn, operands, NULL);"
   [(set_attr "length" "6,0,2,4,4,10,10")
@@ -1817,7 +1808,7 @@ (define_insn "ashrhi3"
 (define_insn "ashrsi3"
   [(set (match_operand:SI 0 "register_operand"             "=r,r,r,r,r,r,r")
 	(ashiftrt:SI (match_operand:SI 1 "register_operand" "0,0,0,r,0,0,0")
-		     (match_operand:QI 2 "general_operand"  "r,L,P,O,K,n,Qm")))]
+		     (match_operand:QI 2 "general_operand"  "r,L,P,O,K,n,m")))]
   ""
   "* return ashrsi3_out (insn, operands, NULL);"
   [(set_attr "length" "8,0,4,6,8,10,12")
@@ -1907,7 +1898,7 @@ (define_split	; lshrqi3_const6
 (define_insn "*lshrqi3"
   [(set (match_operand:QI 0 "register_operand"             "=r,r,r,r,!d,r,r")
 	(lshiftrt:QI (match_operand:QI 1 "register_operand" "0,0,0,0,0,0,0")
-		     (match_operand:QI 2 "general_operand"  "r,L,P,K,n,n,Qm")))]
+		     (match_operand:QI 2 "general_operand"  "r,L,P,K,n,n,m")))]
   ""
   "* return lshrqi3_out (insn, operands, NULL);"
   [(set_attr "length" "5,0,1,2,4,6,9")
@@ -1916,7 +1907,7 @@ (define_insn "*lshrqi3"
 (define_insn "lshrhi3"
   [(set (match_operand:HI 0 "register_operand"             "=r,r,r,r,r,r,r")
 	(lshiftrt:HI (match_operand:HI 1 "register_operand" "0,0,0,r,0,0,0")
-		     (match_operand:QI 2 "general_operand"  "r,L,P,O,K,n,Qm")))]
+		     (match_operand:QI 2 "general_operand"  "r,L,P,O,K,n,m")))]
   ""
   "* return lshrhi3_out (insn, operands, NULL);"
   [(set_attr "length" "6,0,2,2,4,10,10")
@@ -1925,7 +1916,7 @@ (define_insn "lshrhi3"
 (define_insn "lshrsi3"
   [(set (match_operand:SI 0 "register_operand"             "=r,r,r,r,r,r,r")
 	(lshiftrt:SI (match_operand:SI 1 "register_operand" "0,0,0,r,0,0,0")
-		     (match_operand:QI 2 "general_operand"  "r,L,P,O,K,n,Qm")))]
+		     (match_operand:QI 2 "general_operand"  "r,L,P,O,K,n,m")))]
   ""
   "* return lshrsi3_out (insn, operands, NULL);"
   [(set_attr "length" "8,0,4,4,8,10,12")
@@ -2198,54 +2189,6 @@ (define_insn_and_split "zero_extendhisi2
   operands[3] = simplify_gen_subreg (HImode, operands[0], SImode, high_off);
 })
 
-(define_insn_and_split "zero_extendqidi2"
-  [(set (match_operand:DI 0 "register_operand" "=r")
-        (zero_extend:DI (match_operand:QI 1 "register_operand" "r")))]
-  ""
-  "#"
-  "reload_completed"
-  [(set (match_dup 2) (zero_extend:SI (match_dup 1)))
-   (set (match_dup 3) (const_int 0))]
-{
-  unsigned int low_off = subreg_lowpart_offset (SImode, DImode);
-  unsigned int high_off = subreg_highpart_offset (SImode, DImode);
-
-  operands[2] = simplify_gen_subreg (SImode, operands[0], DImode, low_off);
-  operands[3] = simplify_gen_subreg (SImode, operands[0], DImode, high_off);
-})
-
-(define_insn_and_split "zero_extendhidi2"
-  [(set (match_operand:DI 0 "register_operand" "=r")
-        (zero_extend:DI (match_operand:HI 1 "register_operand" "r")))]
-  ""
-  "#"
-  "reload_completed"
-  [(set (match_dup 2) (zero_extend:SI (match_dup 1)))
-   (set (match_dup 3) (const_int 0))]
-{
-  unsigned int low_off = subreg_lowpart_offset (SImode, DImode);
-  unsigned int high_off = subreg_highpart_offset (SImode, DImode);
-
-  operands[2] = simplify_gen_subreg (SImode, operands[0], DImode, low_off);
-  operands[3] = simplify_gen_subreg (SImode, operands[0], DImode, high_off);
-})
-
-(define_insn_and_split "zero_extendsidi2"
-  [(set (match_operand:DI 0 "register_operand" "=r")
-        (zero_extend:DI (match_operand:SI 1 "register_operand" "r")))]
-  ""
-  "#"
-  "reload_completed"
-  [(set (match_dup 2) (match_dup 1))
-   (set (match_dup 3) (const_int 0))]
-{
-  unsigned int low_off = subreg_lowpart_offset (SImode, DImode);
-  unsigned int high_off = subreg_highpart_offset (SImode, DImode);
-
-  operands[2] = simplify_gen_subreg (SImode, operands[0], DImode, low_off);
-  operands[3] = simplify_gen_subreg (SImode, operands[0], DImode, high_off);
-})
-
 ;;<=><=><=><=><=><=><=><=><=><=><=><=><=><=><=><=><=><=><=><=><=><=><=><=><=>
 ;; compare
 
@@ -2504,7 +2447,7 @@ (define_insn "*sbrx_branch<mode>"
   [(set (pc)
         (if_then_else
 	 (match_operator 0 "eqne_operator"
-			 [(zero_extract:QIDI
+			 [(zero_extract:QISI
 			   (match_operand:VOID 1 "register_operand" "r")
 			   (const_int 1)
 			   (match_operand 2 "const_int_operand" "n"))
Index: config/avr/avr-protos.h
===================================================================
--- config/avr/avr-protos.h	(revision 175956)
+++ config/avr/avr-protos.h	(working copy)
@@ -24,7 +24,7 @@
 
 extern int function_arg_regno_p (int r);
 extern void avr_cpu_cpp_builtins (struct cpp_reader * pfile);
-extern enum reg_class avr_regno_reg_class (int r);
+extern reg_class_t avr_regno_reg_class (unsigned int r);
 extern void asm_globalize_label (FILE *file, const char *name);
 extern void avr_asm_declare_function_name (FILE *, const char *, tree);
 extern void order_regs_for_local_alloc (void);
@@ -107,6 +107,8 @@ extern RTX_CODE avr_normalize_condition
 extern int compare_eq_p (rtx insn);
 extern void out_shift_with_cnt (const char *templ, rtx insn,
 				rtx operands[], int *len, int t_len);
+extern reg_class_t avr_mode_code_base_reg_class (enum machine_mode, RTX_CODE, RTX_CODE);
+extern bool avr_regno_mode_code_ok_for_base_p (int, enum machine_mode, RTX_CODE, RTX_CODE);
 extern rtx avr_incoming_return_addr_rtx (void);
 #endif /* RTX_CODE */
 
Index: config/avr/avr.c
===================================================================
--- config/avr/avr.c	(revision 175956)
+++ config/avr/avr.c	(working copy)
@@ -279,18 +279,27 @@ avr_option_override (void)
 
 /*  return register class from register number.  */
 
-static const enum reg_class reg_class_tab[]={
-  GENERAL_REGS,GENERAL_REGS,GENERAL_REGS,GENERAL_REGS,GENERAL_REGS,
-  GENERAL_REGS,GENERAL_REGS,GENERAL_REGS,GENERAL_REGS,GENERAL_REGS,
-  GENERAL_REGS,GENERAL_REGS,GENERAL_REGS,GENERAL_REGS,GENERAL_REGS,
-  GENERAL_REGS, /* r0 - r15 */
-  LD_REGS,LD_REGS,LD_REGS,LD_REGS,LD_REGS,LD_REGS,LD_REGS,
-  LD_REGS,                      /* r16 - 23 */
-  ADDW_REGS,ADDW_REGS,          /* r24,r25 */
-  POINTER_X_REGS,POINTER_X_REGS, /* r26,27 */
-  POINTER_Y_REGS,POINTER_Y_REGS, /* r28,r29 */
-  POINTER_Z_REGS,POINTER_Z_REGS, /* r30,r31 */
-  STACK_REG,STACK_REG           /* SPL,SPH */
+static const reg_class_t reg_class_tab[] =
+  {
+    /* r0 */
+    R0_REG,
+    /* r1 - r15 */
+    NO_LD_REGS, NO_LD_REGS, NO_LD_REGS, NO_LD_REGS, NO_LD_REGS,
+    NO_LD_REGS, NO_LD_REGS, NO_LD_REGS, NO_LD_REGS, NO_LD_REGS,
+    NO_LD_REGS, NO_LD_REGS, NO_LD_REGS, NO_LD_REGS, NO_LD_REGS,
+    /* r16 - r23 */
+    SIMPLE_LD_REGS, SIMPLE_LD_REGS, SIMPLE_LD_REGS, SIMPLE_LD_REGS,
+    SIMPLE_LD_REGS, SIMPLE_LD_REGS, SIMPLE_LD_REGS, SIMPLE_LD_REGS,
+    /* r24, r25 */
+    ADDW_REGS, ADDW_REGS,
+    /* r26, r27 */
+    POINTER_X_REGS, POINTER_X_REGS,
+    /* r28, r29 */
+    POINTER_Y_REGS, POINTER_Y_REGS,
+    /* r30, r31 */
+    POINTER_Z_REGS, POINTER_Z_REGS,
+    /* SPL, SPH */
+    STACK_REG, STACK_REG
 };
 
 /* Function to set up the backend function structure.  */
@@ -303,8 +312,8 @@ avr_init_machine_status (void)
 
 /* Return register class for register R.  */
 
-enum reg_class
-avr_regno_reg_class (int r)
+reg_class_t
+avr_regno_reg_class (unsigned int r)
 {
   if (r <= 33)
     return reg_class_tab[r];
@@ -1082,74 +1091,75 @@ avr_cannot_modify_jumps_p (void)
 }
 
 
-/* Return nonzero if X (an RTX) is a legitimate memory address on the target
-   machine for a memory operand of mode MODE.  */
+/* Helper function for `avr_legitimate_address_p'.  */
+
+static inline int
+avr_reg_ok_for_addr (rtx reg, int strict)
+{
+  return (REG_P (reg)
+          && (avr_regno_mode_code_ok_for_base_p (REGNO (reg), QImode, MEM, SCRATCH)
+              || (!strict && REGNO (reg) >= FIRST_PSEUDO_REGISTER)));
+}
+
+
+/* Implement `TARGET_LEGITIMATE_ADDRESS_P'.  */
 
 bool
 avr_legitimate_address_p (enum machine_mode mode, rtx x, bool strict)
 {
-  enum reg_class r = NO_REGS;
-  
-  if (TARGET_ALL_DEBUG)
-    {
-      fprintf (stderr, "mode: (%s) %s %s %s %s:",
-	       GET_MODE_NAME(mode),
-	       strict ? "(strict)": "",
-	       reload_completed ? "(reload_completed)": "",
-	       reload_in_progress ? "(reload_in_progress)": "",
-	       reg_renumber ? "(reg_renumber)" : "");
-      if (GET_CODE (x) == PLUS
-	  && REG_P (XEXP (x, 0))
-	  && GET_CODE (XEXP (x, 1)) == CONST_INT
-	  && INTVAL (XEXP (x, 1)) >= 0
-	  && INTVAL (XEXP (x, 1)) <= MAX_LD_OFFSET (mode)
-	  && reg_renumber
-	  )
-	fprintf (stderr, "(r%d ---> r%d)", REGNO (XEXP (x, 0)),
-		 true_regnum (XEXP (x, 0)));
-      debug_rtx (x);
-    }
-  if (!strict && GET_CODE (x) == SUBREG)
-	x = SUBREG_REG (x);
-  if (REG_P (x) && (strict ? REG_OK_FOR_BASE_STRICT_P (x)
-                    : REG_OK_FOR_BASE_NOSTRICT_P (x)))
-    r = POINTER_REGS;
-  else if (CONSTANT_ADDRESS_P (x))
-    r = ALL_REGS;
-  else if (GET_CODE (x) == PLUS
-           && REG_P (XEXP (x, 0))
-	   && GET_CODE (XEXP (x, 1)) == CONST_INT
-	   && INTVAL (XEXP (x, 1)) >= 0)
-    {
-      int fit = INTVAL (XEXP (x, 1)) <= MAX_LD_OFFSET (mode);
-      if (fit)
-	{
-	  if (! strict
-	      || REGNO (XEXP (x,0)) == REG_X
-	      || REGNO (XEXP (x,0)) == REG_Y
-	      || REGNO (XEXP (x,0)) == REG_Z)
-	    r = BASE_POINTER_REGS;
-	  if (XEXP (x,0) == frame_pointer_rtx
-	      || XEXP (x,0) == arg_pointer_rtx)
-	    r = BASE_POINTER_REGS;
-	}
-      else if (frame_pointer_needed && XEXP (x,0) == frame_pointer_rtx)
-	r = POINTER_Y_REGS;
-    }
-  else if ((GET_CODE (x) == PRE_DEC || GET_CODE (x) == POST_INC)
-           && REG_P (XEXP (x, 0))
-           && (strict ? REG_OK_FOR_BASE_STRICT_P (XEXP (x, 0))
-               : REG_OK_FOR_BASE_NOSTRICT_P (XEXP (x, 0))))
-    {
-      r = POINTER_REGS;
-    }
-  if (TARGET_ALL_DEBUG)
+  bool ok = false;
+
+  switch (GET_CODE (x))
     {
-      fprintf (stderr, "   ret = %c\n", r + '0');
+    case REG:
+      ok = avr_reg_ok_for_addr (x, strict);
+      if (strict
+          && DImode == mode
+          && REG_X == REGNO (x))
+        {
+          ok = false;
+        }
+      break;
+
+    case POST_INC:
+    case PRE_DEC:
+      ok = avr_reg_ok_for_addr (XEXP (x, 0), strict);
+      break;
+
+    case SYMBOL_REF:
+    case CONST_INT:
+    case CONST:
+      ok = true;
+      break;
+      
+    case PLUS:
+      {
+        rtx op0 = XEXP (x, 0);
+        rtx op1 = XEXP (x, 1);
+            
+        if (REG_P (op0)
+            && CONST_INT_P (op1))
+          {
+            ok = (avr_reg_ok_for_addr (op0, strict)
+                  && INTVAL (op1) >= 0
+                  && INTVAL (op1) <= MAX_LD_OFFSET (mode));
+            
+            if (strict
+                && REG_X == REGNO (op0))
+              {
+                ok = false;
+              }
+          }
+        break;
+      }
+      
+    default:
+      break;
     }
-  return r == NO_REGS ? 0 : (int)r;
-}
 
+  return ok;
+}
+ 
 /* Attempts to replace X with a valid
    memory address for an operand of mode MODE  */
 
@@ -6118,27 +6128,74 @@ jump_over_one_insn_p (rtx insn, rtx dest
 int
 avr_hard_regno_mode_ok (int regno, enum machine_mode mode)
 {
-  /* Disallow QImode in stack pointer regs.  */
-  if ((regno == REG_SP || regno == (REG_SP + 1)) && mode == QImode)
-    return 0;
+  /* Any GENERAL_REGS register can hold 8-bit values.  */
+  /* FIXME:  8-bit values must not be disallowed for R28 or R29.
+     Disallowing QI et al. in these registers might lead to code like
+         (set (subreg:QI (reg:HI 28) n) ...)
+     which will result in wrong code because reload does not handle
+     SUBREGs of hard regsisters like this.  This could be fixed in reload.
+     However, it appears that fixing reload is not wanted by reload people.  */
+  
+  if (GET_MODE_SIZE (mode) == 1)
+     return 1;
 
-  /* The only thing that can go into registers r28:r29 is a Pmode.  */
-  if (regno == REG_Y && mode == Pmode)
-    return 1;
+   /* All modes larger than 8 bits should start in an even register.  */
+   
+  return regno % 2 == 0;
+}
 
-  /* Otherwise disallow all regno/mode combinations that span r28:r29.  */
-  if (regno <= (REG_Y + 1) && (regno + GET_MODE_SIZE (mode)) >= (REG_Y + 1))
-    return 0;
 
-  if (mode == QImode)
-    return 1;
+/* Worker function for `MODE_CODE_BASE_REG_CLASS'.  */
 
-  /* Modes larger than QImode occupy consecutive registers.  */
-  if (regno + GET_MODE_SIZE (mode) > FIRST_PSEUDO_REGISTER)
-    return 0;
+reg_class_t
+avr_mode_code_base_reg_class (enum machine_mode mode ATTRIBUTE_UNUSED,
+                              RTX_CODE outer_code, RTX_CODE index_code ATTRIBUTE_UNUSED)
+{
+  reg_class_t rclass = BASE_POINTER_REGS;
+  
+  switch (outer_code)
+    {
+    case MEM:
+    case POST_INC:
+    case PRE_DEC:
+      rclass = POINTER_REGS;
+      break;
+      
+    default:
+      break;
+    }
+  
+  return rclass;
+}
+
+
+/* Worker function for `REGNO_MODE_CODE_OK_FOR_BASE_P'.  */
+
+bool
+avr_regno_mode_code_ok_for_base_p (int regno, enum machine_mode mode ATTRIBUTE_UNUSED,
+                                   RTX_CODE outer_code, RTX_CODE index_code ATTRIBUTE_UNUSED)
+{
+  bool ok;
+  
+  switch (outer_code)
+    {
+    case PLUS:
+      ok = regno == REG_Z || regno == REG_Y;
+      break;
+      
+    case MEM: /* plain reg */
+    case POST_INC:
+    case PRE_DEC:
+      ok = regno == REG_Z || regno == REG_Y || regno == REG_X;
+      break;
+      
+    default:
+      ok = false;
+      break;
+      
+    }
 
-  /* All modes larger than QImode should start in an even register.  */
-  return !(regno & 1);
+  return ok;
 }
 
 const char *
@@ -6410,13 +6467,23 @@ avr_hard_regno_scratch_ok (unsigned int
       && !df_regs_ever_live_p (regno))
     return false;
 
+  /* Don't allow hard registers that might be part of the frame pointer.
+     Some places in the compiler just test for [HARD_]FRAME_POINTER_REGNUM
+     and don't care for a frame pointer that spans more than one register.  */
+
+  if ((!reload_completed || frame_pointer_needed)
+      && (regno == REG_Y || regno == REG_Y + 1))
+    {
+      return false;
+    }
+
   return true;
 }
 
 /* Return nonzero if register OLD_REG can be renamed to register NEW_REG.  */
 
 int
-avr_hard_regno_rename_ok (unsigned int old_reg ATTRIBUTE_UNUSED,
+avr_hard_regno_rename_ok (unsigned int old_reg,
 			  unsigned int new_reg)
 {
   /* Interrupt functions can only use registers that have already been
@@ -6427,6 +6494,17 @@ avr_hard_regno_rename_ok (unsigned int o
       && !df_regs_ever_live_p (new_reg))
     return 0;
 
+  /* Don't allow hard registers that might be part of the frame pointer.
+     Some places in the compiler just test for [HARD_]FRAME_POINTER_REGNUM
+     and don't care for a frame pointer that spans more than one register.  */
+
+  if ((!reload_completed || frame_pointer_needed)
+      && (old_reg == REG_Y || old_reg == REG_Y + 1
+          || new_reg == REG_Y || new_reg == REG_Y + 1))
+    {
+      return 0;
+    }
+  
   return 1;
 }
 
Index: config/avr/avr.h
===================================================================
--- config/avr/avr.h	(revision 175811)
+++ config/avr/avr.h	(working copy)
@@ -292,21 +292,13 @@ enum reg_class {
 
 #define REGNO_REG_CLASS(R) avr_regno_reg_class(R)
 
-#define BASE_REG_CLASS (reload_completed ? BASE_POINTER_REGS : POINTER_REGS)
+#define MODE_CODE_BASE_REG_CLASS(mode, outer_code, index_code)  \
+  avr_mode_code_base_reg_class (mode, outer_code, index_code)
 
 #define INDEX_REG_CLASS NO_REGS
 
-#define REGNO_OK_FOR_BASE_P(r) (((r) < FIRST_PSEUDO_REGISTER		\
-				 && ((r) == REG_X			\
-				     || (r) == REG_Y			\
-				     || (r) == REG_Z			\
-				     || (r) == ARG_POINTER_REGNUM))	\
-				|| (reg_renumber			\
-				    && (reg_renumber[r] == REG_X	\
-					|| reg_renumber[r] == REG_Y	\
-					|| reg_renumber[r] == REG_Z	\
-					|| (reg_renumber[r]		\
-					    == ARG_POINTER_REGNUM))))
+#define REGNO_MODE_CODE_OK_FOR_BASE_P(num, mode, outer_code, index_code) \
+  avr_regno_mode_code_ok_for_base_p (num, mode, outer_code, index_code) 
 
 #define REGNO_OK_FOR_INDEX_P(NUM) 0
 
@@ -369,16 +361,11 @@ extern int avr_reg_order[];
 
 #define MAX_REGS_PER_ADDRESS 1
 
-#define REG_OK_FOR_BASE_NOSTRICT_P(X) \
-  (REGNO (X) >= FIRST_PSEUDO_REGISTER || REG_OK_FOR_BASE_STRICT_P(X))
-
-#define REG_OK_FOR_BASE_STRICT_P(X) REGNO_OK_FOR_BASE_P (REGNO (X))
-
 /* LEGITIMIZE_RELOAD_ADDRESS will allow register R26/27 to be used, where it
    is no worse than normal base pointers R28/29 and R30/31. For example:
    If base offset is greater than 63 bytes or for R++ or --R addressing.  */
    
-#define LEGITIMIZE_RELOAD_ADDRESS(X, MODE, OPNUM, TYPE, IND_LEVELS, WIN)    \
+#define _LEGITIMIZE_RELOAD_ADDRESS(X, MODE, OPNUM, TYPE, IND_LEVELS, WIN)    \
 do {									    \
   if (1&&(GET_CODE (X) == POST_INC || GET_CODE (X) == PRE_DEC))	    \
     {									    \

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Patch, AVR]: Fix PR46779
  2011-07-07  9:59                                           ` Georg-Johann Lay
@ 2011-07-07 18:21                                             ` Denis Chertykov
  2011-07-08 10:12                                               ` Georg-Johann Lay
  0 siblings, 1 reply; 47+ messages in thread
From: Denis Chertykov @ 2011-07-07 18:21 UTC (permalink / raw)
  To: Georg-Johann Lay
  Cc: Richard Henderson, gcc-patches, Anatoly Sokolov, Eric B. Weddington

2011/7/7 Georg-Johann Lay <avr@gjlay.de>:
> Denis Chertykov wrote:
>> 2011/6/27 Georg-Johann Lay:
>>> Denis Chertykov wrote:
>
>>>> The main problem for me is that the new addressing mode produce a
>>>> worse code in many tests.
>>> You have an example source?
>>
>> In attachment.
>>
>> Denis.
>
> Hi Denis.
>
> I had a look at the sources you sent.
>
> sort.c:
> =======
>
> There is some difference because of register allocation, but the new
> code does not look awfully bad, just a bit different because of
> different register allocation that might need some more bytes.
>
> The difference is *not* because of deny fake X addressing, it's
> because of the new avr_hard_regno_mode_ok implementation to fix
> PR46779.  When I add
>
>
>  if (GET_MODE_SIZE (mode) == 1)
>     return 1;
>
> +     if (SImode == mode && regno == 28)
> +       return 0;
>
>   return regno % 2 == 0;
>
> to that function, the difference in code disappears.
>


I have attached my version of addressing experiments.
(on rietveld http://codereview.appspot.com/4674046/)
It's against Richard's git://gcc.gnu.org/git/gcc.git rth/avr-addressing
(may be you compare my regressions with your. It's interesting.)

I still have a regression for sort.c (but I didn't analyzed the code):
   text	   data	    bss	    dec	    hex	filename
    364	      0	      0	    364	    16c	/tmp/svn.o
    446	      0	      0	    446	    1be	/tmp/addr.o

> pr.c:
> =====
>
> I get the following sizes with pr-0 the original compile and pr qith
> my patch:
>
>>avr-size pr-0.o
>   text    data     bss     dec     hex filename
>   2824      24       0    2848     b20 pr-0.o
>>avr-size pr.o
>   text    data     bss     dec     hex filename
>   2564      24       0    2588     a1c pr.o
>
> So the size actually decreased significantly.  Avoiding SI in
> avr_hard_regno_mode_ok like above does not change code size.

Right now I'm also has a win for pr.c with new addressing code:
   text	   data	    bss	    dec	    hex	filename
   1226	     24	      0	   1250	    4e2	/tmp/svn.o
   1176	     24	      0	   1200	    4b0	/tmp/addr.o


But, from time to time (while experimenting) regression was here.

>
> ============
>
> Note that I did *not* use the version from the git repository; I could
> not get reasonable code out of it (even after some fixes).  Hundreds
> of testsuite crashes...

IMHO it's because of `avr_frame_pointer_required_p' and `avr_can_eliminate'.
(look at my attached patch)

>
> I used the initial patch that I posted; I attached it again for
> reference. Note that LEGITIMIZE_RELOAD_ADDRESS is still not
> implemented there.

I think you can copy it from rth/addressing.

>
> ============
>
> Did you decide about the fix for PR46779?
>
> http://gcc.gnu.org/ml/gcc-patches/2011-06/msg00810.html
>
> Is it ok to commit?

I forgot about testsuite regressions for this patch.

Denis.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Patch, AVR]: Fix PR46779
  2011-07-07 18:21                                             ` Denis Chertykov
@ 2011-07-08 10:12                                               ` Georg-Johann Lay
  2011-07-08 10:25                                                 ` Denis Chertykov
  0 siblings, 1 reply; 47+ messages in thread
From: Georg-Johann Lay @ 2011-07-08 10:12 UTC (permalink / raw)
  To: Denis Chertykov
  Cc: Richard Henderson, gcc-patches, Anatoly Sokolov,
	Eric B. Weddington, Eric Botcazou, Bernd Schmidt

[-- Attachment #1: Type: text/plain, Size: 3205 bytes --]

CCed Eric and Bernd.

Denis Chertykov wrote:
>> Did you decide about the fix for PR46779?
>>
>> http://gcc.gnu.org/ml/gcc-patches/2011-06/msg00810.html
>>
>> Is it ok to commit?
> 
> I forgot about testsuite regressions for this patch.
> 
> Denis.


There were no new regressions:
  http://gcc.gnu.org/ml/gcc-patches/2011-06/msg00747.html

However, with the actual trunk (SVN 175991), I get two more
spill fails for following sources:

./gcc.c-torture/compile/pr32349.c -O1 -mmcu=atmega128

 pr30338.c: In function 'testload_func':
pr30338.c:13:1: error: unable to find a register to spill in class
'POINTER_REGS'
pr30338.c:13:1: error: this is the insn:
(insn 14 13 15 2 (set (reg:QI 24 r24 [orig:73 *D.1963_37 ] [73])
        (mem:QI (subreg:HI (reg:SI 71) 0) [0 *D.1963_37+0 S1 A8]))
pr30338.c:9 4 {*movqi}
     (expr_list:REG_DEAD (reg:SI 71)
        (nil)))
pr30338.c:13:1: internal compiler error: in spill_failure, at
reload1.c:2120



./gcc.c-torture/compile/pr32349.c -S -O3 -funroll-loops

pr32349.c: In function 'foo':
pr32349.c:26:1: error: unable to find a register to spill in class
'POINTER_REGS'
pr32349.c:26:1: error: this is the insn:
(insn 175 197 177 10 (set (reg/v:SI 234 [ m ])
        (mem:SI (post_inc:HI (reg:HI 16 r16 [orig:192 ivtmp.18 ]
[192])) [3 MEM[base: D.1996_74, offset: 0B]+0 S4 A8])) pr32349.c:18 12
{*movsi}
     (expr_list:REG_INC (reg:HI 16 r16 [orig:192 ivtmp.18 ] [192])
        (nil)))
pr32349.c:26:1: internal compiler error: in spill_failure, at
reload1.c:2120


(1)
I can fix *both* fails with additional test in avr_hard_regno_mode_ok:

+   if (GET_MODE_SIZE (mode) >= 4
+       && regno >= REG_X)
+     return 0;

(2)
I can fix the first fail but *not* the second by not allow SUBREGs in
avr_legitimate_address_p:

-   if (!strict && GET_CODE (x) == SUBREG) */
- 	x = SUBREG_REG (x); */


(2) Looks very reasonble, Eric Botcazou proposed it because he ran
into problems:
   http://gcc.gnu.org/ml/gcc-patches/2011-04/msg01367.html

(1) Appears to be hackish, but it should be ok.  If code breaks
because of that is's *definitely* a reload bug (e.g. SI-subreg of DI).

Even the original avr_hard_regno_mode_ok is ok IMO because if a
machine says "I can hold HI in 28 but not QI in 29" reload has to
handle it (except a machine must allow word_mode in *all* it's
GENERAL_REGS, don't know if that's a must).

I made a patch for reload, too:
   http://gcc.gnu.org/ml/gcc/2011-06/msg00005.html

Because IRA generates SUBREG of hardreg (which old lreg/greg handled
ok) and reload does not handle it correctly.  It generates a spill but
without the needed input reload so that one part of the register is
missing.

reload blames IRA or BE, IRA blames reload, BE blames IRA, etc...


I didn't rerun the testsuite with (1) or/and (2), I'd like both (1)
and (2) in the compiler.  What do you think?

For reference, I attached the patch again.  It's like the original
patch, just with some comment change.

Johann


	PR target/46779
	* config/avr/avr.c (avr_hard_regno_mode_ok): Rewrite.
	In particular, allow 8-bit values in r28 and r29.
	(avr_hard_regno_scratch_ok): Disallow any register that might be
	part of the frame pointer.
	(avr_hard_regno_rename_ok): Same.
	

[-- Attachment #2: pr46779.diff --]
[-- Type: text/x-patch, Size: 3016 bytes --]

Index: config/avr/avr.c
===================================================================
--- config/avr/avr.c	(revision 175991)
+++ config/avr/avr.c	(working copy)
@@ -6118,26 +6118,21 @@ jump_over_one_insn_p (rtx insn, rtx dest
 int
 avr_hard_regno_mode_ok (int regno, enum machine_mode mode)
 {
-  /* Disallow QImode in stack pointer regs.  */
-  if ((regno == REG_SP || regno == (REG_SP + 1)) && mode == QImode)
-    return 0;
-
-  /* The only thing that can go into registers r28:r29 is a Pmode.  */
-  if (regno == REG_Y && mode == Pmode)
-    return 1;
-
-  /* Otherwise disallow all regno/mode combinations that span r28:r29.  */
-  if (regno <= (REG_Y + 1) && (regno + GET_MODE_SIZE (mode)) >= (REG_Y + 1))
-    return 0;
-
-  if (mode == QImode)
+  /* NOTE: 8-bit values must not be disallowed for R28 or R29.
+        Disallowing QI et al. in these regs might lead to code like
+            (set (subreg:QI (reg:HI 28) n) ...)
+        which will result in wrong code because reload does not
+        handle SUBREGs of hard regsisters like this.
+        This could be fixed in reload.  However, it appears
+        that fixing reload is not wanted by reload people.  */
+  
+  /* Any GENERAL_REGS register can hold 8-bit values.  */
+  
+  if (GET_MODE_SIZE (mode) == 1)
     return 1;
-
-  /* Modes larger than QImode occupy consecutive registers.  */
-  if (regno + GET_MODE_SIZE (mode) > FIRST_PSEUDO_REGISTER)
-    return 0;
-
-  /* All modes larger than QImode should start in an even register.  */
+  
+  /* All modes larger than 8 bits should start in an even register.  */
+  
   return !(regno & 1);
 }
 
@@ -6410,13 +6405,23 @@ avr_hard_regno_scratch_ok (unsigned int
       && !df_regs_ever_live_p (regno))
     return false;
 
+  /* Don't allow hard registers that might be part of the frame pointer.
+     Some places in the compiler just test for [HARD_]FRAME_POINTER_REGNUM
+     and don't care for a frame pointer that spans more than one register.  */
+
+  if ((!reload_completed || frame_pointer_needed)
+      && (regno == REG_Y || regno == REG_Y + 1))
+    {
+      return false;
+    }
+
   return true;
 }
 
 /* Return nonzero if register OLD_REG can be renamed to register NEW_REG.  */
 
 int
-avr_hard_regno_rename_ok (unsigned int old_reg ATTRIBUTE_UNUSED,
+avr_hard_regno_rename_ok (unsigned int old_reg,
 			  unsigned int new_reg)
 {
   /* Interrupt functions can only use registers that have already been
@@ -6427,6 +6432,17 @@ avr_hard_regno_rename_ok (unsigned int o
       && !df_regs_ever_live_p (new_reg))
     return 0;
 
+  /* Don't allow hard registers that might be part of the frame pointer.
+     Some places in the compiler just test for [HARD_]FRAME_POINTER_REGNUM
+     and don't care for a frame pointer that spans more than one register.  */
+
+  if ((!reload_completed || frame_pointer_needed)
+      && (old_reg == REG_Y || old_reg == REG_Y + 1
+          || new_reg == REG_Y || new_reg == REG_Y + 1))
+    {
+      return 0;
+    }
+  
   return 1;
 }
 

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Patch, AVR]: Fix PR46779
  2011-07-08 10:12                                               ` Georg-Johann Lay
@ 2011-07-08 10:25                                                 ` Denis Chertykov
  2011-07-08 12:16                                                   ` Georg-Johann Lay
  0 siblings, 1 reply; 47+ messages in thread
From: Denis Chertykov @ 2011-07-08 10:25 UTC (permalink / raw)
  To: Georg-Johann Lay
  Cc: Richard Henderson, gcc-patches, Anatoly Sokolov,
	Eric B. Weddington, Eric Botcazou, Bernd Schmidt

2011/7/8 Georg-Johann Lay <avr@gjlay.de>:
> CCed Eric and Bernd.
>
> Denis Chertykov wrote:
>>> Did you decide about the fix for PR46779?
>>>
>>> http://gcc.gnu.org/ml/gcc-patches/2011-06/msg00810.html
>>>
>>> Is it ok to commit?
>>
>> I forgot about testsuite regressions for this patch.
>>
>> Denis.
>
>
> There were no new regressions:
>  http://gcc.gnu.org/ml/gcc-patches/2011-06/msg00747.html
>
> However, with the actual trunk (SVN 175991), I get two more
> spill fails for following sources:
>
> ./gcc.c-torture/compile/pr32349.c -O1 -mmcu=atmega128
>
>  pr30338.c: In function 'testload_func':
> pr30338.c:13:1: error: unable to find a register to spill in class
> 'POINTER_REGS'
> pr30338.c:13:1: error: this is the insn:
> (insn 14 13 15 2 (set (reg:QI 24 r24 [orig:73 *D.1963_37 ] [73])
>        (mem:QI (subreg:HI (reg:SI 71) 0) [0 *D.1963_37+0 S1 A8]))
> pr30338.c:9 4 {*movqi}
>     (expr_list:REG_DEAD (reg:SI 71)
>        (nil)))
> pr30338.c:13:1: internal compiler error: in spill_failure, at
> reload1.c:2120
>
>
>
> ./gcc.c-torture/compile/pr32349.c -S -O3 -funroll-loops
>
> pr32349.c: In function 'foo':
> pr32349.c:26:1: error: unable to find a register to spill in class
> 'POINTER_REGS'
> pr32349.c:26:1: error: this is the insn:
> (insn 175 197 177 10 (set (reg/v:SI 234 [ m ])
>        (mem:SI (post_inc:HI (reg:HI 16 r16 [orig:192 ivtmp.18 ]
> [192])) [3 MEM[base: D.1996_74, offset: 0B]+0 S4 A8])) pr32349.c:18 12
> {*movsi}
>     (expr_list:REG_INC (reg:HI 16 r16 [orig:192 ivtmp.18 ] [192])
>        (nil)))
> pr32349.c:26:1: internal compiler error: in spill_failure, at
> reload1.c:2120
>
>
> (1)
> I can fix *both* fails with additional test in avr_hard_regno_mode_ok:
>
> +   if (GET_MODE_SIZE (mode) >= 4
> +       && regno >= REG_X)
> +     return 0;
>
> (2)
> I can fix the first fail but *not* the second by not allow SUBREGs in
> avr_legitimate_address_p:
>
> -   if (!strict && GET_CODE (x) == SUBREG) */
> -       x = SUBREG_REG (x); */
>
>
> (2) Looks very reasonble, Eric Botcazou proposed it because he ran
> into problems:
>   http://gcc.gnu.org/ml/gcc-patches/2011-04/msg01367.html
>
> (1) Appears to be hackish, but it should be ok.  If code breaks
> because of that is's *definitely* a reload bug (e.g. SI-subreg of DI).
>
> Even the original avr_hard_regno_mode_ok is ok IMO because if a
> machine says "I can hold HI in 28 but not QI in 29" reload has to
> handle it (except a machine must allow word_mode in *all* it's
> GENERAL_REGS, don't know if that's a must).
>
> I made a patch for reload, too:
>   http://gcc.gnu.org/ml/gcc/2011-06/msg00005.html
>
> Because IRA generates SUBREG of hardreg (which old lreg/greg handled
> ok) and reload does not handle it correctly.  It generates a spill but
> without the needed input reload so that one part of the register is
> missing.
>
> reload blames IRA or BE, IRA blames reload, BE blames IRA, etc...
>
>
> I didn't rerun the testsuite with (1) or/and (2), I'd like both (1)
> and (2) in the compiler.  What do you think?

I think that AVR is a stress test for GCC core. We are on the edge.
IMHO your patch is a change one tweaks to another.
It's not needed if it adds regressions.

Denis.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Patch, AVR]: Fix PR46779
  2011-07-08 10:25                                                 ` Denis Chertykov
@ 2011-07-08 12:16                                                   ` Georg-Johann Lay
  0 siblings, 0 replies; 47+ messages in thread
From: Georg-Johann Lay @ 2011-07-08 12:16 UTC (permalink / raw)
  To: Denis Chertykov
  Cc: Richard Henderson, gcc-patches, Anatoly Sokolov,
	Eric B. Weddington, Eric Botcazou, Bernd Schmidt

[-- Attachment #1: Type: text/plain, Size: 5959 bytes --]

Denis Chertykov wrote:
> 2011/7/8 Georg-Johann Lay <avr@gjlay.de>:
>> CCed Eric and Bernd.
>>
>> Denis Chertykov wrote:
>>>> Did you decide about the fix for PR46779?
>>>>
>>>> http://gcc.gnu.org/ml/gcc-patches/2011-06/msg00810.html
>>>>
>>>> Is it ok to commit?
>>> I forgot about testsuite regressions for this patch.
>>>
>>> Denis.
>>
>> There were no new regressions:
>>  http://gcc.gnu.org/ml/gcc-patches/2011-06/msg00747.html
>>
>> However, with the actual trunk (SVN 175991), I get two more
>> spill fails for following sources:
>>
>> ./gcc.c-torture/compile/pr32349.c -O1 -mmcu=atmega128
>>
>>  pr30338.c: In function 'testload_func':
>> pr30338.c:13:1: error: unable to find a register to spill in class
>> 'POINTER_REGS'
>> pr30338.c:13:1: error: this is the insn:
>> (insn 14 13 15 2 (set (reg:QI 24 r24 [orig:73 *D.1963_37 ] [73])
>>        (mem:QI (subreg:HI (reg:SI 71) 0) [0 *D.1963_37+0 S1 A8]))
>> pr30338.c:9 4 {*movqi}
>>     (expr_list:REG_DEAD (reg:SI 71)
>>        (nil)))
>> pr30338.c:13:1: internal compiler error: in spill_failure, at
>> reload1.c:2120
>>
>>
>>
>> ./gcc.c-torture/compile/pr32349.c -S -O3 -funroll-loops
>>
>> pr32349.c: In function 'foo':
>> pr32349.c:26:1: error: unable to find a register to spill in class
>> 'POINTER_REGS'
>> pr32349.c:26:1: error: this is the insn:
>> (insn 175 197 177 10 (set (reg/v:SI 234 [ m ])
>>        (mem:SI (post_inc:HI (reg:HI 16 r16 [orig:192 ivtmp.18 ]
>> [192])) [3 MEM[base: D.1996_74, offset: 0B]+0 S4 A8])) pr32349.c:18 12
>> {*movsi}
>>     (expr_list:REG_INC (reg:HI 16 r16 [orig:192 ivtmp.18 ] [192])
>>        (nil)))
>> pr32349.c:26:1: internal compiler error: in spill_failure, at
>> reload1.c:2120
>>
>>
>> (1)
>> I can fix *both* fails with additional test in avr_hard_regno_mode_ok:
>>
>> +   if (GET_MODE_SIZE (mode) >= 4
>> +       && regno >= REG_X)
>> +     return 0;
>>
>> (2)
>> I can fix the first fail but *not* the second by not allow SUBREGs in
>> avr_legitimate_address_p:
>>
>> -   if (!strict && GET_CODE (x) == SUBREG) */
>> -       x = SUBREG_REG (x); */
>>
>>
>> (2) Looks very reasonble, Eric Botcazou proposed it because he ran
>> into problems:
>>   http://gcc.gnu.org/ml/gcc-patches/2011-04/msg01367.html
>>
>> (1) Appears to be hackish, but it should be ok.  If code breaks
>> because of that is's *definitely* a reload bug (e.g. SI-subreg of DI).
>>
>> Even the original avr_hard_regno_mode_ok is ok IMO because if a
>> machine says "I can hold HI in 28 but not QI in 29" reload has to
>> handle it (except a machine must allow word_mode in *all* it's
>> GENERAL_REGS, don't know if that's a must).
>>
>> I made a patch for reload, too:
>>   http://gcc.gnu.org/ml/gcc/2011-06/msg00005.html
>>
>> Because IRA generates SUBREG of hardreg (which old lreg/greg handled
>> ok) and reload does not handle it correctly.  It generates a spill but
>> without the needed input reload so that one part of the register is
>> missing.
>>
>> reload blames IRA or BE, IRA blames reload, BE blames IRA, etc...
>>
>>
>> I didn't rerun the testsuite with (1) or/and (2), I'd like both (1)
>> and (2) in the compiler.  What do you think?
> 
> I think that AVR is a stress test for GCC core. We are on the edge.
> IMHO your patch is a change one tweaks to another.
> It's not needed if it adds regressions.
> 
> Denis.

Reran testsuite against newer version (175991).

First the good news.

Following tests pass, that's the reason for the patch:

* gcc.target/avr/pr46779-1.c
* gcc.target/avr/pr46779-2.c

These tests now pass, too.  They all came up with spill fail both with
and without original patch, but pass with (1) and (2) added:

* gcc.c-torture/execute/pr38051.c (-Os)
* gcc.dg/20030324-1.c (-O -fstrict-aliasing -fgcse)
* gcc.dg/pr43670.c (-O -ftree-vrp -fcompare-debug)

And here the not-so-good news. There's additional ICE in reload:

* gcc.dg/pr32912-2.c (-Os)

pr32912-2.c:23:1: internal compiler error: in find_valid_class, at
reload.c:708
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://gcc.gnu.org/bugs.html> for instructions.
compiler exited with status 1
output is:
pr32912-2.c: In function 'bar':
pr32912-2.c:23:1: internal compiler error: in find_valid_class, at
reload.c:708


But look at the source!

#if(__SIZEOF_INT__ >= 4)
typedef int __m128i __attribute__ ((__vector_size__ (16)));
#else
typedef long __m128i __attribute__ ((__vector_size__ (16)));
#endif

That's no sensible on AVR at all!
The stack trace:

Breakpoint 1, fancy_abort (file=0x8896f38
"../../../gcc.gnu.org/trunk/gcc/reload.c", line=708,
function=0x8896fa8 "find_valid_class") at
../../../gcc.gnu.org/trunk/gcc/diagnostic.c:893
(gdb) bt
#0  fancy_abort (file=0x8896f38
"../../../gcc.gnu.org/trunk/gcc/reload.c", line=708,
function=0x8896fa8 "find_valid_class") at
../../../gcc.gnu.org/trunk/gcc/diagnostic.c:893
#1  0x0845a9d4 in find_valid_class (outer=SImode, inner=TImode, n=12,
dest_regno=16) at ../../../gcc.gnu.org/trunk/gcc/reload.c:708
#2  0x0845bfd5 in push_reload (in=0x0, out=0xb7dfe2c4, inloc=0x0,
outloc=0xb7dfe2d4, rclass=GENERAL_REGS, inmode=VOIDmode,
outmode=SImode, strict_low=0, optional=0, opnum=0,
type=RELOAD_FOR_OUTPUT) at ../../../gcc.gnu.org/trunk/gcc/reload.c:1196
#3  0x08462d52 in find_reloads (insn=0xb7dff144, replace=0,
ind_levels=0, live_known=1, reload_reg_p=0x89607c0) at
../../../gcc.gnu.org/trunk/gcc/reload.c:4015
#4  0x08470af9 in calculate_needs_all_insns (global=1) at
../../../gcc.gnu.org/trunk/gcc/reload1.c:1525

Note the TImode, this ICE is no harm for AVR!

As gcc.dg/pr32912-2.c is completely irrelevant for AVR,
here is the patch for review that integrates (1) and (2).

Ok?

Johann

	PR target/46779
	* config/avr/avr.c (avr_hard_regno_mode_ok): Rewrite.
	In particular, allow 8-bit values in r28 and r29.
	(avr_hard_regno_scratch_ok): Disallow any register that might be
	part of the frame pointer.
	(avr_hard_regno_rename_ok): Same.
	(avr_legitimate_address_p): Don't allow SUBREGs.


[-- Attachment #2: pr46779.diff --]
[-- Type: text/x-patch, Size: 3648 bytes --]

Index: config/avr/avr.c
===================================================================
--- config/avr/avr.c	(revision 175991)
+++ config/avr/avr.c	(working copy)
@@ -1109,8 +1109,7 @@ avr_legitimate_address_p (enum machine_m
 		 true_regnum (XEXP (x, 0)));
       debug_rtx (x);
     }
-  if (!strict && GET_CODE (x) == SUBREG)
-	x = SUBREG_REG (x);
+  
   if (REG_P (x) && (strict ? REG_OK_FOR_BASE_STRICT_P (x)
                     : REG_OK_FOR_BASE_NOSTRICT_P (x)))
     r = POINTER_REGS;
@@ -6118,26 +6117,30 @@ jump_over_one_insn_p (rtx insn, rtx dest
 int
 avr_hard_regno_mode_ok (int regno, enum machine_mode mode)
 {
-  /* Disallow QImode in stack pointer regs.  */
-  if ((regno == REG_SP || regno == (REG_SP + 1)) && mode == QImode)
-    return 0;
-
-  /* The only thing that can go into registers r28:r29 is a Pmode.  */
-  if (regno == REG_Y && mode == Pmode)
-    return 1;
-
-  /* Otherwise disallow all regno/mode combinations that span r28:r29.  */
-  if (regno <= (REG_Y + 1) && (regno + GET_MODE_SIZE (mode)) >= (REG_Y + 1))
-    return 0;
-
-  if (mode == QImode)
+  /* NOTE: 8-bit values must not be disallowed for R28 or R29.
+        Disallowing QI et al. in these regs might lead to code like
+            (set (subreg:QI (reg:HI 28) n) ...)
+        which will result in wrong code because reload does not
+        handle SUBREGs of hard regsisters like this.
+        This could be fixed in reload.  However, it appears
+        that fixing reload is not wanted by reload people.  */
+  
+  /* Any GENERAL_REGS register can hold 8-bit values.  */
+  
+  if (GET_MODE_SIZE (mode) == 1)
     return 1;
 
-  /* Modes larger than QImode occupy consecutive registers.  */
-  if (regno + GET_MODE_SIZE (mode) > FIRST_PSEUDO_REGISTER)
+  /* FIXME: Ideally, the following test is not needed.
+        However, it turned out that it can reduce the number
+        of spill fails.  AVR and it's poor endowment with
+        address registers is extreme stress test for reload.  */
+  
+  if (GET_MODE_SIZE (mode) >= 4
+      && regno >= REG_X)
     return 0;
 
-  /* All modes larger than QImode should start in an even register.  */
+  /* All modes larger than 8 bits should start in an even register.  */
+  
   return !(regno & 1);
 }
 
@@ -6410,13 +6413,23 @@ avr_hard_regno_scratch_ok (unsigned int
       && !df_regs_ever_live_p (regno))
     return false;
 
+  /* Don't allow hard registers that might be part of the frame pointer.
+     Some places in the compiler just test for [HARD_]FRAME_POINTER_REGNUM
+     and don't care for a frame pointer that spans more than one register.  */
+
+  if ((!reload_completed || frame_pointer_needed)
+      && (regno == REG_Y || regno == REG_Y + 1))
+    {
+      return false;
+    }
+
   return true;
 }
 
 /* Return nonzero if register OLD_REG can be renamed to register NEW_REG.  */
 
 int
-avr_hard_regno_rename_ok (unsigned int old_reg ATTRIBUTE_UNUSED,
+avr_hard_regno_rename_ok (unsigned int old_reg,
 			  unsigned int new_reg)
 {
   /* Interrupt functions can only use registers that have already been
@@ -6427,6 +6440,17 @@ avr_hard_regno_rename_ok (unsigned int o
       && !df_regs_ever_live_p (new_reg))
     return 0;
 
+  /* Don't allow hard registers that might be part of the frame pointer.
+     Some places in the compiler just test for [HARD_]FRAME_POINTER_REGNUM
+     and don't care for a frame pointer that spans more than one register.  */
+
+  if ((!reload_completed || frame_pointer_needed)
+      && (old_reg == REG_Y || old_reg == REG_Y + 1
+          || new_reg == REG_Y || new_reg == REG_Y + 1))
+    {
+      return 0;
+    }
+  
   return 1;
 }
 

^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2011-07-08 12:03 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-06-09 17:24 [Patch, AVR]: Fix PR46779 Georg-Johann Lay
2011-06-09 18:50 ` Denis Chertykov
2011-06-09 19:43   ` Georg-Johann Lay
2011-06-09 19:51     ` Denis Chertykov
2011-06-10 10:23       ` Georg-Johann Lay
2011-06-12 10:34         ` Denis Chertykov
2011-06-13 18:10           ` Georg-Johann Lay
2011-06-13 22:37             ` Denis Chertykov
2011-06-14 10:39               ` Georg-Johann Lay
2011-06-14 11:38                 ` Denis Chertykov
2011-06-14 21:38                   ` Georg-Johann Lay
2011-06-14 23:04                     ` Richard Henderson
2011-06-15 17:58                     ` Richard Henderson
2011-06-15 21:58                       ` Georg-Johann Lay
2011-06-15 22:20                         ` Richard Henderson
2011-06-16  3:46                           ` Richard Henderson
2011-06-16 11:37                             ` Denis Chertykov
2011-06-16 12:01                               ` Denis Chertykov
2011-06-16 14:07                                 ` Richard Henderson
2011-06-16 18:18                                   ` Denis Chertykov
2011-06-16 18:57                                     ` Richard Henderson
2011-06-16 19:33                                       ` Denis Chertykov
2011-06-16 19:42                                         ` Richard Henderson
2011-06-16 20:04                                           ` Denis Chertykov
2011-06-16 14:36                               ` Richard Henderson
2011-06-16 14:52                                 ` Bernd Schmidt
2011-06-16 15:13                                   ` Richard Henderson
2011-06-16 18:57                                 ` Denis Chertykov
2011-06-23 20:48                             ` Denis Chertykov
2011-06-23 22:04                               ` Richard Henderson
2011-06-26 20:03                                 ` Denis Chertykov
2011-06-26 20:51                                   ` Georg-Johann Lay
2011-06-27  8:41                                     ` Denis Chertykov
2011-06-27  9:19                                       ` Georg-Johann Lay
2011-06-27 10:17                                         ` Denis Chertykov
2011-07-07  9:59                                           ` Georg-Johann Lay
2011-07-07 18:21                                             ` Denis Chertykov
2011-07-08 10:12                                               ` Georg-Johann Lay
2011-07-08 10:25                                                 ` Denis Chertykov
2011-07-08 12:16                                                   ` Georg-Johann Lay
2011-06-22  3:28             ` Hans-Peter Nilsson
2011-06-22 17:03               ` Georg-Johann Lay
2011-06-23 12:51                 ` Hans-Peter Nilsson
2011-06-23 13:00                 ` Hans-Peter Nilsson
2011-06-09 20:03     ` Georg-Johann Lay
2011-06-10  9:27   ` Georg-Johann Lay
2011-06-21 17:17     ` Georg-Johann Lay

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).