public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Reload bug
@ 2003-04-08 18:52 Eric Botcazou
  2003-04-08 23:16 ` Jan Hubicka
  2003-04-09  9:13 ` Eric Botcazou
  0 siblings, 2 replies; 41+ messages in thread
From: Eric Botcazou @ 2003-04-08 18:52 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc

[-- Attachment #1: Type: text/plain, Size: 1766 bytes --]

Hello Jan,

I've narrowed down the bug reported as PR optimization/10233, a regression 
from GCC 3.2 present on the 3.2 branch since 3.2.1 (but latent on all active 
branches), to this patch

Thu Oct 17 17:14:07 CEST 2002  Jan Hubicka  <jh@suse.cz>

	PR opt/7630
	* reload.c (reload_inner_reg_of_subreg): New argument output;
	(push_reload): Update call.


We are trying to reload

	(subreg:SI (reg/v:DI 29 rmm0 [68]) 4)

as the 'in' argument of push_reload.

Before your patch, the compiler correctly detected that the whole register 
needs to be reloaded because we can't access the high part of MMX regs.

But now reload_inner_reg_of_subreg returns 0 for this subreg because its 
return value depends upon the context: if we pass it an 'in' argument, it 
will not check the following condition

  /* If the outer part is a word or smaller, INNER larger than a
     word and the number of regs for INNER is not the same as the
     number of words in INNER, then INNER will need reloading.  */

which happens to apply to this subreg.

Then the subsequent logic is fooled and we end up reloading

	(reg/v:SI 29 emm0)


The rationale for the patch is there:
http://gcc.gnu.org/ml/gcc-patches/2002-10/msg00628.html

The last sentence: "However it is completely irrelevant for input operand in 
all cases - we always can move just the small part of register and use it 
directly, so I've added new operand to bypass this logic in the input 
operand case."

Did you intend to disable the condition for low parts only? (there appears to 
be no testcase associated with the patch so it's hard to tell). If so, does 
the following patch fit your needs?

-- 
Eric Botcazou


	* reload.c (reload_inner_reg_of_subreg): Require reloading for
	non-low parts of input operands.

[-- Attachment #2: pr10233.diff --]
[-- Type: text/x-diff, Size: 1007 bytes --]

Index: reload.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/reload.c,v
retrieving revision 1.178.2.4.2.6
diff -u -p -r1.178.2.4.2.6 reload.c
--- reload.c	29 Mar 2003 19:30:39 -0000	1.178.2.4.2.6
+++ reload.c	8 Apr 2003 17:40:34 -0000
@@ -826,10 +826,12 @@ reload_inner_reg_of_subreg (x, mode, out
      word and the number of regs for INNER is not the same as the
      number of words in INNER, then INNER will need reloading.  */
   return (GET_MODE_SIZE (mode) <= UNITS_PER_WORD
-	  && output
 	  && GET_MODE_SIZE (GET_MODE (inner)) > UNITS_PER_WORD
 	  && ((GET_MODE_SIZE (GET_MODE (inner)) / UNITS_PER_WORD)
-	      != HARD_REGNO_NREGS (REGNO (inner), GET_MODE (inner))));
+	      != HARD_REGNO_NREGS (REGNO (inner), GET_MODE (inner)))
+	  /* This is not necessary for low parts of input operands
+	     because we can always access them directly.  */
+	  && (output || ! subreg_lowpart_p (x)));
 }
 
 /* Record one reload that needs to be performed.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Reload bug
  2003-04-08 18:52 Reload bug Eric Botcazou
@ 2003-04-08 23:16 ` Jan Hubicka
  2003-04-09  0:44   ` Jan Hubicka
                     ` (2 more replies)
  2003-04-09  9:13 ` Eric Botcazou
  1 sibling, 3 replies; 41+ messages in thread
From: Jan Hubicka @ 2003-04-08 23:16 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: Jan Hubicka, gcc

> Hello Jan,
> 
> I've narrowed down the bug reported as PR optimization/10233, a regression 
> from GCC 3.2 present on the 3.2 branch since 3.2.1 (but latent on all active 
> branches), to this patch
> 
> Thu Oct 17 17:14:07 CEST 2002  Jan Hubicka  <jh@suse.cz>
> 
> 	PR opt/7630
> 	* reload.c (reload_inner_reg_of_subreg): New argument output;
> 	(push_reload): Update call.
> 
> 
> We are trying to reload
> 
> 	(subreg:SI (reg/v:DI 29 rmm0 [68]) 4)
> 
> as the 'in' argument of push_reload.
> 
> Before your patch, the compiler correctly detected that the whole register 
> needs to be reloaded because we can't access the high part of MMX regs.

How it did detected that?  I believe it is just pessimization masking a
bug as currently there is no way how to realize that subreg:SI of MMX
with offset 4 register is not valid (and with offset 0 is).

Thinking about it again, I think we can rely on HARD_REGNO_NREGS
and compute how large the inidividular registers really are, but I don't
think this is used anywhere in the code.

It should not get recognized earlier than in reload as we should not
allocate value with subreg undoable in MMX register into MMX register
(unless rest of insns dictates otherwise).  We've chatted about this
with Richard
> 
> But now reload_inner_reg_of_subreg returns 0 for this subreg because its 
> return value depends upon the context: if we pass it an 'in' argument, it 
> will not check the following condition
> 
>   /* If the outer part is a word or smaller, INNER larger than a
>      word and the number of regs for INNER is not the same as the
>      number of words in INNER, then INNER will need reloading.  */
> 
> which happens to apply to this subreg.
> 
> Then the subsequent logic is fooled and we end up reloading
> 
> 	(reg/v:SI 29 emm0)
> 
> 
> The rationale for the patch is there:
> http://gcc.gnu.org/ml/gcc-patches/2002-10/msg00628.html
> 
> The last sentence: "However it is completely irrelevant for input operand in 
> all cases - we always can move just the small part of register and use it 
> directly, so I've added new operand to bypass this logic in the input 
> operand case."
> 
> Did you intend to disable the condition for low parts only? (there appears to 

No, it is valid for high parts too as long as the register is possible
(imagine the xmm being replaced by eax).  I think we can add extra test
realizing how large the registers are when index is nonzero.  This will
get bit tricky to get right when indexes are nonzero for lowparts.
Richard?

Honza
> be no testcase associated with the patch so it's hard to tell). If so, does 
> the following patch fit your needs?
> 
> -- 
> Eric Botcazou
> 
> 
> 	* reload.c (reload_inner_reg_of_subreg): Require reloading for
> 	non-low parts of input operands.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Reload bug
  2003-04-08 23:16 ` Jan Hubicka
@ 2003-04-09  0:44   ` Jan Hubicka
  2003-04-09  7:00     ` Eric Botcazou
  2003-04-09  3:00   ` Eric Botcazou
  2003-04-09  8:57   ` Eric Botcazou
  2 siblings, 1 reply; 41+ messages in thread
From: Jan Hubicka @ 2003-04-09  0:44 UTC (permalink / raw)
  To: Jan Hubicka, rth; +Cc: Eric Botcazou, gcc

> > Hello Jan,
> > 
> > I've narrowed down the bug reported as PR optimization/10233, a regression 
> > from GCC 3.2 present on the 3.2 branch since 3.2.1 (but latent on all active 
> > branches), to this patch
> > 
> > Thu Oct 17 17:14:07 CEST 2002  Jan Hubicka  <jh@suse.cz>
> > 
> > 	PR opt/7630
> > 	* reload.c (reload_inner_reg_of_subreg): New argument output;
> > 	(push_reload): Update call.
> > 
> > 
> > We are trying to reload
> > 
> > 	(subreg:SI (reg/v:DI 29 rmm0 [68]) 4)
> > 
> > as the 'in' argument of push_reload.
> > 
> > Before your patch, the compiler correctly detected that the whole register 
> > needs to be reloaded because we can't access the high part of MMX regs.
> 
> How it did detected that?  I believe it is just pessimization masking a
> bug as currently there is no way how to realize that subreg:SI of MMX
> with offset 4 register is not valid (and with offset 0 is).
> 
> Thinking about it again, I think we can rely on HARD_REGNO_NREGS
> and compute how large the inidividular registers really are, but I don't
> think this is used anywhere in the code.
> 
> It should not get recognized earlier than in reload as we should not
> allocate value with subreg undoable in MMX register into MMX register
> (unless rest of insns dictates otherwise).  We've chatted about this
> with Richard

Actually I think the problem is just that the patch has never get into
3.2 line.
The patch is pretty involved:
	* combine.c (simplify_set): Reverse order of ragumetns to
	REG_CANNOT_CHANGE_MODE_P
	* df.c (df_def_record_1): Likewise.
	* recog.c (register_operand): Likewise.
	* simplify-rtx.c (simplify_subreg): Likewise.
	* hard-reg-set.h (REG_CANNOT_CHANGE_MODE_P): Update use of
	CANNOT_CHANGE_MODE_CLASS.
	* regclass.c (cannot_change_mode_set_regs, invalid_mode_change_p):
	Likewise.
	* reload.c (push_reload): Likewise.
	* alpha.h (CANNOT_CHANGE_MODE_CLASS): Update definition.
	* ia64.h (CANNOT_CHANGE_MODE_CLASS): Update definition.
	* mips.h (CANNOT_CHANGE_MODE_CLASS): Update definition.
	* mips-protos.h (mips_cannot_change_mode_class): Update prototype.
	* mips.c (mips_cannot_change_mode_class): Update.
	* pa64-regs.h (CANNOT_CHANGE_MODE_CLASS): Update definition.
	* rs6000.h (CANNOT_CHANGE_MODE_CLASS): Update definition.
	* s390.h (CANNOT_CHANGE_MODE_CLASS): Update definition.
	* sh.h (CANNOT_CHANGE_MODE_CLASS): Update definition.
	* sh-protos.h (sh_cannot_change_mode_class): Update prototype.
	* sh.c (sh_cannot_change_mode_class): Update.
	* i386.h (CANNOT_CHANGE_MODE_CLASS): New.
	* tm.texi (CANNOT_CHANGE_MODE_CLASS): Update documentation.
and I believe that the MMX suport in 3.2 is in so bad shape that I would
simply close the bugreport claiming that 3.3 fixes that.

Honza

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Reload bug
  2003-04-08 23:16 ` Jan Hubicka
  2003-04-09  0:44   ` Jan Hubicka
@ 2003-04-09  3:00   ` Eric Botcazou
  2003-04-09  9:49     ` Jan Hubicka
  2003-04-09  8:57   ` Eric Botcazou
  2 siblings, 1 reply; 41+ messages in thread
From: Eric Botcazou @ 2003-04-09  3:00 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc

> How it did detected that?

  /* If the outer part is a word or smaller, INNER larger than a
     word and the number of regs for INNER is not the same as the
     number of words in INNER, then INNER will need reloading.  */
  return (GET_MODE_SIZE (mode) <= UNITS_PER_WORD
	  && GET_MODE_SIZE (GET_MODE (inner)) > UNITS_PER_WORD
	  && ((GET_MODE_SIZE (GET_MODE (inner)) / UNITS_PER_WORD)
	      != HARD_REGNO_NREGS (REGNO (inner), GET_MODE (inner))))

> I believe it is just pessimization masking a bug as currently there is no
> way how to realize that subreg:SI of MMX with offset 4 register is not 
> valid (and with offset 0 is).

Of course, it's pessimization when offset ==0, but it's correctness when 
offset == 4. As for the choice between correctness and pessimization...

> It should not get recognized earlier than in reload as we should not
> allocate value with subreg undoable in MMX register into MMX register
> (unless rest of insns dictates otherwise).  We've chatted about this
> with Richard

Then the culprit is local alloc.

> No, it is valid for high parts too as long as the register is possible
> (imagine the xmm being replaced by eax).  I think we can add extra test
> realizing how large the registers are when index is nonzero.  This will
> get bit tricky to get right when indexes are nonzero for lowparts.

Then why not use ! lowpart?

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Reload bug
  2003-04-09  0:44   ` Jan Hubicka
@ 2003-04-09  7:00     ` Eric Botcazou
  0 siblings, 0 replies; 41+ messages in thread
From: Eric Botcazou @ 2003-04-09  7:00 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: rth, gcc

> Actually I think the problem is just that the patch has never get into
> 3.2 line.
> The patch is pretty involved:
> 	* combine.c (simplify_set): Reverse order of ragumetns to
> 	REG_CANNOT_CHANGE_MODE_P
> 	* df.c (df_def_record_1): Likewise.
> 	* recog.c (register_operand): Likewise.
> 	* simplify-rtx.c (simplify_subreg): Likewise.
> 	* hard-reg-set.h (REG_CANNOT_CHANGE_MODE_P): Update use of
> 	CANNOT_CHANGE_MODE_CLASS.
> 	* regclass.c (cannot_change_mode_set_regs, invalid_mode_change_p):
> 	Likewise.
> 	* reload.c (push_reload): Likewise.
> 	* alpha.h (CANNOT_CHANGE_MODE_CLASS): Update definition.
> 	* ia64.h (CANNOT_CHANGE_MODE_CLASS): Update definition.
> 	* mips.h (CANNOT_CHANGE_MODE_CLASS): Update definition.
> 	* mips-protos.h (mips_cannot_change_mode_class): Update prototype.
> 	* mips.c (mips_cannot_change_mode_class): Update.
> 	* pa64-regs.h (CANNOT_CHANGE_MODE_CLASS): Update definition.
> 	* rs6000.h (CANNOT_CHANGE_MODE_CLASS): Update definition.
> 	* s390.h (CANNOT_CHANGE_MODE_CLASS): Update definition.
> 	* sh.h (CANNOT_CHANGE_MODE_CLASS): Update definition.
> 	* sh-protos.h (sh_cannot_change_mode_class): Update prototype.
> 	* sh.c (sh_cannot_change_mode_class): Update.
> 	* i386.h (CANNOT_CHANGE_MODE_CLASS): New.
> 	* tm.texi (CANNOT_CHANGE_MODE_CLASS): Update documentation.

Did your fix for PR opt/7630 go in before this patch?

> and I believe that the MMX suport in 3.2 is in so bad shape that I would
> simply close the bugreport claiming that 3.3 fixes that.

I would have do that without hesitation if the bug was not a regression from 
GCC 3.2.0; now it's up to the release manager to decide.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Reload bug
  2003-04-08 23:16 ` Jan Hubicka
  2003-04-09  0:44   ` Jan Hubicka
  2003-04-09  3:00   ` Eric Botcazou
@ 2003-04-09  8:57   ` Eric Botcazou
  2003-04-09  9:45     ` Jan Hubicka
  2 siblings, 1 reply; 41+ messages in thread
From: Eric Botcazou @ 2003-04-09  8:57 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc

> No, it is valid for high parts too as long as the register is possible
> (imagine the xmm being replaced by eax).

Did I forget to say that this example is bogus? The logic doesn't of course 
trigger for %eax so we don't pessimize anything in that case.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Reload bug
  2003-04-08 18:52 Reload bug Eric Botcazou
  2003-04-08 23:16 ` Jan Hubicka
@ 2003-04-09  9:13 ` Eric Botcazou
  2003-04-09 11:25   ` Jan Hubicka
  1 sibling, 1 reply; 41+ messages in thread
From: Eric Botcazou @ 2003-04-09  9:13 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc

[-- Attachment #1: Type: text/plain, Size: 626 bytes --]

> Thu Oct 17 17:14:07 CEST 2002  Jan Hubicka  <jh@suse.cz>
>
> 	PR opt/7630
> 	* reload.c (reload_inner_reg_of_subreg): New argument output;
> 	(push_reload): Update call.

This patch is really bogus, at least on the 3.2 branch: sure, GCC doesn't ICE 
when compiling pr7630.c any more but now it silently produces wrong code!

What you have done is to teach the reload pass to silently transform any 
non-accessible part of a multi-word register into the low part.

Please revert on the 3.2 branch or consider applying the correction I posted 
(which of course brings back the ICE for PR opt/7630).

Thanks.

-- 
Eric Botcazou

[-- Attachment #2: pr7630.c --]
[-- Type: text/x-csrc, Size: 911 bytes --]

/* PR optimization/7630 */
/* Originator: Jeremy Buhler <jbuhler@cs.wustl.edu> */
/* { dg-do compile { target i?86-*-* x86_64-*-* } } */
/* { dg-options "-O -fPIC -march=pentium4" } */

typedef union {
    double value;
    struct {
        unsigned int lsw;
        unsigned int msw;
    } parts;
} js_ieee_double_shape_type;

#define JSDOUBLE_HI32(x)  ({js_ieee_double_shape_type sh_u;  \
                            sh_u.value = (x);                \
                            sh_u.parts.msw; })


enum JSOp {JSOP_URSH, JSOP_DIV};
struct JSParseNode {
    enum JSOp     pn_op;
    double        dval;
};

int js_FoldConstants(struct JSParseNode *pn)
{
    double d;
   
    switch (pn->pn_op) {
    case JSOP_URSH:
        d = 0;
       
    case JSOP_DIV:
        if (JSDOUBLE_HI32(d) ^ JSDOUBLE_HI32(d))
            d = 0;
        break;
       
    default:;
    }
   
    pn->dval = d;
    return 0;
}

[-- Attachment #3: pr7630.s --]
[-- Type: text/x-asm, Size: 793 bytes --]

	.file	"pr7630.c"
	.section	.rodata.cst8,"aM",@progbits,8
	.align 8
.LC0:
	.long	0
	.long	0
	.text
.globl js_FoldConstants
	.type	js_FoldConstants,@function
js_FoldConstants:
	fldz
	pushl	%ebp
	movl	%esp, %ebp
	subl	$16, %esp
	movl	%ebx, -4(%ebp)
	call	.LPR0
	addl	$_GLOBAL_OFFSET_TABLE_, %ebx
	movl	8(%ebp), %edx
	movl	(%edx), %eax
	testl	%eax, %eax
	je	.L3
	cmpl	$1, %eax
	je	.L4
	jmp	.L2
.L3:
	fstp	%st(0)
	fldl	.LC0@GOTOFF(%ebx)
.L4:
	fstl	-16(%ebp)
	movsd	-16(%ebp), %xmm0
	movd	%xmm0, %eax
	cmpl	%eax, %eax
	fldl	.LC0@GOTOFF(%ebx)
	fcmove	%st(1), %st
	fstp	%st(1)
.L2:
	fstpl	4(%edx)
	movl	$0, %eax
	movl	-4(%ebp), %ebx
	movl	%ebp, %esp
	popl	%ebp
	ret
.Lfe1:
	.size	js_FoldConstants,.Lfe1-js_FoldConstants
.LPR0:
	movl	(%esp), %ebx
	ret
	.ident	"GCC: (GNU) 3.2.3 20030405 (prerelease)"

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Reload bug
  2003-04-09  8:57   ` Eric Botcazou
@ 2003-04-09  9:45     ` Jan Hubicka
  2003-04-09  9:50       ` Eric Botcazou
  0 siblings, 1 reply; 41+ messages in thread
From: Jan Hubicka @ 2003-04-09  9:45 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: Jan Hubicka, gcc

> > No, it is valid for high parts too as long as the register is possible
> > (imagine the xmm being replaced by eax).
> 
> Did I forget to say that this example is bogus? The logic doesn't of course 
> trigger for %eax so we don't pessimize anything in that case.

What would prevent it from match on (subreg:DI (reg eax) 4)? That was
the original problem that made me to change the behaviour (we can't
reload instructions needing two such subregged operands in -fPIC as we
simply run out of registers)

Honza
> 
> -- 
> Eric Botcazou

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Reload bug
  2003-04-09  3:00   ` Eric Botcazou
@ 2003-04-09  9:49     ` Jan Hubicka
  0 siblings, 0 replies; 41+ messages in thread
From: Jan Hubicka @ 2003-04-09  9:49 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: Jan Hubicka, gcc

> > How it did detected that?
> 
>   /* If the outer part is a word or smaller, INNER larger than a
>      word and the number of regs for INNER is not the same as the
>      number of words in INNER, then INNER will need reloading.  */
>   return (GET_MODE_SIZE (mode) <= UNITS_PER_WORD
> 	  && GET_MODE_SIZE (GET_MODE (inner)) > UNITS_PER_WORD
> 	  && ((GET_MODE_SIZE (GET_MODE (inner)) / UNITS_PER_WORD)
> 	      != HARD_REGNO_NREGS (REGNO (inner), GET_MODE (inner))))
> 
> > I believe it is just pessimization masking a bug as currently there is no
> > way how to realize that subreg:SI of MMX with offset 4 register is not 
> > valid (and with offset 0 is).
> 
> Of course, it's pessimization when offset ==0, but it's correctness when 
> offset == 4. As for the choice between correctness and pessimization...
> 
> > It should not get recognized earlier than in reload as we should not
> > allocate value with subreg undoable in MMX register into MMX register
> > (unless rest of insns dictates otherwise).  We've chatted about this
> > with Richard
> 
> Then the culprit is local alloc.

This is approximately what the patch I quoted does - it makes regclass
to realize that it can't change mode this way.  (it is not correct
either as it prohibits (subreg:SI xmm:DI 0) that is valid but we don't
have interface to declare this properly.
> 
> > No, it is valid for high parts too as long as the register is possible
> > (imagine the xmm being replaced by eax).  I think we can add extra test
> > realizing how large the registers are when index is nonzero.  This will
> > get bit tricky to get right when indexes are nonzero for lowparts.
> 
> Then why not use ! lowpart?

Because you can still have (subreg:sI eax:DI 4)

Honza
> 
> -- 
> Eric Botcazou

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Reload bug
  2003-04-09  9:45     ` Jan Hubicka
@ 2003-04-09  9:50       ` Eric Botcazou
  2003-04-09 14:52         ` Jan Hubicka
  0 siblings, 1 reply; 41+ messages in thread
From: Eric Botcazou @ 2003-04-09  9:50 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc

> What would prevent it from match on (subreg:DI (reg eax) 4)?

GET_MODE_SIZE (GET_MODE (inner)) / UNITS_PER_WORD == 2
HARD_REGNO_NREGS (REGNO (inner), GET_MODE (inner)) == 2

> That was the original problem that made me to change the behaviour (we
> can't reload instructions needing two such subregged operands in -fPIC as
> we simply run out of registers)

No, the problem was related to SSE regs.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Reload bug
  2003-04-09  9:13 ` Eric Botcazou
@ 2003-04-09 11:25   ` Jan Hubicka
  2003-04-09 12:04     ` Eric Botcazou
  0 siblings, 1 reply; 41+ messages in thread
From: Jan Hubicka @ 2003-04-09 11:25 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: Jan Hubicka, gcc

> > Thu Oct 17 17:14:07 CEST 2002  Jan Hubicka  <jh@suse.cz>
> >
> > 	PR opt/7630
> > 	* reload.c (reload_inner_reg_of_subreg): New argument output;
> > 	(push_reload): Update call.
> 
> This patch is really bogus, at least on the 3.2 branch: sure, GCC doesn't ICE 
> when compiling pr7630.c any more but now it silently produces wrong code!

Urm.  The use of SSE is completely broken idea here..  Will have to look
why register allocator decided that.

> 
> What you have done is to teach the reload pass to silently transform any 
> non-accessible part of a multi-word register into the low part.

In the case the subreg is undable (ie it is in the SSE), or can this
happen in the integer register too?

> 
> Please revert on the 3.2 branch or consider applying the correction I posted 
> (which of course brings back the ICE for PR opt/7630).
Or we can backport the CANNOT_CHANGE_MODE_PATCH.  I would be happy about
any alternative for 3.2

Honza
> 
> Thanks.
> 
> -- 
> Eric Botcazou


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Reload bug
  2003-04-09 11:25   ` Jan Hubicka
@ 2003-04-09 12:04     ` Eric Botcazou
  2003-04-09 18:05       ` Jan Hubicka
  0 siblings, 1 reply; 41+ messages in thread
From: Eric Botcazou @ 2003-04-09 12:04 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc

> > What you have done is to teach the reload pass to silently transform any
> > non-accessible part of a multi-word register into the low part.
>
> In the case the subreg is undable (ie it is in the SSE), or can this
> happen in the integer register too?

Only when the logic should have triggered, i.e with MMX and SSE regs but not 
with general regs since it doesn't trigger for them.

> Or we can backport the CANNOT_CHANGE_MODE_PATCH.

Big patch.

> I would be happy about any alternative for 3.2

Well, there aren't too many possibilities:
- either we prevent such invalid subregs from being created in the first 
place (but it appears that doing so brings about collateral damages in the 
form of (subreg:SI xmm:DI 0) being now rejected?),
- or we restore the machinery that was able to handle them.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Reload bug
  2003-04-09  9:50       ` Eric Botcazou
@ 2003-04-09 14:52         ` Jan Hubicka
  2003-04-09 18:10           ` Eric Botcazou
  2003-04-10 14:25           ` Eric Botcazou
  0 siblings, 2 replies; 41+ messages in thread
From: Jan Hubicka @ 2003-04-09 14:52 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: Jan Hubicka, gcc

> > What would prevent it from match on (subreg:DI (reg eax) 4)?
> 
> GET_MODE_SIZE (GET_MODE (inner)) / UNITS_PER_WORD == 2
> HARD_REGNO_NREGS (REGNO (inner), GET_MODE (inner)) == 2

Hmm, I guess I see it finally :)  Thanks for the patience.

One problem is that reload is really dealing with the impossible subregs
incorrectly.  See simd-* testcases on gcc 3.2 compiled with SSE support
- the (subreg:DF (XMM0:V3DF) 8) is produced, passed trought reload and
then assembled as XMM0 as the registers are of size 16.  I think we
should at least abort when the SUBREG_BYTE is not represent via
subreg_regno_offset.  Will try to make patch for that in mainline.

Concerning this testcase, the reload used to manage to get around this
problem.  It didn't get around intentionally and just because the
instruction contrain does not accept SSE register so reload is needed
anyway.  The idea of reloading whole SSE register into integer is no-go
as can be see from the other testcases.  There are two cases when the
reload is really needed:
1) we are not able to access the SUBREG_BYTE part because of
   the problem above because SUBREG_BYTE is not divisible by size of the
   register (there is no way to get the size of the register, but I guess
   we can use GET_MODE_SIZE / HARD_REGNO_NREGS) and we will also need to
   compensate the lowparts,

   This is the problem we are seeing.  It is IMO questionable whether it
   is valid to see such scenario and whether local alloc can produce
   such a subregs that has no representation.  My CANNOT_CHANGE_MODE_P
   patch avoids that as it causes the problem described above.

2) We are outputing into the register and the HARD_REGNO_NREGS is
   decreasing so we may convert subreg rewriting just part of the
   register to the subreg rewriting the lower part and clobbering upper
   part.  This is what I beleive the conditional was originally invented
   for.

Are we in the sync now?
It seems to be that the conditional is still overconservative for 2) as
it is still possible that the subreg is large enought to fit into
HARD_REGNO_NREGS of the destination and it now ignores 1).  It is
question whether 1) can legaly happen at all but I see it does for 3.2
(and does not for 3.3).
Honza
> 
> > That was the original problem that made me to change the behaviour (we
> > can't reload instructions needing two such subregged operands in -fPIC as
> > we simply run out of registers)
> 
> No, the problem was related to SSE regs.
> 
> -- 
> Eric Botcazou

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Reload bug
  2003-04-09 12:04     ` Eric Botcazou
@ 2003-04-09 18:05       ` Jan Hubicka
  2003-04-09 18:26         ` Eric Botcazou
  2003-04-09 21:23         ` Richard Henderson
  0 siblings, 2 replies; 41+ messages in thread
From: Jan Hubicka @ 2003-04-09 18:05 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: Jan Hubicka, gcc

> > > What you have done is to teach the reload pass to silently transform any
> > > non-accessible part of a multi-word register into the low part.
> >
> > In the case the subreg is undable (ie it is in the SSE), or can this
> > happen in the integer register too?
> 
> Only when the logic should have triggered, i.e with MMX and SSE regs but not 
> with general regs since it doesn't trigger for them.
> 
> > Or we can backport the CANNOT_CHANGE_MODE_PATCH.
> 
> Big patch.
> 
> > I would be happy about any alternative for 3.2
> 
> Well, there aren't too many possibilities:
> - either we prevent such invalid subregs from being created in the first 
> place (but it appears that doing so brings about collateral damages in the 
> form of (subreg:SI xmm:DI 0) being now rejected?),

This is the big patch, so probably not good choice.

> - or we restore the machinery that was able to handle them.

As I tried to explain in the other email, I believe we never had any
working reliably.  For 3.2 it is probably OK to simply revert the patch
after all.  I can do so if maintainer approves that.
I just want to be sure that for 3.3/3.4 it is dealt correct that I am
not quite certain about at this point :((

Honza
> 
> -- 
> Eric Botcazou

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Reload bug
  2003-04-09 14:52         ` Jan Hubicka
@ 2003-04-09 18:10           ` Eric Botcazou
  2003-04-09 19:15             ` Jan Hubicka
  2003-04-10 14:25           ` Eric Botcazou
  1 sibling, 1 reply; 41+ messages in thread
From: Eric Botcazou @ 2003-04-09 18:10 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc

> Concerning this testcase, the reload used to manage to get around this
> problem.  It didn't get around intentionally and just because the
> instruction contrain does not accept SSE register so reload is needed
> anyway.  The idea of reloading whole SSE register into integer is no-go
> as can be see from the other testcases.  There are two cases when the
> reload is really needed:
> 1) we are not able to access the SUBREG_BYTE part because of
>    the problem above because SUBREG_BYTE is not divisible by size of the
>    register (there is no way to get the size of the register, but I guess
>    we can use GET_MODE_SIZE / HARD_REGNO_NREGS) and we will also need to
>    compensate the lowparts,
>
>    This is the problem we are seeing.  It is IMO questionable whether it
>    is valid to see such scenario and whether local alloc can produce
>    such a subregs that has no representation.  My CANNOT_CHANGE_MODE_P
>    patch avoids that as it causes the problem described above.
>
> 2) We are outputing into the register and the HARD_REGNO_NREGS is
>    decreasing so we may convert subreg rewriting just part of the
>    register to the subreg rewriting the lower part and clobbering upper
>    part.  This is what I beleive the conditional was originally invented
>    for.
>
> Are we in the sync now?

I don't fully understand the second point. Could you give an example?

> It seems to be that the conditional is still overconservative for 2) as
> it is still possible that the subreg is large enought to fit into
> HARD_REGNO_NREGS of the destination and it now ignores 1).  It is
> question whether 1) can legaly happen at all but I see it does for 3.2
> (and does not for 3.3).

Are you sure of that for 3.3? Because we will silently miscompile too if 
there are a few leaks.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Reload bug
  2003-04-09 18:05       ` Jan Hubicka
@ 2003-04-09 18:26         ` Eric Botcazou
  2003-04-09 21:23         ` Richard Henderson
  1 sibling, 0 replies; 41+ messages in thread
From: Eric Botcazou @ 2003-04-09 18:26 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc

[-- Attachment #1: Type: text/plain, Size: 496 bytes --]

> For 3.2 it is probably OK to simply revert the patch
> after all.  I can do so if maintainer approves that.

Thanks. Please mention the reference (PR optimization/10233) in the ChangeLog 
entry. I've attached a testcase gcc.dg/i386-mmx-3.c .

> I just want to be sure that for 3.3/3.4 it is dealt correct that I am
> not quite certain about at this point :((

Perhaps instrument the code to see if invalid subregs reach the logic that 
would have triggered before your patch.

-- 
Eric Botcazou

[-- Attachment #2: i386-mmx-3.c --]
[-- Type: text/x-csrc, Size: 571 bytes --]

/* PR optimization/10233 */
/* Originator: <dean-gcc@arctic.org> */
/* { dg-do run { target i?86-*-* } } */
/* { dg-options "-std=c99 -O3 -mmmx" } */

#include <mmintrin.h>

extern void abort(void);

typedef union {
  unsigned long long uq[1];
  __m64 m;
} mm_t __attribute__((aligned(8)));

static int foo(mm_t *p, const mm_t *q)
{
  mm_t t;

  t.m = _mm_slli_pi16(p->m, 1);
  return t.uq[0] == q->uq[0];
}

int main(void)
{
   mm_t m1, m2;

   m1.uq[0] = 0x4001800180018001ULL;
   m2.uq[0] = 0x0002000200020002ULL;

   if (foo(&m1, &m2))
      abort();

   return 0;
}

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Reload bug
  2003-04-09 18:10           ` Eric Botcazou
@ 2003-04-09 19:15             ` Jan Hubicka
  0 siblings, 0 replies; 41+ messages in thread
From: Jan Hubicka @ 2003-04-09 19:15 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: Jan Hubicka, gcc

> > Concerning this testcase, the reload used to manage to get around this
> > problem.  It didn't get around intentionally and just because the
> > instruction contrain does not accept SSE register so reload is needed
> > anyway.  The idea of reloading whole SSE register into integer is no-go
> > as can be see from the other testcases.  There are two cases when the
> > reload is really needed:
> > 1) we are not able to access the SUBREG_BYTE part because of
> >    the problem above because SUBREG_BYTE is not divisible by size of the
> >    register (there is no way to get the size of the register, but I guess
> >    we can use GET_MODE_SIZE / HARD_REGNO_NREGS) and we will also need to
> >    compensate the lowparts,
> >
> >    This is the problem we are seeing.  It is IMO questionable whether it
> >    is valid to see such scenario and whether local alloc can produce
> >    such a subregs that has no representation.  My CANNOT_CHANGE_MODE_P
> >    patch avoids that as it causes the problem described above.
> >
> > 2) We are outputing into the register and the HARD_REGNO_NREGS is
> >    decreasing so we may convert subreg rewriting just part of the
> >    register to the subreg rewriting the lower part and clobbering upper
> >    part.  This is what I beleive the conditional was originally invented
> >    for.
> >
> > Are we in the sync now?
> 
> I don't fully understand the second point. Could you give an example?

It would be something like (subreg:SI (reg:TI) 8))
that is "set 8-12th byte of TIregister) when HARD_REGNO_NREGS is 4, but
"set 8-12th byte and clobber 12th-16th byte) when HARD_REGNO_NREGS is 2.
Not sure if it happens in real code...

Honza
> 
> > It seems to be that the conditional is still overconservative for 2) as
> > it is still possible that the subreg is large enought to fit into
> > HARD_REGNO_NREGS of the destination and it now ignores 1).  It is
> > question whether 1) can legaly happen at all but I see it does for 3.2
> > (and does not for 3.3).
> 
> Are you sure of that for 3.3? Because we will silently miscompile too if 
> there are a few leaks.
> 
> -- 
> Eric Botcazou

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Reload bug
  2003-04-09 18:05       ` Jan Hubicka
  2003-04-09 18:26         ` Eric Botcazou
@ 2003-04-09 21:23         ` Richard Henderson
  1 sibling, 0 replies; 41+ messages in thread
From: Richard Henderson @ 2003-04-09 21:23 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Eric Botcazou, gcc

On Wed, Apr 09, 2003 at 06:50:23PM +0200, Jan Hubicka wrote:
> For 3.2 it is probably OK to simply revert the patch
> after all.  I can do so if maintainer approves that.

Please do.


r~

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Reload bug
  2003-04-09 14:52         ` Jan Hubicka
  2003-04-09 18:10           ` Eric Botcazou
@ 2003-04-10 14:25           ` Eric Botcazou
  2003-04-10 16:31             ` Jan Hubicka
  2003-04-10 20:21             ` Eric Botcazou
  1 sibling, 2 replies; 41+ messages in thread
From: Eric Botcazou @ 2003-04-10 14:25 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc

> Concerning this testcase, the reload used to manage to get around this
> problem.  It didn't get around intentionally and just because the
> instruction contrain does not accept SSE register so reload is needed
> anyway.

I think you're completely right: the reload pass has no specific 
infrastructure for dealing with invalid subregs. It may "fix" these subregs, 
but only if it happens that the operand needs reloading because of the insn 
constraints.

I now have an example: PR target/10286

We pass

(insn 33 30 35 (set (subreg:SI (reg/v:DI 67) 4)
        (mem/f:SI (plus:SI (reg/f:SI 16 argp)
                (const_int 4 [0x4])) [0 hi+0 S4 A32])) 45 {*movsi_1} 
(insn_list 30 (nil))

to reload, which produces

(insn 33 30 35 (set (reg:SI 29 emm0)
        (mem/f:SI (plus:SI (reg/f:SI 6 ebp)
                (const_int 12 [0xc])) [0 hi+0 S4 A32])) 45 {*movsi_1} 
(insn_list 30 (nil))
    (nil))

that is, the only reload for insn 33 is

Reloads for insn # 33
Reload 0: reload_in (SI) = (mem/f:SI (plus:SI (reg/f:SI 6 ebp)
                                                        (const_int 12 [0xc])) 
[0 hi+0 S4 A32])
	GENERAL_REGS, RELOAD_FOR_INPUT (opnum = 1), optional
	reload_in_reg: (mem/f:SI (plus:SI (reg/f:SI 6 ebp)
                                                        (const_int 12 [0xc])) 
[0 hi+0 S4 A32])


Incidentally, PR target/10286 is a regression from GCC 3.2 present in
GCC 3.2.1 and later. The culprit is this patch:

Sat Oct 19 15:49:14 CEST 2002  Jan Hubicka  <jh@suse.cz>

	* mmintrin.h (__m64): typedef it to v2si.
	(_mm_cvtsi32_si64, _mm_cvtsi32_si64_mm_sll_pi16,
	_mm_sll_pi32, _mm_sll_pi64, _mm_slli_pi64, _mm_sra_pi16,
	_mm_sra_pi32, _mm_srl_pi16, _mm_srl_pi32, _mm_srl_pi64,
	_mm_srli_pi64, _mm_and_si64, _mm_andnot_si64,
	_mm_or_si64, _mm_xor_si64): Add neccesary casts.
	* xmmintrin.h (_mm_setzero_si64): Likewise.

See http://gcc.gnu.org/ml/gcc-patches/2002-10/msg01091.html

Before the patch, __m64 was defined as 'unsigned long long' so (%eax, %edx) 
was used instead of %rmm0 and we didn't produce invalid subregs. I think 
this patch has opened a can of worms and the MMX intrinsics are seriously 
broken on the 3.2 branch now.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Reload bug
  2003-04-10 14:25           ` Eric Botcazou
@ 2003-04-10 16:31             ` Jan Hubicka
  2003-04-10 16:35               ` Jan Hubicka
  2003-04-10 20:21             ` Eric Botcazou
  1 sibling, 1 reply; 41+ messages in thread
From: Jan Hubicka @ 2003-04-10 16:31 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: Jan Hubicka, gcc

> > Concerning this testcase, the reload used to manage to get around this
> > problem.  It didn't get around intentionally and just because the
> > instruction contrain does not accept SSE register so reload is needed
> > anyway.
> 
> I think you're completely right: the reload pass has no specific 
> infrastructure for dealing with invalid subregs. It may "fix" these subregs, 
> but only if it happens that the operand needs reloading because of the insn 
> constraints.
> 
> I now have an example: PR target/10286
> 
> We pass
> 
> (insn 33 30 35 (set (subreg:SI (reg/v:DI 67) 4)
>         (mem/f:SI (plus:SI (reg/f:SI 16 argp)
>                 (const_int 4 [0x4])) [0 hi+0 S4 A32])) 45 {*movsi_1} 
> (insn_list 30 (nil))
> 
> to reload, which produces
> 
> (insn 33 30 35 (set (reg:SI 29 emm0)
>         (mem/f:SI (plus:SI (reg/f:SI 6 ebp)
>                 (const_int 12 [0xc])) [0 hi+0 S4 A32])) 45 {*movsi_1} 
> (insn_list 30 (nil))
>     (nil))
> 
> that is, the only reload for insn 33 is
> 
> Reloads for insn # 33
> Reload 0: reload_in (SI) = (mem/f:SI (plus:SI (reg/f:SI 6 ebp)
>                                                         (const_int 12 [0xc])) 
> [0 hi+0 S4 A32])
> 	GENERAL_REGS, RELOAD_FOR_INPUT (opnum = 1), optional
> 	reload_in_reg: (mem/f:SI (plus:SI (reg/f:SI 6 ebp)
>                                                         (const_int 12 [0xc])) 
> [0 hi+0 S4 A32])
> 
> 
> Incidentally, PR target/10286 is a regression from GCC 3.2 present in
> GCC 3.2.1 and later. The culprit is this patch:
> 
> Sat Oct 19 15:49:14 CEST 2002  Jan Hubicka  <jh@suse.cz>
> 
> 	* mmintrin.h (__m64): typedef it to v2si.
> 	(_mm_cvtsi32_si64, _mm_cvtsi32_si64_mm_sll_pi16,
> 	_mm_sll_pi32, _mm_sll_pi64, _mm_slli_pi64, _mm_sra_pi16,
> 	_mm_sra_pi32, _mm_srl_pi16, _mm_srl_pi32, _mm_srl_pi64,
> 	_mm_srli_pi64, _mm_and_si64, _mm_andnot_si64,
> 	_mm_or_si64, _mm_xor_si64): Add neccesary casts.
> 	* xmmintrin.h (_mm_setzero_si64): Likewise.
> 
> See http://gcc.gnu.org/ml/gcc-patches/2002-10/msg01091.html
> 
> Before the patch, __m64 was defined as 'unsigned long long' so (%eax, %edx) 
> was used instead of %rmm0 and we didn't produce invalid subregs. I think 
> this patch has opened a can of worms and the MMX intrinsics are seriously 
> broken on the 3.2 branch now.

They always have been - I've fixed 50+ different bugs exposed by simple
testsuite executing each individual intrincisc with few diferent args,
so I would not brother about this on the 3.2.  3.3 is different issue
and should be the first GCC where intrincisc are generaly usable.  I
hope.

Will look into this in the detail this evening, now I have about -10
minutes of time.  Thanks!

Honza
> 
> -- 
> Eric Botcazou

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Reload bug
  2003-04-10 16:31             ` Jan Hubicka
@ 2003-04-10 16:35               ` Jan Hubicka
  0 siblings, 0 replies; 41+ messages in thread
From: Jan Hubicka @ 2003-04-10 16:35 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Eric Botcazou, gcc

> > > Concerning this testcase, the reload used to manage to get around this
> > > problem.  It didn't get around intentionally and just because the
> > > instruction contrain does not accept SSE register so reload is needed
> > > anyway.
> > 
> > I think you're completely right: the reload pass has no specific 
> > infrastructure for dealing with invalid subregs. It may "fix" these subregs, 
> > but only if it happens that the operand needs reloading because of the insn 
> > constraints.
> > 
> > I now have an example: PR target/10286
> > 
> > We pass
> > 
> > (insn 33 30 35 (set (subreg:SI (reg/v:DI 67) 4)
> >         (mem/f:SI (plus:SI (reg/f:SI 16 argp)
> >                 (const_int 4 [0x4])) [0 hi+0 S4 A32])) 45 {*movsi_1} 
> > (insn_list 30 (nil))
> > 
> > to reload, which produces
> > 
> > (insn 33 30 35 (set (reg:SI 29 emm0)
> >         (mem/f:SI (plus:SI (reg/f:SI 6 ebp)
> >                 (const_int 12 [0xc])) [0 hi+0 S4 A32])) 45 {*movsi_1} 
> > (insn_list 30 (nil))
> >     (nil))
> > 
> > that is, the only reload for insn 33 is
> > 
> > Reloads for insn # 33
> > Reload 0: reload_in (SI) = (mem/f:SI (plus:SI (reg/f:SI 6 ebp)
> >                                                         (const_int 12 [0xc])) 
> > [0 hi+0 S4 A32])
> > 	GENERAL_REGS, RELOAD_FOR_INPUT (opnum = 1), optional
> > 	reload_in_reg: (mem/f:SI (plus:SI (reg/f:SI 6 ebp)
> >                                                         (const_int 12 [0xc])) 
> > [0 hi+0 S4 A32])
> > 
> > 
> > Incidentally, PR target/10286 is a regression from GCC 3.2 present in
> > GCC 3.2.1 and later. The culprit is this patch:
> > 
> > Sat Oct 19 15:49:14 CEST 2002  Jan Hubicka  <jh@suse.cz>
> > 
> > 	* mmintrin.h (__m64): typedef it to v2si.
> > 	(_mm_cvtsi32_si64, _mm_cvtsi32_si64_mm_sll_pi16,
> > 	_mm_sll_pi32, _mm_sll_pi64, _mm_slli_pi64, _mm_sra_pi16,
> > 	_mm_sra_pi32, _mm_srl_pi16, _mm_srl_pi32, _mm_srl_pi64,
> > 	_mm_srli_pi64, _mm_and_si64, _mm_andnot_si64,
> > 	_mm_or_si64, _mm_xor_si64): Add neccesary casts.
> > 	* xmmintrin.h (_mm_setzero_si64): Likewise.
> > 
> > See http://gcc.gnu.org/ml/gcc-patches/2002-10/msg01091.html
> > 
> > Before the patch, __m64 was defined as 'unsigned long long' so (%eax, %edx) 
> > was used instead of %rmm0 and we didn't produce invalid subregs. I think 
> > this patch has opened a can of worms and the MMX intrinsics are seriously 
> > broken on the 3.2 branch now.
> 
> They always have been - I've fixed 50+ different bugs exposed by simple
> testsuite executing each individual intrincisc with few diferent args,
> so I would not brother about this on the 3.2.  3.3 is different issue
> and should be the first GCC where intrincisc are generaly usable.  I
> hope.
> 
> Will look into this in the detail this evening, now I have about -10
> minutes of time.  Thanks!

Perhaps I would vote for simply disabling the xmm intrincisc support in
the 3.2.x compiler by adding #error into the headers.  It is better than
confusing user by completely broken interface especially one so dificult
to use as xmm/mmx intrincisc are.

Honza
> 
> Honza
> > 
> > -- 
> > Eric Botcazou

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Reload bug
  2003-04-10 14:25           ` Eric Botcazou
  2003-04-10 16:31             ` Jan Hubicka
@ 2003-04-10 20:21             ` Eric Botcazou
  2003-04-10 20:43               ` Jan Hubicka
  2003-04-10 20:51               ` Reload bug Dale Johannesen
  1 sibling, 2 replies; 41+ messages in thread
From: Eric Botcazou @ 2003-04-10 20:21 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc

> I think you're completely right: the reload pass has no specific
> infrastructure for dealing with invalid subregs. It may "fix" these
> subregs, but only if it happens that the operand needs reloading because
> of the insn constraints.

I've found why: the code has been disabled in find_reloads()

		  /* This following hunk of code should no longer be
		     needed at all with SUBREG_BYTE.  If you need this
		     code back, please explain to me why so I can
		     fix the real problem.  -DaveM */
#if 0
		  /* Subreg of a hard reg which can't handle the subreg's mode
		     or which would handle that mode in the wrong number of
		     registers for subregging to work.  */
		  || (GET_CODE (operand) == REG
		      && REGNO (operand) < FIRST_PSEUDO_REGISTER
		      && ((GET_MODE_SIZE (operand_mode[i]) <= UNITS_PER_WORD
			   && (GET_MODE_SIZE (GET_MODE (operand))
			       > UNITS_PER_WORD)
			   && ((GET_MODE_SIZE (GET_MODE (operand))
				/ UNITS_PER_WORD)
			       != HARD_REGNO_NREGS (REGNO (operand),
						    GET_MODE (operand))))
			  || ! HARD_REGNO_MODE_OK (REGNO (operand) + offset,
						   operand_mode[i])))
#endif

Re-enabling it fix PR target/10286.

Now the question: what is the replacement machinery that is supposed to be 
doing the work?

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Reload bug
  2003-04-10 20:21             ` Eric Botcazou
@ 2003-04-10 20:43               ` Jan Hubicka
  2003-04-11 14:44                 ` Eric Botcazou
  2003-04-10 20:51               ` Reload bug Dale Johannesen
  1 sibling, 1 reply; 41+ messages in thread
From: Jan Hubicka @ 2003-04-10 20:43 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: Jan Hubicka, gcc

> > I think you're completely right: the reload pass has no specific
> > infrastructure for dealing with invalid subregs. It may "fix" these
> > subregs, but only if it happens that the operand needs reloading because
> > of the insn constraints.
> 
> I've found why: the code has been disabled in find_reloads()
> 
> 		  /* This following hunk of code should no longer be
> 		     needed at all with SUBREG_BYTE.  If you need this
> 		     code back, please explain to me why so I can
> 		     fix the real problem.  -DaveM */
> #if 0
> 		  /* Subreg of a hard reg which can't handle the subreg's mode
> 		     or which would handle that mode in the wrong number of
> 		     registers for subregging to work.  */
> 		  || (GET_CODE (operand) == REG
> 		      && REGNO (operand) < FIRST_PSEUDO_REGISTER
> 		      && ((GET_MODE_SIZE (operand_mode[i]) <= UNITS_PER_WORD
> 			   && (GET_MODE_SIZE (GET_MODE (operand))
> 			       > UNITS_PER_WORD)
> 			   && ((GET_MODE_SIZE (GET_MODE (operand))
> 				/ UNITS_PER_WORD)
> 			       != HARD_REGNO_NREGS (REGNO (operand),
> 						    GET_MODE (operand))))
> 			  || ! HARD_REGNO_MODE_OK (REGNO (operand) + offset,
> 						   operand_mode[i])))
> #endif
> 
> Re-enabling it fix PR target/10286.
> 
> Now the question: what is the replacement machinery that is supposed to be 
> doing the work?

Hmm, I've wondered about this piece of code as well today in the train.
I think Davej was wrong to disable it without approripate replacement.
The code is wrong for post-SUBREG_BYTE patch as it checks that the
partial regs have exactly size of one word, but we do allow subregs on
registers of different sizes.  We need to verify that offset can
represent the register exactly.

Looking as subreg_regno_offset, I think we need to practically check
that division in:
  return (y_offset / (mode_multiple / nregs_multiple)) * nregs_ymode;
Does not round.  When it rounds we are having registers too wide and we
must reload.
All the other divisions should be safe from SUBREG definition that is
already verified by my simplify_subreg code.
We also should add a trap to subreg_regno_offset in the mainline...
Seems to make sense?

Honza

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Reload bug
  2003-04-10 20:21             ` Eric Botcazou
  2003-04-10 20:43               ` Jan Hubicka
@ 2003-04-10 20:51               ` Dale Johannesen
  1 sibling, 0 replies; 41+ messages in thread
From: Dale Johannesen @ 2003-04-10 20:51 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: Dale Johannesen, Jan Hubicka, gcc

On Thursday, April 10, 2003, at 01:01  PM, Eric Botcazou wrote:
>> I think you're completely right: the reload pass has no specific
>> infrastructure for dealing with invalid subregs. It may "fix" these
>> subregs, but only if it happens that the operand needs reloading 
>> because
>> of the insn constraints.
>
> I've found why: the code has been disabled in find_reloads()
>
> 		  /* This following hunk of code should no longer be
> 		     needed at all with SUBREG_BYTE.  If you need this
> 		     code back, please explain to me why so I can
> 		     fix the real problem.  -DaveM */
>
> Re-enabling it fix PR target/10286.
>
> Now the question: what is the replacement machinery that is supposed 
> to be
> doing the work?

Ask DaveM?  He does seem to have volunteered.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Reload bug
  2003-04-10 20:43               ` Jan Hubicka
@ 2003-04-11 14:44                 ` Eric Botcazou
  2003-04-11 17:49                   ` Jan Hubicka
                                     ` (3 more replies)
  0 siblings, 4 replies; 41+ messages in thread
From: Eric Botcazou @ 2003-04-11 14:44 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc

> The code is wrong for post-SUBREG_BYTE patch as it checks that the
> partial regs have exactly size of one word, but we do allow subregs on
> registers of different sizes.  We need to verify that offset can
> represent the register exactly.

Is it really wrong or is it incomplete now, in the post-SUBREG_BYTE era?

> Looking as subreg_regno_offset, I think we need to practically check
> that division in:
>   return (y_offset / (mode_multiple / nregs_multiple)) * nregs_ymode;
> Does not round.  When it rounds we are having registers too wide and we
> must reload.
> All the other divisions should be safe from SUBREG definition that is
> already verified by my simplify_subreg code.
> We also should add a trap to subreg_regno_offset in the mainline...
> Seems to make sense?

Do you mean that we can generate

	(subreg:HI (reg/v:SI 67) 2)

and we currently have no means to fix it during the reload pass?

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Reload bug
  2003-04-11 14:44                 ` Eric Botcazou
@ 2003-04-11 17:49                   ` Jan Hubicka
  2003-04-11 18:09                   ` Jan Hubicka
                                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 41+ messages in thread
From: Jan Hubicka @ 2003-04-11 17:49 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: Jan Hubicka, gcc

> > The code is wrong for post-SUBREG_BYTE patch as it checks that the
> > partial regs have exactly size of one word, but we do allow subregs on
> > registers of different sizes.  We need to verify that offset can
> > represent the register exactly.
> 
> Is it really wrong or is it incomplete now, in the post-SUBREG_BYTE era?

Yes, it would force, for instance subregs of SFmode registers in 64bit
sparc to be reloaded all the time (the original motivation for
SUBREG_BYTE)
> 
> > Looking as subreg_regno_offset, I think we need to practically check
> > that division in:
> >   return (y_offset / (mode_multiple / nregs_multiple)) * nregs_ymode;
> > Does not round.  When it rounds we are having registers too wide and we
> > must reload.
> > All the other divisions should be safe from SUBREG definition that is
> > already verified by my simplify_subreg code.
> > We also should add a trap to subreg_regno_offset in the mainline...
> > Seems to make sense?
> 
> Do you mean that we can generate
> 
> 	(subreg:HI (reg/v:SI 67) 2)
> 
> and we currently have no means to fix it during the reload pass?

Yes, we basically rely on the fact that we never generate them as we
originaly generated everything on word basic.

Honza
> 
> -- 
> Eric Botcazou

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Reload bug
  2003-04-11 14:44                 ` Eric Botcazou
  2003-04-11 17:49                   ` Jan Hubicka
@ 2003-04-11 18:09                   ` Jan Hubicka
  2003-04-11 19:01                   ` Jan Hubicka
  2003-04-11 19:07                   ` Jan Hubicka
  3 siblings, 0 replies; 41+ messages in thread
From: Jan Hubicka @ 2003-04-11 18:09 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: Jan Hubicka, gcc

> > The code is wrong for post-SUBREG_BYTE patch as it checks that the
> > partial regs have exactly size of one word, but we do allow subregs on
> > registers of different sizes.  We need to verify that offset can
> > represent the register exactly.
> 
> Is it really wrong or is it incomplete now, in the post-SUBREG_BYTE era?
> 
> > Looking as subreg_regno_offset, I think we need to practically check
> > that division in:
> >   return (y_offset / (mode_multiple / nregs_multiple)) * nregs_ymode;
> > Does not round.  When it rounds we are having registers too wide and we
> > must reload.
> > All the other divisions should be safe from SUBREG definition that is
> > already verified by my simplify_subreg code.
> > We also should add a trap to subreg_regno_offset in the mainline...
> > Seems to make sense?
> 
> Do you mean that we can generate
> 
> 	(subreg:HI (reg/v:SI 67) 2)
> 
> and we currently have no means to fix it during the reload pass?

Forgot to mention, I have patch in testing, will send it as soon as I
will be able to access the net (it is down for some reason now)

Honza
> 
> -- 
> Eric Botcazou

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Reload bug
  2003-04-11 14:44                 ` Eric Botcazou
  2003-04-11 17:49                   ` Jan Hubicka
  2003-04-11 18:09                   ` Jan Hubicka
@ 2003-04-11 19:01                   ` Jan Hubicka
  2003-04-11 19:07                   ` Jan Hubicka
  3 siblings, 0 replies; 41+ messages in thread
From: Jan Hubicka @ 2003-04-11 19:01 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: Jan Hubicka, gcc

> > The code is wrong for post-SUBREG_BYTE patch as it checks that the
> > partial regs have exactly size of one word, but we do allow subregs on
> > registers of different sizes.  We need to verify that offset can
> > represent the register exactly.
> 
> Is it really wrong or is it incomplete now, in the post-SUBREG_BYTE era?
Hi,
this is the patch I am currently playing with.  The
subreg_offset_representable_p should return false when we can't
represent the subreg and should abort in the cases I believe current
implemetnation would get entirely confused.  I am not quite sure I got
it right, but you may want to take a look :)

Index: emit-rtl.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/emit-rtl.c,v
retrieving revision 1.318
diff -c -3 -p -r1.318 emit-rtl.c
*** emit-rtl.c	4 Apr 2003 22:44:02 -0000	1.318
--- emit-rtl.c	11 Apr 2003 18:06:01 -0000
*************** subreg_hard_regno (x, check_mode)
*** 1080,1086 ****
      abort ();
    if (check_mode && ! HARD_REGNO_MODE_OK (base_regno, GET_MODE (reg)))
      abort ();
! 
    /* Catch non-congruent offsets too.  */
    byte_offset = SUBREG_BYTE (x);
    if ((byte_offset % GET_MODE_SIZE (mode)) != 0)
--- 1080,1090 ----
      abort ();
    if (check_mode && ! HARD_REGNO_MODE_OK (base_regno, GET_MODE (reg)))
      abort ();
! #ifdef ENABLE_CHECKING
!   if (!subreg_offset_representable_p (REGNO (reg), GET_MODE (reg),
! 			  	      SUBREG_BYTE (x), mode))
!     abort ();
! #endif
    /* Catch non-congruent offsets too.  */
    byte_offset = SUBREG_BYTE (x);
    if ((byte_offset % GET_MODE_SIZE (mode)) != 0)
Index: reload.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/reload.c,v
retrieving revision 1.211
diff -c -3 -p -r1.211 reload.c
*** reload.c	26 Mar 2003 07:48:13 -0000	1.211
--- reload.c	11 Apr 2003 18:06:01 -0000
*************** find_reloads (insn, replace, ind_levels,
*** 2880,2885 ****
--- 2880,2891 ----
  	      if (GET_CODE (SUBREG_REG (operand)) == REG
  		  && REGNO (SUBREG_REG (operand)) < FIRST_PSEUDO_REGISTER)
  		{
+ 		  if (!subreg_offset_representable_p
+ 			(REGNO (SUBREG_REG (operand)),
+ 			 GET_MODE (SUBREG_REG (operand)),
+ 			 SUBREG_BYTE (operand),
+ 			 GET_MODE (operand)))
+ 		     force_reload = 1;
  		  offset += subreg_regno_offset (REGNO (SUBREG_REG (operand)),
  						 GET_MODE (SUBREG_REG (operand)),
  						 SUBREG_BYTE (operand),
*************** find_reloads (insn, replace, ind_levels,
*** 2935,2960 ****
  			  )
  #endif
  		      )
- 		  /* This following hunk of code should no longer be
- 		     needed at all with SUBREG_BYTE.  If you need this
- 		     code back, please explain to me why so I can
- 		     fix the real problem.  -DaveM */
- #if 0
- 		  /* Subreg of a hard reg which can't handle the subreg's mode
- 		     or which would handle that mode in the wrong number of
- 		     registers for subregging to work.  */
- 		  || (GET_CODE (operand) == REG
- 		      && REGNO (operand) < FIRST_PSEUDO_REGISTER
- 		      && ((GET_MODE_SIZE (operand_mode[i]) <= UNITS_PER_WORD
- 			   && (GET_MODE_SIZE (GET_MODE (operand))
- 			       > UNITS_PER_WORD)
- 			   && ((GET_MODE_SIZE (GET_MODE (operand))
- 				/ UNITS_PER_WORD)
- 			       != HARD_REGNO_NREGS (REGNO (operand),
- 						    GET_MODE (operand))))
- 			  || ! HARD_REGNO_MODE_OK (REGNO (operand) + offset,
- 						   operand_mode[i])))
- #endif
  		  )
  		force_reload = 1;
  	    }
--- 2941,2946 ----
*************** find_reloads_address_1 (mode, x, context
*** 5271,5276 ****
--- 5257,5275 ----
  						       GET_MODE (SUBREG_REG (orig_op1)),
  						       SUBREG_BYTE (orig_op1),
  						       GET_MODE (orig_op1))));
+ 	  }
+ 	/* Plus in the index register may be created only as a result of
+ 	   register remateralization for expresion like &localvar*4.  Reload it.
+ 	   It may be possible to combine the displacement on the outer level,
+ 	   but it is probably not worthwhile to do so.  */
+ 	if (context)
+ 	  {
+ 	    find_reloads_address (GET_MODE (x), loc, XEXP (x, 0), &XEXP (x, 0),
+ 				  opnum, ADDR_TYPE (type), ind_levels, insn);
+ 	    push_reload (*loc, NULL_RTX, loc, (rtx*) 0,
+ 			 (context ? INDEX_REG_CLASS : MODE_BASE_REG_CLASS (mode)),
+ 			 GET_MODE (x), VOIDmode, 0, 0, opnum, type);
+ 	    return 1;
  	  }
  
  	if (code0 == MULT || code0 == SIGN_EXTEND || code0 == TRUNCATE

+   nregs_ymode = HARD_REGNO_NREGS (xregno, ymode);
+ 
+   /* paradoxical subregs are always valid.  */
+   if (offset == 0
+       && nregs_ymode > nregs_xmode
+       && (GET_MODE_SIZE (ymode) > UNITS_PER_WORD
+ 	  ? WORDS_BIG_ENDIAN : BYTES_BIG_ENDIAN))
+     return true;
+ 
+   /* Lowpart subregs are always valid.  */
+   if (offset == subreg_lowpart_offset (ymode, xmode))
+     return true;
+ 
+ #ifdef ENABLE_CHECKING
+   /* This should always pass, otherwise we don't know how to verify the
+      constraint. 
+ 
+      These conditions may be relaxed but subreg_offset would need to be
+      redesigned.  */
+   if (GET_MODE_SIZE (xmode) % GET_MODE_SIZE (ymode)
+       || GET_MODE_SIZE (ymode) % nregs_ymode
+       || mode_for_size (GET_MODE_SIZE (ymode) / nregs_ymode,
+ 	      		MODE_INT, 0) == VOIDmode
+       || nregs_xmode % nregs_ymode)
+     abort ();
+ #endif
+ 
+   /* The XMODE value can be seen as an vector of NREGS_XMODE
+      values.  The subreg must represent an lowpart of given field.
+      Compute what field it is.  */
+   offset -= subreg_lowpart_offset (mode_for_size (GET_MODE_SIZE (ymode)
+ 			  			  / nregs_ymode,
+ 						  MODE_INT, 0), xmode);
+ 
+   /* size of ymode must not be greater than the size of xmode.  */
+   mode_multiple = GET_MODE_SIZE (xmode) / GET_MODE_SIZE (ymode);
+   if (mode_multiple == 0)
+     abort ();
+ 
+   y_offset = offset / GET_MODE_SIZE (ymode);
+   nregs_multiple =  nregs_xmode / nregs_ymode;
+ #ifdef ENABLE_CHECKING
+   if (offset % GET_MODE_SIZE (ymode)
+       || mode_multiple % nregs_multiple)
+     abort ();
+ #endif
+   return (!(y_offset % (mode_multiple / nregs_multiple)));
+ }
+ 
  /* Return the final regno that a subreg expression refers to.  */
  unsigned int
  subreg_regno (x)

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Reload bug
  2003-04-11 14:44                 ` Eric Botcazou
                                     ` (2 preceding siblings ...)
  2003-04-11 19:01                   ` Jan Hubicka
@ 2003-04-11 19:07                   ` Jan Hubicka
  2003-04-12 14:55                     ` Eric Botcazou
  3 siblings, 1 reply; 41+ messages in thread
From: Jan Hubicka @ 2003-04-11 19:07 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: Jan Hubicka, gcc

oops, I stripped the patch too much last time.
I am attaching full one.  It fixes the 3.2 testcases I have...
Does it appear to make sense to you?
Will have to test it on big endian machine too...

Index: emit-rtl.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/emit-rtl.c,v
retrieving revision 1.249.2.10.2.2
diff -c -3 -p -r1.249.2.10.2.2 emit-rtl.c
*** emit-rtl.c	3 Feb 2003 18:01:37 -0000	1.249.2.10.2.2
--- emit-rtl.c	11 Apr 2003 18:25:46 -0000
*************** subreg_hard_regno (x, check_mode)
*** 799,805 ****
      abort ();
    if (check_mode && ! HARD_REGNO_MODE_OK (base_regno, GET_MODE (reg)))
      abort ();
! 
    /* Catch non-congruent offsets too.  */
    byte_offset = SUBREG_BYTE (x);
    if ((byte_offset % GET_MODE_SIZE (mode)) != 0)
--- 799,809 ----
      abort ();
    if (check_mode && ! HARD_REGNO_MODE_OK (base_regno, GET_MODE (reg)))
      abort ();
! #ifdef ENABLE_CHECKING
!   if (!subreg_offset_representable_p (REGNO (reg), GET_MODE (reg),
! 			  	      SUBREG_BYTE (x), mode))
!     abort ();
! #endif
    /* Catch non-congruent offsets too.  */
    byte_offset = SUBREG_BYTE (x);
    if ((byte_offset % GET_MODE_SIZE (mode)) != 0)
Index: reload.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/reload.c,v
retrieving revision 1.178.2.4.2.4
diff -c -3 -p -r1.178.2.4.2.4 reload.c
*** reload.c	24 Oct 2002 08:59:49 -0000	1.178.2.4.2.4
--- reload.c	11 Apr 2003 18:25:48 -0000
*************** reload_inner_reg_of_subreg (x, mode, out
*** 817,822 ****
--- 817,829 ----
        || REGNO (inner) >= FIRST_PSEUDO_REGISTER)
      return 0;
  
+   if (!subreg_offset_representable_p
+ 	(REGNO (SUBREG_REG (x)),
+ 		GET_MODE (SUBREG_REG (x)),
+ 		SUBREG_BYTE (x),
+ 		GET_MODE (x)))
+     return 1;
+ 
    /* If INNER is not ok for MODE, then INNER will need reloading.  */
    if (! HARD_REGNO_MODE_OK (subreg_regno (x), mode))
      return 1;
*************** find_reloads (insn, replace, ind_levels,
*** 2856,2861 ****
--- 2863,2874 ----
  	      if (GET_CODE (SUBREG_REG (operand)) == REG
  		  && REGNO (SUBREG_REG (operand)) < FIRST_PSEUDO_REGISTER)
  		{
+ 		  if (!subreg_offset_representable_p
+ 			(REGNO (SUBREG_REG (operand)),
+ 			 GET_MODE (SUBREG_REG (operand)),
+ 			 SUBREG_BYTE (operand),
+ 			 GET_MODE (operand)))
+ 		     force_reload = 1;
  		  offset += subreg_regno_offset (REGNO (SUBREG_REG (operand)),
  						 GET_MODE (SUBREG_REG (operand)),
  						 SUBREG_BYTE (operand),
*************** find_reloads (insn, replace, ind_levels,
*** 2911,2936 ****
  			  )
  #endif
  		      )
- 		  /* This following hunk of code should no longer be
- 		     needed at all with SUBREG_BYTE.  If you need this
- 		     code back, please explain to me why so I can
- 		     fix the real problem.  -DaveM */
- #if 0
- 		  /* Subreg of a hard reg which can't handle the subreg's mode
- 		     or which would handle that mode in the wrong number of
- 		     registers for subregging to work.  */
- 		  || (GET_CODE (operand) == REG
- 		      && REGNO (operand) < FIRST_PSEUDO_REGISTER
- 		      && ((GET_MODE_SIZE (operand_mode[i]) <= UNITS_PER_WORD
- 			   && (GET_MODE_SIZE (GET_MODE (operand))
- 			       > UNITS_PER_WORD)
- 			   && ((GET_MODE_SIZE (GET_MODE (operand))
- 				/ UNITS_PER_WORD)
- 			       != HARD_REGNO_NREGS (REGNO (operand),
- 						    GET_MODE (operand))))
- 			  || ! HARD_REGNO_MODE_OK (REGNO (operand) + offset,
- 						   operand_mode[i])))
- #endif
  		  )
  		force_reload = 1;
  	    }
--- 2924,2929 ----
Index: rtlanal.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/rtlanal.c,v
retrieving revision 1.126.2.2.4.1
diff -c -3 -p -r1.126.2.2.4.1 rtlanal.c
*** rtlanal.c	9 Jan 2003 13:18:41 -0000	1.126.2.2.4.1
--- rtlanal.c	11 Apr 2003 18:25:48 -0000
*************** subreg_regno_offset (xregno, xmode, offs
*** 3045,3050 ****
--- 3045,3121 ----
    return (y_offset / (mode_multiple / nregs_multiple)) * nregs_ymode;
  }
  
+ /* This function returns true when the offset is representable via
+    subreg_offset in the given regno.
+    xregno - A regno of an inner hard subreg_reg (or what will become one).
+    xmode  - The mode of xregno.
+    offset - The byte offset.
+    ymode  - The mode of a top level SUBREG (or what may become one).
+    RETURN - The regno offset which would be used.  */
+ bool
+ subreg_offset_representable_p (xregno, xmode, offset, ymode)
+      unsigned int xregno;
+      enum machine_mode xmode;
+      unsigned int offset;
+      enum machine_mode ymode;
+ {
+   int nregs_xmode, nregs_ymode;
+   int mode_multiple, nregs_multiple;
+   int y_offset;
+ 
+   if (xregno >= FIRST_PSEUDO_REGISTER)
+     abort ();
+ 
+   nregs_xmode = HARD_REGNO_NREGS (xregno, xmode);
+   nregs_ymode = HARD_REGNO_NREGS (xregno, ymode);
+ 
+   /* paradoxical subregs are always valid.  */
+   if (offset == 0
+       && nregs_ymode > nregs_xmode
+       && (GET_MODE_SIZE (ymode) > UNITS_PER_WORD
+ 	  ? WORDS_BIG_ENDIAN : BYTES_BIG_ENDIAN))
+     return true;
+ 
+   /* Lowpart subregs are always valid.  */
+   if (offset == subreg_lowpart_offset (ymode, xmode))
+     return true;
+ 
+ #ifdef ENABLE_CHECKING
+   /* This should always pass, otherwise we don't know how to verify the
+      constraint. 
+ 
+      These conditions may be relaxed but subreg_offset would need to be
+      redesigned.  */
+   if (GET_MODE_SIZE (xmode) % GET_MODE_SIZE (ymode)
+       || GET_MODE_SIZE (ymode) % nregs_ymode
+       || mode_for_size (GET_MODE_SIZE (ymode) / nregs_ymode,
+ 	      		MODE_INT, 0) == VOIDmode
+       || nregs_xmode % nregs_ymode)
+     abort ();
+ #endif
+ 
+   /* The XMODE value can be seen as an vector of NREGS_XMODE
+      values.  The subreg must represent an lowpart of given field.
+      Compute what field it is.  */
+   offset -= subreg_lowpart_offset (mode_for_size (GET_MODE_SIZE (ymode)
+ 			  			  / nregs_ymode,
+ 						  MODE_INT, 0), xmode);
+ 
+   /* size of ymode must not be greater than the size of xmode.  */
+   mode_multiple = GET_MODE_SIZE (xmode) / GET_MODE_SIZE (ymode);
+   if (mode_multiple == 0)
+     abort ();
+ 
+   y_offset = offset / GET_MODE_SIZE (ymode);
+   nregs_multiple =  nregs_xmode / nregs_ymode;
+ #ifdef ENABLE_CHECKING
+   if (offset % GET_MODE_SIZE (ymode)
+       || mode_multiple % nregs_multiple)
+     abort ();
+ #endif
+   return (!(y_offset % (mode_multiple / nregs_multiple)));
+ }
+ 
  /* Return the final regno that a subreg expression refers to.  */
  unsigned int 
  subreg_regno (x)

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Reload bug
  2003-04-11 19:07                   ` Jan Hubicka
@ 2003-04-12 14:55                     ` Eric Botcazou
  2003-04-12 17:45                       ` Jan Hubicka
  2003-04-12 17:55                       ` Make reload to avoid invalid subregs Jan Hubicka
  0 siblings, 2 replies; 41+ messages in thread
From: Eric Botcazou @ 2003-04-12 14:55 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc

> I am attaching full one.  It fixes the 3.2 testcases I have...
> Does it appear to make sense to you?

The locations from which you call the new subreg_offset_representable_p() 
look sensible. But I think there are some duplicate logic in 
reload_inner_reg_of_subreg() now.

+ /* This function returns true when the offset is representable via
+    subreg_offset in the given regno.

The function appears to assume that reg XREGNO supports mode YMODE. Is that 
by design? If so, I think you should mention it in the header comment.

+    RETURN - The regno offset which would be used.  */

Pasto.

+ bool
+ subreg_offset_representable_p (xregno, xmode, offset, ymode)
+      unsigned int xregno;
+      enum machine_mode xmode;
+      unsigned int offset;
+      enum machine_mode ymode;
+ {
+   int nregs_xmode, nregs_ymode;
+   int mode_multiple, nregs_multiple;
+   int y_offset;
+ 
+   if (xregno >= FIRST_PSEUDO_REGISTER)
+     abort ();
+ 
+   nregs_xmode = HARD_REGNO_NREGS (xregno, xmode);
+   nregs_ymode = HARD_REGNO_NREGS (xregno, ymode);
+ 
+   /* paradoxical subregs are always valid.  */
+   if (offset == 0
+       && nregs_ymode > nregs_xmode
+       && (GET_MODE_SIZE (ymode) > UNITS_PER_WORD
+ 	  ? WORDS_BIG_ENDIAN : BYTES_BIG_ENDIAN))
+     return true;
+ 
+   /* Lowpart subregs are always valid.  */
+   if (offset == subreg_lowpart_offset (ymode, xmode))
+     return true;

I think you should mention that the second "if" catches paradoxical subregs 
too.

+ #ifdef ENABLE_CHECKING
+   /* This should always pass, otherwise we don't know how to verify the
+      constraint. 
+ 
+      These conditions may be relaxed but subreg_offset would need to be
+      redesigned.  */
+   if (GET_MODE_SIZE (xmode) % GET_MODE_SIZE (ymode)
+       || GET_MODE_SIZE (ymode) % nregs_ymode
+       || mode_for_size (GET_MODE_SIZE (ymode) / nregs_ymode,
+ 	      		MODE_INT, 0) == VOIDmode
+       || nregs_xmode % nregs_ymode)
+     abort ();
+ #endif
+ 
+   /* The XMODE value can be seen as an vector of NREGS_XMODE
+      values.  The subreg must represent an lowpart of given field.
+      Compute what field it is.  */
+   offset -= subreg_lowpart_offset (mode_for_size (GET_MODE_SIZE (ymode)
+ 			  			  / nregs_ymode,
+ 						  MODE_INT, 0), xmode);

I'm clueless here: I think I understand what you want to do but I have a 
mixed feeling about the formula. In particular,

	GET_MODE_SIZE (ymode) / nregs_ymode

looks weird for a size. Didn't you forget to re-multiply by nregs_ymode?

+   /* size of ymode must not be greater than the size of xmode.  */
+   mode_multiple = GET_MODE_SIZE (xmode) / GET_MODE_SIZE (ymode);
+   if (mode_multiple == 0)
+     abort ();

All paradoxical subregs should already have been caught at that point, no?

+   y_offset = offset / GET_MODE_SIZE (ymode);
+   nregs_multiple =  nregs_xmode / nregs_ymode;
+ #ifdef ENABLE_CHECKING
+   if (offset % GET_MODE_SIZE (ymode)
+       || mode_multiple % nregs_multiple)
+     abort ();
+ #endif
+   return (!(y_offset % (mode_multiple / nregs_multiple)));
+ }

I think you're right about the condition.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Reload bug
  2003-04-12 14:55                     ` Eric Botcazou
@ 2003-04-12 17:45                       ` Jan Hubicka
  2003-04-13 19:57                         ` Eric Botcazou
  2003-04-12 17:55                       ` Make reload to avoid invalid subregs Jan Hubicka
  1 sibling, 1 reply; 41+ messages in thread
From: Jan Hubicka @ 2003-04-12 17:45 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: Jan Hubicka, gcc

> > I am attaching full one.  It fixes the 3.2 testcases I have...
> > Does it appear to make sense to you?
> 
> The locations from which you call the new subreg_offset_representable_p() 
> look sensible. But I think there are some duplicate logic in 
> reload_inner_reg_of_subreg() now.
Yes, I would like to remove it separately so in the case of problem we
will be able to determine what part has actually caused that.
> 
> + /* This function returns true when the offset is representable via
> +    subreg_offset in the given regno.
> 
> The function appears to assume that reg XREGNO supports mode YMODE. Is that 
> by design? If so, I think you should mention it in the header comment.

Hmm, I am not certain what should happen there.  We can either verify it
here or the place we are doing so already...
> +   nregs_xmode = HARD_REGNO_NREGS (xregno, xmode);
> +   nregs_ymode = HARD_REGNO_NREGS (xregno, ymode);
> + 
> +   /* paradoxical subregs are always valid.  */
> +   if (offset == 0
> +       && nregs_ymode > nregs_xmode
> +       && (GET_MODE_SIZE (ymode) > UNITS_PER_WORD
> + 	  ? WORDS_BIG_ENDIAN : BYTES_BIG_ENDIAN))
> +     return true;
> + 
> +   /* Lowpart subregs are always valid.  */
> +   if (offset == subreg_lowpart_offset (ymode, xmode))
> +     return true;
> 
> I think you should mention that the second "if" catches paradoxical subregs 
> too.

Hmm, you are right. Then we can remove the first if too :)
> 
> + #ifdef ENABLE_CHECKING
> +   /* This should always pass, otherwise we don't know how to verify the
> +      constraint. 
> + 
> +      These conditions may be relaxed but subreg_offset would need to be
> +      redesigned.  */
> +   if (GET_MODE_SIZE (xmode) % GET_MODE_SIZE (ymode)
> +       || GET_MODE_SIZE (ymode) % nregs_ymode
> +       || mode_for_size (GET_MODE_SIZE (ymode) / nregs_ymode,
> + 	      		MODE_INT, 0) == VOIDmode
> +       || nregs_xmode % nregs_ymode)
> +     abort ();
> + #endif
> + 
> +   /* The XMODE value can be seen as an vector of NREGS_XMODE
> +      values.  The subreg must represent an lowpart of given field.
> +      Compute what field it is.  */
> +   offset -= subreg_lowpart_offset (mode_for_size (GET_MODE_SIZE (ymode)
> + 			  			  / nregs_ymode,
> + 						  MODE_INT, 0), xmode);
> 
> I'm clueless here: I think I understand what you want to do but I have a 
> mixed feeling about the formula. In particular,
> 
> 	GET_MODE_SIZE (ymode) / nregs_ymode
> 
> looks weird for a size. Didn't you forget to re-multiply by nregs_ymode?

No, I am actually trying to compute the size of the one ymode register
in order to be able to compensate the lowpart operation.
Imagine  (subreg:QI (reg:DI) 3) is valid on little endian with
nregs_ymode == 2
> 
> +   /* size of ymode must not be greater than the size of xmode.  */
> +   mode_multiple = GET_MODE_SIZE (xmode) / GET_MODE_SIZE (ymode);
> +   if (mode_multiple == 0)
> +     abort ();
> 
> All paradoxical subregs should already have been caught at that point, no?

Yes, it is just pasto.

Honza
> 
> +   y_offset = offset / GET_MODE_SIZE (ymode);
> +   nregs_multiple =  nregs_xmode / nregs_ymode;
> + #ifdef ENABLE_CHECKING
> +   if (offset % GET_MODE_SIZE (ymode)
> +       || mode_multiple % nregs_multiple)
> +     abort ();
> + #endif
> +   return (!(y_offset % (mode_multiple / nregs_multiple)));
> + }
> 
> I think you're right about the condition.
> 
> -- 
> Eric Botcazou

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Make reload to avoid invalid subregs
  2003-04-12 14:55                     ` Eric Botcazou
  2003-04-12 17:45                       ` Jan Hubicka
@ 2003-04-12 17:55                       ` Jan Hubicka
  2003-04-17 22:32                         ` Richard Henderson
  1 sibling, 1 reply; 41+ messages in thread
From: Jan Hubicka @ 2003-04-12 17:55 UTC (permalink / raw)
  To: Eric Botcazou, gcc-patches, rth; +Cc: Jan Hubicka, gcc

Hi,
this patch should fix number of PRs related to invalid subregs
(like (subreg:SI (xmm) 4)) and reloads that often leads to silent
misscompilations.  New function subreg_representable_p is introduced
to verify all constraints under wich we would produce wrong code
and reload now deals with it.  Some code already did exist in reload to
cope with this but part of it was ifdeffed out by Dave and the other
part is still SUBREG_WORD centric.  I would like to remove it in
followup patch but would like to do so step by step.

Bootstrapped/regtested mainline on i386 and PPC
Ok for mainline/3.3/3.2?

Honza

Sat Apr 12 07:47:53 MST 2003  Jan Hubicka  <jh@suse.cz>
			      Eric Botcazou
	* emit-rtl.c (subreg_hard_regno): Check subreg_offset_representable_p
	* reload.c (reload_inner_reg_of_subreg): Likewise.
	(find_reloads): Likewise; remove #if 0 hunk we replace.
	* rtl.h (subreg_representable_p): Declare.
	* rtlanal.c (subreg_representable_p): New function.
Index: emit-rtl.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/emit-rtl.c,v
retrieving revision 1.303
diff -c -3 -p -r1.303 emit-rtl.c
*** emit-rtl.c	20 Nov 2002 21:52:58 -0000	1.303
--- emit-rtl.c	12 Apr 2003 14:46:26 -0000
*************** subreg_hard_regno (x, check_mode)
*** 925,931 ****
      abort ();
    if (check_mode && ! HARD_REGNO_MODE_OK (base_regno, GET_MODE (reg)))
      abort ();
! 
    /* Catch non-congruent offsets too.  */
    byte_offset = SUBREG_BYTE (x);
    if ((byte_offset % GET_MODE_SIZE (mode)) != 0)
--- 925,935 ----
      abort ();
    if (check_mode && ! HARD_REGNO_MODE_OK (base_regno, GET_MODE (reg)))
      abort ();
! #ifdef ENABLE_CHECKING
!   if (!subreg_offset_representable_p (REGNO (reg), GET_MODE (reg),
! 			  	      SUBREG_BYTE (x), mode))
!     abort ();
! #endif
    /* Catch non-congruent offsets too.  */
    byte_offset = SUBREG_BYTE (x);
    if ((byte_offset % GET_MODE_SIZE (mode)) != 0)
Index: reload.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/reload.c,v
retrieving revision 1.199.2.6
diff -c -3 -p -r1.199.2.6 reload.c
*** reload.c	26 Mar 2003 07:53:57 -0000	1.199.2.6
--- reload.c	12 Apr 2003 14:46:27 -0000
*************** reload_inner_reg_of_subreg (x, mode, out
*** 819,824 ****
--- 819,831 ----
        || REGNO (inner) >= FIRST_PSEUDO_REGISTER)
      return 0;
  
+   if (!subreg_offset_representable_p
+ 	(REGNO (SUBREG_REG (x)),
+ 		GET_MODE (SUBREG_REG (x)),
+ 		SUBREG_BYTE (x),
+ 		GET_MODE (x)))
+     return 1;
+ 
    /* If INNER is not ok for MODE, then INNER will need reloading.  */
    if (! HARD_REGNO_MODE_OK (subreg_regno (x), mode))
      return 1;
*************** find_reloads (insn, replace, ind_levels,
*** 2869,2874 ****
--- 2876,2887 ----
  	      if (GET_CODE (SUBREG_REG (operand)) == REG
  		  && REGNO (SUBREG_REG (operand)) < FIRST_PSEUDO_REGISTER)
  		{
+ 		  if (!subreg_offset_representable_p
+ 			(REGNO (SUBREG_REG (operand)),
+ 			 GET_MODE (SUBREG_REG (operand)),
+ 			 SUBREG_BYTE (operand),
+ 			 GET_MODE (operand)))
+ 		     force_reload = 1;
  		  offset += subreg_regno_offset (REGNO (SUBREG_REG (operand)),
  						 GET_MODE (SUBREG_REG (operand)),
  						 SUBREG_BYTE (operand),
*************** find_reloads (insn, replace, ind_levels,
*** 2924,2949 ****
  			  )
  #endif
  		      )
- 		  /* This following hunk of code should no longer be
- 		     needed at all with SUBREG_BYTE.  If you need this
- 		     code back, please explain to me why so I can
- 		     fix the real problem.  -DaveM */
- #if 0
- 		  /* Subreg of a hard reg which can't handle the subreg's mode
- 		     or which would handle that mode in the wrong number of
- 		     registers for subregging to work.  */
- 		  || (GET_CODE (operand) == REG
- 		      && REGNO (operand) < FIRST_PSEUDO_REGISTER
- 		      && ((GET_MODE_SIZE (operand_mode[i]) <= UNITS_PER_WORD
- 			   && (GET_MODE_SIZE (GET_MODE (operand))
- 			       > UNITS_PER_WORD)
- 			   && ((GET_MODE_SIZE (GET_MODE (operand))
- 				/ UNITS_PER_WORD)
- 			       != HARD_REGNO_NREGS (REGNO (operand),
- 						    GET_MODE (operand))))
- 			  || ! HARD_REGNO_MODE_OK (REGNO (operand) + offset,
- 						   operand_mode[i])))
- #endif
  		  )
  		force_reload = 1;
  	    }
--- 2937,2942 ----
Index: rtl.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/rtl.h,v
retrieving revision 1.375.2.4
diff -c -3 -p -r1.375.2.4 rtl.h
*** rtl.h	24 Mar 2003 17:59:36 -0000	1.375.2.4
--- rtl.h	12 Apr 2003 14:46:28 -0000
*************** extern unsigned int subreg_regno_offset 
*** 1033,1038 ****
--- 1033,1042 ----
  							 enum machine_mode, 
  							 unsigned int, 
  							 enum machine_mode));
+ extern bool subreg_representable_p 	PARAMS ((unsigned int, 
+ 							 enum machine_mode, 
+ 							 unsigned int, 
+ 							 enum machine_mode));
  extern unsigned int subreg_regno 	PARAMS ((rtx));
  
  /* 1 if RTX is a subreg containing a reg that is already known to be
Index: rtlanal.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/rtlanal.c,v
retrieving revision 1.140.4.2
diff -c -3 -p -r1.140.4.2 rtlanal.c
*** rtlanal.c	11 Feb 2003 20:47:56 -0000	1.140.4.2
--- rtlanal.c	12 Apr 2003 14:46:28 -0000
*************** subreg_regno_offset (xregno, xmode, offs
*** 3159,3164 ****
--- 3159,3225 ----
    return (y_offset / (mode_multiple / nregs_multiple)) * nregs_ymode;
  }
  
+ /* This function returns true when the offset is representable via
+    subreg_offset in the given regno.
+    xregno - A regno of an inner hard subreg_reg (or what will become one).
+    xmode  - The mode of xregno.
+    offset - The byte offset.
+    ymode  - The mode of a top level SUBREG (or what may become one).  */
+ bool
+ subreg_offset_representable_p (xregno, xmode, offset, ymode)
+      unsigned int xregno;
+      enum machine_mode xmode;
+      unsigned int offset;
+      enum machine_mode ymode;
+ {
+   int nregs_xmode, nregs_ymode;
+   int mode_multiple, nregs_multiple;
+   int y_offset;
+ 
+   if (xregno >= FIRST_PSEUDO_REGISTER)
+     abort ();
+ 
+   nregs_xmode = HARD_REGNO_NREGS (xregno, xmode);
+   nregs_ymode = HARD_REGNO_NREGS (xregno, ymode);
+ 
+   /* Lowpart subregs (including paradoxical) are always valid.  */
+   if (offset == subreg_lowpart_offset (ymode, xmode))
+     return true;
+ 
+ #ifdef ENABLE_CHECKING
+   /* This should always pass, otherwise we don't know how to verify the
+      constraint. 
+ 
+      These conditions may be relaxed but subreg_offset would need to be
+      redesigned.  */
+   if (GET_MODE_SIZE (xmode) % GET_MODE_SIZE (ymode)
+       || GET_MODE_SIZE (ymode) % nregs_ymode
+       || mode_for_size (GET_MODE_SIZE (ymode) / nregs_ymode,
+ 	      		MODE_INT, 0) == VOIDmode
+       || nregs_xmode % nregs_ymode)
+     abort ();
+ #endif
+ 
+   /* The XMODE value can be seen as an vector of NREGS_XMODE
+      values.  The subreg must represent an lowpart of given field.
+      Compute what field it is.  */
+   offset -= subreg_lowpart_offset (mode_for_size (GET_MODE_SIZE (ymode)
+ 			  			  / nregs_ymode,
+ 						  MODE_INT, 0), xmode);
+ 
+   /* size of ymode must not be greater than the size of xmode.  */
+   mode_multiple = GET_MODE_SIZE (xmode) / GET_MODE_SIZE (ymode);
+ 
+   y_offset = offset / GET_MODE_SIZE (ymode);
+   nregs_multiple =  nregs_xmode / nregs_ymode;
+ #ifdef ENABLE_CHECKING
+   if (offset % GET_MODE_SIZE (ymode)
+       || mode_multiple % nregs_multiple)
+     abort ();
+ #endif
+   return (!(y_offset % (mode_multiple / nregs_multiple)));
+ }
+ 
  /* Return the final regno that a subreg expression refers to.  */
  unsigned int
  subreg_regno (x)

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Reload bug
  2003-04-12 17:45                       ` Jan Hubicka
@ 2003-04-13 19:57                         ` Eric Botcazou
  2003-04-13 20:04                           ` Jan Hubicka
  0 siblings, 1 reply; 41+ messages in thread
From: Eric Botcazou @ 2003-04-13 19:57 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc

> > The function appears to assume that reg XREGNO supports mode YMODE. Is
> > that by design? If so, I think you should mention it in the header
> > comment.
>
> Hmm, I am not certain what should happen there.  We can either verify it
> here or the place we are doing so already...

I think we should verify that the subreg is really representable. For low 
parts, it's HARD_REGNO_MODE_OK (xregno, ymode) == 1. Are you sure that we 
should not verify that HARD_REGNO_MODE_OK (xregno + offset, ymode) == 1
at the very end of the function either, like in the original code?

> > I think you should mention that the second "if" catches paradoxical
> > subregs too.
>
> Hmm, you are right. Then we can remove the first if too :)

Even for big-endian platforms? I think the first "if" in the original code 
catches paradoxical subregs on big-endian platforms and the second "if" 
catches them on little-endian platforms.

> No, I am actually trying to compute the size of the one ymode register
> in order to be able to compensate the lowpart operation.
> Imagine (subreg:QI (reg:DI) 3) is valid on little endian with
> nregs_ymode == 2

I presume you mean big-endian? And nregs_xmode == 2?

Then wouldn't the following formula be better?

offset -= subreg_lowpart_offset (mode_for_size (GET_MODE_SIZE (ymode)
                                                 / nregs_ymode, MODE_INT, 0),
                                         mode_for_size (GET_MODE_SIZE (xmode)
                                                / nregs_xmode, MODE_INT, 0));

It looks certainly more homogeneous than the former.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Reload bug
  2003-04-13 19:57                         ` Eric Botcazou
@ 2003-04-13 20:04                           ` Jan Hubicka
  0 siblings, 0 replies; 41+ messages in thread
From: Jan Hubicka @ 2003-04-13 20:04 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: Jan Hubicka, gcc

> > > The function appears to assume that reg XREGNO supports mode YMODE. Is
> > > that by design? If so, I think you should mention it in the header
> > > comment.
> >
> > Hmm, I am not certain what should happen there.  We can either verify it
> > here or the place we are doing so already...
> 
> I think we should verify that the subreg is really representable. For low 
> parts, it's HARD_REGNO_MODE_OK (xregno, ymode) == 1. Are you sure that we 
> should not verify that HARD_REGNO_MODE_OK (xregno + offset, ymode) == 1
> at the very end of the function either, like in the original code?

My original plan was to check only cases where subreg_regno_offset would
reuslt in misscompilation.  In the case HARD_REGNO_MODE_OK is false or
CLASS_CANNOT_CHANGE_MODE is set, we would catch it elsewhere, but I
agree that we can add it as incemental step.
> 
> > > I think you should mention that the second "if" catches paradoxical
> > > subregs too.
> >
> > Hmm, you are right. Then we can remove the first if too :)
> 
> Even for big-endian platforms? I think the first "if" in the original code 
> catches paradoxical subregs on big-endian platforms and the second "if" 
> catches them on little-endian platforms.

Paradoxical subreg has always 0 offset and the function to compute
offset knows how to return 0 in this case so it will work.
> 
> > No, I am actually trying to compute the size of the one ymode register
> > in order to be able to compensate the lowpart operation.
> > Imagine (subreg:QI (reg:DI) 3) is valid on little endian with
> > nregs_ymode == 2
> 
> I presume you mean big-endian? And nregs_xmode == 2?
> 
> Then wouldn't the following formula be better?
> 
> offset -= subreg_lowpart_offset (mode_for_size (GET_MODE_SIZE (ymode)
>                                                  / nregs_ymode, MODE_INT, 0),
>                                          mode_for_size (GET_MODE_SIZE (xmode)
>                                                 / nregs_xmode, MODE_INT, 0));
> 
> It looks certainly more homogeneous than the former.

This would result in same value as when the outer mode is bigger than
word, it must be multiple of it and the original code would do
paradoxical subreg and return 0, while this code would do subregs of
equal sizes.

Honza
> 
> -- 
> Eric Botcazou

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Make reload to avoid invalid subregs
  2003-04-12 17:55                       ` Make reload to avoid invalid subregs Jan Hubicka
@ 2003-04-17 22:32                         ` Richard Henderson
  0 siblings, 0 replies; 41+ messages in thread
From: Richard Henderson @ 2003-04-17 22:32 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Eric Botcazou, gcc-patches, gcc

On Sat, Apr 12, 2003 at 04:52:57PM +0200, Jan Hubicka wrote:
> 	* emit-rtl.c (subreg_hard_regno): Check subreg_offset_representable_p
> 	* reload.c (reload_inner_reg_of_subreg): Likewise.
> 	(find_reloads): Likewise; remove #if 0 hunk we replace.
> 	* rtl.h (subreg_representable_p): Declare.
> 	* rtlanal.c (subreg_representable_p): New function.

Ok.


r~

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Reload bug
  1999-09-02  2:15   ` Andreas Schwab
@ 1999-09-30 18:02     ` Andreas Schwab
  0 siblings, 0 replies; 41+ messages in thread
From: Andreas Schwab @ 1999-09-30 18:02 UTC (permalink / raw)
  To: law; +Cc: gcc

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1542 bytes --]

Jeffrey A Law <law@cygnus.com> writes:

|> I'll note this doesn't occur in the mainline tree, but I have no idea if that
|> is because the bug has actually been fixed or because we're just not triggering
|> the problem anymore.

I have been able to reproduce the bug with CVS-yesterday, but not right
now.  I think this is due to the ia32 merge, due to the change in the
subdi3 pattern.  With the yesterday's compiler the offending insn looked
like this before global alloc:

(insn 83 77 85 (parallel[ 
            (set (reg/v:DI 26)
                (minus:DI (reg:DI 31)
                    (reg:DI 38)))
            (clobber (scratch:SI))
        ] ) 215 {subdi3} (insn_list 77 (nil))
    (expr_list:REG_DEAD (reg:DI 38)
        (expr_list:REG_UNUSED (scratch:SI)
            (nil))))

with (reg:DI 31) being equivalent to the const_double.  Today I'm getting
this:

(insn 251 74 80 (set (reg/v:DI 28)
        (reg:DI 35)) 66 {*movdi_2} (nil)
    (nil))

(insn 80 251 82 (parallel [
            (set (reg/v:DI 28)
                (minus:DI (reg/v:DI 28)
                    (reg:DI 41)))
            (clobber (reg:CC 17 flags))
        ]) 179 {subdi3} (insn_list 74 (nil))
    (expr_list:REG_DEAD (reg:DI 41)
        (expr_list:REG_UNUSED (reg:CC 17 flags)
            (nil))))

where (reg:DI 35) now contains the const_double.

Andreas.

-- 
Andreas Schwab                                  "And now for something
schwab@suse.de                                   completely different."
SuSE GmbH, Schanzäckerstr. 10, D-90443 Nürnberg

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Reload bug
  1999-09-02  0:32 ` Jeffrey A Law
  1999-09-02  2:15   ` Andreas Schwab
@ 1999-09-30 18:02   ` Jeffrey A Law
  1 sibling, 0 replies; 41+ messages in thread
From: Jeffrey A Law @ 1999-09-30 18:02 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: gcc

  In message < jeyaeq4p4n.fsf@hawking.suse.de >you write:
[ ... ]

  > $ b=/cvs/test/i686-linux/egcs/gcc; $b/xgcc -B$b/ -O2 -fpic -S random.c -da
  > random.c: In function `Default_RandInt':
  > random.c:34: Internal compiler error in `change_address', at emit-rtl.c:151
  > 6
  > Please submit a full bug report.
  > See <URL: http://www.gnu.org/software/gcc/faq.html#bugreport > for instructio
  > ns.
  > 
[ ... ]
  > During global alloc register 31 is replaced by (mem:DI (symbol_ref)), but
  > a symbol_ref is not a valid memory operand due to PIC.  In other words,
  > register 31 is _not_ equivalent to what insn 256 claims.
Note, REG_EQUIV indicates that a pseudo is equivalent to a particular VALUE,
not whether or not the REG_EQUIV expression can be directly substituted into
the insn without additional work.

So insn 256 is fine.

I'll note this doesn't occur in the mainline tree, but I have no idea if that
is because the bug has actually been fixed or because we're just not triggering
the problem anymore.

jeff

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Reload bug
  1999-09-01  8:40 Andreas Schwab
  1999-09-02  0:32 ` Jeffrey A Law
@ 1999-09-30 18:02 ` Andreas Schwab
  1 sibling, 0 replies; 41+ messages in thread
From: Andreas Schwab @ 1999-09-30 18:02 UTC (permalink / raw)
  To: gcc

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1993 bytes --]

The following function generates a compiler crash in final:

$ cat random.c
unsigned int GetRandom (void);

long long
Default_RandInt (long long Max)
{
  long long f, m, r, a, b;
  long long s;

  do
    {
      m = Max;
      s = 0;
      f = 1;
      while (m > 1)
	{
	  if (m >= 4294967296)
	    r = GetRandom ();
	  else
	    {
	      a = 4294967296 - 4294967296 % m;
	      do
		b = GetRandom ();
	      while (!(b < a));
	      r = b % m;
	    }
	  s += r * f;
	  f = f * 4294967296;
	  m = (m - 1) / 4294967296 + 1;
	}
    }
  while (!(s < Max || Max == 0));

  return s;
}
$ b=/cvs/test/i686-linux/egcs/gcc; $b/xgcc -B$b/ -O2 -fpic -S random.c -da
random.c: In function `Default_RandInt':
random.c:34: Internal compiler error in `change_address', at emit-rtl.c:1516
Please submit a full bug report.
See <URL: http://www.gnu.org/software/gcc/faq.html#bugreport > for instructions.

The problem is in the following two insns:

(insn 256 254 10 (set (reg:DI 31)
        (mem/u:DI (plus:SI (reg:SI 3 %ebx)
                (const (unspec[ 
                            (symbol_ref/u:SI ("*.LC0"))
                        ]  7))) 0)) 79 {movdi+1} (nil)
    (expr_list:REG_EQUIV (const_double (mem/u:DI (symbol_ref/u:SI ("*.LC0")) 0) 0 [0x0] 1 [0x1] 0 [0x0])
        (nil)))

(insn 83 77 85 (parallel[ 
            (set (reg/v:DI 26)
                (minus:DI (reg:DI 31)
                    (reg:DI 38)))
            (clobber (scratch:SI))
        ] ) 215 {subdi3} (insn_list 77 (nil))
    (expr_list:REG_DEAD (reg:DI 38)
        (expr_list:REG_UNUSED (scratch:SI)
            (nil))))

During global alloc register 31 is replaced by (mem:DI (symbol_ref)), but
a symbol_ref is not a valid memory operand due to PIC.  In other words,
register 31 is _not_ equivalent to what insn 256 claims.

-- 
Andreas Schwab                                  "And now for something
schwab@suse.de                                   completely different."
SuSE GmbH, Schanzäckerstr. 10, D-90443 Nürnberg

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Reload bug
  1999-09-02  0:32 ` Jeffrey A Law
@ 1999-09-02  2:15   ` Andreas Schwab
  1999-09-30 18:02     ` Andreas Schwab
  1999-09-30 18:02   ` Jeffrey A Law
  1 sibling, 1 reply; 41+ messages in thread
From: Andreas Schwab @ 1999-09-02  2:15 UTC (permalink / raw)
  To: law; +Cc: gcc

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1542 bytes --]

Jeffrey A Law <law@cygnus.com> writes:

|> I'll note this doesn't occur in the mainline tree, but I have no idea if that
|> is because the bug has actually been fixed or because we're just not triggering
|> the problem anymore.

I have been able to reproduce the bug with CVS-yesterday, but not right
now.  I think this is due to the ia32 merge, due to the change in the
subdi3 pattern.  With the yesterday's compiler the offending insn looked
like this before global alloc:

(insn 83 77 85 (parallel[ 
            (set (reg/v:DI 26)
                (minus:DI (reg:DI 31)
                    (reg:DI 38)))
            (clobber (scratch:SI))
        ] ) 215 {subdi3} (insn_list 77 (nil))
    (expr_list:REG_DEAD (reg:DI 38)
        (expr_list:REG_UNUSED (scratch:SI)
            (nil))))

with (reg:DI 31) being equivalent to the const_double.  Today I'm getting
this:

(insn 251 74 80 (set (reg/v:DI 28)
        (reg:DI 35)) 66 {*movdi_2} (nil)
    (nil))

(insn 80 251 82 (parallel [
            (set (reg/v:DI 28)
                (minus:DI (reg/v:DI 28)
                    (reg:DI 41)))
            (clobber (reg:CC 17 flags))
        ]) 179 {subdi3} (insn_list 74 (nil))
    (expr_list:REG_DEAD (reg:DI 41)
        (expr_list:REG_UNUSED (reg:CC 17 flags)
            (nil))))

where (reg:DI 35) now contains the const_double.

Andreas.

-- 
Andreas Schwab                                  "And now for something
schwab@suse.de                                   completely different."
SuSE GmbH, Schanzäckerstr. 10, D-90443 Nürnberg

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Reload bug
  1999-09-01  8:40 Andreas Schwab
@ 1999-09-02  0:32 ` Jeffrey A Law
  1999-09-02  2:15   ` Andreas Schwab
  1999-09-30 18:02   ` Jeffrey A Law
  1999-09-30 18:02 ` Andreas Schwab
  1 sibling, 2 replies; 41+ messages in thread
From: Jeffrey A Law @ 1999-09-02  0:32 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: gcc

  In message < jeyaeq4p4n.fsf@hawking.suse.de >you write:
[ ... ]

  > $ b=/cvs/test/i686-linux/egcs/gcc; $b/xgcc -B$b/ -O2 -fpic -S random.c -da
  > random.c: In function `Default_RandInt':
  > random.c:34: Internal compiler error in `change_address', at emit-rtl.c:151
  > 6
  > Please submit a full bug report.
  > See <URL: http://www.gnu.org/software/gcc/faq.html#bugreport > for instructio
  > ns.
  > 
[ ... ]
  > During global alloc register 31 is replaced by (mem:DI (symbol_ref)), but
  > a symbol_ref is not a valid memory operand due to PIC.  In other words,
  > register 31 is _not_ equivalent to what insn 256 claims.
Note, REG_EQUIV indicates that a pseudo is equivalent to a particular VALUE,
not whether or not the REG_EQUIV expression can be directly substituted into
the insn without additional work.

So insn 256 is fine.

I'll note this doesn't occur in the mainline tree, but I have no idea if that
is because the bug has actually been fixed or because we're just not triggering
the problem anymore.

jeff

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Reload bug
@ 1999-09-01  8:40 Andreas Schwab
  1999-09-02  0:32 ` Jeffrey A Law
  1999-09-30 18:02 ` Andreas Schwab
  0 siblings, 2 replies; 41+ messages in thread
From: Andreas Schwab @ 1999-09-01  8:40 UTC (permalink / raw)
  To: gcc

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1993 bytes --]

The following function generates a compiler crash in final:

$ cat random.c
unsigned int GetRandom (void);

long long
Default_RandInt (long long Max)
{
  long long f, m, r, a, b;
  long long s;

  do
    {
      m = Max;
      s = 0;
      f = 1;
      while (m > 1)
	{
	  if (m >= 4294967296)
	    r = GetRandom ();
	  else
	    {
	      a = 4294967296 - 4294967296 % m;
	      do
		b = GetRandom ();
	      while (!(b < a));
	      r = b % m;
	    }
	  s += r * f;
	  f = f * 4294967296;
	  m = (m - 1) / 4294967296 + 1;
	}
    }
  while (!(s < Max || Max == 0));

  return s;
}
$ b=/cvs/test/i686-linux/egcs/gcc; $b/xgcc -B$b/ -O2 -fpic -S random.c -da
random.c: In function `Default_RandInt':
random.c:34: Internal compiler error in `change_address', at emit-rtl.c:1516
Please submit a full bug report.
See <URL: http://www.gnu.org/software/gcc/faq.html#bugreport > for instructions.

The problem is in the following two insns:

(insn 256 254 10 (set (reg:DI 31)
        (mem/u:DI (plus:SI (reg:SI 3 %ebx)
                (const (unspec[ 
                            (symbol_ref/u:SI ("*.LC0"))
                        ]  7))) 0)) 79 {movdi+1} (nil)
    (expr_list:REG_EQUIV (const_double (mem/u:DI (symbol_ref/u:SI ("*.LC0")) 0) 0 [0x0] 1 [0x1] 0 [0x0])
        (nil)))

(insn 83 77 85 (parallel[ 
            (set (reg/v:DI 26)
                (minus:DI (reg:DI 31)
                    (reg:DI 38)))
            (clobber (scratch:SI))
        ] ) 215 {subdi3} (insn_list 77 (nil))
    (expr_list:REG_DEAD (reg:DI 38)
        (expr_list:REG_UNUSED (scratch:SI)
            (nil))))

During global alloc register 31 is replaced by (mem:DI (symbol_ref)), but
a symbol_ref is not a valid memory operand due to PIC.  In other words,
register 31 is _not_ equivalent to what insn 256 claims.

-- 
Andreas Schwab                                  "And now for something
schwab@suse.de                                   completely different."
SuSE GmbH, Schanzäckerstr. 10, D-90443 Nürnberg

^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2003-04-17 21:44 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-04-08 18:52 Reload bug Eric Botcazou
2003-04-08 23:16 ` Jan Hubicka
2003-04-09  0:44   ` Jan Hubicka
2003-04-09  7:00     ` Eric Botcazou
2003-04-09  3:00   ` Eric Botcazou
2003-04-09  9:49     ` Jan Hubicka
2003-04-09  8:57   ` Eric Botcazou
2003-04-09  9:45     ` Jan Hubicka
2003-04-09  9:50       ` Eric Botcazou
2003-04-09 14:52         ` Jan Hubicka
2003-04-09 18:10           ` Eric Botcazou
2003-04-09 19:15             ` Jan Hubicka
2003-04-10 14:25           ` Eric Botcazou
2003-04-10 16:31             ` Jan Hubicka
2003-04-10 16:35               ` Jan Hubicka
2003-04-10 20:21             ` Eric Botcazou
2003-04-10 20:43               ` Jan Hubicka
2003-04-11 14:44                 ` Eric Botcazou
2003-04-11 17:49                   ` Jan Hubicka
2003-04-11 18:09                   ` Jan Hubicka
2003-04-11 19:01                   ` Jan Hubicka
2003-04-11 19:07                   ` Jan Hubicka
2003-04-12 14:55                     ` Eric Botcazou
2003-04-12 17:45                       ` Jan Hubicka
2003-04-13 19:57                         ` Eric Botcazou
2003-04-13 20:04                           ` Jan Hubicka
2003-04-12 17:55                       ` Make reload to avoid invalid subregs Jan Hubicka
2003-04-17 22:32                         ` Richard Henderson
2003-04-10 20:51               ` Reload bug Dale Johannesen
2003-04-09  9:13 ` Eric Botcazou
2003-04-09 11:25   ` Jan Hubicka
2003-04-09 12:04     ` Eric Botcazou
2003-04-09 18:05       ` Jan Hubicka
2003-04-09 18:26         ` Eric Botcazou
2003-04-09 21:23         ` Richard Henderson
  -- strict thread matches above, loose matches on Subject: below --
1999-09-01  8:40 Andreas Schwab
1999-09-02  0:32 ` Jeffrey A Law
1999-09-02  2:15   ` Andreas Schwab
1999-09-30 18:02     ` Andreas Schwab
1999-09-30 18:02   ` Jeffrey A Law
1999-09-30 18:02 ` Andreas Schwab

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).