public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Non-representable subregs of subword pseudoregs
@ 2004-11-13 13:17 Eric Botcazou
  2004-11-13 13:45 ` Jan Hubicka
                   ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: Eric Botcazou @ 2004-11-13 13:17 UTC (permalink / raw)
  To: gcc; +Cc: Jan Hubicka

Hello,

I've run into a surprising (to me) behavior of the reload pass: it apparently 
doesn't know how to reload non-representable subregs of subword pseudoregs.
For example on SPARC64:

(insn 22 21 23 1 (set (reg:SI 109 [ D.1130 ])
        (subreg:SI (reg:DI 126) 0)) 51 {*movsi_insn} (nil)
    (nil))

leads to

Reloads for insn # 22
Reload 0: reload_in (SI) = (reg:SI 1 %g1)
	GENERAL_REGS, RELOAD_FOR_INPUT (opnum = 1), can't combine
	reload_in_reg: (subreg:SI (reg:DI 1 %g1 [126]) 0)
	reload_reg_rtx: (reg:SI 5 %g5)

(insn 61 21 22 1 (set (reg:SI 5 %g5)
        (reg:SI 1 %g1)) 51 {*movsi_insn} (nil)
    (nil))

(insn 22 61 23 1 (set (reg:SI 4 %g4 [orig:109 D.1130 ] [109])
        (reg:SI 5 %g5)) 51 {*movsi_insn} (nil)
    (nil))

which is of course wrong since the target is big-endian.


A reload is scheduled because find_reload correctly diagnoses that the subreg 
is not representable:

	  while (GET_CODE (operand) == SUBREG)
	    {
	      /* Offset only matters when operand is a REG and
		 it is a hard reg.  This is because it is passed
		 to reg_fits_class_p if it is a REG and all pseudos
		 return 0 from that function.  */
	      if (REG_P (SUBREG_REG (operand))
		  && REGNO (SUBREG_REG (operand)) < FIRST_PSEUDO_REGISTER)
		{
		  if (!subreg_offset_representable_p
			(REGNO (SUBREG_REG (operand)),
			 GET_MODE (SUBREG_REG (operand)),
			 SUBREG_BYTE (operand),
			 GET_MODE (operand)))
		     force_reload = 1;

However, reload_inner_reg_of_subreg returns false for it (for 2 reasons: it's 
an input reload and the function only looks at output reloads, and subregs of 
subword regs are rejected) so push_reload silently turns the invalid subreg 
into the low part:

  /* If IN is a SUBREG of a hard register, make a new REG.  This
     simplifies some of the cases below.  */

  if (in != 0 && GET_CODE (in) == SUBREG && REG_P (SUBREG_REG (in))
      && REGNO (SUBREG_REG (in)) < FIRST_PSEUDO_REGISTER
      && ! dont_remove_subreg)
    in = gen_rtx_REG (GET_MODE (in), subreg_regno (in));


Is this a known limitation of the reload pass or am I missing anything else?
[Jan, I've CCed you because you devised subreg_offset_representable_p.]


The non-representable subreg is created by emit_move_insn_1 when it is invoked 
on pseudo regs and CONCATs with complex modes.  Note that it already has a 
special provision in order not to create non-representable subregs of hard 
regs:

	  /* If this is a complex value with each part being smaller than a
	     word, the usual calling sequence will likely pack the pieces into
	     a single register.  Unfortunately, SUBREG of hard registers only
	     deals in terms of words, so we have a problem converting input
	     arguments to the CONCAT of two registers that is used elsewhere
	     for complex values.  If this is before reload, we can copy it into
	     memory and reload.  FIXME, we should see about using extract and
	     insert on integer registers, but complex short and complex char
	     variables should be rarely used.  */
	  if (GET_MODE_BITSIZE (mode) < 2 * BITS_PER_WORD
	      && (reload_in_progress | reload_completed) == 0)
	    {
	      int packed_dest_p
		= (REG_P (x) && REGNO (x) < FIRST_PSEUDO_REGISTER);
	      int packed_src_p
		= (REG_P (y) && REGNO (y) < FIRST_PSEUDO_REGISTER);

	      if (packed_dest_p || packed_src_p)

but it doesn't apply here because we are dealing with pseudo regs.


How is this supposed to work?  Is it simply illegal to create non-lowpart 
subregs of subword pseudo regs?

Thanks in advance.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 21+ messages in thread
* Re: [cft] subreg validation patch
@ 2004-11-14 18:03 Ulrich Weigand
  2004-11-15 20:12 ` Richard Henderson
  0 siblings, 1 reply; 21+ messages in thread
From: Ulrich Weigand @ 2004-11-14 18:03 UTC (permalink / raw)
  To: rth; +Cc: gcc, ebotcazou, jh

Richard Henderson wrote:

>Could everyone please try the following on their favorite platform
>and report back with new failures?

I've tried your patch on s390-ibm-linux and s390x-ibm-linux.  On both
platforms, bootstrap fails when building stage2, because of a bug in
regmove_optimize.  The code there tries to convert a paradoxical
subreg move into a move with the opposite subreg on the other side,
but completely messes up the subreg offset.  The following patch
appears to fix that problem:

Index: gcc/regmove.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/regmove.c,v
retrieving revision 1.164
diff -c -p -r1.164 regmove.c
*** gcc/regmove.c	9 Oct 2004 19:19:24 -0000	1.164
--- gcc/regmove.c	14 Nov 2004 02:56:01 -0000
*************** regmove_optimize (rtx f, int nregs, FILE
*** 1150,1159 ****
  		  && GET_MODE_SIZE (GET_MODE (dst))
  		     >= GET_MODE_SIZE (GET_MODE (SUBREG_REG (dst))))
  		{
- 		  src_subreg
- 		    = gen_rtx_SUBREG (GET_MODE (SUBREG_REG (dst)),
- 				      src, SUBREG_BYTE (dst));
  		  dst = SUBREG_REG (dst);
  		}
  	      if (!REG_P (dst)
  		  || REGNO (dst) < FIRST_PSEUDO_REGISTER)
--- 1150,1157 ----
  		  && GET_MODE_SIZE (GET_MODE (dst))
  		     >= GET_MODE_SIZE (GET_MODE (SUBREG_REG (dst))))
  		{
  		  dst = SUBREG_REG (dst);
+ 		  src_subreg = gen_lowpart (GET_MODE (dst), src);
  		}
  	      if (!REG_P (dst)
  		  || REGNO (dst) < FIRST_PSEUDO_REGISTER)


With that patch, bootstrap and regtest complete on s390-ibm-linux
without regressions.  On s390x, however, I get a couple of 
regressions:
> FAIL: gcc.c-torture/execute/20040709-1.c compilation,  -O1
> UNRESOLVED: gcc.c-torture/execute/20040709-1.c execution,  -O1
> FAIL: gcc.c-torture/execute/20040709-1.c compilation,  -O2
> UNRESOLVED: gcc.c-torture/execute/20040709-1.c execution,  -O2
> FAIL: gcc.c-torture/execute/930930-2.c compilation,  -O1
> UNRESOLVED: gcc.c-torture/execute/930930-2.c execution,  -O1
> FAIL: gcc.c-torture/execute/930930-2.c compilation,  -O2
> UNRESOLVED: gcc.c-torture/execute/930930-2.c execution,  -O2
> FAIL: gcc.c-torture/execute/930930-2.c compilation,  -O3 -fomit-frame-pointer
> UNRESOLVED: gcc.c-torture/execute/930930-2.c execution,  -O3 -fomit-frame-pointer
> FAIL: gcc.c-torture/execute/930930-2.c compilation,  -O3 -g
> UNRESOLVED: gcc.c-torture/execute/930930-2.c execution,  -O3 -g
> FAIL: gcc.c-torture/execute/930930-2.c compilation,  -Os
> UNRESOLVED: gcc.c-torture/execute/930930-2.c execution,  -Os
> FAIL: gcc.c-torture/execute/stdarg-3.c compilation,  -O1
> UNRESOLVED: gcc.c-torture/execute/stdarg-3.c execution,  -O1
> FAIL: gcc.c-torture/execute/stdarg-3.c compilation,  -O2
> UNRESOLVED: gcc.c-torture/execute/stdarg-3.c execution,  -O2
> FAIL: gcc.c-torture/execute/stdarg-3.c compilation,  -O3 -fomit-frame-pointer
> UNRESOLVED: gcc.c-torture/execute/stdarg-3.c execution,  -O3 -fomit-frame-pointer
> FAIL: gcc.c-torture/execute/stdarg-3.c compilation,  -O3 -fomit-frame-pointer -funroll-loops
> UNRESOLVED: gcc.c-torture/execute/stdarg-3.c execution,  -O3 -fomit-frame-pointer -funroll-loops
> FAIL: gcc.c-torture/execute/stdarg-3.c compilation,  -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions
> UNRESOLVED: gcc.c-torture/execute/stdarg-3.c execution,  -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions
> FAIL: gcc.c-torture/execute/stdarg-3.c compilation,  -O3 -g
> UNRESOLVED: gcc.c-torture/execute/stdarg-3.c execution,  -O3 -g
> FAIL: gcc.c-torture/execute/stdarg-3.c compilation,  -Os
> UNRESOLVED: gcc.c-torture/execute/stdarg-3.c execution,  -Os

These all have the same reason: this piece of code in store_bit_field

  if (bitpos == 0
      && bitsize == GET_MODE_BITSIZE (fieldmode)
      && (!MEM_P (op0)
          ? ((GET_MODE_SIZE (fieldmode) >= UNITS_PER_WORD
             || GET_MODE_SIZE (GET_MODE (op0)) == GET_MODE_SIZE (fieldmode))
             && byte_offset % GET_MODE_SIZE (fieldmode) == 0)
          : (! SLOW_UNALIGNED_ACCESS (fieldmode, MEM_ALIGN (op0))
             || (offset * BITS_PER_UNIT % bitsize == 0
                 && MEM_ALIGN (op0) % GET_MODE_BITSIZE (fieldmode) == 0))))
    {
      if (GET_MODE (op0) != fieldmode)
        {
          if (GET_CODE (op0) == SUBREG)
            {
              /* Else we've got some float mode source being extracted
                 into a different float mode destination -- this
                 combination of subregs results in Severe Tire
                 Damage.  */
              gcc_assert (GET_MODE (SUBREG_REG (op0)) == fieldmode
                          || GET_MODE_CLASS (fieldmode) == MODE_INT
                          || GET_MODE_CLASS (fieldmode) == MODE_PARTIAL_INT);
              op0 = SUBREG_REG (op0);
            }
          if (REG_P (op0))
            op0 = gen_rtx_SUBREG (fieldmode, op0, byte_offset);
          else
            op0 = adjust_address (op0, fieldmode, offset);
        }
      emit_move_insn (op0, value);
      return value;
    }

calls gen_rtx_SUBREG to form a DFmode subreg of one of the subwords of
a TImode register.

I'm not sure how to fix that, not least because I still don't understand
why this subreg is now supposed to be invalid in the first place ...


One other question about your patch:

+   /* For hard registers, we already have all of these rules collected in
+      subreg_offset_representable_p.  */
+   if (reg && REG_P (reg) && HARD_REGISTER_P (reg))
+     return subreg_offset_representable_p (REGNO (reg), imode, offset, omode);

This check is not quite the same as the check done in simplify_subreg.
In particular, you neither check for REG_CANNOT_CHANGE_MODE_P, nor is
there the special checks for frame-related registers.  While I'm not
quite sure I fully understand those checks, shouldn't the rules be
the same here as in simplify_subreg?

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  Linux on zSeries Development
  Ulrich.Weigand@de.ibm.com

^ permalink raw reply	[flat|nested] 21+ messages in thread
* Re: [cft] subreg validation patch
@ 2004-11-16 14:03 Ulrich Weigand
  2004-11-16 21:02 ` Richard Henderson
  2004-11-17  1:01 ` Richard Henderson
  0 siblings, 2 replies; 21+ messages in thread
From: Ulrich Weigand @ 2004-11-16 14:03 UTC (permalink / raw)
  To: rth; +Cc: gcc, gcc-patches, ebotcazou, jh

I wrote:

>Richard Henderson <rth@redhat.com> wrote on 11/15/2004 11:02:32 PM:
>
>> Hmm.  How about lowpart_subreg?  At least then we won't wind up
>> with nested subregs.
>
>src is guaranteed to satisfy REG_P at this point, so this shouldn't
>be an issue.  (Also, regmove.c already uses gen_lowpart_SUBREG at
>another similar location.)

On second thought, you're probably right: gen_lowpart_SUBREG may
abort (with your change), and we certainly don't want that here.
So I've changed the patch to use lowpart_subreg, and just ignore
that move if the subreg cannot be generated.

Bootstraped/regtested on s390-ibm-linux and s390x-ibm-linux
(with your patch applied as well).

OK for mainline?

Bye,
Ulrich


ChangeLog:

	* regmove.c (regmove_optimize): Use lowpart_subreg instead of
	gen_rtx_SUBREG with incorrect offset to compute SRC_SUBREG.

Index: gcc/regmove.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/regmove.c,v
retrieving revision 1.164
diff -c -p -r1.164 regmove.c
*** gcc/regmove.c	9 Oct 2004 19:19:24 -0000	1.164
--- gcc/regmove.c	16 Nov 2004 11:44:46 -0000
*************** regmove_optimize (rtx f, int nregs, FILE
*** 1150,1159 ****
  		  && GET_MODE_SIZE (GET_MODE (dst))
  		     >= GET_MODE_SIZE (GET_MODE (SUBREG_REG (dst))))
  		{
- 		  src_subreg
- 		    = gen_rtx_SUBREG (GET_MODE (SUBREG_REG (dst)),
- 				      src, SUBREG_BYTE (dst));
  		  dst = SUBREG_REG (dst);
  		}
  	      if (!REG_P (dst)
  		  || REGNO (dst) < FIRST_PSEUDO_REGISTER)
--- 1150,1160 ----
  		  && GET_MODE_SIZE (GET_MODE (dst))
  		     >= GET_MODE_SIZE (GET_MODE (SUBREG_REG (dst))))
  		{
  		  dst = SUBREG_REG (dst);
+ 		  src_subreg = lowpart_subreg (GET_MODE (dst),
+ 					       src, GET_MODE (src));
+ 		  if (!src_subreg)
+ 		    continue;
  		}
  	      if (!REG_P (dst)
  		  || REGNO (dst) < FIRST_PSEUDO_REGISTER)

-- 
  Dr. Ulrich Weigand
  Linux on zSeries Development
  Ulrich.Weigand@de.ibm.com

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2004-11-16 21:02 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-11-13 13:17 Non-representable subregs of subword pseudoregs Eric Botcazou
2004-11-13 13:45 ` Jan Hubicka
2004-11-13 22:55 ` Richard Sandiford
2004-11-13 23:31   ` Eric Botcazou
2004-11-13 23:06 ` [cft] subreg validation patch Richard Henderson
2004-11-13 23:50   ` Eric Botcazou
2004-11-14  0:02     ` Richard Henderson
2004-11-14  0:21       ` Eric Botcazou
2004-11-14  4:41         ` Richard Henderson
2004-11-14  5:12     ` Richard Henderson
2004-11-14  6:14       ` Geert Bosch
2004-11-14  7:46       ` Richard Henderson
2004-11-14 10:50         ` Eric Botcazou
2004-11-14 18:03 Ulrich Weigand
2004-11-15 20:12 ` Richard Henderson
2004-11-15 22:02   ` Ulrich Weigand
2004-11-15 22:08     ` Richard Henderson
2004-11-15 23:22       ` Ulrich Weigand
2004-11-16 14:03 Ulrich Weigand
2004-11-16 21:02 ` Richard Henderson
2004-11-17  1:01 ` Richard Henderson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).