Re: [PATCH] adds powerpc-*-freebsd? to mainline

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* Re: [PATCH] adds powerpc-*-freebsd? to mainline
  2001-11-13 15:03     ` Geoff Keating
@ 2001-11-13 15:03       ` David Edelsohn
  2001-11-13 15:03         ` David O'Brien
  2001-11-13 15:03         ` Richard Henderson
  0 siblings, 2 replies; 875+ messages in thread
From: David Edelsohn @ 2001-11-13 15:03 UTC (permalink / raw)
  To: Geoff Keating; +Cc: shebs, obrien, gcc-patches

>>>>> Geoff Keating writes:

>> Not to mention that this is ugly and cumbersome.

Geoff> I couldn't think of a better solution, neither could David, and I
Geoff> find the solution grows on me.  Isn't this exactly what header files
Geoff> are for?

	If this now is a required file for config/rs6000/sysv4.h, why not
#include it right before the macros are used instead of adding it
throughout config.gcc?  If the two no longer can be separated, it makes no
sense for everyone to repeat that boilerplate.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] adds powerpc-*-freebsd? to mainline
  2001-11-13 15:03   ` David Edelsohn
  2001-11-13 15:03     ` Geoff Keating
@ 2001-11-13 15:03     ` David O'Brien
  1 sibling, 0 replies; 875+ messages in thread
From: David O'Brien @ 2001-11-13 15:03 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Geoff Keating, shebs, gcc-patches

On Wed, Nov 21, 2001 at 12:19:25AM -0500, David Edelsohn wrote:
> Geoff> Why do you object?
> 
> 	Because a FreeBSD maintainer for any architecture now can break
> all PowerPC ELF targets.  Not to mention that this is ugly and
> cumbersome. 

Only with a syntax error.  The values used from freebsd-spec.h cannot
break any other target.
 
-- 
-- David  (obrien@FreeBSD.org)

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] adds powerpc-*-freebsd? to mainline
@ 2001-11-13 15:03 David Edelsohn
  2001-11-13 15:03 ` Geoff Keating
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2001-11-13 15:03 UTC (permalink / raw)
  To: Geoff Keating, Stan Shebs, David O'Brien; +Cc: gcc-patches

>>>>> Geoff writes:

Geoff> I don't think that would be a problem.

	I have a problem with that.  Adding freebsd-spec.h to the generic
PowerPC ELF targets is fine, but adding it to every ELF target is not
acceptable to me.  I do not agree with the config.gcc changes in this
patch and it needs to be revised or reverted.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] adds powerpc-*-freebsd? to mainline
  2001-11-13 15:03   ` David Edelsohn
@ 2001-11-13 15:03     ` Geoff Keating
  2001-11-13 15:03       ` David Edelsohn
  2001-11-13 15:03     ` David O'Brien
  1 sibling, 1 reply; 875+ messages in thread
From: Geoff Keating @ 2001-11-13 15:03 UTC (permalink / raw)
  To: dje; +Cc: shebs, obrien, gcc-patches

> cc: shebs@apple.com, obrien@FreeBSD.org, gcc-patches@gcc.gnu.org
> Date: Wed, 21 Nov 2001 00:19:25 -0500
> From: David Edelsohn <dje@watson.ibm.com>
> 
> >>>>> Geoff Keating writes:
> 
> Geoff> Why do you object?
> 
> 	Because a FreeBSD maintainer for any architecture now can break
> all PowerPC ELF targets.

Well,
(a) so let's tell them not to do that.
(b) I really don't think it's likely that they'll be able to somehow
    break things in a file that defines five macros, unless
    of course they forget to 'cvs add' it (hint hint David)
(c) and anyway, the automated regression tester will catch them.

>  Not to mention that this is ugly and cumbersome.

I couldn't think of a better solution, neither could David, and I
find the solution grows on me.  Isn't this exactly what header files
are for?

-- 
- Geoffrey Keating <geoffk@geoffk.org> <geoffk@redhat.com>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] adds powerpc-*-freebsd? to mainline
  2001-11-13 15:03 [PATCH] adds powerpc-*-freebsd? to mainline David Edelsohn
@ 2001-11-13 15:03 ` Geoff Keating
  2001-11-13 15:03   ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Geoff Keating @ 2001-11-13 15:03 UTC (permalink / raw)
  To: dje; +Cc: shebs, obrien, gcc-patches

> cc: gcc-patches@gcc.gnu.org
> Date: Tue, 20 Nov 2001 22:56:40 -0500
> From: David Edelsohn <dje@watson.ibm.com>
> 
> >>>>> Geoff writes:
> 
> Geoff> I don't think that would be a problem.
> 
> 	I have a problem with that.  Adding freebsd-spec.h to the generic
> PowerPC ELF targets is fine, but adding it to every ELF target is not
> acceptable to me.  I do not agree with the config.gcc changes in this
> patch and it needs to be revised or reverted.

Why do you object?

The only differences from the similar definitions for Linux (and,
once, Solaris) in srv4.h are that this is in a separate file and is
shared with other freebsd OSs.

It's worth noting that there are only 'generic' PowerPC ELF targets.
All ELF targets just take the generic target and change the defaults,
but all the functionality is always there.

-- 
- Geoffrey Keating <geoffk@geoffk.org> <geoffk@redhat.com>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] adds powerpc-*-freebsd? to mainline
  2001-11-13 15:03       ` David Edelsohn
  2001-11-13 15:03         ` David O'Brien
@ 2001-11-13 15:03         ` Richard Henderson
  1 sibling, 0 replies; 875+ messages in thread
From: Richard Henderson @ 2001-11-13 15:03 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Geoff Keating, shebs, obrien, gcc-patches

On Wed, Nov 21, 2001 at 01:03:05AM -0500, David Edelsohn wrote:
> 	If this now is a required file for config/rs6000/sysv4.h, why not
> #include it right before the macros are used instead of adding it
> throughout config.gcc?

Putting it in config.gcc means that Makefile dependencies come out right.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] adds powerpc-*-freebsd? to mainline
  2001-11-13 15:03       ` David Edelsohn
@ 2001-11-13 15:03         ` David O'Brien
  2001-11-13 15:03         ` Richard Henderson
  1 sibling, 0 replies; 875+ messages in thread
From: David O'Brien @ 2001-11-13 15:03 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Geoff Keating, shebs, gcc-patches

On Wed, Nov 21, 2001 at 01:03:05AM -0500, David Edelsohn wrote:
> 	If this now is a required file for config/rs6000/sysv4.h, why not
> #include it right before the macros are used instead of adding it
> throughout config.gcc?

That is the style I was taught by Jeff Law.  He spoke as if it was a
rather absolute rule.  (maybe I misunderstood him)

-- 
-- David  (obrien@FreeBSD.org)

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] adds powerpc-*-freebsd? to mainline
  2001-11-13 15:03 ` Geoff Keating
@ 2001-11-13 15:03   ` David Edelsohn
  2001-11-13 15:03     ` Geoff Keating
  2001-11-13 15:03     ` David O'Brien
  0 siblings, 2 replies; 875+ messages in thread
From: David Edelsohn @ 2001-11-13 15:03 UTC (permalink / raw)
  To: Geoff Keating; +Cc: shebs, obrien, gcc-patches

>>>>> Geoff Keating writes:

Geoff> Why do you object?

	Because a FreeBSD maintainer for any architecture now can break
all PowerPC ELF targets.  Not to mention that this is ugly and
cumbersome. 

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* PATCH, rs6000 (alpha?) long const
@ 2001-12-29  7:03 Tom Rix
  2001-12-29 11:40 ` Richard Henderson
  0 siblings, 1 reply; 875+ messages in thread
From: Tom Rix @ 2001-12-29  7:03 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 552 bytes --]

This patch fixes an incorrect add of a large immediate in
rs6000_emit_set_long_const.    d2 can not be added because it is greater
than 2^16.

To reproduce the problem, run the testsuite with the -maix64 switch.  A
large number of internal compiler errors are generated.   The one I
looked specifically was execute/920410-1.c

With the patch, -maix64 c failures are reduced from 216  to 115.

Looks like alpha also uses this logic.  Can someone familar with alpha
comment on the old logic's correctness?

Tom

--
Tom Rix
GCC Engineer
trix@redhat.com



[-- Attachment #2: long_const-001fa.patch --]
[-- Type: text/plain, Size: 2939 bytes --]

2001-12-28  Tom Rix  <trix@redhat.com>

	* config/rs6000/rs6000.c (rs6000_emit_set_long_const): Fix invalid add 
	of large immediate.
 
diff -rcp gcc-old/gcc/config/rs6000/rs6000.c gcc/gcc/config/rs6000/rs6000.c
*** gcc-old/gcc/config/rs6000/rs6000.c	Fri Dec 28 10:00:02 2001
--- gcc/gcc/config/rs6000/rs6000.c	Fri Dec 28 22:59:44 2001
*************** rs6000_emit_set_long_const (dest, c1, c2
*** 2002,2008 ****
      }
    else
      {
!       HOST_WIDE_INT d1, d2, d3, d4;
  
    /* Decompose the entire word */
  #if HOST_BITS_PER_WIDE_INT >= 64
--- 2002,2008 ----
      }
    else
      {
!       HOST_WIDE_INT d1, d2, d2_s, d3, d4;
  
    /* Decompose the entire word */
  #if HOST_BITS_PER_WIDE_INT >= 64
*************** rs6000_emit_set_long_const (dest, c1, c2
*** 2011,2016 ****
--- 2011,2017 ----
        d1 = ((c1 & 0xffff) ^ 0x8000) - 0x8000;
        c1 -= d1;
        d2 = ((c1 & 0xffffffff) ^ 0x80000000) - 0x80000000;
+       d2_s = d2 >> 16;
        c1 = (c1 - d2) >> 32;
        d3 = ((c1 & 0xffff) ^ 0x8000) - 0x8000;
        c1 -= d3;
*************** rs6000_emit_set_long_const (dest, c1, c2
*** 2021,2026 ****
--- 2022,2028 ----
        d1 = ((c1 & 0xffff) ^ 0x8000) - 0x8000;
        c1 -= d1;
        d2 = ((c1 & 0xffffffff) ^ 0x80000000) - 0x80000000;
+       d2_s = d2 >> 16;
        if (c1 != d2)
  	abort ();
        c2 += (d2 < 0);
*************** rs6000_emit_set_long_const (dest, c1, c2
*** 2039,2056 ****
  	    emit_move_insn (dest,
  			    gen_rtx_PLUS (DImode, dest, GEN_INT (d3)));
  	}
!       else
  	emit_move_insn (dest, GEN_INT (d3));
  
        /* Shift it into place */
        if (d3 != 0 || d4 != 0)
! 	emit_move_insn (dest, gen_rtx_ASHIFT (DImode, dest, GEN_INT (32)));
  
        /* Add in the low bits.  */
        if (d2 != 0)
! 	emit_move_insn (dest, gen_rtx_PLUS (DImode, dest, GEN_INT (d2)));
        if (d1 != 0)
! 	emit_move_insn (dest, gen_rtx_PLUS (DImode, dest, GEN_INT (d1)));
      }
  
    return dest;
--- 2041,2074 ----
  	    emit_move_insn (dest,
  			    gen_rtx_PLUS (DImode, dest, GEN_INT (d3)));
  	}
!       else if (d3 != 0)
  	emit_move_insn (dest, GEN_INT (d3));
  
        /* Shift it into place */
        if (d3 != 0 || d4 != 0)
!  	if (d2 != 0) 
!  	  emit_move_insn (dest, gen_rtx_ASHIFT (DImode, dest, GEN_INT (16)));
!  	else 
! 	  emit_move_insn (dest, gen_rtx_ASHIFT (DImode, dest, GEN_INT (32)));
  
        /* Add in the low bits.  */
        if (d2 != 0)
! 	{
! 	  if (d3 != 0 || d4 != 0)
! 	    {
!  	      emit_move_insn (dest, gen_rtx_PLUS (DImode, dest, 
!  						  GEN_INT (d2_s)));
!  	      emit_move_insn (dest, gen_rtx_ASHIFT (DImode, dest,  
!  						    GEN_INT (16)));
! 	    }
! 	  else
! 	    emit_move_insn (dest, GEN_INT (d2));
! 	}
        if (d1 != 0)
! 	if (d2 != 0 || d3 != 0 || d4 != 0)
! 	  emit_move_insn (dest, gen_rtx_PLUS (DImode, dest, GEN_INT (d1)));
! 	else
! 	  emit_move_insn (dest, GEN_INT (d1));
      }
  
    return dest;

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: PATCH, rs6000 (alpha?) long const
  2001-12-29  7:03 PATCH, rs6000 (alpha?) long const Tom Rix
@ 2001-12-29 11:40 ` Richard Henderson
  2001-12-29 12:40   ` Tom Rix
  0 siblings, 1 reply; 875+ messages in thread
From: Richard Henderson @ 2001-12-29 11:40 UTC (permalink / raw)
  To: Tom Rix; +Cc: gcc-patches

On Sat, Dec 29, 2001 at 09:22:23AM -0600, Tom Rix wrote:
> Looks like alpha also uses this logic.  Can someone familar with alpha
> comment on the old logic's correctness?

The 'L' constraint is for exactly this sort of constant.  This
is supported by the "ldah" insn on alpha, and "addis" on ppc64.



r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: PATCH, rs6000 (alpha?) long const
  2001-12-29 11:40 ` Richard Henderson
@ 2001-12-29 12:40   ` Tom Rix
  2001-12-29 17:00     ` Richard Henderson
  0 siblings, 1 reply; 875+ messages in thread
From: Tom Rix @ 2001-12-29 12:40 UTC (permalink / raw)
  To: Richard Henderson; +Cc: gcc-patches

Richard Henderson wrote:

> On Sat, Dec 29, 2001 at 09:22:23AM -0600, Tom Rix wrote:
> > Looks like alpha also uses this logic.  Can someone familar with alpha
> > comment on the old logic's correctness?
>
> The 'L' constraint is for exactly this sort of constant.  This
> is supported by the "ldah" insn on alpha, and "addis" on ppc64.
>

Yes.  The internal errors complained about not meeting the insn's 'L'
contraint.   In the case I looked at,  rs6000_emit_set_long_const was
called from rs6000_emit_allocate_stack, via try_split.    I believe the
usual contraint checking is bypassed and was worried alpha would have the
same problem.

Tom

>
> r~

--
Tom Rix
GCC Engineer
trix@redhat.com

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: PATCH, rs6000 (alpha?) long const
  2001-12-29 12:40   ` Tom Rix
@ 2001-12-29 17:00     ` Richard Henderson
  2001-12-29 18:37       ` Tom Rix
  0 siblings, 1 reply; 875+ messages in thread
From: Richard Henderson @ 2001-12-29 17:00 UTC (permalink / raw)
  To: Tom Rix; +Cc: gcc-patches

On Sat, Dec 29, 2001 at 03:33:35PM -0600, Tom Rix wrote:
> Yes.  The internal errors complained about not meeting the insn's 'L'
> contraint.

What is the number that didn't match?

> I believe the usual contraint checking is bypassed...

No.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: PATCH, rs6000 (alpha?) long const
  2001-12-29 17:00     ` Richard Henderson
@ 2001-12-29 18:37       ` Tom Rix
  2001-12-29 20:45         ` Richard Henderson
  0 siblings, 1 reply; 875+ messages in thread
From: Tom Rix @ 2001-12-29 18:37 UTC (permalink / raw)
  To: Richard Henderson; +Cc: gcc-patches

Richard Henderson wrote:

> On Sat, Dec 29, 2001 at 03:33:35PM -0600, Tom Rix wrote:
> > Yes.  The internal errors complained about not meeting the insn's 'L'
> > contraint.
>
> What is the number that didn't match?

-131072
split from -160128

Tom

--
Tom Rix
GCC Engineer
trix@redhat.com



^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: PATCH, rs6000 (alpha?) long const
  2001-12-29 18:37       ` Tom Rix
@ 2001-12-29 20:45         ` Richard Henderson
  2001-12-29 21:24           ` PATCH, rs6000 (alpha?) long const --verbose Tom Rix
  0 siblings, 1 reply; 875+ messages in thread
From: Richard Henderson @ 2001-12-29 20:45 UTC (permalink / raw)
  To: Tom Rix; +Cc: gcc-patches

On Sat, Dec 29, 2001 at 09:06:39PM -0600, Tom Rix wrote:
> > What is the number that didn't match?
> 
> -131072

That's 0xffff_ffff_fffe_0000, which is definitely valid.  So
why doesn't that insn get recognized?


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: PATCH, rs6000 (alpha?) long const --verbose
  2001-12-29 20:45         ` Richard Henderson
@ 2001-12-29 21:24           ` Tom Rix
  2001-12-29 23:01             ` Richard Henderson
  2001-12-29 23:02             ` Richard Henderson
  0 siblings, 2 replies; 875+ messages in thread
From: Tom Rix @ 2001-12-29 21:24 UTC (permalink / raw)
  To: Richard Henderson; +Cc: gcc-patches

Richard Henderson wrote:

> On Sat, Dec 29, 2001 at 09:06:39PM -0600, Tom Rix wrote:
> > > What is the number that didn't match?
> >
> > -131072
>
> That's 0xffff_ffff_fffe_0000, which is definitely valid.  So
> why doesn't that insn get recognized?

I believe the rs6000_emit_allocate_stack is called after the usual
recogn/splitting.   From inspection of the funtion :

if (TARGET_UPDATE)
    {
      if (size > 32767)          << --checking for large stack
        {
          /* Need a note here so that try_split doesn't get confused.
*/
          if (get_last_insn() == NULL_RTX)
            emit_note (0, NOTE_INSN_DELETED);
          insn = emit_move_insn (tmp_reg, todec);
          try_split (PATTERN (insn), insn, 0);   << --splitting the r1 =
r1 - bigstack
          todec = tmp_reg;
        }

    }

If this was a normal add (a = a - 131072),  yes the constraint in adddi3
would have worked.    I am guessing the check for the size means this
add happens after normal contraint checking/splitting  and split is
being done here the hard way.

The split that try_split uses is :

;; Split a load of a large constant into the appropriate
five-instruction
;; sequence.  Handle anything in a constant number of insns.
;; When non-easy constants can go in the TOC, this should use
;; easy_fp_constant predicate.
(define_split
  [(set (match_operand:DI 0 "gpc_reg_operand" "")
        (match_operand:DI 1 "const_int_operand" ""))]
  "TARGET_POWERPC64 && num_insns_constant (operands[1], DImode) > 1"
  [(set (match_dup 0) (match_dup 2))
   (set (match_dup 0) (plus:DI (match_dup 0) (match_dup 3)))]
  "
{ rtx tem = rs6000_emit_set_const (operands[0], DImode, operands[1], 5);

  if (tem == operands[0])
    DONE;
  else
    FAIL;
}")

Here is the rtl dump of the stack allocation from flow2 :


(note 3 2 5 NOTE_INSN_FUNCTION_BEG)

(note 5 3 24 30033180 NOTE_INSN_BLOCK_BEG)

;; Start of basic block 0, registers live: 1 [1] 65 [lr]
(note 24 5 29 [bb 0] NOTE_INSN_BASIC_BLOCK)

(insn 29 24 31 (set (reg:DI 0 r0)
        (reg:DI 65 lr)) -1 (nil)
    (expr_list:REG_DEAD (reg:DI 65 lr)
        (nil)))

(insn/f 31 29 35 (set (mem:DI (plus:DI (reg/f:DI 1 r1)
                (const_int 16 [0x10])) [0 S8 A64])
        (reg:DI 0 r0)) -1 (insn_list 29 (nil))
    (expr_list:REG_DEAD (reg:DI 0 r0)
        (expr_list:REG_FRAME_RELATED_EXPR (set (mem:DI (plus:DI
(reg/f:DI 1 r1)
                        (const_int 16 [0x10])) [0 S8 A64])
                (reg:DI 65 lr))
            (nil))))

-- start of rs6000_emit_set_long_const

<<<         emit_move_insn (dest, GEN_INT (d3));

(insn 35 31 37 (set (reg:DI 0 r0)
        (const_int 0 [0x0])) -1 (nil)
    (nil))

<<<   if (d2 != 0)
        emit_move_insn (dest, gen_rtx_PLUS (DImode, dest, GEN_INT
(d2)));

(insn 37 35 39 (set (reg:DI 0 r0)            <<< this is the problem >>>

        (plus:DI (reg:DI 0 r0)
            (const_int -131072 [0xfffffffffffe0000]))) -1 (insn_list 35
(nil))
    (nil))


<<<      if (d1 != 0)
        emit_move_insn (dest, gen_rtx_PLUS (DImode, dest, GEN_INT
(d1)));

(insn 39 37 40 (set (reg:DI 0 r0)
        (plus:DI (reg:DI 0 r0)
            (const_int -29040 [0xffffffffffff8e90]))) -1 (insn_list 37
(nil))
    (nil))

-- end of rs6000_emit_set_long const

(insn/f 40 39 41 (parallel[
            (set (mem:DI (plus:DI (reg/f:DI 1 r1)
                        (reg:DI 0 r0)) [0 S8 A64])
                (reg/f:DI 1 r1))
            (set (reg/f:DI 1 r1)
                (plus:DI (reg/f:DI 1 r1)
                    (reg:DI 0 r0)))
        ] ) -1 (insn_list 39 (nil))
    (expr_list:REG_DEAD (reg:DI 0 r0)
        (expr_list:REG_FRAME_RELATED_EXPR (set (reg/f:DI 1 r1)
                (plus:DI (reg/f:DI 1 r1)
                    (const_int -160112 [0xfffffffffffd8e90])))
            (nil))))

(note 41 40 9 NOTE_INSN_PROLOGUE_END)

>
>

New add improved spits out

(insn 35 31 37 (set (reg:DI 0 r0)
        (const_int -131072 [0xfffffffffffe0000])) -1 (nil)
    (nil))

(insn 37 35 38 (set (reg:DI 0 r0)
        (plus:DI (reg:DI 0 r0)
            (const_int -29040 [0xffffffffffff8e90]))) -1 (insn_list 35
(nil))
    (nil))

Which is fairly close to the way it used to work before
rs6000_emit_set_long_const.  If you want to see that,  I will have to
set output to --verbose --verbose :)~

If rs6000_emit_set_long_const is called early,  the error of the add
will be hidden by splitting it again as if it were a normal addition of
a large immediate.  If it is called late, the error shows up as a fatal
compiler error.

Tom

--
Tom Rix
GCC Engineer
trix@redhat.com



^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: PATCH, rs6000 (alpha?) long const --verbose
  2001-12-29 21:24           ` PATCH, rs6000 (alpha?) long const --verbose Tom Rix
@ 2001-12-29 23:01             ` Richard Henderson
  2001-12-29 23:02             ` Richard Henderson
  1 sibling, 0 replies; 875+ messages in thread
From: Richard Henderson @ 2001-12-29 23:01 UTC (permalink / raw)
  To: Tom Rix; +Cc: gcc-patches

On Sat, Dec 29, 2001 at 11:42:37PM -0600, Tom Rix wrote:
>           try_split (PATTERN (insn), insn, 0);   << --splitting the r1 =

Hum.  I'm of the opinion that this is fairly dodgy.  Try this.


r~



Index: rs6000.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.c,v
retrieving revision 1.270
diff -c -p -d -r1.270 rs6000.c
*** rs6000.c	2001/12/29 09:07:56	1.270
--- rs6000.c	2001/12/30 06:49:38
*************** rs6000_emit_set_const (dest, mode, sourc
*** 1951,1963 ****
  {
    HOST_WIDE_INT c0, c1;
  
!   if (mode == QImode || mode == HImode || mode == SImode)
      {
-       if (dest == NULL)
-         dest = gen_reg_rtx (mode);
        emit_insn (gen_rtx_SET (VOIDmode, dest, source));
        return dest;
      }
  
    if (GET_CODE (source) == CONST_INT)
      {
--- 1951,1976 ----
  {
    HOST_WIDE_INT c0, c1;
  
!   if (dest == NULL)
!     dest = gen_reg_rtx (mode);
! 
!   if (num_insns_constant (source, mode) == 1)
      {
        emit_insn (gen_rtx_SET (VOIDmode, dest, source));
        return dest;
      }
+   else if (mode == SImode)
+     {
+       if (GET_CODE (source) != CONST_INT)
+ 	abort ();
+       c0 = INTVAL (source);
+       emit_insn (gen_rtx_SET (VOIDmode, dest,
+ 			      GEN_INT (c0 & ~ (HOST_WIDE_INT) 0xffff)));
+       emit_insn (gen_rtx_SET (VOIDmode, dest,
+ 			      gen_rtx_IOR (mode, dest,
+ 				      	   GEN_INT (c0 & 0xffff))));
+       return dest;
+     }
  
    if (GET_CODE (source) == CONST_INT)
      {
*************** rs6000_emit_allocate_stack (size, copy_r
*** 7755,7769 ****
    if (TARGET_UPDATE)
      {
        if (size > 32767)
! 	{
! 	  /* Need a note here so that try_split doesn't get confused.  */
! 	  if (get_last_insn() == NULL_RTX)
! 	    emit_note (0, NOTE_INSN_DELETED);
! 	  insn = emit_move_insn (tmp_reg, todec);
! 	  try_split (PATTERN (insn), insn, 0);
! 	  todec = tmp_reg;
! 	}
!       
        if (Pmode == SImode)
  	insn = emit_insn (gen_movsi_update (stack_reg, stack_reg, 
  					    todec, stack_reg));
--- 7768,7775 ----
    if (TARGET_UPDATE)
      {
        if (size > 32767)
! 	todec = rs6000_emit_set_const (tmp_reg, Pmode, todec, 5);
! 
        if (Pmode == SImode)
  	insn = emit_insn (gen_movsi_update (stack_reg, stack_reg, 
  					    todec, stack_reg));

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: PATCH, rs6000 (alpha?) long const --verbose
  2001-12-29 21:24           ` PATCH, rs6000 (alpha?) long const --verbose Tom Rix
  2001-12-29 23:01             ` Richard Henderson
@ 2001-12-29 23:02             ` Richard Henderson
  2002-01-01 12:21               ` PATCH, rs6000 (alpha?) long const take 2 Tom Rix
  1 sibling, 1 reply; 875+ messages in thread
From: Richard Henderson @ 2001-12-29 23:02 UTC (permalink / raw)
  To: Tom Rix; +Cc: gcc-patches

On Sat, Dec 29, 2001 at 11:42:37PM -0600, Tom Rix wrote:
> (insn 37 35 39 (set (reg:DI 0 r0)            <<< this is the problem >>>
> 
>         (plus:DI (reg:DI 0 r0)
>             (const_int -131072 [0xfffffffffffe0000]))) -1 (insn_list 35
> (nil))
>     (nil))

Actually, the real problem is that addis must be used
with some register other than r0.  So the temp register
chosen should be other than r0.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: PATCH, rs6000 (alpha?) long const take 2
  2001-12-29 23:02             ` Richard Henderson
@ 2002-01-01 12:21               ` Tom Rix
  2002-01-01 13:46                 ` Richard Henderson
  0 siblings, 1 reply; 875+ messages in thread
From: Tom Rix @ 2002-01-01 12:21 UTC (permalink / raw)
  To: Richard Henderson; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 581 bytes --]

Richard Henderson wrote:

> On Sat, Dec 29, 2001 at 11:42:37PM -0600, Tom Rix wrote:
> > (insn 37 35 39 (set (reg:DI 0 r0)            <<< this is the problem >>>
> >
> >         (plus:DI (reg:DI 0 r0)
> >             (const_int -131072 [0xfffffffffffe0000]))) -1 (insn_list 35
> > (nil))
> >     (nil))
>
> Actually, the real problem is that addis must be used
> with some register other than r0.  So the temp register
> chosen should be other than r0.

Yes.
Here is the original patch modified to do the addsi if the reg is not 0.

Tom


--
Tom Rix
GCC Engineer
trix@redhat.com



[-- Attachment #2: long_const-002fa.patch --]
[-- Type: text/plain, Size: 3307 bytes --]

2002-01-01  Tom Rix  <trix@redhat.com>

	* config/rs6000/rs6000.c (rs6000_emit_set_long_const): Fix for use by
	rs6000_emit_allocate_stack.

diff -rcp gcc-old/gcc/config/rs6000/rs6000.c gcc/gcc/config/rs6000/rs6000.c
*** gcc-old/gcc/config/rs6000/rs6000.c	Tue Jan  1 07:12:35 2002
--- gcc/gcc/config/rs6000/rs6000.c	Tue Jan  1 07:34:25 2002
*************** rs6000_emit_set_long_const (dest, c1, c2
*** 2002,2008 ****
      }
    else
      {
!       HOST_WIDE_INT d1, d2, d3, d4;
  
    /* Decompose the entire word */
  #if HOST_BITS_PER_WIDE_INT >= 64
--- 2002,2013 ----
      }
    else
      {
!       HOST_WIDE_INT d1, d2, d2_s, d3, d4;
! 
!       /* This function is called by rs6000_emit_allocate_stack after reload 
! 	 with a dest of r0.  r0 is an invalid register for addsi.  Use an addi 
! 	 and a shift instead.  */
!       int regnum = REGNO (dest);
  
    /* Decompose the entire word */
  #if HOST_BITS_PER_WIDE_INT >= 64
*************** rs6000_emit_set_long_const (dest, c1, c2
*** 2011,2016 ****
--- 2016,2022 ----
        d1 = ((c1 & 0xffff) ^ 0x8000) - 0x8000;
        c1 -= d1;
        d2 = ((c1 & 0xffffffff) ^ 0x80000000) - 0x80000000;
+       d2_s = d2 >> 16;
        c1 = (c1 - d2) >> 32;
        d3 = ((c1 & 0xffff) ^ 0x8000) - 0x8000;
        c1 -= d3;
*************** rs6000_emit_set_long_const (dest, c1, c2
*** 2021,2026 ****
--- 2027,2033 ----
        d1 = ((c1 & 0xffff) ^ 0x8000) - 0x8000;
        c1 -= d1;
        d2 = ((c1 & 0xffffffff) ^ 0x80000000) - 0x80000000;
+       d2_s = d2 >> 16;
        if (c1 != d2)
  	abort ();
        c2 += (d2 < 0);
*************** rs6000_emit_set_long_const (dest, c1, c2
*** 2039,2056 ****
  	    emit_move_insn (dest,
  			    gen_rtx_PLUS (DImode, dest, GEN_INT (d3)));
  	}
!       else
  	emit_move_insn (dest, GEN_INT (d3));
  
        /* Shift it into place */
        if (d3 != 0 || d4 != 0)
! 	emit_move_insn (dest, gen_rtx_ASHIFT (DImode, dest, GEN_INT (32)));
  
        /* Add in the low bits.  */
        if (d2 != 0)
! 	emit_move_insn (dest, gen_rtx_PLUS (DImode, dest, GEN_INT (d2)));
        if (d1 != 0)
! 	emit_move_insn (dest, gen_rtx_PLUS (DImode, dest, GEN_INT (d1)));
      }
  
    return dest;
--- 2046,2085 ----
  	    emit_move_insn (dest,
  			    gen_rtx_PLUS (DImode, dest, GEN_INT (d3)));
  	}
!       else if (d3 != 0)
  	emit_move_insn (dest, GEN_INT (d3));
  
        /* Shift it into place */
        if (d3 != 0 || d4 != 0)
!  	if (regnum == 0 && d2 != 0) 
!  	  emit_move_insn (dest, gen_rtx_ASHIFT (DImode, dest, GEN_INT (16)));
!  	else 
! 	  emit_move_insn (dest, gen_rtx_ASHIFT (DImode, dest, GEN_INT (32)));
  
        /* Add in the low bits.  */
        if (d2 != 0)
! 	{
! 	  if (d3 != 0 || d4 != 0)
! 	    {
! 	      if (regnum == 0)
! 		{
! 		  emit_move_insn (dest, gen_rtx_PLUS (DImode, dest, 
! 						      GEN_INT (d2_s)));
! 		  emit_move_insn (dest, gen_rtx_ASHIFT (DImode, dest,  
! 							GEN_INT (16)));
! 		}
! 	      else
! 		emit_move_insn (dest, gen_rtx_PLUS (DImode, dest, 
! 						    GEN_INT (d2)));
! 	    }
! 	  else
! 	    emit_move_insn (dest, GEN_INT (d2));
! 	}
        if (d1 != 0)
! 	if (d2 != 0 || d3 != 0 || d4 != 0)
! 	  emit_move_insn (dest, gen_rtx_PLUS (DImode, dest, GEN_INT (d1)));
! 	else
! 	  emit_move_insn (dest, GEN_INT (d1));
      }
  
    return dest;


^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: PATCH, rs6000 (alpha?) long const take 2
  2002-01-01 12:21               ` PATCH, rs6000 (alpha?) long const take 2 Tom Rix
@ 2002-01-01 13:46                 ` Richard Henderson
  2002-01-02 13:20                   ` Geoff Keating
  0 siblings, 1 reply; 875+ messages in thread
From: Richard Henderson @ 2002-01-01 13:46 UTC (permalink / raw)
  To: Tom Rix; +Cc: gcc-patches

On Tue, Jan 01, 2002 at 03:16:57PM -0600, Tom Rix wrote:
> 	* config/rs6000/rs6000.c (rs6000_emit_set_long_const): Fix for use by
> 	rs6000_emit_allocate_stack.

Ok.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: PATCH, rs6000 (alpha?) long const take 2
  2002-01-01 13:46                 ` Richard Henderson
@ 2002-01-02 13:20                   ` Geoff Keating
  2002-01-02 13:22                     ` Richard Henderson
  2002-01-02 20:44                     ` David Edelsohn
  0 siblings, 2 replies; 875+ messages in thread
From: Geoff Keating @ 2002-01-02 13:20 UTC (permalink / raw)
  To: trix; +Cc: Richard Henderson, gcc-patches

Richard Henderson <rth@redhat.com> writes:

> On Tue, Jan 01, 2002 at 03:16:57PM -0600, Tom Rix wrote:
> > 	* config/rs6000/rs6000.c (rs6000_emit_set_long_const): Fix for use by
> > 	rs6000_emit_allocate_stack.
> 
> Ok.

No, not ok!

!       /* This function is called by rs6000_emit_allocate_stack after reload 
! 	 with a dest of r0.  r0 is an invalid register for addsi.  Use an addi 
! 	 and a shift instead.  */

The code sequence to load a value into r0 is like this:

	lis %r0,0x1234
	ori %r0,%r0,0x5678

If you already have a value in r0's high bits (and know the low bits
are 0), you can use

	oris %r0,%r0,0x9ABC
	ori %r0,%r0,0xDEF0

So the full 5-word load sequence looks like:

	lis %r0,0x1234
	ori %r0,%r0,0x5678
	sli %r0,%r0,32
	oris %r0,%r0,0x9ABC
	ori %r0,%r0,0xDEF0

There is no need for additional shift instructions.

-- 
- Geoffrey Keating <geoffk@geoffk.org> <geoffk@redhat.com>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: PATCH, rs6000 (alpha?) long const take 2
  2002-01-02 13:20                   ` Geoff Keating
@ 2002-01-02 13:22                     ` Richard Henderson
  2002-01-02 20:44                     ` David Edelsohn
  1 sibling, 0 replies; 875+ messages in thread
From: Richard Henderson @ 2002-01-02 13:22 UTC (permalink / raw)
  To: Geoff Keating; +Cc: trix, gcc-patches

On Wed, Jan 02, 2002 at 01:14:19PM -0800, Geoff Keating wrote:
> 	oris %r0,%r0,0x9ABC

Doh.  I forgot that oris didn't sign-extend the constant.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: PATCH, rs6000 (alpha?) long const take 2
  2002-01-02 13:20                   ` Geoff Keating
  2002-01-02 13:22                     ` Richard Henderson
@ 2002-01-02 20:44                     ` David Edelsohn
  2002-01-03  0:52                       ` Richard Henderson
  1 sibling, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2002-01-02 20:44 UTC (permalink / raw)
  To: Geoff Keating, trix, Richard Henderson; +Cc: gcc-patches

	Do we want to change rs6000_emit_set_long_const to use get_rtx_IOR
instead of gen_rtx_PLUS? 

	Also, if explicitly allocting r0 was a problem, r12 should have
been used instead.  Adding extra shifts was wrong on a number of levels. 

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: PATCH, rs6000 (alpha?) long const take 2
  2002-01-02 20:44                     ` David Edelsohn
@ 2002-01-03  0:52                       ` Richard Henderson
  2002-01-03  8:06                         ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Richard Henderson @ 2002-01-03  0:52 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Geoff Keating, trix, gcc-patches

On Wed, Jan 02, 2002 at 11:26:12PM -0500, David Edelsohn wrote:
> 	Do we want to change rs6000_emit_set_long_const to use get_rtx_IOR
> instead of gen_rtx_PLUS? 
> 
> 	Also, if explicitly allocting r0 was a problem, r12 should have
> been used instead.  Adding extra shifts was wrong on a number of levels. 

I think changing rs6000_emit_set_long_const so that it always
works is the right thing.  If you want to choose another register,
that's fine.

That said, IOR clearly works better than PLUS because of the r0 issue.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: PATCH, rs6000 (alpha?) long const take 2
  2002-01-03  0:52                       ` Richard Henderson
@ 2002-01-03  8:06                         ` David Edelsohn
  2002-01-04 12:04                           ` Geoff Keating
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2002-01-03  8:06 UTC (permalink / raw)
  To: Richard Henderson, Geoff Keating, trix, gcc-patches

>>>>> Richard Henderson writes:

Richard> I think changing rs6000_emit_set_long_const so that it always
Richard> works is the right thing.  If you want to choose another register,
Richard> that's fine.

	rs6000_emit_set_long_const() does always work when called from the
MD file because the MD file register constraints ensure that the
instructions always will have an appropriate registers.

	Only, *ONLY*, rs6000_emit_allocate_stack() forces r0 into a
pattern for which it was not designed:

  rtx tmp_reg = gen_rtx_REG (Pmode, 0);

rs6000_emit_allocate_stack() is the problem and what should be fixed, not
rs6000_emit_set_long_const().  rs6000_emit_set_long_const() was the
symptom and was correct.  Fix the problem, not the symptom.

Richard> That said, IOR clearly works better than PLUS because of the r0 issue.

	If rs6000_emit_allocate_stack() cannot be rewritten in terms of
r12 for this large stack frame case, *IT* should be modified to use
gen_iorsi3() instead of gen_addsi3() for that case, not modifying
rs6000_emit_set_long_const().

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: PATCH, rs6000 (alpha?) long const take 2
  2002-01-03  8:06                         ` David Edelsohn
@ 2002-01-04 12:04                           ` Geoff Keating
  2002-01-04 14:31                             ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Geoff Keating @ 2002-01-04 12:04 UTC (permalink / raw)
  To: dje; +Cc: rth, trix, gcc-patches

> Date: Thu, 03 Jan 2002 11:04:19 -0500
> From: David Edelsohn <dje@watson.ibm.com>
> 
> 	If rs6000_emit_allocate_stack() cannot be rewritten in terms of
> r12 for this large stack frame case, *IT* should be modified to use
> gen_iorsi3() instead of gen_addsi3() for that case, not modifying
> rs6000_emit_set_long_const().

Wouldn't it be better to improve rs6000_emit_set_long_const so that it
can handle r0 as a destination?  That would make it more useful as a
routine, and would mean that if some other piece of code needs to set
a register, which might be r0, to a constant,
rs6000_emit_set_long_const could be used without needing any changes.

-- 
- Geoffrey Keating <geoffk@geoffk.org> <geoffk@redhat.com>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: PATCH, rs6000 (alpha?) long const take 2
  2002-01-04 12:04                           ` Geoff Keating
@ 2002-01-04 14:31                             ` David Edelsohn
  2002-01-10 14:00                               ` PATCH, rs6000 long const take 3 Tom Rix
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2002-01-04 14:31 UTC (permalink / raw)
  To: Geoff Keating; +Cc: rth, trix, gcc-patches

>>>>> Geoff Keating writes:

Geoff> Wouldn't it be better to improve rs6000_emit_set_long_const so that it
Geoff> can handle r0 as a destination?  That would make it more useful as a
Geoff> routine, and would mean that if some other piece of code needs to set
Geoff> a register, which might be r0, to a constant,
Geoff> rs6000_emit_set_long_const could be used without needing any changes.

	Are you suggesting / recommending the function be implemented only
in terms of IOR (which works with any GPR) instead of PLUS or that the
function should try to choose between IOR and PLUS based on the most
efficient algorithm for the particular constant and register?

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* PATCH, rs6000 long const take 3
  2002-01-04 14:31                             ` David Edelsohn
@ 2002-01-10 14:00                               ` Tom Rix
  2002-01-10 14:08                                 ` Richard Henderson
  2002-01-10 14:20                                 ` David Edelsohn
  0 siblings, 2 replies; 875+ messages in thread
From: Tom Rix @ 2002-01-10 14:00 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Geoff Keating, rth, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1056 bytes --]

David Edelsohn wrote:

> >>>>> Geoff Keating writes:
>
> Geoff> Wouldn't it be better to improve rs6000_emit_set_long_const so that it
> Geoff> can handle r0 as a destination?  That would make it more useful as a
> Geoff> routine, and would mean that if some other piece of code needs to set
> Geoff> a register, which might be r0, to a constant,
> Geoff> rs6000_emit_set_long_const could be used without needing any changes.
>
>         Are you suggesting / recommending the function be implemented only
> in terms of IOR (which works with any GPR) instead of PLUS

Yes.  Here is a patch to replace the PLUS with IOR

> or that the
> function should try to choose between IOR and PLUS based on the most
> efficient algorithm for the particular constant and register?

PLUS  favors  0xffff_ffff_8xxx_xxxx and half words of 0xffff's
IOR favors 0x0000_0000_7xxx_xxxx and half words of 0x0000's.

Other than that they are evenly matched.
I don't think there is enough of a win to check for the PLUS algorithm.

Tom


--
Tom Rix
GCC Engineer
trix@redhat.com



[-- Attachment #2: long_const-004fa.patch --]
[-- Type: text/plain, Size: 4898 bytes --]

2002-01-10  Tom Rix  <trix@redhat.com>

	* config/rs6000/rs6000.c (rs6000_emit_set_long_const): Use ior for 
	TARGET_POWERPC64.

diff -rcp gcc-old/gcc/config/rs6000/rs6000.c gcc/gcc/config/rs6000/rs6000.c
*** gcc-old/gcc/config/rs6000/rs6000.c	Fri Jan  4 19:51:37 2002
--- gcc/gcc/config/rs6000/rs6000.c	Thu Jan 10 10:46:43 2002
*************** rs6000_emit_set_long_const (dest, c1, c2
*** 2002,2087 ****
      }
    else
      {
!       HOST_WIDE_INT d1, d2, d2_s, d3, d4;
  
!       /* This function is called by rs6000_emit_allocate_stack after reload 
! 	 with a dest of r0.  r0 is an invalid register for addsi.  Use an addi 
! 	 and a shift instead.  */
!       int regnum = REGNO (dest);
! 
!   /* Decompose the entire word */
  #if HOST_BITS_PER_WIDE_INT >= 64
!       if (c2 != -(c1 < 0))
! 	abort ();
!       d1 = ((c1 & 0xffff) ^ 0x8000) - 0x8000;
!       c1 -= d1;
!       d2 = ((c1 & 0xffffffff) ^ 0x80000000) - 0x80000000;
!       d2_s = d2 >> 16;
!       c1 = (c1 - d2) >> 32;
!       d3 = ((c1 & 0xffff) ^ 0x8000) - 0x8000;
!       c1 -= d3;
!       d4 = ((c1 & 0xffffffff) ^ 0x80000000) - 0x80000000;
!       if (c1 != d4)
! 	abort ();
! #else
!       d1 = ((c1 & 0xffff) ^ 0x8000) - 0x8000;
!       c1 -= d1;
!       d2 = ((c1 & 0xffffffff) ^ 0x80000000) - 0x80000000;
!       d2_s = d2 >> 16;
!       if (c1 != d2)
! 	abort ();
!       c2 += (d2 < 0);
!       d3 = ((c2 & 0xffff) ^ 0x8000) - 0x8000;
!       c2 -= d3;
!       d4 = ((c2 & 0xffffffff) ^ 0x80000000) - 0x80000000;
!       if (c2 != d4)
! 	abort ();
  #endif
  
!       /* Construct the high word */
!       if (d4 != 0)
  	{
! 	  emit_move_insn (dest, GEN_INT (d4));
! 	  if (d3 != 0)
! 	    emit_move_insn (dest,
! 			    gen_rtx_PLUS (DImode, dest, GEN_INT (d3)));
  	}
-       else if (d3 != 0)
- 	emit_move_insn (dest, GEN_INT (d3));
  
!       /* Shift it into place */
!       if (d3 != 0 || d4 != 0)
!  	if (regnum == 0 && d2 != 0) 
!  	  emit_move_insn (dest, gen_rtx_ASHIFT (DImode, dest, GEN_INT (16)));
!  	else 
! 	  emit_move_insn (dest, gen_rtx_ASHIFT (DImode, dest, GEN_INT (32)));
  
!       /* Add in the low bits.  */
!       if (d2 != 0)
  	{
! 	  if (d3 != 0 || d4 != 0)
! 	    {
! 	      if (regnum == 0)
! 		{
! 		  emit_move_insn (dest, gen_rtx_PLUS (DImode, dest, 
! 						      GEN_INT (d2_s)));
! 		  emit_move_insn (dest, gen_rtx_ASHIFT (DImode, dest,  
! 							GEN_INT (16)));
! 		}
! 	      else
! 		emit_move_insn (dest, gen_rtx_PLUS (DImode, dest, 
! 						    GEN_INT (d2)));
! 	    }
  	  else
! 	    emit_move_insn (dest, GEN_INT (d2));
  	}
-       if (d1 != 0)
- 	if (d2 != 0 || d3 != 0 || d4 != 0)
- 	  emit_move_insn (dest, gen_rtx_PLUS (DImode, dest, GEN_INT (d1)));
- 	else
- 	  emit_move_insn (dest, GEN_INT (d1));
      }
- 
    return dest;
  }
  
--- 2002,2071 ----
      }
    else
      {
!       HOST_WIDE_INT ud1, ud2, ud3, ud4;
  
!       ud1 = c1 & 0xffff;
!       ud2 = (c1 & 0xffff0000) >> 16;
  #if HOST_BITS_PER_WIDE_INT >= 64
!       c2 = c1 >> 32;
  #endif
+       ud3 = c2 & 0xffff;
+       ud4 = (c2 & 0xffff0000) >> 16;
  
!       if ((ud4 == 0xffff && ud3 == 0xffff && ud2 == 0xffff && (ud1 & 0x8000)) 
! 	  || (ud4 == 0 && ud3 == 0 && ud2 == 0 && ! (ud1 & 0x8000)))
  	{
! 	  if (ud1 & 0x8000)
! 	    emit_move_insn (dest, GEN_INT (((ud1  ^ 0x8000) -  0x8000)));
! 	  else
! 	    emit_move_insn (dest, GEN_INT (ud1));
  	}
  
!       else if ((ud4 == 0xffff && ud3 == 0xffff && (ud2 & 0x8000)) 
! 	       || (ud4 == 0 && ud3 == 0 && ! (ud2 & 0x8000)))
! 	{
! 	  if (ud2 & 0x8000)
! 	    emit_move_insn (dest, GEN_INT (((ud2 << 16) ^ 0x80000000) 
! 					   - 0x80000000));
! 	  else
! 	    emit_move_insn (dest, GEN_INT (ud2 << 16));
! 	  if (ud1)
! 	    emit_move_insn (dest, gen_rtx_IOR (DImode, dest, GEN_INT (ud1)));
! 	}
!       else if ((ud4 == 0xffff && (ud3 & 0x8000)) 
! 	       || (ud4 == 0 && ! (ud3 & 0x8000)))
! 	{
! 	  if (ud3 & 0x8000)
! 	    emit_move_insn (dest, GEN_INT (((ud3 << 16) ^ 0x80000000) 
! 					   - 0x80000000));
! 	  else
! 	    emit_move_insn (dest, GEN_INT (ud3 << 16));
  
! 	  if (ud2)
! 	    emit_move_insn (dest, gen_rtx_IOR (DImode, dest, GEN_INT (ud2)));
! 	  emit_move_insn (dest, gen_rtx_ASHIFT (DImode, dest, GEN_INT (16)));
! 	  if (ud1)
! 	    emit_move_insn (dest, gen_rtx_IOR (DImode, dest, GEN_INT (ud1)));
! 	}
!       else 
  	{
! 	  if (ud4 & 0x8000)
! 	    emit_move_insn (dest, GEN_INT (((ud4 << 16) ^ 0x80000000) 
! 					   - 0x80000000));
  	  else
! 	    emit_move_insn (dest, GEN_INT (ud4 << 16));
! 
! 	  if (ud3)
! 	    emit_move_insn (dest, gen_rtx_IOR (DImode, dest, GEN_INT (ud3)));
! 
! 	  emit_move_insn (dest, gen_rtx_ASHIFT (DImode, dest, GEN_INT (32)));
! 	  if (ud2)
! 	    emit_move_insn (dest, gen_rtx_IOR (DImode, dest, 
! 					       GEN_INT (ud2 << 16)));	
! 	  if (ud1)
! 	    emit_move_insn (dest, gen_rtx_IOR (DImode, dest, GEN_INT (ud1)));
  	}
      }
    return dest;
  }
  


^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: PATCH, rs6000 long const take 3
  2002-01-10 14:00                               ` PATCH, rs6000 long const take 3 Tom Rix
@ 2002-01-10 14:08                                 ` Richard Henderson
  2002-01-10 14:20                                 ` David Edelsohn
  1 sibling, 0 replies; 875+ messages in thread
From: Richard Henderson @ 2002-01-10 14:08 UTC (permalink / raw)
  To: Tom Rix; +Cc: David Edelsohn, Geoff Keating, gcc-patches

On Thu, Jan 10, 2002 at 04:52:46PM -0600, Tom Rix wrote:
> Other than that they are evenly matched.
> I don't think there is enough of a win to check for the PLUS algorithm.

Depending on where this is used.  If it is ever used in early rtl
generation, PLUS favours being merged with address loads.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: PATCH, rs6000 long const take 3
  2002-01-10 14:00                               ` PATCH, rs6000 long const take 3 Tom Rix
  2002-01-10 14:08                                 ` Richard Henderson
@ 2002-01-10 14:20                                 ` David Edelsohn
  1 sibling, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2002-01-10 14:20 UTC (permalink / raw)
  To: Tom Rix; +Cc: Geoff Keating, rth, gcc-patches

	Yes, this looks okay, as long as you change

if (ud2)

to

if (ud2 != 0)

etc.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* ppc call_value* fixes (plus minor apple gripe)
@ 2002-01-28  1:05 Aldy Hernandez
  2002-01-28  7:26 ` Stan Shebs
                   ` (3 more replies)
  0 siblings, 4 replies; 875+ messages in thread
From: Aldy Hernandez @ 2002-01-28  1:05 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Edelsohn, Stan Shebs

[-- Attachment #1: Type: text/plain, Size: 5474 bytes --]

<gripe>
both of the problems described here had obviously
been debugged and fixed by apple, and at least one of them was coded
after my altivec changes went in, so it would have been VERY simple
to submit the fix to the gcc list as well.

guys, if you have fixes in your local tree that are very obviously 
correct,
and will undoubtedly be encountered by others, post them.  it saves
the rest of us cycles that could be used to do other cool things in gcc
(and not spent re-fixing things that have already been fixed).
</gripe>

find_reloads was dying trying to fit the register constraints of
*call_value_nonlocal_sysv.  the first operand needed a "v" constraint 
added.
likewise for a couple more call_value* patterns.  i didn't mess with the 
AIX
call_value patterns because currently there is no AIX altivec, but i 
could
add it if someone wants it.

i also added a new register class because find_reloads was picking the
wrong register for the pattern:

	(define_insn "*call_value_nonlocal_sysv"
	  [(set (match_operand 0 "" "=fgv,fgv,fgv,fgv")

upon seeing f and g, the biggest subunion containing them was set to
NON_SPECIAL_REGS, then when it tried to get the union of that and
ALTIVEC_REGS it got confused.  i noticed apple-gcc has moved
NON_SPECIAL_REGS before ALTIVEC_REGS to fix the problem, but
i believe the proper solution is to add a register class encompassing all
three register classes.

so... this patch:
	a) fixes the call_value patterns to handle vector return values
	b) adds GEN_OR_FLOAT_OR_ALTIVEC_REGS

ok?

2002-01-28  Aldy Hernandez  <aldyh@redhat.com>

	* config/rs6000/rs6000.h (reg_class): New class
	GEN_OR_FLOAT_OR_ALTIVEC_REGS.
	(REG_CLASS_NAMES): Same.
	(REG_CLASS_CONTENTS): Same.

	* rs6000.md ("*call_value_local32"): Support vector registers.
	("*call_value_local64"): Same.
	("*call_value_nonlocal_sysv"): Same.

Index: config/rs6000/rs6000.h
===================================================================
RCS file: /cvs/uberbaum/gcc/config/rs6000/rs6000.h,v
retrieving revision 1.176
diff -c -p -r1.176 rs6000.h
*** rs6000.h	2002/01/22 02:36:52	1.176
--- rs6000.h	2002/01/28 07:30:17
*************** enum reg_class
*** 1046,1051 ****
--- 1046,1052 ----
     GENERAL_REGS,
     FLOAT_REGS,
     ALTIVEC_REGS,
+   GEN_OR_FLOAT_OR_ALTIVEC_REGS,
     VRSAVE_REGS,
     NON_SPECIAL_REGS,
     MQ_REGS,
*************** enum reg_class
*** 1073,1078 ****
--- 1074,1080 ----
     "GENERAL_REGS",							\
     "FLOAT_REGS",								\
     "ALTIVEC_REGS",							\
+   
"GEN_OR_FLOAT_OR_ALTIVEC_REGS",                                       \
     "VRSAVE_REGS",							\
     "NON_SPECIAL_REGS",							\
     "MQ_REGS",								\
*************** enum reg_class
*** 1099,1104 ****
--- 1101,1107 ----
     { 0xffffffff, 0x00000000, 0x00000008, 0x00000000 }, /* 
GENERAL_REGS */     \
     { 0x00000000, 0xffffffff, 0x00000000, 0x00000000 }, /* 
FLOAT_REGS */       \
     { 0x00000000, 0x00000000, 0xffffe000, 0x00001fff }, /* 
ALTIVEC_REGS */     \
+   { 0xffffffff, 0xffffffff, 0xffffe008, 0x00001fff }, /* 
GEN_OR_FLOAT_OR_ALTIVEC_REGS */ \
     { 0x00000000, 0x00000000, 0x00000000, 0x00002000 }, /* VRSAVE_REGS */	
      \
     { 0xffffffff, 0xffffffff, 0x00000008, 0x00000000 }, /* 
NON_SPECIAL_REGS */ \
     { 0x00000000, 0x00000000, 0x00000001, 0x00000000 }, /* MQ_REGS */	    
  \
Index: config/rs6000/rs6000.md
===================================================================
RCS file: /cvs/uberbaum/gcc/config/rs6000/rs6000.md,v
retrieving revision 1.161
diff -c -p -r1.161 rs6000.md
*** rs6000.md	2002/01/25 17:52:43	1.161
--- rs6000.md	2002/01/28 07:30:28
***************
*** 9878,9884 ****
      (set_attr "length" "4,8")])

   (define_insn "*call_value_local32"
!   [(set (match_operand 0 "" "=fg,fg")
   	(call (mem:SI (match_operand:SI 1 "current_file_function_operand" 
"s,s"))
   	      (match_operand 2 "" "g,g")))
      (use (match_operand:SI 3 "immediate_operand" "O,n"))
--- 9878,9884 ----
      (set_attr "length" "4,8")])

   (define_insn "*call_value_local32"
!   [(set (match_operand 0 "" "=fgv,fgv")
   	(call (mem:SI (match_operand:SI 1 "current_file_function_operand" 
"s,s"))
   	      (match_operand 2 "" "g,g")))
      (use (match_operand:SI 3 "immediate_operand" "O,n"))
***************
*** 9899,9905 ****


   (define_insn "*call_value_local64"
!   [(set (match_operand 0 "" "=fg,fg")
   	(call (mem:SI (match_operand:DI 1 "current_file_function_operand" 
"s,s"))
   	      (match_operand 2 "" "g,g")))
      (use (match_operand:SI 3 "immediate_operand" "O,n"))
--- 9899,9905 ----


   (define_insn "*call_value_local64"
!   [(set (match_operand 0 "" "=fgv,fgv")
   	(call (mem:SI (match_operand:DI 1 "current_file_function_operand" 
"s,s"))
   	      (match_operand 2 "" "g,g")))
      (use (match_operand:SI 3 "immediate_operand" "O,n"))
***************
*** 10067,10073 ****
      (set_attr "length" "4,8,4,8")])

   (define_insn "*call_value_nonlocal_sysv"
!   [(set (match_operand 0 "" "=fg,fg,fg,fg")
   	(call (mem:SI (match_operand:SI 1 "call_operand" "cl,cl,s,s"))
   	      (match_operand 2 "" "g,g,g,g")))
      (use (match_operand:SI 3 "immediate_operand" "O,n,O,n"))
--- 10067,10073 ----
      (set_attr "length" "4,8,4,8")])

   (define_insn "*call_value_nonlocal_sysv"
!   [(set (match_operand 0 "" "=fgv,fgv,fgv,fgv")
   	(call (mem:SI (match_operand:SI 1 "call_operand" "cl,cl,s,s"))
   	      (match_operand 2 "" "g,g,g,g")))
      (use (match_operand:SI 3 "immediate_operand" "O,n,O,n"))

[-- Attachment #2: Type: text/enriched, Size: 5615 bytes --]

<fontfamily><param>Helvetica</param><<gripe>

both of the problems described here had obviously 

been debugged and fixed by apple, and at least one of them was coded

after my altivec changes went in, so it would have been VERY simple

to submit the fix to the gcc list as well.


guys, if you have fixes in your local tree that are very obviously
correct, 

and will undoubtedly be encountered by others, post them.  it saves

the rest of us cycles that could be used to do other cool things in
gcc 

(and not spent re-fixing things that have already been fixed).

<</gripe>


find_reloads was dying trying to fit the register constraints of 

*call_value_nonlocal_sysv.  the first operand needed a "v" constraint
added.

likewise for a couple more call_value* patterns.  i didn't mess with
the AIX

call_value patterns because currently there is no AIX altivec, but i
could 

add it if someone wants it.


i also added a new register class because find_reloads was picking the

wrong register for the pattern:


	(define_insn "*call_value_nonlocal_sysv"

	  [(set (match_operand 0 "" "=fgv,fgv,fgv,fgv")


upon seeing f and g, the biggest subunion containing them was set to

NON_SPECIAL_REGS, then when it tried to get the union of that and

ALTIVEC_REGS it got confused.  i noticed apple-gcc has moved

NON_SPECIAL_REGS before ALTIVEC_REGS to fix the problem, but

i believe the proper solution is to add a register class encompassing
all

three register classes.


so... this patch:

	a) fixes the call_value patterns to handle vector return values

	b) adds GEN_OR_FLOAT_OR_ALTIVEC_REGS


ok?


2002-01-28  Aldy Hernandez  <<aldyh@redhat.com>


	* config/rs6000/rs6000.h (reg_class): New class

	GEN_OR_FLOAT_OR_ALTIVEC_REGS.

	(REG_CLASS_NAMES): Same.

	(REG_CLASS_CONTENTS): Same.


	* rs6000.md ("*call_value_local32"): Support vector registers.

	("*call_value_local64"): Same.

	("*call_value_nonlocal_sysv"): Same.


Index: config/rs6000/rs6000.h

===================================================================

RCS file: /cvs/uberbaum/gcc/config/rs6000/rs6000.h,v

retrieving revision 1.176

diff -c -p -r1.176 rs6000.h

*** rs6000.h	2002/01/22 02:36:52	1.176

--- rs6000.h	2002/01/28 07:30:17

*************** enum reg_class

*** 1046,1051 ****

--- 1046,1052 ----

    GENERAL_REGS,

    FLOAT_REGS,

    ALTIVEC_REGS,

+   GEN_OR_FLOAT_OR_ALTIVEC_REGS,

    VRSAVE_REGS,

    NON_SPECIAL_REGS,

    MQ_REGS,

*************** enum reg_class

*** 1073,1078 ****

--- 1074,1080 ----

    "GENERAL_REGS",							\

    "FLOAT_REGS",								\

    "ALTIVEC_REGS",							\

+   "GEN_OR_FLOAT_OR_ALTIVEC_REGS",                                       \

    "VRSAVE_REGS",							\

    "NON_SPECIAL_REGS",							\

    "MQ_REGS",								\

*************** enum reg_class

*** 1099,1104 ****

--- 1101,1107 ----

    { 0xffffffff, 0x00000000, 0x00000008, 0x00000000 }, /*
GENERAL_REGS */     \

    { 0x00000000, 0xffffffff, 0x00000000, 0x00000000 }, /* FLOAT_REGS
*/       \

    { 0x00000000, 0x00000000, 0xffffe000, 0x00001fff }, /*
ALTIVEC_REGS */     \

+   { 0xffffffff, 0xffffffff, 0xffffe008, 0x00001fff }, /*
GEN_OR_FLOAT_OR_ALTIVEC_REGS */ \

    { 0x00000000, 0x00000000, 0x00000000, 0x00002000 }, /* VRSAVE_REGS
*/	     \

    { 0xffffffff, 0xffffffff, 0x00000008, 0x00000000 }, /*
NON_SPECIAL_REGS */ \

    { 0x00000000, 0x00000000, 0x00000001, 0x00000000 }, /* MQ_REGS */	     \

Index: config/rs6000/rs6000.md

===================================================================

RCS file: /cvs/uberbaum/gcc/config/rs6000/rs6000.md,v

retrieving revision 1.161

diff -c -p -r1.161 rs6000.md

*** rs6000.md	2002/01/25 17:52:43	1.161

--- rs6000.md	2002/01/28 07:30:28

***************

*** 9878,9884 ****

     (set_attr "length" "4,8")])

  

  (define_insn "*call_value_local32"

!   [(set (match_operand 0 "" "=fg,fg")

  	(call (mem:SI (match_operand:SI 1 "current_file_function_operand"
"s,s"))

  	      (match_operand 2 "" "g,g")))

     (use (match_operand:SI 3 "immediate_operand" "O,n"))

--- 9878,9884 ----

     (set_attr "length" "4,8")])

  

  (define_insn "*call_value_local32"

!   [(set (match_operand 0 "" "=fgv,fgv")

  	(call (mem:SI (match_operand:SI 1 "current_file_function_operand"
"s,s"))

  	      (match_operand 2 "" "g,g")))

     (use (match_operand:SI 3 "immediate_operand" "O,n"))

***************

*** 9899,9905 ****

  

  

  (define_insn "*call_value_local64"

!   [(set (match_operand 0 "" "=fg,fg")

  	(call (mem:SI (match_operand:DI 1 "current_file_function_operand"
"s,s"))

  	      (match_operand 2 "" "g,g")))

     (use (match_operand:SI 3 "immediate_operand" "O,n"))

--- 9899,9905 ----

  

  

  (define_insn "*call_value_local64"

!   [(set (match_operand 0 "" "=fgv,fgv")

  	(call (mem:SI (match_operand:DI 1 "current_file_function_operand"
"s,s"))

  	      (match_operand 2 "" "g,g")))

     (use (match_operand:SI 3 "immediate_operand" "O,n"))

***************

*** 10067,10073 ****

     (set_attr "length" "4,8,4,8")])

  

  (define_insn "*call_value_nonlocal_sysv"

!   [(set (match_operand 0 "" "=fg,fg,fg,fg")

  	(call (mem:SI (match_operand:SI 1 "call_operand" "cl,cl,s,s"))

  	      (match_operand 2 "" "g,g,g,g")))

     (use (match_operand:SI 3 "immediate_operand" "O,n,O,n"))

--- 10067,10073 ----

     (set_attr "length" "4,8,4,8")])

  

  (define_insn "*call_value_nonlocal_sysv"

!   [(set (match_operand 0 "" "=fgv,fgv,fgv,fgv")

  	(call (mem:SI (match_operand:SI 1 "call_operand" "cl,cl,s,s"))

  	      (match_operand 2 "" "g,g,g,g")))

     (use (match_operand:SI 3 "immediate_operand" "O,n,O,n"))

</fontfamily>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: ppc call_value* fixes (plus minor apple gripe)
  2002-01-28  1:05 ppc call_value* fixes (plus minor apple gripe) Aldy Hernandez
@ 2002-01-28  7:26 ` Stan Shebs
  2002-01-29 14:30   ` Aldy Hernandez
  2002-01-28  7:43 ` Geoff Keating
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 875+ messages in thread
From: Stan Shebs @ 2002-01-28  7:26 UTC (permalink / raw)
  To: Aldy Hernandez; +Cc: gcc-patches, David Edelsohn

Aldy Hernandez wrote:
> 
> <gripe>
> both of the problems described here had obviously
> been debugged and fixed by apple, and at least one of them was coded
> after my altivec changes went in, so it would have been VERY simple
> to submit the fix to the gcc list as well.

Unfortunately, our actual experience over the past year has been
otherwise - a certain amount of time has to be set aside for
nitpicking and debate over even the apparently straightforward
changes.  I'm not griping (OK, a little :-) ), but it's clearly
not a zero-cost activity.

> guys, if you have fixes in your local tree that are very obviously correct,
> and will undoubtedly be encountered by others, post them. it saves
> the rest of us cycles that could be used to do other cool things in gcc
> (and not spent re-fixing things that have already been fixed).
> </gripe>

Sorry - as you know, the Apple version has many local differences,
particularly for AltiVec code, and any submission candidate patch
needs a whole separate round of testing with FSF sources, to be sure
that the patch does not depend on any of our local changes that have
been deemed permanently unacceptable for FSF GCC.

> [...] i noticed apple-gcc has moved
> NON_SPECIAL_REGS before ALTIVEC_REGS to fix the problem, but
> i believe the proper solution is to add a register class encompassing all
> three register classes.

This is a perfect example - I did something expedient, basically
to get our version working again after an import of FSF code,
but was pretty sure it was wrong, and I didn't want to submit
a hack that I couldn't justify.  Worse, since the patch only
mattered if one had more complete AltiVec support than was in
FSF GCC at the time, there was no way to validate it or any
better patch using only FSF code.

I can assure you that the parallel maintenance of versions consumes
way more of my personal time and effort than anybody else's, so I
totally sympathize when there is evidence that it is inefficient.
But Apple is not going to give up its requirements for GCC just
because they're not desired in FSF GCC, nor vice versa, so we live
with the permanent fork instead.

Stan

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: ppc call_value* fixes (plus minor apple gripe)
  2002-01-28  1:05 ppc call_value* fixes (plus minor apple gripe) Aldy Hernandez
  2002-01-28  7:26 ` Stan Shebs
@ 2002-01-28  7:43 ` Geoff Keating
  2002-01-28  7:49 ` David Edelsohn
  2002-01-28 11:12 ` Richard Henderson
  3 siblings, 0 replies; 875+ messages in thread
From: Geoff Keating @ 2002-01-28  7:43 UTC (permalink / raw)
  To: Aldy Hernandez; +Cc: gcc-patches

Aldy Hernandez <aldyh@redhat.com> writes:

...
> i also added a new register class because find_reloads was picking the
> wrong register for the pattern:
> 
> 	(define_insn "*call_value_nonlocal_sysv"
> 	  [(set (match_operand 0 "" "=fgv,fgv,fgv,fgv")
> 
> upon seeing f and g, the biggest subunion containing them was set to
> NON_SPECIAL_REGS, then when it tried to get the union of that and
> ALTIVEC_REGS it got confused.  i noticed apple-gcc has moved
> NON_SPECIAL_REGS before ALTIVEC_REGS to fix the problem, but
> i believe the proper solution is to add a register class encompassing
> all
> three register classes.
> 
> so... this patch:
> 	a) fixes the call_value patterns to handle vector return

That part is OK, although of course it won't work without the next bit.

> values
> 	b) adds GEN_OR_FLOAT_OR_ALTIVEC_REGS
> 
> ok?

Could you change the name of NON_SPECIAL_REGS too?  I don't think the
altivec registers are "special" in that sense...

Actually, I think it'd probably be better to create a
GEN_OR_FLOAT_REGS class, and then make NON_SPECIAL_REGS contain
altivec registers too.  There are two places where NON_SPECIAL_REGS is
used, in PREFERRED_RELOAD_CLASS and secondary_reload_class; the
secondary_reload_class case should be 
(regno == -1 || FP_REGNO_P (regno)) && reg_class_subset_p (FLOAT_REGS, class)
and for PREFERRED_RELOAD_CLASS, you probably want to add a
GEN_OR_ALTIVEC_REGS and use that.

> 2002-01-28 Aldy Hernandez <aldyh@redhat.com>
> 
> 	* config/rs6000/rs6000.h (reg_class): New class
> 	GEN_OR_FLOAT_OR_ALTIVEC_REGS.
> 	(REG_CLASS_NAMES): Same.
> 	(REG_CLASS_CONTENTS): Same.
> 
> 	* rs6000.md ("*call_value_local32"): Support vector registers.
> 	("*call_value_local64"): Same.
> 	("*call_value_nonlocal_sysv"): Same.
-- 
- Geoffrey Keating <geoffk@geoffk.org> <geoffk@redhat.com>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: ppc call_value* fixes (plus minor apple gripe)
  2002-01-28  1:05 ppc call_value* fixes (plus minor apple gripe) Aldy Hernandez
  2002-01-28  7:26 ` Stan Shebs
  2002-01-28  7:43 ` Geoff Keating
@ 2002-01-28  7:49 ` David Edelsohn
  2002-01-28 11:12 ` Richard Henderson
  3 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2002-01-28  7:49 UTC (permalink / raw)
  To: Aldy Hernandez; +Cc: gcc-patches, Stan Shebs

2002-01-28  Aldy Hernandez  <aldyh@redhat.com>

	* config/rs6000/rs6000.h (reg_class): New class
	GEN_OR_FLOAT_OR_ALTIVEC_REGS.
	(REG_CLASS_NAMES): Same.
	(REG_CLASS_CONTENTS): Same.

	* rs6000.md ("*call_value_local32"): Support vector registers.
	("*call_value_local64"): Same.
	("*call_value_nonlocal_sysv"): Same.

This patch is fine.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: ppc call_value* fixes (plus minor apple gripe)
  2002-01-28  1:05 ppc call_value* fixes (plus minor apple gripe) Aldy Hernandez
                   ` (2 preceding siblings ...)
  2002-01-28  7:49 ` David Edelsohn
@ 2002-01-28 11:12 ` Richard Henderson
  2002-01-28 12:52   ` David Edelsohn
  2002-01-28 22:44   ` Aldy Hernandez
  3 siblings, 2 replies; 875+ messages in thread
From: Richard Henderson @ 2002-01-28 11:12 UTC (permalink / raw)
  To: Aldy Hernandez; +Cc: gcc-patches, David Edelsohn, Stan Shebs

On Mon, Jan 28, 2002 at 06:33:00PM +1100, Aldy Hernandez wrote:
>   (define_insn "*call_value_local32" 
> -   [(set (match_operand 0 "" "=fg,fg") 
> +   [(set (match_operand 0 "" "=fgv,fgv") 

This is silly.  This operand will _always_ be a hard reg right
from the beginning of code generation.  This should be written

	(match_operand 0 "" "")

so that reload does nothing with it at all.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: ppc call_value* fixes (plus minor apple gripe)
  2002-01-28 11:12 ` Richard Henderson
@ 2002-01-28 12:52   ` David Edelsohn
  2002-01-28 22:44   ` Aldy Hernandez
  1 sibling, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2002-01-28 12:52 UTC (permalink / raw)
  To: Aldy Hernandez; +Cc: gcc-patches, Stan Shebs

	The output constraints on call_value have been around for a very
long time.  If we remove them, we need to remove them from all call_value
patterns. 

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: ppc call_value* fixes (plus minor apple gripe)
  2002-01-28 11:12 ` Richard Henderson
  2002-01-28 12:52   ` David Edelsohn
@ 2002-01-28 22:44   ` Aldy Hernandez
  2002-01-29 10:51     ` David Edelsohn
  1 sibling, 1 reply; 875+ messages in thread
From: Aldy Hernandez @ 2002-01-28 22:44 UTC (permalink / raw)
  To: Richard Henderson; +Cc: gcc-patches, David Edelsohn, Stan Shebs

> This is silly.  This operand will _always_ be a hard reg right
> from the beginning of code generation.  This should be written
>
> 	(match_operand 0 "" "")
>
> so that reload does nothing with it at all.

weee!!!

ok, here's the new patch.  regression tested on darwin.

david, i fixed all the call_value patterns.

ok?

--
Aldy Hernandez                                E-mail: aldyh@redhat.com
Professional Gypsy Lost in Australia
Red Hat, Inc.

2002-01-29  Aldy Hernandez  <aldyh@redhat.com>

	* rs6000.md ("*call_value_local32"): Remove constraints.
	("*call_value_local64"): Same.
	("*call_value_indirect_nonlocal_aix32"): Same.
	("*call_value_nonlocal_aix32"): Same.
	("*call_value_indirect_nonlocal_aix64"): Same.
	("*call_value_nonlocal_aix64"): Same.
	("*call_value_nonlocal_sysv"): Same.

Index: config/rs6000/rs6000.md
===================================================================
RCS file: /cvs/uberbaum/gcc/config/rs6000/rs6000.md,v
retrieving revision 1.161
diff -c -p -r1.161 rs6000.md
*** rs6000.md	2002/01/25 17:52:43	1.161
--- rs6000.md	2002/01/29 05:50:23
***************
*** 9878,9884 ****
      (set_attr "length" "4,8")])

   (define_insn "*call_value_local32"
!   [(set (match_operand 0 "" "=fg,fg")
   	(call (mem:SI (match_operand:SI 1 "current_file_function_operand" 
"s,s"))
   	      (match_operand 2 "" "g,g")))
      (use (match_operand:SI 3 "immediate_operand" "O,n"))
--- 9878,9884 ----
      (set_attr "length" "4,8")])

   (define_insn "*call_value_local32"
!   [(set (match_operand 0 "" "")
   	(call (mem:SI (match_operand:SI 1 "current_file_function_operand" 
"s,s"))
   	      (match_operand 2 "" "g,g")))
      (use (match_operand:SI 3 "immediate_operand" "O,n"))
***************
*** 9899,9905 ****


   (define_insn "*call_value_local64"
!   [(set (match_operand 0 "" "=fg,fg")
   	(call (mem:SI (match_operand:DI 1 "current_file_function_operand" 
"s,s"))
   	      (match_operand 2 "" "g,g")))
      (use (match_operand:SI 3 "immediate_operand" "O,n"))
--- 9899,9905 ----


   (define_insn "*call_value_local64"
!   [(set (match_operand 0 "" "")
   	(call (mem:SI (match_operand:DI 1 "current_file_function_operand" 
"s,s"))
   	      (match_operand 2 "" "g,g")))
      (use (match_operand:SI 3 "immediate_operand" "O,n"))
***************
*** 9976,9982 ****
      (set_attr "length" "8")])

   (define_insn "*call_value_indirect_nonlocal_aix32"
!   [(set (match_operand 0 "" "=fg")
   	(call (mem:SI (match_operand:SI 1 "register_operand" "cl"))
   	      (match_operand 2 "" "g")))
      (use (reg:SI 2))
--- 9976,9982 ----
      (set_attr "length" "8")])

   (define_insn "*call_value_indirect_nonlocal_aix32"
!   [(set (match_operand 0 "" "")
   	(call (mem:SI (match_operand:SI 1 "register_operand" "cl"))
   	      (match_operand 2 "" "g")))
      (use (reg:SI 2))
***************
*** 9990,9996 ****
      (set_attr "length" "8")])

   (define_insn "*call_value_nonlocal_aix32"
!   [(set (match_operand 0 "" "=fg")
   	(call (mem:SI (match_operand:SI 1 "call_operand" "s"))
   	      (match_operand 2 "" "g")))
      (use (match_operand:SI 3 "immediate_operand" "O"))
--- 9990,9996 ----
      (set_attr "length" "8")])

   (define_insn "*call_value_nonlocal_aix32"
!   [(set (match_operand 0 "" "")
   	(call (mem:SI (match_operand:SI 1 "call_operand" "s"))
   	      (match_operand 2 "" "g")))
      (use (match_operand:SI 3 "immediate_operand" "O"))
***************
*** 10003,10009 ****
      (set_attr "length" "8")])

   (define_insn "*call_value_indirect_nonlocal_aix64"
!   [(set (match_operand 0 "" "=fg")
   	(call (mem:SI (match_operand:DI 1 "register_operand" "cl"))
   	      (match_operand 2 "" "g")))
      (use (reg:DI 2))
--- 10003,10009 ----
      (set_attr "length" "8")])

   (define_insn "*call_value_indirect_nonlocal_aix64"
!   [(set (match_operand 0 "" "")
   	(call (mem:SI (match_operand:DI 1 "register_operand" "cl"))
   	      (match_operand 2 "" "g")))
      (use (reg:DI 2))
***************
*** 10017,10023 ****
      (set_attr "length" "8")])

   (define_insn "*call_value_nonlocal_aix64"
!   [(set (match_operand 0 "" "=fg")
   	(call (mem:SI (match_operand:DI 1 "call_operand" "s"))
   	      (match_operand 2 "" "g")))
      (use (match_operand:SI 3 "immediate_operand" "O"))
--- 10017,10023 ----
      (set_attr "length" "8")])

   (define_insn "*call_value_nonlocal_aix64"
!   [(set (match_operand 0 "" "")
   	(call (mem:SI (match_operand:DI 1 "call_operand" "s"))
   	      (match_operand 2 "" "g")))
      (use (match_operand:SI 3 "immediate_operand" "O"))
***************
*** 10067,10073 ****
      (set_attr "length" "4,8,4,8")])

   (define_insn "*call_value_nonlocal_sysv"
!   [(set (match_operand 0 "" "=fg,fg,fg,fg")
   	(call (mem:SI (match_operand:SI 1 "call_operand" "cl,cl,s,s"))
   	      (match_operand 2 "" "g,g,g,g")))
      (use (match_operand:SI 3 "immediate_operand" "O,n,O,n"))
--- 10067,10073 ----
      (set_attr "length" "4,8,4,8")])

   (define_insn "*call_value_nonlocal_sysv"
!   [(set (match_operand 0 "" "")
   	(call (mem:SI (match_operand:SI 1 "call_operand" "cl,cl,s,s"))
   	      (match_operand 2 "" "g,g,g,g")))
      (use (match_operand:SI 3 "immediate_operand" "O,n,O,n"))

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: ppc call_value* fixes (plus minor apple gripe)
  2002-01-28 22:44   ` Aldy Hernandez
@ 2002-01-29 10:51     ` David Edelsohn
  0 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2002-01-29 10:51 UTC (permalink / raw)
  To: Aldy Hernandez; +Cc: Richard Henderson, gcc-patches, Stan Shebs

2002-01-29  Aldy Hernandez  <aldyh@redhat.com>

	* rs6000.md ("*call_value_local32"): Remove constraints.
	("*call_value_local64"): Same.
	("*call_value_indirect_nonlocal_aix32"): Same.
	("*call_value_nonlocal_aix32"): Same.
	("*call_value_indirect_nonlocal_aix64"): Same.
	("*call_value_nonlocal_aix64"): Same.
	("*call_value_nonlocal_sysv"): Same.

This patch looks okay and works so far in my preliminary test.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: ppc call_value* fixes (plus minor apple gripe)
  2002-01-28  7:26 ` Stan Shebs
@ 2002-01-29 14:30   ` Aldy Hernandez
  0 siblings, 0 replies; 875+ messages in thread
From: Aldy Hernandez @ 2002-01-29 14:30 UTC (permalink / raw)
  To: Stan Shebs; +Cc: gcc-patches, David Edelsohn

> Unfortunately, our actual experience over the past year has been
> otherwise - a certain amount of time has to be set aside for
> nitpicking and debate over even the apparently straightforward
> changes.  I'm not griping (OK, a little :-) ), but it's clearly
> not a zero-cost activity.

yes, i am aware it takes a lot extra-hours work time.

perhaps, if you know it's something that will definitely have
to be fixed later, distill a small testcase and mail it to me.

> because they're not desired in FSF GCC, nor vice versa, so we live
> with the permanent fork instead.

forked trees?  we have the mother of all forked trees here at
redhat. :)

--
Aldy Hernandez                                E-mail: aldyh@redhat.com
Professional Gypsy Lost in Australia
Red Hat, Inc.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* [PATCH] PowerPC fsel PR5217
@ 2002-02-03 19:14 David Edelsohn
  2002-02-03 21:14 ` Geoff Keating
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2002-02-03 19:14 UTC (permalink / raw)
  To: Geoff Keating; +Cc: gcc-patches

	This problem seems to be a mismatch between the mode of the
if_then_else in the generated RTL versus the machine description.  I think
this mixed mode pattern only can be generated by GCC internally, not any
machine description pattern, so we might as well follow the canonical
modes that GCC chooses.

	Does this seem correct to you, Geoff?  Or should the if_then_else
be the other mode?

David


Index: rs6000.md
===================================================================
RCS file: /cvs/gcc/egcs/gcc/config/rs6000/rs6000.md,v
retrieving revision 1.162
diff -c -p -r1.162 rs6000.md
*** rs6000.md	2002/01/29 22:49:55	1.162
--- rs6000.md	2002/02/04 00:46:26
***************
*** 5084,5090 ****
  
  (define_insn "*fseldfsf4"
    [(set (match_operand:SF 0 "gpc_reg_operand" "=f")
! 	(if_then_else:SF (ge (match_operand:DF 1 "gpc_reg_operand" "f")
  			     (match_operand:DF 4 "zero_fp_constant" "F"))
  			 (match_operand:SF 2 "gpc_reg_operand" "f")
  			 (match_operand:SF 3 "gpc_reg_operand" "f")))]
--- 5084,5090 ----
  
  (define_insn "*fseldfsf4"
    [(set (match_operand:SF 0 "gpc_reg_operand" "=f")
! 	(if_then_else:DF (ge (match_operand:DF 1 "gpc_reg_operand" "f")
  			     (match_operand:DF 4 "zero_fp_constant" "F"))
  			 (match_operand:SF 2 "gpc_reg_operand" "f")
  			 (match_operand:SF 3 "gpc_reg_operand" "f")))]
***************
*** 5248,5254 ****
  
  (define_insn "*fselsfdf4"
    [(set (match_operand:DF 0 "gpc_reg_operand" "=f")
! 	(if_then_else:DF (ge (match_operand:SF 1 "gpc_reg_operand" "f")
  			     (match_operand:SF 4 "zero_fp_constant" "F"))
  			 (match_operand:DF 2 "gpc_reg_operand" "f")
  			 (match_operand:DF 3 "gpc_reg_operand" "f")))]
--- 5248,5254 ----
  
  (define_insn "*fselsfdf4"
    [(set (match_operand:DF 0 "gpc_reg_operand" "=f")
! 	(if_then_else:SF (ge (match_operand:SF 1 "gpc_reg_operand" "f")
  			     (match_operand:SF 4 "zero_fp_constant" "F"))
  			 (match_operand:DF 2 "gpc_reg_operand" "f")
  			 (match_operand:DF 3 "gpc_reg_operand" "f")))]

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] PowerPC fsel PR5217
  2002-02-03 19:14 [PATCH] PowerPC fsel PR5217 David Edelsohn
@ 2002-02-03 21:14 ` Geoff Keating
  2002-02-03 21:40   ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Geoff Keating @ 2002-02-03 21:14 UTC (permalink / raw)
  To: dje; +Cc: gcc-patches

> cc: gcc-patches@gcc.gnu.org
> Date: Sun, 03 Feb 2002 19:57:07 -0500
> From: David Edelsohn <dje@watson.ibm.com>
> 
> 	This problem seems to be a mismatch between the mode of the
> if_then_else in the generated RTL versus the machine description.  I think
> this mixed mode pattern only can be generated by GCC internally, not any
> machine description pattern, so we might as well follow the canonical
> modes that GCC chooses.
> 
> 	Does this seem correct to you, Geoff?  Or should the if_then_else
> be the other mode?

The patterns you're changing seem to have been correct originally.
The IF_THEN_ELSE should have the same mode as the two alternatives and
the same mode as the destination of the SET, correct?

>   (define_insn "*fseldfsf4"
>     [(set (match_operand:SF 0 "gpc_reg_operand" "=f")
> ! 	(if_then_else:SF (ge (match_operand:DF 1 "gpc_reg_operand" "f")
>   			     (match_operand:DF 4 "zero_fp_constant" "F"))
>   			 (match_operand:SF 2 "gpc_reg_operand" "f")
>   			 (match_operand:SF 3 "gpc_reg_operand" "f")))]

-- 
- Geoffrey Keating <geoffk@geoffk.org> <geoffk@redhat.com>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] PowerPC fsel PR5217
  2002-02-03 21:14 ` Geoff Keating
@ 2002-02-03 21:40   ` David Edelsohn
  2002-02-03 22:08     ` Geoff Keating
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2002-02-03 21:40 UTC (permalink / raw)
  To: Geoff Keating; +Cc: gcc-patches

>>>>> Geoff Keating writes:

Geoff> The patterns you're changing seem to have been correct originally.
Geoff> The IF_THEN_ELSE should have the same mode as the two alternatives and
Geoff> the same mode as the destination of the SET, correct?

	This is what I am not sure about.  I am pretty sure that I created
those patterns because GCC was not generating fsel for those mixed cases.
I am pretty sure that I created the patterns to match what GCC was
generating, not because the RTL was canonically correct.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] PowerPC fsel PR5217
  2002-02-03 21:40   ` David Edelsohn
@ 2002-02-03 22:08     ` Geoff Keating
  2002-02-03 22:45       ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Geoff Keating @ 2002-02-03 22:08 UTC (permalink / raw)
  To: dje; +Cc: gcc-patches

> cc: gcc-patches@gcc.gnu.org
> Date: Sun, 03 Feb 2002 23:50:12 -0500
> From: David Edelsohn <dje@watson.ibm.com>
> 
> >>>>> Geoff Keating writes:
> 
> Geoff> The patterns you're changing seem to have been correct originally.
> Geoff> The IF_THEN_ELSE should have the same mode as the two alternatives and
> Geoff> the same mode as the destination of the SET, correct?
> 
> 	This is what I am not sure about.  I am pretty sure that I created
> those patterns because GCC was not generating fsel for those mixed cases.
> I am pretty sure that I created the patterns to match what GCC was
> generating, not because the RTL was canonically correct.

In ifcvt.c, there is:

tmp = gen_rtx_fmt_ee (code, GET_MODE (if_info->cond), cmp_a, cmp_b);
tmp = gen_rtx_IF_THEN_ELSE (GET_MODE (x), tmp, vtrue, vfalse);
tmp = gen_rtx_SET (VOIDmode, x, tmp);

which generates an IF_THEN_ELSE with the same mode as the destination
of the SET.

-- 
- Geoffrey Keating <geoffk@geoffk.org> <geoffk@redhat.com>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] PowerPC fsel PR5217
  2002-02-03 22:08     ` Geoff Keating
@ 2002-02-03 22:45       ` David Edelsohn
  2002-02-03 23:14         ` Geoff Keating
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2002-02-03 22:45 UTC (permalink / raw)
  To: Geoff Keating; +Cc: gcc-patches

>>>>> Geoff Keating writes:

Geoff> tmp = gen_rtx_fmt_ee (code, GET_MODE (if_info->cond), cmp_a, cmp_b);
Geoff> tmp = gen_rtx_IF_THEN_ELSE (GET_MODE (x), tmp, vtrue, vfalse);
Geoff> tmp = gen_rtx_SET (VOIDmode, x, tmp);

Geoff> which generates an IF_THEN_ELSE with the same mode as the destination
Geoff> of the SET.

	combine.c uses VOIDmode and cselib.c appears to use the mode of
the source:

            src = gen_rtx_IF_THEN_ELSE (GET_MODE (src), cond, src, dest);

which is what my pattern matches.

	So the question is which way do we want to canonicalize this?
Then we can document it and match the decision in the PowerPC port.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] PowerPC fsel PR5217
  2002-02-03 22:45       ` David Edelsohn
@ 2002-02-03 23:14         ` Geoff Keating
  2002-02-04  9:26           ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Geoff Keating @ 2002-02-03 23:14 UTC (permalink / raw)
  To: dje; +Cc: gcc-patches

> cc: gcc-patches@gcc.gnu.org
> Date: Mon, 04 Feb 2002 00:14:23 -0500
> From: David Edelsohn <dje@watson.ibm.com>
> 
> >>>>> Geoff Keating writes:
> 
> Geoff> tmp = gen_rtx_fmt_ee (code, GET_MODE (if_info->cond), cmp_a, cmp_b);
> Geoff> tmp = gen_rtx_IF_THEN_ELSE (GET_MODE (x), tmp, vtrue, vfalse);
> Geoff> tmp = gen_rtx_SET (VOIDmode, x, tmp);
> 
> Geoff> which generates an IF_THEN_ELSE with the same mode as the destination
> Geoff> of the SET.
> 
> 	combine.c uses VOIDmode and cselib.c appears to use the mode of
> the source:
> 
>             src = gen_rtx_IF_THEN_ELSE (GET_MODE (src), cond, src, dest);
> 
> which is what my pattern matches.

Um...  In your patch, you had

***************
*** 5084,5090 ****
  
  (define_insn "*fseldfsf4"
    [(set (match_operand:SF 0 "gpc_reg_operand" "=f")
! 	(if_then_else:SF (ge (match_operand:DF 1 "gpc_reg_operand" "f")
  			     (match_operand:DF 4 "zero_fp_constant" "F"))
  			 (match_operand:SF 2 "gpc_reg_operand" "f")
  			 (match_operand:SF 3 "gpc_reg_operand" "f")))]
--- 5084,5090 ----
  
  (define_insn "*fseldfsf4"
    [(set (match_operand:SF 0 "gpc_reg_operand" "=f")
! 	(if_then_else:DF (ge (match_operand:DF 1 "gpc_reg_operand" "f")
  			     (match_operand:DF 4 "zero_fp_constant" "F"))
  			 (match_operand:SF 2 "gpc_reg_operand" "f")
  			 (match_operand:SF 3 "gpc_reg_operand" "f")))]

In this case, 'src' is '(match_operand:SF 2 "gpc_reg_operand" "f")',
and has mode SFmode.  

> 	So the question is which way do we want to canonicalize this?
> Then we can document it and match the decision in the PowerPC port.

We certainly want the mode of the IF_THEN_ELSE to be the mode of the
value it produces; this is how almost all other RTL is done.  VOIDmode
would just be annoying.

-- 
- Geoffrey Keating <geoffk@geoffk.org> <geoffk@redhat.com>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] PowerPC fsel PR5217
  2002-02-03 23:14         ` Geoff Keating
@ 2002-02-04  9:26           ` David Edelsohn
  2002-02-04 10:24             ` Geoff Keating
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2002-02-04  9:26 UTC (permalink / raw)
  To: Geoff Keating; +Cc: gcc-patches

	I do not understand your reply.  The current version of the
pattern expects the MODE of IF_THEN_ELSE to match the source.  The code in
cselib.c creates IF_THEN_ELSE with the MODE based on src, not on cond.  I
am not saying it is correct.

	My patch changes the MODE of IF_THEN_ELSE to match the condition,
which is what the code in ifcvt.c creates.  The purpose of my patch was to
create a strawman to start a discussion because I have been harshly
criticized before when simply asking for guidance without a patch.

	Changing the pattern without having GCC consistently generating
matching RTL is not effective.  If the canonical form should have the MODE
match the condition, then cselib.c appears that it should be changed.

	I am reporting facts of contradictions within GCC, not making
recommendations.  I am asking for people familiar with this code and this
design to give some advice and help make the necessary changes in the
common parts of GCC.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] PowerPC fsel PR5217
  2002-02-04  9:26           ` David Edelsohn
@ 2002-02-04 10:24             ` Geoff Keating
  2002-02-04 10:40               ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Geoff Keating @ 2002-02-04 10:24 UTC (permalink / raw)
  To: dje; +Cc: gcc-patches

> cc: gcc-patches@gcc.gnu.org
> Date: Mon, 04 Feb 2002 12:10:09 -0500
> From: David Edelsohn <dje@watson.ibm.com>
> 
> 	I do not understand your reply.  The current version of the
> pattern expects the MODE of IF_THEN_ELSE to match the source.  The code in
> cselib.c creates IF_THEN_ELSE with the MODE based on src, not on cond.  I
> am not saying it is correct.
> 
> 	My patch changes the MODE of IF_THEN_ELSE to match the condition,
> which is what the code in ifcvt.c creates. 

This is what I am trying to say:

On my reading, ifcvt.c does not create IF_THEN_ELSE RTL in which the
mode of the IF_THEN_ELSE matches the condition.

Why do you think that it does?

I think it doesn't because in this chunk of code from ifcvt.c:

tmp = gen_rtx_fmt_ee (code, GET_MODE (if_info->cond), cmp_a, cmp_b);
tmp = gen_rtx_IF_THEN_ELSE (GET_MODE (x), tmp, vtrue, vfalse);
tmp = gen_rtx_SET (VOIDmode, x, tmp);

the mode of the condition will be GET_MODE (if_info->cond), but
the IF_THEN_ELSE is created with mode GET_MODE (x).

Since GET_MODE(x) and GET_MODE(vtrue) and GET_MODE(vfalse) should be
the same thing (unless vtrue or vfalse are CONST_INTs), that makes
this code equivalent to the code in cselib.c:

src = gen_rtx_IF_THEN_ELSE (GET_MODE (src), cond, src, dest);

... except in the case where 'src' has VOIDmode, which is probably a
bug in cselib.c.  It should really be using GET_MODE (dest) which we
know isn't a CONST_INT.

> The purpose of my patch was to
> create a strawman to start a discussion because I have been harshly
> criticized before when simply asking for guidance without a patch.

Yes, your patch was very helpful.  It indicated that we're somehow
talking at cross purposes.

> 	Changing the pattern without having GCC consistently generating
> matching RTL is not effective.  

Yes, I agree.  GCC should be generating the correct RTL.

> If the canonical form should have the MODE
> match the condition, then cselib.c appears that it should be
> changed.

It certainly shouldn't be matching the condition.  It should be
matching the mode of the result of the IF_THEN_ELSE.

> 	I am reporting facts of contradictions within GCC, not making
> recommendations.  I am asking for people familiar with this code and this
> design to give some advice and help make the necessary changes in the
> common parts of GCC.

I hope I am being helpful...

-- 
- Geoffrey Keating <geoffk@geoffk.org> <geoffk@redhat.com>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] PowerPC fsel PR5217
  2002-02-04 10:24             ` Geoff Keating
@ 2002-02-04 10:40               ` David Edelsohn
  2002-02-04 11:18                 ` Dale Johannesen
  2002-02-04 11:44                 ` Geoff Keating
  0 siblings, 2 replies; 875+ messages in thread
From: David Edelsohn @ 2002-02-04 10:40 UTC (permalink / raw)
  To: Geoff Keating; +Cc: gcc-patches

	I somehow thought you were saying that ifcvt.c uses the mode of
the operands of the condition, sorry.

	So this patch should be all that is necessary.  Do you have any
objection to the patch?

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] PowerPC fsel PR5217
  2002-02-04 10:40               ` David Edelsohn
@ 2002-02-04 11:18                 ` Dale Johannesen
  2002-02-04 11:44                 ` Geoff Keating
  1 sibling, 0 replies; 875+ messages in thread
From: Dale Johannesen @ 2002-02-04 11:18 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Dale Johannesen, Geoff Keating, gcc-patches


This earlier fix
http://gcc.gnu.org/ml/gcc-patches/2001-11/msg00236.html
is related, but it seems I did not fix the entire problem.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] PowerPC fsel PR5217
  2002-02-04 10:40               ` David Edelsohn
  2002-02-04 11:18                 ` Dale Johannesen
@ 2002-02-04 11:44                 ` Geoff Keating
  2002-02-05 11:20                   ` David Edelsohn
  1 sibling, 1 reply; 875+ messages in thread
From: Geoff Keating @ 2002-02-04 11:44 UTC (permalink / raw)
  To: dje; +Cc: gcc-patches

> cc: gcc-patches@gcc.gnu.org
> Date: Mon, 04 Feb 2002 13:24:31 -0500
> From: David Edelsohn <dje@watson.ibm.com>
> 
> 	I somehow thought you were saying that ifcvt.c uses the mode of
> the operands of the condition, sorry.
> 
> 	So this patch should be all that is necessary.  Do you have any
> objection to the patch?

The patch does exactly the opposite thing to what I was recommending!
Did you perhaps generate it reversed?

-- 
- Geoffrey Keating <geoffk@geoffk.org> <geoffk@redhat.com>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] PowerPC fsel PR5217
  2002-02-04 11:44                 ` Geoff Keating
@ 2002-02-05 11:20                   ` David Edelsohn
  2002-02-05 12:47                     ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2002-02-05 11:20 UTC (permalink / raw)
  To: Geoff Keating; +Cc: gcc-patches

	Okay, so those patterns already are correct.  The problem is that
something in GCC is generating

(set (reg:SF)
     (if_then_else:DF (ge (reg:DF)
			  (const_double:DF 0)
			(reg:SF)
			(reg:SF))))

Note that the if_then_else:DF.  This only exists in gcc-3.0 branch.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] PowerPC fsel PR5217
  2002-02-05 11:20                   ` David Edelsohn
@ 2002-02-05 12:47                     ` David Edelsohn
  2002-02-05 14:17                       ` Mark Mitchell
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2002-02-05 12:47 UTC (permalink / raw)
  To: Geoff Keating, Mark Mitchell; +Cc: gcc-patches

	So the problem is probably that the GET_MODE (dest) patch was not
back-ported to the GCC-3.0 branch.

http://gcc.gnu.org/ml/gcc-patches/2001-11/msg00236.html

2001-11-05  Dale Johannesen  <dalej@apple.com>

         * config/rs6000/rs6000.c (rs6000_emit_cmove):  Derive mode
           of the IF_THEN_ELSE from the result, not the operands.

Mark: may this patch be back-ported to GCC 3.0 branch?

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] PowerPC fsel PR5217
  2002-02-05 12:47                     ` David Edelsohn
@ 2002-02-05 14:17                       ` Mark Mitchell
  0 siblings, 0 replies; 875+ messages in thread
From: Mark Mitchell @ 2002-02-05 14:17 UTC (permalink / raw)
  To: David Edelsohn, Geoff Keating; +Cc: gcc-patches



--On Tuesday, February 05, 2002 03:27:00 PM -0500 David Edelsohn 
<dje@watson.ibm.com> wrote:

> 	So the problem is probably that the GET_MODE (dest) patch was not
> back-ported to the GCC-3.0 branch.

Yes, it's fine to apply this patch.

Thanks,

-- 
Mark Mitchell                mark@codesourcery.com
CodeSourcery, LLC            http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 875+ messages in thread

* f build dies with: undefined reference to `lookup_name'
@ 2002-03-06  6:54 Andrew Cagney
  2002-03-06  8:29 ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Andrew Cagney @ 2002-03-06  6:54 UTC (permalink / raw)
  To: gcc-patches

Hello,

I'm trying to get some patches committed but am finding that my build 
dies in while building fortran with:

gcc -DIN_GCC    -g -O -W -Wall -Wwrite-strings -Wstrict-prototypes 
-Wmissing-prototypes -Wtraditional -pedantic -Wno-long-long 
-DHAVE_CONFIG_H  -o f771 f/bad.o f/bit.o f/bld.o f/com.o f/data.o 
f/equiv.o f/expr.o f/global.o f/implic.o f/info.o f/intrin.o f/lab.o 
f/lex.o f/malloc.o f/name.o f/parse.o f/src.o f/st.o f/sta.o f/stb.o 
f/stc.o f/std.o f/ste.o f/storag.o f/stp.o f/str.o f/sts.o f/stt.o 
f/stu.o f/stv.o f/stw.o f/symbol.o f/target.o f/top.o f/type.o 
f/version.o f/where.o main.o libbackend.a   ../libiberty/libiberty.a
libbackend.a(varasm.o): In function `weak_finish':
/home/scratch/PENDING/GCC-2002-02-02-switch-default/gcc/gcc/varasm.c:5111: 
undefined reference to `lookup_name'

on a powerpc-unknown-netbsd1.5ZA native build.

I suspect this change:

2002-03-01  Alan Modra  <amodra@bigpond.net.au>
             David Edelsohn  <edelsohn@gnu.org>

	.....
         (weak_finish): Use ASM_WEAKEN_DECL. Try to find decl.
         (remove_from_pending_weak_list): Declare and define for
         ASM_WEAKEN_DECL.

Andrew

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: f build dies with: undefined reference to `lookup_name'
  2002-03-06  6:54 f build dies with: undefined reference to `lookup_name' Andrew Cagney
@ 2002-03-06  8:29 ` David Edelsohn
  2002-03-06  8:53   ` Andrew Cagney
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2002-03-06  8:29 UTC (permalink / raw)
  To: Andrew Cagney; +Cc: gcc-patches

I'm trying to get some patches committed but am finding that my build 
dies in while building fortran with:
/home/scratch/PENDING/GCC-2002-02-02-switch-default/gcc/gcc/varasm.c:5111: 
undefined reference to `lookup_name'

	The patch works on Linux and AIX and eABI.  What is different
about the NetBSD configuration?  The function is declared in c-tree.h and
defined in c-decl.c. 

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: f build dies with: undefined reference to `lookup_name'
  2002-03-06  8:29 ` David Edelsohn
@ 2002-03-06  8:53   ` Andrew Cagney
  2002-03-06 10:18     ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Andrew Cagney @ 2002-03-06  8:53 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

>> I'm trying to get some patches committed but am finding that my build 
>> dies in while building fortran with:
>> /home/scratch/PENDING/GCC-2002-02-02-switch-default/gcc/gcc/varasm.c:5111: 
>> undefined reference to `lookup_name'

> 
> 	The patch works on Linux and AIX and eABI.  What is different
> about the NetBSD configuration?  The function is declared in c-tree.h and
> defined in c-decl.c. 

Fortran?  The archive libbackend.a (which the fortran compiler linked 
against) contains varasm.o but not contain c-decl.o.  Looking at 
gcc/Makefile.in, C_AND_OBJC_OBJS includes c-decl.o but OBJS (used for 
libbackend.a) does not.

Andrew


^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: f build dies with: undefined reference to `lookup_name'
  2002-03-06  8:53   ` Andrew Cagney
@ 2002-03-06 10:18     ` David Edelsohn
  2002-03-06 10:59       ` Richard Henderson
  2002-03-06 11:43       ` Stan Shebs
  0 siblings, 2 replies; 875+ messages in thread
From: David Edelsohn @ 2002-03-06 10:18 UTC (permalink / raw)
  To: Andrew Cagney, Alan Modra, Richard Henderson; +Cc: gcc-patches

>>>>> Andrew Cagney writes:

Andrew> Fortran?  The archive libbackend.a (which the fortran compiler linked 
Andrew> against) contains varasm.o but not contain c-decl.o.  Looking at 
Andrew> gcc/Makefile.in, C_AND_OBJC_OBJS includes c-decl.o but OBJS (used for 
Andrew> libbackend.a) does not.

	I am surprised that this did not cause any bootstrap problems
during all of the testing.

	The problem appears to be that all of the pragma and weak support
is in varasm.c, but it only is needed for C-like languages.  Because this
is tied with varasm, I don't think that one can separate it out into a
file only compiled for C-like languages.

	varasm.c includes c-tree.h, so there was no reason to believe that
those symbols were not safe.

	gcc/cp/decl.c defines lookup_name, but with an extra parameter
which currently is random garbage when this is used from the C++
front-end.  Sigh.

	gcc/f/com.c defines lookup_name_current_level, but not
lookup_name, copied directly from c-decl.c (even the comment referring to
lookup_name!)

	I think that one needs to define lookup_name in each language or
define a similar function in common code included in libbackend.a.  I
don't see how one can avoid referencing some symbol from varasm.c, even if
it never is called for some languages.

	Maybe someone else has a better idea which I am missing.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: f build dies with: undefined reference to `lookup_name'
  2002-03-06 10:18     ` David Edelsohn
@ 2002-03-06 10:59       ` Richard Henderson
  2002-03-06 11:27         ` David Edelsohn
                           ` (3 more replies)
  2002-03-06 11:43       ` Stan Shebs
  1 sibling, 4 replies; 875+ messages in thread
From: Richard Henderson @ 2002-03-06 10:59 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Andrew Cagney, Alan Modra, gcc-patches

On Wed, Mar 06, 2002 at 01:17:59PM -0500, David Edelsohn wrote:
> 	I think that one needs to define lookup_name in each language or
> define a similar function in common code included in libbackend.a.  I
> don't see how one can avoid referencing some symbol from varasm.c, even if
> it never is called for some languages.
> 
> 	Maybe someone else has a better idea which I am missing.

You can't look up decls from varasm.c.  You have to have
them passed down to you.

And you're right.  All the queueing of identifiers should
probably happen in c-common.c or something.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: f build dies with: undefined reference to `lookup_name'
  2002-03-06 10:59       ` Richard Henderson
@ 2002-03-06 11:27         ` David Edelsohn
  2002-03-06 12:41           ` Richard Henderson
  2002-03-06 15:40         ` David Edelsohn
                           ` (2 subsequent siblings)
  3 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2002-03-06 11:27 UTC (permalink / raw)
  To: Richard Henderson, Andrew Cagney, Alan Modra, gcc-patches

>>>>> Richard Henderson writes:

Richard> You can't look up decls from varasm.c.  You have to have
Richard> them passed down to you.

	The whole problem, and what Alan is trying to get around, is that
one may not know the DECL at the time one sees the #pragma weak.  This is
why I had my ugly patch to c-pragma.c differentiating identifiers from
strings, which also did not work in all cases covered by Alan's patch.

	One wants to be able to say

#pragma weak foo
extern int foo(void);

int main (void)
{
  if (foo)
    return (*foo) ();
  return 0;
}

At the point that the pragma is parsed, one has not seen the DECL for foo.
GCC's attribute extension weakens the DECL, but #pragma weak just puts the
string on a list.  Even if we walk the weak list when creating any DECL to
mark it weak if we have seen a #pragma weak, we still need to be able to
find the DECL again when we finish the weak symbols if the weak symbol was
not already emitted.

	The names of the symbols that need to be weakened need to be
deferred as long as possible, which is why Alan is looking up the names in
weak_finish when GCC is about to emit the symbols.  That is the point
where we have committed to emitting the symbol names and have our last
chance to find a DECL corresponding to the string.  Otherwise, we are back
to weakening strings without any knowledge of a DECL to which it might
correspond.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: f build dies with: undefined reference to `lookup_name'
  2002-03-06 10:18     ` David Edelsohn
  2002-03-06 10:59       ` Richard Henderson
@ 2002-03-06 11:43       ` Stan Shebs
  1 sibling, 0 replies; 875+ messages in thread
From: Stan Shebs @ 2002-03-06 11:43 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Andrew Cagney, Alan Modra, Richard Henderson, gcc-patches

David Edelsohn wrote:
> 
>         The problem appears to be that all of the pragma and weak support
> is in varasm.c, but it only is needed for C-like languages.  Because this
> is tied with varasm, I don't think that one can separate it out into a
> file only compiled for C-like languages.
> 
>         varasm.c includes c-tree.h, so there was no reason to believe that
> those symbols were not safe.

That sounds like a bogosity in varasm.c.

>         gcc/cp/decl.c defines lookup_name, but with an extra parameter
> which currently is random garbage when this is used from the C++
> front-end.  Sigh.

I can recount my personal unfortunate experience with this; Apple's
2.95.2 port called lookup_name from the backend, but I was never
able to get this to work completely correctly in 3.x for all the
languages, despite various experiments with lookup_name workalikes
in the C++ frontend.  Shouldn't have been surprising, because
"names" in C++ are not flat and global as in other languages.

In the end, I went to section info encoding, which worked much
better.

For your case, it sounds like you want code in c-common.c and
friends, and maybe some new langhooks called from varasm.c.

Stan

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: f build dies with: undefined reference to `lookup_name'
  2002-03-06 11:27         ` David Edelsohn
@ 2002-03-06 12:41           ` Richard Henderson
  2002-03-06 14:18             ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Richard Henderson @ 2002-03-06 12:41 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Andrew Cagney, Alan Modra, gcc-patches

On Wed, Mar 06, 2002 at 02:27:36PM -0500, David Edelsohn wrote:
> 	The whole problem, and what Alan is trying to get around, is that
> one may not know the DECL at the time one sees the #pragma weak.

Yes, I know.

> Otherwise, we are back to weakening strings without any knowledge of a
> DECL to which it might correspond.

Not at all.  You just have to do it in the C/C++ front end rather
than in the generic part of the compiler.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: f build dies with: undefined reference to `lookup_name'
  2002-03-06 12:41           ` Richard Henderson
@ 2002-03-06 14:18             ` David Edelsohn
  2002-03-06 14:22               ` Richard Henderson
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2002-03-06 14:18 UTC (permalink / raw)
  To: Richard Henderson, Andrew Cagney, Alan Modra, gcc-patches

>>>>> Richard Henderson writes:

>> Otherwise, we are back to weakening strings without any knowledge of a
>> DECL to which it might correspond.

Richard> Not at all.  You just have to do it in the C/C++ front end rather
Richard> than in the generic part of the compiler.

	Do what?  What is "it"?  Have each language provide its own
definition of weak_finish() instead of lookup_name()?

	Maybe Alan understands what you mean.  I think moving some of this
into the C/C++ front-end is going to start a domino effect of other
dependencies.  I cannot see a clean place to make a modular cut, but I
guess I'm just being dense.

Sorry, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: f build dies with: undefined reference to `lookup_name'
  2002-03-06 14:18             ` David Edelsohn
@ 2002-03-06 14:22               ` Richard Henderson
  2002-03-06 15:06                 ` David Edelsohn
  2002-03-06 15:18                 ` Alan Modra
  0 siblings, 2 replies; 875+ messages in thread
From: Richard Henderson @ 2002-03-06 14:22 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Andrew Cagney, Alan Modra, gcc-patches

On Wed, Mar 06, 2002 at 05:18:16PM -0500, David Edelsohn wrote:
> 	Do what?  What is "it"?  Have each language provide its own
> definition of weak_finish() instead of lookup_name()?

Basically, yes.  Though I would actually remove weak_finish
entirely and process #pragma weak forward declarations in
finish_decl and finish_function or something.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: f build dies with: undefined reference to `lookup_name'
  2002-03-06 14:22               ` Richard Henderson
@ 2002-03-06 15:06                 ` David Edelsohn
  2002-03-06 15:07                   ` Richard Henderson
  2002-03-06 15:18                 ` Alan Modra
  1 sibling, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2002-03-06 15:06 UTC (permalink / raw)
  To: Richard Henderson, Andrew Cagney, Alan Modra, gcc-patches

>>>>> Richard Henderson writes:

Richard> Basically, yes.  Though I would actually remove weak_finish
Richard> entirely and process #pragma weak forward declarations in
Richard> finish_decl and finish_function or something.

	That means weak_decls now needs to be global, but it is defined in
varasm.c so we will not have any undefined symbols.

	As far as finish_decl or finish_function or sometihng, what if I
have

#pragma weak a
#pragma weak b
#pragma weak c

extern a();

/* use a */

extern b();

/* use b */

extern c();

/* use c */

Current weak_finish() is run at the end of the file, not each function.
There isn't any language-dependent finish_file().  It seems that we would
need to add that to each language, or just add weak_finish to each
language, possibly a no-op.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: f build dies with: undefined reference to `lookup_name'
  2002-03-06 15:06                 ` David Edelsohn
@ 2002-03-06 15:07                   ` Richard Henderson
  2002-03-06 15:09                     ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Richard Henderson @ 2002-03-06 15:07 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Andrew Cagney, Alan Modra, gcc-patches

On Wed, Mar 06, 2002 at 05:39:05PM -0500, David Edelsohn wrote:
> 	That means weak_decls now needs to be global, but it is defined in
> varasm.c so we will not have any undefined symbols.

No, the *entire* processing of #pragma weak should be moved.

> Current weak_finish() is run at the end of the file, not each function.

Yeah, so?  Means we'd properly get DECL_WEAK set asap.

Which in fact fixes a bug.  Consider

	#pragma weak a
	int a;
	int foo() { return a; }

	int b __attribute__((weak));
	int bar() { return b; }

Compiled on alpha-linux, with ./cc1 -O -fno-common,

foo:
        ldah $1,a($29)          !gprelhigh
        ldl $0,a($1)            !gprellow
        ret $31,($26),1

bar:
        ldq $1,b($29)           !literal
        ldl $0,0($1)
        ret $31,($26),1

We've miscompiled foo.  It should look like bar, since the weak
binding means that the variable may not be resolved within the
current module.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: f build dies with: undefined reference to `lookup_name'
  2002-03-06 15:07                   ` Richard Henderson
@ 2002-03-06 15:09                     ` David Edelsohn
  2002-03-06 15:13                       ` Richard Henderson
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2002-03-06 15:09 UTC (permalink / raw)
  To: Richard Henderson, Andrew Cagney, Alan Modra, gcc-patches

>>>>> Richard Henderson writes:

Richard> On Wed, Mar 06, 2002 at 05:39:05PM -0500, David Edelsohn wrote:
>> That means weak_decls now needs to be global, but it is defined in
>> varasm.c so we will not have any undefined symbols.

Richard> No, the *entire* processing of #pragma weak should be moved.

	I don't mean move everything.  weak_finish walks weak_decls list.
We now need to walk the list outside varasm.c which means that the list
head cannot be static.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: f build dies with: undefined reference to `lookup_name'
  2002-03-06 15:09                     ` David Edelsohn
@ 2002-03-06 15:13                       ` Richard Henderson
  0 siblings, 0 replies; 875+ messages in thread
From: Richard Henderson @ 2002-03-06 15:13 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Andrew Cagney, Alan Modra, gcc-patches

On Wed, Mar 06, 2002 at 06:09:36PM -0500, David Edelsohn wrote:
> Richard> No, the *entire* processing of #pragma weak should be moved.
> 
> 	I don't mean move everything.

I do.

> weak_finish walks weak_decls list.

Yes.

> We now need to walk the list outside varasm.c which means that the list
> head cannot be static.

No, it means all of

	weak_decls
	mark_weak_decls
	add_weak
	weak_finish
	remove_from_pending_weak_list

should be moved and/or rewritten for c-common.c.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: f build dies with: undefined reference to `lookup_name'
  2002-03-06 14:22               ` Richard Henderson
  2002-03-06 15:06                 ` David Edelsohn
@ 2002-03-06 15:18                 ` Alan Modra
  2002-03-06 19:01                   ` Alan Modra
  1 sibling, 1 reply; 875+ messages in thread
From: Alan Modra @ 2002-03-06 15:18 UTC (permalink / raw)
  To: Richard Henderson, David Edelsohn, Andrew Cagney, gcc-patches

On Wed, Mar 06, 2002 at 02:22:49PM -0800, Richard Henderson wrote:
> On Wed, Mar 06, 2002 at 05:18:16PM -0500, David Edelsohn wrote:
> > 	Do what?  What is "it"?  Have each language provide its own
> > definition of weak_finish() instead of lookup_name()?
> 
> Basically, yes.  Though I would actually remove weak_finish
> entirely and process #pragma weak forward declarations in
> finish_decl and finish_function or something.

How about looking through the weak_decls list from pushdecl?
Something like the following.  Totally untested btw; I'm not even
sure I've got the list handling right.  I'd need to do something
similar in cp/decl.c too.

Index: gcc/c-decl.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/c-decl.c,v
retrieving revision 1.305
diff -u -p -r1.305 c-decl.c
--- c-decl.c	2002/03/05 02:34:05	1.305
+++ c-decl.c	2002/03/06 23:11:24
@@ -2289,6 +2289,8 @@ pushdecl (x)
 		pedwarn ("`%s' was declared `extern' and later `static'",
 			 IDENTIFIER_POINTER (name));
 	    }
+
+	  tie_decl_to_weaks (x);
 	}
       else
 	{
Index: gcc/varasm.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/varasm.c,v
retrieving revision 1.255
diff -u -p -r1.255 varasm.c
--- varasm.c	2002/03/03 04:50:53	1.255
+++ varasm.c	2002/03/06 23:11:28
@@ -42,7 +42,6 @@ Software Foundation, 59 Temple Place - S
 #include "obstack.h"
 #include "hashtab.h"
 #include "c-pragma.h"
-#include "c-tree.h"
 #include "ggc.h"
 #include "langhooks.h"
 #include "tm_p.h"
@@ -5036,6 +5035,7 @@ struct weak_syms
 };
 
 static struct weak_syms * weak_decls;
+static struct weak_syms ** weak_decls_tail = weak_decls;
 
 /* Mark weak_decls for garbage collection.  */
 
@@ -5065,15 +5065,37 @@ add_weak (decl, name, value)
   if (weak == NULL)
     return 0;
 
-  weak->next = weak_decls;
   weak->decl = decl;
   weak->name = name;
   weak->value = value;
-  weak_decls = weak;
+  weak->next = *weak_decls_tail;
+  *weak_decls_tail = weak;
+  if (decl)
+    weak_decls_tail = &weak->next;
 
   return 1;
 }
 
+void
+tie_decl_to_weaks (decl)
+     tree decl;
+{
+  tree name = DECL_NAME (decl);
+  struct weak_syms **p;
+  struct weak_syms *weak;
+
+  for (p = weak_decls_tail; (weak = *p) != NULL; p = &weak->next)
+    if (weak->name == name)
+      {
+	weak->decl = decl;
+	*p = weak->next;
+	weak->next = *weak_decls_tail;
+	*weak_decls_tail = weak;
+	weak_decls_tail = &weak->next;
+	break;
+      }
+}
+
 /* Declare DECL to be a weak symbol.  */
 
 void
@@ -5103,14 +5125,7 @@ weak_finish ()
       for (t = weak_decls; t != NULL; t = t->next)
 	{
 #ifdef ASM_WEAKEN_DECL
-	  tree decl = t->decl;
-	  if (decl == NULL_TREE)
-	    {
-	      tree name = get_identifier (t->name);
-	      if (name)
-		decl = lookup_name (name);
-	    }
-	  ASM_WEAKEN_DECL (asm_out_file, decl, t->name, t->value);
+	  ASM_WEAKEN_DECL (asm_out_file, t->decl, t->name, t->value);
 #else
 #ifdef ASM_OUTPUT_WEAK_ALIAS
 	  ASM_OUTPUT_WEAK_ALIAS (asm_out_file, t->name, t->value);
-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: f build dies with: undefined reference to `lookup_name'
  2002-03-06 10:59       ` Richard Henderson
  2002-03-06 11:27         ` David Edelsohn
@ 2002-03-06 15:40         ` David Edelsohn
  2002-03-14 11:34         ` David Edelsohn
  2002-03-14 16:00         ` David Edelsohn
  3 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2002-03-06 15:40 UTC (permalink / raw)
  To: Richard Henderson, Andrew Cagney, Alan Modra, gcc-patches

	I thought that the other varasm.c functions called into the weak
functions, but they only look at flags in DECLs.  Moving all of the
functions supporting weak in varasm.c to c-common.c is a clean solution.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: f build dies with: undefined reference to `lookup_name'
  2002-03-06 15:18                 ` Alan Modra
@ 2002-03-06 19:01                   ` Alan Modra
  2002-03-10 14:27                     ` Andrew Cagney
  0 siblings, 1 reply; 875+ messages in thread
From: Alan Modra @ 2002-03-06 19:01 UTC (permalink / raw)
  To: Richard Henderson, David Edelsohn, Andrew Cagney, gcc-patches

On Thu, Mar 07, 2002 at 09:48:29AM +1030, Alan Modra wrote:
> How about looking through the weak_decls list from pushdecl?

This seems to work.  Tested by building c,c++,f powerpc64-linux, and
using the following testcase.

cat >weakf.c <<EOF
extern int foo (void) __attribute__ ((weak));

#pragma weak bar
extern int bar (void);

extern int var1 __attribute__ ((weak));

#pragma weak var2
extern int var2;

extern int def (void) __attribute__ ((weak));
int def (void) { return var1; }

#pragma weak zzz = def
#pragma weak xxx = def2

#pragma weak def2
int def2 (void) { return var2; }

#pragma weak oops
static int oops (void) { return zzz () + xxx (); }

int main (void)
{
  if (foo)
    return (*foo) ();
  if (bar)
    return (*bar) ();
  if (&var1)
    return var1;
  if (&var2)
    return var2;
  return def ();
}
EOF

powerpc-linux and i686-linux bootstrap in progress.

BTW, how should we treat "#pragma weak foo = foo" ?  My c-pragma.c patch
makes it the same as "#pragma weak foo", but maybe an error is more
appropriate.  This change (or doing something slightly different in
decl_pending_weak) is necessary as decl_pending_weak needs to look
through the weak list twice to handle:

#pragma weak fun
#pragma weak fun2 = fun
int fun (void);

gcc/ChangeLog
	* c-decl.c (pushdecl): Call decl_pending_weak.
	* output.h (decl_pending_weak): Declare.
	* varasm.c (weak_decls_tail): New.
	(add_weak): Manipulate weak_decls_tail.
	(decl_pending_weak): New.
	(weak_finish): Don't lookup_name here.
	(remove_from_pending_weak_list): Fix up weak_decls_tail.  Don't
	strcmp identifiers.
	* c-pragma.c (handle_pragma_weak): Try to find a decl for value.
	Check value != name.

gcc/cp/ChangeLog
	* decl.c (pushdecl): Call decl_pending_weak.

Index: c-decl.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/c-decl.c,v
retrieving revision 1.305
diff -u -p -r1.305 c-decl.c
--- c-decl.c	2002/03/05 02:34:05	1.305
+++ c-decl.c	2002/03/07 02:44:37
@@ -2242,6 +2242,14 @@ pushdecl (x)
 
 	  IDENTIFIER_GLOBAL_VALUE (name) = x;
 
+	  /* Check for a function or var decl weakened by "#pragma weak".  */
+	  if ((TREE_CODE (x) == FUNCTION_DECL || TREE_CODE (x) == VAR_DECL)
+	      && decl_pending_weak (x))
+	    {
+	      TREE_PUBLIC (name) = 1;
+	      DECL_WEAK (x) = 1;
+	    }
+
 	  /* We no longer care about any previous block level declarations.  */
 	  IDENTIFIER_LIMBO_VALUE (name) = 0;
 
Index: c-pragma.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/c-pragma.c,v
retrieving revision 1.47
diff -u -p -r1.47 c-pragma.c
--- c-pragma.c	2002/03/01 06:00:32	1.47
+++ c-pragma.c	2002/03/07 02:44:37
@@ -26,6 +26,7 @@ Software Foundation, 59 Temple Place - S
 #include "function.h"
 #include "cpplib.h"
 #include "c-pragma.h"
+#include "c-tree.h"
 #include "flags.h"
 #include "toplev.h"
 #include "ggc.h"
@@ -281,8 +282,9 @@ static void
 handle_pragma_weak (dummy)
      cpp_reader *dummy ATTRIBUTE_UNUSED;
 {
-  tree name, value, x;
+  tree name, value, x, decl;
   enum cpp_ttype t;
+  const char *valstr;
 
   value = 0;
 
@@ -298,8 +300,14 @@ handle_pragma_weak (dummy)
   if (t != CPP_EOF)
     warning ("junk at end of #pragma weak");
 
-  add_weak (NULL_TREE, IDENTIFIER_POINTER (name),
-	    value ? IDENTIFIER_POINTER (value) : NULL);
+  decl = NULL_TREE;
+  valstr = NULL;
+  if (value && value != name)
+    {
+      decl = lookup_name (value);
+      valstr = IDENTIFIER_POINTER (value);
+    }
+  add_weak (decl, IDENTIFIER_POINTER (name), valstr);
 }
 #endif
 
Index: output.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/output.h,v
retrieving revision 1.96
diff -u -p -r1.96 output.h
--- output.h	2002/03/01 06:00:33	1.96
+++ output.h	2002/03/07 02:44:37
@@ -231,6 +231,10 @@ extern void mergeable_constant_section	P
 
 /* Declare DECL to be a weak symbol.  */
 extern void declare_weak		PARAMS ((tree));
+
+/* Look for DECL on "#pragma weak" list.  If found return 1 and tie
+   decl to list, otherwise return 0.  */
+extern int decl_pending_weak		PARAMS ((tree));
 #endif /* TREE_CODE */
 
 /* Emit any pending weak declarations.  */
Index: varasm.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/varasm.c,v
retrieving revision 1.255
diff -u -p -r1.255 varasm.c
--- varasm.c	2002/03/03 04:50:53	1.255
+++ varasm.c	2002/03/07 02:44:43
@@ -42,7 +42,6 @@ Software Foundation, 59 Temple Place - S
 #include "obstack.h"
 #include "hashtab.h"
 #include "c-pragma.h"
-#include "c-tree.h"
 #include "ggc.h"
 #include "langhooks.h"
 #include "tm_p.h"
@@ -5036,6 +5035,7 @@ struct weak_syms
 };
 
 static struct weak_syms * weak_decls;
+static struct weak_syms ** weak_decls_tail = &weak_decls;
 
 /* Mark weak_decls for garbage collection.  */
 
@@ -5050,7 +5050,8 @@ mark_weak_decls (arg)
 }
 
 /* Add function NAME to the weak symbols list.  VALUE is a weak alias
-   associated with NAME.  */
+   associated with NAME.  IF DECL is 0, we are being called from
+   #pragma weak handler.  *WEAK_DECLS_TAIL points to such entries.  */
 
 int
 add_weak (decl, name, value)
@@ -5065,15 +5066,53 @@ add_weak (decl, name, value)
   if (weak == NULL)
     return 0;
 
-  weak->next = weak_decls;
   weak->decl = decl;
   weak->name = name;
   weak->value = value;
-  weak_decls = weak;
+  weak->next = *weak_decls_tail;
+  *weak_decls_tail = weak;
+  if (decl)
+    weak_decls_tail = &weak->next;
 
   return 1;
 }
 
+/* Look for DECL on "#pragma weak" list.  If found return 1 and tie
+   decl to list, otherwise return 0.  */
+
+int
+decl_pending_weak (decl)
+     tree decl;
+{
+  const char *name = IDENTIFIER_POINTER (DECL_NAME (decl));
+  struct weak_syms **p;
+  struct weak_syms *weak;
+
+  for (p = weak_decls_tail; (weak = *p) != NULL; p = &weak->next)
+    if (weak->value == name)
+      {
+	weak->decl = decl;
+	*p = weak->next;
+	weak->next = *weak_decls_tail;
+	*weak_decls_tail = weak;
+	weak_decls_tail = &weak->next;
+	break;
+      }
+
+  for (p = weak_decls_tail; (weak = *p) != NULL; p = &weak->next)
+    if (weak->name == name)
+      {
+	weak->decl = decl;
+	*p = weak->next;
+	weak->next = *weak_decls_tail;
+	*weak_decls_tail = weak;
+	weak_decls_tail = &weak->next;
+	return 1;
+      }
+
+  return 0;
+}
+
 /* Declare DECL to be a weak symbol.  */
 
 void
@@ -5103,14 +5142,7 @@ weak_finish ()
       for (t = weak_decls; t != NULL; t = t->next)
 	{
 #ifdef ASM_WEAKEN_DECL
-	  tree decl = t->decl;
-	  if (decl == NULL_TREE)
-	    {
-	      tree name = get_identifier (t->name);
-	      if (name)
-		decl = lookup_name (name);
-	    }
-	  ASM_WEAKEN_DECL (asm_out_file, decl, t->name, t->value);
+	  ASM_WEAKEN_DECL (asm_out_file, t->decl, t->name, t->value);
 #else
 #ifdef ASM_OUTPUT_WEAK_ALIAS
 	  ASM_OUTPUT_WEAK_ALIAS (asm_out_file, t->name, t->value);
@@ -5140,9 +5172,11 @@ remove_from_pending_weak_list (name)
   for (p = &weak_decls; *p; )
     {
       t = *p;
-      if (strcmp (name, t->name) == 0)
+      if (name == t->name)
         {
           *p = t->next;
+	  if (weak_decls_tail == &t->next)
+	    weak_decls_tail = p;
           free (t);
         }
       else
Index: cp/decl.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/cp/decl.c,v
retrieving revision 1.869
diff -u -p -r1.869 decl.c
--- decl.c	2002/03/03 14:07:29	1.869
+++ decl.c	2002/03/07 02:44:58
@@ -4087,6 +4087,14 @@ pushdecl (x)
  		  || TREE_CODE (x) == TEMPLATE_DECL))
  	    SET_IDENTIFIER_NAMESPACE_VALUE (name, x);
 
+	  /* Check for a function or var decl weakened by "#pragma weak".  */
+	  if ((TREE_CODE (x) == FUNCTION_DECL || TREE_CODE (x) == VAR_DECL)
+	      && decl_pending_weak (x))
+	    {
+	      TREE_PUBLIC (name) = 1;
+	      DECL_WEAK (x) = 1;
+	    }
+
 	  /* Don't forget if the function was used via an implicit decl.  */
 	  if (IDENTIFIER_IMPLICIT_DECL (name)
 	      && TREE_USED (IDENTIFIER_IMPLICIT_DECL (name)))

^ permalink raw reply	[flat|nested] 875+ messages in thread

* biggest alignment for sysv4.h altivec
@ 2002-03-08 14:25 Aldy Hernandez
  2002-03-08 14:49 ` Geoff Keating
  0 siblings, 1 reply; 875+ messages in thread
From: Aldy Hernandez @ 2002-03-08 14:25 UTC (permalink / raw)
  To: dje, gcc-patches

obvious.

committing to branch and mainline.

2002-03-08  Aldy Hernandez  <aldyh@redhat.com>

	* config/rs6000/sysv4.h (BIGGEST_ALIGNMENT): Change for altivec.

Index: config/rs6000/sysv4.h
===================================================================
RCS file: /cvs/uberbaum/gcc/config/rs6000/sysv4.h,v
retrieving revision 1.84
diff -c -p -r1.84 sysv4.h
*** sysv4.h	2002/02/19 19:40:41	1.84
--- sysv4.h	2002/03/08 22:23:47
*************** do {									\
*** 385,391 ****
  
  /* No data type wants to be aligned rounder than this.  */
  #undef	BIGGEST_ALIGNMENT
! #define BIGGEST_ALIGNMENT (TARGET_EABI ? 64 : 128)
  
  /* An expression for the alignment of a structure field FIELD if the
     alignment computed in the usual way is COMPUTED.  */
--- 385,391 ----
  
  /* No data type wants to be aligned rounder than this.  */
  #undef	BIGGEST_ALIGNMENT
! #define BIGGEST_ALIGNMENT ((TARGET_EABI && !TARGET_ALTIVEC) ? 64 : 128)
  
  /* An expression for the alignment of a structure field FIELD if the
     alignment computed in the usual way is COMPUTED.  */

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: biggest alignment for sysv4.h altivec
  2002-03-08 14:25 biggest alignment for sysv4.h altivec Aldy Hernandez
@ 2002-03-08 14:49 ` Geoff Keating
  2002-03-08 14:52   ` Aldy Hernandez
  0 siblings, 1 reply; 875+ messages in thread
From: Geoff Keating @ 2002-03-08 14:49 UTC (permalink / raw)
  To: Aldy Hernandez; +Cc: gcc-patches

Aldy Hernandez <aldyh@redhat.com> writes:

> obvious.

Not obvious to me.

Why didn't you just change it to 128 always?

> committing to branch and mainline.
> 
> 2002-03-08  Aldy Hernandez  <aldyh@redhat.com>
> 
> 	* config/rs6000/sysv4.h (BIGGEST_ALIGNMENT): Change for altivec.
> 
> Index: config/rs6000/sysv4.h
> ===================================================================
> RCS file: /cvs/uberbaum/gcc/config/rs6000/sysv4.h,v
> retrieving revision 1.84
> diff -c -p -r1.84 sysv4.h
> *** sysv4.h	2002/02/19 19:40:41	1.84
> --- sysv4.h	2002/03/08 22:23:47
> *************** do {									\
> *** 385,391 ****
>   
>   /* No data type wants to be aligned rounder than this.  */
>   #undef	BIGGEST_ALIGNMENT
> ! #define BIGGEST_ALIGNMENT (TARGET_EABI ? 64 : 128)
>   
>   /* An expression for the alignment of a structure field FIELD if the
>      alignment computed in the usual way is COMPUTED.  */
> --- 385,391 ----
>   
>   /* No data type wants to be aligned rounder than this.  */
>   #undef	BIGGEST_ALIGNMENT
> ! #define BIGGEST_ALIGNMENT ((TARGET_EABI && !TARGET_ALTIVEC) ? 64 : 128)
>   
>   /* An expression for the alignment of a structure field FIELD if the
>      alignment computed in the usual way is COMPUTED.  */

-- 
- Geoffrey Keating <geoffk@geoffk.org> <geoffk@redhat.com>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: biggest alignment for sysv4.h altivec
  2002-03-08 14:49 ` Geoff Keating
@ 2002-03-08 14:52   ` Aldy Hernandez
  2002-03-08 15:16     ` Geoff Keating
  0 siblings, 1 reply; 875+ messages in thread
From: Aldy Hernandez @ 2002-03-08 14:52 UTC (permalink / raw)
  To: Geoff Keating; +Cc: gcc-patches


On Saturday, March 9, 2002, at 09:49  AM, Geoff Keating wrote:

> Aldy Hernandez <aldyh@redhat.com> writes:
>
>> obvious.
>
> Not obvious to me.

this is for the sysv4.h header file, not just for altivec.

do you want me to change BIGGEST_ALIGNMENT to 128 for every
sysv4.h variant?

aldy

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: biggest alignment for sysv4.h altivec
  2002-03-08 14:52   ` Aldy Hernandez
@ 2002-03-08 15:16     ` Geoff Keating
  2002-03-08 15:26       ` Aldy Hernandez
                         ` (2 more replies)
  0 siblings, 3 replies; 875+ messages in thread
From: Geoff Keating @ 2002-03-08 15:16 UTC (permalink / raw)
  To: Aldy Hernandez; +Cc: gcc-patches

Aldy Hernandez <aldyh@redhat.com> writes:

> On Saturday, March 9, 2002, at 09:49  AM, Geoff Keating wrote:
> 
> > Aldy Hernandez <aldyh@redhat.com> writes:
> >
> >> obvious.
> >
> > Not obvious to me.
> 
> this is for the sysv4.h header file, not just for altivec.
> 
> do you want me to change BIGGEST_ALIGNMENT to 128 for every
> sysv4.h variant?

I would look at it the other way.  Why should BIGGEST_ALIGNMENT
differ?  It doesn't change when we switch on -msoft-float, even though
if we don't have hardware doubles then we don't need 64-bit alignment.

-- 
- Geoffrey Keating <geoffk@geoffk.org> <geoffk@redhat.com>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: biggest alignment for sysv4.h altivec
  2002-03-08 15:16     ` Geoff Keating
@ 2002-03-08 15:26       ` Aldy Hernandez
  2002-03-08 15:48         ` Geoff Keating
  2002-03-08 18:52       ` Richard Henderson
  2002-03-08 20:09       ` David Edelsohn
  2 siblings, 1 reply; 875+ messages in thread
From: Aldy Hernandez @ 2002-03-08 15:26 UTC (permalink / raw)
  To: Geoff Keating; +Cc: gcc-patches

>> this is for the sysv4.h header file, not just for altivec.
>>
>> do you want me to change BIGGEST_ALIGNMENT to 128 for every
>> sysv4.h variant?
>
> I would look at it the other way.  Why should BIGGEST_ALIGNMENT
> differ?  It doesn't change when we switch on -msoft-float, even though
> if we don't have hardware doubles then we don't need 64-bit alignment.

BIGGEST_ALIGNMENT was already set to 64 in the eabi case, all i'm
doing it is enforcing it to 128 when eabi && altivec.

i can come up with a testcase that fails without the patch.

jeff johnston has some code for printf with vectors that either needs
this patch or needs -mstrict-align (or -mno-eabi :)).

cheers

--
Aldy Hernandez                                E-mail: aldyh@redhat.com
Professional Gypsy Lost in Australia
Red Hat, Inc.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: biggest alignment for sysv4.h altivec
  2002-03-08 15:26       ` Aldy Hernandez
@ 2002-03-08 15:48         ` Geoff Keating
  2002-03-08 15:53           ` Aldy Hernandez
  0 siblings, 1 reply; 875+ messages in thread
From: Geoff Keating @ 2002-03-08 15:48 UTC (permalink / raw)
  To: aldyh; +Cc: gcc-patches

> Date: Sat, 9 Mar 2002 10:27:15 +1100
> Cc: gcc-patches@gcc.gnu.org
> From: Aldy Hernandez <aldyh@redhat.com>
> X-OriginalArrivalTime: 08 Mar 2002 23:26:28.0054 (UTC) FILETIME=[ABDB2760:01C1C6F8]
> 
> >> this is for the sysv4.h header file, not just for altivec.
> >>
> >> do you want me to change BIGGEST_ALIGNMENT to 128 for every
> >> sysv4.h variant?
> >
> > I would look at it the other way.  Why should BIGGEST_ALIGNMENT
> > differ?  It doesn't change when we switch on -msoft-float, even though
> > if we don't have hardware doubles then we don't need 64-bit alignment.
> 
> BIGGEST_ALIGNMENT was already set to 64 in the eabi case, all i'm
> doing it is enforcing it to 128 when eabi && altivec.

Sure.  Why only when altivec?  Why not always?

> i can come up with a testcase that fails without the patch.

Yes, I'm sure some patch is needed.  The question is whether this
particular patch is the best choice.

-- 
- Geoffrey Keating <geoffk@geoffk.org> <geoffk@redhat.com>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: biggest alignment for sysv4.h altivec
  2002-03-08 15:48         ` Geoff Keating
@ 2002-03-08 15:53           ` Aldy Hernandez
  2002-03-08 17:34             ` Richard Henderson
  0 siblings, 1 reply; 875+ messages in thread
From: Aldy Hernandez @ 2002-03-08 15:53 UTC (permalink / raw)
  To: Geoff Keating; +Cc: gcc-patches

>> BIGGEST_ALIGNMENT was already set to 64 in the eabi case, all i'm
>> doing it is enforcing it to 128 when eabi && altivec.
>
> Sure.  Why only when altivec?  Why not always?

pffttt, sure i don't care.  i don't know why it was conditionalized
on TARGET_EABI in the first place?  it could have been me for all
i know.

i don't think increasing alignment is ever a problem :)

if dje and you agree, i'm all about setting it to 128.
better yet, removing the definition altogether, since rs6000.h
already sets it to 128 unconditionally.

cheerios
aldy

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: biggest alignment for sysv4.h altivec
  2002-03-08 15:53           ` Aldy Hernandez
@ 2002-03-08 17:34             ` Richard Henderson
  0 siblings, 0 replies; 875+ messages in thread
From: Richard Henderson @ 2002-03-08 17:34 UTC (permalink / raw)
  To: Aldy Hernandez; +Cc: Geoff Keating, gcc-patches

On Sat, Mar 09, 2002 at 10:53:57AM +1100, Aldy Hernandez wrote:
> i don't think increasing alignment is ever a problem :)

Yes, it is.  BIGGEST_ALIGNMENT affects structure layout.
It affects how __attribute__((aligned)) is treated.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: biggest alignment for sysv4.h altivec
  2002-03-08 15:16     ` Geoff Keating
  2002-03-08 15:26       ` Aldy Hernandez
@ 2002-03-08 18:52       ` Richard Henderson
  2002-03-08 20:09       ` David Edelsohn
  2 siblings, 0 replies; 875+ messages in thread
From: Richard Henderson @ 2002-03-08 18:52 UTC (permalink / raw)
  To: Geoff Keating; +Cc: Aldy Hernandez, gcc-patches

On Fri, Mar 08, 2002 at 03:16:39PM -0800, Geoff Keating wrote:
> I would look at it the other way.  Why should BIGGEST_ALIGNMENT
> differ?

BIGGEST_ALIGNMENT affects the ABI.  You can't change it
arbitrarily.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: biggest alignment for sysv4.h altivec
  2002-03-08 15:16     ` Geoff Keating
  2002-03-08 15:26       ` Aldy Hernandez
  2002-03-08 18:52       ` Richard Henderson
@ 2002-03-08 20:09       ` David Edelsohn
  2002-03-09  2:11         ` Geoff Keating
  2 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2002-03-08 20:09 UTC (permalink / raw)
  To: Geoff Keating, aldyh; +Cc: gcc-patches

>>>>> Geoff Keating writes:

| I would look at it the other way.  Why should BIGGEST_ALIGNMENT
| differ?  It doesn't change when we switch on -msoft-float, even though
| if we don't have hardware doubles then we don't need 64-bit alignment.

	The Altivec ABI is an extension to the SVR4 and Darwin/AIX ABIs.
It increases the biggest alignment to 128 bits.

	-msoft-float is not a new or extended ABI.  It is the original ABI
with FPRs disabled and floating point emulated in GPRs.  It is intended to
be compatible with the original ABI and the original ABI's alignment
rules. 

	The two issues are not similar enough to use as justification or
precedent for the other.

	  BIGGEST_ALIGNMENT needs to change because the Altivec ABI says
that it is different.  One cannot uniformly change BIGGEST_ALIGNMENT to
128 because that is not what the SVR4 ABI specifies.  One needs the
conditional as Aldy has proposed.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: biggest alignment for sysv4.h altivec
  2002-03-08 20:09       ` David Edelsohn
@ 2002-03-09  2:11         ` Geoff Keating
  2002-03-09 15:09           ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Geoff Keating @ 2002-03-09  2:11 UTC (permalink / raw)
  To: dje; +Cc: aldyh, gcc-patches

> cc: gcc-patches@gcc.gnu.org
> Date: Fri, 08 Mar 2002 23:08:18 -0500
> From: David Edelsohn <dje@watson.ibm.com>
> 
> >>>>> Geoff Keating writes:
> 
> | I would look at it the other way.  Why should BIGGEST_ALIGNMENT
> | differ?  It doesn't change when we switch on -msoft-float, even though
> | if we don't have hardware doubles then we don't need 64-bit alignment.
> 
> 	The Altivec ABI is an extension to the SVR4 and Darwin/AIX ABIs.
> It increases the biggest alignment to 128 bits.
> 
> 	-msoft-float is not a new or extended ABI.  It is the original ABI
> with FPRs disabled and floating point emulated in GPRs.  It is intended to
> be compatible with the original ABI and the original ABI's alignment
> rules. 
> 
> 	The two issues are not similar enough to use as justification or
> precedent for the other.
> 
> 	  BIGGEST_ALIGNMENT needs to change because the Altivec ABI says
> that it is different.  One cannot uniformly change BIGGEST_ALIGNMENT to
> 128 because that is not what the SVR4 ABI specifies.  One needs the
> conditional as Aldy has proposed.

That's an interesting point, and helped clarify my thinking on this a
lot.  Thanks!

I presume you meant EABI when you said SVR4 ABI.  The SVR4 ABI has
always had 128-bit alignment of the stack frame and other objects.

The published EABI also has certain objects that do require 128-bit
alignment.  Although the stack is not 128-bit aligned, for
compatibility with the SVR4 ABI it has always been the case that, if
'long double' was implemented (as in the -mlong-double-128 flag), then

struct x {
  int a;
  long double d;
};

has had 'd' aligned to a 128-bit boundary (even though if it was on
the stack or a global it would be aligned only to a 64-bit boundary,
and in fact there's no guarantee the structure will be properly
aligned).  I don't think we do this properly now; we should, and one
step towards doing it is changing BIGGEST_ALIGNMENT.

And now, in a neat twist of logic, I can argue that -mno-long-double-128
is just like -msoft-float.

We have different restrictions on changing GCC's EABI implementation
than changing the SVR4 ABI implementation, especially if it's fixing
bugs.  Changing the SVR4 ABI implementation, even to fix a bug, is
difficult because of the problems it causes with binary compatibility
under Linux.  The EABI is intended at embedded targets and so has
fewer compatibility requirements.

I will therefore commit a patch to change BIGGEST_ALIGNMENT and a
testcase to check that the long-double structure alignment is correct.

-- 
- Geoffrey Keating <geoffk@geoffk.org> <geoffk@redhat.com>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: biggest alignment for sysv4.h altivec
  2002-03-09  2:11         ` Geoff Keating
@ 2002-03-09 15:09           ` David Edelsohn
  2002-03-09 16:17             ` Geoff Keating
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2002-03-09 15:09 UTC (permalink / raw)
  To: Geoff Keating; +Cc: aldyh, gcc-patches

>>>>> Geoff Keating writes:

Geoff> I will therefore commit a patch to change BIGGEST_ALIGNMENT and a
Geoff> testcase to check that the long-double structure alignment is correct.

	Does correct mean correct for the long double mode in effect or
uniformly 128 bit alignment?  While long double is implemented as 64-bit
double, the object is only 64 bits, not 64 bits embedded in 128 bits.
Having eABI use 128-bit alignment when -mlong-double=128 does seem
correct. 

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: biggest alignment for sysv4.h altivec
  2002-03-09 15:09           ` David Edelsohn
@ 2002-03-09 16:17             ` Geoff Keating
  0 siblings, 0 replies; 875+ messages in thread
From: Geoff Keating @ 2002-03-09 16:17 UTC (permalink / raw)
  To: dje; +Cc: aldyh, gcc-patches

> cc: aldyh@redhat.com, gcc-patches@gcc.gnu.org
> Date: Sat, 09 Mar 2002 18:08:52 -0500
> From: David Edelsohn <dje@watson.ibm.com>
> 
> >>>>> Geoff Keating writes:
> 
> Geoff> I will therefore commit a patch to change BIGGEST_ALIGNMENT and a
> Geoff> testcase to check that the long-double structure alignment is correct.
> 
> 	Does correct mean correct for the long double mode in effect or
> uniformly 128 bit alignment?  While long double is implemented as 64-bit
> double, the object is only 64 bits, not 64 bits embedded in 128 bits.
> Having eABI use 128-bit alignment when -mlong-double=128 does seem
> correct. 

The change should have effect only when 'long double' is 128 bits.

-- 
- Geoffrey Keating <geoffk@geoffk.org> <geoffk@redhat.com>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: f build dies with: undefined reference to `lookup_name'
  2002-03-06 19:01                   ` Alan Modra
@ 2002-03-10 14:27                     ` Andrew Cagney
  2002-03-10 14:34                       ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Andrew Cagney @ 2002-03-10 14:27 UTC (permalink / raw)
  To: Alan Modra; +Cc: Richard Henderson, David Edelsohn, gcc-patches

> gcc/ChangeLog
> 	* c-decl.c (pushdecl): Call decl_pending_weak.
> 	* output.h (decl_pending_weak): Declare.
> 	* varasm.c (weak_decls_tail): New.
> 	(add_weak): Manipulate weak_decls_tail.
> 	(decl_pending_weak): New.
> 	(weak_finish): Don't lookup_name here.
> 	(remove_from_pending_weak_list): Fix up weak_decls_tail.  Don't
> 	strcmp identifiers.
> 	* c-pragma.c (handle_pragma_weak): Try to find a decl for value.
> 	Check value != name.

How is this patch going?

Andrew



^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: f build dies with: undefined reference to `lookup_name'
  2002-03-10 14:27                     ` Andrew Cagney
@ 2002-03-10 14:34                       ` David Edelsohn
  2002-03-10 16:00                         ` Richard Henderson
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2002-03-10 14:34 UTC (permalink / raw)
  To: Andrew Cagney; +Cc: Alan Modra, Richard Henderson, gcc-patches

>>>>> Andrew Cagney writes:

Andrew> How is this patch going?

	We have two proposed patches: Alan's patch moving the weak support
for pushdecl and Richard's proposal to move the current weak support from
varasm.c to c-common.c.

	Alan has not commented on Richard's proposal and Richard has not
commented on Alan's proposal.  Moving the weak machinery to c-common seems
like a good shift to me because it is self-containted and pragma weak is
C/C++/Obj-C specific.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: f build dies with: undefined reference to `lookup_name'
  2002-03-10 14:34                       ` David Edelsohn
@ 2002-03-10 16:00                         ` Richard Henderson
  0 siblings, 0 replies; 875+ messages in thread
From: Richard Henderson @ 2002-03-10 16:00 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Andrew Cagney, Alan Modra, gcc-patches

On Sun, Mar 10, 2002 at 05:34:14PM -0500, David Edelsohn wrote:
> 	Alan has not commented on Richard's proposal and Richard has not
> commented on Alan's proposal.  Moving the weak machinery to c-common seems
> like a good shift to me because it is self-containted and pragma weak is
> C/C++/Obj-C specific.

I started on a patch, but got sidetracked in the middle.
Perhaps today or tomorrow.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: f build dies with: undefined reference to `lookup_name'
  2002-03-06 10:59       ` Richard Henderson
  2002-03-06 11:27         ` David Edelsohn
  2002-03-06 15:40         ` David Edelsohn
@ 2002-03-14 11:34         ` David Edelsohn
  2002-03-14 12:02           ` Neil Booth
  2002-03-14 13:47           ` Geoff Keating
  2002-03-14 16:00         ` David Edelsohn
  3 siblings, 2 replies; 875+ messages in thread
From: David Edelsohn @ 2002-03-14 11:34 UTC (permalink / raw)
  To: Richard Henderson, Andrew Cagney, Alan Modra, gcc-patches

	How about a patch like the following?

	Note that remove_from_pending_list now needs to be global because
it is accessed by the recently added varasm.c:globalize_decl().

Thanks, David


	* output.h: Declare remove_from_pending_list.
	* langhooks.h (lang_hooks): Add symbol_finish.
	* langhooks-def.h (LANG_HOOKS_SYMBOL_FINISH): New.
	(LANG_HOOKS_INITIALIZER): Use it.
	* varasm.c: Move weak support from here ...
	* c-common.c: ... to here.
	(c_common_init): Add weak_decls to ggc roots.
	* c-lang.c (LANG_HOOKS_SYMBOL_FINISH): Define.
	* objc/objc-lang.c: Same.
	* cp/cp-lang.c: Same.

Index: output.h
===================================================================
RCS file: /cvs/gcc/egcs/gcc/output.h,v
retrieving revision 1.96
diff -c -p -r1.96 output.h
*** output.h	2002/03/01 06:00:33	1.96
--- output.h	2002/03/14 19:23:50
*************** extern const char *get_insn_template PAR
*** 139,144 ****
--- 139,147 ----
     associated with NAME.  */
  extern int add_weak PARAMS ((tree, const char *, const char *));
  
+ /* Remove function NAME from the weak symbols list.  */
+ extern void remove_from_pending_weak_list	PARAMS ((const char *));
+ 
  /* Functions in flow.c */
  extern void allocate_for_life_analysis	PARAMS ((void));
  extern int regno_uninitialized		PARAMS ((unsigned int));
Index: langhooks-def.h
===================================================================
RCS file: /cvs/gcc/egcs/gcc/langhooks-def.h,v
retrieving revision 1.10
diff -c -p -r1.10 langhooks-def.h
*** langhooks-def.h	2002/03/08 19:20:47	1.10
--- langhooks-def.h	2002/03/14 19:23:50
*************** void lhd_tree_inlining_end_inlining		PAR
*** 67,72 ****
--- 67,73 ----
  #define LANG_HOOKS_NAME			"GNU unknown"
  #define LANG_HOOKS_IDENTIFIER_SIZE	sizeof (struct lang_identifier)
  #define LANG_HOOKS_INIT			lhd_do_nothing
+ #define LANG_HOOKS_SYMBOL_FINISH	lhd_do_nothing
  #define LANG_HOOKS_FINISH		lhd_do_nothing
  #define LANG_HOOKS_CLEAR_BINDING_STACK	lhd_clear_binding_stack
  #define LANG_HOOKS_INIT_OPTIONS		lhd_do_nothing
*************** int lhd_tree_dump_type_quals			PARAMS ((
*** 140,145 ****
--- 141,147 ----
    LANG_HOOKS_DECODE_OPTION, \
    LANG_HOOKS_POST_OPTIONS, \
    LANG_HOOKS_INIT, \
+   LANG_HOOKS_SYMBOL_FINISH, \
    LANG_HOOKS_FINISH, \
    LANG_HOOKS_CLEAR_BINDING_STACK, \
    LANG_HOOKS_GET_ALIAS_SET, \
Index: langhooks.h
===================================================================
RCS file: /cvs/gcc/egcs/gcc/langhooks.h,v
retrieving revision 1.17
diff -c -p -r1.17 langhooks.h
*** langhooks.h	2002/03/08 19:20:47	1.17
--- langhooks.h	2002/03/14 19:23:50
*************** struct lang_hooks
*** 101,106 ****
--- 101,110 ----
       immediately.  */
    const char * (*init) PARAMS ((const char *));
  
+   /* Called near the end of compilation, to emit any pending symbols,
+      such as weak delclarations.  */
+   void (*symbol_finish) PARAMS ((void));
+ 
    /* Called at the end of compilation, as a finalizer.  */
    void (*finish) PARAMS ((void));
  
Index: c-common.c
===================================================================
RCS file: /cvs/gcc/egcs/gcc/c-common.c,v
retrieving revision 1.297
diff -c -p -r1.297 c-common.c
*** c-common.c	2002/03/13 01:42:29	1.297
--- c-common.c	2002/03/14 19:23:50
*************** boolean_increment (code, arg)
*** 4023,4028 ****
--- 4023,4156 ----
    return val;
  }
  \f
+ /* This structure contains any weak symbol declarations waiting
+    to be emitted.  */
+ struct weak_syms
+ {
+   struct weak_syms * next;
+   tree decl;
+   const char * name;
+   const char * value;
+ };
+ 
+ static struct weak_syms * weak_decls;
+ 
+ static void mark_weak_decls		PARAMS ((void *));
+ 
+ /* Mark weak_decls for garbage collection.  */
+ 
+ static void
+ mark_weak_decls (arg)
+      void *arg;
+ {
+   struct weak_syms *t;
+ 
+   for (t = *(struct weak_syms **) arg; t != NULL; t = t->next)
+     ggc_mark_tree (t->decl);
+ }
+ 
+ /* Add function NAME to the weak symbols list.  VALUE is a weak alias
+    associated with NAME.  */
+ 
+ int
+ add_weak (decl, name, value)
+      tree decl;
+      const char *name;
+      const char *value;
+ {
+   struct weak_syms *weak;
+ 
+   weak = (struct weak_syms *) xmalloc (sizeof (struct weak_syms));
+ 
+   if (weak == NULL)
+     return 0;
+ 
+   weak->next = weak_decls;
+   weak->decl = decl;
+   weak->name = name;
+   weak->value = value;
+   weak_decls = weak;
+ 
+   return 1;
+ }
+ 
+ /* Declare DECL to be a weak symbol.  */
+ 
+ void
+ declare_weak (decl)
+      tree decl;
+ {
+   if (! TREE_PUBLIC (decl))
+     error_with_decl (decl, "weak declaration of `%s' must be public");
+   else if (TREE_ASM_WRITTEN (decl))
+     error_with_decl (decl, "weak declaration of `%s' must precede definition");
+   else if (SUPPORTS_WEAK)
+     add_weak (decl, IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl)), NULL);
+   else
+     warning_with_decl (decl, "weak declaration of `%s' not supported");
+ 
+   DECL_WEAK (decl) = 1;
+ }
+ 
+ /* Emit any pending weak declarations.  */
+ 
+ void
+ weak_finish ()
+ {
+   if (SUPPORTS_WEAK)
+     {
+       struct weak_syms *t;
+       for (t = weak_decls; t != NULL; t = t->next)
+ 	{
+ #ifdef ASM_WEAKEN_DECL
+ 	  tree decl = t->decl;
+ 	  if (decl == NULL_TREE)
+ 	    {
+ 	      tree name = get_identifier (t->name);
+ 	      if (name)
+ 		decl = lookup_name (name);
+ 	    }
+ 	  ASM_WEAKEN_DECL (asm_out_file, decl, t->name, t->value);
+ #else
+ #ifdef ASM_OUTPUT_WEAK_ALIAS
+ 	  ASM_OUTPUT_WEAK_ALIAS (asm_out_file, t->name, t->value);
+ #else
+ #ifdef ASM_WEAKEN_LABEL
+ 	  if (t->value)
+ 	    abort ();
+ 	  ASM_WEAKEN_LABEL (asm_out_file, t->name);
+ #endif
+ #endif
+ #endif
+ 	}
+     }
+ }
+ 
+ /* Remove NAME from the pending list of weak symbols.  This prevents
+    the compiler from emitting multiple .weak directives which confuses
+    some assemblers.  */
+ #if defined (ASM_WEAKEN_LABEL) || defined (ASM_WEAKEN_DECL)
+ static void
+ remove_from_pending_weak_list (name)
+      const char *name;
+ {
+   struct weak_syms *t;
+   struct weak_syms **p;
+ 
+   for (p = &weak_decls; *p; )
+     {
+       t = *p;
+       if (strcmp (name, t->name) == 0)
+         {
+           *p = t->next;
+           free (t);
+         }
+       else
+         p = &(t->next);
+     }
+ }
+ #endif /* defined (ASM_WEAKEN_LABEL) || defined (ASM_WEAKEN_DECL) */
+ \f
  /* Handle C and C++ default attributes.  */
  
  enum built_in_attribute
*************** c_common_init_options (lang)
*** 4058,4063 ****
--- 4186,4193 ----
  
    /* Mark as "unspecified" (see c_common_post_options).  */
    flag_bounds_check = -1;
+ 
+   ggc_add_root (&weak_decls, 1, sizeof weak_decls, mark_weak_decls);
  }
  
  /* Post-switch processing.  */
Index: c-lang.c
===================================================================
RCS file: /cvs/gcc/egcs/gcc/c-lang.c,v
retrieving revision 1.76
diff -c -p -r1.76 c-lang.c
*** c-lang.c	2002/03/13 01:42:30	1.76
--- c-lang.c	2002/03/14 19:23:50
*************** Software Foundation, 59 Temple Place - S
*** 23,28 ****
--- 23,29 ----
  #include "config.h"
  #include "system.h"
  #include "tree.h"
+ #include "output.h"
  #include "c-tree.h"
  #include "langhooks.h"
  #include "langhooks-def.h"
*************** static void c_post_options PARAMS ((void
*** 37,42 ****
--- 38,46 ----
  #define LANG_HOOKS_NAME "GNU C"
  #undef LANG_HOOKS_INIT
  #define LANG_HOOKS_INIT c_init
+ #undef LANG_HOOKS_SYMBOL_FINISH
+ #define LANG_HOOKS_SYMBOL_FINISH weak_finish
+ #undef LANG_HOOKS_INIT_OPTIONS
  #undef LANG_HOOKS_FINISH
  #define LANG_HOOKS_FINISH c_common_finish
  #undef LANG_HOOKS_INIT_OPTIONS
Index: cp/cp-lang.c
===================================================================
RCS file: /cvs/gcc/egcs/gcc/cp/cp-lang.c,v
retrieving revision 1.13
diff -c -p -r1.13 cp-lang.c
*** cp-lang.c	2002/03/13 01:42:39	1.13
--- cp-lang.c	2002/03/14 19:23:50
*************** Boston, MA 02111-1307, USA.  */
*** 22,27 ****
--- 22,28 ----
  #include "config.h"
  #include "system.h"
  #include "tree.h"
+ #include "output.h"
  #include "cp-tree.h"
  #include "c-common.h"
  #include "toplev.h"
*************** static bool ok_to_generate_alias_set_for
*** 35,40 ****
--- 36,43 ----
  #define LANG_HOOKS_NAME "GNU C++"
  #undef LANG_HOOKS_INIT
  #define LANG_HOOKS_INIT cxx_init
+ #undef LANG_HOOKS_SYMBOL_FINISH
+ #define LANG_HOOKS_SYMBOL_FINISH weak_finish
  #undef LANG_HOOKS_FINISH
  #define LANG_HOOKS_FINISH cxx_finish
  #undef LANG_HOOKS_CLEAR_BINDING_STACK
Index: objc/objc-lang.c
===================================================================
RCS file: /cvs/gcc/egcs/gcc/objc/objc-lang.c,v
retrieving revision 1.4
diff -c -p -r1.4 objc-lang.c
*** objc-lang.c	2002/03/13 01:42:43	1.4
--- objc-lang.c	2002/03/14 19:23:50
*************** Boston, MA 02111-1307, USA.  */
*** 22,27 ****
--- 22,28 ----
  #include "config.h"
  #include "system.h"
  #include "tree.h"
+ #include "output.h"
  #include "c-tree.h"
  #include "c-common.h"
  #include "toplev.h"
*************** static void objc_post_options           
*** 36,41 ****
--- 37,44 ----
  #define LANG_HOOKS_NAME "GNU Objective-C"  
  #undef LANG_HOOKS_INIT
  #define LANG_HOOKS_INIT objc_init
+ #undef LANG_HOOKS_SYMBOL_FINISH
+ #define LANG_HOOKS_SYMBOL_FINISH weak_finish
  #undef LANG_HOOKS_FINISH
  #define LANG_HOOKS_FINISH c_common_finish
  #undef LANG_HOOKS_INIT_OPTIONS

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: f build dies with: undefined reference to `lookup_name'
  2002-03-14 11:34         ` David Edelsohn
@ 2002-03-14 12:02           ` Neil Booth
  2002-03-14 13:47           ` Geoff Keating
  1 sibling, 0 replies; 875+ messages in thread
From: Neil Booth @ 2002-03-14 12:02 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Richard Henderson, Andrew Cagney, Alan Modra, gcc-patches

David Edelsohn wrote:-

> + void
> + weak_finish ()
> + {

Since this is a hook common to the 3 C languages, could you call it
c_weak_finish instead?  The other hooks (mostly) follow this convention.

Thanks,

Neil.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: f build dies with: undefined reference to `lookup_name'
  2002-03-14 11:34         ` David Edelsohn
  2002-03-14 12:02           ` Neil Booth
@ 2002-03-14 13:47           ` Geoff Keating
  2002-03-14 14:07             ` David Edelsohn
  2002-03-14 15:24             ` David Edelsohn
  1 sibling, 2 replies; 875+ messages in thread
From: Geoff Keating @ 2002-03-14 13:47 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches


David Edelsohn <dje@watson.ibm.com> writes:

> 	How about a patch like the following?
> 
> 	Note that remove_from_pending_list now needs to be global because
> it is accessed by the recently added varasm.c:globalize_decl().

With the change that Neil suggested, suitable testing, and the posting
of the missing patch to varasm.c (which I presume just deletes stuff),
this patch is OK.

BTW, the reason the regression tester didn't notice this is because it
thinks the assembler is very old and doesn't support weak symbols :-(.
This is a bug in configure.in, it needs to search for the assembler
the same way as the C code does (and then we could stop the C code
from searching at all and get it to just use the configured location).

> 	* output.h: Declare remove_from_pending_list.
> 	* langhooks.h (lang_hooks): Add symbol_finish.
> 	* langhooks-def.h (LANG_HOOKS_SYMBOL_FINISH): New.
> 	(LANG_HOOKS_INITIALIZER): Use it.
> 	* varasm.c: Move weak support from here ...
> 	* c-common.c: ... to here.
> 	(c_common_init): Add weak_decls to ggc roots.
> 	* c-lang.c (LANG_HOOKS_SYMBOL_FINISH): Define.
> 	* objc/objc-lang.c: Same.
> 	* cp/cp-lang.c: Same.
> 
> Index: output.h
> ===================================================================
> RCS file: /cvs/gcc/egcs/gcc/output.h,v
> retrieving revision 1.96
> diff -c -p -r1.96 output.h
> *** output.h	2002/03/01 06:00:33	1.96
> --- output.h	2002/03/14 19:23:50
> *************** extern const char *get_insn_template PAR
> *** 139,144 ****
> --- 139,147 ----
>      associated with NAME.  */
>   extern int add_weak PARAMS ((tree, const char *, const char *));
>   
> + /* Remove function NAME from the weak symbols list.  */
> + extern void remove_from_pending_weak_list	PARAMS ((const char *));
> + 
>   /* Functions in flow.c */
>   extern void allocate_for_life_analysis	PARAMS ((void));
>   extern int regno_uninitialized		PARAMS ((unsigned int));
> Index: langhooks-def.h
> ===================================================================
> RCS file: /cvs/gcc/egcs/gcc/langhooks-def.h,v
> retrieving revision 1.10
> diff -c -p -r1.10 langhooks-def.h
> *** langhooks-def.h	2002/03/08 19:20:47	1.10
> --- langhooks-def.h	2002/03/14 19:23:50
> *************** void lhd_tree_inlining_end_inlining		PAR
> *** 67,72 ****
> --- 67,73 ----
>   #define LANG_HOOKS_NAME			"GNU unknown"
>   #define LANG_HOOKS_IDENTIFIER_SIZE	sizeof (struct lang_identifier)
>   #define LANG_HOOKS_INIT			lhd_do_nothing
> + #define LANG_HOOKS_SYMBOL_FINISH	lhd_do_nothing
>   #define LANG_HOOKS_FINISH		lhd_do_nothing
>   #define LANG_HOOKS_CLEAR_BINDING_STACK	lhd_clear_binding_stack
>   #define LANG_HOOKS_INIT_OPTIONS		lhd_do_nothing
> *************** int lhd_tree_dump_type_quals			PARAMS ((
> *** 140,145 ****
> --- 141,147 ----
>     LANG_HOOKS_DECODE_OPTION, \
>     LANG_HOOKS_POST_OPTIONS, \
>     LANG_HOOKS_INIT, \
> +   LANG_HOOKS_SYMBOL_FINISH, \
>     LANG_HOOKS_FINISH, \
>     LANG_HOOKS_CLEAR_BINDING_STACK, \
>     LANG_HOOKS_GET_ALIAS_SET, \
> Index: langhooks.h
> ===================================================================
> RCS file: /cvs/gcc/egcs/gcc/langhooks.h,v
> retrieving revision 1.17
> diff -c -p -r1.17 langhooks.h
> *** langhooks.h	2002/03/08 19:20:47	1.17
> --- langhooks.h	2002/03/14 19:23:50
> *************** struct lang_hooks
> *** 101,106 ****
> --- 101,110 ----
>        immediately.  */
>     const char * (*init) PARAMS ((const char *));
>   
> +   /* Called near the end of compilation, to emit any pending symbols,
> +      such as weak delclarations.  */
> +   void (*symbol_finish) PARAMS ((void));
> + 
>     /* Called at the end of compilation, as a finalizer.  */
>     void (*finish) PARAMS ((void));
>   
> Index: c-common.c
> ===================================================================
> RCS file: /cvs/gcc/egcs/gcc/c-common.c,v
> retrieving revision 1.297
> diff -c -p -r1.297 c-common.c
> *** c-common.c	2002/03/13 01:42:29	1.297
> --- c-common.c	2002/03/14 19:23:50
> *************** boolean_increment (code, arg)
> *** 4023,4028 ****
> --- 4023,4156 ----
>     return val;
>   }
>   \f
> + /* This structure contains any weak symbol declarations waiting
> +    to be emitted.  */
> + struct weak_syms
> + {
> +   struct weak_syms * next;
> +   tree decl;
> +   const char * name;
> +   const char * value;
> + };
> + 
> + static struct weak_syms * weak_decls;
> + 
> + static void mark_weak_decls		PARAMS ((void *));
> + 
> + /* Mark weak_decls for garbage collection.  */
> + 
> + static void
> + mark_weak_decls (arg)
> +      void *arg;
> + {
> +   struct weak_syms *t;
> + 
> +   for (t = *(struct weak_syms **) arg; t != NULL; t = t->next)
> +     ggc_mark_tree (t->decl);
> + }
> + 
> + /* Add function NAME to the weak symbols list.  VALUE is a weak alias
> +    associated with NAME.  */
> + 
> + int
> + add_weak (decl, name, value)
> +      tree decl;
> +      const char *name;
> +      const char *value;
> + {
> +   struct weak_syms *weak;
> + 
> +   weak = (struct weak_syms *) xmalloc (sizeof (struct weak_syms));
> + 
> +   if (weak == NULL)
> +     return 0;
> + 
> +   weak->next = weak_decls;
> +   weak->decl = decl;
> +   weak->name = name;
> +   weak->value = value;
> +   weak_decls = weak;
> + 
> +   return 1;
> + }
> + 
> + /* Declare DECL to be a weak symbol.  */
> + 
> + void
> + declare_weak (decl)
> +      tree decl;
> + {
> +   if (! TREE_PUBLIC (decl))
> +     error_with_decl (decl, "weak declaration of `%s' must be public");
> +   else if (TREE_ASM_WRITTEN (decl))
> +     error_with_decl (decl, "weak declaration of `%s' must precede definition");
> +   else if (SUPPORTS_WEAK)
> +     add_weak (decl, IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl)), NULL);
> +   else
> +     warning_with_decl (decl, "weak declaration of `%s' not supported");
> + 
> +   DECL_WEAK (decl) = 1;
> + }
> + 
> + /* Emit any pending weak declarations.  */
> + 
> + void
> + weak_finish ()
> + {
> +   if (SUPPORTS_WEAK)
> +     {
> +       struct weak_syms *t;
> +       for (t = weak_decls; t != NULL; t = t->next)
> + 	{
> + #ifdef ASM_WEAKEN_DECL
> + 	  tree decl = t->decl;
> + 	  if (decl == NULL_TREE)
> + 	    {
> + 	      tree name = get_identifier (t->name);
> + 	      if (name)
> + 		decl = lookup_name (name);
> + 	    }
> + 	  ASM_WEAKEN_DECL (asm_out_file, decl, t->name, t->value);
> + #else
> + #ifdef ASM_OUTPUT_WEAK_ALIAS
> + 	  ASM_OUTPUT_WEAK_ALIAS (asm_out_file, t->name, t->value);
> + #else
> + #ifdef ASM_WEAKEN_LABEL
> + 	  if (t->value)
> + 	    abort ();
> + 	  ASM_WEAKEN_LABEL (asm_out_file, t->name);
> + #endif
> + #endif
> + #endif
> + 	}
> +     }
> + }
> + 
> + /* Remove NAME from the pending list of weak symbols.  This prevents
> +    the compiler from emitting multiple .weak directives which confuses
> +    some assemblers.  */
> + #if defined (ASM_WEAKEN_LABEL) || defined (ASM_WEAKEN_DECL)
> + static void
> + remove_from_pending_weak_list (name)
> +      const char *name;
> + {
> +   struct weak_syms *t;
> +   struct weak_syms **p;
> + 
> +   for (p = &weak_decls; *p; )
> +     {
> +       t = *p;
> +       if (strcmp (name, t->name) == 0)
> +         {
> +           *p = t->next;
> +           free (t);
> +         }
> +       else
> +         p = &(t->next);
> +     }
> + }
> + #endif /* defined (ASM_WEAKEN_LABEL) || defined (ASM_WEAKEN_DECL) */
> + \f
>   /* Handle C and C++ default attributes.  */
>   
>   enum built_in_attribute
> *************** c_common_init_options (lang)
> *** 4058,4063 ****
> --- 4186,4193 ----
>   
>     /* Mark as "unspecified" (see c_common_post_options).  */
>     flag_bounds_check = -1;
> + 
> +   ggc_add_root (&weak_decls, 1, sizeof weak_decls, mark_weak_decls);
>   }
>   
>   /* Post-switch processing.  */
> Index: c-lang.c
> ===================================================================
> RCS file: /cvs/gcc/egcs/gcc/c-lang.c,v
> retrieving revision 1.76
> diff -c -p -r1.76 c-lang.c
> *** c-lang.c	2002/03/13 01:42:30	1.76
> --- c-lang.c	2002/03/14 19:23:50
> *************** Software Foundation, 59 Temple Place - S
> *** 23,28 ****
> --- 23,29 ----
>   #include "config.h"
>   #include "system.h"
>   #include "tree.h"
> + #include "output.h"
>   #include "c-tree.h"
>   #include "langhooks.h"
>   #include "langhooks-def.h"
> *************** static void c_post_options PARAMS ((void
> *** 37,42 ****
> --- 38,46 ----
>   #define LANG_HOOKS_NAME "GNU C"
>   #undef LANG_HOOKS_INIT
>   #define LANG_HOOKS_INIT c_init
> + #undef LANG_HOOKS_SYMBOL_FINISH
> + #define LANG_HOOKS_SYMBOL_FINISH weak_finish
> + #undef LANG_HOOKS_INIT_OPTIONS
>   #undef LANG_HOOKS_FINISH
>   #define LANG_HOOKS_FINISH c_common_finish
>   #undef LANG_HOOKS_INIT_OPTIONS
> Index: cp/cp-lang.c
> ===================================================================
> RCS file: /cvs/gcc/egcs/gcc/cp/cp-lang.c,v
> retrieving revision 1.13
> diff -c -p -r1.13 cp-lang.c
> *** cp-lang.c	2002/03/13 01:42:39	1.13
> --- cp-lang.c	2002/03/14 19:23:50
> *************** Boston, MA 02111-1307, USA.  */
> *** 22,27 ****
> --- 22,28 ----
>   #include "config.h"
>   #include "system.h"
>   #include "tree.h"
> + #include "output.h"
>   #include "cp-tree.h"
>   #include "c-common.h"
>   #include "toplev.h"
> *************** static bool ok_to_generate_alias_set_for
> *** 35,40 ****
> --- 36,43 ----
>   #define LANG_HOOKS_NAME "GNU C++"
>   #undef LANG_HOOKS_INIT
>   #define LANG_HOOKS_INIT cxx_init
> + #undef LANG_HOOKS_SYMBOL_FINISH
> + #define LANG_HOOKS_SYMBOL_FINISH weak_finish
>   #undef LANG_HOOKS_FINISH
>   #define LANG_HOOKS_FINISH cxx_finish
>   #undef LANG_HOOKS_CLEAR_BINDING_STACK
> Index: objc/objc-lang.c
> ===================================================================
> RCS file: /cvs/gcc/egcs/gcc/objc/objc-lang.c,v
> retrieving revision 1.4
> diff -c -p -r1.4 objc-lang.c
> *** objc-lang.c	2002/03/13 01:42:43	1.4
> --- objc-lang.c	2002/03/14 19:23:50
> *************** Boston, MA 02111-1307, USA.  */
> *** 22,27 ****
> --- 22,28 ----
>   #include "config.h"
>   #include "system.h"
>   #include "tree.h"
> + #include "output.h"
>   #include "c-tree.h"
>   #include "c-common.h"
>   #include "toplev.h"
> *************** static void objc_post_options           
> *** 36,41 ****
> --- 37,44 ----
>   #define LANG_HOOKS_NAME "GNU Objective-C"  
>   #undef LANG_HOOKS_INIT
>   #define LANG_HOOKS_INIT objc_init
> + #undef LANG_HOOKS_SYMBOL_FINISH
> + #define LANG_HOOKS_SYMBOL_FINISH weak_finish
>   #undef LANG_HOOKS_FINISH
>   #define LANG_HOOKS_FINISH c_common_finish
>   #undef LANG_HOOKS_INIT_OPTIONS

-- 
- Geoffrey Keating <geoffk@geoffk.org> <geoffk@redhat.com>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: f build dies with: undefined reference to `lookup_name'
  2002-03-14 13:47           ` Geoff Keating
@ 2002-03-14 14:07             ` David Edelsohn
  2002-03-14 15:02               ` Geoff Keating
  2002-03-14 15:24             ` David Edelsohn
  1 sibling, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2002-03-14 14:07 UTC (permalink / raw)
  To: Geoff Keating; +Cc: gcc-patches

	Okay now?

David


	* output.h (remove_from_pending_list): Declare.
	(weak_finish): Delete.
	* langhooks.h (lang_hooks): Add symbol_finish.
	* langhooks-def.h (LANG_HOOKS_SYMBOL_FINISH): New.
	(LANG_HOOKS_INITIALIZER): Use it.
	* c-common.h (c_weak_finish): Declare.
	* varasm.c: Move weak support from here ...
	* c-common.c: ... to here.
	(c_common_init): Add weak_decls to ggc roots.
	* c-lang.c (LANG_HOOKS_SYMBOL_FINISH): Define.
	* objc/objc-lang.c: Same.
	* cp/cp-lang.c: Same.
	* Makefile.in (c-lang.o): Depend on c-common.h.

Index: output.h
===================================================================
RCS file: /cvs/gcc/egcs/gcc/output.h,v
retrieving revision 1.96
diff -c -p -r1.96 output.h
*** output.h	2002/03/01 06:00:33	1.96
--- output.h	2002/03/14 22:01:38
*************** extern const char *get_insn_template PAR
*** 139,144 ****
--- 139,147 ----
     associated with NAME.  */
  extern int add_weak PARAMS ((tree, const char *, const char *));
  
+ /* Remove function NAME from the weak symbols list.  */
+ extern void remove_from_pending_weak_list	PARAMS ((const char *));
+ 
  /* Functions in flow.c */
  extern void allocate_for_life_analysis	PARAMS ((void));
  extern int regno_uninitialized		PARAMS ((unsigned int));
*************** extern void mergeable_constant_section	P
*** 232,240 ****
  /* Declare DECL to be a weak symbol.  */
  extern void declare_weak		PARAMS ((tree));
  #endif /* TREE_CODE */
- 
- /* Emit any pending weak declarations.  */
- extern void weak_finish			PARAMS ((void));
  
  /* Decode an `asm' spec for a declaration as a register name.
     Return the register number, or -1 if nothing specified,
--- 235,240 ----
Index: langhooks-def.h
===================================================================
RCS file: /cvs/gcc/egcs/gcc/langhooks-def.h,v
retrieving revision 1.10
diff -c -p -r1.10 langhooks-def.h
*** langhooks-def.h	2002/03/08 19:20:47	1.10
--- langhooks-def.h	2002/03/14 22:01:39
*************** void lhd_tree_inlining_end_inlining		PAR
*** 67,72 ****
--- 67,73 ----
  #define LANG_HOOKS_NAME			"GNU unknown"
  #define LANG_HOOKS_IDENTIFIER_SIZE	sizeof (struct lang_identifier)
  #define LANG_HOOKS_INIT			lhd_do_nothing
+ #define LANG_HOOKS_SYMBOL_FINISH	lhd_do_nothing
  #define LANG_HOOKS_FINISH		lhd_do_nothing
  #define LANG_HOOKS_CLEAR_BINDING_STACK	lhd_clear_binding_stack
  #define LANG_HOOKS_INIT_OPTIONS		lhd_do_nothing
*************** int lhd_tree_dump_type_quals			PARAMS ((
*** 140,145 ****
--- 141,147 ----
    LANG_HOOKS_DECODE_OPTION, \
    LANG_HOOKS_POST_OPTIONS, \
    LANG_HOOKS_INIT, \
+   LANG_HOOKS_SYMBOL_FINISH, \
    LANG_HOOKS_FINISH, \
    LANG_HOOKS_CLEAR_BINDING_STACK, \
    LANG_HOOKS_GET_ALIAS_SET, \
Index: langhooks.h
===================================================================
RCS file: /cvs/gcc/egcs/gcc/langhooks.h,v
retrieving revision 1.17
diff -c -p -r1.17 langhooks.h
*** langhooks.h	2002/03/08 19:20:47	1.17
--- langhooks.h	2002/03/14 22:01:39
*************** struct lang_hooks
*** 101,106 ****
--- 101,110 ----
       immediately.  */
    const char * (*init) PARAMS ((const char *));
  
+   /* Called near the end of compilation, to emit any pending symbols,
+      such as weak delclarations.  */
+   void (*symbol_finish) PARAMS ((void));
+ 
    /* Called at the end of compilation, as a finalizer.  */
    void (*finish) PARAMS ((void));
  
Index: varasm.c
===================================================================
RCS file: /cvs/gcc/egcs/gcc/varasm.c,v
retrieving revision 1.256
diff -c -p -r1.256 varasm.c
*** varasm.c	2002/03/13 14:20:17	1.256
--- varasm.c	2002/03/14 22:01:40
*************** static unsigned HOST_WIDE_INT array_size
*** 167,176 ****
  static unsigned min_align		PARAMS ((unsigned, unsigned));
  static void output_constructor		PARAMS ((tree, HOST_WIDE_INT,
  						 unsigned int));
- static void mark_weak_decls		PARAMS ((void *));
- #if defined (ASM_WEAKEN_LABEL) || defined (ASM_WEAKEN_DECL)
- static void remove_from_pending_weak_list	PARAMS ((const char *));
- #endif
  static void globalize_decl		PARAMS ((tree));
  static void maybe_assemble_visibility	PARAMS ((tree));
  static int in_named_entry_eq		PARAMS ((const PTR, const PTR));
--- 167,172 ----
*************** output_constructor (exp, size, align)
*** 4995,5126 ****
  }
  
  
- /* This structure contains any weak symbol declarations waiting
-    to be emitted.  */
- struct weak_syms
- {
-   struct weak_syms * next;
-   tree decl;
-   const char * name;
-   const char * value;
- };
- 
- static struct weak_syms * weak_decls;
- 
- /* Mark weak_decls for garbage collection.  */
- 
- static void
- mark_weak_decls (arg)
-      void *arg;
- {
-   struct weak_syms *t;
- 
-   for (t = *(struct weak_syms **) arg; t != NULL; t = t->next)
-     ggc_mark_tree (t->decl);
- }
- 
- /* Add function NAME to the weak symbols list.  VALUE is a weak alias
-    associated with NAME.  */
- 
- int
- add_weak (decl, name, value)
-      tree decl;
-      const char *name;
-      const char *value;
- {
-   struct weak_syms *weak;
- 
-   weak = (struct weak_syms *) xmalloc (sizeof (struct weak_syms));
- 
-   if (weak == NULL)
-     return 0;
- 
-   weak->next = weak_decls;
-   weak->decl = decl;
-   weak->name = name;
-   weak->value = value;
-   weak_decls = weak;
- 
-   return 1;
- }
- 
- /* Declare DECL to be a weak symbol.  */
- 
- void
- declare_weak (decl)
-      tree decl;
- {
-   if (! TREE_PUBLIC (decl))
-     error_with_decl (decl, "weak declaration of `%s' must be public");
-   else if (TREE_ASM_WRITTEN (decl))
-     error_with_decl (decl, "weak declaration of `%s' must precede definition");
-   else if (SUPPORTS_WEAK)
-     add_weak (decl, IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl)), NULL);
-   else
-     warning_with_decl (decl, "weak declaration of `%s' not supported");
- 
-   DECL_WEAK (decl) = 1;
- }
- 
- /* Emit any pending weak declarations.  */
- 
- void
- weak_finish ()
- {
-   if (SUPPORTS_WEAK)
-     {
-       struct weak_syms *t;
-       for (t = weak_decls; t != NULL; t = t->next)
- 	{
- #ifdef ASM_WEAKEN_DECL
- 	  tree decl = t->decl;
- 	  if (decl == NULL_TREE)
- 	    {
- 	      tree name = get_identifier (t->name);
- 	      if (name)
- 		decl = lookup_name (name);
- 	    }
- 	  ASM_WEAKEN_DECL (asm_out_file, decl, t->name, t->value);
- #else
- #ifdef ASM_OUTPUT_WEAK_ALIAS
- 	  ASM_OUTPUT_WEAK_ALIAS (asm_out_file, t->name, t->value);
- #else
- #ifdef ASM_WEAKEN_LABEL
- 	  if (t->value)
- 	    abort ();
- 	  ASM_WEAKEN_LABEL (asm_out_file, t->name);
- #endif
- #endif
- #endif
- 	}
-     }
- }
- 
- /* Remove NAME from the pending list of weak symbols.  This prevents
-    the compiler from emitting multiple .weak directives which confuses
-    some assemblers.  */
- #if defined (ASM_WEAKEN_LABEL) || defined (ASM_WEAKEN_DECL)
- static void
- remove_from_pending_weak_list (name)
-      const char *name;
- {
-   struct weak_syms *t;
-   struct weak_syms **p;
- 
-   for (p = &weak_decls; *p; )
-     {
-       t = *p;
-       if (strcmp (name, t->name) == 0)
-         {
-           *p = t->next;
-           free (t);
-         }
-       else
-         p = &(t->next);
-     }
- }
- #endif /* defined (ASM_WEAKEN_LABEL) || defined (ASM_WEAKEN_DECL) */
- 
  /* Emit the assembly bits to indicate that DECL is globally visible.  */
  
  static void
--- 4991,4996 ----
*************** init_varasm_once ()
*** 5282,5288 ****
  		mark_const_hash_entry);
    ggc_add_root (&const_str_htab, 1, sizeof const_str_htab,
  		mark_const_str_htab);
-   ggc_add_root (&weak_decls, 1, sizeof weak_decls, mark_weak_decls);
  
    const_alias_set = new_alias_set ();
  }
--- 5152,5157 ----
Index: c-common.c
===================================================================
RCS file: /cvs/gcc/egcs/gcc/c-common.c,v
retrieving revision 1.297
diff -c -p -r1.297 c-common.c
*** c-common.c	2002/03/13 01:42:29	1.297
--- c-common.c	2002/03/14 22:01:40
*************** boolean_increment (code, arg)
*** 4023,4028 ****
--- 4023,4156 ----
    return val;
  }
  \f
+ /* This structure contains any weak symbol declarations waiting
+    to be emitted.  */
+ struct weak_syms
+ {
+   struct weak_syms * next;
+   tree decl;
+   const char * name;
+   const char * value;
+ };
+ 
+ static struct weak_syms * weak_decls;
+ 
+ static void mark_weak_decls		PARAMS ((void *));
+ 
+ /* Mark weak_decls for garbage collection.  */
+ 
+ static void
+ mark_weak_decls (arg)
+      void *arg;
+ {
+   struct weak_syms *t;
+ 
+   for (t = *(struct weak_syms **) arg; t != NULL; t = t->next)
+     ggc_mark_tree (t->decl);
+ }
+ 
+ /* Add function NAME to the weak symbols list.  VALUE is a weak alias
+    associated with NAME.  */
+ 
+ int
+ add_weak (decl, name, value)
+      tree decl;
+      const char *name;
+      const char *value;
+ {
+   struct weak_syms *weak;
+ 
+   weak = (struct weak_syms *) xmalloc (sizeof (struct weak_syms));
+ 
+   if (weak == NULL)
+     return 0;
+ 
+   weak->next = weak_decls;
+   weak->decl = decl;
+   weak->name = name;
+   weak->value = value;
+   weak_decls = weak;
+ 
+   return 1;
+ }
+ 
+ /* Declare DECL to be a weak symbol.  */
+ 
+ void
+ declare_weak (decl)
+      tree decl;
+ {
+   if (! TREE_PUBLIC (decl))
+     error_with_decl (decl, "weak declaration of `%s' must be public");
+   else if (TREE_ASM_WRITTEN (decl))
+     error_with_decl (decl, "weak declaration of `%s' must precede definition");
+   else if (SUPPORTS_WEAK)
+     add_weak (decl, IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl)), NULL);
+   else
+     warning_with_decl (decl, "weak declaration of `%s' not supported");
+ 
+   DECL_WEAK (decl) = 1;
+ }
+ 
+ /* Emit any pending weak declarations.  */
+ 
+ void
+ c_weak_finish ()
+ {
+   if (SUPPORTS_WEAK)
+     {
+       struct weak_syms *t;
+       for (t = weak_decls; t != NULL; t = t->next)
+ 	{
+ #ifdef ASM_WEAKEN_DECL
+ 	  tree decl = t->decl;
+ 	  if (decl == NULL_TREE)
+ 	    {
+ 	      tree name = get_identifier (t->name);
+ 	      if (name)
+ 		decl = lookup_name (name);
+ 	    }
+ 	  ASM_WEAKEN_DECL (asm_out_file, decl, t->name, t->value);
+ #else
+ #ifdef ASM_OUTPUT_WEAK_ALIAS
+ 	  ASM_OUTPUT_WEAK_ALIAS (asm_out_file, t->name, t->value);
+ #else
+ #ifdef ASM_WEAKEN_LABEL
+ 	  if (t->value)
+ 	    abort ();
+ 	  ASM_WEAKEN_LABEL (asm_out_file, t->name);
+ #endif
+ #endif
+ #endif
+ 	}
+     }
+ }
+ 
+ /* Remove NAME from the pending list of weak symbols.  This prevents
+    the compiler from emitting multiple .weak directives which confuses
+    some assemblers.  */
+ #if defined (ASM_WEAKEN_LABEL) || defined (ASM_WEAKEN_DECL)
+ static void
+ remove_from_pending_weak_list (name)
+      const char *name;
+ {
+   struct weak_syms *t;
+   struct weak_syms **p;
+ 
+   for (p = &weak_decls; *p; )
+     {
+       t = *p;
+       if (strcmp (name, t->name) == 0)
+         {
+           *p = t->next;
+           free (t);
+         }
+       else
+         p = &(t->next);
+     }
+ }
+ #endif /* defined (ASM_WEAKEN_LABEL) || defined (ASM_WEAKEN_DECL) */
+ \f
  /* Handle C and C++ default attributes.  */
  
  enum built_in_attribute
*************** c_common_init (filename)
*** 4112,4117 ****
--- 4240,4247 ----
  
    if (!c_attrs_initialized)
      c_init_attributes ();
+ 
+   ggc_add_root (&weak_decls, 1, sizeof weak_decls, mark_weak_decls);
  
    return filename;
  }
Index: c-common.h
===================================================================
RCS file: /cvs/gcc/egcs/gcc/c-common.h,v
retrieving revision 1.117
diff -c -p -r1.117 c-common.h
*** c-common.h	2002/03/13 01:42:29	1.117
--- c-common.h	2002/03/14 22:01:41
*************** extern tree build_va_arg			PARAMS ((tree
*** 551,556 ****
--- 551,557 ----
  extern void c_common_init_options		PARAMS ((enum c_language_kind));
  extern void c_common_post_options		PARAMS ((void));
  extern const char *c_common_init		PARAMS ((const char *));
+ extern void c_weak_finish			PARAMS ((void));
  extern void c_common_finish			PARAMS ((void));
  extern HOST_WIDE_INT c_common_get_alias_set	PARAMS ((tree));
  extern bool c_promoting_integer_type_p		PARAMS ((tree));
Index: c-lang.c
===================================================================
RCS file: /cvs/gcc/egcs/gcc/c-lang.c,v
retrieving revision 1.76
diff -c -p -r1.76 c-lang.c
*** c-lang.c	2002/03/13 01:42:30	1.76
--- c-lang.c	2002/03/14 22:01:41
*************** Software Foundation, 59 Temple Place - S
*** 23,29 ****
--- 23,30 ----
  #include "config.h"
  #include "system.h"
  #include "tree.h"
  #include "c-tree.h"
+ #include "c-common.h"
  #include "langhooks.h"
  #include "langhooks-def.h"
  
*************** static void c_post_options PARAMS ((void
*** 37,42 ****
--- 39,46 ----
  #define LANG_HOOKS_NAME "GNU C"
  #undef LANG_HOOKS_INIT
  #define LANG_HOOKS_INIT c_init
+ #undef LANG_HOOKS_SYMBOL_FINISH
+ #define LANG_HOOKS_SYMBOL_FINISH c_weak_finish
  #undef LANG_HOOKS_FINISH
  #define LANG_HOOKS_FINISH c_common_finish
  #undef LANG_HOOKS_INIT_OPTIONS
Index: Makefile.in
===================================================================
RCS file: /cvs/gcc/egcs/gcc/Makefile.in,v
retrieving revision 1.837
diff -c -p -r1.837 Makefile.in
*** Makefile.in	2002/03/12 05:40:30	1.837
--- Makefile.in	2002/03/14 22:01:42
*************** c-decl.o : c-decl.c $(CONFIG_H) $(SYSTEM
*** 1152,1158 ****
  c-typeck.o : c-typeck.c $(CONFIG_H) $(SYSTEM_H) $(TREE_H) $(C_TREE_H) \
      $(TARGET_H) flags.h intl.h output.h $(EXPR_H) $(RTL_H) toplev.h $(TM_P_H)
  c-lang.o : c-lang.c $(CONFIG_H) $(SYSTEM_H) $(TREE_H) $(C_TREE_H) \
!     langhooks.h langhooks-def.h
  c-lex.o : c-lex.c $(CONFIG_H) $(SYSTEM_H) $(TREE_H) $(RTL_H) c-lex.h \
      debug.h $(C_TREE_H) \
      c-pragma.h input.h intl.h flags.h toplev.h output.h \
--- 1152,1158 ----
  c-typeck.o : c-typeck.c $(CONFIG_H) $(SYSTEM_H) $(TREE_H) $(C_TREE_H) \
      $(TARGET_H) flags.h intl.h output.h $(EXPR_H) $(RTL_H) toplev.h $(TM_P_H)
  c-lang.o : c-lang.c $(CONFIG_H) $(SYSTEM_H) $(TREE_H) $(C_TREE_H) \
!     langhooks.h langhooks-def.h c-common.h
  c-lex.o : c-lex.c $(CONFIG_H) $(SYSTEM_H) $(TREE_H) $(RTL_H) c-lex.h \
      debug.h $(C_TREE_H) \
      c-pragma.h input.h intl.h flags.h toplev.h output.h \
Index: objc/objc-lang.c
===================================================================
RCS file: /cvs/gcc/egcs/gcc/objc/objc-lang.c,v
retrieving revision 1.4
diff -c -p -r1.4 objc-lang.c
*** objc-lang.c	2002/03/13 01:42:43	1.4
--- objc-lang.c	2002/03/14 22:01:42
*************** static void objc_post_options           
*** 36,41 ****
--- 36,43 ----
  #define LANG_HOOKS_NAME "GNU Objective-C"  
  #undef LANG_HOOKS_INIT
  #define LANG_HOOKS_INIT objc_init
+ #undef LANG_HOOKS_SYMBOL_FINISH
+ #define LANG_HOOKS_SYMBOL_FINISH c_weak_finish
  #undef LANG_HOOKS_FINISH
  #define LANG_HOOKS_FINISH c_common_finish
  #undef LANG_HOOKS_INIT_OPTIONS
Index: cp/cp-lang.c
===================================================================
RCS file: /cvs/gcc/egcs/gcc/cp/cp-lang.c,v
retrieving revision 1.13
diff -c -p -r1.13 cp-lang.c
*** cp-lang.c	2002/03/13 01:42:39	1.13
--- cp-lang.c	2002/03/14 22:01:42
*************** static bool ok_to_generate_alias_set_for
*** 35,40 ****
--- 35,42 ----
  #define LANG_HOOKS_NAME "GNU C++"
  #undef LANG_HOOKS_INIT
  #define LANG_HOOKS_INIT cxx_init
+ #undef LANG_HOOKS_SYMBOL_FINISH
+ #define LANG_HOOKS_SYMBOL_FINISH c_weak_finish
  #undef LANG_HOOKS_FINISH
  #define LANG_HOOKS_FINISH cxx_finish
  #undef LANG_HOOKS_CLEAR_BINDING_STACK

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: f build dies with: undefined reference to `lookup_name'
  2002-03-14 14:07             ` David Edelsohn
@ 2002-03-14 15:02               ` Geoff Keating
  0 siblings, 0 replies; 875+ messages in thread
From: Geoff Keating @ 2002-03-14 15:02 UTC (permalink / raw)
  To: dje; +Cc: gcc-patches


>	Okay now?

Yes, this is OK.  Thanks!

-- 
- Geoffrey Keating <geoffk@geoffk.org> <geoffk@redhat.com>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: f build dies with: undefined reference to `lookup_name'
  2002-03-14 13:47           ` Geoff Keating
  2002-03-14 14:07             ` David Edelsohn
@ 2002-03-14 15:24             ` David Edelsohn
  2002-03-14 16:57               ` Alan Modra
  1 sibling, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2002-03-14 15:24 UTC (permalink / raw)
  To: Geoff Keating, Richard Henderson; +Cc: gcc-patches

	Actually, this doesn't work for the reason that I originally
thought: other parts of varasm.c reach into the weak support, e.g.,
globalize_decl() calls remove_from_pending_weak_list().
remove_from_pending_weak_list() now is in c-common, so that fails for F77
and Java.

	I can turn globalize_decl() into a hook and maybe assemble_alias()
and assemble_visibility() while I am at it.  globalize_decl() now wants
asm_out_file. 

	Basically, this is leading back to varasm.c and pulling things
out of varasm.c that seem appropriate for that file.  Basically this is
turning varasm.c into a giant lang_hook.

	I think a better approach may be to turn lookup_name into a
lang_hook because we still need to distinguish between the different
languages for that anyway.

	I am going to roll this back and follow that path.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: f build dies with: undefined reference to `lookup_name'
  2002-03-06 10:59       ` Richard Henderson
                           ` (2 preceding siblings ...)
  2002-03-14 11:34         ` David Edelsohn
@ 2002-03-14 16:00         ` David Edelsohn
  3 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2002-03-14 16:00 UTC (permalink / raw)
  To: Geoff Keating; +Cc: gcc-patches

	Okay, how about this patch?

	* langhooks.h (lang_hooks): Add lookup_name.
	* langhooks-def.h (LANG_HOOKS_LOOKUP_NAME): Define.
	(LANG_HOOKS_INITIALIZER): Use it.
	* varasm.c (weak_finish): Call lookup_name via lang_hooks.
	* c-lang.c (LANG_HOOKS_LOOKUP_NAME): Define.
	* objc/objc-lang.c (LANG_HOOKS_LOOKUP_NAME): Define.
	* cp/cp-lang.c (cxx_lookup_name): New function.
	(LANG_HOOKS_LOOKUP_NAME): Use it.

Index: varasm.c
===================================================================
RCS file: /cvs/gcc/egcs/gcc/varasm.c,v
retrieving revision 1.256
diff -c -p -r1.256 varasm.c
*** varasm.c	2002/03/13 14:20:17	1.256
--- varasm.c	2002/03/14 23:53:33
*************** weak_finish ()
*** 5078,5084 ****
  	    {
  	      tree name = get_identifier (t->name);
  	      if (name)
! 		decl = lookup_name (name);
  	    }
  	  ASM_WEAKEN_DECL (asm_out_file, decl, t->name, t->value);
  #else
--- 5078,5084 ----
  	    {
  	      tree name = get_identifier (t->name);
  	      if (name)
! 		decl = (*lang_hooks.lookup_name) (name);
  	    }
  	  ASM_WEAKEN_DECL (asm_out_file, decl, t->name, t->value);
  #else
Index: langhooks-def.h
===================================================================
RCS file: /cvs/gcc/egcs/gcc/langhooks-def.h,v
retrieving revision 1.10
diff -c -p -r1.10 langhooks-def.h
*** langhooks-def.h	2002/03/08 19:20:47	1.10
--- langhooks-def.h	2002/03/14 23:53:33
*************** void lhd_tree_inlining_end_inlining		PAR
*** 78,83 ****
--- 78,84 ----
  #define LANG_HOOKS_STATICP		lhd_staticp
  #define LANG_HOOKS_DUP_LANG_SPECIFIC_DECL lhd_do_nothing_t
  #define LANG_HOOKS_UNSAVE_EXPR_NOW	lhd_unsave_expr_now
+ #define LANG_HOOKS_LOOKUP_NAME		lhd_return_tree
  #define LANG_HOOKS_HONOR_READONLY	false
  #define LANG_HOOKS_PRINT_STATISTICS	lhd_do_nothing
  #define LANG_HOOKS_PRINT_XNODE		lhd_print_tree_nothing
*************** int lhd_tree_dump_type_quals			PARAMS ((
*** 148,153 ****
--- 149,155 ----
    LANG_HOOKS_STATICP, \
    LANG_HOOKS_DUP_LANG_SPECIFIC_DECL, \
    LANG_HOOKS_UNSAVE_EXPR_NOW, \
+   LANG_HOOKS_LOOKUP_NAME, \
    LANG_HOOKS_HONOR_READONLY, \
    LANG_HOOKS_PRINT_STATISTICS, \
    LANG_HOOKS_PRINT_XNODE, \
Index: langhooks.h
===================================================================
RCS file: /cvs/gcc/egcs/gcc/langhooks.h,v
retrieving revision 1.17
diff -c -p -r1.17 langhooks.h
*** langhooks.h	2002/03/08 19:20:47	1.17
--- langhooks.h	2002/03/14 23:53:33
*************** struct lang_hooks
*** 137,142 ****
--- 137,145 ----
       things are cleared out.  */
    tree (*unsave_expr_now) PARAMS ((tree));
  
+   /* Find DECL for NAME.  */
+   tree (*lookup_name) PARAMS ((tree));
+ 
    /* Nonzero if TYPE_READONLY and TREE_READONLY should always be honored.  */
    bool honor_readonly;
  
Index: c-lang.c
===================================================================
RCS file: /cvs/gcc/egcs/gcc/c-lang.c,v
retrieving revision 1.76
diff -c -p -r1.76 c-lang.c
*** c-lang.c	2002/03/13 01:42:30	1.76
--- c-lang.c	2002/03/14 23:53:33
*************** static void c_post_options PARAMS ((void
*** 57,62 ****
--- 57,64 ----
  #define LANG_HOOKS_SET_YYDEBUG c_set_yydebug
  #undef LANG_HOOKS_DUP_LANG_SPECIFIC_DECL
  #define LANG_HOOKS_DUP_LANG_SPECIFIC_DECL c_dup_lang_specific_decl
+ #undef LANG_HOOKS_LOOKUP_NAME
+ #define LANG_HOOKS_LOOKUP_NAME lookup_name
  
  #undef LANG_HOOKS_TREE_INLINING_CANNOT_INLINE_TREE_FN
  #define LANG_HOOKS_TREE_INLINING_CANNOT_INLINE_TREE_FN \
Index: objc/objc-lang.c
===================================================================
RCS file: /cvs/gcc/egcs/gcc/objc/objc-lang.c,v
retrieving revision 1.4
diff -c -p -r1.4 objc-lang.c
*** objc-lang.c	2002/03/13 01:42:43	1.4
--- objc-lang.c	2002/03/14 23:53:33
*************** static void objc_post_options           
*** 52,57 ****
--- 52,60 ----
  #define LANG_HOOKS_PRINT_IDENTIFIER c_print_identifier
  #undef LANG_HOOKS_SET_YYDEBUG
  #define LANG_HOOKS_SET_YYDEBUG c_set_yydebug
+ #undef LANG_HOOKS_LOOKUP_NAME
+ #define LANG_HOOKS_LOOKUP_NAME lookup_name
+ 
  /* Inlining hooks same as the C front end.  */
  #undef LANG_HOOKS_TREE_INLINING_CANNOT_INLINE_TREE_FN
  #define LANG_HOOKS_TREE_INLINING_CANNOT_INLINE_TREE_FN \
Index: cp/cp-lang.c
===================================================================
RCS file: /cvs/gcc/egcs/gcc/cp/cp-lang.c,v
retrieving revision 1.13
diff -c -p -r1.13 cp-lang.c
*** cp-lang.c	2002/03/13 01:42:39	1.13
--- cp-lang.c	2002/03/14 23:53:33
*************** Boston, MA 02111-1307, USA.  */
*** 28,33 ****
--- 28,34 ----
  #include "langhooks.h"
  #include "langhooks-def.h"
  
+ static tree cxx_lookup_name PARAMS ((tree));
  static HOST_WIDE_INT cxx_get_alias_set PARAMS ((tree));
  static bool ok_to_generate_alias_set_for_type PARAMS ((tree));
  
*************** static bool ok_to_generate_alias_set_for
*** 55,60 ****
--- 56,63 ----
  #define LANG_HOOKS_DUP_LANG_SPECIFIC_DECL cxx_dup_lang_specific_decl
  #undef LANG_HOOKS_UNSAVE_EXPR_NOW
  #define LANG_HOOKS_UNSAVE_EXPR_NOW cxx_unsave_expr_now
+ #undef LANG_HOOKS_LOOKUP_NAME
+ #define LANG_HOOKS_LOOKUP_NAME cxx_lookup_name
  #undef LANG_HOOKS_PRINT_STATISTICS
  #define LANG_HOOKS_PRINT_STATISTICS cxx_print_statistics
  #undef LANG_HOOKS_PRINT_XNODE
*************** static bool ok_to_generate_alias_set_for
*** 99,104 ****
--- 102,116 ----
  
  /* Each front end provides its own hooks, for toplev.c.  */
  const struct lang_hooks lang_hooks = LANG_HOOKS_INITIALIZER;
+ 
+ /* Lookup NAME.  */
+ 
+ static tree
+ cxx_lookup_name (t)
+      tree t;
+ {
+   return lookup_name (t, 0);
+ }
  
  /* Check if a C++ type is safe for aliasing.
     Return TRUE if T safe for aliasing FALSE otherwise.  */

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: f build dies with: undefined reference to `lookup_name'
  2002-03-14 15:24             ` David Edelsohn
@ 2002-03-14 16:57               ` Alan Modra
  2002-03-14 18:05                 ` Geoff Keating
  0 siblings, 1 reply; 875+ messages in thread
From: Alan Modra @ 2002-03-14 16:57 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Geoff Keating, Richard Henderson, gcc-patches

On Thu, Mar 14, 2002 at 06:24:04PM -0500, David Edelsohn wrote:
> 
> 	I think a better approach may be to turn lookup_name into a
> lang_hook because we still need to distinguish between the different
> languages for that anyway.
> 
> 	I am going to roll this back and follow that path.

IMO adding these hooks isn't improving matters at all.  The major
problem with the code as of a day or so ago was the call to lookup_decl
from varasm.c.  The patch I posted at
http://gcc.gnu.org/ml/gcc-patches/2002-03/msg00308.html cures this, and
also addresses the problem rth mentioned in
http://gcc.gnu.org/ml/gcc-patches/2002-03/msg00298.html

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: f build dies with: undefined reference to `lookup_name'
  2002-03-14 16:57               ` Alan Modra
@ 2002-03-14 18:05                 ` Geoff Keating
  2002-03-14 18:35                   ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Geoff Keating @ 2002-03-14 18:05 UTC (permalink / raw)
  To: amodra; +Cc: dje, rth, gcc-patches

> Date: Fri, 15 Mar 2002 11:26:41 +1030
> From: Alan Modra <amodra@bigpond.net.au>

> On Thu, Mar 14, 2002 at 06:24:04PM -0500, David Edelsohn wrote:
> > 
> > 	I think a better approach may be to turn lookup_name into a
> > lang_hook because we still need to distinguish between the different
> > languages for that anyway.
> > 
> > 	I am going to roll this back and follow that path.
> 
> IMO adding these hooks isn't improving matters at all.  The major
> problem with the code as of a day or so ago was the call to lookup_decl
> from varasm.c.  The patch I posted at
> http://gcc.gnu.org/ml/gcc-patches/2002-03/msg00308.html cures this, and
> also addresses the problem rth mentioned in
> http://gcc.gnu.org/ml/gcc-patches/2002-03/msg00298.html

OK, we have two proposals, both of which I've looked at and I think I
understand:

- David's (revised) proposal, which is to add an extra lang_hook to
  allow varasm.c to call into the language backends to do name lookup;
and
- Alan Modra's proposal, which moves the initial name lookup to the
  backends, and provides a routine in varasm.c which backends can call
  before creating a decl to see if they should mark the decl weak.

Now, I have two comments about these that makes me think neither is
the final answer, although they both represent progress.

Firstly, how do these work in C++?  In particular, if I write

void bar(void);
namespace foo {
#pragma weak bar
  extern void bar(void);
  struct c {
    void bar(void);
  };
}
namespace bat {
  extern void bar(void);
}
void doit(void)
{
  struct foo::c cc;
  bar();
  foo::bar();
  cc.bar();
  bat::bar();
}

do I get foo::bar weak, ::bar weak, everything named 'bar' weak, or
something else?  I believe the correct thing is to have foo::bar,
only, be weak.  The current compiler seems to make 
'extern "C" bar' weak, which indicates (at least to me) that no-one
thought about this at all :-).  I believe that Alan's patch will
change this behaviour in an improving direction, but I think from the
description it will pick ::bar, because ::bar's decl is visible at the
point of the pragma under the name 'bar'.

[It's this question that makes me think that the code should really be
part of a language backend.  Anything that can't work right for both C
and C++ is clearly so language-specific that it probably doesn't make
any sense at all for, say, Ada, and so it's not surprising that
there are compile errors.]

Secondly, why is this particular code in varasm.c at all?  I liked
what David originally tried to do by moving it out (to c-common.c?),
but I think it will require at least part of Alan's patch to avoid the
problems that David hit.

The reason I ask this question is that in some other languages, you
can do this sort of thing in a more organized fashion by writting
things like

attribute bar weak;
attribute bar watched, synchronized;
...
type bar procedure;

and so on, and those languages wouldn't be able to use this mechanism;
it's very specialized for the C languages, and so I would think it
should be somewhere like c-common.

-- 
- Geoffrey Keating <geoffk@geoffk.org> <geoffk@redhat.com>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: f build dies with: undefined reference to `lookup_name'
  2002-03-14 18:05                 ` Geoff Keating
@ 2002-03-14 18:35                   ` David Edelsohn
  2002-03-14 20:07                     ` Geoff Keating
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2002-03-14 18:35 UTC (permalink / raw)
  To: Geoff Keating; +Cc: amodra, rth, gcc-patches

>>>>> Geoff Keating writes:

Geoff> Firstly, how do these work in C++?  In particular, if I write

Geoff> do I get foo::bar weak, ::bar weak, everything named 'bar' weak, or
Geoff> something else?  I believe the correct thing is to have foo::bar,
Geoff> only, be weak.  The current compiler seems to make 
Geoff> 'extern "C" bar' weak, which indicates (at least to me) that no-one
Geoff> thought about this at all :-).  I believe that Alan's patch will
Geoff> change this behaviour in an improving direction, but I think from the
Geoff> description it will pick ::bar, because ::bar's decl is visible at the
Geoff> point of the pragma under the name 'bar'.

	Your intuition is wrong with respect to the Solaris C Compiler
semantics for #pragma.

#pragma weak bar

makes "bar" weak.  No mangling.  If you want mangled bar weak, you need to
explicitly state the mangled string in the pragma.  If you want to define
some new, incompatible semantics for #pragma, that is fine, but I'm not
working on it.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: f build dies with: undefined reference to `lookup_name'
  2002-03-14 18:35                   ` David Edelsohn
@ 2002-03-14 20:07                     ` Geoff Keating
  2002-03-14 21:10                       ` Richard Henderson
  0 siblings, 1 reply; 875+ messages in thread
From: Geoff Keating @ 2002-03-14 20:07 UTC (permalink / raw)
  To: dje; +Cc: amodra, rth, gcc-patches

> cc: amodra@bigpond.net.au, rth@redhat.com, gcc-patches@gcc.gnu.org
> Date: Thu, 14 Mar 2002 21:35:25 -0500
> From: David Edelsohn <dje@watson.ibm.com>
> 
> >>>>> Geoff Keating writes:
> 
> Geoff> Firstly, how do these work in C++?  In particular, if I write
> 
> Geoff> do I get foo::bar weak, ::bar weak, everything named 'bar' weak, or
> Geoff> something else?  I believe the correct thing is to have foo::bar,
> Geoff> only, be weak.  The current compiler seems to make 
> Geoff> 'extern "C" bar' weak, which indicates (at least to me) that no-one
> Geoff> thought about this at all :-).  I believe that Alan's patch will
> Geoff> change this behaviour in an improving direction, but I think from the
> Geoff> description it will pick ::bar, because ::bar's decl is visible at the
> Geoff> point of the pragma under the name 'bar'.
> 
> 	Your intuition is wrong with respect to the Solaris C Compiler
> semantics for #pragma.

I presume you mean 'C++'.

> #pragma weak bar
> 
> makes "bar" weak.  No mangling.  If you want mangled bar weak, you need to
> explicitly state the mangled string in the pragma.  If you want to define
> some new, incompatible semantics for #pragma, that is fine, but I'm not
> working on it.

In that case, the existing code and Alan's patch are incorrect,
because lookup_name doesn't look up mangled names.  Problems should be
experienced with something like the following:

namespace bar {
  extern "C" void foo(void);
#pragma weak foo
}
extern int foo;

because when weak_finish is being run, 'foo' binds to the toplevel,
which is not the right thing.

Now, this is a different kind of problem.  Mangled names are certainly
in the scope of the middle-end.  So perhaps the correct fix is to stop
using lookup_name and instead search through the assembler names.

-- 
- Geoffrey Keating <geoffk@geoffk.org> <geoffk@redhat.com>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: f build dies with: undefined reference to `lookup_name'
  2002-03-14 20:07                     ` Geoff Keating
@ 2002-03-14 21:10                       ` Richard Henderson
  2002-03-14 23:03                         ` Richard Henderson
  0 siblings, 1 reply; 875+ messages in thread
From: Richard Henderson @ 2002-03-14 21:10 UTC (permalink / raw)
  To: Geoff Keating; +Cc: dje, amodra, gcc-patches

On Thu, Mar 14, 2002 at 08:07:37PM -0800, Geoff Keating wrote:
> Now, this is a different kind of problem.  Mangled names are certainly
> in the scope of the middle-end.  So perhaps the correct fix is to stop
> using lookup_name and instead search through the assembler names.

Indeed, identifier_global_value is the function we want.

Sorry for taking so long to get back to this, David.  I'm
currently testing the following, which I believe addresses
the issues I raised wrt moving code for the pragma itself,
that Geoff raised wrt C++, and that I raise wrt getting
DECL_WEAK set on the decl.

I also need to formalize the rest of my test cases for
aliases and c++.


r~


	* c-decl.c: Include c-pragma.h.
	(start_decl, start_function): Invoke maybe_apply_pragma_weak.
	(finish_function): Tidy.
	* c-pragma.c: Include c-common.h.
	(pending_weaks, apply_pragma_weak, maybe_apply_pragma_weak): New.
	(handle_pragma_weak): Use them.
	(init_pragma): Register pending_weaks.
	* c-pragma.h (maybe_apply_pragma_weak): Declare.
	* print-tree.c (print_node): Print DECL_WEAK.
	* varasm.c (mark_weak_decls): Remove.
	(remove_from_pending_weak_list): Remove.
	(add_weak): Remove.
	(asm_emit_uninitialised): Call globalize_decl for weak commons.
	(weak_decls): Make a tree_list.
	(declare_weak): Cons weak_decls directly.
	(globalize_decl): Remove weak_decls elements directly.
	(weak_finish): Simplify weak_decls walk.  Don't weaken unused
	symbols.  Don't pretend to handle aliases.
	(init_varasm_once): Update weak_decls registry.

	* cp/decl.c: Include c-pragma.h.
	(start_decl, start_function): Invoke maybe_apply_pragma_weak.

	* gcc.dg/weak-1.c: New.

Index: c-decl.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/c-decl.c,v
retrieving revision 1.305
diff -c -p -d -u -r1.305 c-decl.c
--- c-decl.c	2002/03/05 02:34:05	1.305
+++ c-decl.c	2002/03/15 05:02:16
@@ -46,6 +46,7 @@ Software Foundation, 59 Temple Place - S
 #include "debug.h"
 #include "timevar.h"
 #include "c-common.h"
+#include "c-pragma.h"
 
 /* In grokdeclarator, distinguish syntactic contexts of declarators.  */
 enum decl_context
@@ -3403,6 +3404,10 @@ start_decl (declarator, declspecs, initi
   /* Set attributes here so if duplicate decl, will have proper attributes.  */
   decl_attributes (&decl, attributes, 0);
 
+  /* If #pragma weak was used, mark the decl weak now.  */
+  if (current_binding_level == global_binding_level)
+    maybe_apply_pragma_weak (decl);
+
   if (TREE_CODE (decl) == FUNCTION_DECL
       && DECL_DECLARED_INLINE_P (decl)
       && DECL_UNINLINABLE (decl)
@@ -6042,6 +6047,10 @@ start_function (declspecs, declarator, a
 
   decl_attributes (&decl1, attributes, 0);
 
+  /* If #pragma weak was used, mark the decl weak now.  */
+  if (current_binding_level == global_binding_level)
+    maybe_apply_pragma_weak (decl1);
+
   if (DECL_DECLARED_INLINE_P (decl1)
       && DECL_UNINLINABLE (decl1)
       && lookup_attribute ("noinline", DECL_ATTRIBUTES (decl1)))
@@ -6691,9 +6700,11 @@ finish_function (nested)
 {
   tree fndecl = current_function_decl;
 
-/*  TREE_READONLY (fndecl) = 1;
-    This caused &foo to be of type ptr-to-const-function
-    which then got a warning when stored in a ptr-to-function variable.  */
+#if 0
+  /* This caused &foo to be of type ptr-to-const-function which then
+     got a warning when stored in a ptr-to-function variable.  */
+  TREE_READONLY (fndecl) = 1;
+#endif
 
   poplevel (1, 0, 1);
   BLOCK_SUPERCONTEXT (DECL_INITIAL (fndecl)) = fndecl;
@@ -6755,6 +6766,7 @@ finish_function (nested)
     {
       /* Generate RTL for the body of this function.  */
       c_expand_body (fndecl, nested, 1);
+
       /* Let the error reporting routines know that we're outside a
 	 function.  For a nested function, this value is used in
 	 pop_c_function_context and then reset via pop_function_context.  */
Index: c-pragma.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/c-pragma.c,v
retrieving revision 1.47
diff -c -p -d -u -r1.47 c-pragma.c
--- c-pragma.c	2002/03/01 06:00:32	1.47
+++ c-pragma.c	2002/03/15 05:02:16
@@ -30,6 +30,7 @@ Software Foundation, 59 Temple Place - S
 #include "toplev.h"
 #include "ggc.h"
 #include "c-lex.h"
+#include "c-common.h"
 #include "output.h"
 #include "tm_p.h"
 
@@ -55,9 +56,9 @@ static struct align_stack * alignment_st
    maximum_field_alignment in effect.  When the final pop_alignment() 
    happens, we restore the value to this, not to a value of 0 for
    maximum_field_alignment.  Value is in bits.  */
-static int  default_alignment;
+static int default_alignment;
 #define SET_GLOBAL_ALIGNMENT(ALIGN) \
-(default_alignment = maximum_field_alignment = (ALIGN))
+  (default_alignment = maximum_field_alignment = (ALIGN))
 
 static void push_alignment PARAMS ((int, tree));
 static void pop_alignment  PARAMS ((tree));
@@ -69,7 +70,6 @@ push_alignment (alignment, id)
      int alignment;
      tree id;
 {
-  
   if (alignment_stack == NULL
       || alignment_stack->alignment != alignment
       || id != NULL_TREE)
@@ -274,14 +274,53 @@ handle_pragma_pack (dummy)
 #endif  /* HANDLE_PRAGMA_PACK */
 
 #ifdef HANDLE_PRAGMA_WEAK
+static void apply_pragma_weak PARAMS ((tree, tree));
 static void handle_pragma_weak PARAMS ((cpp_reader *));
 
+static tree pending_weaks;
+
+static void
+apply_pragma_weak (decl, value)
+     tree decl, value;
+{
+  if (value)
+    decl_attributes (&decl, build_tree_list (get_identifier ("alias"),
+				             build_tree_list (NULL, value)),
+		     0);
+  declare_weak (decl);
+}
+
+void
+maybe_apply_pragma_weak (decl)
+     tree decl;
+{
+  tree *p, t, id;
+
+  /* Copied from the check in set_decl_assembler_name.  */
+  if (TREE_CODE (decl) == FUNCTION_DECL
+      || (TREE_CODE (decl) == VAR_DECL 
+          && (TREE_STATIC (decl) 
+              || DECL_EXTERNAL (decl) 
+              || TREE_PUBLIC (decl))))
+    id = DECL_ASSEMBLER_NAME (decl);
+  else
+    return;
+
+  for (p = &pending_weaks; (t = *p) ; p = &TREE_CHAIN (t))
+    if (id == TREE_PURPOSE (t))
+      {
+	apply_pragma_weak (decl, TREE_VALUE (t));
+	*p = TREE_CHAIN (t);
+	break;
+      }
+}
+
 /* #pragma weak name [= value] */
 static void
 handle_pragma_weak (dummy)
      cpp_reader *dummy ATTRIBUTE_UNUSED;
 {
-  tree name, value, x;
+  tree name, value, x, decl;
   enum cpp_ttype t;
 
   value = 0;
@@ -298,10 +337,19 @@ handle_pragma_weak (dummy)
   if (t != CPP_EOF)
     warning ("junk at end of #pragma weak");
 
-  add_weak (NULL_TREE, IDENTIFIER_POINTER (name),
-	    value ? IDENTIFIER_POINTER (value) : NULL);
+  decl = identifier_global_value (name);
+  if (decl && TREE_CODE_CLASS (TREE_CODE (decl)) == 'd')
+    apply_pragma_weak (decl, value);
+  else
+    pending_weaks = tree_cons (name, value, pending_weaks);
 }
-#endif
+#else
+void
+maybe_apply_pragma_weak (decl)
+     tree decl ATTRIBUTE_UNUSED;
+{
+}
+#endif /* HANDLE_PRAGMA_WEAK */
 
 void
 init_pragma ()
@@ -311,6 +359,7 @@ init_pragma ()
 #endif
 #ifdef HANDLE_PRAGMA_WEAK
   cpp_register_pragma (parse_in, 0, "weak", handle_pragma_weak);
+  ggc_add_tree_root (&pending_weaks, 1);
 #endif
 #ifdef REGISTER_TARGET_PRAGMAS
   REGISTER_TARGET_PRAGMAS (parse_in);
Index: c-pragma.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/c-pragma.h,v
retrieving revision 1.27
diff -c -p -d -u -r1.27 c-pragma.h
--- c-pragma.h	2002/03/01 06:00:32	1.27
+++ c-pragma.h	2002/03/15 05:02:16
@@ -53,4 +53,6 @@ extern void cpp_register_pragma PARAMS (
 					 void (*) PARAMS ((cpp_reader *))));
 #endif
 
+extern void maybe_apply_pragma_weak PARAMS ((tree));
+
 #endif /* GCC_C_PRAGMA_H */
Index: print-tree.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/print-tree.c,v
retrieving revision 1.55
diff -c -p -d -u -r1.55 print-tree.c
--- print-tree.c	2002/03/03 21:09:46	1.55
+++ print-tree.c	2002/03/15 05:02:16
@@ -317,6 +317,8 @@ print_node (file, prefix, node, indent)
 	fputs (" common", file);
       if (DECL_EXTERNAL (node))
 	fputs (" external", file);
+      if (DECL_WEAK (node))
+	fputs (" weak", file);
       if (DECL_REGISTER (node) && TREE_CODE (node) != FIELD_DECL
 	  && TREE_CODE (node) != FUNCTION_DECL
 	  && TREE_CODE (node) != LABEL_DECL)
Index: varasm.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/varasm.c,v
retrieving revision 1.256
diff -c -p -d -u -r1.256 varasm.c
--- varasm.c	2002/03/13 14:20:17	1.256
+++ varasm.c	2002/03/15 05:02:16
@@ -167,10 +167,6 @@ static unsigned HOST_WIDE_INT array_size
 static unsigned min_align		PARAMS ((unsigned, unsigned));
 static void output_constructor		PARAMS ((tree, HOST_WIDE_INT,
 						 unsigned int));
-static void mark_weak_decls		PARAMS ((void *));
-#if defined (ASM_WEAKEN_LABEL) || defined (ASM_WEAKEN_DECL)
-static void remove_from_pending_weak_list	PARAMS ((const char *));
-#endif
 static void globalize_decl		PARAMS ((tree));
 static void maybe_assemble_visibility	PARAMS ((tree));
 static int in_named_entry_eq		PARAMS ((const PTR, const PTR));
@@ -1399,6 +1395,17 @@ asm_emit_uninitialised (decl, name, size
 	destination = asm_dest_common;
     }
 
+  switch (destination)
+    {
+    case asm_dest_common:
+      if (! DECL_WEAK (decl))
+	break;
+    case asm_dest_bss:
+      globalize_decl (decl);
+    default:
+      break;
+    }
+
   if (flag_shared_data)
     {
       switch (destination)
@@ -1429,7 +1436,6 @@ asm_emit_uninitialised (decl, name, size
     {
 #ifdef ASM_EMIT_BSS
     case asm_dest_bss:
-      globalize_decl (decl);
       ASM_EMIT_BSS (decl, name, size, rounded);
       break;
 #endif
@@ -4993,56 +4999,10 @@ output_constructor (exp, size, align)
   if (total_bytes < size)
     assemble_zeros (size - total_bytes);
 }
-
 
-/* This structure contains any weak symbol declarations waiting
+/* This TREE_LIST contains any weak symbol declarations waiting
    to be emitted.  */
-struct weak_syms
-{
-  struct weak_syms * next;
-  tree decl;
-  const char * name;
-  const char * value;
-};
-
-static struct weak_syms * weak_decls;
-
-/* Mark weak_decls for garbage collection.  */
-
-static void
-mark_weak_decls (arg)
-     void *arg;
-{
-  struct weak_syms *t;
-
-  for (t = *(struct weak_syms **) arg; t != NULL; t = t->next)
-    ggc_mark_tree (t->decl);
-}
-
-/* Add function NAME to the weak symbols list.  VALUE is a weak alias
-   associated with NAME.  */
-
-int
-add_weak (decl, name, value)
-     tree decl;
-     const char *name;
-     const char *value;
-{
-  struct weak_syms *weak;
-
-  weak = (struct weak_syms *) xmalloc (sizeof (struct weak_syms));
-
-  if (weak == NULL)
-    return 0;
-
-  weak->next = weak_decls;
-  weak->decl = decl;
-  weak->name = name;
-  weak->value = value;
-  weak_decls = weak;
-
-  return 1;
-}
+static tree weak_decls;
 
 /* Declare DECL to be a weak symbol.  */
 
@@ -5055,7 +5015,10 @@ declare_weak (decl)
   else if (TREE_ASM_WRITTEN (decl))
     error_with_decl (decl, "weak declaration of `%s' must precede definition");
   else if (SUPPORTS_WEAK)
-    add_weak (decl, IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl)), NULL);
+    {
+      if (! DECL_WEAK (decl))
+	weak_decls = tree_cons (NULL, decl, weak_decls);
+    }
   else
     warning_with_decl (decl, "weak declaration of `%s' not supported");
 
@@ -5067,59 +5030,30 @@ declare_weak (decl)
 void
 weak_finish ()
 {
-  if (SUPPORTS_WEAK)
+  tree t;
+
+  for (t = weak_decls; t ; t = TREE_CHAIN (t))
     {
-      struct weak_syms *t;
-      for (t = weak_decls; t != NULL; t = t->next)
-	{
+      tree decl = TREE_VALUE (t);
+      const char *name = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl));
+
+      if (! TREE_USED (decl))
+	continue;
+
 #ifdef ASM_WEAKEN_DECL
-	  tree decl = t->decl;
-	  if (decl == NULL_TREE)
-	    {
-	      tree name = get_identifier (t->name);
-	      if (name)
-		decl = lookup_name (name);
-	    }
-	  ASM_WEAKEN_DECL (asm_out_file, decl, t->name, t->value);
-#else
-#ifdef ASM_OUTPUT_WEAK_ALIAS
-	  ASM_OUTPUT_WEAK_ALIAS (asm_out_file, t->name, t->value);
+      ASM_WEAKEN_DECL (asm_out_file, decl, name, NULL);
 #else
 #ifdef ASM_WEAKEN_LABEL
-	  if (t->value)
-	    abort ();
-	  ASM_WEAKEN_LABEL (asm_out_file, t->name);
+      ASM_WEAKEN_LABEL (asm_out_file, name);
+#else
+#ifdef ASM_OUTPUT_WEAK_ALIAS
+      warning ("only weak aliases are supported in this configuration");
+      return;
 #endif
 #endif
 #endif
-	}
-    }
-}
-
-/* Remove NAME from the pending list of weak symbols.  This prevents
-   the compiler from emitting multiple .weak directives which confuses
-   some assemblers.  */
-#if defined (ASM_WEAKEN_LABEL) || defined (ASM_WEAKEN_DECL)
-static void
-remove_from_pending_weak_list (name)
-     const char *name;
-{
-  struct weak_syms *t;
-  struct weak_syms **p;
-
-  for (p = &weak_decls; *p; )
-    {
-      t = *p;
-      if (strcmp (name, t->name) == 0)
-        {
-          *p = t->next;
-          free (t);
-        }
-      else
-        p = &(t->next);
     }
 }
-#endif /* defined (ASM_WEAKEN_LABEL) || defined (ASM_WEAKEN_DECL) */
 
 /* Emit the assembly bits to indicate that DECL is globally visible.  */
 
@@ -5132,18 +5066,26 @@ globalize_decl (decl)
 #if defined (ASM_WEAKEN_LABEL) || defined (ASM_WEAKEN_DECL)
   if (DECL_WEAK (decl))
     {
+      tree *p, t;
+
 #ifdef ASM_WEAKEN_DECL
       ASM_WEAKEN_DECL (asm_out_file, decl, name, 0);
 #else
       ASM_WEAKEN_LABEL (asm_out_file, name);
 #endif
+
       /* Remove this function from the pending weak list so that
 	 we do not emit multiple .weak directives for it.  */
-      remove_from_pending_weak_list (name);
+      for (p = &weak_decls; (t = *p) ; p = &TREE_CHAIN (t))
+	if (TREE_VALUE (t) == decl)
+	  {
+	    *p = TREE_CHAIN (t);
+	    break;
+	  }
       return;
     }
-  /* else */
 #endif
+
   ASM_GLOBALIZE_LABEL (asm_out_file, name);
 }
 
@@ -5168,7 +5110,6 @@ assemble_alias (decl, target)
   if (TREE_PUBLIC (decl))
     {
       globalize_decl (decl);
-
       maybe_assemble_visibility (decl);
     }
 
@@ -5282,7 +5223,7 @@ init_varasm_once ()
 		mark_const_hash_entry);
   ggc_add_root (&const_str_htab, 1, sizeof const_str_htab,
 		mark_const_str_htab);
-  ggc_add_root (&weak_decls, 1, sizeof weak_decls, mark_weak_decls);
+  ggc_add_tree_root (&weak_decls, 1);
 
   const_alias_set = new_alias_set ();
 }
Index: cp/decl.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/cp/decl.c,v
retrieving revision 1.873
diff -c -p -d -u -r1.873 decl.c
--- decl.c	2002/03/13 17:12:22	1.873
+++ decl.c	2002/03/15 05:02:16
@@ -46,6 +46,7 @@ Boston, MA 02111-1307, USA.  */
 #include "tm_p.h"
 #include "target.h"
 #include "c-common.h"
+#include "c-pragma.h"
 #include "diagnostic.h"
 
 extern const struct attribute_spec *lang_attribute_table;
@@ -7256,6 +7257,10 @@ start_decl (declarator, declspecs, initi
   /* Set attributes here so if duplicate decl, will have proper attributes.  */
   cplus_decl_attributes (&decl, attributes, 0);
 
+  /* If #pragma weak was used, mark the decl weak now.  */
+  if (current_binding_level == global_binding_level)
+    maybe_apply_pragma_weak (decl);
+
   if (TREE_CODE (decl) == FUNCTION_DECL
       && DECL_DECLARED_INLINE_P (decl)
       && DECL_UNINLINABLE (decl)
@@ -13475,6 +13480,10 @@ start_function (declspecs, declarator, a
 	return 0;
 
       cplus_decl_attributes (&decl1, attrs, 0);
+
+      /* If #pragma weak was used, mark the decl weak now.  */
+      if (current_binding_level == global_binding_level)
+	maybe_apply_pragma_weak (decl1);
 
       fntype = TREE_TYPE (decl1);
 
Index: testsuite/gcc.dg/weak-1.c
===================================================================
RCS file: weak-1.c
diff -N weak-1.c
--- /dev/null	Tue May  5 13:32:27 1998
+++ weak-1.c	Thu Mar 14 21:02:16 2002
@@ -0,0 +1,51 @@
+/* { dg-do compile } */
+/* COFF does not support weak, and dg doesn't support UNSUPPORTED.  */
+/* { dg-do compile { xfail *-*-coff i?86-pc-cygwin } } */
+
+/* { dg-final { scan-assembler "weak[^ 	]*[ 	]_?a" } } */
+/* { dg-final { scan-assembler "weak[^ 	]*[ 	]_?b" } } */
+/* { dg-final { scan-assembler "weak[^ 	]*[ 	]_?c" } } */
+/* { dg-final { scan-assembler "weak[^ 	]*[ 	]_?d" } } */
+/* { dg-final { scan-assembler "weak[^ 	]*[ 	]_?e" } } */
+/* { dg-final { scan-assembler "weak[^ 	]*[ 	]_?g" } } */
+/* { dg-final { scan-assembler-not "weak[^ 	]*[ 	]_?i" } } */
+/* { dg-final { scan-assembler "weak[^ 	]*[ 	]_?j" } } */
+
+#pragma weak a
+int a;
+
+int b;
+#pragma weak b
+
+#pragma weak c
+extern int c;
+int c;
+
+extern int d;
+#pragma weak d
+int d;
+
+#pragma weak e
+void e(void) { }
+
+#if 0
+/* This permutation is illegal.  */
+void f(void) { }
+#pragma weak f
+#endif
+
+#pragma weak g
+int g = 1;
+
+#if 0
+/* This permutation is illegal.  */
+int h = 1;
+#pragma weak h
+#endif
+
+#pragma weak i
+extern int i;
+
+#pragma weak j
+extern int j;
+int use_j() { return j; }

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: f build dies with: undefined reference to `lookup_name'
  2002-03-14 21:10                       ` Richard Henderson
@ 2002-03-14 23:03                         ` Richard Henderson
  2002-03-15  8:20                           ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Richard Henderson @ 2002-03-14 23:03 UTC (permalink / raw)
  To: Geoff Keating, dje, amodra, gcc-patches

Ok, this passed regression testing on alpha-linux, so I went ahead
and checked it in to mainline.  Did this need to get backported to
the branch?


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: f build dies with: undefined reference to `lookup_name'
  2002-03-14 23:03                         ` Richard Henderson
@ 2002-03-15  8:20                           ` David Edelsohn
  2002-03-15 17:01                             ` Richard Henderson
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2002-03-15  8:20 UTC (permalink / raw)
  To: Richard Henderson, Geoff Keating, amodra, gcc-patches

>>>>> Richard Henderson writes:

Richard> Ok, this passed regression testing on alpha-linux, so I went ahead
Richard> and checked it in to mainline.  Did this need to get backported to
Richard> the branch?

	Yes, we need this patch on the 3.1 branch as well for glibc on
ppc64 Linux. 

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: f build dies with: undefined reference to `lookup_name'
  2002-03-15  8:20                           ` David Edelsohn
@ 2002-03-15 17:01                             ` Richard Henderson
  2002-03-19 16:40                               ` Alan Modra
  0 siblings, 1 reply; 875+ messages in thread
From: Richard Henderson @ 2002-03-15 17:01 UTC (permalink / raw)
  To: David Edelsohn; +Cc: amodra, gcc-patches

On Fri, Mar 15, 2002 at 11:20:11AM -0500, David Edelsohn wrote:
> Richard> Ok, this passed regression testing on alpha-linux, so I went ahead
> Richard> and checked it in to mainline.  Did this need to get backported to
> Richard> the branch?
> 
> 	Yes, we need this patch on the 3.1 branch as well for glibc on
> ppc64 Linux. 

Ok, applied.  Note that Alan's ASM_WEAKEN_DECL patch is not
yet on the branch, which is what you were after in the first
place.  I'll let him one of you merge that.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: f build dies with: undefined reference to `lookup_name'
  2002-03-15 17:01                             ` Richard Henderson
@ 2002-03-19 16:40                               ` Alan Modra
  2002-03-19 17:02                                 ` Richard Henderson
  0 siblings, 1 reply; 875+ messages in thread
From: Alan Modra @ 2002-03-19 16:40 UTC (permalink / raw)
  To: Richard Henderson, gcc-patches

On Fri, Mar 15, 2002 at 05:00:59PM -0800, Richard Henderson wrote:
> 
> Ok, applied.  Note that Alan's ASM_WEAKEN_DECL patch is not

One or two things were missed.

	* defaults.h (SUPPORTS_WEAK): Set if ASM_WEAKEN_DECL.
	* varasm.c (assemble_alias): Use ASM_WEAKEN_DECL.
	* doc/tm.texi (ASM_WEAKEN_DECL): Document.
	(ASM_WEAKEN_LABEL): Mention ASM_WEAKEN_DECL.
	(SUPPORTS_WEAK): Likewise.

OK to apply to 3.1 ?

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

diff -urpN -xCVS -x*~ -xTAGS gcc-ppc64-31.orig/gcc/defaults.h gcc-ppc64-31/gcc/defaults.h
--- gcc-ppc64-31.orig/gcc/defaults.h	Mon Mar 18 19:16:31 2002
+++ gcc-ppc64-31/gcc/defaults.h	Wed Mar 20 09:54:47 2002
@@ -1,5 +1,5 @@
 /* Definitions of various defaults for tm.h macros.
-   Copyright (C) 1992, 1996, 1997, 1998, 1999, 2000, 2001
+   Copyright (C) 1992, 1996, 1997, 1998, 1999, 2000, 2001, 2002
    Free Software Foundation, Inc.
    Contributed by Ron Guilmette (rfg@monkeys.com)
 
@@ -158,7 +158,7 @@ do { ASM_OUTPUT_LABEL(FILE,LABEL_ALTERNA
 
 /* This determines whether or not we support weak symbols.  */
 #ifndef SUPPORTS_WEAK
-#ifdef ASM_WEAKEN_LABEL
+#if defined (ASM_WEAKEN_LABEL) || defined (ASM_WEAKEN_DECL)
 #define SUPPORTS_WEAK 1
 #else
 #define SUPPORTS_WEAK 0
diff -urpN -xCVS -x*~ -xTAGS gcc-ppc64-31.orig/gcc/varasm.c gcc-ppc64-31/gcc/varasm.c
--- gcc-ppc64-31.orig/gcc/varasm.c	Mon Mar 18 19:16:31 2002
+++ gcc-ppc64-31/gcc/varasm.c	Wed Mar 20 09:54:47 2002
@@ -5106,12 +5106,16 @@ assemble_alias (decl, target)
   ASM_OUTPUT_DEF (asm_out_file, name, IDENTIFIER_POINTER (target));
 #endif
   TREE_ASM_WRITTEN (decl) = 1;
-#else
-#ifdef ASM_OUTPUT_WEAK_ALIAS
+#else /* !ASM_OUTPUT_DEF */
+#if defined (ASM_OUTPUT_WEAK_ALIAS) || defined (ASM_WEAKEN_DECL)
   if (! DECL_WEAK (decl))
     warning ("only weak aliases are supported in this configuration");
 
+#ifdef ASM_WEAKEN_DECL
+  ASM_WEAKEN_DECL (asm_out_file, decl, name, IDENTIFIER_POINTER (target));
+#else
   ASM_OUTPUT_WEAK_ALIAS (asm_out_file, name, IDENTIFIER_POINTER (target));
+#endif
   TREE_ASM_WRITTEN (decl) = 1;
 #else
   warning ("alias definitions not supported in this configuration; ignored");
diff -urpN -xCVS -x*~ -xTAGS gcc-ppc64-31.orig/gcc/doc/tm.texi gcc-ppc64-31/gcc/doc/tm.texi
--- gcc-ppc64-31.orig/gcc/doc/tm.texi	Mon Mar 18 19:16:31 2002
+++ gcc-ppc64-31/gcc/doc/tm.texi	Wed Mar 20 09:54:47 2002
@@ -6174,7 +6174,7 @@ itself; before and after that, output th
 for making that name global, and a newline.
 
 @findex ASM_WEAKEN_LABEL
-@item ASM_WEAKEN_LABEL
+@item ASM_WEAKEN_LABEL (@var{stream}, @var{name})
 A C statement (sans semicolon) to output to the stdio stream
 @var{stream} some commands that will make the label @var{name} weak;
 that is, available for reference from other files but only used if
@@ -6183,18 +6183,29 @@ no other definition is available.  Use t
 itself; before and after that, output the additional assembler syntax
 for making that name weak, and a newline.
 
-If you don't define this macro, GCC will not support weak
-symbols and you should not define the @code{SUPPORTS_WEAK} macro.
+If you don't define this macro or @code{ASM_WEAKEN_DECL}, GCC will not
+support weak symbols and you should not define the @code{SUPPORTS_WEAK}
+macro.
+
+@findex ASM_WEAKEN_DECL
+@item ASM_WEAKEN_DECL (@var{stream}, @var{decl}, @var{name}, @var{value})
+Combines (and replaces) the function of @code{ASM_WEAKEN_LABEL} and
+@code{ASM_OUTPUT_WEAK_ALIAS}, allowing access to the associated function
+or variable decl.  If @var{value} is not @code{NULL}, this C statement
+should output to the stdio stream @var{stream} assembler code which
+defines (equates) the weak symbol @var{name} to have the value
+@var{value}.  If @var{value} is @code{NULL}, it should output commands
+to make @var{name} weak.
 
 @findex SUPPORTS_WEAK
 @item SUPPORTS_WEAK
 A C expression which evaluates to true if the target supports weak symbols.
 
 If you don't define this macro, @file{defaults.h} provides a default
-definition.  If @code{ASM_WEAKEN_LABEL} is defined, the default
-definition is @samp{1}; otherwise, it is @samp{0}.  Define this macro if
-you want to control weak symbol support with a compiler flag such as
-@option{-melf}.
+definition.  If either @code{ASM_WEAKEN_LABEL} or @code{ASM_WEAKEN_DECL}
+is defined, the default definition is @samp{1}; otherwise, it is
+@samp{0}.  Define this macro if you want to control weak symbol support
+with a compiler flag such as @option{-melf}.
 
 @findex MAKE_DECL_ONE_ONLY (@var{decl})
 @item MAKE_DECL_ONE_ONLY

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: f build dies with: undefined reference to `lookup_name'
  2002-03-19 16:40                               ` Alan Modra
@ 2002-03-19 17:02                                 ` Richard Henderson
  0 siblings, 0 replies; 875+ messages in thread
From: Richard Henderson @ 2002-03-19 17:02 UTC (permalink / raw)
  To: gcc-patches

On Wed, Mar 20, 2002 at 11:10:17AM +1030, Alan Modra wrote:
> 	* defaults.h (SUPPORTS_WEAK): Set if ASM_WEAKEN_DECL.
> 	* varasm.c (assemble_alias): Use ASM_WEAKEN_DECL.
> 	* doc/tm.texi (ASM_WEAKEN_DECL): Document.
> 	(ASM_WEAKEN_LABEL): Mention ASM_WEAKEN_DECL.
> 	(SUPPORTS_WEAK): Likewise.
> 
> OK to apply to 3.1 ?

Yes.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* thread-local storage: c front end and generic backend patch
@ 2002-05-21 18:54 Richard Henderson
  2002-05-22  4:25 ` Joseph S. Myers
                   ` (2 more replies)
  0 siblings, 3 replies; 875+ messages in thread
From: Richard Henderson @ 2002-05-21 18:54 UTC (permalink / raw)
  To: gcc-patches; +Cc: Joseph S. Myers

[-- Attachment #1: Type: text/plain, Size: 789 bytes --]

The following adds support in the C front end for a new 
storage specifier keyword "__thread" that marks a variable
to be allocated in storage private to every extant thread.

A similar patch for the C++ front end will follow directly;
I wanted to split that out for ease of review by the C++
front end folk.

Joseph, the extend.texi documentation has some user-level
description of the extension.  I've tried to come up with
a set of edits for C99, but I'm not sure where to put them,
or exactly what form they should take.  Thoughts?

There is a fledgeling testsuite here, but it won't get run
until you have target support as well.  I have x86 support
completed, and it'll get submitted as soon as I clean it up
properly and add some autoconf detection logic for binutils
support.


r~

[-- Attachment #2: tls-iso-changes --]
[-- Type: text/plain, Size: 1456 bytes --]

ISO/IEC 9899:1999 edits for thread-local storage:

6.2.4  Storage durations of objects

P3: Add new paragraph before

	An object whose identifier is declared with the storage-class
	specifier @code{__thread} has @dfn{thread storage duration}.
	Its lifetime is the entire execution of the thread, and its
	stored value is initialized only once, prior to thread startup.

6.5.3.2  Address and indirection operators

P3:
	[ I don't think any change is required here.  The explicit
	  semantics of @code{&} require that the result be the address
	  of the tls variable within the current thread, but I don't
	  see that at odds with "returns the address of its operand". ]

6.6 Constant expressions

P9:
	[ No change here, since we've defined @code{__thread} variables
	  to have thread storage duration, not static storage duration,
	  and that isn't listed as legal.  ]

6.7.1 Storage-class specifiers

P1: Add @code{__thread}.

P2: Change to

        With the exception of @code{__thread}, at most one storage-class
        specifier may be given [...].  The @code{__thread} specifier may
	be used alone, or immediately following @code{extern} or
	@code{static}.

P6: Add new paragraph after

        The declaration of an identifier for a variable that has
        block scope that specifies @code{__thread} shall also
        specify either @code{extern} or @code{static}.

        The @code{__thread} specifier shall be used only with
        variables.

[-- Attachment #3: d-tls-32-1 --]
[-- Type: text/plain, Size: 37346 bytes --]

        * c-common.h (enum rid): Add RID_THREAD.
        * c-decl.c (start_decl): Do not set DECL_COMMON for tls variables.
        (grokdeclarator): Grok __thread.
        * c-parse.in (reswords): Add __thread.
        (rid_to_yy): Add RID_THREAD.

        * tree.h (DECL_THREAD_LOCAL): New.
        (struct tree_decl): Add thread_local_flag.
        * print-tree.c (print_node): Dump DECL_THREAD_LOCAL.
        * tree.c (staticp): TLS variables are not static.

        * target-def.h (TARGET_HAVE_TLS): New.
        * target.h (have_tls): New.
        * output.h (SECTION_TLS): New.
        * varasm.c (assemble_variable): TLS variables can't be common for now.
        (default_section_type_flags): Handle .tdata and .tbss.
        (default_elf_asm_named_section): Handle SECTION_TLS.
        (categorize_decl_for_section): Handle DECL_THREAD_LOCAL.

        * flags.h (flag_tls_default): Declare.
        * toplev.c (flag_tls_default): Define.
        (display_help): Display help for it.
        (decode_f_option): Set it.

        * doc/extend.texi (Thread-Local): New node describing language-level
        thread-local storage.
        * doc/invoke.texi (-ftls-model): Document.

        * fixinc/inclhack.def (thread_keyword): New.
        * fixinc/fixincl.x: Rebuild.

cp/
        * lex.c (rid_to_yy): Add RID_THREAD.

testsuite/
        * gcc.dg/tls/tls.exp, gcc.dg/tls/trivial.c, gcc.dg/tls/diag-1.c,
        gcc.dg/tls/diag-2.c, gcc.dg/tls/init-1.c: New directory and files.


Index: c-common.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/c-common.h,v
retrieving revision 1.136
diff -c -p -d -r1.136 c-common.h
*** c-common.h	18 May 2002 19:02:01 -0000	1.136
--- c-common.h	21 May 2002 22:10:18 -0000
*************** enum rid
*** 58,64 ****
    RID_VOLATILE, RID_SIGNED,  RID_AUTO,  RID_RESTRICT,
  
    /* C extensions */
!   RID_BOUNDED, RID_UNBOUNDED, RID_COMPLEX,
  
    /* C++ */
    RID_FRIEND, RID_VIRTUAL, RID_EXPLICIT, RID_EXPORT, RID_MUTABLE,
--- 58,64 ----
    RID_VOLATILE, RID_SIGNED,  RID_AUTO,  RID_RESTRICT,
  
    /* C extensions */
!   RID_BOUNDED, RID_UNBOUNDED, RID_COMPLEX, RID_THREAD,
  
    /* C++ */
    RID_FRIEND, RID_VIRTUAL, RID_EXPLICIT, RID_EXPORT, RID_MUTABLE,
Index: c-decl.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/c-decl.c,v
retrieving revision 1.326
diff -c -p -d -r1.326 c-decl.c
*** c-decl.c	18 May 2002 19:02:02 -0000	1.326
--- c-decl.c	21 May 2002 22:10:19 -0000
*************** start_decl (declarator, declspecs, initi
*** 3350,3358 ****
    /* ANSI specifies that a tentative definition which is not merged with
       a non-tentative definition behaves exactly like a definition with an
       initializer equal to zero.  (Section 3.7.2)
!      -fno-common gives strict ANSI behavior.  Usually you don't want it.
!      This matters only for variables with external linkage.  */
!   if (!initialized && (! flag_no_common || ! TREE_PUBLIC (decl)))
      DECL_COMMON (decl) = 1;
  
    /* Set attributes here so if duplicate decl, will have proper attributes.  */
--- 3350,3368 ----
    /* ANSI specifies that a tentative definition which is not merged with
       a non-tentative definition behaves exactly like a definition with an
       initializer equal to zero.  (Section 3.7.2)
! 
!      -fno-common gives strict ANSI behavior, though this tends to break
!      a large body of code that grew up without this rule.
! 
!      Thread-local variables are never common, since there's no entrenched
!      body of code to break, and it allows more efficient variable references
!      in the presense of dynamic linking.  */
! 
!   if (TREE_CODE (decl) == VAR_DECL
!       && !initialized
!       && TREE_PUBLIC (decl)
!       && !DECL_THREAD_LOCAL (decl)
!       && !flag_no_common)
      DECL_COMMON (decl) = 1;
  
    /* Set attributes here so if duplicate decl, will have proper attributes.  */
*************** grokdeclarator (declarator, declspecs, d
*** 3933,3939 ****
  	  enum rid i = C_RID_CODE (id);
  	  if ((int) i <= (int) RID_LAST_MODIFIER)
  	    {
! 	      if (i == RID_LONG && (specbits & (1 << (int) i)))
  		{
  		  if (longlong)
  		    error ("`long long long' is too long for GCC");
--- 3943,3949 ----
  	  enum rid i = C_RID_CODE (id);
  	  if ((int) i <= (int) RID_LAST_MODIFIER)
  	    {
! 	      if (i == RID_LONG && (specbits & (1 << (int) RID_LONG)))
  		{
  		  if (longlong)
  		    error ("`long long long' is too long for GCC");
*************** grokdeclarator (declarator, declspecs, d
*** 3947,3952 ****
--- 3957,3975 ----
  		}
  	      else if (specbits & (1 << (int) i))
  		pedwarn ("duplicate `%s'", IDENTIFIER_POINTER (id));
+ 
+ 	      /* Diagnose "__thread extern".  Recall that this list
+ 		 is in the reverse order seen in the text.  */
+ 	      if (i == RID_THREAD
+ 		  && (specbits & (1 << (int) RID_EXTERN
+ 				  | 1 << (int) RID_STATIC)))
+ 		{
+ 		  if (specbits & 1 << (int) RID_EXTERN)
+ 		    error ("`__thread' before `extern'");
+ 		  else
+ 		    error ("`__thread' before `static'");
+ 		}
+ 
  	      specbits |= 1 << (int) i;
  	      goto found;
  	    }
*************** grokdeclarator (declarator, declspecs, d
*** 4196,4201 ****
--- 4219,4230 ----
      if (specbits & 1 << (int) RID_REGISTER) nclasses++;
      if (specbits & 1 << (int) RID_TYPEDEF) nclasses++;
  
+     /* "static __thread" and "extern __thread" are allowed.  */
+     if ((specbits & (1 << (int) RID_THREAD
+ 		     | 1 << (int) RID_STATIC
+ 		     | 1 << (int) RID_EXTERN)) == (1 << (int) RID_THREAD))
+       nclasses++;
+ 
      /* Warn about storage classes that are invalid for certain
         kinds of declarations (parameters, typenames, etc.).  */
  
*************** grokdeclarator (declarator, declspecs, d
*** 4205,4211 ****
  	     && (specbits
  		 & ((1 << (int) RID_REGISTER)
  		    | (1 << (int) RID_AUTO)
! 		    | (1 << (int) RID_TYPEDEF))))
        {
  	if (specbits & 1 << (int) RID_AUTO
  	    && (pedantic || current_binding_level == global_binding_level))
--- 4234,4241 ----
  	     && (specbits
  		 & ((1 << (int) RID_REGISTER)
  		    | (1 << (int) RID_AUTO)
! 		    | (1 << (int) RID_TYPEDEF)
! 		    | (1 << (int) RID_THREAD))))
        {
  	if (specbits & 1 << (int) RID_AUTO
  	    && (pedantic || current_binding_level == global_binding_level))
*************** grokdeclarator (declarator, declspecs, d
*** 4214,4221 ****
  	  error ("function definition declared `register'");
  	if (specbits & 1 << (int) RID_TYPEDEF)
  	  error ("function definition declared `typedef'");
  	specbits &= ~((1 << (int) RID_TYPEDEF) | (1 << (int) RID_REGISTER)
! 		      | (1 << (int) RID_AUTO));
        }
      else if (decl_context != NORMAL && nclasses > 0)
        {
--- 4244,4253 ----
  	  error ("function definition declared `register'");
  	if (specbits & 1 << (int) RID_TYPEDEF)
  	  error ("function definition declared `typedef'");
+ 	if (specbits & 1 << (int) RID_THREAD)
+ 	  error ("function definition declared `__thread'");
  	specbits &= ~((1 << (int) RID_TYPEDEF) | (1 << (int) RID_REGISTER)
! 		      | (1 << (int) RID_AUTO) | (1 << (int) RID_THREAD));
        }
      else if (decl_context != NORMAL && nclasses > 0)
        {
*************** grokdeclarator (declarator, declspecs, d
*** 4238,4244 ****
  	      }
  	    specbits &= ~((1 << (int) RID_TYPEDEF) | (1 << (int) RID_REGISTER)
  			  | (1 << (int) RID_AUTO) | (1 << (int) RID_STATIC)
! 			  | (1 << (int) RID_EXTERN));
  	  }
        }
      else if (specbits & 1 << (int) RID_EXTERN && initialized && ! funcdef_flag)
--- 4270,4276 ----
  	      }
  	    specbits &= ~((1 << (int) RID_TYPEDEF) | (1 << (int) RID_REGISTER)
  			  | (1 << (int) RID_AUTO) | (1 << (int) RID_STATIC)
! 			  | (1 << (int) RID_EXTERN) | (1 << (int) RID_THREAD));
  	  }
        }
      else if (specbits & 1 << (int) RID_EXTERN && initialized && ! funcdef_flag)
*************** grokdeclarator (declarator, declspecs, d
*** 4249,4260 ****
  	else
  	  error ("`%s' has both `extern' and initializer", name);
        }
!     else if (specbits & 1 << (int) RID_EXTERN && funcdef_flag
! 	     && current_binding_level != global_binding_level)
!       error ("nested function `%s' declared `extern'", name);
!     else if (current_binding_level == global_binding_level
! 	     && specbits & (1 << (int) RID_AUTO))
!       error ("top-level declaration of `%s' specifies `auto'", name);
    }
  
    /* Now figure out the structure of the declarator proper.
--- 4281,4305 ----
  	else
  	  error ("`%s' has both `extern' and initializer", name);
        }
!     else if (current_binding_level == global_binding_level)
!       {
! 	if (specbits & 1 << (int) RID_AUTO)
! 	  error ("top-level declaration of `%s' specifies `auto'", name);
!       }
!     else
!       {
! 	if (specbits & 1 << (int) RID_EXTERN && funcdef_flag)
! 	  error ("nested function `%s' declared `extern'", name);
! 	else if ((specbits & (1 << (int) RID_THREAD
! 			       | 1 << (int) RID_EXTERN
! 			       | 1 << (int) RID_STATIC))
! 		 == (1 << (int) RID_THREAD))
! 	  {
! 	    error ("function-scope `%s' implicitly auto and declared `__thread'",
! 		   name);
! 	    specbits &= ~(1 << (int) RID_THREAD);
! 	  }
!       }
    }
  
    /* Now figure out the structure of the declarator proper.
*************** grokdeclarator (declarator, declspecs, d
*** 4842,4847 ****
--- 4887,4894 ----
  	  pedwarn ("invalid storage class for function `%s'", name);
  	if (specbits & (1 << (int) RID_REGISTER))
  	  error ("invalid storage class for function `%s'", name);
+ 	if (specbits & (1 << (int) RID_THREAD))
+ 	  error ("invalid storage class for function `%s'", name);
  	/* Function declaration not at top level.
  	   Storage classes other than `extern' are not allowed
  	   and `extern' makes no difference.  */
*************** grokdeclarator (declarator, declspecs, d
*** 4934,4955 ****
  	  pedwarn_with_decl (decl, "variable `%s' declared `inline'");
  
  	DECL_EXTERNAL (decl) = extern_ref;
  	/* At top level, the presence of a `static' or `register' storage
  	   class specifier, or the absence of all storage class specifiers
  	   makes this declaration a definition (perhaps tentative).  Also,
  	   the absence of both `static' and `register' makes it public.  */
  	if (current_binding_level == global_binding_level)
  	  {
! 	    TREE_PUBLIC (decl)
! 	      = !(specbits
! 		  & ((1 << (int) RID_STATIC) | (1 << (int) RID_REGISTER)));
! 	    TREE_STATIC (decl) = ! DECL_EXTERNAL (decl);
  	  }
  	/* Not at top level, only `static' makes a static definition.  */
  	else
  	  {
  	    TREE_STATIC (decl) = (specbits & (1 << (int) RID_STATIC)) != 0;
! 	    TREE_PUBLIC (decl) = DECL_EXTERNAL (decl);
  	  }
        }
  
--- 4981,5012 ----
  	  pedwarn_with_decl (decl, "variable `%s' declared `inline'");
  
  	DECL_EXTERNAL (decl) = extern_ref;
+ 
  	/* At top level, the presence of a `static' or `register' storage
  	   class specifier, or the absence of all storage class specifiers
  	   makes this declaration a definition (perhaps tentative).  Also,
  	   the absence of both `static' and `register' makes it public.  */
  	if (current_binding_level == global_binding_level)
  	  {
! 	    TREE_PUBLIC (decl) = !(specbits & ((1 << (int) RID_STATIC)
! 					       | (1 << (int) RID_REGISTER)));
! 	    TREE_STATIC (decl) = !extern_ref;
  	  }
  	/* Not at top level, only `static' makes a static definition.  */
  	else
  	  {
  	    TREE_STATIC (decl) = (specbits & (1 << (int) RID_STATIC)) != 0;
! 	    TREE_PUBLIC (decl) = extern_ref;
! 	  }
! 
! 	if (specbits & 1 << (int) RID_THREAD)
! 	  {
! 	    if (targetm.have_tls)
! 	      DECL_THREAD_LOCAL (decl) = 1;
! 	    else
! 	      /* A mere warning is sure to result in improper semantics
! 		 at runtime.  Don't bother to allow this to compile.  */
! 	      error ("thread-local storage not supported for this target");
  	  }
        }
  
Index: c-parse.in
===================================================================
RCS file: /cvs/gcc/gcc/gcc/c-parse.in,v
retrieving revision 1.139
diff -c -p -d -r1.139 c-parse.in
*** c-parse.in	27 Apr 2002 06:53:06 -0000	1.139
--- c-parse.in	21 May 2002 22:10:19 -0000
*************** static const struct resword reswords[] =
*** 3343,3348 ****
--- 3343,3349 ----
    { "__restrict__",	RID_RESTRICT,	0 },
    { "__signed",		RID_SIGNED,	0 },
    { "__signed__",	RID_SIGNED,	0 },
+   { "__thread",		RID_THREAD,	0 },
    { "__typeof",		RID_TYPEOF,	0 },
    { "__typeof__",	RID_TYPEOF,	0 },
    { "__unbounded",	RID_UNBOUNDED,	0 },
*************** static const short rid_to_yy[RID_MAX] =
*** 3438,3443 ****
--- 3439,3445 ----
    /* RID_BOUNDED */	TYPE_QUAL,
    /* RID_UNBOUNDED */	TYPE_QUAL,
    /* RID_COMPLEX */	TYPESPEC,
+   /* RID_THREAD */	SCSPEC,
  
    /* C++ */
    /* RID_FRIEND */	0,
Index: flags.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/flags.h,v
retrieving revision 1.84
diff -c -p -d -r1.84 flags.h
*** flags.h	15 May 2002 09:00:01 -0000	1.84
--- flags.h	21 May 2002 22:10:19 -0000
*************** extern int flag_dump_unnumbered;
*** 458,467 ****
  
  extern int flag_pedantic_errors;
  
! /* Nonzero means generate position-independent code.
!    This is not fully implemented yet.  */
  
  extern int flag_pic;
  
  /* Nonzero means generate extra code for exception handling and enable
     exception handling.  */
--- 458,478 ----
  
  extern int flag_pedantic_errors;
  
! /* Nonzero means generate position-independent code.  1 vs 2 for a 
!    target-dependent "small" or "large" mode.  */
  
  extern int flag_pic;
+ 
+ /* Set to the default thread-local storage (tls) model to use.  */
+ 
+ enum tls_model {
+   TLS_MODEL_GLOBAL_DYNAMIC,
+   TLS_MODEL_LOCAL_DYNAMIC,
+   TLS_MODEL_INITIAL_EXEC,
+   TLS_MODEL_LOCAL_EXEC
+ };
+ 
+ extern enum tls_model flag_tls_default;
  
  /* Nonzero means generate extra code for exception handling and enable
     exception handling.  */
Index: output.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/output.h,v
retrieving revision 1.105
diff -c -p -d -r1.105 output.h
*** output.h	19 May 2002 09:50:11 -0000	1.105
--- output.h	21 May 2002 22:10:19 -0000
*************** extern void no_asm_to_stream PARAMS ((FI
*** 507,513 ****
  #define SECTION_STRINGS  0x10000	/* contains zero terminated strings without
  					   embedded zeros */
  #define SECTION_OVERRIDE 0x20000	/* allow override of default flags */
! #define SECTION_MACH_DEP 0x40000	/* subsequent bits reserved for target */
  
  extern unsigned int get_named_section_flags PARAMS ((const char *));
  extern bool set_named_section_flags	PARAMS ((const char *, unsigned int));
--- 507,514 ----
  #define SECTION_STRINGS  0x10000	/* contains zero terminated strings without
  					   embedded zeros */
  #define SECTION_OVERRIDE 0x20000	/* allow override of default flags */
! #define SECTION_TLS	 0x40000	/* contains thread-local storage */
! #define SECTION_MACH_DEP 0x80000	/* subsequent bits reserved for target */
  
  extern unsigned int get_named_section_flags PARAMS ((const char *));
  extern bool set_named_section_flags	PARAMS ((const char *, unsigned int));
Index: print-tree.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/print-tree.c,v
retrieving revision 1.57
diff -c -p -d -r1.57 print-tree.c
*** print-tree.c	20 May 2002 18:06:54 -0000	1.57
--- print-tree.c	21 May 2002 22:10:19 -0000
*************** print_node (file, prefix, node, indent)
*** 352,357 ****
--- 352,359 ----
  
        if (TREE_CODE (node) == VAR_DECL && DECL_IN_TEXT_SECTION (node))
  	fputs (" in-text-section", file);
+       if (TREE_CODE (node) == VAR_DECL && DECL_THREAD_LOCAL (node))
+ 	fputs (" thread-local", file);
  
        if (TREE_CODE (node) == PARM_DECL && DECL_TRANSPARENT_UNION (node))
  	fputs (" transparent-union", file);
Index: target-def.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/target-def.h,v
retrieving revision 1.28
diff -c -p -d -r1.28 target-def.h
*** target-def.h	19 May 2002 09:50:12 -0000	1.28
--- target-def.h	21 May 2002 22:10:19 -0000
*************** Foundation, 59 Temple Place - Suite 330,
*** 110,115 ****
--- 110,119 ----
  #define TARGET_HAVE_NAMED_SECTIONS false
  #endif
  
+ #ifndef TARGET_HAVE_TLS
+ #define TARGET_HAVE_TLS false
+ #endif
+ 
  #ifndef TARGET_ASM_EXCEPTION_SECTION
  #define TARGET_ASM_EXCEPTION_SECTION default_exception_section
  #endif
*************** Foundation, 59 Temple Place - Suite 330,
*** 244,249 ****
--- 248,254 ----
    TARGET_STRIP_NAME_ENCODING,			\
    TARGET_HAVE_NAMED_SECTIONS,			\
    TARGET_HAVE_CTORS_DTORS,			\
+   TARGET_HAVE_TLS				\
  }
  
  #include "hooks.h"
Index: target.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/target.h,v
retrieving revision 1.30
diff -c -p -d -r1.30 target.h
*** target.h	19 May 2002 09:50:14 -0000	1.30
--- target.h	21 May 2002 22:10:19 -0000
*************** struct gcc_target
*** 256,261 ****
--- 256,264 ----
    /* True if "native" constructors and destructors are supported,
       false if we're using collect2 for the job.  */
    bool have_ctors_dtors;
+ 
+   /* True if thread-local storage is supported.  */
+   bool have_tls;
  };
  
  extern struct gcc_target targetm;
Index: toplev.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/toplev.c,v
retrieving revision 1.628
diff -c -p -d -r1.628 toplev.c
*** toplev.c	19 May 2002 08:31:47 -0000	1.628
--- toplev.c	21 May 2002 22:10:19 -0000
*************** int flag_shared_data;
*** 685,696 ****
  int flag_delayed_branch;
  
  /* Nonzero if we are compiling pure (sharable) code.
!    Value is 1 if we are doing reasonable (i.e. simple
!    offset into offset table) pic.  Value is 2 if we can
!    only perform register offsets.  */
  
  int flag_pic;
  
  /* Nonzero means generate extra code for exception handling and enable
     exception handling.  */
  
--- 685,699 ----
  int flag_delayed_branch;
  
  /* Nonzero if we are compiling pure (sharable) code.
!    Value is 1 if we are doing "small" pic; value is 2 if we're doing
!    "large" pic.  */
  
  int flag_pic;
  
+ /* Set to the default thread-local storage (tls) model to use.  */
+ 
+ enum tls_model flag_tls_default;
+ 
  /* Nonzero means generate extra code for exception handling and enable
     exception handling.  */
  
*************** display_help ()
*** 3547,3552 ****
--- 3550,3556 ----
    printf (_("  -finline-limit=<number> Limits the size of inlined functions to <number>\n"));
    printf (_("  -fmessage-length=<number> Limits diagnostics messages lengths to <number> characters per line.  0 suppresses line-wrapping\n"));
    printf (_("  -fdiagnostics-show-location=[once | every-line] Indicates how often source location information should be emitted, as prefix, at the beginning of diagnostics when line-wrapping\n"));
+   printf (_("  -ftls-model=[global-dynamic | local-dynamic | initial-exec | local-exec] Indicates the default thread-local storage code generation model\n"));
  
    for (i = ARRAY_SIZE (f_options); i--;)
      {
*************** decode_f_option (arg)
*** 3824,3829 ****
--- 3828,3846 ----
  	read_integral_parameter (option_value, arg - 2,
  				 MAX_INLINE_INSNS);
        set_param_value ("max-inline-insns", val);
+     }
+   else if ((option_value = skip_leading_substring (arg, "tls-model=")))
+     {
+       if (strcmp (option_value, "global-dynamic") == 0)
+ 	flag_tls_default = TLS_MODEL_GLOBAL_DYNAMIC;
+       else if (strcmp (option_value, "local-dynamic") == 0)
+ 	flag_tls_default = TLS_MODEL_LOCAL_DYNAMIC;
+       else if (strcmp (option_value, "initial-exec") == 0)
+ 	flag_tls_default = TLS_MODEL_INITIAL_EXEC;
+       else if (strcmp (option_value, "local-exec") == 0)
+ 	flag_tls_default = TLS_MODEL_LOCAL_EXEC;
+       else
+ 	warning ("`%s': unknown tls-model option", arg - 2);
      }
  #ifdef INSN_SCHEDULING
    else if ((option_value = skip_leading_substring (arg, "sched-verbose=")))
Index: tree.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree.c,v
retrieving revision 1.257
diff -c -p -d -r1.257 tree.c
*** tree.c	9 May 2002 22:48:33 -0000	1.257
--- tree.c	21 May 2002 22:10:19 -0000
*************** staticp (arg)
*** 1349,1360 ****
      case FUNCTION_DECL:
        /* Nested functions aren't static, since taking their address
  	 involves a trampoline.  */
!       return (decl_function_context (arg) == 0 || DECL_NO_STATIC_CHAIN (arg))
! 	&& ! DECL_NON_ADDR_CONST_P (arg);
  
      case VAR_DECL:
!       return (TREE_STATIC (arg) || DECL_EXTERNAL (arg))
! 	&& ! DECL_NON_ADDR_CONST_P (arg);
  
      case CONSTRUCTOR:
        return TREE_STATIC (arg);
--- 1349,1361 ----
      case FUNCTION_DECL:
        /* Nested functions aren't static, since taking their address
  	 involves a trampoline.  */
!       return ((decl_function_context (arg) == 0 || DECL_NO_STATIC_CHAIN (arg))
! 	      && ! DECL_NON_ADDR_CONST_P (arg));
  
      case VAR_DECL:
!       return ((TREE_STATIC (arg) || DECL_EXTERNAL (arg))
! 	      && ! DECL_THREAD_LOCAL (arg)
! 	      && ! DECL_NON_ADDR_CONST_P (arg));
  
      case CONSTRUCTOR:
        return TREE_STATIC (arg);
Index: tree.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree.h,v
retrieving revision 1.336
diff -c -p -d -r1.336 tree.h
*** tree.h	12 May 2002 21:42:00 -0000	1.336
--- tree.h	21 May 2002 22:10:19 -0000
*************** struct tree_type
*** 1615,1620 ****
--- 1615,1624 ----
  /* In a FUNCTION_DECL, nonzero if the function cannot be inlined.  */
  #define DECL_UNINLINABLE(NODE) (FUNCTION_DECL_CHECK (NODE)->decl.uninlinable)
  
+ /* In a VAR_DECL, nonzero if the data should be allocated from 
+    thread-local storage.  */
+ #define DECL_THREAD_LOCAL(NODE) (VAR_DECL_CHECK (NODE)->decl.thread_local_flag)
+ 
  /* In a FUNCTION_DECL, the saved representation of the body of the
     entire function.  Usually a COMPOUND_STMT, but in C++ this may also
     be a RETURN_INIT, CTOR_INITIALIZER, or TRY_BLOCK.  */
*************** struct tree_decl
*** 1793,1799 ****
    unsigned non_addressable : 1;
    unsigned user_align : 1;
    unsigned uninlinable : 1;
!   /* Three unused bits.  */
  
    unsigned lang_flag_0 : 1;
    unsigned lang_flag_1 : 1;
--- 1797,1804 ----
    unsigned non_addressable : 1;
    unsigned user_align : 1;
    unsigned uninlinable : 1;
!   unsigned thread_local_flag : 1;
!   /* Two unused bits.  */
  
    unsigned lang_flag_0 : 1;
    unsigned lang_flag_1 : 1;
Index: varasm.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/varasm.c,v
retrieving revision 1.283
diff -c -p -d -r1.283 varasm.c
*** varasm.c	19 May 2002 20:17:50 -0000	1.283
--- varasm.c	21 May 2002 22:10:19 -0000
*************** assemble_variable (decl, top_level, at_e
*** 1586,1604 ****
  
    /* Handle uninitialized definitions.  */
  
!   if ((DECL_INITIAL (decl) == 0 || DECL_INITIAL (decl) == error_mark_node
! #if defined ASM_EMIT_BSS
!        || (flag_zero_initialized_in_bss
! 	   && initializer_zerop (DECL_INITIAL (decl)))
! #endif
!        )
!       /* If the target can't output uninitialized but not common global data
! 	 in .bss, then we have to use .data.  */
! #if ! defined ASM_EMIT_BSS
!       && DECL_COMMON (decl)
  #endif
!       && DECL_SECTION_NAME (decl) == NULL_TREE
!       && ! dont_output_data)
      {
        unsigned HOST_WIDE_INT size = tree_low_cst (DECL_SIZE_UNIT (decl), 1);
        unsigned HOST_WIDE_INT rounded = size;
--- 1586,1613 ----
  
    /* Handle uninitialized definitions.  */
  
!   /* If the decl has been given an explicit section name, then it
!      isn't common, and shouldn't be handled as such.  */
!   if (DECL_SECTION_NAME (decl) || dont_output_data)
!     ;
!   /* We don't implement common thread-local data at present.  */
!   else if (DECL_THREAD_LOCAL (decl))
!     {
!       if (DECL_COMMON (decl))
! 	sorry ("thread-local COMMON data not implemented");
!     }
! #ifndef ASM_EMIT_BSS
!   /* If the target can't output uninitialized but not common global data
!      in .bss, then we have to use .data.  */
!   /* ??? We should handle .bss via select_section mechanisms rather than
!      via special target hooks.  That would eliminate this special case.  */
!   else if (!DECL_COMMON (decl))
!     ;
  #endif
!   else if (DECL_INITIAL (decl) == 0
! 	   || DECL_INITIAL (decl) == error_mark_node
!            || (flag_zero_initialized_in_bss
! 	       && initializer_zerop (DECL_INITIAL (decl))))
      {
        unsigned HOST_WIDE_INT size = tree_low_cst (DECL_SIZE_UNIT (decl), 1);
        unsigned HOST_WIDE_INT rounded = size;
*************** default_section_type_flags (decl, name, 
*** 5101,5109 ****
        || strncmp (name, ".gnu.linkonce.b.", 16) == 0
        || strcmp (name, ".sbss") == 0
        || strncmp (name, ".sbss.", 6) == 0
!       || strncmp (name, ".gnu.linkonce.sb.", 17) == 0)
      flags |= SECTION_BSS;
  
    return flags;
  }
  
--- 5110,5123 ----
        || strncmp (name, ".gnu.linkonce.b.", 16) == 0
        || strcmp (name, ".sbss") == 0
        || strncmp (name, ".sbss.", 6) == 0
!       || strncmp (name, ".gnu.linkonce.sb.", 17) == 0
!       || strcmp (name, ".tbss") == 0)
      flags |= SECTION_BSS;
  
+   if (strcmp (name, ".tdata") == 0
+       || strcmp (name, ".tbss") == 0)
+     flags |= SECTION_TLS;
+ 
    return flags;
  }
  
*************** default_elf_asm_named_section (name, fla
*** 5146,5151 ****
--- 5160,5167 ----
      *f++ = 'M';
    if (flags & SECTION_STRINGS)
      *f++ = 'S';
+   if (flags & SECTION_TLS)
+     *f++ = 'T';
    *f = '\0';
  
    if (flags & SECTION_BSS)
*************** categorize_decl_for_section (decl, reloc
*** 5353,5360 ****
    else
      ret = SECCAT_RODATA;
  
    /* If the target uses small data sections, select it.  */
!   if ((*targetm.in_small_data_p) (decl))
      {
        if (ret == SECCAT_BSS)
  	ret = SECCAT_SBSS;
--- 5369,5385 ----
    else
      ret = SECCAT_RODATA;
  
+   /* There are no read-only thread-local sections.  */
+   if (TREE_CODE (decl) == VAR_DECL && DECL_THREAD_LOCAL (decl))
+     {
+       if (ret == SECCAT_BSS)
+ 	ret = SECCAT_TBSS;
+       else
+ 	ret = SECCAT_TDATA;
+     }
+ 
    /* If the target uses small data sections, select it.  */
!   else if ((*targetm.in_small_data_p) (decl))
      {
        if (ret == SECCAT_BSS)
  	ret = SECCAT_SBSS;
Index: cp/lex.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/cp/lex.c,v
retrieving revision 1.277
diff -c -p -d -r1.277 lex.c
*** cp/lex.c	25 Apr 2002 06:24:34 -0000	1.277
--- cp/lex.c	21 May 2002 22:10:21 -0000
*************** const short rid_to_yy[RID_MAX] =
*** 474,479 ****
--- 474,480 ----
    /* RID_BOUNDED */	0,
    /* RID_UNBOUNDED */	0,
    /* RID_COMPLEX */	TYPESPEC,
+   /* RID_THREAD */	0,
  
    /* C++ */
    /* RID_FRIEND */	SCSPEC,
Index: doc/extend.texi
===================================================================
RCS file: /cvs/gcc/gcc/gcc/doc/extend.texi,v
retrieving revision 1.70
diff -c -p -d -r1.70 extend.texi
*** doc/extend.texi	11 May 2002 16:25:04 -0000	1.70
--- doc/extend.texi	21 May 2002 22:10:22 -0000
*************** extensions, accepted by GCC in C89 mode 
*** 432,437 ****
--- 432,438 ----
  * Target Builtins::     Built-in functions specific to particular targets.
  * Pragmas::             Pragmas accepted by GCC.
  * Unnamed Fields::      Unnamed struct/union fields within structs/unions.
+ * Thread-Local::        Per-thread variables.
  @end menu
  
  @node Statement Exprs
*************** struct @{
*** 6164,6169 ****
--- 6165,6219 ----
  It is ambiguous which @code{a} is being referred to with @samp{foo.a}.
  Such constructs are not supported and must be avoided.  In the future,
  such constructs may be detected and treated as compilation errors.
+ 
+ @node Thread-Local
+ @section Thread-Local Storage
+ @cindex Thread-Local Storage
+ @cindex TLS
+ @cindex __thread
+ 
+ Thread-local storage (TLS) is a mechanism by which variables are
+ allocated such that there is one instance of the variable per extant
+ thread.  The run-time model GCC uses to implement this originates
+ in the IA-64 processor-specific ABI, but has since been migrated
+ to other processors as well.  It requires significant support from
+ the linker (@command{ld}), dynamic linker (@command{ld.so}), and
+ system libraries (@file{libc.so} and @file{libpthread.so}), so it
+ is not supported everywhere.
+ 
+ At the user level, the extension is visible with a new storage
+ class keyword: @code{__thread}.  For example:
+ 
+ @example
+ __thread int i;
+ extern __thread struct state s;
+ static __thread char *p;
+ @end example
+ 
+ The @code{__thread} specifier may be used alone, with the @code{extern}
+ or @code{static} specifiers, but with no other storage class specifier.
+ When used with @code{extern} or @code{static}, @code{__thread} must appear
+ immediately after the other storage class specifier.
+ 
+ The @code{__thread} specifier may be applied to any global, file-scoped
+ static, function-scoped static, or class-scoped static variable.  It may
+ not be applied to function-scoped automatic or class-scoped member variables.
+ 
+ When the address-of operator is applied to a thread-local variable, it is
+ evaluated at run-time and returns the address of the current thread's
+ instance of that variable.  An address so obtained may be used by any
+ thread.  When a thread terminates, any pointers to thread-local variables
+ in that thread become invalid.
+ 
+ No static initialization may refer to the address of a thread-local variable.
+ 
+ In C++, a thread-local variable may not be initialized by a static
+ constructor.
+ 
+ See @uref{http://people.redhat.com/drepper/tls.pdf,
+ ELF Handling For Thread-Local Storage} for a detailed explanation of
+ the four thread-local storage addressing models, and how the run-time
+ is expected to function.
  
  @node C++ Extensions
  @chapter Extensions to the C++ Language
Index: doc/invoke.texi
===================================================================
RCS file: /cvs/gcc/gcc/gcc/doc/invoke.texi,v
retrieving revision 1.146
diff -c -p -d -r1.146 invoke.texi
*** doc/invoke.texi	18 May 2002 19:02:02 -0000	1.146
--- doc/invoke.texi	21 May 2002 22:10:22 -0000
*************** in the following sections.
*** 677,683 ****
  -fverbose-asm  -fpack-struct  -fstack-check @gol
  -fstack-limit-register=@var{reg}  -fstack-limit-symbol=@var{sym} @gol
  -fargument-alias  -fargument-noalias @gol
! -fargument-noalias-global  -fleading-underscore}
  @end table
  
  @menu
--- 677,683 ----
  -fverbose-asm  -fpack-struct  -fstack-check @gol
  -fstack-limit-register=@var{reg}  -fstack-limit-symbol=@var{sym} @gol
  -fargument-alias  -fargument-noalias @gol
! -fargument-noalias-global  -fleading-underscore -ftls-model=@var{model}}
  @end table
  
  @menu
*************** is to help link with legacy assembly cod
*** 9915,9920 ****
--- 9915,9928 ----
  
  Be warned that you should know what you are doing when invoking this
  option, and that not all targets provide complete support for it.
+ 
+ @item -ftls-model=@var{model}
+ Alter the thread-local storage model to be used (@pxref{Thread-Local}).
+ The @var{model} argument should be one of @code{global-dynamic},
+ @code{local-dynamic}, @code{initial-exec} or @code{local-exec}.
+ 
+ The default without @option{-fpic} is @code{initial-exec}; with
+ @option{-fpic} the default is @code{global-dynamic}.
  @end table
  
  @c man end
Index: fixinc/inclhack.def
===================================================================
RCS file: /cvs/gcc/gcc/gcc/fixinc/inclhack.def,v
retrieving revision 1.128
diff -c -p -d -r1.128 inclhack.def
*** fixinc/inclhack.def	14 May 2002 00:33:14 -0000	1.128
--- fixinc/inclhack.def	21 May 2002 22:10:23 -0000
***************
*** 1,4 ****
- 
  /* -*- Mode: C -*-  */
  
  autogen definitions fixincl;
--- 1,3 ----
*************** fix = {
*** 2885,2890 ****
--- 2884,2903 ----
      "extern char*\tbsearch(void*,size_t,size_t);\n";
  };
  
+ 
+ /*
+  * __thread is now a keyword.
+  */
+ fix = {
+     hackname	= thread_keyword;
+     files	= "pthread.h";
+     files	= "bits/sigthread.h";
+     select	= "pthread_t __thread";
+ 
+     sed		= "s/pthread_t __thread\\([^a-z0-9_]\\)/pthread_t __thr\\1/";
+ 
+     test_text	= "extern int pthread_kill (pthread_t __thread, int __signo);";
+ };
  
  /*
   *  if the #if says _cplusplus, not the double underscore __cplusplus
Index: testsuite/gcc.dg/tls/diag-1.c
===================================================================
RCS file: testsuite/gcc.dg/tls/diag-1.c
diff -N testsuite/gcc.dg/tls/diag-1.c
*** /dev/null	1 Jan 1970 00:00:00 -0000
--- testsuite/gcc.dg/tls/diag-1.c	21 May 2002 22:51:13 -0000
***************
*** 0 ****
--- 1,11 ----
+ /* Valid __thread specifiers.  */
+ 
+ __thread int g1;
+ extern __thread int g2;
+ static __thread int g3;
+ 
+ void foo()
+ {
+   extern __thread int l1;
+   static __thread int l2;
+ }
Index: testsuite/gcc.dg/tls/diag-2.c
===================================================================
RCS file: testsuite/gcc.dg/tls/diag-2.c
diff -N testsuite/gcc.dg/tls/diag-2.c
*** /dev/null	1 Jan 1970 00:00:00 -0000
--- testsuite/gcc.dg/tls/diag-2.c	21 May 2002 22:51:13 -0000
***************
*** 0 ****
--- 1,21 ----
+ /* Invalid __thread specifiers.  */
+ 
+ __thread extern int g1;		/* { dg-error "`__thread' before `extern'" } */
+ __thread static int g2;		/* { dg-error "`__thread' before `static'" } */
+ __thread __thread int g3;	/* { dg-error "duplicate `__thread'" } */
+ typedef __thread int g4;	/* { dg-error "multiple storage classes" } */
+ 
+ void foo()
+ {
+   __thread int l1;		/* { dg-error "implicitly auto and declared `__thread'" } */
+   auto __thread int l2;		/* { dg-error "multiple storage classes" } */
+   __thread extern int l3;	/* { dg-error "`__thread' before `extern'" } */
+   register __thread int l4;	/* { dg-error "multiple storage classes" } */
+ }
+ 
+ __thread void f1 ();		/* { dg-error "invalid storage class for function" } */
+ extern __thread void f2 ();	/* { dg-error "invalid storage class for function" } */
+ static __thread void f3 ();	/* { dg-error "invalid storage class for function" } */
+ __thread void f4 () { }		/* { dg-error "function definition declared `__thread'" } */
+ 
+ void bar(__thread int p1);	/* { dg-error "storage class specified for parameter" } */
Index: testsuite/gcc.dg/tls/init-1.c
===================================================================
RCS file: testsuite/gcc.dg/tls/init-1.c
diff -N testsuite/gcc.dg/tls/init-1.c
*** /dev/null	1 Jan 1970 00:00:00 -0000
--- testsuite/gcc.dg/tls/init-1.c	21 May 2002 22:51:13 -0000
***************
*** 0 ****
--- 1,4 ----
+ /* Invalid initializations.  */
+ 
+ extern __thread int i;
+ int *p = &i;	/* { dg-error "initializer element is not constant" } */
Index: testsuite/gcc.dg/tls/tls.exp
===================================================================
RCS file: testsuite/gcc.dg/tls/tls.exp
diff -N testsuite/gcc.dg/tls/tls.exp
*** /dev/null	1 Jan 1970 00:00:00 -0000
--- testsuite/gcc.dg/tls/tls.exp	21 May 2002 22:51:13 -0000
***************
*** 0 ****
--- 1,45 ----
+ #   Copyright (C) 2002 Free Software Foundation, Inc.
+ 
+ # This program is free software; you can redistribute it and/or modify
+ # it under the terms of the GNU General Public License as published by
+ # the Free Software Foundation; either version 2 of the License, or
+ # (at your option) any later version.
+ # 
+ # This program is distributed in the hope that it will be useful,
+ # but WITHOUT ANY WARRANTY; without even the implied warranty of
+ # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ # GNU General Public License for more details.
+ # 
+ # You should have received a copy of the GNU General Public License
+ # along with this program; if not, write to the Free Software
+ # Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.  
+ 
+ # GCC testsuite that uses the `dg.exp' driver.
+ 
+ # Load support procs.
+ load_lib gcc-dg.exp
+ 
+ # Test for thread-local data supported by the platform.  If it
+ # isn't, everything will fail with the "not supported" message.
+ 
+ set comp_output [gcc_target_compile \
+ 		"$srcdir/$subdir/trivial.c" "trivial.S" assembly ""]
+ if { [string match "*not supported*" $comp_output] } {
+   return 0
+ }
+ 
+ # If a testcase doesn't have special options, use these.
+ global DEFAULT_CFLAGS
+ if ![info exists DEFAULT_CFLAGS] then {
+     set DEFAULT_CFLAGS " -ansi -pedantic-errors"
+ }
+ 
+ # Initialize `dg'.
+ dg-init
+ 
+ # Main loop.
+ dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.\[cS\]]] \
+         "" $DEFAULT_CFLAGS
+ 
+ # All done.
+ dg-finish
Index: testsuite/gcc.dg/tls/trivial.c
===================================================================
RCS file: testsuite/gcc.dg/tls/trivial.c
diff -N testsuite/gcc.dg/tls/trivial.c
*** /dev/null	1 Jan 1970 00:00:00 -0000
--- testsuite/gcc.dg/tls/trivial.c	21 May 2002 22:51:13 -0000
***************
*** 0 ****
--- 1 ----
+ __thread int i;

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: thread-local storage: c front end and generic backend patch
  2002-05-21 18:54 thread-local storage: c front end and generic backend patch Richard Henderson
@ 2002-05-22  4:25 ` Joseph S. Myers
  2002-05-22 13:53 ` Mark Mitchell
  2002-07-11  9:00 ` David Edelsohn
  2 siblings, 0 replies; 875+ messages in thread
From: Joseph S. Myers @ 2002-05-22  4:25 UTC (permalink / raw)
  To: Richard Henderson; +Cc: gcc-patches

On Tue, 21 May 2002, Richard Henderson wrote:

> Joseph, the extend.texi documentation has some user-level
> description of the extension.  I've tried to come up with
> a set of edits for C99, but I'm not sure where to put them,
> or exactly what form they should take.  Thoughts?

I think they should go in extend.texi just after the user-level
documentation.

-- 
Joseph S. Myers
jsm28@cam.ac.uk

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: thread-local storage: c front end and generic backend patch
  2002-05-21 18:54 thread-local storage: c front end and generic backend patch Richard Henderson
  2002-05-22  4:25 ` Joseph S. Myers
@ 2002-05-22 13:53 ` Mark Mitchell
  2002-05-22 14:22   ` Richard Henderson
  2002-07-11  9:00 ` David Edelsohn
  2 siblings, 1 reply; 875+ messages in thread
From: Mark Mitchell @ 2002-05-22 13:53 UTC (permalink / raw)
  To: Richard Henderson, gcc-patches; +Cc: Joseph S. Myers

--On Tuesday, May 21, 2002 06:09:01 PM -0700 Richard Henderson 
<rth@redhat.com> wrote:

> The following adds support in the C front end for a new
> storage specifier keyword "__thread" that marks a variable
> to be allocated in storage private to every extant thread.

Is this really necessary?

The traditional approach (i.e., make structures containing data
and then stuffing them into TLS) is workable, if slightly
cumbersome.

It's great to have the proposed standardese.  That's very helpful.

There's no definition of "thread" in the standard, so saying that
these variables have the lifetime of "the thread" is probably
insufficient.  "Prior to thread startup" is also a bit unclear; we
don't know how to start a thread or what it means for it to be
started.  Does the initialization take place in the parent or the
child?  (That matters if, say, longjmps occur in the midst of the
initializers.)

You probably also need to say something about interactions between
__thread variables and initialization order.  For example:

  __thread int i = j;
  __thread int j = 7;

Does "i" get 0, or 7, or is this undefined behavior?

Similarly:

  extern __thread int i;

  int f() { return i + 2; }

  __thread int i = f();

Is:

  extern __thread int i;

  extern int i;

  int i;

legal?  (Can you drop the "__thread" after you've said it once?  What
if you don't say it the first time?)

In C++, what happens if an exception is thrown during construction of
a __thread variable?

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: thread-local storage: c front end and generic backend patch
  2002-05-22 13:53 ` Mark Mitchell
@ 2002-05-22 14:22   ` Richard Henderson
  2002-05-22 14:44     ` Gabriel Dos Reis
                       ` (2 more replies)
  0 siblings, 3 replies; 875+ messages in thread
From: Richard Henderson @ 2002-05-22 14:22 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: gcc-patches, Joseph S. Myers

On Wed, May 22, 2002 at 01:41:51PM -0700, Mark Mitchell wrote:
> Is this really necessary?
> 
> The traditional approach (i.e., make structures containing data
> and then stuffing them into TLS) is workable, if slightly
> cumbersome.

This mechanism, or rather the run-time mechanism behind this,
is significantly faster.  It also solves a number of problems
with dynamic loading and wanting to share TLS data across DSO
boundaries.

> There's no definition of "thread" in the standard, so saying that
> these variables have the lifetime of "the thread" is probably
> insufficient.  "Prior to thread startup" is also a bit unclear; we
> don't know how to start a thread or what it means for it to be
> started.

This is intentionally vague.  Why would we mention pthread_create
here when the run-time might actually use some other thread library?

> Does the initialization take place in the parent or the
> child?  (That matters if, say, longjmps occur in the midst of the
> initializers.)

There are no run-time initializers for TLS data.  They are
specifically disallowed.

The run-time mechanism here is that TLS data is initialized via
block-copy from a PT_TLS ELF program segment.  So we can't have
constructors of any form.

I should track down a copy of C++98 so that I can make similar
standardese changes there.  That would have made this clearer
for you from the start.

> Is:
> 
>   extern __thread int i;
> 
>   extern int i;
> 
>   int i;
> 
> legal?  (Can you drop the "__thread" after you've said it once?  What
> if you don't say it the first time?)

Ooo good point.  I'd tend to want to force you to have it in the 
earliest declaration, but allow it to be dropped subsequently.

The rationale being that while

	extern int i;
	int foo() { return i; }
	static int i;

works, a similar case with __thread does not.  Further, one would
like to be able to do

	extern __thread int errno;

rather than the current best-effort in glibc:

	#define errno (*__errno_location ())

But old programs have a tendancy to redefine errno themselves
"just in case".

Alternately, most of these programs have been fixed over the last
few years because of the above define.  It might be sufficient to
have libc do

	extern __thread int errno;
	#define errno errno

since most of the fixes to the programs have been of the form

	#ifndef errno
	extern int errno
	#endif

Then we can reasonably enforce the restriction that every declaration
must contain the __thread specifier.

Thoughts?

r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: thread-local storage: c front end and generic backend patch
  2002-05-22 14:22   ` Richard Henderson
@ 2002-05-22 14:44     ` Gabriel Dos Reis
  2002-05-22 14:55       ` Joseph S. Myers
  2002-05-22 14:52     ` Mark Mitchell
  2002-05-22 16:46     ` Alexandre Oliva
  2 siblings, 1 reply; 875+ messages in thread
From: Gabriel Dos Reis @ 2002-05-22 14:44 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Mark Mitchell, gcc-patches, Joseph S. Myers

Richard Henderson <rth@redhat.com> writes:

[...]

| > There's no definition of "thread" in the standard, so saying that
| > these variables have the lifetime of "the thread" is probably
| > insufficient.  "Prior to thread startup" is also a bit unclear; we
| > don't know how to start a thread or what it means for it to be
| > started.
| 
| This is intentionally vague.  Why would we mention pthread_create
| here when the run-time might actually use some other thread library?

Being more specifc about the semantics is really needed for proving
and/or maintaining the invariant that a change to compiler (be it an
advanced optimization or implementation of standard features) does not
break the semantics of that extension.  In shrot, we need to know what
it is supposed to mean in order to prove that some change is correct.

I think the issue is important so that we don't gloss over it.

-- Gaby

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: thread-local storage: c front end and generic backend patch
  2002-05-22 14:22   ` Richard Henderson
  2002-05-22 14:44     ` Gabriel Dos Reis
@ 2002-05-22 14:52     ` Mark Mitchell
  2002-05-22 15:01       ` Richard Henderson
  2002-05-22 16:46     ` Alexandre Oliva
  2 siblings, 1 reply; 875+ messages in thread
From: Mark Mitchell @ 2002-05-22 14:52 UTC (permalink / raw)
  To: Richard Henderson; +Cc: gcc-patches, Joseph S. Myers

--On Wednesday, May 22, 2002 02:17:55 PM -0700 Richard Henderson 
<rth@redhat.com> wrote:

> On Wed, May 22, 2002 at 01:41:51PM -0700, Mark Mitchell wrote:
>> Is this really necessary?
>>
>> The traditional approach (i.e., make structures containing data
>> and then stuffing them into TLS) is workable, if slightly
>> cumbersome.
>
> This mechanism, or rather the run-time mechanism behind this,
> is significantly faster.  It also solves a number of problems
> with dynamic loading and wanting to share TLS data across DSO
> boundaries.

I see.

Well, heck, a new extension proposal gives me a chance to be an
annoying language lawyer. :-)

>> There's no definition of "thread" in the standard, so saying that
>> these variables have the lifetime of "the thread" is probably
>> insufficient.  "Prior to thread startup" is also a bit unclear; we
>> don't know how to start a thread or what it means for it to be
>> started.
>
> This is intentionally vague.  Why would we mention pthread_create
> here when the run-time might actually use some other thread library?

Agreed; still if we're going to talk about threads we have to have some
model of what a thread is.  Even if we say "you create threads by some
implementation-defined means", we have to say that a thread is a flow
of control, that there are no guarantees about the values of
global variables that are not marked with __thread when accessed
simultaneously from multiple threads, etc.

In other words, by introducing __thread we've promoted threads to a
linguistic concept.  When it was just "pthread_create", well, that's
just some function with whatever behavior it needs to have, it's not
our problem from a language point of view.

Now, all of a sudden, it is.

>> Does the initialization take place in the parent or the
>> child?  (That matters if, say, longjmps occur in the midst of the
>> initializers.)
>
> There are no run-time initializers for TLS data.  They are
> specifically disallowed.

I missed that point.  That's great; it gets rid of one whole
set of issues.

(I assume that, in C++:

  __thread const int i = 7;
  __thread int j = i;

is OK.  In C++, this would not be a dynamic initialization, if there
were no __thread modifier, and it seems like you could do this in
the implementation you're envisioning as well.)

> Ooo good point.  I'd tend to want to force you to have it in the
> earliest declaration, but allow it to be dropped subsequently.

That seems best to me too.

We also need to note any GNU variable attributes don't work
with __thread variables.  I don't see any reason any of them shouldn't
work, except maybe "section", if there are any restrictions on what
sections get into the ELF segment in question.  (For example, if you
put a __thread variable in the ordinary .data section, is that going
to work?)  We can just say you get undefined behavior if you use the
section attribute with __thread without noting which sections might
happen to work.  Same might go for "shared" on Windows.

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: thread-local storage: c front end and generic backend patch
  2002-05-22 14:44     ` Gabriel Dos Reis
@ 2002-05-22 14:55       ` Joseph S. Myers
  0 siblings, 0 replies; 875+ messages in thread
From: Joseph S. Myers @ 2002-05-22 14:55 UTC (permalink / raw)
  To: Gabriel Dos Reis; +Cc: Richard Henderson, Mark Mitchell, gcc-patches

On 22 May 2002, Gabriel Dos Reis wrote:

> | This is intentionally vague.  Why would we mention pthread_create
> | here when the run-time might actually use some other thread library?
> 
> Being more specifc about the semantics is really needed for proving
> and/or maintaining the invariant that a change to compiler (be it an
> advanced optimization or implementation of standard features) does not
> break the semantics of that extension.  In shrot, we need to know what
> it is supposed to mean in order to prove that some change is correct.
> 
> I think the issue is important so that we don't gloss over it.

There is the underlying problem here that POSIX needs to come with a set
of edits for the C standard to explain how it profiles that standard to
provide for threads (and for signal handling that isn't always undefined
behavior).  However, it doesn't.  In turn there are the problems of lack
of precise definition of object semantics in the C standard, which should
be addressed for it to serve properly as a base standard being profiled to
add threads, concurrency, etc..

-- 
Joseph S. Myers
jsm28@cam.ac.uk

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: thread-local storage: c front end and generic backend patch
  2002-05-22 14:52     ` Mark Mitchell
@ 2002-05-22 15:01       ` Richard Henderson
  2002-05-22 15:13         ` Jakub Jelinek
  2002-05-22 15:39         ` Mark Mitchell
  0 siblings, 2 replies; 875+ messages in thread
From: Richard Henderson @ 2002-05-22 15:01 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: gcc-patches, Joseph S. Myers

On Wed, May 22, 2002 at 02:35:24PM -0700, Mark Mitchell wrote:
> Agreed; still if we're going to talk about threads we have to have some
> model of what a thread is.  Even if we say "you create threads by some
> implementation-defined means", we have to say that a thread is a flow
> of control, that there are no guarantees about the values of
> global variables that are not marked with __thread when accessed
> simultaneously from multiple threads, etc.

Ok, I'll see what I can come up with.

> (I assume that, in C++:
> 
>   __thread const int i = 7;
>   __thread int j = i;
> 
> is OK.  In C++, this would not be a dynamic initialization, if there
> were no __thread modifier, and it seems like you could do this in
> the implementation you're envisioning as well.)

*shrug* So long as the compiler does the constant propagation,
then sure this should be ok.

> I don't see any reason any of them shouldn't
> work, except maybe "section", if there are any restrictions on what
> sections get into the ELF segment in question.

There aren't.  There's a new section flag SHF_TLS that can be
applied to any random section.

> (For example, if you put a __thread variable in the ordinary .data
> section, is that going to work?)

That should fail already because we already disallow variables
to be placed in sections for which there is a flag mismatch.
Thus you can't do

	const int x __attribute__((section("foo"))) = 1;
	int y __attribute__((section("foo"))) = 1;

because X wants FOO to have SHT_WRITE clear, and Y wants
SHF_WRITE set.

I'll add a test for it though, since it's quite a bit more
important here than with read-only vs read-write -- one could
have also taken the position that FOO receives the loosest
read/write/execute bits required for its contents.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: thread-local storage: c front end and generic backend patch
  2002-05-22 15:01       ` Richard Henderson
@ 2002-05-22 15:13         ` Jakub Jelinek
  2002-05-22 15:36           ` Richard Henderson
  2002-05-29 23:48           ` Fergus Henderson
  2002-05-22 15:39         ` Mark Mitchell
  1 sibling, 2 replies; 875+ messages in thread
From: Jakub Jelinek @ 2002-05-22 15:13 UTC (permalink / raw)
  To: Richard Henderson, Mark Mitchell, gcc-patches, Joseph S. Myers

On Wed, May 22, 2002 at 02:55:18PM -0700, Richard Henderson wrote:
> > (I assume that, in C++:
> > 
> >   __thread const int i = 7;
> >   __thread int j = i;
> > 
> > is OK.  In C++, this would not be a dynamic initialization, if there
> > were no __thread modifier, and it seems like you could do this in
> > the implementation you're envisioning as well.)
> 
> *shrug* So long as the compiler does the constant propagation,
> then sure this should be ok.

Shouldn't be __thread const int i = 7; either forbidden, or __thread
ignored for it (in order of preference)?
It makes no sense to use __thread here (if the value is constant,
it has the same value for all the threads), there is no .trodata section
and we'd have problems with section flags...

	Jakub

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: thread-local storage: c front end and generic backend patch
  2002-05-22 15:13         ` Jakub Jelinek
@ 2002-05-22 15:36           ` Richard Henderson
  2002-05-22 15:42             ` Mark Mitchell
  2002-05-29 23:48           ` Fergus Henderson
  1 sibling, 1 reply; 875+ messages in thread
From: Richard Henderson @ 2002-05-22 15:36 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Mark Mitchell, gcc-patches, Joseph S. Myers

On Wed, May 22, 2002 at 06:00:26PM -0400, Jakub Jelinek wrote:
> Shouldn't be __thread const int i = 7; either forbidden, or __thread
> ignored for it (in order of preference)?

It is questionable, sure, but illegal?  I'll think about it.
Perhaps a warning only...

> It makes no sense to use __thread here (if the value is constant,
> it has the same value for all the threads), there is no .trodata section
> and we'd have problems with section flags...

Currently categorize_decl_for_section discards the read-only
bit for tls variables; default_section_type_flags could do the
same thing and solve the problem.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: thread-local storage: c front end and generic backend patch
  2002-05-22 15:01       ` Richard Henderson
  2002-05-22 15:13         ` Jakub Jelinek
@ 2002-05-22 15:39         ` Mark Mitchell
  2002-05-22 16:30           ` Richard Henderson
  1 sibling, 1 reply; 875+ messages in thread
From: Mark Mitchell @ 2002-05-22 15:39 UTC (permalink / raw)
  To: Richard Henderson; +Cc: gcc-patches, Joseph S. Myers



--On Wednesday, May 22, 2002 02:55:18 PM -0700 Richard Henderson 
<rth@redhat.com> wrote:

> On Wed, May 22, 2002 at 02:35:24PM -0700, Mark Mitchell wrote:
>> Agreed; still if we're going to talk about threads we have to have some
>> model of what a thread is.  Even if we say "you create threads by some
>> implementation-defined means", we have to say that a thread is a flow
>> of control, that there are no guarantees about the values of
>> global variables that are not marked with __thread when accessed
>> simultaneously from multiple threads, etc.
>
> Ok, I'll see what I can come up with.

Great.  As Joseph says, things are a bit of a mess in this regard.

It's probably not fair to ask you to solve all the problems...

>> (I assume that, in C++:
>>
>>   __thread const int i = 7;
>>   __thread int j = i;
>>
>> is OK.  In C++, this would not be a dynamic initialization, if there
>> were no __thread modifier, and it seems like you could do this in
>> the implementation you're envisioning as well.)
>
> *shrug* So long as the compiler does the constant propagation,
> then sure this should be ok.

It should... I guess the language issue is that in C++ "const int i"
is an integral constant expression -- and it still is even with __thread.
I don't think we need to say anything about that; I'm just musing.

OK, I'm done picking it apart... :-)

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: thread-local storage: c front end and generic backend patch
  2002-05-22 15:36           ` Richard Henderson
@ 2002-05-22 15:42             ` Mark Mitchell
  2002-05-22 15:56               ` Richard Henderson
  0 siblings, 1 reply; 875+ messages in thread
From: Mark Mitchell @ 2002-05-22 15:42 UTC (permalink / raw)
  To: Richard Henderson, Jakub Jelinek; +Cc: gcc-patches, Joseph S. Myers



--On Wednesday, May 22, 2002 03:13:07 PM -0700 Richard Henderson 
<rth@redhat.com> wrote:

> On Wed, May 22, 2002 at 06:00:26PM -0400, Jakub Jelinek wrote:
>> Shouldn't be __thread const int i = 7; either forbidden, or __thread
>> ignored for it (in order of preference)?
>
> It is questionable, sure, but illegal?  I'll think about it.
> Perhaps a warning only...

I don't see any reason for it to be illegal.  It may be wasteful, but
that's all.

Sometimes you don't even know what type you've got; especially in
templates:

  template <class T>
  struct S {
    __thread static T t;
  };

  f(S<const int>::t);

That one happens to be "const", but only because the template got
instantiated that way.

I don't think I'd even warn.

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: thread-local storage: c front end and generic backend patch
  2002-05-22 15:42             ` Mark Mitchell
@ 2002-05-22 15:56               ` Richard Henderson
  0 siblings, 0 replies; 875+ messages in thread
From: Richard Henderson @ 2002-05-22 15:56 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: Jakub Jelinek, gcc-patches, Joseph S. Myers

On Wed, May 22, 2002 at 03:35:15PM -0700, Mark Mitchell wrote:
>   f(S<const int>::t);
> 
> I don't think I'd even warn.

Right.  New test case added.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: thread-local storage: c front end and generic backend patch
  2002-05-22 15:39         ` Mark Mitchell
@ 2002-05-22 16:30           ` Richard Henderson
  0 siblings, 0 replies; 875+ messages in thread
From: Richard Henderson @ 2002-05-22 16:30 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: gcc-patches, Joseph S. Myers

On Wed, May 22, 2002 at 03:32:49PM -0700, Mark Mitchell wrote:
> > *shrug* So long as the compiler does the constant propagation,
> > then sure this should be ok.
> 
> It should...

Test case added.  It does work as expected.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: thread-local storage: c front end and generic backend patch
  2002-05-22 14:22   ` Richard Henderson
  2002-05-22 14:44     ` Gabriel Dos Reis
  2002-05-22 14:52     ` Mark Mitchell
@ 2002-05-22 16:46     ` Alexandre Oliva
  2002-05-22 16:53       ` Richard Henderson
  2 siblings, 1 reply; 875+ messages in thread
From: Alexandre Oliva @ 2002-05-22 16:46 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Mark Mitchell, gcc-patches, Joseph S. Myers

On May 22, 2002, Richard Henderson <rth@redhat.com> wrote:

> The run-time mechanism here is that TLS data is initialized via
> block-copy from a PT_TLS ELF program segment.  So we can't have
> constructors of any form.

Sounds like you're describing POD types.

-- 
Alexandre Oliva   Enjoy Guarana', see http://www.ic.unicamp.br/~oliva/
Red Hat GCC Developer                  aoliva@{cygnus.com, redhat.com}
CS PhD student at IC-Unicamp        oliva@{lsd.ic.unicamp.br, gnu.org}
Free Software Evangelist                Professional serial bug killer

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: thread-local storage: c front end and generic backend patch
  2002-05-22 16:46     ` Alexandre Oliva
@ 2002-05-22 16:53       ` Richard Henderson
  0 siblings, 0 replies; 875+ messages in thread
From: Richard Henderson @ 2002-05-22 16:53 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: Mark Mitchell, gcc-patches, Joseph S. Myers

On Wed, May 22, 2002 at 08:44:11PM -0300, Alexandre Oliva wrote:
> > The run-time mechanism here is that TLS data is initialized via
> > block-copy from a PT_TLS ELF program segment.  So we can't have
> > constructors of any form.
> 
> Sounds like you're describing POD types.

Well, more than that, as Jason noted with "int i = f();".


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* rs6000.c:output_toc
@ 2002-05-23  7:02             ` Alan Modra
  2002-05-23  9:21               ` rs6000.c:output_toc David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Alan Modra @ 2002-05-23  7:02 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Edelsohn

Fixes assembly output like
  .tc ID_ffffffffcccccccc_cccccccd[TC],0xffffffffcccccccccccccccd
and the resultant assembler complaint
  Warning: bignum truncated to 8 bytes
when compiling on a host with 64 bit longs.

gcc/ChangeLog
	* config/rs6000/rs6000.c (output_toc): Mask longs to 32 bits.

OK for mainline and branch?

Index: gcc/config/rs6000/rs6000.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.c,v
retrieving revision 1.324
diff -u -p -r1.324 rs6000.c
--- gcc/config/rs6000/rs6000.c	23 May 2002 02:26:45 -0000	1.324
+++ gcc/config/rs6000/rs6000.c	23 May 2002 13:19:30 -0000
@@ -10256,8 +10256,10 @@ output_toc (file, x, labelno, mode)
 	  if (TARGET_MINIMAL_TOC)
 	    fputs (DOUBLE_INT_ASM_OP, file);
 	  else
-	    fprintf (file, "\t.tc FD_%lx_%lx[TC],", k[0], k[1]);
-	  fprintf (file, "0x%lx%08lx\n", k[0], k[1]);
+	    fprintf (file, "\t.tc FD_%lx_%lx[TC],",
+		     k[0] & 0xffffffff, k[1] & 0xffffffff);
+	  fprintf (file, "0x%lx%08lx\n",
+		   k[0] & 0xffffffff, k[1] & 0xffffffff);
 	  return;
 	}
       else
@@ -10265,8 +10267,10 @@ output_toc (file, x, labelno, mode)
 	  if (TARGET_MINIMAL_TOC)
 	    fputs ("\t.long ", file);
 	  else
-	    fprintf (file, "\t.tc FD_%lx_%lx[TC],", k[0], k[1]);
-	  fprintf (file, "0x%lx,0x%lx\n", k[0], k[1]);
+	    fprintf (file, "\t.tc FD_%lx_%lx[TC],",
+		     k[0] & 0xffffffff, k[1] & 0xffffffff);
+	  fprintf (file, "0x%lx,0x%lx\n",
+		   k[0] & 0xffffffff, k[1] & 0xffffffff);
 	  return;
 	}
     }
@@ -10283,8 +10287,8 @@ output_toc (file, x, labelno, mode)
 	  if (TARGET_MINIMAL_TOC)
 	    fputs (DOUBLE_INT_ASM_OP, file);
 	  else
-	    fprintf (file, "\t.tc FS_%lx[TC],", l);
-	  fprintf (file, "0x%lx00000000\n", l);
+	    fprintf (file, "\t.tc FS_%lx[TC],", l & 0xffffffff);
+	  fprintf (file, "0x%lx00000000\n", l & 0xffffffff);
 	  return;
 	}
       else
@@ -10292,8 +10296,8 @@ output_toc (file, x, labelno, mode)
 	  if (TARGET_MINIMAL_TOC)
 	    fputs ("\t.long ", file);
 	  else
-	    fprintf (file, "\t.tc FS_%lx[TC],", l);
-	  fprintf (file, "0x%lx\n", l);
+	    fprintf (file, "\t.tc FS_%lx[TC],", l & 0xffffffff);
+	  fprintf (file, "0x%lx\n", l & 0xffffffff);
 	  return;
 	}
     }
@@ -10343,8 +10347,10 @@ output_toc (file, x, labelno, mode)
 	  if (TARGET_MINIMAL_TOC)
 	    fputs (DOUBLE_INT_ASM_OP, file);
 	  else
-	    fprintf (file, "\t.tc ID_%lx_%lx[TC],", (long) high, (long) low);
-	  fprintf (file, "0x%lx%08lx\n", (long) high, (long) low);
+	    fprintf (file, "\t.tc ID_%lx_%lx[TC],",
+		     (long) high & 0xffffffff, (long) low & 0xffffffff);
+	  fprintf (file, "0x%lx%08lx\n",
+		   (long) high & 0xffffffff, (long) low & 0xffffffff);
 	  return;
 	}
       else
@@ -10355,16 +10361,17 @@ output_toc (file, x, labelno, mode)
 		fputs ("\t.long ", file);
 	      else
 		fprintf (file, "\t.tc ID_%lx_%lx[TC],",
-			 (long) high, (long) low);
-	      fprintf (file, "0x%lx,0x%lx\n", (long) high, (long) low);
+			 (long) high & 0xffffffff, (long) low & 0xffffffff);
+	      fprintf (file, "0x%lx,0x%lx\n",
+		       (long) high & 0xffffffff, (long) low & 0xffffffff);
 	    }
 	  else
 	    {
 	      if (TARGET_MINIMAL_TOC)
 		fputs ("\t.long ", file);
 	      else
-		fprintf (file, "\t.tc IS_%lx[TC],", (long) low);
-	      fprintf (file, "0x%lx\n", (long) low);
+		fprintf (file, "\t.tc IS_%lx[TC],", (long) low & 0xffffffff);
+	      fprintf (file, "0x%lx\n", (long) low & 0xffffffff);
 	    }
 	  return;
 	}

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: rs6000.c:output_toc
  2002-05-23  7:02             ` rs6000.c:output_toc Alan Modra
@ 2002-05-23  9:21               ` David Edelsohn
  0 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2002-05-23  9:21 UTC (permalink / raw)
  To: gcc-patches

	* config/rs6000/rs6000.c (output_toc): Mask longs to 32 bits.

Yes, this is fine.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: thread-local storage: c front end and generic backend patch
  2002-05-22 15:13         ` Jakub Jelinek
  2002-05-22 15:36           ` Richard Henderson
@ 2002-05-29 23:48           ` Fergus Henderson
  1 sibling, 0 replies; 875+ messages in thread
From: Fergus Henderson @ 2002-05-29 23:48 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches

On 22-May-2002, Jakub Jelinek <jakub@redhat.com> wrote:
> Shouldn't be __thread const int i = 7; either forbidden, or __thread
> ignored for it (in order of preference)?

Ignoring __thread here would give the wrong semantics --
the constant should have different addresses for different threads.

-- 
Fergus Henderson <fjh@cs.mu.oz.au>  |  "I have always known that the pursuit
The University of Melbourne         |  of excellence is a lethal habit"
WWW: <http://www.cs.mu.oz.au/~fjh>  |     -- the last words of T. S. Garp.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* PowerPC cleanup and Power4
@ 2002-06-09  8:10 David Edelsohn
  2002-06-09  8:24 ` Neil Booth
  2002-06-09 10:05 ` Geoff Keating
  0 siblings, 2 replies; 875+ messages in thread
From: David Edelsohn @ 2002-06-09  8:10 UTC (permalink / raw)
  To: gcc-patches

	The following patch cleans up some cruft and formatting in the
rs6000 port and adds preliminary, basic Power4 support.

	While exploring the scheduling, I noticed that cr_logical
attribute had not been applied to mfcr/mtcrf instructions.  This bumps
performance on processors with multiple SCIUs where cr_logical was being
scheduled inefficiently.

David

	* config/rs6000/{aix43.h,aix5.1} (ASM_CPU_SPEC): Add power3
	synonym for 630.  Add power4.  Remove embedded processors.  Use -m604
	assembler option.
	(CPP_CPU_SPEC): Add power3 and power4.
	(PROCESSOR_DEFAULT): Change to 604e.
	* config/rs6000/rs6000.h (ASM_CPU_SPEC): Similar additions.
	(CPP_CPU_SPEC): Similar additions.
	(enum process_type): Add POWER4.
	(RTX_COSTS): Add POWER4.
	(CPP_CPU_SPEC): Similar additions.
	* config/rs6000/linux64.h (PROCESSOR_DEFAULT): Define.
	* config/rs6000/rs6000.c (rs6000_override_options): Add power4.
	(rs6000_adjust_cost): Add 603, 604, 604e, 620, 630, Power4 to
	branch adjustment.
	(rs6000_issue_rate): Add Power4.
	* config/rs6000/rs6000.md (cpu attr): Add power4.
	(iu compare): Remove 604, 604e, 620, 630.
	Add basic Power4 scheduling information.
	(mfcr/mtcrf): Change type attribute to cr_logical.

Index: aix43.h
===================================================================
RCS file: /cvs/gcc/egcs/gcc/config/rs6000/aix43.h,v
retrieving revision 1.23
diff -c -p -r1.23 aix43.h
*** aix43.h	23 May 2002 02:26:45 -0000	1.23
--- aix43.h	9 Jun 2002 14:40:29 -0000
*************** do {									\
*** 75,80 ****
--- 75,82 ----
  %{mcpu=common: -mcom} \
  %{mcpu=power: -mpwr} \
  %{mcpu=power2: -mpwr2} \
+ %{mcpu=power3: -m604} \
+ %{mcpu=power4: -m604} \
  %{mcpu=powerpc: -mppc} \
  %{mcpu=rios: -mpwr} \
  %{mcpu=rios1: -mpwr} \
*************** do {									\
*** 82,89 ****
  %{mcpu=rsc: -mpwr} \
  %{mcpu=rsc1: -mpwr} \
  %{mcpu=rs64a: -mppc} \
- %{mcpu=403: -mppc} \
- %{mcpu=505: -mppc} \
  %{mcpu=601: -m601} \
  %{mcpu=602: -mppc} \
  %{mcpu=603: -m603} \
--- 84,89 ----
*************** do {									\
*** 91,99 ****
  %{mcpu=604: -m604} \
  %{mcpu=604e: -m604} \
  %{mcpu=620: -mppc} \
! %{mcpu=630: -mppc} \
! %{mcpu=821: -mppc} \
! %{mcpu=860: -mppc}"
  
  #undef	ASM_DEFAULT_SPEC
  #define ASM_DEFAULT_SPEC "-mcom"
--- 91,97 ----
  %{mcpu=604: -m604} \
  %{mcpu=604e: -m604} \
  %{mcpu=620: -mppc} \
! %{mcpu=630: -m604}"
  
  #undef	ASM_DEFAULT_SPEC
  #define ASM_DEFAULT_SPEC "-mcom"
*************** do {									\
*** 135,140 ****
--- 133,140 ----
  %{mcpu=common: -D_ARCH_COM} \
  %{mcpu=power: -D_ARCH_PWR} \
  %{mcpu=power2: -D_ARCH_PWR2} \
+ %{mcpu=power3: -D_ARCH_PPC} \
+ %{mcpu=power4: -D_ARCH_PPC} \
  %{mcpu=powerpc: -D_ARCH_PPC} \
  %{mcpu=rios: -D_ARCH_PWR} \
  %{mcpu=rios1: -D_ARCH_PWR} \
*************** do {									\
*** 142,158 ****
  %{mcpu=rsc: -D_ARCH_PWR} \
  %{mcpu=rsc1: -D_ARCH_PWR} \
  %{mcpu=rs64a: -D_ARCH_PPC} \
- %{mcpu=403: -D_ARCH_PPC} \
- %{mcpu=505: -D_ARCH_PPC} \
  %{mcpu=601: -D_ARCH_PPC -D_ARCH_PWR} \
  %{mcpu=602: -D_ARCH_PPC} \
  %{mcpu=603: -D_ARCH_PPC} \
  %{mcpu=603e: -D_ARCH_PPC} \
  %{mcpu=604: -D_ARCH_PPC} \
  %{mcpu=620: -D_ARCH_PPC} \
! %{mcpu=630: -D_ARCH_PPC} \
! %{mcpu=821: -D_ARCH_PPC} \
! %{mcpu=860: -D_ARCH_PPC}"
  
  #undef	CPP_DEFAULT_SPEC
  #define CPP_DEFAULT_SPEC "-D_ARCH_COM"
--- 142,154 ----
  %{mcpu=rsc: -D_ARCH_PWR} \
  %{mcpu=rsc1: -D_ARCH_PWR} \
  %{mcpu=rs64a: -D_ARCH_PPC} \
  %{mcpu=601: -D_ARCH_PPC -D_ARCH_PWR} \
  %{mcpu=602: -D_ARCH_PPC} \
  %{mcpu=603: -D_ARCH_PPC} \
  %{mcpu=603e: -D_ARCH_PPC} \
  %{mcpu=604: -D_ARCH_PPC} \
  %{mcpu=620: -D_ARCH_PPC} \
! %{mcpu=630: -D_ARCH_PPC}"
  
  #undef	CPP_DEFAULT_SPEC
  #define CPP_DEFAULT_SPEC "-D_ARCH_COM"
*************** do {									\
*** 161,167 ****
  #define TARGET_DEFAULT MASK_NEW_MNEMONICS
  
  #undef PROCESSOR_DEFAULT
! #define PROCESSOR_DEFAULT PROCESSOR_PPC604
  
  /* Define this macro as a C expression for the initializer of an
     array of string to tell the driver program which options are
--- 157,163 ----
  #define TARGET_DEFAULT MASK_NEW_MNEMONICS
  
  #undef PROCESSOR_DEFAULT
! #define PROCESSOR_DEFAULT PROCESSOR_PPC604e
  
  /* Define this macro as a C expression for the initializer of an
     array of string to tell the driver program which options are
Index: aix51.h
===================================================================
RCS file: /cvs/gcc/egcs/gcc/config/rs6000/aix51.h,v
retrieving revision 1.12
diff -c -p -r1.12 aix51.h
*** aix51.h	23 May 2002 02:26:45 -0000	1.12
--- aix51.h	9 Jun 2002 14:40:29 -0000
*************** do {									\
*** 75,80 ****
--- 75,82 ----
  %{mcpu=common: -mcom} \
  %{mcpu=power: -mpwr} \
  %{mcpu=power2: -mpwr2} \
+ %{mcpu=power3: -m604} \
+ %{mcpu=power4: -m604} \
  %{mcpu=powerpc: -mppc} \
  %{mcpu=rios: -mpwr} \
  %{mcpu=rios1: -mpwr} \
*************** do {									\
*** 82,89 ****
  %{mcpu=rsc: -mpwr} \
  %{mcpu=rsc1: -mpwr} \
  %{mcpu=rs64a: -mppc} \
- %{mcpu=403: -mppc} \
- %{mcpu=505: -mppc} \
  %{mcpu=601: -m601} \
  %{mcpu=602: -mppc} \
  %{mcpu=603: -m603} \
--- 84,89 ----
*************** do {									\
*** 91,99 ****
  %{mcpu=604: -m604} \
  %{mcpu=604e: -m604} \
  %{mcpu=620: -mppc} \
! %{mcpu=630: -mppc} \
! %{mcpu=821: -mppc} \
! %{mcpu=860: -mppc}"
  
  #undef	ASM_DEFAULT_SPEC
  #define ASM_DEFAULT_SPEC "-mcom"
--- 91,97 ----
  %{mcpu=604: -m604} \
  %{mcpu=604e: -m604} \
  %{mcpu=620: -mppc} \
! %{mcpu=630: -m604}"
  
  #undef	ASM_DEFAULT_SPEC
  #define ASM_DEFAULT_SPEC "-mcom"
*************** do {									\
*** 135,140 ****
--- 133,140 ----
  %{mcpu=common: -D_ARCH_COM} \
  %{mcpu=power: -D_ARCH_PWR} \
  %{mcpu=power2: -D_ARCH_PWR2} \
+ %{mcpu=power3: -D_ARCH_PPC} \
+ %{mcpu=power4: -D_ARCH_PPC} \
  %{mcpu=powerpc: -D_ARCH_PPC} \
  %{mcpu=rios: -D_ARCH_PWR} \
  %{mcpu=rios1: -D_ARCH_PWR} \
*************** do {									\
*** 142,158 ****
  %{mcpu=rsc: -D_ARCH_PWR} \
  %{mcpu=rsc1: -D_ARCH_PWR} \
  %{mcpu=rs64a: -D_ARCH_PPC} \
- %{mcpu=403: -D_ARCH_PPC} \
- %{mcpu=505: -D_ARCH_PPC} \
  %{mcpu=601: -D_ARCH_PPC -D_ARCH_PWR} \
  %{mcpu=602: -D_ARCH_PPC} \
  %{mcpu=603: -D_ARCH_PPC} \
  %{mcpu=603e: -D_ARCH_PPC} \
  %{mcpu=604: -D_ARCH_PPC} \
  %{mcpu=620: -D_ARCH_PPC} \
! %{mcpu=630: -D_ARCH_PPC} \
! %{mcpu=821: -D_ARCH_PPC} \
! %{mcpu=860: -D_ARCH_PPC}"
  
  #undef	CPP_DEFAULT_SPEC
  #define CPP_DEFAULT_SPEC "-D_ARCH_COM"
--- 142,154 ----
  %{mcpu=rsc: -D_ARCH_PWR} \
  %{mcpu=rsc1: -D_ARCH_PWR} \
  %{mcpu=rs64a: -D_ARCH_PPC} \
  %{mcpu=601: -D_ARCH_PPC -D_ARCH_PWR} \
  %{mcpu=602: -D_ARCH_PPC} \
  %{mcpu=603: -D_ARCH_PPC} \
  %{mcpu=603e: -D_ARCH_PPC} \
  %{mcpu=604: -D_ARCH_PPC} \
  %{mcpu=620: -D_ARCH_PPC} \
! %{mcpu=630: -D_ARCH_PPC}"
  
  #undef	CPP_DEFAULT_SPEC
  #define CPP_DEFAULT_SPEC "-D_ARCH_COM"
*************** do {									\
*** 161,167 ****
  #define TARGET_DEFAULT MASK_NEW_MNEMONICS
  
  #undef PROCESSOR_DEFAULT
! #define PROCESSOR_DEFAULT PROCESSOR_PPC604
  
  /* Define this macro as a C expression for the initializer of an
     array of string to tell the driver program which options are
--- 157,163 ----
  #define TARGET_DEFAULT MASK_NEW_MNEMONICS
  
  #undef PROCESSOR_DEFAULT
! #define PROCESSOR_DEFAULT PROCESSOR_PPC604e
  
  /* Define this macro as a C expression for the initializer of an
     array of string to tell the driver program which options are
Index: linux64.h
===================================================================
RCS file: /cvs/gcc/egcs/gcc/config/rs6000/linux64.h,v
retrieving revision 1.15
diff -c -p -r1.15 linux64.h
*** linux64.h	5 Jun 2002 03:56:27 -0000	1.15
--- linux64.h	9 Jun 2002 14:40:29 -0000
*************** Boston, MA 02111-1307, USA.  */
*** 31,36 ****
--- 31,39 ----
  #define TARGET_DEFAULT \
    (MASK_POWERPC | MASK_POWERPC64 | MASK_64BIT | MASK_NEW_MNEMONICS)
  
+ #undef PROCESSOR_DEFAULT
+ #define PROCESSOR_DEFAULT PROCESSOR_PPC630
+ 
  #undef  CPP_DEFAULT_SPEC
  #define CPP_DEFAULT_SPEC "-D_ARCH_PPC64"
  
Index: rs6000.c
===================================================================
RCS file: /cvs/gcc/egcs/gcc/config/rs6000/rs6000.c,v
retrieving revision 1.329
diff -c -p -r1.329 rs6000.c
*** rs6000.c	4 Jun 2002 07:09:27 -0000	1.329
--- rs6000.c	9 Jun 2002 14:40:30 -0000
*************** rs6000_override_options (default_cpu)
*** 353,358 ****
--- 353,361 ----
  	 {"power3", PROCESSOR_PPC630,
  	    MASK_POWERPC | MASK_PPC_GFXOPT | MASK_NEW_MNEMONICS,
  	    POWER_MASKS | MASK_PPC_GPOPT},
+ 	 {"power4", PROCESSOR_POWER4,
+ 	    MASK_POWERPC | MASK_PPC_GFXOPT | MASK_NEW_MNEMONICS,
+ 	    POWER_MASKS | MASK_PPC_GPOPT},
  	 {"powerpc", PROCESSOR_POWERPC,
  	    MASK_POWERPC | MASK_NEW_MNEMONICS,
  	    POWER_MASKS | POWERPC_OPT_MASKS | MASK_POWERPC64},
*************** rs6000_adjust_cost (insn, link, dep_insn
*** 10696,10713 ****
        switch (get_attr_type (insn))
  	{
  	case TYPE_JMPREG:
!           /* Tell the first scheduling pass about the latency between
  	     a mtctr and bctr (and mtlr and br/blr).  The first
  	     scheduling pass will not know about this latency since
  	     the mtctr instruction, which has the latency associated
  	     to it, will be generated by reload.  */
!           return TARGET_POWER ? 5 : 4;
  	case TYPE_BRANCH:
  	  /* Leave some extra cycles between a compare and its
  	     dependent branch, to inhibit expensive mispredicts.  */
! 	  if ((rs6000_cpu_attr == CPU_PPC750
!                || rs6000_cpu_attr == CPU_PPC7400
!                || rs6000_cpu_attr == CPU_PPC7450)
  	      && recog_memoized (dep_insn)
  	      && (INSN_CODE (dep_insn) >= 0)
  	      && (get_attr_type (dep_insn) == TYPE_COMPARE
--- 10699,10722 ----
        switch (get_attr_type (insn))
  	{
  	case TYPE_JMPREG:
! 	  /* Tell the first scheduling pass about the latency between
  	     a mtctr and bctr (and mtlr and br/blr).  The first
  	     scheduling pass will not know about this latency since
  	     the mtctr instruction, which has the latency associated
  	     to it, will be generated by reload.  */
! 	  return TARGET_POWER ? 5 : 4;
  	case TYPE_BRANCH:
  	  /* Leave some extra cycles between a compare and its
  	     dependent branch, to inhibit expensive mispredicts.  */
! 	  if ((rs6000_cpu_attr == CPU_PPC603
! 	       || rs6000_cpu_attr == CPU_PPC604
! 	       || rs6000_cpu_attr == CPU_PPC604E
! 	       || rs6000_cpu_attr == CPU_PPC620
! 	       || rs6000_cpu_attr == CPU_PPC630
! 	       || rs6000_cpu_attr == CPU_PPC750
! 	       || rs6000_cpu_attr == CPU_PPC7400
! 	       || rs6000_cpu_attr == CPU_PPC7450
! 	       || rs6000_cpu_attr == CPU_POWER4)
  	      && recog_memoized (dep_insn)
  	      && (INSN_CODE (dep_insn) >= 0)
  	      && (get_attr_type (dep_insn) == TYPE_COMPARE
*************** rs6000_issue_rate ()
*** 10788,10793 ****
--- 10797,10803 ----
    case CPU_PPC604E:
    case CPU_PPC620:
    case CPU_PPC630:
+   case CPU_POWER4:
      return 4;
    default:
      return 1;
Index: rs6000.h
===================================================================
RCS file: /cvs/gcc/egcs/gcc/config/rs6000/rs6000.h,v
retrieving revision 1.206
diff -c -p -r1.206 rs6000.h
*** rs6000.h	4 Jun 2002 07:09:32 -0000	1.206
--- rs6000.h	9 Jun 2002 14:40:30 -0000
*************** Boston, MA 02111-1307, USA.  */
*** 58,63 ****
--- 58,65 ----
  %{mcpu=common: -D_ARCH_COM} \
  %{mcpu=power: -D_ARCH_PWR} \
  %{mcpu=power2: -D_ARCH_PWR2} \
+ %{mcpu=power3: -D_ARCH_PPC} \
+ %{mcpu=power4: -D_ARCH_PPC} \
  %{mcpu=powerpc: -D_ARCH_PPC} \
  %{mcpu=rios: -D_ARCH_PWR} \
  %{mcpu=rios1: -D_ARCH_PWR} \
*************** Boston, MA 02111-1307, USA.  */
*** 98,103 ****
--- 100,107 ----
  %{mcpu=common: -mcom} \
  %{mcpu=power: -mpwr} \
  %{mcpu=power2: -mpwrx} \
+ %{mcpu=power3: -m604} \
+ %{mcpu=power4: -m604} \
  %{mcpu=powerpc: -mppc} \
  %{mcpu=rios: -mpwr} \
  %{mcpu=rios1: -mpwr} \
*************** Boston, MA 02111-1307, USA.  */
*** 116,121 ****
--- 120,126 ----
  %{mcpu=604: -mppc} \
  %{mcpu=604e: -mppc} \
  %{mcpu=620: -mppc} \
+ %{mcpu=630: -m604} \
  %{mcpu=740: -mppc} \
  %{mcpu=7400: -mppc} \
  %{mcpu=7450: -mppc} \
*************** enum processor_type
*** 395,401 ****
     PROCESSOR_PPC630,
     PROCESSOR_PPC750,
     PROCESSOR_PPC7400,
!    PROCESSOR_PPC7450
  };
  
  extern enum processor_type rs6000_cpu;
--- 400,407 ----
     PROCESSOR_PPC630,
     PROCESSOR_PPC750,
     PROCESSOR_PPC7400,
!    PROCESSOR_PPC7450,
!    PROCESSOR_POWER4
  };
  
  extern enum processor_type rs6000_cpu;
*************** do {									     \
*** 2298,2303 ****
--- 2304,2310 ----
          return COSTS_N_INSNS (4);					\
        case PROCESSOR_PPC620:						\
        case PROCESSOR_PPC630:						\
+       case PROCESSOR_POWER4:						\
          return (GET_CODE (XEXP (X, 1)) != CONST_INT			\
  		? GET_MODE (XEXP (X, 1)) != DImode			\
  		? COSTS_N_INSNS (5) : COSTS_N_INSNS (7)			\
*************** do {									     \
*** 2337,2342 ****
--- 2344,2350 ----
  	return COSTS_N_INSNS (20);					\
        case PROCESSOR_PPC620:						\
        case PROCESSOR_PPC630:						\
+       case PROCESSOR_POWER4:						\
          return (GET_MODE (XEXP (X, 1)) != DImode			\
  		? COSTS_N_INSNS (21)					\
  		: COSTS_N_INSNS (37));					\
Index: rs6000.md
===================================================================
RCS file: /cvs/gcc/egcs/gcc/config/rs6000/rs6000.md,v
retrieving revision 1.189
diff -c -p -r1.189 rs6000.md
*** rs6000.md	19 May 2002 17:10:47 -0000	1.189
--- rs6000.md	9 Jun 2002 14:40:30 -0000
***************
*** 56,62 ****
  ;; Processor type -- this attribute must exactly match the processor_type
  ;; enumeration in rs6000.h.
  
! (define_attr "cpu" "rios1,rios2,rs64a,mpccore,ppc403,ppc405,ppc601,ppc603,ppc604,ppc604e,ppc620,ppc630,ppc750,ppc7400,ppc7450"
    (const (symbol_ref "rs6000_cpu_attr")))
  
  ; (define_function_unit NAME MULTIPLICITY SIMULTANEITY
--- 56,62 ----
  ;; Processor type -- this attribute must exactly match the processor_type
  ;; enumeration in rs6000.h.
  
! (define_attr "cpu" "rios1,rios2,rs64a,mpccore,ppc403,ppc405,ppc601,ppc603,ppc604,ppc604e,ppc620,ppc630,ppc750,ppc7400,ppc7450,power4"
    (const (symbol_ref "rs6000_cpu_attr")))
  
  ; (define_function_unit NAME MULTIPLICITY SIMULTANEITY
***************
*** 375,416 ****
--- 375,426 ----
    (and (eq_attr "type" "cr_logical")
         (eq_attr "cpu" "ppc7450"))
    1 1)
+ 
  (define_function_unit "vec_alu2" 2 0
    (and (eq_attr "type" "vecsimple")
         (eq_attr "cpu" "ppc7450"))
    1 2 [(eq_attr "type" "vecsimple")])
+ 
  (define_function_unit "vec_alu2" 2 0
    (and (eq_attr "type" "vecsimple")
         (eq_attr "cpu" "ppc7450"))
    1 1 [(eq_attr "type" "!vecsimple")])
+ 
  (define_function_unit "vec_alu2" 2 0
    (and (eq_attr "type" "veccomplex")
         (eq_attr "cpu" "ppc7450"))
    4 2 [(eq_attr "type" "veccomplex")])
+ 
  (define_function_unit "vec_alu2" 2 0
    (and (eq_attr "type" "veccomplex")
         (eq_attr "cpu" "ppc7450"))
    4 1 [(eq_attr "type" "!veccomplex")])
+ 
  (define_function_unit "vec_alu2" 2 0
    (and (eq_attr "type" "veccmp")
         (eq_attr "cpu" "ppc7450"))
    2 2 [(eq_attr "type" "veccmp")])
+ 
  (define_function_unit "vec_alu2" 2 0
    (and (eq_attr "type" "veccmp")
         (eq_attr "cpu" "ppc7450"))
    2 1 [(eq_attr "type" "!veccmp")])
+ 
  (define_function_unit "vec_alu2" 2 0
    (and (eq_attr "type" "vecfloat")
         (eq_attr "cpu" "ppc7450"))
    4 2 [(eq_attr "type" "vecfloat")])
+ 
  (define_function_unit "vec_alu2" 2 0
    (and (eq_attr "type" "vecfloat")
         (eq_attr "cpu" "ppc7450"))
    4 1 [(eq_attr "type" "!vecfloat")])
+ 
  (define_function_unit "vec_alu2" 2 0
    (and (eq_attr "type" "vecperm")
         (eq_attr "cpu" "ppc7450"))
    2 2 [(eq_attr "type" "vecperm")])
+ 
  (define_function_unit "vec_alu2" 2 0
    (and (eq_attr "type" "vecperm")
         (eq_attr "cpu" "ppc7450"))
***************
*** 489,495 ****
  
  (define_function_unit "iu" 1 0
    (and (eq_attr "type" "compare,delayed_compare")
!        (eq_attr "cpu" "rs64a,mpccore,ppc403,ppc405,ppc601,ppc603,ppc604,ppc604e,ppc620,ppc630"))
    3 1)
  
  ; some extra cycles added by TARGET_SCHED_ADJUST_COST between compare
--- 499,505 ----
  
  (define_function_unit "iu" 1 0
    (and (eq_attr "type" "compare,delayed_compare")
!        (eq_attr "cpu" "rs64a,mpccore,ppc403,ppc405,ppc601,ppc603"))
    3 1)
  
  ; some extra cycles added by TARGET_SCHED_ADJUST_COST between compare
***************
*** 699,720 ****
  
  ; RIOS2 has two symmetric FPUs.
  (define_function_unit "fpu2" 2 0
!   (and (eq_attr "type" "fp")
!        (eq_attr "cpu" "rios2"))
!   2 1)
! 
! (define_function_unit "fpu2" 2 0
!   (and (eq_attr "type" "fp")
!        (eq_attr "cpu" "ppc630"))
!   3 1)
! 
! (define_function_unit "fpu2" 2 0
!   (and (eq_attr "type" "dmul")
         (eq_attr "cpu" "rios2"))
    2 1)
  
  (define_function_unit "fpu2" 2 0
!   (and (eq_attr "type" "dmul")
         (eq_attr "cpu" "ppc630"))
    3 1)
  
--- 709,720 ----
  
  ; RIOS2 has two symmetric FPUs.
  (define_function_unit "fpu2" 2 0
!   (and (eq_attr "type" "fp,dmul")
         (eq_attr "cpu" "rios2"))
    2 1)
  
  (define_function_unit "fpu2" 2 0
!   (and (eq_attr "type" "fp,dmul")
         (eq_attr "cpu" "ppc630"))
    3 1)
  
***************
*** 748,753 ****
--- 748,854 ----
         (eq_attr "cpu" "ppc630"))
    26 26)
  
+ ;; Power4
+ (define_function_unit "lsu2" 2 0
+   (and (eq_attr "type" "load")
+        (eq_attr "cpu" "power4"))
+   3 1)
+ 
+ (define_function_unit "lsu2" 2 0
+   (and (eq_attr "type" "fpload")
+        (eq_attr "cpu" "power4"))
+   5 1)
+ 
+ (define_function_unit "lsu2" 2 0
+   (and (eq_attr "type" "store,fpstore")
+        (eq_attr "cpu" "power4"))
+   1 1)
+ 
+ (define_function_unit "iu2" 2 0
+   (and (eq_attr "type" "integer")
+        (eq_attr "cpu" "power4"))
+   2 1)
+ 
+ (define_function_unit "iu2" 2 0
+   (and (eq_attr "type" "imul,lmul")
+        (eq_attr "cpu" "power4"))
+   7 6)
+ 
+ (define_function_unit "iu2" 2 0
+   (and (eq_attr "type" "imul2")
+        (eq_attr "cpu" "power4"))
+   5 4)
+ 
+ (define_function_unit "iu2" 2 0
+   (and (eq_attr "type" "imul3")
+        (eq_attr "cpu" "power4"))
+   4 3)
+ 
+ (define_function_unit "iu2" 2 0
+   (and (eq_attr "type" "idiv")
+        (eq_attr "cpu" "power4"))
+   36 35)
+ 
+ (define_function_unit "iu2" 2 0
+   (and (eq_attr "type" "ldiv")
+        (eq_attr "cpu" "power4"))
+   68 67)
+ 
+ (define_function_unit "imuldiv" 1 0
+   (and (eq_attr "type" "idiv")
+        (eq_attr "cpu" "power4"))
+   36 35)
+ 
+ (define_function_unit "imuldiv" 1 0
+   (and (eq_attr "type" "ldiv")
+        (eq_attr "cpu" "power4"))
+   68 67)
+ 
+ (define_function_unit "iu2" 2 0
+   (and (eq_attr "type" "compare,delayed_compare")
+        (eq_attr "cpu" "power4"))
+   2 1)
+ 
+ (define_function_unit "iu2" 2 0
+   (and (eq_attr "type" "mtjmpr")
+        (eq_attr "cpu" "power4"))
+   3 1)
+ 
+ (define_function_unit "bpu" 1 0
+   (and (eq_attr "type" "mtjmpr")
+        (eq_attr "cpu" "power4"))
+   3 1)
+ 
+ (define_function_unit "bpu" 1 0
+   (and (eq_attr "type" "jmpreg,branch")
+        (eq_attr "cpu" "power4"))
+   2 1)
+ 
+ (define_function_unit "cru" 1 0
+   (and (eq_attr "type" "cr_logical")
+        (eq_attr "cpu" "power4"))
+   4 1)
+ 
+ (define_function_unit "fpu2" 2 0
+   (and (eq_attr "type" "fp,dmul")
+        (eq_attr "cpu" "power4"))
+   6 1)
+ 
+ (define_function_unit "fpu2" 2 0
+   (and (eq_attr "type" "fpcompare")
+        (eq_attr "cpu" "power4"))
+   8 2)
+ 
+ (define_function_unit "fpu2" 2 0
+   (and (eq_attr "type" "sdiv,ddiv")
+        (eq_attr "cpu" "power4"))
+   33 28)
+ 
+ (define_function_unit "fpu2" 2 0
+   (and (eq_attr "type" "ssqrt,dsqrt")
+        (eq_attr "cpu" "power4"))
+   40 35)
+ 
  \f
  ;; Start with fixed-point load and store insns.  Here we put only the more
  ;; complex forms.  Basic data transfer is done later.
***************
*** 7778,7784 ****
     mr %0,%1
     {l%U1%X1|lwz%U1%X1} %0,%1
     {st%U0%U1|stw%U0%U1} %1,%0"
!   [(set_attr "type" "*,*,*,compare,*,*,load,store")
     (set_attr "length" "*,*,12,*,8,*,*,*")])
  \f
  ;; For floating-point, we normally deal with the floating-point registers
--- 7879,7885 ----
     mr %0,%1
     {l%U1%X1|lwz%U1%X1} %0,%1
     {st%U0%U1|stw%U0%U1} %1,%0"
!   [(set_attr "type" "cr_logical,cr_logical,cr_logical,cr_logical,cr_logical,*,load,store")
     (set_attr "length" "*,*,12,*,8,*,*,*")])
  \f
  ;; For floating-point, we normally deal with the floating-point registers
***************
*** 10585,10591 ****
  			    (const_int 0)]))]
    ""
    "%D1mfcr %0\;{rlinm|rlwinm} %0,%0,%J1,1"
!   [(set_attr "length" "12")])
  
  (define_insn ""
    [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
--- 10686,10693 ----
  			    (const_int 0)]))]
    ""
    "%D1mfcr %0\;{rlinm|rlwinm} %0,%0,%J1,1"
!   [(set_attr "type" "cr_logical")
!    (set_attr "length" "12")])
  
  (define_insn ""
    [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
***************
*** 10594,10600 ****
  			    (const_int 0)]))]
    "TARGET_POWERPC64"
    "%D1mfcr %0\;{rlinm|rlwinm} %0,%0,%J1,1"
!   [(set_attr "length" "12")])
  
  (define_insn ""
    [(set (match_operand:CC 0 "cc_reg_operand" "=x,?y")
--- 10696,10703 ----
  			    (const_int 0)]))]
    "TARGET_POWERPC64"
    "%D1mfcr %0\;{rlinm|rlwinm} %0,%0,%J1,1"
!   [(set_attr "type" "cr_logical")
!    (set_attr "length" "12")])
  
  (define_insn ""
    [(set (match_operand:CC 0 "cc_reg_operand" "=x,?y")
***************
*** 10650,10656 ****
  
    return \"%D1mfcr %0\;{rlinm|rlwinm} %0,%0,%4,%5,%5\";
  }"
!  [(set_attr "length" "12")])
  
  (define_insn ""
    [(set (match_operand:CC 0 "cc_reg_operand" "=x,?y")
--- 10753,10760 ----
  
    return \"%D1mfcr %0\;{rlinm|rlwinm} %0,%0,%4,%5,%5\";
  }"
!   [(set_attr "type" "cr_logical")
!    (set_attr "length" "12")])
  
  (define_insn ""
    [(set (match_operand:CC 0 "cc_reg_operand" "=x,?y")
***************
*** 10719,10727 ****
  	(match_operator:SI 4 "scc_comparison_operator"
  			   [(match_operand 5 "cc_reg_operand" "y")
  			    (const_int 0)]))]
!    "REGNO (operands[2]) != REGNO (operands[5])"
!    "%D1%D4mfcr %3\;{rlinm|rlwinm} %0,%3,%J1,1\;{rlinm|rlwinm} %3,%3,%J4,1"
!    [(set_attr "length" "20")])
  
  (define_peephole
    [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
--- 10823,10832 ----
  	(match_operator:SI 4 "scc_comparison_operator"
  			   [(match_operand 5 "cc_reg_operand" "y")
  			    (const_int 0)]))]
!   "REGNO (operands[2]) != REGNO (operands[5])"
!   "%D1%D4mfcr %3\;{rlinm|rlwinm} %0,%3,%J1,1\;{rlinm|rlwinm} %3,%3,%J4,1"
!   [(set_attr "type" "cr_logical")
!    (set_attr "length" "20")])
  
  (define_peephole
    [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
***************
*** 10732,10740 ****
  	(match_operator:DI 4 "scc_comparison_operator"
  			   [(match_operand 5 "cc_reg_operand" "y")
  			    (const_int 0)]))]
!    "TARGET_POWERPC64 && REGNO (operands[2]) != REGNO (operands[5])"
!    "%D1%D4mfcr %3\;{rlinm|rlwinm} %0,%3,%J1,1\;{rlinm|rlwinm} %3,%3,%J4,1"
!    [(set_attr "length" "20")])
  
  ;; There are some scc insns that can be done directly, without a compare.
  ;; These are faster because they don't involve the communications between
--- 10837,10846 ----
  	(match_operator:DI 4 "scc_comparison_operator"
  			   [(match_operand 5 "cc_reg_operand" "y")
  			    (const_int 0)]))]
!   "TARGET_POWERPC64 && REGNO (operands[2]) != REGNO (operands[5])"
!   "%D1%D4mfcr %3\;{rlinm|rlwinm} %0,%3,%J1,1\;{rlinm|rlwinm} %3,%3,%J4,1"
!   [(set_attr "type" "cr_logical")
!    (set_attr "length" "20")])
  
  ;; There are some scc insns that can be done directly, without a compare.
  ;; These are faster because they don't involve the communications between
***************
*** 13727,13733 ****
          (unspec:SI [(reg:CC 68) (reg:CC 69) (reg:CC 70) (reg:CC 71) 
  		    (reg:CC 72)	(reg:CC 73) (reg:CC 74) (reg:CC 75)] 19))]
    ""
!   "mfcr %0")
  
  (define_insn "*stmw"
   [(match_parallel 0 "stmw_operation"
--- 13833,13840 ----
          (unspec:SI [(reg:CC 68) (reg:CC 69) (reg:CC 70) (reg:CC 71) 
  		    (reg:CC 72)	(reg:CC 73) (reg:CC 74) (reg:CC 75)] 19))]
    ""
!   "mfcr %0"
!   [(set_attr "type" "cr_logical")])
  
  (define_insn "*stmw"
   [(match_parallel 0 "stmw_operation"
***************
*** 13799,13815 ****
      mask |= INTVAL (XVECEXP (SET_SRC (XVECEXP (operands[0], 0, i)), 0, 1));
    operands[4] = GEN_INT (mask);
    return \"mtcrf %4,%2\";
! }")
  
  (define_insn ""
!  [(set (match_operand:CC 0 "cc_reg_operand" "=y")
!        (unspec:CC [(match_operand:SI 1 "gpc_reg_operand" "r")
! 		   (match_operand 2 "immediate_operand" "n")] 20))]
!  "GET_CODE (operands[0]) == REG 
!   && CR_REGNO_P (REGNO (operands[0]))
!   && GET_CODE (operands[2]) == CONST_INT
!   && INTVAL (operands[2]) == 1 << (75 - REGNO (operands[0]))"
!  "mtcrf %R0,%1")
  
  ; The load-multiple instructions have similar properties.
  ; Note that "load_multiple" is a name known to the machine-independent
--- 13906,13924 ----
      mask |= INTVAL (XVECEXP (SET_SRC (XVECEXP (operands[0], 0, i)), 0, 1));
    operands[4] = GEN_INT (mask);
    return \"mtcrf %4,%2\";
! }"
!   [(set_attr "type" "cr_logical")])
  
  (define_insn ""
!   [(set (match_operand:CC 0 "cc_reg_operand" "=y")
!         (unspec:CC [(match_operand:SI 1 "gpc_reg_operand" "r")
! 		    (match_operand 2 "immediate_operand" "n")] 20))]
!   "GET_CODE (operands[0]) == REG 
!    && CR_REGNO_P (REGNO (operands[0]))
!    && GET_CODE (operands[2]) == CONST_INT
!    && INTVAL (operands[2]) == 1 << (75 - REGNO (operands[0]))"
!   "mtcrf %R0,%1"
!   [(set_attr "type" "cr_logical")])
  
  ; The load-multiple instructions have similar properties.
  ; Note that "load_multiple" is a name known to the machine-independent

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: PowerPC cleanup and Power4
  2002-06-09  8:10 PowerPC cleanup and Power4 David Edelsohn
@ 2002-06-09  8:24 ` Neil Booth
  2002-06-09  8:31   ` David Edelsohn
  2002-06-09 10:05 ` Geoff Keating
  1 sibling, 1 reply; 875+ messages in thread
From: Neil Booth @ 2002-06-09  8:24 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

David Edelsohn wrote:-

> *************** do {									\
> *** 135,140 ****
> --- 133,140 ----
>   %{mcpu=common: -D_ARCH_COM} \
>   %{mcpu=power: -D_ARCH_PWR} \
>   %{mcpu=power2: -D_ARCH_PWR2} \
> + %{mcpu=power3: -D_ARCH_PPC} \
> + %{mcpu=power4: -D_ARCH_PPC} \
>   %{mcpu=powerpc: -D_ARCH_PPC} \
>   %{mcpu=rios: -D_ARCH_PWR} \
>   %{mcpu=rios1: -D_ARCH_PWR} \
> *************** do {									\
> *** 142,158 ****
>   %{mcpu=rsc: -D_ARCH_PWR} \
>   %{mcpu=rsc1: -D_ARCH_PWR} \
>   %{mcpu=rs64a: -D_ARCH_PPC} \
> - %{mcpu=403: -D_ARCH_PPC} \
> - %{mcpu=505: -D_ARCH_PPC} \
>   %{mcpu=601: -D_ARCH_PPC -D_ARCH_PWR} \
>   %{mcpu=602: -D_ARCH_PPC} \
>   %{mcpu=603: -D_ARCH_PPC} \
>   %{mcpu=603e: -D_ARCH_PPC} \
>   %{mcpu=604: -D_ARCH_PPC} \
>   %{mcpu=620: -D_ARCH_PPC} \
> ! %{mcpu=630: -D_ARCH_PPC} \
> ! %{mcpu=821: -D_ARCH_PPC} \
> ! %{mcpu=860: -D_ARCH_PPC}"

Is there any chance of persuading you to get rid of all this spec
hackery for CPP macros, and to use TARGET_OS_CPP_BUILTINS and
TARGET_CPU_CPP_BUILTINS instead, as is done in the alpha/ port
and others?  It has many advantages.

Neil.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: PowerPC cleanup and Power4
  2002-06-09  8:24 ` Neil Booth
@ 2002-06-09  8:31   ` David Edelsohn
  0 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2002-06-09  8:31 UTC (permalink / raw)
  To: Neil Booth; +Cc: gcc-patches

>>>>> Neil Booth writes:

Neil> Is there any chance of persuading you to get rid of all this spec
Neil> hackery for CPP macros, and to use TARGET_OS_CPP_BUILTINS and
Neil> TARGET_CPU_CPP_BUILTINS instead, as is done in the alpha/ port
Neil> and others?  It has many advantages.

	I believe that Red Hat is planning to do this.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: PowerPC cleanup and Power4
  2002-06-09  8:10 PowerPC cleanup and Power4 David Edelsohn
  2002-06-09  8:24 ` Neil Booth
@ 2002-06-09 10:05 ` Geoff Keating
  2002-06-09 10:33   ` David Edelsohn
  1 sibling, 1 reply; 875+ messages in thread
From: Geoff Keating @ 2002-06-09 10:05 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches


Hi David,

David Edelsohn <dje@watson.ibm.com> writes:

> 	The following patch cleans up some cruft and formatting in the
> rs6000 port and adds preliminary, basic Power4 support.
> 
> 	While exploring the scheduling, I noticed that cr_logical
> attribute had not been applied to mfcr/mtcrf instructions.  This bumps
> performance on processors with multiple SCIUs where cr_logical was being
> scheduled inefficiently.

How does this interact with processors where mfcr/mtcrf instructions
aren't done in the same function unit as other CR logical instructions
(like the 750)?

> *** linux64.h	5 Jun 2002 03:56:27 -0000	1.15
> --- linux64.h	9 Jun 2002 14:40:29 -0000
> *************** Boston, MA 02111-1307, USA.  */
> *** 31,36 ****
> --- 31,39 ----
>   #define TARGET_DEFAULT \
>     (MASK_POWERPC | MASK_POWERPC64 | MASK_64BIT | MASK_NEW_MNEMONICS)
>   
> + #undef PROCESSOR_DEFAULT
> + #define PROCESSOR_DEFAULT PROCESSOR_PPC630
> + 
>   #undef  CPP_DEFAULT_SPEC
>   #define CPP_DEFAULT_SPEC "-D_ARCH_PPC64"

This almost certainly doesn't do what you wanted.  More likely, you
want to change PROCESSOR_DEFAULT64.

I would also suggest changing PROCESSOR_DEFAULT64 for AIX, since the
default is currently rios1.

-- 
- Geoffrey Keating <geoffk@geoffk.org> <geoffk@redhat.com>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: PowerPC cleanup and Power4
  2002-06-09 10:05 ` Geoff Keating
@ 2002-06-09 10:33   ` David Edelsohn
  2002-06-09 12:08     ` Geoff Keating
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2002-06-09 10:33 UTC (permalink / raw)
  To: Geoff Keating; +Cc: gcc-patches

>>>>> Geoff Keating writes:

Geoff> How does this interact with processors where mfcr/mtcrf instructions
Geoff> aren't done in the same function unit as other CR logical instructions
Geoff> (like the 750)?

	Slight improvement because the 750 performs them in the complex
integer unit.  Previously the instructions were considered integer and
assumed to dispatch on both simple integer pipelines, which was even
worse.  It's not perfect, but it's a net improvement for the moment.

Geoff> This almost certainly doesn't do what you wanted.  More likely, you
Geoff> want to change PROCESSOR_DEFAULT64.

	Oops, right, fixed.

Geoff> I would also suggest changing PROCESSOR_DEFAULT64 for AIX, since the
Geoff> default is currently rios1.

	PROCESSOR_DEFAULT64 never was defined as rios1.  PROCESSOR_DEFAULT
is defined as rios1 in rs6000.h, but overridden in aix43.h and aix51.h.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: PowerPC cleanup and Power4
  2002-06-09 10:33   ` David Edelsohn
@ 2002-06-09 12:08     ` Geoff Keating
  2002-06-09 12:15       ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Geoff Keating @ 2002-06-09 12:08 UTC (permalink / raw)
  To: dje; +Cc: gcc-patches

> Date: Sun, 09 Jun 2002 13:33:34 -0400
> From: David Edelsohn <dje@watson.ibm.com>

> Geoff> I would also suggest changing PROCESSOR_DEFAULT64 for AIX, since the
> Geoff> default is currently rios1.
> 
> 	PROCESSOR_DEFAULT64 never was defined as rios1.  PROCESSOR_DEFAULT
> is defined as rios1 in rs6000.h, but overridden in aix43.h and aix51.h.

Sorry, I meant "the default is currently rs64k".  I expect that most
systems running aix 4.3 are not rs64k any more...

-- 
- Geoffrey Keating <geoffk@geoffk.org> <geoffk@redhat.com>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: PowerPC cleanup and Power4
  2002-06-09 12:08     ` Geoff Keating
@ 2002-06-09 12:15       ` David Edelsohn
  0 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2002-06-09 12:15 UTC (permalink / raw)
  To: Geoff Keating; +Cc: gcc-patches

>>>>> Geoff Keating writes:

Geoff> Sorry, I meant "the default is currently rs64k".  I expect that most
Geoff> systems running aix 4.3 are not rs64k any more...

	For AIX, most are business systems running on something from the
RS64 processor family.  Scientific systems would be based Power3-based,
but those users probably don't use the default anyway.  I doubt many AIX
users are compiling 64-bit apps with GCC anyway.

	Scientific Power3 systems are probably a good assumption for
64-bit PowerPC Linux.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* convert 32-bit PowerPC GNU/Linux to TARGET_OS_CPP_BUILTINS
@ 2002-07-02 21:18 Matt Kraai
  2002-07-03  8:10 ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Matt Kraai @ 2002-07-02 21:18 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1695 bytes --]

Howdy,

The following patch converts 32-bit PowerPC GNU/Linux to use
TARGET_OS_CPP_BUILTINS.  It was bootstrapped and regression
tested on powerpc-unknown-linux-gnu.

OK to commit?


If desired, I can also submit a patch which converts the other
targets which include rs6000/sysv4.h.  I will only test on
powerpc-unknown-linux-gnu and powerpc-eabisim, however.

Matt

	* config/rs6000/linux.h (CPP_PREDEFINES): Remove.
	(TARGET_OS_CPP_BUILTINS): New.

Index: gcc/config/rs6000/linux.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/linux.h,v
retrieving revision 1.31
diff -c -3 -p -r1.31 linux.h
*** gcc/config/rs6000/linux.h	3 Dec 2001 00:49:41 -0000	1.31
--- gcc/config/rs6000/linux.h	2 Jul 2002 17:27:45 -0000
*************** Boston, MA 02111-1307, USA.  */
*** 28,35 ****
  #undef MD_STARTFILE_PREFIX
  
  #undef CPP_PREDEFINES
! #define CPP_PREDEFINES \
!  "-DPPC -D__ELF__ -Dpowerpc -Acpu=powerpc -Amachine=powerpc"
  
  #undef	CPP_OS_DEFAULT_SPEC
  #define CPP_OS_DEFAULT_SPEC "%(cpp_os_linux)"
--- 28,43 ----
  #undef MD_STARTFILE_PREFIX
  
  #undef CPP_PREDEFINES
! #define TARGET_OS_CPP_BUILTINS()          \
!   do                                      \
!     {                                     \
!       builtin_define_std ("PPC");         \
!       builtin_define_std ("__ELF__");     \
!       builtin_define_std ("powerpc");     \
!       builtin_assert ("cpu=powerpc");     \
!       builtin_assert ("machine=powerpc"); \
!     }                                     \
!   while (0)
  
  #undef	CPP_OS_DEFAULT_SPEC
  #define CPP_OS_DEFAULT_SPEC "%(cpp_os_linux)"

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: convert 32-bit PowerPC GNU/Linux to TARGET_OS_CPP_BUILTINS
  2002-07-02 21:18 convert 32-bit PowerPC GNU/Linux to TARGET_OS_CPP_BUILTINS Matt Kraai
@ 2002-07-03  8:10 ` David Edelsohn
  2002-07-03  9:15   ` Matt Kraai
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2002-07-03  8:10 UTC (permalink / raw)
  To: kraai; +Cc: gcc-patches

	I think it would be much better to convert the entire PowerPC port
at once instead of each target separately -- that means AIX and VxWorks,
etc.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: convert 32-bit PowerPC GNU/Linux to TARGET_OS_CPP_BUILTINS
  2002-07-03  8:10 ` David Edelsohn
@ 2002-07-03  9:15   ` Matt Kraai
  2002-07-03  9:25     ` Stan Shebs
                       ` (3 more replies)
  0 siblings, 4 replies; 875+ messages in thread
From: Matt Kraai @ 2002-07-03  9:15 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 24072 bytes --]

On Wed, Jul 03, 2002 at 10:51:19AM -0400, David Edelsohn wrote:
> 	I think it would be much better to convert the entire PowerPC port
> at once instead of each target separately -- that means AIX and VxWorks,
> etc.

OK.  The following patch converts the entire port.  I'm
currently bootstrapping it on powerpc-unknown-linux-gnu.  I'd
appreciate other people testing it on some other PowerPC
platforms.

Matt

	* config/rs6000/aix.h: Convert CPP_PREDEFINES to
	TARGET_OS_CPP_BUILTINS.
	* config/rs6000/aix31.h: Likewise.
	* config/rs6000/aix41.h: Likewise.
	* config/rs6000/aix43.h: Likewise.
	* config/rs6000/aix51.h: Likewise.
	* config/rs6000/beos.h: Likewise.
	* config/rs6000/darwin.h: Likewise.
	* config/rs6000/eabi.h: Likewise.
	* config/rs6000/eabisim.h: Likewise.
	* config/rs6000/linux.h: Likewise.
	* config/rs6000/linux64.h: Likewise.
	* config/rs6000/lynx.h: Likewise.
	* config/rs6000/mach.h: Likewise.
	* config/rs6000/rtems.h: Likewise.
	* config/rs6000/sysv4.h: Likewise.
	* config/rs6000/vxppc.h: Likewise.

Index: gcc/config/rs6000/aix.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/aix.h,v
retrieving revision 1.29
diff -c -3 -p -r1.29 aix.h
*** gcc/config/rs6000/aix.h	11 Jun 2002 23:14:47 -0000	1.29
--- gcc/config/rs6000/aix.h	3 Jul 2002 15:38:09 -0000
*************** Boston, MA 02111-1307, USA.  */
*** 41,48 ****
  #define LINK_LIBGCC_SPECIAL_1
  
  /* Names to predefine in the preprocessor for this target machine.  */
! #define CPP_PREDEFINES "-D_IBMR2 -D_POWER -D_AIX -D_AIX32 -D_LONG_LONG \
! -Asystem=unix -Asystem=aix -Acpu=rs6000 -Amachine=rs6000"
  
  /* Define appropriate architecture macros for preprocessor depending on
     target switches.  */
--- 41,60 ----
  #define LINK_LIBGCC_SPECIAL_1
  
  /* Names to predefine in the preprocessor for this target machine.  */
! #define TARGET_OS_CPP_BUILTINS()         \
!   do                                     \
!     {                                    \
!       builtin_define_std ("_IBMR2");     \
!       builtin_define_std ("_POWER");     \
!       builtin_define_std ("_AIX");       \
!       builtin_define_std ("_AIX32");     \
!       builtin_define_std ("_LONG_LONG"); \
!       builtin_assert ("system=unix");    \
!       builtin_assert ("system=aix");     \
!       builtin_assert ("cpu=rs6000");     \
!       builtin_assert ("machine=rs6000"); \
!     }                                    \
!   while (0)
  
  /* Define appropriate architecture macros for preprocessor depending on
     target switches.  */
Index: gcc/config/rs6000/aix31.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/aix31.h,v
retrieving revision 1.8
diff -c -3 -p -r1.8 aix31.h
*** gcc/config/rs6000/aix31.h	15 Nov 2001 05:21:06 -0000	1.8
--- gcc/config/rs6000/aix31.h	3 Jul 2002 15:38:09 -0000
*************** Boston, MA 02111-1307, USA.  */
*** 60,67 ****
  }
  
  /* AIX 3.2 defined _AIX32, but older versions do not.  */
! #undef CPP_PREDEFINES
! #define CPP_PREDEFINES "-D_IBMR2 -D_AIX -Asystem=unix -Asystem=aix -Acpu=rs6000 -Amachine=rs6000"
  
  /* AIX 3.1 uses bit 15 in CROR as the magic nop.  */
  #undef RS6000_CALL_GLUE
--- 60,77 ----
  }
  
  /* AIX 3.2 defined _AIX32, but older versions do not.  */
! #undef TARGET_OS_CPP_BUILTINS
! #define TARGET_OS_CPP_BUILTINS()         \
!   do                                     \
!     {                                    \
!       builtin_define_std ("_IBMR2");     \
!       builtin_define_std ("_AIX");       \
!       builtin_assert ("system=unix");    \
!       builtin_assert ("system=aix");     \
!       builtin_assert ("cpu=rs6000");     \
!       builtin_assert ("machine=rs6000"); \
!     }                                    \
!   while (0)
  
  /* AIX 3.1 uses bit 15 in CROR as the magic nop.  */
  #undef RS6000_CALL_GLUE
Index: gcc/config/rs6000/aix41.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/aix41.h,v
retrieving revision 1.16
diff -c -3 -p -r1.16 aix41.h
*** gcc/config/rs6000/aix41.h	11 Jun 2002 23:14:47 -0000	1.16
--- gcc/config/rs6000/aix41.h	3 Jul 2002 15:38:09 -0000
*************** Boston, MA 02111-1307, USA.  */
*** 33,41 ****
  #undef	ASM_DEFAULT_SPEC
  #define ASM_DEFAULT_SPEC "-mcom"
  
! #undef CPP_PREDEFINES
! #define CPP_PREDEFINES "-D_IBMR2 -D_POWER -D_AIX -D_AIX32 -D_AIX41 \
! -D_LONG_LONG -Asystem=unix -Asystem=aix"
  
  #undef CPP_SPEC
  #define CPP_SPEC "%{posix: -D_POSIX_SOURCE}\
--- 33,52 ----
  #undef	ASM_DEFAULT_SPEC
  #define ASM_DEFAULT_SPEC "-mcom"
  
! #undef TARGET_OS_CPP_BUILTINS
! #define TARGET_OS_CPP_BUILTINS()         \
!   do                                     \
!     {                                    \
!       builtin_define_std ("_IBMR2");     \
!       builtin_define_std ("_POWER");     \
!       builtin_define_std ("_AIX");       \
!       builtin_define_std ("_AIX32");     \
!       builtin_define_std ("_AIX41");     \
!       builtin_define_std ("_LONG_LONG"); \
!       builtin_assert ("system=unix");    \
!       builtin_assert ("system=aix");     \
!     }                                    \
!   while (0)
  
  #undef CPP_SPEC
  #define CPP_SPEC "%{posix: -D_POSIX_SOURCE}\
Index: gcc/config/rs6000/aix43.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/aix43.h,v
retrieving revision 1.26
diff -c -3 -p -r1.26 aix43.h
*** gcc/config/rs6000/aix43.h	12 Jun 2002 03:06:25 -0000	1.26
--- gcc/config/rs6000/aix43.h	3 Jul 2002 15:38:09 -0000
*************** do {									\
*** 96,104 ****
  #undef	ASM_DEFAULT_SPEC
  #define ASM_DEFAULT_SPEC "-mcom"
  
! #undef CPP_PREDEFINES
! #define CPP_PREDEFINES "-D_IBMR2 -D_POWER -D_AIX -D_AIX32 -D_AIX41 -D_AIX43 \
! -D_LONG_LONG -Asystem=unix -Asystem=aix"
  
  #undef CPP_SPEC
  #define CPP_SPEC "%{posix: -D_POSIX_SOURCE}\
--- 96,116 ----
  #undef	ASM_DEFAULT_SPEC
  #define ASM_DEFAULT_SPEC "-mcom"
  
! #undef TARGET_OS_CPP_BUILTINS
! #define TARGET_OS_CPP_BUILTINS()         \
!   do                                     \
!     {                                    \
!       builtin_define_std ("_IBMR2");     \
!       builtin_define_std ("_POWER");     \
!       builtin_define_std ("_AIX");       \
!       builtin_define_std ("_AIX32");     \
!       builtin_define_std ("_AIX41");     \
!       builtin_define_std ("_AIX43");     \
!       builtin_define_std ("_LONG_LONG"); \
!       builtin_assert ("system=unix");    \
!       builtin_assert ("system=aix");     \
!     }                                    \
!   while (0)
  
  #undef CPP_SPEC
  #define CPP_SPEC "%{posix: -D_POSIX_SOURCE}\
Index: gcc/config/rs6000/aix51.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/aix51.h,v
retrieving revision 1.15
diff -c -3 -p -r1.15 aix51.h
*** gcc/config/rs6000/aix51.h	12 Jun 2002 03:06:26 -0000	1.15
--- gcc/config/rs6000/aix51.h	3 Jul 2002 15:38:09 -0000
*************** do {									\
*** 96,104 ****
  #undef	ASM_DEFAULT_SPEC
  #define ASM_DEFAULT_SPEC "-mcom"
  
! #undef CPP_PREDEFINES
! #define CPP_PREDEFINES "-D_IBMR2 -D_POWER -D_LONG_LONG \
! -D_AIX -D_AIX32 -D_AIX41 -D_AIX43 -D_AIX51 -Asystem=unix -Asystem=aix"
  
  #undef CPP_SPEC
  #define CPP_SPEC "%{posix: -D_POSIX_SOURCE}	\
--- 96,117 ----
  #undef	ASM_DEFAULT_SPEC
  #define ASM_DEFAULT_SPEC "-mcom"
  
! #undef TARGET_OS_CPP_BUILTINS
! #define TARGET_OS_CPP_BUILTINS()         \
!   do                                     \
!     {                                    \
!       builtin_define_std ("_IBMR2");     \
!       builtin_define_std ("_POWER");     \
!       builtin_define_std ("_LONG_LONG"); \
!       builtin_define_std ("_AIX");       \
!       builtin_define_std ("_AIX32");     \
!       builtin_define_std ("_AIX41");     \
!       builtin_define_std ("_AIX43");     \
!       builtin_define_std ("_AIX51");     \
!       builtin_assert ("system=unix");    \
!       builtin_assert ("system=aix");     \
!     }                                    \
!   while (0)
  
  #undef CPP_SPEC
  #define CPP_SPEC "%{posix: -D_POSIX_SOURCE}	\
Index: gcc/config/rs6000/beos.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/beos.h,v
retrieving revision 1.10
diff -c -3 -p -r1.10 beos.h
*** gcc/config/rs6000/beos.h	11 Jun 2002 23:14:47 -0000	1.10
--- gcc/config/rs6000/beos.h	3 Jul 2002 15:38:09 -0000
*************** Boston, MA 02111-1307, USA.  */
*** 38,46 ****
  #undef ASM_SPEC
  #define ASM_SPEC "-u %(asm_cpu)"
  
! #undef CPP_PREDEFINES
  /* __POWERPC__ must be defined for some header files */
! #define CPP_PREDEFINES "-D__BEOS__ -D__POWERPC__ -Asystem=beos -Acpu=powerpc -Amachine=powerpc"
  
  #undef CPP_SPEC
  #define CPP_SPEC "%{posix: -D_POSIX_SOURCE}"
--- 38,55 ----
  #undef ASM_SPEC
  #define ASM_SPEC "-u %(asm_cpu)"
  
! #undef TARGET_OS_CPP_BUILTINS
  /* __POWERPC__ must be defined for some header files */
! #define TARGET_OS_CPP_BUILTINS()          \
!   do                                      \
!     {                                     \
!       builtin_define_std ("__BEOS__");    \
!       builtin_define_std ("__POWERPC__"); \
!       builtin_assert ("system=beos");     \
!       builtin_assert ("cpu=powerpc");     \
!       builtin_assert ("machine=powerpc"); \
!     }                                     \
!   while (0)
  
  #undef CPP_SPEC
  #define CPP_SPEC "%{posix: -D_POSIX_SOURCE}"
Index: gcc/config/rs6000/darwin.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/darwin.h,v
retrieving revision 1.21
diff -c -3 -p -r1.21 darwin.h
*** gcc/config/rs6000/darwin.h	11 Jun 2002 23:14:47 -0000	1.21
--- gcc/config/rs6000/darwin.h	3 Jul 2002 15:38:09 -0000
*************** Boston, MA 02111-1307, USA.  */
*** 44,50 ****
  #define SUBTARGET_OVERRIDE_OPTIONS  \
    rs6000_altivec_abi = 1;
  
! #define CPP_PREDEFINES "-D__ppc__ -D__POWERPC__ -D__NATURAL_ALIGNMENT__ -D__MACH__ -D__APPLE__"
  
  /* We want -fPIC by default, unless we're using -static to compile for
     the kernel or some such.  */
--- 44,59 ----
  #define SUBTARGET_OVERRIDE_OPTIONS  \
    rs6000_altivec_abi = 1;
  
! #define TARGET_OS_CPP_BUILTINS()                    \
!   do                                                \
!     {                                               \
!       builtin_define_std ("__ppc__");               \
!       builtin_define_std ("__POWERPC__");           \
!       builtin_define_std ("__NATURAL_ALIGNMENT__"); \
!       builtin_define_std ("__MACH__");              \
!       builtin_define_std ("__APPLE__");             \
!     }                                               \
!   while (0)
  
  /* We want -fPIC by default, unless we're using -static to compile for
     the kernel or some such.  */
Index: gcc/config/rs6000/eabi.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/eabi.h,v
retrieving revision 1.5
diff -c -3 -p -r1.5 eabi.h
*** gcc/config/rs6000/eabi.h	2 Nov 2000 23:29:12 -0000	1.5
--- gcc/config/rs6000/eabi.h	3 Jul 2002 15:38:09 -0000
*************** Boston, MA 02111-1307, USA.  */
*** 31,36 ****
  #undef TARGET_VERSION
  #define TARGET_VERSION fprintf (stderr, " (PowerPC Embedded)");
  
! #undef CPP_PREDEFINES
! #define CPP_PREDEFINES \
!   "-DPPC -D__embedded__ -Asystem=embedded -Acpu=powerpc -Amachine=powerpc"
--- 31,44 ----
  #undef TARGET_VERSION
  #define TARGET_VERSION fprintf (stderr, " (PowerPC Embedded)");
  
! #undef TARGET_OS_CPP_BUILTINS
! #define TARGET_OS_CPP_BUILTINS()           \
!   do                                       \
!     {                                      \
!       builtin_define_std ("PPC");          \
!       builtin_define_std ("__embedded__"); \
!       builtin_assert ("system=embedded");  \
!       builtin_assert ("cpu=powerpc");      \
!       builtin_assert ("machine=powerpc");  \
!     }                                      \
!   while (0)
Index: gcc/config/rs6000/eabisim.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/eabisim.h,v
retrieving revision 1.4
diff -c -3 -p -r1.4 eabisim.h
*** gcc/config/rs6000/eabisim.h	2 Nov 2000 23:29:12 -0000	1.4
--- gcc/config/rs6000/eabisim.h	3 Jul 2002 15:38:09 -0000
*************** Boston, MA 02111-1307, USA.  */
*** 23,31 ****
  #undef TARGET_VERSION
  #define TARGET_VERSION fprintf (stderr, " (PowerPC Simulated)");
  
! #undef CPP_PREDEFINES
! #define CPP_PREDEFINES \
!   "-DPPC -D__embedded__ -D__simulator__ -Asystem=embedded -Asystem=simulator -Acpu=powerpc -Amachine=powerpc"
  
  /* Make the simulator the default */
  #undef	LIB_DEFAULT_SPEC
--- 23,41 ----
  #undef TARGET_VERSION
  #define TARGET_VERSION fprintf (stderr, " (PowerPC Simulated)");
  
! #undef TARGET_OS_CPP_BUILTINS
! #define TARGET_OS_CPP_BUILTINS()            \
!   do                                        \
!     {                                       \
!       builtin_define_std ("PPC");           \
!       builtin_define_std ("__embedded__");  \
!       builtin_define_std ("__simulator__"); \
!       builtin_assert ("system=embedded");   \
!       builtin_assert ("system=simulator");  \
!       builtin_assert ("cpu=powerpc");       \
!       builtin_assert ("machine=powerpc");   \
!     }                                       \
!   while (0)
  
  /* Make the simulator the default */
  #undef	LIB_DEFAULT_SPEC
Index: gcc/config/rs6000/linux.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/linux.h,v
retrieving revision 1.31
diff -c -3 -p -r1.31 linux.h
*** gcc/config/rs6000/linux.h	3 Dec 2001 00:49:41 -0000	1.31
--- gcc/config/rs6000/linux.h	3 Jul 2002 15:38:09 -0000
*************** Boston, MA 02111-1307, USA.  */
*** 27,35 ****
  #undef MD_EXEC_PREFIX
  #undef MD_STARTFILE_PREFIX
  
! #undef CPP_PREDEFINES
! #define CPP_PREDEFINES \
!  "-DPPC -D__ELF__ -Dpowerpc -Acpu=powerpc -Amachine=powerpc"
  
  #undef	CPP_OS_DEFAULT_SPEC
  #define CPP_OS_DEFAULT_SPEC "%(cpp_os_linux)"
--- 27,43 ----
  #undef MD_EXEC_PREFIX
  #undef MD_STARTFILE_PREFIX
  
! #undef TARGET_OS_CPP_BUILTINS
! #define TARGET_OS_CPP_BUILTINS()          \
!   do                                      \
!     {                                     \
!       builtin_define_std ("PPC");         \
!       builtin_define_std ("__ELF__");     \
!       builtin_define_std ("powerpc");     \
!       builtin_assert ("cpu=powerpc");     \
!       builtin_assert ("machine=powerpc"); \
!     }                                     \
!   while (0)
  
  #undef	CPP_OS_DEFAULT_SPEC
  #define CPP_OS_DEFAULT_SPEC "%(cpp_os_linux)"
Index: gcc/config/rs6000/linux64.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/linux64.h,v
retrieving revision 1.19
diff -c -3 -p -r1.19 linux64.h
*** gcc/config/rs6000/linux64.h	12 Jun 2002 03:06:26 -0000	1.19
--- gcc/config/rs6000/linux64.h	3 Jul 2002 15:38:09 -0000
*************** Boston, MA 02111-1307, USA.  */
*** 106,116 ****
  #undef MD_EXEC_PREFIX
  #undef MD_STARTFILE_PREFIX
  
! #undef  CPP_PREDEFINES
! #define CPP_PREDEFINES \
!  "-D_PPC_ -D__PPC__ -D_PPC64_ -D__PPC64__ -D__powerpc__ -D__powerpc64__ \
!   -D_PIC_ -D__PIC__ -D__ELF__ \
!   -Acpu=powerpc64 -Amachine=powerpc64"
  
  #undef  CPP_OS_DEFAULT_SPEC
  #define CPP_OS_DEFAULT_SPEC "%(cpp_os_linux)"
--- 106,128 ----
  #undef MD_EXEC_PREFIX
  #undef MD_STARTFILE_PREFIX
  
! #undef TARGET_OS_CPP_BUILTINS
! #define TARGET_OS_CPP_BUILTINS()            \
!   do                                        \
!     {                                       \
!       builtin_define_std ("_PPC_");         \
!       builtin_define_std ("__PPC__");       \
!       builtin_define_std ("_PPC64_");       \
!       builtin_define_std ("__PPC64__");     \
!       builtin_define_std ("__powerpc__");   \
!       builtin_define_std ("__powerpc64__"); \
!       builtin_define_std ("_PIC_");         \
!       builtin_define_std ("__PIC__");       \
!       builtin_define_std ("__ELF__");       \
!       builtin_assert ("cpu=powerpc64");     \
!       builtin_assert ("machine=powerpc64"); \
!     }                                       \
!   while (0)
  
  #undef  CPP_OS_DEFAULT_SPEC
  #define CPP_OS_DEFAULT_SPEC "%(cpp_os_linux)"
Index: gcc/config/rs6000/lynx.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/lynx.h,v
retrieving revision 1.10
diff -c -3 -p -r1.10 lynx.h
*** gcc/config/rs6000/lynx.h	18 May 2002 23:47:17 -0000	1.10
--- gcc/config/rs6000/lynx.h	3 Jul 2002 15:38:09 -0000
*************** Boston, MA 02111-1307, USA.  */
*** 53,60 ****
  #undef DEFAULT_SIGNED_CHAR
  #define DEFAULT_SIGNED_CHAR 1
  
! #undef CPP_PREDEFINES
! #define CPP_PREDEFINES "-Acpu=rs6000 -Amachine=rs6000 -Asystem=lynx -Asystem=unix -DLynx -D_IBMR2 -Dunix -Drs6000 -Dlynx -DLYNX"
  
  #undef LINK_SPEC
  #define LINK_SPEC "-T0x10001000 -H0x1000 -D0x20000000 -btextro -bhalt:4 -bnodelcsect -bnso -bro -bnoglink %{v} %{b*}"
--- 53,74 ----
  #undef DEFAULT_SIGNED_CHAR
  #define DEFAULT_SIGNED_CHAR 1
  
! #undef TARGET_OS_CPP_BUILTINS
! #define TARGET_OS_CPP_BUILTINS()         \
!   do                                     \
!     {                                    \
!       builtin_assert ("cpu=rs6000");     \
!       builtin_assert ("machine=rs6000"); \
!       builtin_assert ("system=lynx");    \
!       builtin_assert ("system=unix");    \
!       builtin_define_std ("Lynx");       \
!       builtin_define_std ("_IBMR2");     \
!       builtin_define_std ("unix");       \
!       builtin_define_std ("rs6000");     \
!       builtin_define_std ("lynx");       \
!       builtin_define_std ("LYNX");       \
!     }                                    \
!   while (0)
  
  #undef LINK_SPEC
  #define LINK_SPEC "-T0x10001000 -H0x1000 -D0x20000000 -btextro -bhalt:4 -bnodelcsect -bnso -bro -bnoglink %{v} %{b*}"
Index: gcc/config/rs6000/mach.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/mach.h,v
retrieving revision 1.6
diff -c -3 -p -r1.6 mach.h
*** gcc/config/rs6000/mach.h	20 Nov 2001 19:43:28 -0000	1.6
--- gcc/config/rs6000/mach.h	3 Jul 2002 15:38:09 -0000
*************** Boston, MA 02111-1307, USA.  */
*** 26,33 ****
  #define TARGET_VERSION fprintf (stderr, " (Mach-RS/6000)");
  
  /* We don't define AIX under MACH; instead we define `unix'.  */
! #undef CPP_PREDEFINES
! #define CPP_PREDEFINES "-Drios -D_IBMR2 -Dunix -Asystem=unix -Asystem=mach -Acpu=rs6000 -Amachine=rs6000"
  
  /* Define different binder options for MACH.  */
  #undef LINK_SPEC
--- 26,44 ----
  #define TARGET_VERSION fprintf (stderr, " (Mach-RS/6000)");
  
  /* We don't define AIX under MACH; instead we define `unix'.  */
! #undef TARGET_OS_CPP_BUILTINS
! #define TARGET_OS_CPP_BUILTINS()         \
!   do                                     \
!     {                                    \
!       builtin_define_std ("rios");       \
!       builtin_define_std ("_IBMR2");     \
!       builtin_define_std ("unix");       \
!       builtin_assert ("system=unix");    \
!       builtin_assert ("system=mach");    \
!       builtin_assert ("cpu=rs6000");     \
!       builtin_assert ("machine=rs6000"); \
!     }                                    \
!   while (0)
  
  /* Define different binder options for MACH.  */
  #undef LINK_SPEC
Index: gcc/config/rs6000/rtems.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rtems.h,v
retrieving revision 1.14
diff -c -3 -p -r1.14 rtems.h
*** gcc/config/rs6000/rtems.h	12 Apr 2002 13:35:00 -0000	1.14
--- gcc/config/rs6000/rtems.h	3 Jul 2002 15:38:09 -0000
*************** Boston, MA 02111-1307, USA.  */
*** 21,26 ****
  
  /* Specify predefined symbols in preprocessor.  */
  
! #undef CPP_PREDEFINES
! #define CPP_PREDEFINES "-DPPC -D__rtems__ \
!    -Asystem=rtems -Acpu=powerpc -Amachine=powerpc"
--- 21,34 ----
  
  /* Specify predefined symbols in preprocessor.  */
  
! #undef TARGET_OS_CPP_BUILTINS
! #define TARGET_OS_CPP_BUILTINS()          \
!   do                                      \
!     {                                     \
!       builtin_define_std ("PPC");         \
!       builtin_define_std ("__rtems__");   \
!       builtin_assert ("system=rtems");    \
!       builtin_assert ("cpu=powerpc");     \
!       builtin_assert ("machine=powerpc"); \
!     }                                     \
!   while (0)
Index: gcc/config/rs6000/sysv4.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/sysv4.h,v
retrieving revision 1.97
diff -c -3 -p -r1.97 sysv4.h
*** gcc/config/rs6000/sysv4.h	11 Jun 2002 23:14:47 -0000	1.97
--- gcc/config/rs6000/sysv4.h	3 Jul 2002 15:38:10 -0000
*************** do {						\
*** 808,816 ****
  #define	TARGET_VERSION fprintf (stderr, " (PowerPC System V.4)");
  #endif
  \f
! #ifndef	CPP_PREDEFINES
! #define	CPP_PREDEFINES \
!   "-DPPC -Dunix -D__svr4__ -Asystem=unix -Asystem=svr4 -Acpu=powerpc -Amachine=powerpc"
  #endif
  
  /* Pass various options to the assembler.  */
--- 808,826 ----
  #define	TARGET_VERSION fprintf (stderr, " (PowerPC System V.4)");
  #endif
  \f
! #ifndef	TARGET_OS_CPP_BUILTINS
! #define TARGET_OS_CPP_BUILTINS()          \
!   do                                      \
!     {                                     \
!       builtin_define_std ("PPC");         \
!       builtin_define_std ("unix");        \
!       builtin_define_std ("__svr4__");    \
!       builtin_assert ("system=unix");     \
!       builtin_assert ("system=svr4");     \
!       builtin_assert ("cpu=powerpc");     \
!       builtin_assert ("machine=powerpc"); \
!     }                                     \
!   while (0)
  #endif
  
  /* Pass various options to the assembler.  */
Index: gcc/config/rs6000/vxppc.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/vxppc.h,v
retrieving revision 1.9
diff -c -3 -p -r1.9 vxppc.h
*** gcc/config/rs6000/vxppc.h	11 Jun 2002 23:14:47 -0000	1.9
--- gcc/config/rs6000/vxppc.h	3 Jul 2002 15:38:10 -0000
*************** Boston, MA 02111-1307, USA.  */
*** 39,48 ****
  #undef	LINK_OS_DEFAULT_SPEC
  #define LINK_OS_DEFAULT_SPEC "%(link_os_vxworks)"
  
! #undef CPP_PREDEFINES
! #define CPP_PREDEFINES "\
! -D__vxworks -D__vxworks__ -Asystem=vxworks -Asystem=embedded \
! -Acpu=powerpc -Amachine=powerpc"
  
  /* We use stabs-in-elf for debugging */
  #undef PREFERRED_DEBUGGING_TYPE
--- 39,56 ----
  #undef	LINK_OS_DEFAULT_SPEC
  #define LINK_OS_DEFAULT_SPEC "%(link_os_vxworks)"
  
! #undef TARGET_OS_CPP_BUILTINS
! #define TARGET_OS_CPP_BUILTINS()          \
!   do                                      \
!     {                                     \
!       builtin_define_std ("__vxworks");   \
!       builtin_define_std ("__vxworks__"); \
!       builtin_assert ("system=vxworks");  \
!       builtin_assert ("system=embedded"); \
!       builtin_assert ("cpu=powerpc");     \
!       builtin_assert ("machine=powerpc"); \
!     }                                     \
!   while (0)
  
  /* We use stabs-in-elf for debugging */
  #undef PREFERRED_DEBUGGING_TYPE

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: convert 32-bit PowerPC GNU/Linux to TARGET_OS_CPP_BUILTINS
  2002-07-03  9:15   ` Matt Kraai
@ 2002-07-03  9:25     ` Stan Shebs
  2002-07-03  9:32     ` David Edelsohn
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 875+ messages in thread
From: Stan Shebs @ 2002-07-03  9:25 UTC (permalink / raw)
  To: Matt Kraai; +Cc: David Edelsohn, gcc-patches

Matt Kraai wrote:
> 
> On Wed, Jul 03, 2002 at 10:51:19AM -0400, David Edelsohn wrote:
> >       I think it would be much better to convert the entire PowerPC port
> > at once instead of each target separately -- that means AIX and VxWorks,
> > etc.
> 
> OK.  The following patch converts the entire port.  I'm
> currently bootstrapping it on powerpc-unknown-linux-gnu.  I'd
> appreciate other people testing it on some other PowerPC
> platforms.

The Darwin part looks good, I'm happy with you just checking it in.

Stan

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: convert 32-bit PowerPC GNU/Linux to TARGET_OS_CPP_BUILTINS
  2002-07-03  9:15   ` Matt Kraai
  2002-07-03  9:25     ` Stan Shebs
@ 2002-07-03  9:32     ` David Edelsohn
  2002-07-03  9:36     ` Jason R Thorpe
  2002-07-03 23:50     ` Alan Modra
  3 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2002-07-03  9:32 UTC (permalink / raw)
  To: Matt Kraai; +Cc: gcc-patches

	* config/rs6000/aix.h: Convert CPP_PREDEFINES to
	TARGET_OS_CPP_BUILTINS.
	* config/rs6000/aix31.h: Likewise.
	* config/rs6000/aix41.h: Likewise.
	* config/rs6000/aix43.h: Likewise.
	* config/rs6000/aix51.h: Likewise.
	* config/rs6000/beos.h: Likewise.
	* config/rs6000/darwin.h: Likewise.
	* config/rs6000/eabi.h: Likewise.
	* config/rs6000/eabisim.h: Likewise.
	* config/rs6000/linux.h: Likewise.
	* config/rs6000/linux64.h: Likewise.
	* config/rs6000/lynx.h: Likewise.
	* config/rs6000/mach.h: Likewise.
	* config/rs6000/rtems.h: Likewise.
	* config/rs6000/sysv4.h: Likewise.
	* config/rs6000/vxppc.h: Likewise.

This looks fine to check in.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: convert 32-bit PowerPC GNU/Linux to TARGET_OS_CPP_BUILTINS
  2002-07-03  9:15   ` Matt Kraai
  2002-07-03  9:25     ` Stan Shebs
  2002-07-03  9:32     ` David Edelsohn
@ 2002-07-03  9:36     ` Jason R Thorpe
  2002-07-03 10:29       ` Matt Kraai
  2002-07-03 23:50     ` Alan Modra
  3 siblings, 1 reply; 875+ messages in thread
From: Jason R Thorpe @ 2002-07-03  9:36 UTC (permalink / raw)
  To: kraai; +Cc: gcc-patches

On Wed, Jul 03, 2002 at 08:52:47AM -0700, Matt Kraai wrote:

 >   /* Names to predefine in the preprocessor for this target machine.  */
 > ! #define TARGET_OS_CPP_BUILTINS()         \
 > !   do                                     \
 > !     {                                    \
 > !       builtin_define_std ("_IBMR2");     \
 > !       builtin_define_std ("_POWER");     \
 > !       builtin_define_std ("_AIX");       \
 > !       builtin_define_std ("_AIX32");     \
 > !       builtin_define_std ("_LONG_LONG"); \

All these which already have _ at the beginning of their names don't
need builtin_define_std, only builtin_define.

 > --- 38,55 ----
 >   #undef ASM_SPEC
 >   #define ASM_SPEC "-u %(asm_cpu)"
 >   
 > ! #undef TARGET_OS_CPP_BUILTINS
 >   /* __POWERPC__ must be defined for some header files */
 > ! #define TARGET_OS_CPP_BUILTINS()          \
 > !   do                                      \
 > !     {                                     \
 > !       builtin_define_std ("__BEOS__");    \
 > !       builtin_define_std ("__POWERPC__"); \

...ditto for these.

 > --- 31,44 ----
 >   #undef TARGET_VERSION
 >   #define TARGET_VERSION fprintf (stderr, " (PowerPC Embedded)");
 >   
 > ! #undef TARGET_OS_CPP_BUILTINS
 > ! #define TARGET_OS_CPP_BUILTINS()           \
 > !   do                                       \
 > !     {                                      \
 > !       builtin_define_std ("PPC");          \

This one, however, is a correct use of builtin_define_std.

-- 
        -- Jason R. Thorpe <thorpej@wasabisystems.com>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: convert 32-bit PowerPC GNU/Linux to TARGET_OS_CPP_BUILTINS
  2002-07-03  9:36     ` Jason R Thorpe
@ 2002-07-03 10:29       ` Matt Kraai
  0 siblings, 0 replies; 875+ messages in thread
From: Matt Kraai @ 2002-07-03 10:29 UTC (permalink / raw)
  To: Jason R Thorpe, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 26124 bytes --]

On Wed, Jul 03, 2002 at 09:35:26AM -0700, Jason R Thorpe wrote:
> On Wed, Jul 03, 2002 at 08:52:47AM -0700, Matt Kraai wrote:
> 
>  >   /* Names to predefine in the preprocessor for this target machine.  */
>  > ! #define TARGET_OS_CPP_BUILTINS()         \
>  > !   do                                     \
>  > !     {                                    \
>  > !       builtin_define_std ("_IBMR2");     \
>  > !       builtin_define_std ("_POWER");     \
>  > !       builtin_define_std ("_AIX");       \
>  > !       builtin_define_std ("_AIX32");     \
>  > !       builtin_define_std ("_LONG_LONG"); \
> 
> All these which already have _ at the beginning of their names don't
> need builtin_define_std, only builtin_define.

The following patch documents this in tm.texi and passes `make
info' and `make dvi'.  OK to commit?

	* doc/tm.texi (TARGET_CPU_CPP_BUILTINS): Document when
	builtin_define is preferred over builtin_define_std.

Index: gcc/doc/tm.texi
===================================================================
RCS file: /cvs/gcc/gcc/gcc/doc/tm.texi,v
retrieving revision 1.143
diff -c -3 -r1.143 tm.texi
*** gcc/doc/tm.texi	27 Jun 2002 17:19:04 -0000	1.143
--- gcc/doc/tm.texi	3 Jul 2002 17:07:42 -0000
***************
*** 609,614 ****
--- 609,617 ----
  @code{__mips__} and possibly @code{_mips}, and passing @code{_ABI64}
  defines only @code{_ABI64}.
  
+ For object-like macros which do not lie in the user's namespace, use
+ @code{builtin_define}.
+ 
  You can also test for the C dialect being compiled.  The variable
  @code{c_language} is set to one of @code{clk_c}, @code{clk_cplusplus}
  or @code{clk_objective_c}.  Note that if we are preprocessing

>  >   #undef ASM_SPEC
>  >   #define ASM_SPEC "-u %(asm_cpu)"
>  >   
>  > ! #undef TARGET_OS_CPP_BUILTINS
>  >   /* __POWERPC__ must be defined for some header files */
>  > ! #define TARGET_OS_CPP_BUILTINS()          \
>  > !   do                                      \
>  > !     {                                     \
>  > !       builtin_define_std ("__BEOS__");    \
>  > !       builtin_define_std ("__POWERPC__"); \
> 
> ...ditto for these.
> 
>  >   #undef TARGET_VERSION
>  >   #define TARGET_VERSION fprintf (stderr, " (PowerPC Embedded)");
>  >   
>  > ! #undef TARGET_OS_CPP_BUILTINS
>  > ! #define TARGET_OS_CPP_BUILTINS()           \
>  > !   do                                       \
>  > !     {                                      \
>  > !       builtin_define_std ("PPC");          \
> 
> This one, however, is a correct use of builtin_define_std.

I've converted the definitions of all macros which lie in the
user's namespace to use builtin_define instead of
builtin_define_std.  If the bootstrap and regression testing on
powerpc-unknown-linux-gnu are successful, OK to commit?

Matt

	* aix.h: Convert from CPP_PREDEFINES to TARGET_OS_CPP_BUILTINS.
	* aix31.h: Likewise.
	* aix41.h: Likewise.
	* aix43.h: Likewise.
	* aix51.h: Likewise.
	* beos.h: Likewise.
	* darwin.h: Likewise.
	* eabi.h: Likewise.
	* eabisim.h: Likewise.
	* linux.h: Likewise.
	* linux64.h: Likewise.
	* lynx.h: Likewise.
	* mach.h: Likewise.
	* rtems.h: Likewise.
	* sysv4.h: Likewise.
	* vxppc.h: Likewise.

Index: gcc/config/rs6000/aix.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/aix.h,v
retrieving revision 1.29
diff -c -3 -p -r1.29 aix.h
*** gcc/config/rs6000/aix.h	11 Jun 2002 23:14:47 -0000	1.29
--- gcc/config/rs6000/aix.h	3 Jul 2002 17:08:49 -0000
*************** Boston, MA 02111-1307, USA.  */
*** 41,48 ****
  #define LINK_LIBGCC_SPECIAL_1
  
  /* Names to predefine in the preprocessor for this target machine.  */
! #define CPP_PREDEFINES "-D_IBMR2 -D_POWER -D_AIX -D_AIX32 -D_LONG_LONG \
! -Asystem=unix -Asystem=aix -Acpu=rs6000 -Amachine=rs6000"
  
  /* Define appropriate architecture macros for preprocessor depending on
     target switches.  */
--- 41,60 ----
  #define LINK_LIBGCC_SPECIAL_1
  
  /* Names to predefine in the preprocessor for this target machine.  */
! #define TARGET_OS_CPP_BUILTINS()         \
!   do                                     \
!     {                                    \
!       builtin_define ("_IBMR2");         \
!       builtin_define ("_POWER");         \
!       builtin_define ("_AIX");           \
!       builtin_define ("_AIX32");         \
!       builtin_define ("_LONG_LONG");     \
!       builtin_assert ("system=unix");    \
!       builtin_assert ("system=aix");     \
!       builtin_assert ("cpu=rs6000");     \
!       builtin_assert ("machine=rs6000"); \
!     }                                    \
!   while (0)
  
  /* Define appropriate architecture macros for preprocessor depending on
     target switches.  */
Index: gcc/config/rs6000/aix31.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/aix31.h,v
retrieving revision 1.8
diff -c -3 -p -r1.8 aix31.h
*** gcc/config/rs6000/aix31.h	15 Nov 2001 05:21:06 -0000	1.8
--- gcc/config/rs6000/aix31.h	3 Jul 2002 17:08:50 -0000
*************** Boston, MA 02111-1307, USA.  */
*** 60,67 ****
  }
  
  /* AIX 3.2 defined _AIX32, but older versions do not.  */
! #undef CPP_PREDEFINES
! #define CPP_PREDEFINES "-D_IBMR2 -D_AIX -Asystem=unix -Asystem=aix -Acpu=rs6000 -Amachine=rs6000"
  
  /* AIX 3.1 uses bit 15 in CROR as the magic nop.  */
  #undef RS6000_CALL_GLUE
--- 60,77 ----
  }
  
  /* AIX 3.2 defined _AIX32, but older versions do not.  */
! #undef TARGET_OS_CPP_BUILTINS
! #define TARGET_OS_CPP_BUILTINS()         \
!   do                                     \
!     {                                    \
!       builtin_define ("_IBMR2");         \
!       builtin_define ("_AIX");           \
!       builtin_assert ("system=unix");    \
!       builtin_assert ("system=aix");     \
!       builtin_assert ("cpu=rs6000");     \
!       builtin_assert ("machine=rs6000"); \
!     }                                    \
!   while (0)
  
  /* AIX 3.1 uses bit 15 in CROR as the magic nop.  */
  #undef RS6000_CALL_GLUE
Index: gcc/config/rs6000/aix41.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/aix41.h,v
retrieving revision 1.16
diff -c -3 -p -r1.16 aix41.h
*** gcc/config/rs6000/aix41.h	11 Jun 2002 23:14:47 -0000	1.16
--- gcc/config/rs6000/aix41.h	3 Jul 2002 17:08:50 -0000
*************** Boston, MA 02111-1307, USA.  */
*** 33,41 ****
  #undef	ASM_DEFAULT_SPEC
  #define ASM_DEFAULT_SPEC "-mcom"
  
! #undef CPP_PREDEFINES
! #define CPP_PREDEFINES "-D_IBMR2 -D_POWER -D_AIX -D_AIX32 -D_AIX41 \
! -D_LONG_LONG -Asystem=unix -Asystem=aix"
  
  #undef CPP_SPEC
  #define CPP_SPEC "%{posix: -D_POSIX_SOURCE}\
--- 33,52 ----
  #undef	ASM_DEFAULT_SPEC
  #define ASM_DEFAULT_SPEC "-mcom"
  
! #undef TARGET_OS_CPP_BUILTINS
! #define TARGET_OS_CPP_BUILTINS()      \
!   do                                  \
!     {                                 \
!       builtin_define ("_IBMR2");      \
!       builtin_define ("_POWER");      \
!       builtin_define ("_AIX");        \
!       builtin_define ("_AIX32");      \
!       builtin_define ("_AIX41");      \
!       builtin_define ("_LONG_LONG");  \
!       builtin_assert ("system=unix"); \
!       builtin_assert ("system=aix");  \
!     }                                 \
!   while (0)
  
  #undef CPP_SPEC
  #define CPP_SPEC "%{posix: -D_POSIX_SOURCE}\
Index: gcc/config/rs6000/aix43.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/aix43.h,v
retrieving revision 1.26
diff -c -3 -p -r1.26 aix43.h
*** gcc/config/rs6000/aix43.h	12 Jun 2002 03:06:25 -0000	1.26
--- gcc/config/rs6000/aix43.h	3 Jul 2002 17:08:50 -0000
*************** do {									\
*** 96,104 ****
  #undef	ASM_DEFAULT_SPEC
  #define ASM_DEFAULT_SPEC "-mcom"
  
! #undef CPP_PREDEFINES
! #define CPP_PREDEFINES "-D_IBMR2 -D_POWER -D_AIX -D_AIX32 -D_AIX41 -D_AIX43 \
! -D_LONG_LONG -Asystem=unix -Asystem=aix"
  
  #undef CPP_SPEC
  #define CPP_SPEC "%{posix: -D_POSIX_SOURCE}\
--- 96,116 ----
  #undef	ASM_DEFAULT_SPEC
  #define ASM_DEFAULT_SPEC "-mcom"
  
! #undef TARGET_OS_CPP_BUILTINS
! #define TARGET_OS_CPP_BUILTINS()      \
!   do                                  \
!     {                                 \
!       builtin_define ("_IBMR2");      \
!       builtin_define ("_POWER");      \
!       builtin_define ("_AIX");        \
!       builtin_define ("_AIX32");      \
!       builtin_define ("_AIX41");      \
!       builtin_define ("_AIX43");      \
!       builtin_define ("_LONG_LONG");  \
!       builtin_assert ("system=unix"); \
!       builtin_assert ("system=aix");  \
!     }                                 \
!   while (0)
  
  #undef CPP_SPEC
  #define CPP_SPEC "%{posix: -D_POSIX_SOURCE}\
Index: gcc/config/rs6000/aix51.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/aix51.h,v
retrieving revision 1.15
diff -c -3 -p -r1.15 aix51.h
*** gcc/config/rs6000/aix51.h	12 Jun 2002 03:06:26 -0000	1.15
--- gcc/config/rs6000/aix51.h	3 Jul 2002 17:08:50 -0000
*************** do {									\
*** 96,104 ****
  #undef	ASM_DEFAULT_SPEC
  #define ASM_DEFAULT_SPEC "-mcom"
  
! #undef CPP_PREDEFINES
! #define CPP_PREDEFINES "-D_IBMR2 -D_POWER -D_LONG_LONG \
! -D_AIX -D_AIX32 -D_AIX41 -D_AIX43 -D_AIX51 -Asystem=unix -Asystem=aix"
  
  #undef CPP_SPEC
  #define CPP_SPEC "%{posix: -D_POSIX_SOURCE}	\
--- 96,117 ----
  #undef	ASM_DEFAULT_SPEC
  #define ASM_DEFAULT_SPEC "-mcom"
  
! #undef TARGET_OS_CPP_BUILTINS
! #define TARGET_OS_CPP_BUILTINS()      \
!   do                                  \
!     {                                 \
!       builtin_define ("_IBMR2");      \
!       builtin_define ("_POWER");      \
!       builtin_define ("_LONG_LONG");  \
!       builtin_define ("_AIX");        \
!       builtin_define ("_AIX32");      \
!       builtin_define ("_AIX41");      \
!       builtin_define ("_AIX43");      \
!       builtin_define ("_AIX51");      \
!       builtin_assert ("system=unix"); \
!       builtin_assert ("system=aix");  \
!     }                                 \
!   while (0)
  
  #undef CPP_SPEC
  #define CPP_SPEC "%{posix: -D_POSIX_SOURCE}	\
Index: gcc/config/rs6000/beos.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/beos.h,v
retrieving revision 1.10
diff -c -3 -p -r1.10 beos.h
*** gcc/config/rs6000/beos.h	11 Jun 2002 23:14:47 -0000	1.10
--- gcc/config/rs6000/beos.h	3 Jul 2002 17:08:50 -0000
*************** Boston, MA 02111-1307, USA.  */
*** 38,46 ****
  #undef ASM_SPEC
  #define ASM_SPEC "-u %(asm_cpu)"
  
! #undef CPP_PREDEFINES
  /* __POWERPC__ must be defined for some header files */
! #define CPP_PREDEFINES "-D__BEOS__ -D__POWERPC__ -Asystem=beos -Acpu=powerpc -Amachine=powerpc"
  
  #undef CPP_SPEC
  #define CPP_SPEC "%{posix: -D_POSIX_SOURCE}"
--- 38,55 ----
  #undef ASM_SPEC
  #define ASM_SPEC "-u %(asm_cpu)"
  
! #undef TARGET_OS_CPP_BUILTINS
  /* __POWERPC__ must be defined for some header files */
! #define TARGET_OS_CPP_BUILTINS()          \
!   do                                      \
!     {                                     \
!       builtin_define ("__BEOS__");        \
!       builtin_define ("__POWERPC__");     \
!       builtin_assert ("system=beos");     \
!       builtin_assert ("cpu=powerpc");     \
!       builtin_assert ("machine=powerpc"); \
!     }                                     \
!   while (0)
  
  #undef CPP_SPEC
  #define CPP_SPEC "%{posix: -D_POSIX_SOURCE}"
Index: gcc/config/rs6000/darwin.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/darwin.h,v
retrieving revision 1.21
diff -c -3 -p -r1.21 darwin.h
*** gcc/config/rs6000/darwin.h	11 Jun 2002 23:14:47 -0000	1.21
--- gcc/config/rs6000/darwin.h	3 Jul 2002 17:08:50 -0000
*************** Boston, MA 02111-1307, USA.  */
*** 44,50 ****
  #define SUBTARGET_OVERRIDE_OPTIONS  \
    rs6000_altivec_abi = 1;
  
! #define CPP_PREDEFINES "-D__ppc__ -D__POWERPC__ -D__NATURAL_ALIGNMENT__ -D__MACH__ -D__APPLE__"
  
  /* We want -fPIC by default, unless we're using -static to compile for
     the kernel or some such.  */
--- 44,59 ----
  #define SUBTARGET_OVERRIDE_OPTIONS  \
    rs6000_altivec_abi = 1;
  
! #define TARGET_OS_CPP_BUILTINS()                \
!   do                                            \
!     {                                           \
!       builtin_define ("__ppc__");               \
!       builtin_define ("__POWERPC__");           \
!       builtin_define ("__NATURAL_ALIGNMENT__"); \
!       builtin_define ("__MACH__");              \
!       builtin_define ("__APPLE__");             \
!     }                                           \
!   while (0)
  
  /* We want -fPIC by default, unless we're using -static to compile for
     the kernel or some such.  */
Index: gcc/config/rs6000/eabi.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/eabi.h,v
retrieving revision 1.5
diff -c -3 -p -r1.5 eabi.h
*** gcc/config/rs6000/eabi.h	2 Nov 2000 23:29:12 -0000	1.5
--- gcc/config/rs6000/eabi.h	3 Jul 2002 17:08:50 -0000
*************** Boston, MA 02111-1307, USA.  */
*** 31,36 ****
  #undef TARGET_VERSION
  #define TARGET_VERSION fprintf (stderr, " (PowerPC Embedded)");
  
! #undef CPP_PREDEFINES
! #define CPP_PREDEFINES \
!   "-DPPC -D__embedded__ -Asystem=embedded -Acpu=powerpc -Amachine=powerpc"
--- 31,44 ----
  #undef TARGET_VERSION
  #define TARGET_VERSION fprintf (stderr, " (PowerPC Embedded)");
  
! #undef TARGET_OS_CPP_BUILTINS
! #define TARGET_OS_CPP_BUILTINS()          \
!   do                                      \
!     {                                     \
!       builtin_define_std ("PPC");         \
!       builtin_define ("__embedded__");    \
!       builtin_assert ("system=embedded"); \
!       builtin_assert ("cpu=powerpc");     \
!       builtin_assert ("machine=powerpc"); \
!     }                                     \
!   while (0)
Index: gcc/config/rs6000/eabisim.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/eabisim.h,v
retrieving revision 1.4
diff -c -3 -p -r1.4 eabisim.h
*** gcc/config/rs6000/eabisim.h	2 Nov 2000 23:29:12 -0000	1.4
--- gcc/config/rs6000/eabisim.h	3 Jul 2002 17:08:50 -0000
*************** Boston, MA 02111-1307, USA.  */
*** 23,31 ****
  #undef TARGET_VERSION
  #define TARGET_VERSION fprintf (stderr, " (PowerPC Simulated)");
  
! #undef CPP_PREDEFINES
! #define CPP_PREDEFINES \
!   "-DPPC -D__embedded__ -D__simulator__ -Asystem=embedded -Asystem=simulator -Acpu=powerpc -Amachine=powerpc"
  
  /* Make the simulator the default */
  #undef	LIB_DEFAULT_SPEC
--- 23,41 ----
  #undef TARGET_VERSION
  #define TARGET_VERSION fprintf (stderr, " (PowerPC Simulated)");
  
! #undef TARGET_OS_CPP_BUILTINS
! #define TARGET_OS_CPP_BUILTINS()           \
!   do                                       \
!     {                                      \
!       builtin_define_std ("PPC");          \
!       builtin_define ("__embedded__");     \
!       builtin_define ("__simulator__");    \
!       builtin_assert ("system=embedded");  \
!       builtin_assert ("system=simulator"); \
!       builtin_assert ("cpu=powerpc");      \
!       builtin_assert ("machine=powerpc");  \
!     }                                      \
!   while (0)
  
  /* Make the simulator the default */
  #undef	LIB_DEFAULT_SPEC
Index: gcc/config/rs6000/linux.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/linux.h,v
retrieving revision 1.31
diff -c -3 -p -r1.31 linux.h
*** gcc/config/rs6000/linux.h	3 Dec 2001 00:49:41 -0000	1.31
--- gcc/config/rs6000/linux.h	3 Jul 2002 17:08:50 -0000
*************** Boston, MA 02111-1307, USA.  */
*** 27,35 ****
  #undef MD_EXEC_PREFIX
  #undef MD_STARTFILE_PREFIX
  
! #undef CPP_PREDEFINES
! #define CPP_PREDEFINES \
!  "-DPPC -D__ELF__ -Dpowerpc -Acpu=powerpc -Amachine=powerpc"
  
  #undef	CPP_OS_DEFAULT_SPEC
  #define CPP_OS_DEFAULT_SPEC "%(cpp_os_linux)"
--- 27,43 ----
  #undef MD_EXEC_PREFIX
  #undef MD_STARTFILE_PREFIX
  
! #undef TARGET_OS_CPP_BUILTINS
! #define TARGET_OS_CPP_BUILTINS()          \
!   do                                      \
!     {                                     \
!       builtin_define_std ("PPC");         \
!       builtin_define ("__ELF__");         \
!       builtin_define_std ("powerpc");     \
!       builtin_assert ("cpu=powerpc");     \
!       builtin_assert ("machine=powerpc"); \
!     }                                     \
!   while (0)
  
  #undef	CPP_OS_DEFAULT_SPEC
  #define CPP_OS_DEFAULT_SPEC "%(cpp_os_linux)"
Index: gcc/config/rs6000/linux64.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/linux64.h,v
retrieving revision 1.19
diff -c -3 -p -r1.19 linux64.h
*** gcc/config/rs6000/linux64.h	12 Jun 2002 03:06:26 -0000	1.19
--- gcc/config/rs6000/linux64.h	3 Jul 2002 17:08:50 -0000
*************** Boston, MA 02111-1307, USA.  */
*** 106,116 ****
  #undef MD_EXEC_PREFIX
  #undef MD_STARTFILE_PREFIX
  
! #undef  CPP_PREDEFINES
! #define CPP_PREDEFINES \
!  "-D_PPC_ -D__PPC__ -D_PPC64_ -D__PPC64__ -D__powerpc__ -D__powerpc64__ \
!   -D_PIC_ -D__PIC__ -D__ELF__ \
!   -Acpu=powerpc64 -Amachine=powerpc64"
  
  #undef  CPP_OS_DEFAULT_SPEC
  #define CPP_OS_DEFAULT_SPEC "%(cpp_os_linux)"
--- 106,128 ----
  #undef MD_EXEC_PREFIX
  #undef MD_STARTFILE_PREFIX
  
! #undef TARGET_OS_CPP_BUILTINS
! #define TARGET_OS_CPP_BUILTINS()            \
!   do                                        \
!     {                                       \
!       builtin_define ("_PPC_");             \
!       builtin_define ("__PPC__");           \
!       builtin_define ("_PPC64_");           \
!       builtin_define ("__PPC64__");         \
!       builtin_define ("__powerpc__");       \
!       builtin_define ("__powerpc64__");     \
!       builtin_define ("_PIC_");             \
!       builtin_define ("__PIC__");           \
!       builtin_define ("__ELF__");           \
!       builtin_assert ("cpu=powerpc64");     \
!       builtin_assert ("machine=powerpc64"); \
!     }                                       \
!   while (0)
  
  #undef  CPP_OS_DEFAULT_SPEC
  #define CPP_OS_DEFAULT_SPEC "%(cpp_os_linux)"
Index: gcc/config/rs6000/lynx.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/lynx.h,v
retrieving revision 1.10
diff -c -3 -p -r1.10 lynx.h
*** gcc/config/rs6000/lynx.h	18 May 2002 23:47:17 -0000	1.10
--- gcc/config/rs6000/lynx.h	3 Jul 2002 17:08:50 -0000
*************** Boston, MA 02111-1307, USA.  */
*** 53,60 ****
  #undef DEFAULT_SIGNED_CHAR
  #define DEFAULT_SIGNED_CHAR 1
  
! #undef CPP_PREDEFINES
! #define CPP_PREDEFINES "-Acpu=rs6000 -Amachine=rs6000 -Asystem=lynx -Asystem=unix -DLynx -D_IBMR2 -Dunix -Drs6000 -Dlynx -DLYNX"
  
  #undef LINK_SPEC
  #define LINK_SPEC "-T0x10001000 -H0x1000 -D0x20000000 -btextro -bhalt:4 -bnodelcsect -bnso -bro -bnoglink %{v} %{b*}"
--- 53,74 ----
  #undef DEFAULT_SIGNED_CHAR
  #define DEFAULT_SIGNED_CHAR 1
  
! #undef TARGET_OS_CPP_BUILTINS
! #define TARGET_OS_CPP_BUILTINS()         \
!   do                                     \
!     {                                    \
!       builtin_assert ("cpu=rs6000");     \
!       builtin_assert ("machine=rs6000"); \
!       builtin_assert ("system=lynx");    \
!       builtin_assert ("system=unix");    \
!       builtin_define_std ("Lynx");       \
!       builtin_define ("_IBMR2");         \
!       builtin_define_std ("unix");       \
!       builtin_define_std ("rs6000");     \
!       builtin_define_std ("lynx");       \
!       builtin_define_std ("LYNX");       \
!     }                                    \
!   while (0)
  
  #undef LINK_SPEC
  #define LINK_SPEC "-T0x10001000 -H0x1000 -D0x20000000 -btextro -bhalt:4 -bnodelcsect -bnso -bro -bnoglink %{v} %{b*}"
Index: gcc/config/rs6000/mach.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/mach.h,v
retrieving revision 1.6
diff -c -3 -p -r1.6 mach.h
*** gcc/config/rs6000/mach.h	20 Nov 2001 19:43:28 -0000	1.6
--- gcc/config/rs6000/mach.h	3 Jul 2002 17:08:50 -0000
*************** Boston, MA 02111-1307, USA.  */
*** 26,33 ****
  #define TARGET_VERSION fprintf (stderr, " (Mach-RS/6000)");
  
  /* We don't define AIX under MACH; instead we define `unix'.  */
! #undef CPP_PREDEFINES
! #define CPP_PREDEFINES "-Drios -D_IBMR2 -Dunix -Asystem=unix -Asystem=mach -Acpu=rs6000 -Amachine=rs6000"
  
  /* Define different binder options for MACH.  */
  #undef LINK_SPEC
--- 26,44 ----
  #define TARGET_VERSION fprintf (stderr, " (Mach-RS/6000)");
  
  /* We don't define AIX under MACH; instead we define `unix'.  */
! #undef TARGET_OS_CPP_BUILTINS
! #define TARGET_OS_CPP_BUILTINS()         \
!   do                                     \
!     {                                    \
!       builtin_define_std ("rios");       \
!       builtin_define ("_IBMR2");         \
!       builtin_define_std ("unix");       \
!       builtin_assert ("system=unix");    \
!       builtin_assert ("system=mach");    \
!       builtin_assert ("cpu=rs6000");     \
!       builtin_assert ("machine=rs6000"); \
!     }                                    \
!   while (0)
  
  /* Define different binder options for MACH.  */
  #undef LINK_SPEC
Index: gcc/config/rs6000/rtems.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rtems.h,v
retrieving revision 1.14
diff -c -3 -p -r1.14 rtems.h
*** gcc/config/rs6000/rtems.h	12 Apr 2002 13:35:00 -0000	1.14
--- gcc/config/rs6000/rtems.h	3 Jul 2002 17:08:50 -0000
*************** Boston, MA 02111-1307, USA.  */
*** 21,26 ****
  
  /* Specify predefined symbols in preprocessor.  */
  
! #undef CPP_PREDEFINES
! #define CPP_PREDEFINES "-DPPC -D__rtems__ \
!    -Asystem=rtems -Acpu=powerpc -Amachine=powerpc"
--- 21,34 ----
  
  /* Specify predefined symbols in preprocessor.  */
  
! #undef TARGET_OS_CPP_BUILTINS
! #define TARGET_OS_CPP_BUILTINS()          \
!   do                                      \
!     {                                     \
!       builtin_define_std ("PPC");         \
!       builtin_define ("__rtems__");       \
!       builtin_assert ("system=rtems");    \
!       builtin_assert ("cpu=powerpc");     \
!       builtin_assert ("machine=powerpc"); \
!     }                                     \
!   while (0)
Index: gcc/config/rs6000/sysv4.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/sysv4.h,v
retrieving revision 1.97
diff -c -3 -p -r1.97 sysv4.h
*** gcc/config/rs6000/sysv4.h	11 Jun 2002 23:14:47 -0000	1.97
--- gcc/config/rs6000/sysv4.h	3 Jul 2002 17:08:50 -0000
*************** do {						\
*** 808,816 ****
  #define	TARGET_VERSION fprintf (stderr, " (PowerPC System V.4)");
  #endif
  \f
! #ifndef	CPP_PREDEFINES
! #define	CPP_PREDEFINES \
!   "-DPPC -Dunix -D__svr4__ -Asystem=unix -Asystem=svr4 -Acpu=powerpc -Amachine=powerpc"
  #endif
  
  /* Pass various options to the assembler.  */
--- 808,826 ----
  #define	TARGET_VERSION fprintf (stderr, " (PowerPC System V.4)");
  #endif
  \f
! #ifndef	TARGET_OS_CPP_BUILTINS
! #define TARGET_OS_CPP_BUILTINS()          \
!   do                                      \
!     {                                     \
!       builtin_define_std ("PPC");         \
!       builtin_define_std ("unix");        \
!       builtin_define ("__svr4__");        \
!       builtin_assert ("system=unix");     \
!       builtin_assert ("system=svr4");     \
!       builtin_assert ("cpu=powerpc");     \
!       builtin_assert ("machine=powerpc"); \
!     }                                     \
!   while (0)
  #endif
  
  /* Pass various options to the assembler.  */
Index: gcc/config/rs6000/vxppc.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/vxppc.h,v
retrieving revision 1.9
diff -c -3 -p -r1.9 vxppc.h
*** gcc/config/rs6000/vxppc.h	11 Jun 2002 23:14:47 -0000	1.9
--- gcc/config/rs6000/vxppc.h	3 Jul 2002 17:08:50 -0000
*************** Boston, MA 02111-1307, USA.  */
*** 39,48 ****
  #undef	LINK_OS_DEFAULT_SPEC
  #define LINK_OS_DEFAULT_SPEC "%(link_os_vxworks)"
  
! #undef CPP_PREDEFINES
! #define CPP_PREDEFINES "\
! -D__vxworks -D__vxworks__ -Asystem=vxworks -Asystem=embedded \
! -Acpu=powerpc -Amachine=powerpc"
  
  /* We use stabs-in-elf for debugging */
  #undef PREFERRED_DEBUGGING_TYPE
--- 39,56 ----
  #undef	LINK_OS_DEFAULT_SPEC
  #define LINK_OS_DEFAULT_SPEC "%(link_os_vxworks)"
  
! #undef TARGET_OS_CPP_BUILTINS
! #define TARGET_OS_CPP_BUILTINS()          \
!   do                                      \
!     {                                     \
!       builtin_define ("__vxworks");       \
!       builtin_define ("__vxworks__");     \
!       builtin_assert ("system=vxworks");  \
!       builtin_assert ("system=embedded"); \
!       builtin_assert ("cpu=powerpc");     \
!       builtin_assert ("machine=powerpc"); \
!     }                                     \
!   while (0)
  
  /* We use stabs-in-elf for debugging */
  #undef PREFERRED_DEBUGGING_TYPE

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: convert 32-bit PowerPC GNU/Linux to TARGET_OS_CPP_BUILTINS
  2002-07-03  9:15   ` Matt Kraai
                       ` (2 preceding siblings ...)
  2002-07-03  9:36     ` Jason R Thorpe
@ 2002-07-03 23:50     ` Alan Modra
  2002-07-04  9:22       ` David Edelsohn
  3 siblings, 1 reply; 875+ messages in thread
From: Alan Modra @ 2002-07-03 23:50 UTC (permalink / raw)
  To: David Edelsohn, gcc-patches

On Wed, Jul 03, 2002 at 08:52:47AM -0700, Matt Kraai wrote:
> Index: gcc/config/rs6000/linux64.h
[snip]
> --- 106,128 ----
>   #undef MD_EXEC_PREFIX
>   #undef MD_STARTFILE_PREFIX
>   
> ! #undef TARGET_OS_CPP_BUILTINS
> ! #define TARGET_OS_CPP_BUILTINS()            \
> !   do                                        \
> !     {                                       \
> !       builtin_define_std ("_PPC_");         \
> !       builtin_define_std ("__PPC__");       \
> !       builtin_define_std ("_PPC64_");       \
> !       builtin_define_std ("__PPC64__");     \
> !       builtin_define_std ("__powerpc__");   \
> !       builtin_define_std ("__powerpc64__"); \
> !       builtin_define_std ("_PIC_");         \
> !       builtin_define_std ("__PIC__");       \
> !       builtin_define_std ("__ELF__");       \
> !       builtin_assert ("cpu=powerpc64");     \
> !       builtin_assert ("machine=powerpc64"); \
> !     }                                       \
> !   while (0)

Since powerpc64-linux is a relatively new target, is there good reason
to define _PPC_, _PPC64_ and _PIC_?

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: convert 32-bit PowerPC GNU/Linux to TARGET_OS_CPP_BUILTINS
  2002-07-03 23:50     ` Alan Modra
@ 2002-07-04  9:22       ` David Edelsohn
  2002-07-08 18:27         ` Matt Kraai
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2002-07-04  9:22 UTC (permalink / raw)
  To: gcc-patches

> Since powerpc64-linux is a relatively new target, is there good reason
> to define _PPC_, _PPC64_ and _PIC_?

	I guess we can remove those macros.  I guess GNU likes double
underscores, even when it is unecessary for upper case names.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: convert 32-bit PowerPC GNU/Linux to TARGET_OS_CPP_BUILTINS
  2002-07-04  9:22       ` David Edelsohn
@ 2002-07-08 18:27         ` Matt Kraai
  2002-07-08 19:05           ` Geoff Keating
  2002-07-08 19:16           ` David Edelsohn
  0 siblings, 2 replies; 875+ messages in thread
From: Matt Kraai @ 2002-07-08 18:27 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1437 bytes --]

On Thu, Jul 04, 2002 at 12:19:19PM -0400, David Edelsohn wrote:
> > Since powerpc64-linux is a relatively new target, is there good reason
> > to define _PPC_, _PPC64_ and _PIC_?
> 
> 	I guess we can remove those macros.  I guess GNU likes double
> underscores, even when it is unecessary for upper case names.

The following patch removes the aforementioned macros.  Is it
possible to test it on powerpc-unknown-linux-gnu, and if so,
how?

Matt

	* config/rs6000/linux64.h (CPP_PREDEFINES): Do not
	define _PPC_, _PPC64_, or _PIC_.

Index: gcc/config/rs6000/linux64.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/linux64.h,v
retrieving revision 1.19
diff -c -3 -p -r1.19 linux64.h
*** gcc/config/rs6000/linux64.h	12 Jun 2002 03:06:26 -0000	1.19
--- gcc/config/rs6000/linux64.h	8 Jul 2002 22:55:55 -0000
*************** Boston, MA 02111-1307, USA.  */
*** 108,115 ****
  
  #undef  CPP_PREDEFINES
  #define CPP_PREDEFINES \
!  "-D_PPC_ -D__PPC__ -D_PPC64_ -D__PPC64__ -D__powerpc__ -D__powerpc64__ \
!   -D_PIC_ -D__PIC__ -D__ELF__ \
    -Acpu=powerpc64 -Amachine=powerpc64"
  
  #undef  CPP_OS_DEFAULT_SPEC
--- 108,114 ----
  
  #undef  CPP_PREDEFINES
  #define CPP_PREDEFINES \
!  "-D__PPC__ -D__PPC64__ -D__powerpc__ -D__powerpc64__ -D__PIC__ -D__ELF__ \
    -Acpu=powerpc64 -Amachine=powerpc64"
  
  #undef  CPP_OS_DEFAULT_SPEC

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: convert 32-bit PowerPC GNU/Linux to TARGET_OS_CPP_BUILTINS
  2002-07-08 18:27         ` Matt Kraai
@ 2002-07-08 19:05           ` Geoff Keating
  2002-07-08 19:16           ` David Edelsohn
  1 sibling, 0 replies; 875+ messages in thread
From: Geoff Keating @ 2002-07-08 19:05 UTC (permalink / raw)
  To: Matt Kraai; +Cc: gcc-patches

Matt Kraai <kraai@alumni.cmu.edu> writes:

> On Thu, Jul 04, 2002 at 12:19:19PM -0400, David Edelsohn wrote:
> > > Since powerpc64-linux is a relatively new target, is there good reason
> > > to define _PPC_, _PPC64_ and _PIC_?
> > 
> > 	I guess we can remove those macros.  I guess GNU likes double
> > underscores, even when it is unecessary for upper case names.
> 
> The following patch removes the aforementioned macros.  Is it
> possible to test it on powerpc-unknown-linux-gnu, and if so,
> how?

Um, bootstrap on a powerpc-linux machine?

The patch clearly can't affect powerpc-linux, because that header file
isn't used there.

> 	* config/rs6000/linux64.h (CPP_PREDEFINES): Do not
> 	define _PPC_, _PPC64_, or _PIC_.
> 
> Index: gcc/config/rs6000/linux64.h
> ===================================================================
> RCS file: /cvs/gcc/gcc/gcc/config/rs6000/linux64.h,v
> retrieving revision 1.19
> diff -c -3 -p -r1.19 linux64.h
> *** gcc/config/rs6000/linux64.h	12 Jun 2002 03:06:26 -0000	1.19
> --- gcc/config/rs6000/linux64.h	8 Jul 2002 22:55:55 -0000
> *************** Boston, MA 02111-1307, USA.  */
> *** 108,115 ****
>   
>   #undef  CPP_PREDEFINES
>   #define CPP_PREDEFINES \
> !  "-D_PPC_ -D__PPC__ -D_PPC64_ -D__PPC64__ -D__powerpc__ -D__powerpc64__ \
> !   -D_PIC_ -D__PIC__ -D__ELF__ \
>     -Acpu=powerpc64 -Amachine=powerpc64"
>   
>   #undef  CPP_OS_DEFAULT_SPEC
> --- 108,114 ----
>   
>   #undef  CPP_PREDEFINES
>   #define CPP_PREDEFINES \
> !  "-D__PPC__ -D__PPC64__ -D__powerpc__ -D__powerpc64__ -D__PIC__ -D__ELF__ \
>     -Acpu=powerpc64 -Amachine=powerpc64"
>   
>   #undef  CPP_OS_DEFAULT_SPEC

-- 
- Geoffrey Keating <geoffk@geoffk.org> <geoffk@redhat.com>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: convert 32-bit PowerPC GNU/Linux to TARGET_OS_CPP_BUILTINS
  2002-07-08 18:27         ` Matt Kraai
  2002-07-08 19:05           ` Geoff Keating
@ 2002-07-08 19:16           ` David Edelsohn
  2002-07-09  0:37             ` Matt Kraai
  1 sibling, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2002-07-08 19:16 UTC (permalink / raw)
  To: Matt Kraai; +Cc: gcc-patches

	Just removes those macros with single underscores from your patch
converting to the new infrastructure.

	Are you going to apply your patch or do you need someone to do it
for you? 

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: convert 32-bit PowerPC GNU/Linux to TARGET_OS_CPP_BUILTINS
  2002-07-08 19:16           ` David Edelsohn
@ 2002-07-09  0:37             ` Matt Kraai
  0 siblings, 0 replies; 875+ messages in thread
From: Matt Kraai @ 2002-07-09  0:37 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 495 bytes --]

On Mon, Jul 08, 2002 at 10:05:13PM -0400, David Edelsohn wrote:
> 	Just removes those macros with single underscores from your patch
> converting to the new infrastructure.
> 
> 	Are you going to apply your patch or do you need someone to do it
> for you? 

I wasn't sure whether I should commit it.  Neil Booth's reply
indicated that he didn't think I should.  Thus, I was planning
to cleanup the CPP_PREDEFINES handling first, and then
transition to TARGET_OS_CPP_BUILTINS.

Matt

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: thread-local storage: c front end and generic backend patch
  2002-05-21 18:54 thread-local storage: c front end and generic backend patch Richard Henderson
  2002-05-22  4:25 ` Joseph S. Myers
  2002-05-22 13:53 ` Mark Mitchell
@ 2002-07-11  9:00 ` David Edelsohn
  2002-07-11 11:02   ` Richard Henderson
  2 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2002-07-11  9:00 UTC (permalink / raw)
  To: Richard Henderson; +Cc: gcc-patches

	The tls patch has changed GCC's behavior so that it no longer
emits uninitialize data in the common section, as the GCC Internals
documentation has stated would occur if BSS macros were not defined.

	* c-decl.c (start_decl): Do not set DECL_COMMON for tls variables.

-   if (!initialized && (! flag_no_common || ! TREE_PUBLIC (decl)))
+   if (TREE_CODE (decl) == VAR_DECL
+       && !initialized
+       && TREE_PUBLIC (decl)
+       && !DECL_THREAD_LOCAL (decl)
+       && !flag_no_common)
      DECL_COMMON (decl) = 1;

	This part of your patch from May did more than just change the
behavior for tls variables, it limited DECL_COMMON to variables marked
TREE_PUBLIC.  Prior to your patch, variables without TREE_PUBLIC flag were
marked DECL_COMMON.

At the point where varasm.c:assemble_variable() considers unitialized
data, it tests DECL_COMMON if no BSS macro is defined:

#ifndef ASM_EMIT_BSS
  /* If the target can't output uninitialized but not common global data
     in .bss, then we have to use .data.  */
  /* ??? We should handle .bss via select_section mechanisms rather than
     via special target hooks.  That would eliminate this special case.  */
  else if (!DECL_COMMON (decl))
    ;
#endif
  else if (DECL_INITIAL (decl) == 0
           || DECL_INITIAL (decl) == error_mark_node
           || (flag_zero_initialized_in_bss
               && initializer_zerop (DECL_INITIAL (decl))))
    {
...
      asm_emit_uninitialised (decl, name, size, rounded);
    }

The tm.texi section for ASM_OUTPUT_BSS states:

If this macro and @code{ASM_OUTPUT_ALIGNED_BSS} are not defined then
@code{ASM_OUTPUT_COMMON} or @code{ASM_OUTPUT_ALIGNED_COMMON} or
@code{ASM_OUTPUT_ALIGNED_DECL_COMMON} is used.

and BSS_SECTION_ASM_OP:

If not defined, and neither @code{ASM_OUTPUT_BSS} nor
@code{ASM_OUTPUT_ALIGNED_BSS} are defined, uninitialized global data will
be output in the data section if @option{-fno-common} is passed, otherwise
@code{ASM_OUTPUT_COMMON} will be used.

ASM_OUTPUT_COMMON no longer is used in the absence of -fno-common as
documented.  All systems without BSS definitions now are using initialized
data instead of common.  If that is intentional, all targets without BSS
definitions need to be updated.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: thread-local storage: c front end and generic backend patch
  2002-07-11  9:00 ` David Edelsohn
@ 2002-07-11 11:02   ` Richard Henderson
  2002-07-26 11:08     ` [PATCH] " David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Richard Henderson @ 2002-07-11 11:02 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

On Thu, Jul 11, 2002 at 11:28:57AM -0400, David Edelsohn wrote:
> 	This part of your patch from May did more than just change the
> behavior for tls variables, it limited DECL_COMMON to variables marked
> TREE_PUBLIC.  Prior to your patch, variables without TREE_PUBLIC flag were
> marked DECL_COMMON.

Hmm.  Ok.  I can see why I made the change -- COMMON does not
make sense without PUBLIC.  I'll adjust assemble_variable and
its subroutines to compensate.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: target/7282: powerpc64 SImode in FPR
       [not found] <20020712071414.GR30362@bubble.sa.bigpond.net.au>
@ 2002-07-13  4:58 ` Alan Modra
  2002-07-13  7:25   ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Alan Modra @ 2002-07-13  4:58 UTC (permalink / raw)
  To: gcc-gnats; +Cc: gcc-patches, David Edelsohn

This cures the ICE.  Thanks to dje for putting me on the right track.

gcc/ChangeLog
	PR 7282
	* config/rs6000/rs6000.md (floatsidf2): Enable for POWERPC64.
	(floatsidf_ppc64): New insn_and_split.

Checked with gcc-3.1 i686-linux -> powerpc64-linux cross-compiler.
Bootstrapping and regresssion testing powerpc-linux mainline just to
be sure.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

diff -up gcc-ppc64-31.orig/gcc/config/rs6000/rs6000.md gcc-ppc64-31/gcc/config/rs6000/rs6000.md
--- gcc-ppc64-31.orig/gcc/config/rs6000/rs6000.md	2002-07-04 19:40:32.000000000 +0930
+++ gcc-ppc64-31/gcc/config/rs6000/rs6000.md	2002-07-13 20:26:05.000000000 +0930
@@ -5271,9 +5428,18 @@
 	      (clobber (match_dup 4))
 	      (clobber (match_dup 5))
 	      (clobber (match_dup 6))])]
-  "! TARGET_POWERPC64 && TARGET_HARD_FLOAT"
+  "TARGET_HARD_FLOAT"
   "
 {
+  if (TARGET_POWERPC64)
+    {
+      rtx mem = assign_stack_temp (DImode, GET_MODE_SIZE (DImode), 0);
+      rtx t1 = gen_reg_rtx (DImode);
+      rtx t2 = gen_reg_rtx (DImode);
+      emit_insn (gen_floatsidf_ppc64 (operands[0], operands[1], mem, t1, t2));
+      DONE;
+    }
+
   operands[2] = force_reg (SImode, GEN_INT (0x43300000));
   operands[3] = force_reg (DFmode, rs6000_float_const (\"4503601774854144\", DFmode));
   operands[4] = assign_stack_temp (DFmode, GET_MODE_SIZE (DFmode), 0);
@@ -5456,6 +5622,22 @@
   "fcfid %0,%1"
   [(set_attr "type" "fp")])
 
+(define_insn_and_split "floatsidf_ppc64"
+  [(set (match_operand:DF 0 "gpc_reg_operand" "=f")
+	(float:DF (match_operand:SI 1 "gpc_reg_operand" "*f")))
+   (clobber (match_operand:DI 2 "memory_operand" "=o"))
+   (clobber (match_operand:DI 3 "gpc_reg_operand" "=r"))
+   (clobber (match_operand:DI 4 "gpc_reg_operand" "=f"))]
+  "TARGET_POWERPC64 && TARGET_HARD_FLOAT"
+  "#"
+  ""
+  [(set (match_dup 3) (sign_extend:DI (match_dup 1)))
+   (set (match_dup 2) (match_dup 3))
+   (set (match_dup 4) (match_dup 2))
+   (set (match_dup 0) (float:DF (match_dup 4)))]
+  ""
+  [(set_attr "type" "fp")])
+
 (define_insn "fix_truncdfdi2"
   [(set (match_operand:DI 0 "gpc_reg_operand" "=*f")
 	(fix:DI (match_operand:DF 1 "gpc_reg_operand" "f")))]

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: target/7282: powerpc64 SImode in FPR
  2002-07-13  4:58 ` target/7282: powerpc64 SImode in FPR Alan Modra
@ 2002-07-13  7:25   ` David Edelsohn
  2002-07-13 23:36     ` Alan Modra
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2002-07-13  7:25 UTC (permalink / raw)
  To: gcc-gnats, gcc-patches

	This looks okay except for

+  [(set (match_operand:DF 0 "gpc_reg_operand" "=f")
+	(float:DF (match_operand:SI 1 "gpc_reg_operand" "*f")))
+   (clobber (match_operand:DI 2 "memory_operand" "=o"))
+   (clobber (match_operand:DI 3 "gpc_reg_operand" "=r"))
+   (clobber (match_operand:DI 4 "gpc_reg_operand" "=f"))]

The input SImode operand should have constraint "r", not "*f".  The whole
point of this pattern is to move the SImode operand from the GPR to the
FPR because GCC sometimes gets confused when asked to do this itself.
SImode is not allowed in FPRs, so the "*f" constraint is contradictory.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: target/7282: powerpc64 SImode in FPR
  2002-07-13  7:25   ` David Edelsohn
@ 2002-07-13 23:36     ` Alan Modra
  2002-07-14  7:59       ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Alan Modra @ 2002-07-13 23:36 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-gnats, gcc-patches

On Sat, Jul 13, 2002 at 10:18:54AM -0400, David Edelsohn wrote:
> 	This looks okay except for
> 
> +  [(set (match_operand:DF 0 "gpc_reg_operand" "=f")
> +	(float:DF (match_operand:SI 1 "gpc_reg_operand" "*f")))
> +   (clobber (match_operand:DI 2 "memory_operand" "=o"))
> +   (clobber (match_operand:DI 3 "gpc_reg_operand" "=r"))
> +   (clobber (match_operand:DI 4 "gpc_reg_operand" "=f"))]
> 
> The input SImode operand should have constraint "r", not "*f".  The whole
> point of this pattern is to move the SImode operand from the GPR to the
> FPR because GCC sometimes gets confused when asked to do this itself.
> SImode is not allowed in FPRs, so the "*f" constraint is contradictory.

Revised patch follows, incorporating your floatunssidf2 suggestion too.

	PR target/7282
	* config/rs6000/rs6000.md (floatsidf2): Enable for POWERPC64.
	(floatunssidf2): Likewise.
	(floatsidf_ppc64): New insn_and_split.
	(floatunssidf_ppc64): Likewise.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

Index: gcc/config/rs6000/rs6000.md
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.md,v
retrieving revision 1.192
diff -u -p -r1.192 rs6000.md
--- gcc/config/rs6000/rs6000.md	3 Jul 2002 14:41:22 -0000	1.192
+++ gcc/config/rs6000/rs6000.md	14 Jul 2002 05:38:17 -0000
@@ -5350,9 +5350,18 @@
 	      (clobber (match_dup 4))
 	      (clobber (match_dup 5))
 	      (clobber (match_dup 6))])]
-  "! TARGET_POWERPC64 && TARGET_HARD_FLOAT"
+  "TARGET_HARD_FLOAT"
   "
 {
+  if (TARGET_POWERPC64)
+    {
+      rtx mem = assign_stack_temp (DImode, GET_MODE_SIZE (DImode), 0);
+      rtx t1 = gen_reg_rtx (DImode);
+      rtx t2 = gen_reg_rtx (DImode);
+      emit_insn (gen_floatsidf_ppc64 (operands[0], operands[1], mem, t1, t2));
+      DONE;
+    }
+
   operands[2] = force_reg (SImode, GEN_INT (0x43300000));
   operands[3] = force_reg (DFmode, CONST_DOUBLE_ATOF (\"4503601774854144\", DFmode));
   operands[4] = assign_stack_temp (DFmode, GET_MODE_SIZE (DFmode), 0);
@@ -5417,9 +5426,19 @@
 	      (use (match_dup 3))
 	      (clobber (match_dup 4))
 	      (clobber (match_dup 5))])]
-  "! TARGET_POWERPC64 && TARGET_HARD_FLOAT"
+  "TARGET_HARD_FLOAT"
   "
 {
+  if (TARGET_POWERPC64)
+    {
+      rtx mem = assign_stack_temp (DImode, GET_MODE_SIZE (DImode), 0);
+      rtx t1 = gen_reg_rtx (DImode);
+      rtx t2 = gen_reg_rtx (DImode);
+      emit_insn (gen_floatunssidf_ppc64 (operands[0], operands[1], mem,
+					 t1, t2));
+      DONE;
+    }
+
   operands[2] = force_reg (SImode, GEN_INT (0x43300000));
   operands[3] = force_reg (DFmode, CONST_DOUBLE_ATOF (\"4503599627370496\", DFmode));
   operands[4] = assign_stack_temp (DFmode, GET_MODE_SIZE (DFmode), 0);
@@ -5533,6 +5552,38 @@
 	(float:DF (match_operand:DI 1 "gpc_reg_operand" "*f")))]
   "TARGET_POWERPC64 && TARGET_HARD_FLOAT"
   "fcfid %0,%1"
+  [(set_attr "type" "fp")])
+
+(define_insn_and_split "floatsidf_ppc64"
+  [(set (match_operand:DF 0 "gpc_reg_operand" "=f")
+	(float:DF (match_operand:SI 1 "gpc_reg_operand" "r")))
+   (clobber (match_operand:DI 2 "memory_operand" "=o"))
+   (clobber (match_operand:DI 3 "gpc_reg_operand" "=r"))
+   (clobber (match_operand:DI 4 "gpc_reg_operand" "=f"))]
+  "TARGET_POWERPC64 && TARGET_HARD_FLOAT"
+  "#"
+  ""
+  [(set (match_dup 3) (sign_extend:DI (match_dup 1)))
+   (set (match_dup 2) (match_dup 3))
+   (set (match_dup 4) (match_dup 2))
+   (set (match_dup 0) (float:DF (match_dup 4)))]
+  ""
+  [(set_attr "type" "fp")])
+
+(define_insn_and_split "floatunssidf_ppc64"
+  [(set (match_operand:DF 0 "gpc_reg_operand" "=f")
+	(unsigned_float:DF (match_operand:SI 1 "gpc_reg_operand" "r")))
+   (clobber (match_operand:DI 2 "memory_operand" "=o"))
+   (clobber (match_operand:DI 3 "gpc_reg_operand" "=r"))
+   (clobber (match_operand:DI 4 "gpc_reg_operand" "=f"))]
+  "TARGET_POWERPC64 && TARGET_HARD_FLOAT"
+  "#"
+  ""
+  [(set (match_dup 3) (zero_extend:DI (match_dup 1)))
+   (set (match_dup 2) (match_dup 3))
+   (set (match_dup 4) (match_dup 2))
+   (set (match_dup 0) (float:DF (match_dup 4)))]
+  ""
   [(set_attr "type" "fp")])
 
 (define_insn "fix_truncdfdi2"

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: target/7282: powerpc64 SImode in FPR
  2002-07-13 23:36     ` Alan Modra
@ 2002-07-14  7:59       ` David Edelsohn
  0 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2002-07-14  7:59 UTC (permalink / raw)
  To: Alan Modra; +Cc: gcc-patches

	This is fine to commit to the mainline, but please remove

[(set_attr "type" "fp")]

from the define_insn_and_split patterns.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: other/7114: ICE building strcoll.op from glibc-2.2.5
       [not found] <20020625081846.10430.qmail@sources.redhat.com>
@ 2002-07-15  2:43 ` Alan Modra
  2002-07-15  5:22   ` Alan Modra
  2002-07-15 12:51   ` Geoff Keating
  0 siblings, 2 replies; 875+ messages in thread
From: Alan Modra @ 2002-07-15  2:43 UTC (permalink / raw)
  To: d.mueller; +Cc: gcc-gnats, gcc-patches, David Edelsohn, geoffk

This patch cures the testcase.  The !using_store_multiple code tests
whether regs are live before saving.  We need to do something similar
for using_store_multiple, in case all regs need not be saved.

	* config/rs6000/rs6000.c (rs6000_emit_prologue): Trim saved regs
	for -mmultiple case like -mno-multiple case.
	(rs6000_emit_epilogue): Likewise.

Index: gcc/config/rs6000/rs6000.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.c,v
retrieving revision 1.291.2.13
diff -u -p -r1.291.2.13 rs6000.c
--- gcc/config/rs6000/rs6000.c	23 May 2002 23:22:44 -0000	1.291.2.13
+++ gcc/config/rs6000/rs6000.c	15 Jul 2002 09:06:31 -0000
@@ -8836,30 +8836,45 @@ rs6000_emit_prologue ()
      the store-multiple instructions.  */
   if (using_store_multiple)
     {
-      rtvec p, dwarfp;
-      int i;
-      p = rtvec_alloc (32 - info->first_gp_reg_save);
-      dwarfp = rtvec_alloc (32 - info->first_gp_reg_save);
-      for (i = 0; i < 32 - info->first_gp_reg_save; i++)
+      int n = info->first_gp_reg_save;
+
+      while (n < 32
+	     && !((regs_ever_live[n]
+		   && ! call_used_regs[n])
+		  || (n == RS6000_PIC_OFFSET_TABLE_REGNUM
+		      && ((DEFAULT_ABI == ABI_V4 && flag_pic == 1)
+			  || (DEFAULT_ABI == ABI_DARWIN && flag_pic)))))
+	n++;
+
+      if (n < 32)
 	{
-	  rtx addr, reg, mem;
-	  reg = gen_rtx_REG (reg_mode, info->first_gp_reg_save + i);
-	  addr = gen_rtx_PLUS (Pmode, frame_reg_rtx, 
-			       GEN_INT (info->gp_save_offset 
-					+ sp_offset 
-					+ reg_size * i));
-	  mem = gen_rtx_MEM (reg_mode, addr);
-	  set_mem_alias_set (mem, rs6000_sr_alias_set);
+	  rtvec p, dwarfp;
+	  int i;
+
+	  p = rtvec_alloc (32 - n);
+	  dwarfp = rtvec_alloc (32 - n);
+	  for (i = 0; i < 32 - n; i++)
+	    {
+	      rtx addr, reg, mem;
+	      reg = gen_rtx_REG (reg_mode, n + i);
+	      addr = gen_rtx_PLUS (Pmode, frame_reg_rtx, 
+				   GEN_INT (info->gp_save_offset 
+					    + sp_offset 
+					    + reg_size * i));
+	      mem = gen_rtx_MEM (reg_mode, addr);
+	      set_mem_alias_set (mem, rs6000_sr_alias_set);
 
-	  RTVEC_ELT (p, i) = gen_rtx_SET (VOIDmode, mem, reg);
+	      RTVEC_ELT (p, i) = gen_rtx_SET (VOIDmode, mem, reg);
+	    }
+	  insn = emit_insn (gen_rtx_PARALLEL (VOIDmode, p));
+	  rs6000_frame_related (insn, frame_ptr_rtx, info->total_size, 
+				NULL_RTX, NULL_RTX);
 	}
-      insn = emit_insn (gen_rtx_PARALLEL (VOIDmode, p));
-      rs6000_frame_related (insn, frame_ptr_rtx, info->total_size, 
-			    NULL_RTX, NULL_RTX);
     }
   else
     {
       int i;
+
       for (i = 0; i < 32 - info->first_gp_reg_save; i++)
 	if ((regs_ever_live[info->first_gp_reg_save+i] 
 	     && ! call_used_regs[info->first_gp_reg_save+i])
@@ -9226,24 +9241,38 @@ rs6000_emit_epilogue (sibcall)
      the load-multiple instructions.  */
   if (using_load_multiple)
     {
-      rtvec p;
-      p = rtvec_alloc (32 - info->first_gp_reg_save);
-      for (i = 0; i < 32 - info->first_gp_reg_save; i++)
-	{
-	  rtx addr = gen_rtx_PLUS (Pmode, frame_reg_rtx, 
-				   GEN_INT (info->gp_save_offset 
-					    + sp_offset 
-					    + reg_size * i));
-	  rtx mem = gen_rtx_MEM (reg_mode, addr);
+      int n = info->first_gp_reg_save;
 
-	  set_mem_alias_set (mem, rs6000_sr_alias_set);
+      while (n < 32
+	     && !((regs_ever_live[n]
+		   && ! call_used_regs[n])
+		  || (n == RS6000_PIC_OFFSET_TABLE_REGNUM
+		      && ((DEFAULT_ABI == ABI_V4 && flag_pic == 1)
+			  || (DEFAULT_ABI == ABI_DARWIN && flag_pic)))))
+	n++;
+
+      if (n < 32)
+	{
+	  rtvec p;
 
-	  RTVEC_ELT (p, i) = 
-	    gen_rtx_SET (VOIDmode,
-			 gen_rtx_REG (reg_mode, info->first_gp_reg_save + i),
-			 mem);
+	  p = rtvec_alloc (32 - n);
+	  for (i = 0; i < 32 - n; i++)
+	    {
+	      rtx addr = gen_rtx_PLUS (Pmode, frame_reg_rtx, 
+				       GEN_INT (info->gp_save_offset 
+						+ sp_offset 
+						+ reg_size * i));
+	      rtx mem = gen_rtx_MEM (reg_mode, addr);
+
+	      set_mem_alias_set (mem, rs6000_sr_alias_set);
+
+	      RTVEC_ELT (p, i) = 
+		gen_rtx_SET (VOIDmode,
+			     gen_rtx_REG (reg_mode, n + i),
+			     mem);
+	    }
+	  emit_insn (gen_rtx_PARALLEL (VOIDmode, p));
 	}
-      emit_insn (gen_rtx_PARALLEL (VOIDmode, p));
     }
   else
     for (i = 0; i < 32 - info->first_gp_reg_save; i++)

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: other/7114: ICE building strcoll.op from glibc-2.2.5
  2002-07-15  2:43 ` other/7114: ICE building strcoll.op from glibc-2.2.5 Alan Modra
@ 2002-07-15  5:22   ` Alan Modra
  2002-07-15 12:51   ` Geoff Keating
  1 sibling, 0 replies; 875+ messages in thread
From: Alan Modra @ 2002-07-15  5:22 UTC (permalink / raw)
  To: David Edelsohn, geoffk; +Cc: gcc-patches

On Mon, Jul 15, 2002 at 06:56:03PM +0930, Alan Modra wrote:
> This patch cures the testcase.  The !using_store_multiple code tests
> whether regs are live before saving.  We need to do something similar
> for using_store_multiple, in case all regs need not be saved.
> 
> 	* config/rs6000/rs6000.c (rs6000_emit_prologue): Trim saved regs
> 	for -mmultiple case like -mno-multiple case.
> 	(rs6000_emit_epilogue): Likewise.

While this does cure the ICE, on further investigation, I'm not happy.
I think trimming off the saved regs in the -mno-multiple case is
wrong, and the above patch just copies wrong code.  Why isn't
first_reg_to_save giving us the right number?

The answer to this is that r30 isn't being marked as used in the
current_function_needs_context case.  PR 5967 is one result.
What's the right way to fix this?  Using get_hard_reg_initial_val
for the static chain reg?

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: other/7114: ICE building strcoll.op from glibc-2.2.5
  2002-07-15  2:43 ` other/7114: ICE building strcoll.op from glibc-2.2.5 Alan Modra
  2002-07-15  5:22   ` Alan Modra
@ 2002-07-15 12:51   ` Geoff Keating
  2002-07-15 16:54     ` Alan Modra
  1 sibling, 1 reply; 875+ messages in thread
From: Geoff Keating @ 2002-07-15 12:51 UTC (permalink / raw)
  To: amodra; +Cc: d.mueller, gcc-gnats, gcc-patches, dje

> Date: Mon, 15 Jul 2002 18:56:03 +0930
> From: Alan Modra <amodra@bigpond.net.au>

> This patch cures the testcase.  The !using_store_multiple code tests
> whether regs are live before saving.  We need to do something similar
> for using_store_multiple, in case all regs need not be saved.

Those registers are actually saved, whether they need to be or not,
correct?

So the RTL generated is an accurate representation of the instruction,
and the bug must be elsewhere.

> 	* config/rs6000/rs6000.c (rs6000_emit_prologue): Trim saved regs
> 	for -mmultiple case like -mno-multiple case.
> 	(rs6000_emit_epilogue): Likewise.
> 
> Index: gcc/config/rs6000/rs6000.c
> ===================================================================
> RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.c,v
> retrieving revision 1.291.2.13
> diff -u -p -r1.291.2.13 rs6000.c
> --- gcc/config/rs6000/rs6000.c	23 May 2002 23:22:44 -0000	1.291.2.13
> +++ gcc/config/rs6000/rs6000.c	15 Jul 2002 09:06:31 -0000
> @@ -8836,30 +8836,45 @@ rs6000_emit_prologue ()
>       the store-multiple instructions.  */
>    if (using_store_multiple)
>      {
> -      rtvec p, dwarfp;
> -      int i;
> -      p = rtvec_alloc (32 - info->first_gp_reg_save);
> -      dwarfp = rtvec_alloc (32 - info->first_gp_reg_save);
> -      for (i = 0; i < 32 - info->first_gp_reg_save; i++)
> +      int n = info->first_gp_reg_save;
> +
> +      while (n < 32
> +	     && !((regs_ever_live[n]
> +		   && ! call_used_regs[n])
> +		  || (n == RS6000_PIC_OFFSET_TABLE_REGNUM
> +		      && ((DEFAULT_ABI == ABI_V4 && flag_pic == 1)
> +			  || (DEFAULT_ABI == ABI_DARWIN && flag_pic)))))
> +	n++;
> +
> +      if (n < 32)
>  	{
> -	  rtx addr, reg, mem;
> -	  reg = gen_rtx_REG (reg_mode, info->first_gp_reg_save + i);
> -	  addr = gen_rtx_PLUS (Pmode, frame_reg_rtx, 
> -			       GEN_INT (info->gp_save_offset 
> -					+ sp_offset 
> -					+ reg_size * i));
> -	  mem = gen_rtx_MEM (reg_mode, addr);
> -	  set_mem_alias_set (mem, rs6000_sr_alias_set);
> +	  rtvec p, dwarfp;
> +	  int i;
> +
> +	  p = rtvec_alloc (32 - n);
> +	  dwarfp = rtvec_alloc (32 - n);
> +	  for (i = 0; i < 32 - n; i++)
> +	    {
> +	      rtx addr, reg, mem;
> +	      reg = gen_rtx_REG (reg_mode, n + i);
> +	      addr = gen_rtx_PLUS (Pmode, frame_reg_rtx, 
> +				   GEN_INT (info->gp_save_offset 
> +					    + sp_offset 
> +					    + reg_size * i));
> +	      mem = gen_rtx_MEM (reg_mode, addr);
> +	      set_mem_alias_set (mem, rs6000_sr_alias_set);
>  
> -	  RTVEC_ELT (p, i) = gen_rtx_SET (VOIDmode, mem, reg);
> +	      RTVEC_ELT (p, i) = gen_rtx_SET (VOIDmode, mem, reg);
> +	    }
> +	  insn = emit_insn (gen_rtx_PARALLEL (VOIDmode, p));
> +	  rs6000_frame_related (insn, frame_ptr_rtx, info->total_size, 
> +				NULL_RTX, NULL_RTX);
>  	}
> -      insn = emit_insn (gen_rtx_PARALLEL (VOIDmode, p));
> -      rs6000_frame_related (insn, frame_ptr_rtx, info->total_size, 
> -			    NULL_RTX, NULL_RTX);
>      }
>    else
>      {
>        int i;
> +
>        for (i = 0; i < 32 - info->first_gp_reg_save; i++)
>  	if ((regs_ever_live[info->first_gp_reg_save+i] 
>  	     && ! call_used_regs[info->first_gp_reg_save+i])
> @@ -9226,24 +9241,38 @@ rs6000_emit_epilogue (sibcall)
>       the load-multiple instructions.  */
>    if (using_load_multiple)
>      {
> -      rtvec p;
> -      p = rtvec_alloc (32 - info->first_gp_reg_save);
> -      for (i = 0; i < 32 - info->first_gp_reg_save; i++)
> -	{
> -	  rtx addr = gen_rtx_PLUS (Pmode, frame_reg_rtx, 
> -				   GEN_INT (info->gp_save_offset 
> -					    + sp_offset 
> -					    + reg_size * i));
> -	  rtx mem = gen_rtx_MEM (reg_mode, addr);
> +      int n = info->first_gp_reg_save;
>  
> -	  set_mem_alias_set (mem, rs6000_sr_alias_set);
> +      while (n < 32
> +	     && !((regs_ever_live[n]
> +		   && ! call_used_regs[n])
> +		  || (n == RS6000_PIC_OFFSET_TABLE_REGNUM
> +		      && ((DEFAULT_ABI == ABI_V4 && flag_pic == 1)
> +			  || (DEFAULT_ABI == ABI_DARWIN && flag_pic)))))
> +	n++;
> +
> +      if (n < 32)
> +	{
> +	  rtvec p;
>  
> -	  RTVEC_ELT (p, i) = 
> -	    gen_rtx_SET (VOIDmode,
> -			 gen_rtx_REG (reg_mode, info->first_gp_reg_save + i),
> -			 mem);
> +	  p = rtvec_alloc (32 - n);
> +	  for (i = 0; i < 32 - n; i++)
> +	    {
> +	      rtx addr = gen_rtx_PLUS (Pmode, frame_reg_rtx, 
> +				       GEN_INT (info->gp_save_offset 
> +						+ sp_offset 
> +						+ reg_size * i));
> +	      rtx mem = gen_rtx_MEM (reg_mode, addr);
> +
> +	      set_mem_alias_set (mem, rs6000_sr_alias_set);
> +
> +	      RTVEC_ELT (p, i) = 
> +		gen_rtx_SET (VOIDmode,
> +			     gen_rtx_REG (reg_mode, n + i),
> +			     mem);
> +	    }
> +	  emit_insn (gen_rtx_PARALLEL (VOIDmode, p));
>  	}
> -      emit_insn (gen_rtx_PARALLEL (VOIDmode, p));
>      }
>    else
>      for (i = 0; i < 32 - info->first_gp_reg_save; i++)
> 
> -- 
> Alan Modra
> IBM OzLabs - Linux Technology Centre
> 


-- 
- Geoffrey Keating <geoffk@geoffk.org> <geoffk@redhat.com>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: other/7114: ICE building strcoll.op from glibc-2.2.5
  2002-07-15 12:51   ` Geoff Keating
@ 2002-07-15 16:54     ` Alan Modra
  2002-07-15 18:38       ` Alan Modra
  2002-07-16 10:46       ` Geoff Keating
  0 siblings, 2 replies; 875+ messages in thread
From: Alan Modra @ 2002-07-15 16:54 UTC (permalink / raw)
  To: Geoff Keating; +Cc: d.mueller, gcc-gnats, gcc-patches, dje

On Mon, Jul 15, 2002 at 12:43:02PM -0700, Geoff Keating wrote:
> > Date: Mon, 15 Jul 2002 18:56:03 +0930
> > From: Alan Modra <amodra@bigpond.net.au>
> 
> > This patch cures the testcase.  The !using_store_multiple code tests
> > whether regs are live before saving.  We need to do something similar
> > for using_store_multiple, in case all regs need not be saved.
> 
> Those registers are actually saved, whether they need to be or not,
> correct?
> 
> So the RTL generated is an accurate representation of the instruction,
> and the bug must be elsewhere.

The testcase saves r30 and r31, but both are marked unused (don't
appear in regs_ever_live).  Later rtl analysis decides that the
save instruction can be eliminated, thus the ICE.  The real bug is
that r30 is not marked used when current_function_needs_context.
This is also the reason for PR5967.

The code that I copied from the !using_store_multiple case just
papers over this bug.  So the above patch merely makes -mmultiple
and -mno-multiple consistently wrong.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: other/7114: ICE building strcoll.op from glibc-2.2.5
  2002-07-15 16:54     ` Alan Modra
@ 2002-07-15 18:38       ` Alan Modra
  2002-07-15 22:08         ` Richard Henderson
  2002-07-16 11:23         ` Geoff Keating
  2002-07-16 10:46       ` Geoff Keating
  1 sibling, 2 replies; 875+ messages in thread
From: Alan Modra @ 2002-07-15 18:38 UTC (permalink / raw)
  To: Geoff Keating, d.mueller, gcc-gnats, gcc-patches, dje

On Tue, Jul 16, 2002 at 09:20:27AM +0930, Alan Modra wrote:
> The testcase saves r30 and r31, but both are marked unused (don't
> appear in regs_ever_live).  Later rtl analysis decides that the
> save instruction can be eliminated, thus the ICE.  The real bug is
> that r30 is not marked used when current_function_needs_context.
> This is also the reason for PR5967.
> 
> The code that I copied from the !using_store_multiple case just
> papers over this bug.  So the above patch merely makes -mmultiple
> and -mno-multiple consistently wrong.

It gets worse.  rs6000/sysv4.h sets PROFILE_BEFORE_PROLOGUE.  That
means it is completely wrong for first_reg_to_save to decide r30
needs saving when profiling and current_function_needs_context.
It's too late to think about saving r30.  What really needs to
happen is one of

a) Don't use PROFILE_BEFORE_PROLOGUE.
or
b) Save the static chain reg on the stack somewhere before the mcount
   call.
or
c) Clobber r30 in a call to a nested function when profiling is
   enabled.

People would probably scream if we went with (a) as special purpose
implementations of mcount might make assumptions about the stack.
So, anyone want to speak up on the best way to solve this?

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: other/7114: ICE building strcoll.op from glibc-2.2.5
  2002-07-15 18:38       ` Alan Modra
@ 2002-07-15 22:08         ` Richard Henderson
  2002-07-16  0:03           ` Alan Modra
  2002-07-16 11:23         ` Geoff Keating
  1 sibling, 1 reply; 875+ messages in thread
From: Richard Henderson @ 2002-07-15 22:08 UTC (permalink / raw)
  To: Geoff Keating, d.mueller, gcc-gnats, gcc-patches, dje

On Tue, Jul 16, 2002 at 11:08:16AM +0930, Alan Modra wrote:
> It gets worse.  rs6000/sysv4.h sets PROFILE_BEFORE_PROLOGUE.

Make rs6000/sysv4.h can use PROFILE_HOOK instead.

I don't know what the ppc svr4 _mcount abi looks like, but I know that
you can emit any sort of rtl you want, which pretty much means that you
can avoid all of these register allocation problems by making an actual
register allocator take care of them.

r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: other/7114: ICE building strcoll.op from glibc-2.2.5
  2002-07-15 22:08         ` Richard Henderson
@ 2002-07-16  0:03           ` Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2002-07-16  0:03 UTC (permalink / raw)
  To: Richard Henderson, Geoff Keating, d.mueller, gcc-gnats, gcc-patches, dje

On Mon, Jul 15, 2002 at 09:31:24PM -0700, Richard Henderson wrote:
> On Tue, Jul 16, 2002 at 11:08:16AM +0930, Alan Modra wrote:
> > It gets worse.  rs6000/sysv4.h sets PROFILE_BEFORE_PROLOGUE.
> 
> Make rs6000/sysv4.h can use PROFILE_HOOK instead.

Aye, that's the nice way to do it.  However, on powerpc64-linux,
I've had kernel people complaining that the profiling code isn't
what they want:  All those register saves from the prologue
preceding the mcount call apparently are messing up accurate
count values, and it's hard for an mcount implementation to
adjust times, or so I'm told.  I implemented a simple hack to
do PROFILE_BEFORE_PROLOGUE on powerpc64-linux for people who
want it.

I suspect we'll get the same sort of complaint if we change
powerpc mcount.  There's also the issue that some special-purpose
mcount functions may expect to be called before the stack has
been adjusted.

I'm rapidly approaching the point where I either give up on this
problem, or simply remove support for profiling on nested
functions.  The current code just doesn't work.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: other/7114: ICE building strcoll.op from glibc-2.2.5
  2002-07-15 16:54     ` Alan Modra
  2002-07-15 18:38       ` Alan Modra
@ 2002-07-16 10:46       ` Geoff Keating
  1 sibling, 0 replies; 875+ messages in thread
From: Geoff Keating @ 2002-07-16 10:46 UTC (permalink / raw)
  To: amodra; +Cc: d.mueller, gcc-gnats, gcc-patches, dje

> Date: Tue, 16 Jul 2002 09:20:27 +0930
> From: Alan Modra <amodra@bigpond.net.au>

> On Mon, Jul 15, 2002 at 12:43:02PM -0700, Geoff Keating wrote:
> > > Date: Mon, 15 Jul 2002 18:56:03 +0930
> > > From: Alan Modra <amodra@bigpond.net.au>
> > 
> > > This patch cures the testcase.  The !using_store_multiple code tests
> > > whether regs are live before saving.  We need to do something similar
> > > for using_store_multiple, in case all regs need not be saved.
> > 
> > Those registers are actually saved, whether they need to be or not,
> > correct?
> > 
> > So the RTL generated is an accurate representation of the instruction,
> > and the bug must be elsewhere.
> 
> The testcase saves r30 and r31, but both are marked unused (don't
> appear in regs_ever_live).  Later rtl analysis decides that the
> save instruction can be eliminated, thus the ICE.  The real bug is
> that r30 is not marked used when current_function_needs_context.
> This is also the reason for PR5967.
> 
> The code that I copied from the !using_store_multiple case just
> papers over this bug.  So the above patch merely makes -mmultiple
> and -mno-multiple consistently wrong.

No, the code below causes unused registers to actually not be saved,
which is correct (it does sometimes happen that all uses of a register
are eliminated after reload).  This can be done when individual loads
and stores are being used, you just don't emit that store.  It can't
be done when store-multiple is being used.

-- 
- Geoffrey Keating <geoffk@geoffk.org> <geoffk@redhat.com>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: other/7114: ICE building strcoll.op from glibc-2.2.5
  2002-07-15 18:38       ` Alan Modra
  2002-07-15 22:08         ` Richard Henderson
@ 2002-07-16 11:23         ` Geoff Keating
  2002-07-16 18:51           ` Alan Modra
  1 sibling, 1 reply; 875+ messages in thread
From: Geoff Keating @ 2002-07-16 11:23 UTC (permalink / raw)
  To: amodra; +Cc: d.mueller, gcc-gnats, gcc-patches, dje

> Date: Tue, 16 Jul 2002 11:08:16 +0930
> From: Alan Modra <amodra@bigpond.net.au>
> Mail-Followup-To: Geoff Keating <geoffk@redhat.com>, d.mueller@elsoft.ch,
> 	gcc-gnats@gcc.gnu.org, gcc-patches@gcc.gnu.org, dje@watson.ibm.com
> Content-Disposition: inline
> User-Agent: Mutt/1.3.25i
> 
> On Tue, Jul 16, 2002 at 09:20:27AM +0930, Alan Modra wrote:
> > The testcase saves r30 and r31, but both are marked unused (don't
> > appear in regs_ever_live).  Later rtl analysis decides that the
> > save instruction can be eliminated, thus the ICE.  The real bug is
> > that r30 is not marked used when current_function_needs_context.
> > This is also the reason for PR5967.
> > 
> > The code that I copied from the !using_store_multiple case just
> > papers over this bug.  So the above patch merely makes -mmultiple
> > and -mno-multiple consistently wrong.
> 
> It gets worse.  rs6000/sysv4.h sets PROFILE_BEFORE_PROLOGUE.  That
> means it is completely wrong for first_reg_to_save to decide r30
> needs saving when profiling and current_function_needs_context.
> It's too late to think about saving r30.  What really needs to
> happen is one of
> 
> a) Don't use PROFILE_BEFORE_PROLOGUE.
> or
> b) Save the static chain reg on the stack somewhere before the mcount
>    call.
> or
> c) Clobber r30 in a call to a nested function when profiling is
>    enabled.
> 
> People would probably scream if we went with (a) as special purpose
> implementations of mcount might make assumptions about the stack.
> So, anyone want to speak up on the best way to solve this?

The profiling function isn't allowed to clobber r30.  It never has
been, so this should be no surprise.

-- 
- Geoffrey Keating <geoffk@geoffk.org> <geoffk@redhat.com>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: other/7114: ICE building strcoll.op from glibc-2.2.5
  2002-07-16 11:23         ` Geoff Keating
@ 2002-07-16 18:51           ` Alan Modra
  2002-07-16 22:07             ` Alan Modra
  2002-07-17  0:58             ` Geoff Keating
  0 siblings, 2 replies; 875+ messages in thread
From: Alan Modra @ 2002-07-16 18:51 UTC (permalink / raw)
  To: Geoff Keating; +Cc: d.mueller, gcc-gnats, gcc-patches, dje

On Tue, Jul 16, 2002 at 10:48:14AM -0700, Geoff Keating wrote:
> The profiling function isn't allowed to clobber r30.  It never has
> been, so this should be no surprise.

I'm not quite sure what to make of this response.  We're talking about
this code from rs6000.c:10479

      if (current_function_needs_context)
	asm_fprintf (file, "\tmr %s,%s\n",
		     reg_names[30], reg_names[STATIC_CHAIN_REGNUM]);
      fprintf (file, "\tbl %s\n", RS6000_MCOUNT);
      if (current_function_needs_context)
	asm_fprintf (file, "\tmr %s,%s\n",
		     reg_names[STATIC_CHAIN_REGNUM], reg_names[30]);

This is currently emitted _before_ the prologue in the nested function,
thus trashes r30.  I was considering the idea of adding a clobber of
r30 to CALL_INSN_FUNCTION_USAGE when calling a nested function.  That's
a workable solution, but means you need to zap r30 on all calls via
function pointers too.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: other/7114: ICE building strcoll.op from glibc-2.2.5
  2002-07-16 18:51           ` Alan Modra
@ 2002-07-16 22:07             ` Alan Modra
  2002-07-17  0:58             ` Geoff Keating
  1 sibling, 0 replies; 875+ messages in thread
From: Alan Modra @ 2002-07-16 22:07 UTC (permalink / raw)
  To: Geoff Keating, d.mueller, gcc-gnats, gcc-patches, dje

	PR target/5967, PR other/7114
	* config/rs6000/r6000.c (first_reg_to_save): Remove bogus
	adjustments to first_reg for profiling case.
	(output_function_profiler): Correct lr save slot for ABI_AIX_NODESC.
	Disable profiling for nested functions on ABI_V4, and for 64 bit
	code on both ABI_V4 and ABI_AIX_NODESC.  Save static chain reg
	to sp + 12 on ABI_AIX_NODESC.

Rationale:

	* config/rs6000/r6000.c (first_reg_to_save): Remove bogus
	adjustments to first_reg for profiling case.
first_reg_to_save doesn't need to do anything special for any of these
registers as profiling is done via PROFILE_HOOK when ABI_AIX or
ABI_DARWIN.  The normal register allocation code will set up
regs_ever_live for us.  We're killing profiling on nested functions
when ABI_V4.

	(output_function_profiler): Correct lr save slot for ABI_AIX_NODESC.
ABI_AIX_NODESC saves lr to sp + 8.  This change is perhaps a little
contentious as existing mcount implementations may take into account
the current ABI breakage.

	Disable profiling for nested functions on ABI_V4
See the comment below.

	and for 64 bit code on both ABI_V4 and ABI_AIX_NODESC.
The instructions emitted here are 32 bit ones.

	Save static chain reg to sp + 12 on ABI_AIX_NODESC.
We need to save it somewhere, or disable profiling of nested functions
for ABI_AIX_NODESC too.

bootstrapped (a slightly different patch) and regression tested
powerpc-linux.  I'm re-running the bootstrap now.  Built powerpc-linux
and powerpc64-linux glibc --enable-profile to test ABI_V4 and ABI_AIX
changes.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

Index: rs6000.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.c,v
retrieving revision 1.343
diff -u -p -r1.343 rs6000.c
--- rs6000.c	16 Jul 2002 02:16:41 -0000	1.343
+++ rs6000.c	17 Jul 2002 03:43:03 -0000
@@ -7357,53 +7357,6 @@ first_reg_to_save ()
 		    || (DEFAULT_ABI == ABI_DARWIN && flag_pic)))))
       break;
 
-  if (current_function_profile)
-    {
-      /* AIX must save/restore every register that contains a parameter
-	 before/after the .__mcount call plus an additional register
-	 for the static chain, if needed; use registers from 30 down to 22
-	 to do this.  */
-      if (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_DARWIN)
-	{
-	  int last_parm_reg, profile_first_reg;
-
-	  /* Figure out last used parameter register.  The proper thing
-	     to do is to walk incoming args of the function.  A function
-	     might have live parameter registers even if it has no
-	     incoming args.  */
-	  for (last_parm_reg = 10;
-	       last_parm_reg > 2 && ! regs_ever_live [last_parm_reg];
-	       last_parm_reg--)
-	    ;
-
-	  /* Calculate first reg for saving parameter registers
-	     and static chain.
-	     Skip reg 31 which may contain the frame pointer.  */
-	  profile_first_reg = (33 - last_parm_reg
-			       - (current_function_needs_context ? 1 : 0));
-#if TARGET_MACHO
-          /* Need to skip another reg to account for R31 being PICBASE
-             (when flag_pic is set) or R30 being used as the frame
-             pointer (when flag_pic is not set).  */
-          --profile_first_reg;
-#endif
-	  /* Do not save frame pointer if no parameters needs to be saved.  */
-	  if (profile_first_reg == 31)
-	    profile_first_reg = 32;
-
-	  if (first_reg > profile_first_reg)
-	    first_reg = profile_first_reg;
-	}
-
-      /* SVR4 may need one register to preserve the static chain.  */
-      else if (current_function_needs_context)
-	{
-	  /* Skip reg 31 which may contain the frame pointer.  */
-	  if (first_reg > 30)
-	    first_reg = 30;
-	}
-    }
-
 #if TARGET_MACHO
   if (flag_pic && current_function_uses_pic_offset_table &&
       (first_reg > RS6000_PIC_OFFSET_TABLE_REGNUM))
@@ -10430,6 +10383,8 @@ output_function_profiler (file, labelno)
   int labelno;
 {
   char buf[100];
+  int save_lr = 8;
+  int save_chain = 12;
 
   ASM_GENERATE_INTERNAL_LABEL (buf, "LP", labelno);
   switch (DEFAULT_ABI)
@@ -10438,13 +10393,35 @@ output_function_profiler (file, labelno)
       abort ();
 
     case ABI_V4:
+      if (current_function_needs_context)
+	{
+	  /* There's no safe way to save STATIC_CHAIN_REGNUM around
+	     the mcount call.  The ABI_V4 stack has no slot available,
+	     and since we are PROFILE_BEFORE_PROLOGUE, we can't use a
+	     call-saved register.  Adjusting the stack to give us some
+	     space might confuse special purpose mcount functions.
+	     And finally, clobbering a callee saved register for the
+	     purpose of saving the static chain when calling a nested
+	     function is difficult to get right;  You might be calling
+	     a nested function via a function pointer.  */
+	  warning ("no profiling on nested functions for this ABI");
+	  return;
+	}
+      save_lr = 4;
+      /* Fall through.  */
+
     case ABI_AIX_NODESC:
+      if (!TARGET_32BIT)
+	{
+	  warning ("no profiling of 64-bit code for this ABI");
+	  return;
+	}
       fprintf (file, "\tmflr %s\n", reg_names[0]);
       if (flag_pic == 1)
 	{
 	  fputs ("\tbl _GLOBAL_OFFSET_TABLE_@local-4\n", file);
-	  asm_fprintf (file, "\t{st|stw} %s,4(%s)\n",
-		       reg_names[0], reg_names[1]);
+	  asm_fprintf (file, "\t{st|stw} %s,%d(%s)\n",
+		       reg_names[0], save_lr, reg_names[1]);
 	  asm_fprintf (file, "\tmflr %s\n", reg_names[12]);
 	  asm_fprintf (file, "\t{l|lwz} %s,", reg_names[0]);
 	  assemble_name (file, buf);
@@ -10452,8 +10429,8 @@ output_function_profiler (file, labelno)
 	}
       else if (flag_pic > 1)
 	{
-	  asm_fprintf (file, "\t{st|stw} %s,4(%s)\n",
-		       reg_names[0], reg_names[1]);
+	  asm_fprintf (file, "\t{st|stw} %s,%d(%s)\n",
+		       reg_names[0], save_lr, reg_names[1]);
 	  /* Now, we need to get the address of the label.  */
 	  fputs ("\tbl 1f\n\t.long ", file);
 	  assemble_name (file, buf);
@@ -10469,20 +10446,25 @@ output_function_profiler (file, labelno)
 	  asm_fprintf (file, "\t{liu|lis} %s,", reg_names[12]);
 	  assemble_name (file, buf);
 	  fputs ("@ha\n", file);
-	  asm_fprintf (file, "\t{st|stw} %s,4(%s)\n",
-		       reg_names[0], reg_names[1]);
+	  asm_fprintf (file, "\t{st|stw} %s,%d(%s)\n",
+		       reg_names[0], save_lr, reg_names[1]);
 	  asm_fprintf (file, "\t{cal|la} %s,", reg_names[0]);
 	  assemble_name (file, buf);
 	  asm_fprintf (file, "@l(%s)\n", reg_names[12]);
 	}
 
       if (current_function_needs_context)
-	asm_fprintf (file, "\tmr %s,%s\n",
-		     reg_names[30], reg_names[STATIC_CHAIN_REGNUM]);
-      fprintf (file, "\tbl %s\n", RS6000_MCOUNT);
-      if (current_function_needs_context)
-	asm_fprintf (file, "\tmr %s,%s\n",
-		     reg_names[STATIC_CHAIN_REGNUM], reg_names[30]);
+	{
+	  asm_fprintf (file, "\t{st|stw} %s,%d(%s)\n",
+		       reg_names[STATIC_CHAIN_REGNUM],
+		       save_chain, reg_names[1]);
+	  fprintf (file, "\tbl %s\n", RS6000_MCOUNT);
+	  asm_fprintf (file, "\t{l|lwz} %s,%d(%s)\n",
+		       reg_names[STATIC_CHAIN_REGNUM],
+		       save_chain, reg_names[1]);
+	}
+      else
+	fprintf (file, "\tbl %s\n", RS6000_MCOUNT);
       break;
 
     case ABI_AIX:

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: other/7114: ICE building strcoll.op from glibc-2.2.5
  2002-07-16 18:51           ` Alan Modra
  2002-07-16 22:07             ` Alan Modra
@ 2002-07-17  0:58             ` Geoff Keating
  2002-07-17  2:04               ` Alan Modra
  2002-07-17  8:45               ` David Edelsohn
  1 sibling, 2 replies; 875+ messages in thread
From: Geoff Keating @ 2002-07-17  0:58 UTC (permalink / raw)
  To: amodra; +Cc: d.mueller, gcc-gnats, gcc-patches, dje

> Date: Wed, 17 Jul 2002 11:17:39 +0930
> From: Alan Modra <amodra@bigpond.net.au>

> On Tue, Jul 16, 2002 at 10:48:14AM -0700, Geoff Keating wrote:
> > The profiling function isn't allowed to clobber r30.  It never has
> > been, so this should be no surprise.
> 
> I'm not quite sure what to make of this response.  We're talking about
> this code from rs6000.c:10479
> 
>       if (current_function_needs_context)
> 	asm_fprintf (file, "\tmr %s,%s\n",
> 		     reg_names[30], reg_names[STATIC_CHAIN_REGNUM]);
>       fprintf (file, "\tbl %s\n", RS6000_MCOUNT);
>       if (current_function_needs_context)
> 	asm_fprintf (file, "\tmr %s,%s\n",
> 		     reg_names[STATIC_CHAIN_REGNUM], reg_names[30]);

I see, I was confused.  I thought we were already using r30 for
STATIC_CHAIN_REGNUM.  The ABI specifies r31, but I see we can't use
that if we want trampolines to be efficient.

> This is currently emitted _before_ the prologue in the nested function,
> thus trashes r30.  I was considering the idea of adding a clobber of
> r30 to CALL_INSN_FUNCTION_USAGE when calling a nested function.  That's
> a workable solution, but means you need to zap r30 on all calls via
> function pointers too.

From David's mail message when he put the code in:

     I have ripped out all of the stack PUSH/POP stuff that was
     causing ABI problems and replaced it with explicit moves to a
     temporary register.  This includes having the SVR4 ABI act more
     like AIX using a register instead of the dangerous stack
     save/restore game.  I could not test the SVR4 changes, so I would
     appreciate if the LinuxPPC testers would make sure that I have
     not broken anything when profiling is enabled.

So, thanks for testing it!  We now know this doesn't work. :-)

So, why don't we go back to the push/pop implementation, but this time
do it properly?  We'd only need to push/pop in the (rare)
nested-function case.

-- 
- Geoffrey Keating <geoffk@geoffk.org> <geoffk@redhat.com>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: other/7114: ICE building strcoll.op from glibc-2.2.5
  2002-07-17  0:58             ` Geoff Keating
@ 2002-07-17  2:04               ` Alan Modra
  2002-07-17 10:42                 ` David Edelsohn
  2002-07-17 12:10                 ` Geoff Keating
  2002-07-17  8:45               ` David Edelsohn
  1 sibling, 2 replies; 875+ messages in thread
From: Alan Modra @ 2002-07-17  2:04 UTC (permalink / raw)
  To: Geoff Keating; +Cc: d.mueller, gcc-gnats, gcc-patches, dje

On Wed, Jul 17, 2002 at 12:07:11AM -0700, Geoff Keating wrote:
> So, why don't we go back to the push/pop implementation, but this time
> do it properly?  We'd only need to push/pop in the (rare)
> nested-function case.

I wasn't aware that powerpc used that scheme previously, and therefore
was worried that some mcount implementation might peek at the stack.

Here we go.

	* config/rs6000/r6000.c (first_reg_to_save): Remove bogus
	adjustments to first_reg for profiling case.
	(output_function_profiler): Correct lr save slot for ABI_AIX_NODESC.
	Disable profiling for 64 bit code on both ABI_V4 and ABI_AIX_NODESC.
	Save static chain reg to sp + 12 on ABI_AIX_NODESC.
	* config/rs6000/sysv4.h (ASM_OUTPUT_REG_PUSH): Define.
	(ASM_OUTPUT_REG_POP): Define.
	* config/rs6000/linux64.h (ASM_OUTPUT_REG_PUSH): Undef.
	(ASM_OUTPUT_REG_POP): Undef.

Rationale:

	* config/rs6000/r6000.c (first_reg_to_save): Remove bogus
	adjustments to first_reg for profiling case.
first_reg_to_save doesn't need to do anything special for any of these
registers as profiling is done via PROFILE_HOOK when ABI_AIX or
ABI_DARWIN.  The normal register allocation code will set up
regs_ever_live for us.  We're also not trying to use a reg when ABI_V4.

	(output_function_profiler): Correct lr save slot for ABI_AIX_NODESC.
ABI_AIX_NODESC saves lr to sp + 8.  This change is perhaps a little
contentious as existing mcount implementations may take into account
the current ABI breakage.

	Disable profiling for 64 bit code on both ABI_V4 and ABI_AIX_NODESC.
The instructions emitted here are 32 bit ones.  Fix this with a later
patch.

	Save static chain reg to sp + 12 on ABI_AIX_NODESC.
We need to save it somewhere.  This seems a likely spot.

	* config/rs6000/sysv4.h (ASM_OUTPUT_REG_PUSH): Define.
	(ASM_OUTPUT_REG_POP): Define.
Code resurrected from prior to Mon Mar 15 22:45:25 1999 delta, but
with DEFAULT_ABI == ABI_V4 test added.


powerpc-linux bootstrap on mainline seems to be broken at the moment.

internal compiler error: Internal compiler error
 in tree_low_cst, at tree.c:3312

so I'm in the process of bootstrapping this one on the 3.1 branch.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

Index: gcc/config/rs6000/rs6000.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.c,v
retrieving revision 1.344
diff -u -p -r1.344 rs6000.c
--- gcc/config/rs6000/rs6000.c	16 Jul 2002 20:59:03 -0000	1.344
+++ gcc/config/rs6000/rs6000.c	17 Jul 2002 08:19:35 -0000
@@ -7356,53 +7356,6 @@ first_reg_to_save ()
 		    || (DEFAULT_ABI == ABI_DARWIN && flag_pic)))))
       break;
 
-  if (current_function_profile)
-    {
-      /* AIX must save/restore every register that contains a parameter
-	 before/after the .__mcount call plus an additional register
-	 for the static chain, if needed; use registers from 30 down to 22
-	 to do this.  */
-      if (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_DARWIN)
-	{
-	  int last_parm_reg, profile_first_reg;
-
-	  /* Figure out last used parameter register.  The proper thing
-	     to do is to walk incoming args of the function.  A function
-	     might have live parameter registers even if it has no
-	     incoming args.  */
-	  for (last_parm_reg = 10;
-	       last_parm_reg > 2 && ! regs_ever_live [last_parm_reg];
-	       last_parm_reg--)
-	    ;
-
-	  /* Calculate first reg for saving parameter registers
-	     and static chain.
-	     Skip reg 31 which may contain the frame pointer.  */
-	  profile_first_reg = (33 - last_parm_reg
-			       - (current_function_needs_context ? 1 : 0));
-#if TARGET_MACHO
-          /* Need to skip another reg to account for R31 being PICBASE
-             (when flag_pic is set) or R30 being used as the frame
-             pointer (when flag_pic is not set).  */
-          --profile_first_reg;
-#endif
-	  /* Do not save frame pointer if no parameters needs to be saved.  */
-	  if (profile_first_reg == 31)
-	    profile_first_reg = 32;
-
-	  if (first_reg > profile_first_reg)
-	    first_reg = profile_first_reg;
-	}
-
-      /* SVR4 may need one register to preserve the static chain.  */
-      else if (current_function_needs_context)
-	{
-	  /* Skip reg 31 which may contain the frame pointer.  */
-	  if (first_reg > 30)
-	    first_reg = 30;
-	}
-    }
-
 #if TARGET_MACHO
   if (flag_pic && current_function_uses_pic_offset_table &&
       (first_reg > RS6000_PIC_OFFSET_TABLE_REGNUM))
@@ -10429,6 +10382,7 @@ output_function_profiler (file, labelno)
   int labelno;
 {
   char buf[100];
+  int save_lr = 8;
 
   ASM_GENERATE_INTERNAL_LABEL (buf, "LP", labelno);
   switch (DEFAULT_ABI)
@@ -10437,13 +10391,21 @@ output_function_profiler (file, labelno)
       abort ();
 
     case ABI_V4:
+      save_lr = 4;
+      /* Fall through.  */
+
     case ABI_AIX_NODESC:
+      if (!TARGET_32BIT)
+	{
+	  warning ("no profiling of 64-bit code for this ABI");
+	  return;
+	}
       fprintf (file, "\tmflr %s\n", reg_names[0]);
       if (flag_pic == 1)
 	{
 	  fputs ("\tbl _GLOBAL_OFFSET_TABLE_@local-4\n", file);
-	  asm_fprintf (file, "\t{st|stw} %s,4(%s)\n",
-		       reg_names[0], reg_names[1]);
+	  asm_fprintf (file, "\t{st|stw} %s,%d(%s)\n",
+		       reg_names[0], save_lr, reg_names[1]);
 	  asm_fprintf (file, "\tmflr %s\n", reg_names[12]);
 	  asm_fprintf (file, "\t{l|lwz} %s,", reg_names[0]);
 	  assemble_name (file, buf);
@@ -10451,8 +10413,8 @@ output_function_profiler (file, labelno)
 	}
       else if (flag_pic > 1)
 	{
-	  asm_fprintf (file, "\t{st|stw} %s,4(%s)\n",
-		       reg_names[0], reg_names[1]);
+	  asm_fprintf (file, "\t{st|stw} %s,%d(%s)\n",
+		       reg_names[0], save_lr, reg_names[1]);
 	  /* Now, we need to get the address of the label.  */
 	  fputs ("\tbl 1f\n\t.long ", file);
 	  assemble_name (file, buf);
@@ -10468,27 +10430,32 @@ output_function_profiler (file, labelno)
 	  asm_fprintf (file, "\t{liu|lis} %s,", reg_names[12]);
 	  assemble_name (file, buf);
 	  fputs ("@ha\n", file);
-	  asm_fprintf (file, "\t{st|stw} %s,4(%s)\n",
-		       reg_names[0], reg_names[1]);
+	  asm_fprintf (file, "\t{st|stw} %s,%d(%s)\n",
+		       reg_names[0], save_lr, reg_names[1]);
 	  asm_fprintf (file, "\t{cal|la} %s,", reg_names[0]);
 	  assemble_name (file, buf);
 	  asm_fprintf (file, "@l(%s)\n", reg_names[12]);
 	}
 
-      if (current_function_needs_context)
-	asm_fprintf (file, "\tmr %s,%s\n",
-		     reg_names[30], reg_names[STATIC_CHAIN_REGNUM]);
-      fprintf (file, "\tbl %s\n", RS6000_MCOUNT);
-      if (current_function_needs_context)
-	asm_fprintf (file, "\tmr %s,%s\n",
-		     reg_names[STATIC_CHAIN_REGNUM], reg_names[30]);
+      if (current_function_needs_context && DEFAULT_ABI == ABI_AIX_NODESC)
+	{
+	  asm_fprintf (file, "\t{st|stw} %s,%d(%s)\n",
+		       reg_names[STATIC_CHAIN_REGNUM],
+		       12, reg_names[1]);
+	  fprintf (file, "\tbl %s\n", RS6000_MCOUNT);
+	  asm_fprintf (file, "\t{l|lwz} %s,%d(%s)\n",
+		       reg_names[STATIC_CHAIN_REGNUM],
+		       12, reg_names[1]);
+	}
+      else
+	/* ABI_V4 saves the static chain reg with ASM_OUTPUT_REG_PUSH.  */
+	fprintf (file, "\tbl %s\n", RS6000_MCOUNT);
       break;
 
     case ABI_AIX:
     case ABI_DARWIN:
       /* Don't do anything, done in output_profile_hook ().  */
       break;
-
     }
 }
 
Index: gcc/config/rs6000/sysv4.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/sysv4.h,v
retrieving revision 1.98
diff -u -p -r1.98 sysv4.h
--- gcc/config/rs6000/sysv4.h	10 Jul 2002 00:33:51 -0000	1.98
+++ gcc/config/rs6000/sysv4.h	17 Jul 2002 08:19:36 -0000
@@ -736,6 +736,38 @@ do {									\
   ASM_OUTPUT_ALIGNED_LOCAL (FILE, NAME, SIZE, ALIGN);			\
 } while (0)
 
+/* This is how to output code to push a register on the stack.
+   It need not be very fast code.
+
+   On the rs6000, we must keep the backchain up to date.  In order
+   to simplify things, always allocate 16 bytes for a push (System V
+   wants to keep stack aligned to a 16 byte boundary).  */
+
+#define	ASM_OUTPUT_REG_PUSH(FILE, REGNO)				\
+do {									\
+  if (DEFAULT_ABI == ABI_V4)						\
+    asm_fprintf (FILE,							\
+		 (TARGET_32BIT						\
+		  ? "\t{stu|stwu} %s,-16(%s)\n\t{st|stw} %s,12(%s)\n"	\
+		  : "\tstdu %s,-32(%s)\n\tstd %s,24(%s)\n"),		\
+		 reg_names[1], reg_names[1], reg_names[REGNO],		\
+		 reg_names[1]);						\
+} while (0)
+
+/* This is how to output an insn to pop a register from the stack.
+   It need not be very fast code.  */
+
+#define	ASM_OUTPUT_REG_POP(FILE, REGNO)					\
+do {									\
+  if (DEFAULT_ABI == ABI_V4)						\
+    asm_fprintf (FILE,							\
+		 (TARGET_32BIT						\
+		  ? "\t{l|lwz} %s,12(%s)\n\t{ai|addic} %s,%s,16\n"	\
+		  : "\tld %s,24(%s)\n\t{ai|addic} %s,%s,32\n"),		\
+		 reg_names[REGNO], reg_names[1], reg_names[1],		\
+		 reg_names[1]);						\
+} while (0)
+
 /* Switch  Recognition by gcc.c.  Add -G xx support.  */
 
 /* Override svr4.h definition.  */
Index: gcc/config/rs6000/linux64.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/linux64.h,v
retrieving revision 1.21
diff -u -p -r1.21 linux64.h
--- gcc/config/rs6000/linux64.h	11 Jul 2002 00:23:16 -0000	1.21
+++ gcc/config/rs6000/linux64.h	17 Jul 2002 08:19:25 -0000
@@ -329,3 +329,7 @@ do									\
     sym_lineno += 1;							\
   }									\
 while (0)
+
+/* Override sysv4.h as these are ABI_V4 only.  */
+#undef	ASM_OUTPUT_REG_PUSH
+#undef	ASM_OUTPUT_REG_POP

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: other/7114: ICE building strcoll.op from glibc-2.2.5
  2002-07-17  0:58             ` Geoff Keating
  2002-07-17  2:04               ` Alan Modra
@ 2002-07-17  8:45               ` David Edelsohn
  2002-07-17 12:26                 ` Geoff Keating
  1 sibling, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2002-07-17  8:45 UTC (permalink / raw)
  To: Geoff Keating; +Cc: amodra, d.mueller, gcc-gnats, gcc-patches

>>>>> Geoff Keating writes:

Geoff> So, thanks for testing it!  We now know this doesn't work. :-)

	The patch was tested on SVR4 by Franz and showed no new failures:

http://gcc.gnu.org/ml/gcc-patches/1999-03n/msg00584.html

PUSH/POP cannot work on PowerPC.  On AIX PUSH/POP were corrupting the
stack.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: other/7114: ICE building strcoll.op from glibc-2.2.5
  2002-07-17  2:04               ` Alan Modra
@ 2002-07-17 10:42                 ` David Edelsohn
  2002-07-17 12:10                 ` Geoff Keating
  1 sibling, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2002-07-17 10:42 UTC (permalink / raw)
  To: Alan Modra; +Cc: Geoff Keating, d.mueller, gcc-gnats, gcc-patches

	If you want to define PUSH/POP for sysv4.h, that's fine.  I would
recommend removing the !TARGET_32BIT case for those macros and only using
the PowerPC mnemonics, to simplify things.

>        * config/rs6000/r6000.c (first_reg_to_save): Remove bogus
>        adjustments to first_reg for profiling case.
> first_reg_to_save doesn't need to do anything special for any of these
> registers as profiling is done via PROFILE_HOOK when ABI_AIX or
> ABI_DARWIN.  The normal register allocation code will set up
> regs_ever_live for us.  We're also not trying to use a reg when ABI_V4.

	I guess this works now because of the scheduled prologue.

	I think this is okay, once the trunk can bootstrap again and the
patch can be tested there.  Joern has a patch which fixes the regrename.c
bug he introduced

http://gcc.gnu.org/ml/gcc-patches/2002-07/msg00840.html

but no one with global write privileges has approved it.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: other/7114: ICE building strcoll.op from glibc-2.2.5
  2002-07-17  2:04               ` Alan Modra
  2002-07-17 10:42                 ` David Edelsohn
@ 2002-07-17 12:10                 ` Geoff Keating
  1 sibling, 0 replies; 875+ messages in thread
From: Geoff Keating @ 2002-07-17 12:10 UTC (permalink / raw)
  To: amodra; +Cc: d.mueller, gcc-gnats, gcc-patches, dje

> Date: Wed, 17 Jul 2002 18:30:49 +0930
> From: Alan Modra <amodra@bigpond.net.au>
> Cc: d.mueller@elsoft.ch, gcc-gnats@gcc.gnu.org, gcc-patches@gcc.gnu.org,
>    dje@watson.ibm.com

> On Wed, Jul 17, 2002 at 12:07:11AM -0700, Geoff Keating wrote:
> > So, why don't we go back to the push/pop implementation, but this time
> > do it properly?  We'd only need to push/pop in the (rare)
> > nested-function case.
> 
> I wasn't aware that powerpc used that scheme previously, and therefore
> was worried that some mcount implementation might peek at the stack.
> 
> Here we go.
> 
> 	* config/rs6000/r6000.c (first_reg_to_save): Remove bogus
> 	adjustments to first_reg for profiling case.
> 	(output_function_profiler): Correct lr save slot for ABI_AIX_NODESC.
> 	Disable profiling for 64 bit code on both ABI_V4 and ABI_AIX_NODESC.
> 	Save static chain reg to sp + 12 on ABI_AIX_NODESC.
> 	* config/rs6000/sysv4.h (ASM_OUTPUT_REG_PUSH): Define.
> 	(ASM_OUTPUT_REG_POP): Define.
> 	* config/rs6000/linux64.h (ASM_OUTPUT_REG_PUSH): Undef.
> 	(ASM_OUTPUT_REG_POP): Undef.

This is OK.  I'm pretty sure it's right, and completely sure it's no
worse than before. :-)


-- 
- Geoffrey Keating <geoffk@geoffk.org> <geoffk@redhat.com>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: other/7114: ICE building strcoll.op from glibc-2.2.5
  2002-07-17  8:45               ` David Edelsohn
@ 2002-07-17 12:26                 ` Geoff Keating
  2002-07-17 14:05                   ` David Edelsohn
  2002-07-17 19:20                   ` Alan Modra
  0 siblings, 2 replies; 875+ messages in thread
From: Geoff Keating @ 2002-07-17 12:26 UTC (permalink / raw)
  To: dje; +Cc: amodra, d.mueller, gcc-gnats, gcc-patches

> cc: amodra@bigpond.net.au, d.mueller@elsoft.ch, gcc-gnats@gcc.gnu.org,
>    gcc-patches@gcc.gnu.org
> Date: Wed, 17 Jul 2002 11:42:02 -0400
> From: David Edelsohn <dje@watson.ibm.com>
> 
> >>>>> Geoff Keating writes:
> 
> Geoff> So, thanks for testing it!  We now know this doesn't work. :-)
> 
> 	The patch was tested on SVR4 by Franz and showed no new failures:
> 
> http://gcc.gnu.org/ml/gcc-patches/1999-03n/msg00584.html

The testsuite probably doesn't check that r30 is not clobbered in a
nested function when profiling is switched on.  Actually, Alan, could
you write a test for that?  (That's something I missed while reviewing
your patch.)

> PUSH/POP cannot work on PowerPC.  On AIX PUSH/POP were corrupting the
> stack.

Because they were implemented wrongly?

Clearly, push/pop can work, because procedures push and pop call
frames all the time.  It's just necessary to do it the right way. 

For AIX, of course, push/pop is unnecessary, and Alan's patch didn't
add it.

-- 
- Geoffrey Keating <geoffk@geoffk.org> <geoffk@redhat.com>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: other/7114: ICE building strcoll.op from glibc-2.2.5
  2002-07-17 12:26                 ` Geoff Keating
@ 2002-07-17 14:05                   ` David Edelsohn
  2002-07-17 19:20                   ` Alan Modra
  1 sibling, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2002-07-17 14:05 UTC (permalink / raw)
  To: Geoff Keating; +Cc: amodra, d.mueller, gcc-gnats, gcc-patches

>>>>> Geoff Keating writes:

>> PUSH/POP cannot work on PowerPC.  On AIX PUSH/POP were corrupting the
>> stack.

Geoff> Because they were implemented wrongly?

Geoff> Clearly, push/pop can work, because procedures push and pop call
Geoff> frames all the time.  It's just necessary to do it the right way. 

	We are discussing PUSH/POP in the context of the macros to save
and restore individual registers used when generating a profiler call in
final.c, so GCC's ability to correctly allocate call frames is irrelevant.

	The discussion about the PUSH/POP macros in March 1999 provides
more background about the problem: the macros were not respecting the
ABI-defined location of dynamic allocation on the stack.  PUSH/POP ideally
should work like alloca and open a hole in the stack frame at the alloca
location.  The PUSH/POP macros modify the stack pointer without updating
the compiler's internal knowledge about the stack layout and assume that
the alloca area is adjacent to the top of the stack, so adjusting the
stack pointer will not have any bad consequences, e.g., if a function call
is invoked.

	On AIX, things went haywire because the PUSH/POP macros only
copied the backchain pointer, not all of the ABI-specified area between
the backchain pointer and the alloca area.  When GCC called the profiler
function, the call frame had random garbage in ABI-specified locations,
causing the application to run off the rails when an ABI stack location
was accessed in the called function.  Copying the backchain *and* CR *and*
LR may be sufficient for the AIX case for this particular use.

	Because SVR4 PowerPC invokes the profiler before the prologue and
has a slightly different stack layout, the simplistic definition of
PUSH/POP may work correctly.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: other/7114: ICE building strcoll.op from glibc-2.2.5
  2002-07-17 12:26                 ` Geoff Keating
  2002-07-17 14:05                   ` David Edelsohn
@ 2002-07-17 19:20                   ` Alan Modra
  2002-07-17 19:45                     ` David Edelsohn
  2002-07-17 20:50                     ` David Edelsohn
  1 sibling, 2 replies; 875+ messages in thread
From: Alan Modra @ 2002-07-17 19:20 UTC (permalink / raw)
  To: Geoff Keating; +Cc: dje, gcc-patches

On Wed, Jul 17, 2002 at 12:10:12PM -0700, Geoff Keating wrote:
> The testsuite probably doesn't check that r30 is not clobbered in a
> nested function when profiling is switched on.  Actually, Alan, could
> you write a test for that?

Can we assume a profiling library is available?

Implements dje's suggestion re. !TARGET_32BIT
gcc/ChangeLog
	* config/rs6000/sysv4.h (ASM_OUTPUT_REG_PUSH): Remove 64-bit support.
	(ASM_OUTPUT_REG_POP): Likewise.

gcc/testsuite/ChangeLog
	* gcc.dg/nest.c: New.

Rather embarrassingly, this new testcase fails with a segfault on
powerpc64-linux, both before and after my change to the profiling
code.  So I have another bug to look into..

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

Index: gcc/config/rs6000/sysv4.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/sysv4.h,v
retrieving revision 1.99
diff -u -p -r1.99 sysv4.h
--- gcc/config/rs6000/sysv4.h	18 Jul 2002 00:15:31 -0000	1.99
+++ gcc/config/rs6000/sysv4.h	18 Jul 2002 00:45:01 -0000
@@ -747,9 +747,7 @@ do {									\
 do {									\
   if (DEFAULT_ABI == ABI_V4)						\
     asm_fprintf (FILE,							\
-		 (TARGET_32BIT						\
-		  ? "\t{stu|stwu} %s,-16(%s)\n\t{st|stw} %s,12(%s)\n"	\
-		  : "\tstdu %s,-32(%s)\n\tstd %s,24(%s)\n"),		\
+		 "\t{stu|stwu} %s,-16(%s)\n\t{st|stw} %s,12(%s)\n",	\
 		 reg_names[1], reg_names[1], reg_names[REGNO],		\
 		 reg_names[1]);						\
 } while (0)
@@ -761,9 +759,7 @@ do {									\
 do {									\
   if (DEFAULT_ABI == ABI_V4)						\
     asm_fprintf (FILE,							\
-		 (TARGET_32BIT						\
-		  ? "\t{l|lwz} %s,12(%s)\n\t{ai|addic} %s,%s,16\n"	\
-		  : "\tld %s,24(%s)\n\t{ai|addic} %s,%s,32\n"),		\
+		 "\t{l|lwz} %s,12(%s)\n\t{ai|addic} %s,%s,16\n",	\
 		 reg_names[REGNO], reg_names[1], reg_names[1],		\
 		 reg_names[1]);						\
 } while (0)
Index: gcc/testsuite/gcc.dg/nest.c
===================================================================
RCS file: gcc/testsuite/gcc.dg/nest.c
diff -N gcc/testsuite/gcc.dg/nest.c
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ gcc/testsuite/gcc.dg/nest.c	18 Jul 2002 00:45:10 -0000
@@ -0,0 +1,21 @@
+/* PR 5967, PR 7114 */
+/* { dg-do run } */
+/* { dg-options "-O2 -pg" } */
+
+long foo (long x)
+{
+  long i, sum = 0;
+  long bar (long z) { return z * 2; }
+
+  for (i = 0; i < x; i++)
+    sum += bar (i);
+
+  return sum;
+}
+
+int main (void)
+{
+  if (foo(10) != 90)
+    abort ();
+  return 0;
+}

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: other/7114: ICE building strcoll.op from glibc-2.2.5
  2002-07-17 19:20                   ` Alan Modra
@ 2002-07-17 19:45                     ` David Edelsohn
  2002-07-17 20:50                     ` David Edelsohn
  1 sibling, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2002-07-17 19:45 UTC (permalink / raw)
  To: Alan Modra; +Cc: Geoff Keating, gcc-patches

	* config/rs6000/sysv4.h (ASM_OUTPUT_REG_PUSH): Remove 64-bit support.
	(ASM_OUTPUT_REG_POP): Likewise.

This is fine.  You probably should change the {l|lwz} to just lwz, etc.
because those macros only target PowerPC mnemonics.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: other/7114: ICE building strcoll.op from glibc-2.2.5
  2002-07-17 19:20                   ` Alan Modra
  2002-07-17 19:45                     ` David Edelsohn
@ 2002-07-17 20:50                     ` David Edelsohn
  2002-07-17 20:52                       ` Alan Modra
  1 sibling, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2002-07-17 20:50 UTC (permalink / raw)
  To: Alan Modra; +Cc: Geoff Keating, gcc-patches

	The nest.c testcase does not fail on AIX -- neither 32-bit mode
nor 64-bit mode.  I thought that user mode linux64 basically used the same
profiling definitions as AIX, so I am surprised that I do not see the same
failure.  Note that I have not installed your patch changing
first_reg_to_save. 

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: other/7114: ICE building strcoll.op from glibc-2.2.5
  2002-07-17 20:50                     ` David Edelsohn
@ 2002-07-17 20:52                       ` Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2002-07-17 20:52 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Geoff Keating, gcc-patches

On Wed, Jul 17, 2002 at 10:45:17PM -0400, David Edelsohn wrote:
>  I thought that user mode linux64 basically used the same
> profiling definitions as AIX, so I am surprised that I do not see the same
> failure.

It's bombing inside glibc.  Not a gcc problem.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* [PATCH] Re: thread-local storage: c front end and generic backend patch
  2002-07-11 11:02   ` Richard Henderson
@ 2002-07-26 11:08     ` David Edelsohn
  2002-07-27 15:40       ` Richard Henderson
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2002-07-26 11:08 UTC (permalink / raw)
  To: Richard Henderson; +Cc: gcc-patches

>>>>> Richard Henderson writes:

> Hmm.  Ok.  I can see why I made the change -- COMMON does not
> make sense without PUBLIC.  I'll adjust assemble_variable and
> its subroutines to compensate.

	I think this actually stems from the question of what DECL_COMMON
means.  tree.h says:

/* Nonzero for a given ..._DECL node means that this node should be
   put in .common, if possible.  If a DECL_INITIAL is given, and it
   is not error_mark_node, then the decl cannot be put in .common.  */
#define DECL_COMMON(NODE) (DECL_CHECK (NODE)->decl.common_flag)


Is ".common" all common or only global common?

	The change to c-decl.c testing TREE_PUBLIC implies that
DECL_COMMON only means global common.  This means that GCC has no way to
mark a decl as local common, other than uninitialized and not public.
And, currently, GCC only *emits* local common if the the target also
supports ASM_OUTPUT_BSS.

	There is no connection between a target supporting local common
and supporting BSS, so the logic has been confused somewhere.

	Two apparent solutions are:

1) Either DECL_COMMON should mean any common, in which case it should not
depend on TREE_PUBLIC(decl) and c-decl should be fixed.

2) Or, DECL_COMMON only means global common, in which case
varasm.c:assemble_variable() should test ASM_EMIT_BSS and, if not defined,
avoid invoking asm_emit_uninitialized() for the *one* case it is required:
uninitialized, public symbol that is not common.

	I have implemented option 2 above and submit it for review.  As I
say in the comment, it seems that it would be cleaner to localize the test
to asm_emit_uninitialized and return failure instead of spreading the test
across two functions and aborting if called the wrong way.

David


	* varasm.c (assemble_variable): Narrow test for uninitialized
	without BSS target support.

Index: varasm.c
===================================================================
RCS file: /cvs/gcc/egcs/gcc/varasm.c,v
retrieving revision 1.296
diff -c -p -r1.296 varasm.c
*** varasm.c	26 Jun 2002 15:16:01 -0000	1.296
--- varasm.c	26 Jul 2002 16:47:51 -0000
*************** assemble_variable (decl, top_level, at_e
*** 1598,1604 ****
       in .bss, then we have to use .data.  */
    /* ??? We should handle .bss via select_section mechanisms rather than
       via special target hooks.  That would eliminate this special case.  */
!   else if (!DECL_COMMON (decl))
      ;
  #endif
    else if (DECL_INITIAL (decl) == 0
--- 1598,1606 ----
       in .bss, then we have to use .data.  */
    /* ??? We should handle .bss via select_section mechanisms rather than
       via special target hooks.  That would eliminate this special case.  */
!   /* Duplicate BSS test in asm_emit_uninitialized instead of having it
!      return success or failure for that case.  Shrug.  */
!   else if (TREE_PUBLIC (decl) && !DECL_COMMON (decl))
      ;
  #endif
    else if (DECL_INITIAL (decl) == 0

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] Re: thread-local storage: c front end and generic backend patch
  2002-07-26 11:08     ` [PATCH] " David Edelsohn
@ 2002-07-27 15:40       ` Richard Henderson
  2002-07-27 16:18         ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Richard Henderson @ 2002-07-27 15:40 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

On Fri, Jul 26, 2002 at 01:16:03PM -0400, David Edelsohn wrote:
> I think this actually stems from the question of what DECL_COMMON means.

I think of it in terms of the ELF STT_COMMON defintion, which is
the same as that which a.out used.

> Is ".common" all common or only global common?

Only global, I'd say.  Local common doesn't make any sense when
you think of the linker semantics.

> 	There is no connection between a target supporting local common
> and supporting BSS, so the logic has been confused somewhere.

Local common _is_ BSS.  Perhaps the target doesn't support changing
to a section via .bss and emitting symbols, but that's the effect
of e.g. the .lcomm directive.

> 	* varasm.c (assemble_variable): Narrow test for uninitialized
> 	without BSS target support.

This is about what I had in mind.

r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] Re: thread-local storage: c front end and generic backend patch
  2002-07-27 15:40       ` Richard Henderson
@ 2002-07-27 16:18         ` David Edelsohn
  2002-07-29 11:02           ` Richard Henderson
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2002-07-27 16:18 UTC (permalink / raw)
  To: Richard Henderson; +Cc: gcc-patches

	I think of data, common, bss, weak, etc. mapping into the
following, basic characteristics:

1) Initialized Data (data section) versus Uninitialized Data (bss
section, blank static storage initialized with zeros).

2) Global scope versus Local scope.

3) Multiple symbol definitions merged (common) or error (no-common).

	Common specifies global, uninitialized data which is merged with
other definitions of the same global symbol by the linker.  Local common
specifies local, uninitialized data which is merged with other definitions
of the same local symbol by the *assembler*.  If a data section global
symbol definition does not override a common symbol, the common symbol is
instantiated in the BSS section.

	Explicitly defining a BSS symbol specifies global, uninitialized
data which *cannot* be merged and is an error if a duplicate definition of
the global symbol appears in another module.  In other words, BSS but not
COMMON.

	ASM_OUTPUT_COMMON generates an uninitialized, global, common-label
name.  ASM_OUTPUT_LOCAL generates an uninitialized, local, common-label
name.  ASM_OUTPUT_BSS generates an uninitialized, global name.

David> Is ".common" all common or only global common?

Richard> Only global, I'd say.  Local common doesn't make any sense when
Richard> you think of the linker semantics.

	Right, because it's assembler semantics, not linker semantics.

Richard> Local common _is_ BSS.  Perhaps the target doesn't support changing
Richard> to a section via .bss and emitting symbols, but that's the effect
Richard> of e.g. the .lcomm directive.

	Well, sort of.  Local common is allocated in the BSS section, but
it's scope is local, while a BSS symbol can have global scope. And local
common symbols can be merged by the assembler, while pure BSS symbols
cannot have duplicates.

	What I meant by no connection between local common and BSS is that
there is no connection between an assembler/linker allowing GCC to emit
unique BSS symbol definitions which cannot be merged (using
ASM_OUTPUT_BSS) and allowing uninitialized data through local common.

	The current logic in varasm.c means that if the target does not
support directly emitting a BSS global symbol, then it does not support
local common either and no uninitialized local data is generated.

	If you want GCC DECL_COMMON to mean linker-only common, fine.

David> 	* varasm.c (assemble_variable): Narrow test for uninitialized
David> 	without BSS target support.

Richard> This is about what I had in mind.

	Any additional change you want me to make, such as localizing this
logic in asm_emit_uninitialized, or can I commit the patch as posted?

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] Re: thread-local storage: c front end and generic backend patch
  2002-07-27 16:18         ` David Edelsohn
@ 2002-07-29 11:02           ` Richard Henderson
  2002-07-29 11:36             ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Richard Henderson @ 2002-07-29 11:02 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

On Sat, Jul 27, 2002 at 06:40:28PM -0400, David Edelsohn wrote:
> Local common specifies local, uninitialized data which is merged
> with other definitions of the same local symbol by the *assembler*.

I suppose, yes, but gcc has no need for such a feature.  We will
never emit duplicates of the same local symbol for the assembler
to merge.

> 	Any additional change you want me to make, such as localizing this
> logic in asm_emit_uninitialized, or can I commit the patch as posted?

Well, you can't really localize the change any more since the
change to assemble_variable is require in order for
asm_emit_uninitialized to be called.

So, the patch is ok as-is.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] Re: thread-local storage: c front end and generic backend patch
  2002-07-29 11:02           ` Richard Henderson
@ 2002-07-29 11:36             ` David Edelsohn
  2002-07-29 15:30               ` Richard Henderson
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2002-07-29 11:36 UTC (permalink / raw)
  To: gcc-patches

>>>>> Richard Henderson writes:

>> Any additional change you want me to make, such as localizing this
>> logic in asm_emit_uninitialized, or can I commit the patch as posted?

Richard> Well, you can't really localize the change any more since the
Richard> change to assemble_variable is require in order for
Richard> asm_emit_uninitialized to be called.

Richard> So, the patch is ok as-is.

	Okay.

	Just for clarity, what I suggested is for asm_emit_uninitialized()
to return success or failure instead of assemble_variable() avoiding the
function for cases it cannot handle.  The code currently is

  if ()
    ...
#ifndef ASM_EMIT_BSS
    else if (! DECL_COMMON)
    ; /* skip next test */
#endif
  else if (BSS)
    {
      asm_emit_uninitialized();
      return;
    }
  /* initialized variables and no defined BSS */

It could be reworked as

  if ()
    ...
  else if (BSS)
    {
      if (asm_emit_uninitialized())
        return;
    }

where asm_emit_uninitialized() returns FALSE for decls requiring
ASM_EMIT_BSS if that macro is undefined.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] Re: thread-local storage: c front end and generic backend patch
  2002-07-29 11:36             ` David Edelsohn
@ 2002-07-29 15:30               ` Richard Henderson
  2002-07-29 22:10                 ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Richard Henderson @ 2002-07-29 15:30 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

On Mon, Jul 29, 2002 at 02:09:35PM -0400, David Edelsohn wrote:
> It could be reworked as
> 
>   if ()
>     ...
>   else if (BSS)
>     {
>       if (asm_emit_uninitialized())
>         return;
>     }
> 
> where asm_emit_uninitialized() returns FALSE for decls requiring
> ASM_EMIT_BSS if that macro is undefined.

Oh, I see.  Yes, that would be cleaner.  If you'd like to
work on that, I'd be grateful.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] Re: thread-local storage: c front end and generic backend patch
  2002-07-29 15:30               ` Richard Henderson
@ 2002-07-29 22:10                 ` David Edelsohn
  2002-07-30  9:41                   ` Richard Henderson
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2002-07-29 22:10 UTC (permalink / raw)
  To: Richard Henderson; +Cc: gcc-patches

>>>>> Richard Henderson writes:

Richard> Oh, I see.  Yes, that would be cleaner.  If you'd like to
Richard> work on that, I'd be grateful.

	* varasm.c (asm_emit_uninitialized): Return false if global BSS
	and ASM_EMIT_BSS not supported by target.
	(assemble_variable): Do not duplicate uninitialized logic.
	Fall through if asm_emit_uninitialized failed.

Index: varasm.c
===================================================================
RCS file: /cvs/gcc/egcs/gcc/varasm.c,v
retrieving revision 1.297
diff -c -p -r1.297 varasm.c
*** varasm.c	29 Jul 2002 19:01:55 -0000	1.297
--- varasm.c	30 Jul 2002 04:54:39 -0000
*************** static void asm_output_aligned_bss	PARAM
*** 172,178 ****
  #endif /* BSS_SECTION_ASM_OP */
  static hashval_t const_str_htab_hash	PARAMS ((const void *x));
  static int const_str_htab_eq		PARAMS ((const void *x, const void *y));
! static void asm_emit_uninitialised	PARAMS ((tree, const char*, int, int));
  static void resolve_unique_section	PARAMS ((tree, int, int));
  static void mark_weak                   PARAMS ((tree));
  \f
--- 172,178 ----
  #endif /* BSS_SECTION_ASM_OP */
  static hashval_t const_str_htab_hash	PARAMS ((const void *x));
  static int const_str_htab_eq		PARAMS ((const void *x, const void *y));
! static bool asm_emit_uninitialised	PARAMS ((tree, const char*, int, int));
  static void resolve_unique_section	PARAMS ((tree, int, int));
  static void mark_weak                   PARAMS ((tree));
  \f
*************** assemble_string (p, size)
*** 1350,1356 ****
  #endif
  #endif
  
! static void
  asm_emit_uninitialised (decl, name, size, rounded)
       tree decl;
       const char *name;
--- 1350,1356 ----
  #endif
  #endif
  
! static bool
  asm_emit_uninitialised (decl, name, size, rounded)
       tree decl;
       const char *name;
*************** asm_emit_uninitialised (decl, name, size
*** 1365,1377 ****
    }
    destination = asm_dest_local;
  
    if (TREE_PUBLIC (decl))
      {
! #if defined ASM_EMIT_BSS
!       if (! DECL_COMMON (decl))
  	destination = asm_dest_bss;
!       else
  #endif
  	destination = asm_dest_common;
      }
  
--- 1365,1381 ----
    }
    destination = asm_dest_local;
  
+   /* ??? We should handle .bss via select_section mechanisms rather than
+      via special target hooks.  That would eliminate this special case.  */
    if (TREE_PUBLIC (decl))
      {
!       if (!DECL_COMMON (decl))
! #ifdef ASM_EMIT_BSS
  	destination = asm_dest_bss;
! #else
! 	return false;
  #endif
+       else
  	destination = asm_dest_common;
      }
  
*************** asm_emit_uninitialised (decl, name, size
*** 1420,1426 ****
        abort ();
      }
  
!   return;
  }
  
  /* Assemble everything that is needed for a variable or function declaration.
--- 1424,1430 ----
        abort ();
      }
  
!   return true;
  }
  
  /* Assemble everything that is needed for a variable or function declaration.
*************** assemble_variable (decl, top_level, at_e
*** 1593,1608 ****
        if (DECL_COMMON (decl))
  	sorry ("thread-local COMMON data not implemented");
      }
- #ifndef ASM_EMIT_BSS
-   /* If the target can't output uninitialized but not common global data
-      in .bss, then we have to use .data.  */
-   /* ??? We should handle .bss via select_section mechanisms rather than
-      via special target hooks.  That would eliminate this special case.  */
-   /* Duplicate BSS test in asm_emit_uninitialized instead of having it
-      return success or failure for that case.  Shrug.  */
-   else if (TREE_PUBLIC (decl) && !DECL_COMMON (decl))
-     ;
- #endif
    else if (DECL_INITIAL (decl) == 0
  	   || DECL_INITIAL (decl) == error_mark_node
  	   || (flag_zero_initialized_in_bss
--- 1597,1602 ----
*************** assemble_variable (decl, top_level, at_e
*** 1629,1637 ****
  	  (decl, "requested alignment for %s is greater than implemented alignment of %d",rounded);
  #endif
  
!       asm_emit_uninitialised (decl, name, size, rounded);
! 
!       return;
      }
  
    /* Handle initialized definitions.
--- 1623,1632 ----
  	  (decl, "requested alignment for %s is greater than implemented alignment of %d",rounded);
  #endif
  
!       /* If the target cannot output uninitialized but not common global data
! 	 in .bss, then we have to use .data, so fall through.  */
!       if (asm_emit_uninitialised (decl, name, size, rounded))
! 	return;
      }
  
    /* Handle initialized definitions.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] Re: thread-local storage: c front end and generic backend patch
  2002-07-29 22:10                 ` David Edelsohn
@ 2002-07-30  9:41                   ` Richard Henderson
  0 siblings, 0 replies; 875+ messages in thread
From: Richard Henderson @ 2002-07-30  9:41 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

On Tue, Jul 30, 2002 at 01:04:06AM -0400, David Edelsohn wrote:
> 	* varasm.c (asm_emit_uninitialized): Return false if global BSS
> 	and ASM_EMIT_BSS not supported by target.
> 	(assemble_variable): Do not duplicate uninitialized logic.
> 	Fall through if asm_emit_uninitialized failed.

Ok.

Thanks for cleaning this up.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* power4 branch hints
@ 2002-08-01 18:39 Alan Modra
  2002-08-01 18:47 ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Alan Modra @ 2002-08-01 18:39 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Edelsohn

See the comment below.

	* config/rs6000/rs6000.c (output_cbranch): Hint differently for power4.
	* config/rs6000/rs6000.h (enum processor_type): Comment on ordering.

Index: gcc/config/rs6000/rs6000.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.c,v
retrieving revision 1.355
diff -u -p -r1.355 rs6000.c
--- gcc/config/rs6000/rs6000.c	2 Aug 2002 01:08:01 -0000	1.355
+++ gcc/config/rs6000/rs6000.c	2 Aug 2002 01:21:55 -0000
@@ -8438,21 +8438,30 @@ output_cbranch (op, label, reversed, ins
   
   /* Maybe we have a guess as to how likely the branch is.  
      The old mnemonics don't have a way to specify this information.  */
+  pred = "";
   note = find_reg_note (insn, REG_BR_PROB, NULL_RTX);
   if (note != NULL_RTX)
     {
       /* PROB is the difference from 50%.  */
       int prob = INTVAL (XEXP (note, 0)) - REG_BR_PROB_BASE / 2;
-      
-      /* For branches that are very close to 50%, assume not-taken.  */
-      if (abs (prob) > REG_BR_PROB_BASE / 20
-	  && ((prob > 0) ^ need_longbranch))
-	pred = "+";
-      else
-	pred = "-";
+      int cpu_version1_arch = rs6000_cpu < PROCESSOR_POWER4;
+
+      /* Only hint for highly probable/improbable branches on newer
+	 cpus as static prediction overrides processor dynamic
+	 prediction.  For older cpus we may as well always hint, but
+	 assume not taken for branches that are very close to 50% as a
+	 mispredicted taken branch is more expensive than a
+	 mispredicted not-taken branch.  */ 
+      if (cpu_version1_arch
+	  || abs (prob) > REG_BR_PROB_BASE / 100 * 48)
+	{
+	  if (abs (prob) > REG_BR_PROB_BASE / 20
+	      && ((prob > 0) ^ need_longbranch))
+	    pred = "+";
+	  else
+	    pred = "-";
+	}
     }
-  else
-    pred = "";
 
   if (label == NULL)
     s += sprintf (s, "{b%sr|b%slr%s} ", ccode, ccode, pred);
Index: gcc/config/rs6000/rs6000.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.h,v
retrieving revision 1.217
diff -u -p -r1.217 rs6000.h
--- gcc/config/rs6000/rs6000.h	1 Aug 2002 21:18:34 -0000	1.217
+++ gcc/config/rs6000/rs6000.h	2 Aug 2002 01:22:00 -0000
@@ -341,7 +341,10 @@ extern int target_flags;
 /* This is meant to be redefined in the host dependent files */
 #define SUBTARGET_SWITCHES
 
-/* Processor type.  Order must match cpu attribute in MD file.  */
+/* Processor type.  Order must match cpu attribute in MD file.
+   Please keep all Power4 type processors using "at" branch hints
+   after PROCESSOR_POWER4, and those using the "y" branch hints,
+   before.  */
 enum processor_type
  {
    PROCESSOR_RIOS1,

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: power4 branch hints
  2002-08-01 18:39 power4 branch hints Alan Modra
@ 2002-08-01 18:47 ` David Edelsohn
  2002-08-01 19:50   ` Alan Modra
  2002-08-02 13:25   ` Geoff Keating
  0 siblings, 2 replies; 875+ messages in thread
From: David Edelsohn @ 2002-08-01 18:47 UTC (permalink / raw)
  To: Alan Modra, Geoff Keating; +Cc: gcc-patches

+      int cpu_version1_arch = rs6000_cpu < PROCESSOR_POWER4;

	I am extremely conflicted about having a fragile test like the one
above versus 

	int at_hints = (rs6000_cpu == PROCESSOR_POWER4)

which adds yet another place potentially requiring tweaking for new
processors. 

	Geoff, do you have any preferences?

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: power4 branch hints
  2002-08-01 18:47 ` David Edelsohn
@ 2002-08-01 19:50   ` Alan Modra
  2002-08-01 20:25     ` David Edelsohn
  2002-08-02 13:25   ` Geoff Keating
  1 sibling, 1 reply; 875+ messages in thread
From: Alan Modra @ 2002-08-01 19:50 UTC (permalink / raw)
  To: gcc-patches; +Cc: Geoff Keating, David Edelsohn

On Thu, Aug 01, 2002 at 09:47:01PM -0400, David Edelsohn wrote:
> +      int cpu_version1_arch = rs6000_cpu < PROCESSOR_POWER4;
> 
> 	I am extremely conflicted about having a fragile test like the one
> above versus 

OK, we came to agreement in private email.  For the record, this is
what I'm committing.

	* config/rs6000/rs6000.c (output_cbranch): Hint differently for power4.

Index: config/rs6000/rs6000.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.c,v
retrieving revision 1.355
diff -u -p -r1.355 rs6000.c
--- config/rs6000/rs6000.c	2 Aug 2002 01:08:01 -0000	1.355
+++ config/rs6000/rs6000.c	2 Aug 2002 02:46:19 -0000
@@ -8438,21 +8438,30 @@ output_cbranch (op, label, reversed, ins
   
   /* Maybe we have a guess as to how likely the branch is.  
      The old mnemonics don't have a way to specify this information.  */
+  pred = "";
   note = find_reg_note (insn, REG_BR_PROB, NULL_RTX);
   if (note != NULL_RTX)
     {
       /* PROB is the difference from 50%.  */
       int prob = INTVAL (XEXP (note, 0)) - REG_BR_PROB_BASE / 2;
-      
-      /* For branches that are very close to 50%, assume not-taken.  */
-      if (abs (prob) > REG_BR_PROB_BASE / 20
-	  && ((prob > 0) ^ need_longbranch))
-	pred = "+";
-      else
-	pred = "-";
+      bool always_hint = rs6000_cpu != PROCESSOR_POWER4;
+
+      /* Only hint for highly probable/improbable branches on newer
+	 cpus as static prediction overrides processor dynamic
+	 prediction.  For older cpus we may as well always hint, but
+	 assume not taken for branches that are very close to 50% as a
+	 mispredicted taken branch is more expensive than a
+	 mispredicted not-taken branch.  */ 
+      if (always_hint
+	  || abs (prob) > REG_BR_PROB_BASE / 100 * 48)
+	{
+	  if (abs (prob) > REG_BR_PROB_BASE / 20
+	      && ((prob > 0) ^ need_longbranch))
+	    pred = "+";
+	  else
+	    pred = "-";
+	}
     }
-  else
-    pred = "";
 
   if (label == NULL)
     s += sprintf (s, "{b%sr|b%slr%s} ", ccode, ccode, pred);


-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: power4 branch hints
  2002-08-01 19:50   ` Alan Modra
@ 2002-08-01 20:25     ` David Edelsohn
  0 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2002-08-01 20:25 UTC (permalink / raw)
  To: gcc-patches

	* config/rs6000/rs6000.c (output_cbranch): Hint differently for power4.

Yes, this is fine.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: power4 branch hints
  2002-08-01 18:47 ` David Edelsohn
  2002-08-01 19:50   ` Alan Modra
@ 2002-08-02 13:25   ` Geoff Keating
  1 sibling, 0 replies; 875+ messages in thread
From: Geoff Keating @ 2002-08-02 13:25 UTC (permalink / raw)
  To: dje; +Cc: amodra, gcc-patches

> cc: gcc-patches@gcc.gnu.org
> Date: Thu, 01 Aug 2002 21:47:01 -0400
> From: David Edelsohn <dje@watson.ibm.com>
> 
> +      int cpu_version1_arch = rs6000_cpu < PROCESSOR_POWER4;
> 
> 	I am extremely conflicted about having a fragile test like the one
> above versus 
> 
> 	int at_hints = (rs6000_cpu == PROCESSOR_POWER4)
> 
> which adds yet another place potentially requiring tweaking for new
> processors. 
> 
> 	Geoff, do you have any preferences?

I think I slightly prefer the second, because the first one depends on
ordering of an enum.

-- 
- Geoffrey Keating <geoffk@geoffk.org> <geoffk@redhat.com>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* [RFC] PowerPC select_section / unique_section
@ 2002-08-21 11:09 David Edelsohn
  2002-08-21 11:21 ` Franz Sirl
  2002-08-21 18:54 ` Alan Modra
  0 siblings, 2 replies; 875+ messages in thread
From: David Edelsohn @ 2002-08-21 11:09 UTC (permalink / raw)
  To: Geoff Keating; +Cc: gcc-patches

	While investigating a problem with PowerPC ELF section decisions,
I realized that the functions need to treat PPC64 as if flag_pic were set.
I also noticed a discrepancy between the PowerPC definition and the
default definition in varasm.c: rs6000_elf_select_section should not
default to readonly when reloc is defined.

	My remaining concern is that the readonly algorithm in
rs6000_elf_unique_section does not match the algorithm in
rs6000_elf_select_section: the handling of decl CONSTRUCTOR and the
handling of default readonly.  Should the definitions match?

Thanks, David


	* config/rs6000/rs6000.c (rs6000_elf_select_section): Treat
	DEFAULT_ABI == ABI_AIX like PIC.  Test PIC & reloc for readonly
	default.
	(rs6000_elf_unique_section): Treat DEFAULT_ABI == ABI_AIX like
	PIC.
	(rs6000_xcoff_select_section): Update to recent readonly
	algorithm.

Index: rs6000.c
===================================================================
RCS file: /cvs/gcc/egcs/gcc/config/rs6000/rs6000.c,v
retrieving revision 1.364
diff -c -p -r1.364 rs6000.c
*** rs6000.c	19 Aug 2002 16:32:53 -0000	1.364
--- rs6000.c	21 Aug 2002 16:42:25 -0000
*************** rs6000_elf_select_section (decl, reloc, 
*** 12432,12439 ****
       unsigned HOST_WIDE_INT align ATTRIBUTE_UNUSED;
  {
    int size = int_size_in_bytes (TREE_TYPE (decl));
!   int needs_sdata;
!   int readonly;
    static void (* const sec_funcs[4]) PARAMS ((void)) = {
      &readonly_data_section,
      &sdata2_section,
--- 12462,12469 ----
       unsigned HOST_WIDE_INT align ATTRIBUTE_UNUSED;
  {
    int size = int_size_in_bytes (TREE_TYPE (decl));
!   bool needs_sdata;
!   bool readonly;
    static void (* const sec_funcs[4]) PARAMS ((void)) = {
      &readonly_data_section,
      &sdata2_section,
*************** rs6000_elf_select_section (decl, reloc, 
*** 12447,12468 ****
  		 && (rs6000_sdata != SDATA_DATA || TREE_PUBLIC (decl)));
  
    if (TREE_CODE (decl) == STRING_CST)
!     readonly = ! flag_writable_strings;
    else if (TREE_CODE (decl) == VAR_DECL)
!     readonly = (! (flag_pic && reloc)
  		&& TREE_READONLY (decl)
! 		&& ! TREE_SIDE_EFFECTS (decl)
  		&& DECL_INITIAL (decl)
  		&& DECL_INITIAL (decl) != error_mark_node
  		&& TREE_CONSTANT (DECL_INITIAL (decl)));
    else if (TREE_CODE (decl) == CONSTRUCTOR)
!     readonly = (! (flag_pic && reloc)
! 		&& ! TREE_SIDE_EFFECTS (decl)
  		&& TREE_CONSTANT (decl));
    else
!     readonly = 1;
    if (needs_sdata && rs6000_sdata != SDATA_EABI)
!     readonly = 0;
    
    (*sec_funcs[(readonly ? 0 : 2) + (needs_sdata ? 1 : 0)])();
  }
--- 12477,12499 ----
  		 && (rs6000_sdata != SDATA_DATA || TREE_PUBLIC (decl)));
  
    if (TREE_CODE (decl) == STRING_CST)
!     readonly = !flag_writable_strings;
    else if (TREE_CODE (decl) == VAR_DECL)
!     readonly = (!((flag_pic || DEFAULT_ABI == ABI_AIX) && reloc)
  		&& TREE_READONLY (decl)
! 		&& !TREE_SIDE_EFFECTS (decl)
  		&& DECL_INITIAL (decl)
  		&& DECL_INITIAL (decl) != error_mark_node
  		&& TREE_CONSTANT (DECL_INITIAL (decl)));
    else if (TREE_CODE (decl) == CONSTRUCTOR)
!     readonly = (!((flag_pic || DEFAULT_ABI == ABI_AIX) && reloc)
! 		&& !TREE_SIDE_EFFECTS (decl)
  		&& TREE_CONSTANT (decl));
    else
!     readonly = !((flag_pic || DEFAULT_ABI == ABI_AIX) && reloc);
! 
    if (needs_sdata && rs6000_sdata != SDATA_EABI)
!     readonly = false;
    
    (*sec_funcs[(readonly ? 0 : 2) + (needs_sdata ? 1 : 0)])();
  }
*************** rs6000_elf_unique_section (decl, reloc)
*** 12501,12517 ****
      sec = 6;
    else
      {
!       int readonly;
!       int needs_sdata;
        int size;
  
-       readonly = 1;
        if (TREE_CODE (decl) == STRING_CST)
! 	readonly = ! flag_writable_strings;
        else if (TREE_CODE (decl) == VAR_DECL)
! 	readonly = (! (flag_pic && reloc)
  		    && TREE_READONLY (decl)
! 		    && ! TREE_SIDE_EFFECTS (decl)
  		    && TREE_CONSTANT (DECL_INITIAL (decl)));
  
        size = int_size_in_bytes (TREE_TYPE (decl));
--- 12532,12547 ----
      sec = 6;
    else
      {
!       bool readonly = true;
!       bool needs_sdata;
        int size;
  
        if (TREE_CODE (decl) == STRING_CST)
! 	readonly = !flag_writable_strings;
        else if (TREE_CODE (decl) == VAR_DECL)
! 	readonly = (!((flag_pic || DEFAULT_ABI == ABI_AIX) && reloc)
  		    && TREE_READONLY (decl)
! 		    && !TREE_SIDE_EFFECTS (decl)
  		    && TREE_CONSTANT (DECL_INITIAL (decl)));
  
        size = int_size_in_bytes (TREE_TYPE (decl));
*************** rs6000_elf_unique_section (decl, reloc)
*** 12523,12529 ****
        if (DECL_INITIAL (decl) == 0
  	  || DECL_INITIAL (decl) == error_mark_node)
  	sec = 4;
!       else if (! readonly)
  	sec = 2;
        else
  	sec = 0;
--- 12553,12559 ----
        if (DECL_INITIAL (decl) == 0
  	  || DECL_INITIAL (decl) == error_mark_node)
  	sec = 4;
!       else if (!readonly)
  	sec = 2;
        else
  	sec = 0;
*************** xcoff_asm_named_section (name, flags)
*** 13104,13131 ****
  }
  
  static void
! rs6000_xcoff_select_section (exp, reloc, align)
!      tree exp;
       int reloc;
       unsigned HOST_WIDE_INT align ATTRIBUTE_UNUSED;
  {
!   if ((TREE_CODE (exp) == STRING_CST
!        && ! flag_writable_strings)
!       || (TREE_CODE_CLASS (TREE_CODE (exp)) == 'd'
! 	  && TREE_READONLY (exp) && ! TREE_THIS_VOLATILE (exp)
! 	  && DECL_INITIAL (exp)
! 	  && (DECL_INITIAL (exp) == error_mark_node
! 	      || TREE_CONSTANT (DECL_INITIAL (exp)))
! 	  && ! (reloc)))
      {
!       if (TREE_PUBLIC (exp))
          read_only_data_section ();
        else
          read_only_private_data_section ();
      }
    else
      {
!       if (TREE_PUBLIC (exp))
          data_section ();
        else
          private_data_section ();
--- 13134,13172 ----
  }
  
  static void
! rs6000_xcoff_select_section (decl, reloc, align)
!      tree decl;
       int reloc;
       unsigned HOST_WIDE_INT align ATTRIBUTE_UNUSED;
  {
!   bool readonly = false;
! 
!   if (TREE_CODE (decl) == STRING_CST)
!     readonly = !flag_writable_strings;
!   else if (TREE_CODE (decl) == VAR_DECL)
!     readonly = (!reloc
! 		&& TREE_READONLY (decl)
! 		&& !TREE_SIDE_EFFECTS (decl)
! 		&& DECL_INITIAL (decl)
! 		&& DECL_INITIAL (decl) != error_mark_node
! 		&& TREE_CONSTANT (DECL_INITIAL (decl)));
!   else if (TREE_CODE (decl) == CONSTRUCTOR)
!     readonly = (!reloc
! 		&& !TREE_SIDE_EFFECTS (decl)
! 		&& TREE_CONSTANT (decl));
!   else
!     readonly = !reloc;
! 
!   if (readonly)
      {
!       if (TREE_PUBLIC (decl))
          read_only_data_section ();
        else
          read_only_private_data_section ();
      }
    else
      {
!       if (TREE_PUBLIC (decl))
          data_section ();
        else
          private_data_section ();

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC select_section / unique_section
  2002-08-21 11:09 [RFC] PowerPC select_section / unique_section David Edelsohn
@ 2002-08-21 11:21 ` Franz Sirl
  2002-08-21 11:29   ` David Edelsohn
  2002-08-21 18:54 ` Alan Modra
  1 sibling, 1 reply; 875+ messages in thread
From: Franz Sirl @ 2002-08-21 11:21 UTC (permalink / raw)
  To: David Edelsohn, Geoff Keating; +Cc: gcc-patches

On Mittwoch, 21. August 2002 19:20, David Edelsohn wrote:
> While investigating a problem with PowerPC ELF section decisions,
> I realized that the functions need to treat PPC64 as if flag_pic were set.
> I also noticed a discrepancy between the PowerPC definition and the
> default definition in varasm.c: rs6000_elf_select_section should not
> default to readonly when reloc is defined.
>
> 	My remaining concern is that the readonly algorithm in
> rs6000_elf_unique_section does not match the algorithm in
> rs6000_elf_select_section: the handling of decl CONSTRUCTOR and the
> handling of default readonly.  Should the definitions match?

 See

	<http://gcc.gnu.org/ml/gcc-patches/2002-05/subjects.html#01551>

for a patch I suggested to unify the handling and the followup discussion on 
how to better re-use the stuff in varasm.c. I haven't come around to 
implement this yet unfortunately (and it doesn't look like I will be able to 
tackle this anytime soon).

Franz.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC select_section / unique_section
  2002-08-21 11:21 ` Franz Sirl
@ 2002-08-21 11:29   ` David Edelsohn
  2002-08-21 12:01     ` Franz Sirl
  2002-08-21 19:02     ` Alan Modra
  0 siblings, 2 replies; 875+ messages in thread
From: David Edelsohn @ 2002-08-21 11:29 UTC (permalink / raw)
  To: Franz Sirl; +Cc: Geoff Keating, gcc-patches

>>>>> Franz Sirl writes:

Franz> I suggested to unify the handling and the followup discussion on 
Franz> how to better re-use the stuff in varasm.c. I haven't come around to 
Franz> implement this yet unfortunately (and it doesn't look like I will be able to 
Franz> tackle this anytime soon).

	PowerPC cannot use the default functions in varasm.c because of
its assumptions (e.g., flag_pic).  I already investigated that and I do
remember the earlier discussion.

	My patch is not a cleanup, it's a correctness issue.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC select_section / unique_section
  2002-08-21 11:29   ` David Edelsohn
@ 2002-08-21 12:01     ` Franz Sirl
  2002-08-21 12:15       ` David Edelsohn
  2002-08-21 19:02     ` Alan Modra
  1 sibling, 1 reply; 875+ messages in thread
From: Franz Sirl @ 2002-08-21 12:01 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Geoff Keating, gcc-patches

On Mittwoch, 21. August 2002 20:25, David Edelsohn wrote:
> >>>>> Franz Sirl writes:
>
> Franz> I suggested to unify the handling and the followup discussion on
> Franz> how to better re-use the stuff in varasm.c. I haven't come around to
> Franz> implement this yet unfortunately (and it doesn't look like I will be
> able to Franz> tackle this anytime soon).
>
> 	PowerPC cannot use the default functions in varasm.c because of
> its assumptions (e.g., flag_pic).  I already investigated that and I do
> remember the earlier discussion.

Well, we could save flag_pic around calling these functions or change the 
flag_pic checks in varasm.c to a target hook.

> 	My patch is not a cleanup, it's a correctness issue.

I understand that, but I consider reusing as much as possible of the varasm 
functions a longterm correctness issue.

Franz.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC select_section / unique_section
  2002-08-21 12:01     ` Franz Sirl
@ 2002-08-21 12:15       ` David Edelsohn
  2002-08-29 18:01         ` Richard Henderson
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2002-08-21 12:15 UTC (permalink / raw)
  To: Franz Sirl; +Cc: Geoff Keating, gcc-patches

>>>>> Franz Sirl writes:

Franz> Well, we could save flag_pic around calling these functions or change the 
Franz> flag_pic checks in varasm.c to a target hook.

	Yes, I considered setting and unsetting flag_pic, but that's
ugly.  The only way I see to use the default functions is to add a "pic"
parameter to the functions instead of using the GCC flag_pic global.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC select_section / unique_section
  2002-08-21 11:09 [RFC] PowerPC select_section / unique_section David Edelsohn
  2002-08-21 11:21 ` Franz Sirl
@ 2002-08-21 18:54 ` Alan Modra
  2002-08-21 18:59   ` David Edelsohn
  1 sibling, 1 reply; 875+ messages in thread
From: Alan Modra @ 2002-08-21 18:54 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Geoff Keating, gcc-patches

On Wed, Aug 21, 2002 at 01:20:18PM -0400, David Edelsohn wrote:
> 	* config/rs6000/rs6000.c (rs6000_elf_select_section): Treat
> 	DEFAULT_ABI == ABI_AIX like PIC.  Test PIC & reloc for readonly
> 	default.
> 	(rs6000_elf_unique_section): Treat DEFAULT_ABI == ABI_AIX like
> 	PIC.
> 	(rs6000_xcoff_select_section): Update to recent readonly
> 	algorithm.

Why not use decl_readonly_section?  rs6000_elf_select_section ends up
looking like:

  needs_sdata = (size > 0 
		 && size <= g_switch_value
		 && rs6000_sdata != SDATA_NONE
		 && (rs6000_sdata != SDATA_DATA || TREE_PUBLIC (decl)));

  readonly = 0;
  if (!needs_sdata || rs6000_sdata == SDATA_EABI)
    readonly = decl_readonly_section (decl, reloc);
  
  (*sec_funcs[(readonly ? 0 : 2) + (needs_sdata ? 1 : 0)])();

Similarly for the other functions.  I've used this on gcc-3.2 based
powerpc64-linux for a couple of weeks.  Patch available on request.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC select_section / unique_section
  2002-08-21 18:54 ` Alan Modra
@ 2002-08-21 18:59   ` David Edelsohn
  0 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2002-08-21 18:59 UTC (permalink / raw)
  To: Alan Modra; +Cc: Geoff Keating, gcc-patches

>>>>> Alan Modra writes:

Alan> Why not use decl_readonly_section?
Alan> Similarly for the other functions.  I've used this on gcc-3.2 based
Alan> powerpc64-linux for a couple of weeks.  Patch available on request.

	As I replied to Franz, testing flag_pic is not sufficient for
PPC64.  Any patch using the varasm.c default functions without further
changes is wrong.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC select_section / unique_section
  2002-08-21 11:29   ` David Edelsohn
  2002-08-21 12:01     ` Franz Sirl
@ 2002-08-21 19:02     ` Alan Modra
  2002-08-21 19:19       ` David Edelsohn
  2002-08-29 18:03       ` Richard Henderson
  1 sibling, 2 replies; 875+ messages in thread
From: Alan Modra @ 2002-08-21 19:02 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Franz Sirl, Geoff Keating, gcc-patches

On Wed, Aug 21, 2002 at 02:25:06PM -0400, David Edelsohn wrote:
> 	PowerPC cannot use the default functions in varasm.c because of
> its assumptions (e.g., flag_pic).

Ouch.  Forget my suggestion re decl_readonly_section.  Hmm, would
you accept a patch that cleans up flag_pic in rs6000.c?  ie. has
flag_pic always set for ABI_AIX?

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC select_section / unique_section
  2002-08-21 19:02     ` Alan Modra
@ 2002-08-21 19:19       ` David Edelsohn
  2002-08-21 19:25         ` Alan Modra
  2002-08-29 18:03       ` Richard Henderson
  1 sibling, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2002-08-21 19:19 UTC (permalink / raw)
  To: Alan Modra; +Cc: Franz Sirl, Geoff Keating, gcc-patches

>>>>> Alan Modra writes:

Alan> Ouch.  Forget my suggestion re decl_readonly_section.  Hmm, would
Alan> you accept a patch that cleans up flag_pic in rs6000.c?  ie. has
Alan> flag_pic always set for ABI_AIX?

	Have you actually tried that?  flag_pic interferes with ABI_AIX
TOC register usage.  ABI_AIX does not use the flag_pic machinery.

	Now that I have fixed the ASM_GLOBALIZE_LABEL breakage, I am going
to get back to testing the select_section patch for GCC 3.2 and GCC 3.3.
We can look at cleaning up select_section and unique_section for GCC 3.4,
but it's too late for GCC 3.2 and GCC 3.3.  I only am modifying the
current form of those functions enough to fix bugs.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC select_section / unique_section
  2002-08-21 19:19       ` David Edelsohn
@ 2002-08-21 19:25         ` Alan Modra
  2002-08-21 21:28           ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Alan Modra @ 2002-08-21 19:25 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Franz Sirl, Geoff Keating, gcc-patches

On Wed, Aug 21, 2002 at 10:07:13PM -0400, David Edelsohn wrote:
> >>>>> Alan Modra writes:
> 
> Alan> Ouch.  Forget my suggestion re decl_readonly_section.  Hmm, would
> Alan> you accept a patch that cleans up flag_pic in rs6000.c?  ie. has
> Alan> flag_pic always set for ABI_AIX?
> 
> 	Have you actually tried that?

Not yet, but I think it's worth doing.  Setting flag_pic to zero when
we are PIC amounts to lying to the rest of the compiler.

>  flag_pic interferes with ABI_AIX
> TOC register usage.  ABI_AIX does not use the flag_pic machinery.

Yes, I know.  I'm talking about adding a DEFAULT_ABI test with all
uses of flag_pic in rs6000/*.  Many already do so.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC select_section / unique_section
  2002-08-21 19:25         ` Alan Modra
@ 2002-08-21 21:28           ` David Edelsohn
  2002-08-29 18:19             ` Richard Henderson
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2002-08-21 21:28 UTC (permalink / raw)
  To: Alan Modra; +Cc: Franz Sirl, Geoff Keating, gcc-patches

>>>>> Alan Modra writes:

Alan> Setting flag_pic to zero when
Alan> we are PIC amounts to lying to the rest of the compiler.

	Not really.  GCC handled PIC just fine before flag_pic was
introduced.  flag_pic really means flag_got.  It really means enable a
particular way of implementing PIC support, not enable PIC codegen.  It
would be nice if flag_pic meant PIC, but the meaning is confused with the
implementation. 

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC select_section / unique_section
  2002-08-21 12:15       ` David Edelsohn
@ 2002-08-29 18:01         ` Richard Henderson
  0 siblings, 0 replies; 875+ messages in thread
From: Richard Henderson @ 2002-08-29 18:01 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Franz Sirl, Geoff Keating, gcc-patches

On Wed, Aug 21, 2002 at 03:01:20PM -0400, David Edelsohn wrote:
> Yes, I considered setting and unsetting flag_pic, but that's ugly.

Nevertheless, see ia64_aix_unique_section and ia64_aix_select_rtx_section
in which exactly this is done.  Personally, I think this is the best
solution.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC select_section / unique_section
  2002-08-21 19:02     ` Alan Modra
  2002-08-21 19:19       ` David Edelsohn
@ 2002-08-29 18:03       ` Richard Henderson
  2002-08-30  7:15         ` Alan Modra
  1 sibling, 1 reply; 875+ messages in thread
From: Richard Henderson @ 2002-08-29 18:03 UTC (permalink / raw)
  To: David Edelsohn, Franz Sirl, Geoff Keating, gcc-patches

On Thu, Aug 22, 2002 at 11:29:31AM +0930, Alan Modra wrote:
> Ouch.  Forget my suggestion re decl_readonly_section.  Hmm, would
> you accept a patch that cleans up flag_pic in rs6000.c?  ie. has
> flag_pic always set for ABI_AIX?

That will pessimize ppc64-elf name binding.  (I.e. binds_local_p.)


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC select_section / unique_section
  2002-08-21 21:28           ` David Edelsohn
@ 2002-08-29 18:19             ` Richard Henderson
  0 siblings, 0 replies; 875+ messages in thread
From: Richard Henderson @ 2002-08-29 18:19 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Alan Modra, Franz Sirl, Geoff Keating, gcc-patches

On Wed, Aug 21, 2002 at 10:25:26PM -0400, David Edelsohn wrote:
> Alan> Setting flag_pic to zero when
> Alan> we are PIC amounts to lying to the rest of the compiler.
> 
> 	Not really.  GCC handled PIC just fine before flag_pic was
> introduced.  flag_pic really means flag_got.  It really means enable a
> particular way of implementing PIC support, not enable PIC codegen.  It
> would be nice if flag_pic meant PIC, but the meaning is confused with the
> implementation. 

Indeed.  For ELF what it actually means is "compiling for
inclusion in a shared library".  Existing usage of the
command-line switch is way too entrenched to rename it.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC select_section / unique_section
  2002-08-29 18:03       ` Richard Henderson
@ 2002-08-30  7:15         ` Alan Modra
  2002-08-30  8:27           ` Alan Modra
  2002-08-30  8:42           ` David Edelsohn
  0 siblings, 2 replies; 875+ messages in thread
From: Alan Modra @ 2002-08-30  7:15 UTC (permalink / raw)
  To: Richard Henderson, David Edelsohn, Franz Sirl, Geoff Keating; +Cc: gcc-patches

On Thu, Aug 29, 2002 at 06:01:18PM -0700, Richard Henderson wrote:
> On Thu, Aug 22, 2002 at 11:29:31AM +0930, Alan Modra wrote:
> > Ouch.  Forget my suggestion re decl_readonly_section.  Hmm, would
> > you accept a patch that cleans up flag_pic in rs6000.c?  ie. has
> > flag_pic always set for ABI_AIX?
> 
> That will pessimize ppc64-elf name binding.  (I.e. binds_local_p.)

Uh, oh.  If I understand binds_local_p and mark_constant_function
correctly, setting flag_pic is actually a bug-fix when compiling
powerpc64-linux shared libs.  We have the standard ELF binding of
global syms.  That is, global functions may be overridden by functions
in another shared library or by the main application.

So powerpc64-linux-gcc should allow -fpic/PIC to twiddle flag_pic for
binds_local_p, and users should set -fPIC when compiling shared libs
as is common on other ELF targets.  We could use another flag, because
like that annoying rs6000.c warning says "all code is position
independent" on ppc64, but that would make powerpc64-linux just that
more odd.  Lots of packages set -fPIC to mean "compile me code for a
shared library".

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC select_section / unique_section
  2002-08-30  7:15         ` Alan Modra
@ 2002-08-30  8:27           ` Alan Modra
  2002-08-30  8:42           ` David Edelsohn
  1 sibling, 0 replies; 875+ messages in thread
From: Alan Modra @ 2002-08-30  8:27 UTC (permalink / raw)
  To: Richard Henderson, David Edelsohn, Franz Sirl, Geoff Keating,
	gcc-patches

On Fri, Aug 30, 2002 at 11:00:23PM +0930, Alan Modra wrote:
> So powerpc64-linux-gcc should allow -fpic/PIC to twiddle flag_pic for
> binds_local_p

Just a note to save any duplication of effort:
I have a patch that does this and makes rs6000/* effectively ignore
flag_pic as regarding code generation when ABI_AIX.  I'll post it in
the morning if my overnight builds are successful.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC select_section / unique_section
  2002-08-30  7:15         ` Alan Modra
  2002-08-30  8:27           ` Alan Modra
@ 2002-08-30  8:42           ` David Edelsohn
  2002-08-30 11:17             ` Franz Sirl
  2002-08-30 17:32             ` Alan Modra
  1 sibling, 2 replies; 875+ messages in thread
From: David Edelsohn @ 2002-08-30  8:42 UTC (permalink / raw)
  To: Alan Modra; +Cc: Richard Henderson, Franz Sirl, Geoff Keating, gcc-patches

>>>>> Alan Modra writes:

Alan> Uh, oh.  If I understand binds_local_p and mark_constant_function
Alan> correctly, setting flag_pic is actually a bug-fix when compiling
Alan> powerpc64-linux shared libs.  We have the standard ELF binding of
Alan> global syms.  That is, global functions may be overridden by functions
Alan> in another shared library or by the main application.

Alan> So powerpc64-linux-gcc should allow -fpic/PIC to twiddle flag_pic for
Alan> binds_local_p, and users should set -fPIC when compiling shared libs
Alan> as is common on other ELF targets.  We could use another flag, because
Alan> like that annoying rs6000.c warning says "all code is position
Alan> independent" on ppc64, but that would make powerpc64-linux just that
Alan> more odd.  Lots of packages set -fPIC to mean "compile me code for a
Alan> shared library".

	This is what the patch that I applied to both gcc-3.2 and the
trunk already does, without utilizing the generic infrastructure.  The
PowerPC port currently does not use the targetm.binds_local_p.  We can
discuss evolving to the generic infrastructure for GCC 3.4.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC select_section / unique_section
  2002-08-30  8:42           ` David Edelsohn
@ 2002-08-30 11:17             ` Franz Sirl
  2002-08-30 11:26               ` Franz Sirl
  2002-08-30 11:29               ` David Edelsohn
  2002-08-30 17:32             ` Alan Modra
  1 sibling, 2 replies; 875+ messages in thread
From: Franz Sirl @ 2002-08-30 11:17 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Alan Modra, Richard Henderson, Geoff Keating, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1509 bytes --]

At 17:27 30.08.2002, David Edelsohn wrote:
> >>>>> Alan Modra writes:
>
>Alan> Uh, oh.  If I understand binds_local_p and mark_constant_function
>Alan> correctly, setting flag_pic is actually a bug-fix when compiling
>Alan> powerpc64-linux shared libs.  We have the standard ELF binding of
>Alan> global syms.  That is, global functions may be overridden by functions
>Alan> in another shared library or by the main application.
>
>Alan> So powerpc64-linux-gcc should allow -fpic/PIC to twiddle flag_pic for
>Alan> binds_local_p, and users should set -fPIC when compiling shared libs
>Alan> as is common on other ELF targets.  We could use another flag, because
>Alan> like that annoying rs6000.c warning says "all code is position
>Alan> independent" on ppc64, but that would make powerpc64-linux just that
>Alan> more odd.  Lots of packages set -fPIC to mean "compile me code for a
>Alan> shared library".
>
>         This is what the patch that I applied to both gcc-3.2 and the
>trunk already does, without utilizing the generic infrastructure.  The
>PowerPC port currently does not use the targetm.binds_local_p.  We can
>discuss evolving to the generic infrastructure for GCC 3.4.

What might be acceptable for 3.3 still is the appended patch which 
basically just copies the varasm routines to rs6000.c and adds SDATA2 and 
AIXELF PIC handling. Briefly tested on powerpc-linux-gnu without 
regressions. Can the AIXELF people give it a try? Sorry, no changelog yet, 
I'm in a hurry right now.

Franz.



[-- Attachment #2: gcc-ppcsections-1.patch --]
[-- Type: application/octet-stream, Size: 15128 bytes --]

Index: gcc/config/rs6000/rs6000.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/config/rs6000/rs6000.c,v
retrieving revision 1.368
diff -u -p -r1.368 rs6000.c
--- gcc/config/rs6000/rs6000.c	23 Aug 2002 18:02:22 -0000	1.368
+++ gcc/config/rs6000/rs6000.c	30 Aug 2002 18:08:39 -0000
@@ -196,9 +196,11 @@ static unsigned int rs6000_elf_section_t
 							   int));
 static void rs6000_elf_asm_out_constructor PARAMS ((rtx, int));
 static void rs6000_elf_asm_out_destructor PARAMS ((rtx, int));
-static void rs6000_elf_select_section PARAMS ((tree, int,
-						 unsigned HOST_WIDE_INT));
-static void rs6000_elf_unique_section PARAMS ((tree, int));
+static bool rs6000_in_small_data_p PARAMS ((tree));
+
+void rs6000_elf_select_section PARAMS ((tree, int,
+					unsigned HOST_WIDE_INT));
+void rs6000_elf_unique_section PARAMS ((tree, int));
 static void rs6000_elf_select_rtx_section PARAMS ((enum machine_mode, rtx,
 						   unsigned HOST_WIDE_INT));
 static void rs6000_elf_encode_section_info PARAMS ((tree, int));
@@ -308,7 +310,8 @@ static const char alt_reg_names[][8] =
 #define TARGET_ATTRIBUTE_TABLE rs6000_attribute_table
 #undef TARGET_SET_DEFAULT_TYPE_ATTRIBUTES
 #define TARGET_SET_DEFAULT_TYPE_ATTRIBUTES rs6000_set_default_type_attributes
-
+#undef TARGET_IN_SMALL_DATA_P
+#define TARGET_IN_SMALL_DATA_P rs6000_in_small_data_p
 #undef TARGET_ASM_ALIGNED_DI_OP
 #define TARGET_ASM_ALIGNED_DI_OP DOUBLE_INT_ASM_OP
 
@@ -6503,7 +6506,7 @@ mtcrf_operation (op, mode)
       maskval = 1 << (MAX_CR_REGNO - REGNO (SET_DEST (exp)));
       
       if (GET_CODE (unspec) != UNSPEC
-	  || XINT (unspec, 1) != 20
+	  || XINT (unspec, 1) != UNSPEC_MOVESI_TO_CR
 	  || XVECLEN (unspec, 0) != 2
 	  || XVECEXP (unspec, 0, 0) != src_reg
 	  || GET_CODE (XVECEXP (unspec, 0, 1)) != CONST_INT
@@ -9633,7 +9636,7 @@ get_TOC_alias_set ()
 }   
 
 /* This retuns nonzero if the current function uses the TOC.  This is
-   determined by the presence of (unspec ... 7), which is generated by
+   determined by the presence of (unspec ... UNSPEC_TOCADR), which is generated by
    the various load_toc_* patterns.  */
 
 int
@@ -9650,7 +9653,7 @@ uses_TOC () 
 	  if (GET_CODE (pat) == PARALLEL) 
 	    for (i = 0; i < XVECLEN (PATTERN (insn), 0); i++)
 	      if (GET_CODE (XVECEXP (PATTERN (insn), 0, i)) == UNSPEC 
-		 && XINT (XVECEXP (PATTERN (insn), 0, i), 1) == 7)
+		 && XINT (XVECEXP (PATTERN (insn), 0, i), 1) == UNSPEC_TOCADR)
 		  return 1;
 	}
     return 0;
@@ -10845,7 +10848,7 @@ rs6000_emit_epilogue (sibcall)
 		RTVEC_ELT (r, 1) = GEN_INT (1 << (7-i));
 		RTVEC_ELT (p, ndx) =
 		  gen_rtx_SET (VOIDmode, gen_rtx_REG (CCmode, CR0_REGNO+i), 
-			       gen_rtx_UNSPEC (CCmode, r, 20));
+			       gen_rtx_UNSPEC (CCmode, r, UNSPEC_MOVESI_TO_CR));
 		ndx++;
 	      }
 	  emit_insn (gen_rtx_PARALLEL (VOIDmode, p));
@@ -12416,137 +12419,293 @@ rs6000_elf_select_rtx_section (mode, x, 
     default_elf_select_rtx_section (mode, x, align);
 }
 
-/* A C statement or statements to switch to the appropriate
-   section for output of DECL.  DECL is either a `VAR_DECL' node
-   or a constant of some sort.  RELOC indicates whether forming
-   the initial value of DECL requires link-time relocations.  */
 
-static void
-rs6000_elf_select_section (decl, reloc, align)
+static bool
+rs6000_in_small_data_p (decl)
      tree decl;
-     int reloc;
-     unsigned HOST_WIDE_INT align ATTRIBUTE_UNUSED;
 {
-  int size = int_size_in_bytes (TREE_TYPE (decl));
-  bool needs_sdata;
-  bool readonly;
-  static void (* const sec_funcs[4]) PARAMS ((void)) = {
-    &readonly_data_section,
-    &sdata2_section,
-    &data_section,
-    &sdata_section
-  };
+  /* We want to merge strings, so we never consider them small data.  */
+  if (TREE_CODE (decl) == STRING_CST)
+    return false;
   
-  needs_sdata = (size > 0 
-		 && size <= g_switch_value
-		 && rs6000_sdata != SDATA_NONE
-		 && (rs6000_sdata != SDATA_DATA || TREE_PUBLIC (decl)));
+  if (TREE_CODE (decl) == VAR_DECL && DECL_SECTION_NAME (decl))
+    {
+      const char *section = TREE_STRING_POINTER (DECL_SECTION_NAME (decl));
+      if (strcmp (section, ".sdata") == 0
+	  || strcmp (section, ".sdata2") == 0
+	  || strcmp (section, ".sbss") == 0)
+	return true;
+    }
+  else
+    {
+      int size = int_size_in_bytes (TREE_TYPE (decl));
+      return (size > 0
+	      && size <= g_switch_value
+	      && rs6000_sdata != SDATA_NONE
+	      && (rs6000_sdata != SDATA_DATA || TREE_PUBLIC (decl)));
+    }
+}
 
-  if (TREE_CODE (decl) == STRING_CST)
-    readonly = !flag_writable_strings;
+/* A helper function for default_elf_select_section and
+   default_elf_unique_section.  Categorizes the DECL.  */
+
+enum rs6000_section_category
+{
+  SECCAT_TEXT,
+
+  SECCAT_RODATA,
+  SECCAT_RODATA_MERGE_STR,
+  SECCAT_RODATA_MERGE_STR_INIT,
+  SECCAT_RODATA_MERGE_CONST,
+
+  SECCAT_DATA,
+
+  /* To optimize loading of shared programs, define following subsections
+     of data section:
+        _REL    Contains data that has relocations, so they get grouped
+                together and dynamic linker will visit fewer pages in memory.
+        _RO     Contains data that is otherwise read-only.  This is useful
+                with prelinking as most relocations won't be dynamically
+                linked and thus stay read only.
+        _LOCAL  Marks data containing relocations only to local objects.
+                These relocations will get fully resolved by prelinking.  */
+  SECCAT_DATA_REL,
+  SECCAT_DATA_REL_LOCAL,
+  SECCAT_DATA_REL_RO,
+  SECCAT_DATA_REL_RO_LOCAL,
+
+  SECCAT_SDATA,
+  SECCAT_SDATA2,
+  SECCAT_TDATA,
+
+  SECCAT_BSS,
+  SECCAT_SBSS,
+  SECCAT_TBSS
+};
+
+static enum rs6000_section_category rs6000_categorize_decl_for_section PARAMS ((tree, int, int));
+
+static enum rs6000_section_category
+rs6000_categorize_decl_for_section (decl, reloc, pic)
+     tree decl;
+     int reloc;
+     int pic;
+{
+  enum rs6000_section_category ret;
+
+  if (TREE_CODE (decl) == FUNCTION_DECL)
+    return SECCAT_TEXT;
+  else if (TREE_CODE (decl) == STRING_CST)
+    {
+      if (flag_writable_strings)
+        return SECCAT_DATA;
+      else
+        return SECCAT_RODATA_MERGE_STR;
+    }
   else if (TREE_CODE (decl) == VAR_DECL)
-    readonly = (!((flag_pic || DEFAULT_ABI == ABI_AIX) && reloc)
-		&& TREE_READONLY (decl)
-		&& !TREE_SIDE_EFFECTS (decl)
-		&& DECL_INITIAL (decl)
-		&& DECL_INITIAL (decl) != error_mark_node
-		&& TREE_CONSTANT (DECL_INITIAL (decl)));
+    {
+      if (DECL_INITIAL (decl) == NULL
+          || DECL_INITIAL (decl) == error_mark_node)
+        ret = SECCAT_BSS;
+      else if (! TREE_READONLY (decl)
+               || TREE_SIDE_EFFECTS (decl)
+               || ! TREE_CONSTANT (DECL_INITIAL (decl)))
+        {
+          if (pic && (reloc & 2))
+            ret = SECCAT_DATA_REL;
+          else if (pic && reloc)
+            ret = SECCAT_DATA_REL_LOCAL;
+          else
+            ret = SECCAT_DATA;
+        }
+      else if (pic && (reloc & 2))
+        ret = SECCAT_DATA_REL_RO;
+      else if (pic && reloc)
+        ret = SECCAT_DATA_REL_RO_LOCAL;
+      else if (flag_merge_constants < 2)
+        /* C and C++ don't allow different variables to share the same
+           location.  -fmerge-all-constants allows even that (at the
+           expense of not conforming).  */
+        ret = SECCAT_RODATA;
+      else if (TREE_CODE (DECL_INITIAL (decl)) == STRING_CST)
+        ret = SECCAT_RODATA_MERGE_STR_INIT;
+      else
+        ret = SECCAT_RODATA_MERGE_CONST;
+    }
   else if (TREE_CODE (decl) == CONSTRUCTOR)
-    readonly = (!((flag_pic || DEFAULT_ABI == ABI_AIX) && reloc)
-		&& !TREE_SIDE_EFFECTS (decl)
-		&& TREE_CONSTANT (decl));
+    {
+      if ((pic && reloc)
+          || TREE_SIDE_EFFECTS (decl)
+          || ! TREE_CONSTANT (decl))
+        ret = SECCAT_DATA;
+      else
+        ret = SECCAT_RODATA;
+    }
   else
-    readonly = !((flag_pic || DEFAULT_ABI == ABI_AIX) && reloc);
+    ret = SECCAT_RODATA;
 
-  if (needs_sdata && rs6000_sdata != SDATA_EABI)
-    readonly = false;
-  
-  (*sec_funcs[(readonly ? 0 : 2) + (needs_sdata ? 1 : 0)])();
+  /* There are no read-only thread-local sections.  */
+  if (TREE_CODE (decl) == VAR_DECL && DECL_THREAD_LOCAL (decl))
+    {
+      if (ret == SECCAT_BSS)
+        ret = SECCAT_TBSS;
+      else
+        ret = SECCAT_TDATA;
+    }
+
+  /* If the target uses small data sections, select it.  */
+  else if ((*targetm.in_small_data_p) (decl))
+    {
+      if (ret == SECCAT_BSS)
+        ret = SECCAT_SBSS;
+      else
+        ret = SECCAT_SDATA;
+    }
+
+  /* With EABI we put small readonly data into .sdata2.  */
+  if (ret == SECCAT_RODATA && rs6000_sdata == SDATA_EABI
+      && (*targetm.in_small_data_p) (decl))
+    ret = SECCAT_SDATA2;
+
+  return ret;
 }
 
-/* A C statement to build up a unique section name, expressed as a
-   STRING_CST node, and assign it to DECL_SECTION_NAME (decl).
-   RELOC indicates whether the initial value of EXP requires
-   link-time relocations.  If you do not define this macro, GCC will use
-   the symbol name prefixed by `.' as the section name.  Note - this
-   macro can now be called for uninitialized data items as well as
-   initialised data and functions.  */
+/* Select a section based on the above categorization.  */
 
-static void
+void
+rs6000_elf_select_section (decl, reloc, align)
+     tree decl;
+     int reloc;
+     unsigned HOST_WIDE_INT align;
+{
+  bool pic = flag_pic || DEFAULT_ABI == ABI_AIX;
+
+  switch (rs6000_categorize_decl_for_section (decl, reloc, pic))
+    {
+    case SECCAT_TEXT:
+      /* We're not supposed to be called on FUNCTION_DECLs.  */
+      abort ();
+    case SECCAT_RODATA:
+      readonly_data_section ();
+      break;
+    case SECCAT_RODATA_MERGE_STR:
+      mergeable_string_section (decl, align, 0);
+      break;
+    case SECCAT_RODATA_MERGE_STR_INIT:
+      mergeable_string_section (DECL_INITIAL (decl), align, 0);
+      break;
+    case SECCAT_RODATA_MERGE_CONST:
+      mergeable_constant_section (DECL_MODE (decl), align, 0);
+      break;
+    case SECCAT_DATA:
+      data_section ();
+      break;
+    case SECCAT_DATA_REL:
+      named_section (NULL_TREE, ".data.rel", reloc);
+      break;
+    case SECCAT_DATA_REL_LOCAL:
+      named_section (NULL_TREE, ".data.rel.local", reloc);
+      break;
+    case SECCAT_DATA_REL_RO:
+      named_section (NULL_TREE, ".data.rel.ro", reloc);
+      break;
+    case SECCAT_DATA_REL_RO_LOCAL:
+      named_section (NULL_TREE, ".data.rel.ro.local", reloc);
+      break;
+    case SECCAT_SDATA:
+      named_section (NULL_TREE, ".sdata", reloc);
+      break;
+    case SECCAT_SDATA2:
+      named_section (NULL_TREE, ".sdata2", reloc);
+      break;
+    case SECCAT_TDATA:
+      named_section (NULL_TREE, ".tdata", reloc);
+      break;
+    case SECCAT_BSS:
+#ifdef BSS_SECTION_ASM_OP
+      bss_section ();
+#else
+      named_section (NULL_TREE, ".bss", reloc);
+#endif
+      break;
+    case SECCAT_SBSS:
+      named_section (NULL_TREE, ".sbss", reloc);
+      break;
+    case SECCAT_TBSS:
+      named_section (NULL_TREE, ".tbss", reloc);
+      break;
+    default:
+      abort ();
+    }
+}
+
+/* Construct a unique section name based on the decl name and the
+   categorization performed above.  */
+
+void
 rs6000_elf_unique_section (decl, reloc)
      tree decl;
      int reloc;
 {
-  int len;
-  int sec;
-  const char *name;
+  bool one_only = DECL_ONE_ONLY (decl);
+  bool pic = flag_pic || DEFAULT_ABI == ABI_AIX;
+  const char *prefix, *name;
+  size_t nlen, plen;
   char *string;
-  const char *prefix;
 
-  static const char *const prefixes[7][2] =
-  {
-    { ".rodata.", ".gnu.linkonce.r." },
-    { ".sdata2.", ".gnu.linkonce.s2." },
-    { ".data.",   ".gnu.linkonce.d." },
-    { ".sdata.",  ".gnu.linkonce.s." },
-    { ".bss.",    ".gnu.linkonce.b." },
-    { ".sbss.",   ".gnu.linkonce.sb." },
-    { ".text.",   ".gnu.linkonce.t." }
-  };
-
-  if (TREE_CODE (decl) == FUNCTION_DECL)
-    sec = 6;
-  else
+  switch (rs6000_categorize_decl_for_section (decl, reloc, pic))
     {
-      bool readonly;
-      bool needs_sdata;
-      int size;
-
-      if (TREE_CODE (decl) == STRING_CST)
-	readonly = !flag_writable_strings;
-      else if (TREE_CODE (decl) == VAR_DECL)
-	readonly = (!((flag_pic || DEFAULT_ABI == ABI_AIX) && reloc)
-		    && TREE_READONLY (decl)
-		    && !TREE_SIDE_EFFECTS (decl)
-		    && TREE_CONSTANT (DECL_INITIAL (decl)));
-      else
-	readonly = !((flag_pic || DEFAULT_ABI == ABI_AIX) && reloc);
-
-      size = int_size_in_bytes (TREE_TYPE (decl));
-      needs_sdata = (size > 0 
-		     && size <= g_switch_value
-		     && rs6000_sdata != SDATA_NONE
-		     && (rs6000_sdata != SDATA_DATA || TREE_PUBLIC (decl)));
-
-      if (DECL_INITIAL (decl) == NULL
-	  || DECL_INITIAL (decl) == error_mark_node)
-	sec = 4;
-      else if (!readonly)
-	sec = 2;
-      else
-	sec = 0;
-
-      if (needs_sdata)
-	{
-	  /* .sdata2 is only for EABI.  */
-	  if (sec == 0 && rs6000_sdata != SDATA_EABI)
-	    sec = 2;
-	  sec += 1;
-	}
+    case SECCAT_TEXT:
+      prefix = one_only ? ".gnu.linkonce.t." : ".text.";
+      break;
+    case SECCAT_RODATA:
+    case SECCAT_RODATA_MERGE_STR:
+    case SECCAT_RODATA_MERGE_STR_INIT:
+    case SECCAT_RODATA_MERGE_CONST:
+      prefix = one_only ? ".gnu.linkonce.r." : ".rodata.";
+      break;
+    case SECCAT_DATA:
+    case SECCAT_DATA_REL:
+    case SECCAT_DATA_REL_LOCAL:
+    case SECCAT_DATA_REL_RO:
+    case SECCAT_DATA_REL_RO_LOCAL:
+      prefix = one_only ? ".gnu.linkonce.d." : ".data.";
+      break;
+    case SECCAT_SDATA:
+      prefix = one_only ? ".gnu.linkonce.s." : ".sdata.";
+      break;
+    case SECCAT_SDATA2:
+      prefix = one_only ? ".gnu.linkonce.s2." : ".sdata2.";
+      break;
+    case SECCAT_BSS:
+      prefix = one_only ? ".gnu.linkonce.b." : ".bss.";
+      break;
+    case SECCAT_SBSS:
+      prefix = one_only ? ".gnu.linkonce.sb." : ".sbss.";
+      break;
+    case SECCAT_TDATA:
+      prefix = one_only ? ".gnu.linkonce.td." : ".tdata.";
+      break;
+    case SECCAT_TBSS:
+      prefix = one_only ? ".gnu.linkonce.tb." : ".tbss.";
+      break;
+    default:
+      abort ();
     }
+  plen = strlen (prefix);
 
   name = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl));
-  name = (*targetm.strip_name_encoding) (name);
-  prefix = prefixes[sec][DECL_ONE_ONLY (decl)];
-  len    = strlen (name) + strlen (prefix);
-  string = alloca (len + 1);
-  
-  sprintf (string, "%s%s", prefix, name);
-  
-  DECL_SECTION_NAME (decl) = build_string (len, string);
+  name = (* targetm.strip_name_encoding) (name);
+  nlen = strlen (name);
+
+  string = alloca (nlen + plen + 1);
+  memcpy (string, prefix, plen);
+  memcpy (string + plen, name, nlen + 1);
+
+  DECL_SECTION_NAME (decl) = build_string (nlen + plen, string);
 }
 
-\f
+
 /* If we are referencing a function that is static or is known to be
    in this file, make the SYMBOL_REF special.  We can use this to indicate
    that we can branch to this function without emitting a no-op after the

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC select_section / unique_section
  2002-08-30 11:17             ` Franz Sirl
@ 2002-08-30 11:26               ` Franz Sirl
  2002-08-30 11:29               ` David Edelsohn
  1 sibling, 0 replies; 875+ messages in thread
From: Franz Sirl @ 2002-08-30 11:26 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Alan Modra, Richard Henderson, Geoff Keating, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1931 bytes --]

At 20:15 30.08.2002, David Edelsohn wrote:
>At 17:27 30.08.2002, David Edelsohn wrote:
>> >>>>> Alan Modra writes:
>>
>>Alan> Uh, oh.  If I understand binds_local_p and mark_constant_function
>>Alan> correctly, setting flag_pic is actually a bug-fix when compiling
>>Alan> powerpc64-linux shared libs.  We have the standard ELF binding of
>>Alan> global syms.  That is, global functions may be overridden by functions
>>Alan> in another shared library or by the main application.
>>
>>Alan> So powerpc64-linux-gcc should allow -fpic/PIC to twiddle flag_pic for
>>Alan> binds_local_p, and users should set -fPIC when compiling shared libs
>>Alan> as is common on other ELF targets.  We could use another flag, because
>>Alan> like that annoying rs6000.c warning says "all code is position
>>Alan> independent" on ppc64, but that would make powerpc64-linux just that
>>Alan> more odd.  Lots of packages set -fPIC to mean "compile me code for a
>>Alan> shared library".
>>
>>         This is what the patch that I applied to both gcc-3.2 and the
>>trunk already does, without utilizing the generic infrastructure.  The
>>PowerPC port currently does not use the targetm.binds_local_p.  We can
>>discuss evolving to the generic infrastructure for GCC 3.4.
>
>What might be acceptable for 3.3 still is the appended patch which 
>basically just copies the varasm routines to rs6000.c and adds SDATA2 and 
>AIXELF PIC handling. Briefly tested on powerpc-linux-gnu without 
>regressions. Can the AIXELF people give it a try? Sorry, no changelog yet, 
>I'm in a hurry right now.

Argh, forgot to remove the define_constant hunks, here the corrected patch.

And one comment, while working on this it occurred to me it's not that easy 
to re-use the varasm code for 3.4 with the current structure, even if I add 
a target hook for categorize_decl_for_section as suggested. The aborts in 
the generic functions make re-use difficult...

Franz.

[-- Attachment #2: gcc-ppcsections-1a.patch --]
[-- Type: application/octet-stream, Size: 13732 bytes --]

Index: gcc/config/rs6000/rs6000.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/config/rs6000/rs6000.c,v
retrieving revision 1.368
diff -u -p -r1.368 rs6000.c
--- gcc/config/rs6000/rs6000.c	23 Aug 2002 18:02:22 -0000	1.368
+++ gcc/config/rs6000/rs6000.c	30 Aug 2002 18:08:39 -0000
@@ -196,9 +196,11 @@ static unsigned int rs6000_elf_section_t
 							   int));
 static void rs6000_elf_asm_out_constructor PARAMS ((rtx, int));
 static void rs6000_elf_asm_out_destructor PARAMS ((rtx, int));
-static void rs6000_elf_select_section PARAMS ((tree, int,
-						 unsigned HOST_WIDE_INT));
-static void rs6000_elf_unique_section PARAMS ((tree, int));
+static bool rs6000_in_small_data_p PARAMS ((tree));
+
+void rs6000_elf_select_section PARAMS ((tree, int,
+					unsigned HOST_WIDE_INT));
+void rs6000_elf_unique_section PARAMS ((tree, int));
 static void rs6000_elf_select_rtx_section PARAMS ((enum machine_mode, rtx,
 						   unsigned HOST_WIDE_INT));
 static void rs6000_elf_encode_section_info PARAMS ((tree, int));
@@ -308,7 +310,8 @@ static const char alt_reg_names[][8] =
 #define TARGET_ATTRIBUTE_TABLE rs6000_attribute_table
 #undef TARGET_SET_DEFAULT_TYPE_ATTRIBUTES
 #define TARGET_SET_DEFAULT_TYPE_ATTRIBUTES rs6000_set_default_type_attributes
-
+#undef TARGET_IN_SMALL_DATA_P
+#define TARGET_IN_SMALL_DATA_P rs6000_in_small_data_p
 #undef TARGET_ASM_ALIGNED_DI_OP
 #define TARGET_ASM_ALIGNED_DI_OP DOUBLE_INT_ASM_OP
 
@@ -12416,137 +12419,293 @@ rs6000_elf_select_rtx_section (mode, x, 
     default_elf_select_rtx_section (mode, x, align);
 }
 
-/* A C statement or statements to switch to the appropriate
-   section for output of DECL.  DECL is either a `VAR_DECL' node
-   or a constant of some sort.  RELOC indicates whether forming
-   the initial value of DECL requires link-time relocations.  */
 
-static void
-rs6000_elf_select_section (decl, reloc, align)
+static bool
+rs6000_in_small_data_p (decl)
      tree decl;
-     int reloc;
-     unsigned HOST_WIDE_INT align ATTRIBUTE_UNUSED;
 {
-  int size = int_size_in_bytes (TREE_TYPE (decl));
-  bool needs_sdata;
-  bool readonly;
-  static void (* const sec_funcs[4]) PARAMS ((void)) = {
-    &readonly_data_section,
-    &sdata2_section,
-    &data_section,
-    &sdata_section
-  };
+  /* We want to merge strings, so we never consider them small data.  */
+  if (TREE_CODE (decl) == STRING_CST)
+    return false;
   
-  needs_sdata = (size > 0 
-		 && size <= g_switch_value
-		 && rs6000_sdata != SDATA_NONE
-		 && (rs6000_sdata != SDATA_DATA || TREE_PUBLIC (decl)));
+  if (TREE_CODE (decl) == VAR_DECL && DECL_SECTION_NAME (decl))
+    {
+      const char *section = TREE_STRING_POINTER (DECL_SECTION_NAME (decl));
+      if (strcmp (section, ".sdata") == 0
+	  || strcmp (section, ".sdata2") == 0
+	  || strcmp (section, ".sbss") == 0)
+	return true;
+    }
+  else
+    {
+      int size = int_size_in_bytes (TREE_TYPE (decl));
+      return (size > 0
+	      && size <= g_switch_value
+	      && rs6000_sdata != SDATA_NONE
+	      && (rs6000_sdata != SDATA_DATA || TREE_PUBLIC (decl)));
+    }
+}
 
-  if (TREE_CODE (decl) == STRING_CST)
-    readonly = !flag_writable_strings;
+/* A helper function for default_elf_select_section and
+   default_elf_unique_section.  Categorizes the DECL.  */
+
+enum rs6000_section_category
+{
+  SECCAT_TEXT,
+
+  SECCAT_RODATA,
+  SECCAT_RODATA_MERGE_STR,
+  SECCAT_RODATA_MERGE_STR_INIT,
+  SECCAT_RODATA_MERGE_CONST,
+
+  SECCAT_DATA,
+
+  /* To optimize loading of shared programs, define following subsections
+     of data section:
+        _REL    Contains data that has relocations, so they get grouped
+                together and dynamic linker will visit fewer pages in memory.
+        _RO     Contains data that is otherwise read-only.  This is useful
+                with prelinking as most relocations won't be dynamically
+                linked and thus stay read only.
+        _LOCAL  Marks data containing relocations only to local objects.
+                These relocations will get fully resolved by prelinking.  */
+  SECCAT_DATA_REL,
+  SECCAT_DATA_REL_LOCAL,
+  SECCAT_DATA_REL_RO,
+  SECCAT_DATA_REL_RO_LOCAL,
+
+  SECCAT_SDATA,
+  SECCAT_SDATA2,
+  SECCAT_TDATA,
+
+  SECCAT_BSS,
+  SECCAT_SBSS,
+  SECCAT_TBSS
+};
+
+static enum rs6000_section_category rs6000_categorize_decl_for_section PARAMS ((tree, int, int));
+
+static enum rs6000_section_category
+rs6000_categorize_decl_for_section (decl, reloc, pic)
+     tree decl;
+     int reloc;
+     int pic;
+{
+  enum rs6000_section_category ret;
+
+  if (TREE_CODE (decl) == FUNCTION_DECL)
+    return SECCAT_TEXT;
+  else if (TREE_CODE (decl) == STRING_CST)
+    {
+      if (flag_writable_strings)
+        return SECCAT_DATA;
+      else
+        return SECCAT_RODATA_MERGE_STR;
+    }
   else if (TREE_CODE (decl) == VAR_DECL)
-    readonly = (!((flag_pic || DEFAULT_ABI == ABI_AIX) && reloc)
-		&& TREE_READONLY (decl)
-		&& !TREE_SIDE_EFFECTS (decl)
-		&& DECL_INITIAL (decl)
-		&& DECL_INITIAL (decl) != error_mark_node
-		&& TREE_CONSTANT (DECL_INITIAL (decl)));
+    {
+      if (DECL_INITIAL (decl) == NULL
+          || DECL_INITIAL (decl) == error_mark_node)
+        ret = SECCAT_BSS;
+      else if (! TREE_READONLY (decl)
+               || TREE_SIDE_EFFECTS (decl)
+               || ! TREE_CONSTANT (DECL_INITIAL (decl)))
+        {
+          if (pic && (reloc & 2))
+            ret = SECCAT_DATA_REL;
+          else if (pic && reloc)
+            ret = SECCAT_DATA_REL_LOCAL;
+          else
+            ret = SECCAT_DATA;
+        }
+      else if (pic && (reloc & 2))
+        ret = SECCAT_DATA_REL_RO;
+      else if (pic && reloc)
+        ret = SECCAT_DATA_REL_RO_LOCAL;
+      else if (flag_merge_constants < 2)
+        /* C and C++ don't allow different variables to share the same
+           location.  -fmerge-all-constants allows even that (at the
+           expense of not conforming).  */
+        ret = SECCAT_RODATA;
+      else if (TREE_CODE (DECL_INITIAL (decl)) == STRING_CST)
+        ret = SECCAT_RODATA_MERGE_STR_INIT;
+      else
+        ret = SECCAT_RODATA_MERGE_CONST;
+    }
   else if (TREE_CODE (decl) == CONSTRUCTOR)
-    readonly = (!((flag_pic || DEFAULT_ABI == ABI_AIX) && reloc)
-		&& !TREE_SIDE_EFFECTS (decl)
-		&& TREE_CONSTANT (decl));
+    {
+      if ((pic && reloc)
+          || TREE_SIDE_EFFECTS (decl)
+          || ! TREE_CONSTANT (decl))
+        ret = SECCAT_DATA;
+      else
+        ret = SECCAT_RODATA;
+    }
   else
-    readonly = !((flag_pic || DEFAULT_ABI == ABI_AIX) && reloc);
+    ret = SECCAT_RODATA;
 
-  if (needs_sdata && rs6000_sdata != SDATA_EABI)
-    readonly = false;
-  
-  (*sec_funcs[(readonly ? 0 : 2) + (needs_sdata ? 1 : 0)])();
+  /* There are no read-only thread-local sections.  */
+  if (TREE_CODE (decl) == VAR_DECL && DECL_THREAD_LOCAL (decl))
+    {
+      if (ret == SECCAT_BSS)
+        ret = SECCAT_TBSS;
+      else
+        ret = SECCAT_TDATA;
+    }
+
+  /* If the target uses small data sections, select it.  */
+  else if ((*targetm.in_small_data_p) (decl))
+    {
+      if (ret == SECCAT_BSS)
+        ret = SECCAT_SBSS;
+      else
+        ret = SECCAT_SDATA;
+    }
+
+  /* With EABI we put small readonly data into .sdata2.  */
+  if (ret == SECCAT_RODATA && rs6000_sdata == SDATA_EABI
+      && (*targetm.in_small_data_p) (decl))
+    ret = SECCAT_SDATA2;
+
+  return ret;
 }
 
-/* A C statement to build up a unique section name, expressed as a
-   STRING_CST node, and assign it to DECL_SECTION_NAME (decl).
-   RELOC indicates whether the initial value of EXP requires
-   link-time relocations.  If you do not define this macro, GCC will use
-   the symbol name prefixed by `.' as the section name.  Note - this
-   macro can now be called for uninitialized data items as well as
-   initialised data and functions.  */
+/* Select a section based on the above categorization.  */
 
-static void
+void
+rs6000_elf_select_section (decl, reloc, align)
+     tree decl;
+     int reloc;
+     unsigned HOST_WIDE_INT align;
+{
+  bool pic = flag_pic || DEFAULT_ABI == ABI_AIX;
+
+  switch (rs6000_categorize_decl_for_section (decl, reloc, pic))
+    {
+    case SECCAT_TEXT:
+      /* We're not supposed to be called on FUNCTION_DECLs.  */
+      abort ();
+    case SECCAT_RODATA:
+      readonly_data_section ();
+      break;
+    case SECCAT_RODATA_MERGE_STR:
+      mergeable_string_section (decl, align, 0);
+      break;
+    case SECCAT_RODATA_MERGE_STR_INIT:
+      mergeable_string_section (DECL_INITIAL (decl), align, 0);
+      break;
+    case SECCAT_RODATA_MERGE_CONST:
+      mergeable_constant_section (DECL_MODE (decl), align, 0);
+      break;
+    case SECCAT_DATA:
+      data_section ();
+      break;
+    case SECCAT_DATA_REL:
+      named_section (NULL_TREE, ".data.rel", reloc);
+      break;
+    case SECCAT_DATA_REL_LOCAL:
+      named_section (NULL_TREE, ".data.rel.local", reloc);
+      break;
+    case SECCAT_DATA_REL_RO:
+      named_section (NULL_TREE, ".data.rel.ro", reloc);
+      break;
+    case SECCAT_DATA_REL_RO_LOCAL:
+      named_section (NULL_TREE, ".data.rel.ro.local", reloc);
+      break;
+    case SECCAT_SDATA:
+      named_section (NULL_TREE, ".sdata", reloc);
+      break;
+    case SECCAT_SDATA2:
+      named_section (NULL_TREE, ".sdata2", reloc);
+      break;
+    case SECCAT_TDATA:
+      named_section (NULL_TREE, ".tdata", reloc);
+      break;
+    case SECCAT_BSS:
+#ifdef BSS_SECTION_ASM_OP
+      bss_section ();
+#else
+      named_section (NULL_TREE, ".bss", reloc);
+#endif
+      break;
+    case SECCAT_SBSS:
+      named_section (NULL_TREE, ".sbss", reloc);
+      break;
+    case SECCAT_TBSS:
+      named_section (NULL_TREE, ".tbss", reloc);
+      break;
+    default:
+      abort ();
+    }
+}
+
+/* Construct a unique section name based on the decl name and the
+   categorization performed above.  */
+
+void
 rs6000_elf_unique_section (decl, reloc)
      tree decl;
      int reloc;
 {
-  int len;
-  int sec;
-  const char *name;
+  bool one_only = DECL_ONE_ONLY (decl);
+  bool pic = flag_pic || DEFAULT_ABI == ABI_AIX;
+  const char *prefix, *name;
+  size_t nlen, plen;
   char *string;
-  const char *prefix;
 
-  static const char *const prefixes[7][2] =
-  {
-    { ".rodata.", ".gnu.linkonce.r." },
-    { ".sdata2.", ".gnu.linkonce.s2." },
-    { ".data.",   ".gnu.linkonce.d." },
-    { ".sdata.",  ".gnu.linkonce.s." },
-    { ".bss.",    ".gnu.linkonce.b." },
-    { ".sbss.",   ".gnu.linkonce.sb." },
-    { ".text.",   ".gnu.linkonce.t." }
-  };
-
-  if (TREE_CODE (decl) == FUNCTION_DECL)
-    sec = 6;
-  else
+  switch (rs6000_categorize_decl_for_section (decl, reloc, pic))
     {
-      bool readonly;
-      bool needs_sdata;
-      int size;
-
-      if (TREE_CODE (decl) == STRING_CST)
-	readonly = !flag_writable_strings;
-      else if (TREE_CODE (decl) == VAR_DECL)
-	readonly = (!((flag_pic || DEFAULT_ABI == ABI_AIX) && reloc)
-		    && TREE_READONLY (decl)
-		    && !TREE_SIDE_EFFECTS (decl)
-		    && TREE_CONSTANT (DECL_INITIAL (decl)));
-      else
-	readonly = !((flag_pic || DEFAULT_ABI == ABI_AIX) && reloc);
-
-      size = int_size_in_bytes (TREE_TYPE (decl));
-      needs_sdata = (size > 0 
-		     && size <= g_switch_value
-		     && rs6000_sdata != SDATA_NONE
-		     && (rs6000_sdata != SDATA_DATA || TREE_PUBLIC (decl)));
-
-      if (DECL_INITIAL (decl) == NULL
-	  || DECL_INITIAL (decl) == error_mark_node)
-	sec = 4;
-      else if (!readonly)
-	sec = 2;
-      else
-	sec = 0;
-
-      if (needs_sdata)
-	{
-	  /* .sdata2 is only for EABI.  */
-	  if (sec == 0 && rs6000_sdata != SDATA_EABI)
-	    sec = 2;
-	  sec += 1;
-	}
+    case SECCAT_TEXT:
+      prefix = one_only ? ".gnu.linkonce.t." : ".text.";
+      break;
+    case SECCAT_RODATA:
+    case SECCAT_RODATA_MERGE_STR:
+    case SECCAT_RODATA_MERGE_STR_INIT:
+    case SECCAT_RODATA_MERGE_CONST:
+      prefix = one_only ? ".gnu.linkonce.r." : ".rodata.";
+      break;
+    case SECCAT_DATA:
+    case SECCAT_DATA_REL:
+    case SECCAT_DATA_REL_LOCAL:
+    case SECCAT_DATA_REL_RO:
+    case SECCAT_DATA_REL_RO_LOCAL:
+      prefix = one_only ? ".gnu.linkonce.d." : ".data.";
+      break;
+    case SECCAT_SDATA:
+      prefix = one_only ? ".gnu.linkonce.s." : ".sdata.";
+      break;
+    case SECCAT_SDATA2:
+      prefix = one_only ? ".gnu.linkonce.s2." : ".sdata2.";
+      break;
+    case SECCAT_BSS:
+      prefix = one_only ? ".gnu.linkonce.b." : ".bss.";
+      break;
+    case SECCAT_SBSS:
+      prefix = one_only ? ".gnu.linkonce.sb." : ".sbss.";
+      break;
+    case SECCAT_TDATA:
+      prefix = one_only ? ".gnu.linkonce.td." : ".tdata.";
+      break;
+    case SECCAT_TBSS:
+      prefix = one_only ? ".gnu.linkonce.tb." : ".tbss.";
+      break;
+    default:
+      abort ();
     }
+  plen = strlen (prefix);
 
   name = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl));
-  name = (*targetm.strip_name_encoding) (name);
-  prefix = prefixes[sec][DECL_ONE_ONLY (decl)];
-  len    = strlen (name) + strlen (prefix);
-  string = alloca (len + 1);
-  
-  sprintf (string, "%s%s", prefix, name);
-  
-  DECL_SECTION_NAME (decl) = build_string (len, string);
+  name = (* targetm.strip_name_encoding) (name);
+  nlen = strlen (name);
+
+  string = alloca (nlen + plen + 1);
+  memcpy (string, prefix, plen);
+  memcpy (string + plen, name, nlen + 1);
+
+  DECL_SECTION_NAME (decl) = build_string (nlen + plen, string);
 }
 
-\f
+
 /* If we are referencing a function that is static or is known to be
    in this file, make the SYMBOL_REF special.  We can use this to indicate
    that we can branch to this function without emitting a no-op after the

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC select_section / unique_section
  2002-08-30 11:17             ` Franz Sirl
  2002-08-30 11:26               ` Franz Sirl
@ 2002-08-30 11:29               ` David Edelsohn
  1 sibling, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2002-08-30 11:29 UTC (permalink / raw)
  To: Franz Sirl; +Cc: Alan Modra, Richard Henderson, Geoff Keating, gcc-patches

>>>>> Franz Sirl writes:

Franz> What might be acceptable for 3.3 still is the appended patch which 
Franz> basically just copies the varasm routines to rs6000.c and adds SDATA2 and 
Franz> AIXELF PIC handling. Briefly tested on powerpc-linux-gnu without 
Franz> regressions. Can the AIXELF people give it a try?

	No, this is not acceptable for GCC 3.3.  This is not a bugfix.

	For GCC 3.4 we can explore merging the PowerPC functionality in
with the default implementation in varasm.c -- either with an explicit
"pic" argument or modifying flag_pic around the calls. 

	None of the larger, intrusive patches are open for discussion for
GCC 3.3.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC select_section / unique_section
  2002-08-30  8:42           ` David Edelsohn
  2002-08-30 11:17             ` Franz Sirl
@ 2002-08-30 17:32             ` Alan Modra
  2002-08-30 18:17               ` Richard Henderson
  1 sibling, 1 reply; 875+ messages in thread
From: Alan Modra @ 2002-08-30 17:32 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Richard Henderson, Franz Sirl, Geoff Keating, gcc-patches

On Fri, Aug 30, 2002 at 11:27:25AM -0400, David Edelsohn wrote:
> >>>>> Alan Modra writes:
> 
> Alan> Uh, oh.  If I understand binds_local_p and mark_constant_function
> Alan> correctly, setting flag_pic is actually a bug-fix when compiling
> Alan> powerpc64-linux shared libs.  We have the standard ELF binding of
> Alan> global syms.  That is, global functions may be overridden by functions
> Alan> in another shared library or by the main application.
> 
> Alan> So powerpc64-linux-gcc should allow -fpic/PIC to twiddle flag_pic for
> Alan> binds_local_p, and users should set -fPIC when compiling shared libs
> Alan> as is common on other ELF targets.  We could use another flag, because
> Alan> like that annoying rs6000.c warning says "all code is position
> Alan> independent" on ppc64, but that would make powerpc64-linux just that
> Alan> more odd.  Lots of packages set -fPIC to mean "compile me code for a
> Alan> shared library".
> 
> 	This is what the patch that I applied to both gcc-3.2 and the
> trunk already does, without utilizing the generic infrastructure.  The
> PowerPC port currently does not use the targetm.binds_local_p.  We can
> discuss evolving to the generic infrastructure for GCC 3.4.

The PowerPC back-end code doesn't use binds_local_p, but
mark_constant_function does.

I think I'm correct in claiming that powerpc64-linux-gcc lacks a way
to say "I want this code compiled for a shared library;  Don't try to
analyze global functions for pure/const as the function in this file
may not be the one called at runtime."

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC select_section / unique_section
  2002-08-30 17:32             ` Alan Modra
@ 2002-08-30 18:17               ` Richard Henderson
  2002-08-30 18:48                 ` Geoff Keating
  2002-09-02 15:41                 ` [RFC] PowerPC select_section / unique_section David Edelsohn
  0 siblings, 2 replies; 875+ messages in thread
From: Richard Henderson @ 2002-08-30 18:17 UTC (permalink / raw)
  To: David Edelsohn, Franz Sirl, Geoff Keating, gcc-patches

On Sat, Aug 31, 2002 at 09:19:30AM +0930, Alan Modra wrote:
> The PowerPC back-end code doesn't use binds_local_p, but
> mark_constant_function does.

The inliner will use it in a moment as well.  There's an outstanding
bug wrt -O3 that inlines functions that it shouldn't.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC select_section / unique_section
  2002-08-30 18:17               ` Richard Henderson
@ 2002-08-30 18:48                 ` Geoff Keating
  2002-08-30 19:40                   ` -finline-functions vs -fpic Richard Henderson
  2002-09-02 15:41                 ` [RFC] PowerPC select_section / unique_section David Edelsohn
  1 sibling, 1 reply; 875+ messages in thread
From: Geoff Keating @ 2002-08-30 18:48 UTC (permalink / raw)
  To: rth; +Cc: dje, Franz.Sirl-kernel, gcc-patches

> Date: Fri, 30 Aug 2002 18:09:27 -0700
> From: Richard Henderson <rth@redhat.com>

> On Sat, Aug 31, 2002 at 09:19:30AM +0930, Alan Modra wrote:
> > The PowerPC back-end code doesn't use binds_local_p, but
> > mark_constant_function does.
> 
> The inliner will use it in a moment as well.  There's an outstanding
> bug wrt -O3 that inlines functions that it shouldn't.

FYI, I tried to fix this the obvious way,

Index: gcc/gcc/c-decl.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/c-decl.c,v
retrieving revision 1.349
diff -p -u -p -r1.349 c-decl.c
--- gcc/gcc/c-decl.c    21 Aug 2002 16:31:34 -0000      1.349
+++ gcc/gcc/c-decl.c    31 Aug 2002 01:33:02 -0000
@@ -4546,9 +4546,14 @@ grokdeclarator (declarator, declspecs, d
              }
          }
        /* If -finline-functions, assume it can be inlined.  This does
-          two things: let the function be deferred until it is actually
-          needed, and let dwarf2 know that the function is inlinable.  */
-       else if (flag_inline_trees == 2 && initialized)
+          two things: let the function be deferred until it is
+          actually needed, and let dwarf2 know that the function is
+          inlinable.  Don't inline anything that might not actually
+          be this function from this file---this is done here because
+          if the user explicitly specifies 'inline' then we want to
+          inline anyway.  */
+       else if (flag_inline_trees == 2 && initialized
+                && (*targetm.binds_local_p) (decl))
          { 
            DECL_INLINE (decl) = 1;
            DECL_DECLARED_INLINE_P (decl) = 0;

(and equivalently for C++) but this doesn't allow globally-visible
functions to be inlined when not -fpic on x86.  Possibly this means
binds_local_p is wrong on x86.

-- 
- Geoffrey Keating <geoffk@geoffk.org> <geoffk@redhat.com>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* -finline-functions vs -fpic
  2002-08-30 18:48                 ` Geoff Keating
@ 2002-08-30 19:40                   ` Richard Henderson
  2002-08-30 20:57                     ` Richard Henderson
  0 siblings, 1 reply; 875+ messages in thread
From: Richard Henderson @ 2002-08-30 19:40 UTC (permalink / raw)
  To: Geoff Keating; +Cc: gcc-patches

On Fri, Aug 30, 2002 at 06:43:25PM -0700, Geoff Keating wrote:
> > The inliner will use it in a moment as well.  There's an outstanding
> > bug wrt -O3 that inlines functions that it shouldn't.
> 
> FYI, I tried to fix this the obvious way,
[...]
> (and equivalently for C++) but this doesn't allow globally-visible
> functions to be inlined when not -fpic on x86.  Possibly this means
> binds_local_p is wrong on x86.

Nope.  The decl isn't properly constructed yet.  In particular,
DECL_EXTERNAL is still set, and won't be cleared until start_function.

I thought about moving where we set DECL_INLINE, or changing where
we set DECL_EXTERNAL, but the former didn't seem particularly clean,
and the later made me nervous.

The following, however, makes me happy.  I'll fix up the C++ front
end similarly in a moment.


r~



	* c-objc-common.c: Include target.h.
	(c_cannot_inline_tree_fn): Don't auto-inline functions that
	don't bind locally.  Factor setting DECL_UNINLINABLE.
	* Makefile.in (c-objc-common.o): Update.

Index: Makefile.in
===================================================================
RCS file: /cvs/gcc/gcc/gcc/Makefile.in,v
retrieving revision 1.939
diff -c -p -d -r1.939 Makefile.in
*** Makefile.in	22 Aug 2002 04:29:34 -0000	1.939
--- Makefile.in	31 Aug 2002 02:21:22 -0000
*************** c-lex.o : c-lex.c $(CONFIG_H) $(SYSTEM_H
*** 1190,1196 ****
  c-objc-common.o : c-objc-common.c $(CONFIG_H) $(SYSTEM_H) $(TREE_H) \
      $(C_TREE_H) $(RTL_H) insn-config.h integrate.h $(EXPR_H) $(C_TREE_H) \
      flags.h toplev.h tree-inline.h diagnostic.h integrate.h $(VARRAY_H) \
!     langhooks.h $(GGC_H) gt-c-objc-common.h
  c-aux-info.o : c-aux-info.c  $(CONFIG_H) $(SYSTEM_H) $(TREE_H) $(C_TREE_H) \
      flags.h toplev.h
  c-convert.o : c-convert.c $(CONFIG_H) $(SYSTEM_H) $(TREE_H) flags.h toplev.h \
--- 1190,1196 ----
  c-objc-common.o : c-objc-common.c $(CONFIG_H) $(SYSTEM_H) $(TREE_H) \
      $(C_TREE_H) $(RTL_H) insn-config.h integrate.h $(EXPR_H) $(C_TREE_H) \
      flags.h toplev.h tree-inline.h diagnostic.h integrate.h $(VARRAY_H) \
!     langhooks.h $(GGC_H) gt-c-objc-common.h $(TARGET_H)
  c-aux-info.o : c-aux-info.c  $(CONFIG_H) $(SYSTEM_H) $(TREE_H) $(C_TREE_H) \
      flags.h toplev.h
  c-convert.o : c-convert.c $(CONFIG_H) $(SYSTEM_H) $(TREE_H) flags.h toplev.h \
Index: c-objc-common.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/c-objc-common.c,v
retrieving revision 1.15
diff -c -p -d -r1.15 c-objc-common.c
*** c-objc-common.c	16 Jul 2002 02:16:31 -0000	1.15
--- c-objc-common.c	31 Aug 2002 02:21:22 -0000
*************** Software Foundation, 59 Temple Place - S
*** 34,39 ****
--- 34,40 ----
  #include "varray.h"
  #include "ggc.h"
  #include "langhooks.h"
+ #include "target.h"
  
  static bool c_tree_printer PARAMS ((output_buffer *, text_info *));
  static tree inline_forbidden_p PARAMS ((tree *, int *, void *));
*************** c_cannot_inline_tree_fn (fnp)
*** 150,160 ****
        && lookup_attribute ("always_inline", DECL_ATTRIBUTES (fn)) == NULL)
      return 1;
  
    if (! function_attribute_inlinable_p (fn))
!     {
!       DECL_UNINLINABLE (fn) = 1;
!       return 1;
!     }
  
    /* If a function has pending sizes, we must not defer its
       compilation, and we can't inline it as a tree.  */
--- 151,163 ----
        && lookup_attribute ("always_inline", DECL_ATTRIBUTES (fn)) == NULL)
      return 1;
  
+   /* Don't auto-inline anything that might not be bound within 
+      this unit of translation.  */
+   if (!DECL_DECLARED_INLINE_P (fn) && !(*targetm.binds_local_p) (fn))
+     goto cannot_inline;
+ 
    if (! function_attribute_inlinable_p (fn))
!     goto cannot_inline;
  
    /* If a function has pending sizes, we must not defer its
       compilation, and we can't inline it as a tree.  */
*************** c_cannot_inline_tree_fn (fnp)
*** 164,173 ****
        put_pending_sizes (t);
  
        if (t)
! 	{
! 	  DECL_UNINLINABLE (fn) = 1;
! 	  return 1;
! 	}
      }
  
    if (DECL_CONTEXT (fn))
--- 167,173 ----
        put_pending_sizes (t);
  
        if (t)
! 	goto cannot_inline;
      }
  
    if (DECL_CONTEXT (fn))
*************** c_cannot_inline_tree_fn (fnp)
*** 175,184 ****
        /* If a nested function has pending sizes, we may have already
           saved them.  */
        if (DECL_LANG_SPECIFIC (fn)->pending_sizes)
! 	{
! 	  DECL_UNINLINABLE (fn) = 1;
! 	  return 1;
! 	}
      }
    else
      {
--- 175,181 ----
        /* If a nested function has pending sizes, we may have already
           saved them.  */
        if (DECL_LANG_SPECIFIC (fn)->pending_sizes)
! 	goto cannot_inline;
      }
    else
      {
*************** c_cannot_inline_tree_fn (fnp)
*** 201,212 ****
      }
      
    if (walk_tree (&DECL_SAVED_TREE (fn), inline_forbidden_p, fn, NULL))
!     {
!       DECL_UNINLINABLE (fn) = 1;
!       return 1;
!     }
  
    return 0;
  }
  
  /* Called from check_global_declarations.  */
--- 198,210 ----
      }
      
    if (walk_tree (&DECL_SAVED_TREE (fn), inline_forbidden_p, fn, NULL))
!     goto cannot_inline;
  
    return 0;
+ 
+  cannot_inline:
+   DECL_UNINLINABLE (fn) = 1;
+   return 1;
  }
  
  /* Called from check_global_declarations.  */

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: -finline-functions vs -fpic
  2002-08-30 19:40                   ` -finline-functions vs -fpic Richard Henderson
@ 2002-08-30 20:57                     ` Richard Henderson
  0 siblings, 0 replies; 875+ messages in thread
From: Richard Henderson @ 2002-08-30 20:57 UTC (permalink / raw)
  To: gcc-patches

On Fri, Aug 30, 2002 at 07:28:33PM -0700, Richard Henderson wrote:
> The following, however, makes me happy.  I'll fix up the C++ front
> end similarly in a moment.

Like so.


r~

        * tree.c: Include target.h.
        (cp_cannot_inline_tree_fn): Don't auto-inline functions that
        don't bind locally.
        * Makefile.in (tree.o): Update.

Index: cp/Make-lang.in
===================================================================
RCS file: /cvs/gcc/gcc/gcc/cp/Make-lang.in,v
retrieving revision 1.121
diff -c -p -d -r1.121 Make-lang.in
*** cp/Make-lang.in	8 Aug 2002 09:10:36 -0000	1.121
--- cp/Make-lang.in	31 Aug 2002 02:36:15 -0000
*************** cp/method.o: cp/method.c $(CXX_TREE_H) t
*** 280,286 ****
  cp/cvt.o: cp/cvt.c $(CXX_TREE_H) cp/decl.h flags.h toplev.h convert.h
  cp/search.o: cp/search.c $(CXX_TREE_H) stack.h flags.h toplev.h $(RTL_H)
  cp/tree.o: cp/tree.c $(CXX_TREE_H) flags.h toplev.h $(GGC_H) $(RTL_H) \
!   insn-config.h integrate.h tree-inline.h real.h gt-cp-tree.h
  cp/ptree.o: cp/ptree.c $(CXX_TREE_H) $(SYSTEM_H)
  cp/rtti.o: cp/rtti.c $(CXX_TREE_H) flags.h toplev.h
  cp/except.o: cp/except.c $(CXX_TREE_H) flags.h $(RTL_H) except.h toplev.h \
--- 280,286 ----
  cp/cvt.o: cp/cvt.c $(CXX_TREE_H) cp/decl.h flags.h toplev.h convert.h
  cp/search.o: cp/search.c $(CXX_TREE_H) stack.h flags.h toplev.h $(RTL_H)
  cp/tree.o: cp/tree.c $(CXX_TREE_H) flags.h toplev.h $(GGC_H) $(RTL_H) \
!   insn-config.h integrate.h tree-inline.h real.h gt-cp-tree.h $(TARGET_H)
  cp/ptree.o: cp/ptree.c $(CXX_TREE_H) $(SYSTEM_H)
  cp/rtti.o: cp/rtti.c $(CXX_TREE_H) flags.h toplev.h
  cp/except.o: cp/except.c $(CXX_TREE_H) flags.h $(RTL_H) except.h toplev.h \
Index: cp/tree.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/cp/tree.c,v
retrieving revision 1.296
diff -c -p -d -r1.296 tree.c
*** cp/tree.c	25 Aug 2002 04:57:15 -0000	1.296
--- cp/tree.c	31 Aug 2002 02:36:15 -0000
*************** Boston, MA 02111-1307, USA.  */
*** 32,37 ****
--- 32,38 ----
  #include "insn-config.h"
  #include "integrate.h"
  #include "tree-inline.h"
+ #include "target.h"
  
  static tree bot_manip PARAMS ((tree *, int *, void *));
  static tree bot_replace PARAMS ((tree *, int *, void *));
*************** cp_cannot_inline_tree_fn (fnp)
*** 2209,2214 ****
--- 2210,2223 ----
        fn = *fnp = instantiate_decl (fn, /*defer_ok=*/0);
        if (TI_PENDING_TEMPLATE_FLAG (DECL_TEMPLATE_INFO (fn)))
  	return 1;
+     }
+ 
+   /* Don't auto-inline anything that might not be bound within
+      this unit of translation.  */
+   if (!DECL_DECLARED_INLINE_P (fn) && !(*targetm.binds_local_p) (fn))
+     {
+       DECL_UNINLINABLE (fn) = 1;
+       return 1;
      }
  
    if (varargs_function_p (fn))

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC select_section / unique_section
  2002-08-30 18:17               ` Richard Henderson
  2002-08-30 18:48                 ` Geoff Keating
@ 2002-09-02 15:41                 ` David Edelsohn
  2002-09-02 16:32                   ` Alan Modra
  1 sibling, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2002-09-02 15:41 UTC (permalink / raw)
  To: Richard Henderson, Alan Modra, Franz Sirl, Geoff Keating; +Cc: gcc-patches

>>>>> Richard Henderson writes:

Richard> The inliner will use it in a moment as well.  There's an outstanding
Richard> bug wrt -O3 that inlines functions that it shouldn't.

	Either this is a bug which needs to be fixed in GCC 3.2/3.3 or can
wait until GCC 3.4.

	If this needs to be fixed in GCC 3.2/3.3, then either it can be
fixed by tweaking the current PowerPC-specific functions or we need to
extend the default functionality in varasm.c so that the PowerPC port can
tie into it.

	I am happy to help extend the current varasm.c infrastructure for
the PowerPC port to use in the GCC 3.3 release.  I do not believe that
creating yet another different, private set of functions in rs6000.c for
the PowerPC port (even in the short term) is necessary or a good strategy.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC select_section / unique_section
  2002-09-02 15:41                 ` [RFC] PowerPC select_section / unique_section David Edelsohn
@ 2002-09-02 16:32                   ` Alan Modra
  2002-09-02 16:51                     ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Alan Modra @ 2002-09-02 16:32 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Richard Henderson, Franz Sirl, Geoff Keating, gcc-patches

You can simply allow -fpic to be accepted by ppc64 gcc, and qualify
all the occurrences of flag_pic in rs6000/* with a test on
DEFAULT_ABI.  No special private functions, no modifications to
varasm.c, and as a bonus, a reduction in the size of rs6000.o.
We (linux people) need a flag to say "build for a shared library".
Too many packages use -fpic for this purpose to warrant choosing
some other flag for PowerPC64.

As I said before, I have a patch that does this but David curtly
told me, without explanation, "Don't bother" posting it.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC select_section / unique_section
  2002-09-02 16:32                   ` Alan Modra
@ 2002-09-02 16:51                     ` David Edelsohn
  2002-09-02 17:13                       ` Alan Modra
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2002-09-02 16:51 UTC (permalink / raw)
  To: Alan Modra; +Cc: Richard Henderson, Franz Sirl, Geoff Keating, gcc-patches

	Again, no one has said what is incorrect about about GCC's current
code generation, other than Richard's report about inlining and -O3.  Not
what is inefficient, but what produces incorrect code.

	GCC targeted for AIX and linuxppc64 currently always should
produce code appropriate for shared libraries, with the most recent
changes, other than the problem reported above.  If GCC should produce
different code without -fpic, that is a different question and not a bug.

Alan> We (linux people) need a flag to say "build for a shared library".

	You have not justified this statement.

	All I have seen so far is a bunch of proposed patches and
assumptions without analysis (other than Richard).  When I see a report of
an actual problem, then we can discuss how to solve it, not patches in
search of a bug.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC select_section / unique_section
  2002-09-02 16:51                     ` David Edelsohn
@ 2002-09-02 17:13                       ` Alan Modra
  2002-09-02 17:57                         ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Alan Modra @ 2002-09-02 17:13 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Richard Henderson, Franz Sirl, Geoff Keating, gcc-patches

On Mon, Sep 02, 2002 at 07:51:07PM -0400, David Edelsohn wrote:
> Alan> We (linux people) need a flag to say "build for a shared library".
> 
> 	You have not justified this statement.

See http://gcc.gnu.org/ml/gcc-patches/2002-08/msg01821.html

Note "global functions may be overridden".  A silly example, but gcc
cannot optimize the calls to foo in the following if this code appears
in an ELF shared library, as the actual foo called might do something
completely different, like open an emacs window.

int foo (void)
{
  return 0;
}

int bar (void)
{
  return foo () + foo ();
}

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC select_section / unique_section
  2002-09-02 17:13                       ` Alan Modra
@ 2002-09-02 17:57                         ` David Edelsohn
  2002-09-02 18:27                           ` Alan Modra
  2002-09-03  8:41                           ` Geoff Keating
  0 siblings, 2 replies; 875+ messages in thread
From: David Edelsohn @ 2002-09-02 17:57 UTC (permalink / raw)
  To: Alan Modra, Richard Henderson; +Cc: Franz Sirl, Geoff Keating, gcc-patches

	To fix the need for GCC to know that it is compiling in PIC mode,
the linuxppc64 and AIX targets can:

1) allow flag_pic to be set on the commandline (creating an artificial
distinction between PIC and non-PIC object files), or

2) uniformly set flag_pic (creating problems because of the other uses of
flag_pic in the compiler), or

3) set and unset flag_pic around calls to the select section and
default_binds_local_p functions, or

4) modify the default select section and default_binds_local_p functions
to accept a "pic" argument (and override targetm.binds_local_p for
PowerPC).

	I prefer option 4 to allow finer-grained control, allow greater
compiler optimization opportunities, and use better programming practice.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC select_section / unique_section
  2002-09-02 17:57                         ` David Edelsohn
@ 2002-09-02 18:27                           ` Alan Modra
  2002-09-02 18:49                             ` David Edelsohn
                                               ` (2 more replies)
  2002-09-03  8:41                           ` Geoff Keating
  1 sibling, 3 replies; 875+ messages in thread
From: Alan Modra @ 2002-09-02 18:27 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Richard Henderson, Franz Sirl, Geoff Keating, gcc-patches

On Mon, Sep 02, 2002 at 08:57:31PM -0400, David Edelsohn wrote:
> 	To fix the need for GCC to know that it is compiling in PIC mode,
> the linuxppc64 and AIX targets can:
> 
> 1) allow flag_pic to be set on the commandline (creating an artificial
> distinction between PIC and non-PIC object files), or

This is the one we want.  In our case -fpic doesn't mean "position
independent code", as we're always that sort of PIC.  Instead it just
means create code for shared library linking semantics.  Note that
-fpic means both PIC code _and_ shared library code generation on
other targets.  Which is perhaps unfortunate as they are really two
separate issues.

> 2) uniformly set flag_pic (creating problems because of the other uses of
> flag_pic in the compiler), or

No, because then you lose optimization opportunities, as rth pointed
out.

> 3) set and unset flag_pic around calls to the select section and
> default_binds_local_p functions, or
> 
> 4) modify the default select section and default_binds_local_p functions
> to accept a "pic" argument (and override targetm.binds_local_p for
> PowerPC).
> 
> 	I prefer option 4 to allow finer-grained control, allow greater
> compiler optimization opportunities, and use better programming practice.

3) and 4) don't address how gcc should be told to generate code for
shared libraries.  You seem to be assuming some flag other than -fpic
would be used.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC select_section / unique_section
  2002-09-02 18:27                           ` Alan Modra
@ 2002-09-02 18:49                             ` David Edelsohn
  2002-09-02 19:41                               ` Alan Modra
  2002-09-02 20:17                               ` Richard Henderson
  2002-09-02 20:11                             ` Jeff Sturm
  2002-09-03  9:29                             ` Mark Mitchell
  2 siblings, 2 replies; 875+ messages in thread
From: David Edelsohn @ 2002-09-02 18:49 UTC (permalink / raw)
  To: Alan Modra; +Cc: Richard Henderson, Franz Sirl, Geoff Keating, gcc-patches

>>>>> Alan Modra writes:

Alan> 3) and 4) don't address how gcc should be told to generate code for
Alan> shared libraries.  You seem to be assuming some flag other than -fpic
Alan> would be used.

	I am not assuming any additional flags.

	The question is whether we ever need to distinguish between code
generation for shared libraries and code generation for regular object
files.  AIX compilers never make any distinction.

	AIX default to tight binding within shared objects, but AIX 4.2
and above added the runtime-linking flag to allow SVR4-type symbol
overriding.  No changes have been necessary to IBM's compilers or to GCC
to support this.  There is no "prepare for runtime-linking" compiler
option.

	GCC always accesses globals through the TOC and always calls
public functions through glue, so symbols always can be overridden -- at
least until the recent binds_local_p changes.

	It is not obvious that GCC should behave differently for shared
libraries and regular object files when targeting ABI_AIX. 

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC select_section / unique_section
  2002-09-02 18:49                             ` David Edelsohn
@ 2002-09-02 19:41                               ` Alan Modra
  2002-09-02 19:59                                 ` David Edelsohn
  2002-09-02 20:17                               ` Richard Henderson
  1 sibling, 1 reply; 875+ messages in thread
From: Alan Modra @ 2002-09-02 19:41 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Richard Henderson, Franz Sirl, Geoff Keating, gcc-patches

On Mon, Sep 02, 2002 at 09:49:11PM -0400, David Edelsohn wrote:
> 	It is not obvious that GCC should behave differently for shared
> libraries and regular object files when targeting ABI_AIX. 

I don't know how to make it any more obvious to you.  Do you disagree
with:

1) Current mainline powerpc64-linux gcc, in the face of inlining and
   const function folding optimizations, is broken when building
   shared libraries.

2) Therefore these optimizations must be turned off when building
   shared libraries.

3) If shared and regular objects are to be built the same, then these
   optimizations must be turned off for regular objects too.

You may make the decision that AIX gcc doesn't need these
optimizations, but please don't hobble PowerPC64 Linux.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC select_section / unique_section
  2002-09-02 19:41                               ` Alan Modra
@ 2002-09-02 19:59                                 ` David Edelsohn
  0 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2002-09-02 19:59 UTC (permalink / raw)
  To: Alan Modra; +Cc: Richard Henderson, Franz Sirl, Geoff Keating, gcc-patches

>>>>> Alan Modra writes:

Alan> You may make the decision that AIX gcc doesn't need these
Alan> optimizations, but please don't hobble PowerPC64 Linux.

	Please tone it down and stop making this personal.

	This type of decision needs input from others, including other
experts on calling conventions and compilers within IBM.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC select_section / unique_section
  2002-09-02 18:27                           ` Alan Modra
  2002-09-02 18:49                             ` David Edelsohn
@ 2002-09-02 20:11                             ` Jeff Sturm
  2002-09-02 20:19                               ` David Edelsohn
  2002-09-03  9:29                             ` Mark Mitchell
  2 siblings, 1 reply; 875+ messages in thread
From: Jeff Sturm @ 2002-09-02 20:11 UTC (permalink / raw)
  To: Alan Modra
  Cc: David Edelsohn, Richard Henderson, Franz Sirl, Geoff Keating,
	gcc-patches

On Tue, 3 Sep 2002, Alan Modra wrote:
> Note that -fpic means both PIC code _and_ shared library code generation
> on other targets.

Some targets.  Unless I'm very mistaken, all code is PIC on alpha, so that
-fPIC has little effect apart from binds_local_p.

Irrespective of PPC, wouldn't it be a desirable to have uniformity across
GNU/Linux targets over this issue?

Jeff

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC select_section / unique_section
  2002-09-02 18:49                             ` David Edelsohn
  2002-09-02 19:41                               ` Alan Modra
@ 2002-09-02 20:17                               ` Richard Henderson
  1 sibling, 0 replies; 875+ messages in thread
From: Richard Henderson @ 2002-09-02 20:17 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Alan Modra, Franz Sirl, Geoff Keating, gcc-patches

On Mon, Sep 02, 2002 at 09:49:11PM -0400, David Edelsohn wrote:
> 	GCC always accesses globals through the TOC and always calls
> public functions through glue, so symbols always can be overridden -- at
> least until the recent binds_local_p changes.

This is actually incorrect.  Until the binds_local_p changes, various
parts of gcc did various checks (or not), somewhat at random.  The
binds_local_p change "merely" centralizes the logic.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC select_section / unique_section
  2002-09-02 20:11                             ` Jeff Sturm
@ 2002-09-02 20:19                               ` David Edelsohn
  2002-09-03  0:16                                 ` Richard Henderson
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2002-09-02 20:19 UTC (permalink / raw)
  To: Jeff Sturm
  Cc: Alan Modra, Richard Henderson, Franz Sirl, Geoff Keating, gcc-patches

>>>>> Jeff Sturm writes:

Jeff> Some targets.  Unless I'm very mistaken, all code is PIC on alpha, so that
Jeff> -fPIC has little effect apart from binds_local_p.

Jeff> Irrespective of PPC, wouldn't it be a desirable to have uniformity across
Jeff> GNU/Linux targets over this issue?

	Which seems like a good argument for default_binds_local_p and
others using a "pic" function parameter instead of grabbing the global
flag_pic.  This would allow -fpic/-fPIC to affect targetm.binds_local_p
while not affecting the other functions.

	For instance, rs6000_override_options could reset the GCC global
flag_pic to zero while remembering the original value privately and then
using that value to affect binds_local_p.

	These uses of flag_pic seem like orthogonal decisions to me and
ports should have finer-grained control only available if binds_local_p
and select_section accept arguments.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC select_section / unique_section
  2002-09-02 20:19                               ` David Edelsohn
@ 2002-09-03  0:16                                 ` Richard Henderson
  2002-09-03  8:22                                   ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Richard Henderson @ 2002-09-03  0:16 UTC (permalink / raw)
  To: David Edelsohn
  Cc: Jeff Sturm, Alan Modra, Franz Sirl, Geoff Keating, gcc-patches

On Mon, Sep 02, 2002 at 11:19:11PM -0400, David Edelsohn wrote:
> Jeff> Some targets.  Unless I'm very mistaken, all code is PIC on alpha,
> Jeff> so that -fPIC has little effect apart from binds_local_p.

Well, it also changes what sections data with relocations are placed in.
But I think you underestimate what effect binds_local_p has -- variables
that are known to be local are addressed with gp-relative relocations
instead of through the got, calls within the same UOT use bsr, etc.

> Jeff> Irrespective of PPC, wouldn't it be a desirable to have uniformity
> Jeff> across GNU/Linux targets over this issue?

Yes.  But aside from PPC, I think we largely have that already.

> 	Which seems like a good argument for default_binds_local_p and
> others using a "pic" function parameter instead of grabbing the global
> flag_pic.  This would allow -fpic/-fPIC to affect targetm.binds_local_p
> while not affecting the other functions.

I don't see how that follows.  In fact, I think that would simply
be confusing to the many places that want to call binds_local_p.

I would be willing to extract the bulk of default_binds_local_p
into a default_binds_local_p_1 that did have a "shlib" parameter,
and similar for select_section, but I don't want the choice of
what value of flag_pic to pass to binds_local_p to reside at all
of the call sites.

r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC select_section / unique_section
  2002-09-03  0:16                                 ` Richard Henderson
@ 2002-09-03  8:22                                   ` David Edelsohn
  2002-09-03  9:04                                     ` Richard Henderson
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2002-09-03  8:22 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Jeff Sturm, Alan Modra, Franz Sirl, Geoff Keating, gcc-patches

>>>>> Richard Henderson writes:

Richard> I would be willing to extract the bulk of default_binds_local_p
Richard> into a default_binds_local_p_1 that did have a "shlib" parameter,
Richard> and similar for select_section, but I don't want the choice of
Richard> what value of flag_pic to pass to binds_local_p to reside at all
Richard> of the call sites.

	Like the appended patch?

	Do you want all of the default_* functions wrapped around a
default_*_1 function instead of adding flag_pic at the call sites?

Thanks, David


	* varasm.c (default_binds_local_p): Rename as
	default_binds_local_p_1 with shlib parameter.  Use original name
	to call new function with flag_pic.

Index: varasm.c
===================================================================
RCS file: /cvs/gcc/egcs/gcc/varasm.c,v
retrieving revision 1.303
diff -c -p -r1.303 varasm.c
*** varasm.c	21 Aug 2002 02:41:44 -0000	1.303
--- varasm.c	3 Sep 2002 15:15:36 -0000
*************** default_strip_name_encoding (str)
*** 5202,5209 ****
     wrt cross-module name binding.  */
  
  bool
! default_binds_local_p (exp)
       tree exp;
  {
    bool local_p;
  
--- 5205,5213 ----
     wrt cross-module name binding.  */
  
  bool
! default_binds_local_p_1 (exp, shlib)
       tree exp;
+      int shlib;
  {
    bool local_p;
  
*************** default_binds_local_p (exp)
*** 5224,5230 ****
      local_p = false;
    /* If PIC, then assume that any global name can be overridden by
       symbols resolved from other modules.  */
!   else if (flag_pic)
      local_p = false;
    /* Uninitialized COMMON variable may be unified with symbols
       resolved from other modules.  */
--- 5228,5234 ----
      local_p = false;
    /* If PIC, then assume that any global name can be overridden by
       symbols resolved from other modules.  */
!   else if (shlib)
      local_p = false;
    /* Uninitialized COMMON variable may be unified with symbols
       resolved from other modules.  */
*************** default_binds_local_p (exp)
*** 5238,5243 ****
--- 5242,5254 ----
      local_p = true;
  
    return local_p;
+ }
+ 
+ bool
+ default_binds_local_p (exp)
+      tree exp;
+ {
+   return default_binds_local_p (exp, flag_pic);
  }
  
  /* Default function to output code that will globalize a label.  A
Index: output.h
===================================================================
RCS file: /cvs/gcc/egcs/gcc/output.h,v
retrieving revision 1.109
diff -c -p -r1.109 output.h
*** output.h	21 Aug 2002 02:41:44 -0000	1.109
--- output.h	3 Sep 2002 15:15:37 -0000
*************** extern void default_elf_select_rtx_secti
*** 537,542 ****
--- 537,543 ----
  						    unsigned HOST_WIDE_INT));
  extern const char *default_strip_name_encoding PARAMS ((const char *));
  extern bool default_binds_local_p PARAMS ((tree));
+ extern bool default_binds_local_p_1 PARAMS ((tree, int));
  extern void default_globalize_label PARAMS ((FILE *, const char *));
  
  /* Emit data for vtable gc for GNU binutils.  */

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC select_section / unique_section
  2002-09-02 17:57                         ` David Edelsohn
  2002-09-02 18:27                           ` Alan Modra
@ 2002-09-03  8:41                           ` Geoff Keating
  2002-09-03  9:50                             ` David Edelsohn
  1 sibling, 1 reply; 875+ messages in thread
From: Geoff Keating @ 2002-09-03  8:41 UTC (permalink / raw)
  To: dje; +Cc: amodra, rth, Franz.Sirl-kernel, gcc-patches

> X-Sieve: cmu-sieve 2.0
> cc: Franz Sirl <Franz.Sirl-kernel@lauterbach.com>,
>    Geoff Keating <geoffk@redhat.com>, gcc-patches@gcc.gnu.org
> Date: Mon, 02 Sep 2002 20:57:31 -0400
> From: David Edelsohn <dje@watson.ibm.com>
> 
> 	To fix the need for GCC to know that it is compiling in PIC mode,
> the linuxppc64 and AIX targets can:

Do you need to do this for AIX?  I thought the symbol binding rules
for AIX shared libraries were different, they didn't permit overriding
a symbol in a shared library.

-- 
- Geoffrey Keating <geoffk@geoffk.org> <geoffk@redhat.com>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC select_section / unique_section
  2002-09-03  8:22                                   ` David Edelsohn
@ 2002-09-03  9:04                                     ` Richard Henderson
  2002-09-03 10:40                                       ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Richard Henderson @ 2002-09-03  9:04 UTC (permalink / raw)
  To: David Edelsohn
  Cc: Jeff Sturm, Alan Modra, Franz Sirl, Geoff Keating, gcc-patches

On Tue, Sep 03, 2002 at 11:21:18AM -0400, David Edelsohn wrote:
> 	Like the appended patch?

Yes, thanks.

> 	Do you want all of the default_* functions wrapped around a
> default_*_1 function instead of adding flag_pic at the call sites?

I think so, yes.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC select_section / unique_section
  2002-09-02 18:27                           ` Alan Modra
  2002-09-02 18:49                             ` David Edelsohn
  2002-09-02 20:11                             ` Jeff Sturm
@ 2002-09-03  9:29                             ` Mark Mitchell
  2 siblings, 0 replies; 875+ messages in thread
From: Mark Mitchell @ 2002-09-03  9:29 UTC (permalink / raw)
  To: Alan Modra, David Edelsohn
  Cc: Richard Henderson, Franz Sirl, Geoff Keating, gcc-patches



--On Tuesday, September 03, 2002 10:57:54 AM +0930 Alan Modra 
<amodra@bigpond.net.au> wrote:

> On Mon, Sep 02, 2002 at 08:57:31PM -0400, David Edelsohn wrote:
>> 	To fix the need for GCC to know that it is compiling in PIC mode,
>> the linuxppc64 and AIX targets can:
>>
>> 1) allow flag_pic to be set on the commandline (creating an artificial
>> distinction between PIC and non-PIC object files), or
>
> This is the one we want.  In our case -fpic doesn't mean "position
> independent code", as we're always that sort of PIC.  Instead it just
> means create code for shared library linking semantics.  Note that
> -fpic means both PIC code _and_ shared library code generation on
> other targets.  Which is perhaps unfortunate as they are really two
> separate issues.

So why not fix this?

Add -fshared-lib, and have -fpic turn it on.  Then, test flag_shared_lib
where you mean that, and flag_pic where you mean that.  There's no
compatibility problem for existing code, and on systems where the two
ideas are different, people now have a way of saying which mode they
want.

(You have to notice -fpic -fno_pic which now means something different
than it did before, and warn about that, but that will be a rare case.)

-- 
Mark Mitchell                mark@codesourcery.com
CodeSourcery, LLC            http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC select_section / unique_section
  2002-09-03  8:41                           ` Geoff Keating
@ 2002-09-03  9:50                             ` David Edelsohn
  0 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2002-09-03  9:50 UTC (permalink / raw)
  To: Geoff Keating; +Cc: amodra, rth, Franz.Sirl-kernel, gcc-patches

>>>>> Geoff Keating writes:

Geoff> Do you need to do this for AIX?  I thought the symbol binding rules
Geoff> for AIX shared libraries were different, they didn't permit overriding
Geoff> a symbol in a shared library.

	Because AIX does support a run-time linking mode, this could be an
issue.  IBM's compilers currently do not change their code generation to
make this easier or more difficult.

	The two issues for both AIX and linuxppc64 are placing certain
types of data in writeable versus read-only sections and the ability to
interpose functions (binds_local_p).

	My current belief is that AIX and linuxppc64 always should default
to PIC and shareable, and that -fpic should only affect binds_local_p to
ensure that symbols can be interposed.  The 64-bit PowerPC SVR4 ABI
already does have a non-PIC mode which GCC does not generate, so having a
non-PIC mode for PIC code would add confusion.  Some of the choice of
sections depends on the PowerPC architecture and how best to utilize it.

	I don't think that any policy change from the current defaults
should be made without a lot more investigation -- which I am doing.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC select_section / unique_section
  2002-09-03  9:04                                     ` Richard Henderson
@ 2002-09-03 10:40                                       ` David Edelsohn
  2002-09-03 13:44                                         ` Richard Henderson
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2002-09-03 10:40 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Jeff Sturm, Alan Modra, Franz Sirl, Geoff Keating, gcc-patches

>>>>> Richard Henderson writes:

>> Do you want all of the default_* functions wrapped around a
>> default_*_1 function instead of adding flag_pic at the call sites?

Richard> I think so, yes.

	How about the following patch?  Then we can move on to integrating
sdata. 

Thanks, David

	* varasm.c (default_section_type_flags): Append _1 to name with
	shlib parameter.  Use original name to call new function with
	implicit flag_pic.
	(decl_readonly_section): Likewise.
	(default_elf_select_section): Likewise.
	(default_unique_section): Likewise.
	(default_bind_local_p): Likewise.
	(categorize_decl_for_section): Add shlib parameter to use in place
	of implicit flag_pic.
	* output.h: Declare new functions with _1 and shlib argument.

Index: varasm.c
===================================================================
RCS file: /cvs/gcc/egcs/gcc/varasm.c,v
retrieving revision 1.303
diff -c -p -r1.303 varasm.c
*** varasm.c	21 Aug 2002 02:41:44 -0000	1.303
--- varasm.c	3 Sep 2002 17:29:08 -0000
*************** init_varasm_once ()
*** 4686,4701 ****
     read-only for a const data decl, and writable for a non-const data decl.  */
  
  unsigned int
! default_section_type_flags (decl, name, reloc)
       tree decl;
       const char *name;
       int reloc;
  {
    unsigned int flags;
  
    if (decl && TREE_CODE (decl) == FUNCTION_DECL)
      flags = SECTION_CODE;
!   else if (decl && decl_readonly_section (decl, reloc))
      flags = 0;
    else
      flags = SECTION_WRITE;
--- 4686,4702 ----
     read-only for a const data decl, and writable for a non-const data decl.  */
  
  unsigned int
! default_section_type_flags_1 (decl, name, reloc, shlib)
       tree decl;
       const char *name;
       int reloc;
+      int shlib;
  {
    unsigned int flags;
  
    if (decl && TREE_CODE (decl) == FUNCTION_DECL)
      flags = SECTION_CODE;
!   else if (decl && decl_readonly_section_1 (decl, reloc, shlib))
      flags = 0;
    else
      flags = SECTION_WRITE;
*************** default_section_type_flags (decl, name, 
*** 4725,4730 ****
--- 4726,4740 ----
    return flags;
  }
  
+ unsigned int
+ default_section_type_flags (decl, name, reloc)
+      tree decl;
+      const char *name;
+      int reloc;
+ {
+   return default_section_type_flags_1 (decl, name, reloc, flag_pic);
+ }
+ 
  /* Output assembly to switch to section NAME with attribute FLAGS.
     Four variants for common object file formats.  */
  
*************** enum section_category
*** 4913,4924 ****
    SECCAT_TBSS
  };
  
! static enum section_category categorize_decl_for_section PARAMS ((tree, int));
  
  static enum section_category
! categorize_decl_for_section (decl, reloc)
       tree decl;
       int reloc;
  {
    enum section_category ret;
  
--- 4923,4936 ----
    SECCAT_TBSS
  };
  
! static enum section_category
! categorize_decl_for_section PARAMS ((tree, int, int));
  
  static enum section_category
! categorize_decl_for_section (decl, reloc, shlib)
       tree decl;
       int reloc;
+      int shlib;
  {
    enum section_category ret;
  
*************** categorize_decl_for_section (decl, reloc
*** 4940,4955 ****
  	       || TREE_SIDE_EFFECTS (decl)
  	       || ! TREE_CONSTANT (DECL_INITIAL (decl)))
  	{
! 	  if (flag_pic && (reloc & 2))
  	    ret = SECCAT_DATA_REL;
! 	  else if (flag_pic && reloc)
  	    ret = SECCAT_DATA_REL_LOCAL;
  	  else
  	    ret = SECCAT_DATA;
  	}
!       else if (flag_pic && (reloc & 2))
  	ret = SECCAT_DATA_REL_RO;
!       else if (flag_pic && reloc)
  	ret = SECCAT_DATA_REL_RO_LOCAL;
        else if (flag_merge_constants < 2)
  	/* C and C++ don't allow different variables to share the same
--- 4952,4967 ----
  	       || TREE_SIDE_EFFECTS (decl)
  	       || ! TREE_CONSTANT (DECL_INITIAL (decl)))
  	{
! 	  if (shlib && (reloc & 2))
  	    ret = SECCAT_DATA_REL;
! 	  else if (shlib && reloc)
  	    ret = SECCAT_DATA_REL_LOCAL;
  	  else
  	    ret = SECCAT_DATA;
  	}
!       else if (shlib && (reloc & 2))
  	ret = SECCAT_DATA_REL_RO;
!       else if (shlib && reloc)
  	ret = SECCAT_DATA_REL_RO_LOCAL;
        else if (flag_merge_constants < 2)
  	/* C and C++ don't allow different variables to share the same
*************** categorize_decl_for_section (decl, reloc
*** 4963,4969 ****
      }
    else if (TREE_CODE (decl) == CONSTRUCTOR)
      {
!       if ((flag_pic && reloc)
  	  || TREE_SIDE_EFFECTS (decl)
  	  || ! TREE_CONSTANT (decl))
  	ret = SECCAT_DATA;
--- 4975,4981 ----
      }
    else if (TREE_CODE (decl) == CONSTRUCTOR)
      {
!       if ((shlib && reloc)
  	  || TREE_SIDE_EFFECTS (decl)
  	  || ! TREE_CONSTANT (decl))
  	ret = SECCAT_DATA;
*************** categorize_decl_for_section (decl, reloc
*** 4995,5005 ****
  }
  
  bool
! decl_readonly_section (decl, reloc)
       tree decl;
       int reloc;
  {
!   switch (categorize_decl_for_section (decl, reloc))
      {
      case SECCAT_RODATA:
      case SECCAT_RODATA_MERGE_STR:
--- 5007,5018 ----
  }
  
  bool
! decl_readonly_section_1 (decl, reloc, shlib)
       tree decl;
       int reloc;
+      int shlib;
  {
!   switch (categorize_decl_for_section (decl, reloc, shlib))
      {
      case SECCAT_RODATA:
      case SECCAT_RODATA_MERGE_STR:
*************** decl_readonly_section (decl, reloc)
*** 5013,5027 ****
      }
  }
  
  /* Select a section based on the above categorization.  */
  
  void
! default_elf_select_section (decl, reloc, align)
       tree decl;
       int reloc;
       unsigned HOST_WIDE_INT align;
  {
!   switch (categorize_decl_for_section (decl, reloc))
      {
      case SECCAT_TEXT:
        /* We're not supposed to be called on FUNCTION_DECLs.  */
--- 5026,5049 ----
      }
  }
  
+ bool
+ decl_readonly_section (decl, reloc)
+      tree decl;
+      int reloc;
+ {
+   return decl_readonly_section_1 (decl, reloc, flag_pic);
+ }
+ 
  /* Select a section based on the above categorization.  */
  
  void
! default_elf_select_section_1 (decl, reloc, align, shlib)
       tree decl;
       int reloc;
       unsigned HOST_WIDE_INT align;
+      int shlib;
  {
!   switch (categorize_decl_for_section (decl, reloc, shlib))
      {
      case SECCAT_TEXT:
        /* We're not supposed to be called on FUNCTION_DECLs.  */
*************** default_elf_select_section (decl, reloc,
*** 5077,5096 ****
      }
  }
  
  /* Construct a unique section name based on the decl name and the
     categorization performed above.  */
  
  void
! default_unique_section (decl, reloc)
       tree decl;
       int reloc;
  {
    bool one_only = DECL_ONE_ONLY (decl);
    const char *prefix, *name;
    size_t nlen, plen;
    char *string;
  
!   switch (categorize_decl_for_section (decl, reloc))
      {
      case SECCAT_TEXT:
        prefix = one_only ? ".gnu.linkonce.t." : ".text.";
--- 5099,5128 ----
      }
  }
  
+ void
+ default_elf_select_section (decl, reloc, align)
+      tree decl;
+      int reloc;
+      unsigned HOST_WIDE_INT align;
+ {
+   return default_elf_select_section_1 (decl, reloc, align, flag_pic);
+ }
+ 
  /* Construct a unique section name based on the decl name and the
     categorization performed above.  */
  
  void
! default_unique_section_1 (decl, reloc, shlib)
       tree decl;
       int reloc;
+      int shlib;
  {
    bool one_only = DECL_ONE_ONLY (decl);
    const char *prefix, *name;
    size_t nlen, plen;
    char *string;
  
!   switch (categorize_decl_for_section (decl, reloc, shlib))
      {
      case SECCAT_TEXT:
        prefix = one_only ? ".gnu.linkonce.t." : ".text.";
*************** default_unique_section (decl, reloc)
*** 5140,5145 ****
--- 5172,5185 ----
  }
  
  void
+ default_unique_section (decl, reloc)
+      tree decl;
+      int reloc;
+ {
+   return default_unique_section_1 (decl, reloc, flag_pic);
+ }
+ 
+ void
  default_select_rtx_section (mode, x, align)
       enum machine_mode mode ATTRIBUTE_UNUSED;
       rtx x;
*************** default_strip_name_encoding (str)
*** 5202,5209 ****
     wrt cross-module name binding.  */
  
  bool
! default_binds_local_p (exp)
       tree exp;
  {
    bool local_p;
  
--- 5242,5250 ----
     wrt cross-module name binding.  */
  
  bool
! default_binds_local_p_1 (exp, shlib)
       tree exp;
+      int shlib;
  {
    bool local_p;
  
*************** default_binds_local_p (exp)
*** 5224,5230 ****
      local_p = false;
    /* If PIC, then assume that any global name can be overridden by
       symbols resolved from other modules.  */
!   else if (flag_pic)
      local_p = false;
    /* Uninitialized COMMON variable may be unified with symbols
       resolved from other modules.  */
--- 5265,5271 ----
      local_p = false;
    /* If PIC, then assume that any global name can be overridden by
       symbols resolved from other modules.  */
!   else if (shlib)
      local_p = false;
    /* Uninitialized COMMON variable may be unified with symbols
       resolved from other modules.  */
*************** default_binds_local_p (exp)
*** 5238,5243 ****
--- 5279,5291 ----
      local_p = true;
  
    return local_p;
+ }
+ 
+ bool
+ default_binds_local_p (exp)
+      tree exp;
+ {
+   return default_binds_local_p (exp, flag_pic);
  }
  
  /* Default function to output code that will globalize a label.  A
Index: output.h
===================================================================
RCS file: /cvs/gcc/egcs/gcc/output.h,v
retrieving revision 1.109
diff -c -p -r1.109 output.h
*** output.h	21 Aug 2002 02:41:44 -0000	1.109
--- output.h	3 Sep 2002 17:29:08 -0000
*************** extern rtx this_is_asm_operands;
*** 468,473 ****
--- 468,474 ----
  /* Decide whether DECL needs to be in a writable section.
     RELOC is the same as for SELECT_SECTION.  */
  extern bool decl_readonly_section PARAMS ((tree, int));
+ extern bool decl_readonly_section_1 PARAMS ((tree, int, int));
  
  /* User label prefix in effect for this compilation.  */
  extern const char *user_label_prefix;
*************** extern bool named_section_first_declarat
*** 508,513 ****
--- 509,517 ----
  union tree_node;
  extern unsigned int default_section_type_flags PARAMS ((union tree_node *,
  							const char *, int));
+ extern unsigned int default_section_type_flags_1 PARAMS ((union tree_node *,
+ 							  const char *,
+ 							  int, int));
  
  extern void default_no_named_section PARAMS ((const char *, unsigned int));
  extern void default_elf_asm_named_section PARAMS ((const char *, unsigned int));
*************** extern void default_select_section PARAM
*** 530,542 ****
--- 534,550 ----
  					    unsigned HOST_WIDE_INT));
  extern void default_elf_select_section PARAMS ((tree, int,
  						unsigned HOST_WIDE_INT));
+ extern void default_elf_select_section_1 PARAMS ((tree, int,
+ 						  unsigned HOST_WIDE_INT, int));
  extern void default_unique_section PARAMS ((tree, int));
+ extern void default_unique_section_1 PARAMS ((tree, int, int));
  extern void default_select_rtx_section PARAMS ((enum machine_mode, rtx,
  						unsigned HOST_WIDE_INT));
  extern void default_elf_select_rtx_section PARAMS ((enum machine_mode, rtx,
  						    unsigned HOST_WIDE_INT));
  extern const char *default_strip_name_encoding PARAMS ((const char *));
  extern bool default_binds_local_p PARAMS ((tree));
+ extern bool default_binds_local_p_1 PARAMS ((tree, int));
  extern void default_globalize_label PARAMS ((FILE *, const char *));
  
  /* Emit data for vtable gc for GNU binutils.  */

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC select_section / unique_section
  2002-09-03 10:40                                       ` David Edelsohn
@ 2002-09-03 13:44                                         ` Richard Henderson
  0 siblings, 0 replies; 875+ messages in thread
From: Richard Henderson @ 2002-09-03 13:44 UTC (permalink / raw)
  To: David Edelsohn
  Cc: Jeff Sturm, Alan Modra, Franz Sirl, Geoff Keating, gcc-patches

On Tue, Sep 03, 2002 at 01:40:19PM -0400, David Edelsohn wrote:
> 	* varasm.c (default_section_type_flags): Append _1 to name with
> 	shlib parameter.  Use original name to call new function with
> 	implicit flag_pic.
> 	(decl_readonly_section): Likewise.
> 	(default_elf_select_section): Likewise.
> 	(default_unique_section): Likewise.
> 	(default_bind_local_p): Likewise.
> 	(categorize_decl_for_section): Add shlib parameter to use in place
> 	of implicit flag_pic.
> 	* output.h: Declare new functions with _1 and shlib argument.

Ok.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: SYMBOL_REF_FLAG
       [not found]             ` <200209140219.WAA25730@makai.watson.ibm.com>
@ 2002-09-13 19:59               ` Alan Modra
  2002-09-13 22:17                 ` SYMBOL_REF_FLAG Richard Henderson
  2002-09-14  0:01                 ` SYMBOL_REF_FLAG David Edelsohn
  0 siblings, 2 replies; 875+ messages in thread
From: Alan Modra @ 2002-09-13 19:59 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

On Fri, Sep 13, 2002 at 10:19:45PM -0400, David Edelsohn wrote:
> 	Yes, this should be changed in all locations.

OK here we go.

	* config/rs6000/rs6000.c (rs6000_elf_encode_section_info): Use
	targetm.binds_local_p to set SYMBOL_REF_FLAG.
	(rs6000_xcoff_encode_section_info): Likewise.
	(config/rs6000/xcoff.h): Likewise.

OK mainline?  This should probably also be fixed on the gcc-3.2 branch,
and the flag_pic issue too.  Alternatively, if that is too large a fix
for the branch to risk destabilizing rs6000-aix gcc, I can continue
putting such fixes in a patcheset for powerpc64-linux.

Index: gcc/config/rs6000/rs6000.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.c,v
retrieving revision 1.377
diff -u -p -r1.377 rs6000.c
--- gcc/config/rs6000/rs6000.c	13 Sep 2002 01:40:43 -0000	1.377
+++ gcc/config/rs6000/rs6000.c	14 Sep 2002 02:35:27 -0000
@@ -12424,8 +12424,7 @@ rs6000_elf_encode_section_info (decl, fi
   if (TREE_CODE (decl) == FUNCTION_DECL)
     {
       rtx sym_ref = XEXP (DECL_RTL (decl), 0);
-      if ((TREE_ASM_WRITTEN (decl) || ! TREE_PUBLIC (decl))
-          && ! DECL_WEAK (decl))
+      if ((*targetm.binds_local_p) (decl))
 	SYMBOL_REF_FLAG (sym_ref) = 1;
 
       if (DEFAULT_ABI == ABI_AIX)
@@ -13121,8 +13120,7 @@ rs6000_xcoff_encode_section_info (decl, 
      int first ATTRIBUTE_UNUSED;
 {
   if (TREE_CODE (decl) == FUNCTION_DECL
-      && (TREE_ASM_WRITTEN (decl) || ! TREE_PUBLIC (decl))
-      && ! DECL_WEAK (decl))
+      && (*targetm.binds_local_p) (decl))
     SYMBOL_REF_FLAG (XEXP (DECL_RTL (decl), 0)) = 1;
 }
 
Index: gcc/config/rs6000/xcoff.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/xcoff.h,v
retrieving revision 1.36
diff -u -p -r1.36 xcoff.h
--- gcc/config/rs6000/xcoff.h	11 Sep 2002 17:36:06 -0000	1.36
+++ gcc/config/rs6000/xcoff.h	14 Sep 2002 02:47:20 -0000
@@ -266,7 +266,7 @@ toc_section ()						\
 
 #define ASM_DECLARE_FUNCTION_NAME(FILE,NAME,DECL)		\
 { rtx sym_ref = XEXP (DECL_RTL (DECL), 0);			\
-  if (!DECL_WEAK (DECL))					\
+  if ((*targetm.binds_local_p) (DECL))				\
     SYMBOL_REF_FLAG (sym_ref) = 1;				\
   if (TREE_PUBLIC (DECL))					\
     {								\

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: SYMBOL_REF_FLAG
  2002-09-13 19:59               ` SYMBOL_REF_FLAG Alan Modra
@ 2002-09-13 22:17                 ` Richard Henderson
  2002-09-14  0:09                   ` SYMBOL_REF_FLAG David Edelsohn
  2002-09-14  0:01                 ` SYMBOL_REF_FLAG David Edelsohn
  1 sibling, 1 reply; 875+ messages in thread
From: Richard Henderson @ 2002-09-13 22:17 UTC (permalink / raw)
  To: David Edelsohn, gcc-patches

On Sat, Sep 14, 2002 at 12:29:50PM +0930, Alan Modra wrote:
> This should probably also be fixed on the gcc-3.2 branch,
> and the flag_pic issue too.

3.2 doesn't have binds_local_p, so the fix is going to
be uglier by far.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: SYMBOL_REF_FLAG
  2002-09-13 19:59               ` SYMBOL_REF_FLAG Alan Modra
  2002-09-13 22:17                 ` SYMBOL_REF_FLAG Richard Henderson
@ 2002-09-14  0:01                 ` David Edelsohn
  1 sibling, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2002-09-14  0:01 UTC (permalink / raw)
  To: Alan Modra; +Cc: gcc-patches

	* config/rs6000/rs6000.c (rs6000_elf_encode_section_info): Use
	targetm.binds_local_p to set SYMBOL_REF_FLAG.
	(rs6000_xcoff_encode_section_info): Likewise.
	(config/rs6000/xcoff.h): Likewise.

Yes, this is okay.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: SYMBOL_REF_FLAG
  2002-09-13 22:17                 ` SYMBOL_REF_FLAG Richard Henderson
@ 2002-09-14  0:09                   ` David Edelsohn
  0 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2002-09-14  0:09 UTC (permalink / raw)
  To: Alan Modra; +Cc: Richard Henderson, gcc-patches

	I think the flag_pic issues require changes that are too invasive
to be fixed on the GCC 3.2 branch.  There wasn't powerpc64-linux support
before, so it cannot be a regression.  The flag_pic stuff needs to remain
as patches in the private source tree of backported GCC 3.3 functionality.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* powerpc fix for gcc.dg/asm-names.c failure
@ 2002-10-03 19:25             ` Alan Modra
  2002-10-03 19:32               ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Alan Modra @ 2002-10-03 19:25 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Edelsohn

Fixes gcc.dg/asm-names.c failure on powerpc64.  Stripping out '*' from
the function name too early meant that a duplicate label was emitted,
as asm-names.c has two "main" functions with different asm names.

Note that the traceback table still uses the C function name.  I'm not
sure whether this needs changin too.

	* config/rs6000/rs6000.c (rs6000_output_function_epilogue): Use a
	name for the tbtab label that depends on the function asm name.
	Don't output tbtab label unless optional_tbtab.
	(output_mi_thunk): Formatting.

OK mainline?  Bootstrapped and regression tested powerpc-linux,
xcompiled and regression tested powerpc64-linux.

Index: gcc/config/rs6000/rs6000.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.c,v
retrieving revision 1.387
diff -u -p -r1.387 rs6000.c
--- gcc/config/rs6000/rs6000.c	28 Sep 2002 15:29:44 -0000	1.387
+++ gcc/config/rs6000/rs6000.c	4 Oct 2002 00:43:11 -0000
@@ -10967,7 +10967,7 @@ rs6000_output_function_epilogue (file, s
   if (DEFAULT_ABI == ABI_AIX && ! flag_inhibit_size_directive
       && rs6000_traceback != traceback_none)
     {
-      const char *fname = XSTR (XEXP (DECL_RTL (current_function_decl), 0), 0);
+      const char *fname = NULL;
       const char *language_string = lang_hooks.name;
       int fixed_parms = 0, float_parms = 0, parm_info = 0;
       int i;
@@ -10980,15 +10980,17 @@ rs6000_output_function_epilogue (file, s
       else
 	optional_tbtab = !optimize_size && !TARGET_ELF;
 
-      while (*fname == '.')	/* V.4 encodes . in the name */
-	fname++;
-
-      /* Need label immediately before tbtab, so we can compute its offset
-	 from the function start.  */
-      if (*fname == '*')
-	++fname;
-      ASM_OUTPUT_INTERNAL_LABEL_PREFIX (file, "LT");
-      ASM_OUTPUT_LABEL (file, fname);
+      if (optional_tbtab)
+	{
+	  fname = XSTR (XEXP (DECL_RTL (current_function_decl), 0), 0);
+	  while (*fname == '.')	/* V.4 encodes . in the name */
+	    fname++;
+
+	  /* Need label immediately before tbtab, so we can compute
+	     its offset from the function start.  */
+	  ASM_OUTPUT_INTERNAL_LABEL_PREFIX (file, "LT");
+	  ASM_OUTPUT_LABEL (file, fname);
+	}
 
       /* The .tbtab pseudo-op can only be used for the first eight
 	 expressions, since it can't handle the possibly variable
@@ -11160,6 +11162,8 @@ rs6000_output_function_epilogue (file, s
       /* Omit this list of longs, because there are no CTL anchors.  */
 
       /* Length of function name.  */
+      if (*fname == '*')
+	++fname;
       fprintf (file, "\t.short %d\n", (int) strlen (fname));
 
       /* Function name.  */
@@ -11285,7 +11289,6 @@ output_mi_thunk (file, thunk_fndecl, del
 			      TYPE_ATTRIBUTES (TREE_TYPE (function)))
 	  || lookup_attribute ("shortcall",
 			       TYPE_ATTRIBUTES (TREE_TYPE (function)))))
-
     {
       fprintf (file, "\tb %s", prefix);
       assemble_name (file, fname);
@@ -11320,7 +11323,7 @@ output_mi_thunk (file, thunk_fndecl, del
 	  if (TARGET_ELF)
 	    function_section (current_function_decl);
 	  else
-	    text_section();
+	    text_section ();
 	  if (TARGET_MINIMAL_TOC)
 	    asm_fprintf (file, (TARGET_32BIT)
 			 ? "\t{l|lwz} %s,%s(%s)\n" : "\tld %s,%s(%s)\n", r12,

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: powerpc fix for gcc.dg/asm-names.c failure
  2002-10-03 19:25             ` powerpc fix for gcc.dg/asm-names.c failure Alan Modra
@ 2002-10-03 19:32               ` David Edelsohn
  0 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2002-10-03 19:32 UTC (permalink / raw)
  To: gcc-patches

	* config/rs6000/rs6000.c (rs6000_output_function_epilogue): Use a
	name for the tbtab label that depends on the function asm name.
	Don't output tbtab label unless optional_tbtab.
	(output_mi_thunk): Formatting.


Okay.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* [PATCH] fold-const.c use of BRANCH_COST
@ 2003-04-09 19:34 David Edelsohn
  2003-04-11  4:11 ` Richard Henderson
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2003-04-09 19:34 UTC (permalink / raw)
  To: gcc-patches

	While investigating PowerPC performance, I have discovered that
the threshold levels used to test BRANCH_COST are not always correct for
PowerPC.  Instead of arguing to change the threshold, I propose allowing
the test to be overridden in the machine description, with the default
being the former BRANCH_COST-based test.

	If the patch below is approved, I will add the appropriate
documentation as well.  The motivation is:

Unmodified:
   200.sixtrack      1100       651       169*     1100       535      206*
   254.gap           1100       249       442*     1100       262      419*
   255.vortex        1900       283       672*     1900       275      691*

BRANCH_COST=1:
   200.sixtrack      1100      1492      73.7*     1100       743      148*
   254.gap           1100       248       444*     1100       239      460*
   255.vortex        1900       293       647*     1900       276      688*

Modified fold-const.c only:
   200.sixtrack      1100       649       169*     1100       675      163*
   254.gap           1100       281       391*     1100       249      442*
   255.vortex        1900       295       644*     1900       275      692*

Modified fold-const.c:fold_range_test() only:
   200.sixtrack      1100       964       114*     1100       529      208*
   254.gap           1100       245       449*     1100       243      452*
   255.vortex        1900       284       668*     1900       274      694*

Yes, BRANCH_COST=1 really does degrade 200.sixtrack that much -- it is
completely repeatable.

Thanks, David


	* fold-const.c (fold_range_test): Add new macro defaulting
	to BRANCH_COST.

Index: fold-const.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/fold-const.c,v
retrieving revision 1.242
diff -c -p -r1.242 fold-const.c
*** fold-const.c	23 Mar 2003 22:57:25 -0000	1.242
--- fold-const.c	9 Apr 2003 19:25:26 -0000
*************** fold_range_test (exp)
*** 3449,3455 ****
    /* On machines where the branch cost is expensive, if this is a
       short-circuited branch and the underlying object on both sides
       is the same, make a non-short-circuit operation.  */
!   else if (BRANCH_COST >= 2
  	   && lhs != 0 && rhs != 0
  	   && (TREE_CODE (exp) == TRUTH_ANDIF_EXPR
  	       || TREE_CODE (exp) == TRUTH_ORIF_EXPR)
--- 3449,3460 ----
    /* On machines where the branch cost is expensive, if this is a
       short-circuited branch and the underlying object on both sides
       is the same, make a non-short-circuit operation.  */
! 
! #ifndef RANGE_TEST_SHORT_CIRCUIT
! #define RANGE_TEST_SHORT_CIRCUIT (BRANCH_COST >= 2)
! #endif
! 
!   else if (RANGE_TEST_SHORT_CIRCUIT
  	   && lhs != 0 && rhs != 0
  	   && (TREE_CODE (exp) == TRUTH_ANDIF_EXPR
  	       || TREE_CODE (exp) == TRUTH_ORIF_EXPR)

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] fold-const.c use of BRANCH_COST
  2003-04-09 19:34 [PATCH] fold-const.c use of BRANCH_COST David Edelsohn
@ 2003-04-11  4:11 ` Richard Henderson
  2003-04-11  4:23   ` Andrew Pinski
  2003-04-11  4:24   ` David Edelsohn
  0 siblings, 2 replies; 875+ messages in thread
From: Richard Henderson @ 2003-04-11  4:11 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

On Wed, Apr 09, 2003 at 03:34:31PM -0400, David Edelsohn wrote:
> Yes, BRANCH_COST=1 really does degrade 200.sixtrack that much -- it is
> completely repeatable.

But why would you want to set BRANCH_COST that low?


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] fold-const.c use of BRANCH_COST
  2003-04-11  4:11 ` Richard Henderson
@ 2003-04-11  4:23   ` Andrew Pinski
  2003-04-11 17:47     ` Geoff Keating
  2003-04-11  4:24   ` David Edelsohn
  1 sibling, 1 reply; 875+ messages in thread
From: Andrew Pinski @ 2003-04-11  4:23 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Andrew Pinski, David Edelsohn, gcc-patches

Because with BRANCH_COST set to 3, gcc on PPC produces
serializing instructions (adde, subfe, subfme, subfze) on some 
processors, like 750 and 7400.
It also sometimes produces smaller code.

Thanks,
Andrew Pinski

On Friday, Apr 11, 2003, at 00:09 US/Eastern, Richard Henderson wrote:

> On Wed, Apr 09, 2003 at 03:34:31PM -0400, David Edelsohn wrote:
>> Yes, BRANCH_COST=1 really does degrade 200.sixtrack that much -- it is
>> completely repeatable.
>
> But why would you want to set BRANCH_COST that low?
>
>
> r~
>
>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] fold-const.c use of BRANCH_COST
  2003-04-11  4:11 ` Richard Henderson
  2003-04-11  4:23   ` Andrew Pinski
@ 2003-04-11  4:24   ` David Edelsohn
  2003-04-11  4:43     ` Richard Henderson
  1 sibling, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2003-04-11  4:24 UTC (permalink / raw)
  To: Richard Henderson, gcc-patches

>>>>> Richard Henderson writes:

>> Yes, BRANCH_COST=1 really does degrade 200.sixtrack that much -- it is
>> completely repeatable.

Richard> But why would you want to set BRANCH_COST that low?

	Because one of the design points for the POWER/PowerPC
architecture was "branches are free" and good branch prediction makes the
cost very low.  I do not want to set BRANCH_COST that low, but it is hard
to argue when setting BRANCH_COST=1 does improve performance of some
testcases.

	The issue is what effect of BRANCH_COST=1 is actually improving
performance.  My investigation lead me to fold_range_test.

	I would like to change the heuristic used by fold_range_test for
PowerPC.  Instead of arguing that the threshold should be some arbitrary
value greater than the setting of BRANCH_COST on PowerPC, it seems more
useful to add port-specific finer granularity of control.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] fold-const.c use of BRANCH_COST
  2003-04-11  4:24   ` David Edelsohn
@ 2003-04-11  4:43     ` Richard Henderson
  2003-04-11  4:58       ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Richard Henderson @ 2003-04-11  4:43 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

On Fri, Apr 11, 2003 at 12:24:49AM -0400, David Edelsohn wrote:
> 	I would like to change the heuristic used by fold_range_test for
> PowerPC.  Instead of arguing that the threshold should be some arbitrary
> value greater than the setting of BRANCH_COST on PowerPC, it seems more
> useful to add port-specific finer granularity of control.

Well, ok.  I just wonder if you'll wind up with a whole set of
knobs for ifcvt.c too...


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] fold-const.c use of BRANCH_COST
  2003-04-11  4:43     ` Richard Henderson
@ 2003-04-11  4:58       ` David Edelsohn
  2003-04-11  5:11         ` Richard Henderson
  2003-04-11 17:08         ` Dale Johannesen
  0 siblings, 2 replies; 875+ messages in thread
From: David Edelsohn @ 2003-04-11  4:58 UTC (permalink / raw)
  To: Richard Henderson, gcc-patches

>>>>> Richard Henderson writes:

Richard> Well, ok.  I just wonder if you'll wind up with a whole set of
Richard> knobs for ifcvt.c too...

	PowerPC won't, because it looks like the threshold in ifcvt.c is
okay already, but what is wrong with more knobs?  The problem is that
BRANCH_COST is too coarse an adjustment and it seems to be tested against
arbitrary values or values chosen for whichever processor the optimization
was developed.  It just does not capture enough detail about the effect on
various architectures and processors, i.e., one avoids the branch but the
alternative code sequence may be worse, such as serializations.

	As you probably can infer, Apple sets BRANCH_COST=1 in their
sources because it had a positive effect on their testcases.  That is not
really the correct value, so I investigated the source of the benefits and
would like to provide that benefit without the negative side-effects.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] fold-const.c use of BRANCH_COST
  2003-04-11  4:58       ` David Edelsohn
@ 2003-04-11  5:11         ` Richard Henderson
  2003-04-11 14:41           ` David Edelsohn
  2003-04-11 17:08         ` Dale Johannesen
  1 sibling, 1 reply; 875+ messages in thread
From: Richard Henderson @ 2003-04-11  5:11 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

On Fri, Apr 11, 2003 at 12:57:56AM -0400, David Edelsohn wrote:
> 	PowerPC won't, because it looks like the threshold in ifcvt.c is
> okay already, but what is wrong with more knobs?

Nothing, except that it'd be better to go at them in a more
organized sort of way than a basketful of random defines.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] fold-const.c use of BRANCH_COST
  2003-04-11  5:11         ` Richard Henderson
@ 2003-04-11 14:41           ` David Edelsohn
  2003-04-11 14:47             ` Jan Hubicka
  2003-04-11 17:49             ` Geoff Keating
  0 siblings, 2 replies; 875+ messages in thread
From: David Edelsohn @ 2003-04-11 14:41 UTC (permalink / raw)
  To: Richard Henderson, gcc-patches

>>>>> Richard Henderson writes:

Richard> Nothing, except that it'd be better to go at them in a more
Richard> organized sort of way than a basketful of random defines.

	We can add the knobs for both uses of BRANCH_COST in fold-const.c
in this pass, if you want.  BRANCH_COST isn't actually used in that many
parts of GCC.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] fold-const.c use of BRANCH_COST
  2003-04-11 14:41           ` David Edelsohn
@ 2003-04-11 14:47             ` Jan Hubicka
  2003-04-11 15:51               ` David Edelsohn
  2003-04-11 17:49             ` Geoff Keating
  1 sibling, 1 reply; 875+ messages in thread
From: Jan Hubicka @ 2003-04-11 14:47 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Richard Henderson, gcc-patches

> >>>>> Richard Henderson writes:
> 
> Richard> Nothing, except that it'd be better to go at them in a more
> Richard> organized sort of way than a basketful of random defines.
> 
> 	We can add the knobs for both uses of BRANCH_COST in fold-const.c
> in this pass, if you want.  BRANCH_COST isn't actually used in that many
> parts of GCC.

I would like to see this happen.  Definitly i386 CPUs are other where
BRANCH_COST choice doesn't match.  It would be nice to have separate
knobs for when optimizing for size or speed (decided using
maybe_hot_bb_p inside ifcvt.c)

Honza
> 
> David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] fold-const.c use of BRANCH_COST
  2003-04-11 14:47             ` Jan Hubicka
@ 2003-04-11 15:51               ` David Edelsohn
  2003-04-11 16:57                 ` Jan Hubicka
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2003-04-11 15:51 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Richard Henderson, gcc-patches

>>>>> Jan Hubicka writes:

Jan> I would like to see this happen.  Definitly i386 CPUs are other where
Jan> BRANCH_COST choice doesn't match.  It would be nice to have separate
Jan> knobs for when optimizing for size or speed (decided using
Jan> maybe_hot_bb_p inside ifcvt.c)

	If we are going to separate all of the thresholds, should they be
target hooks or macros?

The basic breakdown is the following:

1) ifcvt.c
  a) MAX_CONDITIONAL_EXECUTE (BRANCH_COST + 1)
  b) store_flag normalize (true/false BRANCH_COST >= 2)
  c) store_flag normalize (default BRANCH_COST >= 3)
  d) store_flag addcc (BRANCH_COST >= 2)
  e) store_flag mask (BRANCH_COST >= 2)
  f) cmove arith (BRANCH_COST >= 5)

2) expmed.c
  a) div quotient jump vs shift (BRANCH_COST < 1 or < 3)
  b) emit_store_flag SCC (BRANCH_COST > 0)

3) expr.c
  a) trinary const (BRANCH_COST >= 3)
       call do_store_flag cheap (BRANCH_COST <= 1)
  b) do_store_flag SCC (BRANCH_COST >= 0)

4) fold-const.c
  a) range_test short-circuit (BRANCH_COST >= 2)
  b) fold_truthop unconditionally evaluate RHS (BRANCH_COST >= 2)


David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] fold-const.c use of BRANCH_COST
  2003-04-11 15:51               ` David Edelsohn
@ 2003-04-11 16:57                 ` Jan Hubicka
  2003-04-11 16:58                   ` David Edelsohn
  2003-04-21 17:24                   ` David Edelsohn
  0 siblings, 2 replies; 875+ messages in thread
From: Jan Hubicka @ 2003-04-11 16:57 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Jan Hubicka, Richard Henderson, gcc-patches

> >>>>> Jan Hubicka writes:
> 
> Jan> I would like to see this happen.  Definitly i386 CPUs are other where
> Jan> BRANCH_COST choice doesn't match.  It would be nice to have separate
> Jan> knobs for when optimizing for size or speed (decided using
> Jan> maybe_hot_bb_p inside ifcvt.c)
> 
> 	If we are going to separate all of the thresholds, should they be
> target hooks or macros?

I believe we are mocing from macros to the hooks, so it should be hook.
I think it can be a bitmap (ie one value) so we don't need to have so
many of them.  This opens a question how we will deal with adding new
transformations....
> 
> The basic breakdown is the following:
> 
> 1) ifcvt.c
>   a) MAX_CONDITIONAL_EXECUTE (BRANCH_COST + 1)
>   b) store_flag normalize (true/false BRANCH_COST >= 2)
>   c) store_flag normalize (default BRANCH_COST >= 3)
>   d) store_flag addcc (BRANCH_COST >= 2)
>   e) store_flag mask (BRANCH_COST >= 2)
>   f) cmove arith (BRANCH_COST >= 5)
> 
> 2) expmed.c
>   a) div quotient jump vs shift (BRANCH_COST < 1 or < 3)
>   b) emit_store_flag SCC (BRANCH_COST > 0)
> 
> 3) expr.c
>   a) trinary const (BRANCH_COST >= 3)
>        call do_store_flag cheap (BRANCH_COST <= 1)
>   b) do_store_flag SCC (BRANCH_COST >= 0)
> 
> 4) fold-const.c
>   a) range_test short-circuit (BRANCH_COST >= 2)
>   b) fold_truthop unconditionally evaluate RHS (BRANCH_COST >= 2)

optabs.c for instance is missing in your list.

Honza
> 
> 
> David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] fold-const.c use of BRANCH_COST
  2003-04-11 16:57                 ` Jan Hubicka
@ 2003-04-11 16:58                   ` David Edelsohn
  2003-04-21 17:24                   ` David Edelsohn
  1 sibling, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2003-04-11 16:58 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Richard Henderson, gcc-patches

>>>>> Jan Hubicka writes:

Jan> optabs.c for instance is missing in your list.

	Yes, sorry, I forgot to list that.  It's choosing how to compute
absolute value.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] fold-const.c use of BRANCH_COST
  2003-04-11  4:58       ` David Edelsohn
  2003-04-11  5:11         ` Richard Henderson
@ 2003-04-11 17:08         ` Dale Johannesen
  2003-04-11 17:54           ` David Edelsohn
  1 sibling, 1 reply; 875+ messages in thread
From: Dale Johannesen @ 2003-04-11 17:08 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Dale Johannesen, Richard Henderson, gcc-patches

On Thursday, April 10, 2003, at 09:57  PM, David Edelsohn wrote:
>>>>>> Richard Henderson writes:
>
> 	As you probably can infer, Apple sets BRANCH_COST=1 in their
> sources because it had a positive effect on their testcases.  That is 
> not
> really the correct value, so I investigated the source of the benefits 
> and
> would like to provide that benefit without the negative side-effects.

Yes, Apple has done this since the 2.95 days, and it is a consistent
winner.  Every so often somebody tries setting it back up to 2 or 3
but it's always been a lose overall.  Despite David's 3 examples, SPEC
still does better overall with BRANCH_COST==1.

As for "not really the correct value", I don't see why not.  Branches
really are very cheap on the ppc, typically 0 or 1 cycle.  If the
macro is being used to mean something other than "cost of a branch",
I'd say that indicates a problem in its use or naming.  I'm glad to
see David's attempt to treat the different uses of it differently; I
think that's a good approach.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] fold-const.c use of BRANCH_COST
  2003-04-11  4:23   ` Andrew Pinski
@ 2003-04-11 17:47     ` Geoff Keating
  2003-04-12  2:48       ` Segher Boessenkool
  0 siblings, 1 reply; 875+ messages in thread
From: Geoff Keating @ 2003-04-11 17:47 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: David Edelsohn, gcc-patches

Andrew Pinski <pinskia@physics.uc.edu> writes:

> Because with BRANCH_COST set to 3, gcc on PPC produces
> serializing instructions (adde, subfe, subfme, subfze) on some
> processors, like 750 and 7400.
> It also sometimes produces smaller code.

If these instructions are serializing, it's probably not useful for
GCC to be producing them at all on these processors; they'll always be
slower than a branch.

-- 
- Geoffrey Keating <geoffk@geoffk.org>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] fold-const.c use of BRANCH_COST
  2003-04-11 14:41           ` David Edelsohn
  2003-04-11 14:47             ` Jan Hubicka
@ 2003-04-11 17:49             ` Geoff Keating
  1 sibling, 0 replies; 875+ messages in thread
From: Geoff Keating @ 2003-04-11 17:49 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

David Edelsohn <dje@watson.ibm.com> writes:

> >>>>> Richard Henderson writes:
> 
> Richard> Nothing, except that it'd be better to go at them in a more
> Richard> organized sort of way than a basketful of random defines.
> 
> 	We can add the knobs for both uses of BRANCH_COST in fold-const.c
> in this pass, if you want.  BRANCH_COST isn't actually used in that many
> parts of GCC.

It'd be better if we could assign a meaning to each macro, like
"this is the relative cost of a branch compared to an integer
instruction" rather than "this is the knob you tweak to affect this
code path which may or may not affect this optimisation".

-- 
- Geoffrey Keating <geoffk@geoffk.org>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] fold-const.c use of BRANCH_COST
  2003-04-11 17:08         ` Dale Johannesen
@ 2003-04-11 17:54           ` David Edelsohn
  0 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2003-04-11 17:54 UTC (permalink / raw)
  To: Dale Johannesen, Richard Henderson; +Cc: gcc-patches

>>>>> Dale Johannesen writes:

> As for "not really the correct value", I don't see why not.  Branches
> really are very cheap on the ppc, typically 0 or 1 cycle.  If the
> macro is being used to mean something other than "cost of a branch",
> I'd say that indicates a problem in its use or naming.  I'm glad to
> see David's attempt to treat the different uses of it differently; I
> think that's a good approach.

	I think the two problems with BRANCH_COST are:

1) Possibly arbitrary thresholds.  Yes, the alternate code is better
   (ignoring all other issues) if branch costs are expensive, but what is
   the choice for "expensive".

2) Cost of alternate code.  The non-branch code may generate more
   expensive instruction sequences on the target architecture, e.g.,
   materializing condition codes in GPRs.  BRANCH_COST and the threshold
   value do not take those case-by-case issues into effect.

So a large value for BRANCH_COST may be correct, but its benefit is
getting wiped out by the missed architecture-specific optimizations.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] fold-const.c use of BRANCH_COST
  2003-04-11 17:47     ` Geoff Keating
@ 2003-04-12  2:48       ` Segher Boessenkool
  0 siblings, 0 replies; 875+ messages in thread
From: Segher Boessenkool @ 2003-04-12  2:48 UTC (permalink / raw)
  To: Geoff Keating; +Cc: Andrew Pinski, David Edelsohn, gcc-patches

Geoff Keating wrote:
> Andrew Pinski <pinskia@physics.uc.edu> writes:
> 
>>Because with BRANCH_COST set to 3, gcc on PPC produces
>>serializing instructions (adde, subfe, subfme, subfze) on some
>>processors, like 750 and 7400.
>>It also sometimes produces smaller code.
> 
> If these instructions are serializing, it's probably not useful for
> GCC to be producing them at all on these processors; they'll always be
> slower than a branch.

On G3 and G4, these insns are only execution serialized (because
XER is not renamed), not completion serialized or worse.  They
can be quite useful, especially if interleaved with integer, load
or AltiVec insns; it's certainly not worse then branching in
most cases.  Current GCC seems to like them a little bit *too*
much, though.


Segher

[Execution serialization means: the insn won't execute until all
prior insns have completed; for most such insns the results of
the insn are only available after the insn has completed.]


^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] fold-const.c use of BRANCH_COST
  2003-04-11 16:57                 ` Jan Hubicka
  2003-04-11 16:58                   ` David Edelsohn
@ 2003-04-21 17:24                   ` David Edelsohn
  2003-04-21 18:00                     ` Richard Henderson
  1 sibling, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2003-04-21 17:24 UTC (permalink / raw)
  To: Richard Henderson; +Cc: gcc-patches

	Where do we stand with respect to my proposal to allow
finer-grained control of heuristics involving BRANCH_COST?  If you want a
more organized re-organization, a little guidance would help.  I do not
see an easy way to group the changes into categories that are appropriate
to all targets because I am trying to address the problem of the
non-branch implementation possibly being slower on some processors with
high branch cost.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] fold-const.c use of BRANCH_COST
  2003-04-21 17:24                   ` David Edelsohn
@ 2003-04-21 18:00                     ` Richard Henderson
  2003-04-22 15:02                       ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Richard Henderson @ 2003-04-21 18:00 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

On Mon, Apr 21, 2003 at 01:24:23PM -0400, David Edelsohn wrote:
> 	Where do we stand with respect to my proposal to allow
> finer-grained control of heuristics involving BRANCH_COST?  If you want a
> more organized re-organization, a little guidance would help.

I don't have any, sorry.  I guess going with what you have is
fine until we come up with a more unified scheme.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] fold-const.c use of BRANCH_COST
  2003-04-21 18:00                     ` Richard Henderson
@ 2003-04-22 15:02                       ` David Edelsohn
  0 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2003-04-22 15:02 UTC (permalink / raw)
  To: gcc-patches

	FYI, appended is the complete patch I applied.

David

	* fold-const.c (fold_range_test): Use RANGE_TEST_NON_SHORT_CIRCUIT
	macro defaulting to original BRANCH_COST heuristic.
	* doc/tm.texi (RANGE_TEST_NON_SHORT_CIRCUIT): Document.

	* config/rs6000/rs6000.h (RANGE_TEST_NON_SHORT_CIRCUIT): Define.

Index: fold-const.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/fold-const.c,v
retrieving revision 1.247
diff -c -p -r1.247 fold-const.c
*** fold-const.c	16 Apr 2003 21:33:19 -0000	1.247
--- fold-const.c	21 Apr 2003 18:21:19 -0000
*************** merge_ranges (pin_p, plow, phigh, in0_p,
*** 3414,3419 ****
--- 3414,3423 ----
    return 1;
  }
  \f
+ #ifndef RANGE_TEST_NON_SHORT_CIRCUIT
+ #define RANGE_TEST_NON_SHORT_CIRCUIT (BRANCH_COST >= 2)
+ #endif
+ 
  /* EXP is some logical combination of boolean tests.  See if we can
     merge it into some range test.  Return the new tree if so.  */
  
*************** fold_range_test (exp)
*** 3450,3456 ****
    /* On machines where the branch cost is expensive, if this is a
       short-circuited branch and the underlying object on both sides
       is the same, make a non-short-circuit operation.  */
!   else if (BRANCH_COST >= 2
  	   && lhs != 0 && rhs != 0
  	   && (TREE_CODE (exp) == TRUTH_ANDIF_EXPR
  	       || TREE_CODE (exp) == TRUTH_ORIF_EXPR)
--- 3454,3460 ----
    /* On machines where the branch cost is expensive, if this is a
       short-circuited branch and the underlying object on both sides
       is the same, make a non-short-circuit operation.  */
!   else if (RANGE_TEST_NON_SHORT_CIRCUIT
  	   && lhs != 0 && rhs != 0
  	   && (TREE_CODE (exp) == TRUTH_ANDIF_EXPR
  	       || TREE_CODE (exp) == TRUTH_ORIF_EXPR)
Index: tm.texi
===================================================================
RCS file: /cvs/gcc/gcc/gcc/doc/tm.texi,v
retrieving revision 1.214
diff -c -p -r1.214 tm.texi
*** tm.texi	20 Apr 2003 18:20:39 -0000	1.214
--- tm.texi	21 Apr 2003 18:33:01 -0000
*************** function address than to call an address
*** 5529,5534 ****
--- 5529,5540 ----
  Define this macro if it is as good or better for a function to call
  itself with an explicit address than to call an address kept in a
  register.
+ 
+ @findex RANGE_TEST_NON_SHORT_CIRCUIT
+ @item RANGE_TEST_NON_SHORT_CIRCUIT
+ Define this macro if a non-short-circuit operation produced by
+ @samp{fold_range_test ()} is optimal.  This macro defaults to true if
+ @code{BRANCH_COST} is greater than or equal to the value 2.
  @end table
  
  @deftypefn {Target Hook} bool TARGET_RTX_COSTS (rtx @var{x}, int @var{code}, int @var{outer_code}, int *@var{total})
Index: rs6000.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.h,v
retrieving revision 1.267
diff -c -p -r1.267 rs6000.h
*** rs6000.h	17 Apr 2003 23:18:57 -0000	1.267
--- rs6000.h	22 Apr 2003 14:43:20 -0000
*************** extern int rs6000_default_long_calls;
*** 992,997 ****
--- 992,1001 ----
  
  #define BRANCH_COST 3
  
+ /* Override BRANCH_COST heuristic which empirically produces worse
+    performance for fold_range_test().  */
+ 
+ #define RANGE_TEST_NON_SHORT_CIRCUIT 0
  
  /* A fixed register used at prologue and epilogue generation to fix
     addressing modes.  The SPE needs heavy addressing fixes at the last

^ permalink raw reply	[flat|nested] 875+ messages in thread

* function parms in regs, patch 3 of 3
@ 2003-04-24 15:34 Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2003-04-24 15:34 UTC (permalink / raw)
  To: gcc-patches

This patch defines a new macro, BLOCK_REG_PADDING, and uses it to
determine what to do with odd-sized struct pieces in regs.  I wanted
to make use of FUNCTION_ARG_PADDING in its default definition, so
that meant passing TYPE to emit_group_{load,store}.  Anyway, what
this does is allow powerpc64 to pad structs upward, downward and
every which way to our heart's content, in registers, without using
the PARALLELs invented for Irix support.  All controlled in one place,
rs6000.c:function_arg_padding.  Hmm.  It might be a good idea to make
the default PAD_VARARGS_DOWN use FUNCTION_ARG_PADDING too.

The rs6000 code has been tested using AGGREGATE_PADDING_FIXED == 0,
AGGREGATE_PADDING_FIXED == 1 and AGGREGATES_PAD_UPWARD_ALWAYS == 0,
AGGREGATE_PADDING_FIXED == 1 and AGGREGATES_PAD_UPWARD_ALWAYS == 1,
with and without MUST_PASS_IN_STACK redefined.  All seems well.
Of course, redefining MUST_PASS_IN_STACK means that code compiled
with previous versions of gcc might not be compatible with a new
gcc, but I think we need to suffer the pain now so that we're
compliant with the ABI.

	* expr.h (struct locate_and_pad_arg_data): Add where_pad.
	(BLOCK_REG_PADDING): Define.
	(emit_group_load, emit_group_store): Adjust declarations.
	* expr.c (emit_group_load): Add "type" param, and use
	BLOCK_REG_PADDING to determine need for a shift.  Optimize non-
	aligned accesses if !SLOW_UNALIGNED_ACCESS.
	(emit_group_store): Likewise.
	(emit_push_insn, expand_assignment, store_expr, expand_expr): Adjust
	emit_group_load and emit_group_store calls.
	* calls.c (store_unaligned_arguments_into_pseudos): Tidy.  Use
	BLOCK_REG_PADDING to determine whether we need endian_correction.
	(load_register_parameters): Localize vars.  Handle shifting of
	small values to the correct end of regs.  Adjust emit_group_load
	call.
	(expand_call, emit_library_call_value_1): Adjust emit_group_load
	and emit_group_store calls.
	* function.c (assign_parms): Set mem alignment for stack slots.
	Adjust emit_group_store call.  Store values at the "wrong" end
	of regs to the stack.  Use BLOCK_REG_PADDING.
	(locate_and_pad_parm): Save where_pad.
	(expand_function_end): Adjust emit_group_load call.
	* stmt.c (expand_value_return): Adjust emit_group_load call.
	* Makefile.in (calls.o): Depend on $(OPTABS_H).

	* config/rs6000/linux64.h (TARGET_BIG_ENDIAN): Redefine as 1.
	(FIXED_R13): Delete.
	(AGGREGATE_PADDING_FIXED): Define.
	(MUST_PASS_IN_STACK): Define.
	* config/rs6000/rs6000.h (struct rs6000_args): Remove orig_nargs.
	(PAD_VARARGS_DOWN): Define in terms of FUNCTION_ARG_PADDING.
	* config/rs6000/rs6000.c (init_cumulative_args): Don't set orig_nargs.
	(function_arg_padding): !AGGREGATE_PADDING_FIXED compatibility code.
	Act on AGGREGATES_PAD_UPWARD_ALWAYS.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

diff -urp gcc2/gcc/expr.h gcc3/gcc/expr.h
--- gcc2/gcc/expr.h	2003-04-24 15:31:56.000000000 +0930
+++ gcc3/gcc/expr.h	2003-04-24 21:07:17.000000000 +0930
@@ -93,6 +93,8 @@ struct locate_and_pad_arg_data
   /* The amount that the stack pointer needs to be adjusted to
      force alignment for the next argument.  */
   struct args_size alignment_pad;
+  /* Which way we should pad this arg.  */
+  enum direction where_pad;
 };
 #endif
 
@@ -150,6 +152,14 @@ do {							\
       ? downward : upward))
 #endif
 
+/* Specify padding for the last element of a block move between
+   registers and memory.  FIRST is non-zero if this is the only
+   element.  */
+#ifndef BLOCK_REG_PADDING
+#define BLOCK_REG_PADDING(MODE, TYPE, FIRST) \
+  (!(FIRST) ? upward : FUNCTION_ARG_PADDING (MODE, TYPE))
+#endif
+
 /* Supply a default definition for FUNCTION_ARG_BOUNDARY.  Normally, we let
    FUNCTION_ARG_PADDING, which also pads the length, handle any needed
    alignment.  */
@@ -416,19 +426,21 @@ extern void move_block_from_reg PARAMS (
 /* Generate a non-consecutive group of registers represented by a PARALLEL.  */
 extern rtx gen_group_rtx PARAMS ((rtx));
 
+#ifdef TREE_CODE
 /* Load a BLKmode value into non-consecutive registers represented by a
    PARALLEL.  */
-extern void emit_group_load PARAMS ((rtx, rtx, int));
+extern void emit_group_load PARAMS ((rtx, rtx, tree, int));
+#endif
 
 /* Move a non-consecutive group of registers represented by a PARALLEL into
    a non-consecutive group of registers represented by a PARALLEL.  */
 extern void emit_group_move PARAMS ((rtx, rtx));
 
+#ifdef TREE_CODE
 /* Store a BLKmode value from non-consecutive registers represented by a
    PARALLEL.  */
-extern void emit_group_store PARAMS ((rtx, rtx, int));
+extern void emit_group_store PARAMS ((rtx, rtx, tree, int));
 
-#ifdef TREE_CODE
 /* Copy BLKmode object from a set of registers.  */
 extern rtx copy_blkmode_from_reg PARAMS ((rtx, rtx, tree));
 #endif
diff -urp gcc2/gcc/expr.c gcc3/gcc/expr.c
--- gcc2/gcc/expr.c	2003-04-24 14:46:04.000000000 +0930
+++ gcc3/gcc/expr.c	2003-04-24 21:07:17.000000000 +0930
@@ -2212,19 +2212,15 @@ gen_group_rtx (orig)
   return gen_rtx_PARALLEL (GET_MODE (orig), gen_rtvec_v (length, tmps));
 }
 
-/* Emit code to move a block SRC to a block DST, where DST is non-consecutive
-   registers represented by a PARALLEL.  SSIZE represents the total size of
-   block SRC in bytes, or -1 if not known.  */
-/* ??? If SSIZE % UNITS_PER_WORD != 0, we make the blatant assumption that
-   the balance will be in what would be the low-order memory addresses, i.e.
-   left justified for big endian, right justified for little endian.  This
-   happens to be true for the targets currently using this support.  If this
-   ever changes, a new target macro along the lines of FUNCTION_ARG_PADDING
-   would be needed.  */
+/* Emit code to move a block ORIG_SRC of type TYPE to a block DST,
+   where DST is non-consecutive registers represented by a PARALLEL.
+   SSIZE represents the total size of block ORIG_SRC in bytes, or -1
+   if not known.  */ 
 
 void
-emit_group_load (dst, orig_src, ssize)
+emit_group_load (dst, orig_src, type, ssize)
      rtx dst, orig_src;
+     tree type;
      int ssize;
 {
   rtx *tmps, src;
@@ -2253,7 +2249,11 @@ emit_group_load (dst, orig_src, ssize)
       /* Handle trailing fragments that run over the size of the struct.  */
       if (ssize >= 0 && bytepos + (HOST_WIDE_INT) bytelen > ssize)
 	{
-	  shift = (bytelen - (ssize - bytepos)) * BITS_PER_UNIT;
+	  /* Arrange to shift the fragment to where it belongs.
+	     extract_bit_field loads to the lsb of the reg.  */
+	  if (BLOCK_REG_PADDING (GET_MODE (orig_src), type, i == start)
+	      == (BYTES_BIG_ENDIAN ? upward : downward))
+	    shift = (bytelen - (ssize - bytepos)) * BITS_PER_UNIT;
 	  bytelen = ssize - bytepos;
 	  if (bytelen <= 0)
 	    abort ();
@@ -2278,7 +2278,8 @@ emit_group_load (dst, orig_src, ssize)
 
       /* Optimize the access just a bit.  */
       if (GET_CODE (src) == MEM
-	  && MEM_ALIGN (src) >= GET_MODE_ALIGNMENT (mode)
+	  && (! SLOW_UNALIGNED_ACCESS (mode, MEM_ALIGN (src))
+	      || MEM_ALIGN (src) >= GET_MODE_ALIGNMENT (mode))
 	  && bytepos * BITS_PER_UNIT % GET_MODE_ALIGNMENT (mode) == 0
 	  && bytelen == GET_MODE_SIZE (mode))
 	{
@@ -2321,7 +2322,7 @@ emit_group_load (dst, orig_src, ssize)
 				     bytepos * BITS_PER_UNIT, 1, NULL_RTX,
 				     mode, mode, ssize);
 
-      if (BYTES_BIG_ENDIAN && shift)
+      if (shift)
 	expand_binop (mode, ashl_optab, tmps[i], GEN_INT (shift),
 		      tmps[i], 0, OPTAB_WIDEN);
     }
@@ -2353,13 +2354,16 @@ emit_group_move (dst, src)
 		    XEXP (XVECEXP (src, 0, i), 0));
 }
 
-/* Emit code to move a block SRC to a block DST, where SRC is non-consecutive
-   registers represented by a PARALLEL.  SSIZE represents the total size of
-   block DST, or -1 if not known.  */
+/* Emit code to move a block SRC to a block ORIG_DST of type TYPE,
+   where SRC is non-consecutive registers represented by a PARALLEL.
+   SSIZE represents the total size of block ORIG_DST, or -1 if not
+   known.  */
 
 void
-emit_group_store (orig_dst, src, ssize)
-     rtx orig_dst, src;
+emit_group_store (orig_dst, src, type, ssize)
+     rtx orig_dst;
+     rtx src;
+     tree type ATTRIBUTE_UNUSED;
      int ssize;
 {
   rtx *tmps, dst;
@@ -2404,8 +2408,8 @@ emit_group_store (orig_dst, src, ssize)
 	 the temporary.  */
 
       temp = assign_stack_temp (GET_MODE (dst), ssize, 0);
-      emit_group_store (temp, src, ssize);
-      emit_group_load (dst, temp, ssize);
+      emit_group_store (temp, src, type, ssize);
+      emit_group_load (dst, temp, type, ssize);
       return;
     }
   else if (GET_CODE (dst) != MEM && GET_CODE (dst) != CONCAT)
@@ -2426,7 +2430,10 @@ emit_group_store (orig_dst, src, ssize)
       /* Handle trailing fragments that run over the size of the struct.  */
       if (ssize >= 0 && bytepos + (HOST_WIDE_INT) bytelen > ssize)
 	{
-	  if (BYTES_BIG_ENDIAN)
+	  /* store_bit_field always takes its value from the lsb.
+	     Move the fragment to the lsb if it's not already there.  */
+	  if (BLOCK_REG_PADDING (GET_MODE (orig_dst), type, i == start)
+	      == (BYTES_BIG_ENDIAN ? upward : downward))
 	    {
 	      int shift = (bytelen - (ssize - bytepos)) * BITS_PER_UNIT;
 	      expand_binop (mode, ashr_optab, tmps[i], GEN_INT (shift),
@@ -2459,7 +2466,8 @@ emit_group_store (orig_dst, src, ssize)
 
       /* Optimize the access just a bit.  */
       if (GET_CODE (dest) == MEM
-	  && MEM_ALIGN (dest) >= GET_MODE_ALIGNMENT (mode)
+	  && (! SLOW_UNALIGNED_ACCESS (mode, MEM_ALIGN (dest))
+	      || MEM_ALIGN (dest) >= GET_MODE_ALIGNMENT (mode))
 	  && bytepos * BITS_PER_UNIT % GET_MODE_ALIGNMENT (mode) == 0
 	  && bytelen == GET_MODE_SIZE (mode))
 	emit_move_insn (adjust_address (dest, mode, bytepos), tmps[i]);
@@ -3991,7 +3999,7 @@ emit_push_insn (x, mode, type, size, ali
       /* Handle calls that pass values in multiple non-contiguous locations.
 	 The Irix 6 ABI has examples of this.  */
       if (GET_CODE (reg) == PARALLEL)
-	emit_group_load (reg, x, -1);  /* ??? size? */
+	emit_group_load (reg, x, type, -1);
       else
 	move_block_to_reg (REGNO (reg), x, partial, mode);
     }
@@ -4194,7 +4202,8 @@ expand_assignment (to, from, want_value,
       /* Handle calls that return values in multiple non-contiguous locations.
 	 The Irix 6 ABI has examples of this.  */
       if (GET_CODE (to_rtx) == PARALLEL)
-	emit_group_load (to_rtx, value, int_size_in_bytes (TREE_TYPE (from)));
+	emit_group_load (to_rtx, value, TREE_TYPE (from),
+			 int_size_in_bytes (TREE_TYPE (from)));
       else if (GET_MODE (to_rtx) == BLKmode)
 	emit_block_move (to_rtx, value, expr_size (from), BLOCK_OP_NORMAL);
       else
@@ -4228,7 +4237,8 @@ expand_assignment (to, from, want_value,
       temp = expand_expr (from, 0, GET_MODE (to_rtx), 0);
 
       if (GET_CODE (to_rtx) == PARALLEL)
-	emit_group_load (to_rtx, temp, int_size_in_bytes (TREE_TYPE (from)));
+	emit_group_load (to_rtx, temp, TREE_TYPE (from),
+			 int_size_in_bytes (TREE_TYPE (from)));
       else
 	emit_move_insn (to_rtx, temp);
 
@@ -4641,7 +4651,8 @@ store_expr (exp, target, want_value)
       /* Handle calls that return values in multiple non-contiguous locations.
 	 The Irix 6 ABI has examples of this.  */
       else if (GET_CODE (target) == PARALLEL)
-	emit_group_load (target, temp, int_size_in_bytes (TREE_TYPE (exp)));
+	emit_group_load (target, temp, TREE_TYPE (exp),
+			 int_size_in_bytes (TREE_TYPE (exp)));
       else if (GET_MODE (temp) == BLKmode)
 	emit_block_move (target, temp, expr_size (exp),
 			 (want_value & 2
@@ -9182,7 +9193,7 @@ expand_expr (exp, target, tmode, modifie
 		    /* Handle calls that pass values in multiple
 		       non-contiguous locations.  The Irix 6 ABI has examples
 		       of this.  */
-		    emit_group_store (memloc, op0,
+		    emit_group_store (memloc, op0, inner_type,
 				      int_size_in_bytes (inner_type));
 		  else
 		    emit_move_insn (memloc, op0);
diff -urp gcc2/gcc/calls.c gcc3/gcc/calls.c
--- gcc2/gcc/calls.c	2003-04-24 11:25:45.000000000 +0930
+++ gcc3/gcc/calls.c	2003-04-24 21:07:17.000000000 +0930
@@ -27,6 +27,7 @@ Software Foundation, 59 Temple Place - S
 #include "tree.h"
 #include "flags.h"
 #include "expr.h"
+#include "optabs.h"
 #include "libfuncs.h"
 #include "function.h"
 #include "regs.h"
@@ -1015,22 +1016,22 @@ store_unaligned_arguments_into_pseudos (
 	    < (unsigned int) MIN (BIGGEST_ALIGNMENT, BITS_PER_WORD)))
       {
 	int bytes = int_size_in_bytes (TREE_TYPE (args[i].tree_value));
-	int big_endian_correction = 0;
-
-	args[i].n_aligned_regs
-	  = args[i].partial ? args[i].partial
-	    : (bytes + (UNITS_PER_WORD - 1)) / UNITS_PER_WORD;
+	int nregs = (bytes + UNITS_PER_WORD - 1) / UNITS_PER_WORD;
+	int endian_correction = 0;
 
+	args[i].n_aligned_regs = args[i].partial ? args[i].partial : nregs;
 	args[i].aligned_regs = (rtx *) xmalloc (sizeof (rtx)
 						* args[i].n_aligned_regs);
 
-	/* Structures smaller than a word are aligned to the least
-	   significant byte (to the right).  On a BYTES_BIG_ENDIAN machine,
+	/* Structures smaller than a word are normally aligned to the
+	   least significant byte.  On a BYTES_BIG_ENDIAN machine,
 	   this means we must skip the empty high order bytes when
 	   calculating the bit offset.  */
-	if (BYTES_BIG_ENDIAN
-	    && bytes < UNITS_PER_WORD)
-	  big_endian_correction = (BITS_PER_WORD  - (bytes * BITS_PER_UNIT));
+	if (bytes < UNITS_PER_WORD
+	    && (BLOCK_REG_PADDING (args[i].mode,
+				   TREE_TYPE (args[i].tree_value), 1)
+		== downward))
+	  endian_correction = BITS_PER_WORD - bytes * BITS_PER_UNIT;
 
 	for (j = 0; j < args[i].n_aligned_regs; j++)
 	  {
@@ -1039,6 +1040,8 @@ store_unaligned_arguments_into_pseudos (
 	    int bitsize = MIN (bytes * BITS_PER_UNIT, BITS_PER_WORD);
 
 	    args[i].aligned_regs[j] = reg;
+	    word = extract_bit_field (word, bitsize, 0, 1, NULL_RTX,
+				      word_mode, word_mode, BITS_PER_WORD);
 
 	    /* There is no need to restrict this code to loading items
 	       in TYPE_ALIGN sized hunks.  The bitfield instructions can
@@ -1054,11 +1057,8 @@ store_unaligned_arguments_into_pseudos (
 	    emit_move_insn (reg, const0_rtx);
 
 	    bytes -= bitsize / BITS_PER_UNIT;
-	    store_bit_field (reg, bitsize, big_endian_correction, word_mode,
-			     extract_bit_field (word, bitsize, 0, 1, NULL_RTX,
-						word_mode, word_mode,
-						BITS_PER_WORD),
-			     BITS_PER_WORD);
+	    store_bit_field (reg, bitsize, endian_correction, word_mode,
+			     word, BITS_PER_WORD);
 	  }
       }
 }
@@ -1689,34 +1689,45 @@ load_register_parameters (args, num_actu
     {
       rtx reg = ((flags & ECF_SIBCALL)
 		 ? args[i].tail_call_reg : args[i].reg);
-      int partial = args[i].partial;
-      int nregs;
-
       if (reg)
 	{
+	  int partial = args[i].partial;
+	  int nregs;
+	  int size = 0;
 	  rtx before_arg = get_last_insn ();
 	  /* Set to non-negative if must move a word at a time, even if just
 	     one word (e.g, partial == 1 && mode == DFmode).  Set to -1 if
 	     we just use a normal move insn.  This value can be zero if the
 	     argument is a zero size structure with no fields.  */
-	  nregs = (partial ? partial
-		   : (TYPE_MODE (TREE_TYPE (args[i].tree_value)) == BLKmode
-		      ? ((int_size_in_bytes (TREE_TYPE (args[i].tree_value))
-			  + (UNITS_PER_WORD - 1)) / UNITS_PER_WORD)
-		      : -1));
+	  nregs = -1;
+	  if (partial)
+	    nregs = partial;
+	  else if (TYPE_MODE (TREE_TYPE (args[i].tree_value)) == BLKmode)
+	    {
+	      size = int_size_in_bytes (TREE_TYPE (args[i].tree_value));
+	      nregs = (size + (UNITS_PER_WORD - 1)) / UNITS_PER_WORD;
+	    }
+	  else
+	    size = GET_MODE_SIZE (args[i].mode);
 
 	  /* Handle calls that pass values in multiple non-contiguous
 	     locations.  The Irix 6 ABI has examples of this.  */
 
 	  if (GET_CODE (reg) == PARALLEL)
-	    emit_group_load (reg, args[i].value,
-			     int_size_in_bytes (TREE_TYPE (args[i].tree_value)));
+	    {
+	      tree type = TREE_TYPE (args[i].tree_value);
+	      emit_group_load (reg, args[i].value, type,
+			       int_size_in_bytes (type));
+	    }
 
 	  /* If simple case, just do move.  If normal partial, store_one_arg
 	     has already loaded the register for us.  In all other cases,
 	     load the register(s) from memory.  */
 
-	  else if (nregs == -1)
+	  else if (nregs == -1
+		   && !(size < UNITS_PER_WORD
+			&& (args[i].locate.where_pad
+			    == (BYTES_BIG_ENDIAN ? upward : downward))))
 	    emit_move_insn (reg, args[i].value);
 
 	  /* If we have pre-computed the values to put in the registers in
@@ -1728,9 +1739,42 @@ load_register_parameters (args, num_actu
 			      args[i].aligned_regs[j]);
 
 	  else if (partial == 0 || args[i].pass_on_stack)
-	    move_block_to_reg (REGNO (reg),
-			       validize_mem (args[i].value), nregs,
-			       args[i].mode);
+	    {
+	      rtx mem = validize_mem (args[i].value);
+
+	      /* Handle case where we have a value that needs shifting
+		 up to the msb.  eg. a QImode value and we're padding
+		 upward on a BYTES_BIG_ENDIAN machine.  */
+	      if (nregs == -1)
+		{
+		  rtx ri = gen_rtx_REG (word_mode, REGNO (reg));
+		  rtx x;
+		  int shift = (UNITS_PER_WORD - size) * BITS_PER_UNIT;
+		  x = expand_binop (word_mode, ashl_optab, mem,
+				    GEN_INT (shift), ri, 1, OPTAB_WIDEN);
+		  if (x != ri)
+		    emit_move_insn (ri, x);
+		}
+
+	      /* Handle a BLKmode that needs shifting.  */
+	      else if (nregs == 1 && size < UNITS_PER_WORD
+		       && args[i].locate.where_pad == downward)
+		{
+		  rtx tem = operand_subword_force (mem, 0, args[i].mode);
+		  rtx ri = gen_rtx_REG (word_mode, REGNO (reg));
+		  rtx x = gen_reg_rtx (word_mode);
+		  int shift = (UNITS_PER_WORD - size) * BITS_PER_UNIT;
+		  optab dir = BYTES_BIG_ENDIAN ? lshr_optab : ashl_optab;
+
+		  emit_move_insn (x, tem);
+		  x = expand_binop (word_mode, dir, x, GEN_INT (shift),
+				    ri, 1, OPTAB_WIDEN);
+		  if (x != ri)
+		    emit_move_insn (ri, x);
+		}
+	      else
+		move_block_to_reg (REGNO (reg), mem, nregs, args[i].mode);
+	    }
 
 	  /* When a parameter is a block, and perhaps in other cases, it is
 	     possible that it did a load from an argument slot that was
@@ -3225,7 +3269,7 @@ expand_call (exp, target, ignore)
 	    }
 
 	  if (! rtx_equal_p (target, valreg))
-	    emit_group_store (target, valreg,
+	    emit_group_store (target, valreg, TREE_TYPE (exp),
 			      int_size_in_bytes (TREE_TYPE (exp)));
 
 	  /* We can not support sibling calls for this case.  */
@@ -3976,7 +4020,7 @@ emit_library_call_value_1 (retval, orgfu
       /* Handle calls that pass values in multiple non-contiguous
 	 locations.  The PA64 has examples of this for library calls.  */
       if (reg != 0 && GET_CODE (reg) == PARALLEL)
-	emit_group_load (reg, val, GET_MODE_SIZE (GET_MODE (val)));
+	emit_group_load (reg, val, NULL_TREE, GET_MODE_SIZE (GET_MODE (val)));
       else if (reg != 0 && partial == 0)
 	emit_move_insn (reg, val);
 
@@ -4080,7 +4124,7 @@ emit_library_call_value_1 (retval, orgfu
 	  if (GET_CODE (valreg) == PARALLEL)
 	    {
 	      temp = gen_reg_rtx (outmode);
-	      emit_group_store (temp, valreg, outmode);
+	      emit_group_store (temp, valreg, NULL_TREE, outmode);
 	      valreg = temp;
 	    }
 
@@ -4123,7 +4167,7 @@ emit_library_call_value_1 (retval, orgfu
 	{
 	  if (value == 0)
 	    value = gen_reg_rtx (outmode);
-	  emit_group_store (value, valreg, outmode);
+	  emit_group_store (value, valreg, NULL_TREE, outmode);
 	}
       else if (value != 0)
 	emit_move_insn (value, valreg);
diff -urp gcc2/gcc/function.c gcc3/gcc/function.c
--- gcc2/gcc/function.c	2003-04-24 17:52:45.000000000 +0930
+++ gcc3/gcc/function.c	2003-04-24 21:07:17.000000000 +0930
@@ -4623,6 +4623,8 @@ assign_parms (fndecl)
 						  offset_rtx));
 
 	set_mem_attributes (stack_parm, parm, 1);
+	if (entry_parm && MEM_ATTRS (stack_parm)->align < PARM_BOUNDARY)
+	  set_mem_align (stack_parm, PARM_BOUNDARY);
 
 	/* Set also REG_ATTRS if parameter was passed in a register.  */
 	if (entry_parm)
@@ -4654,6 +4656,7 @@ assign_parms (fndecl)
 	     locations.  The Irix 6 ABI has examples of this.  */
 	  if (GET_CODE (entry_parm) == PARALLEL)
 	    emit_group_store (validize_mem (stack_parm), entry_parm,
+			      TREE_TYPE (parm),
 			      int_size_in_bytes (TREE_TYPE (parm)));
 
 	  else
@@ -4757,7 +4760,10 @@ assign_parms (fndecl)
 
 	 Set DECL_RTL to that place.  */
 
-      if (nominal_mode == BLKmode || GET_CODE (entry_parm) == PARALLEL)
+      if (nominal_mode == BLKmode
+	  || (locate.where_pad == (BYTES_BIG_ENDIAN ? upward : downward)
+	      && GET_MODE_SIZE (promoted_mode) < UNITS_PER_WORD)
+	  || GET_CODE (entry_parm) == PARALLEL)
 	{
 	  /* If a BLKmode arrives in registers, copy it to a stack slot.
 	     Handle calls that pass values in multiple non-contiguous
@@ -4793,7 +4799,7 @@ assign_parms (fndecl)
 	      /* Handle calls that pass values in multiple non-contiguous
 		 locations.  The Irix 6 ABI has examples of this.  */
 	      if (GET_CODE (entry_parm) == PARALLEL)
-		emit_group_store (mem, entry_parm, size);
+		emit_group_store (mem, entry_parm, TREE_TYPE (parm), size);
 
 	      /* If SIZE is that of a mode no bigger than a word, just use
 		 that mode's store operation.  */
@@ -4802,7 +4808,10 @@ assign_parms (fndecl)
 		  enum machine_mode mode
 		    = mode_for_size (size * BITS_PER_UNIT, MODE_INT, 0);
 
-		  if (mode != BLKmode)
+		  if (mode != BLKmode
+		      && (size == UNITS_PER_WORD
+			  || (BLOCK_REG_PADDING (mode, TREE_TYPE (parm), 1)
+			      != (BYTES_BIG_ENDIAN ? upward : downward))))
 		    {
 		      rtx reg = gen_rtx_REG (mode, REGNO (entry_parm));
 		      emit_move_insn (change_address (mem, mode, 0), reg);
@@ -4813,7 +4822,8 @@ assign_parms (fndecl)
 		     to memory.  Note that the previous test doesn't
 		     handle all cases (e.g. SIZE == 3).  */
 		  else if (size != UNITS_PER_WORD
-			   && BYTES_BIG_ENDIAN)
+			   && (BLOCK_REG_PADDING (mode, TREE_TYPE (parm), 1)
+			       == downward))
 		    {
 		      rtx tem, x;
 		      int by = (UNITS_PER_WORD - size) * BITS_PER_UNIT;
@@ -5411,6 +5421,7 @@ locate_and_pad_parm (passed_mode, type, 
     = type ? size_in_bytes (type) : size_int (GET_MODE_SIZE (passed_mode));
   where_pad = FUNCTION_ARG_PADDING (passed_mode, type);
   boundary = FUNCTION_ARG_BOUNDARY (passed_mode, type);
+  locate->where_pad = where_pad;
 
 #ifdef ARGS_GROW_DOWNWARD
   locate->slot_offset.constant = -initial_offset_ptr->constant;
@@ -7142,6 +7153,7 @@ expand_function_end (filename, line, end
 		emit_group_move (real_decl_rtl, decl_rtl);
 	      else
 		emit_group_load (real_decl_rtl, decl_rtl,
+				 TREE_TYPE (decl_result),
 				 int_size_in_bytes (TREE_TYPE (decl_result)));
 	    }
 	  else
diff -urp gcc2/gcc/stmt.c gcc3/gcc/stmt.c
--- gcc2/gcc/stmt.c	2003-04-22 18:55:34.000000000 +0930
+++ gcc3/gcc/stmt.c	2003-04-24 21:07:18.000000000 +0930
@@ -3008,7 +3008,7 @@ expand_value_return (val)
 	val = convert_modes (mode, old_mode, val, unsignedp);
 #endif
       if (GET_CODE (return_reg) == PARALLEL)
-	emit_group_load (return_reg, val, int_size_in_bytes (type));
+	emit_group_load (return_reg, val, type, int_size_in_bytes (type));
       else
 	emit_move_insn (return_reg, val);
     }
diff -urp gcc2/gcc/Makefile.in gcc3/gcc/Makefile.in
--- gcc2/gcc/Makefile.in	2003-04-24 15:47:39.000000000 +0930
+++ gcc3/gcc/Makefile.in	2003-04-24 21:14:33.000000000 +0930
@@ -1531,7 +1531,7 @@ builtins.o : builtins.c $(CONFIG_H) $(SY
    $(RECOG_H) output.h typeclass.h hard-reg-set.h toplev.h hard-reg-set.h \
    except.h $(TM_P_H) $(PREDICT_H) libfuncs.h real.h langhooks.h
 calls.o : calls.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) $(TREE_H) flags.h \
-   $(EXPR_H) langhooks.h $(TARGET_H) \
+   $(EXPR_H) $(OPTABS_H) langhooks.h $(TARGET_H) \
    libfuncs.h $(REGS_H) toplev.h output.h function.h $(TIMEVAR_H) $(TM_P_H) cgraph.h except.h
 expmed.o : expmed.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) $(TREE_H) \
    flags.h insn-config.h $(EXPR_H) $(OPTABS_H) $(RECOG_H) real.h \
diff -urp gcc2/gcc/config/rs6000/linux64.h gcc3/gcc/config/rs6000/linux64.h
--- gcc2/gcc/config/rs6000/linux64.h	2003-04-22 18:55:34.000000000 +0930
+++ gcc3/gcc/config/rs6000/linux64.h	2003-04-24 21:07:17.000000000 +0930
@@ -49,6 +49,10 @@
 #undef	TARGET_64BIT
 #define	TARGET_64BIT		1
 
+/* And we're always big-endian.  */
+#undef	TARGET_BIG_ENDIAN
+#define TARGET_BIG_ENDIAN 1
+
 /* 64-bit PowerPC Linux always has a TOC.  */
 #undef  TARGET_NO_TOC
 #define TARGET_NO_TOC		0
@@ -143,8 +147,25 @@
 #undef  JUMP_TABLES_IN_TEXT_SECTION
 #define JUMP_TABLES_IN_TEXT_SECTION 1
 
-/* 64-bit PowerPC Linux always has GPR13 fixed.  */
-#define FIXED_R13		1
+/* The linux ppc64 ABI isn't explicit on whether aggregates smaller
+   than a doubleword should be padded upward or downward.  You could
+   reasonably assume that they follow the normal rules for structure
+   layout treating the parameter area as any other block of memory,
+   then map the reg param area to registers.  ie. pad updard.
+   Setting both of the following defines results in this behaviour.
+   Setting just the first one will result in aggregates that fit in a
+   doubleword being padded downward, and others being padded upward.
+   Not a bad idea as this results in struct { int x; } being passed
+   the same way as an int.  */
+#define AGGREGATE_PADDING_FIXED 1
+/* #define AGGREGATES_PAD_UPWARD_ALWAYS 1 */
+
+/* We don't want anything in the reg parm area being passed on the
+   stack.  */
+#define MUST_PASS_IN_STACK(MODE, TYPE)				\
+  ((TYPE) != 0							\
+   && (TREE_CODE (TYPE_SIZE (TYPE)) != INTEGER_CST		\
+       || TREE_ADDRESSABLE (TYPE)))
 
 /* __throw will restore its own return address to be the same as the
    return address of the function that the throw is being made to.
diff -urp gcc2/gcc/config/rs6000/rs6000.h gcc3/gcc/config/rs6000/rs6000.h
--- gcc2/gcc/config/rs6000/rs6000.h	2003-04-23 11:29:04.000000000 +0930
+++ gcc3/gcc/config/rs6000/rs6000.h	2003-04-24 21:07:17.000000000 +0930
@@ -1701,7 +1701,6 @@ typedef struct rs6000_args
   int fregno;			/* next available FP register */
   int vregno;			/* next available AltiVec register */
   int nargs_prototype;		/* # args left in the current prototype */
-  int orig_nargs;		/* Original value of nargs_prototype */
   int prototype;		/* Whether a prototype was defined */
   int call_cookie;		/* Do special things for this call */
   int sysv_gregno;		/* next available GP register */
@@ -1832,13 +1831,8 @@ typedef struct rs6000_args
 #define EXPAND_BUILTIN_VA_ARG(valist, type) \
   rs6000_va_arg (valist, type)
 
-/* For AIX, the rule is that structures are passed left-aligned in
-   their stack slot.  However, GCC does not presently do this:
-   structures which are the same size as integer types are passed
-   right-aligned, as if they were in fact integers.  This only
-   matters for structures of size 1 or 2, or 4 when TARGET_64BIT.
-   ABI_V4 does not use std_expand_builtin_va_arg.  */
-#define PAD_VARARGS_DOWN (TYPE_MODE (type) != BLKmode)
+#define PAD_VARARGS_DOWN \
+   (FUNCTION_ARG_PADDING (TYPE_MODE (type), type) == downward)
 
 /* Define this macro to be a nonzero value if the location where a function
    argument is passed depends on whether or not it is a named argument.  */
diff -urp gcc2/gcc/config/rs6000/rs6000.c gcc3/gcc/config/rs6000/rs6000.c
--- gcc2/gcc/config/rs6000/rs6000.c	2003-04-24 14:46:04.000000000 +0930
+++ gcc3/gcc/config/rs6000/rs6000.c	2003-04-24 21:43:35.000000000 +0930
@@ -3132,8 +3132,6 @@ init_cumulative_args (cum, fntype, libna
   else
     cum->nargs_prototype = 0;
 
-  cum->orig_nargs = cum->nargs_prototype;
-
   /* Check for a longcall attribute.  */
   if (fntype
       && lookup_attribute ("longcall", TYPE_ATTRIBUTES (fntype))
@@ -3172,8 +3170,37 @@ function_arg_padding (mode, type)
      enum machine_mode mode;
      tree type;
 {
+#if !AGGREGATE_PADDING_FIXED
+  /* GCC used to pass structures of the same size as integer types as
+     if they were in fact integers, ignoring FUNCTION_ARG_PADDING.
+     ie. Structures of size 1 or 2 (or 4 when TARGET_64BIT) were
+     passed padded downward, except that -mstrict-align further
+     muddied the water in that multi-component structures of 2 and 4
+     bytes in size were passed padded upward.
+
+     The following arranges for best compatibility with previous
+     versions of gcc, but removes the -mstrict-align dependency.  */
+  if (BYTES_BIG_ENDIAN)
+    {
+      HOST_WIDE_INT size = 0;
+
+      if (mode == BLKmode)
+	{
+	  if (type && TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST)
+	    size = int_size_in_bytes (type);
+	}
+      else
+	size = GET_MODE_SIZE (mode);
+
+      if (size == 1 || size == 2 || size == 4)
+	return downward;
+    }
+  return upward;
+#else
+#if AGGREGATES_PAD_UPWARD_ALWAYS
   if (type != 0 && AGGREGATE_TYPE_P (type))
     return upward;
+#endif
 
   /* This is the default definition.  */
   return (! BYTES_BIG_ENDIAN
@@ -3183,6 +3210,7 @@ function_arg_padding (mode, type)
                  && int_size_in_bytes (type) < (PARM_BOUNDARY / BITS_PER_UNIT))
               : GET_MODE_BITSIZE (mode) < PARM_BOUNDARY)
              ? downward : upward));
+#endif
 }
 
 /* If defined, a C expression that gives the alignment boundary, in bits,

^ permalink raw reply	[flat|nested] 875+ messages in thread

* function parms in regs, patch 1 of 3
@ 2003-04-24 15:34 Alan Modra
  2003-04-25 22:44 ` Richard Henderson
  2003-05-02  5:06 ` function parms in regs, patch 1 " Jim Wilson
  0 siblings, 2 replies; 875+ messages in thread
From: Alan Modra @ 2003-04-24 15:34 UTC (permalink / raw)
  To: gcc-patches

I've been playing with gcc's function call code over the last week or
so.  The goal was to properly pass structures by value in registers,
according to the PowerPC64 ABI, and avoid the problems illustrated
in PR 10397 and PR 10408.  What started out as a relatively straight-
forward patch using PARALLELs in rs6000.c:function_arg has grown a
little..  So much so that it needs splitting up.

Patch 1 of 3 (this one) mostly mucks about with locate_and_pad_parm.
Patch 2 of 3 moves special case code out of move_block_from_reg.
Patch 3 of 3 is the new code to handle big-endian reg parms, and
rs6000 specific changes.

Patch 1 and 2 can probably be applied in reverse order as they're
independent, although I haven't tested that.  Regression tested
powerpc64-linux after each patch cumulatively applied.  I'll also
bootstrap and regression test i686-linux and powerpc-linux again
(these were tested on earlier versions of the patch).

This patch
a) Packages struct arg_data fields set by locate_and_pad_parm into a
   separate struct, reducing the number of function parms.  The resulting
   change to callers happens to fix a bug in emit_library_call_value_1
   where alignment_pad was being set for each arg in one loop, then used
   in another loop for each arg.  ie. we used the last arg's
   alignment_pad for all args.
b) Calculates slot_offset in locate_and_pad_parm.  This requires an
   extra input parm to specify partial-in-reg parms, which also lets us
   correct the arg size calculation in the function instead of adjusting
   it externally.  Incidentally fixes a bug in calls.c:
   initialize_argument_information ARGS_GROW_DOWNWARD calculation of
   slot_offset for variable size args by deleting the buggy code.
c) Fixes bugs in locate_and_pad_parm handling of initial_offset_ptr.
   I can't see anything that guarantees when initial_offset_ptr->var
   is non-NULL that initial_offset_ptr->constant == 0, but that's what
   the old code assumes.
d) Removes the hacks in locate_and_pad_parm that fudged the stack offset
   for pad-down reg parms.  Instead we choose slot_offset externally.
   Note! For !ARGS_GROW_DOWNWARD, downward padding and non-BLKmode args,
   this changes the stack home for a reg arg, which shouldn't be a
   problem.  I'm raising the point here for full disclosure.  :)
e) Optimizes a few things in assign_parms.

	* calls.c (struct arg_data): Move offset, slot_offset, size and
	alignment_pad to struct locate_and_pad_arg_data.  Update all refs.
	(initialize_argument_information): Adjust call to locate_and_pad_parm.
	Delete alignment_pad var.  Don't calculate slot_offset here.
	(emit_library_call_value_1): Delete alignment_pad, offset and size
	vars.  Use struct locate_and_pad_arg_data instead.  Adjust refs.
	Adjust call to locate_and_pad_parm.  Don't tweak arg size for
	partial in-regs here.  Formatting fixes.
	* expr.h (struct locate_and_pad_arg_data): New struct.
	(locate_and_pad_parm): Adjust declaration.
	* function.c (assign_parms): Localize vars.  Use "locate" instead of
	other arg location vars.  Don't invoke FUNCTION_ARG or
	FUNCTION_INCOMING_ARG unless pretend_named is different from
	named_arg.  Heed MUST_PASS_IN_STACK and set up "partial" before
	calling locate_and_pad_parm.  Adjust locate_and_pad_parm call.
	Use slot_offset for stack home of reg parms.  Correct test for
	parm passed in memory.  Formatting fixes.
	(locate_and_pad_parm): Add "partial" to params.  Replace offset_ptr
	arg_size_ptr and alignment pad with "locate".  Set slot_offset here.
	Correct initial_offset_ptr handling.  Localize vars.  Always pad
	locate->offset even when in_regs.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

diff -urp gcc.orig/gcc/calls.c gcc1/gcc/calls.c
--- gcc.orig/gcc/calls.c	2003-04-22 18:55:34.000000000 +0930
+++ gcc1/gcc/calls.c	2003-04-24 11:25:45.000000000 +0930
@@ -98,16 +98,8 @@ struct arg_data
      even though pass_on_stack is zero, just because FUNCTION_ARG says so.
      pass_on_stack identifies arguments that *cannot* go in registers.  */
   int pass_on_stack;
-  /* Offset of this argument from beginning of stack-args.  */
-  struct args_size offset;
-  /* Similar, but offset to the start of the stack slot.  Different from
-     OFFSET if this arg pads downward.  */
-  struct args_size slot_offset;
-  /* Size of this argument on the stack, rounded up for any padding it gets,
-     parts of the argument passed in registers do not count.
-     If REG_PARM_STACK_SPACE is defined, then register parms
-     are counted here as well.  */
-  struct args_size size;
+  /* Some fields packaged up for locate_and_pad_parm.  */
+  struct locate_and_pad_arg_data locate;
   /* Location on the stack at which parameter should be stored.  The store
      has already been done if STACK == VALUE.  */
   rtx stack;
@@ -123,9 +115,6 @@ struct arg_data
      word-sized pseudos we made.  */
   rtx *aligned_regs;
   int n_aligned_regs;
-  /* The amount that the stack pointer needs to be adjusted to
-     force alignment for the next argument.  */
-  struct args_size alignment_pad;
 };
 
 /* A vector of one char per byte of stack space.  A byte if nonzero if
@@ -1120,7 +1109,6 @@ initialize_argument_information (num_act
   /* Count arg position in order args appear.  */
   int argpos;
 
-  struct args_size alignment_pad;
   int i;
   tree p;
 
@@ -1331,39 +1319,14 @@ initialize_argument_information (num_act
 #else
 			     args[i].reg != 0,
 #endif
-			     fndecl, args_size, &args[i].offset,
-			     &args[i].size, &alignment_pad);
-
-#ifndef ARGS_GROW_DOWNWARD
-      args[i].slot_offset = *args_size;
-#endif
-
-      args[i].alignment_pad = alignment_pad;
-
-      /* If a part of the arg was put into registers,
-	 don't include that part in the amount pushed.  */
-      if (reg_parm_stack_space == 0 && ! args[i].pass_on_stack)
-	args[i].size.constant -= ((args[i].partial * UNITS_PER_WORD)
-				  / (PARM_BOUNDARY / BITS_PER_UNIT)
-				  * (PARM_BOUNDARY / BITS_PER_UNIT));
+			     args[i].pass_on_stack ? 0 : args[i].partial,
+			     fndecl, args_size, &args[i].locate);
 
       /* Update ARGS_SIZE, the total stack space for args so far.  */
 
-      args_size->constant += args[i].size.constant;
-      if (args[i].size.var)
-	{
-	  ADD_PARM_SIZE (*args_size, args[i].size.var);
-	}
-
-      /* Since the slot offset points to the bottom of the slot,
-	 we must record it after incrementing if the args grow down.  */
-#ifdef ARGS_GROW_DOWNWARD
-      args[i].slot_offset = *args_size;
-
-      args[i].slot_offset.constant = -args_size->constant;
-      if (args_size->var)
-	SUB_PARM_SIZE (args[i].slot_offset, args_size->var);
-#endif
+      args_size->constant += args[i].locate.size.constant;
+      if (args[i].locate.size.var)
+	ADD_PARM_SIZE (*args_size, args[i].locate.size.var);
 
       /* Increment ARGS_SO_FAR, which has info about which arg-registers
 	 have been used, etc.  */
@@ -1616,8 +1579,8 @@ compute_argument_addresses (args, argblo
 
       for (i = 0; i < num_actuals; i++)
 	{
-	  rtx offset = ARGS_SIZE_RTX (args[i].offset);
-	  rtx slot_offset = ARGS_SIZE_RTX (args[i].slot_offset);
+	  rtx offset = ARGS_SIZE_RTX (args[i].locate.offset);
+	  rtx slot_offset = ARGS_SIZE_RTX (args[i].locate.slot_offset);
 	  rtx addr;
 
 	  /* Skip this parm if it will not be passed on the stack.  */
@@ -2060,12 +2023,12 @@ check_sibcall_argument_overlap (insn, ar
   if (mark_stored_args_map)
     {
 #ifdef ARGS_GROW_DOWNWARD
-      low = -arg->slot_offset.constant - arg->size.constant;
+      low = -arg->locate.slot_offset.constant - arg->locate.size.constant;
 #else
-      low = arg->slot_offset.constant;
+      low = arg->locate.slot_offset.constant;
 #endif
 
-      for (high = low + arg->size.constant; low < high; low++)
+      for (high = low + arg->locate.size.constant; low < high; low++)
 	SET_BIT (stored_args_map, low);
     }
   return insn != NULL_RTX;
@@ -3358,7 +3321,7 @@ expand_call (exp, target, ignore)
 		  emit_move_insn (stack_area, args[i].save_area);
 		else
 		  emit_block_move (stack_area, args[i].save_area,
-				   GEN_INT (args[i].size.constant),
+				   GEN_INT (args[i].locate.size.constant),
 				   BLOCK_OP_CALL_PARM);
 	      }
 
@@ -3507,7 +3470,6 @@ emit_library_call_value_1 (retval, orgfu
   rtx fun;
   int inc;
   int count;
-  struct args_size alignment_pad;
   rtx argblock = 0;
   CUMULATIVE_ARGS args_so_far;
   struct arg
@@ -3516,8 +3478,7 @@ emit_library_call_value_1 (retval, orgfu
     enum machine_mode mode;
     rtx reg;
     int partial;
-    struct args_size offset;
-    struct args_size size;
+    struct locate_and_pad_arg_data locate;
     rtx save_area;
   };
   struct arg *argvec;
@@ -3677,12 +3638,11 @@ emit_library_call_value_1 (retval, orgfu
 #else
 			   argvec[count].reg != 0,
 #endif
-			   NULL_TREE, &args_size, &argvec[count].offset,
-			   &argvec[count].size, &alignment_pad);
+			   0, NULL_TREE, &args_size, &argvec[count].locate);
 
       if (argvec[count].reg == 0 || argvec[count].partial != 0
 	  || reg_parm_stack_space > 0)
-	args_size.constant += argvec[count].size.constant;
+	args_size.constant += argvec[count].locate.size.constant;
 
       FUNCTION_ARG_ADVANCE (args_so_far, Pmode, (tree) 0, 1);
 
@@ -3796,18 +3756,15 @@ emit_library_call_value_1 (retval, orgfu
 #else
 			   argvec[count].reg != 0,
 #endif
-			   NULL_TREE, &args_size, &argvec[count].offset,
-			   &argvec[count].size, &alignment_pad);
+			   argvec[count].partial,
+			   NULL_TREE, &args_size, &argvec[count].locate);
 
-      if (argvec[count].size.var)
+      if (argvec[count].locate.size.var)
 	abort ();
 
-      if (reg_parm_stack_space == 0 && argvec[count].partial)
-	argvec[count].size.constant -= argvec[count].partial * UNITS_PER_WORD;
-
       if (argvec[count].reg == 0 || argvec[count].partial != 0
 	  || reg_parm_stack_space > 0)
-	args_size.constant += argvec[count].size.constant;
+	args_size.constant += argvec[count].locate.size.constant;
 
       FUNCTION_ARG_ADVANCE (args_so_far, mode, (tree) 0, 1);
     }
@@ -3945,11 +3902,11 @@ emit_library_call_value_1 (retval, orgfu
 #ifdef ARGS_GROW_DOWNWARD
 	      /* stack_slot is negative, but we want to index stack_usage_map
 		 with positive values.  */
-	      upper_bound = -argvec[argnum].offset.constant + 1;
-	      lower_bound = upper_bound - argvec[argnum].size.constant;
+	      upper_bound = -argvec[argnum].locate.offset.constant + 1;
+	      lower_bound = upper_bound - argvec[argnum].lcoate.size.constant;
 #else
-	      lower_bound = argvec[argnum].offset.constant;
-	      upper_bound = lower_bound + argvec[argnum].size.constant;
+	      lower_bound = argvec[argnum].locate.offset.constant;
+	      upper_bound = lower_bound + argvec[argnum].locate.size.constant;
 #endif
 
 	      i = lower_bound;
@@ -3962,19 +3919,16 @@ emit_library_call_value_1 (retval, orgfu
 
 	      if (i < upper_bound)
 		{
-		  /* We need to make a save area.  See what mode we can make
-		     it.  */
+		  /* We need to make a save area.  */
+		  unsigned int size
+		    = argvec[argnum].locate.size.constant * BITS_PER_UNIT;
 		  enum machine_mode save_mode
-		    = mode_for_size (argvec[argnum].size.constant
-				     * BITS_PER_UNIT,
-				     MODE_INT, 1);
+		    = mode_for_size (size, MODE_INT, 1);
+		  rtx adr
+		    = plus_constant (argblock,
+				     argvec[argnum].locate.offset.constant);
 		  rtx stack_area
-		    = gen_rtx_MEM
-		      (save_mode,
-		       memory_address
-		       (save_mode,
-			plus_constant (argblock,
-				       argvec[argnum].offset.constant)));
+		    = gen_rtx_MEM (save_mode, memory_address (save_mode, adr));
 		  argvec[argnum].save_area = gen_reg_rtx (save_mode);
 
 		  emit_move_insn (argvec[argnum].save_area, stack_area);
@@ -3983,8 +3937,9 @@ emit_library_call_value_1 (retval, orgfu
 
 	  emit_push_insn (val, mode, NULL_TREE, NULL_RTX, PARM_BOUNDARY,
 			  partial, reg, 0, argblock,
-			  GEN_INT (argvec[argnum].offset.constant),
-			  reg_parm_stack_space, ARGS_SIZE_RTX (alignment_pad));
+			  GEN_INT (argvec[argnum].locate.offset.constant),
+			  reg_parm_stack_space,
+			  ARGS_SIZE_RTX (argvec[argnum].locate.alignment_pad));
 
 	  /* Now mark the segment we just used.  */
 	  if (ACCUMULATE_OUTGOING_ARGS)
@@ -4189,12 +4144,10 @@ emit_library_call_value_1 (retval, orgfu
 	if (argvec[count].save_area)
 	  {
 	    enum machine_mode save_mode = GET_MODE (argvec[count].save_area);
-	    rtx stack_area
-	      = gen_rtx_MEM (save_mode,
-			     memory_address
-			     (save_mode,
-			      plus_constant (argblock,
-					     argvec[count].offset.constant)));
+	    rtx adr = plus_constant (argblock,
+				     argvec[count].locate.offset.constant);
+	    rtx stack_area = gen_rtx_MEM (save_mode,
+					  memory_address (save_mode, adr));
 
 	    emit_move_insn (stack_area, argvec[count].save_area);
 	  }
@@ -4321,14 +4274,14 @@ store_one_arg (arg, argblock, flags, var
 	  else
 	    upper_bound = 0;
 
-	  lower_bound = upper_bound - arg->size.constant;
+	  lower_bound = upper_bound - arg->locate.size.constant;
 #else
 	  if (GET_CODE (XEXP (arg->stack_slot, 0)) == PLUS)
 	    lower_bound = INTVAL (XEXP (XEXP (arg->stack_slot, 0), 1));
 	  else
 	    lower_bound = 0;
 
-	  upper_bound = lower_bound + arg->size.constant;
+	  upper_bound = lower_bound + arg->locate.size.constant;
 #endif
 
 	  i = lower_bound;
@@ -4341,13 +4294,11 @@ store_one_arg (arg, argblock, flags, var
 
 	  if (i < upper_bound)
 	    {
-	      /* We need to make a save area.  See what mode we can make it.  */
-	      enum machine_mode save_mode
-		= mode_for_size (arg->size.constant * BITS_PER_UNIT, MODE_INT, 1);
-	      rtx stack_area
-		= gen_rtx_MEM (save_mode,
-			       memory_address (save_mode,
-					       XEXP (arg->stack_slot, 0)));
+	      /* We need to make a save area.  */
+	      unsigned int size = arg->locate.size.constant * BITS_PER_UNIT;
+	      enum machine_mode save_mode = mode_for_size (size, MODE_INT, 1);
+	      rtx adr = memory_address (save_mode, XEXP (arg->stack_slot, 0));
+	      rtx stack_area = gen_rtx_MEM (save_mode, adr);
 
 	      if (save_mode == BLKmode)
 		{
@@ -4475,8 +4426,8 @@ store_one_arg (arg, argblock, flags, var
 	 This can either be done with push or copy insns.  */
       emit_push_insn (arg->value, arg->mode, TREE_TYPE (pval), NULL_RTX, 
 		      PARM_BOUNDARY, partial, reg, used - size, argblock,
-		      ARGS_SIZE_RTX (arg->offset), reg_parm_stack_space,
-		      ARGS_SIZE_RTX (arg->alignment_pad));
+		      ARGS_SIZE_RTX (arg->locate.offset), reg_parm_stack_space,
+		      ARGS_SIZE_RTX (arg->locate.alignment_pad));
 
       /* Unless this is a partially-in-register argument, the argument is now
 	 in the stack.  */
@@ -4498,16 +4449,17 @@ store_one_arg (arg, argblock, flags, var
       /* Round its size up to a multiple
 	 of the allocation unit for arguments.  */
 
-      if (arg->size.var != 0)
+      if (arg->locate.size.var != 0)
 	{
 	  excess = 0;
-	  size_rtx = ARGS_SIZE_RTX (arg->size);
+	  size_rtx = ARGS_SIZE_RTX (arg->locate.size);
 	}
       else
 	{
 	  /* PUSH_ROUNDING has no effect on us, because
 	     emit_push_insn for BLKmode is careful to avoid it.  */
-	  excess = (arg->size.constant - int_size_in_bytes (TREE_TYPE (pval))
+	  excess = (arg->locate.size.constant
+		    - int_size_in_bytes (TREE_TYPE (pval))
 		    + partial * UNITS_PER_WORD);
 	  size_rtx = expand_expr (size_in_bytes (TREE_TYPE (pval)),
 				  NULL_RTX, TYPE_MODE (sizetype), 0);
@@ -4521,7 +4473,7 @@ store_one_arg (arg, argblock, flags, var
 	 PARM_BOUNDARY, but the actual argument isn't.  */
       if (FUNCTION_ARG_PADDING (arg->mode, TREE_TYPE (pval)) == downward)
 	{
-	  if (arg->size.var)
+	  if (arg->locate.size.var)
 	    parm_align = BITS_PER_UNIT;
 	  else if (excess)
 	    {
@@ -4533,7 +4485,7 @@ store_one_arg (arg, argblock, flags, var
       if ((flags & ECF_SIBCALL) && GET_CODE (arg->value) == MEM)
 	{
 	  /* emit_push_insn might not work properly if arg->value and
-	     argblock + arg->offset areas overlap.  */
+	     argblock + arg->locate.offset areas overlap.  */
 	  rtx x = arg->value;
 	  int i = 0;
 
@@ -4547,17 +4499,17 @@ store_one_arg (arg, argblock, flags, var
 		i = INTVAL (XEXP (XEXP (x, 0), 1));
 
 	      /* expand_call should ensure this */
-	      if (arg->offset.var || GET_CODE (size_rtx) != CONST_INT)
+	      if (arg->locate.offset.var || GET_CODE (size_rtx) != CONST_INT)
 		abort ();
 
-	      if (arg->offset.constant > i)
+	      if (arg->locate.offset.constant > i)
 		{
-		  if (arg->offset.constant < i + INTVAL (size_rtx))
+		  if (arg->locate.offset.constant < i + INTVAL (size_rtx))
 		    sibcall_failure = 1;
 		}
-	      else if (arg->offset.constant < i)
+	      else if (arg->locate.offset.constant < i)
 		{
-		  if (i < arg->offset.constant + INTVAL (size_rtx))
+		  if (i < arg->locate.offset.constant + INTVAL (size_rtx))
 		    sibcall_failure = 1;
 		}
 	    }
@@ -4565,8 +4517,8 @@ store_one_arg (arg, argblock, flags, var
 
       emit_push_insn (arg->value, arg->mode, TREE_TYPE (pval), size_rtx,
 		      parm_align, partial, reg, excess, argblock,
-		      ARGS_SIZE_RTX (arg->offset), reg_parm_stack_space,
-		      ARGS_SIZE_RTX (arg->alignment_pad));
+		      ARGS_SIZE_RTX (arg->locate.offset), reg_parm_stack_space,
+		      ARGS_SIZE_RTX (arg->locate.alignment_pad));
 
       /* Unless this is a partially-in-register argument, the argument is now
 	 in the stack.
diff -urp gcc.orig/gcc/expr.h gcc1/gcc/expr.h
--- gcc.orig/gcc/expr.h	2003-04-22 18:55:34.000000000 +0930
+++ gcc1/gcc/expr.h	2003-04-24 11:29:11.000000000 +0930
@@ -63,6 +63,8 @@ enum expand_modifier {EXPAND_NORMAL = 0,
    more information.  */
 #define OK_DEFER_POP (inhibit_defer_pop -= 1)
 \f
+enum direction {none, upward, downward};
+
 #ifdef TREE_CODE /* Don't lose if tree.h not included.  */
 /* Structure to record the size of a sequence of arguments
    as the sum of a tree-expression and a constant.  This structure is
@@ -74,6 +76,24 @@ struct args_size
   HOST_WIDE_INT constant;
   tree var;
 };
+
+/* Package up various arg related fields of struct args for
+   locate_and_pad_parm.  */
+struct locate_and_pad_arg_data
+{
+  /* Size of this argument on the stack, rounded up for any padding it
+     gets.  If REG_PARM_STACK_SPACE is defined, then register parms are
+     counted here, otherwise they aren't.  */
+  struct args_size size;
+  /* Offset of this argument from beginning of stack-args.  */
+  struct args_size offset;
+  /* Offset to the start of the stack slot.  Different from OFFSET
+     if this arg pads downward.  */
+  struct args_size slot_offset;
+  /* The amount that the stack pointer needs to be adjusted to
+     force alignment for the next argument.  */
+  struct args_size alignment_pad;
+};
 #endif
 
 /* Add the value of the tree INC to the `struct args_size' TO.  */
@@ -119,8 +139,6 @@ do {							\
    usually pad upward, but pad short args downward on
    big-endian machines.  */
 
-enum direction {none, upward, downward};  /* Value has this type.  */
-
 #ifndef FUNCTION_ARG_PADDING
 #define FUNCTION_ARG_PADDING(MODE, TYPE)				\
   (! BYTES_BIG_ENDIAN							\
@@ -567,11 +585,9 @@ extern rtx expand_shift PARAMS ((enum tr
 				 rtx, int));
 extern rtx expand_divmod PARAMS ((int, enum tree_code, enum machine_mode, rtx,
 				  rtx, rtx, int));
-extern void locate_and_pad_parm PARAMS ((enum machine_mode, tree, int, tree,
-					 struct args_size *,
-					 struct args_size *,
-					 struct args_size *,
-					 struct args_size *));
+extern void locate_and_pad_parm PARAMS ((enum machine_mode, tree, int, int,
+					 tree, struct args_size *,
+					 struct locate_and_pad_arg_data *));
 extern rtx expand_inline_function PARAMS ((tree, tree, rtx, int, tree, rtx));
 
 /* Return the CODE_LABEL rtx for a LABEL_DECL, creating it if necessary.  */
diff -urp gcc.orig/gcc/function.c gcc1/gcc/function.c
--- gcc.orig/gcc/function.c	2003-04-23 11:29:02.000000000 +0930
+++ gcc1/gcc/function.c	2003-04-24 17:50:45.000000000 +0930
@@ -4338,12 +4338,7 @@ assign_parms (fndecl)
      tree fndecl;
 {
   tree parm;
-  rtx entry_parm = 0;
-  rtx stack_parm = 0;
   CUMULATIVE_ARGS args_so_far;
-  enum machine_mode promoted_mode, passed_mode;
-  enum machine_mode nominal_mode, promoted_nominal_mode;
-  int unsignedp;
   /* Total space needed so far for args on the stack,
      given as a constant and a tree-expression.  */
   struct args_size stack_args_size;
@@ -4357,8 +4352,8 @@ assign_parms (fndecl)
 #ifdef SETUP_INCOMING_VARARGS
   int varargs_setup = 0;
 #endif
+  int reg_parm_stack_space = 0;
   rtx conversion_insns = 0;
-  struct args_size alignment_pad;
 
   /* Nonzero if function takes extra anonymous args.
      This means the last named arg must be on the stack
@@ -4405,6 +4400,14 @@ assign_parms (fndecl)
   max_parm_reg = LAST_VIRTUAL_REGISTER + 1;
   parm_reg_stack_loc = (rtx *) ggc_alloc_cleared (max_parm_reg * sizeof (rtx));
 
+#ifdef REG_PARM_STACK_SPACE
+#ifdef MAYBE_REG_PARM_STACK_SPACE
+  reg_parm_stack_space = MAYBE_REG_PARM_STACK_SPACE;
+#else
+  reg_parm_stack_space = REG_PARM_STACK_SPACE (fndecl);
+#endif
+#endif
+
 #ifdef INIT_CUMULATIVE_INCOMING_ARGS
   INIT_CUMULATIVE_INCOMING_ARGS (args_so_far, fntype, NULL_RTX);
 #else
@@ -4417,14 +4420,19 @@ assign_parms (fndecl)
 
   for (parm = fnargs; parm; parm = TREE_CHAIN (parm))
     {
-      struct args_size stack_offset;
-      struct args_size arg_size;
+      rtx entry_parm;
+      rtx stack_parm;
+      enum machine_mode promoted_mode, passed_mode;
+      enum machine_mode nominal_mode, promoted_nominal_mode;
+      int unsignedp;
+      struct locate_and_pad_arg_data locate;
       int passed_pointer = 0;
       int did_conversion = 0;
       tree passed_type = DECL_ARG_TYPE (parm);
       tree nominal_type = TREE_TYPE (parm);
-      int pretend_named;
       int last_named = 0, named_arg;
+      int in_regs;
+      int partial = 0;
 
       /* Set LAST_NAMED if this is last named arg before last
 	 anonymous args.  */
@@ -4488,7 +4496,7 @@ assign_parms (fndecl)
 	  || TREE_ADDRESSABLE (passed_type)
 #ifdef FUNCTION_ARG_PASS_BY_REFERENCE
 	  || FUNCTION_ARG_PASS_BY_REFERENCE (args_so_far, passed_mode,
-					      passed_type, named_arg)
+					     passed_type, named_arg)
 #endif
 	  )
 	{
@@ -4558,27 +4566,52 @@ assign_parms (fndecl)
 	 it came in a register so that REG_PARM_STACK_SPACE isn't skipped.
 	 In this case, we call FUNCTION_ARG with NAMED set to 1 instead of
 	 0 as it was the previous time.  */
-
-      pretend_named = named_arg || PRETEND_OUTGOING_VARARGS_NAMED;
-      locate_and_pad_parm (promoted_mode, passed_type,
+      in_regs = entry_parm != 0;
 #ifdef STACK_PARMS_IN_REG_PARM_AREA
-			   1,
-#else
+      in_regs = 1;
+#endif
+      if (!in_regs && !named_arg)
+	{
+	  int pretend_named = PRETEND_OUTGOING_VARARGS_NAMED;
+	  if (pretend_named)
+	    {
 #ifdef FUNCTION_INCOMING_ARG
-			   FUNCTION_INCOMING_ARG (args_so_far, promoted_mode,
-						  passed_type,
-						  pretend_named) != 0,
+	      in_regs = FUNCTION_INCOMING_ARG (args_so_far, promoted_mode,
+					       passed_type,
+					       pretend_named) != 0;
 #else
-			   FUNCTION_ARG (args_so_far, promoted_mode,
-					 passed_type,
-					 pretend_named) != 0,
+	      in_regs = FUNCTION_ARG (args_so_far, promoted_mode,
+				      passed_type,
+				      pretend_named) != 0;
 #endif
+	    }
+	}
+
+      /* If this parameter was passed both in registers and in the stack,
+	 use the copy on the stack.  */
+      if (MUST_PASS_IN_STACK (promoted_mode, passed_type))
+	entry_parm = 0;
+
+#ifdef FUNCTION_ARG_PARTIAL_NREGS
+      if (entry_parm)
+	partial = FUNCTION_ARG_PARTIAL_NREGS (args_so_far, promoted_mode,
+					      passed_type, named_arg);
 #endif
-			   fndecl, &stack_args_size, &stack_offset, &arg_size,
-			   &alignment_pad);
+
+      memset (&locate, 0, sizeof (locate));
+      locate_and_pad_parm (promoted_mode, passed_type, in_regs,
+			   entry_parm ? partial : 0, fndecl,
+			   &stack_args_size, &locate);
 
       {
-	rtx offset_rtx = ARGS_SIZE_RTX (stack_offset);
+	rtx offset_rtx;
+
+	/* If we're passing this arg using a reg, make its stack home
+	   the aligned stack slot.  */
+	if (entry_parm)
+	  offset_rtx = ARGS_SIZE_RTX (locate.slot_offset);
+	else
+	  offset_rtx = ARGS_SIZE_RTX (locate.offset);
 
 	if (offset_rtx == const0_rtx)
 	  stack_parm = gen_rtx_MEM (promoted_mode, internal_arg_pointer);
@@ -4595,12 +4628,6 @@ assign_parms (fndecl)
 	  set_reg_attrs_for_parm (entry_parm, stack_parm);
       }
 
-      /* If this parameter was passed both in registers and in the stack,
-	 use the copy on the stack.  */
-      if (MUST_PASS_IN_STACK (promoted_mode, passed_type))
-	entry_parm = 0;
-
-#ifdef FUNCTION_ARG_PARTIAL_NREGS
       /* If this parm was passed part in regs and part in memory,
 	 pretend it arrived entirely in memory
 	 by pushing the register-part onto the stack.
@@ -4609,39 +4636,31 @@ assign_parms (fndecl)
 	 we could put it together in a pseudoreg directly,
 	 but for now that's not worth bothering with.  */
 
-      if (entry_parm)
+      if (partial)
 	{
-	  int nregs = FUNCTION_ARG_PARTIAL_NREGS (args_so_far, promoted_mode,
-						  passed_type, named_arg);
-
-	  if (nregs > 0)
-	    {
-#if defined (REG_PARM_STACK_SPACE) && !defined (MAYBE_REG_PARM_STACK_SPACE)
-	      /* When REG_PARM_STACK_SPACE is nonzero, stack space for
-		 split parameters was allocated by our caller, so we
-		 won't be pushing it in the prolog.  */
-	      if (REG_PARM_STACK_SPACE (fndecl) == 0)
-#endif
-	      current_function_pretend_args_size
-		= (((nregs * UNITS_PER_WORD) + (PARM_BOUNDARY / BITS_PER_UNIT) - 1)
-		   / (PARM_BOUNDARY / BITS_PER_UNIT)
-		   * (PARM_BOUNDARY / BITS_PER_UNIT));
+#ifndef MAYBE_REG_PARM_STACK_SPACE
+	  /* When REG_PARM_STACK_SPACE is nonzero, stack space for
+	     split parameters was allocated by our caller, so we
+	     won't be pushing it in the prolog.  */
+	  if (reg_parm_stack_space)
+#endif
+	  current_function_pretend_args_size
+	    = (((partial * UNITS_PER_WORD) + (PARM_BOUNDARY / BITS_PER_UNIT) - 1)
+	       / (PARM_BOUNDARY / BITS_PER_UNIT)
+	       * (PARM_BOUNDARY / BITS_PER_UNIT));
 
-	      /* Handle calls that pass values in multiple non-contiguous
-		 locations.  The Irix 6 ABI has examples of this.  */
-	      if (GET_CODE (entry_parm) == PARALLEL)
-		emit_group_store (validize_mem (stack_parm), entry_parm,
-				  int_size_in_bytes (TREE_TYPE (parm)));
+	  /* Handle calls that pass values in multiple non-contiguous
+	     locations.  The Irix 6 ABI has examples of this.  */
+	  if (GET_CODE (entry_parm) == PARALLEL)
+	    emit_group_store (validize_mem (stack_parm), entry_parm,
+			      int_size_in_bytes (TREE_TYPE (parm)));
 
-	      else
-		move_block_from_reg (REGNO (entry_parm),
-				     validize_mem (stack_parm), nregs,
-				     int_size_in_bytes (TREE_TYPE (parm)));
+	  else
+	    move_block_from_reg (REGNO (entry_parm), validize_mem (stack_parm),
+				 partial, int_size_in_bytes (TREE_TYPE (parm)));
 
-	      entry_parm = stack_parm;
-	    }
+	  entry_parm = stack_parm;
 	}
-#endif
 
       /* If we didn't decide this parm came in a register,
 	 by default it came on the stack.  */
@@ -4672,9 +4691,9 @@ assign_parms (fndecl)
 #endif
 	  )
 	{
-	  stack_args_size.constant += arg_size.constant;
-	  if (arg_size.var)
-	    ADD_PARM_SIZE (stack_args_size, arg_size.var);
+	  stack_args_size.constant += locate.size.constant;
+	  if (locate.size.var)
+	    ADD_PARM_SIZE (stack_args_size, locate.size.var);
 	}
       else
 	/* No stack slot was pushed for this parm.  */
@@ -4698,7 +4717,7 @@ assign_parms (fndecl)
 
       /* If parm was passed in memory, and we need to convert it on entry,
 	 don't store it back in that same slot.  */
-      if (entry_parm != 0
+      if (entry_parm == stack_parm
 	  && nominal_mode != BLKmode && nominal_mode != passed_mode)
 	stack_parm = 0;
 
@@ -5021,7 +5040,7 @@ assign_parms (fndecl)
 	      && ! did_conversion
 	      && stack_parm != 0
 	      && GET_CODE (stack_parm) == MEM
-	      && stack_offset.var == 0
+	      && locate.offset.var == 0
 	      && reg_mentioned_p (virtual_incoming_args_rtx,
 				  XEXP (stack_parm, 0)))
 	    {
@@ -5107,7 +5126,8 @@ assign_parms (fndecl)
 		{
 		  stack_parm
 		    = assign_stack_local (GET_MODE (entry_parm),
-					  GET_MODE_SIZE (GET_MODE (entry_parm)), 0);
+					  GET_MODE_SIZE (GET_MODE (entry_parm)),
+					  0);
 		  set_mem_attributes (stack_parm, parm, 1);
 		}
 
@@ -5278,8 +5298,11 @@ promoted_input_arg (regno, pmode, punsig
    INITIAL_OFFSET_PTR points to the current offset into the stacked
    arguments.
 
-   The starting offset and size for this parm are returned in *OFFSET_PTR
-   and *ARG_SIZE_PTR, respectively.
+   The starting offset and size for this parm are returned in
+   LOCATE->OFFSET and LOCATE->SIZE, respectively.  When IN_REGS is
+   nonzero, the offset is that of stack slot, which is returned in
+   LOCATE->SLOT_OFFSET.  LOCATE->ALIGNMENT_PAD is the amount of
+   padding required from the initial offset ptr to the stack slot.
 
    IN_REGS is nonzero if the argument will be passed in registers.  It will
    never be set if REG_PARM_STACK_SPACE is not defined.
@@ -5296,45 +5319,39 @@ promoted_input_arg (regno, pmode, punsig
    initial offset is not affected by this rounding, while the size always
    is and the starting offset may be.  */
 
-/*  offset_ptr will be negative for ARGS_GROW_DOWNWARD case;
-    initial_offset_ptr is positive because locate_and_pad_parm's
+/*  LOCATE->OFFSET will be negative for ARGS_GROW_DOWNWARD case;
+    INITIAL_OFFSET_PTR is positive because locate_and_pad_parm's
     callers pass in the total size of args so far as
-    initial_offset_ptr. arg_size_ptr is always positive.  */
+    INITIAL_OFFSET_PTR.  LOCATE->SIZE is always positive.  */
 
 void
-locate_and_pad_parm (passed_mode, type, in_regs, fndecl,
-		     initial_offset_ptr, offset_ptr, arg_size_ptr,
-		     alignment_pad)
+locate_and_pad_parm (passed_mode, type, in_regs, partial, fndecl,
+		     initial_offset_ptr, locate)
      enum machine_mode passed_mode;
      tree type;
-     int in_regs ATTRIBUTE_UNUSED;
+     int in_regs;
+     int partial;
      tree fndecl ATTRIBUTE_UNUSED;
      struct args_size *initial_offset_ptr;
-     struct args_size *offset_ptr;
-     struct args_size *arg_size_ptr;
-     struct args_size *alignment_pad;
-
+     struct locate_and_pad_arg_data *locate;
 {
-  tree sizetree
-    = type ? size_in_bytes (type) : size_int (GET_MODE_SIZE (passed_mode));
-  enum direction where_pad = FUNCTION_ARG_PADDING (passed_mode, type);
-  int boundary = FUNCTION_ARG_BOUNDARY (passed_mode, type);
-#ifdef ARGS_GROW_DOWNWARD
-  tree s2 = sizetree;
-#endif
+  tree sizetree;
+  enum direction where_pad;
+  int boundary;
+  int reg_parm_stack_space = 0;
+  int part_size_in_regs;
 
 #ifdef REG_PARM_STACK_SPACE
+#ifdef MAYBE_REG_PARM_STACK_SPACE
+  reg_parm_stack_space = MAYBE_REG_PARM_STACK_SPACE;
+#else
+  reg_parm_stack_space = REG_PARM_STACK_SPACE (fndecl);
+#endif
+
   /* If we have found a stack parm before we reach the end of the
      area reserved for registers, skip that area.  */
   if (! in_regs)
     {
-      int reg_parm_stack_space = 0;
-
-#ifdef MAYBE_REG_PARM_STACK_SPACE
-      reg_parm_stack_space = MAYBE_REG_PARM_STACK_SPACE;
-#else
-      reg_parm_stack_space = REG_PARM_STACK_SPACE (fndecl);
-#endif
       if (reg_parm_stack_space > 0)
 	{
 	  if (initial_offset_ptr->var)
@@ -5350,54 +5367,56 @@ locate_and_pad_parm (passed_mode, type, 
     }
 #endif /* REG_PARM_STACK_SPACE */
 
-  arg_size_ptr->var = 0;
-  arg_size_ptr->constant = 0;
-  alignment_pad->var = 0;
-  alignment_pad->constant = 0;
+  part_size_in_regs = 0;
+  if (reg_parm_stack_space == 0)
+    part_size_in_regs = ((partial * UNITS_PER_WORD)
+			 / (PARM_BOUNDARY / BITS_PER_UNIT)
+			 * (PARM_BOUNDARY / BITS_PER_UNIT));
+
+  sizetree
+    = type ? size_in_bytes (type) : size_int (GET_MODE_SIZE (passed_mode));
+  where_pad = FUNCTION_ARG_PADDING (passed_mode, type);
+  boundary = FUNCTION_ARG_BOUNDARY (passed_mode, type);
 
 #ifdef ARGS_GROW_DOWNWARD
+  locate->slot_offset.constant = -initial_offset_ptr->constant;
   if (initial_offset_ptr->var)
-    {
-      offset_ptr->constant = 0;
-      offset_ptr->var = size_binop (MINUS_EXPR, ssize_int (0),
-				    initial_offset_ptr->var);
-    }
-  else
-    {
-      offset_ptr->constant = -initial_offset_ptr->constant;
-      offset_ptr->var = 0;
-    }
+    locate->slot_offset.var = size_binop (MINUS_EXPR, ssize_int (0),
+					  initial_offset_ptr->var);
 
-  if (where_pad != none
-      && (!host_integerp (sizetree, 1)
-	  || (tree_low_cst (sizetree, 1) * BITS_PER_UNIT) % PARM_BOUNDARY))
-    s2 = round_up (s2, PARM_BOUNDARY / BITS_PER_UNIT);
-  SUB_PARM_SIZE (*offset_ptr, s2);
+  {
+    tree s2 = sizetree;
+    if (where_pad != none
+	&& (!host_integerp (sizetree, 1)
+	    || (tree_low_cst (sizetree, 1) * BITS_PER_UNIT) % PARM_BOUNDARY))
+      s2 = round_up (s2, PARM_BOUNDARY / BITS_PER_UNIT);
+    SUB_PARM_SIZE (locate->slot_offset, s2);
+  }
+
+  locate->slot_offset.constant += part_size_in_regs;
 
   if (!in_regs
 #ifdef REG_PARM_STACK_SPACE
       || REG_PARM_STACK_SPACE (fndecl) > 0
 #endif
      )
-    pad_to_arg_alignment (offset_ptr, boundary, alignment_pad);
+    pad_to_arg_alignment (&locate->slot_offset, boundary,
+			  &locate->alignment_pad);
 
+  locate->size.constant = (-initial_offset_ptr->constant
+			   - locate->slot_offset.constant);
   if (initial_offset_ptr->var)
-    arg_size_ptr->var = size_binop (MINUS_EXPR,
-				    size_binop (MINUS_EXPR,
-						ssize_int (0),
-						initial_offset_ptr->var),
-				    offset_ptr->var);
-
-  else
-    arg_size_ptr->constant = (-initial_offset_ptr->constant
-			      - offset_ptr->constant);
-
-  /* Pad_below needs the pre-rounded size to know how much to pad below.
-     We only pad parameters which are not in registers as they have their
-     padding done elsewhere.  */
-  if (where_pad == downward
-      && !in_regs)
-    pad_below (offset_ptr, passed_mode, sizetree);
+    locate->size.var = size_binop (MINUS_EXPR,
+				   size_binop (MINUS_EXPR,
+					       ssize_int (0),
+					       initial_offset_ptr->var),
+				   locate->slot_offset.var);
+
+  /* Pad_below needs the pre-rounded size to know how much to pad
+     below.  */
+  locate->offset = locate->slot_offset;
+  if (where_pad == downward)
+    pad_below (&locate->offset, passed_mode, sizetree);
 
 #else /* !ARGS_GROW_DOWNWARD */
   if (!in_regs
@@ -5405,8 +5424,9 @@ locate_and_pad_parm (passed_mode, type, 
       || REG_PARM_STACK_SPACE (fndecl) > 0
 #endif
       )
-    pad_to_arg_alignment (initial_offset_ptr, boundary, alignment_pad);
-  *offset_ptr = *initial_offset_ptr;
+    pad_to_arg_alignment (initial_offset_ptr, boundary,
+			  &locate->alignment_pad);
+  locate->slot_offset = *initial_offset_ptr;
 
 #ifdef PUSH_ROUNDING
   if (passed_mode != BLKmode)
@@ -5415,18 +5435,18 @@ locate_and_pad_parm (passed_mode, type, 
 
   /* Pad_below needs the pre-rounded size to know how much to pad below
      so this must be done before rounding up.  */
-  if (where_pad == downward
-    /* However, BLKmode args passed in regs have their padding done elsewhere.
-       The stack slot must be able to hold the entire register.  */
-      && !(in_regs && passed_mode == BLKmode))
-    pad_below (offset_ptr, passed_mode, sizetree);
+  locate->offset = locate->slot_offset;
+  if (where_pad == downward)
+    pad_below (&locate->offset, passed_mode, sizetree);
 
   if (where_pad != none
       && (!host_integerp (sizetree, 1)
 	  || (tree_low_cst (sizetree, 1) * BITS_PER_UNIT) % PARM_BOUNDARY))
     sizetree = round_up (sizetree, PARM_BOUNDARY / BITS_PER_UNIT);
 
-  ADD_PARM_SIZE (*arg_size_ptr, sizetree);
+  ADD_PARM_SIZE (locate->size, sizetree);
+
+  locate->size.constant -= part_size_in_regs;
 #endif /* ARGS_GROW_DOWNWARD */
 }
 
@@ -5465,7 +5485,8 @@ pad_to_arg_alignment (offset_ptr, bounda
 #endif
 	      (ARGS_SIZE_TREE (*offset_ptr),
 	       boundary / BITS_PER_UNIT);
-	  offset_ptr->constant = 0; /*?*/
+	  /* ARGS_SIZE_TREE includes constant term.  */
+	  offset_ptr->constant = 0;
 	  if (boundary > PARM_BOUNDARY && boundary > STACK_BOUNDARY)
 	    alignment_pad->var = size_binop (MINUS_EXPR, offset_ptr->var,
 					     save_var);

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: function parms in regs, patch 1 of 3
  2003-04-24 15:34 function parms in regs, patch 1 of 3 Alan Modra
@ 2003-04-25 22:44 ` Richard Henderson
  2003-04-26  0:33   ` Janis Johnson
  2003-04-27 23:34   ` Alan Modra
  2003-05-02  5:06 ` function parms in regs, patch 1 " Jim Wilson
  1 sibling, 2 replies; 875+ messages in thread
From: Richard Henderson @ 2003-04-25 22:44 UTC (permalink / raw)
  To: gcc-patches

On Fri, Apr 25, 2003 at 01:04:16AM +0930, Alan Modra wrote:
> Patch 1 and 2 can probably be applied in reverse order as they're
> independent, although I haven't tested that.  Regression tested
> powerpc64-linux after each patch cumulatively applied.  I'll also
> bootstrap and regression test i686-linux and powerpc-linux again
> (these were tested on earlier versions of the patch).

I think you need to test this more places.  Particularly at least
one ARGS_GROW_DOWNWARD machine:

> +	      lower_bound = upper_bound - argvec[argnum].lcoate.size.constant;
                                                         ^^^

It looks reasonable in concept, but I worry about silent ABI 
changes.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: function parms in regs, patch 1 of 3
  2003-04-25 22:44 ` Richard Henderson
@ 2003-04-26  0:33   ` Janis Johnson
  2003-04-27 23:34   ` Alan Modra
  1 sibling, 0 replies; 875+ messages in thread
From: Janis Johnson @ 2003-04-26  0:33 UTC (permalink / raw)
  To: Richard Henderson, gcc-patches

On Fri, Apr 25, 2003 at 03:43:00PM -0700, Richard Henderson wrote:
> It looks reasonable in concept, but I worry about silent ABI 
> changes.

I'm planning to add some binary compatibility tests for C, as well as
adding to the C++ tests in g++/compat.  These let you compile pieces of
a test with the compiler under test plus a different version of GCC (or,
in theory, an entirely different compiler).  I'll start with tests for
argument passing and function return, including structs, and including
structs with specific attributes.  Suggestions for constructs to include
in these tests would be most welcome.

Janis

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: function parms in regs, patch 1 of 3
  2003-04-25 22:44 ` Richard Henderson
  2003-04-26  0:33   ` Janis Johnson
@ 2003-04-27 23:34   ` Alan Modra
  2003-04-30 13:29     ` function parms in regs, patch 3 " Alan Modra
  1 sibling, 1 reply; 875+ messages in thread
From: Alan Modra @ 2003-04-27 23:34 UTC (permalink / raw)
  To: Richard Henderson, gcc-patches

On Fri, Apr 25, 2003 at 03:43:00PM -0700, Richard Henderson wrote:
> On Fri, Apr 25, 2003 at 01:04:16AM +0930, Alan Modra wrote:
> I think you need to test this more places.  Particularly at least
> one ARGS_GROW_DOWNWARD machine:
> 
> > +	      lower_bound = upper_bound - argvec[argnum].lcoate.size.constant;

Oops.  Indeed.  Testing hppa-linux.  My old 80MHz PA box is going to
take a while to do a bootstrap..

> It looks reasonable in concept, but I worry about silent ABI 
> changes.

Yes, it's tedious to prove that a patch like this is safe.  For that
reason I'm going to withdraw the emit_group_load/store patches and
concentrate on proving that the rest is reasonably safe.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: function parms in regs, patch 3 of 3
  2003-04-27 23:34   ` Alan Modra
@ 2003-04-30 13:29     ` Alan Modra
  2003-05-02  6:05       ` Jim Wilson
  2003-07-10  6:55       ` Jim Wilson
  0 siblings, 2 replies; 875+ messages in thread
From: Alan Modra @ 2003-04-30 13:29 UTC (permalink / raw)
  To: gcc-patches, Richard Henderson

On Mon, Apr 28, 2003 at 09:04:30AM +0930, Alan Modra wrote:
> concentrate on proving that the rest is reasonably safe.

Famous last words.  Some musings on the previous BLOCK_REG_PADDING
patch.  Firstly, it's fairly easy to see that targets that use the
default FUNCTION_ARG_PADDING will have no change in behaviour.  The
extra code and changed tests are to support different padding from the
usual, and in every case we're operating with args strictly less than
UNITS_PER_WORD in size.

Now for the other cases.  The following files define
FUNCTION_ARG_PADDING:

alpha/alpha.h:	Defined as upward, but BYTES_BIG_ENDIAN == 0, so this
is the same as the default.  alpha/unicosmk.h defines BYTES_BIG_ENDIAN
to 1, but #undefs FUNCTION_ARG_PADDING.

m68k/3b1.h:	The same as the default for BYTES_BIG_ENDIAN and
sizes less than a word.
m68k/3b1g.h:	Ditto.
m68k/crds.h:	Ditto.

m88k/m88k.h:	Pads BLKmode, structs and unions upwards, but
m88k_function_arg puts them on the stack unless size is
UNITS_PER_WORD.  Since the changes I made only affect register parms,
we're OK here too.

mips/mips.h:	MIPS uses the standard FUNCTION_ARG_PADDING, but with
some extra tests on ABI that can cause small args to pad upwards.  It
looks to me like the extra tests are for float args, in which case we
should be OK here too.

pa/pa.h:	PA has its own function_arg_padding, but uses
PARALLELs in function_arg for the cases where padding is different
from the default.  Should be OK, I think.

rs6000/rs6000.h: This is the one I _want_ to change, at least for
powerpc64-linux.  :)

sparc/sparc.h:	The default FUNCTION_ARG_PADDING, but aggregates pad
upwards for TARGET_ARCH64.  I think my patch will change behaviour
for small unions of 1, 2 and 4 bytes in size.

ia64/hpux.h:	The default FUNCTION_ARG_PADDING, but aggregates pad
upwards.  PARALLELs are used to for aggregates, so there shouldn't be
any surprises here.

m68hc11/m68hc11.h: Here, it looks like aggregates want to pad upwards,
but no special code is added to make that actually happen in
m68hc11_function_arg.  My patch will change behaviour.


So, given that the previous patch will change argument padding in
two targets, even though the current behaviour is probably a bug,
here's a patch that just uses BLOCK_REG_PADDING for rs6000.

	* expr.h (struct locate_and_pad_arg_data): Add where_pad.
	(emit_group_load, emit_group_store): Adjust declarations.
	* expr.c (emit_group_load): Add "type" param, and use
	BLOCK_REG_PADDING to determine need for a shift.  Optimize non-
	aligned accesses if !SLOW_UNALIGNED_ACCESS.
	(emit_group_store): Likewise.
	(emit_push_insn, expand_assignment, store_expr, expand_expr): Adjust
	emit_group_load and emit_group_store calls.
	* calls.c (store_unaligned_arguments_into_pseudos): Tidy.  Use
	BLOCK_REG_PADDING to determine whether we need endian_correction.
	(load_register_parameters): Localize vars.  Handle shifting of
	small values to the correct end of regs.  Adjust emit_group_load
	call.
	(expand_call, emit_library_call_value_1): Adjust emit_group_load
	and emit_group_store calls.
	* function.c (assign_parms): Set mem alignment for stack slots.
	Adjust emit_group_store call.  Store values at the "wrong" end
	of regs to the stack.  Use BLOCK_REG_PADDING.
	(locate_and_pad_parm): Save where_pad.
	(expand_function_end): Adjust emit_group_load call.
	* stmt.c (expand_value_return): Adjust emit_group_load call.
	* Makefile.in (calls.o): Depend on $(OPTABS_H).

	* config/rs6000/linux64.h (TARGET_BIG_ENDIAN): Redefine as 1.
	(FIXED_R13): Delete.
	(AGGREGATE_PADDING_FIXED): Define.
	(MUST_PASS_IN_STACK): Define.
	(BLOCK_REG_PADDING): Define.
	* config/rs6000/rs6000.h (struct rs6000_args): Remove orig_nargs.
	(PAD_VARARGS_DOWN): Define in terms of FUNCTION_ARG_PADDING.
	* config/rs6000/rs6000.c (init_cumulative_args): Don't set orig_nargs.
	(function_arg_padding): !AGGREGATE_PADDING_FIXED compatibility code.
	Act on AGGREGATES_PAD_UPWARD_ALWAYS.

Regression tested powerpc64-linux.  Bootstrapped hppa-linux,
regression testing still in progress.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

diff -urp gcc2/gcc/expr.h gcc3/gcc/expr.h
--- gcc2/gcc/expr.h	2003-04-24 15:31:56.000000000 +0930
+++ gcc3/gcc/expr.h	2003-04-30 14:37:09.000000000 +0930
@@ -93,6 +93,8 @@ struct locate_and_pad_arg_data
   /* The amount that the stack pointer needs to be adjusted to
      force alignment for the next argument.  */
   struct args_size alignment_pad;
+  /* Which way we should pad this arg.  */
+  enum direction where_pad;
 };
 #endif
 
@@ -416,19 +418,21 @@ extern void move_block_from_reg PARAMS (
 /* Generate a non-consecutive group of registers represented by a PARALLEL.  */
 extern rtx gen_group_rtx PARAMS ((rtx));
 
+#ifdef TREE_CODE
 /* Load a BLKmode value into non-consecutive registers represented by a
    PARALLEL.  */
-extern void emit_group_load PARAMS ((rtx, rtx, int));
+extern void emit_group_load PARAMS ((rtx, rtx, tree, int));
+#endif
 
 /* Move a non-consecutive group of registers represented by a PARALLEL into
    a non-consecutive group of registers represented by a PARALLEL.  */
 extern void emit_group_move PARAMS ((rtx, rtx));
 
+#ifdef TREE_CODE
 /* Store a BLKmode value from non-consecutive registers represented by a
    PARALLEL.  */
-extern void emit_group_store PARAMS ((rtx, rtx, int));
+extern void emit_group_store PARAMS ((rtx, rtx, tree, int));
 
-#ifdef TREE_CODE
 /* Copy BLKmode object from a set of registers.  */
 extern rtx copy_blkmode_from_reg PARAMS ((rtx, rtx, tree));
 #endif
diff -urp gcc2/gcc/expr.c gcc3/gcc/expr.c
--- gcc2/gcc/expr.c	2003-04-24 14:46:04.000000000 +0930
+++ gcc3/gcc/expr.c	2003-04-30 14:33:24.000000000 +0930
@@ -2212,19 +2212,15 @@ gen_group_rtx (orig)
   return gen_rtx_PARALLEL (GET_MODE (orig), gen_rtvec_v (length, tmps));
 }
 
-/* Emit code to move a block SRC to a block DST, where DST is non-consecutive
-   registers represented by a PARALLEL.  SSIZE represents the total size of
-   block SRC in bytes, or -1 if not known.  */
-/* ??? If SSIZE % UNITS_PER_WORD != 0, we make the blatant assumption that
-   the balance will be in what would be the low-order memory addresses, i.e.
-   left justified for big endian, right justified for little endian.  This
-   happens to be true for the targets currently using this support.  If this
-   ever changes, a new target macro along the lines of FUNCTION_ARG_PADDING
-   would be needed.  */
+/* Emit code to move a block ORIG_SRC of type TYPE to a block DST,
+   where DST is non-consecutive registers represented by a PARALLEL.
+   SSIZE represents the total size of block ORIG_SRC in bytes, or -1
+   if not known.  */ 
 
 void
-emit_group_load (dst, orig_src, ssize)
+emit_group_load (dst, orig_src, type, ssize)
      rtx dst, orig_src;
+     tree type ATTRIBUTE_UNUSED;
      int ssize;
 {
   rtx *tmps, src;
@@ -2253,7 +2249,17 @@ emit_group_load (dst, orig_src, ssize)
       /* Handle trailing fragments that run over the size of the struct.  */
       if (ssize >= 0 && bytepos + (HOST_WIDE_INT) bytelen > ssize)
 	{
-	  shift = (bytelen - (ssize - bytepos)) * BITS_PER_UNIT;
+	  /* Arrange to shift the fragment to where it belongs.
+	     extract_bit_field loads to the lsb of the reg.  */
+	  if (
+#ifdef BLOCK_REG_PADDING
+	      BLOCK_REG_PADDING (GET_MODE (orig_src), type, i == start)
+	      == (BYTES_BIG_ENDIAN ? upward : downward)
+#else
+	      BYTES_BIG_ENDIAN
+#endif
+	      )
+	    shift = (bytelen - (ssize - bytepos)) * BITS_PER_UNIT;
 	  bytelen = ssize - bytepos;
 	  if (bytelen <= 0)
 	    abort ();
@@ -2278,7 +2284,8 @@ emit_group_load (dst, orig_src, ssize)
 
       /* Optimize the access just a bit.  */
       if (GET_CODE (src) == MEM
-	  && MEM_ALIGN (src) >= GET_MODE_ALIGNMENT (mode)
+	  && (! SLOW_UNALIGNED_ACCESS (mode, MEM_ALIGN (src))
+	      || MEM_ALIGN (src) >= GET_MODE_ALIGNMENT (mode))
 	  && bytepos * BITS_PER_UNIT % GET_MODE_ALIGNMENT (mode) == 0
 	  && bytelen == GET_MODE_SIZE (mode))
 	{
@@ -2321,7 +2328,7 @@ emit_group_load (dst, orig_src, ssize)
 				     bytepos * BITS_PER_UNIT, 1, NULL_RTX,
 				     mode, mode, ssize);
 
-      if (BYTES_BIG_ENDIAN && shift)
+      if (shift)
 	expand_binop (mode, ashl_optab, tmps[i], GEN_INT (shift),
 		      tmps[i], 0, OPTAB_WIDEN);
     }
@@ -2353,13 +2360,16 @@ emit_group_move (dst, src)
 		    XEXP (XVECEXP (src, 0, i), 0));
 }
 
-/* Emit code to move a block SRC to a block DST, where SRC is non-consecutive
-   registers represented by a PARALLEL.  SSIZE represents the total size of
-   block DST, or -1 if not known.  */
+/* Emit code to move a block SRC to a block ORIG_DST of type TYPE,
+   where SRC is non-consecutive registers represented by a PARALLEL.
+   SSIZE represents the total size of block ORIG_DST, or -1 if not
+   known.  */
 
 void
-emit_group_store (orig_dst, src, ssize)
-     rtx orig_dst, src;
+emit_group_store (orig_dst, src, type, ssize)
+     rtx orig_dst;
+     rtx src;
+     tree type ATTRIBUTE_UNUSED;
      int ssize;
 {
   rtx *tmps, dst;
@@ -2404,8 +2414,8 @@ emit_group_store (orig_dst, src, ssize)
 	 the temporary.  */
 
       temp = assign_stack_temp (GET_MODE (dst), ssize, 0);
-      emit_group_store (temp, src, ssize);
-      emit_group_load (dst, temp, ssize);
+      emit_group_store (temp, src, type, ssize);
+      emit_group_load (dst, temp, type, ssize);
       return;
     }
   else if (GET_CODE (dst) != MEM && GET_CODE (dst) != CONCAT)
@@ -2426,7 +2436,16 @@ emit_group_store (orig_dst, src, ssize)
       /* Handle trailing fragments that run over the size of the struct.  */
       if (ssize >= 0 && bytepos + (HOST_WIDE_INT) bytelen > ssize)
 	{
-	  if (BYTES_BIG_ENDIAN)
+	  /* store_bit_field always takes its value from the lsb.
+	     Move the fragment to the lsb if it's not already there.  */
+	  if (
+#ifdef BLOCK_REG_PADDING
+	      BLOCK_REG_PADDING (GET_MODE (orig_dst), type, i == start)
+	      == (BYTES_BIG_ENDIAN ? upward : downward)
+#else
+	      BYTES_BIG_ENDIAN
+#endif
+	      )
 	    {
 	      int shift = (bytelen - (ssize - bytepos)) * BITS_PER_UNIT;
 	      expand_binop (mode, ashr_optab, tmps[i], GEN_INT (shift),
@@ -2459,7 +2478,8 @@ emit_group_store (orig_dst, src, ssize)
 
       /* Optimize the access just a bit.  */
       if (GET_CODE (dest) == MEM
-	  && MEM_ALIGN (dest) >= GET_MODE_ALIGNMENT (mode)
+	  && (! SLOW_UNALIGNED_ACCESS (mode, MEM_ALIGN (dest))
+	      || MEM_ALIGN (dest) >= GET_MODE_ALIGNMENT (mode))
 	  && bytepos * BITS_PER_UNIT % GET_MODE_ALIGNMENT (mode) == 0
 	  && bytelen == GET_MODE_SIZE (mode))
 	emit_move_insn (adjust_address (dest, mode, bytepos), tmps[i]);
@@ -3991,7 +4011,7 @@ emit_push_insn (x, mode, type, size, ali
       /* Handle calls that pass values in multiple non-contiguous locations.
 	 The Irix 6 ABI has examples of this.  */
       if (GET_CODE (reg) == PARALLEL)
-	emit_group_load (reg, x, -1);  /* ??? size? */
+	emit_group_load (reg, x, type, -1);
       else
 	move_block_to_reg (REGNO (reg), x, partial, mode);
     }
@@ -4194,7 +4214,8 @@ expand_assignment (to, from, want_value,
       /* Handle calls that return values in multiple non-contiguous locations.
 	 The Irix 6 ABI has examples of this.  */
       if (GET_CODE (to_rtx) == PARALLEL)
-	emit_group_load (to_rtx, value, int_size_in_bytes (TREE_TYPE (from)));
+	emit_group_load (to_rtx, value, TREE_TYPE (from),
+			 int_size_in_bytes (TREE_TYPE (from)));
       else if (GET_MODE (to_rtx) == BLKmode)
 	emit_block_move (to_rtx, value, expr_size (from), BLOCK_OP_NORMAL);
       else
@@ -4228,7 +4249,8 @@ expand_assignment (to, from, want_value,
       temp = expand_expr (from, 0, GET_MODE (to_rtx), 0);
 
       if (GET_CODE (to_rtx) == PARALLEL)
-	emit_group_load (to_rtx, temp, int_size_in_bytes (TREE_TYPE (from)));
+	emit_group_load (to_rtx, temp, TREE_TYPE (from),
+			 int_size_in_bytes (TREE_TYPE (from)));
       else
 	emit_move_insn (to_rtx, temp);
 
@@ -4641,7 +4663,8 @@ store_expr (exp, target, want_value)
       /* Handle calls that return values in multiple non-contiguous locations.
 	 The Irix 6 ABI has examples of this.  */
       else if (GET_CODE (target) == PARALLEL)
-	emit_group_load (target, temp, int_size_in_bytes (TREE_TYPE (exp)));
+	emit_group_load (target, temp, TREE_TYPE (exp),
+			 int_size_in_bytes (TREE_TYPE (exp)));
       else if (GET_MODE (temp) == BLKmode)
 	emit_block_move (target, temp, expr_size (exp),
 			 (want_value & 2
@@ -9182,7 +9205,7 @@ expand_expr (exp, target, tmode, modifie
 		    /* Handle calls that pass values in multiple
 		       non-contiguous locations.  The Irix 6 ABI has examples
 		       of this.  */
-		    emit_group_store (memloc, op0,
+		    emit_group_store (memloc, op0, inner_type,
 				      int_size_in_bytes (inner_type));
 		  else
 		    emit_move_insn (memloc, op0);
diff -urp gcc2/gcc/calls.c gcc3/gcc/calls.c
--- gcc2/gcc/calls.c	2003-04-27 19:07:19.000000000 +0930
+++ gcc3/gcc/calls.c	2003-04-30 14:33:24.000000000 +0930
@@ -27,6 +27,7 @@ Software Foundation, 59 Temple Place - S
 #include "tree.h"
 #include "flags.h"
 #include "expr.h"
+#include "optabs.h"
 #include "libfuncs.h"
 #include "function.h"
 #include "regs.h"
@@ -1015,22 +1016,27 @@ store_unaligned_arguments_into_pseudos (
 	    < (unsigned int) MIN (BIGGEST_ALIGNMENT, BITS_PER_WORD)))
       {
 	int bytes = int_size_in_bytes (TREE_TYPE (args[i].tree_value));
-	int big_endian_correction = 0;
-
-	args[i].n_aligned_regs
-	  = args[i].partial ? args[i].partial
-	    : (bytes + (UNITS_PER_WORD - 1)) / UNITS_PER_WORD;
+	int nregs = (bytes + UNITS_PER_WORD - 1) / UNITS_PER_WORD;
+	int endian_correction = 0;
 
+	args[i].n_aligned_regs = args[i].partial ? args[i].partial : nregs;
 	args[i].aligned_regs = (rtx *) xmalloc (sizeof (rtx)
 						* args[i].n_aligned_regs);
 
-	/* Structures smaller than a word are aligned to the least
-	   significant byte (to the right).  On a BYTES_BIG_ENDIAN machine,
+	/* Structures smaller than a word are normally aligned to the
+	   least significant byte.  On a BYTES_BIG_ENDIAN machine,
 	   this means we must skip the empty high order bytes when
 	   calculating the bit offset.  */
-	if (BYTES_BIG_ENDIAN
-	    && bytes < UNITS_PER_WORD)
-	  big_endian_correction = (BITS_PER_WORD  - (bytes * BITS_PER_UNIT));
+	if (bytes < UNITS_PER_WORD
+#ifdef BLOCK_REG_PADDING
+	    && (BLOCK_REG_PADDING (args[i].mode,
+				   TREE_TYPE (args[i].tree_value), 1)
+		== downward)
+#else
+	    && BYTES_BIG_ENDIAN
+#endif
+	    )
+	  endian_correction = BITS_PER_WORD - bytes * BITS_PER_UNIT;
 
 	for (j = 0; j < args[i].n_aligned_regs; j++)
 	  {
@@ -1039,6 +1045,8 @@ store_unaligned_arguments_into_pseudos (
 	    int bitsize = MIN (bytes * BITS_PER_UNIT, BITS_PER_WORD);
 
 	    args[i].aligned_regs[j] = reg;
+	    word = extract_bit_field (word, bitsize, 0, 1, NULL_RTX,
+				      word_mode, word_mode, BITS_PER_WORD);
 
 	    /* There is no need to restrict this code to loading items
 	       in TYPE_ALIGN sized hunks.  The bitfield instructions can
@@ -1054,11 +1062,8 @@ store_unaligned_arguments_into_pseudos (
 	    emit_move_insn (reg, const0_rtx);
 
 	    bytes -= bitsize / BITS_PER_UNIT;
-	    store_bit_field (reg, bitsize, big_endian_correction, word_mode,
-			     extract_bit_field (word, bitsize, 0, 1, NULL_RTX,
-						word_mode, word_mode,
-						BITS_PER_WORD),
-			     BITS_PER_WORD);
+	    store_bit_field (reg, bitsize, endian_correction, word_mode,
+			     word, BITS_PER_WORD);
 	  }
       }
 }
@@ -1689,34 +1694,48 @@ load_register_parameters (args, num_actu
     {
       rtx reg = ((flags & ECF_SIBCALL)
 		 ? args[i].tail_call_reg : args[i].reg);
-      int partial = args[i].partial;
-      int nregs;
-
       if (reg)
 	{
+	  int partial = args[i].partial;
+	  int nregs;
+	  int size = 0;
 	  rtx before_arg = get_last_insn ();
 	  /* Set to non-negative if must move a word at a time, even if just
 	     one word (e.g, partial == 1 && mode == DFmode).  Set to -1 if
 	     we just use a normal move insn.  This value can be zero if the
 	     argument is a zero size structure with no fields.  */
-	  nregs = (partial ? partial
-		   : (TYPE_MODE (TREE_TYPE (args[i].tree_value)) == BLKmode
-		      ? ((int_size_in_bytes (TREE_TYPE (args[i].tree_value))
-			  + (UNITS_PER_WORD - 1)) / UNITS_PER_WORD)
-		      : -1));
+	  nregs = -1;
+	  if (partial)
+	    nregs = partial;
+	  else if (TYPE_MODE (TREE_TYPE (args[i].tree_value)) == BLKmode)
+	    {
+	      size = int_size_in_bytes (TREE_TYPE (args[i].tree_value));
+	      nregs = (size + (UNITS_PER_WORD - 1)) / UNITS_PER_WORD;
+	    }
+	  else
+	    size = GET_MODE_SIZE (args[i].mode);
 
 	  /* Handle calls that pass values in multiple non-contiguous
 	     locations.  The Irix 6 ABI has examples of this.  */
 
 	  if (GET_CODE (reg) == PARALLEL)
-	    emit_group_load (reg, args[i].value,
-			     int_size_in_bytes (TREE_TYPE (args[i].tree_value)));
+	    {
+	      tree type = TREE_TYPE (args[i].tree_value);
+	      emit_group_load (reg, args[i].value, type,
+			       int_size_in_bytes (type));
+	    }
 
 	  /* If simple case, just do move.  If normal partial, store_one_arg
 	     has already loaded the register for us.  In all other cases,
 	     load the register(s) from memory.  */
 
-	  else if (nregs == -1)
+	  else if (nregs == -1
+#ifdef BLOCK_REG_PADDING
+		   && !(size < UNITS_PER_WORD
+			&& (args[i].locate.where_pad
+			    == (BYTES_BIG_ENDIAN ? upward : downward)))
+#endif
+		   )
 	    emit_move_insn (reg, args[i].value);
 
 	  /* If we have pre-computed the values to put in the registers in
@@ -1728,9 +1747,44 @@ load_register_parameters (args, num_actu
 			      args[i].aligned_regs[j]);
 
 	  else if (partial == 0 || args[i].pass_on_stack)
-	    move_block_to_reg (REGNO (reg),
-			       validize_mem (args[i].value), nregs,
-			       args[i].mode);
+	    {
+	      rtx mem = validize_mem (args[i].value);
+
+#ifdef BLOCK_REG_PADDING
+	      /* Handle case where we have a value that needs shifting
+		 up to the msb.  eg. a QImode value and we're padding
+		 upward on a BYTES_BIG_ENDIAN machine.  */
+	      if (nregs == -1)
+		{
+		  rtx ri = gen_rtx_REG (word_mode, REGNO (reg));
+		  rtx x;
+		  int shift = (UNITS_PER_WORD - size) * BITS_PER_UNIT;
+		  x = expand_binop (word_mode, ashl_optab, mem,
+				    GEN_INT (shift), ri, 1, OPTAB_WIDEN);
+		  if (x != ri)
+		    emit_move_insn (ri, x);
+		}
+
+	      /* Handle a BLKmode that needs shifting.  */
+	      else if (nregs == 1 && size < UNITS_PER_WORD
+		       && args[i].locate.where_pad == downward)
+		{
+		  rtx tem = operand_subword_force (mem, 0, args[i].mode);
+		  rtx ri = gen_rtx_REG (word_mode, REGNO (reg));
+		  rtx x = gen_reg_rtx (word_mode);
+		  int shift = (UNITS_PER_WORD - size) * BITS_PER_UNIT;
+		  optab dir = BYTES_BIG_ENDIAN ? lshr_optab : ashl_optab;
+
+		  emit_move_insn (x, tem);
+		  x = expand_binop (word_mode, dir, x, GEN_INT (shift),
+				    ri, 1, OPTAB_WIDEN);
+		  if (x != ri)
+		    emit_move_insn (ri, x);
+		}
+	      else
+#endif
+		move_block_to_reg (REGNO (reg), mem, nregs, args[i].mode);
+	    }
 
 	  /* When a parameter is a block, and perhaps in other cases, it is
 	     possible that it did a load from an argument slot that was
@@ -3225,7 +3279,7 @@ expand_call (exp, target, ignore)
 	    }
 
 	  if (! rtx_equal_p (target, valreg))
-	    emit_group_store (target, valreg,
+	    emit_group_store (target, valreg, TREE_TYPE (exp),
 			      int_size_in_bytes (TREE_TYPE (exp)));
 
 	  /* We can not support sibling calls for this case.  */
@@ -3976,7 +4030,7 @@ emit_library_call_value_1 (retval, orgfu
       /* Handle calls that pass values in multiple non-contiguous
 	 locations.  The PA64 has examples of this for library calls.  */
       if (reg != 0 && GET_CODE (reg) == PARALLEL)
-	emit_group_load (reg, val, GET_MODE_SIZE (GET_MODE (val)));
+	emit_group_load (reg, val, NULL_TREE, GET_MODE_SIZE (GET_MODE (val)));
       else if (reg != 0 && partial == 0)
 	emit_move_insn (reg, val);
 
@@ -4080,7 +4134,7 @@ emit_library_call_value_1 (retval, orgfu
 	  if (GET_CODE (valreg) == PARALLEL)
 	    {
 	      temp = gen_reg_rtx (outmode);
-	      emit_group_store (temp, valreg, outmode);
+	      emit_group_store (temp, valreg, NULL_TREE, outmode);
 	      valreg = temp;
 	    }
 
@@ -4123,7 +4177,7 @@ emit_library_call_value_1 (retval, orgfu
 	{
 	  if (value == 0)
 	    value = gen_reg_rtx (outmode);
-	  emit_group_store (value, valreg, outmode);
+	  emit_group_store (value, valreg, NULL_TREE, outmode);
 	}
       else if (value != 0)
 	emit_move_insn (value, valreg);
diff -urp gcc2/gcc/function.c gcc3/gcc/function.c
--- gcc2/gcc/function.c	2003-04-24 17:52:45.000000000 +0930
+++ gcc3/gcc/function.c	2003-04-30 14:33:27.000000000 +0930
@@ -4623,6 +4623,8 @@ assign_parms (fndecl)
 						  offset_rtx));
 
 	set_mem_attributes (stack_parm, parm, 1);
+	if (entry_parm && MEM_ATTRS (stack_parm)->align < PARM_BOUNDARY)
+	  set_mem_align (stack_parm, PARM_BOUNDARY);
 
 	/* Set also REG_ATTRS if parameter was passed in a register.  */
 	if (entry_parm)
@@ -4654,6 +4656,7 @@ assign_parms (fndecl)
 	     locations.  The Irix 6 ABI has examples of this.  */
 	  if (GET_CODE (entry_parm) == PARALLEL)
 	    emit_group_store (validize_mem (stack_parm), entry_parm,
+			      TREE_TYPE (parm),
 			      int_size_in_bytes (TREE_TYPE (parm)));
 
 	  else
@@ -4757,7 +4760,12 @@ assign_parms (fndecl)
 
 	 Set DECL_RTL to that place.  */
 
-      if (nominal_mode == BLKmode || GET_CODE (entry_parm) == PARALLEL)
+      if (nominal_mode == BLKmode
+#ifdef BLOCK_REG_PADDING
+	  || (locate.where_pad == (BYTES_BIG_ENDIAN ? upward : downward)
+	      && GET_MODE_SIZE (promoted_mode) < UNITS_PER_WORD)
+#endif
+	  || GET_CODE (entry_parm) == PARALLEL)
 	{
 	  /* If a BLKmode arrives in registers, copy it to a stack slot.
 	     Handle calls that pass values in multiple non-contiguous
@@ -4793,7 +4801,7 @@ assign_parms (fndecl)
 	      /* Handle calls that pass values in multiple non-contiguous
 		 locations.  The Irix 6 ABI has examples of this.  */
 	      if (GET_CODE (entry_parm) == PARALLEL)
-		emit_group_store (mem, entry_parm, size);
+		emit_group_store (mem, entry_parm, TREE_TYPE (parm), size);
 
 	      /* If SIZE is that of a mode no bigger than a word, just use
 		 that mode's store operation.  */
@@ -4802,7 +4810,13 @@ assign_parms (fndecl)
 		  enum machine_mode mode
 		    = mode_for_size (size * BITS_PER_UNIT, MODE_INT, 0);
 
-		  if (mode != BLKmode)
+		  if (mode != BLKmode
+#ifdef BLOCK_REG_PADDING
+		      && (size == UNITS_PER_WORD
+			  || (BLOCK_REG_PADDING (mode, TREE_TYPE (parm), 1)
+			      != (BYTES_BIG_ENDIAN ? upward : downward)))
+#endif
+		      )
 		    {
 		      rtx reg = gen_rtx_REG (mode, REGNO (entry_parm));
 		      emit_move_insn (change_address (mem, mode, 0), reg);
@@ -4813,7 +4827,13 @@ assign_parms (fndecl)
 		     to memory.  Note that the previous test doesn't
 		     handle all cases (e.g. SIZE == 3).  */
 		  else if (size != UNITS_PER_WORD
-			   && BYTES_BIG_ENDIAN)
+#ifdef BLOCK_REG_PADDING
+			   && (BLOCK_REG_PADDING (mode, TREE_TYPE (parm), 1)
+			       == downward)
+#else
+			   && BYTES_BIG_ENDIAN
+#endif
+			   )
 		    {
 		      rtx tem, x;
 		      int by = (UNITS_PER_WORD - size) * BITS_PER_UNIT;
@@ -5411,6 +5431,7 @@ locate_and_pad_parm (passed_mode, type, 
     = type ? size_in_bytes (type) : size_int (GET_MODE_SIZE (passed_mode));
   where_pad = FUNCTION_ARG_PADDING (passed_mode, type);
   boundary = FUNCTION_ARG_BOUNDARY (passed_mode, type);
+  locate->where_pad = where_pad;
 
 #ifdef ARGS_GROW_DOWNWARD
   locate->slot_offset.constant = -initial_offset_ptr->constant;
@@ -7142,6 +7163,7 @@ expand_function_end (filename, line, end
 		emit_group_move (real_decl_rtl, decl_rtl);
 	      else
 		emit_group_load (real_decl_rtl, decl_rtl,
+				 TREE_TYPE (decl_result),
 				 int_size_in_bytes (TREE_TYPE (decl_result)));
 	    }
 	  else
diff -urp gcc2/gcc/stmt.c gcc3/gcc/stmt.c
--- gcc2/gcc/stmt.c	2003-04-22 18:55:34.000000000 +0930
+++ gcc3/gcc/stmt.c	2003-04-24 21:07:18.000000000 +0930
@@ -3008,7 +3008,7 @@ expand_value_return (val)
 	val = convert_modes (mode, old_mode, val, unsignedp);
 #endif
       if (GET_CODE (return_reg) == PARALLEL)
-	emit_group_load (return_reg, val, int_size_in_bytes (type));
+	emit_group_load (return_reg, val, type, int_size_in_bytes (type));
       else
 	emit_move_insn (return_reg, val);
     }
diff -urp gcc2/gcc/Makefile.in gcc3/gcc/Makefile.in
--- gcc2/gcc/Makefile.in	2003-04-24 15:47:39.000000000 +0930
+++ gcc3/gcc/Makefile.in	2003-04-24 21:14:33.000000000 +0930
@@ -1531,7 +1531,7 @@ builtins.o : builtins.c $(CONFIG_H) $(SY
    $(RECOG_H) output.h typeclass.h hard-reg-set.h toplev.h hard-reg-set.h \
    except.h $(TM_P_H) $(PREDICT_H) libfuncs.h real.h langhooks.h
 calls.o : calls.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) $(TREE_H) flags.h \
-   $(EXPR_H) langhooks.h $(TARGET_H) \
+   $(EXPR_H) $(OPTABS_H) langhooks.h $(TARGET_H) \
    libfuncs.h $(REGS_H) toplev.h output.h function.h $(TIMEVAR_H) $(TM_P_H) cgraph.h except.h
 expmed.o : expmed.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) $(TREE_H) \
    flags.h insn-config.h $(EXPR_H) $(OPTABS_H) $(RECOG_H) real.h \
diff -urp gcc2/gcc/config/rs6000/linux64.h gcc3/gcc/config/rs6000/linux64.h
--- gcc2/gcc/config/rs6000/linux64.h	2003-04-22 18:55:34.000000000 +0930
+++ gcc3/gcc/config/rs6000/linux64.h	2003-04-30 14:37:08.000000000 +0930
@@ -49,6 +49,10 @@
 #undef	TARGET_64BIT
 #define	TARGET_64BIT		1
 
+/* And we're always big-endian.  */
+#undef	TARGET_BIG_ENDIAN
+#define TARGET_BIG_ENDIAN 1
+
 /* 64-bit PowerPC Linux always has a TOC.  */
 #undef  TARGET_NO_TOC
 #define TARGET_NO_TOC		0
@@ -143,8 +147,33 @@
 #undef  JUMP_TABLES_IN_TEXT_SECTION
 #define JUMP_TABLES_IN_TEXT_SECTION 1
 
-/* 64-bit PowerPC Linux always has GPR13 fixed.  */
-#define FIXED_R13		1
+/* The linux ppc64 ABI isn't explicit on whether aggregates smaller
+   than a doubleword should be padded upward or downward.  You could
+   reasonably assume that they follow the normal rules for structure
+   layout treating the parameter area as any other block of memory,
+   then map the reg param area to registers.  ie. pad updard.
+   Setting both of the following defines results in this behaviour.
+   Setting just the first one will result in aggregates that fit in a
+   doubleword being padded downward, and others being padded upward.
+   Not a bad idea as this results in struct { int x; } being passed
+   the same way as an int.  */
+#define AGGREGATE_PADDING_FIXED 1
+/* #define AGGREGATES_PAD_UPWARD_ALWAYS 1 */
+
+/* We don't want anything in the reg parm area being passed on the
+   stack.  */
+#define MUST_PASS_IN_STACK(MODE, TYPE)				\
+  ((TYPE) != 0							\
+   && (TREE_CODE (TYPE_SIZE (TYPE)) != INTEGER_CST		\
+       || TREE_ADDRESSABLE (TYPE)))
+
+/* Specify padding for the last element of a block move between
+   registers and memory.  FIRST is non-zero if this is the only
+   element.  */
+#ifndef BLOCK_REG_PADDING
+#define BLOCK_REG_PADDING(MODE, TYPE, FIRST) \
+  (!(FIRST) ? upward : FUNCTION_ARG_PADDING (MODE, TYPE))
+#endif
 
 /* __throw will restore its own return address to be the same as the
    return address of the function that the throw is being made to.
diff -urp gcc2/gcc/config/rs6000/rs6000.h gcc3/gcc/config/rs6000/rs6000.h
--- gcc2/gcc/config/rs6000/rs6000.h	2003-04-23 11:29:04.000000000 +0930
+++ gcc3/gcc/config/rs6000/rs6000.h	2003-04-24 21:07:17.000000000 +0930
@@ -1701,7 +1701,6 @@ typedef struct rs6000_args
   int fregno;			/* next available FP register */
   int vregno;			/* next available AltiVec register */
   int nargs_prototype;		/* # args left in the current prototype */
-  int orig_nargs;		/* Original value of nargs_prototype */
   int prototype;		/* Whether a prototype was defined */
   int call_cookie;		/* Do special things for this call */
   int sysv_gregno;		/* next available GP register */
@@ -1832,13 +1831,8 @@ typedef struct rs6000_args
 #define EXPAND_BUILTIN_VA_ARG(valist, type) \
   rs6000_va_arg (valist, type)
 
-/* For AIX, the rule is that structures are passed left-aligned in
-   their stack slot.  However, GCC does not presently do this:
-   structures which are the same size as integer types are passed
-   right-aligned, as if they were in fact integers.  This only
-   matters for structures of size 1 or 2, or 4 when TARGET_64BIT.
-   ABI_V4 does not use std_expand_builtin_va_arg.  */
-#define PAD_VARARGS_DOWN (TYPE_MODE (type) != BLKmode)
+#define PAD_VARARGS_DOWN \
+   (FUNCTION_ARG_PADDING (TYPE_MODE (type), type) == downward)
 
 /* Define this macro to be a nonzero value if the location where a function
    argument is passed depends on whether or not it is a named argument.  */
diff -urp gcc2/gcc/config/rs6000/rs6000.c gcc3/gcc/config/rs6000/rs6000.c
--- gcc2/gcc/config/rs6000/rs6000.c	2003-04-24 14:46:04.000000000 +0930
+++ gcc3/gcc/config/rs6000/rs6000.c	2003-04-24 21:43:35.000000000 +0930
@@ -3132,8 +3132,6 @@ init_cumulative_args (cum, fntype, libna
   else
     cum->nargs_prototype = 0;
 
-  cum->orig_nargs = cum->nargs_prototype;
-
   /* Check for a longcall attribute.  */
   if (fntype
       && lookup_attribute ("longcall", TYPE_ATTRIBUTES (fntype))
@@ -3172,8 +3170,37 @@ function_arg_padding (mode, type)
      enum machine_mode mode;
      tree type;
 {
+#if !AGGREGATE_PADDING_FIXED
+  /* GCC used to pass structures of the same size as integer types as
+     if they were in fact integers, ignoring FUNCTION_ARG_PADDING.
+     ie. Structures of size 1 or 2 (or 4 when TARGET_64BIT) were
+     passed padded downward, except that -mstrict-align further
+     muddied the water in that multi-component structures of 2 and 4
+     bytes in size were passed padded upward.
+
+     The following arranges for best compatibility with previous
+     versions of gcc, but removes the -mstrict-align dependency.  */
+  if (BYTES_BIG_ENDIAN)
+    {
+      HOST_WIDE_INT size = 0;
+
+      if (mode == BLKmode)
+	{
+	  if (type && TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST)
+	    size = int_size_in_bytes (type);
+	}
+      else
+	size = GET_MODE_SIZE (mode);
+
+      if (size == 1 || size == 2 || size == 4)
+	return downward;
+    }
+  return upward;
+#else
+#if AGGREGATES_PAD_UPWARD_ALWAYS
   if (type != 0 && AGGREGATE_TYPE_P (type))
     return upward;
+#endif
 
   /* This is the default definition.  */
   return (! BYTES_BIG_ENDIAN
@@ -3183,6 +3210,7 @@ function_arg_padding (mode, type)
                  && int_size_in_bytes (type) < (PARM_BOUNDARY / BITS_PER_UNIT))
               : GET_MODE_BITSIZE (mode) < PARM_BOUNDARY)
              ? downward : upward));
+#endif
 }
 
 /* If defined, a C expression that gives the alignment boundary, in bits,

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: function parms in regs, patch 1 of 3
  2003-04-24 15:34 function parms in regs, patch 1 of 3 Alan Modra
  2003-04-25 22:44 ` Richard Henderson
@ 2003-05-02  5:06 ` Jim Wilson
  2003-05-02  5:20   ` Richard Henderson
  1 sibling, 1 reply; 875+ messages in thread
From: Jim Wilson @ 2003-05-02  5:06 UTC (permalink / raw)
  To: Alan Modra; +Cc: gcc-patches

I looked through this patch, and didn't see any problems.  It looks 
pretty straight
forward.  I am willing to approve it if Richard doesn't object.

I have the same concerns as Richard though.  The function calling code 
is complex and fragile.  Problems here can result in silent ABI changes 
that are difficult to detect, particularly since we have no established 
testsuite for ABI compatibility.  You should be prepared to revert the 
patch if problems arise.

Jim

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: function parms in regs, patch 1 of 3
  2003-05-02  5:06 ` function parms in regs, patch 1 " Jim Wilson
@ 2003-05-02  5:20   ` Richard Henderson
  0 siblings, 0 replies; 875+ messages in thread
From: Richard Henderson @ 2003-05-02  5:20 UTC (permalink / raw)
  To: Jim Wilson; +Cc: Alan Modra, gcc-patches

On Thu, May 01, 2003 at 10:07:01PM -0400, Jim Wilson wrote:
> I looked through this patch, and didn't see any problems.  It looks 
> pretty straight
> forward.  I am willing to approve it if Richard doesn't object.

I don't object.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: function parms in regs, patch 3 of 3
  2003-04-30 13:29     ` function parms in regs, patch 3 " Alan Modra
@ 2003-05-02  6:05       ` Jim Wilson
  2003-05-02 12:38         ` Alan Modra
  2003-07-10  6:55       ` Jim Wilson
  1 sibling, 1 reply; 875+ messages in thread
From: Jim Wilson @ 2003-05-02  6:05 UTC (permalink / raw)
  To: Alan Modra; +Cc: gcc-patches

This is the patch I really wanted to comment on, which is why I am 
willing to review this set of patches.

I believe that we already have so many macros that interact in confusing 
ways that we should avoid adding new ones.  I think a better approach is 
to return more info from FUNCTION_ARG, in a more structured form, so 
that we can get away from the ever expanding set of macros that we have.
The PARALLEL stuff is a step in this direction.  The structure of the 
info is ugly, but that can always be improved.  I'd like to see the 
entire interface rewritten, but it isn't clear if anyone will ever have 
the time for it, and we could only do this in a .0 release.

Your revised patch which only defines BLOCK_REG_PADDING for rs6000 is a 
major improvement.  I think there is too much risk in trying to define a 
default for all targets.

There is a known IA-64 ABI problem that your patch doesn't help with. 
The IA-64 ABI treats homogeneous floating-point aggregates (HFAs) 
specially.  A structure all of whose fields are the same FP type is an 
HFA for instance.  An HFA is passed in FP regs, one field per FP reg, 
until the FP regs are full, and then we pass the rest in integer regs.
The part passed in integer regs is placed in the same place it would 
have gone if the entire structure was passed in integer regs.  Suppose 
we have a structure of 5 floats, and there are 3 FP registers left.  The 
first 3 fields are passed in FP registers.  The fourth one is passed 
left-justified (little-endian) in the second integer register, because 
it is the second half of the second word.  The fifth one is passed 
right-justified (little-endian) in the third integer register, because 
it is the first half of the third word.  Now the problem here is that a 
float field in an integer register can require either left or right 
justification depending on the context.  I don't think your 
BLOCK_REG_PADDING macro can handle this.  It doesn't have enough info. 
FUNCTION_ARG does have enough info.  If the PARALLEL stuff had 3 items 
location/size/alignment instead of the current 2 location/size, then I 
believe we could fix this problem, your powerpc problem, and probably 
also some ABI problems in other ports that we may or may not know about.

Having said that, I really should take a closer look at the patch to see 
what it actually does.  And I need to look at the second patch of the set.

Jim

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: function parms in regs, patch 3 of 3
  2003-05-02  6:05       ` Jim Wilson
@ 2003-05-02 12:38         ` Alan Modra
  2003-05-02 20:23           ` Jim Wilson
  0 siblings, 1 reply; 875+ messages in thread
From: Alan Modra @ 2003-05-02 12:38 UTC (permalink / raw)
  To: Jim Wilson; +Cc: gcc-patches

On Thu, May 01, 2003 at 11:06:06PM -0400, Jim Wilson wrote:
> FUNCTION_ARG does have enough info.  If the PARALLEL stuff had 3 items 
> location/size/alignment instead of the current 2 location/size, then I 
> believe we could fix this problem, your powerpc problem, and probably 
> also some ABI problems in other ports that we may or may not know about.

I like the idea.  It would be nice if the called function could use
the regs in a parallel without copying to the stack too, at least for
simple cases.  eg. for

	struct it { char c[3]; };
	int foo (struct it x) { return x.c[0]; }

I get
	.foo:
        	sldi 3,3,40
        	std 3,48(1)
        	lbz 3,48(1)
        	blr

when we could dispense with the store to the stack and just have

	.foo
		rldicl 3,3,48,56
		blr

> Having said that, I really should take a closer look at the patch to see 
> what it actually does.  And I need to look at the second patch of the set.

The second patch doesn't do much besides move code around and a little
tidying.  The last patch is really about honouring FUNCTION_ARG_PADDING
in all cases.  See the comment added to rs6000.c:function_arg_padding.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: function parms in regs, patch 3 of 3
  2003-05-02 12:38         ` Alan Modra
@ 2003-05-02 20:23           ` Jim Wilson
  2003-05-03  1:22             ` Alan Modra
  0 siblings, 1 reply; 875+ messages in thread
From: Jim Wilson @ 2003-05-02 20:23 UTC (permalink / raw)
  To: Alan Modra; +Cc: gcc-patches

On Fri, 2003-05-02 at 08:38, Alan Modra wrote:
> I like the idea.  It would be nice if the called function could use
> the regs in a parallel without copying to the stack too, at least for
> simple cases.

In order to do this, we need some support for decomposing structures.
Currently, the only way to access a structure field is via an offset
from the base address for the structure.  This requires that the
structure be in contiguous memory locations.  In some simple cases, we
can handle structures in contiguous registers, but mostly it means they
have to be in memory.  Thus we must force the object into a stack slot
so we can access fields.

If we could give each structure field its own DECL_RTL, then we could
access them in place without forcing the structure to a stack slot.  Or
maybe we can put the PARALLEL into the structure's DECL_RTL, and modify
the access routines to parse it.  I suspect it is simpler to have
separate DECL_RTLs for each field though.

There is also a simpler related problem.  Some ABIs use big-endian
left-justified/little-endian right-justified for arguments smaller than
a word.  The reason is that a simple word-sized word-aligned store can
then be used to put the argument in memory, and it will end up in the
same place in both the big and little endian cases.  Likewise, a simple
word-sized word-aligned load can be used to load the argument into a
register.  Unfortunately, gcc doesn't understand this convention, and
ends up doing bit-field operations to move the argument between regs and
memory.  This often results in a series of shifts, masks, and byte-sized
loads/stores.  I think it is a little embarrassing that we do this. 
This does require that we have a word sized word aligned area to
load/store from/into, but that will often be the case.  Stack slots are
always word sized and aligned for instance.

Jim

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: function parms in regs, patch 3 of 3
  2003-05-02 20:23           ` Jim Wilson
@ 2003-05-03  1:22             ` Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2003-05-03  1:22 UTC (permalink / raw)
  To: Jim Wilson; +Cc: gcc-patches

On Fri, May 02, 2003 at 01:24:40PM -0400, Jim Wilson wrote:
> There is also a simpler related problem.  Some ABIs use big-endian
> left-justified/little-endian right-justified for arguments smaller than
> a word.  The reason is that a simple word-sized word-aligned store can
> then be used to put the argument in memory, and it will end up in the
> same place in both the big and little endian cases.  Likewise, a simple
> word-sized word-aligned load can be used to load the argument into a
> register.  Unfortunately, gcc doesn't understand this convention, and
> ends up doing bit-field operations to move the argument between regs and
> memory.  This often results in a series of shifts, masks, and byte-sized
> loads/stores.  I think it is a little embarrassing that we do this. 
> This does require that we have a word sized word aligned area to
> load/store from/into, but that will often be the case.  Stack slots are
> always word sized and aligned for instance.

Yes, I noticed.  Actually, it was this that led me to delve into this
code and start tidying a few things.  I'd solved the powerpc64 ABI
problem originally using PARALLELs, and then found that in certain
cases (< word size struct) we could generate better code by not using
a PARALLEL for incoming args.  Not exactly elegant though, so I started
to poke into why a PARALLEL was necessary in the first place.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* powerpc-unknown-linux-gnu bootstrap fix
@ 2003-05-14 16:25 Matt Kraai
  2003-05-14 17:12 ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Matt Kraai @ 2003-05-14 16:25 UTC (permalink / raw)
  To: gcc-patches

Howdy,

Bootstrap on my powerpc-unknown-linux-gnu failed with the
following output:

 stage1/xgcc -Bstage1/ -B/home/kraai/dev/gcc/inst/powerpc-unknown-linux-gnu/bin/ -c   -g -O2 -DIN_GCC   -W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -pedantic -Wno-long-long -Werror -fno-common   -DHAVE_CONFIG_H    -I. -I. -I../../gcc/gcc -I../../gcc/gcc/. -I../../gcc/gcc/config -I../../gcc/gcc/../include ../../gcc/gcc/varasm.c -o varasm.o
 ../../gcc/gcc/varasm.c: In function `asm_emit_uninitialised':
 ../../gcc/gcc/varasm.c:1375: warning: comparison between signed and unsigned
 ../../gcc/gcc/varasm.c:1382: warning: comparison between signed and unsigned
 ../../gcc/gcc/varasm.c: In function `assemble_static_space':
 ../../gcc/gcc/varasm.c:1783: warning: comparison between signed and unsigned
 make[2]: *** [varasm.o] Error 1
 make[2]: Leaving directory `/home/kraai/dev/gcc/bld/gcc'
 make[1]: *** [stage2_build] Error 2
 make[1]: Leaving directory `/home/kraai/dev/gcc/bld/gcc'
 make: *** [bootstrap] Error 2

Applying the appended patch allowed it to proceed.  OK?

-- 
Matt Kraai <kraai@alumni.cmu.edu>
Debian GNU/Linux Peon

       * config/rs6000/sysv4.h (ASM_OUTPUT_ALIGNED_LOCAL): Cast
       g_switch_value to unsigned HOST_WIDE_INT.

*** sysv4.h.~1.126.~	Tue May 13 21:43:03 2003
--- sysv4.h	Wed May 14 07:35:47 2003
***************
*** 666,672 ****
  #define	ASM_OUTPUT_ALIGNED_LOCAL(FILE, NAME, SIZE, ALIGN)		\
  do {									\
    if (rs6000_sdata != SDATA_NONE && (SIZE) > 0				\
!       && (SIZE) <= g_switch_value)					\
      {									\
        sbss_section ();							\
        ASM_OUTPUT_ALIGN (FILE, exact_log2 (ALIGN / BITS_PER_UNIT));	\
--- 666,672 ----
  #define	ASM_OUTPUT_ALIGNED_LOCAL(FILE, NAME, SIZE, ALIGN)		\
  do {									\
    if (rs6000_sdata != SDATA_NONE && (SIZE) > 0				\
!       && (SIZE) <= (unsigned HOST_WIDE_INT)g_switch_value)		\
      {									\
        sbss_section ();							\
        ASM_OUTPUT_ALIGN (FILE, exact_log2 (ALIGN / BITS_PER_UNIT));	\

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: powerpc-unknown-linux-gnu bootstrap fix
  2003-05-14 16:25 powerpc-unknown-linux-gnu bootstrap fix Matt Kraai
@ 2003-05-14 17:12 ` David Edelsohn
  2003-05-14 17:23   ` Richard Henderson
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2003-05-14 17:12 UTC (permalink / raw)
  To: Matt Kraai, Richard Henderson; +Cc: gcc-patches

	It seems like this should be a problem on other targets as well,
such as Alpha which has a similar test in elf.h.

	Richard, would it be better to change g_switch_value to unsigned
int or unsigned HOST_WIDE_INT?

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: powerpc-unknown-linux-gnu bootstrap fix
  2003-05-14 17:12 ` David Edelsohn
@ 2003-05-14 17:23   ` Richard Henderson
  2003-05-14 17:26     ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Richard Henderson @ 2003-05-14 17:23 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Matt Kraai, gcc-patches

On Wed, May 14, 2003 at 01:03:12PM -0400, David Edelsohn wrote:
> 	Richard, would it be better to change g_switch_value to unsigned
> int or unsigned HOST_WIDE_INT?

Yes.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: powerpc-unknown-linux-gnu bootstrap fix
  2003-05-14 17:23   ` Richard Henderson
@ 2003-05-14 17:26     ` David Edelsohn
  2003-05-14 17:52       ` Richard Henderson
  2003-05-15 23:22       ` Geoff Keating
  0 siblings, 2 replies; 875+ messages in thread
From: David Edelsohn @ 2003-05-14 17:26 UTC (permalink / raw)
  To: Richard Henderson, Matt Kraai, gcc-patches

>>>>> Richard Henderson writes:

Richard> On Wed, May 14, 2003 at 01:03:12PM -0400, David Edelsohn wrote:
>> Richard, would it be better to change g_switch_value to unsigned
>> int or unsigned HOST_WIDE_INT?

Richard> Yes.

	I just noticed that this is fairly ugly.  Half the time
g_switch_value is tested against SIZE (unsigned HOST_WIDE_INT) and the
other half again int_size_in_bytes (signed HOST_WIDE_INT).  It's not going
to be easy to reliably rationalize this either way.  Maybe we should just
punt with a cast in sysv4.h -- alpha.c does that already.  Comments?

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: powerpc-unknown-linux-gnu bootstrap fix
  2003-05-14 17:26     ` David Edelsohn
@ 2003-05-14 17:52       ` Richard Henderson
  2003-05-14 18:58         ` David Edelsohn
  2003-05-15 23:22       ` Geoff Keating
  1 sibling, 1 reply; 875+ messages in thread
From: Richard Henderson @ 2003-05-14 17:52 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Matt Kraai, gcc-patches

On Wed, May 14, 2003 at 01:20:36PM -0400, David Edelsohn wrote:
> 	I just noticed that this is fairly ugly.  Half the time
> g_switch_value is tested against SIZE (unsigned HOST_WIDE_INT) and the
> other half again int_size_in_bytes (signed HOST_WIDE_INT).

Blah.  I guess casts are the least invasive change.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: powerpc-unknown-linux-gnu bootstrap fix
  2003-05-14 17:52       ` Richard Henderson
@ 2003-05-14 18:58         ` David Edelsohn
  0 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2003-05-14 18:58 UTC (permalink / raw)
  To: Richard Henderson, Matt Kraai, gcc-patches

>>>>> Richard Henderson writes:

Richard> Blah.  I guess casts are the least invasive change.

	Okay, Matt, just check in your proposed fix with the cast.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: powerpc-unknown-linux-gnu bootstrap fix
  2003-05-14 17:26     ` David Edelsohn
  2003-05-14 17:52       ` Richard Henderson
@ 2003-05-15 23:22       ` Geoff Keating
  2003-05-18 23:17         ` Matt Kraai
  1 sibling, 1 reply; 875+ messages in thread
From: Geoff Keating @ 2003-05-15 23:22 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

David Edelsohn <dje@watson.ibm.com> writes:

> >>>>> Richard Henderson writes:
> 
> Richard> On Wed, May 14, 2003 at 01:03:12PM -0400, David Edelsohn wrote:
> >> Richard, would it be better to change g_switch_value to unsigned
> >> int or unsigned HOST_WIDE_INT?
> 
> Richard> Yes.
> 
> 	I just noticed that this is fairly ugly.  Half the time
> g_switch_value is tested against SIZE (unsigned HOST_WIDE_INT) and the
> other half again int_size_in_bytes (signed HOST_WIDE_INT).  It's not going
> to be easy to reliably rationalize this either way.  Maybe we should just
> punt with a cast in sysv4.h -- alpha.c does that already.  Comments?

I think g_switch_value should be unsigned, because the reason
int_size_in_bytes is signed is that it can return -1 meaning "I
dunno", and that's a special case that the port maintainer should take
care to handle specially.

-- 
- Geoffrey Keating <geoffk@geoffk.org>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: powerpc-unknown-linux-gnu bootstrap fix
  2003-05-15 23:22       ` Geoff Keating
@ 2003-05-18 23:17         ` Matt Kraai
  2003-05-19  0:16           ` Geoff Keating
  0 siblings, 1 reply; 875+ messages in thread
From: Matt Kraai @ 2003-05-18 23:17 UTC (permalink / raw)
  To: Geoff Keating; +Cc: David Edelsohn, gcc-patches

On Thu, May 15, 2003 at 03:46:58PM -0700, Geoff Keating wrote:
> David Edelsohn <dje@watson.ibm.com> writes:
> 
> > >>>>> Richard Henderson writes:
> > 
> > Richard> On Wed, May 14, 2003 at 01:03:12PM -0400, David Edelsohn wrote:
> > >> Richard, would it be better to change g_switch_value to unsigned
> > >> int or unsigned HOST_WIDE_INT?
> > 
> > Richard> Yes.
> > 
> > 	I just noticed that this is fairly ugly.  Half the time
> > g_switch_value is tested against SIZE (unsigned HOST_WIDE_INT) and the
> > other half again int_size_in_bytes (signed HOST_WIDE_INT).  It's not going
> > to be easy to reliably rationalize this either way.  Maybe we should just
> > punt with a cast in sysv4.h -- alpha.c does that already.  Comments?
> 
> I think g_switch_value should be unsigned, because the reason
> int_size_in_bytes is signed is that it can return -1 meaning "I
> dunno", and that's a special case that the port maintainer should take
> care to handle specially.

The following patch makes g_switch_value an unsigned
HOST_WIDE_INT.

Tested by building cc1 for alpha-unknown-linux-gnu, frv-foo-elf,
and m32r-foo-elf, and by bootstrapping and regression testing on
powerpc-unknown-linux-gnu.

OK to commit?

-- 
Matt Kraai <kraai@alumni.cmu.edu>
Debian GNU/Linux Peon

	* flags.h (g_switch_value): Change to an unsigned
	HOST_WIDE_INT.
	* toplev.c (g_switch_value): Likewise.

	* config/alpha/alpha.c (small_symbolic_operand): Remove
	g_switch_value cast. 
	(alpha_in_small_data_p): Cast size to an unsigned
	HOST_WIDE_INT.

	* config/frv/frv.c (frv_in_small_data_p): Cast size to an
	unsigned HOST_WIDE_INT.
	* config/frv/frv.h (g_switch_value, g_switch_set): Remove.
	(ASM_OUTPUT_ALIGNED_DECL_LOCAL): Declare g_switch_set.

	* config/m32r/m32r.c (m32r_in_small_data_p): Cast size to an
	unsigned HOST_WIDE_INT.
	(m32r_asm_file_start): Use HOST_WIDE_INT_PRINT_UNSIGNED.
	* config/m32r/m32r.h (g_switch_value, g_switch_set): Remove.
	(ASM_OUTPUT_ALIGNED_COMMON): Declare g_switch_value.

	* config/rs6000/rs6000.c (rs6000_file_start): Use
	HOST_WIDE_INT_PRINT_UNSIGNED.
	(small_data_operand): Cast summand to unsigned HOST_WIDE_INT.
	(rs6000_elf_in_small_data_p): Cast size to unsigned
	HOST_WIDE_INT.
	* config/rs6000/sysv4.h (g_switch_value, g_switch_set):
	Remove.
	(SUBTARGET_OVERRIDE_OPTIONS): Declare g_switch_value and
	g_switch_set.
	(ASM_OUTPUT_ALIGNED_LOCAL): Declare g_switch_value and remove
	g_switch_value cast.

Index: gcc/flags.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/flags.h,v
retrieving revision 1.107
diff -c -3 -p -r1.107 flags.h
*** gcc/flags.h	7 May 2003 22:11:33 -0000	1.107
--- gcc/flags.h	18 May 2003 21:19:28 -0000
*************** extern int frame_pointer_needed;
*** 586,592 ****
  extern int flag_trapv;
  
  /* Value of the -G xx switch, and whether it was passed or not.  */
! extern int g_switch_value;
  extern int g_switch_set;
  
  /* Values of the -falign-* flags: how much to align labels in code. 
--- 586,592 ----
  extern int flag_trapv;
  
  /* Value of the -G xx switch, and whether it was passed or not.  */
! extern unsigned HOST_WIDE_INT g_switch_value;
  extern int g_switch_set;
  
  /* Values of the -falign-* flags: how much to align labels in code. 
Index: gcc/toplev.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/toplev.c,v
retrieving revision 1.757
diff -c -3 -p -r1.757 toplev.c
*** gcc/toplev.c	14 May 2003 07:29:41 -0000	1.757
--- gcc/toplev.c	18 May 2003 21:19:48 -0000
*************** enum graph_dump_types graph_dump_format;
*** 329,335 ****
  char *asm_file_name;
  
  /* Value of the -G xx switch, and whether it was passed or not.  */
! int g_switch_value;
  int g_switch_set;
  
  /* Type(s) of debugging information we are producing (if any).
--- 329,335 ----
  char *asm_file_name;
  
  /* Value of the -G xx switch, and whether it was passed or not.  */
! unsigned HOST_WIDE_INT g_switch_value;
  int g_switch_set;
  
  /* Type(s) of debugging information we are producing (if any).
Index: gcc/config/alpha/alpha.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/alpha/alpha.c,v
retrieving revision 1.306
diff -c -3 -p -r1.306 alpha.c
*** gcc/config/alpha/alpha.c	16 May 2003 18:57:37 -0000	1.306
--- gcc/config/alpha/alpha.c	18 May 2003 21:20:19 -0000
*************** small_symbolic_operand (op, mode)
*** 1166,1172 ****
    /* ??? There's no encode_section_info equivalent for the rtl
       constant pool, so SYMBOL_FLAG_SMALL never gets set.  */
    if (CONSTANT_POOL_ADDRESS_P (op))
!     return GET_MODE_SIZE (get_pool_mode (op)) <= (unsigned) g_switch_value;
  
    return (SYMBOL_REF_LOCAL_P (op)
  	  && SYMBOL_REF_SMALL_P (op)
--- 1166,1172 ----
    /* ??? There's no encode_section_info equivalent for the rtl
       constant pool, so SYMBOL_FLAG_SMALL never gets set.  */
    if (CONSTANT_POOL_ADDRESS_P (op))
!     return GET_MODE_SIZE (get_pool_mode (op)) <= g_switch_value;
  
    return (SYMBOL_REF_LOCAL_P (op)
  	  && SYMBOL_REF_SMALL_P (op)
*************** alpha_in_small_data_p (exp)
*** 1891,1897 ****
  
        /* If this is an incomplete type with size 0, then we can't put it
  	 in sdata because it might be too big when completed.  */
!       if (size > 0 && size <= g_switch_value)
  	return true;
      }
  
--- 1891,1897 ----
  
        /* If this is an incomplete type with size 0, then we can't put it
  	 in sdata because it might be too big when completed.  */
!       if (size > 0 && (unsigned HOST_WIDE_INT) size <= g_switch_value)
  	return true;
      }
  
Index: gcc/config/frv/frv.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/frv/frv.c,v
retrieving revision 1.23
diff -c -3 -p -r1.23 frv.c
*** gcc/config/frv/frv.c	16 May 2003 18:57:40 -0000	1.23
--- gcc/config/frv/frv.c	18 May 2003 21:20:42 -0000
*************** frv_in_small_data_p (decl)
*** 9714,9720 ****
      return false;
  
    size = int_size_in_bytes (TREE_TYPE (decl));
!   if (size > 0 && size <= g_switch_value)
      return true;
  
    /* If we already know which section the decl should be in, see if
--- 9714,9720 ----
      return false;
  
    size = int_size_in_bytes (TREE_TYPE (decl));
!   if (size > 0 && (unsigned HOST_WIDE_INT) size <= g_switch_value)
      return true;
  
    /* If we already know which section the decl should be in, see if
Index: gcc/config/frv/frv.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/frv/frv.h,v
retrieving revision 1.22
diff -c -3 -p -r1.22 frv.h
*** gcc/config/frv/frv.h	12 May 2003 09:51:24 -0000	1.22
--- gcc/config/frv/frv.h	18 May 2003 21:20:59 -0000
*************** extern int target_flags;
*** 554,562 ****
  #define SDATA_DEFAULT_SIZE 8
  #endif
  
- extern int g_switch_value;        /* value of the -G xx switch */
- extern int g_switch_set;          /* whether -G xx was passed.  */
- 
  
  /* Storage Layout */
  
--- 554,559 ----
*************** extern int size_directive_output;
*** 2783,2788 ****
--- 2780,2787 ----
  #undef ASM_OUTPUT_ALIGNED_DECL_LOCAL
  #define ASM_OUTPUT_ALIGNED_DECL_LOCAL(STREAM, DECL, NAME, SIZE, ALIGN)	\
  do {                                                                   	\
+   extern unsigned HOST_WIDE_INT g_switch_value;				\
+ 									\
    if ((SIZE) > 0 && (SIZE) <= g_switch_value)				\
      sbss_section ();                                                 	\
    else                                                                 	\
Index: gcc/config/m32r/m32r.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/m32r/m32r.c,v
retrieving revision 1.64
diff -c -3 -p -r1.64 m32r.c
*** gcc/config/m32r/m32r.c	9 May 2003 06:37:20 -0000	1.64
--- gcc/config/m32r/m32r.c	18 May 2003 21:21:09 -0000
*************** m32r_in_small_data_p (decl)
*** 433,439 ****
  	{
  	  int size = int_size_in_bytes (TREE_TYPE (decl));
  
! 	  if (size > 0 && size <= g_switch_value)
  	    return true;
  	}
      }
--- 433,439 ----
  	{
  	  int size = int_size_in_bytes (TREE_TYPE (decl));
  
! 	  if (size > 0 && (unsigned HOST_WIDE_INT) size <= g_switch_value)
  	    return true;
  	}
      }
*************** m32r_asm_file_start (file)
*** 2208,2214 ****
       FILE * file;
  {
    if (flag_verbose_asm)
!     fprintf (file, "%s M32R/D special options: -G %d\n",
  	     ASM_COMMENT_START, g_switch_value);
  }
  \f
--- 2208,2215 ----
       FILE * file;
  {
    if (flag_verbose_asm)
!     fprintf (file,
! 	     "%s M32R/D special options: -G " HOST_WIDE_INT_PRINT_UNSIGNED "\n",
  	     ASM_COMMENT_START, g_switch_value);
  }
  \f
Index: gcc/config/m32r/m32r.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/m32r/m32r.h,v
retrieving revision 1.82
diff -c -3 -p -r1.82 m32r.h
*** gcc/config/m32r/m32r.h	12 May 2003 09:51:29 -0000	1.82
--- gcc/config/m32r/m32r.h	18 May 2003 21:21:16 -0000
*************** extern enum m32r_model m32r_model;
*** 350,358 ****
  #define SDATA_DEFAULT_SIZE 8
  #endif
  
- extern int g_switch_value;		/* value of the -G xx switch */
- extern int g_switch_set;		/* whether -G xx was passed.  */
- 
  enum m32r_sdata { M32R_SDATA_NONE, M32R_SDATA_SDATA, M32R_SDATA_USE };
  
  extern enum m32r_sdata m32r_sdata;
--- 350,355 ----
*************** extern char m32r_punct_chars[256];
*** 1690,1695 ****
--- 1687,1694 ----
  #define ASM_OUTPUT_ALIGNED_COMMON(FILE, NAME, SIZE, ALIGN)		\
    do									\
      {									\
+       extern unsigned HOST_WIDE_INT g_switch_value;			\
+ 									\
        if (! TARGET_SDATA_NONE						\
  	  && (SIZE) > 0 && (SIZE) <= g_switch_value)			\
  	fprintf ((FILE), "%s", SCOMMON_ASM_OP);				\
Index: gcc/config/rs6000/rs6000.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.c,v
retrieving revision 1.478
diff -c -3 -p -r1.478 rs6000.c
*** gcc/config/rs6000/rs6000.c	17 May 2003 16:57:17 -0000	1.478
--- gcc/config/rs6000/rs6000.c	18 May 2003 21:21:54 -0000
*************** rs6000_file_start (file, default_cpu)
*** 950,956 ****
  
        if (rs6000_sdata && g_switch_value)
  	{
! 	  fprintf (file, "%s -G %d", start, g_switch_value);
  	  start = "";
  	}
  #endif
--- 950,957 ----
  
        if (rs6000_sdata && g_switch_value)
  	{
! 	  fprintf (file, "%s -G " HOST_WIDE_INT_PRINT_UNSIGNED, start,
! 		   g_switch_value);
  	  start = "";
  	}
  #endif
*************** small_data_operand (op, mode)
*** 2254,2260 ****
        /* We have to be careful here, because it is the referenced address
          that must be 32k from _SDA_BASE_, not just the symbol.  */
        summand = INTVAL (XEXP (sum, 1));
!       if (summand < 0 || summand > g_switch_value)
         return 0;
  
        sym_ref = XEXP (sum, 0);
--- 2255,2261 ----
        /* We have to be careful here, because it is the referenced address
          that must be 32k from _SDA_BASE_, not just the symbol.  */
        summand = INTVAL (XEXP (sum, 1));
!       if (summand < 0 || (unsigned HOST_WIDE_INT) summand > g_switch_value)
         return 0;
  
        sym_ref = XEXP (sum, 0);
*************** rs6000_elf_in_small_data_p (decl)
*** 13452,13458 ****
        HOST_WIDE_INT size = int_size_in_bytes (TREE_TYPE (decl));
  
        if (size > 0
! 	  && size <= g_switch_value
  	  /* If it's not public, and we're not going to reference it there,
  	     there's no need to put it in the small data section.  */
  	  && (rs6000_sdata != SDATA_DATA || TREE_PUBLIC (decl)))
--- 13453,13459 ----
        HOST_WIDE_INT size = int_size_in_bytes (TREE_TYPE (decl));
  
        if (size > 0
! 	  && (unsigned HOST_WIDE_INT) size <= g_switch_value
  	  /* If it's not public, and we're not going to reference it there,
  	     there's no need to put it in the small data section.  */
  	  && (rs6000_sdata != SDATA_DATA || TREE_PUBLIC (decl)))
Index: gcc/config/rs6000/sysv4.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/sysv4.h,v
retrieving revision 1.127
diff -c -3 -p -r1.127 sysv4.h
*** gcc/config/rs6000/sysv4.h	15 May 2003 02:35:19 -0000	1.127
--- gcc/config/rs6000/sysv4.h	18 May 2003 21:22:00 -0000
*************** extern const char *rs6000_tls_size_strin
*** 89,99 ****
    { "tls-size=", &rs6000_tls_size_string,					\
     N_("Specify bit size of immediate TLS offsets"), 0 }
  
- /* Max # of bytes for variables to automatically be put into the .sdata
-    or .sdata2 sections.  */
- extern int g_switch_value;		/* Value of the -G xx switch.  */
- extern int g_switch_set;		/* Whether -G xx was passed.  */
- 
  #define SDATA_DEFAULT_SIZE 8
  
  /* Note, V.4 no longer uses a normal TOC, so make -mfull-toc, be just
--- 89,94 ----
*************** extern int g_switch_set;		/* Whether -G 
*** 171,176 ****
--- 166,174 ----
  
  #define SUBTARGET_OVERRIDE_OPTIONS					\
  do {									\
+   extern unsigned HOST_WIDE_INT g_switch_value;				\
+   extern int g_switch_set;						\
+ 									\
    if (!g_switch_set)							\
      g_switch_value = SDATA_DEFAULT_SIZE;				\
  									\
*************** extern int rs6000_pic_labelno;
*** 665,672 ****
  #undef	ASM_OUTPUT_ALIGNED_LOCAL
  #define	ASM_OUTPUT_ALIGNED_LOCAL(FILE, NAME, SIZE, ALIGN)		\
  do {									\
    if (rs6000_sdata != SDATA_NONE && (SIZE) > 0				\
!       && (SIZE) <= (unsigned HOST_WIDE_INT)g_switch_value)		\
      {									\
        sbss_section ();							\
        ASM_OUTPUT_ALIGN (FILE, exact_log2 (ALIGN / BITS_PER_UNIT));	\
--- 663,672 ----
  #undef	ASM_OUTPUT_ALIGNED_LOCAL
  #define	ASM_OUTPUT_ALIGNED_LOCAL(FILE, NAME, SIZE, ALIGN)		\
  do {									\
+   extern unsigned HOST_WIDE_INT g_switch_value;				\
+ 									\
    if (rs6000_sdata != SDATA_NONE && (SIZE) > 0				\
!       && (SIZE) <= g_switch_value)					\
      {									\
        sbss_section ();							\
        ASM_OUTPUT_ALIGN (FILE, exact_log2 (ALIGN / BITS_PER_UNIT));	\

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: powerpc-unknown-linux-gnu bootstrap fix
  2003-05-18 23:17         ` Matt Kraai
@ 2003-05-19  0:16           ` Geoff Keating
  0 siblings, 0 replies; 875+ messages in thread
From: Geoff Keating @ 2003-05-19  0:16 UTC (permalink / raw)
  To: kraai; +Cc: dje, gcc-patches

> Date: Sun, 18 May 2003 15:36:36 -0700
> From: Matt Kraai <kraai@alumni.cmu.edu>

> 	* flags.h (g_switch_value): Change to an unsigned
> 	HOST_WIDE_INT.
> 	* toplev.c (g_switch_value): Likewise.
> 
> 	* config/alpha/alpha.c (small_symbolic_operand): Remove
> 	g_switch_value cast. 
> 	(alpha_in_small_data_p): Cast size to an unsigned
> 	HOST_WIDE_INT.
> 
> 	* config/frv/frv.c (frv_in_small_data_p): Cast size to an
> 	unsigned HOST_WIDE_INT.
> 	* config/frv/frv.h (g_switch_value, g_switch_set): Remove.
> 	(ASM_OUTPUT_ALIGNED_DECL_LOCAL): Declare g_switch_set.
> 
> 	* config/m32r/m32r.c (m32r_in_small_data_p): Cast size to an
> 	unsigned HOST_WIDE_INT.
> 	(m32r_asm_file_start): Use HOST_WIDE_INT_PRINT_UNSIGNED.
> 	* config/m32r/m32r.h (g_switch_value, g_switch_set): Remove.
> 	(ASM_OUTPUT_ALIGNED_COMMON): Declare g_switch_value.
> 
> 	* config/rs6000/rs6000.c (rs6000_file_start): Use
> 	HOST_WIDE_INT_PRINT_UNSIGNED.
> 	(small_data_operand): Cast summand to unsigned HOST_WIDE_INT.
> 	(rs6000_elf_in_small_data_p): Cast size to unsigned
> 	HOST_WIDE_INT.
> 	* config/rs6000/sysv4.h (g_switch_value, g_switch_set):
> 	Remove.
> 	(SUBTARGET_OVERRIDE_OPTIONS): Declare g_switch_value and
> 	g_switch_set.
> 	(ASM_OUTPUT_ALIGNED_LOCAL): Declare g_switch_value and remove
> 	g_switch_value cast.

This is OK.

-- 
- Geoffrey Keating <geoffk@geoffk.org>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* [PATCH] powerpc64-linux bi-arch support
@ 2003-05-27 11:58 Jakub Jelinek
  2003-05-27 14:57 ` David Edelsohn
                   ` (3 more replies)
  0 siblings, 4 replies; 875+ messages in thread
From: Jakub Jelinek @ 2003-05-27 11:58 UTC (permalink / raw)
  To: David Edelsohn, Alan Modra, Janis Johnson; +Cc: gcc-patches

Hi!

Below is a forward port of the ppc64-linux bi-arch support.
Similar patch got lots of testing on gcc-3_2-rhl8-branch and some on
gcc-3_3-rhl-branch.
GCC can be configured to support just 32-bit target (configuring for
powerpc-*-linux*), 32-bit and 64-bit target defaulting to 32-bit
(--target powerpc64-*-linux* --with-cpu=default32 (or some specific 32-bit
CPU)) or 32-bit and 64-bit target defaulting to 64-bit
(--target powerpc64-*-linux*).

2003-05-27  Jakub Jelinek  <jakub@redhat.com>
	    Alan Modra  <amodra@bigpond.net.au>

	* config/i386/linux.h (NO_PROFILE_COUNTERS): Define to 1.
	* config/i386/freebsd.h (NO_PROFILE_COUNTERS): Likewise.
	* config/i386/netbsd-elf.h (NO_PROFILE_COUNTERS): Likewise.
	* config/xtensa/xtensa.h (NO_PROFILE_COUTNERS): Likewise.
	* config/darwin.h (NO_PROFILE_COUNTERS): Likewise.
	* final.c (NO_PROFILE_COUNTERS): Define to 0 if not defined.
	(profile_function): Allow NO_PROFILE_COUNTERS to be non-constant.
	* config/rs6000/rs6000.c (output_profile_hook): Likewise.

	* configure.in (powerpc*-*, s390*-*): Set tls_as_opt.
	Pass it to $gcc_cv_as.
	* configure: Rebuilt.

	* config/rs6000/rs6000.c (rs6000_abi_name): Remove initializer.
	(print_operand): Allow TARGET_AIX to be non-constant.
	(rs6000_aix_emit_builtin_unwind_init, rs6000_emit_eh_toc_restore):
	Define unconditionally.
	(rs6000_elf_declare_function_name): New function.
	* config/rs6000/rs6000.md (eh_return): Allow TARGET_AIX to be
	non-constant.
	* config/rs6000/linux64.h [!RS6000_BI_ARCH] (TARGET_64BIT): Define
	to 1.
	(DEFAULT_ARCH64_P, RS6000_BI_ARCH_P): Define.
	[IN_LIBGCC2] (TARGET_64BIT): Define based on whether __powerpc64__
	is defined.
	(TARGET_AIX): Define to 1 if TARGET_64BIT.
	(PROCESSOR_DEFAULT): Remove.
	(TARGET_RELOCATABLE, RS6000_ABI_NAME, INVALID_64BIT,
	INVALID_32BIT, SUBSUBTARGET_OVERRIDE_OPTIONS): Define.
	[RS6000_BI_ARCH] (OVERRIDE_OPTIONS, ASM_FILE_START): Define.
	(ASM_DEFAULT_SPEC, ASM_SPEC, LINK_OS_LINUX_SPEC): Define for both
	-m32 and -m64.
	(MULTILIB_DEFAULTS): Define.
	(SUBSUBTARGET_EXTRA_SPECS): Define.
	(ASM_SPEC32, ASM_SPEC64, ASM_SPEC_COMMON): Define.
	(TARGET_TOC): Define only if !RS6000_BI_ARCH.
	(TARGET_NO_TOC): Remove.
	[!RS6000_BI_ARCH] (TARGET_RELOCATABLE, TARGET_EABI,
	TARGET_PROTOTYPE): Define to 0.
	(NO_PROFILE_COUNTERS): Define to TARGET_64BIT.
	(PROFILE_HOOK): Only call output_profile_hook if TARGET_64BIT.
	(ADJUST_FIELD_ALIGN, ROUND_TYPE_ALIGN): Adjust to work properly
	if !TARGET_64BIT.
	(USER_LABEL_PREFIX): Remove.
	(JUMP_TABLES_IN_TEXT_SECTION): Define to TARGET_64BIT.
	(SETUP_FRAME_ADDRESSES): Only call rs6000_aix_emit_builtin_unwind_init
	if TARGET_64BIT.
	(TARGET_OS_CPP_BUILTINS): Handle both -m32 and -m64.
	(LINK_OS_LINUX_SPEC32, LINK_OS_LINUX_SPEC64): Define.
	(STARTFILE_LINUX_SPEC, ENDFILE_LINUX_SPEC): Remove.
	(TOC_SECTION_ASM_OP): Define depending on TARGET_64BIT.
	(MINIMAL_TOC_SECTION_ASM_OP): Likewise.
	(SIZE_TYPE, PTRDIFF_TYPE, WCHAR_TYPE): Define depending on
	TARGET_64BIT.
	(RS6000_CALL_GLUE): Likewise.
	(SAVE_FP_PREFIX, SAVE_FP_SUFFIX, RESTORE_FP_PREFIX,
	RESTORE_FP_SUFFIX): Likewise.
	(ASM_DECLARE_FUNCTION_NAME): Remove.
	(ASM_DECLARE_FUNCTION_SIZE, ASM_OUTPUT_SOURCE_LINE,
	DBX_OUTPUT_BRAC, DBX_OUTPUT_NFUN): Only output dot before function
	name if TARGET_64BIT.
	(ASM_OUTPUT_SPECIAL_POOL_ENTRY_P): Handle both TARGET_64BIT and
	!TARGET_64BIT.
	(ASM_OUTPUT_REG_PUSH, ASM_OUTPUT_REG_POP): Remove undefs.
	(ASM_PREFERRED_EH_DATA_FORMAT): Take TARGET_64BIT into account.
	(DRAFT_V4_STRUCT_RET): Define.
	(SIGNAL_FRAMESIZE): New enum value.
	(MD_FALLBACK_FRAME_STATE_FOR): Define.
	* config/rs6000/default64.h: New file.
	* config/rs6000/sysv4.h (SUBTARGET_SWITCHES): Add -m32 and -m64
	options.
	(SUBTARGET_OVERRIDE_OPTIONS): If rs6000_abi_name is NULL, set it
	to RS6000_ABI_NAME.  Only disallow mixing of -fPIC with -mcall-aixdesc
	if !TARGET_64BIT.
	[!RS6000_BI_ARCH] (SUBSUBTARGET_OVERRIDE_OPTIONS): Define.
	(ASM_DECLARE_FUNCTION_NAME): Use rs6000_elf_declare_function_name
	function.
	(TARGET_OS_SYSV_CPP_BUILTINS): Define.
	(TARGET_OS_CPP_BUILTINS): Use it.
	(CPP_SYSV_SPEC): Remove.
	(CPP_SPEC): Remove cpp_sysv.
	(SUBTARGET_EXTRA_SPECS): Remove cpp_sysv.
	Add SUBSUBTARGET_EXTRA_SPECS.
	(SUBSUBTARGET_EXTRA_SPECS): Define.
	* config/rs6000/biarch64.h: New file.
	* config/rs6000/rs6000-protos.h (rs6000_elf_declare_function_name):
	New prototype.
	* config/rs6000/x-linux64: New file.
	* config/rs6000/t-linux64: Build -m64, -m32 and -m32 -msoft-float
	multilibs.
	* config/rs6000/eabi-ci.asm: Protect with #ifndef __powerpc64__.
	* config/rs6000/eabi-cn.asm: Likewise.
	* config/rs6000/tramp.asm: Likewise.
	* config/rs6000/sol-ci.asm: Likewise.
	* config/rs6000/sol-cn.asm: Likewise.
	* config/rs6000/linux.h (TARGET_64BIT): Define to 0.
	(TARGET_OS_CPP_BUILTINS): Use TARGET_OS_SYSV_CPP_BUILTINS.
	* config/rs6000/ppc-asm.h: Move __powerpc64__ section before
	_CALL_AIXDESC section.
	* config.gcc (powerpc64-*-linux*): Configure a bi-arch compiler,
	defaulting to -m64 unless --with-cpu= is one of the 32-bit CPUs
	or default32.

--- gcc/config/darwin.h.jj	2003-05-26 17:20:11.000000000 -0400
+++ gcc/config/darwin.h	2003-05-26 19:17:04.000000000 -0400
@@ -310,7 +310,7 @@ do { text_section ();							\
 
 /* Our profiling scheme doesn't LP labels and counter words.  */
 
-#define NO_PROFILE_COUNTERS
+#define NO_PROFILE_COUNTERS	1
 
 #undef	INIT_SECTION_ASM_OP
 #define INIT_SECTION_ASM_OP
--- gcc/config/i386/linux.h.jj	2003-05-26 17:20:11.000000000 -0400
+++ gcc/config/i386/linux.h	2003-05-26 19:17:04.000000000 -0400
@@ -51,7 +51,7 @@ Boston, MA 02111-1307, USA.  */
    To the best of my knowledge, no Linux libc has required the label
    argument to mcount.  */
 
-#define NO_PROFILE_COUNTERS
+#define NO_PROFILE_COUNTERS	1
 
 #undef MCOUNT_NAME
 #define MCOUNT_NAME "mcount"
--- gcc/config/i386/freebsd.h.jj	2003-05-26 17:20:11.000000000 -0400
+++ gcc/config/i386/freebsd.h	2003-05-26 19:17:04.000000000 -0400
@@ -43,7 +43,7 @@ Boston, MA 02111-1307, USA.  */
   (TARGET_64BIT ? dbx64_register_map[n] : svr4_dbx_register_map[n])
 
 #undef  NO_PROFILE_COUNTERS
-#define NO_PROFILE_COUNTERS
+#define NO_PROFILE_COUNTERS	1
 
 /* Tell final.c that we don't need a label passed to mcount.  */
 
--- gcc/config/i386/netbsd-elf.h.jj	2003-05-26 17:20:11.000000000 -0400
+++ gcc/config/i386/netbsd-elf.h	2003-05-26 19:17:04.000000000 -0400
@@ -76,7 +76,7 @@ Boston, MA 02111-1307, USA.  */
 /* Output assembler code to FILE to call the profiler.  */
 
 #undef NO_PROFILE_COUNTERS
-#define NO_PROFILE_COUNTERS
+#define NO_PROFILE_COUNTERS	1
 
 #undef FUNCTION_PROFILER
 #define FUNCTION_PROFILER(FILE, LABELNO)				\
--- gcc/config/rs6000/rs6000.c.jj	2003-05-26 17:20:11.000000000 -0400
+++ gcc/config/rs6000/rs6000.c	2003-05-26 19:17:05.000000000 -0400
@@ -124,7 +124,7 @@ int rs6000_pic_labelno;
 
 #ifdef USING_ELFOS_H
 /* Which abi to adhere to */
-const char *rs6000_abi_name = RS6000_ABI_NAME;
+const char *rs6000_abi_name;
 
 /* Semantics of the small data area */
 enum rs6000_sdata_type rs6000_sdata = SDATA_DATA;
@@ -8660,11 +8660,10 @@ print_operand (file, x, code)
 	      break;
 	    }
 	}
-#if TARGET_AIX
-      RS6000_OUTPUT_BASENAME (file, XSTR (x, 0));
-#else
-      assemble_name (file, XSTR (x, 0));
-#endif
+      if (TARGET_AIX)
+	RS6000_OUTPUT_BASENAME (file, XSTR (x, 0));
+      else
+	assemble_name (file, XSTR (x, 0));
       return;
 
     case 'Z':
@@ -10565,7 +10564,6 @@ create_TOC_reference (symbol) 
 		 gen_rtx_SYMBOL_REF (Pmode, toc_label_name))));
 }
 
-#if TARGET_AIX
 /* __throw will restore its own return address to be the same as the
    return address of the function that the throw is being made to.
    This is unfortunate, because we want to check the original
@@ -10693,7 +10691,6 @@ rs6000_emit_eh_toc_restore (stacksize)
   emit_note (NULL, NOTE_INSN_LOOP_END);
   emit_label (loop_exit);
 }
-#endif /* TARGET_AIX */
 \f
 /* This ties together stack memory (MEM with an alias set of
    rs6000_sr_alias_set) and the change to the stack pointer.  */
@@ -12884,20 +12881,24 @@ output_profile_hook (labelno)
 
   if (DEFAULT_ABI == ABI_AIX)
     {
-#ifdef NO_PROFILE_COUNTERS
-      emit_library_call (init_one_libfunc (RS6000_MCOUNT), 0, VOIDmode, 0);
-#else
-      char buf[30];
-      const char *label_name;
-      rtx fun;
+#ifndef NO_PROFILE_COUNTERS
+# define NO_PROFILE_COUNTERS 0
+#endif
+      if (NO_PROFILE_COUNTERS)  
+	emit_library_call (init_one_libfunc (RS6000_MCOUNT), 0, VOIDmode, 0);
+      else
+	{
+	  char buf[30];
+	  const char *label_name;
+	  rtx fun;
 
-      ASM_GENERATE_INTERNAL_LABEL (buf, "LP", labelno);
-      label_name = (*targetm.strip_name_encoding) (ggc_strdup (buf));
-      fun = gen_rtx_SYMBOL_REF (Pmode, label_name);
+	  ASM_GENERATE_INTERNAL_LABEL (buf, "LP", labelno);
+	  label_name = (*targetm.strip_name_encoding) (ggc_strdup (buf));
+	  fun = gen_rtx_SYMBOL_REF (Pmode, label_name);
 
-      emit_library_call (init_one_libfunc (RS6000_MCOUNT), 0, VOIDmode, 1,
-                         fun, Pmode);
-#endif
+	  emit_library_call (init_one_libfunc (RS6000_MCOUNT), 0, VOIDmode, 1,
+			     fun, Pmode);
+	}
     }
   else if (DEFAULT_ABI == ABI_DARWIN)
     {
@@ -13933,6 +13934,79 @@ rs6000_elf_asm_out_destructor (symbol, p
   else
     assemble_integer (symbol, POINTER_SIZE / BITS_PER_UNIT, POINTER_SIZE, 1);
 }
+
+void
+rs6000_elf_declare_function_name (file, name, decl)
+     FILE *file;
+     const char *name;
+     tree decl;
+{
+  if (TARGET_64BIT)
+    {
+      fputs ("\t.section\t\".opd\",\"aw\"\n\t.align 3\n", file);
+      ASM_OUTPUT_LABEL (file, name);
+      fputs (DOUBLE_INT_ASM_OP, file);
+      putc ('.', file);
+      assemble_name (file, name);
+      fputs (",.TOC.@tocbase,0\n\t.previous\n\t.size\t", file);
+      assemble_name (file, name);
+      fputs (",24\n\t.type\t.", file);
+      assemble_name (file, name);
+      fputs (",@function\n", file);
+      if (TREE_PUBLIC (decl) && ! DECL_WEAK (decl))
+	{
+	  fputs ("\t.globl\t.", file);
+	  assemble_name (file, name);
+	  putc ('\n', file);
+	}
+      ASM_DECLARE_RESULT (file, DECL_RESULT (decl));
+      putc ('.', file);
+      ASM_OUTPUT_LABEL (file, name);
+      return;
+    }
+
+  if (TARGET_RELOCATABLE
+      && (get_pool_size () != 0 || current_function_profile)
+      && uses_TOC())
+    {
+      char buf[256];
+
+      (*targetm.asm_out.internal_label) (file, "LCL", rs6000_pic_labelno);
+
+      ASM_GENERATE_INTERNAL_LABEL (buf, "LCTOC", 1);
+      fprintf (file, "\t.long ");
+      assemble_name (file, buf);
+      putc ('-', file);
+      ASM_GENERATE_INTERNAL_LABEL (buf, "LCF", rs6000_pic_labelno);
+      assemble_name (file, buf);
+      putc ('\n', file);
+    }
+
+  ASM_OUTPUT_TYPE_DIRECTIVE (file, name, "function");
+  ASM_DECLARE_RESULT (file, DECL_RESULT (decl));
+
+  if (DEFAULT_ABI == ABI_AIX)
+    {
+      const char *desc_name, *orig_name;
+
+      orig_name = (*targetm.strip_name_encoding) (name);
+      desc_name = orig_name;
+      while (*desc_name == '.')
+	desc_name++;
+
+      if (TREE_PUBLIC (decl))
+	fprintf (file, "\t.globl %s\n", desc_name);
+
+      fprintf (file, "%s\n", MINIMAL_TOC_SECTION_ASM_OP);
+      fprintf (file, "%s:\n", desc_name);
+      fprintf (file, "\t.long %s\n", orig_name);
+      fputs ("\t.long _GLOBAL_OFFSET_TABLE_\n", file);
+      if (DEFAULT_ABI == ABI_AIX)
+	fputs ("\t.long 0\n", file);
+      fprintf (file, "\t.previous\n");
+    }
+  ASM_OUTPUT_LABEL (file, name);
+}
 #endif
 
 #if TARGET_XCOFF
--- gcc/config/rs6000/linux64.h.jj	2003-05-26 17:20:11.000000000 -0400
+++ gcc/config/rs6000/linux64.h	2003-05-26 19:17:56.000000000 -0400
@@ -19,43 +19,163 @@
    Free Software Foundation, 59 Temple Place - Suite 330, Boston,
    MA 02111-1307, USA.  */
 
-/* Yes!  We are AIX! Err. Wait. We're Linux!. No, wait, we're a
-  combo of both!*/
-#undef  DEFAULT_ABI
-#define DEFAULT_ABI ABI_AIX
-
-#undef  TARGET_AIX
-#define TARGET_AIX 1
-
-#undef  TARGET_DEFAULT
-#define TARGET_DEFAULT \
-  (MASK_POWERPC | MASK_POWERPC64 | MASK_64BIT | MASK_NEW_MNEMONICS)
-
-#undef  PROCESSOR_DEFAULT
-#define PROCESSOR_DEFAULT PROCESSOR_PPC630
-#undef  PROCESSOR_DEFAULT64
+#ifndef RS6000_BI_ARCH
+
+#undef	DEFAULT_ABI
+#define	DEFAULT_ABI ABI_AIX
+
+#undef	TARGET_64BIT
+#define	TARGET_64BIT 1
+
+#define	DEFAULT_ARCH64_P 1
+#define	RS6000_BI_ARCH_P 0
+
+#else
+
+#define	DEFAULT_ARCH64_P (TARGET_DEFAULT & MASK_64BIT)
+#define	RS6000_BI_ARCH_P 1
+
+#endif
+
+#ifdef IN_LIBGCC2
+#undef TARGET_64BIT
+#ifdef __powerpc64__
+#define TARGET_64BIT 1
+#else
+#define TARGET_64BIT 0
+#endif
+#endif
+
+#undef	TARGET_AIX
+#define	TARGET_AIX TARGET_64BIT
+
+#undef PROCESSOR_DEFAULT64
 #define PROCESSOR_DEFAULT64 PROCESSOR_PPC630
 
-#undef  ASM_DEFAULT_SPEC
-#define ASM_DEFAULT_SPEC "-mppc64"
+#undef	TARGET_RELOCATABLE
+#define	TARGET_RELOCATABLE (!TARGET_64BIT && (target_flags & MASK_RELOCATABLE))
+
+#undef	RS6000_ABI_NAME
+#define	RS6000_ABI_NAME (TARGET_64BIT ? "aixdesc" : "sysv")
 
+#define INVALID_64BIT "-m%s not supported in this configuration"
+#define INVALID_32BIT INVALID_64BIT
+
+#undef	SUBSUBTARGET_OVERRIDE_OPTIONS
+#define	SUBSUBTARGET_OVERRIDE_OPTIONS				\
+  do								\
+    {								\
+      if (TARGET_64BIT)						\
+	{							\
+	  if (DEFAULT_ABI != ABI_AIX)				\
+	    {							\
+	      DEFAULT_ABI = ABI_AIX;				\
+	      error (INVALID_64BIT, "call");			\
+	    }							\
+	  if (TARGET_RELOCATABLE)				\
+	    {							\
+	      target_flags &= ~MASK_RELOCATABLE;		\
+	      error (INVALID_64BIT, "relocatable");		\
+	    }							\
+	  if (TARGET_EABI)					\
+	    {							\
+	      target_flags &= ~MASK_EABI;			\
+	      error (INVALID_64BIT, "eabi");			\
+	    }							\
+	  if (TARGET_PROTOTYPE)					\
+	    {							\
+	      target_flags &= ~MASK_PROTOTYPE;			\
+	      error (INVALID_64BIT, "prototype");		\
+	    }							\
+	}							\
+      else							\
+	{							\
+	  if (!RS6000_BI_ARCH_P)				\
+	    error (INVALID_32BIT, "32");			\
+	}							\
+    }								\
+  while (0)
+
+#ifdef	RS6000_BI_ARCH
+
+#undef	OVERRIDE_OPTIONS
+#define	OVERRIDE_OPTIONS \
+  rs6000_override_options (((TARGET_DEFAULT ^ target_flags) & MASK_64BIT) \
+			   ? (char *) 0 : TARGET_CPU_DEFAULT)
+
+#undef	ASM_FILE_START
+#define	ASM_FILE_START(FILE)						    \
+  do									    \
+    {                                                                       \
+      output_file_directive ((FILE), main_input_filename);		    \
+      rs6000_file_start (FILE, (((TARGET_DEFAULT ^ target_flags)	    \
+				 & MASK_64BIT)				    \
+				? (char *) 0 : TARGET_CPU_DEFAULT));	    \
+    }									    \
+  while (0)
+
+#endif
+
+#undef	ASM_DEFAULT_SPEC
 #undef	ASM_SPEC
-#define	ASM_SPEC "%{.s: %{mregnames} %{mno-regnames}} \
-%{.S: %{mregnames} %{mno-regnames}} \
-%{mlittle} %{mlittle-endian} %{mbig} %{mbig-endian} \
-%{v:-V} %{Qy:} %{!Qn:-Qy} -a64 %(asm_cpu) %{Wa,*:%*}"
+#undef	LINK_OS_LINUX_SPEC
 
-/* This is always a 64 bit compiler.  */
-#undef	TARGET_64BIT
-#define	TARGET_64BIT		1
+#ifndef	RS6000_BI_ARCH
+#define	ASM_DEFAULT_SPEC "-mppc64"
+#define	ASM_SPEC         "%(asm_spec64) %(asm_spec_common)"
+#define	LINK_OS_LINUX_SPEC "%(link_os_linux_spec64)"
+#else
+#if DEFAULT_ARCH64_P
+#define	ASM_DEFAULT_SPEC "-mppc%{!m32:64}"
+#define	ASM_SPEC         "%{m32:%(asm_spec32)}%{!m32:%(asm_spec64)} %(asm_spec_common)"
+#define	LINK_OS_LINUX_SPEC "%{m32:%(link_os_linux_spec32)}%{!m32:%(link_os_linux_spec64)}"
+#else
+#define	ASM_DEFAULT_SPEC "-mppc%{m64:64}"
+#define	ASM_SPEC         "%{!m64:%(asm_spec32)}%{m64:%(asm_spec64)} %(asm_spec_common)"
+#define	LINK_OS_LINUX_SPEC "%{!m64:%(link_os_linux_spec32)}%{m64:%(link_os_linux_spec64)}"
+#endif
+#endif
+
+#define ASM_SPEC32 "-a32 %{n} %{T} %{Ym,*} %{Yd,*} \
+%{mrelocatable} %{mrelocatable-lib} %{fpic:-K PIC} %{fPIC:-K PIC} \
+%{memb} %{!memb: %{msdata: -memb} %{msdata=eabi: -memb}} \
+%{!mlittle: %{!mlittle-endian: %{!mbig: %{!mbig-endian: \
+    %{mcall-freebsd: -mbig} \
+    %{mcall-i960-old: -mlittle} \
+    %{mcall-linux: -mbig} \
+    %{mcall-gnu: -mbig} \
+    %{mcall-netbsd: -mbig} \
+}}}}"
+
+#define ASM_SPEC64 "-a64"
+
+#define ASM_SPEC_COMMON "%(asm_cpu) \
+%{.s: %{mregnames} %{mno-regnames}} %{.S: %{mregnames} %{mno-regnames}} \
+%{v:-V} %{Qy:} %{!Qn:-Qy} %{Wa,*:%*} \
+%{mlittle} %{mlittle-endian} %{mbig} %{mbig-endian}"
+
+#undef	SUBSUBTARGET_EXTRA_SPECS
+#define SUBSUBTARGET_EXTRA_SPECS \
+  { "asm_spec_common",		ASM_SPEC_COMMON },			\
+  { "asm_spec32",		ASM_SPEC32 },				\
+  { "asm_spec64",		ASM_SPEC64 },				\
+  { "link_os_linux_spec32",	LINK_OS_LINUX_SPEC32 },			\
+  { "link_os_linux_spec64",	LINK_OS_LINUX_SPEC64 },
+
+#undef	MULTILIB_DEFAULTS
+#if DEFAULT_ARCH64_P
+#define MULTILIB_DEFAULTS { "m64" }
+#else
+#define MULTILIB_DEFAULTS { "m32" }
+#endif
+
+#ifndef RS6000_BI_ARCH
 
 /* 64-bit PowerPC Linux always has a TOC.  */
-#undef  TARGET_NO_TOC
-#define TARGET_NO_TOC		0
 #undef  TARGET_TOC
 #define	TARGET_TOC		1
 
-/* Some things from sysv4.h we don't do.  */
+/* Some things from sysv4.h we don't do when 64 bit.  */
 #undef	TARGET_RELOCATABLE
 #define	TARGET_RELOCATABLE	0
 #undef	TARGET_EABI
@@ -63,8 +183,9 @@
 #undef	TARGET_PROTOTYPE
 #define	TARGET_PROTOTYPE	0
 
-/* Reuse sysv4 mask bits we made available above.  */
-#define	MASK_PROFILE_KERNEL	0x08000000
+#endif
+
+#define	MASK_PROFILE_KERNEL	0x00080000
 
 /* Non-standard profiling for kernels, which just saves LR then calls
    _mcount without worrying about arg saves.  The idea is to change
@@ -73,88 +194,62 @@
 #define TARGET_PROFILE_KERNEL	(target_flags & MASK_PROFILE_KERNEL)
 
 /* Override sysv4.h.  */
-#undef	SUBTARGET_SWITCHES
-#define SUBTARGET_SWITCHES						\
-  {"bit-align",	-MASK_NO_BITFIELD_TYPE,					\
-    N_("Align to the base type of the bit-field") },			\
-  {"no-bit-align",	 MASK_NO_BITFIELD_TYPE,				\
-    N_("Don't align to the base type of the bit-field") },		\
-  {"strict-align",	 MASK_STRICT_ALIGN,				\
-    N_("Don't assume that unaligned accesses are handled by the system") }, \
-  {"no-strict-align",	-MASK_STRICT_ALIGN,				\
-    N_("Assume that unaligned accesses are handled by the system") },	\
-  {"little-endian",	 MASK_LITTLE_ENDIAN,				\
-    N_("Produce little endian code") },					\
-  {"little",		 MASK_LITTLE_ENDIAN,				\
-    N_("Produce little endian code") },					\
-  {"big-endian",	-MASK_LITTLE_ENDIAN,				\
-    N_("Produce big endian code") },					\
-  {"big",		-MASK_LITTLE_ENDIAN,				\
-    N_("Produce big endian code") },					\
-  {"bit-word",		-MASK_NO_BITFIELD_WORD,				\
-    N_("Allow bit-fields to cross word boundaries") },			\
-  {"no-bit-word",	 MASK_NO_BITFIELD_WORD,				\
-    N_("Do not allow bit-fields to cross word boundaries") },		\
-  {"regnames",		 MASK_REGNAMES,					\
-    N_("Use alternate register names") },				\
-  {"no-regnames",	-MASK_REGNAMES,					\
-    N_("Don't use alternate register names") },				\
+#undef	EXTRA_SUBTARGET_SWITCHES
+#define EXTRA_SUBTARGET_SWITCHES					\
   {"profile-kernel",	 MASK_PROFILE_KERNEL,				\
    N_("Call mcount for profiling before a function prologue") },	\
   {"no-profile-kernel",	-MASK_PROFILE_KERNEL,				\
    N_("Call mcount for profiling after a function prologue") },
 
-#undef	SUBTARGET_OPTIONS
-#define	SUBTARGET_OPTIONS
-
-#undef	SUBTARGET_OVERRIDE_OPTIONS
-#define	SUBTARGET_OVERRIDE_OPTIONS {}
-
 /* We use glibc _mcount for profiling.  */
-#define NO_PROFILE_COUNTERS 1
-#define PROFILE_HOOK(LABEL) output_profile_hook (LABEL)
+#define NO_PROFILE_COUNTERS TARGET_64BIT
+#define PROFILE_HOOK(LABEL) \
+  do { if (TARGET_64BIT) output_profile_hook (LABEL); } while (0)
 
 /* We don't need to generate entries in .fixup.  */
 #undef RELOCATABLE_NEEDS_FIXUP
 
-#define USER_LABEL_PREFIX  ""
-
 /* This now supports a natural alignment mode. */
 /* AIX word-aligns FP doubles but doubleword-aligns 64-bit ints.  */
 #undef  ADJUST_FIELD_ALIGN
 #define ADJUST_FIELD_ALIGN(FIELD, COMPUTED) \
-  (TARGET_ALIGN_NATURAL ? (COMPUTED) : \
-  (TYPE_MODE (TREE_CODE (TREE_TYPE (FIELD)) == ARRAY_TYPE \
-	      ? get_inner_array_type (FIELD) \
-	      : TREE_TYPE (FIELD)) == DFmode \
-   ? MIN ((COMPUTED), 32) : (COMPUTED)))
+  ((TARGET_ALTIVEC && TREE_CODE (TREE_TYPE (FIELD)) == VECTOR_TYPE)	\
+   ? 128								\
+   : (TARGET_64BIT							\
+      && TARGET_ALIGN_NATURAL == 0					\
+      && TYPE_MODE (TREE_CODE (TREE_TYPE (FIELD)) == ARRAY_TYPE		\
+		    ? get_inner_array_type (FIELD)			\
+		    : TREE_TYPE (FIELD)) == DFmode)			\
+   ? MIN ((COMPUTED), 32)						\
+   : (COMPUTED))
 
 /* AIX increases natural record alignment to doubleword if the first
    field is an FP double while the FP fields remain word aligned.  */
 #undef  ROUND_TYPE_ALIGN
-#define ROUND_TYPE_ALIGN(STRUCT, COMPUTED, SPECIFIED)	\
-  ((TREE_CODE (STRUCT) == RECORD_TYPE			\
-    || TREE_CODE (STRUCT) == UNION_TYPE			\
-    || TREE_CODE (STRUCT) == QUAL_UNION_TYPE)		\
-   && TYPE_FIELDS (STRUCT) != 0				\
-   && TARGET_ALIGN_NATURAL == 0                         \
-   && DECL_MODE (TYPE_FIELDS (STRUCT)) == DFmode	\
-   ? MAX (MAX ((COMPUTED), (SPECIFIED)), 64)		\
+#define ROUND_TYPE_ALIGN(STRUCT, COMPUTED, SPECIFIED)		\
+  ((TARGET_ALTIVEC && TREE_CODE (STRUCT) == VECTOR_TYPE)	\
+   ? MAX (MAX ((COMPUTED), (SPECIFIED)), 128)			\
+   : (TARGET_64BIT						\
+      && (TREE_CODE (STRUCT) == RECORD_TYPE			\
+	  || TREE_CODE (STRUCT) == UNION_TYPE			\
+	  || TREE_CODE (STRUCT) == QUAL_UNION_TYPE)		\
+      && TYPE_FIELDS (STRUCT) != 0				\
+      && TARGET_ALIGN_NATURAL == 0				\
+      && DECL_MODE (TYPE_FIELDS (STRUCT)) == DFmode)		\
+   ? MAX (MAX ((COMPUTED), (SPECIFIED)), 64)			\
    : MAX ((COMPUTED), (SPECIFIED)))
 
 /* Indicate that jump tables go in the text section.  */
 #undef  JUMP_TABLES_IN_TEXT_SECTION
-#define JUMP_TABLES_IN_TEXT_SECTION 1
-
-/* 64-bit PowerPC Linux always has GPR13 fixed.  */
-#define FIXED_R13		1
+#define JUMP_TABLES_IN_TEXT_SECTION TARGET_64BIT
 
 /* __throw will restore its own return address to be the same as the
    return address of the function that the throw is being made to.
    This is unfortunate, because we want to check the original
    return address to see if we need to restore the TOC.
    So we have to squirrel it away with this.  */
-#define SETUP_FRAME_ADDRESSES() rs6000_aix_emit_builtin_unwind_init ()
+#define SETUP_FRAME_ADDRESSES() \
+  do { if (TARGET_64BIT) rs6000_aix_emit_builtin_unwind_init (); } while (0)
 
 /* Override svr4.h  */
 #undef MD_EXEC_PREFIX
@@ -165,17 +260,28 @@
 #define	CPP_SYSV_SPEC ""
 
 #undef  TARGET_OS_CPP_BUILTINS
-#define TARGET_OS_CPP_BUILTINS()            \
-  do                                        \
-    {                                       \
-      builtin_define ("__PPC__");           \
-      builtin_define ("__PPC64__");         \
-      builtin_define ("__powerpc__");       \
-      builtin_define ("__powerpc64__");     \
-      builtin_define ("__PIC__");           \
-      builtin_assert ("cpu=powerpc64");     \
-      builtin_assert ("machine=powerpc64"); \
-    }                                       \
+#define TARGET_OS_CPP_BUILTINS()            		\
+  do							\
+    {							\
+      if (TARGET_64BIT)					\
+	{						\
+	  builtin_define ("__PPC__");			\
+	  builtin_define ("__PPC64__");			\
+	  builtin_define ("__powerpc__");		\
+	  builtin_define ("__powerpc64__");		\
+	  builtin_define ("__PIC__");			\
+	  builtin_assert ("cpu=powerpc64");		\
+	  builtin_assert ("machine=powerpc64");		\
+	}						\
+      else						\
+	{						\
+	  builtin_define_std ("PPC");			\
+	  builtin_define_std ("powerpc");		\
+	  builtin_assert ("cpu=powerpc");		\
+	  builtin_assert ("machine=powerpc");		\
+	  TARGET_OS_SYSV_CPP_BUILTINS ();		\
+	}						\
+    }							\
   while (0)
 
 #undef  CPP_OS_DEFAULT_SPEC
@@ -205,43 +311,40 @@
 #undef	LINK_OS_DEFAULT_SPEC
 #define LINK_OS_DEFAULT_SPEC "%(link_os_linux)"
 
-#undef  LINK_OS_LINUX_SPEC
-#define LINK_OS_LINUX_SPEC "-m elf64ppc %{!shared: %{!static: \
+#define LINK_OS_LINUX_SPEC32 "-m elf32ppclinux %{!shared: %{!static: \
   %{rdynamic:-export-dynamic} \
-  %{!dynamic-linker:-dynamic-linker /lib64/ld64.so.1}}}"
+  %{!dynamic-linker:-dynamic-linker /lib/ld.so.1}}}"
 
-#ifdef NATIVE_CROSS
-#define STARTFILE_PREFIX_SPEC "/usr/local/lib64/ /lib64/ /usr/lib64/"
-#endif
-
-#undef  STARTFILE_LINUX_SPEC
-#define STARTFILE_LINUX_SPEC "\
-%{!shared: %{pg:gcrt1.o%s} %{!pg:%{p:gcrt1.o%s} %{!p:crt1.o%s}}} crti.o%s \
-%{static:crtbeginT.o%s} \
-%{!static:%{!shared:crtbegin.o%s} %{shared:crtbeginS.o%s}}"
-
-#undef  ENDFILE_LINUX_SPEC
-#define ENDFILE_LINUX_SPEC "\
-%{!shared:crtend.o%s} %{shared:crtendS.o%s} crtn.o%s"
+#define LINK_OS_LINUX_SPEC64 "-m elf64ppc %{!shared: %{!static: \
+  %{rdynamic:-export-dynamic} \
+  %{!dynamic-linker:-dynamic-linker /lib64/ld64.so.1}}}"
 
 #undef  TOC_SECTION_ASM_OP
-#define TOC_SECTION_ASM_OP "\t.section\t\".toc\",\"aw\""
+#define TOC_SECTION_ASM_OP \
+  (TARGET_64BIT						\
+   ? "\t.section\t\".toc\",\"aw\""			\
+   : "\t.section\t\".got\",\"aw\"")
 
 #undef  MINIMAL_TOC_SECTION_ASM_OP
-#define MINIMAL_TOC_SECTION_ASM_OP "\t.section\t\".toc1\",\"aw\""
+#define MINIMAL_TOC_SECTION_ASM_OP \
+  (TARGET_64BIT						\
+   ? "\t.section\t\".toc1\",\"aw\""			\
+   : ((TARGET_RELOCATABLE || flag_pic)			\
+      ? "\t.section\t\".got2\",\"aw\""			\
+      : "\t.section\t\".got1\",\"aw\""))
 
 #undef  TARGET_VERSION
 #define TARGET_VERSION fprintf (stderr, " (PowerPC64 GNU/Linux)");
 
 /* Must be at least as big as our pointer type.  */
-#undef  SIZE_TYPE
-#define SIZE_TYPE "long unsigned int"
+#undef	SIZE_TYPE
+#define	SIZE_TYPE (TARGET_64BIT ? "long unsigned int" : "unsigned int")
 
-#undef  PTRDIFF_TYPE
-#define PTRDIFF_TYPE "long int"
-
-#undef  WCHAR_TYPE
-#define WCHAR_TYPE "int"
+#undef	PTRDIFF_TYPE
+#define	PTRDIFF_TYPE (TARGET_64BIT ? "long int" : "int")
+  
+#undef	WCHAR_TYPE
+#define	WCHAR_TYPE (TARGET_64BIT ? "int" : "long int")
 #undef  WCHAR_TYPE_SIZE
 #define WCHAR_TYPE_SIZE 32
 
@@ -255,51 +358,25 @@
 
 /* PowerPC no-op instruction.  */
 #undef  RS6000_CALL_GLUE
-#define RS6000_CALL_GLUE "nop"
+#define RS6000_CALL_GLUE (TARGET_64BIT ? "nop" : "cror 31,31,31")
 
 #undef  RS6000_MCOUNT
 #define RS6000_MCOUNT "_mcount"
 
 /* FP save and restore routines.  */
 #undef  SAVE_FP_PREFIX
-#define SAVE_FP_PREFIX "._savef"
+#define SAVE_FP_PREFIX (TARGET_64BIT ? "._savef" : "_savefpr_")
 #undef  SAVE_FP_SUFFIX
-#define SAVE_FP_SUFFIX ""
+#define SAVE_FP_SUFFIX (TARGET_64BIT ? "" : "_l")
 #undef  RESTORE_FP_PREFIX
-#define RESTORE_FP_PREFIX "._restf"
+#define RESTORE_FP_PREFIX (TARGET_64BIT ? "._restf" : "_restfpr_")
 #undef  RESTORE_FP_SUFFIX
-#define RESTORE_FP_SUFFIX ""
+#define RESTORE_FP_SUFFIX (TARGET_64BIT ? "" : "_l")
 
 /* Dwarf2 debugging.  */
 #undef  PREFERRED_DEBUGGING_TYPE
 #define PREFERRED_DEBUGGING_TYPE DWARF2_DEBUG
 
-#undef  ASM_DECLARE_FUNCTION_NAME
-#define ASM_DECLARE_FUNCTION_NAME(FILE, NAME, DECL)			\
-  do									\
-    {									\
-      fputs ("\t.section\t\".opd\",\"aw\"\n\t.align 3\n", (FILE));	\
-      ASM_OUTPUT_LABEL ((FILE), (NAME));				\
-      fputs (DOUBLE_INT_ASM_OP, (FILE));				\
-      putc ('.', (FILE));						\
-      assemble_name ((FILE), (NAME));					\
-      fputs (",.TOC.@tocbase,0\n\t.previous\n\t.size\t", (FILE));	\
-      assemble_name ((FILE), (NAME));					\
-      fputs (",24\n\t.type\t.", (FILE));				\
-      assemble_name ((FILE), (NAME));					\
-      fputs (",@function\n", (FILE));					\
-      if (TREE_PUBLIC (DECL) && ! DECL_WEAK (DECL))			\
-        {								\
-	  fputs ("\t.globl\t.", (FILE));				\
-	  assemble_name ((FILE), (NAME));				\
-	  putc ('\n', (FILE));						\
-        }								\
-      ASM_DECLARE_RESULT ((FILE), DECL_RESULT (DECL));			\
-      putc ('.', (FILE));						\
-      ASM_OUTPUT_LABEL ((FILE), (NAME));				\
-    }									\
-  while (0)
-
 /* This is how to declare the size of a function.  */
 #undef	ASM_DECLARE_FUNCTION_SIZE
 #define	ASM_DECLARE_FUNCTION_SIZE(FILE, FNAME, DECL)			\
@@ -307,9 +384,13 @@
     {									\
       if (!flag_inhibit_size_directive)					\
 	{								\
-	  fputs ("\t.size\t.", (FILE));					\
+	  fputs ("\t.size\t", (FILE));					\
+	  if (TARGET_64BIT)						\
+	    putc ('.', (FILE));						\
 	  assemble_name ((FILE), (FNAME));				\
-	  fputs (",.-.", (FILE));					\
+	  fputs (",.-", (FILE));					\
+	  if (TARGET_64BIT)						\
+	    putc ('.', (FILE));						\
 	  assemble_name ((FILE), (FNAME));				\
 	  putc ('\n', (FILE));						\
 	}								\
@@ -336,10 +417,16 @@
        || (GET_CODE (X) == CONST_INT 					\
 	   && GET_MODE_BITSIZE (MODE) <= GET_MODE_BITSIZE (Pmode))	\
        || (GET_CODE (X) == CONST_DOUBLE					\
-	   && (TARGET_POWERPC64						\
-	       || TARGET_MINIMAL_TOC					\
-	       || (GET_MODE_CLASS (GET_MODE (X)) == MODE_FLOAT		\
-		   && ! TARGET_NO_FP_IN_TOC)))))
+	   && ((TARGET_64BIT						\
+		&& (TARGET_POWERPC64					\
+		    || TARGET_MINIMAL_TOC				\
+		    || (GET_MODE_CLASS (GET_MODE (X)) == MODE_FLOAT	\
+			&& ! TARGET_NO_FP_IN_TOC)))			\
+	       || (!TARGET_64BIT					\
+		   && !TARGET_NO_FP_IN_TOC				\
+		   && !TARGET_RELOCATABLE				\
+		   && GET_MODE_CLASS (GET_MODE (X)) == MODE_FLOAT	\
+		   && BITS_PER_WORD == HOST_BITS_PER_INT)))))
 
 /* This is the same as the dbxelf.h version, except that we need to
    use the function code label, not the function descriptor.  */
@@ -352,7 +439,9 @@ do									\
     ASM_GENERATE_INTERNAL_LABEL (temp, "LM", sym_lineno);		\
     fprintf (FILE, "\t.stabn 68,0,%d,", LINE);				\
     assemble_name (FILE, temp);						\
-    fputs ("-.", FILE);							\
+    putc ('-', FILE);							\
+    if (TARGET_64BIT)							\
+      putc ('.', FILE);							\
     assemble_name (FILE,						\
 		   XSTR (XEXP (DECL_RTL (current_function_decl), 0), 0));\
     putc ('\n', FILE);							\
@@ -373,7 +462,8 @@ while (0)
 	flab = IDENTIFIER_POINTER (current_function_func_begin_label);	\
       else								\
 	{								\
-	  putc ('.', FILE);						\
+	  if (TARGET_64BIT)						\
+	    putc ('.', FILE);						\
 	  flab = XSTR (XEXP (DECL_RTL (current_function_decl), 0), 0);	\
 	}								\
       assemble_name (FILE, flab);					\
@@ -390,19 +480,175 @@ while (0)
     {									\
       fprintf (FILE, "%s\"\",%d,0,0,", ASM_STABS_OP, N_FUN);		\
       assemble_name (FILE, LSCOPE);					\
-      fputs ("-.", FILE);						\
+      putc ('-', FILE);							\
+      if (TARGET_64BIT)							\
+        putc ('.', FILE);						\
       assemble_name (FILE, XSTR (XEXP (DECL_RTL (DECL), 0), 0));	\
       putc ('\n', FILE);						\
     }									\
   while (0)
 
-/* Override sysv4.h as these are ABI_V4 only.  */
-#undef	ASM_OUTPUT_REG_PUSH
-#undef	ASM_OUTPUT_REG_POP
-
 /* Select a format to encode pointers in exception handling data.  CODE
    is 0 for data, 1 for code labels, 2 for function pointers.  GLOBAL is
    true if the symbol may be affected by dynamic relocations.  */
 #undef	ASM_PREFERRED_EH_DATA_FORMAT
 #define	ASM_PREFERRED_EH_DATA_FORMAT(CODE, GLOBAL) \
-  (((GLOBAL) ? DW_EH_PE_indirect : 0) | DW_EH_PE_pcrel | DW_EH_PE_udata8)
+  ((TARGET_64BIT || flag_pic || TARGET_RELOCATABLE)			\
+   ? (((GLOBAL) ? DW_EH_PE_indirect : 0) | DW_EH_PE_pcrel		\
+      | (TARGET_64BIT ? DW_EH_PE_udata8 : DW_EH_PE_sdata4))		\
+   : DW_EH_PE_absptr)
+
+/* For backward compatibility, we must continue to use the AIX
+   structure return convention.  */
+#undef DRAFT_V4_STRUCT_RET
+#define DRAFT_V4_STRUCT_RET (!TARGET_64BIT)
+
+/* Do code reading to identify a signal frame, and set the frame
+   state data appropriately.  See unwind-dw2.c for the structs.  */
+
+#ifdef IN_LIBGCC2
+#include <signal.h>
+#include <sys/ucontext.h>
+
+#ifdef __powerpc64__
+enum { SIGNAL_FRAMESIZE = 128 };
+#else
+enum { SIGNAL_FRAMESIZE = 64 };
+#endif
+#endif
+
+#ifdef __powerpc64__
+
+#define MD_FALLBACK_FRAME_STATE_FOR(CONTEXT, FS, SUCCESS)		\
+  do {									\
+    unsigned char *pc_ = (CONTEXT)->ra;					\
+    struct sigcontext *sc_;						\
+    long new_cfa_;							\
+    int i_;								\
+									\
+    /* addi r1, r1, 128; li r0, 0x0077; sc  (sigreturn) */		\
+    /* addi r1, r1, 128; li r0, 0x00AC; sc  (rt_sigreturn) */		\
+    if (*(unsigned int *) (pc_+0) != 0x38210000 + SIGNAL_FRAMESIZE	\
+	|| *(unsigned int *) (pc_+8) != 0x44000002)			\
+      break;								\
+    if (*(unsigned int *) (pc_+4) == 0x38000077)			\
+      {									\
+	struct sigframe {						\
+	  char gap[SIGNAL_FRAMESIZE];					\
+	  struct sigcontext sigctx;					\
+	} *rt_ = (CONTEXT)->cfa;					\
+	sc_ = &rt_->sigctx;						\
+      }									\
+    else if (*(unsigned int *) (pc_+4) == 0x380000AC)			\
+      {									\
+	struct rt_sigframe {						\
+	  int tramp[6];							\
+	  struct siginfo *pinfo;					\
+	  struct ucontext *puc;						\
+	} *rt_ = (struct rt_sigframe *) pc_;				\
+	sc_ = &rt_->puc->uc_mcontext;					\
+      }									\
+    else								\
+      break;								\
+    									\
+    new_cfa_ = sc_->regs->gpr[STACK_POINTER_REGNUM];			\
+    (FS)->cfa_how = CFA_REG_OFFSET;					\
+    (FS)->cfa_reg = STACK_POINTER_REGNUM;				\
+    (FS)->cfa_offset = new_cfa_ - (long) (CONTEXT)->cfa;		\
+    									\
+    for (i_ = 0; i_ < 32; i_++)						\
+      if (i_ != STACK_POINTER_REGNUM)					\
+	{	    							\
+	  (FS)->regs.reg[i_].how = REG_SAVED_OFFSET;			\
+	  (FS)->regs.reg[i_].loc.offset 				\
+	    = (long)&(sc_->regs->gpr[i_]) - new_cfa_;			\
+	}								\
+									\
+    (FS)->regs.reg[LINK_REGISTER_REGNUM].how = REG_SAVED_OFFSET;	\
+    (FS)->regs.reg[LINK_REGISTER_REGNUM].loc.offset 			\
+      = (long)&(sc_->regs->link) - new_cfa_;				\
+									\
+    /* The unwinder expects the IP to point to the following insn,	\
+       whereas the kernel returns the address of the actual		\
+       faulting insn. We store NIP+4 in an unused register slot to	\
+       get the same result for multiple evaluation of the same signal	\
+       frame.  */							\
+    sc_->regs->gpr[47] = sc_->regs->nip + 4;  				\
+    (FS)->regs.reg[CR0_REGNO].how = REG_SAVED_OFFSET;			\
+    (FS)->regs.reg[CR0_REGNO].loc.offset 				\
+      = (long)&(sc_->regs->gpr[47]) - new_cfa_;				\
+    (FS)->retaddr_column = CR0_REGNO;					\
+    goto SUCCESS;							\
+  } while (0)
+
+#else
+
+#define MD_FALLBACK_FRAME_STATE_FOR(CONTEXT, FS, SUCCESS)		\
+  do {									\
+    unsigned char *pc_ = (CONTEXT)->ra;					\
+    struct sigcontext *sc_;						\
+    long new_cfa_;							\
+    int i_;								\
+									\
+    /* li r0, 0x7777; sc  (sigreturn old)  */				\
+    /* li r0, 0x0077; sc  (sigreturn new)  */				\
+    /* li r0, 0x6666; sc  (rt_sigreturn old)  */			\
+    /* li r0, 0x00AC; sc  (rt_sigreturn new)  */			\
+    if (*(unsigned int *) (pc_+4) != 0x44000002)			\
+      break;								\
+    if (*(unsigned int *) (pc_+0) == 0x38007777				\
+	|| *(unsigned int *) (pc_+0) == 0x38000077)			\
+      {									\
+	struct sigframe {						\
+	  char gap[SIGNAL_FRAMESIZE];					\
+	  struct sigcontext sigctx;					\
+	} *rt_ = (CONTEXT)->cfa;					\
+	sc_ = &rt_->sigctx;						\
+      }									\
+    else if (*(unsigned int *) (pc_+0) == 0x38006666			\
+	     || *(unsigned int *) (pc_+0) == 0x380000AC)		\
+      {									\
+	struct rt_sigframe {						\
+	  char gap[SIGNAL_FRAMESIZE];					\
+	  unsigned long _unused[2];					\
+	  struct siginfo *pinfo;					\
+	  void *puc;							\
+	  struct siginfo info;						\
+	  struct ucontext uc;						\
+	} *rt_ = (CONTEXT)->cfa;					\
+	sc_ = &rt_->uc.uc_mcontext;					\
+      }									\
+    else								\
+      break;								\
+    									\
+    new_cfa_ = sc_->regs->gpr[STACK_POINTER_REGNUM];			\
+    (FS)->cfa_how = CFA_REG_OFFSET;					\
+    (FS)->cfa_reg = STACK_POINTER_REGNUM;				\
+    (FS)->cfa_offset = new_cfa_ - (long) (CONTEXT)->cfa;		\
+    									\
+    for (i_ = 0; i_ < 32; i_++)						\
+      if (i_ != STACK_POINTER_REGNUM)					\
+	{	    							\
+	  (FS)->regs.reg[i_].how = REG_SAVED_OFFSET;			\
+	  (FS)->regs.reg[i_].loc.offset 				\
+	    = (long)&(sc_->regs->gpr[i_]) - new_cfa_;			\
+	}								\
+									\
+    (FS)->regs.reg[LINK_REGISTER_REGNUM].how = REG_SAVED_OFFSET;	\
+    (FS)->regs.reg[LINK_REGISTER_REGNUM].loc.offset 			\
+      = (long)&(sc_->regs->link) - new_cfa_;				\
+									\
+    /* The unwinder expects the IP to point to the following insn,	\
+       whereas the kernel returns the address of the actual		\
+       faulting insn. We store NIP+4 in an unused register slot to	\
+       get the same result for multiple evaluation of the same signal	\
+       frame.  */							\
+    sc_->regs->gpr[47] = sc_->regs->nip + 4;  				\
+    (FS)->regs.reg[CR0_REGNO].how = REG_SAVED_OFFSET;			\
+    (FS)->regs.reg[CR0_REGNO].loc.offset 				\
+      = (long)&(sc_->regs->gpr[47]) - new_cfa_;				\
+    (FS)->retaddr_column = CR0_REGNO;					\
+    goto SUCCESS;							\
+  } while (0)
+
+#endif
--- gcc/config/rs6000/rs6000.md.jj	2003-05-26 17:20:12.000000000 -0400
+++ gcc/config/rs6000/rs6000.md	2003-05-26 19:17:05.000000000 -0400
@@ -14581,9 +14581,8 @@
   ""
   "
 {
-#if TARGET_AIX
-  rs6000_emit_eh_toc_restore (EH_RETURN_STACKADJ_RTX);
-#endif
+  if (TARGET_AIX)
+    rs6000_emit_eh_toc_restore (EH_RETURN_STACKADJ_RTX);
   if (TARGET_32BIT)
     emit_insn (gen_eh_set_lr_si (operands[0]));
   else
--- gcc/config/rs6000/default64.h.jj	2003-05-26 19:17:05.000000000 -0400
+++ gcc/config/rs6000/default64.h	2003-05-26 19:17:05.000000000 -0400
@@ -0,0 +1,24 @@
+/* Definitions of target machine for GNU compiler,
+   for 64 bit powerpc linux defaulting to -m64.
+   Copyright (C) 2003 Free Software Foundation, Inc.
+
+This file is part of GNU CC.
+
+GNU CC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 2, or (at your option)
+any later version.
+
+GNU CC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GNU CC; see the file COPYING.  If not, write to
+the Free Software Foundation, 59 Temple Place - Suite 330,
+Boston, MA 02111-1307, USA.  */
+
+#undef TARGET_DEFAULT
+#define TARGET_DEFAULT \
+  (MASK_POWERPC | MASK_POWERPC64 | MASK_64BIT | MASK_NEW_MNEMONICS)
--- gcc/config/rs6000/sysv4.h.jj	2003-05-26 17:20:12.000000000 -0400
+++ gcc/config/rs6000/sysv4.h	2003-05-26 19:17:05.000000000 -0400
@@ -149,6 +149,10 @@ extern const char *rs6000_tls_size_strin
     N_("Set the PPC_EMB bit in the ELF flags header") },		\
   { "windiss",           0, N_("Use the WindISS simulator") },          \
   { "shlib",		 0, N_("no description yet") },			\
+  { "64",		 MASK_64BIT | MASK_POWERPC64 | MASK_POWERPC,	\
+			 N_("Generate 64-bit code") },			\
+  { "32",		 - (MASK_64BIT | MASK_POWERPC64),		\
+			 N_("Generate 32-bit code") },			\
   EXTRA_SUBTARGET_SWITCHES						\
   { "newlib",		 0, N_("no description yet") },
 
@@ -172,6 +176,9 @@ do {									\
   if (!g_switch_set)							\
     g_switch_value = SDATA_DEFAULT_SIZE;				\
 									\
+  if (rs6000_abi_name == NULL)						\
+    rs6000_abi_name = RS6000_ABI_NAME;					\
+									\
   if (!strcmp (rs6000_abi_name, "sysv"))				\
     rs6000_current_abi = ABI_V4;					\
   else if (!strcmp (rs6000_abi_name, "sysv-noeabi"))			\
@@ -274,7 +281,7 @@ do {									\
 	     rs6000_abi_name);						\
     }									\
 									\
-  if (flag_pic > 1 && rs6000_current_abi == ABI_AIX)			\
+  if (!TARGET_64BIT && flag_pic > 1 && rs6000_current_abi == ABI_AIX)	\
     {									\
       flag_pic = 0;							\
       error ("-fPIC and -mcall-%s are incompatible",			\
@@ -293,9 +300,16 @@ do {									\
 									\
   else if (TARGET_RELOCATABLE)						\
     flag_pic = 2;							\
-									\
 } while (0)
 
+#ifndef RS6000_BI_ARCH
+# define SUBSUBTARGET_OVERRIDE_OPTIONS					\
+do {									\
+  if ((TARGET_DEFAULT ^ target_flags) & MASK_64BIT)			\
+    error ("-m%s not supported in this configuration",			\
+	   (target_flags & MASK_64BIT) ? "64" : "32");			\
+} while (0)
+#endif
 
 /* Override rs6000.h definition.  */
 #undef	TARGET_DEFAULT
@@ -590,51 +604,7 @@ extern int rs6000_pic_labelno;
 /* Override elfos.h definition.  */
 #undef	ASM_DECLARE_FUNCTION_NAME
 #define ASM_DECLARE_FUNCTION_NAME(FILE, NAME, DECL)			\
-  do {									\
-    const char *const init_ptr = (TARGET_64BIT) ? ".quad" : ".long";	\
-									\
-    if (TARGET_RELOCATABLE 						\
-	&& (get_pool_size () != 0 || current_function_profile)		\
-	&& uses_TOC())							\
-      {									\
-	char buf[256];							\
-									\
-	(*targetm.asm_out.internal_label) (FILE, "LCL", rs6000_pic_labelno); \
-									\
-	ASM_GENERATE_INTERNAL_LABEL (buf, "LCTOC", 1);			\
-	fprintf (FILE, "\t%s ", init_ptr);				\
-	assemble_name (FILE, buf);					\
-	putc ('-', FILE);						\
-	ASM_GENERATE_INTERNAL_LABEL (buf, "LCF", rs6000_pic_labelno);	\
-	assemble_name (FILE, buf);					\
-	putc ('\n', FILE);						\
-      }									\
-									\
-    ASM_OUTPUT_TYPE_DIRECTIVE (FILE, NAME, "function");			\
-    ASM_DECLARE_RESULT (FILE, DECL_RESULT (DECL));			\
-									\
-    if (DEFAULT_ABI == ABI_AIX)						\
-      {									\
-	const char *desc_name, *orig_name;				\
-									\
-        orig_name = (*targetm.strip_name_encoding) (NAME);		\
-        desc_name = orig_name;						\
-	while (*desc_name == '.')					\
-	  desc_name++;							\
-									\
-	if (TREE_PUBLIC (DECL))						\
-	  fprintf (FILE, "\t.globl %s\n", desc_name);			\
-									\
-	fprintf (FILE, "%s\n", MINIMAL_TOC_SECTION_ASM_OP);		\
-	fprintf (FILE, "%s:\n", desc_name);				\
-	fprintf (FILE, "\t%s %s\n", init_ptr, orig_name);		\
-	fprintf (FILE, "\t%s _GLOBAL_OFFSET_TABLE_\n", init_ptr);	\
-	if (DEFAULT_ABI == ABI_AIX)					\
-	  fprintf (FILE, "\t%s 0\n", init_ptr);				\
-	fprintf (FILE, "\t.previous\n");				\
-      }									\
-    ASM_OUTPUT_LABEL (FILE, NAME);					\
-  } while (0)
+  rs6000_elf_declare_function_name ((FILE), (NAME), (DECL))
 
 /* The USER_LABEL_PREFIX stuff is affected by the -fleading-underscore
    flag.  The LOCAL_LABEL_PREFIX variable is used by dbxelf.h.  */
@@ -789,6 +759,25 @@ extern int fixuplabelno;
 #define	TARGET_VERSION fprintf (stderr, " (PowerPC System V.4)");
 #endif
 \f
+#define TARGET_OS_SYSV_CPP_BUILTINS()	  \
+  do                                      \
+    {                                     \
+      if (flag_pic == 1)		  \
+        {				  \
+	  builtin_define ("__pic__=1");	  \
+	  builtin_define ("__PIC__=1");	  \
+        }				  \
+      else if (flag_pic == 2)		  \
+        {				  \
+	  builtin_define ("__pic__=2");	  \
+	  builtin_define ("__PIC__=2");	  \
+        }				  \
+      if (target_flags_explicit		  \
+	  & MASK_RELOCATABLE)		  \
+	builtin_define ("_RELOCATABLE");  \
+    }                                     \
+  while (0)
+
 #ifndef	TARGET_OS_CPP_BUILTINS
 #define TARGET_OS_CPP_BUILTINS()          \
   do                                      \
@@ -800,6 +789,7 @@ extern int fixuplabelno;
       builtin_assert ("system=svr4");     \
       builtin_assert ("cpu=powerpc");     \
       builtin_assert ("machine=powerpc"); \
+      TARGET_OS_SYSV_CPP_BUILTINS ();	  \
     }                                     \
   while (0)
 #endif
@@ -945,14 +935,9 @@ extern int fixuplabelno;
 
 #define LINK_OS_DEFAULT_SPEC ""
 
-#define CPP_SYSV_SPEC \
-"%{mrelocatable*: -D_RELOCATABLE} \
-%{fpic: -D__PIC__=1 -D__pic__=1} \
-%{!fpic: %{fPIC: -D__PIC__=2 -D__pic__=2}}"
-
 /* Override rs6000.h definition.  */
 #undef	CPP_SPEC
-#define	CPP_SPEC "%{posix: -D_POSIX_SOURCE} %(cpp_sysv) \
+#define	CPP_SPEC "%{posix: -D_POSIX_SOURCE} \
 %{mads         : %(cpp_os_ads)         ; \
   myellowknife : %(cpp_os_yellowknife) ; \
   mmvme        : %(cpp_os_mvme)        ; \
@@ -1215,7 +1200,6 @@ ncrtn.o%s"
 /* Override rs6000.h definition.  */
 #undef	SUBTARGET_EXTRA_SPECS
 #define	SUBTARGET_EXTRA_SPECS						\
-  { "cpp_sysv",			CPP_SYSV_SPEC },			\
   { "crtsavres_default",        CRTSAVRES_DEFAULT_SPEC },              \
   { "lib_ads",			LIB_ADS_SPEC },				\
   { "lib_yellowknife",		LIB_YELLOWKNIFE_SPEC },			\
@@ -1284,7 +1268,10 @@ ncrtn.o%s"
   { "cpp_os_linux",		CPP_OS_LINUX_SPEC },			\
   { "cpp_os_netbsd",		CPP_OS_NETBSD_SPEC },			\
   { "cpp_os_windiss",           CPP_OS_WINDISS_SPEC },                  \
-  { "cpp_os_default",		CPP_OS_DEFAULT_SPEC },
+  { "cpp_os_default",		CPP_OS_DEFAULT_SPEC },			\
+  SUBSUBTARGET_EXTRA_SPECS
+
+#define	SUBSUBTARGET_EXTRA_SPECS
 
 /* Define this macro as a C expression for the initializer of an
    array of string to tell the driver program which options are
--- gcc/config/rs6000/biarch64.h.jj	2003-05-26 19:17:05.000000000 -0400
+++ gcc/config/rs6000/biarch64.h	2003-05-26 19:17:05.000000000 -0400
@@ -0,0 +1,22 @@
+/* Definitions of target machine for GNU compiler, for 32/64 bit powerpc.
+   Copyright (C) 2003 Free Software Foundation, Inc.
+
+This file is part of GNU CC.
+
+GNU CC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 2, or (at your option)
+any later version.
+
+GNU CC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GNU CC; see the file COPYING.  If not, write to
+the Free Software Foundation, 59 Temple Place - Suite 330,
+Boston, MA 02111-1307, USA.  */
+
+/* Specify this in a cover file to provide bi-architecture (32/64) support.  */
+#define RS6000_BI_ARCH 1
--- gcc/config/rs6000/rs6000-protos.h.jj	2003-05-26 17:20:11.000000000 -0400
+++ gcc/config/rs6000/rs6000-protos.h	2003-05-26 19:17:05.000000000 -0400
@@ -155,6 +155,8 @@ extern void setup_incoming_varargs PARAM
 extern rtx rs6000_function_value (tree, tree);
 extern struct rtx_def *rs6000_va_arg PARAMS ((tree, tree));
 extern int function_ok_for_sibcall PARAMS ((tree));
+extern void rs6000_elf_declare_function_name
+  PARAMS ((FILE *, const char *, tree));
 #ifdef ARGS_SIZE_RTX
 /* expr.h defines ARGS_SIZE_RTX and `enum direction' */
 extern enum direction function_arg_padding PARAMS ((enum machine_mode, tree));
--- gcc/config/rs6000/x-linux64.jj	2003-05-26 19:17:05.000000000 -0400
+++ gcc/config/rs6000/x-linux64	2003-05-26 19:17:05.000000000 -0400
@@ -0,0 +1,2 @@
+# parts of gcc need more than a 64k TOC.
+X_CFLAGS = -mminimal-toc
--- gcc/config/rs6000/t-linux64.jj	2003-05-26 17:20:12.000000000 -0400
+++ gcc/config/rs6000/t-linux64	2003-05-26 19:17:05.000000000 -0400
@@ -1,19 +1,32 @@
-# Override t-linux.  We don't want -fPIC.
-CRTSTUFF_T_CFLAGS_S =
-TARGET_LIBGCC2_CFLAGS =
+# These functions are needed for soft-float on powerpc64-linux.
+LIB2FUNCS_EXTRA = tramp.S $(srcdir)/config/rs6000/ppc64-fp.c
 
-EXTRA_MULTILIB_PARTS=crtbegin.o crtend.o crtbeginS.o crtendS.o crtbeginT.o \
-			crtsavres.o
+# Modify the shared lib version file
+SHLIB_MKMAP_OPTS = -v dotsyms=1
 
-# These functions are needed for soft-float on powerpc64-linux.
-LIB2FUNCS_EXTRA = $(srcdir)/config/rs6000/ppc64-fp.c
+MULTILIB_OPTIONS        = m64/m32 msoft-float
+MULTILIB_DIRNAMES       = 64 32 nof
+MULTILIB_EXTRA_OPTS     = fPIC mstrict-align
+MULTILIB_EXCEPTIONS     = m64/msoft-float
+MULTILIB_EXCLUSIONS     = m64/!m32/msoft-float
+MULTILIB_OSDIRNAMES	= ../lib64 ../lib nof
+MULTILIB_MATCHES        = $(MULTILIB_MATCHES_FLOAT)
 
-# ld provides these functions as needed.
-crtsavres.S:
-	echo >crtsavres.S
+TARGET_LIBGCC2_CFLAGS = -mno-minimal-toc -fPIC
 
-$(T)crtsavres.o: crtsavres.S
-	$(GCC_FOR_TARGET) $(GCC_CFLAGS) $(INCLUDES) $(MULTILIB_CFLAGS) -c crtsavres.S -o $(T)crtsavres.o
+# We want fine grained libraries, so use the new code to build the
+# floating point emulation libraries.
+# fp-bit is only to be used by 32-bit multilibs
+FPBIT = fp-bit32.c
+DPBIT = dp-bit32.c
 
-# Modify the shared lib version file
-SHLIB_MKMAP_OPTS = -v dotsyms=1
+dp-bit32.c: $(srcdir)/config/fp-bit.c
+	( echo '#ifndef __powerpc64__'; \
+	  cat $(srcdir)/config/fp-bit.c; \
+	  echo '#endif' ) > dp-bit32.c
+
+fp-bit32.c: $(srcdir)/config/fp-bit.c
+	( echo '#ifndef __powerpc64__'; \
+	  echo '#define FLOAT'; \
+	  cat $(srcdir)/config/fp-bit.c; \
+	  echo '#endif' ) > fp-bit32.c
--- gcc/config/rs6000/eabi-ci.asm.jj	2003-05-26 17:20:11.000000000 -0400
+++ gcc/config/rs6000/eabi-ci.asm	2003-05-26 19:17:05.000000000 -0400
@@ -41,6 +41,7 @@ Boston, MA 02111-1307, USA.
 
 #include <ppc-asm.h>
 
+#ifndef __powerpc64__
 	.section ".got","aw"
 	.globl	__GOT_START__
 	.type	__GOT_START__,@object
@@ -122,3 +123,4 @@ FUNC_START(__fini)
 	stwu 1,-16(1)
 	mflr 0
 	stw 0,20(1)
+#endif
--- gcc/config/rs6000/linux.h.jj	2003-05-26 17:20:11.000000000 -0400
+++ gcc/config/rs6000/linux.h	2003-05-26 19:17:05.000000000 -0400
@@ -32,6 +32,7 @@
       builtin_define_std ("powerpc");     \
       builtin_assert ("cpu=powerpc");     \
       builtin_assert ("machine=powerpc"); \
+      TARGET_OS_SYSV_CPP_BUILTINS ();	  \
     }                                     \
   while (0)
 
@@ -78,6 +79,13 @@
 #undef  DRAFT_V4_STRUCT_RET
 #define DRAFT_V4_STRUCT_RET 1
 
+/* We are 32-bit all the time, so optimize a little.  */
+#undef TARGET_64BIT
+#define TARGET_64BIT 0
+ 
+/* We don't need to generate entries in .fixup.  */
+#undef RELOCATABLE_NEEDS_FIXUP
+
 /* Do code reading to identify a signal frame, and set the frame
    state data appropriately.  See unwind-dw2.c for the structs.  */
 
--- gcc/config/rs6000/eabi-cn.asm.jj	2003-05-26 17:20:11.000000000 -0400
+++ gcc/config/rs6000/eabi-cn.asm	2003-05-26 19:17:05.000000000 -0400
@@ -39,6 +39,7 @@ Boston, MA 02111-1307, USA.
 	.file	"crtn.s"
 	.ident	"GNU C crtn.s"
 
+#ifndef __powerpc64__
 	.section ".got","aw"
 	.globl	__GOT_END__
 	.type	__GOT_END__,@object
@@ -113,3 +114,4 @@ __EH_FRAME_END__:
 	mtlr 0
 	addi 1,1,16
 	blr
+#endif
--- gcc/config/rs6000/tramp.asm.jj	2003-05-26 17:20:12.000000000 -0400
+++ gcc/config/rs6000/tramp.asm	2003-05-26 19:17:05.000000000 -0400
@@ -39,6 +39,7 @@
 	.section ".text"
 	#include "ppc-asm.h"
 
+#ifndef __powerpc64__
 	.type	trampoline_initial,@object
 	.align	2
 trampoline_initial:
@@ -107,3 +108,4 @@ FUNC_START(__trampoline_setup)
 	bl	JUMP_TARGET(abort)
 FUNC_END(__trampoline_setup)
 
+#endif
--- gcc/config/rs6000/sol-ci.asm.jj	2003-05-26 17:20:12.000000000 -0400
+++ gcc/config/rs6000/sol-ci.asm	2003-05-26 19:17:05.000000000 -0400
@@ -39,6 +39,7 @@
 	.file	"scrti.s"
 	.ident	"GNU C scrti.s"
 
+#ifndef __powerpc64__
 # Start of .text
 	.section ".text"
 	.globl	_ex_text0
@@ -102,3 +103,4 @@ _fini:	stwu	%r1,-16(%r1)
 	.space 4
 	.weak	environ
 	.set	environ,_environ
+#endif
--- gcc/config/rs6000/sol-cn.asm.jj	2003-05-26 17:20:12.000000000 -0400
+++ gcc/config/rs6000/sol-cn.asm	2003-05-26 19:17:05.000000000 -0400
@@ -39,6 +39,7 @@
 	.file	"scrtn.s"
 	.ident	"GNU C scrtn.s"
 
+#ifndef __powerpc64__
 # Default versions of exception handling register/deregister
 	.weak	_ex_register
 	.weak	_ex_deregister
@@ -80,3 +81,4 @@ _ex_range1:
 	mtlr	%r0
 	addi	%r1,%r1,16
 	blr
+#endif
--- gcc/config/rs6000/ppc-asm.h.jj	2003-05-26 17:20:11.000000000 -0400
+++ gcc/config/rs6000/ppc-asm.h	2003-05-26 19:17:05.000000000 -0400
@@ -95,21 +95,15 @@
  * the real function with one or two leading periods respectively.
  */
 
-#ifdef _RELOCATABLE
-#define DESC_SECTION ".got2"
-#else
-#define DESC_SECTION ".got1"
-#endif
-
-#if defined(_CALL_AIXDESC)
+#if defined (__powerpc64__)
 #define FUNC_NAME(name) GLUE(.,name)
 #define JUMP_TARGET(name) FUNC_NAME(name)
 #define FUNC_START(name) \
-	.section DESC_SECTION,"aw"; \
+	.section ".opd","aw"; \
 name: \
-	.long GLUE(.,name); \
-	.long _GLOBAL_OFFSET_TABLE_; \
-	.long 0; \
+	.quad GLUE(.,name); \
+	.quad .TOC.@tocbase; \
+	.quad 0; \
 	.previous; \
 	.type GLUE(.,name),@function; \
 	.globl name; \
@@ -120,15 +114,22 @@ GLUE(.,name):
 GLUE(.L,name): \
 	.size GLUE(.,name),GLUE(.L,name)-GLUE(.,name)
 
-#elif defined (__powerpc64__)
+#elif defined(_CALL_AIXDESC)
+
+#ifdef _RELOCATABLE
+#define DESC_SECTION ".got2"
+#else
+#define DESC_SECTION ".got1"
+#endif
+
 #define FUNC_NAME(name) GLUE(.,name)
 #define JUMP_TARGET(name) FUNC_NAME(name)
 #define FUNC_START(name) \
-	.section ".opd","aw"; \
+	.section DESC_SECTION,"aw"; \
 name: \
-	.quad GLUE(.,name); \
-	.quad .TOC.@tocbase; \
-	.quad 0; \
+	.long GLUE(.,name); \
+	.long _GLOBAL_OFFSET_TABLE_; \
+	.long 0; \
 	.previous; \
 	.type GLUE(.,name),@function; \
 	.globl name; \
@@ -140,6 +141,7 @@ GLUE(.L,name): \
 	.size GLUE(.,name),GLUE(.L,name)-GLUE(.,name)
 
 #else
+
 #define FUNC_NAME(name) GLUE(__USER_LABEL_PREFIX__,name)
 #if defined __PIC__ || defined __pic__
 #define JUMP_TARGET(name) FUNC_NAME(name@plt)
@@ -155,4 +157,3 @@ FUNC_NAME(name):
 GLUE(.L,name): \
 	.size FUNC_NAME(name),GLUE(.L,name)-FUNC_NAME(name)
 #endif
-
--- gcc/config/xtensa/xtensa.h.jj	2003-05-26 17:20:12.000000000 -0400
+++ gcc/config/xtensa/xtensa.h	2003-05-26 19:17:05.000000000 -0400
@@ -944,7 +944,7 @@ typedef struct xtensa_args {
    _mcount uses a window size of 8 to make sure that it doesn't clobber
    any incoming argument values. */
 
-#define NO_PROFILE_COUNTERS
+#define NO_PROFILE_COUNTERS	1
 
 #define FUNCTION_PROFILER(FILE, LABELNO) \
   do {									\
--- gcc/config.gcc.jj	2003-05-26 17:20:10.000000000 -0400
+++ gcc/config.gcc	2003-05-26 19:17:05.000000000 -0400
@@ -1566,8 +1566,12 @@ powerpc-*-openbsd*)
 	extra_headers=
 	;;
 powerpc64-*-linux*)
-	tm_file="${tm_file} dbxelf.h elfos.h svr4.h freebsd-spec.h rs6000/sysv4.h rs6000/linux64.h"
-	tmake_file="rs6000/t-fprules t-slibgcc-elf-ver t-linux rs6000/t-linux64"
+	tm_file="rs6000/biarch64.h ${tm_file} dbxelf.h elfos.h svr4.h freebsd-spec.h rs6000/sysv4.h"
+	case x$with_cpu in
+	x|xpowerpc64|xdefault64) tm_file="${tm_file} rs6000/default64.h";;
+	esac
+	tm_file="${tm_file} rs6000/linux64.h"
+	tmake_file="rs6000/t-fprules t-slibgcc-elf-ver t-linux rs6000/t-ppccomm rs6000/t-linux64"
 	;;
 powerpc64-*-gnu*)
 	tm_file="${cpu_type}/${cpu_type}.h elfos.h svr4.h freebsd-spec.h gnu.h rs6000/sysv4.h rs6000/linux64.h rs6000/gnu.h"
@@ -1698,14 +1702,14 @@ rs6000-ibm-aix4.[12]* | powerpc-ibm-aix4
 	extra_headers=
 	;;
 rs6000-ibm-aix4.[3456789]* | powerpc-ibm-aix4.[3456789]*)
-	tm_file="${tm_file} rs6000/aix.h rs6000/aix43.h rs6000/xcoff.h"
+	tm_file="rs6000/biarch64.h ${tm_file} rs6000/aix.h rs6000/aix43.h rs6000/xcoff.h"
 	tmake_file=rs6000/t-aix43
 	use_collect2=yes
 	thread_file='aix'
 	extra_headers=
 	;;
 rs6000-ibm-aix5.1.* | powerpc-ibm-aix5.1.*)
-	tm_file="${tm_file} rs6000/aix.h rs6000/aix51.h rs6000/xcoff.h"
+	tm_file="rs6000/biarch64.h ${tm_file} rs6000/aix.h rs6000/aix51.h rs6000/xcoff.h"
 	tmake_file=rs6000/t-aix43
 	use_collect2=yes
 	thread_file='aix'
@@ -2291,7 +2295,7 @@ powerpc*-*-* | rs6000-*-*)
                 tm_file="$tm_file rs6000/altivec-defs.h"
         fi
 	case "x$with_cpu" in
-		x)
+		x | xdefault32 | xdefault64)
 			;;
 
 		xcommon | xpowerpc | xpowerpc64 \
--- gcc/final.c.jj	2003-05-26 17:20:11.000000000 -0400
+++ gcc/final.c	2003-05-26 19:17:05.000000000 -0400
@@ -1426,7 +1426,7 @@ profile_function (file)
      FILE *file ATTRIBUTE_UNUSED;
 {
 #ifndef NO_PROFILE_COUNTERS
-  int align = MIN (BIGGEST_ALIGNMENT, LONG_TYPE_SIZE);
+# define NO_PROFILE_COUNTERS	0
 #endif
 #if defined(ASM_OUTPUT_REG_PUSH)
 #if defined(STRUCT_VALUE_INCOMING_REGNUM) || defined(STRUCT_VALUE_REGNUM)
@@ -1437,12 +1437,14 @@ profile_function (file)
 #endif
 #endif /* ASM_OUTPUT_REG_PUSH */
 
-#ifndef NO_PROFILE_COUNTERS
-  data_section ();
-  ASM_OUTPUT_ALIGN (file, floor_log2 (align / BITS_PER_UNIT));
-  (*targetm.asm_out.internal_label) (file, "LP", current_function_funcdef_no);
-  assemble_integer (const0_rtx, LONG_TYPE_SIZE / BITS_PER_UNIT, align, 1);
-#endif
+  if (! NO_PROFILE_COUNTERS)
+    {
+      int align = MIN (BIGGEST_ALIGNMENT, LONG_TYPE_SIZE);
+      data_section ();
+      ASM_OUTPUT_ALIGN (file, floor_log2 (align / BITS_PER_UNIT));
+      (*targetm.asm_out.internal_label) (file, "LP", current_function_funcdef_no);
+      assemble_integer (const0_rtx, LONG_TYPE_SIZE / BITS_PER_UNIT, align, 1);
+    }
 
   function_section (current_function_decl);
 
--- gcc/configure.in.jj	2003-05-26 17:20:10.000000000 -0400
+++ gcc/configure.in	2003-05-26 19:17:05.000000000 -0400
@@ -2048,6 +2048,7 @@ gcc_cv_as_tls=no
 conftest_s=
 tls_first_major=
 tls_first_minor=
+tls_as_opt=
 case "$target" in
 changequote(,)dnl
   alpha*-*-*)
@@ -2147,6 +2148,7 @@ x3:	.space 4
 	addi 9,9,x2@tprel@l'
 	tls_first_major=2
 	tls_first_minor=14
+	tls_as_opt=-a32
 	;;
   powerpc64-*-*)
     conftest_s='
@@ -2180,6 +2182,7 @@ x3:	.space 8
 	nop'
 	tls_first_major=2
 	tls_first_minor=14
+	tls_as_opt=-a64
 	;;
   s390-*-*)
     conftest_s='
@@ -2198,6 +2201,7 @@ foo:	.long	25
 	bas	%r14,0(%r1,%r13):tls_ldcall:foo'
 	tls_first_major=2
 	tls_first_minor=14
+	tls_as_opt=-m31
 	;;
   s390x-*-*)
     conftest_s='
@@ -2215,6 +2219,7 @@ foo:	.long	25
 	brasl	%r14,__tls_get_offset@PLT:tls_ldcall:foo'
 	tls_first_major=2
 	tls_first_minor=14
+	tls_as_opt="-m64 -Aesame"
 	;;
 esac
 if test -z "$tls_first_major"; then
@@ -2225,7 +2230,7 @@ elif test $in_tree_gas = yes ; then
   ])
 elif test x$gcc_cv_as != x; then
   echo "$conftest_s" > conftest.s
-  if $gcc_cv_as --fatal-warnings -o conftest.o conftest.s > /dev/null 2>&1
+  if $gcc_cv_as $tls_as_opt --fatal-warnings -o conftest.o conftest.s > /dev/null 2>&1
   then
     gcc_cv_as_tls=yes
   fi
--- gcc/configure.jj	2003-05-26 17:20:10.000000000 -0400
+++ gcc/configure	2003-05-26 19:17:05.000000000 -0400
@@ -8123,6 +8123,7 @@ gcc_cv_as_tls=no
 conftest_s=
 tls_first_major=
 tls_first_minor=
+tls_as_opt=
 case "$target" in
   alpha*-*-*)
     conftest_s='
@@ -8220,6 +8221,7 @@ x3:	.space 4
 	addi 9,9,x2@tprel@l'
 	tls_first_major=2
 	tls_first_minor=14
+	tls_as_opt=-a32
 	;;
   powerpc64-*-*)
     conftest_s='
@@ -8253,6 +8255,7 @@ x3:	.space 8
 	nop'
 	tls_first_major=2
 	tls_first_minor=14
+	tls_as_opt=-a64
 	;;
   s390-*-*)
     conftest_s='
@@ -8271,6 +8274,7 @@ foo:	.long	25
 	bas	%r14,0(%r1,%r13):tls_ldcall:foo'
 	tls_first_major=2
 	tls_first_minor=14
+	tls_as_opt=-m31
 	;;
   s390x-*-*)
     conftest_s='
@@ -8288,6 +8292,7 @@ foo:	.long	25
 	brasl	%r14,__tls_get_offset@PLT:tls_ldcall:foo'
 	tls_first_major=2
 	tls_first_minor=14
+	tls_as_opt="-m64 -Aesame"
 	;;
 esac
 if test -z "$tls_first_major"; then
@@ -8305,7 +8310,7 @@ fi
 
 elif test x$gcc_cv_as != x; then
   echo "$conftest_s" > conftest.s
-  if $gcc_cv_as --fatal-warnings -o conftest.o conftest.s > /dev/null 2>&1
+  if $gcc_cv_as $tls_as_opt --fatal-warnings -o conftest.o conftest.s > /dev/null 2>&1
   then
     gcc_cv_as_tls=yes
   fi

	Jakub

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] powerpc64-linux bi-arch support
  2003-05-27 11:58 [PATCH] powerpc64-linux bi-arch support Jakub Jelinek
@ 2003-05-27 14:57 ` David Edelsohn
  2003-05-27 15:06   ` Jakub Jelinek
  2003-06-04 14:48 ` David Edelsohn
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2003-05-27 14:57 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Alan Modra, Janis Johnson, gcc-patches

	This patch is going to require a lot of careful review.  One issue
I notice is :

 /* PowerPC no-op instruction.  */
 #undef  RS6000_CALL_GLUE
-#define RS6000_CALL_GLUE "nop"
+#define RS6000_CALL_GLUE (TARGET_64BIT ? "nop" : "cror 31,31,31")
 

Why are you changing RS6000_CALL_GLUE based on TARGET_64BIT?  It is a new
mnemonic versus old mnemonic issue.  PPC32 Linux does not use it and PPC64
only uses "nop".

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] powerpc64-linux bi-arch support
  2003-05-27 14:57 ` David Edelsohn
@ 2003-05-27 15:06   ` Jakub Jelinek
  2003-05-27 15:25     ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Jakub Jelinek @ 2003-05-27 15:06 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Alan Modra, Janis Johnson, gcc-patches

On Tue, May 27, 2003 at 10:46:53AM -0400, David Edelsohn wrote:
> 	This patch is going to require a lot of careful review.  One issue
> I notice is :
> 
>  /* PowerPC no-op instruction.  */
>  #undef  RS6000_CALL_GLUE
> -#define RS6000_CALL_GLUE "nop"
> +#define RS6000_CALL_GLUE (TARGET_64BIT ? "nop" : "cror 31,31,31")
>  
> 
> Why are you changing RS6000_CALL_GLUE based on TARGET_64BIT?  It is a new
> mnemonic versus old mnemonic issue.  PPC32 Linux does not use it and PPC64
> only uses "nop".

I think -m32 -mcall-aixdesc still uses it.
If it is ok for -m32 -mcall-aixdesc to emit nop instead of cror 31,31,31,
then this hunk can certainly go away.

	Jakub

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] powerpc64-linux bi-arch support
  2003-05-27 15:06   ` Jakub Jelinek
@ 2003-05-27 15:25     ` David Edelsohn
  2003-05-27 15:52       ` Jakub Jelinek
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2003-05-27 15:25 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Alan Modra, Janis Johnson, gcc-patches

>>>>> Jakub Jelinek writes:

>> Why are you changing RS6000_CALL_GLUE based on TARGET_64BIT?  It is a new
>> mnemonic versus old mnemonic issue.  PPC32 Linux does not use it and PPC64
>> only uses "nop".

Jakub> I think -m32 -mcall-aixdesc still uses it.
Jakub> If it is ok for -m32 -mcall-aixdesc to emit nop instead of cror 31,31,31,
Jakub> then this hunk can certainly go away.

	If that's the reason, then it certainly is not reflected in any
comment and the test should be based on aix-calldesc, not TARGET_64BIT.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] powerpc64-linux bi-arch support
  2003-05-27 15:25     ` David Edelsohn
@ 2003-05-27 15:52       ` Jakub Jelinek
  2003-05-27 22:44         ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Jakub Jelinek @ 2003-05-27 15:52 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Alan Modra, Janis Johnson, gcc-patches

On Tue, May 27, 2003 at 11:06:00AM -0400, David Edelsohn wrote:
> >>>>> Jakub Jelinek writes:
> 
> >> Why are you changing RS6000_CALL_GLUE based on TARGET_64BIT?  It is a new
> >> mnemonic versus old mnemonic issue.  PPC32 Linux does not use it and PPC64
> >> only uses "nop".
> 
> Jakub> I think -m32 -mcall-aixdesc still uses it.
> Jakub> If it is ok for -m32 -mcall-aixdesc to emit nop instead of cror 31,31,31,
> Jakub> then this hunk can certainly go away.
> 
> 	If that's the reason, then it certainly is not reflected in any
> comment and the test should be based on aix-calldesc, not TARGET_64BIT.

A comment can be certainly added (most of such linux64.h changes were simple
if sysv4.h/linux.h version differs from linux64.h, use
previous linux64.h definition for TARGET_64BIT and sysv4.h/linux.h 
definition for !TARGET_64BIT).
But I don't see how the test can be based on aixdesc (ie. DEFAULT_ABI == ABI_AIX)
because ABI_AIX is set for both -m64 (which is -m64 -mcall-aixdesc with no
other -mcall-* variants allowed) and -m32 -mcall-aixdesc.
ppc64-linux -m64 only compiler hardcodes DEFAULT_ABI to ABI_AIX while with
the bi-arch patch it is variable initialized to ABI_AIX with an error if
cmdline switches override it in -m64 mode.

	Jakub

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] powerpc64-linux bi-arch support
  2003-05-27 15:52       ` Jakub Jelinek
@ 2003-05-27 22:44         ` David Edelsohn
  2003-05-29 22:50           ` Alan Modra
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2003-05-27 22:44 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Alan Modra, Janis Johnson, gcc-patches

	My more basic question is when will RS6000_CALL_GLUE be used for
32-bit PPC mode linux64.h given this rewrite?

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] powerpc64-linux bi-arch support
  2003-05-27 22:44         ` David Edelsohn
@ 2003-05-29 22:50           ` Alan Modra
  2003-05-30  0:52             ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Alan Modra @ 2003-05-29 22:50 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Jakub Jelinek, Janis Johnson, gcc-patches

On Tue, May 27, 2003 at 04:06:43PM -0400, David Edelsohn wrote:
> 	My more basic question is when will RS6000_CALL_GLUE be used for
> 32-bit PPC mode linux64.h given this rewrite?

In exactly the same situations a non-biarch --target=powerpc-linux
compiler will use RS6000_CALL_GLUE.  ie. the patch doesn't change
existing behaviour.  You may have unearthed an existing bug, but
it's not a new one.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] powerpc64-linux bi-arch support
  2003-05-29 22:50           ` Alan Modra
@ 2003-05-30  0:52             ` David Edelsohn
  2003-05-31 20:52               ` Jakub Jelinek
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2003-05-30  0:52 UTC (permalink / raw)
  To: Jakub Jelinek, Janis Johnson, gcc-patches

>>>>> Alan Modra writes:

Alan> In exactly the same situations a non-biarch --target=powerpc-linux
Alan> compiler will use RS6000_CALL_GLUE.  ie. the patch doesn't change
Alan> existing behaviour.  You may have unearthed an existing bug, but
Alan> it's not a new one.

	I don't think it necessarily is a bug because the macros is not
except for ABI_AIX.

	I want to avoid adding new definitions or expanding definitions to
cover cases that don't exist because someone later will interpret that as
self-documenting code: "This was done on purpose."  Or later will enable
some 32/64-bit thunk mode that breaks because of this.  Where possible, I
would like to avoid making things selectable when the other case never
occurs.

	Please keep RS6000_CALL_GLUE uniformly defined to "nop".  If this
breaks something, we have a worse problem.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] powerpc64-linux bi-arch support
  2003-05-30  0:52             ` David Edelsohn
@ 2003-05-31 20:52               ` Jakub Jelinek
  2003-06-02 20:24                 ` David Edelsohn
  2003-06-02 22:08                 ` David Edelsohn
  0 siblings, 2 replies; 875+ messages in thread
From: Jakub Jelinek @ 2003-05-31 20:52 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Janis Johnson, gcc-patches

On Thu, May 29, 2003 at 06:05:51PM -0400, David Edelsohn wrote:
> >>>>> Alan Modra writes:
> 
> Alan> In exactly the same situations a non-biarch --target=powerpc-linux
> Alan> compiler will use RS6000_CALL_GLUE.  ie. the patch doesn't change
> Alan> existing behaviour.  You may have unearthed an existing bug, but
> Alan> it's not a new one.
> 
> 	I don't think it necessarily is a bug because the macros is not
> except for ABI_AIX.
> 
> 	I want to avoid adding new definitions or expanding definitions to
> cover cases that don't exist because someone later will interpret that as
> self-documenting code: "This was done on purpose."  Or later will enable
> some 32/64-bit thunk mode that breaks because of this.  Where possible, I
> would like to avoid making things selectable when the other case never
> occurs.
> 
> 	Please keep RS6000_CALL_GLUE uniformly defined to "nop".  If this
> breaks something, we have a worse problem.

But then either linux.h would need to define RS6000_CALL_GLUE the same way,
or -mcall-aixdesc should be disallowed for powerpc-*-linux* and
powerpc64-*-linux* targets.
If linux.h inherits the cror definition from sysv.h and linux64.h defines
it to nop unconditionally, then suddenly --target powerpc-ibm-linux and
--target powerpc64-ibm-linux --with-cpu=default32 behave differently,
which is IMHO not desirable. The only difference between those two should
be that the latter supports -m64 while the former does not.

	Jakub

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] powerpc64-linux bi-arch support
  2003-05-31 20:52               ` Jakub Jelinek
@ 2003-06-02 20:24                 ` David Edelsohn
  2003-06-02 22:08                 ` David Edelsohn
  1 sibling, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2003-06-02 20:24 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Janis Johnson, gcc-patches

	This patch does bootstrap on AIX, but I would like to do a more
complete regression test on AIX to make sure it doesn't break something. 

	I also want to stage in major patches one by one, so I would like
to get Aldy's split complex patch approved before this one.  Hopefully
Ricard or someone will approve Aldy's patch soon.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] powerpc64-linux bi-arch support
  2003-05-31 20:52               ` Jakub Jelinek
  2003-06-02 20:24                 ` David Edelsohn
@ 2003-06-02 22:08                 ` David Edelsohn
  2003-06-02 22:37                   ` Michael Meissner
  1 sibling, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2003-06-02 22:08 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Janis Johnson, gcc-patches

>>>>> Jakub Jelinek writes:

Jakub> But then either linux.h would need to define RS6000_CALL_GLUE the same way,
Jakub> or -mcall-aixdesc should be disallowed for powerpc-*-linux* and
Jakub> powerpc64-*-linux* targets.
Jakub> If linux.h inherits the cror definition from sysv.h and linux64.h defines
Jakub> it to nop unconditionally, then suddenly --target powerpc-ibm-linux and
Jakub> --target powerpc64-ibm-linux --with-cpu=default32 behave differently,
Jakub> which is IMHO not desirable. The only difference between those two should
Jakub> be that the latter supports -m64 while the former does not.

	I cannot see how -mcall-aixdesc makes sense for powerpc*-*-linux*

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] powerpc64-linux bi-arch support
  2003-06-02 22:08                 ` David Edelsohn
@ 2003-06-02 22:37                   ` Michael Meissner
  0 siblings, 0 replies; 875+ messages in thread
From: Michael Meissner @ 2003-06-02 22:37 UTC (permalink / raw)
  To: gcc-patches

On Mon, Jun 02, 2003 at 06:08:25PM -0400, David Edelsohn wrote:
> >>>>> Jakub Jelinek writes:
> 
> Jakub> But then either linux.h would need to define RS6000_CALL_GLUE the same way,
> Jakub> or -mcall-aixdesc should be disallowed for powerpc-*-linux* and
> Jakub> powerpc64-*-linux* targets.
> Jakub> If linux.h inherits the cror definition from sysv.h and linux64.h defines
> Jakub> it to nop unconditionally, then suddenly --target powerpc-ibm-linux and
> Jakub> --target powerpc64-ibm-linux --with-cpu=default32 behave differently,
> Jakub> which is IMHO not desirable. The only difference between those two should
> Jakub> be that the latter supports -m64 while the former does not.
> 
> 	I cannot see how -mcall-aixdesc makes sense for powerpc*-*-linux*

It doesn't for Linux.  It is or was needed for some of Cygnus's (now Red Hat)
embedded customers who started with GCC before I added the System V/eabi
calling sequences (roughly 1994-1995 time frame), and they needed to be able to
use the same calling sequence they were used to (basically AIX without the 3
word function pointer if memory serves).  I have no idea if those customers are
still using PowerPC platforms or not, or whether they are still using this
option.  To simplify the number of multilibs that were built, I restricted it
to just big endian, and I imagine a similar restriction to 32-bit would allow
the current users to use it, but encourage any new development to use the
official eabi calling sequence.

-- 
Michael Meissner
email: gnu@the-meissners.org
http://www.the-meissners.org

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] powerpc64-linux bi-arch support
  2003-05-27 11:58 [PATCH] powerpc64-linux bi-arch support Jakub Jelinek
  2003-05-27 14:57 ` David Edelsohn
@ 2003-06-04 14:48 ` David Edelsohn
  2003-06-04 16:51 ` David Edelsohn
  2003-08-19 20:07 ` David Edelsohn
  3 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2003-06-04 14:48 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Alan Modra, Janis Johnson, gcc-patches

> Below is a forward port of the ppc64-linux bi-arch support.
> Similar patch got lots of testing on gcc-3_2-rhl8-branch and some on
> gcc-3_3-rhl-branch.

	I have bootstrapped this patch on AIX and it appears to be okay.
Please go ahead and commit it.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] powerpc64-linux bi-arch support
  2003-05-27 11:58 [PATCH] powerpc64-linux bi-arch support Jakub Jelinek
  2003-05-27 14:57 ` David Edelsohn
  2003-06-04 14:48 ` David Edelsohn
@ 2003-06-04 16:51 ` David Edelsohn
  2003-06-04 16:58   ` Jakub Jelinek
  2003-08-19 20:07 ` David Edelsohn
  3 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2003-06-04 16:51 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Alan Modra, Janis Johnson, gcc-patches

	FYI, I received feedback from some colleagues who are trying to
use the patch to build a powerpc64-linux toolchain that the patch causes
problems when building a cross-compiler.

	The problem is that the gcc configuration wants
powerpc64-linux-XXX, while the bi-arch binutils with 64-bit support
installed powerpc-linux-XXX without the "64".

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] powerpc64-linux bi-arch support
  2003-06-04 16:51 ` David Edelsohn
@ 2003-06-04 16:58   ` Jakub Jelinek
  0 siblings, 0 replies; 875+ messages in thread
From: Jakub Jelinek @ 2003-06-04 16:58 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Alan Modra, Janis Johnson, gcc-patches

On Wed, Jun 04, 2003 at 12:51:44PM -0400, David Edelsohn wrote:
> 	FYI, I received feedback from some colleagues who are trying to
> use the patch to build a powerpc64-linux toolchain that the patch causes
> problems when building a cross-compiler.
> 
> 	The problem is that the gcc configuration wants
> powerpc64-linux-XXX, while the bi-arch binutils with 64-bit support
> installed powerpc-linux-XXX without the "64".

Depends on how binutils were configured. They can be also configured
for powerpc64-*-linux* with additional powerpc-*-linux* support.
Alternatively, one can symlink all powerpc-linux-XXX binaries to
powerpc64-linux-XXX if they were configured to support both 32-bit and
64-bit arch.

	Jakub

^ permalink raw reply	[flat|nested] 875+ messages in thread

* powerpc64 crt* tweak
@ 2003-06-07  5:59             ` Alan Modra
  2003-06-07  6:00               ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Alan Modra @ 2003-06-07  5:59 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Edelsohn

I've been implementing some linker magic for powerpc64-linux so that we
don't need to use -mminimal-toc on apps with large TOCs.  The idea is
to insert stubs to adjust r2 (TOC base) when calling between functions
that need to use a different TOC.  As for plt call stubs, the stub saves
the current r2 value on the stack and we restore r2 when the call
returns.  After the call we need a nop that the linker can replace with
a suitable instruction to do the restore.

	* config/rs6000/linux64.h (CRT_CALL_STATIC_FUNCTION): Define.

OK mainline and 3.3 branch?  Regression tested powerpc-linux bi-arch.

Index: gcc/config/rs6000/linux64.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/linux64.h,v
retrieving revision 1.42
diff -u -p -r1.42 linux64.h
--- gcc/config/rs6000/linux64.h	4 Jun 2003 16:44:51 -0000	1.42
+++ gcc/config/rs6000/linux64.h	6 Jun 2003 23:42:50 -0000
@@ -363,6 +363,18 @@
 #undef  RS6000_MCOUNT
 #define RS6000_MCOUNT "_mcount"
 
+#ifdef __powerpc64__
+/* _init and _fini functions are built from bits spread across many
+   object files, each potentially with a different TOC pointer.  For
+   that reason, place a nop after the call so that the linker can
+   restore the TOC pointer if a TOC adjusting call stub is needed.  */
+#define CRT_CALL_STATIC_FUNCTION(SECTION_OP, FUNC)	\
+  asm (SECTION_OP "\n"					\
+"	bl ." #FUNC "\n"				\
+"	nop\n"						\
+"	.previous");
+#endif
+
 /* FP save and restore routines.  */
 #undef  SAVE_FP_PREFIX
 #define SAVE_FP_PREFIX (TARGET_64BIT ? "._savef" : "_savefpr_")

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: powerpc64 crt* tweak
  2003-06-07  5:59             ` powerpc64 crt* tweak Alan Modra
@ 2003-06-07  6:00               ` David Edelsohn
  0 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2003-06-07  6:00 UTC (permalink / raw)
  To: gcc-patches

>>>>> Alan Modra writes:

Alan> * config/rs6000/linux64.h (CRT_CALL_STATIC_FUNCTION): Define.

Alan> OK mainline and 3.3 branch?  Regression tested powerpc-linux bi-arch.

Yep.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: ppc64 floating point usage [was Re: PPC64 Compiler bug !!]
       [not found] ` <20030613020133.GK23826@bubble.sa.bigpond.net.au>
@ 2003-06-13 15:07   ` Alan Modra
  2003-06-13 15:38     ` David Edelsohn
  2003-06-13 20:04     ` David Edelsohn
  0 siblings, 2 replies; 875+ messages in thread
From: Alan Modra @ 2003-06-13 15:07 UTC (permalink / raw)
  To: linas, gcc, gcc-patches

On Fri, Jun 13, 2003 at 11:31:33AM +0930, Alan Modra wrote:
> To expand on this:  Under linux lazy fp save/restore we take an   
> exception the first time (after a context switch) we use a fp
> temporary in user code.  This could significantly increase the cost of
> using a fp reg for moves.  IMO this is reason enough to change
> REG_ALLOC_ORDER for powerpc64.

Anton Blanchard ran some timing tests.  The exception hit costs
around 1000 cycles on a Power4 processor running Linux.  This patch
simply changes REG_ALLOC_ORDER to better reflect the fact that gprs
should be used before fprs.  I'd like something better, such as a
working version of Zack's patch to only use fprs for user fp operations,
but this is better than nothing.

	* config/rs6000/linux64.h (SUBSUBTARGET_OVERRIDE_OPTIONS): Use
	RS6000_ALT_REG_ALLOC_ORDER.
	* config/rs6000/rs6000.h: Formatting fixes.
	(REG_ALLOC_ORDER): Correct comment.
	(RS6000_ALT_REG_ALLOC_ORDER): Define.

Regression tested powerpc64-linux.  OK mainline?  3.3 branch?

Index: gcc/config/rs6000/linux64.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/linux64.h,v
retrieving revision 1.44
diff -u -p -r1.44 linux64.h
--- gcc/config/rs6000/linux64.h	7 Jun 2003 17:11:47 -0000	1.44
+++ gcc/config/rs6000/linux64.h	13 Jun 2003 14:33:30 -0000
@@ -67,6 +67,9 @@
     {								\
       if (TARGET_64BIT)						\
 	{							\
+	  static const int order[FIRST_PSEUDO_REGISTER]		\
+	    = RS6000_ALT_REG_ALLOC_ORDER;			\
+	  memcpy (reg_alloc_order, order, sizeof (order));	\
 	  if (DEFAULT_ABI != ABI_AIX)				\
 	    {							\
 	      DEFAULT_ABI = ABI_AIX;				\
Index: gcc/config/rs6000/rs6000.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.h,v
retrieving revision 1.278
diff -u -p -r1.278 rs6000.h
--- gcc/config/rs6000/rs6000.h	4 Jun 2003 17:50:43 -0000	1.278
+++ gcc/config/rs6000/rs6000.h	13 Jun 2003 14:33:33 -0000
@@ -781,8 +781,7 @@ extern int rs6000_alignment_flags;
    /* AltiVec registers.  */			   \
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \
-   1, 1						   \
-   , 1, 1                                          \
+   1, 1, 1, 1					   \
 }
 
 /* 1 for registers not available across function calls.
@@ -801,8 +800,7 @@ extern int rs6000_alignment_flags;
    /* AltiVec registers.  */			   \
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \
-   1, 1						   \
-   , 1, 1                                          \
+   1, 1, 1, 1					   \
 }
 
 /* Like `CALL_USED_REGISTERS' except this macro doesn't require that
@@ -820,8 +818,7 @@ extern int rs6000_alignment_flags;
    /* AltiVec registers.  */			   \
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \
-   0, 0						   \
-   , 0, 0                                          \
+   0, 0, 0, 0					   \
 }
 
 #define MQ_REGNO     64
@@ -861,8 +858,7 @@ extern int rs6000_alignment_flags;
 	mq		(not saved; best to use it if we can)
 	ctr		(not saved; when we have the choice ctr is better)
 	lr		(saved)
-        cr5, r1, r2, ap, xer, vrsave, vscr (fixed)
-	spe_acc, spefscr (fixed)
+	cr5, r1, r2, ap, xer (fixed, but note r2 exception on some ABIs)
 
 	AltiVec registers:
 	v0 - v1         (not saved or used for anything)
@@ -870,8 +866,10 @@ extern int rs6000_alignment_flags;
 	v2              (not saved; incoming vector arg reg; return value)
 	v19 - v14       (not saved or used for anything)
 	v31 - v20       (saved; order given to save least number)
+
+ 	vrsave, vscr, spe_acc, spefscr (fixed)
 */
-						
+
 #if FIXED_R2 == 1
 #define MAYBE_R2_AVAILABLE
 #define MAYBE_R2_FIXED 2,
@@ -900,8 +898,36 @@ extern int rs6000_alignment_flags;
    79,							\
    96, 95, 94, 93, 92, 91,				\
    108, 107, 106, 105, 104, 103, 102, 101, 100, 99, 98,	\
-   97, 109, 110						\
-   , 111, 112                                              \
+   97,							\
+   109, 110, 111, 112					\
+}
+
+/* Used by powerpc64-linux.  Places fp regs after gp regs, so that
+   DImode moves tend to use a gp reg rather than a fp reg.  Usage of fp
+   regs under Linux' lazy fp save/restore means an exception is taken on
+   first use of a fp reg.  */
+#define RS6000_ALT_REG_ALLOC_ORDER			\
+  {75, 74, 69, 68, 72, 71, 70,				\
+   0, MAYBE_R2_AVAILABLE				\
+   9, 11, 10, 8, 7, 6, 5, 4,				\
+   3,							\
+   31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19,	\
+   18, 17, 16, 15, 14, 13, 12,				\
+   64, 66, 65, 						\
+   73, 1, MAYBE_R2_FIXED 67, 76,			\
+   32, 							\
+   45, 44, 43, 42, 41, 40, 39, 38, 37, 36, 35, 34,	\
+   33,							\
+   63, 62, 61, 60, 59, 58, 57, 56, 55, 54, 53, 52, 51,	\
+   50, 49, 48, 47, 46, 					\
+   /* AltiVec registers.  */				\
+   77, 78,						\
+   90, 89, 88, 87, 86, 85, 84, 83, 82, 81, 80,		\
+   79,							\
+   96, 95, 94, 93, 92, 91,				\
+   108, 107, 106, 105, 104, 103, 102, 101, 100, 99, 98,	\
+   97,							\
+   109, 110, 111, 112					\
 }
 
 /* True if register is floating-point.  */

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: ppc64 floating point usage [was Re: PPC64 Compiler bug !!]
  2003-06-13 15:07   ` ppc64 floating point usage [was Re: PPC64 Compiler bug !!] Alan Modra
@ 2003-06-13 15:38     ` David Edelsohn
  2003-06-13 20:04     ` David Edelsohn
  1 sibling, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2003-06-13 15:38 UTC (permalink / raw)
  To: linas, gcc-patches

	Thanks for sending this patch.  I will take it under
consideration. 

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: ppc64 floating point usage [was Re: PPC64 Compiler bug !!]
  2003-06-13 15:07   ` ppc64 floating point usage [was Re: PPC64 Compiler bug !!] Alan Modra
  2003-06-13 15:38     ` David Edelsohn
@ 2003-06-13 20:04     ` David Edelsohn
  2003-06-13 20:06       ` Jakub Jelinek
                         ` (2 more replies)
  1 sibling, 3 replies; 875+ messages in thread
From: David Edelsohn @ 2003-06-13 20:04 UTC (permalink / raw)
  To: linas, gcc-patches

	Have you tested the change in allocation order patch on floating
point intensive code?

	How does this affect the register allocation for moves of floating
point values?  Does GCC now prefer GPRs?

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: ppc64 floating point usage [was Re: PPC64 Compiler bug !!]
  2003-06-13 20:04     ` David Edelsohn
@ 2003-06-13 20:06       ` Jakub Jelinek
  2003-06-13 20:38         ` David Edelsohn
  2003-06-13 21:08       ` linas
  2003-08-08  7:24       ` Alan Modra
  2 siblings, 1 reply; 875+ messages in thread
From: Jakub Jelinek @ 2003-06-13 20:06 UTC (permalink / raw)
  To: David Edelsohn; +Cc: linas, gcc-patches

On Fri, Jun 13, 2003 at 04:00:39PM -0400, David Edelsohn wrote:
> 	Have you tested the change in allocation order patch on floating
> point intensive code?
> 
> 	How does this affect the register allocation for moves of floating
> point values?  Does GCC now prefer GPRs?

I guess best would be if GCC could dynamically switch the reg alloc orders
based on whether FPRs will be used anyway or whether current function is
pure integer code.

	Jakub

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: ppc64 floating point usage [was Re: PPC64 Compiler bug !!]
  2003-06-13 20:06       ` Jakub Jelinek
@ 2003-06-13 20:38         ` David Edelsohn
  2003-06-13 21:06           ` linas
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2003-06-13 20:38 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: linas, gcc-patches

>>>>> Jakub Jelinek writes:

Jakub> I guess best would be if GCC could dynamically switch the reg alloc orders
Jakub> based on whether FPRs will be used anyway or whether current function is
Jakub> pure integer code.

	Yes.  The difficulty is determining whether a function is pure
integer code.  Even more, one does not want to simply change the alloc
order but mark FPRs as fixed for pure integer code, which is what
-msoft-float does.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: ppc64 floating point usage [was Re: PPC64 Compiler bug !!]
  2003-06-13 20:38         ` David Edelsohn
@ 2003-06-13 21:06           ` linas
  2003-06-13 21:17             ` Michael S. Zick
  2003-06-13 22:19             ` Janis Johnson
  0 siblings, 2 replies; 875+ messages in thread
From: linas @ 2003-06-13 21:06 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Jakub Jelinek, gcc-patches, gcc

On Fri, Jun 13, 2003 at 04:09:06PM -0400, David Edelsohn wrote:
> >>>>> Jakub Jelinek writes:
> 
> Jakub> I guess best would be if GCC could dynamically switch the reg alloc orders
> Jakub> based on whether FPRs will be used anyway or whether current function is
> Jakub> pure integer code.

Why would this be best?

> 	Yes.  The difficulty is determining whether a function is pure
> integer code.  

I just finished reading the October archives on this issue.  One of
the notes hinted at a flag in some struct that indicated if a function
used float.  Not clear if the flag was hypothetical or real.

Barring further discussion of eh, setjump/longjump & variadic funcs,
it seems to me such a flag could be easily computed early on (although
it might slow down compilation a tad to do so).

> Even more, one does not want to simply change the alloc
> order but mark FPRs as fixed for pure integer code, which is what
> -msoft-float does.

Bird in hand vs. two in the bush.  We have Alan Modra's patch now, we
don't have an acceptable -mno-implicit-float patch.

There is concern that Alan's patch will negatively impact performance
of fp code.  Is there a way to unambiguously resolve this issue, or 
at least resolve it to everyones satisfaction?  

How would one do this? Visually inspect generated fp code? Run 
benchmarks? Which benchmarks? 

--linas

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: ppc64 floating point usage [was Re: PPC64 Compiler bug !!]
  2003-06-13 20:04     ` David Edelsohn
  2003-06-13 20:06       ` Jakub Jelinek
@ 2003-06-13 21:08       ` linas
  2003-08-08  7:24       ` Alan Modra
  2 siblings, 0 replies; 875+ messages in thread
From: linas @ 2003-06-13 21:08 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

On Fri, Jun 13, 2003 at 04:00:39PM -0400, David Edelsohn wrote:
> 	Have you tested the change in allocation order patch on floating
> point intensive code?
> 
> 	How does this affect the register allocation for moves of floating
> point values?  Does GCC now prefer GPRs?

I'll see if I can get a copy of Alan's compiler.  I can do some simple 
visual inspection of the fp code that it generates, but I suspect this
won't be sufficient to make the patch aceptable. What would be? Some 
perf tests?


--linas
> 
> David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: ppc64 floating point usage [was Re: PPC64 Compiler bug !!]
  2003-06-13 21:06           ` linas
@ 2003-06-13 21:17             ` Michael S. Zick
  2003-06-14  3:14               ` Michael Meissner
  2003-06-13 22:19             ` Janis Johnson
  1 sibling, 1 reply; 875+ messages in thread
From: Michael S. Zick @ 2003-06-13 21:17 UTC (permalink / raw)
  To: linas, David Edelsohn; +Cc: Jakub Jelinek, gcc-patches, gcc

On Friday 13 June 2003 04:02 pm, linas@austin.ibm.com wrote:
> On Fri, Jun 13, 2003 at 04:09:06PM -0400, David Edelsohn wrote:
>
> Bird in hand vs. two in the bush.  We have Alan Modra's patch now, we
> don't have an acceptable -mno-implicit-float patch.
>
> There is concern that Alan's patch will negatively impact performance
> of fp code.  Is there a way to unambiguously resolve this issue, or
> at least resolve it to everyones satisfaction?
>
A patch for a related issue just appeared for gas-arm-Linux and
gas-arm-NetBSD...
<http://sources.redhat.com/ml/binutils/2003-06/msg00494.html>
<http://sources.redhat.com/ml/binutils/2003-06/msg00497.html>

Perhaps a similar route could be followed (I.E: -mfpu=none) to
support developers that need the "never, ever, for any reason"
touch the fp registers.
I imagine that option would just select a register/mode description
that does not mention fp registers. Ports might have to add 
descriptor support for this option.

Mike
>
> How would one do this? Visually inspect generated fp code? Run
> benchmarks? Which benchmarks?
>
> --linas

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: ppc64 floating point usage [was Re: PPC64 Compiler bug !!]
  2003-06-13 21:06           ` linas
  2003-06-13 21:17             ` Michael S. Zick
@ 2003-06-13 22:19             ` Janis Johnson
  1 sibling, 0 replies; 875+ messages in thread
From: Janis Johnson @ 2003-06-13 22:19 UTC (permalink / raw)
  To: linas; +Cc: David Edelsohn, Jakub Jelinek, gcc-patches, gcc

On Fri, Jun 13, 2003 at 04:02:25PM -0500, linas@austin.ibm.com wrote:
> Bird in hand vs. two in the bush.  We have Alan Modra's patch now, we
> don't have an acceptable -mno-implicit-float patch.
> 
> There is concern that Alan's patch will negatively impact performance
> of fp code.  Is there a way to unambiguously resolve this issue, or 
> at least resolve it to everyones satisfaction?  
> 
> How would one do this? Visually inspect generated fp code? Run 
> benchmarks? Which benchmarks? 

Try the floating point tests from SPEC CPU2000.

Janis

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: ppc64 floating point usage [was Re: PPC64 Compiler bug !!]
  2003-06-14  3:14               ` Michael Meissner
@ 2003-06-14  3:14                 ` gp
  2003-06-14  9:07                 ` Alan Modra
  2003-06-14 14:59                 ` Michael S. Zick
  2 siblings, 0 replies; 875+ messages in thread
From: gp @ 2003-06-14  3:14 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, gcc

<snip>

> > 
> > Perhaps a similar route could be followed (I.E: -mfpu=none) to
> > support developers that need the "never, ever, for any reason"
> > touch the fp registers.
> 
> Ummm, you already have that, it is spelled -msoft-float.  I don't understand
> what a separate switch buys you.
>

<snip>

> Note, at least a few years ago, getting USER space code to use floating 
point
> registers to speed up structure moves (since using the FP registers is the 
only
> way to move 64 bits at a time in a 32 bit environment) was a big deal.  I am
> worried that if we change the compiler, we may be be slowing down these
> programs.
> 

I posted my patch on this issue because I need a solution for a 32-bit 
environment.  I can turn off FP reg use completely with the -msoft-float, or 
I can leave it on, including the default move 64bit ints through the FP 
regs.  

I want to be able to say: "Don't turn off the FPU (because it might be there, 
and if it isn't our emulator will be), but please use all of the GP regs 
first for integer operations". 

Having it optional would not slow down the programs for whom this feature was 
added, as the default behavior would remain the same.  But the behavior that 
was eliminated is sometimes exactly what I want.

To that end I like Alan Modra's patch, with the exception that I would like 
to be able to use this behavior in a 32bit environment based on a command 
line option.  It might make sense to make it optional for 64-bit platforms, 
too, if it turns out that the patch causes gcc to also prefer GP regs when 
doing FP operations (as was cautioned earlier in the thread).

Regards,
GP

> -- 
> Michael Meissner
> email: gnu@the-meissners.org
> http://www.the-meissners.org
> 

-- 

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: ppc64 floating point usage [was Re: PPC64 Compiler bug !!]
  2003-06-13 21:17             ` Michael S. Zick
@ 2003-06-14  3:14               ` Michael Meissner
  2003-06-14  3:14                 ` gp
                                   ` (2 more replies)
  0 siblings, 3 replies; 875+ messages in thread
From: Michael Meissner @ 2003-06-14  3:14 UTC (permalink / raw)
  To: gcc-patches, gcc

On Fri, Jun 13, 2003 at 04:07:45PM -0500, Michael S. Zick wrote:
> On Friday 13 June 2003 04:02 pm, linas@austin.ibm.com wrote:
> > On Fri, Jun 13, 2003 at 04:09:06PM -0400, David Edelsohn wrote:
> >
> > Bird in hand vs. two in the bush.  We have Alan Modra's patch now, we
> > don't have an acceptable -mno-implicit-float patch.
> >
> > There is concern that Alan's patch will negatively impact performance
> > of fp code.  Is there a way to unambiguously resolve this issue, or
> > at least resolve it to everyones satisfaction?
> >
> A patch for a related issue just appeared for gas-arm-Linux and
> gas-arm-NetBSD...
> <http://sources.redhat.com/ml/binutils/2003-06/msg00494.html>
> <http://sources.redhat.com/ml/binutils/2003-06/msg00497.html>
> 
> Perhaps a similar route could be followed (I.E: -mfpu=none) to
> support developers that need the "never, ever, for any reason"
> touch the fp registers.

Ummm, you already have that, it is spelled -msoft-float.  I don't understand
what a separate switch buys you.

> I imagine that option would just select a register/mode description
> that does not mention fp registers. Ports might have to add 
> descriptor support for this option.

Ummm, one of the earlier messages stated this furor came about because a driver
writer did not use the -msoft-float option.  If they are going to neglect using
a switch that has been used by the Linux kernel for at least 4 years for all of
its PowerPC builds (for precisely this reason BTW), I can't really imagine them
using any new switch.

Note, at least a few years ago, getting USER space code to use floating point
registers to speed up structure moves (since using the FP registers is the only
way to move 64 bits at a time in a 32 bit environment) was a big deal.  I am
worried that if we change the compiler, we may be be slowing down these
programs.

-- 
Michael Meissner
email: gnu@the-meissners.org
http://www.the-meissners.org

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: ppc64 floating point usage [was Re: PPC64 Compiler bug !!]
  2003-06-14  3:14               ` Michael Meissner
  2003-06-14  3:14                 ` gp
@ 2003-06-14  9:07                 ` Alan Modra
  2003-06-14 14:59                 ` Michael S. Zick
  2 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2003-06-14  9:07 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, gcc

On Fri, Jun 13, 2003 at 06:38:37PM -0400, Michael Meissner wrote:
> 
> Ummm, one of the earlier messages stated this furor came about because a driver
> writer did not use the -msoft-float option.  If they are going to neglect using
> a switch that has been used by the Linux kernel for at least 4 years for all of
> its PowerPC builds (for precisely this reason BTW), I can't really imagine them
> using any new switch.

Exactly.

> Note, at least a few years ago, getting USER space code to use floating point
> registers to speed up structure moves (since using the FP registers is the only
> way to move 64 bits at a time in a 32 bit environment) was a big deal.  I am
> worried that if we change the compiler, we may be be slowing down these
> programs.

Which is why my patch just affects 64 bit code, and only Linux.  Linux
32 bit probably wants the same patch because contrary to what you say
above, using a fp reg for structure copies might be very slow.  You can
cause an exception due to the lazy fp save/restore used by Linux.
However, since the 32 bit case is a little more controversial, I
restricted my patch to 64 bit Linux.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: ppc64 floating point usage [was Re: PPC64 Compiler bug !!]
  2003-06-14  3:14               ` Michael Meissner
  2003-06-14  3:14                 ` gp
  2003-06-14  9:07                 ` Alan Modra
@ 2003-06-14 14:59                 ` Michael S. Zick
  2003-06-14 22:52                   ` Michael Meissner
  2 siblings, 1 reply; 875+ messages in thread
From: Michael S. Zick @ 2003-06-14 14:59 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, gcc

On Friday 13 June 2003 05:38 pm, Michael Meissner wrote:
> On Fri, Jun 13, 2003 at 04:07:45PM -0500, Michael S. Zick wrote:
> > On Friday 13 June 2003 04:02 pm, linas@austin.ibm.com wrote:
> > > On Fri, Jun 13, 2003 at 04:09:06PM -0400, David Edelsohn wrote:
> > >
> > > Bird in hand vs. two in the bush.  We have Alan Modra's patch now, we
> > > don't have an acceptable -mno-implicit-float patch.
> > >
> > > There is concern that Alan's patch will negatively impact performance
> > > of fp code.  Is there a way to unambiguously resolve this issue, or
> > > at least resolve it to everyones satisfaction?
> >
> > A patch for a related issue just appeared for gas-arm-Linux and
> > gas-arm-NetBSD...
> > <http://sources.redhat.com/ml/binutils/2003-06/msg00494.html>
> > <http://sources.redhat.com/ml/binutils/2003-06/msg00497.html>
> >
> > Perhaps a similar route could be followed (I.E: -mfpu=none) to
> > support developers that need the "never, ever, for any reason"
> > touch the fp registers.
>
> Ummm, you already have that, it is spelled -msoft-float.  I don't
> understand what a separate switch buys you.
>
My bad, I misunderstood the issue.
I thought someone pointed out that -msoft-float did not eleminate
all usage of float instructions (hardware OR software).
Mike

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: ppc64 floating point usage [was Re: PPC64 Compiler bug !!]
  2003-06-14 14:59                 ` Michael S. Zick
@ 2003-06-14 22:52                   ` Michael Meissner
  0 siblings, 0 replies; 875+ messages in thread
From: Michael Meissner @ 2003-06-14 22:52 UTC (permalink / raw)
  To: gcc-patches, gcc

On Sat, Jun 14, 2003 at 08:59:25AM -0500, Michael S. Zick wrote:
> On Friday 13 June 2003 05:38 pm, Michael Meissner wrote:
> > On Fri, Jun 13, 2003 at 04:07:45PM -0500, Michael S. Zick wrote:
> > > On Friday 13 June 2003 04:02 pm, linas@austin.ibm.com wrote:
> > > > On Fri, Jun 13, 2003 at 04:09:06PM -0400, David Edelsohn wrote:
> > > >
> > > > Bird in hand vs. two in the bush.  We have Alan Modra's patch now, we
> > > > don't have an acceptable -mno-implicit-float patch.
> > > >
> > > > There is concern that Alan's patch will negatively impact performance
> > > > of fp code.  Is there a way to unambiguously resolve this issue, or
> > > > at least resolve it to everyones satisfaction?
> > >
> > > A patch for a related issue just appeared for gas-arm-Linux and
> > > gas-arm-NetBSD...
> > > <http://sources.redhat.com/ml/binutils/2003-06/msg00494.html>
> > > <http://sources.redhat.com/ml/binutils/2003-06/msg00497.html>
> > >
> > > Perhaps a similar route could be followed (I.E: -mfpu=none) to
> > > support developers that need the "never, ever, for any reason"
> > > touch the fp registers.
> >
> > Ummm, you already have that, it is spelled -msoft-float.  I don't
> > understand what a separate switch buys you.
> >
> My bad, I misunderstood the issue.
> I thought someone pointed out that -msoft-float did not eleminate
> all usage of float instructions (hardware OR software).

You might have been thinking about the alpha.  IIRC, the alpha's -msoft-float
doesn't eliminate all fp regs because the arch. manual says that fp regs must
be provided even if you don't have fp instructions.  However, for at least the
last 8 years or so, the rs6000/powerpc port -msoft-float option disables all FP
registers from being allocated.

-- 
Michael Meissner
email: gnu@the-meissners.org
http://www.the-meissners.org

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: function parms in regs, patch 3 of 3
  2003-04-30 13:29     ` function parms in regs, patch 3 " Alan Modra
  2003-05-02  6:05       ` Jim Wilson
@ 2003-07-10  6:55       ` Jim Wilson
  2003-07-14  2:51         ` Alan Modra
  1 sibling, 1 reply; 875+ messages in thread
From: Jim Wilson @ 2003-07-10  6:55 UTC (permalink / raw)
  To: Alan Modra; +Cc: gcc-patches

We shouldn't need TREE_CODE checks in expr.h anymore.  The typedef for 
tree was moved to a new file coretypes.h which is included first in 
nearly every file, thus making these checks unnecessary.  It isn't your 
problem that they are there though, I just wanted to mention it because 
I noticed it.

This looks OK to me.  Much safer than your previous patch which defines 
and uses BLOCK_REG_PADDING for all targets.  That one looked scary to me.

You forgot to add documentation for the new BLOCK_REG_PADDING macro.

If feeling ambitious, you could start a list of things that we should do 
for gcc 4, such as define BLOCK_REG_PADDING by default for all targets, 
and put it on the projects page.

You might want to mention the rs6000 bits to an rs6000 maintainer, just 
so they aren't surprised by them.

Jim

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: function parms in regs, patch 3 of 3
  2003-07-10  6:55       ` Jim Wilson
@ 2003-07-14  2:51         ` Alan Modra
  2003-07-14  3:00           ` David Edelsohn
  2003-07-15 15:08           ` David Edelsohn
  0 siblings, 2 replies; 875+ messages in thread
From: Alan Modra @ 2003-07-14  2:51 UTC (permalink / raw)
  To: Jim Wilson, David Edelsohn; +Cc: gcc-patches

On Wed, Jul 09, 2003 at 11:55:26PM -0700, Jim Wilson wrote:
> We shouldn't need TREE_CODE checks in expr.h anymore.  The typedef for 
> tree was moved to a new file coretypes.h which is included first in 
> nearly every file, thus making these checks unnecessary.  It isn't your 
> problem that they are there though, I just wanted to mention it because 
> I noticed it.

OK, I took most of these out.  We still need #ifdef TREE_CODE around
uses of enum tree_code.

> This looks OK to me.  Much safer than your previous patch which defines 
> and uses BLOCK_REG_PADDING for all targets.  That one looked scary to me.
> 
> You forgot to add documentation for the new BLOCK_REG_PADDING macro.

Done.

> If feeling ambitious, you could start a list of things that we should do 
> for gcc 4, such as define BLOCK_REG_PADDING by default for all targets, 
> and put it on the projects page.

OK, I'll look at this later.

> You might want to mention the rs6000 bits to an rs6000 maintainer, just 
> so they aren't surprised by them.

Here is a revised patch, which I'll commit after David has had a look.
Some changes were necessary for the biarch patches that have gone into
rs6000/linux64.h, and to use C90 syntax.  Bootstrapped powerpc-linux,
regression tested powerpc-linux and powerpc64-linux.  I used Janis'
compat tests against gcc-3.2.3 and found scalar-by-value-3 and
scalar-return-3 failed on powerpc-linux, which I believe is expected due
to fixes in handling of complex types.  On powerpc64-linux, this patch
fixes gcc.dg/compat/struct-by-value-{2,3,4,5,8,11,12} and
gcc.dg/compat/struct-return-{2,3}.

David, the padding options I chose for powerpc64-linux are as we
discussed a (rather long) while ago, but I haven't changed anything
yet for AIX.  From our discussion, for AIX you probably want

#define AGGREGATE_PADDING_FIXED 1
#define AGGREGATES_PAD_UPWARD_ALWAYS 1

This patch does introduce a change on powerpc-linux too.  As indicated
in the function_arg_padding comment regarding -mstrict-align,
powerpc-linux-gcc will no long pass certain structures differently
depending on -mstrict-align.

	* doc/tm.texi (BLOCK_REG_PADDING): Describe.
	* expr.h (struct locate_and_pad_arg_data): Add where_pad.
	(emit_group_load, emit_group_store): Adjust declarations.
	Remove most occurrences of #ifdef TREE_CODE.
	* expr.c (emit_group_load): Add "type" param, and use
	BLOCK_REG_PADDING to determine need for a shift.  Optimize non-
	aligned accesses if !SLOW_UNALIGNED_ACCESS.
	(emit_group_store): Likewise.
	(emit_push_insn, expand_assignment, store_expr, expand_expr): Adjust
	emit_group_load and emit_group_store calls.
	* calls.c (store_unaligned_arguments_into_pseudos): Tidy.  Use
	BLOCK_REG_PADDING to determine whether we need endian_correction.
	(load_register_parameters): Localize vars.  Handle shifting of
	small values to the correct end of regs.  Adjust emit_group_load
	call.
	(expand_call, emit_library_call_value_1): Adjust emit_group_load
	and emit_group_store calls.
	* function.c (assign_parms): Set mem alignment for stack slots.
	Adjust emit_group_store call.  Store values at the "wrong" end
	of regs to the stack.  Use BLOCK_REG_PADDING.
	(locate_and_pad_parm): Save where_pad.
	(expand_function_end): Adjust emit_group_load call.
	* stmt.c (expand_value_return): Adjust emit_group_load call.
	* Makefile.in (calls.o): Depend on $(OPTABS_H).

	* config/rs6000/linux64.h (TARGET_LITTLE_ENDIAN): Redefine as 0.
	(AGGREGATE_PADDING_FIXED, AGGREGATES_PAD_UPWARD_ALWAYS): Define.
	(MUST_PASS_IN_STACK): Define.
	(BLOCK_REG_PADDING): Define.
	* config/rs6000/rs6000.h (struct rs6000_args): Remove orig_nargs.
	(PAD_VARARGS_DOWN): Define in terms of FUNCTION_ARG_PADDING.
	* config/rs6000/rs6000.c (init_cumulative_args): Don't set orig_nargs.
	(function_arg_padding): !AGGREGATE_PADDING_FIXED compatibility code.
	Act on AGGREGATES_PAD_UPWARD_ALWAYS.

diff -urp gcc-current/gcc/doc/tm.texi gcc-new/gcc/doc/tm.texi
--- gcc-current/gcc/doc/tm.texi	2003-07-11 08:43:37.000000000 +0930
+++ gcc-new/gcc/doc/tm.texi	2003-07-14 09:21:37.000000000 +0930
@@ -3762,6 +3762,17 @@ controlled by @code{PARM_BOUNDARY}.  If 
 arguments are padded down if @code{BYTES_BIG_ENDIAN} is true.
 @end defmac
 
+@defmac BLOCK_REG_PADDING (@var{mode}, @var{type}, @var{first})
+Specify padding for the last element of a block move between registers and
+memory.  @var{first} is nonzero if this is the only element.  Defining this
+macro allows better control of register function parameters on big-endian
+machines, without using @code{PARALLEL} rtl.  In particular,
+@code{MUST_PASS_IN_STACK} need not test padding and mode of types in
+registers, as there is no longer a "wrong" part of a register;  For example,
+a three byte aggregate may be passed in the high part of a register if so
+required.
+@end defmac
+
 @defmac FUNCTION_ARG_BOUNDARY (@var{mode}, @var{type})
 If defined, a C expression that gives the alignment boundary, in bits,
 of an argument with the specified mode and type.  If it is not defined,
diff -urp gcc-current/gcc/expr.h gcc-new/gcc/expr.h
--- gcc-current/gcc/expr.h	2003-06-30 09:51:42.000000000 +0930
+++ gcc-new/gcc/expr.h	2003-07-11 13:17:10.000000000 +0930
@@ -68,7 +68,6 @@ enum expand_modifier {EXPAND_NORMAL = 0,
 \f
 enum direction {none, upward, downward};
 
-#ifdef TREE_CODE /* Don't lose if tree.h not included.  */
 /* Structure to record the size of a sequence of arguments
    as the sum of a tree-expression and a constant.  This structure is
    also used to store offsets from the stack, which might be negative,
@@ -96,8 +95,9 @@ struct locate_and_pad_arg_data
   /* The amount that the stack pointer needs to be adjusted to
      force alignment for the next argument.  */
   struct args_size alignment_pad;
+  /* Which way we should pad this arg.  */
+  enum direction where_pad;
 };
-#endif
 
 /* Add the value of the tree INC to the `struct args_size' TO.  */
 
@@ -427,7 +427,7 @@ extern rtx gen_group_rtx (rtx);
 
 /* Load a BLKmode value into non-consecutive registers represented by a
    PARALLEL.  */
-extern void emit_group_load (rtx, rtx, int);
+extern void emit_group_load (rtx, rtx, tree, int);
 
 /* Move a non-consecutive group of registers represented by a PARALLEL into
    a non-consecutive group of registers represented by a PARALLEL.  */
@@ -435,12 +435,10 @@ extern void emit_group_move (rtx, rtx);
 
 /* Store a BLKmode value from non-consecutive registers represented by a
    PARALLEL.  */
-extern void emit_group_store (rtx, rtx, int);
+extern void emit_group_store (rtx, rtx, tree, int);
 
-#ifdef TREE_CODE
 /* Copy BLKmode object from a set of registers.  */
 extern rtx copy_blkmode_from_reg (rtx, rtx, tree);
-#endif
 
 /* Mark REG as holding a parameter for the next CALL_INSN.  */
 extern void use_reg (rtx *, rtx);
@@ -490,7 +488,6 @@ extern rtx emit_move_insn_1 (rtx, rtx);
    and return an rtx to address the beginning of the block.  */
 extern rtx push_block (rtx, int, int);
 
-#ifdef TREE_CODE
 /* Generate code to push something onto the stack, given its mode and type.  */
 extern void emit_push_insn (rtx, enum machine_mode, tree, rtx, unsigned int,
 			    int, rtx, int, rtx, rtx, int, rtx);
@@ -503,7 +500,6 @@ extern rtx expand_assignment (tree, tree
    If SUGGEST_REG is nonzero, copy the value through a register
    and return that register, if that is possible.  */
 extern rtx store_expr (tree, rtx, int);
-#endif
 
 /* Given an rtx that may include add and multiply operations,
    generate them as insns and return a pseudo-reg containing the value.
@@ -535,7 +531,6 @@ extern void clear_pending_stack_adjust (
 /* Pop any previously-pushed arguments that have not been popped yet.  */
 extern void do_pending_stack_adjust (void);
 
-#ifdef TREE_CODE
 /* Return the tree node and offset if a given argument corresponds to
    a string constant.  */
 extern tree string_constant (tree, tree *);
@@ -549,7 +544,6 @@ extern void jumpif (tree, rtx);
 /* Generate code to evaluate EXP and jump to IF_FALSE_LABEL if
    the result is zero, or IF_TRUE_LABEL if the result is one.  */
 extern void do_jump (tree, rtx, rtx);
-#endif
 
 /* Generate rtl to compare two rtx's, will call emit_cmp_insn.  */
 extern rtx compare_from_rtx (rtx, rtx, enum rtx_code, int, enum machine_mode,
@@ -566,7 +560,6 @@ extern int try_tablejump (tree, tree, tr
 extern unsigned int case_values_threshold (void);
 
 \f
-#ifdef TREE_CODE
 /* rtl.h and tree.h were included.  */
 /* Return an rtx for the size in bytes of the value of an expr.  */
 extern rtx expr_size (tree);
@@ -592,10 +585,13 @@ extern rtx prepare_call_address (rtx, tr
 
 extern rtx expand_call (tree, rtx, int);
 
+#ifdef TREE_CODE
 extern rtx expand_shift (enum tree_code, enum machine_mode, rtx, tree, rtx,
 			 int);
 extern rtx expand_divmod (int, enum tree_code, enum machine_mode, rtx, rtx,
 			  rtx, int);
+#endif
+
 extern void locate_and_pad_parm (enum machine_mode, tree, int, int, tree,
 				 struct args_size *,
 				 struct locate_and_pad_arg_data *);
@@ -608,7 +604,6 @@ extern rtx label_rtx (tree);
    list of its containing function (i.e. it is treated as reachable even
    if how is not obvious).  */
 extern rtx force_label_rtx (tree);
-#endif
 
 /* Indicate how an input argument register was promoted.  */
 extern rtx promoted_input_arg (unsigned int, enum machine_mode *, int *);
@@ -691,7 +686,6 @@ extern rtx widen_memory_access (rtx, enu
    valid address.  */
 extern rtx validize_mem (rtx);
 
-#ifdef TREE_CODE
 /* Given REF, either a MEM or a REG, and T, either the type of X or
    the expression corresponding to REF, set RTX_UNCHANGING_P if
    appropriate.  */
@@ -706,7 +700,6 @@ extern void set_mem_attributes (rtx, tre
    we alter MEM_OFFSET according to T then we should subtract BITPOS
    expecting that it'll be added back in later.  */
 extern void set_mem_attributes_minus_bitpos (rtx, tree, int, HOST_WIDE_INT);
-#endif
 
 /* Assemble the static constant template for function entry trampolines.  */
 extern rtx assemble_trampoline_template (void);
@@ -738,10 +731,8 @@ extern rtx force_reg (enum machine_mode,
 /* Return given rtx, copied into a new temp reg if it was in memory.  */
 extern rtx force_not_mem (rtx);
 
-#ifdef TREE_CODE
 /* Return mode and signedness to use when object is promoted.  */
 extern enum machine_mode promote_mode (tree, enum machine_mode, int *, int);
-#endif
 
 /* Remove some bytes from the stack.  An rtx says how many.  */
 extern void adjust_stack (rtx);
@@ -812,9 +803,7 @@ extern void do_jump_by_parts_equality_rt
 extern void do_jump_by_parts_greater_rtx (enum machine_mode, int, rtx, rtx,
 					  rtx, rtx);
 
-#ifdef TREE_CODE   /* Don't lose if tree.h not included.  */
 extern void mark_seen_cases (tree, unsigned char *, HOST_WIDE_INT, int);
-#endif
 
 extern int vector_mode_valid_p (enum machine_mode);
 
diff -urp gcc-current/gcc/expr.c gcc-new/gcc/expr.c
--- gcc-current/gcc/expr.c	2003-07-10 14:27:29.000000000 +0930
+++ gcc-new/gcc/expr.c	2003-07-11 09:21:51.000000000 +0930
@@ -2240,18 +2240,13 @@ gen_group_rtx (rtx orig)
   return gen_rtx_PARALLEL (GET_MODE (orig), gen_rtvec_v (length, tmps));
 }
 
-/* Emit code to move a block SRC to a block DST, where DST is non-consecutive
-   registers represented by a PARALLEL.  SSIZE represents the total size of
-   block SRC in bytes, or -1 if not known.  */
-/* ??? If SSIZE % UNITS_PER_WORD != 0, we make the blatant assumption that
-   the balance will be in what would be the low-order memory addresses, i.e.
-   left justified for big endian, right justified for little endian.  This
-   happens to be true for the targets currently using this support.  If this
-   ever changes, a new target macro along the lines of FUNCTION_ARG_PADDING
-   would be needed.  */
+/* Emit code to move a block ORIG_SRC of type TYPE to a block DST,
+   where DST is non-consecutive registers represented by a PARALLEL.
+   SSIZE represents the total size of block ORIG_SRC in bytes, or -1
+   if not known.  */ 
 
 void
-emit_group_load (rtx dst, rtx orig_src, int ssize)
+emit_group_load (rtx dst, rtx orig_src, tree type ATTRIBUTE_UNUSED, int ssize)
 {
   rtx *tmps, src;
   int start, i;
@@ -2279,7 +2274,17 @@ emit_group_load (rtx dst, rtx orig_src, 
       /* Handle trailing fragments that run over the size of the struct.  */
       if (ssize >= 0 && bytepos + (HOST_WIDE_INT) bytelen > ssize)
 	{
-	  shift = (bytelen - (ssize - bytepos)) * BITS_PER_UNIT;
+	  /* Arrange to shift the fragment to where it belongs.
+	     extract_bit_field loads to the lsb of the reg.  */
+	  if (
+#ifdef BLOCK_REG_PADDING
+	      BLOCK_REG_PADDING (GET_MODE (orig_src), type, i == start)
+	      == (BYTES_BIG_ENDIAN ? upward : downward)
+#else
+	      BYTES_BIG_ENDIAN
+#endif
+	      )
+	    shift = (bytelen - (ssize - bytepos)) * BITS_PER_UNIT;
 	  bytelen = ssize - bytepos;
 	  if (bytelen <= 0)
 	    abort ();
@@ -2304,7 +2309,8 @@ emit_group_load (rtx dst, rtx orig_src, 
 
       /* Optimize the access just a bit.  */
       if (GET_CODE (src) == MEM
-	  && MEM_ALIGN (src) >= GET_MODE_ALIGNMENT (mode)
+	  && (! SLOW_UNALIGNED_ACCESS (mode, MEM_ALIGN (src))
+	      || MEM_ALIGN (src) >= GET_MODE_ALIGNMENT (mode))
 	  && bytepos * BITS_PER_UNIT % GET_MODE_ALIGNMENT (mode) == 0
 	  && bytelen == GET_MODE_SIZE (mode))
 	{
@@ -2360,7 +2366,7 @@ emit_group_load (rtx dst, rtx orig_src, 
 				     bytepos * BITS_PER_UNIT, 1, NULL_RTX,
 				     mode, mode, ssize);
 
-      if (BYTES_BIG_ENDIAN && shift)
+      if (shift)
 	expand_binop (mode, ashl_optab, tmps[i], GEN_INT (shift),
 		      tmps[i], 0, OPTAB_WIDEN);
     }
@@ -2391,12 +2397,13 @@ emit_group_move (rtx dst, rtx src)
 		    XEXP (XVECEXP (src, 0, i), 0));
 }
 
-/* Emit code to move a block SRC to a block DST, where SRC is non-consecutive
-   registers represented by a PARALLEL.  SSIZE represents the total size of
-   block DST, or -1 if not known.  */
+/* Emit code to move a block SRC to a block ORIG_DST of type TYPE,
+   where SRC is non-consecutive registers represented by a PARALLEL.
+   SSIZE represents the total size of block ORIG_DST, or -1 if not
+   known.  */
 
 void
-emit_group_store (rtx orig_dst, rtx src, int ssize)
+emit_group_store (rtx orig_dst, rtx src, tree type ATTRIBUTE_UNUSED, int ssize)
 {
   rtx *tmps, dst;
   int start, i;
@@ -2440,8 +2447,8 @@ emit_group_store (rtx orig_dst, rtx src,
 	 the temporary.  */
 
       temp = assign_stack_temp (GET_MODE (dst), ssize, 0);
-      emit_group_store (temp, src, ssize);
-      emit_group_load (dst, temp, ssize);
+      emit_group_store (temp, src, type, ssize);
+      emit_group_load (dst, temp, type, ssize);
       return;
     }
   else if (GET_CODE (dst) != MEM && GET_CODE (dst) != CONCAT)
@@ -2462,7 +2469,16 @@ emit_group_store (rtx orig_dst, rtx src,
       /* Handle trailing fragments that run over the size of the struct.  */
       if (ssize >= 0 && bytepos + (HOST_WIDE_INT) bytelen > ssize)
 	{
-	  if (BYTES_BIG_ENDIAN)
+	  /* store_bit_field always takes its value from the lsb.
+	     Move the fragment to the lsb if it's not already there.  */
+	  if (
+#ifdef BLOCK_REG_PADDING
+	      BLOCK_REG_PADDING (GET_MODE (orig_dst), type, i == start)
+	      == (BYTES_BIG_ENDIAN ? upward : downward)
+#else
+	      BYTES_BIG_ENDIAN
+#endif
+	      )
 	    {
 	      int shift = (bytelen - (ssize - bytepos)) * BITS_PER_UNIT;
 	      expand_binop (mode, ashr_optab, tmps[i], GEN_INT (shift),
@@ -2495,7 +2511,8 @@ emit_group_store (rtx orig_dst, rtx src,
 
       /* Optimize the access just a bit.  */
       if (GET_CODE (dest) == MEM
-	  && MEM_ALIGN (dest) >= GET_MODE_ALIGNMENT (mode)
+	  && (! SLOW_UNALIGNED_ACCESS (mode, MEM_ALIGN (dest))
+	      || MEM_ALIGN (dest) >= GET_MODE_ALIGNMENT (mode))
 	  && bytepos * BITS_PER_UNIT % GET_MODE_ALIGNMENT (mode) == 0
 	  && bytelen == GET_MODE_SIZE (mode))
 	emit_move_insn (adjust_address (dest, mode, bytepos), tmps[i]);
@@ -4076,7 +4093,7 @@ emit_push_insn (rtx x, enum machine_mode
       /* Handle calls that pass values in multiple non-contiguous locations.
 	 The Irix 6 ABI has examples of this.  */
       if (GET_CODE (reg) == PARALLEL)
-	emit_group_load (reg, x, -1);  /* ??? size? */
+	emit_group_load (reg, x, type, -1);
       else
 	move_block_to_reg (REGNO (reg), x, partial, mode);
     }
@@ -4276,7 +4293,8 @@ expand_assignment (tree to, tree from, i
       /* Handle calls that return values in multiple non-contiguous locations.
 	 The Irix 6 ABI has examples of this.  */
       if (GET_CODE (to_rtx) == PARALLEL)
-	emit_group_load (to_rtx, value, int_size_in_bytes (TREE_TYPE (from)));
+	emit_group_load (to_rtx, value, TREE_TYPE (from),
+			 int_size_in_bytes (TREE_TYPE (from)));
       else if (GET_MODE (to_rtx) == BLKmode)
 	emit_block_move (to_rtx, value, expr_size (from), BLOCK_OP_NORMAL);
       else
@@ -4310,7 +4328,8 @@ expand_assignment (tree to, tree from, i
       temp = expand_expr (from, 0, GET_MODE (to_rtx), 0);
 
       if (GET_CODE (to_rtx) == PARALLEL)
-	emit_group_load (to_rtx, temp, int_size_in_bytes (TREE_TYPE (from)));
+	emit_group_load (to_rtx, temp, TREE_TYPE (from),
+			 int_size_in_bytes (TREE_TYPE (from)));
       else
 	emit_move_insn (to_rtx, temp);
 
@@ -4720,7 +4739,8 @@ store_expr (tree exp, rtx target, int wa
       /* Handle calls that return values in multiple non-contiguous locations.
 	 The Irix 6 ABI has examples of this.  */
       else if (GET_CODE (target) == PARALLEL)
-	emit_group_load (target, temp, int_size_in_bytes (TREE_TYPE (exp)));
+	emit_group_load (target, temp, TREE_TYPE (exp),
+			 int_size_in_bytes (TREE_TYPE (exp)));
       else if (GET_MODE (temp) == BLKmode)
 	emit_block_move (target, temp, expr_size (exp),
 			 (want_value & 2
@@ -9266,7 +9286,7 @@ expand_expr (tree exp, rtx target, enum 
 		    /* Handle calls that pass values in multiple
 		       non-contiguous locations.  The Irix 6 ABI has examples
 		       of this.  */
-		    emit_group_store (memloc, op0,
+		    emit_group_store (memloc, op0, inner_type,
 				      int_size_in_bytes (inner_type));
 		  else
 		    emit_move_insn (memloc, op0);
diff -urp gcc-current/gcc/calls.c gcc-new/gcc/calls.c
--- gcc-current/gcc/calls.c	2003-07-10 14:27:27.000000000 +0930
+++ gcc-new/gcc/calls.c	2003-07-11 08:57:52.000000000 +0930
@@ -27,6 +27,7 @@ Software Foundation, 59 Temple Place - S
 #include "tree.h"
 #include "flags.h"
 #include "expr.h"
+#include "optabs.h"
 #include "libfuncs.h"
 #include "function.h"
 #include "regs.h"
@@ -928,22 +929,27 @@ store_unaligned_arguments_into_pseudos (
 	    < (unsigned int) MIN (BIGGEST_ALIGNMENT, BITS_PER_WORD)))
       {
 	int bytes = int_size_in_bytes (TREE_TYPE (args[i].tree_value));
-	int big_endian_correction = 0;
-
-	args[i].n_aligned_regs
-	  = args[i].partial ? args[i].partial
-	    : (bytes + (UNITS_PER_WORD - 1)) / UNITS_PER_WORD;
+	int nregs = (bytes + UNITS_PER_WORD - 1) / UNITS_PER_WORD;
+	int endian_correction = 0;
 
+	args[i].n_aligned_regs = args[i].partial ? args[i].partial : nregs;
 	args[i].aligned_regs = (rtx *) xmalloc (sizeof (rtx)
 						* args[i].n_aligned_regs);
 
-	/* Structures smaller than a word are aligned to the least
-	   significant byte (to the right).  On a BYTES_BIG_ENDIAN machine,
+	/* Structures smaller than a word are normally aligned to the
+	   least significant byte.  On a BYTES_BIG_ENDIAN machine,
 	   this means we must skip the empty high order bytes when
 	   calculating the bit offset.  */
-	if (BYTES_BIG_ENDIAN
-	    && bytes < UNITS_PER_WORD)
-	  big_endian_correction = (BITS_PER_WORD  - (bytes * BITS_PER_UNIT));
+	if (bytes < UNITS_PER_WORD
+#ifdef BLOCK_REG_PADDING
+	    && (BLOCK_REG_PADDING (args[i].mode,
+				   TREE_TYPE (args[i].tree_value), 1)
+		== downward)
+#else
+	    && BYTES_BIG_ENDIAN
+#endif
+	    )
+	  endian_correction = BITS_PER_WORD - bytes * BITS_PER_UNIT;
 
 	for (j = 0; j < args[i].n_aligned_regs; j++)
 	  {
@@ -952,6 +958,8 @@ store_unaligned_arguments_into_pseudos (
 	    int bitsize = MIN (bytes * BITS_PER_UNIT, BITS_PER_WORD);
 
 	    args[i].aligned_regs[j] = reg;
+	    word = extract_bit_field (word, bitsize, 0, 1, NULL_RTX,
+				      word_mode, word_mode, BITS_PER_WORD);
 
 	    /* There is no need to restrict this code to loading items
 	       in TYPE_ALIGN sized hunks.  The bitfield instructions can
@@ -967,11 +975,8 @@ store_unaligned_arguments_into_pseudos (
 	    emit_move_insn (reg, const0_rtx);
 
 	    bytes -= bitsize / BITS_PER_UNIT;
-	    store_bit_field (reg, bitsize, big_endian_correction, word_mode,
-			     extract_bit_field (word, bitsize, 0, 1, NULL_RTX,
-						word_mode, word_mode,
-						BITS_PER_WORD),
-			     BITS_PER_WORD);
+	    store_bit_field (reg, bitsize, endian_correction, word_mode,
+			     word, BITS_PER_WORD);
 	  }
       }
 }
@@ -1574,34 +1579,48 @@ load_register_parameters (struct arg_dat
     {
       rtx reg = ((flags & ECF_SIBCALL)
 		 ? args[i].tail_call_reg : args[i].reg);
-      int partial = args[i].partial;
-      int nregs;
-
       if (reg)
 	{
+	  int partial = args[i].partial;
+	  int nregs;
+	  int size = 0;
 	  rtx before_arg = get_last_insn ();
 	  /* Set to non-negative if must move a word at a time, even if just
 	     one word (e.g, partial == 1 && mode == DFmode).  Set to -1 if
 	     we just use a normal move insn.  This value can be zero if the
 	     argument is a zero size structure with no fields.  */
-	  nregs = (partial ? partial
-		   : (TYPE_MODE (TREE_TYPE (args[i].tree_value)) == BLKmode
-		      ? ((int_size_in_bytes (TREE_TYPE (args[i].tree_value))
-			  + (UNITS_PER_WORD - 1)) / UNITS_PER_WORD)
-		      : -1));
+	  nregs = -1;
+	  if (partial)
+	    nregs = partial;
+	  else if (TYPE_MODE (TREE_TYPE (args[i].tree_value)) == BLKmode)
+	    {
+	      size = int_size_in_bytes (TREE_TYPE (args[i].tree_value));
+	      nregs = (size + (UNITS_PER_WORD - 1)) / UNITS_PER_WORD;
+	    }
+	  else
+	    size = GET_MODE_SIZE (args[i].mode);
 
 	  /* Handle calls that pass values in multiple non-contiguous
 	     locations.  The Irix 6 ABI has examples of this.  */
 
 	  if (GET_CODE (reg) == PARALLEL)
-	    emit_group_load (reg, args[i].value,
-			     int_size_in_bytes (TREE_TYPE (args[i].tree_value)));
+	    {
+	      tree type = TREE_TYPE (args[i].tree_value);
+	      emit_group_load (reg, args[i].value, type,
+			       int_size_in_bytes (type));
+	    }
 
 	  /* If simple case, just do move.  If normal partial, store_one_arg
 	     has already loaded the register for us.  In all other cases,
 	     load the register(s) from memory.  */
 
-	  else if (nregs == -1)
+	  else if (nregs == -1
+#ifdef BLOCK_REG_PADDING
+		   && !(size < UNITS_PER_WORD
+			&& (args[i].locate.where_pad
+			    == (BYTES_BIG_ENDIAN ? upward : downward)))
+#endif
+		   )
 	    emit_move_insn (reg, args[i].value);
 
 	  /* If we have pre-computed the values to put in the registers in
@@ -1613,9 +1632,44 @@ load_register_parameters (struct arg_dat
 			      args[i].aligned_regs[j]);
 
 	  else if (partial == 0 || args[i].pass_on_stack)
-	    move_block_to_reg (REGNO (reg),
-			       validize_mem (args[i].value), nregs,
-			       args[i].mode);
+	    {
+	      rtx mem = validize_mem (args[i].value);
+
+#ifdef BLOCK_REG_PADDING
+	      /* Handle case where we have a value that needs shifting
+		 up to the msb.  eg. a QImode value and we're padding
+		 upward on a BYTES_BIG_ENDIAN machine.  */
+	      if (nregs == -1)
+		{
+		  rtx ri = gen_rtx_REG (word_mode, REGNO (reg));
+		  rtx x;
+		  int shift = (UNITS_PER_WORD - size) * BITS_PER_UNIT;
+		  x = expand_binop (word_mode, ashl_optab, mem,
+				    GEN_INT (shift), ri, 1, OPTAB_WIDEN);
+		  if (x != ri)
+		    emit_move_insn (ri, x);
+		}
+
+	      /* Handle a BLKmode that needs shifting.  */
+	      else if (nregs == 1 && size < UNITS_PER_WORD
+		       && args[i].locate.where_pad == downward)
+		{
+		  rtx tem = operand_subword_force (mem, 0, args[i].mode);
+		  rtx ri = gen_rtx_REG (word_mode, REGNO (reg));
+		  rtx x = gen_reg_rtx (word_mode);
+		  int shift = (UNITS_PER_WORD - size) * BITS_PER_UNIT;
+		  optab dir = BYTES_BIG_ENDIAN ? lshr_optab : ashl_optab;
+
+		  emit_move_insn (x, tem);
+		  x = expand_binop (word_mode, dir, x, GEN_INT (shift),
+				    ri, 1, OPTAB_WIDEN);
+		  if (x != ri)
+		    emit_move_insn (ri, x);
+		}
+	      else
+#endif
+		move_block_to_reg (REGNO (reg), mem, nregs, args[i].mode);
+	    }
 
 	  /* When a parameter is a block, and perhaps in other cases, it is
 	     possible that it did a load from an argument slot that was
@@ -3138,7 +3192,7 @@ expand_call (tree exp, rtx target, int i
 	    }
 
 	  if (! rtx_equal_p (target, valreg))
-	    emit_group_store (target, valreg,
+	    emit_group_store (target, valreg, TREE_TYPE (exp),
 			      int_size_in_bytes (TREE_TYPE (exp)));
 
 	  /* We can not support sibling calls for this case.  */
@@ -3983,7 +4037,7 @@ emit_library_call_value_1 (int retval, r
       /* Handle calls that pass values in multiple non-contiguous
 	 locations.  The PA64 has examples of this for library calls.  */
       if (reg != 0 && GET_CODE (reg) == PARALLEL)
-	emit_group_load (reg, val, GET_MODE_SIZE (GET_MODE (val)));
+	emit_group_load (reg, val, NULL_TREE, GET_MODE_SIZE (GET_MODE (val)));
       else if (reg != 0 && partial == 0)
 	emit_move_insn (reg, val);
 
@@ -4087,7 +4141,7 @@ emit_library_call_value_1 (int retval, r
 	  if (GET_CODE (valreg) == PARALLEL)
 	    {
 	      temp = gen_reg_rtx (outmode);
-	      emit_group_store (temp, valreg, outmode);
+	      emit_group_store (temp, valreg, NULL_TREE, outmode);
 	      valreg = temp;
 	    }
 
@@ -4130,7 +4184,7 @@ emit_library_call_value_1 (int retval, r
 	{
 	  if (value == 0)
 	    value = gen_reg_rtx (outmode);
-	  emit_group_store (value, valreg, outmode);
+	  emit_group_store (value, valreg, NULL_TREE, outmode);
 	}
       else if (value != 0)
 	emit_move_insn (value, valreg);
diff -urp gcc-current/gcc/function.c gcc-new/gcc/function.c
--- gcc-current/gcc/function.c	2003-07-10 14:27:29.000000000 +0930
+++ gcc-new/gcc/function.c	2003-07-11 08:57:52.000000000 +0930
@@ -4507,6 +4507,8 @@ assign_parms (tree fndecl)
 						  offset_rtx));
 
 	set_mem_attributes (stack_parm, parm, 1);
+	if (entry_parm && MEM_ATTRS (stack_parm)->align < PARM_BOUNDARY)
+	  set_mem_align (stack_parm, PARM_BOUNDARY);
 
 	/* Set also REG_ATTRS if parameter was passed in a register.  */
 	if (entry_parm)
@@ -4538,6 +4540,7 @@ assign_parms (tree fndecl)
 	     locations.  The Irix 6 ABI has examples of this.  */
 	  if (GET_CODE (entry_parm) == PARALLEL)
 	    emit_group_store (validize_mem (stack_parm), entry_parm,
+			      TREE_TYPE (parm),
 			      int_size_in_bytes (TREE_TYPE (parm)));
 
 	  else
@@ -4644,7 +4647,12 @@ assign_parms (tree fndecl)
 
 	 Set DECL_RTL to that place.  */
 
-      if (nominal_mode == BLKmode || GET_CODE (entry_parm) == PARALLEL)
+      if (nominal_mode == BLKmode
+#ifdef BLOCK_REG_PADDING
+	  || (locate.where_pad == (BYTES_BIG_ENDIAN ? upward : downward)
+	      && GET_MODE_SIZE (promoted_mode) < UNITS_PER_WORD)
+#endif
+	  || GET_CODE (entry_parm) == PARALLEL)
 	{
 	  /* If a BLKmode arrives in registers, copy it to a stack slot.
 	     Handle calls that pass values in multiple non-contiguous
@@ -4680,7 +4688,7 @@ assign_parms (tree fndecl)
 	      /* Handle calls that pass values in multiple non-contiguous
 		 locations.  The Irix 6 ABI has examples of this.  */
 	      if (GET_CODE (entry_parm) == PARALLEL)
-		emit_group_store (mem, entry_parm, size);
+		emit_group_store (mem, entry_parm, TREE_TYPE (parm), size);
 
 	      else if (size == 0)
 		;
@@ -4692,7 +4700,13 @@ assign_parms (tree fndecl)
 		  enum machine_mode mode
 		    = mode_for_size (size * BITS_PER_UNIT, MODE_INT, 0);
 
-		  if (mode != BLKmode)
+		  if (mode != BLKmode
+#ifdef BLOCK_REG_PADDING
+		      && (size == UNITS_PER_WORD
+			  || (BLOCK_REG_PADDING (mode, TREE_TYPE (parm), 1)
+			      != (BYTES_BIG_ENDIAN ? upward : downward)))
+#endif
+		      )
 		    {
 		      rtx reg = gen_rtx_REG (mode, REGNO (entry_parm));
 		      emit_move_insn (change_address (mem, mode, 0), reg);
@@ -4703,7 +4717,13 @@ assign_parms (tree fndecl)
 		     to memory.  Note that the previous test doesn't
 		     handle all cases (e.g. SIZE == 3).  */
 		  else if (size != UNITS_PER_WORD
-			   && BYTES_BIG_ENDIAN)
+#ifdef BLOCK_REG_PADDING
+			   && (BLOCK_REG_PADDING (mode, TREE_TYPE (parm), 1)
+			       == downward)
+#else
+			   && BYTES_BIG_ENDIAN
+#endif
+			   )
 		    {
 		      rtx tem, x;
 		      int by = (UNITS_PER_WORD - size) * BITS_PER_UNIT;
@@ -5352,6 +5372,7 @@ locate_and_pad_parm (enum machine_mode p
     = type ? size_in_bytes (type) : size_int (GET_MODE_SIZE (passed_mode));
   where_pad = FUNCTION_ARG_PADDING (passed_mode, type);
   boundary = FUNCTION_ARG_BOUNDARY (passed_mode, type);
+  locate->where_pad = where_pad;
 
 #ifdef ARGS_GROW_DOWNWARD
   locate->slot_offset.constant = -initial_offset_ptr->constant;
@@ -7021,6 +7042,7 @@ expand_function_end (void)
 		emit_group_move (real_decl_rtl, decl_rtl);
 	      else
 		emit_group_load (real_decl_rtl, decl_rtl,
+				 TREE_TYPE (decl_result),
 				 int_size_in_bytes (TREE_TYPE (decl_result)));
 	    }
 	  else
diff -urp gcc-current/gcc/stmt.c gcc-new/gcc/stmt.c
--- gcc-current/gcc/stmt.c	2003-07-10 14:27:33.000000000 +0930
+++ gcc-new/gcc/stmt.c	2003-07-11 08:57:52.000000000 +0930
@@ -2955,7 +2955,7 @@ expand_value_return (rtx val)
 	val = convert_modes (mode, old_mode, val, unsignedp);
 #endif
       if (GET_CODE (return_reg) == PARALLEL)
-	emit_group_load (return_reg, val, int_size_in_bytes (type));
+	emit_group_load (return_reg, val, type, int_size_in_bytes (type));
       else
 	emit_move_insn (return_reg, val);
     }
diff -urp gcc-current/gcc/Makefile.in gcc-new/gcc/Makefile.in
--- gcc-current/gcc/Makefile.in	2003-07-11 08:43:30.000000000 +0930
+++ gcc-new/gcc/Makefile.in	2003-07-11 08:57:52.000000000 +0930
@@ -1543,7 +1543,7 @@ builtins.o : builtins.c $(CONFIG_H) $(SY
    $(RECOG_H) output.h typeclass.h hard-reg-set.h toplev.h hard-reg-set.h \
    except.h $(TM_P_H) $(PREDICT_H) libfuncs.h real.h langhooks.h
 calls.o : calls.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) $(TREE_H) flags.h \
-   $(EXPR_H) langhooks.h $(TARGET_H) \
+   $(EXPR_H) $(OPTABS_H) langhooks.h $(TARGET_H) \
    libfuncs.h $(REGS_H) toplev.h output.h function.h $(TIMEVAR_H) $(TM_P_H) cgraph.h except.h
 expmed.o : expmed.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) $(TREE_H) \
    flags.h insn-config.h $(EXPR_H) $(OPTABS_H) $(RECOG_H) real.h \
diff -urp gcc-current/gcc/config/rs6000/linux64.h gcc-new/gcc/config/rs6000/linux64.h
--- gcc-current/gcc/config/rs6000/linux64.h	2003-07-10 12:59:19.000000000 +0930
+++ gcc-new/gcc/config/rs6000/linux64.h	2003-07-11 11:29:25.000000000 +0930
@@ -163,6 +163,10 @@
 
 #ifndef RS6000_BI_ARCH
 
+/* 64-bit PowerPC Linux is always big-endian.  */
+#undef	TARGET_LITTLE_ENDIAN
+#define TARGET_LITTLE_ENDIAN	0
+
 /* 64-bit PowerPC Linux always has a TOC.  */
 #undef  TARGET_TOC
 #define	TARGET_TOC		1
@@ -235,6 +239,35 @@
 #undef  JUMP_TABLES_IN_TEXT_SECTION
 #define JUMP_TABLES_IN_TEXT_SECTION TARGET_64BIT
 
+/* The linux ppc64 ABI isn't explicit on whether aggregates smaller
+   than a doubleword should be padded upward or downward.  You could
+   reasonably assume that they follow the normal rules for structure
+   layout treating the parameter area as any other block of memory,
+   then map the reg param area to registers.  ie. pad updard.
+   Setting both of the following defines results in this behaviour.
+   Setting just the first one will result in aggregates that fit in a
+   doubleword being padded downward, and others being padded upward.
+   Not a bad idea as this results in struct { int x; } being passed
+   the same way as an int.  */
+#define AGGREGATE_PADDING_FIXED TARGET_64BIT
+#define AGGREGATES_PAD_UPWARD_ALWAYS 0
+
+/* We don't want anything in the reg parm area being passed on the
+   stack.  */
+#define MUST_PASS_IN_STACK(MODE, TYPE)				\
+  ((TARGET_64BIT						\
+    && (TYPE) != 0						\
+    && (TREE_CODE (TYPE_SIZE (TYPE)) != INTEGER_CST		\
+	|| TREE_ADDRESSABLE (TYPE)))				\
+   || (!TARGET_64BIT						\
+       && default_must_pass_in_stack ((MODE), (TYPE))))
+
+/* Specify padding for the last element of a block move between
+   registers and memory.  FIRST is nonzero if this is the only
+   element.  */
+#define BLOCK_REG_PADDING(MODE, TYPE, FIRST) \
+  (!(FIRST) ? upward : FUNCTION_ARG_PADDING (MODE, TYPE))
+
 /* __throw will restore its own return address to be the same as the
    return address of the function that the throw is being made to.
    This is unfortunate, because we want to check the original
diff -urp gcc-current/gcc/config/rs6000/rs6000.c gcc-new/gcc/config/rs6000/rs6000.c
--- gcc-current/gcc/config/rs6000/rs6000.c	2003-07-10 14:27:38.000000000 +0930
+++ gcc-new/gcc/config/rs6000/rs6000.c	2003-07-11 11:29:32.000000000 +0930
@@ -3702,8 +3702,6 @@ init_cumulative_args (cum, fntype, libna
   else
     cum->nargs_prototype = 0;
 
-  cum->orig_nargs = cum->nargs_prototype;
-
   /* Check for a longcall attribute.  */
   if (fntype
       && lookup_attribute ("longcall", TYPE_ATTRIBUTES (fntype))
@@ -3742,8 +3740,47 @@ function_arg_padding (mode, type)
      enum machine_mode mode;
      tree type;
 {
-  if (type != 0 && AGGREGATE_TYPE_P (type))
-    return upward;
+#ifndef AGGREGATE_PADDING_FIXED
+#define AGGREGATE_PADDING_FIXED 0
+#endif
+#ifndef AGGREGATES_PAD_UPWARD_ALWAYS
+#define AGGREGATES_PAD_UPWARD_ALWAYS 0
+#endif
+
+  if (!AGGREGATE_PADDING_FIXED)
+    {
+      /* GCC used to pass structures of the same size as integer types as
+	 if they were in fact integers, ignoring FUNCTION_ARG_PADDING.
+	 ie. Structures of size 1 or 2 (or 4 when TARGET_64BIT) were
+	 passed padded downward, except that -mstrict-align further
+	 muddied the water in that multi-component structures of 2 and 4
+	 bytes in size were passed padded upward.
+
+	 The following arranges for best compatibility with previous
+	 versions of gcc, but removes the -mstrict-align dependency.  */
+      if (BYTES_BIG_ENDIAN)
+	{
+	  HOST_WIDE_INT size = 0;
+
+	  if (mode == BLKmode)
+	    {
+	      if (type && TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST)
+		size = int_size_in_bytes (type);
+	    }
+	  else
+	    size = GET_MODE_SIZE (mode);
+
+	  if (size == 1 || size == 2 || size == 4)
+	    return downward;
+	}
+      return upward;
+    }
+
+  if (AGGREGATES_PAD_UPWARD_ALWAYS)
+    {
+      if (type != 0 && AGGREGATE_TYPE_P (type))
+	return upward;
+    }
 
   /* This is the default definition.  */
   return (! BYTES_BIG_ENDIAN
diff -urp gcc-current/gcc/config/rs6000/rs6000.h gcc-new/gcc/config/rs6000/rs6000.h
--- gcc-current/gcc/config/rs6000/rs6000.h	2003-07-10 14:27:39.000000000 +0930
+++ gcc-new/gcc/config/rs6000/rs6000.h	2003-07-11 08:57:52.000000000 +0930
@@ -1760,7 +1760,6 @@ typedef struct rs6000_args
   int fregno;			/* next available FP register */
   int vregno;			/* next available AltiVec register */
   int nargs_prototype;		/* # args left in the current prototype */
-  int orig_nargs;		/* Original value of nargs_prototype */
   int prototype;		/* Whether a prototype was defined */
   int stdarg;			/* Whether function is a stdarg function.  */
   int call_cookie;		/* Do special things for this call */
@@ -1904,13 +1903,8 @@ typedef struct rs6000_args
 #define EXPAND_BUILTIN_VA_ARG(valist, type) \
   rs6000_va_arg (valist, type)
 
-/* For AIX, the rule is that structures are passed left-aligned in
-   their stack slot.  However, GCC does not presently do this:
-   structures which are the same size as integer types are passed
-   right-aligned, as if they were in fact integers.  This only
-   matters for structures of size 1 or 2, or 4 when TARGET_64BIT.
-   ABI_V4 does not use std_expand_builtin_va_arg.  */
-#define PAD_VARARGS_DOWN (TYPE_MODE (type) != BLKmode)
+#define PAD_VARARGS_DOWN \
+   (FUNCTION_ARG_PADDING (TYPE_MODE (type), type) == downward)
 
 /* Define this macro to be a nonzero value if the location where a function
    argument is passed depends on whether or not it is a named argument.  */


-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: function parms in regs, patch 3 of 3
  2003-07-14  2:51         ` Alan Modra
@ 2003-07-14  3:00           ` David Edelsohn
  2003-07-15 15:08           ` David Edelsohn
  1 sibling, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2003-07-14  3:00 UTC (permalink / raw)
  To: Alan Modra; +Cc: Jim Wilson, gcc-patches

	The rs6000 changes are fine with me.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: function parms in regs, patch 3 of 3
  2003-07-14  2:51         ` Alan Modra
  2003-07-14  3:00           ` David Edelsohn
@ 2003-07-15 15:08           ` David Edelsohn
  2003-07-16  3:49             ` Alan Modra
  1 sibling, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2003-07-15 15:08 UTC (permalink / raw)
  To: Alan Modra; +Cc: gcc-patches

>>>>> Alan Modra writes:

Alan> David, the padding options I chose for powerpc64-linux are as we
Alan> discussed a (rather long) while ago, but I haven't changed anything
Alan> yet for AIX.  From our discussion, for AIX you probably want

Alan> #define AGGREGATE_PADDING_FIXED 1
Alan> #define AGGREGATES_PAD_UPWARD_ALWAYS 1

	This change apparently needs adjustments to the ABI_AIX stdarg
support.  That is not handled automatically by these macros?

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: function parms in regs, patch 3 of 3
  2003-07-15 15:08           ` David Edelsohn
@ 2003-07-16  3:49             ` Alan Modra
  2003-07-16 15:08               ` David Edelsohn
  2003-07-16 15:10               ` David Edelsohn
  0 siblings, 2 replies; 875+ messages in thread
From: Alan Modra @ 2003-07-16  3:49 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

On Tue, Jul 15, 2003 at 11:08:26AM -0400, David Edelsohn wrote:
> >>>>> Alan Modra writes:
> Alan> #define AGGREGATE_PADDING_FIXED 1
> Alan> #define AGGREGATES_PAD_UPWARD_ALWAYS 1
> 
> 	This change apparently needs adjustments to the ABI_AIX stdarg
> support.  That is not handled automatically by these macros?

It should be automatic with PAD_VARARGS_DOWN being defined in terms
of FUNCTION_ARG_PADDING.  Perhaps MUST_PASS_IN_STACK is interfering?

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: function parms in regs, patch 3 of 3
  2003-07-16  3:49             ` Alan Modra
@ 2003-07-16 15:08               ` David Edelsohn
  2003-07-16 15:10               ` David Edelsohn
  1 sibling, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2003-07-16 15:08 UTC (permalink / raw)
  To: gcc-patches

>>>>> Alan Modra writes:


Alan> On Tue, Jul 15, 2003 at 11:08:26AM -0400, David Edelsohn wrote:
>> >>>>> Alan Modra writes:
Alan> #define AGGREGATE_PADDING_FIXED 1
Alan> #define AGGREGATES_PAD_UPWARD_ALWAYS 1
>> 
>> This change apparently needs adjustments to the ABI_AIX stdarg
>> support.  That is not handled automatically by these macros?

Alan> It should be automatic with PAD_VARARGS_DOWN being defined in terms
Alan> of FUNCTION_ARG_PADDING.  Perhaps MUST_PASS_IN_STACK is interfering?

Alan> -- 
Alan> Alan Modra
Alan> IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: function parms in regs, patch 3 of 3
  2003-07-16  3:49             ` Alan Modra
  2003-07-16 15:08               ` David Edelsohn
@ 2003-07-16 15:10               ` David Edelsohn
  1 sibling, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2003-07-16 15:10 UTC (permalink / raw)
  To: Alan Modra; +Cc: gcc-patches

>>>>> Alan Modra writes:

Alan> It should be automatic with PAD_VARARGS_DOWN being defined in terms
Alan> of FUNCTION_ARG_PADDING.  Perhaps MUST_PASS_IN_STACK is interfering?

	Okay, just needs more macros defined than how I interpreted your
original comment.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* powerpc64-linux libffi update
@ 2003-08-01 14:57             ` Alan Modra
  2003-08-01 15:08               ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Alan Modra @ 2003-08-01 14:57 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Edelsohn

Updates libffi for the current powerpc64-linux gcc structure passing
conventions.  Since we now don't pass anything in the parm save area,
as per the ABI, the hack in linux64.S can go too.

libffi/ChangeLog
	* src/powerpc/ffi.c (ffi_prep_args64): Modify for changed gcc
	structure passing.
	(ffi_closure_helper_LINUX64): Likewise.
	* src/powerpc/linux64.S: Remove code writing to parm save area.
	* src/powerpc/linux64_closure.S (ffi_closure_LINUX64): Use return
	address in lr from ffi_closure_helper_LINUX64 call to calculate
	table address.  Optimize function tail.

ffitest now reports no errors, and no new libjava regressions.
OK to install mainline?

Index: libffi/src/powerpc/ffi.c
===================================================================
RCS file: /cvs/gcc/gcc/libffi/src/powerpc/ffi.c,v
retrieving revision 1.4
diff -u -p -r1.4 ffi.c
--- libffi/src/powerpc/ffi.c	18 Apr 2003 12:32:36 -0000	1.4
+++ libffi/src/powerpc/ffi.c	1 Aug 2003 14:34:54 -0000
@@ -288,7 +288,7 @@ enum { ASM_NEEDS_REGISTERS64 = 4 };
    |--------------------------------------------| |
    |   TOC save area			8	| |
    |--------------------------------------------| |	stack	|
-   |   Linker doubleword		8	| |	gorws	|
+   |   Linker doubleword		8	| |	grows	|
    |--------------------------------------------| |	down	V
    |   Compiler doubleword		8	| |
    |--------------------------------------------| |	lower addresses
@@ -384,15 +384,14 @@ void hidden ffi_prep_args64(extended_cif
 	    }
 	  else
 	    {
-	      /* Structures with 1, 2 and 4 byte sizes are passed left-padded
-		 if they are in the first 8 arguments.  */
-	      if (next_arg >= gpr_base
-		  && (*ptr)->size < 8
-		  && ((*ptr)->size & ~((*ptr)->size - 1)) == (*ptr)->size)
-		memcpy((char *) next_arg + 8 - (*ptr)->size,
-		       (char *) *p_argv, (*ptr)->size);
-	      else
-		memcpy((char *) next_arg, (char *) *p_argv, (*ptr)->size);
+	      char *where = (char *) next_arg;
+
+	      /* Structures with size less than eight bytes are passed
+		 left-padded.  */
+	      if ((*ptr)->size < 8)
+		where += 8 - (*ptr)->size;
+
+	      memcpy (where, (char *) *p_argv, (*ptr)->size);
 	      next_arg += words;
 	      if (next_arg == gpr_end)
 		next_arg = rest;
@@ -1027,12 +1026,9 @@ ffi_closure_helper_LINUX64 (ffi_closure*
 #if FFI_TYPE_LONGDOUBLE != FFI_TYPE_DOUBLE
 	case FFI_TYPE_LONGDOUBLE:
 #endif
-	  /* Structures with 1, 2 and 4 byte sizes are passed left-padded
-	     if they are in the first 8 arguments.  */
-	  if (ng < NUM_GPR_ARG_REGISTERS64
-	      && arg_types[i]->size < 8
-	      && ((arg_types[i]->size & ~(arg_types[i]->size - 1))
-		  == arg_types[i]->size))
+	  /* Structures with size less than eight bytes are passed
+	     left-padded.  */
+	  if (arg_types[i]->size < 8)
 	    avalue[i] = (char *) pst + 8 - arg_types[i]->size;
 	  else
 	    avalue[i] = pst;
Index: libffi/src/powerpc/linux64.S
===================================================================
RCS file: /cvs/gcc/gcc/libffi/src/powerpc/linux64.S,v
retrieving revision 1.2
diff -u -p -r1.2 linux64.S
--- libffi/src/powerpc/linux64.S	16 May 2003 22:09:21 -0000	1.2
+++ libffi/src/powerpc/linux64.S	1 Aug 2003 14:34:54 -0000
@@ -95,17 +95,6 @@ ffi_call_LINUX64:
 	lfd	%f12, -32-(10*8)(%r28)
 	lfd	%f13, -32-(9*8)(%r28)
 2:
-	/* FIXME: Shouldn't gcc use %r3-%r10 in this case
-	   and not the parm save area?  */
-	std	%r3, 48+(0*8)(%r1)
-	std	%r4, 48+(1*8)(%r1)
-	std	%r5, 48+(2*8)(%r1)
-	std	%r6, 48+(3*8)(%r1)
-	std	%r7, 48+(4*8)(%r1)
-	std	%r8, 48+(5*8)(%r1)
-	std	%r9, 48+(6*8)(%r1)
-	std	%r10, 48+(7*8)(%r1)
-	/* end of FIXME.  */
 
 	/* Make the call.  */
 	bctrl
Index: libffi/src/powerpc/linux64_closure.S
===================================================================
RCS file: /cvs/gcc/gcc/libffi/src/powerpc/linux64_closure.S,v
retrieving revision 1.2
diff -u -p -r1.2 linux64_closure.S
--- libffi/src/powerpc/linux64_closure.S	16 May 2003 22:09:21 -0000	1.2
+++ libffi/src/powerpc/linux64_closure.S	1 Aug 2003 14:34:54 -0000
@@ -64,6 +64,7 @@ ffi_closure_LINUX64:
 
 	# make the call
 	bl .ffi_closure_helper_LINUX64
+.Lret:
 
 	# now r3 contains the return type
 	# so use it to look up in a table
@@ -71,10 +72,10 @@ ffi_closure_LINUX64:
 
 	# look up the proper starting point in table 
 	# by using return type as offset
-	addi %r5, %r1, 112	# get pointer to results area
-	bl .Lget_ret_type0_addr # get pointer to .Lret_type0 into LR
-	mflr %r4		# move to r4
+	mflr %r4		# move address of .Lret to r4
 	sldi %r3, %r3, 4	# now multiply return type by 16
+	addi %r4, %r4, .Lret_type0 - .Lret
+	ld %r0, 224+16(%r1)
 	add %r3, %r3, %r4	# add contents of table to table address
 	mtctr %r3
 	bctr			# jump to it
@@ -84,94 +85,84 @@ ffi_closure_LINUX64:
 # first.
 	.align 4
 
-	nop
-	nop
-	nop
-.Lget_ret_type0_addr:
-	blrl
-
 .Lret_type0:
 # case FFI_TYPE_VOID
-	b .Lfinish
-	nop
-	nop
+	mtlr %r0
+	addi %r1, %r1, 224
+	blr
 	nop
 # case FFI_TYPE_INT
-	lwa %r3, 4(%r5)
-	b .Lfinish
-	nop
-	nop
+	lwa %r3, 112+4(%r1)
+	mtlr %r0
+	addi %r1, %r1, 224
+	blr
 # case FFI_TYPE_FLOAT
-	lfs %f1, 4(%r5)
-	b .Lfinish
-	nop
-	nop
+	lfs %f1, 112+4(%r1)
+	mtlr %r0
+	addi %r1, %r1, 224
+	blr
 # case FFI_TYPE_DOUBLE
-	lfd %f1, 0(%r5)
-	b .Lfinish
-	nop
-	nop
+	lfd %f1, 112+0(%r1)
+	mtlr %r0
+	addi %r1, %r1, 224
+	blr
 # case FFI_TYPE_LONGDOUBLE
-	lfd %f1, 0(%r5)
-	b .Lfinish
-	nop
-	nop
+	lfd %f1, 112+0(%r1)
+	mtlr %r0
+	addi %r1, %r1, 224
+	blr
 # case FFI_TYPE_UINT8
-	lbz %r3, 7(%r5)
-	b .Lfinish
-	nop
-	nop
+	lbz %r3, 112+7(%r1)
+	mtlr %r0
+	addi %r1, %r1, 224
+	blr
 # case FFI_TYPE_SINT8
-	lbz %r3, 7(%r5)
+	lbz %r3, 112+7(%r1)
 	extsb %r3,%r3
+	mtlr %r0
 	b .Lfinish
-	nop
 # case FFI_TYPE_UINT16
-	lhz %r3, 6(%r5)
-	b .Lfinish
-	nop
-	nop
+	lhz %r3, 112+6(%r1)
+	mtlr %r0
+.Lfinish:
+	addi %r1, %r1, 224
+	blr
 # case FFI_TYPE_SINT16
-	lha %r3, 6(%r5)
-	b .Lfinish
-	nop
-	nop
+	lha %r3, 112+6(%r1)
+	mtlr %r0
+	addi %r1, %r1, 224
+	blr
 # case FFI_TYPE_UINT32
-	lwz %r3, 4(%r5)
-	b .Lfinish
-	nop
-	nop
+	lwz %r3, 112+4(%r1)
+	mtlr %r0
+	addi %r1, %r1, 224
+	blr
 # case FFI_TYPE_SINT32
-	lwa %r3, 4(%r5)
-	b .Lfinish
-	nop
-	nop
+	lwa %r3, 112+4(%r1)
+	mtlr %r0
+	addi %r1, %r1, 224
+	blr
 # case FFI_TYPE_UINT64
-	ld %r3, 0(%r5)
-	b .Lfinish
-	nop
-	nop
+	ld %r3, 112+0(%r1)
+	mtlr %r0
+	addi %r1, %r1, 224
+	blr
 # case FFI_TYPE_SINT64
-	ld %r3, 0(%r5)
-	b .Lfinish
-	nop
-	nop
+	ld %r3, 112+0(%r1)
+	mtlr %r0
+	addi %r1, %r1, 224
+	blr
 # case FFI_TYPE_STRUCT
-	b .Lfinish
-	nop
-	nop
+	mtlr %r0
+	addi %r1, %r1, 224
+	blr
 	nop
 # case FFI_TYPE_POINTER
-	ld %r3, 0(%r5)
-	b .Lfinish
-	nop
-	nop
-# esac
-.Lfinish:
-	ld %r0, 224+16(%r1)
+	ld %r3, 112+0(%r1)
 	mtlr %r0
 	addi %r1, %r1, 224
 	blr
+# esac
 .LFE1:
 	.long	0
 	.byte	0,12,0,1,128,0,0,0

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: powerpc64-linux libffi update
  2003-08-01 14:57             ` powerpc64-linux libffi update Alan Modra
@ 2003-08-01 15:08               ` David Edelsohn
  0 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2003-08-01 15:08 UTC (permalink / raw)
  To: gcc-patches

libffi/ChangeLog
	* src/powerpc/ffi.c (ffi_prep_args64): Modify for changed gcc
	structure passing.
	(ffi_closure_helper_LINUX64): Likewise.
	* src/powerpc/linux64.S: Remove code writing to parm save area.
	* src/powerpc/linux64_closure.S (ffi_closure_LINUX64): Use return
	address in lr from ffi_closure_helper_LINUX64 call to calculate
	table address.  Optimize function tail.

ffitest now reports no errors, and no new libjava regressions.
OK to install mainline?

Okay.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: ppc64 floating point usage [was Re: PPC64 Compiler bug !!]
  2003-06-13 20:04     ` David Edelsohn
  2003-06-13 20:06       ` Jakub Jelinek
  2003-06-13 21:08       ` linas
@ 2003-08-08  7:24       ` Alan Modra
  2003-08-08 14:01         ` David Edelsohn
  2 siblings, 1 reply; 875+ messages in thread
From: Alan Modra @ 2003-08-08  7:24 UTC (permalink / raw)
  To: David Edelsohn; +Cc: linas, gcc-patches, Janis Johnson

[-- Attachment #1: Type: text/plain, Size: 4569 bytes --]

On Fri, Jun 13, 2003 at 04:00:39PM -0400, David Edelsohn wrote:
> 	Have you tested the change in allocation order patch on floating
> point intensive code?

Janis has kindly run specfp tests.  And found a regression.  :-(

On looking at the code differences for the case with a slowdown, I
see 

 fctiwz 0,1
..
 stfd 0,128(1)
..
 ld 0,128(1)
..
 std 0,112(1)
 lwz 5,116(1)

Without the RS6000_ALT_REG_ALLOC_ORDER patch we generate:

 fctiwz 0,1
..
 stfd 0,112(1)
..
 lwz 5,116(1)

We have three of these sequences in the function.  On analyzing -da
dumps, for the slow case I see the fctiwz instruction being allocated a
gpr for its output, while the fast case gets an fpr.  Of course, with a
gpr we then require a reload, and optimization passes which run after
reload don't seem to be clever enough to see the useless moves to and
from memory.

Now it turns out that the register allocator isn't getting all the
information it needs to do a good job, because the fctiwz pattern tells
the allocator to ignore the fact that the output should be in a fpr.
David, you introduced this with

2002-07-03  David Edelsohn  <edelsohn@gnu.org>

	* config/rs6000/rs6000.md (fix_truncdfsi2_internal): Ignore DImode
	in FPR as preference.
	(fctiwz): Same.
	(floatdidf2, fix_truncdfdi2): Same.
	(floatdisf2, floatditf2, fix_trunctfdi2): Same.
	(floatditf2): Same.
	(floatsitf2, fix_trunctfsi2): SImode in GPR.
	(ctrdi): Remove FPR alternative and splitter.

Going back over gcc-patches archives, it appears that this was to fix
a reload problem.  (See [RFC PATCH] Fix middle-end/6963 from June and
July 2002.)  The good news is that the underlying reload problem is
fixed, I think.  I'm testing the following patch, which compiles the
testcase in http://gcc.gnu.org/ml/gcc-patches/2002-06/msg02214.html
without a problem, and things are looking good so far with regression
tests.

	* config/rs6000/rs6000.md (fctiwz): Don't ignore fpr preference.
	(floatdidf2, fix_truncdfdi2, floatdisf2_internal1): Likewise.
	(floatditf2, fix_trunctfdi2): Likewise.

Index: gcc/config/rs6000/rs6000.md
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.md,v
retrieving revision 1.264
diff -u -p -r1.264 rs6000.md
--- gcc/config/rs6000/rs6000.md	16 Jul 2003 11:52:51 -0000	1.264
+++ gcc/config/rs6000/rs6000.md	8 Aug 2003 06:52:44 -0000
@@ -5182,7 +5182,7 @@
 ; because the first makes it clear that operand 0 is not live
 ; before the instruction.
 (define_insn "fctiwz"
-  [(set (match_operand:DI 0 "gpc_reg_operand" "=*f")
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=f")
 	(unspec:DI [(fix:SI (match_operand:DF 1 "gpc_reg_operand" "f"))]
 		   UNSPEC_FCTIWZ))]
   "(TARGET_POWER2 || TARGET_POWERPC) && TARGET_HARD_FLOAT && TARGET_FPRS"
@@ -5197,7 +5197,7 @@
 
 (define_insn "floatdidf2"
   [(set (match_operand:DF 0 "gpc_reg_operand" "=f")
-	(float:DF (match_operand:DI 1 "gpc_reg_operand" "*f")))]
+	(float:DF (match_operand:DI 1 "gpc_reg_operand" "f")))]
   "TARGET_POWERPC64 && TARGET_HARD_FLOAT && TARGET_FPRS"
   "fcfid %0,%1"
   [(set_attr "type" "fp")])
@@ -5233,7 +5233,7 @@
   "")
 
 (define_insn "fix_truncdfdi2"
-  [(set (match_operand:DI 0 "gpc_reg_operand" "=*f")
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=f")
 	(fix:DI (match_operand:DF 1 "gpc_reg_operand" "f")))]
   "TARGET_POWERPC64 && TARGET_HARD_FLOAT && TARGET_FPRS"
   "fctidz %0,%1"
@@ -5260,7 +5260,7 @@
 ;; from double rounding.
 (define_insn_and_split "floatdisf2_internal1"
   [(set (match_operand:SF 0 "gpc_reg_operand" "=f")
-        (float:SF (match_operand:DI 1 "gpc_reg_operand" "*f")))
+        (float:SF (match_operand:DI 1 "gpc_reg_operand" "f")))
    (clobber (match_scratch:DF 2 "=f"))]
   "TARGET_POWERPC64 && TARGET_HARD_FLOAT && TARGET_FPRS"
   "#"
@@ -8342,7 +8342,7 @@
 
 (define_insn_and_split "floatditf2"
   [(set (match_operand:TF 0 "gpc_reg_operand" "=f")
-        (float:TF (match_operand:DI 1 "gpc_reg_operand" "*f")))
+        (float:TF (match_operand:DI 1 "gpc_reg_operand" "f")))
    (clobber (match_scratch:DF 2 "=f"))]
   "DEFAULT_ABI == ABI_AIX && TARGET_POWERPC64
    && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_LONG_DOUBLE_128"
@@ -8369,7 +8369,7 @@
   "")
 
 (define_insn_and_split "fix_trunctfdi2"
-  [(set (match_operand:DI 0 "gpc_reg_operand" "=*f")
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=f")
         (fix:DI (match_operand:TF 1 "gpc_reg_operand" "f")))
    (clobber (match_scratch:DF 2 "=f"))]
   "DEFAULT_ABI == ABI_AIX && TARGET_POWERPC64

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

[-- Attachment #2: janis.c --]
[-- Type: text/plain, Size: 1816 bytes --]

/* Minimized function from file texture.c in mesa, from SPEC CPU 2000.

   The original function has lots of conditional code, but all invocations
   of the function used the same paths so the IF blocks have been
   removed.  */

typedef unsigned char GLubyte;
typedef int GLint;
typedef unsigned int GLuint;
typedef float GLfloat;
struct gl_texture_object;

/* the original version of this is much larger */
struct gl_texture_image {
  GLuint Width2;
  GLuint Height2;
};

extern void get_1d_texel( const struct gl_texture_object *tObj,
                          const struct gl_texture_image *img, GLint i,
                          GLubyte *red, GLubyte *green, GLubyte *blue,
                          GLubyte *alpha );

void sample_1d_linear( const struct gl_texture_object *tObj,
                              const struct gl_texture_image *img,
                              GLfloat s,
                              GLubyte *red, GLubyte *green,
                              GLubyte *blue, GLubyte *alpha )
{
   GLint width = img->Width2;
   GLint i0, i1;
   GLfloat u;
   GLint i0border, i1border;

   u = s * width;

   i0 = ((GLint) floor(u - 0.5F)) % width;
   i1 = (i0 + 1) & (width-1);
   i0border = i1border = 0;

   i0 &= (width-1);

   {
      GLfloat a = ((GLfloat)(u - 0.5F)-floor((GLfloat)u - 0.5F));

      GLint w0 = (GLint) ((1.0F-a) * 256.0F);
      GLint w1 = (GLint) ( a * 256.0F);

      GLubyte red0, green0, blue0, alpha0;
      GLubyte red1, green1, blue1, alpha1;

      get_1d_texel( tObj, img, i0, &red0, &green0, &blue0, &alpha0 );

      get_1d_texel( tObj, img, i1, &red1, &green1, &blue1, &alpha1 );

      *red = (w0*red0 + w1*red1) >> 8;
      *green = (w0*green0 + w1*green1) >> 8;
      *blue = (w0*blue0 + w1*blue1) >> 8;
      *alpha = (w0*alpha0 + w1*alpha1) >> 8;
   }
}

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: ppc64 floating point usage [was Re: PPC64 Compiler bug !!]
  2003-08-08  7:24       ` Alan Modra
@ 2003-08-08 14:01         ` David Edelsohn
  2003-08-09  1:55           ` ppc64 floating point usage Zack Weinberg
  2003-08-09  2:23           ` ppc64 floating point usage [was Re: PPC64 Compiler bug !!] Eric Christopher
  0 siblings, 2 replies; 875+ messages in thread
From: David Edelsohn @ 2003-08-08 14:01 UTC (permalink / raw)
  To: gcc-patches

>>>>> Alan Modra writes:

Alan> Going back over gcc-patches archives, it appears that this was to fix
Alan> a reload problem.  (See [RFC PATCH] Fix middle-end/6963 from June and
Alan> July 2002.)  The good news is that the underlying reload problem is
Alan> fixed, I think.  I'm testing the following patch, which compiles the
Alan> testcase in http://gcc.gnu.org/ml/gcc-patches/2002-06/msg02214.html
Alan> without a problem, and things are looking good so far with regression
Alan> tests.

Alan> * config/rs6000/rs6000.md (fctiwz): Don't ignore fpr preference.
Alan> (floatdidf2, fix_truncdfdi2, floatdisf2_internal1): Likewise.
Alan> (floatditf2, fix_trunctfdi2): Likewise.

	Removing the ignore flag on all of the patterns is incorrect.
This activity continues to fix symptoms instead of problems.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: ppc64 floating point usage
  2003-08-08 14:01         ` David Edelsohn
@ 2003-08-09  1:55           ` Zack Weinberg
  2003-08-09  2:42             ` David Edelsohn
  2003-08-09  2:23           ` ppc64 floating point usage [was Re: PPC64 Compiler bug !!] Eric Christopher
  1 sibling, 1 reply; 875+ messages in thread
From: Zack Weinberg @ 2003-08-09  1:55 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

David Edelsohn <dje@watson.ibm.com> writes:

>>>>>> Alan Modra writes:
>
> Alan> Going back over gcc-patches archives, it appears that this was to fix
> Alan> a reload problem.  (See [RFC PATCH] Fix middle-end/6963 from June and
> Alan> July 2002.)  The good news is that the underlying reload problem is
> Alan> fixed, I think.  I'm testing the following patch, which compiles the
> Alan> testcase in http://gcc.gnu.org/ml/gcc-patches/2002-06/msg02214.html
> Alan> without a problem, and things are looking good so far with regression
> Alan> tests.
>
> Alan> * config/rs6000/rs6000.md (fctiwz): Don't ignore fpr preference.
> Alan> (floatdidf2, fix_truncdfdi2, floatdisf2_internal1): Likewise.
> Alan> (floatditf2, fix_trunctfdi2): Likewise.
>
> 	Removing the ignore flag on all of the patterns is incorrect.
> This activity continues to fix symptoms instead of problems.

Would you please explain this in more detail?  What do you think the
problem is, and the correct fix?

zw

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: ppc64 floating point usage [was Re: PPC64 Compiler bug !!]
  2003-08-08 14:01         ` David Edelsohn
  2003-08-09  1:55           ` ppc64 floating point usage Zack Weinberg
@ 2003-08-09  2:23           ` Eric Christopher
  2003-08-09  2:50             ` David Edelsohn
  1 sibling, 1 reply; 875+ messages in thread
From: Eric Christopher @ 2003-08-09  2:23 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

David,

> 
> Alan> * config/rs6000/rs6000.md (fctiwz): Don't ignore fpr preference.
> Alan> (floatdidf2, fix_truncdfdi2, floatdisf2_internal1): Likewise.
> Alan> (floatditf2, fix_trunctfdi2): Likewise.
> 
> 	Removing the ignore flag on all of the patterns is incorrect.
> This activity continues to fix symptoms instead of problems.

I always reserve the right to be mistaken, but from my reading of the
ppc instruction set the change looks good to me since those instructions
require a floating point register. Perhaps could you explain more
please?

-eric

-- 
Eric Christopher <echristo@redhat.com>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: ppc64 floating point usage
  2003-08-09  1:55           ` ppc64 floating point usage Zack Weinberg
@ 2003-08-09  2:42             ` David Edelsohn
  2003-08-09  2:56               ` Alan Modra
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2003-08-09  2:42 UTC (permalink / raw)
  To: Zack Weinberg; +Cc: gcc-patches

>>>>> Zack Weinberg writes:

Zack> Would you please explain this in more detail?  What do you think the
Zack> problem is, and the correct fix?

	The problem is FPRs allocated in functions which do not perform
any floating point computations.  Making it less likely that FPRs will be
allocated is not an acceptable solution because that is not a guarantee of
correct behavior and simply gives programmers a false sense of confidence.
If it does not produce the expected result 100% of the time, it is useless
for the programmers who truly require that functionality.

	The use of FPRs as the DImode output for float to int conversion
patterns should not contribute to the regclass preference for those
pseudos.  The previous patch was not only for a reload problem.

	I am sorry that a hardware manufacturer used the wrong options to
compile a PPC64 Linux kernel driver and the mistake was not detected
automatically.  However, the patches in this thread are not going to
prevent it from happening again, they just makes it less likely -- which
is even worse.  The correct compiler option still is necessary, but now it
sometimes will work without the option, which will cause more confusion
due to the seemingly random failures and superstition about options.

	The correct fix is for the compiler to automatically determine
whether the function performs any floating point computations and then
modify register allocation appropriately, e.g., make FPRs fixed just prior
to register allocation if they should not be used.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: ppc64 floating point usage [was Re: PPC64 Compiler bug !!]
  2003-08-09  2:23           ` ppc64 floating point usage [was Re: PPC64 Compiler bug !!] Eric Christopher
@ 2003-08-09  2:50             ` David Edelsohn
  2003-08-09  3:06               ` Eric Christopher
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2003-08-09  2:50 UTC (permalink / raw)
  To: Eric Christopher; +Cc: gcc-patches

>>>>> Eric Christopher writes:

Eric> I always reserve the right to be mistaken, but from my reading of the
Eric> ppc instruction set the change looks good to me since those instructions
Eric> require a floating point register. Perhaps could you explain more
Eric> please?

	The instructions require FPRs and the constraint lists FPRs.
However, the operands with the "*" modifier are DImode and should not
contribute to the register preferencing for the pseudo used elsewhere in
the function.  It is a constraint necessary for reload to get the output
into the pseudo, not for the coloring of the pseudo itself.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: ppc64 floating point usage
  2003-08-09  2:42             ` David Edelsohn
@ 2003-08-09  2:56               ` Alan Modra
  2003-08-09  3:15                 ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Alan Modra @ 2003-08-09  2:56 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Zack Weinberg, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 596 bytes --]

On Fri, Aug 08, 2003 at 10:42:19PM -0400, David Edelsohn wrote:
> 	The use of FPRs as the DImode output for float to int conversion
> patterns should not contribute to the regclass preference for those
> pseudos.  The previous patch was not only for a reload problem.

Uh, found the testcase that failed until you told the register allocator
to ignore fpr preference.  I'm posting it so that I don't lose it again..

BTW, your patch to ignore fpr preference wouldn't have been a hack
attacking symptoms rather than the real problem, would it?

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

[-- Attachment #2: t5.i.gz --]
[-- Type: application/x-gunzip, Size: 20325 bytes --]

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: ppc64 floating point usage [was Re: PPC64 Compiler bug !!]
  2003-08-09  2:50             ` David Edelsohn
@ 2003-08-09  3:06               ` Eric Christopher
  2003-08-09  3:27                 ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Eric Christopher @ 2003-08-09  3:06 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

> 
> 	The instructions require FPRs and the constraint lists FPRs.
> However, the operands with the "*" modifier are DImode and should not
> contribute to the register preferencing for the pseudo used elsewhere in
> the function.  It is a constraint necessary for reload to get the output
> into the pseudo, not for the coloring of the pseudo itself.

Right. Understand that, but here you're not preferring any particular
register at all. It's not like you're trying to get it to ignore one
constraint in preference of another, in this case you have to have an fp
register during register allocation to use the instruction at all so why
not keep it there the entire time. I'm just not seeing what you gained
in your previous patch?

I apologize if I'm being particularly dense, but it seems like more than
just me here...

-eric

-- 
Eric Christopher <echristo@redhat.com>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: ppc64 floating point usage
  2003-08-09  2:56               ` Alan Modra
@ 2003-08-09  3:15                 ` David Edelsohn
  2003-08-09  3:40                   ` Alan Modra
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2003-08-09  3:15 UTC (permalink / raw)
  To: Alan Modra; +Cc: Zack Weinberg, gcc-patches

>>>>> Alan Modra writes:

Alan> BTW, your patch to ignore fpr preference wouldn't have been a hack
Alan> attacking symptoms rather than the real problem, would it?

	The patch to ignore FPR preferences fixed a oversight in the
original patterns.  The register cost model is about tuning and tuning is
based on some metric.  If you want to equate measuring the metric with
symptom, fine, but the problem is maximizing (optimizing) some function.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: ppc64 floating point usage [was Re: PPC64 Compiler bug !!]
  2003-08-09  3:06               ` Eric Christopher
@ 2003-08-09  3:27                 ` David Edelsohn
  2003-08-09 23:14                   ` Eric Christopher
  2003-08-10  5:17                   ` Geoff Keating
  0 siblings, 2 replies; 875+ messages in thread
From: David Edelsohn @ 2003-08-09  3:27 UTC (permalink / raw)
  To: Eric Christopher; +Cc: gcc-patches

>>>>> Eric Christopher writes:

Eric> Right. Understand that, but here you're not preferring any particular
Eric> register at all. It's not like you're trying to get it to ignore one
Eric> constraint in preference of another, in this case you have to have an fp
Eric> register during register allocation to use the instruction at all so why
Eric> not keep it there the entire time. I'm just not seeing what you gained
Eric> in your previous patch?

	Consider an example where you want to perform an integer
arithmetic operations on the DImode pseudo after converting from float.
Would it be good to allocate that pseudo in an FPR and have reload copy it
in and out of a GPR for each integer operation?  I could create a similar
example for the pseudo used in a floating pointer context.

	The architecture produces a DImode value in an FPR.  DImode
most frequently is used in GPRs.  The best choice for the allocation of
the pseudo depends on the other uses of the pseudo in the function, not
the requirement of the conversion instruction.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: ppc64 floating point usage
  2003-08-09  3:15                 ` David Edelsohn
@ 2003-08-09  3:40                   ` Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2003-08-09  3:40 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Zack Weinberg, gcc-patches

On Fri, Aug 08, 2003 at 11:15:53PM -0400, David Edelsohn wrote:
> 	The patch to ignore FPR preferences fixed a oversight in the
> original patterns.  The register cost model is about tuning and tuning is
> based on some metric.  If you want to equate measuring the metric with
> symptom, fine, but the problem is maximizing (optimizing) some function.

OK, I think I'm beginning to make sense of this.  In the case where the
result of a fp -> int conversion is live for any appreciable time, we'd
much rather it lived in a gpr reg.  It's just a pity that when the
result is written immediately to a stack slot that there is then no
preference for fpr regs, as that prevents me changing the reg alloc
order.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: ppc64 floating point usage [was Re: PPC64 Compiler bug !!]
  2003-08-09  3:27                 ` David Edelsohn
@ 2003-08-09 23:14                   ` Eric Christopher
  2003-08-09 23:27                     ` David Edelsohn
  2003-08-10  5:17                   ` Geoff Keating
  1 sibling, 1 reply; 875+ messages in thread
From: Eric Christopher @ 2003-08-09 23:14 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches


> 	Consider an example where you want to perform an integer
> arithmetic operations on the DImode pseudo after converting from float.
> Would it be good to allocate that pseudo in an FPR and have reload copy it
> in and out of a GPR for each integer operation?  I could create a similar
> example for the pseudo used in a floating pointer context.
> 
> 	The architecture produces a DImode value in an FPR.  DImode
> most frequently is used in GPRs.  The best choice for the allocation of
> the pseudo depends on the other uses of the pseudo in the function, not
> the requirement of the conversion instruction.

Right, I can understand all of that, I worry that when you're choosing
register preferences it'll take an integer register in the unlikely case
that you have all of your floating point registers occupied giving you
an insn that can't match it's constraints. I thought * was normally used
when you have multiple alternatives and want to ignore one/multiple of
them for choosing register preferences. I just don't see a valid
constraint as a possible choice for register preference.

If I'm wrong here I apologize for the furor :)

-eric

-- 
Eric Christopher <echristo@redhat.com>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: ppc64 floating point usage [was Re: PPC64 Compiler bug !!]
  2003-08-09 23:14                   ` Eric Christopher
@ 2003-08-09 23:27                     ` David Edelsohn
  2003-08-10  0:15                       ` Eric Christopher
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2003-08-09 23:27 UTC (permalink / raw)
  To: Eric Christopher; +Cc: gcc-patches

>>>>> Eric Christopher writes:

Eric> Right, I can understand all of that, I worry that when you're choosing
Eric> register preferences it'll take an integer register in the unlikely case
Eric> that you have all of your floating point registers occupied giving you
Eric> an insn that can't match it's constraints. I thought * was normally used
Eric> when you have multiple alternatives and want to ignore one/multiple of
Eric> them for choosing register preferences. I just don't see a valid
Eric> constraint as a possible choice for register preference.

Eric> If I'm wrong here I apologize for the furor :)

	Quoting the GCC Internals Manual, Machine Descriptions, Operand
Constraints, Constraint Modifier Characters:

"* 
      Says that the following character should be ignored when choosing
      register preferences. * has no effect on the meaning of the
      constraint as a constraint, and no effect on reloading."

Reload will ensure that the instruction matches its constraints.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: ppc64 floating point usage [was Re: PPC64 Compiler bug !!]
  2003-08-09 23:27                     ` David Edelsohn
@ 2003-08-10  0:15                       ` Eric Christopher
  0 siblings, 0 replies; 875+ messages in thread
From: Eric Christopher @ 2003-08-10  0:15 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches


> 	Quoting the GCC Internals Manual, Machine Descriptions, Operand
> Constraints, Constraint Modifier Characters:
> 
> "* 
>       Says that the following character should be ignored when choosing
>       register preferences. * has no effect on the meaning of the
>       constraint as a constraint, and no effect on reloading."
> 
> Reload will ensure that the instruction matches its constraints.

Right, I'd read that, I'm not big on having reload fix up such things,
but via our conversation on irc I see where you're going :)

-eric

-- 
Eric Christopher <echristo@redhat.com>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: ppc64 floating point usage [was Re: PPC64 Compiler bug !!]
  2003-08-09  3:27                 ` David Edelsohn
  2003-08-09 23:14                   ` Eric Christopher
@ 2003-08-10  5:17                   ` Geoff Keating
  2003-08-10  6:43                     ` Alan Modra
  1 sibling, 1 reply; 875+ messages in thread
From: Geoff Keating @ 2003-08-10  5:17 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

David Edelsohn <dje@watson.ibm.com> writes:

> >>>>> Eric Christopher writes:
> 
> Eric> Right. Understand that, but here you're not preferring any particular
> Eric> register at all. It's not like you're trying to get it to ignore one
> Eric> constraint in preference of another, in this case you have to have an fp
> Eric> register during register allocation to use the instruction at all so why
> Eric> not keep it there the entire time. I'm just not seeing what you gained
> Eric> in your previous patch?
> 
> 	Consider an example where you want to perform an integer
> arithmetic operations on the DImode pseudo after converting from float.
> Would it be good to allocate that pseudo in an FPR and have reload copy it
> in and out of a GPR for each integer operation?  I could create a similar
> example for the pseudo used in a floating pointer context.

Yes, it would be good, sometimes.  Consider:

void setx (long long *result, double d, double b, double e)
{
  long long a;
  
  a = d;
  if (b * b != b)
    a = b;
  if (e * e != e)
    a = e;
  
  *result = a + 1;
}

GCC generates code like this (this example at -O1):

_setx:
        fctidz f1,f1
        stfd f1,-32(r1)
        ld r2,-32(r1)
        fmul f0,f2,f2
        fcmpu cr7,f0,f2
        beq- cr7,L2
        fctidz f0,f2
        stfd f0,-32(r1)
        ld r2,-32(r1)
L2:
        fmul f0,f3,f3
        fcmpu cr7,f0,f3
        beq- cr7,L4
        fctidz f0,f3
        stfd f0,-32(r1)
        ld r2,-32(r1)
L4:
        addi r2,r2,1
        std r2,0(r3)
        blr

which is suboptimal no matter what way you look at it.  Without the
'*', GCC generates:

_setx:
        fctidz f1,f1
        fmul f0,f2,f2
        fcmpu cr7,f0,f2
        beq- cr7,L2
        fctidz f1,f2
L2:
        fmul f0,f3,f3
        fcmpu cr7,f0,f3
        beq- cr7,L4
        fctidz f1,f3
L4:
        stfd f1,-32(r1)
        ld r2,-32(r1)
        addi r0,r2,1
        std r0,0(r3)
        blr

which is obviously better.  In fact, this turns out to be much better,
because the output reload on the 'addi' is avoided.

-- 
- Geoffrey Keating <geoffk@geoffk.org>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: ppc64 floating point usage [was Re: PPC64 Compiler bug !!]
  2003-08-10  5:17                   ` Geoff Keating
@ 2003-08-10  6:43                     ` Alan Modra
  2003-08-10  7:00                       ` Geoff Keating
  0 siblings, 1 reply; 875+ messages in thread
From: Alan Modra @ 2003-08-10  6:43 UTC (permalink / raw)
  To: Geoff Keating; +Cc: David Edelsohn, gcc-patches

Admittedly this is fairly contrived..

cat >fpcount.c <<EOF
void f (double x, char *p)
{
  long i, c;
  for (i = 0; i < 3; i++)
    asm volatile ("fcfid %0,%1" : "=f" (c) : "f" (x));

  while (--c)
    *p++ = 0;
}
EOF
powerpc64-linux-gcc -O2 -S fpcount.c

fpcount.c:9: error: unable to generate reloads for:
(jump_insn:HI 79 30 97 2 (parallel [
            (set (pc)
                (if_then_else (eq (reg/v:DI 32 f0 [orig:120 c ] [120])
                        (const_int 1 [0x1]))
                    (label_ref 80)
                    (pc)))
            (set (reg/v:DI 32 f0 [orig:120 c ] [120])
                (plus:DI (reg/v:DI 32 f0 [orig:120 c ] [120])
                    (const_int -1 [0xffffffffffffffff])))
            (clobber (scratch:CC))
            (clobber (scratch:DI))
        ]) 565 {*ctrdi_internal5} (insn_list:REG_DEP_ANTI 30 (nil))
    (expr_list:REG_UNUSED (scratch:CC)
        (expr_list:REG_UNUSED (scratch:DI)
            (expr_list:REG_BR_PROB (const_int 3600 [0xe10])
                (nil)))))
fpcount.c:12: internal compiler error: in find_reloads, at reload.c:3647

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: ppc64 floating point usage [was Re: PPC64 Compiler bug !!]
  2003-08-10  6:43                     ` Alan Modra
@ 2003-08-10  7:00                       ` Geoff Keating
  0 siblings, 0 replies; 875+ messages in thread
From: Geoff Keating @ 2003-08-10  7:00 UTC (permalink / raw)
  To: amodra; +Cc: dje, gcc-patches

> Date: Sun, 10 Aug 2003 16:11:05 +0930
> From: Alan Modra <amodra@bigpond.net.au>

> fpcount.c:9: error: unable to generate reloads for:
> (jump_insn:HI 79 30 97 2 (parallel [
>             (set (pc)
>                 (if_then_else (eq (reg/v:DI 32 f0 [orig:120 c ] [120])
>                         (const_int 1 [0x1]))
>                     (label_ref 80)
>                     (pc)))
>             (set (reg/v:DI 32 f0 [orig:120 c ] [120])
>                 (plus:DI (reg/v:DI 32 f0 [orig:120 c ] [120])
>                     (const_int -1 [0xffffffffffffffff])))
>             (clobber (scratch:CC))
>             (clobber (scratch:DI))
>         ]) 565 {*ctrdi_internal5} (insn_list:REG_DEP_ANTI 30 (nil))
>     (expr_list:REG_UNUSED (scratch:CC)
>         (expr_list:REG_UNUSED (scratch:DI)
>             (expr_list:REG_BR_PROB (const_int 3600 [0xe10])
>                 (nil)))))
> fpcount.c:12: internal compiler error: in find_reloads, at reload.c:3647

That's the long-standing bug where reload can't put reloads on output
operands of branches because it was written before we had
insert_insn_on_edge.  To fix this, you should fix reload, not hack the
port to work around it.

-- 
- Geoffrey Keating <geoffk@geoffk.org>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] powerpc64-linux bi-arch support
  2003-05-27 11:58 [PATCH] powerpc64-linux bi-arch support Jakub Jelinek
                   ` (2 preceding siblings ...)
  2003-06-04 16:51 ` David Edelsohn
@ 2003-08-19 20:07 ` David Edelsohn
  2003-08-20  0:42   ` Alan Modra
  3 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2003-08-19 20:07 UTC (permalink / raw)
  To: Jakub Jelinek, Alan Modra; +Cc: gcc-patches

	A colleague of mine has noticed a problem with the ppc64 bi-arch
patch when linux64.h is configured without bi-arch.  linux64.h includes
code to assign to DEFAULT_ABI, but that macro may be a constant:

#ifndef RS6000_BI_ARCH

#undef  DEFAULT_ABI
#define DEFAULT_ABI ABI_AIX

...

#undef  SUBSUBTARGET_OVERRIDE_OPTIONS
#define SUBSUBTARGET_OVERRIDE_OPTIONS                           \
  do                                                            \
    {                                                           \
      if (TARGET_64BIT)                                         \
        {                                                       \
          if (DEFAULT_ABI != ABI_AIX)                           \
            {                                                   \
              DEFAULT_ABI = ABI_AIX;                            \
              error (INVALID_64BIT, "call");                    \
            }

The assignment never will be executed because DEFAULT_ABI already is
ABI_AIX, but GCC correctly complains about an invalid lvalue.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] powerpc64-linux bi-arch support
  2003-08-19 20:07 ` David Edelsohn
@ 2003-08-20  0:42   ` Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2003-08-20  0:42 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Jakub Jelinek, gcc-patches

On Tue, Aug 19, 2003 at 04:07:39PM -0400, David Edelsohn wrote:
> The assignment never will be executed because DEFAULT_ABI already is
> ABI_AIX, but GCC correctly complains about an invalid lvalue.

Yes, I'd noticed this myself but hadn't applied a fix since mainline
currently can only be configured biarch.  The correct fix is

Index: gcc/config/rs6000/linux64.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/linux64.h,v
retrieving revision 1.48
diff -u -p -r1.48 linux64.h
--- gcc/config/rs6000/linux64.h	16 Jul 2003 11:52:51 -0000	1.48
+++ gcc/config/rs6000/linux64.h	20 Aug 2003 00:38:26 -0000
@@ -69,7 +69,7 @@
 	{							\
 	  if (DEFAULT_ABI != ABI_AIX)				\
 	    {							\
-	      DEFAULT_ABI = ABI_AIX;				\
+	      rs6000_current_abi = ABI_AIX;			\
 	      error (INVALID_64BIT, "call");			\
 	    }							\
 	  if (TARGET_RELOCATABLE)				\


-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: C++ testsuite failures on AIX
       [not found] <20030911193937.GD16280@redhat.com>
@ 2003-09-11 20:08 ` David Edelsohn
  2003-09-11 20:09   ` Richard Henderson
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2003-09-11 20:08 UTC (permalink / raw)
  To: Richard Henderson, Jan Hubicka, gcc-patches

	In the recent decl2.c patch adding var_finalized_p

http://gcc.gnu.org/ml/gcc-patches/2003-09/msg00578.html

The function does not implement the same functionality as the inlined
versions.

+ /* Return true when the variable has been already expanded.  */
+ static bool
+ var_finalized_p (tree var)
+ {
+   if (flag_unit_at_a_time)
+     return TREE_ASM_WRITTEN (var);
+   else
+     return cgraph_varpool_node (var)->finalized;
+ }

!   if (TREE_ASM_WRITTEN (primary_vtbl)
!       || (flag_unit_at_a_time
! 	  && cgraph_varpool_node (primary_vtbl)->finalized))

The old version always tested TREE_ASM_WRITTEN.  The new version only
tests it for flag_unit_at_a_time.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: C++ testsuite failures on AIX
  2003-09-11 20:08 ` C++ testsuite failures on AIX David Edelsohn
@ 2003-09-11 20:09   ` Richard Henderson
  2003-09-11 20:17     ` David Edelsohn
  2003-09-11 20:29     ` Jan Hubicka
  0 siblings, 2 replies; 875+ messages in thread
From: Richard Henderson @ 2003-09-11 20:09 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Jan Hubicka, gcc-patches

On Thu, Sep 11, 2003 at 04:08:21PM -0400, David Edelsohn wrote:
> The old version always tested TREE_ASM_WRITTEN.  The new version only
> tests it for flag_unit_at_a_time.

Mark told Jan to do it that way.  Does changing it back fix
the problem?


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: C++ testsuite failures on AIX
  2003-09-11 20:09   ` Richard Henderson
@ 2003-09-11 20:17     ` David Edelsohn
  2003-09-11 20:20       ` Richard Henderson
  2003-09-11 20:29     ` Jan Hubicka
  1 sibling, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2003-09-11 20:17 UTC (permalink / raw)
  To: Richard Henderson, Jan Hubicka, Mark Mitchell; +Cc: gcc-patches

>>>>> Richard Henderson writes:

Richard> Mark told Jan to do it that way.  Does changing it back fix
Richard> the problem?

	Early feedback: if I change the function to

  return (TREE_ASM_WRITTEN (var)
          || (flag_unit_at_a_time
              && cgraph_varpool_node (var)->finalized));

the symbol definitions reappear.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: C++ testsuite failures on AIX
  2003-09-11 20:17     ` David Edelsohn
@ 2003-09-11 20:20       ` Richard Henderson
  2003-09-11 20:49         ` Mark Mitchell
  0 siblings, 1 reply; 875+ messages in thread
From: Richard Henderson @ 2003-09-11 20:20 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Jan Hubicka, Mark Mitchell, gcc-patches

On Thu, Sep 11, 2003 at 04:17:19PM -0400, David Edelsohn wrote:
>   return (TREE_ASM_WRITTEN (var)
>           || (flag_unit_at_a_time
>               && cgraph_varpool_node (var)->finalized));
> 
> the symbol definitions reappear.

Cool.  Can you commit that change then?  We can debate
why the other should have worked later.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: C++ testsuite failures on AIX
  2003-09-11 20:09   ` Richard Henderson
  2003-09-11 20:17     ` David Edelsohn
@ 2003-09-11 20:29     ` Jan Hubicka
  2003-09-11 20:31       ` David Edelsohn
  1 sibling, 1 reply; 875+ messages in thread
From: Jan Hubicka @ 2003-09-11 20:29 UTC (permalink / raw)
  To: Richard Henderson, David Edelsohn, Jan Hubicka, gcc-patches

> On Thu, Sep 11, 2003 at 04:08:21PM -0400, David Edelsohn wrote:
> > The old version always tested TREE_ASM_WRITTEN.  The new version only
> > tests it for flag_unit_at_a_time.
> 
> Mark told Jan to do it that way.  Does changing it back fix
> the problem?

I see now the conditional is other way around.  Does negating it back fix?

Honza
> 
> 
> r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: C++ testsuite failures on AIX
  2003-09-11 20:29     ` Jan Hubicka
@ 2003-09-11 20:31       ` David Edelsohn
  2003-09-11 20:32         ` Jan Hubicka
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2003-09-11 20:31 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Richard Henderson, gcc-patches

>>>>> Jan Hubicka writes:

Jan> I see now the conditional is other way around.  Does negating it back fix?

	Do you want me to try it with the conditional reversed?

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: C++ testsuite failures on AIX
  2003-09-11 20:31       ` David Edelsohn
@ 2003-09-11 20:32         ` Jan Hubicka
  2003-09-11 20:48           ` David Edelsohn
  2003-09-11 20:50           ` David Edelsohn
  0 siblings, 2 replies; 875+ messages in thread
From: Jan Hubicka @ 2003-09-11 20:32 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Jan Hubicka, Richard Henderson, gcc-patches

> >>>>> Jan Hubicka writes:
> 
> Jan> I see now the conditional is other way around.  Does negating it back fix?
> 
> 	Do you want me to try it with the conditional reversed?
Yes, if it will work, please commit it that way.
I don't know why it did work for me with the other conditional at all...

Honza
> 
> David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: C++ testsuite failures on AIX
  2003-09-11 20:32         ` Jan Hubicka
@ 2003-09-11 20:48           ` David Edelsohn
  2003-09-11 20:50           ` David Edelsohn
  1 sibling, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2003-09-11 20:48 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Richard Henderson, gcc-patches

>>>>> Jan Hubicka writes:

Jan> Yes, if it will work, please commit it that way.
Jan> I don't know why it did work for me with the other conditional at all...

	Swapping around the arms did work, so that is what I have
committed.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: C++ testsuite failures on AIX
  2003-09-11 20:20       ` Richard Henderson
@ 2003-09-11 20:49         ` Mark Mitchell
  2003-09-11 21:33           ` Jan Hubicka
  0 siblings, 1 reply; 875+ messages in thread
From: Mark Mitchell @ 2003-09-11 20:49 UTC (permalink / raw)
  To: Richard Henderson; +Cc: David Edelsohn, Jan Hubicka, gcc-patches

On Thu, 2003-09-11 at 13:20, Richard Henderson wrote:
> On Thu, Sep 11, 2003 at 04:17:19PM -0400, David Edelsohn wrote:
> >   return (TREE_ASM_WRITTEN (var)
> >           || (flag_unit_at_a_time
> >               && cgraph_varpool_node (var)->finalized));
> > 
> > the symbol definitions reappear.
> 
> Cool.  Can you commit that change then?  We can debate
> why the other should have worked later.

Agreed.

What that suggests is that some variables are bypassing the cgraph
stuff.

-- 
Mark Mitchell
CodeSourcery, LLC
mark@codesourcery.com

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: C++ testsuite failures on AIX
  2003-09-11 20:32         ` Jan Hubicka
  2003-09-11 20:48           ` David Edelsohn
@ 2003-09-11 20:50           ` David Edelsohn
  1 sibling, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2003-09-11 20:50 UTC (permalink / raw)
  To: gcc-patches

	Just for the record, appended is the patch I committed.

David

        * decl2.c (var_finalized_p): Swap arms of conditional.

Index: decl2.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/cp/decl2.c,v
retrieving revision 1.675
diff -c -p -r1.675 decl2.c
*** decl2.c	11 Sep 2003 05:55:16 -0000	1.675
--- decl2.c	11 Sep 2003 20:37:15 -0000
*************** static bool
*** 1627,1635 ****
  var_finalized_p (tree var)
  {
    if (flag_unit_at_a_time)
-     return TREE_ASM_WRITTEN (var);
-   else
      return cgraph_varpool_node (var)->finalized;
  }
  
  /* If necessary, write out the vtables for the dynamic class CTYPE.
--- 1627,1635 ----
  var_finalized_p (tree var)
  {
    if (flag_unit_at_a_time)
      return cgraph_varpool_node (var)->finalized;
+   else
+     return TREE_ASM_WRITTEN (var);
  }
  
  /* If necessary, write out the vtables for the dynamic class CTYPE.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: C++ testsuite failures on AIX
  2003-09-11 20:49         ` Mark Mitchell
@ 2003-09-11 21:33           ` Jan Hubicka
  0 siblings, 0 replies; 875+ messages in thread
From: Jan Hubicka @ 2003-09-11 21:33 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: Richard Henderson, David Edelsohn, Jan Hubicka, gcc-patches

> On Thu, 2003-09-11 at 13:20, Richard Henderson wrote:
> > On Thu, Sep 11, 2003 at 04:17:19PM -0400, David Edelsohn wrote:
> > >   return (TREE_ASM_WRITTEN (var)
> > >           || (flag_unit_at_a_time
> > >               && cgraph_varpool_node (var)->finalized));
> > > 
> > > the symbol definitions reappear.
> > 
> > Cool.  Can you commit that change then?  We can debate
> > why the other should have worked later.
> 
> Agreed.
> 
> What that suggests is that some variables are bypassing the cgraph
> stuff.
The problem has been that I actually messed up the conditional.
I should not made that patch in so much hurry....

Honza
> 
> -- 
> Mark Mitchell
> CodeSourcery, LLC
> mark@codesourcery.com
> 

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] - Use of powerpc 64bit instructions in 32bit ABI
@ 2003-10-24 13:52 Ulrich Weigand
  2003-10-24 17:32 ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Ulrich Weigand @ 2003-10-24 13:52 UTC (permalink / raw)
  To: Fariborz Jahanian
  Cc: David Edelsohn, Richard Henderson, ian, davem, gcc-patches


Fariborz Jahanian wrote:

>Yes. The very early apple's implementation assumed word_mode == DImode
>and emit_group{load/store} just work by
>using PARALLEL for function arguments/return values to register pairs.
>Problem started arising in other parts of
>the compiler;  which uses word_mode==DImode to figure out, for example,
>number of registers used in passing
>structure-by-value; with register size remaining 32-bit.

Could you elaborate what the problem with struct-by-value was?
This appears to work fine for me by returning false from
FUNCTION_ARG_PASS_BY_REFERENCE and then returning a PARALLEL
from FUNCTION_ARG for that struct parameter.



Mit freundlichen Gruessen / Best Regards

Ulrich Weigand

--
  Dr. Ulrich Weigand
  Linux for S/390 Design & Development
  IBM Deutschland Entwicklung GmbH, Schoenaicher Str. 220, 71032 Boeblingen
  Phone: +49-7031/16-3727   ---   Email: Ulrich.Weigand@de.ibm.com

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] - Use of powerpc 64bit instructions in 32bit ABI
  2003-10-24 13:52 [PATCH] - Use of powerpc 64bit instructions in 32bit ABI Ulrich Weigand
@ 2003-10-24 17:32 ` David Edelsohn
  2003-10-24 18:07   ` Richard Henderson
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2003-10-24 17:32 UTC (permalink / raw)
  To: Ulrich Weigand
  Cc: Fariborz Jahanian, Richard Henderson, ian, davem, gcc-patches

>>>>> Ulrich Weigand writes:

Ulrich> Could you elaborate what the problem with struct-by-value was?
Ulrich> This appears to work fine for me by returning false from
Ulrich> FUNCTION_ARG_PASS_BY_REFERENCE and then returning a PARALLEL
Ulrich> from FUNCTION_ARG for that struct parameter.

	Notice the use of UNITS_PER_WORD, BITS_PER_WORD, and word_mode in
the following functions that depend on the argument wordsize and ABI:

calls.c:load_register_parameters()

          else if (TYPE_MODE (TREE_TYPE (args[i].tree_value)) == BLKmode)
            {
              size = int_size_in_bytes (TREE_TYPE (args[i].tree_value));
              nregs = (size + (UNITS_PER_WORD - 1)) / UNITS_PER_WORD; <***
            }


calls.c:store_one_arg()

      /* Round its size up to a multiple
         of the allocation unit for arguments.  */

      if (arg->locate.size.var != 0)
        {
          excess = 0;
          size_rtx = ARGS_SIZE_RTX (arg->locate.size);
        }
      else
        {
          /* PUSH_ROUNDING has no effect on us, because
             emit_push_insn for BLKmode is careful to avoid it.  */
          excess = (arg->locate.size.constant
                    - int_size_in_bytes (TREE_TYPE (pval))
                    + partial * UNITS_PER_WORD); <***
          size_rtx = expand_expr (size_in_bytes (TREE_TYPE (pval)),
                                  NULL_RTX, TYPE_MODE (sizetype), 0);
        }


expr.c:emit_push_insn()

  if (mode == BLKmode)
    {
      /* Copy a block into the stack, entirely or partially.  */

      rtx temp;
      int used = partial * UNITS_PER_WORD; <***

...

  else if (partial > 0)
    {
      /* Scalar partly in registers.  */

      int size = GET_MODE_SIZE (mode) / UNITS_PER_WORD; <***
      int i;
      int not_stack;
      /* # words of start of argument
         that we must make space for but need not store.  */
      int offset = partial % (PARM_BOUNDARY / BITS_PER_WORD); <***
...
        if (i >= not_stack + offset)
          emit_push_insn (operand_subword_force (x, i, mode),
                     ***> word_mode, NULL_TREE, NULL_RTX, align, 0, NULL_RTX,
                          0, args_addr,
                          GEN_INT (args_offset + ((i - not_stack + skip)
                                                  * UNITS_PER_WORD)),
                          reg_parm_stack_space, alignment_pad);


function.c:assign_parm()

      if (nominal_mode == BLKmode
#ifdef BLOCK_REG_PADDING
          || (locate.where_pad == (BYTES_BIG_ENDIAN ? upward : downward)
              && GET_MODE_SIZE (promoted_mode) < UNITS_PER_WORD) <***
#endif
          || GET_CODE (entry_parm) == PARALLEL)
        {
          /* If a BLKmode arrives in registers, copy it to a stack slot.
             Handle calls that pass values in multiple non-contiguous
             locations.  The Irix 6 ABI has examples of this.  */
          if (GET_CODE (entry_parm) == REG
              || GET_CODE (entry_parm) == PARALLEL)
            {
              int size = int_size_in_bytes (TREE_TYPE (parm));
              int size_stored = CEIL_ROUND (size, UNITS_PER_WORD); <***
...
              /* If SIZE is that of a mode no bigger than a word, just use
                 that mode's store operation.  */
              else if (size <= UNITS_PER_WORD) <***
                {
                  enum machine_mode mode
                    = mode_for_size (size * BITS_PER_UNIT, MODE_INT, 0);

                  if (mode != BLKmode
#ifdef BLOCK_REG_PADDING
                      && (size == UNITS_PER_WORD <***
                          || (BLOCK_REG_PADDING (mode, TREE_TYPE (parm), 1)
                              != (BYTES_BIG_ENDIAN ? upward : downward)))
#endif
                      )
                    {
                      rtx reg = gen_rtx_REG (mode, REGNO (entry_parm));
                      emit_move_insn (change_address (mem, mode, 0), reg);
                    }

                  /* Blocks smaller than a word on a BYTES_BIG_ENDIAN
                     machine must be aligned to the left before storing
                     to memory.  Note that the previous test doesn't
                     handle all cases (e.g. SIZE == 3).  */
                  else if (size != UNITS_PER_WORD <***
#ifdef BLOCK_REG_PADDING
                           && (BLOCK_REG_PADDING (mode, TREE_TYPE (parm), 1)
                               == downward)
#else
                           && BYTES_BIG_ENDIAN
#endif
                           )
                    {
                      rtx tem, x;
                      int by = (UNITS_PER_WORD - size) * BITS_PER_UNIT; <***
                      rtx reg = gen_rtx_REG (word_mode, REGNO (entry_parm));

                      x = expand_binop (word_mode, ashl_optab, reg,
                                        GEN_INT (by), 0, 1, OPTAB_WIDEN);
                      tem = change_address (mem, word_mode, 0);
                      emit_move_insn (tem, x);
                    }
                  else
                    move_block_from_reg (REGNO (entry_parm), mem,
                                         size_stored / UNITS_PER_WORD); <***
                }
              else
                move_block_from_reg (REGNO (entry_parm), mem,
                                     size_stored / UNITS_PER_WORD); <***
            }
          SET_DECL_RTL (parm, stack_parm);
        }

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] - Use of powerpc 64bit instructions in 32bit ABI
  2003-10-24 17:32 ` David Edelsohn
@ 2003-10-24 18:07   ` Richard Henderson
  2003-10-24 18:34     ` Fariborz Jahanian
  2003-10-24 18:34     ` David Edelsohn
  0 siblings, 2 replies; 875+ messages in thread
From: Richard Henderson @ 2003-10-24 18:07 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Ulrich Weigand, Fariborz Jahanian, ian, davem, gcc-patches

On Fri, Oct 24, 2003 at 01:31:41PM -0400, David Edelsohn wrote:
> 	Notice the use of UNITS_PER_WORD, BITS_PER_WORD, and word_mode in
> the following functions that depend on the argument wordsize and ABI:
[...]
> calls.c:load_register_parameters()

If you're returning parallels from function_arg, you won't get here.

> calls.c:store_one_arg()

Or here.

> function.c:assign_parm()

You definitely won't get to all the bits you marked here.

> expr.c:emit_push_insn()

Dunno about here, but if we're really interested in block copy, then
you *do* want to move in 64-bit hunks.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] - Use of powerpc 64bit instructions in 32bit ABI
  2003-10-24 18:07   ` Richard Henderson
  2003-10-24 18:34     ` Fariborz Jahanian
@ 2003-10-24 18:34     ` David Edelsohn
  2003-10-24 18:57       ` Richard Henderson
                         ` (2 more replies)
  1 sibling, 3 replies; 875+ messages in thread
From: David Edelsohn @ 2003-10-24 18:34 UTC (permalink / raw)
  To: Richard Henderson, Ulrich Weigand, Fariborz Jahanian, ian, davem,
	gcc-patches

>>>>> Richard Henderson writes:

Richard> If you're returning parallels from function_arg, you won't get here.

	You are suggesting that function_arg() return PARALLELs for all
arguments in this 32/64 mode, not just "long long" and "double" arguments
that need to be split across two registers in 32-bit mode?  That's a lot
of code to emulate the GCC defaults in 32-bit mode.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] - Use of powerpc 64bit instructions in 32bit ABI
  2003-10-24 18:07   ` Richard Henderson
@ 2003-10-24 18:34     ` Fariborz Jahanian
  2003-10-24 18:34     ` David Edelsohn
  1 sibling, 0 replies; 875+ messages in thread
From: Fariborz Jahanian @ 2003-10-24 18:34 UTC (permalink / raw)
  To: Richard Henderson; +Cc: David Edelsohn, Ulrich Weigand, ian, davem, gcc-patches


On Friday, October 24, 2003, at 11:07 AM, Richard Henderson wrote:

> On Fri, Oct 24, 2003 at 01:31:41PM -0400, David Edelsohn wrote:
>> 	Notice the use of UNITS_PER_WORD, BITS_PER_WORD, and word_mode in
>> the following functions that depend on the argument wordsize and ABI:
> [...]
>> calls.c:load_register_parameters()
>
> If you're returning parallels from function_arg, you won't get here.

This routine is called for passing args in registers. Above code is 
used in following
example (which does not have any long long but a structure is passed by 
value in 32-bit
registers using UNITS_PER_WORD).

struct S {
         long i1;
         long i2;
         long i3;
         long i4;
};

void foo(struct S ll);

void add(struct S val)
{
     foo( val );
}

- Fariborz

>
>> calls.c:store_one_arg()
>
> Or here.
>
>> function.c:assign_parm()
>
> You definitely won't get to all the bits you marked here.
>
>> expr.c:emit_push_insn()
>
> Dunno about here, but if we're really interested in block copy, then
> you *do* want to move in 64-bit hunks.
>
>
> r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] - Use of powerpc 64bit instructions in 32bit ABI
  2003-10-24 18:34     ` David Edelsohn
@ 2003-10-24 18:57       ` Richard Henderson
  2003-11-03 20:55         ` David Edelsohn
  2003-10-24 19:08       ` Geoff Keating
  2003-10-24 19:22       ` Dale Johannesen
  2 siblings, 1 reply; 875+ messages in thread
From: Richard Henderson @ 2003-10-24 18:57 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Ulrich Weigand, Fariborz Jahanian, ian, davem, gcc-patches

On Fri, Oct 24, 2003 at 02:27:32PM -0400, David Edelsohn wrote:
> 	You are suggesting that function_arg() return PARALLELs for all
> arguments in this 32/64 mode, not just "long long" and "double" arguments
> that need to be split across two registers in 32-bit mode?

Yes.

> That's a lot of code to emulate the GCC defaults in 32-bit mode.

I didn't think the AIX/Darwin ABI was that complicated.  I know
the SVR4 one is ugly, but...


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] - Use of powerpc 64bit instructions in 32bit ABI
  2003-10-24 18:34     ` David Edelsohn
  2003-10-24 18:57       ` Richard Henderson
@ 2003-10-24 19:08       ` Geoff Keating
  2003-10-24 19:10         ` David Edelsohn
  2003-10-24 19:22       ` Dale Johannesen
  2 siblings, 1 reply; 875+ messages in thread
From: Geoff Keating @ 2003-10-24 19:08 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

David Edelsohn <dje@watson.ibm.com> writes:

> >>>>> Richard Henderson writes:
> 
> Richard> If you're returning parallels from function_arg, you won't get here.
> 
> 	You are suggesting that function_arg() return PARALLELs for all
> arguments in this 32/64 mode, not just "long long" and "double" arguments
> that need to be split across two registers in 32-bit mode?  That's a lot
> of code to emulate the GCC defaults in 32-bit mode.

It should only be necessary to do this for arguments that would
otherwise be passed in GPRs.

-- 
- Geoffrey Keating <geoffk@geoffk.org>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] - Use of powerpc 64bit instructions in 32bit ABI
  2003-10-24 19:08       ` Geoff Keating
@ 2003-10-24 19:10         ` David Edelsohn
  0 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2003-10-24 19:10 UTC (permalink / raw)
  To: Geoff Keating; +Cc: gcc-patches

>>>>> Geoff Keating writes:

Geoff> It should only be necessary to do this for arguments that would
Geoff> otherwise be passed in GPRs.

	That's most arguments: ints, structs, unions, FP without
prototype, stdarg.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] - Use of powerpc 64bit instructions in 32bit ABI
  2003-10-24 18:34     ` David Edelsohn
  2003-10-24 18:57       ` Richard Henderson
  2003-10-24 19:08       ` Geoff Keating
@ 2003-10-24 19:22       ` Dale Johannesen
  2003-10-24 19:28         ` Dale Johannesen
                           ` (2 more replies)
  2 siblings, 3 replies; 875+ messages in thread
From: Dale Johannesen @ 2003-10-24 19:22 UTC (permalink / raw)
  To: David Edelsohn
  Cc: Dale Johannesen, Richard Henderson, Ulrich Weigand,
	Fariborz Jahanian, ian, davem, gcc-patches

On Friday, October 24, 2003, at 11:27  AM, David Edelsohn wrote:
>>>>>> Richard Henderson writes:
>
> Richard> If you're returning parallels from function_arg, you won't 
> get here.
>
> 	You are suggesting that function_arg() return PARALLELs for all
> arguments in this 32/64 mode, not just "long long" and "double" 
> arguments
> that need to be split across two registers in 32-bit mode?  That's a 
> lot
> of code to emulate the GCC defaults in 32-bit mode.

Can't the Sparc also benefit from the proposed changes in the
machine-dependent area?

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] - Use of powerpc 64bit instructions in 32bit ABI
  2003-10-24 19:22       ` Dale Johannesen
@ 2003-10-24 19:28         ` Dale Johannesen
  2003-10-24 21:25         ` Richard Henderson
  2003-10-25  3:10         ` David S. Miller
  2 siblings, 0 replies; 875+ messages in thread
From: Dale Johannesen @ 2003-10-24 19:28 UTC (permalink / raw)
  To: Dale Johannesen
  Cc: David Edelsohn, Richard Henderson, Ulrich Weigand,
	Fariborz Jahanian, ian, davem, gcc-patches

On Friday, October 24, 2003, at 12:20  PM, Dale Johannesen wrote:
>
> Can't the Sparc also benefit from the proposed changes in the
> machine-dependent area?

s/dependent/independent/ of course, sorry.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] - Use of powerpc 64bit instructions in 32bit ABI
  2003-10-24 19:22       ` Dale Johannesen
  2003-10-24 19:28         ` Dale Johannesen
@ 2003-10-24 21:25         ` Richard Henderson
  2003-10-25  3:10         ` David S. Miller
  2 siblings, 0 replies; 875+ messages in thread
From: Richard Henderson @ 2003-10-24 21:25 UTC (permalink / raw)
  To: Dale Johannesen
  Cc: David Edelsohn, Ulrich Weigand, Fariborz Jahanian, ian, davem,
	gcc-patches

On Fri, Oct 24, 2003 at 12:20:29PM -0700, Dale Johannesen wrote:
> Can't the Sparc also benefit from the proposed changes in the
> machine-dependent area?

I don't know.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] - Use of powerpc 64bit instructions in 32bit ABI
  2003-10-24 19:22       ` Dale Johannesen
  2003-10-24 19:28         ` Dale Johannesen
  2003-10-24 21:25         ` Richard Henderson
@ 2003-10-25  3:10         ` David S. Miller
  2 siblings, 0 replies; 875+ messages in thread
From: David S. Miller @ 2003-10-25  3:10 UTC (permalink / raw)
  To: Dale Johannesen
  Cc: dje, dalej, rth, Ulrich.Weigand, fjahanian, ian, gcc-patches

On Fri, 24 Oct 2003 12:20:29 -0700
Dale Johannesen <dalej@apple.com> wrote:

> On Friday, October 24, 2003, at 11:27  AM, David Edelsohn wrote:
> >>>>>> Richard Henderson writes:
> >
> > Richard> If you're returning parallels from function_arg, you won't 
> > get here.
> >
> > 	You are suggesting that function_arg() return PARALLELs for all
> > arguments in this 32/64 mode, not just "long long" and "double" 
> > arguments
> > that need to be split across two registers in 32-bit mode?  That's a 
> > lot
> > of code to emulate the GCC defaults in 32-bit mode.
> 
> Can't the Sparc also benefit from the proposed changes in the
> machine-dependent area?

The current sparc code, no.  If and when I rewrite the sparc 64-in-32
code to be more useful, it probably would benefit.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] - Use of powerpc 64bit instructions in 32bit ABI
  2003-10-24 18:57       ` Richard Henderson
@ 2003-11-03 20:55         ` David Edelsohn
  2003-12-01 14:27           ` Eric Botcazou
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2003-11-03 20:55 UTC (permalink / raw)
  To: Richard Henderson, Ulrich Weigand, Fariborz Jahanian, ian, davem,
	gcc-patches

	After the recent additional changes to the rs6000 backend to
support -mpowerpc64, the following patch contains the remaining pieces
that are required for this mode to produce correct and efficient code.
This patch bootstrap with no regressions.

	This patch does three things:

1) Fixes a bug in expand_call() for PARALLEL where "target" may be
deleted.

2) Uses GET_MODE_SIZE (GET_MODE (...)) on the elements of the PARALLEL
vector for various offset computations instead of assuming
UNITS_PER_WORD. 

3) Uses and manipulates a parameter in a register if it already is present
instead of choosing a memory copy.

David


2003-11-03  Fariborz Jahanian  <fjahanian@apple.com>
            David Edelsohn  <edelsohn@gnu.org>

	* calls.c (expand_call): Allocate new temp in pass1.
	(store_one_arg): If PARALLEL, calculate excess using mode size of
	rtvec elt. 
	* expr.c (emit_push_insn): If PARALLEL, calculate offset using
	mode size of rtvec elt.
	* function.c (assign_parms): Use parm in register, if available.

Index: calls.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/calls.c,v
retrieving revision 1.304
diff -c -p -r1.304 calls.c
*** calls.c	7 Oct 2003 19:48:17 -0000	1.304
--- calls.c	3 Nov 2003 20:45:45 -0000
*************** expand_call (tree exp, rtx target, int i
*** 2101,2106 ****
--- 2101,2107 ----
  #endif
  
    int initial_highest_arg_in_use = highest_outgoing_arg_in_use;
+   rtx temp_target = 0;
    char *initial_stack_usage_map = stack_usage_map;
  
    int old_stack_allocated;
*************** expand_call (tree exp, rtx target, int i
*** 3215,3221 ****
  	 The Irix 6 ABI has examples of this.  */
        else if (GET_CODE (valreg) == PARALLEL)
  	{
! 	  if (target == 0)
  	    {
  	      /* This will only be assigned once, so it can be readonly.  */
  	      tree nt = build_qualified_type (TREE_TYPE (exp),
--- 3216,3226 ----
  	 The Irix 6 ABI has examples of this.  */
        else if (GET_CODE (valreg) == PARALLEL)
  	{
! 	  /* Second condition is added because "target" is freed at the
! 	     the end of "pass0" for -O2 when call is made to
! 	     expand_end_target_temps ().  Its "in_use" flag has been set
! 	     to false, so allocate a new temp.  */
! 	  if (target == 0 || (pass == 1 && target == temp_target))
  	    {
  	      /* This will only be assigned once, so it can be readonly.  */
  	      tree nt = build_qualified_type (TREE_TYPE (exp),
*************** expand_call (tree exp, rtx target, int i
*** 3223,3228 ****
--- 3228,3234 ----
  					       | TYPE_QUAL_CONST));
  
  	      target = assign_temp (nt, 0, 1, 1);
+ 	      temp_target = target;
  	      preserve_temp_slots (target);
  	    }
  
*************** store_one_arg (struct arg_data *arg, rtx
*** 4559,4567 ****
  	{
  	  /* PUSH_ROUNDING has no effect on us, because
  	     emit_push_insn for BLKmode is careful to avoid it.  */
! 	  excess = (arg->locate.size.constant
! 		    - int_size_in_bytes (TREE_TYPE (pval))
! 		    + partial * UNITS_PER_WORD);
  	  size_rtx = expand_expr (size_in_bytes (TREE_TYPE (pval)),
  				  NULL_RTX, TYPE_MODE (sizetype), 0);
  	}
--- 4565,4582 ----
  	{
  	  /* PUSH_ROUNDING has no effect on us, because
  	     emit_push_insn for BLKmode is careful to avoid it.  */
! 	  if (reg && GET_CODE (reg) == PARALLEL)
! 	  {
! 	    /* Use the size of the elt to compute excess.  */
! 	    rtx elt = XEXP (XVECEXP (reg, 0, 0), 0);
! 	    excess = (arg->locate.size.constant
! 		      - int_size_in_bytes (TREE_TYPE (pval))
! 		      + partial * GET_MODE_SIZE (GET_MODE (elt)));
! 	  } 
! 	  else
! 	    excess = (arg->locate.size.constant
! 		      - int_size_in_bytes (TREE_TYPE (pval))
! 		      + partial * UNITS_PER_WORD);
  	  size_rtx = expand_expr (size_in_bytes (TREE_TYPE (pval)),
  				  NULL_RTX, TYPE_MODE (sizetype), 0);
  	}
Index: expr.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/expr.c,v
retrieving revision 1.598
diff -c -p -r1.598 expr.c
*** expr.c	1 Nov 2003 00:59:53 -0000	1.598
--- expr.c	3 Nov 2003 20:45:48 -0000
*************** emit_push_insn (rtx x, enum machine_mode
*** 3466,3473 ****
  
        rtx temp;
        int used = partial * UNITS_PER_WORD;
!       int offset = used % (PARM_BOUNDARY / BITS_PER_UNIT);
        int skip;
  
        if (size == 0)
  	abort ();
--- 3466,3483 ----
  
        rtx temp;
        int used = partial * UNITS_PER_WORD;
!       int offset;
        int skip;
+ 
+       if (reg && GET_CODE (reg) == PARALLEL)
+ 	{
+ 	  /* Use the size of the elt to compute offset.  */
+ 	  rtx elt = XEXP (XVECEXP (reg, 0, 0), 0);
+ 	  used = partial * GET_MODE_SIZE (GET_MODE (elt));
+ 	  offset = used % (PARM_BOUNDARY / BITS_PER_UNIT);
+ 	}
+       else
+ 	offset = used % (PARM_BOUNDARY / BITS_PER_UNITS);
  
        if (size == 0)
  	abort ();
Index: function.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/function.c,v
retrieving revision 1.465
diff -c -p -r1.465 function.c
*** function.c	1 Nov 2003 02:23:44 -0000	1.465
--- function.c	3 Nov 2003 20:45:50 -0000
*************** assign_parms (tree fndecl)
*** 4703,4708 ****
--- 4703,4717 ----
  
  	 Set DECL_RTL to that place.  */
  
+       if (GET_CODE (entry_parm) == PARALLEL && nominal_mode != BLKmode)
+ 	{
+ 	  /* Objects the size of a register can be combined in registers */
+ 	  rtx parmreg = gen_reg_rtx (nominal_mode);
+ 	  emit_group_store (parmreg, entry_parm, TREE_TYPE (parm),
+ 			    int_size_in_bytes (TREE_TYPE (parm)));
+ 	  SET_DECL_RTL (parm, parmreg);
+ 	}
+ 
        if (nominal_mode == BLKmode
  #ifdef BLOCK_REG_PADDING
  	  || (locate.where_pad == (BYTES_BIG_ENDIAN ? upward : downward)
*************** assign_parms (tree fndecl)
*** 4726,4732 ****
  		 assign_stack_local if space was not allocated in the argument
  		 list.  If it was, this will not work if PARM_BOUNDARY is not
  		 a multiple of BITS_PER_WORD.  It isn't clear how to fix this
! 		 if it becomes a problem.  */
  
  	      if (stack_parm == 0)
  		{
--- 4735,4742 ----
  		 assign_stack_local if space was not allocated in the argument
  		 list.  If it was, this will not work if PARM_BOUNDARY is not
  		 a multiple of BITS_PER_WORD.  It isn't clear how to fix this
! 		 if it becomes a problem.  Exception is when BLKmode arrives
! 		 with arguments not conforming to word_mode.  */
  
  	      if (stack_parm == 0)
  		{
*************** assign_parms (tree fndecl)
*** 4734,4740 ****
  		  PUT_MODE (stack_parm, GET_MODE (entry_parm));
  		  set_mem_attributes (stack_parm, parm, 1);
  		}
! 
  	      else if (PARM_BOUNDARY % BITS_PER_WORD != 0)
  		abort ();
  
--- 4744,4752 ----
  		  PUT_MODE (stack_parm, GET_MODE (entry_parm));
  		  set_mem_attributes (stack_parm, parm, 1);
  		}
! 	      else if (GET_CODE (entry_parm) == PARALLEL 
! 		       && GET_MODE(entry_parm) == BLKmode)
! 		;
  	      else if (PARM_BOUNDARY % BITS_PER_WORD != 0)
  		abort ();
  
*************** assign_parms (tree fndecl)
*** 4797,4803 ****
  		move_block_from_reg (REGNO (entry_parm), mem,
  				     size_stored / UNITS_PER_WORD);
  	    }
! 	  SET_DECL_RTL (parm, stack_parm);
  	}
        else if (! ((! optimize
  		   && ! DECL_REGISTER (parm))
--- 4809,4818 ----
  		move_block_from_reg (REGNO (entry_parm), mem,
  				     size_stored / UNITS_PER_WORD);
  	    }
! 	  /* If parm is already bound to register pair, don't change 
! 	     this binding. */
! 	  if (! DECL_RTL_SET_P (parm))
! 	    SET_DECL_RTL (parm, stack_parm);
  	}
        else if (! ((! optimize
  		   && ! DECL_REGISTER (parm))

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] - Use of powerpc 64bit instructions in 32bit ABI
  2003-11-03 20:55         ` David Edelsohn
@ 2003-12-01 14:27           ` Eric Botcazou
  2003-12-01 15:46             ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Eric Botcazou @ 2003-12-01 14:27 UTC (permalink / raw)
  To: David Edelsohn
  Cc: Richard Henderson, Ulrich Weigand, Fariborz Jahanian, ian, davem,
	gcc-patches

> *** function.c	1 Nov 2003 02:23:44 -0000	1.465
> --- function.c	3 Nov 2003 20:45:50 -0000
> *************** assign_parms (tree fndecl)
> *** 4703,4708 ****
> --- 4703,4717 ----
>
>   	 Set DECL_RTL to that place.  */
>
> +       if (GET_CODE (entry_parm) == PARALLEL && nominal_mode != BLKmode)
> + 	{
> + 	  /* Objects the size of a register can be combined in registers */
> + 	  rtx parmreg = gen_reg_rtx (nominal_mode);
> + 	  emit_group_store (parmreg, entry_parm, TREE_TYPE (parm),
> + 			    int_size_in_bytes (TREE_TYPE (parm)));
> + 	  SET_DECL_RTL (parm, parmreg);
> + 	}
> +
>         if (nominal_mode == BLKmode
>   #ifdef BLOCK_REG_PADDING
>
>   	  || (locate.where_pad == (BYTES_BIG_ENDIAN ? upward : downward)
>

This hunk introduced a pessimization on SPARC64 (verified at -O2) and 
probably on SPARC too.  Here is an example (extracted from a dg-compat 
testcase which segfaults after the patch, maybe a latent bug elsewhere):

typedef struct { _Complex char a; } Scc1;

void checkScc1 (Scc1 x, _Complex char y)
{
  if (x.a != y)
    abort ();
}


'x' is passed in a register and a stack slot is reserved (64-bit ABI). This 
is specified by the back-end with 'entry_parm' set to

(parallel:CQI [
        (expr_list (reg:DI %i0 [ x ])
            (const_int 0 [0x0]))
    ])

Now the new code generates:

(concat:CQI (reg:QI 107)
    (reg:QI 108))

for 'parmreg' so emit_group_store needs to spill %i0, which generates more 
instructions, and it spills it to the frame instead of spilling it to the 
reserved stack slot, which takes more space in the frame.


I think the code should be guarded by severe sanity checks.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] - Use of powerpc 64bit instructions in 32bit ABI
  2003-12-01 14:27           ` Eric Botcazou
@ 2003-12-01 15:46             ` David Edelsohn
  2003-12-01 16:15               ` Eric Botcazou
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2003-12-01 15:46 UTC (permalink / raw)
  To: Eric Botcazou
  Cc: Richard Henderson, Ulrich Weigand, Fariborz Jahanian, ian, davem,
	gcc-patches

>>>>> Eric Botcazou writes:

Eric> I think the code should be guarded by severe sanity checks.

	Any suggestion for the sanity checks you think would be
appropriate?

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] - Use of powerpc 64bit instructions in 32bit ABI
  2003-12-01 15:46             ` David Edelsohn
@ 2003-12-01 16:15               ` Eric Botcazou
  2003-12-01 18:44                 ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Eric Botcazou @ 2003-12-01 16:15 UTC (permalink / raw)
  To: David Edelsohn
  Cc: Richard Henderson, Ulrich Weigand, Fariborz Jahanian, ian, davem,
	gcc-patches

> 	Any suggestion for the sanity checks you think would be
> appropriate?

I'd say:
(1) check that the size is not greater than a word,
(2) check that the rtx returned by gen_reg_rtx is really a REG,
(3) ideally, it would be nice to have a bit more control over the insns 
emitted for the move, because it appears that emit_group_store can really do 
scary things :-)  But I don't know if this is really doable.

That said, the code may have been added for a specific pattern so the 
suggestions above may of course be totally void.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] - Use of powerpc 64bit instructions in 32bit ABI
  2003-12-01 16:15               ` Eric Botcazou
@ 2003-12-01 18:44                 ` David Edelsohn
  2003-12-01 19:29                   ` Fariborz Jahanian
  2003-12-02  8:00                   ` Eric Botcazou
  0 siblings, 2 replies; 875+ messages in thread
From: David Edelsohn @ 2003-12-01 18:44 UTC (permalink / raw)
  To: Eric Botcazou
  Cc: Richard Henderson, Ulrich Weigand, Fariborz Jahanian, ian, davem,
	gcc-patches

>>>>> Eric Botcazou writes:

Eric> I'd say:
Eric> (1) check that the size is not greater than a word,
Eric> (2) check that the rtx returned by gen_reg_rtx is really a REG,
Eric> (3) ideally, it would be nice to have a bit more control over the insns 
Eric> emitted for the move, because it appears that emit_group_store can really do 
Eric> scary things :-)  But I don't know if this is really doable.

Eric> That said, the code may have been added for a specific pattern so the 
Eric> suggestions above may of course be totally void.

	I think some scenarios exist where (1) might be too restrictive.

	I'll test with adding a check that gen_reg_rtx() returned a REG
because CONCAT is not useful in this scenario.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] - Use of powerpc 64bit instructions in 32bit ABI
  2003-12-01 18:44                 ` David Edelsohn
@ 2003-12-01 19:29                   ` Fariborz Jahanian
  2003-12-01 19:33                     ` David Edelsohn
  2003-12-02  8:00                   ` Eric Botcazou
  1 sibling, 1 reply; 875+ messages in thread
From: Fariborz Jahanian @ 2003-12-01 19:29 UTC (permalink / raw)
  To: David Edelsohn
  Cc: Eric Botcazou, Richard Henderson, Ulrich Weigand, ian, davem,
	gcc-patches

This code was added for when pattern for 'long long' is passed-by-value 
is generated in  -mpowerpc64 mode (64bit insns
with 32bit ABI).
Following diff should fix the SPARC problem.

Index: function.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/function.c,v
retrieving revision 1.469
diff -c -p -r1.469 function.c
*** function.c  20 Nov 2003 22:42:01 -0000      1.469
--- function.c  1 Dec 2003 19:24:14 -0000
*************** assign_parms (tree fndecl)
*** 4704,4710 ****

          Set DECL_RTL to that place.  */

!       if (GET_CODE (entry_parm) == PARALLEL && nominal_mode != 
BLKmode)
         {
           /* Objects the size of a register can be combined in 
registers */
           rtx parmreg = gen_reg_rtx (nominal_mode);
--- 4704,4710 ----

          Set DECL_RTL to that place.  */

!       if (GET_CODE (entry_parm) == PARALLEL && nominal_mode == DImode)
         {
           /* Objects the size of a register can be combined in 
registers */
           rtx parmreg = gen_reg_rtx (nominal_mode);


- Fariborz

On Monday, December 1, 2003, at 10:43 AM, David Edelsohn wrote:

>>>>>> Eric Botcazou writes:
>
> Eric> I'd say:
> Eric> (1) check that the size is not greater than a word,
> Eric> (2) check that the rtx returned by gen_reg_rtx is really a REG,
> Eric> (3) ideally, it would be nice to have a bit more control over 
> the insns
> Eric> emitted for the move, because it appears that emit_group_store 
> can really do
> Eric> scary things :-)  But I don't know if this is really doable.
>
> Eric> That said, the code may have been added for a specific pattern 
> so the
> Eric> suggestions above may of course be totally void.
>
> 	I think some scenarios exist where (1) might be too restrictive.
>
> 	I'll test with adding a check that gen_reg_rtx() returned a REG
> because CONCAT is not useful in this scenario.
>
> Thanks, David
>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] - Use of powerpc 64bit instructions in 32bit ABI
  2003-12-01 19:29                   ` Fariborz Jahanian
@ 2003-12-01 19:33                     ` David Edelsohn
  2003-12-02  9:20                       ` Eric Botcazou
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2003-12-01 19:33 UTC (permalink / raw)
  To: Fariborz Jahanian
  Cc: Eric Botcazou, Richard Henderson, Ulrich Weigand, ian, davem,
	gcc-patches

>>>>> Fariborz Jahanian writes:

Fariborz> This code was added for when pattern for 'long long' is
Fariborz> passed-by-value is generated in  -mpowerpc64 mode (64bit insns
Fariborz> with 32bit ABI).
Fariborz> Following diff should fix the SPARC problem.

	Testing on the mode is not appropriate.  This is in a common part
of the compiler and should not be tuned for the configuration of a
specific platform.

	I am testing with the following patch.

Thanks, David

Index: function.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/function.c,v
retrieving revision 1.470
diff -c -p -r1.470 function.c
*** function.c	24 Nov 2003 21:19:33 -0000	1.470
--- function.c	1 Dec 2003 19:29:02 -0000
*************** assign_parms (tree fndecl)
*** 4706,4716 ****
  
        if (GET_CODE (entry_parm) == PARALLEL && nominal_mode != BLKmode)
  	{
! 	  /* Objects the size of a register can be combined in registers */
  	  rtx parmreg = gen_reg_rtx (nominal_mode);
! 	  emit_group_store (parmreg, entry_parm, TREE_TYPE (parm),
! 			    int_size_in_bytes (TREE_TYPE (parm)));
! 	  SET_DECL_RTL (parm, parmreg);
  	}
  
        if (nominal_mode == BLKmode
--- 4706,4720 ----
  
        if (GET_CODE (entry_parm) == PARALLEL && nominal_mode != BLKmode)
  	{
! 	  /* Objects the size of a register can be combined in registers.  */
  	  rtx parmreg = gen_reg_rtx (nominal_mode);
! 
! 	  if (REG_P (parmreg))
! 	    {
! 	      emit_group_store (parmreg, entry_parm, TREE_TYPE (parm),
! 				int_size_in_bytes (TREE_TYPE (parm)));
! 	      SET_DECL_RTL (parm, parmreg);
! 	    }
  	}
  
        if (nominal_mode == BLKmode

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] - Use of powerpc 64bit instructions in 32bit ABI
  2003-12-01 18:44                 ` David Edelsohn
  2003-12-01 19:29                   ` Fariborz Jahanian
@ 2003-12-02  8:00                   ` Eric Botcazou
  1 sibling, 0 replies; 875+ messages in thread
From: Eric Botcazou @ 2003-12-02  8:00 UTC (permalink / raw)
  To: David Edelsohn
  Cc: Richard Henderson, Ulrich Weigand, Fariborz Jahanian, ian, davem,
	gcc-patches

> 	I think some scenarios exist where (1) might be too restrictive.

Ok, but then what does the comment mean exactly?

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] - Use of powerpc 64bit instructions in 32bit ABI
  2003-12-01 19:33                     ` David Edelsohn
@ 2003-12-02  9:20                       ` Eric Botcazou
  2003-12-02 16:17                         ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Eric Botcazou @ 2003-12-02  9:20 UTC (permalink / raw)
  To: David Edelsohn
  Cc: Fariborz Jahanian, Richard Henderson, Ulrich Weigand, ian, davem,
	gcc-patches

> 	I am testing with the following patch.

It cures the pessimization for my original testcase.  However, I think this 
is not enough for the following slightly tweaked case:

typedef struct { char a; } Scc1;

void checkScc1 (Scc1 x, _Complex char y)
{
  if (x.a != Re(y))
    abort ();
}

We start with:

(parallel:QI [
        (expr_list (reg:DI %i0)
            (const_int 0 [0x0]))
    ])

and the new transformation generates

(reg:QI 107)

so emit_group_store emits a move between the two locations, which requires 9 
more insns in the 01.rtl file than the original code, because this time no 
reg is spilled so a combination of ASHIFT, SUBREG, AND and OR is used.  This 
is fully recovered at -O2, but not at -O1.

So I'd suggest to explicitly test the pattern you want to catch, using 
XVECEXP and the likes (see the transformation just above the hot spot).

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] - Use of powerpc 64bit instructions in 32bit ABI
  2003-12-02  9:20                       ` Eric Botcazou
@ 2003-12-02 16:17                         ` David Edelsohn
  2003-12-02 17:28                           ` Eric Botcazou
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2003-12-02 16:17 UTC (permalink / raw)
  To: Eric Botcazou
  Cc: Fariborz Jahanian, Richard Henderson, Ulrich Weigand, ian, davem,
	gcc-patches

>>>>> Eric Botcazou writes:

Eric> It cures the pessimization for my original testcase.  However, I think this 
Eric> is not enough for the following slightly tweaked case:

Eric> We start with:

Eric> (parallel:QI [
Eric> (expr_list (reg:DI %i0)
Eric> (const_int 0 [0x0]))
Eric> ])

Eric> and the new transformation generates

Eric> (reg:QI 107)

Eric> so emit_group_store emits a move between the two locations, which requires 9 
Eric> more insns in the 01.rtl file than the original code, because this time no 
Eric> reg is spilled so a combination of ASHIFT, SUBREG, AND and OR is used.  This 
Eric> is fully recovered at -O2, but not at -O1.

	In some sense, this is what the change is trying to produce.  Yes,
it's more instructions, but it generally is much faster for any modern
processor to perform these computations in registers instead of accessing
(cache) memory.  If you only are focussing on instruction counts, you are
overlooking the actual performance.  Nine instructions is a lot, but that
is in the 01.rtl final, not the final output (even at -O1).

	It is expensive for a processor to write out one or more values to
memory and then read back a value with a different width -- often the
processor cannot short-circuit that type of operation.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] - Use of powerpc 64bit instructions in 32bit ABI
  2003-12-02 16:17                         ` David Edelsohn
@ 2003-12-02 17:28                           ` Eric Botcazou
  2003-12-02 17:39                             ` Fariborz Jahanian
                                               ` (2 more replies)
  0 siblings, 3 replies; 875+ messages in thread
From: Eric Botcazou @ 2003-12-02 17:28 UTC (permalink / raw)
  To: David Edelsohn
  Cc: Fariborz Jahanian, Richard Henderson, Ulrich Weigand, ian, davem,
	gcc-patches

>       In some sense, this is what the change is trying to produce.  Yes,
> it's more instructions, but it generally is much faster for any modern
> processor to perform these computations in registers instead of accessing
> (cache) memory.

I don't discuss the 9 insns in registers vs register spills in frame.  What I 
discuss is the 9 insns in registers vs <nothing>: for the second testcase, 
the 9 insns are totally useless, they don't save any memory access since the 
argument is already in a single register.  And they increase the probability 
to run into one of the famous bugs of the combiner.

So I'd suggest that you introduce more sanity checks for the transformation: 
can't you bypass it if the PARALLEL contains a single member?  Can't you 
bypass it if the PARALLEL contains too many members (I don't know whether 
this can happen though)?

> If you only are focussing on instruction counts, you are overlooking the
> actual performance.  Nine instructions is a lot, but that is in the 01.rtl 
> final, not the final output (even at -O1).

Yes, but the code at -O1 is (slightly) worse because the transformation has 
increased the register pressure.  I'd expect the effect to worsen with the 
complexity of the code, maybe even at -O2.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] - Use of powerpc 64bit instructions in 32bit ABI
  2003-12-02 17:28                           ` Eric Botcazou
@ 2003-12-02 17:39                             ` Fariborz Jahanian
  2003-12-02 18:50                               ` Eric Botcazou
  2003-12-02 17:41                             ` David Edelsohn
  2003-12-02 18:02                             ` David Edelsohn
  2 siblings, 1 reply; 875+ messages in thread
From: Fariborz Jahanian @ 2003-12-02 17:39 UTC (permalink / raw)
  To: Eric Botcazou
  Cc: David Edelsohn, Richard Henderson, Ulrich Weigand, ian, davem,
	gcc-patches


On Tuesday, December 2, 2003, at 09:29 AM, Eric Botcazou wrote:

>>       In some sense, this is what the change is trying to produce.  
>> Yes,
>> it's more instructions, but it generally is much faster for any modern
>> processor to perform these computations in registers instead of 
>> accessing
>> (cache) memory.
>
> I don't discuss the 9 insns in registers vs register spills in frame.  
> What I
> discuss is the 9 insns in registers vs <nothing>: for the second 
> testcase,
> the 9 insns are totally useless, they don't save any memory access 
> since the
> argument is already in a single register.  And they increase the 
> probability
> to run into one of the famous bugs of the combiner.
>
> So I'd suggest that you introduce more sanity checks for the 
> transformation:
> can't you bypass it if the PARALLEL contains a single member?  Can't 
> you
> bypass it if the PARALLEL contains too many members (I don't know 
> whether
> this can happen though)?
>

I mentioned in a previous email that this pattern is generated for a 
'long long' (DImode). If we cannot
restrict the pattern to this mode, then we can strict this to PARALLEL 
with two members only.

- Fariborz

>> If you only are focussing on instruction counts, you are overlooking 
>> the
>> actual performance.  Nine instructions is a lot, but that is in the 
>> 01.rtl
>> final, not the final output (even at -O1).
>
> Yes, but the code at -O1 is (slightly) worse because the 
> transformation has
> increased the register pressure.  I'd expect the effect to worsen with 
> the
> complexity of the code, maybe even at -O2.
>
> -- 
> Eric Botcazou

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] - Use of powerpc 64bit instructions in 32bit ABI
  2003-12-02 17:28                           ` Eric Botcazou
  2003-12-02 17:39                             ` Fariborz Jahanian
@ 2003-12-02 17:41                             ` David Edelsohn
  2003-12-02 18:50                               ` Eric Botcazou
  2003-12-02 18:02                             ` David Edelsohn
  2 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2003-12-02 17:41 UTC (permalink / raw)
  To: Eric Botcazou
  Cc: Fariborz Jahanian, Richard Henderson, Ulrich Weigand, ian, davem,
	gcc-patches

>>>>> Eric Botcazou writes:

Eric> can't you bypass it if the PARALLEL contains a single member?  Can't you 
Eric> bypass it if the PARALLEL contains too many members (I don't know whether 
Eric> this can happen though)?

	I can try to look at the number of elements of the rtvec.

	Your second suggestion contradicts the purpose of the change
explained in the comment: "Objects the size of a register can be combined
in registers."  This specifically change is trying to optimize the case
where a 64-bit or larger object is passed in multiple 32-bit blocks (due
to the ABI), but will operate on the object within the function in a wider
mode.  The compiler should combine the blocks using register operations,
not memory operations.  Without the change, GCC would choose to use the
stack to reconstitute the object, which is inefficient on modern
architectures.  The standard example is "long long", but this also can and
should apply to 128-bit TImode on PPC64, as well as other architectures.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] - Use of powerpc 64bit instructions in 32bit ABI
  2003-12-02 17:28                           ` Eric Botcazou
  2003-12-02 17:39                             ` Fariborz Jahanian
  2003-12-02 17:41                             ` David Edelsohn
@ 2003-12-02 18:02                             ` David Edelsohn
  2003-12-02 23:34                               ` Eric Botcazou
  2 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2003-12-02 18:02 UTC (permalink / raw)
  To: Eric Botcazou
  Cc: Fariborz Jahanian, Richard Henderson, Ulrich Weigand, ian, davem,
	gcc-patches

	Does the appended patch work better for you?

	The SPARC ABI, especially for records, is complex and I am trying
to understand how to mitigate the problems you are seeing.  You need to
meet me half way and consider what this change is trying to accomplish on
other architectures.  If we make the test too tight to match one specific
port or architecture, then we no longer have a portable, retargetable
compiler.

David

Index: function.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/function.c,v
retrieving revision 1.471
diff -c -p -r1.471 function.c
*** function.c	1 Dec 2003 13:07:12 -0000	1.471
--- function.c	2 Dec 2003 17:54:53 -0000
*************** assign_parms (tree fndecl)
*** 4704,4716 ****
  
  	 Set DECL_RTL to that place.  */
  
!       if (GET_CODE (entry_parm) == PARALLEL && nominal_mode != BLKmode)
  	{
! 	  /* Objects the size of a register can be combined in registers */
  	  rtx parmreg = gen_reg_rtx (nominal_mode);
! 	  emit_group_store (parmreg, entry_parm, TREE_TYPE (parm),
! 			    int_size_in_bytes (TREE_TYPE (parm)));
! 	  SET_DECL_RTL (parm, parmreg);
  	}
  
        if (nominal_mode == BLKmode
--- 4704,4721 ----
  
  	 Set DECL_RTL to that place.  */
  
!       if (GET_CODE (entry_parm) == PARALLEL && nominal_mode != BLKmode
! 	  && XVECLEN (entry_parm, 0) > 1)
  	{
! 	  /* Objects the size of a register can be combined in registers.  */
  	  rtx parmreg = gen_reg_rtx (nominal_mode);
! 
! 	  if (REG_P (parmreg))
! 	    {
! 	      emit_group_store (parmreg, entry_parm, TREE_TYPE (parm),
! 				int_size_in_bytes (TREE_TYPE (parm)));
! 	      SET_DECL_RTL (parm, parmreg);
! 	    }
  	}
  
        if (nominal_mode == BLKmode

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] - Use of powerpc 64bit instructions in 32bit ABI
  2003-12-02 17:41                             ` David Edelsohn
@ 2003-12-02 18:50                               ` Eric Botcazou
  0 siblings, 0 replies; 875+ messages in thread
From: Eric Botcazou @ 2003-12-02 18:50 UTC (permalink / raw)
  To: David Edelsohn
  Cc: Fariborz Jahanian, Richard Henderson, Ulrich Weigand, ian, davem,
	gcc-patches

> 	I can try to look at the number of elements of the rtvec.

Thanks.

> 	Your second suggestion contradicts the purpose of the change
> explained in the comment: "Objects the size of a register can be combined
> in registers."

Then I'd suggest to clarify the comment, using....

> This specifically change is trying to optimize the case where a 64-bit or
> larger object is passed in multiple 32-bit blocks (due to the ABI), but
> will operate on the object within the function in a wider mode.  The
> compiler should combine the blocks using register operations, not memory
> operations.  Without the change, GCC would choose to use the
> stack to reconstitute the object, which is inefficient on modern
> architectures.  The standard example is "long long", but this also can and
> should apply to 128-bit TImode on PPC64, as well as other architectures.

...this clear explanation as the basis.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] - Use of powerpc 64bit instructions in 32bit ABI
  2003-12-02 17:39                             ` Fariborz Jahanian
@ 2003-12-02 18:50                               ` Eric Botcazou
  0 siblings, 0 replies; 875+ messages in thread
From: Eric Botcazou @ 2003-12-02 18:50 UTC (permalink / raw)
  To: Fariborz Jahanian
  Cc: David Edelsohn, Richard Henderson, Ulrich Weigand, ian, davem,
	gcc-patches

> I mentioned in a previous email that this pattern is generated for a
> 'long long' (DImode). If we cannot restrict the pattern to this mode, then 
> we can strict this to PARALLEL with two members only.

Restricting it to PARALLELs with at least two members would already be much 
better I think.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] - Use of powerpc 64bit instructions in 32bit ABI
  2003-12-02 18:02                             ` David Edelsohn
@ 2003-12-02 23:34                               ` Eric Botcazou
  2003-12-02 23:42                                 ` fj
  0 siblings, 1 reply; 875+ messages in thread
From: Eric Botcazou @ 2003-12-02 23:34 UTC (permalink / raw)
  To: David Edelsohn
  Cc: Fariborz Jahanian, Richard Henderson, Ulrich Weigand, ian, davem,
	gcc-patches

> Does the appended patch work better for you?

Definitely. I couldn't find any obvious cases for which the code still does 
dumb things. If it suits Fariborz' needs and yours, please commit it (a 
clarified comment would be nice too).

> 	The SPARC ABI, especially for records, is complex and I am trying
> to understand how to mitigate the problems you are seeing.

Yes, the 64-bit ABI is a bit intricate, to say the least. Thanks for your 
patience with it :-)

> You need to meet me half way and consider what this change is trying to
> accomplish on other architectures.

I understand. Simply, until very recently (your previous message), I didn't 
know where you really had started from.

> If we make the test too tight to match one specific port or architecture,
> then we no longer have a portable, retargetable compiler.

Note that the opposite statement is also true: if we make the test not tight 
enough, we risk to inadvertently catch unwanted patterns on other 
architectures, as the SPARC64 example clearly demonstrates.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] - Use of powerpc 64bit instructions in 32bit ABI
  2003-12-02 23:34                               ` Eric Botcazou
@ 2003-12-02 23:42                                 ` fj
  2003-12-05 16:41                                   ` Eric Botcazou
  0 siblings, 1 reply; 875+ messages in thread
From: fj @ 2003-12-02 23:42 UTC (permalink / raw)
  To: Eric Botcazou
  Cc: David Edelsohn, Richard Henderson, Ulrich Weigand, ian, davem,
	gcc-patches


On Tuesday, December 2, 2003, at 03:36 PM, Eric Botcazou wrote:

>> Does the appended patch work better for you?
>
> Definitely. I couldn't find any obvious cases for which the code still 
> does
> dumb things. If it suits Fariborz' needs and yours, please commit it (a
> clarified comment would be nice too).

Yes, it does meet the need for the 'long long' pattern.

- fariborz

>
>> 	The SPARC ABI, especially for records, is complex and I am trying
>> to understand how to mitigate the problems you are seeing.
>
> Yes, the 64-bit ABI is a bit intricate, to say the least. Thanks for 
> your
> patience with it :-)
>
>> You need to meet me half way and consider what this change is trying 
>> to
>> accomplish on other architectures.
>
> I understand. Simply, until very recently (your previous message), I 
> didn't
> know where you really had started from.
>
>> If we make the test too tight to match one specific port or 
>> architecture,
>> then we no longer have a portable, retargetable compiler.
>
> Note that the opposite statement is also true: if we make the test not 
> tight
> enough, we risk to inadvertently catch unwanted patterns on other
> architectures, as the SPARC64 example clearly demonstrates.
>
> -- 
> Eric Botcazou

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] - Use of powerpc 64bit instructions in 32bit ABI
  2003-12-02 23:42                                 ` fj
@ 2003-12-05 16:41                                   ` Eric Botcazou
  0 siblings, 0 replies; 875+ messages in thread
From: Eric Botcazou @ 2003-12-05 16:41 UTC (permalink / raw)
  To: fj
  Cc: David Edelsohn, Richard Henderson, Ulrich Weigand, ian, davem,
	gcc-patches

> Yes, it does meet the need for the 'long long' pattern.

I just saw that David had commited the patch, and I confirm that it fixed 
the testsuite regressions thanks to which I had spotted the pessimization. 
And it appears that 64-bit HP-PA was affected somehow too.

Thanks to David and you for helping me to solve the problem.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Incorrect DWARF-2 register numbers on PPC64?
       [not found]       ` <200401022317.i02NHQBR001191@desire.geoffk.org>
@ 2004-01-06 15:27         ` Alan Modra
  2004-01-06 18:07           ` Geoff Keating
                             ` (2 more replies)
  0 siblings, 3 replies; 875+ messages in thread
From: Alan Modra @ 2004-01-06 15:27 UTC (permalink / raw)
  To: gcc-patches
  Cc: Geoff Keating, cagney, kettenis, dje, gdb-patches, Ulrich.Weigand

This patch corrects DWARF debug info register numbering for PPC targets.
See http://gcc.gnu.org/ml/gcc/2004-01/msg00025.html for some background.

I've also made a small fix to DWARF_REG_TO_UNWIND_COLUMN which
incorrectly hardcoded an unwinder array index, and removed the confused
FIXME.  See the new comment.

	* config/rs6000/rs6000.c (rs6000_dbx_register_number): New function.
	* config/rs6000/rs6000-protos.h (rs6000_dbx_register_number): Declare.
	* config/rs6000/rs6000.h (DWARF_FRAME_REGNUM): Define.
	(DWARF_REG_TO_UNWIND_COLUMN): Correct column adjustment and comment.
	* config/rs6000/sysv4.h (DBX_REGISTER_NUMBER): Define.

Bootstrapped powerpc-linux, no regressions.

Index: gcc/config/rs6000/rs6000-protos.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000-protos.h,v
retrieving revision 1.68
diff -u -p -r1.68 rs6000-protos.h
--- gcc/config/rs6000/rs6000-protos.h	9 Dec 2003 01:57:45 -0000	1.68
+++ gcc/config/rs6000/rs6000-protos.h	6 Jan 2004 14:41:35 -0000
@@ -186,6 +186,7 @@ extern int uses_TOC (void);
 extern void rs6000_emit_prologue (void);
 extern void rs6000_emit_load_toc_table (int);
 extern void rs6000_aix_emit_builtin_unwind_init (void);
+extern unsigned int rs6000_dbx_register_number (unsigned int);
 extern void rs6000_emit_epilogue (int);
 extern void rs6000_emit_eh_reg_restore (rtx, rtx);
 extern const char * output_isel (rtx *);
Index: gcc/config/rs6000/rs6000.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.c,v
retrieving revision 1.565
diff -u -p -r1.565 rs6000.c
--- gcc/config/rs6000/rs6000.c	31 Dec 2003 00:25:51 -0000	1.565
+++ gcc/config/rs6000/rs6000.c	6 Jan 2004 14:41:41 -0000
@@ -15763,4 +15763,39 @@ rs6000_dwarf_register_span (rtx reg)
 				   gen_rtx_REG (SImode, regno + 1200)));
 }
 
+/* Map internal gcc register numbers to DWARF2 register numbers.  */
+
+unsigned int
+rs6000_dbx_register_number (unsigned int regno)
+{
+  if (regno <= 63 || write_symbols != DWARF2_DEBUG)
+    return regno;
+  if (regno == MQ_REGNO)
+    return 100;
+  if (regno == LINK_REGISTER_REGNUM)
+    return 108;
+  if (regno == COUNT_REGISTER_REGNUM)
+    return 109;
+  if (CR_REGNO_P (regno))
+    return regno - CR0_REGNO + 86;
+  if (regno == XER_REGNO)
+    return 101;
+  if (ALTIVEC_REGNO_P (regno))
+    return regno - FIRST_ALTIVEC_REGNO + 1124;
+  if (regno == VRSAVE_REGNO)
+    return 356;
+  if (regno == VSCR_REGNO)
+    return 67;
+  if (regno == SPE_ACC_REGNO)
+    return 99;
+  if (regno == SPEFSCR_REGNO)
+    return 612;
+  /* SPE high reg number.  We get these values of regno from
+     rs6000_dwarf_register_span.  */
+  if (regno >= 1200 && regno < 1232)
+    return regno;
+
+  abort ();
+}
+
 #include "gt-rs6000.h"
Index: gcc/config/rs6000/rs6000.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.h,v
retrieving revision 1.303
diff -u -p -r1.303 rs6000.h
--- gcc/config/rs6000/rs6000.h	31 Dec 2003 00:25:51 -0000	1.303
+++ gcc/config/rs6000/rs6000.h	6 Jan 2004 15:09:42 -0000
@@ -811,19 +811,27 @@ extern enum rs6000_nop_insertion rs6000_
 /* This must be included for pre gcc 3.0 glibc compatibility.  */
 #define PRE_GCC3_DWARF_FRAME_REGISTERS 77
 
-/* Add 32 dwarf columns for synthetic SPE registers.  The SPE
-   synthetic registers are 113 through 145.  */
+/* Add 32 dwarf columns for synthetic SPE registers.  */
 #define DWARF_FRAME_REGISTERS (FIRST_PSEUDO_REGISTER + 32)
 
-/* The SPE has an additional 32 synthetic registers starting at 1200.
-   We must map them here to sane values in the unwinder to avoid a
-   huge hole in the unwind tables.
+/* The SPE has an additional 32 synthetic registers, with DWARF debug
+   info numbering for these registers starting at 1200.  While eh_frame
+   register numbering need not be the same as the debug info numbering,
+   we choose to number these regs for eh_frame at 1200 too.  This allows
+   future versions of the rs6000 backend to add hard registers and
+   continue to use the gcc hard register numbering for eh_frame.  If the
+   extra SPE registers in eh_frame were numbered starting from the
+   current value of FIRST_PSEUDO_REGISTER, then if FIRST_PSEUDO_REGISTER
+   changed we'd need to introduce a mapping in DWARF_FRAME_REGNUM to
+   avoid invalidating older SPE eh_frame info.
 
-   FIXME: the AltiVec ABI has AltiVec registers being 1124-1155, and
-   the VRSAVE SPR (SPR256) assigned to register 356.  When AltiVec EH
-   is verified to be working, this macro should be changed
-   accordingly.  */
-#define DWARF_REG_TO_UNWIND_COLUMN(r) ((r) > 1200 ? ((r) - 1200 + 113) : (r))
+   We must map them here to avoid huge unwinder tables mostly consisting
+   of unused space.  */ 
+#define DWARF_REG_TO_UNWIND_COLUMN(r) \
+  ((r) > 1200 ? ((r) - 1200 + FIRST_PSEUDO_REGISTER) : (r))
+
+/* Use gcc hard register numbering for eh_frame.  */
+#define DWARF_FRAME_REGNUM(REGNO) (REGNO)
 
 /* 1 for registers that have pervasive standard uses
    and are not available for the register allocator.
Index: gcc/config/rs6000/sysv4.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/sysv4.h,v
retrieving revision 1.143
diff -u -p -r1.143 sysv4.h
--- gcc/config/rs6000/sysv4.h	29 Nov 2003 03:08:12 -0000	1.143
+++ gcc/config/rs6000/sysv4.h	6 Jan 2004 14:41:45 -0000
@@ -742,6 +742,8 @@ extern int fixuplabelno;
 /* Historically we have also supported stabs debugging.  */
 #define DBX_DEBUGGING_INFO 1
 
+#define DBX_REGISTER_NUMBER(REGNO) rs6000_dbx_register_number (REGNO)
+
 #define TARGET_ENCODE_SECTION_INFO  rs6000_elf_encode_section_info
 #define TARGET_IN_SMALL_DATA_P  rs6000_elf_in_small_data_p
 #define TARGET_SECTION_TYPE_FLAGS  rs6000_elf_section_type_flags

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Incorrect DWARF-2 register numbers on PPC64?
       [not found]           ` <amodra@bigpond.net.au>
                               ` (4 preceding siblings ...)
  2003-08-01 14:57             ` powerpc64-linux libffi update Alan Modra
@ 2004-01-06 16:02             ` David Edelsohn
  2004-01-08  2:09             ` mainline -mcpu=power4 David Edelsohn
                               ` (55 subsequent siblings)
  61 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2004-01-06 16:02 UTC (permalink / raw)
  To: gcc-patches, Geoff Keating, cagney, kettenis, gdb-patches,
	Ulrich.Weigand

	* config/rs6000/rs6000.c (rs6000_dbx_register_number): New function.
	* config/rs6000/rs6000-protos.h (rs6000_dbx_register_number): Declare.
	* config/rs6000/rs6000.h (DWARF_FRAME_REGNUM): Define.
	(DWARF_REG_TO_UNWIND_COLUMN): Correct column adjustment and comment.
	* config/rs6000/sysv4.h (DBX_REGISTER_NUMBER): Define.

This looks okay.

	I assume that a matching GDB change will be applied as well.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Incorrect DWARF-2 register numbers on PPC64?
  2004-01-06 15:27         ` Incorrect DWARF-2 register numbers on PPC64? Alan Modra
@ 2004-01-06 18:07           ` Geoff Keating
  2004-01-06 18:10             ` David Edelsohn
  2004-01-07 17:43           ` Mark Kettenis
       [not found]           ` <amodra@bigpond.net.au>
  2 siblings, 1 reply; 875+ messages in thread
From: Geoff Keating @ 2004-01-06 18:07 UTC (permalink / raw)
  To: amodra; +Cc: gcc-patches, cagney, kettenis, dje, gdb-patches, Ulrich.Weigand

> Date: Wed, 7 Jan 2004 01:57:10 +1030
> From: Alan Modra <amodra@bigpond.net.au>

> This patch corrects DWARF debug info register numbering for PPC targets.
> See http://gcc.gnu.org/ml/gcc/2004-01/msg00025.html for some background.
> 
> I've also made a small fix to DWARF_REG_TO_UNWIND_COLUMN which
> incorrectly hardcoded an unwinder array index, and removed the confused
> FIXME.  See the new comment.
> 
> 	* config/rs6000/rs6000.c (rs6000_dbx_register_number): New function.
> 	* config/rs6000/rs6000-protos.h (rs6000_dbx_register_number): Declare.
> 	* config/rs6000/rs6000.h (DWARF_FRAME_REGNUM): Define.
> 	(DWARF_REG_TO_UNWIND_COLUMN): Correct column adjustment and comment.
> 	* config/rs6000/sysv4.h (DBX_REGISTER_NUMBER): Define.
> 
> Bootstrapped powerpc-linux, no regressions.

Why is DBX_REGISTER_NUMBER in sysv4.h instead of rs6000.h?

-- 
- Geoffrey Keating <geoffk@geoffk.org>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Incorrect DWARF-2 register numbers on PPC64?
  2004-01-06 18:07           ` Geoff Keating
@ 2004-01-06 18:10             ` David Edelsohn
  2004-01-06 22:05               ` Geoff Keating
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2004-01-06 18:10 UTC (permalink / raw)
  To: Geoff Keating
  Cc: amodra, gcc-patches, cagney, kettenis, gdb-patches, Ulrich.Weigand

>>>>> Geoff Keating writes:

Geoff> Why is DBX_REGISTER_NUMBER in sysv4.h instead of rs6000.h?

	Because only Dwarf2 register numbering is being fixed to conform
to the ABI mapping.  The register numbering for Stabs is correct.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Incorrect DWARF-2 register numbers on PPC64?
  2004-01-06 18:10             ` David Edelsohn
@ 2004-01-06 22:05               ` Geoff Keating
  2004-01-06 22:08                 ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Geoff Keating @ 2004-01-06 22:05 UTC (permalink / raw)
  To: David Edelsohn
  Cc: amodra, gcc-patches, cagney, kettenis, gdb-patches, Ulrich.Weigand

David Edelsohn <dje@watson.ibm.com> writes:

> >>>>> Geoff Keating writes:
> 
> Geoff> Why is DBX_REGISTER_NUMBER in sysv4.h instead of rs6000.h?
> 
> 	Because only Dwarf2 register numbering is being fixed to conform
> to the ABI mapping.  The register numbering for Stabs is correct.

Then, doesn't this patch break stabs under ELF?

-- 
- Geoffrey Keating <geoffk@geoffk.org>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Incorrect DWARF-2 register numbers on PPC64?
  2004-01-06 22:05               ` Geoff Keating
@ 2004-01-06 22:08                 ` David Edelsohn
  2004-01-06 22:34                   ` Geoff Keating
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2004-01-06 22:08 UTC (permalink / raw)
  To: Geoff Keating
  Cc: amodra, gcc-patches, cagney, kettenis, gdb-patches, Ulrich.Weigand

>>>>> Geoff Keating writes:

Geoff> Then, doesn't this patch break stabs under ELF?

	The patch only maps the register number for DWARF2_DEBUG:

+  if (regno <= 63 || write_symbols != DWARF2_DEBUG)
+    return regno;

However, there is no reason to waste the time computing a no-op
transformation on targets that do not have a choice of Dwarf2 debugging.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Incorrect DWARF-2 register numbers on PPC64?
  2004-01-06 22:08                 ` David Edelsohn
@ 2004-01-06 22:34                   ` Geoff Keating
  2004-01-07  0:26                     ` Alan Modra
  0 siblings, 1 reply; 875+ messages in thread
From: Geoff Keating @ 2004-01-06 22:34 UTC (permalink / raw)
  To: dje; +Cc: amodra, gcc-patches, cagney, kettenis, gdb-patches, Ulrich.Weigand

> Date: Tue, 06 Jan 2004 17:08:40 -0500
> From: David Edelsohn <dje@watson.ibm.com>

> >>>>> Geoff Keating writes:
> 
> Geoff> Then, doesn't this patch break stabs under ELF?
> 
> 	The patch only maps the register number for DWARF2_DEBUG:
> 
> +  if (regno <= 63 || write_symbols != DWARF2_DEBUG)
> +    return regno;
> 
> However, there is no reason to waste the time computing a no-op
> transformation on targets that do not have a choice of Dwarf2 debugging.

OK.  It'd probably be better to have this at the toplevel and an undef
for AIX, since Darwin will one day have DWARF, but that can be done
later.

-- 
- Geoffrey Keating <geoffk@geoffk.org>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Incorrect DWARF-2 register numbers on PPC64?
  2004-01-06 22:34                   ` Geoff Keating
@ 2004-01-07  0:26                     ` Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2004-01-07  0:26 UTC (permalink / raw)
  To: Geoff Keating
  Cc: dje, gcc-patches, cagney, kettenis, gdb-patches, Ulrich.Weigand

On Tue, Jan 06, 2004 at 02:34:29PM -0800, Geoff Keating wrote:
> > Date: Tue, 06 Jan 2004 17:08:40 -0500
> > From: David Edelsohn <dje@watson.ibm.com>
> 
> > >>>>> Geoff Keating writes:
> > 
> > Geoff> Then, doesn't this patch break stabs under ELF?
> > 
> > 	The patch only maps the register number for DWARF2_DEBUG:
> > 
> > +  if (regno <= 63 || write_symbols != DWARF2_DEBUG)
> > +    return regno;
> > 
> > However, there is no reason to waste the time computing a no-op
> > transformation on targets that do not have a choice of Dwarf2 debugging.
> 
> OK.  It'd probably be better to have this at the toplevel and an undef
> for AIX, since Darwin will one day have DWARF, but that can be done
> later.

I originally put the DBX_REGISTER_NUMBER define in rs6000.h, but quickly
found that was #undef'd by config/svr4.h, which is included later.  If
you want the define in rs6000.h, that can be done by either removing the
#undef in svr4.h, or including rs6000.h after svr4.h.  I've done the
work to verify that you can include rs6000.h later by adding a few
#undefs in rs6000.h, and defining SIZE_TYPE in rs6000/sysv4.h (in fact,
I wrote the patch that way in the first instance), but decided that
resulted in a much more intrusive patch.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Incorrect DWARF-2 register numbers on PPC64?
  2004-01-06 15:27         ` Incorrect DWARF-2 register numbers on PPC64? Alan Modra
  2004-01-06 18:07           ` Geoff Keating
@ 2004-01-07 17:43           ` Mark Kettenis
  2004-01-07 22:29             ` Alan Modra
       [not found]           ` <amodra@bigpond.net.au>
  2 siblings, 1 reply; 875+ messages in thread
From: Mark Kettenis @ 2004-01-07 17:43 UTC (permalink / raw)
  To: amodra; +Cc: gcc-patches, geoffk, cagney, dje, gdb-patches, Ulrich.Weigand

   Date: Wed, 7 Jan 2004 01:57:10 +1030
   From: Alan Modra <amodra@bigpond.net.au>

   This patch corrects DWARF debug info register numbering for PPC targets.
   See http://gcc.gnu.org/ml/gcc/2004-01/msg00025.html for some background.

   I've also made a small fix to DWARF_REG_TO_UNWIND_COLUMN which
   incorrectly hardcoded an unwinder array index, and removed the confused
   FIXME.  See the new comment.

	   * config/rs6000/rs6000.c (rs6000_dbx_register_number): New function.
	   * config/rs6000/rs6000-protos.h (rs6000_dbx_register_number): Declare.
	   * config/rs6000/rs6000.h (DWARF_FRAME_REGNUM): Define.
	   (DWARF_REG_TO_UNWIND_COLUMN): Correct column adjustment and comment.
	   * config/rs6000/sysv4.h (DBX_REGISTER_NUMBER): Define.

   Bootstrapped powerpc-linux, no regressions.

   Index: gcc/config/rs6000/rs6000-protos.h
   ===================================================================
   RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000-protos.h,v
   retrieving revision 1.68
   diff -u -p -r1.68 rs6000-protos.h
   --- gcc/config/rs6000/rs6000-protos.h	9 Dec 2003 01:57:45 -0000	1.68
   +++ gcc/config/rs6000/rs6000-protos.h	6 Jan 2004 14:41:35 -0000
   @@ -186,6 +186,7 @@ extern int uses_TOC (void);
    extern void rs6000_emit_prologue (void);
    extern void rs6000_emit_load_toc_table (int);
    extern void rs6000_aix_emit_builtin_unwind_init (void);
   +extern unsigned int rs6000_dbx_register_number (unsigned int);
    extern void rs6000_emit_epilogue (int);
    extern void rs6000_emit_eh_reg_restore (rtx, rtx);
    extern const char * output_isel (rtx *);
   Index: gcc/config/rs6000/rs6000.c
   ===================================================================
   RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.c,v
   retrieving revision 1.565
   diff -u -p -r1.565 rs6000.c
   --- gcc/config/rs6000/rs6000.c	31 Dec 2003 00:25:51 -0000	1.565
   +++ gcc/config/rs6000/rs6000.c	6 Jan 2004 14:41:41 -0000
   @@ -15763,4 +15763,39 @@ rs6000_dwarf_register_span (rtx reg)
				      gen_rtx_REG (SImode, regno + 1200)));
    }

   +/* Map internal gcc register numbers to DWARF2 register numbers.  */
   +
   +unsigned int
   +rs6000_dbx_register_number (unsigned int regno)
   +{
   +  if (regno <= 63 || write_symbols != DWARF2_DEBUG)
   +    return regno;
   +  if (regno == MQ_REGNO)
   +    return 100;
   +  if (regno == LINK_REGISTER_REGNUM)
   +    return 108;
   +  if (regno == COUNT_REGISTER_REGNUM)
   +    return 109;
   +  if (CR_REGNO_P (regno))
   +    return regno - CR0_REGNO + 86;
   +  if (regno == XER_REGNO)
   +    return 101;
   +  if (ALTIVEC_REGNO_P (regno))
   +    return regno - FIRST_ALTIVEC_REGNO + 1124;
   +  if (regno == VRSAVE_REGNO)
   +    return 356;
   +  if (regno == VSCR_REGNO)
   +    return 67;
   +  if (regno == SPE_ACC_REGNO)
   +    return 99;
   +  if (regno == SPEFSCR_REGNO)
   +    return 612;
   +  /* SPE high reg number.  We get these values of regno from
   +     rs6000_dwarf_register_span.  */
   +  if (regno >= 1200 && regno < 1232)
   +    return regno;
   +
   +  abort ();
   +}
   +
    #include "gt-rs6000.h"
   Index: gcc/config/rs6000/rs6000.h
   ===================================================================
   RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.h,v
   retrieving revision 1.303
   diff -u -p -r1.303 rs6000.h
   --- gcc/config/rs6000/rs6000.h	31 Dec 2003 00:25:51 -0000	1.303
   +++ gcc/config/rs6000/rs6000.h	6 Jan 2004 15:09:42 -0000
   @@ -811,19 +811,27 @@ extern enum rs6000_nop_insertion rs6000_
    /* This must be included for pre gcc 3.0 glibc compatibility.  */
    #define PRE_GCC3_DWARF_FRAME_REGISTERS 77

   -/* Add 32 dwarf columns for synthetic SPE registers.  The SPE
   -   synthetic registers are 113 through 145.  */
   +/* Add 32 dwarf columns for synthetic SPE registers.  */
    #define DWARF_FRAME_REGISTERS (FIRST_PSEUDO_REGISTER + 32)

   -/* The SPE has an additional 32 synthetic registers starting at 1200.
   -   We must map them here to sane values in the unwinder to avoid a
   -   huge hole in the unwind tables.
   +/* The SPE has an additional 32 synthetic registers, with DWARF debug
   +   info numbering for these registers starting at 1200.  While eh_frame
   +   register numbering need not be the same as the debug info numbering,
   +   we choose to number these regs for eh_frame at 1200 too.  This allows
   +   future versions of the rs6000 backend to add hard registers and
   +   continue to use the gcc hard register numbering for eh_frame.  If the
   +   extra SPE registers in eh_frame were numbered starting from the
   +   current value of FIRST_PSEUDO_REGISTER, then if FIRST_PSEUDO_REGISTER
   +   changed we'd need to introduce a mapping in DWARF_FRAME_REGNUM to
   +   avoid invalidating older SPE eh_frame info.

   -   FIXME: the AltiVec ABI has AltiVec registers being 1124-1155, and
   -   the VRSAVE SPR (SPR256) assigned to register 356.  When AltiVec EH
   -   is verified to be working, this macro should be changed
   -   accordingly.  */
   -#define DWARF_REG_TO_UNWIND_COLUMN(r) ((r) > 1200 ? ((r) - 1200 + 113) : (r))
   +   We must map them here to avoid huge unwinder tables mostly consisting
   +   of unused space.  */ 
   +#define DWARF_REG_TO_UNWIND_COLUMN(r) \
   +  ((r) > 1200 ? ((r) - 1200 + FIRST_PSEUDO_REGISTER) : (r))
   +
   +/* Use gcc hard register numbering for eh_frame.  */
   +#define DWARF_FRAME_REGNUM(REGNO) (REGNO)

    /* 1 for registers that have pervasive standard uses
       and are not available for the register allocator.
   Index: gcc/config/rs6000/sysv4.h
   ===================================================================
   RCS file: /cvs/gcc/gcc/gcc/config/rs6000/sysv4.h,v
   retrieving revision 1.143
   diff -u -p -r1.143 sysv4.h
   --- gcc/config/rs6000/sysv4.h	29 Nov 2003 03:08:12 -0000	1.143
   +++ gcc/config/rs6000/sysv4.h	6 Jan 2004 14:41:45 -0000
   @@ -742,6 +742,8 @@ extern int fixuplabelno;
    /* Historically we have also supported stabs debugging.  */
    #define DBX_DEBUGGING_INFO 1

   +#define DBX_REGISTER_NUMBER(REGNO) rs6000_dbx_register_number (REGNO)
   +
    #define TARGET_ENCODE_SECTION_INFO  rs6000_elf_encode_section_info
    #define TARGET_IN_SMALL_DATA_P  rs6000_elf_in_small_data_p
    #define TARGET_SECTION_TYPE_FLAGS  rs6000_elf_section_type_flags

   -- 
   Alan Modra
   IBM OzLabs - Linux Technology Centre

Alan,

If I read your patch correctly, this fixes normal DWARF 2 debugging
info to use the official System V register numbers, but lets GCC
continue to use its own numbering for the Call Frame Info (CFI) in
both the .eh_frame and .debug_frame sections.  This won't work for GDB
since it assumes that CFI uses the same register number encoding as
all the other DWARF 2 debug information.

Mark

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Incorrect DWARF-2 register numbers on PPC64?
  2004-01-07 17:43           ` Mark Kettenis
@ 2004-01-07 22:29             ` Alan Modra
  2004-01-07 23:36               ` Andrew Cagney
  0 siblings, 1 reply; 875+ messages in thread
From: Alan Modra @ 2004-01-07 22:29 UTC (permalink / raw)
  To: Mark Kettenis
  Cc: gcc-patches, geoffk, cagney, dje, gdb-patches, Ulrich.Weigand

On Wed, Jan 07, 2004 at 06:43:10PM +0100, Mark Kettenis wrote:
> If I read your patch correctly, this fixes normal DWARF 2 debugging
> info to use the official System V register numbers, but lets GCC
> continue to use its own numbering for the Call Frame Info (CFI) in
> both the .eh_frame and .debug_frame sections. 

That's correct.  hppa, hppa64, iq2000 and ns32k all do the same.

mips and cris also define DWARF_FRAME_REGNUM, but squinting at the code
leads me to believe they will actually use the same register numbers.

> This won't work for GDB
> since it assumes that CFI uses the same register number encoding as
> all the other DWARF 2 debug information.

Hmm, I can see that a debugger might reasonably expect .debug_frame
to have the same numbers.  When I wrote the patch, I was concentrating
on .eh_frame rather than .debug_frame, but .debug_frame uses the
.eh_frame numbering.  It's a little perplexing that dwarf2out.c does
this, as it means defining DWARF_FRAME_REGNUM to something other
than DBX_REGISTER_NUMBER is useless.  DWARF_FRAME_REGNUM ought to
just effect .eh_frame.  I'm not keen on trying to untangle dwarf2out.c
though..

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Incorrect DWARF-2 register numbers on PPC64?
  2004-01-07 22:29             ` Alan Modra
@ 2004-01-07 23:36               ` Andrew Cagney
  2004-01-08  0:48                 ` Alan Modra
  0 siblings, 1 reply; 875+ messages in thread
From: Andrew Cagney @ 2004-01-07 23:36 UTC (permalink / raw)
  To: Alan Modra
  Cc: Mark Kettenis, gcc-patches, geoffk, dje, gdb-patches, Ulrich.Weigand

> On Wed, Jan 07, 2004 at 06:43:10PM +0100, Mark Kettenis wrote:
> 
>> If I read your patch correctly, this fixes normal DWARF 2 debugging
>> info to use the official System V register numbers, but lets GCC
>> continue to use its own numbering for the Call Frame Info (CFI) in
>> both the .eh_frame and .debug_frame sections. 
> 
> 
> That's correct.  hppa, hppa64, iq2000 and ns32k all do the same.

Outch!

> mips and cris also define DWARF_FRAME_REGNUM, but squinting at the code
> leads me to believe they will actually use the same register numbers.

(same as which? gcc or dwarf 2?)

> 
>> This won't work for GDB
>> since it assumes that CFI uses the same register number encoding as
>> all the other DWARF 2 debug information.
> 
> 
> Hmm, I can see that a debugger might reasonably expect .debug_frame
> to have the same numbers.  When I wrote the patch, I was concentrating
> on .eh_frame rather than .debug_frame, but .debug_frame uses the
> .eh_frame numbering.  It's a little perplexing that dwarf2out.c does
> this, as it means defining DWARF_FRAME_REGNUM to something other
> than DBX_REGISTER_NUMBER is useless.  DWARF_FRAME_REGNUM ought to
> just effect .eh_frame.  I'm not keen on trying to untangle dwarf2out.c
> though..

Is it going to be possible to get this untangled before 3.4 is 
branched/released?

Andrew


^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Incorrect DWARF-2 register numbers on PPC64?
  2004-01-07 23:36               ` Andrew Cagney
@ 2004-01-08  0:48                 ` Alan Modra
  2004-01-08  5:01                   ` Geoff Keating
  0 siblings, 1 reply; 875+ messages in thread
From: Alan Modra @ 2004-01-08  0:48 UTC (permalink / raw)
  To: Andrew Cagney
  Cc: Mark Kettenis, gcc-patches, geoffk, dje, gdb-patches, Ulrich.Weigand

On Wed, Jan 07, 2004 at 06:35:59PM -0500, Andrew Cagney wrote:
> >On Wed, Jan 07, 2004 at 06:43:10PM +0100, Mark Kettenis wrote:
> >
> >>If I read your patch correctly, this fixes normal DWARF 2 debugging
> >>info to use the official System V register numbers, but lets GCC
> >>continue to use its own numbering for the Call Frame Info (CFI) in
> >>both the .eh_frame and .debug_frame sections. 
> >
> >
> >That's correct.  hppa, hppa64, iq2000 and ns32k all do the same.
> 
> Outch!
> 
> >mips and cris also define DWARF_FRAME_REGNUM, but squinting at the code
> >leads me to believe they will actually use the same register numbers.
> 
> (same as which? gcc or dwarf 2?)

dwarf2, ie. .debug_info and .debug_frame use the same reg numbers.

> >>This won't work for GDB
> >>since it assumes that CFI uses the same register number encoding as
> >>all the other DWARF 2 debug information.
> >
> >
> >Hmm, I can see that a debugger might reasonably expect .debug_frame
> >to have the same numbers.  When I wrote the patch, I was concentrating
> >on .eh_frame rather than .debug_frame, but .debug_frame uses the
> >.eh_frame numbering.  It's a little perplexing that dwarf2out.c does
> >this, as it means defining DWARF_FRAME_REGNUM to something other
> >than DBX_REGISTER_NUMBER is useless.  DWARF_FRAME_REGNUM ought to
> >just effect .eh_frame.  I'm not keen on trying to untangle dwarf2out.c
> >though..
> 
> Is it going to be possible to get this untangled before 3.4 is 
> branched/released?

Hmm, I see gdb looks at .eh_frame as well as .debug_frame, so my idea
of using gcc hard regs for .eh_frame and the proper dwarf regs for
.debug_frame is probably a non-starter anyway.

The "easy" fix for PPC is to not define DWARF_FRAME_REGNUM so that
.eh_frame and .debug_frame use the reg numbers specified by the ABI,
and to define DWARF_FRAME_REGISTERS as 1232.  We can even map "old"
.eh_frame regs using DWARF_REG_TO_UNWIND_COLUMN, so that older libs can
be understood by the unwinder, at least as long as they don't use
altivec regs.

The only trouble is that this will mean huge unwinder tables.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: mainline -mcpu=power4
       [not found]     ` <200401080107.i0817HT26846@makai.watson.ibm.com>
@ 2004-01-08  2:01       ` Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2004-01-08  2:01 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Geoff Keating, gcc-patches

On Wed, Jan 07, 2004 at 08:07:17PM -0500, David Edelsohn wrote:
> >>>>> Alan Modra writes:
> 
> Alan> Yes, that's the obvious fix to make -mcpu=power4 work.  I was really
> Alan> asking about why we have TARGET_64BIT in the insn predicate.  It's a
> Alan> rather more general problem than just fixing -mcpu=power4.  For
> Alan> instance, -m64 -mcpu=601 gets the ICE too.
> 
> 	Why should that combination of options work?

I confuse easily. :)  I need some comment above MASK_64BIT that points
out that the 64 bit model isn't just LP64.  ie. that MASK_POWERPC64 is
required for 64 bit regs, for example, to make lr 64 bit.

> 	We should fail earlier and in a more user-friendly manner, but
> PPC601 does not support any form of 64-bit mode.

	* config/rs6000/linux64.h (SUBSUBTARGET_OVERRIDE_OPTIONS): Ensure
	target_flags has MASK_POWERPC64 when -m64.
	* config/rs6000/rs6000.c (processor_target_table): Add MASK_POWERPC64
	to 620, 630, power3, power4 and rs64a entries.
	* config/rs6000/rs6000.h (MASK_POWERPC64, MASK_64BIT): Expand comments.

bootstrap in progress.

Index: gcc/config/rs6000/linux64.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/linux64.h,v
retrieving revision 1.52
diff -u -p -r1.52 linux64.h
--- gcc/config/rs6000/linux64.h	30 Nov 2003 23:43:05 -0000	1.52
+++ gcc/config/rs6000/linux64.h	8 Jan 2004 01:52:39 -0000
@@ -89,6 +89,11 @@
 	      target_flags &= ~MASK_PROTOTYPE;			\
 	      error (INVALID_64BIT, "prototype");		\
 	    }							\
+          if ((target_flags & MASK_POWERPC64) == 0)		\
+	    {							\
+	      target_flags |= MASK_POWERPC64;			\
+	      error ("-m64 requires a PowerPC64 cpu");		\
+	    }							\
 	}							\
       else							\
 	{							\
Index: gcc/config/rs6000/rs6000.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.c,v
retrieving revision 1.566
diff -u -p -r1.566 rs6000.c
--- gcc/config/rs6000/rs6000.c	7 Jan 2004 01:21:27 -0000	1.566
+++ gcc/config/rs6000/rs6000.c	8 Jan 2004 01:53:03 -0000
@@ -662,8 +662,10 @@ rs6000_override_options (const char *def
 	 {"603e", PROCESSOR_PPC603, POWERPC_BASE_MASK | MASK_PPC_GFXOPT},
 	 {"604", PROCESSOR_PPC604, POWERPC_BASE_MASK | MASK_PPC_GFXOPT},
 	 {"604e", PROCESSOR_PPC604e, POWERPC_BASE_MASK | MASK_PPC_GFXOPT},
-	 {"620", PROCESSOR_PPC620, POWERPC_BASE_MASK | MASK_PPC_GFXOPT},
-	 {"630", PROCESSOR_PPC630, POWERPC_BASE_MASK | MASK_PPC_GFXOPT},
+	 {"620", PROCESSOR_PPC620,
+	  POWERPC_BASE_MASK | MASK_PPC_GFXOPT | MASK_POWERPC64},
+	 {"630", PROCESSOR_PPC630,
+	  POWERPC_BASE_MASK | MASK_PPC_GFXOPT | MASK_POWERPC64},
 	 {"740", PROCESSOR_PPC750, POWERPC_BASE_MASK | MASK_PPC_GFXOPT},
 	 {"7400", PROCESSOR_PPC7400, POWERPC_7400_MASK},
 	 {"7450", PROCESSOR_PPC7450, POWERPC_7400_MASK},
@@ -684,8 +686,10 @@ rs6000_override_options (const char *def
 	 {"power", PROCESSOR_POWER, MASK_POWER | MASK_MULTIPLE | MASK_STRING},
 	 {"power2", PROCESSOR_POWER,
 	  MASK_POWER | MASK_POWER2 | MASK_MULTIPLE | MASK_STRING},
-	 {"power3", PROCESSOR_PPC630, POWERPC_BASE_MASK | MASK_PPC_GFXOPT},
-	 {"power4", PROCESSOR_POWER4, POWERPC_BASE_MASK | MASK_PPC_GFXOPT},
+	 {"power3", PROCESSOR_PPC630,
+	  POWERPC_BASE_MASK | MASK_PPC_GFXOPT | MASK_POWERPC64},
+	 {"power4", PROCESSOR_POWER4,
+	  POWERPC_BASE_MASK | MASK_PPC_GFXOPT | MASK_POWERPC64},
 	 {"powerpc", PROCESSOR_POWERPC, POWERPC_BASE_MASK},
 	 {"powerpc64", PROCESSOR_POWERPC64,
 	  POWERPC_BASE_MASK | MASK_POWERPC64},
@@ -695,7 +699,7 @@ rs6000_override_options (const char *def
 	  MASK_POWER | MASK_POWER2 | MASK_MULTIPLE | MASK_STRING},
 	 {"rsc", PROCESSOR_PPC601, MASK_POWER | MASK_MULTIPLE | MASK_STRING},
 	 {"rsc1", PROCESSOR_PPC601, MASK_POWER | MASK_MULTIPLE | MASK_STRING},
-	 {"rs64a", PROCESSOR_RS64A, POWERPC_BASE_MASK},
+	 {"rs64a", PROCESSOR_RS64A, POWERPC_BASE_MASK | MASK_POWERPC64},
       };
 
   const size_t ptt_size = ARRAY_SIZE (processor_target_table);
Index: gcc/config/rs6000/rs6000.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.h,v
retrieving revision 1.304
diff -u -p -r1.304 rs6000.h
--- gcc/config/rs6000/rs6000.h	7 Jan 2004 01:21:28 -0000	1.304
+++ gcc/config/rs6000/rs6000.h	8 Jan 2004 01:53:10 -0000
@@ -137,7 +137,7 @@ extern int target_flags;
 /* Use PowerPC Graphics group optional instructions, e.g. fsel.  */
 #define MASK_PPC_GFXOPT		0x00000010
 
-/* Use PowerPC-64 architecture instructions.  */
+/* Use PowerPC-64 architecture instructions and 64 bit registers.  */
 #define MASK_POWERPC64		0x00000020
 
 /* Use revised mnemonic names defined for PowerPC architecture.  */
@@ -160,7 +160,8 @@ extern int target_flags;
    function, and one less allocable register.  */
 #define MASK_MINIMAL_TOC	0x00000200
 
-/* Nonzero for the 64bit model: longs and pointers are 64 bits.  */
+/* Nonzero for the 64 bit ABIs: longs and pointers are 64 bits, and
+   registers are 64 bits.  Requires MASK_POWERPC64.  */
 #define MASK_64BIT		0x00000400
 
 /* Disable use of FPRs.  */

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: mainline -mcpu=power4
       [not found]           ` <amodra@bigpond.net.au>
                               ` (5 preceding siblings ...)
  2004-01-06 16:02             ` Incorrect DWARF-2 register numbers on PPC64? David Edelsohn
@ 2004-01-08  2:09             ` David Edelsohn
  2004-01-08 15:14               ` Alan Modra
  2004-02-10 15:07             ` Fix libjava failure on powerpc64-linux David Edelsohn
                               ` (54 subsequent siblings)
  61 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2004-01-08  2:09 UTC (permalink / raw)
  To: Geoff Keating, gcc-patches

	* config/rs6000/linux64.h (SUBSUBTARGET_OVERRIDE_OPTIONS): Ensure
	target_flags has MASK_POWERPC64 when -m64.
	* config/rs6000/rs6000.c (processor_target_table): Add MASK_POWERPC64
	to 620, 630, power3, power4 and rs64a entries.
	* config/rs6000/rs6000.h (MASK_POWERPC64, MASK_64BIT): Expand comments.

This patch is okay, assuming it bootstraps with no regressions.

-/* Use PowerPC-64 architecture instructions.  */
+/* Use PowerPC-64 architecture instructions and 64 bit registers.  */
 #define MASK_POWERPC64		0x00000020

If you are going to mention registers, I would recommend extending the
comment further so that others do not assume 64 bit registers implies 64
bit word size.


-/* Nonzero for the 64bit model: longs and pointers are 64 bits.  */
+/* Nonzero for the 64 bit ABIs: longs and pointers are 64 bits, and
+   registers are 64 bits.  Requires MASK_POWERPC64.  */
 #define MASK_64BIT		0x00000400

Similarly, maybe this should mention that the above implies 64 bit words.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Incorrect DWARF-2 register numbers on PPC64?
  2004-01-08  0:48                 ` Alan Modra
@ 2004-01-08  5:01                   ` Geoff Keating
  2004-01-09  2:34                     ` Alan Modra
  0 siblings, 1 reply; 875+ messages in thread
From: Geoff Keating @ 2004-01-08  5:01 UTC (permalink / raw)
  To: amodra; +Cc: cagney, kettenis, gcc-patches, dje, gdb-patches, Ulrich.Weigand

> Date: Thu, 8 Jan 2004 11:18:49 +1030
> From: Alan Modra <amodra@bigpond.net.au>

> > >>This won't work for GDB
> > >>since it assumes that CFI uses the same register number encoding as
> > >>all the other DWARF 2 debug information.
> > >
> > >
> > >Hmm, I can see that a debugger might reasonably expect .debug_frame
> > >to have the same numbers.  When I wrote the patch, I was concentrating
> > >on .eh_frame rather than .debug_frame, but .debug_frame uses the
> > >.eh_frame numbering.  It's a little perplexing that dwarf2out.c does
> > >this, as it means defining DWARF_FRAME_REGNUM to something other
> > >than DBX_REGISTER_NUMBER is useless.  DWARF_FRAME_REGNUM ought to
> > >just effect .eh_frame.  I'm not keen on trying to untangle dwarf2out.c
> > >though..
> > 
> > Is it going to be possible to get this untangled before 3.4 is 
> > branched/released?
> 
> Hmm, I see gdb looks at .eh_frame as well as .debug_frame, so my idea
> of using gcc hard regs for .eh_frame and the proper dwarf regs for
> .debug_frame is probably a non-starter anyway.
> 
> The "easy" fix for PPC is to not define DWARF_FRAME_REGNUM so that
> .eh_frame and .debug_frame use the reg numbers specified by the ABI,
> and to define DWARF_FRAME_REGISTERS as 1232.  We can even map "old"
> .eh_frame regs using DWARF_REG_TO_UNWIND_COLUMN, so that older libs can
> be understood by the unwinder, at least as long as they don't use
> altivec regs.
> 
> The only trouble is that this will mean huge unwinder tables.

That will also mean that executables built with a new version of GCC won't
run on operating systems with an old libgcc.  For Darwin, because of
historical mistakes involving non-shared libgcc, it will make life
very difficult.

-- 
- Geoffrey Keating <geoffk@geoffk.org>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: mainline -mcpu=power4
  2004-01-08  2:09             ` mainline -mcpu=power4 David Edelsohn
@ 2004-01-08 15:14               ` Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2004-01-08 15:14 UTC (permalink / raw)
  To: gcc-patches

On Wed, Jan 07, 2004 at 09:09:22PM -0500, David Edelsohn wrote:
> This patch is okay, assuming it bootstraps with no regressions.

Applied.  I decided to leave the MASK_POWERPC64 comment as is, and
stole Geoff's words for MASK_64BIT.

-/* Nonzero for the 64bit model: longs and pointers are 64 bits.  */
+/* Nonzero for the 64 bit ABIs: longs and pointers are 64 bits.  The
+   chip is running in "64-bit mode", in which CR0 is set in dot
+   operations based on all 64 bits of the register, bdnz works on 64-bit
+   ctr, lr is 64 bits, and so on.  Requires MASK_POWERPC64.  */
 #define MASK_64BIT		0x00000400

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Incorrect DWARF-2 register numbers on PPC64?
  2004-01-08  5:01                   ` Geoff Keating
@ 2004-01-09  2:34                     ` Alan Modra
  2004-01-09  2:49                       ` Alan Modra
                                         ` (2 more replies)
  0 siblings, 3 replies; 875+ messages in thread
From: Alan Modra @ 2004-01-09  2:34 UTC (permalink / raw)
  To: Geoff Keating
  Cc: cagney, kettenis, gcc-patches, dje, gdb-patches, Ulrich.Weigand

On Wed, Jan 07, 2004 at 09:01:31PM -0800, Geoff Keating wrote:
> That will also mean that executables built with a new version of GCC won't
> run on operating systems with an old libgcc.  For Darwin, because of
> historical mistakes involving non-shared libgcc, it will make life
> very difficult.

I hadn't thought of that.  This means we are stuck with the current
.eh_frame register numbering.  It should be possible to fix .debug_frame
with something like the following totally untested patch.  I'm throwing
it out to the list now for comment.

With a little more work, I can get rid of DWARF_REG_TO_UNWIND_COLUMN
by having the SPE hack generate a number suitable for .eh_frame, and
mapping up to 1200 in DBX_REGISTER_NUMBER and the new
DWARF2_FRAME_REG_OUT.  So the final patch won't increasing the number of
macros littering gcc.  :)

	  * config/rs6000/sysv4.h (DWARF2_FRAME_REG_OUT): Define.
	  * dwraf2out.c (output_cfi): Map regs using DWARF2_FRAME_REG_OUT.

Index: gcc/config/rs6000/sysv4.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/sysv4.h,v
retrieving revision 1.144
diff -u -p -r1.144 sysv4.h
--- gcc/config/rs6000/sysv4.h	7 Jan 2004 01:21:28 -0000	1.144
+++ gcc/config/rs6000/sysv4.h	9 Jan 2004 02:24:39 -0000
@@ -744,6 +744,18 @@ extern int fixuplabelno;
 
 #define DBX_REGISTER_NUMBER(REGNO) rs6000_dbx_register_number (REGNO)
 
+/* Map register numbers held in the call frame info that gcc has
+   collected using DWARF_FRAME_REGNUM to those that should be output in
+   .debug_frame and .eh_frame.  We continue to use gcc hard reg numbers
+   for .eh_frame, but use the numbers mandated by the various ABIs for
+   .debug_frame.  rs6000_emit_prologue has translated any combination of
+   CR2, CR3, CR4 saves to a save of CR2.  The actual code emitted saves
+   the whole of CR, so we map CR2_REGNO to the DWARF reg for CR.  */
+#define DWARF2_FRAME_REG_OUT(REGNO, FOR_EH)	\
+  ((FOR_EH) ? (REGNO)				\
+   : (REGNO) == CR2_REGNO ? 64			\
+   : DBX_REGISTER_NUMBER (REGNO))
+
 #define TARGET_ENCODE_SECTION_INFO  rs6000_elf_encode_section_info
 #define TARGET_IN_SMALL_DATA_P  rs6000_elf_in_small_data_p
 #define TARGET_SECTION_TYPE_FLAGS  rs6000_elf_section_type_flags
Index: gcc/dwarf2out.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/dwarf2out.c,v
retrieving revision 1.471
diff -u -p -r1.471 dwarf2out.c
--- gcc/dwarf2out.c	8 Jan 2004 07:54:11 -0000	1.471
+++ gcc/dwarf2out.c	9 Jan 2004 02:24:32 -0000
@@ -1783,11 +1783,21 @@ dw_cfi_oprnd2_desc (enum dwarf_call_fram
 
 #if defined (DWARF2_DEBUGGING_INFO) || defined (DWARF2_UNWIND_INFO)
 
+/* Map register numbers held in the call frame info that gcc has
+   collected using DWARF_FRAME_REGNUM to those that should be output in
+   .debug_frame and .eh_frame.  */
+#ifndef DWARF2_FRAME_REG_OUT
+#define DWARF2_FRAME_REG_OUT(REGNO, FOR_EH) (REGNO)
+#endif
+
 /* Output a Call Frame Information opcode and its operand(s).  */
 
 static void
 output_cfi (dw_cfi_ref cfi, dw_fde_ref fde, int for_eh)
 {
+  unsigned long op1_reg;
+  op1_reg = DWARF2_FRAME_REG_OUT (cfi->dw_cfi_oprnd1.dw_cfi_reg_num, for_eh);
+
   if (cfi->dw_cfi_opc == DW_CFA_advance_loc)
     dw2_asm_output_data (1, (cfi->dw_cfi_opc
 			     | (cfi->dw_cfi_oprnd1.dw_cfi_offset & 0x3f)),
@@ -1795,17 +1805,15 @@ output_cfi (dw_cfi_ref cfi, dw_fde_ref f
 			 cfi->dw_cfi_oprnd1.dw_cfi_offset);
   else if (cfi->dw_cfi_opc == DW_CFA_offset)
     {
-      dw2_asm_output_data (1, (cfi->dw_cfi_opc
-			       | (cfi->dw_cfi_oprnd1.dw_cfi_reg_num & 0x3f)),
+      dw2_asm_output_data (1, (cfi->dw_cfi_opc | (op1_reg & 0x3f)),
 			   "DW_CFA_offset, column 0x%lx",
-			   cfi->dw_cfi_oprnd1.dw_cfi_reg_num);
+			   op1_reg);
       dw2_asm_output_data_uleb128 (cfi->dw_cfi_oprnd2.dw_cfi_offset, NULL);
     }
   else if (cfi->dw_cfi_opc == DW_CFA_restore)
-    dw2_asm_output_data (1, (cfi->dw_cfi_opc
-			     | (cfi->dw_cfi_oprnd1.dw_cfi_reg_num & 0x3f)),
+    dw2_asm_output_data (1, (cfi->dw_cfi_opc | (op1_reg & 0x3f)),
 			 "DW_CFA_restore, column 0x%lx",
-			 cfi->dw_cfi_oprnd1.dw_cfi_reg_num);
+			 op1_reg);
   else
     {
       dw2_asm_output_data (1, cfi->dw_cfi_opc,
@@ -1850,15 +1858,13 @@ output_cfi (dw_cfi_ref cfi, dw_fde_ref f
 
 	case DW_CFA_offset_extended:
 	case DW_CFA_def_cfa:
-	  dw2_asm_output_data_uleb128 (cfi->dw_cfi_oprnd1.dw_cfi_reg_num,
-				       NULL);
+	  dw2_asm_output_data_uleb128 (op1_reg, NULL);
 	  dw2_asm_output_data_uleb128 (cfi->dw_cfi_oprnd2.dw_cfi_offset, NULL);
 	  break;
 
 	case DW_CFA_offset_extended_sf:
 	case DW_CFA_def_cfa_sf:
-	  dw2_asm_output_data_uleb128 (cfi->dw_cfi_oprnd1.dw_cfi_reg_num,
-				       NULL);
+	  dw2_asm_output_data_uleb128 (op1_reg, NULL);
 	  dw2_asm_output_data_sleb128 (cfi->dw_cfi_oprnd2.dw_cfi_offset, NULL);
 	  break;
 
@@ -1866,15 +1872,14 @@ output_cfi (dw_cfi_ref cfi, dw_fde_ref f
 	case DW_CFA_undefined:
 	case DW_CFA_same_value:
 	case DW_CFA_def_cfa_register:
-	  dw2_asm_output_data_uleb128 (cfi->dw_cfi_oprnd1.dw_cfi_reg_num,
-				       NULL);
+	  dw2_asm_output_data_uleb128 (op1_reg, NULL);
 	  break;
 
 	case DW_CFA_register:
-	  dw2_asm_output_data_uleb128 (cfi->dw_cfi_oprnd1.dw_cfi_reg_num,
-				       NULL);
-	  dw2_asm_output_data_uleb128 (cfi->dw_cfi_oprnd2.dw_cfi_reg_num,
-				       NULL);
+	  dw2_asm_output_data_uleb128 (op1_reg, NULL);
+	  dw2_asm_output_data_uleb128
+	    (DWARF2_FRAME_REG_OUT (cfi->dw_cfi_oprnd2.dw_cfi_reg_num, for_eh),
+	     NULL);
 	  break;
 
 	case DW_CFA_def_cfa_offset:
@@ -1904,7 +1909,7 @@ output_cfi (dw_cfi_ref cfi, dw_fde_ref f
     }
 }
 
-/* Output the call frame information used to used to record information
+/* Output the call frame information used to record information
    that relates to calculating the frame pointer, and records the
    location of saved registers.  */
 

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Incorrect DWARF-2 register numbers on PPC64?
  2004-01-09  2:34                     ` Alan Modra
@ 2004-01-09  2:49                       ` Alan Modra
  2004-01-09  6:39                       ` Alan Modra
  2004-01-09 15:15                       ` Mark Kettenis
  2 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2004-01-09  2:49 UTC (permalink / raw)
  To: Geoff Keating, cagney, kettenis, gcc-patches, dje, gdb-patches,
	Ulrich.Weigand

On Fri, Jan 09, 2004 at 01:04:42PM +1030, Alan Modra wrote:
> With a little more work, I can get rid of DWARF_REG_TO_UNWIND_COLUMN

No I can't.  I was forgetting that we can't change eh_frame numbers.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Incorrect DWARF-2 register numbers on PPC64?
  2004-01-09  2:34                     ` Alan Modra
  2004-01-09  2:49                       ` Alan Modra
@ 2004-01-09  6:39                       ` Alan Modra
  2004-01-17  6:54                         ` Alan Modra
  2004-01-09 15:15                       ` Mark Kettenis
  2 siblings, 1 reply; 875+ messages in thread
From: Alan Modra @ 2004-01-09  6:39 UTC (permalink / raw)
  To: Geoff Keating, cagney, kettenis, gcc-patches, dje, gdb-patches,
	Ulrich.Weigand

On Fri, Jan 09, 2004 at 01:04:42PM +1030, Alan Modra wrote:
> This means we are stuck with the current
> .eh_frame register numbering.  It should be possible to fix .debug_frame

This version actually works..  The untested one suffered from aborts in
rs6000_dbx_register_number due to calling it on invalid values.

	* config/rs6000/sysv4.h (DWARF2_FRAME_REG_OUT): Define.
	* dwarf2out.c (output_cfi): Map regs using DWARF2_FRAME_REG_OUT.
	* doc/tm.texi (DWARF_FRAME_REGNUM, DWARF2_FRAME_REG_OUT): Document.

Regression testing on powerpc-linux and powerpc64-linux still in
progress.  OK for mainline assuming everything passes?

Index: gcc/config/rs6000/sysv4.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/sysv4.h,v
retrieving revision 1.144
diff -u -p -r1.144 sysv4.h
--- gcc/config/rs6000/sysv4.h	7 Jan 2004 01:21:28 -0000	1.144
+++ gcc/config/rs6000/sysv4.h	9 Jan 2004 06:08:50 -0000
@@ -744,6 +744,18 @@ extern int fixuplabelno;
 
 #define DBX_REGISTER_NUMBER(REGNO) rs6000_dbx_register_number (REGNO)
 
+/* Map register numbers held in the call frame info that gcc has
+   collected using DWARF_FRAME_REGNUM to those that should be output in
+   .debug_frame and .eh_frame.  We continue to use gcc hard reg numbers
+   for .eh_frame, but use the numbers mandated by the various ABIs for
+   .debug_frame.  rs6000_emit_prologue has translated any combination of
+   CR2, CR3, CR4 saves to a save of CR2.  The actual code emitted saves
+   the whole of CR, so we map CR2_REGNO to the DWARF reg for CR.  */
+#define DWARF2_FRAME_REG_OUT(REGNO, FOR_EH)	\
+  ((FOR_EH) ? (REGNO)				\
+   : (REGNO) == CR2_REGNO ? 64			\
+   : DBX_REGISTER_NUMBER (REGNO))
+
 #define TARGET_ENCODE_SECTION_INFO  rs6000_elf_encode_section_info
 #define TARGET_IN_SMALL_DATA_P  rs6000_elf_in_small_data_p
 #define TARGET_SECTION_TYPE_FLAGS  rs6000_elf_section_type_flags
Index: gcc/dwarf2out.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/dwarf2out.c,v
retrieving revision 1.471
diff -u -p -r1.471 dwarf2out.c
--- gcc/dwarf2out.c	8 Jan 2004 07:54:11 -0000	1.471
+++ gcc/dwarf2out.c	9 Jan 2004 06:08:43 -0000
@@ -1783,11 +1783,19 @@ dw_cfi_oprnd2_desc (enum dwarf_call_fram
 
 #if defined (DWARF2_DEBUGGING_INFO) || defined (DWARF2_UNWIND_INFO)
 
+/* Map register numbers held in the call frame info that gcc has
+   collected using DWARF_FRAME_REGNUM to those that should be output in
+   .debug_frame and .eh_frame.  */
+#ifndef DWARF2_FRAME_REG_OUT
+#define DWARF2_FRAME_REG_OUT(REGNO, FOR_EH) (REGNO)
+#endif
+
 /* Output a Call Frame Information opcode and its operand(s).  */
 
 static void
 output_cfi (dw_cfi_ref cfi, dw_fde_ref fde, int for_eh)
 {
+  unsigned long r;
   if (cfi->dw_cfi_opc == DW_CFA_advance_loc)
     dw2_asm_output_data (1, (cfi->dw_cfi_opc
 			     | (cfi->dw_cfi_oprnd1.dw_cfi_offset & 0x3f)),
@@ -1795,17 +1803,17 @@ output_cfi (dw_cfi_ref cfi, dw_fde_ref f
 			 cfi->dw_cfi_oprnd1.dw_cfi_offset);
   else if (cfi->dw_cfi_opc == DW_CFA_offset)
     {
-      dw2_asm_output_data (1, (cfi->dw_cfi_opc
-			       | (cfi->dw_cfi_oprnd1.dw_cfi_reg_num & 0x3f)),
-			   "DW_CFA_offset, column 0x%lx",
-			   cfi->dw_cfi_oprnd1.dw_cfi_reg_num);
+      r = DWARF2_FRAME_REG_OUT (cfi->dw_cfi_oprnd1.dw_cfi_reg_num, for_eh);
+      dw2_asm_output_data (1, (cfi->dw_cfi_opc | (r & 0x3f)),
+			   "DW_CFA_offset, column 0x%lx", r);
       dw2_asm_output_data_uleb128 (cfi->dw_cfi_oprnd2.dw_cfi_offset, NULL);
     }
   else if (cfi->dw_cfi_opc == DW_CFA_restore)
-    dw2_asm_output_data (1, (cfi->dw_cfi_opc
-			     | (cfi->dw_cfi_oprnd1.dw_cfi_reg_num & 0x3f)),
-			 "DW_CFA_restore, column 0x%lx",
-			 cfi->dw_cfi_oprnd1.dw_cfi_reg_num);
+    {
+      r = DWARF2_FRAME_REG_OUT (cfi->dw_cfi_oprnd1.dw_cfi_reg_num, for_eh);
+      dw2_asm_output_data (1, (cfi->dw_cfi_opc | (r & 0x3f)),
+			   "DW_CFA_restore, column 0x%lx", r);
+    }
   else
     {
       dw2_asm_output_data (1, cfi->dw_cfi_opc,
@@ -1850,15 +1858,15 @@ output_cfi (dw_cfi_ref cfi, dw_fde_ref f
 
 	case DW_CFA_offset_extended:
 	case DW_CFA_def_cfa:
-	  dw2_asm_output_data_uleb128 (cfi->dw_cfi_oprnd1.dw_cfi_reg_num,
-				       NULL);
+	  r = DWARF2_FRAME_REG_OUT (cfi->dw_cfi_oprnd1.dw_cfi_reg_num, for_eh);
+	  dw2_asm_output_data_uleb128 (r, NULL);
 	  dw2_asm_output_data_uleb128 (cfi->dw_cfi_oprnd2.dw_cfi_offset, NULL);
 	  break;
 
 	case DW_CFA_offset_extended_sf:
 	case DW_CFA_def_cfa_sf:
-	  dw2_asm_output_data_uleb128 (cfi->dw_cfi_oprnd1.dw_cfi_reg_num,
-				       NULL);
+	  r = DWARF2_FRAME_REG_OUT (cfi->dw_cfi_oprnd1.dw_cfi_reg_num, for_eh);
+	  dw2_asm_output_data_uleb128 (r, NULL);
 	  dw2_asm_output_data_sleb128 (cfi->dw_cfi_oprnd2.dw_cfi_offset, NULL);
 	  break;
 
@@ -1866,15 +1874,15 @@ output_cfi (dw_cfi_ref cfi, dw_fde_ref f
 	case DW_CFA_undefined:
 	case DW_CFA_same_value:
 	case DW_CFA_def_cfa_register:
-	  dw2_asm_output_data_uleb128 (cfi->dw_cfi_oprnd1.dw_cfi_reg_num,
-				       NULL);
+	  r = DWARF2_FRAME_REG_OUT (cfi->dw_cfi_oprnd1.dw_cfi_reg_num, for_eh);
+	  dw2_asm_output_data_uleb128 (r, NULL);
 	  break;
 
 	case DW_CFA_register:
-	  dw2_asm_output_data_uleb128 (cfi->dw_cfi_oprnd1.dw_cfi_reg_num,
-				       NULL);
-	  dw2_asm_output_data_uleb128 (cfi->dw_cfi_oprnd2.dw_cfi_reg_num,
-				       NULL);
+	  r = DWARF2_FRAME_REG_OUT (cfi->dw_cfi_oprnd1.dw_cfi_reg_num, for_eh);
+	  dw2_asm_output_data_uleb128 (r, NULL);
+	  r = DWARF2_FRAME_REG_OUT (cfi->dw_cfi_oprnd2.dw_cfi_reg_num, for_eh);
+	  dw2_asm_output_data_uleb128 (r, NULL);
 	  break;
 
 	case DW_CFA_def_cfa_offset:
@@ -1904,7 +1912,7 @@ output_cfi (dw_cfi_ref cfi, dw_fde_ref f
     }
 }
 
-/* Output the call frame information used to used to record information
+/* Output the call frame information used to record information
    that relates to calculating the frame pointer, and records the
    location of saved registers.  */
 
Index: gcc/doc/tm.texi
===================================================================
RCS file: /cvs/gcc/gcc/gcc/doc/tm.texi,v
retrieving revision 1.276
diff -u -p -r1.276 tm.texi
--- gcc/doc/tm.texi	30 Dec 2003 20:27:53 -0000	1.276
+++ gcc/doc/tm.texi	9 Jan 2004 06:29:37 -0000
@@ -3289,6 +3289,26 @@ column number to use instead.
 See the PowerPC's SPE target for an example.
 @end defmac
 
+@defmac DWARF_FRAME_REGNUM (@var{regno})
+
+Define this macro if the target's representation for dwarf registers
+used in .eh_frame or .debug_frame is different from that used in other
+debug info sections.  Given a gcc hard register number, this macro
+should return the .eh_frame register number.  The default is
+@code{DBX_REGISTER_NUMBER (@var{regno})}.
+
+@end defmac
+
+@defmac DWARF2_FRAME_REG_OUT (@var{regno}, @var{for_eh})
+
+Define this macro to map register numbers held in the call frame info
+that gcc has collected using @code{DWARF_FRAME_REGNUM} to those that
+should be output in .debug_frame (@code{@var{for_eh}} is zero) and
+.eh_frame (@code{@var{for_eh}} is non-zero).  The default is to 
+return @code{@var{regno}}.
+
+@end defmac
+
 @node Elimination
 @subsection Eliminating Frame Pointer and Arg Pointer
 

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Incorrect DWARF-2 register numbers on PPC64?
  2004-01-09  2:34                     ` Alan Modra
  2004-01-09  2:49                       ` Alan Modra
  2004-01-09  6:39                       ` Alan Modra
@ 2004-01-09 15:15                       ` Mark Kettenis
  2 siblings, 0 replies; 875+ messages in thread
From: Mark Kettenis @ 2004-01-09 15:15 UTC (permalink / raw)
  To: amodra; +Cc: geoffk, cagney, gcc-patches, dje, gdb-patches, Ulrich.Weigand

   Date: Fri, 9 Jan 2004 13:04:42 +1030
   From: Alan Modra <amodra@bigpond.net.au>

   On Wed, Jan 07, 2004 at 09:01:31PM -0800, Geoff Keating wrote:
   > That will also mean that executables built with a new version of GCC won't
   > run on operating systems with an old libgcc.  For Darwin, because of
   > historical mistakes involving non-shared libgcc, it will make life
   > very difficult.

   I hadn't thought of that.  This means we are stuck with the current
   .eh_frame register numbering.

I was afraid of that :-(.  

   It should be possible to fix .debug_frame with something like the
   following totally untested patch.  I'm throwing it out to the list
   now for comment.

That way we can hope that GCC's .dwarf_frame will be compatible with
DWARF CFI generated by other compilers, which would be a great win I
think.  GDB doesn't support different register encodings for .eh_frame
and .dwarf_frame yet, but that should be fixable; we can distinguish
between the two.

Mark

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Incorrect DWARF-2 register numbers on PPC64?
  2004-01-09  6:39                       ` Alan Modra
@ 2004-01-17  6:54                         ` Alan Modra
  2004-01-17  8:05                           ` Geoff Keating
  0 siblings, 1 reply; 875+ messages in thread
From: Alan Modra @ 2004-01-17  6:54 UTC (permalink / raw)
  To: Geoff Keating, gcc-patches

On Fri, Jan 09, 2004 at 05:09:21PM +1030, Alan Modra wrote:
> 	* config/rs6000/sysv4.h (DWARF2_FRAME_REG_OUT): Define.
> 	* dwarf2out.c (output_cfi): Map regs using DWARF2_FRAME_REG_OUT.
> 	* doc/tm.texi (DWARF_FRAME_REGNUM, DWARF2_FRAME_REG_OUT): Document.

Ping.  http://gcc.gnu.org/ml/gcc-patches/2004-01/msg00638.html

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Incorrect DWARF-2 register numbers on PPC64?
  2004-01-17  6:54                         ` Alan Modra
@ 2004-01-17  8:05                           ` Geoff Keating
  2004-01-18  6:03                             ` Alan Modra
  0 siblings, 1 reply; 875+ messages in thread
From: Geoff Keating @ 2004-01-17  8:05 UTC (permalink / raw)
  To: amodra; +Cc: gcc-patches

> Date: Sat, 17 Jan 2004 17:24:09 +1030
> From: Alan Modra <amodra@bigpond.net.au>

> On Fri, Jan 09, 2004 at 05:09:21PM +1030, Alan Modra wrote:
> > 	* config/rs6000/sysv4.h (DWARF2_FRAME_REG_OUT): Define.
> > 	* dwarf2out.c (output_cfi): Map regs using DWARF2_FRAME_REG_OUT.
> > 	* doc/tm.texi (DWARF_FRAME_REGNUM, DWARF2_FRAME_REG_OUT): Document.
> 
> Ping.  http://gcc.gnu.org/ml/gcc-patches/2004-01/msg00638.html

Please also:

- Check that binary compatibility holds, that is if you use a
  libgcc.so from before this patch, then you can throw & catch
  exceptions both in and between old & new code.
- Run the GDB testsuite.

(I know these are a bother, but I think they're the only way to be
really sure the patch is safe.)

-- 
- Geoffrey Keating <geoffk@geoffk.org>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Incorrect DWARF-2 register numbers on PPC64?
  2004-01-17  8:05                           ` Geoff Keating
@ 2004-01-18  6:03                             ` Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2004-01-18  6:03 UTC (permalink / raw)
  To: Geoff Keating; +Cc: gcc-patches

On Sat, Jan 17, 2004 at 12:05:34AM -0800, Geoff Keating wrote:
> > Date: Sat, 17 Jan 2004 17:24:09 +1030
> > From: Alan Modra <amodra@bigpond.net.au>
> 
> > On Fri, Jan 09, 2004 at 05:09:21PM +1030, Alan Modra wrote:
> > > 	* config/rs6000/sysv4.h (DWARF2_FRAME_REG_OUT): Define.
> > > 	* dwarf2out.c (output_cfi): Map regs using DWARF2_FRAME_REG_OUT.
> > > 	* doc/tm.texi (DWARF_FRAME_REGNUM, DWARF2_FRAME_REG_OUT): Document.
> > 
> > Ping.  http://gcc.gnu.org/ml/gcc-patches/2004-01/msg00638.html
> 
> Please also:
> 
> - Check that binary compatibility holds, that is if you use a
>   libgcc.so from before this patch, then you can throw & catch
>   exceptions both in and between old & new code.

As discussed privately, I checked for binary compatibility by looking
at libstdc++.so .eh_frame.  readelf -wf results for libstdc++ built
during a powerpc-linux bootstrap were identical before and after the
patch.

> - Run the GDB testsuite.

Identical results before and after.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Fix libjava failure on powerpc64-linux
@ 2004-02-10 11:42 Alan Modra
  2004-02-10 12:34 ` Andrew Haley
  2004-02-21 13:45 ` Alan Modra
  0 siblings, 2 replies; 875+ messages in thread
From: Alan Modra @ 2004-02-10 11:42 UTC (permalink / raw)
  To: gcc-patches; +Cc: Andrew Haley, David Edelsohn

I finally found out why powerpc64 libjava tests were failing (See
http://gcc.gnu.org/ml/gcc-patches/2004-01/msg02462.html), and would have
a lot sooner if the sigaction syscall return value had been checked.
The powerpc64 linux kernel only provides a sigaction call for 32 bit
processes, something I wasn't aware of.  64 bit processes are supposed
to use rt_sigaction, so the syscall didn't manage to install a handler.

gcc/ChangeLog
	* config/rs6000/linux64.h (MD_FALLBACK_FRAME_STATE_FOR): Don't
	bump retaddr here.

libjava/ChangeLog
	* include/powerpc-signal.h: Revert 2004-01-21 change.
	(INIT_SEGV, INIT_FPE): Provide powerpc64 versions.  Check return
	from syscall for ppc32 versions.

I used an illegal instruction to bomb on syscall failure rather than
an abort, because abort in libjava doesn't do anything fancy, just
prints "Aborted" and exits.  I like sigill because core dumps from a
sigill give you the reg set at the point of failure, instead of having
to look back up the call stack, which can be tedious when gdb doesn't
happen to work too well on your target.  Maybe I should use _Jv_abort
here?

Regtested powerpc64-linux.  OK mainline and 3.4?

Index: gcc/config/rs6000/linux64.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/linux64.h,v
retrieving revision 1.57
diff -u -p -r1.57 linux64.h
--- gcc/config/rs6000/linux64.h	3 Feb 2004 00:40:26 -0000	1.57
+++ gcc/config/rs6000/linux64.h	10 Feb 2004 10:22:36 -0000
@@ -643,15 +649,9 @@ enum { SIGNAL_FRAMESIZE = 64 };
     (FS)->regs.reg[LINK_REGISTER_REGNUM].loc.offset 			\
       = (long)&(sc_->regs->link) - new_cfa_;				\
 									\
-    /* The unwinder expects the IP to point to the following insn,	\
-       whereas the kernel returns the address of the actual		\
-       faulting insn. We store NIP+4 in an unused register slot to	\
-       get the same result for multiple evaluation of the same signal	\
-       frame.  */							\
-    sc_->regs->gpr[47] = sc_->regs->nip + 4;  				\
     (FS)->regs.reg[ARG_POINTER_REGNUM].how = REG_SAVED_OFFSET;		\
     (FS)->regs.reg[ARG_POINTER_REGNUM].loc.offset 			\
-      = (long)&(sc_->regs->gpr[47]) - new_cfa_;				\
+      = (long)&(sc_->regs->nip) - new_cfa_;				\
     (FS)->retaddr_column = ARG_POINTER_REGNUM;				\
     goto SUCCESS;							\
   } while (0)
Index: libjava/include/powerpc-signal.h
===================================================================
RCS file: /cvs/gcc/gcc/libjava/include/powerpc-signal.h,v
retrieving revision 1.3
diff -u -p -w -r1.3 powerpc-signal.h
--- libjava/include/powerpc-signal.h	23 Jan 2004 17:32:16 -0000	1.3
+++ libjava/include/powerpc-signal.h	10 Feb 2004 10:41:45 -0000
@@ -13,8 +13,6 @@ details.  */
 #ifndef JAVA_SIGNAL_H
 # define JAVA_SIGNAL_H 1
 
-# ifndef __powerpc64__
-
 #  include <signal.h>
 #  include <sys/syscall.h>
 
@@ -53,6 +51,7 @@ while (0)
    compatibility hacks in MAKE_THROW_FRAME, as the ucontext layout
    on PPC changed during the 2.5 kernel series.  */
 
+#ifndef __powerpc64__
 struct kernel_old_sigaction {
   void (*k_sa_handler) (int, struct sigcontext *);
   unsigned long k_sa_mask;
@@ -67,7 +66,8 @@ do									\
     kact.k_sa_handler = catch_segv;					\
     kact.k_sa_mask = 0;							\
     kact.k_sa_flags = 0;						\
-    syscall (SYS_sigaction, SIGSEGV, &kact, NULL);			\
+    if (syscall (SYS_sigaction, SIGSEGV, &kact, NULL) != 0)		\
+      __asm__ __volatile__ (".long 0");					\
   }									\
 while (0)  
 
@@ -78,17 +78,42 @@ do									\
     kact.k_sa_handler = catch_fpe;					\
     kact.k_sa_mask = 0;							\
     kact.k_sa_flags = 0;						\
-    syscall (SYS_sigaction, SIGFPE, &kact, NULL);			\
+    if (syscall (SYS_sigaction, SIGFPE, &kact, NULL) != 0)		\
+      __asm__ __volatile__ (".long 0");					\
   }									\
 while (0)
 
-# else
+#else /* powerpc64 */
 
-#  undef HANDLE_SEGV
-#  undef HANDLE_FPE
+struct kernel_sigaction
+{
+  void (*k_sa_handler) (int, struct sigcontext *);
+  unsigned long k_sa_flags;
+  void (*k_sa_restorer)(void);
+  unsigned long k_sa_mask;
+};
+
+#define INIT_SEGV							\
+do									\
+  {									\
+    struct kernel_sigaction kact;					\
+    memset (&kact, 0, sizeof (kact));					\
+    kact.k_sa_handler = catch_segv;					\
+    if (syscall (SYS_rt_sigaction, SIGSEGV, &kact, NULL, 8) != 0)	\
+      __asm__ __volatile__ (".long 0");					\
+  }									\
+while (0)  
 
-#  define INIT_SEGV   do {} while (0)
-#  define INIT_FPE   do {} while (0)
+#define INIT_FPE							\
+do									\
+  {									\
+    struct kernel_sigaction kact;					\
+    memset (&kact, 0, sizeof (kact));					\
+    kact.k_sa_handler = catch_fpe;					\
+    if (syscall (SYS_rt_sigaction, SIGFPE, &kact, NULL, 8) != 0)	\
+      __asm__ __volatile__ (".long 0");					\
+  }									\
+while (0)
 # endif
 
 #endif /* JAVA_SIGNAL_H */

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Fix libjava failure on powerpc64-linux
  2004-02-10 11:42 Fix libjava failure on powerpc64-linux Alan Modra
@ 2004-02-10 12:34 ` Andrew Haley
  2004-02-10 13:31   ` Alan Modra
  2004-02-21 13:45   ` Andrew Haley
  2004-02-21 13:45 ` Alan Modra
  1 sibling, 2 replies; 875+ messages in thread
From: Andrew Haley @ 2004-02-10 12:34 UTC (permalink / raw)
  To: Alan Modra; +Cc: gcc-patches, David Edelsohn

Alan Modra writes:
 > I finally found out why powerpc64 libjava tests were failing (See
 > http://gcc.gnu.org/ml/gcc-patches/2004-01/msg02462.html), and would have
 > a lot sooner if the sigaction syscall return value had been checked.
 > The powerpc64 linux kernel only provides a sigaction call for 32 bit
 > processes, something I wasn't aware of.  64 bit processes are supposed
 > to use rt_sigaction, so the syscall didn't manage to install a handler.
 > 
 > gcc/ChangeLog
 > 	* config/rs6000/linux64.h (MD_FALLBACK_FRAME_STATE_FOR): Don't
 > 	bump retaddr here.
 > 
 > libjava/ChangeLog
 > 	* include/powerpc-signal.h: Revert 2004-01-21 change.
 > 	(INIT_SEGV, INIT_FPE): Provide powerpc64 versions.  Check return
 > 	from syscall for ppc32 versions.
 > 
 > Regtested powerpc64-linux.  OK mainline and 3.4?

Thanks.  What were the libgcj test results?

Can we not use rt_sigaction for both 32- and 64-bit processes?

 > I used an illegal instruction to bomb on syscall failure rather than
 > an abort, because abort in libjava doesn't do anything fancy, just
 > prints "Aborted" and exits.  I like sigill because core dumps from a
 > sigill give you the reg set at the point of failure, instead of having
 > to look back up the call stack, which can be tedious when gdb doesn't
 > happen to work too well on your target.  Maybe I should use _Jv_abort
 > here?

I don't know about this one.

Andrew.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Fix libjava failure on powerpc64-linux
  2004-02-10 12:34 ` Andrew Haley
@ 2004-02-10 13:31   ` Alan Modra
  2004-02-10 14:12     ` Andrew Haley
  2004-02-21 13:45     ` Alan Modra
  2004-02-21 13:45   ` Andrew Haley
  1 sibling, 2 replies; 875+ messages in thread
From: Alan Modra @ 2004-02-10 13:31 UTC (permalink / raw)
  To: Andrew Haley; +Cc: gcc-patches, David Edelsohn

On Tue, Feb 10, 2004 at 11:41:34AM +0000, Andrew Haley wrote:
> Alan Modra writes:
>  > I finally found out why powerpc64 libjava tests were failing (See
>  > http://gcc.gnu.org/ml/gcc-patches/2004-01/msg02462.html), and would have
>  > a lot sooner if the sigaction syscall return value had been checked.
>  > The powerpc64 linux kernel only provides a sigaction call for 32 bit
>  > processes, something I wasn't aware of.  64 bit processes are supposed
>  > to use rt_sigaction, so the syscall didn't manage to install a handler.
>  > 
>  > gcc/ChangeLog
>  > 	* config/rs6000/linux64.h (MD_FALLBACK_FRAME_STATE_FOR): Don't
>  > 	bump retaddr here.
>  > 
>  > libjava/ChangeLog
>  > 	* include/powerpc-signal.h: Revert 2004-01-21 change.
>  > 	(INIT_SEGV, INIT_FPE): Provide powerpc64 versions.  Check return
>  > 	from syscall for ppc32 versions.
>  > 
>  > Regtested powerpc64-linux.  OK mainline and 3.4?
> 
> Thanks.  What were the libgcj test results?

All passed except for FAIL: linking simple, which was there before.

The log shows:
simple.java: In class `simple':
simple.java: In method `simple.main(java.lang.String[])':
simple.java:5: internal compiler error: Segmentation fault

While if I compile and run the test by hand:

$ CLASSPATH=.. /home/alan/build/ppc/gcc64-curr/powerpc64-linux/libjava/testsuite/../libtool --tag=GCJ --mode=link /home/alan/build/ppc/gcc64-curr/gcc/gcj -B/home/alan/build/ppc/gcc64-curr/gcc/ --encoding=UTF-8 -B/home/alan/build/ppc/gcc64-curr/powerpc64-linux/./libjava/ /src/gcc-current/libjava/testsuite/libjava.jar/simple.jar   -no-install --main=simple -g  -L/home/alan/build/ppc/gcc64-curr/powerpc64-linux/./libjava/.libs -lm   -o simple
/home/alan/build/ppc/gcc64-curr/gcc/gcj -B/home/alan/build/ppc/gcc64-curr/gcc/ --encoding=UTF-8 -B/home/alan/build/ppc/gcc64-curr/powerpc64-linux/./libjava/ /src/gcc-current/libjava/testsuite/libjava.jar/simple.jar --main=simple -g -o simple  -L/home/alan/build/ppc/gcc64-curr/powerpc64-linux/./libjava/.libs -lm
$ LD_LIBRARY_PATH=../.libs:../../../gcc ./simple
hi
$

A bit of a worry, but then, this is mainline..

> Can we not use rt_sigaction for both 32- and 64-bit processes?

Yes.  Hmm, maybe not.  Early 2.x kernels supported ppc32 but didn't have
rt_sigaction.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Fix libjava failure on powerpc64-linux
  2004-02-10 13:31   ` Alan Modra
@ 2004-02-10 14:12     ` Andrew Haley
  2004-02-21 13:45       ` Andrew Haley
  2004-02-21 13:45     ` Alan Modra
  1 sibling, 1 reply; 875+ messages in thread
From: Andrew Haley @ 2004-02-10 14:12 UTC (permalink / raw)
  To: Alan Modra; +Cc: gcc-patches, David Edelsohn

Alan Modra writes:
 > 
 > > Can we not use rt_sigaction for both 32- and 64-bit processes?
 > 
 > Yes.  Hmm, maybe not.  Early 2.x kernels supported ppc32 but didn't have
 > rt_sigaction.

Patch approved.

Thanks,
Andrew.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Fix libjava failure on powerpc64-linux
       [not found]           ` <amodra@bigpond.net.au>
                               ` (6 preceding siblings ...)
  2004-01-08  2:09             ` mainline -mcpu=power4 David Edelsohn
@ 2004-02-10 15:07             ` David Edelsohn
  2004-02-10 15:09               ` Andrew Haley
  2004-02-21 13:45               ` David Edelsohn
  2004-03-03 21:03             ` Fix PR 14406 (rs6000 abstf2) David Edelsohn
                               ` (53 subsequent siblings)
  61 siblings, 2 replies; 875+ messages in thread
From: David Edelsohn @ 2004-02-10 15:07 UTC (permalink / raw)
  To: Andrew Haley, gcc-patches

>>>>> Alan Modra writes:

>> Can we not use rt_sigaction for both 32- and 64-bit processes?

Alan> Yes.  Hmm, maybe not.  Early 2.x kernels supported ppc32 but didn't have
Alan> rt_sigaction.

	Can we test for Linux kernel version or libc version instead of
ppc32 versus ppc64?

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Fix libjava failure on powerpc64-linux
  2004-02-10 15:07             ` Fix libjava failure on powerpc64-linux David Edelsohn
@ 2004-02-10 15:09               ` Andrew Haley
  2004-02-10 15:39                 ` David Edelsohn
  2004-02-21 13:45                 ` Andrew Haley
  2004-02-21 13:45               ` David Edelsohn
  1 sibling, 2 replies; 875+ messages in thread
From: Andrew Haley @ 2004-02-10 15:09 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

David Edelsohn writes:
 > >>>>> Alan Modra writes:
 > 
 > >> Can we not use rt_sigaction for both 32- and 64-bit processes?
 > 
 > Alan> Yes.  Hmm, maybe not.  Early 2.x kernels supported ppc32 but didn't have
 > Alan> rt_sigaction.
 > 
 > 	Can we test for Linux kernel version or libc version instead
 > of ppc32 versus ppc64?

What bug would that fix?

Andrew.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Fix libjava failure on powerpc64-linux
  2004-02-10 15:09               ` Andrew Haley
@ 2004-02-10 15:39                 ` David Edelsohn
  2004-02-10 15:59                   ` Andrew Haley
  2004-02-21 13:45                   ` David Edelsohn
  2004-02-21 13:45                 ` Andrew Haley
  1 sibling, 2 replies; 875+ messages in thread
From: David Edelsohn @ 2004-02-10 15:39 UTC (permalink / raw)
  To: Andrew Haley; +Cc: gcc-patches

>>>>> Andrew Haley writes:

Andrew> David Edelsohn writes:
>> >>>>> Alan Modra writes:
>> 
>> >> Can we not use rt_sigaction for both 32- and 64-bit processes?
>> 
Alan> Yes.  Hmm, maybe not.  Early 2.x kernels supported ppc32 but didn't have
Alan> rt_sigaction.
>> 
>> Can we test for Linux kernel version or libc version instead
>> of ppc32 versus ppc64?

Andrew> What bug would that fix?

	Not having different rt_sigaction versus sigation for ppc64 versus
ppc32. 

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Fix libjava failure on powerpc64-linux
  2004-02-10 15:39                 ` David Edelsohn
@ 2004-02-10 15:59                   ` Andrew Haley
  2004-02-10 16:14                     ` David Edelsohn
  2004-02-21 13:45                     ` Andrew Haley
  2004-02-21 13:45                   ` David Edelsohn
  1 sibling, 2 replies; 875+ messages in thread
From: Andrew Haley @ 2004-02-10 15:59 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

David Edelsohn writes:
 > >>>>> Andrew Haley writes:
 > 
 > Andrew> David Edelsohn writes:
 > >> >>>>> Alan Modra writes:
 > >> 
 > >> >> Can we not use rt_sigaction for both 32- and 64-bit processes?
 > >> 
 > Alan> Yes.  Hmm, maybe not.  Early 2.x kernels supported ppc32 but didn't have
 > Alan> rt_sigaction.
 > >> 
 > >> Can we test for Linux kernel version or libc version instead
 > >> of ppc32 versus ppc64?
 > 
 > Andrew> What bug would that fix?
 > 
 > 	Not having different rt_sigaction versus sigation for ppc64 versus
 > ppc32. 

I don't understand.  This doesn't, as far as I can see, fix any bugs
and it introduces a dependency on the kernel version and it even
further complicates the libgcj configury.  All for 2.x kernels.

Andrew.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Fix libjava failure on powerpc64-linux
  2004-02-10 15:59                   ` Andrew Haley
@ 2004-02-10 16:14                     ` David Edelsohn
  2004-02-21 13:45                       ` David Edelsohn
  2004-02-21 13:45                     ` Andrew Haley
  1 sibling, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2004-02-10 16:14 UTC (permalink / raw)
  To: Andrew Haley; +Cc: gcc-patches

>>>>> Andrew Haley writes:

>> >> >>>>> Alan Modra writes:
>> >> 
>> >> >> Can we not use rt_sigaction for both 32- and 64-bit processes?
>> >> 
Alan> Yes.  Hmm, maybe not.  Early 2.x kernels supported ppc32 but didn't have
Alan> rt_sigaction.
>> >> 
>> >> Can we test for Linux kernel version or libc version instead
>> >> of ppc32 versus ppc64?
>> 
Andrew> What bug would that fix?
>> 
>> Not having different rt_sigaction versus sigation for ppc64 versus
>> ppc32. 

Andrew> I don't understand.  This doesn't, as far as I can see, fix any bugs
Andrew> and it introduces a dependency on the kernel version and it even
Andrew> further complicates the libgcj configury.  All for 2.x kernels.

	You asked if we could use rt_sigaction for both ppc32 and ppc64.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Fix libjava failure on powerpc64-linux
  2004-02-10 16:14                     ` David Edelsohn
@ 2004-02-21 13:45                       ` David Edelsohn
  0 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2004-02-21 13:45 UTC (permalink / raw)
  To: Andrew Haley; +Cc: gcc-patches

>>>>> Andrew Haley writes:

>> >> >>>>> Alan Modra writes:
>> >> 
>> >> >> Can we not use rt_sigaction for both 32- and 64-bit processes?
>> >> 
Alan> Yes.  Hmm, maybe not.  Early 2.x kernels supported ppc32 but didn't have
Alan> rt_sigaction.
>> >> 
>> >> Can we test for Linux kernel version or libc version instead
>> >> of ppc32 versus ppc64?
>> 
Andrew> What bug would that fix?
>> 
>> Not having different rt_sigaction versus sigation for ppc64 versus
>> ppc32. 

Andrew> I don't understand.  This doesn't, as far as I can see, fix any bugs
Andrew> and it introduces a dependency on the kernel version and it even
Andrew> further complicates the libgcj configury.  All for 2.x kernels.

	You asked if we could use rt_sigaction for both ppc32 and ppc64.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Fix libjava failure on powerpc64-linux
  2004-02-10 15:09               ` Andrew Haley
  2004-02-10 15:39                 ` David Edelsohn
@ 2004-02-21 13:45                 ` Andrew Haley
  1 sibling, 0 replies; 875+ messages in thread
From: Andrew Haley @ 2004-02-21 13:45 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

David Edelsohn writes:
 > >>>>> Alan Modra writes:
 > 
 > >> Can we not use rt_sigaction for both 32- and 64-bit processes?
 > 
 > Alan> Yes.  Hmm, maybe not.  Early 2.x kernels supported ppc32 but didn't have
 > Alan> rt_sigaction.
 > 
 > 	Can we test for Linux kernel version or libc version instead
 > of ppc32 versus ppc64?

What bug would that fix?

Andrew.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Fix libjava failure on powerpc64-linux
  2004-02-10 14:12     ` Andrew Haley
@ 2004-02-21 13:45       ` Andrew Haley
  0 siblings, 0 replies; 875+ messages in thread
From: Andrew Haley @ 2004-02-21 13:45 UTC (permalink / raw)
  To: Alan Modra; +Cc: gcc-patches, David Edelsohn

Alan Modra writes:
 > 
 > > Can we not use rt_sigaction for both 32- and 64-bit processes?
 > 
 > Yes.  Hmm, maybe not.  Early 2.x kernels supported ppc32 but didn't have
 > rt_sigaction.

Patch approved.

Thanks,
Andrew.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Fix libjava failure on powerpc64-linux
  2004-02-10 13:31   ` Alan Modra
  2004-02-10 14:12     ` Andrew Haley
@ 2004-02-21 13:45     ` Alan Modra
  1 sibling, 0 replies; 875+ messages in thread
From: Alan Modra @ 2004-02-21 13:45 UTC (permalink / raw)
  To: Andrew Haley; +Cc: gcc-patches, David Edelsohn

On Tue, Feb 10, 2004 at 11:41:34AM +0000, Andrew Haley wrote:
> Alan Modra writes:
>  > I finally found out why powerpc64 libjava tests were failing (See
>  > http://gcc.gnu.org/ml/gcc-patches/2004-01/msg02462.html), and would have
>  > a lot sooner if the sigaction syscall return value had been checked.
>  > The powerpc64 linux kernel only provides a sigaction call for 32 bit
>  > processes, something I wasn't aware of.  64 bit processes are supposed
>  > to use rt_sigaction, so the syscall didn't manage to install a handler.
>  > 
>  > gcc/ChangeLog
>  > 	* config/rs6000/linux64.h (MD_FALLBACK_FRAME_STATE_FOR): Don't
>  > 	bump retaddr here.
>  > 
>  > libjava/ChangeLog
>  > 	* include/powerpc-signal.h: Revert 2004-01-21 change.
>  > 	(INIT_SEGV, INIT_FPE): Provide powerpc64 versions.  Check return
>  > 	from syscall for ppc32 versions.
>  > 
>  > Regtested powerpc64-linux.  OK mainline and 3.4?
> 
> Thanks.  What were the libgcj test results?

All passed except for FAIL: linking simple, which was there before.

The log shows:
simple.java: In class `simple':
simple.java: In method `simple.main(java.lang.String[])':
simple.java:5: internal compiler error: Segmentation fault

While if I compile and run the test by hand:

$ CLASSPATH=.. /home/alan/build/ppc/gcc64-curr/powerpc64-linux/libjava/testsuite/../libtool --tag=GCJ --mode=link /home/alan/build/ppc/gcc64-curr/gcc/gcj -B/home/alan/build/ppc/gcc64-curr/gcc/ --encoding=UTF-8 -B/home/alan/build/ppc/gcc64-curr/powerpc64-linux/./libjava/ /src/gcc-current/libjava/testsuite/libjava.jar/simple.jar   -no-install --main=simple -g  -L/home/alan/build/ppc/gcc64-curr/powerpc64-linux/./libjava/.libs -lm   -o simple
/home/alan/build/ppc/gcc64-curr/gcc/gcj -B/home/alan/build/ppc/gcc64-curr/gcc/ --encoding=UTF-8 -B/home/alan/build/ppc/gcc64-curr/powerpc64-linux/./libjava/ /src/gcc-current/libjava/testsuite/libjava.jar/simple.jar --main=simple -g -o simple  -L/home/alan/build/ppc/gcc64-curr/powerpc64-linux/./libjava/.libs -lm
$ LD_LIBRARY_PATH=../.libs:../../../gcc ./simple
hi
$

A bit of a worry, but then, this is mainline..

> Can we not use rt_sigaction for both 32- and 64-bit processes?

Yes.  Hmm, maybe not.  Early 2.x kernels supported ppc32 but didn't have
rt_sigaction.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Fix libjava failure on powerpc64-linux
  2004-02-10 12:34 ` Andrew Haley
  2004-02-10 13:31   ` Alan Modra
@ 2004-02-21 13:45   ` Andrew Haley
  1 sibling, 0 replies; 875+ messages in thread
From: Andrew Haley @ 2004-02-21 13:45 UTC (permalink / raw)
  To: Alan Modra; +Cc: gcc-patches, David Edelsohn

Alan Modra writes:
 > I finally found out why powerpc64 libjava tests were failing (See
 > http://gcc.gnu.org/ml/gcc-patches/2004-01/msg02462.html), and would have
 > a lot sooner if the sigaction syscall return value had been checked.
 > The powerpc64 linux kernel only provides a sigaction call for 32 bit
 > processes, something I wasn't aware of.  64 bit processes are supposed
 > to use rt_sigaction, so the syscall didn't manage to install a handler.
 > 
 > gcc/ChangeLog
 > 	* config/rs6000/linux64.h (MD_FALLBACK_FRAME_STATE_FOR): Don't
 > 	bump retaddr here.
 > 
 > libjava/ChangeLog
 > 	* include/powerpc-signal.h: Revert 2004-01-21 change.
 > 	(INIT_SEGV, INIT_FPE): Provide powerpc64 versions.  Check return
 > 	from syscall for ppc32 versions.
 > 
 > Regtested powerpc64-linux.  OK mainline and 3.4?

Thanks.  What were the libgcj test results?

Can we not use rt_sigaction for both 32- and 64-bit processes?

 > I used an illegal instruction to bomb on syscall failure rather than
 > an abort, because abort in libjava doesn't do anything fancy, just
 > prints "Aborted" and exits.  I like sigill because core dumps from a
 > sigill give you the reg set at the point of failure, instead of having
 > to look back up the call stack, which can be tedious when gdb doesn't
 > happen to work too well on your target.  Maybe I should use _Jv_abort
 > here?

I don't know about this one.

Andrew.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Fix libjava failure on powerpc64-linux
  2004-02-10 15:59                   ` Andrew Haley
  2004-02-10 16:14                     ` David Edelsohn
@ 2004-02-21 13:45                     ` Andrew Haley
  1 sibling, 0 replies; 875+ messages in thread
From: Andrew Haley @ 2004-02-21 13:45 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

David Edelsohn writes:
 > >>>>> Andrew Haley writes:
 > 
 > Andrew> David Edelsohn writes:
 > >> >>>>> Alan Modra writes:
 > >> 
 > >> >> Can we not use rt_sigaction for both 32- and 64-bit processes?
 > >> 
 > Alan> Yes.  Hmm, maybe not.  Early 2.x kernels supported ppc32 but didn't have
 > Alan> rt_sigaction.
 > >> 
 > >> Can we test for Linux kernel version or libc version instead
 > >> of ppc32 versus ppc64?
 > 
 > Andrew> What bug would that fix?
 > 
 > 	Not having different rt_sigaction versus sigation for ppc64 versus
 > ppc32. 

I don't understand.  This doesn't, as far as I can see, fix any bugs
and it introduces a dependency on the kernel version and it even
further complicates the libgcj configury.  All for 2.x kernels.

Andrew.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Fix libjava failure on powerpc64-linux
  2004-02-10 15:07             ` Fix libjava failure on powerpc64-linux David Edelsohn
  2004-02-10 15:09               ` Andrew Haley
@ 2004-02-21 13:45               ` David Edelsohn
  1 sibling, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2004-02-21 13:45 UTC (permalink / raw)
  To: Andrew Haley, gcc-patches

>>>>> Alan Modra writes:

>> Can we not use rt_sigaction for both 32- and 64-bit processes?

Alan> Yes.  Hmm, maybe not.  Early 2.x kernels supported ppc32 but didn't have
Alan> rt_sigaction.

	Can we test for Linux kernel version or libc version instead of
ppc32 versus ppc64?

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Fix libjava failure on powerpc64-linux
  2004-02-10 11:42 Fix libjava failure on powerpc64-linux Alan Modra
  2004-02-10 12:34 ` Andrew Haley
@ 2004-02-21 13:45 ` Alan Modra
  1 sibling, 0 replies; 875+ messages in thread
From: Alan Modra @ 2004-02-21 13:45 UTC (permalink / raw)
  To: gcc-patches; +Cc: Andrew Haley, David Edelsohn

I finally found out why powerpc64 libjava tests were failing (See
http://gcc.gnu.org/ml/gcc-patches/2004-01/msg02462.html), and would have
a lot sooner if the sigaction syscall return value had been checked.
The powerpc64 linux kernel only provides a sigaction call for 32 bit
processes, something I wasn't aware of.  64 bit processes are supposed
to use rt_sigaction, so the syscall didn't manage to install a handler.

gcc/ChangeLog
	* config/rs6000/linux64.h (MD_FALLBACK_FRAME_STATE_FOR): Don't
	bump retaddr here.

libjava/ChangeLog
	* include/powerpc-signal.h: Revert 2004-01-21 change.
	(INIT_SEGV, INIT_FPE): Provide powerpc64 versions.  Check return
	from syscall for ppc32 versions.

I used an illegal instruction to bomb on syscall failure rather than
an abort, because abort in libjava doesn't do anything fancy, just
prints "Aborted" and exits.  I like sigill because core dumps from a
sigill give you the reg set at the point of failure, instead of having
to look back up the call stack, which can be tedious when gdb doesn't
happen to work too well on your target.  Maybe I should use _Jv_abort
here?

Regtested powerpc64-linux.  OK mainline and 3.4?

Index: gcc/config/rs6000/linux64.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/linux64.h,v
retrieving revision 1.57
diff -u -p -r1.57 linux64.h
--- gcc/config/rs6000/linux64.h	3 Feb 2004 00:40:26 -0000	1.57
+++ gcc/config/rs6000/linux64.h	10 Feb 2004 10:22:36 -0000
@@ -643,15 +649,9 @@ enum { SIGNAL_FRAMESIZE = 64 };
     (FS)->regs.reg[LINK_REGISTER_REGNUM].loc.offset 			\
       = (long)&(sc_->regs->link) - new_cfa_;				\
 									\
-    /* The unwinder expects the IP to point to the following insn,	\
-       whereas the kernel returns the address of the actual		\
-       faulting insn. We store NIP+4 in an unused register slot to	\
-       get the same result for multiple evaluation of the same signal	\
-       frame.  */							\
-    sc_->regs->gpr[47] = sc_->regs->nip + 4;  				\
     (FS)->regs.reg[ARG_POINTER_REGNUM].how = REG_SAVED_OFFSET;		\
     (FS)->regs.reg[ARG_POINTER_REGNUM].loc.offset 			\
-      = (long)&(sc_->regs->gpr[47]) - new_cfa_;				\
+      = (long)&(sc_->regs->nip) - new_cfa_;				\
     (FS)->retaddr_column = ARG_POINTER_REGNUM;				\
     goto SUCCESS;							\
   } while (0)
Index: libjava/include/powerpc-signal.h
===================================================================
RCS file: /cvs/gcc/gcc/libjava/include/powerpc-signal.h,v
retrieving revision 1.3
diff -u -p -w -r1.3 powerpc-signal.h
--- libjava/include/powerpc-signal.h	23 Jan 2004 17:32:16 -0000	1.3
+++ libjava/include/powerpc-signal.h	10 Feb 2004 10:41:45 -0000
@@ -13,8 +13,6 @@ details.  */
 #ifndef JAVA_SIGNAL_H
 # define JAVA_SIGNAL_H 1
 
-# ifndef __powerpc64__
-
 #  include <signal.h>
 #  include <sys/syscall.h>
 
@@ -53,6 +51,7 @@ while (0)
    compatibility hacks in MAKE_THROW_FRAME, as the ucontext layout
    on PPC changed during the 2.5 kernel series.  */
 
+#ifndef __powerpc64__
 struct kernel_old_sigaction {
   void (*k_sa_handler) (int, struct sigcontext *);
   unsigned long k_sa_mask;
@@ -67,7 +66,8 @@ do									\
     kact.k_sa_handler = catch_segv;					\
     kact.k_sa_mask = 0;							\
     kact.k_sa_flags = 0;						\
-    syscall (SYS_sigaction, SIGSEGV, &kact, NULL);			\
+    if (syscall (SYS_sigaction, SIGSEGV, &kact, NULL) != 0)		\
+      __asm__ __volatile__ (".long 0");					\
   }									\
 while (0)  
 
@@ -78,17 +78,42 @@ do									\
     kact.k_sa_handler = catch_fpe;					\
     kact.k_sa_mask = 0;							\
     kact.k_sa_flags = 0;						\
-    syscall (SYS_sigaction, SIGFPE, &kact, NULL);			\
+    if (syscall (SYS_sigaction, SIGFPE, &kact, NULL) != 0)		\
+      __asm__ __volatile__ (".long 0");					\
   }									\
 while (0)
 
-# else
+#else /* powerpc64 */
 
-#  undef HANDLE_SEGV
-#  undef HANDLE_FPE
+struct kernel_sigaction
+{
+  void (*k_sa_handler) (int, struct sigcontext *);
+  unsigned long k_sa_flags;
+  void (*k_sa_restorer)(void);
+  unsigned long k_sa_mask;
+};
+
+#define INIT_SEGV							\
+do									\
+  {									\
+    struct kernel_sigaction kact;					\
+    memset (&kact, 0, sizeof (kact));					\
+    kact.k_sa_handler = catch_segv;					\
+    if (syscall (SYS_rt_sigaction, SIGSEGV, &kact, NULL, 8) != 0)	\
+      __asm__ __volatile__ (".long 0");					\
+  }									\
+while (0)  
 
-#  define INIT_SEGV   do {} while (0)
-#  define INIT_FPE   do {} while (0)
+#define INIT_FPE							\
+do									\
+  {									\
+    struct kernel_sigaction kact;					\
+    memset (&kact, 0, sizeof (kact));					\
+    kact.k_sa_handler = catch_fpe;					\
+    if (syscall (SYS_rt_sigaction, SIGFPE, &kact, NULL, 8) != 0)	\
+      __asm__ __volatile__ (".long 0");					\
+  }									\
+while (0)
 # endif
 
 #endif /* JAVA_SIGNAL_H */

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Fix libjava failure on powerpc64-linux
  2004-02-10 15:39                 ` David Edelsohn
  2004-02-10 15:59                   ` Andrew Haley
@ 2004-02-21 13:45                   ` David Edelsohn
  1 sibling, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2004-02-21 13:45 UTC (permalink / raw)
  To: Andrew Haley; +Cc: gcc-patches

>>>>> Andrew Haley writes:

Andrew> David Edelsohn writes:
>> >>>>> Alan Modra writes:
>> 
>> >> Can we not use rt_sigaction for both 32- and 64-bit processes?
>> 
Alan> Yes.  Hmm, maybe not.  Early 2.x kernels supported ppc32 but didn't have
Alan> rt_sigaction.
>> 
>> Can we test for Linux kernel version or libc version instead
>> of ppc32 versus ppc64?

Andrew> What bug would that fix?

	Not having different rt_sigaction versus sigation for ppc64 versus
ppc32. 

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Fix PR 14406 (rs6000 abstf2)
@ 2004-03-03 15:14 Alan Modra
  2004-03-19  8:14 ` Alan Modra
  0 siblings, 1 reply; 875+ messages in thread
From: Alan Modra @ 2004-03-03 15:14 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Edelsohn

Replaces the bogus abstf2 pattern with one that works.  Details in the
PR.

	PR target/14406
	* config/rs6000/rs6000.md (abstf2, abstf2+1): Delete define_insn.
	(abstf2, abstf2_internal): New define_expand.

powerpc64-linux bootstrap and regression test in progress.

Index: gcc/config/rs6000/rs6000.md
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.md,v
retrieving revision 1.296
diff -c -p -r1.296 rs6000.md
*** gcc/config/rs6000/rs6000.md	27 Feb 2004 02:13:59 -0000	1.296
--- gcc/config/rs6000/rs6000.md	3 Mar 2004 15:04:29 -0000
***************
*** 8375,8409 ****
    [(set_attr "type" "fp")
     (set_attr "length" "8")])
  
! (define_insn "abstf2"
    [(set (match_operand:TF 0 "gpc_reg_operand" "=f")
  	(abs:TF (match_operand:TF 1 "gpc_reg_operand" "f")))]
    "(DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_DARWIN)
     && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_LONG_DOUBLE_128"
!   "*
  {
!   if (REGNO (operands[0]) == REGNO (operands[1]) + 1)
!     return \"fabs %L0,%L1\;fabs %0,%1\";
!   else
!     return \"fabs %0,%1\;fabs %L0,%L1\";
! }"
!   [(set_attr "type" "fp")
!    (set_attr "length" "8")])
  
! (define_insn ""
    [(set (match_operand:TF 0 "gpc_reg_operand" "=f")
! 	(neg:TF (abs:TF (match_operand:TF 1 "gpc_reg_operand" "f"))))]
    "(DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_DARWIN)
     && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_LONG_DOUBLE_128"
!   "*
  {
!   if (REGNO (operands[0]) == REGNO (operands[1]) + 1)
!     return \"fnabs %L0,%L1\;fnabs %0,%1\";
!   else
!     return \"fnabs %0,%1\;fnabs %L0,%L1\";
! }"
!   [(set_attr "type" "fp")
!    (set_attr "length" "8")])
  \f
  ;; Next come the multi-word integer load and store and the load and store
  ;; multiple insns.
--- 8376,8415 ----
    [(set_attr "type" "fp")
     (set_attr "length" "8")])
  
! (define_expand "abstf2"
    [(set (match_operand:TF 0 "gpc_reg_operand" "=f")
  	(abs:TF (match_operand:TF 1 "gpc_reg_operand" "f")))]
    "(DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_DARWIN)
     && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_LONG_DOUBLE_128"
!   "
  {
!   rtx label = gen_label_rtx ();
!   emit_insn (gen_abstf2_internal (operands[0], operands[1], label));
!   emit_label (label);
!   DONE;
! }")
  
! (define_expand "abstf2_internal"
    [(set (match_operand:TF 0 "gpc_reg_operand" "=f")
! 	(match_operand:TF 1 "gpc_reg_operand" "f"))
!    (set (match_dup 3) (abs:DF (match_dup 5)))
!    (set (match_dup 4) (compare:CCFP (match_dup 3) (match_dup 5)))
!    (set (pc) (if_then_else (eq (match_dup 4) (const_int 0))
! 			   (label_ref (match_operand 2 "" ""))
! 			   (pc)))
!    (set (match_dup 5) (abs:DF (match_dup 5)))
!    (set (match_dup 6) (neg:DF (match_dup 6)))]
    "(DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_DARWIN)
     && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_LONG_DOUBLE_128"
!   "
  {
!   const int hi_word = FLOAT_WORDS_BIG_ENDIAN ? 0 : GET_MODE_SIZE (DFmode);
!   const int lo_word = FLOAT_WORDS_BIG_ENDIAN ? GET_MODE_SIZE (DFmode) : 0;
!   operands[3] = gen_reg_rtx (DFmode);
!   operands[4] = gen_reg_rtx (CCFPmode);
!   operands[5] = simplify_gen_subreg (DFmode, operands[0], TFmode, hi_word);
!   operands[6] = simplify_gen_subreg (DFmode, operands[0], TFmode, lo_word);
! }")
  \f
  ;; Next come the multi-word integer load and store and the load and store
  ;; multiple insns.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Fix PR 14406 (rs6000 abstf2)
       [not found]           ` <amodra@bigpond.net.au>
                               ` (7 preceding siblings ...)
  2004-02-10 15:07             ` Fix libjava failure on powerpc64-linux David Edelsohn
@ 2004-03-03 21:03             ` David Edelsohn
  2004-03-03 21:34               ` Alan Modra
  2004-03-19  8:14               ` David Edelsohn
  2004-03-04  2:47             ` David Edelsohn
                               ` (52 subsequent siblings)
  61 siblings, 2 replies; 875+ messages in thread
From: David Edelsohn @ 2004-03-03 21:03 UTC (permalink / raw)
  To: gcc-patches

	PR target/14406
	* config/rs6000/rs6000.md (abstf2, abstf2+1): Delete define_insn.
	(abstf2, abstf2_internal): New define_expand.

Okay, assuming no regressions.  I hope that scheduling and CSE actually
produce something better than the raw pattern.  XLC produces:

        fabs    fp0,fp1
        fcmpu   0,fp0,fp1
        bc      BO_IF,CR0_EQ,__L10
        fneg    fp2,fp2
__L10:
        fmr     fp1,fp0
	blr

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Fix PR 14406 (rs6000 abstf2)
  2004-03-03 21:03             ` Fix PR 14406 (rs6000 abstf2) David Edelsohn
@ 2004-03-03 21:34               ` Alan Modra
  2004-03-04  2:44                 ` Alan Modra
  2004-03-19  8:14                 ` Alan Modra
  2004-03-19  8:14               ` David Edelsohn
  1 sibling, 2 replies; 875+ messages in thread
From: Alan Modra @ 2004-03-03 21:34 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

On Wed, Mar 03, 2004 at 04:03:52PM -0500, David Edelsohn wrote:
> 	PR target/14406
> 	* config/rs6000/rs6000.md (abstf2, abstf2+1): Delete define_insn.
> 	(abstf2, abstf2_internal): New define_expand.
> 
> Okay, assuming no regressions.  I hope that scheduling and CSE actually
> produce something better than the raw pattern.  XLC produces:
> 
>         fabs    fp0,fp1
>         fcmpu   0,fp0,fp1
>         bc      BO_IF,CR0_EQ,__L10
>         fneg    fp2,fp2
> __L10:
>         fmr     fp1,fp0
> 	blr

The following is -O1 -mlong-double-128 code.

.foo:
        fabs 0,1
        fcmpu 7,0,1
        beqlr- 7
        fmr 1,0
        fneg 2,2
        blr

I was about to say that gcc does better, but the XLC sequence has
alerted me to the fact that I'm not doing the right thing for -0.0

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Fix PR 14406 (rs6000 abstf2)
  2004-03-03 21:34               ` Alan Modra
@ 2004-03-04  2:44                 ` Alan Modra
  2004-03-19  8:14                   ` Alan Modra
  2004-03-19  8:14                 ` Alan Modra
  1 sibling, 1 reply; 875+ messages in thread
From: Alan Modra @ 2004-03-04  2:44 UTC (permalink / raw)
  To: David Edelsohn, gcc-patches

On Thu, Mar 04, 2004 at 08:04:39AM +1030, Alan Modra wrote:
> I was about to say that gcc does better, but the XLC sequence has
> alerted me to the fact that I'm not doing the right thing for -0.0

Revised patch.

long double foo (long double x)
{
  return __builtin_fabsl (x);
}

generates

.foo:
        fmr 0,1
        fabs 1,1
        fcmpu 7,0,1
        beqlr- 7
        fneg 2,2
        blr

	PR target/14406
	* config/rs6000/rs6000.md (abstf2, abstf2+1): Delete define_insn.
	(abstf2, abstf2_internal): New define_expand.

Index: gcc/config/rs6000/rs6000.md
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.md,v
retrieving revision 1.296
diff -c -p -r1.296 rs6000.md
*** gcc/config/rs6000/rs6000.md	27 Feb 2004 02:13:59 -0000	1.296
--- gcc/config/rs6000/rs6000.md	4 Mar 2004 01:40:41 -0000
***************
*** 8375,8409 ****
    [(set_attr "type" "fp")
     (set_attr "length" "8")])
  
! (define_insn "abstf2"
    [(set (match_operand:TF 0 "gpc_reg_operand" "=f")
  	(abs:TF (match_operand:TF 1 "gpc_reg_operand" "f")))]
    "(DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_DARWIN)
     && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_LONG_DOUBLE_128"
!   "*
  {
!   if (REGNO (operands[0]) == REGNO (operands[1]) + 1)
!     return \"fabs %L0,%L1\;fabs %0,%1\";
!   else
!     return \"fabs %0,%1\;fabs %L0,%L1\";
! }"
!   [(set_attr "type" "fp")
!    (set_attr "length" "8")])
  
! (define_insn ""
    [(set (match_operand:TF 0 "gpc_reg_operand" "=f")
! 	(neg:TF (abs:TF (match_operand:TF 1 "gpc_reg_operand" "f"))))]
    "(DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_DARWIN)
     && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_LONG_DOUBLE_128"
!   "*
  {
!   if (REGNO (operands[0]) == REGNO (operands[1]) + 1)
!     return \"fnabs %L0,%L1\;fnabs %0,%1\";
!   else
!     return \"fnabs %0,%1\;fnabs %L0,%L1\";
! }"
!   [(set_attr "type" "fp")
!    (set_attr "length" "8")])
  \f
  ;; Next come the multi-word integer load and store and the load and store
  ;; multiple insns.
--- 8418,8457 ----
    [(set_attr "type" "fp")
     (set_attr "length" "8")])
  
! (define_expand "abstf2"
    [(set (match_operand:TF 0 "gpc_reg_operand" "=f")
  	(abs:TF (match_operand:TF 1 "gpc_reg_operand" "f")))]
    "(DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_DARWIN)
     && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_LONG_DOUBLE_128"
!   "
  {
!   rtx label = gen_label_rtx ();
!   emit_insn (gen_abstf2_internal (operands[0], operands[1], label));
!   emit_label (label);
!   DONE;
! }")
  
! (define_expand "abstf2_internal"
    [(set (match_operand:TF 0 "gpc_reg_operand" "=f")
! 	(match_operand:TF 1 "gpc_reg_operand" "f"))
!    (set (match_dup 3) (match_dup 5))
!    (set (match_dup 5) (abs:DF (match_dup 5)))
!    (set (match_dup 4) (compare:CCFP (match_dup 3) (match_dup 5)))
!    (set (pc) (if_then_else (eq (match_dup 4) (const_int 0))
! 			   (label_ref (match_operand 2 "" ""))
! 			   (pc)))
!    (set (match_dup 6) (neg:DF (match_dup 6)))]
    "(DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_DARWIN)
     && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_LONG_DOUBLE_128"
!   "
  {
!   const int hi_word = FLOAT_WORDS_BIG_ENDIAN ? 0 : GET_MODE_SIZE (DFmode);
!   const int lo_word = FLOAT_WORDS_BIG_ENDIAN ? GET_MODE_SIZE (DFmode) : 0;
!   operands[3] = gen_reg_rtx (DFmode);
!   operands[4] = gen_reg_rtx (CCFPmode);
!   operands[5] = simplify_gen_subreg (DFmode, operands[0], TFmode, hi_word);
!   operands[6] = simplify_gen_subreg (DFmode, operands[0], TFmode, lo_word);
! }")
  \f
  ;; Next come the multi-word integer load and store and the load and store
  ;; multiple insns.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Fix PR 14406 (rs6000 abstf2)
       [not found]           ` <amodra@bigpond.net.au>
                               ` (8 preceding siblings ...)
  2004-03-03 21:03             ` Fix PR 14406 (rs6000 abstf2) David Edelsohn
@ 2004-03-04  2:47             ` David Edelsohn
  2004-03-19  8:14               ` David Edelsohn
  2004-03-10  6:23             ` Powerpc64 long double support David Edelsohn
                               ` (51 subsequent siblings)
  61 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2004-03-04  2:47 UTC (permalink / raw)
  To: gcc-patches

	PR target/14406
	* config/rs6000/rs6000.md (abstf2, abstf2+1): Delete define_insn.
	(abstf2, abstf2_internal): New define_expand.

Okay.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
       [not found] <40491948.2010900@us.ibm.com>
@ 2004-03-06  9:52 ` Alan Modra
  2004-03-19  8:14   ` Alan Modra
  2004-03-06 10:50 ` Alan Modra
  1 sibling, 1 reply; 875+ messages in thread
From: Alan Modra @ 2004-03-06  9:52 UTC (permalink / raw)
  To: Steve Munroe
  Cc: gcc-patches, Geoff Keating, Andreas Jaeger, Dwayne McConnell,
	David Edelsohn, Marcus Meissner

On Fri, Mar 05, 2004 at 06:20:24PM -0600, Steve Munroe wrote:
> Making progress, the code builds but still has a number of make check 
> failures. Any one who has time for code review and suggested 
> improvements would be greatly appreciated.
> 
> One make check failure may be a code gen bug in hammer3_3. The problem 
> is in nexttoward(). For test-idouble; nexttoward (0, -0) and  nexttoward 
> (-0, -0) should return -0 but we are getting 0 instead.
> 
> The offending statement in s_nexttoward.c is line 54:
> 
> 	if((long double) x==y) return y;	/* x=y, return y */
> 
> The code generate is:
> 
>     10000c9c:	fc 00 68 90 	fmr	f0,f13
>     10000ca0:	c8 22 81 50 	lfd	f1,-32432(r2)
>     10000ca4:	ff 80 18 00 	fcmpu	cr7,f0,f3
>     10000ca8:	40 9e 00 08 	bne-	cr7,10000cb0
>     10000cac:	ff 81 20 00 	fcmpu	cr7,f1,f4
>     10000cb0:	40 9e 00 18 	bne-	cr7,10000cc8
>     10000cb4:	fc 23 20 2a 	fadd	f1,f3,f4
>     10000cb8:	38 21 00 90 	addi	r1,r1,144
>     10000cbc:	e8 01 00 10 	ld	r0,16(r1)
>     10000cc0:	7c 08 03 a6 	mtlr	r0
>     10000cc4:	4e 80 00 20 	blr
> 
> It seems that the coversion of y (a long double) to double generates a 
> fadd f1,f3,f4 which seems to change the sign.

This is due to another error in real.c:encode_ibm_extended.

	* real.c (encode_ibm_extended): Duplicate high double in low for
	zeros, infinities and nans.  Explain why.

Index: gcc/real.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/real.c,v
retrieving revision 1.103.2.7
diff -u -p -r1.103.2.7 real.c
--- gcc/real.c	5 Mar 2004 15:06:08 -0000	1.103.2.7
+++ gcc/real.c	6 Mar 2004 09:48:23 -0000
@@ -3298,8 +3298,9 @@ const struct real_format ieee_extended_i
    range as an IEEE double precision value, but effectively 106 bits of
    significand precision.  Infinity and NaN are represented by their IEEE
    double precision value stored in the first number, the second number is
-   ignored.  Zeroes, Infinities, and NaNs are set in both doubles
-   due to precedent.  */
+   ignored.  Zeroes are set in both doubles so that conversion of a
+   long double -0.0 to double by adding the two doubles will result in
+   -0.0.  Infinities and NaNs do the same due to precedent.  */
 
 static void encode_ibm_extended PARAMS ((const struct real_format *fmt,
 					 long *, const REAL_VALUE_TYPE *));
@@ -3337,10 +3338,8 @@ encode_ibm_extended (fmt, buf, r)
     }
   else
     {
-      /* Inf, NaN, 0 are all representable as doubles, so the
-	 least-significant part can be 0.0.  */
-      buf[2] = 0;
-      buf[3] = 0;
+      buf[2] = buf[0];
+      buf[3] = buf[1];
     }
 }
 

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
       [not found] <40491948.2010900@us.ibm.com>
  2004-03-06  9:52 ` Powerpc64 long double support Alan Modra
@ 2004-03-06 10:50 ` Alan Modra
  2004-03-06 23:13   ` Geoff Keating
  2004-03-19  8:14   ` Alan Modra
  1 sibling, 2 replies; 875+ messages in thread
From: Alan Modra @ 2004-03-06 10:50 UTC (permalink / raw)
  To: Steve Munroe
  Cc: gcc-patches, Geoff Keating, Andreas Jaeger, Dwayne McConnell,
	David Edelsohn, Marcus Meissner

On Fri, Mar 05, 2004 at 06:20:24PM -0600, Steve Munroe wrote:
> +/* Powerpc64 uses the AIX long double format.
> +   
> +   Each long double is made up of two IEEE doubles.  The value of the
> +   long double is the sum of the values of the two parts.  The most
> +   significant part is required to be the value of the long double
> +   rounded to the nearest double, as specified by IEEE.  For Inf
> +   values, the least significant part is required to be one of +0.0 or
> +   -0.0.

Do you know why this is required for Inf?  If there is a reason,
then the patch I just posted to fix -0.0 is wrong..  (In any case,
the patch is incomplete, as rs6000.md extenddftf2 also needs looking
at.)

Hmm, I can see that if you represent +Inf by (+Inf + -Inf), you're
in trouble, because converting to double will result in a Nan.
Perhaps there is some sequence of operations that will result in
(+Inf + +Inf) being turned into (+Inf + -Inf)?

>  No other requirements are made; so, for example, 1.0 may be
> +   represented as (1.0, +0.0) or (1.0, -0.0), and the low part of a
> +   NaN is don't-care.  */

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
  2004-03-06 10:50 ` Alan Modra
@ 2004-03-06 23:13   ` Geoff Keating
  2004-03-07  6:37     ` Alan Modra
  2004-03-19  8:14     ` Geoff Keating
  2004-03-19  8:14   ` Alan Modra
  1 sibling, 2 replies; 875+ messages in thread
From: Geoff Keating @ 2004-03-06 23:13 UTC (permalink / raw)
  To: amodra; +Cc: sjmunroe, gcc-patches, aj, dgm69, dje, meissner

> X-Original-To: geoffk@foam.wonderslug.com
> Date: Sat, 6 Mar 2004 21:20:33 +1030
> From: Alan Modra <amodra@bigpond.net.au>
> Cc: gcc-patches@gcc.gnu.org, Geoff Keating <geoffk@geoffk.org>,
>         Andreas Jaeger <aj@suse.de>, Dwayne McConnell <dgm69@us.ibm.com>,
>         David Edelsohn <dje@watson.ibm.com>,
>         Marcus Meissner <meissner@suse.de>
> Mail-Followup-To: Steve Munroe <sjmunroe@us.ibm.com>,
> 	gcc-patches@gcc.gnu.org, Geoff Keating <geoffk@geoffk.org>,
> 	Andreas Jaeger <aj@suse.de>, Dwayne McConnell <dgm69@us.ibm.com>,
> 	David Edelsohn <dje@watson.ibm.com>,
> 	Marcus Meissner <meissner@suse.de>
> Content-Disposition: inline
> X-OriginalArrivalTime: 06 Mar 2004 10:50:37.0609 (UTC) FILETIME=[DC062990:01C40368]
> 
> On Fri, Mar 05, 2004 at 06:20:24PM -0600, Steve Munroe wrote:
> > +/* Powerpc64 uses the AIX long double format.
> > +   
> > +   Each long double is made up of two IEEE doubles.  The value of the
> > +   long double is the sum of the values of the two parts.  The most
> > +   significant part is required to be the value of the long double
> > +   rounded to the nearest double, as specified by IEEE.  For Inf
> > +   values, the least significant part is required to be one of +0.0 or
> > +   -0.0.
> 
> Do you know why this is required for Inf?  If there is a reason,
> then the patch I just posted to fix -0.0 is wrong..  (In any case,
> the patch is incomplete, as rs6000.md extenddftf2 also needs looking
> at.)
> 
> Hmm, I can see that if you represent +Inf by (+Inf + -Inf), you're
> in trouble, because converting to double will result in a Nan.
> Perhaps there is some sequence of operations that will result in
> (+Inf + +Inf) being turned into (+Inf + -Inf)?

If you represent +Inf by (+Inf, +/-Inf), then the code to convert a
double to a long double becomes significantly more complicated.  Right
now, it's done by just loading +0.0 in the low double.

You have to use the same value consistently, of course, or when you
compare two Inf values for == you might get the wrong answer.

-- 
- Geoffrey Keating <geoffk@geoffk.org>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
  2004-03-06 23:13   ` Geoff Keating
@ 2004-03-07  6:37     ` Alan Modra
  2004-03-07  7:30       ` Richard Henderson
  2004-03-19  8:14       ` Alan Modra
  2004-03-19  8:14     ` Geoff Keating
  1 sibling, 2 replies; 875+ messages in thread
From: Alan Modra @ 2004-03-07  6:37 UTC (permalink / raw)
  To: Geoff Keating; +Cc: sjmunroe, gcc-patches, aj, dgm69, dje, meissner

On Sat, Mar 06, 2004 at 03:13:44PM -0800, Geoff Keating wrote:
> > From: Alan Modra <amodra@bigpond.net.au>
> > On Fri, Mar 05, 2004 at 06:20:24PM -0600, Steve Munroe wrote:
> > > +/* Powerpc64 uses the AIX long double format.
> > > +   
> > > +   Each long double is made up of two IEEE doubles.  The value of the
> > > +   long double is the sum of the values of the two parts.  The most
> > > +   significant part is required to be the value of the long double
> > > +   rounded to the nearest double, as specified by IEEE.  For Inf
> > > +   values, the least significant part is required to be one of +0.0 or
> > > +   -0.0.
> > 
> > Do you know why this is required for Inf?  If there is a reason,
> > then the patch I just posted to fix -0.0 is wrong..  (In any case,
> > the patch is incomplete, as rs6000.md extenddftf2 also needs looking
> > at.)
> > 
> > Hmm, I can see that if you represent +Inf by (+Inf + -Inf), you're
> > in trouble, because converting to double will result in a Nan.
> > Perhaps there is some sequence of operations that will result in
> > (+Inf + +Inf) being turned into (+Inf + -Inf)?
> 
> If you represent +Inf by (+Inf, +/-Inf), then the code to convert a
> double to a long double becomes significantly more complicated.  Right
> now, it's done by just loading +0.0 in the low double.
> 
> You have to use the same value consistently, of course, or when you
> compare two Inf values for == you might get the wrong answer.

OK, I can see that it makes sense to load a zero in extenddftf2, and
like you say, comparisons then explain the need for zero in the low
double of Inf.  Specifically, we need -0.0, so that conversion of
(-0.0 + -0.0) to double works using the existing trunctfdf2.  Is there a
better trick than the following for extenddftf2's body?

	* config/rs6000/rs6000.md (extenddftf2): Use -0.0 in low double.
	* real.c (encode_ibm_extended): Use -0.0 in low double of Inf,
	NaN, and zero.  Update comment.

Patch against hammer branch, so expect a reject if applying mainline..
Index: gcc/config/rs6000/rs6000.md
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.md,v
retrieving revision 1.221.4.12
diff -u -p -r1.221.4.12 rs6000.md
--- gcc/config/rs6000/rs6000.md	6 Feb 2004 07:17:40 -0000	1.221.4.12
+++ gcc/config/rs6000/rs6000.md	7 Mar 2004 04:13:46 -0000
@@ -8170,7 +8175,11 @@
   "(DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_DARWIN)
    && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_LONG_DOUBLE_128"
 {
-  operands[2] = CONST0_RTX (DFmode);
+  REAL_VALUE_TYPE rv;
+  /* Make a -0.0 */
+  memset (&rv, 0, sizeof (rv));
+  rv.sign = 1;
+  operands[2] = CONST_DOUBLE_FROM_REAL_VALUE (rv, DFmode);
 })
 
 (define_insn_and_split "*extenddftf2_internal"
Index: gcc/real.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/real.c,v
retrieving revision 1.103.2.7
diff -u -p -r1.103.2.7 real.c
--- gcc/real.c	5 Mar 2004 15:06:08 -0000	1.103.2.7
+++ gcc/real.c	7 Mar 2004 06:26:02 -0000
@@ -3296,10 +3296,9 @@ const struct real_format ieee_extended_i
    numbers whose sum is equal to the extended precision value.  The number
    with greater magnitude is first.  This format has the same magnitude
    range as an IEEE double precision value, but effectively 106 bits of
-   significand precision.  Infinity and NaN are represented by their IEEE
-   double precision value stored in the first number, the second number is
-   ignored.  Zeroes, Infinities, and NaNs are set in both doubles
-   due to precedent.  */
+   significand precision.  Zero, Infinity and NaN are represented by their
+   IEEE double precision value stored in the first number, the second
+   number is -0.0.  */
 
 static void encode_ibm_extended PARAMS ((const struct real_format *fmt,
 					 long *, const REAL_VALUE_TYPE *));
@@ -3338,9 +3337,21 @@ encode_ibm_extended (fmt, buf, r)
   else
     {
       /* Inf, NaN, 0 are all representable as doubles, so the
-	 least-significant part can be 0.0.  */
-      buf[2] = 0;
-      buf[3] = 0;
+	 least-significant part can be zero.  We choose -0.0 because
+	 conversion of IBM extended precision to double is done by
+	 adding the two component doubles.  -0.0 is the only value that
+	 will result in a long double -0.0 correctly converting to a
+	 -0.0 double.  */
+      if (FLOAT_WORDS_BIG_ENDIAN)
+	{
+	  buf[2] = 0x80000000;
+	  buf[3] = 0;
+	}
+      else
+	{
+	  buf[2] = 0;
+	  buf[3] = 0x80000000;
+	}
     }
 }
 

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
  2004-03-07  6:37     ` Alan Modra
@ 2004-03-07  7:30       ` Richard Henderson
  2004-03-09  5:05         ` Alan Modra
  2004-03-19  8:14         ` Richard Henderson
  2004-03-19  8:14       ` Alan Modra
  1 sibling, 2 replies; 875+ messages in thread
From: Richard Henderson @ 2004-03-07  7:30 UTC (permalink / raw)
  To: Geoff Keating, sjmunroe, gcc-patches, aj, dgm69, dje, meissner

On Sun, Mar 07, 2004 at 05:07:08PM +1030, Alan Modra wrote:
> -  operands[2] = CONST0_RTX (DFmode);
> +  REAL_VALUE_TYPE rv;
> +  /* Make a -0.0 */
> +  memset (&rv, 0, sizeof (rv));
> +  rv.sign = 1;
> +  operands[2] = CONST_DOUBLE_FROM_REAL_VALUE (rv, DFmode);

How about 

  REAL_VALUE_TYPE rv = REAL_VALUE_NEGATE (dconst0);



r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
  2004-03-07  7:30       ` Richard Henderson
@ 2004-03-09  5:05         ` Alan Modra
  2004-03-09  7:59           ` Richard Henderson
  2004-03-19  8:14           ` Alan Modra
  2004-03-19  8:14         ` Richard Henderson
  1 sibling, 2 replies; 875+ messages in thread
From: Alan Modra @ 2004-03-09  5:05 UTC (permalink / raw)
  To: Richard Henderson, gcc-patches

On Sat, Mar 06, 2004 at 11:30:23PM -0800, Richard Henderson wrote:
> On Sun, Mar 07, 2004 at 05:07:08PM +1030, Alan Modra wrote:
> > -  operands[2] = CONST0_RTX (DFmode);
> > +  REAL_VALUE_TYPE rv;
> > +  /* Make a -0.0 */
> > +  memset (&rv, 0, sizeof (rv));
> > +  rv.sign = 1;
> > +  operands[2] = CONST_DOUBLE_FROM_REAL_VALUE (rv, DFmode);
> 
> How about 
> 
>   REAL_VALUE_TYPE rv = REAL_VALUE_NEGATE (dconst0);

Well, it's nicer to hide the details of REAL_VALUE_TYPE, but..
REAL_VALUE_NEGATE needs NEGATE_EXPR.  ie. you need to arrange for
tree.h to be included in insn-emit.c.  An easy patch to genemit.c,
but is it a good idea?  Also, real_arithmetic2 is slower than memset.

Hmm, I suppose I could roll my own dconstm0 or even dfmode_m0_rtx.
Doesn't seem worth it though.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
  2004-03-09  5:05         ` Alan Modra
@ 2004-03-09  7:59           ` Richard Henderson
  2004-03-09 23:49             ` Alan Modra
  2004-03-19  8:14             ` Richard Henderson
  2004-03-19  8:14           ` Alan Modra
  1 sibling, 2 replies; 875+ messages in thread
From: Richard Henderson @ 2004-03-09  7:59 UTC (permalink / raw)
  To: gcc-patches

On Tue, Mar 09, 2004 at 03:35:31PM +1030, Alan Modra wrote:
> On Sat, Mar 06, 2004 at 11:30:23PM -0800, Richard Henderson wrote:
> > On Sun, Mar 07, 2004 at 05:07:08PM +1030, Alan Modra wrote:
> > > -  operands[2] = CONST0_RTX (DFmode);
> > > +  REAL_VALUE_TYPE rv;
> > > +  /* Make a -0.0 */
> > > +  memset (&rv, 0, sizeof (rv));
> > > +  rv.sign = 1;
> > > +  operands[2] = CONST_DOUBLE_FROM_REAL_VALUE (rv, DFmode);
> > 
> > How about 
> > 
> >   REAL_VALUE_TYPE rv = REAL_VALUE_NEGATE (dconst0);
> 
> Well, it's nicer to hide the details of REAL_VALUE_TYPE, but..
> REAL_VALUE_NEGATE needs NEGATE_EXPR.  ie. you need to arrange for
> tree.h to be included in insn-emit.c.  An easy patch to genemit.c,
> but is it a good idea?

I *do* think that's better than frobbing rv.sign yourself.
The less the format of real.h gets exposed the better.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
  2004-03-09  7:59           ` Richard Henderson
@ 2004-03-09 23:49             ` Alan Modra
  2004-03-10  9:42               ` Richard Sandiford
                                 ` (2 more replies)
  2004-03-19  8:14             ` Richard Henderson
  1 sibling, 3 replies; 875+ messages in thread
From: Alan Modra @ 2004-03-09 23:49 UTC (permalink / raw)
  To: Richard Henderson, David Edelsohn, gcc-patches

On Mon, Mar 08, 2004 at 11:59:45PM -0800, Richard Henderson wrote:
> I *do* think that's better than frobbing rv.sign yourself.
> The less the format of real.h gets exposed the better.

OK.  I decided to avoid including tree.h in insn-output.c.  Instead,
I'm using a new rs6000 backend function to calculate -0.0.  I believe
it's not necessary to GTY(()) rs6000_dfmode_m0_rtx because it will
point somewhere inside const_double_htab.

	* real.c (encode_ibm_extended): Use -0.0 in low double of Inf,
	NaN, and zero.  Update comment.
	* config/rs6000/rs6000.md (extenddftf2): Use -0.0 in low double.
	* config/rs6000/rs6000.c (rs6000_dfmode_m0): New function.
	* config/rs6000/rs6000-protos.h (rs6000_dfmode_m0): Declare.
	Replace "struct rtx_def *" with "rtx" in decls protected with
	#ifdef RTX_CODE.  Formatting.

Bootstrapped, regression tested powerpc64-linux.

Index: gcc/real.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/real.c,v
retrieving revision 1.139
diff -u -p -r1.139 real.c
--- gcc/real.c	4 Mar 2004 10:23:20 -0000	1.139
+++ gcc/real.c	9 Mar 2004 23:34:36 -0000
@@ -3216,10 +3215,9 @@ const struct real_format ieee_extended_i
    numbers whose sum is equal to the extended precision value.  The number
    with greater magnitude is first.  This format has the same magnitude
    range as an IEEE double precision value, but effectively 106 bits of
-   significand precision.  Infinity and NaN are represented by their IEEE
-   double precision value stored in the first number, the second number is
-   ignored.  Zeroes, Infinities, and NaNs are set in both doubles
-   due to precedent.  */
+   significand precision.  Zero, Infinity and NaN are represented by their
+   IEEE double precision value stored in the first number, the second
+   number is -0.0.  */
 
 static void encode_ibm_extended (const struct real_format *fmt,
 				 long *, const REAL_VALUE_TYPE *);
@@ -3256,9 +3254,21 @@ encode_ibm_extended (const struct real_f
   else
     {
       /* Inf, NaN, 0 are all representable as doubles, so the
-	 least-significant part can be 0.0.  */
-      buf[2] = 0;
-      buf[3] = 0;
+	 least-significant part can be zero.  We choose -0.0 because
+	 conversion of IBM extended precision to double is done by
+	 adding the two component doubles.  -0.0 is the only value that
+	 will result in a long double -0.0 correctly converting to a
+	 -0.0 double.  */
+      if (FLOAT_WORDS_BIG_ENDIAN)
+	{
+	  buf[2] = 0x80000000;
+	  buf[3] = 0;
+	}
+      else
+	{
+	  buf[2] = 0;
+	  buf[3] = 0x80000000;
+	}
     }
 }
 
Index: gcc/config/rs6000/rs6000.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.c,v
retrieving revision 1.604
diff -u -p -r1.604 rs6000.c
--- gcc/config/rs6000/rs6000.c	8 Mar 2004 04:24:27 -0000	1.604
+++ gcc/config/rs6000/rs6000.c	9 Mar 2004 23:34:47 -0000
@@ -2780,6 +2780,22 @@ rs6000_legitimize_address (rtx x, rtx ol
     return NULL_RTX;
 }
 
+/* Construct a -0.0 here for use by extenddftf2.  */
+
+rtx
+rs6000_dfmode_m0 (void)
+{
+  static rtx rs6000_dfmode_m0_rtx;
+
+  if (rs6000_dfmode_m0_rtx == NULL_RTX)
+    {
+      REAL_VALUE_TYPE dconstm0 = REAL_VALUE_NEGATE (dconst0);
+      rs6000_dfmode_m0_rtx = CONST_DOUBLE_FROM_REAL_VALUE (dconstm0, DFmode);
+    }
+
+  return rs6000_dfmode_m0_rtx;
+}
+
 /* Construct the SYMBOL_REF for the tls_get_addr function.  */
 
 static GTY(()) rtx rs6000_tls_symbol;
Index: gcc/config/rs6000/rs6000.md
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.md,v
retrieving revision 1.299
diff -u -p -r1.299 rs6000.md
--- gcc/config/rs6000/rs6000.md	9 Mar 2004 12:10:25 -0000	1.299
+++ gcc/config/rs6000/rs6000.md	9 Mar 2004 23:34:54 -0000
@@ -8235,7 +8235,7 @@
   "(DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_DARWIN)
    && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_LONG_DOUBLE_128"
 {
-  operands[2] = CONST0_RTX (DFmode);
+  operands[2] = rs6000_dfmode_m0 ();
 })
 
 (define_insn_and_split "*extenddftf2_internal"
Index: gcc/config/rs6000/rs6000-protos.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000-protos.h,v
retrieving revision 1.74
diff -u -p -r1.74 rs6000-protos.h
--- gcc/config/rs6000/rs6000-protos.h	6 Feb 2004 06:18:19 -0000	1.74
+++ gcc/config/rs6000/rs6000-protos.h	9 Mar 2004 23:34:40 -0000
@@ -32,8 +32,9 @@ extern void init_cumulative_args (CUMULA
 extern void rs6000_va_start (tree, rtx);
 #endif /* TREE_CODE */
 
-extern struct rtx_def *rs6000_got_register (rtx);
-extern struct rtx_def *find_addr_reg (rtx);
+extern rtx rs6000_got_register (rtx);
+extern rtx rs6000_dfmode_m0 (void);
+extern rtx find_addr_reg (rtx);
 extern int any_operand (rtx, enum machine_mode);
 extern int short_cint_operand (rtx, enum machine_mode);
 extern int u_short_cint_operand (rtx, enum machine_mode);
@@ -120,41 +121,40 @@ extern int rs6000_emit_cmove (rtx, rtx, 
 extern void rs6000_emit_minmax (rtx, enum rtx_code, rtx, rtx);
 extern void output_toc (FILE *, rtx, int, enum machine_mode);
 extern void rs6000_initialize_trampoline (rtx, rtx, rtx);
-extern struct rtx_def *rs6000_longcall_ref (rtx);
+extern rtx rs6000_longcall_ref (rtx);
 extern void rs6000_fatal_bad_address (rtx);
 extern int stmw_operation (rtx, enum machine_mode);
 extern int mfcr_operation (rtx, enum machine_mode);
 extern int mtcrf_operation (rtx, enum machine_mode);
 extern int lmw_operation (rtx, enum machine_mode);
-extern struct rtx_def *create_TOC_reference (rtx);
+extern rtx create_TOC_reference (rtx);
 extern void rs6000_split_multireg_move (rtx, rtx);
 extern void rs6000_emit_move (rtx, rtx, enum machine_mode);
 extern rtx rs6000_legitimize_address (rtx, rtx, enum machine_mode);
 extern rtx rs6000_legitimize_reload_address (rtx, enum machine_mode,
-			    int, int, int, int *);
+					     int, int, int, int *);
 extern int rs6000_legitimate_address (enum machine_mode, rtx, int);
 extern bool rs6000_mode_dependent_address (rtx);
 extern rtx rs6000_return_addr (int, rtx);
 extern void rs6000_output_symbol_ref (FILE*, rtx);
 extern HOST_WIDE_INT rs6000_initial_elimination_offset (int, int);
 
-extern rtx rs6000_machopic_legitimize_pic_address (rtx orig, 
-                            enum machine_mode mode, rtx reg);
+extern rtx rs6000_machopic_legitimize_pic_address (rtx, enum machine_mode,
+						   rtx);
 
 #endif /* RTX_CODE */
 
 #ifdef TREE_CODE
 extern unsigned int rs6000_special_round_type_align (tree, int, int);
-extern void function_arg_advance (CUMULATIVE_ARGS *, enum machine_mode,
-					  tree, int);
+extern void function_arg_advance (CUMULATIVE_ARGS *,
+				  enum machine_mode, tree, int);
 extern int function_arg_boundary (enum machine_mode, tree);
 extern struct rtx_def *function_arg (CUMULATIVE_ARGS *,
-					     enum machine_mode, tree, int);
+				     enum machine_mode, tree, int);
 extern int function_arg_partial_nregs (CUMULATIVE_ARGS *,
-					       enum machine_mode, tree, int);
+				       enum machine_mode, tree, int);
 extern int function_arg_pass_by_reference (CUMULATIVE_ARGS *,
-						   enum machine_mode,
-						   tree, int);
+					   enum machine_mode, tree, int);
 extern rtx rs6000_function_value (tree, tree);
 extern rtx rs6000_libcall_value (enum machine_mode);
 extern struct rtx_def *rs6000_va_arg (tree, tree);

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
       [not found]           ` <amodra@bigpond.net.au>
                               ` (9 preceding siblings ...)
  2004-03-04  2:47             ` David Edelsohn
@ 2004-03-10  6:23             ` David Edelsohn
  2004-03-10  6:44               ` Alan Modra
  2004-03-19  8:14               ` David Edelsohn
  2004-03-12 20:26             ` Correct powerpc64 long double -0.0 to double conversion David Edelsohn
                               ` (50 subsequent siblings)
  61 siblings, 2 replies; 875+ messages in thread
From: David Edelsohn @ 2004-03-10  6:23 UTC (permalink / raw)
  To: Richard Henderson, gcc-patches

	The PowerPC changes are okay with me.

	Other ports, such as Mips, use the IBM extended format, why not
just add dconstm0 to standard list in real.h instead of creating a special
function for rs6000 port?

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
  2004-03-10  6:23             ` Powerpc64 long double support David Edelsohn
@ 2004-03-10  6:44               ` Alan Modra
  2004-03-19  8:14                 ` Alan Modra
  2004-03-19  8:14               ` David Edelsohn
  1 sibling, 1 reply; 875+ messages in thread
From: Alan Modra @ 2004-03-10  6:44 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Richard Henderson, gcc-patches

On Wed, Mar 10, 2004 at 01:23:36AM -0500, David Edelsohn wrote:
> 	The PowerPC changes are okay with me.
> 
> 	Other ports, such as Mips, use the IBM extended format, why not
> just add dconstm0 to standard list in real.h instead of creating a special
> function for rs6000 port?

I did consider doing that.  Note that using rs6000_dfmode_m0 costs just
one function call, whereas dconstm0 needs to be converted to double via
const_double_from_real_value, lookup_const_double, htab_find_slot.  I
thought I'd be pushing my luck to ask for a dfmode_m0_rtx in
emit-rtl.c.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
  2004-03-09 23:49             ` Alan Modra
@ 2004-03-10  9:42               ` Richard Sandiford
  2004-03-10 11:01                 ` Alan Modra
                                   ` (2 more replies)
  2004-03-19  8:14               ` Alan Modra
  2004-04-01  0:56               ` Geoff Keating
  2 siblings, 3 replies; 875+ messages in thread
From: Richard Sandiford @ 2004-03-10  9:42 UTC (permalink / raw)
  To: Richard Henderson; +Cc: David Edelsohn, gcc-patches

Alan Modra <amodra@bigpond.net.au> writes:
> Index: gcc/real.c
> ===================================================================
> RCS file: /cvs/gcc/gcc/gcc/real.c,v
> retrieving revision 1.139
> diff -u -p -r1.139 real.c
> --- gcc/real.c	4 Mar 2004 10:23:20 -0000	1.139
> +++ gcc/real.c	9 Mar 2004 23:34:36 -0000
> @@ -3216,10 +3215,9 @@ const struct real_format ieee_extended_i
>     numbers whose sum is equal to the extended precision value.  The number
>     with greater magnitude is first.  This format has the same magnitude
>     range as an IEEE double precision value, but effectively 106 bits of
> -   significand precision.  Infinity and NaN are represented by their IEEE
> -   double precision value stored in the first number, the second number is
> -   ignored.  Zeroes, Infinities, and NaNs are set in both doubles
> -   due to precedent.  */
> +   significand precision.  Zero, Infinity and NaN are represented by their
> +   IEEE double precision value stored in the first number, the second
> +   number is -0.0.  */
>  
>  static void encode_ibm_extended (const struct real_format *fmt,
>  				 long *, const REAL_VALUE_TYPE *);
> @@ -3256,9 +3254,21 @@ encode_ibm_extended (const struct real_f
>    else
>      {
>        /* Inf, NaN, 0 are all representable as doubles, so the
> -	 least-significant part can be 0.0.  */
> -      buf[2] = 0;
> -      buf[3] = 0;
> +	 least-significant part can be zero.  We choose -0.0 because
> +	 conversion of IBM extended precision to double is done by
> +	 adding the two component doubles.  -0.0 is the only value that
> +	 will result in a long double -0.0 correctly converting to a
> +	 -0.0 double.  */
> +      if (FLOAT_WORDS_BIG_ENDIAN)
> +	{
> +	  buf[2] = 0x80000000;
> +	  buf[3] = 0;
> +	}
> +      else
> +	{
> +	  buf[2] = 0;
> +	  buf[3] = 0x80000000;
> +	}
>      }
>  }
>  

This function seems to be developing in a very ad-hoc way.  Is there really
no spec that says what a canonical "IBM format" number should look like?

The current implementation does seem to be correct for IRIX.  It certainly
uses +0.0 as the low part of NaN, +/-Inf and +/-0.0.  We'd need to split
the definition into two if your patch is needed for powerpc.

FWIW, here's the output of the attached program when compiled with MIPSpro cc.

     NaN : 7ff7ffff ffffffff : 00000000 00000000
    +Inf : 7ff00000 00000000 : 00000000 00000000
       0 : 00000000 00000000 : 00000000 00000000
      -0 : 80000000 00000000 : 00000000 00000000
    -Inf : fff00000 00000000 : 00000000 00000000

Richard


#include <float.h>

void print_it (const char *fmt, long double ll)
{
  union { long double ll; unsigned int i[4]; } u;
  u.ll = ll;
  printf ("%8s : %08x %08x : %08x %08x\n",
	  fmt, u.i[0], u.i[1], u.i[2], u.i[3]);
}

int main ()
{
  print_it ("NaN", 0.0L/0.0L);
  print_it ("+Inf", 1.0L/0.0L);
  print_it ("0", LDBL_MIN / 1e100);
  print_it ("-0", -LDBL_MIN / 1e100);
  print_it ("-Inf", -1.0L/0.0L);
  exit (0);
}

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
  2004-03-10  9:42               ` Richard Sandiford
@ 2004-03-10 11:01                 ` Alan Modra
  2004-03-10 11:11                   ` Richard Sandiford
                                     ` (2 more replies)
  2004-03-10 16:18                 ` David Edelsohn
  2004-03-19  8:14                 ` Richard Sandiford
  2 siblings, 3 replies; 875+ messages in thread
From: Alan Modra @ 2004-03-10 11:01 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: Richard Henderson, David Edelsohn, gcc-patches

On Wed, Mar 10, 2004 at 09:42:36AM +0000, Richard Sandiford wrote:
> when compiled with MIPSpro cc.

Does MIPSpro correctly convert a long double -0.0 to double -0.0?  Does
mips gcc?

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
  2004-03-10 11:01                 ` Alan Modra
@ 2004-03-10 11:11                   ` Richard Sandiford
  2004-03-10 18:48                     ` David Edelsohn
  2004-03-19  8:14                     ` Richard Sandiford
  2004-03-10 11:13                   ` Alan Modra
  2004-03-19  8:14                   ` Alan Modra
  2 siblings, 2 replies; 875+ messages in thread
From: Richard Sandiford @ 2004-03-10 11:11 UTC (permalink / raw)
  To: Richard Henderson; +Cc: David Edelsohn, gcc-patches

Alan Modra <amodra@bigpond.net.au> writes:
> On Wed, Mar 10, 2004 at 09:42:36AM +0000, Richard Sandiford wrote:
>> when compiled with MIPSpro cc.
>
> Does MIPSpro correctly convert a long double -0.0 to double -0.0?  Does
> mips gcc?

Seems like it.  The attached program prints the expected:

      -0 : 80000000 00000000

when compiled with either MIPSpro or gcc 3.4.

Richard


#include <float.h>

void print_it (const char *fmt, double d)
{
  union { double d; unsigned int i[2]; } u;
  u.d = d;
  printf ("%8s : %08x %08x\n", fmt, u.i[0], u.i[1]);
}

long double f () { return -LDBL_MIN / 1e100L; }

int main ()
{
  print_it ("-0", f ());
  return 0;
}

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
  2004-03-10 11:01                 ` Alan Modra
  2004-03-10 11:11                   ` Richard Sandiford
@ 2004-03-10 11:13                   ` Alan Modra
  2004-03-10 11:25                     ` Richard Sandiford
  2004-03-19  8:14                     ` Alan Modra
  2004-03-19  8:14                   ` Alan Modra
  2 siblings, 2 replies; 875+ messages in thread
From: Alan Modra @ 2004-03-10 11:13 UTC (permalink / raw)
  To: Richard Sandiford, Richard Henderson, David Edelsohn, gcc-patches

On Wed, Mar 10, 2004 at 09:31:13PM +1030, Alan Modra wrote:
> On Wed, Mar 10, 2004 at 09:42:36AM +0000, Richard Sandiford wrote:
> > when compiled with MIPSpro cc.
> 
> Does MIPSpro correctly convert a long double -0.0 to double -0.0?  Does
> mips gcc?

The reason for -0.0 in the low double goes like this:

Conversion from long double to double is done by simply adding the
two component doubles.  That means long double -0.0 must be
(-0.0 + -0.0), or you need to add code to handle -0.0 on every
conversion.

Conversion from double to long double is done by using the double
in the high part and making the low part zero.  For consistency
(and correct conversion back to double), the low part must be -0.0
when the double is -0.0.  Again, if you want +0.0, +Inf, -Inf to
have +0.0 as the low double, then you need something more complicated
than just loading a constant value in the low double.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
  2004-03-10 11:13                   ` Alan Modra
@ 2004-03-10 11:25                     ` Richard Sandiford
  2004-03-10 11:58                       ` Alan Modra
  2004-03-19  8:14                       ` Richard Sandiford
  2004-03-19  8:14                     ` Alan Modra
  1 sibling, 2 replies; 875+ messages in thread
From: Richard Sandiford @ 2004-03-10 11:25 UTC (permalink / raw)
  To: Richard Henderson; +Cc: David Edelsohn, gcc-patches

Alan Modra <amodra@bigpond.net.au> writes:
> On Wed, Mar 10, 2004 at 09:31:13PM +1030, Alan Modra wrote:
>> On Wed, Mar 10, 2004 at 09:42:36AM +0000, Richard Sandiford wrote:
>> > when compiled with MIPSpro cc.
>> 
>> Does MIPSpro correctly convert a long double -0.0 to double -0.0?  Does
>> mips gcc?
>
> The reason for -0.0 in the low double goes like this:
>
> Conversion from long double to double is done by simply adding the
> two component doubles.  That means long double -0.0 must be
> (-0.0 + -0.0), or you need to add code to handle -0.0 on every
> conversion.

Not sure: are you saying that's what the spec says you should do, or
that is it just what a particular implementation does?  As per my
previous message, IRIX uses +0.0 for the low double and it still gets
the conversion right.  I assume it must be using something other than
simple addition.

My only concern (in case it wasn't obvious ;) is that you don't
change the behaviour for IRIX.  I'm certainly not trying to say
the change is wrong for powerpc...

Richard

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
  2004-03-10 11:25                     ` Richard Sandiford
@ 2004-03-10 11:58                       ` Alan Modra
  2004-03-10 12:06                         ` Richard Sandiford
  2004-03-19  8:14                         ` Alan Modra
  2004-03-19  8:14                       ` Richard Sandiford
  1 sibling, 2 replies; 875+ messages in thread
From: Alan Modra @ 2004-03-10 11:58 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: Richard Henderson, David Edelsohn, gcc-patches

On Wed, Mar 10, 2004 at 11:25:38AM +0000, Richard Sandiford wrote:
> Alan Modra <amodra@bigpond.net.au> writes:
> > On Wed, Mar 10, 2004 at 09:31:13PM +1030, Alan Modra wrote:
> >> On Wed, Mar 10, 2004 at 09:42:36AM +0000, Richard Sandiford wrote:
> >> > when compiled with MIPSpro cc.
> >> 
> >> Does MIPSpro correctly convert a long double -0.0 to double -0.0?  Does
> >> mips gcc?
> >
> > The reason for -0.0 in the low double goes like this:
> >
> > Conversion from long double to double is done by simply adding the
> > two component doubles.  That means long double -0.0 must be
> > (-0.0 + -0.0), or you need to add code to handle -0.0 on every
> > conversion.
> 
> Not sure: are you saying that's what the spec says you should do, or
> that is it just what a particular implementation does?

No, I'm not talking about any spec or other implementation.  I'm just
following through the logical implications of using the simplest
long double -> double -> long double conversion sequences.

>  As per my
> previous message, IRIX uses +0.0 for the low double and it still gets
> the conversion right.  I assume it must be using something other than
> simple addition.

I'm curious as to what it uses.

> My only concern (in case it wasn't obvious ;) is that you don't
> change the behaviour for IRIX.

Easy.  I can add this.

  else if (!fmt->qnan_msb_set)
    {
      /* MIPS slavishly follows proprietary compilers, which use 0.0
	 in the low word.  */
      buf[2] = 0;
      buf[3] = 0;
    }

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
  2004-03-10 11:58                       ` Alan Modra
@ 2004-03-10 12:06                         ` Richard Sandiford
  2004-03-10 12:25                           ` Alan Modra
                                             ` (2 more replies)
  2004-03-19  8:14                         ` Alan Modra
  1 sibling, 3 replies; 875+ messages in thread
From: Richard Sandiford @ 2004-03-10 12:06 UTC (permalink / raw)
  To: Richard Henderson; +Cc: David Edelsohn, gcc-patches

Alan Modra <amodra@bigpond.net.au> writes:
>> My only concern (in case it wasn't obvious ;) is that you don't
>> change the behaviour for IRIX.
>
> Easy.  I can add this.
>
>   else if (!fmt->qnan_msb_set)
>     {
>       /* MIPS slavishly follows proprietary compilers, which use 0.0
> 	 in the low word.  */
>       buf[2] = 0;
>       buf[3] = 0;
>     }

Sounds good, although the comment's a bit on the vitriolic side. ;)
Surely the long double representation is as much a part of the ABI as
any other data representation?  I.e., it's not that were doing something
just because another compiler does it.  We're doing it because that's
the platform ABI.

Richard

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
  2004-03-10 12:06                         ` Richard Sandiford
@ 2004-03-10 12:25                           ` Alan Modra
  2004-03-19  8:14                             ` Alan Modra
  2004-03-10 12:42                           ` Andreas Schwab
  2004-03-19  8:14                           ` Richard Sandiford
  2 siblings, 1 reply; 875+ messages in thread
From: Alan Modra @ 2004-03-10 12:25 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: Richard Henderson, David Edelsohn, gcc-patches

On Wed, Mar 10, 2004 at 12:06:57PM +0000, Richard Sandiford wrote:
> Alan Modra <amodra@bigpond.net.au> writes:
> >> My only concern (in case it wasn't obvious ;) is that you don't
> >> change the behaviour for IRIX.
> >
> > Easy.  I can add this.
> >
> >   else if (!fmt->qnan_msb_set)
> >     {
> >       /* MIPS slavishly follows proprietary compilers, which use 0.0
> > 	 in the low word.  */
> >       buf[2] = 0;
> >       buf[3] = 0;
> >     }
> 
> Sounds good, although the comment's a bit on the vitriolic side. ;)

Heh.  It was meant to sting a little in a friendly way.  Perhaps
/* MIPS uses +0.0 in the low word.  */ would suit better?  :)

> Surely the long double representation is as much a part of the ABI as
> any other data representation?  I.e., it's not that were doing something
> just because another compiler does it.  We're doing it because that's
> the platform ABI.
> 
> Richard

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
  2004-03-10 12:06                         ` Richard Sandiford
  2004-03-10 12:25                           ` Alan Modra
@ 2004-03-10 12:42                           ` Andreas Schwab
  2004-03-10 12:53                             ` Richard Sandiford
                                               ` (2 more replies)
  2004-03-19  8:14                           ` Richard Sandiford
  2 siblings, 3 replies; 875+ messages in thread
From: Andreas Schwab @ 2004-03-10 12:42 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: Richard Henderson, David Edelsohn, gcc-patches

Richard Sandiford <rsandifo@redhat.com> writes:

> Alan Modra <amodra@bigpond.net.au> writes:
>>> My only concern (in case it wasn't obvious ;) is that you don't
>>> change the behaviour for IRIX.
>>
>> Easy.  I can add this.
>>
>>   else if (!fmt->qnan_msb_set)
>>     {
>>       /* MIPS slavishly follows proprietary compilers, which use 0.0
>> 	 in the low word.  */
>>       buf[2] = 0;
>>       buf[3] = 0;
>>     }
>
> Sounds good, although the comment's a bit on the vitriolic side. ;)
> Surely the long double representation is as much a part of the ABI as
> any other data representation?  I.e., it's not that were doing something
> just because another compiler does it.  We're doing it because that's
> the platform ABI.

The sign bit might also be a don't-care at this point, in which case both
formats must be supported.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux AG, Maxfeldstra\xDFe 5, 90409 N\xFCrnberg, Germany
Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
  2004-03-10 12:42                           ` Andreas Schwab
@ 2004-03-10 12:53                             ` Richard Sandiford
  2004-03-19  8:14                               ` Richard Sandiford
  2004-03-10 13:48                             ` Alan Modra
  2004-03-19  8:14                             ` Andreas Schwab
  2 siblings, 1 reply; 875+ messages in thread
From: Richard Sandiford @ 2004-03-10 12:53 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Richard Henderson, David Edelsohn, gcc-patches

Andreas Schwab <schwab@suse.de> writes:
> The sign bit might also be a don't-care at this point, in which case both
> formats must be supported.

It might, that's true, but unless someone can show that it definitely
_is_, I don't think we should make that assumption.  math(3M) just says:

     Long double infinity is represented as the sum of a double
     infinity and a double zero; similarly for NaNs.

which perhaps can be read either way, but which hardly goes out
on a limb to say "zero of either sign".

Also, libgcc will use +0.0, so we might as well be consistent.

Richard

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
  2004-03-10 12:42                           ` Andreas Schwab
  2004-03-10 12:53                             ` Richard Sandiford
@ 2004-03-10 13:48                             ` Alan Modra
  2004-03-19  8:14                               ` Alan Modra
  2004-03-19  8:14                             ` Andreas Schwab
  2 siblings, 1 reply; 875+ messages in thread
From: Alan Modra @ 2004-03-10 13:48 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Richard Sandiford, Richard Henderson, Geoff Keating,
	David Edelsohn, gcc-patches

On Wed, Mar 10, 2004 at 01:42:43PM +0100, Andreas Schwab wrote:
> The sign bit might also be a don't-care at this point, in which case both
> formats must be supported.

I don't believe there is anything in our implementation that cares about
the sign of a low zero word, *except* for long double -> double
conversion of -0.0.

Richard brought up the interesting point privately, that if the high
double of a long double pair is always a correctly rounded double, then
just using that double as the result for a long double -> double
conversion would be correct.  My only counter argument is that our
ABI doesn't specify the long double format as tightly as MIPS does,
only requiring that the larger magnitude double be first, and that
the magnitudes do not overlap.  So our ABI doesn't require correct
rounding of the high double, just that the low double be < 1ULP of the
high double.  However, a correctly rounded high double results in
the best precision results for straight-forward arithmetic routines.

I also think that the MIPS specification requiring correct rounding
(or equivalently the low double <= 0.5 ULP of the high double) is
better.  Our looser spec means that it is possible for someone to
create two long doubles that have exactly the same infinite precision
sum, but different component doubles.  Differing component doubles will
result in the long doubles not comparing equal in gcc's current
implementation, when they have the same value according to our ABI.

So..

If gcc's rs6000/darwin-ldouble.c implementation of the basic arithmetic
operations always rounds the high double correctly, then I'd be quite
happy to work on rewording our ABI, and just taking the high double
for long double -> double conversion as Richard suggested.  You can't
get faster than a conversion that need not do anything.

Then, if we don't need to add the component doubles, we don't need to
use -0.0 in the low double.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
  2004-03-10  9:42               ` Richard Sandiford
  2004-03-10 11:01                 ` Alan Modra
@ 2004-03-10 16:18                 ` David Edelsohn
  2004-03-11 15:05                   ` Correct powerpc64 long double -0.0 to double conversion Alan Modra
  2004-03-19  8:14                   ` Powerpc64 long double support David Edelsohn
  2004-03-19  8:14                 ` Richard Sandiford
  2 siblings, 2 replies; 875+ messages in thread
From: David Edelsohn @ 2004-03-10 16:18 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: Richard Henderson, gcc-patches

	Using your test program, IBM xlc128 prints:

     NaN : 7ff80000 00000000 : 00000000 00000000
    +Inf : 7ff00000 00000000 : 00000000 00000000
       0 : 00000000 00000000 : 00000000 00000000
      -0 : 80000000 00000000 : 00000000 00000000
    -Inf : fff00000 00000000 : 00000000 00000000

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
  2004-03-10 11:11                   ` Richard Sandiford
@ 2004-03-10 18:48                     ` David Edelsohn
  2004-03-10 20:11                       ` Richard Sandiford
  2004-03-19  8:14                       ` David Edelsohn
  2004-03-19  8:14                     ` Richard Sandiford
  1 sibling, 2 replies; 875+ messages in thread
From: David Edelsohn @ 2004-03-10 18:48 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: Richard Henderson, gcc-patches

	I am interested to know exactly what instructions MIPSpro cc
produces for the conversion from long double to double.  For example, the
following program:

double
ld2d (long double f)
{
  return (double)f;
}

Something must be preserving the sign bit given the representation of -0
displayed by the other sample program.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
  2004-03-10 18:48                     ` David Edelsohn
@ 2004-03-10 20:11                       ` Richard Sandiford
  2004-03-19  8:14                         ` Richard Sandiford
  2004-03-19  8:14                       ` David Edelsohn
  1 sibling, 1 reply; 875+ messages in thread
From: Richard Sandiford @ 2004-03-10 20:11 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Richard Henderson, gcc-patches

David Edelsohn <dje@watson.ibm.com> writes:
> 	I am interested to know exactly what instructions MIPSpro cc
> produces for the conversion from long double to double.  For example, the
> following program:
>
> double
> ld2d (long double f)
> {
>   return (double)f;
> }
>
> Something must be preserving the sign bit given the representation of -0
> displayed by the other sample program.

It just calls a library function (__dble_q).  It would be tempting to
disassemble the standard library in order to find out what it does, but
I'm not certain what the licence restrictions are.

Richard

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Correct powerpc64 long double -0.0 to double conversion
  2004-03-10 16:18                 ` David Edelsohn
@ 2004-03-11 15:05                   ` Alan Modra
  2004-03-19  8:14                     ` Alan Modra
  2004-03-19  8:14                   ` Powerpc64 long double support David Edelsohn
  1 sibling, 1 reply; 875+ messages in thread
From: Alan Modra @ 2004-03-11 15:05 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Richard Sandiford, Richard Henderson, gcc-patches

On Wed, Mar 10, 2004 at 11:18:38AM -0500, David Edelsohn wrote:
> 	Using your test program, IBM xlc128 prints:
> 
>      NaN : 7ff80000 00000000 : 00000000 00000000
>     +Inf : 7ff00000 00000000 : 00000000 00000000
>        0 : 00000000 00000000 : 00000000 00000000
>       -0 : 80000000 00000000 : 00000000 00000000
>     -Inf : fff00000 00000000 : 00000000 00000000

Let's start again.  We can do without the -0.0 in the low double, and
also simplify long double -> double conversion.

- The PowerPC GCC long double comparison insn, cmptf_internal1 assumes
  there is exactly one representation for any long double value, because
  we compare the component doubles.  This is despite the currect
  PowerPC64 Linux ABI (and probably AIX) defining the IBM extended
  precision format in a loose manner that would seem to allow more than
  one representation for certain finite values.
- Our math functions always produce a correctly rounded high double
  (it's hard to see how they could do otherwise, given that the
  underlying hardware does so for double operations if rounding mode is
  set correctly)
- Therefore, the current gcc code only supports long doubles that have
  a correctly rounded high double.
- Therefore, we don't need to add the component doubles when converting
  to double, as Richard Sandiford pointed out.

Also a correctly rounded high double means that our actual precision
is 107 bits, not 106.  Changing this is orthogonal to the rs6000 back
end change, so I'll leave it to another patch.

	* config/rs6000/rs6000.md (trunctfdf2): Just use the high double.

Bootstrap and regression test in progress.

Index: gcc/config/rs6000/rs6000.md
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.md,v
retrieving revision 1.299
diff -u -p -r1.299 rs6000.md
--- gcc/config/rs6000/rs6000.md	9 Mar 2004 12:10:25 -0000	1.299
+++ gcc/config/rs6000/rs6000.md	11 Mar 2004 11:22:13 -0000
@@ -8269,14 +8269,19 @@
   DONE;
 })
 
-(define_insn "trunctfdf2"
+(define_insn_and_split "trunctfdf2"
   [(set (match_operand:DF 0 "gpc_reg_operand" "=f")
 	(float_truncate:DF (match_operand:TF 1 "gpc_reg_operand" "f")))]
   "(DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_DARWIN)
    && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_LONG_DOUBLE_128"
-  "fadd %0,%1,%L1"
-  [(set_attr "type" "fp")
-   (set_attr "length" "4")])
+  "#"
+  "&& 1"
+  [(set (match_dup 0) (match_dup 2))]
+  "
+{
+  const int hi_word = FLOAT_WORDS_BIG_ENDIAN ? 0 : GET_MODE_SIZE (DFmode);
+  operands[2] = simplify_gen_subreg (DFmode, operands[1], TFmode, hi_word);
+}")
 
 (define_insn_and_split "trunctfsf2"
   [(set (match_operand:SF 0 "gpc_reg_operand" "=f")


-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Correct powerpc64 long double -0.0 to double conversion
       [not found]           ` <amodra@bigpond.net.au>
                               ` (10 preceding siblings ...)
  2004-03-10  6:23             ` Powerpc64 long double support David Edelsohn
@ 2004-03-12 20:26             ` David Edelsohn
  2004-03-19  8:14               ` David Edelsohn
  2004-04-30 14:55             ` rs6000 stack boundary David Edelsohn
                               ` (49 subsequent siblings)
  61 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2004-03-12 20:26 UTC (permalink / raw)
  To: Richard Sandiford, Richard Henderson, gcc-patches

	After investigating this further, I would recommend leaving the
second component as +0.0 and leaving the rounding as summing the two
components.  No patch.

- Our math functions always produce a correctly rounded high double
  (it's hard to see how they could do otherwise, given that the
  underlying hardware does so for double operations if rounding mode is
  set correctly)

I am not sure what "our math functions" means, but not all efficient long
double algorithms will leave the first component correctly rounded, nor
propagate NaN or Inf for that matter.  If PPC64 Linux requires this corner
case to be correct, it can invoke a more complicated LIBCALL when not
-ffast-math.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* [csl-arm,  HEAD] ARM PATCH - fix QImode addressing on ARMv4
@ 2004-03-13 11:43 Richard Earnshaw
  2004-03-13 13:01 ` Richard Earnshaw
  2004-03-19  8:14 ` Richard Earnshaw
  0 siblings, 2 replies; 875+ messages in thread
From: Richard Earnshaw @ 2004-03-13 11:43 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard.Earnshaw

[-- Attachment #1: Type: text/plain, Size: 2290 bytes --]

This patch fixes the way that we manage QImode indexes when compiling for
ARM Architecture v4 or later.  In v4 we have a ldrsb instruction that can
sign-extend a byte load (ldrb zero-extends).  Unfortunately the indexing
capabilities of this insn are less flexible than its unsigned counterpart.
In the past we have restricted (mostly) the indexing range of ldrb to that
of its poorer cousin: that generates correct code, but at the expense of
wasting instructions when the indexing exceeds the capabilities of ldrsb.

The patch below addresses all this by introducing a new memory predicate
arm_extendqisi_mem_op which can validate a ldrsb address index distinctly
from an ldrb address index (it does so by calling arm_legitimate_address_p
with a new argument, the 'outer' code in much the same way as the RTX_COST
macros do.

Measurements on CSiBE code show about 0.1% code size reduction with this
change.

Built and regress tested for arm-unknown-elf on the csl-arm-branch and
HEAD, and fully bootstrapped on armv4-linux-gnu on HEAD.

Committed to csl-arm and HEAD.

2004-03-13  Richard Earnshaw  <rearnsha@arm.com>

	* arm.c (arm_legitimate_address_p): New argument, OUTER.  Pass through
	to arm_legitimate_index_p.  Update all callers with SET as default
	value.
	(arm_legitimate_index_p): New argument, OUTER.  Restrict the index
	range if OUTER is a sign-extend operation on QImode.  Correctly
	reject shift operations on sign-extended QImode addresses.
	(bad_signed_byte_operand): Delete.
	(arm_extendqisi_mem_op): New function.
	* arm.h (EXTRA_CONSTRAINT_ARM): Delete.  Replace with...
	(EXTRA_CONSTRAINT_STR_ARM): ... this.  Handle extended address
	constraints.
	(CONSTRAINT_LEN): New.
	(EXTRA_CONSTRAINT): Delete.  Replace with...
	(EXTRA_CONSTRAINT_STR): ... this.
	(PREDICATE_CODES): Remove bad_signed_byte_operand.
	* arm.md (extendqihi_insn): Use new constraint Uq.  Rework.  Length
	is now always default.
	(define_splits for bad sign-extend loads): Delete.
	(arm_extendqisi, arm_extendqisi_v5): Likewise.
	* arm/vfp.md (arm_movsi_vfp, arm_movdi_vfp, movsf_vfp, movdf_vfp):
	Rework 'U' constraint to 'Uv'.
	* arm-protos.h: Remove bad_signed_byte_operand.  Add
	arm_extendqisi_mem_op.
	* doc/md.texi (ARM constraints): Rename VFP constraint (now Uv).
	Add Uq constraint.



[-- Attachment #2: extendqisi-addr.patch --]
[-- Type: text/plain , Size: 26031 bytes --]

Index: config/arm/arm-protos.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/arm/arm-protos.h,v
retrieving revision 1.60.4.6
diff -p -p -r1.60.4.6 arm-protos.h
*** config/arm/arm-protos.h	30 Jan 2004 16:14:21 -0000	1.60.4.6
--- config/arm/arm-protos.h	11 Mar 2004 07:15:25 -0000
*************** extern int arm_split_constant (RTX_CODE,
*** 50,56 ****
  extern RTX_CODE arm_canonicalize_comparison (RTX_CODE, rtx *);
  extern int legitimate_pic_operand_p (rtx);
  extern rtx legitimize_pic_address (rtx, enum machine_mode, rtx);
! extern int arm_legitimate_address_p  (enum machine_mode, rtx, int);
  extern int thumb_legitimate_address_p (enum machine_mode, rtx, int);
  extern int thumb_legitimate_offset_p (enum machine_mode, HOST_WIDE_INT);
  extern rtx arm_legitimize_address (rtx, rtx, enum machine_mode);
--- 50,56 ----
  extern RTX_CODE arm_canonicalize_comparison (RTX_CODE, rtx *);
  extern int legitimate_pic_operand_p (rtx);
  extern rtx legitimize_pic_address (rtx, enum machine_mode, rtx);
! extern int arm_legitimate_address_p  (enum machine_mode, rtx, RTX_CODE, int);
  extern int thumb_legitimate_address_p (enum machine_mode, rtx, int);
  extern int thumb_legitimate_offset_p (enum machine_mode, HOST_WIDE_INT);
  extern rtx arm_legitimize_address (rtx, rtx, enum machine_mode);
*************** extern int arm_rhsm_operand (rtx, enum m
*** 70,78 ****
  extern int arm_add_operand (rtx, enum machine_mode);
  extern int arm_addimm_operand (rtx, enum machine_mode);
  extern int arm_not_operand (rtx, enum machine_mode);
  extern int offsettable_memory_operand (rtx, enum machine_mode);
  extern int alignable_memory_operand (rtx, enum machine_mode);
- extern int bad_signed_byte_operand (rtx, enum machine_mode);
  extern int arm_float_rhs_operand (rtx, enum machine_mode);
  extern int arm_float_add_operand (rtx, enum machine_mode);
  extern int power_of_two_operand (rtx, enum machine_mode);
--- 70,78 ----
  extern int arm_add_operand (rtx, enum machine_mode);
  extern int arm_addimm_operand (rtx, enum machine_mode);
  extern int arm_not_operand (rtx, enum machine_mode);
+ extern int arm_extendqisi_mem_op (rtx, enum machine_mode);
  extern int offsettable_memory_operand (rtx, enum machine_mode);
  extern int alignable_memory_operand (rtx, enum machine_mode);
  extern int arm_float_rhs_operand (rtx, enum machine_mode);
  extern int arm_float_add_operand (rtx, enum machine_mode);
  extern int power_of_two_operand (rtx, enum machine_mode);
Index: config/arm/arm.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/arm/arm.c,v
retrieving revision 1.303.2.17
diff -p -p -r1.303.2.17 arm.c
*** config/arm/arm.c	3 Mar 2004 16:03:31 -0000	1.303.2.17
--- config/arm/arm.c	11 Mar 2004 07:15:57 -0000
*************** static int arm_gen_constant (enum rtx_co
*** 64,70 ****
  			     rtx, rtx, int, int);
  static unsigned bit_count (unsigned long);
  static int arm_address_register_rtx_p (rtx, int);
! static int arm_legitimate_index_p (enum machine_mode, rtx, int);
  static int thumb_base_register_rtx_p (rtx, enum machine_mode, int);
  inline static int thumb_index_register_rtx_p (rtx, int);
  static int const_ok_for_op (HOST_WIDE_INT, enum rtx_code);
--- 64,70 ----
  			     rtx, rtx, int, int);
  static unsigned bit_count (unsigned long);
  static int arm_address_register_rtx_p (rtx, int);
! static int arm_legitimate_index_p (enum machine_mode, rtx, RTX_CODE, int);
  static int thumb_base_register_rtx_p (rtx, enum machine_mode, int);
  inline static int thumb_index_register_rtx_p (rtx, int);
  static int const_ok_for_op (HOST_WIDE_INT, enum rtx_code);
*************** legitimize_pic_address (rtx orig, enum m
*** 2708,2714 ****
  	{
  	  /* The base register doesn't really matter, we only want to
  	     test the index for the appropriate mode.  */
! 	  if (!arm_legitimate_index_p (mode, offset, 0))
  	    {
  	      if (!no_new_pseudos)
  		offset = force_reg (Pmode, offset);
--- 2708,2714 ----
  	{
  	  /* The base register doesn't really matter, we only want to
  	     test the index for the appropriate mode.  */
! 	  if (!arm_legitimate_index_p (mode, offset, SET, 0))
  	    {
  	      if (!no_new_pseudos)
  		offset = force_reg (Pmode, offset);
*************** arm_address_register_rtx_p (rtx x, int s
*** 2813,2819 ****
  
  /* Return nonzero if X is a valid ARM state address operand.  */
  int
! arm_legitimate_address_p (enum machine_mode mode, rtx x, int strict_p)
  {
    if (arm_address_register_rtx_p (x, strict_p))
      return 1;
--- 2813,2820 ----
  
  /* Return nonzero if X is a valid ARM state address operand.  */
  int
! arm_legitimate_address_p (enum machine_mode mode, rtx x, RTX_CODE outer,
! 			  int strict_p)
  {
    if (arm_address_register_rtx_p (x, strict_p))
      return 1;
*************** arm_legitimate_address_p (enum machine_m
*** 2826,2832 ****
  	   && arm_address_register_rtx_p (XEXP (x, 0), strict_p)
  	   && GET_CODE (XEXP (x, 1)) == PLUS
  	   && XEXP (XEXP (x, 1), 0) == XEXP (x, 0))
!     return arm_legitimate_index_p (mode, XEXP (XEXP (x, 1), 1), strict_p);
  
    /* After reload constants split into minipools will have addresses
       from a LABEL_REF.  */
--- 2827,2834 ----
  	   && arm_address_register_rtx_p (XEXP (x, 0), strict_p)
  	   && GET_CODE (XEXP (x, 1)) == PLUS
  	   && XEXP (XEXP (x, 1), 0) == XEXP (x, 0))
!     return arm_legitimate_index_p (mode, XEXP (XEXP (x, 1), 1), outer,
! 				   strict_p);
  
    /* After reload constants split into minipools will have addresses
       from a LABEL_REF.  */
*************** arm_legitimate_address_p (enum machine_m
*** 2878,2886 ****
        rtx xop1 = XEXP (x, 1);
  
        return ((arm_address_register_rtx_p (xop0, strict_p)
! 	       && arm_legitimate_index_p (mode, xop1, strict_p))
  	      || (arm_address_register_rtx_p (xop1, strict_p)
! 		  && arm_legitimate_index_p (mode, xop0, strict_p)));
      }
  
  #if 0
--- 2880,2888 ----
        rtx xop1 = XEXP (x, 1);
  
        return ((arm_address_register_rtx_p (xop0, strict_p)
! 	       && arm_legitimate_index_p (mode, xop1, outer, strict_p))
  	      || (arm_address_register_rtx_p (xop1, strict_p)
! 		  && arm_legitimate_index_p (mode, xop0, outer, strict_p)));
      }
  
  #if 0
*************** arm_legitimate_address_p (enum machine_m
*** 2891,2897 ****
        rtx xop1 = XEXP (x, 1);
  
        return (arm_address_register_rtx_p (xop0, strict_p)
! 	      && arm_legitimate_index_p (mode, xop1, strict_p));
      }
  #endif
  
--- 2893,2899 ----
        rtx xop1 = XEXP (x, 1);
  
        return (arm_address_register_rtx_p (xop0, strict_p)
! 	      && arm_legitimate_index_p (mode, xop1, outer, strict_p));
      }
  #endif
  
*************** arm_legitimate_address_p (enum machine_m
*** 2913,2919 ****
  /* Return nonzero if INDEX is valid for an address index operand in
     ARM state.  */
  static int
! arm_legitimate_index_p (enum machine_mode mode, rtx index, int strict_p)
  {
    HOST_WIDE_INT range;
    enum rtx_code code = GET_CODE (index);
--- 2915,2922 ----
  /* Return nonzero if INDEX is valid for an address index operand in
     ARM state.  */
  static int
! arm_legitimate_index_p (enum machine_mode mode, rtx index, RTX_CODE outer,
! 			int strict_p)
  {
    HOST_WIDE_INT range;
    enum rtx_code code = GET_CODE (index);
*************** arm_legitimate_index_p (enum machine_mod
*** 2938,2975 ****
  	    && INTVAL (index) < 256
  	    && INTVAL (index) > -256);
  
!   /* XXX What about ldrsb?  */
!   if (GET_MODE_SIZE (mode) <= 4  && code == MULT
!       && (!arm_arch4 || (mode) != HImode))
!     {
!       rtx xiop0 = XEXP (index, 0);
!       rtx xiop1 = XEXP (index, 1);
! 
!       return ((arm_address_register_rtx_p (xiop0, strict_p)
! 	       && power_of_two_operand (xiop1, SImode))
! 	      || (arm_address_register_rtx_p (xiop1, strict_p)
! 		  && power_of_two_operand (xiop0, SImode)));
      }
  
!   if (GET_MODE_SIZE (mode) <= 4
!       && (code == LSHIFTRT || code == ASHIFTRT
! 	  || code == ASHIFT || code == ROTATERT)
!       && (!arm_arch4 || (mode) != HImode))
!     {
!       rtx op = XEXP (index, 1);
! 
!       return (arm_address_register_rtx_p (XEXP (index, 0), strict_p)
! 	      && GET_CODE (op) == CONST_INT
! 	      && INTVAL (op) > 0
! 	      && INTVAL (op) <= 31);
!     }
! 
!   /* XXX For ARM v4 we may be doing a sign-extend operation during the
!      load, but that has a restricted addressing range and we are unable
!      to tell here whether that is the case.  To be safe we restrict all
!      loads to that range.  */
    if (arm_arch4)
!     range = (mode == HImode || mode == QImode) ? 256 : 4096;
    else
      range = (mode == HImode) ? 4095 : 4096;
  
--- 2941,2982 ----
  	    && INTVAL (index) < 256
  	    && INTVAL (index) > -256);
  
!   if (GET_MODE_SIZE (mode) <= 4
!       && ! (arm_arch4
! 	    && (mode == HImode
! 		|| (mode == QImode && outer == SIGN_EXTEND))))
!     {
!       if (code == MULT)
! 	{
! 	  rtx xiop0 = XEXP (index, 0);
! 	  rtx xiop1 = XEXP (index, 1);
! 
! 	  return ((arm_address_register_rtx_p (xiop0, strict_p)
! 		   && power_of_two_operand (xiop1, SImode))
! 		  || (arm_address_register_rtx_p (xiop1, strict_p)
! 		      && power_of_two_operand (xiop0, SImode)));
! 	}
!       else if (code == LSHIFTRT || code == ASHIFTRT
! 	       || code == ASHIFT || code == ROTATERT)
! 	{
! 	  rtx op = XEXP (index, 1);
! 
! 	  return (arm_address_register_rtx_p (XEXP (index, 0), strict_p)
! 		  && GET_CODE (op) == CONST_INT
! 		  && INTVAL (op) > 0
! 		  && INTVAL (op) <= 31);
! 	}
      }
  
!   /* For ARM v4 we may be doing a sign-extend operation during the
!      load.  */
    if (arm_arch4)
!     {
!       if (mode == HImode || (outer == SIGN_EXTEND && mode == QImode))
! 	range = 256;
!       else
! 	range = 4096;
!     }
    else
      range = (mode == HImode) ? 4095 : 4096;
  
*************** arm_reload_memory_operand (rtx op, enum 
*** 4135,4171 ****
  		  && REGNO (op) >= FIRST_PSEUDO_REGISTER)));
  }
  
- /* Return 1 if OP is a valid memory address, but not valid for a signed byte
-    memory access (architecture V4).
-    MODE is QImode if called when computing constraints, or VOIDmode when
-    emitting patterns.  In this latter case we cannot use memory_operand()
-    because it will fail on badly formed MEMs, which is precisely what we are
-    trying to catch.  */
- int
- bad_signed_byte_operand (rtx op, enum machine_mode mode ATTRIBUTE_UNUSED)
- {
-   if (GET_CODE (op) != MEM)
-     return 0;
- 
-   op = XEXP (op, 0);
- 
-   /* A sum of anything more complex than reg + reg or reg + const is bad.  */
-   if ((GET_CODE (op) == PLUS || GET_CODE (op) == MINUS)
-       && (!s_register_operand (XEXP (op, 0), VOIDmode)
- 	  || (!s_register_operand (XEXP (op, 1), VOIDmode)
- 	      && GET_CODE (XEXP (op, 1)) != CONST_INT)))
-     return 1;
- 
-   /* Big constants are also bad.  */
-   if (GET_CODE (op) == PLUS && GET_CODE (XEXP (op, 1)) == CONST_INT
-       && (INTVAL (XEXP (op, 1)) > 0xff
- 	  || -INTVAL (XEXP (op, 1)) > 0xff))
-     return 1;
- 
-   /* Everything else is good, or can will automatically be made so.  */
-   return 0;
- }
- 
  /* Return TRUE for valid operands for the rhs of an ARM instruction.  */
  int
  arm_rhs_operand (rtx op, enum machine_mode mode)
--- 4142,4147 ----
*************** cirrus_memory_offset (rtx op)
*** 4356,4361 ****
--- 4332,4346 ----
      }
  
    return 0;
+ }
+ 
+ int
+ arm_extendqisi_mem_op (rtx op, enum machine_mode mode)
+ {
+   if (!memory_operand (op, mode))
+     return 0;
+ 
+   return arm_legitimate_address_p (mode, XEXP (op, 0), SIGN_EXTEND, 0);
  }
  
  /* Return nonzero if OP is a Cirrus or general register.  */
Index: config/arm/arm.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/arm/arm.h,v
retrieving revision 1.210.2.19
diff -p -p -r1.210.2.19 arm.h
*** config/arm/arm.h	5 Mar 2004 17:32:17 -0000	1.210.2.19
--- config/arm/arm.h	11 Mar 2004 07:16:07 -0000
*************** enum reg_class
*** 1486,1508 ****
     accessed without using a load.
     'U' is an address valid for VFP load/store insns.  */
  
! #define EXTRA_CONSTRAINT_ARM(OP, C)					    \
!   ((C) == 'Q' ? GET_CODE (OP) == MEM && GET_CODE (XEXP (OP, 0)) == REG :    \
!    (C) == 'R' ? (GET_CODE (OP) == MEM					    \
! 		 && GET_CODE (XEXP (OP, 0)) == SYMBOL_REF		    \
! 		 && CONSTANT_POOL_ADDRESS_P (XEXP (OP, 0))) :		    \
!    (C) == 'S' ? (optimize > 0 && CONSTANT_ADDRESS_P (OP)) :		    \
!    (C) == 'T' ? cirrus_memory_offset (OP) : 		    		    \
!    (C) == 'U' ? vfp_mem_operand (OP) :					    \
!    0)
  
  #define EXTRA_CONSTRAINT_THUMB(X, C)					\
    ((C) == 'Q' ? (GET_CODE (X) == MEM					\
  		 && GET_CODE (XEXP (X, 0)) == LABEL_REF) : 0)
  
! #define EXTRA_CONSTRAINT(X, C)						\
!   (TARGET_ARM ?								\
!    EXTRA_CONSTRAINT_ARM (X, C) : EXTRA_CONSTRAINT_THUMB (X, C))
  
  #define EXTRA_MEMORY_CONSTRAINT(C, STR) ((C) == 'U')
  
--- 1486,1515 ----
     accessed without using a load.
     'U' is an address valid for VFP load/store insns.  */
  
! #define EXTRA_CONSTRAINT_STR_ARM(OP, C, STR)			\
!   (((C) == 'Q') ? (GET_CODE (OP) == MEM				\
! 		 && GET_CODE (XEXP (OP, 0)) == REG) :		\
!    ((C) == 'R') ? (GET_CODE (OP) == MEM				\
! 		   && GET_CODE (XEXP (OP, 0)) == SYMBOL_REF	\
! 		   && CONSTANT_POOL_ADDRESS_P (XEXP (OP, 0))) :	\
!    ((C) == 'S') ? (optimize > 0 && CONSTANT_ADDRESS_P (OP)) :	\
!    ((C) == 'T') ? cirrus_memory_offset (OP) :			\
!    ((C) == 'U' && (STR)[1] == 'v') ? vfp_mem_operand (OP) :	\
!    ((C) == 'U' && (STR)[1] == 'q')				\
!     ? arm_extendqisi_mem_op (OP, GET_MODE (OP))			\
!       : 0)
! 
! #define CONSTRAINT_LEN(C,STR)				\
!   ((C) == 'U' ? 2 : DEFAULT_CONSTRAINT_LEN (C, STR))
  
  #define EXTRA_CONSTRAINT_THUMB(X, C)					\
    ((C) == 'Q' ? (GET_CODE (X) == MEM					\
  		 && GET_CODE (XEXP (X, 0)) == LABEL_REF) : 0)
  
! #define EXTRA_CONSTRAINT_STR(X, C, STR)		\
!   (TARGET_ARM					\
!    ? EXTRA_CONSTRAINT_STR_ARM (X, C, STR)	\
!    : EXTRA_CONSTRAINT_THUMB (X, C))
  
  #define EXTRA_MEMORY_CONSTRAINT(C, STR) ((C) == 'U')
  
*************** typedef struct
*** 2371,2377 ****
  
  #define ARM_GO_IF_LEGITIMATE_ADDRESS(MODE,X,WIN)		\
    {								\
!     if (arm_legitimate_address_p (MODE, X, REG_STRICT_P))	\
        goto WIN;							\
    }
  
--- 2378,2384 ----
  
  #define ARM_GO_IF_LEGITIMATE_ADDRESS(MODE,X,WIN)		\
    {								\
!     if (arm_legitimate_address_p (MODE, X, SET, REG_STRICT_P))	\
        goto WIN;							\
    }
  
*************** extern int making_const_table;
*** 2862,2868 ****
    {"thumb_cmpneg_operand", {CONST_INT}},				\
    {"thumb_cbrch_target_operand", {SUBREG, REG, MEM}},			\
    {"offsettable_memory_operand", {MEM}},				\
-   {"bad_signed_byte_operand", {MEM}},					\
    {"alignable_memory_operand", {MEM}},					\
    {"shiftable_operator", {PLUS, MINUS, AND, IOR, XOR}},			\
    {"minmax_operator", {SMIN, SMAX, UMIN, UMAX}},			\
--- 2869,2874 ----
Index: config/arm/arm.md
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/arm/arm.md,v
retrieving revision 1.145.2.14
diff -p -p -r1.145.2.14 arm.md
*** config/arm/arm.md	3 Mar 2004 16:03:32 -0000	1.145.2.14
--- config/arm/arm.md	11 Mar 2004 07:16:25 -0000
***************
*** 3746,3804 ****
    }"
  )
  
- ; Rather than restricting all byte accesses to memory addresses that ldrsb
- ; can handle, we fix up the ones that ldrsb can't grok with a split.
  (define_insn "*extendqihi_insn"
!   [(set (match_operand:HI                 0 "s_register_operand" "=r")
! 	(sign_extend:HI (match_operand:QI 1 "memory_operand"      "m")))]
    "TARGET_ARM && arm_arch4"
!   "*
!   /* If the address is invalid, this will split the instruction into two.  */
!   if (bad_signed_byte_operand (operands[1], VOIDmode))
!     return \"#\";
!   return \"ldr%?sb\\t%0, %1\";
!   "
    [(set_attr "type" "load_byte")
     (set_attr "predicable" "yes")
-    (set_attr "length" "8")
     (set_attr "pool_range" "256")
     (set_attr "neg_pool_range" "244")]
  )
  
- (define_split
-   [(set (match_operand:HI 0 "s_register_operand" "")
- 	(sign_extend:HI (match_operand:QI 1 "bad_signed_byte_operand" "")))]
-   "TARGET_ARM && arm_arch4 && reload_completed"
-   [(set (match_dup 3) (match_dup 1))
-    (set (match_dup 0) (sign_extend:HI (match_dup 2)))]
-   "
-   {
-     HOST_WIDE_INT offset;
- 
-     operands[3] = gen_rtx_REG (SImode, REGNO (operands[0]));
-     operands[2] = gen_rtx_MEM (QImode, operands[3]);
-     MEM_COPY_ATTRIBUTES (operands[2], operands[1]);
-     operands[1] = XEXP (operands[1], 0);
-     if (GET_CODE (operands[1]) == PLUS
- 	&& GET_CODE (XEXP (operands[1], 1)) == CONST_INT
- 	&& !(const_ok_for_arm (offset = INTVAL (XEXP (operands[1], 1)))
- 	     || const_ok_for_arm (-offset)))
-       {
- 	HOST_WIDE_INT low = (offset > 0
- 			     ? (offset & 0xff) : -((-offset) & 0xff));
- 	XEXP (operands[2], 0) = plus_constant (operands[3], low);
- 	operands[1] = plus_constant (XEXP (operands[1], 0), offset - low);
-       }
-     /* Ensure the sum is in correct canonical form.  */
-     else if (GET_CODE (operands[1]) == PLUS
- 	     && GET_CODE (XEXP (operands[1], 1)) != CONST_INT
- 	     && !s_register_operand (XEXP (operands[1], 1), VOIDmode))
-       operands[1] = gen_rtx_PLUS (GET_MODE (operands[1]),
- 					   XEXP (operands[1], 1),
- 					   XEXP (operands[1], 0));
-   }"
- )
- 
  (define_expand "extendqisi2"
    [(set (match_dup 2)
  	(ashift:SI (match_operand:QI 1 "general_operand" "")
--- 3746,3762 ----
    }"
  )
  
  (define_insn "*extendqihi_insn"
!   [(set (match_operand:HI 0 "s_register_operand" "=r")
! 	(sign_extend:HI (match_operand:QI 1 "memory_operand" "Uq")))]
    "TARGET_ARM && arm_arch4"
!   "ldr%?sb\\t%0, %1"
    [(set_attr "type" "load_byte")
     (set_attr "predicable" "yes")
     (set_attr "pool_range" "256")
     (set_attr "neg_pool_range" "244")]
  )
  
  (define_expand "extendqisi2"
    [(set (match_dup 2)
  	(ashift:SI (match_operand:QI 1 "general_operand" "")
***************
*** 3831,3872 ****
    }"
  )
  
- ; Rather than restricting all byte accesses to memory addresses that ldrsb
- ; can handle, we fix up the ones that ldrsb can't grok with a split.
  (define_insn "*arm_extendqisi"
    [(set (match_operand:SI 0 "s_register_operand" "=r")
! 	(sign_extend:SI (match_operand:QI 1 "memory_operand" "m")))]
    "TARGET_ARM && arm_arch4 && !arm_arch6"
!   "*
!   /* If the address is invalid, this will split the instruction into two.  */
!   if (bad_signed_byte_operand (operands[1], VOIDmode))
!     return \"#\";
!   return \"ldr%?sb\\t%0, %1\";
!   "
    [(set_attr "type" "load_byte")
     (set_attr "predicable" "yes")
-    (set_attr "length" "8")
     (set_attr "pool_range" "256")
     (set_attr "neg_pool_range" "244")]
  )
  
  (define_insn "*arm_extendqisi_v6"
    [(set (match_operand:SI 0 "s_register_operand" "=r,r")
! 	(sign_extend:SI (match_operand:QI 1 "nonimmediate_operand" "r,m")))]
    "TARGET_ARM && arm_arch6"
!   "*
!   if (which_alternative == 0)
!     return \"sxtb%?\\t%0, %1\";
! 
!   /* If the address is invalid, this will split the instruction into two.  */
!   if (bad_signed_byte_operand (operands[1], VOIDmode))
!     return \"#\";
! 
!   return \"ldr%?sb\\t%0, %1\";
!   "
    [(set_attr "type" "alu_shift,load_byte")
     (set_attr "predicable" "yes")
-    (set_attr "length" "4,8")
     (set_attr "pool_range" "*,256")
     (set_attr "neg_pool_range" "*,244")]
  )
--- 3789,3814 ----
    }"
  )
  
  (define_insn "*arm_extendqisi"
    [(set (match_operand:SI 0 "s_register_operand" "=r")
! 	(sign_extend:SI (match_operand:QI 1 "memory_operand" "Uq")))]
    "TARGET_ARM && arm_arch4 && !arm_arch6"
!   "ldr%?sb\\t%0, %1"
    [(set_attr "type" "load_byte")
     (set_attr "predicable" "yes")
     (set_attr "pool_range" "256")
     (set_attr "neg_pool_range" "244")]
  )
  
  (define_insn "*arm_extendqisi_v6"
    [(set (match_operand:SI 0 "s_register_operand" "=r,r")
! 	(sign_extend:SI (match_operand:QI 1 "nonimmediate_operand" "r,Uq")))]
    "TARGET_ARM && arm_arch6"
!   "@
!    sxtb%?\\t%0, %1
!    ldr%?sb\\t%0, %1"
    [(set_attr "type" "alu_shift,load_byte")
     (set_attr "predicable" "yes")
     (set_attr "pool_range" "*,256")
     (set_attr "neg_pool_range" "*,244")]
  )
***************
*** 3879,3917 ****
    "sxtab%?\\t%0, %2, %1"
    [(set_attr "type" "alu_shift")
     (set_attr "predicable" "yes")]
- )
- 
- (define_split
-   [(set (match_operand:SI 0 "s_register_operand" "")
- 	(sign_extend:SI (match_operand:QI 1 "bad_signed_byte_operand" "")))]
-   "TARGET_ARM && arm_arch4 && reload_completed"
-   [(set (match_dup 0) (match_dup 1))
-    (set (match_dup 0) (sign_extend:SI (match_dup 2)))]
-   "
-   {
-     HOST_WIDE_INT offset;
- 
-     operands[2] = gen_rtx_MEM (QImode, operands[0]);
-     MEM_COPY_ATTRIBUTES (operands[2], operands[1]);
-     operands[1] = XEXP (operands[1], 0);
-     if (GET_CODE (operands[1]) == PLUS
- 	&& GET_CODE (XEXP (operands[1], 1)) == CONST_INT
- 	&& !(const_ok_for_arm (offset = INTVAL (XEXP (operands[1], 1)))
- 	     || const_ok_for_arm (-offset)))
-       {
- 	HOST_WIDE_INT low = (offset > 0
- 			     ? (offset & 0xff) : -((-offset) & 0xff));
- 	XEXP (operands[2], 0) = plus_constant (operands[0], low);
- 	operands[1] = plus_constant (XEXP (operands[1], 0), offset - low);
-       }
-     /* Ensure the sum is in correct canonical form.  */
-     else if (GET_CODE (operands[1]) == PLUS
- 	     && GET_CODE (XEXP (operands[1], 1)) != CONST_INT
- 	     && !s_register_operand (XEXP (operands[1], 1), VOIDmode))
-       operands[1] = gen_rtx_PLUS (GET_MODE (operands[1]),
- 					   XEXP (operands[1], 1),
- 					   XEXP (operands[1], 0));
-   }"
  )
  
  (define_insn "*thumb_extendqisi2"
--- 3821,3826 ----
Index: config/arm/vfp.md
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/arm/vfp.md,v
retrieving revision 1.1.2.1
diff -p -p -r1.1.2.1 vfp.md
*** config/arm/vfp.md	30 Jan 2004 16:15:19 -0000	1.1.2.1
--- config/arm/vfp.md	11 Mar 2004 07:16:25 -0000
***************
*** 111,118 ****
  ;; ??? For now do not allow loading constants into vfp regs.  This causes
  ;; problems because small sonstants get converted into adds.
  (define_insn "*arm_movsi_vfp"
!   [(set (match_operand:SI 0 "nonimmediate_operand" "=r,r,r ,m,!w,r,!w,!w, U")
!       (match_operand:SI 1 "general_operand"	   "rI,K,mi,r,r,!w,!w,Ui,!w"))]
    "TARGET_ARM && TARGET_VFP && TARGET_HARD_FLOAT
     && (   s_register_operand (operands[0], SImode)
         || s_register_operand (operands[1], SImode))"
--- 111,118 ----
  ;; ??? For now do not allow loading constants into vfp regs.  This causes
  ;; problems because small sonstants get converted into adds.
  (define_insn "*arm_movsi_vfp"
!   [(set (match_operand:SI 0 "nonimmediate_operand" "=r,r,r ,m,!w,r,!w,!w,  Uv")
!       (match_operand:SI 1 "general_operand"	   "rI,K,mi,r,r,!w,!w,Uvi,!w"))]
    "TARGET_ARM && TARGET_VFP && TARGET_HARD_FLOAT
     && (   s_register_operand (operands[0], SImode)
         || s_register_operand (operands[1], SImode))"
***************
*** 136,143 ****
  ;; DImode moves
  
  (define_insn "*arm_movdi_vfp"
!   [(set (match_operand:DI 0 "nonimmediate_di_operand" "=r, r,o<>,w,r,w,w ,U")
! 	(match_operand:DI 1 "di_operand"              "rIK,mi,r ,r,w,w,Ui,w"))]
    "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
    "*
    switch (which_alternative)
--- 136,143 ----
  ;; DImode moves
  
  (define_insn "*arm_movdi_vfp"
!   [(set (match_operand:DI 0 "nonimmediate_di_operand" "=r, r,o<>,w,r,w,w  ,Uv")
! 	(match_operand:DI 1 "di_operand"              "rIK,mi,r ,r,w,w,Uvi,w"))]
    "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
    "*
    switch (which_alternative)
***************
*** 168,175 ****
  ;; SFmode moves
  
  (define_insn "*movsf_vfp"
!   [(set (match_operand:SF 0 "nonimmediate_operand" "=w,r,w ,U,r ,m,w,r")
! 	(match_operand:SF 1 "general_operand"	   " r,w,UE,w,mE,r,w,r"))]
    "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP
     && (   s_register_operand (operands[0], SFmode)
         || s_register_operand (operands[1], SFmode))"
--- 168,175 ----
  ;; SFmode moves
  
  (define_insn "*movsf_vfp"
!   [(set (match_operand:SF 0 "nonimmediate_operand" "=w,r,w  ,Uv,r ,m,w,r")
! 	(match_operand:SF 1 "general_operand"	   " r,w,UvE,w, mE,r,w,r"))]
    "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP
     && (   s_register_operand (operands[0], SFmode)
         || s_register_operand (operands[1], SFmode))"
***************
*** 192,199 ****
  ;; DFmode moves
  
  (define_insn "*movdf_vfp"
!   [(set (match_operand:DF 0 "nonimmediate_soft_df_operand" "=w,r,r, m,w ,U,w,r")
! 	(match_operand:DF 1 "soft_df_operand"		   " r,w,mF,r,UF,w,w,r"))]
    "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
    "*
    {
--- 192,199 ----
  ;; DFmode moves
  
  (define_insn "*movdf_vfp"
!   [(set (match_operand:DF 0 "nonimmediate_soft_df_operand" "=w,r,r, m,w  ,Uv,w,r")
! 	(match_operand:DF 1 "soft_df_operand"		   " r,w,mF,r,UvF,w, w,r"))]
    "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
    "*
    {
Index: doc/md.texi
===================================================================
RCS file: /cvs/gcc/gcc/gcc/doc/md.texi,v
retrieving revision 1.79.4.4
diff -p -p -r1.79.4.4 md.texi
*** doc/md.texi	3 Mar 2004 16:04:37 -0000	1.79.4.4
--- doc/md.texi	11 Mar 2004 07:17:10 -0000
*************** An item in the constant pool
*** 1379,1386 ****
  A symbol in the text segment of the current file
  @end table
  
! @item U
  A memory reference suitable for VFP load/store insns (reg+constant offset)
  
  @item AVR family---@file{avr.h}
  @table @code
--- 1379,1389 ----
  A symbol in the text segment of the current file
  @end table
  
! @item Uv
  A memory reference suitable for VFP load/store insns (reg+constant offset)
+ 
+ @item Uq
+ A memory reference suitable for for the ARMv4 ldrsb instruction.
  
  @item AVR family---@file{avr.h}
  @table @code

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [csl-arm, HEAD] ARM PATCH - fix QImode addressing on ARMv4
  2004-03-13 11:43 [csl-arm, HEAD] ARM PATCH - fix QImode addressing on ARMv4 Richard Earnshaw
@ 2004-03-13 13:01 ` Richard Earnshaw
  2004-03-13 21:44   ` Daniel Jacobowitz
  2004-03-19  8:14   ` Richard Earnshaw
  2004-03-19  8:14 ` Richard Earnshaw
  1 sibling, 2 replies; 875+ messages in thread
From: Richard Earnshaw @ 2004-03-13 13:01 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard.Earnshaw, Richard Earnshaw


rearnsha@buzzard.freeserve.co.uk said:
> This patch fixes the way that we manage QImode indexes when compiling
> for ARM Architecture v4 or later.  In v4 we have a ldrsb instruction
> that can sign-extend a byte load (ldrb zero-extends).  Unfortunately
> the indexing capabilities of this insn are less flexible than its
> unsigned counterpart. In the past we have restricted (mostly) the
> indexing range of ldrb to that of its poorer cousin: that generates
> correct code, but at the expense of wasting instructions when the
> indexing exceeds the capabilities of ldrsb.

> The patch below addresses all this by introducing a new memory
> predicate arm_extendqisi_mem_op which can validate a ldrsb address
> index distinctly from an ldrb address index (it does so by calling
> arm_legitimate_address_p with a new argument, the 'outer' code in much
> the same way as the RTX_COST macros do.


I forgot to mention that the patch changes the memory constraint used for 
VFP memory operands from U to 'Uv'.  The constraint for an ldrsb 
instruction is 'Uq'.

R.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [csl-arm, HEAD] ARM PATCH - fix QImode addressing on ARMv4
  2004-03-13 13:01 ` Richard Earnshaw
@ 2004-03-13 21:44   ` Daniel Jacobowitz
  2004-03-14  0:43     ` Richard Earnshaw
  2004-03-19  8:14     ` Daniel Jacobowitz
  2004-03-19  8:14   ` Richard Earnshaw
  1 sibling, 2 replies; 875+ messages in thread
From: Daniel Jacobowitz @ 2004-03-13 21:44 UTC (permalink / raw)
  To: Richard Earnshaw; +Cc: gcc-patches, Richard.Earnshaw

On Sat, Mar 13, 2004 at 01:01:15PM +0000, Richard Earnshaw wrote:
> 
> rearnsha@buzzard.freeserve.co.uk said:
> > This patch fixes the way that we manage QImode indexes when compiling
> > for ARM Architecture v4 or later.  In v4 we have a ldrsb instruction
> > that can sign-extend a byte load (ldrb zero-extends).  Unfortunately
> > the indexing capabilities of this insn are less flexible than its
> > unsigned counterpart. In the past we have restricted (mostly) the
> > indexing range of ldrb to that of its poorer cousin: that generates
> > correct code, but at the expense of wasting instructions when the
> > indexing exceeds the capabilities of ldrsb.
> 
> > The patch below addresses all this by introducing a new memory
> > predicate arm_extendqisi_mem_op which can validate a ldrsb address
> > index distinctly from an ldrb address index (it does so by calling
> > arm_legitimate_address_p with a new argument, the 'outer' code in much
> > the same way as the RTX_COST macros do.
> 
> 
> I forgot to mention that the patch changes the memory constraint used for 
> VFP memory operands from U to 'Uv'.  The constraint for an ldrsb 
> instruction is 'Uq'.

In that case, you probably want to update this bit of arm.h:

--- 1486,1515 ----
     accessed without using a load.
     'U' is an address valid for VFP load/store insns.  */


-- 
Daniel Jacobowitz
MontaVista Software                         Debian GNU/Linux Developer

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [csl-arm, HEAD] ARM PATCH - fix QImode addressing on ARMv4
  2004-03-13 21:44   ` Daniel Jacobowitz
@ 2004-03-14  0:43     ` Richard Earnshaw
  2004-03-19  8:14       ` Richard Earnshaw
  2004-03-19  8:14     ` Daniel Jacobowitz
  1 sibling, 1 reply; 875+ messages in thread
From: Richard Earnshaw @ 2004-03-14  0:43 UTC (permalink / raw)
  To: Richard Earnshaw, gcc-patches, Richard.Earnshaw

[-- Attachment #1: Type: text/plain, Size: 540 bytes --]


> > I forgot to mention that the patch changes the memory constraint used for 
> > VFP memory operands from U to 'Uv'.  The constraint for an ldrsb 
> > instruction is 'Uq'.
> 
> In that case, you probably want to update this bit of arm.h:
> 
> --- 1486,1515 ----
>      accessed without using a load.
>      'U' is an address valid for VFP load/store insns.  */

Argh!  I'd seen that and then forgotten to update it.

Fixed thusly:

2004-03-14  Richard Earnshaw  <rearnsha@arm.com>

	* arm.h (EXTRA_CONSTRAINT_STR_ARM): Update comment.



[-- Attachment #2: extendqisi-addr2.patch --]
[-- Type: text/plain , Size: 1159 bytes --]

Index: config/arm/arm.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/arm/arm.h,v
retrieving revision 1.226
diff -p -p -r1.226 arm.h
*** config/arm/arm.h	13 Mar 2004 11:19:21 -0000	1.226
--- config/arm/arm.h	14 Mar 2004 00:21:23 -0000
*************** enum reg_class
*** 1469,1475 ****
     `S' means any symbol that has the SYMBOL_REF_FLAG set or a CONSTANT_POOL
     address.  This means that the symbol is in the text segment and can be
     accessed without using a load.
!    'U' is an address valid for VFP load/store insns.  */
  
  #define EXTRA_CONSTRAINT_STR_ARM(OP, C, STR)			\
    (((C) == 'Q') ? (GET_CODE (OP) == MEM				\
--- 1469,1477 ----
     `S' means any symbol that has the SYMBOL_REF_FLAG set or a CONSTANT_POOL
     address.  This means that the symbol is in the text segment and can be
     accessed without using a load.
!    'U' Prefixes an extended memory constraint where:
!    'Uv' is an address valid for VFP load/store insns.  
!    'Uq' is an address valid for ldrsb.  */
  
  #define EXTRA_CONSTRAINT_STR_ARM(OP, C, STR)			\
    (((C) == 'Q') ? (GET_CODE (OP) == MEM				\

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
  2004-03-10  9:42               ` Richard Sandiford
  2004-03-10 11:01                 ` Alan Modra
  2004-03-10 16:18                 ` David Edelsohn
@ 2004-03-19  8:14                 ` Richard Sandiford
  2 siblings, 0 replies; 875+ messages in thread
From: Richard Sandiford @ 2004-03-19  8:14 UTC (permalink / raw)
  To: Richard Henderson; +Cc: David Edelsohn, gcc-patches

Alan Modra <amodra@bigpond.net.au> writes:
> Index: gcc/real.c
> ===================================================================
> RCS file: /cvs/gcc/gcc/gcc/real.c,v
> retrieving revision 1.139
> diff -u -p -r1.139 real.c
> --- gcc/real.c	4 Mar 2004 10:23:20 -0000	1.139
> +++ gcc/real.c	9 Mar 2004 23:34:36 -0000
> @@ -3216,10 +3215,9 @@ const struct real_format ieee_extended_i
>     numbers whose sum is equal to the extended precision value.  The number
>     with greater magnitude is first.  This format has the same magnitude
>     range as an IEEE double precision value, but effectively 106 bits of
> -   significand precision.  Infinity and NaN are represented by their IEEE
> -   double precision value stored in the first number, the second number is
> -   ignored.  Zeroes, Infinities, and NaNs are set in both doubles
> -   due to precedent.  */
> +   significand precision.  Zero, Infinity and NaN are represented by their
> +   IEEE double precision value stored in the first number, the second
> +   number is -0.0.  */
>  
>  static void encode_ibm_extended (const struct real_format *fmt,
>  				 long *, const REAL_VALUE_TYPE *);
> @@ -3256,9 +3254,21 @@ encode_ibm_extended (const struct real_f
>    else
>      {
>        /* Inf, NaN, 0 are all representable as doubles, so the
> -	 least-significant part can be 0.0.  */
> -      buf[2] = 0;
> -      buf[3] = 0;
> +	 least-significant part can be zero.  We choose -0.0 because
> +	 conversion of IBM extended precision to double is done by
> +	 adding the two component doubles.  -0.0 is the only value that
> +	 will result in a long double -0.0 correctly converting to a
> +	 -0.0 double.  */
> +      if (FLOAT_WORDS_BIG_ENDIAN)
> +	{
> +	  buf[2] = 0x80000000;
> +	  buf[3] = 0;
> +	}
> +      else
> +	{
> +	  buf[2] = 0;
> +	  buf[3] = 0x80000000;
> +	}
>      }
>  }
>  

This function seems to be developing in a very ad-hoc way.  Is there really
no spec that says what a canonical "IBM format" number should look like?

The current implementation does seem to be correct for IRIX.  It certainly
uses +0.0 as the low part of NaN, +/-Inf and +/-0.0.  We'd need to split
the definition into two if your patch is needed for powerpc.

FWIW, here's the output of the attached program when compiled with MIPSpro cc.

     NaN : 7ff7ffff ffffffff : 00000000 00000000
    +Inf : 7ff00000 00000000 : 00000000 00000000
       0 : 00000000 00000000 : 00000000 00000000
      -0 : 80000000 00000000 : 00000000 00000000
    -Inf : fff00000 00000000 : 00000000 00000000

Richard


#include <float.h>

void print_it (const char *fmt, long double ll)
{
  union { long double ll; unsigned int i[4]; } u;
  u.ll = ll;
  printf ("%8s : %08x %08x : %08x %08x\n",
	  fmt, u.i[0], u.i[1], u.i[2], u.i[3]);
}

int main ()
{
  print_it ("NaN", 0.0L/0.0L);
  print_it ("+Inf", 1.0L/0.0L);
  print_it ("0", LDBL_MIN / 1e100);
  print_it ("-0", -LDBL_MIN / 1e100);
  print_it ("-Inf", -1.0L/0.0L);
  exit (0);
}

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
  2004-03-10 13:48                             ` Alan Modra
@ 2004-03-19  8:14                               ` Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2004-03-19  8:14 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Richard Sandiford, Richard Henderson, Geoff Keating,
	David Edelsohn, gcc-patches

On Wed, Mar 10, 2004 at 01:42:43PM +0100, Andreas Schwab wrote:
> The sign bit might also be a don't-care at this point, in which case both
> formats must be supported.

I don't believe there is anything in our implementation that cares about
the sign of a low zero word, *except* for long double -> double
conversion of -0.0.

Richard brought up the interesting point privately, that if the high
double of a long double pair is always a correctly rounded double, then
just using that double as the result for a long double -> double
conversion would be correct.  My only counter argument is that our
ABI doesn't specify the long double format as tightly as MIPS does,
only requiring that the larger magnitude double be first, and that
the magnitudes do not overlap.  So our ABI doesn't require correct
rounding of the high double, just that the low double be < 1ULP of the
high double.  However, a correctly rounded high double results in
the best precision results for straight-forward arithmetic routines.

I also think that the MIPS specification requiring correct rounding
(or equivalently the low double <= 0.5 ULP of the high double) is
better.  Our looser spec means that it is possible for someone to
create two long doubles that have exactly the same infinite precision
sum, but different component doubles.  Differing component doubles will
result in the long doubles not comparing equal in gcc's current
implementation, when they have the same value according to our ABI.

So..

If gcc's rs6000/darwin-ldouble.c implementation of the basic arithmetic
operations always rounds the high double correctly, then I'd be quite
happy to work on rewording our ABI, and just taking the high double
for long double -> double conversion as Richard suggested.  You can't
get faster than a conversion that need not do anything.

Then, if we don't need to add the component doubles, we don't need to
use -0.0 in the low double.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Fix PR 14406 (rs6000 abstf2)
  2004-03-04  2:44                 ` Alan Modra
@ 2004-03-19  8:14                   ` Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2004-03-19  8:14 UTC (permalink / raw)
  To: David Edelsohn, gcc-patches

On Thu, Mar 04, 2004 at 08:04:39AM +1030, Alan Modra wrote:
> I was about to say that gcc does better, but the XLC sequence has
> alerted me to the fact that I'm not doing the right thing for -0.0

Revised patch.

long double foo (long double x)
{
  return __builtin_fabsl (x);
}

generates

.foo:
        fmr 0,1
        fabs 1,1
        fcmpu 7,0,1
        beqlr- 7
        fneg 2,2
        blr

	PR target/14406
	* config/rs6000/rs6000.md (abstf2, abstf2+1): Delete define_insn.
	(abstf2, abstf2_internal): New define_expand.

Index: gcc/config/rs6000/rs6000.md
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.md,v
retrieving revision 1.296
diff -c -p -r1.296 rs6000.md
*** gcc/config/rs6000/rs6000.md	27 Feb 2004 02:13:59 -0000	1.296
--- gcc/config/rs6000/rs6000.md	4 Mar 2004 01:40:41 -0000
***************
*** 8375,8409 ****
    [(set_attr "type" "fp")
     (set_attr "length" "8")])
  
! (define_insn "abstf2"
    [(set (match_operand:TF 0 "gpc_reg_operand" "=f")
  	(abs:TF (match_operand:TF 1 "gpc_reg_operand" "f")))]
    "(DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_DARWIN)
     && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_LONG_DOUBLE_128"
!   "*
  {
!   if (REGNO (operands[0]) == REGNO (operands[1]) + 1)
!     return \"fabs %L0,%L1\;fabs %0,%1\";
!   else
!     return \"fabs %0,%1\;fabs %L0,%L1\";
! }"
!   [(set_attr "type" "fp")
!    (set_attr "length" "8")])
  
! (define_insn ""
    [(set (match_operand:TF 0 "gpc_reg_operand" "=f")
! 	(neg:TF (abs:TF (match_operand:TF 1 "gpc_reg_operand" "f"))))]
    "(DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_DARWIN)
     && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_LONG_DOUBLE_128"
!   "*
  {
!   if (REGNO (operands[0]) == REGNO (operands[1]) + 1)
!     return \"fnabs %L0,%L1\;fnabs %0,%1\";
!   else
!     return \"fnabs %0,%1\;fnabs %L0,%L1\";
! }"
!   [(set_attr "type" "fp")
!    (set_attr "length" "8")])
  \f
  ;; Next come the multi-word integer load and store and the load and store
  ;; multiple insns.
--- 8418,8457 ----
    [(set_attr "type" "fp")
     (set_attr "length" "8")])
  
! (define_expand "abstf2"
    [(set (match_operand:TF 0 "gpc_reg_operand" "=f")
  	(abs:TF (match_operand:TF 1 "gpc_reg_operand" "f")))]
    "(DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_DARWIN)
     && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_LONG_DOUBLE_128"
!   "
  {
!   rtx label = gen_label_rtx ();
!   emit_insn (gen_abstf2_internal (operands[0], operands[1], label));
!   emit_label (label);
!   DONE;
! }")
  
! (define_expand "abstf2_internal"
    [(set (match_operand:TF 0 "gpc_reg_operand" "=f")
! 	(match_operand:TF 1 "gpc_reg_operand" "f"))
!    (set (match_dup 3) (match_dup 5))
!    (set (match_dup 5) (abs:DF (match_dup 5)))
!    (set (match_dup 4) (compare:CCFP (match_dup 3) (match_dup 5)))
!    (set (pc) (if_then_else (eq (match_dup 4) (const_int 0))
! 			   (label_ref (match_operand 2 "" ""))
! 			   (pc)))
!    (set (match_dup 6) (neg:DF (match_dup 6)))]
    "(DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_DARWIN)
     && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_LONG_DOUBLE_128"
!   "
  {
!   const int hi_word = FLOAT_WORDS_BIG_ENDIAN ? 0 : GET_MODE_SIZE (DFmode);
!   const int lo_word = FLOAT_WORDS_BIG_ENDIAN ? GET_MODE_SIZE (DFmode) : 0;
!   operands[3] = gen_reg_rtx (DFmode);
!   operands[4] = gen_reg_rtx (CCFPmode);
!   operands[5] = simplify_gen_subreg (DFmode, operands[0], TFmode, hi_word);
!   operands[6] = simplify_gen_subreg (DFmode, operands[0], TFmode, lo_word);
! }")
  \f
  ;; Next come the multi-word integer load and store and the load and store
  ;; multiple insns.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* [csl-arm,  HEAD] ARM PATCH - fix QImode addressing on ARMv4
  2004-03-13 11:43 [csl-arm, HEAD] ARM PATCH - fix QImode addressing on ARMv4 Richard Earnshaw
  2004-03-13 13:01 ` Richard Earnshaw
@ 2004-03-19  8:14 ` Richard Earnshaw
  1 sibling, 0 replies; 875+ messages in thread
From: Richard Earnshaw @ 2004-03-19  8:14 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard.Earnshaw

[-- Attachment #1: Type: text/plain, Size: 2290 bytes --]

This patch fixes the way that we manage QImode indexes when compiling for
ARM Architecture v4 or later.  In v4 we have a ldrsb instruction that can
sign-extend a byte load (ldrb zero-extends).  Unfortunately the indexing
capabilities of this insn are less flexible than its unsigned counterpart.
In the past we have restricted (mostly) the indexing range of ldrb to that
of its poorer cousin: that generates correct code, but at the expense of
wasting instructions when the indexing exceeds the capabilities of ldrsb.

The patch below addresses all this by introducing a new memory predicate
arm_extendqisi_mem_op which can validate a ldrsb address index distinctly
from an ldrb address index (it does so by calling arm_legitimate_address_p
with a new argument, the 'outer' code in much the same way as the RTX_COST
macros do.

Measurements on CSiBE code show about 0.1% code size reduction with this
change.

Built and regress tested for arm-unknown-elf on the csl-arm-branch and
HEAD, and fully bootstrapped on armv4-linux-gnu on HEAD.

Committed to csl-arm and HEAD.

2004-03-13  Richard Earnshaw  <rearnsha@arm.com>

	* arm.c (arm_legitimate_address_p): New argument, OUTER.  Pass through
	to arm_legitimate_index_p.  Update all callers with SET as default
	value.
	(arm_legitimate_index_p): New argument, OUTER.  Restrict the index
	range if OUTER is a sign-extend operation on QImode.  Correctly
	reject shift operations on sign-extended QImode addresses.
	(bad_signed_byte_operand): Delete.
	(arm_extendqisi_mem_op): New function.
	* arm.h (EXTRA_CONSTRAINT_ARM): Delete.  Replace with...
	(EXTRA_CONSTRAINT_STR_ARM): ... this.  Handle extended address
	constraints.
	(CONSTRAINT_LEN): New.
	(EXTRA_CONSTRAINT): Delete.  Replace with...
	(EXTRA_CONSTRAINT_STR): ... this.
	(PREDICATE_CODES): Remove bad_signed_byte_operand.
	* arm.md (extendqihi_insn): Use new constraint Uq.  Rework.  Length
	is now always default.
	(define_splits for bad sign-extend loads): Delete.
	(arm_extendqisi, arm_extendqisi_v5): Likewise.
	* arm/vfp.md (arm_movsi_vfp, arm_movdi_vfp, movsf_vfp, movdf_vfp):
	Rework 'U' constraint to 'Uv'.
	* arm-protos.h: Remove bad_signed_byte_operand.  Add
	arm_extendqisi_mem_op.
	* doc/md.texi (ARM constraints): Rename VFP constraint (now Uv).
	Add Uq constraint.



[-- Attachment #2: extendqisi-addr.patch --]
[-- Type: text/plain , Size: 26031 bytes --]

Index: config/arm/arm-protos.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/arm/arm-protos.h,v
retrieving revision 1.60.4.6
diff -p -p -r1.60.4.6 arm-protos.h
*** config/arm/arm-protos.h	30 Jan 2004 16:14:21 -0000	1.60.4.6
--- config/arm/arm-protos.h	11 Mar 2004 07:15:25 -0000
*************** extern int arm_split_constant (RTX_CODE,
*** 50,56 ****
  extern RTX_CODE arm_canonicalize_comparison (RTX_CODE, rtx *);
  extern int legitimate_pic_operand_p (rtx);
  extern rtx legitimize_pic_address (rtx, enum machine_mode, rtx);
! extern int arm_legitimate_address_p  (enum machine_mode, rtx, int);
  extern int thumb_legitimate_address_p (enum machine_mode, rtx, int);
  extern int thumb_legitimate_offset_p (enum machine_mode, HOST_WIDE_INT);
  extern rtx arm_legitimize_address (rtx, rtx, enum machine_mode);
--- 50,56 ----
  extern RTX_CODE arm_canonicalize_comparison (RTX_CODE, rtx *);
  extern int legitimate_pic_operand_p (rtx);
  extern rtx legitimize_pic_address (rtx, enum machine_mode, rtx);
! extern int arm_legitimate_address_p  (enum machine_mode, rtx, RTX_CODE, int);
  extern int thumb_legitimate_address_p (enum machine_mode, rtx, int);
  extern int thumb_legitimate_offset_p (enum machine_mode, HOST_WIDE_INT);
  extern rtx arm_legitimize_address (rtx, rtx, enum machine_mode);
*************** extern int arm_rhsm_operand (rtx, enum m
*** 70,78 ****
  extern int arm_add_operand (rtx, enum machine_mode);
  extern int arm_addimm_operand (rtx, enum machine_mode);
  extern int arm_not_operand (rtx, enum machine_mode);
  extern int offsettable_memory_operand (rtx, enum machine_mode);
  extern int alignable_memory_operand (rtx, enum machine_mode);
- extern int bad_signed_byte_operand (rtx, enum machine_mode);
  extern int arm_float_rhs_operand (rtx, enum machine_mode);
  extern int arm_float_add_operand (rtx, enum machine_mode);
  extern int power_of_two_operand (rtx, enum machine_mode);
--- 70,78 ----
  extern int arm_add_operand (rtx, enum machine_mode);
  extern int arm_addimm_operand (rtx, enum machine_mode);
  extern int arm_not_operand (rtx, enum machine_mode);
+ extern int arm_extendqisi_mem_op (rtx, enum machine_mode);
  extern int offsettable_memory_operand (rtx, enum machine_mode);
  extern int alignable_memory_operand (rtx, enum machine_mode);
  extern int arm_float_rhs_operand (rtx, enum machine_mode);
  extern int arm_float_add_operand (rtx, enum machine_mode);
  extern int power_of_two_operand (rtx, enum machine_mode);
Index: config/arm/arm.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/arm/arm.c,v
retrieving revision 1.303.2.17
diff -p -p -r1.303.2.17 arm.c
*** config/arm/arm.c	3 Mar 2004 16:03:31 -0000	1.303.2.17
--- config/arm/arm.c	11 Mar 2004 07:15:57 -0000
*************** static int arm_gen_constant (enum rtx_co
*** 64,70 ****
  			     rtx, rtx, int, int);
  static unsigned bit_count (unsigned long);
  static int arm_address_register_rtx_p (rtx, int);
! static int arm_legitimate_index_p (enum machine_mode, rtx, int);
  static int thumb_base_register_rtx_p (rtx, enum machine_mode, int);
  inline static int thumb_index_register_rtx_p (rtx, int);
  static int const_ok_for_op (HOST_WIDE_INT, enum rtx_code);
--- 64,70 ----
  			     rtx, rtx, int, int);
  static unsigned bit_count (unsigned long);
  static int arm_address_register_rtx_p (rtx, int);
! static int arm_legitimate_index_p (enum machine_mode, rtx, RTX_CODE, int);
  static int thumb_base_register_rtx_p (rtx, enum machine_mode, int);
  inline static int thumb_index_register_rtx_p (rtx, int);
  static int const_ok_for_op (HOST_WIDE_INT, enum rtx_code);
*************** legitimize_pic_address (rtx orig, enum m
*** 2708,2714 ****
  	{
  	  /* The base register doesn't really matter, we only want to
  	     test the index for the appropriate mode.  */
! 	  if (!arm_legitimate_index_p (mode, offset, 0))
  	    {
  	      if (!no_new_pseudos)
  		offset = force_reg (Pmode, offset);
--- 2708,2714 ----
  	{
  	  /* The base register doesn't really matter, we only want to
  	     test the index for the appropriate mode.  */
! 	  if (!arm_legitimate_index_p (mode, offset, SET, 0))
  	    {
  	      if (!no_new_pseudos)
  		offset = force_reg (Pmode, offset);
*************** arm_address_register_rtx_p (rtx x, int s
*** 2813,2819 ****
  
  /* Return nonzero if X is a valid ARM state address operand.  */
  int
! arm_legitimate_address_p (enum machine_mode mode, rtx x, int strict_p)
  {
    if (arm_address_register_rtx_p (x, strict_p))
      return 1;
--- 2813,2820 ----
  
  /* Return nonzero if X is a valid ARM state address operand.  */
  int
! arm_legitimate_address_p (enum machine_mode mode, rtx x, RTX_CODE outer,
! 			  int strict_p)
  {
    if (arm_address_register_rtx_p (x, strict_p))
      return 1;
*************** arm_legitimate_address_p (enum machine_m
*** 2826,2832 ****
  	   && arm_address_register_rtx_p (XEXP (x, 0), strict_p)
  	   && GET_CODE (XEXP (x, 1)) == PLUS
  	   && XEXP (XEXP (x, 1), 0) == XEXP (x, 0))
!     return arm_legitimate_index_p (mode, XEXP (XEXP (x, 1), 1), strict_p);
  
    /* After reload constants split into minipools will have addresses
       from a LABEL_REF.  */
--- 2827,2834 ----
  	   && arm_address_register_rtx_p (XEXP (x, 0), strict_p)
  	   && GET_CODE (XEXP (x, 1)) == PLUS
  	   && XEXP (XEXP (x, 1), 0) == XEXP (x, 0))
!     return arm_legitimate_index_p (mode, XEXP (XEXP (x, 1), 1), outer,
! 				   strict_p);
  
    /* After reload constants split into minipools will have addresses
       from a LABEL_REF.  */
*************** arm_legitimate_address_p (enum machine_m
*** 2878,2886 ****
        rtx xop1 = XEXP (x, 1);
  
        return ((arm_address_register_rtx_p (xop0, strict_p)
! 	       && arm_legitimate_index_p (mode, xop1, strict_p))
  	      || (arm_address_register_rtx_p (xop1, strict_p)
! 		  && arm_legitimate_index_p (mode, xop0, strict_p)));
      }
  
  #if 0
--- 2880,2888 ----
        rtx xop1 = XEXP (x, 1);
  
        return ((arm_address_register_rtx_p (xop0, strict_p)
! 	       && arm_legitimate_index_p (mode, xop1, outer, strict_p))
  	      || (arm_address_register_rtx_p (xop1, strict_p)
! 		  && arm_legitimate_index_p (mode, xop0, outer, strict_p)));
      }
  
  #if 0
*************** arm_legitimate_address_p (enum machine_m
*** 2891,2897 ****
        rtx xop1 = XEXP (x, 1);
  
        return (arm_address_register_rtx_p (xop0, strict_p)
! 	      && arm_legitimate_index_p (mode, xop1, strict_p));
      }
  #endif
  
--- 2893,2899 ----
        rtx xop1 = XEXP (x, 1);
  
        return (arm_address_register_rtx_p (xop0, strict_p)
! 	      && arm_legitimate_index_p (mode, xop1, outer, strict_p));
      }
  #endif
  
*************** arm_legitimate_address_p (enum machine_m
*** 2913,2919 ****
  /* Return nonzero if INDEX is valid for an address index operand in
     ARM state.  */
  static int
! arm_legitimate_index_p (enum machine_mode mode, rtx index, int strict_p)
  {
    HOST_WIDE_INT range;
    enum rtx_code code = GET_CODE (index);
--- 2915,2922 ----
  /* Return nonzero if INDEX is valid for an address index operand in
     ARM state.  */
  static int
! arm_legitimate_index_p (enum machine_mode mode, rtx index, RTX_CODE outer,
! 			int strict_p)
  {
    HOST_WIDE_INT range;
    enum rtx_code code = GET_CODE (index);
*************** arm_legitimate_index_p (enum machine_mod
*** 2938,2975 ****
  	    && INTVAL (index) < 256
  	    && INTVAL (index) > -256);
  
!   /* XXX What about ldrsb?  */
!   if (GET_MODE_SIZE (mode) <= 4  && code == MULT
!       && (!arm_arch4 || (mode) != HImode))
!     {
!       rtx xiop0 = XEXP (index, 0);
!       rtx xiop1 = XEXP (index, 1);
! 
!       return ((arm_address_register_rtx_p (xiop0, strict_p)
! 	       && power_of_two_operand (xiop1, SImode))
! 	      || (arm_address_register_rtx_p (xiop1, strict_p)
! 		  && power_of_two_operand (xiop0, SImode)));
      }
  
!   if (GET_MODE_SIZE (mode) <= 4
!       && (code == LSHIFTRT || code == ASHIFTRT
! 	  || code == ASHIFT || code == ROTATERT)
!       && (!arm_arch4 || (mode) != HImode))
!     {
!       rtx op = XEXP (index, 1);
! 
!       return (arm_address_register_rtx_p (XEXP (index, 0), strict_p)
! 	      && GET_CODE (op) == CONST_INT
! 	      && INTVAL (op) > 0
! 	      && INTVAL (op) <= 31);
!     }
! 
!   /* XXX For ARM v4 we may be doing a sign-extend operation during the
!      load, but that has a restricted addressing range and we are unable
!      to tell here whether that is the case.  To be safe we restrict all
!      loads to that range.  */
    if (arm_arch4)
!     range = (mode == HImode || mode == QImode) ? 256 : 4096;
    else
      range = (mode == HImode) ? 4095 : 4096;
  
--- 2941,2982 ----
  	    && INTVAL (index) < 256
  	    && INTVAL (index) > -256);
  
!   if (GET_MODE_SIZE (mode) <= 4
!       && ! (arm_arch4
! 	    && (mode == HImode
! 		|| (mode == QImode && outer == SIGN_EXTEND))))
!     {
!       if (code == MULT)
! 	{
! 	  rtx xiop0 = XEXP (index, 0);
! 	  rtx xiop1 = XEXP (index, 1);
! 
! 	  return ((arm_address_register_rtx_p (xiop0, strict_p)
! 		   && power_of_two_operand (xiop1, SImode))
! 		  || (arm_address_register_rtx_p (xiop1, strict_p)
! 		      && power_of_two_operand (xiop0, SImode)));
! 	}
!       else if (code == LSHIFTRT || code == ASHIFTRT
! 	       || code == ASHIFT || code == ROTATERT)
! 	{
! 	  rtx op = XEXP (index, 1);
! 
! 	  return (arm_address_register_rtx_p (XEXP (index, 0), strict_p)
! 		  && GET_CODE (op) == CONST_INT
! 		  && INTVAL (op) > 0
! 		  && INTVAL (op) <= 31);
! 	}
      }
  
!   /* For ARM v4 we may be doing a sign-extend operation during the
!      load.  */
    if (arm_arch4)
!     {
!       if (mode == HImode || (outer == SIGN_EXTEND && mode == QImode))
! 	range = 256;
!       else
! 	range = 4096;
!     }
    else
      range = (mode == HImode) ? 4095 : 4096;
  
*************** arm_reload_memory_operand (rtx op, enum 
*** 4135,4171 ****
  		  && REGNO (op) >= FIRST_PSEUDO_REGISTER)));
  }
  
- /* Return 1 if OP is a valid memory address, but not valid for a signed byte
-    memory access (architecture V4).
-    MODE is QImode if called when computing constraints, or VOIDmode when
-    emitting patterns.  In this latter case we cannot use memory_operand()
-    because it will fail on badly formed MEMs, which is precisely what we are
-    trying to catch.  */
- int
- bad_signed_byte_operand (rtx op, enum machine_mode mode ATTRIBUTE_UNUSED)
- {
-   if (GET_CODE (op) != MEM)
-     return 0;
- 
-   op = XEXP (op, 0);
- 
-   /* A sum of anything more complex than reg + reg or reg + const is bad.  */
-   if ((GET_CODE (op) == PLUS || GET_CODE (op) == MINUS)
-       && (!s_register_operand (XEXP (op, 0), VOIDmode)
- 	  || (!s_register_operand (XEXP (op, 1), VOIDmode)
- 	      && GET_CODE (XEXP (op, 1)) != CONST_INT)))
-     return 1;
- 
-   /* Big constants are also bad.  */
-   if (GET_CODE (op) == PLUS && GET_CODE (XEXP (op, 1)) == CONST_INT
-       && (INTVAL (XEXP (op, 1)) > 0xff
- 	  || -INTVAL (XEXP (op, 1)) > 0xff))
-     return 1;
- 
-   /* Everything else is good, or can will automatically be made so.  */
-   return 0;
- }
- 
  /* Return TRUE for valid operands for the rhs of an ARM instruction.  */
  int
  arm_rhs_operand (rtx op, enum machine_mode mode)
--- 4142,4147 ----
*************** cirrus_memory_offset (rtx op)
*** 4356,4361 ****
--- 4332,4346 ----
      }
  
    return 0;
+ }
+ 
+ int
+ arm_extendqisi_mem_op (rtx op, enum machine_mode mode)
+ {
+   if (!memory_operand (op, mode))
+     return 0;
+ 
+   return arm_legitimate_address_p (mode, XEXP (op, 0), SIGN_EXTEND, 0);
  }
  
  /* Return nonzero if OP is a Cirrus or general register.  */
Index: config/arm/arm.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/arm/arm.h,v
retrieving revision 1.210.2.19
diff -p -p -r1.210.2.19 arm.h
*** config/arm/arm.h	5 Mar 2004 17:32:17 -0000	1.210.2.19
--- config/arm/arm.h	11 Mar 2004 07:16:07 -0000
*************** enum reg_class
*** 1486,1508 ****
     accessed without using a load.
     'U' is an address valid for VFP load/store insns.  */
  
! #define EXTRA_CONSTRAINT_ARM(OP, C)					    \
!   ((C) == 'Q' ? GET_CODE (OP) == MEM && GET_CODE (XEXP (OP, 0)) == REG :    \
!    (C) == 'R' ? (GET_CODE (OP) == MEM					    \
! 		 && GET_CODE (XEXP (OP, 0)) == SYMBOL_REF		    \
! 		 && CONSTANT_POOL_ADDRESS_P (XEXP (OP, 0))) :		    \
!    (C) == 'S' ? (optimize > 0 && CONSTANT_ADDRESS_P (OP)) :		    \
!    (C) == 'T' ? cirrus_memory_offset (OP) : 		    		    \
!    (C) == 'U' ? vfp_mem_operand (OP) :					    \
!    0)
  
  #define EXTRA_CONSTRAINT_THUMB(X, C)					\
    ((C) == 'Q' ? (GET_CODE (X) == MEM					\
  		 && GET_CODE (XEXP (X, 0)) == LABEL_REF) : 0)
  
! #define EXTRA_CONSTRAINT(X, C)						\
!   (TARGET_ARM ?								\
!    EXTRA_CONSTRAINT_ARM (X, C) : EXTRA_CONSTRAINT_THUMB (X, C))
  
  #define EXTRA_MEMORY_CONSTRAINT(C, STR) ((C) == 'U')
  
--- 1486,1515 ----
     accessed without using a load.
     'U' is an address valid for VFP load/store insns.  */
  
! #define EXTRA_CONSTRAINT_STR_ARM(OP, C, STR)			\
!   (((C) == 'Q') ? (GET_CODE (OP) == MEM				\
! 		 && GET_CODE (XEXP (OP, 0)) == REG) :		\
!    ((C) == 'R') ? (GET_CODE (OP) == MEM				\
! 		   && GET_CODE (XEXP (OP, 0)) == SYMBOL_REF	\
! 		   && CONSTANT_POOL_ADDRESS_P (XEXP (OP, 0))) :	\
!    ((C) == 'S') ? (optimize > 0 && CONSTANT_ADDRESS_P (OP)) :	\
!    ((C) == 'T') ? cirrus_memory_offset (OP) :			\
!    ((C) == 'U' && (STR)[1] == 'v') ? vfp_mem_operand (OP) :	\
!    ((C) == 'U' && (STR)[1] == 'q')				\
!     ? arm_extendqisi_mem_op (OP, GET_MODE (OP))			\
!       : 0)
! 
! #define CONSTRAINT_LEN(C,STR)				\
!   ((C) == 'U' ? 2 : DEFAULT_CONSTRAINT_LEN (C, STR))
  
  #define EXTRA_CONSTRAINT_THUMB(X, C)					\
    ((C) == 'Q' ? (GET_CODE (X) == MEM					\
  		 && GET_CODE (XEXP (X, 0)) == LABEL_REF) : 0)
  
! #define EXTRA_CONSTRAINT_STR(X, C, STR)		\
!   (TARGET_ARM					\
!    ? EXTRA_CONSTRAINT_STR_ARM (X, C, STR)	\
!    : EXTRA_CONSTRAINT_THUMB (X, C))
  
  #define EXTRA_MEMORY_CONSTRAINT(C, STR) ((C) == 'U')
  
*************** typedef struct
*** 2371,2377 ****
  
  #define ARM_GO_IF_LEGITIMATE_ADDRESS(MODE,X,WIN)		\
    {								\
!     if (arm_legitimate_address_p (MODE, X, REG_STRICT_P))	\
        goto WIN;							\
    }
  
--- 2378,2384 ----
  
  #define ARM_GO_IF_LEGITIMATE_ADDRESS(MODE,X,WIN)		\
    {								\
!     if (arm_legitimate_address_p (MODE, X, SET, REG_STRICT_P))	\
        goto WIN;							\
    }
  
*************** extern int making_const_table;
*** 2862,2868 ****
    {"thumb_cmpneg_operand", {CONST_INT}},				\
    {"thumb_cbrch_target_operand", {SUBREG, REG, MEM}},			\
    {"offsettable_memory_operand", {MEM}},				\
-   {"bad_signed_byte_operand", {MEM}},					\
    {"alignable_memory_operand", {MEM}},					\
    {"shiftable_operator", {PLUS, MINUS, AND, IOR, XOR}},			\
    {"minmax_operator", {SMIN, SMAX, UMIN, UMAX}},			\
--- 2869,2874 ----
Index: config/arm/arm.md
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/arm/arm.md,v
retrieving revision 1.145.2.14
diff -p -p -r1.145.2.14 arm.md
*** config/arm/arm.md	3 Mar 2004 16:03:32 -0000	1.145.2.14
--- config/arm/arm.md	11 Mar 2004 07:16:25 -0000
***************
*** 3746,3804 ****
    }"
  )
  
- ; Rather than restricting all byte accesses to memory addresses that ldrsb
- ; can handle, we fix up the ones that ldrsb can't grok with a split.
  (define_insn "*extendqihi_insn"
!   [(set (match_operand:HI                 0 "s_register_operand" "=r")
! 	(sign_extend:HI (match_operand:QI 1 "memory_operand"      "m")))]
    "TARGET_ARM && arm_arch4"
!   "*
!   /* If the address is invalid, this will split the instruction into two.  */
!   if (bad_signed_byte_operand (operands[1], VOIDmode))
!     return \"#\";
!   return \"ldr%?sb\\t%0, %1\";
!   "
    [(set_attr "type" "load_byte")
     (set_attr "predicable" "yes")
-    (set_attr "length" "8")
     (set_attr "pool_range" "256")
     (set_attr "neg_pool_range" "244")]
  )
  
- (define_split
-   [(set (match_operand:HI 0 "s_register_operand" "")
- 	(sign_extend:HI (match_operand:QI 1 "bad_signed_byte_operand" "")))]
-   "TARGET_ARM && arm_arch4 && reload_completed"
-   [(set (match_dup 3) (match_dup 1))
-    (set (match_dup 0) (sign_extend:HI (match_dup 2)))]
-   "
-   {
-     HOST_WIDE_INT offset;
- 
-     operands[3] = gen_rtx_REG (SImode, REGNO (operands[0]));
-     operands[2] = gen_rtx_MEM (QImode, operands[3]);
-     MEM_COPY_ATTRIBUTES (operands[2], operands[1]);
-     operands[1] = XEXP (operands[1], 0);
-     if (GET_CODE (operands[1]) == PLUS
- 	&& GET_CODE (XEXP (operands[1], 1)) == CONST_INT
- 	&& !(const_ok_for_arm (offset = INTVAL (XEXP (operands[1], 1)))
- 	     || const_ok_for_arm (-offset)))
-       {
- 	HOST_WIDE_INT low = (offset > 0
- 			     ? (offset & 0xff) : -((-offset) & 0xff));
- 	XEXP (operands[2], 0) = plus_constant (operands[3], low);
- 	operands[1] = plus_constant (XEXP (operands[1], 0), offset - low);
-       }
-     /* Ensure the sum is in correct canonical form.  */
-     else if (GET_CODE (operands[1]) == PLUS
- 	     && GET_CODE (XEXP (operands[1], 1)) != CONST_INT
- 	     && !s_register_operand (XEXP (operands[1], 1), VOIDmode))
-       operands[1] = gen_rtx_PLUS (GET_MODE (operands[1]),
- 					   XEXP (operands[1], 1),
- 					   XEXP (operands[1], 0));
-   }"
- )
- 
  (define_expand "extendqisi2"
    [(set (match_dup 2)
  	(ashift:SI (match_operand:QI 1 "general_operand" "")
--- 3746,3762 ----
    }"
  )
  
  (define_insn "*extendqihi_insn"
!   [(set (match_operand:HI 0 "s_register_operand" "=r")
! 	(sign_extend:HI (match_operand:QI 1 "memory_operand" "Uq")))]
    "TARGET_ARM && arm_arch4"
!   "ldr%?sb\\t%0, %1"
    [(set_attr "type" "load_byte")
     (set_attr "predicable" "yes")
     (set_attr "pool_range" "256")
     (set_attr "neg_pool_range" "244")]
  )
  
  (define_expand "extendqisi2"
    [(set (match_dup 2)
  	(ashift:SI (match_operand:QI 1 "general_operand" "")
***************
*** 3831,3872 ****
    }"
  )
  
- ; Rather than restricting all byte accesses to memory addresses that ldrsb
- ; can handle, we fix up the ones that ldrsb can't grok with a split.
  (define_insn "*arm_extendqisi"
    [(set (match_operand:SI 0 "s_register_operand" "=r")
! 	(sign_extend:SI (match_operand:QI 1 "memory_operand" "m")))]
    "TARGET_ARM && arm_arch4 && !arm_arch6"
!   "*
!   /* If the address is invalid, this will split the instruction into two.  */
!   if (bad_signed_byte_operand (operands[1], VOIDmode))
!     return \"#\";
!   return \"ldr%?sb\\t%0, %1\";
!   "
    [(set_attr "type" "load_byte")
     (set_attr "predicable" "yes")
-    (set_attr "length" "8")
     (set_attr "pool_range" "256")
     (set_attr "neg_pool_range" "244")]
  )
  
  (define_insn "*arm_extendqisi_v6"
    [(set (match_operand:SI 0 "s_register_operand" "=r,r")
! 	(sign_extend:SI (match_operand:QI 1 "nonimmediate_operand" "r,m")))]
    "TARGET_ARM && arm_arch6"
!   "*
!   if (which_alternative == 0)
!     return \"sxtb%?\\t%0, %1\";
! 
!   /* If the address is invalid, this will split the instruction into two.  */
!   if (bad_signed_byte_operand (operands[1], VOIDmode))
!     return \"#\";
! 
!   return \"ldr%?sb\\t%0, %1\";
!   "
    [(set_attr "type" "alu_shift,load_byte")
     (set_attr "predicable" "yes")
-    (set_attr "length" "4,8")
     (set_attr "pool_range" "*,256")
     (set_attr "neg_pool_range" "*,244")]
  )
--- 3789,3814 ----
    }"
  )
  
  (define_insn "*arm_extendqisi"
    [(set (match_operand:SI 0 "s_register_operand" "=r")
! 	(sign_extend:SI (match_operand:QI 1 "memory_operand" "Uq")))]
    "TARGET_ARM && arm_arch4 && !arm_arch6"
!   "ldr%?sb\\t%0, %1"
    [(set_attr "type" "load_byte")
     (set_attr "predicable" "yes")
     (set_attr "pool_range" "256")
     (set_attr "neg_pool_range" "244")]
  )
  
  (define_insn "*arm_extendqisi_v6"
    [(set (match_operand:SI 0 "s_register_operand" "=r,r")
! 	(sign_extend:SI (match_operand:QI 1 "nonimmediate_operand" "r,Uq")))]
    "TARGET_ARM && arm_arch6"
!   "@
!    sxtb%?\\t%0, %1
!    ldr%?sb\\t%0, %1"
    [(set_attr "type" "alu_shift,load_byte")
     (set_attr "predicable" "yes")
     (set_attr "pool_range" "*,256")
     (set_attr "neg_pool_range" "*,244")]
  )
***************
*** 3879,3917 ****
    "sxtab%?\\t%0, %2, %1"
    [(set_attr "type" "alu_shift")
     (set_attr "predicable" "yes")]
- )
- 
- (define_split
-   [(set (match_operand:SI 0 "s_register_operand" "")
- 	(sign_extend:SI (match_operand:QI 1 "bad_signed_byte_operand" "")))]
-   "TARGET_ARM && arm_arch4 && reload_completed"
-   [(set (match_dup 0) (match_dup 1))
-    (set (match_dup 0) (sign_extend:SI (match_dup 2)))]
-   "
-   {
-     HOST_WIDE_INT offset;
- 
-     operands[2] = gen_rtx_MEM (QImode, operands[0]);
-     MEM_COPY_ATTRIBUTES (operands[2], operands[1]);
-     operands[1] = XEXP (operands[1], 0);
-     if (GET_CODE (operands[1]) == PLUS
- 	&& GET_CODE (XEXP (operands[1], 1)) == CONST_INT
- 	&& !(const_ok_for_arm (offset = INTVAL (XEXP (operands[1], 1)))
- 	     || const_ok_for_arm (-offset)))
-       {
- 	HOST_WIDE_INT low = (offset > 0
- 			     ? (offset & 0xff) : -((-offset) & 0xff));
- 	XEXP (operands[2], 0) = plus_constant (operands[0], low);
- 	operands[1] = plus_constant (XEXP (operands[1], 0), offset - low);
-       }
-     /* Ensure the sum is in correct canonical form.  */
-     else if (GET_CODE (operands[1]) == PLUS
- 	     && GET_CODE (XEXP (operands[1], 1)) != CONST_INT
- 	     && !s_register_operand (XEXP (operands[1], 1), VOIDmode))
-       operands[1] = gen_rtx_PLUS (GET_MODE (operands[1]),
- 					   XEXP (operands[1], 1),
- 					   XEXP (operands[1], 0));
-   }"
  )
  
  (define_insn "*thumb_extendqisi2"
--- 3821,3826 ----
Index: config/arm/vfp.md
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/arm/vfp.md,v
retrieving revision 1.1.2.1
diff -p -p -r1.1.2.1 vfp.md
*** config/arm/vfp.md	30 Jan 2004 16:15:19 -0000	1.1.2.1
--- config/arm/vfp.md	11 Mar 2004 07:16:25 -0000
***************
*** 111,118 ****
  ;; ??? For now do not allow loading constants into vfp regs.  This causes
  ;; problems because small sonstants get converted into adds.
  (define_insn "*arm_movsi_vfp"
!   [(set (match_operand:SI 0 "nonimmediate_operand" "=r,r,r ,m,!w,r,!w,!w, U")
!       (match_operand:SI 1 "general_operand"	   "rI,K,mi,r,r,!w,!w,Ui,!w"))]
    "TARGET_ARM && TARGET_VFP && TARGET_HARD_FLOAT
     && (   s_register_operand (operands[0], SImode)
         || s_register_operand (operands[1], SImode))"
--- 111,118 ----
  ;; ??? For now do not allow loading constants into vfp regs.  This causes
  ;; problems because small sonstants get converted into adds.
  (define_insn "*arm_movsi_vfp"
!   [(set (match_operand:SI 0 "nonimmediate_operand" "=r,r,r ,m,!w,r,!w,!w,  Uv")
!       (match_operand:SI 1 "general_operand"	   "rI,K,mi,r,r,!w,!w,Uvi,!w"))]
    "TARGET_ARM && TARGET_VFP && TARGET_HARD_FLOAT
     && (   s_register_operand (operands[0], SImode)
         || s_register_operand (operands[1], SImode))"
***************
*** 136,143 ****
  ;; DImode moves
  
  (define_insn "*arm_movdi_vfp"
!   [(set (match_operand:DI 0 "nonimmediate_di_operand" "=r, r,o<>,w,r,w,w ,U")
! 	(match_operand:DI 1 "di_operand"              "rIK,mi,r ,r,w,w,Ui,w"))]
    "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
    "*
    switch (which_alternative)
--- 136,143 ----
  ;; DImode moves
  
  (define_insn "*arm_movdi_vfp"
!   [(set (match_operand:DI 0 "nonimmediate_di_operand" "=r, r,o<>,w,r,w,w  ,Uv")
! 	(match_operand:DI 1 "di_operand"              "rIK,mi,r ,r,w,w,Uvi,w"))]
    "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
    "*
    switch (which_alternative)
***************
*** 168,175 ****
  ;; SFmode moves
  
  (define_insn "*movsf_vfp"
!   [(set (match_operand:SF 0 "nonimmediate_operand" "=w,r,w ,U,r ,m,w,r")
! 	(match_operand:SF 1 "general_operand"	   " r,w,UE,w,mE,r,w,r"))]
    "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP
     && (   s_register_operand (operands[0], SFmode)
         || s_register_operand (operands[1], SFmode))"
--- 168,175 ----
  ;; SFmode moves
  
  (define_insn "*movsf_vfp"
!   [(set (match_operand:SF 0 "nonimmediate_operand" "=w,r,w  ,Uv,r ,m,w,r")
! 	(match_operand:SF 1 "general_operand"	   " r,w,UvE,w, mE,r,w,r"))]
    "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP
     && (   s_register_operand (operands[0], SFmode)
         || s_register_operand (operands[1], SFmode))"
***************
*** 192,199 ****
  ;; DFmode moves
  
  (define_insn "*movdf_vfp"
!   [(set (match_operand:DF 0 "nonimmediate_soft_df_operand" "=w,r,r, m,w ,U,w,r")
! 	(match_operand:DF 1 "soft_df_operand"		   " r,w,mF,r,UF,w,w,r"))]
    "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
    "*
    {
--- 192,199 ----
  ;; DFmode moves
  
  (define_insn "*movdf_vfp"
!   [(set (match_operand:DF 0 "nonimmediate_soft_df_operand" "=w,r,r, m,w  ,Uv,w,r")
! 	(match_operand:DF 1 "soft_df_operand"		   " r,w,mF,r,UvF,w, w,r"))]
    "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
    "*
    {
Index: doc/md.texi
===================================================================
RCS file: /cvs/gcc/gcc/gcc/doc/md.texi,v
retrieving revision 1.79.4.4
diff -p -p -r1.79.4.4 md.texi
*** doc/md.texi	3 Mar 2004 16:04:37 -0000	1.79.4.4
--- doc/md.texi	11 Mar 2004 07:17:10 -0000
*************** An item in the constant pool
*** 1379,1386 ****
  A symbol in the text segment of the current file
  @end table
  
! @item U
  A memory reference suitable for VFP load/store insns (reg+constant offset)
  
  @item AVR family---@file{avr.h}
  @table @code
--- 1379,1389 ----
  A symbol in the text segment of the current file
  @end table
  
! @item Uv
  A memory reference suitable for VFP load/store insns (reg+constant offset)
+ 
+ @item Uq
+ A memory reference suitable for for the ARMv4 ldrsb instruction.
  
  @item AVR family---@file{avr.h}
  @table @code

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [csl-arm, HEAD] ARM PATCH - fix QImode addressing on ARMv4
  2004-03-14  0:43     ` Richard Earnshaw
@ 2004-03-19  8:14       ` Richard Earnshaw
  0 siblings, 0 replies; 875+ messages in thread
From: Richard Earnshaw @ 2004-03-19  8:14 UTC (permalink / raw)
  To: Richard Earnshaw, gcc-patches, Richard.Earnshaw

[-- Attachment #1: Type: text/plain, Size: 540 bytes --]


> > I forgot to mention that the patch changes the memory constraint used for 
> > VFP memory operands from U to 'Uv'.  The constraint for an ldrsb 
> > instruction is 'Uq'.
> 
> In that case, you probably want to update this bit of arm.h:
> 
> --- 1486,1515 ----
>      accessed without using a load.
>      'U' is an address valid for VFP load/store insns.  */

Argh!  I'd seen that and then forgotten to update it.

Fixed thusly:

2004-03-14  Richard Earnshaw  <rearnsha@arm.com>

	* arm.h (EXTRA_CONSTRAINT_STR_ARM): Update comment.



[-- Attachment #2: extendqisi-addr2.patch --]
[-- Type: text/plain , Size: 1159 bytes --]

Index: config/arm/arm.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/arm/arm.h,v
retrieving revision 1.226
diff -p -p -r1.226 arm.h
*** config/arm/arm.h	13 Mar 2004 11:19:21 -0000	1.226
--- config/arm/arm.h	14 Mar 2004 00:21:23 -0000
*************** enum reg_class
*** 1469,1475 ****
     `S' means any symbol that has the SYMBOL_REF_FLAG set or a CONSTANT_POOL
     address.  This means that the symbol is in the text segment and can be
     accessed without using a load.
!    'U' is an address valid for VFP load/store insns.  */
  
  #define EXTRA_CONSTRAINT_STR_ARM(OP, C, STR)			\
    (((C) == 'Q') ? (GET_CODE (OP) == MEM				\
--- 1469,1477 ----
     `S' means any symbol that has the SYMBOL_REF_FLAG set or a CONSTANT_POOL
     address.  This means that the symbol is in the text segment and can be
     accessed without using a load.
!    'U' Prefixes an extended memory constraint where:
!    'Uv' is an address valid for VFP load/store insns.  
!    'Uq' is an address valid for ldrsb.  */
  
  #define EXTRA_CONSTRAINT_STR_ARM(OP, C, STR)			\
    (((C) == 'Q') ? (GET_CODE (OP) == MEM				\

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
  2004-03-10  6:44               ` Alan Modra
@ 2004-03-19  8:14                 ` Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2004-03-19  8:14 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Richard Henderson, gcc-patches

On Wed, Mar 10, 2004 at 01:23:36AM -0500, David Edelsohn wrote:
> 	The PowerPC changes are okay with me.
> 
> 	Other ports, such as Mips, use the IBM extended format, why not
> just add dconstm0 to standard list in real.h instead of creating a special
> function for rs6000 port?

I did consider doing that.  Note that using rs6000_dfmode_m0 costs just
one function call, whereas dconstm0 needs to be converted to double via
const_double_from_real_value, lookup_const_double, htab_find_slot.  I
thought I'd be pushing my luck to ask for a dfmode_m0_rtx in
emit-rtl.c.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Fix PR 14406 (rs6000 abstf2)
  2004-03-03 21:03             ` Fix PR 14406 (rs6000 abstf2) David Edelsohn
  2004-03-03 21:34               ` Alan Modra
@ 2004-03-19  8:14               ` David Edelsohn
  1 sibling, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2004-03-19  8:14 UTC (permalink / raw)
  To: gcc-patches

	PR target/14406
	* config/rs6000/rs6000.md (abstf2, abstf2+1): Delete define_insn.
	(abstf2, abstf2_internal): New define_expand.

Okay, assuming no regressions.  I hope that scheduling and CSE actually
produce something better than the raw pattern.  XLC produces:

        fabs    fp0,fp1
        fcmpu   0,fp0,fp1
        bc      BO_IF,CR0_EQ,__L10
        fneg    fp2,fp2
__L10:
        fmr     fp1,fp0
	blr

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
  2004-03-10 12:42                           ` Andreas Schwab
  2004-03-10 12:53                             ` Richard Sandiford
  2004-03-10 13:48                             ` Alan Modra
@ 2004-03-19  8:14                             ` Andreas Schwab
  2 siblings, 0 replies; 875+ messages in thread
From: Andreas Schwab @ 2004-03-19  8:14 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: Richard Henderson, David Edelsohn, gcc-patches

Richard Sandiford <rsandifo@redhat.com> writes:

> Alan Modra <amodra@bigpond.net.au> writes:
>>> My only concern (in case it wasn't obvious ;) is that you don't
>>> change the behaviour for IRIX.
>>
>> Easy.  I can add this.
>>
>>   else if (!fmt->qnan_msb_set)
>>     {
>>       /* MIPS slavishly follows proprietary compilers, which use 0.0
>> 	 in the low word.  */
>>       buf[2] = 0;
>>       buf[3] = 0;
>>     }
>
> Sounds good, although the comment's a bit on the vitriolic side. ;)
> Surely the long double representation is as much a part of the ABI as
> any other data representation?  I.e., it's not that were doing something
> just because another compiler does it.  We're doing it because that's
> the platform ABI.

The sign bit might also be a don't-care at this point, in which case both
formats must be supported.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux AG, MaxfeldstraÃŸe 5, 90409 NÃ¼rnberg, Germany
Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [csl-arm, HEAD] ARM PATCH - fix QImode addressing on ARMv4
  2004-03-13 21:44   ` Daniel Jacobowitz
  2004-03-14  0:43     ` Richard Earnshaw
@ 2004-03-19  8:14     ` Daniel Jacobowitz
  1 sibling, 0 replies; 875+ messages in thread
From: Daniel Jacobowitz @ 2004-03-19  8:14 UTC (permalink / raw)
  To: Richard Earnshaw; +Cc: gcc-patches, Richard.Earnshaw

On Sat, Mar 13, 2004 at 01:01:15PM +0000, Richard Earnshaw wrote:
> 
> rearnsha@buzzard.freeserve.co.uk said:
> > This patch fixes the way that we manage QImode indexes when compiling
> > for ARM Architecture v4 or later.  In v4 we have a ldrsb instruction
> > that can sign-extend a byte load (ldrb zero-extends).  Unfortunately
> > the indexing capabilities of this insn are less flexible than its
> > unsigned counterpart. In the past we have restricted (mostly) the
> > indexing range of ldrb to that of its poorer cousin: that generates
> > correct code, but at the expense of wasting instructions when the
> > indexing exceeds the capabilities of ldrsb.
> 
> > The patch below addresses all this by introducing a new memory
> > predicate arm_extendqisi_mem_op which can validate a ldrsb address
> > index distinctly from an ldrb address index (it does so by calling
> > arm_legitimate_address_p with a new argument, the 'outer' code in much
> > the same way as the RTX_COST macros do.
> 
> 
> I forgot to mention that the patch changes the memory constraint used for 
> VFP memory operands from U to 'Uv'.  The constraint for an ldrsb 
> instruction is 'Uq'.

In that case, you probably want to update this bit of arm.h:

--- 1486,1515 ----
     accessed without using a load.
     'U' is an address valid for VFP load/store insns.  */


-- 
Daniel Jacobowitz
MontaVista Software                         Debian GNU/Linux Developer

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
  2004-03-10 11:58                       ` Alan Modra
  2004-03-10 12:06                         ` Richard Sandiford
@ 2004-03-19  8:14                         ` Alan Modra
  1 sibling, 0 replies; 875+ messages in thread
From: Alan Modra @ 2004-03-19  8:14 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: Richard Henderson, David Edelsohn, gcc-patches

On Wed, Mar 10, 2004 at 11:25:38AM +0000, Richard Sandiford wrote:
> Alan Modra <amodra@bigpond.net.au> writes:
> > On Wed, Mar 10, 2004 at 09:31:13PM +1030, Alan Modra wrote:
> >> On Wed, Mar 10, 2004 at 09:42:36AM +0000, Richard Sandiford wrote:
> >> > when compiled with MIPSpro cc.
> >> 
> >> Does MIPSpro correctly convert a long double -0.0 to double -0.0?  Does
> >> mips gcc?
> >
> > The reason for -0.0 in the low double goes like this:
> >
> > Conversion from long double to double is done by simply adding the
> > two component doubles.  That means long double -0.0 must be
> > (-0.0 + -0.0), or you need to add code to handle -0.0 on every
> > conversion.
> 
> Not sure: are you saying that's what the spec says you should do, or
> that is it just what a particular implementation does?

No, I'm not talking about any spec or other implementation.  I'm just
following through the logical implications of using the simplest
long double -> double -> long double conversion sequences.

>  As per my
> previous message, IRIX uses +0.0 for the low double and it still gets
> the conversion right.  I assume it must be using something other than
> simple addition.

I'm curious as to what it uses.

> My only concern (in case it wasn't obvious ;) is that you don't
> change the behaviour for IRIX.

Easy.  I can add this.

  else if (!fmt->qnan_msb_set)
    {
      /* MIPS slavishly follows proprietary compilers, which use 0.0
	 in the low word.  */
      buf[2] = 0;
      buf[3] = 0;
    }

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
  2004-03-07  6:37     ` Alan Modra
  2004-03-07  7:30       ` Richard Henderson
@ 2004-03-19  8:14       ` Alan Modra
  1 sibling, 0 replies; 875+ messages in thread
From: Alan Modra @ 2004-03-19  8:14 UTC (permalink / raw)
  To: Geoff Keating; +Cc: sjmunroe, gcc-patches, aj, dgm69, dje, meissner

On Sat, Mar 06, 2004 at 03:13:44PM -0800, Geoff Keating wrote:
> > From: Alan Modra <amodra@bigpond.net.au>
> > On Fri, Mar 05, 2004 at 06:20:24PM -0600, Steve Munroe wrote:
> > > +/* Powerpc64 uses the AIX long double format.
> > > +   
> > > +   Each long double is made up of two IEEE doubles.  The value of the
> > > +   long double is the sum of the values of the two parts.  The most
> > > +   significant part is required to be the value of the long double
> > > +   rounded to the nearest double, as specified by IEEE.  For Inf
> > > +   values, the least significant part is required to be one of +0.0 or
> > > +   -0.0.
> > 
> > Do you know why this is required for Inf?  If there is a reason,
> > then the patch I just posted to fix -0.0 is wrong..  (In any case,
> > the patch is incomplete, as rs6000.md extenddftf2 also needs looking
> > at.)
> > 
> > Hmm, I can see that if you represent +Inf by (+Inf + -Inf), you're
> > in trouble, because converting to double will result in a Nan.
> > Perhaps there is some sequence of operations that will result in
> > (+Inf + +Inf) being turned into (+Inf + -Inf)?
> 
> If you represent +Inf by (+Inf, +/-Inf), then the code to convert a
> double to a long double becomes significantly more complicated.  Right
> now, it's done by just loading +0.0 in the low double.
> 
> You have to use the same value consistently, of course, or when you
> compare two Inf values for == you might get the wrong answer.

OK, I can see that it makes sense to load a zero in extenddftf2, and
like you say, comparisons then explain the need for zero in the low
double of Inf.  Specifically, we need -0.0, so that conversion of
(-0.0 + -0.0) to double works using the existing trunctfdf2.  Is there a
better trick than the following for extenddftf2's body?

	* config/rs6000/rs6000.md (extenddftf2): Use -0.0 in low double.
	* real.c (encode_ibm_extended): Use -0.0 in low double of Inf,
	NaN, and zero.  Update comment.

Patch against hammer branch, so expect a reject if applying mainline..
Index: gcc/config/rs6000/rs6000.md
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.md,v
retrieving revision 1.221.4.12
diff -u -p -r1.221.4.12 rs6000.md
--- gcc/config/rs6000/rs6000.md	6 Feb 2004 07:17:40 -0000	1.221.4.12
+++ gcc/config/rs6000/rs6000.md	7 Mar 2004 04:13:46 -0000
@@ -8170,7 +8175,11 @@
   "(DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_DARWIN)
    && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_LONG_DOUBLE_128"
 {
-  operands[2] = CONST0_RTX (DFmode);
+  REAL_VALUE_TYPE rv;
+  /* Make a -0.0 */
+  memset (&rv, 0, sizeof (rv));
+  rv.sign = 1;
+  operands[2] = CONST_DOUBLE_FROM_REAL_VALUE (rv, DFmode);
 })
 
 (define_insn_and_split "*extenddftf2_internal"
Index: gcc/real.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/real.c,v
retrieving revision 1.103.2.7
diff -u -p -r1.103.2.7 real.c
--- gcc/real.c	5 Mar 2004 15:06:08 -0000	1.103.2.7
+++ gcc/real.c	7 Mar 2004 06:26:02 -0000
@@ -3296,10 +3296,9 @@ const struct real_format ieee_extended_i
    numbers whose sum is equal to the extended precision value.  The number
    with greater magnitude is first.  This format has the same magnitude
    range as an IEEE double precision value, but effectively 106 bits of
-   significand precision.  Infinity and NaN are represented by their IEEE
-   double precision value stored in the first number, the second number is
-   ignored.  Zeroes, Infinities, and NaNs are set in both doubles
-   due to precedent.  */
+   significand precision.  Zero, Infinity and NaN are represented by their
+   IEEE double precision value stored in the first number, the second
+   number is -0.0.  */
 
 static void encode_ibm_extended PARAMS ((const struct real_format *fmt,
 					 long *, const REAL_VALUE_TYPE *));
@@ -3338,9 +3337,21 @@ encode_ibm_extended (fmt, buf, r)
   else
     {
       /* Inf, NaN, 0 are all representable as doubles, so the
-	 least-significant part can be 0.0.  */
-      buf[2] = 0;
-      buf[3] = 0;
+	 least-significant part can be zero.  We choose -0.0 because
+	 conversion of IBM extended precision to double is done by
+	 adding the two component doubles.  -0.0 is the only value that
+	 will result in a long double -0.0 correctly converting to a
+	 -0.0 double.  */
+      if (FLOAT_WORDS_BIG_ENDIAN)
+	{
+	  buf[2] = 0x80000000;
+	  buf[3] = 0;
+	}
+      else
+	{
+	  buf[2] = 0;
+	  buf[3] = 0x80000000;
+	}
     }
 }
 

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
  2004-03-09 23:49             ` Alan Modra
  2004-03-10  9:42               ` Richard Sandiford
@ 2004-03-19  8:14               ` Alan Modra
  2004-04-01  0:56               ` Geoff Keating
  2 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2004-03-19  8:14 UTC (permalink / raw)
  To: Richard Henderson, David Edelsohn, gcc-patches

On Mon, Mar 08, 2004 at 11:59:45PM -0800, Richard Henderson wrote:
> I *do* think that's better than frobbing rv.sign yourself.
> The less the format of real.h gets exposed the better.

OK.  I decided to avoid including tree.h in insn-output.c.  Instead,
I'm using a new rs6000 backend function to calculate -0.0.  I believe
it's not necessary to GTY(()) rs6000_dfmode_m0_rtx because it will
point somewhere inside const_double_htab.

	* real.c (encode_ibm_extended): Use -0.0 in low double of Inf,
	NaN, and zero.  Update comment.
	* config/rs6000/rs6000.md (extenddftf2): Use -0.0 in low double.
	* config/rs6000/rs6000.c (rs6000_dfmode_m0): New function.
	* config/rs6000/rs6000-protos.h (rs6000_dfmode_m0): Declare.
	Replace "struct rtx_def *" with "rtx" in decls protected with
	#ifdef RTX_CODE.  Formatting.

Bootstrapped, regression tested powerpc64-linux.

Index: gcc/real.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/real.c,v
retrieving revision 1.139
diff -u -p -r1.139 real.c
--- gcc/real.c	4 Mar 2004 10:23:20 -0000	1.139
+++ gcc/real.c	9 Mar 2004 23:34:36 -0000
@@ -3216,10 +3215,9 @@ const struct real_format ieee_extended_i
    numbers whose sum is equal to the extended precision value.  The number
    with greater magnitude is first.  This format has the same magnitude
    range as an IEEE double precision value, but effectively 106 bits of
-   significand precision.  Infinity and NaN are represented by their IEEE
-   double precision value stored in the first number, the second number is
-   ignored.  Zeroes, Infinities, and NaNs are set in both doubles
-   due to precedent.  */
+   significand precision.  Zero, Infinity and NaN are represented by their
+   IEEE double precision value stored in the first number, the second
+   number is -0.0.  */
 
 static void encode_ibm_extended (const struct real_format *fmt,
 				 long *, const REAL_VALUE_TYPE *);
@@ -3256,9 +3254,21 @@ encode_ibm_extended (const struct real_f
   else
     {
       /* Inf, NaN, 0 are all representable as doubles, so the
-	 least-significant part can be 0.0.  */
-      buf[2] = 0;
-      buf[3] = 0;
+	 least-significant part can be zero.  We choose -0.0 because
+	 conversion of IBM extended precision to double is done by
+	 adding the two component doubles.  -0.0 is the only value that
+	 will result in a long double -0.0 correctly converting to a
+	 -0.0 double.  */
+      if (FLOAT_WORDS_BIG_ENDIAN)
+	{
+	  buf[2] = 0x80000000;
+	  buf[3] = 0;
+	}
+      else
+	{
+	  buf[2] = 0;
+	  buf[3] = 0x80000000;
+	}
     }
 }
 
Index: gcc/config/rs6000/rs6000.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.c,v
retrieving revision 1.604
diff -u -p -r1.604 rs6000.c
--- gcc/config/rs6000/rs6000.c	8 Mar 2004 04:24:27 -0000	1.604
+++ gcc/config/rs6000/rs6000.c	9 Mar 2004 23:34:47 -0000
@@ -2780,6 +2780,22 @@ rs6000_legitimize_address (rtx x, rtx ol
     return NULL_RTX;
 }
 
+/* Construct a -0.0 here for use by extenddftf2.  */
+
+rtx
+rs6000_dfmode_m0 (void)
+{
+  static rtx rs6000_dfmode_m0_rtx;
+
+  if (rs6000_dfmode_m0_rtx == NULL_RTX)
+    {
+      REAL_VALUE_TYPE dconstm0 = REAL_VALUE_NEGATE (dconst0);
+      rs6000_dfmode_m0_rtx = CONST_DOUBLE_FROM_REAL_VALUE (dconstm0, DFmode);
+    }
+
+  return rs6000_dfmode_m0_rtx;
+}
+
 /* Construct the SYMBOL_REF for the tls_get_addr function.  */
 
 static GTY(()) rtx rs6000_tls_symbol;
Index: gcc/config/rs6000/rs6000.md
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.md,v
retrieving revision 1.299
diff -u -p -r1.299 rs6000.md
--- gcc/config/rs6000/rs6000.md	9 Mar 2004 12:10:25 -0000	1.299
+++ gcc/config/rs6000/rs6000.md	9 Mar 2004 23:34:54 -0000
@@ -8235,7 +8235,7 @@
   "(DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_DARWIN)
    && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_LONG_DOUBLE_128"
 {
-  operands[2] = CONST0_RTX (DFmode);
+  operands[2] = rs6000_dfmode_m0 ();
 })
 
 (define_insn_and_split "*extenddftf2_internal"
Index: gcc/config/rs6000/rs6000-protos.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000-protos.h,v
retrieving revision 1.74
diff -u -p -r1.74 rs6000-protos.h
--- gcc/config/rs6000/rs6000-protos.h	6 Feb 2004 06:18:19 -0000	1.74
+++ gcc/config/rs6000/rs6000-protos.h	9 Mar 2004 23:34:40 -0000
@@ -32,8 +32,9 @@ extern void init_cumulative_args (CUMULA
 extern void rs6000_va_start (tree, rtx);
 #endif /* TREE_CODE */
 
-extern struct rtx_def *rs6000_got_register (rtx);
-extern struct rtx_def *find_addr_reg (rtx);
+extern rtx rs6000_got_register (rtx);
+extern rtx rs6000_dfmode_m0 (void);
+extern rtx find_addr_reg (rtx);
 extern int any_operand (rtx, enum machine_mode);
 extern int short_cint_operand (rtx, enum machine_mode);
 extern int u_short_cint_operand (rtx, enum machine_mode);
@@ -120,41 +121,40 @@ extern int rs6000_emit_cmove (rtx, rtx, 
 extern void rs6000_emit_minmax (rtx, enum rtx_code, rtx, rtx);
 extern void output_toc (FILE *, rtx, int, enum machine_mode);
 extern void rs6000_initialize_trampoline (rtx, rtx, rtx);
-extern struct rtx_def *rs6000_longcall_ref (rtx);
+extern rtx rs6000_longcall_ref (rtx);
 extern void rs6000_fatal_bad_address (rtx);
 extern int stmw_operation (rtx, enum machine_mode);
 extern int mfcr_operation (rtx, enum machine_mode);
 extern int mtcrf_operation (rtx, enum machine_mode);
 extern int lmw_operation (rtx, enum machine_mode);
-extern struct rtx_def *create_TOC_reference (rtx);
+extern rtx create_TOC_reference (rtx);
 extern void rs6000_split_multireg_move (rtx, rtx);
 extern void rs6000_emit_move (rtx, rtx, enum machine_mode);
 extern rtx rs6000_legitimize_address (rtx, rtx, enum machine_mode);
 extern rtx rs6000_legitimize_reload_address (rtx, enum machine_mode,
-			    int, int, int, int *);
+					     int, int, int, int *);
 extern int rs6000_legitimate_address (enum machine_mode, rtx, int);
 extern bool rs6000_mode_dependent_address (rtx);
 extern rtx rs6000_return_addr (int, rtx);
 extern void rs6000_output_symbol_ref (FILE*, rtx);
 extern HOST_WIDE_INT rs6000_initial_elimination_offset (int, int);
 
-extern rtx rs6000_machopic_legitimize_pic_address (rtx orig, 
-                            enum machine_mode mode, rtx reg);
+extern rtx rs6000_machopic_legitimize_pic_address (rtx, enum machine_mode,
+						   rtx);
 
 #endif /* RTX_CODE */
 
 #ifdef TREE_CODE
 extern unsigned int rs6000_special_round_type_align (tree, int, int);
-extern void function_arg_advance (CUMULATIVE_ARGS *, enum machine_mode,
-					  tree, int);
+extern void function_arg_advance (CUMULATIVE_ARGS *,
+				  enum machine_mode, tree, int);
 extern int function_arg_boundary (enum machine_mode, tree);
 extern struct rtx_def *function_arg (CUMULATIVE_ARGS *,
-					     enum machine_mode, tree, int);
+				     enum machine_mode, tree, int);
 extern int function_arg_partial_nregs (CUMULATIVE_ARGS *,
-					       enum machine_mode, tree, int);
+				       enum machine_mode, tree, int);
 extern int function_arg_pass_by_reference (CUMULATIVE_ARGS *,
-						   enum machine_mode,
-						   tree, int);
+					   enum machine_mode, tree, int);
 extern rtx rs6000_function_value (tree, tree);
 extern rtx rs6000_libcall_value (enum machine_mode);
 extern struct rtx_def *rs6000_va_arg (tree, tree);

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Correct powerpc64 long double -0.0 to double conversion
  2004-03-11 15:05                   ` Correct powerpc64 long double -0.0 to double conversion Alan Modra
@ 2004-03-19  8:14                     ` Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2004-03-19  8:14 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Richard Sandiford, Richard Henderson, gcc-patches

On Wed, Mar 10, 2004 at 11:18:38AM -0500, David Edelsohn wrote:
> 	Using your test program, IBM xlc128 prints:
> 
>      NaN : 7ff80000 00000000 : 00000000 00000000
>     +Inf : 7ff00000 00000000 : 00000000 00000000
>        0 : 00000000 00000000 : 00000000 00000000
>       -0 : 80000000 00000000 : 00000000 00000000
>     -Inf : fff00000 00000000 : 00000000 00000000

Let's start again.  We can do without the -0.0 in the low double, and
also simplify long double -> double conversion.

- The PowerPC GCC long double comparison insn, cmptf_internal1 assumes
  there is exactly one representation for any long double value, because
  we compare the component doubles.  This is despite the currect
  PowerPC64 Linux ABI (and probably AIX) defining the IBM extended
  precision format in a loose manner that would seem to allow more than
  one representation for certain finite values.
- Our math functions always produce a correctly rounded high double
  (it's hard to see how they could do otherwise, given that the
  underlying hardware does so for double operations if rounding mode is
  set correctly)
- Therefore, the current gcc code only supports long doubles that have
  a correctly rounded high double.
- Therefore, we don't need to add the component doubles when converting
  to double, as Richard Sandiford pointed out.

Also a correctly rounded high double means that our actual precision
is 107 bits, not 106.  Changing this is orthogonal to the rs6000 back
end change, so I'll leave it to another patch.

	* config/rs6000/rs6000.md (trunctfdf2): Just use the high double.

Bootstrap and regression test in progress.

Index: gcc/config/rs6000/rs6000.md
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.md,v
retrieving revision 1.299
diff -u -p -r1.299 rs6000.md
--- gcc/config/rs6000/rs6000.md	9 Mar 2004 12:10:25 -0000	1.299
+++ gcc/config/rs6000/rs6000.md	11 Mar 2004 11:22:13 -0000
@@ -8269,14 +8269,19 @@
   DONE;
 })
 
-(define_insn "trunctfdf2"
+(define_insn_and_split "trunctfdf2"
   [(set (match_operand:DF 0 "gpc_reg_operand" "=f")
 	(float_truncate:DF (match_operand:TF 1 "gpc_reg_operand" "f")))]
   "(DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_DARWIN)
    && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_LONG_DOUBLE_128"
-  "fadd %0,%1,%L1"
-  [(set_attr "type" "fp")
-   (set_attr "length" "4")])
+  "#"
+  "&& 1"
+  [(set (match_dup 0) (match_dup 2))]
+  "
+{
+  const int hi_word = FLOAT_WORDS_BIG_ENDIAN ? 0 : GET_MODE_SIZE (DFmode);
+  operands[2] = simplify_gen_subreg (DFmode, operands[1], TFmode, hi_word);
+}")
 
 (define_insn_and_split "trunctfsf2"
   [(set (match_operand:SF 0 "gpc_reg_operand" "=f")


-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
  2004-03-10 12:53                             ` Richard Sandiford
@ 2004-03-19  8:14                               ` Richard Sandiford
  0 siblings, 0 replies; 875+ messages in thread
From: Richard Sandiford @ 2004-03-19  8:14 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Richard Henderson, David Edelsohn, gcc-patches

Andreas Schwab <schwab@suse.de> writes:
> The sign bit might also be a don't-care at this point, in which case both
> formats must be supported.

It might, that's true, but unless someone can show that it definitely
_is_, I don't think we should make that assumption.  math(3M) just says:

     Long double infinity is represented as the sum of a double
     infinity and a double zero; similarly for NaNs.

which perhaps can be read either way, but which hardly goes out
on a limb to say "zero of either sign".

Also, libgcc will use +0.0, so we might as well be consistent.

Richard

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
  2004-03-10  6:23             ` Powerpc64 long double support David Edelsohn
  2004-03-10  6:44               ` Alan Modra
@ 2004-03-19  8:14               ` David Edelsohn
  1 sibling, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2004-03-19  8:14 UTC (permalink / raw)
  To: Richard Henderson, gcc-patches

	The PowerPC changes are okay with me.

	Other ports, such as Mips, use the IBM extended format, why not
just add dconstm0 to standard list in real.h instead of creating a special
function for rs6000 port?

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
  2004-03-07  7:30       ` Richard Henderson
  2004-03-09  5:05         ` Alan Modra
@ 2004-03-19  8:14         ` Richard Henderson
  1 sibling, 0 replies; 875+ messages in thread
From: Richard Henderson @ 2004-03-19  8:14 UTC (permalink / raw)
  To: Geoff Keating, sjmunroe, gcc-patches, aj, dgm69, dje, meissner

On Sun, Mar 07, 2004 at 05:07:08PM +1030, Alan Modra wrote:
> -  operands[2] = CONST0_RTX (DFmode);
> +  REAL_VALUE_TYPE rv;
> +  /* Make a -0.0 */
> +  memset (&rv, 0, sizeof (rv));
> +  rv.sign = 1;
> +  operands[2] = CONST_DOUBLE_FROM_REAL_VALUE (rv, DFmode);

How about 

  REAL_VALUE_TYPE rv = REAL_VALUE_NEGATE (dconst0);



r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
  2004-03-09  5:05         ` Alan Modra
  2004-03-09  7:59           ` Richard Henderson
@ 2004-03-19  8:14           ` Alan Modra
  1 sibling, 0 replies; 875+ messages in thread
From: Alan Modra @ 2004-03-19  8:14 UTC (permalink / raw)
  To: Richard Henderson, gcc-patches

On Sat, Mar 06, 2004 at 11:30:23PM -0800, Richard Henderson wrote:
> On Sun, Mar 07, 2004 at 05:07:08PM +1030, Alan Modra wrote:
> > -  operands[2] = CONST0_RTX (DFmode);
> > +  REAL_VALUE_TYPE rv;
> > +  /* Make a -0.0 */
> > +  memset (&rv, 0, sizeof (rv));
> > +  rv.sign = 1;
> > +  operands[2] = CONST_DOUBLE_FROM_REAL_VALUE (rv, DFmode);
> 
> How about 
> 
>   REAL_VALUE_TYPE rv = REAL_VALUE_NEGATE (dconst0);

Well, it's nicer to hide the details of REAL_VALUE_TYPE, but..
REAL_VALUE_NEGATE needs NEGATE_EXPR.  ie. you need to arrange for
tree.h to be included in insn-emit.c.  An easy patch to genemit.c,
but is it a good idea?  Also, real_arithmetic2 is slower than memset.

Hmm, I suppose I could roll my own dconstm0 or even dfmode_m0_rtx.
Doesn't seem worth it though.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Fix PR 14406 (rs6000 abstf2)
  2004-03-04  2:47             ` David Edelsohn
@ 2004-03-19  8:14               ` David Edelsohn
  0 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2004-03-19  8:14 UTC (permalink / raw)
  To: gcc-patches

	PR target/14406
	* config/rs6000/rs6000.md (abstf2, abstf2+1): Delete define_insn.
	(abstf2, abstf2_internal): New define_expand.

Okay.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
  2004-03-06 10:50 ` Alan Modra
  2004-03-06 23:13   ` Geoff Keating
@ 2004-03-19  8:14   ` Alan Modra
  1 sibling, 0 replies; 875+ messages in thread
From: Alan Modra @ 2004-03-19  8:14 UTC (permalink / raw)
  To: Steve Munroe
  Cc: gcc-patches, Geoff Keating, Andreas Jaeger, Dwayne McConnell,
	David Edelsohn, Marcus Meissner

On Fri, Mar 05, 2004 at 06:20:24PM -0600, Steve Munroe wrote:
> +/* Powerpc64 uses the AIX long double format.
> +   
> +   Each long double is made up of two IEEE doubles.  The value of the
> +   long double is the sum of the values of the two parts.  The most
> +   significant part is required to be the value of the long double
> +   rounded to the nearest double, as specified by IEEE.  For Inf
> +   values, the least significant part is required to be one of +0.0 or
> +   -0.0.

Do you know why this is required for Inf?  If there is a reason,
then the patch I just posted to fix -0.0 is wrong..  (In any case,
the patch is incomplete, as rs6000.md extenddftf2 also needs looking
at.)

Hmm, I can see that if you represent +Inf by (+Inf + -Inf), you're
in trouble, because converting to double will result in a Nan.
Perhaps there is some sequence of operations that will result in
(+Inf + +Inf) being turned into (+Inf + -Inf)?

>  No other requirements are made; so, for example, 1.0 may be
> +   represented as (1.0, +0.0) or (1.0, -0.0), and the low part of a
> +   NaN is don't-care.  */

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
  2004-03-06 23:13   ` Geoff Keating
  2004-03-07  6:37     ` Alan Modra
@ 2004-03-19  8:14     ` Geoff Keating
  1 sibling, 0 replies; 875+ messages in thread
From: Geoff Keating @ 2004-03-19  8:14 UTC (permalink / raw)
  To: amodra; +Cc: sjmunroe, gcc-patches, aj, dgm69, dje, meissner

> X-Original-To: geoffk@foam.wonderslug.com
> Date: Sat, 6 Mar 2004 21:20:33 +1030
> From: Alan Modra <amodra@bigpond.net.au>
> Cc: gcc-patches@gcc.gnu.org, Geoff Keating <geoffk@geoffk.org>,
>         Andreas Jaeger <aj@suse.de>, Dwayne McConnell <dgm69@us.ibm.com>,
>         David Edelsohn <dje@watson.ibm.com>,
>         Marcus Meissner <meissner@suse.de>
> Mail-Followup-To: Steve Munroe <sjmunroe@us.ibm.com>,
> 	gcc-patches@gcc.gnu.org, Geoff Keating <geoffk@geoffk.org>,
> 	Andreas Jaeger <aj@suse.de>, Dwayne McConnell <dgm69@us.ibm.com>,
> 	David Edelsohn <dje@watson.ibm.com>,
> 	Marcus Meissner <meissner@suse.de>
> Content-Disposition: inline
> X-OriginalArrivalTime: 06 Mar 2004 10:50:37.0609 (UTC) FILETIME=[DC062990:01C40368]
> 
> On Fri, Mar 05, 2004 at 06:20:24PM -0600, Steve Munroe wrote:
> > +/* Powerpc64 uses the AIX long double format.
> > +   
> > +   Each long double is made up of two IEEE doubles.  The value of the
> > +   long double is the sum of the values of the two parts.  The most
> > +   significant part is required to be the value of the long double
> > +   rounded to the nearest double, as specified by IEEE.  For Inf
> > +   values, the least significant part is required to be one of +0.0 or
> > +   -0.0.
> 
> Do you know why this is required for Inf?  If there is a reason,
> then the patch I just posted to fix -0.0 is wrong..  (In any case,
> the patch is incomplete, as rs6000.md extenddftf2 also needs looking
> at.)
> 
> Hmm, I can see that if you represent +Inf by (+Inf + -Inf), you're
> in trouble, because converting to double will result in a Nan.
> Perhaps there is some sequence of operations that will result in
> (+Inf + +Inf) being turned into (+Inf + -Inf)?

If you represent +Inf by (+Inf, +/-Inf), then the code to convert a
double to a long double becomes significantly more complicated.  Right
now, it's done by just loading +0.0 in the low double.

You have to use the same value consistently, of course, or when you
compare two Inf values for == you might get the wrong answer.

-- 
- Geoffrey Keating <geoffk@geoffk.org>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
  2004-03-10 20:11                       ` Richard Sandiford
@ 2004-03-19  8:14                         ` Richard Sandiford
  0 siblings, 0 replies; 875+ messages in thread
From: Richard Sandiford @ 2004-03-19  8:14 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Richard Henderson, gcc-patches

David Edelsohn <dje@watson.ibm.com> writes:
> 	I am interested to know exactly what instructions MIPSpro cc
> produces for the conversion from long double to double.  For example, the
> following program:
>
> double
> ld2d (long double f)
> {
>   return (double)f;
> }
>
> Something must be preserving the sign bit given the representation of -0
> displayed by the other sample program.

It just calls a library function (__dble_q).  It would be tempting to
disassemble the standard library in order to find out what it does, but
I'm not certain what the licence restrictions are.

Richard

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
  2004-03-10 18:48                     ` David Edelsohn
  2004-03-10 20:11                       ` Richard Sandiford
@ 2004-03-19  8:14                       ` David Edelsohn
  1 sibling, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2004-03-19  8:14 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: Richard Henderson, gcc-patches

	I am interested to know exactly what instructions MIPSpro cc
produces for the conversion from long double to double.  For example, the
following program:

double
ld2d (long double f)
{
  return (double)f;
}

Something must be preserving the sign bit given the representation of -0
displayed by the other sample program.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Fix PR 14406 (rs6000 abstf2)
  2004-03-03 21:34               ` Alan Modra
  2004-03-04  2:44                 ` Alan Modra
@ 2004-03-19  8:14                 ` Alan Modra
  1 sibling, 0 replies; 875+ messages in thread
From: Alan Modra @ 2004-03-19  8:14 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

On Wed, Mar 03, 2004 at 04:03:52PM -0500, David Edelsohn wrote:
> 	PR target/14406
> 	* config/rs6000/rs6000.md (abstf2, abstf2+1): Delete define_insn.
> 	(abstf2, abstf2_internal): New define_expand.
> 
> Okay, assuming no regressions.  I hope that scheduling and CSE actually
> produce something better than the raw pattern.  XLC produces:
> 
>         fabs    fp0,fp1
>         fcmpu   0,fp0,fp1
>         bc      BO_IF,CR0_EQ,__L10
>         fneg    fp2,fp2
> __L10:
>         fmr     fp1,fp0
> 	blr

The following is -O1 -mlong-double-128 code.

.foo:
        fabs 0,1
        fcmpu 7,0,1
        beqlr- 7
        fmr 1,0
        fneg 2,2
        blr

I was about to say that gcc does better, but the XLC sequence has
alerted me to the fact that I'm not doing the right thing for -0.0

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
  2004-03-06  9:52 ` Powerpc64 long double support Alan Modra
@ 2004-03-19  8:14   ` Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2004-03-19  8:14 UTC (permalink / raw)
  To: Steve Munroe
  Cc: gcc-patches, Geoff Keating, Andreas Jaeger, Dwayne McConnell,
	David Edelsohn, Marcus Meissner

On Fri, Mar 05, 2004 at 06:20:24PM -0600, Steve Munroe wrote:
> Making progress, the code builds but still has a number of make check 
> failures. Any one who has time for code review and suggested 
> improvements would be greatly appreciated.
> 
> One make check failure may be a code gen bug in hammer3_3. The problem 
> is in nexttoward(). For test-idouble; nexttoward (0, -0) and  nexttoward 
> (-0, -0) should return -0 but we are getting 0 instead.
> 
> The offending statement in s_nexttoward.c is line 54:
> 
> 	if((long double) x==y) return y;	/* x=y, return y */
> 
> The code generate is:
> 
>     10000c9c:	fc 00 68 90 	fmr	f0,f13
>     10000ca0:	c8 22 81 50 	lfd	f1,-32432(r2)
>     10000ca4:	ff 80 18 00 	fcmpu	cr7,f0,f3
>     10000ca8:	40 9e 00 08 	bne-	cr7,10000cb0
>     10000cac:	ff 81 20 00 	fcmpu	cr7,f1,f4
>     10000cb0:	40 9e 00 18 	bne-	cr7,10000cc8
>     10000cb4:	fc 23 20 2a 	fadd	f1,f3,f4
>     10000cb8:	38 21 00 90 	addi	r1,r1,144
>     10000cbc:	e8 01 00 10 	ld	r0,16(r1)
>     10000cc0:	7c 08 03 a6 	mtlr	r0
>     10000cc4:	4e 80 00 20 	blr
> 
> It seems that the coversion of y (a long double) to double generates a 
> fadd f1,f3,f4 which seems to change the sign.

This is due to another error in real.c:encode_ibm_extended.

	* real.c (encode_ibm_extended): Duplicate high double in low for
	zeros, infinities and nans.  Explain why.

Index: gcc/real.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/real.c,v
retrieving revision 1.103.2.7
diff -u -p -r1.103.2.7 real.c
--- gcc/real.c	5 Mar 2004 15:06:08 -0000	1.103.2.7
+++ gcc/real.c	6 Mar 2004 09:48:23 -0000
@@ -3298,8 +3298,9 @@ const struct real_format ieee_extended_i
    range as an IEEE double precision value, but effectively 106 bits of
    significand precision.  Infinity and NaN are represented by their IEEE
    double precision value stored in the first number, the second number is
-   ignored.  Zeroes, Infinities, and NaNs are set in both doubles
-   due to precedent.  */
+   ignored.  Zeroes are set in both doubles so that conversion of a
+   long double -0.0 to double by adding the two doubles will result in
+   -0.0.  Infinities and NaNs do the same due to precedent.  */
 
 static void encode_ibm_extended PARAMS ((const struct real_format *fmt,
 					 long *, const REAL_VALUE_TYPE *));
@@ -3337,10 +3338,8 @@ encode_ibm_extended (fmt, buf, r)
     }
   else
     {
-      /* Inf, NaN, 0 are all representable as doubles, so the
-	 least-significant part can be 0.0.  */
-      buf[2] = 0;
-      buf[3] = 0;
+      buf[2] = buf[0];
+      buf[3] = buf[1];
     }
 }
 

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
  2004-03-10 11:11                   ` Richard Sandiford
  2004-03-10 18:48                     ` David Edelsohn
@ 2004-03-19  8:14                     ` Richard Sandiford
  1 sibling, 0 replies; 875+ messages in thread
From: Richard Sandiford @ 2004-03-19  8:14 UTC (permalink / raw)
  To: Richard Henderson; +Cc: David Edelsohn, gcc-patches

Alan Modra <amodra@bigpond.net.au> writes:
> On Wed, Mar 10, 2004 at 09:42:36AM +0000, Richard Sandiford wrote:
>> when compiled with MIPSpro cc.
>
> Does MIPSpro correctly convert a long double -0.0 to double -0.0?  Does
> mips gcc?

Seems like it.  The attached program prints the expected:

      -0 : 80000000 00000000

when compiled with either MIPSpro or gcc 3.4.

Richard


#include <float.h>

void print_it (const char *fmt, double d)
{
  union { double d; unsigned int i[2]; } u;
  u.d = d;
  printf ("%8s : %08x %08x\n", fmt, u.i[0], u.i[1]);
}

long double f () { return -LDBL_MIN / 1e100L; }

int main ()
{
  print_it ("-0", f ());
  return 0;
}

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
  2004-03-10 11:25                     ` Richard Sandiford
  2004-03-10 11:58                       ` Alan Modra
@ 2004-03-19  8:14                       ` Richard Sandiford
  1 sibling, 0 replies; 875+ messages in thread
From: Richard Sandiford @ 2004-03-19  8:14 UTC (permalink / raw)
  To: Richard Henderson; +Cc: David Edelsohn, gcc-patches

Alan Modra <amodra@bigpond.net.au> writes:
> On Wed, Mar 10, 2004 at 09:31:13PM +1030, Alan Modra wrote:
>> On Wed, Mar 10, 2004 at 09:42:36AM +0000, Richard Sandiford wrote:
>> > when compiled with MIPSpro cc.
>> 
>> Does MIPSpro correctly convert a long double -0.0 to double -0.0?  Does
>> mips gcc?
>
> The reason for -0.0 in the low double goes like this:
>
> Conversion from long double to double is done by simply adding the
> two component doubles.  That means long double -0.0 must be
> (-0.0 + -0.0), or you need to add code to handle -0.0 on every
> conversion.

Not sure: are you saying that's what the spec says you should do, or
that is it just what a particular implementation does?  As per my
previous message, IRIX uses +0.0 for the low double and it still gets
the conversion right.  I assume it must be using something other than
simple addition.

My only concern (in case it wasn't obvious ;) is that you don't
change the behaviour for IRIX.  I'm certainly not trying to say
the change is wrong for powerpc...

Richard

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Correct powerpc64 long double -0.0 to double conversion
  2004-03-12 20:26             ` Correct powerpc64 long double -0.0 to double conversion David Edelsohn
@ 2004-03-19  8:14               ` David Edelsohn
  0 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2004-03-19  8:14 UTC (permalink / raw)
  To: Richard Sandiford, Richard Henderson, gcc-patches

	After investigating this further, I would recommend leaving the
second component as +0.0 and leaving the rounding as summing the two
components.  No patch.

- Our math functions always produce a correctly rounded high double
  (it's hard to see how they could do otherwise, given that the
  underlying hardware does so for double operations if rounding mode is
  set correctly)

I am not sure what "our math functions" means, but not all efficient long
double algorithms will leave the first component correctly rounded, nor
propagate NaN or Inf for that matter.  If PPC64 Linux requires this corner
case to be correct, it can invoke a more complicated LIBCALL when not
-ffast-math.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [csl-arm, HEAD] ARM PATCH - fix QImode addressing on ARMv4
  2004-03-13 13:01 ` Richard Earnshaw
  2004-03-13 21:44   ` Daniel Jacobowitz
@ 2004-03-19  8:14   ` Richard Earnshaw
  1 sibling, 0 replies; 875+ messages in thread
From: Richard Earnshaw @ 2004-03-19  8:14 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard.Earnshaw, Richard Earnshaw


rearnsha@buzzard.freeserve.co.uk said:
> This patch fixes the way that we manage QImode indexes when compiling
> for ARM Architecture v4 or later.  In v4 we have a ldrsb instruction
> that can sign-extend a byte load (ldrb zero-extends).  Unfortunately
> the indexing capabilities of this insn are less flexible than its
> unsigned counterpart. In the past we have restricted (mostly) the
> indexing range of ldrb to that of its poorer cousin: that generates
> correct code, but at the expense of wasting instructions when the
> indexing exceeds the capabilities of ldrsb.

> The patch below addresses all this by introducing a new memory
> predicate arm_extendqisi_mem_op which can validate a ldrsb address
> index distinctly from an ldrb address index (it does so by calling
> arm_legitimate_address_p with a new argument, the 'outer' code in much
> the same way as the RTX_COST macros do.


I forgot to mention that the patch changes the memory constraint used for 
VFP memory operands from U to 'Uv'.  The constraint for an ldrsb 
instruction is 'Uq'.

R.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Fix PR 14406 (rs6000 abstf2)
  2004-03-03 15:14 Fix PR 14406 (rs6000 abstf2) Alan Modra
@ 2004-03-19  8:14 ` Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2004-03-19  8:14 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Edelsohn

Replaces the bogus abstf2 pattern with one that works.  Details in the
PR.

	PR target/14406
	* config/rs6000/rs6000.md (abstf2, abstf2+1): Delete define_insn.
	(abstf2, abstf2_internal): New define_expand.

powerpc64-linux bootstrap and regression test in progress.

Index: gcc/config/rs6000/rs6000.md
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.md,v
retrieving revision 1.296
diff -c -p -r1.296 rs6000.md
*** gcc/config/rs6000/rs6000.md	27 Feb 2004 02:13:59 -0000	1.296
--- gcc/config/rs6000/rs6000.md	3 Mar 2004 15:04:29 -0000
***************
*** 8375,8409 ****
    [(set_attr "type" "fp")
     (set_attr "length" "8")])
  
! (define_insn "abstf2"
    [(set (match_operand:TF 0 "gpc_reg_operand" "=f")
  	(abs:TF (match_operand:TF 1 "gpc_reg_operand" "f")))]
    "(DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_DARWIN)
     && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_LONG_DOUBLE_128"
!   "*
  {
!   if (REGNO (operands[0]) == REGNO (operands[1]) + 1)
!     return \"fabs %L0,%L1\;fabs %0,%1\";
!   else
!     return \"fabs %0,%1\;fabs %L0,%L1\";
! }"
!   [(set_attr "type" "fp")
!    (set_attr "length" "8")])
  
! (define_insn ""
    [(set (match_operand:TF 0 "gpc_reg_operand" "=f")
! 	(neg:TF (abs:TF (match_operand:TF 1 "gpc_reg_operand" "f"))))]
    "(DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_DARWIN)
     && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_LONG_DOUBLE_128"
!   "*
  {
!   if (REGNO (operands[0]) == REGNO (operands[1]) + 1)
!     return \"fnabs %L0,%L1\;fnabs %0,%1\";
!   else
!     return \"fnabs %0,%1\;fnabs %L0,%L1\";
! }"
!   [(set_attr "type" "fp")
!    (set_attr "length" "8")])
  \f
  ;; Next come the multi-word integer load and store and the load and store
  ;; multiple insns.
--- 8376,8415 ----
    [(set_attr "type" "fp")
     (set_attr "length" "8")])
  
! (define_expand "abstf2"
    [(set (match_operand:TF 0 "gpc_reg_operand" "=f")
  	(abs:TF (match_operand:TF 1 "gpc_reg_operand" "f")))]
    "(DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_DARWIN)
     && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_LONG_DOUBLE_128"
!   "
  {
!   rtx label = gen_label_rtx ();
!   emit_insn (gen_abstf2_internal (operands[0], operands[1], label));
!   emit_label (label);
!   DONE;
! }")
  
! (define_expand "abstf2_internal"
    [(set (match_operand:TF 0 "gpc_reg_operand" "=f")
! 	(match_operand:TF 1 "gpc_reg_operand" "f"))
!    (set (match_dup 3) (abs:DF (match_dup 5)))
!    (set (match_dup 4) (compare:CCFP (match_dup 3) (match_dup 5)))
!    (set (pc) (if_then_else (eq (match_dup 4) (const_int 0))
! 			   (label_ref (match_operand 2 "" ""))
! 			   (pc)))
!    (set (match_dup 5) (abs:DF (match_dup 5)))
!    (set (match_dup 6) (neg:DF (match_dup 6)))]
    "(DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_DARWIN)
     && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_LONG_DOUBLE_128"
!   "
  {
!   const int hi_word = FLOAT_WORDS_BIG_ENDIAN ? 0 : GET_MODE_SIZE (DFmode);
!   const int lo_word = FLOAT_WORDS_BIG_ENDIAN ? GET_MODE_SIZE (DFmode) : 0;
!   operands[3] = gen_reg_rtx (DFmode);
!   operands[4] = gen_reg_rtx (CCFPmode);
!   operands[5] = simplify_gen_subreg (DFmode, operands[0], TFmode, hi_word);
!   operands[6] = simplify_gen_subreg (DFmode, operands[0], TFmode, lo_word);
! }")
  \f
  ;; Next come the multi-word integer load and store and the load and store
  ;; multiple insns.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
  2004-03-10 12:25                           ` Alan Modra
@ 2004-03-19  8:14                             ` Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2004-03-19  8:14 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: Richard Henderson, David Edelsohn, gcc-patches

On Wed, Mar 10, 2004 at 12:06:57PM +0000, Richard Sandiford wrote:
> Alan Modra <amodra@bigpond.net.au> writes:
> >> My only concern (in case it wasn't obvious ;) is that you don't
> >> change the behaviour for IRIX.
> >
> > Easy.  I can add this.
> >
> >   else if (!fmt->qnan_msb_set)
> >     {
> >       /* MIPS slavishly follows proprietary compilers, which use 0.0
> > 	 in the low word.  */
> >       buf[2] = 0;
> >       buf[3] = 0;
> >     }
> 
> Sounds good, although the comment's a bit on the vitriolic side. ;)

Heh.  It was meant to sting a little in a friendly way.  Perhaps
/* MIPS uses +0.0 in the low word.  */ would suit better?  :)

> Surely the long double representation is as much a part of the ABI as
> any other data representation?  I.e., it's not that were doing something
> just because another compiler does it.  We're doing it because that's
> the platform ABI.
> 
> Richard

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
  2004-03-10 11:01                 ` Alan Modra
  2004-03-10 11:11                   ` Richard Sandiford
  2004-03-10 11:13                   ` Alan Modra
@ 2004-03-19  8:14                   ` Alan Modra
  2 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2004-03-19  8:14 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: Richard Henderson, David Edelsohn, gcc-patches

On Wed, Mar 10, 2004 at 09:42:36AM +0000, Richard Sandiford wrote:
> when compiled with MIPSpro cc.

Does MIPSpro correctly convert a long double -0.0 to double -0.0?  Does
mips gcc?

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
  2004-03-10 11:13                   ` Alan Modra
  2004-03-10 11:25                     ` Richard Sandiford
@ 2004-03-19  8:14                     ` Alan Modra
  1 sibling, 0 replies; 875+ messages in thread
From: Alan Modra @ 2004-03-19  8:14 UTC (permalink / raw)
  To: Richard Sandiford, Richard Henderson, David Edelsohn, gcc-patches

On Wed, Mar 10, 2004 at 09:31:13PM +1030, Alan Modra wrote:
> On Wed, Mar 10, 2004 at 09:42:36AM +0000, Richard Sandiford wrote:
> > when compiled with MIPSpro cc.
> 
> Does MIPSpro correctly convert a long double -0.0 to double -0.0?  Does
> mips gcc?

The reason for -0.0 in the low double goes like this:

Conversion from long double to double is done by simply adding the
two component doubles.  That means long double -0.0 must be
(-0.0 + -0.0), or you need to add code to handle -0.0 on every
conversion.

Conversion from double to long double is done by using the double
in the high part and making the low part zero.  For consistency
(and correct conversion back to double), the low part must be -0.0
when the double is -0.0.  Again, if you want +0.0, +Inf, -Inf to
have +0.0 as the low double, then you need something more complicated
than just loading a constant value in the low double.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
  2004-03-10 12:06                         ` Richard Sandiford
  2004-03-10 12:25                           ` Alan Modra
  2004-03-10 12:42                           ` Andreas Schwab
@ 2004-03-19  8:14                           ` Richard Sandiford
  2 siblings, 0 replies; 875+ messages in thread
From: Richard Sandiford @ 2004-03-19  8:14 UTC (permalink / raw)
  To: Richard Henderson; +Cc: David Edelsohn, gcc-patches

Alan Modra <amodra@bigpond.net.au> writes:
>> My only concern (in case it wasn't obvious ;) is that you don't
>> change the behaviour for IRIX.
>
> Easy.  I can add this.
>
>   else if (!fmt->qnan_msb_set)
>     {
>       /* MIPS slavishly follows proprietary compilers, which use 0.0
> 	 in the low word.  */
>       buf[2] = 0;
>       buf[3] = 0;
>     }

Sounds good, although the comment's a bit on the vitriolic side. ;)
Surely the long double representation is as much a part of the ABI as
any other data representation?  I.e., it's not that were doing something
just because another compiler does it.  We're doing it because that's
the platform ABI.

Richard

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
  2004-03-10 16:18                 ` David Edelsohn
  2004-03-11 15:05                   ` Correct powerpc64 long double -0.0 to double conversion Alan Modra
@ 2004-03-19  8:14                   ` David Edelsohn
  1 sibling, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2004-03-19  8:14 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: Richard Henderson, gcc-patches

	Using your test program, IBM xlc128 prints:

     NaN : 7ff80000 00000000 : 00000000 00000000
    +Inf : 7ff00000 00000000 : 00000000 00000000
       0 : 00000000 00000000 : 00000000 00000000
      -0 : 80000000 00000000 : 00000000 00000000
    -Inf : fff00000 00000000 : 00000000 00000000

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
  2004-03-09  7:59           ` Richard Henderson
  2004-03-09 23:49             ` Alan Modra
@ 2004-03-19  8:14             ` Richard Henderson
  1 sibling, 0 replies; 875+ messages in thread
From: Richard Henderson @ 2004-03-19  8:14 UTC (permalink / raw)
  To: gcc-patches

On Tue, Mar 09, 2004 at 03:35:31PM +1030, Alan Modra wrote:
> On Sat, Mar 06, 2004 at 11:30:23PM -0800, Richard Henderson wrote:
> > On Sun, Mar 07, 2004 at 05:07:08PM +1030, Alan Modra wrote:
> > > -  operands[2] = CONST0_RTX (DFmode);
> > > +  REAL_VALUE_TYPE rv;
> > > +  /* Make a -0.0 */
> > > +  memset (&rv, 0, sizeof (rv));
> > > +  rv.sign = 1;
> > > +  operands[2] = CONST_DOUBLE_FROM_REAL_VALUE (rv, DFmode);
> > 
> > How about 
> > 
> >   REAL_VALUE_TYPE rv = REAL_VALUE_NEGATE (dconst0);
> 
> Well, it's nicer to hide the details of REAL_VALUE_TYPE, but..
> REAL_VALUE_NEGATE needs NEGATE_EXPR.  ie. you need to arrange for
> tree.h to be included in insn-emit.c.  An easy patch to genemit.c,
> but is it a good idea?

I *do* think that's better than frobbing rv.sign yourself.
The less the format of real.h gets exposed the better.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* [lno] PATCH rs6000.md fix
@ 2004-03-30 14:55 Mostafa Hagog
  2004-03-30 15:36 ` David Edelsohn
  2004-04-01 12:43 ` Dorit Naishlos
  0 siblings, 2 replies; 875+ messages in thread
From: Mostafa Hagog @ 2004-03-30 14:55 UTC (permalink / raw)
  To: gcc-patches; +Cc: dje, geoffk

The following patch fixes an inconsistency in doloop related patterns of
rs6000.md.
Dorit Naishlos has pointed out to me that the gap benchmark fails on the
lno branch when
gcse-after-reload is specified. It turned that this is due to the same
problem that appeared
in s390 port (http://gcc.gnu.org/ml/gcc-patches/2004-03/msg01803.html).

Bootstraps on powerpc-apple-darwin7.2.0, no new regressions. Ok to commit
to lno?
IMO this should go also to mainline.

2004-03-28 Mostafa Hagog  <mustafa@il.ibm.com>

      * config/rs6000/rs6000.md ("*ctrsi_internal1", "*ctrsi_internal2",
      "*ctrdi_internal1", "*ctrdi_internal2", "*ctrsi_internal3",
      "*ctrsi_internal4", "*ctrdi_internal3", "*ctrdi_internal4",
      "*ctrsi_internal5", "*ctrsi_internal6", "*ctrdi_internal5",
      "*ctrdi_internal6"): Replace register_operand with
nonimmediate_operand


Index: rs6000.md
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.md,v
retrieving revision 1.190.2.35.2.3
diff -c -p -r1.190.2.35.2.3 rs6000.md
*** rs6000.md     21 Mar 2004 03:21:23 -0000    1.190.2.35.2.3
--- rs6000.md     28 Mar 2004 09:20:59 -0000
***************
*** 13900,13906 ****
                    (const_int 1))
                  (label_ref (match_operand 0 "" ""))
                  (pc)))
!    (set (match_operand:SI 2 "register_operand" "=1,*r,m,*q*c*l")
      (plus:SI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
--- 13900,13906 ----
                    (const_int 1))
                  (label_ref (match_operand 0 "" ""))
                  (pc)))
!    (set (match_operand:SI 2 "nonimmediate_operand" "=1,*r,m,*q*c*l")
      (plus:SI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
***************
*** 13924,13930 ****
                    (const_int 1))
                  (pc)
                  (label_ref (match_operand 0 "" ""))))
!    (set (match_operand:SI 2 "register_operand" "=1,*r,m,*q*c*l")
      (plus:SI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
--- 13924,13930 ----
                    (const_int 1))
                  (pc)
                  (label_ref (match_operand 0 "" ""))))
!    (set (match_operand:SI 2 "nonimmediate_operand" "=1,*r,m,*q*c*l")
      (plus:SI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
***************
*** 13948,13954 ****
                    (const_int 1))
                  (label_ref (match_operand 0 "" ""))
                  (pc)))
!    (set (match_operand:DI 2 "register_operand" "=1,*r,m,*c*l")
      (plus:DI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
--- 13948,13954 ----
                    (const_int 1))
                  (label_ref (match_operand 0 "" ""))
                  (pc)))
!    (set (match_operand:DI 2 "nonimmediate_operand" "=1,*r,m,*c*l")
      (plus:DI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
***************
*** 13972,13978 ****
                    (const_int 1))
                  (pc)
                  (label_ref (match_operand 0 "" ""))))
!    (set (match_operand:DI 2 "register_operand" "=1,*r,m,*c*l")
      (plus:DI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
--- 13972,13978 ----
                    (const_int 1))
                  (pc)
                  (label_ref (match_operand 0 "" ""))))
!    (set (match_operand:DI 2 "nonimmediate_operand" "=1,*r,m,*c*l")
      (plus:DI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
***************
*** 13998,14004 ****
                    (const_int 0))
                  (label_ref (match_operand 0 "" ""))
                  (pc)))
!    (set (match_operand:SI 2 "register_operand" "=1,*r,m,*q*c*l")
      (plus:SI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
--- 13998,14004 ----
                    (const_int 0))
                  (label_ref (match_operand 0 "" ""))
                  (pc)))
!    (set (match_operand:SI 2 "nonimmediate_operand" "=1,*r,m,*q*c*l")
      (plus:SI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
***************
*** 14022,14028 ****
                    (const_int 0))
                  (pc)
                  (label_ref (match_operand 0 "" ""))))
!    (set (match_operand:SI 2 "register_operand" "=1,*r,m,*q*c*l")
      (plus:SI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
--- 14022,14028 ----
                    (const_int 0))
                  (pc)
                  (label_ref (match_operand 0 "" ""))))
!    (set (match_operand:SI 2 "nonimmediate_operand" "=1,*r,m,*q*c*l")
      (plus:SI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
***************
*** 14046,14052 ****
                    (const_int 0))
                  (label_ref (match_operand 0 "" ""))
                  (pc)))
!    (set (match_operand:DI 2 "register_operand" "=1,*r,m,*c*l")
      (plus:DI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
--- 14046,14052 ----
                    (const_int 0))
                  (label_ref (match_operand 0 "" ""))
                  (pc)))
!    (set (match_operand:DI 2 "nonimmediate_operand" "=1,*r,m,*c*l")
      (plus:DI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
***************
*** 14070,14076 ****
                    (const_int 0))
                  (pc)
                  (label_ref (match_operand 0 "" ""))))
!    (set (match_operand:DI 2 "register_operand" "=1,*r,m,*c*l")
      (plus:DI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
--- 14070,14076 ----
                    (const_int 0))
                  (pc)
                  (label_ref (match_operand 0 "" ""))))
!    (set (match_operand:DI 2 "nonimmediate_operand" "=1,*r,m,*c*l")
      (plus:DI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
***************
*** 14096,14102 ****
                    (const_int 1))
                  (label_ref (match_operand 0 "" ""))
                  (pc)))
!    (set (match_operand:SI 2 "register_operand" "=1,*r,m,*q*c*l")
      (plus:SI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
--- 14096,14102 ----
                    (const_int 1))
                  (label_ref (match_operand 0 "" ""))
                  (pc)))
!    (set (match_operand:SI 2 "nonimmediate_operand" "=1,*r,m,*q*c*l")
      (plus:SI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
***************
*** 14120,14126 ****
                    (const_int 1))
                  (pc)
                  (label_ref (match_operand 0 "" ""))))
!    (set (match_operand:SI 2 "register_operand" "=1,*r,m,*q*c*l")
      (plus:SI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
--- 14120,14126 ----
                    (const_int 1))
                  (pc)
                  (label_ref (match_operand 0 "" ""))))
!    (set (match_operand:SI 2 "nonimmediate_operand" "=1,*r,m,*q*c*l")
      (plus:SI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
***************
*** 14144,14150 ****
                    (const_int 1))
                  (label_ref (match_operand 0 "" ""))
                  (pc)))
!    (set (match_operand:DI 2 "register_operand" "=1,*r,m,*c*l")
      (plus:DI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
--- 14144,14150 ----
                    (const_int 1))
                  (label_ref (match_operand 0 "" ""))
                  (pc)))
!    (set (match_operand:DI 2 "nonimmediate_operand" "=1,*r,m,*c*l")
      (plus:DI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
***************
*** 14168,14174 ****
                    (const_int 1))
                  (pc)
                  (label_ref (match_operand 0 "" ""))))
!    (set (match_operand:DI 2 "register_operand" "=1,*r,m,*c*l")
      (plus:DI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
--- 14168,14174 ----
                    (const_int 1))
                  (pc)
                  (label_ref (match_operand 0 "" ""))))
!    (set (match_operand:DI 2 "nonimmediate_operand" "=1,*r,m,*c*l")
      (plus:DI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))



^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [lno] PATCH rs6000.md fix
  2004-03-30 14:55 [lno] PATCH rs6000.md fix Mostafa Hagog
@ 2004-03-30 15:36 ` David Edelsohn
  2004-03-31  4:35   ` Alan Modra
  2004-04-01 12:43 ` Dorit Naishlos
  1 sibling, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2004-03-30 15:36 UTC (permalink / raw)
  To: Mostafa Hagog; +Cc: gcc-patches, geoffk

>>>>> Mostafa Hagog writes:

Mostafa> Bootstraps on powerpc-apple-darwin7.2.0, no new regressions. Ok to commit
Mostafa> to lno?
Mostafa> IMO this should go also to mainline.

Mostafa> 2004-03-28 Mostafa Hagog  <mustafa@il.ibm.com>

Mostafa> * config/rs6000/rs6000.md ("*ctrsi_internal1", "*ctrsi_internal2",
Mostafa> "*ctrdi_internal1", "*ctrdi_internal2", "*ctrsi_internal3",
Mostafa> "*ctrsi_internal4", "*ctrdi_internal3", "*ctrdi_internal4",
Mostafa> "*ctrsi_internal5", "*ctrsi_internal6", "*ctrdi_internal5",
Mostafa> "*ctrdi_internal6"): Replace register_operand with
Mostafa> nonimmediate_operand

	Yes.  I looked into this, but I preferred to see a testcase that
failed.  I will commit this to mainline.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [lno] PATCH rs6000.md fix
  2004-03-30 15:36 ` David Edelsohn
@ 2004-03-31  4:35   ` Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2004-03-31  4:35 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches, geoffk

On Tue, Mar 30, 2004 at 10:36:46AM -0500, David Edelsohn wrote:
> >>>>> Mostafa Hagog writes:
> Mostafa> * config/rs6000/rs6000.md ("*ctrsi_internal1", "*ctrsi_internal2",
> Mostafa> "*ctrdi_internal1", "*ctrdi_internal2", "*ctrsi_internal3",
> Mostafa> "*ctrsi_internal4", "*ctrdi_internal3", "*ctrdi_internal4",
> Mostafa> "*ctrsi_internal5", "*ctrsi_internal6", "*ctrdi_internal5",
> Mostafa> "*ctrdi_internal6"): Replace register_operand with
> Mostafa> nonimmediate_operand
> 
> 	Yes.  I looked into this, but I preferred to see a testcase that
> failed.  I will commit this to mainline.

Seeing the ctr{si,di}_* patterns mentioned reminded me of a patch I
submitted a long time ago.  I reckon the GE patterns, ie.
ctrsi_internal3, ctrsi_internal4, ctrdi_internal3 and ctrdi_internal4
are all invalid.  Consider that dbnz does:
  decrement ctr by one.
  branch if resulting ctr != 0

This operation is accurately reflected in the rtl for the NE and EQ
patterns, eg.

(define_insn "*ctrsi_internal1"
  [(set (pc)
	(if_then_else (ne (match_operand:SI 1 "register_operand" "c,*r,*r,*r")
			  (const_int 1))
		      (label_ref (match_operand 0 "" ""))
		      (pc)))
   (set (match_operand:SI 2 "register_operand" "=1,*r,m,*q*c*l")
	(plus:SI (match_dup 1)
		 (const_int -1)))
[snip]

However the GE patterns use

(define_insn "*ctrsi_internal3"
  [(set (pc)
	(if_then_else (ge (match_operand:SI 1 "register_operand" "c,*r,*r,*r")
			  (const_int 0))
		      (label_ref (match_operand 0 "" ""))
		      (pc)))
[snip]

Yet both patterns emit the same bdnz instructions!  reg >= 0 is not even
closely equivalent to reg != 1.  The correct condition is reg >= 2, but
then you need to guarantee that the initial value of reg is greater or
equal to 2 rather than the currently checked initial value >= 0 via a
REG_NONNEG note.

May I commit a patch to mainline that removes the above mentioned
patterns?

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
  2004-03-09 23:49             ` Alan Modra
  2004-03-10  9:42               ` Richard Sandiford
  2004-03-19  8:14               ` Alan Modra
@ 2004-04-01  0:56               ` Geoff Keating
  2004-04-06 14:21                 ` Geoff Keating
  2004-04-09 20:06                 ` Geoff Keating
  2 siblings, 2 replies; 875+ messages in thread
From: Geoff Keating @ 2004-04-01  0:56 UTC (permalink / raw)
  To: Alan Modra; +Cc: gcc-patches

Alan Modra <amodra@bigpond.net.au> writes:

> On Mon, Mar 08, 2004 at 11:59:45PM -0800, Richard Henderson wrote:
> > I *do* think that's better than frobbing rv.sign yourself.
> > The less the format of real.h gets exposed the better.
> 
> OK.  I decided to avoid including tree.h in insn-output.c.  Instead,
> I'm using a new rs6000 backend function to calculate -0.0.  I believe
> it's not necessary to GTY(()) rs6000_dfmode_m0_rtx because it will
> point somewhere inside const_double_htab.

No, you have to GTY it because it'll go away when a PCH file is loaded
otherwise.

-- 
- Geoffrey Keating <geoffk@geoffk.org>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [lno] PATCH rs6000.md fix
  2004-03-30 14:55 [lno] PATCH rs6000.md fix Mostafa Hagog
  2004-03-30 15:36 ` David Edelsohn
@ 2004-04-01 12:43 ` Dorit Naishlos
  2004-04-06 14:21   ` Dorit Naishlos
  2004-04-09 20:06   ` Dorit Naishlos
  1 sibling, 2 replies; 875+ messages in thread
From: Dorit Naishlos @ 2004-04-01 12:43 UTC (permalink / raw)
  To: Mostafa Hagog; +Cc: dje, gcc-patches


I committed it to lno.

thanks,
dorit



                                                                                                                                     
                      Mostafa                                                                                                        
                      Hagog/Haifa/IBM@IB        To:       gcc-patches@gcc.gnu.org                                                    
                      MIL                       cc:       dje@watson.ibm.com, geoffk@desire.geoffk.org                               
                      Sent by:                  Subject:  [lno] PATCH rs6000.md fix                                                  
                      gcc-patches-owner@                                                                                             
                      gcc.gnu.org                                                                                                    
                                                                                                                                     
                                                                                                                                     
                      30/03/2004 17:53                                                                                               
                                                                                                                                     




The following patch fixes an inconsistency in doloop related patterns of
rs6000.md.
Dorit Naishlos has pointed out to me that the gap benchmark fails on the
lno branch when
gcse-after-reload is specified. It turned that this is due to the same
problem that appeared
in s390 port (http://gcc.gnu.org/ml/gcc-patches/2004-03/msg01803.html).

Bootstraps on powerpc-apple-darwin7.2.0, no new regressions. Ok to commit
to lno?
IMO this should go also to mainline.

2004-03-28 Mostafa Hagog  <mustafa@il.ibm.com>

      * config/rs6000/rs6000.md ("*ctrsi_internal1", "*ctrsi_internal2",
      "*ctrdi_internal1", "*ctrdi_internal2", "*ctrsi_internal3",
      "*ctrsi_internal4", "*ctrdi_internal3", "*ctrdi_internal4",
      "*ctrsi_internal5", "*ctrsi_internal6", "*ctrdi_internal5",
      "*ctrdi_internal6"): Replace register_operand with
nonimmediate_operand


Index: rs6000.md
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.md,v
retrieving revision 1.190.2.35.2.3
diff -c -p -r1.190.2.35.2.3 rs6000.md
*** rs6000.md     21 Mar 2004 03:21:23 -0000    1.190.2.35.2.3
--- rs6000.md     28 Mar 2004 09:20:59 -0000
***************
*** 13900,13906 ****
                    (const_int 1))
                  (label_ref (match_operand 0 "" ""))
                  (pc)))
!    (set (match_operand:SI 2 "register_operand" "=1,*r,m,*q*c*l")
      (plus:SI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
--- 13900,13906 ----
                    (const_int 1))
                  (label_ref (match_operand 0 "" ""))
                  (pc)))
!    (set (match_operand:SI 2 "nonimmediate_operand" "=1,*r,m,*q*c*l")
      (plus:SI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
***************
*** 13924,13930 ****
                    (const_int 1))
                  (pc)
                  (label_ref (match_operand 0 "" ""))))
!    (set (match_operand:SI 2 "register_operand" "=1,*r,m,*q*c*l")
      (plus:SI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
--- 13924,13930 ----
                    (const_int 1))
                  (pc)
                  (label_ref (match_operand 0 "" ""))))
!    (set (match_operand:SI 2 "nonimmediate_operand" "=1,*r,m,*q*c*l")
      (plus:SI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
***************
*** 13948,13954 ****
                    (const_int 1))
                  (label_ref (match_operand 0 "" ""))
                  (pc)))
!    (set (match_operand:DI 2 "register_operand" "=1,*r,m,*c*l")
      (plus:DI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
--- 13948,13954 ----
                    (const_int 1))
                  (label_ref (match_operand 0 "" ""))
                  (pc)))
!    (set (match_operand:DI 2 "nonimmediate_operand" "=1,*r,m,*c*l")
      (plus:DI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
***************
*** 13972,13978 ****
                    (const_int 1))
                  (pc)
                  (label_ref (match_operand 0 "" ""))))
!    (set (match_operand:DI 2 "register_operand" "=1,*r,m,*c*l")
      (plus:DI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
--- 13972,13978 ----
                    (const_int 1))
                  (pc)
                  (label_ref (match_operand 0 "" ""))))
!    (set (match_operand:DI 2 "nonimmediate_operand" "=1,*r,m,*c*l")
      (plus:DI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
***************
*** 13998,14004 ****
                    (const_int 0))
                  (label_ref (match_operand 0 "" ""))
                  (pc)))
!    (set (match_operand:SI 2 "register_operand" "=1,*r,m,*q*c*l")
      (plus:SI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
--- 13998,14004 ----
                    (const_int 0))
                  (label_ref (match_operand 0 "" ""))
                  (pc)))
!    (set (match_operand:SI 2 "nonimmediate_operand" "=1,*r,m,*q*c*l")
      (plus:SI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
***************
*** 14022,14028 ****
                    (const_int 0))
                  (pc)
                  (label_ref (match_operand 0 "" ""))))
!    (set (match_operand:SI 2 "register_operand" "=1,*r,m,*q*c*l")
      (plus:SI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
--- 14022,14028 ----
                    (const_int 0))
                  (pc)
                  (label_ref (match_operand 0 "" ""))))
!    (set (match_operand:SI 2 "nonimmediate_operand" "=1,*r,m,*q*c*l")
      (plus:SI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
***************
*** 14046,14052 ****
                    (const_int 0))
                  (label_ref (match_operand 0 "" ""))
                  (pc)))
!    (set (match_operand:DI 2 "register_operand" "=1,*r,m,*c*l")
      (plus:DI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
--- 14046,14052 ----
                    (const_int 0))
                  (label_ref (match_operand 0 "" ""))
                  (pc)))
!    (set (match_operand:DI 2 "nonimmediate_operand" "=1,*r,m,*c*l")
      (plus:DI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
***************
*** 14070,14076 ****
                    (const_int 0))
                  (pc)
                  (label_ref (match_operand 0 "" ""))))
!    (set (match_operand:DI 2 "register_operand" "=1,*r,m,*c*l")
      (plus:DI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
--- 14070,14076 ----
                    (const_int 0))
                  (pc)
                  (label_ref (match_operand 0 "" ""))))
!    (set (match_operand:DI 2 "nonimmediate_operand" "=1,*r,m,*c*l")
      (plus:DI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
***************
*** 14096,14102 ****
                    (const_int 1))
                  (label_ref (match_operand 0 "" ""))
                  (pc)))
!    (set (match_operand:SI 2 "register_operand" "=1,*r,m,*q*c*l")
      (plus:SI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
--- 14096,14102 ----
                    (const_int 1))
                  (label_ref (match_operand 0 "" ""))
                  (pc)))
!    (set (match_operand:SI 2 "nonimmediate_operand" "=1,*r,m,*q*c*l")
      (plus:SI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
***************
*** 14120,14126 ****
                    (const_int 1))
                  (pc)
                  (label_ref (match_operand 0 "" ""))))
!    (set (match_operand:SI 2 "register_operand" "=1,*r,m,*q*c*l")
      (plus:SI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
--- 14120,14126 ----
                    (const_int 1))
                  (pc)
                  (label_ref (match_operand 0 "" ""))))
!    (set (match_operand:SI 2 "nonimmediate_operand" "=1,*r,m,*q*c*l")
      (plus:SI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
***************
*** 14144,14150 ****
                    (const_int 1))
                  (label_ref (match_operand 0 "" ""))
                  (pc)))
!    (set (match_operand:DI 2 "register_operand" "=1,*r,m,*c*l")
      (plus:DI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
--- 14144,14150 ----
                    (const_int 1))
                  (label_ref (match_operand 0 "" ""))
                  (pc)))
!    (set (match_operand:DI 2 "nonimmediate_operand" "=1,*r,m,*c*l")
      (plus:DI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
***************
*** 14168,14174 ****
                    (const_int 1))
                  (pc)
                  (label_ref (match_operand 0 "" ""))))
!    (set (match_operand:DI 2 "register_operand" "=1,*r,m,*c*l")
      (plus:DI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
--- 14168,14174 ----
                    (const_int 1))
                  (pc)
                  (label_ref (match_operand 0 "" ""))))
!    (set (match_operand:DI 2 "nonimmediate_operand" "=1,*r,m,*c*l")
      (plus:DI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))






^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [lno] PATCH rs6000.md fix
  2004-04-01 12:43 ` Dorit Naishlos
@ 2004-04-06 14:21   ` Dorit Naishlos
  2004-04-09 20:06   ` Dorit Naishlos
  1 sibling, 0 replies; 875+ messages in thread
From: Dorit Naishlos @ 2004-04-06 14:21 UTC (permalink / raw)
  To: Mostafa Hagog; +Cc: dje, gcc-patches


I committed it to lno.

thanks,
dorit



                                                                                                                                     
                      Mostafa                                                                                                        
                      Hagog/Haifa/IBM@IB        To:       gcc-patches@gcc.gnu.org                                                    
                      MIL                       cc:       dje@watson.ibm.com, geoffk@desire.geoffk.org                               
                      Sent by:                  Subject:  [lno] PATCH rs6000.md fix                                                  
                      gcc-patches-owner@                                                                                             
                      gcc.gnu.org                                                                                                    
                                                                                                                                     
                                                                                                                                     
                      30/03/2004 17:53                                                                                               
                                                                                                                                     




The following patch fixes an inconsistency in doloop related patterns of
rs6000.md.
Dorit Naishlos has pointed out to me that the gap benchmark fails on the
lno branch when
gcse-after-reload is specified. It turned that this is due to the same
problem that appeared
in s390 port (http://gcc.gnu.org/ml/gcc-patches/2004-03/msg01803.html).

Bootstraps on powerpc-apple-darwin7.2.0, no new regressions. Ok to commit
to lno?
IMO this should go also to mainline.

2004-03-28 Mostafa Hagog  <mustafa@il.ibm.com>

      * config/rs6000/rs6000.md ("*ctrsi_internal1", "*ctrsi_internal2",
      "*ctrdi_internal1", "*ctrdi_internal2", "*ctrsi_internal3",
      "*ctrsi_internal4", "*ctrdi_internal3", "*ctrdi_internal4",
      "*ctrsi_internal5", "*ctrsi_internal6", "*ctrdi_internal5",
      "*ctrdi_internal6"): Replace register_operand with
nonimmediate_operand


Index: rs6000.md
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.md,v
retrieving revision 1.190.2.35.2.3
diff -c -p -r1.190.2.35.2.3 rs6000.md
*** rs6000.md     21 Mar 2004 03:21:23 -0000    1.190.2.35.2.3
--- rs6000.md     28 Mar 2004 09:20:59 -0000
***************
*** 13900,13906 ****
                    (const_int 1))
                  (label_ref (match_operand 0 "" ""))
                  (pc)))
!    (set (match_operand:SI 2 "register_operand" "=1,*r,m,*q*c*l")
      (plus:SI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
--- 13900,13906 ----
                    (const_int 1))
                  (label_ref (match_operand 0 "" ""))
                  (pc)))
!    (set (match_operand:SI 2 "nonimmediate_operand" "=1,*r,m,*q*c*l")
      (plus:SI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
***************
*** 13924,13930 ****
                    (const_int 1))
                  (pc)
                  (label_ref (match_operand 0 "" ""))))
!    (set (match_operand:SI 2 "register_operand" "=1,*r,m,*q*c*l")
      (plus:SI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
--- 13924,13930 ----
                    (const_int 1))
                  (pc)
                  (label_ref (match_operand 0 "" ""))))
!    (set (match_operand:SI 2 "nonimmediate_operand" "=1,*r,m,*q*c*l")
      (plus:SI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
***************
*** 13948,13954 ****
                    (const_int 1))
                  (label_ref (match_operand 0 "" ""))
                  (pc)))
!    (set (match_operand:DI 2 "register_operand" "=1,*r,m,*c*l")
      (plus:DI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
--- 13948,13954 ----
                    (const_int 1))
                  (label_ref (match_operand 0 "" ""))
                  (pc)))
!    (set (match_operand:DI 2 "nonimmediate_operand" "=1,*r,m,*c*l")
      (plus:DI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
***************
*** 13972,13978 ****
                    (const_int 1))
                  (pc)
                  (label_ref (match_operand 0 "" ""))))
!    (set (match_operand:DI 2 "register_operand" "=1,*r,m,*c*l")
      (plus:DI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
--- 13972,13978 ----
                    (const_int 1))
                  (pc)
                  (label_ref (match_operand 0 "" ""))))
!    (set (match_operand:DI 2 "nonimmediate_operand" "=1,*r,m,*c*l")
      (plus:DI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
***************
*** 13998,14004 ****
                    (const_int 0))
                  (label_ref (match_operand 0 "" ""))
                  (pc)))
!    (set (match_operand:SI 2 "register_operand" "=1,*r,m,*q*c*l")
      (plus:SI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
--- 13998,14004 ----
                    (const_int 0))
                  (label_ref (match_operand 0 "" ""))
                  (pc)))
!    (set (match_operand:SI 2 "nonimmediate_operand" "=1,*r,m,*q*c*l")
      (plus:SI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
***************
*** 14022,14028 ****
                    (const_int 0))
                  (pc)
                  (label_ref (match_operand 0 "" ""))))
!    (set (match_operand:SI 2 "register_operand" "=1,*r,m,*q*c*l")
      (plus:SI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
--- 14022,14028 ----
                    (const_int 0))
                  (pc)
                  (label_ref (match_operand 0 "" ""))))
!    (set (match_operand:SI 2 "nonimmediate_operand" "=1,*r,m,*q*c*l")
      (plus:SI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
***************
*** 14046,14052 ****
                    (const_int 0))
                  (label_ref (match_operand 0 "" ""))
                  (pc)))
!    (set (match_operand:DI 2 "register_operand" "=1,*r,m,*c*l")
      (plus:DI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
--- 14046,14052 ----
                    (const_int 0))
                  (label_ref (match_operand 0 "" ""))
                  (pc)))
!    (set (match_operand:DI 2 "nonimmediate_operand" "=1,*r,m,*c*l")
      (plus:DI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
***************
*** 14070,14076 ****
                    (const_int 0))
                  (pc)
                  (label_ref (match_operand 0 "" ""))))
!    (set (match_operand:DI 2 "register_operand" "=1,*r,m,*c*l")
      (plus:DI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
--- 14070,14076 ----
                    (const_int 0))
                  (pc)
                  (label_ref (match_operand 0 "" ""))))
!    (set (match_operand:DI 2 "nonimmediate_operand" "=1,*r,m,*c*l")
      (plus:DI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
***************
*** 14096,14102 ****
                    (const_int 1))
                  (label_ref (match_operand 0 "" ""))
                  (pc)))
!    (set (match_operand:SI 2 "register_operand" "=1,*r,m,*q*c*l")
      (plus:SI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
--- 14096,14102 ----
                    (const_int 1))
                  (label_ref (match_operand 0 "" ""))
                  (pc)))
!    (set (match_operand:SI 2 "nonimmediate_operand" "=1,*r,m,*q*c*l")
      (plus:SI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
***************
*** 14120,14126 ****
                    (const_int 1))
                  (pc)
                  (label_ref (match_operand 0 "" ""))))
!    (set (match_operand:SI 2 "register_operand" "=1,*r,m,*q*c*l")
      (plus:SI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
--- 14120,14126 ----
                    (const_int 1))
                  (pc)
                  (label_ref (match_operand 0 "" ""))))
!    (set (match_operand:SI 2 "nonimmediate_operand" "=1,*r,m,*q*c*l")
      (plus:SI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
***************
*** 14144,14150 ****
                    (const_int 1))
                  (label_ref (match_operand 0 "" ""))
                  (pc)))
!    (set (match_operand:DI 2 "register_operand" "=1,*r,m,*c*l")
      (plus:DI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
--- 14144,14150 ----
                    (const_int 1))
                  (label_ref (match_operand 0 "" ""))
                  (pc)))
!    (set (match_operand:DI 2 "nonimmediate_operand" "=1,*r,m,*c*l")
      (plus:DI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
***************
*** 14168,14174 ****
                    (const_int 1))
                  (pc)
                  (label_ref (match_operand 0 "" ""))))
!    (set (match_operand:DI 2 "register_operand" "=1,*r,m,*c*l")
      (plus:DI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
--- 14168,14174 ----
                    (const_int 1))
                  (pc)
                  (label_ref (match_operand 0 "" ""))))
!    (set (match_operand:DI 2 "nonimmediate_operand" "=1,*r,m,*c*l")
      (plus:DI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))






^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
  2004-04-01  0:56               ` Geoff Keating
@ 2004-04-06 14:21                 ` Geoff Keating
  2004-04-09 20:06                 ` Geoff Keating
  1 sibling, 0 replies; 875+ messages in thread
From: Geoff Keating @ 2004-04-06 14:21 UTC (permalink / raw)
  To: Alan Modra; +Cc: gcc-patches

Alan Modra <amodra@bigpond.net.au> writes:

> On Mon, Mar 08, 2004 at 11:59:45PM -0800, Richard Henderson wrote:
> > I *do* think that's better than frobbing rv.sign yourself.
> > The less the format of real.h gets exposed the better.
> 
> OK.  I decided to avoid including tree.h in insn-output.c.  Instead,
> I'm using a new rs6000 backend function to calculate -0.0.  I believe
> it's not necessary to GTY(()) rs6000_dfmode_m0_rtx because it will
> point somewhere inside const_double_htab.

No, you have to GTY it because it'll go away when a PCH file is loaded
otherwise.

-- 
- Geoffrey Keating <geoffk@geoffk.org>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Powerpc64 long double support
  2004-04-01  0:56               ` Geoff Keating
  2004-04-06 14:21                 ` Geoff Keating
@ 2004-04-09 20:06                 ` Geoff Keating
  1 sibling, 0 replies; 875+ messages in thread
From: Geoff Keating @ 2004-04-09 20:06 UTC (permalink / raw)
  To: Alan Modra; +Cc: gcc-patches

Alan Modra <amodra@bigpond.net.au> writes:

> On Mon, Mar 08, 2004 at 11:59:45PM -0800, Richard Henderson wrote:
> > I *do* think that's better than frobbing rv.sign yourself.
> > The less the format of real.h gets exposed the better.
> 
> OK.  I decided to avoid including tree.h in insn-output.c.  Instead,
> I'm using a new rs6000 backend function to calculate -0.0.  I believe
> it's not necessary to GTY(()) rs6000_dfmode_m0_rtx because it will
> point somewhere inside const_double_htab.

No, you have to GTY it because it'll go away when a PCH file is loaded
otherwise.

-- 
- Geoffrey Keating <geoffk@geoffk.org>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [lno] PATCH rs6000.md fix
  2004-04-01 12:43 ` Dorit Naishlos
  2004-04-06 14:21   ` Dorit Naishlos
@ 2004-04-09 20:06   ` Dorit Naishlos
  1 sibling, 0 replies; 875+ messages in thread
From: Dorit Naishlos @ 2004-04-09 20:06 UTC (permalink / raw)
  To: Mostafa Hagog; +Cc: dje, gcc-patches


I committed it to lno.

thanks,
dorit



                                                                                                                                     
                      Mostafa                                                                                                        
                      Hagog/Haifa/IBM@IB        To:       gcc-patches@gcc.gnu.org                                                    
                      MIL                       cc:       dje@watson.ibm.com, geoffk@desire.geoffk.org                               
                      Sent by:                  Subject:  [lno] PATCH rs6000.md fix                                                  
                      gcc-patches-owner@                                                                                             
                      gcc.gnu.org                                                                                                    
                                                                                                                                     
                                                                                                                                     
                      30/03/2004 17:53                                                                                               
                                                                                                                                     




The following patch fixes an inconsistency in doloop related patterns of
rs6000.md.
Dorit Naishlos has pointed out to me that the gap benchmark fails on the
lno branch when
gcse-after-reload is specified. It turned that this is due to the same
problem that appeared
in s390 port (http://gcc.gnu.org/ml/gcc-patches/2004-03/msg01803.html).

Bootstraps on powerpc-apple-darwin7.2.0, no new regressions. Ok to commit
to lno?
IMO this should go also to mainline.

2004-03-28 Mostafa Hagog  <mustafa@il.ibm.com>

      * config/rs6000/rs6000.md ("*ctrsi_internal1", "*ctrsi_internal2",
      "*ctrdi_internal1", "*ctrdi_internal2", "*ctrsi_internal3",
      "*ctrsi_internal4", "*ctrdi_internal3", "*ctrdi_internal4",
      "*ctrsi_internal5", "*ctrsi_internal6", "*ctrdi_internal5",
      "*ctrdi_internal6"): Replace register_operand with
nonimmediate_operand


Index: rs6000.md
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.md,v
retrieving revision 1.190.2.35.2.3
diff -c -p -r1.190.2.35.2.3 rs6000.md
*** rs6000.md     21 Mar 2004 03:21:23 -0000    1.190.2.35.2.3
--- rs6000.md     28 Mar 2004 09:20:59 -0000
***************
*** 13900,13906 ****
                    (const_int 1))
                  (label_ref (match_operand 0 "" ""))
                  (pc)))
!    (set (match_operand:SI 2 "register_operand" "=1,*r,m,*q*c*l")
      (plus:SI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
--- 13900,13906 ----
                    (const_int 1))
                  (label_ref (match_operand 0 "" ""))
                  (pc)))
!    (set (match_operand:SI 2 "nonimmediate_operand" "=1,*r,m,*q*c*l")
      (plus:SI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
***************
*** 13924,13930 ****
                    (const_int 1))
                  (pc)
                  (label_ref (match_operand 0 "" ""))))
!    (set (match_operand:SI 2 "register_operand" "=1,*r,m,*q*c*l")
      (plus:SI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
--- 13924,13930 ----
                    (const_int 1))
                  (pc)
                  (label_ref (match_operand 0 "" ""))))
!    (set (match_operand:SI 2 "nonimmediate_operand" "=1,*r,m,*q*c*l")
      (plus:SI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
***************
*** 13948,13954 ****
                    (const_int 1))
                  (label_ref (match_operand 0 "" ""))
                  (pc)))
!    (set (match_operand:DI 2 "register_operand" "=1,*r,m,*c*l")
      (plus:DI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
--- 13948,13954 ----
                    (const_int 1))
                  (label_ref (match_operand 0 "" ""))
                  (pc)))
!    (set (match_operand:DI 2 "nonimmediate_operand" "=1,*r,m,*c*l")
      (plus:DI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
***************
*** 13972,13978 ****
                    (const_int 1))
                  (pc)
                  (label_ref (match_operand 0 "" ""))))
!    (set (match_operand:DI 2 "register_operand" "=1,*r,m,*c*l")
      (plus:DI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
--- 13972,13978 ----
                    (const_int 1))
                  (pc)
                  (label_ref (match_operand 0 "" ""))))
!    (set (match_operand:DI 2 "nonimmediate_operand" "=1,*r,m,*c*l")
      (plus:DI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
***************
*** 13998,14004 ****
                    (const_int 0))
                  (label_ref (match_operand 0 "" ""))
                  (pc)))
!    (set (match_operand:SI 2 "register_operand" "=1,*r,m,*q*c*l")
      (plus:SI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
--- 13998,14004 ----
                    (const_int 0))
                  (label_ref (match_operand 0 "" ""))
                  (pc)))
!    (set (match_operand:SI 2 "nonimmediate_operand" "=1,*r,m,*q*c*l")
      (plus:SI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
***************
*** 14022,14028 ****
                    (const_int 0))
                  (pc)
                  (label_ref (match_operand 0 "" ""))))
!    (set (match_operand:SI 2 "register_operand" "=1,*r,m,*q*c*l")
      (plus:SI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
--- 14022,14028 ----
                    (const_int 0))
                  (pc)
                  (label_ref (match_operand 0 "" ""))))
!    (set (match_operand:SI 2 "nonimmediate_operand" "=1,*r,m,*q*c*l")
      (plus:SI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
***************
*** 14046,14052 ****
                    (const_int 0))
                  (label_ref (match_operand 0 "" ""))
                  (pc)))
!    (set (match_operand:DI 2 "register_operand" "=1,*r,m,*c*l")
      (plus:DI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
--- 14046,14052 ----
                    (const_int 0))
                  (label_ref (match_operand 0 "" ""))
                  (pc)))
!    (set (match_operand:DI 2 "nonimmediate_operand" "=1,*r,m,*c*l")
      (plus:DI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
***************
*** 14070,14076 ****
                    (const_int 0))
                  (pc)
                  (label_ref (match_operand 0 "" ""))))
!    (set (match_operand:DI 2 "register_operand" "=1,*r,m,*c*l")
      (plus:DI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
--- 14070,14076 ----
                    (const_int 0))
                  (pc)
                  (label_ref (match_operand 0 "" ""))))
!    (set (match_operand:DI 2 "nonimmediate_operand" "=1,*r,m,*c*l")
      (plus:DI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
***************
*** 14096,14102 ****
                    (const_int 1))
                  (label_ref (match_operand 0 "" ""))
                  (pc)))
!    (set (match_operand:SI 2 "register_operand" "=1,*r,m,*q*c*l")
      (plus:SI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
--- 14096,14102 ----
                    (const_int 1))
                  (label_ref (match_operand 0 "" ""))
                  (pc)))
!    (set (match_operand:SI 2 "nonimmediate_operand" "=1,*r,m,*q*c*l")
      (plus:SI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
***************
*** 14120,14126 ****
                    (const_int 1))
                  (pc)
                  (label_ref (match_operand 0 "" ""))))
!    (set (match_operand:SI 2 "register_operand" "=1,*r,m,*q*c*l")
      (plus:SI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
--- 14120,14126 ----
                    (const_int 1))
                  (pc)
                  (label_ref (match_operand 0 "" ""))))
!    (set (match_operand:SI 2 "nonimmediate_operand" "=1,*r,m,*q*c*l")
      (plus:SI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
***************
*** 14144,14150 ****
                    (const_int 1))
                  (label_ref (match_operand 0 "" ""))
                  (pc)))
!    (set (match_operand:DI 2 "register_operand" "=1,*r,m,*c*l")
      (plus:DI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
--- 14144,14150 ----
                    (const_int 1))
                  (label_ref (match_operand 0 "" ""))
                  (pc)))
!    (set (match_operand:DI 2 "nonimmediate_operand" "=1,*r,m,*c*l")
      (plus:DI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
***************
*** 14168,14174 ****
                    (const_int 1))
                  (pc)
                  (label_ref (match_operand 0 "" ""))))
!    (set (match_operand:DI 2 "register_operand" "=1,*r,m,*c*l")
      (plus:DI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
--- 14168,14174 ----
                    (const_int 1))
                  (pc)
                  (label_ref (match_operand 0 "" ""))))
!    (set (match_operand:DI 2 "nonimmediate_operand" "=1,*r,m,*c*l")
      (plus:DI (match_dup 1)
             (const_int -1)))
     (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))






^ permalink raw reply	[flat|nested] 875+ messages in thread

* rs6000 stack boundary
@ 2004-04-30 14:46 Alan Modra
  2004-04-30 22:26 ` Geoff Keating
  0 siblings, 1 reply; 875+ messages in thread
From: Alan Modra @ 2004-04-30 14:46 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Edelsohn

Fixes some problems with STACK_BOUNDARY.  Altivec needs 16 byte
alignment whether or no -mabi=altivec is given, and PowerPC64 Linux
always has a 16 byte aligned stack.  On PowerPC64 Linux, the sysv4.h
definition unfortunately overrode the rs6000.h one.

	* config/rs6000/rs6000.h (STACK_BOUNDARY): Use 128 bit for either
	TARGET_ALTIVEC or TARGET_ALTIVEC_ABI.
	* config/rs6000/sysv4.h (ABI_STACK_BOUNDARY): Likewise.
	(STACK_BOUNDARY): Delete.

Regression tested powerpc64-linux.  OK for mainline?

Index: gcc/config/rs6000/rs6000.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.h,v
retrieving revision 1.320
diff -u -p -r1.320 rs6000.h
--- gcc/config/rs6000/rs6000.h	24 Apr 2004 06:37:19 -0000	1.320
+++ gcc/config/rs6000/rs6000.h	30 Apr 2004 14:29:35 -0000
@@ -718,7 +718,8 @@ extern const char *rs6000_warn_altivec_l
 #define PARM_BOUNDARY (TARGET_32BIT ? 32 : 64)
 
 /* Boundary (in *bits*) on which stack pointer should be aligned.  */
-#define STACK_BOUNDARY ((TARGET_32BIT && !TARGET_ALTIVEC_ABI) ? 64 : 128)
+#define STACK_BOUNDARY \
+  ((TARGET_32BIT && !TARGET_ALTIVEC && !TARGET_ALTIVEC_ABI) ? 64 : 128)
 
 /* Allocation boundary (in *bits*) for the code of a function.  */
 #define FUNCTION_BOUNDARY 32
Index: gcc/config/rs6000/sysv4.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/sysv4.h,v
retrieving revision 1.148
diff -u -p -r1.148 sysv4.h
--- gcc/config/rs6000/sysv4.h	11 Apr 2004 06:21:05 -0000	1.148
+++ gcc/config/rs6000/sysv4.h	30 Apr 2004 14:29:36 -0000
@@ -385,12 +385,6 @@ do {									\
 #undef	STRICT_ALIGNMENT
 #define	STRICT_ALIGNMENT (TARGET_STRICT_ALIGN)
 
-/* Alignment in bits of the stack boundary.  Note, in order to allow building
-   one set of libraries with -mno-eabi instead of eabi libraries and non-eabi
-   versions, just use 64 as the stack boundary.  */
-#undef	STACK_BOUNDARY
-#define	STACK_BOUNDARY	(TARGET_ALTIVEC_ABI ? 128 : 64)
-
 /* Define this macro if you wish to preserve a certain alignment for
    the stack pointer, greater than what the hardware enforces.  The
    definition is a C expression for the desired alignment (measured
@@ -407,7 +401,8 @@ do {									\
 #define PREFERRED_STACK_BOUNDARY 128
 
 /* Real stack boundary as mandated by the appropriate ABI.  */
-#define ABI_STACK_BOUNDARY ((TARGET_EABI && !TARGET_ALTIVEC_ABI) ? 64 : 128)
+#define ABI_STACK_BOUNDARY \
+  ((TARGET_EABI && !TARGET_ALTIVEC && !TARGET_ALTIVEC_ABI) ? 64 : 128)
 
 /* An expression for the alignment of a structure field FIELD if the
    alignment computed in the usual way is COMPUTED.  */

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: rs6000 stack boundary
       [not found]           ` <amodra@bigpond.net.au>
                               ` (11 preceding siblings ...)
  2004-03-12 20:26             ` Correct powerpc64 long double -0.0 to double conversion David Edelsohn
@ 2004-04-30 14:55             ` David Edelsohn
  2004-05-09 14:19             ` Fixes for powerpc-linux param passing David Edelsohn
                               ` (48 subsequent siblings)
  61 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2004-04-30 14:55 UTC (permalink / raw)
  To: gcc-patches

>>>>> Alan Modra writes:

Alan> Fixes some problems with STACK_BOUNDARY.  Altivec needs 16 byte
Alan> alignment whether or no -mabi=altivec is given, and PowerPC64 Linux
Alan> always has a 16 byte aligned stack.  On PowerPC64 Linux, the sysv4.h
Alan> definition unfortunately overrode the rs6000.h one.

Alan> * config/rs6000/rs6000.h (STACK_BOUNDARY): Use 128 bit for either
Alan> TARGET_ALTIVEC or TARGET_ALTIVEC_ABI.
Alan> * config/rs6000/sysv4.h (ABI_STACK_BOUNDARY): Likewise.
Alan> (STACK_BOUNDARY): Delete.

Alan> Regression tested powerpc64-linux.  OK for mainline?

	Okay.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: rs6000 stack boundary
  2004-04-30 14:46 rs6000 stack boundary Alan Modra
@ 2004-04-30 22:26 ` Geoff Keating
  2004-04-30 23:06   ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Geoff Keating @ 2004-04-30 22:26 UTC (permalink / raw)
  To: Alan Modra; +Cc: David Edelsohn, gcc-patches

Alan Modra <amodra@bigpond.net.au> writes:

> Fixes some problems with STACK_BOUNDARY.  Altivec needs 16 byte
> alignment whether or no -mabi=altivec is given, and PowerPC64 Linux
> always has a 16 byte aligned stack.  On PowerPC64 Linux, the sysv4.h
> definition unfortunately overrode the rs6000.h one.
> 
> 	* config/rs6000/rs6000.h (STACK_BOUNDARY): Use 128 bit for either
> 	TARGET_ALTIVEC or TARGET_ALTIVEC_ABI.
> 	* config/rs6000/sysv4.h (ABI_STACK_BOUNDARY): Likewise.
> 	(STACK_BOUNDARY): Delete.

This isn't right; it's not true that all ELF targets have a 128-bit
stack boundary.  In particular, EABI targets don't.  Remember,
TARGET_ALTIVEC does not change ABI.  The change would be OK if it
didn't have the TARGET_ALTIVEC part.  (Note that the sysv4.h
definition of STACK_BOUNDARY was correct.)

-- 
- Geoffrey Keating <geoffk@geoffk.org>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: rs6000 stack boundary
  2004-04-30 22:26 ` Geoff Keating
@ 2004-04-30 23:06   ` David Edelsohn
       [not found]     ` <200404302355.i3UNtw61022391@desire.geoffk.org>
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2004-04-30 23:06 UTC (permalink / raw)
  To: Geoff Keating; +Cc: Alan Modra, gcc-patches

>>>>> Geoff Keating writes:

Geoff> This isn't right; it's not true that all ELF targets have a 128-bit
Geoff> stack boundary.  In particular, EABI targets don't.  Remember,
Geoff> TARGET_ALTIVEC does not change ABI.  The change would be OK if it
Geoff> didn't have the TARGET_ALTIVEC part.  (Note that the sysv4.h
Geoff> definition of STACK_BOUNDARY was correct.)

	TARGET_ALTIVEC cannot work without some form of stricter
alignment.  It is fine to separate the ABI in theory, but the alignment
requirements and truncated displacements of Altivec instructions forces
stricter alignment for Altivec variables.

	Are STARTING_FRAME_OFFSET and STACK_DYNAMIC_OFFSET sufficient?
Remember that Darwin has enforced a stricter alignment for all
applications and something similar is needed for SVR4 when Altivec is
enabled. 

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: rs6000 stack boundary
       [not found]     ` <200404302355.i3UNtw61022391@desire.geoffk.org>
@ 2004-05-01  0:40       ` David Edelsohn
  2004-05-01  2:08       ` Alan Modra
  1 sibling, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2004-05-01  0:40 UTC (permalink / raw)
  To: Geoff Keating; +Cc: amodra, gcc-patches

>>>>> Geoff Keating writes:

Geoff> I agree, but the correct behaviour if the alignment can't be achieved is
Geoff> to warn about the user's incompatible choice of options, not silently
Geoff> generate code that assumes an alignment greater than actually exists.

	TARGET_ALTIVEC_ABI implies more than just correct alignment, it
affects argument passing and vrsave as well.  I think issue this may be
confusion about TARGET_ALTIVEC actually implies.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: rs6000 stack boundary
       [not found]     ` <200404302355.i3UNtw61022391@desire.geoffk.org>
  2004-05-01  0:40       ` David Edelsohn
@ 2004-05-01  2:08       ` Alan Modra
  1 sibling, 0 replies; 875+ messages in thread
From: Alan Modra @ 2004-05-01  2:08 UTC (permalink / raw)
  To: Geoff Keating; +Cc: dje, gcc-patches

Geoff,
  If I understand your objection, you're concerned that using -maltivec
with a target that happens to compile libraries with -meabi, will no
longer work.  That's true.

However, such targets won't work with -mabi=altivec either.  Except
for t-spe, I see no multilib selection for mabi=altivec.  This, along
with the fact that the altivec support in gcc currently doesn't doesn't
work without 16 byte alignment, means that -maltivec and -mabi=altivec
on such targets is not usable anyway.

What I'd like to see is a -maltivec option that works for linux.  It
doesn't currently unless -mabi=altivec is also used, and even that
combination is buggy.  Please don't erect roadblocks.  I can't fix
all the problems in one go.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Fixes for powerpc-linux param passing
  2004-05-08 15:54 Fixes for powerpc-linux param passing Alan Modra
@ 2004-05-08 15:54 ` Aldy Hernandez
  2004-05-08 22:43   ` Geoff Keating
  2004-05-09 15:16   ` Alan Modra
  2004-05-08 16:41 ` Andrew Pinski
  2004-05-15 15:00 ` Alan Modra
  2 siblings, 2 replies; 875+ messages in thread
From: Aldy Hernandez @ 2004-05-08 15:54 UTC (permalink / raw)
  To: gcc-patches, David Edelsohn, Hartmut Penner, Janis Johnson; +Cc: geoffk

> 5) Passing vectors when -maltivec and -mabi=no-altivec was broken for
>    reasons detailed in http://gcc.gnu.org/ml/gcc/2004-04/msg01316.html
>    Vectors need to be aligned.  Also, -maltivec shouldn't change the
>    function passing mechanism, which means both
>    "-mabi=no-altivec -mno-altivec" and "-mabi=no-altivec -maltivec" need
>    to change.  In this patch, I've chosen to pass altivec vectors by
>    reference.  If passing by value, alignment constraints allow only one
>    vector to be passed in regs, in r5,r6,r7,r8.
> 
> My fix for (5) does mean an ABI change, unfortunately.  I'm not sure how

I don't have a problem with changing the ABI for -mabi=no-altivec.
There is no library code that is currently depending on it, and it
isn't documented in any of the altivec or ppc-linux/sysv/whatever
documents.  I say we come up with a sane solution, break it once and
for all, and try to document this in the ABI standard for PPC32/64.

Geoff, David, do you agree?

Aldy

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Fixes for powerpc-linux param passing
@ 2004-05-08 15:54 Alan Modra
  2004-05-08 15:54 ` Aldy Hernandez
                   ` (2 more replies)
  0 siblings, 3 replies; 875+ messages in thread
From: Alan Modra @ 2004-05-08 15:54 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Edelsohn, Aldy Hernandez, Hartmut Penner, Janis Johnson

This patch cures a number of powerpc-linux parameter passing problems.

1) rs6000_va_arg disagreed with function_arg on where floating point
   types live, apart from SFmode, DFmode and TFmode.  function_arg
   passes other floating point types, including all complex floats, in
   gprs, whereas rs6000_va_arg expected them in fp regs.

2) rs6000_va_arg disagreed with function_arg on arg alignment.
   function_arg/function_arg_advance aligned to an even word for DFmode
   and when exactly 2 gprs were needed for args.  rs6000_va_arg aligned
   when more than one word was needed for args, a mismatch on something
   like complex long long.

3) function_arg_boundary disagreed with function_arg_advance on
   alignment for two word args that didn't happen to be either DImode,
   DFmode or a SPE vector.  This doesn't cause a problem until you
   exhaust gprs for arg passing, at which point the generic function
   calling code lays out the stack according to function_arg_boundary,
   but the rs6000 backend assumes larger alignment.  Only causes a
   problem with variable argument functions with a large number of
   fixed arguments.  Presumably such functions are rare.

4) No accounting for stack space used by AltiVec args when -mabi=altivec
   and we run out of AltiVec registers.  Again, only causes a problem
   with variable argument functions with a large number of fixed vector
   arguments.

5) Passing vectors when -maltivec and -mabi=no-altivec was broken for
   reasons detailed in http://gcc.gnu.org/ml/gcc/2004-04/msg01316.html
   Vectors need to be aligned.  Also, -maltivec shouldn't change the
   function passing mechanism, which means both
   "-mabi=no-altivec -mno-altivec" and "-mabi=no-altivec -maltivec" need
   to change.  In this patch, I've chosen to pass altivec vectors by
   reference.  If passing by value, alignment constraints allow only one
   vector to be passed in regs, in r5,r6,r7,r8.

My fix for (5) does mean an ABI change, unfortunately.  I'm not sure how
to go about implementing altivec loads and stores to unaligned
locations.  The hardware can do it using multiple instructions and a
few temp vector regs.  I did try adding the following to rs6000_emit_move
to see whether the idea is feasible:

@@ -3573,7 +3573,25 @@ rs6000_emit_move (rtx dest, rtx source, 
       return;
     }
 
+  if (TARGET_ALTIVEC && ALTIVEC_VECTOR_MODE (mode))
+    {
+      /* Loads and stores using AltiVec instructions must be 16 byte
+	 aligned, at least when transferring full vectors.  */
+      if (GET_CODE (operands[0]) == MEM
+	  && MEM_ALIGN (operands[0]) < 128)
+	{
+	  debug_rtx (operands[0]);
+	  abort ();
+	}
+      if (GET_CODE (operands[1]) == MEM
+	  && MEM_ALIGN (operands[1]) < 128)
+	{
+	  debug_rtx (operands[1]);
+	  abort ();
+	}
+    }
+
   if (!no_new_pseudos)
     {
       if (GET_CODE (operands[1]) == MEM && optimize > 0

This resulted in
(mem:V4SI (reg:SI 126) [0 S16 A8])
.../gcc.dg/altivec-varargs-1.c: In function `foo':
.../gcc.dg/altivec-varargs-1.c:25: internal compiler error: in rs6000_emit_move
We're looking at the result of a MEM returned from expand_builtin_va_arg,
so that's the first thing that would need fixing..


Anyhow, the following was tested powerpc-linux and powerpc64-linux, on
both altivec and non-altivec hardware, no regressions.  Fixes the
following failures:

FAIL: gcc.c-torture/execute/va-arg-25.c execution,  -O0 
FAIL: gcc.c-torture/execute/va-arg-25.c execution,  -O1 
FAIL: gcc.c-torture/execute/va-arg-25.c execution,  -O2 
FAIL: gcc.c-torture/execute/va-arg-25.c execution,  -O3 -fomit-frame-pointer 
FAIL: gcc.c-torture/execute/va-arg-25.c execution,  -O3 -g 
FAIL: gcc.c-torture/execute/va-arg-25.c execution,  -Os 
FAIL: gcc.dg/compat/scalar-by-value-3 c_compat_x_tst.o-c_compat_y_tst.o execute
FAIL: gcc.dg/compat/scalar-return-3 c_compat_x_tst.o-c_compat_y_tst.o execute
FAIL: gcc.dg/complex-1.c execution test
FAIL: g++.dg/ext/altivec-3.C execution test

	* config/rs6000/rs6000.c (function_arg_boundary): Align for ABI_V4
	when size is 8 bytes.
	(function_arg_advance): Account for stack space used by AltiVec
	args when -mabi=altivec.  Simplify alignment calculations.  For 
	ABI_V4, pass AltiVec vectors by reference when -mabi=no-altivec.
	(function_arg): Similarly.
	(function_arg_pass_by_reference): True for ABI_V4 AltiVec when
	not AltiVec ABI.
	(rs6000_va_arg): Correct fp arg test.  Adjust for AltiVec change.
	Correct alignment, and align before testing reg count.  Remove
	TREE_THIS_VOLATILE from reg.  Don't emit unused labels.
	(rs6000_complex_function_value): Check TARGET_HARD_FLOAT and
	TARGET_FPRS here..
	(rs6000_function_value): .. not here before call.

Some notes on the patch:
- Aldy added TREE_THIS_VOLATILE in
  http://gcc.gnu.org/ml/gcc-patches/2002-03/msg00409.html without any
  explanation.  I think it is no longer needed.  Not surprisingly, it
  pessimises the code.
- Aligning reg before comparing in rs6000_va_arg isn't a bug fix.  I
  just think it's neater that way as you then don't need the code that
  later sets reg to 8.

OK to install mainline?  Failing that, OK to install without the changes
to pass AltiVec by refernce?


--- gcc-virgin/gcc/config/rs6000/rs6000.c	2004-05-06 15:10:07.000000000 +0930
+++ gcc-mainline/gcc/config/rs6000/rs6000.c	2004-05-08 22:20:52.000000000 +0930
@@ -4065,10 +4083,10 @@ function_arg_padding (enum machine_mode 
 int
 function_arg_boundary (enum machine_mode mode, tree type ATTRIBUTE_UNUSED)
 {
-  if (DEFAULT_ABI == ABI_V4 && (mode == DImode || mode == DFmode))
+  if (DEFAULT_ABI == ABI_V4 && GET_MODE_SIZE (mode) == 8)
+    return 64;
+  else if (SPE_VECTOR_MODE (mode))
     return 64;
-   else if (SPE_VECTOR_MODE (mode))
-     return 64;
   else if (TARGET_ALTIVEC_ABI && ALTIVEC_VECTOR_MODE (mode))
     return 128;
   else
@@ -4105,6 +4123,8 @@ function_arg_advance (CUMULATIVE_ARGS *c
 
   if (TARGET_ALTIVEC_ABI && ALTIVEC_VECTOR_MODE (mode))
     {
+      bool stack = false;
+
       if (USE_ALTIVEC_FOR_ARG_P (cum, mode, type, named))
         {
 	  cum->vregno++;
@@ -4112,12 +4132,18 @@ function_arg_advance (CUMULATIVE_ARGS *c
 	    error ("Cannot pass argument in vector register because"
 		   " altivec instructions are disabled, use -maltivec"
 		   " to enable them.");
+
+	  /* PowerPC64 Linux and AIX allocate GPRs for a vector argument
+	     even if it is going to be passed in a vector register.  
+	     Darwin does the same for variable-argument functions.  */
+	  if ((DEFAULT_ABI == ABI_AIX && TARGET_64BIT)
+	      || (cum->stdarg && DEFAULT_ABI != ABI_V4))
+	    stack = true;
 	}
-      /* PowerPC64 Linux and AIX allocates GPRs for a vector argument
-	 even if it is going to be passed in a vector register.  
-	 Darwin does the same for variable-argument functions.  */
-      if ((DEFAULT_ABI == ABI_AIX && TARGET_64BIT)
-		   || (cum->stdarg && DEFAULT_ABI != ABI_V4))
+      else
+	stack = true;
+
+      if (stack)
         {
 	  int align;
 	  
@@ -4129,7 +4155,7 @@ function_arg_advance (CUMULATIVE_ARGS *c
 	     aligned.  Space for GPRs is reserved even if the argument
 	     will be passed in memory.  */
 	  if (TARGET_32BIT)
-	    align = ((6 - (cum->words & 3)) & 3);
+	    align = (2 - cum->words) & 3;
 	  else
 	    align = cum->words & 1;
 	  cum->words += align + rs6000_arg_size (mode, type);
@@ -4167,22 +4193,27 @@ function_arg_advance (CUMULATIVE_ARGS *c
 	  int n_words;
 	  int gregno = cum->sysv_gregno;
 
-	  /* Aggregates and IEEE quad get passed by reference.  */
+	  /* Aggregates, IEEE quad, and AltiVec vectors get passed by
+	     reference.  */
 	  if ((type && AGGREGATE_TYPE_P (type))
-	      || mode == TFmode)
+	      || mode == TFmode
+	      || ALTIVEC_VECTOR_MODE (mode))
 	    n_words = 1;
 	  else 
 	    n_words = rs6000_arg_size (mode, type);
 
-	  /* Long long and SPE vectors are put in odd registers.  */
-	  if (n_words == 2 && (gregno & 1) == 0)
-	    gregno += 1;
+	  /* Long long and SPE vectors are put in (r3,r4), (r5,r6),
+	     (r7,r8) or (r9,r10).  As does any other 2 word item such
+	     as complex int due to a historical mistake.  */
+	  if (n_words == 2)
+	    gregno += (1 - gregno) & 1;
 
-	  /* Long long and SPE vectors are not split between registers
-	     and stack.  */
+	  /* Multi-reg args are not split between registers and stack.  */
 	  if (gregno + n_words - 1 > GP_ARG_MAX_REG)
 	    {
-	      /* Long long is aligned on the stack.  */
+	      /* Long long and SPE vectors are aligned on the stack.
+		 So are other 2 word items such as complex int due to
+		 a historical mistake.  */
 	      if (n_words == 2)
 		cum->words += cum->words & 1;
 	      cum->words += n_words;
@@ -4465,7 +4496,7 @@ function_arg (CUMULATIVE_ARGS *cum, enum
 	     they just have to start on an even word, since the parameter
 	     save area is 16-byte aligned.  */
 	  if (TARGET_32BIT)
-	    align = ((6 - (cum->words & 3)) & 3);
+	    align = (2 - cum->words) & 3;
 	  else
 	    align = cum->words & 1;
 	  align_words = cum->words + align;
@@ -4502,18 +4533,22 @@ function_arg (CUMULATIVE_ARGS *cum, enum
 	  int n_words;
 	  int gregno = cum->sysv_gregno;
 
-	  /* Aggregates and IEEE quad get passed by reference.  */
+	  /* Aggregates, IEEE quad, and AltiVec vectors get passed by
+	     reference.  */
 	  if ((type && AGGREGATE_TYPE_P (type))
-	      || mode == TFmode)
+	      || mode == TFmode
+	      || ALTIVEC_VECTOR_MODE (mode))
 	    n_words = 1;
 	  else 
 	    n_words = rs6000_arg_size (mode, type);
 
-	  /* Long long and SPE vectors are put in odd registers.  */
-	  if (n_words == 2 && (gregno & 1) == 0)
-	    gregno += 1;
+	  /* Long long and SPE vectors are put in (r3,r4), (r5,r6),
+	     (r7,r8) or (r9,r10).  As does any other 2 word item such
+	     as complex int due to a historical mistake.  */
+	  if (n_words == 2)
+	    gregno += (1 - gregno) & 1;
 
-	  /* Long long does not split between registers and stack.  */
+	  /* Multi-reg args are not split between registers and stack.  */
 	  if (gregno + n_words - 1 <= GP_ARG_MAX_REG)
 	    return gen_rtx_REG (mode, gregno);
 	  else
@@ -4659,7 +4694,8 @@ function_arg_pass_by_reference (CUMULATI
 {
   if (DEFAULT_ABI == ABI_V4
       && ((type && AGGREGATE_TYPE_P (type))
-	  || mode == TFmode))
+	  || mode == TFmode
+	  || (!TARGET_ALTIVEC_ABI && ALTIVEC_VECTOR_MODE (mode))))
     {
       if (TARGET_DEBUG_ARG)
 	fprintf (stderr, "function_arg_pass_by_reference: aggregate\n");
@@ -4913,6 +4948,7 @@ rs6000_va_arg (tree valist, tree type)
   tree gpr, fpr, ovf, sav, reg, t, u;
   int indirect_p, size, rsize, n_reg, sav_ofs, sav_scale;
   rtx lab_false, lab_over, addr_rtx, r;
+  int align;
 
   if (DEFAULT_ABI != ABI_V4)
     {
@@ -4986,10 +5022,14 @@ rs6000_va_arg (tree valist, tree type)
 
   size = int_size_in_bytes (type);
   rsize = (size + UNITS_PER_WORD - 1) / UNITS_PER_WORD;
+  align = 1;
 
-  if (AGGREGATE_TYPE_P (type) || TYPE_MODE (type) == TFmode)
+  if (AGGREGATE_TYPE_P (type)
+      || TYPE_MODE (type) == TFmode
+      || (!TARGET_ALTIVEC_ABI && ALTIVEC_VECTOR_MODE (TYPE_MODE (type))))
     {
-      /* Aggregates and long doubles are passed by reference.  */
+      /* Aggregates, long doubles, and AltiVec vectors are passed by
+	 reference.  */
       indirect_p = 1;
       reg = gpr;
       n_reg = 1;
@@ -4998,7 +5038,8 @@ rs6000_va_arg (tree valist, tree type)
       size = UNITS_PER_WORD;
       rsize = 1;
     }
-  else if (FLOAT_TYPE_P (type) && TARGET_HARD_FLOAT && TARGET_FPRS)
+  else if (TARGET_HARD_FLOAT && TARGET_FPRS
+	   && (TYPE_MODE (type) == SFmode || TYPE_MODE (type) == DFmode))
     {
       /* FP args go in FP registers, if present.  */
       indirect_p = 0;
@@ -5006,6 +5047,8 @@ rs6000_va_arg (tree valist, tree type)
       n_reg = 1;
       sav_ofs = 8*4;
       sav_scale = 8;
+      if (TYPE_MODE (type) == DFmode)
+	align = 8;
     }
   else
     {
@@ -5015,38 +5058,43 @@ rs6000_va_arg (tree valist, tree type)
       n_reg = rsize;
       sav_ofs = 0;
       sav_scale = 4;
+      if (n_reg == 2)
+	align = 8;
     }
 
   /* Pull the value out of the saved registers....  */
 
-  lab_false = gen_label_rtx ();
-  lab_over = gen_label_rtx ();
+  lab_over = NULL_RTX;
   addr_rtx = gen_reg_rtx (Pmode);
 
-  /*  AltiVec vectors never go in registers.  */
-  if (!TARGET_ALTIVEC || TREE_CODE (type) != VECTOR_TYPE)
+  /*  AltiVec vectors never go in registers when -mabi=altivec.  */
+  if (TARGET_ALTIVEC_ABI && ALTIVEC_VECTOR_MODE (TYPE_MODE (type)))
+    align = 16;
+  else
     {
-      TREE_THIS_VOLATILE (reg) = 1;
-      emit_cmp_and_jump_insns
-	(expand_expr (reg, NULL_RTX, QImode, EXPAND_NORMAL),
-	 GEN_INT (8 - n_reg + 1), GE, const1_rtx, QImode, 1,
-	 lab_false);
-
-      /* Long long is aligned in the registers.  */
-      if (n_reg > 1)
+      lab_false = gen_label_rtx ();
+      lab_over = gen_label_rtx ();
+
+      /* Long long and SPE vectors are aligned in the registers.
+	 As are any other 2 gpr item such as complex int due to a
+	 historical mistake.  */
+      u = reg;
+      if (n_reg == 2)
 	{
 	  u = build (BIT_AND_EXPR, TREE_TYPE (reg), reg,
 		     build_int_2 (n_reg - 1, 0));
-	  u = build (PLUS_EXPR, TREE_TYPE (reg), reg, u);
-	  u = build (MODIFY_EXPR, TREE_TYPE (reg), reg, u);
+	  u = build (POSTINCREMENT_EXPR, TREE_TYPE (reg), reg, u);
 	  TREE_SIDE_EFFECTS (u) = 1;
-	  expand_expr (u, const0_rtx, VOIDmode, EXPAND_NORMAL);
 	}
 
+      emit_cmp_and_jump_insns
+	(expand_expr (u, NULL_RTX, QImode, EXPAND_NORMAL),
+	 GEN_INT (8 - n_reg + 1), GE, const1_rtx, QImode, 1,
+	 lab_false);
+
+      t = sav;
       if (sav_ofs)
 	t = build (PLUS_EXPR, ptr_type_node, sav, build_int_2 (sav_ofs, 0));
-      else
-	t = sav;
 
       u = build (POSTINCREMENT_EXPR, TREE_TYPE (reg), reg,
 		 build_int_2 (n_reg, 0));
@@ -5067,40 +5115,18 @@ rs6000_va_arg (tree valist, tree type)
 
       emit_jump_insn (gen_jump (lab_over));
       emit_barrier ();
-    }
 
-  emit_label (lab_false);
+      emit_label (lab_false);
+    }
 
   /* ... otherwise out of the overflow area.  */
 
-  /* Make sure we don't find reg 7 for the next int arg.
-
-     All AltiVec vectors go in the overflow area.  So in the AltiVec
-     case we need to get the vectors from the overflow area, but
-     remember where the GPRs and FPRs are.  */
-  if (n_reg > 1 && (TREE_CODE (type) != VECTOR_TYPE
-		    || !TARGET_ALTIVEC))
-    {
-      t = build (MODIFY_EXPR, TREE_TYPE (reg), reg, build_int_2 (8, 0));
-      TREE_SIDE_EFFECTS (t) = 1;
-      expand_expr (t, const0_rtx, VOIDmode, EXPAND_NORMAL);
-    }
-
   /* Care for on-stack alignment if needed.  */
-  if (rsize <= 1)
-    t = ovf;
-  else
+  t = ovf;
+  if (align != 1)
     {
-      int align;
-
-      /* AltiVec vectors are 16 byte aligned.  */
-      if (TARGET_ALTIVEC && TREE_CODE (type) == VECTOR_TYPE)
-	align = 15;
-      else
-	align = 7;
-
-      t = build (PLUS_EXPR, TREE_TYPE (ovf), ovf, build_int_2 (align, 0));
-      t = build (BIT_AND_EXPR, TREE_TYPE (t), t, build_int_2 (-align-1, -1));
+      t = build (PLUS_EXPR, TREE_TYPE (t), t, build_int_2 (align - 1, 0));
+      t = build (BIT_AND_EXPR, TREE_TYPE (t), t, build_int_2 (-align, -1));
     }
   t = save_expr (t);
 
@@ -5113,7 +5139,8 @@ rs6000_va_arg (tree valist, tree type)
   TREE_SIDE_EFFECTS (t) = 1;
   expand_expr (t, const0_rtx, VOIDmode, EXPAND_NORMAL);
 
-  emit_label (lab_over);
+  if (lab_over)
+    emit_label (lab_over);
 
   if (indirect_p)
     {
@@ -16199,7 +16226,7 @@ rs6000_complex_function_value (enum mach
   enum machine_mode inner = GET_MODE_INNER (mode);
   unsigned int inner_bytes = GET_MODE_SIZE (inner);
 
-  if (FLOAT_MODE_P (mode))
+  if (FLOAT_MODE_P (mode) && TARGET_HARD_FLOAT && TARGET_FPRS)
     regno = FP_ARG_RETURN;
   else
     {
@@ -16257,10 +16284,9 @@ rs6000_function_value (tree valtype, tre
   else
     mode = TYPE_MODE (valtype);
 
-  if (TREE_CODE (valtype) == REAL_TYPE && TARGET_HARD_FLOAT && TARGET_FPRS)
+  if (SCALAR_FLOAT_TYPE_P (valtype) && TARGET_HARD_FLOAT && TARGET_FPRS)
     regno = FP_ARG_RETURN;
   else if (TREE_CODE (valtype) == COMPLEX_TYPE
-	   && TARGET_HARD_FLOAT
 	   && targetm.calls.split_complex_arg)
     return rs6000_complex_function_value (mode);
   else if (TREE_CODE (valtype) == VECTOR_TYPE

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Fixes for powerpc-linux param passing
  2004-05-08 15:54 Fixes for powerpc-linux param passing Alan Modra
  2004-05-08 15:54 ` Aldy Hernandez
@ 2004-05-08 16:41 ` Andrew Pinski
  2004-05-08 22:20   ` Aldy Hernandez
  2004-05-15 15:00 ` Alan Modra
  2 siblings, 1 reply; 875+ messages in thread
From: Andrew Pinski @ 2004-05-08 16:41 UTC (permalink / raw)
  To: Alan Modra
  Cc: Aldy Hernandez, David Edelsohn, gcc-patches, Janis Johnson,
	Hartmut Penner, Andrew Pinski


On May 8, 2004, at 09:36, Alan Modra wrote:

>
> Some notes on the patch:
> - Aldy added TREE_THIS_VOLATILE in
>   http://gcc.gnu.org/ml/gcc-patches/2002-03/msg00409.html without any
>   explanation.  I think it is no longer needed.  Not surprisingly, it
>   pessimises the code.

Also note TREE_THIS_VOLATILE causes the tree-ssa to fail for vectors
as the last variable before the variable arguments.  So this should fix
those regressions.  I had forgot to look into why the tests were 
failing.

Thanks,
Andrew Pinski

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Fixes for powerpc-linux param passing
  2004-05-08 16:41 ` Andrew Pinski
@ 2004-05-08 22:20   ` Aldy Hernandez
  0 siblings, 0 replies; 875+ messages in thread
From: Aldy Hernandez @ 2004-05-08 22:20 UTC (permalink / raw)
  To: Andrew Pinski
  Cc: Alan Modra, David Edelsohn, gcc-patches, Janis Johnson, Hartmut Penner

On Sat, May 08, 2004 at 12:09:13PM -0400, Andrew Pinski wrote:
> 
> On May 8, 2004, at 09:36, Alan Modra wrote:
> 
> >
> >Some notes on the patch:
> >- Aldy added TREE_THIS_VOLATILE in
> >  http://gcc.gnu.org/ml/gcc-patches/2002-03/msg00409.html without any
> >  explanation.  I think it is no longer needed.  Not surprisingly, it
> >  pessimises the code.
> 
> Also note TREE_THIS_VOLATILE causes the tree-ssa to fail for vectors
> as the last variable before the variable arguments.  So this should fix
> those regressions.  I had forgot to look into why the tests were 
> failing.

Nobody remembers why this went in in the first place, possibly to
cover some bug in the front or middle end.  I agree we should take it
out.

Aldy

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Fixes for powerpc-linux param passing
  2004-05-08 15:54 ` Aldy Hernandez
@ 2004-05-08 22:43   ` Geoff Keating
  2004-05-09 15:16   ` Alan Modra
  1 sibling, 0 replies; 875+ messages in thread
From: Geoff Keating @ 2004-05-08 22:43 UTC (permalink / raw)
  To: Aldy Hernandez; +Cc: gcc-patches

Aldy Hernandez <aldyh@redhat.com> writes:

> > 5) Passing vectors when -maltivec and -mabi=no-altivec was broken for
> >    reasons detailed in http://gcc.gnu.org/ml/gcc/2004-04/msg01316.html
> >    Vectors need to be aligned.  Also, -maltivec shouldn't change the
> >    function passing mechanism, which means both
> >    "-mabi=no-altivec -mno-altivec" and "-mabi=no-altivec -maltivec" need
> >    to change.  In this patch, I've chosen to pass altivec vectors by
> >    reference.  If passing by value, alignment constraints allow only one
> >    vector to be passed in regs, in r5,r6,r7,r8.
> > 
> > My fix for (5) does mean an ABI change, unfortunately.  I'm not sure how
> 
> I don't have a problem with changing the ABI for -mabi=no-altivec.
> There is no library code that is currently depending on it, and it
> isn't documented in any of the altivec or ppc-linux/sysv/whatever
> documents.  I say we come up with a sane solution, break it once and
> for all, and try to document this in the ABI standard for PPC32/64.
> 
> Geoff, David, do you agree?

This seems reasonable to me.

-- 
- Geoffrey Keating <geoffk@geoffk.org>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Fixes for powerpc-linux param passing
       [not found]           ` <amodra@bigpond.net.au>
                               ` (12 preceding siblings ...)
  2004-04-30 14:55             ` rs6000 stack boundary David Edelsohn
@ 2004-05-09 14:19             ` David Edelsohn
  2004-05-09 14:22               ` Aldy Hernandez
  2004-05-18  4:23             ` [lno] PATCH rs6000.md fix David Edelsohn
                               ` (47 subsequent siblings)
  61 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2004-05-09 14:19 UTC (permalink / raw)
  To: gcc-patches, Aldy Hernandez, Hartmut Penner, Janis Johnson

	The patch is okay with me, mainly because I do not think we have
any other choice for correct operation.  Neither Geoff nor I have any
objection, so I would say go ahead.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Fixes for powerpc-linux param passing
  2004-05-09 15:16   ` Alan Modra
@ 2004-05-09 14:21     ` Aldy Hernandez
  2004-05-09 14:22     ` Geoff Keating
  2004-05-11  6:55     ` Alan Modra
  2 siblings, 0 replies; 875+ messages in thread
From: Aldy Hernandez @ 2004-05-09 14:21 UTC (permalink / raw)
  To: gcc-patches, David Edelsohn, Hartmut Penner, Janis Johnson, geoffk

> > I don't have a problem with changing the ABI for -mabi=no-altivec.
> > There is no library code that is currently depending on it, and it
> > isn't documented in any of the altivec or ppc-linux/sysv/whatever
> > documents.  I say we come up with a sane solution, break it once and
> > for all, and try to document this in the ABI standard for PPC32/64.
> 
> Note that this patch just fixes the ABI_V4 case.  We have exactly the
> same problem on every other rs6000 ABI in that -mabi=no-altivec doesn't
> align vector params.  I'm happy to fix the other ABIs too if everyone
> is agrees.  Pass vectors by value in gprs for 64-bit, and by reference
> for 32-bit?  Does that sound reasonable?

Sounds very reasonable.  Please put big comments around such parts,
and do try to push this out to the PPC-Linux ABI committee (if there
is one).

Thanks Alan.

Aldy

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Fixes for powerpc-linux param passing
  2004-05-09 15:16   ` Alan Modra
  2004-05-09 14:21     ` Aldy Hernandez
@ 2004-05-09 14:22     ` Geoff Keating
  2004-05-11  6:55     ` Alan Modra
  2 siblings, 0 replies; 875+ messages in thread
From: Geoff Keating @ 2004-05-09 14:22 UTC (permalink / raw)
  To: amodra; +Cc: aldyh, gcc-patches, dje, HPENNER, janis187

> Date: Sun, 9 May 2004 08:49:50 +0930
> From: Alan Modra <amodra@bigpond.net.au>

> On powerpc-linux, there are some other ABI changes just crying out to be
> done too.  For example, we currently pass _Complex double in gprs.
> Perhaps this change needs to be accompanied with a means to generate
> code compatible with the current broken ABI.

That, you can't change so easily, since libc has functions that accept
complex values.

-- 
- Geoffrey Keating <geoffk@geoffk.org>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Fixes for powerpc-linux param passing
  2004-05-09 14:19             ` Fixes for powerpc-linux param passing David Edelsohn
@ 2004-05-09 14:22               ` Aldy Hernandez
  0 siblings, 0 replies; 875+ messages in thread
From: Aldy Hernandez @ 2004-05-09 14:22 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches, Hartmut Penner, Janis Johnson

On Sat, May 08, 2004 at 06:43:29PM -0400, David Edelsohn wrote:
> 	The patch is okay with me, mainly because I do not think we have
> any other choice for correct operation.  Neither Geoff nor I have any
> objection, so I would say go ahead.

Ditto.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Fixes for powerpc-linux param passing
  2004-05-08 15:54 ` Aldy Hernandez
  2004-05-08 22:43   ` Geoff Keating
@ 2004-05-09 15:16   ` Alan Modra
  2004-05-09 14:21     ` Aldy Hernandez
                       ` (2 more replies)
  1 sibling, 3 replies; 875+ messages in thread
From: Alan Modra @ 2004-05-09 15:16 UTC (permalink / raw)
  To: Aldy Hernandez
  Cc: gcc-patches, David Edelsohn, Hartmut Penner, Janis Johnson, geoffk

On Sat, May 08, 2004 at 09:59:15AM -0400, Aldy Hernandez wrote:
> > My fix for (5) does mean an ABI change, unfortunately.  I'm not sure how
> 
> I don't have a problem with changing the ABI for -mabi=no-altivec.
> There is no library code that is currently depending on it, and it
> isn't documented in any of the altivec or ppc-linux/sysv/whatever
> documents.  I say we come up with a sane solution, break it once and
> for all, and try to document this in the ABI standard for PPC32/64.

Note that this patch just fixes the ABI_V4 case.  We have exactly the
same problem on every other rs6000 ABI in that -mabi=no-altivec doesn't
align vector params.  I'm happy to fix the other ABIs too if everyone
is agrees.  Pass vectors by value in gprs for 64-bit, and by reference
for 32-bit?  Does that sound reasonable?

On powerpc-linux, there are some other ABI changes just crying out to be
done too.  For example, we currently pass _Complex double in gprs.
Perhaps this change needs to be accompanied with a means to generate
code compatible with the current broken ABI.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Fixes for powerpc-linux param passing
  2004-05-11  6:55     ` Alan Modra
@ 2004-05-10 14:01       ` Aldy Hernandez
  0 siblings, 0 replies; 875+ messages in thread
From: Aldy Hernandez @ 2004-05-10 14:01 UTC (permalink / raw)
  To: gcc-patches, David Edelsohn, Hartmut Penner, Janis Johnson, geoffk

> 	* config/rs6000/rs6000.c (function_arg_boundary): Always align
> 	AltiVec vectors.
> 	(function_arg_advance): Pass TARGET_32BIT -mabi=no-altivec AltiVec
> 	vectors by refererence.  Align the same for TARGET_64BIT to a 16
> 	byte boundary.  Remove useless code.  Add function comment.
> 	(function_arg): Similarly.  Move gpr rs6000_mixed_function_arg
> 	call to where it belongs.
> 	(function_arg_partial_nregs): Return true for all TARGET_32BIT
> 	-mabi=no-altivec AltiVec vectors.  Fix debug output.
> 	(rs6000_va_arg): Adjust for AltiVec change.
> 
> OK to install?

Yes.  Thank you.

Aldy

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Fixes for powerpc-linux param passing
  2004-05-09 15:16   ` Alan Modra
  2004-05-09 14:21     ` Aldy Hernandez
  2004-05-09 14:22     ` Geoff Keating
@ 2004-05-11  6:55     ` Alan Modra
  2004-05-10 14:01       ` Aldy Hernandez
  2 siblings, 1 reply; 875+ messages in thread
From: Alan Modra @ 2004-05-11  6:55 UTC (permalink / raw)
  To: Aldy Hernandez, gcc-patches, David Edelsohn, Hartmut Penner,
	Janis Johnson, geoffk

On Sun, May 09, 2004 at 08:49:50AM +0930, Alan Modra wrote:
> Note that this patch just fixes the ABI_V4 case.  We have exactly the
> same problem on every other rs6000 ABI in that -mabi=no-altivec doesn't
> align vector params.  I'm happy to fix the other ABIs too if everyone
> is agrees.  Pass vectors by value in gprs for 64-bit, and by reference
> for 32-bit?  Does that sound reasonable?

This changes the remaining -mabi=no-altivec vector passing to 16 byte
alignment.  Also removes some useless code as explained by the comments
added to function_arg_advance and function_arg.  Regression tested
powerpc-linux and powerpc64-linux.

	* config/rs6000/rs6000.c (function_arg_boundary): Always align
	AltiVec vectors.
	(function_arg_advance): Pass TARGET_32BIT -mabi=no-altivec AltiVec
	vectors by refererence.  Align the same for TARGET_64BIT to a 16
	byte boundary.  Remove useless code.  Add function comment.
	(function_arg): Similarly.  Move gpr rs6000_mixed_function_arg
	call to where it belongs.
	(function_arg_partial_nregs): Return true for all TARGET_32BIT
	-mabi=no-altivec AltiVec vectors.  Fix debug output.
	(rs6000_va_arg): Adjust for AltiVec change.

OK to install?

--- gcc-virgin/gcc/config/rs6000/rs6000.c	2004-05-10 09:06:46.000000000 +0930
+++ gcc-mainline/gcc/config/rs6000/rs6000.c	2004-05-10 18:24:52.000000000 +0930
@@ -4192,7 +4192,7 @@ function_arg_boundary (enum machine_mode
     return 64;
   else if (SPE_VECTOR_MODE (mode))
     return 64;
-  else if (TARGET_ALTIVEC_ABI && ALTIVEC_VECTOR_MODE (mode))
+  else if (ALTIVEC_VECTOR_MODE (mode))
     return 128;
   else
     return PARM_BOUNDARY;
@@ -4218,7 +4218,11 @@ rs6000_arg_size (enum machine_mode mode,
 \f
 /* Update the data in CUM to advance over an argument
    of mode MODE and data type TYPE.
-   (TYPE is null for libcalls where that information may not be available.)  */
+   (TYPE is null for libcalls where that information may not be available.)
+
+   Note that for args passed by reference, function_arg will be called
+   with MODE and TYPE set to that of the pointer to the arg, not the arg
+   itself.  */
 
 void
 function_arg_advance (CUMULATIVE_ARGS *cum, enum machine_mode mode, 
@@ -4295,18 +4299,9 @@ function_arg_advance (CUMULATIVE_ARGS *c
 	}
       else
 	{
-	  int n_words;
+	  int n_words = rs6000_arg_size (mode, type);
 	  int gregno = cum->sysv_gregno;
 
-	  /* Aggregates, IEEE quad, and AltiVec vectors get passed by
-	     reference.  */
-	  if ((type && AGGREGATE_TYPE_P (type))
-	      || mode == TFmode
-	      || ALTIVEC_VECTOR_MODE (mode))
-	    n_words = 1;
-	  else 
-	    n_words = rs6000_arg_size (mode, type);
-
 	  /* Long long and SPE vectors are put in (r3,r4), (r5,r6),
 	     (r7,r8) or (r9,r10).  As does any other 2 word item such
 	     as complex int due to a historical mistake.  */
@@ -4342,10 +4337,16 @@ function_arg_advance (CUMULATIVE_ARGS *c
     }
   else
     {
-      int align = (TARGET_32BIT && (cum->words & 1) != 0
-		   && function_arg_boundary (mode, type) == 64) ? 1 : 0;
+      int n_words = rs6000_arg_size (mode, type);
+      int align = function_arg_boundary (mode, type) / PARM_BOUNDARY - 1;
 
-      cum->words += align + rs6000_arg_size (mode, type);
+      /* The simple alignment calculation here works because
+	 function_arg_boundary / PARM_BOUNDARY will only be 1 or 2.
+	 If we ever want to handle alignments larger than 8 bytes for
+	 32-bit or 16 bytes for 64-bit, then we'll need to take into
+	 account the offset to the start of the parm save area.  */
+      align &= cum->words;
+      cum->words += align + n_words;
 
       if (GET_MODE_CLASS (mode) == MODE_FLOAT
 	  && TARGET_HARD_FLOAT && TARGET_FPRS)
@@ -4544,7 +4545,11 @@ rs6000_mixed_function_arg (CUMULATIVE_AR
    both an FP and integer register (or possibly FP reg and stack).  Library
    functions (when CALL_LIBCALL is set) always have the proper types for args,
    so we can pass the FP value just in one register.  emit_library_function
-   doesn't support PARALLEL anyway.  */
+   doesn't support PARALLEL anyway.
+
+   Note that for args passed by reference, function_arg will be called
+   with MODE and TYPE set to that of the pointer to the arg, not the arg
+   itself.  */
 
 struct rtx_def *
 function_arg (CUMULATIVE_ARGS *cum, enum machine_mode mode, 
@@ -4658,18 +4663,9 @@ function_arg (CUMULATIVE_ARGS *cum, enum
 	}
       else
 	{
-	  int n_words;
+	  int n_words = rs6000_arg_size (mode, type);
 	  int gregno = cum->sysv_gregno;
 
-	  /* Aggregates, IEEE quad, and AltiVec vectors get passed by
-	     reference.  */
-	  if ((type && AGGREGATE_TYPE_P (type))
-	      || mode == TFmode
-	      || ALTIVEC_VECTOR_MODE (mode))
-	    n_words = 1;
-	  else 
-	    n_words = rs6000_arg_size (mode, type);
-
 	  /* Long long and SPE vectors are put in (r3,r4), (r5,r6),
 	     (r7,r8) or (r9,r10).  As does any other 2 word item such
 	     as complex int due to a historical mistake.  */
@@ -4685,16 +4681,8 @@ function_arg (CUMULATIVE_ARGS *cum, enum
     }
   else
     {
-      int align = (TARGET_32BIT && (cum->words & 1) != 0
-	           && function_arg_boundary (mode, type) == 64) ? 1 : 0;
-      int align_words = cum->words + align;
-
-      if (type && TREE_CODE (TYPE_SIZE (type)) != INTEGER_CST)
-        return NULL_RTX;
-
-      if (TARGET_32BIT && TARGET_POWERPC64
-	  && (mode == DImode || mode == BLKmode))
-	return rs6000_mixed_function_arg (cum, mode, type, align_words);
+      int align = function_arg_boundary (mode, type) / PARM_BOUNDARY - 1;
+      int align_words = cum->words + (cum->words & align);
 
       if (USE_FP_FOR_ARG_P (cum, mode, type))
 	{
@@ -4763,7 +4751,13 @@ function_arg (CUMULATIVE_ARGS *cum, enum
 	  return gen_rtx_PARALLEL (mode, gen_rtvec_v (n, r));
 	}
       else if (align_words < GP_ARG_NUM_REG)
-	return gen_rtx_REG (mode, GP_ARG_MIN_REG + align_words);
+	{
+	  if (TARGET_32BIT && TARGET_POWERPC64
+	      && (mode == DImode || mode == BLKmode))
+	    return rs6000_mixed_function_arg (cum, mode, type, align_words);
+
+	  return gen_rtx_REG (mode, GP_ARG_MIN_REG + align_words);
+	}
       else
 	return NULL_RTX;
     }
@@ -4810,7 +4804,10 @@ function_arg_partial_nregs (CUMULATIVE_A
    the argument itself.  The pointer is passed in whatever way is
    appropriate for passing a pointer to that type.
 
-   Under V.4, structures and unions are passed by reference.
+   Under V.4, aggregates and long double are passed by reference.
+
+   As an extension to all 32-bit ABIs, AltiVec vectors are passed by
+   reference unless the AltiVec vector extension ABI is in force.
 
    As an extension to all ABIs, variable sized types are passed by
    reference.  */
@@ -4820,17 +4817,18 @@ function_arg_pass_by_reference (CUMULATI
 				enum machine_mode mode ATTRIBUTE_UNUSED, 
 				tree type, int named ATTRIBUTE_UNUSED)
 {
-  if (DEFAULT_ABI == ABI_V4
-      && ((type && AGGREGATE_TYPE_P (type))
-	  || mode == TFmode
-	  || (!TARGET_ALTIVEC_ABI && ALTIVEC_VECTOR_MODE (mode))))
+  if ((DEFAULT_ABI == ABI_V4
+       && ((type && AGGREGATE_TYPE_P (type))
+	   || mode == TFmode))
+      || (TARGET_32BIT && !TARGET_ALTIVEC_ABI && ALTIVEC_VECTOR_MODE (mode))
+      || (type && int_size_in_bytes (type) < 0))
     {
       if (TARGET_DEBUG_ARG)
-	fprintf (stderr, "function_arg_pass_by_reference: aggregate\n");
+	fprintf (stderr, "function_arg_pass_by_reference\n");
 
       return 1;
     }
-  return type && int_size_in_bytes (type) < 0;
+  return 0;
 }
 
 static void
@@ -5081,8 +5078,12 @@ rs6000_va_arg (tree valist, tree type)
 
   if (DEFAULT_ABI != ABI_V4)
     {
-      /* Variable sized types are passed by reference.  */
-      if (int_size_in_bytes (type) < 0)
+      /* Variable sized types are passed by reference, as are AltiVec
+	 vectors when 32-bit and not using the AltiVec ABI extension.  */
+      if (int_size_in_bytes (type) < 0
+	  || (TARGET_32BIT
+	      && !TARGET_ALTIVEC_ABI
+	      && ALTIVEC_VECTOR_MODE (TYPE_MODE (type))))
 	{
 	  u = build_pointer_type (type);
 

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Fixes for powerpc-linux param passing
  2004-05-08 15:54 Fixes for powerpc-linux param passing Alan Modra
  2004-05-08 15:54 ` Aldy Hernandez
  2004-05-08 16:41 ` Andrew Pinski
@ 2004-05-15 15:00 ` Alan Modra
  2 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2004-05-15 15:00 UTC (permalink / raw)
  To: gcc-patches

Applying as obvious.  UNITS_PER_WORD is wrong when -m32 -mpowerpc64.
The code that I'm reinstating here is to handle types such as
_Complex long long, which aren't aligned.

	* config/rs6000/rs6000.c (rs6000_va_arg <ABI_V4>): Don't use
	UNITS_PER_WORD to calculate gpr size.  Re-instate code to set reg
	count to 8 to handle n_reg > 2.

Index: gcc/config/rs6000/rs6000.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.c,v
retrieving revision 1.642
diff -u -p -r1.642 rs6000.c
--- gcc/config/rs6000/rs6000.c	13 May 2004 06:40:04 -0000	1.642
+++ gcc/config/rs6000/rs6000.c	15 May 2004 04:33:31 -0000
@@ -5144,7 +5144,7 @@ rs6000_va_arg (tree valist, tree type)
   sav = build (COMPONENT_REF, TREE_TYPE (f_sav), valist, f_sav);
 
   size = int_size_in_bytes (type);
-  rsize = (size + UNITS_PER_WORD - 1) / UNITS_PER_WORD;
+  rsize = (size + 3) / 4;
   align = 1;
 
   if (AGGREGATE_TYPE_P (type)
@@ -5158,7 +5158,7 @@ rs6000_va_arg (tree valist, tree type)
       n_reg = 1;
       sav_ofs = 0;
       sav_scale = 4;
-      size = UNITS_PER_WORD;
+      size = 4;
       rsize = 1;
     }
   else if (TARGET_HARD_FLOAT && TARGET_FPRS
@@ -5240,6 +5240,14 @@ rs6000_va_arg (tree valist, tree type)
       emit_barrier ();
 
       emit_label (lab_false);
+      if (n_reg > 2)
+	{
+	  /* Ensure that we don't find any more args in regs.
+	     Alignment has taken care of the n_reg == 2 case.  */
+	  t = build (MODIFY_EXPR, TREE_TYPE (reg), reg, build_int_2 (8, 0));
+	  TREE_SIDE_EFFECTS (t) = 1;
+	  expand_expr (t, const0_rtx, VOIDmode, EXPAND_NORMAL);
+	}
     }
 
   /* ... otherwise out of the overflow area.  */

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [lno] PATCH rs6000.md fix
       [not found]           ` <amodra@bigpond.net.au>
                               ` (13 preceding siblings ...)
  2004-05-09 14:19             ` Fixes for powerpc-linux param passing David Edelsohn
@ 2004-05-18  4:23             ` David Edelsohn
  2004-05-26 18:52             ` rs6000 mainline patch for pr 14478 David Edelsohn
                               ` (46 subsequent siblings)
  61 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2004-05-18  4:23 UTC (permalink / raw)
  To: gcc-patches

>>>>> Alan Modra writes:

Alan> reg >= 0 is not even
Alan> closely equivalent to reg != 1.  The correct condition is reg >= 2, but
Alan> then you need to guarantee that the initial value of reg is greater or
Alan> equal to 2 rather than the currently checked initial value >= 0 via a
Alan> REG_NONNEG note.

Alan> May I commit a patch to mainline that removes the above mentioned
Alan> patterns?

	Yes, ctr[ds]i_internal[34] appear to be unusable.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* rs6000 mainline patch for pr 14478
@ 2004-05-26 17:31 Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2004-05-26 17:31 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Edelsohn

Insurance that const zero never makes it to these patterns.
http://gcc.gnu.org/ml/gcc-patches/2004-03/msg00696.html
Bootstrapped, etc. powerpc64-linux.  OK mainline?  I'd like to apply
either this patch or the one referenced above to the 3.4 branch too.

	PR target/14478
	* config/rs6000/rs6000.c (reg_or_short_operand): Don't allow zero.

diff -urp -xCVS -x'*~' gcc-virgin/gcc/config/rs6000/rs6000.c gcc-mainline/gcc/config/rs6000/rs6000.c
--- gcc-virgin/gcc/config/rs6000/rs6000.c	2004-05-21 11:00:09.000000000 +0930
+++ gcc-mainline/gcc/config/rs6000/rs6000.c	2004-05-26 10:52:34.000000000 +0930
@@ -1406,13 +1406,16 @@ reg_or_short_operand (rtx op, enum machi
 }
 
 /* Similar, except check if the negation of the constant would be
-   valid for a D-field.  */
+   valid for a D-field.  Don't allow a constant zero, since all the
+   patterns that call this predicate use "addic r1,r2,-constant" on
+   a constant value to set a carry when r2 is greater or equal to
+   "constant".  That doesn't work for zero.  */
 
 int
 reg_or_neg_short_operand (rtx op, enum machine_mode mode)
 {
   if (GET_CODE (op) == CONST_INT)
-    return CONST_OK_FOR_LETTER_P (INTVAL (op), 'P');
+    return CONST_OK_FOR_LETTER_P (INTVAL (op), 'P') && INTVAL (op) != 0;
 
   return gpc_reg_operand (op, mode);
 }

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: rs6000 mainline patch for pr 14478
       [not found]           ` <amodra@bigpond.net.au>
                               ` (14 preceding siblings ...)
  2004-05-18  4:23             ` [lno] PATCH rs6000.md fix David Edelsohn
@ 2004-05-26 18:52             ` David Edelsohn
  2004-05-26 20:17               ` Alan Modra
  2004-05-26 21:18             ` David Edelsohn
                               ` (45 subsequent siblings)
  61 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2004-05-26 18:52 UTC (permalink / raw)
  To: gcc-patches

Insurance that const zero never makes it to these patterns.
http://gcc.gnu.org/ml/gcc-patches/2004-03/msg00696.html
Bootstrapped, etc. powerpc64-linux.  OK mainline?  I'd like to apply
either this patch or the one referenced above to the 3.4 branch too.

	PR target/14478
	* config/rs6000/rs6000.c (reg_or_short_operand): Don't allow zero.

This is okay for mainline and 3.4 branch.

	You also could change CONST_OF_FOR_LETTER_P macro so that it
rejects zero for 'P'.

	The patterns also do not operate as intended if the register is
zero, correct?

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: rs6000 mainline patch for pr 14478
  2004-05-26 18:52             ` rs6000 mainline patch for pr 14478 David Edelsohn
@ 2004-05-26 20:17               ` Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2004-05-26 20:17 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

On Wed, May 26, 2004 at 10:36:18AM -0400, David Edelsohn wrote:
> 	The patterns also do not operate as intended if the register is
> zero, correct?

They work fine for a zero in a register, since "subfc" is used.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: rs6000 mainline patch for pr 14478
       [not found]           ` <amodra@bigpond.net.au>
                               ` (15 preceding siblings ...)
  2004-05-26 18:52             ` rs6000 mainline patch for pr 14478 David Edelsohn
@ 2004-05-26 21:18             ` David Edelsohn
  2004-07-26 22:22             ` [PATCH, committed] SFmode arg padding and va_arg cleanup David Edelsohn
                               ` (44 subsequent siblings)
  61 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2004-05-26 21:18 UTC (permalink / raw)
  To: gcc-patches

>>>>> Alan Modra writes:

Alan> They work fine for a zero in a register, since "subfc" is used.

	Okay.  subfc is NOT instead of negation.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* [PATCH, committed] SFmode arg padding and va_arg cleanup
@ 2004-07-14 21:21 David Edelsohn
  2004-07-26 20:30 ` Alan Modra
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2004-07-14 21:21 UTC (permalink / raw)
  To: gcc-patches

	SFmode should not be padded when passed in stack in 64-bit mode.

	Update rs6000_gimplify_va_arg to use size_int().

David


	* config/rs6000/rs6000.c (function_arg_padding): Do not pad SFmode
	for TARGET_64BIT.
	(rs6000_gimplify_va_arg): Use size_int instead of build_int_2.

Index: rs6000.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.c,v
retrieving revision 1.665
diff -c -p -u -r1.665 rs6000.c
--- rs6000.c	13 Jul 2004 07:45:04 -0000	1.665
+++ rs6000.c	14 Jul 2004 13:59:47 -0000
@@ -4593,6 +4593,10 @@ function_arg_padding (enum machine_mode 
 	return upward;
     }
 
+  /* SFmode parameters are not padded.  */
+  if (TARGET_64BIT && mode == SFmode)
+    return none;
+
   /* Fall back to the default.  */
   return DEFAULT_FUNCTION_ARG_PADDING (mode, type);
 }
@@ -5459,7 +5463,7 @@ rs6000_gimplify_va_arg (tree valist, tre
 	{
 	  /* Args grow upward.  */
 	  t = build2 (POSTINCREMENT_EXPR, TREE_TYPE (valist), valist,
-		      build_int_2 (POINTER_SIZE / BITS_PER_UNIT, 0));
+		      size_int (POINTER_SIZE / BITS_PER_UNIT));
 	  t = build1 (NOP_EXPR, build_pointer_type (ptrtype), t);
 	  t = build_fold_indirect_ref (t);
 	  return build_fold_indirect_ref (t);
@@ -5568,12 +5572,11 @@ rs6000_gimplify_va_arg (tree valist, tre
       if (n_reg == 2)
 	{
 	  u = build2 (BIT_AND_EXPR, TREE_TYPE (reg), reg,
-		     build_int_2 (n_reg - 1, 0));
+		     size_int (n_reg - 1));
 	  u = build2 (POSTINCREMENT_EXPR, TREE_TYPE (reg), reg, u);
 	}
 
-      t = build_int_2 (8 - n_reg + 1, 0);
-      TREE_TYPE (t) = TREE_TYPE (reg);
+      t = fold_convert (TREE_TYPE (reg), size_int (8 - n_reg + 1));
       t = build2 (GE_EXPR, boolean_type_node, u, t);
       u = build1 (GOTO_EXPR, void_type_node, lab_false);
       t = build3 (COND_EXPR, void_type_node, t, u, NULL_TREE);
@@ -5581,12 +5584,11 @@ rs6000_gimplify_va_arg (tree valist, tre
 
       t = sav;
       if (sav_ofs)
-	t = build2 (PLUS_EXPR, ptr_type_node, sav, build_int_2 (sav_ofs, 0));
+	t = build2 (PLUS_EXPR, ptr_type_node, sav, size_int (sav_ofs));
 
-      u = build2 (POSTINCREMENT_EXPR, TREE_TYPE (reg), reg,
-		 build_int_2 (n_reg, 0));
+      u = build2 (POSTINCREMENT_EXPR, TREE_TYPE (reg), reg, size_int (n_reg));
       u = build1 (CONVERT_EXPR, integer_type_node, u);
-      u = build2 (MULT_EXPR, integer_type_node, u, build_int_2 (sav_scale, 0));
+      u = build2 (MULT_EXPR, integer_type_node, u, size_int (sav_scale));
       t = build2 (PLUS_EXPR, ptr_type_node, t, u);
 
       t = build2 (MODIFY_EXPR, void_type_node, addr, t);
@@ -5602,7 +5604,7 @@ rs6000_gimplify_va_arg (tree valist, tre
 	{
 	  /* Ensure that we don't find any more args in regs.
 	     Alignment has taken care of the n_reg == 2 case.  */
-	  t = build (MODIFY_EXPR, TREE_TYPE (reg), reg, build_int_2 (8, 0));
+	  t = build (MODIFY_EXPR, TREE_TYPE (reg), reg, size_int (8));
 	  gimplify_and_add (t, pre_p);
 	}
     }
@@ -5613,7 +5615,7 @@ rs6000_gimplify_va_arg (tree valist, tre
   t = ovf;
   if (align != 1)
     {
-      t = build2 (PLUS_EXPR, TREE_TYPE (t), t, build_int_2 (align - 1, 0));
+      t = build2 (PLUS_EXPR, TREE_TYPE (t), t, size_int (align - 1));
       t = build2 (BIT_AND_EXPR, TREE_TYPE (t), t, build_int_2 (-align, -1));
     }
   gimplify_expr (&t, pre_p, NULL, is_gimple_val, fb_rvalue);
@@ -5621,7 +5623,7 @@ rs6000_gimplify_va_arg (tree valist, tre
   u = build2 (MODIFY_EXPR, void_type_node, addr, t);
   gimplify_and_add (u, pre_p);
 
-  t = build2 (PLUS_EXPR, TREE_TYPE (t), t, build_int_2 (size, 0));
+  t = build2 (PLUS_EXPR, TREE_TYPE (t), t, size_int (size));
   t = build2 (MODIFY_EXPR, TREE_TYPE (ovf), ovf, t);
   gimplify_and_add (t, pre_p);
 

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH, committed] SFmode arg padding and va_arg cleanup
  2004-07-14 21:21 [PATCH, committed] SFmode arg padding and va_arg cleanup David Edelsohn
@ 2004-07-26 20:30 ` Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2004-07-26 20:30 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

On Wed, Jul 14, 2004 at 11:01:09AM -0400, David Edelsohn wrote:
> --- rs6000.c	13 Jul 2004 07:45:04 -0000	1.665
> +++ rs6000.c	14 Jul 2004 13:59:47 -0000
> @@ -4593,6 +4593,10 @@ function_arg_padding (enum machine_mode 
>  	return upward;
>      }
>  
> +  /* SFmode parameters are not padded.  */
> +  if (TARGET_64BIT && mode == SFmode)
> +    return none;
> +
>    /* Fall back to the default.  */
>    return DEFAULT_FUNCTION_ARG_PADDING (mode, type);
>  }

Hmm, shouldn't this be

  /* SFmode parameters are padded upwards.  */
  if (mode == SFmode)
    return upward;

We really do have padding on a ppc64 SFmode arg, so returning "none" is
confusing, and I suspect will result in locate_and_pad_parm calculating
a wrong "sizetree".

It's also not necessary to test for TARGET_64BIT since the return value
of function_arg_padding only matters when in fact there is padding.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH, committed] SFmode arg padding and va_arg cleanup
       [not found]           ` <amodra@bigpond.net.au>
                               ` (16 preceding siblings ...)
  2004-05-26 21:18             ` David Edelsohn
@ 2004-07-26 22:22             ` David Edelsohn
  2004-07-28 11:26               ` Alan Modra
  2004-07-28 12:17             ` David Edelsohn
                               ` (43 subsequent siblings)
  61 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2004-07-26 22:22 UTC (permalink / raw)
  To: gcc-patches

>>>>> Alan Modra writes:

Alan> Hmm, shouldn't this be

Alan> /* SFmode parameters are padded upwards.  */
Alan> if (mode == SFmode)
Alan> return upward;

Alan> We really do have padding on a ppc64 SFmode arg, so returning "none" is
Alan> confusing, and I suspect will result in locate_and_pad_parm calculating
Alan> a wrong "sizetree".

Alan> It's also not necessary to test for TARGET_64BIT since the return value
Alan> of function_arg_padding only matters when in fact there is padding.

	The PPC64 ABI says:

* Each argument is mapped to as many doublewords of the parameter save
  area as are required to hold its value.
  - Single precision floating point values are mapped to the first word
    in a single doubleword.
  - Double precision floating point values are mapped to a single
    doubleword.

but the 64-bit AIX ABI says:

  - Single-precision floating values are mapped to a single word.

  - Double-precision floating values are mapped to two consecutive words.

I guess that one can interpret the AIX ABI as meaning "a single word in a
doubleword."  We should make sure that "first word" means the right thing.

	Also, there are two separate complex ABI problems -- one in 32-bit
mode and one in 64-bit mode.  I do not know if they are related.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH, committed] SFmode arg padding and va_arg cleanup
  2004-07-26 22:22             ` [PATCH, committed] SFmode arg padding and va_arg cleanup David Edelsohn
@ 2004-07-28 11:26               ` Alan Modra
  2004-07-28 12:11                 ` Alan Modra
  0 siblings, 1 reply; 875+ messages in thread
From: Alan Modra @ 2004-07-28 11:26 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

Applied mainline after testing powerpc64-linux.

	* config/rs6000/rs6000.c (function_arg_padding): Pad SFmode upwards.

Index: gcc/config/rs6000/rs6000.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.c,v
retrieving revision 1.671
retrieving revision 1.672
diff -u -p -r1.671 -r1.672
--- gcc/config/rs6000/rs6000.c	23 Jul 2004 04:35:17 -0000	1.671
+++ gcc/config/rs6000/rs6000.c	28 Jul 2004 00:56:47 -0000	1.672
@@ -4594,9 +4594,9 @@ function_arg_padding (enum machine_mode 
 	return upward;
     }
 
-  /* SFmode parameters are not padded.  */
-  if (TARGET_64BIT && mode == SFmode)
-    return none;
+  /* SFmode parameters are padded upwards.  */
+  if (mode == SFmode)
+    return upward;
 
   /* Fall back to the default.  */
   return DEFAULT_FUNCTION_ARG_PADDING (mode, type);

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH, committed] SFmode arg padding and va_arg cleanup
  2004-07-28 11:26               ` Alan Modra
@ 2004-07-28 12:11                 ` Alan Modra
  2004-08-06  7:58                   ` Alan Modra
  0 siblings, 1 reply; 875+ messages in thread
From: Alan Modra @ 2004-07-28 12:11 UTC (permalink / raw)
  To: David Edelsohn, gcc-patches

On Wed, Jul 28, 2004 at 11:55:47AM +0930, Alan Modra wrote:
> Applied mainline after testing powerpc64-linux.
> 
> 	* config/rs6000/rs6000.c (function_arg_padding): Pad SFmode upwards.

I'd like to apply this to the 3.4 branch too.  OK?

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH, committed] SFmode arg padding and va_arg cleanup
       [not found]           ` <amodra@bigpond.net.au>
                               ` (17 preceding siblings ...)
  2004-07-26 22:22             ` [PATCH, committed] SFmode arg padding and va_arg cleanup David Edelsohn
@ 2004-07-28 12:17             ` David Edelsohn
  2004-08-06 14:53             ` David Edelsohn
                               ` (42 subsequent siblings)
  61 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2004-07-28 12:17 UTC (permalink / raw)
  To: gcc-patches

>>>>> Alan Modra writes:

Alan> I'd like to apply this to the 3.4 branch too.  OK?

	Yes, please.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH, committed] SFmode arg padding and va_arg cleanup
  2004-07-28 12:11                 ` Alan Modra
@ 2004-08-06  7:58                   ` Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2004-08-06  7:58 UTC (permalink / raw)
  To: David Edelsohn, gcc-patches

On Wed, Jul 28, 2004 at 12:07:02PM +0930, Alan Modra wrote:
> On Wed, Jul 28, 2004 at 11:55:47AM +0930, Alan Modra wrote:
> > Applied mainline after testing powerpc64-linux.
> > 
> > 	* config/rs6000/rs6000.c (function_arg_padding): Pad SFmode upwards.
> 
> I'd like to apply this to the 3.4 branch too.  OK?

I don't know what happened during my testing, but clearly I didn't test
the source I actually committed.  Regressions all over the place.  :-(
Sorry, I think this patch needs reverting.

As I said in http://gcc.gnu.org/ml/gcc-patches/2004-07/msg02370.html,
there's a problem with returning "none" from function_arg_padding too,
with the size of float args being calculated incorrectly.  The following
testcase shows va_arg looking in the wrong place for an arg that follows
a float, if we return "none".  It also happens to show the current
breakage when returning "upward"..

int foo (float f1, ...)
{
  __builtin_va_list ap;

  if (f1 != 1.0f)
    return 1;

  __builtin_va_start (ap, f1);
  if (__builtin_va_arg (ap, int) != 2)
    return 2;

  __builtin_va_end (ap);
  return 0;
}

int main (void)
{
  return foo (1.0f, 2);
}

There is also a problem in that returning "none" doesn't actually do the
right thing according to the current ABI:  SFmode, when passed in gprs
as well as fprs (eg. non-prototyped function) is passed in the least
significant end of the gpr, which corresponds to the second word of a
doubleword in the parameter save area.

So, I'm inclined to think that both patches should be reverted, and we
modify the ppc64 ABI to state that floats are passed in the second word
of a doubleword.  ie. in the word with the higher address.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH, committed] SFmode arg padding and va_arg cleanup
       [not found]           ` <amodra@bigpond.net.au>
                               ` (18 preceding siblings ...)
  2004-07-28 12:17             ` David Edelsohn
@ 2004-08-06 14:53             ` David Edelsohn
  2004-08-11 14:47             ` powerpc64 fixes missing from 3.4 branch David Edelsohn
                               ` (41 subsequent siblings)
  61 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2004-08-06 14:53 UTC (permalink / raw)
  To: gcc-patches

>>>>> Alan Modra writes:

Alan> So, I'm inclined to think that both patches should be reverted, and we
Alan> modify the ppc64 ABI to state that floats are passed in the second word
Alan> of a doubleword.  ie. in the word with the higher address.

	Go ahead and revert the patch from GCC 3.4.

	We need to consider this more carefully if GCC never is going to
be able to conform to that part of the ABI.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* powerpc64 linux dot symbols
@ 2004-08-10 11:55 Alan Modra
  2004-08-10 12:37 ` Joseph S. Myers
                   ` (2 more replies)
  0 siblings, 3 replies; 875+ messages in thread
From: Alan Modra @ 2004-08-10 11:55 UTC (permalink / raw)
  To: gcc-patches; +Cc: dje, mendell, sjmunroe, gilliam, yaari, klausner

This is the gcc part of my patch to binutils+gcc to remove use of "dot"
symbols on PowerPC64 Linux function entry points.  The main reason for
making this change is to remove a user visible PowerPC64 ELF wart:
Functions called names like "data" and "dynamic" won't compile, since
the associated dot symbols, ".data" and ".dynamic" clash with ELF
section names.  A secondary reason is that reducing the number of global
symbols speeds application load time, because ld.so's symbol table is
somewhat smaller.


So, what's different?

Old

function call:		bl .foo
descriptor sym:		foo, type=object, size=24
entry sym:		.foo, type=function, size=<function code size>
descriptor:		.quad .foo, .TOC.@tocbase, 0

New

function call:		bl foo
descriptor sym:		foo, type=function, size=<function code size>
entry sym:		.LEfoo local sym (won't appear in object)
descriptor:		.quad .LEfoo, .TOC.@tocbase, 0

Note that old descriptor syms have been given type "function" by the
linker in executables and dynamic libraries for quite some time.
(see http://sources.redhat.com/ml/binutils/2004-03/msg00550.html)
So for tools like FDPR that process binaries, the net change in
executables is that dot symbols are missing, and function size is
attached to a different symbol.

Of course, this change has implications in other parts of the
toolchain.  For instance, I need to make some changes to objdump,
as .text tends to be one large block of code without any symbols.
gdb likely needs some work too, as it will need to look up function
descriptors to find code addresses.  glibc is probably OK, although I
haven't yet compiled it with the new compiler.  One very fortunate
property of old dynamic libraries is that they don't make references
to external dot symbols, so there shouldn't be any change needed in
ld.so.


Testing of the following patch against mainline hasn't completed yet,
but a virtually identical patch against 3.4 has survived bootstrap and
regression testing on powerpc64-linux, with both old and new linkers.

	* config/rs6000/linux64.h (DOT_SYMBOLS): Define.
	(CRT_CALL_STATIC_FUNCTION): Define !DOT_SYMBOLS version.
	(ASM_DECLARE_FUNCTION_SIZE): Modify for !DOT_SYMBOLS.
	(ASM_OUTPUT_SOURCE_LINE, DBX_OUTPUT_BRAC, DBX_OUTPUT_NFUN): Likewise.
	* config/rs6000/rs6000-protos.h (rs6000_output_function_entry): Decl.
	* config/rs6000/rs6000.c (rs6000_output_function_entry): New function,
	modified for !DOT_SYMBOLS..
	(print_operand <case 'z'>): ..extracted from here.
	(rs6000_assemble_visibility): Modify for !DOT_SYMBOLS.
	(rs6000_output_function_epilogue): Likewise.
	(rs6000_elf_declare_function_name): Likewise.
	* config/rs6000/rs6000.h (DOT_SYMBOLS): Define.
	(ASM_WEAKEN_DECL, ASM_OUTPUT_DEF_FROM_DECLS): Modify for !DOT_SYMBOLS.
	* configure.ac (HAVE_LD_NO_DOT_SYMS): Add new AC_DEFINE.
	* configure: Regenerate.
	* config.in: Regenerate.

Index: gcc/configure.ac
===================================================================
RCS file: /cvs/gcc/gcc/gcc/configure.ac,v
retrieving revision 2.50
diff -u -p -r2.50 configure.ac
--- gcc/configure.ac	23 Jul 2004 06:59:34 -0000	2.50
+++ gcc/configure.ac	9 Aug 2004 23:52:02 -0000
@@ -2857,6 +2857,47 @@ if test x"$gcc_cv_ld_as_needed" = xyes; 
 [Define if your linker supports --as-needed and --no-as-needed options.])
 fi
 
+case "$target" in
+  powerpc64*-*-linux*)
+    AC_CACHE_CHECK(linker support for omitting dot symbols,
+    gcc_cv_ld_no_dot_syms,
+    [gcc_cv_ld_no_dot_syms=no
+    if test $in_tree_ld = yes ; then
+      if test "$gcc_cv_gld_major_version" -eq 2 -a "$gcc_cv_gld_minor_version" -ge 16 -o "$gcc_cv_gld_major_version" -gt 2; then
+        gcc_cv_ld_no_dot_syms=yes
+      fi
+    elif test x$gcc_cv_as != x -a x$gcc_cv_ld != x ; then
+      cat > conftest1.s <<EOF
+	.text
+	bl foo
+EOF
+      cat > conftest2.s <<EOF
+	.section ".opd","aw"
+	.align 3
+	.globl foo
+	.type foo,@function
+foo:
+	.quad .LEfoo,.TOC.@tocbase,0
+	.text
+.LEfoo:
+	blr
+	.size foo,.-.LEfoo
+EOF
+      if $gcc_cv_as -o conftest1.o conftest1.s > /dev/null 2>&1 \
+         && $gcc_cv_as -o conftest2.o conftest2.s > /dev/null 2>&1 \
+         && $gcc_cv_ld -o conftest conftest1.o conftest2.o > /dev/null 2>&1; then
+        gcc_cv_ld_no_dot_syms=yes
+      fi
+      rm -f conftest conftest1.o conftest2.o conftest1.s conftest2.s
+    fi
+    ])
+    if test x"$gcc_cv_ld_no_dot_syms" = xyes; then
+      AC_DEFINE(HAVE_LD_NO_DOT_SYMS, 1,
+    [Define if your PowerPC64 linker only needs function descriptor syms.])
+    fi
+    ;;
+esac
+
 if test x$with_sysroot = x && test x$host = x$target \
    && test "$prefix" != "/usr" && test "x$prefix" != "x$local_prefix" ; then
   AC_DEFINE_UNQUOTED(PREFIX_INCLUDE_DIR, "$prefix/include",
Index: gcc/config/rs6000/linux64.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/linux64.h,v
retrieving revision 1.62
diff -u -p -r1.62 linux64.h
--- gcc/config/rs6000/linux64.h	13 Jul 2004 07:45:04 -0000	1.62
+++ gcc/config/rs6000/linux64.h	9 Aug 2004 23:52:06 -0000
@@ -50,6 +50,12 @@
 #undef	TARGET_AIX
 #define	TARGET_AIX TARGET_64BIT
 
+#ifdef HAVE_LD_NO_DOT_SYMS
+/* New ABI uses a local sym for the function entry point.  */
+#undef DOT_SYMBOLS
+#define DOT_SYMBOLS 0
+#endif
+
 #undef PROCESSOR_DEFAULT64
 #define PROCESSOR_DEFAULT64 PROCESSOR_PPC630
 
@@ -386,11 +398,19 @@
    object files, each potentially with a different TOC pointer.  For
    that reason, place a nop after the call so that the linker can
    restore the TOC pointer if a TOC adjusting call stub is needed.  */
+#if DOT_SYMBOLS
 #define CRT_CALL_STATIC_FUNCTION(SECTION_OP, FUNC)	\
   asm (SECTION_OP "\n"					\
 "	bl ." #FUNC "\n"				\
 "	nop\n"						\
 "	.previous");
+#else
+#define CRT_CALL_STATIC_FUNCTION(SECTION_OP, FUNC)	\
+  asm (SECTION_OP "\n"					\
+"	bl " #FUNC "\n"					\
+"	nop\n"						\
+"	.previous");
+#endif
 #endif
 
 /* FP save and restore routines.  */
@@ -415,13 +435,11 @@
       if (!flag_inhibit_size_directive)					\
 	{								\
 	  fputs ("\t.size\t", (FILE));					\
-	  if (TARGET_64BIT)						\
+	  if (TARGET_64BIT && DOT_SYMBOLS)				\
 	    putc ('.', (FILE));						\
 	  assemble_name ((FILE), (FNAME));				\
 	  fputs (",.-", (FILE));					\
-	  if (TARGET_64BIT)						\
-	    putc ('.', (FILE));						\
-	  assemble_name ((FILE), (FNAME));				\
+	  rs6000_output_function_entry (FILE, FNAME);			\
 	  putc ('\n', (FILE));						\
 	}								\
     }									\
@@ -465,14 +483,13 @@
 do									\
   {									\
     char temp[256];							\
+    const char *s;							\
     ASM_GENERATE_INTERNAL_LABEL (temp, "LM", COUNTER);			\
     fprintf (FILE, "\t.stabn 68,0,%d,", LINE);				\
     assemble_name (FILE, temp);						\
     putc ('-', FILE);							\
-    if (TARGET_64BIT)							\
-      putc ('.', FILE);							\
-    assemble_name (FILE,						\
-		   XSTR (XEXP (DECL_RTL (current_function_decl), 0), 0));\
+    s = XSTR (XEXP (DECL_RTL (current_function_decl), 0), 0);		\
+    rs6000_output_function_entry (FILE, s);				\
     putc ('\n', FILE);							\
     (*targetm.asm_out.internal_label) (FILE, "LM", COUNTER);		\
   }									\
@@ -482,19 +499,20 @@ while (0)
 #define DBX_OUTPUT_BRAC(FILE, NAME, BRAC) \
   do									\
     {									\
-      const char *flab;							\
+      const char *s;							\
       fprintf (FILE, "%s%d,0,0,", ASM_STABN_OP, BRAC);			\
       assemble_name (FILE, NAME);					\
       putc ('-', FILE);							\
       if (current_function_func_begin_label != NULL_TREE)		\
-	flab = IDENTIFIER_POINTER (current_function_func_begin_label);	\
+	{								\
+	  s = IDENTIFIER_POINTER (current_function_func_begin_label);	\
+	  assemble_name (FILE, s);					\
+	}								\
       else								\
 	{								\
-	  if (TARGET_64BIT)						\
-	    putc ('.', FILE);						\
-	  flab = XSTR (XEXP (DECL_RTL (current_function_decl), 0), 0);	\
+	  s = XSTR (XEXP (DECL_RTL (current_function_decl), 0), 0);	\
+	  rs6000_output_function_entry (FILE, s);			\
 	}								\
-      assemble_name (FILE, flab);					\
       putc ('\n', FILE);						\
     }									\
   while (0)
@@ -506,12 +524,12 @@ while (0)
 #define	DBX_OUTPUT_NFUN(FILE, LSCOPE, DECL)				\
   do									\
     {									\
+      const char *s;							\
       fprintf (FILE, "%s\"\",%d,0,0,", ASM_STABS_OP, N_FUN);		\
       assemble_name (FILE, LSCOPE);					\
       putc ('-', FILE);							\
-      if (TARGET_64BIT)							\
-        putc ('.', FILE);						\
-      assemble_name (FILE, XSTR (XEXP (DECL_RTL (DECL), 0), 0));	\
+      s = XSTR (XEXP (DECL_RTL (current_function_decl), 0), 0);		\
+      rs6000_output_function_entry (FILE, s);				\
       putc ('\n', FILE);						\
     }									\
   while (0)
Index: gcc/config/rs6000/rs6000-protos.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000-protos.h,v
retrieving revision 1.85
diff -u -p -r1.85 rs6000-protos.h
--- gcc/config/rs6000/rs6000-protos.h	28 Jul 2004 12:13:13 -0000	1.85
+++ gcc/config/rs6000/rs6000-protos.h	9 Aug 2004 23:52:06 -0000
@@ -110,6 +110,7 @@ extern enum reg_class secondary_reload_c
 extern int ccr_bit (rtx, int);
 extern int extract_MB (rtx);
 extern int extract_ME (rtx);
+extern void rs6000_output_function_entry (FILE *, const char *);
 extern void print_operand (FILE *, rtx, int);
 extern void print_operand_address (FILE *, rtx);
 extern enum rtx_code rs6000_reverse_condition (enum machine_mode,
Index: gcc/config/rs6000/rs6000.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.c,v
retrieving revision 1.675
diff -u -p -r1.675 rs6000.c
--- gcc/config/rs6000/rs6000.c	2 Aug 2004 01:46:39 -0000	1.675
+++ gcc/config/rs6000/rs6000.c	9 Aug 2004 23:52:15 -0000
@@ -9543,6 +9539,36 @@ rs6000_get_some_local_dynamic_name_1 (rt
   return 0;
 }
 
+/* Write out a function code label.  */
+
+void
+rs6000_output_function_entry (FILE *file, const char *fname)
+{
+  if (fname[0] != '.')
+    {
+      switch (DEFAULT_ABI)
+	{
+	default:
+	  abort ();
+
+	case ABI_AIX:
+	  if (DOT_SYMBOLS)
+	    putc ('.', file);
+	  else
+	    ASM_OUTPUT_INTERNAL_LABEL_PREFIX (file, "LE");
+	  break;
+
+	case ABI_V4:
+	case ABI_DARWIN:
+	  break;
+	}
+    }
+  if (TARGET_AIX)
+    RS6000_OUTPUT_BASENAME (file, fname);
+  else
+    assemble_name (file, fname);
+}
+
 /* Print an operand.  Recognize special options, documented below.  */
 
 #if TARGET_ELF
@@ -10075,23 +10101,7 @@ print_operand (FILE *file, rtx x, int co
       if (SYMBOL_REF_DECL (x))
         mark_decl_referenced (SYMBOL_REF_DECL (x));
 
-      if (XSTR (x, 0)[0] != '.')
-	{
-	  switch (DEFAULT_ABI)
-	    {
-	    default:
-	      abort ();
-
-	    case ABI_AIX:
-	      putc ('.', file);
-	      break;
-
-	    case ABI_V4:
-	    case ABI_DARWIN:
-	      break;
-	    }
-	}
-      /* For macho, we need to check it see if we need a stub.  */
+      /* For macho, check to see if we need a stub.  */
       if (TARGET_MACHO)
 	{
 	  const char *name = XSTR (x, 0);
@@ -10102,10 +10112,10 @@ print_operand (FILE *file, rtx x, int co
 #endif
 	  assemble_name (file, name);
 	}
-     else if (TARGET_AIX)
-	RS6000_OUTPUT_BASENAME (file, XSTR (x, 0));
-      else
+      else if (!DOT_SYMBOLS)
 	assemble_name (file, XSTR (x, 0));
+      else
+	rs6000_output_function_entry (file, XSTR (x, 0));
       return;
 
     case 'Z':
@@ -10361,7 +10371,9 @@ rs6000_assemble_visibility (tree decl, i
 {
   /* Functions need to have their entry point symbol visibility set as
      well as their descriptor symbol visibility.  */
-  if (DEFAULT_ABI == ABI_AIX && TREE_CODE (decl) == FUNCTION_DECL)
+  if (DEFAULT_ABI == ABI_AIX
+      && DOT_SYMBOLS
+      && TREE_CODE (decl) == FUNCTION_DECL)
     {
       static const char * const visibility_types[] = {
         NULL, "internal", "hidden", "protected"
@@ -13686,17 +13698,12 @@ rs6000_output_function_epilogue (FILE *f
       /* Offset from start of code to tb table.  */
       fputs ("\t.long ", file);
       ASM_OUTPUT_INTERNAL_LABEL_PREFIX (file, "LT");
-#if TARGET_AIX
-      RS6000_OUTPUT_BASENAME (file, fname);
-#else
-      assemble_name (file, fname);
-#endif
-      fputs ("-.", file);
-#if TARGET_AIX
-      RS6000_OUTPUT_BASENAME (file, fname);
-#else
-      assemble_name (file, fname);
-#endif
+      if (TARGET_AIX)
+	RS6000_OUTPUT_BASENAME (file, fname);
+      else
+	assemble_name (file, fname);
+      putc ('-', file);
+      rs6000_output_function_entry (file, fname);
       putc ('\n', file);
 
       /* Interrupt handler mask.  */
@@ -16262,22 +16269,27 @@ rs6000_elf_declare_function_name (FILE *
       fputs ("\t.section\t\".opd\",\"aw\"\n\t.align 3\n", file);
       ASM_OUTPUT_LABEL (file, name);
       fputs (DOUBLE_INT_ASM_OP, file);
-      putc ('.', file);
-      assemble_name (file, name);
-      fputs (",.TOC.@tocbase,0\n\t.previous\n\t.size\t", file);
-      assemble_name (file, name);
-      fputs (",24\n\t.type\t.", file);
-      assemble_name (file, name);
-      fputs (",@function\n", file);
-      if (TREE_PUBLIC (decl) && ! DECL_WEAK (decl))
+      rs6000_output_function_entry (file, name);
+      fputs (",.TOC.@tocbase,0\n\t.previous\n", file);
+      if (DOT_SYMBOLS)
 	{
-	  fputs ("\t.globl\t.", file);
+	  fputs ("\t.size\t", file);
 	  assemble_name (file, name);
-	  putc ('\n', file);
+	  fputs (",24\n\t.type\t.", file);
+	  assemble_name (file, name);
+	  fputs (",@function\n", file);
+	  if (TREE_PUBLIC (decl) && ! DECL_WEAK (decl))
+	    {
+	      fputs ("\t.globl\t.", file);
+	      assemble_name (file, name);
+	      putc ('\n', file);
+	    }
 	}
+      else
+	ASM_OUTPUT_TYPE_DIRECTIVE (file, name, "function");
       ASM_DECLARE_RESULT (file, DECL_RESULT (decl));
-      putc ('.', file);
-      ASM_OUTPUT_LABEL (file, name);
+      rs6000_output_function_entry (file, name);
+      fputs (":\n", file);
       return;
     }
 
Index: gcc/config/rs6000/rs6000.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.h,v
retrieving revision 1.329
diff -u -p -r1.329 rs6000.h
--- gcc/config/rs6000/rs6000.h	16 Jul 2004 23:25:47 -0000	1.329
+++ gcc/config/rs6000/rs6000.h	9 Aug 2004 23:52:17 -0000
@@ -40,6 +40,10 @@
 #define TARGET_AIX 0
 #endif
 
+/* Control whether function entry points use a "dot" symbol when
+   ABI_AIX.  */
+#define DOT_SYMBOLS 1
+
 /* Default string to use for cpu if not specified.  */
 #ifndef TARGET_CPU_DEFAULT
 #define TARGET_CPU_DEFAULT ((char *)0)
@@ -2233,9 +2237,9 @@ extern int toc_initialized;
   do									\
     {									\
       fputs ("\t.weak\t", (FILE));					\
-      RS6000_OUTPUT_BASENAME ((FILE), (NAME)); 			\
+      RS6000_OUTPUT_BASENAME ((FILE), (NAME)); 				\
       if ((DECL) && TREE_CODE (DECL) == FUNCTION_DECL			\
-	  && DEFAULT_ABI == ABI_AIX)					\
+	  && DEFAULT_ABI == ABI_AIX && DOT_SYMBOLS)			\
 	{								\
 	  if (TARGET_XCOFF)						\
 	    fputs ("[DS]", (FILE));					\
@@ -2247,7 +2251,7 @@ extern int toc_initialized;
 	{								\
 	  ASM_OUTPUT_DEF ((FILE), (NAME), (VAL));			\
 	  if ((DECL) && TREE_CODE (DECL) == FUNCTION_DECL		\
-	      && DEFAULT_ABI == ABI_AIX)				\
+	      && DEFAULT_ABI == ABI_AIX && DOT_SYMBOLS)			\
 	    {								\
 	      fputs ("\t.set\t.", (FILE));				\
 	      RS6000_OUTPUT_BASENAME ((FILE), (NAME));			\
@@ -2268,7 +2272,7 @@ extern int toc_initialized;
       const char *alias = XSTR (XEXP (DECL_RTL (DECL), 0), 0);		\
       const char *name = IDENTIFIER_POINTER (TARGET);			\
       if (TREE_CODE (DECL) == FUNCTION_DECL				\
-	  && DEFAULT_ABI == ABI_AIX)					\
+	  && DEFAULT_ABI == ABI_AIX && DOT_SYMBOLS)			\
 	{								\
 	  if (TREE_PUBLIC (DECL))					\
 	    {								\

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: powerpc64 linux dot symbols
  2004-08-10 11:55 powerpc64 linux dot symbols Alan Modra
@ 2004-08-10 12:37 ` Joseph S. Myers
  2004-08-10 13:34   ` Alan Modra
  2004-08-13 13:04 ` Jakub Jelinek
  2004-08-16 11:49 ` Jakub Jelinek
  2 siblings, 1 reply; 875+ messages in thread
From: Joseph S. Myers @ 2004-08-10 12:37 UTC (permalink / raw)
  To: Alan Modra; +Cc: gcc-patches, dje, mendell, sjmunroe, gilliam, yaari, klausner

On Tue, 10 Aug 2004, Alan Modra wrote:

> This is the gcc part of my patch to binutils+gcc to remove use of "dot"
> symbols on PowerPC64 Linux function entry points.  The main reason for
> making this change is to remove a user visible PowerPC64 ELF wart:
> Functions called names like "data" and "dynamic" won't compile, since
> the associated dot symbols, ".data" and ".dynamic" clash with ELF
> section names.  A secondary reason is that reducing the number of global

This (names in user namespace that have caused problems; not necessarily 
just "data" and "dynamic", if there are or have been other such names on 
this or other platforms) seems like something that should have a testcase 
added.

-- 
Joseph S. Myers               http://www.srcf.ucam.org/~jsm28/gcc/
    jsm@polyomino.org.uk (personal mail)
    jsm28@gcc.gnu.org (Bugzilla assignments and CCs)

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: powerpc64 linux dot symbols
  2004-08-10 12:37 ` Joseph S. Myers
@ 2004-08-10 13:34   ` Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2004-08-10 13:34 UTC (permalink / raw)
  To: Joseph S. Myers
  Cc: gcc-patches, dje, mendell, sjmunroe, gilliam, yaari, klausner

On Tue, Aug 10, 2004 at 10:53:30AM +0000, Joseph S. Myers wrote:
> On Tue, 10 Aug 2004, Alan Modra wrote:
> 
> > This is the gcc part of my patch to binutils+gcc to remove use of "dot"
> > symbols on PowerPC64 Linux function entry points.  The main reason for
> > making this change is to remove a user visible PowerPC64 ELF wart:
> > Functions called names like "data" and "dynamic" won't compile, since
> > the associated dot symbols, ".data" and ".dynamic" clash with ELF
> > section names.  A secondary reason is that reducing the number of global
> 
> This (names in user namespace that have caused problems; not necessarily 
> just "data" and "dynamic", if there are or have been other such names on 
> this or other platforms) seems like something that should have a testcase 
> added.

The list of names is enormous.  It's not just commonly used ELF section
names (minus the dot), but also commonly used local labels.  So for
example, with the current ppc64 compiler you'll run into trouble with
function called "L0", "L1", ..., "LC0", "LCTOC0", "LCFI0", etc.  (grep
the GCC source for ASM_GENERATE_INTERNAL_LABEL!)

We do already have one test, gcc.dg/debug/20020327-1.c, currently
disabled for powerpc64.  Which reminds me:  I probably should have the
preprocessor define something so testsuites can cope with differently
generated code.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* powerpc64 fixes missing from 3.4 branch
@ 2004-08-11 11:13 Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2004-08-11 11:13 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

Hi David,
  http://gcc.gnu.org/ml/gcc-patches/2004-06/msg02298.html hasn't been
applied to the gcc-3.4 branch, probably because it was okayed too close
to the 3.4.1 release.  OK to apply?  Bootstrapped etc. powerpc64-linux.
See http://gcc.gnu.org/ml/gcc-patches/2004-05/msg00665.html for blurb on
what it does.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: powerpc64 fixes missing from 3.4 branch
       [not found]           ` <amodra@bigpond.net.au>
                               ` (19 preceding siblings ...)
  2004-08-06 14:53             ` David Edelsohn
@ 2004-08-11 14:47             ` David Edelsohn
  2004-08-20 22:35             ` powerpc64 linux dot symbols David Edelsohn
                               ` (40 subsequent siblings)
  61 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2004-08-11 14:47 UTC (permalink / raw)
  To: gcc-patches

>>>>> Alan Modra writes:

Alan> http://gcc.gnu.org/ml/gcc-patches/2004-06/msg02298.html hasn't been
Alan> applied to the gcc-3.4 branch, probably because it was okayed too close
Alan> to the 3.4.1 release.  OK to apply?  Bootstrapped etc. powerpc64-linux.
Alan> See http://gcc.gnu.org/ml/gcc-patches/2004-05/msg00665.html for blurb on
Alan> what it does.

	Yes, please apply to 3.4 branch.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: powerpc64 linux dot symbols
  2004-08-10 11:55 powerpc64 linux dot symbols Alan Modra
  2004-08-10 12:37 ` Joseph S. Myers
@ 2004-08-13 13:04 ` Jakub Jelinek
  2004-08-13 13:58   ` Alan Modra
  2004-08-16 11:49 ` Jakub Jelinek
  2 siblings, 1 reply; 875+ messages in thread
From: Jakub Jelinek @ 2004-08-13 13:04 UTC (permalink / raw)
  To: gcc-patches, dje, mendell, sjmunroe, gilliam, yaari, klausner

On Tue, Aug 10, 2004 at 08:01:41PM +0930, Alan Modra wrote:
> --- gcc/configure.ac	23 Jul 2004 06:59:34 -0000	2.50
> +++ gcc/configure.ac	9 Aug 2004 23:52:02 -0000
> @@ -2857,6 +2857,47 @@ if test x"$gcc_cv_ld_as_needed" = xyes; 
>  [Define if your linker supports --as-needed and --no-as-needed options.])
>  fi
>  
> +case "$target" in
> +  powerpc64*-*-linux*)
> +    AC_CACHE_CHECK(linker support for omitting dot symbols,
> +    gcc_cv_ld_no_dot_syms,
> +    [gcc_cv_ld_no_dot_syms=no
> +    if test $in_tree_ld = yes ; then
> +      if test "$gcc_cv_gld_major_version" -eq 2 -a "$gcc_cv_gld_minor_version" -ge 16 -o "$gcc_cv_gld_major_version" -gt 2; then
> +        gcc_cv_ld_no_dot_syms=yes
> +      fi
> +    elif test x$gcc_cv_as != x -a x$gcc_cv_ld != x ; then
> +      cat > conftest1.s <<EOF
> +	.text
> +	bl foo
> +EOF
> +      cat > conftest2.s <<EOF
> +	.section ".opd","aw"
> +	.align 3
> +	.globl foo
> +	.type foo,@function
> +foo:
> +	.quad .LEfoo,.TOC.@tocbase,0
> +	.text
> +.LEfoo:
> +	blr
> +	.size foo,.-.LEfoo
> +EOF
> +      if $gcc_cv_as -o conftest1.o conftest1.s > /dev/null 2>&1 \
> +         && $gcc_cv_as -o conftest2.o conftest2.s > /dev/null 2>&1 \
> +         && $gcc_cv_ld -o conftest conftest1.o conftest2.o > /dev/null 2>&1; then
> +        gcc_cv_ld_no_dot_syms=yes
> +      fi

The configure test is bad, as it signals gcc_cv_ld_no_dot_syms yes even
if the linker doesn't support omitting the dot symbols.
With current linker it will branch to .LEfoo, with old linker it will
happily jump into the .opd section.
What you perhaps could do is if it successfully links:
&& $gcc_cv_ld -melf64ppc -o conftest conftest1.o conftest2.o > /dev/null 2>&1
&& $gcc_cv_objdump -d -j .text conftest | grep '48 00 00 05' > /dev/null; then

(.LEfoo will immediately follow bl foo and if binutils are with your
patchset, this will thus be branch to the next instruction, while without it
it will branch somewhere else).

Also, $gcc_cv_as should probably use -a64.

	Jakub

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: powerpc64 linux dot symbols
  2004-08-13 13:04 ` Jakub Jelinek
@ 2004-08-13 13:58   ` Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2004-08-13 13:58 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: gcc-patches, dje, mendell, sjmunroe, gilliam, yaari, klausner

On Fri, Aug 13, 2004 at 08:38:47AM -0400, Jakub Jelinek wrote:
> On Tue, Aug 10, 2004 at 08:01:41PM +0930, Alan Modra wrote:
> > --- gcc/configure.ac	23 Jul 2004 06:59:34 -0000	2.50
> > +++ gcc/configure.ac	9 Aug 2004 23:52:02 -0000
> > @@ -2857,6 +2857,47 @@ if test x"$gcc_cv_ld_as_needed" = xyes; 
> >  [Define if your linker supports --as-needed and --no-as-needed options.])
> >  fi
> >  
> > +case "$target" in
> > +  powerpc64*-*-linux*)
> > +    AC_CACHE_CHECK(linker support for omitting dot symbols,
> > +    gcc_cv_ld_no_dot_syms,
> > +    [gcc_cv_ld_no_dot_syms=no
> > +    if test $in_tree_ld = yes ; then
> > +      if test "$gcc_cv_gld_major_version" -eq 2 -a "$gcc_cv_gld_minor_version" -ge 16 -o "$gcc_cv_gld_major_version" -gt 2; then
> > +        gcc_cv_ld_no_dot_syms=yes
> > +      fi
> > +    elif test x$gcc_cv_as != x -a x$gcc_cv_ld != x ; then
> > +      cat > conftest1.s <<EOF
> > +	.text
> > +	bl foo
> > +EOF
> > +      cat > conftest2.s <<EOF
> > +	.section ".opd","aw"
> > +	.align 3
> > +	.globl foo
> > +	.type foo,@function
> > +foo:
> > +	.quad .LEfoo,.TOC.@tocbase,0
> > +	.text
> > +.LEfoo:
> > +	blr
> > +	.size foo,.-.LEfoo
> > +EOF
> > +      if $gcc_cv_as -o conftest1.o conftest1.s > /dev/null 2>&1 \
> > +         && $gcc_cv_as -o conftest2.o conftest2.s > /dev/null 2>&1 \
> > +         && $gcc_cv_ld -o conftest conftest1.o conftest2.o > /dev/null 2>&1; then
> > +        gcc_cv_ld_no_dot_syms=yes
> > +      fi
> 
> The configure test is bad, as it signals gcc_cv_ld_no_dot_syms yes even
> if the linker doesn't support omitting the dot symbols.
> With current linker it will branch to .LEfoo, with old linker it will
> happily jump into the .opd section.

Duh.  Use "bl .foo" instead.

> What you perhaps could do is if it successfully links:
> && $gcc_cv_ld -melf64ppc -o conftest conftest1.o conftest2.o > /dev/null 2>&1
> && $gcc_cv_objdump -d -j .text conftest | grep '48 00 00 05' > /dev/null; then
> 
> (.LEfoo will immediately follow bl foo and if binutils are with your
> patchset, this will thus be branch to the next instruction, while without it
> it will branch somewhere else).
> 
> Also, $gcc_cv_as should probably use -a64.

Yes, good idea, and ld should use -melf64ppc.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: powerpc64 linux dot symbols
  2004-08-10 11:55 powerpc64 linux dot symbols Alan Modra
  2004-08-10 12:37 ` Joseph S. Myers
  2004-08-13 13:04 ` Jakub Jelinek
@ 2004-08-16 11:49 ` Jakub Jelinek
  2004-08-17  0:43   ` Alan Modra
  2 siblings, 1 reply; 875+ messages in thread
From: Jakub Jelinek @ 2004-08-16 11:49 UTC (permalink / raw)
  To: gcc-patches, dje, mendell, sjmunroe, gilliam, yaari, klausner

On Tue, Aug 10, 2004 at 08:01:41PM +0930, Alan Modra wrote:
> New
> 
> function call:		bl foo
> descriptor sym:		foo, type=function, size=<function code size>
> entry sym:		.LEfoo local sym (won't appear in object)
> descriptor:		.quad .LEfoo, .TOC.@tocbase, 0

IMHO we shouldn't be using .LEfoo, because GCC uses e.g. .LEHB*,
.LEHE*, .LECIE*, .LEFDE*, .LELTP* etc. labels.
This would mean that we are in trouble with functions like
CIE10 (), B26 () etc.
We could use e.g. .LYfoo, .LE.foo, etc.

> +	    ASM_OUTPUT_INTERNAL_LABEL_PREFIX (file, "LE");

ASM_OUTPUT_INTERNAL_LABEL_PREFIX (file, "LY");
or
ASM_OUTPUT_INTERNAL_LABEL_PREFIX (file, "LE.");

	Jakub

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: powerpc64 linux dot symbols
  2004-08-16 11:49 ` Jakub Jelinek
@ 2004-08-17  0:43   ` Alan Modra
  2004-08-20  5:54     ` Alan Modra
  0 siblings, 1 reply; 875+ messages in thread
From: Alan Modra @ 2004-08-17  0:43 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: gcc-patches, dje, mendell, sjmunroe, gilliam, yaari, klausner

On Mon, Aug 16, 2004 at 07:46:46AM -0400, Jakub Jelinek wrote:
> On Tue, Aug 10, 2004 at 08:01:41PM +0930, Alan Modra wrote:
> > New
> > 
> > function call:		bl foo
> > descriptor sym:		foo, type=function, size=<function code size>
> > entry sym:		.LEfoo local sym (won't appear in object)
> > descriptor:		.quad .LEfoo, .TOC.@tocbase, 0
> 
> IMHO we shouldn't be using .LEfoo, because GCC uses e.g. .LEHB*,
> .LEHE*, .LECIE*, .LEFDE*, .LELTP* etc. labels.
> This would mean that we are in trouble with functions like
> CIE10 (), B26 () etc.
> We could use e.g. .LYfoo, .LE.foo, etc.

Yes, you're right.  Thanks for the correction.  I'll incorporate this
change into my tree.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: powerpc64 linux dot symbols
  2004-08-17  0:43   ` Alan Modra
@ 2004-08-20  5:54     ` Alan Modra
  2004-08-21 11:58       ` Jakub Jelinek
  0 siblings, 1 reply; 875+ messages in thread
From: Alan Modra @ 2004-08-20  5:54 UTC (permalink / raw)
  To: gcc-patches, dje

Here is the current patch.  This incorporates Jakub's fixes, and makes
dot-symbol output selectable via -mcall-aixdesc and -mcall-linux.

	* config/rs6000/linux64.h (DOT_SYMBOLS): Define.
	(CRT_CALL_STATIC_FUNCTION): Define !DOT_SYMBOLS version.
	(ASM_DECLARE_FUNCTION_SIZE): Modify for !DOT_SYMBOLS.
	(ASM_OUTPUT_SOURCE_LINE, DBX_OUTPUT_BRAC, DBX_OUTPUT_NFUN): Likewise.
	(RS6000_ABI_NAME): Define as "linux".
	(SUBSUBTARGET_OVERRIDE_OPTIONS): Set dot_symbols.
	* config/rs6000/sysv4.h (SUBTARGET_OVERRIDE_OPTIONS): Select
	ABI_AIX when rs6000_abi_name is "linux" and TARGET_64BIT.
	* config/rs6000/rs6000-protos.h (rs6000_output_function_entry): Decl.
	* config/rs6000/rs6000.c (dot_symbols): New global var.
	(rs6000_output_function_entry): New function, modified for
	!DOT_SYMBOLS..
	(print_operand <case 'z'>): ..extracted from here.
	(rs6000_assemble_visibility): Modify for !DOT_SYMBOLS.
	(rs6000_output_function_epilogue): Likewise.
	(rs6000_elf_declare_function_name): Likewise.
	* config/rs6000/rs6000.h (DOT_SYMBOLS): Define.
	(ASM_WEAKEN_DECL, ASM_OUTPUT_DEF_FROM_DECLS): Modify for !DOT_SYMBOLS.
	* configure.ac (HAVE_LD_NO_DOT_SYMS): Add new AC_DEFINE.
	* configure: Regenerate.
	* config.in: Regenerate.

diff -urp -xCVS -x'*~' gcc-virgin/gcc/config/rs6000/linux64.h gcc-dot/gcc/config/rs6000/linux64.h
--- gcc-virgin/gcc/config/rs6000/linux64.h	2004-07-26 16:31:41.000000000 +0930
+++ gcc-dot/gcc/config/rs6000/linux64.h	2004-08-20 13:40:39.926927090 +0930
@@ -50,6 +50,13 @@
 #undef	TARGET_AIX
 #define	TARGET_AIX TARGET_64BIT
 
+#ifdef HAVE_LD_NO_DOT_SYMS
+/* New ABI uses a local sym for the function entry point.  */
+extern int dot_symbols;
+#undef DOT_SYMBOLS
+#define DOT_SYMBOLS dot_symbols
+#endif
+
 #undef PROCESSOR_DEFAULT64
 #define PROCESSOR_DEFAULT64 PROCESSOR_PPC630
 
@@ -57,7 +64,7 @@
 #define	TARGET_RELOCATABLE (!TARGET_64BIT && (target_flags & MASK_RELOCATABLE))
 
 #undef	RS6000_ABI_NAME
-#define	RS6000_ABI_NAME (TARGET_64BIT ? "aixdesc" : "sysv")
+#define	RS6000_ABI_NAME "linux"
 
 #define INVALID_64BIT "-m%s not supported in this configuration"
 #define INVALID_32BIT INVALID_64BIT
@@ -75,6 +82,7 @@
 	      rs6000_current_abi = ABI_AIX;			\
 	      error (INVALID_64BIT, "call");			\
 	    }							\
+	  dot_symbols = !strcmp (rs6000_abi_name, "aixdesc");	\
 	  if (target_flags & MASK_RELOCATABLE)			\
 	    {							\
 	      target_flags &= ~MASK_RELOCATABLE;		\
@@ -392,11 +400,19 @@
    object files, each potentially with a different TOC pointer.  For
    that reason, place a nop after the call so that the linker can
    restore the TOC pointer if a TOC adjusting call stub is needed.  */
+#if DOT_SYMBOLS
 #define CRT_CALL_STATIC_FUNCTION(SECTION_OP, FUNC)	\
   asm (SECTION_OP "\n"					\
 "	bl ." #FUNC "\n"				\
 "	nop\n"						\
 "	.previous");
+#else
+#define CRT_CALL_STATIC_FUNCTION(SECTION_OP, FUNC)	\
+  asm (SECTION_OP "\n"					\
+"	bl " #FUNC "\n"					\
+"	nop\n"						\
+"	.previous");
+#endif
 #endif
 
 /* FP save and restore routines.  */
@@ -421,13 +437,11 @@
       if (!flag_inhibit_size_directive)					\
 	{								\
 	  fputs ("\t.size\t", (FILE));					\
-	  if (TARGET_64BIT)						\
+	  if (TARGET_64BIT && DOT_SYMBOLS)				\
 	    putc ('.', (FILE));						\
 	  assemble_name ((FILE), (FNAME));				\
 	  fputs (",.-", (FILE));					\
-	  if (TARGET_64BIT)						\
-	    putc ('.', (FILE));						\
-	  assemble_name ((FILE), (FNAME));				\
+	  rs6000_output_function_entry (FILE, FNAME);			\
 	  putc ('\n', (FILE));						\
 	}								\
     }									\
@@ -471,14 +485,13 @@
 do									\
   {									\
     char temp[256];							\
+    const char *s;							\
     ASM_GENERATE_INTERNAL_LABEL (temp, "LM", COUNTER);			\
     fprintf (FILE, "\t.stabn 68,0,%d,", LINE);				\
     assemble_name (FILE, temp);						\
     putc ('-', FILE);							\
-    if (TARGET_64BIT)							\
-      putc ('.', FILE);							\
-    assemble_name (FILE,						\
-		   XSTR (XEXP (DECL_RTL (current_function_decl), 0), 0));\
+    s = XSTR (XEXP (DECL_RTL (current_function_decl), 0), 0);		\
+    rs6000_output_function_entry (FILE, s);				\
     putc ('\n', FILE);							\
     (*targetm.asm_out.internal_label) (FILE, "LM", COUNTER);		\
   }									\
@@ -488,19 +501,20 @@ while (0)
 #define DBX_OUTPUT_BRAC(FILE, NAME, BRAC) \
   do									\
     {									\
-      const char *flab;							\
+      const char *s;							\
       fprintf (FILE, "%s%d,0,0,", ASM_STABN_OP, BRAC);			\
       assemble_name (FILE, NAME);					\
       putc ('-', FILE);							\
       if (current_function_func_begin_label != NULL_TREE)		\
-	flab = IDENTIFIER_POINTER (current_function_func_begin_label);	\
+	{								\
+	  s = IDENTIFIER_POINTER (current_function_func_begin_label);	\
+	  assemble_name (FILE, s);					\
+	}								\
       else								\
 	{								\
-	  if (TARGET_64BIT)						\
-	    putc ('.', FILE);						\
-	  flab = XSTR (XEXP (DECL_RTL (current_function_decl), 0), 0);	\
+	  s = XSTR (XEXP (DECL_RTL (current_function_decl), 0), 0);	\
+	  rs6000_output_function_entry (FILE, s);			\
 	}								\
-      assemble_name (FILE, flab);					\
       putc ('\n', FILE);						\
     }									\
   while (0)
@@ -512,12 +526,12 @@ while (0)
 #define	DBX_OUTPUT_NFUN(FILE, LSCOPE, DECL)				\
   do									\
     {									\
+      const char *s;							\
       fprintf (FILE, "%s\"\",%d,0,0,", ASM_STABS_OP, N_FUN);		\
       assemble_name (FILE, LSCOPE);					\
       putc ('-', FILE);							\
-      if (TARGET_64BIT)							\
-        putc ('.', FILE);						\
-      assemble_name (FILE, XSTR (XEXP (DECL_RTL (DECL), 0), 0));	\
+      s = XSTR (XEXP (DECL_RTL (current_function_decl), 0), 0);		\
+      rs6000_output_function_entry (FILE, s);				\
       putc ('\n', FILE);						\
     }									\
   while (0)
diff -urp -xCVS -x'*~' gcc-virgin/gcc/config/rs6000/rs6000-protos.h gcc-dot/gcc/config/rs6000/rs6000-protos.h
--- gcc-virgin/gcc/config/rs6000/rs6000-protos.h	2004-08-19 10:06:36.000000000 +0930
+++ gcc-dot/gcc/config/rs6000/rs6000-protos.h	2004-08-20 12:04:51.358068189 +0930
@@ -113,6 +113,7 @@ extern enum reg_class secondary_reload_c
 extern int ccr_bit (rtx, int);
 extern int extract_MB (rtx);
 extern int extract_ME (rtx);
+extern void rs6000_output_function_entry (FILE *, const char *);
 extern void print_operand (FILE *, rtx, int);
 extern void print_operand_address (FILE *, rtx);
 extern enum rtx_code rs6000_reverse_condition (enum machine_mode,
diff -urp -xCVS -x'*~' gcc-virgin/gcc/config/rs6000/rs6000.c gcc-dot/gcc/config/rs6000/rs6000.c
--- gcc-virgin/gcc/config/rs6000/rs6000.c	2004-08-20 11:19:55.187713346 +0930
+++ gcc-dot/gcc/config/rs6000/rs6000.c	2004-08-20 13:40:14.825913549 +0930
@@ -213,6 +213,9 @@ enum rs6000_abi rs6000_current_abi;
 /* ABI string from -mabi= option.  */
 const char *rs6000_abi_string;
 
+/* Whether to use variant of AIX ABI for PowerPC64 Linux.  */
+int dot_symbols;
+
 /* Debug flags */
 const char *rs6000_debug_name;
 int rs6000_debug_stack;		/* debug stack applications */
@@ -9791,6 +9794,36 @@ rs6000_get_some_local_dynamic_name_1 (rt
   return 0;
 }
 
+/* Write out a function code label.  */
+
+void
+rs6000_output_function_entry (FILE *file, const char *fname)
+{
+  if (fname[0] != '.')
+    {
+      switch (DEFAULT_ABI)
+	{
+	default:
+	  abort ();
+
+	case ABI_AIX:
+	  if (DOT_SYMBOLS)
+	    putc ('.', file);
+	  else
+	    ASM_OUTPUT_INTERNAL_LABEL_PREFIX (file, "L.");
+	  break;
+
+	case ABI_V4:
+	case ABI_DARWIN:
+	  break;
+	}
+    }
+  if (TARGET_AIX)
+    RS6000_OUTPUT_BASENAME (file, fname);
+  else
+    assemble_name (file, fname);
+}
+
 /* Print an operand.  Recognize special options, documented below.  */
 
 #if TARGET_ELF
@@ -10323,23 +10356,7 @@ print_operand (FILE *file, rtx x, int co
       if (SYMBOL_REF_DECL (x))
         mark_decl_referenced (SYMBOL_REF_DECL (x));
 
-      if (XSTR (x, 0)[0] != '.')
-	{
-	  switch (DEFAULT_ABI)
-	    {
-	    default:
-	      abort ();
-
-	    case ABI_AIX:
-	      putc ('.', file);
-	      break;
-
-	    case ABI_V4:
-	    case ABI_DARWIN:
-	      break;
-	    }
-	}
-      /* For macho, we need to check it see if we need a stub.  */
+      /* For macho, check to see if we need a stub.  */
       if (TARGET_MACHO)
 	{
 	  const char *name = XSTR (x, 0);
@@ -10350,10 +10367,10 @@ print_operand (FILE *file, rtx x, int co
 #endif
 	  assemble_name (file, name);
 	}
-     else if (TARGET_AIX)
-	RS6000_OUTPUT_BASENAME (file, XSTR (x, 0));
-      else
+      else if (!DOT_SYMBOLS)
 	assemble_name (file, XSTR (x, 0));
+      else
+	rs6000_output_function_entry (file, XSTR (x, 0));
       return;
 
     case 'Z':
@@ -10609,7 +10626,9 @@ rs6000_assemble_visibility (tree decl, i
 {
   /* Functions need to have their entry point symbol visibility set as
      well as their descriptor symbol visibility.  */
-  if (DEFAULT_ABI == ABI_AIX && TREE_CODE (decl) == FUNCTION_DECL)
+  if (DEFAULT_ABI == ABI_AIX
+      && DOT_SYMBOLS
+      && TREE_CODE (decl) == FUNCTION_DECL)
     {
       static const char * const visibility_types[] = {
         NULL, "internal", "hidden", "protected"
@@ -14219,17 +14238,12 @@ rs6000_output_function_epilogue (FILE *f
       /* Offset from start of code to tb table.  */
       fputs ("\t.long ", file);
       ASM_OUTPUT_INTERNAL_LABEL_PREFIX (file, "LT");
-#if TARGET_AIX
-      RS6000_OUTPUT_BASENAME (file, fname);
-#else
-      assemble_name (file, fname);
-#endif
-      fputs ("-.", file);
-#if TARGET_AIX
-      RS6000_OUTPUT_BASENAME (file, fname);
-#else
-      assemble_name (file, fname);
-#endif
+      if (TARGET_AIX)
+	RS6000_OUTPUT_BASENAME (file, fname);
+      else
+	assemble_name (file, fname);
+      putc ('-', file);
+      rs6000_output_function_entry (file, fname);
       putc ('\n', file);
 
       /* Interrupt handler mask.  */
@@ -16802,22 +16816,27 @@ rs6000_elf_declare_function_name (FILE *
       fputs ("\t.section\t\".opd\",\"aw\"\n\t.align 3\n", file);
       ASM_OUTPUT_LABEL (file, name);
       fputs (DOUBLE_INT_ASM_OP, file);
-      putc ('.', file);
-      assemble_name (file, name);
-      fputs (",.TOC.@tocbase,0\n\t.previous\n\t.size\t", file);
-      assemble_name (file, name);
-      fputs (",24\n\t.type\t.", file);
-      assemble_name (file, name);
-      fputs (",@function\n", file);
-      if (TREE_PUBLIC (decl) && ! DECL_WEAK (decl))
+      rs6000_output_function_entry (file, name);
+      fputs (",.TOC.@tocbase,0\n\t.previous\n", file);
+      if (DOT_SYMBOLS)
 	{
-	  fputs ("\t.globl\t.", file);
+	  fputs ("\t.size\t", file);
 	  assemble_name (file, name);
-	  putc ('\n', file);
+	  fputs (",24\n\t.type\t.", file);
+	  assemble_name (file, name);
+	  fputs (",@function\n", file);
+	  if (TREE_PUBLIC (decl) && ! DECL_WEAK (decl))
+	    {
+	      fputs ("\t.globl\t.", file);
+	      assemble_name (file, name);
+	      putc ('\n', file);
+	    }
 	}
+      else
+	ASM_OUTPUT_TYPE_DIRECTIVE (file, name, "function");
       ASM_DECLARE_RESULT (file, DECL_RESULT (decl));
-      putc ('.', file);
-      ASM_OUTPUT_LABEL (file, name);
+      rs6000_output_function_entry (file, name);
+      fputs (":\n", file);
       return;
     }
 
diff -urp -xCVS -x'*~' gcc-virgin/gcc/config/rs6000/rs6000.h gcc-dot/gcc/config/rs6000/rs6000.h
--- gcc-virgin/gcc/config/rs6000/rs6000.h	2004-08-19 10:06:39.000000000 +0930
+++ gcc-dot/gcc/config/rs6000/rs6000.h	2004-08-20 12:04:52.645863795 +0930
@@ -40,6 +40,10 @@
 #define TARGET_AIX 0
 #endif
 
+/* Control whether function entry points use a "dot" symbol when
+   ABI_AIX.  */
+#define DOT_SYMBOLS 1
+
 /* Default string to use for cpu if not specified.  */
 #ifndef TARGET_CPU_DEFAULT
 #define TARGET_CPU_DEFAULT ((char *)0)
@@ -2246,9 +2250,9 @@ extern int toc_initialized;
   do									\
     {									\
       fputs ("\t.weak\t", (FILE));					\
-      RS6000_OUTPUT_BASENAME ((FILE), (NAME)); 			\
+      RS6000_OUTPUT_BASENAME ((FILE), (NAME)); 				\
       if ((DECL) && TREE_CODE (DECL) == FUNCTION_DECL			\
-	  && DEFAULT_ABI == ABI_AIX)					\
+	  && DEFAULT_ABI == ABI_AIX && DOT_SYMBOLS)			\
 	{								\
 	  if (TARGET_XCOFF)						\
 	    fputs ("[DS]", (FILE));					\
@@ -2260,7 +2264,7 @@ extern int toc_initialized;
 	{								\
 	  ASM_OUTPUT_DEF ((FILE), (NAME), (VAL));			\
 	  if ((DECL) && TREE_CODE (DECL) == FUNCTION_DECL		\
-	      && DEFAULT_ABI == ABI_AIX)				\
+	      && DEFAULT_ABI == ABI_AIX && DOT_SYMBOLS)			\
 	    {								\
 	      fputs ("\t.set\t.", (FILE));				\
 	      RS6000_OUTPUT_BASENAME ((FILE), (NAME));			\
@@ -2281,7 +2285,7 @@ extern int toc_initialized;
       const char *alias = XSTR (XEXP (DECL_RTL (DECL), 0), 0);		\
       const char *name = IDENTIFIER_POINTER (TARGET);			\
       if (TREE_CODE (DECL) == FUNCTION_DECL				\
-	  && DEFAULT_ABI == ABI_AIX)					\
+	  && DEFAULT_ABI == ABI_AIX && DOT_SYMBOLS)			\
 	{								\
 	  if (TREE_PUBLIC (DECL))					\
 	    {								\
diff -urp -xCVS -x'*~' gcc-virgin/gcc/config/rs6000/sysv4.h gcc-dot/gcc/config/rs6000/sysv4.h
--- gcc-virgin/gcc/config/rs6000/sysv4.h	2004-05-07 11:27:14.000000000 +0930
+++ gcc-dot/gcc/config/rs6000/sysv4.h	2004-08-20 13:40:42.091583305 +0930
@@ -194,7 +194,12 @@ do {									\
   else if (!strcmp (rs6000_abi_name, "freebsd"))			\
     rs6000_current_abi = ABI_V4;					\
   else if (!strcmp (rs6000_abi_name, "linux"))				\
-    rs6000_current_abi = ABI_V4;					\
+    {									\
+      if (TARGET_64BIT)							\
+	rs6000_current_abi = ABI_AIX;					\
+      else								\
+	rs6000_current_abi = ABI_V4;					\
+    }									\
   else if (!strcmp (rs6000_abi_name, "gnu"))				\
     rs6000_current_abi = ABI_V4;					\
   else if (!strcmp (rs6000_abi_name, "netbsd"))				\
diff -urp -xCVS -x'*~' gcc-virgin/gcc/configure.ac gcc-dot/gcc/configure.ac
--- gcc-virgin/gcc/configure.ac	2004-08-18 12:52:46.000000000 +0930
+++ gcc-dot/gcc/configure.ac	2004-08-20 12:04:28.534690610 +0930
@@ -2845,6 +2845,47 @@ if test x"$gcc_cv_ld_as_needed" = xyes; 
 [Define if your linker supports --as-needed and --no-as-needed options.])
 fi
 
+case "$target" in
+  powerpc64*-*-linux*)
+    AC_CACHE_CHECK(linker support for omitting dot symbols,
+    gcc_cv_ld_no_dot_syms,
+    [gcc_cv_ld_no_dot_syms=no
+    if test $in_tree_ld = yes ; then
+      if test "$gcc_cv_gld_major_version" -eq 2 -a "$gcc_cv_gld_minor_version" -ge 16 -o "$gcc_cv_gld_major_version" -gt 2; then
+        gcc_cv_ld_no_dot_syms=yes
+      fi
+    elif test x$gcc_cv_as != x -a x$gcc_cv_ld != x ; then
+      cat > conftest1.s <<EOF
+	.text
+	bl .foo
+EOF
+      cat > conftest2.s <<EOF
+	.section ".opd","aw"
+	.align 3
+	.globl foo
+	.type foo,@function
+foo:
+	.quad .LEfoo,.TOC.@tocbase,0
+	.text
+.LEfoo:
+	blr
+	.size foo,.-.LEfoo
+EOF
+      if $gcc_cv_as -a64 -o conftest1.o conftest1.s > /dev/null 2>&1 \
+         && $gcc_cv_as -a64 -o conftest2.o conftest2.s > /dev/null 2>&1 \
+         && $gcc_cv_ld -melf64ppc -o conftest conftest1.o conftest2.o > /dev/null 2>&1; then
+        gcc_cv_ld_no_dot_syms=yes
+      fi
+      rm -f conftest conftest1.o conftest2.o conftest1.s conftest2.s
+    fi
+    ])
+    if test x"$gcc_cv_ld_no_dot_syms" = xyes; then
+      AC_DEFINE(HAVE_LD_NO_DOT_SYMS, 1,
+    [Define if your PowerPC64 linker only needs function descriptor syms.])
+    fi
+    ;;
+esac
+
 if test x$with_sysroot = x && test x$host = x$target \
    && test "$prefix" != "/usr" && test "x$prefix" != "x$local_prefix" ; then
   AC_DEFINE_UNQUOTED(PREFIX_INCLUDE_DIR, "$prefix/include",

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: powerpc64 linux dot symbols
       [not found]           ` <amodra@bigpond.net.au>
                               ` (20 preceding siblings ...)
  2004-08-11 14:47             ` powerpc64 fixes missing from 3.4 branch David Edelsohn
@ 2004-08-20 22:35             ` David Edelsohn
  2004-08-25 23:56             ` RS6000 fix pr16480 David Edelsohn
                               ` (39 subsequent siblings)
  61 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2004-08-20 22:35 UTC (permalink / raw)
  To: gcc-patches

	* config/rs6000/linux64.h (DOT_SYMBOLS): Define.
	(CRT_CALL_STATIC_FUNCTION): Define !DOT_SYMBOLS version.
	(ASM_DECLARE_FUNCTION_SIZE): Modify for !DOT_SYMBOLS.
	(ASM_OUTPUT_SOURCE_LINE, DBX_OUTPUT_BRAC, DBX_OUTPUT_NFUN): Likewise.
	(RS6000_ABI_NAME): Define as "linux".
	(SUBSUBTARGET_OVERRIDE_OPTIONS): Set dot_symbols.
	* config/rs6000/sysv4.h (SUBTARGET_OVERRIDE_OPTIONS): Select
	ABI_AIX when rs6000_abi_name is "linux" and TARGET_64BIT.
	* config/rs6000/rs6000-protos.h (rs6000_output_function_entry): Decl.
	* config/rs6000/rs6000.c (dot_symbols): New global var.
	(rs6000_output_function_entry): New function, modified for
	!DOT_SYMBOLS..
	(print_operand <case 'z'>): ..extracted from here.
	(rs6000_assemble_visibility): Modify for !DOT_SYMBOLS.
	(rs6000_output_function_epilogue): Likewise.
	(rs6000_elf_declare_function_name): Likewise.
	* config/rs6000/rs6000.h (DOT_SYMBOLS): Define.
	(ASM_WEAKEN_DECL, ASM_OUTPUT_DEF_FROM_DECLS): Modify for !DOT_SYMBOLS.
	* configure.ac (HAVE_LD_NO_DOT_SYMS): Add new AC_DEFINE.
	* configure: Regenerate.
	* config.in: Regenerate.

This is okay.

	Did you consider defining DOT_SYMBOL to 0 in rs6000.h and
re-defining DOT_SYMBOL to 1 in aix.h?

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: powerpc64 linux dot symbols
  2004-08-20  5:54     ` Alan Modra
@ 2004-08-21 11:58       ` Jakub Jelinek
  2004-08-21 15:39         ` Alan Modra
  0 siblings, 1 reply; 875+ messages in thread
From: Jakub Jelinek @ 2004-08-21 11:58 UTC (permalink / raw)
  To: gcc-patches, dje

On Fri, Aug 20, 2004 at 02:33:00PM +0930, Alan Modra wrote:
> +#ifdef HAVE_LD_NO_DOT_SYMS
> +/* New ABI uses a local sym for the function entry point.  */
> +extern int dot_symbols;
> +#undef DOT_SYMBOLS
> +#define DOT_SYMBOLS dot_symbols
> +#endif
> +

> +#if DOT_SYMBOLS
>  #define CRT_CALL_STATIC_FUNCTION(SECTION_OP, FUNC)	\
>    asm (SECTION_OP "\n"					\
>  "	bl ." #FUNC "\n"				\
>  "	nop\n"						\
>  "	.previous");
> +#else
> +#define CRT_CALL_STATIC_FUNCTION(SECTION_OP, FUNC)	\
> +  asm (SECTION_OP "\n"					\
> +"	bl " #FUNC "\n"					\
> +"	nop\n"						\
> +"	.previous");
> +#endif

This is not going to work right, after the change to make DOT_SYMBOLS
runtime selectable DOT_SYMBOLS is no longer a constant suitable
for #if.
Maybe #ifndef HAVE_LD_NO_DOT_SYMS instead of #if DOT_SYMBOLS
or predefining some macro depending on whether dot_symbols is false or true
in the compiler and using it here could do the job.

	Jakub

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: powerpc64 linux dot symbols
  2004-08-21 11:58       ` Jakub Jelinek
@ 2004-08-21 15:39         ` Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2004-08-21 15:39 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches, dje

On Sat, Aug 21, 2004 at 03:55:25AM -0400, Jakub Jelinek wrote:
> This is not going to work right, after the change to make DOT_SYMBOLS
> runtime selectable DOT_SYMBOLS is no longer a constant suitable
> for #if.
> Maybe #ifndef HAVE_LD_NO_DOT_SYMS instead of #if DOT_SYMBOLS
> or predefining some macro depending on whether dot_symbols is false or true
> in the compiler and using it here could do the job.

Actually, #ifndef HAVE_LD_NO_DOT_SYMS behaves exactly the same as
#if DOT_SYMBOLS here.  The following 

#define DOT_SYMBOLS dot_symbols

#if DOT_SYMBOLS
use_dot_symbol_version
#else
do_not_use_dot_symbol_version
#endif

will result in do_not_use_dot_symbol_version.  Admittedly, relying on
this particular C pre-processor behaviour doesn't make the code very
clear.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* RS6000 fix pr16480
@ 2004-08-25 14:28 Alan Modra
  2004-08-25 15:45 ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Alan Modra @ 2004-08-25 14:28 UTC (permalink / raw)
  To: gcc-patches

This fixes an abort on a memory address that rs6000_split_multireg_move
wasn't expecting, a (mem/i:DI (symbol_ref ..)), which is allowed by
legitimate_small_data_p.  In looking over other addresses allowed by
rs6000_legitimate_address, I think we also need to look inside a LO_SUM
for a clash with the destination reg.

	PR target/16480
	* config/rs6000/rs6000.c (rs6000_split_multireg_move): Don't abort
	on "(mem (symbol_ref ..))" rtl.  Look at LO_SUM base regs as well
	as PLUS base regs.

Bootstrap and regression test on powerpc-linux in progress.  To fix this
properly on gcc-3.4, I think we also should back-port the
offsettable_memref_p change to rs6000_split_multireg_move.

diff -urp -xCVS -x'*~' gcc-virgin/gcc/config/rs6000/rs6000.c gcc-current/gcc/config/rs6000/rs6000.c
--- gcc-virgin/gcc/config/rs6000/rs6000.c	2004-08-25 13:08:14.909251153 +0930
+++ gcc-current/gcc/config/rs6000/rs6000.c	2004-08-25 21:13:33.405845840 +0930
@@ -11352,18 +11353,14 @@ rs6000_split_multireg_move (rtx dst, rtx
 	      src = newsrc;
 	    }
 
-	  /* We have now address involving an base register only.
-	     If we use one of the registers to address memory,
-	     we have change that register last.  */
-
-	  breg = (GET_CODE (XEXP (src, 0)) == PLUS
-		  ? XEXP (XEXP (src, 0), 0)
-		  : XEXP (src, 0));
-
-	  if (!REG_P (breg))
-	      abort();
-
-	  if (REGNO (breg) >= REGNO (dst)
+	  breg = XEXP (src, 0);
+	  if (GET_CODE (breg) == PLUS || GET_CODE (breg) == LO_SUM)
+	    breg = XEXP (breg, 0);
+
+	  /* If the base register we are using to address memory is
+	     also a destination reg, then change that register last.  */
+	  if (REG_P (breg)
+	      && REGNO (breg) >= REGNO (dst)
 	      && REGNO (breg) < REGNO (dst) + nregs)
 	    j = REGNO (breg) - REGNO (dst);
         }

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: RS6000 fix pr16480
  2004-08-25 14:28 RS6000 fix pr16480 Alan Modra
@ 2004-08-25 15:45 ` David Edelsohn
  2004-08-25 23:09   ` Alan Modra
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2004-08-25 15:45 UTC (permalink / raw)
  To: Alan Modra; +Cc: gcc-patches

	Is removing the REG_P test and abort really safe?  We've gone from

	if (!REG_P (breg))
	  abort();
	if (REGNO (breg) >= REGNO (dst)
	...

to

	if (REG_P (breg)
	    && REGNO (breg) >= REGNO (dst)
	...

Either the REG_P test now is redundant or we silently will generate
incorrect code if we happen not to have a REG.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: RS6000 fix pr16480
  2004-08-25 15:45 ` David Edelsohn
@ 2004-08-25 23:09   ` Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2004-08-25 23:09 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

On Wed, Aug 25, 2004 at 11:27:47AM -0400, David Edelsohn wrote:
> 	Is removing the REG_P test and abort really safe?  We've gone from
> 
> 	if (!REG_P (breg))
> 	  abort();
> 	if (REGNO (breg) >= REGNO (dst)
> 	...
> 
> to
> 
> 	if (REG_P (breg)
> 	    && REGNO (breg) >= REGNO (dst)
> 	...
> 
> Either the REG_P test now is redundant or we silently will generate
> incorrect code if we happen not to have a REG.

I should have made it clear in my patch submission:  I didn't just
simply remove the abort to work around a problem.  Instead, I looked at
all the types of address that rs6000_legitimate_address allows.  The
REG_P test is necessary, and I think the rs6000_split_multireg_move code
now will handle all the addresses that it needs to.  (*)

These are the addresses that rs6000_legitimate_address allows:
a) (reg)
b) (pre_{inc,dec} (reg))
c) (symbol_ref)
d) (const (plus (symbol_ref) (const_int)))
e) (plus (reg) (..))
f) (lo_sum (reg) (..))

(a) via legitimate_indirect_address_p, (c) and (d) via
legitimate_small_data_p, (e) via legitimate_constant_pool_address_p,
stack offsets, rs6000_legitimate_offset_address_p and
legitimate_indexed_address_p, (f) via legitimate_lo_sum_address_p.

*) With the possible exception of indexed addressing modes.  We test for
the base being used also as a destination reg, but what of the index
reg?  What prevents an index from also being used as a destination?
Oh, I see.  The other parts of the expression where
legitimate_indexed_address_p is called in rs6000_legitimate_address,
exclude modes that might need multiple regs.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: RS6000 fix pr16480
       [not found]           ` <amodra@bigpond.net.au>
                               ` (21 preceding siblings ...)
  2004-08-20 22:35             ` powerpc64 linux dot symbols David Edelsohn
@ 2004-08-25 23:56             ` David Edelsohn
  2004-08-26  1:21               ` Alan Modra
  2004-08-26  1:30             ` David Edelsohn
                               ` (38 subsequent siblings)
  61 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2004-08-25 23:56 UTC (permalink / raw)
  To: gcc-patches

>>>>> Alan Modra writes:

Alan> I should have made it clear in my patch submission:  I didn't just
Alan> simply remove the abort to work around a problem.  Instead, I looked at
Alan> all the types of address that rs6000_legitimate_address allows.  The
Alan> REG_P test is necessary, and I think the rs6000_split_multireg_move code
Alan> now will handle all the addresses that it needs to.  (*)

	I think we're not communicating.  I was not suggesting that the
removal of abort was gratuitous.

	My question is whether it truly is okay to follow the codepath
that you changed and leave j=-1.  Other paths leave j=-1, but the path you
changed used to assert that it had discovered a REG and would set j or it
had not found a REG and would abort.

	As far as I can understand, this would mean that SRC is

(mem (symbol_ref))  or  (mem (const (plus (symbol_ref) (const_int))))

and DST is a GPR.

	So you patch changes the behavior for LO_SUM, SYMBOL_REF, and
CONST, which previously would have aborted.  Now it sets j for the former
and leaves j=-1 for the latter two.

	I was not sure from the patch if that is what you intended.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: RS6000 fix pr16480
  2004-08-25 23:56             ` RS6000 fix pr16480 David Edelsohn
@ 2004-08-26  1:21               ` Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2004-08-26  1:21 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

On Wed, Aug 25, 2004 at 07:44:55PM -0400, David Edelsohn wrote:
> 	So you patch changes the behavior for LO_SUM, SYMBOL_REF, and
> CONST, which previously would have aborted.

Yes.

>  Now it sets j for the former
> and leaves j=-1 for the latter two.

The whole point of setting j to something other than -1 is to handle the
case where we have a base register that is also a destination register.
That can only happen if we find a REG somewhere in the mem address.  My
previous email enumerates the types of address rtl that we might see
here..

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: RS6000 fix pr16480
       [not found]           ` <amodra@bigpond.net.au>
                               ` (22 preceding siblings ...)
  2004-08-25 23:56             ` RS6000 fix pr16480 David Edelsohn
@ 2004-08-26  1:30             ` David Edelsohn
  2004-11-10  4:48             ` fix pr 16480 on gcc-3.4 David Edelsohn
                               ` (37 subsequent siblings)
  61 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2004-08-26  1:30 UTC (permalink / raw)
  To: gcc-patches

	The patch is okay.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Release RTL bodies after compilation (sometimes)
@ 2004-09-02 18:53 Jan Hubicka
  2004-09-02 18:56 ` Jakub Jelinek
                   ` (2 more replies)
  0 siblings, 3 replies; 875+ messages in thread
From: Jan Hubicka @ 2004-09-02 18:53 UTC (permalink / raw)
  To: gcc-patches, rth



Hi,

This patch save about 3MB of garbage on -O0 combine.c compilation by
explicitely freeing some and annotations CFG edges.  Additionally it
makes RTL function bodies to be released again in majority of cases
after compilation.  This reduce peak memory usage from 9MB to 5MB on
Gerald testcase I get from 21 garbage collection runs to 18.  This can
still be significantly improved by followup patches I broke out as they
are slightly more controversal.

On gcc modules compilation test I didn't measured any actual performance
differences but for Gerald's testcase it is about 6%.

Bootstraped/regtested on i686-pc-gnu-linux, OK?

Honza
2004-09-01  Jan Hubicka  <jh@suse.cz>
	* cfg.c (fre_edge): Use ggc_free.
	(expunge_block): Use ggc_free.
	* cfglayout.c (cfg_layout_initialize): Free RBI info.
	* tree-ssa-dce.c (remove_dead_stmt): Free STMT annotation.
	* tree-ssa.c (delete_tree_ssa): Free annotations.
Index: gcc/cfg.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/cfg.c,v
retrieving revision 1.63
diff -c -3 -p -r1.63 cfg.c
*** gcc/cfg.c	25 Aug 2004 07:25:06 -0000	1.63
--- gcc/cfg.c	2 Sep 2004 18:25:04 -0000
*************** static void
*** 134,140 ****
  free_edge (edge e ATTRIBUTE_UNUSED)
  {
    n_edges--;
!   /* ggc_free (e);  */
  }
  
  /* Free the memory associated with the edge structures.  */
--- 134,140 ----
  free_edge (edge e ATTRIBUTE_UNUSED)
  {
    n_edges--;
!   ggc_free (e);
  }
  
  /* Free the memory associated with the edge structures.  */
*************** expunge_block (basic_block b)
*** 269,275 ****
    unlink_block (b);
    BASIC_BLOCK (b->index) = NULL;
    n_basic_blocks--;
!   /* ggc_free (b); */
  }
  \f
  /* Create an edge connecting SRC and DEST with flags FLAGS.  Return newly
--- 269,275 ----
    unlink_block (b);
    BASIC_BLOCK (b->index) = NULL;
    n_basic_blocks--;
!   ggc_free (b);
  }
  \f
  /* Create an edge connecting SRC and DEST with flags FLAGS.  Return newly
Index: gcc/passes.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/passes.c,v
retrieving revision 2.41
diff -c -3 -p -r2.41 passes.c
*** gcc/passes.c	1 Sep 2004 20:58:52 -0000	2.41
--- gcc/passes.c	2 Sep 2004 18:25:04 -0000
*************** rest_of_clean_state (void)
*** 1685,1690 ****
--- 1685,1691 ----
  
    /* We're done with this function.  Free up memory if we can.  */
    free_after_parsing (cfun);
+   free_after_compilation (cfun);
  }
  \f
  
Index: gcc/tree-ssa.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-ssa.c,v
retrieving revision 2.30
diff -c -3 -p -r2.30 tree-ssa.c
*** gcc/tree-ssa.c	1 Sep 2004 22:06:20 -0000	2.30
--- gcc/tree-ssa.c	2 Sep 2004 18:25:05 -0000
*************** delete_tree_ssa (void)
*** 641,653 ****
    /* Remove annotations from every tree in the function.  */
    FOR_EACH_BB (bb)
      for (bsi = bsi_start (bb); !bsi_end_p (bsi); bsi_next (&bsi))
!       bsi_stmt (bsi)->common.ann = NULL;
  
    /* Remove annotations from every referenced variable.  */
    if (referenced_vars)
      {
        for (i = 0; i < num_referenced_vars; i++)
! 	referenced_var (i)->common.ann = NULL;
        referenced_vars = NULL;
      }
  
--- 641,662 ----
    /* Remove annotations from every tree in the function.  */
    FOR_EACH_BB (bb)
      for (bsi = bsi_start (bb); !bsi_end_p (bsi); bsi_next (&bsi))
!       {
! 	tree stmt = bsi_stmt (bsi);
!         release_defs (stmt);
! 	ggc_free (stmt->common.ann);
! 	stmt->common.ann = NULL;
!       }
  
    /* Remove annotations from every referenced variable.  */
    if (referenced_vars)
      {
        for (i = 0; i < num_referenced_vars; i++)
! 	{
! 	  tree var = referenced_var (i);
! 	  ggc_free (var->common.ann);
! 	  var->common.ann = NULL;
! 	}
        referenced_vars = NULL;
      }
  
Index: gcc/tree-ssanames.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-ssanames.c,v
retrieving revision 2.8
diff -c -3 -p -r2.8 tree-ssanames.c
*** gcc/tree-ssanames.c	25 Aug 2004 21:21:19 -0000	2.8
--- gcc/tree-ssanames.c	2 Sep 2004 18:25:05 -0000
*************** release_defs (tree stmt)
*** 300,306 ****
    ssa_op_iter iter;
  
    FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_ALL_DEFS)
!     release_ssa_name (def);
  }
  
  
--- 300,307 ----
    ssa_op_iter iter;
  
    FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_ALL_DEFS)
!     if (TREE_CODE (def) == SSA_NAME)
!       release_ssa_name (def);
  }
  
  

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-02 18:53 Release RTL bodies after compilation (sometimes) Jan Hubicka
@ 2004-09-02 18:56 ` Jakub Jelinek
  2004-09-02 18:58   ` Jan Hubicka
  2004-09-03  6:58 ` Richard Henderson
  2004-09-03 19:08 ` Geoffrey Keating
  2 siblings, 1 reply; 875+ messages in thread
From: Jakub Jelinek @ 2004-09-02 18:56 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc-patches, rth

On Thu, Sep 02, 2004 at 08:48:49PM +0200, Jan Hubicka wrote:
> 2004-09-01  Jan Hubicka  <jh@suse.cz>
> 	* cfg.c (fre_edge): Use ggc_free.
> 	(expunge_block): Use ggc_free.
> 	* cfglayout.c (cfg_layout_initialize): Free RBI info.
> 	* tree-ssa-dce.c (remove_dead_stmt): Free STMT annotation.
> 	* tree-ssa.c (delete_tree_ssa): Free annotations.

The changelog doesn't match the patch.  Changed is:
cfg.c (free_edge, expunge_block)
passes.c (rest_of_clean_state)
tree-ssa.c (delete_tree_ssa)
tree-ssanames.c (release_defs)
instead.

	Jakub

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-02 18:56 ` Jakub Jelinek
@ 2004-09-02 18:58   ` Jan Hubicka
  2004-09-03  5:43     ` Mark Mitchell
  0 siblings, 1 reply; 875+ messages in thread
From: Jan Hubicka @ 2004-09-02 18:58 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Jan Hubicka, gcc-patches, rth

> On Thu, Sep 02, 2004 at 08:48:49PM +0200, Jan Hubicka wrote:
> > 2004-09-01  Jan Hubicka  <jh@suse.cz>
> > 	* cfg.c (fre_edge): Use ggc_free.
> > 	(expunge_block): Use ggc_free.
> > 	* cfglayout.c (cfg_layout_initialize): Free RBI info.
> > 	* tree-ssa-dce.c (remove_dead_stmt): Free STMT annotation.
> > 	* tree-ssa.c (delete_tree_ssa): Free annotations.
> 
> The changelog doesn't match the patch.  Changed is:
> cfg.c (free_edge, expunge_block)
> passes.c (rest_of_clean_state)
> tree-ssa.c (delete_tree_ssa)
> tree-ssanames.c (release_defs)
> instead.

Oops,  I've broke out the patch twice from larger set of changes and
forgot to check the changelog.  Here is fixed one.

	* cfg.c (fre_edge): Use ggc_free.
	(expunge_block): Use ggc_free.
	* passes.c (rest_of_clean_state): Free after compilation.
	* tree-ssa.c (delete_tree_ssa): Free annotations; call release_defs
	* tree-ssanames.c (release_defs): Ignore non-SSA_NAME arguments.

Thanks,
Honza
> 
> 	Jakub

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-02 18:58   ` Jan Hubicka
@ 2004-09-03  5:43     ` Mark Mitchell
  2004-09-03 22:41       ` Jan Hubicka
  2004-09-14 20:00       ` Diego Novillo
  0 siblings, 2 replies; 875+ messages in thread
From: Mark Mitchell @ 2004-09-03  5:43 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Jakub Jelinek, gcc-patches, rth

Jan Hubicka wrote:

>>On Thu, Sep 02, 2004 at 08:48:49PM +0200, Jan Hubicka wrote:
>>    
>>
>>>2004-09-01  Jan Hubicka  <jh@suse.cz>
>>>	* cfg.c (fre_edge): Use ggc_free.
>>>	(expunge_block): Use ggc_free.
>>>	* cfglayout.c (cfg_layout_initialize): Free RBI info.
>>>	* tree-ssa-dce.c (remove_dead_stmt): Free STMT annotation.
>>>	* tree-ssa.c (delete_tree_ssa): Free annotations.
>>>      
>>>
>>The changelog doesn't match the patch.  Changed is:
>>cfg.c (free_edge, expunge_block)
>>passes.c (rest_of_clean_state)
>>tree-ssa.c (delete_tree_ssa)
>>tree-ssanames.c (release_defs)
>>instead.
>>    
>>
>
>Oops,  I've broke out the patch twice from larger set of changes and
>forgot to check the changelog.  Here is fixed one.
>
>	* cfg.c (fre_edge): Use ggc_free.
>	(expunge_block): Use ggc_free.
>	* passes.c (rest_of_clean_state): Free after compilation.
>	* tree-ssa.c (delete_tree_ssa): Free annotations; call release_defs
>	* tree-ssanames.c (release_defs): Ignore non-SSA_NAME arguments.
>
>Thanks,
>Honza
>  
>
OK.

-- 
Mark Mitchell
CodeSourcery, LLC
(916) 791-8304
mark@codesourcery.com

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-02 18:53 Release RTL bodies after compilation (sometimes) Jan Hubicka
  2004-09-02 18:56 ` Jakub Jelinek
@ 2004-09-03  6:58 ` Richard Henderson
  2004-09-03  8:03   ` Jan Hubicka
  2004-09-03 19:08 ` Geoffrey Keating
  2 siblings, 1 reply; 875+ messages in thread
From: Richard Henderson @ 2004-09-03  6:58 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc-patches

On Thu, Sep 02, 2004 at 08:48:49PM +0200, Jan Hubicka wrote:
> Bootstraped/regtested on i686-pc-gnu-linux, OK?

With gcac?  Why were those ggc_free calls commented out?


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-03  6:58 ` Richard Henderson
@ 2004-09-03  8:03   ` Jan Hubicka
  0 siblings, 0 replies; 875+ messages in thread
From: Jan Hubicka @ 2004-09-03  8:03 UTC (permalink / raw)
  To: Richard Henderson, Jan Hubicka, gcc-patches

> On Thu, Sep 02, 2004 at 08:48:49PM +0200, Jan Hubicka wrote:
> > Bootstraped/regtested on i686-pc-gnu-linux, OK?
> 
> With gcac?
not, I am going to re-test it with gcac
> Why were those ggc_free calls commented out?
Because I added them before ggc_free was implemented as an reminder that
it should be uncommented once it is ready.

Honza
> 
> 
> r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-02 18:53 Release RTL bodies after compilation (sometimes) Jan Hubicka
  2004-09-02 18:56 ` Jakub Jelinek
  2004-09-03  6:58 ` Richard Henderson
@ 2004-09-03 19:08 ` Geoffrey Keating
  2004-09-03 19:50   ` Jan Hubicka
  2 siblings, 1 reply; 875+ messages in thread
From: Geoffrey Keating @ 2004-09-03 19:08 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc-patches

Jan Hubicka <jh@suse.cz> writes:

> Hi,
> 
> This patch save about 3MB of garbage on -O0 combine.c compilation by
> explicitely freeing some and annotations CFG edges.  Additionally it
> makes RTL function bodies to be released again in majority of cases
> after compilation.  This reduce peak memory usage from 9MB to 5MB on
> Gerald testcase I get from 21 garbage collection runs to 18.  This can
> still be significantly improved by followup patches I broke out as they
> are slightly more controversal.
> 
> On gcc modules compilation test I didn't measured any actual performance
> differences but for Gerald's testcase it is about 6%.
> 
> Bootstraped/regtested on i686-pc-gnu-linux, OK?

Did you do your performance measurements with --disable-checking?  If not,
please do so.  On my system, to have 18 GC runs on a single file means
that it has to allocate a minimum of 350MB, and usually much more.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-03 19:08 ` Geoffrey Keating
@ 2004-09-03 19:50   ` Jan Hubicka
  0 siblings, 0 replies; 875+ messages in thread
From: Jan Hubicka @ 2004-09-03 19:50 UTC (permalink / raw)
  To: Geoffrey Keating; +Cc: Jan Hubicka, gcc-patches

> Jan Hubicka <jh@suse.cz> writes:
> 
> > Hi,
> > 
> > This patch save about 3MB of garbage on -O0 combine.c compilation by
> > explicitely freeing some and annotations CFG edges.  Additionally it
> > makes RTL function bodies to be released again in majority of cases
> > after compilation.  This reduce peak memory usage from 9MB to 5MB on
> > Gerald testcase I get from 21 garbage collection runs to 18.  This can
> > still be significantly improved by followup patches I broke out as they
> > are slightly more controversal.
> > 
> > On gcc modules compilation test I didn't measured any actual performance
> > differences but for Gerald's testcase it is about 6%.
> > 
> > Bootstraped/regtested on i686-pc-gnu-linux, OK?
> 
> Did you do your performance measurements with --disable-checking?  If not,
> please do so.  On my system, to have 18 GC runs on a single file means

yes it is with checking disabled.

> that it has to allocate a minimum of 350MB, and usually much more.

Gerald's testcase allocate 924MB of garbage at the moment.

Honza

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-03  5:43     ` Mark Mitchell
@ 2004-09-03 22:41       ` Jan Hubicka
  2004-09-14 20:00       ` Diego Novillo
  1 sibling, 0 replies; 875+ messages in thread
From: Jan Hubicka @ 2004-09-03 22:41 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: Jan Hubicka, Jakub Jelinek, gcc-patches, rth

> > 
> >
> OK.

Thanks,
I've additionally bootstrapped the patch with gcac per Richard's request
and I am going to commit it now.

Honza
> 
> -- 
> Mark Mitchell
> CodeSourcery, LLC
> (916) 791-8304
> mark@codesourcery.com

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-03  5:43     ` Mark Mitchell
  2004-09-03 22:41       ` Jan Hubicka
@ 2004-09-14 20:00       ` Diego Novillo
  2004-09-14 20:15         ` Jan Hubicka
  2004-09-14 20:18         ` Jan Hubicka
  1 sibling, 2 replies; 875+ messages in thread
From: Diego Novillo @ 2004-09-14 20:00 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Jakub Jelinek, gcc-patches, Richard Henderson, Mark Mitchell

On Fri, 2004-09-03 at 01:39, Mark Mitchell wrote:

> >Oops,  I've broke out the patch twice from larger set of changes and
> >forgot to check the changelog.  Here is fixed one.
> >
> >	* cfg.c (fre_edge): Use ggc_free.
> >	(expunge_block): Use ggc_free.
>
Jan,

This is causing the bootstrap failure on x86_64 and others reported in
other/17437.  I think that it may be safer to revert this patch for now.

The problem is that out of DU chains we end up wanting to traverse into
a basic block that has been explicitly ggc_free'd.

Could you please revert this and investigate who may be holding on to DU
information longer than the life time of the block?

Thanks.  Diego.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-14 20:00       ` Diego Novillo
@ 2004-09-14 20:15         ` Jan Hubicka
  2004-09-14 20:35           ` Richard Henderson
  2004-09-14 21:07           ` Jeffrey A Law
  2004-09-14 20:18         ` Jan Hubicka
  1 sibling, 2 replies; 875+ messages in thread
From: Jan Hubicka @ 2004-09-14 20:15 UTC (permalink / raw)
  To: Diego Novillo
  Cc: Jan Hubicka, Jakub Jelinek, gcc-patches, Richard Henderson,
	Mark Mitchell

> 
> On Fri, 2004-09-03 at 01:39, Mark Mitchell wrote:
> 
> > >Oops,  I've broke out the patch twice from larger set of changes and
> > >forgot to check the changelog.  Here is fixed one.
> > >
> > >	* cfg.c (fre_edge): Use ggc_free.
> > >	(expunge_block): Use ggc_free.
> >
> Jan,
> 
> This is causing the bootstrap failure on x86_64 and others reported in
> other/17437.  I think that it may be safer to revert this patch for now.

Interesting-the bootstrap allways breaks on wrong platform ;).  I've
bootstrapped this patch with gcac sucesfully, but I did that on i386.  
> 
> The problem is that out of DU chains we end up wanting to traverse into
> a basic block that has been explicitly ggc_free'd.
> 
> Could you please revert this and investigate who may be holding on to DU
> information longer than the life time of the block?

Sure.  I see this is another problem with leaking SSA nodes.  There are
actually few problems with this.  At first place the released SSA nodes 
are not cleared out so they still points to dead statements.  Making
them to be cleared one gets segfaults as few loops still walk the dead
ssanames.  I've made patch for this part with the zeroing patch as
followup.  I would bet I sent it out, but I can't find it in the
archives, so I am attaching it.

Thank you for working this out and I apologize for the breakage!

Hi,
the patch also has interesting side effect on i686-pc-gnu-linux bootstrap & regtest:
Tests that now work, but didn't before:

gfortran.fortran-torture/execute/elemental.f90 execution,  -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions
gfortran.fortran-torture/execute/elemental.f90 execution,  -O3 -fomit-frame-pointer -funroll-loops

I didn't tracked down why this happends for the very moment, but I would
attribute it to tree-ssa-alias.c change.

Bootstrapped/regtested i686-pc-gnu-linux with and w/o checking and
ppc-pc-gnu-linux with checking and no slowdowns measured (and slight speedup on
non checking i686 build, non-checking PPC build died for unrealed reasons I am
trying to look into now)

Memory savings of this patch per se are small (400Kb for combine.c) but I need
it for some further work - I want to clear out pointers in released nodes (thus
I need them to not be walked by loops - after all it seems quite crazy design
to keep dead arrays in ssa_names array) and this saves about 30% of peak memory
usage for rt-3.4.ii (huge template testcase I grabbed somewhere from bugzilla)
as most of statements gets simply elliminated by first DCE so we can recycle
them before burning a lot of memory elsewhere.

I also want to try the goal of releasing all SSA names.  For this I need few
further changes that are more dubious so I decided to not send them in the
first round.

Honza
2004-09-10  Jan Hubicka  <jh@suse.cz>
	* tree-cfg.c (remove_bb): Release SSA defs.
	* tree-ssa-alias.c (init_alias_info): Do not walk pointers in freelist.
	* tree-ssa-loop-ivopts (remove_statements): Release SSA defs.
	* tree-ssa.c (verify_flow_sensitive_alias_info): Do not walk dead nodes.
	* tree-tailcall.c (eliminate_tail_call): Release SSA name.

Index: tree-cfg.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-cfg.c,v
retrieving revision 2.47
diff -c -3 -p -r2.47 tree-cfg.c
*** tree-cfg.c	6 Sep 2004 10:07:56 -0000	2.47
--- tree-cfg.c	7 Sep 2004 13:26:46 -0000
*************** remove_bb (basic_block bb)
*** 1818,1823 ****
--- 1818,1824 ----
    for (i = bsi_start (bb); !bsi_end_p (i); bsi_remove (&i))
      {
        tree stmt = bsi_stmt (i);
+       release_defs (stmt);
  
        set_bb_for_stmt (stmt, NULL);
  
Index: tree-ssa-alias.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-ssa-alias.c,v
retrieving revision 2.28
diff -c -3 -p -r2.28 tree-ssa-alias.c
*** tree-ssa-alias.c	6 Sep 2004 10:08:05 -0000	2.28
--- tree-ssa-alias.c	7 Sep 2004 13:26:46 -0000
*************** init_alias_info (void)
*** 417,423 ****
  	{
  	  tree name = ssa_name (i);
  
! 	  if (!POINTER_TYPE_P (TREE_TYPE (name)))
  	    continue;
  
  	  if (SSA_NAME_PTR_INFO (name))
--- 417,423 ----
  	{
  	  tree name = ssa_name (i);
  
! 	  if (SSA_NAME_IN_FREE_LIST (name) || !POINTER_TYPE_P (TREE_TYPE (name)))
  	    continue;
  
  	  if (SSA_NAME_PTR_INFO (name))
Index: tree-ssa-loop-ivopts.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-ssa-loop-ivopts.c,v
retrieving revision 2.3
diff -c -3 -p -r2.3 tree-ssa-loop-ivopts.c
*** tree-ssa-loop-ivopts.c	6 Sep 2004 18:38:27 -0000	2.3
--- tree-ssa-loop-ivopts.c	7 Sep 2004 13:26:47 -0000
*************** remove_statement (tree stmt, bool includ
*** 3777,3782 ****
--- 3777,3784 ----
        block_stmt_iterator bsi = stmt_for_bsi (stmt);
  
        bsi_remove (&bsi);
+       if (including_defined_name)
+         release_defs (stmt);
      }
  }
  
Index: tree-ssa.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-ssa.c,v
retrieving revision 2.32
diff -c -3 -p -r2.32 tree-ssa.c
*** tree-ssa.c	6 Sep 2004 10:08:11 -0000	2.32
--- tree-ssa.c	7 Sep 2004 13:26:47 -0000
*************** verify_flow_sensitive_alias_info (void)
*** 408,413 ****
--- 408,415 ----
        struct ptr_info_def *pi;
  
        ptr = ssa_name (i);
+       if (!TREE_VISITED (ptr))
+ 	continue;
        ann = var_ann (SSA_NAME_VAR (ptr));
        pi = SSA_NAME_PTR_INFO (ptr);
  
Index: tree-tailcall.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-tailcall.c,v
retrieving revision 2.23
diff -c -3 -p -r2.23 tree-tailcall.c
*** tree-tailcall.c	6 Sep 2004 10:08:13 -0000	2.23
--- tree-tailcall.c	7 Sep 2004 13:26:47 -0000
*************** eliminate_tail_call (struct tailcall *t)
*** 681,692 ****
    bsi_next (&bsi);
    while (!bsi_end_p (bsi))
      {
        /* Do not remove the return statement, so that redirect_edge_and_branch
  	 sees how the block ends.  */
!       if (TREE_CODE (bsi_stmt (bsi)) == RETURN_EXPR)
  	break;
  
        bsi_remove (&bsi);
      }
  
    /* Replace the call by a jump to the start of function.  */
--- 681,694 ----
    bsi_next (&bsi);
    while (!bsi_end_p (bsi))
      {
+       tree t = bsi_stmt (bsi);
        /* Do not remove the return statement, so that redirect_edge_and_branch
  	 sees how the block ends.  */
!       if (TREE_CODE (t) == RETURN_EXPR)
  	break;
  
        bsi_remove (&bsi);
+       release_defs (t);
      }
  
    /* Replace the call by a jump to the start of function.  */
*************** eliminate_tail_call (struct tailcall *t)
*** 772,777 ****
--- 774,780 ----
      }
  
    bsi_remove (&t->call_bsi);
+   release_defs (call);
  }
  
  /* Optimizes the tailcall described by T.  If OPT_TAILCALLS is true, also

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-14 20:00       ` Diego Novillo
  2004-09-14 20:15         ` Jan Hubicka
@ 2004-09-14 20:18         ` Jan Hubicka
  1 sibling, 0 replies; 875+ messages in thread
From: Jan Hubicka @ 2004-09-14 20:18 UTC (permalink / raw)
  To: Diego Novillo
  Cc: Jan Hubicka, Jakub Jelinek, gcc-patches, Richard Henderson,
	Mark Mitchell

> 
> On Fri, 2004-09-03 at 01:39, Mark Mitchell wrote:
> 
> > >Oops,  I've broke out the patch twice from larger set of changes and
> > >forgot to check the changelog.  Here is fixed one.
> > >
> > >	* cfg.c (fre_edge): Use ggc_free.
> > >	(expunge_block): Use ggc_free.
> >

This is the reveting patch I commited.  It appears to be safe to revert
only the expunge_block change and I am running gcac x86-64 bootstrap
now.

2004-09-14  Jan Hubicka  <jh@suse.cz>
	* cfg.c (expunge_block): Revert previous change adding ggc_free call.
Index: cfg.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/cfg.c,v
retrieving revision 1.65
diff -c -3 -p -r1.65 cfg.c
*** cfg.c	7 Sep 2004 15:46:46 -0000	1.65
--- cfg.c	14 Sep 2004 20:06:48 -0000
*************** expunge_block (basic_block b)
*** 266,272 ****
    unlink_block (b);
    BASIC_BLOCK (b->index) = NULL;
    n_basic_blocks--;
!   ggc_free (b);
  }
  \f
  /* Create an edge connecting SRC and DEST with flags FLAGS.  Return newly
--- 266,276 ----
    unlink_block (b);
    BASIC_BLOCK (b->index) = NULL;
    n_basic_blocks--;
!   /* We should be able to ggc_free here, but we are not.
!      The dead SSA_NAMES are left pointing to dead statements that are pointing
!      to dead basic blocks making garbage collector to die.
!      We should be able to release all dead SSA_NAMES and at the same time we should
!      clear out BB pointer of dead statements consistently.  */
  }
  \f
  /* Create an edge connecting SRC and DEST with flags FLAGS.  Return newly

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-14 20:15         ` Jan Hubicka
@ 2004-09-14 20:35           ` Richard Henderson
  2004-09-14 20:51             ` Jan Hubicka
  2004-09-14 21:07           ` Jeffrey A Law
  1 sibling, 1 reply; 875+ messages in thread
From: Richard Henderson @ 2004-09-14 20:35 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Diego Novillo, Jakub Jelinek, gcc-patches, Mark Mitchell

On Tue, Sep 14, 2004 at 10:00:41PM +0200, Jan Hubicka wrote:
> - 	  if (!POINTER_TYPE_P (TREE_TYPE (name)))
> + 	  if (SSA_NAME_IN_FREE_LIST (name) || !POINTER_TYPE_P (TREE_TYPE (name)))

If you're having to add this kind of check, it means that
we've a serious conceptual problem elsewhere.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-14 20:35           ` Richard Henderson
@ 2004-09-14 20:51             ` Jan Hubicka
  2004-09-14 21:07               ` Jeffrey A Law
  0 siblings, 1 reply; 875+ messages in thread
From: Jan Hubicka @ 2004-09-14 20:51 UTC (permalink / raw)
  To: Richard Henderson, Jan Hubicka, Diego Novillo, Jakub Jelinek,
	gcc-patches, Mark Mitchell

> On Tue, Sep 14, 2004 at 10:00:41PM +0200, Jan Hubicka wrote:
> > - 	  if (!POINTER_TYPE_P (TREE_TYPE (name)))
> > + 	  if (SSA_NAME_IN_FREE_LIST (name) || !POINTER_TYPE_P (TREE_TYPE (name)))
> 
> If you're having to add this kind of check, it means that
> we've a serious conceptual problem elsewhere.

I think the conceptual problem is to keep ssa_names array pointing to
nodes in the freelist.  I sent separate message about setting them to
NULL and re-initializing when node is recycled, but didn't get a reply.

If this sounds sanier, I will do this.
Anyway the function is definitly walking and analyzing dead statements
at least for SSA_NAMES we didn't explicitly released.

Honza
> 
> 
> r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-14 20:15         ` Jan Hubicka
  2004-09-14 20:35           ` Richard Henderson
@ 2004-09-14 21:07           ` Jeffrey A Law
  2004-09-14 21:19             ` Jan Hubicka
  2004-09-15 11:59             ` Jan Hubicka
  1 sibling, 2 replies; 875+ messages in thread
From: Jeffrey A Law @ 2004-09-14 21:07 UTC (permalink / raw)
  To: Jan Hubicka
  Cc: Diego Novillo, Jakub Jelinek, gcc-patches, Richard Henderson,
	Mark Mitchell

On Tue, 2004-09-14 at 14:00, Jan Hubicka wrote:
> 2004-09-10  Jan Hubicka  <jh@suse.cz>
> 	* tree-cfg.c (remove_bb): Release SSA defs.
> 	* tree-ssa-alias.c (init_alias_info): Do not walk pointers in freelist.
> 	* tree-ssa-loop-ivopts (remove_statements): Release SSA defs.
> 	* tree-ssa.c (verify_flow_sensitive_alias_info): Do not walk dead nodes.
> 	* tree-tailcall.c (eliminate_tail_call): Release SSA name.

> 
> Index: tree-cfg.c
> ===================================================================
> RCS file: /cvs/gcc/gcc/gcc/tree-cfg.c,v
> retrieving revision 2.47
> diff -c -3 -p -r2.47 tree-cfg.c
> *** tree-cfg.c	6 Sep 2004 10:07:56 -0000	2.47
> --- tree-cfg.c	7 Sep 2004 13:26:46 -0000
> *************** remove_bb (basic_block bb)
> *** 1818,1823 ****
> --- 1818,1824 ----
>     for (i = bsi_start (bb); !bsi_end_p (i); bsi_remove (&i))
>       {
>         tree stmt = bsi_stmt (i);
> +       release_defs (stmt);
>   
>         set_bb_for_stmt (stmt, NULL);
This is probably OK.

> Index: tree-ssa-alias.c
> ===================================================================
> RCS file: /cvs/gcc/gcc/gcc/tree-ssa-alias.c,v
> retrieving revision 2.28
> diff -c -3 -p -r2.28 tree-ssa-alias.c
> *** tree-ssa-alias.c	6 Sep 2004 10:08:05 -0000	2.28
> --- tree-ssa-alias.c	7 Sep 2004 13:26:46 -0000
> *************** init_alias_info (void)
> *** 417,423 ****
>   	{
>   	  tree name = ssa_name (i);
>   
> ! 	  if (!POINTER_TYPE_P (TREE_TYPE (name)))
>   	    continue;
>   
>   	  if (SSA_NAME_PTR_INFO (name))
> --- 417,423 ----
>   	{
>   	  tree name = ssa_name (i);
>   
> ! 	  if (SSA_NAME_IN_FREE_LIST (name) || !POINTER_TYPE_P (TREE_TYPE (name)))
>   	    continue;
>   
>   	  if (SSA_NAME_PTR_INFO (name))
If there's any reason to clear released entries in the formal
SSA_NAME table, this is it.  The fact that right now any code that
wants to walk over the table and perform actions on the names found
therein has to check if the returned name was released or not
is rather prone to hidden errors.


> Index: tree-ssa-loop-ivopts.c
> ===================================================================
> RCS file: /cvs/gcc/gcc/gcc/tree-ssa-loop-ivopts.c,v
> retrieving revision 2.3
> diff -c -3 -p -r2.3 tree-ssa-loop-ivopts.c
> *** tree-ssa-loop-ivopts.c	6 Sep 2004 18:38:27 -0000	2.3
> --- tree-ssa-loop-ivopts.c	7 Sep 2004 13:26:47 -0000
> *************** remove_statement (tree stmt, bool includ
> *** 3777,3782 ****
> --- 3777,3784 ----
>         block_stmt_iterator bsi = stmt_for_bsi (stmt);
>   
>         bsi_remove (&bsi);
> +       if (including_defined_name)
> +         release_defs (stmt);
>       }
>   }
Don't know enough about the IV code to comment.

>   
> Index: tree-ssa.c
> ===================================================================
> RCS file: /cvs/gcc/gcc/gcc/tree-ssa.c,v
> retrieving revision 2.32
> diff -c -3 -p -r2.32 tree-ssa.c
> *** tree-ssa.c	6 Sep 2004 10:08:11 -0000	2.32
> --- tree-ssa.c	7 Sep 2004 13:26:47 -0000
> *************** verify_flow_sensitive_alias_info (void)
> *** 408,413 ****
> --- 408,415 ----
>         struct ptr_info_def *pi;
>   
>         ptr = ssa_name (i);
> +       if (!TREE_VISITED (ptr))
> + 	continue;
>         ann = var_ann (SSA_NAME_VAR (ptr));
>         pi = SSA_NAME_PTR_INFO (ptr);
>   
Err, the code already does this about 2 lines after your change.  Is
there some reason you couldn't just move the existing test (which
is more comprehensive than yours) to the earlier location?

> Index: tree-tailcall.c
> ===================================================================
> RCS file: /cvs/gcc/gcc/gcc/tree-tailcall.c,v
> retrieving revision 2.23
> diff -c -3 -p -r2.23 tree-tailcall.c
> *** tree-tailcall.c	6 Sep 2004 10:08:13 -0000	2.23
> --- tree-tailcall.c	7 Sep 2004 13:26:47 -0000
> *************** eliminate_tail_call (struct tailcall *t)
> *** 681,692 ****
>     bsi_next (&bsi);
>     while (!bsi_end_p (bsi))
>       {
>         /* Do not remove the return statement, so that redirect_edge_and_branch
>   	 sees how the block ends.  */
> !       if (TREE_CODE (bsi_stmt (bsi)) == RETURN_EXPR)
>   	break;
>   
>         bsi_remove (&bsi);
>       }
>   
>     /* Replace the call by a jump to the start of function.  */
> --- 681,694 ----
>     bsi_next (&bsi);
>     while (!bsi_end_p (bsi))
>       {
> +       tree t = bsi_stmt (bsi);
>         /* Do not remove the return statement, so that redirect_edge_and_branch
>   	 sees how the block ends.  */
> !       if (TREE_CODE (t) == RETURN_EXPR)
>   	break;
>   
>         bsi_remove (&bsi);
> +       release_defs (t);
>       }
Is RETURN_EXPR still allowed to create definitions?

>   
>     /* Replace the call by a jump to the start of function.  */
> *************** eliminate_tail_call (struct tailcall *t)
> *** 772,777 ****
> --- 774,780 ----
>       }
>   
>     bsi_remove (&t->call_bsi);
> +   release_defs (call);
>   }
>   
>   /* Optimizes the tailcall described by T.  If OPT_TAILCALLS is true, also
Probably OK.


^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-14 20:51             ` Jan Hubicka
@ 2004-09-14 21:07               ` Jeffrey A Law
  2004-09-14 21:12                 ` Jakub Jelinek
                                   ` (2 more replies)
  0 siblings, 3 replies; 875+ messages in thread
From: Jeffrey A Law @ 2004-09-14 21:07 UTC (permalink / raw)
  To: Jan Hubicka
  Cc: Richard Henderson, Diego Novillo, Jakub Jelinek, gcc-patches,
	Mark Mitchell

On Tue, 2004-09-14 at 14:19, Jan Hubicka wrote:
> > On Tue, Sep 14, 2004 at 10:00:41PM +0200, Jan Hubicka wrote:
> > > - 	  if (!POINTER_TYPE_P (TREE_TYPE (name)))
> > > + 	  if (SSA_NAME_IN_FREE_LIST (name) || !POINTER_TYPE_P (TREE_TYPE (name)))
> > 
> > If you're having to add this kind of check, it means that
> > we've a serious conceptual problem elsewhere.
> 
> I think the conceptual problem is to keep ssa_names array pointing to
> nodes in the freelist.  I sent separate message about setting them to
> NULL and re-initializing when node is recycled, but didn't get a reply.
Sorry.  I don't follow what you're trying to say.

Fundamentally, if we require SSA_NAMEs never leak, then we've screwed
something up so badly, that we might as well quit now.  I don't mind
finding leaks and fixing them, hell, I probably wouldn't object
too strongly to running a pass over the IL once or twice to mark
used SSA_NAMEs and release any which weren't marked.

The problem is your change to explicitly call ggc_free bypasses the
entire GC mechanisms we've built.  As I've stated before, if you
can't be absolutely sure that there are no pointers left into your
object, then the object is _NOT_ a candidate for ggc_free.  I can't
emphasize this enough.

Or to put it another way, the only objects we should ever explicitly
ggc_free would be those which, if they show up as live after a certain
point would indicate a fundamental bug.  basic blocks, edges and the
like do not IMHO fit that description, except maybe between compiling
functions.

I'm regretting ever pushing for including ggc_free; the more it's
used the more I think we made a fundamental mistake.

Jeff

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-14 21:07               ` Jeffrey A Law
@ 2004-09-14 21:12                 ` Jakub Jelinek
  2004-09-14 22:33                   ` Daniel Jacobowitz
                                     ` (2 more replies)
  2004-09-14 21:26                 ` Jan Hubicka
  2004-09-14 21:40                 ` Diego Novillo
  2 siblings, 3 replies; 875+ messages in thread
From: Jakub Jelinek @ 2004-09-14 21:12 UTC (permalink / raw)
  To: Jeffrey A Law
  Cc: Jan Hubicka, Richard Henderson, Diego Novillo, gcc-patches,
	Mark Mitchell

On Tue, Sep 14, 2004 at 02:51:39PM -0600, Jeffrey A Law wrote:
> The problem is your change to explicitly call ggc_free bypasses the
> entire GC mechanisms we've built.  As I've stated before, if you
> can't be absolutely sure that there are no pointers left into your
> object, then the object is _NOT_ a candidate for ggc_free.  I can't
> emphasize this enough.

Shouldn't we have a checking mode which verifies this then?
I.e. ggc_free in that mode would just set a flag and if during GC
collection an object marked that way is reachable, we would abort ().

	Jakub

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-14 21:07           ` Jeffrey A Law
@ 2004-09-14 21:19             ` Jan Hubicka
  2004-09-15 11:59             ` Jan Hubicka
  1 sibling, 0 replies; 875+ messages in thread
From: Jan Hubicka @ 2004-09-14 21:19 UTC (permalink / raw)
  To: Jeffrey A Law
  Cc: Jan Hubicka, Diego Novillo, Jakub Jelinek, gcc-patches,
	Richard Henderson, Mark Mitchell

> On Tue, 2004-09-14 at 14:00, Jan Hubicka wrote:
> > 2004-09-10  Jan Hubicka  <jh@suse.cz>
> > 	* tree-cfg.c (remove_bb): Release SSA defs.
> > 	* tree-ssa-alias.c (init_alias_info): Do not walk pointers in freelist.
> > 	* tree-ssa-loop-ivopts (remove_statements): Release SSA defs.
> > 	* tree-ssa.c (verify_flow_sensitive_alias_info): Do not walk dead nodes.
> > 	* tree-tailcall.c (eliminate_tail_call): Release SSA name.
> 
> > 
> > Index: tree-cfg.c
> > ===================================================================
> > RCS file: /cvs/gcc/gcc/gcc/tree-cfg.c,v
> > retrieving revision 2.47
> > diff -c -3 -p -r2.47 tree-cfg.c
> > *** tree-cfg.c	6 Sep 2004 10:07:56 -0000	2.47
> > --- tree-cfg.c	7 Sep 2004 13:26:46 -0000
> > *************** remove_bb (basic_block bb)
> > *** 1818,1823 ****
> > --- 1818,1824 ----
> >     for (i = bsi_start (bb); !bsi_end_p (i); bsi_remove (&i))
> >       {
> >         tree stmt = bsi_stmt (i);
> > +       release_defs (stmt);
> >   
> >         set_bb_for_stmt (stmt, NULL);
> This is probably OK.

> If there's any reason to clear released entries in the formal
> SSA_NAME table, this is it.  The fact that right now any code that
> wants to walk over the table and perform actions on the names found
> therein has to check if the returned name was released or not
> is rather prone to hidden errors.

OK, I was asking about this in separate mail previously.  I have patch
to clear out the entries somewhere around my archives.  I will rescuesce
and send shortly.
> 
> 
> > Index: tree-ssa-loop-ivopts.c
> > ===================================================================
> > RCS file: /cvs/gcc/gcc/gcc/tree-ssa-loop-ivopts.c,v
> > retrieving revision 2.3
> > diff -c -3 -p -r2.3 tree-ssa-loop-ivopts.c
> > *** tree-ssa-loop-ivopts.c	6 Sep 2004 18:38:27 -0000	2.3
> > --- tree-ssa-loop-ivopts.c	7 Sep 2004 13:26:47 -0000
> > *************** remove_statement (tree stmt, bool includ
> > *** 3777,3782 ****
> > --- 3777,3784 ----
> >         block_stmt_iterator bsi = stmt_for_bsi (stmt);
> >   
> >         bsi_remove (&bsi);
> > +       if (including_defined_name)
> > +         release_defs (stmt);
> >       }
> >   }
> Don't know enough about the IV code to comment.

OK, I can ask Zdenek but it looks quite obvious from the comment:
/* Removes statement STMT (real or a phi node).  If INCLUDING_DEFINED_NAME
   is true, remove also the ssa name defined by the statement.  */
and the code handling PHI statements:
if (TREE_CODE (stmt) == PHI_NODE)
  {
    if (!including_defined_name)
      {
	/* Prevent the ssa name defined by the statement from being removed.  */
	SET_PHI_RESULT (stmt, NULL);
      }
    remove_phi_node (stmt, NULL_TREE, bb_for_stmt (stmt));
  }
> > Index: tree-ssa.c
> > ===================================================================
> > RCS file: /cvs/gcc/gcc/gcc/tree-ssa.c,v
> > retrieving revision 2.32
> > diff -c -3 -p -r2.32 tree-ssa.c
> > *** tree-ssa.c	6 Sep 2004 10:08:11 -0000	2.32
> > --- tree-ssa.c	7 Sep 2004 13:26:47 -0000
> > *************** verify_flow_sensitive_alias_info (void)
> > *** 408,413 ****
> > --- 408,415 ----
> >         struct ptr_info_def *pi;
> >   
> >         ptr = ssa_name (i);
> > +       if (!TREE_VISITED (ptr))
> > + 	continue;
> >         ann = var_ann (SSA_NAME_VAR (ptr));
> >         pi = SSA_NAME_PTR_INFO (ptr);
> >   
> Err, the code already does this about 2 lines after your change.  Is
> there some reason you couldn't just move the existing test (which
> is more comprehensive than yours) to the earlier location?

I didn't noticed it.  We died on SSA_NAME_VAR pointer dereference I
cleared out.  However with the clearing patch we don't need this change.
> 
> > Index: tree-tailcall.c
> > ===================================================================
> > RCS file: /cvs/gcc/gcc/gcc/tree-tailcall.c,v
> > retrieving revision 2.23
> > diff -c -3 -p -r2.23 tree-tailcall.c
> > *** tree-tailcall.c	6 Sep 2004 10:08:13 -0000	2.23
> > --- tree-tailcall.c	7 Sep 2004 13:26:47 -0000
> > *************** eliminate_tail_call (struct tailcall *t)
> > *** 681,692 ****
> >     bsi_next (&bsi);
> >     while (!bsi_end_p (bsi))
> >       {
> >         /* Do not remove the return statement, so that redirect_edge_and_branch
> >   	 sees how the block ends.  */
> > !       if (TREE_CODE (bsi_stmt (bsi)) == RETURN_EXPR)
> >   	break;
> >   
> >         bsi_remove (&bsi);
> >       }
> >   
> >     /* Replace the call by a jump to the start of function.  */
> > --- 681,694 ----
> >     bsi_next (&bsi);
> >     while (!bsi_end_p (bsi))
> >       {
> > +       tree t = bsi_stmt (bsi);
> >         /* Do not remove the return statement, so that redirect_edge_and_branch
> >   	 sees how the block ends.  */
> > !       if (TREE_CODE (t) == RETURN_EXPR)
> >   	break;
> >   
> >         bsi_remove (&bsi);
> > +       release_defs (t);
> >       }
> Is RETURN_EXPR still allowed to create definitions?

This is not removing the return_expr, but statements up to return_expr.
> 
> >   
> >     /* Replace the call by a jump to the start of function.  */
> > *************** eliminate_tail_call (struct tailcall *t)
> > *** 772,777 ****
> > --- 774,780 ----
> >       }
> >   
> >     bsi_remove (&t->call_bsi);
> > +   release_defs (call);
> >   }
> >   
> >   /* Optimizes the tailcall described by T.  If OPT_TAILCALLS is true, also
> Probably OK.

Thanks!
I will re-test and apply the approved bits and prepare SSA_NAME clearing patch.

Honza
> 
> 

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-14 21:07               ` Jeffrey A Law
  2004-09-14 21:12                 ` Jakub Jelinek
@ 2004-09-14 21:26                 ` Jan Hubicka
  2004-09-14 21:40                 ` Diego Novillo
  2 siblings, 0 replies; 875+ messages in thread
From: Jan Hubicka @ 2004-09-14 21:26 UTC (permalink / raw)
  To: Jeffrey A Law
  Cc: Jan Hubicka, Richard Henderson, Diego Novillo, Jakub Jelinek,
	gcc-patches, Mark Mitchell

> On Tue, 2004-09-14 at 14:19, Jan Hubicka wrote:
> > > On Tue, Sep 14, 2004 at 10:00:41PM +0200, Jan Hubicka wrote:
> > > > - 	  if (!POINTER_TYPE_P (TREE_TYPE (name)))
> > > > + 	  if (SSA_NAME_IN_FREE_LIST (name) || !POINTER_TYPE_P (TREE_TYPE (name)))
> > > 
> > > If you're having to add this kind of check, it means that
> > > we've a serious conceptual problem elsewhere.
> > 
> > I think the conceptual problem is to keep ssa_names array pointing to
> > nodes in the freelist.  I sent separate message about setting them to
> > NULL and re-initializing when node is recycled, but didn't get a reply.
> Sorry.  I don't follow what you're trying to say.
> 
> Fundamentally, if we require SSA_NAMEs never leak, then we've screwed
> something up so badly, that we might as well quit now.  I don't mind
> finding leaks and fixing them, hell, I probably wouldn't object
> too strongly to running a pass over the IL once or twice to mark
> used SSA_NAMEs and release any which weren't marked.
> 
> The problem is your change to explicitly call ggc_free bypasses the
> entire GC mechanisms we've built.  As I've stated before, if you
> can't be absolutely sure that there are no pointers left into your
> object, then the object is _NOT_ a candidate for ggc_free.  I can't
> emphasize this enough.
> 
> Or to put it another way, the only objects we should ever explicitly
> ggc_free would be those which, if they show up as live after a certain
> point would indicate a fundamental bug.  basic blocks, edges and the
> like do not IMHO fit that description, except maybe between compiling
> functions.
> 
> I'm regretting ever pushing for including ggc_free; the more it's
> used the more I think we made a fundamental mistake.

This problem has been actually noticed without the ggc_free.  I was
concerned that the loop is actually processing dead variables, but I see
it just clears out the alias information that should be safe in this
case.

Basically every loop walking ssa_names must be ready to see some dead
names in the list that is making things little bit more slippery than
they needs to be. 

Honza
> 
> Jeff
> 
> 

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-14 21:07               ` Jeffrey A Law
  2004-09-14 21:12                 ` Jakub Jelinek
  2004-09-14 21:26                 ` Jan Hubicka
@ 2004-09-14 21:40                 ` Diego Novillo
  2004-09-14 22:08                   ` Jeffrey A Law
  2 siblings, 1 reply; 875+ messages in thread
From: Diego Novillo @ 2004-09-14 21:40 UTC (permalink / raw)
  To: Jeff Law
  Cc: Jan Hubicka, Richard Henderson, Jakub Jelinek, gcc-patches,
	Mark Mitchell

On Tue, 2004-09-14 at 16:51, Jeffrey A Law wrote:

> The problem is your change to explicitly call ggc_free bypasses the
> entire GC mechanisms we've built.  As I've stated before, if you
> can't be absolutely sure that there are no pointers left into your
> object, then the object is _NOT_ a candidate for ggc_free.  I can't
> emphasize this enough.
> 
Heartily agree.  Sprinkling ggc_free is a dead give away that we are
papering over some basic memory management design problem.

Either we shouldn't be using GC here, or the collection mechanism is not
good enough.  All this random explicit deallocation is a perfect recipe
for disaster and extremely hard to track bugs.


Diego.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-14 21:40                 ` Diego Novillo
@ 2004-09-14 22:08                   ` Jeffrey A Law
  2004-09-14 22:25                     ` Jan Hubicka
                                       ` (2 more replies)
  0 siblings, 3 replies; 875+ messages in thread
From: Jeffrey A Law @ 2004-09-14 22:08 UTC (permalink / raw)
  To: Diego Novillo
  Cc: Jan Hubicka, Richard Henderson, Jakub Jelinek, gcc-patches,
	Mark Mitchell

On Tue, 2004-09-14 at 15:19, Diego Novillo wrote:
> On Tue, 2004-09-14 at 16:51, Jeffrey A Law wrote:
> 
> > The problem is your change to explicitly call ggc_free bypasses the
> > entire GC mechanisms we've built.  As I've stated before, if you
> > can't be absolutely sure that there are no pointers left into your
> > object, then the object is _NOT_ a candidate for ggc_free.  I can't
> > emphasize this enough.
> > 
> Heartily agree.  Sprinkling ggc_free is a dead give away that we are
> papering over some basic memory management design problem.
> 
> Either we shouldn't be using GC here, or the collection mechanism is not
> good enough.  All this random explicit deallocation is a perfect recipe
> for disaster and extremely hard to track bugs.
You know, if calling ggc_free "fixes" a memory leak, then well, the
ggc_free is wrong, no other way to look at it.

ggc_free's only purpose should be to return objects to the GC pool
faster (ie, before the next collection point).  Any other use is
simply wrong.

jeff


^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-14 22:08                   ` Jeffrey A Law
@ 2004-09-14 22:25                     ` Jan Hubicka
  2004-09-14 23:55                     ` Michael Matz
  2004-09-15  4:38                     ` David Edelsohn
  2 siblings, 0 replies; 875+ messages in thread
From: Jan Hubicka @ 2004-09-14 22:25 UTC (permalink / raw)
  To: Jeffrey A Law
  Cc: Diego Novillo, Jan Hubicka, Richard Henderson, Jakub Jelinek,
	gcc-patches, Mark Mitchell

> On Tue, 2004-09-14 at 15:19, Diego Novillo wrote:
> > On Tue, 2004-09-14 at 16:51, Jeffrey A Law wrote:
> > 
> > > The problem is your change to explicitly call ggc_free bypasses the
> > > entire GC mechanisms we've built.  As I've stated before, if you
> > > can't be absolutely sure that there are no pointers left into your
> > > object, then the object is _NOT_ a candidate for ggc_free.  I can't
> > > emphasize this enough.
> > > 
> > Heartily agree.  Sprinkling ggc_free is a dead give away that we are
> > papering over some basic memory management design problem.
> > 
> > Either we shouldn't be using GC here, or the collection mechanism is not
> > good enough.  All this random explicit deallocation is a perfect recipe
> > for disaster and extremely hard to track bugs.
> You know, if calling ggc_free "fixes" a memory leak, then well, the
> ggc_free is wrong, no other way to look at it.
> 
> ggc_free's only purpose should be to return objects to the GC pool
> faster (ie, before the next collection point).  Any other use is
> simply wrong.

Of course this is what I am trying to use ggc_free for.

We already do have checking so garbage collector crash badly first time
it sees pointer to explicitly freed object, so the bugs does not appear
to be that difficult to notice with gcac and we can't use ggc_free for
objects with some pointers left to this.

Sorry for breakage in this case - it bootstrapped x86-64 and until quite
recently we use explicit malloc/free pairs on CFG so it seemed safe to
me.
Still it seems to be bug to keep pointers to dead basic blocks in the
insn chain actually.

Honza
> 
> jeff
> 

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-14 21:12                 ` Jakub Jelinek
@ 2004-09-14 22:33                   ` Daniel Jacobowitz
  2004-09-14 22:53                   ` Richard Henderson
  2004-09-16  8:35                   ` Jeffrey A Law
  2 siblings, 0 replies; 875+ messages in thread
From: Daniel Jacobowitz @ 2004-09-14 22:33 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Jeffrey A Law, Jan Hubicka, Richard Henderson, Diego Novillo,
	gcc-patches, Mark Mitchell

On Tue, Sep 14, 2004 at 05:07:02PM -0400, Jakub Jelinek wrote:
> On Tue, Sep 14, 2004 at 02:51:39PM -0600, Jeffrey A Law wrote:
> > The problem is your change to explicitly call ggc_free bypasses the
> > entire GC mechanisms we've built.  As I've stated before, if you
> > can't be absolutely sure that there are no pointers left into your
> > object, then the object is _NOT_ a candidate for ggc_free.  I can't
> > emphasize this enough.
> 
> Shouldn't we have a checking mode which verifies this then?
> I.e. ggc_free in that mode would just set a flag and if during GC
> collection an object marked that way is reachable, we would abort ().

What a good idea.  Richard implemented it in January, when he
implemented ggc_free :-)

-- 
Daniel Jacobowitz

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-14 21:12                 ` Jakub Jelinek
  2004-09-14 22:33                   ` Daniel Jacobowitz
@ 2004-09-14 22:53                   ` Richard Henderson
  2004-09-16  8:35                   ` Jeffrey A Law
  2 siblings, 0 replies; 875+ messages in thread
From: Richard Henderson @ 2004-09-14 22:53 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Jeffrey A Law, Jan Hubicka, Diego Novillo, gcc-patches, Mark Mitchell

On Tue, Sep 14, 2004 at 05:07:02PM -0400, Jakub Jelinek wrote:
> I.e. ggc_free in that mode would just set a flag and if during GC
> collection an object marked that way is reachable, we would abort ().

This was suggested originally, yes.  No one's found the time.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-14 22:08                   ` Jeffrey A Law
  2004-09-14 22:25                     ` Jan Hubicka
@ 2004-09-14 23:55                     ` Michael Matz
  2004-09-15  0:25                       ` Jan Hubicka
  2004-09-15  4:38                     ` David Edelsohn
  2 siblings, 1 reply; 875+ messages in thread
From: Michael Matz @ 2004-09-14 23:55 UTC (permalink / raw)
  To: Jeffrey A Law
  Cc: Diego Novillo, Jan Hubicka, Richard Henderson, Jakub Jelinek,
	gcc-patches, Mark Mitchell

Hi,

On Tue, 14 Sep 2004, Jeffrey A Law wrote:

> You know, if calling ggc_free "fixes" a memory leak, then well, the
> ggc_free is wrong, no other way to look at it.
> 
> ggc_free's only purpose should be to return objects to the GC pool
> faster (ie, before the next collection point).  Any other use is
> simply wrong.

Hmm, you are stating the obvious, and it is what Honza actually wants.  It 
just uncovered a latent problem elsewhere, namely that "inactive" entries 
were still referenced, which then got noticed when a ggc_free was done on 
them.

Diego mentioned that sprinkling ggc_free is a recipe for disaster.  I 
disagree partly.  If something in ggc is conceptually dead after some 
transformation (like deleting a statement or similar), then there really 
should be no other references to it anymore.  garbage collected memory 
actually hides such problem, which could result in wrong answers instead 
of crashes.  Adding ggc_free just exposes those problems by crashing or 
similar, but it is not a problem in itself.

Of course ggc_free should only be applied to things with a definitive life 
time, not "sprinkled around".  But I don't see anyone doing the latter.

Ciao,
Michael.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-14 23:55                     ` Michael Matz
@ 2004-09-15  0:25                       ` Jan Hubicka
  0 siblings, 0 replies; 875+ messages in thread
From: Jan Hubicka @ 2004-09-15  0:25 UTC (permalink / raw)
  To: Michael Matz
  Cc: Jeffrey A Law, Diego Novillo, Jan Hubicka, Richard Henderson,
	Jakub Jelinek, gcc-patches, Mark Mitchell

> Hi,
> 
> On Tue, 14 Sep 2004, Jeffrey A Law wrote:
> 
> > You know, if calling ggc_free "fixes" a memory leak, then well, the
> > ggc_free is wrong, no other way to look at it.
> > 
> > ggc_free's only purpose should be to return objects to the GC pool
> > faster (ie, before the next collection point).  Any other use is
> > simply wrong.
> 
> Hmm, you are stating the obvious, and it is what Honza actually wants.  It 
> just uncovered a latent problem elsewhere, namely that "inactive" entries 
> were still referenced, which then got noticed when a ggc_free was done on 
> them.
> 
> Diego mentioned that sprinkling ggc_free is a recipe for disaster.  I 
> disagree partly.  If something in ggc is conceptually dead after some 
> transformation (like deleting a statement or similar), then there really 
> should be no other references to it anymore.  garbage collected memory 
> actually hides such problem, which could result in wrong answers instead 
> of crashes.  Adding ggc_free just exposes those problems by crashing or 
> similar, but it is not a problem in itself.

Yes, actually I think of ggc_free as usefull tool to verify that
something that should be conceptually dead actually is dead with extra
bonus for allowing faster memory reuse.  We gave clear proof that not
caring these issues is way to disaster.  Just week ago we were keeping
pointers to both gimple and RTL bodies of already compiled functions
increasing our memory consumption for large compilation units several
times (400MB versus 36MB on Gerald's testcase and of course it is
possible to go much further down in the consumption.  Because of the
same bug but relating to some RTL bodies only 3.4 used to consume 120MB)

Honza
> 
> Of course ggc_free should only be applied to things with a definitive life 
> time, not "sprinkled around".  But I don't see anyone doing the latter.
> 
> 
> Ciao,
> Michael.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-14 22:08                   ` Jeffrey A Law
  2004-09-14 22:25                     ` Jan Hubicka
  2004-09-14 23:55                     ` Michael Matz
@ 2004-09-15  4:38                     ` David Edelsohn
  2004-09-15 12:25                       ` Jan Hubicka
  2004-09-15 19:50                       ` Mike Stump
  2 siblings, 2 replies; 875+ messages in thread
From: David Edelsohn @ 2004-09-15  4:38 UTC (permalink / raw)
  To: Jeffrey Law, Jan Hubicka; +Cc: gcc-patches

> if calling ggc_free "fixes" a memory leak, then well, the
> ggc_free is wrong, no other way to look at it.

> ggc_free's only purpose should be to return objects to the GC pool
> faster (ie, before the next collection point).  Any other use is
> simply wrong.

	I completely agree.

	If GCC is not freeing memory or not freeing memory soon enough,
then we should track down why GC think there still is a reference and fix
the dangling reference or improve the GC machinery to GC more often.

	ggc_free() is wrong and I strongly disagree with using it as a
fundamental concept in the design and implementation of GCC.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-14 21:07           ` Jeffrey A Law
  2004-09-14 21:19             ` Jan Hubicka
@ 2004-09-15 11:59             ` Jan Hubicka
  2004-09-15 13:07               ` Diego Novillo
  1 sibling, 1 reply; 875+ messages in thread
From: Jan Hubicka @ 2004-09-15 11:59 UTC (permalink / raw)
  To: Jeffrey A Law
  Cc: Jan Hubicka, Diego Novillo, Jakub Jelinek, gcc-patches,
	Richard Henderson, Mark Mitchell

> If there's any reason to clear released entries in the formal
> SSA_NAME table, this is it.  The fact that right now any code that
> wants to walk over the table and perform actions on the names found
> therein has to check if the returned name was released or not
> is rather prone to hidden errors.

Hi,
bootstrapped/regtested i686-pc-gnu-linux, OK?

2004-09-15  Jan Hubicka  <jh@suse.cz>
	* tree-into-ssa.c (rewrite_ssa_into_ssa):  Expect ssa_name to return
	NULL.
	* tree-ssa-alias.c (init_alias_info): Likewise.
	* tree-ssa.c (verify_flow_sensitive_alias_info): Likewise.
	(verify_ssa): Likewise.
	* tree-ssanames.c (make_ssa_name): Clear out ssa_names arrays.
Index: tree-into-ssa.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-into-ssa.c,v
retrieving revision 2.19
diff -c -3 -p -r2.19 tree-into-ssa.c
*** tree-into-ssa.c	9 Sep 2004 20:53:36 -0000	2.19
--- tree-into-ssa.c	14 Sep 2004 23:42:34 -0000
*************** rewrite_ssa_into_ssa (void)
*** 1622,1628 ****
    sbitmap_free (mark_def_sites_global_data.kills);
  
    for (i = 1; i < num_ssa_names; i++)
!     set_current_def (ssa_name (i), NULL_TREE);
  
    /* Insert PHI nodes at dominance frontiers of definition blocks.  */
    insert_phi_nodes (dfs, to_rename);
--- 1622,1629 ----
    sbitmap_free (mark_def_sites_global_data.kills);
  
    for (i = 1; i < num_ssa_names; i++)
!     if (ssa_name (i))
!       set_current_def (ssa_name (i), NULL_TREE);
  
    /* Insert PHI nodes at dominance frontiers of definition blocks.  */
    insert_phi_nodes (dfs, to_rename);
*************** rewrite_ssa_into_ssa (void)
*** 1679,1685 ****
    for (i = 1; i < num_ssa_names; i++)
      {
        name = ssa_name (i);
!       if (!SSA_NAME_AUX (name))
  	continue;
  
        free (SSA_NAME_AUX (name));
--- 1680,1686 ----
    for (i = 1; i < num_ssa_names; i++)
      {
        name = ssa_name (i);
!       if (!name || !SSA_NAME_AUX (name))
  	continue;
  
        free (SSA_NAME_AUX (name));
Index: tree-ssa-alias.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-ssa-alias.c,v
retrieving revision 2.33
diff -c -3 -p -r2.33 tree-ssa-alias.c
*** tree-ssa-alias.c	11 Sep 2004 18:57:03 -0000	2.33
--- tree-ssa-alias.c	14 Sep 2004 23:42:34 -0000
*************** init_alias_info (void)
*** 414,420 ****
  	{
  	  tree name = ssa_name (i);
  
! 	  if (!POINTER_TYPE_P (TREE_TYPE (name)))
  	    continue;
  
  	  if (SSA_NAME_PTR_INFO (name))
--- 414,420 ----
  	{
  	  tree name = ssa_name (i);
  
! 	  if (!name || !POINTER_TYPE_P (TREE_TYPE (name)))
  	    continue;
  
  	  if (SSA_NAME_PTR_INFO (name))
Index: tree-ssa.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-ssa.c,v
retrieving revision 2.36
diff -c -3 -p -r2.36 tree-ssa.c
*** tree-ssa.c	14 Sep 2004 07:20:05 -0000	2.36
--- tree-ssa.c	14 Sep 2004 23:42:34 -0000
*************** verify_flow_sensitive_alias_info (void)
*** 407,412 ****
--- 407,414 ----
        struct ptr_info_def *pi;
  
        ptr = ssa_name (i);
+       if (!ptr)
+ 	continue;
        ann = var_ann (SSA_NAME_VAR (ptr));
        pi = SSA_NAME_PTR_INFO (ptr);
  
*************** verify_flow_sensitive_alias_info (void)
*** 454,478 ****
  	  size_t j;
  
  	  for (j = i + 1; j < num_ssa_names; j++)
! 	    {
! 	      tree ptr2 = ssa_name (j);
! 	      struct ptr_info_def *pi2 = SSA_NAME_PTR_INFO (ptr2);
  
! 	      if (!TREE_VISITED (ptr2) || !POINTER_TYPE_P (TREE_TYPE (ptr2)))
! 		continue;
  
! 	      if (pi2
! 		  && pi2->name_mem_tag
! 		  && pi2->pt_vars
! 		  && bitmap_first_set_bit (pi2->pt_vars) >= 0
! 		  && pi->name_mem_tag != pi2->name_mem_tag
! 		  && bitmap_equal_p (pi->pt_vars, pi2->pt_vars))
! 		{
! 		  error ("Two pointers with different name tags and identical points-to sets");
! 		  debug_variable (ptr2);
! 		  goto err;
! 		}
! 	    }
  	}
      }
  
--- 456,481 ----
  	  size_t j;
  
  	  for (j = i + 1; j < num_ssa_names; j++)
! 	    if (ssa_name (j))
! 	      {
! 		tree ptr2 = ssa_name (j);
! 		struct ptr_info_def *pi2 = SSA_NAME_PTR_INFO (ptr2);
  
! 		if (!TREE_VISITED (ptr2) || !POINTER_TYPE_P (TREE_TYPE (ptr2)))
! 		  continue;
  
! 		if (pi2
! 		    && pi2->name_mem_tag
! 		    && pi2->pt_vars
! 		    && bitmap_first_set_bit (pi2->pt_vars) >= 0
! 		    && pi->name_mem_tag != pi2->name_mem_tag
! 		    && bitmap_equal_p (pi->pt_vars, pi2->pt_vars))
! 		  {
! 		    error ("Two pointers with different name tags and identical points-to sets");
! 		    debug_variable (ptr2);
! 		    goto err;
! 		  }
! 	      }
  	}
      }
  
*************** verify_ssa (void)
*** 511,517 ****
  
    /* Keep track of SSA names present in the IL.  */
    for (i = 1; i < num_ssa_names; i++)
!     TREE_VISITED (ssa_name (i)) = 0;
  
    calculate_dominance_info (CDI_DOMINATORS);
  
--- 514,521 ----
  
    /* Keep track of SSA names present in the IL.  */
    for (i = 1; i < num_ssa_names; i++)
!     if (ssa_name (i))
!       TREE_VISITED (ssa_name (i)) = 0;
  
    calculate_dominance_info (CDI_DOMINATORS);
  
Index: tree-ssanames.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-ssanames.c,v
retrieving revision 2.12
diff -c -3 -p -r2.12 tree-ssanames.c
*** tree-ssanames.c	9 Sep 2004 07:54:12 -0000	2.12
--- tree-ssanames.c	14 Sep 2004 23:42:34 -0000
*************** make_ssa_name (tree var, tree stmt)
*** 207,212 ****
--- 207,214 ----
        memset (t, 0, tree_size (t));
        TREE_SET_CODE (t, SSA_NAME);
        SSA_NAME_VERSION (t) = save_version;
+       gcc_assert (ssa_name (save_version) == NULL);
+       VARRAY_TREE (ssa_names, save_version) = t;
      }
    else
      {
*************** release_ssa_name (tree var)
*** 262,267 ****
--- 264,270 ----
       defining statement.  */
    if (! SSA_NAME_IN_FREE_LIST (var))
      {
+       VARRAY_TREE (ssa_names, SSA_NAME_VERSION (var)) = NULL;
        SSA_NAME_IN_FREE_LIST (var) = 1;
        TREE_CHAIN (var) = free_ssanames;
        free_ssanames = var;

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-15  4:38                     ` David Edelsohn
@ 2004-09-15 12:25                       ` Jan Hubicka
  2004-09-15 12:30                         ` Jan Hubicka
  2004-09-15 19:50                       ` Mike Stump
  1 sibling, 1 reply; 875+ messages in thread
From: Jan Hubicka @ 2004-09-15 12:25 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Jeffrey Law, Jan Hubicka, gcc-patches

> > if calling ggc_free "fixes" a memory leak, then well, the
> > ggc_free is wrong, no other way to look at it.
> 
> > ggc_free's only purpose should be to return objects to the GC pool
> > faster (ie, before the next collection point).  Any other use is
> > simply wrong.
> 
> 	I completely agree.
> 
> 	If GCC is not freeing memory or not freeing memory soon enough,
> then we should track down why GC think there still is a reference and fix
> the dangling reference or improve the GC machinery to GC more often.

There is still a lot of confustion.  One can not use ggc_free in the
case there is dangling reference to the object or you will get abort in
next ggc_collect run with checking enabled when GGC will travel the
pointer to the freed object.

Only it is usefull for is to reduce the frequency of ggc_collect
invocations and improve data locality.  (and it is also usefull to
answer question "who is till pointing to my data")

Honza
> 
> 	ggc_free() is wrong and I strongly disagree with using it as a
> fundamental concept in the design and implementation of GCC.
> 
> David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-15 12:25                       ` Jan Hubicka
@ 2004-09-15 12:30                         ` Jan Hubicka
  2004-09-15 13:05                           ` Diego Novillo
  0 siblings, 1 reply; 875+ messages in thread
From: Jan Hubicka @ 2004-09-15 12:30 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: David Edelsohn, Jeffrey Law, Jan Hubicka, gcc-patches

> > > if calling ggc_free "fixes" a memory leak, then well, the
> > > ggc_free is wrong, no other way to look at it.
> > 
> > > ggc_free's only purpose should be to return objects to the GC pool
> > > faster (ie, before the next collection point).  Any other use is
> > > simply wrong.
> > 
> > 	I completely agree.
> > 
> > 	If GCC is not freeing memory or not freeing memory soon enough,
> > then we should track down why GC think there still is a reference and fix
> > the dangling reference or improve the GC machinery to GC more often.
> 
> There is still a lot of confustion.  One can not use ggc_free in the
> case there is dangling reference to the object or you will get abort in
> next ggc_collect run with checking enabled when GGC will travel the
> pointer to the freed object.
> 
> Only it is usefull for is to reduce the frequency of ggc_collect
> invocations and improve data locality.  (and it is also usefull to
> answer question "who is till pointing to my data")

What about renaming ggc_free to ggc_dead to make it obvious that one can
not "free" live data?

Honza
> 
> Honza
> > 
> > 	ggc_free() is wrong and I strongly disagree with using it as a
> > fundamental concept in the design and implementation of GCC.
> > 
> > David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-15 12:30                         ` Jan Hubicka
@ 2004-09-15 13:05                           ` Diego Novillo
  2004-09-15 13:13                             ` Daniel Berlin
                                               ` (2 more replies)
  0 siblings, 3 replies; 875+ messages in thread
From: Diego Novillo @ 2004-09-15 13:05 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: David Edelsohn, Jeff Law, Jan Hubicka, gcc-patches

On Wed, 2004-09-15 at 07:32, Jan Hubicka wrote:

> What about renaming ggc_free to ggc_dead to make it obvious that one can
> not "free" live data?
> 
It's not a matter of naming.  When you are expunging a basic block, you
really consider it dead, but as we have proven, you don't really know if
it's reachable from somewhere else.

Again, if we find ourselves having to help GC by expressly telling it
what is dead and what isn't, then our GC system is broken and ggc_free
is _NOT_ the way to fix it.

If we are holding onto too much unnecessary data in our algorithms, then
the solution _ought_ to involve breaking the chains to the dead data so
that we can collect all that garbage.  And the way of breaking those
chains should simply be writing NULL to your pointers.

If you want to use ggc_free() in your local tree to find out which
passes are holding on to garbage for too long, that is fine with me. 
But _please_ do not use that crutch to compensate for GC's shortcomings.

Together with that, we need to distinguish data structures whose usage
pattern is not really suitable for GC.  Again, the fact that you are
adding ggc_free() here and there may be a symptom of a memory allocation
mismatch.

Diego.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-15 11:59             ` Jan Hubicka
@ 2004-09-15 13:07               ` Diego Novillo
  0 siblings, 0 replies; 875+ messages in thread
From: Diego Novillo @ 2004-09-15 13:07 UTC (permalink / raw)
  To: Jan Hubicka
  Cc: Jeffrey A Law, Jakub Jelinek, gcc-patches, Richard Henderson,
	Mark Mitchell

On Wed, 2004-09-15 at 06:48, Jan Hubicka wrote:

> 	* tree-into-ssa.c (rewrite_ssa_into_ssa):  Expect ssa_name to return
> 	NULL.
> 	* tree-ssa-alias.c (init_alias_info): Likewise.
> 	* tree-ssa.c (verify_flow_sensitive_alias_info): Likewise.
> 	(verify_ssa): Likewise.
> 	* tree-ssanames.c (make_ssa_name): Clear out ssa_names arrays.
>
This is OK.


Thanks.  Diego.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-15 13:05                           ` Diego Novillo
  2004-09-15 13:13                             ` Daniel Berlin
@ 2004-09-15 13:13                             ` Daniel Jacobowitz
  2004-09-15 13:32                               ` Jan Hubicka
                                                 ` (2 more replies)
  2004-09-15 16:53                             ` Release RTL bodies after compilation (sometimes) Jeffrey A Law
  2 siblings, 3 replies; 875+ messages in thread
From: Daniel Jacobowitz @ 2004-09-15 13:13 UTC (permalink / raw)
  To: Diego Novillo
  Cc: Jan Hubicka, David Edelsohn, Jeff Law, Jan Hubicka, gcc-patches

On Wed, Sep 15, 2004 at 08:30:43AM -0400, Diego Novillo wrote:
> On Wed, 2004-09-15 at 07:32, Jan Hubicka wrote:
> 
> > What about renaming ggc_free to ggc_dead to make it obvious that one can
> > not "free" live data?
> > 
> It's not a matter of naming.  When you are expunging a basic block, you
> really consider it dead, but as we have proven, you don't really know if
> it's reachable from somewhere else.

Then this isn't a suitable candidate for ggc_free, obviously.  Unless,
of course, you want to _define_ that it is dead when expunging it.  In
that case, ggc_free acts as a form of assertion.

> Again, if we find ourselves having to help GC by expressly telling it
> what is dead and what isn't, then our GC system is broken and ggc_free
> is _NOT_ the way to fix it.
> 
> If we are holding onto too much unnecessary data in our algorithms, then
> the solution _ought_ to involve breaking the chains to the dead data so
> that we can collect all that garbage.  And the way of breaking those
> chains should simply be writing NULL to your pointers.
> 
> If you want to use ggc_free() in your local tree to find out which
> passes are holding on to garbage for too long, that is fine with me. 
> But _please_ do not use that crutch to compensate for GC's shortcomings.
> 
> Together with that, we need to distinguish data structures whose usage
> pattern is not really suitable for GC.  Again, the fact that you are
> adding ggc_free() here and there may be a symptom of a memory allocation
> mismatch.

But this part, that you and Jeff keep reiterating, doesn't make sense.
Even with an improved garbage collector well beyond what we have today,
the fact is that collection is expensive, and the locality penalty for
garbage is expensive.  I've done measurements which suggest that, if
"optimal" memory allocation were possible, it would be 10-15% faster.
Reducing the number of collections required for a file has similar
benefit.

If you "solve" this by moving the object out of GC, you lose the
powerful checking ability of ggc_free.  With GC checking, it poisons
the object that you're claiming is dead.  With GCAC checking, it
verifies at the next collection that it really isn't reachable.

Some degree of this can be fixed by fixing "GC's shortcomings" - for
instance, putting optimizer data specific to a function into a GC zone
and discarding the zone (optionally checking that it isn't reachable!)
at the end of the function.  And Jeff's right to remind us that
ggc_free still has overhead - less in the collector I'm working on, but
it's still there.  So looping over small allocations to destroy them is
probably not a good idea.  But within processing a function, for large
allocations, what's the problem with using ggc_free?

-- 
Daniel Jacobowitz

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-15 13:05                           ` Diego Novillo
@ 2004-09-15 13:13                             ` Daniel Berlin
  2004-09-15 13:13                             ` Daniel Jacobowitz
  2004-09-15 16:53                             ` Release RTL bodies after compilation (sometimes) Jeffrey A Law
  2 siblings, 0 replies; 875+ messages in thread
From: Daniel Berlin @ 2004-09-15 13:13 UTC (permalink / raw)
  To: Diego Novillo
  Cc: Jan Hubicka, David Edelsohn, Jeff Law, Jan Hubicka, gcc-patches



On Wed, 15 Sep 2004, Diego Novillo wrote:

> On Wed, 2004-09-15 at 07:32, Jan Hubicka wrote:
>
> pattern is not really suitable for GC.  Again, the fact that you are
> adding ggc_free() here and there may be a symptom of a memory allocation
> mismatch.

We already knew/know that for things like basic blocks and edges, there 
was an allocation mismatch.
If you look at the history, they were either obstack or xmalloc'd (i 
forget which), and i changed it to be pool allocated.

However, in order to make the bb annotations reachable to the GC, we 
switched bb's back to gc allocation.
Personally, I still think the bb's/edge's should be pool allocated.
It's the obvious match for them.
However, the GC can't walk un-gc'd memory at all, so we can't do that.
It's GC limitations getting in the way of proper memory management here.
--Dan

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-15 13:13                             ` Daniel Jacobowitz
@ 2004-09-15 13:32                               ` Jan Hubicka
  2004-09-15 15:59                               ` Diego Novillo
       [not found]                               ` <drow@false.org>
  2 siblings, 0 replies; 875+ messages in thread
From: Jan Hubicka @ 2004-09-15 13:32 UTC (permalink / raw)
  To: Diego Novillo, Jan Hubicka, David Edelsohn, Jeff Law,
	Jan Hubicka, gcc-patches

> On Wed, Sep 15, 2004 at 08:30:43AM -0400, Diego Novillo wrote:
> > On Wed, 2004-09-15 at 07:32, Jan Hubicka wrote:
> > 
> > > What about renaming ggc_free to ggc_dead to make it obvious that one can
> > > not "free" live data?
> > > 
> > It's not a matter of naming.  When you are expunging a basic block, you
> > really consider it dead, but as we have proven, you don't really know if
> > it's reachable from somewhere else.
> 
> Then this isn't a suitable candidate for ggc_free, obviously.  Unless,
> of course, you want to _define_ that it is dead when expunging it.  In
> that case, ggc_free acts as a form of assertion.

And this was my intent of course (ie I tought that we are curefull enought
to not leak pointers to dead blocks as we simply clear them out when
removing instruction from the chain).  This however turned out to be
wrong (for quite weird reasons) so we reverted this particular change
and I am not intending to put it back unless something significantly
changes.

Honza

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
       [not found]                               ` <drow@false.org>
@ 2004-09-15 15:46                                 ` David Edelsohn
  2004-09-15 20:17                                   ` Geoffrey Keating
  2005-04-01 21:58                                 ` AIX bootstrap failure (was Re: Hot/cold partitioning fixes) David Edelsohn
                                                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2004-09-15 15:46 UTC (permalink / raw)
  To: Diego Novillo, Jan Hubicka, Jeff Law, Jan Hubicka, gcc-patches

>>>>> Daniel Jacobowitz writes:

Daniel> But this part, that you and Jeff keep reiterating, doesn't make sense.
Daniel> Even with an improved garbage collector well beyond what we have today,
Daniel> the fact is that collection is expensive, and the locality penalty for
Daniel> garbage is expensive.  I've done measurements which suggest that, if
Daniel> "optimal" memory allocation were possible, it would be 10-15% faster.
Daniel> Reducing the number of collections required for a file has similar
Daniel> benefit.

Daniel> Some degree of this can be fixed by fixing "GC's shortcomings" - for
Daniel> instance, putting optimizer data specific to a function into a GC zone
Daniel> and discarding the zone (optionally checking that it isn't reachable!)
Daniel> at the end of the function.  And Jeff's right to remind us that
Daniel> ggc_free still has overhead - less in the collector I'm working on, but
Daniel> it's still there.  So looping over small allocations to destroy them is
Daniel> probably not a good idea.  But within processing a function, for large
Daniel> allocations, what's the problem with using ggc_free?

	If the data is dead, make sure that there is no reference and let
GC clean it up.

	Also, within reason, we should not become obsessed with reducing
memory usage.  Having GCC 4.0 use the same amount or less memory than GCC
3.4 isn't necessarily good.  The issue is locality and how we use the
memory.  If it is useful to cache the information contained in the
additional memory and the information has good locality properties and we
access the information judiciously, it may be good to have it hang around
instead of recomputing it.

	Using lots of memory in an inefficient manner slows down the
compiler.  We should not turn the compile-time performance into a fixation
on reducing memory usage as a goal in isolation.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-15 13:13                             ` Daniel Jacobowitz
  2004-09-15 13:32                               ` Jan Hubicka
@ 2004-09-15 15:59                               ` Diego Novillo
  2004-09-15 16:12                                 ` Daniel Jacobowitz
       [not found]                               ` <drow@false.org>
  2 siblings, 1 reply; 875+ messages in thread
From: Diego Novillo @ 2004-09-15 15:59 UTC (permalink / raw)
  To: Daniel Jacobowitz
  Cc: Jan Hubicka, David Edelsohn, Jeff Law, Jan Hubicka, gcc-patches

On Wed, 2004-09-15 at 08:53, Daniel Jacobowitz wrote:

> But this part, that you and Jeff keep reiterating, doesn't make sense.
> Even with an improved garbage collector well beyond what we have today,
> the fact is that collection is expensive, and the locality penalty for
> garbage is expensive.
>
That is a symptom that GC may not be suited to our allocation profile. 
Or that we have a bad collector.  Or any number of other mismatches.

> what's the problem with using ggc_free?
>
If you are going to use a garbage collector, then use a garbage
collector.  If you find yourself having to baby sit its decisions, then
you have a fundamental design problem.

Garbage collected memory is by definition "allocate and forget" not
"allocate and I will tell you what to release".  If you do that, you are
subverting the very principle of garbage collected memory.  At that
point, you have to switch to a different memory allocation strategy,
improve your collector, use different pools, whatever.


Diego.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-15 15:59                               ` Diego Novillo
@ 2004-09-15 16:12                                 ` Daniel Jacobowitz
  2004-09-15 17:07                                   ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Daniel Jacobowitz @ 2004-09-15 16:12 UTC (permalink / raw)
  To: Diego Novillo
  Cc: Jan Hubicka, David Edelsohn, Jeff Law, Jan Hubicka, gcc-patches

On Wed, Sep 15, 2004 at 11:10:14AM -0400, Diego Novillo wrote:
> On Wed, 2004-09-15 at 08:53, Daniel Jacobowitz wrote:
> 
> > But this part, that you and Jeff keep reiterating, doesn't make sense.
> > Even with an improved garbage collector well beyond what we have today,
> > the fact is that collection is expensive, and the locality penalty for
> > garbage is expensive.
> >
> That is a symptom that GC may not be suited to our allocation profile. 
> Or that we have a bad collector.  Or any number of other mismatches.
> 
> > what's the problem with using ggc_free?
> >
> If you are going to use a garbage collector, then use a garbage
> collector.  If you find yourself having to baby sit its decisions, then
> you have a fundamental design problem.
> 
> Garbage collected memory is by definition "allocate and forget" not
> "allocate and I will tell you what to release".  If you do that, you are
> subverting the very principle of garbage collected memory.  At that
> point, you have to switch to a different memory allocation strategy,
> improve your collector, use different pools, whatever.

I hope my point is clear by now, but I'll try to summarize, since I
have the feeling I'm not getting through...

I don't think your "definition" is correct, or that marking objects as
explicitly free conflicts with the design of using a garbage collector.

Why should your memory management subsystem have to know about
everything that has a fixed lifetime?  I think that a good way to use
garbage collection, if you're willing to define its interfaces
this way, is:

  - If I know I am done with X, and no one else should be holding on to
    X, tell the GC subsystem so.  Let it decide whether to drop X back
    in the allocation pool, or hang on to X and check my assertion, or
    look at X and laugh itself silly.

  - If I think I am done with X, but I'm not statically sure, just let
    it go instead of inventing mechanisms to track it more precisely.
    The GC maid will sweep the floors for me later anyway.

No matter how much we improve the garbage collector, I think that it
will continue to be useful to have these hints.  Maybe with a
generational collector (for example) collection would be enough faster
that the right thing to do would be to ignore the hints.  Maybe it
wouldn't.  And the assertions will remain valuable.

-- 
Daniel Jacobowitz

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-15 13:05                           ` Diego Novillo
  2004-09-15 13:13                             ` Daniel Berlin
  2004-09-15 13:13                             ` Daniel Jacobowitz
@ 2004-09-15 16:53                             ` Jeffrey A Law
  2004-09-15 17:20                               ` Jan Hubicka
  2 siblings, 1 reply; 875+ messages in thread
From: Jeffrey A Law @ 2004-09-15 16:53 UTC (permalink / raw)
  To: Diego Novillo; +Cc: Jan Hubicka, David Edelsohn, Jan Hubicka, gcc-patches

On Wed, 2004-09-15 at 06:30, Diego Novillo wrote:
> On Wed, 2004-09-15 at 07:32, Jan Hubicka wrote:
> 
> > What about renaming ggc_free to ggc_dead to make it obvious that one can
> > not "free" live data?
> > 
> It's not a matter of naming.  When you are expunging a basic block, you
> really consider it dead, but as we have proven, you don't really know if
> it's reachable from somewhere else.
Precisely.

And people can talk about whatever checks have been implemented to
ensure that "bad things don't happen", but as this case clearly shows
those checks are not sufficient.


> If we are holding onto too much unnecessary data in our algorithms, then
> the solution _ought_ to involve breaking the chains to the dead data so
> that we can collect all that garbage.  And the way of breaking those
> chains should simply be writing NULL to your pointers.
Right.  Now a good way to help find those cases would be a huge
step forward.  Right now I find them by noting the object which ought
to be dead, the setting conditional breakpoints in the marking
routines.  Needless to say that's slow and error prone and
doesn't scale well.

Jeff

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-15 16:12                                 ` Daniel Jacobowitz
@ 2004-09-15 17:07                                   ` David Edelsohn
  2004-09-15 18:48                                     ` Daniel Jacobowitz
  2004-09-15 20:35                                     ` Michael Matz
  0 siblings, 2 replies; 875+ messages in thread
From: David Edelsohn @ 2004-09-15 17:07 UTC (permalink / raw)
  To: Daniel Jacobowitz
  Cc: Diego Novillo, Jan Hubicka, Jeff Law, Jan Hubicka, gcc-patches

>>>>> Daniel Jacobowitz writes:

Daniel> I don't think your "definition" is correct, or that marking objects as
Daniel> explicitly free conflicts with the design of using a garbage collector.

	Diego's definition is correct.  If the memory is not getting
collected at the next GC phase, figure out why the GC still sees a
reference.

	If you want to tell the GC to collect the memory, then explicitly
NULL what you believe are the references.  Add assertions that there
should be no references to objects of type X at some appropriate barrier
and fix any dangling references if objects of type X still exist.  What
you are saying should be done with explicit ggc_free(), I (and I believe
Jeff and Diego) are saying should be done by removing any references.  If
you want to explicitly GC right after the reference is removed, fine.
Just don't explicitly tell the garbage collector to free a particular
object.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-15 16:53                             ` Release RTL bodies after compilation (sometimes) Jeffrey A Law
@ 2004-09-15 17:20                               ` Jan Hubicka
  2004-09-16  9:03                                 ` Jeffrey A Law
  0 siblings, 1 reply; 875+ messages in thread
From: Jan Hubicka @ 2004-09-15 17:20 UTC (permalink / raw)
  To: Jeffrey A Law
  Cc: Diego Novillo, Jan Hubicka, David Edelsohn, Jan Hubicka, gcc-patches

> > If we are holding onto too much unnecessary data in our algorithms, then
> > the solution _ought_ to involve breaking the chains to the dead data so
> > that we can collect all that garbage.  And the way of breaking those
> > chains should simply be writing NULL to your pointers.
> Right.  Now a good way to help find those cases would be a huge
> step forward.  Right now I find them by noting the object which ought
> to be dead, the setting conditional breakpoints in the marking
> routines.  Needless to say that's slow and error prone and
> doesn't scale well.

What I do is to add ggc_free on the object, wait for next ggc_collect to
crash and then look into backtrace that shows me who forget about that
case.
I don't see much way to make this easier, except for perhaps adding
ggc_free/ggc_collect pair

In my tree I am having also ggc_dead_at_end_of_compilation that ensures
that the pointers to the object will die eventually.  This is usefull
for instace for ensuring that function bodies gets reclaimed
proactively.

Leving varrays aside, I would not object agains making ggc_free to not
release the block for re-use and simply just verify that it is not
re-used when enable-checking is on and do noop when it is off, if that
will make us to feel safer about garbage collecting.
That would increase our memory usage by 5MB on combine.c  (10%), 5% is
create_stmt_ann that should be dealt with more effectivly (and more
dangerously) by zone collector anyway.

Concerning varrays,  I still believe that most of these should go
completely or at least out of the GGC pool as GGC is not good
datastructure for this.

So we can make ggc_free to be used by varrays only and have ggc_dead
doing what ggc_free is doing now except for the freeing bit.

Honza
> 
> Jeff

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-15 17:07                                   ` David Edelsohn
@ 2004-09-15 18:48                                     ` Daniel Jacobowitz
  2004-09-15 20:35                                     ` Michael Matz
  1 sibling, 0 replies; 875+ messages in thread
From: Daniel Jacobowitz @ 2004-09-15 18:48 UTC (permalink / raw)
  To: David Edelsohn
  Cc: Diego Novillo, Jan Hubicka, Jeff Law, Jan Hubicka, gcc-patches

On Wed, Sep 15, 2004 at 11:36:45AM -0400, David Edelsohn wrote:
> >>>>> Daniel Jacobowitz writes:
> 
> Daniel> I don't think your "definition" is correct, or that marking objects as
> Daniel> explicitly free conflicts with the design of using a garbage collector.
> 
> 	Diego's definition is correct.  If the memory is not getting
> collected at the next GC phase, figure out why the GC still sees a
> reference.
> 
> 	If you want to tell the GC to collect the memory, then explicitly
> NULL what you believe are the references.  Add assertions that there
> should be no references to objects of type X at some appropriate barrier
> and fix any dangling references if objects of type X still exist.  What
> you are saying should be done with explicit ggc_free(), I (and I believe
> Jeff and Diego) are saying should be done by removing any references.  If
> you want to explicitly GC right after the reference is removed, fine.
> Just don't explicitly tell the garbage collector to free a particular
> object.

[Continued this discussion offline.]

I think that what David is suggesting, whether it's more ideally
correct or not, is infeasible today.  We have a certain obligation to
do what we can in the short term since no one has expressed interest in
massive overhauls.

But it's clear that everyone else with anything to say in this thread
disagrees with my position on ggc_free, so I will shut up now.

-- 
Daniel Jacobowitz

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-15  4:38                     ` David Edelsohn
  2004-09-15 12:25                       ` Jan Hubicka
@ 2004-09-15 19:50                       ` Mike Stump
  2004-09-15 19:58                         ` David Edelsohn
  2004-09-16  4:29                         ` Jeffrey A Law
  1 sibling, 2 replies; 875+ messages in thread
From: Mike Stump @ 2004-09-15 19:50 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Jan Hubicka, gcc-patches, Jeffrey Law

On Sep 14, 2004, at 7:47 PM, David Edelsohn wrote:
> ggc_free() is wrong and I strongly disagree with using it as a
> fundamental concept in the design and implementation of GCC.

It is amazing to me that people hate it so much.  To me, it provides 
flexibility.  The flexibility to make a compiler that reduces memory 
pressure on a system.  The flexibility to reduce cache pressure on a 
system.  The flexibility to make a faster compiler.

If performance isn't an issue, then we should remove ggc_free.  It it 
is, we should keep it, at least until such time as another technology 
is proven to replace the need for it.

We should not allow people that don't care about performance to stop 
those that do, from achieving it.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-15 19:50                       ` Mike Stump
@ 2004-09-15 19:58                         ` David Edelsohn
  2004-09-15 20:10                           ` Michael Matz
  2004-09-16  3:43                           ` Jeffrey A Law
  2004-09-16  4:29                         ` Jeffrey A Law
  1 sibling, 2 replies; 875+ messages in thread
From: David Edelsohn @ 2004-09-15 19:58 UTC (permalink / raw)
  To: Mike Stump; +Cc: gcc-patches

>>>>> Mike Stump writes:

Mike> If performance isn't an issue, then we should remove ggc_free.  It it 
Mike> is, we should keep it, at least until such time as another technology 
Mike> is proven to replace the need for it.

Mike> We should not allow people that don't care about performance to stop 
Mike> those that do, from achieving it.

	Mike, that is fallacious reasoning.  The discussion is about
ggc_free, not performance.  ggc_free does not automatically equate with
performance.  Linking the two issues and then arguing that people who
disagree with ggc_free are not interested in performance and are
inhibiting developers trying to improve performance is inappropriate and
unwarranted.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-15 19:58                         ` David Edelsohn
@ 2004-09-15 20:10                           ` Michael Matz
  2004-09-15 20:51                             ` David Edelsohn
  2004-09-16  3:43                           ` Jeffrey A Law
  1 sibling, 1 reply; 875+ messages in thread
From: Michael Matz @ 2004-09-15 20:10 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Mike Stump, gcc-patches

Hi,

On Wed, 15 Sep 2004, David Edelsohn wrote:

> Mike> If performance isn't an issue, then we should remove ggc_free.  It it 
> Mike> is, we should keep it, at least until such time as another technology 
> Mike> is proven to replace the need for it.
> 
> Mike> We should not allow people that don't care about performance to stop 
> Mike> those that do, from achieving it.
> 
> 	Mike, that is fallacious reasoning.  The discussion is about
> ggc_free, not performance.  ggc_free does not automatically equate with
> performance.

But it helps _finding_ leaked memory, so it does influence performance
very much.  If you don't believe me look up the old mails from Honza where
he used ggc_free to reduce memory overhead by like 2000%.  It is the kind
of assertion you already mentioned, if it's made not to free the memory,
but instead mark the object as dead in checking mode.

And it can also be used by the memory subsystem to do whatever decisions 
it wants to make based on the hints given by the user.  

More information for the memory subsystem is better than less information.  
This is obviously true.  One such bit of information can be provided by 
ggc_free (or whatever changed version of it, in order not to crash onto 
the users face as soon as the assertion the developer made does not hold).

> Linking the two issues and then arguing that people who disagree with
> ggc_free are not interested in performance and are inhibiting developers
> trying to improve performance is inappropriate and unwarranted.

Well, then the people disagreeing with the very principle of ggc_free (and 
it seems also some variants of it) should come up with a plan and patch to 
actually reduce memory in the same way like Honza.

Ciao,
Michael.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-15 15:46                                 ` David Edelsohn
@ 2004-09-15 20:17                                   ` Geoffrey Keating
  2004-09-16  4:22                                     ` Jeffrey A Law
  0 siblings, 1 reply; 875+ messages in thread
From: Geoffrey Keating @ 2004-09-15 20:17 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

David Edelsohn <dje@watson.ibm.com> writes:

> 	Using lots of memory in an inefficient manner slows down the
> compiler.  We should not turn the compile-time performance into a fixation
> on reducing memory usage as a goal in isolation.

This is an important point.  I don't believe that freeing memory gives
significant compile speed benefits except in unusual cases, and in
general it slows the compile down because of the overhead of doing the
freeing.

When we speak of 'reducing memory usage' here at Apple, we do not mean
'allocating as much memory as before and then freeing it'.  We mean
allocating less memory than before, and preferably doing this by
allocating fewer objects rather than the same number of smaller ones.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-15 17:07                                   ` David Edelsohn
  2004-09-15 18:48                                     ` Daniel Jacobowitz
@ 2004-09-15 20:35                                     ` Michael Matz
  2004-09-16  7:38                                       ` Jeffrey A Law
  2004-09-18  7:36                                       ` Geoffrey Keating
  1 sibling, 2 replies; 875+ messages in thread
From: Michael Matz @ 2004-09-15 20:35 UTC (permalink / raw)
  To: David Edelsohn
  Cc: Daniel Jacobowitz, Diego Novillo, Jan Hubicka, Jeff Law,
	Jan Hubicka, gcc-patches

Hi,

On Wed, 15 Sep 2004, David Edelsohn wrote:

> >>>>> Daniel Jacobowitz writes:
> 
> Daniel> I don't think your "definition" is correct, or that marking objects as
> Daniel> explicitly free conflicts with the design of using a garbage collector.
> 
> 	Diego's definition is correct.  If the memory is not getting
> collected at the next GC phase, figure out why the GC still sees a
> reference.

Oh, it will be collected (if it really is free).  That's not the point.  
The point is, that currently collection is slower the more things it has 
to walk.  And another point is that all these small later-to-be-collected
items can add up to some extreme amounts.  We could collect more often.  
That's slow.  Or we could build some reuse machinery on top of the GC 
memory.  Complicated, and another layer of mem management (but sometimes 
the right thing).  Or, if we _know_ that this or that object is dead, tell 
the memory subsystem so, and let it make something good with this 
information.  For instance freeing the object right now or later or never.

I.e. exactly what Daniel suggested.

In this regard ggc_free is merely an optimization of the memory usage
pattern in gcc, and an internal checking assertion, but not something
required for correctness.  It can lead to problems in only exactly those
cases where also a non-ggc_free world would have problems (modulo if used 
wrongly, but thankfully this can be seen easily).

The only difference is, that with ggc_free it ICEes, without ggc_free it
produces wrong code, due to the unexpected references to conceptually dead
data, which will not be noticed as dead, because of those references.  In
fact if not too much memory is wasted by this pseudo-dead data noone will
notice at all, and we all have a hard time debugging some funny heisenbug
wrong code problem.

Ciao,
Michael.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-15 20:10                           ` Michael Matz
@ 2004-09-15 20:51                             ` David Edelsohn
  2004-09-15 21:02                               ` Daniel Jacobowitz
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2004-09-15 20:51 UTC (permalink / raw)
  To: Michael Matz; +Cc: Mike Stump, gcc-patches

>>>>> Michael Matz writes:

Michael> But it helps _finding_ leaked memory, so it does influence performance
Michael> very much.  If you don't believe me look up the old mails from Honza where
Michael> he used ggc_free to reduce memory overhead by like 2000%.  It is the kind
Michael> of assertion you already mentioned, if it's made not to free the memory,
Michael> but instead mark the object as dead in checking mode.

	Verifying that an object is freed by a specific point is good.
That is different than asserting to the garbage collection system that
something can be and should be freed.  As we have seen, even with the
checks implemented by ggc_free(), we still can have dangling references
that break the compiler, instead of a well-defined ICE from the GC
subsystem.  Hopefully the zone-collector will allow more accurate checking
and better garbage collection performance.  I believe that I am accurately
stating that if you read Jeff and Diego's comments, they support more
powerful verification of GCC's GC that would allow more aggressive garbage
collection.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-15 20:51                             ` David Edelsohn
@ 2004-09-15 21:02                               ` Daniel Jacobowitz
  2004-09-16  4:58                                 ` Jeffrey A Law
  0 siblings, 1 reply; 875+ messages in thread
From: Daniel Jacobowitz @ 2004-09-15 21:02 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Michael Matz, Mike Stump, gcc-patches

On Wed, Sep 15, 2004 at 04:16:11PM -0400, David Edelsohn wrote:
> As we have seen, even with the
> checks implemented by ggc_free(), we still can have dangling references
> that break the compiler, instead of a well-defined ICE from the GC
> subsystem.

Incorrect.  The existing checks are only run at gcac and as far as I
can tell, no one ran a gcac bootstrap on a target that showed this
problem.

I discussed with Honza ways to move these checks to the non-gcac case.

-- 
Daniel Jacobowitz

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-15 19:58                         ` David Edelsohn
  2004-09-15 20:10                           ` Michael Matz
@ 2004-09-16  3:43                           ` Jeffrey A Law
  1 sibling, 0 replies; 875+ messages in thread
From: Jeffrey A Law @ 2004-09-16  3:43 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Mike Stump, gcc-patches

On Wed, 2004-09-15 at 13:03, David Edelsohn wrote:
> >>>>> Mike Stump writes:
> 
> Mike> If performance isn't an issue, then we should remove ggc_free.  It it 
> Mike> is, we should keep it, at least until such time as another technology 
> Mike> is proven to replace the need for it.
> 
> Mike> We should not allow people that don't care about performance to stop 
> Mike> those that do, from achieving it.
> 
> 	Mike, that is fallacious reasoning.
Agreed.


>   The discussion is about
> ggc_free, not performance.  ggc_free does not automatically equate with
> performance.  Linking the two issues and then arguing that people who
> disagree with ggc_free are not interested in performance and are
> inhibiting developers trying to improve performance is inappropriate and
> unwarranted.
Not only is it unwarranted, the idea that those of us who think
ggc_free is generally bad don't care about performance is totally
wrong.  Speaking strictly for myself, performance issues are 
my top GCC priority right now.

jeff


^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-15 20:17                                   ` Geoffrey Keating
@ 2004-09-16  4:22                                     ` Jeffrey A Law
  0 siblings, 0 replies; 875+ messages in thread
From: Jeffrey A Law @ 2004-09-16  4:22 UTC (permalink / raw)
  To: Geoffrey Keating; +Cc: David Edelsohn, gcc-patches

On Wed, 2004-09-15 at 13:50, Geoffrey Keating wrote:
> David Edelsohn <dje@watson.ibm.com> writes:
> 
> > 	Using lots of memory in an inefficient manner slows down the
> > compiler.  We should not turn the compile-time performance into a fixation
> > on reducing memory usage as a goal in isolation.
> 
> This is an important point.  I don't believe that freeing memory gives
> significant compile speed benefits except in unusual cases, and in
> general it slows the compile down because of the overhead of doing the
> freeing.
Agreed.

> When we speak of 'reducing memory usage' here at Apple, we do not mean
> 'allocating as much memory as before and then freeing it'.  We mean
> allocating less memory than before, and preferably doing this by
> allocating fewer objects rather than the same number of smaller ones.
Precisely.  Losing the block-local varrays is a great example; so
far it's consistently been better to go ahead and allocate a global
varray with markers to note where entries for the current block
end than anything I've seen with lazily allocating and recycling
those block local varrays.

I just finished the first cut at removing the block_defs local
varray in favor of a global one in tree-ssa-dom and tree-into-ssa.
To give you an idea we were creating 31594 block local varrays
for block defs (and that's allocating them lazily!).  Those
31594 block local varrays sucked up roughly 14M of memory.

With a global varray with block markers we generate 2132
varrays which suck up roughly 6M of memory.

Interestingly enough, this also allows us to remove the toplevel
block local data structure in tree-into-ssa.c.  Another 2000
varrays gone (unfortunately that savings is much smaller 200k).

[ All numbers are with Gerald's testcase. ]

Anyway, I just wanted to give a concrete example to illustrate
Geoff's point.

jeff

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-15 19:50                       ` Mike Stump
  2004-09-15 19:58                         ` David Edelsohn
@ 2004-09-16  4:29                         ` Jeffrey A Law
  1 sibling, 0 replies; 875+ messages in thread
From: Jeffrey A Law @ 2004-09-16  4:29 UTC (permalink / raw)
  To: Mike Stump; +Cc: David Edelsohn, Jan Hubicka, gcc-patches

On Wed, 2004-09-15 at 12:54, Mike Stump wrote:
> On Sep 14, 2004, at 7:47 PM, David Edelsohn wrote:
> > ggc_free() is wrong and I strongly disagree with using it as a
> > fundamental concept in the design and implementation of GCC.
> 
> It is amazing to me that people hate it so much.  To me, it provides 
> flexibility.  The flexibility to make a compiler that reduces memory 
> pressure on a system.  The flexibility to reduce cache pressure on a 
> system.  The flexibility to make a faster compiler.
It's a bloody band-aid.  In the end I think our time is going to
be better spent looking at alternate ways to represent the data we
need.

ggc_free's only purpose in life was supposed to be to give an object
back to the GC system so that it could be re-used before the next
GC cycle.  The number of cases where it's appropriate given our 
rats nest of pointers is small.  Again, as someone who pushed 
reasonably hard to get ggc_free into GCC a while back, I've come
to realize that it was a strategic mistake and takes us down a
path which leads to a compiler which is significantly more difficult
to maintain.



> If performance isn't an issue, then we should remove ggc_free.  It it 
> is, we should keep it, at least until such time as another technology 
> is proven to replace the need for it.
Using ggc_free is IMHO ultimately asking for a semi-random segfault
because of the difficulty involved in ensuring that no references
to the ggc_free'd data remain.

Jeff

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-15 21:02                               ` Daniel Jacobowitz
@ 2004-09-16  4:58                                 ` Jeffrey A Law
  0 siblings, 0 replies; 875+ messages in thread
From: Jeffrey A Law @ 2004-09-16  4:58 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: David Edelsohn, Michael Matz, Mike Stump, gcc-patches

On Wed, 2004-09-15 at 14:19, Daniel Jacobowitz wrote:
> On Wed, Sep 15, 2004 at 04:16:11PM -0400, David Edelsohn wrote:
> > As we have seen, even with the
> > checks implemented by ggc_free(), we still can have dangling references
> > that break the compiler, instead of a well-defined ICE from the GC
> > subsystem.
> 
> Incorrect.  The existing checks are only run at gcac and as far as I
> can tell, no one ran a gcac bootstrap on a target that showed this
> problem.
And I suspect that in general people are not going to run with GCAC
simply because of its overhead.


> I discussed with Honza ways to move these checks to the non-gcac case.
And the way this ought to work is we should assert that certain
objects are not supposed to be reachable and have the checking code
bark if they are reachable.  Those objects should _NOT_ be made
available to be re-used until they are collected by the normal
mechanisms.

This would provide us with what we really need -- a sane way to 
identify objects which should have a well defined lifetime, but
for whatever reason do not.  And by triggering a checking failure
it becomes relatively easy to walk back up the frames in GDB to
see the chain of data leading to the object that is supposed to
not be reachable.

jeff

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-15 20:35                                     ` Michael Matz
@ 2004-09-16  7:38                                       ` Jeffrey A Law
  2004-09-18  7:47                                         ` Geoffrey Keating
  2004-09-18  7:36                                       ` Geoffrey Keating
  1 sibling, 1 reply; 875+ messages in thread
From: Jeffrey A Law @ 2004-09-16  7:38 UTC (permalink / raw)
  To: Michael Matz
  Cc: David Edelsohn, Daniel Jacobowitz, Diego Novillo, Jan Hubicka,
	Jan Hubicka, gcc-patches

On Wed, 2004-09-15 at 14:08, Michael Matz wrote:
>  The point is, that currently collection is slower the more things it has 
> to walk.
Reducing this was never the primary goal of ggc_free.  The primary 
goal of ggc_free was to return memory to the GC system faster so
that it could be re-used before the next collection point, thus
potentially reducing the peak memory usage.

The fact that we have fewer things to mark was a secondary effect.

And as I've said before, ggc_free is just asking for problems given
our pointer infested code.  Rather than focusing on ggc_free, I'd
much rather see us focus on:

  1. Being smarter about our allocations to begin with.

  2. Safe means of identifying objects that should not be
     reachable, but which are reachable.  ggc_free as it stands
     today does not meet that need.



>   And another point is that all these small later-to-be-collected
> items can add up to some extreme amounts. 
Certainly.  Again, that's why we introduced ggc_free to begin with.
Unfortunately, in hindsight, we should really have been looking more
at why we were allocating so many objects rather than looking for
ways to recycle them faster.



>  We could collect more often.  
> That's slow.  Or we could build some reuse machinery on top of the GC 
> memory.  Complicated, and another layer of mem management (but sometimes 
> the right thing).  Or, if we _know_ that this or that object is dead, tell 
> the memory subsystem so, and let it make something good with this 
> information.  For instance freeing the object right now or later or never.
> 
> I.e. exactly what Daniel suggested
But given the nature of GCC, knowing an object is dead can be bloody
hard, real bloody hard.

Jeff

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-14 21:12                 ` Jakub Jelinek
  2004-09-14 22:33                   ` Daniel Jacobowitz
  2004-09-14 22:53                   ` Richard Henderson
@ 2004-09-16  8:35                   ` Jeffrey A Law
  2 siblings, 0 replies; 875+ messages in thread
From: Jeffrey A Law @ 2004-09-16  8:35 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Jan Hubicka, Richard Henderson, Diego Novillo, gcc-patches,
	Mark Mitchell

On Tue, 2004-09-14 at 15:07, Jakub Jelinek wrote:
> On Tue, Sep 14, 2004 at 02:51:39PM -0600, Jeffrey A Law wrote:
> > The problem is your change to explicitly call ggc_free bypasses the
> > entire GC mechanisms we've built.  As I've stated before, if you
> > can't be absolutely sure that there are no pointers left into your
> > object, then the object is _NOT_ a candidate for ggc_free.  I can't
> > emphasize this enough.
> 
> Shouldn't we have a checking mode which verifies this then?
> I.e. ggc_free in that mode would just set a flag and if during GC
> collection an object marked that way is reachable, we would abort ().
I think this would make a lot of sense.  I don't think I'd call it
ggc_free, but it's precisely the kind of thing I think we want to 
track down objects which are supposed to be dead, but which continue
to be live for some reason.

Jeff

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-15 17:20                               ` Jan Hubicka
@ 2004-09-16  9:03                                 ` Jeffrey A Law
  2004-09-16 13:13                                   ` Jan Hubicka
  0 siblings, 1 reply; 875+ messages in thread
From: Jeffrey A Law @ 2004-09-16  9:03 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Diego Novillo, Jan Hubicka, David Edelsohn, gcc-patches

On Wed, 2004-09-15 at 09:46, Jan Hubicka wrote:
> > > If we are holding onto too much unnecessary data in our algorithms, then
> > > the solution _ought_ to involve breaking the chains to the dead data so
> > > that we can collect all that garbage.  And the way of breaking those
> > > chains should simply be writing NULL to your pointers.
> > Right.  Now a good way to help find those cases would be a huge
> > step forward.  Right now I find them by noting the object which ought
> > to be dead, the setting conditional breakpoints in the marking
> > routines.  Needless to say that's slow and error prone and
> > doesn't scale well.
> 
> What I do is to add ggc_free on the object, wait for next ggc_collect to
> crash and then look into backtrace that shows me who forget about that
> case.
> I don't see much way to make this easier, except for perhaps adding
> ggc_free/ggc_collect pair
Well, rather than actually freeing the object, you mark it as
something that ought to be unreachable, but don't actually free it.

The difference is subtle, but critical.

> Concerning varrays,  I still believe that most of these should go
> completely or at least out of the GGC pool as GGC is not good
> datastructure for this.
What I don't want is for there to be two or more different kinds
of varrays.  ie, I don't want some that are GC'd, some that are
obstack'd, others which are xmalloc'd, etc.


Jeff

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-16  9:03                                 ` Jeffrey A Law
@ 2004-09-16 13:13                                   ` Jan Hubicka
  0 siblings, 0 replies; 875+ messages in thread
From: Jan Hubicka @ 2004-09-16 13:13 UTC (permalink / raw)
  To: Jeffrey A Law
  Cc: Jan Hubicka, Diego Novillo, Jan Hubicka, David Edelsohn, gcc-patches

> On Wed, 2004-09-15 at 09:46, Jan Hubicka wrote:
> > > > If we are holding onto too much unnecessary data in our algorithms, then
> > > > the solution _ought_ to involve breaking the chains to the dead data so
> > > > that we can collect all that garbage.  And the way of breaking those
> > > > chains should simply be writing NULL to your pointers.
> > > Right.  Now a good way to help find those cases would be a huge
> > > step forward.  Right now I find them by noting the object which ought
> > > to be dead, the setting conditional breakpoints in the marking
> > > routines.  Needless to say that's slow and error prone and
> > > doesn't scale well.
> > 
> > What I do is to add ggc_free on the object, wait for next ggc_collect to
> > crash and then look into backtrace that shows me who forget about that
> > case.
> > I don't see much way to make this easier, except for perhaps adding
> > ggc_free/ggc_collect pair
> Well, rather than actually freeing the object, you mark it as
> something that ought to be unreachable, but don't actually free it.
> 
> The difference is subtle, but critical.

This is precisely what ggc_free does with --enable-checking=gcac
(together with zeroing out the released chunk so we are punished for
accessing released memory). 
Since nowdays the gcac bootstraps are far from being practical (over one
day on my fastest machine) I would be happy about moving this code to
default --enable-checking too.  We also might have two kinds of
construct (ggc_free and ggc_dead) both behaving same with checking
enabled and the other compiling to noop with checking disabled if it
makes something better....

Honza

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-15 20:35                                     ` Michael Matz
  2004-09-16  7:38                                       ` Jeffrey A Law
@ 2004-09-18  7:36                                       ` Geoffrey Keating
  1 sibling, 0 replies; 875+ messages in thread
From: Geoffrey Keating @ 2004-09-18  7:36 UTC (permalink / raw)
  To: Michael Matz
  Cc: Daniel Jacobowitz, Diego Novillo, Jan Hubicka, Jeff Law,
	Jan Hubicka, gcc-patches

Michael Matz <matz@suse.de> writes:

> Hi,
> 
> On Wed, 15 Sep 2004, David Edelsohn wrote:
> 
> > >>>>> Daniel Jacobowitz writes:
> > 
> > Daniel> I don't think your "definition" is correct, or that marking objects as
> > Daniel> explicitly free conflicts with the design of using a garbage collector.
> > 
> > 	Diego's definition is correct.  If the memory is not getting
> > collected at the next GC phase, figure out why the GC still sees a
> > reference.
> 
> Oh, it will be collected (if it really is free).  That's not the point.  
> The point is, that currently collection is slower the more things it has 
> to walk.

Things that are no longer referenced (and will be freed) will not be walked.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-16  7:38                                       ` Jeffrey A Law
@ 2004-09-18  7:47                                         ` Geoffrey Keating
  2004-09-18 13:58                                           ` Daniel Berlin
  0 siblings, 1 reply; 875+ messages in thread
From: Geoffrey Keating @ 2004-09-18  7:47 UTC (permalink / raw)
  To: law
  Cc: David Edelsohn, Daniel Jacobowitz, Diego Novillo, Jan Hubicka,
	Jan Hubicka, gcc-patches

Jeffrey A Law <law@redhat.com> writes:

> On Wed, 2004-09-15 at 14:08, Michael Matz wrote:
> >  The point is, that currently collection is slower the more things it has 
> > to walk.
> Reducing this was never the primary goal of ggc_free.  The primary 
> goal of ggc_free was to return memory to the GC system faster so
> that it could be re-used before the next collection point, thus
> potentially reducing the peak memory usage.
> 
> The fact that we have fewer things to mark was a secondary effect.

Actually, it's not an effect at all.  You can only safely use ggc_free
on an object which is not referenced.  Objects which are not
referenced are not marked.  Therefore, ggc_free does not change the
number of objects marked.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-18  7:47                                         ` Geoffrey Keating
@ 2004-09-18 13:58                                           ` Daniel Berlin
  2004-09-18 14:52                                             ` Daniel Berlin
  0 siblings, 1 reply; 875+ messages in thread
From: Daniel Berlin @ 2004-09-18 13:58 UTC (permalink / raw)
  To: Geoffrey Keating
  Cc: law, David Edelsohn, Daniel Jacobowitz, Diego Novillo,
	Jan Hubicka, Jan Hubicka, gcc-patches

> Actually, it's not an effect at all.  You can only safely use ggc_free
> on an object which is not referenced.  Objects which are not
> referenced are not marked.

This isn't necessarily true.

You may forget to null out the pointer, accidently keeping it, and what 
it points to, live (since unless it's marked deletable, the gc doesn't 
know it can null it for you).

>Therefore, ggc_free does not change the
> number of objects marked.

Therefore, this isn't true either.
It would be true if the pointer was reset to null before collection, but 
if not, then ggc_free can change the number of objects marked.

>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-18 13:58                                           ` Daniel Berlin
@ 2004-09-18 14:52                                             ` Daniel Berlin
  2004-09-18 15:33                                               ` Daniel Jacobowitz
  0 siblings, 1 reply; 875+ messages in thread
From: Daniel Berlin @ 2004-09-18 14:52 UTC (permalink / raw)
  To: Geoffrey Keating
  Cc: law, David Edelsohn, Daniel Jacobowitz, Diego Novillo,
	Jan Hubicka, Jan Hubicka, gcc-patches



On Sat, 18 Sep 2004, Daniel Berlin wrote:

>
>
>> Actually, it's not an effect at all.  You can only safely use ggc_free
>> on an object which is not referenced.  Objects which are not
>> referenced are not marked.
>
> This isn't necessarily true.
>
> You may forget to null out the pointer, accidently keeping it, and what it 
> points to, live (since unless it's marked deletable, the gc doesn't know it 
> can null it for you).

Just to be clear here, i'm talking about things that are marked 
GTY (ie, roots), and in reality, are unreferenced, but you forgot to null 
the pointer, so the GC doesn't know that.
If you ggc_free it, you've caused the same effect as properly nulling it.
Obviously, ggc_free can't change the number of marked objects except 
through this way.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-18 14:52                                             ` Daniel Berlin
@ 2004-09-18 15:33                                               ` Daniel Jacobowitz
  2004-09-18 18:47                                                 ` Daniel Berlin
  0 siblings, 1 reply; 875+ messages in thread
From: Daniel Jacobowitz @ 2004-09-18 15:33 UTC (permalink / raw)
  To: Daniel Berlin
  Cc: Geoffrey Keating, law, David Edelsohn, Diego Novillo,
	Jan Hubicka, Jan Hubicka, gcc-patches

On Sat, Sep 18, 2004 at 09:15:40AM -0400, Daniel Berlin wrote:
> 
> 
> On Sat, 18 Sep 2004, Daniel Berlin wrote:
> 
> >
> >
> >>Actually, it's not an effect at all.  You can only safely use ggc_free
> >>on an object which is not referenced.  Objects which are not
> >>referenced are not marked.
> >
> >This isn't necessarily true.
> >
> >You may forget to null out the pointer, accidently keeping it, and what it 
> >points to, live (since unless it's marked deletable, the gc doesn't know 
> >it can null it for you).
> 
> Just to be clear here, i'm talking about things that are marked 
> GTY (ie, roots), and in reality, are unreferenced, but you forgot to null 
> the pointer, so the GC doesn't know that.
> If you ggc_free it, you've caused the same effect as properly nulling it.
> Obviously, ggc_free can't change the number of marked objects except 
> through this way.

Huh?  If you ggc_free something referenced, you'll crash during
marking.  Geoff's assertion is correct.

-- 
Daniel Jacobowitz

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-18 15:33                                               ` Daniel Jacobowitz
@ 2004-09-18 18:47                                                 ` Daniel Berlin
  2004-09-18 19:47                                                   ` Daniel Jacobowitz
  0 siblings, 1 reply; 875+ messages in thread
From: Daniel Berlin @ 2004-09-18 18:47 UTC (permalink / raw)
  To: Daniel Jacobowitz
  Cc: Geoffrey Keating, law, David Edelsohn, Diego Novillo,
	Jan Hubicka, Jan Hubicka, gcc-patches

On Sat, 2004-09-18 at 10:03 -0400, Daniel Jacobowitz wrote:
> On Sat, Sep 18, 2004 at 09:15:40AM -0400, Daniel Berlin wrote:
> > 
> > 
> > On Sat, 18 Sep 2004, Daniel Berlin wrote:
> > 
> > >
> > >
> > >>Actually, it's not an effect at all.  You can only safely use ggc_free
> > >>on an object which is not referenced.  Objects which are not
> > >>referenced are not marked.
> > >
> > >This isn't necessarily true.
> > >
> > >You may forget to null out the pointer, accidently keeping it, and what it 
> > >points to, live (since unless it's marked deletable, the gc doesn't know 
> > >it can null it for you).
> > 
> > Just to be clear here, i'm talking about things that are marked 
> > GTY (ie, roots), and in reality, are unreferenced, but you forgot to null 
> > the pointer, so the GC doesn't know that.
> > If you ggc_free it, you've caused the same effect as properly nulling it.
> > Obviously, ggc_free can't change the number of marked objects except 
> > through this way.
> 
> Huh?  If you ggc_free something referenced, you'll crash during
> marking.  Geoff's assertion is correct.

So you are saying the following would crash during marking?

static GTY(()) varray_type test;


<fill up test>

<no varray_clear>

ggc_free (test).


Whereas this wouldn't?

static GTY(()) varray_type test;


<fill up test>

<no varray_clear>

test = NULL;



^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-18 18:47                                                 ` Daniel Berlin
@ 2004-09-18 19:47                                                   ` Daniel Jacobowitz
  2004-09-18 20:20                                                     ` Daniel Berlin
  0 siblings, 1 reply; 875+ messages in thread
From: Daniel Jacobowitz @ 2004-09-18 19:47 UTC (permalink / raw)
  To: Daniel Berlin
  Cc: Geoffrey Keating, law, David Edelsohn, Diego Novillo,
	Jan Hubicka, Jan Hubicka, gcc-patches

On Sat, Sep 18, 2004 at 02:14:46PM -0400, Daniel Berlin wrote:
> So you are saying the following would crash during marking?
> 
> static GTY(()) varray_type test;
> 
> 
> <fill up test>
> 
> <no varray_clear>
> 
> ggc_free (test).
> 
> 
> Whereas this wouldn't?
> 
> static GTY(()) varray_type test;
> 
> 
> <fill up test>
> 
> <no varray_clear>
> 
> test = NULL;

Correct.  It will still walk test from the roots.  It will be pointing
at free'd memory, or GC'd memory of an unexpected type.

-- 
Daniel Jacobowitz

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Release RTL bodies after compilation (sometimes)
  2004-09-18 19:47                                                   ` Daniel Jacobowitz
@ 2004-09-18 20:20                                                     ` Daniel Berlin
  0 siblings, 0 replies; 875+ messages in thread
From: Daniel Berlin @ 2004-09-18 20:20 UTC (permalink / raw)
  To: Daniel Jacobowitz
  Cc: Geoffrey Keating, law, David Edelsohn, Diego Novillo,
	Jan Hubicka, Jan Hubicka, gcc-patches

On Sat, 2004-09-18 at 14:29 -0400, Daniel Jacobowitz wrote:
> On Sat, Sep 18, 2004 at 02:14:46PM -0400, Daniel Berlin wrote:
> > So you are saying the following would crash during marking?
> > 
> > static GTY(()) varray_type test;
> > 
> > 
> > <fill up test>
> > 
> > <no varray_clear>
> > 
> > ggc_free (test).
> > 
> > 
> > Whereas this wouldn't?
> > 
> > static GTY(()) varray_type test;
> > 
> > 
> > <fill up test>
> > 
> > <no varray_clear>
> > 
> > test = NULL;
> 
> Correct.  It will still walk test from the roots.  It will be pointing
> at free'd memory, or GC'd memory of an unexpected type.
Ah.
Though we would only catch it with gcac on or something.

Geoff's right then, of course.
:)

-- 

^ permalink raw reply	[flat|nested] 875+ messages in thread

* [RFC] PowerPC sCC patterns
@ 2004-11-08 23:07 David Edelsohn
  2004-11-24 11:40 ` Alan Modra
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2004-11-08 23:07 UTC (permalink / raw)
  To: gcc-patches

	This patch creates splitters for sCC patterns whose final step
does not use the carry bit, namely GTU and LTU.  This gives the scheduler
a little more information and a little more freedom.

	While testing, I discovered that combine did not choose the best
patterns because rtx_costs did not model those instruction, so this patch
improves the cost model as well.

	If anyone notices any mistakes, let me know.

David


	* config/rs6000/rs6000.c (rs6000_rtx_costs): Add EQ, GTU, and LTU.
	* config/rs6000/rs6000.md (sCC): Split GTU and LTU patterns.

Index: rs6000.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.c,v
retrieving revision 1.737
diff -c -p -r1.737 rs6000.c
*** rs6000.c	8 Nov 2004 04:42:35 -0000	1.737
--- rs6000.c	8 Nov 2004 22:54:52 -0000
*************** rs6000_rtx_costs (rtx x, int code, int o
*** 17984,17992 ****
  	  *total = rs6000_cost->fp;
  	  return false;
  	}
- 
        break;
  
      default:
        break;
      }
--- 17984,18015 ----
  	  *total = rs6000_cost->fp;
  	  return false;
  	}
        break;
  
+     case EQ:
+     case GTU:
+     case LTU:
+       if (mode == Pmode)
+ 	{
+ 	  switch (outer_code)
+ 	    {
+ 	    case PLUS:
+ 	    case NEG:
+ 	      /* PLUS or NEG already counted so only add one more.  */
+ 	      *total = COSTS_N_INSNS (1);
+ 	      break;
+ 	    case SET:
+ 	      *total = COSTS_N_INSNS (3);
+ 	      break;
+ 	    case COMPARE:
+ 	      *total = 0;
+ 	      return true;
+ 	    default:
+ 	      break;
+ 	    }
+ 	  return false;
+ 	}
+ 
      default:
        break;
      }
Index: rs6000.md
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.md,v
retrieving revision 1.329
diff -c -p -r1.329 rs6000.md
*** rs6000.md	8 Nov 2004 04:42:36 -0000	1.329
--- rs6000.md	8 Nov 2004 22:54:53 -0000
***************
*** 12460,12474 ****
    "doz%I2 %0,%1,%2\;nabs %0,%0\;{srai|srawi} %0,%0,31"
    [(set_attr "length" "12")])
  
! (define_insn ""
    [(set (match_operand:SI 0 "gpc_reg_operand" "=r,r")
  	(ltu:SI (match_operand:SI 1 "gpc_reg_operand" "r,r")
  		(match_operand:SI 2 "reg_or_neg_short_operand" "r,P")))]
    "TARGET_32BIT"
!   "@
!    {sf|subfc} %0,%2,%1\;{sfe|subfe} %0,%0,%0\;neg %0,%0
!    {ai|addic} %0,%1,%n2\;{sfe|subfe} %0,%0,%0\;neg %0,%0"
!   [(set_attr "length" "12")])
  
  (define_insn ""
    [(set (match_operand:CC 3 "cc_reg_operand" "=x,x,?y,?y")
--- 12460,12486 ----
    "doz%I2 %0,%1,%2\;nabs %0,%0\;{srai|srawi} %0,%0,31"
    [(set_attr "length" "12")])
  
! (define_insn_and_split ""
    [(set (match_operand:SI 0 "gpc_reg_operand" "=r,r")
  	(ltu:SI (match_operand:SI 1 "gpc_reg_operand" "r,r")
  		(match_operand:SI 2 "reg_or_neg_short_operand" "r,P")))]
    "TARGET_32BIT"
!   "#"
!   "TARGET_32BIT"
!   [(set (match_dup 0) (neg:SI (ltu:SI (match_dup 1) (match_dup 2))))
!    (set (match_dup 0) (neg:SI (match_dup 0)))]
!   "")
! 
! (define_insn_and_split ""
!   [(set (match_operand:DI 0 "gpc_reg_operand" "=r,r")
! 	(ltu:DI (match_operand:DI 1 "gpc_reg_operand" "r,r")
! 		(match_operand:DI 2 "reg_or_neg_short_operand" "r,P")))]
!   "TARGET_64BIT"
!   "#"
!   "TARGET_64BIT"
!   [(set (match_dup 0) (neg:DI (ltu:DI (match_dup 1) (match_dup 2))))
!    (set (match_dup 0) (neg:DI (match_dup 0)))]
!   "")
  
  (define_insn ""
    [(set (match_operand:CC 3 "cc_reg_operand" "=x,x,?y,?y")
***************
*** 12503,12520 ****
  		    (const_int 0)))]
    "")
  
! (define_insn ""
!   [(set (match_operand:SI 0 "gpc_reg_operand" "=&r,&r,&r,&r")
! 	(plus:SI (ltu:SI (match_operand:SI 1 "gpc_reg_operand" "r,r,r,r")
! 			 (match_operand:SI 2 "reg_or_neg_short_operand" "r,r,P,P"))
! 		 (match_operand:SI 3 "reg_or_short_operand" "r,I,r,I")))]
    "TARGET_32BIT"
!   "@
!   {sf|subfc} %0,%2,%1\;{sfe|subfe} %0,%0,%0\;{sf|subf} %0,%0,%3
!   {sf|subfc} %0,%2,%1\;{sfe|subfe} %0,%0,%0\;{sfi|subfic} %0,%0,%3
!   {ai|addic} %0,%1,%n2\;{sfe|subfe} %0,%0,%0\;{sf|subf} %0,%0,%3
!   {ai|addic} %0,%1,%n2\;{sfe|subfe} %0,%0,%0\;{sfi|subfic} %0,%0,%3"
!  [(set_attr "length" "12")])
  
  (define_insn ""
    [(set (match_operand:CC 0 "cc_reg_operand" "=x,x,?y,?y")
--- 12515,12543 ----
  		    (const_int 0)))]
    "")
  
! (define_insn_and_split ""
!   [(set (match_operand:SI 0 "gpc_reg_operand" "=r,r")
! 	(plus:SI (ltu:SI (match_operand:SI 1 "gpc_reg_operand" "r,r")
! 			 (match_operand:SI 2 "reg_or_neg_short_operand" "r,P"))
! 		 (match_operand:SI 3 "reg_or_short_operand" "rI,rI")))]
    "TARGET_32BIT"
!   "#"
!   "TARGET_32BIT"
!   [(set (match_dup 0) (neg:SI (ltu:SI (match_dup 1) (match_dup 2))))
!    (set (match_dup 0) (minus:SI (match_dup 3) (match_dup 0)))]
!   "")
! 
! (define_insn_and_split ""
!   [(set (match_operand:DI 0 "gpc_reg_operand" "=r,r")
! 	(plus:DI (ltu:DI (match_operand:DI 1 "gpc_reg_operand" "r,r")
! 			 (match_operand:DI 2 "reg_or_neg_short_operand" "r,P"))
! 		 (match_operand:DI 3 "reg_or_short_operand" "rI,rI")))]
!   "TARGET_64BIT"
!   "#"
!   "TARGET_64BIT"
!   [(set (match_dup 0) (neg:DI (ltu:DI (match_dup 1) (match_dup 2))))
!    (set (match_dup 0) (minus:DI (match_dup 3) (match_dup 0)))]
!   "")
  
  (define_insn ""
    [(set (match_operand:CC 0 "cc_reg_operand" "=x,x,?y,?y")
***************
*** 12596,12601 ****
--- 12619,12634 ----
    [(set_attr "length" "8")])
  
  (define_insn ""
+   [(set (match_operand:DI 0 "gpc_reg_operand" "=r,r")
+ 	(neg:DI (ltu:DI (match_operand:DI 1 "gpc_reg_operand" "r,r")
+ 			(match_operand:DI 2 "reg_or_neg_short_operand" "r,P"))))]
+   "TARGET_64BIT"
+   "@
+    {sf|subfc} %0,%2,%1\;{sfe|subfe} %0,%0,%0
+    {ai|addic} %0,%1,%n2\;{sfe|subfe} %0,%0,%0"
+   [(set_attr "length" "8")])
+ 
+ (define_insn ""
    [(set (match_operand:SI 0 "gpc_reg_operand" "=r")
  	(ge:SI (match_operand:SI 1 "gpc_reg_operand" "r")
  	       (match_operand:SI 2 "reg_or_short_operand" "rI")))
***************
*** 13343,13363 ****
    "doz %0,%2,%1\;nabs %0,%0\;{srai|srawi} %0,%0,31"
    [(set_attr "length" "12")])
  
! (define_insn ""
    [(set (match_operand:SI 0 "gpc_reg_operand" "=r")
! 	(gtu:SI (match_operand:SI 1 "gpc_reg_operand" "r")
! 		(match_operand:SI 2 "reg_or_short_operand" "rI")))]
    "TARGET_32BIT"
!   "{sf%I2|subf%I2c} %0,%1,%2\;{sfe|subfe} %0,%0,%0\;neg %0,%0"
!   [(set_attr "length" "12")])
  
! (define_insn ""
    [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
! 	(gtu:DI (match_operand:DI 1 "gpc_reg_operand" "r")
! 		(match_operand:DI 2 "reg_or_short_operand" "rI")))]
    "TARGET_64BIT"
!   "subf%I2c %0,%1,%2\;subfe %0,%0,%0\;neg %0,%0"
!   [(set_attr "length" "12")])
  
  (define_insn ""
    [(set (match_operand:CC 3 "cc_reg_operand" "=x,?y")
--- 13376,13402 ----
    "doz %0,%2,%1\;nabs %0,%0\;{srai|srawi} %0,%0,31"
    [(set_attr "length" "12")])
  
! (define_insn_and_split ""
    [(set (match_operand:SI 0 "gpc_reg_operand" "=r")
!         (gtu:SI (match_operand:SI 1 "gpc_reg_operand" "r")
!                 (match_operand:SI 2 "reg_or_short_operand" "rI")))]
    "TARGET_32BIT"
!   "#"
!   "TARGET_32BIT"
!   [(set (match_dup 0) (neg:SI (gtu:SI (match_dup 1) (match_dup 2))))
!    (set (match_dup 0) (neg:SI (match_dup 0)))]
!   "")
  
! (define_insn_and_split ""
    [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
!         (gtu:DI (match_operand:DI 1 "gpc_reg_operand" "r")
!                 (match_operand:DI 2 "reg_or_short_operand" "rI")))]
    "TARGET_64BIT"
!   "#"
!   "TARGET_64BIT"
!   [(set (match_dup 0) (neg:DI (gtu:DI (match_dup 1) (match_dup 2))))
!    (set (match_dup 0) (neg:DI (match_dup 0)))]
!   "")
  
  (define_insn ""
    [(set (match_operand:CC 3 "cc_reg_operand" "=x,?y")
***************
*** 13421,13449 ****
  		    (const_int 0)))]
    "")
  
! (define_insn ""
!   [(set (match_operand:SI 0 "gpc_reg_operand" "=&r,&r,&r")
! 	(plus:SI (gtu:SI (match_operand:SI 1 "gpc_reg_operand" "r,r,r")
! 			 (match_operand:SI 2 "reg_or_short_operand" "I,rI,rI"))
! 		 (match_operand:SI 3 "reg_or_short_operand" "r,r,I")))]
    "TARGET_32BIT"
!   "@
!    {ai|addic} %0,%1,%k2\;{aze|addze} %0,%3
!    {sf%I2|subf%I2c} %0,%1,%2\;{sfe|subfe} %0,%0,%0\;{sf|subf} %0,%0,%3
!    {sf%I2|subf%I2c} %0,%1,%2\;{sfe|subfe} %0,%0,%0\;{sfi|subfic} %0,%0,%3"
!   [(set_attr "length" "8,12,12")])
  
! (define_insn ""
!   [(set (match_operand:DI 0 "gpc_reg_operand" "=&r,&r,&r")
! 	(plus:DI (gtu:DI (match_operand:DI 1 "gpc_reg_operand" "r,r,r")
! 			 (match_operand:DI 2 "reg_or_short_operand" "I,rI,rI"))
! 		 (match_operand:DI 3 "reg_or_short_operand" "r,r,I")))]
    "TARGET_64BIT"
!   "@
!    addic %0,%1,%k2\;addze %0,%3
!    subf%I2c %0,%1,%2\;subfe %0,%0,%0\;subf %0,%0,%3
!    subf%I2c %0,%1,%2\;subfe %0,%0,%0\;subfic %0,%0,%3"
!   [(set_attr "length" "8,12,12")])
  
  (define_insn ""
    [(set (match_operand:CC 0 "cc_reg_operand" "=x,x,?y,?y")
--- 13460,13488 ----
  		    (const_int 0)))]
    "")
  
! (define_insn_and_split ""
!   [(set (match_operand:SI 0 "gpc_reg_operand" "=r")
!         (plus:SI (gtu:SI (match_operand:SI 1 "gpc_reg_operand" "r")
!                          (match_operand:SI 2 "reg_or_short_operand" "rI"))
!                  (match_operand:SI 3 "reg_or_short_operand" "rI")))]
    "TARGET_32BIT"
!   "#"
!   "TARGET_32BIT"
!   [(set (match_dup 0) (neg:SI (gtu:SI (match_dup 1) (match_dup 2))))
!    (set (match_dup 0) (minus:SI (match_dup 3) (match_dup 0)))]
!   "")
  
! (define_insn_and_split ""
!   [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
!         (plus:DI (gtu:DI (match_operand:DI 1 "gpc_reg_operand" "r")
!                          (match_operand:DI 2 "reg_or_short_operand" "rI"))
!                  (match_operand:DI 3 "reg_or_short_operand" "rI")))]
    "TARGET_64BIT"
!   "#"
!   "TARGET_64BIT"
!   [(set (match_dup 0) (neg:DI (gtu:DI (match_dup 1) (match_dup 2))))
!    (set (match_dup 0) (minus:DI (match_dup 3) (match_dup 0)))]
!   "")
  
  (define_insn ""
    [(set (match_operand:CC 0 "cc_reg_operand" "=x,x,?y,?y")

^ permalink raw reply	[flat|nested] 875+ messages in thread

* fix pr 16480 on gcc-3.4
@ 2004-11-10  2:36 Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2004-11-10  2:36 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Edelsohn

This is a backport of my PR 16480 fix, along with one of Geoff's
obviously correct fixes, for gcc-3.4.  Regression tested powerpc-linux.
OK to apply?

	PR target/16480
	2004-08-26  Alan Modra  <amodra@bigpond.net.au>
	* config/rs6000/rs6000.c (rs6000_split_multireg_move): Don't abort
	on "(mem (symbol_ref ..))" rtl.  Look at LO_SUM base regs as well
	as PLUS base regs.
	2004-08-01  Geoffrey Keating  <geoffk@apple.com>
	* config/rs6000/rs6000.c (rs6000_split_multireg_move): Just abort
	if trying to *store* to a non-offsettable address.
	2004-07-30  Geoffrey Keating  <geoffk@apple.com>
	* config/rs6000/rs6000.c (rs6000_split_multireg_move): Cope with
	non-offsettable addresses being moved into multiple GPRs.

diff -urp -xCVS -x'*~' -x'.#*' gcc-3.4-virgin/gcc/config/rs6000/rs6000.c gcc-3.4-pr16480/gcc/config/rs6000/rs6000.c
--- gcc-3.4-virgin/gcc/config/rs6000/rs6000.c	2004-10-23 19:26:49.000000000 +0930
+++ gcc-3.4-pr16480/gcc/config/rs6000/rs6000.c	2004-11-10 11:43:22.302029284 +1030
@@ -10526,22 +10526,27 @@ rs6000_split_multireg_move (rtx dst, rtx
 			 : gen_adddi3 (breg, breg, delta_rtx));
 	      src = gen_rtx_MEM (mode, breg);
 	    }
+	  else if (! offsettable_memref_p (src))
+	    {
+	      rtx newsrc, basereg;
+	      basereg = gen_rtx_REG (Pmode, reg);
+	      emit_insn (gen_rtx_SET (VOIDmode, basereg, XEXP (src, 0)));
+	      newsrc = gen_rtx_MEM (GET_MODE (src), basereg);
+	      MEM_COPY_ATTRIBUTES (newsrc, src);
+	      src = newsrc;
+	    }
 
-	  /* We have now address involving an base register only.
-	     If we use one of the registers to address memory, 
-	     we have change that register last.  */
-
-	  breg = (GET_CODE (XEXP (src, 0)) == PLUS
-		  ? XEXP (XEXP (src, 0), 0)
-		  : XEXP (src, 0));
-
-	  if (!REG_P (breg))
-	      abort();
-
-	  if (REGNO (breg) >= REGNO (dst) 
+	  breg = XEXP (src, 0);
+	  if (GET_CODE (breg) == PLUS || GET_CODE (breg) == LO_SUM)
+	    breg = XEXP (breg, 0);
+
+	  /* If the base register we are using to address memory is
+	     also a destination reg, then change that register last.  */
+	  if (REG_P (breg)
+	      && REGNO (breg) >= REGNO (dst)
 	      && REGNO (breg) < REGNO (dst) + nregs)
 	    j = REGNO (breg) - REGNO (dst);
-        }
+	}
 
       if (GET_CODE (dst) == MEM && INT_REGNO_P (reg))
 	{
@@ -10573,6 +10578,8 @@ rs6000_split_multireg_move (rtx dst, rtx
 			   : gen_adddi3 (breg, breg, delta_rtx));
 	      dst = gen_rtx_MEM (mode, breg);
 	    }
+	  else if (! offsettable_memref_p (dst))
+	    abort ();
 	}
 
       for (i = 0; i < nregs; i++)

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: fix pr 16480 on gcc-3.4
       [not found]           ` <amodra@bigpond.net.au>
                               ` (23 preceding siblings ...)
  2004-08-26  1:30             ` David Edelsohn
@ 2004-11-10  4:48             ` David Edelsohn
  2004-11-24 18:27             ` [RFC] PowerPC sCC patterns David Edelsohn
                               ` (36 subsequent siblings)
  61 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2004-11-10  4:48 UTC (permalink / raw)
  To: gcc-patches

>>>>> Alan Modra writes:

Alan> This is a backport of my PR 16480 fix, along with one of Geoff's
Alan> obviously correct fixes, for gcc-3.4.  Regression tested powerpc-linux.
Alan> OK to apply?

	Yes, if the branch is open for bug fixes, if not already.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC sCC patterns
  2004-11-08 23:07 [RFC] PowerPC sCC patterns David Edelsohn
@ 2004-11-24 11:40 ` Alan Modra
  2004-11-24 21:39   ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Alan Modra @ 2004-11-24 11:40 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

On Mon, Nov 08, 2004 at 06:04:08PM -0500, David Edelsohn wrote:
> 	While testing, I discovered that combine did not choose the best
> patterns because rtx_costs did not model those instruction, so this patch
> improves the cost model as well.

David,
  See my analysis in http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16800

Testing the mode for EQ made the SImode insn look cheaper than the DImode
one..

	PR target/16800
	* config/rs6000/rs6000.c (rs6000_rtx_costs): Zero cost for const0
	in COMPARISON_P rtx.  Don't test mode for EQ,LTU,GTU.

Bootstrap and regression tests in progress.

diff -urp -xCVS -x'*~' gcc-virgin/gcc/config/rs6000/rs6000.c gcc-current/gcc/config/rs6000/rs6000.c
--- gcc-virgin/gcc/config/rs6000/rs6000.c	2004-11-23 10:01:27.000000000 +1030
+++ gcc-current/gcc/config/rs6000/rs6000.c	2004-11-24 21:06:39.979557861 +1030
@@ -18027,7 +18042,10 @@ rs6000_rtx_costs (rtx x, int code, int o
 	      && CONST_OK_FOR_LETTER_P (INTVAL (x), 'I'))
 	  || (outer_code == COMPARE
 	      && (CONST_OK_FOR_LETTER_P (INTVAL (x), 'I')
-		  || CONST_OK_FOR_LETTER_P (INTVAL (x), 'K'))))
+		  || CONST_OK_FOR_LETTER_P (INTVAL (x), 'K')))
+	  || (((GET_RTX_CLASS (outer_code) & RTX_COMPARE_MASK)
+	       == RTX_COMPARE_RESULT)
+	      && x == const0_rtx))
 	{
 	  *total = 0;
 	  return true;
@@ -18305,26 +18323,23 @@ rs6000_rtx_costs (rtx x, int code, int o
     case EQ:
     case GTU:
     case LTU:
-      if (mode == Pmode)
+      switch (outer_code)
 	{
-	  switch (outer_code)
-	    {
-	    case PLUS:
-	    case NEG:
-	      /* PLUS or NEG already counted so only add one more.  */
-	      *total = COSTS_N_INSNS (1);
-	      break;
-	    case SET:
-	      *total = COSTS_N_INSNS (3);
-	      break;
-	    case COMPARE:
-	      *total = 0;
-	      return true;
-	    default:
-	      break;
-	    }
-	  return false;
+	case PLUS:
+	case NEG:
+	  /* PLUS or NEG already counted so only add one more.  */
+	  *total = COSTS_N_INSNS (1);
+	  break;
+	case SET:
+	  *total = COSTS_N_INSNS (3);
+	  break;
+	case COMPARE:
+	  *total = 0;
+	  return true;
+	default:
+	  break;
 	}
+      return false;
 
     default:
       break;

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC sCC patterns
       [not found]           ` <amodra@bigpond.net.au>
                               ` (24 preceding siblings ...)
  2004-11-10  4:48             ` fix pr 16480 on gcc-3.4 David Edelsohn
@ 2004-11-24 18:27             ` David Edelsohn
  2004-11-26  4:45             ` [PATCH] Fix PR16356 David Edelsohn
                               ` (35 subsequent siblings)
  61 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2004-11-24 18:27 UTC (permalink / raw)
  To: gcc-patches

	The mode test is because the SCC patterns that do not use CRs
often use carry, which only exists when mode matches Pmode.  The EQ. LTU,
GTU test exists because those are the non-CR patterns supported by
PowerPC (POWER architecture was intentionally ignored to avoid complexity).

	The revised cost model needs to preserve the existing costs for
arbitrary CONST_INT and add cases for any SCC comparison with const0_rtx.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC sCC patterns
  2004-11-24 11:40 ` Alan Modra
@ 2004-11-24 21:39   ` David Edelsohn
  0 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2004-11-24 21:39 UTC (permalink / raw)
  To: Alan Modra; +Cc: gcc-patches

	Here's a revised patch with which I am experimenting.

David

Index: rs6000.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.c,v
retrieving revision 1.752
diff -c -p -r1.752 rs6000.c
*** rs6000.c	24 Nov 2004 16:42:41 -0000	1.752
--- rs6000.c	24 Nov 2004 20:34:08 -0000
*************** rs6000_rtx_costs (rtx x, int code, int o
*** 18332,18357 ****
      case EQ:
      case GTU:
      case LTU:
!       if (mode == Pmode)
  	{
! 	  switch (outer_code)
  	    {
! 	    case PLUS:
! 	    case NEG:
! 	      /* PLUS or NEG already counted so only add one more.  */
! 	      *total = COSTS_N_INSNS (1);
! 	      break;
! 	    case SET:
! 	      *total = COSTS_N_INSNS (3);
! 	      break;
! 	    case COMPARE:
! 	      *total = 0;
  	      return true;
- 	    default:
- 	      break;
  	    }
  	  return false;
  	}
  
      default:
        break;
--- 18332,18369 ----
      case EQ:
      case GTU:
      case LTU:
!       /* Carry bit requires mode == Pmode.
! 	 NEG or PLUS already counted so only add one.  */
!       if (mode == Pmode
! 	  && (outer_code == NEG || outer_code == PLUS))
  	{
! 	  *total = COSTS_N_INSNS (1);
! 	  return false;
! 	}
!       if (outer_code == SET)
! 	{
! 	  if (XEXP (x, 1) == const0_rtx)
  	    {
! 	      *total = COSTS_N_INSNS (2);
  	      return true;
  	    }
+ 	  else if (mode == Pmode)
+ 	    *total = COSTS_N_INSNS (3);
+ 
  	  return false;
  	}
+       /* FALLTHRU */
+ 
+     case GT:
+     case LT:
+       if (outer_code == SET && (XEXP (x, 1) == const0_rtx))
+ 	{
+ 	  *total = COSTS_N_INSNS (2);
+ 	  return true;
+ 	}
+       if (outer_code == COMPARE)
+ 	*total = 0;
+       break;
  
      default:
        break;

^ permalink raw reply	[flat|nested] 875+ messages in thread

* [PATCH] Fix PR16356
@ 2004-11-26  2:32 Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2004-11-26  2:32 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Edelsohn

This fixes PR 16356, a regression caused by floatdisf2_internal2
manipulating its input register rather than copying to a new output reg.
While I was poking at this problem, I noticed we can get rid of one of
the branches in this expander by using the following trick:

  if (x & 2047)
    {
      x &= ~2047;
      x |= 2048;
    }

can be implemented as

  tmp = x & 2047;
  tmp += 2047;
  x |= tmp;
  x &= ~2047;

Removing the branch allows floatdisf2_internal2 to be used when
TARGET_POWERPC64, because the old branch condition was the only reason
that this insn needed to be TARGET_64BIT.

	PR rtl-optimization/16356
	* config/rs6000/rs6000.md (floatdisf2_internal2): Rewrite with
	separate output register and one less jump.  Enable for powerpc64.
	(floatdisf2): Adjust for above.

Bootstrapped and regression tested powerpc64-linux.  Also tested that
a few G random long values convert to exactly the same floats as older
compilers give.  OK to apply?  3.4 too?

--- gcc-virgin/gcc/config/rs6000/rs6000.md	2004-11-15 17:05:37.000000000 +1030
+++ gcc-current/gcc/config/rs6000/rs6000.md	2004-11-26 08:48:09.671759913 +1030
@@ -5402,16 +5402,18 @@
 (define_expand "floatdisf2"
   [(set (match_operand:SF 0 "gpc_reg_operand" "")
         (float:SF (match_operand:DI 1 "gpc_reg_operand" "")))]
-  "TARGET_64BIT && TARGET_HARD_FLOAT && TARGET_FPRS"
+  "TARGET_POWERPC64 && TARGET_HARD_FLOAT && TARGET_FPRS"
   "
 {
+  rtx val = operands[1];
   if (!flag_unsafe_math_optimizations)
     {
       rtx label = gen_label_rtx ();
-      emit_insn (gen_floatdisf2_internal2 (operands[1], label));
+      val = gen_reg_rtx (DImode);
+      emit_insn (gen_floatdisf2_internal2 (val, operands[1], label));
       emit_label (label);
     }
-  emit_insn (gen_floatdisf2_internal1 (operands[0], operands[1]));
+  emit_insn (gen_floatdisf2_internal1 (operands[0], val));
   DONE;
 }")
 
@@ -5436,30 +5438,31 @@
 ;; by a bit that won't be lost at that stage, but is below the SFmode
 ;; rounding position.
 (define_expand "floatdisf2_internal2"
-  [(parallel [(set (match_dup 4)
-		   (compare:CC (and:DI (match_operand:DI 0 "" "")
-				       (const_int 2047))
-			       (const_int 0)))
-	      (set (match_dup 2) (and:DI (match_dup 0) (const_int 2047)))
-	      (clobber (match_scratch:CC 7 ""))])
-   (set (match_dup 3) (ashiftrt:DI (match_dup 0) (const_int 53)))
-   (set (match_dup 3) (plus:DI (match_dup 3) (const_int 1)))
-   (set (pc) (if_then_else (eq (match_dup 4) (const_int 0))
-			   (label_ref (match_operand:DI 1 "" ""))
-			   (pc)))
-   (set (match_dup 5) (compare:CCUNS (match_dup 3) (const_int 2)))
-   (set (pc) (if_then_else (ltu (match_dup 5) (const_int 0))
-			   (label_ref (match_dup 1))
+  [(set (match_dup 3) (ashiftrt:DI (match_operand:DI 1 "" "")
+				   (const_int 53)))
+   (parallel [(set (match_operand:DI 0 "" "") (and:DI (match_dup 1)
+						      (const_int 2047)))
+	      (clobber (scratch:CC))])
+   (set (match_dup 3) (plus:DI (match_dup 3)
+			       (const_int 1)))
+   (set (match_dup 0) (plus:DI (match_dup 0)
+			       (const_int 2047)))
+   (set (match_dup 4) (compare:CCUNS (match_dup 3)
+				     (const_int 3)))
+   (set (match_dup 0) (ior:DI (match_dup 0)
+			      (match_dup 1)))
+   (parallel [(set (match_dup 0) (and:DI (match_dup 0)
+					 (const_int -2048)))
+	      (clobber (scratch:CC))])
+   (set (pc) (if_then_else (geu (match_dup 4) (const_int 0))
+			   (label_ref (match_operand:DI 2 "" ""))
 			   (pc)))
-   (set (match_dup 0) (xor:DI (match_dup 0) (match_dup 2)))
-   (set (match_dup 0) (ior:DI (match_dup 0) (const_int 2048)))]
-  "TARGET_64BIT && TARGET_HARD_FLOAT && TARGET_FPRS"
+   (set (match_dup 0) (match_dup 1))]
+  "TARGET_POWERPC64 && TARGET_HARD_FLOAT && TARGET_FPRS"
   "
 {
-  operands[2] = gen_reg_rtx (DImode);
   operands[3] = gen_reg_rtx (DImode);
-  operands[4] = gen_reg_rtx (CCmode);
-  operands[5] = gen_reg_rtx (CCUNSmode);
+  operands[4] = gen_reg_rtx (CCUNSmode);
 }")
 \f
 ;; Define the DImode operations that can be done in a small number

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] Fix PR16356
       [not found]           ` <amodra@bigpond.net.au>
                               ` (25 preceding siblings ...)
  2004-11-24 18:27             ` [RFC] PowerPC sCC patterns David Edelsohn
@ 2004-11-26  4:45             ` David Edelsohn
  2004-11-27  0:00             ` [RS6000] Fix PR12817 David Edelsohn
                               ` (34 subsequent siblings)
  61 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2004-11-26  4:45 UTC (permalink / raw)
  To: gcc-patches

	PR rtl-optimization/16356
	* config/rs6000/rs6000.md (floatdisf2_internal2): Rewrite with
	separate output register and one less jump.  Enable for powerpc64.
	(floatdisf2): Adjust for above.

Okay.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* [RS6000] Fix PR12817
@ 2004-11-26 10:32 Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2004-11-26 10:32 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Edelsohn

The comment below says it all.  I've added a testcase to the pr.

	PR target/12817
	* config/rs6000/rs6000.c (rs6000_emit_prologue): Use r11 for vrsave.

Bootstrap and regression test powerpc-linux in progress.  This ought to
count as obvious, but I'll ask.  OK to install?  gcc-3.4 and gcc-3.3
too?

diff -urp -xCVS -x'*~' gcc-virgin/gcc/config/rs6000/rs6000.c gcc-current/gcc/config/rs6000/rs6000.c
--- gcc-virgin/gcc/config/rs6000/rs6000.c	2004-11-26 14:48:20.163051172 +1030
+++ gcc-current/gcc/config/rs6000/rs6000.c	2004-11-26 19:45:39.482476440 +1030
@@ -14026,8 +14042,9 @@ rs6000_emit_prologue (void)
       rtx reg, mem, vrsave;
       int offset;
 
-      /* Get VRSAVE onto a GPR.  */
-      reg = gen_rtx_REG (SImode, 12);
+      /* Get VRSAVE onto a GPR.  Note that ABI_V4 might be using r12
+	 as frame_reg_rtx.  */
+      reg = gen_rtx_REG (SImode, 11);
       vrsave = gen_rtx_REG (SImode, VRSAVE_REGNO);
       if (TARGET_MACHO)
 	emit_insn (gen_get_vrsave_internal (reg));

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RS6000] Fix PR12817
       [not found]           ` <amodra@bigpond.net.au>
                               ` (26 preceding siblings ...)
  2004-11-26  4:45             ` [PATCH] Fix PR16356 David Edelsohn
@ 2004-11-27  0:00             ` David Edelsohn
  2004-11-27  0:18               ` Alan Modra
                                 ` (2 more replies)
  2004-11-27 22:03             ` David Edelsohn
                               ` (33 subsequent siblings)
  61 siblings, 3 replies; 875+ messages in thread
From: David Edelsohn @ 2004-11-27  0:00 UTC (permalink / raw)
  To: gcc-patches, Dale Johannesen, Stan Shebs, Geoff Keating

>>>>> Alan Modra writes:

	PR target/12817
	* config/rs6000/rs6000.c (rs6000_emit_prologue): Use r11 for vrsave.

Alan> Bootstrap and regression test powerpc-linux in progress.  This ought to
Alan> count as obvious, but I'll ask.  OK to install?  gcc-3.4 and gcc-3.3
Alan> too?

	This looks okay, but rs6000_emit_prologue is becoming quite
incomprehensible now.  The change affects Apple, so I'd like to give them
a chance to comment before you drop it in.  If there's no objection by the
end of the month, go ahead.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RS6000] Fix PR12817
  2004-11-27  0:00             ` [RS6000] Fix PR12817 David Edelsohn
@ 2004-11-27  0:18               ` Alan Modra
  2004-11-27  4:55                 ` Geoffrey Keating
                                   ` (2 more replies)
  2004-11-27  4:15               ` Geoffrey Keating
  2004-11-27 22:30               ` Mike Stump
  2 siblings, 3 replies; 875+ messages in thread
From: Alan Modra @ 2004-11-27  0:18 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches, Dale Johannesen, Stan Shebs, Geoff Keating

On Fri, Nov 26, 2004 at 06:31:58PM -0500, David Edelsohn wrote:
> >>>>> Alan Modra writes:
> 
> 	PR target/12817
> 	* config/rs6000/rs6000.c (rs6000_emit_prologue): Use r11 for vrsave.
> 
> Alan> Bootstrap and regression test powerpc-linux in progress.  This ought to
> Alan> count as obvious, but I'll ask.  OK to install?  gcc-3.4 and gcc-3.3
> Alan> too?
> 
> 	This looks okay, but rs6000_emit_prologue is becoming quite
> incomprehensible now.

Yes, I think we would do well to split the function into separate ones
for each major ABI.

>  The change affects Apple, so I'd like to give them
> a chance to comment before you drop it in.  If there's no objection by the
> end of the month, go ahead.

The main question of course, is whether r11 is free to use at that point
for all ABIs.  I reckon it is, and I've looked carefully.

BTW, it's rather weird that ABI_V4 uses r12 in the prologue as a frame
pointer, and r11 in the epilogue.  For one, it makes it impossible to
use ABI compliant _save* functions since the SYSV ABI says to use r11.
A further complication is that the PowerPC64 ABI says the register save
and restore functions use r12..

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RS6000] Fix PR12817
  2004-11-27  0:00             ` [RS6000] Fix PR12817 David Edelsohn
  2004-11-27  0:18               ` Alan Modra
@ 2004-11-27  4:15               ` Geoffrey Keating
  2004-11-27 22:30               ` Mike Stump
  2 siblings, 0 replies; 875+ messages in thread
From: Geoffrey Keating @ 2004-11-27  4:15 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches, Dale Johannesen, Stan Shebs

[-- Attachment #1: Type: text/plain, Size: 1165 bytes --]

On 26/11/2004, at 3:31 PM, David Edelsohn wrote:

>>>>>> Alan Modra writes:
>
> 	PR target/12817
> 	* config/rs6000/rs6000.c (rs6000_emit_prologue): Use r11 for vrsave.
>
> Alan> Bootstrap and regression test powerpc-linux in progress.  This 
> ought to
> Alan> count as obvious, but I'll ask.  OK to install?  gcc-3.4 and 
> gcc-3.3
> Alan> too?
>
> 	This looks okay, but rs6000_emit_prologue is becoming quite
> incomprehensible now.  The change affects Apple, so I'd like to give 
> them
> a chance to comment before you drop it in.  If there's no objection by 
> the
> end of the month, go ahead.

My suggestion is that rs6000_emit_prologue needs a real register 
allocator (although not a very complex one).  My suggestion would be to 
write a pair of routines, one that takes a register which is now 
available for allocation, and one that allocates a register, and use 
those: code for each ABI would initially make registers available for 
allocation, and more would become available as the routine progresses; 
the routine would need (sometimes) to be able to handle a "sorry, 
nothing available" response and delay that particular operation until 
later.

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 2408 bytes --]

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RS6000] Fix PR12817
  2004-11-27  0:18               ` Alan Modra
@ 2004-11-27  4:55                 ` Geoffrey Keating
  2004-11-27  8:41                   ` Alan Modra
  2004-11-27  7:34                 ` Alan Modra
  2004-11-27 19:56                 ` Dale Johannesen
  2 siblings, 1 reply; 875+ messages in thread
From: Geoffrey Keating @ 2004-11-27  4:55 UTC (permalink / raw)
  To: Alan Modra; +Cc: gcc-patches, Dale Johannesen, Stan Shebs, David Edelsohn

[-- Attachment #1: Type: text/plain, Size: 681 bytes --]


On 26/11/2004, at 4:00 PM, Alan Modra wrote:

> On Fri, Nov 26, 2004 at 06:31:58PM -0500, David Edelsohn wrote:
>>>>>>> Alan Modra writes:
>>
>> 	PR target/12817
>> 	* config/rs6000/rs6000.c (rs6000_emit_prologue): Use r11 for vrsave.
>>
>> Alan> Bootstrap and regression test powerpc-linux in progress.  This 
>> ought to
>> Alan> count as obvious, but I'll ask.  OK to install?  gcc-3.4 and 
>> gcc-3.3
>> Alan> too?
>>
>> 	This looks okay, but rs6000_emit_prologue is becoming quite
>> incomprehensible now.
>
> Yes, I think we would do well to split the function into separate ones
> for each major ABI.

I think that would just quadruple the amount of incomprehensible code.

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 2408 bytes --]

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RS6000] Fix PR12817
  2004-11-27  0:18               ` Alan Modra
  2004-11-27  4:55                 ` Geoffrey Keating
@ 2004-11-27  7:34                 ` Alan Modra
  2004-11-27 19:56                 ` Dale Johannesen
  2 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2004-11-27  7:34 UTC (permalink / raw)
  To: David Edelsohn, gcc-patches, Dale Johannesen, Stan Shebs, Geoff Keating

On Sat, Nov 27, 2004 at 10:30:00AM +1030, Alan Modra wrote:
> The main question of course, is whether r11 is free to use at that point
> for all ABIs.  I reckon it is, and I've looked carefully.

I forgot something. :-(  r11 is used as the static chain, so this would
break nested functions.  r0 is free though, so I've updated the patch
accordingly.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RS6000] Fix PR12817
  2004-11-27  4:55                 ` Geoffrey Keating
@ 2004-11-27  8:41                   ` Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2004-11-27  8:41 UTC (permalink / raw)
  To: Geoffrey Keating; +Cc: gcc-patches, Dale Johannesen, Stan Shebs, David Edelsohn

On Fri, Nov 26, 2004 at 07:40:58PM -0800, Geoffrey Keating wrote:
> I think that would just quadruple the amount of incomprehensible code.

Perhaps so.  Hmm, talking about quadrupling code, the save_world_p
stuff is useless bloat for anything other than Darwin.  Here is a fix.

	* config/rs6000/rs6000.h (WORLD_SAVE_P): Define.
	* config/rs6000/darwin.h (WORLD_SAVE_P): Define.
	* config/rs6000/rs6000.c (compute_save_world_info): Use WORLD_SAVE_P
	to allow non-darwin ABIs to optimize away code.
	(rs6000_emit_prologue, rs6000_emit_epilogue): Likewise.

   text    data     bss     dec     hex filename
 184620   29152    5032  218804   356b4 rs6000.o.orig
 180948   29096    5032  215076   34824 rs6000.o

powerpc-linux bootstrap in progress.  OK for mainline?

diff -urp -xCVS -x'*~' gcc-virgin/gcc/config/rs6000/darwin.h gcc-current/gcc/config/rs6000/darwin.h
--- gcc-virgin/gcc/config/rs6000/darwin.h	2004-11-26 14:48:19.770112722 +1030
+++ gcc-current/gcc/config/rs6000/darwin.h	2004-11-27 15:21:48.944295052 +1030
@@ -208,6 +208,10 @@ do {									\
 #undef	FP_SAVE_INLINE
 #define FP_SAVE_INLINE(FIRST_REG) ((FIRST_REG) < 64)
 
+/* Darwin uses a function call if everything needs to be saved/restored.  */
+#undef WORLD_SAVE_P
+#define WORLD_SAVE_P(INFO) ((INFO)->world_save_p)
+
 /* The assembler wants the alternate register names, but without
    leading percent sign.  */
 #undef REGISTER_NAMES
diff -urp -xCVS -x'*~' gcc-virgin/gcc/config/rs6000/rs6000.h gcc-current/gcc/config/rs6000/rs6000.h
--- gcc-virgin/gcc/config/rs6000/rs6000.h	2004-11-19 08:39:47.000000000 +1030
+++ gcc-current/gcc/config/rs6000/rs6000.h	2004-11-27 15:06:06.443374912 +1030
@@ -1627,6 +1622,10 @@ extern enum rs6000_abi rs6000_current_ab
 #define CALL_LONG		0x00000008	/* always call indirect */
 #define CALL_LIBCALL		0x00000010	/* libcall */
 
+/* We don't have prologue and epilogue functions to save/restore
+   everything for most ABIs.  */
+#define WORLD_SAVE_P(INFO) 0
+
 /* 1 if N is a possible register number for a function value
    as seen by the caller.
 
diff -urp -xCVS -x'*~' gcc-virgin/gcc/config/rs6000/rs6000.c gcc-current/gcc/config/rs6000/rs6000.c
--- gcc-virgin/gcc/config/rs6000/rs6000.c	2004-11-27 11:30:01.513696539 +1030
+++ gcc-current/gcc/config/rs6000/rs6000.c	2004-11-27 15:21:41.273499406 +1030
@@ -12428,17 +12443,19 @@ compute_vrsave_mask (void)
 static void
 compute_save_world_info(rs6000_stack_t *info_ptr)
 {
-  info_ptr->world_save_p =
-    (DEFAULT_ABI == ABI_DARWIN)
-    && ! (current_function_calls_setjmp && flag_exceptions)
-    && info_ptr->first_fp_reg_save == FIRST_SAVED_FP_REGNO
-    && info_ptr->first_gp_reg_save == FIRST_SAVED_GP_REGNO
-    && info_ptr->first_altivec_reg_save == FIRST_SAVED_ALTIVEC_REGNO
-    && info_ptr->cr_save_p;
+  info_ptr->world_save_p = 1;
+  info_ptr->world_save_p
+    = (WORLD_SAVE_P (info_ptr)
+       && DEFAULT_ABI == ABI_DARWIN
+       && ! (current_function_calls_setjmp && flag_exceptions)
+       && info_ptr->first_fp_reg_save == FIRST_SAVED_FP_REGNO
+       && info_ptr->first_gp_reg_save == FIRST_SAVED_GP_REGNO
+       && info_ptr->first_altivec_reg_save == FIRST_SAVED_ALTIVEC_REGNO
+       && info_ptr->cr_save_p);
 
   /* This will not work in conjunction with sibcalls.  Make sure there
      are none.  (This check is expensive, but seldom executed.) */
-  if ( info_ptr->world_save_p )
+  if (WORLD_SAVE_P (info_ptr))
     {
       rtx insn;
       for ( insn = get_last_insn_anywhere (); insn; insn = PREV_INSN (insn))
@@ -12450,7 +12467,7 @@ compute_save_world_info(rs6000_stack_t *
 	  }
     }
 
-  if (info_ptr->world_save_p)
+  if (WORLD_SAVE_P (info_ptr))
     {
       /* Even if we're not touching VRsave, make sure there's room on the
 	 stack for it, if it looks like we're calling SAVE_WORLD, which
@@ -13862,7 +13879,7 @@ rs6000_emit_prologue (void)
     }
 
   /* Handle world saves specially here.  */
-  if (info->world_save_p)
+  if (WORLD_SAVE_P (info))
     {
       int i, j, sz;
       rtx treg;
@@ -13982,7 +13999,7 @@ rs6000_emit_prologue (void)
     }
 
   /* Save AltiVec registers if needed.  */
-  if (! info->world_save_p && TARGET_ALTIVEC_ABI && info->altivec_size != 0)
+  if (!WORLD_SAVE_P (info) && TARGET_ALTIVEC_ABI && info->altivec_size != 0)
     {
       int i;
 
@@ -14023,7 +14040,7 @@ rs6000_emit_prologue (void)
      epilogue.  */
 
   if (TARGET_ALTIVEC && TARGET_ALTIVEC_VRSAVE
-      && ! info->world_save_p && info->vrsave_mask != 0)
+      && !WORLD_SAVE_P (info) && info->vrsave_mask != 0)
     {
       rtx reg, mem, vrsave;
       int offset;
@@ -14051,7 +14070,7 @@ rs6000_emit_prologue (void)
     }
 
   /* If we use the link register, get it into r0.  */
-  if (! info->world_save_p && info->lr_save_p)
+  if (!WORLD_SAVE_P (info) && info->lr_save_p)
     {
       insn = emit_move_insn (gen_rtx_REG (Pmode, 0),
 			     gen_rtx_REG (Pmode, LINK_REGISTER_REGNUM));
@@ -14059,7 +14078,7 @@ rs6000_emit_prologue (void)
     }
 
   /* If we need to save CR, put it into r12.  */
-  if (! info->world_save_p && info->cr_save_p && frame_reg_rtx != frame_ptr_rtx)
+  if (!WORLD_SAVE_P (info) && info->cr_save_p && frame_reg_rtx != frame_ptr_rtx)
     {
       rtx set;
 
@@ -14081,7 +14100,7 @@ rs6000_emit_prologue (void)
 
   /* Do any required saving of fpr's.  If only one or two to save, do
      it ourselves.  Otherwise, call function.  */
-  if (! info->world_save_p && saving_FPRs_inline)
+  if (!WORLD_SAVE_P (info) && saving_FPRs_inline)
     {
       int i;
       for (i = 0; i < 64 - info->first_fp_reg_save; i++)
@@ -14092,7 +14111,7 @@ rs6000_emit_prologue (void)
 			   info->fp_save_offset + sp_offset + 8 * i,
 			   info->total_size);
     }
-  else if (! info->world_save_p && info->first_fp_reg_save != 64)
+  else if (!WORLD_SAVE_P (info) && info->first_fp_reg_save != 64)
     {
       int i;
       char rname[30];
@@ -14128,7 +14147,7 @@ rs6000_emit_prologue (void)
 
   /* Save GPRs.  This is done as a PARALLEL if we are using
      the store-multiple instructions.  */
-  if (! info->world_save_p && using_store_multiple)
+  if (!WORLD_SAVE_P (info) && using_store_multiple)
     {
       rtvec p;
       int i;
@@ -14150,7 +14169,7 @@ rs6000_emit_prologue (void)
       rs6000_frame_related (insn, frame_ptr_rtx, info->total_size,
 			    NULL_RTX, NULL_RTX);
     }
-  else if (! info->world_save_p)
+  else if (!WORLD_SAVE_P (info))
     {
       int i;
       for (i = 0; i < 32 - info->first_gp_reg_save; i++)
@@ -14209,7 +14228,7 @@ rs6000_emit_prologue (void)
 
   /* ??? There's no need to emit actual instructions here, but it's the
      easiest way to get the frame unwind information emitted.  */
-  if (! info->world_save_p && current_function_calls_eh_return)
+  if (!WORLD_SAVE_P (info) && current_function_calls_eh_return)
     {
       unsigned int i, regno;
 
@@ -14244,7 +14263,7 @@ rs6000_emit_prologue (void)
     }
 
   /* Save lr if we used it.  */
-  if (! info->world_save_p && info->lr_save_p)
+  if (!WORLD_SAVE_P (info) && info->lr_save_p)
     {
       rtx addr = gen_rtx_PLUS (Pmode, frame_reg_rtx,
 			       GEN_INT (info->lr_save_offset + sp_offset));
@@ -14259,7 +14278,7 @@ rs6000_emit_prologue (void)
     }
 
   /* Save CR if we use any that must be preserved.  */
-  if (! info->world_save_p && info->cr_save_p)
+  if (!WORLD_SAVE_P (info) && info->cr_save_p)
     {
       rtx addr = gen_rtx_PLUS (Pmode, frame_reg_rtx,
 			       GEN_INT (info->cr_save_offset + sp_offset));
@@ -14292,7 +14311,7 @@ rs6000_emit_prologue (void)
 
   /* Update stack and set back pointer unless this is V.4,
      for which it was done previously.  */
-  if (! info->world_save_p && info->push_p
+  if (!WORLD_SAVE_P (info) && info->push_p
       && !(DEFAULT_ABI == ABI_V4 || current_function_calls_eh_return))
     rs6000_emit_allocate_stack (info->total_size, FALSE);
 
@@ -14461,7 +14480,7 @@ rs6000_emit_epilogue (int sibcall)
 			 || rs6000_cpu == PROCESSOR_PPC750
 			 || optimize_size);
 
-  if (info->world_save_p)
+  if (WORLD_SAVE_P (info))
     {
       int i, j;
       char rname[30];

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RS6000] Fix PR12817
  2004-11-27  0:18               ` Alan Modra
  2004-11-27  4:55                 ` Geoffrey Keating
  2004-11-27  7:34                 ` Alan Modra
@ 2004-11-27 19:56                 ` Dale Johannesen
  2 siblings, 0 replies; 875+ messages in thread
From: Dale Johannesen @ 2004-11-27 19:56 UTC (permalink / raw)
  To: Alan Modra
  Cc: David Edelsohn, gcc-patches, Dale Johannesen, Geoff Keating, Stan Shebs

On Nov 26, 2004, at 4:00 PM, Alan Modra wrote:
> On Fri, Nov 26, 2004 at 06:31:58PM -0500, David Edelsohn wrote:
>>>>>>> Alan Modra writes:
>>
>> 	PR target/12817
>> 	* config/rs6000/rs6000.c (rs6000_emit_prologue): Use r11 for vrsave.
>>
>> Alan> Bootstrap and regression test powerpc-linux in progress.  This 
>> ought to
>> Alan> count as obvious, but I'll ask.  OK to install?  gcc-3.4 and 
>> gcc-3.3
>> Alan> too?
>>
>> 	This looks okay, but rs6000_emit_prologue is becoming quite
>> incomprehensible now.
>
> Yes, I think we would do well to split the function into separate ones
> for each major ABI.

I tried this a year or so ago and gave up.  It doesn't actually help 
with the
comprehensibility all that much, and introduces a lot of duplicated code
as well.  But maybe you can do better.

If you do it you should do rs6000_emit_epilogue as well.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RS6000] Fix PR12817
       [not found]           ` <amodra@bigpond.net.au>
                               ` (27 preceding siblings ...)
  2004-11-27  0:00             ` [RS6000] Fix PR12817 David Edelsohn
@ 2004-11-27 22:03             ` David Edelsohn
  2004-11-27 22:23               ` Mike Stump
                                 ` (2 more replies)
  2004-12-03 16:02             ` mklibgcc fallout David Edelsohn
                               ` (32 subsequent siblings)
  61 siblings, 3 replies; 875+ messages in thread
From: David Edelsohn @ 2004-11-27 22:03 UTC (permalink / raw)
  To: Geoffrey Keating, gcc-patches, Dale Johannesen, Stan Shebs

	* config/rs6000/rs6000.h (WORLD_SAVE_P): Define.
	* config/rs6000/darwin.h (WORLD_SAVE_P): Define.
	* config/rs6000/rs6000.c (compute_save_world_info): Use WORLD_SAVE_P
	to allow non-darwin ABIs to optimize away code.
	(rs6000_emit_prologue, rs6000_emit_epilogue): Likewise.

This looks good to me.  Hopefully Dale or Stan or Geoff will approve the
darwin.h part quickly.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RS6000] Fix PR12817
  2004-11-27 22:03             ` David Edelsohn
@ 2004-11-27 22:23               ` Mike Stump
  2004-11-27 22:48               ` Dale Johannesen
  2004-11-29  2:38               ` Geoffrey Keating
  2 siblings, 0 replies; 875+ messages in thread
From: Mike Stump @ 2004-11-27 22:23 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Geoffrey Keating, gcc-patches, Dale Johannesen, Stan Shebs

On Saturday, November 27, 2004, at 01:37  PM, David Edelsohn wrote:
> 	* config/rs6000/rs6000.h (WORLD_SAVE_P): Define.
> 	* config/rs6000/darwin.h (WORLD_SAVE_P): Define.
> 	* config/rs6000/rs6000.c (compute_save_world_info): Use WORLD_SAVE_P
> 	to allow non-darwin ABIs to optimize away code.
> 	(rs6000_emit_prologue, rs6000_emit_epilogue): Likewise.
>
> This looks good to me.  Hopefully Dale or Stan or Geoff will approve 
> the
> darwin.h part quickly.

Looks good to me, also, the darwin.h part looks trivial enough.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RS6000] Fix PR12817
  2004-11-27  0:00             ` [RS6000] Fix PR12817 David Edelsohn
  2004-11-27  0:18               ` Alan Modra
  2004-11-27  4:15               ` Geoffrey Keating
@ 2004-11-27 22:30               ` Mike Stump
  2 siblings, 0 replies; 875+ messages in thread
From: Mike Stump @ 2004-11-27 22:30 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches, Dale Johannesen, Stan Shebs, Geoff Keating

On Friday, November 26, 2004, at 03:31  PM, David Edelsohn wrote:
> 	This looks okay, but rs6000_emit_prologue is becoming quite
> incomprehensible now.  The change affects Apple, so I'd like to give 
> them
> a chance to comment before you drop it in.

My comment, there is so much cruft we have in our tree (see 
apple-ppc-branch, if you care) in this area, that any edit hurts, so 
we've lost already.  Notice how the live registers randomly change 
about, random insertions of code, etc...

I don't think anyone else should have to pay for this, cept those that 
don't want to fix it (us), so, to me, I think the proposed change is 
fine.

Would be nice to generate rtl for the prologue and let the register 
allocator pick registers, that would clean up a bit of the code.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RS6000] Fix PR12817
  2004-11-27 22:03             ` David Edelsohn
  2004-11-27 22:23               ` Mike Stump
@ 2004-11-27 22:48               ` Dale Johannesen
  2004-11-27 22:52                 ` Alan Modra
  2004-11-29  2:38               ` Geoffrey Keating
  2 siblings, 1 reply; 875+ messages in thread
From: Dale Johannesen @ 2004-11-27 22:48 UTC (permalink / raw)
  To: Alan Modra, David Edelsohn
  Cc: gcc-patches, Dale Johannesen, Geoffrey Keating, Stan Shebs


On Nov 27, 2004, at 1:37 PM, David Edelsohn wrote:

> 	* config/rs6000/rs6000.h (WORLD_SAVE_P): Define.
> 	* config/rs6000/darwin.h (WORLD_SAVE_P): Define.
> 	* config/rs6000/rs6000.c (compute_save_world_info): Use WORLD_SAVE_P
> 	to allow non-darwin ABIs to optimize away code.
> 	(rs6000_emit_prologue, rs6000_emit_epilogue): Likewise.
>
> This looks good to me.  Hopefully Dale or Stan or Geoff will approve 
> the
> darwin.h part quickly.

+  info_ptr->world_save_p = 1;
+  info_ptr->world_save_p
+    = (WORLD_SAVE_P (info_ptr)
+       && DEFAULT_ABI == ABI_DARWIN
+       && ! (current_function_calls_setjmp && flag_exceptions)
+       && info_ptr->first_fp_reg_save == FIRST_SAVED_FP_REGNO
+       && info_ptr->first_gp_reg_save == FIRST_SAVED_GP_REGNO
+       && info_ptr->first_altivec_reg_save == FIRST_SAVED_ALTIVEC_REGNO
+       && info_ptr->cr_save_p);

The first line is redundant.
Looks OK otherwise; the darwin.h part is fine.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RS6000] Fix PR12817
  2004-11-27 22:48               ` Dale Johannesen
@ 2004-11-27 22:52                 ` Alan Modra
  2004-11-28  0:20                   ` Dale Johannesen
  0 siblings, 1 reply; 875+ messages in thread
From: Alan Modra @ 2004-11-27 22:52 UTC (permalink / raw)
  To: Dale Johannesen; +Cc: David Edelsohn, gcc-patches, Geoffrey Keating, Stan Shebs

On Sat, Nov 27, 2004 at 02:21:30PM -0800, Dale Johannesen wrote:
> +  info_ptr->world_save_p = 1;
> +  info_ptr->world_save_p
> +    = (WORLD_SAVE_P (info_ptr)
> +       && DEFAULT_ABI == ABI_DARWIN
> +       && ! (current_function_calls_setjmp && flag_exceptions)
> +       && info_ptr->first_fp_reg_save == FIRST_SAVED_FP_REGNO
> +       && info_ptr->first_gp_reg_save == FIRST_SAVED_GP_REGNO
> +       && info_ptr->first_altivec_reg_save == FIRST_SAVED_ALTIVEC_REGNO
> +       && info_ptr->cr_save_p);
> 
> The first line is redundant.

Note the WORLD_SAVE_P test on the next line.  It's part of the trick to
have this function reduce down to

  info_ptr->world_save_p = 0;

on anything but Darwin.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RS6000] Fix PR12817
  2004-11-27 22:52                 ` Alan Modra
@ 2004-11-28  0:20                   ` Dale Johannesen
  0 siblings, 0 replies; 875+ messages in thread
From: Dale Johannesen @ 2004-11-28  0:20 UTC (permalink / raw)
  To: Alan Modra
  Cc: Geoffrey Keating, David Edelsohn, gcc-patches, Dale Johannesen,
	Stan Shebs

On Nov 27, 2004, at 2:30 PM, Alan Modra wrote:
> On Sat, Nov 27, 2004 at 02:21:30PM -0800, Dale Johannesen wrote:
>> +  info_ptr->world_save_p = 1;
>> +  info_ptr->world_save_p
>> +    = (WORLD_SAVE_P (info_ptr)
>> +       && DEFAULT_ABI == ABI_DARWIN
>> +       && ! (current_function_calls_setjmp && flag_exceptions)
>> +       && info_ptr->first_fp_reg_save == FIRST_SAVED_FP_REGNO
>> +       && info_ptr->first_gp_reg_save == FIRST_SAVED_GP_REGNO
>> +       && info_ptr->first_altivec_reg_save == 
>> FIRST_SAVED_ALTIVEC_REGNO
>> +       && info_ptr->cr_save_p);
>>
>> The first line is redundant.
>
>> Note the WORLD_SAVE_P test on the next line.  It's part of the trick 
>> to
> have this function reduce down to
>
>   info_ptr->world_save_p = 0;
>
> on anything but Darwin.

That should happen whether there's a previous assignment to 1 or not, 
surely?
Oh, I see, it hadn't been set previously on Darwin.  OK then.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RS6000] Fix PR12817
  2004-11-27 22:03             ` David Edelsohn
  2004-11-27 22:23               ` Mike Stump
  2004-11-27 22:48               ` Dale Johannesen
@ 2004-11-29  2:38               ` Geoffrey Keating
  2 siblings, 0 replies; 875+ messages in thread
From: Geoffrey Keating @ 2004-11-29  2:38 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches, Dale Johannesen, Stan Shebs

[-- Attachment #1: Type: text/plain, Size: 585 bytes --]


On 27/11/2004, at 1:37 PM, David Edelsohn wrote:

> 	* config/rs6000/rs6000.h (WORLD_SAVE_P): Define.
> 	* config/rs6000/darwin.h (WORLD_SAVE_P): Define.
> 	* config/rs6000/rs6000.c (compute_save_world_info): Use WORLD_SAVE_P
> 	to allow non-darwin ABIs to optimize away code.
> 	(rs6000_emit_prologue, rs6000_emit_epilogue): Likewise.
>
> This looks good to me.  Hopefully Dale or Stan or Geoff will approve 
> the
> darwin.h part quickly.

There's no hurry; this is clearly not a regression fix, so the mainline 
is closed to this patch.

However, the patch looks OK to me for 4.1.

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 2408 bytes --]

^ permalink raw reply	[flat|nested] 875+ messages in thread

* mklibgcc fallout
@ 2004-12-02 12:13 Alan Modra
  2004-12-02 17:12 ` Zack Weinberg
  0 siblings, 1 reply; 875+ messages in thread
From: Alan Modra @ 2004-12-02 12:13 UTC (permalink / raw)
  To: gcc-patches; +Cc: zack

This fixes one problem with Zack's recent mklibgcc changes.  On a native
powerpc64-linux build, I was seeing

checking for sin in -lm... configure: error: Link tests are not allowed after GCC_NO_EXECUTABLES.
make[1]: *** [configure-target-libstdc++-v3] Error 1

when building multilibs.  This turned out to be due to a bad symlink in
gcc/32,
  libgcc_s_32.so -> 32/libgcc_s.so.1
which should be
  libgcc_s_32.so -> libgcc_s.so.1.

	* mklibgcc.in: Trim directory when substituting shlib_base_name.

See config/t-slibgcc-elf-ver for where shlib_base_name is used.
Tested powerpc64-linux.

--- gcc-virgin/gcc/mklibgcc.in	2004-12-01 08:25:51.000000000 +1030
+++ gcc-current/gcc/mklibgcc.in	2004-12-02 21:41:44.682588912 +1030
@@ -755,13 +755,14 @@
 
   # Shared libraries.
   if [ "$libgcc_s_so" ]; then
+    shlib_base_name=`echo $libgcc_s_so_base | sed 's,.*/,,'`
     echo ""
     echo "$libgcc_s_so: stmp-dirs $libunwind_so"
     echo "	$SHLIB_LINK" \
 	 | sed -e "s%@multilib_flags@%$flags%g" \
 	       -e "s%@multilib_dir@%$dir%g" \
 	       -e "s%@shlib_objs@%\$(filter-out $libgcc_s_so_extra,\$(objects))%g" \
-	       -e "s%@shlib_base_name@%$libgcc_s_so_base%g" \
+	       -e "s%@shlib_base_name@%$shlib_base_name%g" \
 	       -e "s%@shlib_so_name@%$libgcc_s_soname%g" \
 	       -e "s%@shlib_map_file@%$mapfile%g" \
 	       -e "s%@shlib_dir@%$shlib_dir%g" \
@@ -770,13 +771,14 @@
   fi
 
   if [ "$libunwind_so" ]; then
+    shlib_base_name=`echo $libunwind_so_base | sed 's,.*/,,'`
     echo ""
     echo "$libunwind_so: stmp-dirs"
     echo "	$SHLIBUNWIND_LINK" \
 	   | sed -e "s%@multilib_flags@%$flags%g" \
 		 -e "s%@multilib_dir@%$dir%g" \
 		 -e "s%@shlib_objs@%\$(filter-out $libunwind_so_extra,\$(objects))%g" \
-		 -e "s%@shlib_base_name@%$libunwind_so_base%g" \
+		 -e "s%@shlib_base_name@%$shlib_base_name%g" \
 		 -e "s%@shlib_so_name@%$libunwind_soname%g" \
 		 -e "s%@shlib_dir@%$shlib_dir%g" \
 		 -e "s%@shlib_slibdir_qual@%$shlib_dir_qual%g"
-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: mklibgcc fallout
  2004-12-02 12:13 mklibgcc fallout Alan Modra
@ 2004-12-02 17:12 ` Zack Weinberg
  2004-12-02 18:21   ` Joseph S. Myers
  0 siblings, 1 reply; 875+ messages in thread
From: Zack Weinberg @ 2004-12-02 17:12 UTC (permalink / raw)
  To: gcc-patches

Alan Modra <amodra@bigpond.net.au> writes:

> This fixes one problem with Zack's recent mklibgcc changes.  On a native
> powerpc64-linux build, I was seeing
>
> checking for sin in -lm... configure: error: Link tests are not allowed after GCC_NO_EXECUTABLES.
> make[1]: *** [configure-target-libstdc++-v3] Error 1
>
> when building multilibs.  This turned out to be due to a bad symlink in
> gcc/32,
>   libgcc_s_32.so -> 32/libgcc_s.so.1
> which should be
>   libgcc_s_32.so -> libgcc_s.so.1.
>
> 	* mklibgcc.in: Trim directory when substituting shlib_base_name.

I'm not sure this is right.  David Edelsohn was saying yesterday on
IRC that all the shared libgcc variants are supposed to wind up in the
toplevel, in which case the symlink *should* be 32/libgcc_s.so.1, it
should just be in the parent directory.

zw

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: mklibgcc fallout
  2004-12-02 17:12 ` Zack Weinberg
@ 2004-12-02 18:21   ` Joseph S. Myers
  2004-12-02 18:47     ` Zack Weinberg
  0 siblings, 1 reply; 875+ messages in thread
From: Joseph S. Myers @ 2004-12-02 18:21 UTC (permalink / raw)
  To: Zack Weinberg; +Cc: gcc-patches

On Thu, 2 Dec 2004, Zack Weinberg wrote:

> > when building multilibs.  This turned out to be due to a bad symlink in
> > gcc/32,
> >   libgcc_s_32.so -> 32/libgcc_s.so.1
> > which should be
> >   libgcc_s_32.so -> libgcc_s.so.1.
> >
> > 	* mklibgcc.in: Trim directory when substituting shlib_base_name.
> 
> I'm not sure this is right.  David Edelsohn was saying yesterday on
> IRC that all the shared libgcc variants are supposed to wind up in the
> toplevel, in which case the symlink *should* be 32/libgcc_s.so.1, it
> should just be in the parent directory.

My understanding from Mark 
<http://gcc.gnu.org/ml/gcc-patches/2004-11/msg01170.html> was that the 
link should go in the same directory as the linked-to file (although the 
driver patch to pass a -L path only for the correct multilib variant, so 
links at toplevel wouldn't be seen at all, has been postponed until 4.1).

-- 
Joseph S. Myers               http://www.srcf.ucam.org/~jsm28/gcc/
    jsm@polyomino.org.uk (personal mail)
    joseph@codesourcery.com (CodeSourcery mail)
    jsm28@gcc.gnu.org (Bugzilla assignments and CCs)

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: mklibgcc fallout
  2004-12-02 18:21   ` Joseph S. Myers
@ 2004-12-02 18:47     ` Zack Weinberg
  2004-12-02 19:02       ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Zack Weinberg @ 2004-12-02 18:47 UTC (permalink / raw)
  To: Joseph S. Myers, Alan Modra, David Edelsohn; +Cc: gcc-patches

"Joseph S. Myers" <joseph@codesourcery.com> writes:

> On Thu, 2 Dec 2004, Zack Weinberg wrote:
>
>> > when building multilibs.  This turned out to be due to a bad symlink in
>> > gcc/32,
>> >   libgcc_s_32.so -> 32/libgcc_s.so.1
>> > which should be
>> >   libgcc_s_32.so -> libgcc_s.so.1.
>> >
>> > 	* mklibgcc.in: Trim directory when substituting shlib_base_name.
>> 
>> I'm not sure this is right.  David Edelsohn was saying yesterday on
>> IRC that all the shared libgcc variants are supposed to wind up in the
>> toplevel, in which case the symlink *should* be 32/libgcc_s.so.1, it
>> should just be in the parent directory.
>
> My understanding from Mark 
> <http://gcc.gnu.org/ml/gcc-patches/2004-11/msg01170.html> was that the 
> link should go in the same directory as the linked-to file (although the 
> driver patch to pass a -L path only for the correct multilib variant, so 
> links at toplevel wouldn't be seen at all, has been postponed until 4.1).

Hm, so is it the install rule that's incorrect, then?  Quoting from
IRC

<Rhyolite> The AIX "make install" is showing failures because
           libgcc_s_* does not exist in the directory
<Rhyolite> because libgcc.mk is placing it in the subdirectories
<Rhyolite> libgcc.mk install does
<Rhyolite> ../install-sh -c -m 644 libgcc_s_pthread.a $(DESTDIR)$(slibdir)/
<Rhyolite> which is expecting the file in the main build directory,
           where it used to be placed

zw

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: mklibgcc fallout
  2004-12-02 18:47     ` Zack Weinberg
@ 2004-12-02 19:02       ` David Edelsohn
  2004-12-02 21:05         ` Richard Henderson
  2004-12-02 21:37         ` Alan Modra
  0 siblings, 2 replies; 875+ messages in thread
From: David Edelsohn @ 2004-12-02 19:02 UTC (permalink / raw)
  To: Zack Weinberg; +Cc: Joseph S. Myers, Alan Modra, gcc-patches

>>>>> Zack Weinberg writes:

Zack> Hm, so is it the install rule that's incorrect, then?  Quoting from
Zack> IRC

Zack> <Rhyolite> The AIX "make install" is showing failures because
Zack> libgcc_s_* does not exist in the directory
Zack> <Rhyolite> because libgcc.mk is placing it in the subdirectories
Zack> <Rhyolite> libgcc.mk install does
Zack> <Rhyolite> ../install-sh -c -m 644 libgcc_s_pthread.a $(DESTDIR)$(slibdir)/
Zack> <Rhyolite> which is expecting the file in the main build directory,
Zack> where it used to be placed

	The make install rule is correct, or at least was correct.  All of
the shared libraries have unique names (libgcc_s.so, libgcc_s_pthread.so,
libgcc_s_ppc64.so, etc.) unlike the libgcc.a static libraries.  I prefer
the original behavior of building and installing all of them in the same
library directory.

	I believe that Richard Henderson implemented the original
semantics.  He probably should be consulted on any change.

	I personally think that someone should present an extremely
convincing case for changing the semantics before accepting that change.
Users now expect to specify one shared library directory for all GCC
shared libraries associated with a release, including runtime dlopen.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: mklibgcc fallout
  2004-12-02 19:02       ` David Edelsohn
@ 2004-12-02 21:05         ` Richard Henderson
  2004-12-02 21:37         ` Alan Modra
  1 sibling, 0 replies; 875+ messages in thread
From: Richard Henderson @ 2004-12-02 21:05 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Zack Weinberg, Joseph S. Myers, Alan Modra, gcc-patches

On Thu, Dec 02, 2004 at 02:02:28PM -0500, David Edelsohn wrote:
> Users now expect to specify one shared library directory for all GCC
> shared libraries associated with a release, including runtime dlopen.

Indeed, and the naming scheme was specifically chosen so that
this would work.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: mklibgcc fallout
  2004-12-02 19:02       ` David Edelsohn
  2004-12-02 21:05         ` Richard Henderson
@ 2004-12-02 21:37         ` Alan Modra
  2004-12-02 22:10           ` Zack Weinberg
  2004-12-02 23:13           ` Andreas Schwab
  1 sibling, 2 replies; 875+ messages in thread
From: Alan Modra @ 2004-12-02 21:37 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Zack Weinberg, Joseph S. Myers, gcc-patches

I noticed another problem yesterday with the current mklibgcc, but
didn't mention it:  Dependencies for the shared libs are wrong on
powerpc-linux.

That's because we have two sets of directory names, one given by
MULTILIB_DIRNAMES, and the other by MULTILIB_OSDIRNAMES.  For some
reason, shlib_dir isn't set unless MULTILIB_OSDIRNAMES is used, and
powerpc-linux doesn't use MULTILIB_OSDIRNAMES.  We end up with
libgcc_s_nof.so.1 being built in gcc/ from objects in gcc/libgcc/nof/,
with dependencies like

nof/libgcc_s_nof.so: libgcc/nof/_muldi3_s.o

So it looks like Zack wants to put both the .so.1 and .so link in the
MULTILIB_DIRNAMES directory.  I think that's the right move too, but
doing so will require changes to t-slibgcc-elf-ver and similar config
files.  I know gcc will find the .so files there.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: mklibgcc fallout
  2004-12-02 21:37         ` Alan Modra
@ 2004-12-02 22:10           ` Zack Weinberg
  2004-12-02 22:27             ` David Edelsohn
  2004-12-03  9:09             ` Alan Modra
  2004-12-02 23:13           ` Andreas Schwab
  1 sibling, 2 replies; 875+ messages in thread
From: Zack Weinberg @ 2004-12-02 22:10 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Joseph S. Myers, gcc-patches

Alan Modra <amodra@bigpond.net.au> writes:

> I noticed another problem yesterday with the current mklibgcc, but
> didn't mention it:  Dependencies for the shared libs are wrong on
> powerpc-linux.
>
> That's because we have two sets of directory names, one given by
> MULTILIB_DIRNAMES, and the other by MULTILIB_OSDIRNAMES.  For some
> reason, shlib_dir isn't set unless MULTILIB_OSDIRNAMES is used, and
> powerpc-linux doesn't use MULTILIB_OSDIRNAMES.  We end up with
> libgcc_s_nof.so.1 being built in gcc/ from objects in gcc/libgcc/nof/,
> with dependencies like
>
> nof/libgcc_s_nof.so: libgcc/nof/_muldi3_s.o
>
> So it looks like Zack wants to put both the .so.1 and .so link in the
> MULTILIB_DIRNAMES directory.  I think that's the right move too, but
> doing so will require changes to t-slibgcc-elf-ver and similar config
> files.  I know gcc will find the .so files there.

I don't want anything.  I do not know what is right.  I am an empty
vessel.

Y'all sort out what should be going on and I'll make it happen, if
y'all don't beat me to it.

zw

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: mklibgcc fallout
  2004-12-02 22:10           ` Zack Weinberg
@ 2004-12-02 22:27             ` David Edelsohn
  2004-12-02 23:08               ` Zack Weinberg
  2004-12-03  9:09             ` Alan Modra
  1 sibling, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2004-12-02 22:27 UTC (permalink / raw)
  To: Zack Weinberg; +Cc: Joseph S. Myers, gcc-patches

>>>>> Zack Weinberg writes:

Zack> Y'all sort out what should be going on and I'll make it happen, if
Zack> y'all don't beat me to it.

	The original semantics should be restored: all libgcc_s_* shared
objects built in the main gcc build directory, not multilib
subdirectories, and all libgcc_s_* shared objects installed in the lib
directory, not multilib subdirectories.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: mklibgcc fallout
  2004-12-02 22:27             ` David Edelsohn
@ 2004-12-02 23:08               ` Zack Weinberg
  2004-12-02 23:26                 ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Zack Weinberg @ 2004-12-02 23:08 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Joseph S. Myers, gcc-patches

David Edelsohn <dje@watson.ibm.com> writes:

>>>>>> Zack Weinberg writes:
>
> Zack> Y'all sort out what should be going on and I'll make it happen, if
> Zack> y'all don't beat me to it.
>
> 	The original semantics should be restored: all libgcc_s_* shared
> objects built in the main gcc build directory, not multilib
> subdirectories, and all libgcc_s_* shared objects installed in the lib
> directory, not multilib subdirectories.

It's not that simple.

A powerpc64-linux build tree from October contains the following files
named libgcc_s*so* which are of principal relevance

libgcc_s.so.1
32/libgcc_s.so.1
32/nof/libgcc_s_nof.so.1
libgcc_s.so -> libgcc_s.so.1
libgcc_s_32.so -> 32/libgcc_s.so.1
libgcc_s_32_nof.so -> 32/nof/libgcc_s_nof.so.1

You can see that this doesn't correspond exactly to what either you or
Alan has been saying.  The actual shared libraries are in the multilib
directories and all have the same basename, the files with suffixed
names are all together in the toplevel but they're symlinks.

There are also several other files which are a consequence of the
bootstrap process:

libgcc_s.so.1.stage1
libgcc_s.so.1.stage2
32/libgcc_s.so.1.stage2
32/nof/libgcc_s_nof.so.1.stage2
stage1/32/libgcc_s.so.1
stage1/32/nof/libgcc_s_nof.so.1
stage1/libgcc_s.so
stage2/32/libgcc_s.so.1
stage2/32/nof/libgcc_s_nof.so.1
stage2/libgcc_s.so

I can say with some confidence that the .stageN suffixed variants
should not exist, but not much else.  I don't have any confidence that
any of these layouts is right: not what you say, not what Alan says,
not the way it was in October.

zw

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: mklibgcc fallout
  2004-12-02 21:37         ` Alan Modra
  2004-12-02 22:10           ` Zack Weinberg
@ 2004-12-02 23:13           ` Andreas Schwab
  1 sibling, 0 replies; 875+ messages in thread
From: Andreas Schwab @ 2004-12-02 23:13 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Zack Weinberg, Joseph S. Myers, gcc-patches

Alan Modra <amodra@bigpond.net.au> writes:

> I noticed another problem yesterday with the current mklibgcc, but
> didn't mention it:  Dependencies for the shared libs are wrong on
> powerpc-linux.
>
> That's because we have two sets of directory names, one given by
> MULTILIB_DIRNAMES, and the other by MULTILIB_OSDIRNAMES.  For some
> reason, shlib_dir isn't set unless MULTILIB_OSDIRNAMES is used, and
> powerpc-linux doesn't use MULTILIB_OSDIRNAMES.  We end up with
> libgcc_s_nof.so.1 being built in gcc/ from objects in gcc/libgcc/nof/,
> with dependencies like
>
> nof/libgcc_s_nof.so: libgcc/nof/_muldi3_s.o

This has always been wrong.  I've recently filed a bug for that (PR18524).

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux Products GmbH, MaxfeldstraÃŸe 5, 90409 NÃ¼rnberg, Germany
Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: mklibgcc fallout
  2004-12-02 23:08               ` Zack Weinberg
@ 2004-12-02 23:26                 ` David Edelsohn
  2004-12-04  2:41                   ` Zack Weinberg
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2004-12-02 23:26 UTC (permalink / raw)
  To: Zack Weinberg; +Cc: Joseph S. Myers, gcc-patches

>>>>> Zack Weinberg writes:

Zack> It's not that simple.

Zack> A powerpc64-linux build tree from October contains the following files
Zack> named libgcc_s*so* which are of principal relevance

Zack> libgcc_s.so.1
Zack> 32/libgcc_s.so.1
Zack> 32/nof/libgcc_s_nof.so.1
Zack> libgcc_s.so -> libgcc_s.so.1
Zack> libgcc_s_32.so -> 32/libgcc_s.so.1
Zack> libgcc_s_32_nof.so -> 32/nof/libgcc_s_nof.so.1

	A build of powerpc-ibm-aix5.1.0.0 from 20040819 creates (ignoring
that AIX shared libraries have a ".a" suffix):

gcc/libgcc_s.a
gcc/libgcc_s_power.a
gcc/libgcc_s_powerpc.a
gcc/libgcc_s_ppc64.a
gcc/libgcc_s_pthread.a
gcc/libgcc_s_pthread_power.a
gcc/libgcc_s_pthread_powerpc.a
gcc/libgcc_s_pthread_ppc64.a

gcc/stage2/libgcc_s.a
gcc/stage2/libgcc_s_power.a
gcc/stage2/libgcc_s_powerpc.a
gcc/stage2/libgcc_s_ppc64.a
gcc/stage2/libgcc_s_pthread.a
gcc/stage2/libgcc_s_pthread_power.a
gcc/stage2/libgcc_s_pthread_powerpc.a
gcc/stage2/libgcc_s_pthread_ppc64.a

	It appears that powerpc64-linux has been buggy all along.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: mklibgcc fallout
  2004-12-02 22:10           ` Zack Weinberg
  2004-12-02 22:27             ` David Edelsohn
@ 2004-12-03  9:09             ` Alan Modra
  1 sibling, 0 replies; 875+ messages in thread
From: Alan Modra @ 2004-12-03  9:09 UTC (permalink / raw)
  To: Zack Weinberg; +Cc: David Edelsohn, Joseph S. Myers, gcc-patches

On Thu, Dec 02, 2004 at 02:10:51PM -0800, Zack Weinberg wrote:
> I don't want anything.  I do not know what is right.  I am an empty
> vessel.

:-)  I don't think it matters at all where the shared libs are built.
The same place as crt files seems as good as anywhere to me.  After all,
a newly built multilib compiler needs to get at the crt files, and same
search path is used for startup and libraries.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: mklibgcc fallout
       [not found]           ` <amodra@bigpond.net.au>
                               ` (28 preceding siblings ...)
  2004-11-27 22:03             ` David Edelsohn
@ 2004-12-03 16:02             ` David Edelsohn
  2004-12-03 21:55               ` Mark Mitchell
  2004-12-23 17:46             ` [PATCH] PR target/19137: ICE with load of TImode constant David Edelsohn
                               ` (31 subsequent siblings)
  61 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2004-12-03 16:02 UTC (permalink / raw)
  To: Zack Weinberg, Joseph S. Myers, gcc-patches

>>>>> Alan Modra writes:

Alan> :-)  I don't think it matters at all where the shared libs are built.
Alan> The same place as crt files seems as good as anywhere to me.  After all,
Alan> a newly built multilib compiler needs to get at the crt files, and same
Alan> search path is used for startup and libraries.

	The issue is not the search path for link editing, but the search
path at runtime.  If one wants to test an application created using an
uninstalled GCC located in a build directory, it is more convenient to
point the library path at the single build directory instead of multiple
multilib subdirectories.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: mklibgcc fallout
  2004-12-03 16:02             ` mklibgcc fallout David Edelsohn
@ 2004-12-03 21:55               ` Mark Mitchell
  2004-12-03 22:18                 ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Mark Mitchell @ 2004-12-03 21:55 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Zack Weinberg, Joseph S. Myers, gcc-patches

David Edelsohn wrote:
>>>>>>Alan Modra writes:
> 
> 
> Alan> :-)  I don't think it matters at all where the shared libs are built.
> Alan> The same place as crt files seems as good as anywhere to me.  After all,
> Alan> a newly built multilib compiler needs to get at the crt files, and same
> Alan> search path is used for startup and libraries.
> 
> 	The issue is not the search path for link editing, but the search
> path at runtime.  If one wants to test an application created using an
> uninstalled GCC located in a build directory, it is more convenient to
> point the library path at the single build directory instead of multiple
> multilib subdirectories.

I know that you are talking about the build directory, so this may not 
be relevant, but let me explain Sun's comment in a little more detail. 
They noticed that a -m64 compilation added 32-bit library directories to 
the search patch.  That was disturbing to them, in that it meant that 
the linker was searching more directories that it needed to be.  I think 
that's a valid concern, in that, if nothing else, it means more 
filesystem access that necessary.  So, after installation, having ligbcc 
honor the usual multilib discipline makes sense.

To me, it makes sense for the build situation to be similar, but maybe 
the way to fix that is the long-discussed "staged install" thing where 
we try to put everything in a mirror of the installation tree nested 
within the build directory.

-- 
Mark Mitchell
CodeSourcery, LLC
mark@codesourcery.com
(916) 791-8304

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: mklibgcc fallout
  2004-12-03 21:55               ` Mark Mitchell
@ 2004-12-03 22:18                 ` David Edelsohn
  2004-12-03 22:42                   ` Mark Mitchell
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2004-12-03 22:18 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: Zack Weinberg, Joseph S. Myers, gcc-patches

	Is Sun concerned about the directory search path for libraries at
link-edit time or the directory search path for shared libraries at
runtime?

	Richard Henderson's design for libgcc_s shared libraries places
them all in the same directory, both during the build and at installation.
Directories for libgcc_s are not added or changed based on multilib
options.  GCC searches a single directory for shared libgcc_s at link-edit
time and the resulting application searches a single directory for
libgcc_s at runtime.  The shared library name contains the multilib name.

	Shared libraries for other languages, such as libstdc++ and
libgfortran are in multilib subdirectories, but that is a separate issue.
The topic is the location of libgcc_s_* built by mklibgcc.

	I would appreciate if mklibgcc could be corrected so that it
returns to the original behavior.  I do not believe that libgcc_s exhibits
the directory search problem raised by Sun.  If we need to change the GCC
directory search behavior, I would like to discuss that separately from
fixing the current and continuing breakage.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: mklibgcc fallout
  2004-12-03 22:18                 ` David Edelsohn
@ 2004-12-03 22:42                   ` Mark Mitchell
  2004-12-04  3:06                     ` Alan Modra
  0 siblings, 1 reply; 875+ messages in thread
From: Mark Mitchell @ 2004-12-03 22:42 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Zack Weinberg, Joseph S. Myers, gcc-patches

David Edelsohn wrote:
> 	Is Sun concerned about the directory search path for libraries at
> link-edit time or the directory search path for shared libraries at
> runtime?

The former.

> 	Richard Henderson's design for libgcc_s shared libraries places
> them all in the same directory, both during the build and at installation.

It's a little more complex than that; there seem to be symlinks from (or 
to?) the multilib directories.  That was certainly the behavior that we 
were seeing on Solaris.  I don't see any particular virtue in not doing 
the usual multilib thing for libgcc, but I agree with your comment below 
that we shouldn't be changing the design now -- just fixing problems.

> 	I would appreciate if mklibgcc could be corrected so that it
> returns to the original behavior.  I do not believe that libgcc_s exhibits
> the directory search problem raised by Sun.  If we need to change the GCC
> directory search behavior, I would like to discuss that separately from
> fixing the current and continuing breakage.

I agree that we need to fix the breakage, and that we should do that 
before making design changes.

I'm not 100% clear if the compiler is presently broken after 
installation, of if it's just the case that the layout before 
installation is not as you desire.  The former is of course a more 
severe situation than the latter.

This situation is frustruating; Andrew checked in a patch that broke 
Darwin, and Zack spent a couple of weeks of time fixing that, and now 
there's fallout elsewhere.  I'm not suggesting that it's not Zack's 
responsibility to fix the problems, but it would also be nice if Andrew 
and/or folks at Apple could step in to help out.

-- 
Mark Mitchell
CodeSourcery, LLC
mark@codesourcery.com
(916) 791-8304

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: mklibgcc fallout
  2004-12-02 23:26                 ` David Edelsohn
@ 2004-12-04  2:41                   ` Zack Weinberg
  2004-12-04  2:48                     ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Zack Weinberg @ 2004-12-04  2:41 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Joseph S. Myers, gcc-patches

David Edelsohn wrote:
> 	A build of powerpc-ibm-aix5.1.0.0 from 20040819 creates (ignoring
> that AIX shared libraries have a ".a" suffix):
> 
> gcc/libgcc_s.a
> gcc/libgcc_s_power.a
> gcc/libgcc_s_powerpc.a
> gcc/libgcc_s_ppc64.a
> gcc/libgcc_s_pthread.a
> gcc/libgcc_s_pthread_power.a
> gcc/libgcc_s_pthread_powerpc.a
> gcc/libgcc_s_pthread_ppc64.a
> 
> gcc/stage2/libgcc_s.a
> gcc/stage2/libgcc_s_power.a
> gcc/stage2/libgcc_s_powerpc.a
> gcc/stage2/libgcc_s_ppc64.a
> gcc/stage2/libgcc_s_pthread.a
> gcc/stage2/libgcc_s_pthread_power.a
> gcc/stage2/libgcc_s_pthread_powerpc.a
> gcc/stage2/libgcc_s_pthread_ppc64.a

Are any of those symbolic links, and if so, what are they symbolic
links to?

zw

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: mklibgcc fallout
  2004-12-04  2:41                   ` Zack Weinberg
@ 2004-12-04  2:48                     ` David Edelsohn
  0 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2004-12-04  2:48 UTC (permalink / raw)
  To: Zack Weinberg; +Cc: Joseph S. Myers, gcc-patches

>>>>> Zack Weinberg writes:

Zack> Are any of those symbolic links, and if so, what are they symbolic
Zack> links to?

	None of the files are symbolic links.  All are the libraries
themselves.

$ ls -l libgcc_s*
-rw-r--r--   1 dje      usr          567550 Aug 19 12:44 libgcc_s.a
-rw-r--r--   1 dje      usr          572286 Aug 22 16:31 libgcc_s_power.a
-rw-r--r--   1 dje      usr          557531 Aug 22 16:31 libgcc_s_powerpc.a
-rw-r--r--   1 dje      usr          599364 Aug 19 21:51 libgcc_s_ppc64.a
-rw-r--r--   1 dje      usr          591112 Aug 22 16:31 libgcc_s_pthread.a
-rw-r--r--   1 dje      usr          595478 Aug 22 16:31 libgcc_s_pthread_power.a
-rw-r--r--   1 dje      usr          580731 Aug 22 16:31 libgcc_s_pthread_powerpc.a
-rw-r--r--   1 dje      usr          623302 Aug 19 21:51 libgcc_s_pthread_ppc64.a

	The files in stage2 are the versions built in stage2.  The files
in stage1 are the versions built in stage1.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: mklibgcc fallout
  2004-12-03 22:42                   ` Mark Mitchell
@ 2004-12-04  3:06                     ` Alan Modra
  2004-12-04  3:40                       ` Zack Weinberg
  2004-12-08 12:09                       ` Richard Sandiford
  0 siblings, 2 replies; 875+ messages in thread
From: Alan Modra @ 2004-12-04  3:06 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: David Edelsohn, Zack Weinberg, Joseph S. Myers, gcc-patches

On Fri, Dec 03, 2004 at 02:42:25PM -0800, Mark Mitchell wrote:
> I'm not 100% clear if the compiler is presently broken after 
> installation, of if it's just the case that the layout before 
> installation is not as you desire.

It is currently broken for *bootstrap* on multilib targets.  The
following fixes all the problems I know about, and builds
libgcc_s*.so.1 and libunwind*.so.7 in gcc/.  The patch doesn't change
any install locations.  So far, tested by "make quickstrap" and visually
scanning rules in libgcc.mk for both powerpc-linux and powerpc64-linux.
Bootstrap in progress.

	* mklibgcc.in: Build shared libgcc and shared libunwind in gcc/.
	Don't subst shlib_dir for SHLIB_LINK, SHLIBUNWIND_LINK,
	SHLIB_INSTALL, and SHLIBUNWIND_INSTALL.
	* config/i386/t-nwld (SHLIB_NAME): Use shlib_base_name in place of
	shlib_dir and shlib_so_name.
	* config/mips/t-slibgcc-irix (SHLIB_NAME): Likewise.
	* config/t-libunwind-elf (SHLIB_NAME): Likewise.
	* config/t-slibgcc-darwin (SHLIB_NAME): Likewise.
	* config/t-slibgcc-sld (SHLIB_NAME): Likewise.
	(SHLIB_LINK): Don't use shlib_dir when creating symlink.

Further cleanup of mklibgcc.in looks possible.  One obvious thing is
that shlib_dir isn't used in the "for ml in $MULTILIBS" loop.

diff -urp -xCVS -x'*~' -x'.#*' gcc-virgin/gcc/mklibgcc.in gcc-current/gcc/mklibgcc.in
--- gcc-virgin/gcc/mklibgcc.in	2004-12-04 09:27:19.813477064 +1030
+++ gcc-current/gcc/mklibgcc.in	2004-12-04 12:40:13.663844692 +1030
@@ -173,21 +173,21 @@ for ml in $MULTILIBS; do
     if [ -z "$SHLIB_MULTILIB" ]; then
       if [ "$dir" = . ]; then
 	libgcc_eh_a=$dir/libgcc_eh.a
-	libgcc_s_so_base=$dir/libgcc_s
+	libgcc_s_so_base=libgcc_s
 	libgcc_s_so=${libgcc_s_so_base}${SHLIB_EXT}
 	libgcc_s_soname=libgcc_s
 	if [ "$LIBUNWIND" ]; then
-	  libunwind_so_base=$dir/libunwind
+	  libunwind_so_base=libunwind
 	  libunwind_so=${libunwind_so_base}${SHLIB_EXT}
 	  libunwind_soname=libunwind
 	fi
       else
 	libgcc_eh_a=$dir/libgcc_eh.a
-	libgcc_s_so_base=$dir/libgcc_s_${suffix}
+	libgcc_s_so_base=libgcc_s_${suffix}
 	libgcc_s_so=${libgcc_s_so_base}${SHLIB_EXT}
 	libgcc_s_soname=libgcc_s_${suffix}
 	if [ "$LIBUNWIND" ]; then
-	  libunwind_so_base=$dir/libunwind_${suffix}
+	  libunwind_so_base=libunwind_${suffix}
 	  libunwind_so=${libunwind_so_base}${SHLIB_EXT}
 	fi
       fi
@@ -215,11 +215,11 @@ for ml in $MULTILIBS; do
 
     elif [ "$SHLIB_MULTILIB" = "$dir" ]; then
       libgcc_eh_a=$dir/libgcc_eh.a
-      libgcc_s_so_base=$dir/libgcc_s
+      libgcc_s_so_base=libgcc_s
       libgcc_s_so=${libgcc_s_so_base}${SHLIB_EXT}
       libgcc_s_soname=libgcc_s
       if [ "$LIBUNWIND" ]; then
-	libunwind_so_base=$dir/libunwind
+	libunwind_so_base=libunwind
 	libunwind_so=${libunwind_so_base}${SHLIB_EXT}
 	libunwind_soname=libunwind
       fi
@@ -774,7 +774,6 @@ EOF
 	       -e "s%@shlib_base_name@%$libgcc_s_so_base%g" \
 	       -e "s%@shlib_so_name@%$libgcc_s_soname%g" \
 	       -e "s%@shlib_map_file@%$mapfile%g" \
-	       -e "s%@shlib_dir@%$shlib_dir%g" \
 	       -e "s%@shlib_slibdir_qual@%$shlib_dir_qual%g"
     echo "all: $libgcc_s_so"
   fi
@@ -788,7 +787,6 @@ EOF
 		 -e "s%@shlib_objs@%\$(objects)%g" \
 		 -e "s%@shlib_base_name@%$libunwind_so_base%g" \
 		 -e "s%@shlib_so_name@%$libunwind_soname%g" \
-		 -e "s%@shlib_dir@%$shlib_dir%g" \
 		 -e "s%@shlib_slibdir_qual@%$shlib_dir_qual%g"
     echo "all: $libunwind_so"
   fi
@@ -870,13 +868,11 @@ for ml in $MULTILIBS; do
       echo "	$SHLIB_INSTALL" \
 	| sed -e "s%@shlib_base_name@%$shlib_base_name%g" \
 	      -e "s%@shlib_so_name@%$shlib_so_name%g" \
-	      -e "s%@shlib_dir@%$shlib_dir%g" \
 	      -e "s%@shlib_slibdir_qual@%$shlib_slibdir_qual%g"
       if [ "$LIBUNWIND" ]; then
 	echo "	$SHLIBUNWIND_INSTALL" \
 	   | sed -e "s%@shlib_base_name@%$shlibunwind_base_name%g" \
 		 -e "s%@shlib_so_name@%$shlibunwind_so_name%g" \
-		 -e "s%@shlib_dir@%$shlib_dir%g" \
 		 -e "s%@shlib_slibdir_qual@%$shlib_slibdir_qual%g"
 	libunwinddir='$(DESTDIR)$(slibdir)$(shlib_slibdir_qual)/$(shlib_dir)'
 	echo '	$(INSTALL_DATA)' ${dir}/libunwind.a ${libunwinddir}/
@@ -887,13 +883,11 @@ for ml in $MULTILIBS; do
       echo "	$SHLIB_INSTALL" \
 	| sed -e "s%@shlib_base_name@%$shlib_base_name%g" \
 	      -e "s%@shlib_so_name@%$shlib_base_name%g" \
-	      -e "s%@shlib_dir@%%g" \
 	      -e "s%@shlib_slibdir_qual@%%g"
       if [ "$LIBUNWIND" ]; then
 	echo "	$SHLIBUNWIND_INSTALL" \
 	   | sed -e "s%@shlib_base_name@%$shlibunwind_base_name%g" \
 		 -e "s%@shlib_so_name@%$shlibunwind_base_name%g" \
-		 -e "s%@shlib_dir@%%g" \
 		 -e "s%@shlib_slibdir_qual@%%g"
 	libunwinddir='$(DESTDIR)$(slibdir)'
 	echo '	$(INSTALL_DATA)' ${dir}/libunwind.a ${libunwinddir}/
diff -urp -xCVS -x'*~' -x'.#*' gcc-virgin/gcc/config/i386/t-nwld gcc-current/gcc/config/i386/t-nwld
--- gcc-virgin/gcc/config/i386/t-nwld	2004-12-04 11:59:23.862201945 +1030
+++ gcc-current/gcc/config/i386/t-nwld	2004-12-04 11:59:02.319595246 +1030
@@ -30,7 +30,7 @@ s-crt0: $(srcdir)/unwind-dw2-fde.h
 
 SHLIB_EXT = .nlm
 SHLIB_SONAME = @shlib_so_name@.nlm
-SHLIB_NAME = @shlib_dir@@shlib_so_name@.nlm
+SHLIB_NAME = @shlib_base_name@.nlm
 SHLIB_SLIBDIR_QUAL = @shlib_slibdir_qual@
 SHLIB_DEF = $(srcdir)/config/i386/netware-libgcc.def
 SHLIB_MAP = $(srcdir)/config/i386/netware-libgcc.exp
diff -urp -xCVS -x'*~' -x'.#*' gcc-virgin/gcc/config/mips/t-slibgcc-irix gcc-current/gcc/config/mips/t-slibgcc-irix
--- gcc-virgin/gcc/config/mips/t-slibgcc-irix	2004-12-04 11:59:23.863201788 +1030
+++ gcc-current/gcc/config/mips/t-slibgcc-irix	2004-12-04 11:59:02.319595246 +1030
@@ -4,7 +4,7 @@ SHLIB_EXT = .so
 SHLIB_SOLINK = @shlib_base_name@.so
 SHLIB_SOVERSION = 1
 SHLIB_SONAME = @shlib_so_name@.so.$(SHLIB_SOVERSION)
-SHLIB_NAME = @shlib_dir@@shlib_so_name@.so.$(SHLIB_SOVERSION)
+SHLIB_NAME = @shlib_base_name@.so.$(SHLIB_SOVERSION)
 SHLIB_MAP = @shlib_map_file@
 SHLIB_OBJS = @shlib_objs@
 SHLIB_SLIBDIR_QUAL = @shlib_slibdir_qual@
diff -urp -xCVS -x'*~' -x'.#*' gcc-virgin/gcc/config/t-libunwind-elf gcc-current/gcc/config/t-libunwind-elf
--- gcc-virgin/gcc/config/t-libunwind-elf	2004-12-04 11:59:23.876199740 +1030
+++ gcc-current/gcc/config/t-libunwind-elf	2004-12-04 11:59:02.320595088 +1030
@@ -6,7 +6,7 @@ LIBUNWINDDEP = unwind.inc unwind-dw2-fde
 
 SHLIBUNWIND_SOVERSION = 7
 SHLIBUNWIND_SONAME = @shlib_so_name@.so.$(SHLIBUNWIND_SOVERSION)
-SHLIBUNWIND_NAME = @shlib_dir@@shlib_so_name@.so.$(SHLIBUNWIND_SOVERSION)
+SHLIBUNWIND_NAME = @shlib_base_name@.so.$(SHLIBUNWIND_SOVERSION)
 
 SHLIBUNWIND_LINK = $(GCC_FOR_TARGET) $(LIBGCC2_CFLAGS) -shared \
 	-nodefaultlibs -Wl,-h,$(SHLIBUNWIND_SONAME) \
diff -urp -xCVS -x'*~' -x'.#*' gcc-virgin/gcc/config/t-slibgcc-darwin gcc-current/gcc/config/t-slibgcc-darwin
--- gcc-virgin/gcc/config/t-slibgcc-darwin	2004-12-04 11:59:23.877199583 +1030
+++ gcc-current/gcc/config/t-slibgcc-darwin	2004-12-04 11:59:02.321594931 +1030
@@ -5,7 +5,7 @@ SHLIB_VERSTRING = -compatibility_version
 SHLIB_EXT = .dylib
 SHLIB_SOLINK = @shlib_base_name@.dylib
 SHLIB_SONAME = @shlib_so_name@.$(SHLIB_MINOR).$(SHLIB_REVISION).dylib
-SHLIB_NAME = @shlib_dir@@shlib_so_name@.$(SHLIB_MINOR).$(SHLIB_REVISION).dylib
+SHLIB_NAME = @shlib_base_name@.$(SHLIB_MINOR).$(SHLIB_REVISION).dylib
 SHLIB_MAP = @shlib_map_file@
 SHLIB_OBJS = @shlib_objs@
 SHLIB_SLIBDIR_QUAL = @shlib_slibdir_qual@
diff -urp -xCVS -x'*~' -x'.#*' gcc-virgin/gcc/config/t-slibgcc-sld gcc-current/gcc/config/t-slibgcc-sld
--- gcc-virgin/gcc/config/t-slibgcc-sld	2004-12-04 11:59:23.878199425 +1030
+++ gcc-current/gcc/config/t-slibgcc-sld	2004-12-04 11:59:02.322594773 +1030
@@ -3,7 +3,7 @@
 SHLIB_EXT = .so
 SHLIB_SOLINK = @shlib_base_name@.so
 SHLIB_SONAME = @shlib_so_name@.so.1
-SHLIB_NAME = @shlib_dir@@shlib_so_name@.so.1
+SHLIB_NAME = @shlib_base_name@.so.1
 SHLIB_MAP = @shlib_map_file@
 SHLIB_OBJS = @shlib_objs@
 SHLIB_SLIBDIR_QUAL = @shlib_slibdir_qual@
@@ -12,12 +12,12 @@ SHLIB_LINK = $(GCC_FOR_TARGET) $(LIBGCC2
 	-Wl,-h,$(SHLIB_SONAME) -Wl,-z,text -Wl,-z,defs \
 	-Wl,-M,$(SHLIB_MAP) -o $(SHLIB_NAME).tmp \
 	@multilib_flags@ $(SHLIB_OBJS) -lc && \
-	rm -f @shlib_dir@$(SHLIB_SOLINK) && \
+	rm -f $(SHLIB_SOLINK) && \
 	if [ -f $(SHLIB_NAME) ]; then \
 	  mv -f $(SHLIB_NAME) $(SHLIB_NAME).`basename $(STAGE_PREFIX)`; \
 	else true; fi && \
 	mv $(SHLIB_NAME).tmp $(SHLIB_NAME) && \
-	$(LN_S) $(SHLIB_SONAME) @shlib_dir@$(SHLIB_SOLINK)
+	$(LN_S) $(SHLIB_SONAME) $(SHLIB_SOLINK)
 # $(slibdir) double quoted to protect it from expansion while building
 # libgcc.mk.  We want this delayed until actual install time.
 SHLIB_INSTALL = \

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: mklibgcc fallout
  2004-12-04  3:06                     ` Alan Modra
@ 2004-12-04  3:40                       ` Zack Weinberg
  2004-12-08 12:09                       ` Richard Sandiford
  1 sibling, 0 replies; 875+ messages in thread
From: Zack Weinberg @ 2004-12-04  3:40 UTC (permalink / raw)
  To: Mark Mitchell, David Edelsohn, Zack Weinberg, Joseph S. Myers,
	gcc-patches

Alan Modra wrote:
> 	* mklibgcc.in: Build shared libgcc and shared libunwind in gcc/.
> 	Don't subst shlib_dir for SHLIB_LINK, SHLIBUNWIND_LINK,
> 	SHLIB_INSTALL, and SHLIBUNWIND_INSTALL.
> 	* config/i386/t-nwld (SHLIB_NAME): Use shlib_base_name in place of
> 	shlib_dir and shlib_so_name.
> 	* config/mips/t-slibgcc-irix (SHLIB_NAME): Likewise.
> 	* config/t-libunwind-elf (SHLIB_NAME): Likewise.
> 	* config/t-slibgcc-darwin (SHLIB_NAME): Likewise.
> 	* config/t-slibgcc-sld (SHLIB_NAME): Likewise.
> 	(SHLIB_LINK): Don't use shlib_dir when creating symlink.

This looks good to me.

> Further cleanup of mklibgcc.in looks possible.  One obvious thing is
> that shlib_dir isn't used in the "for ml in $MULTILIBS" loop.

Not even to set *_qual?

zw

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: mklibgcc fallout
  2004-12-04  3:06                     ` Alan Modra
  2004-12-04  3:40                       ` Zack Weinberg
@ 2004-12-08 12:09                       ` Richard Sandiford
  2004-12-08 13:54                         ` Alan Modra
  1 sibling, 1 reply; 875+ messages in thread
From: Richard Sandiford @ 2004-12-08 12:09 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: David Edelsohn, Zack Weinberg, Joseph S. Myers, gcc-patches

Alan Modra <amodra@bigpond.net.au> writes:
> 	* mklibgcc.in: Build shared libgcc and shared libunwind in gcc/.
> 	Don't subst shlib_dir for SHLIB_LINK, SHLIBUNWIND_LINK,
> 	SHLIB_INSTALL, and SHLIBUNWIND_INSTALL.
> 	* config/i386/t-nwld (SHLIB_NAME): Use shlib_base_name in place of
> 	shlib_dir and shlib_so_name.
> 	* config/mips/t-slibgcc-irix (SHLIB_NAME): Likewise.
> 	* config/t-libunwind-elf (SHLIB_NAME): Likewise.
> 	* config/t-slibgcc-darwin (SHLIB_NAME): Likewise.
> 	* config/t-slibgcc-sld (SHLIB_NAME): Likewise.
> 	(SHLIB_LINK): Don't use shlib_dir when creating symlink.

This breaks build-dir testing for me on IRIX because (a) we set the
DT_SONAME of every libgcc_s*.so.1 multilib to libgcc.so.1 and (b) there's
now only one libgcc_so.1 in the build directory (the one associated with
the default multilib).  Tests for the non-default multilibs fail to run.

Is this a known problem?  Although IRIX uses its own t-slibgcc-* file,
I couldn't see anything off-hand that would make the IRIX case different
from powerpc64-linux-gnu.

Richard

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: mklibgcc fallout
  2004-12-08 12:09                       ` Richard Sandiford
@ 2004-12-08 13:54                         ` Alan Modra
  2004-12-08 14:08                           ` Richard Sandiford
  0 siblings, 1 reply; 875+ messages in thread
From: Alan Modra @ 2004-12-08 13:54 UTC (permalink / raw)
  To: Richard Sandiford
  Cc: Mark Mitchell, David Edelsohn, Zack Weinberg, Joseph S. Myers,
	gcc-patches

On Wed, Dec 08, 2004 at 12:08:21PM +0000, Richard Sandiford wrote:
> This breaks build-dir testing for me on IRIX because (a) we set the
> DT_SONAME of every libgcc_s*.so.1 multilib to libgcc.so.1

I do get all the libs built on powerpc64-linux, but sonames are a little
odd.  Oh, they are suited to the final installation location, which
means build dir testing won't work.

in gcc/			soname			install
libgcc_s.so.1		libgcc_s.so.1		../lib64/libgcc_s.so.1
libgcc_s_32.so.1	libgcc_s.so.1		../lib/libgcc_s.so.1
libgcc_s_32_nof.so.1	libgcc_s_nof.so.1	../lib/libgcc_s_nof.so.1

This does make some sense if you notice that this results in the 32-bit
libs having the same name and installation location as libs built for
a 32-bit powerpc-linux gcc.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: mklibgcc fallout
  2004-12-08 13:54                         ` Alan Modra
@ 2004-12-08 14:08                           ` Richard Sandiford
  0 siblings, 0 replies; 875+ messages in thread
From: Richard Sandiford @ 2004-12-08 14:08 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: David Edelsohn, Zack Weinberg, Joseph S. Myers, gcc-patches

Alan Modra <amodra@bigpond.net.au> writes:
> On Wed, Dec 08, 2004 at 12:08:21PM +0000, Richard Sandiford wrote:
>> This breaks build-dir testing for me on IRIX because (a) we set the
>> DT_SONAME of every libgcc_s*.so.1 multilib to libgcc.so.1
>
> I do get all the libs built on powerpc64-linux, but sonames are a little
> odd.  Oh, they are suited to the final installation location, which
> means build dir testing won't work.
>
> in gcc/			soname			install
> libgcc_s.so.1		libgcc_s.so.1		../lib64/libgcc_s.so.1
> libgcc_s_32.so.1	libgcc_s.so.1		../lib/libgcc_s.so.1
> libgcc_s_32_nof.so.1	libgcc_s_nof.so.1	../lib/libgcc_s_nof.so.1
>
> This does make some sense if you notice that this results in the 32-bit
> libs having the same name and installation location as libs built for
> a 32-bit powerpc-linux gcc.

Thanks for the info.  Just to add publicly what I've already said privately:

The o32 and n64 multilibs have an soname of libgcc_s.so.1.  That's been
true for a while (since at least 3.4).  It works fine for the installed
libraries because the install tree looks like this:

    libgcc_s.so -> libgcc_s.so.1
    libgcc_s.so.1
    mabi=64/libgcc_s.so.1
    mabi=64/libgcc_s_mabi-64.so -> libgcc_s.so.1
    mabi=32/libgcc_s.so.1
    mabi=32/libgcc_s_mabi-32.so -> libgcc_s.so.1

and it worked with the old hierarchical build directory layout too.
The problem is that it doesn't work with the new flat layout because
every multilib tries to use the same libgcc_s.so.1 (the n32 one).

(Note that in 3.4, the names used to be "...mabi=32.so" and
"...mabi=64.so" rather than "...mabi-32.so" and "mabi-64.so".
That change shouldn't really matter, I guess, since only the
soname makes it into the final output.)

Or, using the same format as Alan did above:

in gcc/			soname			install
libgcc_s.so.1           libgcc_s.so.1           libgcc_s.so.1
libgcc_s_mabi-32.so.1   libgcc_s.so.1           mabi=32/libgcc_s.so.1
libgcc_s_mabi-64.so.1   libgcc_s.so.1           mabi=64/libgcc_s.so.1

Richard

^ permalink raw reply	[flat|nested] 875+ messages in thread

* [PATCH] PR target/19137: ICE with load of TImode constant
@ 2004-12-23 14:57 Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2004-12-23 14:57 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Edelsohn

This fixes an ICE on powerpc-linux, found running the gcc testsuite.
The problem being that rs6000.md has no TImode move pattern handling
constants (*), and the existing splitter doesn't do anything until after
reload.  This ICE happens in regclass.

*) The movti_string pattern that might seem to be available for
powerpc-linux has a too restrictive operand predicate.
reg_or_mem_operand won't match a constant.  An alternative fix is to
relax this predicate to "input_operand".  That unfortunately results in
reload forcing the zero to memory, so generated code is poor.  OK, you
could add an "n" alternative..  See below for a patch to do that.

Here's my first attempt at a fix.

	PR target/19137
	* config/rs6000/rs6000.c (rs6000_emit_move): Split TImode constant
	loads.

diff -u6rp -xCVS -x'*~' -x'.#*' gcc-virgin/gcc/config/rs6000/rs6000.c gcc-current/gcc/config/rs6000/rs6000.c
--- gcc-virgin/gcc/config/rs6000/rs6000.c	2004-12-23 17:46:27.419162669 +1030
+++ gcc-current/gcc/config/rs6000/rs6000.c	2004-12-23 20:05:46.989764716 +1030
@@ -4334,16 +4334,17 @@ rs6000_emit_move (rtx dest, rtx source, 
   if (reload_in_progress && mode == Pmode
       && (! general_operand (operands[1], mode)
 	  || ! nonimmediate_operand (operands[0], mode)))
     goto emit_set;
 
   /* 128-bit constant floating-point values on Darwin should really be
-     loaded as two parts.  */
-  if ((DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_DARWIN)
-      && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_LONG_DOUBLE_128
-      && mode == TFmode && GET_CODE (operands[1]) == CONST_DOUBLE)
+     loaded as two parts.  Similarly, split TImode constants.  */
+  if (((DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_DARWIN)
+       && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_LONG_DOUBLE_128
+       && mode == TFmode && GET_CODE (operands[1]) == CONST_DOUBLE)
+      || (mode == TImode && GET_CODE (operands[1]) == CONST_INT))
     {
       /* DImode is used, not DFmode, because simplify_gen_subreg doesn't
 	 know how to get a DFmode SUBREG of a TFmode.  */
       rs6000_emit_move (simplify_gen_subreg (DImode, operands[0], mode, 0),
 			simplify_gen_subreg (DImode, operands[1], mode, 0),
 			DImode);


Alternative fix.  I'm inclined to think this one is better because it
delays splitting, which means less rtl to process in earlier passes.

	PR target/19137
	* config/rs6000/rs6000.md (movti_power, movti_string): Relax
	operand[1] predicate to input_operand, and add r<-n alternative.

powerpc-linux bootstrap and regression test in progress.  OK mainline?

diff -urp -xCVS -x'*~' -x'.#*' gcc-virgin/gcc/config/rs6000/rs6000.md gcc-current/gcc/config/rs6000/rs6000.md
--- gcc-virgin/gcc/config/rs6000/rs6000.md	2004-12-18 20:17:55.000000000 +1030
+++ gcc-current/gcc/config/rs6000/rs6000.md	2004-12-24 00:53:07.155150191 +1030
@@ -8692,8 +8695,8 @@
 ;; giving the SCRATCH mq.
 
 (define_insn "*movti_power"
-  [(set (match_operand:TI 0 "reg_or_mem_operand" "=Q,m,????r,????r,????r")
-	(match_operand:TI 1 "reg_or_mem_operand" "r,r,r,Q,m"))
+  [(set (match_operand:TI 0 "reg_or_mem_operand" "=Q,m,????r,????r,????r,r")
+	(match_operand:TI 1 "input_operand" "r,r,r,Q,m,n"))
    (clobber (match_scratch:SI 2 "=q,q#X,X,X,X"))]
   "TARGET_POWER && ! TARGET_POWERPC64
    && (gpc_reg_operand (operands[0], TImode) || gpc_reg_operand (operands[1], TImode))"
@@ -8718,14 +8721,15 @@
 	return \"{lsi|lswi} %0,%P1,16\";
       /* ... fall through ...  */
     case 4:
+    case 5:
       return \"#\";
     }
 }"
-  [(set_attr "type" "store,store,*,load,load")])
+  [(set_attr "type" "store,store,*,load,load,*")])
 
 (define_insn "*movti_string"
-  [(set (match_operand:TI 0 "reg_or_mem_operand" "=Q,o<>,????r,????r,????r")
-	(match_operand:TI 1 "reg_or_mem_operand" "r,r,r,Q,m"))]
+  [(set (match_operand:TI 0 "reg_or_mem_operand" "=Q,o<>,????r,????r,????r,r")
+	(match_operand:TI 1 "input_operand" "r,r,r,Q,m,n"))]
   "! TARGET_POWER && ! TARGET_POWERPC64
    && (gpc_reg_operand (operands[0], TImode) || gpc_reg_operand (operands[1], TImode))"
   "*
@@ -8748,10 +8752,11 @@
 	return \"{lsi|lswi} %0,%P1,16\";
       /* ... fall through ...  */
     case 4:
+    case 5:
       return \"#\";
     }
 }"
-  [(set_attr "type" "store,store,*,load,load")])
+  [(set_attr "type" "store,store,*,load,load,*")])
 
 (define_insn "*movti_ppc64"
   [(set (match_operand:TI 0 "nonimmediate_operand" "=r,o<>,r")

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] PR target/19137: ICE with load of TImode constant
       [not found]           ` <amodra@bigpond.net.au>
                               ` (29 preceding siblings ...)
  2004-12-03 16:02             ` mklibgcc fallout David Edelsohn
@ 2004-12-23 17:46             ` David Edelsohn
       [not found]               ` <20041224002336.GB2765@bubble.modra.org>
  2004-12-24 15:55             ` [PATCH] Fix target/19147: invalid rlwinm patterns David Edelsohn
                               ` (30 subsequent siblings)
  61 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2004-12-23 17:46 UTC (permalink / raw)
  To: gcc-patches

	I agree that splitting TImode constants does not seem like the
correct solution.  It is not a float that cannot change change mode.

	Adding "n" constraint to movti patterns seems like the right
solution.  Was the second patch offered for approval?

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* [PATCH] Fix target/19147: invalid rlwinm patterns
@ 2004-12-24 13:47 Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2004-12-24 13:47 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Edelsohn

gcc.c-torture/execute/930718-1.c has been failing on powerpc64-linux
since somewhere in the latter half of September, and I hadn't got around
to investigating the reason.  It turns out that this failure is due to a
problem with andsi3_internal7 and andsi3_internal8, two patterns dealing
with wrapping rlwinm masks (ie. mb > me) I added to rs6000.md on
2002-07-24.  Fariborz Jahanian made a fix for one failure mode of
andsi3_internal8 on 2004-10-26, and at the time I commented that LT and
GT condition bits might be wrong before his patch.

Well, they're still wrong.  To get LT right, you must have a mask that
wraps over the high bits, but we go out of our way to ensure that the
mask doesn't wrap (to make EQ right).  A wrapping mask won't do for EQ
because "rlwinm." works by duplicating the 32-bit input to both high and
low word of a 64-bit value, rotating by 0 to 31 bits, then masking with
a pattern consisting of 1's between two given bits in the low word and
0's elsewhere.  The trickiness is that if the start bit of the mask is
less signifigant than the end bit, then the mask wraps around.  If the
mask wraps then the high word of the mask will be 1's, which will result
in EQ false if any bit of the input is set.  On the other hand, without
a wrapping mask LT will always be false regardless of input.

One solution to this dilemma is to rotate the input so that we can use
a non-wrapping mask to clear bits, then rotate back and use a wrapping
mask of all 1's to generate the condition bits.  That gets EQ, LT and
GT correct.  Unfortunately, if the output is wanted we've just
duplicated the low word to the high word, and I think GCC "knows" that
an andsi operation doesn't set bits willy nilly in the high word.  (See
rtlanal.c:nonzero_bits).

So..

	PR target/19147
	* config/rs6000/rs6000.md (andsi3_internal7, andsi3_internal8): Delete.

OK mainline, 3.4 and 3.3?

Index: gcc/config/rs6000/rs6000.md
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.md,v
retrieving revision 1.337
diff -u -p -r1.337 rs6000.md
--- gcc/config/rs6000/rs6000.md	11 Dec 2004 17:37:25 -0000	1.337
+++ gcc/config/rs6000/rs6000.md	24 Dec 2004 13:42:00 -0000
@@ -2411,60 +2411,6 @@
 }"
   [(set_attr "length" "8")])

-(define_insn_and_split "*andsi3_internal7"
-  [(set (match_operand:CC 2 "cc_reg_operand" "=x,?y")
-	(compare:CC (and:SI (match_operand:SI 0 "gpc_reg_operand" "r,r")
-			    (match_operand:SI 1 "mask_operand_wrap" "i,i"))
-		    (const_int 0)))
-   (clobber (match_scratch:SI 3 "=r,r"))]
-  "TARGET_POWERPC64"
-  "#"
-  "TARGET_POWERPC64"
-  [(parallel [(set (match_dup 2)
-		   (compare:CC (and:SI (rotate:SI (match_dup 0) (match_dup 4))
-				       (match_dup 5))
-			       (const_int 0)))
-	      (clobber (match_dup 3))])]
-  "
-{
-  int mb = extract_MB (operands[1]);
-  int me = extract_ME (operands[1]);
-  operands[4] = GEN_INT (me + 1);
-  operands[5] = GEN_INT (~((HOST_WIDE_INT) -1 << (33 + me - mb)));
-}"
-  [(set_attr "type" "delayed_compare,compare")
-   (set_attr "length" "4,8")])
-
-(define_insn_and_split "*andsi3_internal8"
-  [(set (match_operand:CC 3 "cc_reg_operand" "=x,??y")
-	(compare:CC (and:SI (match_operand:SI 1 "gpc_reg_operand" "r,r")
-			    (match_operand:SI 2 "mask_operand_wrap" "i,i"))
-		    (const_int 0)))
-   (set (match_operand:SI 0 "gpc_reg_operand" "=r,r")
-	(and:SI (match_dup 1)
-		(match_dup 2)))]
-  "TARGET_POWERPC64"
-  "#"
-  "TARGET_POWERPC64"
-  [(set (match_dup 0)
-		   (and:SI (rotate:SI (match_dup 1) (match_dup 4))
-			   (match_dup 5)))
-   (parallel [(set (match_dup 3)
-	           (compare:CC (rotate:SI (match_dup 0) (match_dup 6))
-			       (const_int 0)))
-              (set (match_dup 0)
-	           (rotate:SI (match_dup 0) (match_dup 6)))])]
-  "
-{
-  int mb = extract_MB (operands[2]);
-  int me = extract_ME (operands[2]);
-  operands[4] = GEN_INT (me + 1);
-  operands[6] = GEN_INT (32 - (me + 1));
-  operands[5] = GEN_INT (~((HOST_WIDE_INT) -1 << (33 + me - mb)));
-}"
-  [(set_attr "type" "delayed_compare,compare")
-   (set_attr "length" "8,12")])
-
 (define_expand "iorsi3"
   [(set (match_operand:SI 0 "gpc_reg_operand" "")
 	(ior:SI (match_operand:SI 1 "gpc_reg_operand" "")

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] Fix target/19147: invalid rlwinm patterns
       [not found]           ` <amodra@bigpond.net.au>
                               ` (30 preceding siblings ...)
  2004-12-23 17:46             ` [PATCH] PR target/19137: ICE with load of TImode constant David Edelsohn
@ 2004-12-24 15:55             ` David Edelsohn
  2005-01-12  5:23             ` [PATCH] PR target/19389 Odd gpr mem load unrecognizable insn David Edelsohn
                               ` (29 subsequent siblings)
  61 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2004-12-24 15:55 UTC (permalink / raw)
  To: gcc-patches

	PR target/19147
	* config/rs6000/rs6000.md (andsi3_internal7, andsi3_internal8): Delete.

OK mainline, 3.4 and 3.3?

Yes, okay everywhere that branches are open for patches.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] PR target/19137: ICE with load of TImode constant
       [not found]               ` <20041224002336.GB2765@bubble.modra.org>
@ 2004-12-24 19:30                 ` David Edelsohn
  0 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2004-12-24 19:30 UTC (permalink / raw)
  To: gcc-patches

	PR target/19137
	* config/rs6000/rs6000.md (movti_power, movti_string): Relax
	operand[1] predicate to input_operand, and add r<-n alternative.

This patch is okay.

	Could you also move movti splitter for const_double to the rest of
the movti patterns, before the rs6000_split_multireg_move splitter?  It
appears to have been orphaned in the DImode section, which is confusing.

	Having movti_ppc64 inconsistent seems a bit strange to me, but it
is okay for now.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* [PATCH] PR target/19389 Odd gpr mem load unrecognizable insn
@ 2005-01-12  5:02 Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2005-01-12  5:02 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Edelsohn

This patch constrains addresses accepted by movtf so that gpr loads and
stores to memory must satisfy word_offset_memref_operand, similarly to
the way movdf constrains gpr loads and stores.  This allows David's
recent rs6000_legitimize_reload_address fix to trigger for TFmode.

	PR target/19389
	* config/rs6000/rs6000.md (movtf_internal): Replace r->o and m->r
	with r->Y and Y->r.

Bootstrap etc. in progress.  OK for mainline?

Index: gcc/config/rs6000/rs6000.md
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.md,v
retrieving revision 1.339
diff -u -p -r1.339 rs6000.md
--- gcc/config/rs6000/rs6000.md	25 Dec 2004 12:41:02 -0000	1.339
+++ gcc/config/rs6000/rs6000.md	12 Jan 2005 04:50:12 -0000
@@ -8234,10 +8234,10 @@
 
 ; It's important to list the o->f and f->o moves before f->f because
 ; otherwise reload, given m->f, will try to pick f->f and reload it,
-; which doesn't make progress.  Likewise r->o<> must be before r->r.
+; which doesn't make progress.  Likewise r->Y must be before r->r.
 (define_insn_and_split "*movtf_internal"
-  [(set (match_operand:TF 0 "nonimmediate_operand" "=o,f,f,r,o<>,r")
-	(match_operand:TF 1 "input_operand"         "f,o,f,mGHF,r,r"))]
+  [(set (match_operand:TF 0 "nonimmediate_operand" "=o,f,f,r,Y,r")
+	(match_operand:TF 1 "input_operand"         "f,o,f,YGHF,r,r"))]
   "(DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_DARWIN)
    && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_LONG_DOUBLE_128
    && (gpc_reg_operand (operands[0], TFmode)

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] PR target/19389 Odd gpr mem load unrecognizable insn
       [not found]           ` <amodra@bigpond.net.au>
                               ` (31 preceding siblings ...)
  2004-12-24 15:55             ` [PATCH] Fix target/19147: invalid rlwinm patterns David Edelsohn
@ 2005-01-12  5:23             ` David Edelsohn
  2005-01-29 17:21             ` [PATCH] powerpc dwarf2 unwinder fallback David Edelsohn
                               ` (28 subsequent siblings)
  61 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2005-01-12  5:23 UTC (permalink / raw)
  To: gcc-patches

>>>>> Alan Modra writes:

	PR target/19389
	* config/rs6000/rs6000.md (movtf_internal): Replace r->o and m->r
	with r->Y and Y->r.

Alan> Bootstrap etc. in progress.  OK for mainline?

	Hmm.  Okay, I guess.  We could split movtf_internal into 32bit and
64bit versions.  The GPR restriction doesn't apply to 32bit loads and
stores, but if TFmode is in a GPR, performance already is shot.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* [PATCH] powerpc dwarf2 unwinder fallback
@ 2005-01-29  7:47 Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2005-01-29  7:47 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Edelsohn

This patch corrects some problems with the unwind fallback code on
powerpc-linux and powerpc64-linux.

1) Floating point and vector registers were not being restored.  OK, so
who uses fprs and vmx inside signal handlers?  It can't be very common
to have explicit use, but gcc generates code that uses fp and vector
regs to move blocks of memory around.

2) The 64-bit rt_sigreturn code used to access saved registers by an
offset from pc.  This works when the trampoline is on the stack, but new
linux kernels will use a vdso with the trampoline no longer on the
stack.  I think the only reason that an offset from pc was used was to
support linux-2.4.19 and linux-2.4.20, the first two released linux
kernels that supported ppc64.  These kernels used a different stack
layout than later kernels.  Since people might still be running these
old kernels, I've kept code to support them, and added extra code to
support the vdso.

3) I've removed gcc_sigcontext because it only bore a passing semblance
to the real structure, and merged it with gcc_ucontext.  get_sigcontext
becomes get_regs too, because we are really only interested in regs.

Tested with powerpc-linux and powerpc64-linux bootstrap and regression
check.  Also glibc "make check" both with and without vdso (but using
the equivalent code with a gcc-3.4 branch compiler to avoid glibc build
problems with gcc-4.0).  OK to install?

	* config/rs6000/linux-unwind.h (struct gcc_vregs): New.
	(struct gcc_regs): Rename from gcc_pt_regs.  Add more regs.
	(struct gcc_sigcontext): Delete.  Merge contents to..
	(struct gcc_ucontext): ..here.
	(get_sigcontext): Delete.
	(get_regs): New function, like get_sigcontext but return regs ptr.
	64-bit version finds regs from r1 to support vdso.
	(ppc_linux_aux_vector): New function.
	(ppc_fallback_frame_state): Modify for get_regs.  Restore fprs
	and vector regs.

Index: gcc/config/rs6000/linux-unwind.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/linux-unwind.h,v
retrieving revision 1.2
diff -u -p -r1.2 linux-unwind.h
--- gcc/config/rs6000/linux-unwind.h	15 Sep 2004 11:43:31 -0000	1.2
+++ gcc/config/rs6000/linux-unwind.h	29 Jan 2005 07:17:56 -0000
@@ -24,7 +24,22 @@
    these structs elsewhere;  Many fields are missing, particularly
    from the end of the structures.  */
 
-struct gcc_pt_regs
+struct gcc_vregs
+{
+  __attribute__ ((vector_size (16))) int vr[32];
+#ifdef __powerpc64__
+  unsigned int pad1[3];
+  unsigned int vscr;
+  unsigned int vsave;
+  unsigned int pad2[3];
+#else
+  unsigned int vsave;
+  unsigned int pad[2];
+  unsigned int vscr;
+#endif
+};
+
+struct gcc_regs
 {
   unsigned long gpr[32];
   unsigned long nip;
@@ -34,22 +49,32 @@ struct gcc_pt_regs
   unsigned long link;
   unsigned long xer;
   unsigned long ccr;
-};
-
-struct gcc_sigcontext
-{
-  unsigned long	pad[7];
-  struct gcc_pt_regs *regs;
+  unsigned long softe;
+  unsigned long trap;
+  unsigned long dar;
+  unsigned long dsisr;
+  unsigned long result;
+  unsigned long pad1[4];
+  double fpr[32];
+  unsigned int pad2;
+  unsigned int fpscr;
+#ifdef __powerpc64__
+  struct gcc_vregs *vp;
+#else
+  unsigned int pad3[2];
+#endif
+  struct gcc_vregs vregs;
 };
 
 struct gcc_ucontext
 {
 #ifdef __powerpc64__
-  unsigned long pad[21];
+  unsigned long pad[28];
 #else
-  unsigned long pad[5];
+  unsigned long pad[12];
 #endif
-  struct gcc_sigcontext uc_mcontext;
+  struct gcc_regs *regs;
+  struct gcc_regs rsave;
 };
 
 #ifdef __powerpc64__
@@ -77,34 +102,54 @@ frob_update_context (struct _Unwind_Cont
 }
 
 /* If PC is at a sigreturn trampoline, return a pointer to the
-   sigcontext.  Otherwise return NULL.  */
+   regs.  Otherwise return NULL.  */
 
-static struct gcc_sigcontext *
-get_sigcontext (struct _Unwind_Context *context)
+static struct gcc_regs *
+get_regs (struct _Unwind_Context *context)
 {
   const unsigned char *pc = context->ra;
 
   /* addi r1, r1, 128; li r0, 0x0077; sc  (sigreturn) */
   /* addi r1, r1, 128; li r0, 0x00AC; sc  (rt_sigreturn) */
-  if (*(unsigned int *) (pc+0) != 0x38210000 + SIGNAL_FRAMESIZE
-      || *(unsigned int *) (pc+8) != 0x44000002)
+  if (*(unsigned int *) (pc + 0) != 0x38210000 + SIGNAL_FRAMESIZE
+      || *(unsigned int *) (pc + 8) != 0x44000002)
     return NULL;
-  if (*(unsigned int *) (pc+4) == 0x38000077)
+  if (*(unsigned int *) (pc + 4) == 0x38000077)
     {
       struct sigframe {
 	char gap[SIGNAL_FRAMESIZE];
-	struct gcc_sigcontext sigctx;
-      } *rt_ = context->cfa;
-      return &rt_->sigctx;
+	unsigned long pad[7];
+	struct gcc_regs *regs;
+      } *frame = (struct sigframe *) context->cfa;
+      return frame->regs;
     }
-  else if (*(unsigned int *) (pc+4) == 0x380000AC)
+  else if (*(unsigned int *) (pc + 4) == 0x380000AC)
     {
-      struct rt_sigframe {
+      /* This works for 2.4 kernels, but not for 2.6 kernels with vdso
+	 because pc isn't pointing into the stack.  Can be removed when
+	 no one is running 2.4.19 or 2.4.20, the first two ppc64
+	 kernels released.  */
+      struct rt_sigframe_24 {
 	int tramp[6];
 	void *pinfo;
 	struct gcc_ucontext *puc;
-      } *rt_ = (struct rt_sigframe *) pc;
-      return &rt_->puc->uc_mcontext;
+      } *frame24 = (struct rt_sigframe_24 *) pc;
+
+      /* Test for magic value in *puc of vdso.  */
+      if ((long) frame24->puc != -21 * 8)
+	return frame24->puc->regs;
+      else
+	{
+	  struct rt_sigframe {
+	    char gap[SIGNAL_FRAMESIZE];
+	    struct gcc_ucontext uc;
+	    unsigned long pad[2];
+	    int tramp[6];
+	    void *pinfo;
+	    struct gcc_ucontext *puc;
+	  } *frame = (struct rt_sigframe *) context->cfa;
+	  return frame->uc.regs;
+	}
     }
   return NULL;
 }
@@ -113,8 +158,8 @@ get_sigcontext (struct _Unwind_Context *
 
 enum { SIGNAL_FRAMESIZE = 64 };
 
-static struct gcc_sigcontext *
-get_sigcontext (struct _Unwind_Context *context)
+static struct gcc_regs *
+get_regs (struct _Unwind_Context *context)
 {
   const unsigned char *pc = context->ra;
 
@@ -122,31 +167,64 @@ get_sigcontext (struct _Unwind_Context *
   /* li r0, 0x0077; sc  (sigreturn new)  */
   /* li r0, 0x6666; sc  (rt_sigreturn old)  */
   /* li r0, 0x00AC; sc  (rt_sigreturn new)  */
-  if (*(unsigned int *) (pc+4) != 0x44000002)
+  if (*(unsigned int *) (pc + 4) != 0x44000002)
     return NULL;
-  if (*(unsigned int *) (pc+0) == 0x38007777
-      || *(unsigned int *) (pc+0) == 0x38000077)
+  if (*(unsigned int *) (pc + 0) == 0x38007777
+      || *(unsigned int *) (pc + 0) == 0x38000077)
     {
       struct sigframe {
 	char gap[SIGNAL_FRAMESIZE];
-	struct gcc_sigcontext sigctx;
-      } *rt_ = context->cfa;
-      return &rt_->sigctx;
+	unsigned long pad[7];
+	struct gcc_regs *regs;
+      } *frame = (struct sigframe *) context->cfa;
+      return frame->regs;
     }
-  else if (*(unsigned int *) (pc+0) == 0x38006666
-	   || *(unsigned int *) (pc+0) == 0x380000AC)
+  else if (*(unsigned int *) (pc + 0) == 0x38006666
+	   || *(unsigned int *) (pc + 0) == 0x380000AC)
     {
       struct rt_sigframe {
 	char gap[SIGNAL_FRAMESIZE + 16];
 	char siginfo[128];
 	struct gcc_ucontext uc;
-      } *rt_ = context->cfa;
-      return &rt_->uc.uc_mcontext;
+      } *frame = (struct rt_sigframe *) context->cfa;
+      return frame->uc.regs;
     }
   return NULL;
 }
 #endif
 
+/* Find an entry in the process auxilliary vector.  The canonical way to
+   test for VMX is to look at AT_HWCAP.  */
+
+static long
+ppc_linux_aux_vector (long which)
+{
+  /* __libc_stack_end holds the original stack passed to a process.  */
+  extern long *__libc_stack_end;
+  long argc;
+  char **argv;
+  char **envp;
+  struct auxv
+  {
+    long a_type;
+    long a_val;
+  } *auxp;
+
+  /* The Linux kernel puts argc first on the stack.  */
+  argc = __libc_stack_end[0];
+  /* Followed by argv, NULL terminated.  */
+  argv = (char **) __libc_stack_end + 1;
+  /* Followed by environment string pointers, NULL terminated. */
+  envp = argv + argc + 1;
+  while (*envp++)
+    continue;
+  /* Followed by the aux vector, zero terminated.  */
+  for (auxp = (struct auxv *) envp; auxp->a_type != 0; ++auxp)
+    if (auxp->a_type == which)
+      return auxp->a_val;
+  return 0;
+}
+
 /* Do code reading to identify a signal frame, and set the frame
    state data appropriately.  See unwind-dw2.c for the structs.  */
 
@@ -156,14 +234,15 @@ static _Unwind_Reason_Code
 ppc_fallback_frame_state (struct _Unwind_Context *context,
 			  _Unwind_FrameState *fs)
 {
-  struct gcc_sigcontext *sc = get_sigcontext (context);
+  static long hwcap = 0;
+  struct gcc_regs *regs = get_regs (context);
   long new_cfa;
   int i;
 
-  if (sc == NULL)
+  if (regs == NULL)
     return _URC_END_OF_STACK;
 
-  new_cfa = sc->regs->gpr[STACK_POINTER_REGNUM];
+  new_cfa = regs->gpr[STACK_POINTER_REGNUM];
   fs->cfa_how = CFA_REG_OFFSET;
   fs->cfa_reg = STACK_POINTER_REGNUM;
   fs->cfa_offset = new_cfa - (long) context->cfa;
@@ -172,21 +251,65 @@ ppc_fallback_frame_state (struct _Unwind
     if (i != STACK_POINTER_REGNUM)
       {
 	fs->regs.reg[i].how = REG_SAVED_OFFSET;
-	fs->regs.reg[i].loc.offset
-	  = (long)&(sc->regs->gpr[i]) - new_cfa;
+	fs->regs.reg[i].loc.offset = (long) &regs->gpr[i] - new_cfa;
       }
 
   fs->regs.reg[CR2_REGNO].how = REG_SAVED_OFFSET;
-  fs->regs.reg[CR2_REGNO].loc.offset
-    = (long)&(sc->regs->ccr) - new_cfa;
+  fs->regs.reg[CR2_REGNO].loc.offset = (long) &regs->ccr - new_cfa;
 
   fs->regs.reg[LINK_REGISTER_REGNUM].how = REG_SAVED_OFFSET;
-  fs->regs.reg[LINK_REGISTER_REGNUM].loc.offset
-    = (long)&(sc->regs->link) - new_cfa;
+  fs->regs.reg[LINK_REGISTER_REGNUM].loc.offset = (long) &regs->link - new_cfa;
 
   fs->regs.reg[ARG_POINTER_REGNUM].how = REG_SAVED_OFFSET;
-  fs->regs.reg[ARG_POINTER_REGNUM].loc.offset
-    = (long)&(sc->regs->nip) - new_cfa;
+  fs->regs.reg[ARG_POINTER_REGNUM].loc.offset = (long) &regs->nip - new_cfa;
   fs->retaddr_column = ARG_POINTER_REGNUM;
+
+  if (hwcap == 0)
+    {
+      hwcap = ppc_linux_aux_vector (16);
+      /* These will already be set if we found AT_HWCAP.  A non-zero
+	 value stops us looking again if for some reason we couldn't
+	 find AT_HWCAP.  */
+#ifdef __powerpc64__
+      hwcap |= 0xc0000000;
+#else
+      hwcap |= 0x80000000;
+#endif
+    }
+
+  /* If we have a FPU...  */
+  if (hwcap & 0x08000000)
+    for (i = 0; i < 32; i++)
+      {
+	fs->regs.reg[i + 32].how = REG_SAVED_OFFSET;
+	fs->regs.reg[i + 32].loc.offset = (long) &regs->fpr[i] - new_cfa;
+      }
+
+  /* If we have a VMX unit...  */
+  if (hwcap & 0x10000000)
+    {
+      struct gcc_vregs *vregs;
+#ifdef __powerpc64__
+      vregs = regs->vp;
+#else
+      vregs = &regs->vregs;
+#endif
+      if (regs->msr & (1 << 25))
+	{
+	  for (i = 0; i < 32; i++)
+	    {
+	      fs->regs.reg[i + FIRST_ALTIVEC_REGNO].how = REG_SAVED_OFFSET;
+	      fs->regs.reg[i + FIRST_ALTIVEC_REGNO].loc.offset
+		= (long) &vregs[i] - new_cfa;
+	    }
+
+	  fs->regs.reg[VSCR_REGNO].how = REG_SAVED_OFFSET;
+	  fs->regs.reg[VSCR_REGNO].loc.offset = (long) &vregs->vscr - new_cfa;
+	}
+
+      fs->regs.reg[VRSAVE_REGNO].how = REG_SAVED_OFFSET;
+      fs->regs.reg[VRSAVE_REGNO].loc.offset = (long) &vregs->vsave - new_cfa;
+    }
+
   return _URC_NO_REASON;
 }

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] powerpc dwarf2 unwinder fallback
       [not found]           ` <amodra@bigpond.net.au>
                               ` (32 preceding siblings ...)
  2005-01-12  5:23             ` [PATCH] PR target/19389 Odd gpr mem load unrecognizable insn David Edelsohn
@ 2005-01-29 17:21             ` David Edelsohn
  2005-02-15 19:41             ` [RFC] PowerPC 128 bit long double compatibility (PR target/19019) David Edelsohn
                               ` (27 subsequent siblings)
  61 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2005-01-29 17:21 UTC (permalink / raw)
  To: gcc-patches

	* config/rs6000/linux-unwind.h (struct gcc_vregs): New.
	(struct gcc_regs): Rename from gcc_pt_regs.  Add more regs.
	(struct gcc_sigcontext): Delete.  Merge contents to..
	(struct gcc_ucontext): ..here.
	(get_sigcontext): Delete.
	(get_regs): New function, like get_sigcontext but return regs ptr.
	64-bit version finds regs from r1 to support vdso.
	(ppc_linux_aux_vector): New function.
	(ppc_fallback_frame_state): Modify for get_regs.  Restore fprs
	and vector regs.

Okay.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* [RFC] PowerPC 128 bit long double compatibility (PR target/19019)
@ 2005-02-15  2:20 David Edelsohn
  2005-02-15  2:22 ` Geoffrey Keating
  2005-02-15  2:53 ` Richard Henderson
  0 siblings, 2 replies; 875+ messages in thread
From: David Edelsohn @ 2005-02-15  2:20 UTC (permalink / raw)
  To: Geoff Keating; +Cc: gcc-patches

	The GCC and IBM XLC semantics of the 128-bit long double differ
and do not interoperate completely.  This patch renames the support
functions to unique names so that the compilers do not try to use each
other's support functions.

	This will need to be pushed back into GCC 3.4 and Linux distros
using GCC 3.4 as well.

David

	* config/rs6000/darwin-ldouble.c (_xlqadd): Rename to __gcc_qadd.
	(_xlqsub): Rename to __gcc_qsub.
	(_xlqmul): Rename to __gcc_qmul.
	(_xlqdiv): Rename to __gcc_qdiv.
	* config/rs6000/libgcc-ppc64.ver: Rename symbols.
	* config/rs6000/rs6000.c (rs6000_init_libfuncs): Rename symbols.
	* config/rs6000/t-aix43 (LIB2FUNCS_EXTRA): New.
	* config/rs6000/t-aix52 (LIB2FUNCS_EXTRA): New.
	* config/rs6000/t-newas (LIB2FUNCS_EXTRA): New.

Index: darwin-ldouble.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/darwin-ldouble.c,v
retrieving revision 1.8
diff -c -p -r1.8 darwin-ldouble.c
*** darwin-ldouble.c	31 Jul 2004 01:40:16 -0000	1.8
--- darwin-ldouble.c	14 Feb 2005 22:08:49 -0000
*************** Software Foundation, 59 Temple Place - S
*** 30,41 ****
  /* Implementations of floating-point long double basic arithmetic
     functions called by the IBM C compiler when generating code for
     PowerPC platforms.  In particular, the following functions are
!    implemented: _xlqadd, _xlqsub, _xlqmul, and _xlqdiv.  Double-double
!    algorithms are based on the paper "Doubled-Precision IEEE Standard
!    754 Floating-Point Arithmetic" by W. Kahan, February 26, 1987.  An
!    alternative published reference is "Software for Doubled-Precision
!    Floating-Point Computations", by Seppo Linnainmaa, ACM TOMS vol 7
!    no 3, September 1981, pages 272-283.  */
  
  /* Each long double is made up of two IEEE doubles.  The value of the
     long double is the sum of the values of the two parts.  The most
--- 30,41 ----
  /* Implementations of floating-point long double basic arithmetic
     functions called by the IBM C compiler when generating code for
     PowerPC platforms.  In particular, the following functions are
!    implemented: __gcc_qadd, __gcc_qsub, __gcc_qmul, and __gcc_qdiv.
!    Double-double algorithms are based on the paper "Doubled-Precision
!    IEEE Standard 754 Floating-Point Arithmetic" by W. Kahan, February 26,
!    1987.  An alternative published reference is "Software for
!    Doubled-Precision Floating-Point Computations", by Seppo Linnainmaa,
!    ACM TOMS vol 7 no 3, September 1981, pages 272-283.  */
  
  /* Each long double is made up of two IEEE doubles.  The value of the
     long double is the sum of the values of the two parts.  The most
*************** Software Foundation, 59 Temple Place - S
*** 48,54 ****
  
     This code currently assumes big-endian.  */
  
! #if !_SOFT_FLOAT && (defined (__MACH__) || defined (__powerpc64__))
  
  #define fabs(x) __builtin_fabs(x)
  #define isless(x, y) __builtin_isless (x, y)
--- 48,54 ----
  
     This code currently assumes big-endian.  */
  
! #if !_SOFT_FLOAT && (defined (__MACH__) || defined (__powerpc64__) || defined (_AIX))
  
  #define fabs(x) __builtin_fabs(x)
  #define isless(x, y) __builtin_isless (x, y)
*************** Software Foundation, 59 Temple Place - S
*** 62,71 ****
     but GCC currently generates poor code when a union is used to turn
     a long double into a pair of doubles.  */
  
! extern long double _xlqadd (double, double, double, double);
! extern long double _xlqsub (double, double, double, double);
! extern long double _xlqmul (double, double, double, double);
! extern long double _xlqdiv (double, double, double, double);
  
  typedef union
  {
--- 62,71 ----
     but GCC currently generates poor code when a union is used to turn
     a long double into a pair of doubles.  */
  
! extern long double __gcc_qadd (double, double, double, double);
! extern long double __gcc_qsub (double, double, double, double);
! extern long double __gcc_qmul (double, double, double, double);
! extern long double __gcc_qdiv (double, double, double, double);
  
  typedef union
  {
*************** typedef union
*** 75,81 ****
  
  /* Add two 'long double' values and return the result.	*/
  long double
! _xlqadd (double a, double aa, double c, double cc)
  {
    longDblUnion x;
    double z, q, zz, xh;
--- 75,81 ----
  
  /* Add two 'long double' values and return the result.	*/
  long double
! __gcc_qadd (double a, double aa, double c, double cc)
  {
    longDblUnion x;
    double z, q, zz, xh;
*************** _xlqadd (double a, double aa, double c, 
*** 110,122 ****
  }
  
  long double
! _xlqsub (double a, double b, double c, double d)
  {
!   return _xlqadd (a, b, -c, -d);
  }
  
  long double
! _xlqmul (double a, double b, double c, double d)
  {
    longDblUnion z;
    double t, tau, u, v, w;
--- 110,122 ----
  }
  
  long double
! __gcc_qsub (double a, double b, double c, double d)
  {
!   return __gcc_qadd (a, b, -c, -d);
  }
  
  long double
! __gcc_qmul (double a, double b, double c, double d)
  {
    longDblUnion z;
    double t, tau, u, v, w;
*************** _xlqmul (double a, double b, double c, d
*** 145,151 ****
  }
  
  long double
! _xlqdiv (double a, double b, double c, double d)
  {
    longDblUnion z;
    double s, sigma, t, tau, u, v, w;
--- 145,151 ----
  }
  
  long double
! __gcc_qdiv (double a, double b, double c, double d)
  {
    longDblUnion z;
    double s, sigma, t, tau, u, v, w;
Index: libgcc-ppc64.ver
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/libgcc-ppc64.ver,v
retrieving revision 1.2
diff -c -p -r1.2 libgcc-ppc64.ver
*** libgcc-ppc64.ver	7 Feb 2004 03:06:46 -0000	1.2
--- libgcc-ppc64.ver	14 Feb 2005 22:08:49 -0000
***************
*** 1,7 ****
  GCC_3.4 {
    # long double support
!   _xlqadd
!   _xlqsub
!   _xlqmul
!   _xlqdiv
  }
--- 1,7 ----
  GCC_3.4 {
    # long double support
!   __gcc_qadd
!   __gcc_qsub
!   __gcc_qmul
!   __gcc_qdiv
  }
Index: rs6000.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.c,v
retrieving revision 1.785
diff -c -p -r1.785 rs6000.c
*** rs6000.c	13 Feb 2005 21:31:25 -0000	1.785
--- rs6000.c	14 Feb 2005 22:08:49 -0000
*************** rs6000_init_libfuncs (void)
*** 8871,8880 ****
  	}
  
        /* Standard AIX/Darwin/64-bit SVR4 quad floating point routines.  */
!       set_optab_libfunc (add_optab, TFmode, "_xlqadd");
!       set_optab_libfunc (sub_optab, TFmode, "_xlqsub");
!       set_optab_libfunc (smul_optab, TFmode, "_xlqmul");
!       set_optab_libfunc (sdiv_optab, TFmode, "_xlqdiv");
      }
    else
      {
--- 8871,8880 ----
  	}
  
        /* Standard AIX/Darwin/64-bit SVR4 quad floating point routines.  */
!       set_optab_libfunc (add_optab, TFmode, "__gcc_qadd");
!       set_optab_libfunc (sub_optab, TFmode, "__gcc_qsub");
!       set_optab_libfunc (smul_optab, TFmode, "__gcc_qmul");
!       set_optab_libfunc (sdiv_optab, TFmode, "__gcc_qdiv");
      }
    else
      {
Index: t-aix43
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/t-aix43,v
retrieving revision 1.22
diff -c -p -r1.22 t-aix43
*** t-aix43	10 Jan 2005 15:10:05 -0000	1.22
--- t-aix43	14 Feb 2005 22:08:49 -0000
*************** SHLIB_MKMAP = $(srcdir)/mkmap-flat.awk
*** 61,66 ****
--- 61,69 ----
  SHLIB_MAPFILES = $(srcdir)/libgcc-std.ver
  SHLIB_NM_FLAGS = -Bpg -X32_64
  
+ # GCC 128-bit long double support routines.
+ LIB2FUNCS_EXTRA = $(srcdir)/config/rs6000/darwin-ldouble.c
+ 
  # Either 32-bit and 64-bit objects in archives.
  AR_FLAGS_FOR_TARGET = -X32_64
  
Index: t-aix52
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/t-aix52,v
retrieving revision 1.5
diff -c -p -r1.5 t-aix52
*** t-aix52	7 Dec 2004 18:44:31 -0000	1.5
--- t-aix52	14 Feb 2005 22:08:49 -0000
*************** SHLIB_MKMAP = $(srcdir)/mkmap-flat.awk
*** 42,47 ****
--- 42,50 ----
  SHLIB_MAPFILES = $(srcdir)/libgcc-std.ver
  SHLIB_NM_FLAGS = -Bpg -X32_64
  
+ # GCC 128-bit long double support routines.
+ LIB2FUNCS_EXTRA = $(srcdir)/config/rs6000/darwin-ldouble.c
+ 
  # Either 32-bit and 64-bit objects in archives.
  AR_FLAGS_FOR_TARGET = -X32_64
  
Index: t-newas
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/t-newas,v
retrieving revision 1.6
diff -c -p -r1.6 t-newas
*** t-newas	18 Dec 2002 22:45:35 -0000	1.6
--- t-newas	14 Feb 2005 22:08:49 -0000
*************** MULTILIB_MATCHES	= $(MULTILIB_MATCHES_FL
*** 27,32 ****
--- 27,35 ----
  			  mcpu?powerpc=mpowerpc-gpopt \
  			  mcpu?powerpc=mpowerpc-gfxopt
  
+ # GCC 128-bit long double support routines.
+ LIB2FUNCS_EXTRA = $(srcdir)/config/rs6000/darwin-ldouble.c
+ 
  # Aix 3.2.x needs milli.exp for -mcpu=common
  EXTRA_PARTS = milli.exp
  milli.exp: $(srcdir)/config/rs6000/milli.exp

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC 128 bit long double compatibility (PR target/19019)
  2005-02-15  2:20 [RFC] PowerPC 128 bit long double compatibility (PR target/19019) David Edelsohn
@ 2005-02-15  2:22 ` Geoffrey Keating
  2005-02-15  2:29   ` David Edelsohn
  2005-02-15  2:53 ` Richard Henderson
  1 sibling, 1 reply; 875+ messages in thread
From: Geoffrey Keating @ 2005-02-15  2:22 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1070 bytes --]


On 14/02/2005, at 2:42 PM, David Edelsohn wrote:

> *** rs6000.c	13 Feb 2005 21:31:25 -0000	1.785
> --- rs6000.c	14 Feb 2005 22:08:49 -0000
> *************** rs6000_init_libfuncs (void)
> *** 8871,8880 ****
>   	}
>
>         /* Standard AIX/Darwin/64-bit SVR4 quad floating point 
> routines.  */
> !       set_optab_libfunc (add_optab, TFmode, "_xlqadd");
> !       set_optab_libfunc (sub_optab, TFmode, "_xlqsub");
> !       set_optab_libfunc (smul_optab, TFmode, "_xlqmul");
> !       set_optab_libfunc (sdiv_optab, TFmode, "_xlqdiv");
>       }
>     else
>       {
> --- 8871,8880 ----
>   	}
>
>         /* Standard AIX/Darwin/64-bit SVR4 quad floating point 
> routines.  */
> !       set_optab_libfunc (add_optab, TFmode, "__gcc_qadd");
> !       set_optab_libfunc (sub_optab, TFmode, "__gcc_qsub");
> !       set_optab_libfunc (smul_optab, TFmode, "__gcc_qmul");
> !       set_optab_libfunc (sdiv_optab, TFmode, "__gcc_qdiv");
>       }
>     else
>       {

Shouldn't this part be conditional on some flag, xlc-compat or 
something?  (The rest looks fine.)


[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 2408 bytes --]

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC 128 bit long double compatibility (PR target/19019)
  2005-02-15  2:22 ` Geoffrey Keating
@ 2005-02-15  2:29   ` David Edelsohn
  2005-02-15 11:20     ` Geoffrey Keating
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2005-02-15  2:29 UTC (permalink / raw)
  To: Geoffrey Keating; +Cc: gcc-patches

>>>>> Geoffrey Keating writes:

>> /* Standard AIX/Darwin/64-bit SVR4 quad floating point 
>> routines.  */
>> !       set_optab_libfunc (add_optab, TFmode, "__gcc_qadd");
>> !       set_optab_libfunc (sub_optab, TFmode, "__gcc_qsub");
>> !       set_optab_libfunc (smul_optab, TFmode, "__gcc_qmul");
>> !       set_optab_libfunc (sdiv_optab, TFmode, "__gcc_qdiv");
>> }
>> else
>> {

Geoff> Shouldn't this part be conditional on some flag, xlc-compat or 
Geoff> something?  (The rest looks fine.)

	Only AIX and XLC provide the _xlq routines and GCC on platforms
other than AIX.  Non-AIX platforms do not normally link with the XLC
support library.  The GCC semantics are stricter, so I think that we only
lose performance by always using the GCC functions.  I do not see that we
generate any usable code if we make the symbols conditional on xl-compat.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC 128 bit long double compatibility (PR target/19019)
  2005-02-15  2:20 [RFC] PowerPC 128 bit long double compatibility (PR target/19019) David Edelsohn
  2005-02-15  2:22 ` Geoffrey Keating
@ 2005-02-15  2:53 ` Richard Henderson
  2005-02-15  3:19   ` David Edelsohn
  1 sibling, 1 reply; 875+ messages in thread
From: Richard Henderson @ 2005-02-15  2:53 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Geoff Keating, gcc-patches

On Mon, Feb 14, 2005 at 05:42:32PM -0500, David Edelsohn wrote:
>   GCC_3.4 {
>     # long double support
> !   _xlqadd
> !   _xlqsub
> !   _xlqmul
> !   _xlqdiv
>   }
> --- 1,7 ----
>   GCC_3.4 {
>     # long double support
> !   __gcc_qadd
> !   __gcc_qsub
> !   __gcc_qmul
> !   __gcc_qdiv

You can't make this change without breaking binary compatibility on ELF.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC 128 bit long double compatibility (PR target/19019)
  2005-02-15  2:53 ` Richard Henderson
@ 2005-02-15  3:19   ` David Edelsohn
  2005-02-15 15:54     ` Alan Modra
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2005-02-15  3:19 UTC (permalink / raw)
  To: Richard Henderson, Geoff Keating, gcc-patches

>>>>> Richard Henderson writes:

>> GCC_3.4 {
>> # long double support
>> !   __gcc_qadd
>> !   __gcc_qsub
>> !   __gcc_qmul
>> !   __gcc_qdiv

Richard> You can't make this change without breaking binary compatibility
Richard> on ELF. 

	Yes, we know.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC 128 bit long double compatibility (PR target/19019)
  2005-02-15  2:29   ` David Edelsohn
@ 2005-02-15 11:20     ` Geoffrey Keating
  0 siblings, 0 replies; 875+ messages in thread
From: Geoffrey Keating @ 2005-02-15 11:20 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

David Edelsohn <dje@watson.ibm.com> writes:

> >>>>> Geoffrey Keating writes:
> 
> >> /* Standard AIX/Darwin/64-bit SVR4 quad floating point 
> >> routines.  */
> >> !       set_optab_libfunc (add_optab, TFmode, "__gcc_qadd");
> >> !       set_optab_libfunc (sub_optab, TFmode, "__gcc_qsub");
> >> !       set_optab_libfunc (smul_optab, TFmode, "__gcc_qmul");
> >> !       set_optab_libfunc (sdiv_optab, TFmode, "__gcc_qdiv");
> >> }
> >> else
> >> {
> 
> Geoff> Shouldn't this part be conditional on some flag, xlc-compat or 
> Geoff> something?  (The rest looks fine.)
> 
> 	Only AIX and XLC provide the _xlq routines and GCC on platforms
> other than AIX.  Non-AIX platforms do not normally link with the XLC
> support library.  The GCC semantics are stricter, so I think that we only
> lose performance by always using the GCC functions.  I do not see that we
> generate any usable code if we make the symbols conditional on xl-compat.

Well, you'd generate code that could link with xlc...

But I'm fine with the original patch if you think it's right.

Although the semantics are stricter, I'm not sure that you can take an
xlc-derived value and feed it into GCC's supports routines and have it
work; when I was writing the routines I didn't try to analyse invalid
inputs to see what they did.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC 128 bit long double compatibility (PR target/19019)
  2005-02-15  3:19   ` David Edelsohn
@ 2005-02-15 15:54     ` Alan Modra
  2005-02-15 16:11       ` Richard Henderson
                         ` (2 more replies)
  0 siblings, 3 replies; 875+ messages in thread
From: Alan Modra @ 2005-02-15 15:54 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Richard Henderson, Geoff Keating, gcc-patches

On Mon, Feb 14, 2005 at 06:56:00PM -0500, David Edelsohn wrote:
> Richard> You can't make this change without breaking binary compatibility
> Richard> on ELF. 
> 
> 	Yes, we know.

This patch on top of yours provides versioned symbols for backwards
compatibility.  I've put the new syms as GCC_3.4.4 rather than GCC_4.0
as I'm assuming you'll port your patch to gcc-3.4 too.

Note!  The Darwin support hasn't been checked.  I'm not certain whether
the Darwin assembler even supports .symver.


diff -urp gcc-dje/gcc/config/rs6000/darwin-ldouble.c gcc-alan/gcc/config/rs6000/darwin-ldouble.c
--- gcc-dje/gcc/config/rs6000/darwin-ldouble.c	2005-02-15 19:51:11.872609608 +1030
+++ gcc-alan/gcc/config/rs6000/darwin-ldouble.c	2005-02-15 20:02:10.975688136 +1030
@@ -67,6 +67,29 @@ extern long double __gcc_qsub (double, d
 extern long double __gcc_qmul (double, double, double, double);
 extern long double __gcc_qdiv (double, double, double, double);
 
+#ifdef __ELF__
+/* Provide definitions of the old symbol names to statisfy apps and
+   shared libs built against an older libgcc.  To access the _xlq
+   symbols an explicit version reference is needed, so these won't
+   satisfy an unadorned reference like _xlqadd.  If dot symbols are
+   not needed, the assembler will remove the aliases from the symbol
+   table.  */
+__asm__ (".symver __gcc_qadd,_xlqadd@GCC_3.4\n\t"
+	 ".symver __gcc_qsub,_xlqsub@GCC_3.4\n\t"
+	 ".symver __gcc_qmul,_xlqmul@GCC_3.4\n\t"
+	 ".symver __gcc_qdiv,_xlqdiv@GCC_3.4\n\t"
+	 ".symver .__gcc_qadd,._xlqadd@GCC_3.4\n\t"
+	 ".symver .__gcc_qsub,._xlqsub@GCC_3.4\n\t"
+	 ".symver .__gcc_qmul,._xlqmul@GCC_3.4\n\t"
+	 ".symver .__gcc_qdiv,._xlqdiv@GCC_3.4");
+#endif
+#ifdef __MACH__
+__asm__ (".symver ___gcc_qadd,__xlqadd@GCC_3.4\n\t"
+	 ".symver ___gcc_qsub,__xlqsub@GCC_3.4\n\t"
+	 ".symver ___gcc_qmul,__xlqmul@GCC_3.4\n\t"
+	 ".symver ___gcc_qdiv,__xlqdiv@GCC_3.4");
+#endif
+
 typedef union
 {
   long double ldval;
diff -urp gcc-dje/gcc/config/rs6000/libgcc-ppc64.ver gcc-alan/gcc/config/rs6000/libgcc-ppc64.ver
--- gcc-dje/gcc/config/rs6000/libgcc-ppc64.ver	2005-02-15 19:51:11.873609450 +1030
+++ gcc-alan/gcc/config/rs6000/libgcc-ppc64.ver	2005-02-15 19:08:13.131266227 +1030
@@ -1,4 +1,4 @@
-GCC_3.4 {
+GCC_3.4.4 {
   # long double support
   __gcc_qadd
   __gcc_qsub

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC 128 bit long double compatibility (PR target/19019)
  2005-02-15 15:54     ` Alan Modra
@ 2005-02-15 16:11       ` Richard Henderson
  2005-02-15 16:31         ` Alan Modra
  2005-02-16  0:20       ` Geoffrey Keating
  2005-02-23 17:43       ` [3.4 PATCH] Fix powerpc*-*-linux* bootstrap " Jakub Jelinek
  2 siblings, 1 reply; 875+ messages in thread
From: Richard Henderson @ 2005-02-15 16:11 UTC (permalink / raw)
  To: David Edelsohn, Geoff Keating, gcc-patches

On Tue, Feb 15, 2005 at 08:09:02PM +1030, Alan Modra wrote:
> -GCC_3.4 {
> +GCC_3.4.4 {

Don't you still need the GCC_3.4 symbol and the _xlc names
in the version map, even with the symver?


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC 128 bit long double compatibility (PR target/19019)
  2005-02-15 16:11       ` Richard Henderson
@ 2005-02-15 16:31         ` Alan Modra
  2005-02-15 17:17           ` Alan Modra
  0 siblings, 1 reply; 875+ messages in thread
From: Alan Modra @ 2005-02-15 16:31 UTC (permalink / raw)
  To: Richard Henderson, David Edelsohn, Geoff Keating, gcc-patches

On Tue, Feb 15, 2005 at 01:42:29AM -0800, Richard Henderson wrote:
> On Tue, Feb 15, 2005 at 08:09:02PM +1030, Alan Modra wrote:
> > -GCC_3.4 {
> > +GCC_3.4.4 {
> 
> Don't you still need the GCC_3.4 symbol and the _xlc names
> in the version map, even with the symver?

No, I don't think so.  At least, the linker doesn't need the syms to be
mentioned in a version script as well as .symver.  Perhaps there's
something I'm missing about the gcc build process..

libgcc_s.so and libgcc.a look good to me.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC 128 bit long double compatibility (PR target/19019)
  2005-02-15 16:31         ` Alan Modra
@ 2005-02-15 17:17           ` Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2005-02-15 17:17 UTC (permalink / raw)
  To: Richard Henderson, David Edelsohn, Geoff Keating, gcc-patches

On Tue, Feb 15, 2005 at 09:02:18PM +1030, Alan Modra wrote:
> On Tue, Feb 15, 2005 at 01:42:29AM -0800, Richard Henderson wrote:
> > On Tue, Feb 15, 2005 at 08:09:02PM +1030, Alan Modra wrote:
> > > -GCC_3.4 {
> > > +GCC_3.4.4 {
> > 
> > Don't you still need the GCC_3.4 symbol and the _xlc names
> > in the version map, even with the symver?
> 
> No, I don't think so.  At least, the linker doesn't need the syms to be
> mentioned in a version script as well as .symver.  Perhaps there's
> something I'm missing about the gcc build process..
> 
> libgcc_s.so and libgcc.a look good to me.

Hmm, I suppose I'm relying on a GCC_3.4 version being defined in
libgcc-std.ver.  If that ever changed, we'd need

GCC_3.4 { }

in libgcc-ppc64.ver.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC 128 bit long double compatibility (PR target/19019)
       [not found]           ` <amodra@bigpond.net.au>
                               ` (33 preceding siblings ...)
  2005-01-29 17:21             ` [PATCH] powerpc dwarf2 unwinder fallback David Edelsohn
@ 2005-02-15 19:41             ` David Edelsohn
  2005-02-16  1:52               ` Geoffrey Keating
  2005-03-02 16:17             ` Implicit altivec vs. linux kernel build David Edelsohn
                               ` (26 subsequent siblings)
  61 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2005-02-15 19:41 UTC (permalink / raw)
  To: Richard Henderson, Geoff Keating, gcc-patches

Geoff,

	Should GCC perform any symbol versioning for Darwin?

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC 128 bit long double compatibility (PR target/19019)
  2005-02-15 15:54     ` Alan Modra
  2005-02-15 16:11       ` Richard Henderson
@ 2005-02-16  0:20       ` Geoffrey Keating
  2005-02-23 17:43       ` [3.4 PATCH] Fix powerpc*-*-linux* bootstrap " Jakub Jelinek
  2 siblings, 0 replies; 875+ messages in thread
From: Geoffrey Keating @ 2005-02-16  0:20 UTC (permalink / raw)
  To: Alan Modra; +Cc: gcc-patches, Richard Henderson, David Edelsohn

[-- Attachment #1: Type: text/plain, Size: 728 bytes --]


On 15/02/2005, at 1:39 AM, Alan Modra wrote:

> On Mon, Feb 14, 2005 at 06:56:00PM -0500, David Edelsohn wrote:
>> Richard> You can't make this change without breaking binary 
>> compatibility
>> Richard> on ELF.
>>
>> 	Yes, we know.
>
> This patch on top of yours provides versioned symbols for backwards
> compatibility.  I've put the new syms as GCC_3.4.4 rather than GCC_4.0
> as I'm assuming you'll port your patch to gcc-3.4 too.
>
> Note!  The Darwin support hasn't been checked.  I'm not certain whether
> the Darwin assembler even supports .symver.

It doesn't, there's no way to do this on Darwin, we'll just have to 
break compatibility.  Fortunately, released versions of Darwin don't 
provide a shared libgcc yet.

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 2408 bytes --]

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFC] PowerPC 128 bit long double compatibility (PR target/19019)
  2005-02-15 19:41             ` [RFC] PowerPC 128 bit long double compatibility (PR target/19019) David Edelsohn
@ 2005-02-16  1:52               ` Geoffrey Keating
  0 siblings, 0 replies; 875+ messages in thread
From: Geoffrey Keating @ 2005-02-16  1:52 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches, Richard Henderson

[-- Attachment #1: Type: text/plain, Size: 170 bytes --]


On 15/02/2005, at 7:24 AM, David Edelsohn wrote:

> Geoff,
>
> 	Should GCC perform any symbol versioning for Darwin?

Darwin does not support symbol versioning, so no.


[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 2408 bytes --]

^ permalink raw reply	[flat|nested] 875+ messages in thread

* [3.4 PATCH] Fix powerpc*-*-linux* bootstrap (PR target/19019)
  2005-02-15 15:54     ` Alan Modra
  2005-02-15 16:11       ` Richard Henderson
  2005-02-16  0:20       ` Geoffrey Keating
@ 2005-02-23 17:43       ` Jakub Jelinek
  2005-02-23 19:28         ` David Edelsohn
  2 siblings, 1 reply; 875+ messages in thread
From: Jakub Jelinek @ 2005-02-23 17:43 UTC (permalink / raw)
  To: David Edelsohn, Richard Henderson, Geoff Keating, gcc-patches

On Tue, Feb 15, 2005 at 08:09:02PM +1030, Alan Modra wrote:
> On Mon, Feb 14, 2005 at 06:56:00PM -0500, David Edelsohn wrote:
> > Richard> You can't make this change without breaking binary compatibility
> > Richard> on ELF. 
> > 
> > 	Yes, we know.
> 
> This patch on top of yours provides versioned symbols for backwards
> compatibility.  I've put the new syms as GCC_3.4.4 rather than GCC_4.0
> as I'm assuming you'll port your patch to gcc-3.4 too.

This part of the patch breaks ppc64-linux build on gcc-3_4-branch ATM.
The problem is that on 3.4 branch, there is just one object used in both
libgcc.a and libgcc_s.so and for libgcc.a it is then ld -r'ed with an
autogenerated assembly with .hidden directives to hide the symbols.
But this autogeneration doesn't cope with @ in symbols (just puts
.hidden _xlqadd@GCC_3.4 etc.).
The fundamental problem though is that the .symver directive should never
appear in libgcc.a's objects, only in libgcc_s.so's objects.
For the trunk this shouldn't be hard to fix (will do later on today),
because the libgcc.a and libgcc_s.so object files are compiled separately.
But for 3.4 I think we need something like the patch below, which worked for
me so far to get past stage2 and both libgcc.a and libgcc_s.so looked ok.

Ok to commit if it bootstraps/regtests?

Not sure about AIX, maybe it needs the same changes as in t-linux64.

2005-02-23  Jakub Jelinek  <jakub@redhat.com>

	PR target/19019
	* Makefile.in (LIB2FUNCS_SHARED_EXTRA, LIB2ADD_SH): New.
	(libgcc.mk): Depend on $(LIB2ADD_SH), pass LIB2ADD_SH to mklibgcc.
	(LIBGCC_DEPS): Add $(LIB2ADD_SH).
	* mklibgcc.in: Handle LIB2ADD_SH.
	* config/rs6000/t-linux64 (LIB2FUNCS_EXTRA): Remove darwin-ldouble.c.
	(LIB2FUNCS_STATIC_EXTRA, LIB2FUNCS_SHARED_EXTRA): Set.
	* config/rs6000/darwin-ldouble.c: Protect .symver asm also with
	defined IN_LIBGCC2_S.
	* config/rs6000/darwin-ldouble-shared.c: New file.

--- gcc/Makefile.in.jj	2005-01-19 12:15:42.000000000 +0100
+++ gcc/Makefile.in	2005-02-23 10:03:22.811303371 +0100
@@ -553,6 +553,10 @@ LIB2FUNCS_EXTRA =
 # Assembler files should have names ending in `.asm'.
 LIB2FUNCS_STATIC_EXTRA =
 
+# List of extra C and assembler files to add to shared libgcc2.
+# Assembler files should have names ending in `.asm'.
+LIB2FUNCS_SHARED_EXTRA =
+
 # Program to convert libraries.
 LIBCONVERT =
 
@@ -1144,14 +1148,17 @@ xlimits.h: glimits.h limitx.h limity.h
 
 LIB2ADD = $(LIB2FUNCS_EXTRA)
 LIB2ADD_ST = $(LIB2FUNCS_STATIC_EXTRA)
+LIB2ADD_SH = $(LIB2FUNCS_SHARED_EXTRA)
 
-libgcc.mk: config.status Makefile mklibgcc $(LIB2ADD) $(LIB2ADD_ST) xgcc$(exeext) specs
+libgcc.mk: config.status Makefile mklibgcc $(LIB2ADD) $(LIB2ADD_ST) $(LIB2ADD_SH) \
+  xgcc$(exeext) specs
 	objext='$(objext)' \
 	LIB1ASMFUNCS='$(LIB1ASMFUNCS)' \
 	LIB2FUNCS_ST='$(LIB2FUNCS_ST)' \
 	LIBGCOV='$(LIBGCOV)' \
 	LIB2ADD='$(LIB2ADD)' \
 	LIB2ADD_ST='$(LIB2ADD_ST)' \
+	LIB2ADD_SH='$(LIB2ADD_SH)' \
 	LIB2ADDEH='$(LIB2ADDEH)' \
 	LIB2ADDEHSTATIC='$(LIB2ADDEHSTATIC)' \
 	LIB2ADDEHSHARED='$(LIB2ADDEHSHARED)' \
@@ -1187,8 +1194,8 @@ LIBGCC_DEPS = $(GCC_PASSES) $(LANGUAGES)
 	libgcc.mk $(srcdir)/libgcc2.c $(srcdir)/libgcov.c $(TCONFIG_H) \
 	$(MACHMODE_H) longlong.h gbl-ctors.h config.status stmp-int-hdrs \
 	tsystem.h $(FPBIT) $(DPBIT) $(TPBIT) $(LIB2ADD) \
-	$(LIB2ADD_ST) $(LIB2ADDEH) $(LIB2ADDEHDEP) $(EXTRA_PARTS) \
-	$(srcdir)/config/$(LIB1ASMSRC) \
+	$(LIB2ADD_ST) $(LIB2ADD_SH) $(LIB2ADDEH) $(LIB2ADDEHDEP) \
+	$(EXTRA_PARTS) $(srcdir)/config/$(LIB1ASMSRC) \
 	$(srcdir)/gcov-io.h $(srcdir)/gcov-io.c gcov-iov.h
 
 libgcov.a: libgcc.a; @true
--- gcc/mklibgcc.in.jj	2004-10-23 21:18:15.000000000 +0200
+++ gcc/mklibgcc.in	2005-02-23 10:00:46.010164416 +0100
@@ -13,6 +13,7 @@
 # LIBGCOV
 # LIB2ADD
 # LIB2ADD_ST 
+# LIB2ADD_SH
 # LIB2ADDEH
 # LIB2ADDEHSTATIC
 # LIB2ADDEHSHARED
@@ -279,6 +280,26 @@ for file in $LIB2ADD_ST; do
   libgcc2_st_objs="$libgcc2_st_objs ${oname}${objext}"
 done
 
+if [ "$SHLIB_LINK" ]; then
+  for file in $LIB2ADD_SH; do
+    name=`echo $file | sed -e 's/[.][cSo]$//' -e 's/[.]asm$//' -e 's/[.]txt$//'`
+    oname=`echo $name | sed -e 's,.*/,,'`
+
+    for ml in $MULTILIBS; do
+      dir=`echo ${ml} | sed -e 's/;.*$//' -e 's/=/$(EQ)/g'`
+      flags=`echo ${ml} | sed -e 's/^[^;]*;//' -e 's/@/ -/g'`;
+      out="libgcc/${dir}/${oname}${objext}"
+      if [ ${name}.asm = ${file} ]; then
+	flags="$flags -xassembler-with-cpp"
+      fi
+
+      echo $out: stmp-dirs $file
+      echo "	$gcc_compile" $flags -c $file -o $out
+    done
+    libgcc2_sh_objs="$libgcc2_sh_objs ${oname}${objext}"
+  done
+fi
+
 if [ "$LIBUNWIND" ]; then
   libunwind_static_objs=""
   libunwind_shared_objs=""
@@ -346,6 +367,9 @@ for ml in $MULTILIBS; do
     libgcc_eh_shared_objs="$libgcc_eh_shared_objs libgcc/${dir}/$o"
   done
   libgcc_sh_objs="$libgcc_objs $libgcc_eh_shared_objs"
+  for o in $libgcc2_sh_objs; do
+    libgcc_sh_objs="$libgcc_sh_objs libgcc/${dir}/$o"
+  done
   shlib_deps="$libgcc_sh_objs"
 
   libgcc_st_objs=""
--- gcc/config/rs6000/t-linux64.jj	2004-03-17 16:16:20.000000000 +0100
+++ gcc/config/rs6000/t-linux64	2005-02-23 10:12:09.127809856 +0100
@@ -1,8 +1,9 @@
 
 #rs6000/t-linux64
 
-LIB2FUNCS_EXTRA = tramp.S $(srcdir)/config/rs6000/ppc64-fp.c \
-	$(srcdir)/config/rs6000/darwin-ldouble.c
+LIB2FUNCS_EXTRA = tramp.S $(srcdir)/config/rs6000/ppc64-fp.c
+LIB2FUNCS_STATIC_EXTRA = eabi.S $(srcdir)/config/rs6000/darwin-ldouble.c
+LIB2FUNCS_SHARED_EXTRA = $(srcdir)/config/rs6000/darwin-ldouble-shared.c
 
 TARGET_LIBGCC2_CFLAGS = -mno-minimal-toc -fPIC -specs=bispecs
 
--- gcc/config/rs6000/darwin-ldouble.c.jj	2005-02-23 09:49:58.000000000 +0100
+++ gcc/config/rs6000/darwin-ldouble.c	2005-02-23 10:14:09.321463511 +0100
@@ -63,7 +63,7 @@ extern long double __gcc_qsub (double, d
 extern long double __gcc_qmul (double, double, double, double);
 extern long double __gcc_qdiv (double, double, double, double);
 
-#ifdef __ELF__
+#if defined __ELF__ && defined IN_LIBGCC2_S
 /* Provide definitions of the old symbol names to statisfy apps and
    shared libs built against an older libgcc.  To access the _xlq
    symbols an explicit version reference is needed, so these won't
--- gcc/config/rs6000/darwin-ldouble-shared.c.jj	2005-02-23 10:13:19.712274080 +0100
+++ gcc/config/rs6000/darwin-ldouble-shared.c	2005-02-23 10:13:57.428575681 +0100
@@ -0,0 +1,2 @@
+#define IN_LIBGCC2_S 1
+#include "darwin-ldouble.c"


	Jakub

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [3.4 PATCH] Fix powerpc*-*-linux* bootstrap (PR target/19019)
  2005-02-23 17:43       ` [3.4 PATCH] Fix powerpc*-*-linux* bootstrap " Jakub Jelinek
@ 2005-02-23 19:28         ` David Edelsohn
  2005-02-23 19:33           ` [PATCH] Fix powerpc*-*-linux* libgcc.a " Jakub Jelinek
  2005-02-24 14:08           ` [3.4 PATCH] Fix powerpc*-*-linux* bootstrap " Alan Modra
  0 siblings, 2 replies; 875+ messages in thread
From: David Edelsohn @ 2005-02-23 19:28 UTC (permalink / raw)
  To: Jakub Jelinek, Alan Modra; +Cc: Richard Henderson, Geoff Keating, gcc-patches

	This patch looks reasonable to me.

	Alan, any comment?

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* [PATCH] Fix powerpc*-*-linux* libgcc.a (PR target/19019)
  2005-02-23 19:28         ` David Edelsohn
@ 2005-02-23 19:33           ` Jakub Jelinek
  2005-02-24 14:08           ` [3.4 PATCH] Fix powerpc*-*-linux* bootstrap " Alan Modra
  1 sibling, 0 replies; 875+ messages in thread
From: Jakub Jelinek @ 2005-02-23 19:33 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Alan Modra, Richard Henderson, Geoff Keating, gcc-patches

On Wed, Feb 23, 2005 at 11:31:17AM -0500, David Edelsohn wrote:
> 	This patch looks reasonable to me.

Here is what I'm testing ATM for mainline:

2005-02-23  Jakub Jelinek  <jakub@redhat.com>

	PR target/19019
	* mklibgcc.in: Pass -DSHARED when compiling all *_s${objext} objects.
	* config/rs6000/darwin-ldouble.c: Only use the .symver directives
	if SHARED is defined.

--- gcc/mklibgcc.in.jj	2005-02-23 13:40:49.000000000 +0100
+++ gcc/mklibgcc.in	2005-02-23 14:16:11.652743072 +0100
@@ -219,7 +219,7 @@ for ml in $MULTILIBS; do
 
       echo ${outS}: stmp-dirs '$(srcdir)/config/$(LIB1ASMSRC)'
       echo "	$gcc_compile" $flags -DL$name -xassembler-with-cpp \
-	  -c '$(srcdir)/config/$(LIB1ASMSRC)' -o $outS
+	  -c '$(srcdir)/config/$(LIB1ASMSRC)' -DSHARED -o $outS
 
       echo ${out}: stmp-dirs '$(srcdir)/config/$(LIB1ASMSRC)' ${outV}
       echo "	$gcc_compile" $flags -DL$name -xassembler-with-cpp \
@@ -251,7 +251,8 @@ for ml in $MULTILIBS; do
       outS="libgcc/${dir}/${name}_s${objext}"
 
       echo $outS: $libgcc2_c_dep
-      echo "	$gcc_compile" $flags -DL$name -c '$(srcdir)/libgcc2.c' -o $outS
+      echo "	$gcc_compile" $flags -DL$name -c '$(srcdir)/libgcc2.c' \
+	-DSHARED -o $outS
 
       echo $out: $libgcc2_c_dep
       echo "	$gcc_compile" $flags -DL$name '$(vis_hide)' \
@@ -285,7 +286,7 @@ for ml in $MULTILIBS; do
       outS="libgcc/${dir}/${name}_s${objext}"
 
       echo $outS: $libgcc2_c_dep
-      echo "	$gcc_compile" $flags -DL$name \
+      echo "	$gcc_compile" $flags -DL$name -DSHARED \
         -fexceptions -fnon-call-exceptions -c '$(srcdir)/libgcc2.c' -o $outS
 
       echo $out: $libgcc2_c_dep
@@ -318,7 +319,7 @@ for ml in $MULTILIBS; do
 
 	echo $outS: $FPBIT $fpbit_c_dep
 	echo "	$gcc_compile" -DFINE_GRAINED_LIBRARIES $flags -DL$name \
-	  -c $FPBIT -o $outS
+	  -DSHARED -c $FPBIT -o $outS
 
         echo $out: $FPBIT $fpbit_c_dep
         echo "	$gcc_compile" -DFINE_GRAINED_LIBRARIES $flags -DL$name \
@@ -348,7 +349,7 @@ for ml in $MULTILIBS; do
 
 	echo $outS: $DPBIT $fpbit_c_dep
 	echo "	$gcc_compile" -DFINE_GRAINED_LIBRARIES $flags -DL$name \
-	  -c $DPBIT -o $outS
+	  -DSHARED -c $DPBIT -o $outS
 
         echo $out: $DPBIT $fpbit_c_dep
         echo "	$gcc_compile" -DFINE_GRAINED_LIBRARIES $flags -DL$name \
@@ -378,7 +379,7 @@ for ml in $MULTILIBS; do
 
 	echo $outS: $TPBIT $fpbit_c_dep
 	echo "	$gcc_compile" -DFINE_GRAINED_LIBRARIES $flags -DL$name \
-	  -c $TPBIT -o $outS
+	  -DSHARED -c $TPBIT -o $outS
 
         echo $out: $TPBIT $fpbit_c_dep
         echo "	$gcc_compile" -DFINE_GRAINED_LIBRARIES $flags -DL$name \
@@ -411,7 +412,7 @@ for ml in $MULTILIBS; do
       case $file in
 	*.c)
 	  echo $outS: stmp-dirs $file $libgcc_dep
-	  echo "	$gcc_compile" $flags -c $file -o $outS
+	  echo "	$gcc_compile" $flags -c $file -DSHARED -o $outS
 
 	  echo $out: stmp-dirs $file $libgcc_dep
 	  echo "	$gcc_compile" $flags '$(vis_hide)' -c $file -o $out
@@ -422,7 +423,7 @@ for ml in $MULTILIBS; do
 
 	  echo $outS: stmp-dirs $file $libgcc_dep
 	  echo "	$gcc_compile" $flags -xassembler-with-cpp \
-	         -c $file -o $outS
+	         -DSHARED -c $file -o $outS
 
 	  echo $out: stmp-dirs $file $libgcc_dep $outV
 	  echo "	$gcc_compile" $flags -xassembler-with-cpp \
@@ -533,13 +534,13 @@ for ml in $MULTILIBS; do
 
       name=`echo $file | sed -e 's/[.]c$//'`
       oname=`echo $name | sed -e 's,.*/,,'`
-      out="libgcc/${dir}/${oname}_s${objext}"
+      outS="libgcc/${dir}/${oname}_s${objext}"
 
-      echo $out: stmp-dirs $file $LIB2ADDEHDEP $libgcc_dep
-      echo "	$gcc_compile" $flags -fexceptions -c $file -o $out
-      echo $libgcc_s_so: $out
+      echo $outS: stmp-dirs $file $LIB2ADDEHDEP $libgcc_dep
+      echo "	$gcc_compile" $flags -fexceptions -DSHARED -c $file -o $outS
+      echo $libgcc_s_so: $outS
       if [ "$SHLIB_MKMAP" ]; then
-	echo libgcc/${dir}/libgcc.map: $out
+	echo libgcc/${dir}/libgcc.map: $outS
       fi
     done
 
--- gcc/config/rs6000/darwin-ldouble.c.jj	2005-02-16 00:16:46.000000000 +0100
+++ gcc/config/rs6000/darwin-ldouble.c	2005-02-23 14:20:14.675546974 +0100
@@ -67,7 +67,7 @@ extern long double __gcc_qsub (double, d
 extern long double __gcc_qmul (double, double, double, double);
 extern long double __gcc_qdiv (double, double, double, double);
 
-#ifdef __ELF__
+#if defined __ELF__ && defined SHARED
 /* Provide definitions of the old symbol names to statisfy apps and
    shared libs built against an older libgcc.  To access the _xlq
    symbols an explicit version reference is needed, so these won't


	Jakub

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [3.4 PATCH] Fix powerpc*-*-linux* bootstrap (PR target/19019)
  2005-02-23 19:28         ` David Edelsohn
  2005-02-23 19:33           ` [PATCH] Fix powerpc*-*-linux* libgcc.a " Jakub Jelinek
@ 2005-02-24 14:08           ` Alan Modra
  2005-02-24 14:51             ` Jakub Jelinek
  1 sibling, 1 reply; 875+ messages in thread
From: Alan Modra @ 2005-02-24 14:08 UTC (permalink / raw)
  To: David Edelsohn
  Cc: Jakub Jelinek, Richard Henderson, Geoff Keating, gcc-patches

On Wed, Feb 23, 2005 at 11:31:17AM -0500, David Edelsohn wrote:
> 	This patch looks reasonable to me.
> 
> 	Alan, any comment?

Looks OK to me too.  I was developing a competing patch when I saw
Jakub's post last night, but was stuck in shell quoting hell.  I think
my patch is more elegant (I'm biased, of course!), but Jakub's patch is
simpler in the sense that he doesn't try to use much shell magic.  It's
probably safer to go with Jakub's patch for 3.4, and of course this
patch isn't needed on mainline.  Anyway, just in case some problem
arises with Jakub's patch, here's mine.

The idea here is to not mark the versioned symbols hidden, and to add
	strip -w -N '*@GCC*' $@
to the darwin-ldouble.oS rule in libgcc.mk, which will remove the
versioned symbols.  (We know strip is the GNU one, because it is
powerpc64-linux-strip.)

	* Makefile.in (libgcc.mk): Pass STATIC_LIBGCC_OBJECT.
	* mklibgcc.in (STATIC_LIBGCC_OBJECT): Define and use.
	* config/rs6000/t-linux64 (STATIC_LIBGCC_OBJECT): Define.

Tested on native powerpc64-linux and i686-linux builds and an
i686-linux -> powerpc64-linux cross.

diff -urp -xCVS -x'*~' -x'.#*' gcc-3.4-virgin/gcc/Makefile.in gcc-3.4/gcc/Makefile.in
--- gcc-3.4-virgin/gcc/Makefile.in	2005-01-21 22:04:53.000000000 +1030
+++ gcc-3.4/gcc/Makefile.in	2005-02-23 17:44:30.000000000 +1030
@@ -1178,6 +1178,7 @@ libgcc.mk: config.status Makefile mklibg
 	SHLIB_MAPFILES='$(SHLIB_MAPFILES)' \
 	SHLIB_NM_FLAGS='$(SHLIB_NM_FLAGS)' \
 	MULTILIB_OSDIRNAMES='$(MULTILIB_OSDIRNAMES)' \
+	STATIC_LIBGCC_OBJECT='$(STATIC_LIBGCC_OBJECT)' \
 	mkinstalldirs='$(SHELL) $(srcdir)/mkinstalldirs' \
 	  $(SHELL) mklibgcc > tmp-libgcc.mk
 	mv tmp-libgcc.mk libgcc.mk
diff -urp -xCVS -x'*~' -x'.#*' gcc-3.4-virgin/gcc/config/rs6000/t-linux64 gcc-3.4/gcc/config/rs6000/t-linux64
--- gcc-3.4-virgin/gcc/config/rs6000/t-linux64	2004-03-21 18:36:48.000000000 +1030
+++ gcc-3.4/gcc/config/rs6000/t-linux64	2005-02-24 09:23:12.246515629 +1030
@@ -4,6 +4,8 @@
 LIB2FUNCS_EXTRA = tramp.S $(srcdir)/config/rs6000/ppc64-fp.c \
 	$(srcdir)/config/rs6000/darwin-ldouble.c
 
+STATIC_LIBGCC_OBJECT = echo '\''	( '$(NM_FOR_TARGET)' '\''${SHLIB_NM_FLAGS} $${o}'\'' | $(AWK) '\'\\\'\''NF == 3 && $$$$2 !~ /^[UN]$$$$/ && $$$$3 !~ /@GCC/ { print "\t.hidden", $$$$3 }'\'\\\'\''; cat libgcc/'\''$${dir}'\''/stacknote.s ) | $$(GCC_FOR_TARGET) $$(LIBGCC2_CFLAGS) '\''$${flags}'\'' -r -nostdinc -nostdlib -o $$@ '\''$${o}'\'' -xassembler -'\''; test $${o} != libgcc/$${dir}/darwin-ldouble.o || ( echo -n '\''	'\''; if [ -f ./strip ] ; then echo -n ./strip ; elif [ -f $(objdir)/../binutils/strip ] ; then echo -n $(objdir)/../binutils/strip ; else if [ "$(host)" = "$(target)" ] ; then echo -n strip; else echo -n strip | sed -e '\''$(program_transform_name)'\'' ; fi; fi; echo '\'' -w -N '\'\\\''*@GCC*'\\\'\'' $$@'\'' )
+
 TARGET_LIBGCC2_CFLAGS = -mno-minimal-toc -fPIC -specs=bispecs
 
 SHLIB_MAPFILES += $(srcdir)/config/rs6000/libgcc-ppc64.ver
diff -urp -xCVS -x'*~' -x'.#*' gcc-3.4-virgin/gcc/mklibgcc.in gcc-3.4/gcc/mklibgcc.in
--- gcc-3.4-virgin/gcc/mklibgcc.in	2004-10-20 20:50:00.000000000 +0930
+++ gcc-3.4/gcc/mklibgcc.in	2005-02-23 22:41:26.761371598 +1030
@@ -40,6 +40,7 @@
 # SHLIB_NM_FLAGS
 # SHLIB_INSTALL
 # MULTILIB_OSDIRNAMES
+# STATIC_LIBGCC_OBJECT
 
 # Make needs VPATH to be literal.
 echo 'srcdir = @srcdir@'
@@ -323,6 +324,10 @@ for name in $LIBGCOV; do
   libgcov_objs="$libgcov_objs ${name}${objext}"
 done
 
+# non-GNU nm emits three fields even for undefined and typeless symbols,
+# so explicitly omit them
+test -n "$STATIC_LIBGCC_OBJECT" || STATIC_LIBGCC_OBJECT='echo '\''	( $(NM_FOR_TARGET) '\''${SHLIB_NM_FLAGS} ${o}'\'' | $(AWK) '\'\\\'\''NF == 3 && $$2 !~ /^[UN]$$/ { print "\t.hidden", $$3 }'\'\\\'\''; cat libgcc/'\''${dir}'\''/stacknote.s ) | $(GCC_FOR_TARGET) $(LIBGCC2_CFLAGS) '\''${flags}'\'' -r -nostdinc -nostdlib -o $@ '\''${o}'\'' -xassembler -'\'
+
 # SHLIB_MKMAP
 # SHLIB_MKMAP_OPTS
 # SHLIB_MAPFILES
@@ -409,9 +414,7 @@ EOF
       # .oS objects will have all non-local symbol definitions .hidden
       oS=`echo ${o} | sed s~${objext}'$~.oS~g'`
       echo "${oS}: stmp-dirs libgcc/${dir}/stacknote.s ${o}"
-      # non-GNU nm emits three fields even for undefined and typeless symbols,
-      # so explicitly omit them
-      echo '	( $(NM_FOR_TARGET) '${SHLIB_NM_FLAGS} ${o}' | $(AWK) '\''NF == 3 && $$2 !~ /^[UN]$$/ { print "\t.hidden", $$3 }'\''; cat libgcc/${dir}/stacknote.s ) | $(GCC_FOR_TARGET) $(LIBGCC2_CFLAGS) '${flags}' -r -nostdinc -nostdlib -o $@ '${o}' -xassembler -'
+      eval "$STATIC_LIBGCC_OBJECT"
       libgcc_a_objs="${libgcc_a_objs} ${oS}"
     done
   fi

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [3.4 PATCH] Fix powerpc*-*-linux* bootstrap (PR target/19019)
  2005-02-24 14:08           ` [3.4 PATCH] Fix powerpc*-*-linux* bootstrap " Alan Modra
@ 2005-02-24 14:51             ` Jakub Jelinek
  2005-02-24 14:59               ` Richard Henderson
  2005-02-24 15:09               ` Alan Modra
  0 siblings, 2 replies; 875+ messages in thread
From: Jakub Jelinek @ 2005-02-24 14:51 UTC (permalink / raw)
  To: David Edelsohn, Richard Henderson, Geoff Keating, gcc-patches

On Thu, Feb 24, 2005 at 10:01:56AM +1030, Alan Modra wrote:
> Looks OK to me too.  I was developing a competing patch when I saw
> Jakub's post last night, but was stuck in shell quoting hell.  I think
> my patch is more elegant (I'm biased, of course!), but Jakub's patch is
> simpler in the sense that he doesn't try to use much shell magic.  It's
> probably safer to go with Jakub's patch for 3.4, and of course this
> patch isn't needed on mainline.  Anyway, just in case some problem
> arises with Jakub's patch, here's mine.

Well, on mainline is something needed too, see
http://gcc.gnu.org/ml/gcc-patches/2005-02/msg01418.html
Although GCC builds there, it is a bad idea to have versioned but hidden
symbols put into other binaries/shared libraries.
But IMHO passing -DSHARED to all *_s.o objects is desirable if not for
anything else then for consistency (it is already used e.g. for ia64 .symver
stuff).  And once the mklibgcc.in change is in, the ppc64 fix is a
one-liner.

FYI, I have bootstrapped/regtested both the 4.0 and 3.4 patch on 7 linux
arches (inclusing ppc and ppc64) each with no regressions.

	Jakub

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [3.4 PATCH] Fix powerpc*-*-linux* bootstrap (PR target/19019)
  2005-02-24 14:51             ` Jakub Jelinek
@ 2005-02-24 14:59               ` Richard Henderson
  2005-02-24 15:09               ` Alan Modra
  1 sibling, 0 replies; 875+ messages in thread
From: Richard Henderson @ 2005-02-24 14:59 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: David Edelsohn, Geoff Keating, gcc-patches

On Wed, Feb 23, 2005 at 07:09:13PM -0500, Jakub Jelinek wrote:
> FYI, I have bootstrapped/regtested both the 4.0 and 3.4 patch on 7 linux
> arches (inclusing ppc and ppc64) each with no regressions.

Ok to both.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [3.4 PATCH] Fix powerpc*-*-linux* bootstrap (PR target/19019)
  2005-02-24 14:51             ` Jakub Jelinek
  2005-02-24 14:59               ` Richard Henderson
@ 2005-02-24 15:09               ` Alan Modra
  2005-02-24 15:17                 ` Jakub Jelinek
  2005-02-25  2:12                 ` [PATCH] Fix powerpc*-*-linux* libgcc.a (PR target/19019, take 2) Jakub Jelinek
  1 sibling, 2 replies; 875+ messages in thread
From: Alan Modra @ 2005-02-24 15:09 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: David Edelsohn, Richard Henderson, Geoff Keating, gcc-patches

On Wed, Feb 23, 2005 at 07:09:13PM -0500, Jakub Jelinek wrote:
> Well, on mainline is something needed too, see
> http://gcc.gnu.org/ml/gcc-patches/2005-02/msg01418.html

Saw it.  Looks good to me.  One refinement would be to push the
-DSHARED into a new gcc_compile_s var.  That would lend itself to
compiling with different options for libgcc.a vs libgcc_s.so

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [3.4 PATCH] Fix powerpc*-*-linux* bootstrap (PR target/19019)
  2005-02-24 15:09               ` Alan Modra
@ 2005-02-24 15:17                 ` Jakub Jelinek
  2005-02-25  2:12                 ` [PATCH] Fix powerpc*-*-linux* libgcc.a (PR target/19019, take 2) Jakub Jelinek
  1 sibling, 0 replies; 875+ messages in thread
From: Jakub Jelinek @ 2005-02-24 15:17 UTC (permalink / raw)
  To: David Edelsohn, Richard Henderson, Geoff Keating, gcc-patches

On Thu, Feb 24, 2005 at 11:19:18AM +1030, Alan Modra wrote:
> On Wed, Feb 23, 2005 at 07:09:13PM -0500, Jakub Jelinek wrote:
> > Well, on mainline is something needed too, see
> > http://gcc.gnu.org/ml/gcc-patches/2005-02/msg01418.html
> 
> Saw it.  Looks good to me.  One refinement would be to push the
> -DSHARED into a new gcc_compile_s var.  That would lend itself to
> compiling with different options for libgcc.a vs libgcc_s.so

I like this, will test tomorrow.

	Jakub

^ permalink raw reply	[flat|nested] 875+ messages in thread

* [PATCH] Fix powerpc*-*-linux* libgcc.a (PR target/19019, take 2)
  2005-02-24 15:09               ` Alan Modra
  2005-02-24 15:17                 ` Jakub Jelinek
@ 2005-02-25  2:12                 ` Jakub Jelinek
  2005-02-25  2:18                   ` Richard Henderson
  1 sibling, 1 reply; 875+ messages in thread
From: Jakub Jelinek @ 2005-02-25  2:12 UTC (permalink / raw)
  To: David Edelsohn, Richard Henderson, Geoff Keating, gcc-patches

On Thu, Feb 24, 2005 at 11:19:18AM +1030, Alan Modra wrote:
> On Wed, Feb 23, 2005 at 07:09:13PM -0500, Jakub Jelinek wrote:
> > Well, on mainline is something needed too, see
> > http://gcc.gnu.org/ml/gcc-patches/2005-02/msg01418.html
> 
> Saw it.  Looks good to me.  One refinement would be to push the
> -DSHARED into a new gcc_compile_s var.  That would lend itself to
> compiling with different options for libgcc.a vs libgcc_s.so

Here is a new trunk patch that implements Alan's suggestion.
Bootstrapped/regtested on 7 linux arches.
Ok to commit this instead?

2005-02-24  Jakub Jelinek  <jakub@redhat.com>

	PR target/19019
	* mklibgcc.in: Pass -DSHARED when compiling all *_s${objext} objects.
	* config/rs6000/darwin-ldouble.c: Only use the .symver directives
	if SHARED is defined.

--- gcc/mklibgcc.in.jj	2005-02-23 13:40:49.000000000 +0100
+++ gcc/mklibgcc.in	2005-02-24 10:39:22.063448154 +0100
@@ -1,6 +1,6 @@
 #!/bin/sh
 # Construct makefile for libgcc.
-#   Copyright (C) 2000, 2002, 2003 Free Software Foundation, Inc.
+#   Copyright (C) 2000, 2002, 2003, 2004, 2005 Free Software Foundation, Inc.
 #
 # This file is part of GCC.
 
@@ -72,6 +72,7 @@ fi
 # Build lines.
 
 gcc_compile='$(GCC_FOR_TARGET) $(LIBGCC2_CFLAGS) $(INCLUDES)'
+gcc_s_compile="$gcc_compile -DSHARED"
 make_compile='$(MAKE) GCC_FOR_TARGET="$(GCC_FOR_TARGET)" \
 	  AR_FOR_TARGET="$(AR_FOR_TARGET)" \
 	  AR_CREATE_FOR_TARGET="$(AR_CREATE_FOR_TARGET)" \
@@ -218,7 +219,7 @@ for ml in $MULTILIBS; do
       outV="libgcc/${dir}/${name}.vis"
 
       echo ${outS}: stmp-dirs '$(srcdir)/config/$(LIB1ASMSRC)'
-      echo "	$gcc_compile" $flags -DL$name -xassembler-with-cpp \
+      echo "	$gcc_s_compile" $flags -DL$name -xassembler-with-cpp \
 	  -c '$(srcdir)/config/$(LIB1ASMSRC)' -o $outS
 
       echo ${out}: stmp-dirs '$(srcdir)/config/$(LIB1ASMSRC)' ${outV}
@@ -251,7 +252,8 @@ for ml in $MULTILIBS; do
       outS="libgcc/${dir}/${name}_s${objext}"
 
       echo $outS: $libgcc2_c_dep
-      echo "	$gcc_compile" $flags -DL$name -c '$(srcdir)/libgcc2.c' -o $outS
+      echo "	$gcc_s_compile" $flags -DL$name -c '$(srcdir)/libgcc2.c' \
+	-o $outS
 
       echo $out: $libgcc2_c_dep
       echo "	$gcc_compile" $flags -DL$name '$(vis_hide)' \
@@ -285,7 +287,7 @@ for ml in $MULTILIBS; do
       outS="libgcc/${dir}/${name}_s${objext}"
 
       echo $outS: $libgcc2_c_dep
-      echo "	$gcc_compile" $flags -DL$name \
+      echo "	$gcc_s_compile" $flags -DL$name \
         -fexceptions -fnon-call-exceptions -c '$(srcdir)/libgcc2.c' -o $outS
 
       echo $out: $libgcc2_c_dep
@@ -317,7 +319,7 @@ for ml in $MULTILIBS; do
 	outS="libgcc/${dir}/${name}_s${objext}"
 
 	echo $outS: $FPBIT $fpbit_c_dep
-	echo "	$gcc_compile" -DFINE_GRAINED_LIBRARIES $flags -DL$name \
+	echo "	$gcc_s_compile" -DFINE_GRAINED_LIBRARIES $flags -DL$name \
 	  -c $FPBIT -o $outS
 
         echo $out: $FPBIT $fpbit_c_dep
@@ -347,7 +349,7 @@ for ml in $MULTILIBS; do
 	outS="libgcc/${dir}/${name}_s${objext}"
 
 	echo $outS: $DPBIT $fpbit_c_dep
-	echo "	$gcc_compile" -DFINE_GRAINED_LIBRARIES $flags -DL$name \
+	echo "	$gcc_s_compile" -DFINE_GRAINED_LIBRARIES $flags -DL$name \
 	  -c $DPBIT -o $outS
 
         echo $out: $DPBIT $fpbit_c_dep
@@ -377,7 +379,7 @@ for ml in $MULTILIBS; do
 	outS="libgcc/${dir}/${name}_s${objext}"
 
 	echo $outS: $TPBIT $fpbit_c_dep
-	echo "	$gcc_compile" -DFINE_GRAINED_LIBRARIES $flags -DL$name \
+	echo "	$gcc_s_compile" -DFINE_GRAINED_LIBRARIES $flags -DL$name \
 	  -c $TPBIT -o $outS
 
         echo $out: $TPBIT $fpbit_c_dep
@@ -411,7 +413,7 @@ for ml in $MULTILIBS; do
       case $file in
 	*.c)
 	  echo $outS: stmp-dirs $file $libgcc_dep
-	  echo "	$gcc_compile" $flags -c $file -o $outS
+	  echo "	$gcc_s_compile" $flags -c $file -o $outS
 
 	  echo $out: stmp-dirs $file $libgcc_dep
 	  echo "	$gcc_compile" $flags '$(vis_hide)' -c $file -o $out
@@ -421,7 +423,7 @@ for ml in $MULTILIBS; do
 	  outV="libgcc/${dir}/${oname}.vis"
 
 	  echo $outS: stmp-dirs $file $libgcc_dep
-	  echo "	$gcc_compile" $flags -xassembler-with-cpp \
+	  echo "	$gcc_s_compile" $flags -xassembler-with-cpp \
 	         -c $file -o $outS
 
 	  echo $out: stmp-dirs $file $libgcc_dep $outV
@@ -533,13 +535,13 @@ for ml in $MULTILIBS; do
 
       name=`echo $file | sed -e 's/[.]c$//'`
       oname=`echo $name | sed -e 's,.*/,,'`
-      out="libgcc/${dir}/${oname}_s${objext}"
+      outS="libgcc/${dir}/${oname}_s${objext}"
 
-      echo $out: stmp-dirs $file $LIB2ADDEHDEP $libgcc_dep
-      echo "	$gcc_compile" $flags -fexceptions -c $file -o $out
-      echo $libgcc_s_so: $out
+      echo $outS: stmp-dirs $file $LIB2ADDEHDEP $libgcc_dep
+      echo "	$gcc_s_compile" $flags -fexceptions -c $file -o $outS
+      echo $libgcc_s_so: $outS
       if [ "$SHLIB_MKMAP" ]; then
-	echo libgcc/${dir}/libgcc.map: $out
+	echo libgcc/${dir}/libgcc.map: $outS
       fi
     done
 
@@ -592,7 +594,7 @@ for ml in $MULTILIBS; do
 	echo "	$gcc_compile $flags -fexceptions \$(vis_hide) -c $file -o $out"
 
 	echo $outS: stmp-dirs $file $LIBUNWINDDEP
-	echo "	$gcc_compile $flags -fexceptions -DSHARED -c $file -o $outS"
+	echo "	$gcc_s_compile $flags -fexceptions -c $file -o $outS"
 
 	echo $libunwind_a: $out
 	echo $libunwind_so: $outS
--- gcc/config/rs6000/darwin-ldouble.c.jj	2005-02-16 00:16:46.000000000 +0100
+++ gcc/config/rs6000/darwin-ldouble.c	2005-02-24 10:34:42.783160135 +0100
@@ -67,7 +67,7 @@ extern long double __gcc_qsub (double, d
 extern long double __gcc_qmul (double, double, double, double);
 extern long double __gcc_qdiv (double, double, double, double);
 
-#ifdef __ELF__
+#if defined __ELF__ && defined SHARED
 /* Provide definitions of the old symbol names to statisfy apps and
    shared libs built against an older libgcc.  To access the _xlq
    symbols an explicit version reference is needed, so these won't


	Jakub

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] Fix powerpc*-*-linux* libgcc.a (PR target/19019, take 2)
  2005-02-25  2:12                 ` [PATCH] Fix powerpc*-*-linux* libgcc.a (PR target/19019, take 2) Jakub Jelinek
@ 2005-02-25  2:18                   ` Richard Henderson
  0 siblings, 0 replies; 875+ messages in thread
From: Richard Henderson @ 2005-02-25  2:18 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: David Edelsohn, Geoff Keating, gcc-patches

On Thu, Feb 24, 2005 at 04:28:37PM -0500, Jakub Jelinek wrote:
> 	* mklibgcc.in: Pass -DSHARED when compiling all *_s${objext} objects.
> 	* config/rs6000/darwin-ldouble.c: Only use the .symver directives
> 	if SHARED is defined.

Ok.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Implicit altivec vs. linux kernel build
       [not found]   ` <653d1ad4308fa0e72f08252032f6c753@physics.uc.edu>
@ 2005-02-28 10:52     ` Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2005-02-28 10:52 UTC (permalink / raw)
  To: Andrew Pinski, gcc-patches
  Cc: David Edelsohn, Paul Mackerras, Geoff Keating, Benjamin Herrenschmidt

On Sun, Feb 27, 2005 at 06:40:31PM -0500, Andrew Pinski wrote:
> As I and Ben found out that does not work, altivec is still turned on
> and there is no way to turn it off. Alan has a patch which he is going
> to post (again) on my request to fix the problem with "-mcpu=power4 
> -maltivec",
> I don't know if -mno-altivec will work though.

See
http://gcc.gnu.org/ml/gcc/2005-02/msg01143.html
http://gcc.gnu.org/ml/gcc/2005-02/msg01142.html
http://gcc.gnu.org/ml/gcc-patches/2004-03/msg00688.html

	* config/rs6000/rs6000.c (rs6000_override_options): Don't allow
	-mcpu to override any other explicitly given flags.

Index: gcc/config/rs6000/rs6000.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.c,v
retrieving revision 1.788
diff -u -p -r1.788 rs6000.c
--- gcc/config/rs6000/rs6000.c	25 Feb 2005 01:16:06 -0000	1.788
+++ gcc/config/rs6000/rs6000.c	27 Feb 2005 23:49:07 -0000
@@ -1199,8 +1199,7 @@ rs6000_override_options (const char *def
 #endif
 
   /* Don't override these by the processor default if given explicitly.  */
-  set_masks &= ~(target_flags_explicit
-		 & (MASK_MULTIPLE | MASK_STRING | MASK_SOFT_FLOAT));
+  set_masks &= ~target_flags_explicit;
 
   /* Identify the processor type.  */
   rs6000_select[0].string = default_cpu;

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Implicit altivec vs. linux kernel build
       [not found]           ` <amodra@bigpond.net.au>
                               ` (34 preceding siblings ...)
  2005-02-15 19:41             ` [RFC] PowerPC 128 bit long double compatibility (PR target/19019) David Edelsohn
@ 2005-03-02 16:17             ` David Edelsohn
  2005-03-17  0:48             ` powerpc-linux unwinder fix for 3.4 David Edelsohn
                               ` (25 subsequent siblings)
  61 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2005-03-02 16:17 UTC (permalink / raw)
  To: Andrew Pinski, gcc-patches, Paul Mackerras, Geoff Keating,
	Benjamin Herrenschmidt

>>>>> Alan Modra writes:

	PR target/20277
	* config/rs6000/rs6000.c (rs6000_override_options): Don't allow
	-mcpu to override any other explicitly given flags.

	Okay.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* [PATCH] PowerPC function arg alignment tidy
@ 2005-03-14 12:58 Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2005-03-14 12:58 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Edelsohn

This tidies the alignment calculations for function arg passing.  One
function that calculates alignment padding correctly for all possible
alignment values is better than three different calculations, each
making use of local knowledge about the restricted set of alignments
to optimize the code a little.

It takes a little thought to see that this patch doesn't change
anything, especially for rs6000_arg_partial_bytes.  You need to know
that function_arg_boundary()/PARM_BOUNDARY-1 can only be 0, 1 or 3 for
32-bit, and 0 or 1 for 64-bit.  Then it's not hard to see that changing
from
  ((TARGET_32BIT ? 2 : 0) - x) & align
to
  -((ABI_V4 ? 2 : 6) + x) & align
doesn't alter the padding.

	* config/rs6000/rs6000.c (rs6000_parm_start): New function.
	(function_arg_advance): Use rs6000_parm_start.
	(function_arg, rs6000_arg_partial_bytes): Likewise.

Bootstrapped and regression tested powerpc-linux.  OK for mainline?

Index: gcc/config/rs6000/rs6000.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.c,v
retrieving revision 1.794
diff -u -p -r1.794 rs6000.c
--- gcc/config/rs6000/rs6000.c	14 Mar 2005 07:24:25 -0000	1.794
+++ gcc/config/rs6000/rs6000.c	14 Mar 2005 09:50:53 -0000
@@ -4060,6 +4060,20 @@ function_arg_boundary (enum machine_mode
     return PARM_BOUNDARY;
 }
 
+/* For a function parm of MODE and TYPE, return the starting word in
+   the parameter area.  NWORDS of the parameter area are already used.  */
+
+static unsigned int
+rs6000_parm_start (enum machine_mode mode, tree type, unsigned int nwords)
+{
+  unsigned int align;
+  unsigned int parm_offset;
+
+  align = function_arg_boundary (mode, type) / PARM_BOUNDARY - 1;
+  parm_offset = DEFAULT_ABI == ABI_V4 ? 2 : 6;
+  return nwords + (-(parm_offset + nwords) & align);
+}
+
 /* Compute the size (in words) of a function argument.  */
 
 static unsigned long
@@ -4314,15 +4328,10 @@ function_arg_advance (CUMULATIVE_ARGS *c
   else
     {
       int n_words = rs6000_arg_size (mode, type);
-      int align = function_arg_boundary (mode, type) / PARM_BOUNDARY - 1;
+      int start_words = cum->words;
+      int align_words = rs6000_parm_start (mode, type, start_words);
 
-      /* The simple alignment calculation here works because
-	 function_arg_boundary / PARM_BOUNDARY will only be 1 or 2.
-	 If we ever want to handle alignments larger than 8 bytes for
-	 32-bit or 16 bytes for 64-bit, then we'll need to take into
-	 account the offset to the start of the parm save area.  */
-      align &= cum->words;
-      cum->words += align + n_words;
+      cum->words = align_words + n_words;
 
       if (GET_MODE_CLASS (mode) == MODE_FLOAT
 	  && TARGET_HARD_FLOAT && TARGET_FPRS)
@@ -4335,7 +4344,7 @@ function_arg_advance (CUMULATIVE_ARGS *c
 	  fprintf (stderr, "nargs = %4d, proto = %d, mode = %4s, ",
 		   cum->nargs_prototype, cum->prototype, GET_MODE_NAME (mode));
 	  fprintf (stderr, "named = %d, align = %d, depth = %d\n",
-		   named, align, depth);
+		   named, align_words - start_words, depth);
 	}
     }
 }
@@ -4826,8 +4835,7 @@ function_arg (CUMULATIVE_ARGS *cum, enum
     }
   else
     {
-      int align = function_arg_boundary (mode, type) / PARM_BOUNDARY - 1;
-      int align_words = cum->words + (cum->words & align);
+      int align_words = rs6000_parm_start (mode, type, cum->words);
 
       if (USE_FP_FOR_ARG_P (cum, mode, type))
 	{
@@ -4940,8 +4948,6 @@ rs6000_arg_partial_bytes (CUMULATIVE_ARG
 			  tree type, bool named)
 {
   int ret = 0;
-  int align;
-  int parm_offset;
   int align_words;
 
   if (DEFAULT_ABI == ABI_V4)
@@ -4957,9 +4963,7 @@ rs6000_arg_partial_bytes (CUMULATIVE_ARG
       && int_size_in_bytes (type) > 0)
     return 0;
 
-  align = function_arg_boundary (mode, type) / PARM_BOUNDARY - 1;
-  parm_offset = TARGET_32BIT ? 2 : 0;
-  align_words = cum->words + ((parm_offset - cum->words) & align);
+  align_words = rs6000_parm_start (mode, type, cum->words);
 
   if (USE_FP_FOR_ARG_P (cum, mode, type)
       /* If we are passing this arg in gprs as well, then this function

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* powerpc-linux unwinder fix for 3.4
@ 2005-03-17  0:27 Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2005-03-17  0:27 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

Hi David,
  In http://gcc.gnu.org/ml/gcc-patches/2005-01/msg02170.html, I fixed
a number of problems with the dwarf2 unwinder and mentioned that I'd
tested code on the 3.4 branch, but didn't ask permission to apply
there.  Now people want to use vdso enabled kernels with 3.4 libgcc,
so I'd like to apply the changes to the branch too.  OK?

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: powerpc-linux unwinder fix for 3.4
       [not found]           ` <amodra@bigpond.net.au>
                               ` (35 preceding siblings ...)
  2005-03-02 16:17             ` Implicit altivec vs. linux kernel build David Edelsohn
@ 2005-03-17  0:48             ` David Edelsohn
  2005-03-20 23:01             ` [PATCH] PowerPC function arg alignment tidy David Edelsohn
                               ` (24 subsequent siblings)
  61 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2005-03-17  0:48 UTC (permalink / raw)
  To: gcc-patches

>>>>> Alan Modra writes:

Alan> In http://gcc.gnu.org/ml/gcc-patches/2005-01/msg02170.html, I fixed
Alan> a number of problems with the dwarf2 unwinder and mentioned that I'd
Alan> tested code on the 3.4 branch, but didn't ask permission to apply
Alan> there.  Now people want to use vdso enabled kernels with 3.4 libgcc,
Alan> so I'd like to apply the changes to the branch too.  OK?

	Okay.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] PowerPC function arg alignment tidy
       [not found]           ` <amodra@bigpond.net.au>
                               ` (36 preceding siblings ...)
  2005-03-17  0:48             ` powerpc-linux unwinder fix for 3.4 David Edelsohn
@ 2005-03-20 23:01             ` David Edelsohn
  2005-03-31  0:16             ` [RS6000] Fix PR20611, duplicate label for inlined function referencing TLS David Edelsohn
                               ` (23 subsequent siblings)
  61 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2005-03-20 23:01 UTC (permalink / raw)
  To: gcc-patches

> 	* config/rs6000/rs6000.c (rs6000_parm_start): New function.
> 	(function_arg_advance): Use rs6000_parm_start.
> 	(function_arg, rs6000_arg_partial_bytes): Likewise.
> 
> Bootstrapped and regression tested powerpc-linux.  OK for mainline?

Okay.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* [RS6000] Fix PR20611, duplicate label for inlined function referencing TLS
@ 2005-03-30  3:38 Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2005-03-30  3:38 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Edelsohn

This patch avoids the duplicate label by simply not emitting a label for
load_toc_v4_PIC_1b.  I don't believe there is any need for the label on
this insn.  Removing the (use (unspec ..)) on the pattern doesn't affect
uses_TOC() because this particular unspec doesn't match anyway (nor
should it, the toc use here doesn't require the word emitted when
uses_TOC() is true).  Also, I think that changing the (set (reg) (symref))
to (set (reg) (unspec)) won't affect any of the places in rs6000.c that
recognize symbol_refs;  The symbol_ref in the set was previously a code
label, not one of the interesting symbol_refs to a variable or somesuch.

	PR target/20611
	* config/rs6000/rs6000.md (load_toc_v4_PIC_1b): Remove inline
	label operand.  Remove (use (unspec..)).  Don't emit a label on
	the offset word.
	* config/rs6000/rs6000.c (rs6000_legitimize_tls_address): Don't
	generate inline label for load_toc_v4_PIC_1b.
	(rs6000_emit_load_toc_table): Likewise.

Bootstrap and regression test powerpc-linux in progress.  I'd like to
apply this mainline, 4.0, and 3.4.  OK?

Index: gcc/config/rs6000/rs6000.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.c,v
retrieving revision 1.798
diff -u -p -r1.798 rs6000.c
--- gcc/config/rs6000/rs6000.c	29 Mar 2005 16:10:06 -0000	1.798
+++ gcc/config/rs6000/rs6000.c	30 Mar 2005 01:54:10 -0000
@@ -2809,21 +2809,16 @@ rs6000_legitimize_tls_address (rtx addr,
 		rs6000_emit_move (got, gsym, Pmode);
 	      else
 		{
-		  char buf[30];
-		  static int tls_got_labelno = 0;
-		  rtx tempLR, lab, tmp3, mem;
+		  rtx tempLR, tmp3, mem;
 		  rtx first, last;
 
-		  ASM_GENERATE_INTERNAL_LABEL (buf, "LTLS", tls_got_labelno++);
-		  lab = gen_rtx_SYMBOL_REF (Pmode, ggc_strdup (buf));
 		  tempLR = gen_reg_rtx (Pmode);
 		  tmp1 = gen_reg_rtx (Pmode);
 		  tmp2 = gen_reg_rtx (Pmode);
 		  tmp3 = gen_reg_rtx (Pmode);
 		  mem = gen_const_mem (Pmode, tmp1);
 
-		  first = emit_insn (gen_load_toc_v4_PIC_1b (tempLR, lab,
-							     gsym));
+		  first = emit_insn (gen_load_toc_v4_PIC_1b (tempLR, gsym));
 		  emit_move_insn (tmp1, tempLR);
 		  emit_move_insn (tmp2, mem);
 		  emit_insn (gen_addsi3 (tmp3, tmp1, tmp2));
@@ -12023,11 +12018,10 @@ rs6000_emit_load_toc_table (int fromprol
       rtx temp0 = (fromprolog
 		   ? gen_rtx_REG (Pmode, 0)
 		   : gen_reg_rtx (Pmode));
-      rtx symF;
 
       if (fromprolog)
 	{
-	  rtx symL;
+	  rtx symF, symL;
 
 	  ASM_GENERATE_INTERNAL_LABEL (buf, "LCF", rs6000_pic_labelno);
 	  symF = gen_rtx_SYMBOL_REF (Pmode, ggc_strdup (buf));
@@ -12045,14 +12039,9 @@ rs6000_emit_load_toc_table (int fromprol
       else
 	{
 	  rtx tocsym;
-	  static int reload_toc_labelno = 0;
 
 	  tocsym = gen_rtx_SYMBOL_REF (Pmode, toc_label_name);
-
-	  ASM_GENERATE_INTERNAL_LABEL (buf, "LCG", reload_toc_labelno++);
-	  symF = gen_rtx_SYMBOL_REF (Pmode, ggc_strdup (buf));
-
-	  emit_insn (gen_load_toc_v4_PIC_1b (tempLR, symF, tocsym));
+	  emit_insn (gen_load_toc_v4_PIC_1b (tempLR, tocsym));
 	  emit_move_insn (dest, tempLR);
 	  emit_move_insn (temp0, gen_rtx_MEM (Pmode, dest));
 	}
Index: gcc/config/rs6000/rs6000.md
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.md,v
retrieving revision 1.357
diff -u -p -r1.357 rs6000.md
--- gcc/config/rs6000/rs6000.md	26 Mar 2005 17:35:41 -0000	1.357
+++ gcc/config/rs6000/rs6000.md	30 Mar 2005 01:54:12 -0000
@@ -10146,11 +10146,10 @@
 
 (define_insn "load_toc_v4_PIC_1b"
   [(set (match_operand:SI 0 "register_operand" "=l")
-	(match_operand:SI 1 "immediate_operand" "s"))
-   (use (unspec [(match_dup 1) (match_operand 2 "immediate_operand" "s")]
+	(unspec [(match_operand 1 "immediate_operand" "s")]
 		UNSPEC_TOCPTR))]
   "TARGET_ELF && DEFAULT_ABI != ABI_AIX && flag_pic == 2"
-  "bcl 20,31,%1+4\\n%1:\\n\\t.long %2-%1"
+  "bcl 20,31,$+8\\n\\t.long %1-$"
   [(set_attr "type" "branch")
    (set_attr "length" "8")])
 

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Patch ping! (4.1 projects, stage 1.2)  Hot/cold partitioning fixes
@ 2005-03-30 20:02 Caroline Tice
  2005-03-30 23:14 ` Geoffrey Keating
  2005-04-01 13:55 ` AIX bootstrap failure (was Re: Hot/cold partitioning fixes) David Edelsohn
  0 siblings, 2 replies; 875+ messages in thread
From: Caroline Tice @ 2005-03-30 20:02 UTC (permalink / raw)
  To: gcc-patches@gcc.gnu.org Patches; +Cc: Caroline Tice


Fix current problems with hot/cold partitioning optimization
http://gcc.gnu.org/ml/gcc-patches/2005-03/msg02230.html


Since stage 1.2 projects are supposed to be committed by this Friday,
I would appreciate someone taking a look at this soon.  Thanks!

-- Caroline Tice
ctice@apple.com

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Patch ping! (4.1 projects, stage 1.2)  Hot/cold partitioning fixes
  2005-03-30 20:02 Patch ping! (4.1 projects, stage 1.2) Hot/cold partitioning fixes Caroline Tice
@ 2005-03-30 23:14 ` Geoffrey Keating
       [not found]   ` <5c0e321f26901d84492dfe29fa755d7e@apple.com>
  2005-04-01 13:55 ` AIX bootstrap failure (was Re: Hot/cold partitioning fixes) David Edelsohn
  1 sibling, 1 reply; 875+ messages in thread
From: Geoffrey Keating @ 2005-03-30 23:14 UTC (permalink / raw)
  To: Caroline Tice; +Cc: gcc-patches

Caroline Tice <ctice@apple.com> writes:

> Fix current problems with hot/cold partitioning optimization
> http://gcc.gnu.org/ml/gcc-patches/2005-03/msg02230.html
> 
> 
> Since stage 1.2 projects are supposed to be committed by this Friday,
> I would appreciate someone taking a look at this soon.  Thanks!

You said you were going to send a revised patch in

<http://gcc.gnu.org/ml/gcc-patches/2005-03/msg02200.html>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RS6000] Fix PR20611, duplicate label for inlined function referencing TLS
       [not found]           ` <amodra@bigpond.net.au>
                               ` (37 preceding siblings ...)
  2005-03-20 23:01             ` [PATCH] PowerPC function arg alignment tidy David Edelsohn
@ 2005-03-31  0:16             ` David Edelsohn
  2005-05-31 14:32             ` powerpc new PLT and GOT David Edelsohn
                               ` (22 subsequent siblings)
  61 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2005-03-31  0:16 UTC (permalink / raw)
  To: gcc-patches

	PR target/20611
	* config/rs6000/rs6000.md (load_toc_v4_PIC_1b): Remove inline
	label operand.  Remove (use (unspec..)).  Don't emit a label on
	the offset word.
	* config/rs6000/rs6000.c (rs6000_legitimize_tls_address): Don't
	generate inline label for load_toc_v4_PIC_1b.
	(rs6000_emit_load_toc_table): Likewise.

The unspec and the operand probably should have SImode, but otherwise it
looks okay.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Patch ping! (4.1 projects, stage 1.2)  Hot/cold partitioning fixes
       [not found]       ` <87ll84llvz.fsf@codesourcery.com>
@ 2005-03-31  0:19         ` Caroline Tice
  2005-03-31  0:29           ` Zack Weinberg
  0 siblings, 1 reply; 875+ messages in thread
From: Caroline Tice @ 2005-03-31  0:19 UTC (permalink / raw)
  To: gcc-patches@gcc.gnu.org Patches, Zack Weinberg
  Cc: Geoff Keating, Richard Henderson


On Mar 30, 2005, at 4:06 PM, Zack Weinberg wrote:

> Geoff Keating <geoffk@geoffk.org> writes:
>
>> On 30/03/2005, at 3:15 PM, Caroline Tice wrote:
>>
>>> Yes, and the link in my ping below (from a week ago) contains the
>>> revised patch I sent out.
>>
>> Yes, you're right.  Richard, Zack, do you have any outstanding issues
>> with this patch?
>
> I'm still concerned about the RTL optimizers destroying the
> information that insert_section_boundary_note uses.  Other than that,
> I'm fine with the revised patch.
>
> zw

Yes, that still might be a problem, but if so it is also a problem with 
the
implementation that is currently checked into mainline.  At least the 
patch
fixes the worst problems with the optimization. So...just to make 
absolutely
sure I'm not misunderstanding, does the above message from Zack 
constitute
approval to commit the patch?

-- Caroline
ctice@apple.com

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Patch ping! (4.1 projects, stage 1.2)  Hot/cold partitioning fixes
  2005-03-31  0:19         ` Caroline Tice
@ 2005-03-31  0:29           ` Zack Weinberg
  2005-03-31  0:43             ` Richard Henderson
  0 siblings, 1 reply; 875+ messages in thread
From: Zack Weinberg @ 2005-03-31  0:29 UTC (permalink / raw)
  To: Caroline Tice
  Cc: gcc-patches@gcc.gnu.org Patches, Geoff Keating, Richard Henderson

Caroline Tice <ctice@apple.com> writes:

>> I'm still concerned about the RTL optimizers destroying the
>> information that insert_section_boundary_note uses.  Other than
>> that, I'm fine with the revised patch.
>
> Yes, that still might be a problem, but if so it is also a problem
> with the implementation that is currently checked into mainline. 

This is true.  I am therefore not objecting to the patch.

> At least the patch fixes the worst problems with the
> optimization. So...just to make absolutely sure I'm not
> misunderstanding, does the above message from Zack constitute
> approval to commit the patch?

No, not yet; let's see if Richard has any further concerns first.

zw

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Patch ping! (4.1 projects, stage 1.2)  Hot/cold partitioning fixes
  2005-03-31  0:29           ` Zack Weinberg
@ 2005-03-31  0:43             ` Richard Henderson
  0 siblings, 0 replies; 875+ messages in thread
From: Richard Henderson @ 2005-03-31  0:43 UTC (permalink / raw)
  To: Zack Weinberg
  Cc: Caroline Tice, gcc-patches@gcc.gnu.org Patches, Geoff Keating

On Wed, Mar 30, 2005 at 04:19:42PM -0800, Zack Weinberg wrote:
> No, not yet; let's see if Richard has any further concerns first.

Nope.


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* AIX bootstrap failure (was Re: Hot/cold partitioning fixes)
  2005-03-30 20:02 Patch ping! (4.1 projects, stage 1.2) Hot/cold partitioning fixes Caroline Tice
  2005-03-30 23:14 ` Geoffrey Keating
@ 2005-04-01 13:55 ` David Edelsohn
  2005-04-01 16:44   ` David Edelsohn
  2005-04-01 17:04   ` Joseph S. Myers
  1 sibling, 2 replies; 875+ messages in thread
From: David Edelsohn @ 2005-04-01 13:55 UTC (permalink / raw)
  To: Caroline Tice; +Cc: gcc-patches

	This patch has broken bootstrap on AIX.  I thought that this
functionality was suppose to be disabled by default.

	I now see incorrect assembly on AIX with labels like:

_Unwind_GetTextRelBase[DS].hot_section:

Disappointed,
David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: AIX bootstrap failure (was Re: Hot/cold partitioning fixes)
  2005-04-01 13:55 ` AIX bootstrap failure (was Re: Hot/cold partitioning fixes) David Edelsohn
@ 2005-04-01 16:44   ` David Edelsohn
  2005-04-01 17:13     ` Caroline Tice
  2005-04-01 18:48     ` Caroline Tice
  2005-04-01 17:04   ` Joseph S. Myers
  1 sibling, 2 replies; 875+ messages in thread
From: David Edelsohn @ 2005-04-01 16:44 UTC (permalink / raw)
  To: Caroline Tice; +Cc: gcc-patches

	The hot/cold partitioning patch is emitting hot and cold section
labels unconditionally, regardless whether reorder_blocks_and_partition is
enabled.  The appended patch addresses that problem.  I am attempting
another bootstrap now.

	Also, the implementation assumes that it can manipulate section
names by appending strings, which is fundamentally not portable.  This
functionality was not implemented portably for inclusion in GCC.

David


	* varasm.c (assemble_start_function): Only emit hot_section_label
	if reorder_blocks_and_partition is enabled.
	(assemble_end_function): Only emit cold_section_end_label and
	hot_section_end_label if reorder_blocks_and_partition is enabled.

Index: varasm.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/varasm.c,v
retrieving revision 1.492
diff -c -p -r1.492 varasm.c
*** varasm.c	1 Apr 2005 03:42:45 -0000	1.492
--- varasm.c	1 Apr 2005 16:37:59 -0000
*************** assemble_start_function (tree decl, cons
*** 1303,1309 ****
    /* Switch to the correct text section for the start of the function.  */
  
    function_section (decl);
!   if (!hot_label_written)
      ASM_OUTPUT_LABEL (asm_out_file, hot_section_label);
  
    /* Tell assembler to move to target machine's alignment for functions.  */
--- 1303,1309 ----
    /* Switch to the correct text section for the start of the function.  */
  
    function_section (decl);
!   if (flag_reorder_blocks_and_partition && !hot_label_written)
      ASM_OUTPUT_LABEL (asm_out_file, hot_section_label);
  
    /* Tell assembler to move to target machine's alignment for functions.  */
*************** assemble_end_function (tree decl, const 
*** 1379,1387 ****
       debug info.)  */
    save_text_section = in_section;
    unlikely_text_section ();
!   ASM_OUTPUT_LABEL (asm_out_file, cold_section_end_label);
    text_section ();
!   ASM_OUTPUT_LABEL (asm_out_file, hot_section_end_label);
    if (save_text_section == in_unlikely_executed_text)
      unlikely_text_section ();
  }
--- 1379,1389 ----
       debug info.)  */
    save_text_section = in_section;
    unlikely_text_section ();
!   if (flag_reorder_blocks_and_partition)
!     ASM_OUTPUT_LABEL (asm_out_file, cold_section_end_label);
    text_section ();
!   if (flag_reorder_blocks_and_partition)
!     ASM_OUTPUT_LABEL (asm_out_file, hot_section_end_label);
    if (save_text_section == in_unlikely_executed_text)
      unlikely_text_section ();
  }

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: AIX bootstrap failure (was Re: Hot/cold partitioning fixes)
  2005-04-01 13:55 ` AIX bootstrap failure (was Re: Hot/cold partitioning fixes) David Edelsohn
  2005-04-01 16:44   ` David Edelsohn
@ 2005-04-01 17:04   ` Joseph S. Myers
  2005-04-01 17:21     ` Caroline Tice
  1 sibling, 1 reply; 875+ messages in thread
From: Joseph S. Myers @ 2005-04-01 17:04 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Caroline Tice, gcc-patches

On Fri, 1 Apr 2005, David Edelsohn wrote:

> 	This patch has broken bootstrap on AIX.  I thought that this
> functionality was suppose to be disabled by default.

I also suspect it of breaking bootstrap on hppa2.0w-hpux, bug 20719, as 
being the most recent big patch to varasm.c.

-- 
Joseph S. Myers               http://www.srcf.ucam.org/~jsm28/gcc/
    jsm@polyomino.org.uk (personal mail)
    joseph@codesourcery.com (CodeSourcery mail)
    jsm28@gcc.gnu.org (Bugzilla assignments and CCs)

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: AIX bootstrap failure (was Re: Hot/cold partitioning fixes)
  2005-04-01 16:44   ` David Edelsohn
@ 2005-04-01 17:13     ` Caroline Tice
  2005-04-01 18:24       ` David Edelsohn
  2005-04-01 18:48     ` Caroline Tice
  1 sibling, 1 reply; 875+ messages in thread
From: Caroline Tice @ 2005-04-01 17:13 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Caroline Tice, gcc-patches@gcc.gnu.org Patches


On Apr 1, 2005, at 8:43 AM, David Edelsohn wrote:

> 	The hot/cold partitioning patch is emitting hot and cold section
> labels unconditionally, regardless whether 
> reorder_blocks_and_partition is
> enabled.  The appended patch addresses that problem.  I am attempting
> another bootstrap now.
>

I apologize for this.  I was under the impression, from some of the 
messages I had
received before, that emitting extra labels would not ever be a 
problem, so it
seemed simplest to me not to put the checks in.

> 	Also, the implementation assumes that it can manipulate section
> names by appending strings, which is fundamentally not portable.  This
> functionality was not implemented portably for inclusion in GCC.
>

I was unaware that this was not portable.  If you could explain exactly 
what the
problem is and/or suggest a better way to do this, I will fix this 
problem.    Again, I
apologize for this problem.

-- Caroline Tice
ctice@apple.com

> David
>
>
> 	* varasm.c (assemble_start_function): Only emit hot_section_label
> 	if reorder_blocks_and_partition is enabled.
> 	(assemble_end_function): Only emit cold_section_end_label and
> 	hot_section_end_label if reorder_blocks_and_partition is enabled.
>
> Index: varasm.c
> ===================================================================
> RCS file: /cvs/gcc/gcc/gcc/varasm.c,v
> retrieving revision 1.492
> diff -c -p -r1.492 varasm.c
> *** varasm.c	1 Apr 2005 03:42:45 -0000	1.492
> --- varasm.c	1 Apr 2005 16:37:59 -0000
> *************** assemble_start_function (tree decl, cons
> *** 1303,1309 ****
>     /* Switch to the correct text section for the start of the 
> function.  */
>
>     function_section (decl);
> !   if (!hot_label_written)
>       ASM_OUTPUT_LABEL (asm_out_file, hot_section_label);
>
>     /* Tell assembler to move to target machine's alignment for 
> functions.  */
> --- 1303,1309 ----
>     /* Switch to the correct text section for the start of the 
> function.  */
>
>     function_section (decl);
> !   if (flag_reorder_blocks_and_partition && !hot_label_written)
>       ASM_OUTPUT_LABEL (asm_out_file, hot_section_label);
>
>     /* Tell assembler to move to target machine's alignment for 
> functions.  */
> *************** assemble_end_function (tree decl, const
> *** 1379,1387 ****
>        debug info.)  */
>     save_text_section = in_section;
>     unlikely_text_section ();
> !   ASM_OUTPUT_LABEL (asm_out_file, cold_section_end_label);
>     text_section ();
> !   ASM_OUTPUT_LABEL (asm_out_file, hot_section_end_label);
>     if (save_text_section == in_unlikely_executed_text)
>       unlikely_text_section ();
>   }
> --- 1379,1389 ----
>        debug info.)  */
>     save_text_section = in_section;
>     unlikely_text_section ();
> !   if (flag_reorder_blocks_and_partition)
> !     ASM_OUTPUT_LABEL (asm_out_file, cold_section_end_label);
>     text_section ();
> !   if (flag_reorder_blocks_and_partition)
> !     ASM_OUTPUT_LABEL (asm_out_file, hot_section_end_label);
>     if (save_text_section == in_unlikely_executed_text)
>       unlikely_text_section ();
>   }
>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: AIX bootstrap failure (was Re: Hot/cold partitioning fixes)
  2005-04-01 17:04   ` Joseph S. Myers
@ 2005-04-01 17:21     ` Caroline Tice
  0 siblings, 0 replies; 875+ messages in thread
From: Caroline Tice @ 2005-04-01 17:21 UTC (permalink / raw)
  To: Joseph S. Myers; +Cc: Caroline Tice, gcc-patches@gcc.gnu.org Patches

On Apr 1, 2005, at 9:04 AM, Joseph S. Myers wrote:

> On Fri, 1 Apr 2005, David Edelsohn wrote:
>
>> 	This patch has broken bootstrap on AIX.  I thought that this
>> functionality was suppose to be disabled by default.
>
> I also suspect it of breaking bootstrap on hppa2.0w-hpux, bug 20719, as
> being the most recent big patch to varasm.c.
>

I am looking into fixing this as well.

-- Caroline Tice
ctice@apple.com

> -- 
> Joseph S. Myers               http://www.srcf.ucam.org/~jsm28/gcc/
>     jsm@polyomino.org.uk (personal mail)
>     joseph@codesourcery.com (CodeSourcery mail)
>     jsm28@gcc.gnu.org (Bugzilla assignments and CCs)
>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: AIX bootstrap failure (was Re: Hot/cold partitioning fixes)
  2005-04-01 17:13     ` Caroline Tice
@ 2005-04-01 18:24       ` David Edelsohn
  2005-04-01 19:20         ` Caroline Tice
  2005-04-01 22:14         ` Geoffrey Keating
  0 siblings, 2 replies; 875+ messages in thread
From: David Edelsohn @ 2005-04-01 18:24 UTC (permalink / raw)
  To: Caroline Tice; +Cc: gcc-patches@gcc.gnu.org Patches

	Emitting the extra labels probably would be okay, if the labels
had the correct syntax.  You did not consider the possible failure modes
and did not ask the right questions.  The fundamental problem is not the
extra labels but the incorrectly formatted labels.

	The recent change emits sections and labels on AIX like:

        .csect ..text.unlikely[PR],2

and

_Unwind_GetTextRelBase[DS].end.cold:


This is imposing specific formats for names that may not be correct for
all assembly and object file formats.  For instance, one cannot assume
that appending the string ".hot_section" to the function name is a valid
label.  The parts of the AIX port of GCC that generate names and output
names work in cooperation.  Your patch creates new names outside of the
standard interfaces, which causes failures; appending arbitrary strings is
not safe.

hot_section_label = reconcat (hot_section_label, fnname, ".hot_section", NULL);

is not safe.

	Also, I just noticed that the previous changes to choose sections
in varasm.c appear to have broken the select_section mechanism.  The
arguments to the target hook are suppose to be decl, reloc, and alignment,
e.g., in variable_section():

    targetm.asm_out.select_section (decl, reloc, DECL_ALIGN (decl));

but it sometimes is called with the arguments decl, unlikely, and
alignment, e.g., in function_section:

  targetm.asm_out.select_section (decl, unlikely, DECL_ALIGN (decl));

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: AIX bootstrap failure (was Re: Hot/cold partitioning fixes)
  2005-04-01 16:44   ` David Edelsohn
  2005-04-01 17:13     ` Caroline Tice
@ 2005-04-01 18:48     ` Caroline Tice
  1 sibling, 0 replies; 875+ messages in thread
From: Caroline Tice @ 2005-04-01 18:48 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches


On Apr 1, 2005, at 8:43 AM, David Edelsohn wrote:

> 	The hot/cold partitioning patch is emitting hot and cold section
> labels unconditionally, regardless whether 
> reorder_blocks_and_partition is
> enabled.  The appended patch addresses that problem.  I am attempting
> another bootstrap now.
>
> 	Also, the implementation assumes that it can manipulate section
> names by appending strings, which is fundamentally not portable.  This
> functionality was not implemented portably for inclusion in GCC.
>
> David
>
>
> 	* varasm.c (assemble_start_function): Only emit hot_section_label
> 	if reorder_blocks_and_partition is enabled.
> 	(assemble_end_function): Only emit cold_section_end_label and
> 	hot_section_end_label if reorder_blocks_and_partition is enabled.
>
> Index: varasm.c
> ===================================================================
> RCS file: /cvs/gcc/gcc/gcc/varasm.c,v
> retrieving revision 1.492
> diff -c -p -r1.492 varasm.c
> *** varasm.c	1 Apr 2005 03:42:45 -0000	1.492
> --- varasm.c	1 Apr 2005 16:37:59 -0000
> *************** assemble_start_function (tree decl, cons
> *** 1303,1309 ****
>     /* Switch to the correct text section for the start of the 
> function.  */
>
>     function_section (decl);
> !   if (!hot_label_written)
>       ASM_OUTPUT_LABEL (asm_out_file, hot_section_label);
>
>     /* Tell assembler to move to target machine's alignment for 
> functions.  */
> --- 1303,1309 ----
>     /* Switch to the correct text section for the start of the 
> function.  */
>
>     function_section (decl);
> !   if (flag_reorder_blocks_and_partition && !hot_label_written)
>       ASM_OUTPUT_LABEL (asm_out_file, hot_section_label);
>
>     /* Tell assembler to move to target machine's alignment for 
> functions.  */
> *************** assemble_end_function (tree decl, const
> *** 1379,1387 ****
>        debug info.)  */
>     save_text_section = in_section;
>     unlikely_text_section ();
> !   ASM_OUTPUT_LABEL (asm_out_file, cold_section_end_label);
>     text_section ();
> !   ASM_OUTPUT_LABEL (asm_out_file, hot_section_end_label);
>     if (save_text_section == in_unlikely_executed_text)
>       unlikely_text_section ();
>   }
> --- 1379,1389 ----
>        debug info.)  */
>     save_text_section = in_section;
>     unlikely_text_section ();
> !   if (flag_reorder_blocks_and_partition)
> !     ASM_OUTPUT_LABEL (asm_out_file, cold_section_end_label);
>     text_section ();
> !   if (flag_reorder_blocks_and_partition)
> !     ASM_OUTPUT_LABEL (asm_out_file, hot_section_end_label);
>     if (save_text_section == in_unlikely_executed_text)
>       unlikely_text_section ();
>   }

It would make more sense to put all the code that puts out labels and 
changes back and forth
between sections in assemble_end_function into a single condition block 
as shown below:

l*************** assemble_end_function (tree decl, const
*** 1377,1389 ****
       }
     /* Output labels for end of hot/cold text sections (to be used by
        debug info.)  */
!   save_text_section = in_section;
!   unlikely_text_section ();
!   ASM_OUTPUT_LABEL (asm_out_file, cold_section_end_label);
!   text_section ();
!   ASM_OUTPUT_LABEL (asm_out_file, hot_section_end_label);
!   if (save_text_section == in_unlikely_executed_text)
!     unlikely_text_section ();
   }


   /* Assemble code to leave SIZE bytes of zeros.  */
--- 1379,1394 ----
       }
     /* Output labels for end of hot/cold text sections (to be used by
        debug info.)  */
!   if (flag_reorder_blocks_and_partition)
!     {
!       save_text_section = in_section;
!       unlikely_text_section ();
!       ASM_OUTPUT_LABEL (asm_out_file, cold_section_end_label);
!       text_section ();
!       ASM_OUTPUT_LABEL (asm_out_file, hot_section_end_label);
!       if (save_text_section == in_unlikely_executed_text)
!       unlikely_text_section ();
!     }
   }




^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: AIX bootstrap failure (was Re: Hot/cold partitioning fixes)
  2005-04-01 18:24       ` David Edelsohn
@ 2005-04-01 19:20         ` Caroline Tice
  2005-04-01 19:25           ` David Edelsohn
                             ` (2 more replies)
  2005-04-01 22:14         ` Geoffrey Keating
  1 sibling, 3 replies; 875+ messages in thread
From: Caroline Tice @ 2005-04-01 19:20 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches@gcc.gnu.org Patches


On Apr 1, 2005, at 10:24 AM, David Edelsohn wrote:

> 	Emitting the extra labels probably would be okay, if the labels
> had the correct syntax.  You did not consider the possible failure 
> modes
> and did not ask the right questions.  The fundamental problem is not 
> the
> extra labels but the incorrectly formatted labels.
>
> 	The recent change emits sections and labels on AIX like:
>
>         .csect ..text.unlikely[PR],2
>
> and
>
> _Unwind_GetTextRelBase[DS].end.cold:
>
>
> This is imposing specific formats for names that may not be correct for
> all assembly and object file formats.  For instance, one cannot assume
> that appending the string ".hot_section" to the function name is a 
> valid
> label.  The parts of the AIX port of GCC that generate names and output
> names work in cooperation.  Your patch creates new names outside of the
> standard interfaces, which causes failures; appending arbitrary 
> strings is
> not safe.
>
> hot_section_label = reconcat (hot_section_label, fnname, 
> ".hot_section", NULL);
>
> is not safe.
>

I was unaware that these formats would not  work for all architectures. 
  I have just
been scanning through the GCC source to see if there are any target 
hooks
for creating section names/labels, and I don't see any.  Therefore I am 
not sure
at this point as to the best way to proceed.  Any constructive advice 
would be
appreciated.


> 	Also, I just noticed that the previous changes to choose sections
> in varasm.c appear to have broken the select_section mechanism.  The
> arguments to the target hook are suppose to be decl, reloc, and 
> alignment,
> e.g., in variable_section():
>
>     targetm.asm_out.select_section (decl, reloc, DECL_ALIGN (decl));
>
> but it sometimes is called with the arguments decl, unlikely, and
> alignment, e.g., in function_section:
>
>   targetm.asm_out.select_section (decl, unlikely, DECL_ALIGN (decl));
>

I was not the origin of this change (the change in function_section was 
written by
someone else, I *think* it was Geoff  Keating but I'm not sure, 
sometime late last fall).
It seems unlikely to me that this is currently a major problem, since 
the code
has been like this 4-5 months with no one complaining.  However if you
think it is important, I am willing to look at fixing this.  Let me 
know if you really
want me to do this.

-- Caroline Tice
ctice@apple.com

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: AIX bootstrap failure (was Re: Hot/cold partitioning fixes)
  2005-04-01 19:20         ` Caroline Tice
@ 2005-04-01 19:25           ` David Edelsohn
  2005-04-01 19:45           ` Mark Mitchell
  2005-04-01 19:50           ` Daniel Jacobowitz
  2 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2005-04-01 19:25 UTC (permalink / raw)
  To: Caroline Tice; +Cc: gcc-patches@gcc.gnu.org Patches

>>>>> Caroline Tice writes:

Caroline> I was unaware that these formats would not  work for all architectures. 
Caroline> I have just
Caroline> been scanning through the GCC source to see if there are any target 
Caroline> hooks
Caroline> for creating section names/labels, and I don't see any.  Therefore I am 
Caroline> not sure
Caroline> at this point as to the best way to proceed.  Any constructive advice 
Caroline> would be
Caroline> appreciated.

	A good start would be to apply targetm.strip_name_encoding to the
string before appending, e.g.,

hot_section_label = reconcat (hot_section_label,
			      targetm.strip_name_encoding (fnname),
			      ".hot_section", NULL);

etc.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: AIX bootstrap failure (was Re: Hot/cold partitioning fixes)
  2005-04-01 19:20         ` Caroline Tice
  2005-04-01 19:25           ` David Edelsohn
@ 2005-04-01 19:45           ` Mark Mitchell
  2005-04-01 20:02             ` Caroline Tice
  2005-04-01 21:34             ` Caroline Tice
  2005-04-01 19:50           ` Daniel Jacobowitz
  2 siblings, 2 replies; 875+ messages in thread
From: Mark Mitchell @ 2005-04-01 19:45 UTC (permalink / raw)
  To: Caroline Tice; +Cc: David Edelsohn, gcc-patches@gcc.gnu.org Patches

Caroline Tice wrote:

>> but it sometimes is called with the arguments decl, unlikely, and
>> alignment, e.g., in function_section:
>>
>>   targetm.asm_out.select_section (decl, unlikely, DECL_ALIGN (decl));
>>
> 
> I was not the origin of this change (the change in function_section was 
> written by
> someone else, I *think* it was Geoff  Keating but I'm not sure, sometime 
> late last fall).
> It seems unlikely to me that this is currently a major problem, since 
> the code
> has been like this 4-5 months with no one complaining.  However if you
> think it is important, I am willing to look at fixing this.  Let me know 
> if you really
> want me to do this.

I would like you to look into this.  If this change is on the 4.0 
release branch, please look there as well.

-- 
Mark Mitchell
CodeSourcery, LLC
mark@codesourcery.com
(916) 791-8304

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: AIX bootstrap failure (was Re: Hot/cold partitioning fixes)
  2005-04-01 19:20         ` Caroline Tice
  2005-04-01 19:25           ` David Edelsohn
  2005-04-01 19:45           ` Mark Mitchell
@ 2005-04-01 19:50           ` Daniel Jacobowitz
  2005-04-01 20:00             ` Caroline Tice
  2 siblings, 1 reply; 875+ messages in thread
From: Daniel Jacobowitz @ 2005-04-01 19:50 UTC (permalink / raw)
  To: Caroline Tice; +Cc: David Edelsohn, gcc-patches@gcc.gnu.org Patches

On Fri, Apr 01, 2005 at 11:20:27AM -0800, Caroline Tice wrote:
> I was unaware that these formats would not  work for all architectures. 
>  I have just
> been scanning through the GCC source to see if there are any target 
> hooks
> for creating section names/labels, and I don't see any.  Therefore I am 
> not sure
> at this point as to the best way to proceed.  Any constructive advice 
> would be
> appreciated.

Do you really need labels with specific names?  i.e. couldn't you use
ASM_GENERATE_INTERNAL_LABEL?


-- 
Daniel Jacobowitz
CodeSourcery, LLC

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: AIX bootstrap failure (was Re: Hot/cold partitioning fixes)
  2005-04-01 19:50           ` Daniel Jacobowitz
@ 2005-04-01 20:00             ` Caroline Tice
  2005-04-01 20:07               ` Daniel Jacobowitz
  0 siblings, 1 reply; 875+ messages in thread
From: Caroline Tice @ 2005-04-01 20:00 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: Caroline Tice, gcc-patches@gcc.gnu.org Patches

I really need labels with predictable names, as these labels are used in
the debugging information to determine the size of the text sections.

-- Caroline
ctice@apple.com

On Apr 1, 2005, at 11:50 AM, Daniel Jacobowitz wrote:

> On Fri, Apr 01, 2005 at 11:20:27AM -0800, Caroline Tice wrote:
>> I was unaware that these formats would not  work for all 
>> architectures.
>>  I have just
>> been scanning through the GCC source to see if there are any target
>> hooks
>> for creating section names/labels, and I don't see any.  Therefore I 
>> am
>> not sure
>> at this point as to the best way to proceed.  Any constructive advice
>> would be
>> appreciated.
>
> Do you really need labels with specific names?  i.e. couldn't you use
> ASM_GENERATE_INTERNAL_LABEL?
>
>
> -- 
> Daniel Jacobowitz
> CodeSourcery, LLC
>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: AIX bootstrap failure (was Re: Hot/cold partitioning fixes)
  2005-04-01 19:45           ` Mark Mitchell
@ 2005-04-01 20:02             ` Caroline Tice
  2005-04-01 21:34             ` Caroline Tice
  1 sibling, 0 replies; 875+ messages in thread
From: Caroline Tice @ 2005-04-01 20:02 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: Caroline Tice, gcc-patches@gcc.gnu.org Patches

Okay, will do!

-- Caroline

On Apr 1, 2005, at 11:45 AM, Mark Mitchell wrote:

> Caroline Tice wrote:
>
>>> but it sometimes is called with the arguments decl, unlikely, and
>>> alignment, e.g., in function_section:
>>>
>>>   targetm.asm_out.select_section (decl, unlikely, DECL_ALIGN (decl));
>>>
>> I was not the origin of this change (the change in function_section 
>> was written by
>> someone else, I *think* it was Geoff  Keating but I'm not sure, 
>> sometime late last fall).
>> It seems unlikely to me that this is currently a major problem, since 
>> the code
>> has been like this 4-5 months with no one complaining.  However if you
>> think it is important, I am willing to look at fixing this.  Let me 
>> know if you really
>> want me to do this.
>
> I would like you to look into this.  If this change is on the 4.0 
> release branch, please look there as well.
>
> -- 
> Mark Mitchell
> CodeSourcery, LLC
> mark@codesourcery.com
> (916) 791-8304
>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: AIX bootstrap failure (was Re: Hot/cold partitioning fixes)
  2005-04-01 20:00             ` Caroline Tice
@ 2005-04-01 20:07               ` Daniel Jacobowitz
  2005-04-01 20:09                 ` Caroline Tice
  2005-04-01 22:08                 ` Geoffrey Keating
  0 siblings, 2 replies; 875+ messages in thread
From: Daniel Jacobowitz @ 2005-04-01 20:07 UTC (permalink / raw)
  To: Caroline Tice; +Cc: gcc-patches@gcc.gnu.org Patches

On Fri, Apr 01, 2005 at 12:00:08PM -0800, Caroline Tice wrote:
> I really need labels with predictable names, as these labels are used in
> the debugging information to determine the size of the text sections.

And you can't somehow associate the labels with the function?  For
instance, record them in "struct function", or as a property of the
FUNCTION_DECL.  It looks like current_function isn't usable during
dwarf2 output, but you'll have the decl; you could create a hash table
mapping the FUNCTION_DECL to the appropriate label.

[There are a couple references to cfun in dwarf2out.c, which would be
easier, but I'm not sure if they're correct.]

-- 
Daniel Jacobowitz
CodeSourcery, LLC

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: AIX bootstrap failure (was Re: Hot/cold partitioning fixes)
  2005-04-01 20:07               ` Daniel Jacobowitz
@ 2005-04-01 20:09                 ` Caroline Tice
  2005-04-01 21:19                   ` Mark Mitchell
  2005-04-01 22:08                 ` Geoffrey Keating
  1 sibling, 1 reply; 875+ messages in thread
From: Caroline Tice @ 2005-04-01 20:09 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: Caroline Tice, gcc-patches@gcc.gnu.org Patches

Actually I may be able to do this after all; I will look into it.

-- Caroline

On Apr 1, 2005, at 12:07 PM, Daniel Jacobowitz wrote:

> On Fri, Apr 01, 2005 at 12:00:08PM -0800, Caroline Tice wrote:
>> I really need labels with predictable names, as these labels are used 
>> in
>> the debugging information to determine the size of the text sections.
>
> And you can't somehow associate the labels with the function?  For
> instance, record them in "struct function", or as a property of the
> FUNCTION_DECL.  It looks like current_function isn't usable during
> dwarf2 output, but you'll have the decl; you could create a hash table
> mapping the FUNCTION_DECL to the appropriate label.
>
> [There are a couple references to cfun in dwarf2out.c, which would be
> easier, but I'm not sure if they're correct.]
>
> -- 
> Daniel Jacobowitz
> CodeSourcery, LLC
>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: AIX bootstrap failure (was Re: Hot/cold partitioning fixes)
  2005-04-01 20:09                 ` Caroline Tice
@ 2005-04-01 21:19                   ` Mark Mitchell
  0 siblings, 0 replies; 875+ messages in thread
From: Mark Mitchell @ 2005-04-01 21:19 UTC (permalink / raw)
  To: Caroline Tice; +Cc: Daniel Jacobowitz, gcc-patches@gcc.gnu.org Patches

Caroline Tice wrote:
> Actually I may be able to do this after all; I will look into it.

I think Daniel's suggestion makes sense; it should indeed be possible to 
store the labels associated with the function somewhere.  I would 
recommend against the FUCNTION_DECL itself, as that will be more 
data-structure bloat, but an on-the-side hash table, or "struct 
function" would indeed make sense to me.

-- 
Mark Mitchell
CodeSourcery, LLC
mark@codesourcery.com
(916) 791-8304

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: AIX bootstrap failure (was Re: Hot/cold partitioning fixes)
  2005-04-01 19:45           ` Mark Mitchell
  2005-04-01 20:02             ` Caroline Tice
@ 2005-04-01 21:34             ` Caroline Tice
  2005-04-01 21:44               ` David Edelsohn
  2005-04-01 21:46               ` Mark Mitchell
  1 sibling, 2 replies; 875+ messages in thread
From: Caroline Tice @ 2005-04-01 21:34 UTC (permalink / raw)
  To: Mark Mitchell
  Cc: Caroline Tice, gcc-patches@gcc.gnu.org Patches, David Edelsohn

On Apr 1, 2005, at 11:45 AM, Mark Mitchell wrote:

> Caroline Tice wrote:
>
>>> but it sometimes is called with the arguments decl, unlikely, and
>>> alignment, e.g., in function_section:
>>>
>>>   targetm.asm_out.select_section (decl, unlikely, DECL_ALIGN (decl));
>>>
>> I was not the origin of this change (the change in function_section 
>> was written by
>> someone else, I *think* it was Geoff  Keating but I'm not sure, 
>> sometime late last fall).
>> It seems unlikely to me that this is currently a major problem, since 
>> the code
>> has been like this 4-5 months with no one complaining.  However if you
>> think it is important, I am willing to look at fixing this.  Let me 
>> know if you really
>> want me to do this.
>
> I would like you to look into this.  If this change is on the 4.0 
> release branch, please look there as well.
>
>

According to the manual, when TARGET_ASM_SELECT_SECTION is used for
function decl's, the RELOC parameter is supposed to be 0 if the
function is likely to be executed, and 1 if the function is unlikely
to be executed.  Therefore it looks like the call above in
function_section is passing the correct values.  But I will
change the name of the argument from 'unlikely' to 'reloc', and the type
from 'bool' to 'int'.  :-)

Since, in terms of functionality, it is correct on the 4.0 release 
branch, do
you want me to make this change there as well or not?

-- Caroline

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: AIX bootstrap failure (was Re: Hot/cold partitioning fixes)
  2005-04-01 21:34             ` Caroline Tice
@ 2005-04-01 21:44               ` David Edelsohn
  2005-04-01 21:55                 ` Daniel Jacobowitz
  2005-04-01 22:02                 ` Caroline Tice
  2005-04-01 21:46               ` Mark Mitchell
  1 sibling, 2 replies; 875+ messages in thread
From: David Edelsohn @ 2005-04-01 21:44 UTC (permalink / raw)
  To: Caroline Tice; +Cc: Mark Mitchell, gcc-patches@gcc.gnu.org Patches

>>>>> Caroline Tice writes:

Caroline> According to the manual, when TARGET_ASM_SELECT_SECTION is used for
Caroline> function decl's, the RELOC parameter is supposed to be 0 if the
Caroline> function is likely to be executed, and 1 if the function is unlikely
Caroline> to be executed.  Therefore it looks like the call above in
Caroline> function_section is passing the correct values.  But I will
Caroline> change the name of the argument from 'unlikely' to 'reloc', and the type
Caroline> from 'bool' to 'int'.  :-)

Caroline> Since, in terms of functionality, it is correct on the 4.0 release 
Caroline> branch, do
Caroline> you want me to make this change there as well or not?

	What documentation are you reading?  Are you looking at an
Apple-local copy?  Mainline gcc/doc/tm.texi and the online GCC manual
states:

@deftypefn {Target Hook} void TARGET_ASM_SELECT_SECTION (tree @var{exp},
int @var{reloc}, unsigned HOST_WIDE_INT @var{align}) Switches to the
appropriate section for output of @var{exp}.  You can assume that
@var{exp} is either a @code{VAR_DECL} node or a constant of some sort.
@var{reloc} indicates whether the initial value of @var{exp} requires
link-time relocations.  Bit 0 is set when variable contains local
relocations only, while bit 1 is set for global relocations.

Also, none of the implementations of the target hook on mainline were
modified to use the parameter to select likely or unlikely.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: AIX bootstrap failure (was Re: Hot/cold partitioning fixes)
  2005-04-01 21:34             ` Caroline Tice
  2005-04-01 21:44               ` David Edelsohn
@ 2005-04-01 21:46               ` Mark Mitchell
  2005-04-01 22:06                 ` Caroline Tice
  1 sibling, 1 reply; 875+ messages in thread
From: Mark Mitchell @ 2005-04-01 21:46 UTC (permalink / raw)
  To: Caroline Tice; +Cc: gcc-patches@gcc.gnu.org Patches, David Edelsohn

Caroline Tice wrote:

> According to the manual, when TARGET_ASM_SELECT_SECTION is used for
> function decl's, the RELOC parameter is supposed to be 0 if the
> function is likely to be executed, and 1 if the function is unlikely
> to be executed.  Therefore it looks like the call above in
> function_section is passing the correct values.  But I will
> change the name of the argument from 'unlikely' to 'reloc', and the type
> from 'bool' to 'int'.  :-)

How confusing!

In target.h, it says:

   /* Given a decl, a section name, and whether the decl initializer 

      has relocs, choose attributes for the section.  */

and in my copy of the manual it says:

@deftypefn {Target Hook} void TARGET_ASM_SELECT_SECTION (tree @var{exp}, 
int @var{reloc}, unsigned HOST_WIDE_INT @var{align})
Switches to the appropriate section for output of @var{exp}.  You can
assume that @var{exp} is either a @code{VAR_DECL} node or a constant of
some sort.  @var{reloc} indicates whether the initial value of @var{exp}
requires link-time relocations.

But then I see that USE_SELECT_SECTION_FOR_FUNCTIONS says that:

In the case of a @code{FUNCTION_DECL}, @var{reloc} will be zero if the
function has been determined to be likely to be called, and nonzero if
it is unlikely to be called.

So, if your calls to select section are appropriately guarded with 
USE_SELECT_SECTION_FOR_FUCNTIONS, then it does indeed sound like your 
code is correct, and there's no need to make changes to the 4.0 branch. 
  I sure wish someone would tidy up target.h to reflect that (a) 
FUNCTION_DECLs can be passed to this function, and (b) reloc has as a 
different meaning in this case.  Or, make an entirely separate hook.

-- 
Mark Mitchell
CodeSourcery, LLC
mark@codesourcery.com
(916) 791-8304

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: AIX bootstrap failure (was Re: Hot/cold partitioning fixes)
  2005-04-01 21:44               ` David Edelsohn
@ 2005-04-01 21:55                 ` Daniel Jacobowitz
  2005-04-01 22:02                 ` Caroline Tice
  1 sibling, 0 replies; 875+ messages in thread
From: Daniel Jacobowitz @ 2005-04-01 21:55 UTC (permalink / raw)
  To: David Edelsohn
  Cc: Caroline Tice, Mark Mitchell, gcc-patches@gcc.gnu.org Patches

On Fri, Apr 01, 2005 at 04:44:29PM -0500, David Edelsohn wrote:
> >>>>> Caroline Tice writes:
> 
> Caroline> According to the manual, when TARGET_ASM_SELECT_SECTION is used for
> Caroline> function decl's, the RELOC parameter is supposed to be 0 if the
> Caroline> function is likely to be executed, and 1 if the function is unlikely
> Caroline> to be executed.  Therefore it looks like the call above in
> Caroline> function_section is passing the correct values.  But I will
> Caroline> change the name of the argument from 'unlikely' to 'reloc', and the type
> Caroline> from 'bool' to 'int'.  :-)
> 
> Caroline> Since, in terms of functionality, it is correct on the 4.0 release 
> Caroline> branch, do
> Caroline> you want me to make this change there as well or not?
> 
> 	What documentation are you reading?  Are you looking at an
> Apple-local copy?  Mainline gcc/doc/tm.texi and the online GCC manual
> states:
> 
> @deftypefn {Target Hook} void TARGET_ASM_SELECT_SECTION (tree @var{exp},
> int @var{reloc}, unsigned HOST_WIDE_INT @var{align}) Switches to the
> appropriate section for output of @var{exp}.  You can assume that
> @var{exp} is either a @code{VAR_DECL} node or a constant of some sort.
> @var{reloc} indicates whether the initial value of @var{exp} requires
> link-time relocations.  Bit 0 is set when variable contains local
> relocations only, while bit 1 is set for global relocations.
> 
> 
> Also, none of the implementations of the target hook on mainline were
> modified to use the parameter to select likely or unlikely.

Try the following paragraph?

See also @var{USE_SELECT_SECTION_FOR_FUNCTIONS}.

-- 
Daniel Jacobowitz
CodeSourcery, LLC

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: AIX bootstrap failure (was Re: Hot/cold partitioning fixes)
       [not found]                               ` <drow@false.org>
  2004-09-15 15:46                                 ` David Edelsohn
@ 2005-04-01 21:58                                 ` David Edelsohn
  2005-06-27 17:18                                 ` [PATCH, committed] PPC405 atomic support (PR target/21760) David Edelsohn
                                                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2005-04-01 21:58 UTC (permalink / raw)
  To: Caroline Tice, Mark Mitchell, gcc-patches@gcc.gnu.org Patches

>>>>> Daniel Jacobowitz writes:

Daniel> Try the following paragraph?

Daniel> See also @var{USE_SELECT_SECTION_FOR_FUNCTIONS}.

	Sigh.  How confusing.

	However, neither the default definition nor the target overrides
handle the alternate use of the second argument.

Daid

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: AIX bootstrap failure (was Re: Hot/cold partitioning fixes)
  2005-04-01 21:44               ` David Edelsohn
  2005-04-01 21:55                 ` Daniel Jacobowitz
@ 2005-04-01 22:02                 ` Caroline Tice
  1 sibling, 0 replies; 875+ messages in thread
From: Caroline Tice @ 2005-04-01 22:02 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches@gcc.gnu.org Patches


On Apr 1, 2005, at 1:44 PM, David Edelsohn wrote:

>>>>>> Caroline Tice writes:
>
> Caroline> According to the manual, when TARGET_ASM_SELECT_SECTION is 
> used for
> Caroline> function decl's, the RELOC parameter is supposed to be 0 if 
> the
> Caroline> function is likely to be executed, and 1 if the function is 
> unlikely
> Caroline> to be executed.  Therefore it looks like the call above in
> Caroline> function_section is passing the correct values.  But I will
> Caroline> change the name of the argument from 'unlikely' to 'reloc', 
> and the type
> Caroline> from 'bool' to 'int'.  :-)
>
> Caroline> Since, in terms of functionality, it is correct on the 4.0 
> release
> Caroline> branch, do
> Caroline> you want me to make this change there as well or not?
>
> 	What documentation are you reading?  Are you looking at an
> Apple-local copy?  Mainline gcc/doc/tm.texi and the online GCC manual
> states:
>
> @deftypefn {Target Hook} void TARGET_ASM_SELECT_SECTION (tree 
> @var{exp},
> int @var{reloc}, unsigned HOST_WIDE_INT @var{align}) Switches to the
> appropriate section for output of @var{exp}.  You can assume that
> @var{exp} is either a @code{VAR_DECL} node or a constant of some sort.
> @var{reloc} indicates whether the initial value of @var{exp} requires
> link-time relocations.  Bit 0 is set when variable contains local
> relocations only, while bit 1 is set for global relocations.
>
>

If you look a few lines further, at the end of that section it say "See 
also USE_SELECT_SECTION_FOR_FUNCTIONS", where it says:

  USE_SELECT_SECTION_FOR_FUNCTIONS
      Define this macro if you wish TARGET_ASM_SELECT_SECTION to be
      called for `FUNCTION_DECL's as well as for variables and constants.

      In the case of a `FUNCTION_DECL', RELOC will be zero if the
      function has been determined to be likely to be called, and
      nonzero if it is unlikely to be called.


> Also, none of the implementations of the target hook on mainline were
> modified to use the parameter to select likely or unlikely.
>

In the function machopic_select_section, in config/darwin.c, it most 
definitely
does use this parameter to select the likely versus unlikely sections 
(and no,
I didn't write that part either).

-- Caroline Tice
ctice@apple.com

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: AIX bootstrap failure (was Re: Hot/cold partitioning fixes)
  2005-04-01 21:46               ` Mark Mitchell
@ 2005-04-01 22:06                 ` Caroline Tice
  0 siblings, 0 replies; 875+ messages in thread
From: Caroline Tice @ 2005-04-01 22:06 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: Caroline Tice, gcc-patches@gcc.gnu.org Patches


Yes, the calls in function_section and current_function_section
are guarded with that check.

-- Caroline
ctice@apple.com

On Apr 1, 2005, at 1:45 PM, Mark Mitchell wrote:
> So, if your calls to select section are appropriately guarded with 
> USE_SELECT_SECTION_FOR_FUCNTIONS, then it does indeed sound like your 
> code is correct, and there's no need to make changes to the 4.0 
> branch.  I sure wish someone would tidy up target.h to reflect that 
> (a) FUNCTION_DECLs can be passed to this function, and (b) reloc has 
> as a different meaning in this case.  Or, make an entirely separate 
> hook.
>
> -- 
> Mark Mitchell
> CodeSourcery, LLC
> mark@codesourcery.com
> (916) 791-8304
>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: AIX bootstrap failure (was Re: Hot/cold partitioning fixes)
  2005-04-01 20:07               ` Daniel Jacobowitz
  2005-04-01 20:09                 ` Caroline Tice
@ 2005-04-01 22:08                 ` Geoffrey Keating
  2005-04-01 22:12                   ` Caroline Tice
  1 sibling, 1 reply; 875+ messages in thread
From: Geoffrey Keating @ 2005-04-01 22:08 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: gcc-patches@gcc.gnu.org Patches

Daniel Jacobowitz <drow@false.org> writes:

> On Fri, Apr 01, 2005 at 12:00:08PM -0800, Caroline Tice wrote:
> > I really need labels with predictable names, as these labels are used in
> > the debugging information to determine the size of the text sections.
> 
> And you can't somehow associate the labels with the function?  For
> instance, record them in "struct function", or as a property of the
> FUNCTION_DECL.  It looks like current_function isn't usable during
> dwarf2 output, but you'll have the decl; you could create a hash table
> mapping the FUNCTION_DECL to the appropriate label.

They should go in 'struct function'.  There's a pointer to the function
structure from the DECL.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: AIX bootstrap failure (was Re: Hot/cold partitioning fixes)
  2005-04-01 22:08                 ` Geoffrey Keating
@ 2005-04-01 22:12                   ` Caroline Tice
  2005-04-01 22:21                     ` Geoffrey Keating
  0 siblings, 1 reply; 875+ messages in thread
From: Caroline Tice @ 2005-04-01 22:12 UTC (permalink / raw)
  To: gcc-patches@gcc.gnu.org Patches
  Cc: Geoffrey Keating, Caroline Tice, Daniel Jacobowitz

I was thinking that since I already have global variables  that I was 
putting the old
strings into, I would just use those global variables putting the 
result of
calling ASM_GENERATE_INTERNAL_LABLE into them.  Is this a bad idea?  
Should
I attach them to the function structure instead?

-- Caroline

On Apr 1, 2005, at 2:07 PM, Geoffrey Keating wrote:

> Daniel Jacobowitz <drow@false.org> writes:
>
>> On Fri, Apr 01, 2005 at 12:00:08PM -0800, Caroline Tice wrote:
>>> I really need labels with predictable names, as these labels are 
>>> used in
>>> the debugging information to determine the size of the text sections.
>>
>> And you can't somehow associate the labels with the function?  For
>> instance, record them in "struct function", or as a property of the
>> FUNCTION_DECL.  It looks like current_function isn't usable during
>> dwarf2 output, but you'll have the decl; you could create a hash table
>> mapping the FUNCTION_DECL to the appropriate label.
>
> They should go in 'struct function'.  There's a pointer to the function
> structure from the DECL.
>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: AIX bootstrap failure (was Re: Hot/cold partitioning fixes)
  2005-04-01 18:24       ` David Edelsohn
  2005-04-01 19:20         ` Caroline Tice
@ 2005-04-01 22:14         ` Geoffrey Keating
  2005-04-01 22:17           ` David Edelsohn
  1 sibling, 1 reply; 875+ messages in thread
From: Geoffrey Keating @ 2005-04-01 22:14 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches@gcc.gnu.org Patches

David Edelsohn <dje@watson.ibm.com> writes:

> 	Also, I just noticed that the previous changes to choose sections
> in varasm.c appear to have broken the select_section mechanism.  The
> arguments to the target hook are suppose to be decl, reloc, and alignment,
> e.g., in variable_section():
> 
>     targetm.asm_out.select_section (decl, reloc, DECL_ALIGN (decl));
> 
> but it sometimes is called with the arguments decl, unlikely, and
> alignment, e.g., in function_section:
> 
>   targetm.asm_out.select_section (decl, unlikely, DECL_ALIGN (decl));

What do you think is broken?

See the documentation for the select_section target hook.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: AIX bootstrap failure (was Re: Hot/cold partitioning fixes)
  2005-04-01 22:14         ` Geoffrey Keating
@ 2005-04-01 22:17           ` David Edelsohn
  2005-04-01 22:24             ` Daniel Jacobowitz
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2005-04-01 22:17 UTC (permalink / raw)
  To: Geoffrey Keating; +Cc: gcc-patches@gcc.gnu.org Patches

>>>>> Geoffrey Keating writes:

Geoff> What do you think is broken?

	While the select_section target hook appears to be defined
correctly for Darwin, none of the other definitions were updated,
including varasm.c:default_select_section().

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: AIX bootstrap failure (was Re: Hot/cold partitioning fixes)
  2005-04-01 22:12                   ` Caroline Tice
@ 2005-04-01 22:21                     ` Geoffrey Keating
  2005-04-01 22:31                       ` Caroline Tice
  0 siblings, 1 reply; 875+ messages in thread
From: Geoffrey Keating @ 2005-04-01 22:21 UTC (permalink / raw)
  To: Caroline Tice; +Cc: Daniel Jacobowitz, gcc-patches

Caroline Tice <ctice@apple.com> writes:

> I was thinking that since I already have global variables that I was
> putting the old strings into, I would just use those global
> variables putting the result of calling ASM_GENERATE_INTERNAL_LABLE
> into them.  Is this a bad idea?

Where are these global variables used?

Yes, in general having random routines depend on the contents of
random global variables is a bad idea.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: AIX bootstrap failure (was Re: Hot/cold partitioning fixes)
  2005-04-01 22:17           ` David Edelsohn
@ 2005-04-01 22:24             ` Daniel Jacobowitz
  2005-04-01 22:31               ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Daniel Jacobowitz @ 2005-04-01 22:24 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Geoffrey Keating, gcc-patches@gcc.gnu.org Patches

On Fri, Apr 01, 2005 at 05:16:24PM -0500, David Edelsohn wrote:
> >>>>> Geoffrey Keating writes:
> 
> Geoff> What do you think is broken?
> 
> 	While the select_section target hook appears to be defined
> correctly for Darwin, none of the other definitions were updated,
> including varasm.c:default_select_section().

But none of them should be called in this case, either, because they
don't define USE_SELECT_SECTION_FOR_FUNCTIONS.

-- 
Daniel Jacobowitz
CodeSourcery, LLC

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: AIX bootstrap failure (was Re: Hot/cold partitioning fixes)
  2005-04-01 22:24             ` Daniel Jacobowitz
@ 2005-04-01 22:31               ` David Edelsohn
  0 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2005-04-01 22:31 UTC (permalink / raw)
  To: Geoffrey Keating, Daniel Jacobowitz, gcc-patches@gcc.gnu.org Patches

>>>>> Daniel Jacoboqitz writes:

> But none of them should be called in this case, either, because they
> don't define USE_SELECT_SECTION_FOR_FUNCTIONS.

	Yes, sorry.  Thanks for explaining this contorted feature.  The
overloading of the interface is a poor design.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: AIX bootstrap failure (was Re: Hot/cold partitioning fixes)
  2005-04-01 22:21                     ` Geoffrey Keating
@ 2005-04-01 22:31                       ` Caroline Tice
  2005-04-01 22:57                         ` Caroline Tice
  0 siblings, 1 reply; 875+ messages in thread
From: Caroline Tice @ 2005-04-01 22:31 UTC (permalink / raw)
  To: Geoffrey Keating
  Cc: Caroline Tice, gcc-patches@gcc.gnu.org Patches, Daniel Jacobowitz



The variables are:

unlikely_section_label
hot_section_label
cold_section_end_label
hot_section_end_label

They are used in assemble_start_function and assemble_end_function,
where they are written out.  They are also used in dbxout.c and
dwarf2out.c, where they are used to calculate the sizes of the text
sections for the debugging information.

-- Caroline
ctice@apple.com

On Apr 1, 2005, at 2:21 PM, Geoffrey Keating wrote:

> Caroline Tice <ctice@apple.com> writes:
>
>> I was thinking that since I already have global variables that I was
>> putting the old strings into, I would just use those global
>> variables putting the result of calling ASM_GENERATE_INTERNAL_LABLE
>> into them.  Is this a bad idea?
>
> Where are these global variables used?
>
> Yes, in general having random routines depend on the contents of
> random global variables is a bad idea.
>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: AIX bootstrap failure (was Re: Hot/cold partitioning fixes)
  2005-04-01 22:31                       ` Caroline Tice
@ 2005-04-01 22:57                         ` Caroline Tice
  2005-04-02  2:22                           ` Mark Mitchell
  0 siblings, 1 reply; 875+ messages in thread
From: Caroline Tice @ 2005-04-01 22:57 UTC (permalink / raw)
  To: Caroline Tice
  Cc: gcc-patches@gcc.gnu.org Patches, Geoffrey Keating, Daniel Jacobowitz

Before I go and attach these labels to the function structure, I
just wanted to double check on something.  The last time
I tried adding fields to a structure, there turned out to
be a problem because people did not like the size of
the structure to be increased (admittedly there are *far*
more basic_block structs and edge structs that function
structs).  So I just want to make sure this is not going to
be a problem, before I add these labels to the function
structure.

Any objections to my adding fields to the function structure?
Anyone?

-- Caroline Tice
ctice@apple.com

On Apr 1, 2005, at 2:30 PM, Caroline Tice wrote:

>
>
> The variables are:
>
> unlikely_section_label
> hot_section_label
> cold_section_end_label
> hot_section_end_label
>
> They are used in assemble_start_function and assemble_end_function,
> where they are written out.  They are also used in dbxout.c and
> dwarf2out.c, where they are used to calculate the sizes of the text
> sections for the debugging information.
>
> -- Caroline
> ctice@apple.com
>
> On Apr 1, 2005, at 2:21 PM, Geoffrey Keating wrote:
>
>> Caroline Tice <ctice@apple.com> writes:
>>
>>> I was thinking that since I already have global variables that I was
>>> putting the old strings into, I would just use those global
>>> variables putting the result of calling ASM_GENERATE_INTERNAL_LABLE
>>> into them.  Is this a bad idea?
>>
>> Where are these global variables used?
>>
>> Yes, in general having random routines depend on the contents of
>> random global variables is a bad idea.
>>
>

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: AIX bootstrap failure (was Re: Hot/cold partitioning fixes)
  2005-04-01 22:57                         ` Caroline Tice
@ 2005-04-02  2:22                           ` Mark Mitchell
  2005-04-02  2:51                             ` Daniel Jacobowitz
  0 siblings, 1 reply; 875+ messages in thread
From: Mark Mitchell @ 2005-04-02  2:22 UTC (permalink / raw)
  To: Caroline Tice
  Cc: gcc-patches@gcc.gnu.org Patches, Geoffrey Keating, Daniel Jacobowitz

Caroline Tice wrote:
> Before I go and attach these labels to the function structure, I
> just wanted to double check on something.  The last time
> I tried adding fields to a structure, there turned out to
> be a problem because people did not like the size of
> the structure to be increased (admittedly there are *far*
> more basic_block structs and edge structs that function
> structs).  So I just want to make sure this is not going to
> be a problem, before I add these labels to the function
> structure.
> 
> Any objections to my adding fields to the function structure?
> Anyone?

Do you ever need them after the function has been optimized and emitted? 
  If so, I think they can just replace your current global variables; 
that's the way that information needed only in a single function has 
been handled.  If they need to live beyond that point, then they should 
go in "struct function".

-- 
Mark Mitchell
CodeSourcery, LLC
mark@codesourcery.com
(916) 791-8304

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: AIX bootstrap failure (was Re: Hot/cold partitioning fixes)
  2005-04-02  2:22                           ` Mark Mitchell
@ 2005-04-02  2:51                             ` Daniel Jacobowitz
  0 siblings, 0 replies; 875+ messages in thread
From: Daniel Jacobowitz @ 2005-04-02  2:51 UTC (permalink / raw)
  To: gcc-patches@gcc.gnu.org Patches

On Fri, Apr 01, 2005 at 06:21:52PM -0800, Mark Mitchell wrote:
> Caroline Tice wrote:
> >Before I go and attach these labels to the function structure, I
> >just wanted to double check on something.  The last time
> >I tried adding fields to a structure, there turned out to
> >be a problem because people did not like the size of
> >the structure to be increased (admittedly there are *far*
> >more basic_block structs and edge structs that function
> >structs).  So I just want to make sure this is not going to
> >be a problem, before I add these labels to the function
> >structure.
> >
> >Any objections to my adding fields to the function structure?
> >Anyone?

Note that I was suggesting adding pointers, either to the strings or to
the CODE_LABEL rtxen - not adding 256-byte char buffers.  Seems a
little excessive!

-- 
Daniel Jacobowitz
CodeSourcery, LLC

^ permalink raw reply	[flat|nested] 875+ messages in thread

* powerpc new PLT and GOT
@ 2005-05-12 16:05 Alan Modra
  2005-05-12 16:09 ` Andrew Pinski
                   ` (2 more replies)
  0 siblings, 3 replies; 875+ messages in thread
From: Alan Modra @ 2005-05-12 16:05 UTC (permalink / raw)
  To: gcc-patches

This is where I'm at with gcc support for the new powerpc-linux PLT/GOT
layout (see http://sources.redhat.com/ml/binutils/2005-05/msg00391.html).
I'm not asking for commit approval yet;  That ought to wait until I've
thrown together glibc support as well so this can all be tested properly,
but what I have here seems to do the right thing.  So I'm looking for
comments like "That's the Wrong Way.  You ought to ..."

Some things I know need attention:
a) Should the new -fpic PLT/GOT code support be enabled by default?  The
   linker  will continue to generate the old GOT/PLT layout until a new
   glibc is available, a consequence of a "bl got-4" used in the current
   crti.o.  This is fortunate, and means we don't need to do a configure
   test on glibc to figure whether the new PLT/GOT code is safe to use.
   However, the new GOT pointer load sequence is larger, (but might be
   quicker) and new PLT calls always need the GOT pointer, so code
   increases a little in size.
b) -fPIC should use the new GOT pointer load sequence too, as it is
   faster and smaller than the current -fPIC sequence.
c) The rtl generated by load_toc_v4_PIC_3c matches elf_low.  We get the
   right assembly though.

	* configure.ac (HAVE_AS_REL16): Test for R_PPC_REL16 relocs.
	* config/rs6000/sysv4.opt (mdata-plt, bss-plt): Add options.
	* config/rs6000/sysv4.h (SUBTARGET_OVERRIDE_OPTIONS): Error if
	-mdata-plt given without assembler support.
	* config/rs6000/rs6000.h (TARGET_DATA_PLT): Undef if not HAVE_AS_REL16.
	* config/rs6000/rs6000.c (rs6000_emit_load_toc_table): Handle
	TARGET_DATA_PLT got register load sequence.
	* config/rs6000/rs6000.md (load_toc_v4_PIC_1) Enable for
	TARGET_DATA_PLT.
	(load_toc_v4_PIC_3b, load_toc_v4_PIC_3c): New insns.
	(call, call_value): Mark pic_offset_table_rtx used for TARGET_DATA_PLT.
	* config.in: Regenerate.
	* configure: Regenerate.

diff -urp -xCVS -x'*~' -x'.#*' gcc-virgin/gcc/configure.ac gcc-current/gcc/configure.ac
--- gcc-virgin/gcc/configure.ac	2005-05-09 20:03:03.000000000 +0930
+++ gcc-current/gcc/configure.ac	2005-05-12 18:31:15.000000000 +0930
@@ -2828,6 +2828,24 @@ foo:	nop
       [AC_DEFINE(HAVE_AS_POPCNTB, 1,
 	  [Define if your assembler supports popcntb field.])])
 
+    case $target in
+      *-*-aix*) conftest_s='	.csect .text[[PR]]
+LCF..0:
+	addis 11,30,_GLOBAL_OFFSET_TABLE_-LCF..0@ha';;
+      *-*-darwin*)
+	conftest_s='	.text
+LCF0:
+	addis r11,r30,_GLOBAL_OFFSET_TABLE_-LCF0@ha';;
+      *) conftest_s='	.text
+.LCF0:
+	addis 11,30,_GLOBAL_OFFSET_TABLE_-.LCF0@ha';;
+    esac
+
+    gcc_GAS_CHECK_FEATURE([rel16 relocs],
+      gcc_cv_as_powerpc_rel16, [2,17,0],,
+      [$conftest_s],,
+      [AC_DEFINE(HAVE_AS_REL16, 1,
+	  [Define if your assembler supports R_PPC_REL16 relocs.])])
     ;;
 
   mips*-*-*)
diff -urp -xCVS -x'*~' -x'.#*' gcc-virgin/gcc/config/rs6000/sysv4.opt gcc-current/gcc/config/rs6000/sysv4.opt
--- gcc-virgin/gcc/config/rs6000/sysv4.opt	2005-05-07 17:51:46.000000000 +0930
+++ gcc-current/gcc/config/rs6000/sysv4.opt	2005-05-12 20:27:22.000000000 +0930
@@ -140,3 +140,11 @@ Generate 32-bit code
 mnewlib
 Target RejectNegative
 no description yet
+
+mdata-plt
+Target Report RejectNegative Mask(DATA_PLT)
+Generate code to use a non-exec PLT and GOT
+
+mbss-plt
+Target Report RejectNegative InverseMask(DATA_PLT)
+Generate code for old exec BSS PLT
diff -urp -xCVS -x'*~' -x'.#*' gcc-virgin/gcc/config/rs6000/sysv4.h gcc-current/gcc/config/rs6000/sysv4.h
--- gcc-virgin/gcc/config/rs6000/sysv4.h	2005-05-06 23:34:43.000000000 +0930
+++ gcc-current/gcc/config/rs6000/sysv4.h	2005-05-12 19:41:09.000000000 +0930
@@ -205,6 +205,11 @@ do {									\
       error ("-mcall-aixdesc must be big endian");			\
     }									\
 									\
+  if (TARGET_DATA_PLT != ((target_flags & MASK_DATA_PLT) != 0))		\
+    {									\
+      error ("-mdata-plt not supported by your assembler");		\
+    }									\
+									\
   /* Treat -fPIC the same as -mrelocatable.  */				\
   if (flag_pic > 1 && DEFAULT_ABI != ABI_AIX)				\
     target_flags |= MASK_RELOCATABLE | MASK_MINIMAL_TOC | MASK_NO_FP_IN_TOC; \
diff -urp -xCVS -x'*~' -x'.#*' gcc-virgin/gcc/config/rs6000/rs6000.c gcc-current/gcc/config/rs6000/rs6000.c
--- gcc-virgin/gcc/config/rs6000/rs6000.c	2005-05-09 20:03:12.000000000 +0930
+++ gcc-current/gcc/config/rs6000/rs6000.c	2005-05-12 18:45:53.000000000 +0930
@@ -12547,15 +12596,46 @@ rs6000_emit_load_toc_table (int fromprol
 
   if (TARGET_ELF && DEFAULT_ABI == ABI_V4 && flag_pic == 1)
     {
-      rtx temp = (fromprolog
-		  ? gen_rtx_REG (Pmode, LINK_REGISTER_REGNUM)
-		  : gen_reg_rtx (Pmode));
-      insn = emit_insn (gen_load_toc_v4_pic_si (temp));
-      if (fromprolog)
-	rs6000_maybe_dead (insn);
-      insn = emit_move_insn (dest, temp);
-      if (fromprolog)
-	rs6000_maybe_dead (insn);
+      rtx tempLR = (fromprolog
+		    ? gen_rtx_REG (Pmode, LINK_REGISTER_REGNUM)
+		    : gen_reg_rtx (Pmode));
+
+      if (TARGET_DATA_PLT)
+	{
+	  char buf[30];
+	  rtx lab, tmp1, tmp2, got;
+
+	  ASM_GENERATE_INTERNAL_LABEL (buf, "LCF", rs6000_pic_labelno);
+	  lab = gen_rtx_SYMBOL_REF (Pmode, ggc_strdup (buf));
+	  got = rs6000_got_sym ();
+	  tmp1 = tmp2 = dest;
+	  if (!fromprolog)
+	    {
+	      tmp1 = gen_reg_rtx (Pmode);
+	      tmp2 = gen_reg_rtx (Pmode);
+	    }
+	  insn = emit_insn (gen_load_toc_v4_PIC_1 (tempLR, lab));
+	  if (fromprolog)
+	    rs6000_maybe_dead (insn);
+	  insn = emit_move_insn (tmp1, tempLR);
+	  if (fromprolog)
+	    rs6000_maybe_dead (insn);
+	  insn = emit_insn (gen_load_toc_v4_PIC_3b (tmp2, tmp1, got, lab));
+	  if (fromprolog)
+	    rs6000_maybe_dead (insn);
+	  insn = emit_insn (gen_load_toc_v4_PIC_3c (dest, tmp2, got, lab));
+	  if (fromprolog)
+	    rs6000_maybe_dead (insn);
+	}
+      else
+	{
+	  insn = emit_insn (gen_load_toc_v4_pic_si (tempLR));
+	  if (fromprolog)
+	    rs6000_maybe_dead (insn);
+	  insn = emit_move_insn (dest, tempLR);
+	  if (fromprolog)
+	    rs6000_maybe_dead (insn);
+	}
     }
   else if (TARGET_ELF && DEFAULT_ABI != ABI_AIX && flag_pic == 2)
     {
diff -urp -xCVS -x'*~' -x'.#*' gcc-virgin/gcc/config/rs6000/rs6000.h gcc-current/gcc/config/rs6000/rs6000.h
--- gcc-virgin/gcc/config/rs6000/rs6000.h	2005-05-09 20:03:12.000000000 +0930
+++ gcc-current/gcc/config/rs6000/rs6000.h	2005-05-12 18:45:55.000000000 +0930
@@ -144,6 +144,11 @@
 #define TARGET_POPCNTB 0
 #endif
 
+#ifndef HAVE_AS_REL16
+#undef  TARGET_DATA_PLT
+#define TARGET_DATA_PLT 0
+#endif
+
 #define TARGET_32BIT		(! TARGET_64BIT)
 
 /* Emit a dtp-relative reference to a TLS variable.  */
diff -urp -xCVS -x'*~' -x'.#*' gcc-virgin/gcc/config/rs6000/rs6000.md gcc-current/gcc/config/rs6000/rs6000.md
--- gcc-virgin/gcc/config/rs6000/rs6000.md	2005-05-12 12:45:31.000000000 +0930
+++ gcc-current/gcc/config/rs6000/rs6000.md	2005-05-12 18:39:45.000000000 +0930
@@ -9812,7 +9810,8 @@
   [(set (match_operand:SI 0 "register_operand" "=l")
 	(match_operand:SI 1 "immediate_operand" "s"))
    (use (unspec [(match_dup 1)] UNSPEC_TOC))]
-  "TARGET_ELF && DEFAULT_ABI != ABI_AIX && flag_pic == 2"
+  "TARGET_ELF && DEFAULT_ABI != ABI_AIX
+   && (flag_pic == 2 || (flag_pic && TARGET_DATA_PLT))"
   "bcl 20,31,%1\\n%1:"
   [(set_attr "type" "branch")
    (set_attr "length" "4")])
@@ -9835,6 +9834,22 @@
   "{l|lwz} %0,%2-%3(%1)"
   [(set_attr "type" "load")])
 
+(define_insn "load_toc_v4_PIC_3b"
+  [(set (match_operand:SI 0 "gpc_reg_operand" "=b")
+	(plus:SI (match_operand:SI 1 "gpc_reg_operand" "r")
+		 (high:SI
+		   (minus:SI (match_operand:SI 2 "symbol_ref_operand" "s")
+			     (match_operand:SI 3 "symbol_ref_operand" "s")))))]
+  "DEFAULT_ABI == ABI_V4 && flag_pic == 1 && TARGET_DATA_PLT"
+  "{cau|addis} %0,%1,%2-%3@ha")
+
+(define_insn "load_toc_v4_PIC_3c"
+  [(set (match_operand:SI 0 "gpc_reg_operand" "=r")
+	(lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "b")
+		   (minus:SI (match_operand:SI 2 "symbol_ref_operand" "s")
+			     (match_operand:SI 3 "symbol_ref_operand" "s"))))]
+  "DEFAULT_ABI == ABI_V4 && flag_pic == 1 && TARGET_DATA_PLT"
+  "{cal|addi} %0,%1,%2-%3@l")
 
 ;; If the TOC is shared over a translation unit, as happens with all
 ;; the kinds of PIC that we support, we need to restore the TOC
@@ -9985,6 +10000,25 @@
 
   operands[0] = XEXP (operands[0], 0);
 
+  if (DEFAULT_ABI == ABI_V4 && TARGET_DATA_PLT
+      && flag_pic
+      && GET_CODE (operands[0]) == SYMBOL_REF
+      && !SYMBOL_REF_LOCAL_P (operands[0]))
+    {
+      rtx call;
+      rtvec tmp;
+
+      tmp = gen_rtvec (3,
+		       gen_rtx_CALL (VOIDmode,
+				     gen_rtx_MEM (SImode, operands[0]),
+				     operands[1]),
+		       gen_rtx_USE (VOIDmode, operands[2]),
+		       gen_rtx_CLOBBER (VOIDmode, gen_rtx_SCRATCH (SImode)));
+      call = emit_call_insn (gen_rtx_PARALLEL (VOIDmode, tmp));
+      use_reg (&CALL_INSN_FUNCTION_USAGE (call), pic_offset_table_rtx);
+      DONE;
+    }
+
   if (GET_CODE (operands[0]) != SYMBOL_REF
       || (DEFAULT_ABI == ABI_AIX && !SYMBOL_REF_FUNCTION_P (operands[0]))
       || (DEFAULT_ABI != ABI_DARWIN && (INTVAL (operands[2]) & CALL_LONG) != 0))
@@ -10036,6 +10070,28 @@
 
   operands[1] = XEXP (operands[1], 0);
 
+  if (DEFAULT_ABI == ABI_V4 && TARGET_DATA_PLT
+      && flag_pic
+      && GET_CODE (operands[1]) == SYMBOL_REF
+      && !SYMBOL_REF_LOCAL_P (operands[1]))
+    {
+      rtx call;
+      rtvec tmp;
+
+      tmp = gen_rtvec (3,
+		       gen_rtx_SET (VOIDmode,
+				    operands[0],
+				    gen_rtx_CALL (VOIDmode,
+						  gen_rtx_MEM (SImode,
+							       operands[1]),
+						  operands[2])),
+		       gen_rtx_USE (VOIDmode, operands[3]),
+		       gen_rtx_CLOBBER (VOIDmode, gen_rtx_SCRATCH (SImode)));
+      call = emit_call_insn (gen_rtx_PARALLEL (VOIDmode, tmp));
+      use_reg (&CALL_INSN_FUNCTION_USAGE (call), pic_offset_table_rtx);
+      DONE;
+    }
+
   if (GET_CODE (operands[1]) != SYMBOL_REF
       || (DEFAULT_ABI == ABI_AIX && !SYMBOL_REF_FUNCTION_P (operands[1]))
       || (DEFAULT_ABI != ABI_DARWIN && (INTVAL (operands[3]) & CALL_LONG) != 0))

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: powerpc new PLT and GOT
  2005-05-12 16:05 powerpc new PLT and GOT Alan Modra
@ 2005-05-12 16:09 ` Andrew Pinski
  2005-05-12 20:00   ` Mike Stump
  2005-05-12 16:53 ` Matt Thomas
  2005-05-19 13:01 ` Alan Modra
  2 siblings, 1 reply; 875+ messages in thread
From: Andrew Pinski @ 2005-05-12 16:09 UTC (permalink / raw)
  To: Alan Modra; +Cc: gcc-patches


On May 12, 2005, at 12:05 PM, Alan Modra wrote:

> +      *-*-darwin*)
> +	conftest_s='	.text
> +LCF0:
> +	addis r11,r30,_GLOBAL_OFFSET_TABLE_-LCF0@ha';;

Does darwin even need this test?

-- Pinski

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: powerpc new PLT and GOT
  2005-05-12 16:05 powerpc new PLT and GOT Alan Modra
  2005-05-12 16:09 ` Andrew Pinski
@ 2005-05-12 16:53 ` Matt Thomas
  2005-05-13  0:00   ` Alan Modra
  2005-05-19 13:01 ` Alan Modra
  2 siblings, 1 reply; 875+ messages in thread
From: Matt Thomas @ 2005-05-12 16:53 UTC (permalink / raw)
  To: Alan Modra; +Cc: gcc-patches

Alan Modra wrote:
> This is where I'm at with gcc support for the new powerpc-linux PLT/GOT
> layout (see http://sources.redhat.com/ml/binutils/2005-05/msg00391.html).
> I'm not asking for commit approval yet;  That ought to wait until I've
> thrown together glibc support as well so this can all be tested properly,
> but what I have here seems to do the right thing.  So I'm looking for
> comments like "That's the Wrong Way.  You ought to ..."

Not everyone who uses ABI_V4 uses glibc.  Two, one might try to use
the new compiler/binutils on an older system who's dynamic loader doesn't
support the new PLT/GOT mechanism.

> Some things I know need attention:
> a) Should the new -fpic PLT/GOT code support be enabled by default?  The
>    linker  will continue to generate the old GOT/PLT layout until a new
>    glibc is available, a consequence of a "bl got-4" used in the current
>    crti.o.  This is fortunate, and means we don't need to do a configure
>    test on glibc to figure whether the new PLT/GOT code is safe to use.
>    However, the new GOT pointer load sequence is larger, (but might be
>    quicker) and new PLT calls always need the GOT pointer, so code
>    increases a little in size.

As I said before, the issue isn't the linker but the dynamic loader.
Unless you do a configure time to see if the dynamic loader does
the right thing I don't see how you can enable it by default.

-- 
Matt Thomas                     email: matt@3am-software.com
3am Software Foundry              www: http://3am-software.com/bio/matt/
Cupertino, CA              disclaimer: I avow all knowledge of this message.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: powerpc new PLT and GOT
  2005-05-12 16:09 ` Andrew Pinski
@ 2005-05-12 20:00   ` Mike Stump
  2005-05-13  0:10     ` Alan Modra
  0 siblings, 1 reply; 875+ messages in thread
From: Mike Stump @ 2005-05-12 20:00 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: Alan Modra, gcc-patches

On May 12, 2005, at 9:09 AM, Andrew Pinski wrote:
> On May 12, 2005, at 12:05 PM, Alan Modra wrote:
>> +      *-*-darwin*)
>> +    conftest_s='    .text
>> +LCF0:
>> +    addis r11,r30,_GLOBAL_OFFSET_TABLE_-LCF0@ha';;
>>
>
> Does darwin even need this test?

GLOBAL_OFFSET_TABLE, what's that?  :-)

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: powerpc new PLT and GOT
  2005-05-12 16:53 ` Matt Thomas
@ 2005-05-13  0:00   ` Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2005-05-13  0:00 UTC (permalink / raw)
  To: Matt Thomas; +Cc: gcc-patches

On Thu, May 12, 2005 at 09:54:38AM -0700, Matt Thomas wrote:
> Not everyone who uses ABI_V4 uses glibc.  Two, one might try to use
> the new compiler/binutils on an older system who's dynamic loader doesn't
> support the new PLT/GOT mechanism.

Yes, but see below.

> > Some things I know need attention:
> > a) Should the new -fpic PLT/GOT code support be enabled by default?  The
> >    linker  will continue to generate the old GOT/PLT layout until a new
> >    glibc is available, a consequence of a "bl got-4" used in the current
> >    crti.o.  This is fortunate, and means we don't need to do a configure
> >    test on glibc to figure whether the new PLT/GOT code is safe to use.
> >    However, the new GOT pointer load sequence is larger, (but might be
> >    quicker) and new PLT calls always need the GOT pointer, so code
> >    increases a little in size.
> 
> As I said before, the issue isn't the linker but the dynamic loader.
> Unless you do a configure time to see if the dynamic loader does
> the right thing I don't see how you can enable it by default.

We're talking about two different things here, I think.  I'm
specifically referring to the compiler by default generating relocatable
object files suitable for the new GOT/PLT layout.  From these object
files GNU ld can generate a binary that uses either the old PLT scheme
or the new one.  If the old scheme is selected, because ld detects
"bl got-4" instructions, or because --bss-plt is passed to ld, then the
resultant binary will work with an old dynamic linker.  The new
R_PPC_REL16* relocs will only be found in relocatable object files; None
will be found in any fully linked object.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: powerpc new PLT and GOT
  2005-05-12 20:00   ` Mike Stump
@ 2005-05-13  0:10     ` Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2005-05-13  0:10 UTC (permalink / raw)
  To: Mike Stump; +Cc: Andrew Pinski, gcc-patches

On Thu, May 12, 2005 at 01:00:11PM -0700, Mike Stump wrote:
> On May 12, 2005, at 9:09 AM, Andrew Pinski wrote:
> >On May 12, 2005, at 12:05 PM, Alan Modra wrote:
> >>+      *-*-darwin*)
> >>+    conftest_s='    .text
> >>+LCF0:
> >>+    addis r11,r30,_GLOBAL_OFFSET_TABLE_-LCF0@ha';;
> >>
> >
> >Does darwin even need this test?

No, probably not.  Then again, you might want to make use of REL16
relocs somewhere.

> GLOBAL_OFFSET_TABLE, what's that?  :-)

Just some undefined symbol.  Who knows?  :)

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: powerpc new PLT and GOT
  2005-05-12 16:05 powerpc new PLT and GOT Alan Modra
  2005-05-12 16:09 ` Andrew Pinski
  2005-05-12 16:53 ` Matt Thomas
@ 2005-05-19 13:01 ` Alan Modra
  2005-05-25 14:26   ` Alan Modra
  2 siblings, 1 reply; 875+ messages in thread
From: Alan Modra @ 2005-05-19 13:01 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 360 bytes --]

This should be close to the final version.  -fPIC -mdata-plt now works.
See http://sources.redhat.com/ml/binutils/2005-05/msg00592.html for the
gory details.  I'm still testing this, but posted it here because a
number of people have asked about the patch.  mainline, gcc-4.0, and
gcc-3.4 patches attached.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

[-- Attachment #2: gcc.diff --]
[-- Type: text/plain, Size: 16960 bytes --]

	* configure.ac (HAVE_AS_REL16): Test for R_PPC_REL16 relocs.
	* config.in: Regenerate.
	* configure: Regenerate.
	* config/rs6000/sysv4.opt (mdata-plt, bss-plt): Add options.
	* config/rs6000/sysv4.h (SUBTARGET_OVERRIDE_OPTIONS): Error if
	-mdata-plt given without assembler support.
	* config/rs6000/rs6000.h (TARGET_DATA_PLT): Undef if not HAVE_AS_REL16.
	* config/rs6000/rs6000.c (rs6000_emit_load_toc_table): Handle
	TARGET_DATA_PLT got register load sequence.
	* config/rs6000/rs6000.md (load_toc_v4_PIC_1) Enable for
	TARGET_DATA_PLT.
	(load_toc_v4_PIC_3b, load_toc_v4_PIC_3c): New insns.
	(call, call_value): Mark pic_offset_table_rtx used for TARGET_DATA_PLT.
	(call_nonlocal_sysv, call_value_nonlocal_sysv, sibcall_nonlocal_sysv,
	sibcall_value_nonlocal_sysv): Add 32768 offset when TARGET_DATA_PLT
	and -fPIC.
	* config/rs6000/t-rs6000 (DATA_PLT): New shell variable.
	* config/rs6000/t-linux64 (TARGET_LIBGCC2_CFLAGS): Add $DATA_PLT.
	(MULTILIB_EXTRA_OPTS): Likewise.
	* config/rs6000/t-netbsd (MULTILIB_EXTRA_OPTS): Likewise.
	* config/rs6000/t-ppcos (MULTILIB_EXTRA_OPTS): Likewise.
	* config/rs6000/t-ppccomm (CRTSTUFF_T_CFLAGS_S): Likewise.
	* config/rs6000/t-lynx (CRTSTUFF_T_CFLAGS_S): Likewise.

diff -urp -xCVS -x'*~' -x'.#*' gcc-virgin/gcc/configure.ac gcc-current/gcc/configure.ac
--- gcc-virgin/gcc/configure.ac	2005-05-19 19:10:56.000000000 +0930
+++ gcc-current/gcc/configure.ac	2005-05-19 19:13:04.000000000 +0930
@@ -2828,6 +2828,24 @@ foo:	nop
       [AC_DEFINE(HAVE_AS_POPCNTB, 1,
 	  [Define if your assembler supports popcntb field.])])
 
+    case $target in
+      *-*-aix*) conftest_s='	.csect .text[[PR]]
+LCF..0:
+	addis 11,30,_GLOBAL_OFFSET_TABLE_-LCF..0@ha';;
+      *-*-darwin*)
+	conftest_s='	.text
+LCF0:
+	addis r11,r30,_GLOBAL_OFFSET_TABLE_-LCF0@ha';;
+      *) conftest_s='	.text
+.LCF0:
+	addis 11,30,_GLOBAL_OFFSET_TABLE_-.LCF0@ha';;
+    esac
+
+    gcc_GAS_CHECK_FEATURE([rel16 relocs],
+      gcc_cv_as_powerpc_rel16, [2,17,0], -a32,
+      [$conftest_s],,
+      [AC_DEFINE(HAVE_AS_REL16, 1,
+	  [Define if your assembler supports R_PPC_REL16 relocs.])])
     ;;
 
   mips*-*-*)
diff -urp -xCVS -x'*~' -x'.#*' gcc-virgin/gcc/config/rs6000/sysv4.opt gcc-current/gcc/config/rs6000/sysv4.opt
--- gcc-virgin/gcc/config/rs6000/sysv4.opt	2005-05-19 19:11:10.000000000 +0930
+++ gcc-current/gcc/config/rs6000/sysv4.opt	2005-05-19 19:13:21.000000000 +0930
@@ -139,3 +139,11 @@ Generate 32-bit code
 mnewlib
 Target RejectNegative
 no description yet
+
+mdata-plt
+Target Report RejectNegative Mask(DATA_PLT)
+Generate code to use a non-exec PLT and GOT
+
+mbss-plt
+Target Report RejectNegative InverseMask(DATA_PLT)
+Generate code for old exec BSS PLT
diff -urp -xCVS -x'*~' -x'.#*' gcc-virgin/gcc/config/rs6000/sysv4.h gcc-current/gcc/config/rs6000/sysv4.h
--- gcc-virgin/gcc/config/rs6000/sysv4.h	2005-05-06 23:34:43.000000000 +0930
+++ gcc-current/gcc/config/rs6000/sysv4.h	2005-05-18 23:42:08.000000000 +0930
@@ -205,6 +205,11 @@ do {									\
       error ("-mcall-aixdesc must be big endian");			\
     }									\
 									\
+  if (TARGET_DATA_PLT != ((target_flags & MASK_DATA_PLT) != 0))		\
+    {									\
+      error ("-mdata-plt not supported by your assembler");		\
+    }									\
+									\
   /* Treat -fPIC the same as -mrelocatable.  */				\
   if (flag_pic > 1 && DEFAULT_ABI != ABI_AIX)				\
     target_flags |= MASK_RELOCATABLE | MASK_MINIMAL_TOC | MASK_NO_FP_IN_TOC; \
diff -urp -xCVS -x'*~' -x'.#*' gcc-virgin/gcc/config/rs6000/rs6000.h gcc-current/gcc/config/rs6000/rs6000.h
--- gcc-virgin/gcc/config/rs6000/rs6000.h	2005-05-09 20:03:12.000000000 +0930
+++ gcc-current/gcc/config/rs6000/rs6000.h	2005-05-12 18:45:55.000000000 +0930
@@ -144,6 +144,11 @@
 #define TARGET_POPCNTB 0
 #endif
 
+#ifndef HAVE_AS_REL16
+#undef  TARGET_DATA_PLT
+#define TARGET_DATA_PLT 0
+#endif
+
 #define TARGET_32BIT		(! TARGET_64BIT)
 
 /* Emit a dtp-relative reference to a TLS variable.  */
diff -urp -xCVS -x'*~' -x'.#*' gcc-virgin/gcc/config/rs6000/rs6000.c gcc-current/gcc/config/rs6000/rs6000.c
--- gcc-virgin/gcc/config/rs6000/rs6000.c	2005-05-19 19:11:10.000000000 +0930
+++ gcc-current/gcc/config/rs6000/rs6000.c	2005-05-19 19:13:18.000000000 +0930
@@ -12572,15 +12621,49 @@ rs6000_emit_load_toc_table (int fromprol
   rtx dest, insn;
   dest = gen_rtx_REG (Pmode, RS6000_PIC_OFFSET_TABLE_REGNUM);
 
-  if (TARGET_ELF && DEFAULT_ABI == ABI_V4 && flag_pic == 1)
+  if (TARGET_ELF && TARGET_DATA_PLT && DEFAULT_ABI != ABI_AIX && flag_pic)
     {
-      rtx temp = (fromprolog
-		  ? gen_rtx_REG (Pmode, LINK_REGISTER_REGNUM)
-		  : gen_reg_rtx (Pmode));
-      insn = emit_insn (gen_load_toc_v4_pic_si (temp));
+      char buf[30];
+      rtx lab, tmp1, tmp2, got, tempLR;
+
+      ASM_GENERATE_INTERNAL_LABEL (buf, "LCF", rs6000_pic_labelno);
+      lab = gen_rtx_SYMBOL_REF (Pmode, ggc_strdup (buf));
+      if (flag_pic == 2)
+	got = gen_rtx_SYMBOL_REF (Pmode, toc_label_name);
+      else
+	got = rs6000_got_sym ();
+      tmp1 = tmp2 = dest;
+      if (!fromprolog)
+	{
+	  tmp1 = gen_reg_rtx (Pmode);
+	  tmp2 = gen_reg_rtx (Pmode);
+	}
+      tempLR = (fromprolog
+		? gen_rtx_REG (Pmode, LINK_REGISTER_REGNUM)
+		: gen_reg_rtx (Pmode));
+      insn = emit_insn (gen_load_toc_v4_PIC_1 (tempLR, lab));
+      if (fromprolog)
+	rs6000_maybe_dead (insn);
+      insn = emit_move_insn (tmp1, tempLR);
+      if (fromprolog)
+	rs6000_maybe_dead (insn);
+      insn = emit_insn (gen_load_toc_v4_PIC_3b (tmp2, tmp1, got, lab));
+      if (fromprolog)
+	rs6000_maybe_dead (insn);
+      insn = emit_insn (gen_load_toc_v4_PIC_3c (dest, tmp2, got, lab));
+      if (fromprolog)
+	rs6000_maybe_dead (insn);
+    }
+  else if (TARGET_ELF && DEFAULT_ABI == ABI_V4 && flag_pic == 1)
+    {
+      rtx tempLR = (fromprolog
+		    ? gen_rtx_REG (Pmode, LINK_REGISTER_REGNUM)
+		    : gen_reg_rtx (Pmode));
+
+      insn = emit_insn (gen_load_toc_v4_pic_si (tempLR));
       if (fromprolog)
 	rs6000_maybe_dead (insn);
-      insn = emit_move_insn (dest, temp);
+      insn = emit_move_insn (dest, tempLR);
       if (fromprolog)
 	rs6000_maybe_dead (insn);
     }
diff -urp -xCVS -x'*~' -x'.#*' gcc-virgin/gcc/config/rs6000/rs6000.md gcc-current/gcc/config/rs6000/rs6000.md
--- gcc-virgin/gcc/config/rs6000/rs6000.md	2005-05-19 19:11:10.000000000 +0930
+++ gcc-current/gcc/config/rs6000/rs6000.md	2005-05-19 21:32:25.000000000 +0930
@@ -7360,26 +7360,6 @@
 \f
 ;; Now define ways of moving data around.
 
-;; Elf specific ways of loading addresses for non-PIC code.
-;; The output of this could be r0, but we make a very strong
-;; preference for a base register because it will usually
-;; be needed there.
-(define_insn "elf_high"
-  [(set (match_operand:SI 0 "gpc_reg_operand" "=b*r")
-	(high:SI (match_operand 1 "" "")))]
-  "TARGET_ELF && ! TARGET_64BIT"
-  "{liu|lis} %0,%1@ha")
-
-(define_insn "elf_low"
-  [(set (match_operand:SI 0 "gpc_reg_operand" "=r,r")
-	(lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "b,!*r")
-		   (match_operand 2 "" "")))]
-   "TARGET_ELF && ! TARGET_64BIT"
-   "@
-    {cal|la} %0,%2@l(%1)
-    {ai|addic} %0,%1,%K2")
-
-
 ;; Set up a register with a value from the GOT table
 
 (define_expand "movsi_got"
@@ -9810,7 +9788,8 @@
   [(set (match_operand:SI 0 "register_operand" "=l")
 	(match_operand:SI 1 "immediate_operand" "s"))
    (use (unspec [(match_dup 1)] UNSPEC_TOC))]
-  "TARGET_ELF && DEFAULT_ABI != ABI_AIX && flag_pic == 2"
+  "TARGET_ELF && DEFAULT_ABI != ABI_AIX
+   && (flag_pic == 2 || (flag_pic && TARGET_DATA_PLT))"
   "bcl 20,31,%1\\n%1:"
   [(set_attr "type" "branch")
    (set_attr "length" "4")])
@@ -9833,6 +9812,22 @@
   "{l|lwz} %0,%2-%3(%1)"
   [(set_attr "type" "load")])
 
+(define_insn "load_toc_v4_PIC_3b"
+  [(set (match_operand:SI 0 "gpc_reg_operand" "=b")
+	(plus:SI (match_operand:SI 1 "gpc_reg_operand" "r")
+		 (high:SI
+		   (minus:SI (match_operand:SI 2 "symbol_ref_operand" "s")
+			     (match_operand:SI 3 "symbol_ref_operand" "s")))))]
+  "TARGET_ELF && TARGET_DATA_PLT && DEFAULT_ABI != ABI_AIX && flag_pic"
+  "{cau|addis} %0,%1,%2-%3@ha")
+
+(define_insn "load_toc_v4_PIC_3c"
+  [(set (match_operand:SI 0 "gpc_reg_operand" "=r")
+	(lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "b")
+		   (minus:SI (match_operand:SI 2 "symbol_ref_operand" "s")
+			     (match_operand:SI 3 "symbol_ref_operand" "s"))))]
+  "TARGET_ELF && TARGET_DATA_PLT && DEFAULT_ABI != ABI_AIX && flag_pic"
+  "{cal|addi} %0,%1,%2-%3@l")
 
 ;; If the TOC is shared over a translation unit, as happens with all
 ;; the kinds of PIC that we support, we need to restore the TOC
@@ -9867,6 +9862,25 @@
     rs6000_emit_load_toc_table (FALSE);
   DONE;
 }")
+
+;; Elf specific ways of loading addresses for non-PIC code.
+;; The output of this could be r0, but we make a very strong
+;; preference for a base register because it will usually
+;; be needed there.
+(define_insn "elf_high"
+  [(set (match_operand:SI 0 "gpc_reg_operand" "=b*r")
+	(high:SI (match_operand 1 "" "")))]
+  "TARGET_ELF && ! TARGET_64BIT"
+  "{liu|lis} %0,%1@ha")
+
+(define_insn "elf_low"
+  [(set (match_operand:SI 0 "gpc_reg_operand" "=r,r")
+	(lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "b,!*r")
+		   (match_operand 2 "" "")))]
+   "TARGET_ELF && ! TARGET_64BIT"
+   "@
+    {cal|la} %0,%2@l(%1)
+    {ai|addic} %0,%1,%K2")
 \f
 ;; A function pointer under AIX is a pointer to a data area whose first word
 ;; contains the actual address of the function, whose second word contains a
@@ -9983,6 +9997,25 @@
 
   operands[0] = XEXP (operands[0], 0);
 
+  if (DEFAULT_ABI == ABI_V4 && TARGET_DATA_PLT
+      && flag_pic
+      && GET_CODE (operands[0]) == SYMBOL_REF
+      && !SYMBOL_REF_LOCAL_P (operands[0]))
+    {
+      rtx call;
+      rtvec tmp;
+
+      tmp = gen_rtvec (3,
+		       gen_rtx_CALL (VOIDmode,
+				     gen_rtx_MEM (SImode, operands[0]),
+				     operands[1]),
+		       gen_rtx_USE (VOIDmode, operands[2]),
+		       gen_rtx_CLOBBER (VOIDmode, gen_rtx_SCRATCH (SImode)));
+      call = emit_call_insn (gen_rtx_PARALLEL (VOIDmode, tmp));
+      use_reg (&CALL_INSN_FUNCTION_USAGE (call), pic_offset_table_rtx);
+      DONE;
+    }
+
   if (GET_CODE (operands[0]) != SYMBOL_REF
       || (DEFAULT_ABI == ABI_AIX && !SYMBOL_REF_FUNCTION_P (operands[0]))
       || (DEFAULT_ABI != ABI_DARWIN && (INTVAL (operands[2]) & CALL_LONG) != 0))
@@ -10034,6 +10067,28 @@
 
   operands[1] = XEXP (operands[1], 0);
 
+  if (DEFAULT_ABI == ABI_V4 && TARGET_DATA_PLT
+      && flag_pic
+      && GET_CODE (operands[1]) == SYMBOL_REF
+      && !SYMBOL_REF_LOCAL_P (operands[1]))
+    {
+      rtx call;
+      rtvec tmp;
+
+      tmp = gen_rtvec (3,
+		       gen_rtx_SET (VOIDmode,
+				    operands[0],
+				    gen_rtx_CALL (VOIDmode,
+						  gen_rtx_MEM (SImode,
+							       operands[1]),
+						  operands[2])),
+		       gen_rtx_USE (VOIDmode, operands[3]),
+		       gen_rtx_CLOBBER (VOIDmode, gen_rtx_SCRATCH (SImode)));
+      call = emit_call_insn (gen_rtx_PARALLEL (VOIDmode, tmp));
+      use_reg (&CALL_INSN_FUNCTION_USAGE (call), pic_offset_table_rtx);
+      DONE;
+    }
+
   if (GET_CODE (operands[1]) != SYMBOL_REF
       || (DEFAULT_ABI == ABI_AIX && !SYMBOL_REF_FUNCTION_P (operands[1]))
       || (DEFAULT_ABI != ABI_DARWIN && (INTVAL (operands[3]) & CALL_LONG) != 0))
@@ -10307,7 +10362,18 @@
 #if TARGET_MACHO
   return output_call(insn, operands, 0, 2);
 #else
-  return (DEFAULT_ABI == ABI_V4 && flag_pic) ? "bl %z0@plt" : "bl %z0";
+  if (DEFAULT_ABI == ABI_V4 && flag_pic)
+    {
+      if (TARGET_DATA_PLT && flag_pic == 2)
+	/* The magic 32768 offset here and in the other sysv call insns
+	   corresponds to the offset of r30 in .got2, as given by LCTOC1.
+	   See sysv4.h:toc_section.  */
+	return "bl %z0+32768@plt";
+      else
+	return "bl %z0@plt";
+    }
+  else
+    return "bl %z0";
 #endif
 }
   [(set_attr "type" "branch,branch")
@@ -10352,7 +10418,15 @@
 #if TARGET_MACHO
   return output_call(insn, operands, 1, 3);
 #else
-  return (DEFAULT_ABI == ABI_V4 && flag_pic) ? "bl %z1@plt" : "bl %z1";
+  if (DEFAULT_ABI == ABI_V4 && flag_pic)
+    {
+      if (TARGET_DATA_PLT && flag_pic == 2)
+	return "bl %z1+32768@plt";
+      else
+	return "bl %z1@plt";
+    }
+  else
+    return "bl %z1";
 #endif
 }
   [(set_attr "type" "branch,branch")
@@ -10567,7 +10641,15 @@
   else if (INTVAL (operands[2]) & CALL_V4_CLEAR_FP_ARGS)
     output_asm_insn (\"creqv 6,6,6\", operands);
 
-  return (DEFAULT_ABI == ABI_V4 && flag_pic) ? \"b %z0@plt\" : \"b %z0\";
+  if (DEFAULT_ABI == ABI_V4 && flag_pic)
+    {
+      if (TARGET_DATA_PLT && flag_pic == 2)
+	return \"b %z0+32768@plt\";
+      else
+	return \"b %z0@plt\";
+    }
+  else
+    return \"b %z0\";
 }"
   [(set_attr "type" "branch,branch")
    (set_attr "length" "4,8")])
@@ -10613,7 +10695,15 @@
   else if (INTVAL (operands[2]) & CALL_V4_CLEAR_FP_ARGS)
     output_asm_insn (\"creqv 6,6,6\", operands);
 
-  return (DEFAULT_ABI == ABI_V4 && flag_pic) ? \"b %z1@plt\" : \"b %z1\";
+  if (DEFAULT_ABI == ABI_V4 && flag_pic)
+    {
+      if (TARGET_DATA_PLT && flag_pic == 2)
+	return \"b %z1+32768@plt\";
+      else
+	return \"b %z1@plt\";
+    }
+  else
+    return \"b %z1\";
 }"
   [(set_attr "type" "branch,branch")
    (set_attr "length" "4,8")])
diff -urp -xCVS -x'*~' -x'.#*' gcc-virgin/gcc/config/rs6000/t-rs6000 gcc-current/gcc/config/rs6000/t-rs6000
--- gcc-virgin/gcc/config/rs6000/t-rs6000	2004-05-26 10:55:14.000000000 +0930
+++ gcc-current/gcc/config/rs6000/t-rs6000	2005-05-18 14:29:13.000000000 +0930
@@ -18,3 +18,6 @@ rs6000-c.o: $(srcdir)/config/rs6000/rs60
 
 # The rs6000 backend doesn't cause warnings in these files.
 insn-conditions.o-warn =
+
+# Whether to use -mdata-plt in other t-files.
+DATA_PLT := $(shell sed -n -e 's/\#define HAVE_AS_REL16 1/mdata-plt/p' auto-host.h)
diff -urp -xCVS -x'*~' -x'.#*' gcc-virgin/gcc/config/rs6000/t-linux64 gcc-current/gcc/config/rs6000/t-linux64
--- gcc-virgin/gcc/config/rs6000/t-linux64	2004-04-15 18:55:03.000000000 +0930
+++ gcc-current/gcc/config/rs6000/t-linux64	2005-05-19 20:00:06.000000000 +0930
@@ -4,13 +4,13 @@
 LIB2FUNCS_EXTRA = tramp.S $(srcdir)/config/rs6000/ppc64-fp.c \
 	$(srcdir)/config/rs6000/darwin-ldouble.c
 
-TARGET_LIBGCC2_CFLAGS = -mno-minimal-toc -fPIC -specs=bispecs
+TARGET_LIBGCC2_CFLAGS = -mno-minimal-toc -fPIC $(DATA_PLT:m%=-m%) -specs=bispecs
 
 SHLIB_MAPFILES += $(srcdir)/config/rs6000/libgcc-ppc64.ver
 
 MULTILIB_OPTIONS        = m64/m32 msoft-float
 MULTILIB_DIRNAMES       = 64 32 nof
-MULTILIB_EXTRA_OPTS     = fPIC mstrict-align
+MULTILIB_EXTRA_OPTS     = fPIC $(DATA_PLT) mstrict-align
 MULTILIB_EXCEPTIONS     = m64/msoft-float
 MULTILIB_EXCLUSIONS     = m64/!m32/msoft-float
 MULTILIB_OSDIRNAMES	= ../lib64 ../lib nof
diff -urp -xCVS -x'*~' -x'.#*' gcc-virgin/gcc/config/rs6000/t-lynx gcc-current/gcc/config/rs6000/t-lynx
--- gcc-virgin/gcc/config/rs6000/t-lynx	2004-08-05 14:25:37.000000000 +0930
+++ gcc-current/gcc/config/rs6000/t-lynx	2005-05-19 20:00:05.000000000 +0930
@@ -29,9 +29,9 @@ EXTRA_MULTILIB_PARTS = crtbegin.o crtend
 # If .sdata is enabled __CTOR_{LIST,END}__ go into .sdata instead of
 # .ctors.
 CRTSTUFF_T_CFLAGS = -mno-sdata 
- 
+
 # Compile crtbeginS.o and crtendS.o with pic. 
-CRTSTUFF_T_CFLAGS_S = -fPIC -mno-sdata 
+CRTSTUFF_T_CFLAGS_S = -fPIC $(DATA_PLT:m%=-m%) -mno-sdata 
 
 Local Variables:
 mode: makefile
diff -urp -xCVS -x'*~' -x'.#*' gcc-virgin/gcc/config/rs6000/t-netbsd gcc-current/gcc/config/rs6000/t-netbsd
--- gcc-virgin/gcc/config/rs6000/t-netbsd	2002-11-26 10:35:07.000000000 +1030
+++ gcc-current/gcc/config/rs6000/t-netbsd	2005-05-19 20:00:05.000000000 +0930
@@ -26,7 +26,7 @@ MULTILIB_MATCHES_FLOAT	= msoft-float=mcp
 
 MULTILIB_OPTIONS	= msoft-float
 MULTILIB_DIRNAMES	= soft-float
-MULTILIB_EXTRA_OPTS	= fPIC mstrict-align
+MULTILIB_EXTRA_OPTS	= fPIC $(DATA_PLT) mstrict-align
 MULTILIB_EXCEPTIONS	=
 
 MULTILIB_MATCHES	= ${MULTILIB_MATCHES_FLOAT}
diff -urp -xCVS -x'*~' -x'.#*' gcc-virgin/gcc/config/rs6000/t-ppccomm gcc-current/gcc/config/rs6000/t-ppccomm
--- gcc-virgin/gcc/config/rs6000/t-ppccomm	2002-12-20 22:53:07.000000000 +1030
+++ gcc-current/gcc/config/rs6000/t-ppccomm	2005-05-19 20:00:04.000000000 +0930
@@ -60,4 +60,4 @@ $(T)crtsavres$(objext): crtsavres.S
 CRTSTUFF_T_CFLAGS = -msdata=none
 # Make sure crt*.o are built with -fPIC even if configured with 
 # --enable-shared --disable-multilib
-CRTSTUFF_T_CFLAGS_S = -fPIC -msdata=none
+CRTSTUFF_T_CFLAGS_S = -fPIC $(DATA_PLT:m%=-m%) -msdata=none
diff -urp -xCVS -x'*~' -x'.#*' gcc-virgin/gcc/config/rs6000/t-ppcos gcc-current/gcc/config/rs6000/t-ppcos
--- gcc-virgin/gcc/config/rs6000/t-ppcos	2001-11-25 00:32:46.000000000 +1030
+++ gcc-current/gcc/config/rs6000/t-ppcos	2005-05-19 20:00:04.000000000 +0930
@@ -2,7 +2,7 @@
 
 MULTILIB_OPTIONS	= msoft-float
 MULTILIB_DIRNAMES	= nof
-MULTILIB_EXTRA_OPTS	= fPIC mstrict-align
+MULTILIB_EXTRA_OPTS	= fPIC $(DATA_PLT) mstrict-align
 MULTILIB_EXCEPTIONS	= 
 
 MULTILIB_MATCHES	= ${MULTILIB_MATCHES_FLOAT}

[-- Attachment #3: gcc4.diff --]
[-- Type: text/plain, Size: 17753 bytes --]

	* configure.ac (HAVE_AS_REL16): Test for R_PPC_REL16 relocs.
	* config.in: Regenerate.
	* configure: Regenerate.
	* config/rs6000/sysv4.h (MASK_DATA_PLT): Define.
	(SUBTARGET_SWITCHES): Add "data-plt" and "bss-plt".  Move "newlib".
	(SUBTARGET_OVERRIDE_OPTIONS): Error if -mdata-plt given without
	assembler support.
	* config/rs6000/rs6000.h: Update target_flags free bits comment.
	(TARGET_DATA_PLT): Define.
	* config/rs6000/rs6000.c (rs6000_emit_load_toc_table): Handle
	TARGET_DATA_PLT got register load sequence.
	* config/rs6000/rs6000.md (elf_high, elf_low): Move after toc load
	insns.
	(load_toc_v4_PIC_1) Enable for TARGET_DATA_PLT.
	(load_toc_v4_PIC_3b, load_toc_v4_PIC_3c): New insns.
	(call, call_value): Mark pic_offset_table_rtx used for TARGET_DATA_PLT.
	(call_nonlocal_sysv, call_value_nonlocal_sysv, sibcall_nonlocal_sysv,
	sibcall_value_nonlocal_sysv): Add 32768 offset when TARGET_DATA_PLT
	and -fPIC.
	* config/rs6000/t-rs6000 (DATA_PLT): New shell variable.
	* config/rs6000/t-linux64 (TARGET_LIBGCC2_CFLAGS): Add $DATA_PLT.
	(MULTILIB_EXTRA_OPTS): Likewise.
	* config/rs6000/t-netbsd (MULTILIB_EXTRA_OPTS): Likewise.
	* config/rs6000/t-ppcos (MULTILIB_EXTRA_OPTS): Likewise.
	* config/rs6000/t-ppccomm (CRTSTUFF_T_CFLAGS_S): Likewise.

diff -urp -xCVS -x'*~' -x'.#*' gcc-4.0-virgin/gcc/configure.ac gcc-4.0/gcc/configure.ac
--- gcc-4.0-virgin/gcc/configure.ac	2005-05-11 18:20:56.000000000 +0930
+++ gcc-4.0/gcc/configure.ac	2005-05-19 21:02:05.000000000 +0930
@@ -2762,6 +2762,25 @@ foo:	nop
       [$conftest_s],,
       [AC_DEFINE(HAVE_AS_MFCRF, 1,
 	  [Define if your assembler supports mfcr field.])])
+
+    case $target in
+      *-*-aix*) conftest_s='	.csect .text[[PR]]
+LCF..0:
+	addis 11,30,_GLOBAL_OFFSET_TABLE_-LCF..0@ha';;
+      *-*-darwin*)
+	conftest_s='	.text
+LCF0:
+	addis r11,r30,_GLOBAL_OFFSET_TABLE_-LCF0@ha';;
+      *) conftest_s='	.text
+.LCF0:
+	addis 11,30,_GLOBAL_OFFSET_TABLE_-.LCF0@ha';;
+    esac
+
+    gcc_GAS_CHECK_FEATURE([rel16 relocs],
+      gcc_cv_as_powerpc_rel16, [2,17,0], -a32,
+      [$conftest_s],,
+      [AC_DEFINE(HAVE_AS_REL16, 1,
+	  [Define if your assembler supports R_PPC_REL16 relocs.])])
     ;;
 
   mips*-*-*)
diff -urp -xCVS -x'*~' -x'.#*' gcc-4.0-virgin/gcc/config/rs6000/sysv4.h gcc-4.0/gcc/config/rs6000/sysv4.h
--- gcc-4.0-virgin/gcc/config/rs6000/sysv4.h	2005-02-16 09:06:57.000000000 +1030
+++ gcc-4.0/gcc/config/rs6000/sysv4.h	2005-05-19 21:02:05.000000000 +0930
@@ -55,6 +55,7 @@ extern enum rs6000_sdata_type rs6000_sda
 #define	MASK_REGNAMES		0x02000000	/* Use alternate register names.  */
 #define	MASK_PROTOTYPE		0x01000000	/* Only prototyped fcns pass variable args.  */
 #define MASK_NO_BITFIELD_WORD	0x00800000	/* Bitfields cannot cross word boundaries */
+#define MASK_DATA_PLT		0x00400000	/* Use non-exec PLT/GOT.  */
 
 #define	TARGET_NO_BITFIELD_TYPE	(target_flags & MASK_NO_BITFIELD_TYPE)
 #define	TARGET_STRICT_ALIGN	(target_flags & MASK_STRICT_ALIGN)
@@ -149,12 +150,16 @@ extern const char *rs6000_tls_size_strin
     N_("Set the PPC_EMB bit in the ELF flags header") },		\
   { "windiss",		 0, N_("Use the WindISS simulator") },		\
   { "shlib",		 0, N_("no description yet") },			\
+  { "newlib",		 0, N_("no description yet") },			\
   { "64",		 MASK_64BIT | MASK_POWERPC64 | MASK_POWERPC,	\
 			 N_("Generate 64-bit code") },			\
   { "32",		 - (MASK_64BIT | MASK_POWERPC64),		\
 			 N_("Generate 32-bit code") },			\
-  EXTRA_SUBTARGET_SWITCHES						\
-  { "newlib",		 0, N_("no description yet") },
+  { "data-plt",		 MASK_DATA_PLT,					\
+			 N_("Generate code for non-exec PLT and GOT") },\
+  { "bss-plt",		 -MASK_DATA_PLT,				\
+			 N_("Generate code for exec BSS PLT") },	\
+  EXTRA_SUBTARGET_SWITCHES
 
 /* This is meant to be redefined in the host dependent files.  */
 #define EXTRA_SUBTARGET_SWITCHES
@@ -299,6 +304,11 @@ do {									\
       error ("-mcall-aixdesc must be big endian");			\
     }									\
 									\
+  if (TARGET_DATA_PLT != (target_flags & MASK_DATA_PLT))		\
+    {									\
+      error ("-mdata-plt not supported by your assembler");		\
+    }									\
+									\
   /* Treat -fPIC the same as -mrelocatable.  */				\
   if (flag_pic > 1 && DEFAULT_ABI != ABI_AIX)				\
     target_flags |= MASK_RELOCATABLE | MASK_MINIMAL_TOC | MASK_NO_FP_IN_TOC; \
diff -urp -xCVS -x'*~' -x'.#*' gcc-4.0-virgin/gcc/config/rs6000/rs6000.h gcc-4.0/gcc/config/rs6000/rs6000.h
--- gcc-4.0-virgin/gcc/config/rs6000/rs6000.h	2005-03-03 08:34:37.000000000 +1030
+++ gcc-4.0/gcc/config/rs6000/rs6000.h	2005-05-19 21:02:05.000000000 +0930
@@ -201,8 +201,8 @@ extern int target_flags;
 /* Use single field mfcr instruction.  */
 #define MASK_MFCRF		0x00080000
 
-/* The only remaining free bits are 0x00600000.  linux64.h uses
-   0x00100000, and sysv4.h uses 0x00800000 -> 0x40000000.
+/* The only remaining free bit is 0x00200000.  linux64.h uses
+   0x00100000, and sysv4.h uses 0x00400000 -> 0x40000000.
    0x80000000 is not available because target_flags is signed.  */
 
 #define TARGET_POWER		(target_flags & MASK_POWER)
@@ -234,6 +234,11 @@ extern int target_flags;
 #define TARGET_MFCRF 0
 #endif
 
+#ifdef HAVE_AS_REL16
+#define TARGET_DATA_PLT		(target_flags & MASK_DATA_PLT)
+#else
+#define TARGET_DATA_PLT 0
+#endif
 
 #define TARGET_32BIT		(! TARGET_64BIT)
 #define TARGET_HARD_FLOAT	(! TARGET_SOFT_FLOAT)
diff -urp -xCVS -x'*~' -x'.#*' gcc-4.0-virgin/gcc/config/rs6000/rs6000.c gcc-4.0/gcc/config/rs6000/rs6000.c
--- gcc-4.0-virgin/gcc/config/rs6000/rs6000.c	2005-05-11 18:23:48.000000000 +0930
+++ gcc-4.0/gcc/config/rs6000/rs6000.c	2005-05-19 21:02:05.000000000 +0930
@@ -13466,15 +13520,49 @@ rs6000_emit_load_toc_table (int fromprol
   rtx dest, insn;
   dest = gen_rtx_REG (Pmode, RS6000_PIC_OFFSET_TABLE_REGNUM);
 
-  if (TARGET_ELF && DEFAULT_ABI == ABI_V4 && flag_pic == 1)
+  if (TARGET_ELF && TARGET_DATA_PLT && DEFAULT_ABI != ABI_AIX && flag_pic)
     {
-      rtx temp = (fromprolog
-		  ? gen_rtx_REG (Pmode, LINK_REGISTER_REGNUM)
-		  : gen_reg_rtx (Pmode));
-      insn = emit_insn (gen_load_toc_v4_pic_si (temp));
+      char buf[30];
+      rtx lab, tmp1, tmp2, got, tempLR;
+
+      ASM_GENERATE_INTERNAL_LABEL (buf, "LCF", rs6000_pic_labelno);
+      lab = gen_rtx_SYMBOL_REF (Pmode, ggc_strdup (buf));
+      if (flag_pic == 2)
+	got = gen_rtx_SYMBOL_REF (Pmode, toc_label_name);
+      else
+	got = rs6000_got_sym ();
+      tmp1 = tmp2 = dest;
+      if (!fromprolog)
+	{
+	  tmp1 = gen_reg_rtx (Pmode);
+	  tmp2 = gen_reg_rtx (Pmode);
+	}
+      tempLR = (fromprolog
+		? gen_rtx_REG (Pmode, LINK_REGISTER_REGNUM)
+		: gen_reg_rtx (Pmode));
+      insn = emit_insn (gen_load_toc_v4_PIC_1 (tempLR, lab));
+      if (fromprolog)
+	rs6000_maybe_dead (insn);
+      insn = emit_move_insn (tmp1, tempLR);
+      if (fromprolog)
+	rs6000_maybe_dead (insn);
+      insn = emit_insn (gen_load_toc_v4_PIC_3b (tmp2, tmp1, got, lab));
+      if (fromprolog)
+	rs6000_maybe_dead (insn);
+      insn = emit_insn (gen_load_toc_v4_PIC_3c (dest, tmp2, got, lab));
+      if (fromprolog)
+	rs6000_maybe_dead (insn);
+    }
+  else if (TARGET_ELF && DEFAULT_ABI == ABI_V4 && flag_pic == 1)
+    {
+      rtx tempLR = (fromprolog
+		    ? gen_rtx_REG (Pmode, LINK_REGISTER_REGNUM)
+		    : gen_reg_rtx (Pmode));
+
+      insn = emit_insn (gen_load_toc_v4_pic_si (tempLR));
       if (fromprolog)
 	rs6000_maybe_dead (insn);
-      insn = emit_move_insn (dest, temp);
+      insn = emit_move_insn (dest, tempLR);
       if (fromprolog)
 	rs6000_maybe_dead (insn);
     }
diff -urp -xCVS -x'*~' -x'.#*' gcc-4.0-virgin/gcc/config/rs6000/rs6000.md gcc-4.0/gcc/config/rs6000/rs6000.md
--- gcc-4.0-virgin/gcc/config/rs6000/rs6000.md	2005-03-31 21:02:13.000000000 +0930
+++ gcc-4.0/gcc/config/rs6000/rs6000.md	2005-05-19 21:15:01.000000000 +0930
@@ -7653,26 +7653,6 @@
 \f
 ;; Now define ways of moving data around.
 
-;; Elf specific ways of loading addresses for non-PIC code.
-;; The output of this could be r0, but we make a very strong
-;; preference for a base register because it will usually
-;; be needed there.
-(define_insn "elf_high"
-  [(set (match_operand:SI 0 "gpc_reg_operand" "=b*r")
-	(high:SI (match_operand 1 "" "")))]
-  "TARGET_ELF && ! TARGET_64BIT"
-  "{liu|lis} %0,%1@ha")
-
-(define_insn "elf_low"
-  [(set (match_operand:SI 0 "gpc_reg_operand" "=r,r")
-	(lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "b,!*r")
-		   (match_operand 2 "" "")))]
-   "TARGET_ELF && ! TARGET_64BIT"
-   "@
-    {cal|la} %0,%2@l(%1)
-    {ai|addic} %0,%1,%K2")
-
-
 ;; Set up a register with a value from the GOT table
 
 (define_expand "movsi_got"
@@ -10133,7 +10111,8 @@
   [(set (match_operand:SI 0 "register_operand" "=l")
 	(match_operand:SI 1 "immediate_operand" "s"))
    (use (unspec [(match_dup 1)] UNSPEC_TOC))]
-  "TARGET_ELF && DEFAULT_ABI != ABI_AIX && flag_pic == 2"
+  "TARGET_ELF && DEFAULT_ABI != ABI_AIX
+   && (flag_pic == 2 || (flag_pic && TARGET_DATA_PLT))"
   "bcl 20,31,%1\\n%1:"
   [(set_attr "type" "branch")
    (set_attr "length" "4")])
@@ -10156,6 +10135,23 @@
   "{l|lwz} %0,%2-%3(%1)"
   [(set_attr "type" "load")])
 
+(define_insn "load_toc_v4_PIC_3b"
+  [(set (match_operand:SI 0 "gpc_reg_operand" "=b")
+	(plus:SI (match_operand:SI 1 "gpc_reg_operand" "r")
+		 (high:SI
+		   (minus:SI (match_operand:SI 2 "symbol_ref_operand" "s")
+			     (match_operand:SI 3 "symbol_ref_operand" "s")))))]
+  "TARGET_ELF && TARGET_DATA_PLT && DEFAULT_ABI != ABI_AIX && flag_pic"
+  "{cau|addis} %0,%1,%2-%3@ha")
+
+(define_insn "load_toc_v4_PIC_3c"
+  [(set (match_operand:SI 0 "gpc_reg_operand" "=r")
+	(lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "b")
+		   (minus:SI (match_operand:SI 2 "symbol_ref_operand" "s")
+			     (match_operand:SI 3 "symbol_ref_operand" "s"))))]
+  "TARGET_ELF && TARGET_DATA_PLT && DEFAULT_ABI != ABI_AIX && flag_pic"
+  "{cal|addi} %0,%1,%2-%3@l")
+
 
 ;; If the TOC is shared over a translation unit, as happens with all
 ;; the kinds of PIC that we support, we need to restore the TOC
@@ -10190,6 +10186,26 @@
     rs6000_emit_load_toc_table (FALSE);
   DONE;
 }")
+
+;; Elf specific ways of loading addresses for non-PIC code.
+;; The output of this could be r0, but we make a very strong
+;; preference for a base register because it will usually
+;; be needed there.
+(define_insn "elf_high"
+  [(set (match_operand:SI 0 "gpc_reg_operand" "=b*r")
+	(high:SI (match_operand 1 "" "")))]
+  "TARGET_ELF && ! TARGET_64BIT"
+  "{liu|lis} %0,%1@ha")
+
+(define_insn "elf_low"
+  [(set (match_operand:SI 0 "gpc_reg_operand" "=r,r")
+	(lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "b,!*r")
+		   (match_operand 2 "" "")))]
+   "TARGET_ELF && ! TARGET_64BIT"
+   "@
+    {cal|la} %0,%2@l(%1)
+    {ai|addic} %0,%1,%K2")
+
 \f
 ;; A function pointer under AIX is a pointer to a data area whose first word
 ;; contains the actual address of the function, whose second word contains a
@@ -10306,6 +10322,25 @@
 
   operands[0] = XEXP (operands[0], 0);
 
+  if (DEFAULT_ABI == ABI_V4 && TARGET_DATA_PLT
+      && flag_pic
+      && GET_CODE (operands[0]) == SYMBOL_REF
+      && !SYMBOL_REF_LOCAL_P (operands[0]))
+    {
+      rtx call;
+      rtvec tmp;
+
+      tmp = gen_rtvec (3,
+		       gen_rtx_CALL (VOIDmode,
+				     gen_rtx_MEM (SImode, operands[0]),
+				     operands[1]),
+		       gen_rtx_USE (VOIDmode, operands[2]),
+		       gen_rtx_CLOBBER (VOIDmode, gen_rtx_SCRATCH (SImode)));
+      call = emit_call_insn (gen_rtx_PARALLEL (VOIDmode, tmp));
+      use_reg (&CALL_INSN_FUNCTION_USAGE (call), pic_offset_table_rtx);
+      DONE;
+    }
+
   if (GET_CODE (operands[0]) != SYMBOL_REF
       || (DEFAULT_ABI == ABI_AIX && !SYMBOL_REF_FUNCTION_P (operands[0]))
       || (DEFAULT_ABI != ABI_DARWIN && (INTVAL (operands[2]) & CALL_LONG) != 0))
@@ -10354,6 +10389,28 @@
 
   operands[1] = XEXP (operands[1], 0);
 
+  if (DEFAULT_ABI == ABI_V4 && TARGET_DATA_PLT
+      && flag_pic
+      && GET_CODE (operands[1]) == SYMBOL_REF
+      && !SYMBOL_REF_LOCAL_P (operands[1]))
+    {
+      rtx call;
+      rtvec tmp;
+
+      tmp = gen_rtvec (3,
+		       gen_rtx_SET (VOIDmode,
+				    operands[0],
+				    gen_rtx_CALL (VOIDmode,
+						  gen_rtx_MEM (SImode,
+							       operands[1]),
+						  operands[2])),
+		       gen_rtx_USE (VOIDmode, operands[3]),
+		       gen_rtx_CLOBBER (VOIDmode, gen_rtx_SCRATCH (SImode)));
+      call = emit_call_insn (gen_rtx_PARALLEL (VOIDmode, tmp));
+      use_reg (&CALL_INSN_FUNCTION_USAGE (call), pic_offset_table_rtx);
+      DONE;
+    }
+
   if (GET_CODE (operands[1]) != SYMBOL_REF
       || (DEFAULT_ABI == ABI_AIX && !SYMBOL_REF_FUNCTION_P (operands[1]))
       || (DEFAULT_ABI != ABI_DARWIN && (INTVAL (operands[3]) & CALL_LONG) != 0))
@@ -10624,7 +10681,18 @@
 #if TARGET_MACHO
   return output_call(insn, operands, 0, 2);
 #else
-  return (DEFAULT_ABI == ABI_V4 && flag_pic) ? "bl %z0@plt" : "bl %z0";
+  if (DEFAULT_ABI == ABI_V4 && flag_pic)
+    {
+      if (TARGET_DATA_PLT && flag_pic == 2)
+	/* The magic 32768 offset here and in the other sysv call insns
+	   corresponds to the offset of r30 in .got2, as given by LCTOC1.
+	   See sysv4.h:toc_section.  */
+	return "bl %z0+32768@plt";
+      else
+	return "bl %z0@plt";
+    }
+  else
+    return "bl %z0";
 #endif
 }
   [(set_attr "type" "branch,branch")
@@ -10669,7 +10737,15 @@
 #if TARGET_MACHO
   return output_call(insn, operands, 1, 3);
 #else
-  return (DEFAULT_ABI == ABI_V4 && flag_pic) ? "bl %z1@plt" : "bl %z1";
+  if (DEFAULT_ABI == ABI_V4 && flag_pic)
+    {
+      if (TARGET_DATA_PLT && flag_pic == 2)
+	return "bl %z1+32768@plt";
+      else
+	return "bl %z1@plt";
+    }
+  else
+    return "bl %z1";
 #endif
 }
   [(set_attr "type" "branch,branch")
@@ -10884,7 +10960,15 @@
   else if (INTVAL (operands[2]) & CALL_V4_CLEAR_FP_ARGS)
     output_asm_insn (\"creqv 6,6,6\", operands);
 
-  return (DEFAULT_ABI == ABI_V4 && flag_pic) ? \"b %z0@plt\" : \"b %z0\";
+  if (DEFAULT_ABI == ABI_V4 && flag_pic)
+    {
+      if (TARGET_DATA_PLT && flag_pic == 2)
+	return \"b %z0+32768@plt\";
+      else
+	return \"b %z0@plt\";
+    }
+  else
+    return \"b %z0\";
 }"
   [(set_attr "type" "branch,branch")
    (set_attr "length" "4,8")])
@@ -10930,7 +11014,15 @@
   else if (INTVAL (operands[2]) & CALL_V4_CLEAR_FP_ARGS)
     output_asm_insn (\"creqv 6,6,6\", operands);
 
-  return (DEFAULT_ABI == ABI_V4 && flag_pic) ? \"b %z1@plt\" : \"b %z1\";
+  if (DEFAULT_ABI == ABI_V4 && flag_pic)
+    {
+      if (TARGET_DATA_PLT && flag_pic == 2)
+	return \"b %z1+32768@plt\";
+      else
+	return \"b %z1@plt\";
+    }
+  else
+    return \"b %z1\";
 }"
   [(set_attr "type" "branch,branch")
    (set_attr "length" "4,8")])
diff -urp -xCVS -x'*~' -x'.#*' gcc-4.0-virgin/gcc/config/rs6000/t-rs6000 gcc-4.0/gcc/config/rs6000/t-rs6000
--- gcc-4.0-virgin/gcc/config/rs6000/t-rs6000	2004-05-26 10:55:14.000000000 +0930
+++ gcc-4.0/gcc/config/rs6000/t-rs6000	2005-05-19 21:02:05.000000000 +0930
@@ -18,3 +18,6 @@ rs6000-c.o: $(srcdir)/config/rs6000/rs60
 
 # The rs6000 backend doesn't cause warnings in these files.
 insn-conditions.o-warn =
+
+# Whether to use -mdata-plt in other t-files.
+DATA_PLT := $(shell sed -n -e 's/\#define HAVE_AS_REL16 1/mdata-plt/p' auto-host.h)
diff -urp -xCVS -x'*~' -x'.#*' gcc-4.0-virgin/gcc/config/rs6000/t-linux64 gcc-4.0/gcc/config/rs6000/t-linux64
--- gcc-4.0-virgin/gcc/config/rs6000/t-linux64	2004-04-15 18:55:03.000000000 +0930
+++ gcc-4.0/gcc/config/rs6000/t-linux64	2005-05-19 21:06:29.000000000 +0930
@@ -4,13 +4,13 @@
 LIB2FUNCS_EXTRA = tramp.S $(srcdir)/config/rs6000/ppc64-fp.c \
 	$(srcdir)/config/rs6000/darwin-ldouble.c
 
-TARGET_LIBGCC2_CFLAGS = -mno-minimal-toc -fPIC -specs=bispecs
+TARGET_LIBGCC2_CFLAGS = -mno-minimal-toc -fPIC $(DATA_PLT:m%=-m%) -specs=bispecs
 
 SHLIB_MAPFILES += $(srcdir)/config/rs6000/libgcc-ppc64.ver
 
 MULTILIB_OPTIONS        = m64/m32 msoft-float
 MULTILIB_DIRNAMES       = 64 32 nof
-MULTILIB_EXTRA_OPTS     = fPIC mstrict-align
+MULTILIB_EXTRA_OPTS     = fPIC $(DATA_PLT) mstrict-align
 MULTILIB_EXCEPTIONS     = m64/msoft-float
 MULTILIB_EXCLUSIONS     = m64/!m32/msoft-float
 MULTILIB_OSDIRNAMES	= ../lib64 ../lib nof
diff -urp -xCVS -x'*~' -x'.#*' gcc-4.0-virgin/gcc/config/rs6000/t-netbsd gcc-4.0/gcc/config/rs6000/t-netbsd
--- gcc-4.0-virgin/gcc/config/rs6000/t-netbsd	2002-11-26 10:35:07.000000000 +1030
+++ gcc-4.0/gcc/config/rs6000/t-netbsd	2005-05-19 21:02:05.000000000 +0930
@@ -26,7 +26,7 @@ MULTILIB_MATCHES_FLOAT	= msoft-float=mcp
 
 MULTILIB_OPTIONS	= msoft-float
 MULTILIB_DIRNAMES	= soft-float
-MULTILIB_EXTRA_OPTS	= fPIC mstrict-align
+MULTILIB_EXTRA_OPTS	= fPIC $(DATA_PLT) mstrict-align
 MULTILIB_EXCEPTIONS	=
 
 MULTILIB_MATCHES	= ${MULTILIB_MATCHES_FLOAT}
diff -urp -xCVS -x'*~' -x'.#*' gcc-4.0-virgin/gcc/config/rs6000/t-ppccomm gcc-4.0/gcc/config/rs6000/t-ppccomm
--- gcc-4.0-virgin/gcc/config/rs6000/t-ppccomm	2002-12-20 22:53:07.000000000 +1030
+++ gcc-4.0/gcc/config/rs6000/t-ppccomm	2005-05-19 21:02:05.000000000 +0930
@@ -60,4 +60,4 @@ $(T)crtsavres$(objext): crtsavres.S
 CRTSTUFF_T_CFLAGS = -msdata=none
 # Make sure crt*.o are built with -fPIC even if configured with 
 # --enable-shared --disable-multilib
-CRTSTUFF_T_CFLAGS_S = -fPIC -msdata=none
+CRTSTUFF_T_CFLAGS_S = -fPIC $(DATA_PLT:m%=-m%) -msdata=none
diff -urp -xCVS -x'*~' -x'.#*' gcc-4.0-virgin/gcc/config/rs6000/t-ppcos gcc-4.0/gcc/config/rs6000/t-ppcos
--- gcc-4.0-virgin/gcc/config/rs6000/t-ppcos	2001-11-25 00:32:46.000000000 +1030
+++ gcc-4.0/gcc/config/rs6000/t-ppcos	2005-05-19 21:02:05.000000000 +0930
@@ -2,7 +2,7 @@
 
 MULTILIB_OPTIONS	= msoft-float
 MULTILIB_DIRNAMES	= nof
-MULTILIB_EXTRA_OPTS	= fPIC mstrict-align
+MULTILIB_EXTRA_OPTS	= fPIC $(DATA_PLT) mstrict-align
 MULTILIB_EXCEPTIONS	= 
 
 MULTILIB_MATCHES	= ${MULTILIB_MATCHES_FLOAT}

[-- Attachment #4: gcc34.diff --]
[-- Type: text/plain, Size: 17905 bytes --]

	* configure.ac (HAVE_AS_REL16): Test for R_PPC_REL16 relocs.
	* config.in: Regenerate.
	* configure: Regenerate.
	* config/rs6000/sysv4.h (MASK_DATA_PLT): Define.
	(SUBTARGET_SWITCHES): Add "data-plt" and "bss-plt".  Move "newlib".
	(SUBTARGET_OVERRIDE_OPTIONS): Error if -mdata-plt given without
	assembler support.
	* config/rs6000/rs6000.h: Update target_flags free bits comment.
	(TARGET_DATA_PLT): Define.
	* config/rs6000/rs6000.c (rs6000_emit_load_toc_table): Handle
	TARGET_DATA_PLT got register load sequence.
	* config/rs6000/rs6000.md (elf_high, elf_low): Move after toc load
	insns.
	(load_toc_v4_PIC_1) Enable for TARGET_DATA_PLT.
	(load_toc_v4_PIC_3b, load_toc_v4_PIC_3c): New insns.
	(call, call_value): Mark pic_offset_table_rtx used for TARGET_DATA_PLT.
	(call_nonlocal_sysv, call_value_nonlocal_sysv, sibcall_nonlocal_sysv,
	sibcall_value_nonlocal_sysv): Add 32768 offset when TARGET_DATA_PLT
	and -fPIC.
	* config/rs6000/t-rs6000 (DATA_PLT): New shell variable.
	* config/rs6000/t-linux64 (TARGET_LIBGCC2_CFLAGS): Add $DATA_PLT.
	(MULTILIB_EXTRA_OPTS): Likewise.
	* config/rs6000/t-netbsd (MULTILIB_EXTRA_OPTS): Likewise.
	* config/rs6000/t-ppcos (MULTILIB_EXTRA_OPTS): Likewise.
	* config/rs6000/t-ppccomm (CRTSTUFF_T_CFLAGS_S): Likewise.

diff -urp -xCVS -x'*~' -x'.#*' gcc-3.4-virgin/gcc/configure.ac gcc-3.4/gcc/configure.ac
--- gcc-3.4-virgin/gcc/configure.ac	2005-01-13 16:01:15.000000000 +1030
+++ gcc-3.4/gcc/configure.ac	2005-05-18 14:17:30.000000000 +0930
@@ -2465,6 +2467,25 @@ changequote([,])dnl
       [$conftest_s],,
       [AC_DEFINE(HAVE_AS_MFCRF, 1,
 	  [Define if your assembler supports mfcr field.])])
+
+    case $target in
+      *-*-aix*) conftest_s='	.csect .text[[PR]]
+LCF..0:
+	addis 11,30,_GLOBAL_OFFSET_TABLE_-LCF..0@ha';;
+      *-*-darwin*)
+	conftest_s='	.text
+LCF0:
+	addis r11,r30,_GLOBAL_OFFSET_TABLE_-LCF0@ha';;
+      *) conftest_s='	.text
+.LCF0:
+	addis 11,30,_GLOBAL_OFFSET_TABLE_-.LCF0@ha';;
+    esac
+
+    gcc_GAS_CHECK_FEATURE([rel16 relocs],
+      gcc_cv_as_powerpc_rel16, [2,17,0], -a32,
+      [$conftest_s],,
+      [AC_DEFINE(HAVE_AS_REL16, 1,
+	  [Define if your assembler supports R_PPC_REL16 relocs.])])
     ;;
 
   mips*-*-*)
diff -urp -xCVS -x'*~' -x'.#*' gcc-3.4-virgin/gcc/config/rs6000/sysv4.h gcc-3.4/gcc/config/rs6000/sysv4.h
--- gcc-3.4-virgin/gcc/config/rs6000/sysv4.h	2005-02-14 20:08:09.000000000 +1030
+++ gcc-3.4/gcc/config/rs6000/sysv4.h	2005-05-18 23:41:45.000000000 +0930
@@ -55,6 +55,7 @@ extern enum rs6000_sdata_type rs6000_sda
 #define	MASK_REGNAMES		0x02000000	/* Use alternate register names.  */
 #define	MASK_PROTOTYPE		0x01000000	/* Only prototyped fcns pass variable args.  */
 #define MASK_NO_BITFIELD_WORD	0x00800000	/* Bitfields cannot cross word boundaries */
+#define MASK_DATA_PLT		0x00400000	/* Use non-exec PLT/GOT.  */
 
 #define	TARGET_NO_BITFIELD_TYPE	(target_flags & MASK_NO_BITFIELD_TYPE)
 #define	TARGET_STRICT_ALIGN	(target_flags & MASK_STRICT_ALIGN)
@@ -149,12 +150,16 @@ extern const char *rs6000_tls_size_strin
     N_("Set the PPC_EMB bit in the ELF flags header") },		\
   { "windiss",           0, N_("Use the WindISS simulator") },          \
   { "shlib",		 0, N_("no description yet") },			\
+  { "newlib",		 0, N_("no description yet") },			\
   { "64",		 MASK_64BIT | MASK_POWERPC64 | MASK_POWERPC,	\
 			 N_("Generate 64-bit code") },			\
   { "32",		 - (MASK_64BIT | MASK_POWERPC64),		\
 			 N_("Generate 32-bit code") },			\
-  EXTRA_SUBTARGET_SWITCHES						\
-  { "newlib",		 0, N_("no description yet") },
+  { "data-plt",		 MASK_DATA_PLT,					\
+			 N_("Generate code for non-exec PLT and GOT") },\
+  { "bss-plt",		 -MASK_DATA_PLT,				\
+			 N_("Generate code for exec BSS PLT") },	\
+  EXTRA_SUBTARGET_SWITCHES
 
 /* This is meant to be redefined in the host dependent files.  */
 #define EXTRA_SUBTARGET_SWITCHES
@@ -294,6 +304,11 @@ do {									\
       error ("-mcall-aixdesc must be big endian");			\
     }									\
 									\
+  if (TARGET_DATA_PLT != (target_flags & MASK_DATA_PLT))		\
+    {									\
+      error ("-mdata-plt not supported by your assembler");		\
+    }									\
+									\
   /* Treat -fPIC the same as -mrelocatable.  */				\
   if (flag_pic > 1 && DEFAULT_ABI != ABI_AIX)				\
     target_flags |= MASK_RELOCATABLE | MASK_MINIMAL_TOC | MASK_NO_FP_IN_TOC; \
diff -urp -xCVS -x'*~' -x'.#*' gcc-3.4-virgin/gcc/config/rs6000/rs6000.h gcc-3.4/gcc/config/rs6000/rs6000.h
--- gcc-3.4-virgin/gcc/config/rs6000/rs6000.h	2005-02-22 15:19:24.000000000 +1030
+++ gcc-3.4/gcc/config/rs6000/rs6000.h	2005-05-17 16:53:52.000000000 +0930
@@ -197,8 +201,8 @@ extern int target_flags;
 /* Use single field mfcr instruction.  */
 #define MASK_MFCRF		0x00080000
 
-/* The only remaining free bits are 0x00600000.  linux64.h uses
-   0x00100000, and sysv4.h uses 0x00800000 -> 0x40000000.
+/* The only remaining free bit is 0x00200000.  linux64.h uses
+   0x00100000, and sysv4.h uses 0x00400000 -> 0x40000000.
    0x80000000 is not available because target_flags is signed.  */
 
 #define TARGET_POWER		(target_flags & MASK_POWER)
@@ -230,6 +234,11 @@ extern int target_flags;
 #define TARGET_MFCRF 0
 #endif
 
+#ifdef HAVE_AS_REL16
+#define TARGET_DATA_PLT		(target_flags & MASK_DATA_PLT)
+#else
+#define TARGET_DATA_PLT 0
+#endif
 
 #define TARGET_32BIT		(! TARGET_64BIT)
 #define TARGET_HARD_FLOAT	(! TARGET_SOFT_FLOAT)
diff -urp -xCVS -x'*~' -x'.#*' gcc-3.4-virgin/gcc/config/rs6000/rs6000.c gcc-3.4/gcc/config/rs6000/rs6000.c
--- gcc-3.4-virgin/gcc/config/rs6000/rs6000.c	2005-04-29 09:52:17.000000000 +0930
+++ gcc-3.4/gcc/config/rs6000/rs6000.c	2005-05-18 22:59:46.000000000 +0930
@@ -11452,15 +11502,49 @@ rs6000_emit_load_toc_table (int fromprol
   rtx dest, insn;
   dest = gen_rtx_REG (Pmode, RS6000_PIC_OFFSET_TABLE_REGNUM);
 
-  if (TARGET_ELF && DEFAULT_ABI == ABI_V4 && flag_pic == 1)
+  if (TARGET_ELF && TARGET_DATA_PLT && DEFAULT_ABI != ABI_AIX && flag_pic)
     {
-      rtx temp = (fromprolog
-		  ? gen_rtx_REG (Pmode, LINK_REGISTER_REGNUM)
-		  : gen_reg_rtx (Pmode));
-      insn = emit_insn (gen_load_toc_v4_pic_si (temp));
+      char buf[30];
+      rtx lab, tmp1, tmp2, got, tempLR;
+
+      ASM_GENERATE_INTERNAL_LABEL (buf, "LCF", rs6000_pic_labelno);
+      lab = gen_rtx_SYMBOL_REF (Pmode, ggc_strdup (buf));
+      if (flag_pic == 2)
+	got = gen_rtx_SYMBOL_REF (Pmode, toc_label_name);
+      else
+	got = rs6000_got_sym ();
+      tmp1 = tmp2 = dest;
+      if (!fromprolog)
+	{
+	  tmp1 = gen_reg_rtx (Pmode);
+	  tmp2 = gen_reg_rtx (Pmode);
+	}
+      tempLR = (fromprolog
+		? gen_rtx_REG (Pmode, LINK_REGISTER_REGNUM)
+		: gen_reg_rtx (Pmode));
+      insn = emit_insn (gen_load_toc_v4_PIC_1 (tempLR, lab));
       if (fromprolog)
 	rs6000_maybe_dead (insn);
-      insn = emit_move_insn (dest, temp);
+      insn = emit_move_insn (tmp1, tempLR);
+      if (fromprolog)
+	rs6000_maybe_dead (insn);
+      insn = emit_insn (gen_load_toc_v4_PIC_3b (tmp2, tmp1, got, lab));
+      if (fromprolog)
+	rs6000_maybe_dead (insn);
+      insn = emit_insn (gen_load_toc_v4_PIC_3c (dest, tmp2, got, lab));
+      if (fromprolog)
+	rs6000_maybe_dead (insn);
+    }
+  else if (TARGET_ELF && DEFAULT_ABI == ABI_V4 && flag_pic == 1)
+    {
+      rtx tempLR = (fromprolog
+		    ? gen_rtx_REG (Pmode, LINK_REGISTER_REGNUM)
+		    : gen_reg_rtx (Pmode));
+
+      insn = emit_insn (gen_load_toc_v4_pic_si (tempLR));
+      if (fromprolog)
+	rs6000_maybe_dead (insn);
+      insn = emit_move_insn (dest, tempLR);
       if (fromprolog)
 	rs6000_maybe_dead (insn);
     }
diff -urp -xCVS -x'*~' -x'.#*' gcc-3.4-virgin/gcc/config/rs6000/rs6000.md gcc-3.4/gcc/config/rs6000/rs6000.md
--- gcc-3.4-virgin/gcc/config/rs6000/rs6000.md	2005-03-31 21:07:06.000000000 +0930
+++ gcc-3.4/gcc/config/rs6000/rs6000.md	2005-05-19 20:14:57.000000000 +0930
@@ -7454,25 +7454,6 @@
 \f
 ;; Now define ways of moving data around.
 
-;; Elf specific ways of loading addresses for non-PIC code.
-;; The output of this could be r0, but we make a very strong
-;; preference for a base register because it will usually
-;; be needed there.
-(define_insn "elf_high"
-  [(set (match_operand:SI 0 "gpc_reg_operand" "=b*r")
-	(high:SI (match_operand 1 "" "")))]
-  "TARGET_ELF && ! TARGET_64BIT"
-  "{liu|lis} %0,%1@ha")
-
-(define_insn "elf_low"
-  [(set (match_operand:SI 0 "gpc_reg_operand" "=r,r")
-	(lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "b,!*r")
-		   (match_operand 2 "" "")))]
-   "TARGET_ELF && ! TARGET_64BIT"
-   "@
-    {cal|la} %0,%2@l(%1)
-    {ai|addic} %0,%1,%K2")
-
 ;; Mach-O PIC trickery.
 (define_insn "macho_high"
   [(set (match_operand:SI 0 "gpc_reg_operand" "=b*r")
@@ -10044,7 +10026,8 @@
   [(set (match_operand:SI 0 "register_operand" "=l")
 	(match_operand:SI 1 "immediate_operand" "s"))
    (use (unspec [(match_dup 1)] UNSPEC_TOC))]
-  "TARGET_ELF && DEFAULT_ABI != ABI_AIX && flag_pic == 2"
+  "TARGET_ELF && DEFAULT_ABI != ABI_AIX
+   && (flag_pic == 2 || (flag_pic && TARGET_DATA_PLT))"
   "bcl 20,31,%1\\n%1:"
   [(set_attr "type" "branch")
    (set_attr "length" "4")])
@@ -10067,6 +10050,23 @@
   "{l|lwz} %0,%2-%3(%1)"
   [(set_attr "type" "load")])
 
+(define_insn "load_toc_v4_PIC_3b"
+  [(set (match_operand:SI 0 "gpc_reg_operand" "=b")
+	(plus:SI (match_operand:SI 1 "gpc_reg_operand" "r")
+		 (high:SI
+		   (minus:SI (match_operand:SI 2 "symbol_ref_operand" "s")
+			     (match_operand:SI 3 "symbol_ref_operand" "s")))))]
+  "TARGET_ELF && TARGET_DATA_PLT && DEFAULT_ABI != ABI_AIX && flag_pic"
+  "{cau|addis} %0,%1,%2-%3@ha")
+
+(define_insn "load_toc_v4_PIC_3c"
+  [(set (match_operand:SI 0 "gpc_reg_operand" "=r")
+	(lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "b")
+		   (minus:SI (match_operand:SI 2 "symbol_ref_operand" "s")
+			     (match_operand:SI 3 "symbol_ref_operand" "s"))))]
+  "TARGET_ELF && TARGET_DATA_PLT && DEFAULT_ABI != ABI_AIX && flag_pic"
+  "{cal|addi} %0,%1,%2-%3@l")
+
 (define_insn "load_macho_picbase"
   [(set (match_operand:SI 0 "register_operand" "=l")
 	(unspec:SI [(match_operand:SI 1 "immediate_operand" "s")]
@@ -10119,6 +10119,26 @@
     rs6000_emit_load_toc_table (FALSE);
   DONE;
 }")
+
+;; Elf specific ways of loading addresses for non-PIC code.
+;; The output of this could be r0, but we make a very strong
+;; preference for a base register because it will usually
+;; be needed there.
+(define_insn "elf_high"
+  [(set (match_operand:SI 0 "gpc_reg_operand" "=b*r")
+	(high:SI (match_operand 1 "" "")))]
+  "TARGET_ELF && ! TARGET_64BIT"
+  "{liu|lis} %0,%1@ha")
+
+(define_insn "elf_low"
+  [(set (match_operand:SI 0 "gpc_reg_operand" "=r,r")
+	(lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "b,!*r")
+		   (match_operand 2 "" "")))]
+   "TARGET_ELF && ! TARGET_64BIT"
+   "@
+    {cal|la} %0,%2@l(%1)
+    {ai|addic} %0,%1,%K2")
+
 \f
 ;; A function pointer under AIX is a pointer to a data area whose first word
 ;; contains the actual address of the function, whose second word contains a
@@ -10235,6 +10255,25 @@
 
   operands[0] = XEXP (operands[0], 0);
 
+  if (DEFAULT_ABI == ABI_V4 && TARGET_DATA_PLT
+      && flag_pic
+      && GET_CODE (operands[0]) == SYMBOL_REF
+      && !SYMBOL_REF_LOCAL_P (operands[0]))
+    {
+      rtx call;
+      rtvec tmp;
+
+      tmp = gen_rtvec (3,
+		       gen_rtx_CALL (VOIDmode,
+				     gen_rtx_MEM (SImode, operands[0]),
+				     operands[1]),
+		       gen_rtx_USE (VOIDmode, operands[2]),
+		       gen_rtx_CLOBBER (VOIDmode, gen_rtx_SCRATCH (SImode)));
+      call = emit_call_insn (gen_rtx_PARALLEL (VOIDmode, tmp));
+      use_reg (&CALL_INSN_FUNCTION_USAGE (call), pic_offset_table_rtx);
+      DONE;
+    }
+
   if (GET_CODE (operands[0]) != SYMBOL_REF
       || (DEFAULT_ABI == ABI_AIX && !SYMBOL_REF_FUNCTION_P (operands[0]))
       || (DEFAULT_ABI != ABI_DARWIN && (INTVAL (operands[2]) & CALL_LONG) != 0))
@@ -10283,6 +10322,28 @@
 
   operands[1] = XEXP (operands[1], 0);
 
+  if (DEFAULT_ABI == ABI_V4 && TARGET_DATA_PLT
+      && flag_pic
+      && GET_CODE (operands[1]) == SYMBOL_REF
+      && !SYMBOL_REF_LOCAL_P (operands[1]))
+    {
+      rtx call;
+      rtvec tmp;
+
+      tmp = gen_rtvec (3,
+		       gen_rtx_SET (VOIDmode,
+				    operands[0],
+				    gen_rtx_CALL (VOIDmode,
+						  gen_rtx_MEM (SImode,
+							       operands[1]),
+						  operands[2])),
+		       gen_rtx_USE (VOIDmode, operands[3]),
+		       gen_rtx_CLOBBER (VOIDmode, gen_rtx_SCRATCH (SImode)));
+      call = emit_call_insn (gen_rtx_PARALLEL (VOIDmode, tmp));
+      use_reg (&CALL_INSN_FUNCTION_USAGE (call), pic_offset_table_rtx);
+      DONE;
+    }
+
   if (GET_CODE (operands[1]) != SYMBOL_REF
       || (DEFAULT_ABI == ABI_AIX && !SYMBOL_REF_FUNCTION_P (operands[1]))
       || (DEFAULT_ABI != ABI_DARWIN && (INTVAL (operands[3]) & CALL_LONG) != 0))
@@ -10553,7 +10613,18 @@
 #if TARGET_MACHO
   return output_call(insn, operands, 0, 2);
 #else
-  return (DEFAULT_ABI == ABI_V4 && flag_pic) ? "bl %z0@plt" : "bl %z0";
+  if (DEFAULT_ABI == ABI_V4 && flag_pic)
+    {
+      if (TARGET_DATA_PLT && flag_pic == 2)
+	/* The magic 32768 offset here and in the other sysv call insns
+	   corresponds to the offset of r30 in .got2, as given by LCTOC1.
+	   See sysv4.h:toc_section.  */
+	return "bl %z0+32768@plt";
+      else
+	return "bl %z0@plt";
+    }
+  else
+    return "bl %z0";
 #endif     
 }
   [(set_attr "type" "branch,branch")
@@ -10598,7 +10669,15 @@
 #if TARGET_MACHO
   return output_call(insn, operands, 1, 3);
 #else
-  return (DEFAULT_ABI == ABI_V4 && flag_pic) ? "bl %z1@plt" : "bl %z1";
+  if (DEFAULT_ABI == ABI_V4 && flag_pic)
+    {
+      if (TARGET_DATA_PLT && flag_pic == 2)
+	return "bl %z1+32768@plt";
+      else
+	return "bl %z1@plt";
+    }
+  else
+    return "bl %z1";
 #endif     
 }
   [(set_attr "type" "branch,branch")
@@ -10813,7 +10891,15 @@
   else if (INTVAL (operands[2]) & CALL_V4_CLEAR_FP_ARGS)
     output_asm_insn (\"creqv 6,6,6\", operands);
 
-  return (DEFAULT_ABI == ABI_V4 && flag_pic) ? \"b %z0@plt\" : \"b %z0\";
+  if (DEFAULT_ABI == ABI_V4 && flag_pic)
+    {
+      if (TARGET_DATA_PLT && flag_pic == 2)
+	return \"b %z0+32768@plt\";
+      else
+	return \"b %z0@plt\";
+    }
+  else
+    return \"b %z0\";
 }"
   [(set_attr "type" "branch,branch")
    (set_attr "length" "4,8")])
@@ -10859,7 +10944,15 @@
   else if (INTVAL (operands[2]) & CALL_V4_CLEAR_FP_ARGS)
     output_asm_insn (\"creqv 6,6,6\", operands);
 
-  return (DEFAULT_ABI == ABI_V4 && flag_pic) ? \"b %z1@plt\" : \"b %z1\";
+  if (DEFAULT_ABI == ABI_V4 && flag_pic)
+    {
+      if (TARGET_DATA_PLT && flag_pic == 2)
+	return \"b %z1+32768@plt\";
+      else
+	return \"b %z1@plt\";
+    }
+  else
+    return \"b %z1\";
 }"
   [(set_attr "type" "branch,branch")
    (set_attr "length" "4,8")])
diff -urp -xCVS -x'*~' -x'.#*' gcc-3.4-virgin/gcc/config/rs6000/t-rs6000 gcc-3.4/gcc/config/rs6000/t-rs6000
--- gcc-3.4-virgin/gcc/config/rs6000/t-rs6000	2003-06-16 10:39:14.000000000 +0930
+++ gcc-3.4/gcc/config/rs6000/t-rs6000	2005-05-18 14:27:25.000000000 +0930
@@ -18,3 +18,6 @@ rs6000-c.o: $(srcdir)/config/rs6000/rs60
 
 # The rs6000 backend doesn't cause warnings in these files.
 insn-conditions.o-warn =
+
+# Whether to use -mdata-plt in other t-files.
+DATA_PLT := $(shell sed -n -e 's/\#define HAVE_AS_REL16 1/mdata-plt/p' auto-host.h)
diff -urp -xCVS -x'*~' -x'.#*' gcc-3.4-virgin/gcc/config/rs6000/t-linux64 gcc-3.4/gcc/config/rs6000/t-linux64
--- gcc-3.4-virgin/gcc/config/rs6000/t-linux64	2005-03-02 09:09:26.000000000 +1030
+++ gcc-3.4/gcc/config/rs6000/t-linux64	2005-05-19 20:16:53.000000000 +0930
@@ -5,13 +5,13 @@ LIB2FUNCS_EXTRA = tramp.S $(srcdir)/conf
 LIB2FUNCS_STATIC_EXTRA = eabi.S $(srcdir)/config/rs6000/darwin-ldouble.c
 LIB2FUNCS_SHARED_EXTRA = $(srcdir)/config/rs6000/darwin-ldouble-shared.c
 
-TARGET_LIBGCC2_CFLAGS = -mno-minimal-toc -fPIC -specs=bispecs
+TARGET_LIBGCC2_CFLAGS = -mno-minimal-toc -fPIC $(DATA_PLT:m%=-m%) -specs=bispecs
 
 SHLIB_MAPFILES += $(srcdir)/config/rs6000/libgcc-ppc64.ver
 
 MULTILIB_OPTIONS        = m64/m32 msoft-float
 MULTILIB_DIRNAMES       = 64 32 nof
-MULTILIB_EXTRA_OPTS     = fPIC mstrict-align
+MULTILIB_EXTRA_OPTS     = fPIC $(DATA_PLT) mstrict-align
 MULTILIB_EXCEPTIONS     = m64/msoft-float
 MULTILIB_EXCLUSIONS     = m64/!m32/msoft-float
 MULTILIB_OSDIRNAMES	= ../lib64 ../lib nof
diff -urp -xCVS -x'*~' -x'.#*' gcc-3.4-virgin/gcc/config/rs6000/t-netbsd gcc-3.4/gcc/config/rs6000/t-netbsd
--- gcc-3.4-virgin/gcc/config/rs6000/t-netbsd	2002-11-26 10:35:07.000000000 +1030
+++ gcc-3.4/gcc/config/rs6000/t-netbsd	2005-05-19 20:16:53.000000000 +0930
@@ -26,7 +26,7 @@ MULTILIB_MATCHES_FLOAT	= msoft-float=mcp
 
 MULTILIB_OPTIONS	= msoft-float
 MULTILIB_DIRNAMES	= soft-float
-MULTILIB_EXTRA_OPTS	= fPIC mstrict-align
+MULTILIB_EXTRA_OPTS	= fPIC $(DATA_PLT) mstrict-align
 MULTILIB_EXCEPTIONS	=
 
 MULTILIB_MATCHES	= ${MULTILIB_MATCHES_FLOAT}
diff -urp -xCVS -x'*~' -x'.#*' gcc-3.4-virgin/gcc/config/rs6000/t-ppccomm gcc-3.4/gcc/config/rs6000/t-ppccomm
--- gcc-3.4-virgin/gcc/config/rs6000/t-ppccomm	2002-12-20 22:53:07.000000000 +1030
+++ gcc-3.4/gcc/config/rs6000/t-ppccomm	2005-05-19 20:16:53.000000000 +0930
@@ -60,4 +60,4 @@ $(T)crtsavres$(objext): crtsavres.S
 CRTSTUFF_T_CFLAGS = -msdata=none
 # Make sure crt*.o are built with -fPIC even if configured with 
 # --enable-shared --disable-multilib
-CRTSTUFF_T_CFLAGS_S = -fPIC -msdata=none
+CRTSTUFF_T_CFLAGS_S = -fPIC $(DATA_PLT:m%=-m%) -msdata=none
diff -urp -xCVS -x'*~' -x'.#*' gcc-3.4-virgin/gcc/config/rs6000/t-ppcos gcc-3.4/gcc/config/rs6000/t-ppcos
--- gcc-3.4-virgin/gcc/config/rs6000/t-ppcos	2001-11-25 00:32:46.000000000 +1030
+++ gcc-3.4/gcc/config/rs6000/t-ppcos	2005-05-19 20:16:53.000000000 +0930
@@ -2,7 +2,7 @@
 
 MULTILIB_OPTIONS	= msoft-float
 MULTILIB_DIRNAMES	= nof
-MULTILIB_EXTRA_OPTS	= fPIC mstrict-align
+MULTILIB_EXTRA_OPTS	= fPIC $(DATA_PLT) mstrict-align
 MULTILIB_EXCEPTIONS	= 
 
 MULTILIB_MATCHES	= ${MULTILIB_MATCHES_FLOAT}

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: powerpc new PLT and GOT
  2005-05-19 13:01 ` Alan Modra
@ 2005-05-25 14:26   ` Alan Modra
  2005-05-25 15:17     ` Paolo Carlini
                       ` (2 more replies)
  0 siblings, 3 replies; 875+ messages in thread
From: Alan Modra @ 2005-05-25 14:26 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 2140 bytes --]

On Thu, May 19, 2005 at 10:30:46PM +0930, Alan Modra wrote:
> This should be close to the final version.

That sort of comment is really tempting fate.  :-)  libffi and nested
function trampoline assembly needed fixing.  Also, a means of making the
compiler default to the new plt layout (which is also the easiest way to
ensure libraries are all built with new plt).

This patch survives bootstrap and regression testing all languages bar
ada.  I'll admit that most of the testing has been done against the
gcc-4.0 branch, with mainline bootstrap only against an older glibc.
The gcc-4.0 patch has been tested using both old and new versions of
glibc, with good result on both gcc and glibc testsuites.  One libstdc++
test, abi-check, fails with a new glibc.  Symbol differences are all
like
-FUNC:_ZNSt10moneypunctIcLb0EE24_M_initialize_moneypunctEP15__locale_structPKc@@GLIBCXX_3.4
+FUNC:_ZNSt10moneypunctIcLb0EE24_M_initialize_moneypunctEPiPKc@@GLIBCXX_3.4
or
std::moneypunct<char, false>::_M_initialize_moneypunct(__locale_struct*, char const*)
std::moneypunct<char, false>::_M_initialize_moneypunct(int*, char const*)
I think this is due to a libstdc++ configure test failing, probably
because I didn't get all the rpath magic right when trying to use the
new glibc.  ie. this was a failure of my test setup rather than a real
problem.

Some notes on the patch:

I chose to use --enable-dataplt as a configure option to make the
compiler default to -mdata-plt.  This needs documenting (and the new
ppc -mdata-plt and -mbss-plt options).  I'll do that before committing,
assuming --enable-dataplt is a reasonable choice.  Another possibility
is --with-abi=data-plt or even a new powerpc config target (but I think
we already have way too many ppc targets).

Moving the elf_high and elf_low patterns seemed the easiest way to allow 
load_toc_v4_PIC_3c to generate "addi" rather than the rtl matching
elf_low and generating "cal", not that this really matters.  Note that
the comment above elf_high and elf_low is incorrect.  They aren't just
used for non-PIC.

OK to apply mainline?

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

[-- Attachment #2: gcc.diff --]
[-- Type: text/plain, Size: 20071 bytes --]

gcc/
	* configure.ac (HAVE_AS_REL16): Test for R_PPC_REL16 relocs.
	* config.in: Regenerate.
	* configure: Regenerate.
	* config.gcc (powerpc64-*-linux*, powerpc-*-linux*): Add
	rs6000/dataplt.h to tm_file when enable_dataplt.
	* config/rs6000/dataplt.h: New file.
	* config/rs6000/sysv4.h (SUBTARGET_OVERRIDE_OPTIONS): Error if
	-mdata-plt given without assembler support.
	(CC1_DATA_PLT_DEFAULT_SPEC): Define.
	(CC1_SPEC): Delete duplicate mno-sdata.  Invoke cc1_data_plt_default.
	(SUBTARGET_EXTRA_SPECS): Add cc1_data_plt_default.
	* config/rs6000/sysv4.opt (mdata-plt, bss-plt): Add options.
	* config/rs6000/rs6000.h (TARGET_DATA_PLT): Zero if not HAVE_AS_REL16.
	* config/rs6000/rs6000.c (rs6000_emit_load_toc_table): Handle
	TARGET_DATA_PLT got register load sequence.
	(rs6000_emit_prologue): Call rs6000_emit_load_toc_table when
	TARGET_DATA_PLT.
	(rs6000_elf_declare_function_name): Don't emit toc address offset
	word when TARGET_DATA_PLT.
	* config/rs6000/rs6000.md (elf_high, elf_low): Move past load_toc_*.
	(load_toc_v4_PIC_1) Enable for TARGET_DATA_PLT.
	(load_toc_v4_PIC_3b, load_toc_v4_PIC_3c): New insns.
	(call, call_value): Mark pic_offset_table_rtx used for TARGET_DATA_PLT.
	(call_nonlocal_sysv, call_value_nonlocal_sysv, sibcall_nonlocal_sysv,
	sibcall_value_nonlocal_sysv): Add 32768 offset when TARGET_DATA_PLT
	and -fPIC.
	* config/rs6000/tramp.asm (trampoline_initial): Use "bcl 20,31".
	(__trampoline_setup): Likewise.  Init r30 before plt call.

libffi/
	* src/powerpc/ppc_closure.S (ffi_closure_SYSV): Don't use JUMPTARGET
	to call ffi_closure_helper_SYSV.  Append @local instead.
	* src/powerpc/sysv.S (ffi_call_SYSV): Likewise for ffi_prep_args_SYSV.

diff -urpN -xCVS -x'*~' -x'.#*' gcc-virgin/gcc/configure.ac gcc-current/gcc/configure.ac
--- gcc-virgin/gcc/configure.ac	2005-05-19 19:10:56.000000000 +0930
+++ gcc-current/gcc/configure.ac	2005-05-19 19:13:04.000000000 +0930
@@ -2828,6 +2828,24 @@ foo:	nop
       [AC_DEFINE(HAVE_AS_POPCNTB, 1,
 	  [Define if your assembler supports popcntb field.])])
 
+    case $target in
+      *-*-aix*) conftest_s='	.csect .text[[PR]]
+LCF..0:
+	addis 11,30,_GLOBAL_OFFSET_TABLE_-LCF..0@ha';;
+      *-*-darwin*)
+	conftest_s='	.text
+LCF0:
+	addis r11,r30,_GLOBAL_OFFSET_TABLE_-LCF0@ha';;
+      *) conftest_s='	.text
+.LCF0:
+	addis 11,30,_GLOBAL_OFFSET_TABLE_-.LCF0@ha';;
+    esac
+
+    gcc_GAS_CHECK_FEATURE([rel16 relocs],
+      gcc_cv_as_powerpc_rel16, [2,17,0], -a32,
+      [$conftest_s],,
+      [AC_DEFINE(HAVE_AS_REL16, 1,
+	  [Define if your assembler supports R_PPC_REL16 relocs.])])
     ;;
 
   mips*-*-*)
diff -urpN -xCVS -x'*~' -x'.#*' gcc-virgin/gcc/config.gcc gcc-current/gcc/config.gcc
--- gcc-virgin/gcc/config.gcc	2005-05-19 19:10:56.000000000 +0930
+++ gcc-current/gcc/config.gcc	2005-05-25 11:33:46.000000000 +0930
@@ -1581,6 +1581,9 @@ powerpc64-*-linux*)
 	test x$with_cpu != x || cpu_is_64bit=yes
 	test x$cpu_is_64bit != xyes || tm_file="${tm_file} rs6000/default64.h"
 	tm_file="rs6000/biarch64.h ${tm_file} rs6000/linux64.h"
+	if test x${enable_dataplt} = xyes; then
+		tm_file="rs6000/dataplt.h ${tm_file}"
+	fi
 	extra_options="${extra_options} rs6000/sysv4.opt rs6000/linux64.opt"
 	tmake_file="rs6000/t-fprules ${tmake_file} rs6000/t-ppccomm rs6000/t-linux64"
 	;;
@@ -1690,6 +1693,9 @@ powerpc-*-linux*)
 		tm_file="${tm_file} rs6000/linux.h"
 		;;
 	esac
+	if test x${enable_dataplt} = xyes; then
+		tm_file="rs6000/dataplt.h ${tm_file}"
+	fi
 	;;
 powerpc-*-gnu-gnualtivec*)
 	tm_file="${cpu_type}/${cpu_type}.h elfos.h svr4.h freebsd-spec.h gnu.h rs6000/sysv4.h rs6000/linux.h rs6000/linuxaltivec.h rs6000/gnu.h"
diff -urpN -xCVS -x'*~' -x'.#*' gcc-virgin/gcc/config/rs6000/dataplt.h gcc-current/gcc/config/rs6000/dataplt.h
--- gcc-virgin/gcc/config/rs6000/dataplt.h	1970-01-01 09:30:00.000000000 +0930
+++ gcc-current/gcc/config/rs6000/dataplt.h	2005-05-25 11:29:57.000000000 +0930
@@ -0,0 +1,21 @@
+/* Default to -mdata-plt.
+   Copyright (C) 2005 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 2, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING.  If not, write to
+the Free Software Foundation, 59 Temple Place - Suite 330,
+Boston, MA 02111-1307, USA.  */
+
+#define CC1_DATA_PLT_DEFAULT_SPEC "-mdata-plt"
diff -urpN -xCVS -x'*~' -x'.#*' gcc-virgin/gcc/config/rs6000/sysv4.h gcc-current/gcc/config/rs6000/sysv4.h
--- gcc-virgin/gcc/config/rs6000/sysv4.h	2005-05-06 23:34:43.000000000 +0930
+++ gcc-current/gcc/config/rs6000/sysv4.h	2005-05-25 11:34:13.000000000 +0930
@@ -205,6 +205,11 @@ do {									\
       error ("-mcall-aixdesc must be big endian");			\
     }									\
 									\
+  if (TARGET_DATA_PLT != ((target_flags & MASK_DATA_PLT) != 0))		\
+    {									\
+      error ("-mdata-plt not supported by your assembler");		\
+    }									\
+									\
   /* Treat -fPIC the same as -mrelocatable.  */				\
   if (flag_pic > 1 && DEFAULT_ABI != ABI_AIX)				\
     target_flags |= MASK_RELOCATABLE | MASK_MINIMAL_TOC | MASK_NO_FP_IN_TOC; \
@@ -750,6 +755,10 @@ extern int fixuplabelno;
 
 #define	CC1_ENDIAN_DEFAULT_SPEC "%(cc1_endian_big)"
 
+#ifndef CC1_DATA_PLT_DEFAULT_SPEC
+#define CC1_DATA_PLT_DEFAULT_SPEC ""
+#endif
+
 /* Pass -G xxx to the compiler and set correct endian mode.  */
 #define	CC1_SPEC "%{G*} \
 %{mlittle|mlittle-endian: %(cc1_endian_little);           \
@@ -762,7 +771,6 @@ extern int fixuplabelno;
   mcall-gnu             : -mbig %(cc1_endian_big);        \
   mcall-i960-old        : -mlittle %(cc1_endian_little);  \
                         : %(cc1_endian_default)}          \
-%{mno-sdata: -msdata=none } \
 %{meabi: %{!mcall-*: -mcall-sysv }} \
 %{!meabi: %{!mno-eabi: \
     %{mrelocatable: -meabi } \
@@ -774,6 +782,7 @@ extern int fixuplabelno;
     %{mcall-openbsd: -mno-eabi }}} \
 %{msdata: -msdata=default} \
 %{mno-sdata: -msdata=none} \
+%{!mbss-plt: %{!mdata-plt: %(cc1_data_plt_default)}} \
 %{profile: -p}"
 
 /* Don't put -Y P,<path> for cross compilers.  */
@@ -1214,6 +1223,7 @@ ncrtn.o%s"
   { "cc1_endian_big",		CC1_ENDIAN_BIG_SPEC },			\
   { "cc1_endian_little",	CC1_ENDIAN_LITTLE_SPEC },		\
   { "cc1_endian_default",	CC1_ENDIAN_DEFAULT_SPEC },		\
+  { "cc1_data_plt_default",	CC1_DATA_PLT_DEFAULT_SPEC },		\
   { "cpp_os_ads",		CPP_OS_ADS_SPEC },			\
   { "cpp_os_yellowknife",	CPP_OS_YELLOWKNIFE_SPEC },		\
   { "cpp_os_mvme",		CPP_OS_MVME_SPEC },			\
diff -urpN -xCVS -x'*~' -x'.#*' gcc-virgin/gcc/config/rs6000/sysv4.opt gcc-current/gcc/config/rs6000/sysv4.opt
--- gcc-virgin/gcc/config/rs6000/sysv4.opt	2005-05-19 19:11:10.000000000 +0930
+++ gcc-current/gcc/config/rs6000/sysv4.opt	2005-05-19 19:13:21.000000000 +0930
@@ -139,3 +139,11 @@ Generate 32-bit code
 mnewlib
 Target RejectNegative
 no description yet
+
+mdata-plt
+Target Report RejectNegative Mask(DATA_PLT)
+Generate code to use a non-exec PLT and GOT
+
+mbss-plt
+Target Report RejectNegative InverseMask(DATA_PLT)
+Generate code for old exec BSS PLT
diff -urpN -xCVS -x'*~' -x'.#*' gcc-virgin/gcc/config/rs6000/rs6000.h gcc-current/gcc/config/rs6000/rs6000.h
--- gcc-virgin/gcc/config/rs6000/rs6000.h	2005-05-09 20:03:12.000000000 +0930
+++ gcc-current/gcc/config/rs6000/rs6000.h	2005-05-12 18:45:55.000000000 +0930
@@ -144,6 +144,11 @@
 #define TARGET_POPCNTB 0
 #endif
 
+#ifndef HAVE_AS_REL16
+#undef  TARGET_DATA_PLT
+#define TARGET_DATA_PLT 0
+#endif
+
 #define TARGET_32BIT		(! TARGET_64BIT)
 
 /* Emit a dtp-relative reference to a TLS variable.  */
diff -urpN -xCVS -x'*~' -x'.#*' gcc-virgin/gcc/config/rs6000/rs6000.c gcc-current/gcc/config/rs6000/rs6000.c
--- gcc-virgin/gcc/config/rs6000/rs6000.c	2005-05-19 19:11:10.000000000 +0930
+++ gcc-current/gcc/config/rs6000/rs6000.c	2005-05-23 16:04:05.000000000 +0930
@@ -12572,15 +12621,49 @@ rs6000_emit_load_toc_table (int fromprol
   rtx dest, insn;
   dest = gen_rtx_REG (Pmode, RS6000_PIC_OFFSET_TABLE_REGNUM);
 
-  if (TARGET_ELF && DEFAULT_ABI == ABI_V4 && flag_pic == 1)
+  if (TARGET_ELF && TARGET_DATA_PLT && DEFAULT_ABI != ABI_AIX && flag_pic)
     {
-      rtx temp = (fromprolog
-		  ? gen_rtx_REG (Pmode, LINK_REGISTER_REGNUM)
-		  : gen_reg_rtx (Pmode));
-      insn = emit_insn (gen_load_toc_v4_pic_si (temp));
+      char buf[30];
+      rtx lab, tmp1, tmp2, got, tempLR;
+
+      ASM_GENERATE_INTERNAL_LABEL (buf, "LCF", rs6000_pic_labelno);
+      lab = gen_rtx_SYMBOL_REF (Pmode, ggc_strdup (buf));
+      if (flag_pic == 2)
+	got = gen_rtx_SYMBOL_REF (Pmode, toc_label_name);
+      else
+	got = rs6000_got_sym ();
+      tmp1 = tmp2 = dest;
+      if (!fromprolog)
+	{
+	  tmp1 = gen_reg_rtx (Pmode);
+	  tmp2 = gen_reg_rtx (Pmode);
+	}
+      tempLR = (fromprolog
+		? gen_rtx_REG (Pmode, LINK_REGISTER_REGNUM)
+		: gen_reg_rtx (Pmode));
+      insn = emit_insn (gen_load_toc_v4_PIC_1 (tempLR, lab));
+      if (fromprolog)
+	rs6000_maybe_dead (insn);
+      insn = emit_move_insn (tmp1, tempLR);
+      if (fromprolog)
+	rs6000_maybe_dead (insn);
+      insn = emit_insn (gen_load_toc_v4_PIC_3b (tmp2, tmp1, got, lab));
+      if (fromprolog)
+	rs6000_maybe_dead (insn);
+      insn = emit_insn (gen_load_toc_v4_PIC_3c (dest, tmp2, got, lab));
+      if (fromprolog)
+	rs6000_maybe_dead (insn);
+    }
+  else if (TARGET_ELF && DEFAULT_ABI == ABI_V4 && flag_pic == 1)
+    {
+      rtx tempLR = (fromprolog
+		    ? gen_rtx_REG (Pmode, LINK_REGISTER_REGNUM)
+		    : gen_reg_rtx (Pmode));
+
+      insn = emit_insn (gen_load_toc_v4_pic_si (tempLR));
       if (fromprolog)
 	rs6000_maybe_dead (insn);
-      insn = emit_move_insn (dest, temp);
+      insn = emit_move_insn (dest, tempLR);
       if (fromprolog)
 	rs6000_maybe_dead (insn);
     }
@@ -13674,7 +13757,8 @@ rs6000_emit_prologue (void)
 
   /* If we are using RS6000_PIC_OFFSET_TABLE_REGNUM, we need to set it up.  */
   if ((TARGET_TOC && TARGET_MINIMAL_TOC && get_pool_size () != 0)
-      || (DEFAULT_ABI == ABI_V4 && flag_pic == 1
+      || (DEFAULT_ABI == ABI_V4
+	  && (flag_pic == 1 || (flag_pic && TARGET_DATA_PLT))
 	  && regs_ever_live[RS6000_PIC_OFFSET_TABLE_REGNUM]))
     {
       /* If emit_load_toc_table will use the link register, we need to save
@@ -17204,6 +17288,7 @@ rs6000_elf_declare_function_name (FILE *
     }
 
   if (TARGET_RELOCATABLE
+      && !TARGET_DATA_PLT
       && (get_pool_size () != 0 || current_function_profile)
       && uses_TOC ())
     {
diff -urpN -xCVS -x'*~' -x'.#*' gcc-virgin/gcc/config/rs6000/rs6000.md gcc-current/gcc/config/rs6000/rs6000.md
--- gcc-virgin/gcc/config/rs6000/rs6000.md	2005-05-19 19:11:10.000000000 +0930
+++ gcc-current/gcc/config/rs6000/rs6000.md	2005-05-19 21:32:25.000000000 +0930
@@ -7360,26 +7360,6 @@
 \f
 ;; Now define ways of moving data around.
 
-;; Elf specific ways of loading addresses for non-PIC code.
-;; The output of this could be r0, but we make a very strong
-;; preference for a base register because it will usually
-;; be needed there.
-(define_insn "elf_high"
-  [(set (match_operand:SI 0 "gpc_reg_operand" "=b*r")
-	(high:SI (match_operand 1 "" "")))]
-  "TARGET_ELF && ! TARGET_64BIT"
-  "{liu|lis} %0,%1@ha")
-
-(define_insn "elf_low"
-  [(set (match_operand:SI 0 "gpc_reg_operand" "=r,r")
-	(lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "b,!*r")
-		   (match_operand 2 "" "")))]
-   "TARGET_ELF && ! TARGET_64BIT"
-   "@
-    {cal|la} %0,%2@l(%1)
-    {ai|addic} %0,%1,%K2")
-
-
 ;; Set up a register with a value from the GOT table
 
 (define_expand "movsi_got"
@@ -9810,7 +9788,8 @@
   [(set (match_operand:SI 0 "register_operand" "=l")
 	(match_operand:SI 1 "immediate_operand" "s"))
    (use (unspec [(match_dup 1)] UNSPEC_TOC))]
-  "TARGET_ELF && DEFAULT_ABI != ABI_AIX && flag_pic == 2"
+  "TARGET_ELF && DEFAULT_ABI != ABI_AIX
+   && (flag_pic == 2 || (flag_pic && TARGET_DATA_PLT))"
   "bcl 20,31,%1\\n%1:"
   [(set_attr "type" "branch")
    (set_attr "length" "4")])
@@ -9833,6 +9812,22 @@
   "{l|lwz} %0,%2-%3(%1)"
   [(set_attr "type" "load")])
 
+(define_insn "load_toc_v4_PIC_3b"
+  [(set (match_operand:SI 0 "gpc_reg_operand" "=b")
+	(plus:SI (match_operand:SI 1 "gpc_reg_operand" "r")
+		 (high:SI
+		   (minus:SI (match_operand:SI 2 "symbol_ref_operand" "s")
+			     (match_operand:SI 3 "symbol_ref_operand" "s")))))]
+  "TARGET_ELF && TARGET_DATA_PLT && DEFAULT_ABI != ABI_AIX && flag_pic"
+  "{cau|addis} %0,%1,%2-%3@ha")
+
+(define_insn "load_toc_v4_PIC_3c"
+  [(set (match_operand:SI 0 "gpc_reg_operand" "=r")
+	(lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "b")
+		   (minus:SI (match_operand:SI 2 "symbol_ref_operand" "s")
+			     (match_operand:SI 3 "symbol_ref_operand" "s"))))]
+  "TARGET_ELF && TARGET_DATA_PLT && DEFAULT_ABI != ABI_AIX && flag_pic"
+  "{cal|addi} %0,%1,%2-%3@l")
 
 ;; If the TOC is shared over a translation unit, as happens with all
 ;; the kinds of PIC that we support, we need to restore the TOC
@@ -9867,6 +9862,25 @@
     rs6000_emit_load_toc_table (FALSE);
   DONE;
 }")
+
+;; Elf specific ways of loading addresses for non-PIC code.
+;; The output of this could be r0, but we make a very strong
+;; preference for a base register because it will usually
+;; be needed there.
+(define_insn "elf_high"
+  [(set (match_operand:SI 0 "gpc_reg_operand" "=b*r")
+	(high:SI (match_operand 1 "" "")))]
+  "TARGET_ELF && ! TARGET_64BIT"
+  "{liu|lis} %0,%1@ha")
+
+(define_insn "elf_low"
+  [(set (match_operand:SI 0 "gpc_reg_operand" "=r,r")
+	(lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "b,!*r")
+		   (match_operand 2 "" "")))]
+   "TARGET_ELF && ! TARGET_64BIT"
+   "@
+    {cal|la} %0,%2@l(%1)
+    {ai|addic} %0,%1,%K2")
 \f
 ;; A function pointer under AIX is a pointer to a data area whose first word
 ;; contains the actual address of the function, whose second word contains a
@@ -9983,6 +9997,25 @@
 
   operands[0] = XEXP (operands[0], 0);
 
+  if (DEFAULT_ABI == ABI_V4 && TARGET_DATA_PLT
+      && flag_pic
+      && GET_CODE (operands[0]) == SYMBOL_REF
+      && !SYMBOL_REF_LOCAL_P (operands[0]))
+    {
+      rtx call;
+      rtvec tmp;
+
+      tmp = gen_rtvec (3,
+		       gen_rtx_CALL (VOIDmode,
+				     gen_rtx_MEM (SImode, operands[0]),
+				     operands[1]),
+		       gen_rtx_USE (VOIDmode, operands[2]),
+		       gen_rtx_CLOBBER (VOIDmode, gen_rtx_SCRATCH (SImode)));
+      call = emit_call_insn (gen_rtx_PARALLEL (VOIDmode, tmp));
+      use_reg (&CALL_INSN_FUNCTION_USAGE (call), pic_offset_table_rtx);
+      DONE;
+    }
+
   if (GET_CODE (operands[0]) != SYMBOL_REF
       || (DEFAULT_ABI == ABI_AIX && !SYMBOL_REF_FUNCTION_P (operands[0]))
       || (DEFAULT_ABI != ABI_DARWIN && (INTVAL (operands[2]) & CALL_LONG) != 0))
@@ -10034,6 +10067,28 @@
 
   operands[1] = XEXP (operands[1], 0);
 
+  if (DEFAULT_ABI == ABI_V4 && TARGET_DATA_PLT
+      && flag_pic
+      && GET_CODE (operands[1]) == SYMBOL_REF
+      && !SYMBOL_REF_LOCAL_P (operands[1]))
+    {
+      rtx call;
+      rtvec tmp;
+
+      tmp = gen_rtvec (3,
+		       gen_rtx_SET (VOIDmode,
+				    operands[0],
+				    gen_rtx_CALL (VOIDmode,
+						  gen_rtx_MEM (SImode,
+							       operands[1]),
+						  operands[2])),
+		       gen_rtx_USE (VOIDmode, operands[3]),
+		       gen_rtx_CLOBBER (VOIDmode, gen_rtx_SCRATCH (SImode)));
+      call = emit_call_insn (gen_rtx_PARALLEL (VOIDmode, tmp));
+      use_reg (&CALL_INSN_FUNCTION_USAGE (call), pic_offset_table_rtx);
+      DONE;
+    }
+
   if (GET_CODE (operands[1]) != SYMBOL_REF
       || (DEFAULT_ABI == ABI_AIX && !SYMBOL_REF_FUNCTION_P (operands[1]))
       || (DEFAULT_ABI != ABI_DARWIN && (INTVAL (operands[3]) & CALL_LONG) != 0))
@@ -10307,7 +10362,18 @@
 #if TARGET_MACHO
   return output_call(insn, operands, 0, 2);
 #else
-  return (DEFAULT_ABI == ABI_V4 && flag_pic) ? "bl %z0@plt" : "bl %z0";
+  if (DEFAULT_ABI == ABI_V4 && flag_pic)
+    {
+      if (TARGET_DATA_PLT && flag_pic == 2)
+	/* The magic 32768 offset here and in the other sysv call insns
+	   corresponds to the offset of r30 in .got2, as given by LCTOC1.
+	   See sysv4.h:toc_section.  */
+	return "bl %z0+32768@plt";
+      else
+	return "bl %z0@plt";
+    }
+  else
+    return "bl %z0";
 #endif
 }
   [(set_attr "type" "branch,branch")
@@ -10352,7 +10418,15 @@
 #if TARGET_MACHO
   return output_call(insn, operands, 1, 3);
 #else
-  return (DEFAULT_ABI == ABI_V4 && flag_pic) ? "bl %z1@plt" : "bl %z1";
+  if (DEFAULT_ABI == ABI_V4 && flag_pic)
+    {
+      if (TARGET_DATA_PLT && flag_pic == 2)
+	return "bl %z1+32768@plt";
+      else
+	return "bl %z1@plt";
+    }
+  else
+    return "bl %z1";
 #endif
 }
   [(set_attr "type" "branch,branch")
@@ -10567,7 +10641,15 @@
   else if (INTVAL (operands[2]) & CALL_V4_CLEAR_FP_ARGS)
     output_asm_insn (\"creqv 6,6,6\", operands);
 
-  return (DEFAULT_ABI == ABI_V4 && flag_pic) ? \"b %z0@plt\" : \"b %z0\";
+  if (DEFAULT_ABI == ABI_V4 && flag_pic)
+    {
+      if (TARGET_DATA_PLT && flag_pic == 2)
+	return \"b %z0+32768@plt\";
+      else
+	return \"b %z0@plt\";
+    }
+  else
+    return \"b %z0\";
 }"
   [(set_attr "type" "branch,branch")
    (set_attr "length" "4,8")])
@@ -10613,7 +10695,15 @@
   else if (INTVAL (operands[2]) & CALL_V4_CLEAR_FP_ARGS)
     output_asm_insn (\"creqv 6,6,6\", operands);
 
-  return (DEFAULT_ABI == ABI_V4 && flag_pic) ? \"b %z1@plt\" : \"b %z1\";
+  if (DEFAULT_ABI == ABI_V4 && flag_pic)
+    {
+      if (TARGET_DATA_PLT && flag_pic == 2)
+	return \"b %z1+32768@plt\";
+      else
+	return \"b %z1@plt\";
+    }
+  else
+    return \"b %z1\";
 }"
   [(set_attr "type" "branch,branch")
    (set_attr "length" "4,8")])
diff -urpN -xCVS -x'*~' -x'.#*' gcc-virgin/gcc/config/rs6000/tramp.asm gcc-current/gcc/config/rs6000/tramp.asm
--- gcc-virgin/gcc/config/rs6000/tramp.asm	2003-06-06 14:41:22.000000000 +0930
+++ gcc-current/gcc/config/rs6000/tramp.asm	2005-05-24 10:52:43.000000000 +0930
@@ -44,7 +44,7 @@
 	.align	2
 trampoline_initial:
 	mflr	r0
-	bl	1f
+	bcl	20,31,1f
 .Lfunc = .-trampoline_initial
 	.long	0			/* will be replaced with function address */
 .Lchain = .-trampoline_initial
@@ -67,7 +67,7 @@ trampoline_size = .-trampoline_initial
 
 FUNC_START(__trampoline_setup)
 	mflr	r0		/* save return address */
-        bl	.LCF0		/* load up __trampoline_initial into r7 */
+        bcl	20,31,.LCF0	/* load up __trampoline_initial into r7 */
 .LCF0:
         mflr	r11
         addi	r7,r11,trampoline_initial-4-.LCF0 /* trampoline address -4 */
@@ -105,6 +105,12 @@ FUNC_START(__trampoline_setup)
 	blr
 
 .Labort:
+#if defined SHARED && defined HAVE_AS_REL16
+	bcl	20,31,1f
+1:	mflr	r30
+	addis	r30,r30,_GLOBAL_OFFSET_TABLE_-1b@ha
+	addi	r30,r30,_GLOBAL_OFFSET_TABLE_-1b@l
+#endif
 	bl	JUMP_TARGET(abort)
 FUNC_END(__trampoline_setup)
 
diff -urpN -xCVS -x'*~' -x'.#*' gcc-virgin/libffi/src/powerpc/ppc_closure.S gcc-current/libffi/src/powerpc/ppc_closure.S
--- gcc-virgin/libffi/src/powerpc/ppc_closure.S	2004-09-03 22:42:23.000000000 +0930
+++ gcc-current/libffi/src/powerpc/ppc_closure.S	2005-05-24 10:53:31.000000000 +0930
@@ -57,7 +57,7 @@ ENTRY(ffi_closure_SYSV)
 	addi %r7,%r1,152
 
 	# make the call
-	bl JUMPTARGET(ffi_closure_helper_SYSV)
+	bl ffi_closure_helper_SYSV@local
 
 	# now r3 contains the return type
 	# so use it to look up in a table
diff -urpN -xCVS -x'*~' -x'.#*' gcc-virgin/libffi/src/powerpc/sysv.S gcc-current/libffi/src/powerpc/sysv.S
--- gcc-virgin/libffi/src/powerpc/sysv.S	2004-09-03 22:42:23.000000000 +0930
+++ gcc-current/libffi/src/powerpc/sysv.S	2005-05-24 10:53:44.000000000 +0930
@@ -60,7 +60,7 @@ ENTRY(ffi_call_SYSV)
 
 	/* Call ffi_prep_args_SYSV.  */
 	mr	%r4,%r1
-	bl	JUMPTARGET(ffi_prep_args_SYSV)
+	bl	ffi_prep_args_SYSV@local
 
 	/* Now do the call.  */
 	/* Set up cr1 with bits 4-7 of the flags.  */

[-- Attachment #3: gcc4.diff --]
[-- Type: text/plain, Size: 21441 bytes --]

gcc/
	* configure.ac (HAVE_AS_REL16): Test for R_PPC_REL16 relocs.
	* config.in: Regenerate.
	* configure: Regenerate.
	* config.gcc (powerpc64-*-linux*, powerpc-*-linux*): Add
	rs6000/dataplt.h to tm_file when enable_dataplt.
	* config/rs6000/dataplt.h: New file.
	* config/rs6000/sysv4.h (MASK_DATA_PLT): Define.
	(SUBTARGET_SWITCHES): Add "data-plt" and "bss-plt".  Move "newlib".
	(SUBTARGET_OVERRIDE_OPTIONS): Error if -mdata-plt given without
	assembler support.
	(CC1_DATA_PLT_DEFAULT_SPEC): Define.
	(CC1_SPEC): Delete duplicate mno-sdata.  Invoke cc1_data_plt_default.
	(SUBTARGET_EXTRA_SPECS): Add cc1_data_plt_default.
	* config/rs6000/rs6000.h: Update target_flags free bits comment.
	(TARGET_DATA_PLT): Define.
	* config/rs6000/rs6000.c (rs6000_emit_load_toc_table): Handle
	TARGET_DATA_PLT got register load sequence.
	(rs6000_emit_prologue): Call rs6000_emit_load_toc_table when
	TARGET_DATA_PLT.
	(rs6000_elf_declare_function_name): Don't emit toc address offset
	word when TARGET_DATA_PLT.
	* config/rs6000/rs6000.md (elf_high, elf_low): Move after load_toc_*.
	(load_toc_v4_PIC_1) Enable for TARGET_DATA_PLT.
	(load_toc_v4_PIC_3b, load_toc_v4_PIC_3c): New insns.
	(call, call_value): Mark pic_offset_table_rtx used for TARGET_DATA_PLT.
	(call_nonlocal_sysv, call_value_nonlocal_sysv, sibcall_nonlocal_sysv,
	sibcall_value_nonlocal_sysv): Add 32768 offset when TARGET_DATA_PLT
	and -fPIC.
	* config/rs6000/tramp.asm (trampoline_initial): Use "bcl 20,31".
	(__trampoline_setup): Likewise.  Init r30 before plt call.

libffi/
	* src/powerpc/ppc_closure.S (ffi_closure_SYSV): Don't use JUMPTARGET
	to call ffi_closure_helper_SYSV.  Append @local instead.
	* src/powerpc/sysv.S (ffi_call_SYSV): Likewise for ffi_prep_args_SYSV.

diff -urpN -xCVS -x'*~' -x'.#*' gcc-4.0-virgin/gcc/configure.ac gcc-4.0/gcc/configure.ac
--- gcc-4.0-virgin/gcc/configure.ac	2005-05-11 18:20:56.000000000 +0930
+++ gcc-4.0/gcc/configure.ac	2005-05-19 21:02:05.000000000 +0930
@@ -2762,6 +2762,25 @@ foo:	nop
       [$conftest_s],,
       [AC_DEFINE(HAVE_AS_MFCRF, 1,
 	  [Define if your assembler supports mfcr field.])])
+
+    case $target in
+      *-*-aix*) conftest_s='	.csect .text[[PR]]
+LCF..0:
+	addis 11,30,_GLOBAL_OFFSET_TABLE_-LCF..0@ha';;
+      *-*-darwin*)
+	conftest_s='	.text
+LCF0:
+	addis r11,r30,_GLOBAL_OFFSET_TABLE_-LCF0@ha';;
+      *) conftest_s='	.text
+.LCF0:
+	addis 11,30,_GLOBAL_OFFSET_TABLE_-.LCF0@ha';;
+    esac
+
+    gcc_GAS_CHECK_FEATURE([rel16 relocs],
+      gcc_cv_as_powerpc_rel16, [2,17,0], -a32,
+      [$conftest_s],,
+      [AC_DEFINE(HAVE_AS_REL16, 1,
+	  [Define if your assembler supports R_PPC_REL16 relocs.])])
     ;;
 
   mips*-*-*)
diff -urpN -xCVS -x'*~' -x'.#*' gcc-4.0-virgin/gcc/config.gcc gcc-4.0/gcc/config.gcc
--- gcc-4.0-virgin/gcc/config.gcc	2005-05-06 23:50:09.000000000 +0930
+++ gcc-4.0/gcc/config.gcc	2005-05-24 01:02:21.000000000 +0930
@@ -1559,6 +1559,9 @@ powerpc64-*-linux*)
 	test x$with_cpu != x || cpu_is_64bit=yes
 	test x$cpu_is_64bit != xyes || tm_file="${tm_file} rs6000/default64.h"
 	tm_file="rs6000/biarch64.h ${tm_file} rs6000/linux64.h"
+	if test x${enable_dataplt} = xyes; then
+		tm_file="rs6000/dataplt.h ${tm_file}"
+	fi
 	tmake_file="rs6000/t-fprules ${tmake_file} rs6000/t-ppccomm rs6000/t-linux64"
 	;;
 powerpc64-*-gnu*)
@@ -1651,6 +1654,9 @@ powerpc-*-linux*)
 		tm_file="${tm_file} rs6000/linux.h"
 		;;
 	esac
+	if test x${enable_dataplt} = xyes; then
+		tm_file="rs6000/dataplt.h ${tm_file}"
+	fi
 	;;
 powerpc-*-gnu-gnualtivec*)
 	tm_file="${cpu_type}/${cpu_type}.h elfos.h svr4.h freebsd-spec.h gnu.h rs6000/sysv4.h rs6000/linux.h rs6000/linuxaltivec.h rs6000/gnu.h"
diff -urpN -xCVS -x'*~' -x'.#*' gcc-4.0-virgin/gcc/config/rs6000/dataplt.h gcc-4.0/gcc/config/rs6000/dataplt.h
--- gcc-4.0-virgin/gcc/config/rs6000/dataplt.h	1970-01-01 09:30:00.000000000 +0930
+++ gcc-4.0/gcc/config/rs6000/dataplt.h	2005-05-24 00:54:04.000000000 +0930
@@ -0,0 +1,21 @@
+/* Default to -mdata-plt.
+   Copyright (C) 2005 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 2, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING.  If not, write to
+the Free Software Foundation, 59 Temple Place - Suite 330,
+Boston, MA 02111-1307, USA.  */
+
+#define CC1_DATA_PLT_DEFAULT_SPEC "-mdata-plt"
diff -urpN -xCVS -x'*~' -x'.#*' gcc-4.0-virgin/gcc/config/rs6000/sysv4.h gcc-4.0/gcc/config/rs6000/sysv4.h
--- gcc-4.0-virgin/gcc/config/rs6000/sysv4.h	2005-02-16 09:06:57.000000000 +1030
+++ gcc-4.0/gcc/config/rs6000/sysv4.h	2005-05-24 01:09:59.000000000 +0930
@@ -55,6 +55,7 @@ extern enum rs6000_sdata_type rs6000_sda
 #define	MASK_REGNAMES		0x02000000	/* Use alternate register names.  */
 #define	MASK_PROTOTYPE		0x01000000	/* Only prototyped fcns pass variable args.  */
 #define MASK_NO_BITFIELD_WORD	0x00800000	/* Bitfields cannot cross word boundaries */
+#define MASK_DATA_PLT		0x00400000	/* Use non-exec PLT/GOT.  */
 
 #define	TARGET_NO_BITFIELD_TYPE	(target_flags & MASK_NO_BITFIELD_TYPE)
 #define	TARGET_STRICT_ALIGN	(target_flags & MASK_STRICT_ALIGN)
@@ -149,12 +150,16 @@ extern const char *rs6000_tls_size_strin
     N_("Set the PPC_EMB bit in the ELF flags header") },		\
   { "windiss",		 0, N_("Use the WindISS simulator") },		\
   { "shlib",		 0, N_("no description yet") },			\
+  { "newlib",		 0, N_("no description yet") },			\
   { "64",		 MASK_64BIT | MASK_POWERPC64 | MASK_POWERPC,	\
 			 N_("Generate 64-bit code") },			\
   { "32",		 - (MASK_64BIT | MASK_POWERPC64),		\
 			 N_("Generate 32-bit code") },			\
-  EXTRA_SUBTARGET_SWITCHES						\
-  { "newlib",		 0, N_("no description yet") },
+  { "data-plt",		 MASK_DATA_PLT,					\
+			 N_("Generate code for non-exec PLT and GOT") },\
+  { "bss-plt",		 -MASK_DATA_PLT,				\
+			 N_("Generate code for exec BSS PLT") },	\
+  EXTRA_SUBTARGET_SWITCHES
 
 /* This is meant to be redefined in the host dependent files.  */
 #define EXTRA_SUBTARGET_SWITCHES
@@ -299,6 +304,11 @@ do {									\
       error ("-mcall-aixdesc must be big endian");			\
     }									\
 									\
+  if (TARGET_DATA_PLT != (target_flags & MASK_DATA_PLT))		\
+    {									\
+      error ("-mdata-plt not supported by your assembler");		\
+    }									\
+									\
   /* Treat -fPIC the same as -mrelocatable.  */				\
   if (flag_pic > 1 && DEFAULT_ABI != ABI_AIX)				\
     target_flags |= MASK_RELOCATABLE | MASK_MINIMAL_TOC | MASK_NO_FP_IN_TOC; \
@@ -844,6 +854,10 @@ extern int fixuplabelno;
 
 #define	CC1_ENDIAN_DEFAULT_SPEC "%(cc1_endian_big)"
 
+#ifndef CC1_DATA_PLT_DEFAULT_SPEC
+#define CC1_DATA_PLT_DEFAULT_SPEC ""
+#endif
+
 /* Pass -G xxx to the compiler and set correct endian mode.  */
 #define	CC1_SPEC "%{G*} \
 %{mlittle|mlittle-endian: %(cc1_endian_little);           \
@@ -856,7 +870,6 @@ extern int fixuplabelno;
   mcall-gnu             : -mbig %(cc1_endian_big);        \
   mcall-i960-old        : -mlittle %(cc1_endian_little);  \
                         : %(cc1_endian_default)}          \
-%{mno-sdata: -msdata=none } \
 %{meabi: %{!mcall-*: -mcall-sysv }} \
 %{!meabi: %{!mno-eabi: \
     %{mrelocatable: -meabi } \
@@ -868,6 +881,7 @@ extern int fixuplabelno;
     %{mcall-openbsd: -mno-eabi }}} \
 %{msdata: -msdata=default} \
 %{mno-sdata: -msdata=none} \
+%{!mbss-plt: %{!mdata-plt: %(cc1_data_plt_default)}} \
 %{profile: -p}"
 
 /* Don't put -Y P,<path> for cross compilers.  */
@@ -1308,6 +1322,7 @@ ncrtn.o%s"
   { "cc1_endian_big",		CC1_ENDIAN_BIG_SPEC },			\
   { "cc1_endian_little",	CC1_ENDIAN_LITTLE_SPEC },		\
   { "cc1_endian_default",	CC1_ENDIAN_DEFAULT_SPEC },		\
+  { "cc1_data_plt_default",	CC1_DATA_PLT_DEFAULT_SPEC },		\
   { "cpp_os_ads",		CPP_OS_ADS_SPEC },			\
   { "cpp_os_yellowknife",	CPP_OS_YELLOWKNIFE_SPEC },		\
   { "cpp_os_mvme",		CPP_OS_MVME_SPEC },			\
diff -urpN -xCVS -x'*~' -x'.#*' gcc-4.0-virgin/gcc/config/rs6000/rs6000.h gcc-4.0/gcc/config/rs6000/rs6000.h
--- gcc-4.0-virgin/gcc/config/rs6000/rs6000.h	2005-03-03 08:34:37.000000000 +1030
+++ gcc-4.0/gcc/config/rs6000/rs6000.h	2005-05-19 21:02:05.000000000 +0930
@@ -201,8 +201,8 @@ extern int target_flags;
 /* Use single field mfcr instruction.  */
 #define MASK_MFCRF		0x00080000
 
-/* The only remaining free bits are 0x00600000.  linux64.h uses
-   0x00100000, and sysv4.h uses 0x00800000 -> 0x40000000.
+/* The only remaining free bit is 0x00200000.  linux64.h uses
+   0x00100000, and sysv4.h uses 0x00400000 -> 0x40000000.
    0x80000000 is not available because target_flags is signed.  */
 
 #define TARGET_POWER		(target_flags & MASK_POWER)
@@ -234,6 +234,11 @@ extern int target_flags;
 #define TARGET_MFCRF 0
 #endif
 
+#ifdef HAVE_AS_REL16
+#define TARGET_DATA_PLT		(target_flags & MASK_DATA_PLT)
+#else
+#define TARGET_DATA_PLT 0
+#endif
 
 #define TARGET_32BIT		(! TARGET_64BIT)
 #define TARGET_HARD_FLOAT	(! TARGET_SOFT_FLOAT)
diff -urpN -xCVS -x'*~' -x'.#*' gcc-4.0-virgin/gcc/config/rs6000/rs6000.c gcc-4.0/gcc/config/rs6000/rs6000.c
--- gcc-4.0-virgin/gcc/config/rs6000/rs6000.c	2005-05-11 18:23:48.000000000 +0930
+++ gcc-4.0/gcc/config/rs6000/rs6000.c	2005-05-23 16:04:06.000000000 +0930
@@ -13466,15 +13520,49 @@ rs6000_emit_load_toc_table (int fromprol
   rtx dest, insn;
   dest = gen_rtx_REG (Pmode, RS6000_PIC_OFFSET_TABLE_REGNUM);
 
-  if (TARGET_ELF && DEFAULT_ABI == ABI_V4 && flag_pic == 1)
+  if (TARGET_ELF && TARGET_DATA_PLT && DEFAULT_ABI != ABI_AIX && flag_pic)
     {
-      rtx temp = (fromprolog
-		  ? gen_rtx_REG (Pmode, LINK_REGISTER_REGNUM)
-		  : gen_reg_rtx (Pmode));
-      insn = emit_insn (gen_load_toc_v4_pic_si (temp));
+      char buf[30];
+      rtx lab, tmp1, tmp2, got, tempLR;
+
+      ASM_GENERATE_INTERNAL_LABEL (buf, "LCF", rs6000_pic_labelno);
+      lab = gen_rtx_SYMBOL_REF (Pmode, ggc_strdup (buf));
+      if (flag_pic == 2)
+	got = gen_rtx_SYMBOL_REF (Pmode, toc_label_name);
+      else
+	got = rs6000_got_sym ();
+      tmp1 = tmp2 = dest;
+      if (!fromprolog)
+	{
+	  tmp1 = gen_reg_rtx (Pmode);
+	  tmp2 = gen_reg_rtx (Pmode);
+	}
+      tempLR = (fromprolog
+		? gen_rtx_REG (Pmode, LINK_REGISTER_REGNUM)
+		: gen_reg_rtx (Pmode));
+      insn = emit_insn (gen_load_toc_v4_PIC_1 (tempLR, lab));
+      if (fromprolog)
+	rs6000_maybe_dead (insn);
+      insn = emit_move_insn (tmp1, tempLR);
+      if (fromprolog)
+	rs6000_maybe_dead (insn);
+      insn = emit_insn (gen_load_toc_v4_PIC_3b (tmp2, tmp1, got, lab));
+      if (fromprolog)
+	rs6000_maybe_dead (insn);
+      insn = emit_insn (gen_load_toc_v4_PIC_3c (dest, tmp2, got, lab));
+      if (fromprolog)
+	rs6000_maybe_dead (insn);
+    }
+  else if (TARGET_ELF && DEFAULT_ABI == ABI_V4 && flag_pic == 1)
+    {
+      rtx tempLR = (fromprolog
+		    ? gen_rtx_REG (Pmode, LINK_REGISTER_REGNUM)
+		    : gen_reg_rtx (Pmode));
+
+      insn = emit_insn (gen_load_toc_v4_pic_si (tempLR));
       if (fromprolog)
 	rs6000_maybe_dead (insn);
-      insn = emit_move_insn (dest, temp);
+      insn = emit_move_insn (dest, tempLR);
       if (fromprolog)
 	rs6000_maybe_dead (insn);
     }
@@ -14565,7 +14653,8 @@ rs6000_emit_prologue (void)
 
   /* If we are using RS6000_PIC_OFFSET_TABLE_REGNUM, we need to set it up.  */
   if ((TARGET_TOC && TARGET_MINIMAL_TOC && get_pool_size () != 0)
-      || (DEFAULT_ABI == ABI_V4 && flag_pic == 1
+      || (DEFAULT_ABI == ABI_V4
+	  && (flag_pic == 1 || (flag_pic && TARGET_DATA_PLT))
 	  && regs_ever_live[RS6000_PIC_OFFSET_TABLE_REGNUM]))
     {
       /* If emit_load_toc_table will use the link register, we need to save
@@ -18082,6 +18171,7 @@ rs6000_elf_declare_function_name (FILE *
     }
 
   if (TARGET_RELOCATABLE
+      && !TARGET_DATA_PLT
       && (get_pool_size () != 0 || current_function_profile)
       && uses_TOC ())
     {
diff -urpN -xCVS -x'*~' -x'.#*' gcc-4.0-virgin/gcc/config/rs6000/rs6000.md gcc-4.0/gcc/config/rs6000/rs6000.md
--- gcc-4.0-virgin/gcc/config/rs6000/rs6000.md	2005-03-31 21:02:13.000000000 +0930
+++ gcc-4.0/gcc/config/rs6000/rs6000.md	2005-05-19 21:15:01.000000000 +0930
@@ -7653,26 +7653,6 @@
 \f
 ;; Now define ways of moving data around.
 
-;; Elf specific ways of loading addresses for non-PIC code.
-;; The output of this could be r0, but we make a very strong
-;; preference for a base register because it will usually
-;; be needed there.
-(define_insn "elf_high"
-  [(set (match_operand:SI 0 "gpc_reg_operand" "=b*r")
-	(high:SI (match_operand 1 "" "")))]
-  "TARGET_ELF && ! TARGET_64BIT"
-  "{liu|lis} %0,%1@ha")
-
-(define_insn "elf_low"
-  [(set (match_operand:SI 0 "gpc_reg_operand" "=r,r")
-	(lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "b,!*r")
-		   (match_operand 2 "" "")))]
-   "TARGET_ELF && ! TARGET_64BIT"
-   "@
-    {cal|la} %0,%2@l(%1)
-    {ai|addic} %0,%1,%K2")
-
-
 ;; Set up a register with a value from the GOT table
 
 (define_expand "movsi_got"
@@ -10133,7 +10111,8 @@
   [(set (match_operand:SI 0 "register_operand" "=l")
 	(match_operand:SI 1 "immediate_operand" "s"))
    (use (unspec [(match_dup 1)] UNSPEC_TOC))]
-  "TARGET_ELF && DEFAULT_ABI != ABI_AIX && flag_pic == 2"
+  "TARGET_ELF && DEFAULT_ABI != ABI_AIX
+   && (flag_pic == 2 || (flag_pic && TARGET_DATA_PLT))"
   "bcl 20,31,%1\\n%1:"
   [(set_attr "type" "branch")
    (set_attr "length" "4")])
@@ -10156,6 +10135,23 @@
   "{l|lwz} %0,%2-%3(%1)"
   [(set_attr "type" "load")])
 
+(define_insn "load_toc_v4_PIC_3b"
+  [(set (match_operand:SI 0 "gpc_reg_operand" "=b")
+	(plus:SI (match_operand:SI 1 "gpc_reg_operand" "r")
+		 (high:SI
+		   (minus:SI (match_operand:SI 2 "symbol_ref_operand" "s")
+			     (match_operand:SI 3 "symbol_ref_operand" "s")))))]
+  "TARGET_ELF && TARGET_DATA_PLT && DEFAULT_ABI != ABI_AIX && flag_pic"
+  "{cau|addis} %0,%1,%2-%3@ha")
+
+(define_insn "load_toc_v4_PIC_3c"
+  [(set (match_operand:SI 0 "gpc_reg_operand" "=r")
+	(lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "b")
+		   (minus:SI (match_operand:SI 2 "symbol_ref_operand" "s")
+			     (match_operand:SI 3 "symbol_ref_operand" "s"))))]
+  "TARGET_ELF && TARGET_DATA_PLT && DEFAULT_ABI != ABI_AIX && flag_pic"
+  "{cal|addi} %0,%1,%2-%3@l")
+
 
 ;; If the TOC is shared over a translation unit, as happens with all
 ;; the kinds of PIC that we support, we need to restore the TOC
@@ -10190,6 +10186,26 @@
     rs6000_emit_load_toc_table (FALSE);
   DONE;
 }")
+
+;; Elf specific ways of loading addresses for non-PIC code.
+;; The output of this could be r0, but we make a very strong
+;; preference for a base register because it will usually
+;; be needed there.
+(define_insn "elf_high"
+  [(set (match_operand:SI 0 "gpc_reg_operand" "=b*r")
+	(high:SI (match_operand 1 "" "")))]
+  "TARGET_ELF && ! TARGET_64BIT"
+  "{liu|lis} %0,%1@ha")
+
+(define_insn "elf_low"
+  [(set (match_operand:SI 0 "gpc_reg_operand" "=r,r")
+	(lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "b,!*r")
+		   (match_operand 2 "" "")))]
+   "TARGET_ELF && ! TARGET_64BIT"
+   "@
+    {cal|la} %0,%2@l(%1)
+    {ai|addic} %0,%1,%K2")
+
 \f
 ;; A function pointer under AIX is a pointer to a data area whose first word
 ;; contains the actual address of the function, whose second word contains a
@@ -10306,6 +10322,25 @@
 
   operands[0] = XEXP (operands[0], 0);
 
+  if (DEFAULT_ABI == ABI_V4 && TARGET_DATA_PLT
+      && flag_pic
+      && GET_CODE (operands[0]) == SYMBOL_REF
+      && !SYMBOL_REF_LOCAL_P (operands[0]))
+    {
+      rtx call;
+      rtvec tmp;
+
+      tmp = gen_rtvec (3,
+		       gen_rtx_CALL (VOIDmode,
+				     gen_rtx_MEM (SImode, operands[0]),
+				     operands[1]),
+		       gen_rtx_USE (VOIDmode, operands[2]),
+		       gen_rtx_CLOBBER (VOIDmode, gen_rtx_SCRATCH (SImode)));
+      call = emit_call_insn (gen_rtx_PARALLEL (VOIDmode, tmp));
+      use_reg (&CALL_INSN_FUNCTION_USAGE (call), pic_offset_table_rtx);
+      DONE;
+    }
+
   if (GET_CODE (operands[0]) != SYMBOL_REF
       || (DEFAULT_ABI == ABI_AIX && !SYMBOL_REF_FUNCTION_P (operands[0]))
       || (DEFAULT_ABI != ABI_DARWIN && (INTVAL (operands[2]) & CALL_LONG) != 0))
@@ -10354,6 +10389,28 @@
 
   operands[1] = XEXP (operands[1], 0);
 
+  if (DEFAULT_ABI == ABI_V4 && TARGET_DATA_PLT
+      && flag_pic
+      && GET_CODE (operands[1]) == SYMBOL_REF
+      && !SYMBOL_REF_LOCAL_P (operands[1]))
+    {
+      rtx call;
+      rtvec tmp;
+
+      tmp = gen_rtvec (3,
+		       gen_rtx_SET (VOIDmode,
+				    operands[0],
+				    gen_rtx_CALL (VOIDmode,
+						  gen_rtx_MEM (SImode,
+							       operands[1]),
+						  operands[2])),
+		       gen_rtx_USE (VOIDmode, operands[3]),
+		       gen_rtx_CLOBBER (VOIDmode, gen_rtx_SCRATCH (SImode)));
+      call = emit_call_insn (gen_rtx_PARALLEL (VOIDmode, tmp));
+      use_reg (&CALL_INSN_FUNCTION_USAGE (call), pic_offset_table_rtx);
+      DONE;
+    }
+
   if (GET_CODE (operands[1]) != SYMBOL_REF
       || (DEFAULT_ABI == ABI_AIX && !SYMBOL_REF_FUNCTION_P (operands[1]))
       || (DEFAULT_ABI != ABI_DARWIN && (INTVAL (operands[3]) & CALL_LONG) != 0))
@@ -10624,7 +10681,18 @@
 #if TARGET_MACHO
   return output_call(insn, operands, 0, 2);
 #else
-  return (DEFAULT_ABI == ABI_V4 && flag_pic) ? "bl %z0@plt" : "bl %z0";
+  if (DEFAULT_ABI == ABI_V4 && flag_pic)
+    {
+      if (TARGET_DATA_PLT && flag_pic == 2)
+	/* The magic 32768 offset here and in the other sysv call insns
+	   corresponds to the offset of r30 in .got2, as given by LCTOC1.
+	   See sysv4.h:toc_section.  */
+	return "bl %z0+32768@plt";
+      else
+	return "bl %z0@plt";
+    }
+  else
+    return "bl %z0";
 #endif
 }
   [(set_attr "type" "branch,branch")
@@ -10669,7 +10737,15 @@
 #if TARGET_MACHO
   return output_call(insn, operands, 1, 3);
 #else
-  return (DEFAULT_ABI == ABI_V4 && flag_pic) ? "bl %z1@plt" : "bl %z1";
+  if (DEFAULT_ABI == ABI_V4 && flag_pic)
+    {
+      if (TARGET_DATA_PLT && flag_pic == 2)
+	return "bl %z1+32768@plt";
+      else
+	return "bl %z1@plt";
+    }
+  else
+    return "bl %z1";
 #endif
 }
   [(set_attr "type" "branch,branch")
@@ -10884,7 +10960,15 @@
   else if (INTVAL (operands[2]) & CALL_V4_CLEAR_FP_ARGS)
     output_asm_insn (\"creqv 6,6,6\", operands);
 
-  return (DEFAULT_ABI == ABI_V4 && flag_pic) ? \"b %z0@plt\" : \"b %z0\";
+  if (DEFAULT_ABI == ABI_V4 && flag_pic)
+    {
+      if (TARGET_DATA_PLT && flag_pic == 2)
+	return \"b %z0+32768@plt\";
+      else
+	return \"b %z0@plt\";
+    }
+  else
+    return \"b %z0\";
 }"
   [(set_attr "type" "branch,branch")
    (set_attr "length" "4,8")])
@@ -10930,7 +11014,15 @@
   else if (INTVAL (operands[2]) & CALL_V4_CLEAR_FP_ARGS)
     output_asm_insn (\"creqv 6,6,6\", operands);
 
-  return (DEFAULT_ABI == ABI_V4 && flag_pic) ? \"b %z1@plt\" : \"b %z1\";
+  if (DEFAULT_ABI == ABI_V4 && flag_pic)
+    {
+      if (TARGET_DATA_PLT && flag_pic == 2)
+	return \"b %z1+32768@plt\";
+      else
+	return \"b %z1@plt\";
+    }
+  else
+    return \"b %z1\";
 }"
   [(set_attr "type" "branch,branch")
    (set_attr "length" "4,8")])
diff -urpN -xCVS -x'*~' -x'.#*' gcc-4.0-virgin/gcc/config/rs6000/tramp.asm gcc-4.0/gcc/config/rs6000/tramp.asm
--- gcc-4.0-virgin/gcc/config/rs6000/tramp.asm	2003-06-06 14:41:22.000000000 +0930
+++ gcc-4.0/gcc/config/rs6000/tramp.asm	2005-05-24 10:52:09.000000000 +0930
@@ -44,7 +44,7 @@
 	.align	2
 trampoline_initial:
 	mflr	r0
-	bl	1f
+	bcl	20,31,1f
 .Lfunc = .-trampoline_initial
 	.long	0			/* will be replaced with function address */
 .Lchain = .-trampoline_initial
@@ -67,7 +67,7 @@ trampoline_size = .-trampoline_initial
 
 FUNC_START(__trampoline_setup)
 	mflr	r0		/* save return address */
-        bl	.LCF0		/* load up __trampoline_initial into r7 */
+        bcl	20,31,.LCF0	/* load up __trampoline_initial into r7 */
 .LCF0:
         mflr	r11
         addi	r7,r11,trampoline_initial-4-.LCF0 /* trampoline address -4 */
@@ -105,6 +105,12 @@ FUNC_START(__trampoline_setup)
 	blr
 
 .Labort:
+#if defined SHARED && defined HAVE_AS_REL16
+	bcl	20,31,1f
+1:	mflr	r30
+	addis	r30,r30,_GLOBAL_OFFSET_TABLE_-1b@ha
+	addi	r30,r30,_GLOBAL_OFFSET_TABLE_-1b@l
+#endif
 	bl	JUMP_TARGET(abort)
 FUNC_END(__trampoline_setup)
 
diff -urpN -xCVS -x'*~' -x'.#*' gcc-4.0-virgin/libffi/src/powerpc/ppc_closure.S gcc-4.0/libffi/src/powerpc/ppc_closure.S
--- gcc-4.0-virgin/libffi/src/powerpc/ppc_closure.S	2004-09-03 22:42:23.000000000 +0930
+++ gcc-4.0/libffi/src/powerpc/ppc_closure.S	2005-05-24 10:25:49.000000000 +0930
@@ -57,7 +57,7 @@ ENTRY(ffi_closure_SYSV)
 	addi %r7,%r1,152
 
 	# make the call
-	bl JUMPTARGET(ffi_closure_helper_SYSV)
+	bl ffi_closure_helper_SYSV@local
 
 	# now r3 contains the return type
 	# so use it to look up in a table
diff -urpN -xCVS -x'*~' -x'.#*' gcc-4.0-virgin/libffi/src/powerpc/sysv.S gcc-4.0/libffi/src/powerpc/sysv.S
--- gcc-4.0-virgin/libffi/src/powerpc/sysv.S	2004-09-03 22:42:23.000000000 +0930
+++ gcc-4.0/libffi/src/powerpc/sysv.S	2005-05-24 10:25:47.000000000 +0930
@@ -60,7 +60,7 @@ ENTRY(ffi_call_SYSV)
 
 	/* Call ffi_prep_args_SYSV.  */
 	mr	%r4,%r1
-	bl	JUMPTARGET(ffi_prep_args_SYSV)
+	bl	ffi_prep_args_SYSV@local
 
 	/* Now do the call.  */
 	/* Set up cr1 with bits 4-7 of the flags.  */

[-- Attachment #4: gcc34.diff --]
[-- Type: text/plain, Size: 22217 bytes --]

	* configure.ac (HAVE_AS_REL16): Test for R_PPC_REL16 relocs.
	* config.in: Regenerate.
	* configure: Regenerate.
	* config.gcc (powerpc64-*-linux*, powerpc-*-linux*): Add
	rs6000/dataplt.h to tm_file when enable_dataplt.
	* config/rs6000/dataplt.h: New file.
	* config/rs6000/sysv4.h (MASK_DATA_PLT): Define.
	(SUBTARGET_SWITCHES): Add "data-plt" and "bss-plt".  Move "newlib".
	(SUBTARGET_OVERRIDE_OPTIONS): Error if -mdata-plt given without
	assembler support.
	(CC1_DATA_PLT_DEFAULT_SPEC): Define.
	(CC1_SPEC): Delete duplicate mno-sdata.  Invoke cc1_data_plt_default.
	(SUBTARGET_EXTRA_SPECS): Add cc1_data_plt_default.
	* config/rs6000/rs6000.h: Update target_flags free bits comment.
	(TARGET_DATA_PLT): Define.
	* config/rs6000/rs6000.c (rs6000_file_start): Call toc_section.
	(rs6000_xcoff_file_start): Don't call toc_section here.
	(rs6000_emit_load_toc_table): Handle TARGET_DATA_PLT got register
	load sequence.
	(rs6000_emit_prologue): Call rs6000_emit_load_toc_table when
	TARGET_DATA_PLT.
	(rs6000_elf_declare_function_name): Don't emit toc address offset
	word when TARGET_DATA_PLT.
	* config/rs6000/rs6000.md (elf_high, elf_low): Move after load_toc_*.
	(load_toc_v4_PIC_1) Enable for TARGET_DATA_PLT.
	(load_toc_v4_PIC_3b, load_toc_v4_PIC_3c): New insns.
	(call, call_value): Mark pic_offset_table_rtx used for TARGET_DATA_PLT.
	(call_nonlocal_sysv, call_value_nonlocal_sysv, sibcall_nonlocal_sysv,
	sibcall_value_nonlocal_sysv): Add 32768 offset when TARGET_DATA_PLT
	and -fPIC.
	* config/rs6000/tramp.asm (trampoline_initial): Use "bcl 20,31".
	(__trampoline_setup): Likewise.  Init r30 before plt call.

libffi/
	* src/powerpc/ppc_closure.S (ffi_closure_SYSV): Don't use JUMPTARGET
	to call ffi_closure_helper_SYSV.  Append @local instead.
	* src/powerpc/sysv.S (ffi_call_SYSV): Likewise for ffi_prep_args_SYSV.

diff -urpN -xCVS -x'*~' -x'.#*' gcc-3.4-virgin/gcc/configure.ac gcc-3.4/gcc/configure.ac
--- gcc-3.4-virgin/gcc/configure.ac	2005-01-13 16:01:15.000000000 +1030
+++ gcc-3.4/gcc/configure.ac	2005-05-18 14:17:30.000000000 +0930
@@ -2465,6 +2467,25 @@ changequote([,])dnl
       [$conftest_s],,
       [AC_DEFINE(HAVE_AS_MFCRF, 1,
 	  [Define if your assembler supports mfcr field.])])
+
+    case $target in
+      *-*-aix*) conftest_s='	.csect .text[[PR]]
+LCF..0:
+	addis 11,30,_GLOBAL_OFFSET_TABLE_-LCF..0@ha';;
+      *-*-darwin*)
+	conftest_s='	.text
+LCF0:
+	addis r11,r30,_GLOBAL_OFFSET_TABLE_-LCF0@ha';;
+      *) conftest_s='	.text
+.LCF0:
+	addis 11,30,_GLOBAL_OFFSET_TABLE_-.LCF0@ha';;
+    esac
+
+    gcc_GAS_CHECK_FEATURE([rel16 relocs],
+      gcc_cv_as_powerpc_rel16, [2,17,0], -a32,
+      [$conftest_s],,
+      [AC_DEFINE(HAVE_AS_REL16, 1,
+	  [Define if your assembler supports R_PPC_REL16 relocs.])])
     ;;
 
   mips*-*-*)
diff -urpN -xCVS -x'*~' -x'.#*' gcc-3.4-virgin/gcc/config.gcc gcc-3.4/gcc/config.gcc
--- gcc-3.4-virgin/gcc/config.gcc	2005-04-29 08:41:28.000000000 +0930
+++ gcc-3.4/gcc/config.gcc	2005-05-24 01:24:36.000000000 +0930
@@ -1698,6 +1698,9 @@ powerpc64-*-linux*)
 	test x$with_cpu != x || cpu_is_64bit=yes
 	test x$cpu_is_64bit != xyes || tm_file="${tm_file} rs6000/default64.h"
 	tm_file="rs6000/biarch64.h ${tm_file} rs6000/linux64.h"
+	if test x${enable_dataplt} = xyes; then
+		tm_file="rs6000/dataplt.h ${tm_file}"
+	fi
 	tmake_file="rs6000/t-fprules t-slibgcc-elf-ver t-linux rs6000/t-ppccomm rs6000/t-linux64"
 	;;
 powerpc64-*-gnu*)
@@ -1785,6 +1788,9 @@ powerpc-*-linux*)
 		tm_file="${tm_file} rs6000/linux.h"
 		;;
 	esac
+	if test x${enable_dataplt} = xyes; then
+		tm_file="rs6000/dataplt.h ${tm_file}"
+	fi
 	;;
 powerpc-*-gnu-gnualtivec*)
 	tm_file="${cpu_type}/${cpu_type}.h elfos.h svr4.h freebsd-spec.h gnu.h rs6000/sysv4.h rs6000/linux.h rs6000/linuxaltivec.h rs6000/gnu.h"
diff -urpN -xCVS -x'*~' -x'.#*' gcc-3.4-virgin/gcc/config/rs6000/dataplt.h gcc-3.4/gcc/config/rs6000/dataplt.h
--- gcc-3.4-virgin/gcc/config/rs6000/dataplt.h	1970-01-01 09:30:00.000000000 +0930
+++ gcc-3.4/gcc/config/rs6000/dataplt.h	2005-05-24 01:23:42.000000000 +0930
@@ -0,0 +1,21 @@
+/* Default to -mdata-plt.
+   Copyright (C) 2005 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 2, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING.  If not, write to
+the Free Software Foundation, 59 Temple Place - Suite 330,
+Boston, MA 02111-1307, USA.  */
+
+#define CC1_DATA_PLT_DEFAULT_SPEC "-mdata-plt"
diff -urpN -xCVS -x'*~' -x'.#*' gcc-3.4-virgin/gcc/config/rs6000/sysv4.h gcc-3.4/gcc/config/rs6000/sysv4.h
--- gcc-3.4-virgin/gcc/config/rs6000/sysv4.h	2005-02-14 20:08:09.000000000 +1030
+++ gcc-3.4/gcc/config/rs6000/sysv4.h	2005-05-24 01:23:25.000000000 +0930
@@ -55,6 +55,7 @@ extern enum rs6000_sdata_type rs6000_sda
 #define	MASK_REGNAMES		0x02000000	/* Use alternate register names.  */
 #define	MASK_PROTOTYPE		0x01000000	/* Only prototyped fcns pass variable args.  */
 #define MASK_NO_BITFIELD_WORD	0x00800000	/* Bitfields cannot cross word boundaries */
+#define MASK_DATA_PLT		0x00400000	/* Use non-exec PLT/GOT.  */
 
 #define	TARGET_NO_BITFIELD_TYPE	(target_flags & MASK_NO_BITFIELD_TYPE)
 #define	TARGET_STRICT_ALIGN	(target_flags & MASK_STRICT_ALIGN)
@@ -149,12 +150,16 @@ extern const char *rs6000_tls_size_strin
     N_("Set the PPC_EMB bit in the ELF flags header") },		\
   { "windiss",           0, N_("Use the WindISS simulator") },          \
   { "shlib",		 0, N_("no description yet") },			\
+  { "newlib",		 0, N_("no description yet") },			\
   { "64",		 MASK_64BIT | MASK_POWERPC64 | MASK_POWERPC,	\
 			 N_("Generate 64-bit code") },			\
   { "32",		 - (MASK_64BIT | MASK_POWERPC64),		\
 			 N_("Generate 32-bit code") },			\
-  EXTRA_SUBTARGET_SWITCHES						\
-  { "newlib",		 0, N_("no description yet") },
+  { "data-plt",		 MASK_DATA_PLT,					\
+			 N_("Generate code for non-exec PLT and GOT") },\
+  { "bss-plt",		 -MASK_DATA_PLT,				\
+			 N_("Generate code for exec BSS PLT") },	\
+  EXTRA_SUBTARGET_SWITCHES
 
 /* This is meant to be redefined in the host dependent files.  */
 #define EXTRA_SUBTARGET_SWITCHES
@@ -294,6 +304,11 @@ do {									\
       error ("-mcall-aixdesc must be big endian");			\
     }									\
 									\
+  if (TARGET_DATA_PLT != (target_flags & MASK_DATA_PLT))		\
+    {									\
+      error ("-mdata-plt not supported by your assembler");		\
+    }									\
+									\
   /* Treat -fPIC the same as -mrelocatable.  */				\
   if (flag_pic > 1 && DEFAULT_ABI != ABI_AIX)				\
     target_flags |= MASK_RELOCATABLE | MASK_MINIMAL_TOC | MASK_NO_FP_IN_TOC; \
@@ -835,6 +850,10 @@ extern int fixuplabelno;
 
 #define	CC1_ENDIAN_DEFAULT_SPEC "%(cc1_endian_big)"
 
+#ifndef CC1_DATA_PLT_DEFAULT_SPEC
+#define CC1_DATA_PLT_DEFAULT_SPEC ""
+#endif
+
 /* Pass -G xxx to the compiler and set correct endian mode.  */
 #define	CC1_SPEC "%{G*} \
 %{mlittle|mlittle-endian: %(cc1_endian_little);           \
@@ -847,7 +866,6 @@ extern int fixuplabelno;
   mcall-gnu             : -mbig %(cc1_endian_big);        \
   mcall-i960-old        : -mlittle %(cc1_endian_little);  \
                         : %(cc1_endian_default)}          \
-%{mno-sdata: -msdata=none } \
 %{meabi: %{!mcall-*: -mcall-sysv }} \
 %{!meabi: %{!mno-eabi: \
     %{mrelocatable: -meabi } \
@@ -859,6 +877,7 @@ extern int fixuplabelno;
     %{mcall-openbsd: -mno-eabi }}} \
 %{msdata: -msdata=default} \
 %{mno-sdata: -msdata=none} \
+%{!mbss-plt: %{!mdata-plt: %(cc1_data_plt_default)}} \
 %{profile: -p}"
 
 /* Don't put -Y P,<path> for cross compilers.  */
@@ -1299,6 +1318,7 @@ ncrtn.o%s"
   { "cc1_endian_big",		CC1_ENDIAN_BIG_SPEC },			\
   { "cc1_endian_little",	CC1_ENDIAN_LITTLE_SPEC },		\
   { "cc1_endian_default",	CC1_ENDIAN_DEFAULT_SPEC },		\
+  { "cc1_data_plt_default",	CC1_DATA_PLT_DEFAULT_SPEC },		\
   { "cpp_os_ads",		CPP_OS_ADS_SPEC },			\
   { "cpp_os_yellowknife",	CPP_OS_YELLOWKNIFE_SPEC },		\
   { "cpp_os_mvme",		CPP_OS_MVME_SPEC },			\
diff -urpN -xCVS -x'*~' -x'.#*' gcc-3.4-virgin/gcc/config/rs6000/rs6000.h gcc-3.4/gcc/config/rs6000/rs6000.h
--- gcc-3.4-virgin/gcc/config/rs6000/rs6000.h	2005-02-22 15:19:24.000000000 +1030
+++ gcc-3.4/gcc/config/rs6000/rs6000.h	2005-05-17 16:53:52.000000000 +0930
@@ -197,8 +201,8 @@ extern int target_flags;
 /* Use single field mfcr instruction.  */
 #define MASK_MFCRF		0x00080000
 
-/* The only remaining free bits are 0x00600000.  linux64.h uses
-   0x00100000, and sysv4.h uses 0x00800000 -> 0x40000000.
+/* The only remaining free bit is 0x00200000.  linux64.h uses
+   0x00100000, and sysv4.h uses 0x00400000 -> 0x40000000.
    0x80000000 is not available because target_flags is signed.  */
 
 #define TARGET_POWER		(target_flags & MASK_POWER)
@@ -230,6 +234,11 @@ extern int target_flags;
 #define TARGET_MFCRF 0
 #endif
 
+#ifdef HAVE_AS_REL16
+#define TARGET_DATA_PLT		(target_flags & MASK_DATA_PLT)
+#else
+#define TARGET_DATA_PLT 0
+#endif
 
 #define TARGET_32BIT		(! TARGET_64BIT)
 #define TARGET_HARD_FLOAT	(! TARGET_SOFT_FLOAT)
diff -urpN -xCVS -x'*~' -x'.#*' gcc-3.4-virgin/gcc/config/rs6000/rs6000.c gcc-3.4/gcc/config/rs6000/rs6000.c
--- gcc-3.4-virgin/gcc/config/rs6000/rs6000.c	2005-04-29 09:52:17.000000000 +0930
+++ gcc-3.4/gcc/config/rs6000/rs6000.c	2005-05-23 19:39:59.000000000 +0930
@@ -1177,6 +1181,12 @@ rs6000_file_start (void)
       if (*start == '\0')
 	putc ('\n', file);
     }
+
+  if (DEFAULT_ABI == ABI_AIX || (TARGET_ELF && flag_pic == 2))
+    {
+      toc_section ();
+      text_section ();
+    }
 }
 \f
 /* Return nonzero if this function is known to have a null epilogue.  */
@@ -11452,15 +11508,49 @@ rs6000_emit_load_toc_table (int fromprol
   rtx dest, insn;
   dest = gen_rtx_REG (Pmode, RS6000_PIC_OFFSET_TABLE_REGNUM);
 
-  if (TARGET_ELF && DEFAULT_ABI == ABI_V4 && flag_pic == 1)
+  if (TARGET_ELF && TARGET_DATA_PLT && DEFAULT_ABI != ABI_AIX && flag_pic)
     {
-      rtx temp = (fromprolog
-		  ? gen_rtx_REG (Pmode, LINK_REGISTER_REGNUM)
-		  : gen_reg_rtx (Pmode));
-      insn = emit_insn (gen_load_toc_v4_pic_si (temp));
+      char buf[30];
+      rtx lab, tmp1, tmp2, got, tempLR;
+
+      ASM_GENERATE_INTERNAL_LABEL (buf, "LCF", rs6000_pic_labelno);
+      lab = gen_rtx_SYMBOL_REF (Pmode, ggc_strdup (buf));
+      if (flag_pic == 2)
+	got = gen_rtx_SYMBOL_REF (Pmode, toc_label_name);
+      else
+	got = rs6000_got_sym ();
+      tmp1 = tmp2 = dest;
+      if (!fromprolog)
+	{
+	  tmp1 = gen_reg_rtx (Pmode);
+	  tmp2 = gen_reg_rtx (Pmode);
+	}
+      tempLR = (fromprolog
+		? gen_rtx_REG (Pmode, LINK_REGISTER_REGNUM)
+		: gen_reg_rtx (Pmode));
+      insn = emit_insn (gen_load_toc_v4_PIC_1 (tempLR, lab));
+      if (fromprolog)
+	rs6000_maybe_dead (insn);
+      insn = emit_move_insn (tmp1, tempLR);
       if (fromprolog)
 	rs6000_maybe_dead (insn);
-      insn = emit_move_insn (dest, temp);
+      insn = emit_insn (gen_load_toc_v4_PIC_3b (tmp2, tmp1, got, lab));
+      if (fromprolog)
+	rs6000_maybe_dead (insn);
+      insn = emit_insn (gen_load_toc_v4_PIC_3c (dest, tmp2, got, lab));
+      if (fromprolog)
+	rs6000_maybe_dead (insn);
+    }
+  else if (TARGET_ELF && DEFAULT_ABI == ABI_V4 && flag_pic == 1)
+    {
+      rtx tempLR = (fromprolog
+		    ? gen_rtx_REG (Pmode, LINK_REGISTER_REGNUM)
+		    : gen_reg_rtx (Pmode));
+
+      insn = emit_insn (gen_load_toc_v4_pic_si (tempLR));
+      if (fromprolog)
+	rs6000_maybe_dead (insn);
+      insn = emit_move_insn (dest, tempLR);
       if (fromprolog)
 	rs6000_maybe_dead (insn);
     }
@@ -12391,7 +12481,8 @@ rs6000_emit_prologue (void)
 
   /* If we are using RS6000_PIC_OFFSET_TABLE_REGNUM, we need to set it up.  */
   if ((TARGET_TOC && TARGET_MINIMAL_TOC && get_pool_size () != 0)
-      || (DEFAULT_ABI == ABI_V4 && flag_pic == 1
+      || (DEFAULT_ABI == ABI_V4
+	  && (flag_pic == 1 || (flag_pic && TARGET_DATA_PLT))
 	  && regs_ever_live[RS6000_PIC_OFFSET_TABLE_REGNUM]))
   {
     /* If emit_load_toc_table will use the link register, we need to save
@@ -15748,6 +15814,7 @@ rs6000_elf_declare_function_name (FILE *
     }
 
   if (TARGET_RELOCATABLE
+      && !TARGET_DATA_PLT
       && (get_pool_size () != 0 || current_function_profile)
       && uses_TOC ())
     {
@@ -15940,7 +16032,6 @@ rs6000_xcoff_file_start (void)
   fputs ("\t.file\t", asm_out_file);
   output_quoted_string (asm_out_file, main_input_filename);
   fputc ('\n', asm_out_file);
-  toc_section ();
   if (write_symbols != NO_DEBUG)
     private_data_section ();
   text_section ();
diff -urpN -xCVS -x'*~' -x'.#*' gcc-3.4-virgin/gcc/config/rs6000/rs6000.md gcc-3.4/gcc/config/rs6000/rs6000.md
--- gcc-3.4-virgin/gcc/config/rs6000/rs6000.md	2005-03-31 21:07:06.000000000 +0930
+++ gcc-3.4/gcc/config/rs6000/rs6000.md	2005-05-19 20:14:57.000000000 +0930
@@ -7454,25 +7454,6 @@
 \f
 ;; Now define ways of moving data around.
 
-;; Elf specific ways of loading addresses for non-PIC code.
-;; The output of this could be r0, but we make a very strong
-;; preference for a base register because it will usually
-;; be needed there.
-(define_insn "elf_high"
-  [(set (match_operand:SI 0 "gpc_reg_operand" "=b*r")
-	(high:SI (match_operand 1 "" "")))]
-  "TARGET_ELF && ! TARGET_64BIT"
-  "{liu|lis} %0,%1@ha")
-
-(define_insn "elf_low"
-  [(set (match_operand:SI 0 "gpc_reg_operand" "=r,r")
-	(lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "b,!*r")
-		   (match_operand 2 "" "")))]
-   "TARGET_ELF && ! TARGET_64BIT"
-   "@
-    {cal|la} %0,%2@l(%1)
-    {ai|addic} %0,%1,%K2")
-
 ;; Mach-O PIC trickery.
 (define_insn "macho_high"
   [(set (match_operand:SI 0 "gpc_reg_operand" "=b*r")
@@ -10044,7 +10026,8 @@
   [(set (match_operand:SI 0 "register_operand" "=l")
 	(match_operand:SI 1 "immediate_operand" "s"))
    (use (unspec [(match_dup 1)] UNSPEC_TOC))]
-  "TARGET_ELF && DEFAULT_ABI != ABI_AIX && flag_pic == 2"
+  "TARGET_ELF && DEFAULT_ABI != ABI_AIX
+   && (flag_pic == 2 || (flag_pic && TARGET_DATA_PLT))"
   "bcl 20,31,%1\\n%1:"
   [(set_attr "type" "branch")
    (set_attr "length" "4")])
@@ -10067,6 +10050,23 @@
   "{l|lwz} %0,%2-%3(%1)"
   [(set_attr "type" "load")])
 
+(define_insn "load_toc_v4_PIC_3b"
+  [(set (match_operand:SI 0 "gpc_reg_operand" "=b")
+	(plus:SI (match_operand:SI 1 "gpc_reg_operand" "r")
+		 (high:SI
+		   (minus:SI (match_operand:SI 2 "symbol_ref_operand" "s")
+			     (match_operand:SI 3 "symbol_ref_operand" "s")))))]
+  "TARGET_ELF && TARGET_DATA_PLT && DEFAULT_ABI != ABI_AIX && flag_pic"
+  "{cau|addis} %0,%1,%2-%3@ha")
+
+(define_insn "load_toc_v4_PIC_3c"
+  [(set (match_operand:SI 0 "gpc_reg_operand" "=r")
+	(lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "b")
+		   (minus:SI (match_operand:SI 2 "symbol_ref_operand" "s")
+			     (match_operand:SI 3 "symbol_ref_operand" "s"))))]
+  "TARGET_ELF && TARGET_DATA_PLT && DEFAULT_ABI != ABI_AIX && flag_pic"
+  "{cal|addi} %0,%1,%2-%3@l")
+
 (define_insn "load_macho_picbase"
   [(set (match_operand:SI 0 "register_operand" "=l")
 	(unspec:SI [(match_operand:SI 1 "immediate_operand" "s")]
@@ -10119,6 +10119,26 @@
     rs6000_emit_load_toc_table (FALSE);
   DONE;
 }")
+
+;; Elf specific ways of loading addresses for non-PIC code.
+;; The output of this could be r0, but we make a very strong
+;; preference for a base register because it will usually
+;; be needed there.
+(define_insn "elf_high"
+  [(set (match_operand:SI 0 "gpc_reg_operand" "=b*r")
+	(high:SI (match_operand 1 "" "")))]
+  "TARGET_ELF && ! TARGET_64BIT"
+  "{liu|lis} %0,%1@ha")
+
+(define_insn "elf_low"
+  [(set (match_operand:SI 0 "gpc_reg_operand" "=r,r")
+	(lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "b,!*r")
+		   (match_operand 2 "" "")))]
+   "TARGET_ELF && ! TARGET_64BIT"
+   "@
+    {cal|la} %0,%2@l(%1)
+    {ai|addic} %0,%1,%K2")
+
 \f
 ;; A function pointer under AIX is a pointer to a data area whose first word
 ;; contains the actual address of the function, whose second word contains a
@@ -10235,6 +10255,25 @@
 
   operands[0] = XEXP (operands[0], 0);
 
+  if (DEFAULT_ABI == ABI_V4 && TARGET_DATA_PLT
+      && flag_pic
+      && GET_CODE (operands[0]) == SYMBOL_REF
+      && !SYMBOL_REF_LOCAL_P (operands[0]))
+    {
+      rtx call;
+      rtvec tmp;
+
+      tmp = gen_rtvec (3,
+		       gen_rtx_CALL (VOIDmode,
+				     gen_rtx_MEM (SImode, operands[0]),
+				     operands[1]),
+		       gen_rtx_USE (VOIDmode, operands[2]),
+		       gen_rtx_CLOBBER (VOIDmode, gen_rtx_SCRATCH (SImode)));
+      call = emit_call_insn (gen_rtx_PARALLEL (VOIDmode, tmp));
+      use_reg (&CALL_INSN_FUNCTION_USAGE (call), pic_offset_table_rtx);
+      DONE;
+    }
+
   if (GET_CODE (operands[0]) != SYMBOL_REF
       || (DEFAULT_ABI == ABI_AIX && !SYMBOL_REF_FUNCTION_P (operands[0]))
       || (DEFAULT_ABI != ABI_DARWIN && (INTVAL (operands[2]) & CALL_LONG) != 0))
@@ -10283,6 +10322,28 @@
 
   operands[1] = XEXP (operands[1], 0);
 
+  if (DEFAULT_ABI == ABI_V4 && TARGET_DATA_PLT
+      && flag_pic
+      && GET_CODE (operands[1]) == SYMBOL_REF
+      && !SYMBOL_REF_LOCAL_P (operands[1]))
+    {
+      rtx call;
+      rtvec tmp;
+
+      tmp = gen_rtvec (3,
+		       gen_rtx_SET (VOIDmode,
+				    operands[0],
+				    gen_rtx_CALL (VOIDmode,
+						  gen_rtx_MEM (SImode,
+							       operands[1]),
+						  operands[2])),
+		       gen_rtx_USE (VOIDmode, operands[3]),
+		       gen_rtx_CLOBBER (VOIDmode, gen_rtx_SCRATCH (SImode)));
+      call = emit_call_insn (gen_rtx_PARALLEL (VOIDmode, tmp));
+      use_reg (&CALL_INSN_FUNCTION_USAGE (call), pic_offset_table_rtx);
+      DONE;
+    }
+
   if (GET_CODE (operands[1]) != SYMBOL_REF
       || (DEFAULT_ABI == ABI_AIX && !SYMBOL_REF_FUNCTION_P (operands[1]))
       || (DEFAULT_ABI != ABI_DARWIN && (INTVAL (operands[3]) & CALL_LONG) != 0))
@@ -10553,7 +10613,18 @@
 #if TARGET_MACHO
   return output_call(insn, operands, 0, 2);
 #else
-  return (DEFAULT_ABI == ABI_V4 && flag_pic) ? "bl %z0@plt" : "bl %z0";
+  if (DEFAULT_ABI == ABI_V4 && flag_pic)
+    {
+      if (TARGET_DATA_PLT && flag_pic == 2)
+	/* The magic 32768 offset here and in the other sysv call insns
+	   corresponds to the offset of r30 in .got2, as given by LCTOC1.
+	   See sysv4.h:toc_section.  */
+	return "bl %z0+32768@plt";
+      else
+	return "bl %z0@plt";
+    }
+  else
+    return "bl %z0";
 #endif     
 }
   [(set_attr "type" "branch,branch")
@@ -10598,7 +10669,15 @@
 #if TARGET_MACHO
   return output_call(insn, operands, 1, 3);
 #else
-  return (DEFAULT_ABI == ABI_V4 && flag_pic) ? "bl %z1@plt" : "bl %z1";
+  if (DEFAULT_ABI == ABI_V4 && flag_pic)
+    {
+      if (TARGET_DATA_PLT && flag_pic == 2)
+	return "bl %z1+32768@plt";
+      else
+	return "bl %z1@plt";
+    }
+  else
+    return "bl %z1";
 #endif     
 }
   [(set_attr "type" "branch,branch")
@@ -10813,7 +10891,15 @@
   else if (INTVAL (operands[2]) & CALL_V4_CLEAR_FP_ARGS)
     output_asm_insn (\"creqv 6,6,6\", operands);
 
-  return (DEFAULT_ABI == ABI_V4 && flag_pic) ? \"b %z0@plt\" : \"b %z0\";
+  if (DEFAULT_ABI == ABI_V4 && flag_pic)
+    {
+      if (TARGET_DATA_PLT && flag_pic == 2)
+	return \"b %z0+32768@plt\";
+      else
+	return \"b %z0@plt\";
+    }
+  else
+    return \"b %z0\";
 }"
   [(set_attr "type" "branch,branch")
    (set_attr "length" "4,8")])
@@ -10859,7 +10944,15 @@
   else if (INTVAL (operands[2]) & CALL_V4_CLEAR_FP_ARGS)
     output_asm_insn (\"creqv 6,6,6\", operands);
 
-  return (DEFAULT_ABI == ABI_V4 && flag_pic) ? \"b %z1@plt\" : \"b %z1\";
+  if (DEFAULT_ABI == ABI_V4 && flag_pic)
+    {
+      if (TARGET_DATA_PLT && flag_pic == 2)
+	return \"b %z1+32768@plt\";
+      else
+	return \"b %z1@plt\";
+    }
+  else
+    return \"b %z1\";
 }"
   [(set_attr "type" "branch,branch")
    (set_attr "length" "4,8")])
diff -urpN -xCVS -x'*~' -x'.#*' gcc-3.4-virgin/gcc/config/rs6000/tramp.asm gcc-3.4/gcc/config/rs6000/tramp.asm
--- gcc-3.4-virgin/gcc/config/rs6000/tramp.asm	2003-06-06 14:41:22.000000000 +0930
+++ gcc-3.4/gcc/config/rs6000/tramp.asm	2005-05-24 10:52:11.000000000 +0930
@@ -44,7 +44,7 @@
 	.align	2
 trampoline_initial:
 	mflr	r0
-	bl	1f
+	bcl	20,31,1f
 .Lfunc = .-trampoline_initial
 	.long	0			/* will be replaced with function address */
 .Lchain = .-trampoline_initial
@@ -67,7 +67,7 @@ trampoline_size = .-trampoline_initial
 
 FUNC_START(__trampoline_setup)
 	mflr	r0		/* save return address */
-        bl	.LCF0		/* load up __trampoline_initial into r7 */
+        bcl	20,31,.LCF0	/* load up __trampoline_initial into r7 */
 .LCF0:
         mflr	r11
         addi	r7,r11,trampoline_initial-4-.LCF0 /* trampoline address -4 */
@@ -105,6 +105,12 @@ FUNC_START(__trampoline_setup)
 	blr
 
 .Labort:
+#if defined SHARED && defined HAVE_AS_REL16
+	bcl	20,31,1f
+1:	mflr	r30
+	addis	r30,r30,_GLOBAL_OFFSET_TABLE_-1b@ha
+	addi	r30,r30,_GLOBAL_OFFSET_TABLE_-1b@l
+#endif
 	bl	JUMP_TARGET(abort)
 FUNC_END(__trampoline_setup)
 
diff -urpN -xCVS -x'*~' -x'.#*' gcc-3.4-virgin/libffi/src/powerpc/ppc_closure.S gcc-3.4/libffi/src/powerpc/ppc_closure.S
--- gcc-3.4-virgin/libffi/src/powerpc/ppc_closure.S	2003-10-23 14:08:15.000000000 +0930
+++ gcc-3.4/libffi/src/powerpc/ppc_closure.S	2005-05-24 10:54:09.000000000 +0930
@@ -57,7 +57,7 @@ ENTRY(ffi_closure_SYSV)
 	addi %r7,%r1,152
 	
         # make the call
-	bl JUMPTARGET(ffi_closure_helper_SYSV)
+	bl ffi_closure_helper_SYSV@local
 
 	# now r3 contains the return type
 	# so use it to look up in a table
diff -urpN -xCVS -x'*~' -x'.#*' gcc-3.4-virgin/libffi/src/powerpc/sysv.S gcc-3.4/libffi/src/powerpc/sysv.S
--- gcc-3.4-virgin/libffi/src/powerpc/sysv.S	2003-10-23 14:08:15.000000000 +0930
+++ gcc-3.4/libffi/src/powerpc/sysv.S	2005-05-24 10:54:02.000000000 +0930
@@ -62,7 +62,7 @@ ENTRY(ffi_call_SYSV)
 
 	/* Call ffi_prep_args_SYSV.  */
 	mr	%r4,%r1
-	bl	JUMPTARGET(ffi_prep_args_SYSV)
+	bl	ffi_prep_args_SYSV@local
 
 	/* Now do the call.  */
 	/* Set up cr1 with bits 4-7 of the flags.  */

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: powerpc new PLT and GOT
  2005-05-25 14:26   ` Alan Modra
@ 2005-05-25 15:17     ` Paolo Carlini
  2005-05-30 15:49     ` Alan Modra
  2005-05-31 10:53     ` Alan Modra
  2 siblings, 0 replies; 875+ messages in thread
From: Paolo Carlini @ 2005-05-25 15:17 UTC (permalink / raw)
  To: Alan Modra; +Cc: gcc-patches

Alan Modra wrote:

>...                                                      One libstdc++
>test, abi-check, fails with a new glibc.  Symbol differences are all
>like
>-FUNC:_ZNSt10moneypunctIcLb0EE24_M_initialize_moneypunctEP15__locale_structPKc@@GLIBCXX_3.4
>+FUNC:_ZNSt10moneypunctIcLb0EE24_M_initialize_moneypunctEPiPKc@@GLIBCXX_3.4
>or
>std::moneypunct<char, false>::_M_initialize_moneypunct(__locale_struct*, char const*)
>std::moneypunct<char, false>::_M_initialize_moneypunct(int*, char const*)
>I think this is due to a libstdc++ configure test failing, probably
>because I didn't get all the rpath magic right when trying to use the
>new glibc.  ie. this was a failure of my test setup rather than a real
>problem.
>  
>
Indeed, that change of signature means that instead of the GNU locale
model, the GENERIC locale model is selected: not the defaul setup with
glibc.

Paolo.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: powerpc new PLT and GOT
  2005-05-25 14:26   ` Alan Modra
  2005-05-25 15:17     ` Paolo Carlini
@ 2005-05-30 15:49     ` Alan Modra
  2005-05-31  0:35       ` Alan Modra
  2005-05-31 10:53     ` Alan Modra
  2 siblings, 1 reply; 875+ messages in thread
From: Alan Modra @ 2005-05-30 15:49 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 923 bytes --]

On Wed, May 25, 2005 at 11:55:09PM +0930, Alan Modra wrote:
[about libstdc++ abi-check fail]
> I think this is due to a libstdc++ configure test failing, probably
> because I didn't get all the rpath magic right when trying to use the
> new glibc.  ie. this was a failure of my test setup rather than a real
> problem.

Now fixed.  No regressions.  I've attached a script I use to build a
toolchain that needs a new glibc to properly test.  Since it's
non-trivial to set up a libc that co-exists with, rather than replacing
the system libc, someone might find this useful.

> I chose to use --enable-dataplt as a configure option to make the
> compiler default to -mdata-plt.

rth suggested I use something different, because he wants something
similar for Alpha where the "data" part doesn't make any sense.  So
these will change to --enable-secureplt and -msecure-plt

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

[-- Attachment #2: recipe --]
[-- Type: text/plain, Size: 19544 bytes --]

#! /bin/bash
# Build a native toolchain, binutils+gcc+glibc+gdb, that coexists rather
# than replacing the system tools and libraries.  They will be built in
# build/* relative to the current working dir.

# Where the toolchain should be installed
dest=/home/alan/toolchain

# Where to install binutils and gcc that use the host glibc
# If this isn't set it is assumed that the host tools are sufficiently recent
# to support -mdata-plt.
host_dest=

# Where various sources may be found.  These are all copied to src/*
# relative to the current working dir.
kernel_src=/src/ppc-2.6.11.10
gcc_src=/src/gcc-4.0
glibc_src=/src/libc-current
binutils_src=/src/binutils-current
gdb_src=/src/gdb-current

# If this is set, then binutils and gdb will be extracted from the cygnus
# source tree.
cygnus_src=

# If combined_tree is set to "yes", then a combined binutils/gcc tree will be
# built.  In general this only works for mainline CVS source, because the top
# level directory make and configure files of each need to be virtually
# identical.
conbined_tree=

# Stop on errors.
set -e

# Find as and ld for gcc
if test -n "${host_dest}"; then
  # We'll be building new ones.
  gcc_as="${host_dest}"/as
  gcc_ld="${host_dest}"/ld
else
  # Find as and ld from the users path
  save_IFS="$IFS"
  IFS=:
  for d in $PATH; do
    if test -x "$d/as"; then
      gcc_as="$d/as"
      break
    fi
  done
  for d in $PATH; do
    if test -x "$d/ld"; then
      gcc_ld="$d/ld"
      break
    fi
  done
  IFS="$save_IFS"
  if test -z "$gcc_as"; then
    echo "Cannot find as!"
    exit 1
  fi
  if test -z "$gcc_ld"; then
    echo "Cannot find ld!"
    exit 1
  fi
  # Sanity check the ld we found
  if ! "$gcc_ld" --help 2>&1 | grep -q bss-plt; then
    echo "$gcc_ld is too old!"
    exit 1
  fi
  # Also check host gcc
  if ! gcc -v --help 2>&1 | grep -q data-plt; then
    echo "host gcc is too old!"
    exit 1
  fi
fi

# Set up local copies of sources.

if test -n "${kernel_src}"; then
  mkdir -p src/kernel
  rsync -rptgoC --delete "${kernel_src}"/ src/kernel/
  kernel_src=`cd src/kernel; pwd`
fi

if test -n "${cygnus_src}"; then
  mkdir -p src/binutils
  # Pull binutils source out of the whole cygnus repo.
  # Exclude top level dirs that we don't need to build.
  rsync -rptgoLC --delete \
    --exclude blt \
    --exclude compile \
    --exclude contrib \
    --exclude dejagnu \
    --exclude depcomp \
    --exclude expect \
    --exclude gdb \
    --exclude itcl \
    --exclude iwidgets \
    --exclude libgloss \
    --exclude libgui \
    --exclude mmalloc \
    --exclude newlib \
    --exclude rda \
    --exclude readline \
    --exclude sid \
    --exclude sim \
    --exclude tcl \
    --exclude tk \
    --exclude utils \
    --exclude winsup \
    "${cygnus_src}"/ src/binutils
  binutils_src=`cd src/binutils; pwd`

  mkdir -p src/gdb
  # Pull gdb source out of the whole cygnus repo.  Exclude some top level
  # dirs, and some under gdb;  gdbtk is old stuff we don't want.
  rsync -rptgoLC --delete \
    --exclude /binutils \
    --exclude /blt \
    --exclude /cgen \
    --exclude /compile \
    --exclude /contrib \
    --exclude /dejagnu \
    --exclude /depcomp \
    --exclude /expect \
    --exclude /gas \
    --exclude /gdb/gdbtk \
    --exclude /gdb/testsuite/gdb.gdbtk \
    --exclude /gprof \
    --exclude /intl \
    --exclude /itcl \
    --exclude /iwidgets \
    --exclude /ld \
    --exclude /libgloss \
    --exclude /libgui \
    --exclude /mmalloc \
    --exclude /newlib \
    --exclude /rda \
    --exclude /sid \
    --exclude /tcl \
    --exclude /tk \
    --exclude /utils \
    --exclude /winsup \
    "${cygnus_src}"/ src/gdb
  gdb_src=`cd src/gdb; pwd`

else

  if test -n "${binutils_src}"; then
    mkdir -p src/binutils
    rsync -rptgoC --delete "${binutils_src}"/ src/binutils/
    binutils_src=`cd src/binutils; pwd`
  fi

  if test -n "${gdb_src}"; then
    mkdir -p src/gdb
    rsync -rptgoC --delete "${gdb_src}"/ src/gdb/
    gdb_src=`cd src/gdb; pwd`
  fi

fi

if test -n "${gcc_src}"; then
  mkdir -p src/gcc
  rsync -rptgoC --delete "${gcc_src}"/ src/gcc/
  gcc_src=`cd src/gcc; pwd`
fi

if test -n "${glibc_src}"; then
  mkdir -p src/glibc
  rsync -rptgoC --delete "${glibc_src}"/ src/glibc/
  glibc_src=`cd src/glibc; pwd`
fi

if test ! -d "${kernel_src}"; then
  echo "No kernel source!"
  exit 1
fi
if test ! -d "${binutils_src}"; then
  echo "No binutils source!"
  exit 1
fi
if test ! -d "${gcc_src}"; then
  echo "No gcc source!"
  exit 1
fi
if test ! -d "${glibc_src}"; then
  echo "No glibc source!"
  exit 1
fi

unset combined_src
if test x${combined_tree} = xyes; then
  # Use a combined source tree
  rm -rf src/combined
  mkdir -p src/combined
  cp -al "${binutils_src}"/* src/combined
  cp -alf "${gcc_src}"/* src/combined
  combined_src=`cd src/combined; pwd`
fi

if test -n "${host_dest}"; then
  # host binutils+gcc build
  # We do this because glibc requires at least gcc-3.4, and the system
  # compiler may be older.  We also need host tools that support -mdata-plt.
  # NOTE!  This build must not use gcc sources hacked to provide -rpath

  if test -z "${combined_src}"; then
    rm -rf build/host_bin
    mkdir -p build/host_bin
    cd build/host_bin

    CFLAGS="-g -O" CXXFLAGS="-g -O" "${binutils_src}"/configure \
      --build=powerpc-linux \
      --host=powerpc-linux \
      --target=powerpc-linux \
      --enable-targets=powerpc64-linux \
      --prefix="${host_dest}" \
      --disable-nls \
      >& _configure
    make >& _make
    make install >& _install
    cd ../..
  fi

  rm -rf build/host_gcc
  mkdir -p build/host_gcc
  cd build/host_gcc

  CFLAGS="-g -O" CXXFLAGS="-g -O" "${combined_src:-${gcc_src}}"/configure \
    --build=powerpc-linux \
    --host=powerpc-linux \
    --target=powerpc-linux \
    --enable-targets=powerpc64-linux \
    --prefix="${host_dest}" \
    --enable-__cxa_atexit \
    --enable-languages=c \
    --enable-shared \
    --disable-nls \
    >& _configure
  make bootstrap >& _make
  make install >& _install
  cd ../..

  # Sanity check the ld we just built
  if ! "$gcc_ld" --help 2>&1 | grep -q bss-plt; then
    echo "$gcc_ld is too old!"
    exit 1
  fi

  PATH="${host_dest}/bin:$PATH"
fi

# Hack the gcc source so that binaries generated will be able to use
# the new glibc.  This isn't necessary (or desirable) when building
# a cross-compiler because with a cross-compiler you would be running
# the binaries on another system, which presumably would have glibc
# installed in the usual place.  It is also possible to hack the gcc
# specs file after gcc is built instead of editing the source, but that
# presents a problem when trying to bootstrap gcc.  A gcc bootstrap is
# a multi-stage process, and you'd need to hack specs after each stage.
# It is not possible to simply set LD_LIBRARY_PATH in the environment
# because existing dynamic apps would then try to use the old ld.so
# with the new libc.so.  This fails miserably.

# Add -rpath to where new shared libs will be installed, and modify
# --dynamic-linker to point there too.  Use new dtags so that we get
# DT_RUNPATH in binaries rather than DT_RPATH.  DT_RPATH can't be overridden
# with LD_LIBRARY_PATH.
s1='s@%{!shared: %{!static:@%{!static: %{!shared:@'
s2='s@\(-dynamic-linker \)\(/lib[^/]*\)\(.*\)}"@\1'"${dest}"'\2\3 --enable-new-dtags -rpath '"${dest}"'\2}"@'
for f in "${combined_src:-${gcc_src}}"/gcc/config/rs6000/{linux64.h,sysv4.h}
do
  sed -e "$s1" -e "$s2" < "$f" > tmp.$$ \
    && { cmp -s tmp.$$ "$f" || mv -f tmp.$$ "$f"; } || exit 1
  rm -f tmp.$$
done

# Hack the glibc source so that the -rpath option we add on all shared libs
# and executable won't bomb ld.so, which checks that -rpath is not given.
s1='s@assert (\(info\[DT_R.*PATH\]\) == NULL)@\1 = NULL@'
for f in "${glibc_src}"/elf/dynamic-link.h
do
  sed -e "$s1" < "$f" > tmp.$$ \
    && { cmp -s tmp.$$ "$f" || mv -f tmp.$$ "$f"; } || exit 1
  rm -f tmp.$$
done

# kernel headers.

( cd "${kernel_src}"
  yes "" | make ARCH=ppc64 distclean oldconfig \
                include/linux/autoconf.h include/linux/version.h >& _make
)

rm -rf "${dest}"/include/{linux,asm,asm-generic,asm-ppc,asm-ppc64}
mkdir -p "${dest}"/include/asm
cp -a "${kernel_src}"/include/linux "${dest}"/include/linux
cp -a "${kernel_src}"/include/asm-generic "${dest}"/include/asm-generic
cp -a "${kernel_src}"/include/asm-ppc "${dest}"/include/asm-ppc
cp -a "${kernel_src}"/include/asm-ppc64 "${dest}"/include/asm-ppc64
( cd "${dest}"/include/asm-ppc64
  header_list="$(/bin/ls *.h)"
  cd ../asm
  for header in ${header_list}; do
    rm -f ${header}
    macro=$(echo ${header} |
      sed 'y/.abcdefghijklmnopqrstuvwzyz/_ABCDEFGHIJKLMNOPQRSTUVWXYZ/')

    cat >> ${header} <<EOF
#ifndef __ASM_STUB_${macro}__
# define __ASM_STUB_${macro}__
# if defined __powerpc64__
#  include <asm-ppc64/${header}>
# elif defined __powerpc__
#  include <asm-ppc/${header}>
# endif
#endif
EOF
  done
)

# glibc headers

rm -rf build/glibc
mkdir -p build/glibc
cd build/glibc

"${glibc_src}"/configure \
  --build=powerpc-linux \
  --host=powerpc64-linux \
  --prefix="${dest}" \
  --with-headers="${dest}"/include \
  --disable-sanity-checks \
  --without-cvs \
  >& _configure

make sysdeps/gnu/errlist.c >& _make
mkdir -p stdio-common
touch stdio-common/errlist-compat.c
make install-headers >& _install

# Some headers aren't installed by install-headers, so do them by hand.
mkdir -p "${dest}"/include/gnu

if test ! -f "${dest}"/include/gnu/stubs.h; then
  touch "${dest}"/include/gnu/stubs.h
fi
if test ! -f "${dest}"/include/gnu/stubs-32.h; then
  touch "${dest}"/include/gnu/stubs-32.h
fi
if test ! -f "${dest}"/include/gnu/stubs-64.h; then
  touch "${dest}"/include/gnu/stubs-64.h
fi
if test ! -f "${dest}"/include/features.h; then
  cp "${glibc_src}"/include/features.h "${dest}"/include/features.h
fi
if test ! -f "${dest}"/include/bits/stdio_lim.h; then
  cp bits/stdio_lim.h "${dest}"/include/bits/stdio_lim.h
fi
cd ../..

# first binutils+gcc build
# This can't be a "make bootstrap" because we don't have glibc installed yet.

if test -z "${combined_src}"; then
  rm -rf build/bin
  mkdir -p build/bin
  cd build/bin

  CFLAGS="-g -O" CXXFLAGS="-g -O" "${binutils_src}"/configure \
    --build=powerpc-linux \
    --host=powerpc-linux \
    --target=powerpc-linux \
    --enable-targets=powerpc64-linux \
    --prefix="${dest}" \
    --disable-nls \
    >& _configure
  make >& _make
  make install >& _install
  cd ../..
fi

rm -rf build/gcc
mkdir -p build/gcc
cd build/gcc

AS="$gcc_as" LD="$gcc_ld" \
"${combined_src:-${gcc_src}}"/configure \
  --build=powerpc-linux \
  --host=powerpc-linux \
  --target=powerpc-linux \
  --enable-targets=powerpc64-linux \
  --prefix="${dest}" \
  --enable-dataplt \
  --disable-shared \
  --disable-threads \
  --disable-libmudflap \
  --enable-languages=c \
  >& _configure
make >& _make
make install >& _install
cd ../..

# Install powerpc-linux-* and powerpc64-linux-* aliases for our new tools
# This is in case these already exist somewhere on the system.  Many
# configure scripts prefer to use eg. "powerpc-linux-as" over plain "as".
#
( cd "${dest}"/bin
  for z in addr2line ar as c++filt ld nm objcopy objdump ranlib size strings strip
  do
    test -x $z && test ! -e powerpc-linux-$x && ln -sfn $z powerpc-linux-$z
  done
  for z in addr2line ar c++filt nm objcopy objdump ranlib size strings strip
  do
    test -x $z && test ! -e powerpc64-linux-$x && ln -sfn $z powerpc64-linux-$z
  done
  if test ! -e powerpc64-linux-as; then
    cat > powerpc64-linux-as <<\EOF
#!/bin/sh
exec powerpc-linux-as -a64 "$@"
EOF
    chmod a+x powerpc64-linux-as
  fi
  if test ! -e powerpc64-linux-ld; then
    cat > powerpc64-linux-ld <<\EOF
#!/bin/sh
exec powerpc-linux-ld -melf64ppc "$@"
EOF
    chmod a+x powerpc64-linux-ld
  fi
  if test ! -e powerpc64-linux-gcc; then
    cat > powerpc64-linux-gcc <<\EOF
#! /bin/sh
case "$@" in
*-m32*) powerpc-linux-gcc "$@";;
*) powerpc-linux-gcc -m64 "$@";;
esac
EOF
    chmod a+x powerpc64-linux-gcc
  fi
  if test -x powerpc-linux-g++; then
    if test ! -e powerpc64-linux-g++; then
      cat > powerpc64-linux-g++ <<\EOF
#! /bin/sh
case "$@" in
*-m32*) powerpc-linux-g++ "$@";;
*) powerpc-linux-g++ -m64 "$@";;
esac
EOF
      chmod a+x powerpc64-linux-g++
    fi
    ln -f powerpc64-linux-g++ powerpc64-linux-c++
  fi
)

# first ppc32 glibc build
# We can't use the newly built gcc to compile glibc because it will set the
# dynamic linker to be ${dest}/lib/ld.so.1, which isn't installed until the
# glibc build finishes.  So trying to run anything compiled with the new gcc
# will fail, in particular, glibc configure tests.  I suppose you might be
# able to supply glibc configure with lots of libc_cv_* variables to
# avoid this, but then you'd forever be changing this script to keep up with
# new glibc configure tests.
# Note that dynamically linked programs built here with the old host gcc are
# subtly broken too;  The glibc build sets their dynamic linker to
# ${dest}/lib/ld.so.1 but doesn't provide rpath.  Which means you'll get the
# new ld.so trying to use the system libc.so, which doesn't work.  ld.so and
# libc.so share data structures so are tightly coupled.  To run the new
# programs, you need to set LD_LIBRARY_PATH for them, or better (so as to not
# affect forked commands that might need the system libs), run ld.so.1
# explicitly, passing --library-path as is done for localedef below.
# This is one of the reasons why you need to build glibc twice.

rm -rf build/glibc32
mkdir -p build/glibc32
cd build/glibc32

CC="gcc -mdata-plt" \
"${glibc_src}"/configure \
  --build=powerpc-linux \
  --host=powerpc-linux \
  --prefix="${dest}" \
  --with-headers="${dest}"/include \
  --enable-add-ons=nptl \
  --with-tls \
  --without-cvs \
  >& _configure
make >& _make
make install >& _install
cd ../..

if test ! -f "${dest}"/etc/ld.so.conf; then
  cat > "${dest}"/etc/ld.so.conf <<EOF
${dest}/powerpc-linux/lib
${dest}/powerpc-linux/lib64
${dest}/lib
${dest}/lib64
/usr/local/powerpc-linux/lib
/usr/local/powerpc64-linux/lib
/usr/local/lib
/usr/local/lib64
/lib
/lib64
/usr/lib
/usr/lib64
EOF
fi
"${dest}"/sbin/ldconfig

# Build locale files.
# Needed for libstdc++ configure in the full GCC build below
mkdir -p "${dest}"/lib/locale
sed -n -e '/SUPPORTED-LOCALES/,$ s@\([^/]*\)/\([^ \\]*\).*@\1 \2@p' \
 < "${glibc_src}"/localedata/SUPPORTED | sort | \
while read l c x; do
  echo localedef -i ${l%\.*} -f $c $l
  "${dest}"/lib/ld.so.1 --library-path "${dest}"/lib \
    "${dest}"/bin/localedef -i ${l%\.*} -f $c $l || true
done

# Now set PATH to include the newly built toolchain.
PATH="${dest}/bin:$PATH"

# Hack around a glibc build problem.  gcc has been built static-only,
# so libgcc_eh.a isn't built and all the libgcc_eh functions are in
# libgcc.a.  Yet glibc wants to link with libgcc_eh.a
for z in `find "${dest}"/lib -name libgcc.a`; do
  test -e ${z%\.a}_eh.a || ln -s libgcc.a ${z%\.a}_eh.a
done

# first ppc64 glibc build

rm -rf build/glibc64
mkdir -p build/glibc64
cd build/glibc64

# We want shared libs in ${dest}/lib64, and we don't want to trample on
# any 32-bit binaries.
echo slibdir="${dest}"/lib64 > configparms
echo bindir="${dest}"/bin64 >> configparms
echo sbindir="${dest}"/sbin64 >> configparms
echo rootsbindir="${dest}"/sbin64 >> configparms

# We know the compiler we are using supports everything glibc needs,
# but glibc can't run all its configure tests because a 64-bit glibc isn't
# yet installed.  So tell configure some answers.
libc_cv_forced_unwind=yes \
libc_cv_c_cleanup=yes \
"${glibc_src}"/configure \
    --build=powerpc-linux \
    --host=powerpc64-linux \
    --prefix="${dest}" \
    --libdir="${dest}"/lib64 \
    --libexecdir="${dest}"/libexec64 \
    --with-headers="${dest}"/include \
    --enable-add-ons=nptl \
    --with-tls \
    --without-cvs \
    >& _configure
make >& _make
make install >& _install
cd ../..

# Build locale files.
mkdir -p "${dest}"/lib64/locale
sed -n -e '/SUPPORTED-LOCALES/,$ s@\([^/]*\)/\([^ \\]*\).*@\1 \2@p' \
 < "${glibc_src}"/localedata/SUPPORTED | sort | \
while read l c x; do
  echo localedef -i ${l%\.*} -f $c $l
  "${dest}"/bin64/localedef -i ${l%\.*} -f $c $l || true
done

# second binutils+gcc build
# This time we build a full compiler.  Pass AS and LD to configure,
# because configure isn't clever enough to find the right as and ld

if test -z "${combined_src}"; then
  rm -rf build/bin_1
  mv build/bin build/bin_1
  mkdir -p build/bin
  cd build/bin
  CFLAGS="-g -O" CXXFLAGS="-g -O" "${binutils_src}"/configure \
    --build=powerpc-linux \
    --host=powerpc-linux \
    --target=powerpc-linux \
    --enable-targets=powerpc64-linux \
    --prefix="${dest}" \
    --disable-nls \
    >& _configure
  make >& _make
  make install >& _install
  cd ../..
fi

rm -rf build/gcc_1
mv build/gcc build/gcc_1
mkdir -p build/gcc
cd build/gcc

AS="${dest}"/bin/as \
LD="${dest}"/bin/ld \
"${combined_src:-${gcc_src}}"/configure \
    --build=powerpc-linux \
    --host=powerpc-linux \
    --target=powerpc-linux \
    --enable-targets=powerpc64-linux \
    --prefix="${dest}" \
    --enable-dataplt \
    --enable-__cxa_atexit \
    --enable-shared \
    --enable-languages=all \
    >& _configure
make STAGE1_CFLAGS="-g -O" bootstrap >& _make
make install >& _install
cd ../..

# second ppc32 glibc build

rm -rf build/glibc32_1
mv build/glibc32 build/glibc32_1
mkdir -p build/glibc32
cd build/glibc32

"${glibc_src}"/configure \
    --build=powerpc-linux \
    --host=powerpc-linux \
    --prefix="${dest}" \
    --with-headers="${dest}"/include \
    --enable-add-ons=nptl \
    --with-tls \
    --without-cvs \
    >& _configure
make >& _make
make install >& _install
cd ../..

# second ppc64 glibc build
# This is hardly necessary, since we built ppc64 glibc using the new gcc in
# the first build.  However, we fudge glibc configure params here to say we
# aren't really cross-compiling, which lets us build a few more parts of
# glibc.

rm -rf build/glibc64_1
mv build/glibc64 build/glibc64_1
mkdir -p build/glibc64
cd build/glibc64

# We want shared libs in ${dest}/lib64, and we don't want to trample on
# any 32-bit binaries.  Also say that we can run 64-bit binaries (assuming
# ppc64 hardware!).
echo slibdir="${dest}"/lib64 > configparms
echo bindir="${dest}"/bin64 >> configparms
echo sbindir="${dest}"/sbin64 >> configparms
echo rootsbindir="${dest}"/sbin64 >> configparms
echo cross-compiling=no >> configparms

"${glibc_src}"/configure \
    --build=powerpc-linux \
    --host=powerpc64-linux \
    --prefix="${dest}" \
    --libdir="${dest}"/lib64 \
    --libexecdir="${dest}"/libexec64 \
    --with-headers="${dest}"/include \
    --enable-add-ons=nptl \
    --with-tls \
    --without-cvs \
    >& _configure
make >& _make
make install >& _install
cd ../..

# ppc gdb build

if test -n "${gdb_src}"; then
  rm -rf build/gdb32
  mkdir -p build/gdb32
  cd build/gdb32

  "${gdb_src}"/configure \
    --host=powerpc-linux \
    --prefix="${dest}" \
    --program-suffix=32 \
    --disable-tcl \
    --disable-tui \
    >& _configure
  make >& _make
  make install >& _install
  cd ../..
fi

# ppc64 gdb build

if test -n "${gdb_src}"; then
  rm -rf build/gdb64
  mkdir -p build/gdb64
  cd build/gdb64

  "${gdb_src}"/configure \
    --host=powerpc64-linux \
    --prefix="${dest}" \
    --libdir="${dest}"/lib64 \
    --program-suffix=64 \
    --disable-tcl \
    --disable-tui \
    >& _configure

  make >& _make
  make install >& _install
  cd ../..
fi

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: powerpc new PLT and GOT
  2005-05-30 15:49     ` Alan Modra
@ 2005-05-31  0:35       ` Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2005-05-31  0:35 UTC (permalink / raw)
  To: gcc-patches

On Mon, May 30, 2005 at 09:41:13AM +0930, Alan Modra wrote:
> I've attached a script I use to build a
> toolchain that needs a new glibc to properly test.

Um, which had a trivial bug when building host tools.  gcc_as and gcc_ld
should be "${host_dest}"/bin/as and "${host_dest}"/bin/ld.  They were
missing the /bin.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: powerpc new PLT and GOT
  2005-05-25 14:26   ` Alan Modra
  2005-05-25 15:17     ` Paolo Carlini
  2005-05-30 15:49     ` Alan Modra
@ 2005-05-31 10:53     ` Alan Modra
  2005-05-31 13:42       ` Joseph S. Myers
  2 siblings, 1 reply; 875+ messages in thread
From: Alan Modra @ 2005-05-31 10:53 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Edelsohn

Updated to include documentation and requested changes.  Bootstrapped
and tested powerpc-linux.  I also built powerpc-darwin and
powerpc-ibm-aix5.1 C compilers (up to libgcc point, no assembler
installed) to verify that the new powerpc-linux options would not
somehow break bootstrap on other rs6000 targets.

gcc/
	* configure.ac: Add --enable-secureplt.
	(HAVE_AS_REL16): Test for R_PPC_REL16 relocs.
	* config.in: Regenerate.
	* configure: Regenerate.
	* config.gcc (powerpc64-*-linux*, powerpc-*-linux*): Add
	rs6000/secureplt.h to tm_file when enable_secureplt.
	* doc/invoke.texi (msecure-plt, mbss-plt): Document.
	* config/rs6000/secureplt.h: New file.
	* config/rs6000/sysv4.h (TARGET_SECURE_PLT): Define.
	(SUBTARGET_OVERRIDE_OPTIONS): Error if -msecure-plt given without
	assembler support.
	(CC1_SECURE_PLT_DEFAULT_SPEC): Define.
	(CC1_SPEC): Delete duplicate mno-sdata.  Invoke cc1_secure_plt_default.
	(SUBTARGET_EXTRA_SPECS): Add cc1_secure_plt_default.
	* config/rs6000/sysv4.opt (msecure-plt, bss-plt): Add options.
	* config/rs6000/rs6000.h (TARGET_SECURE_PLT): Define.
	* config/rs6000/rs6000.c (rs6000_emit_load_toc_table): Handle
	TARGET_SECURE_PLT got register load sequence.
	(rs6000_emit_prologue): Call rs6000_emit_load_toc_table when
	TARGET_SECURE_PLT.
	(rs6000_elf_declare_function_name): Don't emit toc address offset
	word when TARGET_SECURE_PLT.
	* config/rs6000/rs6000.md (elf_high, elf_low): Move past load_toc_*.
	(load_toc_v4_PIC_1) Enable for TARGET_SECURE_PLT.
	(load_toc_v4_PIC_3b, load_toc_v4_PIC_3c): New insns.
	(call, call_value): Mark pic_offset_table_rtx used for sysv pic and
	TARGET_SECURE_PLT.
	(call_nonlocal_sysv, call_value_nonlocal_sysv, sibcall_nonlocal_sysv,
	sibcall_value_nonlocal_sysv): Add 32768 offset when TARGET_SECURE_PLT
	and -fPIC.
	* config/rs6000/tramp.asm (trampoline_initial): Use "bcl 20,31".
	(__trampoline_setup): Likewise.  Init r30 before plt call.

libffi/
	* src/powerpc/ppc_closure.S (ffi_closure_SYSV): Don't use JUMPTARGET
	to call ffi_closure_helper_SYSV.  Append @local instead.
	* src/powerpc/sysv.S (ffi_call_SYSV): Likewise for ffi_prep_args_SYSV.

diff -urpN -xCVS -x'*~' -x'.#*' gcc-virgin/gcc/configure.ac gcc-plt/gcc/configure.ac
--- gcc-virgin/gcc/configure.ac	2005-05-25 11:50:01.000000000 +0930
+++ gcc-plt/gcc/configure.ac	2005-05-31 16:50:19.000000000 +0930
@@ -1441,6 +1441,10 @@ case "$LIBINTL" in *$LIBICONV*)
 	LIBICONV= ;;
 esac
 
+AC_ARG_ENABLE(secureplt,
+[  --enable-secureplt      enable -msecure-plt by default for PowerPC],
+[], [])
+
 # Windows32 Registry support for specifying GCC installation paths.
 AC_ARG_ENABLE(win32-registry,
 [  --disable-win32-registry
@@ -2822,6 +2826,24 @@ foo:	nop
       [AC_DEFINE(HAVE_AS_POPCNTB, 1,
 	  [Define if your assembler supports popcntb field.])])
 
+    case $target in
+      *-*-aix*) conftest_s='	.csect .text[[PR]]
+LCF..0:
+	addis 11,30,_GLOBAL_OFFSET_TABLE_-LCF..0@ha';;
+      *-*-darwin*)
+	conftest_s='	.text
+LCF0:
+	addis r11,r30,_GLOBAL_OFFSET_TABLE_-LCF0@ha';;
+      *) conftest_s='	.text
+.LCF0:
+	addis 11,30,_GLOBAL_OFFSET_TABLE_-.LCF0@ha';;
+    esac
+
+    gcc_GAS_CHECK_FEATURE([rel16 relocs],
+      gcc_cv_as_powerpc_rel16, [2,17,0], -a32,
+      [$conftest_s],,
+      [AC_DEFINE(HAVE_AS_REL16, 1,
+	  [Define if your assembler supports R_PPC_REL16 relocs.])])
     ;;
 
   mips*-*-*)
diff -urpN -xCVS -x'*~' -x'.#*' gcc-virgin/gcc/config.gcc gcc-plt/gcc/config.gcc
--- gcc-virgin/gcc/config.gcc	2005-05-19 19:10:56.000000000 +0930
+++ gcc-plt/gcc/config.gcc	2005-05-31 12:58:32.000000000 +0930
@@ -1581,6 +1581,9 @@ powerpc64-*-linux*)
 	test x$with_cpu != x || cpu_is_64bit=yes
 	test x$cpu_is_64bit != xyes || tm_file="${tm_file} rs6000/default64.h"
 	tm_file="rs6000/biarch64.h ${tm_file} rs6000/linux64.h"
+	if test x${enable_secureplt} = xyes; then
+		tm_file="rs6000/secureplt.h ${tm_file}"
+	fi
 	extra_options="${extra_options} rs6000/sysv4.opt rs6000/linux64.opt"
 	tmake_file="rs6000/t-fprules ${tmake_file} rs6000/t-ppccomm rs6000/t-linux64"
 	;;
@@ -1690,6 +1693,9 @@ powerpc-*-linux*)
 		tm_file="${tm_file} rs6000/linux.h"
 		;;
 	esac
+	if test x${enable_secureplt} = xyes; then
+		tm_file="rs6000/secureplt.h ${tm_file}"
+	fi
 	;;
 powerpc-*-gnu-gnualtivec*)
 	tm_file="${cpu_type}/${cpu_type}.h elfos.h svr4.h freebsd-spec.h gnu.h rs6000/sysv4.h rs6000/linux.h rs6000/linuxaltivec.h rs6000/gnu.h"
diff -urpN -xCVS -x'*~' -x'.#*' gcc-virgin/gcc/doc/invoke.texi gcc-plt/gcc/doc/invoke.texi
--- gcc-virgin/gcc/doc/invoke.texi	2005-05-31 11:06:18.000000000 +0930
+++ gcc-plt/gcc/doc/invoke.texi	2005-05-31 16:51:10.000000000 +0930
@@ -636,7 +636,7 @@ See RS/6000 and PowerPC Options.
 -minsert-sched-nops=@var{scheme} @gol
 -mcall-sysv  -mcall-netbsd @gol
 -maix-struct-return  -msvr4-struct-return @gol
--mabi=@var{abi-type} @gol
+-mabi=@var{abi-type} -msecure-plt -mbss-plt @gol
 -misel -mno-isel @gol
 -misel=yes  -misel=no @gol
 -mspe -mno-spe @gol
@@ -10733,6 +10733,18 @@ ABI@.
 @opindex mabi=no-spe
 Disable Booke SPE ABI extensions for the current ABI@.
 
+@item -msecure-plt
+@opindex msecure-plt
+Generate code that allows ld and ld.so to build executables and shared
+libraries with non-exec .plt and .got sections.  This is a PowerPC
+32-bit SYSV ABI option.
+
+@item -mbss-plt
+@opindex mbss-plt
+Generate code that uses a BSS .plt section that ld.so fills in, and
+requires .plt and .got sections that are both writable and executable.
+This is a PowerPC 32-bit SYSV ABI option.
+
 @item -misel
 @itemx -mno-isel
 @opindex misel
diff -urpN -xCVS -x'*~' -x'.#*' gcc-virgin/gcc/config/rs6000/secureplt.h gcc-plt/gcc/config/rs6000/secureplt.h
--- gcc-virgin/gcc/config/rs6000/secureplt.h	1970-01-01 09:30:00.000000000 +0930
+++ gcc-plt/gcc/config/rs6000/secureplt.h	2005-05-31 12:58:32.000000000 +0930
@@ -0,0 +1,21 @@
+/* Default to -msecure-plt.
+   Copyright (C) 2005 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 2, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING.  If not, write to
+the Free Software Foundation, 59 Temple Place - Suite 330,
+Boston, MA 02111-1307, USA.  */
+
+#define CC1_SECURE_PLT_DEFAULT_SPEC "-msecure-plt"
diff -urpN -xCVS -x'*~' -x'.#*' gcc-virgin/gcc/config/rs6000/sysv4.h gcc-plt/gcc/config/rs6000/sysv4.h
--- gcc-virgin/gcc/config/rs6000/sysv4.h	2005-05-06 23:34:43.000000000 +0930
+++ gcc-plt/gcc/config/rs6000/sysv4.h	2005-05-31 12:58:32.000000000 +0930
@@ -59,6 +59,11 @@ extern enum rs6000_sdata_type rs6000_sda
 #define	TARGET_NO_TOC		(! TARGET_TOC)
 #define	TARGET_NO_EABI		(! TARGET_EABI)
 
+#ifdef HAVE_AS_REL16
+#undef TARGET_SECURE_PLT
+#define TARGET_SECURE_PLT	secure_plt
+#endif
+
 extern const char *rs6000_abi_name;
 extern const char *rs6000_sdata_name;
 extern const char *rs6000_tls_size_string; /* For -mtls-size= */
@@ -205,6 +210,11 @@ do {									\
       error ("-mcall-aixdesc must be big endian");			\
     }									\
 									\
+  if (TARGET_SECURE_PLT != secure_plt)					\
+    {									\
+      error ("-msecure-plt not supported by your assembler");		\
+    }									\
+									\
   /* Treat -fPIC the same as -mrelocatable.  */				\
   if (flag_pic > 1 && DEFAULT_ABI != ABI_AIX)				\
     target_flags |= MASK_RELOCATABLE | MASK_MINIMAL_TOC | MASK_NO_FP_IN_TOC; \
@@ -750,6 +760,10 @@ extern int fixuplabelno;
 
 #define	CC1_ENDIAN_DEFAULT_SPEC "%(cc1_endian_big)"
 
+#ifndef CC1_SECURE_PLT_DEFAULT_SPEC
+#define CC1_SECURE_PLT_DEFAULT_SPEC ""
+#endif
+
 /* Pass -G xxx to the compiler and set correct endian mode.  */
 #define	CC1_SPEC "%{G*} \
 %{mlittle|mlittle-endian: %(cc1_endian_little);           \
@@ -762,7 +776,6 @@ extern int fixuplabelno;
   mcall-gnu             : -mbig %(cc1_endian_big);        \
   mcall-i960-old        : -mlittle %(cc1_endian_little);  \
                         : %(cc1_endian_default)}          \
-%{mno-sdata: -msdata=none } \
 %{meabi: %{!mcall-*: -mcall-sysv }} \
 %{!meabi: %{!mno-eabi: \
     %{mrelocatable: -meabi } \
@@ -774,6 +787,7 @@ extern int fixuplabelno;
     %{mcall-openbsd: -mno-eabi }}} \
 %{msdata: -msdata=default} \
 %{mno-sdata: -msdata=none} \
+%{!mbss-plt: %{!msecure-plt: %(cc1_secure_plt_default)}} \
 %{profile: -p}"
 
 /* Don't put -Y P,<path> for cross compilers.  */
@@ -1214,6 +1228,7 @@ ncrtn.o%s"
   { "cc1_endian_big",		CC1_ENDIAN_BIG_SPEC },			\
   { "cc1_endian_little",	CC1_ENDIAN_LITTLE_SPEC },		\
   { "cc1_endian_default",	CC1_ENDIAN_DEFAULT_SPEC },		\
+  { "cc1_secure_plt_default",	CC1_SECURE_PLT_DEFAULT_SPEC },		\
   { "cpp_os_ads",		CPP_OS_ADS_SPEC },			\
   { "cpp_os_yellowknife",	CPP_OS_YELLOWKNIFE_SPEC },		\
   { "cpp_os_mvme",		CPP_OS_MVME_SPEC },			\
diff -urpN -xCVS -x'*~' -x'.#*' gcc-virgin/gcc/config/rs6000/sysv4.opt gcc-plt/gcc/config/rs6000/sysv4.opt
--- gcc-virgin/gcc/config/rs6000/sysv4.opt	2005-05-19 19:11:10.000000000 +0930
+++ gcc-plt/gcc/config/rs6000/sysv4.opt	2005-05-31 12:58:32.000000000 +0930
@@ -139,3 +139,11 @@ Generate 32-bit code
 mnewlib
 Target RejectNegative
 no description yet
+
+msecure-plt
+Target Report RejectNegative Var(secure_plt, 1)
+Generate code to use a non-exec PLT and GOT
+
+mbss-plt
+Target Report RejectNegative Var(secure_plt, 0)
+Generate code for old exec BSS PLT
diff -urpN -xCVS -x'*~' -x'.#*' gcc-virgin/gcc/config/rs6000/rs6000.h gcc-plt/gcc/config/rs6000/rs6000.h
--- gcc-virgin/gcc/config/rs6000/rs6000.h	2005-05-27 18:16:46.000000000 +0930
+++ gcc-plt/gcc/config/rs6000/rs6000.h	2005-05-31 12:58:32.000000000 +0930
@@ -144,6 +144,10 @@
 #define TARGET_POPCNTB 0
 #endif
 
+#ifndef TARGET_SECURE_PLT
+#define TARGET_SECURE_PLT 0
+#endif
+
 #define TARGET_32BIT		(! TARGET_64BIT)
 
 /* Emit a dtp-relative reference to a TLS variable.  */
diff -urpN -xCVS -x'*~' -x'.#*' gcc-virgin/gcc/config/rs6000/rs6000.c gcc-plt/gcc/config/rs6000/rs6000.c
--- gcc-virgin/gcc/config/rs6000/rs6000.c	2005-05-27 18:16:46.000000000 +0930
+++ gcc-plt/gcc/config/rs6000/rs6000.c	2005-05-31 12:58:32.000000000 +0930
@@ -12572,15 +12572,49 @@ rs6000_emit_load_toc_table (int fromprol
   rtx dest, insn;
   dest = gen_rtx_REG (Pmode, RS6000_PIC_OFFSET_TABLE_REGNUM);
 
-  if (TARGET_ELF && DEFAULT_ABI == ABI_V4 && flag_pic == 1)
+  if (TARGET_ELF && TARGET_SECURE_PLT && DEFAULT_ABI != ABI_AIX && flag_pic)
     {
-      rtx temp = (fromprolog
-		  ? gen_rtx_REG (Pmode, LINK_REGISTER_REGNUM)
-		  : gen_reg_rtx (Pmode));
-      insn = emit_insn (gen_load_toc_v4_pic_si (temp));
+      char buf[30];
+      rtx lab, tmp1, tmp2, got, tempLR;
+
+      ASM_GENERATE_INTERNAL_LABEL (buf, "LCF", rs6000_pic_labelno);
+      lab = gen_rtx_SYMBOL_REF (Pmode, ggc_strdup (buf));
+      if (flag_pic == 2)
+	got = gen_rtx_SYMBOL_REF (Pmode, toc_label_name);
+      else
+	got = rs6000_got_sym ();
+      tmp1 = tmp2 = dest;
+      if (!fromprolog)
+	{
+	  tmp1 = gen_reg_rtx (Pmode);
+	  tmp2 = gen_reg_rtx (Pmode);
+	}
+      tempLR = (fromprolog
+		? gen_rtx_REG (Pmode, LINK_REGISTER_REGNUM)
+		: gen_reg_rtx (Pmode));
+      insn = emit_insn (gen_load_toc_v4_PIC_1 (tempLR, lab));
+      if (fromprolog)
+	rs6000_maybe_dead (insn);
+      insn = emit_move_insn (tmp1, tempLR);
+      if (fromprolog)
+	rs6000_maybe_dead (insn);
+      insn = emit_insn (gen_load_toc_v4_PIC_3b (tmp2, tmp1, got, lab));
+      if (fromprolog)
+	rs6000_maybe_dead (insn);
+      insn = emit_insn (gen_load_toc_v4_PIC_3c (dest, tmp2, got, lab));
+      if (fromprolog)
+	rs6000_maybe_dead (insn);
+    }
+  else if (TARGET_ELF && DEFAULT_ABI == ABI_V4 && flag_pic == 1)
+    {
+      rtx tempLR = (fromprolog
+		    ? gen_rtx_REG (Pmode, LINK_REGISTER_REGNUM)
+		    : gen_reg_rtx (Pmode));
+
+      insn = emit_insn (gen_load_toc_v4_pic_si (tempLR));
       if (fromprolog)
 	rs6000_maybe_dead (insn);
-      insn = emit_move_insn (dest, temp);
+      insn = emit_move_insn (dest, tempLR);
       if (fromprolog)
 	rs6000_maybe_dead (insn);
     }
@@ -13674,7 +13708,8 @@ rs6000_emit_prologue (void)
 
   /* If we are using RS6000_PIC_OFFSET_TABLE_REGNUM, we need to set it up.  */
   if ((TARGET_TOC && TARGET_MINIMAL_TOC && get_pool_size () != 0)
-      || (DEFAULT_ABI == ABI_V4 && flag_pic == 1
+      || (DEFAULT_ABI == ABI_V4
+	  && (flag_pic == 1 || (flag_pic && TARGET_SECURE_PLT))
 	  && regs_ever_live[RS6000_PIC_OFFSET_TABLE_REGNUM]))
     {
       /* If emit_load_toc_table will use the link register, we need to save
@@ -17204,6 +17239,7 @@ rs6000_elf_declare_function_name (FILE *
     }
 
   if (TARGET_RELOCATABLE
+      && !TARGET_SECURE_PLT
       && (get_pool_size () != 0 || current_function_profile)
       && uses_TOC ())
     {
diff -urpN -xCVS -x'*~' -x'.#*' gcc-virgin/gcc/config/rs6000/rs6000.md gcc-plt/gcc/config/rs6000/rs6000.md
--- gcc-virgin/gcc/config/rs6000/rs6000.md	2005-05-31 11:06:17.000000000 +0930
+++ gcc-plt/gcc/config/rs6000/rs6000.md	2005-05-31 12:58:32.000000000 +0930
@@ -7360,26 +7360,6 @@
 \f
 ;; Now define ways of moving data around.
 
-;; Elf specific ways of loading addresses for non-PIC code.
-;; The output of this could be r0, but we make a very strong
-;; preference for a base register because it will usually
-;; be needed there.
-(define_insn "elf_high"
-  [(set (match_operand:SI 0 "gpc_reg_operand" "=b*r")
-	(high:SI (match_operand 1 "" "")))]
-  "TARGET_ELF && ! TARGET_64BIT"
-  "{liu|lis} %0,%1@ha")
-
-(define_insn "elf_low"
-  [(set (match_operand:SI 0 "gpc_reg_operand" "=r,r")
-	(lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "b,!*r")
-		   (match_operand 2 "" "")))]
-   "TARGET_ELF && ! TARGET_64BIT"
-   "@
-    {cal|la} %0,%2@l(%1)
-    {ai|addic} %0,%1,%K2")
-
-
 ;; Set up a register with a value from the GOT table
 
 (define_expand "movsi_got"
@@ -9810,7 +9790,8 @@
   [(set (match_operand:SI 0 "register_operand" "=l")
 	(match_operand:SI 1 "immediate_operand" "s"))
    (use (unspec [(match_dup 1)] UNSPEC_TOC))]
-  "TARGET_ELF && DEFAULT_ABI != ABI_AIX && flag_pic == 2"
+  "TARGET_ELF && DEFAULT_ABI != ABI_AIX
+   && (flag_pic == 2 || (flag_pic && TARGET_SECURE_PLT))"
   "bcl 20,31,%1\\n%1:"
   [(set_attr "type" "branch")
    (set_attr "length" "4")])
@@ -9833,6 +9814,22 @@
   "{l|lwz} %0,%2-%3(%1)"
   [(set_attr "type" "load")])
 
+(define_insn "load_toc_v4_PIC_3b"
+  [(set (match_operand:SI 0 "gpc_reg_operand" "=b")
+	(plus:SI (match_operand:SI 1 "gpc_reg_operand" "r")
+		 (high:SI
+		   (minus:SI (match_operand:SI 2 "symbol_ref_operand" "s")
+			     (match_operand:SI 3 "symbol_ref_operand" "s")))))]
+  "TARGET_ELF && TARGET_SECURE_PLT && DEFAULT_ABI != ABI_AIX && flag_pic"
+  "{cau|addis} %0,%1,%2-%3@ha")
+
+(define_insn "load_toc_v4_PIC_3c"
+  [(set (match_operand:SI 0 "gpc_reg_operand" "=r")
+	(lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "b")
+		   (minus:SI (match_operand:SI 2 "symbol_ref_operand" "s")
+			     (match_operand:SI 3 "symbol_ref_operand" "s"))))]
+  "TARGET_ELF && TARGET_SECURE_PLT && DEFAULT_ABI != ABI_AIX && flag_pic"
+  "{cal|addi} %0,%1,%2-%3@l")
 
 ;; If the TOC is shared over a translation unit, as happens with all
 ;; the kinds of PIC that we support, we need to restore the TOC
@@ -9867,6 +9864,25 @@
     rs6000_emit_load_toc_table (FALSE);
   DONE;
 }")
+
+;; Elf specific ways of loading addresses for non-PIC code.
+;; The output of this could be r0, but we make a very strong
+;; preference for a base register because it will usually
+;; be needed there.
+(define_insn "elf_high"
+  [(set (match_operand:SI 0 "gpc_reg_operand" "=b*r")
+	(high:SI (match_operand 1 "" "")))]
+  "TARGET_ELF && ! TARGET_64BIT"
+  "{liu|lis} %0,%1@ha")
+
+(define_insn "elf_low"
+  [(set (match_operand:SI 0 "gpc_reg_operand" "=r,r")
+	(lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "b,!*r")
+		   (match_operand 2 "" "")))]
+   "TARGET_ELF && ! TARGET_64BIT"
+   "@
+    {cal|la} %0,%2@l(%1)
+    {ai|addic} %0,%1,%K2")
 \f
 ;; A function pointer under AIX is a pointer to a data area whose first word
 ;; contains the actual address of the function, whose second word contains a
@@ -9983,6 +9999,25 @@
 
   operands[0] = XEXP (operands[0], 0);
 
+  if (DEFAULT_ABI == ABI_V4 && TARGET_SECURE_PLT
+      && flag_pic
+      && GET_CODE (operands[0]) == SYMBOL_REF
+      && !SYMBOL_REF_LOCAL_P (operands[0]))
+    {
+      rtx call;
+      rtvec tmp;
+
+      tmp = gen_rtvec (3,
+		       gen_rtx_CALL (VOIDmode,
+				     gen_rtx_MEM (SImode, operands[0]),
+				     operands[1]),
+		       gen_rtx_USE (VOIDmode, operands[2]),
+		       gen_rtx_CLOBBER (VOIDmode, gen_rtx_SCRATCH (SImode)));
+      call = emit_call_insn (gen_rtx_PARALLEL (VOIDmode, tmp));
+      use_reg (&CALL_INSN_FUNCTION_USAGE (call), pic_offset_table_rtx);
+      DONE;
+    }
+
   if (GET_CODE (operands[0]) != SYMBOL_REF
       || (DEFAULT_ABI == ABI_AIX && !SYMBOL_REF_FUNCTION_P (operands[0]))
       || (DEFAULT_ABI != ABI_DARWIN && (INTVAL (operands[2]) & CALL_LONG) != 0))
@@ -10034,6 +10069,28 @@
 
   operands[1] = XEXP (operands[1], 0);
 
+  if (DEFAULT_ABI == ABI_V4 && TARGET_SECURE_PLT
+      && flag_pic
+      && GET_CODE (operands[1]) == SYMBOL_REF
+      && !SYMBOL_REF_LOCAL_P (operands[1]))
+    {
+      rtx call;
+      rtvec tmp;
+
+      tmp = gen_rtvec (3,
+		       gen_rtx_SET (VOIDmode,
+				    operands[0],
+				    gen_rtx_CALL (VOIDmode,
+						  gen_rtx_MEM (SImode,
+							       operands[1]),
+						  operands[2])),
+		       gen_rtx_USE (VOIDmode, operands[3]),
+		       gen_rtx_CLOBBER (VOIDmode, gen_rtx_SCRATCH (SImode)));
+      call = emit_call_insn (gen_rtx_PARALLEL (VOIDmode, tmp));
+      use_reg (&CALL_INSN_FUNCTION_USAGE (call), pic_offset_table_rtx);
+      DONE;
+    }
+
   if (GET_CODE (operands[1]) != SYMBOL_REF
       || (DEFAULT_ABI == ABI_AIX && !SYMBOL_REF_FUNCTION_P (operands[1]))
       || (DEFAULT_ABI != ABI_DARWIN && (INTVAL (operands[3]) & CALL_LONG) != 0))
@@ -10307,7 +10364,18 @@
 #if TARGET_MACHO
   return output_call(insn, operands, 0, 2);
 #else
-  return (DEFAULT_ABI == ABI_V4 && flag_pic) ? "bl %z0@plt" : "bl %z0";
+  if (DEFAULT_ABI == ABI_V4 && flag_pic)
+    {
+      if (TARGET_SECURE_PLT && flag_pic == 2)
+	/* The magic 32768 offset here and in the other sysv call insns
+	   corresponds to the offset of r30 in .got2, as given by LCTOC1.
+	   See sysv4.h:toc_section.  */
+	return "bl %z0+32768@plt";
+      else
+	return "bl %z0@plt";
+    }
+  else
+    return "bl %z0";
 #endif
 }
   [(set_attr "type" "branch,branch")
@@ -10352,7 +10420,15 @@
 #if TARGET_MACHO
   return output_call(insn, operands, 1, 3);
 #else
-  return (DEFAULT_ABI == ABI_V4 && flag_pic) ? "bl %z1@plt" : "bl %z1";
+  if (DEFAULT_ABI == ABI_V4 && flag_pic)
+    {
+      if (TARGET_SECURE_PLT && flag_pic == 2)
+	return "bl %z1+32768@plt";
+      else
+	return "bl %z1@plt";
+    }
+  else
+    return "bl %z1";
 #endif
 }
   [(set_attr "type" "branch,branch")
@@ -10567,7 +10643,15 @@
   else if (INTVAL (operands[2]) & CALL_V4_CLEAR_FP_ARGS)
     output_asm_insn (\"creqv 6,6,6\", operands);
 
-  return (DEFAULT_ABI == ABI_V4 && flag_pic) ? \"b %z0@plt\" : \"b %z0\";
+  if (DEFAULT_ABI == ABI_V4 && flag_pic)
+    {
+      if (TARGET_SECURE_PLT && flag_pic == 2)
+	return \"b %z0+32768@plt\";
+      else
+	return \"b %z0@plt\";
+    }
+  else
+    return \"b %z0\";
 }"
   [(set_attr "type" "branch,branch")
    (set_attr "length" "4,8")])
@@ -10613,7 +10697,15 @@
   else if (INTVAL (operands[2]) & CALL_V4_CLEAR_FP_ARGS)
     output_asm_insn (\"creqv 6,6,6\", operands);
 
-  return (DEFAULT_ABI == ABI_V4 && flag_pic) ? \"b %z1@plt\" : \"b %z1\";
+  if (DEFAULT_ABI == ABI_V4 && flag_pic)
+    {
+      if (TARGET_SECURE_PLT && flag_pic == 2)
+	return \"b %z1+32768@plt\";
+      else
+	return \"b %z1@plt\";
+    }
+  else
+    return \"b %z1\";
 }"
   [(set_attr "type" "branch,branch")
    (set_attr "length" "4,8")])
diff -urpN -xCVS -x'*~' -x'.#*' gcc-virgin/gcc/config/rs6000/tramp.asm gcc-plt/gcc/config/rs6000/tramp.asm
--- gcc-virgin/gcc/config/rs6000/tramp.asm	2003-06-06 14:41:22.000000000 +0930
+++ gcc-plt/gcc/config/rs6000/tramp.asm	2005-05-31 12:58:32.000000000 +0930
@@ -44,7 +44,7 @@
 	.align	2
 trampoline_initial:
 	mflr	r0
-	bl	1f
+	bcl	20,31,1f
 .Lfunc = .-trampoline_initial
 	.long	0			/* will be replaced with function address */
 .Lchain = .-trampoline_initial
@@ -67,7 +67,7 @@ trampoline_size = .-trampoline_initial
 
 FUNC_START(__trampoline_setup)
 	mflr	r0		/* save return address */
-        bl	.LCF0		/* load up __trampoline_initial into r7 */
+        bcl	20,31,.LCF0	/* load up __trampoline_initial into r7 */
 .LCF0:
         mflr	r11
         addi	r7,r11,trampoline_initial-4-.LCF0 /* trampoline address -4 */
@@ -105,6 +105,12 @@ FUNC_START(__trampoline_setup)
 	blr
 
 .Labort:
+#if defined SHARED && defined HAVE_AS_REL16
+	bcl	20,31,1f
+1:	mflr	r30
+	addis	r30,r30,_GLOBAL_OFFSET_TABLE_-1b@ha
+	addi	r30,r30,_GLOBAL_OFFSET_TABLE_-1b@l
+#endif
 	bl	JUMP_TARGET(abort)
 FUNC_END(__trampoline_setup)
 
diff -urpN -xCVS -x'*~' -x'.#*' gcc-virgin/libffi/src/powerpc/ppc_closure.S gcc-plt/libffi/src/powerpc/ppc_closure.S
--- gcc-virgin/libffi/src/powerpc/ppc_closure.S	2004-09-03 22:42:23.000000000 +0930
+++ gcc-plt/libffi/src/powerpc/ppc_closure.S	2005-05-31 12:58:32.000000000 +0930
@@ -57,7 +57,7 @@ ENTRY(ffi_closure_SYSV)
 	addi %r7,%r1,152
 
 	# make the call
-	bl JUMPTARGET(ffi_closure_helper_SYSV)
+	bl ffi_closure_helper_SYSV@local
 
 	# now r3 contains the return type
 	# so use it to look up in a table
diff -urpN -xCVS -x'*~' -x'.#*' gcc-virgin/libffi/src/powerpc/sysv.S gcc-plt/libffi/src/powerpc/sysv.S
--- gcc-virgin/libffi/src/powerpc/sysv.S	2004-09-03 22:42:23.000000000 +0930
+++ gcc-plt/libffi/src/powerpc/sysv.S	2005-05-31 12:58:32.000000000 +0930
@@ -60,7 +60,7 @@ ENTRY(ffi_call_SYSV)
 
 	/* Call ffi_prep_args_SYSV.  */
 	mr	%r4,%r1
-	bl	JUMPTARGET(ffi_prep_args_SYSV)
+	bl	ffi_prep_args_SYSV@local
 
 	/* Now do the call.  */
 	/* Set up cr1 with bits 4-7 of the flags.  */

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: powerpc new PLT and GOT
  2005-05-31 10:53     ` Alan Modra
@ 2005-05-31 13:42       ` Joseph S. Myers
  2005-05-31 15:03         ` Alan Modra
  0 siblings, 1 reply; 875+ messages in thread
From: Joseph S. Myers @ 2005-05-31 13:42 UTC (permalink / raw)
  To: Alan Modra; +Cc: gcc-patches, David Edelsohn

On Tue, 31 May 2005, Alan Modra wrote:

> 	* configure.ac: Add --enable-secureplt.

I don't see documentation in install.texi for this new configure option.

-- 
Joseph S. Myers               http://www.srcf.ucam.org/~jsm28/gcc/
    jsm@polyomino.org.uk (personal mail)
    joseph@codesourcery.com (CodeSourcery mail)
    jsm28@gcc.gnu.org (Bugzilla assignments and CCs)

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: powerpc new PLT and GOT
       [not found]           ` <amodra@bigpond.net.au>
                               ` (38 preceding siblings ...)
  2005-03-31  0:16             ` [RS6000] Fix PR20611, duplicate label for inlined function referencing TLS David Edelsohn
@ 2005-05-31 14:32             ` David Edelsohn
  2005-06-10  1:13             ` [PATCH] PowerPC SVR4 _mcount calls David Edelsohn
                               ` (21 subsequent siblings)
  61 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2005-05-31 14:32 UTC (permalink / raw)
  To: gcc-patches

gcc/
	* configure.ac: Add --enable-secureplt.
	(HAVE_AS_REL16): Test for R_PPC_REL16 relocs.
	* config.in: Regenerate.
	* configure: Regenerate.
	* config.gcc (powerpc64-*-linux*, powerpc-*-linux*): Add
	rs6000/secureplt.h to tm_file when enable_secureplt.
	* doc/invoke.texi (msecure-plt, mbss-plt): Document.
	* config/rs6000/secureplt.h: New file.
	* config/rs6000/sysv4.h (TARGET_SECURE_PLT): Define.
	(SUBTARGET_OVERRIDE_OPTIONS): Error if -msecure-plt given without
	assembler support.
	(CC1_SECURE_PLT_DEFAULT_SPEC): Define.
	(CC1_SPEC): Delete duplicate mno-sdata.  Invoke cc1_secure_plt_default.
	(SUBTARGET_EXTRA_SPECS): Add cc1_secure_plt_default.
	* config/rs6000/sysv4.opt (msecure-plt, bss-plt): Add options.
	* config/rs6000/rs6000.h (TARGET_SECURE_PLT): Define.
	* config/rs6000/rs6000.c (rs6000_emit_load_toc_table): Handle
	TARGET_SECURE_PLT got register load sequence.
	(rs6000_emit_prologue): Call rs6000_emit_load_toc_table when
	TARGET_SECURE_PLT.
	(rs6000_elf_declare_function_name): Don't emit toc address offset
	word when TARGET_SECURE_PLT.
	* config/rs6000/rs6000.md (elf_high, elf_low): Move past load_toc_*.
	(load_toc_v4_PIC_1) Enable for TARGET_SECURE_PLT.
	(load_toc_v4_PIC_3b, load_toc_v4_PIC_3c): New insns.
	(call, call_value): Mark pic_offset_table_rtx used for sysv pic and
	TARGET_SECURE_PLT.
	(call_nonlocal_sysv, call_value_nonlocal_sysv, sibcall_nonlocal_sysv,
	sibcall_value_nonlocal_sysv): Add 32768 offset when TARGET_SECURE_PLT
	and -fPIC.
	* config/rs6000/tramp.asm (trampoline_initial): Use "bcl 20,31".
	(__trampoline_setup): Likewise.  Init r30 before plt call.

libffi/
	* src/powerpc/ppc_closure.S (ffi_closure_SYSV): Don't use JUMPTARGET
	to call ffi_closure_helper_SYSV.  Append @local instead.
	* src/powerpc/sysv.S (ffi_call_SYSV): Likewise for ffi_prep_args_SYSV.


Okay.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: powerpc new PLT and GOT
  2005-05-31 13:42       ` Joseph S. Myers
@ 2005-05-31 15:03         ` Alan Modra
  2005-05-31 17:11           ` Joseph S. Myers
  0 siblings, 1 reply; 875+ messages in thread
From: Alan Modra @ 2005-05-31 15:03 UTC (permalink / raw)
  To: Joseph S. Myers; +Cc: gcc-patches, David Edelsohn

On Tue, May 31, 2005 at 12:03:51PM +0000, Joseph S. Myers wrote:
> On Tue, 31 May 2005, Alan Modra wrote:
> 
> > 	* configure.ac: Add --enable-secureplt.
> 
> I don't see documentation in install.texi for this new configure option.

Does this pass muster?  I'm a little unsure as to the best place to put
this.

	* doc/install.texi: Document --enable-targets and --enable-secureplt.

Index: gcc/doc/install.texi
===================================================================
RCS file: /cvs/gcc/gcc/gcc/doc/install.texi,v
retrieving revision 1.351
diff -u -p -r1.351 install.texi
--- gcc/doc/install.texi	1 May 2005 13:39:37 -0000	1.351
+++ gcc/doc/install.texi	31 May 2005 14:51:38 -0000
@@ -1072,6 +1072,27 @@ do a @samp{make -C gcc gnatlib_and_tools
 Specify that the compiler should
 use DWARF 2 debugging information as the default.
 
+@item --enable-targets=all
+@itemx --enable-targets=@var{target_list}
+Some GCC targets, eg. powerpc64-linux, build bi-arch compilers.  These
+are compilers that are able to generate either 64-bit or 32-bit code.
+Typicially, the corresponding 32-bit target, eg. powerpc-linux for
+powerpc64-linux, only generates 32-bit code.  This option enables the
+32-bit target to be a bi-arch compiler, which is useful when you want
+a bi-arch compiler that defaults to 32-bit, and you are building a
+bi-arch or multi-arch binutils in a combined tree.  Currently, this
+option only affects powerpc-linux.
+
+@item --enable-secureplt
+This option enables -msecure-plt by default for powerpc-linux.
+@ifnothtml
+@xref{RS/6000 and PowerPC Options,, RS/6000 and PowerPC Options, gcc,
+Using and Porting the GNU Compiler Collection (GCC)},
+@end ifnothtml
+@ifhtml
+See ``RS/6000 and PowerPC Options'' in the main manual
+@end ifhtml
+
 @item --enable-win32-registry
 @itemx --enable-win32-registry=@var{key}
 @itemx --disable-win32-registry

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: powerpc new PLT and GOT
  2005-05-31 15:03         ` Alan Modra
@ 2005-05-31 17:11           ` Joseph S. Myers
  2005-06-01  0:06             ` Alan Modra
  0 siblings, 1 reply; 875+ messages in thread
From: Joseph S. Myers @ 2005-05-31 17:11 UTC (permalink / raw)
  To: Alan Modra; +Cc: gcc-patches, David Edelsohn

On Wed, 1 Jun 2005, Alan Modra wrote:

> +@item --enable-targets=all
> +@itemx --enable-targets=@var{target_list}
> +Some GCC targets, eg. powerpc64-linux, build bi-arch compilers.  These

Should be "e.g.@:" (and again below).

> +This option enables -msecure-plt by default for powerpc-linux.

Should say @option{-msecure-plt}.

> +@ifnothtml
> +@xref{RS/6000 and PowerPC Options,, RS/6000 and PowerPC Options, gcc,
> +Using and Porting the GNU Compiler Collection (GCC)},

The "and Porting" hasn't been there for some time.

Looks OK with those changes, presuming it passes "make info", "make dvi" 
and the install.texi2html script (because of the way the HTML conversion 
with that script works, the install manual is rather fragile, hence it 
being desirable to test changes all three ways).

-- 
Joseph S. Myers               http://www.srcf.ucam.org/~jsm28/gcc/
    jsm@polyomino.org.uk (personal mail)
    joseph@codesourcery.com (CodeSourcery mail)
    jsm28@gcc.gnu.org (Bugzilla assignments and CCs)

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: powerpc new PLT and GOT
  2005-05-31 17:11           ` Joseph S. Myers
@ 2005-06-01  0:06             ` Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2005-06-01  0:06 UTC (permalink / raw)
  To: Joseph S. Myers; +Cc: gcc-patches, David Edelsohn

On Tue, May 31, 2005 at 05:09:49PM +0000, Joseph S. Myers wrote:
> Looks OK with those changes, presuming it passes "make info", "make dvi" 
> and the install.texi2html script (because of the way the HTML conversion 
> with that script works, the install manual is rather fragile, hence it 
> being desirable to test changes all three ways).

Thanks.  This is what I'm about to install.

	* doc/install.texi: Document --enable-targets and --enable-secureplt.
	Correct xrefs to "Using the GNU Compiler Collection (GCC)".

Index: gcc/doc/install.texi
===================================================================
RCS file: /cvs/gcc/gcc/gcc/doc/install.texi,v
retrieving revision 1.351
diff -u -p -r1.351 install.texi
--- gcc/doc/install.texi	1 May 2005 13:39:37 -0000	1.351
+++ gcc/doc/install.texi	31 May 2005 23:53:56 -0000
@@ -1072,6 +1072,27 @@ do a @samp{make -C gcc gnatlib_and_tools
 Specify that the compiler should
 use DWARF 2 debugging information as the default.
 
+@item --enable-targets=all
+@itemx --enable-targets=@var{target_list}
+Some GCC targets, e.g.@: powerpc64-linux, build bi-arch compilers.
+These are compilers that are able to generate either 64-bit or 32-bit
+code.  Typicially, the corresponding 32-bit target, e.g.@:
+powerpc-linux for powerpc64-linux, only generates 32-bit code.  This
+option enables the 32-bit target to be a bi-arch compiler, which is
+useful when you want a bi-arch compiler that defaults to 32-bit, and
+you are building a bi-arch or multi-arch binutils in a combined tree.
+Currently, this option only affects powerpc-linux.
+
+@item --enable-secureplt
+This option enables @option{-msecure-plt} by default for powerpc-linux.
+@ifnothtml
+@xref{RS/6000 and PowerPC Options,, RS/6000 and PowerPC Options, gcc,
+Using the GNU Compiler Collection (GCC)},
+@end ifnothtml
+@ifhtml
+See ``RS/6000 and PowerPC Options'' in the main manual
+@end ifhtml
+
 @item --enable-win32-registry
 @itemx --enable-win32-registry=@var{key}
 @itemx --disable-win32-registry
@@ -2465,7 +2486,7 @@ ARM-family processors.  These targets su
 ATMEL AVR-family micro controllers.  These are used in embedded
 applications.  There are no standard Unix configurations.
 @ifnothtml
-@xref{AVR Options,, AVR Options, gcc, Using and Porting the GNU Compiler
+@xref{AVR Options,, AVR Options, gcc, Using the GNU Compiler
 Collection (GCC)},
 @end ifnothtml
 @ifhtml
@@ -2503,8 +2524,8 @@ indicates that you should upgrade to a n
 
 The Blackfin processor, an Analog Devices DSP.
 @ifnothtml
-@xref{Blackfin Options,, Blackfin Options, gcc, Using and Porting the GNU
-Compiler Collection (GCC)},
+@xref{Blackfin Options,, Blackfin Options, gcc, Using the GNU Compiler
+Collection (GCC)},
 @end ifnothtml
 @ifhtml
 See ``Blackfin Options'' in the main manual
@@ -2522,8 +2543,8 @@ Texas Instruments TMS320C3x and TMS320C4
 Processors.  These are used in embedded applications.  There are no
 standard Unix configurations.
 @ifnothtml
-@xref{TMS320C3x/C4x Options,, TMS320C3x/C4x Options, gcc, Using and
-Porting the GNU Compiler Collection (GCC)},
+@xref{TMS320C3x/C4x Options,, TMS320C3x/C4x Options, gcc, Using the
+GNU Compiler Collection (GCC)},
 @end ifnothtml
 @ifhtml
 See ``TMS320C3x/C4x Options'' in the main manual
@@ -2552,7 +2573,7 @@ CRIS is the CPU architecture in Axis Com
 series.  These are used in embedded applications.
 
 @ifnothtml
-@xref{CRIS Options,, CRIS Options, gcc, Using and Porting the GNU Compiler
+@xref{CRIS Options,, CRIS Options, gcc, Using the GNU Compiler
 Collection (GCC)},
 @end ifnothtml
 @ifhtml

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* [PATCH] PowerPC SVR4 _mcount calls
@ 2005-06-10  1:09 Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2005-06-10  1:09 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Edelsohn, Steve Munroe

Linux has used glibc's _mcount forever, so there is no need for gcc to
allocate count vars for profiling.  I've also added a TARGET_SECURE_PLT
sequence to output_function_profiler in the event that some non-linux
svr4 ppc target wants to use it.

	* config/rs6000/linux.h (NO_PROFILE_COUNTERS): Define.
	* config/rs6000/linux64.h (NO_PROFILE_COUNTERS): Define as 1.
	* config/rs6000/rs6000.c (output_function_profiler): Obey
	NO_PROFILE_COUNTERS.  Handle TARGET_SECURE_PLT.  Use "bcl 20,31"
	for -fPIC.  Delete save_lr and substitute its value into strings.

Tested powerpc-linux.  OK to install mainline?

diff -urpN -xCVS -x'*~' -x'.#*' gcc-virgin/gcc/config/rs6000/linux.h gcc-current/gcc/config/rs6000/linux.h
--- gcc-virgin/gcc/config/rs6000/linux.h	2004-12-02 11:43:18.000000000 +1030
+++ gcc-current/gcc/config/rs6000/linux.h	2005-06-09 17:41:39.000000000 +0930
@@ -28,6 +28,9 @@
    process.  */
 #define OS_MISSING_POWERPC64 1
 
+/* We use glibc _mcount for profiling.  */
+#define NO_PROFILE_COUNTERS 1
+
 /* glibc has float and long double forms of math functions.  */
 #undef  TARGET_C99_FUNCTIONS
 #define TARGET_C99_FUNCTIONS 1
diff -urpN -xCVS -x'*~' -x'.#*' gcc-virgin/gcc/config/rs6000/linux64.h gcc-current/gcc/config/rs6000/linux64.h
--- gcc-virgin/gcc/config/rs6000/linux64.h	2005-05-06 23:34:42.000000000 +0930
+++ gcc-current/gcc/config/rs6000/linux64.h	2005-06-09 17:41:38.000000000 +0930
@@ -207,7 +210,7 @@ extern int dot_symbols;
 #endif
 
 /* We use glibc _mcount for profiling.  */
-#define NO_PROFILE_COUNTERS TARGET_64BIT
+#define NO_PROFILE_COUNTERS 1
 #define PROFILE_HOOK(LABEL) \
   do { if (TARGET_64BIT) output_profile_hook (LABEL); } while (0)
 
diff -urpN -xCVS -x'*~' -x'.#*' gcc-virgin/gcc/config/rs6000/rs6000.c gcc-current/gcc/config/rs6000/rs6000.c
--- gcc-virgin/gcc/config/rs6000/rs6000.c	2005-06-09 19:00:01.000000000 +0930
+++ gcc-current/gcc/config/rs6000/rs6000.c	2005-06-09 19:48:16.000000000 +0930
@@ -15428,7 +15477,6 @@ void
 output_function_profiler (FILE *file, int labelno)
 {
   char buf[100];
-  int save_lr = 8;
 
   switch (DEFAULT_ABI)
     {
@@ -15436,7 +15484,6 @@ output_function_profiler (FILE *file, in
       gcc_unreachable ();
 
     case ABI_V4:
-      save_lr = 4;
       if (!TARGET_32BIT)
 	{
 	  warning (0, "no profiling of 64-bit code for this ABI");
@@ -15444,11 +15491,28 @@ output_function_profiler (FILE *file, in
 	}
       ASM_GENERATE_INTERNAL_LABEL (buf, "LP", labelno);
       fprintf (file, "\tmflr %s\n", reg_names[0]);
-      if (flag_pic == 1)
+      if (NO_PROFILE_COUNTERS)
+	{
+	  asm_fprintf (file, "\t{st|stw} %s,4(%s)\n",
+		       reg_names[0], reg_names[1]);
+	}
+      else if (TARGET_SECURE_PLT && flag_pic)
+	{
+	  asm_fprintf (file, "\tbcl 20,31,1f\n1:\n\t{st|stw} %s,4(%s)\n",
+		       reg_names[0], reg_names[1]);
+	  asm_fprintf (file, "\tmflr %s\n", reg_names[12]);
+	  asm_fprintf (file, "\t{cau|addis} %s,%s,",
+		       reg_names[12], reg_names[12]);
+	  assemble_name (file, buf);
+	  asm_fprintf (file, "-1b@ha\n\t{cal|la} %s,", reg_names[0]);
+	  assemble_name (file, buf);
+	  asm_fprintf (file, "-1b@l(%s)\n", reg_names[12]);
+	}
+      else if (flag_pic == 1)
 	{
 	  fputs ("\tbl _GLOBAL_OFFSET_TABLE_@local-4\n", file);
-	  asm_fprintf (file, "\t{st|stw} %s,%d(%s)\n",
-		       reg_names[0], save_lr, reg_names[1]);
+	  asm_fprintf (file, "\t{st|stw} %s,4(%s)\n",
+		       reg_names[0], reg_names[1]);
 	  asm_fprintf (file, "\tmflr %s\n", reg_names[12]);
 	  asm_fprintf (file, "\t{l|lwz} %s,", reg_names[0]);
 	  assemble_name (file, buf);
@@ -15456,10 +15520,10 @@ output_function_profiler (FILE *file, in
 	}
       else if (flag_pic > 1)
 	{
-	  asm_fprintf (file, "\t{st|stw} %s,%d(%s)\n",
-		       reg_names[0], save_lr, reg_names[1]);
+	  asm_fprintf (file, "\t{st|stw} %s,4(%s)\n",
+		       reg_names[0], reg_names[1]);
 	  /* Now, we need to get the address of the label.  */
-	  fputs ("\tbl 1f\n\t.long ", file);
+	  fputs ("\tbcl 20,31,1f\n\t.long ", file);
 	  assemble_name (file, buf);
 	  fputs ("-.\n1:", file);
 	  asm_fprintf (file, "\tmflr %s\n", reg_names[11]);
@@ -15473,8 +15537,8 @@ output_function_profiler (FILE *file, in
 	  asm_fprintf (file, "\t{liu|lis} %s,", reg_names[12]);
 	  assemble_name (file, buf);
 	  fputs ("@ha\n", file);
-	  asm_fprintf (file, "\t{st|stw} %s,%d(%s)\n",
-		       reg_names[0], save_lr, reg_names[1]);
+	  asm_fprintf (file, "\t{st|stw} %s,4(%s)\n",
+		       reg_names[0], reg_names[1]);
 	  asm_fprintf (file, "\t{cal|la} %s,", reg_names[0]);
 	  assemble_name (file, buf);
 	  asm_fprintf (file, "@l(%s)\n", reg_names[12]);

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] PowerPC SVR4 _mcount calls
       [not found]           ` <amodra@bigpond.net.au>
                               ` (39 preceding siblings ...)
  2005-05-31 14:32             ` powerpc new PLT and GOT David Edelsohn
@ 2005-06-10  1:13             ` David Edelsohn
  2005-09-12  3:15             ` [RS6000] Nop-insertion fix David Edelsohn
                               ` (20 subsequent siblings)
  61 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2005-06-10  1:13 UTC (permalink / raw)
  To: gcc-patches, Steve Munroe

	* config/rs6000/linux.h (NO_PROFILE_COUNTERS): Define.
	* config/rs6000/linux64.h (NO_PROFILE_COUNTERS): Define as 1.
	* config/rs6000/rs6000.c (output_function_profiler): Obey
	NO_PROFILE_COUNTERS.  Handle TARGET_SECURE_PLT.  Use "bcl 20,31"
	for -fPIC.  Delete save_lr and substitute its value into strings.

Tested powerpc-linux.  OK to install mainline?

Okay.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* [PATCH, committed] PPC405 atomic support (PR target/21760)
@ 2005-06-23 13:33 David Edelsohn
  2005-06-23 13:51 ` Andreas Schwab
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2005-06-23 13:33 UTC (permalink / raw)
  To: gcc-patches

	This patch adds support for the PPC405CR atomic operation erratum.
The atomic operation correction needs to be applied to all GCC runtime
libraries, but a PPC405 multilib is not created.  Therefore this feature
only is enabled when GCC is configured using --with-cpu=405.

Bootstrapped and regression tested on powerpc-ibm-aix5.2.0.0

David


	PR target/21760
	* config/rs6000/rs6000.h (PPC405_ERRATUM77): New.
	* config/rs6000/rs6000.md: Move atomic instructions to ...
	* config/rs6000/sync.md: Here.
	Change sync_compare_and_swap<mode> to define_expand.  All stwcx
	patterns test PPC405_ERRATUM77.

Index: rs6000.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.h,v
retrieving revision 1.370
diff -c -p -r1.370 rs6000.h
*** rs6000.h	12 Jun 2005 03:43:11 -0000	1.370
--- rs6000.h	21 Jun 2005 14:50:49 -0000
***************
*** 49,54 ****
--- 49,62 ----
  #define TARGET_CPU_DEFAULT ((char *)0)
  #endif
  
+ /* If configured for PPC405, support PPC405CR Erratum77.  */
+ #define PPC405_CPU_DEFAULT ("405")
+ #if #TARGET_CPU_DEFAULT == #PPC405_CPU_DEFAULT
+ #define PPC405_ERRATUM77 (rs6000_cpu == PROCESSOR_PPC405)
+ #else
+ #define PPC405_ERRATUM77 0
+ #endif
+ 
  /* Common ASM definitions used by ASM_SPEC among the various targets
     for handling -mcpu=xxx switches.  */
  #define ASM_CPU_SPEC \
*** /dev/null	Tue Jun 21 10:51:03 2005
--- sync.md	Mon Jun 20 11:08:00 2005
***************
*** 0 ****
--- 1,489 ----
+ ;; Machine description for PowerPC synchronization instructions.
+ ;; Copyright (C) 2005 Free Software Foundation, Inc.
+ ;; Contributed by Geoffrey Keating.
+ 
+ ;; This file is part of GCC.
+ 
+ ;; GCC is free software; you can redistribute it and/or modify it
+ ;; under the terms of the GNU General Public License as published
+ ;; by the Free Software Foundation; either version 2, or (at your
+ ;; option) any later version.
+ 
+ ;; GCC is distributed in the hope that it will be useful, but WITHOUT
+ ;; ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+ ;; or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+ ;; License for more details.
+ 
+ ;; You should have received a copy of the GNU General Public License
+ ;; along with GCC; see the file COPYING.  If not, write to the
+ ;; Free Software Foundation, 59 Temple Place - Suite 330, Boston,
+ ;; MA 02111-1307, USA.
+ 
+ (define_mode_attr larx [(SI "lwarx") (DI "ldarx")])
+ (define_mode_attr stcx [(SI "stwcx.") (DI "stdcx.")])
+ 
+ (define_insn "memory_barrier"
+   [(set (mem:BLK (match_scratch 0 "X"))
+ 	(unspec:BLK [(mem:BLK (match_scratch 1 "X"))] UNSPEC_SYNC))]
+   ""
+   "{ics|sync}")
+ 
+ (define_expand "sync_compare_and_swap<mode>"
+   [(parallel [(set (match_operand:GPR 1 "memory_operand" "")
+ 		   (unspec:GPR [(match_dup 1)
+ 				(match_operand:GPR 2 "reg_or_short_operand" "")
+ 				(match_operand:GPR 3 "gpc_reg_operand" "")]
+ 			       UNSPEC_SYNC_SWAP))
+ 	      (set (match_operand:GPR 0 "gpc_reg_operand" "") (match_dup 1))
+ 	      (set (mem:BLK (match_scratch 5 ""))
+ 		   (unspec:BLK [(mem:BLK (match_scratch 6 ""))] UNSPEC_SYNC))
+ 	      (clobber (match_scratch:CC 4 ""))])]
+   "TARGET_POWERPC")
+ 
+ (define_insn "sync_compare_and_swap<mode>_internal"
+   [(set (match_operand:GPR 1 "memory_operand" "+Z")
+ 	(unspec:GPR [(match_dup 1)
+ 		     (match_operand:GPR 2 "reg_or_short_operand" "rI")
+ 		     (match_operand:GPR 3 "gpc_reg_operand" "r")]
+ 		    UNSPEC_SYNC_SWAP))
+    (set (match_operand:GPR 0 "gpc_reg_operand" "=&r") (match_dup 1))
+    (set (mem:BLK (match_scratch 5 "X"))
+ 	(unspec:BLK [(mem:BLK (match_scratch 6 "X"))] UNSPEC_SYNC))
+    (clobber (match_scratch:CC 4 "=&x"))]
+   "TARGET_POWERPC && !PPC405_ERRATUM77"
+   "sync\n\t<larx> %0,%y1\n\tcmp<wd>%I2 %0,%2\n\tbne- $+12\n\t<stcx> %3,%y1\n\tbne- $-16\n\tisync"
+   [(set_attr "length" "28")])
+ 
+ (define_insn "sync_compare_and_swap<mode>_ppc405"
+   [(set (match_operand:GPR 1 "memory_operand" "+Z")
+ 	(unspec:GPR [(match_dup 1)
+ 		     (match_operand:GPR 2 "reg_or_short_operand" "rI")
+ 		     (match_operand:GPR 3 "gpc_reg_operand" "r")]
+ 		    UNSPEC_SYNC_SWAP))
+    (set (match_operand:GPR 0 "gpc_reg_operand" "=&r") (match_dup 1))
+    (set (mem:BLK (match_scratch 5 "X"))
+ 	(unspec:BLK [(mem:BLK (match_scratch 6 "X"))] UNSPEC_SYNC))
+    (clobber (match_scratch:CC 4 "=&x"))]
+   "TARGET_POWERPC && PPC405_ERRATUM77"
+   "sync\n\t<larx> %0,%y1\n\tcmp<wd>%I2 %0,%2\n\tbne- $+12\n\tsync\n\t<stcx> %3,%y1\n\tbne- $-16\n\tisync"
+   [(set_attr "length" "32")])
+ 
+ (define_expand "sync_add<mode>"
+   [(use (match_operand:INT1 0 "memory_operand" ""))
+    (use (match_operand:INT1 1 "add_operand" ""))]
+   "TARGET_POWERPC && !PPC405_ERRATUM77"
+   "
+ {
+   rs6000_emit_sync (PLUS, <MODE>mode, operands[0], operands[1], 
+ 		    NULL_RTX, NULL_RTX, true);
+   DONE;
+ }")
+ 
+ (define_expand "sync_sub<mode>"
+   [(use (match_operand:GPR 0 "memory_operand" ""))
+    (use (match_operand:GPR 1 "gpc_reg_operand" ""))]
+   "TARGET_POWERPC && !PPC405_ERRATUM77"
+   "
+ {
+   rs6000_emit_sync (MINUS, <MODE>mode, operands[0], operands[1], 
+ 		    NULL_RTX, NULL_RTX, true);
+   DONE;
+ }")
+ 
+ (define_expand "sync_ior<mode>"
+   [(use (match_operand:INT1 0 "memory_operand" ""))
+    (use (match_operand:INT1 1 "logical_operand" ""))]
+   "TARGET_POWERPC && !PPC405_ERRATUM77"
+   "
+ {
+   rs6000_emit_sync (IOR, <MODE>mode, operands[0], operands[1], 
+ 		    NULL_RTX, NULL_RTX, true);
+   DONE;
+ }")
+ 
+ (define_expand "sync_and<mode>"
+   [(use (match_operand:INT1 0 "memory_operand" ""))
+    (use (match_operand:INT1 1 "and_operand" ""))]
+   "TARGET_POWERPC && !PPC405_ERRATUM77"
+   "
+ {
+   rs6000_emit_sync (AND, <MODE>mode, operands[0], operands[1], 
+ 		    NULL_RTX, NULL_RTX, true);
+   DONE;
+ }")
+ 
+ (define_expand "sync_xor<mode>"
+   [(use (match_operand:INT1 0 "memory_operand" ""))
+    (use (match_operand:INT1 1 "logical_operand" ""))]
+   "TARGET_POWERPC && !PPC405_ERRATUM77"
+   "
+ {
+   rs6000_emit_sync (XOR, <MODE>mode, operands[0], operands[1], 
+ 		    NULL_RTX, NULL_RTX, true);
+   DONE;
+ }")
+ 
+ (define_expand "sync_nand<mode>"
+   [(use (match_operand:INT1 0 "memory_operand" ""))
+    (use (match_operand:INT1 1 "gpc_reg_operand" ""))]
+   "TARGET_POWERPC && !PPC405_ERRATUM77"
+   "
+ {
+   rs6000_emit_sync (AND, <MODE>mode, 
+ 		    gen_rtx_NOT (<MODE>mode, operands[0]),
+ 		    operands[1],
+ 		    NULL_RTX, NULL_RTX, true);
+   DONE;
+ }")
+ 
+ (define_expand "sync_old_add<mode>"
+   [(use (match_operand:INT1 0 "gpc_reg_operand" ""))
+    (use (match_operand:INT1 1 "memory_operand" ""))
+    (use (match_operand:INT1 2 "add_operand" ""))]
+   "TARGET_POWERPC && !PPC405_ERRATUM77"
+   "
+ {
+   rs6000_emit_sync (PLUS, <MODE>mode, operands[1], operands[2], 
+ 		    operands[0], NULL_RTX, true);
+   DONE;
+ }")
+ 
+ (define_expand "sync_old_sub<mode>"
+   [(use (match_operand:GPR 0 "gpc_reg_operand" ""))
+    (use (match_operand:GPR 1 "memory_operand" ""))
+    (use (match_operand:GPR 2 "gpc_reg_operand" ""))]
+   "TARGET_POWERPC && !PPC405_ERRATUM77"
+   "
+ {
+   rs6000_emit_sync (MINUS, <MODE>mode, operands[1], operands[2], 
+ 		    operands[0], NULL_RTX, true);
+   DONE;
+ }")
+ 
+ (define_expand "sync_old_ior<mode>"
+   [(use (match_operand:INT1 0 "gpc_reg_operand" ""))
+    (use (match_operand:INT1 1 "memory_operand" ""))
+    (use (match_operand:INT1 2 "logical_operand" ""))]
+   "TARGET_POWERPC && !PPC405_ERRATUM77"
+   "
+ {
+   rs6000_emit_sync (IOR, <MODE>mode, operands[1], operands[2], 
+ 		    operands[0], NULL_RTX, true);
+   DONE;
+ }")
+ 
+ (define_expand "sync_old_and<mode>"
+   [(use (match_operand:INT1 0 "gpc_reg_operand" ""))
+    (use (match_operand:INT1 1 "memory_operand" ""))
+    (use (match_operand:INT1 2 "and_operand" ""))]
+   "TARGET_POWERPC && !PPC405_ERRATUM77"
+   "
+ {
+   rs6000_emit_sync (AND, <MODE>mode, operands[1], operands[2], 
+ 		    operands[0], NULL_RTX, true);
+   DONE;
+ }")
+ 
+ (define_expand "sync_old_xor<mode>"
+   [(use (match_operand:INT1 0 "gpc_reg_operand" ""))
+    (use (match_operand:INT1 1 "memory_operand" ""))
+    (use (match_operand:INT1 2 "logical_operand" ""))]
+   "TARGET_POWERPC && !PPC405_ERRATUM77"
+   "
+ {
+   rs6000_emit_sync (XOR, <MODE>mode, operands[1], operands[2], 
+ 		    operands[0], NULL_RTX, true);
+   DONE;
+ }")
+ 
+ (define_expand "sync_old_nand<mode>"
+   [(use (match_operand:INT1 0 "gpc_reg_operand" ""))
+    (use (match_operand:INT1 1 "memory_operand" ""))
+    (use (match_operand:INT1 2 "gpc_reg_operand" ""))]
+   "TARGET_POWERPC && !PPC405_ERRATUM77"
+   "
+ {
+   rs6000_emit_sync (AND, <MODE>mode, 
+ 		    gen_rtx_NOT (<MODE>mode, operands[1]),
+ 		    operands[2],
+ 		    operands[0], NULL_RTX, true);
+   DONE;
+ }")
+ 
+ (define_expand "sync_new_add<mode>"
+   [(use (match_operand:INT1 0 "gpc_reg_operand" ""))
+    (use (match_operand:INT1 1 "memory_operand" ""))
+    (use (match_operand:INT1 2 "add_operand" ""))]
+   "TARGET_POWERPC && !PPC405_ERRATUM77"
+   "
+ {
+   rs6000_emit_sync (PLUS, <MODE>mode, operands[1], operands[2], 
+ 		    NULL_RTX, operands[0], true);
+   DONE;
+ }")
+ 
+ (define_expand "sync_new_sub<mode>"
+   [(use (match_operand:GPR 0 "gpc_reg_operand" ""))
+    (use (match_operand:GPR 1 "memory_operand" ""))
+    (use (match_operand:GPR 2 "gpc_reg_operand" ""))]
+   "TARGET_POWERPC && !PPC405_ERRATUM77"
+   "
+ {
+   rs6000_emit_sync (MINUS, <MODE>mode, operands[1], operands[2], 
+ 		    NULL_RTX, operands[0], true);
+   DONE;
+ }")
+ 
+ (define_expand "sync_new_ior<mode>"
+   [(use (match_operand:INT1 0 "gpc_reg_operand" ""))
+    (use (match_operand:INT1 1 "memory_operand" ""))
+    (use (match_operand:INT1 2 "logical_operand" ""))]
+   "TARGET_POWERPC && !PPC405_ERRATUM77"
+   "
+ {
+   rs6000_emit_sync (IOR, <MODE>mode, operands[1], operands[2], 
+ 		    NULL_RTX, operands[0], true);
+   DONE;
+ }")
+ 
+ (define_expand "sync_new_and<mode>"
+   [(use (match_operand:INT1 0 "gpc_reg_operand" ""))
+    (use (match_operand:INT1 1 "memory_operand" ""))
+    (use (match_operand:INT1 2 "and_operand" ""))]
+   "TARGET_POWERPC && !PPC405_ERRATUM77"
+   "
+ {
+   rs6000_emit_sync (AND, <MODE>mode, operands[1], operands[2], 
+ 		    NULL_RTX, operands[0], true);
+   DONE;
+ }")
+ 
+ (define_expand "sync_new_xor<mode>"
+   [(use (match_operand:INT1 0 "gpc_reg_operand" ""))
+    (use (match_operand:INT1 1 "memory_operand" ""))
+    (use (match_operand:INT1 2 "logical_operand" ""))]
+   "TARGET_POWERPC && !PPC405_ERRATUM77"
+   "
+ {
+   rs6000_emit_sync (XOR, <MODE>mode, operands[1], operands[2], 
+ 		    NULL_RTX, operands[0], true);
+   DONE;
+ }")
+ 
+ (define_expand "sync_new_nand<mode>"
+   [(use (match_operand:INT1 0 "gpc_reg_operand" ""))
+    (use (match_operand:INT1 1 "memory_operand" ""))
+    (use (match_operand:INT1 2 "gpc_reg_operand" ""))]
+   "TARGET_POWERPC && !PPC405_ERRATUM77"
+   "
+ {
+   rs6000_emit_sync (AND, <MODE>mode, 
+ 		    gen_rtx_NOT (<MODE>mode, operands[1]),
+ 		    operands[2],
+ 		    NULL_RTX, operands[0], true);
+   DONE;
+ }")
+ 
+ ; the sync_*_internal patterns all have these operands:
+ ; 0 - memory location
+ ; 1 - operand
+ ; 2 - value in memory after operation
+ ; 3 - value in memory immediately before operation
+ 
+ (define_insn "*sync_add<mode>_internal"
+   [(set (match_operand:GPR 2 "gpc_reg_operand" "=&r,&r")
+ 	(plus:GPR (match_operand:GPR 0 "memory_operand" "+Z,Z")
+ 		 (match_operand:GPR 1 "add_operand" "rI,L")))
+    (set (match_operand:GPR 3 "gpc_reg_operand" "=&b,&b") (match_dup 0))
+    (set (match_dup 0) 
+ 	(unspec:GPR [(plus:GPR (match_dup 0) (match_dup 1))]
+ 		   UNSPEC_SYNC_OP))
+    (clobber (match_scratch:CC 4 "=&x,&x"))]
+   "TARGET_POWERPC && !PPC405_ERRATUM77"
+   "@
+    <larx> %3,%y0\n\tadd%I1 %2,%3,%1\n\t<stcx> %2,%y0\n\tbne- $-12
+    <larx> %3,%y0\n\taddis %2,%3,%v1\n\t<stcx> %2,%y0\n\tbne- $-12"
+   [(set_attr "length" "16,16")])
+ 
+ (define_insn "*sync_addshort_internal"
+   [(set (match_operand:SI 2 "gpc_reg_operand" "=&r")
+ 	(ior:SI (and:SI (plus:SI (match_operand:SI 0 "memory_operand" "+Z")
+ 				 (match_operand:SI 1 "add_operand" "rI"))
+ 			(match_operand:SI 4 "gpc_reg_operand" "r"))
+ 		(and:SI (not:SI (match_dup 4)) (match_dup 0))))
+    (set (match_operand:SI 3 "gpc_reg_operand" "=&b") (match_dup 0))
+    (set (match_dup 0) 
+ 	(unspec:SI [(ior:SI (and:SI (plus:SI (match_dup 0) (match_dup 1))
+ 				    (match_dup 4))
+ 			    (and:SI (not:SI (match_dup 4)) (match_dup 0)))]
+ 		   UNSPEC_SYNC_OP))
+    (clobber (match_scratch:CC 5 "=&x"))
+    (clobber (match_scratch:SI 6 "=&r"))]
+   "TARGET_POWERPC && !PPC405_ERRATUM77"
+   "lwarx %3,%y0\n\tadd%I1 %2,%3,%1\n\tandc %6,%3,%4\n\tand %2,%2,%4\n\tor %2,%2,%6\n\tstwcx. %2,%y0\n\tbne- $-24"
+   [(set_attr "length" "28")])
+ 
+ (define_insn "*sync_sub<mode>_internal"
+   [(set (match_operand:GPR 2 "gpc_reg_operand" "=&r")
+ 	(minus:GPR (match_operand:GPR 0 "memory_operand" "+Z")
+ 		  (match_operand:GPR 1 "gpc_reg_operand" "r")))
+    (set (match_operand:GPR 3 "gpc_reg_operand" "=&b") (match_dup 0))
+    (set (match_dup 0) 
+ 	(unspec:GPR [(minus:GPR (match_dup 0) (match_dup 1))]
+ 		   UNSPEC_SYNC_OP))
+    (clobber (match_scratch:CC 4 "=&x"))]
+   "TARGET_POWERPC && !PPC405_ERRATUM77"
+   "<larx> %3,%y0\n\tsubf %2,%1,%3\n\t<stcx> %2,%y0\n\tbne- $-12"
+   [(set_attr "length" "16")])
+ 
+ (define_insn "*sync_andsi_internal"
+   [(set (match_operand:SI 2 "gpc_reg_operand" "=&r,&r,&r,&r")
+ 	(and:SI (match_operand:SI 0 "memory_operand" "+Z,Z,Z,Z")
+ 		(match_operand:SI 1 "and_operand" "r,T,K,L")))
+    (set (match_operand:SI 3 "gpc_reg_operand" "=&b,&b,&b,&b") (match_dup 0))
+    (set (match_dup 0) 
+ 	(unspec:SI [(and:SI (match_dup 0) (match_dup 1))]
+ 		   UNSPEC_SYNC_OP))
+    (clobber (match_scratch:CC 4 "=&x,&x,&x,&x"))]
+   "TARGET_POWERPC && !PPC405_ERRATUM77"
+   "@
+    lwarx %3,%y0\n\tand %2,%3,%1\n\tstwcx. %2,%y0\n\tbne- $-12
+    lwarx %3,%y0\n\trlwinm %2,%3,0,%m1,%M1\n\tstwcx. %2,%y0\n\tbne- $-12
+    lwarx %3,%y0\n\tandi. %2,%3,%b1\n\tstwcx. %2,%y0\n\tbne- $-12
+    lwarx %3,%y0\n\tandis. %2,%3,%u1\n\tstwcx. %2,%y0\n\tbne- $-12"
+   [(set_attr "length" "16,16,16,16")])
+ 
+ (define_insn "*sync_anddi_internal"
+   [(set (match_operand:DI 2 "gpc_reg_operand" "=&r,&r,&r,&r,&r")
+ 	(and:DI (match_operand:DI 0 "memory_operand" "+Z,Z,Z,Z,Z")
+ 		(match_operand:DI 1 "and_operand" "r,S,T,K,J")))
+    (set (match_operand:DI 3 "gpc_reg_operand" "=&b,&b,&b,&b,&b") (match_dup 0))
+    (set (match_dup 0) 
+ 	(unspec:DI [(and:DI (match_dup 0) (match_dup 1))]
+ 		   UNSPEC_SYNC_OP))
+    (clobber (match_scratch:CC 4 "=&x,&x,&x,&x,&x"))]
+   "TARGET_POWERPC64"
+   "@
+    ldarx %3,%y0\n\tand %2,%3,%1\n\tstdcx. %2,%y0\n\tbne- $-12
+    ldarx %3,%y0\n\trldic%B1 %2,%3,0,%S1\n\tstdcx. %2,%y0\n\tbne- $-12
+    ldarx %3,%y0\n\trlwinm %2,%3,0,%m1,%M1\n\tstdcx. %2,%y0\n\tbne- $-12
+    ldarx %3,%y0\n\tandi. %2,%3,%b1\n\tstdcx. %2,%y0\n\tbne- $-12
+    ldarx %3,%y0\n\tandis. %2,%3,%b1\n\tstdcx. %2,%y0\n\tbne- $-12"
+   [(set_attr "length" "16,16,16,16,16")])
+ 
+ (define_insn "*sync_boolsi_internal"
+   [(set (match_operand:SI 2 "gpc_reg_operand" "=&r,&r,&r")
+ 	(match_operator:SI 4 "boolean_or_operator"
+ 	 [(match_operand:SI 0 "memory_operand" "+Z,Z,Z")
+ 	  (match_operand:SI 1 "logical_operand" "r,K,L")]))
+    (set (match_operand:SI 3 "gpc_reg_operand" "=&b,&b,&b") (match_dup 0))
+    (set (match_dup 0) (unspec:SI [(match_dup 4)] UNSPEC_SYNC_OP))
+    (clobber (match_scratch:CC 5 "=&x,&x,&x"))]
+   "TARGET_POWERPC && !PPC405_ERRATUM77"
+   "@
+    lwarx %3,%y0\n\t%q4 %2,%3,%1\n\tstwcx. %2,%y0\n\tbne- $-12
+    lwarx %3,%y0\n\t%q4i %2,%3,%b1\n\tstwcx. %2,%y0\n\tbne- $-12
+    lwarx %3,%y0\n\t%q4is %2,%3,%u1\n\tstwcx. %2,%y0\n\tbne- $-12"
+   [(set_attr "length" "16,16,16")])
+ 
+ (define_insn "*sync_booldi_internal"
+   [(set (match_operand:DI 2 "gpc_reg_operand" "=&r,&r,&r")
+ 	(match_operator:DI 4 "boolean_or_operator"
+ 	 [(match_operand:DI 0 "memory_operand" "+Z,Z,Z")
+ 	  (match_operand:DI 1 "logical_operand" "r,K,JF")]))
+    (set (match_operand:DI 3 "gpc_reg_operand" "=&b,&b,&b") (match_dup 0))
+    (set (match_dup 0) (unspec:DI [(match_dup 4)] UNSPEC_SYNC_OP))
+    (clobber (match_scratch:CC 5 "=&x,&x,&x"))]
+   "TARGET_POWERPC64"
+   "@
+    ldarx %3,%y0\n\t%q4 %2,%3,%1\n\tstdcx. %2,%y0\n\tbne- $-12
+    ldarx %3,%y0\n\t%q4i %2,%3,%b1\n\tstdcx. %2,%y0\n\tbne- $-12
+    ldarx %3,%y0\n\t%q4is %2,%3,%u1\n\tstdcx. %2,%y0\n\tbne- $-12"
+   [(set_attr "length" "16,16,16")])
+ 
+ (define_insn "*sync_boolc<mode>_internal"
+   [(set (match_operand:GPR 2 "gpc_reg_operand" "=&r")
+ 	(match_operator:GPR 4 "boolean_operator"
+ 	 [(not:GPR (match_operand:GPR 0 "memory_operand" "+Z"))
+ 	  (match_operand:GPR 1 "gpc_reg_operand" "r")]))
+    (set (match_operand:GPR 3 "gpc_reg_operand" "=&b") (match_dup 0))
+    (set (match_dup 0) (unspec:GPR [(match_dup 4)] UNSPEC_SYNC_OP))
+    (clobber (match_scratch:CC 5 "=&x"))]
+   "TARGET_POWERPC && !PPC405_ERRATUM77"
+   "<larx> %3,%y0\n\t%q4 %2,%1,%3\n\t<stcx> %2,%y0\n\tbne- $-12"
+   [(set_attr "length" "16")])
+ 
+ ; This pattern could also take immediate values of operand 1,
+ ; since the non-NOT version of the operator is used; but this is not
+ ; very useful, since in practice operand 1 is a full 32-bit value.
+ ; Likewise, operand 5 is in practice either <= 2^16 or it is a register.
+ (define_insn "*sync_boolcshort_internal"
+   [(set (match_operand:SI 2 "gpc_reg_operand" "=&r")
+ 	(match_operator:SI 4 "boolean_operator"
+ 	 [(xor:SI (match_operand:SI 0 "memory_operand" "+Z")
+ 		  (match_operand:SI 5 "logical_operand" "rK"))
+ 	  (match_operand:SI 1 "gpc_reg_operand" "r")]))
+    (set (match_operand:SI 3 "gpc_reg_operand" "=&b") (match_dup 0))
+    (set (match_dup 0) (unspec:SI [(match_dup 4)] UNSPEC_SYNC_OP))
+    (clobber (match_scratch:CC 6 "=&x"))]
+   "TARGET_POWERPC && !PPC405_ERRATUM77"
+   "lwarx %3,%y0\n\txor%I2 %2,%3,%5\n\t%q4 %2,%2,%1\n\tstwcx. %2,%y0\n\tbne- $-16"
+   [(set_attr "length" "20")])
+ 
+ (define_insn "*sync_boolc<mode>_internal2"
+   [(set (match_operand:GPR 2 "gpc_reg_operand" "=&r")
+ 	(match_operator:GPR 4 "boolean_operator"
+ 	 [(not:GPR (match_operand:GPR 1 "gpc_reg_operand" "r"))
+ 	  (match_operand:GPR 0 "memory_operand" "+Z")]))
+    (set (match_operand:GPR 3 "gpc_reg_operand" "=&b") (match_dup 0))
+    (set (match_dup 0) (unspec:GPR [(match_dup 4)] UNSPEC_SYNC_OP))
+    (clobber (match_scratch:CC 5 "=&x"))]
+   "TARGET_POWERPC && !PPC405_ERRATUM77"
+   "<larx> %3,%y0\n\t%q4 %2,%3,%1\n\t<stcx> %2,%y0\n\tbne- $-12"
+   [(set_attr "length" "16")])
+ 
+ (define_insn "*sync_boolcc<mode>_internal"
+   [(set (match_operand:GPR 2 "gpc_reg_operand" "=&r")
+ 	(match_operator:GPR 4 "boolean_operator"
+ 	 [(not:GPR (match_operand:GPR 0 "memory_operand" "+Z"))
+ 	  (not:GPR (match_operand:GPR 1 "gpc_reg_operand" "r"))]))
+    (set (match_operand:GPR 3 "gpc_reg_operand" "=&b") (match_dup 0))
+    (set (match_dup 0) (unspec:GPR [(match_dup 4)] UNSPEC_SYNC_OP))
+    (clobber (match_scratch:CC 5 "=&x"))]
+   "TARGET_POWERPC && !PPC405_ERRATUM77"
+   "<larx> %3,%y0\n\t%q4 %2,%1,%3\n\t<stcx> %2,%y0\n\tbne- $-12"
+   [(set_attr "length" "16")])
+ 
+ (define_insn "isync"
+   [(set (mem:BLK (match_scratch 0 "X"))
+         (unspec:BLK [(mem:BLK (match_scratch 1 "X"))] UNSPEC_ISYNC))]
+   "TARGET_POWERPC"
+   "isync")
+ 
+ (define_insn "sync_lock_test_and_set<mode>"
+   [(set (match_operand:GPR 0 "gpc_reg_operand" "=&r")
+ 	(match_operand:GPR 1 "memory_operand" "+Z"))
+    (set (match_dup 1) (unspec:GPR [(match_operand:GPR 2 "gpc_reg_operand" "r")] 
+ 				 UNSPEC_SYNC_OP))
+    (clobber (match_scratch:CC 3 "=&x"))
+    (set (mem:BLK (match_scratch 4 "X"))
+         (unspec:BLK [(mem:BLK (match_scratch 5 "X"))] UNSPEC_ISYNC))]
+   "TARGET_POWERPC && !PPC405_ERRATUM77"
+   "<larx> %0,%y1\n\t<stcx> %2,%y1\n\tbne- $-8\n\tisync"
+   [(set_attr "length" "16")])
+ 
+ (define_expand "sync_lock_release<mode>"
+   [(set (match_operand:INT 0 "memory_operand")
+ 	(match_operand:INT 1 "any_operand"))]
+   ""
+   "
+ {
+   emit_insn (gen_lwsync ());
+   emit_move_insn (operands[0], operands[1]);
+   DONE;
+ }")
+ 
+ ; Some AIX assemblers don't accept lwsync, so we use a .long.
+ (define_insn "lwsync"
+   [(set (mem:BLK (match_scratch 0 "X"))
+         (unspec:BLK [(mem:BLK (match_scratch 1 "X"))] UNSPEC_LWSYNC))]
+   ""
+   ".long 0x7c2004ac")
+ 

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH, committed] PPC405 atomic support (PR target/21760)
  2005-06-23 13:33 [PATCH, committed] PPC405 atomic support (PR target/21760) David Edelsohn
@ 2005-06-23 13:51 ` Andreas Schwab
  2005-06-23 19:32   ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Andreas Schwab @ 2005-06-23 13:51 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

David Edelsohn <dje@watson.ibm.com> writes:

> + /* If configured for PPC405, support PPC405CR Erratum77.  */
> + #define PPC405_CPU_DEFAULT ("405")
> + #if #TARGET_CPU_DEFAULT == #PPC405_CPU_DEFAULT

What is this expression supposed to do?  It doesn't match anything defined
by C89.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux Products GmbH, MaxfeldstraÃŸe 5, 90409 NÃ¼rnberg, Germany
Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH, committed] PPC405 atomic support (PR target/21760)
  2005-06-23 13:51 ` Andreas Schwab
@ 2005-06-23 19:32   ` David Edelsohn
  2005-06-24  2:07     ` Geoffrey Keating
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2005-06-23 19:32 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: gcc-patches

>>>>> Andreas Schwab writes:

>> + /* If configured for PPC405, support PPC405CR Erratum77.  */
>> + #define PPC405_CPU_DEFAULT ("405")
>> + #if #TARGET_CPU_DEFAULT == #PPC405_CPU_DEFAULT

Andreas> What is this expression supposed to do?  It doesn't match
anything defined by C89.

	The expression is comparing the stringified value of the macros.
When --with-cpu is not used, TARGET_CPU_DEFAULT is defined as

#define TARGET_CPU_DEFAULT ((char *)0)

and CPP produces an error comparing something with that value.

("405") is the value to which TARGET_CPU_DEFAULT is defined in tm.h when
--with-cpu=405 is used.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH, committed] PPC405 atomic support (PR target/21760)
  2005-06-23 19:32   ` David Edelsohn
@ 2005-06-24  2:07     ` Geoffrey Keating
  2005-06-24  2:35       ` Andrew Pinski
                         ` (2 more replies)
  0 siblings, 3 replies; 875+ messages in thread
From: Geoffrey Keating @ 2005-06-24  2:07 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

David Edelsohn <dje@watson.ibm.com> writes:

> >>>>> Andreas Schwab writes:
> 
> >> + /* If configured for PPC405, support PPC405CR Erratum77.  */
> >> + #define PPC405_CPU_DEFAULT ("405")
> >> + #if #TARGET_CPU_DEFAULT == #PPC405_CPU_DEFAULT
> 
> Andreas> What is this expression supposed to do?  It doesn't match
> anything defined by C89.
> 
> 	The expression is comparing the stringified value of the macros.

I'm pretty sure that doesn't work, and in fact should be an error;
even C99 says that "The expression that controls conditional inclusion
shall be an integer constant expression" (and I think the "shall" in
this location requires a diagnostic).  I filed

<http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22168>

to track it.

I think you might be able to get away with

#define ppc405_is_default (strcmp(TARGET_CPU_DEFAULT, "405") == 0)

with no cost at runtime, since gcc will fold the strcmp.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH, committed] PPC405 atomic support (PR target/21760)
  2005-06-24  2:07     ` Geoffrey Keating
@ 2005-06-24  2:35       ` Andrew Pinski
  2005-06-24  3:30         ` Daniel Jacobowitz
  2005-06-24 18:41       ` Zack Weinberg
       [not found]       ` <geoffk@geoffk.org>
  2 siblings, 1 reply; 875+ messages in thread
From: Andrew Pinski @ 2005-06-24  2:35 UTC (permalink / raw)
  To: Geoffrey Keating; +Cc: gcc-patches, David Edelsohn

On Jun 23, 2005, at 10:03 PM, Geoffrey Keating wrote:

> David Edelsohn <dje@watson.ibm.com> writes:
>
> I think you might be able to get away with
>
> #define ppc405_is_default (strcmp(TARGET_CPU_DEFAULT, "405") == 0)

I suggest a different way to David via IRC when he was writing this code
to fix this. Change config.gcc to add a define when --with-cpu=405 is
supplied which in turn turns on the check but David rejected it saying 
that
is it was going to be too hard to maintain.

-- Pinski

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH, committed] PPC405 atomic support (PR target/21760)
  2005-06-24  2:35       ` Andrew Pinski
@ 2005-06-24  3:30         ` Daniel Jacobowitz
  2005-06-24 14:16           ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Daniel Jacobowitz @ 2005-06-24  3:30 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: Geoffrey Keating, gcc-patches, David Edelsohn

On Thu, Jun 23, 2005 at 10:35:30PM -0400, Andrew Pinski wrote:
> 
> On Jun 23, 2005, at 10:03 PM, Geoffrey Keating wrote:
> 
> >David Edelsohn <dje@watson.ibm.com> writes:
> >
> >I think you might be able to get away with
> >
> >#define ppc405_is_default (strcmp(TARGET_CPU_DEFAULT, "405") == 0)
> 
> I suggest a different way to David via IRC when he was writing this code
> to fix this. Change config.gcc to add a define when --with-cpu=405 is
> supplied which in turn turns on the check but David rejected it saying 
> that
> is it was going to be too hard to maintain.

Why does this need to be a preprocessor check - why not do this based
on -mcpu=405 instead of --with-cpu=405?

-- 
Daniel Jacobowitz
CodeSourcery, LLC

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH, committed] PPC405 atomic support (PR target/21760)
  2005-06-24  3:30         ` Daniel Jacobowitz
@ 2005-06-24 14:16           ` David Edelsohn
  2005-06-24 19:10             ` Daniel Jacobowitz
  2005-06-27  6:06             ` Richard Henderson
  0 siblings, 2 replies; 875+ messages in thread
From: David Edelsohn @ 2005-06-24 14:16 UTC (permalink / raw)
  To: Andrew Pinski, drow, Geoffrey Keating, gcc-patches

>>>>> Daniel Jacobowitz writes:

Dan> Why does this need to be a preprocessor check - why not do this based
Dan> on -mcpu=405 instead of --with-cpu=405?

	The atomic instruction fix needs to be included in all GCC runtime
libraries. No PPC405 multilib will be added.  Therefore, GCC needs to
be configured and built for PPC405 to address this erratum.  If one does
not configure for PPC405, there is no reason to include the fix.

	This also allows users to configure --with-cpu=405 for PPC405CR
and use -mcpu=405 for all other PPC405 processors which are not affected
by the erratum. 

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH, committed] PPC405 atomic support (PR target/21760)
  2005-06-24  2:07     ` Geoffrey Keating
  2005-06-24  2:35       ` Andrew Pinski
@ 2005-06-24 18:41       ` Zack Weinberg
       [not found]       ` <geoffk@geoffk.org>
  2 siblings, 0 replies; 875+ messages in thread
From: Zack Weinberg @ 2005-06-24 18:41 UTC (permalink / raw)
  To: Geoffrey Keating; +Cc: David Edelsohn, gcc-patches

Geoffrey Keating wrote:
> David Edelsohn <dje@watson.ibm.com> writes:
> 
> 
>>>>>>>Andreas Schwab writes:
>>
>>>>+ /* If configured for PPC405, support PPC405CR Erratum77.  */
>>>>+ #define PPC405_CPU_DEFAULT ("405")
>>>>+ #if #TARGET_CPU_DEFAULT == #PPC405_CPU_DEFAULT
>>
>>Andreas> What is this expression supposed to do?  It doesn't match
>>anything defined by C89.
>>
>>	The expression is comparing the stringified value of the macros.
> 
> 
> I'm pretty sure that doesn't work, and in fact should be an error

It doesn't.  However, that is being masked, because # is not the
stringification operator in #if; only in #define.  If David had written

#define PPC405_CPU_DEFAULT ("405")
#define TARGET_CPU_DEFAULT ((char *) 0)
#define S(x) S_(x)
#define S_(x) #x
#if S(PPC405_CPU_DEFAULT) == S(TARGET_CPU_DEFAULT)
// yada yada
#endif

then he would have seen

$ gcc -fsyntax-only test.c
test.c:5:25: token ""(\"405\")"" is not valid in preprocessor expressions

# in an #if expression triggers one of our least useful extensions,
preprocessor assertions.  Both halves of the == will evaluate to zero, no
matter what the macros are defined to.  Sadly, we don't diagnose the use of an
extension here (patches welcome).

zw

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH, committed] PPC405 atomic support (PR target/21760)
  2005-06-24 14:16           ` David Edelsohn
@ 2005-06-24 19:10             ` Daniel Jacobowitz
  2005-06-27  6:06             ` Richard Henderson
  1 sibling, 0 replies; 875+ messages in thread
From: Daniel Jacobowitz @ 2005-06-24 19:10 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Andrew Pinski, Geoffrey Keating, gcc-patches

On Fri, Jun 24, 2005 at 10:16:13AM -0400, David Edelsohn wrote:
> >>>>> Daniel Jacobowitz writes:
> 
> Dan> Why does this need to be a preprocessor check - why not do this based
> Dan> on -mcpu=405 instead of --with-cpu=405?
> 
> 	The atomic instruction fix needs to be included in all GCC runtime
> libraries. No PPC405 multilib will be added.  Therefore, GCC needs to
> be configured and built for PPC405 to address this erratum.  If one does
> not configure for PPC405, there is no reason to include the fix.
> 
> 	This also allows users to configure --with-cpu=405 for PPC405CR
> and use -mcpu=405 for all other PPC405 processors which are not affected
> by the erratum. 

There are no other cases in which --with-cpu is different from -mcpu;
it was a conscious design decision for the revised --with-ARG mechanism
to exactly match the behavior of hardcoding options to the compiler. 
I'm pretty unhappy with a divergence.  Why not create -mcpu=405cr, tie
the erratum fix to that option, and document --with-cpu=405cr?

-- 
Daniel Jacobowitz
CodeSourcery, LLC

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH, committed] PPC405 atomic support (PR target/21760)
       [not found]       ` <geoffk@geoffk.org>
@ 2005-06-25 21:46         ` David Edelsohn
  0 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2005-06-25 21:46 UTC (permalink / raw)
  To: gcc-patches

	I will update the method for using the erratum fix to only trigger
when configured for 405cr.

	Also, I did test the patch with GCC configured --with-cpu=405 and
without, and the macro was set correctly in both cases.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH, committed] PPC405 atomic support (PR target/21760)
  2005-06-24 14:16           ` David Edelsohn
  2005-06-24 19:10             ` Daniel Jacobowitz
@ 2005-06-27  6:06             ` Richard Henderson
  2005-06-27 15:17               ` David Edelsohn
  1 sibling, 1 reply; 875+ messages in thread
From: Richard Henderson @ 2005-06-27  6:06 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Andrew Pinski, drow, Geoffrey Keating, gcc-patches

On Fri, Jun 24, 2005 at 10:16:13AM -0400, David Edelsohn wrote:
> 	The atomic instruction fix needs to be included in all GCC runtime
> libraries. No PPC405 multilib will be added.  Therefore, GCC needs to
> be configured and built for PPC405 to address this erratum.  If one does
> not configure for PPC405, there is no reason to include the fix.

I disagree.

It does mean that a libstdc++ not built with --with-cpu=405 will
not operate properly, but there will most definitely be users of
the builtin that are not libstdc++.

> 	This also allows users to configure --with-cpu=405 for PPC405CR
> and use -mcpu=405 for all other PPC405 processors which are not affected
> by the erratum. 

This is simply confusing.  See the mips port for what I'd consider
recommended behaviour wrt fixing chip bugs.

Finally, you can avoid quite a lot of code duplication if you switch
PPC to a post-reload expansion form of ll/sc like I use on Alpha. 
The 405cr fix becomes a single extra emit_insn in the right place.

r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH, committed] PPC405 atomic support (PR target/21760)
  2005-06-27  6:06             ` Richard Henderson
@ 2005-06-27 15:17               ` David Edelsohn
  0 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2005-06-27 15:17 UTC (permalink / raw)
  To: Richard Henderson, Andrew Pinski, drow, Geoffrey Keating, gcc-patches

	My main reason for working on PPC405CR is because the processor
erratum previously was supported and the recent changes created a
regression.  If others will not fix the problem, it falls to me and I am
trying to work around the problem and move on.

	PowerPC atomic support can be improved on a longer timeframe.  I
have other projects I want to complete.

	PPC405CR options I have dismissed include:

1) multilib.
2) first class processor using target_flags.
3) separate option like rs6000_isel.

	PPC405CR erratum support is not worth the level of attention this
discussion is generating.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH, committed] PPC405 atomic support (PR target/21760)
       [not found]                               ` <drow@false.org>
  2004-09-15 15:46                                 ` David Edelsohn
  2005-04-01 21:58                                 ` AIX bootstrap failure (was Re: Hot/cold partitioning fixes) David Edelsohn
@ 2005-06-27 17:18                                 ` David Edelsohn
  2005-06-28  7:54                                   ` Gunther Nikl
  2005-11-09 19:39                                 ` [PATCH] volatile global register variable David Edelsohn
                                                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2005-06-27 17:18 UTC (permalink / raw)
  To: gcc-patches

	I committed the following to remove the use of undefined CPP
behavior. 

David

	* config/rs6000/rs6000.c (rs6000_file_start): Note PPC405 erratum
	in verbose_asm output.
	* config/rs6000/rs6000.h (PPC405_ERRATUM77): Bracket with
	CONFIG_PPC405CR.
	* config.gcc (powerpc with_which): Define CONFIG_PPC405CR for
	405cr.

Index: rs6000.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.c,v
retrieving revision 1.843
diff -c -p -r1.843 rs6000.c
*** rs6000.c	25 Jun 2005 01:22:07 -0000	1.843
--- rs6000.c	27 Jun 2005 17:10:42 -0000
*************** rs6000_file_start (void)
*** 1850,1855 ****
--- 1850,1863 ----
  	    }
  	}
  
+ #ifdef CONFIG_PPC405CR
+       if (rs6000_cpu == PROCESSOR_PPC405)
+ 	{
+ 	  fprint (file, "%s PPC405CR_ERRATUM77", start);
+ 	  start = "";
+ 	}
+ #endif
+ 
  #ifdef USING_ELFOS_H
        switch (rs6000_sdata)
  	{
Index: rs6000.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.h,v
retrieving revision 1.374
diff -c -p -r1.374 rs6000.h
*** rs6000.h	26 Jun 2005 11:42:11 -0000	1.374
--- rs6000.h	27 Jun 2005 17:10:42 -0000
***************
*** 50,57 ****
  #endif
  
  /* If configured for PPC405, support PPC405CR Erratum77.  */
! #define PPC405_CPU_DEFAULT ("405")
! #if #TARGET_CPU_DEFAULT == #PPC405_CPU_DEFAULT
  #define PPC405_ERRATUM77 (rs6000_cpu == PROCESSOR_PPC405)
  #else
  #define PPC405_ERRATUM77 0
--- 50,56 ----
  #endif
  
  /* If configured for PPC405, support PPC405CR Erratum77.  */
! #ifdef CONFIG_PPC405CR
  #define PPC405_ERRATUM77 (rs6000_cpu == PROCESSOR_PPC405)
  #else
  #define PPC405_ERRATUM77 0
Index: config.gcc
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config.gcc,v
retrieving revision 1.550
diff -c -p -r1.550 config.gcc
*** config.gcc	25 Jun 2005 01:59:35 -0000	1.550
--- config.gcc	27 Jun 2005 17:10:54 -0000
*************** case "${target}" in
*** 2637,2642 ****
--- 2637,2646 ----
  				with_which="with_$which"
  				eval $with_which=
  				;;
+ 			405cr)
+ 				tm_defines="${tm_defines} CONFIG_PPC405CR"
+ 				eval "with_$which=405"
+ 				;;
  			"" | common \
  			| power | power[2345] | powerpc | powerpc64 \
  			| rios | rios1 | rios2 | rsc | rsc1 | rs64a \

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH, committed] PPC405 atomic support (PR target/21760)
  2005-06-27 17:18                                 ` [PATCH, committed] PPC405 atomic support (PR target/21760) David Edelsohn
@ 2005-06-28  7:54                                   ` Gunther Nikl
  0 siblings, 0 replies; 875+ messages in thread
From: Gunther Nikl @ 2005-06-28  7:54 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

On Mon, Jun 27, 2005 at 01:18:15PM -0400, David Edelsohn wrote:
> *** rs6000.c	25 Jun 2005 01:22:07 -0000	1.843
> --- rs6000.c	27 Jun 2005 17:10:42 -0000
> *************** rs6000_file_start (void)
> *** 1850,1855 ****
> --- 1850,1863 ----
>   	    }
>   	}
>   
> + #ifdef CONFIG_PPC405CR
> +       if (rs6000_cpu == PROCESSOR_PPC405)
> + 	{
> + 	  fprint (file, "%s PPC405CR_ERRATUM77", start);
> + 	  start = "";
> + 	}
> + #endif

  Wouldn't it be better to use "if (PPC405_ERRATUM77)" since rs6000.h
  defines PPC405_ERRATUM77 to zero if it wasn't defined through the
  config.gcc.

  Gunther

>   /* If configured for PPC405, support PPC405CR Erratum77.  */
> ! #ifdef CONFIG_PPC405CR
>   #define PPC405_ERRATUM77 (rs6000_cpu == PROCESSOR_PPC405)
>   #else
>   #define PPC405_ERRATUM77 0

^ permalink raw reply	[flat|nested] 875+ messages in thread

* [RS6000] Nop-insertion fix
@ 2005-09-12  2:03 Alan Modra
  2005-09-12  7:23 ` Richard Henderson
  0 siblings, 1 reply; 875+ messages in thread
From: Alan Modra @ 2005-09-12  2:03 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Edelsohn

This cleans up get_next_active_insn, using similar tests to those in
emit-rtl.c:next_active_insn to decide which insns are really active.  In
addition it excludes stack_tie, a fake insn that introduces dependency
between the stack pointer and stack memory, used in some function
prologues and epilogues.  This fake insn is seen by the insn grouping
pass as if it were real, resulting in nops being inserted to avoid
non-existent mem access stalls.
See also http://gcc.gnu.org/ml/gcc/2005-09/msg00272.html

	* config/rs6000/rs6000.c (get_next_active_insn): Rewrite using
	CALL_P, JUMP_P and NONJUMP_INSN_P, so that barriers and labels
	are omitted.  Exclude stack_tie insn too.

Bootstrapped and regression tested powerpc-linux.  OK for mainline and
4.0?

Index: gcc/config/rs6000/rs6000.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.c,v
retrieving revision 1.866
diff -c -p -r1.866 rs6000.c
*** gcc/config/rs6000/rs6000.c	6 Sep 2005 02:04:59 -0000	1.866
--- gcc/config/rs6000/rs6000.c	12 Sep 2005 00:05:22 -0000
*************** rs6000_is_costly_dependence (rtx insn, r
*** 16622,16647 ****
  static rtx
  get_next_active_insn (rtx insn, rtx tail)
  {
!   rtx next_insn;
! 
!   if (!insn || insn == tail)
      return NULL_RTX;
  
!   next_insn = NEXT_INSN (insn);
! 
!   while (next_insn
!   	 && next_insn != tail
! 	 && (GET_CODE (next_insn) == NOTE
! 	     || GET_CODE (PATTERN (next_insn)) == USE
! 	     || GET_CODE (PATTERN (next_insn)) == CLOBBER))
      {
!       next_insn = NEXT_INSN (next_insn);
!     }
! 
!   if (!next_insn || next_insn == tail)
!     return NULL_RTX;
  
!   return next_insn;
  }
  
  /* Return whether the presence of INSN causes a dispatch group termination
--- 16636,16661 ----
  static rtx
  get_next_active_insn (rtx insn, rtx tail)
  {
!   if (insn == NULL_RTX || insn == tail)
      return NULL_RTX;
  
!   while (1)
      {
!       insn = NEXT_INSN (insn);
!       if (insn == NULL_RTX || insn == tail)
! 	return NULL_RTX;
  
!       if (CALL_P (insn)
! 	  || JUMP_P (insn)
! 	  || (NONJUMP_INSN_P (insn)
! 	      && GET_CODE (PATTERN (insn)) != USE
! 	      && GET_CODE (PATTERN (insn)) != CLOBBER
! 	      && !(GET_CODE (PATTERN (insn)) == SET
! 		   && GET_CODE (XEXP (PATTERN (insn), 1)) == UNSPEC
! 		   && XINT (XEXP (PATTERN (insn), 1), 1) == UNSPEC_TIE)))
! 	break;
!     }
!   return insn;
  }
  
  /* Return whether the presence of INSN causes a dispatch group termination

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RS6000] Nop-insertion fix
       [not found]           ` <amodra@bigpond.net.au>
                               ` (40 preceding siblings ...)
  2005-06-10  1:13             ` [PATCH] PowerPC SVR4 _mcount calls David Edelsohn
@ 2005-09-12  3:15             ` David Edelsohn
  2005-09-12  3:59             ` [PowerPC] Fix PR23774 stack backchain broken David Edelsohn
                               ` (19 subsequent siblings)
  61 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2005-09-12  3:15 UTC (permalink / raw)
  To: gcc-patches

	* config/rs6000/rs6000.c (get_next_active_insn): Rewrite using
	CALL_P, JUMP_P and NONJUMP_INSN_P, so that barriers and labels
	are omitted.  Exclude stack_tie insn too.


Okay.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* [PowerPC] Fix PR23774 stack backchain broken
@ 2005-09-12  3:41 Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2005-09-12  3:41 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Edelsohn

Bootstrapped and regression tested powerpc-linux and powerpc64-linux.
Relies on http://gcc.gnu.org/ml/gcc-patches/2005-09/msg00689.html being
applied.  OK for mainline and 4.0?

	PR target/23774
	* config/rs6000/rs6000.md (restore_stack_block): Write the backchain
	word before changing the stack pointer.  Use gen_frame_mem for MEMs.
	Use UNSPEC_TIE to prevent insn scheduling reordering the insns.
	(restore_stack_nonlocal): Likewise.
	(save_stack_nonlocal): Use template to emit insns, and gen_frame_mem.

--- rs6000.md	2005-09-01 20:15:50.000000000 +0930
+++ rs6000.md.tie-23774	2005-09-12 12:33:28.000000000 +0930
@@ -9348,50 +9348,53 @@
   "DONE;")
 
 (define_expand "restore_stack_block"
-  [(use (match_operand 0 "register_operand" ""))
-   (set (match_dup 2) (match_dup 3))
-   (set (match_dup 0) (match_operand 1 "register_operand" ""))
-   (set (match_dup 3) (match_dup 2))]
+  [(set (match_dup 2) (match_dup 3))
+   (set (match_dup 4) (match_dup 2))
+   (set (match_dup 5) (unspec:BLK [(match_dup 5)] UNSPEC_TIE))
+   (set (match_operand 0 "register_operand" "")
+	(match_operand 1 "register_operand" ""))]
   ""
   "
 {
   operands[2] = gen_reg_rtx (Pmode);
-  operands[3] = gen_rtx_MEM (Pmode, operands[0]);
+  operands[3] = gen_frame_mem (Pmode, operands[0]);
+  operands[4] = gen_frame_mem (Pmode, operands[1]);
+  operands[5] = gen_frame_mem (BLKmode, operands[0]);
 }")
 
 (define_expand "save_stack_nonlocal"
-  [(match_operand 0 "memory_operand" "")
-   (match_operand 1 "register_operand" "")]
+  [(set (match_dup 3) (match_dup 4))
+   (set (match_operand 0 "memory_operand" "") (match_dup 3))
+   (set (match_dup 2) (match_operand 1 "register_operand" ""))]
   ""
   "
 {
-  rtx temp = gen_reg_rtx (Pmode);
   int units_per_word = (TARGET_32BIT) ? 4 : 8;
 
   /* Copy the backchain to the first word, sp to the second.  */
-  emit_move_insn (temp, gen_rtx_MEM (Pmode, operands[1]));
-  emit_move_insn (adjust_address_nv (operands[0], Pmode, 0), temp);
-  emit_move_insn (adjust_address_nv (operands[0], Pmode, units_per_word),
-		  operands[1]);
-  DONE;
+  operands[0] = adjust_address_nv (operands[0], Pmode, 0);
+  operands[2] = adjust_address_nv (operands[0], Pmode, units_per_word);
+  operands[3] = gen_reg_rtx (Pmode);
+  operands[4] = gen_frame_mem (Pmode, operands[1]);
 }")
 
 (define_expand "restore_stack_nonlocal"
-  [(match_operand 0 "register_operand" "")
-   (match_operand 1 "memory_operand" "")]
+  [(set (match_dup 2) (match_operand 1 "memory_operand" ""))
+   (set (match_dup 3) (match_dup 4))
+   (set (match_dup 5) (match_dup 2))
+   (set (match_dup 6) (unspec:BLK [(match_dup 6)] UNSPEC_TIE))
+   (set (match_operand 0 "register_operand" "") (match_dup 3))]
   ""
   "
 {
-  rtx temp = gen_reg_rtx (Pmode);
   int units_per_word = (TARGET_32BIT) ? 4 : 8;
 
-  /* Restore the backchain from the first word, sp from the second.  */
-  emit_move_insn (temp,
-		  adjust_address_nv (operands[1], Pmode, 0));
-  emit_move_insn (operands[0],
-		  adjust_address_nv (operands[1], Pmode, units_per_word));
-  emit_move_insn (gen_rtx_MEM (Pmode, operands[0]), temp);
-  DONE;
+  operands[2] = gen_reg_rtx (Pmode);
+  operands[3] = gen_reg_rtx (Pmode);
+  operands[1] = adjust_address_nv (operands[1], Pmode, 0);
+  operands[4] = adjust_address_nv (operands[1], Pmode, units_per_word);
+  operands[5] = gen_frame_mem (Pmode, operands[3]);
+  operands[6] = gen_frame_mem (BLKmode, operands[0]);
 }")
 \f
 ;; TOC register handling.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PowerPC] Fix PR23774 stack backchain broken
       [not found]           ` <amodra@bigpond.net.au>
                               ` (41 preceding siblings ...)
  2005-09-12  3:15             ` [RS6000] Nop-insertion fix David Edelsohn
@ 2005-09-12  3:59             ` David Edelsohn
  2005-09-13  0:44             ` David Edelsohn
                               ` (18 subsequent siblings)
  61 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2005-09-12  3:59 UTC (permalink / raw)
  To: gcc-patches

	The comment in restore_stack_nonlocal should be preserved and an
equivaltn comment should be added to restore_stack_block to explain what
operations it is performing.

	Meanwhile, this review will take some study.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RS6000] Nop-insertion fix
  2005-09-12  2:03 [RS6000] Nop-insertion fix Alan Modra
@ 2005-09-12  7:23 ` Richard Henderson
  2005-09-12 14:16   ` Alan Modra
  0 siblings, 1 reply; 875+ messages in thread
From: Richard Henderson @ 2005-09-12  7:23 UTC (permalink / raw)
  To: gcc-patches, David Edelsohn

On Mon, Sep 12, 2005 at 11:33:01AM +0930, Alan Modra wrote:
> ! 	  || (NONJUMP_INSN_P (insn)
> ! 	      && GET_CODE (PATTERN (insn)) != USE
> ! 	      && GET_CODE (PATTERN (insn)) != CLOBBER
> ! 	      && !(GET_CODE (PATTERN (insn)) == SET
> ! 		   && GET_CODE (XEXP (PATTERN (insn), 1)) == UNSPEC
> ! 		   && XINT (XEXP (PATTERN (insn), 1), 1) == UNSPEC_TIE)))

INSN_CODE (insn) == CODE_FOR_stack_tie ?


r~

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RS6000] Nop-insertion fix
  2005-09-12  7:23 ` Richard Henderson
@ 2005-09-12 14:16   ` Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2005-09-12 14:16 UTC (permalink / raw)
  To: Richard Henderson, gcc-patches, David Edelsohn

On Mon, Sep 12, 2005 at 12:22:00AM -0700, Richard Henderson wrote:
> On Mon, Sep 12, 2005 at 11:33:01AM +0930, Alan Modra wrote:
> > ! 	  || (NONJUMP_INSN_P (insn)
> > ! 	      && GET_CODE (PATTERN (insn)) != USE
> > ! 	      && GET_CODE (PATTERN (insn)) != CLOBBER
> > ! 	      && !(GET_CODE (PATTERN (insn)) == SET
> > ! 		   && GET_CODE (XEXP (PATTERN (insn), 1)) == UNSPEC
> > ! 		   && XINT (XEXP (PATTERN (insn), 1), 1) == UNSPEC_TIE)))
> 
> INSN_CODE (insn) == CODE_FOR_stack_tie ?

Thanks.  Applying the following as obvious.

	* config/rs6000/rs6000.c (get_next_active_insn): Simplify test for
	stack_tie.

Index: gcc/config/rs6000/rs6000.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.c,v
retrieving revision 1.868
retrieving revision 1.869
diff -u -p -r1.868 -r1.869
--- gcc/config/rs6000/rs6000.c	12 Sep 2005 03:51:13 -0000	1.868
+++ gcc/config/rs6000/rs6000.c	12 Sep 2005 14:14:21 -0000	1.869
@@ -16646,9 +16646,7 @@ get_next_active_insn (rtx insn, rtx tail
 	  || (NONJUMP_INSN_P (insn)
 	      && GET_CODE (PATTERN (insn)) != USE
 	      && GET_CODE (PATTERN (insn)) != CLOBBER
-	      && !(GET_CODE (PATTERN (insn)) == SET
-		   && GET_CODE (XEXP (PATTERN (insn), 1)) == UNSPEC
-		   && XINT (XEXP (PATTERN (insn), 1), 1) == UNSPEC_TIE)))
+	      && INSN_CODE (insn) != CODE_FOR_stack_tie))
 	break;
     }
   return insn;

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PowerPC] Fix PR23774 stack backchain broken
       [not found]           ` <amodra@bigpond.net.au>
                               ` (42 preceding siblings ...)
  2005-09-12  3:59             ` [PowerPC] Fix PR23774 stack backchain broken David Edelsohn
@ 2005-09-13  0:44             ` David Edelsohn
  2005-09-13  4:19               ` Alan Modra
  2005-10-21  2:01             ` [PowerPC] -msdata=data needless use of .sbss section David Edelsohn
                               ` (17 subsequent siblings)
  61 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2005-09-13  0:44 UTC (permalink / raw)
  To: gcc-patches

	PR target/23774
	* config/rs6000/rs6000.md (restore_stack_block): Write the backchain
	word before changing the stack pointer.  Use gen_frame_mem for MEMs.
	Use UNSPEC_TIE to prevent insn scheduling reordering the insns.
	(restore_stack_nonlocal): Likewise.
	(save_stack_nonlocal): Use template to emit insns, and gen_frame_mem.

Okay, with more comments.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PowerPC] Fix PR23774 stack backchain broken
  2005-09-13  0:44             ` David Edelsohn
@ 2005-09-13  4:19               ` Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2005-09-13  4:19 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

On Mon, Sep 12, 2005 at 08:44:06PM -0400, David Edelsohn wrote:
> 	PR target/23774
> 	* config/rs6000/rs6000.md (restore_stack_block): Write the backchain
> 	word before changing the stack pointer.  Use gen_frame_mem for MEMs.
> 	Use UNSPEC_TIE to prevent insn scheduling reordering the insns.
> 	(restore_stack_nonlocal): Likewise.
> 	(save_stack_nonlocal): Use template to emit insns, and gen_frame_mem.
> 
> Okay, with more comments.

Applied with this comment addend for restore_stack_block, and the
restore_stack_nonlocal comment left in.

;; Adjust stack pointer (op0) to a new value (op1).
;; First copy old stack backchain to new location, and ensure that the
;; scheduler won't reorder the sp assignment before the backchain write.


I accidentally applied the 4.1 version of the patch to 4.0, so have
applied the following to fix it.

	Fix 4.1 version applied in error.
	* config/rs6000/rs6000.md (restore_stack_block): Don't use 
	gen_frame_mem.
	(save_stack_nonlocal, restore_stack_nonlocal): Likewise.

Index: gcc/config/rs6000/rs6000.md
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.md,v
retrieving revision 1.350.2.4
retrieving revision 1.350.2.5
diff -u -p -r1.350.2.4 -r1.350.2.5
--- gcc/config/rs6000/rs6000.md	13 Sep 2005 03:01:01 -0000	1.350.2.4
+++ gcc/config/rs6000/rs6000.md	13 Sep 2005 03:59:44 -0000	1.350.2.5
@@ -10048,9 +10048,9 @@
   "
 {
   operands[2] = gen_reg_rtx (Pmode);
-  operands[3] = gen_frame_mem (Pmode, operands[0]);
-  operands[4] = gen_frame_mem (Pmode, operands[1]);
-  operands[5] = gen_frame_mem (BLKmode, operands[0]);
+  operands[3] = gen_rtx_MEM (Pmode, operands[0]);
+  operands[4] = gen_rtx_MEM (Pmode, operands[1]);
+  operands[5] = gen_rtx_MEM (BLKmode, operands[0]);
 }")
 
 (define_expand "save_stack_nonlocal"
@@ -10066,7 +10066,7 @@
   operands[0] = adjust_address_nv (operands[0], Pmode, 0);
   operands[2] = adjust_address_nv (operands[0], Pmode, units_per_word);
   operands[3] = gen_reg_rtx (Pmode);
-  operands[4] = gen_frame_mem (Pmode, operands[1]);
+  operands[4] = gen_rtx_MEM (Pmode, operands[1]);
 }")
 
 (define_expand "restore_stack_nonlocal"
@@ -10085,8 +10085,8 @@
   operands[3] = gen_reg_rtx (Pmode);
   operands[1] = adjust_address_nv (operands[1], Pmode, 0);
   operands[4] = adjust_address_nv (operands[1], Pmode, units_per_word);
-  operands[5] = gen_frame_mem (Pmode, operands[3]);
-  operands[6] = gen_frame_mem (BLKmode, operands[0]);
+  operands[5] = gen_rtx_MEM (Pmode, operands[3]);
+  operands[6] = gen_rtx_MEM (BLKmode, operands[0]);
 }")
 \f
 ;; TOC register handling.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* [PowerPC] -msdata=data needless use of .sbss section
@ 2005-10-20 11:00 Alan Modra
  2005-10-20 16:28 ` David Edelsohn
  2005-11-27 23:21 ` Alan Modra
  0 siblings, 2 replies; 875+ messages in thread
From: Alan Modra @ 2005-10-20 11:00 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Edelsohn

This patch fixes an inconsistency between rs6000_elf_in_small_data_p and
ASM_OUTPUT_ALIGNED_LOCAL in the treatment of local items under
-msdata=data.  As the comment in rs6000_elf_in_small_data_p says:
	  /* If it's not public, and we're not going to reference it there,
	     there's no need to put it in the small data section.  */

So there isn't much point in having ASM_OUTPUT_ALIGNED_LOCAL place
static variables in .sbss when -msdata=data.  Bootstrapped and
regression tested powerpc-linux.  OK for 4.2?

	* config/rs6000/sysv4.h (ASM_OUTPUT_ALIGNED_LOCAL): Rename to..
	(ASM_OUTPUT_ALIGNED_DECL_LOCAL): ..this, adding extra parm.  Don't
	output locals to sbss when SDATA_DATA.
	(ASM_OUTPUT_ALIGNED_BSS): Adjust for above.
	* doc/invoke.texi (powerpc msdata-data): Static data doesn't gp in
	small data sections.

diff -urp -xCVS -x'*~' -xTAGS -x'*.info' -x'.#*' gcc-virgin/gcc/config/rs6000/sysv4.h gcc-current/gcc/config/rs6000/sysv4.h
--- gcc-virgin/gcc/config/rs6000/sysv4.h	2005-08-08 09:17:55.000000000 +0930
+++ gcc-current/gcc/config/rs6000/sysv4.h	2005-10-20 11:14:05.000000000 +0930
@@ -550,12 +550,12 @@ extern int rs6000_pic_labelno;
 
 #define	LCOMM_ASM_OP	"\t.lcomm\t"
 
-/* Override elfos.h definition.  */
-#undef	ASM_OUTPUT_ALIGNED_LOCAL
-#define	ASM_OUTPUT_ALIGNED_LOCAL(FILE, NAME, SIZE, ALIGN)		\
+/* Describe how to emit uninitialized local items.  */
+#define	ASM_OUTPUT_ALIGNED_DECL_LOCAL(FILE, DECL, NAME, SIZE, ALIGN)	\
 do {									\
-  if (rs6000_sdata != SDATA_NONE && (SIZE) > 0				\
-      && (SIZE) <= g_switch_value)					\
+  if (rs6000_sdata != SDATA_NONE					\
+      && (rs6000_sdata != SDATA_DATA || ((DECL) && TREE_PUBLIC (DECL)))	\
+      && (SIZE) > 0 && (SIZE) <= g_switch_value)			\
     {									\
       sbss_section ();							\
       ASM_OUTPUT_ALIGN (FILE, exact_log2 (ALIGN / BITS_PER_UNIT));	\
@@ -577,7 +577,7 @@ do {									\
 /* Describe how to emit uninitialized external linkage items.  */
 #define	ASM_OUTPUT_ALIGNED_BSS(FILE, DECL, NAME, SIZE, ALIGN)		\
 do {									\
-  ASM_OUTPUT_ALIGNED_LOCAL (FILE, NAME, SIZE, ALIGN);			\
+  ASM_OUTPUT_ALIGNED_DECL_LOCAL (FILE, DECL, NAME, SIZE, ALIGN);	\
 } while (0)
 
 #ifdef HAVE_GAS_MAX_SKIP_P2ALIGN
diff -urp -xCVS -x'*~' -xTAGS -x'*.info' -x'.#*' gcc-virgin/gcc/doc/invoke.texi gcc-current/gcc/doc/invoke.texi
--- gcc-virgin/gcc/doc/invoke.texi	2005-10-20 10:14:26.000000000 +0930
+++ gcc-current/gcc/doc/invoke.texi	2005-10-20 18:07:50.000000000 +0930
@@ -11419,9 +11419,9 @@ same as @option{-msdata=sysv}.
 
 @item -msdata-data
 @opindex msdata-data
-On System V.4 and embedded PowerPC systems, put small global and static
-data in the @samp{.sdata} section.  Put small uninitialized global and
-static data in the @samp{.sbss} section.  Do not use register @code{r13}
+On System V.4 and embedded PowerPC systems, put small global
+data in the @samp{.sdata} section.  Put small uninitialized global
+data in the @samp{.sbss} section.  Do not use register @code{r13}
 to address small data however.  This is the default behavior unless
 other @option{-msdata} options are used.
 

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PowerPC] -msdata=data needless use of .sbss section
  2005-10-20 11:00 [PowerPC] -msdata=data needless use of .sbss section Alan Modra
@ 2005-10-20 16:28 ` David Edelsohn
  2005-10-20 23:04   ` Alan Modra
  2005-11-27 23:21 ` Alan Modra
  1 sibling, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2005-10-20 16:28 UTC (permalink / raw)
  To: gcc-patches, Alan Modra

Alan,

	Why not call rs6000_elf_in_small_data_p() in the macro?

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PowerPC] -msdata=data needless use of .sbss section
  2005-10-20 16:28 ` David Edelsohn
@ 2005-10-20 23:04   ` Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2005-10-20 23:04 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

On Thu, Oct 20, 2005 at 12:28:38PM -0400, David Edelsohn wrote:
> 	Why not call rs6000_elf_in_small_data_p() in the macro?

I looked at doing that, and couldn't convince myself that the size
calculated by rs6000_elf_in_small_data_p would always be the same
as that calculated by assemble_variable.  The former uses
   int_size_in_bytes (TREE_TYPE (decl))
the latter,
   tree_low_cst (DECL_SIZE_UNIT (decl), 1)

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PowerPC] -msdata=data needless use of .sbss section
       [not found]           ` <amodra@bigpond.net.au>
                               ` (43 preceding siblings ...)
  2005-09-13  0:44             ` David Edelsohn
@ 2005-10-21  2:01             ` David Edelsohn
  2005-11-02 23:37             ` [PowerPC64] gcc.c-torture/compile/pr20928.c failure David Edelsohn
                               ` (16 subsequent siblings)
  61 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2005-10-21  2:01 UTC (permalink / raw)
  To: gcc-patches

>>>>> Alan Modra writes:

Alan> I looked at doing that, and couldn't convince myself that the size
Alan> calculated by rs6000_elf_in_small_data_p would always be the same
Alan> as that calculated by assemble_variable.  The former uses
Alan> int_size_in_bytes (TREE_TYPE (decl))
Alan> the latter,
Alan> tree_low_cst (DECL_SIZE_UNIT (decl), 1)

	For non-variable size types, int_size_in_bytes returns

  t = TYPE_SIZE_UNIT (type);
  return TREE_INT_CST_LOW (t);

tree_low_cst returns

  return TREE_INT_CST_LOW (t);

If DECL_SIZE_UNIT and TYPE_SIZE_UNIT don't match, I think we have more
problems.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* [PowerPC64] gcc.c-torture/compile/pr20928.c failure
@ 2005-11-02 23:22 Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2005-11-02 23:22 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Edelsohn

gcc.c-torture/compile/pr20928.c fails on PowerPC64 with
Assembler messages:
Error: junk at end of line, first unrecognized character is `-'
The faulty assembler looks like
 .tc bar.N-2147483648[TC],bar+2147483648

This exposes two bugs in the rs6000 backend,
  a) assuming the offset fits in an int, and
  b) assuming that if offset < 0, then -offset > 0, which doesn't hold
     for the minimum integer.

	* config/rs6000/rs6000.c (output_toc): Don't assume "offset" fits
	in an int, use HOST_WIDE_INT and associated print macros.

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 106403)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -15588,7 +15588,7 @@
   const char *name = buf;
   const char *real_name;
   rtx base = x;
-  int offset = 0;
+  HOST_WIDE_INT offset = 0;
 
   gcc_assert (!TARGET_NO_TOC);
 
@@ -15855,9 +15855,9 @@
       fprintf (file, "\t.tc %s", real_name);
 
       if (offset < 0)
-	fprintf (file, ".N%d", - offset);
+	fprintf (file, ".N" HOST_WIDE_INT_PRINT_UNSIGNED, - offset);
       else if (offset)
-	fprintf (file, ".P%d", offset);
+	fprintf (file, ".P" HOST_WIDE_INT_PRINT_UNSIGNED, offset);
 
       fputs ("[TC],", file);
     }
@@ -15872,9 +15872,9 @@
     {
       RS6000_OUTPUT_BASENAME (file, name);
       if (offset < 0)
-	fprintf (file, "%d", offset);
+	fprintf (file, HOST_WIDE_INT_PRINT_DEC, offset);
       else if (offset > 0)
-	fprintf (file, "+%d", offset);
+	fprintf (file, "+" HOST_WIDE_INT_PRINT_DEC, offset);
     }
   else
     output_addr_const (file, x);

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PowerPC64] gcc.c-torture/compile/pr20928.c failure
       [not found]           ` <amodra@bigpond.net.au>
                               ` (44 preceding siblings ...)
  2005-10-21  2:01             ` [PowerPC] -msdata=data needless use of .sbss section David Edelsohn
@ 2005-11-02 23:37             ` David Edelsohn
  2005-11-08  2:59             ` [PowerPC] Fix PR23704, -m64 overrides prior -mno-powerpc-gfxopt David Edelsohn
                               ` (15 subsequent siblings)
  61 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2005-11-02 23:37 UTC (permalink / raw)
  To: gcc-patches

>>>>> Alan Modra writes:

	* config/rs6000/rs6000.c (output_toc): Don't assume "offset" fits
	in an int, use HOST_WIDE_INT and associated print macros.

Okay, when mainline re-opens for non-regression fixes.

Also, don't explain the patch in the ChangeLog, just leave the HWI and
print macros parts.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* [PowerPC] Fix PR23704, -m64 overrides prior -mno-powerpc-gfxopt
@ 2005-11-08  2:55 Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2005-11-08  2:55 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Edelsohn

This fixes a testsuite failure rth found.  More commentary in the PR.
Bootstrapped etc. powerpc64-linux.  OK to apply?

	PR target/23704
	* config/rs6000/rs6000.c (rs6000_handle_option <OPT_m64>): Don't
	override prior explicit -mno-powerpc-gfxopt.

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 106630)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -1624,9 +1624,9 @@
 #else
     case OPT_m64:
 #endif
-      target_flags |= MASK_POWERPC64 | MASK_POWERPC | MASK_PPC_GFXOPT;
-      target_flags_explicit |= MASK_POWERPC64 | MASK_POWERPC
-	| MASK_PPC_GFXOPT;
+      target_flags |= MASK_POWERPC64 | MASK_POWERPC;
+      target_flags |= ~target_flags_explicit & MASK_PPC_GFXOPT;
+      target_flags_explicit |= MASK_POWERPC64 | MASK_POWERPC;
       break;
 
 #ifdef TARGET_USES_AIX64_OPT

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PowerPC] Fix PR23704, -m64 overrides prior -mno-powerpc-gfxopt
       [not found]           ` <amodra@bigpond.net.au>
                               ` (45 preceding siblings ...)
  2005-11-02 23:37             ` [PowerPC64] gcc.c-torture/compile/pr20928.c failure David Edelsohn
@ 2005-11-08  2:59             ` David Edelsohn
  2005-11-25  5:15             ` [RS6000] Add some more functions to ppc64-fp.c David Edelsohn
                               ` (14 subsequent siblings)
  61 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2005-11-08  2:59 UTC (permalink / raw)
  To: gcc-patches

	PR target/23704
	* config/rs6000/rs6000.c (rs6000_handle_option <OPT_m64>): Don't
	override prior explicit -mno-powerpc-gfxopt.

Okay.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* [PATCH] volatile global register variable
@ 2005-11-09 16:28 David Edelsohn
  2005-11-09 18:13 ` Joseph S. Myers
  2005-11-09 18:41 ` Daniel Jacobowitz
  0 siblings, 2 replies; 875+ messages in thread
From: David Edelsohn @ 2005-11-09 16:28 UTC (permalink / raw)
  To: gcc-patches; +Cc: Mark Mitchell, bergner

	GCC currently generates a warning when one declares a global
register variable as volatile.  The warning cannot be disabled and is
rather uninformative: "don't work as you wish".

	With the new Tree-SSA infrastructure, there now are times when one
needs to declare a variable associated with a global register as volatile.
This patch changes the warning to be protected by an option and disabled
by default.  The warning could be enabled with -Wall, if that is desired.

Okay for mainline?

Thanks, David


	PR 24644
	* common.opt (Wvolatile-register-var): New.
	* varasm.c (make_decl_rtl): Only emit warning when option
	specified.

Index: common.opt
===================================================================
*** common.opt	(revision 106675)
--- common.opt	(working copy)
*************** Wunused-variable
*** 173,178 ****
--- 173,182 ----
  Common Var(warn_unused_variable)
  Warn when a variable is unused
  
+ Wvolatile-register-var
+ Common
+ Warn when a register variable is declared volatile
+ 
  aux-info
  Common Separate
  -aux-info <file>	Emit declaration information into <file>
Index: varasm.c
===================================================================
*** varasm.c	(revision 106675)
--- varasm.c	(working copy)
*************** make_decl_rtl (tree decl)
*** 955,961 ****
  	      error ("global register variable has initial value");
  	    }
  	  if (TREE_THIS_VOLATILE (decl))
! 	    warning (0, "volatile register variables don%'t "
  		     "work as you might wish");
  
  	  /* If the user specified one of the eliminables registers here,
--- 955,962 ----
  	      error ("global register variable has initial value");
  	    }
  	  if (TREE_THIS_VOLATILE (decl))
! 	    warning (OPT_Wvolatile_register_var,
! 		     "volatile register variables don%'t "
  		     "work as you might wish");
  
  	  /* If the user specified one of the eliminables registers here,

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] volatile global register variable
  2005-11-09 16:28 [PATCH] volatile global register variable David Edelsohn
@ 2005-11-09 18:13 ` Joseph S. Myers
  2005-11-09 19:33   ` Ian Lance Taylor
  2005-11-09 18:41 ` Daniel Jacobowitz
  1 sibling, 1 reply; 875+ messages in thread
From: Joseph S. Myers @ 2005-11-09 18:13 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches, Mark Mitchell, bergner

On Wed, 9 Nov 2005, David Edelsohn wrote:

> 	GCC currently generates a warning when one declares a global
> register variable as volatile.  The warning cannot be disabled and is
> rather uninformative: "don't work as you wish".
> 
> 	With the new Tree-SSA infrastructure, there now are times when one
> needs to declare a variable associated with a global register as volatile.
> This patch changes the warning to be protected by an option and disabled
> by default.  The warning could be enabled with -Wall, if that is desired.

Documentation of the option is needed in invoke.texi.  I think it should 
explain something of what "don't work as you wish" means and why you might 
or might not want the warning.

-- 
Joseph S. Myers               http://www.srcf.ucam.org/~jsm28/gcc/
    jsm@polyomino.org.uk (personal mail)
    joseph@codesourcery.com (CodeSourcery mail)
    jsm28@gcc.gnu.org (Bugzilla assignments and CCs)

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] volatile global register variable
  2005-11-09 16:28 [PATCH] volatile global register variable David Edelsohn
  2005-11-09 18:13 ` Joseph S. Myers
@ 2005-11-09 18:41 ` Daniel Jacobowitz
  1 sibling, 0 replies; 875+ messages in thread
From: Daniel Jacobowitz @ 2005-11-09 18:41 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches, Mark Mitchell, bergner

On Wed, Nov 09, 2005 at 11:28:29AM -0500, David Edelsohn wrote:
> 	GCC currently generates a warning when one declares a global
> register variable as volatile.  The warning cannot be disabled and is
> rather uninformative: "don't work as you wish".
> 
> 	With the new Tree-SSA infrastructure, there now are times when one
> needs to declare a variable associated with a global register as volatile.
> This patch changes the warning to be protected by an option and disabled
> by default.  The warning could be enabled with -Wall, if that is desired.

Could you explain what these times are, for the curious?

I know I've wanted to turn off this warning, because I have no
easy qualifier-stripping __typeof__; atomic operations often are
written to use __typeof__(*mem) in a register variable.

-- 
Daniel Jacobowitz
CodeSourcery, LLC

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] volatile global register variable
  2005-11-09 18:13 ` Joseph S. Myers
@ 2005-11-09 19:33   ` Ian Lance Taylor
  0 siblings, 0 replies; 875+ messages in thread
From: Ian Lance Taylor @ 2005-11-09 19:33 UTC (permalink / raw)
  To: Joseph S. Myers; +Cc: David Edelsohn, gcc-patches, Mark Mitchell, bergner

"Joseph S. Myers" <joseph@codesourcery.com> writes:

> On Wed, 9 Nov 2005, David Edelsohn wrote:
> 
> > 	GCC currently generates a warning when one declares a global
> > register variable as volatile.  The warning cannot be disabled and is
> > rather uninformative: "don't work as you wish".
> > 
> > 	With the new Tree-SSA infrastructure, there now are times when one
> > needs to declare a variable associated with a global register as volatile.
> > This patch changes the warning to be protected by an option and disabled
> > by default.  The warning could be enabled with -Wall, if that is desired.
> 
> Documentation of the option is needed in invoke.texi.  I think it should 
> explain something of what "don't work as you wish" means and why you might 
> or might not want the warning.

They "don't work as you might wish" because at the RTL level a global
register variable is stored as a REG, and there is no way to mark a
REG as volatile.  That is, gcc will not in general avoid applying RTL
optimizations to a global register variable.  At the tree level a
volatile global register variable will be handled like any other
volatile variable.

As it happens, gcc does specifically avoid optimizing global register
variables in specific cases (whether or not the variable is declared
volatile).  For example, gcc won't CSE values in global register
variables.  But, as an example of optimization which should be invalid
for a volatile variable, the combine pass will combine instructions
which use global registers, although it won't eliminate assignments to
global registers.

I think this could use an overhaul.  I think we could record when
global register variables are volatile, and I think we could treat
them as volatile.  This would mean looking for code which checks
MEM_VOLATILE_P or UNSPEC_VOLATILE, ideally abtracting those checks
into a few functions, and adding checks for volatile global
registers.

I think that would be a good 4.2 project.

Ian

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] volatile global register variable
       [not found]                               ` <drow@false.org>
                                                   ` (2 preceding siblings ...)
  2005-06-27 17:18                                 ` [PATCH, committed] PPC405 atomic support (PR target/21760) David Edelsohn
@ 2005-11-09 19:39                                 ` David Edelsohn
  2005-11-10 21:18                                   ` Mark Mitchell
  2006-08-09 14:59                                 ` [PATCH] Fix comments in PPC e500 code David Edelsohn
                                                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2005-11-09 19:39 UTC (permalink / raw)
  To: gcc-patches, Mark Mitchell, bergner

>>>>> Daniel Jacobowitz writes:

Dan> Could you explain what these times are, for the curious?

	The issue is how much one expects GCC to try to optimize code
using global register variables and how much one wants GCC to try to
prevent it.  At some level this is going to become a DWIM scenario if GCC
does not allow the programmer to inform the compiler where it is and is
not safe using attributes like volatile.

	Some constituency is going to be unhappy if GCC inhibits all
optimizations on global register variables.  Another constituency is going
to be upset if GCC applies an aggressive optimization and there is no way
to disable the optimization without generating warnings that cannot be
silenced.

	GCC disables some Tree-SSA and RTL optimizations on VAR_DECLs
marked as DECL_HARD_REGISTER, but not all.  In Tree-SSA, is_gimple_reg()
returns false for variables marked DECL_HARD_REGISTER, but some GCC
optimization checks allow SSA copy propagation to be applied to global
register variables.  GCC may not try to apply that transformation now, but
it might in the future.  Trying to catch every DECL_HARD_REGISTER case in
current and future optimization passes seems like design for failure.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] volatile global register variable
  2005-11-09 19:39                                 ` [PATCH] volatile global register variable David Edelsohn
@ 2005-11-10 21:18                                   ` Mark Mitchell
  2005-11-11  3:10                                     ` David Edelsohn
  2005-11-11  3:37                                     ` David Edelsohn
  0 siblings, 2 replies; 875+ messages in thread
From: Mark Mitchell @ 2005-11-10 21:18 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches, bergner

David Edelsohn wrote:
>>>>>>Daniel Jacobowitz writes:
> 
> 
> Dan> Could you explain what these times are, for the curious?
> 
> 	The issue is how much one expects GCC to try to optimize code
> using global register variables and how much one wants GCC to try to
> prevent it.  At some level this is going to become a DWIM scenario if GCC
> does not allow the programmer to inform the compiler where it is and is
> not safe using attributes like volatile.

I think David's patch (with documentation, as Joseph suggested) is a
good idea.  We want to move towards finer control of warnings, and,
apparently, this one is annoying some kernel developers.  So, the patch
is OK, once the documentation is added.

However, I'd also like to see the warning message improved.  I don't
think it's necessary (the patch is still an improvement without changing
the message), but something like:

warning: optimization may eliminate reads and/or writes to global
register variables

would seem a lot more helpful that "don't work as you might wish" which
is cryptic in the extreme.

I'm not sure how to solve the broader problems that David and Ian have
raised.  David points out that being sure never to optimize volatile
register variables may be hard; Ian points out that we already have to
do that for volatile memory, so maybe it's not too hard.  I don't know
how hard it might be, but it does seem like abstractly the right thing;
the rules for a volatile register variable ought to be the same as the
rules for any other volatile object.

-- 
Mark Mitchell
CodeSourcery, LLC
mark@codesourcery.com
(916) 791-8304

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] volatile global register variable
  2005-11-10 21:18                                   ` Mark Mitchell
@ 2005-11-11  3:10                                     ` David Edelsohn
  2005-11-11  5:50                                       ` Mark Mitchell
  2005-11-11  3:37                                     ` David Edelsohn
  1 sibling, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2005-11-11  3:10 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: gcc-patches

>>>>> Mark Mitchell writes:

Mark> I'm not sure how to solve the broader problems that David and Ian have
Mark> raised.  David points out that being sure never to optimize volatile
Mark> register variables may be hard; Ian points out that we already have to
Mark> do that for volatile memory, so maybe it's not too hard.  I don't know
Mark> how hard it might be, but it does seem like abstractly the right thing;
Mark> the rules for a volatile register variable ought to be the same as the
Mark> rules for any other volatile object.

	My concern is not about volatile, but about global register
variables having overloaded and ambiguous semantics.  I specifically think
we should avoid global register variables implying volatile.  Some
constituency will not be happy -- either wanting more aggressive
optimization or being unhappy about an optimization transformation.  If a
programmer really wants a volatile global register variable, he or she
should use the volatile modifier.

	We can ensure that volatile is used consistently throughout the
compiler.  I want to avoid GCC testing "volatile or global register"
because someone is going to forget to add the latter and introduce a
latent bug.  That likely will lead to long arguments about what the
semantics should be.  I want to stop that thread before it starts.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] volatile global register variable
  2005-11-10 21:18                                   ` Mark Mitchell
  2005-11-11  3:10                                     ` David Edelsohn
@ 2005-11-11  3:37                                     ` David Edelsohn
  1 sibling, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2005-11-11  3:37 UTC (permalink / raw)
  To: Mark Mitchell, Joseph S. Myers; +Cc: gcc-patches

>>>>> Mark Mitchell writes:

Mark> So, the patch is OK, once the documentation is added.

	Appended is the revised patch that I believe addresses the
comments.  I will check it in on Friday, unless I hear otherwise.

Thanks, David

	PR 24644
	* common.opt (Wvolatile-register-var): New.
	* varasm.c (make_decl_rtl): Only emit warning when option
	specified.
	* doc/invoke.texi (Wvolatile-register-var): Document new option.

Index: common.opt
===================================================================
*** common.opt	(revision 106727)
--- common.opt	(working copy)
*************** Wunused-variable
*** 173,178 ****
--- 173,182 ----
  Common Var(warn_unused_variable)
  Warn when a variable is unused
  
+ Wvolatile-register-var
+ Common
+ Warn when a register variable is declared volatile
+ 
  aux-info
  Common Separate
  -aux-info <file>	Emit declaration information into <file>
Index: varasm.c
===================================================================
*** varasm.c	(revision 106727)
--- varasm.c	(working copy)
*************** make_decl_rtl (tree decl)
*** 955,962 ****
  	      error ("global register variable has initial value");
  	    }
  	  if (TREE_THIS_VOLATILE (decl))
! 	    warning (0, "volatile register variables don%'t "
! 		     "work as you might wish");
  
  	  /* If the user specified one of the eliminables registers here,
  	     e.g., FRAME_POINTER_REGNUM, we don't want to get this variable
--- 955,963 ----
  	      error ("global register variable has initial value");
  	    }
  	  if (TREE_THIS_VOLATILE (decl))
! 	    warning (OPT_Wvolatile_register_var,
! 		     "optimization may eliminate reads and/or "
! 		     "writes to register variables");
  
  	  /* If the user specified one of the eliminables registers here,
  	     e.g., FRAME_POINTER_REGNUM, we don't want to get this variable
Index: doc/invoke.texi
===================================================================
*** doc/invoke.texi	(revision 106727)
--- doc/invoke.texi	(working copy)
*************** Objective-C and Objective-C++ Dialects}.
*** 245,251 ****
  -Wunknown-pragmas  -Wno-pragmas -Wunreachable-code @gol
  -Wunused  -Wunused-function  -Wunused-label  -Wunused-parameter @gol
  -Wunused-value  -Wunused-variable  -Wvariadic-macros @gol
! -Wwrite-strings}
  
  @item C-only Warning Options
  @gccoptlist{-Wbad-function-cast  -Wmissing-declarations @gol
--- 245,251 ----
  -Wunknown-pragmas  -Wno-pragmas -Wunreachable-code @gol
  -Wunused  -Wunused-function  -Wunused-label  -Wunused-parameter @gol
  -Wunused-value  -Wunused-variable  -Wvariadic-macros @gol
! -Wvolatile-register-var  -Wwrite-strings}
  
  @item C-only Warning Options
  @gccoptlist{-Wbad-function-cast  -Wmissing-declarations @gol
*************** only when @option{-pedantic} flag is use
*** 3369,3374 ****
--- 3369,3381 ----
  Warn if variadic macros are used in pedantic ISO C90 mode, or the GNU
  alternate syntax when in pedantic ISO C99 mode.  This is default.
  To inhibit the warning messages, use @option{-Wno-variadic-macros}.
+ 
+ @item -Wvolatile-register-var
+ @opindex Wvolatile-register-var
+ @opindex Wno-volatile-register-var
+ Warn if a register variable is declared volatile.  The volatile
+ modifier does not inhibit all optimizations that may eliminate reads
+ and/or writes to register variables.
  
  @item -Wdisabled-optimization
  @opindex Wdisabled-optimization

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] volatile global register variable
  2005-11-11  3:10                                     ` David Edelsohn
@ 2005-11-11  5:50                                       ` Mark Mitchell
  0 siblings, 0 replies; 875+ messages in thread
From: Mark Mitchell @ 2005-11-11  5:50 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

David Edelsohn wrote:

> 	My concern is not about volatile, but about global register
> variables having overloaded and ambiguous semantics.  I specifically think
> we should avoid global register variables implying volatile.

I'm sorry to have misunderstood originally.

I definitely agree that global register variables should be volatile
only if explicitly declared as such.

-- 
Mark Mitchell
CodeSourcery, LLC
mark@codesourcery.com
(916) 791-8304

^ permalink raw reply	[flat|nested] 875+ messages in thread

* [RS6000] Add some more functions to ppc64-fp.c
@ 2005-11-25  4:49 Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2005-11-25  4:49 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Edelsohn

This provides the extra ppc64 DImode functions corresponding to the
libgcc2 additions Joseph made in
http://gcc.gnu.org/ml/gcc-patches/2005-11/msg01538.html

	* config/rs6000/ppc64-fp.c (__floatunditf): New function.
	(__floatundidf, __floatundisf): Likewise.

Bootstrapped and regression tested powerpc64-linux.  OK to apply?

Index: gcc/config/rs6000/ppc64-fp.c
===================================================================
--- gcc/config/rs6000/ppc64-fp.c	(revision 107416)
+++ gcc/config/rs6000/ppc64-fp.c	(working copy)
@@ -39,8 +39,11 @@ extern DItype __fixsfdi (SFtype);
 extern USItype __fixunsdfsi (DFtype);
 extern USItype __fixunssfsi (SFtype);
 extern TFtype __floatditf (DItype);
+extern TFtype __floatunditf (UDItype);
 extern DFtype __floatdidf (DItype);
+extern DFtype __floatundidf (UDItype);
 extern SFtype __floatdisf (DItype);
+extern SFtype __floatundisf (UDItype);
 extern DItype __fixunstfdi (TFtype);
 
 static DItype local_fixunssfdi (SFtype);
@@ -100,6 +103,18 @@ __floatditf (DItype u)
   return (TFtype) dh + (TFtype) dl;
 }
 
+TFtype
+__floatunditf (UDItype u)
+{
+  DFtype dh, dl;
+
+  dh = (USItype) (u >> (sizeof (SItype) * 8));
+  dh *= 2.0 * (((UDItype) 1) << ((sizeof (SItype) * 8) - 1));
+  dl = (USItype) (u & ((((UDItype) 1) << (sizeof (SItype) * 8)) - 1));
+
+  return (TFtype) dh + (TFtype) dl;
+}
+
 DFtype
 __floatdidf (DItype u)
 {
@@ -112,6 +127,18 @@ __floatdidf (DItype u)
   return d;
 }
 
+DFtype
+__floatundidf (UDItype u)
+{
+  DFtype d;
+
+  d = (USItype) (u >> (sizeof (SItype) * 8));
+  d *= 2.0 * (((UDItype) 1) << ((sizeof (SItype) * 8) - 1));
+  d += (USItype) (u & ((((UDItype) 1) << (sizeof (SItype) * 8)) - 1));
+
+  return d;
+}
+
 SFtype
 __floatdisf (DItype u)
 {
@@ -137,6 +164,30 @@ __floatdisf (DItype u)
   return (SFtype) f;
 }
 
+SFtype
+__floatundisf (UDItype u)
+{
+  DFtype f;
+
+  if (53 < (sizeof (DItype) * 8)
+      && 53 > ((sizeof (DItype) * 8) - 53 + 24))
+    {
+      if (u >= ((UDItype) 1 << 53))
+        {
+          if ((UDItype) u & (((UDItype) 1 << ((sizeof (DItype) * 8) - 53)) - 1))
+            {
+              u &= ~ (((UDItype) 1 << ((sizeof (DItype) * 8) - 53)) - 1);
+              u |= ((UDItype) 1 << ((sizeof (DItype) * 8) - 53));
+            }
+        }
+    }
+  f = (USItype) (u >> (sizeof (SItype) * 8));
+  f *= 2.0 * (((UDItype) 1) << ((sizeof (SItype) * 8) - 1));
+  f += (USItype) (u & ((((UDItype) 1) << (sizeof (SItype) * 8)) - 1));
+
+  return (SFtype) f;
+}
+
 DItype
 __fixunstfdi (TFtype a)
 {

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RS6000] Add some more functions to ppc64-fp.c
       [not found]           ` <amodra@bigpond.net.au>
                               ` (46 preceding siblings ...)
  2005-11-08  2:59             ` [PowerPC] Fix PR23704, -m64 overrides prior -mno-powerpc-gfxopt David Edelsohn
@ 2005-11-25  5:15             ` David Edelsohn
  2005-11-27 23:25             ` [PowerPC] -msdata=data needless use of .sbss section David Edelsohn
                               ` (13 subsequent siblings)
  61 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2005-11-25  5:15 UTC (permalink / raw)
  To: gcc-patches

	* config/rs6000/ppc64-fp.c (__floatunditf): New function.
	(__floatundidf, __floatundisf): Likewise.

Okay.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PowerPC] -msdata=data needless use of .sbss section
  2005-10-20 11:00 [PowerPC] -msdata=data needless use of .sbss section Alan Modra
  2005-10-20 16:28 ` David Edelsohn
@ 2005-11-27 23:21 ` Alan Modra
  1 sibling, 0 replies; 875+ messages in thread
From: Alan Modra @ 2005-11-27 23:21 UTC (permalink / raw)
  To: gcc-patches, David Edelsohn

On Thu, Oct 20, 2005 at 08:30:47PM +0930, Alan Modra wrote:
> This patch fixes an inconsistency between rs6000_elf_in_small_data_p and
> ASM_OUTPUT_ALIGNED_LOCAL in the treatment of local items under
> -msdata=data.  As the comment in rs6000_elf_in_small_data_p says:
> 	  /* If it's not public, and we're not going to reference it there,
> 	     there's no need to put it in the small data section.  */
> 
> So there isn't much point in having ASM_OUTPUT_ALIGNED_LOCAL place
> static variables in .sbss when -msdata=data.

Revised version using rs6000_elf_in_small_data_p.  Bootstrapped and
regression tested powerpc-linux.  OK for 4.2?

	* doc/invoke.texi (powerpc msdata-data): Static data doesn't go in
	small data sections.
	* config/rs6000/rs6000.c (rs6000_elf_in_small_data_p): Make global.
	* config/rs6000/rs6000-protos.h: (rs6000_elf_in_small_data_p): Declare.
	* config/rs6000/sysv4.h (ASM_OUTPUT_ALIGNED_LOCAL): Rename to..
	(ASM_OUTPUT_ALIGNED_DECL_LOCAL): ..this, adding extra parm.  Don't
	output locals to sbss if !rs6000_elf_in_small_data_p.
	(ASM_OUTPUT_ALIGNED_BSS): Adjust for above.

Index: gcc/doc/invoke.texi
===================================================================
--- gcc/doc/invoke.texi	(revision 107562)
+++ gcc/doc/invoke.texi	(working copy)
@@ -11507,9 +11507,9 @@ same as @option{-msdata=sysv}.
 
 @item -msdata-data
 @opindex msdata-data
-On System V.4 and embedded PowerPC systems, put small global and static
-data in the @samp{.sdata} section.  Put small uninitialized global and
-static data in the @samp{.sbss} section.  Do not use register @code{r13}
+On System V.4 and embedded PowerPC systems, put small global
+data in the @samp{.sdata} section.  Put small uninitialized global
+data in the @samp{.sbss} section.  Do not use register @code{r13}
 to address small data however.  This is the default behavior unless
 other @option{-msdata} options are used.
 
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 107562)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -615,7 +615,6 @@ static void rs6000_elf_select_rtx_sectio
 					   unsigned HOST_WIDE_INT);
 static void rs6000_elf_encode_section_info (tree, rtx, int)
      ATTRIBUTE_UNUSED;
-static bool rs6000_elf_in_small_data_p (tree);
 #endif
 #if TARGET_XCOFF
 static void rs6000_xcoff_asm_globalize_label (FILE *, const char *);
@@ -17427,7 +17450,7 @@ rs6000_elf_encode_section_info (tree dec
     }
 }
 
-static bool
+bool
 rs6000_elf_in_small_data_p (tree decl)
 {
   if (rs6000_sdata == SDATA_NONE)
Index: gcc/config/rs6000/rs6000-protos.h
===================================================================
--- gcc/config/rs6000/rs6000-protos.h	(revision 107562)
+++ gcc/config/rs6000/rs6000-protos.h	(working copy)
@@ -123,6 +122,7 @@ extern rtx rs6000_libcall_value (enum ma
 extern rtx rs6000_va_arg (tree, tree);
 extern int function_ok_for_sibcall (tree);
 extern void rs6000_elf_declare_function_name (FILE *, const char *, tree);
+extern bool rs6000_elf_in_small_data_p (tree);
 #ifdef ARGS_SIZE_RTX
 /* expr.h defines ARGS_SIZE_RTX and `enum direction' */
 extern enum direction function_arg_padding (enum machine_mode, tree);
Index: gcc/config/rs6000/sysv4.h
===================================================================
--- gcc/config/rs6000/sysv4.h	(revision 107562)
+++ gcc/config/rs6000/sysv4.h	(working copy)
@@ -550,12 +550,10 @@ extern int rs6000_pic_labelno;
 
 #define	LCOMM_ASM_OP	"\t.lcomm\t"
 
-/* Override elfos.h definition.  */
-#undef	ASM_OUTPUT_ALIGNED_LOCAL
-#define	ASM_OUTPUT_ALIGNED_LOCAL(FILE, NAME, SIZE, ALIGN)		\
+/* Describe how to emit uninitialized local items.  */
+#define	ASM_OUTPUT_ALIGNED_DECL_LOCAL(FILE, DECL, NAME, SIZE, ALIGN)	\
 do {									\
-  if (rs6000_sdata != SDATA_NONE && (SIZE) > 0				\
-      && (SIZE) <= g_switch_value)					\
+  if ((DECL) && rs6000_elf_in_small_data_p (DECL))			\
     {									\
       sbss_section ();							\
       ASM_OUTPUT_ALIGN (FILE, exact_log2 (ALIGN / BITS_PER_UNIT));	\
@@ -577,7 +575,7 @@ do {									\
 /* Describe how to emit uninitialized external linkage items.  */
 #define	ASM_OUTPUT_ALIGNED_BSS(FILE, DECL, NAME, SIZE, ALIGN)		\
 do {									\
-  ASM_OUTPUT_ALIGNED_LOCAL (FILE, NAME, SIZE, ALIGN);			\
+  ASM_OUTPUT_ALIGNED_DECL_LOCAL (FILE, DECL, NAME, SIZE, ALIGN);	\
 } while (0)
 
 #ifdef HAVE_GAS_MAX_SKIP_P2ALIGN

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PowerPC] -msdata=data needless use of .sbss section
       [not found]           ` <amodra@bigpond.net.au>
                               ` (47 preceding siblings ...)
  2005-11-25  5:15             ` [RS6000] Add some more functions to ppc64-fp.c David Edelsohn
@ 2005-11-27 23:25             ` David Edelsohn
  2005-11-28  2:51             ` [PowerPC] PR24997: ICE with -ftree-vectorize David Edelsohn
                               ` (12 subsequent siblings)
  61 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2005-11-27 23:25 UTC (permalink / raw)
  To: gcc-patches

	* doc/invoke.texi (powerpc msdata-data): Static data doesn't go in
	small data sections.
	* config/rs6000/rs6000.c (rs6000_elf_in_small_data_p): Make global.
	* config/rs6000/rs6000-protos.h: (rs6000_elf_in_small_data_p): Declare.
	* config/rs6000/sysv4.h (ASM_OUTPUT_ALIGNED_LOCAL): Rename to..
	(ASM_OUTPUT_ALIGNED_DECL_LOCAL): ..this, adding extra parm.  Don't
	output locals to sbss if !rs6000_elf_in_small_data_p.
	(ASM_OUTPUT_ALIGNED_BSS): Adjust for above.

Okay.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* [PowerPC] PR24997: ICE with -ftree-vectorize
@ 2005-11-28  2:26 Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2005-11-28  2:26 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Edelsohn

This patch fixes an ICE and also makes better use of indexed addressing
modes.  See http://gcc.gnu.org/ml/gcc/2005-11/msg01218.html.  While
making the necessary change to indexed_or_indirect_operand, I noticed
indexed_or_indirect_address and saw that it too should accept reloaded
indexed addresses, so I factored that code out.  Also, testing
address_operand after we have discarded address patterns we won't match
should speed the compiler a little.  address_operand calls
rs6000_legitimate_address which is reasonably complex.

Notes: 1) Using define_special_predicate for indexed_or_indirect_address
avoids relying on gen-preds noticing that it is special from the
address_operand call.  I originally put the address_operand call inside
the match_test, and managed to get an ICE due to gen-preds adding mode
tests that we don't want.  Calling address_operand via rtl lets
gen-preds see the call, so define_special_prediciate isn't necessary but
I figure it is safer to mark it so.
2) Removing memory_operand from indexed_or_indirect_operand doesn't lose
us anything.  We get the same tests via address_operand.

Bootstrapped and regression tested powerpc64-linux.  OK for 4.2 and 4.1?
Can this go on 4.0 too?  I believe that the same problem exists in 4.0,
but just isn't triggered with the pr24997 testcase.

	PR target/24997
	* config/rs6000/rs6000.c (legitimate_indexed_address_p): Allow pattern
	generated by reload.
	* config/rs6000/predicates.md (indexed_or_indirect_operand): Use
	indexed_or_indirect_address.
	(indexed_or_indirect_address): Don't test for base reg.  Call
	address_operand last.  Make explicit that this is a special predicate.

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 107562)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -2731,13 +2730,23 @@ legitimate_indexed_address_p (rtx x, int
   op0 = XEXP (x, 0);
   op1 = XEXP (x, 1);
 
-  if (!REG_P (op0) || !REG_P (op1))
-    return false;
-
-  return ((INT_REG_OK_FOR_BASE_P (op0, strict)
-	   && INT_REG_OK_FOR_INDEX_P (op1, strict))
-	  || (INT_REG_OK_FOR_BASE_P (op1, strict)
-	      && INT_REG_OK_FOR_INDEX_P (op0, strict)));
+  if (REG_P (op0) && REG_P (op1))
+    return ((INT_REG_OK_FOR_BASE_P (op0, strict)
+	     && INT_REG_OK_FOR_INDEX_P (op1, strict))
+	    || (INT_REG_OK_FOR_BASE_P (op1, strict)
+		&& INT_REG_OK_FOR_INDEX_P (op0, strict)));
+
+  /* Recognize the rtl generated by reload which we know will later be
+     replaced by a base reg.  We rely on nothing but reload generating
+     this particular pattern, a reasonable assumption because it is not
+     canonical.  */
+  else if (reload_in_progress
+	   && GET_CODE (op0) == PLUS
+	   && REG_P (XEXP (op0, 0))
+	   && GET_CODE (XEXP (op0, 1)) == CONST_INT
+	   && REG_P (op1))
+    return INT_REG_OK_FOR_INDEX_P (op1, strict);
+  return false;
 }
 
 inline bool
Index: gcc/config/rs6000/predicates.md
===================================================================
--- gcc/config/rs6000/predicates.md	(revision 107562)
+++ gcc/config/rs6000/predicates.md	(working copy)
@@ -353,25 +353,6 @@
 					   || reload_in_progress,
 					   mode, XEXP (op, 0))")))
 
-;; Return 1 if the operand is an indexed or indirect memory operand.
-(define_predicate "indexed_or_indirect_operand"
-  (match_operand 0 "memory_operand")
-{
-  rtx tmp = XEXP (op, 0);
-
-  if (TARGET_ALTIVEC
-      && ALTIVEC_VECTOR_MODE (mode)
-      && GET_CODE (tmp) == AND
-      && GET_CODE (XEXP (tmp, 1)) == CONST_INT
-      && INTVAL (XEXP (tmp, 1)) == -16)
-    tmp = XEXP (tmp, 0);
-
-    return REG_P (tmp)
-		  || (GET_CODE (tmp) == PLUS
-		      && REG_P (XEXP (tmp, 0)) 
-		      && REG_P (XEXP (tmp, 1)));
-})
-
 ;; Return 1 if the operand is a memory operand with an address divisible by 4
 (define_predicate "word_offset_memref_operand"
   (and (match_operand 0 "memory_operand")
@@ -380,13 +361,28 @@
 		    || GET_CODE (XEXP (XEXP (op, 0), 1)) != CONST_INT
 		    || INTVAL (XEXP (XEXP (op, 0), 1)) % 4 == 0")))
 
+;; Return 1 if the operand is an indexed or indirect memory operand.
+(define_predicate "indexed_or_indirect_operand"
+  (match_code "mem")
+{
+  op = XEXP (op, 0);
+  if (TARGET_ALTIVEC
+      && ALTIVEC_VECTOR_MODE (mode)
+      && GET_CODE (op) == AND
+      && GET_CODE (XEXP (op, 1)) == CONST_INT
+      && INTVAL (XEXP (op, 1)) == -16)
+    op = XEXP (op, 0);
+
+  return indexed_or_indirect_address (op, mode);
+})
+
 ;; Return 1 if the operand is an indexed or indirect address.
-(define_predicate "indexed_or_indirect_address"
-  (and (match_operand 0 "address_operand")
-       (match_test "REG_P (op)
+(define_special_predicate "indexed_or_indirect_address"
+  (and (match_test "REG_P (op)
 		    || (GET_CODE (op) == PLUS
-			&& REG_P (XEXP (op, 0)) 
-			&& REG_P (XEXP (op, 1)))")))
+			/* Omit testing REG_P (XEXP (op, 0)).  */
+			&& REG_P (XEXP (op, 1)))")
+       (match_operand 0 "address_operand")))
 
 ;; Used for the destination of the fix_truncdfsi2 expander.
 ;; If stfiwx will be used, the result goes to memory; otherwise,

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PowerPC] PR24997: ICE with -ftree-vectorize
       [not found]           ` <amodra@bigpond.net.au>
                               ` (48 preceding siblings ...)
  2005-11-27 23:25             ` [PowerPC] -msdata=data needless use of .sbss section David Edelsohn
@ 2005-11-28  2:51             ` David Edelsohn
  2005-12-07 13:35             ` [PowerPC] Fix pr25212, indexed address predicates David Edelsohn
                               ` (11 subsequent siblings)
  61 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2005-11-28  2:51 UTC (permalink / raw)
  To: gcc-patches

	PR target/24997
	* config/rs6000/rs6000.c (legitimate_indexed_address_p): Allow pattern
	generated by reload.
	* config/rs6000/predicates.md (indexed_or_indirect_operand): Use
	indexed_or_indirect_address.
	(indexed_or_indirect_address): Don't test for base reg.  Call
	address_operand last.  Make explicit that this is a special predicate.

Okay.

Can you add it to mainline first and let's see what happens for a few days
before adding to 4.1 and possibly 4.0?

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* [PowerPC] Fix pr25212, indexed address predicates
@ 2005-12-07  5:09 Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2005-12-07  5:09 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Edelsohn

pr25212 is an ICE caused by attempting to reload an altivec indexed mem
address as an indirect address.  The resulting reload expression,
(set (reg) (and (plus (reg) (reg)) (const_int))), is too complex for any
rs6000 insn, and reload itself doesn't try to simplify it.

The root cause of this problem is that reload is tossing out a perfectly
good indexed address as invalid, because the rs6000 predicates are too
restrictive during reload:  legitimate_indexed_address_p only allows
valid base/index hard regs, or pseudos (and the PLUS I added for
pr24997).  In particular, they disallow invalid hard regs such as lr,
yet reload will arrange for an invalid reg to be reloaded into a valid
one.  All MEMs are processed by find_reloads_address early in
find_reloads.

So, as Ian Lance Taylor suggested, this patch simply relaxes the
predicate to accept more indexed addresses when !strict, trusting that
reload will ensure they match in the strict sense after reloading.  No
doubt other powerpc predicates could also be relaxed similarly, but I'll
leave that to later sometime when I feel like wrestling with reload
again.

As mentioned previously this also fixes INT_REG_OK_FOR_BASE_P, which can
return true for a pseudo reg that has reg_renumber[REGNO] == 0.

This has passed a "make quickstrap" followed by "make check" in gcc/,
ie. running gcc, g++, gfortran, objc testsuites without regressions on
both powerpc-linux and powerpc64-linux.  I'm running full bootstraps as
well, but these haven't completed.  OK for mainline and 4.1?

:ADDPATCH target:

	PR target/25212
	* config/rs6000/rs6000.c (legitimate_indexed_address_p): Relax
	tests further when !strict && reload_in_progress.
	(print_operand): Check that both operands of indexed address are regs.
	(print_operand_address): Likewise.
	* config/rs6000/rs6000.h (INT_REG_OK_FOR_INDEX_P): Simplify.
	(INT_REG_OK_FOR_BASE_P): Correct.

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 108055)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -2726,23 +2726,19 @@ legitimate_indexed_address_p (rtx x, int
   op0 = XEXP (x, 0);
   op1 = XEXP (x, 1);
 
-  if (REG_P (op0) && REG_P (op1))
-    return ((INT_REG_OK_FOR_BASE_P (op0, strict)
-	     && INT_REG_OK_FOR_INDEX_P (op1, strict))
-	    || (INT_REG_OK_FOR_BASE_P (op1, strict)
-		&& INT_REG_OK_FOR_INDEX_P (op0, strict)));
-
   /* Recognize the rtl generated by reload which we know will later be
-     replaced by a base reg.  We rely on nothing but reload generating
-     this particular pattern, a reasonable assumption because it is not
-     canonical.  */
-  else if (reload_in_progress
-	   && GET_CODE (op0) == PLUS
-	   && REG_P (XEXP (op0, 0))
 -	   && GET_CODE (XEXP (op0, 1)) == CONST_INT
-	   && REG_P (op1))
-    return INT_REG_OK_FOR_INDEX_P (op1, strict);
-  return false;
+     replaced with proper base and index regs.  */
+  if (!strict
+      && reload_in_progress
+      && (REG_P (op0) || GET_CODE (op0) == PLUS)
+      && REG_P (op1))
+    return true;
+
+  return (REG_P (op0) && REG_P (op1)
+	  && ((INT_REG_OK_FOR_BASE_P (op0, strict)
+	       && INT_REG_OK_FOR_INDEX_P (op1, strict))
+	      || (INT_REG_OK_FOR_BASE_P (op1, strict)
+		  && INT_REG_OK_FOR_INDEX_P (op0, strict))));
 }
 
 inline bool
@@ -10667,7 +10677,8 @@ print_operand (FILE *file, rtx x, int co
 	else
 	  {
 	    gcc_assert (GET_CODE (tmp) == PLUS
-			&& GET_CODE (XEXP (tmp, 1)) == REG);
+			&& REG_P (XEXP (tmp, 0))
+			&& REG_P (XEXP (tmp, 1)));
 
 	    if (REGNO (XEXP (tmp, 0)) == 0)
 	      fprintf (file, "%s,%s", reg_names[ REGNO (XEXP (tmp, 1)) ],
@@ -10727,6 +10738,7 @@ print_operand_address (FILE *file, rtx x
     }
   else if (GET_CODE (x) == PLUS && GET_CODE (XEXP (x, 1)) == REG)
     {
+      gcc_assert (REG_P (XEXP (x, 0)));
       if (REGNO (XEXP (x, 0)) == 0)
 	fprintf (file, "%s,%s", reg_names[ REGNO (XEXP (x, 1)) ],
 		 reg_names[ REGNO (XEXP (x, 0)) ]);
Index: gcc/config/rs6000/rs6000.h
===================================================================
--- gcc/config/rs6000/rs6000.h	(revision 108055)
+++ gcc/config/rs6000/rs6000.h	(working copy)
@@ -1723,17 +1751,14 @@ typedef struct rs6000_args
 /* Nonzero if X is a hard reg that can be used as an index
    or if it is a pseudo reg in the non-strict case.  */
 #define INT_REG_OK_FOR_INDEX_P(X, STRICT)			\
-  ((! (STRICT)							\
-    && (REGNO (X) <= 31						\
-	|| REGNO (X) == ARG_POINTER_REGNUM			\
-	|| REGNO (X) == FRAME_POINTER_REGNUM			\
-	|| REGNO (X) >= FIRST_PSEUDO_REGISTER))			\
-   || ((STRICT) && REGNO_OK_FOR_INDEX_P (REGNO (X))))
+  ((!(STRICT) && REGNO (X) >= FIRST_PSEUDO_REGISTER)		\
+   || REGNO_OK_FOR_INDEX_P (REGNO (X)))
 
 /* Nonzero if X is a hard reg that can be used as a base reg
    or if it is a pseudo reg in the non-strict case.  */
 #define INT_REG_OK_FOR_BASE_P(X, STRICT)			\
-  (REGNO (X) > 0 && INT_REG_OK_FOR_INDEX_P (X, (STRICT)))
+  ((!(STRICT) && REGNO (X) >= FIRST_PSEUDO_REGISTER)		\
+   || REGNO_OK_FOR_BASE_P (REGNO (X)))
 
 #define REG_OK_FOR_INDEX_P(X) INT_REG_OK_FOR_INDEX_P (X, REG_OK_STRICT_FLAG)
 #define REG_OK_FOR_BASE_P(X)  INT_REG_OK_FOR_BASE_P (X, REG_OK_STRICT_FLAG)

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PowerPC] Fix pr25212, indexed address predicates
       [not found]           ` <amodra@bigpond.net.au>
                               ` (49 preceding siblings ...)
  2005-11-28  2:51             ` [PowerPC] PR24997: ICE with -ftree-vectorize David Edelsohn
@ 2005-12-07 13:35             ` David Edelsohn
  2005-12-10 18:14             ` [PowerPC] Default TARGET_ALIGN_NATURAL properly for target libs David Edelsohn
                               ` (10 subsequent siblings)
  61 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2005-12-07 13:35 UTC (permalink / raw)
  To: gcc-patches

	PR target/25212
	* config/rs6000/rs6000.c (legitimate_indexed_address_p): Relax
	tests further when !strict && reload_in_progress.
	(print_operand): Check that both operands of indexed address are regs.
	(print_operand_address): Likewise.
	* config/rs6000/rs6000.h (INT_REG_OK_FOR_INDEX_P): Simplify.
	(INT_REG_OK_FOR_BASE_P): Correct.

Okay if bootstrap and regression passes.  Please wait a few days to make
sure there are no problems on mainline before backporting to 4.1.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* [PowerPC] Default TARGET_ALIGN_NATURAL properly for target libs.
@ 2005-12-09  1:59 Alan Modra
  2005-12-09 16:15 ` Andrew Pinski
  0 siblings, 1 reply; 875+ messages in thread
From: Alan Modra @ 2005-12-09  1:59 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Edelsohn

I noticed this when looking at PR25299.  Since powerpc64-linux now
defaults to natural alignment, we ought to compile target libs like
that.  Not that it really matters, because I don't think we have any
structs that start with a double.  Still..

Bootstrap in progress.  OK mainline?

	* config/rs6000/linux64.h (TARGET_ALIGN_NATURAL): Define.

Index: gcc/config/rs6000/linux64.h
===================================================================
--- gcc/config/rs6000/linux64.h	(revision 108256)
+++ gcc/config/rs6000/linux64.h	(working copy)
@@ -236,6 +239,12 @@ extern int dot_symbols;
    ? rs6000_special_round_type_align (STRUCT, COMPUTED, SPECIFIED)	\
    : MAX ((COMPUTED), (SPECIFIED)))
 
+/* Use the default for compiling target libs.  */
+#ifdef IN_TARGET_LIBS
+#undef TARGET_ALIGN_NATURAL
+#define TARGET_ALIGN_NATURAL 1
+#endif
+
 /* Indicate that jump tables go in the text section.  */
 #undef  JUMP_TABLES_IN_TEXT_SECTION
 #define JUMP_TABLES_IN_TEXT_SECTION TARGET_64BIT

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PowerPC] Default TARGET_ALIGN_NATURAL properly for target libs.
  2005-12-09  1:59 [PowerPC] Default TARGET_ALIGN_NATURAL properly for target libs Alan Modra
@ 2005-12-09 16:15 ` Andrew Pinski
       [not found]   ` <20051212022850.GI1563@bubble.grove.modra.org>
  0 siblings, 1 reply; 875+ messages in thread
From: Andrew Pinski @ 2005-12-09 16:15 UTC (permalink / raw)
  To: Alan Modra; +Cc: gcc-patches, David Edelsohn

> 
> I noticed this when looking at PR25299.  Since powerpc64-linux now
> defaults to natural alignment, we ought to compile target libs like
> that.  Not that it really matters, because I don't think we have any
> structs that start with a double.  Still..
> 
> Bootstrap in progress.  OK mainline?

Actually it does matter but only to libobjc which I have been trying
recently to fix.  This really should go on all opened branches.

Thanks,
Andrew Pinski

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PowerPC] Default TARGET_ALIGN_NATURAL properly for target libs.
       [not found]           ` <amodra@bigpond.net.au>
                               ` (50 preceding siblings ...)
  2005-12-07 13:35             ` [PowerPC] Fix pr25212, indexed address predicates David Edelsohn
@ 2005-12-10 18:14             ` David Edelsohn
  2005-12-15  7:04             ` [PowerPC] Fix 25406, rs6000_special_round_type_align David Edelsohn
                               ` (9 subsequent siblings)
  61 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2005-12-10 18:14 UTC (permalink / raw)
  To: gcc-patches

	* config/rs6000/linux64.h (TARGET_ALIGN_NATURAL): Define.

Okay.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PowerPC] Default TARGET_ALIGN_NATURAL properly for target libs.
       [not found]     ` <26af44d2c30b82a28cfc4fc6e5df2e0d@physics.uc.edu>
@ 2005-12-12  3:34       ` Alan Modra
  2005-12-12  6:35         ` Mark Mitchell
  0 siblings, 1 reply; 875+ messages in thread
From: Alan Modra @ 2005-12-12  3:34 UTC (permalink / raw)
  To: Andrew Pinski, gcc-patches, David Edelsohn, Mark Mitchell,
	Gabriel Dos Reis

On Sun, Dec 11, 2005 at 09:36:30PM -0500, Andrew Pinski wrote:
> libobjc uses the GCC's headers for relayout a struct to see what the
> size (and alignment) is.  This is what I am changing to not to use
> GCC's headers but have libobjc includes the ABI definition itself.
> 
> Try the following objc program:
> 
> #include <stdlib.h>
> #include <objc/encoding.h>
> 
> struct a
> {
>   int t;
>   double tt;
> };
> 
> int main (void)
> {
>   if (objc_sizeof_type (@encode (struct a)) != sizeof(struct a))
>     abort();
> }
> 
> This should not abort.  I am testing an automated testsuite which
> will show the problem any where.

Ick, I'd forgotten that objc_sizeof_type (@encode ...) means a horrible
dependecy on the options used to compile libobjc.  Not having
TARGET_ALIGN_NATURAL set to the proper default when compiling the
library can cause a real problem on objc programs just using default
compiler options.

Of course the fix you Andrew is working on will be much better, but in
the meantime I'd like to apply
http://gcc.gnu.org/ml/gcc-patches/2005-12/msg00648.html to 3.4, 4.0 and
4.1.  This cures a regression from gcc-3.3 where the default alignment
on powerpc64 was still -malign-power.  OK?

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PowerPC] Default TARGET_ALIGN_NATURAL properly for target libs.
  2005-12-12  3:34       ` Alan Modra
@ 2005-12-12  6:35         ` Mark Mitchell
  2005-12-13  1:02           ` Mike Stump
  0 siblings, 1 reply; 875+ messages in thread
From: Mark Mitchell @ 2005-12-12  6:35 UTC (permalink / raw)
  To: Alan Modra; +Cc: Andrew Pinski, gcc-patches, David Edelsohn, Gabriel Dos Reis

Alan Modra wrote:

> Of course the fix you Andrew is working on will be much better, but in
> the meantime I'd like to apply
> http://gcc.gnu.org/ml/gcc-patches/2005-12/msg00648.html to 3.4, 4.0 and
> 4.1.  This cures a regression from gcc-3.3 where the default alignment
> on powerpc64 was still -malign-power.  OK?

For 4.0/4.1, I'm going to defer to the Objective-C maintainers; I just
don't have enough clue about this issue.

Thanks,

-- 
Mark Mitchell
CodeSourcery, LLC
mark@codesourcery.com
(916) 791-8304

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PowerPC] Default TARGET_ALIGN_NATURAL properly for target libs.
  2005-12-12  6:35         ` Mark Mitchell
@ 2005-12-13  1:02           ` Mike Stump
  2005-12-13  2:08             ` Gabriel Dos Reis
  0 siblings, 1 reply; 875+ messages in thread
From: Mike Stump @ 2005-12-13  1:02 UTC (permalink / raw)
  To: Mark Mitchell
  Cc: Alan Modra, Andrew Pinski, gcc-patches, David Edelsohn, Gabriel Dos Reis

On Dec 11, 2005, at 10:34 PM, Mark Mitchell wrote:
>> Of course the fix you Andrew is working on will be much better,  
>> but in
>> the meantime I'd like to apply
>> http://gcc.gnu.org/ml/gcc-patches/2005-12/msg00648.html to 3.4,  
>> 4.0 and
>> 4.1.  This cures a regression from gcc-3.3 where the default  
>> alignment
>> on powerpc64 was still -malign-power.  OK?
>
> For 4.0/4.1, I'm going to defer to the Objective-C maintainers; I just
> don't have enough clue about this issue.

I'm Ok with it.  I think it is reasonable.

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PowerPC] Default TARGET_ALIGN_NATURAL properly for target libs.
  2005-12-13  1:02           ` Mike Stump
@ 2005-12-13  2:08             ` Gabriel Dos Reis
  0 siblings, 0 replies; 875+ messages in thread
From: Gabriel Dos Reis @ 2005-12-13  2:08 UTC (permalink / raw)
  To: Mike Stump
  Cc: Mark Mitchell, Alan Modra, Andrew Pinski, gcc-patches, David Edelsohn

Mike Stump <mrs@apple.com> writes:

| On Dec 11, 2005, at 10:34 PM, Mark Mitchell wrote:
| >> Of course the fix you Andrew is working on will be much better,
| >> but in
| >> the meantime I'd like to apply
| >> http://gcc.gnu.org/ml/gcc-patches/2005-12/msg00648.html to 3.4,
| >> 4.0 and
| >> 4.1.  This cures a regression from gcc-3.3 where the default
| >> alignment
| >> on powerpc64 was still -malign-power.  OK?
| >
| > For 4.0/4.1, I'm going to defer to the Objective-C maintainers; I just
| > don't have enough clue about this issue.
| 
| I'm Ok with it.  I think it is reasonable.

then it is fine for 3.4.x too.

-- Gaby

^ permalink raw reply	[flat|nested] 875+ messages in thread

* [PowerPC] Fix 25406, rs6000_special_round_type_align
@ 2005-12-14 14:56 Alan Modra
  2005-12-14 23:50 ` Alan Modra
  0 siblings, 1 reply; 875+ messages in thread
From: Alan Modra @ 2005-12-14 14:56 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Edelsohn, pinskia

This is obvious really, but I'm doing more than just fixing the bug so
I'll ask permission to commit.  Every place ROUND_TYPE_ALIGN is invoked,
the alignment values are unsigned int, so it makes sense for
rs6000_special_round_type_align to have unsigned int args.

Tested by running make "check-gcc" on powerpc64-linux with site.exp
supplying GCC_UNDER_TEST to use -malign-power, and with TYPE_MODE
in the patch below (and in ADJUST_FIELD_ALIGN) replaced with
MY_TYPE_MODE.

#define MY_TYPE_MODE(X) (!TYPE_P (X) ? abort (), SImode : TYPE_MODE (X))

ie. Effectively turning on rtl checking just here.  A full
powerpc64-linux bootstrap takes forever with --enable-checking=all.

	* config/rs6000/rs6000.c (rs6000_special_round_type_align): Handle
	error_mark_node.  Make alignment params unsigned.
	* config/rs6000/rs6000-protos.h
	(rs6000_special_round_type_align): Update prototype.
	(rs6000_machopic_legitimize_pic_address): Remove arg names.

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 108499)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -2505,21 +2505,27 @@ invalid_e500_subreg (rtx op, enum machin
    field is an FP double while the FP fields remain word aligned.  */
 
 unsigned int
-rs6000_special_round_type_align (tree type, int computed, int specified)
+rs6000_special_round_type_align (tree type, unsigned int computed,
+				 unsigned int specified)
 {
+  unsigned int align = MAX (computed, specified);
   tree field = TYPE_FIELDS (type);
 
   /* Skip all non field decls */
   while (field != NULL && TREE_CODE (field) != FIELD_DECL)
     field = TREE_CHAIN (field);
 
-  if (field == NULL || field == type
-      || TYPE_MODE (TREE_CODE (TREE_TYPE (field)) == ARRAY_TYPE
-		    ? get_inner_array_type (field)
-		    : TREE_TYPE (field)) != DFmode)
-    return MAX (computed, specified);
+  if (field != NULL && field != type)
+    {
+      type = TREE_TYPE (field);
+      while (TREE_CODE (type) == ARRAY_TYPE)
+	type = TREE_TYPE (type);
+
+      if (type != error_mark_node && TYPE_MODE (type) == DFmode)
+	align = MAX (align, 64);
+    }
 
-  return MAX (MAX (computed, specified), 64);
+  return align;
 }
 
 /* Return 1 for an operand in small memory on V.4/eabi.  */
Index: gcc/config/rs6000/rs6000-protos.h
===================================================================
--- gcc/config/rs6000/rs6000-protos.h	(revision 108499)
+++ gcc/config/rs6000/rs6000-protos.h	(working copy)
@@ -105,14 +105,13 @@ extern rtx rs6000_return_addr (int, rtx)
 extern void rs6000_output_symbol_ref (FILE*, rtx);
 extern HOST_WIDE_INT rs6000_initial_elimination_offset (int, int);
 
-extern rtx rs6000_machopic_legitimize_pic_address (rtx orig,
-						   enum machine_mode mode,
-						   rtx reg);
-
+extern rtx rs6000_machopic_legitimize_pic_address (rtx, enum machine_mode,
+						   rtx);
 #endif /* RTX_CODE */
 
 #ifdef TREE_CODE
-extern unsigned int rs6000_special_round_type_align (tree, int, int);
+extern unsigned int rs6000_special_round_type_align (tree, unsigned int,
+						     unsigned int);
 extern void function_arg_advance (CUMULATIVE_ARGS *, enum machine_mode,
 				  tree, int, int);
 extern int function_arg_boundary (enum machine_mode, tree);

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PowerPC] Fix 25406, rs6000_special_round_type_align
  2005-12-14 14:56 [PowerPC] Fix 25406, rs6000_special_round_type_align Alan Modra
@ 2005-12-14 23:50 ` Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2005-12-14 23:50 UTC (permalink / raw)
  To: gcc-patches, David Edelsohn, pinskia

On Thu, Dec 15, 2005 at 01:26:16AM +1030, Alan Modra wrote:
> #define MY_TYPE_MODE(X) (!TYPE_P (X) ? abort (), SImode : TYPE_MODE (X))
> 
> ie. Effectively turning on rtl checking just here.  A full
> powerpc64-linux bootstrap takes forever with --enable-checking=all.

Duh.  This isn't rtl checking, it is tree checking.  Which is turned on
by default anyway on the trunk.  So I didn't need to muck around like
this.  Just doing a -malign-power testsuite run is sufficient.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PowerPC] Fix 25406, rs6000_special_round_type_align
       [not found]           ` <amodra@bigpond.net.au>
                               ` (51 preceding siblings ...)
  2005-12-10 18:14             ` [PowerPC] Default TARGET_ALIGN_NATURAL properly for target libs David Edelsohn
@ 2005-12-15  7:04             ` David Edelsohn
  2005-12-28 15:07             ` [PowerPC] Fix PR25572, -mminimal-toc trashes r30 David Edelsohn
                               ` (8 subsequent siblings)
  61 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2005-12-15  7:04 UTC (permalink / raw)
  To: gcc-patches, pinskia

	* config/rs6000/rs6000.c (rs6000_special_round_type_align): Handle
	error_mark_node.  Make alignment params unsigned.
	* config/rs6000/rs6000-protos.h
	(rs6000_special_round_type_align): Update prototype.
	(rs6000_machopic_legitimize_pic_address): Remove arg names.

Okay.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* [PowerPC] Fix PR25572, -mminimal-toc trashes r30
@ 2005-12-28 10:31 Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2005-12-28 10:31 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Edelsohn

This patch updates regs_ever_live so that the prologue/epilogue code
knows about use of the TOC register created during reload.  Bootstrapped
and regression tested powerpc64-linux.  OK mainline?  OK for 4.1, 4.0
and 3.4 too, after I bootstrap there?

	PR target/25572
	* config/rs6000/rs6000.c (create_TOC_reference): Set regs_ever_live.

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 109087)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -13499,6 +13499,8 @@ uses_TOC (void)
 rtx
 create_TOC_reference (rtx symbol)
 {
+  if (no_new_pseudos)
+    regs_ever_live[TOC_REGISTER] = 1;
   return gen_rtx_PLUS (Pmode,
 	   gen_rtx_REG (Pmode, TOC_REGISTER),
 	     gen_rtx_CONST (Pmode,

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PowerPC] Fix PR25572, -mminimal-toc trashes r30
       [not found]           ` <amodra@bigpond.net.au>
                               ` (52 preceding siblings ...)
  2005-12-15  7:04             ` [PowerPC] Fix 25406, rs6000_special_round_type_align David Edelsohn
@ 2005-12-28 15:07             ` David Edelsohn
  2006-02-24  3:03             ` [PowerPC64] Fix 26453, segfault -m64 -mtraceback=full David Edelsohn
                               ` (7 subsequent siblings)
  61 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2005-12-28 15:07 UTC (permalink / raw)
  To: gcc-patches

	PR target/25572
	* config/rs6000/rs6000.c (create_TOC_reference): Set regs_ever_live.

Okay everywhere.

	This and the other uses of regs_ever_live[] should be solved in a
better way after the dataflow branch is merged.  Ask DannyB for the details.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* [PowerPC64] Fix 26453, segfault -m64 -mtraceback=full
@ 2006-02-24  2:22 Alan Modra
  2006-02-24  3:57 ` Mark Mitchell
  0 siblings, 1 reply; 875+ messages in thread
From: Alan Modra @ 2006-02-24  2:22 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Edelsohn

Fixes a regression from gcc-3.3 introduced with r61058.  Bootstrapped
and regression tested powerpc64-linux.  OK mainline and active branches?

	PR target/26453
	* config/rs6000/rs6000.c (rs6000_output_function_epilogue): Don't
	output traceback table for thunks.  Localise rs6000_stack_info call.

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 111347)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -15153,8 +15153,6 @@ static void
 rs6000_output_function_epilogue (FILE *file,
 				 HOST_WIDE_INT size ATTRIBUTE_UNUSED)
 {
-  rs6000_stack_t *info = rs6000_stack_info ();
-
   if (! HAVE_epilogue)
     {
       rtx insn = get_last_insn ();
@@ -15225,13 +15223,14 @@ rs6000_output_function_epilogue (FILE *f
      System V.4 Powerpc's (and the embedded ABI derived from it) use a
      different traceback table.  */
   if (DEFAULT_ABI == ABI_AIX && ! flag_inhibit_size_directive
-      && rs6000_traceback != traceback_none)
+      && rs6000_traceback != traceback_none && !current_function_is_thunk)
     {
       const char *fname = NULL;
       const char *language_string = lang_hooks.name;
       int fixed_parms = 0, float_parms = 0, parm_info = 0;
       int i;
       int optional_tbtab;
+      rs6000_stack_t *info = rs6000_stack_info ();
 
       if (rs6000_traceback == traceback_full)
 	optional_tbtab = 1;

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PowerPC64] Fix 26453, segfault -m64 -mtraceback=full
       [not found]           ` <amodra@bigpond.net.au>
                               ` (53 preceding siblings ...)
  2005-12-28 15:07             ` [PowerPC] Fix PR25572, -mminimal-toc trashes r30 David Edelsohn
@ 2006-02-24  3:03             ` David Edelsohn
  2006-03-30 21:46             ` [PowerPC] linuxspe vs. ibm long double David Edelsohn
                               ` (6 subsequent siblings)
  61 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2006-02-24  3:03 UTC (permalink / raw)
  To: gcc-patches

	PR target/26453
	* config/rs6000/rs6000.c (rs6000_output_function_epilogue): Don't
	output traceback table for thunks.  Localise rs6000_stack_info call.

Okay.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PowerPC64] Fix 26453, segfault -m64 -mtraceback=full
  2006-02-24  2:22 [PowerPC64] Fix 26453, segfault -m64 -mtraceback=full Alan Modra
@ 2006-02-24  3:57 ` Mark Mitchell
  0 siblings, 0 replies; 875+ messages in thread
From: Mark Mitchell @ 2006-02-24  3:57 UTC (permalink / raw)
  To: gcc-patches, David Edelsohn

Alan Modra wrote:
> Fixes a regression from gcc-3.3 introduced with r61058.  Bootstrapped
> and regression tested powerpc64-linux.  OK mainline and active branches?

I'm sorry; you've (just) missed the window for GCC 4.1.0.  This patch is
certainly OK for 4.1.0 after the 4.1.0 release is out.  Since I'm going
to try to get a 4.0.3 release out approximately in synch with 4.1.0 (see
previous mail), I'd like you to hold this patch until after 4.0.3 as
well, but it's OK for the 4.0 branch after 4.0.3.

Thanks,

-- 
Mark Mitchell
CodeSourcery
mark@codesourcery.com
(650) 331-3385 x713

^ permalink raw reply	[flat|nested] 875+ messages in thread

* [PowerPC] linuxspe vs. ibm long double.
@ 2006-03-30  4:42 Alan Modra
  2006-03-30 12:51 ` Joseph S. Myers
  0 siblings, 1 reply; 875+ messages in thread
From: Alan Modra @ 2006-03-30  4:42 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Edelsohn

This patch fixes a problem I noticed when fiddling with PR26459, namely,
that -mabi=ieeelongdouble or -mabi=ibmlongdouble stops the default
setting of rs6000_spe_abi.  Accomplished by only setting
rs6000_explicit_options.abi when -mabi=spe/no-spe/altivec is given,
and using a separate field in rs6000_explicit_options for
-mabi=ibmlongdouble/ieeelongdouble.  I figure that is reasonable to
separate the ieee/ibm long double selection from other abi selection.

I also select IEEE long double for linux spe by default, because the
E500 ABI doc quite clearly states that while other long double formats
may be used by the compiler that they must not be the default.  Whether
compatibilty with the ABI matters more than compatibility with the new
linux default is arguable, but I'm inclined to favour E500 ABI
compatibility over linux compatibility in this case.

Tested with a powerpc-linux -> powerpc-linuxspe cross build.  OK for
mainline?  4.1 too?

	* config/rs6000/rs6000.c (rs6000_explicit_options): Add ieee.
	(rs6000_override_options): Use it.
	(rs6000_handle_option): Set it.  Set rs6000_explicit_options.abi
	only for -mabi=spe/no-spe and -mabi=altivec.
	* config/rs6000/linuxspe.h (SUBSUBTARGET_OVERRIDE_OPTIONS): Set
	rs6000_ieeequad when spe abi.

Index: gcc/config/rs6000/linuxspe.h
===================================================================
--- gcc/config/rs6000/linuxspe.h	(revision 112399)
+++ gcc/config/rs6000/linuxspe.h	(working copy)
@@ -49,6 +49,8 @@
     rs6000_cpu = PROCESSOR_PPC8540; \
   if (!rs6000_explicit_options.abi) \
     rs6000_spe_abi = 1; \
+  if (rs6000_spe_abi && !rs6000_explicit_options.ieee) \
+    rs6000_ieeequad = 1; \
   if (!rs6000_explicit_options.float_gprs) \
     rs6000_float_gprs = 1; \
   /* See note below.  */ \
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 112399)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -249,11 +249,12 @@ int rs6000_alignment_flags;
 struct {
   bool aix_struct_ret;		/* True if -maix-struct-ret was used.  */
   bool alignment;		/* True if -malign- was used.  */
-  bool abi;			/* True if -mabi= was used.  */
+  bool abi;			/* True if -mabi=spe/nospe was used.  */
   bool spe;			/* True if -mspe= was used.  */
   bool float_gprs;		/* True if -mfloat-gprs= was used.  */
   bool isel;			/* True if -misel was used. */
   bool long_double;	        /* True if -mlong-double- was used.  */
+  bool ieee;			/* True if -mabi=ieee/ibmlongdouble used.  */
 } rs6000_explicit_options;
 
 struct builtin_description
@@ -1292,7 +1293,7 @@ rs6000_override_options (const char *def
     rs6000_long_double_type_size = RS6000_DEFAULT_LONG_DOUBLE_SIZE;
 
 #ifndef POWERPC_LINUX
-  if (!rs6000_explicit_options.abi)
+  if (!rs6000_explicit_options.ieee)
     rs6000_ieeequad = 1;
 #endif
 
@@ -1750,23 +1751,31 @@ rs6000_handle_option (size_t code, const
 #endif
 
     case OPT_mabi_:
-      rs6000_explicit_options.abi = true;
       if (!strcmp (arg, "altivec"))
 	{
+	  rs6000_explicit_options.abi = true;
 	  rs6000_altivec_abi = 1;
 	  rs6000_spe_abi = 0;
 	}
       else if (! strcmp (arg, "no-altivec"))
-	rs6000_altivec_abi = 0;
+	{
+	  /* ??? Don't set rs6000_explicit_options.abi here, to allow
+	     the default for rs6000_spe_abi to be chosen later.  */
+	  rs6000_altivec_abi = 0;
+	}
       else if (! strcmp (arg, "spe"))
 	{
+	  rs6000_explicit_options.abi = true;
 	  rs6000_spe_abi = 1;
 	  rs6000_altivec_abi = 0;
 	  if (!TARGET_SPE_ABI)
 	    error ("not configured for ABI: '%s'", arg);
 	}
       else if (! strcmp (arg, "no-spe"))
-	rs6000_spe_abi = 0;
+	{
+	  rs6000_explicit_options.abi = true;
+	  rs6000_spe_abi = 0;
+	}
 
       /* These are here for testing during development only, do not
 	 document in the manual please.  */
@@ -1783,11 +1792,13 @@ rs6000_handle_option (size_t code, const
 
       else if (! strcmp (arg, "ibmlongdouble"))
 	{
+	  rs6000_explicit_options.ieee = true;
 	  rs6000_ieeequad = 0;
 	  warning (0, "Using IBM extended precision long double");
 	}
       else if (! strcmp (arg, "ieeelongdouble"))
 	{
+	  rs6000_explicit_options.ieee = true;
 	  rs6000_ieeequad = 1;
 	  warning (0, "Using IEEE extended precision long double");
 	}

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PowerPC] linuxspe vs. ibm long double.
  2006-03-30  4:42 [PowerPC] linuxspe vs. ibm long double Alan Modra
@ 2006-03-30 12:51 ` Joseph S. Myers
  0 siblings, 0 replies; 875+ messages in thread
From: Joseph S. Myers @ 2006-03-30 12:51 UTC (permalink / raw)
  To: Alan Modra; +Cc: gcc-patches, David Edelsohn

On Thu, 30 Mar 2006, Alan Modra wrote:

> I also select IEEE long double for linux spe by default, because the
> E500 ABI doc quite clearly states that while other long double formats
> may be used by the compiler that they must not be the default.  Whether
> compatibilty with the ABI matters more than compatibility with the new
> linux default is arguable, but I'm inclined to favour E500 ABI
> compatibility over linux compatibility in this case.

Note that probably no existing glibc version will work with current GCC 
for IEEE long double on PowerPC, because GCC relies on the ABI-defined 
_q_* functions for IEEE long double operations, but glibc defines _q_uitoq 
instead of the standard name _q_utoq (and only defines these functions at 
all if built with 128-bit long double).

-- 
Joseph S. Myers               http://www.srcf.ucam.org/~jsm28/gcc/
    jsm@polyomino.org.uk (personal mail)
    joseph@codesourcery.com (CodeSourcery mail)
    jsm28@gcc.gnu.org (Bugzilla assignments and CCs)

^ permalink raw reply	[flat|nested] 875+ messages in thread

* [PowerPC] PR26459 again
@ 2006-03-30 14:20 Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2006-03-30 14:20 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Edelsohn

I believe this is a better fix for PR26459 than the one I installed,
which really didn't fix the root cause of the problem.  Apologies for
not analysing the problem sufficiently.  The problem with the existing
rs6000 CANNOT_CHANGE_MODE_CLASS is that when !TARGET_IEEEQUAD, the
DF->DI and DI->DF insns that are supposed to be caught by the
TARGET_E500_DOUBLE lines never get that far.  Instead, they match the
first test and return a zero.  Geoff added the first few lines of
CANNOT_CHANGE_MODE_CLASS with a changelog of
    PR target/11848
    * rs6000.h (CANNOT_CHANGE_MODE_CLASS): Allow change of mode
    in floating-point registers between TFmode and DImode.
so it seems to me that the following change will still do as Geoff
intended without the side effect of breaking linuxspe.

Bootstrapped and regression tested on 4.1 powerpc-linux and
powerpc64-linux.  Built powerpc-linux -> powerpc-linuxspe cross to test
pr26459, and i686-linux -> powerpc-darwin cross to test pr11848
testcase.  OK mainline and 4.1?  My previous patch won't do any harm
if it stays applied.

	PR target/26459
	* config/rs6000/rs6000.h (CANNOT_CHANGE_MODE_CLASS): Limit 2003-12-08
	change to FLOAT_REGS.

Index: gcc/config/rs6000/rs6000.h
===================================================================
--- gcc/config/rs6000/rs6000.h	(revision 112399)
+++ gcc/config/rs6000/rs6000.h	(working copy)
@@ -1213,22 +1241,19 @@ enum reg_class
   ? 1                                                                   \
   : ((GET_MODE_SIZE (MODE) + UNITS_PER_WORD - 1) / UNITS_PER_WORD))
 
+/* Return nonzero if for CLASS a mode change from FROM to TO is invalid.  */
 
-/* Return a class of registers that cannot change FROM mode to TO mode.  */
-
-#define CANNOT_CHANGE_MODE_CLASS(FROM, TO, CLASS)			  \
-  (!TARGET_IEEEQUAD							  \
-   && GET_MODE_SIZE (FROM) >= 8 && GET_MODE_SIZE (TO) >= 8		  \
-   ? 0									  \
-   : GET_MODE_SIZE (FROM) != GET_MODE_SIZE (TO)				  \
-   ? reg_classes_intersect_p (FLOAT_REGS, CLASS)			  \
-   : (TARGET_E500_DOUBLE && (((TO) == DFmode) + ((FROM) == DFmode)) == 1) \
-   ? reg_classes_intersect_p (GENERAL_REGS, CLASS)			  \
-   : (TARGET_E500_DOUBLE && (((TO) == DImode) + ((FROM) == DImode)) == 1) \
-   ? reg_classes_intersect_p (GENERAL_REGS, CLASS)			  \
-   : (TARGET_SPE && (SPE_VECTOR_MODE (FROM) + SPE_VECTOR_MODE (TO)) == 1) \
-   ? reg_classes_intersect_p (GENERAL_REGS, CLASS)			  \
-   : 0)
+#define CANNOT_CHANGE_MODE_CLASS(FROM, TO, CLASS)			\
+  (GET_MODE_SIZE (FROM) != GET_MODE_SIZE (TO)				\
+   ? ((GET_MODE_SIZE (FROM) < 8 || GET_MODE_SIZE (TO) < 8		\
+       || TARGET_IEEEQUAD)						\
+      && reg_classes_intersect_p (FLOAT_REGS, CLASS))			\
+   : (((TARGET_E500_DOUBLE						\
+	&& ((((TO) == DFmode) + ((FROM) == DFmode)) == 1		\
+	    || (((TO) == DImode) + ((FROM) == DImode)) == 1))		\
+       || (TARGET_SPE							\
+	   && (SPE_VECTOR_MODE (FROM) + SPE_VECTOR_MODE (TO)) == 1))	\
+      && reg_classes_intersect_p (GENERAL_REGS, CLASS)))
 
 /* Stack layout; function entry, exit and calling.  */
 

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PowerPC] linuxspe vs. ibm long double.
       [not found]           ` <amodra@bigpond.net.au>
                               ` (54 preceding siblings ...)
  2006-02-24  3:03             ` [PowerPC64] Fix 26453, segfault -m64 -mtraceback=full David Edelsohn
@ 2006-03-30 21:46             ` David Edelsohn
  2006-03-30 23:42               ` Alan Modra
  2006-03-30 21:50             ` [PowerPC] PR26459 again David Edelsohn
                               ` (5 subsequent siblings)
  61 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2006-03-30 21:46 UTC (permalink / raw)
  To: gcc-patches

> I also select IEEE long double for linux spe by default, because the
> E500 ABI doc quite clearly states that while other long double formats
> may be used by the compiler that they must not be the default.  Whether
> compatibilty with the ABI matters more than compatibility with the new
> linux default is arguable, but I'm inclined to favour E500 ABI
> compatibility over linux compatibility in this case.

	I agree with Joseph.  It does not make sense to implement a
default that fundamentally cannot work, regardless of the ABI.  The e500
ABI can do whatever it wants for environments that they control, but not
for PPC Linux.  If we enable long-double-128 by default for all PPC Linux
configurations, e500 suddenly becomes broken again for no good reason.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PowerPC] PR26459 again
       [not found]           ` <amodra@bigpond.net.au>
                               ` (55 preceding siblings ...)
  2006-03-30 21:46             ` [PowerPC] linuxspe vs. ibm long double David Edelsohn
@ 2006-03-30 21:50             ` David Edelsohn
  2006-03-30 23:12               ` Alan Modra
  2006-03-30 23:50             ` [PowerPC] linuxspe vs. ibm long double David Edelsohn
                               ` (4 subsequent siblings)
  61 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2006-03-30 21:50 UTC (permalink / raw)
  To: gcc-patches

	What bothers me about this change is that it converts a test that
does not need to call reg_classes_intersect_p() into one that does.  This
seems to be slowing down the common case for no good reason.  The first
test does not just cover floating point modes but integer modes as well.
That shortcut seems to have disappeared in your change.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PowerPC] PR26459 again
  2006-03-30 21:50             ` [PowerPC] PR26459 again David Edelsohn
@ 2006-03-30 23:12               ` Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2006-03-30 23:12 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

On Thu, Mar 30, 2006 at 04:50:32PM -0500, David Edelsohn wrote:
> 	What bothers me about this change is that it converts a test that
> does not need to call reg_classes_intersect_p() into one that does.  This
> seems to be slowing down the common case for no good reason.  The first
> test does not just cover floating point modes but integer modes as well.
> That shortcut seems to have disappeared in your change.

Well, we do need an extra call for TARGET_E500_DOUBLE and TARGET_SPE
of course.  After all, removing the "shortcut" in that case was the
whole point of the change!  I assume you're not talking about that.

However, I can't see any other case where reg_classes_intersect_p is
called now where it previously wasn't.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PowerPC] linuxspe vs. ibm long double.
  2006-03-30 21:46             ` [PowerPC] linuxspe vs. ibm long double David Edelsohn
@ 2006-03-30 23:42               ` Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2006-03-30 23:42 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

On Thu, Mar 30, 2006 at 04:45:55PM -0500, David Edelsohn wrote:
> > I also select IEEE long double for linux spe by default, because the
> > E500 ABI doc quite clearly states that while other long double formats
> > may be used by the compiler that they must not be the default.  Whether
> > compatibilty with the ABI matters more than compatibility with the new
> > linux default is arguable, but I'm inclined to favour E500 ABI
> > compatibility over linux compatibility in this case.
> 
> 	I agree with Joseph.

Well I don't have a strong opinion either way.  Is the patch OK to
install without the IEEE long double default?

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PowerPC] linuxspe vs. ibm long double.
       [not found]           ` <amodra@bigpond.net.au>
                               ` (56 preceding siblings ...)
  2006-03-30 21:50             ` [PowerPC] PR26459 again David Edelsohn
@ 2006-03-30 23:50             ` David Edelsohn
  2006-03-31  0:33             ` [PowerPC] PR26459 again David Edelsohn
                               ` (3 subsequent siblings)
  61 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2006-03-30 23:50 UTC (permalink / raw)
  To: gcc-patches

>>>>> Alan Modra writes:

Alan> Well I don't have a strong opinion either way.  Is the patch OK to
Alan> install without the IEEE long double default?

	Yes, okay without the default.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PowerPC] PR26459 again
       [not found]           ` <amodra@bigpond.net.au>
                               ` (57 preceding siblings ...)
  2006-03-30 23:50             ` [PowerPC] linuxspe vs. ibm long double David Edelsohn
@ 2006-03-31  0:33             ` David Edelsohn
  2006-04-05  2:26             ` [RFT/RFA] Fix AIX fallout from PR/19653 patch David Edelsohn
                               ` (2 subsequent siblings)
  61 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2006-03-31  0:33 UTC (permalink / raw)
  To: gcc-patches

>>>>> Alan Modra writes:

Alan> However, I can't see any other case where reg_classes_intersect_p is
Alan> called now where it previously wasn't.

	Oh, I see, you're expecting the tests to short-circuit the
evaluation before the reg_classes_intersect_p() calls.

	Okay, the patch is okay on mainline and 4.1.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* [RFT/RFA] Fix AIX fallout from PR/19653 patch
@ 2006-04-04 16:21 Paolo Bonzini
  2006-04-04 17:52 ` David Edelsohn
  2006-04-05  2:20 ` Alan Modra
  0 siblings, 2 replies; 875+ messages in thread
From: Paolo Bonzini @ 2006-04-04 16:21 UTC (permalink / raw)
  To: GCC Patches; +Cc: David Edelsohn

[-- Attachment #1: Type: text/plain, Size: 1349 bytes --]

This patch fixes numerous failures that the PR/19653 patch introduced on 
AIX.  The failures could in principle be visible on other rs6000 
subtargets that have TOC, but looking at posted testresults these seem 
unaffected.

The problem is that constant_pool_expr_p returns true if a SYMBOL_REF is 
present in its argument; on the other hand, its users all call 
get_pool_constant on the same argument as if it was a SYMBOL_REF:

   ...
   else if (TARGET_TOC
           && constant_pool_expr_p (x)
           && ASM_OUTPUT_SPECIAL_POOL_ENTRY_P (get_pool_constant (x), 
Pmode))
   ...

This looks like a latent bug to me, but I have not yet investigate why, 
after the PR/19653 patch, we are requesting legitimization of a (const 
(plus (symbol_ref LC..0) (const_int 4))).  I think it is Dale's regclass 
changes, but I'm not sure.

Since constant_pool_expr_p calls ASM_OUTPUT_SPECIAL_POOL_ENTRY_P, but 
always with Pmode mode, my solution is to add an argument to 
constant_pool_expr_p so that it calls ASM_OUTPUT_SPECIAL_POOL_ENTRY_P 
with the correct mode.  This fixes the failing testcases I tried it on, 
but I am not familiar with these parts of the back-ends and I am not 
sure it is correct.  `svn annotate' has not been helpful in decyphering 
the history of these bits.

David, can you bootstrap this?  Ok for mainline if it works?

Paolo

[-- Attachment #2: fix-toc-bug.patch --]
[-- Type: text/plain, Size: 6225 bytes --]

This patch fixes numerous failures that the PR/19653 patch introduced on
AIX.  The failures could in principle be visible on other rs6000 subtargets
that have TOC, but looking at posted testresults these seem unaffected.

The problem is that constant_pool_expr_p returns true if a SYMBOL_REF
is present in its argument; on the other hand, its users all call
get_pool_constant on the same argument as if it was a SYMBOL_REF:

   ...
   else if (TARGET_TOC
           && constant_pool_expr_p (x)
           && ASM_OUTPUT_SPECIAL_POOL_ENTRY_P (get_pool_constant (x), Pmode))
   ...

This looks like a latent bug to me, but it may as well be a problem with
my reload changes (or Dale's regclass change -- could be as likely).

Since constant_pool_expr_p calls ASM_OUTPUT_SPECIAL_POOL_ENTRY_P, but always
with Pmode mode, my solution is to add an argument to constant_pool_expr_p
so that it calls ASM_OUTPUT_SPECIAL_POOL_ENTRY_P with the correct mode.
This fixes the failing testcases I tried it on, but I am not familiar with
these parts of the back-ends and I am not sure it is correct.  `svn
annotate' has not been helpful in decyphering the history of these bits.

David, can you bootstrap this?  Ok for mainline if it works?

Paolo

2006-04-04  Paolo Bonzini  <bonzini@gnu.org>

	* rs6000.c (constant_pool_expr_1): Add MODE parameter defaulting
	to the pool constant's mode.
	(constant_pool_expr_p): Add a MODE parameter, pass it.
	(toc_relative_expr_p): Adjust call.
	(legitimate_constant_pool_address_p, rs6000_legitimize_reload_address,
	rs6000_emit_move): Merge invocations of ASM_OUTPUT_SPECIAL_POOL_ENTRY_P
	with preceding calls to constant_pool_expr_p.

Index: rs6000.c
===================================================================
--- rs6000.c	(revision 112658)
+++ rs6000.c	(working copy)
@@ -588,8 +588,8 @@ static void rs6000_emit_allocate_stack (
 static unsigned rs6000_hash_constant (rtx);
 static unsigned toc_hash_function (const void *);
 static int toc_hash_eq (const void *, const void *);
-static int constant_pool_expr_1 (rtx, int *, int *);
-static bool constant_pool_expr_p (rtx);
+static int constant_pool_expr_1 (rtx, enum machine_mode, int *, int *);
+static bool constant_pool_expr_p (rtx, enum machine_mode);
 static bool legitimate_small_data_p (enum machine_mode, rtx);
 static bool legitimate_indexed_address_p (rtx, int);
 static bool legitimate_lo_sum_address_p (enum machine_mode, rtx, int);
@@ -2642,7 +2642,8 @@ gpr_or_gpr_p (rtx op0, rtx op1)
 /* Subroutines of rs6000_legitimize_address and rs6000_legitimate_address.  */
 
 static int
-constant_pool_expr_1 (rtx op, int *have_sym, int *have_toc)
+constant_pool_expr_1 (rtx op, enum machine_mode mode,
+		      int *have_sym, int *have_toc)
 {
   switch (GET_CODE (op))
     {
@@ -2651,7 +2652,9 @@ constant_pool_expr_1 (rtx op, int *have_
 	return 0;
       else if (CONSTANT_POOL_ADDRESS_P (op))
 	{
-	  if (ASM_OUTPUT_SPECIAL_POOL_ENTRY_P (get_pool_constant (op), Pmode))
+	  if (mode == VOIDmode)
+	    mode = get_pool_mode (op);
+	  if (ASM_OUTPUT_SPECIAL_POOL_ENTRY_P (get_pool_constant (op), mode))
 	    {
 	      *have_sym = 1;
 	      return 1;
@@ -2668,10 +2671,10 @@ constant_pool_expr_1 (rtx op, int *have_
 	return 0;
     case PLUS:
     case MINUS:
-      return (constant_pool_expr_1 (XEXP (op, 0), have_sym, have_toc)
-	      && constant_pool_expr_1 (XEXP (op, 1), have_sym, have_toc));
+      return (constant_pool_expr_1 (XEXP (op, 0), mode, have_sym, have_toc)
+	      && constant_pool_expr_1 (XEXP (op, 1), mode, have_sym, have_toc));
     case CONST:
-      return constant_pool_expr_1 (XEXP (op, 0), have_sym, have_toc);
+      return constant_pool_expr_1 (XEXP (op, 0), mode, have_sym, have_toc);
     case CONST_INT:
       return 1;
     default:
@@ -2680,11 +2683,11 @@ constant_pool_expr_1 (rtx op, int *have_
 }
 
 static bool
-constant_pool_expr_p (rtx op)
+constant_pool_expr_p (rtx op, enum machine_mode mode)
 {
   int have_sym = 0;
   int have_toc = 0;
-  return constant_pool_expr_1 (op, &have_sym, &have_toc) && have_sym;
+  return constant_pool_expr_1 (op, mode, &have_sym, &have_toc) && have_sym;
 }
 
 bool
@@ -2692,7 +2695,7 @@ toc_relative_expr_p (rtx op)
 {
   int have_sym = 0;
   int have_toc = 0;
-  return constant_pool_expr_1 (op, &have_sym, &have_toc) && have_toc;
+  return constant_pool_expr_1 (op, Pmode, &have_sym, &have_toc) && have_toc;
 }
 
 bool
@@ -2702,7 +2705,7 @@ legitimate_constant_pool_address_p (rtx 
 	  && GET_CODE (x) == PLUS
 	  && GET_CODE (XEXP (x, 0)) == REG
 	  && (TARGET_MINIMAL_TOC || REGNO (XEXP (x, 0)) == TOC_REGISTER)
-	  && constant_pool_expr_p (XEXP (x, 1)));
+	  && constant_pool_expr_p (XEXP (x, 1), Pmode));
 }
 
 static bool
@@ -3005,8 +3008,7 @@ rs6000_legitimize_address (rtx x, rtx ol
       return gen_rtx_LO_SUM (Pmode, reg, x);
     }
   else if (TARGET_TOC
-	   && constant_pool_expr_p (x)
-	   && ASM_OUTPUT_SPECIAL_POOL_ENTRY_P (get_pool_constant (x), Pmode))
+	   && constant_pool_expr_p (x, Pmode))
     {
       return create_TOC_reference (x);
     }
@@ -3440,8 +3442,7 @@ rs6000_legitimize_reload_address (rtx x,
     }
 
   if (TARGET_TOC
-      && constant_pool_expr_p (x)
-      && ASM_OUTPUT_SPECIAL_POOL_ENTRY_P (get_pool_constant (x), mode))
+      && constant_pool_expr_p (x, mode))
     {
       (x) = create_TOC_reference (x);
       *win = 1;
@@ -4112,9 +4113,7 @@ rs6000_emit_move (rtx dest, rtx source, 
 	 reference to it.  */
       if (TARGET_TOC
 	  && GET_CODE (operands[1]) == SYMBOL_REF
-	  && constant_pool_expr_p (operands[1])
-	  && ASM_OUTPUT_SPECIAL_POOL_ENTRY_P (get_pool_constant (operands[1]),
-					      get_pool_mode (operands[1])))
+	  && constant_pool_expr_p (operands[1], VOIDmode))
 	{
 	  operands[1] = create_TOC_reference (operands[1]);
 	}
@@ -4179,10 +4178,7 @@ rs6000_emit_move (rtx dest, rtx source, 
 	  operands[1] = force_const_mem (mode, operands[1]);
 
 	  if (TARGET_TOC
-	      && constant_pool_expr_p (XEXP (operands[1], 0))
-	      && ASM_OUTPUT_SPECIAL_POOL_ENTRY_P (
-			get_pool_constant (XEXP (operands[1], 0)),
-			get_pool_mode (XEXP (operands[1], 0))))
+	      && constant_pool_expr_p (XEXP (operands[1], 0), VOIDmode))
 	    {
 	      operands[1]
 		= gen_const_mem (mode,

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFT/RFA] Fix AIX fallout from PR/19653 patch
  2006-04-04 16:21 [RFT/RFA] Fix AIX fallout from PR/19653 patch Paolo Bonzini
@ 2006-04-04 17:52 ` David Edelsohn
  2006-04-05 17:10   ` Geoff Keating
  2006-04-05  2:20 ` Alan Modra
  1 sibling, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2006-04-04 17:52 UTC (permalink / raw)
  To: Paolo Bonzini, Geoff Keating; +Cc: GCC Patches

	I will test your patch.  This machinery originally was designed
and implemented by Geoff, so I would appreciate his comments on your
analysis of a latent bug.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFT/RFA] Fix AIX fallout from PR/19653 patch
  2006-04-04 16:21 [RFT/RFA] Fix AIX fallout from PR/19653 patch Paolo Bonzini
  2006-04-04 17:52 ` David Edelsohn
@ 2006-04-05  2:20 ` Alan Modra
  2006-04-05  7:12   ` Paolo Bonzini
  1 sibling, 1 reply; 875+ messages in thread
From: Alan Modra @ 2006-04-05  2:20 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: GCC Patches, David Edelsohn

On Tue, Apr 04, 2006 at 06:21:06PM +0200, Paolo Bonzini wrote:
> This looks like a latent bug to me, but I have not yet investigate why, 
> after the PR/19653 patch, we are requesting legitimization of a (const 
> (plus (symbol_ref LC..0) (const_int 4))).  I think it is Dale's regclass 
> changes, but I'm not sure.

This part of the pr19653 reload.c patch looks wrong to me.
@@ -1885,7 +1894,11 @@
 
   /* Narrow down the reg class, the same way push_reload will;
      otherwise we might find a dummy now, but push_reload won't.  */
-  class = PREFERRED_RELOAD_CLASS (in, class);
+  {
+    enum reg_class preferred_class = PREFERRED_RELOAD_CLASS (in, class);
+    if (class != NO_REGS)
+      class = preferred_class;
+  }

"if (preferred_class != NO_REGS)" ?

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFT/RFA] Fix AIX fallout from PR/19653 patch
       [not found]           ` <amodra@bigpond.net.au>
                               ` (58 preceding siblings ...)
  2006-03-31  0:33             ` [PowerPC] PR26459 again David Edelsohn
@ 2006-04-05  2:26             ` David Edelsohn
  2006-04-12  0:53             ` [PowerPC] Avoid ICE on DFmode subreg David Edelsohn
  2006-07-07 13:08             ` [PATCH, committed] PR 28150 and PR 28170 David Edelsohn
  61 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2006-04-05  2:26 UTC (permalink / raw)
  To: Paolo Bonzini, GCC Patches

>>>>> Alan Modra writes:

Alan> "if (preferred_class != NO_REGS)" ?

	Yes.  That code is suppose to match the logic of push_reload() and
in it's current form it no longer does.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFT/RFA] Fix AIX fallout from PR/19653 patch
  2006-04-05  2:20 ` Alan Modra
@ 2006-04-05  7:12   ` Paolo Bonzini
  2006-04-05 14:48     ` Alan Modra
  0 siblings, 1 reply; 875+ messages in thread
From: Paolo Bonzini @ 2006-04-05  7:12 UTC (permalink / raw)
  To: Paolo Bonzini, GCC Patches, David Edelsohn, Alan Modra


> This part of the pr19653 reload.c patch looks wrong to me.
> @@ -1885,7 +1894,11 @@
>  
>    /* Narrow down the reg class, the same way push_reload will;
>       otherwise we might find a dummy now, but push_reload won't.  */
> -  class = PREFERRED_RELOAD_CLASS (in, class);
> +  {
> +    enum reg_class preferred_class = PREFERRED_RELOAD_CLASS (in, class);
> +    if (class != NO_REGS)
> +      class = preferred_class;
> +  }
>
> "if (preferred_class != NO_REGS)" ?
I agree and will bootstrap/regtest the obvious fix on 
i686-pc-linux-gnu.  However, fixing this does not bring AIX in better shape.

Paolo

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFT/RFA] Fix AIX fallout from PR/19653 patch
  2006-04-05  7:12   ` Paolo Bonzini
@ 2006-04-05 14:48     ` Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2006-04-05 14:48 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: GCC Patches

On Wed, Apr 05, 2006 at 09:12:06AM +0200, Paolo Bonzini wrote:
> 
> >This part of the pr19653 reload.c patch looks wrong to me.
> >@@ -1885,7 +1894,11 @@
> > 
> >   /* Narrow down the reg class, the same way push_reload will;
> >      otherwise we might find a dummy now, but push_reload won't.  */
> >-  class = PREFERRED_RELOAD_CLASS (in, class);
> >+  {
> >+    enum reg_class preferred_class = PREFERRED_RELOAD_CLASS (in, class);
> >+    if (class != NO_REGS)
> >+      class = preferred_class;
> >+  }
> >
> >"if (preferred_class != NO_REGS)" ?
> I agree and will bootstrap/regtest the obvious fix on 
> i686-pc-linux-gnu.  However, fixing this does not bring AIX in better shape.

Committed after bootstrap and regression test on powerpc-linux and
x86_64-linux.  Roger Sayle also agreed on irc that the fix was
more-or-less obvious.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFT/RFA] Fix AIX fallout from PR/19653 patch
  2006-04-04 17:52 ` David Edelsohn
@ 2006-04-05 17:10   ` Geoff Keating
  2006-04-06  3:38     ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Geoff Keating @ 2006-04-05 17:10 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Paolo Bonzini, GCC Patches

[-- Attachment #1: Type: text/plain, Size: 1370 bytes --]


On 04/04/2006, at 10:52 AM, David Edelsohn wrote:

> 	I will test your patch.  This machinery originally was designed
> and implemented by Geoff, so I would appreciate his comments on your
> analysis of a latent bug.

Actually, the constant_pool_expr_p routine was implemented by Clinton  
Popetz.

His comment, in <http://gcc.gnu.org/ml/gcc-patches/2000-02/ 
msg00223.html>, about these routines I think, was:

> (3) Adds some code to differentiate expressions that reference the
> constant_pool as opposed to those that reference the constant_pool via
> the TOC, which is needed because CSE and reload can try to turn  
> strange
> things into addresses, and input_operand and the force_const_mem  
> code in
> movsi were accepting too much.

I don't see how the proposed patch fixes the problem described.  My  
guess is that it simply causes constant_pool_expr_p to return false  
in the particular cases that are being encountered, suppressing but  
not fixing the problem.

I suspect that each use of constant_pool_expr_p immediately followed  
by ASM_OUTPUT_SPECIAL_POOL_ENTRY_P (get_pool_constant (op), ...),  
that is all but one, should actually be

   GET_CODE (op) == SYMBOL_REF
   && CONSTANT_POOL_ADDRESS_P (op)
   && ASM_OUTPUT_SPECIAL_POOL_ENTRY_P (get_pool_constant (op),  
get_pool_mode (op))

There are three places that should be changed, I think.


[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 2410 bytes --]

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFT/RFA] Fix AIX fallout from PR/19653 patch
  2006-04-05 17:10   ` Geoff Keating
@ 2006-04-06  3:38     ` David Edelsohn
  2006-04-06  4:57       ` Roger Sayle
  2006-04-06 18:46       ` Dale Johannesen
  0 siblings, 2 replies; 875+ messages in thread
From: David Edelsohn @ 2006-04-06  3:38 UTC (permalink / raw)
  To: Geoff Keating, Paolo Bonzini, Roger Sayle; +Cc: GCC Patches

	Paolo's patch, this suggestion, and limiting the SYMBOL_REF test
to just legitimize_reload_address all work.  However, GCC now produces
worse code with many more memory accesses for FP constants on PowerPC.

	Was the performance impact of the original patch ever tested on
anything other than x86?  Reload changes can have significant performance
impact on all GCC architectures and this patch should have been tested
much more extensively.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFT/RFA] Fix AIX fallout from PR/19653 patch
  2006-04-06  3:38     ` David Edelsohn
@ 2006-04-06  4:57       ` Roger Sayle
  2006-04-06  7:31         ` Paolo Bonzini
  2006-04-06 18:46       ` Dale Johannesen
  1 sibling, 1 reply; 875+ messages in thread
From: Roger Sayle @ 2006-04-06  4:57 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Geoff Keating, Paolo Bonzini, GCC Patches

On Wed, 5 Apr 2006, David Edelsohn wrote:
> 	Paolo's patch, this suggestion, and limiting the SYMBOL_REF test
> to just legitimize_reload_address all work.  However, GCC now produces
> worse code with many more memory accesses for FP constants on PowerPC.

I think its important to find out exactly why code generation on PPC
changed at all.  The meat of Paolo's patch was to extend the functionality
of reload to allow backends to return NO_REGS as the preferred reload
class in some circumstances.  This change therefore should only have
affected x86 which should be the only target using this functionality.

As commented on by Paolo in a previous e-mail, this could be caused by
some functionality written by Dale included in the original patch.

Paolo could you identify the target independent reload changes in your
PR 19653 change (Dale's bits) and bootstrap/regression test a reversion
of them?  Meanwhile someone should figure out exactly how/why this is
adversely affect AIX.  That we're now fixing latent TOC bugs uncovered
by this change is interesting, but not helpful for rationalizing why
they've become exposed.

My understanding of the original patch, and presumably that of the
other reviewers that looked at it before approval, was that there
should have been no impact on non-x86 platforms?, and that the
testing and SPEC benchmarking reported were therefore sufficient.

Hopefully, once the problematic hunk is identified it'll provide more
insights and options.  For example, if reverting just this piece
resolves the AIX performance issues, there'll be no need to revert
the entire change.

Likewise if David can provide some concrete examples of code that
has regressed I'm sure that'll also help the process.

Roger
--

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFT/RFA] Fix AIX fallout from PR/19653 patch
  2006-04-06  4:57       ` Roger Sayle
@ 2006-04-06  7:31         ` Paolo Bonzini
  2006-04-06 14:31           ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Paolo Bonzini @ 2006-04-06  7:31 UTC (permalink / raw)
  To: Roger Sayle, Geoff Keating, David Edelsohn, GCC Patches

> Paolo could you identify the target independent reload changes in your
> PR 19653 change (Dale's bits) and bootstrap/regression test a reversion
> of them?  Meanwhile someone should figure out exactly how/why this is
> adversely affect AIX.  That we're now fixing latent TOC bugs uncovered
> by this change is interesting, but not helpful for rationalizing why
> they've become exposed.
>   
Agreed.
> My understanding of the original patch, and presumably that of the
> other reviewers that looked at it before approval, was that there
> should have been no impact on non-x86 platforms?, and that the
> testing and SPEC benchmarking reported were therefore sufficient.
>   
The reload patch should not.  Dale's patch could, because it affects 
register allocation choices in general.

In fact, reverting the regclass.c hunk of the patch fixes the AIX 
failure for at least gcc.c-torture/compile/960829-1.c.  In this 
testcase, we have an unprototyped function for which we must put the 
parameters in both the integer and FP registers.  The costs of registers 
120 and 151 are as follows:

   Register 120 costs: BASE_REGS:16000 GENERAL_REGS:16000 
SPEC_OR_GEN_REGS:22000 NON_FLOAT_REGS:22000 MEM:12000
   Register 151 costs: BASE_REGS:12000 GENERAL_REGS:12000 
SPEC_OR_GEN_REGS:14000 NON_FLOAT_REGS:14000 MEM:8000

so after Dale's patch we decide to not allocate them to a hard 
register.  Register 120 is the DFmode 0.0 constant, which is loaded 
anyway from memory.  Register 151 is a DImode subreg of register 120, 
and is only used to make a SImode subreg out of it, so it makes sense 
there as well.

The problem is that later on we have a (subreg:SI (reg:DF 120) 4) in the 
insn stream, and reload takes the value of reg_equiv_memory_loc[120], 
which is

  (mem/u/c/i:DF (symbol_ref/u:SI ("*LC..0") [flags 0x2]) [2 S8 A64])

and resolves the subreg by turning it into:

  (mem/u/c/i:DF (const:SI (plus:SI (symbol_ref/u:SI ("*LC..0") [flags 0x2])
              (const_int 4 [0x4]))) [2 S8 A64])

(lines 5950-5971 of reload.c)

This however is not a legitimate address, and things do downhill from 
here.  rs6000_legitimize_reload_address is called on the CONST, and 
constant_pool_expr_p recognizes that this operand is including a 
SYMBOL_REF.  Then however rs6000_legitimize_reload_address cannot cope 
with it because it expects to be called only with constant pool 
SYMBOL_REFs, not with CONSTs like this one.

With my patch, the address is legitimized to

 (plus:SI (reg:SI 2 2)
        (const:SI (minus:SI (const:SI (plus:SI (symbol_ref/u:SI 
("*LC..0") [flags 0x2])
                        (const_int 4 [0x4])))
                (symbol_ref:SI ("*LCTOC..1"))))) [2 S8 A64])

which looks weird, but is a valid strict memory address.  Of course, 
correct code is generated for it:

        lwz 0,LC..0+4(2)

that is the offset is not lost.  And FWIW, on the testcase we even 
produce slightly smaller code (a couple less instructions).

Now, here comes some spelunking of rs6000_legitimize_reload_address.  
While it was moved to a function by rth in 2001, its meat dates back to 
newppc-branch and to a patch by Geoff Keating from July 2000.

Clinton Popetz's patch at 
http://gcc.gnu.org/ml/gcc-patches/2000-02/msg00223.html introduced 
CONSTANT_POOL_EXPR_P and used it this way:

*************** typedef struct rs6000_args
*** 1943,1946 ****
--- 1941,1949 ----
        goto WIN;								\
      }									\
+   else if (TARGET_TOC && CONSTANT_POOL_EXPR_P (X))			\
+     {									\
+       (X) = create_TOC_reference(X);					\
+       goto WIN;								\
+     }									\
  }

Later on, however, in revision 31940 (still on the newppc-branch), Geoff 
added the checks to "only use create_TOC_reference on symbols in the 
constant pool that really are TOC references": weird, because 
CONSTANT_POOL_EXPR_P was doing exactly that via 
ASM_OUTPUT_SPECIAL_POOL_ENTRY_P.  Note that at that point, the latter 
macro did not even have a MODE argument, so the two calls (in 
constant_pool_expr_1 and in LEGITIMIZE_RELOAD_ADDRESS) were exact 
duplicates.  The patch is at 
http://gcc.gnu.org/ml/gcc-patches/2000-02/msg00345.html but the e-mail 
is not helpful at all since the subject is "random PPC changes", and the 
change to LEGITIMIZE_RELOAD_ADDRESS is not even mentioned in the 
ChangeLog...

I cannot make Geoff's proposed change work, as it aborts with an 
unrecognizable insn with a cross to powerpc-ibm-aix5.2.0, so I cannot 
test any alternative approach.  And while I was convinced already before 
that *there is* a latent bug in TOC handling, now I'm also convinced 
that my patch is correct: in particular that the 
ASM_OUTPUT_SPECIAL_POOL_ENTRY_P added in revision 31940 is useless, and 
that my patch is the correct way to implement 
http://gcc.gnu.org/ml/gcc-patches/2000-07/msg00764.html (which among 
other things added the MODE argument to ASM_OUTPUT_SPECIAL_POOL_ENTRY_P).

Paolo

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFT/RFA] Fix AIX fallout from PR/19653 patch
  2006-04-06  7:31         ` Paolo Bonzini
@ 2006-04-06 14:31           ` David Edelsohn
  2006-04-06 14:42             ` Paolo Bonzini
  2006-04-07 20:49             ` Geoff Keating
  0 siblings, 2 replies; 875+ messages in thread
From: David Edelsohn @ 2006-04-06 14:31 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Roger Sayle, Geoff Keating, GCC Patches

	Geoff's proposed patch works for me.  Even a more limited patch
works: 

*** rs6000.c    (revision 112731)
--- rs6000.c    (working copy)
*************** rs6000_legitimize_reload_address (rtx x,
*** 3446,3455 ****
      }
  
    if (TARGET_TOC
        && constant_pool_expr_p (x)
        && ASM_OUTPUT_SPECIAL_POOL_ENTRY_P (get_pool_constant (x), mode))
      {
        (x) = create_TOC_reference (x);
        *win = 1;
        return x;
      }
--- 3446,3456 ----
      }
  
    if (TARGET_TOC
+       && GET_CODE (x) == SYMBOL_REF
        && constant_pool_expr_p (x)
        && ASM_OUTPUT_SPECIAL_POOL_ENTRY_P (get_pool_constant (x), mode))
      {
        (x) = create_TOC_reference (x);
        *win = 1;
        return x;
      }

However, the the code generated for the testcase now is worse than before
the regclass change: an extra load and an extra TOC entry.

GCC 4.0.3:

LC..0:
        .tc FD_0_0[TC],0x0,0x0
.f:
        mflr 0
        stw 0,8(1)
        stwu 1,-72(1)
        lwz 11,LC..0(2)
        lwz 12,LC..0+4(2)
        stw 12,56(1)
        li 10,0
        stw 11,64(1)
        stw 12,68(1)
        lfd 4,64(1)
	li 3,0

trunk:

LC..0:
        .tc FD_0_0[TC],0x0,0x0
LC..1:
        .tc LC..0.P4[TC],LC..0+4
.f:
        mflr 0
        stw 0,8(1)
        stwu 1,-80(1)
        lwz 9,LC..1(2)
        lwz 9,0(9)
        stw 9,56(1)
        li 10,0
        lfd 4,LC..0(2)
        li 3,0
        lwz 4,LC..0(2)
        lwz 5,LC..0+4(2)


The smallest and simplest testcase is an unprototyped function, but the
real effect of this change is in stdarg functions, which also fail.

	I do not doubt that the rs6000 port can be modified to recover the
original performance with the regclass change, however it's not a task
that I enjoy inheriting.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFT/RFA] Fix AIX fallout from PR/19653 patch
  2006-04-06 14:31           ` David Edelsohn
@ 2006-04-06 14:42             ` Paolo Bonzini
  2006-04-06 22:44               ` David Edelsohn
  2006-04-07 20:49             ` Geoff Keating
  1 sibling, 1 reply; 875+ messages in thread
From: Paolo Bonzini @ 2006-04-06 14:42 UTC (permalink / raw)
  To: David Edelsohn, Geoff Keating, Roger Sayle, GCC Patches


> However, the the code generated for the testcase now is worse than before
> the regclass change: an extra load and an extra TOC entry.
>   
What I have with my patch is also a single TOC entry, and this assembly

        lwz 4,LC..0(2)
        lwz 5,LC..0+4(2)
        li 3,0
        stw 0,8(1)
        stwu 1,-80(1)
        stw 4,64(1)
        stw 5,68(1)
        lfd 1,64(1)

which is comparable to what you got from 4.0.3:
>         stw 0,8(1)
>         stwu 1,-72(1)
>         lwz 11,LC..0(2)
>         lwz 12,LC..0+4(2)
>         stw 12,56(1)
>         li 10,0
>         stw 11,64(1)
>         stw 12,68(1)
>         lfd 4,64(1)
> 	li 3,0
>   
Paolo

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFT/RFA] Fix AIX fallout from PR/19653 patch
  2006-04-06  3:38     ` David Edelsohn
  2006-04-06  4:57       ` Roger Sayle
@ 2006-04-06 18:46       ` Dale Johannesen
  2006-04-06 19:17         ` David Edelsohn
  1 sibling, 1 reply; 875+ messages in thread
From: Dale Johannesen @ 2006-04-06 18:46 UTC (permalink / raw)
  To: David Edelsohn
  Cc: Dale Johannesen, Geoff Keating, Paolo Bonzini, Roger Sayle, GCC Patches


On Apr 5, 2006, at 8:38 PM, David Edelsohn wrote:

> 	Paolo's patch, this suggestion, and limiting the SYMBOL_REF test
> to just legitimize_reload_address all work.  However, GCC now produces
> worse code with many more memory accesses for FP constants on PowerPC.
>
> 	Was the performance impact of the original patch ever tested on
> anything other than x86?  Reload changes can have significant  
> performance
> impact on all GCC architectures and this patch should have been tested
> much more extensively.

This has been in Apple's branch for months, and is neutral or  
beneficial on Darwin PPC (SPEC and a lot of in-house code).
We aren't seeing the same kind of problems you are on AIX.  I'll see  
if I can get an AIX cross compiler built (don't have an AIX machine  
though)....



^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFT/RFA] Fix AIX fallout from PR/19653 patch
  2006-04-06 18:46       ` Dale Johannesen
@ 2006-04-06 19:17         ` David Edelsohn
  0 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2006-04-06 19:17 UTC (permalink / raw)
  To: Dale Johannesen; +Cc: Geoff Keating, Paolo Bonzini, Roger Sayle, GCC Patches

>>>>> Dale Johannesen writes:

Dale> This has been in Apple's branch for months, and is neutral or  
Dale> beneficial on Darwin PPC (SPEC and a lot of in-house code).
Dale> We aren't seeing the same kind of problems you are on AIX.  I'll see  
Dale> if I can get an AIX cross compiler built (don't have an AIX machine  
Dale> though)....

	I don't expect the change to affect Darwin or PPC Linux and I'm
not sure how much the effects are tested by SPEC.  I see a few drops in
PPC64 Linux performance, but I am waiting to see if that is just noise.
Again, the biggest impact appears to be unprototyped functions and stdarg.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFT/RFA] Fix AIX fallout from PR/19653 patch
  2006-04-06 14:42             ` Paolo Bonzini
@ 2006-04-06 22:44               ` David Edelsohn
  2006-04-07  7:39                 ` Paolo Bonzini
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2006-04-06 22:44 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Geoff Keating, Roger Sayle, GCC Patches

	The fundamental cause of all this problem seems to be the initial
RTL generation of:

(insn 38 37 39 3 (set (mem:SI (plus:SI (reg/f:SI 117 virtual-outgoing-args)
                (const_int 32 [0x20])) [0 S4 A32])
        (subreg:SI (reg:DF 141) 4)) -1 (nil)
    (nil))

despite CANNOT_CHANGE_MODE_CLASS returning true for DFmode.  It's probably
most useful to try to generate better initial RTL, avoiding reload trying
to do something unnatural.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFT/RFA] Fix AIX fallout from PR/19653 patch
  2006-04-06 22:44               ` David Edelsohn
@ 2006-04-07  7:39                 ` Paolo Bonzini
  2006-04-07 14:01                   ` David Edelsohn
       [not found]                   ` <E5778C9D-8FC1-422A-82ED-ABEF7AFA7685@geoffk.org>
  0 siblings, 2 replies; 875+ messages in thread
From: Paolo Bonzini @ 2006-04-07  7:39 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Geoff Keating, Roger Sayle, GCC Patches

David Edelsohn wrote:
> 	The fundamental cause of all this problem seems to be the initial
> RTL generation of:
> 
> (insn 38 37 39 3 (set (mem:SI (plus:SI (reg/f:SI 117 virtual-outgoing-args)
>                 (const_int 32 [0x20])) [0 S4 A32])
>         (subreg:SI (reg:DF 141) 4)) -1 (nil)
>     (nil))
> 
> despite CANNOT_CHANGE_MODE_CLASS returning true for DFmode.  It's probably
> most useful to try to generate better initial RTL, avoiding reload trying
> to do something unnatural.

This is true (indeed, if we compare 4.2 pre-Dale-patch, and 
post-Dale-patch with my fix, the produced code is slightly better but 
still I would not call it decent).  However, I would also like to hear 
from you or Geoff about my analysis, and especially to know if the 
suspicious pieces I pointed out in his TOC patches are really 
oversights, or just false problems.

Paolo

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFT/RFA] Fix AIX fallout from PR/19653 patch
  2006-04-07  7:39                 ` Paolo Bonzini
@ 2006-04-07 14:01                   ` David Edelsohn
       [not found]                   ` <E5778C9D-8FC1-422A-82ED-ABEF7AFA7685@geoffk.org>
  1 sibling, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2006-04-07 14:01 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Geoff Keating, Roger Sayle, GCC Patches

	BTW, the horrible RTL is generated by expr.c:emit_push_insn() when
partial is non-zero:

          emit_push_insn (operand_subword_force (x, i, mode),
                          word_mode, NULL_TREE, NULL_RTX, align, 0, NULL_RTX,
                          0, args_addr,
                          GEN_INT (args_offset + ((i - not_stack + skip)
                                                  * UNITS_PER_WORD)),
                          reg_parm_stack_space, alignment_pad);

which eventually calls simplify-rtx.c:simplify_gen_subreg():

  if (validate_subreg (outermode, innermode, op, byte))
    return gen_rtx_SUBREG (outermode, op, byte);

where emit-rtl.c:validate_subreg() has the wonderful comment:

  /* ??? This should not be here.  Temporarily continue to allow word_mode
     subregs of anything.  The most common offender is (subreg:SI (reg:DF)).
     Generally, backends are doing something sketchy but it'll take time to
     fix them all.  */
  if (omode == word_mode)
    ;

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFT/RFA] Fix AIX fallout from PR/19653 patch
  2006-04-06 14:31           ` David Edelsohn
  2006-04-06 14:42             ` Paolo Bonzini
@ 2006-04-07 20:49             ` Geoff Keating
  1 sibling, 0 replies; 875+ messages in thread
From: Geoff Keating @ 2006-04-07 20:49 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Paolo Bonzini, Roger Sayle, GCC Patches

[-- Attachment #1: Type: text/plain, Size: 2262 bytes --]


On 06/04/2006, at 7:31 AM, David Edelsohn wrote:

> 	Geoff's proposed patch works for me.  Even a more limited patch
> works:
>
> *** rs6000.c    (revision 112731)
> --- rs6000.c    (working copy)
> *************** rs6000_legitimize_reload_address (rtx x,
> *** 3446,3455 ****
>       }
>
>     if (TARGET_TOC
>         && constant_pool_expr_p (x)
>         && ASM_OUTPUT_SPECIAL_POOL_ENTRY_P (get_pool_constant (x),  
> mode))
>       {
>         (x) = create_TOC_reference (x);
>         *win = 1;
>         return x;
>       }
> --- 3446,3456 ----
>       }
>
>     if (TARGET_TOC
> +       && GET_CODE (x) == SYMBOL_REF
>         && constant_pool_expr_p (x)
>         && ASM_OUTPUT_SPECIAL_POOL_ENTRY_P (get_pool_constant (x),  
> mode))
>       {
>         (x) = create_TOC_reference (x);
>         *win = 1;
>         return x;
>       }
>
> However, the the code generated for the testcase now is worse than  
> before
> the regclass change: an extra load and an extra TOC entry.
>
> GCC 4.0.3:
>
> LC..0:
>         .tc FD_0_0[TC],0x0,0x0
> .f:
>         mflr 0
>         stw 0,8(1)
>         stwu 1,-72(1)
>         lwz 11,LC..0(2)
>         lwz 12,LC..0+4(2)
>         stw 12,56(1)
>         li 10,0
>         stw 11,64(1)
>         stw 12,68(1)
>         lfd 4,64(1)
> 	li 3,0
>
> trunk:
>
> LC..0:
>         .tc FD_0_0[TC],0x0,0x0
> LC..1:
>         .tc LC..0.P4[TC],LC..0+4
> .f:
>         mflr 0
>         stw 0,8(1)
>         stwu 1,-80(1)
>         lwz 9,LC..1(2)
>         lwz 9,0(9)
>         stw 9,56(1)
>         li 10,0
>         lfd 4,LC..0(2)
>         li 3,0
>         lwz 4,LC..0(2)
>         lwz 5,LC..0+4(2)
>
>
> The smallest and simplest testcase is an unprototyped function, but  
> the
> real effect of this change is in stdarg functions, which also fail.
>
> 	I do not doubt that the rs6000 port can be modified to recover the
> original performance with the regclass change, however it's not a task
> that I enjoy inheriting.

I believe the modification you want is to also allow expressions of  
the form (const (plus (symbol_ref) (const_int))) as well as plain  
SYMBOL_REFs.  But, if you do that you'll also want to make sure that  
get_pool_constant is called with the embedded SYMBOL_REF not with the  
whole expression. 
  

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 2410 bytes --]

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFT/RFA] Fix AIX fallout from PR/19653 patch
       [not found]                       ` <1880BA7B-88BD-41EE-B9E7-33045526437F@geoffk.org>
@ 2006-04-10 16:33                         ` Paolo Bonzini
  2006-04-10 16:40                           ` Geoff Keating
  0 siblings, 1 reply; 875+ messages in thread
From: Paolo Bonzini @ 2006-04-10 16:33 UTC (permalink / raw)
  To: Geoff Keating; +Cc: GCC Patches

[-- Attachment #1: Type: text/plain, Size: 435 bytes --]


> As I said, I don't believe the chunk of code that calls 
> create_TOC_reference should be using constant_pool_expr_1 at all.  It 
> should be checking explicitly for the patterns for which relocs can be 
> created, either a SYMBOL_REF or a SYMBOL_REF plus a constant.
Compared to what constant_pool_expr_p lets through, this only rules out 
toc-relative expressions.  Do you think the attached (untested) patch 
makes sense?

Paolo

[-- Attachment #2: fix-toc-bug.patch --]
[-- Type: text/plain, Size: 5215 bytes --]

2006-04-04  Paolo Bonzini  <bonzini@gnu.org>

	* rs6000.c (constant_pool_expr_1): Add MODE parameter defaulting
	to the pool constant's mode.
	(constant_pool_expr_p): Add a MODE parameter, pass it.  Add
	ALLOW_TOC_REL parameter.
	(toc_relative_expr_p): Adjust call.
	(legitimate_constant_pool_address_p, rs6000_legitimize_reload_address,
	rs6000_emit_move): Merge invocations of ASM_OUTPUT_SPECIAL_POOL_ENTRY_P
	with preceding calls to constant_pool_expr_p.

Index: config/rs6000/rs6000.c
===================================================================
--- config/rs6000/rs6000.c	(revision 112658)
+++ config/rs6000/rs6000.c	(working copy)
@@ -588,8 +588,8 @@ static void rs6000_emit_allocate_stack (
 static unsigned rs6000_hash_constant (rtx);
 static unsigned toc_hash_function (const void *);
 static int toc_hash_eq (const void *, const void *);
-static int constant_pool_expr_1 (rtx, int *, int *);
-static bool constant_pool_expr_p (rtx);
+static int constant_pool_expr_1 (rtx, enum machine_mode, int *, int *);
+static bool constant_pool_expr_p (rtx, enum machine_mode);
 static bool legitimate_small_data_p (enum machine_mode, rtx);
 static bool legitimate_indexed_address_p (rtx, int);
 static bool legitimate_lo_sum_address_p (enum machine_mode, rtx, int);
@@ -2642,7 +2642,8 @@ gpr_or_gpr_p (rtx op0, rtx op1)
 /* Subroutines of rs6000_legitimize_address and rs6000_legitimate_address.  */
 
 static int
-constant_pool_expr_1 (rtx op, int *have_sym, int *have_toc)
+constant_pool_expr_1 (rtx op, enum machine_mode mode,
+		      int *have_sym, int *have_toc)
 {
   switch (GET_CODE (op))
     {
@@ -2651,7 +2652,9 @@ constant_pool_expr_1 (rtx op, int *have_
 	return 0;
       else if (CONSTANT_POOL_ADDRESS_P (op))
 	{
-	  if (ASM_OUTPUT_SPECIAL_POOL_ENTRY_P (get_pool_constant (op), Pmode))
+	  if (mode == VOIDmode)
+	    mode = get_pool_mode (op);
+	  if (ASM_OUTPUT_SPECIAL_POOL_ENTRY_P (get_pool_constant (op), mode))
 	    {
 	      *have_sym = 1;
 	      return 1;
@@ -2668,10 +2671,10 @@ constant_pool_expr_1 (rtx op, int *have_
 	return 0;
     case PLUS:
     case MINUS:
-      return (constant_pool_expr_1 (XEXP (op, 0), have_sym, have_toc)
-	      && constant_pool_expr_1 (XEXP (op, 1), have_sym, have_toc));
+      return (constant_pool_expr_1 (XEXP (op, 0), mode, have_sym, have_toc)
+	      && constant_pool_expr_1 (XEXP (op, 1), mode, have_sym, have_toc));
     case CONST:
-      return constant_pool_expr_1 (XEXP (op, 0), have_sym, have_toc);
+      return constant_pool_expr_1 (XEXP (op, 0), mode, have_sym, have_toc);
     case CONST_INT:
       return 1;
     default:
@@ -2680,11 +2683,12 @@ constant_pool_expr_1 (rtx op, int *have_
 }
 
 static bool
-constant_pool_expr_p (rtx op)
+constant_pool_expr_p (rtx op, enum machine_mode mode, bool allow_toc_rel)
 {
   int have_sym = 0;
   int have_toc = 0;
-  return constant_pool_expr_1 (op, &have_sym, &have_toc) && have_sym;
+  return constant_pool_expr_1 (op, mode, &have_sym, &have_toc)
+	 && have_sym && (allow_toc_rel || !have_toc);
 }
 
 bool
@@ -2692,7 +2696,7 @@ toc_relative_expr_p (rtx op)
 {
   int have_sym = 0;
   int have_toc = 0;
-  return constant_pool_expr_1 (op, &have_sym, &have_toc) && have_toc;
+  return constant_pool_expr_1 (op, Pmode, &have_sym, &have_toc) && have_toc;
 }
 
 bool
@@ -2702,7 +2706,7 @@ legitimate_constant_pool_address_p (rtx 
 	  && GET_CODE (x) == PLUS
 	  && GET_CODE (XEXP (x, 0)) == REG
 	  && (TARGET_MINIMAL_TOC || REGNO (XEXP (x, 0)) == TOC_REGISTER)
-	  && constant_pool_expr_p (XEXP (x, 1)));
+	  && constant_pool_expr_p (XEXP (x, 1), Pmode, true));
 }
 
 static bool
@@ -3004,9 +3008,7 @@ rs6000_legitimize_address (rtx x, rtx ol
       emit_insn (gen_macho_high (reg, x));
       return gen_rtx_LO_SUM (Pmode, reg, x);
     }
-  else if (TARGET_TOC
-	   && constant_pool_expr_p (x)
-	   && ASM_OUTPUT_SPECIAL_POOL_ENTRY_P (get_pool_constant (x), Pmode))
+  else if (TARGET_TOC && constant_pool_expr_p (x, Pmode, false))
     {
       return create_TOC_reference (x);
     }
@@ -3440,8 +3442,7 @@ rs6000_legitimize_reload_address (rtx x,
     }
 
   if (TARGET_TOC
-      && constant_pool_expr_p (x)
-      && ASM_OUTPUT_SPECIAL_POOL_ENTRY_P (get_pool_constant (x), mode))
+      && constant_pool_expr_p (x, mode, false))
     {
       (x) = create_TOC_reference (x);
       *win = 1;
@@ -4112,9 +4113,7 @@ rs6000_emit_move (rtx dest, rtx source, 
 	 reference to it.  */
       if (TARGET_TOC
 	  && GET_CODE (operands[1]) == SYMBOL_REF
-	  && constant_pool_expr_p (operands[1])
-	  && ASM_OUTPUT_SPECIAL_POOL_ENTRY_P (get_pool_constant (operands[1]),
-					      get_pool_mode (operands[1])))
+	  && constant_pool_expr_p (operands[1], VOIDmode, false))
 	{
 	  operands[1] = create_TOC_reference (operands[1]);
 	}
@@ -4179,10 +4178,7 @@ rs6000_emit_move (rtx dest, rtx source, 
 	  operands[1] = force_const_mem (mode, operands[1]);
 
 	  if (TARGET_TOC
-	      && constant_pool_expr_p (XEXP (operands[1], 0))
-	      && ASM_OUTPUT_SPECIAL_POOL_ENTRY_P (
-			get_pool_constant (XEXP (operands[1], 0)),
-			get_pool_mode (XEXP (operands[1], 0))))
+	      && constant_pool_expr_p (XEXP (operands[1], 0), VOIDmode, true))
 	    {
 	      operands[1]
 		= gen_const_mem (mode,

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [RFT/RFA] Fix AIX fallout from PR/19653 patch
  2006-04-10 16:33                         ` Paolo Bonzini
@ 2006-04-10 16:40                           ` Geoff Keating
  0 siblings, 0 replies; 875+ messages in thread
From: Geoff Keating @ 2006-04-10 16:40 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: GCC Patches

[-- Attachment #1: Type: text/plain, Size: 6211 bytes --]


On 10/04/2006, at 9:33 AM, Paolo Bonzini wrote:

>
>> As I said, I don't believe the chunk of code that calls  
>> create_TOC_reference should be using constant_pool_expr_1 at all.   
>> It should be checking explicitly for the patterns for which relocs  
>> can be created, either a SYMBOL_REF or a SYMBOL_REF plus a constant.
> Compared to what constant_pool_expr_p lets through, this only rules  
> out toc-relative expressions.

I think someone should make a list of what kinds of expressions can  
be made that involve TOC symbols and constant_pool_expr_p should  
allow only those.

>   Do you think the attached (untested) patch makes sense?

I don't know.

> Paolo
> 2006-04-04  Paolo Bonzini  <bonzini@gnu.org>
>
> 	* rs6000.c (constant_pool_expr_1): Add MODE parameter defaulting
> 	to the pool constant's mode.
> 	(constant_pool_expr_p): Add a MODE parameter, pass it.  Add
> 	ALLOW_TOC_REL parameter.
> 	(toc_relative_expr_p): Adjust call.
> 	(legitimate_constant_pool_address_p,  
> rs6000_legitimize_reload_address,
> 	rs6000_emit_move): Merge invocations of  
> ASM_OUTPUT_SPECIAL_POOL_ENTRY_P
> 	with preceding calls to constant_pool_expr_p.
>
> Index: config/rs6000/rs6000.c
> ===================================================================
> --- config/rs6000/rs6000.c	(revision 112658)
> +++ config/rs6000/rs6000.c	(working copy)
> @@ -588,8 +588,8 @@ static void rs6000_emit_allocate_stack (
>  static unsigned rs6000_hash_constant (rtx);
>  static unsigned toc_hash_function (const void *);
>  static int toc_hash_eq (const void *, const void *);
> -static int constant_pool_expr_1 (rtx, int *, int *);
> -static bool constant_pool_expr_p (rtx);
> +static int constant_pool_expr_1 (rtx, enum machine_mode, int *,  
> int *);
> +static bool constant_pool_expr_p (rtx, enum machine_mode);
>  static bool legitimate_small_data_p (enum machine_mode, rtx);
>  static bool legitimate_indexed_address_p (rtx, int);
>  static bool legitimate_lo_sum_address_p (enum machine_mode, rtx,  
> int);
> @@ -2642,7 +2642,8 @@ gpr_or_gpr_p (rtx op0, rtx op1)
>  /* Subroutines of rs6000_legitimize_address and  
> rs6000_legitimate_address.  */
>
>  static int
> -constant_pool_expr_1 (rtx op, int *have_sym, int *have_toc)
> +constant_pool_expr_1 (rtx op, enum machine_mode mode,
> +		      int *have_sym, int *have_toc)
>  {
>    switch (GET_CODE (op))
>      {
> @@ -2651,7 +2652,9 @@ constant_pool_expr_1 (rtx op, int *have_
>  	return 0;
>        else if (CONSTANT_POOL_ADDRESS_P (op))
>  	{
> -	  if (ASM_OUTPUT_SPECIAL_POOL_ENTRY_P (get_pool_constant (op),  
> Pmode))
> +	  if (mode == VOIDmode)
> +	    mode = get_pool_mode (op);
> +	  if (ASM_OUTPUT_SPECIAL_POOL_ENTRY_P (get_pool_constant (op),  
> mode))
>  	    {
>  	      *have_sym = 1;
>  	      return 1;
> @@ -2668,10 +2671,10 @@ constant_pool_expr_1 (rtx op, int *have_
>  	return 0;
>      case PLUS:
>      case MINUS:
> -      return (constant_pool_expr_1 (XEXP (op, 0), have_sym, have_toc)
> -	      && constant_pool_expr_1 (XEXP (op, 1), have_sym, have_toc));
> +      return (constant_pool_expr_1 (XEXP (op, 0), mode, have_sym,  
> have_toc)
> +	      && constant_pool_expr_1 (XEXP (op, 1), mode, have_sym,  
> have_toc));
>      case CONST:
> -      return constant_pool_expr_1 (XEXP (op, 0), have_sym, have_toc);
> +      return constant_pool_expr_1 (XEXP (op, 0), mode, have_sym,  
> have_toc);
>      case CONST_INT:
>        return 1;
>      default:
> @@ -2680,11 +2683,12 @@ constant_pool_expr_1 (rtx op, int *have_
>  }
>
>  static bool
> -constant_pool_expr_p (rtx op)
> +constant_pool_expr_p (rtx op, enum machine_mode mode, bool  
> allow_toc_rel)
>  {
>    int have_sym = 0;
>    int have_toc = 0;
> -  return constant_pool_expr_1 (op, &have_sym, &have_toc) && have_sym;
> +  return constant_pool_expr_1 (op, mode, &have_sym, &have_toc)
> +	 && have_sym && (allow_toc_rel || !have_toc);
>  }
>
>  bool
> @@ -2692,7 +2696,7 @@ toc_relative_expr_p (rtx op)
>  {
>    int have_sym = 0;
>    int have_toc = 0;
> -  return constant_pool_expr_1 (op, &have_sym, &have_toc) && have_toc;
> +  return constant_pool_expr_1 (op, Pmode, &have_sym, &have_toc) &&  
> have_toc;
>  }
>
>  bool
> @@ -2702,7 +2706,7 @@ legitimate_constant_pool_address_p (rtx
>  	  && GET_CODE (x) == PLUS
>  	  && GET_CODE (XEXP (x, 0)) == REG
>  	  && (TARGET_MINIMAL_TOC || REGNO (XEXP (x, 0)) == TOC_REGISTER)
> -	  && constant_pool_expr_p (XEXP (x, 1)));
> +	  && constant_pool_expr_p (XEXP (x, 1), Pmode, true));
>  }
>
>  static bool
> @@ -3004,9 +3008,7 @@ rs6000_legitimize_address (rtx x, rtx ol
>        emit_insn (gen_macho_high (reg, x));
>        return gen_rtx_LO_SUM (Pmode, reg, x);
>      }
> -  else if (TARGET_TOC
> -	   && constant_pool_expr_p (x)
> -	   && ASM_OUTPUT_SPECIAL_POOL_ENTRY_P (get_pool_constant (x),  
> Pmode))
> +  else if (TARGET_TOC && constant_pool_expr_p (x, Pmode, false))
>      {
>        return create_TOC_reference (x);
>      }
> @@ -3440,8 +3442,7 @@ rs6000_legitimize_reload_address (rtx x,
>      }
>
>    if (TARGET_TOC
> -      && constant_pool_expr_p (x)
> -      && ASM_OUTPUT_SPECIAL_POOL_ENTRY_P (get_pool_constant (x),  
> mode))
> +      && constant_pool_expr_p (x, mode, false))
>      {
>        (x) = create_TOC_reference (x);
>        *win = 1;
> @@ -4112,9 +4113,7 @@ rs6000_emit_move (rtx dest, rtx source,
>  	 reference to it.  */
>        if (TARGET_TOC
>  	  && GET_CODE (operands[1]) == SYMBOL_REF
> -	  && constant_pool_expr_p (operands[1])
> -	  && ASM_OUTPUT_SPECIAL_POOL_ENTRY_P (get_pool_constant (operands 
> [1]),
> -					      get_pool_mode (operands[1])))
> +	  && constant_pool_expr_p (operands[1], VOIDmode, false))
>  	{
>  	  operands[1] = create_TOC_reference (operands[1]);
>  	}
> @@ -4179,10 +4178,7 @@ rs6000_emit_move (rtx dest, rtx source,
>  	  operands[1] = force_const_mem (mode, operands[1]);
>
>  	  if (TARGET_TOC
> -	      && constant_pool_expr_p (XEXP (operands[1], 0))
> -	      && ASM_OUTPUT_SPECIAL_POOL_ENTRY_P (
> -			get_pool_constant (XEXP (operands[1], 0)),
> -			get_pool_mode (XEXP (operands[1], 0))))
> +	      && constant_pool_expr_p (XEXP (operands[1], 0), VOIDmode,  
> true))
>  	    {
>  	      operands[1]
>  		= gen_const_mem (mode,


[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 2410 bytes --]

^ permalink raw reply	[flat|nested] 875+ messages in thread

* [PowerPC] Avoid ICE on DFmode subreg
@ 2006-04-12  0:50 Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2006-04-12  0:50 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Edelsohn

This change should fix the AIX testcase failures caused by weird
subregs, by simply not generating them in the first place.

This testcase (-m32 -O2 -mcall-aixdesc -mxl-compat on linux)
  extern int g();
  void f (void) { g (0, 0.0, 0.0, 0.0, 0.0); }
ICEs since Dale's reg alloc changes when trying to handle the last
double arg stored partially in memory and partially in a gpr as well as
in a fpr.

Bootstrapped and regression tested powerpc-linux and powerpc64-linux.
OK mainline?

	* config/rs6000/rs6000.c (rs6000_mixed_function_arg): Update
	magic NULL_RTX comment.
	(function_arg): Store entire fp arg to mem if any part should go
	on stack.
	(rs6000_arg_partial_bytes): Adjust for above change.

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 112849)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -5058,17 +5072,13 @@ rs6000_mixed_function_arg (enum machine_
   if (align_words + n_units > GP_ARG_NUM_REG)
     /* Not all of the arg fits in gprs.  Say that it goes in memory too,
        using a magic NULL_RTX component.
-       FIXME: This is not strictly correct.  Only some of the arg
-       belongs in memory, not all of it.  However, there isn't any way
-       to do this currently, apart from building rtx descriptions for
-       the pieces of memory we want stored.  Due to bugs in the generic
-       code we can't use the normal function_arg_partial_nregs scheme
-       with the PARALLEL arg description we emit here.
-       In any case, the code to store the whole arg to memory is often
-       more efficient than code to store pieces, and we know that space
-       is available in the right place for the whole arg.  */
-    /* FIXME: This should be fixed since the conversion to
-       TARGET_ARG_PARTIAL_BYTES.  */
+       This is not strictly correct.  Only some of the arg belongs in
+       memory, not all of it.  However, the normal scheme using
+       function_arg_partial_nregs can result in unusual subregs, eg.
+       (subreg:SI (reg:DF) 4), which are not handled well.  The code to
+       store the whole arg to memory is often more efficient than code
+       to store pieces, and we know that space is available in the right
+       place for the whole arg.  */
     rvec[k++] = gen_rtx_EXPR_LIST (VOIDmode, NULL_RTX, const0_rtx);
 
   i = 0;
@@ -5310,9 +5320,8 @@ function_arg (CUMULATIVE_ARGS *cum, enum
 			 include the portion actually in registers here.  */
 		      enum machine_mode rmode = TARGET_32BIT ? SImode : DImode;
 		      rtx off;
-		      int i=0;
-		      if (align_words + n_words > GP_ARG_NUM_REG
-			  && (TARGET_32BIT && TARGET_POWERPC64))
+		      int i = 0;
+		      if (align_words + n_words > GP_ARG_NUM_REG)
 			/* Not all of the arg fits in gprs.  Say that it
 			   goes in memory too, using a magic NULL_RTX
 			   component.  Also see comment in
@@ -5391,18 +5400,20 @@ rs6000_arg_partial_bytes (CUMULATIVE_ARG
 
   align_words = rs6000_parm_start (mode, type, cum->words);
 
-  if (USE_FP_FOR_ARG_P (cum, mode, type)
+  if (USE_FP_FOR_ARG_P (cum, mode, type))
+    {
       /* If we are passing this arg in the fixed parameter save area
 	 (gprs or memory) as well as fprs, then this function should
-	 return the number of bytes passed in the parameter save area
-	 rather than bytes passed in fprs.  */
-      && !(type
-	   && (cum->nargs_prototype <= 0
-	       || (DEFAULT_ABI == ABI_AIX
-		   && TARGET_XL_COMPAT
-		   && align_words >= GP_ARG_NUM_REG))))
-    {
-      if (cum->fregno + ((GET_MODE_SIZE (mode) + 7) >> 3) > FP_ARG_MAX_REG + 1)
+	 return the number of partial bytes passed in the parameter
+	 save area rather than partial bytes passed in fprs.  */
+      if (type
+	  && (cum->nargs_prototype <= 0
+	      || (DEFAULT_ABI == ABI_AIX
+		  && TARGET_XL_COMPAT
+		  && align_words >= GP_ARG_NUM_REG)))
+	return 0;
+      else if (cum->fregno + ((GET_MODE_SIZE (mode) + 7) >> 3)
+	       > FP_ARG_MAX_REG + 1)
 	ret = (FP_ARG_MAX_REG + 1 - cum->fregno) * 8;
       else if (cum->nargs_prototype >= 0)
 	return 0;

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PowerPC] Avoid ICE on DFmode subreg
       [not found]           ` <amodra@bigpond.net.au>
                               ` (59 preceding siblings ...)
  2006-04-05  2:26             ` [RFT/RFA] Fix AIX fallout from PR/19653 patch David Edelsohn
@ 2006-04-12  0:53             ` David Edelsohn
  2006-07-07 13:08             ` [PATCH, committed] PR 28150 and PR 28170 David Edelsohn
  61 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2006-04-12  0:53 UTC (permalink / raw)
  To: gcc-patches

	* config/rs6000/rs6000.c (rs6000_mixed_function_arg): Update
	magic NULL_RTX comment.
	(function_arg): Store entire fp arg to mem if any part should go
	on stack.
	(rs6000_arg_partial_bytes): Adjust for above change.

Great!

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* [PATCH, committed] PR 28150 and PR 28170
@ 2006-07-06 19:53 David Edelsohn
  2006-07-07  3:07 ` Alan Modra
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2006-07-06 19:53 UTC (permalink / raw)
  To: gcc-patches

	These patches fix PR 28150 (TFmode PRE_INC) and PR 28170 (insv
with shift).  The insv function cleanups were incorporated from Alan
Modra's patch.

Bootstrapped and regression tested on powerpc-ibm-aix5.2.0.0 and
powerpc-linux. 

David

	PR target/28150
	* config/rs6000/rs6000.c (rs6000_legitimate_address): Do not allow
	PRE_{INC,DEC} of TFmode.

	PR target/28170
	* config/rs6000/rs6000.c (insvdi_rshift_rlwimi_p): Correct shiftop
	bounds. Simplify.

Index: rs6000.c
===================================================================
*** rs6000.c	(revision 115219)
--- rs6000.c	(working copy)
*************** rs6000_legitimate_address (enum machine_
*** 3522,3527 ****
--- 3522,3528 ----
    if ((GET_CODE (x) == PRE_INC || GET_CODE (x) == PRE_DEC)
        && !ALTIVEC_VECTOR_MODE (mode)
        && !SPE_VECTOR_MODE (mode)
+       && mode != TFmode
        /* Restrict addressing for DI because of our SUBREG hackery.  */
        && !(TARGET_E500_DOUBLE && (mode == DFmode || mode == DImode))
        && TARGET_UPDATE
*************** effects of instruction do not correspond
*** 9799,9810 ****
  int
  insvdi_rshift_rlwimi_p (rtx sizeop, rtx startop, rtx shiftop)
  {
!   if (INTVAL (startop) < 64
!       && INTVAL (startop) > 32
!       && (INTVAL (sizeop) + INTVAL (startop) < 64)
!       && (INTVAL (sizeop) + INTVAL (startop) > 33)
!       && (INTVAL (sizeop) + INTVAL (startop) + INTVAL (shiftop) < 96)
!       && (INTVAL (sizeop) + INTVAL (startop) + INTVAL (shiftop) >= 64)
        && (64 - (INTVAL (shiftop) & 63)) >= INTVAL (sizeop))
      return 1;
  
--- 9800,9811 ----
  int
  insvdi_rshift_rlwimi_p (rtx sizeop, rtx startop, rtx shiftop)
  {
!   if (INTVAL (startop) > 32
!       && INTVAL (startop) < 64
!       && INTVAL (sizeop) > 1
!       && INTVAL (sizeop) + INTVAL (startop) < 64
!       && INTVAL (shiftop) > 0
!       && INTVAL (sizeop) + INTVAL (shiftop) < 32
        && (64 - (INTVAL (shiftop) & 63)) >= INTVAL (sizeop))
      return 1;
  

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH, committed] PR 28150 and PR 28170
  2006-07-06 19:53 [PATCH, committed] PR 28150 and PR 28170 David Edelsohn
@ 2006-07-07  3:07 ` Alan Modra
  0 siblings, 0 replies; 875+ messages in thread
From: Alan Modra @ 2006-07-07  3:07 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches

On Thu, Jul 06, 2006 at 03:02:50PM -0400, David Edelsohn wrote:
>   int
>   insvdi_rshift_rlwimi_p (rtx sizeop, rtx startop, rtx shiftop)
>   {
> !   if (INTVAL (startop) > 32
> !       && INTVAL (startop) < 64
> !       && INTVAL (sizeop) > 1

Why disallow single-bit fields?

> !       && INTVAL (sizeop) + INTVAL (startop) < 64
> !       && INTVAL (shiftop) > 0

This could be >= 0 as far as the insn capabilities are concerned.  I
suppose you might never see a shift count of zero due to rtl
simplification.

> !       && INTVAL (sizeop) + INTVAL (shiftop) < 32
>         && (64 - (INTVAL (shiftop) & 63)) >= INTVAL (sizeop))

This last condition is already covered by the previous two lines.  It
rearranges as
         && 64 >= INTVAL (sizeop) + (INTVAL (shiftop) & 63)

>       return 1;

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH, committed] PR 28150 and PR 28170
       [not found]           ` <amodra@bigpond.net.au>
                               ` (60 preceding siblings ...)
  2006-04-12  0:53             ` [PowerPC] Avoid ICE on DFmode subreg David Edelsohn
@ 2006-07-07 13:08             ` David Edelsohn
  61 siblings, 0 replies; 875+ messages in thread
From: David Edelsohn @ 2006-07-07 13:08 UTC (permalink / raw)
  To: gcc-patches

	GCC is in regression only mode.  Other than propagating 32 through
the tests, I did not want to change the behavior of the tests.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* [PATCH] Fix comments in PPC e500 code
@ 2006-08-09 12:49 Eric Botcazou
  2006-08-09 14:02 ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Eric Botcazou @ 2006-08-09 12:49 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 823 bytes --]

Hi,

We were just bitten on the 3.4 branch by the GT/EQ bit ping-pong game in the 
implementation of the FP compare instructions on the PPC e500.  The final fix
  http://gcc.gnu.org/ml/gcc-patches/2005-03/msg00704.html
is only present on 3.4-e500-branch but not on the official 3.4 branch.

The perusal of the final code shows left-overs from the previous stages in 
some comments.  To avoid any more confusion we think it is important to have 
the comments exactly match the code.  Hence the attached patch.

Tested by building a cross to powerpc-eabispe.  OK for mainline?


2006-08-09  Eric Botcazou  <ebotcazou@adacore.com>

	* config/rs6000/rs6000.c (print_operand) <D>: Fix comment and adjust.
	(rs6000_generate_compare): Tweak comments.
	* config/rs6000/rs6000.md (UNSPEC_MV_CR_GT): Fix comment.


-- 
Eric Botcazou

[-- Attachment #2: f803-024-2_fsf.diff --]
[-- Type: text/x-diff, Size: 1995 bytes --]

Index: config/rs6000/rs6000.c
===================================================================
--- config/rs6000/rs6000.c	(revision 115944)
+++ config/rs6000/rs6000.c	(working copy)
@@ -10283,13 +10283,14 @@ print_operand (FILE *file, rtx x, int co
       return;
 
     case 'D':
-      /* Like 'J' but get to the EQ bit.  */
+      /* Like 'J' but get to the GT bit only.  */
       gcc_assert (GET_CODE (x) == REG);
 
-      /* Bit 1 is EQ bit.  */
-      i = 4 * (REGNO (x) - CR0_REGNO) + 2;
+      /* Bit 1 is GT bit.  */
+      i = 4 * (REGNO (x) - CR0_REGNO) + 1;
 
-      fprintf (file, "%d", i);
+      /* Add one for shift count in rlinm for scc.  */
+      fprintf (file, "%d", i+1);
       return;
 
     case 'E':
@@ -11086,7 +11087,7 @@ rs6000_generate_compare (enum rtx_code c
   /* First, the compare.  */
   compare_result = gen_reg_rtx (comp_mode);
 
-  /* SPE FP compare instructions on the GPRs.  Yuck!  */
+  /* E500 FP compare instructions on the GPRs.  Yuck!  */
   if ((TARGET_E500 && !TARGET_FPRS && TARGET_HARD_FLOAT)
       && rs6000_compare_fp_p)
     {
@@ -11096,8 +11097,8 @@ rs6000_generate_compare (enum rtx_code c
       if (op_mode == VOIDmode)
 	op_mode = GET_MODE (rs6000_compare_op1);
 
-      /* Note: The E500 comparison instructions set the GT bit (x +
-	 1), on success.  This explains the mess.  */
+      /* The E500 FP compare instructions toggle the GT bit (CR bit 1) only.
+	 This explains the following mess.  */
 
       switch (code)
 	{
Index: config/rs6000/rs6000.md
===================================================================
--- config/rs6000/rs6000.md	(revision 115944)
+++ config/rs6000/rs6000.md	(working copy)
@@ -55,7 +55,7 @@ (define_constants
    (UNSPEC_TLSGOTTPREL		28)
    (UNSPEC_TLSTLS		29)
    (UNSPEC_FIX_TRUNC_TF		30)	; fadd, rounding towards zero
-   (UNSPEC_MV_CR_GT		31)	; move_from_CR_eq_bit
+   (UNSPEC_MV_CR_GT		31)	; move_from_CR_gt_bit
    (UNSPEC_STFIWX		32)
    (UNSPEC_POPCNTB		33)
    (UNSPEC_FRES			34)

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] Fix comments in PPC e500 code
  2006-08-09 12:49 [PATCH] Fix comments in PPC e500 code Eric Botcazou
@ 2006-08-09 14:02 ` David Edelsohn
  2006-08-09 14:15   ` Daniel Jacobowitz
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2006-08-09 14:02 UTC (permalink / raw)
  To: Eric Botcazou, Edmar Wienskoski; +Cc: gcc-patches

	* config/rs6000/rs6000.c (print_operand) <D>: Fix comment and adjust.
	(rs6000_generate_compare): Tweak comments.
	* config/rs6000/rs6000.md (UNSPEC_MV_CR_GT): Fix comment.

	Do you have access to an e500 to actually test this code?  If GCC
has been setting the wrong bit, how has this been working all along or
does no one use this FP feature.  I do not understand how people who are
building and testing e500 from mainline are not seeing this problem.  I
would have expected an open PR.

	This patch is okay with me if Freescale concurs.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] Fix comments in PPC e500 code
  2006-08-09 14:02 ` David Edelsohn
@ 2006-08-09 14:15   ` Daniel Jacobowitz
  0 siblings, 0 replies; 875+ messages in thread
From: Daniel Jacobowitz @ 2006-08-09 14:15 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Eric Botcazou, Edmar Wienskoski, gcc-patches

On Wed, Aug 09, 2006 at 09:45:42AM -0400, David Edelsohn wrote:
> 	* config/rs6000/rs6000.c (print_operand) <D>: Fix comment and adjust.
> 	(rs6000_generate_compare): Tweak comments.
> 	* config/rs6000/rs6000.md (UNSPEC_MV_CR_GT): Fix comment.
> 
> 	Do you have access to an e500 to actually test this code?  If GCC
> has been setting the wrong bit, how has this been working all along or
> does no one use this FP feature.  I do not understand how people who are
> building and testing e500 from mainline are not seeing this problem.  I
> would have expected an open PR.
> 
> 	This patch is okay with me if Freescale concurs.

I had to read the patch twice to work it out, but Eric didn't actually
change the code; he changed an i = .... + 2 followed by a use of i, to
an i = .... + 1 followed by a use of i+1.

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] Fix comments in PPC e500 code
       [not found]                               ` <drow@false.org>
                                                   ` (3 preceding siblings ...)
  2005-11-09 19:39                                 ` [PATCH] volatile global register variable David Edelsohn
@ 2006-08-09 14:59                                 ` David Edelsohn
  2006-08-09 16:21                                   ` Eric Botcazou
  2007-07-31 12:37                                 ` [patch] set -mabi=altivec with -ftree-vectorize David Edelsohn
                                                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2006-08-09 14:59 UTC (permalink / raw)
  To: Eric Botcazou, Edmar Wienskoski, gcc-patches

>>>>> Daniel Jacobowitz writes:

Daniel> I had to read the patch twice to work it out, but Eric didn't actually
Daniel> change the code; he changed an i = .... + 2 followed by a use of i, to
Daniel> an i = .... + 1 followed by a use of i+1.

	Hmm, well I inferred that more had changed from the term "adjust"
in the ChangeLog entry, especially with no clarification in the email
posting.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] Fix comments in PPC e500 code
  2006-08-09 14:59                                 ` [PATCH] Fix comments in PPC e500 code David Edelsohn
@ 2006-08-09 16:21                                   ` Eric Botcazou
  2006-08-09 16:22                                     ` David Edelsohn
  0 siblings, 1 reply; 875+ messages in thread
From: Eric Botcazou @ 2006-08-09 16:21 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Edmar Wienskoski, gcc-patches

> 	Hmm, well I inferred that more had changed from the term "adjust"
> in the ChangeLog entry, especially with no clarification in the email
> posting.

Sorry, I thought the subject would tell it all.  Yes, the patch is strictly a 
no-op for the compiler, but not for the compiler writer.  I banged my head a 
couple of hours yesterday on this GT/EQ bit game.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] Fix comments in PPC e500 code
  2006-08-09 16:21                                   ` Eric Botcazou
@ 2006-08-09 16:22                                     ` David Edelsohn
  2006-08-09 16:53                                       ` Eric Botcazou
  0 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2006-08-09 16:22 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: Edmar Wienskoski, gcc-patches

>>>>> Eric Botcazou writes:

Eric> Sorry, I thought the subject would tell it all.  Yes, the patch is strictly a 
Eric> no-op for the compiler, but not for the compiler writer.  I banged my head a 
Eric> couple of hours yesterday on this GT/EQ bit game.

	This patch is okay.  One could have explained the +2 in the
original comment, but if this is easier for e500 developers to understand,
that's okay with me.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [PATCH] Fix comments in PPC e500 code
  2006-08-09 16:22                                     ` David Edelsohn
@ 2006-08-09 16:53                                       ` Eric Botcazou
  0 siblings, 0 replies; 875+ messages in thread
From: Eric Botcazou @ 2006-08-09 16:53 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Edmar Wienskoski, gcc-patches

> 	This patch is okay.  One could have explained the +2 in the
> original comment, but if this is easier for e500 developers to understand,
> that's okay with me.

Thanks.  The +1+1 idiom is copied from the J case.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 875+ messages in thread

* [patch] set -mabi=altivec with -ftree-vectorize
@ 2007-07-31  8:20 Zdenek Dvorak
  2007-07-31 12:01 ` Daniel Jacobowitz
  0 siblings, 1 reply; 875+ messages in thread
From: Zdenek Dvorak @ 2007-07-31  8:20 UTC (permalink / raw)
  To: gcc-patches; +Cc: dje

Hello,

using -O2 -maltivec -ftree-vectorize on ppc-linux leads to
misscompilations, as -mabi=altivec is not enabled there.
It would be somewhat annoying and error-prone to require a normal
user to know this and add the extra option.  As David Edelsohn
suggested, this patch makes us enable -mabi=altivec automatically
with -ftree-vectorize.

Bootstrapped & regtested on ppc-linux.

Zdenek

	* config/rs6000/rs6000.c (rs6000_override_options): Enable
	-mabi=altivec with -ftree-vectorize.
	* gcc.dg/pr32582-1.c: New test.

Index: testsuite/gcc.dg/pr32582-1.c
===================================================================
*** testsuite/gcc.dg/pr32582-1.c	(revision 0)
--- testsuite/gcc.dg/pr32582-1.c	(revision 0)
***************
*** 0 ****
--- 1,32 ----
+ /* { dg-do run { target { powerpc*-*-* && powerpc_altivec_ok } } } */
+ /* { dg-options "-O2 -ftree-vectorize -maltivec" } */
+ 
+ #include <stdlib.h>
+ #include <string.h>
+ 
+ char a[64];
+ 
+ void set (void)
+ {
+   int i;
+ 
+   for (i = 0; i < 64; i++)
+     a[i] = 'x';
+ }
+ 
+ void check (void)
+ {
+   int i;
+ 
+   for (i = 0; i < 64; i++)
+     if (a[i] != 0)
+       abort ();
+ }
+ 
+ int main (void)
+ {
+   set ();
+   memset (a, 0, sizeof a);
+   check ();
+   return 0;
+ }
Index: config/rs6000/rs6000.c
===================================================================
*** config/rs6000/rs6000.c	(revision 127065)
--- config/rs6000/rs6000.c	(working copy)
*************** rs6000_override_options (const char *def
*** 1537,1544 ****
      rs6000_ieeequad = 1;
  #endif
  
!   /* Set Altivec ABI as default for powerpc64 linux.  */
!   if (TARGET_ELF && TARGET_64BIT)
      {
        rs6000_altivec_abi = 1;
        TARGET_ALTIVEC_VRSAVE = 1;
--- 1537,1547 ----
      rs6000_ieeequad = 1;
  #endif
  
!   /* Set Altivec ABI as default for powerpc64 linux.  Also, if
!      autovectorization is run, enable Altivec ABI in order to
!      prevent misscompilations.  */
!   if ((TARGET_ELF && TARGET_64BIT)
!       || (TARGET_ALTIVEC && flag_tree_vectorize))
      {
        rs6000_altivec_abi = 1;
        TARGET_ALTIVEC_VRSAVE = 1;

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [patch] set -mabi=altivec with -ftree-vectorize
  2007-07-31  8:20 [patch] set -mabi=altivec with -ftree-vectorize Zdenek Dvorak
@ 2007-07-31 12:01 ` Daniel Jacobowitz
  0 siblings, 0 replies; 875+ messages in thread
From: Daniel Jacobowitz @ 2007-07-31 12:01 UTC (permalink / raw)
  To: Zdenek Dvorak; +Cc: gcc-patches, dje

On Tue, Jul 31, 2007 at 09:14:57AM +0200, Zdenek Dvorak wrote:
> Hello,
> 
> using -O2 -maltivec -ftree-vectorize on ppc-linux leads to
> misscompilations, as -mabi=altivec is not enabled there.
> It would be somewhat annoying and error-prone to require a normal
> user to know this and add the extra option.  As David Edelsohn
> suggested, this patch makes us enable -mabi=altivec automatically
> with -ftree-vectorize.

Is that really a good idea?  It makes sense that -mabi= is an
ABI-changing option, but I'd be surprised if adding -ftree-vectorize
changed the ABI.

BTW, we recently made the linker warn about mismatches between soft
and hard float PowerPC code.  I have a patch to warn about mismatchs
between code with -mabi=altivec and code without it.  I just haven't
posted it yet because PR 32919 breaks my test scripts.

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [patch] set -mabi=altivec with -ftree-vectorize
       [not found]                               ` <drow@false.org>
                                                   ` (4 preceding siblings ...)
  2006-08-09 14:59                                 ` [PATCH] Fix comments in PPC e500 code David Edelsohn
@ 2007-07-31 12:37                                 ` David Edelsohn
  2007-07-31 14:14                                   ` Daniel Jacobowitz
  2007-08-15 22:22                                 ` Mark PowerPC vector ABI in binary attributes David Edelsohn
  2007-09-04 13:05                                 ` Fix a reload ICE on e500 David Edelsohn
  7 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2007-07-31 12:37 UTC (permalink / raw)
  To: Zdenek Dvorak, gcc-patches

>>>>> Daniel Jacobowitz writes:

Daniel> Is that really a good idea?  It makes sense that -mabi= is an
Daniel> ABI-changing option, but I'd be surprised if adding -ftree-vectorize
Daniel> changed the ABI.

Daniel,

	Without automatically enabling -mabi=altivec, GCC silently
generates wrong code.

	The only other option I see is to error that -mabi=altivec needs
to be enabled with -ftree-vectorize -maltivec for the targets that do not
enable it automatically.  Generating wrong code needs more than a warning.
Do you think that is a better solution?

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [patch] set -mabi=altivec with -ftree-vectorize
  2007-07-31 12:37                                 ` [patch] set -mabi=altivec with -ftree-vectorize David Edelsohn
@ 2007-07-31 14:14                                   ` Daniel Jacobowitz
  2007-08-07 15:36                                     ` Daniel Jacobowitz
  0 siblings, 1 reply; 875+ messages in thread
From: Daniel Jacobowitz @ 2007-07-31 14:14 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Zdenek Dvorak, gcc-patches

On Tue, Jul 31, 2007 at 08:02:12AM -0400, David Edelsohn wrote:
> >>>>> Daniel Jacobowitz writes:
> 
> Daniel> Is that really a good idea?  It makes sense that -mabi= is an
> Daniel> ABI-changing option, but I'd be surprised if adding -ftree-vectorize
> Daniel> changed the ABI.
> 
> Daniel,
> 
> 	Without automatically enabling -mabi=altivec, GCC silently
> generates wrong code.
> 
> 	The only other option I see is to error that -mabi=altivec needs
> to be enabled with -ftree-vectorize -maltivec for the targets that do not
> enable it automatically.  Generating wrong code needs more than a warning.
> Do you think that is a better solution?

I didn't realize from the patch that wrong code was involved.  I
certainly agree that silently generating wrong code is not an option
:-)  You'll already know most of this message, but let me outline it
so that everyone's looking from the same side.

Here's what goes wrong in the test case from that patch (- is
-maltivec, + is -maltivec -mabi=altivec):

 main:
        stwu 1,-16(1)
        mflr 0
-       vxor 0,0,0
        stw 0,20(1)
        bl set
        lis 9,a@ha
        la 9,a@l(9)
+       vxor 0,0,0
        addi 11,9,48
        stvx 0,0,9
        stvx 0,0,11

-ftree-vectorize causes "set" to use AltiVec registers, and clobber
v0.  Then the vector code in main (an inlined memset) uses the
clobbered v0.  -ftree-vectorize will make this problem much more
likely, but there's no dependency on it; if you wrote the set and
memset using AltiVec builtin functions, the same thing could go
wrong.  The problem is using just -maltivec.

With -mabi=altivec, v0 through v19 are call-used (call-clobbered,
caller-saved).  Without -mabi=altivec, no AltiVec registers are saved
on the stack (ref: first_altivec_reg_to_save).  Saving and restoring
them would be a bit inefficient, due to the inadequate stack
alignment.  But they're not mark call-used either, so GCC assumes it
can use them.

In my opinion, if we are not going to use the AltiVec ABI, we should
either not use the AltiVec registers or else save and restore them
all.  Would it work to mark them all call-used for -maltivec, and
then save them to the stack by allocating eight bytes extra and
masking the stack pointer for sufficient alignment?  It wouldn't even
be that inefficient.  Of course it would be slower than what we have
today where we don't even bother to save them.  But I don't see how
it's legitimate to use the current -maltivec in anything but a leaf
function, which is undocumented and pretty bad.

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [patch] set -mabi=altivec with -ftree-vectorize
  2007-07-31 14:14                                   ` Daniel Jacobowitz
@ 2007-08-07 15:36                                     ` Daniel Jacobowitz
  2007-08-07 15:58                                       ` Andrew Pinski
  2007-08-08  2:21                                       ` Daniel Jacobowitz
  0 siblings, 2 replies; 875+ messages in thread
From: Daniel Jacobowitz @ 2007-08-07 15:36 UTC (permalink / raw)
  To: David Edelsohn, Zdenek Dvorak, gcc-patches

On Tue, Jul 31, 2007 at 09:20:42AM -0400, Daniel Jacobowitz wrote:
> With -mabi=altivec, v0 through v19 are call-used (call-clobbered,
> caller-saved).  Without -mabi=altivec, no AltiVec registers are saved
> on the stack (ref: first_altivec_reg_to_save).  Saving and restoring
> them would be a bit inefficient, due to the inadequate stack
> alignment.  But they're not mark call-used either, so GCC assumes it
> can use them.
> 
> In my opinion, if we are not going to use the AltiVec ABI, we should
> either not use the AltiVec registers or else save and restore them
> all.  Would it work to mark them all call-used for -maltivec, and
> then save them to the stack by allocating eight bytes extra and
> masking the stack pointer for sufficient alignment?  It wouldn't even
> be that inefficient.  Of course it would be slower than what we have
> today where we don't even bother to save them.  But I don't see how
> it's legitimate to use the current -maltivec in anything but a leaf
> function, which is undocumented and pretty bad.

I have been working on patches which add a .gnu_attribute entry
describing the vector ABI.  This is primarily for GDB's benefit, since
a number of tests in its testsuite fail on -mabi=no-altivec if AltiVec
hardware is detected.  But at the same time, I added support to the
linker to warn if you mix -mabi=no-altivec and -mabi=altivec code.
That caused a bunch of testcases in the GCC testsuite to fail.

Those testcases compile, link, and run code with -mabi=altivec.  This
is an ABI-changing option, so assuming it is compatible with the
system libraries is a little strange.  I tried just removing the
option, and hit the same bug that Zdenek and David are talking about
in this thread.  When one function using AltiVec calls another using
AltiVec, the second is likely to clobber registers that the first
expects to stay valid.

This patch improves the current state of things a bit.  In the
-maltivec -mabi=no-altivec case, we used to mark all AltiVec registers
as call-saved but not save them.  Now we mark them all call-used.
Obviously the code isn't great in non-leaf functions, but it should be
more correct.  And this version has no ABI implications.  I included
the new .gnu_attribute; this won't immediately add the warning,
but I'll post the binutils patch for that in a moment.

There's a potential improvement: define the non-AltiVec variants of
various ABIs to treat the upper registers as call-saved just like
the AltiVec variants do, and save and restore them in the prologue.
The unwind information for this is a little tricky to write and I
don't know how realistic tweaking those ABIs is in practice.

There's also a potential problem: see my message on gcc@ about the
stack boundary.  We assume a 16-byte stack boundary when using
-maltivec and indeed much of the code generated here is going to
misbehave if the stack is only 8-byte aligned.  Both local variables
and stack slots will be offset by eight bytes since lvx / stvx ignore
the low bits of the address.

I have tested this patch on the powerpc.exp and vmx.exp test cases,
on -m32, -m32 -msoft-float, and -m64.  Full test runs on those targets
are in progress now.  Is it OK, assuming they pass?

-- 
Daniel Jacobowitz
CodeSourcery

2007-08-07  Daniel Jacobowitz  <dan@codesourcery.com>

	* gcc.target/powerpc/altivec-consts.c: Remove -mabi=altivec.
	* gcc.target/powerpc/altivec-varargs-1.c: Likewise.
	* gcc.dg/vmx/vmx.exp: Likewise.

	* config/rs6000/rs6000.c (rs6000_file_start): Output a .gnu_attribute
	directive for the current vector ABI.
	(rs6000_conditional_register_usage): Mark call-saved AltiVec registers
	call-used if ! TARGET_ALTIVEC_ABI.
	* config/rs6000/rs6000.h (CALL_USED_REGISTERS): Mark the first 20
	AltiVec registers call-used.
	(CALL_REALLY_USED_REGISTERS): Likewise.

Index: gcc/testsuite/gcc.target/powerpc/altivec-consts.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/altivec-consts.c	(revision 127159)
+++ gcc/testsuite/gcc.target/powerpc/altivec-consts.c	(working copy)
@@ -1,6 +1,6 @@
 /* { dg-do run { target powerpc*-*-* } } */
 /* { dg-require-effective-target powerpc_altivec_ok } */
-/* { dg-options "-maltivec -mabi=altivec -O2" } */
+/* { dg-options "-maltivec -O2" } */
 
 /* Check that "easy" AltiVec constants are correctly synthesized.  */
 
Index: gcc/testsuite/gcc.target/powerpc/altivec-varargs-1.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/altivec-varargs-1.c	(revision 127159)
+++ gcc/testsuite/gcc.target/powerpc/altivec-varargs-1.c	(working copy)
@@ -1,6 +1,6 @@
 /* { dg-do run { target powerpc*-*-* } } */
 /* { dg-require-effective-target powerpc_altivec_ok } */
-/* { dg-options "-maltivec -mabi=altivec -fno-inline" } */
+/* { dg-options "-maltivec -fno-inline" } */
 
 #include <stdarg.h>
 #include <signal.h>
Index: gcc/testsuite/gcc.dg/vmx/vmx.exp
===================================================================
--- gcc/testsuite/gcc.dg/vmx/vmx.exp	(revision 127159)
+++ gcc/testsuite/gcc.dg/vmx/vmx.exp	(working copy)
@@ -31,7 +31,7 @@ if {![istarget powerpc*-*-*]
 # nothing but extensions.
 global DEFAULT_VMXCFLAGS
 if ![info exists DEFAULT_VMXCFLAGS] then {
-    set DEFAULT_VMXCFLAGS "-maltivec -mabi=altivec -std=gnu99"
+    set DEFAULT_VMXCFLAGS "-maltivec -std=gnu99"
 }
 
 # If the target system supports AltiVec instructions, the default action
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 127159)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -2316,8 +2316,14 @@ rs6000_file_start (void)
 
 #ifdef HAVE_AS_GNU_ATTRIBUTE
   if (TARGET_32BIT && DEFAULT_ABI == ABI_V4)
-    fprintf (file, "\t.gnu_attribute 4, %d\n",
-	     (TARGET_HARD_FLOAT && TARGET_FPRS) ? 1 : 2);
+    {
+      fprintf (file, "\t.gnu_attribute 4, %d\n",
+	       (TARGET_HARD_FLOAT && TARGET_FPRS) ? 1 : 2);
+      fprintf (file, "\t.gnu_attribute 8, %d\n",
+	       (TARGET_ALTIVEC_ABI ? 2
+		: TARGET_SPE_ABI ? 3
+		: 1));
+    }
 #endif
 
   if (DEFAULT_ABI == ABI_AIX || (TARGET_ELF && flag_pic == 2))
@@ -4141,8 +4147,13 @@ rs6000_conditional_register_usage (void)
       call_really_used_regs[VRSAVE_REGNO] = 1;
     }
 
-  if (TARGET_ALTIVEC_ABI)
-    for (i = FIRST_ALTIVEC_REGNO; i < FIRST_ALTIVEC_REGNO + 20; ++i)
+  /* If we are not using the AltiVec ABI, pretend that the normally
+     call-saved registers are also call-used.  We could use them
+     normally if we saved and restored them in the prologue; that
+     would require using the alignment padding around the register
+     save area, and some care with unwinding information.  */
+  if (! TARGET_ALTIVEC_ABI)
+    for (i = FIRST_ALTIVEC_REGNO + 20; i <= LAST_ALTIVEC_REGNO; ++i)
       call_used_regs[i] = call_really_used_regs[i] = 1;
 }
 \f
Index: gcc/config/rs6000/rs6000.h
===================================================================
--- gcc/config/rs6000/rs6000.h	(revision 127159)
+++ gcc/config/rs6000/rs6000.h	(working copy)
@@ -698,8 +698,8 @@ extern enum rs6000_nop_insertion rs6000_
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \
    1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1,	   \
    /* AltiVec registers.  */			   \
-   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \
-   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \
+   1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, \
+   1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \
    1, 1						   \
    , 1, 1, 1                                       \
 }
@@ -717,8 +717,8 @@ extern enum rs6000_nop_insertion rs6000_
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \
    1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1,	   \
    /* AltiVec registers.  */			   \
-   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \
-   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \
+   1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, \
+   1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \
    0, 0						   \
    , 0, 0, 0                                       \
 }

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [patch] set -mabi=altivec with -ftree-vectorize
  2007-08-07 15:36                                     ` Daniel Jacobowitz
@ 2007-08-07 15:58                                       ` Andrew Pinski
  2007-08-07 17:00                                         ` Daniel Jacobowitz
  2007-08-08  2:21                                       ` Daniel Jacobowitz
  1 sibling, 1 reply; 875+ messages in thread
From: Andrew Pinski @ 2007-08-07 15:58 UTC (permalink / raw)
  To: David Edelsohn, Zdenek Dvorak, gcc-patches

On 8/7/07, Daniel Jacobowitz <drow@false.org> wrote:
> I have been working on patches which add a .gnu_attribute entry
> describing the vector ABI.  This is primarily for GDB's benefit, since
> a number of tests in its testsuite fail on -mabi=no-altivec if AltiVec
> hardware is detected.  But at the same time, I added support to the
> linker to warn if you mix -mabi=no-altivec and -mabi=altivec code.
> That caused a bunch of testcases in the GCC testsuite to fail.

The ABI is only difference if you actually use altivec registers or
VMX sized vectors (except for vector long long which does not exist).
So I think the warning is wrong and will get people confused even
more.   In most cases you only want to compile some parts of the
program with -maltivec -mabi=altivec and then only call those parts
when VMX exists.  This is very very common in generic programs that
people compile for PowerPC and allow to run on mutliple pieces of
hardware.  Remember not all users of GCC can provide full different
versions of their program in an effient manor.

Thanks,
Andrew Pinski

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [patch] set -mabi=altivec with -ftree-vectorize
  2007-08-07 15:58                                       ` Andrew Pinski
@ 2007-08-07 17:00                                         ` Daniel Jacobowitz
  0 siblings, 0 replies; 875+ messages in thread
From: Daniel Jacobowitz @ 2007-08-07 17:00 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: David Edelsohn, Zdenek Dvorak, gcc-patches

On Tue, Aug 07, 2007 at 11:58:11AM -0400, Andrew Pinski wrote:
> The ABI is only difference if you actually use altivec registers or
> VMX sized vectors (except for vector long long which does not
> exist).

That's almost right, but not quite.  The required stack alignment
difference is an ABI issue.

> So I think the warning is wrong and will get people confused even
> more.   In most cases you only want to compile some parts of the
> program with -maltivec -mabi=altivec and then only call those parts
> when VMX exists.  This is very very common in generic programs that
> people compile for PowerPC and allow to run on mutliple pieces of
> hardware.  Remember not all users of GCC can provide full different
> versions of their program in an effient manor.

Then they should either use just -maltivec without -mabi=altivec, or
someone should implement support for "don't care" tagging.  As things
stand now code can go horribly wrong if the stack does not happen to
be sufficiently aligned.

I wouldn't object to a linker option to suppress the warnings; that's
been suggested before and we may need it on ARM for the -fshort-wchar
tests.

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: [patch] set -mabi=altivec with -ftree-vectorize
  2007-08-07 15:36                                     ` Daniel Jacobowitz
  2007-08-07 15:58                                       ` Andrew Pinski
@ 2007-08-08  2:21                                       ` Daniel Jacobowitz
  1 sibling, 0 replies; 875+ messages in thread
From: Daniel Jacobowitz @ 2007-08-08  2:21 UTC (permalink / raw)
  To: David Edelsohn, Zdenek Dvorak, gcc-patches

On Tue, Aug 07, 2007 at 11:35:53AM -0400, Daniel Jacobowitz wrote:
> I have been working on patches which add a .gnu_attribute entry
> describing the vector ABI.  This is primarily for GDB's benefit, since
> a number of tests in its testsuite fail on -mabi=no-altivec if AltiVec
> hardware is detected.  But at the same time, I added support to the
> linker to warn if you mix -mabi=no-altivec and -mabi=altivec code.
> That caused a bunch of testcases in the GCC testsuite to fail.

It sounds like the warning, and the -maltivec changes, are both going
to require a couple of iterations.  The vector ABI attribute is
separate from using it for a linker warning, so I'd like to deal with
that one first.  Is just this bit OK?

> @@ -2316,8 +2316,14 @@ rs6000_file_start (void)
>  
>  #ifdef HAVE_AS_GNU_ATTRIBUTE
>    if (TARGET_32BIT && DEFAULT_ABI == ABI_V4)
> -    fprintf (file, "\t.gnu_attribute 4, %d\n",
> -	     (TARGET_HARD_FLOAT && TARGET_FPRS) ? 1 : 2);
> +    {
> +      fprintf (file, "\t.gnu_attribute 4, %d\n",
> +	       (TARGET_HARD_FLOAT && TARGET_FPRS) ? 1 : 2);
> +      fprintf (file, "\t.gnu_attribute 8, %d\n",
> +	       (TARGET_ALTIVEC_ABI ? 2
> +		: TARGET_SPE_ABI ? 3
> +		: 1));
> +    }
>  #endif
>  
>    if (DEFAULT_ABI == ABI_AIX || (TARGET_ELF && flag_pic == 2))

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Mark PowerPC vector ABI in binary attributes
@ 2007-08-13 13:31 Daniel Jacobowitz
  0 siblings, 0 replies; 875+ messages in thread
From: Daniel Jacobowitz @ 2007-08-13 13:31 UTC (permalink / raw)
  To: gcc-patches; +Cc: Geoff Keating, David Edelsohn

I'm reposting this part separately from the -maltivec fixes, which are
more complicated.  This patch adds a marker to object files indicating
whether they were compiled with the generic vector ABI, AltiVec vector
ABI, or SPE vector ABI.  There's room for expansion should another
vector ABI be necessary later.

This patch will not have any other user-visible effect.  I will add
support to binutils to recognize and merge the attribute so that
linked executables are correctly marked, but not add a mismatch
warning yet; I think one is appropriate, but Andrew Pinski raised some
concerns and that discussion wasn't resolved.  Once I have binutils
support for merging the attribute, I will add GDB support for it; then
the AltiVec tests will start passing on powerpc-linux configurations.
We can decide what to do about warnings later.

Is this patch OK?

-- 
Daniel Jacobowitz
CodeSourcery

2007-08-07  Daniel Jacobowitz  <dan@codesourcery.com>

	* config/rs6000/rs6000.c (rs6000_file_start): Output a .gnu_attribute
	directive for the current vector ABI.

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 127159)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -2316,8 +2316,14 @@ rs6000_file_start (void)

 #ifdef HAVE_AS_GNU_ATTRIBUTE
   if (TARGET_32BIT && DEFAULT_ABI == ABI_V4)
-    fprintf (file, "\t.gnu_attribute 4, %d\n",
-	     (TARGET_HARD_FLOAT && TARGET_FPRS) ? 1 : 2);
+    {
+      fprintf (file, "\t.gnu_attribute 4, %d\n",
+	       (TARGET_HARD_FLOAT && TARGET_FPRS) ? 1 : 2);
+      fprintf (file, "\t.gnu_attribute 8, %d\n",
+	       (TARGET_ALTIVEC_ABI ? 2
+		: TARGET_SPE_ABI ? 3
+		: 1));
+    }
 #endif

   if (DEFAULT_ABI == ABI_AIX || (TARGET_ELF && flag_pic == 2))

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Fix a reload ICE on e500
@ 2007-08-14 20:00 Daniel Jacobowitz
  2007-08-20  7:38 ` Andrew Pinski
  0 siblings, 1 reply; 875+ messages in thread
From: Daniel Jacobowitz @ 2007-08-14 20:00 UTC (permalink / raw)
  To: gcc-patches

This patch fixes 20001012-1.c for powerpc-eabi with -mcpu=8548
-mfloat-gprs=double, on the 4.2 branch.  The problem is latent on
HEAD.

We start with this instruction:

(insn:HI 12 11 20 2 (set (subreg:DF (reg:DI 122) 0)
        (reg:DF 121)) 1103 {*frob_di_df_2} (nil)
    (expr_list:REG_DEAD (reg:DF 121)
        (nil)))

There's a register/register alternative for this instruction.
However, in 4.2 it can't be used because combine tried to create a
DImode subreg of (reg:DF 121); this leads to invalid_mode_change_p
refusing to let register 121 get assigned to a DFmode register.
It ends up in memory.

Rather than spilling it to the stack, reload uses the fact that this
register is known to be 1.0 to load that value from a constant pool.
This is a GOT-relative operation on PowerPC -fpic.  And the function
is a tiny leaf function which has not otherwise needed the GOT
pointer.  So emiting the reload makes the GOT pointer become suddenly
live, which changes the elimination offsets and eventually leads to
an ICE.

If I manually prod record_subregs_of_mode on mainline, the same ICE
occurs.  We now determine invalid mode changes as a dataflow problem,
so unsuccessful subregs from combine no longer affect them.

Most of this patch is fairly obvious once you know where to look.  The
same problem is already solved for non-SPE but the SPE case ignored
the calculation.  While I was looking at the condition, though, I
added the current_function_uses_const_pool.  I believe that the only
way that the GOT pointer can become live in the midst of reload is
this case, and current_function_uses_const_pool is how the MIPS
and i386 backends check for the same problem.

No regressions on powerpc-eabispe -mfloat-gprs=double for trunk.  Is
this OK to commit?  How about for the 4.2 branch?

-- 
Daniel Jacobowitz
CodeSourcery

2007-08-03  Daniel Jacobowitz  <dan@codesourcery.com>

	* config/rs6000/rs6000.c (rs6000_stack_info): Allocate space for the
	GOT pointer only if there is a constant pool.  Use the allocated space
	for SPE also.

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 178068)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -13091,6 +13091,7 @@ rs6000_stack_info (void)
   int reg_size = TARGET_32BIT ? 4 : 8;
   int ehrd_size;
   int save_align;
+  int first_gp;
   HOST_WIDE_INT non_fixed_size;

   memset (&info, 0, sizeof (info));
@@ -13110,14 +13111,19 @@ rs6000_stack_info (void)
   /* Calculate which registers need to be saved & save area size.  */
   info_ptr->first_gp_reg_save = first_reg_to_save ();
   /* Assume that we will have to save RS6000_PIC_OFFSET_TABLE_REGNUM,
-     even if it currently looks like we won't.  */
+     even if it currently looks like we won't.  Reload may need it to
+     get at a constant; if so, it will have already created a constant
+     pool entry for it.  */
   if (((TARGET_TOC && TARGET_MINIMAL_TOC)
        || (flag_pic == 1 && DEFAULT_ABI == ABI_V4)
        || (flag_pic && DEFAULT_ABI == ABI_DARWIN))
+      && current_function_uses_const_pool
       && info_ptr->first_gp_reg_save > RS6000_PIC_OFFSET_TABLE_REGNUM)
-    info_ptr->gp_size = reg_size * (32 - RS6000_PIC_OFFSET_TABLE_REGNUM);
+    first_gp = RS6000_PIC_OFFSET_TABLE_REGNUM;
   else
-    info_ptr->gp_size = reg_size * (32 - info_ptr->first_gp_reg_save);
+    first_gp = info_ptr->first_gp_reg_save;
+
+  info_ptr->gp_size = reg_size * (32 - first_gp);

   /* For the SPE, we have an additional upper 32-bits on each GPR.
      Ideally we should save the entire 64-bits only when the upper
@@ -13205,7 +13211,7 @@ rs6000_stack_info (void)
 	    + info_ptr->parm_size);

   if (TARGET_SPE_ABI && info_ptr->spe_64bit_regs_used != 0)
-    info_ptr->spe_gp_size = 8 * (32 - info_ptr->first_gp_reg_save);
+    info_ptr->spe_gp_size = 8 * (32 - first_gp);
   else
     info_ptr->spe_gp_size = 0;

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Mark PowerPC vector ABI in binary attributes
       [not found]                               ` <drow@false.org>
                                                   ` (5 preceding siblings ...)
  2007-07-31 12:37                                 ` [patch] set -mabi=altivec with -ftree-vectorize David Edelsohn
@ 2007-08-15 22:22                                 ` David Edelsohn
  2007-08-15 22:29                                   ` Daniel Jacobowitz
  2007-09-04 13:05                                 ` Fix a reload ICE on e500 David Edelsohn
  7 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2007-08-15 22:22 UTC (permalink / raw)
  To: gcc-patches, Geoff Keating

>>>>> Daniel Jacobowitz writes:

Daniel> * config/rs6000/rs6000.c (rs6000_file_start): Output a .gnu_attribute
Daniel> directive for the current vector ABI.

	Okay.  It would be nice to have a registry for these bits.

David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Mark PowerPC vector ABI in binary attributes
  2007-08-15 22:22                                 ` Mark PowerPC vector ABI in binary attributes David Edelsohn
@ 2007-08-15 22:29                                   ` Daniel Jacobowitz
  0 siblings, 0 replies; 875+ messages in thread
From: Daniel Jacobowitz @ 2007-08-15 22:29 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches, Geoff Keating

On Wed, Aug 15, 2007 at 06:22:13PM -0400, David Edelsohn wrote:
> >>>>> Daniel Jacobowitz writes:
> 
> Daniel> * config/rs6000/rs6000.c (rs6000_file_start): Output a .gnu_attribute
> Daniel> directive for the current vector ABI.
> 
> 	Okay.  It would be nice to have a registry for these bits.

Agreed.  As we discussed, I'm going to try to add this to the GNU
Binutils manual; it seems like the most appropriate place.

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Fix a reload ICE on e500
  2007-08-14 20:00 Fix a reload ICE on e500 Daniel Jacobowitz
@ 2007-08-20  7:38 ` Andrew Pinski
  2007-09-04 12:55   ` Daniel Jacobowitz
  0 siblings, 1 reply; 875+ messages in thread
From: Andrew Pinski @ 2007-08-20  7:38 UTC (permalink / raw)
  To: gcc-patches; +Cc: drow

On 8/14/07, Daniel Jacobowitz <drow@false.org> wrote:
> No regressions on powerpc-eabispe -mfloat-gprs=double for trunk.  Is
> this OK to commit?  How about for the 4.2 branch?

It seems like you should also bootstrap on a non SPE target as you
touched generic code in the rs6000 back-end and the last time someone
touched this code, it caused a couple of regressions.  I can do this
if you want, but it won't be until Wednesday as I don't have access to
a powerpc machine until then.

Thanks,
Andrew Pinski

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Fix a reload ICE on e500
  2007-08-20  7:38 ` Andrew Pinski
@ 2007-09-04 12:55   ` Daniel Jacobowitz
  0 siblings, 0 replies; 875+ messages in thread
From: Daniel Jacobowitz @ 2007-09-04 12:55 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: gcc-patches, David Edelsohn

On Mon, Aug 20, 2007 at 12:03:17AM -0700, Andrew Pinski wrote:
> On 8/14/07, Daniel Jacobowitz <drow@false.org> wrote:
> > No regressions on powerpc-eabispe -mfloat-gprs=double for trunk.  Is
> > this OK to commit?  How about for the 4.2 branch?
> 
> It seems like you should also bootstrap on a non SPE target as you
> touched generic code in the rs6000 back-end and the last time someone
> touched this code, it caused a couple of regressions.

Good idea.  I also bootstrapped the patch on powerpc64-linux-gnu.

David, we talked about this on IRC but I can't remember what the
verdict was.  Is there some other testig that I need to do before
it can go in?

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Fix a reload ICE on e500
       [not found]                               ` <drow@false.org>
                                                   ` (6 preceding siblings ...)
  2007-08-15 22:22                                 ` Mark PowerPC vector ABI in binary attributes David Edelsohn
@ 2007-09-04 13:05                                 ` David Edelsohn
  2007-09-04 13:46                                   ` Daniel Jacobowitz
  7 siblings, 1 reply; 875+ messages in thread
From: David Edelsohn @ 2007-09-04 13:05 UTC (permalink / raw)
  To: Andrew Pinski, gcc-patches

>>>>> Daniel Jacobowitz writes:

Dan> Good idea.  I also bootstrapped the patch on powerpc64-linux-gnu.

Dan> David, we talked about this on IRC but I can't remember what the
Dan> verdict was.  Is there some other testig that I need to do before
Dan> it can go in?

	The patch is fine if that testing succeeded.

Thanks, David

^ permalink raw reply	[flat|nested] 875+ messages in thread

* Re: Fix a reload ICE on e500
  2007-09-04 13:05                                 ` Fix a reload ICE on e500 David Edelsohn
@ 2007-09-04 13:46                                   ` Daniel Jacobowitz
  0 siblings, 0 replies; 875+ messages in thread
From: Daniel Jacobowitz @ 2007-09-04 13:46 UTC (permalink / raw)
  To: gcc-patches

On Tue, Sep 04, 2007 at 09:05:31AM -0400, David Edelsohn wrote:
> >>>>> Daniel Jacobowitz writes:
> 
> Dan> Good idea.  I also bootstrapped the patch on powerpc64-linux-gnu.
> 
> Dan> David, we talked about this on IRC but I can't remember what the
> Dan> verdict was.  Is there some other testig that I need to do before
> Dan> it can go in?
> 
> 	The patch is fine if that testing succeeded.

Thanks.  There were no regressions, so I've checked it in.

(I did need a version of Segher's SECTION_WRITE patch to bootstrap.)

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 875+ messages in thread

end of thread, other threads:[~2007-09-04 13:46 UTC | newest]

Thread overview: 875+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-03-30 20:02 Patch ping! (4.1 projects, stage 1.2) Hot/cold partitioning fixes Caroline Tice
2005-03-30 23:14 ` Geoffrey Keating
     [not found]   ` <5c0e321f26901d84492dfe29fa755d7e@apple.com>
     [not found]     ` <776b1297c1d3595d7b8c9d3699a40431@geoffk.org>
     [not found]       ` <87ll84llvz.fsf@codesourcery.com>
2005-03-31  0:19         ` Caroline Tice
2005-03-31  0:29           ` Zack Weinberg
2005-03-31  0:43             ` Richard Henderson
2005-04-01 13:55 ` AIX bootstrap failure (was Re: Hot/cold partitioning fixes) David Edelsohn
2005-04-01 16:44   ` David Edelsohn
2005-04-01 17:13     ` Caroline Tice
2005-04-01 18:24       ` David Edelsohn
2005-04-01 19:20         ` Caroline Tice
2005-04-01 19:25           ` David Edelsohn
2005-04-01 19:45           ` Mark Mitchell
2005-04-01 20:02             ` Caroline Tice
2005-04-01 21:34             ` Caroline Tice
2005-04-01 21:44               ` David Edelsohn
2005-04-01 21:55                 ` Daniel Jacobowitz
2005-04-01 22:02                 ` Caroline Tice
2005-04-01 21:46               ` Mark Mitchell
2005-04-01 22:06                 ` Caroline Tice
2005-04-01 19:50           ` Daniel Jacobowitz
2005-04-01 20:00             ` Caroline Tice
2005-04-01 20:07               ` Daniel Jacobowitz
2005-04-01 20:09                 ` Caroline Tice
2005-04-01 21:19                   ` Mark Mitchell
2005-04-01 22:08                 ` Geoffrey Keating
2005-04-01 22:12                   ` Caroline Tice
2005-04-01 22:21                     ` Geoffrey Keating
2005-04-01 22:31                       ` Caroline Tice
2005-04-01 22:57                         ` Caroline Tice
2005-04-02  2:22                           ` Mark Mitchell
2005-04-02  2:51                             ` Daniel Jacobowitz
2005-04-01 22:14         ` Geoffrey Keating
2005-04-01 22:17           ` David Edelsohn
2005-04-01 22:24             ` Daniel Jacobowitz
2005-04-01 22:31               ` David Edelsohn
2005-04-01 18:48     ` Caroline Tice
2005-04-01 17:04   ` Joseph S. Myers
2005-04-01 17:21     ` Caroline Tice
  -- strict thread matches above, loose matches on Subject: below --
2007-08-14 20:00 Fix a reload ICE on e500 Daniel Jacobowitz
2007-08-20  7:38 ` Andrew Pinski
2007-09-04 12:55   ` Daniel Jacobowitz
2007-08-13 13:31 Mark PowerPC vector ABI in binary attributes Daniel Jacobowitz
2007-07-31  8:20 [patch] set -mabi=altivec with -ftree-vectorize Zdenek Dvorak
2007-07-31 12:01 ` Daniel Jacobowitz
2006-08-09 12:49 [PATCH] Fix comments in PPC e500 code Eric Botcazou
2006-08-09 14:02 ` David Edelsohn
2006-08-09 14:15   ` Daniel Jacobowitz
2006-07-06 19:53 [PATCH, committed] PR 28150 and PR 28170 David Edelsohn
2006-07-07  3:07 ` Alan Modra
2006-04-12  0:50 [PowerPC] Avoid ICE on DFmode subreg Alan Modra
2006-04-04 16:21 [RFT/RFA] Fix AIX fallout from PR/19653 patch Paolo Bonzini
2006-04-04 17:52 ` David Edelsohn
2006-04-05 17:10   ` Geoff Keating
2006-04-06  3:38     ` David Edelsohn
2006-04-06  4:57       ` Roger Sayle
2006-04-06  7:31         ` Paolo Bonzini
2006-04-06 14:31           ` David Edelsohn
2006-04-06 14:42             ` Paolo Bonzini
2006-04-06 22:44               ` David Edelsohn
2006-04-07  7:39                 ` Paolo Bonzini
2006-04-07 14:01                   ` David Edelsohn
     [not found]                   ` <E5778C9D-8FC1-422A-82ED-ABEF7AFA7685@geoffk.org>
     [not found]                     ` <443A1765.2030609@lu.unisi.ch>
     [not found]                       ` <1880BA7B-88BD-41EE-B9E7-33045526437F@geoffk.org>
2006-04-10 16:33                         ` Paolo Bonzini
2006-04-10 16:40                           ` Geoff Keating
2006-04-07 20:49             ` Geoff Keating
2006-04-06 18:46       ` Dale Johannesen
2006-04-06 19:17         ` David Edelsohn
2006-04-05  2:20 ` Alan Modra
2006-04-05  7:12   ` Paolo Bonzini
2006-04-05 14:48     ` Alan Modra
2006-03-30 14:20 [PowerPC] PR26459 again Alan Modra
2006-03-30  4:42 [PowerPC] linuxspe vs. ibm long double Alan Modra
2006-03-30 12:51 ` Joseph S. Myers
2006-02-24  2:22 [PowerPC64] Fix 26453, segfault -m64 -mtraceback=full Alan Modra
2006-02-24  3:57 ` Mark Mitchell
2005-12-28 10:31 [PowerPC] Fix PR25572, -mminimal-toc trashes r30 Alan Modra
2005-12-14 14:56 [PowerPC] Fix 25406, rs6000_special_round_type_align Alan Modra
2005-12-14 23:50 ` Alan Modra
2005-12-09  1:59 [PowerPC] Default TARGET_ALIGN_NATURAL properly for target libs Alan Modra
2005-12-09 16:15 ` Andrew Pinski
     [not found]   ` <20051212022850.GI1563@bubble.grove.modra.org>
     [not found]     ` <26af44d2c30b82a28cfc4fc6e5df2e0d@physics.uc.edu>
2005-12-12  3:34       ` Alan Modra
2005-12-12  6:35         ` Mark Mitchell
2005-12-13  1:02           ` Mike Stump
2005-12-13  2:08             ` Gabriel Dos Reis
2005-12-07  5:09 [PowerPC] Fix pr25212, indexed address predicates Alan Modra
2005-11-28  2:26 [PowerPC] PR24997: ICE with -ftree-vectorize Alan Modra
2005-11-25  4:49 [RS6000] Add some more functions to ppc64-fp.c Alan Modra
2005-11-09 16:28 [PATCH] volatile global register variable David Edelsohn
2005-11-09 18:13 ` Joseph S. Myers
2005-11-09 19:33   ` Ian Lance Taylor
2005-11-09 18:41 ` Daniel Jacobowitz
2005-11-08  2:55 [PowerPC] Fix PR23704, -m64 overrides prior -mno-powerpc-gfxopt Alan Modra
2005-11-02 23:22 [PowerPC64] gcc.c-torture/compile/pr20928.c failure Alan Modra
2005-10-20 11:00 [PowerPC] -msdata=data needless use of .sbss section Alan Modra
2005-10-20 16:28 ` David Edelsohn
2005-10-20 23:04   ` Alan Modra
2005-11-27 23:21 ` Alan Modra
2005-09-12  3:41 [PowerPC] Fix PR23774 stack backchain broken Alan Modra
2005-09-12  2:03 [RS6000] Nop-insertion fix Alan Modra
2005-09-12  7:23 ` Richard Henderson
2005-09-12 14:16   ` Alan Modra
2005-06-23 13:33 [PATCH, committed] PPC405 atomic support (PR target/21760) David Edelsohn
2005-06-23 13:51 ` Andreas Schwab
2005-06-23 19:32   ` David Edelsohn
2005-06-24  2:07     ` Geoffrey Keating
2005-06-24  2:35       ` Andrew Pinski
2005-06-24  3:30         ` Daniel Jacobowitz
2005-06-24 14:16           ` David Edelsohn
2005-06-24 19:10             ` Daniel Jacobowitz
2005-06-27  6:06             ` Richard Henderson
2005-06-27 15:17               ` David Edelsohn
2005-06-24 18:41       ` Zack Weinberg
     [not found]       ` <geoffk@geoffk.org>
2005-06-25 21:46         ` David Edelsohn
2005-06-10  1:09 [PATCH] PowerPC SVR4 _mcount calls Alan Modra
2005-05-12 16:05 powerpc new PLT and GOT Alan Modra
2005-05-12 16:09 ` Andrew Pinski
2005-05-12 20:00   ` Mike Stump
2005-05-13  0:10     ` Alan Modra
2005-05-12 16:53 ` Matt Thomas
2005-05-13  0:00   ` Alan Modra
2005-05-19 13:01 ` Alan Modra
2005-05-25 14:26   ` Alan Modra
2005-05-25 15:17     ` Paolo Carlini
2005-05-30 15:49     ` Alan Modra
2005-05-31  0:35       ` Alan Modra
2005-05-31 10:53     ` Alan Modra
2005-05-31 13:42       ` Joseph S. Myers
2005-05-31 15:03         ` Alan Modra
2005-05-31 17:11           ` Joseph S. Myers
2005-06-01  0:06             ` Alan Modra
2005-03-30  3:38 [RS6000] Fix PR20611, duplicate label for inlined function referencing TLS Alan Modra
2005-03-17  0:27 powerpc-linux unwinder fix for 3.4 Alan Modra
2005-03-14 12:58 [PATCH] PowerPC function arg alignment tidy Alan Modra
     [not found] <1109545242.14992.234.camel@gaston>
     [not found] ` <200502272335.j1RNZDD27844@makai.watson.ibm.com>
     [not found]   ` <653d1ad4308fa0e72f08252032f6c753@physics.uc.edu>
2005-02-28 10:52     ` Implicit altivec vs. linux kernel build Alan Modra
2005-02-15  2:20 [RFC] PowerPC 128 bit long double compatibility (PR target/19019) David Edelsohn
2005-02-15  2:22 ` Geoffrey Keating
2005-02-15  2:29   ` David Edelsohn
2005-02-15 11:20     ` Geoffrey Keating
2005-02-15  2:53 ` Richard Henderson
2005-02-15  3:19   ` David Edelsohn
2005-02-15 15:54     ` Alan Modra
2005-02-15 16:11       ` Richard Henderson
2005-02-15 16:31         ` Alan Modra
2005-02-15 17:17           ` Alan Modra
2005-02-16  0:20       ` Geoffrey Keating
2005-02-23 17:43       ` [3.4 PATCH] Fix powerpc*-*-linux* bootstrap " Jakub Jelinek
2005-02-23 19:28         ` David Edelsohn
2005-02-23 19:33           ` [PATCH] Fix powerpc*-*-linux* libgcc.a " Jakub Jelinek
2005-02-24 14:08           ` [3.4 PATCH] Fix powerpc*-*-linux* bootstrap " Alan Modra
2005-02-24 14:51             ` Jakub Jelinek
2005-02-24 14:59               ` Richard Henderson
2005-02-24 15:09               ` Alan Modra
2005-02-24 15:17                 ` Jakub Jelinek
2005-02-25  2:12                 ` [PATCH] Fix powerpc*-*-linux* libgcc.a (PR target/19019, take 2) Jakub Jelinek
2005-02-25  2:18                   ` Richard Henderson
2005-01-29  7:47 [PATCH] powerpc dwarf2 unwinder fallback Alan Modra
2005-01-12  5:02 [PATCH] PR target/19389 Odd gpr mem load unrecognizable insn Alan Modra
2004-12-24 13:47 [PATCH] Fix target/19147: invalid rlwinm patterns Alan Modra
2004-12-23 14:57 [PATCH] PR target/19137: ICE with load of TImode constant Alan Modra
2004-12-02 12:13 mklibgcc fallout Alan Modra
2004-12-02 17:12 ` Zack Weinberg
2004-12-02 18:21   ` Joseph S. Myers
2004-12-02 18:47     ` Zack Weinberg
2004-12-02 19:02       ` David Edelsohn
2004-12-02 21:05         ` Richard Henderson
2004-12-02 21:37         ` Alan Modra
2004-12-02 22:10           ` Zack Weinberg
2004-12-02 22:27             ` David Edelsohn
2004-12-02 23:08               ` Zack Weinberg
2004-12-02 23:26                 ` David Edelsohn
2004-12-04  2:41                   ` Zack Weinberg
2004-12-04  2:48                     ` David Edelsohn
2004-12-03  9:09             ` Alan Modra
2004-12-02 23:13           ` Andreas Schwab
2004-11-26 10:32 [RS6000] Fix PR12817 Alan Modra
2004-11-26  2:32 [PATCH] Fix PR16356 Alan Modra
2004-11-10  2:36 fix pr 16480 on gcc-3.4 Alan Modra
2004-11-08 23:07 [RFC] PowerPC sCC patterns David Edelsohn
2004-11-24 11:40 ` Alan Modra
2004-11-24 21:39   ` David Edelsohn
2004-09-02 18:53 Release RTL bodies after compilation (sometimes) Jan Hubicka
2004-09-02 18:56 ` Jakub Jelinek
2004-09-02 18:58   ` Jan Hubicka
2004-09-03  5:43     ` Mark Mitchell
2004-09-03 22:41       ` Jan Hubicka
2004-09-14 20:00       ` Diego Novillo
2004-09-14 20:15         ` Jan Hubicka
2004-09-14 20:35           ` Richard Henderson
2004-09-14 20:51             ` Jan Hubicka
2004-09-14 21:07               ` Jeffrey A Law
2004-09-14 21:12                 ` Jakub Jelinek
2004-09-14 22:33                   ` Daniel Jacobowitz
2004-09-14 22:53                   ` Richard Henderson
2004-09-16  8:35                   ` Jeffrey A Law
2004-09-14 21:26                 ` Jan Hubicka
2004-09-14 21:40                 ` Diego Novillo
2004-09-14 22:08                   ` Jeffrey A Law
2004-09-14 22:25                     ` Jan Hubicka
2004-09-14 23:55                     ` Michael Matz
2004-09-15  0:25                       ` Jan Hubicka
2004-09-15  4:38                     ` David Edelsohn
2004-09-15 12:25                       ` Jan Hubicka
2004-09-15 12:30                         ` Jan Hubicka
2004-09-15 13:05                           ` Diego Novillo
2004-09-15 13:13                             ` Daniel Berlin
2004-09-15 13:13                             ` Daniel Jacobowitz
2004-09-15 13:32                               ` Jan Hubicka
2004-09-15 15:59                               ` Diego Novillo
2004-09-15 16:12                                 ` Daniel Jacobowitz
2004-09-15 17:07                                   ` David Edelsohn
2004-09-15 18:48                                     ` Daniel Jacobowitz
2004-09-15 20:35                                     ` Michael Matz
2004-09-16  7:38                                       ` Jeffrey A Law
2004-09-18  7:47                                         ` Geoffrey Keating
2004-09-18 13:58                                           ` Daniel Berlin
2004-09-18 14:52                                             ` Daniel Berlin
2004-09-18 15:33                                               ` Daniel Jacobowitz
2004-09-18 18:47                                                 ` Daniel Berlin
2004-09-18 19:47                                                   ` Daniel Jacobowitz
2004-09-18 20:20                                                     ` Daniel Berlin
2004-09-18  7:36                                       ` Geoffrey Keating
     [not found]                               ` <drow@false.org>
2004-09-15 15:46                                 ` David Edelsohn
2004-09-15 20:17                                   ` Geoffrey Keating
2004-09-16  4:22                                     ` Jeffrey A Law
2005-04-01 21:58                                 ` AIX bootstrap failure (was Re: Hot/cold partitioning fixes) David Edelsohn
2005-06-27 17:18                                 ` [PATCH, committed] PPC405 atomic support (PR target/21760) David Edelsohn
2005-06-28  7:54                                   ` Gunther Nikl
2005-11-09 19:39                                 ` [PATCH] volatile global register variable David Edelsohn
2005-11-10 21:18                                   ` Mark Mitchell
2005-11-11  3:10                                     ` David Edelsohn
2005-11-11  5:50                                       ` Mark Mitchell
2005-11-11  3:37                                     ` David Edelsohn
2006-08-09 14:59                                 ` [PATCH] Fix comments in PPC e500 code David Edelsohn
2006-08-09 16:21                                   ` Eric Botcazou
2006-08-09 16:22                                     ` David Edelsohn
2006-08-09 16:53                                       ` Eric Botcazou
2007-07-31 12:37                                 ` [patch] set -mabi=altivec with -ftree-vectorize David Edelsohn
2007-07-31 14:14                                   ` Daniel Jacobowitz
2007-08-07 15:36                                     ` Daniel Jacobowitz
2007-08-07 15:58                                       ` Andrew Pinski
2007-08-07 17:00                                         ` Daniel Jacobowitz
2007-08-08  2:21                                       ` Daniel Jacobowitz
2007-08-15 22:22                                 ` Mark PowerPC vector ABI in binary attributes David Edelsohn
2007-08-15 22:29                                   ` Daniel Jacobowitz
2007-09-04 13:05                                 ` Fix a reload ICE on e500 David Edelsohn
2007-09-04 13:46                                   ` Daniel Jacobowitz
2004-09-15 16:53                             ` Release RTL bodies after compilation (sometimes) Jeffrey A Law
2004-09-15 17:20                               ` Jan Hubicka
2004-09-16  9:03                                 ` Jeffrey A Law
2004-09-16 13:13                                   ` Jan Hubicka
2004-09-15 19:50                       ` Mike Stump
2004-09-15 19:58                         ` David Edelsohn
2004-09-15 20:10                           ` Michael Matz
2004-09-15 20:51                             ` David Edelsohn
2004-09-15 21:02                               ` Daniel Jacobowitz
2004-09-16  4:58                                 ` Jeffrey A Law
2004-09-16  3:43                           ` Jeffrey A Law
2004-09-16  4:29                         ` Jeffrey A Law
2004-09-14 21:07           ` Jeffrey A Law
2004-09-14 21:19             ` Jan Hubicka
2004-09-15 11:59             ` Jan Hubicka
2004-09-15 13:07               ` Diego Novillo
2004-09-14 20:18         ` Jan Hubicka
2004-09-03  6:58 ` Richard Henderson
2004-09-03  8:03   ` Jan Hubicka
2004-09-03 19:08 ` Geoffrey Keating
2004-09-03 19:50   ` Jan Hubicka
2004-08-25 14:28 RS6000 fix pr16480 Alan Modra
2004-08-25 15:45 ` David Edelsohn
2004-08-25 23:09   ` Alan Modra
2004-08-11 11:13 powerpc64 fixes missing from 3.4 branch Alan Modra
2004-08-10 11:55 powerpc64 linux dot symbols Alan Modra
2004-08-10 12:37 ` Joseph S. Myers
2004-08-10 13:34   ` Alan Modra
2004-08-13 13:04 ` Jakub Jelinek
2004-08-13 13:58   ` Alan Modra
2004-08-16 11:49 ` Jakub Jelinek
2004-08-17  0:43   ` Alan Modra
2004-08-20  5:54     ` Alan Modra
2004-08-21 11:58       ` Jakub Jelinek
2004-08-21 15:39         ` Alan Modra
2004-07-14 21:21 [PATCH, committed] SFmode arg padding and va_arg cleanup David Edelsohn
2004-07-26 20:30 ` Alan Modra
2004-05-26 17:31 rs6000 mainline patch for pr 14478 Alan Modra
2004-05-08 15:54 Fixes for powerpc-linux param passing Alan Modra
2004-05-08 15:54 ` Aldy Hernandez
2004-05-08 22:43   ` Geoff Keating
2004-05-09 15:16   ` Alan Modra
2004-05-09 14:21     ` Aldy Hernandez
2004-05-09 14:22     ` Geoff Keating
2004-05-11  6:55     ` Alan Modra
2004-05-10 14:01       ` Aldy Hernandez
2004-05-08 16:41 ` Andrew Pinski
2004-05-08 22:20   ` Aldy Hernandez
2004-05-15 15:00 ` Alan Modra
2004-04-30 14:46 rs6000 stack boundary Alan Modra
2004-04-30 22:26 ` Geoff Keating
2004-04-30 23:06   ` David Edelsohn
     [not found]     ` <200404302355.i3UNtw61022391@desire.geoffk.org>
2004-05-01  0:40       ` David Edelsohn
2004-05-01  2:08       ` Alan Modra
2004-03-30 14:55 [lno] PATCH rs6000.md fix Mostafa Hagog
2004-03-30 15:36 ` David Edelsohn
2004-03-31  4:35   ` Alan Modra
2004-04-01 12:43 ` Dorit Naishlos
2004-04-06 14:21   ` Dorit Naishlos
2004-04-09 20:06   ` Dorit Naishlos
2004-03-13 11:43 [csl-arm, HEAD] ARM PATCH - fix QImode addressing on ARMv4 Richard Earnshaw
2004-03-13 13:01 ` Richard Earnshaw
2004-03-13 21:44   ` Daniel Jacobowitz
2004-03-14  0:43     ` Richard Earnshaw
2004-03-19  8:14       ` Richard Earnshaw
2004-03-19  8:14     ` Daniel Jacobowitz
2004-03-19  8:14   ` Richard Earnshaw
2004-03-19  8:14 ` Richard Earnshaw
     [not found] <40491948.2010900@us.ibm.com>
2004-03-06  9:52 ` Powerpc64 long double support Alan Modra
2004-03-19  8:14   ` Alan Modra
2004-03-06 10:50 ` Alan Modra
2004-03-06 23:13   ` Geoff Keating
2004-03-07  6:37     ` Alan Modra
2004-03-07  7:30       ` Richard Henderson
2004-03-09  5:05         ` Alan Modra
2004-03-09  7:59           ` Richard Henderson
2004-03-09 23:49             ` Alan Modra
2004-03-10  9:42               ` Richard Sandiford
2004-03-10 11:01                 ` Alan Modra
2004-03-10 11:11                   ` Richard Sandiford
2004-03-10 18:48                     ` David Edelsohn
2004-03-10 20:11                       ` Richard Sandiford
2004-03-19  8:14                         ` Richard Sandiford
2004-03-19  8:14                       ` David Edelsohn
2004-03-19  8:14                     ` Richard Sandiford
2004-03-10 11:13                   ` Alan Modra
2004-03-10 11:25                     ` Richard Sandiford
2004-03-10 11:58                       ` Alan Modra
2004-03-10 12:06                         ` Richard Sandiford
2004-03-10 12:25                           ` Alan Modra
2004-03-19  8:14                             ` Alan Modra
2004-03-10 12:42                           ` Andreas Schwab
2004-03-10 12:53                             ` Richard Sandiford
2004-03-19  8:14                               ` Richard Sandiford
2004-03-10 13:48                             ` Alan Modra
2004-03-19  8:14                               ` Alan Modra
2004-03-19  8:14                             ` Andreas Schwab
2004-03-19  8:14                           ` Richard Sandiford
2004-03-19  8:14                         ` Alan Modra
2004-03-19  8:14                       ` Richard Sandiford
2004-03-19  8:14                     ` Alan Modra
2004-03-19  8:14                   ` Alan Modra
2004-03-10 16:18                 ` David Edelsohn
2004-03-11 15:05                   ` Correct powerpc64 long double -0.0 to double conversion Alan Modra
2004-03-19  8:14                     ` Alan Modra
2004-03-19  8:14                   ` Powerpc64 long double support David Edelsohn
2004-03-19  8:14                 ` Richard Sandiford
2004-03-19  8:14               ` Alan Modra
2004-04-01  0:56               ` Geoff Keating
2004-04-06 14:21                 ` Geoff Keating
2004-04-09 20:06                 ` Geoff Keating
2004-03-19  8:14             ` Richard Henderson
2004-03-19  8:14           ` Alan Modra
2004-03-19  8:14         ` Richard Henderson
2004-03-19  8:14       ` Alan Modra
2004-03-19  8:14     ` Geoff Keating
2004-03-19  8:14   ` Alan Modra
2004-03-03 15:14 Fix PR 14406 (rs6000 abstf2) Alan Modra
2004-03-19  8:14 ` Alan Modra
2004-02-10 11:42 Fix libjava failure on powerpc64-linux Alan Modra
2004-02-10 12:34 ` Andrew Haley
2004-02-10 13:31   ` Alan Modra
2004-02-10 14:12     ` Andrew Haley
2004-02-21 13:45       ` Andrew Haley
2004-02-21 13:45     ` Alan Modra
2004-02-21 13:45   ` Andrew Haley
2004-02-21 13:45 ` Alan Modra
     [not found] <20040108000828.GQ2533@bubble.modra.org>
     [not found] ` <jmllojuvbk.fsf@desire.geoffk.org>
     [not found]   ` <20040108005715.GS2533@bubble.modra.org>
     [not found]     ` <200401080107.i0817HT26846@makai.watson.ibm.com>
2004-01-08  2:01       ` mainline -mcpu=power4 Alan Modra
     [not found] <OFEA5CA921.302AEEB5-ON41256E00.005FB141@de.ibm.com>
     [not found] ` <200312182258.hBIMwgT25422@makai.watson.ibm.com>
     [not found]   ` <200312201527.hBKFRHgI000712@elgar.kettenis.dyndns.org>
     [not found]     ` <3FF5A069.1040306@gnu.org>
     [not found]       ` <200401022317.i02NHQBR001191@desire.geoffk.org>
2004-01-06 15:27         ` Incorrect DWARF-2 register numbers on PPC64? Alan Modra
2004-01-06 18:07           ` Geoff Keating
2004-01-06 18:10             ` David Edelsohn
2004-01-06 22:05               ` Geoff Keating
2004-01-06 22:08                 ` David Edelsohn
2004-01-06 22:34                   ` Geoff Keating
2004-01-07  0:26                     ` Alan Modra
2004-01-07 17:43           ` Mark Kettenis
2004-01-07 22:29             ` Alan Modra
2004-01-07 23:36               ` Andrew Cagney
2004-01-08  0:48                 ` Alan Modra
2004-01-08  5:01                   ` Geoff Keating
2004-01-09  2:34                     ` Alan Modra
2004-01-09  2:49                       ` Alan Modra
2004-01-09  6:39                       ` Alan Modra
2004-01-17  6:54                         ` Alan Modra
2004-01-17  8:05                           ` Geoff Keating
2004-01-18  6:03                             ` Alan Modra
2004-01-09 15:15                       ` Mark Kettenis
     [not found]           ` <amodra@bigpond.net.au>
2002-05-23  7:02             ` rs6000.c:output_toc Alan Modra
2002-05-23  9:21               ` rs6000.c:output_toc David Edelsohn
     [not found]             ` <200209140219.WAA25730@makai.watson.ibm.com>
2002-09-13 19:59               ` SYMBOL_REF_FLAG Alan Modra
2002-09-13 22:17                 ` SYMBOL_REF_FLAG Richard Henderson
2002-09-14  0:09                   ` SYMBOL_REF_FLAG David Edelsohn
2002-09-14  0:01                 ` SYMBOL_REF_FLAG David Edelsohn
2002-10-03 19:25             ` powerpc fix for gcc.dg/asm-names.c failure Alan Modra
2002-10-03 19:32               ` David Edelsohn
2003-06-07  5:59             ` powerpc64 crt* tweak Alan Modra
2003-06-07  6:00               ` David Edelsohn
2003-08-01 14:57             ` powerpc64-linux libffi update Alan Modra
2003-08-01 15:08               ` David Edelsohn
2004-01-06 16:02             ` Incorrect DWARF-2 register numbers on PPC64? David Edelsohn
2004-01-08  2:09             ` mainline -mcpu=power4 David Edelsohn
2004-01-08 15:14               ` Alan Modra
2004-02-10 15:07             ` Fix libjava failure on powerpc64-linux David Edelsohn
2004-02-10 15:09               ` Andrew Haley
2004-02-10 15:39                 ` David Edelsohn
2004-02-10 15:59                   ` Andrew Haley
2004-02-10 16:14                     ` David Edelsohn
2004-02-21 13:45                       ` David Edelsohn
2004-02-21 13:45                     ` Andrew Haley
2004-02-21 13:45                   ` David Edelsohn
2004-02-21 13:45                 ` Andrew Haley
2004-02-21 13:45               ` David Edelsohn
2004-03-03 21:03             ` Fix PR 14406 (rs6000 abstf2) David Edelsohn
2004-03-03 21:34               ` Alan Modra
2004-03-04  2:44                 ` Alan Modra
2004-03-19  8:14                   ` Alan Modra
2004-03-19  8:14                 ` Alan Modra
2004-03-19  8:14               ` David Edelsohn
2004-03-04  2:47             ` David Edelsohn
2004-03-19  8:14               ` David Edelsohn
2004-03-10  6:23             ` Powerpc64 long double support David Edelsohn
2004-03-10  6:44               ` Alan Modra
2004-03-19  8:14                 ` Alan Modra
2004-03-19  8:14               ` David Edelsohn
2004-03-12 20:26             ` Correct powerpc64 long double -0.0 to double conversion David Edelsohn
2004-03-19  8:14               ` David Edelsohn
2004-04-30 14:55             ` rs6000 stack boundary David Edelsohn
2004-05-09 14:19             ` Fixes for powerpc-linux param passing David Edelsohn
2004-05-09 14:22               ` Aldy Hernandez
2004-05-18  4:23             ` [lno] PATCH rs6000.md fix David Edelsohn
2004-05-26 18:52             ` rs6000 mainline patch for pr 14478 David Edelsohn
2004-05-26 20:17               ` Alan Modra
2004-05-26 21:18             ` David Edelsohn
2004-07-26 22:22             ` [PATCH, committed] SFmode arg padding and va_arg cleanup David Edelsohn
2004-07-28 11:26               ` Alan Modra
2004-07-28 12:11                 ` Alan Modra
2004-08-06  7:58                   ` Alan Modra
2004-07-28 12:17             ` David Edelsohn
2004-08-06 14:53             ` David Edelsohn
2004-08-11 14:47             ` powerpc64 fixes missing from 3.4 branch David Edelsohn
2004-08-20 22:35             ` powerpc64 linux dot symbols David Edelsohn
2004-08-25 23:56             ` RS6000 fix pr16480 David Edelsohn
2004-08-26  1:21               ` Alan Modra
2004-08-26  1:30             ` David Edelsohn
2004-11-10  4:48             ` fix pr 16480 on gcc-3.4 David Edelsohn
2004-11-24 18:27             ` [RFC] PowerPC sCC patterns David Edelsohn
2004-11-26  4:45             ` [PATCH] Fix PR16356 David Edelsohn
2004-11-27  0:00             ` [RS6000] Fix PR12817 David Edelsohn
2004-11-27  0:18               ` Alan Modra
2004-11-27  4:55                 ` Geoffrey Keating
2004-11-27  8:41                   ` Alan Modra
2004-11-27  7:34                 ` Alan Modra
2004-11-27 19:56                 ` Dale Johannesen
2004-11-27  4:15               ` Geoffrey Keating
2004-11-27 22:30               ` Mike Stump
2004-11-27 22:03             ` David Edelsohn
2004-11-27 22:23               ` Mike Stump
2004-11-27 22:48               ` Dale Johannesen
2004-11-27 22:52                 ` Alan Modra
2004-11-28  0:20                   ` Dale Johannesen
2004-11-29  2:38               ` Geoffrey Keating
2004-12-03 16:02             ` mklibgcc fallout David Edelsohn
2004-12-03 21:55               ` Mark Mitchell
2004-12-03 22:18                 ` David Edelsohn
2004-12-03 22:42                   ` Mark Mitchell
2004-12-04  3:06                     ` Alan Modra
2004-12-04  3:40                       ` Zack Weinberg
2004-12-08 12:09                       ` Richard Sandiford
2004-12-08 13:54                         ` Alan Modra
2004-12-08 14:08                           ` Richard Sandiford
2004-12-23 17:46             ` [PATCH] PR target/19137: ICE with load of TImode constant David Edelsohn
     [not found]               ` <20041224002336.GB2765@bubble.modra.org>
2004-12-24 19:30                 ` David Edelsohn
2004-12-24 15:55             ` [PATCH] Fix target/19147: invalid rlwinm patterns David Edelsohn
2005-01-12  5:23             ` [PATCH] PR target/19389 Odd gpr mem load unrecognizable insn David Edelsohn
2005-01-29 17:21             ` [PATCH] powerpc dwarf2 unwinder fallback David Edelsohn
2005-02-15 19:41             ` [RFC] PowerPC 128 bit long double compatibility (PR target/19019) David Edelsohn
2005-02-16  1:52               ` Geoffrey Keating
2005-03-02 16:17             ` Implicit altivec vs. linux kernel build David Edelsohn
2005-03-17  0:48             ` powerpc-linux unwinder fix for 3.4 David Edelsohn
2005-03-20 23:01             ` [PATCH] PowerPC function arg alignment tidy David Edelsohn
2005-03-31  0:16             ` [RS6000] Fix PR20611, duplicate label for inlined function referencing TLS David Edelsohn
2005-05-31 14:32             ` powerpc new PLT and GOT David Edelsohn
2005-06-10  1:13             ` [PATCH] PowerPC SVR4 _mcount calls David Edelsohn
2005-09-12  3:15             ` [RS6000] Nop-insertion fix David Edelsohn
2005-09-12  3:59             ` [PowerPC] Fix PR23774 stack backchain broken David Edelsohn
2005-09-13  0:44             ` David Edelsohn
2005-09-13  4:19               ` Alan Modra
2005-10-21  2:01             ` [PowerPC] -msdata=data needless use of .sbss section David Edelsohn
2005-11-02 23:37             ` [PowerPC64] gcc.c-torture/compile/pr20928.c failure David Edelsohn
2005-11-08  2:59             ` [PowerPC] Fix PR23704, -m64 overrides prior -mno-powerpc-gfxopt David Edelsohn
2005-11-25  5:15             ` [RS6000] Add some more functions to ppc64-fp.c David Edelsohn
2005-11-27 23:25             ` [PowerPC] -msdata=data needless use of .sbss section David Edelsohn
2005-11-28  2:51             ` [PowerPC] PR24997: ICE with -ftree-vectorize David Edelsohn
2005-12-07 13:35             ` [PowerPC] Fix pr25212, indexed address predicates David Edelsohn
2005-12-10 18:14             ` [PowerPC] Default TARGET_ALIGN_NATURAL properly for target libs David Edelsohn
2005-12-15  7:04             ` [PowerPC] Fix 25406, rs6000_special_round_type_align David Edelsohn
2005-12-28 15:07             ` [PowerPC] Fix PR25572, -mminimal-toc trashes r30 David Edelsohn
2006-02-24  3:03             ` [PowerPC64] Fix 26453, segfault -m64 -mtraceback=full David Edelsohn
2006-03-30 21:46             ` [PowerPC] linuxspe vs. ibm long double David Edelsohn
2006-03-30 23:42               ` Alan Modra
2006-03-30 21:50             ` [PowerPC] PR26459 again David Edelsohn
2006-03-30 23:12               ` Alan Modra
2006-03-30 23:50             ` [PowerPC] linuxspe vs. ibm long double David Edelsohn
2006-03-31  0:33             ` [PowerPC] PR26459 again David Edelsohn
2006-04-05  2:26             ` [RFT/RFA] Fix AIX fallout from PR/19653 patch David Edelsohn
2006-04-12  0:53             ` [PowerPC] Avoid ICE on DFmode subreg David Edelsohn
2006-07-07 13:08             ` [PATCH, committed] PR 28150 and PR 28170 David Edelsohn
2003-10-24 13:52 [PATCH] - Use of powerpc 64bit instructions in 32bit ABI Ulrich Weigand
2003-10-24 17:32 ` David Edelsohn
2003-10-24 18:07   ` Richard Henderson
2003-10-24 18:34     ` Fariborz Jahanian
2003-10-24 18:34     ` David Edelsohn
2003-10-24 18:57       ` Richard Henderson
2003-11-03 20:55         ` David Edelsohn
2003-12-01 14:27           ` Eric Botcazou
2003-12-01 15:46             ` David Edelsohn
2003-12-01 16:15               ` Eric Botcazou
2003-12-01 18:44                 ` David Edelsohn
2003-12-01 19:29                   ` Fariborz Jahanian
2003-12-01 19:33                     ` David Edelsohn
2003-12-02  9:20                       ` Eric Botcazou
2003-12-02 16:17                         ` David Edelsohn
2003-12-02 17:28                           ` Eric Botcazou
2003-12-02 17:39                             ` Fariborz Jahanian
2003-12-02 18:50                               ` Eric Botcazou
2003-12-02 17:41                             ` David Edelsohn
2003-12-02 18:50                               ` Eric Botcazou
2003-12-02 18:02                             ` David Edelsohn
2003-12-02 23:34                               ` Eric Botcazou
2003-12-02 23:42                                 ` fj
2003-12-05 16:41                                   ` Eric Botcazou
2003-12-02  8:00                   ` Eric Botcazou
2003-10-24 19:08       ` Geoff Keating
2003-10-24 19:10         ` David Edelsohn
2003-10-24 19:22       ` Dale Johannesen
2003-10-24 19:28         ` Dale Johannesen
2003-10-24 21:25         ` Richard Henderson
2003-10-25  3:10         ` David S. Miller
     [not found] <20030911193937.GD16280@redhat.com>
2003-09-11 20:08 ` C++ testsuite failures on AIX David Edelsohn
2003-09-11 20:09   ` Richard Henderson
2003-09-11 20:17     ` David Edelsohn
2003-09-11 20:20       ` Richard Henderson
2003-09-11 20:49         ` Mark Mitchell
2003-09-11 21:33           ` Jan Hubicka
2003-09-11 20:29     ` Jan Hubicka
2003-09-11 20:31       ` David Edelsohn
2003-09-11 20:32         ` Jan Hubicka
2003-09-11 20:48           ` David Edelsohn
2003-09-11 20:50           ` David Edelsohn
     [not found] <20030612184851.E41510@forte.austin.ibm.com>
     [not found] ` <20030613020133.GK23826@bubble.sa.bigpond.net.au>
2003-06-13 15:07   ` ppc64 floating point usage [was Re: PPC64 Compiler bug !!] Alan Modra
2003-06-13 15:38     ` David Edelsohn
2003-06-13 20:04     ` David Edelsohn
2003-06-13 20:06       ` Jakub Jelinek
2003-06-13 20:38         ` David Edelsohn
2003-06-13 21:06           ` linas
2003-06-13 21:17             ` Michael S. Zick
2003-06-14  3:14               ` Michael Meissner
2003-06-14  3:14                 ` gp
2003-06-14  9:07                 ` Alan Modra
2003-06-14 14:59                 ` Michael S. Zick
2003-06-14 22:52                   ` Michael Meissner
2003-06-13 22:19             ` Janis Johnson
2003-06-13 21:08       ` linas
2003-08-08  7:24       ` Alan Modra
2003-08-08 14:01         ` David Edelsohn
2003-08-09  1:55           ` ppc64 floating point usage Zack Weinberg
2003-08-09  2:42             ` David Edelsohn
2003-08-09  2:56               ` Alan Modra
2003-08-09  3:15                 ` David Edelsohn
2003-08-09  3:40                   ` Alan Modra
2003-08-09  2:23           ` ppc64 floating point usage [was Re: PPC64 Compiler bug !!] Eric Christopher
2003-08-09  2:50             ` David Edelsohn
2003-08-09  3:06               ` Eric Christopher
2003-08-09  3:27                 ` David Edelsohn
2003-08-09 23:14                   ` Eric Christopher
2003-08-09 23:27                     ` David Edelsohn
2003-08-10  0:15                       ` Eric Christopher
2003-08-10  5:17                   ` Geoff Keating
2003-08-10  6:43                     ` Alan Modra
2003-08-10  7:00                       ` Geoff Keating
2003-05-27 11:58 [PATCH] powerpc64-linux bi-arch support Jakub Jelinek
2003-05-27 14:57 ` David Edelsohn
2003-05-27 15:06   ` Jakub Jelinek
2003-05-27 15:25     ` David Edelsohn
2003-05-27 15:52       ` Jakub Jelinek
2003-05-27 22:44         ` David Edelsohn
2003-05-29 22:50           ` Alan Modra
2003-05-30  0:52             ` David Edelsohn
2003-05-31 20:52               ` Jakub Jelinek
2003-06-02 20:24                 ` David Edelsohn
2003-06-02 22:08                 ` David Edelsohn
2003-06-02 22:37                   ` Michael Meissner
2003-06-04 14:48 ` David Edelsohn
2003-06-04 16:51 ` David Edelsohn
2003-06-04 16:58   ` Jakub Jelinek
2003-08-19 20:07 ` David Edelsohn
2003-08-20  0:42   ` Alan Modra
2003-05-14 16:25 powerpc-unknown-linux-gnu bootstrap fix Matt Kraai
2003-05-14 17:12 ` David Edelsohn
2003-05-14 17:23   ` Richard Henderson
2003-05-14 17:26     ` David Edelsohn
2003-05-14 17:52       ` Richard Henderson
2003-05-14 18:58         ` David Edelsohn
2003-05-15 23:22       ` Geoff Keating
2003-05-18 23:17         ` Matt Kraai
2003-05-19  0:16           ` Geoff Keating
2003-04-24 15:34 function parms in regs, patch 1 of 3 Alan Modra
2003-04-25 22:44 ` Richard Henderson
2003-04-26  0:33   ` Janis Johnson
2003-04-27 23:34   ` Alan Modra
2003-04-30 13:29     ` function parms in regs, patch 3 " Alan Modra
2003-05-02  6:05       ` Jim Wilson
2003-05-02 12:38         ` Alan Modra
2003-05-02 20:23           ` Jim Wilson
2003-05-03  1:22             ` Alan Modra
2003-07-10  6:55       ` Jim Wilson
2003-07-14  2:51         ` Alan Modra
2003-07-14  3:00           ` David Edelsohn
2003-07-15 15:08           ` David Edelsohn
2003-07-16  3:49             ` Alan Modra
2003-07-16 15:08               ` David Edelsohn
2003-07-16 15:10               ` David Edelsohn
2003-05-02  5:06 ` function parms in regs, patch 1 " Jim Wilson
2003-05-02  5:20   ` Richard Henderson
2003-04-24 15:34 function parms in regs, patch 3 " Alan Modra
2003-04-09 19:34 [PATCH] fold-const.c use of BRANCH_COST David Edelsohn
2003-04-11  4:11 ` Richard Henderson
2003-04-11  4:23   ` Andrew Pinski
2003-04-11 17:47     ` Geoff Keating
2003-04-12  2:48       ` Segher Boessenkool
2003-04-11  4:24   ` David Edelsohn
2003-04-11  4:43     ` Richard Henderson
2003-04-11  4:58       ` David Edelsohn
2003-04-11  5:11         ` Richard Henderson
2003-04-11 14:41           ` David Edelsohn
2003-04-11 14:47             ` Jan Hubicka
2003-04-11 15:51               ` David Edelsohn
2003-04-11 16:57                 ` Jan Hubicka
2003-04-11 16:58                   ` David Edelsohn
2003-04-21 17:24                   ` David Edelsohn
2003-04-21 18:00                     ` Richard Henderson
2003-04-22 15:02                       ` David Edelsohn
2003-04-11 17:49             ` Geoff Keating
2003-04-11 17:08         ` Dale Johannesen
2003-04-11 17:54           ` David Edelsohn
2002-08-21 11:09 [RFC] PowerPC select_section / unique_section David Edelsohn
2002-08-21 11:21 ` Franz Sirl
2002-08-21 11:29   ` David Edelsohn
2002-08-21 12:01     ` Franz Sirl
2002-08-21 12:15       ` David Edelsohn
2002-08-29 18:01         ` Richard Henderson
2002-08-21 19:02     ` Alan Modra
2002-08-21 19:19       ` David Edelsohn
2002-08-21 19:25         ` Alan Modra
2002-08-21 21:28           ` David Edelsohn
2002-08-29 18:19             ` Richard Henderson
2002-08-29 18:03       ` Richard Henderson
2002-08-30  7:15         ` Alan Modra
2002-08-30  8:27           ` Alan Modra
2002-08-30  8:42           ` David Edelsohn
2002-08-30 11:17             ` Franz Sirl
2002-08-30 11:26               ` Franz Sirl
2002-08-30 11:29               ` David Edelsohn
2002-08-30 17:32             ` Alan Modra
2002-08-30 18:17               ` Richard Henderson
2002-08-30 18:48                 ` Geoff Keating
2002-08-30 19:40                   ` -finline-functions vs -fpic Richard Henderson
2002-08-30 20:57                     ` Richard Henderson
2002-09-02 15:41                 ` [RFC] PowerPC select_section / unique_section David Edelsohn
2002-09-02 16:32                   ` Alan Modra
2002-09-02 16:51                     ` David Edelsohn
2002-09-02 17:13                       ` Alan Modra
2002-09-02 17:57                         ` David Edelsohn
2002-09-02 18:27                           ` Alan Modra
2002-09-02 18:49                             ` David Edelsohn
2002-09-02 19:41                               ` Alan Modra
2002-09-02 19:59                                 ` David Edelsohn
2002-09-02 20:17                               ` Richard Henderson
2002-09-02 20:11                             ` Jeff Sturm
2002-09-02 20:19                               ` David Edelsohn
2002-09-03  0:16                                 ` Richard Henderson
2002-09-03  8:22                                   ` David Edelsohn
2002-09-03  9:04                                     ` Richard Henderson
2002-09-03 10:40                                       ` David Edelsohn
2002-09-03 13:44                                         ` Richard Henderson
2002-09-03  9:29                             ` Mark Mitchell
2002-09-03  8:41                           ` Geoff Keating
2002-09-03  9:50                             ` David Edelsohn
2002-08-21 18:54 ` Alan Modra
2002-08-21 18:59   ` David Edelsohn
2002-08-01 18:39 power4 branch hints Alan Modra
2002-08-01 18:47 ` David Edelsohn
2002-08-01 19:50   ` Alan Modra
2002-08-01 20:25     ` David Edelsohn
2002-08-02 13:25   ` Geoff Keating
     [not found] <20020625081846.10430.qmail@sources.redhat.com>
2002-07-15  2:43 ` other/7114: ICE building strcoll.op from glibc-2.2.5 Alan Modra
2002-07-15  5:22   ` Alan Modra
2002-07-15 12:51   ` Geoff Keating
2002-07-15 16:54     ` Alan Modra
2002-07-15 18:38       ` Alan Modra
2002-07-15 22:08         ` Richard Henderson
2002-07-16  0:03           ` Alan Modra
2002-07-16 11:23         ` Geoff Keating
2002-07-16 18:51           ` Alan Modra
2002-07-16 22:07             ` Alan Modra
2002-07-17  0:58             ` Geoff Keating
2002-07-17  2:04               ` Alan Modra
2002-07-17 10:42                 ` David Edelsohn
2002-07-17 12:10                 ` Geoff Keating
2002-07-17  8:45               ` David Edelsohn
2002-07-17 12:26                 ` Geoff Keating
2002-07-17 14:05                   ` David Edelsohn
2002-07-17 19:20                   ` Alan Modra
2002-07-17 19:45                     ` David Edelsohn
2002-07-17 20:50                     ` David Edelsohn
2002-07-17 20:52                       ` Alan Modra
2002-07-16 10:46       ` Geoff Keating
     [not found] <20020712071414.GR30362@bubble.sa.bigpond.net.au>
2002-07-13  4:58 ` target/7282: powerpc64 SImode in FPR Alan Modra
2002-07-13  7:25   ` David Edelsohn
2002-07-13 23:36     ` Alan Modra
2002-07-14  7:59       ` David Edelsohn
2002-07-02 21:18 convert 32-bit PowerPC GNU/Linux to TARGET_OS_CPP_BUILTINS Matt Kraai
2002-07-03  8:10 ` David Edelsohn
2002-07-03  9:15   ` Matt Kraai
2002-07-03  9:25     ` Stan Shebs
2002-07-03  9:32     ` David Edelsohn
2002-07-03  9:36     ` Jason R Thorpe
2002-07-03 10:29       ` Matt Kraai
2002-07-03 23:50     ` Alan Modra
2002-07-04  9:22       ` David Edelsohn
2002-07-08 18:27         ` Matt Kraai
2002-07-08 19:05           ` Geoff Keating
2002-07-08 19:16           ` David Edelsohn
2002-07-09  0:37             ` Matt Kraai
2002-06-09  8:10 PowerPC cleanup and Power4 David Edelsohn
2002-06-09  8:24 ` Neil Booth
2002-06-09  8:31   ` David Edelsohn
2002-06-09 10:05 ` Geoff Keating
2002-06-09 10:33   ` David Edelsohn
2002-06-09 12:08     ` Geoff Keating
2002-06-09 12:15       ` David Edelsohn
2002-05-21 18:54 thread-local storage: c front end and generic backend patch Richard Henderson
2002-05-22  4:25 ` Joseph S. Myers
2002-05-22 13:53 ` Mark Mitchell
2002-05-22 14:22   ` Richard Henderson
2002-05-22 14:44     ` Gabriel Dos Reis
2002-05-22 14:55       ` Joseph S. Myers
2002-05-22 14:52     ` Mark Mitchell
2002-05-22 15:01       ` Richard Henderson
2002-05-22 15:13         ` Jakub Jelinek
2002-05-22 15:36           ` Richard Henderson
2002-05-22 15:42             ` Mark Mitchell
2002-05-22 15:56               ` Richard Henderson
2002-05-29 23:48           ` Fergus Henderson
2002-05-22 15:39         ` Mark Mitchell
2002-05-22 16:30           ` Richard Henderson
2002-05-22 16:46     ` Alexandre Oliva
2002-05-22 16:53       ` Richard Henderson
2002-07-11  9:00 ` David Edelsohn
2002-07-11 11:02   ` Richard Henderson
2002-07-26 11:08     ` [PATCH] " David Edelsohn
2002-07-27 15:40       ` Richard Henderson
2002-07-27 16:18         ` David Edelsohn
2002-07-29 11:02           ` Richard Henderson
2002-07-29 11:36             ` David Edelsohn
2002-07-29 15:30               ` Richard Henderson
2002-07-29 22:10                 ` David Edelsohn
2002-07-30  9:41                   ` Richard Henderson
2002-03-08 14:25 biggest alignment for sysv4.h altivec Aldy Hernandez
2002-03-08 14:49 ` Geoff Keating
2002-03-08 14:52   ` Aldy Hernandez
2002-03-08 15:16     ` Geoff Keating
2002-03-08 15:26       ` Aldy Hernandez
2002-03-08 15:48         ` Geoff Keating
2002-03-08 15:53           ` Aldy Hernandez
2002-03-08 17:34             ` Richard Henderson
2002-03-08 18:52       ` Richard Henderson
2002-03-08 20:09       ` David Edelsohn
2002-03-09  2:11         ` Geoff Keating
2002-03-09 15:09           ` David Edelsohn
2002-03-09 16:17             ` Geoff Keating
2002-03-06  6:54 f build dies with: undefined reference to `lookup_name' Andrew Cagney
2002-03-06  8:29 ` David Edelsohn
2002-03-06  8:53   ` Andrew Cagney
2002-03-06 10:18     ` David Edelsohn
2002-03-06 10:59       ` Richard Henderson
2002-03-06 11:27         ` David Edelsohn
2002-03-06 12:41           ` Richard Henderson
2002-03-06 14:18             ` David Edelsohn
2002-03-06 14:22               ` Richard Henderson
2002-03-06 15:06                 ` David Edelsohn
2002-03-06 15:07                   ` Richard Henderson
2002-03-06 15:09                     ` David Edelsohn
2002-03-06 15:13                       ` Richard Henderson
2002-03-06 15:18                 ` Alan Modra
2002-03-06 19:01                   ` Alan Modra
2002-03-10 14:27                     ` Andrew Cagney
2002-03-10 14:34                       ` David Edelsohn
2002-03-10 16:00                         ` Richard Henderson
2002-03-06 15:40         ` David Edelsohn
2002-03-14 11:34         ` David Edelsohn
2002-03-14 12:02           ` Neil Booth
2002-03-14 13:47           ` Geoff Keating
2002-03-14 14:07             ` David Edelsohn
2002-03-14 15:02               ` Geoff Keating
2002-03-14 15:24             ` David Edelsohn
2002-03-14 16:57               ` Alan Modra
2002-03-14 18:05                 ` Geoff Keating
2002-03-14 18:35                   ` David Edelsohn
2002-03-14 20:07                     ` Geoff Keating
2002-03-14 21:10                       ` Richard Henderson
2002-03-14 23:03                         ` Richard Henderson
2002-03-15  8:20                           ` David Edelsohn
2002-03-15 17:01                             ` Richard Henderson
2002-03-19 16:40                               ` Alan Modra
2002-03-19 17:02                                 ` Richard Henderson
2002-03-14 16:00         ` David Edelsohn
2002-03-06 11:43       ` Stan Shebs
2002-02-03 19:14 [PATCH] PowerPC fsel PR5217 David Edelsohn
2002-02-03 21:14 ` Geoff Keating
2002-02-03 21:40   ` David Edelsohn
2002-02-03 22:08     ` Geoff Keating
2002-02-03 22:45       ` David Edelsohn
2002-02-03 23:14         ` Geoff Keating
2002-02-04  9:26           ` David Edelsohn
2002-02-04 10:24             ` Geoff Keating
2002-02-04 10:40               ` David Edelsohn
2002-02-04 11:18                 ` Dale Johannesen
2002-02-04 11:44                 ` Geoff Keating
2002-02-05 11:20                   ` David Edelsohn
2002-02-05 12:47                     ` David Edelsohn
2002-02-05 14:17                       ` Mark Mitchell
2002-01-28  1:05 ppc call_value* fixes (plus minor apple gripe) Aldy Hernandez
2002-01-28  7:26 ` Stan Shebs
2002-01-29 14:30   ` Aldy Hernandez
2002-01-28  7:43 ` Geoff Keating
2002-01-28  7:49 ` David Edelsohn
2002-01-28 11:12 ` Richard Henderson
2002-01-28 12:52   ` David Edelsohn
2002-01-28 22:44   ` Aldy Hernandez
2002-01-29 10:51     ` David Edelsohn
2001-12-29  7:03 PATCH, rs6000 (alpha?) long const Tom Rix
2001-12-29 11:40 ` Richard Henderson
2001-12-29 12:40   ` Tom Rix
2001-12-29 17:00     ` Richard Henderson
2001-12-29 18:37       ` Tom Rix
2001-12-29 20:45         ` Richard Henderson
2001-12-29 21:24           ` PATCH, rs6000 (alpha?) long const --verbose Tom Rix
2001-12-29 23:01             ` Richard Henderson
2001-12-29 23:02             ` Richard Henderson
2002-01-01 12:21               ` PATCH, rs6000 (alpha?) long const take 2 Tom Rix
2002-01-01 13:46                 ` Richard Henderson
2002-01-02 13:20                   ` Geoff Keating
2002-01-02 13:22                     ` Richard Henderson
2002-01-02 20:44                     ` David Edelsohn
2002-01-03  0:52                       ` Richard Henderson
2002-01-03  8:06                         ` David Edelsohn
2002-01-04 12:04                           ` Geoff Keating
2002-01-04 14:31                             ` David Edelsohn
2002-01-10 14:00                               ` PATCH, rs6000 long const take 3 Tom Rix
2002-01-10 14:08                                 ` Richard Henderson
2002-01-10 14:20                                 ` David Edelsohn
2001-11-13 15:03 [PATCH] adds powerpc-*-freebsd? to mainline David Edelsohn
2001-11-13 15:03 ` Geoff Keating
2001-11-13 15:03   ` David Edelsohn
2001-11-13 15:03     ` Geoff Keating
2001-11-13 15:03       ` David Edelsohn
2001-11-13 15:03         ` David O'Brien
2001-11-13 15:03         ` Richard Henderson
2001-11-13 15:03     ` David O'Brien

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).