public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/16532] New: duplicate function epilogue on sparc
@ 2004-07-13 21:54 dann at godzilla dot ics dot uci dot edu
  2004-07-13 22:31 ` [Bug target/16532] " pinskia at gcc dot gnu dot org
                   ` (14 more replies)
  0 siblings, 15 replies; 16+ messages in thread
From: dann at godzilla dot ics dot uci dot edu @ 2004-07-13 21:54 UTC (permalink / raw)
  To: gcc-bugs

On sparc-sun-solaris2.7 using 
gcc-3.4.0 -O2 -mcpu=ultrasparc -m32

This function:

extern unsigned char first_one[65536];

int FirstOne(unsigned long long arg1)
{
  if (arg1 >> 48)
    return (first_one[arg1 >> 48]);
  if ((arg1 >> 32) & 65535)
    return (first_one[(arg1 >> 32) & 65535] + 16);
  if ((arg1 >> 16) & 65535)
    return (first_one[(arg1 >> 16) & 65535] + 32);
  return (first_one[arg1 & 65535] + 48);
}

is compiled to: 
FirstOne:
        !#PROLOGUE# 0
        save    %sp, -112, %sp
        !#PROLOGUE# 1
        sethi   %hi(64512), %o3
        mov     0, %o2
        sllx    %i0, 32, %g1
        srl     %i1, 0, %i1
        or      %i1, %g1, %g1
        srlx    %g1, 32, %g1
        srlx    %g1, 32, %i4
        mov     %g1, %i5
        or      %o3, 1023, %o3
        and     %i4, %o2, %o4
        sllx    %i0, 32, %g1
        srl     %i1, 0, %i1
        or      %i1, %g1, %g1
        srlx    %g1, 48, %g1
        srlx    %g1, 32, %i2
        mov     %g1, %i3
        orcc    %i2, %i3, %g0
        be,pt   %icc, .LL2
        and     %i5, %o3, %o5
        sethi   %hi(first_one), %g1
        or      %g1, %lo(first_one), %g1
        ba,pt   %xcc, .LL1
        ldub    [%g1+%i3], %i0
.LL2:
        sethi   %hi(64512), %g1
        orcc    %o4, %o5, %g0
        or      %g1, 1023, %g1
        be,pt   %icc, .LL3
        and     %i5, %g1, %i5
        sethi   %hi(first_one), %g1
        or      %g1, %lo(first_one), %g1
        ldub    [%g1+%i5], %i5
        ba,pt   %xcc, .LL1
        add     %i5, 16, %i0
.LL3:
        sethi   %hi(64512), %g1
        or      %g1, 1023, %g1
        and     %i1, %g1, %i2
        sllx    %i0, 32, %g1
        srl     %i1, 0, %i1
        or      %i1, %g1, %g1
        srlx    %g1, 16, %g1
        srlx    %g1, 32, %i4
        mov     %g1, %i5
        sethi   %hi(64512), %g1
        and     %i4, %o2, %i4
        or      %g1, 1023, %g1
        and     %i5, %g1, %i3
        and     %i5, %o3, %i5
        orcc    %i4, %i5, %g0
        be,pt   %icc, .LL4
        sethi   %hi(first_one), %g1
        or      %g1, %lo(first_one), %g1
        ldub    [%g1+%i3], %i5
        ba,pt   %xcc, .LL1
        add     %i5, 32, %i0
.LL4:
        or      %g1, %lo(first_one), %g1
        ldub    [%g1+%i2], %i5
        add     %i5, 48, %i0
.LL1:
        return  %i7+8
        nop


It is probably better to just duplicate the epilogue instead of using 
"ba,pt   %xcc, .LL1", this is what Forte-7 does (and gcc on x86). 
I don't have mainline gcc built on sparc to see if this is still the case.

The function is from crafty, a different version of this function appears in
186.crafty from SPEC2000, the same problem can be seen in that code. (The code
in SPEC2K, which is functionally equivalent is optimized much better than the
function above).

-- 
           Summary: duplicate function epilogue on sparc
           Product: gcc
           Version: 3.4.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: rtl-optimization
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: dann at godzilla dot ics dot uci dot edu
                CC: gcc-bugs at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16532


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug target/16532] duplicate function epilogue on sparc
  2004-07-13 21:54 [Bug rtl-optimization/16532] New: duplicate function epilogue on sparc dann at godzilla dot ics dot uci dot edu
@ 2004-07-13 22:31 ` pinskia at gcc dot gnu dot org
  2004-07-14  7:29 ` [Bug target/16532] Inefficient jump to epilogue ebotcazou at gcc dot gnu dot org
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-07-13 22:31 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pinskia at gcc dot gnu dot org  2004-07-13 22:31 -------
I want to say this is fixed on the mainline by the rewrite of the epilogue/prologue of the sparc back-
end.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|rtl-optimization            |target
 GCC target triplet|                            |sparc-sun-solaris2.7
           Keywords|                            |missed-optimization


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16532


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug target/16532] Inefficient jump to epilogue
  2004-07-13 21:54 [Bug rtl-optimization/16532] New: duplicate function epilogue on sparc dann at godzilla dot ics dot uci dot edu
  2004-07-13 22:31 ` [Bug target/16532] " pinskia at gcc dot gnu dot org
@ 2004-07-14  7:29 ` ebotcazou at gcc dot gnu dot org
  2004-07-14  7:43 ` ebotcazou at gcc dot gnu dot org
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: ebotcazou at gcc dot gnu dot org @ 2004-07-14  7:29 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From ebotcazou at gcc dot gnu dot org  2004-07-14 07:29 -------
Thanks for the report, this is very helpful.  I'm tweaking the summary because
it is slightly confusing.


-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
     Ever Confirmed|                            |1
  GCC build triplet|                            |sparc-*-*
   GCC host triplet|                            |sparc-*-*
 GCC target triplet|sparc-sun-solaris2.7        |sparc-*-*
   Last reconfirmed|0000-00-00 00:00:00         |2004-07-14 07:29:09
               date|                            |
            Summary|duplicate function epilogue |Inefficient jump to epilogue
                   |on sparc                    |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16532


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug target/16532] Inefficient jump to epilogue
  2004-07-13 21:54 [Bug rtl-optimization/16532] New: duplicate function epilogue on sparc dann at godzilla dot ics dot uci dot edu
  2004-07-13 22:31 ` [Bug target/16532] " pinskia at gcc dot gnu dot org
  2004-07-14  7:29 ` [Bug target/16532] Inefficient jump to epilogue ebotcazou at gcc dot gnu dot org
@ 2004-07-14  7:43 ` ebotcazou at gcc dot gnu dot org
  2004-07-14 16:19 ` dann at godzilla dot ics dot uci dot edu
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: ebotcazou at gcc dot gnu dot org @ 2004-07-14  7:43 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From ebotcazou at gcc dot gnu dot org  2004-07-14 07:43 -------
> It is probably better to just duplicate the epilogue instead of using 
> "ba,pt   %xcc, .LL1", this is what Forte-7 does (and gcc on x86).

Definitely.

> I don't have mainline gcc built on sparc to see if this is still the case.

We're on the right track on mainline but not yet there.  We now can emit the
epilogue anywhere in the function, so mainline compiles the function into:

FirstOne:
        save    %sp, -112, %sp
        sethi   %hi(64512), %o5
        mov     0, %o4
        sllx    %i0, 32, %g1
        srl     %i1, 0, %i1
        or      %i1, %g1, %g1
        srlx    %g1, 32, %g1
        srlx    %g1, 32, %i4
        mov     %g1, %i5
        or      %o5, 1023, %o5
        and     %i4, %o4, %i4
        sllx    %i0, 32, %g1
        srl     %i1, 0, %i1
        or      %i1, %g1, %g1
        srlx    %g1, 48, %g1
        srlx    %g1, 32, %i2
        mov     %g1, %i3
        orcc    %i2, %i3, %g0
        be,pt   %icc, .LL2
         and    %i5, %o5, %i5
        sethi   %hi(first_one), %g1
        or      %g1, %lo(first_one), %g1
        ldub    [%g1+%i3], %i0
.LL10:
        return  %i7+8
         nop
.LL2:
        orcc    %i4, %i5, %g0
        be,pt   %icc, .LL5
         sethi  %hi(64512), %g1
        sethi   %hi(first_one), %g1
        or      %g1, %lo(first_one), %g1
        ldub    [%g1+%i5], %i5
        ba,pt   %xcc, .LL10
         add    %i5, 16, %i0
.LL5:
        or      %g1, 1023, %g1
        and     %i1, %g1, %i3
        sllx    %i0, 32, %g1
        srl     %i1, 0, %i1
        or      %i1, %g1, %g1
        srlx    %g1, 16, %g1
        srlx    %g1, 32, %i4
        mov     %g1, %i5
        and     %i4, %o4, %i0
        and     %i5, %o5, %i1
        orcc    %i0, %i1, %g0
        be,pt   %icc, .LL7
         sethi  %hi(first_one), %g1
        or      %g1, %lo(first_one), %g1
        ldub    [%g1+%i1], %i5
        ba,pt   %xcc, .LL10
         add    %i5, 32, %i0
.LL7:
        or      %g1, %lo(first_one), %g1
        ldub    [%g1+%i3], %i5
        ba,pt   %xcc, .LL10
         add    %i5, 48, %i0
        .size   FirstOne, .-FirstOne
        .ident  "GCC: (GNU) 3.5.0 20040710 (experimental)"


The next planned step is to allow multiple epilogues (at least when they are
trivial).  This is currently blocked by a small code quality regression
introduced by the previous change in some cases.

> The function is from crafty, a different version of this function appears in
> 186.crafty from SPEC2000, the same problem can be seen in that code. (The code
> in SPEC2K, which is functionally equivalent is optimized much better than the
> function above).

Do you run SPEC2K regularly on SPARC machines?


-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
         AssignedTo|unassigned at gcc dot gnu   |ebotcazou at gcc dot gnu dot
                   |dot org                     |org
             Status|NEW                         |ASSIGNED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16532


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug target/16532] Inefficient jump to epilogue
  2004-07-13 21:54 [Bug rtl-optimization/16532] New: duplicate function epilogue on sparc dann at godzilla dot ics dot uci dot edu
                   ` (2 preceding siblings ...)
  2004-07-14  7:43 ` ebotcazou at gcc dot gnu dot org
@ 2004-07-14 16:19 ` dann at godzilla dot ics dot uci dot edu
  2004-08-18 18:30 ` dann at godzilla dot ics dot uci dot edu
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: dann at godzilla dot ics dot uci dot edu @ 2004-07-14 16:19 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From dann at godzilla dot ics dot uci dot edu  2004-07-14 16:19 -------
(In reply to comment #3)
> Do you run SPEC2K regularly on SPARC machines?

Unfortunatelly I don't have access to fast SPARCs, so I don't.
BTW PR16541 shows that much nicer code can be generated for the above function.

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16532


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug target/16532] Inefficient jump to epilogue
  2004-07-13 21:54 [Bug rtl-optimization/16532] New: duplicate function epilogue on sparc dann at godzilla dot ics dot uci dot edu
                   ` (3 preceding siblings ...)
  2004-07-14 16:19 ` dann at godzilla dot ics dot uci dot edu
@ 2004-08-18 18:30 ` dann at godzilla dot ics dot uci dot edu
  2004-08-19 10:13 ` ebotcazou at gcc dot gnu dot org
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: dann at godzilla dot ics dot uci dot edu @ 2004-08-18 18:30 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From dann at godzilla dot ics dot uci dot edu  2004-08-18 18:30 -------
A simplified version of the testcase shows an issue with the ultrasparc code
generation: 

extern unsigned char first_one[65536];

int FirstOne(unsigned long long arg1)
{
  if (arg1 >> 48)
    return (first_one[arg1 >> 48]);
  return 0;
}

gcc -O2 -mcpu=ultrasparc generates a "save" in the prologue, 
A "save" is not generated when compiling for v8. 
There should be no need for a "save", this is a leaf function, and the number of
registers used is very small. 



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16532


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug target/16532] Inefficient jump to epilogue
  2004-07-13 21:54 [Bug rtl-optimization/16532] New: duplicate function epilogue on sparc dann at godzilla dot ics dot uci dot edu
                   ` (4 preceding siblings ...)
  2004-08-18 18:30 ` dann at godzilla dot ics dot uci dot edu
@ 2004-08-19 10:13 ` ebotcazou at gcc dot gnu dot org
  2004-08-19 16:58 ` dann at godzilla dot ics dot uci dot edu
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: ebotcazou at gcc dot gnu dot org @ 2004-08-19 10:13 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From ebotcazou at gcc dot gnu dot org  2004-08-19 10:13 -------
> gcc -O2 -mcpu=ultrasparc generates a "save" in the prologue, 
> A "save" is not generated when compiling for v8. 
> There should be no need for a "save", this is a leaf function, and the number
> of registers used is very small. 

That's because -mcpu=ultrasparc enables -mv8plus, which means that the global
and out registers can be used as full 64-bit registers.  For example in the .s file:

srlx    %o5, 48, %o5

However, this optimization is partially defeated by the 32-bit calling
convention in the testcase, so I'm not sure it is really worth while there.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16532


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug target/16532] Inefficient jump to epilogue
  2004-07-13 21:54 [Bug rtl-optimization/16532] New: duplicate function epilogue on sparc dann at godzilla dot ics dot uci dot edu
                   ` (5 preceding siblings ...)
  2004-08-19 10:13 ` ebotcazou at gcc dot gnu dot org
@ 2004-08-19 16:58 ` dann at godzilla dot ics dot uci dot edu
  2004-09-27 20:35 ` pinskia at gcc dot gnu dot org
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: dann at godzilla dot ics dot uci dot edu @ 2004-08-19 16:58 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From dann at godzilla dot ics dot uci dot edu  2004-08-19 16:58 -------
(In reply to comment #6)

> However, this optimization is partially defeated by the 32-bit calling
> convention in the testcase, so I'm not sure it is really worth while there.

I forgot to say that Sun's compiler does not generate a "save"

BTW, any idea why the generated v8 code: 

        srl     %o0, 16, %o5
        mov     0, %o4
        orcc    %o4, %o5, %g0
        [snip]

is not simplified to:

        srl     %o0, 16, %o5
        orcc    %o5, 0, %g0

Also the v9 code does not seem to realize that it can do a 16 bit shift of a 
32bit register instead of a 48bit shift of a 64bit register. This might help if
the register pressure is high (which is not the case here).




-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16532


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug target/16532] Inefficient jump to epilogue
  2004-07-13 21:54 [Bug rtl-optimization/16532] New: duplicate function epilogue on sparc dann at godzilla dot ics dot uci dot edu
                   ` (6 preceding siblings ...)
  2004-08-19 16:58 ` dann at godzilla dot ics dot uci dot edu
@ 2004-09-27 20:35 ` pinskia at gcc dot gnu dot org
  2004-09-28  6:26 ` cvs-commit at gcc dot gnu dot org
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-09-27 20:35 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pinskia at gcc dot gnu dot org  2004-09-27 20:35 -------
Patch here: <http://gcc.gnu.org/ml/gcc-patches/2004-09/msg02789.html>.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |patch


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16532


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug target/16532] Inefficient jump to epilogue
  2004-07-13 21:54 [Bug rtl-optimization/16532] New: duplicate function epilogue on sparc dann at godzilla dot ics dot uci dot edu
                   ` (7 preceding siblings ...)
  2004-09-27 20:35 ` pinskia at gcc dot gnu dot org
@ 2004-09-28  6:26 ` cvs-commit at gcc dot gnu dot org
  2004-09-28  6:34 ` ebotcazou at gcc dot gnu dot org
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: cvs-commit at gcc dot gnu dot org @ 2004-09-28  6:26 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From cvs-commit at gcc dot gnu dot org  2004-09-28 06:26 -------
Subject: Bug 16532

CVSROOT:	/cvs/gcc
Module name:	gcc
Changes by:	ebotcazou@gcc.gnu.org	2004-09-28 06:26:08

Modified files:
	gcc            : ChangeLog 
	gcc/config/sparc: sparc-protos.h sparc.c sparc.h sparc.md 

Log message:
	PR target/16532
	* config/sparc/sparc.c (struct machine_function): New field
	'leaf_function_p' and 'prologue_data_valid_p'.
	(sparc_leaf_function_p, sparc_prologue_data_valid_p): New macro
	to conveniently access the above fields.
	(TARGET_LATE_RTL_PROLOGUE_EPILOGUE): Delete.
	(eligible_for_return_delay): Use 'sparc_leaf_function_p' instead
	of the generic flavor 'current_function_uses_only_leaf_regs'.
	(eligible_for_sibcall_delay): Likewise.
	(sparc_expand_prologue): Compute 'sparc_leaf_function_p' and set
	'sparc_prologue_data_valid_p'.  Use 'sparc_leaf_function_p'.
	(sparc_asm_function_prologue): Add sanity check for the assumption
	made in 'sparc_expand_prologue'.  Use 'sparc_leaf_function_p'.
	(sparc_can_use_return_insn_p): New function.
	(sparc_expand_epilogue): Use 'sparc_leaf_function_p'.
	(output_restore): Likewise.
	(output_sibcall): Likewise.
	(sparc_output_mi_thunk): Likewise.
	* config/sparc/sparc-protos.h (sparc_can_use_return_insn_p): Declare.
	* config/sparc/sparc.md (return): New expander.
	
	* config/sparc/sparc.h (INITIAL_ELIMINATION_OFFSET): Minor tweak.

Patches:
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&r1=2.5651&r2=2.5652
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/sparc/sparc-protos.h.diff?cvsroot=gcc&r1=1.48&r2=1.49
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/sparc/sparc.c.diff?cvsroot=gcc&r1=1.335&r2=1.336
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/sparc/sparc.h.diff?cvsroot=gcc&r1=1.267&r2=1.268
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/sparc/sparc.md.diff?cvsroot=gcc&r1=1.215&r2=1.216



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16532


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug target/16532] Inefficient jump to epilogue
  2004-07-13 21:54 [Bug rtl-optimization/16532] New: duplicate function epilogue on sparc dann at godzilla dot ics dot uci dot edu
                   ` (8 preceding siblings ...)
  2004-09-28  6:26 ` cvs-commit at gcc dot gnu dot org
@ 2004-09-28  6:34 ` ebotcazou at gcc dot gnu dot org
  2004-09-28  6:57 ` ebotcazou at gcc dot gnu dot org
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: ebotcazou at gcc dot gnu dot org @ 2004-09-28  6:34 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From ebotcazou at gcc dot gnu dot org  2004-09-28 06:34 -------
Here's the code generated by mainline as of today at -O2 -mcpu=ultrasparc:

FirstOne:
	save	%sp, -112, %sp
	sethi	%hi(64512), %o5
	mov	0, %o4
	sllx	%i0, 32, %g1
	srl	%i1, 0, %i1
	or	%i1, %g1, %g1
	srlx	%g1, 32, %g1
	srlx	%g1, 32, %i4
	mov	%g1, %i5
	or	%o5, 1023, %o5
	and	%i4, %o4, %i4
	sllx	%i0, 32, %g1
	srl	%i1, 0, %i1
	or	%i1, %g1, %g1
	srlx	%g1, 48, %g1
	srlx	%g1, 32, %i2
	mov	%g1, %i3
	orcc	%i2, %i3, %g0
	be,pt	%icc, .LL2
	 and	%i5, %o5, %i5
	sethi	%hi(first_one), %g1
	or	%g1, %lo(first_one), %g1
	return	%i7+8
	 ldub	[%g1+%o3], %o0
.LL2:
	orcc	%i4, %i5, %g0
	be,pt	%icc, .LL5
	 sethi	%hi(64512), %g1
	sethi	%hi(first_one), %g1
	or	%g1, %lo(first_one), %g1
	ldub	[%g1+%i5], %i5
	return	%i7+8
	 add	%o5, 16, %o0
.LL5:
	or	%g1, 1023, %g1
	and	%i1, %g1, %i3
	sllx	%i0, 32, %g1
	srl	%i1, 0, %i1
	or	%i1, %g1, %g1
	srlx	%g1, 16, %g1
	srlx	%g1, 32, %i4
	mov	%g1, %i5
	and	%i4, %o4, %i0
	and	%i5, %o5, %i1
	orcc	%i0, %i1, %g0
	be,pt	%icc, .LL7
	 sethi	%hi(first_one), %g1
	or	%g1, %lo(first_one), %g1
	ldub	[%g1+%i1], %i5
	return	%i7+8
	 add	%o5, 32, %o0
.LL7:
	or	%g1, %lo(first_one), %g1
	ldub	[%g1+%i3], %i5
	return	%i7+8
	 add	%o5, 48, %o0


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16532


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug target/16532] Inefficient jump to epilogue
  2004-07-13 21:54 [Bug rtl-optimization/16532] New: duplicate function epilogue on sparc dann at godzilla dot ics dot uci dot edu
                   ` (9 preceding siblings ...)
  2004-09-28  6:34 ` ebotcazou at gcc dot gnu dot org
@ 2004-09-28  6:57 ` ebotcazou at gcc dot gnu dot org
  2004-09-28  7:00 ` ebotcazou at gcc dot gnu dot org
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: ebotcazou at gcc dot gnu dot org @ 2004-09-28  6:57 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From ebotcazou at gcc dot gnu dot org  2004-09-28 06:57 -------
It appears that we also made some progress for the second testcase.  We now
generate at -O2 -mcpu=ultrasparc:

FirstOne:
        add     %sp, -120, %sp
        std     %o0, [%sp+96]
        sllx    %o0, 32, %o0
        srl     %o1, 0, %o1
        or      %o1, %o0, %o0
        srlx    %o0, 48, %o1
        srlx    %o1, 32, %o0
        orcc    %o0, %o1, %g0
        be,pt   %icc, .LL2
         nop
        sethi   %hi(first_one), %g1
        sub     %sp, -120, %sp
        or      %g1, %lo(first_one), %g1
        jmp     %o7+8
         ldub   [%g1+%o1], %o0

so the 'save' instruction is not emitted anymore.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16532


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug target/16532] Inefficient jump to epilogue
  2004-07-13 21:54 [Bug rtl-optimization/16532] New: duplicate function epilogue on sparc dann at godzilla dot ics dot uci dot edu
                   ` (10 preceding siblings ...)
  2004-09-28  6:57 ` ebotcazou at gcc dot gnu dot org
@ 2004-09-28  7:00 ` ebotcazou at gcc dot gnu dot org
  2004-09-28  7:14 ` ebotcazou at gcc dot gnu dot org
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: ebotcazou at gcc dot gnu dot org @ 2004-09-28  7:00 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From ebotcazou at gcc dot gnu dot org  2004-09-28 07:00 -------
Oops... I forgot the last chunk:

FirstOne:
        add     %sp, -120, %sp
        std     %o0, [%sp+96]
        sllx    %o0, 32, %o0
        srl     %o1, 0, %o1
        or      %o1, %o0, %o0
        srlx    %o0, 48, %o1
        srlx    %o1, 32, %o0
        orcc    %o0, %o1, %g0
        be,pt   %icc, .LL2
         nop
        sethi   %hi(first_one), %g1
        sub     %sp, -120, %sp
        or      %g1, %lo(first_one), %g1
        jmp     %o7+8
         ldub   [%g1+%o1], %o0
.LL2:
        ld      [%sp+96], %o0
        jmp     %o7+8
         sub    %sp, -120, %sp


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16532


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug target/16532] Inefficient jump to epilogue
  2004-07-13 21:54 [Bug rtl-optimization/16532] New: duplicate function epilogue on sparc dann at godzilla dot ics dot uci dot edu
                   ` (11 preceding siblings ...)
  2004-09-28  7:00 ` ebotcazou at gcc dot gnu dot org
@ 2004-09-28  7:14 ` ebotcazou at gcc dot gnu dot org
  2004-09-28  7:17 ` ebotcazou at gcc dot gnu dot org
  2004-09-28  7:18 ` ebotcazou at gcc dot gnu dot org
  14 siblings, 0 replies; 16+ messages in thread
From: ebotcazou at gcc dot gnu dot org @ 2004-09-28  7:14 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From ebotcazou at gcc dot gnu dot org  2004-09-28 07:14 -------
The remaining oddities, like

        srl     %o0, 16, %o5
        mov     0, %o4
        orcc    %o4, %o5, %g0

in the V8 code or

        srlx    %o0, 48, %o1
        srlx    %o1, 32, %o0
        orcc    %o0, %o1, %g0

in the V9 code are related to the suboptimal model used for 'long long'
arithmetics on SPARC 32-bit.  The code is better on SPARC 64-bit:

FirstOne:
        add     %sp, -208, %sp
        stx     %o0, [%sp+2231]
        srlx    %o0, 48, %o0
        brz,pt  %o0, .LL2
         sethi  %lm(first_one), %g4
        sethi   %hh(first_one), %g1
        or      %g1, %hm(first_one), %g1
        sub     %sp, -208, %sp
        sllx    %g1, 32, %g1
        add     %g1, %g4, %g1
        or      %g1, %lo(first_one), %g1
        ldub    [%g1+%o0], %o0
        jmp     %o7+8
         sra    %o0, 0, %o0
.LL2:
        lduw    [%sp+2231], %o0
        sub     %sp, -208, %sp
        jmp     %o7+8
         sra    %o0, 0, %o0


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16532


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug target/16532] Inefficient jump to epilogue
  2004-07-13 21:54 [Bug rtl-optimization/16532] New: duplicate function epilogue on sparc dann at godzilla dot ics dot uci dot edu
                   ` (12 preceding siblings ...)
  2004-09-28  7:14 ` ebotcazou at gcc dot gnu dot org
@ 2004-09-28  7:17 ` ebotcazou at gcc dot gnu dot org
  2004-09-28  7:18 ` ebotcazou at gcc dot gnu dot org
  14 siblings, 0 replies; 16+ messages in thread
From: ebotcazou at gcc dot gnu dot org @ 2004-09-28  7:17 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From ebotcazou at gcc dot gnu dot org  2004-09-28 07:17 -------
Patch applied (see comment #8).


-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|                            |FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16532


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug target/16532] Inefficient jump to epilogue
  2004-07-13 21:54 [Bug rtl-optimization/16532] New: duplicate function epilogue on sparc dann at godzilla dot ics dot uci dot edu
                   ` (13 preceding siblings ...)
  2004-09-28  7:17 ` ebotcazou at gcc dot gnu dot org
@ 2004-09-28  7:18 ` ebotcazou at gcc dot gnu dot org
  14 siblings, 0 replies; 16+ messages in thread
From: ebotcazou at gcc dot gnu dot org @ 2004-09-28  7:18 UTC (permalink / raw)
  To: gcc-bugs



-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |4.0.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16532


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2004-09-28  7:18 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-07-13 21:54 [Bug rtl-optimization/16532] New: duplicate function epilogue on sparc dann at godzilla dot ics dot uci dot edu
2004-07-13 22:31 ` [Bug target/16532] " pinskia at gcc dot gnu dot org
2004-07-14  7:29 ` [Bug target/16532] Inefficient jump to epilogue ebotcazou at gcc dot gnu dot org
2004-07-14  7:43 ` ebotcazou at gcc dot gnu dot org
2004-07-14 16:19 ` dann at godzilla dot ics dot uci dot edu
2004-08-18 18:30 ` dann at godzilla dot ics dot uci dot edu
2004-08-19 10:13 ` ebotcazou at gcc dot gnu dot org
2004-08-19 16:58 ` dann at godzilla dot ics dot uci dot edu
2004-09-27 20:35 ` pinskia at gcc dot gnu dot org
2004-09-28  6:26 ` cvs-commit at gcc dot gnu dot org
2004-09-28  6:34 ` ebotcazou at gcc dot gnu dot org
2004-09-28  6:57 ` ebotcazou at gcc dot gnu dot org
2004-09-28  7:00 ` ebotcazou at gcc dot gnu dot org
2004-09-28  7:14 ` ebotcazou at gcc dot gnu dot org
2004-09-28  7:17 ` ebotcazou at gcc dot gnu dot org
2004-09-28  7:18 ` ebotcazou at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).