public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/58623] New: lack of ldp/stp optimization
@ 2013-10-04 19:19 b.grayson at samsung dot com
  2013-10-28  0:28 ` [Bug target/58623] " pinskia at gcc dot gnu.org
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: b.grayson at samsung dot com @ 2013-10-04 19:19 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58623

            Bug ID: 58623
           Summary: lack of ldp/stp optimization
           Product: gcc
           Version: 4.9.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: b.grayson at samsung dot com
            Target: AArch64
             Build: 20130602

The following C code:

long long a, b;
int c, d;

int foo() { return a+b; }
int bar() { return c+d; }

generates this assembly code under -O3 -fsection-anchors -fno-common:

foo:
        adrp    x1, .LANCHOR0
        add     x1, x1, :lo12:.LANCHOR0
        ldr     x2, [x1]
        ldr     x0, [x1,8]
        add     w0, w2, w0
        ret
bar:
        adrp    x1, .LANCHOR0
        add     x1, x1, :lo12:.LANCHOR0
        ldr     w2, [x1,16]
        ldr     w0, [x1,20]
        add     w0, w2, w0
        ret

Note that the ldr x2 and ldr x0 could have been merged into an ldp, in foo(). 
Similarly, the ldr w2 and ldr w0 (32-bit loads) could have been merged into an
ldp in bar().

The same optimization applies to stores as well.

I am not sure if this would be handled by the proposed (but apparently not
accepted) patch from March 2013:

http://gcc.gnu.org/ml/gcc-patches/2013-03/msg01051.html


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/58623] lack of ldp/stp optimization
  2013-10-04 19:19 [Bug target/58623] New: lack of ldp/stp optimization b.grayson at samsung dot com
@ 2013-10-28  0:28 ` pinskia at gcc dot gnu.org
  2014-07-24 11:05 ` ramana at gcc dot gnu.org
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2013-10-28  0:28 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58623

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2013-10-28
     Ever confirmed|0                           |1
           Severity|normal                      |enhancement

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Confirmed.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/58623] lack of ldp/stp optimization
  2013-10-04 19:19 [Bug target/58623] New: lack of ldp/stp optimization b.grayson at samsung dot com
  2013-10-28  0:28 ` [Bug target/58623] " pinskia at gcc dot gnu.org
@ 2014-07-24 11:05 ` ramana at gcc dot gnu.org
  2014-11-19  2:00 ` amker.cheng at gmail dot com
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: ramana at gcc dot gnu.org @ 2014-07-24 11:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58623

Ramana Radhakrishnan <ramana at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Target|AArch64                     |aarch64-*
             Status|NEW                         |ASSIGNED
                 CC|                            |ramana at gcc dot gnu.org
           Assignee|unassigned at gcc dot gnu.org      |amker at gcc dot gnu.org

--- Comment #2 from Ramana Radhakrishnan <ramana at gcc dot gnu.org> ---
Bin Cheng has been working on this specifically around putting out a new pass
to deal with this.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/58623] lack of ldp/stp optimization
  2013-10-04 19:19 [Bug target/58623] New: lack of ldp/stp optimization b.grayson at samsung dot com
  2013-10-28  0:28 ` [Bug target/58623] " pinskia at gcc dot gnu.org
  2014-07-24 11:05 ` ramana at gcc dot gnu.org
@ 2014-11-19  2:00 ` amker.cheng at gmail dot com
  2014-12-11 12:10 ` ramana at gcc dot gnu.org
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: amker.cheng at gmail dot com @ 2014-11-19  2:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58623

bin.cheng <amker.cheng at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |amker.cheng at gmail dot com

--- Comment #3 from bin.cheng <amker.cheng at gmail dot com> ---
Patch sent at https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02209.html
On latest trunk, the patch generates below assembly for the example:

    .cpu generic+fp+simd
    .file    "pr58623.c"
    .text
    .align    2
    .global    foo
    .type    foo, %function
foo:
    adrp    x0, .LANCHOR0
    add    x2, x0, :lo12:.LANCHOR0
    ldr    x1, [x0, #:lo12:.LANCHOR0]
    ldr    x0, [x2, 8]
    add    w0, w1, w0
    ret
    .size    foo, .-foo
    .align    2
    .global    bar
    .type    bar, %function
bar:
    adrp    x1, .LANCHOR0
    add    x1, x1, :lo12:.LANCHOR0
    ldp    w2, w0, [x1, 16]
    add    w0, w2, w0
    ret
    .size    bar, .-bar
    .global    d
    .global    c
    .global    b
    .global    a
    .bss
    .align    3
.LANCHOR0 = . + 0
    .type    a, %object
    .size    a, 8
a:
    .zero    8
    .type    b, %object
    .size    b, 8
b:
    .zero    8
    .type    c, %object
    .size    c, 4
c:
    .zero    4
    .type    d, %object
    .size    d, 4
d:
    .zero    4
    .ident    "GCC: (GNU) 5.0.0 20141118 (experimental)"

ldp opportunity in bar is captured, but not the one in foo.  Apparently, fwprop
pass propagates the expression into memory reference, corrupting the pair
opportunity.  This is another known issue for long time.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/58623] lack of ldp/stp optimization
  2013-10-04 19:19 [Bug target/58623] New: lack of ldp/stp optimization b.grayson at samsung dot com
                   ` (2 preceding siblings ...)
  2014-11-19  2:00 ` amker.cheng at gmail dot com
@ 2014-12-11 12:10 ` ramana at gcc dot gnu.org
  2014-12-12  3:32 ` amker at gcc dot gnu.org
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: ramana at gcc dot gnu.org @ 2014-12-11 12:10 UTC (permalink / raw)
  To: gcc-bugs

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="UTF-8", Size: 5744 bytes --]

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58623

--- Comment #4 from Ramana Radhakrishnan <ramana at gcc dot gnu.org> ---
(In reply to bin.cheng from comment #3)
> Patch sent at https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02209.html
> On latest trunk, the patch generates below assembly for the example:
> 
> 	.cpu generic+fp+simd
> 	.file	"pr58623.c"
> 	.text
> 	.align	2
> 	.global	foo
> 	.type	foo, %function
> foo:
> 	adrp	x0, .LANCHOR0
> 	add	x2, x0, :lo12:.LANCHOR0
> 	ldr	x1, [x0, #:lo12:.LANCHOR0]
> 	ldr	x0, [x2, 8]
> 	add	w0, w1, w0
> 	ret
> 	.size	foo, .-foo
> 	.align	2
> 	.global	bar
> 	.type	bar, %function
> bar:
> 	adrp	x1, .LANCHOR0
> 	add	x1, x1, :lo12:.LANCHOR0
> 	ldp	w2, w0, [x1, 16]
> 	add	w0, w2, w0
> 	ret
> 	.size	bar, .-bar
> 	.global	d
> 	.global	c
> 	.global	b
> 	.global	a
> 	.bss
> 	.align	3
> .LANCHOR0 = . + 0
> 	.type	a, %object
> 	.size	a, 8
> a:
> 	.zero	8
> 	.type	b, %object
> 	.size	b, 8
> b:
> 	.zero	8
> 	.type	c, %object
> 	.size	c, 4
> c:
> 	.zero	4
> 	.type	d, %object
> 	.size	d, 4
> d:
> 	.zero	4
> 	.ident	"GCC: (GNU) 5.0.0 20141118 (experimental)"
> 
> ldp opportunity in bar is captured, but not the one in foo.  Apparently,
> fwprop pass propagates the expression into memory reference, corrupting the
> pair opportunity.  This is another known issue for long time.

So I think we should take the fwprop issue as a separate bug and close this out
for 5.0 as the main work required to generate ldp / stp in the compiler is now
done.
>From gcc-bugs-return-470252-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org Thu Dec 11 12:29:12 2014
Return-Path: <gcc-bugs-return-470252-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Delivered-To: listarch-gcc-bugs@gcc.gnu.org
Received: (qmail 28268 invoked by alias); 11 Dec 2014 12:29:11 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Delivered-To: mailing list gcc-bugs@gcc.gnu.org
Received: (qmail 28228 invoked by uid 48); 11 Dec 2014 12:29:08 -0000
From: "marxin at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug c++/64266] New: Can GCC produce local mergeable symbols for *.__FUNCTION__ and *.__PRETTY_FUNCTION__ function.
Date: Thu, 11 Dec 2014 12:29:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: new
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: c++
X-Bugzilla-Version: 5.0
X-Bugzilla-Keywords:
X-Bugzilla-Severity: normal
X-Bugzilla-Who: marxin at gcc dot gnu.org
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter cc attachments.created
Message-ID: <bug-64266-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2014-12/txt/msg01259.txt.bz2
Content-length: 2294

https://gcc.gnu.org/bugzilla/show_bug.cgi?idd266

            Bug ID: 64266
           Summary: Can GCC produce local mergeable symbols for
                    *.__FUNCTION__ and *.__PRETTY_FUNCTION__ function.
           Product: gcc
           Version: 5.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: marxin at gcc dot gnu.org
                CC: jason at gcc dot gnu.org

Created attachment 34250
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id4250&actioníit
test case

For the following testcase:

extern "C" {
 extern int printf (char *, ...);
 }

 class a {
  public:
   void sub (int i)
     {
       printf ("__FUNCTION__ = %s\n", __FUNCTION__);
       printf ("__PRETTY_FUNCTION__ = %s\n", __PRETTY_FUNCTION__);
     }

   void sub()
   {
       printf ("__FUNCTION__ = %s\n", __FUNCTION__);
   }
 };

int
main (void)
{
  a ax;
  ax.sub (0);
  ax.sub ();
  return 0;
}

Unlink clang, GCC produces a local symbol residing in .symtab and string values
are not in mergeable section:

$ g++ ~/Programming/testcases/pretty-function.c -o a.o
$ readelf -s a.o --wide |  grep PRE
    15: 0000000000400710    17 OBJECT  LOCAL  DEFAULT   14
_ZZN1a3subEiE19__PRETTY_FUNCTION__

$ readelf -p '.rodata' a.out

String dump of section '.rodata':
  [    10]  __FUNCTION__ = %s

  [    23]  __PRETTY_FUNCTION__ = %s

  [    3d]  sub
  [    50]  void a::sub(int)
  [    61]  sub

and clang produces:

$ clang++ ~/Programming/testcases/pretty-function.c -o a.o
$ readelf -s a.out --wide |  grep PRE
(nothing)

$ readelf -p '.rodata' a.o

String dump of section '.rodata':
  [     4]  __FUNCTION__ = %s

  [    17]  sub
  [    1b]  __PRETTY_FUNCTION__ = %s

  [    35]  void a::sub(int)

I'm wondering if we can also process such kind of optimization.
For Inkscape (compiled with -O2), there are following differences:

section                  portion        size        size    compared
comparison
.rodata                  15.69 %     2.41 MB     2523277     2291412     90.81
%
.strtab                  13.06 %     2.00 MB     2099988     1933845     92.09
%

Where column 'size' is related to GCC and 'compared' is size generated by
clang.

Thanks,
Martin


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/58623] lack of ldp/stp optimization
  2013-10-04 19:19 [Bug target/58623] New: lack of ldp/stp optimization b.grayson at samsung dot com
                   ` (3 preceding siblings ...)
  2014-12-11 12:10 ` ramana at gcc dot gnu.org
@ 2014-12-12  3:32 ` amker at gcc dot gnu.org
  2014-12-15 21:35 ` e.menezes at samsung dot com
  2014-12-18  3:02 ` amker at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: amker at gcc dot gnu.org @ 2014-12-12  3:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58623

amker at gcc dot gnu.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|---                         |FIXED

--- Comment #5 from amker at gcc dot gnu.org ---
The Changelog entry missed this PR NO.  For the record, it is committed at:
https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=218430

Closed as suggested.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/58623] lack of ldp/stp optimization
  2013-10-04 19:19 [Bug target/58623] New: lack of ldp/stp optimization b.grayson at samsung dot com
                   ` (4 preceding siblings ...)
  2014-12-12  3:32 ` amker at gcc dot gnu.org
@ 2014-12-15 21:35 ` e.menezes at samsung dot com
  2014-12-18  3:02 ` amker at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: e.menezes at samsung dot com @ 2014-12-15 21:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58623

--- Comment #6 from Evandro <e.menezes at samsung dot com> ---
What's the PR of the fwprop issue?

Thank you.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/58623] lack of ldp/stp optimization
  2013-10-04 19:19 [Bug target/58623] New: lack of ldp/stp optimization b.grayson at samsung dot com
                   ` (5 preceding siblings ...)
  2014-12-15 21:35 ` e.menezes at samsung dot com
@ 2014-12-18  3:02 ` amker at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: amker at gcc dot gnu.org @ 2014-12-18  3:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58623

--- Comment #7 from amker at gcc dot gnu.org ---
Hi Evandro,
There is specific PR for this issue.  But as we know, fwprop often corrupts
optimizations on address expression, for below example:

   add rb, r1, r2
   ldr rx, [rb]
   add rb, rb, #4

It's transformed into below by fwprop:

   add rb, r1, r2
   ldr rx, [r1, r2]
   add rb, rb, #4

This corrupts post-increment opportunity.  Though in different form, it's
actually same issue as in ldp/stp.

I think https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44883 describes the
problem in some manner, and there might be other PR about it too.


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-12-18  3:02 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-10-04 19:19 [Bug target/58623] New: lack of ldp/stp optimization b.grayson at samsung dot com
2013-10-28  0:28 ` [Bug target/58623] " pinskia at gcc dot gnu.org
2014-07-24 11:05 ` ramana at gcc dot gnu.org
2014-11-19  2:00 ` amker.cheng at gmail dot com
2014-12-11 12:10 ` ramana at gcc dot gnu.org
2014-12-12  3:32 ` amker at gcc dot gnu.org
2014-12-15 21:35 ` e.menezes at samsung dot com
2014-12-18  3:02 ` amker at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).