public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: "H.J. Lu" <hjl.tools@gmail.com>
To: Jakub Jelinek <jakub@redhat.com>
Cc: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>,
	gcc-patches@gcc.gnu.org,  	ubizjak@gmail.com
Subject: Re: PATCH: PR target/40838: gcc shouldn't assume that the stack is  	aligned
Date: Mon, 24 Aug 2009 17:39:00 -0000	[thread overview]
Message-ID: <6dc9ffc80908240900l73d3c97fo2c31fbd0142e75d2@mail.gmail.com> (raw)
In-Reply-To: <6dc9ffc80908071530x7d4a3965u8021df66a142a0bf@mail.gmail.com>

On Fri, Aug 7, 2009 at 3:30 PM, H.J. Lu<hjl.tools@gmail.com> wrote:
> On Fri, Aug 7, 2009 at 5:53 AM, H.J. Lu<hjl.tools@gmail.com> wrote:
>> On Fri, Aug 7, 2009 at 12:13 AM, Jakub Jelinek<jakub@redhat.com> wrote:
>>> On Fri, Aug 07, 2009 at 02:54:46AM +0200, Mikulas Patocka wrote:
>>>> > > In 32bit, the incoming stack may not be 16 byte aligned.  This patch
>>>> > > assumes the incoming stack is 4 byte aligned and realigns stack if any
>>>> > > SSE variable is put on stack. Any comments?
>>>> >
>>>> > IMHO this is wrong, I could live with a non-default option for those who
>>>> > don't care about performance and think a SCO document from 1996 has any
>>>> > relevance to Linux these days.  In reality a Linux ABI for years assumes
>>>> > 16 byte stack alignment for 32-bit code.
>>>>
>>>> Tell me which Linux distribution did you run with 16-byte stack alignment
>>>> checking (as proposed in bug 40838) and what was the result?
>>>>
>>>> For me, the result was that 75% of binaries in /bin in Debian Lenny do not
>>>> align the stack on 16-byte boundary.
>>>
>>> Besides the obstack glibc bug which has been fixed since then you haven't
>>> reported anything particular.  It is true that parts of i?86 glibc is
>>> compiled with -mpreferered-stack-boundary=2, but only parts that don't call
>>> callbacks.  Async signals AFAIK will align the stack properly.
>>>
>>> I simply don't trust your 75% claim, lots of stuff would break if things
>>> weren't aligned properly.
>>>
>>
>> From gcc 3.4:
>>
>>  /* Validate -mpreferred-stack-boundary= value, or provide default.
>>     The default of 128 bits is for Pentium III's SSE __m128, but we
>>     don't want additional code to keep the stack aligned when
>>     optimizing for code size.  */
>>  ix86_preferred_stack_boundary = (optimize_size
>>                                   ? TARGET_64BIT ? 128 : 32
>>                                   : 128);
>>
>> If you compile code with -Os, you will get 4 byte stack alignment.
>> Just step back, we changed stack alignment from 4 byte to 16byte
>> for SSE since we couldn't realign stack at the time. Now we can
>> realign the stack very efficiently. I think we should do it for SSE
>> to support the existing Linux binaries which have 4 byte stack
>> alignment. If it helps, I can compare -m32 -O3 -msse2 -mfp-math=sse
>> results with SPEC CPU 2006, before and after my patch.
>>
>
> Here are the differences of -m32 -O3 -msse2 -mfpmath=sse -ffast-math
> -funroll-loops
> before and after my patch:
>
> 400.perlbench                    -0.384615%
> 401.bzip2                        0%
> 403.gcc                          -0.362319%
> 429.mcf                          -0.813008%
> 445.gobmk                        0.921659%
> 456.hmmer                        0.549451%
> 458.sjeng                        -0.438596%
> 462.libquantum                   0%
> 464.h264ref                      0%
> 471.omnetpp                      -0.478469%
> 473.astar                        -0.645161%
> 483.xalancbmk                    -0.727273%
> SPECint(R)_base2006                      -0.411523%
> 410.bwaves                       -0.406504%
> 416.gamess                       0%
> 433.milc                         -1.36986%
> 434.zeusmp                       -0.44843%
> 435.gromacs                      0%
> 436.cactusADM                    0%
> 437.leslie3d                     -0.888889%
> 444.namd                         1.20482%
> 447.dealII                       -0.350877%
> 450.soplex                       -0.31746%
> 453.povray                       0.458716%
> 454.calculix                     0%
> 459.GemsFDTD                     0%
> 465.tonto                        0%
> 470.lbm                          0%
> 481.wrf                          0.480769%
> 482.sphinx3                      0.940439%
> SPECfp(R)_base2006                       0%
>
> I think we should align stack if SSE variables are put on stack.
>

Darwin ia32 psABI specifies 16byte stack alignment and enforces it
with

#define PREFERRED_STACK_BOUNDARY                        \
  MAX (STACK_BOUNDARY, ix86_preferred_stack_boundary)

On other ia32 targets, 4byte outgoing stack alignment is
correct and allowed. My patch assumes 4 byte incoming
stack alignment only when SSE variables are put on stack.
Automatic stack alignment implementation is quite efficient.
Its performance impact is very limited as show in SPEC CPU
2006 results. It also fixed a regression:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41156

OK for trunk?

Thanks.

-- 
H.J.

  parent reply	other threads:[~2009-08-24 16:00 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-08-07  0:54 Mikulas Patocka
2009-08-07  7:13 ` Jakub Jelinek
2009-08-07 12:53   ` H.J. Lu
2009-08-07 22:30     ` H.J. Lu
2009-08-08 17:35       ` Mikulas Patocka
2009-08-16 21:25         ` H.J. Lu
2009-08-24 17:39       ` H.J. Lu [this message]
2009-09-12 23:32         ` Mikulas Patocka
2009-09-12 23:42           ` Mikulas Patocka
2009-09-13  1:55           ` H.J. Lu
2009-09-13 14:10             ` Mikulas Patocka
2009-08-07 21:08   ` Mikulas Patocka
2009-08-07 21:25     ` Richard Guenther
  -- strict thread matches above, loose matches on Subject: below --
2009-08-06 21:42 H.J. Lu
2009-08-06 22:26 ` Jakub Jelinek
2009-08-06 22:52   ` H.J. Lu
2009-10-15 15:58 ` H.J. Lu
2009-10-15 18:45   ` Uros Bizjak
2009-10-15 19:22     ` H.J. Lu
2009-10-15 19:32       ` Uros Bizjak
2009-10-15 19:43         ` H.J. Lu
2009-10-15 19:48           ` Jakub Jelinek
2009-10-15 20:11             ` H.J. Lu
2009-10-15 19:53           ` Uros Bizjak
2009-10-15 21:01             ` H.J. Lu
2009-10-15 21:41               ` Uros Bizjak
2009-10-16 20:27     ` H.J. Lu
2009-10-17  1:03       ` Ian Lance Taylor
2009-10-17 18:22         ` H.J. Lu
2009-10-17 19:02           ` Richard Guenther
2009-10-17 19:21             ` H.J. Lu
2009-10-17 19:29               ` Richard Guenther
2009-10-17 19:35                 ` H.J. Lu
2009-10-17 19:46                   ` Richard Guenther
2009-10-17 20:01                     ` H.J. Lu
2009-10-17 20:59                       ` Richard Guenther
2009-10-18 19:21                         ` Michael Matz
2009-10-18 19:45                           ` Richard Guenther
2009-10-19 16:36                             ` H.J. Lu
2009-10-20  1:12                               ` Michael Matz
2009-10-20 19:10                                 ` H.J. Lu
2009-10-19 16:38                           ` H.J. Lu
2009-10-19 17:08                             ` Ian Lance Taylor
2009-10-19 17:26                               ` H.J. Lu
2009-10-19 17:33                                 ` Ian Lance Taylor
2009-10-19 17:46                                   ` H.J. Lu
2009-10-19 17:55                                     ` Ian Lance Taylor
2009-10-19 19:16                                       ` H.J. Lu
2009-10-19 21:15                                         ` Ian Lance Taylor
2009-10-20 19:00                                           ` H.J. Lu
2009-10-20  1:23                                         ` Michael Matz
2009-10-20 19:12                                           ` H.J. Lu
2009-10-20  1:53                             ` Michael Matz
2009-10-20 21:15                               ` H.J. Lu
2009-10-21  1:10                                 ` H.J. Lu
2009-10-21  9:54                                   ` Michael Matz
2009-10-21 16:56                                     ` H.J. Lu
2009-10-30 10:08                                       ` Richard Guenther
2009-10-17  7:09       ` Uros Bizjak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6dc9ffc80908240900l73d3c97fo2c31fbd0142e75d2@mail.gmail.com \
    --to=hjl.tools@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jakub@redhat.com \
    --cc=mikulas@artax.karlin.mff.cuni.cz \
    --cc=ubizjak@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).