public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
From: Michel LESPINASSE <walken@zoy.org>
To: Roger Sayle <roger@eyesopen.com>
Cc: gcc@gcc.gnu.org, Richard Henderson <rth@redhat.com>,
	Jan Hubicka <jh@suse.cz>
Subject: Re: GCC performance regression - its memset!
Date: Mon, 22 Apr 2002 23:30:00 -0000	[thread overview]
Message-ID: <20020423060709.GA21922@zoy.org> (raw)
In-Reply-To: <Pine.LNX.4.33.0204222307450.2893-100000@www.eyesopen.com>

On Mon, Apr 22, 2002 at 11:13:09PM -0600, Roger Sayle wrote:
> 
> I think its one of Jan's changes.  I can reproduce the problem, and
> fix it using "-minline-all-stringops" which forces 3.1 to inline the
> memset on i686.  I was concerned that it was a middle-end bug with
> builtins, but it now appears to be an ia32 back-end issue.
> 
> Michel, does "-minline-all-stringops" fix the problem for you?

This option actually generates invalid code for me. Here is a test case:

------------------- cut here -----------------
#include <string.h>

short table[64];

int main (void)
{
    int i;

    for (i = 0; i < 64; i++)
        table[i] = 1234;

    memset (table, 0, 63 * sizeof(short));

    return (table[63] != 0);
}
------------------- cut here -----------------

This code should return 0, however it returns 1 (compiled with -O3
-minline-all-stringops)

Here is an extract from the generated asm (the memset part of it):
        movl    $table, %edi
        testl   $1, %edi       <- test 1-byte alignment (hmmm, isnt table
                                  already two-byte aligned, being a short ?)
        movl    $126, %eax     <- we want to clear 126 bytes
        je      .L7
        movb    $0, table
        movl    $table+1, %edi <- now edi is guaranteed two-byte-aligned
        movl    $125, %eax
.L7:
        testl   $2, %edi       <- test 4-byte alignment
        je      .L8
        movw    $0, (%edi)
        subl    $2, %eax       <- now edi is guaranteed four-byte-aligned
        addl    $2, %edi
.L8:
        cld
        movl    %eax, %ecx
        xorl    %eax, %eax
        shrl    $2, %ecx       <- number of 4-byte words remaining
        rep
        stosl
        testl   $2, %edi       <- ooops, its really meant to test the remainder
                                  not the address !!! so test will always fail.
        je      .L9
        movw    $0, (%edi)
        addl    $2, %edi
.L9:
        testl   $1, %edi       <- that one too.
        je      .L10
        movb    $0, (%edi)
.L10:


2.95 was generating simpler code:
        movl $table,%edi
        xorl %eax,%eax
        cld
        movl $31,%ecx
        rep
        stosl
        stosw

This did not take care about alignment issues, but was simpler and
actually faster on my athlon.

Hope this helps,

-- 
Michel "Walken" LESPINASSE
Is this the best that god can do ? Then I'm not impressed.

  reply	other threads:[~2002-04-23  6:07 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-04-22 23:07 Roger Sayle
2002-04-22 23:30 ` Michel LESPINASSE [this message]
2002-04-23  0:45   ` Michel LESPINASSE
2002-04-23  2:53   ` Jan Hubicka
2002-04-23  3:28     ` Richard Henderson
2002-05-20  8:06     ` [3.1.1] " Jan Hubicka
2002-05-20  9:36       ` Glen Nakamura
2002-05-20 10:08       ` Richard Henderson
2002-05-20 10:38       ` Jakub Jelinek
2002-05-20 11:28         ` Roger Sayle
2002-05-20 12:57           ` Glen Nakamura
2002-05-20 16:58             ` Roger Sayle
2002-05-21  8:23             ` Jack Lloyd
2002-05-21  9:55               ` Glen Nakamura
2002-05-21 11:13                 ` Jack Lloyd
2002-05-21  8:04         ` Jan Hubicka
2002-05-20 12:07       ` Mark Mitchell
  -- strict thread matches above, loose matches on Subject: below --
2002-04-20 18:13 GCC performance regression - up to 20% ? Michel LESPINASSE
2002-04-22 14:33 ` GCC performance regression - its memset ! Michel LESPINASSE
2002-04-22 14:58   ` Jason R Thorpe
2002-04-22 15:27     ` Michel LESPINASSE
2002-04-22 16:59     ` Segher Boessenkool
2002-04-22 17:10   ` Richard Henderson
2002-04-22 17:13     ` Michel LESPINASSE
2002-04-22 17:39       ` Richard Henderson
2002-04-22 17:49         ` Michel LESPINASSE
2002-04-23  5:03           ` Falk Hueffner
2002-04-23  6:53             ` Andreas Schwab
2002-04-23  2:39       ` Jan Hubicka
2002-04-23 13:36         ` Michel LESPINASSE
2002-04-24  0:30           ` Jan Hubicka
2002-04-24  0:50             ` Jakub Jelinek
2002-04-24  1:00               ` Jan Hubicka
2002-04-24  3:32           ` Jan Hubicka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20020423060709.GA21922@zoy.org \
    --to=walken@zoy.org \
    --cc=gcc@gcc.gnu.org \
    --cc=jh@suse.cz \
    --cc=roger@eyesopen.com \
    --cc=rth@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).