From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 6508 invoked by alias); 17 Nov 2005 15:09:20 -0000 Received: (qmail 6484 invoked by uid 48); 17 Nov 2005 15:09:18 -0000 Date: Thu, 17 Nov 2005 15:09:00 -0000 Message-ID: <20051117150918.6483.qmail@sourceware.org> X-Bugzilla-Reason: CC References: Subject: [Bug target/19923] [4.0/4.1 Regression] openssl is slower when compiled with gcc 4.0 than 3.3 In-Reply-To: Reply-To: gcc-bugzilla@gcc.gnu.org To: gcc-bugs@gcc.gnu.org From: "rakdver at gcc dot gnu dot org" Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org X-SW-Source: 2005-11/txt/msg02446.txt.bz2 List-Id: ------- Comment #35 from rakdver at gcc dot gnu dot org 2005-11-17 15:09 ------- Created an attachment (id=10263) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=10263&action=view) Patch After some playing with fold, I arrived to the following patch, that almost works. With the patch, the code for the loop is :; MEM[base: ptr]{*ptr} = cleanse_ctr; ptr = ptr + 1B; cleanse_ctr = (unsigned char) (((signed char) ptr & 15) + (signed char) cleanse_ctr + 17); len = len - 1; if (len != 0) goto ; else goto ; Which seems just fine. The assembler is .L3: movb (%edi), %al movb %al, (%ecx) incl %ecx movb %cl, %al andl $15, %eax movb (%edi), %dl addl $17, %edx addl %edx, %eax movb %al, (%edi) decl %esi jne .L3 Which also seems OK to me. However, the "ugly" version we produce without the patch: .L4: movb (%edi), %al movb %al, (%ecx) incl %ecx movb -16(%ebp), %al addl %esi, %eax andl $15, %eax movb (%edi), %dl addl $17, %edx addl %edx, %eax movb %al, (%edi) incl %esi cmpl 12(%ebp), %esi jne .L4 Is faster by 30%, from reasons I just don't understand :-( -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19923