From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 11298 invoked by alias); 20 Jan 2015 15:26:22 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Received: (qmail 11224 invoked by uid 48); 20 Jan 2015 15:26:13 -0000 From: "enkovich.gnu at gmail dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/64691] New: Suboptimal register allocation for bytes comparison on i386 Date: Tue, 20 Jan 2015 15:26:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 5.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: enkovich.gnu at gmail dot com X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2015-01/txt/msg02053.txt.bz2 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64691 Bug ID: 64691 Summary: Suboptimal register allocation for bytes comparison on i386 Product: gcc Version: 5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: enkovich.gnu at gmail dot com This problem was actually found in 256.bzip2 benchmark codes compiled by GCC 5.0 on -O2. There is a small loop with bytes comparison which appeared to be ineffective because compared values were not allocated on registers allowing byte access. That caused additional copies and as a result significant loop slow down. Situation may be simulated on a small test if we restrict registers usage. >cat test.c void test (unsigned char *p, unsigned char val) { unsigned char tmp1, tmp2; int i; i = 0; tmp1 = p[0]; while (val != tmp1) { i++; tmp2 = tmp1; tmp1 = p[i]; p[i] = tmp2; } p[0]= tmp1; } >gcc -O2 -m32 -ffixed-ebx test.c -S Here is a loop: .L3: movzbl (%eax), %ebp movl %esi, %ecx movb %dl, (%eax) addl $1, %eax movl %ebp, %edx cmpb %dl, %cl jne .L3 We have an extra register copy esi->ecx to perform comparison. Suppose the easiest way to get better register allocation here would be to transform QI comparison into SI one to relax register constraints.