From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 17279 invoked by alias); 2 Sep 2013 08:39:25 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Received: (qmail 17234 invoked by uid 48); 2 Sep 2013 08:39:21 -0000 From: "uranus at tinlans dot org" To: gcc-bugs@gcc.gnu.org Subject: [Bug rtl-optimization/58295] New: The combination pass doesn't eliminates some extra zero extensions Date: Mon, 02 Sep 2013 08:39:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: rtl-optimization X-Bugzilla-Version: 4.9.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: uranus at tinlans dot org X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2013-09/txt/msg00037.txt.bz2 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58295 Bug ID: 58295 Summary: The combination pass doesn't eliminates some extra zero extensions Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: uranus at tinlans dot org $ cat test.c extern char zeb_test_array[10]; unsigned char ee_isdigit2(unsigned int i) { unsigned char c = zeb_test_array[i]; unsigned char retval; retval = ((c>='0') & (c<='9')) ? 1 : 0; return retval; } $ arm-eabi-gcc -v Using built-in specs. COLLECT_GCC=arm-eabi-gcc COLLECT_LTO_WRAPPER=/home1/lhtseng/arm/4.9/libexec/gcc/arm-eabi/4.9.0/lto-wrapper Target: arm-eabi Configured with: ../../../../work/4.9/src/gcc-4.9.0/configure --target=arm-eabi --prefix=/home1/lhtseng/arm/4.9 --disable-nls --disable-shared --enable-languages=c --enable-__cxa_atexit --enable-c99 --enable-long-long --enable-threads=single --with-newlib --disable-multilib --disable-libssp --disable-libgomp --disable-decimal-float --disable-libffi --disable-libmudflap --disable-lto --with-gmp=/home1/lhtseng/work/general --with-mpfr=/home1/lhtseng/work/general --with-mpc=/home1/lhtseng/work/general --with-isl=/home1/lhtseng/work/general --with-cloog=/home1/lhtseng/work/general Thread model: single gcc version 4.9.0 20130802 (experimental) (GCC) $ arm-eabi-gcc -O3 -S test.c $ cat test.s ... ee_isdigit2: @ Function supports interworking. @ args = 0, pretend = 0, frame = 0 @ frame_needed = 0, uses_anonymous_args = 0 @ link register save eliminated. ldr r3, .L2 ldrb r0, [r3, r0] @ zero_extendqisi2 sub r0, r0, #48 and r0, r0, #255 cmp r0, #9 movhi r0, #0 movls r0, #1 bx lr ... The instruction 'and r0, r0, #255' is a redundant instruction which cannot be eliminated by the RTL instruction combination pass. This pass was able to handle this case before this commit: http://gcc.gnu.org/viewcvs/gcc/trunk/gcc/simplify-rtx.c?r1=191909&r2=191928&pathrev=192303 And the code was re-organized to line 643 ~ 656 after this commit: http://gcc.gnu.org/viewcvs/gcc/trunk/gcc/simplify-rtx.c?r1=192006&r2=192186&pathrev=192303 For example, GCC 4.6.3 can handle it perfectly. In GCC 4.9.0, reverting the two commits or simply commeting the lines mentioned above can make the combination pass handle this case again: $ arm-eabi-gcc-modified -O3 -da -S test.c $ cat test.c.166r.expand ... (insn 9 8 10 2 (set (reg:SI 120) (plus:SI (subreg:SI (reg:QI 118) 0) (const_int -48 [0xffffffffffffffd0]))) test.c:6 -1 (nil)) (insn 10 9 11 2 (set (reg:SI 121) (and:SI (reg:SI 120) (const_int 255 [0xff]))) test.c:6 -1 (nil)) (insn 11 10 12 2 (set (reg:CC 100 cc) (compare:CC (reg:SI 121) (const_int 9 [0x9]))) test.c:6 -1 (nil)) (insn 12 11 13 2 (set (reg:SI 122) (leu:SI (reg:CC 100 cc) (const_int 0 [0]))) test.c:6 -1 (nil)) ... $ cat test.c.197r.combine ... Trying 9, 10 -> 11: Failed to match this instruction: (set (reg:CC 100 cc) (compare:CC (plus:SI (reg:SI 119) (const_int -48 [0xffffffffffffffd0])) (const_int 9 [0x9]))) Successfully matched this instruction: (set (reg:SI 121) (plus:SI (reg:SI 119) (const_int -48 [0xffffffffffffffd0]))) Successfully matched this instruction: (set (reg:CC 100 cc) (compare:CC (reg:SI 121) (const_int 9 [0x9]))) deferring deletion of insn with uid = 9. modifying insn i2 10: r121:SI=r119:SI-0x30 REG_DEAD r119:SI deferring rescan insn with uid = 10. modifying insn i3 11: cc:CC=cmp(r121:SI,0x9) REG_DEAD r121:SI deferring rescan insn with uid = 11. ... The insn 10 is generated by (define_expand "zero_extendqisi2" ...) of ARM's machine description. Before the commits I mentioned above, the combination pass successfully combines it with the insn 9. However, after those commits, the combination pass never tries to do the combination '9, 10 -> 11.' After reading the commit messages of the file 'simplify-rtx.c', we can understand the commits, r191928, was trying to optimize x86 code generation, but it led to the suboptimal code generation of the ARM's target.