From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 2723 invoked by alias); 24 Feb 2007 10:59:13 -0000 Received: (qmail 2613 invoked by uid 48); 24 Feb 2007 10:59:01 -0000 Date: Sat, 24 Feb 2007 10:59:00 -0000 Message-ID: <20070224105901.2612.qmail@sourceware.org> X-Bugzilla-Reason: CC References: Subject: [Bug target/30778] [4.3 Regression] invalid code generation for memset() with -mtune=k8 In-Reply-To: Reply-To: gcc-bugzilla@gcc.gnu.org To: gcc-bugs@gcc.gnu.org From: "ubizjak at gmail dot com" Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org X-SW-Source: 2007-02/txt/msg02755.txt.bz2 ------- Comment #3 from ubizjak at gmail dot com 2007-02-24 10:59 ------- I'm currently testing this patch: 2007-02-24 Uros Bizjak * config/i386/i386.md (expand_set_or_movmem_via_loop): Return if GET_MODE_SIZE (mode) * unroll is less than expected_size. testsuite/ChangeLog: 2007-02-24 Uros Bizjak * gcc.target/i386/pr30778.c: New test. Index: config/i386/i386.c =================================================================== --- config/i386/i386.c (revision 122286) +++ config/i386/i386.c (working copy) @@ -13315,13 +13315,19 @@ { rtx out_label, top_label, iter, tmp; enum machine_mode iter_mode; - rtx piece_size = GEN_INT (GET_MODE_SIZE (mode) * unroll); - rtx piece_size_mask = GEN_INT (~((GET_MODE_SIZE (mode) * unroll) - 1)); + HOST_WIDE_INT min_size = GET_MODE_SIZE (mode) * unroll; + rtx piece_size = GEN_INT (min_size); + rtx piece_size_mask = GEN_INT (~min_size - 1); rtx size; rtx x_addr; rtx y_addr; int i; + /* Bail out if expected size is less than minimum size + that can be emitted. */ + if (expected_size < min_size) + return; + iter_mode = GET_MODE (count); if (iter_mode == VOIDmode) iter_mode = word_mode; There is also an optimization opportunity. When compiling testcase to 32bit code (with -m32 -O2 -mtune=k8), following is generated: init_reg_last: pushl %ebp movl reg_stat, %edx xorl %eax, %eax movl %esp, %ebp .L2: movl $0, (%edx,%eax) movl $0, 4(%edx,%eax) movl $0, 8(%edx,%eax) movl $0, 12(%edx,%eax) addl $16, %eax cmpl $16, %eax <<< not needed jb .L2 <<< not needed addl %eax, %edx movw $0, (%edx) movb $0, 2(%edx) leave ret We don't need to create loop in this case, as this loop will be executed only once. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30778