From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 30133 invoked by alias); 25 Oct 2015 03:46:49 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Received: (qmail 30084 invoked by uid 48); 25 Oct 2015 03:46:37 -0000 From: "igusarov at mail dot ru" To: gcc-bugs@gcc.gnu.org Subject: [Bug rtl-optimization/68086] New: Expression explicitly defined outside the loop is moved inside the loop by the optimizer Date: Sun, 25 Oct 2015 03:46:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: rtl-optimization X-Bugzilla-Version: 5.2.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: igusarov at mail dot ru X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone attachments.created Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2015-10/txt/msg02038.txt.bz2 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68086 Bug ID: 68086 Summary: Expression explicitly defined outside the loop is moved inside the loop by the optimizer Product: gcc Version: 5.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: igusarov at mail dot ru Target Milestone: --- Created attachment 36578 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=36578&action=edit Single function to reproduce the results Compileable C source in "ex324_core.c" does not include any header files. It consists of a single function whose performance is spoiled by the optimizer. Please read explanatory comments in that file. "ex324.c" is a compileable test program build around the same core function. It merely measures the amount of CPU clock ticks taken by that core function. It includes system headers for printf and mmap, and is provided just for convenience of testing. The problem was first discovered in x86_64 gcc 5.2.0 compiler. Brief regression research showed that 4.8.3 has this problem too. 4.7.4 seems to be good. Problem in a nutshell. Let's start with this loop: // Case 1 for (i = 0; i < size; ++i) accumulator += data[i]; and rewrite it in this equivalent form: // Case 2 int* rebased = data + size; for (i = -size; i; ++i) accumulator += rebased[i]; It looks like the forward propagation pass decides not to allocate a register for variable 'rebased', but rather compute its value every time it is used in the loop. This results in assembly output which, if written in terms of C, would look like this: for (i = -size; i; ++i) accumulator += *(data + (size + i)); Extra operation inside the loop only slows the program down. This happens at any optimization level above -O0. Command line: x86_64-unknown-freebsd9.0_5.2.0-gcc -O2 -S ex324_core.c Compiler: x86_64-unknown-freebsd9.0_5.2.0-gcc -v Using built-in specs. COLLECT_GCC=x86_64-unknown-freebsd9.0_5.2.0-gcc COLLECT_LTO_WRAPPER=/usr/toolchain/x86_64-unknown-freebsd9.0_5.2.0/libexec/gcc/x86_64-unknown-freebsd9.0/5.2.0/lto-wrapper Target: x86_64-unknown-freebsd9.0 Configured with: /mnt/hdd/usr/home/toolbuilder/build_scripts/x86_64-unknown-freebsd9.0_5.2.0/build_scripts/../tools_build/x86_64-unknown-freebsd9.0_5.2.0/gcc-5.2.0/configure --target=x86_64-unknown-freebsd9.0 --prefix=/usr/toolchain/x86_64-unknown-freebsd9.0_5.2.0 --with-local-prefix=/usr/local --with-sysroot=/usr/toolchain/x86_64-unknown-freebsd9.0_5.2.0/sysroot --program-prefix=x86_64-unknown-freebsd9.0_5.2.0- --with-gnu-as --with-gnu-ld --with-as=/usr/toolchain/x86_64-unknown-freebsd9.0_5.2.0/bin/x86_64-unknown-freebsd9.0_5.2.0-as --with-ld=/usr/toolchain/x86_64-unknown-freebsd9.0_5.2.0/bin/x86_64-unknown-freebsd9.0_5.2.0-ld --with-nm=/usr/toolchain/x86_64-unknown-freebsd9.0_5.2.0/bin/x86_64-unknown-freebsd9.0_5.2.0-nm --with-objdump=/usr/toolchain/x86_64-unknown-freebsd9.0_5.2.0/bin/x86_64-unknown-freebsd9.0_5.2.0-objdump --with-gmp=/mnt/hdd/usr/home/toolbuilder/build_scripts/x86_64-unknown-freebsd9.0_5.2.0/build_scripts/../tools_build/x86_64-unknown-freebsd9.0_5.2.0/gmp-root --with-mpfr=/mnt/hdd/usr/home/toolbuilder/build_scripts/x86_64-unknown-freebsd9.0_5.2.0/build_scripts/../tools_build/x86_64-unknown-freebsd9.0_5.2.0/mpfr-root --with-mpc=/mnt/hdd/usr/home/toolbuilder/build_scripts/x86_64-unknown-freebsd9.0_5.2.0/build_scripts/../tools_build/x86_64-unknown-freebsd9.0_5.2.0/mpc-root --disable-__cxa_atexit --enable-languages=c,c++ --disable-multilib --disable-nls --enable-shared=libstdc++ --enable-static --enable-threads Thread model: posix gcc version 5.2.0 (GCC) Operating system: amd64 FreeBSD 9.0-RELEASE CPU: Intel(R) Core(TM) i7-2700K CPU @ 3.50GHz (3500.10-MHz K8-class CPU) Origin = "GenuineIntel" Id = 0x206a7 Family = 6 Model = 2a Stepping = 7 Features=0xbfebfbff Features2=0x179ae3bf AMD Features=0x28100800 AMD Features2=0x1