From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 66736 invoked by alias); 4 May 2015 15:00:14 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Received: (qmail 66512 invoked by uid 48); 4 May 2015 15:00:03 -0000 From: "maltsevm at gmail dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug rtl-optimization/53533] [4.8/4.9/5/6 regression] vectorization causes loop unrolling test slowdown as measured by Adobe's C++Benchmark Date: Mon, 04 May 2015 15:00:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: rtl-optimization X-Bugzilla-Version: 4.7.1 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: maltsevm at gmail dot com X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 4.8.5 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2015-05/txt/msg00257.txt.bz2 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53533 --- Comment #29 from Mikhail Maltsev --- Results for attached testcase: Intel(R) Core(TM) i7-5820K CPU @ 3.30GHz (Haswell) g++ -O3 -march=native -mtune=native 10000 iterations Clang 3.7 Total absolute time for int32_t for loop unrolling: 0.99 sec Total absolute time for int32_t do loop unrolling: 1.00 sec Total absolute time for double for loop unrolling: 1.37 sec Total absolute time for double do loop unrolling: 1.37 sec GCC 4.7.4 Total absolute time for int32_t for loop unrolling: 5.88 sec Total absolute time for int32_t do loop unrolling: 7.57 sec Total absolute time for double for loop unrolling: 2.29 sec Total absolute time for double do loop unrolling: 2.45 sec GCC 4.8.4 Total absolute time for int32_t for loop unrolling: 3.12 sec Total absolute time for int32_t do loop unrolling: 3.29 sec Total absolute time for double for loop unrolling: 1.13 sec Total absolute time for double do loop unrolling: 1.14 sec GCC 4.9.2 Total absolute time for int32_t for loop unrolling: 3.02 sec Total absolute time for int32_t do loop unrolling: 3.29 sec Total absolute time for double for loop unrolling: 1.10 sec Total absolute time for double do loop unrolling: 1.13 sec GCC 6 Total absolute time for int32_t for loop unrolling: 5.95 sec Total absolute time for int32_t do loop unrolling: 6.95 sec Total absolute time for double for loop unrolling: 2.39 sec Total absolute time for double do loop unrolling: 2.39 sec g++ -DINLINE_MANUALLY -O3 -march=native -mtune=native 50000 iterations Clang 3.7 Total absolute time for int32_t for loop unrolling: 2.43 sec Total absolute time for int32_t do loop unrolling: 2.32 sec Total absolute time for double for loop unrolling: 6.38 sec Total absolute time for double do loop unrolling: 6.38 sec GCC 4.9.2 Total absolute time for int32_t for loop unrolling: 10.17 sec Total absolute time for int32_t do loop unrolling: 10.16 sec Total absolute time for double for loop unrolling: 3.89 sec Total absolute time for double do loop unrolling: 3.90 sec GCC 6 Total absolute time for int32_t for loop unrolling: 10.10 sec Total absolute time for int32_t do loop unrolling: 10.12 sec Total absolute time for double for loop unrolling: 3.90 sec Total absolute time for double do loop unrolling: 3.89 sec g++ -DINLINE_MANUALLY -Ofast -march=native -mtune=native GCC 6 Total absolute time for int32_t for loop unrolling: 10.11 sec Total absolute time for int32_t do loop unrolling: 10.11 sec Total absolute time for double for loop unrolling: 1.14 sec Total absolute time for double do loop unrolling: 1.15 sec So, IMHO there is no regression here (at least w.r.t. vectorization). Floating point loop gets constant-folded, if reassociation is allowed. Also, GCC6 is able to infer that "for" and "while" tests are semantically equivalent and unifies them.