From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 22804 invoked by alias); 3 Sep 2014 00:16:36 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Received: (qmail 22512 invoked by uid 48); 3 Sep 2014 00:16:29 -0000 From: "doug.gilmore at imgtec dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug c/63148] New: r187042 causes auto-vectorization failure for X86 for -m32. Date: Wed, 03 Sep 2014 00:16:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: c X-Bugzilla-Version: 4.8.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: doug.gilmore at imgtec dot com X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter attachments.created Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2014-09/txt/msg00898.txt.bz2 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63148 Bug ID: 63148 Summary: r187042 causes auto-vectorization failure for X86 for -m32. Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: doug.gilmore at imgtec dot com Created attachment 33440 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=33440&action=edit test example I noticed that MultiSource/Benchmarks/TSVC/LoopRestructuring-{flt,dbl} from LLVM test-suite fail on X86 -m32 and I was able to bisect the failure to commit r187042. I attached a stripped down example: Before the revision if we compile with -fdump-tree-vect-details we see that a loop carried dependency is recorded: (compute_affine_dependence stmt_a: D.1748_9 = global_data.b[D.1747_8]; stmt_b: global_data.b[i.0_2] = D.1750_11; (subscript_dependence_tester (analyze_overlapping_iterations (chrec_a = {0, +, 1}_5) (chrec_b = {1, +, 1}_5) (analyze_siv_subscript (analyze_subscript_affine_affine (overlaps_a = [1 + 1 * x_1] ) (overlaps_b = [0 + 1 * x_1] ) ) ) (overlap_iterations_a = [1 + 1 * x_1] ) (overlap_iterations_b = [0 + 1 * x_1] ) ) (analyze_overlapping_iterations (chrec_a = 2816) (chrec_b = 2816) (overlap_iterations_a = [0] ) (overlap_iterations_b = [0] ) ) (build_classic_dist_vector dist_vector = ( 1 ) ) ) ) which results in the loop not being vectorized because of the memory recurrence. After the change the dependency is not recorded: (compute_affine_dependence stmt_a: D.1748_9 = global_data.b[D.1747_8]; stmt_b: global_data.b[i.0_2] = D.1750_11; (subscript_dependence_tester (analyze_overlapping_iterations (chrec_a = {536870912, +, 1}_5) (chrec_b = {1, +, 1}_5) (analyze_siv_subscript (analyze_subscript_affine_affine (overlaps_a = no dependence ) (overlaps_b = no dependence ) ) ) (overlap_iterations_a = no dependence ) (overlap_iterations_b = no dependence ) ) (dependence classified: scev_known) ) Causing the loop to be incorrectly vectorized. Note that when compiled with -m64 is actually vectorized, but it is determined that versioning is needed: 45: dependence distance == 0 between global_data.a[D.1767_2] and global_data.a[D.1767_2] 45: versioning for alias required: can't determine dependence between global_data.a[D.1767_2] and *D.1776_10 ... 58: LOOP VECTORIZED. s221_extract.c:40: note: vectorized 5 loops in function. Merging blocks 2 and 41 Removing basic block 5 ... and the incorrectly vectorized code is removed.