From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 113450 invoked by alias); 21 May 2019 02:03:34 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 113442 invoked by uid 89); 21 May 2019 02:03:33 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-16.1 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,KAM_SHORT,MIME_CHARSET_FARAWAY,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.1 spammy=Thanks!, shape, PING X-HELO: mx0a-001b2d01.pphosted.com Received: from mx0a-001b2d01.pphosted.com (HELO mx0a-001b2d01.pphosted.com) (148.163.156.1) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 21 May 2019 02:03:29 +0000 Received: from pps.filterd (m0098393.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x4L23PII009652 for ; Mon, 20 May 2019 22:03:28 -0400 Received: from e06smtp05.uk.ibm.com (e06smtp05.uk.ibm.com [195.75.94.101]) by mx0a-001b2d01.pphosted.com with ESMTP id 2sm4jvxj8k-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Mon, 20 May 2019 22:03:27 -0400 Received: from localhost by e06smtp05.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 21 May 2019 03:02:57 +0100 Received: from b06cxnps3074.portsmouth.uk.ibm.com (9.149.109.194) by e06smtp05.uk.ibm.com (192.168.101.135) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Tue, 21 May 2019 03:02:55 +0100 Received: from d06av25.portsmouth.uk.ibm.com (d06av25.portsmouth.uk.ibm.com [9.149.105.61]) by b06cxnps3074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id x4L22sNl56557578 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 21 May 2019 02:02:54 GMT Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 0F47811C04C; Tue, 21 May 2019 02:02:54 +0000 (GMT) Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 2CA5D11C050; Tue, 21 May 2019 02:02:52 +0000 (GMT) Received: from kewenlins-mbp.cn.ibm.com (unknown [9.200.147.25]) by d06av25.portsmouth.uk.ibm.com (Postfix) with ESMTP; Tue, 21 May 2019 02:02:51 +0000 (GMT) Subject: [PING] [PATCH V3] PR88497 - Extend reassoc for vector bit_field_ref From: "Kewen.Lin" To: GCC Patches Cc: Bill Schmidt , Segher Boessenkool , Richard Guenther References: Date: Tue, 21 May 2019 02:03:00 -0000 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:60.0) Gecko/20100101 Thunderbird/60.6.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=gbk Content-Transfer-Encoding: 8bit x-cbid: 19052102-0020-0000-0000-0000033EC51F X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 19052102-0021-0000-0000-000021919FD0 Message-Id: <0a96e8a9-9219-6fc0-7fb5-a3673e29df52@linux.ibm.com> X-IsSubscribed: yes X-SW-Source: 2019-05/txt/msg01341.txt.bz2 Hi, Gentle ping again. Thanks! Kewen on 2019/5/5 ÏÂÎç2:15, Kewen.Lin wrote: > Hi, > > I'd like to gentle ping for this patch: > https://gcc.gnu.org/ml/gcc-patches/2019-03/msg00966.html > > OK for trunk now? > > Thanks! > > on 2019/3/20 ÉÏÎç11:14, Kewen.Lin wrote: >> Hi, >> >> Please refer to below link for previous threads. >> https://gcc.gnu.org/ml/gcc-patches/2019-03/msg00348.html >> >> Comparing to patch v2, I've moved up the vector operation target >> check upward together with vector type target check. Besides, I >> ran bootstrap and regtest on powerpc64-linux-gnu (BE), updated >> testcases' requirements and options for robustness. >> >> Is it OK for GCC10? >> >> >> gcc/ChangeLog >> >> 2019-03-20 Kewen Lin >> >> PR target/88497 >> * tree-ssa-reassoc.c (reassociate_bb): Swap the positions of >> GIMPLE_BINARY_RHS check and gimple_visited_p check, call new >> function undistribute_bitref_for_vector. >> (undistribute_bitref_for_vector): New function. >> (cleanup_vinfo_map): Likewise. >> (unsigned_cmp): Likewise. >> >> gcc/testsuite/ChangeLog >> >> 2019-03-20 Kewen Lin >> >> * gcc.dg/tree-ssa/pr88497-1.c: New test. >> * gcc.dg/tree-ssa/pr88497-2.c: Likewise. >> * gcc.dg/tree-ssa/pr88497-3.c: Likewise. >> * gcc.dg/tree-ssa/pr88497-4.c: Likewise. >> * gcc.dg/tree-ssa/pr88497-5.c: Likewise. >> >> --- >> gcc/testsuite/gcc.dg/tree-ssa/pr88497-1.c | 44 +++++ >> gcc/testsuite/gcc.dg/tree-ssa/pr88497-2.c | 33 ++++ >> gcc/testsuite/gcc.dg/tree-ssa/pr88497-3.c | 33 ++++ >> gcc/testsuite/gcc.dg/tree-ssa/pr88497-4.c | 33 ++++ >> gcc/testsuite/gcc.dg/tree-ssa/pr88497-5.c | 33 ++++ >> gcc/tree-ssa-reassoc.c | 306 +++++++++++++++++++++++++++++- >> 6 files changed, 477 insertions(+), 5 deletions(-) >> create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr88497-1.c >> create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr88497-2.c >> create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr88497-3.c >> create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr88497-4.c >> create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr88497-5.c >> >> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr88497-1.c b/gcc/testsuite/gcc.dg/tree-ssa/pr88497-1.c >> new file mode 100644 >> index 0000000..99c9af8 >> --- /dev/null >> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr88497-1.c >> @@ -0,0 +1,44 @@ >> +/* { dg-do compile } */ >> +/* { dg-require-effective-target vect_double } */ >> +/* { dg-require-effective-target powerpc_vsx_ok { target { powerpc*-*-* } } } */ >> +/* { dg-options "-O2 -ffast-math" } */ >> +/* { dg-options "-O2 -ffast-math -mvsx -fdump-tree-reassoc1" { target { powerpc*-*-* } } } */ >> + >> +/* To test reassoc can undistribute vector bit_field_ref summation. >> + >> + arg1 and arg2 are two arrays whose elements of type vector double. >> + Assuming: >> + A0 = arg1[0], A1 = arg1[1], A2 = arg1[2], A3 = arg1[3], >> + B0 = arg2[0], B1 = arg2[1], B2 = arg2[2], B3 = arg2[3], >> + >> + Then: >> + V0 = A0 * B0, V1 = A1 * B1, V2 = A2 * B2, V3 = A3 * B3, >> + >> + reassoc transforms >> + >> + accumulator += V0[0] + V0[1] + V1[0] + V1[1] + V2[0] + V2[1] >> + + V3[0] + V3[1]; >> + >> + into: >> + >> + T = V0 + V1 + V2 + V3 >> + accumulator += T[0] + T[1]; >> + >> + Fewer bit_field_refs, only two for 128 or more bits vector. */ >> + >> +typedef double v2df __attribute__ ((vector_size (16))); >> +double >> +test (double accumulator, v2df arg1[], v2df arg2[]) >> +{ >> + v2df temp; >> + temp = arg1[0] * arg2[0]; >> + accumulator += temp[0] + temp[1]; >> + temp = arg1[1] * arg2[1]; >> + accumulator += temp[0] + temp[1]; >> + temp = arg1[2] * arg2[2]; >> + accumulator += temp[0] + temp[1]; >> + temp = arg1[3] * arg2[3]; >> + accumulator += temp[0] + temp[1]; >> + return accumulator; >> +} >> +/* { dg-final { scan-tree-dump-times "BIT_FIELD_REF" 2 "reassoc1" { target { powerpc*-*-* } } } } */ >> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr88497-2.c b/gcc/testsuite/gcc.dg/tree-ssa/pr88497-2.c >> new file mode 100644 >> index 0000000..61ed0bf5 >> --- /dev/null >> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr88497-2.c >> @@ -0,0 +1,33 @@ >> +/* { dg-do compile } */ >> +/* { dg-require-effective-target vect_float } */ >> +/* { dg-require-effective-target powerpc_altivec_ok { target { powerpc*-*-* } } } */ >> +/* { dg-options "-O2 -ffast-math" } */ >> +/* { dg-options "-O2 -ffast-math -maltivec -fdump-tree-reassoc1" { target { powerpc*-*-* } } } */ >> + >> +/* To test reassoc can undistribute vector bit_field_ref on multiplication. >> + >> + v1, v2, v3, v4 of type vector float. >> + >> + reassoc transforms >> + >> + accumulator *= v1[0] * v1[1] * v1[2] * v1[3] * >> + v2[0] * v2[1] * v2[2] * v2[3] * >> + v3[0] * v3[1] * v3[2] * v3[3] * >> + v4[0] * v4[1] * v4[2] * v4[3] ; >> + >> + into: >> + >> + T = v1 * v2 * v3 * v4; >> + accumulator *= T[0] * T[1] * T[2] * T[3]; >> + >> + Fewer bit_field_refs, only four for 128 or more bits vector. */ >> + >> +typedef float v4si __attribute__((vector_size(16))); >> +float test(float accumulator, v4si v1, v4si v2, v4si v3, v4si v4) { >> + accumulator *= v1[0] * v1[1] * v1[2] * v1[3]; >> + accumulator *= v2[0] * v2[1] * v2[2] * v2[3]; >> + accumulator *= v3[0] * v3[1] * v3[2] * v3[3]; >> + accumulator *= v4[0] * v4[1] * v4[2] * v4[3]; >> + return accumulator; >> +} >> +/* { dg-final { scan-tree-dump-times "BIT_FIELD_REF" 4 "reassoc1" { target { powerpc*-*-* } } } } */ >> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr88497-3.c b/gcc/testsuite/gcc.dg/tree-ssa/pr88497-3.c >> new file mode 100644 >> index 0000000..3790afc >> --- /dev/null >> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr88497-3.c >> @@ -0,0 +1,33 @@ >> +/* { dg-do compile } */ >> +/* { dg-require-effective-target vect_int } */ >> +/* { dg-require-effective-target powerpc_altivec_ok { target { powerpc*-*-* } } } */ >> +/* { dg-options "-O2 -ffast-math" } */ >> +/* { dg-options "-O2 -ffast-math -maltivec -fdump-tree-reassoc1" { target { powerpc*-*-* } } } */ >> + >> +/* To test reassoc can undistribute vector bit_field_ref on bitwise AND. >> + >> + v1, v2, v3, v4 of type vector int. >> + >> + reassoc transforms >> + >> + accumulator &= v1[0] & v1[1] & v1[2] & v1[3] & >> + v2[0] & v2[1] & v2[2] & v2[3] & >> + v3[0] & v3[1] & v3[2] & v3[3] & >> + v4[0] & v4[1] & v4[2] & v4[3] ; >> + >> + into: >> + >> + T = v1 & v2 & v3 & v4; >> + accumulator &= T[0] & T[1] & T[2] & T[3]; >> + >> + Fewer bit_field_refs, only four for 128 or more bits vector. */ >> + >> +typedef int v4si __attribute__((vector_size(16))); >> +int test(int accumulator, v4si v1, v4si v2, v4si v3, v4si v4) { >> + accumulator &= v1[0] & v1[1] & v1[2] & v1[3]; >> + accumulator &= v2[0] & v2[1] & v2[2] & v2[3]; >> + accumulator &= v3[0] & v3[1] & v3[2] & v3[3]; >> + accumulator &= v4[0] & v4[1] & v4[2] & v4[3]; >> + return accumulator; >> +} >> +/* { dg-final { scan-tree-dump-times "BIT_FIELD_REF" 4 "reassoc1" { target { powerpc*-*-* } } } } */ >> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr88497-4.c b/gcc/testsuite/gcc.dg/tree-ssa/pr88497-4.c >> new file mode 100644 >> index 0000000..1864aad >> --- /dev/null >> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr88497-4.c >> @@ -0,0 +1,33 @@ >> +/* { dg-do compile } */ >> +/* { dg-require-effective-target vect_int } */ >> +/* { dg-require-effective-target powerpc_altivec_ok { target { powerpc*-*-* } } } */ >> +/* { dg-options "-O2 -ffast-math" } */ >> +/* { dg-options "-O2 -ffast-math -maltivec -fdump-tree-reassoc1" { target { powerpc*-*-* } } } */ >> + >> +/* To test reassoc can undistribute vector bit_field_ref on bitwise IOR. >> + >> + v1, v2, v3, v4 of type vector int. >> + >> + reassoc transforms >> + >> + accumulator |= v1[0] | v1[1] | v1[2] | v1[3] | >> + v2[0] | v2[1] | v2[2] | v2[3] | >> + v3[0] | v3[1] | v3[2] | v3[3] | >> + v4[0] | v4[1] | v4[2] | v4[3] ; >> + >> + into: >> + >> + T = v1 | v2 | v3 | v4; >> + accumulator |= T[0] | T[1] | T[2] | T[3]; >> + >> + Fewer bit_field_refs, only four for 128 or more bits vector. */ >> + >> +typedef int v4si __attribute__((vector_size(16))); >> +int test(int accumulator, v4si v1, v4si v2, v4si v3, v4si v4) { >> + accumulator |= v1[0] | v1[1] | v1[2] | v1[3]; >> + accumulator |= v2[0] | v2[1] | v2[2] | v2[3]; >> + accumulator |= v3[0] | v3[1] | v3[2] | v3[3]; >> + accumulator |= v4[0] | v4[1] | v4[2] | v4[3]; >> + return accumulator; >> +} >> +/* { dg-final { scan-tree-dump-times "BIT_FIELD_REF" 4 "reassoc1" { target { powerpc*-*-* } } } } */ >> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr88497-5.c b/gcc/testsuite/gcc.dg/tree-ssa/pr88497-5.c >> new file mode 100644 >> index 0000000..f747372 >> --- /dev/null >> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr88497-5.c >> @@ -0,0 +1,33 @@ >> +/* { dg-do compile } */ >> +/* { dg-require-effective-target vect_int } */ >> +/* { dg-require-effective-target powerpc_altivec_ok { target { powerpc*-*-* } } } */ >> +/* { dg-options "-O2 -ffast-math" } */ >> +/* { dg-options "-O2 -ffast-math -maltivec -fdump-tree-reassoc1" { target { powerpc*-*-* } } } */ >> + >> +/* To test reassoc can undistribute vector bit_field_ref on bitwise XOR. >> + >> + v1, v2, v3, v4 of type vector int. >> + >> + reassoc transforms >> + >> + accumulator ^= v1[0] ^ v1[1] ^ v1[2] ^ v1[3] ^ >> + v2[0] ^ v2[1] ^ v2[2] ^ v2[3] ^ >> + v3[0] ^ v3[1] ^ v3[2] ^ v3[3] ^ >> + v4[0] ^ v4[1] ^ v4[2] ^ v4[3] ; >> + >> + into: >> + >> + T = v1 ^ v2 ^ v3 ^ v4; >> + accumulator ^= T[0] ^ T[1] ^ T[2] ^ T[3]; >> + >> + Fewer bit_field_refs, only four for 128 or more bits vector. */ >> + >> +typedef int v4si __attribute__((vector_size(16))); >> +int test(int accumulator, v4si v1, v4si v2, v4si v3, v4si v4) { >> + accumulator ^= v1[0] ^ v1[1] ^ v1[2] ^ v1[3]; >> + accumulator ^= v2[0] ^ v2[1] ^ v2[2] ^ v2[3]; >> + accumulator ^= v3[0] ^ v3[1] ^ v3[2] ^ v3[3]; >> + accumulator ^= v4[0] ^ v4[1] ^ v4[2] ^ v4[3]; >> + return accumulator; >> +} >> +/* { dg-final { scan-tree-dump-times "BIT_FIELD_REF" 4 "reassoc1" { target { powerpc*-*-* } } } } */ >> diff --git a/gcc/tree-ssa-reassoc.c b/gcc/tree-ssa-reassoc.c >> index e1c4dfe..a6cd85a 100644 >> --- a/gcc/tree-ssa-reassoc.c >> +++ b/gcc/tree-ssa-reassoc.c >> @@ -1772,6 +1772,295 @@ undistribute_ops_list (enum tree_code opcode, >> return changed; >> } >> >> +/* Hold the information of one specific VECTOR_TYPE SSA_NAME. >> + - offsets: for different BIT_FIELD_REF offsets accessing same VECTOR. >> + - ops_indexes: the index of vec ops* for each relavant BIT_FIELD_REF. */ >> +struct v_info >> +{ >> + auto_vec offsets; >> + auto_vec ops_indexes; >> +}; >> + >> +typedef struct v_info *v_info_ptr; >> + >> +/* Comparison function for qsort on unsigned BIT_FIELD_REF offsets. */ >> +static int >> +unsigned_cmp (const void *p_i, const void *p_j) >> +{ >> + if (*(const unsigned HOST_WIDE_INT *) p_i >> + >= *(const unsigned HOST_WIDE_INT *) p_j) >> + return 1; >> + else >> + return -1; >> +} >> + >> +/* Cleanup hash map for VECTOR information. */ >> +static void >> +cleanup_vinfo_map (hash_map &info_map) >> +{ >> + for (hash_map::iterator it = info_map.begin (); >> + it != info_map.end (); ++it) >> + { >> + v_info_ptr info = (*it).second; >> + delete info; >> + (*it).second = NULL; >> + } >> +} >> + >> +/* Perform un-distribution of BIT_FIELD_REF on VECTOR_TYPE. >> + V1[0] + V1[1] + ... + V1[k] + V2[0] + V2[1] + ... + V2[k] + ... Vn[k] >> + is transformed to >> + Vs = (V1 + V2 + ... + Vn) >> + Vs[0] + Vs[1] + ... + Vs[k] >> + >> + The basic steps are listed below: >> + >> + 1) Check the addition chain *OPS by looking those summands coming from >> + VECTOR bit_field_ref on VECTOR type. Put the information into >> + v_info_map for each satisfied summand, using VECTOR SSA_NAME as key. >> + >> + 2) For each key (VECTOR SSA_NAME), validate all its BIT_FIELD_REFs are >> + continous, they can cover the whole VECTOR perfectly without any holes. >> + Obtain one VECTOR list which contain candidates to be transformed. >> + >> + 3) Build the addition statements for all VECTOR candidates, generate >> + BIT_FIELD_REFs accordingly. >> + >> + TODO: >> + 1) The current implementation restrict all candidate VECTORs should have >> + the same VECTOR type, but it can be extended into different groups by >> + VECTOR types in future if any profitable cases found. >> + 2) The current implementation requires the whole VECTORs should be fully >> + covered, but it can be extended to support partial, checking adjacent >> + but not fill the whole, it may need some cost model to define the >> + boundary to do or not. >> +*/ >> +static bool >> +undistribute_bitref_for_vector (enum tree_code opcode, vec *ops, >> + struct loop *loop) >> +{ >> + if (ops->length () <= 1) >> + return false; >> + >> + if (opcode != PLUS_EXPR && opcode != MULT_EXPR && opcode != BIT_XOR_EXPR >> + && opcode != BIT_IOR_EXPR && opcode != BIT_AND_EXPR) >> + return false; >> + >> + hash_map v_info_map; >> + operand_entry *oe1; >> + unsigned i; >> + >> + /* Find those summands from VECTOR BIT_FIELD_REF in addition chain, put the >> + information into map. */ >> + FOR_EACH_VEC_ELT (*ops, i, oe1) >> + { >> + enum tree_code dcode; >> + gimple *oe1def; >> + >> + if (TREE_CODE (oe1->op) != SSA_NAME) >> + continue; >> + oe1def = SSA_NAME_DEF_STMT (oe1->op); >> + if (!is_gimple_assign (oe1def)) >> + continue; >> + dcode = gimple_assign_rhs_code (oe1def); >> + if (dcode != BIT_FIELD_REF || !is_reassociable_op (oe1def, dcode, loop)) >> + continue; >> + >> + tree rhs = gimple_op (oe1def, 1); >> + tree op0 = TREE_OPERAND (rhs, 0); >> + tree vec_type = TREE_TYPE (op0); >> + >> + if (TREE_CODE (op0) != SSA_NAME || TREE_CODE (vec_type) != VECTOR_TYPE) >> + continue; >> + >> + tree op1 = TREE_OPERAND (rhs, 1); >> + tree op2 = TREE_OPERAND (rhs, 2); >> + >> + tree elem_type = TREE_TYPE (vec_type); >> + unsigned HOST_WIDE_INT size = TREE_INT_CST_LOW (TYPE_SIZE (elem_type)); >> + if (size != TREE_INT_CST_LOW (op1)) >> + continue; >> + >> + /* Ignore it if target machine can't support this VECTOR type. */ >> + if (!VECTOR_MODE_P (TYPE_MODE (vec_type))) >> + continue; >> + >> + /* Ignore it if target machine can't support this type of VECTOR >> + operation. */ >> + optab op_tab = optab_for_tree_code (opcode, vec_type, optab_vector); >> + if (optab_handler (op_tab, TYPE_MODE (vec_type)) == CODE_FOR_nothing) >> + continue; >> + >> + v_info_ptr *info_ptr = v_info_map.get (op0); >> + if (info_ptr) >> + { >> + v_info_ptr info = *info_ptr; >> + info->offsets.safe_push (TREE_INT_CST_LOW (op2)); >> + info->ops_indexes.safe_push (i); >> + } >> + else >> + { >> + v_info_ptr info = new v_info; >> + info->offsets.safe_push (TREE_INT_CST_LOW (op2)); >> + info->ops_indexes.safe_push (i); >> + v_info_map.put (op0, info); >> + } >> + } >> + >> + /* At least two VECTOR to combine. */ >> + if (v_info_map.elements () <= 1) >> + { >> + cleanup_vinfo_map (v_info_map); >> + return false; >> + } >> + >> + /* Use the first VECTOR and its information as the reference. >> + Firstly, we should validate it, that is: >> + 1) sorted offsets are adjacent, no holes. >> + 2) can fill the whole VECTOR perfectly. */ >> + hash_map::iterator it = v_info_map.begin (); >> + tree ref_vec = (*it).first; >> + v_info_ptr ref_info = (*it).second; >> + ref_info->offsets.qsort (unsigned_cmp); >> + tree vec_type = TREE_TYPE (ref_vec); >> + tree elem_type = TREE_TYPE (vec_type); >> + unsigned HOST_WIDE_INT elem_size = TREE_INT_CST_LOW (TYPE_SIZE (elem_type)); >> + unsigned HOST_WIDE_INT curr; >> + unsigned HOST_WIDE_INT prev = ref_info->offsets[0]; >> + >> + /* Continous check. */ >> + FOR_EACH_VEC_ELT_FROM (ref_info->offsets, i, curr, 1) >> + { >> + if (curr != (prev + elem_size)) >> + { >> + cleanup_vinfo_map (v_info_map); >> + return false; >> + } >> + prev = curr; >> + } >> + >> + /* Check whether fill the whole. */ >> + if ((prev + elem_size) != TREE_INT_CST_LOW (TYPE_SIZE (TREE_TYPE (ref_vec)))) >> + { >> + cleanup_vinfo_map (v_info_map); >> + return false; >> + } >> + >> + auto_vec vectors (v_info_map.elements ()); >> + vectors.quick_push (ref_vec); >> + >> + /* Use the ref_vec to filter others. */ >> + for (++it; it != v_info_map.end (); ++it) >> + { >> + tree vec = (*it).first; >> + v_info_ptr info = (*it).second; >> + if (TREE_TYPE (ref_vec) != TREE_TYPE (vec)) >> + continue; >> + if (ref_info->offsets.length () != info->offsets.length ()) >> + continue; >> + bool same_offset = true; >> + info->offsets.qsort (unsigned_cmp); >> + for (unsigned i = 0; i < ref_info->offsets.length (); i++) >> + { >> + if (ref_info->offsets[i] != info->offsets[i]) >> + { >> + same_offset = false; >> + break; >> + } >> + } >> + if (!same_offset) >> + continue; >> + vectors.quick_push (vec); >> + } >> + >> + if (vectors.length () < 2) >> + { >> + cleanup_vinfo_map (v_info_map); >> + return false; >> + } >> + >> + tree tr; >> + if (dump_file && (dump_flags & TDF_DETAILS)) >> + { >> + fprintf (dump_file, "The bit_field_ref vector list for undistribute: "); >> + FOR_EACH_VEC_ELT (vectors, i, tr) >> + { >> + print_generic_expr (dump_file, tr); >> + fprintf (dump_file, " "); >> + } >> + fprintf (dump_file, "\n"); >> + } >> + >> + /* Build the sum for all candidate VECTORs. */ >> + unsigned idx; >> + gimple *sum = NULL; >> + v_info_ptr info; >> + tree sum_vec = ref_vec; >> + FOR_EACH_VEC_ELT_FROM (vectors, i, tr, 1) >> + { >> + sum = build_and_add_sum (TREE_TYPE (ref_vec), sum_vec, tr, opcode); >> + info = *(v_info_map.get (tr)); >> + unsigned j; >> + FOR_EACH_VEC_ELT (info->ops_indexes, j, idx) >> + { >> + gimple *def = SSA_NAME_DEF_STMT ((*ops)[idx]->op); >> + gimple_set_visited (def, true); >> + if (opcode == PLUS_EXPR || opcode == BIT_XOR_EXPR >> + || opcode == BIT_IOR_EXPR) >> + (*ops)[idx]->op = build_zero_cst (TREE_TYPE ((*ops)[idx]->op)); >> + else if (opcode == MULT_EXPR) >> + (*ops)[idx]->op = build_one_cst (TREE_TYPE ((*ops)[idx]->op)); >> + else >> + { >> + gcc_assert (opcode == BIT_AND_EXPR); >> + (*ops)[idx]->op >> + = build_all_ones_cst (TREE_TYPE ((*ops)[idx]->op)); >> + } >> + (*ops)[idx]->rank = 0; >> + } >> + sum_vec = gimple_get_lhs (sum); >> + if (dump_file && (dump_flags & TDF_DETAILS)) >> + { >> + fprintf (dump_file, "Generating addition -> "); >> + print_gimple_stmt (dump_file, sum, 0); >> + } >> + } >> + >> + /* Referring to any good shape VECTOR (here using ref_vec), generate the >> + BIT_FIELD_REF statements accordingly. */ >> + info = *(v_info_map.get (ref_vec)); >> + gcc_assert (sum); >> + FOR_EACH_VEC_ELT (info->ops_indexes, i, idx) >> + { >> + tree dst = make_ssa_name (elem_type); >> + gimple *gs >> + = gimple_build_assign (dst, BIT_FIELD_REF, >> + build3 (BIT_FIELD_REF, elem_type, sum_vec, >> + TYPE_SIZE (elem_type), >> + bitsize_int (info->offsets[i]))); >> + insert_stmt_after (gs, sum); >> + update_stmt (gs); >> + gimple *def = SSA_NAME_DEF_STMT ((*ops)[idx]->op); >> + gimple_set_visited (def, true); >> + (*ops)[idx]->op = gimple_assign_lhs (gs); >> + (*ops)[idx]->rank = get_rank ((*ops)[idx]->op); >> + if (dump_file && (dump_flags & TDF_DETAILS)) >> + { >> + fprintf (dump_file, "Generating bit_field_ref -> "); >> + print_gimple_stmt (dump_file, gs, 0); >> + } >> + } >> + >> + if (dump_file && (dump_flags & TDF_DETAILS)) >> + { >> + fprintf (dump_file, "undistributiong bit_field_ref for vector done.\n"); >> + } >> + >> + cleanup_vinfo_map (v_info_map); >> + >> + return true; >> +} >> + >> /* If OPCODE is BIT_IOR_EXPR or BIT_AND_EXPR and CURR is a comparison >> expression, examine the other OPS to see if any of them are comparisons >> of the same values, which we may be able to combine or eliminate. >> @@ -5880,11 +6169,6 @@ reassociate_bb (basic_block bb) >> tree lhs, rhs1, rhs2; >> enum tree_code rhs_code = gimple_assign_rhs_code (stmt); >> >> - /* If this is not a gimple binary expression, there is >> - nothing for us to do with it. */ >> - if (get_gimple_rhs_class (rhs_code) != GIMPLE_BINARY_RHS) >> - continue; >> - >> /* If this was part of an already processed statement, >> we don't need to touch it again. */ >> if (gimple_visited_p (stmt)) >> @@ -5911,6 +6195,11 @@ reassociate_bb (basic_block bb) >> continue; >> } >> >> + /* If this is not a gimple binary expression, there is >> + nothing for us to do with it. */ >> + if (get_gimple_rhs_class (rhs_code) != GIMPLE_BINARY_RHS) >> + continue; >> + >> lhs = gimple_assign_lhs (stmt); >> rhs1 = gimple_assign_rhs1 (stmt); >> rhs2 = gimple_assign_rhs2 (stmt); >> @@ -5950,6 +6239,13 @@ reassociate_bb (basic_block bb) >> optimize_ops_list (rhs_code, &ops); >> } >> >> + if (undistribute_bitref_for_vector (rhs_code, &ops, >> + loop_containing_stmt (stmt))) >> + { >> + ops.qsort (sort_by_operand_rank); >> + optimize_ops_list (rhs_code, &ops); >> + } >> + >> if (rhs_code == PLUS_EXPR >> && transform_add_to_multiply (&ops)) >> ops.qsort (sort_by_operand_rank); >>