From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ed1-x542.google.com (mail-ed1-x542.google.com [IPv6:2a00:1450:4864:20::542]) by sourceware.org (Postfix) with ESMTPS id 963FC386F81F; Fri, 4 Sep 2020 07:23:27 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 963FC386F81F Received: by mail-ed1-x542.google.com with SMTP id t16so334899edw.7; Fri, 04 Sep 2020 00:23:27 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=gxTD/2b8NkuYua0fb5luTJel7j1zQ6ePw5j+uDrZCzs=; b=cEvryv01j4DOoA6kK7dktO404izk3IdipN2r4u+2hdH0CRYDyMUZ29GtUEgjhbwXCr MdpEUA1YD59I2NbBJzqldJijxPJaIpphLfFeq9m/d67d4uLFqVPdkns+nXJJnQEQfyYX T4JgOJCVkOfh98SqvcdZfsLjVOVCvrDAjy1PNFvtk6/5AqmOrN+09HxVSmDD0+FTVtN9 oVyuEK7tSgZaXH6xvpii1VAMHbkPT+G28JzOFqvPxKbPIi95yevt/B00Xh821xzPFs8A 4AjAI0kML3TodMZpAJH+T8vbtpLMkKCNpr6qNo0QidC4D0lsl2T0SrzAJFFAWG3mwgsE uDlA== X-Gm-Message-State: AOAM531KgDiDD987y8L0dgSql4JYI8Kviik8d27e7Q3s5x1cTXmJQfN2 edOGqf+0wNQ+S4xMGguyL+ybq8gSg66AJQzgxOE= X-Google-Smtp-Source: ABdhPJysLSB+E6NvT3jNDeLdFnxNP5B3YqR4DI5oUgeCz2KmtNkYvpLrqqttyOTRbO+P4e96DOkNhBA5Nn2YXUVnzcs= X-Received: by 2002:a50:d304:: with SMTP id g4mr6992721edh.248.1599204206661; Fri, 04 Sep 2020 00:23:26 -0700 (PDT) MIME-Version: 1.0 References: <20200831090647.152432-1-luoxhu@linux.ibm.com> <20200831170414.GJ28786@gate.crashing.org> <3aec8b6f-8cf5-bd48-eb4e-c7b82e88dcd7@linux.ibm.com> <53017f5c-d81b-6fe2-6c5d-1496249313f1@linux.ibm.com> <599b6eff-42fc-82ef-c822-f3502997798a@linux.ibm.com> In-Reply-To: From: Richard Biener Date: Fri, 4 Sep 2020 09:23:15 +0200 Message-ID: Subject: Re: [PATCH] rs6000: Expand vec_insert in expander instead of gimple [PR79251] To: luoxhu Cc: GCC Patches , David Edelsohn , Bill Schmidt , linkw@gcc.gnu.org, Segher Boessenkool Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-7.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Sep 2020 07:23:29 -0000 On Fri, Sep 4, 2020 at 9:19 AM Richard Biener wrote: > > On Fri, Sep 4, 2020 at 8:38 AM luoxhu wrote: > > > > > > > > On 2020/9/4 14:16, luoxhu via Gcc-patches wrote: > > > Hi, > > > > > > > > > Yes, I checked and found that both vec_set and vec_extract doesn't support > > > variable index for most targets, store_bit_field_1 and extract_bit_field_1 > > > would only consider use optabs when index is integer value. Anyway, it > > > shouldn't be hard to extend depending on target requirements. > > > > > > Another problem is v[n&3]=i and vec_insert(v, i, n) are generating with > > > different gimple code: > > > > > > { > > > _1 = n & 3; > > > VIEW_CONVERT_EXPR(v1)[_1] = i; > > > } > > > > > > vs: > > > > > > { > > > __vector signed int v1; > > > __vector signed int D.3192; > > > long unsigned int _1; > > > long unsigned int _2; > > > int * _3; > > > > > > [local count: 1073741824]: > > > D.3192 = v_4(D); > > > _1 = n_7(D) & 3; > > > _2 = _1 * 4; > > > _3 = &D.3192 + _2; > > > *_3 = i_8(D); > > > v1_10 = D.3192; > > > return v1_10; > > > } > > > > Just realized use convert_vector_to_array_for_subscript would generate > > "VIEW_CONVERT_EXPR(v1)[_1] = i;" before produce those instructions, > > your confirmation and comments will be highly appreciated... Thanks in > > advance. :) > > I think what the GCC vector extensions produce is generally better > so wherever "code generation" for vec_insert resides it should be > adjusted to produce the same code. Same for vec_extract. Guess altivec.h, dispatching to __builtin_vec_insert. Wonder why it wasn't #define vec_insert(a,b,c) (a)[c]=(b) anyway, you obviously have some lowering of the builtin somewhere in rs6000.c and thus can adjust that. > > Xionghu > > > > > > > > If not use builtin for "vec_insert(v, i, n)", the pointer is "int*" instead > > > of vector type, will this be difficult for expander to capture so many > > > statements then call the optabs? So shall we still keep the builtin style > > > for "vec_insert(v, i, n)" and expand "v[n&3]=i" with optabs or expand both > > > with optabs??? > > > > > > Drafted a fast patch to expand "v[n&3]=i" with optabs as below, sorry that not > > > using existed vec_set yet as not quite sure, together with the first patch, both > > > cases could be handled as expected: > > > > > > > > > [PATCH] Expander: expand VIEW_CONVERT_EXPR to vec_insert with variable index > > > > > > v[n%4] = i has same semantic with vec_insert (i, v, n), but it will be > > > optimized to "VIEW_CONVERT_EXPR(v1)[_1] = i;" in gimple, this > > > patch tries to recognize the pattern in expander and use optabs to > > > expand it to fast instructions like vec_insert: lvsl+xxperm+xxsel. > > > > > > gcc/ChangeLog: > > > > > > * config/rs6000/vector.md: > > > * expr.c (expand_assignment): > > > * optabs.def (OPTAB_CD): > > > --- > > > gcc/config/rs6000/vector.md | 13 +++++++++++ > > > gcc/expr.c | 46 +++++++++++++++++++++++++++++++++++++ > > > gcc/optabs.def | 1 + > > > 3 files changed, 60 insertions(+) > > > > > > diff --git a/gcc/config/rs6000/vector.md b/gcc/config/rs6000/vector.md > > > index 796345c80d3..46d21271e17 100644 > > > --- a/gcc/config/rs6000/vector.md > > > +++ b/gcc/config/rs6000/vector.md > > > @@ -1244,6 +1244,19 @@ (define_expand "vec_extract" > > > DONE; > > > }) > > > > > > +(define_expand "vec_insert" > > > + [(match_operand:VEC_E 0 "vlogical_operand") > > > + (match_operand: 1 "register_operand") > > > + (match_operand 2 "register_operand")] > > > + "VECTOR_MEM_ALTIVEC_OR_VSX_P (mode)" > > > +{ > > > + rtx target = gen_reg_rtx (V16QImode); > > > + rs6000_expand_vector_insert (target, operands[0], operands[1], operands[2]); > > > + rtx sub_target = simplify_gen_subreg (GET_MODE(operands[0]), target, V16QImode, 0); > > > + emit_insn (gen_rtx_SET (operands[0], sub_target)); > > > + DONE; > > > +}) > > > + > > > ;; Convert double word types to single word types > > > (define_expand "vec_pack_trunc_v2df" > > > [(match_operand:V4SF 0 "vfloat_operand") > > > diff --git a/gcc/expr.c b/gcc/expr.c > > > index dd2200ddea8..ce2890c1a2d 100644 > > > --- a/gcc/expr.c > > > +++ b/gcc/expr.c > > > @@ -5237,6 +5237,52 @@ expand_assignment (tree to, tree from, bool nontemporal) > > > > > > to_rtx = expand_expr (tem, NULL_RTX, VOIDmode, EXPAND_WRITE); > > > > > > + tree type = TREE_TYPE (to); > > > + if (TREE_CODE (to) == ARRAY_REF && tree_fits_uhwi_p (TYPE_SIZE (type)) > > > + && tree_fits_uhwi_p (TYPE_SIZE_UNIT (type)) > > > + && tree_to_uhwi (TYPE_SIZE (type)) > > > + * tree_to_uhwi (TYPE_SIZE_UNIT (type)) > > > + == 128) > > > + { > > > + tree op0 = TREE_OPERAND (to, 0); > > > + tree op1 = TREE_OPERAND (to, 1); > > > + if (TREE_CODE (op0) == VIEW_CONVERT_EXPR) > > > + { > > > + tree view_op0 = TREE_OPERAND (op0, 0); > > > + mode = TYPE_MODE (TREE_TYPE (view_op0)); > > > + if (TREE_CODE (TREE_TYPE (view_op0)) == VECTOR_TYPE) > > > + { > > > + rtx value > > > + = expand_expr (from, NULL_RTX, VOIDmode, EXPAND_NORMAL); > > > + rtx pos > > > + = expand_expr (op1, NULL_RTX, VOIDmode, EXPAND_NORMAL); > > > + rtx temp_target = gen_reg_rtx (mode); > > > + emit_move_insn (temp_target, to_rtx); > > > + > > > + machine_mode outermode = mode; > > > + scalar_mode innermode = GET_MODE_INNER (outermode); > > > + class expand_operand ops[3]; > > > + enum insn_code icode > > > + = convert_optab_handler (vec_insert_optab, innermode, > > > + outermode); > > > + > > > + if (icode != CODE_FOR_nothing) > > > + { > > > + pos = convert_to_mode (E_SImode, pos, 0); > > > + > > > + create_fixed_operand (&ops[0], temp_target); > > > + create_input_operand (&ops[1], value, innermode); > > > + create_input_operand (&ops[2], pos, GET_MODE (pos)); > > > + if (maybe_expand_insn (icode, 3, ops)) > > > + { > > > + emit_move_insn (to_rtx, temp_target); > > > + pop_temp_slots (); > > > + return; > > > + } > > > + } > > > + } > > > + } > > > + } > > > /* If the field has a mode, we want to access it in the > > > field's mode, not the computed mode. > > > If a MEM has VOIDmode (external with incomplete type), > > > diff --git a/gcc/optabs.def b/gcc/optabs.def > > > index 78409aa1453..21b163a969e 100644 > > > --- a/gcc/optabs.def > > > +++ b/gcc/optabs.def > > > @@ -96,6 +96,7 @@ OPTAB_CD(mask_gather_load_optab, "mask_gather_load$a$b") > > > OPTAB_CD(scatter_store_optab, "scatter_store$a$b") > > > OPTAB_CD(mask_scatter_store_optab, "mask_scatter_store$a$b") > > > OPTAB_CD(vec_extract_optab, "vec_extract$a$b") > > > +OPTAB_CD(vec_insert_optab, "vec_insert$a$b") > > > OPTAB_CD(vec_init_optab, "vec_init$a$b") > > > > > > OPTAB_CD (while_ult_optab, "while_ult$a$b") > > >