From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 6BC5C3857816 for ; Tue, 22 Jun 2021 10:56:22 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 6BC5C3857816 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id F056611D4; Tue, 22 Jun 2021 03:56:21 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.126]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 3F4F63F694; Tue, 22 Jun 2021 03:56:21 -0700 (PDT) From: Richard Sandiford To: Tamar Christina Mail-Followup-To: Tamar Christina , Richard Biener , nd , "gcc-patches\@gcc.gnu.org" , richard.sandiford@arm.com Cc: Richard Biener , nd , "gcc-patches\@gcc.gnu.org" Subject: Re: [PATCH 1/4]middle-end Vect: Add support for dot-product where the sign for the multiplicant changes. References: <7q3oonr2-92r0-8o9q-s27q-9r735s4n3s3@fhfr.qr> Date: Tue, 22 Jun 2021 11:56:20 +0100 In-Reply-To: (Tamar Christina's message of "Mon, 14 Jun 2021 12:06:23 +0000") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-6.5 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jun 2021 10:56:24 -0000 Sorry for the slow review. Just concentrating on tree-vect-patterns.c, as before: Tamar Christina writes: > @@ -521,6 +522,9 @@ vect_joust_widened_type (tree type, tree new_type, tr= ee *common_type) > unsigned int precision =3D MAX (TYPE_PRECISION (*common_type), > TYPE_PRECISION (new_type)); > precision *=3D 2; > + > + /* The resulting application is unsigned, check if we have enough > + precision to perform the operation. */ > if (precision * 2 > TYPE_PRECISION (type)) > return false; >=20=20 Not sure what the comment means by =E2=80=9Capplication=E2=80=9D here, but = the common type we pick is signed rather than unsigned. > @@ -546,7 +554,8 @@ static unsigned int > vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_cod= e code, > tree_code widened_code, bool shift_p, > unsigned int max_nops, > - vect_unpromoted_value *unprom, tree *common_type) > + vect_unpromoted_value *unprom, tree *common_type, > + enum optab_subtype *subtype =3D NULL) > { > /* Check for an integer operation with the right code. */ > gassign *assign =3D dyn_cast (stmt_info->stmt); > @@ -607,7 +616,8 @@ vect_widened_op_tree (vec_info *vinfo, stmt_vec_info = stmt_info, tree_code code, > =3D vinfo->lookup_def (this_unprom->op); > nops =3D vect_widened_op_tree (vinfo, def_stmt_info, code, > widened_code, shift_p, max_nops, > - this_unprom, common_type); > + this_unprom, common_type, > + subtype); > if (nops =3D=3D 0) > return 0; >=20=20 > @@ -625,7 +635,24 @@ vect_widened_op_tree (vec_info *vinfo, stmt_vec_info= stmt_info, tree_code code, > *common_type =3D this_unprom->type; > else if (!vect_joust_widened_type (type, this_unprom->type, > common_type)) > - return 0; > + { > + if (subtype) > + { AIUI, if we get here then: - there must be one unsigned operand (A) of precision P - there must be one signed operand (B) with precision <=3D P - we can't extend to precision 2*P=20 A conversion is needed if B's precision is < P. That conversion should be to a signed type with precision P. So=E2=80=A6 > + tree new_type =3D *common_type; > + /* See if we can sign extend the smaller type. */ > + if (TYPE_PRECISION (this_unprom->type) > TYPE_PRECISION (new_typ= e) > + && (TYPE_UNSIGNED (this_unprom->type) && !TYPE_UNSIGNED (new_type))) =E2=80=A6I think this second line could be an assert and > + new_type =3D build_nonstandard_integer_type (TYPE_PRECISION (this_unp= rom->type), true); =E2=80=A6picking an unsigned type here looks wrong. The net effect would be to convert B (the previous signed operand) to an unsigned type. > + > + if (tree_nop_conversion_p (this_unprom->type, new_type)) > + { > + *subtype =3D optab_vector_mixed_sign; > + *common_type =3D new_type; > + } IMO the sign of the common type shouldn't matter for optab_vector_mixed_sig= n: if we need to convert operands later, it should be to the precision of the common type but retaining the sign of the original type. So I think it would be simpler to do: if (TYPE_PRECISION (this_unprom->type) > TYPE_PRECISION (*common_type) *common_type =3D this_unprom->type; *subtype =3D optab_vector_mixed_sign; here and adjust the conversion code as described below. This also has the advantage of coping with > 2 operands, in case that ever becomes important in future. > @@ -806,12 +833,15 @@ vect_convert_input (vec_info *vinfo, stmt_vec_info = stmt_info, tree type, > } >=20=20 > /* Invoke vect_convert_input for N elements of UNPROM and store the > - result in the corresponding elements of RESULT. */ > + result in the corresponding elements of RESULT. > + > + If SUBTYPE then don't convert the types if they only > + differ by sign. */ >=20=20 > static void > vect_convert_inputs (vec_info *vinfo, stmt_vec_info stmt_info, unsigned = int n, > tree *result, tree type, vect_unpromoted_value *unprom, > - tree vectype) > + tree vectype, enum optab_subtype subtype =3D optab_default) > { > for (unsigned int i =3D 0; i < n; ++i) > { > @@ -819,8 +849,12 @@ vect_convert_inputs (vec_info *vinfo, stmt_vec_info = stmt_info, unsigned int n, > for (j =3D 0; j < i; ++j) > if (unprom[j].op =3D=3D unprom[i].op) > break; > + > if (j < i) > result[i] =3D result[j]; > + else if (subtype =3D=3D optab_vector_mixed_sign > + && tree_nop_conversion_p (type, unprom[i].type)) > + result[i] =3D unprom[i].op; > else > result[i] =3D vect_convert_input (vinfo, stmt_info, > type, &unprom[i], vectype); As noted above, I think we want to preserve the sign of the original type for optab_vector_mixed_sign, even if a conversion is needed. I think we should avoid the special case above and instead push subtype down into vect_convert_input. We can then adjust the type at the head of that function: if (subtype =3D=3D optab_vector_mixed_sign && TYPE_SIGN (type) !=3D TYPE_SIGN (TREE_TYPE (unprom->op))) type =3D build_nonstandard_integer_type (TYPE_PRECISION (type), TYPE_SIGN (this_unprom->type)); > @@ -895,7 +929,8 @@ vect_reassociating_reduction_p (vec_info *vinfo, >=20=20 > Try to find the following pattern: >=20=20 > - type x_t, y_t; > + type1a x_t > + type1b y_t; > TYPE1 prod; > TYPE2 sum =3D init; > loop: > @@ -908,8 +943,10 @@ vect_reassociating_reduction_p (vec_info *vinfo, > [S6 prod =3D (TYPE2) prod; #optional] > S7 sum_1 =3D prod + sum_0; >=20=20 > - where 'TYPE1' is exactly double the size of type 'type', and 'TYPE2' = is the > - same size of 'TYPE1' or bigger. This is a special case of a reduction > + where 'TYPE1' is exactly double the size of type 'type1a' and 'type1b= ', > + the sign of 'TYPE1' must be one of 'type1a' or 'type1b' but the sign = of > + 'type1a' and 'type1b' can differ. 'TYPE2' is the same size of 'TYPE1'= or > + bigger and must be the same sign. This is a special case of a reducti= on This last bit isn't true: TYPE2 is the type of the addition and can be any sign. > computation. >=20=20 > Input: > @@ -946,15 +983,15 @@ vect_recog_dot_prod_pattern (vec_info *vinfo, >=20=20 > /* Look for the following pattern > DX =3D (TYPE1) X; > - DY =3D (TYPE1) Y; > + DY =3D (TYPE1) Y; > DPROD =3D DX * DY; > - DDPROD =3D (TYPE2) DPROD; > + DDPROD =3D (TYPE2) DPROD; > sum_1 =3D DDPROD + sum_0; Spurious whitespace changes: would be better to tabify the whole thing or leave it as-is. > In which > - DX is double the size of X > - DY is double the size of Y > - DX, DY, DPROD all have the same type but the sign > - between DX, DY and DPROD can differ. > + between X, Y and DPROD can differ. > - sum is the same size of DPROD or bigger > - sum has been recognized as a reduction variable. >=20=20 > @@ -992,21 +1029,27 @@ vect_recog_dot_prod_pattern (vec_info *vinfo, > /* FORNOW. Can continue analyzing the def-use chain when this stmt in= a phi > inside the loop (in case we are analyzing an outer-loop). */ > vect_unpromoted_value unprom0[2]; > + enum optab_subtype subtype =3D optab_vector; > if (!vect_widened_op_tree (vinfo, mult_vinfo, MULT_EXPR, WIDEN_MULT_EX= PR, > - false, 2, unprom0, &half_type)) > + false, 2, unprom0, &half_type, &subtype)) > + return NULL; > + > + if (subtype =3D=3D optab_vector_mixed_sign > + && TYPE_UNSIGNED (unprom_mult.type) > + && TYPE_PRECISION (half_type) * 4 > TYPE_PRECISION (unprom_mult.ty= pe)) > return NULL; Isn't the final condition here instead that TYPE1 is narrower than TYPE2? I.e. we need to reject the case in which we multiply a signed and an unsigned value to get a (logically) signed result, but then zero-extend it (rather than sign-extend it) to the precision of the addition. That would make the test: if (subtype =3D=3D optab_vector_mixed_sign && TYPE_UNSIGNED (unprom_mult.type) && TYPE_PRECISION (unprom_mult.type) < TYPE_PRECISION (type)) return NULL;=20=20=20=20 =20=20 instead. Thanks, Richard