From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from eggs.gnu.org (eggs.gnu.org [IPv6:2001:470:142:3::10]) by sourceware.org (Postfix) with ESMTPS id 95EB43858CD1 for ; Fri, 26 Jan 2024 08:09:55 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 95EB43858CD1 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=loongson.cn Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=loongson.cn ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 95EB43858CD1 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2001:470:142:3::10 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1706256598; cv=none; b=c5qlI4VSoZVy+clHNw+JN8mEbOKxCilaghKWZQA2K0nV9A53IRYc+e2RkpOUY8FUBjeeWItNkQeUU7SiW3AI3hVDlAXM8ijsDYbVvxj5j9c8wl0rWClwM5MEofOADqEpJu1zDW9Eb0bBvR4jS8ip0XkQQl0CG2BFYpaTaius09Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1706256598; c=relaxed/simple; bh=Wz3QCUa/mT4bjRdcOv2WLhg4CXbj83dcxJc3p2BXbog=; h=Subject:To:From:Message-ID:Date:MIME-Version; b=GIDLIjf0Ic2GtHuwCMy2euP/R4OQsIOMhyrFQex+p1LQ/y8l6nSE+yW3YGSuINPCZoCrAASBcfXB4Vx7A9qKhZAVp7v2jpF3qNgWq1hA5HejIIKgN6rPqpiIorAJQi5DvqjhBQNu3zbG+Dy0DQJ8gjkgTzO1LsaN0N5tEYxeHJ8= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rTHHc-0007mO-21 for gcc-patches@gcc.gnu.org; Fri, 26 Jan 2024 03:09:55 -0500 Received: from loongson.cn (unknown [10.20.4.107]) by gateway (Coremail) with SMTP id _____8CxbevHaLNlzikGAA--.21856S3; Fri, 26 Jan 2024 16:09:44 +0800 (CST) Received: from [10.20.4.107] (unknown [10.20.4.107]) by localhost.localdomain (Coremail) with SMTP id AQAAf8Ax3c7GaLNl6aYbAA--.53603S3; Fri, 26 Jan 2024 16:09:42 +0800 (CST) Subject: Re: [PATCH v1] LoongArch: Adjust cost of vector_stmt that match multiply-add pattern. To: Li Wei , gcc-patches@gcc.gnu.org Cc: xry111@xry111.site, i@xen0n.name, xuchenghua@loongson.cn References: <20240124093615.4137594-1-liwei@loongson.cn> From: chenglulu Message-ID: <393d0697-f542-f061-ffa5-6116c8a2d816@loongson.cn> Date: Fri, 26 Jan 2024 16:09:42 +0800 User-Agent: Mozilla/5.0 (X11; Linux loongarch64; rv:68.0) Gecko/20100101 Thunderbird/68.7.0 MIME-Version: 1.0 In-Reply-To: <20240124093615.4137594-1-liwei@loongson.cn> Content-Type: text/plain; charset=gbk; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US X-CM-TRANSID:AQAAf8Ax3c7GaLNl6aYbAA--.53603S3 X-CM-SenderInfo: xfkh0wpoxo3qxorr0wxvrqhubq/ X-Coremail-Antispam: 1Uk129KBj93XoWxuF1kArW7Xr4kCw1DCw4Dtrc_yoW5CF13pw 4avFy3JFW8Jw1xGF1fJan5Xrn8CryxK3ZIga43K34xCa1DJ34xZ3Wkt347ZFnrW3y8ur1I q3yrZ3Z8Gan0yacCm3ZEXasCq-sJn29KB7ZKAUJUUUU5529EdanIXcx71UUUUU7KY7ZEXa sCq-sGcSsGvfJ3Ic02F40EFcxC0VAKzVAqx4xG6I80ebIjqfuFe4nvWSU5nxnvy29KBjDU 0xBIdaVrnRJUUUvIb4IE77IF4wAFF20E14v26r1j6r4UM7CY07I20VC2zVCF04k26cxKx2 IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48v e4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_JFI_Gr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI 0_Jr0_Gr1l84ACjcxK6I8E87Iv67AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVCY1x0267AK xVW8Jr0_Cr1UM2AIxVAIcxkEcVAq07x20xvEncxIr21l57IF6xkI12xvs2x26I8E6xACxx 1l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjxv20xvE14v26r1Y6r17McIj6I8E87Iv 67AKxVW8JVWxJwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xvr2IY64vIr41lc7I2V7IY0VAS07 AlzVAYIcxG8wCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02 F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_JF0_Jw 1lIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7Cj xVAFwI0_Jr0_Gr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r 4j6F4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr0_Gr1UYxBIdaVFxhVjvjDU0xZFpf9x07jb _-PUUUUU= Received-SPF: pass client-ip=114.242.206.163; envelope-from=chenglulu@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -2 X-Spam_score: -0.3 X-Spam_bar: / X-Spam_report: (-0.3 / 5.0 requ) BAYES_00=-1.9,MIME_CHARSET_FARAWAY=2.45,NICE_REPLY_A=-0.817,SPF_HELO_NONE=0.001,SPF_PASS=-0.001,T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Status: No, score=-12.2 required=5.0 tests=BAYES_00,GIT_PATCH_0,KAM_DMARC_STATUS,MIME_CHARSET_FARAWAY,NICE_REPLY_A,SPF_FAIL,SPF_HELO_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: ÔÚ 2024/1/24 ÏÂÎç5:36, Li Wei дµÀ: > We found that when only 128-bit vectorization was enabled, 549.fotonik3d_r > failed to vectorize effectively. For this reason, we adjust the cost of > 128-bit vector_stmt that match the multiply-add pattern to facilitate 128-bit > vectorization. > The experimental results show that after the modification, 549.fotonik3d_r > performance can be improved by 9.77% under the 128-bit vectorization option. > > gcc/ChangeLog: > > * config/loongarch/loongarch.cc (loongarch_multiply_add_p): New. > (loongarch_vector_costs::add_stmt_cost): Adjust. > > gcc/testsuite/ChangeLog: > > * gfortran.dg/vect/vect-10.f90: New test. > --- > gcc/config/loongarch/loongarch.cc | 42 +++++++++++++ > gcc/testsuite/gfortran.dg/vect/vect-10.f90 | 71 ++++++++++++++++++++++ > 2 files changed, 113 insertions(+) > create mode 100644 gcc/testsuite/gfortran.dg/vect/vect-10.f90 > > diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc > index 072c68d97e3..32a0b6f43e8 100644 > --- a/gcc/config/loongarch/loongarch.cc > +++ b/gcc/config/loongarch/loongarch.cc > @@ -4096,6 +4096,36 @@ loongarch_vector_costs::determine_suggested_unroll_factor (loop_vec_info loop_vi > return 1 << ceil_log2 (uf); > } > > +static bool > +loongarch_multiply_add_p (vec_info *vinfo, stmt_vec_info stmt_info) > +{ > + gassign *assign = dyn_cast (stmt_info->stmt); > + if (!assign) > + return false; > + tree_code code = gimple_assign_rhs_code (assign); > + if (code != PLUS_EXPR && code != MINUS_EXPR) > + return false; > + > + auto is_mul_result = [&](int i) > + { > + tree rhs = gimple_op (assign, i); > + if (TREE_CODE (rhs) != SSA_NAME) > + return false; > + > + stmt_vec_info def_stmt_info = vinfo->lookup_def (rhs); > + if (!def_stmt_info > + || STMT_VINFO_DEF_TYPE (def_stmt_info) != vect_internal_def) > + return false; > + gassign *rhs_assign = dyn_cast (def_stmt_info->stmt); > + if (!rhs_assign || gimple_assign_rhs_code (rhs_assign) != MULT_EXPR) > + return false; > + > + return true; > + }; > + > + return is_mul_result (1) || is_mul_result (2); > +} > + > unsigned > loongarch_vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind, > stmt_vec_info stmt_info, slp_tree, > @@ -4108,6 +4138,18 @@ loongarch_vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind, > { > int stmt_cost = loongarch_builtin_vectorization_cost (kind, vectype, > misalign); > + if (vectype && stmt_info) > + { > + gassign *assign = dyn_cast (STMT_VINFO_STMT (stmt_info)); > + machine_mode mode = TYPE_MODE (vectype); Hi, Liwei: I think the code here needs to be commented. Thanks. > + if (kind == vector_stmt && GET_MODE_SIZE (mode) == 16 && assign) > + { > + if (!vect_is_reduction (stmt_info) > + && loongarch_multiply_add_p (m_vinfo, stmt_info)) > + stmt_cost = 0; > + } > + } > + > retval = adjust_cost_for_freq (stmt_info, where, count * stmt_cost); > m_costs[where] += retval; >