From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 71A57385E836 for ; Thu, 9 May 2024 05:44:25 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 71A57385E836 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 71A57385E836 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=148.163.158.5 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1715233467; cv=none; b=cT5jAovqnCTJLsVdVzQnFwGdohif9mSVHsP6JMwLCoL8hNQZNaGVVItztclrDUWGBpko1WlqQhrs39jNyL3uDEfexdL9/IxHEzOfFduW+G4gPrVAmH5MK2lxXsClTg4p5XrfZEEkapyQ1RHMDaXja70zb7LG4q5IBLjI1QiXnDw= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1715233467; c=relaxed/simple; bh=DW8fXahuK/Ph74ez5H+0ypJkYu66M54njMkmoJUnoW0=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From; b=t8jTwECvuhX+/2XPzhRDYLmwenjy+PzYJoYI/u0u3Im2HSvjunhbirarf5E6ZUQlz2rX1jnhn9Tw7Gk9wI8XBfco6faolY7HB1Vbuyn7Qfweu3AZrkNLlQKAU0E6jfu2qplmloCSX6ymTOaWz5s2OM0IPiqbmx9TbeO7BtQz+nk= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from pps.filterd (m0353725.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 4494woLl023808; Thu, 9 May 2024 05:44:24 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : mime-version : subject : to : cc : references : from : in-reply-to : content-type : content-transfer-encoding; s=pp1; bh=NlHJkUqPZ6LV0nVJlX0e/rSC49rgvbm0oVzq3T6FGcU=; b=VHcuV/zneyvnlA5XpD9BlRZQOYm4C7+mniG/uZwIs6tORWWNoqP2qslb9zII8+CqhlNi mnItiJawg3XDVDsDuiQXytSZWwaqC/xKjRwBzXdMLWO9Vpdy3xV1TjyMpj7vyaYwQSwJ 6xpRgFWjGWz/WcLQxw9PetQSBSsFAnr7FgWzUmR6bui9KyMQQEIqW79k6Agj96aGGdJG IQ6DXJ+mHYjrYbLdck9H0Y+c+LIcRKPzaqPvCEfcpBc2zV/pI2uBm6SGTY3TJjcZLO/s CMHxmADPmXCf+GXOdvK0N31R5n7ZHxY5SrGNweqqaojz2VkQcznjqPOujfb0M7wYPAwI cA== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3y0n9k0ch8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 09 May 2024 05:44:23 +0000 Received: from m0353725.ppops.net (m0353725.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 4495iNQ0025545; Thu, 9 May 2024 05:44:23 GMT Received: from ppma12.dal12v.mail.ibm.com (dc.9e.1632.ip4.static.sl-reverse.com [50.22.158.220]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3y0n9k0ch6-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 09 May 2024 05:44:23 +0000 Received: from pps.filterd (ppma12.dal12v.mail.ibm.com [127.0.0.1]) by ppma12.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 44954bId009354; Thu, 9 May 2024 05:44:22 GMT Received: from smtprelay01.fra02v.mail.ibm.com ([9.218.2.227]) by ppma12.dal12v.mail.ibm.com (PPS) with ESMTPS id 3xyshuryge-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 09 May 2024 05:44:22 +0000 Received: from smtpav06.fra02v.mail.ibm.com (smtpav06.fra02v.mail.ibm.com [10.20.54.105]) by smtprelay01.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 4495iGTr55771392 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 9 May 2024 05:44:19 GMT Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id D3E4A20040; Thu, 9 May 2024 05:44:16 +0000 (GMT) Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 1E9662004B; Thu, 9 May 2024 05:44:15 +0000 (GMT) Received: from [9.200.158.244] (unknown [9.200.158.244]) by smtpav06.fra02v.mail.ibm.com (Postfix) with ESMTP; Thu, 9 May 2024 05:44:14 +0000 (GMT) Message-ID: <57781582-bb00-75e5-bb1b-bfb3c7a4c019@linux.ibm.com> Date: Thu, 9 May 2024 13:44:13 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.15.1 Subject: Re: [PATCH] rs6000: Enable overlapped by-pieces operations Content-Language: en-US To: HAO CHEN GUI Cc: Segher Boessenkool , David , Peter Bergner , gcc-patches References: <5c5c2878-6828-42aa-8cfb-2778aea8050b@linux.ibm.com> From: "Kewen.Lin" In-Reply-To: <5c5c2878-6828-42aa-8cfb-2778aea8050b@linux.ibm.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: 6i-g6wc1cQTnpCxy-iY7VktnYMXKPl22 X-Proofpoint-GUID: O2DBjZNagLZzrDcl83ROfeWN1X_gY0sC X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.650,FMLib:17.11.176.26 definitions=2024-05-09_02,2024-05-08_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 adultscore=0 malwarescore=0 spamscore=0 mlxscore=0 suspectscore=0 bulkscore=0 phishscore=0 clxscore=1015 impostorscore=0 lowpriorityscore=0 mlxlogscore=976 priorityscore=1501 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2405010000 definitions=main-2405090036 X-Spam-Status: No, score=-13.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,NICE_REPLY_A,RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi, on 2024/5/8 14:47, HAO CHEN GUI wrote: > Hi, > This patch enables overlapped by-piece operations. On rs6000, default > move/set/clear ratio is 2. So the overlap is only enabled with compare > by-pieces. Thanks for enabling this, did you evaluate if it can help some benchmark? > > Bootstrapped and tested on powerpc64-linux BE and LE with no > regressions. Is it OK for the trunk? > > Thanks > Gui Haochen > > ChangeLog > rs6000: Enable overlapped by-pieces operations > > This patch enables overlapped by-piece operations by defining > TARGET_OVERLAP_OP_BY_PIECES_P to true. On rs6000, default move/set/clear > ratio is 2. So the overlap is only enabled with compare by-pieces. > > gcc/ > * config/rs6000/rs6000.cc (TARGET_OVERLAP_OP_BY_PIECES_P): Define. > > gcc/testsuite/ > * gcc.target/powerpc/block-cmp-9.c: New. > > > patch.diff > diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc > index 6b9a40fcc66..2b5f5cf1d86 100644 > --- a/gcc/config/rs6000/rs6000.cc > +++ b/gcc/config/rs6000/rs6000.cc > @@ -1774,6 +1774,9 @@ static const scoped_attribute_specs *const rs6000_attribute_table[] = > #undef TARGET_CONST_ANCHOR > #define TARGET_CONST_ANCHOR 0x8000 > > +#undef TARGET_OVERLAP_OP_BY_PIECES_P > +#define TARGET_OVERLAP_OP_BY_PIECES_P hook_bool_void_true > + > > > /* Processor table. */ > diff --git a/gcc/testsuite/gcc.target/powerpc/block-cmp-9.c b/gcc/testsuite/gcc.target/powerpc/block-cmp-9.c > new file mode 100644 > index 00000000000..b5f51affbb7 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/block-cmp-9.c > @@ -0,0 +1,11 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -mdejagnu-cpu=power8" } */ Why does it need power8 forced here? BR, Kewen > +/* { dg-final { scan-assembler-not {\ml[hb]z\M} } } */ > + > +/* Test if by-piece overlap compare is enabled and following case is > + implemented by two overlap word loads and compares. */ > + > +int foo (const char* s1, const char* s2) > +{ > + return __builtin_memcmp (s1, s2, 7) == 0; > +}