From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 695C5384A005 for ; Sun, 11 Jul 2021 15:45:50 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 695C5384A005 Received: from pps.filterd (m0098396.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.43/8.16.0.43) with SMTP id 16BFXSI8021662; Sun, 11 Jul 2021 11:45:48 -0400 Received: from ppma02wdc.us.ibm.com (aa.5b.37a9.ip4.static.sl-reverse.com [169.55.91.170]) by mx0a-001b2d01.pphosted.com with ESMTP id 39qrkv1ywg-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Sun, 11 Jul 2021 11:45:48 -0400 Received: from pps.filterd (ppma02wdc.us.ibm.com [127.0.0.1]) by ppma02wdc.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 16BFgHfY022052; Sun, 11 Jul 2021 15:45:47 GMT Received: from b03cxnp08027.gho.boulder.ibm.com (b03cxnp08027.gho.boulder.ibm.com [9.17.130.19]) by ppma02wdc.us.ibm.com with ESMTP id 39q369eqh2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Sun, 11 Jul 2021 15:45:47 +0000 Received: from b03ledav001.gho.boulder.ibm.com (b03ledav001.gho.boulder.ibm.com [9.17.130.232]) by b03cxnp08027.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 16BFjkvV13238674 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sun, 11 Jul 2021 15:45:46 GMT Received: from b03ledav001.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 2D5786E04E; Sun, 11 Jul 2021 15:45:46 +0000 (GMT) Received: from b03ledav001.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id F39CF6E052; Sun, 11 Jul 2021 15:45:45 +0000 (GMT) Received: from Bills-MacBook-Pro.local (unknown [9.211.124.44]) by b03ledav001.gho.boulder.ibm.com (Postfix) with ESMTP; Sun, 11 Jul 2021 15:45:45 +0000 (GMT) Reply-To: wschmidt@linux.ibm.com Subject: Re: [PATCH 1/4] rs6000: Add support for SSE4.1 "test" intrinsics To: "Paul A. Clarke" , gcc-patches@gcc.gnu.org Cc: segher@kernel.crashing.org References: <20210629180859.1235662-1-pc@us.ibm.com> <20210629180859.1235662-2-pc@us.ibm.com> From: Bill Schmidt Message-ID: <8f826ec8-fd50-5deb-49e1-d2891b867d91@linux.ibm.com> Date: Sun, 11 Jul 2021 10:45:45 -0500 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: <20210629180859.1235662-2-pc@us.ibm.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-GB X-TM-AS-GCONF: 00 X-Proofpoint-GUID: QWSecfCUp4jczLYIWJ_k2oLRQVpHvjc_ X-Proofpoint-ORIG-GUID: QWSecfCUp4jczLYIWJ_k2oLRQVpHvjc_ X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391, 18.0.790 definitions=2021-07-11_09:2021-07-09, 2021-07-11 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 malwarescore=0 suspectscore=0 impostorscore=0 phishscore=0 priorityscore=1501 spamscore=0 mlxlogscore=999 bulkscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2104190000 definitions=main-2107110128 X-Spam-Status: No, score=-13.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, NICE_REPLY_A, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 11 Jul 2021 15:45:51 -0000 Hi Paul, On 6/29/21 1:08 PM, Paul A. Clarke via Gcc-patches wrote: > 2021-06-29 Paul A. Clarke > > gcc/ChangeLog: > * config/rs6000/smmintrin.h (_mm_testz_si128, _mm_testc_si128, > _mm_testnzc_si128, _mm_test_all_ones, _mm_test_all_zeros, > _mm_test_mix_ones_zeros): New. > --- > gcc/config/rs6000/smmintrin.h | 50 +++++++++++++++++++++++++++++++++++ > 1 file changed, 50 insertions(+) > > diff --git a/gcc/config/rs6000/smmintrin.h b/gcc/config/rs6000/smmintrin.h > index bdf6eb365d88..1b8cad135ed0 100644 > --- a/gcc/config/rs6000/smmintrin.h > +++ b/gcc/config/rs6000/smmintrin.h > @@ -116,4 +116,54 @@ _mm_blendv_epi8 (__m128i __A, __m128i __B, __m128i __mask) > return (__m128i) vec_sel ((__v16qu) __A, (__v16qu) __B, __lmask); > } > > +extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) Line too long, please fix here and below.  (Existing cases can be left.) > +_mm_testz_si128 (__m128i __A, __m128i __B) > +{ > + /* Note: This implementation does NOT set "zero" or "carry" flags. */ This is reasonable; thanks for documenting. LGTM; I can't approve, but recommend approval with line lengths fixed.  Thanks! Bill > + const __v16qu __zero = {0}; > + return vec_all_eq (vec_and ((__v16qu) __A, (__v16qu) __B), __zero); > +} > + > +extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) > +_mm_testc_si128 (__m128i __A, __m128i __B) > +{ > + /* Note: This implementation does NOT set "zero" or "carry" flags. */ > + const __v16qu __zero = {0}; > + const __v16qu __notA = vec_nor ((__v16qu) __A, (__v16qu) __A); > + return vec_all_eq (vec_and ((__v16qu) __notA, (__v16qu) __B), __zero); > +} > + > +extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) > +_mm_testnzc_si128 (__m128i __A, __m128i __B) > +{ > + /* Note: This implementation does NOT set "zero" or "carry" flags. */ > + return _mm_testz_si128 (__A, __B) == 0 && _mm_testc_si128 (__A, __B) == 0; > +} > + > +extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) > +_mm_test_all_zeros (__m128i __A, __m128i __mask) > +{ > + const __v16qu __zero = {0}; > + return vec_all_eq (vec_and ((__v16qu) __A, (__v16qu) __mask), __zero); > +} > + > +extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) > +_mm_test_all_ones (__m128i __A) > +{ > + const __v16qu __ones = vec_splats ((unsigned char) 0xff); > + return vec_all_eq ((__v16qu) __A, __ones); > +} > + > +extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) > +_mm_test_mix_ones_zeros (__m128i __A, __m128i __mask) > +{ > + const __v16qu __zero = {0}; > + const __v16qu __Amasked = vec_and ((__v16qu) __A, (__v16qu) __mask); > + const int any_ones = vec_any_ne (__Amasked, __zero); > + const __v16qu __notA = vec_nor ((__v16qu) __A, (__v16qu) __A); > + const __v16qu __notAmasked = vec_and ((__v16qu) __notA, (__v16qu) __mask); > + const int any_zeros = vec_any_ne (__notAmasked, __zero); > + return any_ones * any_zeros; > +} > + > #endif