From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id ADE95398FC2B for ; Thu, 3 Jun 2021 17:59:29 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org ADE95398FC2B Received: from pps.filterd (m0098421.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.43/8.16.0.43) with SMTP id 153HWmMu187515; Thu, 3 Jun 2021 13:59:29 -0400 Received: from ppma04wdc.us.ibm.com (1a.90.2fa9.ip4.static.sl-reverse.com [169.47.144.26]) by mx0a-001b2d01.pphosted.com with ESMTP id 38y2gujacs-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 03 Jun 2021 13:59:29 -0400 Received: from pps.filterd (ppma04wdc.us.ibm.com [127.0.0.1]) by ppma04wdc.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 153HvWwC023348; Thu, 3 Jun 2021 17:59:28 GMT Received: from b03cxnp08028.gho.boulder.ibm.com (b03cxnp08028.gho.boulder.ibm.com [9.17.130.20]) by ppma04wdc.us.ibm.com with ESMTP id 38ud8a1tdx-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 03 Jun 2021 17:59:28 +0000 Received: from b03ledav002.gho.boulder.ibm.com (b03ledav002.gho.boulder.ibm.com [9.17.130.233]) by b03cxnp08028.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 153HxR0S35193306 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 3 Jun 2021 17:59:27 GMT Received: from b03ledav002.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id BDAF2136055; Thu, 3 Jun 2021 17:59:27 +0000 (GMT) Received: from b03ledav002.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 4494313604F; Thu, 3 Jun 2021 17:59:27 +0000 (GMT) Received: from li-24c3614c-2adc-11b2-a85c-85f334518bdb.ibm.com (unknown [9.85.128.185]) by b03ledav002.gho.boulder.ibm.com (Postfix) with ESMTPS; Thu, 3 Jun 2021 17:59:27 +0000 (GMT) Date: Thu, 3 Jun 2021 12:59:17 -0500 From: "Paul A. Clarke" To: Segher Boessenkool Cc: gcc-patches@gcc.gnu.org Subject: Re: [PATCH 1/2] rs6000: Add support for _mm_minpos_epu16 Message-ID: <20210603175917.GA7094@li-24c3614c-2adc-11b2-a85c-85f334518bdb.ibm.com> References: <20210602221316.202627-1-pc@us.ibm.com> <20210602221316.202627-2-pc@us.ibm.com> <20210603002735.GO18427@gate.crashing.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210603002735.GO18427@gate.crashing.org> User-Agent: Mutt/1.10.1 (2018-07-13) X-TM-AS-GCONF: 00 X-Proofpoint-GUID: QNRkOfKiL0IqZPFBLthzkXXfbSroBFs6 X-Proofpoint-ORIG-GUID: QNRkOfKiL0IqZPFBLthzkXXfbSroBFs6 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391, 18.0.761 definitions=2021-06-03_10:2021-06-03, 2021-06-03 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 suspectscore=0 priorityscore=1501 impostorscore=0 clxscore=1015 lowpriorityscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 spamscore=0 mlxlogscore=999 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2104190000 definitions=main-2106030118 X-Spam-Status: No, score=-4.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, KAM_NUMSUBJECT, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=no autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 03 Jun 2021 17:59:30 -0000 On Wed, Jun 02, 2021 at 07:27:35PM -0500, Segher Boessenkool wrote: > On Wed, Jun 02, 2021 at 05:13:15PM -0500, Paul A. Clarke wrote: > > Add a naive implementation of the subject x86 intrinsic to > > ease porting. > > > +/* Return horizontal packed word minimum and its index in bits [15:0] > > + and bits [18:16] respectively. */ > > +extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) > > +_mm_minpos_epu16 (__m128i __A) > > +{ > > + union __u > > + { > > + __m128i __m; > > + __v8hu __uh; > > + }; > > + union __u __u = { .__m = __A }, __r = { .__m = {0} }; > > + unsigned short __ridx = 0; > > + unsigned short __rmin = __u.__uh[__ridx]; > > + for (unsigned long __i = __ridx+1; > > (spaces around the "+"?) ok > > > + __i < sizeof (__u.__uh) / sizeof (__u.__uh[0]); > > You should either use a macro for that, or just write "8" :-) ok. (There should be a standard thing for this operation.) > > + __i++) > > + { > > + if (__u.__uh[__i] < __rmin) > > + { > > + __rmin = __u.__uh[__i]; > > + __ridx = __i; > > + } > > + } > > + __r.__uh[0] = __rmin; > > + __r.__uh[1] = __ridx; > > + return __r.__m; > > +} > > This does not compute the index correctly for big endian (it needs to > walk from right to left for that). The construction of the return value > looks wrong as well. > > Okay for trunk with that fixed. Thanks! I'm not seeing the issue here. The values are numbered by element order, and the results are in the "first" (minimum value) and "second" (index of first encountered minimum value in element order) elements of the result. PC