From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by sourceware.org (Postfix) with ESMTP id 1259B3857402 for ; Fri, 30 Apr 2021 12:23:16 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 1259B3857402 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-163-x1q5nYstMyeLZXhhAl_ISg-1; Fri, 30 Apr 2021 08:23:13 -0400 X-MC-Unique: x1q5nYstMyeLZXhhAl_ISg-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 6C5281898298; Fri, 30 Apr 2021 12:23:12 +0000 (UTC) Received: from oldenburg.str.redhat.com (ovpn-115-124.ams2.redhat.com [10.36.115.124]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 14A4B19C45; Fri, 30 Apr 2021 12:23:10 +0000 (UTC) From: Florian Weimer To: libc-alpha@sourceware.org Cc: tuliom@linux.ibm.com, rzinsly@linux.ibm.com, anton@ozlabs.org Subject: Re: [PATCH v2] powerpc64le: Optimize memset for POWER10 References: <20210429234542.aeer5ncryowx34gs@work-tp> <87sg38cr1h.fsf@oldenburg.str.redhat.com> <20210430121633.kqfheh2tidgyiyap@work-tp> <87zgxg55fz.fsf@oldenburg.str.redhat.com> Date: Fri, 30 Apr 2021 14:23:23 +0200 In-Reply-To: <87zgxg55fz.fsf@oldenburg.str.redhat.com> (Florian Weimer's message of "Fri, 30 Apr 2021 14:21:04 +0200") Message-ID: <87o8dw55c4.fsf@oldenburg.str.redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain X-Spam-Status: No, score=-6.3 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, KAM_NUMSUBJECT, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Apr 2021 12:23:17 -0000 * Florian Weimer: > * Raoni Fassina Firmino: > >> On Fri, Apr 30, 2021 at 06:52:42AM +0200, AL glibc-alpha wrote: >>> * Raoni Fassina Firmino via Libc-alpha: >>> >>> > +L(dcbz_loop): >>> > + /* Sets 512 bytes to zero in each iteration, the loop unrolling shows >>> > + a throughput boost for large sizes (2048 bytes or higher). */ >>> > + dcbz 0,r6 >>> > + dcbz r9,r6 >>> > + dcbz r10,r6 >>> > + dcbz r11,r6 >>> > + addi r6,r6,512 >>> > + bdnz L(dcbz_loop) >>> >>> > +# ifdef __LITTLE_ENDIAN__ >>> > + (hwcap2 & (PPC_FEATURE2_ARCH_3_1 | PPC_FEATURE2_HAS_ISEL) >>> > + && hwcap & PPC_FEATURE_HAS_VSX) >>> > + ? __memset_power10 : >>> > +# endif >>> >>> Should the IFUNC resolver check that the cache line size is 128 bytes? >> >> I'm not sure, this part was taken from power8 version and for that it >> does not check the cache line size. In fact I looked all other memset >> versions (power7, power6, power4 and ppc), and only the ppc version does >> not assume a 128 bytes cache line. And none has this check (and all of >> them uses dcbz). So I don't know. > > Hmm, you are right, I think we had a discussion about this already (in > the POWER8 context) because the string function breaks if the cache line > size (as emulated by QEMU, if I recall correctly) are unexpected. And I should have said that this probably means that this isn't relevant to the POWER10 variant, either, as you suggested. Florian