From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oo1-xc32.google.com (mail-oo1-xc32.google.com [IPv6:2607:f8b0:4864:20::c32]) by sourceware.org (Postfix) with ESMTPS id 96451395A005 for ; Tue, 10 May 2022 11:59:01 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 96451395A005 Received: by mail-oo1-xc32.google.com with SMTP id v33-20020a4a9764000000b0035f814bb06eso906083ooi.11 for ; Tue, 10 May 2022 04:59:01 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:to:cc:references:from:in-reply-to :content-transfer-encoding; bh=n6vH6kg9BiGOFgHy1Le9yrk6Lp8m2UjDzB2+yD9dlws=; b=WiY732TFVkBIlhaE00u/C9/MqOOg9xVPcUdX5ybJGPQCPOEMKoSlaHbhfFolVi8hx1 10p2Zj+DibW6ieLOdjYbEJGQ9/ZkkPQPKNBA8Y/FDH9WuRBo3LZsz/L0W/sks288NQ5h ZYFygzVDf8mRILD+OAbkh1GFaCdbzoN82xhZM2mC0cGKCfRvgyXXrtQzHh4AlmISjlAC hXIUQE+bwLzpN7HXtTT7NRXs79baqfc9zkTJhfSfzySxVK+bBMawZ/NDQz3fAhBfL26N KApwKzgk6vCcTMpJdsOZMJdryRFr0JnWk1u9e0Ta9uG4VlNlvnuVybFRWw1TMtS8W+y+ y+mg== X-Gm-Message-State: AOAM530Q4i/tqXqikER95RKmE9nigKiRtFOL313SKNM8frzH/Qc8SKed apyR/iQCV8cs1lPegDKEyTy97KDU+puHRQ== X-Google-Smtp-Source: ABdhPJzJVHoMLDGQf47yT3C2Hu5IP6VotIicytH0o3SxcJwBnIEyZEIGu502mhgSWu/FzE98zUV21Q== X-Received: by 2002:a05:6820:35a:b0:35e:ae97:1124 with SMTP id m26-20020a056820035a00b0035eae971124mr7879446ooe.66.1652183940768; Tue, 10 May 2022 04:59:00 -0700 (PDT) Received: from ?IPV6:2804:431:c7ca:5fbd:76f7:9485:c71c:cb46? ([2804:431:c7ca:5fbd:76f7:9485:c71c:cb46]) by smtp.gmail.com with ESMTPSA id v25-20020a9d4e99000000b0060626a8e5a4sm5543977otk.74.2022.05.10.04.58.58 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 10 May 2022 04:59:00 -0700 (PDT) Message-ID: <1efb0080-f42e-1fd3-f7ff-c7beea2dfff8@linaro.org> Date: Tue, 10 May 2022 08:58:57 -0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.9.0 Subject: Re: [PATCH v5 6/6] elf: Optimize _dl_new_hash in dl-new-hash.h Content-Language: en-US To: Noah Goldstein , libc-alpha@sourceware.org Cc: Alexander Monakov References: <20220414041231.926415-1-goldstein.w.n@gmail.com> <20220509171747.4153703-1-goldstein.w.n@gmail.com> <20220509171747.4153703-6-goldstein.w.n@gmail.com> From: Adhemerval Zanella In-Reply-To: <20220509171747.4153703-6-goldstein.w.n@gmail.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-5.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 10 May 2022 11:59:04 -0000 On 09/05/2022 14:17, Noah Goldstein via Libc-alpha wrote: > Unroll slightly and enforce good instruction scheduling. This improves > performance on out-of-order machines. Note the unrolling allows > for pipelined multiplies which helps a bit, but most of the gain > is from enforcing better instruction scheduling for more ILP. > Unrolling further started to induce slowdowns for sizes [0, 4] > but can help the loop so if larger sizes are the target further > unrolling can be beneficial. >=20 > Results for _dl_new_hash > Benchmarked on Tigerlake: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GH= z >=20 > Time as Geometric Mean of N=3D25 runs > Geometric of all benchmark New / Old: 0.791 > type, length, New Time, Old Time, New Time / Old Time > fixed, 0, 0.641, 0.658, 0.974 > fixed, 1, 1.888, 1.883, 1.003 > fixed, 2, 2.712, 2.833, 0.957 > fixed, 3, 3.314, 3.739, 0.886 > fixed, 4, 4.316, 4.866, 0.887 > fixed, 5, 5.16, 5.966, 0.865 > fixed, 6, 5.986, 7.241, 0.827 > fixed, 7, 7.264, 8.435, 0.861 > fixed, 8, 8.052, 9.846, 0.818 > fixed, 9, 9.369, 11.316, 0.828 > fixed, 10, 10.256, 12.925, 0.794 > fixed, 11, 12.191, 14.546, 0.838 > fixed, 12, 12.667, 15.92, 0.796 > fixed, 13, 14.442, 17.465, 0.827 > fixed, 14, 14.808, 18.981, 0.78 > fixed, 15, 16.244, 20.565, 0.79 > fixed, 16, 17.166, 22.044, 0.779 > fixed, 32, 35.447, 50.558, 0.701 > fixed, 64, 86.479, 134.529, 0.643 > fixed, 128, 155.453, 287.527, 0.541 > fixed, 256, 302.57, 593.64, 0.51 > random, 2, 11.168, 10.61, 1.053 > random, 4, 13.308, 13.53, 0.984 > random, 8, 16.579, 19.437, 0.853 > random, 16, 21.292, 24.776, 0.859 > random, 32, 30.56, 35.906, 0.851 > random, 64, 49.249, 68.577, 0.718 > random, 128, 81.845, 140.664, 0.582 > random, 256, 152.517, 292.204, 0.522 >=20 > Co-authored-by: Alexander Monakov Buildbot failed to build it [1]: make[2]: Entering directory '/glibc/elf' gcc -m32 dl-lookup.c -c -std=3Dgnu11 -fgnu89-inline -g -O2 -Wall -Wwrite= -strings -Wundef -Werror -fmerge-all-constants -frounding-math -fno-stack= -protector -fno-common -Wstrict-prototypes -Wold-style-definition -fmath-= errno -fPIC -fno-stack-protector -DSTACK_PROTECTOR_LEVEL=3D0 -Wa,-mtu= ne=3Di686 -mno-sse -mno-mmx -mfpmath=3D387 -fexceptions -fasynchronou= s-unwind-tables -ftls-model=3Dinitial-exec -I../include -I/build/el= f -I/build -I../sysdeps/unix/sysv/linux/i386/i686 -I../sysdeps/i386/i6= 86/nptl -I../sysdeps/unix/sysv/linux/i386 -I../sysdeps/unix/sysv/linux/= x86/include -I../sysdeps/unix/sysv/linux/x86 -I../sysdeps/x86/nptl -I..= /sysdeps/i386/nptl -I../sysdeps/unix/sysv/linux/include -I../sysdeps/uni= x/sysv/linux -I../sysdeps/nptl -I../sysdeps/pthread -I../sysdeps/gnu = -I../sysdeps/unix/inet -I../sysdeps/unix/sysv -I../sysdeps/unix/i386 -= I../sysdeps/unix -I../sysdeps/posix -I../sysdeps/i386/i686/fpu/multiarc= h -I../sysdeps/i386/i686/fpu -I../sysdeps/i386/i686/multiarch -I../sys= deps/i386/i686 -I../sysdeps/i386/fpu -I../sysdeps/x86/fpu -I../sysdeps= /i386 -I../sysdeps/x86/include -I../sysdeps/x86 -I../sysdeps/wordsize-3= 2 -I../sysdeps/ieee754/float128 -I../sysdeps/ieee754/ldbl-96/include -I= =2E./sysdeps/ieee754/ldbl-96 -I../sysdeps/ieee754/dbl-64 -I../sysdeps/i= eee754/flt-32 -I../sysdeps/ieee754 -I../sysdeps/generic -I.. -I../libi= o -I. -D_LIBC_REENTRANT -include /build/libc-modules.h -DMODULE_NAME=3Dr= tld -include ../include/libc-symbols.h -DPIC -DSHARED -DTOP_NAMESPAC= E=3Dglibc -o /build/elf/dl-lookup.os -MD -MP -MF /build/elf/dl-lookup.os.= dt -MT /build/elf/dl-lookup.os In file included from dl-lookup.c:27: =2E/dl-new-hash.h: In function '_dl_new_hash': =2E/dl-new-hash.h:30:28: error: pointer targets in initialization of 'con= st unsigned char *' from 'const char *' differ in signedness [-Werror=3Dp= ointer-sign] 30 | const unsigned char *s =3D signed_s; | ^~~~~~~~ cc1: all warnings being treated as errors make[2]: *** [../o-iterator.mk:9: /build/elf/dl-lookup.os] Error 1 gcc -m32 dl-lookup.c -c -std=3Dgnu11 -fgnu89-inline -g -O2 -Wall -Wwrite= -strings -Wundef -Werror -fmerge-all-constants -frounding-math -fno-stack= -protector -fno-common -Wstrict-prototypes -Wold-style-definition -fmath-= errno -fpie -fno-stack-protector -DSTACK_PROTECTOR_LEVEL=3D0 -Wa,-mtu= ne=3Di686 -fexceptions -fasynchronous-unwind-tables -ftls-model=3Dinitia= l-exec -I../include -I/build/elf -I/build -I../sysdeps/unix/sysv/l= inux/i386/i686 -I../sysdeps/i386/i686/nptl -I../sysdeps/unix/sysv/linux= /i386 -I../sysdeps/unix/sysv/linux/x86/include -I../sysdeps/unix/sysv/li= nux/x86 -I../sysdeps/x86/nptl -I../sysdeps/i386/nptl -I../sysdeps/unix= /sysv/linux/include -I../sysdeps/unix/sysv/linux -I../sysdeps/nptl -I..= /sysdeps/pthread -I../sysdeps/gnu -I../sysdeps/unix/inet -I../sysdeps/= unix/sysv -I../sysdeps/unix/i386 -I../sysdeps/unix -I../sysdeps/posix = -I../sysdeps/i386/i686/fpu/multiarch -I../sysdeps/i386/i686/fpu -I../s= ysdeps/i386/i686/multiarch -I../sysdeps/i386/i686 -I../sysdeps/i386/fpu= -I../sysdeps/x86/fpu -I../sysdeps/i386 -I../sysdeps/x86/include -I../= sysdeps/x86 -I../sysdeps/wordsize-32 -I../sysdeps/ieee754/float128 -I.= =2E/sysdeps/ieee754/ldbl-96/include -I../sysdeps/ieee754/ldbl-96 -I../sy= sdeps/ieee754/dbl-64 -I../sysdeps/ieee754/flt-32 -I../sysdeps/ieee754 = -I../sysdeps/generic -I.. -I../libio -I. -D_LIBC_REENTRANT -include /bu= ild/libc-modules.h -DMODULE_NAME=3Dlibc -include ../include/libc-symbols.= h -DPIC -DTOP_NAMESPACE=3Dglibc -o /build/elf/dl-lookup.o -MD -MP -M= F /build/elf/dl-lookup.o.dt -MT /build/elf/dl-lookup.o In file included from dl-lookup.c:27: =2E/dl-new-hash.h: In function '_dl_new_hash': =2E/dl-new-hash.h:30:28: error: pointer targets in initialization of 'con= st unsigned char *' from 'const char *' differ in signedness [-Werror=3Dp= ointer-sign] 30 | const unsigned char *s =3D signed_s; | ^~~~~~~~ [1] https://www.delorie.com/trybots/32bit/9123/make.tail.txt