From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <amonakov@ispras.ru>
Received: from mail.ispras.ru (mail.ispras.ru [83.149.199.84])
 by sourceware.org (Postfix) with ESMTPS id 14CA53850431
 for <libc-alpha@sourceware.org>; Mon, 16 May 2022 20:27:45 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 14CA53850431
Received: from [10.10.3.121] (unknown [10.10.3.121])
 by mail.ispras.ru (Postfix) with ESMTPS id C321440D403E;
 Mon, 16 May 2022 20:27:43 +0000 (UTC)
Date: Mon, 16 May 2022 23:27:43 +0300 (MSK)
From: Alexander Monakov <amonakov@ispras.ru>
To: Adhemerval Zanella <adhemerval.zanella@linaro.org>
cc: GNU C Library <libc-alpha@sourceware.org>
Subject: Re: [PATCH v8 6/6] elf: Optimize _dl_new_hash in dl-new-hash.h
In-Reply-To: <af6f7db1-9ca5-f07f-e919-87916c46d717@linaro.org>
Message-ID: <83124e93-26ce-9e1b-c8e0-668f835f6771@ispras.ru>
References: <20220414041231.926415-1-goldstein.w.n@gmail.com>
 <20220511030635.154689-1-goldstein.w.n@gmail.com>
 <20220511030635.154689-6-goldstein.w.n@gmail.com>
 <1b419b02-0dee-813b-de4c-1fdc0779174a@gotplt.org>
 <1016566-92e6-5aed-b757-c6fdafa68ae@ispras.ru>
 <0cd799bb-5a54-cd71-ca97-58cc62480b4f@gotplt.org>
 <4cb8e190-db42-8284-2237-2d82537f593@ispras.ru>
 <CAFUsyfJR1jd6QLCNK9rOGAFWJEAnSAnxM=J43vZd7vEtkCN6LQ@mail.gmail.com>
 <a48b63fd-8e6b-4e2e-3d76-522a5745e967@ispras.ru>
 <65cda871-9f96-4792-d0d9-923573ec5abd@linaro.org>
 <7a2c4ab-fb44-54ec-7780-8134101480a@ispras.ru>
 <af6f7db1-9ca5-f07f-e919-87916c46d717@linaro.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS,
 SPF_HELO_NONE, SPF_PASS, TXREP,
 T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: libc-alpha@sourceware.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Libc-alpha mailing list <libc-alpha.sourceware.org>
List-Unsubscribe: <https://sourceware.org/mailman/options/libc-alpha>,
 <mailto:libc-alpha-request@sourceware.org?subject=unsubscribe>
List-Archive: <https://sourceware.org/pipermail/libc-alpha/>
List-Post: <mailto:libc-alpha@sourceware.org>
List-Help: <mailto:libc-alpha-request@sourceware.org?subject=help>
List-Subscribe: <https://sourceware.org/mailman/listinfo/libc-alpha>,
 <mailto:libc-alpha-request@sourceware.org?subject=subscribe>
X-List-Received-Date: Mon, 16 May 2022 20:27:46 -0000

On Mon, 16 May 2022, Adhemerval Zanella via Libc-alpha wrote:

> >> How hard would to make compiler to make this very optimization? I raised 
> >> this on weekly call because more and more it seems that tuning computation
> >> dependencies for loop tuning seems to be more a compiler job than libc's
> >> (although this not a blocker, but we have multiple smalls micro-optimizations
> >> in the past that turned in dead code due compiler catching up).
> > 
> > Sorry, since you're responding to a discussion about multiply-add, it's unclear
> > to me which optimization you mean. Is your question about choosing which
> > sequence of additions has shorter cross-iteration chain?
> 
> Indeed I was not clear, I mean the reply to [1] where you explain why 
> you have suggested the asm to prevent compiler reassociating.  
> 
> [1] https://sourceware.org/pipermail/libc-alpha/2022-May/138794.html

I think it's pretty hard, you'd have to decompose 'h*33' into '(h<<5)+h'
in the reassociation pass, notice that it's a part of addition chain that
feeds the phi node for 'h', and based on that select a specific
association variant (all to shave off one cycle per iteration). To me it
looks like an optimization just for this exact scenario. And then you
need to "hope" that no other pass undoes this transformation.

It would be quite some nontrivial code in the compiler, when the alternative is
getting a guaranteed outcome for any compiler by adding an empty asm statement
in a loop that iterates thousands of times on every process startup.

Alexander