From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <segher@kernel.crashing.org>
Received: from gate.crashing.org (gate.crashing.org [63.228.1.57])
 by sourceware.org (Postfix) with ESMTP id F291C385840D
 for <gcc-patches@gcc.gnu.org>; Fri, 17 Sep 2021 22:02:59 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org F291C385840D
Authentication-Results: sourceware.org; dmarc=none (p=none dis=none)
 header.from=kernel.crashing.org
Authentication-Results: sourceware.org;
 spf=pass smtp.mailfrom=kernel.crashing.org
Received: from gate.crashing.org (localhost.localdomain [127.0.0.1])
 by gate.crashing.org (8.14.1/8.14.1) with ESMTP id 18HM1xEW029437;
 Fri, 17 Sep 2021 17:01:59 -0500
Received: (from segher@localhost)
 by gate.crashing.org (8.14.1/8.14.1/Submit) id 18HM1wKS029436;
 Fri, 17 Sep 2021 17:01:58 -0500
X-Authentication-Warning: gate.crashing.org: segher set sender to
 segher@kernel.crashing.org using -f
Date: Fri, 17 Sep 2021 17:01:58 -0500
From: Segher Boessenkool <segher@kernel.crashing.org>
To: "Kewen.Lin" <linkw@linux.ibm.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>,
 Bill Schmidt <wschmidt@linux.ibm.com>, David Edelsohn <dje.gcc@gmail.com>
Subject: Re: [PATCH] rs6000: Modify the way for extra penalized cost
Message-ID: <20210917220158.GW1583@gate.crashing.org>
References: <d26e0a72-a029-c765-75ab-9b31de44f114@linux.ibm.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <d26e0a72-a029-c765-75ab-9b31de44f114@linux.ibm.com>
User-Agent: Mutt/1.4.2.3i
X-Spam-Status: No, score=-3.4 required=5.0 tests=BAYES_00, JMQ_SPF_NEUTRAL,
 KAM_DMARC_STATUS, SPF_HELO_PASS, SPF_PASS,
 TXREP autolearn=no autolearn_force=no version=3.4.4
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Fri, 17 Sep 2021 22:03:01 -0000

Hi!

On Thu, Sep 16, 2021 at 09:14:15AM +0800, Kewen.Lin wrote:
> The way with nunits * stmt_cost can get one much exaggerated
> penalized cost, such as: for V16QI on P8, it's 16 * 20 = 320,
> that's why we need one bound.  To make it scale, this patch
> doesn't use nunits * stmt_cost any more, but it still keeps
> nunits since there are actually nunits scalar loads there.  So
> it uses one cost adjusted from stmt_cost, since the current
> stmt_cost sort of considers nunits, we can stablize the cost
> for big nunits and retain the cost for small nunits.  After
> some tries, this patch gets the adjusted cost as:
> 
>     stmt_cost / (log2(nunits) * log2(nunits))

So for  V16QI it gives *16/(4*4) so *1
        V8HI  it gives *8/(3*3)  so *8/9
        V4SI  it gives *4/(2*2)  so *1
        V2DI  it gives *2/(1*1)  so *2
and for V1TI  it gives *1/(0*0) which is UB (no, does not crash for us,
just gives wildly wrong answers; the div returns 0 on recent systems).

> For V16QI, the adjusted cost would be 1 and total penalized
> cost is 16, it isn't exaggerated.  For V2DI, the adjusted
> cost would be 2 and total penalized cost is 4, which is the
> same as before.  btw, I tried to use one single log2(nunits),
> but the penalized cost is still big enough and can't fix the
> degraded bmk blender_r.

Does it make sense to treat V2DI (and V2DF) as twice more expensive than
other vectors, which are all pretty much equal cost (except those that
end up with cost 0)?  If so, there are simpler ways to do that.

> +	  int nunits_log2 = exact_log2 (nunits);
> +	  gcc_assert (nunits_log2 > 0);
> +	  unsigned int nunits_sq = nunits_log2 * nunits_log2;

>= 0

This of course is assuming nunits will always be a power of 2, but I'm
sure that we have many other places in the compiler assuming that
already, so that is fine.  And if one day this stops being true we will
get a nice ICE, pretty much the best we could hope for.

> +	  unsigned int adjusted_cost = stmt_cost / nunits_sq;

But this can divide by 0.  Or are we somehow guaranteed that nunits
will never be 1?  Yes the log2 check above, sure, but that ICEs if this
is violated; is there anything that actually guarantees it is true?

> +	  gcc_assert (adjusted_cost > 0);

I don't see how you guarantee this, either.


A magic crazy formula like this is no good.  If you want to make the
cost of everything but V2D* be the same, and that of V2D* be twice that,
that is a weird heuristic, but we can live with that perhaps.  But that
beats completely unexplained (and unexplainable) magic!

Sorry.


Segher