From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out2.suse.de (smtp-out2.suse.de [IPv6:2001:67c:2178:6::1d]) by sourceware.org (Postfix) with ESMTPS id 806613858D20 for ; Fri, 3 Feb 2023 15:44:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 806613858D20 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 642035BF55 for ; Fri, 3 Feb 2023 15:44:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1675439063; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ZV8NgKCBD7BBoZfb9o2HbO0W2invKpeQ8sZJ7ak66MU=; b=VScvpKFMs4mv/twUbwXyNxFkgx9U0z+WbkVvSpe+iK4/WT3orbjDpfzpQgAT6q3lQN/Nlx gzBG7x74AMnG0lJwDNdXxgCOw1rrFP/Mh6lssJiOzS3kSb8dP/GNGm2+iQsUBgINKUNhbx Nyl9PFIGJcP5VJnYkRX5XiIF0zQTcys= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1675439063; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ZV8NgKCBD7BBoZfb9o2HbO0W2invKpeQ8sZJ7ak66MU=; b=q4Wb9ZYJgB6HQFVECEFmHS/22XFccO6tLbUzQ4vrtThqxzEugWQkugJiURxOTre6veNe8+ HQL4+L1gcWpZWtAA== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 56F011346D for ; Fri, 3 Feb 2023 15:44:23 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id lHAoFdcr3WMORgAAMHmgww (envelope-from ) for ; Fri, 03 Feb 2023 15:44:23 +0000 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable From: Richard Biener Mime-Version: 1.0 (1.0) Subject: Re: [PATCH] Improve RTL CSE hash table hash usage Date: Fri, 3 Feb 2023 16:44:12 +0100 Message-Id: <617912D0-503E-4F87-BFE1-CC9324ED67CA@suse.de> References: In-Reply-To: To: Richard Sandiford via Gcc-patches X-Mailer: iPhone Mail (20D47) X-Spam-Status: No, score=-11.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: > Am 03.02.2023 um 15:20 schrieb Richard Sandiford via Gcc-patches : >=20 > =EF=BB=BFRichard Biener via Gcc-patches writes: >> The RTL CSE hash table has a fixed number of buckets (32) each >> with a linked list of entries with the same hash value. The >> actual hash values are computed using hash_rtx which uses adds >> for mixing and adds the rtx CODE as CODE << 7 (apart from some >> exceptions such as MEM). The unsigned int typed hash value >> is then simply truncated for the actual lookup into the fixed >> size table which means that usually CODE is simply lost. >>=20 >> The following improves this truncation by first mixing in more >> bits using xor. It does not change the actual hash function >> since that's used outside of CSE as well. >>=20 >> An alternative would be to bump the fixed number of buckets, >> say to 256 which would retain the LSB of CODE or to 8192 which >> can capture all 6 bits required for the last CODE. >>=20 >> As the comment in CSE says, there's invalidate_memory and >> flush_hash_table done possibly frequently and those at least >> need to walk all slots, so when the hash table is mostly empty >> enlarging it will be a loss. Still there should be more >> regular lookups by hash, so less collisions should pay off >> as well. >=20 > Going purely from this description and without having looked > at the code properly, would it be possible to link all current > values together, not just those with the same hash? And would > that help? It looks like the list is already doubly-linked, > and there's spare room to store a "start of new hash" marker. We already do have equivalent values linked, but I=E2=80=99m not sure that=E2= =80=99s what you are suggesting. Those should also have the same hash value= , so both lists are somewhat redundant and we might be able to save some sto= rage here by making this a list of lists of same hash and value list? >=20 > Thanks, > Richard >=20 >> Without enlarging the table a better hash function is unlikely >> going to make a big difference, simple statistics on the >> number of collisions at insertion time shows a reduction of >> around 10%. Bumping HASH_SHIFT by 1 improves that to 30% >> at the expense of reducing the average table fill by 10% >> (all of this stats from looking just at fold-const.i at -O2). >> Increasing HASH_SHIFT more leaves the table even more sparse >> likely showing that hash_rtx uses add for mixing which is >> quite bad. Bumping HASH_SHIFT by 2 removes 90% of all >> collisions. >>=20 >> Experimenting with using inchash instead of adds for the >> mixing does not improve things when looking at the HASH_SHIFT >> bumped by 2 numbers. >>=20 >> Bootstrapped and tested on x86_64-unknown-linux-gnu. >>=20 >> Any opinions? >>=20 >> * cse.cc (HASH): Turn into inline function and mix >> in another HASH_SHIFT bits. >> (SAFE_HASH): Likewise. >> --- >> gcc/cse.cc | 37 +++++++++++++++++++++++-------------- >> 1 file changed, 23 insertions(+), 14 deletions(-) >>=20 >> diff --git a/gcc/cse.cc b/gcc/cse.cc >> index 37afc88b439..4777e559b86 100644 >> --- a/gcc/cse.cc >> +++ b/gcc/cse.cc >> @@ -420,20 +420,6 @@ struct table_elt >> #define HASH_SIZE (1 << HASH_SHIFT) >> #define HASH_MASK (HASH_SIZE - 1) >>=20 >> -/* Compute hash code of X in mode M. Special-case case where X is a pse= udo >> - register (hard registers may require `do_not_record' to be set). */ >> - >> -#define HASH(X, M) \ >> - ((REG_P (X) && REGNO (X) >=3D FIRST_PSEUDO_REGISTER \ >> - ? (((unsigned) REG << 7) + (unsigned) REG_QTY (REGNO (X))) \ >> - : canon_hash (X, M)) & HASH_MASK) >> - >> -/* Like HASH, but without side-effects. */ >> -#define SAFE_HASH(X, M) \ >> - ((REG_P (X) && REGNO (X) >=3D FIRST_PSEUDO_REGISTER \ >> - ? (((unsigned) REG << 7) + (unsigned) REG_QTY (REGNO (X))) \ >> - : safe_hash (X, M)) & HASH_MASK) >> - >> /* Determine whether register number N is considered a fixed register for= the >> purpose of approximating register costs. >> It is desirable to replace other regs with fixed regs, to reduce need f= or >> @@ -586,6 +572,29 @@ static machine_mode cse_cc_succs (basic_block, basic= _block, rtx, rtx, >>=20 >> static const struct rtl_hooks cse_rtl_hooks =3D RTL_HOOKS_INITIALIZER; >> =0C >> +/* Compute hash code of X in mode M. Special-case case where X is a pse= udo >> + register (hard registers may require `do_not_record' to be set). */ >> + >> +static inline unsigned >> +HASH (rtx x, machine_mode mode) >> +{ >> + unsigned h =3D (REG_P (x) && REGNO (x) >=3D FIRST_PSEUDO_REGISTER >> + ? (((unsigned) REG << 7) + (unsigned) REG_QTY (REGNO (x))) >> + : canon_hash (x, mode)); >> + return (h ^ (h >> HASH_SHIFT)) & HASH_MASK; >> +} >> + >> +/* Like HASH, but without side-effects. */ >> + >> +static inline unsigned >> +SAFE_HASH (rtx x, machine_mode mode) >> +{ >> + unsigned h =3D (REG_P (x) && REGNO (x) >=3D FIRST_PSEUDO_REGISTER >> + ? (((unsigned) REG << 7) + (unsigned) REG_QTY (REGNO (x))) >> + : safe_hash (x, mode)); >> + return (h ^ (h >> HASH_SHIFT)) & HASH_MASK; >> +} >> + >> /* Nonzero if X has the form (PLUS frame-pointer integer). */ >>=20 >> static bool