From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=/rDe=MG=tugraz.at=uecker@sourceware.org>
Received: from mailrelay.tugraz.at (mailrelay.tugraz.at [129.27.2.202])
	by sourceware.org (Postfix) with ESMTPS id C7ED0384AB58
	for <gcc-patches@gcc.gnu.org>; Fri,  3 May 2024 18:04:20 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C7ED0384AB58
Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=tugraz.at
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=tugraz.at
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org C7ED0384AB58
Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=129.27.2.202
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1714759462; cv=none;
	b=TmOZ5BG8gTAs8aNDZh/ALA2wrwUzbUqYWS+cJsdrTAhHgrRu/fEZJmD3X06tSZApDAUhXO8698IBXdeua+GPPlg9AxLWHsoLa9bpCBjvcr+OCnVZBoNNnCp0PHb50ccz5kBEiNf8S33CfUxhu0ZfS8uJqyW1CKIb/BXs0palrmc=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
	t=1714759462; c=relaxed/simple;
	bh=3vRk8GSVfbwxwnboV/l6KJsr3tBkho+eS2Q3zFLoG3Y=;
	h=DKIM-Signature:Message-ID:Subject:From:To:Date:MIME-Version; b=uu1c2+Chy1xeLk66tNl8TbGRS/c2MXQi6lQtyxwJ9EFds65UVqbbriAY6fpAMGnB+8QZ4z95Om8C3pTA2XK8VXcovC3sIw5zGxRDw7qj4QMTXUZ/riVepPwLnnAxM9fNxjonu21EJVtYMNuLF80z6W5wz+z4xJBZ1sTcWhBk66M=
ARC-Authentication-Results: i=1; server2.sourceware.org
Received: from [192.168.0.221] (84-115-223-216.cable.dynamic.surfer.at [84.115.223.216])
	by mailrelay.tugraz.at (Postfix) with ESMTPSA id 4VWJbQ4YXHz1HNLy;
	Fri,  3 May 2024 20:04:18 +0200 (CEST)
DKIM-Filter: OpenDKIM Filter v2.11.0 mailrelay.tugraz.at 4VWJbQ4YXHz1HNLy
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tugraz.at;
	s=mailrelay; t=1714759458;
	bh=SxrAX/tPZVIm/OMMcj31rmpjcei8RRYPGpUhKGGqQvs=;
	h=Subject:From:To:Cc:Date:In-Reply-To:References:From;
	b=TGlJRj2ocn8cowdDYepo+l6i22QWISakr9dOSuyxiZPDhU9vlEZwzHJlmdDMG38O9
	 YYsLIBN2zlLl6+i8iaxaRBqSIR5QxpijBe7O39ueXWrNaO9NBsYn85PaBSEo0uArJJ
	 zfh64iv/dFzUHj6yf0+vLmHxZASHuD1X37DHwN4I=
Message-ID: <06a98adaf345a1e942a4973546ea15373adb5df5.camel@tugraz.at>
Subject: Re: [PATCH] middle-end/114931 - type_hash_canon and structual
 equality types
From: Martin Uecker <uecker@tugraz.at>
To: Jakub Jelinek <jakub@redhat.com>
Cc: Richard Biener <rguenther@suse.de>, gcc-patches@gcc.gnu.org
Date: Fri, 03 May 2024 20:04:18 +0200
In-Reply-To: <ZjUfRv/0jC/M/XbZ@tucnak>
References: <20240503121400.C8A4313991@imap1.dmz-prg2.suse.org>
	 <3616afcc455c1f80e2bb8a80a408994a67062d85.camel@tugraz.at>
	 <ZjUfRv/0jC/M/XbZ@tucnak>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
User-Agent: Evolution 3.46.4-2 
MIME-Version: 1.0
X-TUG-Backscatter-control: G/VXY7/6zeyuAY/PU2/0qw
X-Spam-Scanner: SpamAssassin 3.003001 
X-Spam-Score-relay: -1.9
X-Scanned-By: MIMEDefang 2.74 on 129.27.10.116
X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_ABUSEAT,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=no autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <gcc-patches.gcc.gnu.org>

Am Freitag, dem 03.05.2024 um 19:30 +0200 schrieb Jakub Jelinek:
> On Fri, May 03, 2024 at 05:32:12PM +0200, Martin Uecker wrote:
> > Am Freitag, dem 03.05.2024 um 14:13 +0200 schrieb Richard Biener:
> > > TYPE_STRUCTURAL_EQUALITY_P is part of our type system so we have
> > > to make sure to include that into the type unification done via
> > > type_hash_canon.  This requires the flag to be set before querying
> > > the hash which is the biggest part of the patch.
> >=20
> > I assume this does not affect structs / unions because they
> > do not make this mechanism of type unification (each tagged type
> > is a unique type), but only derived types that end up having
> > TYPE_STRUCTURAL_EQUALITY_P because they are constructed from
> > incomplete structs / unions before TYPE_CANONICAL is set.
> >=20
> > I do not yet understand why this change is needed. Type
> > identity should not be affected by setting TYPE_CANONICAL, so
> > why do we need to keep such types separate?  I understand that we
> > created some inconsistencies, but I do not see why this change
> > is needed to fix it.  But I also haven't understood how we ended
> > up with a TYPE_CANONICAL having TYPE_STRUCTURAL_EQUALITY_P in
> > PR 114931 ...
>=20
> So, the C23 situation before the r14-10045 change (not counting the
> r14-9763 change that was later reverted) was that sometimes TYPE_CANONICA=
L
> of a RECORD_TYPE/UNION_TYPE could change from self to a different
> RECORD_TYPE/UNION_TYPE and we didn't bother to adjust derived types.
> That was really dangerous, I think e.g. type alias set wasn't recomputed.
>=20
> r14-10045 changed it to the non-ideal, but perhaps less wrong model,
> where we start with TYPE_STRUCTURAL_EQUALITY_P on incomplete types in C23
> and perhaps later on change them to !TYPE_STRUCTURAL_EQUALITY_P when
> the type is completed, and adjust TYPE_CANONICAL of some easily discovera=
ble
> derived types but certainly not all.
>=20
> Still, that change introduces something novel to the type system, namely
> that TYPE_CANONICAL can change on a type, even when it is just limited to
> the TYPE_STRUCTURAL_EQUALITY_P -> !TYPE_STRUCTURAL_EQUALITY_P kind of
> change and we never change one non-NULL TYPE_CANONICAL to a different one
> (ok, not counting the short lived TYPE_CANONICAL being initially set to
> self during make_node and then quickly adjusted in the callers).
>=20
> One question is, could we for C23 somehow limit this for the most common
> case where people just forward declare some aggregate type and then soon
> after define it?  But guess the problematic counterexample there is
> struct S; // forward declare
> struct S *p; // create some derived types from it
> void foo (void)
> {
>   struct S { int s; };	// define the type in a different scope
> 			// (perhaps with a forward declaration as well)
>   struct S s;
>   use (&s);		// create derived types
> }
> struct S { int s; };	// define the type in the global scope to something
> 			// that matches previously defined struct S in
> 			// another scope
> So e.g. noting in the hash table that a particular type has been forward
> declared so far and using TYPE_STRUCTURAL_EQUALITY_P only if it has been
> forward declared in some other scope wouldn't work.
>=20
> Another question is whether c_update_type_canonical can or should try to
> update TYPE_ALIAS_SET if TYPE_ALIAS_SET_KNOWN_P.  Or do we never cache
> alias sets for TYPE_STRUCTURAL_EQUALITY_P types?
>=20
> Anyway, the ICE on the testcase is because alias.cc asserts that
> a !TYPE_STRUCTURAL_EQUALITY_P (type) has
> !TYPE_STRUCTURAL_EQUALITY_P (TYPE_CANONICAL (type)).
>=20
> The possibilities to resolve that are either at c_update_type_canonical
> time try to find all the derived types rather than just some and recomput=
e
> their TYPE_CANONICAL.  Guess we could e.g. just traverse the whole
> type_hash_table hash table and for each type see if it is in any way rela=
ted
> to the type that is being changed and then recompute them.  Though,
> especially FUNCTION_TYPEs make that really ugly and furthermore it needs
> to be recomputed in the right order, basically in the derivation order.
> Without doing that, we'll have some TYPE_STRUCTURAL_EQUALITY_P derived
> types in the type_hash_table hash table; that is conservatively correct,
> but can result in worse code generation because of using alias set 0.
>=20
> Another possibility is what Richi posted, essentially stop reusing
> derived types created from the time when the base type was incomplete
> when asking for a new derived type.  We'll get the TYPE_STRUCTURAL_EQUALI=
TY_P
> derived types if they were created before the base type was completed
> when used directly (e.g. when it is a TREE_TYPE of some decl etc.), but
> when we ask for a new type we'd disregard the old type and create a new o=
ne.
> I'm not sure the patch is complete for that, because it doesn't adjust
> check_base_type, build_pointer_type_for_mode, build_reference_type_for_mo=
de
> which don't use type_hash_canon but walk TYPE_NEXT_VARIANT list or
> TYPE_POINTER_TO or TYPE_REFERENCE_TO chains.  Though, maybe it is ok
> as is when c_update_type_canonical adjusts the pointer types
> and variant types, those will be adjusted and as soon as we hit the first
> ARRAY_TYPE/FUNCTION_TYPE or other type_canon_hash using derived types and
> they aren't shared but have a new one created, further derived
> pointer/qualified types will be based on the new ones, not the old ones.
>=20
> Another possibility is what I wrote in the PR, ensure the
> !TYPE_STRUCTURAL_EQUALITY_P (type) ->
> !TYPE_STRUCTURAL_EQUALITY_P (TYPE_CANONICAL (type)) implication alias.cc
> relies on is held.  Without the complete update of all derived types or
> Richi's patch, there is always a chance that some TYPE_CANONICAL (...) =
=3D
> assignment will get a TYPE_STRUCTURAL_EQUALITY_P type which was created
> while the base type was incomplete.  So, we could just check for that
> case and use conservative type as well.  I.e. after each TYPE_CANONICAL
> (...) =3D ... (except the TYPE_CANONICAL (t) =3D t; cases which are alway=
s
> fine) add
>   if (TYPE_STRUCTURAL_EQUALITY_P (TYPE_CANONICAL (type)))
>     SET_TYPE_STRUCTURAL_EQUALITY (type);
> That is conservatively fine.
>=20
> And yet another option is to change alias.cc to handle the
> TYPE_STRUCTURAL_EQUALITY_P (TYPE_CANONICAL (type)) case the same
> as TYPE_STRUCTURAL_EQUALITY_P (type) instead of asserting it is not
> ok.
>=20
> I think we'd get best generated code if we managed to update all derived
> types, Richi's patch is the second and the other changes result in even
> worse code.


A change that is not optimal but would avoid a lot of trouble is to
only use the tag of the struct for computing a TYPE_CANONICAL, which
could then be set already for incomplete types and never needs to
change again. We would not differentiate between different struct
types with the same tag for aliasing analysis, but in most cases
I would expect different structs to have a different tag.

Martin