From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=9rt6=5A=redhat.com=jakub@sourceware.org>
Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124])
	by sourceware.org (Postfix) with ESMTPS id 64CD63858C1F
	for <gcc-patches@gcc.gnu.org>; Tue,  3 Jan 2023 13:54:18 +0000 (GMT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com;
	s=mimecast20190719; t=1672754058;
	h=from:from:reply-to:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-type:
	 content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references;
	bh=huLnl5T210Wa1NGBsbEA/gYAnBU9AdPZRNt+Ty2s+Vg=;
	b=Wn8hsST/G+Svj2wrPp8iJi8OnhYu9H0fL+nBd0mT5aMPjuxNT8KEXmNFYLzG4Ay23yF/Vz
	+dZFXwCjWOTOsbZqrN2DaxHpIRp9aBZE7SAh+NumMig8s2cxsfSwo+kxGFry5aBrFYpgPG
	YVcd26URcCIBqlRfdNfDzipLFMl+dBQ=
Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com
 [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS
 (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id
 us-mta-438-dL_KZyNAPqq8ifC2BodLjQ-1; Tue, 03 Jan 2023 08:54:15 -0500
X-MC-Unique: dL_KZyNAPqq8ifC2BodLjQ-1
Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10])
	(using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
	(No client certificate requested)
	by mimecast-mx02.redhat.com (Postfix) with ESMTPS id C19E61C0A583;
	Tue,  3 Jan 2023 13:54:14 +0000 (UTC)
Received: from tucnak.zalov.cz (unknown [10.39.192.223])
	by smtp.corp.redhat.com (Postfix) with ESMTPS id 78AF5492D8B;
	Tue,  3 Jan 2023 13:54:14 +0000 (UTC)
Received: from tucnak.zalov.cz (localhost [127.0.0.1])
	by tucnak.zalov.cz (8.17.1/8.17.1) with ESMTPS id 303DsBux280504
	(version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT);
	Tue, 3 Jan 2023 14:54:11 +0100
Received: (from jakub@localhost)
	by tucnak.zalov.cz (8.17.1/8.17.1/Submit) id 303DsAFO280503;
	Tue, 3 Jan 2023 14:54:10 +0100
Date: Tue, 3 Jan 2023 14:54:10 +0100
From: Jakub Jelinek <jakub@redhat.com>
To: Florian Weimer <fweimer@redhat.com>
Cc: gcc-patches@gcc.gnu.org, Andre Vieira <andre.simoesdiasvieira@arm.com>,
        Andrew Pinski <pinskia@gmail.com>, Jeff Law <jeffreyalaw@gmail.com>
Subject: Re: [PATCH] Various fixes for DWARF register size computation
Message-ID: <Y7QzgnjVRfI+lKB8@tucnak>
Reply-To: Jakub Jelinek <jakub@redhat.com>
References: <87pmbvx41g.fsf@oldenburg.str.redhat.com>
 <Y7QaDPdpmBCMU+Cd@tucnak>
 <878rijwy0u.fsf@oldenburg.str.redhat.com>
MIME-Version: 1.0
In-Reply-To: <878rijwy0u.fsf@oldenburg.str.redhat.com>
X-Scanned-By: MIMEDefang 3.1 on 10.11.54.10
X-Mimecast-Spam-Score: 0
X-Mimecast-Originator: redhat.com
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
X-Spam-Status: No, score=-3.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <gcc-patches.gcc.gnu.org>

On Tue, Jan 03, 2023 at 02:25:21PM +0100, Florian Weimer wrote:
> > Though, I still wonder, because all of this is a hack for a single target
> > - x86_64-linux -m64 - I think no other target has similar constant
> > sizes,
> 
> Really?  That's odd.

I've tried about 30 cross compilers I had around, I admit it isn't
exhaustive.

> Is it because other architectures track callee-saved vector registers
> through unwinding?

I think it is far more than just vector registers, it can be floating point
registers, or just one integral or special register of a different size etc.

> > Or, if you want to do it on the compiler side, instead of predefining
> > __LIBGCC_DWARF_REG_SIZES_CONSTANT__ and __LIBGCC_DWARF_REG_MAXIMUM__
> > register conditionally a new builtin, __builtin_dwarf_reg_size,
> > which would be defined only if -fbuilding-libgcc and the compiler determines
> > dwarf_reg_size is desirable to be computed inline without a table and
> > would fold the builtin to say that
> > index <= 16U ? 8 : 0 on x86_64 -m64,
> > index <= 9U ? 4 : index - 11U <= 5U ? 12 : 0 on x86_64 -m32 etc.
> > and if the expression is too large/complex, wouldn't predefine the builtin.
> 
> I think the pre-computation of the size array is useful even for targets
> where the expression is not so simple, but the array elements are still
> constants.  A builtin like __builtin_dwarf_reg_size could use a
> reference to a constant static array, so that we can get rid of the
> array initialization code in libgcc.  Before we can do that, we need to

I think constant static array might be sometimes an option too, but not
always, not just because of AArch64, or other potential poly-int arches
(RISCV?), but also if something could depend on some runtime check.
It is true that most of the time it is constant and depends just on the
target or more often on target + options combo (mostly ABI related options).

> figure out if the fully dynamic register sizes on AArch64 with SVE are
> actually correct—and if we need to fix the non-SVE unwinder to work
> properly for SVE programs.
> 
> So I don't want to revert the size array computation just yet.
> 
> > Or, is it actually the use of table that is bad on the unwinder side,
> > or lack of a small upper bound for what you get from the table?
> > In that case you could predefine upper bound on the sizes instead (if
> > constant) and simply add if (size > __LIBGCC_DWARF_REG_SIZE_MAX__)
> > __builtin_unreachable ()).
> 
> It also matters what kind of register sizes are used in practice.

Yes, but that is hard to find out easily.  E.g. some registers might be
only saved/restored in signal frames and nowhere else, others only rarely
touched but still we'd need their sizes if they are ever used.

> Should I repost this patch with the three nits fixed?  Or should I
> revert two of the three patches I committed instead?

I lean towards reversion and trying to figure out something that works
on more than one arch.  It doesn't have to improve all arches (say
AArch64 is clearly a nightmare that isn't handled correctly even without
any changes - as noted in the PRs, if libgcc is built without SVE, it will
hardcode 8, while if it is built with SVE, it will be runtime dependent and
will be wrong in theory when some HW has 2048 bit SVE vectors - when it is
256 bytes), but still watching into what we compile -O2 -nostdinc -fbuilding-libgcc

static unsigned char dwarf_reg_size_table[__LIBGCC_DWARF_FRAME_REGISTERS__+1];

void
foo (void)
{
  __builtin_init_dwarf_reg_size_table (dwarf_reg_size_table);
}

on at least 10 most common arches would be useful and optimizing what is
easily possible would be nice.

	Jakub