From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=MAay=FR=foss.st.com=prvs=86405a88e4=torbjorn.svensson@sourceware.org>
Received: from mx07-00178001.pphosted.com (mx07-00178001.pphosted.com [185.132.182.106])
	by sourceware.org (Postfix) with ESMTPS id 648203858C50
	for <binutils@sourceware.org>; Tue,  3 Oct 2023 13:00:04 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 648203858C50
Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=foss.st.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=foss.st.com
Received: from pps.filterd (m0241204.ppops.net [127.0.0.1])
	by mx07-00178001.pphosted.com (8.17.1.22/8.17.1.22) with ESMTP id 393A7Bw2022450;
	Tue, 3 Oct 2023 14:59:59 +0200
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foss.st.com; h=
	message-id:date:mime-version:subject:references:from:to:cc
	:in-reply-to:content-type:content-transfer-encoding; s=
	selector1; bh=CIl6zR63RHgWtfokjzwR3Kp8t1ENb8hbFaX8zsEe4aA=; b=Ic
	z8aXunEKbW8uv6VScnV0FIucYRbowypvfvvAfsbzFizb74h40jgHPBLI8+9juHga
	SdPX3li6wOqgv1AdqcDMkImRPj1toVkCZ+grfwpLjVgSuTlVSPN2zFmNYdOhEMyW
	VhGGIITb277RrW69Gaiw5LeDb4JF2FZ3VzkGsEhUP/GxpWzrY0pOnfoHNUyQ4Ojl
	gFaah33OKRiGnaxWojmNXPE0AsVKmHScXW/XGALu1Pf+2IZrVySYAMyqYXOZO0QP
	m+FZ25IeoXjD86mgrByFUsU/7KmtclyZbzmp5S91F7UlB5eGBi2p6+TwCIQAqczy
	NkwjR2zhJiz6XCGQdLmw==
Received: from beta.dmz-eu.st.com (beta.dmz-eu.st.com [164.129.1.35])
	by mx07-00178001.pphosted.com (PPS) with ESMTPS id 3tggx30tbd-1
	(version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT);
	Tue, 03 Oct 2023 14:59:59 +0200 (MEST)
Received: from euls16034.sgp.st.com (euls16034.sgp.st.com [10.75.44.20])
	by beta.dmz-eu.st.com (STMicroelectronics) with ESMTP id 97B2C100053;
	Tue,  3 Oct 2023 14:59:58 +0200 (CEST)
Received: from Webmail-eu.st.com (shfdag1node3.st.com [10.75.129.71])
	by euls16034.sgp.st.com (STMicroelectronics) with ESMTP id 8F290229A75;
	Tue,  3 Oct 2023 14:59:58 +0200 (CEST)
Received: from [10.74.18.1] (10.74.18.1) by SHFDAG1NODE3.st.com (10.75.129.71)
 with Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27; Tue, 3 Oct
 2023 14:59:58 +0200
Message-ID: <c131c7ce-216e-779c-503a-f9450833261e@foss.st.com>
Date: Tue, 3 Oct 2023 14:59:57 +0200
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Subject: Re: [PATCH v3] libctf: ctf_member_next needs to return (ssize_t)-1 on
 error
Content-Language: en-US
References: <878r9ba2sf.fsf@esperi.org.uk>
 <20230913095727.1420654-1-torbjorn.svensson@foss.st.com>
 <87a5tp2a3n.fsf@esperi.org.uk>
 <657dadf8-4b7b-67e6-9de2-9a1cdb79f081@foss.st.com>
 <87wmwdt2bf.fsf@esperi.org.uk>
 <c5d24991-086d-0198-ee02-488674fc20dc@foss.st.com>
 <87r0mimero.fsf@esperi.org.uk>
 <ce011a56-2a75-5a9e-73ef-04f9307b6850@foss.st.com>
 <87ttr9l2b1.fsf@esperi.org.uk>
From: Torbjorn SVENSSON <torbjorn.svensson@foss.st.com>
To: Nick Alcock <nick.alcock@oracle.com>
CC: <binutils@sourceware.org>, Alan Modra <amodra@gmail.com>,
        Yvan Roux
	<yvan.roux@foss.st.com>
In-Reply-To: <87ttr9l2b1.fsf@esperi.org.uk>
Content-Type: text/plain; charset="UTF-8"; format=flowed
Content-Transfer-Encoding: 8bit
X-Originating-IP: [10.74.18.1]
X-ClientProxiedBy: SHFCAS1NODE1.st.com (10.75.129.72) To SHFDAG1NODE3.st.com
 (10.75.129.71)
X-Proofpoint-Virus-Version: vendor=baseguard
 engine=ICAP:2.0.267,Aquarius:18.0.980,Hydra:6.0.619,FMLib:17.11.176.26
 definitions=2023-10-03_10,2023-10-02_01,2023-05-22_02
X-Spam-Status: No, score=-5.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <binutils.sourceware.org>

Hello Nick,


On 2023-10-02 12:57, Nick Alcock wrote:
> On 29 Sep 2023, Torbjorn SVENSSON verbalised:
> 
>> On 2023-09-28 18:41, Nick Alcock wrote:
>>> Aha! So this is *not* a problem with functions returning int -- it is
>>> specifically a problem with functions returning *size_t types*.
>>
>> So, I'm not sure if were are talking past each other, but I don't think it's a *size_t type* issue either.
>>
>> As I see it, there are 3 different scenarios.
>>
>> 1. The function returns a signed integral type, then -1 would be fine.
>> Even a simple (int)-1 would work as it would be sign-extended.
> 
> Ack.
> 
>> 2. The function returns an unsigned integral type that is as wide, or
>> wider, than unsigned long. In this case, CTF_ERR will be zero extended
>> and the value will be UINT_something_MAX (the number of binary ones
>> depends on the sizeof(unsigned long)).
> 
> Ack. This is the problem area, because the interface contract for libctf
> specifies (or would if it were written down anywhere) that you can test
> all signed returns for error via < 0 and all unsigned ones (like
> ctf_id_t) via == CTF_ERR, and CTF_ERR... has a cast in it.

I don't think this is true with the current logic in place.
On a signed integral function, it could be either -1 (negative one) or, 
if the data type of the called function is wide enough, CTF_ERR.

> I suppose if we changed CTF_ERR to (uintmax_t)-1 this might work for all
> cases, but a) uintmax_t is an annoying sod that doesn't even extend to
> the largest possible unsigned integral type on all platforms and b) this
> just gives us the same problem from the other side if the unsigned type
> is shorter than uintmax_t (which it almost always would be). What would
> happen to the (unsigned long)-1 the libctf function returned when it was
> promoted? Would it turn into (uintmax_t)-1 consistently, given that the
> type has no sign so sign-extension presumably does not apply? I've tried
> to figure it out from the standard text and I guess fevers and bacterial
> infections are not compatible with reading standardese because I haven't
> worked it out yet :P
> 
> (or maybe you need to be on more serious drugs than I'm on when reading
> it. It might well be that way round!)
> 
>> 3. The function return an unsigned integral type that is narrower than
>> unsigned long. In this case, the cast to unsigned long is wider than
>> the return type and may overflow(?). Is it safe to return something
>> bigger than can be represented in the return type?
> 
> Are there any of those at all? I don't think so. It's possible in theory
> but I don't think we use unsigned int anywhere. I suppose size_t might
> be unsigned int on some platforms, and then this problem might arise.
> 
>> In my mind, I see it as we need 2 different implementations. One that
>> returns a simple -1 and another one that casts it for all unsigned
>> calls, but maybe I'm oversimplifying this.
> 
> I suppose if it was always done *at the return site* so the compiler
> could always see the return type properly, it would always get the cast
> right: my worry is the CTF_ERR at the test site, in the user code we
> cannot affect (except by changing CTF_ERR).

I suppose so, but does it really matter when we cannot do this part? Or 
are you thinking about just having the ctf_set_errno function always 
return -1 (negative one) and let the caller do the automagic conversion? 
If so, I think the idiom with return_value == CTF_ERR is going to fail.

>> To make things easier, I would actually consider just having the
>> assign statement and the return statement inlined (without a macro or
>> inline function) directly where they are supposed to be.
>>
>> Would you be open to removing the ctf_set_errno() function completely
>> and just expand it to where it's called (almost like my v2 patch, but
>> everywhere instead)?
> 
> Well, you could use a macro to make it less gross. All the code
> duplication makes me wince a bit.

How would a macro work for this?

Consider that a function returns a char*, in this case, neither CTF_ERR, 
nor -1 should be returned, but instead NULL. Using a macro would make 
this problematic to differentiate.

A possibility could be to have a macro like:

#define CTF_SET_ERRNO(fp,err) do { fp->ctf_errno = err; } while(0)

but what's the point?
Is it better to write:

if (some_failure_condition)
   {
     CTF_SET_ERRNO(fp, ENOMEM);
     return -1; // or CTF_ERR, NULL, ..., depending on the function
   }

instead of:

if (some_failure_condition)
   {
     fp->ctf_errno = ENOMEM;
     return -1; // or CTF_ERR, NULL, ..., depending on the function
   }


> But more problematic from my perspective is the implications for the
> callers. We're trying to get either -1 or (some type)-1 to the callers
> so they can error-check it in a consistent and not-too-horrifying
> fashion, after all. (We could add a ctf_is_error() function or macro
> that does whatever magic is needed if it's too baroque, but that would
> be my least favourite option because it looks nothing like a
> conditional.)

Adding a ctf_is_error()-function would be one way to go, but it would be 
an API change while all the others I've been talking about is internal 
into the library without actually touching the interface.

>> Alternatively, we could start the discussion on adopting the C11 standard instead, but I fear that this will be a much longer
>> discussion than figuring out what the best approach would be for libctf.
> 
> Agreed, or I'd have proposed it already. Toolchain projects like
> binutils must be conservative, and there are no doubt lots of archaic
> weird but valid targets out there where the bootstrap chain requires
> binutils building before GCC and the system compiler is not C11-capable.
> 
>>> ... sorry, I'm still flailing at it. Maybe the above is helpful? (It's
>>> only a very small change atop what you've already done, I think.)
>>
>> Let me know how you want to handle this. :)
> 
> I think we're still narrowing down the problem, really. -1 is such a
> common error return, it's amazing how hard it is to make it work right :)
> 

So, based on all we have talk about so for, I'm leaning towards just 
having the 2 statements in all places where an error should be returned 
and have the change contained inside the library rather than making 
external changes required.
Then the question is, would the CTF_SET_ERRNO macro have anything to 
add? I.e. should we have it even if it's just setting the field or 
should we set the field directly? Regardless, the return statement needs 
to be kept outside that block.

Please let me know what you think so that we can progress on this topic.
I'm currently blocked by this point for the tool chain that I'm supposed 
to deliver.

Kind regards,
Torbjörn