From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=Ekrk=IU=redhat.com=aburgess@sourceware.org>
Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124])
	by sourceware.org (Postfix) with ESMTPS id 6E4D73857C4B
	for <binutils@sourceware.org>; Wed, 10 Jan 2024 13:48:54 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6E4D73857C4B
Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 6E4D73857C4B
Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.133.124
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1704894537; cv=none;
	b=TY34rc2mPG/ONVBU4Ws2E6wB33AYdcnw2SfvP6p0eFVdGyvd02YO7rItYuxCDC8IRx7avRsuRKZCwPX5/i4cJ6AtfVUutOmz6B+hFQktmuwsBJiW/fAgSyiHxzsiX9eENcb/dqfLM9eQxcObiVYO+Y7ka8qY/vBESga+tQo+pRY=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
	t=1704894537; c=relaxed/simple;
	bh=fCxqjVsrMLD+EMH3BiAPHuWk9gDeSZ+OgrpguHyCF50=;
	h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=bMVema9ImTev3aWe+M7StRnNrEo+0AmVKuDNDJlDzNib4qukoRKphm4F2QXAHXgSGtM5SsSwbvUsUrqb6jwjoHn3XGpLKsO32qt0gk0JEF+0EIUqO6fYqa36YmOCnMSMtw2ov/EIMRRVizdIrOLKJpIlBGzM0rmZ3Bi+Q0jVP/0=
ARC-Authentication-Results: i=1; server2.sourceware.org
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com;
	s=mimecast20190719; t=1704894534;
	h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
	 to:to:cc:cc:mime-version:mime-version:content-type:content-type:
	 in-reply-to:in-reply-to:references:references;
	bh=gnRKbBeFcNQFaiuRaRp3lD5NO8M3XN/AVX/oAFbpmvY=;
	b=h2K7S4RyaAxK5D2XGwgwnX+BqT5zTesWfsb/lRSgFnL1KL/SsIHo5lpfXyUHGvcFlVUiyI
	f+OPhVW7rnKDDHbfvEQuBXcEXUCjoHx3wNj7Uvv4XjvcYfeyC6DeE55unIt9Ke5ohXHY5K
	tywB4JyNKnIm1O7IydTkJudiKLCVXdA=
Received: from mail-wr1-f71.google.com (mail-wr1-f71.google.com
 [209.85.221.71]) by relay.mimecast.com with ESMTP with STARTTLS
 (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id
 us-mta-298-wPCPxaM3NUuxM0Ydih_Exw-1; Wed, 10 Jan 2024 08:48:52 -0500
X-MC-Unique: wPCPxaM3NUuxM0Ydih_Exw-1
Received: by mail-wr1-f71.google.com with SMTP id ffacd0b85a97d-33677bbd570so2639450f8f.3
        for <binutils@sourceware.org>; Wed, 10 Jan 2024 05:48:52 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1704894531; x=1705499331;
        h=mime-version:message-id:date:references:in-reply-to:subject:cc:to
         :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
        bh=gnRKbBeFcNQFaiuRaRp3lD5NO8M3XN/AVX/oAFbpmvY=;
        b=L15PhMm5Cegfe0iiq85di1l6h5cBEkbQqI0jh1geLSfKXVWI54F475XSOH5Ub+BK7O
         IaiqAumx55rUjBx9beAk9/cs/izSEkfYha1Gt1DnW/rUF3VMOzH8wTYoxjQrUsVR64F8
         R6gfBmaCUBc0EgEqWTA8NyBKs/Cfi6BXOm2XzLibIcSVnjp+2d4k19PxAjeYBO2wNxPV
         C8ZYUJA7UAhYytHaod2VbQEeHl9BwEYc7RQqnN/I9Sd2Yrxt0IGHP9jkcIzH4//2bKv2
         HMJigzz2Gl9autMyAAHEtWa/jnmNz0BqR3ZXOFD+l5ysK38OrlK04iWPzNUpJc9oUaMx
         AHRw==
X-Gm-Message-State: AOJu0Yyjf1FLNc7YuggF5puWYuDvP2mLYUP2UWeoQw9DcWZCV5XdBTyq
	6HkoT0DGTq7mSwPb+KK0dHBnIMEICVxlVY4/Ry/qhP67r5MhmMMHs2q1FaZO4+Ij/HVnoovX8sX
	X+3KKXtgaJrNaOdS/N4OV+rXgEqtSA/AcoQ==
X-Received: by 2002:adf:f5cf:0:b0:337:51d3:ba68 with SMTP id k15-20020adff5cf000000b0033751d3ba68mr587410wrp.54.1704894531308;
        Wed, 10 Jan 2024 05:48:51 -0800 (PST)
X-Google-Smtp-Source: AGHT+IHgJ6SMnf3tQgD7YhfvrJ3LPUcxnIiKyZdShtcG2qUxi/xE0EgLPXWQ5w2F022jvHblqMUJsA==
X-Received: by 2002:adf:f5cf:0:b0:337:51d3:ba68 with SMTP id k15-20020adff5cf000000b0033751d3ba68mr587402wrp.54.1704894530950;
        Wed, 10 Jan 2024 05:48:50 -0800 (PST)
Received: from localhost (185.223.159.143.dyn.plus.net. [143.159.223.185])
        by smtp.gmail.com with ESMTPSA id k12-20020a5d6e8c000000b00336f43fa654sm4963600wrz.22.2024.01.10.05.48.49
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Wed, 10 Jan 2024 05:48:50 -0800 (PST)
From: Andrew Burgess <aburgess@redhat.com>
To: Jan Beulich <jbeulich@suse.com>
Cc: binutils@sourceware.org
Subject: Re: [PATCH] bfd: make _bfd_section_size_insane part of the public API
In-Reply-To: <f1eb82b6-4cda-4583-852a-4b986ce2aa9d@suse.com>
References: <b9ff0c22bf79023edc29f8f661b683542b771cb2.1701879297.git.aburgess@redhat.com>
 <0c54069e-d907-4f03-8d7f-15374d4bfd6a@suse.com>
 <87frz58n7j.fsf@redhat.com>
 <f1eb82b6-4cda-4583-852a-4b986ce2aa9d@suse.com>
Date: Wed, 10 Jan 2024 13:48:48 +0000
Message-ID: <877ckh8fjz.fsf@redhat.com>
MIME-Version: 1.0
X-Mimecast-Spam-Score: 0
X-Mimecast-Originator: redhat.com
Content-Type: text/plain
X-Spam-Status: No, score=-7.0 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <binutils.sourceware.org>

Jan Beulich <jbeulich@suse.com> writes:

> On 10.01.2024 12:03, Andrew Burgess wrote:
>> Jan Beulich <jbeulich@suse.com> writes:
>>> On 06.12.2023 17:15, Andrew Burgess wrote:
>>>> If a BFD user is making use of a function like
>>>> bfd_get_section_contents to read a section into a pre-allocated
>>>> buffer, then that BFD user might also want to make use of
>>>> _bfd_section_size_insane prior to allocating the buffer they intend to
>>>> use in order to validate that the buffer size that plan to allocate is
>>>> sane.
>>>>
>>>> This commit makes _bfd_section_size_insane public, by renaming it to
>>>> bfd_section_size_insane.
>>>>
>>>> I've updated the existing uses within bfd/, I don't believe this
>>>> function is used outside of bfd/ currently.
>>>>
>>>> One place that I plan to make use of this function is in
>>>> gdb/gdb_bfd.c, in the function gdb_bfd_get_full_section_contents.
>>>> This change isn't included in this commit, but will come later if/when
>>>> this has been merged into bfd.
>>>
>>> Having seen your ping (and no other response), let me share my view:
>>> This function implements a certain policy, internal to the library.
>>> By exposing it, you would make external users dependent upon this
>>> specific policy. What if later we change our view on what's "insane"?
>> 
>> I would expect and want external users to get the updated definition.
>
> And then break if we decide to lower the limit of "insane"?

You've said this again later in your reply, and I think this is the bit
that I'm having trouble with.

Why do you think things would break if the definition of "insane"
changed?

Of course, I ask this under the assumption that there's no broken code
in the consumer, clearly:

  if (bfd_section_size_insane (...))
    // Broken code here...
  else
    // Working code here...

Is going to break, and if *_insane() changes, then we're going to break
in more or less cases, but that seems slightly contrived.

More likely we're going to write:

  if (bfd_section_size_insane (...))
    // Handle cases where the section is invalid.
  else
    // Load the section and process it.

If the definition of *_insane() changes then I 100% want to enter the
error case more often.

Of course, you might say, but what if the code was working fine before?
Then changing *_insane() is sending more cases down the error path...

... but I would suggest that this would be an indication of a bug being
introduced into *_insane(), I mean, if such a thing happened then we'd
already run into problems.

So, in summary, the bit I'm not understanding here is why you feel
future changes to *_insane() would result in breakage in the consumer?

>
>> The function name of "insane" is a little unfortunate.  I think if the
>> function had a better name then this change would seem far less
>> contentious.  Consider a name of:
>> 
>>   validate_section_size_against_other_bfd_infernal_properties_of_the_elf_to_ensure_that_the_requested_size_is_likely_valid()
>> 
>>> IOW external consumers want to implement their own, independent policy
>>> (if so desired).
>> 
>> Sure, consumers _could_ implement their own policy, but IMHO, this would
>> be far worse than exposing the *_insane() function.
>> 
>> What I (as a consumer) want is to check if the size that the BFD library
>> is reporting is valid or not.  To do that I need to check details of the
>> ELF that I, as a BFD users, shouldn't have to bother with. (I thought)
>> the point of BFD was to abstract details of the file format.
>
> Well, your wording (correctly) makes an important distinction: "valid" !=
> "sane". If this was a validity check, no question would arise about it
> being okay to expose.

I didn't intend to make any such distinction.  I was trying to say that
this function _is_ a validity check, it's just unfortunately named as a
sanity check.

If I proposed renaming the function to 'is_bfd_section_size_valid()'
but left the implementation the same would you see a problem with this?
Would you object to making the *_valid() name public?

>
>>> Taking your intended usage example, things would be different if e.g.
>>> bfd_get_full_section_contents() itself used this check unconditionally.
>>> Then I could see a desire to have a way of checking up front whether
>>> allocating a buffer makes sense at all. And really I consider it
>>> questionable for bfd_get_full_section_contents(), when asked to
>>> allocate a buffer, to actually enforce such a library-internal policy.
>>> Like with exposing bfd_section_size_insane(), any change to the policy
>>> may affect external users in unexpected ways.
>> 
>> I don't understand this paragraph at all.  I'm sure I must be reading it
>> wrong, but it feels like you're saying we shouldn't use
>> bfd_section_size_insane(), which would mean we don't check for this one
>> particular error case, but I'm not sure why you'd feel that way.  Like I
>> said, I'm sure that's _not_ what you're suggesting, I just don't see
>> what it is you are trying to say.
>> 
>> You start this paragraph by saying "Taking your intended usage example,
>> ..." but don't really offer an alternative solution.  I'd be interested
>> if you did have some thoughts.
>
> The only alternative I can think about is for every component to enforce
> its own view of "sane".

And I'm 100% on board with the idea that BFD consumers might want to add
their own rules that determine good or bad sections.  But I don't
understand why you think the idea of having a baseline set of rules
within BFD makes no sense.  Surely this just forces every consumer to
copy the existing rules out of BFD and places a task on them to monitor
BFD and copy over any updates, which seems really wasteful...

... And sure, you can argue that a consumer might not want a particular
future change to *_insane(), but I feel such an argument is rather
contrived.  A BFD consumer might object to _any_ change in _any_ of
BFD's public APIs, that doesn't mean we don't make those changes in BFD,
nor does it mean that those consumers shouldn't use BFD.  I don't see
why _this_ particular API function is any different to any other API
function.

Thanks,
Andrew


>
>> Maybe a better solution is to change bfd_get_section_size() so that this
>> function doesn't always just return the recorded section size, but
>> instead returns 0 (or maybe -1 to indicate an error?) based on calling
>> bfd_section_size_insane()?  This feels far more risky as there's likely
>> many calls to bfd_section_size() in the wild that don't expect to get
>> back a size of 0.... but maybe that's a cleaner solution?
>
> Indeed, this risk makes such a change undesirable. Plus when merely
> dumping headers, for example, the true value will want returning (and
> displaying) anyway. I was rather thinking the other way around, to
> perhaps drop the "insane" checking, for being purely heuristic and
> prone to break at some (hopefully distant) future point. A reasonably
> well implemented allocation function ought to be able to fail without
> first trying hard to free up memory, when enough cannot be made
> available anyway.
>
> Jan