From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=Ekrk=IU=redhat.com=aburgess@sourceware.org>
Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124])
	by sourceware.org (Postfix) with ESMTPS id DCAE3385840A
	for <binutils@sourceware.org>; Wed, 10 Jan 2024 16:20:24 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DCAE3385840A
Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org DCAE3385840A
Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.129.124
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1704903627; cv=none;
	b=qkeGIwJhdRZR8HmWpXoyfHkvS+8F9ELYH5z6w6ZHu/D9pAMUqoVDcP4HliB+AV84E5eFltq4D701at6qv8a9rZx2nuWI/R66Y2Xd7fsWxFaFkKiHVBUwGmIczbnrFBI2wh1QsSkfmZc3k59od+eGAtWySt1J7ulCc3IaSg0jo2E=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
	t=1704903627; c=relaxed/simple;
	bh=NCktlEbAouv7CQIzbUhSMleR9b5nWZYi25RcjZPClJE=;
	h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=ZFVSRgRkWeO7Hte8t/TnqPCBn9syLJq5QTiW69l3gt1zDHauFWWha1afyIDm5tXCVR+6arkL6RqbaOr+NjMzOMGHDOvPxV7p3vJrkm6AXGfWlBmW9ChsBi+GaFxJFAzwpn/zyI2APgSLnFlkL/VAnbsInv/j6P+327yizOXPJBA=
ARC-Authentication-Results: i=1; server2.sourceware.org
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com;
	s=mimecast20190719; t=1704903624;
	h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
	 to:to:cc:cc:mime-version:mime-version:content-type:content-type:
	 in-reply-to:in-reply-to:references:references;
	bh=eFtnD0r5P5Ox7XW2qTcpIBEnyAH7N3a40hI54QfsqxI=;
	b=OAwGagGcIvbSdcPGiVgC4J5fJvqR9D6avy9ne84G/hf3q11CkjQHHoCkQoGhXqL0QTK64B
	MuWaG0pb5FTRGzinFg2j+H7Tw3uCmzg30NI0zd74vFP7l+J0zPb3o0HPRz8knWwTO7fiem
	lnZom2kR6isd2ki8mP9c2eArCUXLlGs=
Received: from mail-wr1-f71.google.com (mail-wr1-f71.google.com
 [209.85.221.71]) by relay.mimecast.com with ESMTP with STARTTLS
 (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id
 us-mta-455-kOs2XNwBNbiT53jOh3rHxg-1; Wed, 10 Jan 2024 11:20:22 -0500
X-MC-Unique: kOs2XNwBNbiT53jOh3rHxg-1
Received: by mail-wr1-f71.google.com with SMTP id ffacd0b85a97d-33693a2dae7so2491335f8f.3
        for <binutils@sourceware.org>; Wed, 10 Jan 2024 08:20:22 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1704903621; x=1705508421;
        h=mime-version:message-id:date:references:in-reply-to:subject:cc:to
         :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
        bh=eFtnD0r5P5Ox7XW2qTcpIBEnyAH7N3a40hI54QfsqxI=;
        b=wjUMTAPaM4HoQ6YUZ22S5LknpSLsiL00TcJ6vYg1pr+mozME/3i+L5aZeDMtDmE3XM
         wvC71RU3hMhONEoVeCWrEiDPyZFGu1zhELCGSh/Bq9F5FV7qTHkkivM8A2wROv03+AVw
         Rbe5lTPtYjL9Y15uJ2/uElgBlK+lsFLfR7e/Lj0BJAkkQrnVdyvgreUdfwYkVNE/FK1h
         FzVQVIp8Vhv6CTGowPK6Nx/TCvsz590ELhtOBDnvDEAJRc7fDrbz6Ap6QLDi9K//JEnU
         I4D3AiNBkn/SP6cq65VJclLbJfEvSErFe96LwViXuWpPXVe2VIKfqtPTUCb8Hj7+Upcs
         Gv0Q==
X-Gm-Message-State: AOJu0Ywq0TGSqdnBtKpu5PBPUVFkKPeO1YlmdqoBGhr7D6ix4POFOZ0o
	xqZA4Fy6NGsZF83le6dtf4sxZeCTzmI3XCwKM517jZgT9U61QxnnN0+fcmzdqDn9wVu8jGfpme+
	KmE1GayPIkz6BnwagUDPz/21X+Aw2pJyBAA==
X-Received: by 2002:a7b:cb41:0:b0:40e:43e4:a015 with SMTP id v1-20020a7bcb41000000b0040e43e4a015mr665528wmj.74.1704903620825;
        Wed, 10 Jan 2024 08:20:20 -0800 (PST)
X-Google-Smtp-Source: AGHT+IHJDD7LR0TM8SUgjRef3HMVAZeRAOIKHerrsn35/XYqMn/SaqCEAD/0/yIDZysIMENPKTSs+w==
X-Received: by 2002:a7b:cb41:0:b0:40e:43e4:a015 with SMTP id v1-20020a7bcb41000000b0040e43e4a015mr665522wmj.74.1704903620408;
        Wed, 10 Jan 2024 08:20:20 -0800 (PST)
Received: from localhost (185.223.159.143.dyn.plus.net. [143.159.223.185])
        by smtp.gmail.com with ESMTPSA id t21-20020a05600c451500b0040e3ac9f4c8sm2657808wmo.28.2024.01.10.08.20.19
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Wed, 10 Jan 2024 08:20:20 -0800 (PST)
From: Andrew Burgess <aburgess@redhat.com>
To: Jan Beulich <jbeulich@suse.com>
Cc: binutils@sourceware.org
Subject: Re: [PATCH] bfd: make _bfd_section_size_insane part of the public API
In-Reply-To: <8803d3e2-688e-43ac-b710-5237fab8b054@suse.com>
References: <b9ff0c22bf79023edc29f8f661b683542b771cb2.1701879297.git.aburgess@redhat.com>
 <0c54069e-d907-4f03-8d7f-15374d4bfd6a@suse.com>
 <87frz58n7j.fsf@redhat.com>
 <f1eb82b6-4cda-4583-852a-4b986ce2aa9d@suse.com>
 <877ckh8fjz.fsf@redhat.com>
 <8803d3e2-688e-43ac-b710-5237fab8b054@suse.com>
Date: Wed, 10 Jan 2024 16:20:19 +0000
Message-ID: <874jfl88jg.fsf@redhat.com>
MIME-Version: 1.0
X-Mimecast-Spam-Score: 0
X-Mimecast-Originator: redhat.com
Content-Type: text/plain
X-Spam-Status: No, score=-7.0 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <binutils.sourceware.org>

Jan Beulich <jbeulich@suse.com> writes:

> On 10.01.2024 14:48, Andrew Burgess wrote:
>> Jan Beulich <jbeulich@suse.com> writes:
>>> On 10.01.2024 12:03, Andrew Burgess wrote:
>>>> Jan Beulich <jbeulich@suse.com> writes:
>>>>> On 06.12.2023 17:15, Andrew Burgess wrote:
>>>>>> If a BFD user is making use of a function like
>>>>>> bfd_get_section_contents to read a section into a pre-allocated
>>>>>> buffer, then that BFD user might also want to make use of
>>>>>> _bfd_section_size_insane prior to allocating the buffer they intend to
>>>>>> use in order to validate that the buffer size that plan to allocate is
>>>>>> sane.
>>>>>>
>>>>>> This commit makes _bfd_section_size_insane public, by renaming it to
>>>>>> bfd_section_size_insane.
>>>>>>
>>>>>> I've updated the existing uses within bfd/, I don't believe this
>>>>>> function is used outside of bfd/ currently.
>>>>>>
>>>>>> One place that I plan to make use of this function is in
>>>>>> gdb/gdb_bfd.c, in the function gdb_bfd_get_full_section_contents.
>>>>>> This change isn't included in this commit, but will come later if/when
>>>>>> this has been merged into bfd.
>>>>>
>>>>> Having seen your ping (and no other response), let me share my view:
>>>>> This function implements a certain policy, internal to the library.
>>>>> By exposing it, you would make external users dependent upon this
>>>>> specific policy. What if later we change our view on what's "insane"?
>>>>
>>>> I would expect and want external users to get the updated definition.
>>>
>>> And then break if we decide to lower the limit of "insane"?
>> 
>> You've said this again later in your reply, and I think this is the bit
>> that I'm having trouble with.
>> 
>> Why do you think things would break if the definition of "insane"
>> changed?
>> 
>> Of course, I ask this under the assumption that there's no broken code
>> in the consumer, clearly:
>> 
>>   if (bfd_section_size_insane (...))
>>     // Broken code here...
>>   else
>>     // Working code here...
>> 
>> Is going to break, and if *_insane() changes, then we're going to break
>> in more or less cases, but that seems slightly contrived.
>> 
>> More likely we're going to write:
>> 
>>   if (bfd_section_size_insane (...))
>>     // Handle cases where the section is invalid.
>>   else
>>     // Load the section and process it.
>> 
>> If the definition of *_insane() changes then I 100% want to enter the
>> error case more often.
>> 
>> Of course, you might say, but what if the code was working fine before?
>> Then changing *_insane() is sending more cases down the error path...
>> 
>> ... but I would suggest that this would be an indication of a bug being
>> introduced into *_insane(), I mean, if such a thing happened then we'd
>> already run into problems.
>> 
>> So, in summary, the bit I'm not understanding here is why you feel
>> future changes to *_insane() would result in breakage in the consumer?
>
> Let's assume "insane" is anything above 4G right now (I don't recall what
> the real value is), and people are working with 3G objects, using any
> arbitrary consumer of libbfd. Then we choose lo lower the limit to 2G.

We're talking about the same _bfd_section_size_insane in bfd/section.c,
right?

I don't see any arbitrary limit in there.  If there is such a limit then
it's buried within all the very non-arbitrary sanity checks.

This function isn't answering: "is this section larger than X", it is
instead answering: "can this section possibly be read from this file".

It doesn't matter how much memory your future computer has, if the ELF
is 1M in size, and the section claims to be 2M in size, you aren't going
to be reading that section out of that ELF.

>
> While surely the limit is higher than the given example, the pattern
> remains: People working fine with the present limit might find themselves
> hosed if that limit was lowered. Hence why I dislike such heuristics, in
> turn meaning I would like to not see them put into wider use.
>
>>>> The function name of "insane" is a little unfortunate.  I think if the
>>>> function had a better name then this change would seem far less
>>>> contentious.  Consider a name of:
>>>>
>>>>   validate_section_size_against_other_bfd_infernal_properties_of_the_elf_to_ensure_that_the_requested_size_is_likely_valid()
>>>>
>>>>> IOW external consumers want to implement their own, independent policy
>>>>> (if so desired).
>>>>
>>>> Sure, consumers _could_ implement their own policy, but IMHO, this would
>>>> be far worse than exposing the *_insane() function.
>>>>
>>>> What I (as a consumer) want is to check if the size that the BFD library
>>>> is reporting is valid or not.  To do that I need to check details of the
>>>> ELF that I, as a BFD users, shouldn't have to bother with. (I thought)
>>>> the point of BFD was to abstract details of the file format.
>>>
>>> Well, your wording (correctly) makes an important distinction: "valid" !=
>>> "sane". If this was a validity check, no question would arise about it
>>> being okay to expose.
>> 
>> I didn't intend to make any such distinction.  I was trying to say that
>> this function _is_ a validity check, it's just unfortunately named as a
>> sanity check.
>
> This is then where we disagree. There's no limit the ELF spec imposes
> on sh_size, just to take the most common (for GNU binutils) example.
> Hence there simply are no invalid size values (as far as ELF is
> concerned), and libbfd shouldn't make up any.

Are you sure?  The "spec" (I use quotes because the only ELF spec I've
ever read is pretty light weight) might not say it, but an sh_size for a
section that is bigger than the ELF itself sounds pretty invalid too me.

[ Yes: I'm ignoring !SHT_NOBITS, and compressed sections here for
  simplicity. ]

>
>> If I proposed renaming the function to 'is_bfd_section_size_valid()'
>> but left the implementation the same would you see a problem with this?
>
> Yes, because the size check failing _does not_ mean the size is invalid.
> Memory sizes grow all the time, storage sizes grow all the time, and
> object file as well as section sizes also (almost) only ever grow.
>
>> Would you object to making the *_valid() name public?
>
> If the function truly performed validity checks, of course I wouldn't
> object.

My claim is that the function, as written today, is a validity check.

The only bit I think which could be described as arbitrary relates to
the compressed section handling.  I don't fully understand the comment
there, but the comment is pretty clear the 10x is an arbitrary heuristic
rather than a calculated value.  Obviously this could be improved if it
ever became a problem (calculate actual compressed/uncompressed sizes).

You worries don't seem to align with what I think the function actually
does.

Thanks,
Andrew