From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp.webfaction.com (mail6.webfaction.com [31.170.123.134]) by sourceware.org (Postfix) with ESMTPS id 442DF3858D34 for ; Mon, 15 Jun 2020 04:54:11 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 442DF3858D34 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=census-labs.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=dimitris@census-labs.com Received: from [172.16.232.132] (athedsl-173959.home.otenet.gr [85.75.223.37]) by smtp.webfaction.com (Postfix) with ESMTPSA id ED42160096DD1; Mon, 15 Jun 2020 04:54:06 +0000 (UTC) Subject: Re: undefined references since newlib-3.2.0 To: newlib@sourceware.org References: <87pna2hog9.fsf@keithp.com> <20200614145006.GG15174@raven.inka.de> <87k109h96z.fsf@keithp.com> From: Dimitrios Glynos Autocrypt: addr=dimitris@census-labs.com; prefer-encrypt=mutual; keydata= xsFNBEy9XeIBEAC1G1ckJxDUA+ailLwQbVY05/hgfZr/HTpYhpUgIzdwcGXFxST1F6/TeR0B oyf+mFRE/WCvdfIvAVk3yeIQA/wAUJCKdepAw1/v63z6mh1sV0s8zi1+x0g+p3YthshEB8t7 hBkBm7RXwgxbTbZfpvCz9krSnNrhgFMlYT3JV1rVGEKJNwJKQuaoUtYYYACsq1g0ysTgZfZ6 PcKlZryfPqduPbZtX8OHIYsNNkPyIfZ9yXvRc5vCKQiMzFexMnAu2w8pm39MwtB/PDSAiDw7 m5gHfc+/kKspfXWbpRBJyQiDLvmb5MQLL2HY+n31f9IGbLR/TvUpID99m04vQrO4MTn3zFBs ykaNJ9iLk4DlI3ZlHMZZ1+8dE9BAzuSe7nAEIXE257ZWr8N57A4p0uLOTHFGb5u+6yG37/LQ q2sRx9ABpbkwK/5tYZpqZqnRe8RW1F2z/TKxJsyskDUpV9zDZIlw1HfrFfq/VI7g2Kw6STv9 RDpiZzVa1uySW+O0f5Be8+KTj8qLkwuQQ6NzJbE5dO1gkhI5JQJRA0pRmoQzm8+heJhmFAnE jtBzo+UCfmSuTBCd96ptSxfeWYIAy8a54q37Bw/2dX2a8y5vSG69cw1mi8dAs9Q7I10gH2nQ +tKbYzUPfaddQHwNrBdqTBpq9r1IQEHkOQQw+vLeGR+MHpqOiQARAQABzSpEaW1pdHJpcyBH bHlub3MgPGRpbWl0cmlzQGNlbnN1cy1sYWJzLmNvbT7CwYEEEwECACsCGwMGCwkIBwMCBhUI AgkKCwQWAgMBAh4BAheAAhkBBQJZmrlyBQkUYimFAAoJEBWCONqT/DpInk4QAInUrNEzN/7f 0JCadUJVzYXJf7sbH+CKupLA3uKhz7BjmbfnZhDz8nibLiZ03p8F2XOF82XhflY2n1a0uKmz aT1f+OqEO/TcwAwlOzLugxC7yqCH/Nof6UWy+18owMYVNdxRIHo/1usqXTXpMMiQcqBGlp1i mXoTS6ULT5I3/C8aCQbZfqpKTsOf4piJ0Pt5f//pSmcVmqmr+N3rRxO6Djs3NNVOQmbHLaPJ ygE8uATTsCkXrslutgqh9pA0ybi8vrLDi9vczaezYFNCEXigEXUtiUUUgcs5EmxWya1S5TLu tjgA9ok9M7YdsRPi7x3PfTPlK+jtHl6JK8vbOARifEwIEIOI4t9RPrc3JvFQsyu/q+sIGE9D 3fEUqagsxWr/ANZjCawAbUd5b88G3vUmHMVArf/c478ANSbhnyvysHXXTfNdlXtvoNxTpUBD ko0iTe8DWhCAbODcAOS7at3RkjwRHXdo5IRnRE531wA1rVHV6vOEl+aj+IS1xrZ8LoZYEOJz glgFXtl8E5xQMulZYF0BuPj+ZRBt6CZtQfSYxoIiiBX7xLLaZwhRNO6XXZaVUM8dd55HfTnT f0guDWclghzOsdxu2DIS3cm7Jbx+RRSYN8qpGUALUueKHQ05zJMPE7SOJ1fcxT28o4ifj2GA MoWGGJl91lNgWlkM6KroSZD7zsFNBEy9XeIBEACqkJAWAZJGcRanYZVwi8g4dYxoiQ4A3yVl FVr/Lg+RQWxKoKmdyyCRKDfAXMwYA3cJtPKuXTwtGRN2jlr1EQfnGCImtklmUnvklxEPMwoO bXdy6oBBax8ZifCiuXBVcm4wvGPZ2trZbTc1ZwrcddlBoWqUXt368hi9evpMYaUi9dEbJA8S Q6bmpixEnhjJdnanEQGo8S/C/BxzIoXTa4rrXthDiSO0/hJVDSuhAPBepdzp46Q4ZDFLBu8u sKOdTW5Xfh4q6Y6eUGvfYK2LFuNkM4D4DKAqs1W0LVhowipJWXoBWtuQxBotXQFWcQNPITPB ae1KTL5SX95BsDqR/6FPK4MBp4QgQWcL876+wysaBPhnjTdsP09qDVs349Ma/ySOxZ/XeyIY QKZFbipo3656hIWKZbHTpojvxu0f8ctMq1BNkuYa4CXRBZY26I8hGoZuu4Bybp8NVQc318oW uc81Bg4LEbI6jrjGIhI1Y78jkCKZncr6HOdTeX67GpILbC/phHKY2A90HuIZq5ntxjx6Pm4m 9art2H8r3/wdSeq614QCTCdKLrln46Q1S55wORqFbxcF5pr1A224XJCNAL+WzAUazqmLWGXv HiSgI5KvgjkYfbdZlVF6KEsb2gMWHtjeegHTDpcL2iyVbSOPph8cupDjfXVD/q1YTR/28tVm yQARAQABwsFlBBgBAgAPAhsMBQJZmrnhBQkUYin2AAoJEBWCONqT/DpIHpAP/Rojh8krscYQ kesN8vuILuIDZpHJmuT2WjH+M9TuwtjphdQeS56RYgePhbPJ01uuUrSu3SRiqFur5ltcTBBp GR8B4F8BORT9EmZkBsdx75yQtHCV8IY+MAYON6KXhpC5KpOPWyoTtsr1dxcJZARD0IzdwayF PrHNrDnO6fwbZagCWFR08Uc/+9XOka7tYTIMv64wDttY0P2rZTnOeyoH7NnVpvCeh1vXb/CG aQwM5yrkJsIrfD40sDFy8yWPShniqqLnHrFZKJLv6+roSRAhA+mbtHRdfNSKaTMHr13wmW/G xSexLi8XFGaHf9F5cdSIk2Y8V4jaLI756gWbgJdTp++m7QNTS3qmvPlIDTLFywHfX+SUqmTI KXRC9+CX/jtxwX4EtkJWvaClyni6QpX752CgjstIfugdfaea+MqPD7DRaRYQTMJ+ZclNadKL CIN16xmhu33/iREaBDfzvE3sf6huT2B40ZCfHVnxVGiPfTx88X243Zw8jISKclR+BL/tZk3v yPp9TknyhYcLykcAHcF+L/qQpEOumxaaLT9arz1gr0EtGoQy9lZX/J/Qj8CZNLYZDsCfoTf2 EgkhXlFYJU2Acc1Rr9nWrSNe5t22fCngS6D1APWHwaj3QHVfkRu2s/+VcY0GND/q5rjKIcPl q3gV49DRvNEbTzaxSENyPVEW Organization: CENSUS S.A. Message-ID: Date: Mon, 15 Jun 2020 07:53:47 +0300 User-Agent: Thunderbird MIME-Version: 1.0 In-Reply-To: <87k109h96z.fsf@keithp.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=1.6 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, KAM_NUMSUBJECT, KAM_SHORT, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.2 X-Spam-Level: * X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: newlib@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Newlib mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Jun 2020 04:54:13 -0000 Hello, I'm responsible for the CVE's that triggered these patches (see [1]) and would just like to mention on-list that during the initial (private) discussion of these issues with both Jeff Johnston of newlib and Keith Packard of newlib-nano/picolibc I had provided the following recommendation: a) when the standard allows so, have the API return the NULL pointer to the caller with ENOMEM, so that the caller knows there's an out of memory (OOM) issue. b) when the standard has no provision for ENOMEM, it would be good practice to introduce a function that would be called under OOM conditions. OOM conditions are special, especially for embedded and bare metal projects, as: i) a project might not be able to simply call abort(), i.e. other important actions must be performed (e.g. communicate the error to the user or over some interface) ii) a project might want to take special actions so that it survives this condition (e.g. relinquish some resources, unwind to some other state) For these reasons, I had proposed for case (b) the introduction of a function that would be implemented by the developer, and would handle non-reported out-of-memory conditions. Developers already supply implementations for system calls on bare bones devices, and this could be a similar concept to that. There are two ways of implementing this, one at compile-time (e.g. as you define sbrk today) and another at runtime (similar to an atexit() registration). However I feel that the compile-time introduction of the function elaborates more clearly the message "that this is something that you must take care of when using this library". I don't recommend masquerading ENOMEM as some other error, as many developers may miss that this is a point in the code they should be covering points (i) and/or (ii). CVE-wise newlib with the right compile-time configuration covers the specific issues presented in the advisory, albeit with the caveats you have all witnessed. It is wise to also inspect if the callers of the reported functions handle NULL returns correctly as in picolibc there were a number of callers that required a patch. Finally, it would be beneficial for all if both projects (picolibc and newlib) filed a comment to the standards group stating the cases where an ENOMEM was found useful but was not covered by the standard. Hope this helps. Best regards, Dimitris [1] https://census-labs.com/news/2020/01/31/multiple-null-pointer-dereference-vulnerabilities-in-newlib/ On 14/6/20 8:02 μ.μ., Keith Packard via Newlib wrote: > Josef Wolf writes: > >>> atof >>> atoff >>> [ ... ] >>> strtod >>> strtof >>> strtold >>> wcstod >>> wcstold >>> strtodg >> >> Uh! Why on earth would those functions need to allocate memory? > > Because they are performing string to float conversions using code > written in 1991 by David Gay, based on research done by Will Clinger > which shows that exact conversion from arbitrary strings of decimal > digits to fixed precision binary requires arbitrary precision > arithmetic. > > https://dl.acm.org/doi/10.1145/93548.93557 > >>> These now return infinity and set errno to ERANGE on allocation >>> failure. (not ideal, but the options are limited) >>> >>> Here are some which do return a pointer, but do not document any errors: >>> >>> ecvt >>> fcvt >> >> Maybe the documentation can be fixed? > > The documentation is based on a standard, and fixing that standard > involves a bit of process... > >>> gcvt >>> ecvtbuf >>> fcvtbuf >>> gcvtbuf >> >> Those get a pointer passed. No need to allocate memory. > > These functions are using code also written by David Gay to perform > float to string conversion, based on research done by Guy Steele and Jon > White in how to print floating point numbers accurately (which happened > to be presented at the same conference as the work above!). In this > work, they showed that exact conversion could be done using 1050 bit > arithmetic to generate a 64-bit double result: > > https://dl.acm.org/doi/10.1145/93548.93559 > > David Gay's code in newlib for both directions uses arbitrary precision > arithmetic code found in newlib/libc/stdlib/mprec.c. This code allocates > variable sized arrays of integers on the heap to hold all of the values. > Before the eBalloc patch, none of these allocations were checked, > leading to a rather long list of CVEs as the code could end up storing > through a NULL pointer, which can cause security problems on some > architectures. > >>> And here's a list of functions which I feel reasonable applications >>> should not expect an allocation error from: >> >> I don't think any application should expect those functions to call exit() >> and/or abort() either. > > I'm in complete agreement here. It's better to return an error that an > application *might* check than to not give it any chance to recover at > all. > >>> sprintf >>> snprintf >> >> Those should return -1 on failure. >> >>> sscanf >> >> For this, ENOMEM is documented. > > Yes, but as I suggested, applications probably aren't expecting a call > to sscanf to return EOF and set errno to ENOMEM. > > The real answer to your concerns is to replace the old arbitrary > precision based float/string conversion code with code that uses results > from new research by Ulf Adams. > > That research improves on Steel & White by reducing the precision > required for exact 64-bit float to string conversion from 1050 bits to > 128 bits. Adams also presents an algorithm using a similar technique to > perform (a slightly weaker form of) exact string to float conversion in > the same precision: > > https://dl.acm.org/citation.cfm?doid=3296979.3192369 > > This reasonably small fixed precision can be statically allocated in > memory, or allocated on the stack. Either of these solutions eliminates > the use of the dynamic heap through malloc, and eliminates the need to > change the specification of all of these functions to account for the > heap usage in the existing newlib code. > > Ulf Adams also published code to implement this algorithm on github: > > https://github.com/ulfjack/ryu > > I've ported this code to picolibc, a fork of newlib designed for > embedded systems. That library has an alternate stdio implementation > that doesn't need to use malloc, and it made sense to add this > malloc-free float/string conversion code to that (the previous > float/string conversion code in this implementation was not exact). When > compiled using that code, picolibc will not return errors from malloc > failures in the above cases because it does not call malloc in those > code paths. > > The picolibc source repository also includes the stdio code from newlib > which can be used in place of the default picolibc stdio code by setting > a build option. That code has been modified to catch allocation > failures and return the failures above. I did that in case someone > wanted to use the original stdio code as I felt even this non-default > code should not expose applications to arbitrary calls to abort from > inside the library. I believe this code should be ported back to newlib > so that at least newlib wouldn't call abort. Even better would be to > have someone take a look at the Ryu paper and code and make that work in > newlib. > > (The definition of 'exact' used in Ulf Adams work offers the guarantee > that you can print any floating point value, and then re-read that > string to exactly reproduce the original floating point value in > memory. This is weaker than what Clinger's research used; in that work, > the goal was to generate the floating point value closest to an > arbitrary string of decimal digits.) >