From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 24189 invoked by alias); 2 Jul 2019 06:34:09 -0000 Mailing-List: contact bzip2-devel-help@sourceware.org; run by ezmlm Precedence: bulk List-Post: List-Help: List-Subscribe: List-Id: Sender: bzip2-devel-owner@sourceware.org Received: (qmail 24012 invoked by uid 89); 2 Jul 2019 06:34:09 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Checked: by ClamAV 0.100.3 on sourceware.org X-Virus-Found: No X-Spam-SWARE-Status: No, score=-6.9 required=5.0 tests=BAYES_00,GIT_PATCH_2 autolearn=ham version=3.3.1 spammy=disaster, HX-Languages-Length:1927 X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00,GIT_PATCH_2 autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on sourceware.org X-Spam-Level: X-HELO: black.bookofsand.co.uk DKIM-Filter: OpenDKIM Filter v2.11.0 squarecat1.vs.mythic-beasts.com 54B752934A DKIM-Filter: OpenDKIM Filter v2.11.0 mail.kestrel.ws CAD7523EEC Reply-To: jseward@acm.org Subject: Re: Alternative nSelectors patch (Was: bzip2 1.0.7 released) To: Mark Wielaard , Federico Mena Quintero , bzip2-devel@sourceware.org References: <20190627205837.GD9273@wildebeest.org> <0a2331bc6d0c8500c2c45df1e3ebe01b49ad5831.camel@klomp.org> <8c4d5cf2479253406dacdee122692cc77771afb9.camel@gnome.org> <9998ca428c4c7f895a543aa91941e58efb0d5291.camel@klomp.org> <308d9e82220760205ee673bf0505ee1815d48596.camel@klomp.org> From: Julian Seward Message-ID: <4f434101-5ce3-d757-2f61-c9e419911e00@acm.org> Date: Tue, 01 Jan 2019 00:00:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.7.2 MIME-Version: 1.0 In-Reply-To: <308d9e82220760205ee673bf0505ee1815d48596.camel@klomp.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-GB Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV using ClamSMTP X-SW-Source: 2019-q3/txt/msg00003.txt.bz2 Hi Mark, This seems to me like a better patch than my proposal, so I retract my proposal and vote for this one instead. The one thing that concerned me was that, it would be a disaster -- having ignored all selectors above 18002 -- if subsequent decoding actually *did* manage somehow to try to read more than 18002 selectors out of s->selectorMtf, because we'd be reading uninitialised memory. But this seems to me can't happen because, after the selector-reading loop, you added + if (nSelectors > BZ_MAX_SELECTORS) + nSelectors = BZ_MAX_SELECTORS; and the following loop: /*--- Undo the MTF values for the selectors. ---*/ ... is the only place that reads s->selectorMtf, and then only for the range 0 .. nSelectors-1. So it seems good to me. Does this sync with your analysis? J On 01/07/2019 01:36, Mark Wielaard wrote: > Hi, > > On Fri, 2019-06-28 at 13:10 +0200, Mark Wielaard wrote: >>> It seems to me to be important to now split BZ_MAX_SELECTORS into these two >>> parts so as to make it clear to everybody that we're accepting (decompressing) >>> a slightly larger set of inputs than we create (a la that old saying about >>> network protocol implementations), so as to tolerate other compressors. >> >> That seems good. The attached patch does this and makes it possible to >> decode the problematic bz2 file. > > Sorry, it is a bit too late here to properly document this patch and > explain why I think it is a better one than the "split-max-selectors" > fix. But hopefully the new testsuite example and the comment in the > patch make clear what my thinking is. > > This resolved both the issue with the large file reported as with the > new test suite file (lbzip2/32767.bz2). The whole testsuite passes now, > even under valgrind and with gcc -fsanitize=undefined. > > Comments on the patch idea more than welcome. > > Thanks, > > Mark >