From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <bzip2-devel-return-40-listarch-bzip2-devel=sourceware.org@sourceware.org>
Received: (qmail 24189 invoked by alias); 2 Jul 2019 06:34:09 -0000
Mailing-List: contact bzip2-devel-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Post: <mailto:bzip2-devel@sourceware.org>
List-Help: <mailto:bzip2-devel-help@sourceware.org>
List-Subscribe: <mailto:bzip2-devel-subscribe@sourceware.org>
List-Id: <bzip2-devel.sourceware.org>
Sender: bzip2-devel-owner@sourceware.org
Received: (qmail 24012 invoked by uid 89); 2 Jul 2019 06:34:09 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Checked: by ClamAV 0.100.3 on sourceware.org
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-6.9 required=5.0 tests=BAYES_00,GIT_PATCH_2 autolearn=ham version=3.3.1 spammy=disaster, HX-Languages-Length:1927
X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00,GIT_PATCH_2 autolearn=ham version=3.3.1
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on sourceware.org
X-Spam-Level: 
X-HELO: black.bookofsand.co.uk
DKIM-Filter: OpenDKIM Filter v2.11.0 squarecat1.vs.mythic-beasts.com 54B752934A
DKIM-Filter: OpenDKIM Filter v2.11.0 mail.kestrel.ws CAD7523EEC
Reply-To: jseward@acm.org
Subject: Re: Alternative nSelectors patch (Was: bzip2 1.0.7 released)
To: Mark Wielaard <mark@klomp.org>,
 Federico Mena Quintero <federico@gnome.org>, bzip2-devel@sourceware.org
References: <b8aab785a67113e4f50c54a6cb59129c11f805b6.camel@klomp.org>
 <20190627205837.GD9273@wildebeest.org>
 <0a2331bc6d0c8500c2c45df1e3ebe01b49ad5831.camel@klomp.org>
 <8c4d5cf2479253406dacdee122692cc77771afb9.camel@gnome.org>
 <e73799c0-cb3b-c9f8-84c8-53d028dbe1d5@acm.org>
 <9998ca428c4c7f895a543aa91941e58efb0d5291.camel@klomp.org>
 <308d9e82220760205ee673bf0505ee1815d48596.camel@klomp.org>
From: Julian Seward <jseward@acm.org>
Message-ID: <4f434101-5ce3-d757-2f61-c9e419911e00@acm.org>
Date: Tue, 01 Jan 2019 00:00:00 -0000
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101
 Thunderbird/60.7.2
MIME-Version: 1.0
In-Reply-To: <308d9e82220760205ee673bf0505ee1815d48596.camel@klomp.org>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-GB
Content-Transfer-Encoding: 7bit
X-Virus-Scanned: ClamAV using ClamSMTP
X-SW-Source: 2019-q3/txt/msg00003.txt.bz2


Hi Mark,

This seems to me like a better patch than my proposal, so I retract my
proposal and vote for this one instead.

The one thing that concerned me was that, it would be a disaster -- having
ignored all selectors above 18002 -- if subsequent decoding actually *did*
manage somehow to try to read more than 18002 selectors out of s->selectorMtf,
because we'd be reading uninitialised memory.  But this seems to me can't
happen because, after the selector-reading loop, you added

+      if (nSelectors > BZ_MAX_SELECTORS)
+        nSelectors = BZ_MAX_SELECTORS;

and the following loop:

       /*--- Undo the MTF values for the selectors. ---*/
       ...

is the only place that reads s->selectorMtf, and then only for the range
0 .. nSelectors-1.

So it seems good to me.  Does this sync with your analysis?

J


On 01/07/2019 01:36, Mark Wielaard wrote:
> Hi,
> 
> On Fri, 2019-06-28 at 13:10 +0200, Mark Wielaard wrote:
>>> It seems to me to be important to now split BZ_MAX_SELECTORS into these two
>>> parts so as to make it clear to everybody that we're accepting (decompressing)
>>> a slightly larger set of inputs than we create (a la that old saying about
>>> network protocol implementations), so as to tolerate other compressors.
>>
>> That seems good. The attached patch does this and makes it possible to
>> decode the problematic bz2 file.
> 
> Sorry, it is a bit too late here to properly document this patch and
> explain why I think it is a better one than the "split-max-selectors"
> fix. But hopefully the new testsuite example and the comment in the
> patch make clear what my thinking is.
> 
> This resolved both the issue with the large file reported as with the
> new test suite file (lbzip2/32767.bz2). The whole testsuite passes now,
> even under valgrind and with gcc -fsanitize=undefined.
> 
> Comments on the patch idea more than welcome.
> 
> Thanks,
> 
> Mark
>