From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm1-x32a.google.com (mail-wm1-x32a.google.com [IPv6:2a00:1450:4864:20::32a]) by sourceware.org (Postfix) with ESMTPS id A323B3858C50 for ; Sat, 1 Apr 2023 14:19:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A323B3858C50 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-wm1-x32a.google.com with SMTP id o32so14521822wms.1 for ; Sat, 01 Apr 2023 07:19:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1680358756; h=references:to:cc:in-reply-to:date:subject:mime-version:message-id :from:from:to:cc:subject:date:message-id:reply-to; bh=qRPprZTbuPmAAnA1po3Kl2bihAGA4uVHpCaYAPeWrOk=; b=jau/ugaq6sHogRq9z7U44slmnay9foHbSE2wvZpmxofca35l286I7uY006dtUH6uJz Omxha65H//wE3X8DcIOOeaM5ByXFXvVCR2ENxatbxgNNNg2fNqQx+OuauThNS/XQLDxq z6roBVt5/4enL5m5AkzSjdL+QzldqobOgFRspHtHTKlquaVyq0bXEYjrLJhON3GRnZGb StMVY8kjNeiDMrf93VaKPKaygfR9Q2/FQQG2MQvN+2/msriysx4FU3rEaJmlko/j0veo +GyiPCsS/irXdSZUwtycH6QR6ZwvbR1Di/qBFL85ASFD/zbfnazk2SSIaZhcG6MrWdWE zH6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680358756; h=references:to:cc:in-reply-to:date:subject:mime-version:message-id :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=qRPprZTbuPmAAnA1po3Kl2bihAGA4uVHpCaYAPeWrOk=; b=EDBz8IrmVXVPgL/Yt8yik7lUwUqpIjQELTZBMzIG0UKA99AyOf1Tq0VaH5O2kb8eii 1k1K0Lk6yTlF0ZyER2dD1mXVU39RhlBcPYasc77ijF2cIHP7mLjPxzds5aB7FN7E/MVn 88uwghoNyhLvAYDb7oNV1msy9JGSBAX5PS+XVhO1JmSCIFlDl/rHY+z7uqmHXom9SySp keFNzF7vcJwFZ9y3p1DTPvl4JEzNNXwj2XvSQQxSo5icq9ktXg+oap1mJaOQgKYfUg5o XySScNH2mQpetMl0t6miY6vXsl0g08lkGXiA6/vqN0U9J1o0NN61MfqdQAZl2e+FP8g4 RZ3g== X-Gm-Message-State: AO0yUKV4l9QQlIc+azAU0USk8gX2AFKjDqOq6So7XXfRCHbMFSa1J/Yl BgfvMfmxxCvGni1q3Wx4hjelYzjx/NFJNw== X-Google-Smtp-Source: AK7set8JJXtwi/50kcIER1nWK6x2TEX5hR1allryLk0l+d91wmayccjiTS4ZSZCdXNJq0D2Iyp0BTg== X-Received: by 2002:a05:600c:22d4:b0:3ed:b56c:9496 with SMTP id 20-20020a05600c22d400b003edb56c9496mr22575972wmg.31.1680358755804; Sat, 01 Apr 2023 07:19:15 -0700 (PDT) Received: from smtpclient.apple ([213.55.220.237]) by smtp.gmail.com with ESMTPSA id u21-20020a05600c00d500b003ee4e99a8f6sm6217578wmm.33.2023.04.01.07.19.15 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sat, 01 Apr 2023 07:19:15 -0700 (PDT) From: Shengyu Huang Message-Id: <7B619AA7-C42D-4486-8017-72BB54297005@gmail.com> Content-Type: multipart/alternative; boundary="Apple-Mail=_B7061BDB-68FE-4F7A-83E1-E09F710E0BD1" Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3731.200.110.1.12\)) Subject: Re: [GSoC][Static Analyzer] First proposal draft and a few more questions/requests Date: Sat, 1 Apr 2023 16:19:04 +0200 In-Reply-To: <0DEEAED6-46F3-4339-B600-AD35C92929B9@gmail.com> Cc: GCC Development To: David Malcolm References: <960EE623-1B17-4321-B77E-FBCD9496BE1F@gmail.com> <40fbb064f56845908f797400e5d9443b6cf97fe4.camel@redhat.com> <0e6a972dac60ad290d21a82b428cc76c4e8565e9.camel@redhat.com> <4CBE37A2-7D50-4ECC-9B70-951AB7176D9B@gmail.com> <3dfad33dec50c9f8bfb13e42a29cfb41b6aab457.camel@redhat.com> <2344350B-6AD2-46A5-A335-BD3ECBBAA4DF@gmail.com> <0BED8B95-9F1A-4350-A63C-616D31E405C3@gmail.com> <0DEEAED6-46F3-4339-B600-AD35C92929B9@gmail.com> X-Mailer: Apple Mail (2.3731.200.110.1.12) X-Spam-Status: No, score=1.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_BARRACUDACENTRAL,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Level: * X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: --Apple-Mail=_B7061BDB-68FE-4F7A-83E1-E09F710E0BD1 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 Hi Dave, >>=20 >> I has looked into compiling those files with the patch some time ago; >> looking at my notes, one issue was with this on-stack buffer: >> char extra[1024]; >> declared outside the loop. Inside the loop, it gets modified in >> various ways: >> extra[0] =3D '\0'; >> and >> if (fread(extra, 1, extsize, fpZip) =3D=3D extsize) { >> where the latter means "extra" becomes tainted. >>=20 >> However "extra" is barely used, and is effectively reset each time >> through the loop - but the analyzer doesn't figure that out. So the >> loop analysis explodes, as it tries to keep track of the possibility >> that "extra" is still tainted from previous iteration(s), despite the >> fact that it's going to be clobbered before it ever gets used. >>=20 >> So one fix might be to extend the state-purging code so that it somehow >> "sees" that "extra" gets clobbered before it gets used, and thus we can >> purge the tainted state from it. >=20 > Thanks for your notes. I think we may be talking about the same thing? If= you look at the updated proposal (I have changed it quite a lot since I fi= rst sent it out), you=E2=80=99ll see there is one relevant paper for state = merging (although it is slightly different from state purging, I think the = goal and general methodology is similar): https://dslab.epfl.ch/pubs/stateM= erging.pdf=20 >=20 > I was trying to say if some similar situation happened for other types of= checkers, I expected state explosion would also happen. I tried to constru= ct a similar example (with the same kind of reset and nested conditionals += a loop) but for double-free, so far no success yet. I=E2=80=99ll pick it u= p afterwards, at latest by next Saturday, because I need to prepare for a c= oming midterm on Friday. I will also put this test case to the proposal bec= ause it seems like a very good starting point for the project. As promised, below is a small example that causes state explosion without t= aint state machine involved. void test() { void *p; int a; scanf(=E2=80=9C%d", &a); =20=20 while (a > 0) { p =3D malloc (1024); if (a > 1) free(p); a--; } =20=20=20 if (a >0) free(p); } This example not only causes state explosion, but also reports false positi= ve of double-free. By the way, do you have any feedback regarding my proposal (https://docs.go= ogle.com/document/d/1MRI1R5DaX8kM6DaqRQsEri5Mx2FvHmWv13qe1W0Bj0g )? I am happy to allocate more time polishing the proposal if you find any= thing off there. If you prefer me sending it via email again (for ease of r= eference in the future maybe?), I am happy to do so as well. Best, Shengyu --Apple-Mail=_B7061BDB-68FE-4F7A-83E1-E09F710E0BD1--