From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-vk1-xa35.google.com (mail-vk1-xa35.google.com [IPv6:2607:f8b0:4864:20::a35]) by sourceware.org (Postfix) with ESMTPS id 6BD2E389C41B for ; Mon, 31 Jan 2022 15:57:28 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 6BD2E389C41B Received: by mail-vk1-xa35.google.com with SMTP id b77so8523353vka.11 for ; Mon, 31 Jan 2022 07:57:28 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=2izzQKLHDaTTkoeEFW6y7cQ+5k+8qBZEHm8Qd9H4Bz8=; b=pHh1DZB/1quPb0WDsEyXVukVDtYUliBCJkfrvHbICPBI9QVy/X7FuWMbDwPjO2opsc luI2WYQQtGg9XHP8QUj/8SFkQYfw5C653A8ZSoodVPBAPRUirZWa/yBk2CElHd3SC9dL 9euTCD9Y7nfXAnb3u9/XLWvMlbbO5+sxsKDnaZb9hTVJZSp+ERysvAAVGFa1TRvVCZ7G VfSLOUWVTTNZRsUkmNJhcDhSveF2YjanM6JVoXJPbP3HfibbVWG/kuWu7iTtJ1KcRj47 9478ftvTqet+tWcKu40dykrByLlOr5oYpVyIONvlimRCH6JvCu2TnZ8f04rWrR0OkdDf 6Gww== X-Gm-Message-State: AOAM530KhWm6OonBggWTq6G7rE+6m1EYf3E6AJxVeMXfmS5K7c5y1jMl konsPpoPj3msY3Mh7xCGXVQTGTeRkTjsMVHXUw1jNw== X-Google-Smtp-Source: ABdhPJxDbta1E1isZt20dSwB4uDiRwiuovb67ZeDD3CgXCRUqaNFQygvAM22TpLjGoI0DkbvGvQOFydoxRYbHEVd8hw= X-Received: by 2002:a1f:5d84:: with SMTP id r126mr8067249vkb.33.1643644647422; Mon, 31 Jan 2022 07:57:27 -0800 (PST) MIME-Version: 1.0 References: <87wnijv616.fsf@dirichlet.schwinge.homeip.net> <87r18oynpn.fsf@euler.schwinge.homeip.net> In-Reply-To: <87r18oynpn.fsf@euler.schwinge.homeip.net> From: Giuliano Procida Date: Mon, 31 Jan 2022 15:56:50 +0000 Message-ID: Subject: Re: 'src/abg-dwarf-reader.cc:compare_dies_string_attribute_value' optimization To: Thomas Schwinge Cc: libabigail@sourceware.org, Dodji Seketeli , Mark Wielaard X-Spam-Status: No, score=-22.4 required=5.0 tests=BAYES_00, DKIMWL_WL_MED, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH, HTML_MESSAGE, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE, USER_IN_DEF_DKIM_WL, USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: libabigail@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Mailing list of the Libabigail project List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 31 Jan 2022 15:57:36 -0000 Hi. On Mon, 31 Jan 2022 at 14:38, Thomas Schwinge wrote: > Hi! > > On 2022-01-28T23:39:49+0100, I wrote: > > It's late on a Friday evening, and very well this may just be me not > > understanding libabigail/DWARF ;-) -- in context of > > or > > , I did a > > simple experiment to disable the > > 'src/abg-dwarf-reader.cc:compare_dies_string_attribute_value' > > optimization: > > OK, something doesn't seem right here: > > > --- src/abg-dwarf-reader.cc > > +++ src/abg-dwarf-reader.cc > > [...] > > /// This function is a fast routine (optimization) to compare the > > /// values of two string attributes of two DIEs. > > /// > > /// @param l the first DIE to consider. > > /// > > /// @param r the second DIE to consider. > > /// > > /// @param attr_name the name of the attribute to compare, on the > two > > /// DIEs above. > > /// > > /// @param result out parameter. This is set to the result of the > > /// comparison. If the value of attribute @p attr_name on DIE @p = l > > /// equals the value of attribute @p attr_name on DIE @p r, then t= he > > /// the argument of this parameter is set to true. Otherwise, it'= s > > /// set to false. Note that the argument of this parameter is set > iff > > /// the function returned true. > > /// > > /// @return true iff the comparison could be performed. There are > > /// cases in which the comparison cannot be performed. For > instance, > > /// if one of the DIEs does not have the attribute @p attr_name. = In > > /// any case, if this function returns true, then the parameter @p > > /// result is set to the result of the comparison. > > static bool > > compare_dies_string_attribute_value(const Dwarf_Die *l, const > Dwarf_Die *r, > > unsigned attr_name, > > bool &result) > > { > > Dwarf_Attribute l_attr, r_attr; > > if (!dwarf_attr_integrate(const_cast(l), attr_name, > &l_attr) > > || !dwarf_attr_integrate(const_cast(r), > attr_name, &r_attr)) > > return false; > > > > ABG_ASSERT(l_attr.form =3D=3D DW_FORM_strp > > || l_attr.form =3D=3D DW_FORM_string > > || l_attr.form =3D=3D DW_FORM_GNU_strp_alt > > || form_is_DW_FORM_strx(l_attr.form) > > || form_is_DW_FORM_line_strp(l_attr.form)); > > > > ABG_ASSERT(r_attr.form =3D=3D DW_FORM_strp > > || r_attr.form =3D=3D DW_FORM_string > > || r_attr.form =3D=3D DW_FORM_GNU_strp_alt > > || form_is_DW_FORM_strx(r_attr.form) > > || form_is_DW_FORM_line_strp(r_attr.form)); > > > > +#if 0 > > if ((l_attr.form =3D=3D DW_FORM_strp > > && r_attr.form =3D=3D DW_FORM_strp) > > || (l_attr.form =3D=3D DW_FORM_GNU_strp_alt > > && r_attr.form =3D=3D DW_FORM_GNU_strp_alt) > > || (form_is_DW_FORM_strx(l_attr.form) > > && form_is_DW_FORM_strx(r_attr.form)) > > || (form_is_DW_FORM_line_strp(l_attr.form) > > && form_is_DW_FORM_line_strp(r_attr.form))) > > { > > // So these string attributes are actually pointers into a > > // string table. The string table is most likely > de-duplicated > > // so comparing the *values* of the pointers should be enoug= h. > > // > > // This is the fast path. > > if (l_attr.valp =3D=3D r_attr.valp) > > result =3D true; > > else if (l_attr.valp && r_attr.valp) > > result =3D *l_attr.valp =3D=3D *r_attr.valp; > > else > > result =3D false; > > return true; > > } > > +#endif > > > > // If we reached this point it means we couldn't use the fast pa= th > > // because the string atttributes are strings that are "inline" = in > > // the debug info section. Let's just compare them the slow and > > // obvious way. > > string l_str =3D die_string_attribute(l, attr_name), > > r_str =3D die_string_attribute(r, attr_name); > > result =3D l_str =3D=3D r_str; > > > > return true; > > } > > I re-enabled the "optimized" code, but added a simple 'assert' to verify > that its 'result' does match the "slow" 'result'. (This indeed does > cause a number of more testsuite failures than the single one that I'd > mentioned before!) Here's one case where the 'result's don't match. > > First, the "slow" code: > > (gdb) call die_string_attribute(l, attr_name) > $20 =3D "dwarf_getarangeinfo.c" > (gdb) call die_string_attribute(r, attr_name) > $21 =3D "dwarf_onearange.c" > > Clearly not identical; thus ought to 'return false;'. However, the > "optimized" code: > > (gdb) ptype l_attr > type =3D struct Dwarf_Attribute { > unsigned int code; > unsigned int form; > unsigned char *valp; > Dwarf_CU *cu; > } > (gdb) print l_attr > $5 =3D {code =3D 3, form =3D 14, valp =3D 0x7ffff45c97d9 "\351\024", = cu =3D > 0x7fffe8614670} > (gdb) print r_attr > $6 =3D {code =3D 3, form =3D 14, valp =3D 0x7ffff45c970d "\351\r", cu= =3D > 0x7fffe8614550} > > So these are both 'DW_FORM_strp', eligible for "optimized" comparison: > > (gdb) print l_attr.valp =3D=3D r_attr.valp > $7 =3D false > > OK. Then: > > (gdb) print l_attr.valp && r_attr.valp > $8 =3D true > (gdb) print *l_attr.valp =3D=3D *r_attr.valp > $9 =3D true > > This now does 'return true;'! > > Given 'unsigned char *valp', what this only checks is that one byte at > 'l_attr.valp' equals one byte at 'r_attr.valp' -- which evidently isn't > sufficient to demonstrate that 'l_attr' and 'r_attr' do match. Therefore > it seems that this optimization is not correct? > > Looks like you've found one bug. It has compared addresses and moved on to test data. It could do strcmp at this point, assuming these things are null-terminated strings, which is the case for DW_FORM_strp but is not for other things. But there is another bug. I've instrumented the code and found cases (in the test suite) where the fast path returns false, but the slow path returns true. Both sides are DW_FORM_strx1. The data are not strings but indexes and that the same string appears at different indexes. > I understand the purpose here is that for the supported and matching > 'DW_FORM_*', not to call 'dwarf_formstring' (via the "slow" code's > 'die_string_attribute'), because we don't actually need the strings, but > just identifiers (like, table indices) to compare. But how to do this > properly? > > The optimisation could be restricted to check pointers and indexes only and also no longer assume distinct references mean distinct data. Giuliano. > > Gr=C3=BC=C3=9Fe > Thomas > > > > My assumption was that this shouldn't change anything other than possib= ly > > regress performance. However: > > > > [-PASS:-]{+FAIL:+} runtestreaddwarf > > > > --- build-libabigail/tests/runtestreaddwarf.log > > +++ build-libabigail/tests/runtestreaddwarf.log > > [...] > > -PASS runtestreaddwarf (exit status: 0) > > +--- > [...]/source-libabigail/tests/data/test-read-dwarf/PR25007-sdhci.ko.abi > 2022-01-28 22:48:07.049805043 +0100 > > ++++ > [...]/build-libabigail/tests/output/test-read-dwarf/PR25007-sdhci.ko.abi > 2022-01-28 23:15:30.261483063 +0100 > > +@@ -10107,10 +10107,19 @@ > > + > > + > > + > > ++ > > ++ > > ++ > > ++ > > ++ > > + > > + > > + > > + > > ++ > > ++ > > ++ > > ++ > > + > > + > > + > > +@@ -10138,6 +10147,10 @@ > > + > > + > > + > > ++ > > ++ > > ++ > > ++ > > + > > + > > + > > +@@ -10181,6 +10194,16 @@ > > + > > + > > + > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > + > > + > > + > > +@@ -10370,6 +10393,13 @@ > > + > > + > > + > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > + > > + > > + > > +@@ -10519,6 +10549,31 @@ > > + > > + > > + > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > + > > + > > + > > +@@ -10531,6 +10586,17 @@ > > + > > + > > + > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > + > > + > > + > > +@@ -10591,6 +10657,10 @@ > > + > > + > > + > > ++ > > ++ > > ++ > > ++ > > + > > + > > + > > +@@ -10631,6 +10701,22 @@ > > + > > + > > + > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > + > > + > > + > > +@@ -10746,6 +10832,10 @@ > > + > > + > > + > > ++ > > ++ > > ++ > > ++ > > + > > + > > + > > +@@ -10770,6 +10860,14 @@ > > + > > + > > + > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > + > > + > > + > > +@@ -10815,6 +10913,9 @@ > > + > > + > > + > > ++ > > ++ > > ++ > > + > > + > > + > > +@@ -10859,6 +10960,15 @@ > > + > > + > > + > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > + > > + > > + > > +@@ -10885,11 +10995,44 @@ > > + > > + > > + > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > + > > + > > + > > + > > + > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > + > > + > > + > > +@@ -10923,6 +11066,18 @@ > > + > > + > > + > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > + > > + > > + > > +@@ -10936,6 +11091,14 @@ > > + > > + > > + > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > + > > + > > + > > +@@ -10949,6 +11112,10 @@ > > + > > + > > + > > ++ > > ++ > > ++ > > ++ > > + > > + > > + > > +@@ -11075,12 +11242,26 @@ > > + > > + > > + > > ++ > > ++ > > ++ > > ++ > > + > > + > > + > > + > > + > > + > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > + > > + > > + > > +@@ -11124,6 +11305,15 @@ > > + > > + > > + > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > + > > + > > + > > +@@ -11192,10 +11382,19 @@ > > + > > + > > + > > ++ > > ++ > > ++ > > ++ > > ++ > > + > > + > > + > > + > > ++ > > ++ > > ++ > > ++ > > + > > + > > + > > +@@ -11210,6 +11409,10 @@ > > + > > + > > + > > ++ > > ++ > > ++ > > ++ > > + > > + > > + > > +@@ -11224,6 +11427,9 @@ > > + > > + > > + > > ++ > > ++ > > ++ > > + > > + > > + > > +@@ -11241,223 +11447,6 @@ > > + > > + > > + > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > +- > > + > > + > > + > > +@@ -11477,48 +11466,19 @@ > > + > > + > > + > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > ++ > > + > > + comp-dir-path=3D'/ws/android/kernel/aosp/common-mainline/out/android-main= line/common' > language=3D'LANG_C89'> > > +- visibility=3D'default' is-declaration-only=3D'yes' id=3D'type-id-1555'/> > > +- visibility=3D'default' is-declaration-only=3D'yes' id=3D'type-id-1556'/> > > +- visibility=3D'default' is-declaration-only=3D'yes' id=3D'type-id-1557'/> > > +- visibility=3D'default' is-declaration-only=3D'yes' id=3D'type-id-1558'/> > > +- visibility=3D'default' is-declaration-only=3D'yes' id=3D'type-id-1559'/> > > +- visibility=3D'default' is-declaration-only=3D'yes' id=3D'type-id-1560'/> > > +- is-declaration-only=3D'yes' id=3D'type-id-1561'/> > > +- visibility=3D'default' is-declaration-only=3D'yes' id=3D'type-id-1562'/> > > +- visibility=3D'default' is-declaration-only=3D'yes' id=3D'type-id-1563'/> > > +- visibility=3D'default' is-declaration-only=3D'yes' id=3D'type-id-1564'/> > > +- visibility=3D'default' is-declaration-only=3D'yes' id=3D'type-id-1565'/> > > +- visibility=3D'default' is-declaration-only=3D'yes' id=3D'type-id-1566'/> > > +- visibility=3D'default' is-declaration-only=3D'yes' id=3D'type-id-1567'/> > > +- visibility=3D'default' is-declaration-only=3D'yes' id=3D'type-id-1568'/> > > +- visibility=3D'default' is-declaration-only=3D'yes' id=3D'type-id-1569'/> > > +- visibility=3D'default' is-declaration-only=3D'yes' id=3D'type-id-1570'/> > > +- visibility=3D'default' is-declaration-only=3D'yes' id=3D'type-id-1571'/> > > +- visibility=3D'default' is-declaration-only=3D'yes' id=3D'type-id-1572'/> > > +- visibility=3D'default' is-declaration-only=3D'yes' id=3D'type-id-1573'/> > > +- visibility=3D'default' is-declaration-only=3D'yes' id=3D'type-id-1574'/> > > +- visibility=3D'default' is-declaration-only=3D'yes' id=3D'type-id-1575'/> > > +- visibility=3D'default' is-declaration-only=3D'yes' id=3D'type-id-1576'/> > > +- visibility=3D'default' is-declaration-only=3D'yes' id=3D'type-id-1577'/> > > +- is-declaration-only=3D'yes' id=3D'type-id-1578'/> > > +- visibility=3D'default' is-declaration-only=3D'yes' id=3D'type-id-1579'/> > > +- visibility=3D'default' is-declaration-only=3D'yes' id=3D'type-id-1580'/> > > +- visibility=3D'default' is-declaration-only=3D'yes' id=3D'type-id-1581'/> > > +- visibility=3D'default' is-declaration-only=3D'yes' id=3D'type-id-1582'/> > > +- is-declaration-only=3D'yes' id=3D'type-id-1583'/> > > +- visibility=3D'default' is-declaration-only=3D'yes' id=3D'type-id-1584'/> > > +- visibility=3D'default' is-declaration-only=3D'yes' id=3D'type-id-1585'/> > > +- visibility=3D'default' is-declaration-only=3D'yes' id=3D'type-id-1586'/> > > +- is-declaration-only=3D'yes' id=3D'type-id-1587'/> > > +- visibility=3D'default' is-declaration-only=3D'yes' id=3D'type-id-1588'/> > > +- visibility=3D'default' is-declaration-only=3D'yes' id=3D'type-id-1589'/> > > +- visibility=3D'default' is-declaration-only=3D'yes' id=3D'type-id-1590'/> > > +- is-declaration-only=3D'yes' id=3D'type-id-1591'/> > > +- visibility=3D'default' is-declaration-only=3D'yes' id=3D'type-id-1592'/> > > +- visibility=3D'default' is-declaration-only=3D'yes' id=3D'type-id-1593'/> > > +- visibility=3D'default' is-declaration-only=3D'yes' id=3D'type-id-1594'/> > > + mangled-name=3D'__this_module' visibility=3D'default' > filepath=3D'/ws/android/kernel/aosp/common-mainline/out/android-mainline/= common/drivers/mmc/host/sdhci.mod.c' > line=3D'11' column=3D'1' elf-symbol-id=3D'__this_module'/> > > + > > + > > +ABIs differ: > > > +[...]/source-libabigail/tests/data/test-read-dwarf/PR25007-sdhci.ko.abi > > +and: > > > +[...]/build-libabigail/tests/output/test-read-dwarf/PR25007-sdhci.ko.ab= i > > + > > +FAIL runtestreaddwarf (exit status: 1) > > > > Only that one test case. > > > > The reordered 'function-type's -- but I've not yet verified whether > > they're really just reordered -- may hint towards a sorting stability > > issue, but there are also the disappearing 'class-decl's. > > > > Unless somebody points out any misunderstanding on my side, I suppose I > > shall try to figure out what's going wrong. > > > > > > Gr=C3=BC=C3=9Fe > > Thomas > ----------------- > Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstra=C3=9Fe 2= 01, > 80634 M=C3=BCnchen; Gesellschaft mit beschr=C3=A4nkter Haftung; Gesch=C3= =A4ftsf=C3=BChrer: > Thomas Heurung, Frank Th=C3=BCrauf; Sitz der Gesellschaft: M=C3=BCnchen; > Registergericht M=C3=BCnchen, HRB 106955 >