From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm1-x332.google.com (mail-wm1-x332.google.com [IPv6:2a00:1450:4864:20::332]) by sourceware.org (Postfix) with ESMTPS id DB8C63858D3C for ; Mon, 17 Jan 2022 18:03:09 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org DB8C63858D3C Received: by mail-wm1-x332.google.com with SMTP id c2so16709187wml.1 for ; Mon, 17 Jan 2022 10:03:09 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to; bh=LnBlkhM2FeLjNf45MoWJJDRLOoKewUwpehwOCmGiaVs=; b=fuzYk/PNbV9e3O17Ii7kSgvhJqN6B499xS4x3YeYFHhf9G1AEsrQuZ7CdzlIEnuzvs Y4Ezrl0Z+tmZ7j7jBPoBnxyLIx9a/YajQFoQrtMaCs90KFMMw/+Ao5+OfjZNFNS+r6VI rm/qVrkkKjMrueUiYRHPvMoAXbw7IjjnayccLxh1x1/kMEeSuXiM0qRjzvQekUytOg4D yNEC7JfwlCTDOJKPrHhv8mSXPPdjPL3wxIuQIFPcpeCIszYLgygi05iO60gK1x1yjQBy vQ9yxXm2IvvnPPnVTeepwnEMcRkd+ZcpaCNp4aFjY9vedw1sVNk8iYGnOxHfvL1qy6Ap 4azA== X-Gm-Message-State: AOAM5329wlptit4f8Ok/mIDNLlYuhkICEMZJqA6pi6GohYvUYn7M469j SK1mpZLDLmKHuSgEMi7xeJr1KA== X-Google-Smtp-Source: ABdhPJyBTrClHoQDb8QWhc2Ubhh1/QtOtrZsMzH4I69epYQLKcytmbXS4PsEDIs5yUETKwC9KYCs+g== X-Received: by 2002:a5d:64af:: with SMTP id m15mr14493698wrp.363.1642442588634; Mon, 17 Jan 2022 10:03:08 -0800 (PST) Received: from google.com ([2a00:79e0:d:210:b32c:5916:24a8:41e8]) by smtp.gmail.com with ESMTPSA id f125sm35714wmf.31.2022.01.17.10.03.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 17 Jan 2022 10:03:07 -0800 (PST) Date: Mon, 17 Jan 2022 18:03:07 +0000 From: Matthias Maennich To: Dodji Seketeli Cc: libabigail@sourceware.org, gprocida@google.com, kernel-team@android.com Subject: Re: [PATCH 3/5] XML writer: track emitted types by bare pointer Message-ID: References: <20211203114622.2944173-1-maennich@google.com> <20211203114622.2944173-4-maennich@google.com> <87ilvwrawm.fsf@seketeli.org> <87wnj7cyqz.fsf@seketeli.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <87wnj7cyqz.fsf@seketeli.org> X-Spam-Status: No, score=-20.1 required=5.0 tests=BAYES_00, DKIMWL_WL_MED, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH, FSL_HELO_FAKE, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, USER_IN_DEF_DKIM_WL, USER_IN_DEF_SPF_WL autolearn=no autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libabigail@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Mailing list of the Libabigail project List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Jan 2022 18:03:11 -0000 Thanks Dodji for having a look and for sharing your thoughts! That is - as always - very helpful to get a good full picture! On Mon, Jan 10, 2022 at 06:00:04PM +0100, Dodji Seketeli wrote: >Matthias Maennich a écrit: > >[...] > >> If the XML writer considers two equivalent declaration-only types to be >> different, one question to ask is: what is the real difference, that is, >> how will this affect the outcome of abidiff? > >The problem is not necessarily at the abidiff level per say. > >The problem would be duplication of decl-only types in the abixml >output, I think, and maybe infinite loops in those cases. The infinite >loops are easy to debug, though. So I am not concerned about them. > Agreed there might be some duplication. See for example the commentary about tests/data/test-read-dwarf/PR22122-libftdc.so.abi in PATCH 4/5. This series is specifically to eliminate the risk of infinite loops in the libabigail version we have downstream; and also to improve performance. After these fixes there are some more changes that should make infinite loops even less likely. In our case the infinite loops only happened when using Clang's library (hash tables) and were not so easy to debug! >> If the types never change >> (kind, name or declaration/definition status), nothing should ever be >> reported. If a type does change... there are two possibilities: either >> the types were really one type and now perhaps abidiff reports diffs for >> the same name in two different ways; or the types were really two >> different ones and abidiff has a simpler job. In my experience, abidiff >> doesn't always report declaration-only/defined transitions. It doesn't >> sound like there will be any really bad impact on diffs from having this >> kind of duplication. However, if someone can come up with a test case of >> the kind you mention, that would give some extra reassurance. > >The reason why I was pointing to this "general" issue is to make sure >you are aware of this. As type duplications in abixml was something you >guys were tracking (and rightly so) I thought I'd point out that we >still have the risk here. > >But because the type id map (writer_context::m_type_id_map) is not >affected, the duplicated types will correctly be identified as such by >the reader; thus I don't think abidiff is going to be affected. > Duplicates with different type ids could still appear after these changes. But they should not hurt abidiff and may point to problems earlier in the pipeline (even the compiler - we found a Clang bug during the investigation). Duplicates with the same type id can be conflicting or not conflicting. Not conflicting is not ideal, but abidiff can handle this. Conflicting means we have some problem interpreting the XML - which definition is the right one? PATCH 4/5 does indeed affect the type id map specifically so that we avoid the risk of conflicting definitions. >>>So maybe it would be better have an equality operator that uses >>>is_non_canonicalized_type() to detect those rare cases and use >>>structural comparison in those cases? >> >> That might come at higher cost than it is beneficial. > >I could not tell, as I don't necessarily have the right binaries at >hand. I trust you. It is having the binaries but also the tool chain (the prebuilt clang version that we use to build Android is very close to upstream releases, but differences can be subtle - as always). Clang produces different DWARF and has different bugs from GCC but the standard library is also more sensitive to how unordered_map and unordered_set are used. > >> >>> >>>What do you think? >> >> For us specifically - building with clang and for our use cases - if we >> keep structural equality of any kind then we need a hash function to go >> along with this and, as we've sadly found out, this isn't working well >> at the moment. We are currently on a bit dated version of libabigail for >> our production use, but would like to close that gap again to come >> closer to master. >> >> The risk of infinite loops and the reality of 30x slowdowns for certain >> workloads mean we would need to apply these changes to remove structural >> equality testing from the XML writer and then maintain an Android >> version of libabigail as a more heavily-patched fork, to whatever extent >> is feasible. I would rather we find a good solution that works for all >> to get again close to upstream and not having to maintain such a fork. >> >> Yet, as an additional piece of assurance: the testing we have done does >> not only include kernels, but of course we heavily examined the >> libabigail test suite. Additionally, we maintain a large set of small >> test cases specifically created for ABI stability testing and to cover >> corner cases of all sorts. We are in the process of publishing those as >> well. So far, this has served as great input for this patch series as >> well. >> >> Does this make sense? What do you think? > >If you don't really care about the potential type duplication in the >abixml as stated above, frankly, let's just get this patch in. > >Are you okay with that? Yes. Though I think it's important you are somewhat happy with PATCH 4/5 as well as they go together. Cheers, Matthias > >Cheers, > >-- > Dodji