From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <maennich@google.com>
Received: from mail-wm1-x332.google.com (mail-wm1-x332.google.com
 [IPv6:2a00:1450:4864:20::332])
 by sourceware.org (Postfix) with ESMTPS id DB8C63858D3C
 for <libabigail@sourceware.org>; Mon, 17 Jan 2022 18:03:09 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org DB8C63858D3C
Received: by mail-wm1-x332.google.com with SMTP id c2so16709187wml.1
 for <libabigail@sourceware.org>; Mon, 17 Jan 2022 10:03:09 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20210112;
 h=x-gm-message-state:date:from:to:cc:subject:message-id:references
 :mime-version:content-disposition:content-transfer-encoding
 :in-reply-to;
 bh=LnBlkhM2FeLjNf45MoWJJDRLOoKewUwpehwOCmGiaVs=;
 b=fuzYk/PNbV9e3O17Ii7kSgvhJqN6B499xS4x3YeYFHhf9G1AEsrQuZ7CdzlIEnuzvs
 Y4Ezrl0Z+tmZ7j7jBPoBnxyLIx9a/YajQFoQrtMaCs90KFMMw/+Ao5+OfjZNFNS+r6VI
 rm/qVrkkKjMrueUiYRHPvMoAXbw7IjjnayccLxh1x1/kMEeSuXiM0qRjzvQekUytOg4D
 yNEC7JfwlCTDOJKPrHhv8mSXPPdjPL3wxIuQIFPcpeCIszYLgygi05iO60gK1x1yjQBy
 vQ9yxXm2IvvnPPnVTeepwnEMcRkd+ZcpaCNp4aFjY9vedw1sVNk8iYGnOxHfvL1qy6Ap
 4azA==
X-Gm-Message-State: AOAM5329wlptit4f8Ok/mIDNLlYuhkICEMZJqA6pi6GohYvUYn7M469j
 SK1mpZLDLmKHuSgEMi7xeJr1KA==
X-Google-Smtp-Source: ABdhPJyBTrClHoQDb8QWhc2Ubhh1/QtOtrZsMzH4I69epYQLKcytmbXS4PsEDIs5yUETKwC9KYCs+g==
X-Received: by 2002:a5d:64af:: with SMTP id m15mr14493698wrp.363.1642442588634; 
 Mon, 17 Jan 2022 10:03:08 -0800 (PST)
Received: from google.com ([2a00:79e0:d:210:b32c:5916:24a8:41e8])
 by smtp.gmail.com with ESMTPSA id f125sm35714wmf.31.2022.01.17.10.03.07
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Mon, 17 Jan 2022 10:03:07 -0800 (PST)
Date: Mon, 17 Jan 2022 18:03:07 +0000
From: Matthias Maennich <maennich@google.com>
To: Dodji Seketeli <dodji@seketeli.org>
Cc: libabigail@sourceware.org, gprocida@google.com, kernel-team@android.com
Subject: Re: [PATCH 3/5] XML writer: track emitted types by bare pointer
Message-ID: <YeWvWw+JJzaKe1wj@google.com>
References: <20211203114622.2944173-1-maennich@google.com>
 <20211203114622.2944173-4-maennich@google.com>
 <87ilvwrawm.fsf@seketeli.org> <YbtkO/e81Ike+vG9@google.com>
 <87wnj7cyqz.fsf@seketeli.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <87wnj7cyqz.fsf@seketeli.org>
X-Spam-Status: No, score=-20.1 required=5.0 tests=BAYES_00, DKIMWL_WL_MED,
 DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,
 FSL_HELO_FAKE, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP,
 USER_IN_DEF_DKIM_WL,
 USER_IN_DEF_SPF_WL autolearn=no autolearn_force=no version=3.4.4
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on
 server2.sourceware.org
X-BeenThere: libabigail@sourceware.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Mailing list of the Libabigail project <libabigail.sourceware.org>
List-Unsubscribe: <https://sourceware.org/mailman/options/libabigail>,
 <mailto:libabigail-request@sourceware.org?subject=unsubscribe>
List-Archive: <https://sourceware.org/pipermail/libabigail/>
List-Help: <mailto:libabigail-request@sourceware.org?subject=help>
List-Subscribe: <https://sourceware.org/mailman/listinfo/libabigail>,
 <mailto:libabigail-request@sourceware.org?subject=subscribe>
X-List-Received-Date: Mon, 17 Jan 2022 18:03:11 -0000

Thanks Dodji for having a look and for sharing your thoughts! That is -
as always - very helpful to get a good full picture!

On Mon, Jan 10, 2022 at 06:00:04PM +0100, Dodji Seketeli wrote:
>Matthias Maennich <maennich@google.com> a écrit:
>
>[...]
>
>> If the XML writer considers two equivalent declaration-only types to be
>> different, one question to ask is: what is the real difference, that is,
>> how will this affect the outcome of abidiff?
>
>The problem is not necessarily at the abidiff level per say.
>
>The problem would be duplication of decl-only types in the abixml
>output, I think, and maybe infinite loops in those cases.  The infinite
>loops are easy to debug, though.  So I am not concerned about them.
>

Agreed there might be some duplication. See for example the commentary
about tests/data/test-read-dwarf/PR22122-libftdc.so.abi in PATCH 4/5.

This series is specifically to eliminate the risk of infinite loops in
the libabigail version we have downstream; and also to improve
performance. After these fixes there are some more changes that should
make infinite loops even less likely. In our case the infinite loops
only happened when using Clang's library (hash tables) and were not so
easy to debug!

>> If the types never change
>> (kind, name or declaration/definition status), nothing should ever be
>> reported. If a type does change... there are two possibilities: either
>> the types were really one type and now perhaps abidiff reports diffs for
>> the same name in two different ways; or the types were really two
>> different ones and abidiff has a simpler job. In my experience, abidiff
>> doesn't always report declaration-only/defined transitions. It doesn't
>> sound like there will be any really bad impact on diffs from having this
>> kind of duplication. However, if someone can come up with a test case of
>> the kind you mention, that would give some extra reassurance.
>
>The reason why I was pointing to this "general" issue is to make sure
>you are aware of this.  As type duplications in abixml was something you
>guys were tracking (and rightly so) I thought I'd point out that we
>still have the risk here.
>
>But because the type id map (writer_context::m_type_id_map) is not
>affected, the duplicated types will correctly be identified as such by
>the reader; thus I don't think abidiff is going to be affected.
>

Duplicates with different type ids could still appear after these
changes. But they should not hurt abidiff and may point to problems
earlier in the pipeline (even the compiler - we found a Clang bug during
the investigation).

Duplicates with the same type id can be conflicting or not conflicting.
Not conflicting is not ideal, but abidiff can handle this. Conflicting
means we have some problem interpreting the XML - which definition is
the right one?

PATCH 4/5 does indeed affect the type id map specifically so that we
avoid the risk of conflicting definitions.


>>>So maybe it would be better have an equality operator that uses
>>>is_non_canonicalized_type() to detect those rare cases and use
>>>structural comparison in those cases?
>>
>> That might come at higher cost than it is beneficial.
>
>I could not tell, as I don't necessarily have the right binaries at
>hand.  I trust you.

It is having the binaries but also the tool chain (the prebuilt clang
version that we use to build Android is very close to upstream releases,
but differences can be subtle - as always). Clang produces different
DWARF and has different bugs from GCC but the standard library is also
more sensitive to how unordered_map and unordered_set are used.

>
>>
>>>
>>>What do you think?
>>
>> For us specifically - building with clang and for our use cases - if we
>> keep structural equality of any kind then we need a hash function to go
>> along with this and, as we've sadly found out, this isn't working well
>> at the moment. We are currently on a bit dated version of libabigail for
>> our production use, but would like to close that gap again to come
>> closer to master.
>>
>> The risk of infinite loops and the reality of 30x slowdowns for certain
>> workloads mean we would need to apply these changes to remove structural
>> equality testing from the XML writer and then maintain an Android
>> version of libabigail as a more heavily-patched fork, to whatever extent
>> is feasible. I would rather we find a good solution that works for all
>> to get again close to upstream and not having to maintain such a fork.
>>
>> Yet, as an additional piece of assurance: the testing we have done does
>> not only include kernels, but of course we heavily examined the
>> libabigail test suite. Additionally, we maintain a large set of small
>> test cases specifically created for ABI stability testing and to cover
>> corner cases of all sorts. We are in the process of publishing those as
>> well. So far, this has served as great input for this patch series as
>> well.
>>
>> Does this make sense? What do you think?
>
>If you don't really care about the potential type duplication in the
>abixml as stated above, frankly, let's just get this patch in.
>
>Are you okay with that?

Yes. Though I think it's important you are somewhat happy with PATCH 4/5
as well as they go together.

Cheers,
Matthias

>
>Cheers,
>
>-- 
>		Dodji