From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 6099F3858D32 for ; Mon, 30 Jan 2023 10:33:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6099F3858D32 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1675074815; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Ei8/H7xiPGTYA2mJXOAUIYBHJUVtiYAOmoRvkTEOTSM=; b=UKNwYarLlH4vSNe0NEYHFnok2F9Gm0R+8+sH9nh7YmzEbKZPYA/te8PqWtAWexkAkIkvxA OXehQvGaUpqGXMW9PievCYodepoGX73cRAuAv5yr9glbRtwLbEML2f1pBs7CihJvl0UK7d SdyMlnN2WQTQ2ohTcLjNZ5Cc6kMg2qw= Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-605-Z5Z725BsMBaPqqs9LakYng-1; Mon, 30 Jan 2023 05:33:34 -0500 X-MC-Unique: Z5Z725BsMBaPqqs9LakYng-1 Received: by mail-qk1-f198.google.com with SMTP id v7-20020a05620a0f0700b006faffce43b2so6786854qkl.9 for ; Mon, 30 Jan 2023 02:33:34 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=mime-version:message-id:date:references:in-reply-to:subject:cc:to :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Ei8/H7xiPGTYA2mJXOAUIYBHJUVtiYAOmoRvkTEOTSM=; b=A9YcBrUY70dYoittKDDrchE5lswvfLs19qHgiiTqRs+ZuSbA3XQMch/gY8tLiQzVcZ xmerNjwl5uqWlGx2pWKOCMLidYKkeotd60tatMYeZwFmllQ91l0EPyjbMrcReSRD17BG KtLaJjQPjPGeL2taMLn4mLF+MQqUPBaR87fHrU2XfBn18F6I00bPd3mvBW0FXx9Ft2Xt mQjaRW5ivGxPRKaFr/DNKyYsNw9QH6LRiVIxinA2L8q4QUnw3Wgzu8b8b2FlEb9wvM3C Fe/9R3nhSNeWD/5G+aY44Z2enG28SIDu0uCr8ML5WyWWOdarimGBAlupKFwfwZ4c8cSG aU/Q== X-Gm-Message-State: AO0yUKW+bZyDxEUykYV0z2s7PpTNTS2sPLW16c0QGTwMT7W/k3hY/+D7 YdzzZ6BflauPb6063HXQV/lajzffTda+oVbeImiFlMrZBmY9Ok14YDRLWpw7XuStZgTsn3lP+fW sW8A9HFKLkTxWHdXi6UYlWxP1zyDHaDpfRuCBVlZS2WVGdG+J1d6/Ey4VwLtmz8FzJbrIXk+tpg == X-Received: by 2002:ac8:5988:0:b0:3b8:48e4:83b1 with SMTP id e8-20020ac85988000000b003b848e483b1mr10652494qte.20.1675074813988; Mon, 30 Jan 2023 02:33:33 -0800 (PST) X-Google-Smtp-Source: AK7set9310GYnUwh5/K2JWGe0Tb4zQSfljaog28lU0wbY1M/WPhsSe9PIUs6j/orMflG/CiJj9j+qg== X-Received: by 2002:ac8:5988:0:b0:3b8:48e4:83b1 with SMTP id e8-20020ac85988000000b003b848e483b1mr10652462qte.20.1675074813401; Mon, 30 Jan 2023 02:33:33 -0800 (PST) Received: from localhost (95.72.115.87.dyn.plus.net. [87.115.72.95]) by smtp.gmail.com with ESMTPSA id s39-20020a05622a1aa700b003a7e38055c9sm7818240qtc.63.2023.01.30.02.33.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 30 Jan 2023 02:33:33 -0800 (PST) From: Andrew Burgess To: Tom Tromey via Gdb-patches , gdb-patches@sourceware.org Cc: Tom Tromey Subject: Re: [PATCH] Fix comparator bug in cooked index In-Reply-To: <20230127195632.1570281-1-tromey@adacore.com> References: <20230127195632.1570281-1-tromey@adacore.com> Date: Mon, 30 Jan 2023 10:33:31 +0000 Message-ID: <87ilgouxac.fsf@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain X-Spam-Status: No, score=-11.8 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Tom Tromey via Gdb-patches writes: > Simon pointed out that the cooked index template-matching patch > introduced a failure in libstdc++ debug mode. In particular, the new > code violates the assumption of std::lower_bound and std::upper_bound > that the range is sorted with respect to the comparison. > > When I first debugged this, I thought the problem was unfixable as-is > and that a second layer of filtering would have to be done. However, > on irc, Simon pointed out that it could perhaps be solved if the > comparison function were assured that one operand always came from the > index, with the other always being the search string. > > This patch implements this idea. > > First, a new mode is introduced: a sorting mode for > cooked_index_entry::compare. In this mode, strings are compared > case-insensitively, but we're careful to always sort '<' before any > other printable character. This way, two names like "func" and > "func" will be sorted next to each other -- i.e., "func1" will > not be seen between them. This is important when searching. > > Second, the compare function is changed to work in a strcmp-like way. > This makes it easier to test and (IMO) understand. > > Third, the compare function is modified so that in non-sorting modes, > the index entry is always the first argument. This allows consistency > in compares. > > I regression tested this in libstdc++ debug mode on x86-64 Fedora 36. > It fixes the crash that Simon saw. > --- > gdb/dwarf2/cooked-index.c | 166 ++++++++++++++++++++------------------ > gdb/dwarf2/cooked-index.h | 42 ++++++++-- > 2 files changed, 123 insertions(+), 85 deletions(-) > > diff --git a/gdb/dwarf2/cooked-index.c b/gdb/dwarf2/cooked-index.c > index 09b3fd70b26..f6b1df6e529 100644 > --- a/gdb/dwarf2/cooked-index.c > +++ b/gdb/dwarf2/cooked-index.c > @@ -30,56 +30,38 @@ > > /* See cooked-index.h. */ > > -bool > +int > cooked_index_entry::compare (const char *stra, const char *strb, > - bool completing) > + comparison_mode mode) > { > - /* If we've ever matched "<" in both strings, then we disable the > - special template parameter handling. */ > - bool seen_lt = false; > + /* We want to sort '<' before any other printable character. So, > + rewrite '<' to something just before ' '. */ > +#define MUNGE(c) (c == '<' ? '\x1f' : TOLOWER ((unsigned char) c)) > > while (*stra != '\0' > && *strb != '\0' > - && (TOLOWER ((unsigned char) *stra) > - == TOLOWER ((unsigned char ) *strb))) > + && (MUNGE (*stra) == MUNGE (*strb))) > { > - if (*stra == '<') > - seen_lt = true; > ++stra; > ++strb; > } > > - unsigned c1 = TOLOWER ((unsigned char) *stra); > - unsigned c2 = TOLOWER ((unsigned char) *strb); > + unsigned c1 = MUNGE (*stra); > + unsigned c2 = MUNGE (*strb); > > - if (completing) > - { > - /* When completing, if one string ends earlier than the other, > - consider them as equal. Also, completion mode ignores the > - special '<' handling. */ > - if (c1 == '\0' || c2 == '\0') > - return false; > - /* Fall through to the generic case. */ > - } > - else if (seen_lt) > - { > - /* Fall through to the generic case. */ > - } > - else if (c1 == '\0' || c1 == '<') > - { > - /* Maybe they both end at the same spot. */ > - if (c2 == '\0' || c2 == '<') > - return false; > - /* First string ended earlier. */ > - return true; > - } > - else if (c2 == '\0' || c2 == '<') > + if (c1 == c2) > + return 0; > + > + /* When completing, if STRB ends earlier than STRA, consider them as > + equal. When comparing, if STRB ends earlier and STRA ends with > + '<', consider them as equal. */ > + if (mode == COMPLETE || (mode == COMPARE && c1 == MUNGE ('<'))) > { > - /* Second string ended earlier. */ > - return false; > + if (c2 == '\0') > + return 0; > } > > - return c1 < c2; > + return c1 < c2 ? -1 : 1; > } > > #if GDB_SELF_TEST > @@ -89,45 +71,69 @@ namespace { > void > test_compare () > { > - SELF_CHECK (!cooked_index_entry::compare ("abcd", "abcd", false)); > - SELF_CHECK (!cooked_index_entry::compare ("abcd", "abcd", false)); > - SELF_CHECK (!cooked_index_entry::compare ("abcd", "abcd", true)); > - SELF_CHECK (!cooked_index_entry::compare ("abcd", "abcd", true)); > - > - SELF_CHECK (cooked_index_entry::compare ("abcd", "ABCDE", false)); > - SELF_CHECK (!cooked_index_entry::compare ("ABCDE", "abcd", false)); > - SELF_CHECK (!cooked_index_entry::compare ("abcd", "ABCDE", true)); > - SELF_CHECK (!cooked_index_entry::compare ("ABCDE", "abcd", true)); > - > - SELF_CHECK (!cooked_index_entry::compare ("name", "name<>", false)); > - SELF_CHECK (!cooked_index_entry::compare ("name<>", "name", false)); > - SELF_CHECK (!cooked_index_entry::compare ("name", "name<>", true)); > - SELF_CHECK (!cooked_index_entry::compare ("name<>", "name", true)); > - > - SELF_CHECK (!cooked_index_entry::compare ("name", "name", false)); > - SELF_CHECK (!cooked_index_entry::compare ("name", "name", false)); > - SELF_CHECK (!cooked_index_entry::compare ("name", "name", true)); > - SELF_CHECK (!cooked_index_entry::compare ("name", "name", true)); > - > - SELF_CHECK (!cooked_index_entry::compare ("name>", > - "name>", false)); > - > - SELF_CHECK (!cooked_index_entry::compare ("name", "name>", false)); > - SELF_CHECK (!cooked_index_entry::compare ("name>", "name", false)); > - SELF_CHECK (cooked_index_entry::compare ("name>", > - false)); > - SELF_CHECK (!cooked_index_entry::compare ("name - "name>", > - true)); > - SELF_CHECK (!cooked_index_entry::compare ("name>", "name - false)); > - SELF_CHECK (!cooked_index_entry::compare ("name>", "name - true)); > - > - SELF_CHECK (cooked_index_entry::compare ("", "abcd", false)); > - SELF_CHECK (!cooked_index_entry::compare ("", "abcd", true)); > - SELF_CHECK (!cooked_index_entry::compare ("abcd", "", false)); > - SELF_CHECK (!cooked_index_entry::compare ("abcd", "", true)); > + /* Convenience aliases. */ > + const auto mode_compare = cooked_index_entry::COMPARE; > + const auto mode_sort = cooked_index_entry::SORT; > + const auto mode_complete = cooked_index_entry::COMPLETE; > + > + SELF_CHECK (cooked_index_entry::compare ("abcd", "abcd", > + mode_compare) == 0); > + SELF_CHECK (cooked_index_entry::compare ("abcd", "abcd", > + mode_compare) == 0); > + SELF_CHECK (cooked_index_entry::compare ("abcd", "abcd", > + mode_complete) == 0); > + SELF_CHECK (cooked_index_entry::compare ("abcd", "abcd", > + mode_complete) == 0); > + > + SELF_CHECK (cooked_index_entry::compare ("abcd", "ABCDE", > + mode_compare) < 0); > + SELF_CHECK (cooked_index_entry::compare ("ABCDE", "abcd", > + mode_compare) > 0); > + SELF_CHECK (cooked_index_entry::compare ("abcd", "ABCDE", > + mode_complete) < 0); > + SELF_CHECK (cooked_index_entry::compare ("ABCDE", "abcd", > + mode_complete) == 0); > + > + SELF_CHECK (cooked_index_entry::compare ("name", "name<>", > + mode_compare) < 0); > + SELF_CHECK (cooked_index_entry::compare ("name<>", "name", > + mode_compare) == 0); > + SELF_CHECK (cooked_index_entry::compare ("name", "name<>", > + mode_complete) < 0); > + SELF_CHECK (cooked_index_entry::compare ("name<>", "name", > + mode_complete) == 0); > + > + SELF_CHECK (cooked_index_entry::compare ("name", "name", > + mode_compare) == 0); > + SELF_CHECK (cooked_index_entry::compare ("name", "name", > + mode_compare) > 0); > + SELF_CHECK (cooked_index_entry::compare ("name", "name", > + mode_complete) == 0); > + SELF_CHECK (cooked_index_entry::compare ("name", "name", > + mode_complete) > 0); > + > + SELF_CHECK (cooked_index_entry::compare ("name>", > + "name>", > + mode_compare) == 0); > + > + SELF_CHECK (cooked_index_entry::compare ("name", "name>", > + mode_compare) < 0); > + SELF_CHECK (cooked_index_entry::compare ("name>", "name", > + mode_compare) == 0); > + SELF_CHECK (cooked_index_entry::compare ("name>", "name + mode_compare) > 0); > + SELF_CHECK (cooked_index_entry::compare ("name>", "name + mode_complete) == 0); > + > + SELF_CHECK (cooked_index_entry::compare ("", "abcd", mode_compare) < 0); > + SELF_CHECK (cooked_index_entry::compare ("", "abcd", mode_complete) < 0); > + SELF_CHECK (cooked_index_entry::compare ("abcd", "", mode_compare) > 0); > + SELF_CHECK (cooked_index_entry::compare ("abcd", "", mode_complete) == 0); > + > + SELF_CHECK (cooked_index_entry::compare ("func", "func", > + mode_sort) < 0); > + SELF_CHECK (cooked_index_entry::compare ("func", "func1", > + mode_sort) < 0); > } > > } /* anonymous namespace */ > @@ -359,20 +365,22 @@ cooked_index::find (const std::string &name, bool completing) > { > wait (); > > + cooked_index_entry::comparison_mode mode = (completing > + ? cooked_index_entry::COMPLETE > + : cooked_index_entry::COMPARE); > + > auto lower = std::lower_bound (m_entries.begin (), m_entries.end (), name, > [=] (const cooked_index_entry *entry, > const std::string &n) > { > - return cooked_index_entry::compare (entry->canonical, n.c_str (), > - completing); > + return cooked_index_entry::compare (entry->canonical, n.c_str (), mode) < 0; > }); > > auto upper = std::upper_bound (m_entries.begin (), m_entries.end (), name, > [=] (const std::string &n, > const cooked_index_entry *entry) > { > - return cooked_index_entry::compare (n.c_str (), entry->canonical, > - completing); > + return cooked_index_entry::compare (entry->canonical, n.c_str (), mode) > 0; > }); > > return range (lower, upper); > diff --git a/gdb/dwarf2/cooked-index.h b/gdb/dwarf2/cooked-index.h > index 55eaf9955ab..1c291ba5694 100644 > --- a/gdb/dwarf2/cooked-index.h > +++ b/gdb/dwarf2/cooked-index.h > @@ -143,16 +143,46 @@ struct cooked_index_entry : public allocate_on_obstack > STORAGE. */ > const char *full_name (struct obstack *storage) const; > > - /* Compare two strings, case-insensitively. Return true if STRA is > - less than STRB. If one string has template parameters, but the > - other does not, then they are considered to be equal; so for > - example "t" == "t", "t" < "t", but "t" == "t". */ > - static bool compare (const char *stra, const char *strb, bool completing); > + /* Comparison modes for the 'compare' function. See the function > + for a description. */ > + enum comparison_mode > + { > + COMPARE, > + SORT, > + COMPLETE, > + }; > + > + /* Compare two strings, case-insensitively. Return -1 if STRA is > + less than STRB, 0 if they are equal, and 1 if STRA is greater. > + > + When comparing, '<' is considered to be less than all other > + printable characters. This ensures that "t" sorts before > + "t1", which is necessary when looking up "t". > + > + MODE controls how the comparison proceeds. > + > + MODE==SORT is used when sorting and does no special template > + handling. I think you should replace "template" with "'<'" here. The handling is not limited to template argument blocks, but to anything after a '<'. So languages that don't have templates will be impacted, as will things like 'operator<', 'operator<<', etc in C++. This is all fine, returning false positives from the index is clearly OK, we do the exact matching back in core GDB once the symbols are expanded, but I don't think we should give the impression that this code is actually checking for / skipping template arguments, when all it's really doing is giving up on matching at a '<' (which, again, is fine). I don't know if it's worth explicitly stating that the '<' handling is just a rough heuristic? All the examples you give are template based, so the comment does leave the impression that the compare function does some smart detection/handling of template blocks. Also, it might be worth saying why its fine that we do this C++ specific '<' handling for all languages. On a similar theme, it might be worth saying why a case-insensitive match is fine, even for languages where symbols are case-sensitive. For me, when I started looking at this code, these were all the questions that I had. After some thought I figured out the answers, but I had to dig into other parts of GDB to confirm them. It would have been useful for me if the comments had covered all this stuff. Thanks, Andrew > + > + MODE==COMPARE is used when searching for a symbol. In this case, > + STRB must always be the search name, and STRA must be the name in > + the index that is under consideration. In compare mode, early > + termination of STRB may match STRA -- for example, "t" and > + "t" will be considered to be equal. (However, if A=="t" and > + B=="t", then this will not consider them as equal.) > + > + MODE==COMPLETE is used when searching for a symbol for > + completion. In this case, STRB must always be the search name, > + and STRA must be the name in the index that is under > + consideration. In completion mode, early termination of STRB > + always results in a match. */ > + static int compare (const char *stra, const char *strb, > + comparison_mode mode); > > /* Compare two entries by canonical name. */ > bool operator< (const cooked_index_entry &other) const > { > - return compare (canonical, other.canonical, false); > + return compare (canonical, other.canonical, SORT) < 0; > } > > /* The name as it appears in DWARF. This always points into one of > -- > 2.38.1