From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx07-00178001.pphosted.com (mx07-00178001.pphosted.com [185.132.182.106]) by sourceware.org (Postfix) with ESMTPS id DEF5C3858D28 for ; Thu, 5 Jan 2023 18:07:16 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DEF5C3858D28 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=foss.st.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=foss.st.com Received: from pps.filterd (m0288072.ppops.net [127.0.0.1]) by mx07-00178001.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 305EFKNp013565; Thu, 5 Jan 2023 19:07:04 +0100 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foss.st.com; h=message-id : date : mime-version : subject : to : cc : references : from : in-reply-to : content-type : content-transfer-encoding; s=selector1; bh=nPhXtg8ieyMt2Hf6jvWkIGTq+GKNGfG3R/Flh/FGNt8=; b=AqOR5iP9E2lvoi+HHYX3wGPr6jVI96m+kZiGh6VJul0/Km19WkVkC+PmOXElQcRS6E/a eUnydI1Inb8DGyS9+7J7PXe/u5kKuDkRI2mwd01N8yUHGhZ/1pKWV5+Sqx4oDJV+sBGV Q+JMi58zSxIXaKLmAVKKOOGyFFCj/6wZUlt5hLTT3LTe9KIYFCBDivYX7KWWbHZEAr/0 ve0OyXxX7Y6jBQ1enfXdVL/yuzsWu8M+/+kxbbDlc5wMwU7thQlHjWK5eWN6VbdfUC1v e//mrgFfZYC4Epwnwd/gTjaFsgdfECzD9R7kAa/+3/6L5us8p7juLr6uXMwFK02lScDs tQ== Received: from beta.dmz-eu.st.com (beta.dmz-eu.st.com [164.129.1.35]) by mx07-00178001.pphosted.com (PPS) with ESMTPS id 3mtbcqahyb-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 05 Jan 2023 19:07:03 +0100 Received: from euls16034.sgp.st.com (euls16034.sgp.st.com [10.75.44.20]) by beta.dmz-eu.st.com (STMicroelectronics) with ESMTP id 65A7F10002A; Thu, 5 Jan 2023 19:07:00 +0100 (CET) Received: from Webmail-eu.st.com (shfdag1node3.st.com [10.75.129.71]) by euls16034.sgp.st.com (STMicroelectronics) with ESMTP id 482E321181B; Thu, 5 Jan 2023 19:07:00 +0100 (CET) Received: from [10.252.8.65] (10.252.8.65) by SHFDAG1NODE3.st.com (10.75.129.71) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.13; Thu, 5 Jan 2023 19:06:57 +0100 Message-ID: Date: Thu, 5 Jan 2023 19:06:55 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.6.1 Subject: Re: Two observations using GDB 13 snapshot Content-Language: en-US To: Eli Zaretskii , Simon Marchi CC: , , , References: <83h6xugc5v.fsf@gnu.org> <58b64bf8-90b6-d080-c060-d03761501199@arm.com> <83k02neezy.fsf@gnu.org> <835ye7e9jw.fsf@gnu.org> <87h6xrks77.fsf@tromey.com> <83mt7idacj.fsf@gnu.org> <87fsd4elb2.fsf@tromey.com> <83o7rs4qmg.fsf@gnu.org> <87cz84dasj.fsf@tromey.com> <835ydw20bw.fsf@gnu.org> <87wn6bbi5m.fsf@tromey.com> <83sfgz8m9i.fsf@gnu.org> <87o7rnb0ya.fsf@tromey.com> <83a6368chf.fsf@gnu.org> <87k02aaxc8.fsf@tromey.com> <83wn6a6n21.fsf@gnu.org> <587b899f-0f93-530a-7313-d4e1f9e501b9@simark.ca> <835ydt6jzw.fsf@gnu.org> <83wn63z81w.fsf@gnu.org> <46d7fd4b-d5f0-0007-3e88-20345e0e0584@simark.ca> <831qoayxuu.fsf@gnu.org> From: Torbjorn SVENSSON In-Reply-To: <831qoayxuu.fsf@gnu.org> Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [10.252.8.65] X-ClientProxiedBy: SHFCAS1NODE1.st.com (10.75.129.72) To SHFDAG1NODE3.st.com (10.75.129.71) X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.923,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2023-01-05_09,2023-01-05_01,2022-06-22_01 X-Spam-Status: No, score=-7.2 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,NICE_REPLY_A,RCVD_IN_DNSWL_LOW,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 2023-01-04 19:10, Eli Zaretskii via Gdb-patches wrote: >> Date: Tue, 3 Jan 2023 16:34:53 -0500 >> Cc: tom@tromey.com, gdb-patches@sourceware.org, luis.machado@arm.com >> From: Simon Marchi >> >>> In terms of the code, may be worth trying TOLOWER from >>> include/safe-ctype.h instead of tolower() >> >> The tolower call is inside strcasecmp, we don't call tolower directly: >> >> #0 0x77c348d5 in msvcrt!__crtLCMapStringA () >> from C:\WINDOWS\system32\msvcrt.dll >> #1 0x77c348cd in msvcrt!__crtLCMapStringA () >> from C:\WINDOWS\system32\msvcrt.dll >> #2 0x77c30045 in wmktemp () from C:\WINDOWS\system32\msvcrt.dll >> #3 0x77c1c992 in tolower () from C:\WINDOWS\system32\msvcrt.dll >> #4 0x77c462a1 in stricmp () from C:\WINDOWS\system32\msvcrt.dll >> #5 0x005107d3 in strcasecmp (__s2=, __s1=) >> at d:/usr/include/strings.h:92 >> #6 cooked_index_entry::operator< (this=, other=...) >> at ./dwarf2/cooked-index.h:150 >> >> It would be interesting to change that strcasecmp call to strcmp, just >> to see if it makes an impact on the performance. Whether or not that >> would be correct is another thing, but it would help see if that >> strcasecmp / tolower call is really at fault here. > > Looks like indeed strcasecmp is the culprit. With the patch below, > which replaces strcasecmp with a simple case-insensitive comparison > that only works with ASCII, the phase of reading symbols from gdb.exe > goes down to just 6 seconds, which is basically the same time as with > GDB 12. > > --- gdb/dwarf2/cooked-index.h~0 2022-12-17 03:47:12.000000000 +0200 > +++ gdb/dwarf2/cooked-index.h 2023-01-04 20:00:04.052250000 +0200 > @@ -35,6 +35,8 @@ > #include "dwarf2/tag.h" > #include "gdbsupport/range-chain.h" > > +#define my_tolower(c) (('A' <= (c) && (c) <= 'Z') ? ((c) - 'A' + 'a') : (c)) > + > struct dwarf2_per_cu_data; > > /* Flags that describe an entry in the index. */ > @@ -147,7 +149,20 @@ struct cooked_index_entry : public alloc > entries. */ > bool operator< (const cooked_index_entry &other) const > { > +#if 0 > return strcasecmp (canonical, other.canonical) < 0; > +#else > + const unsigned char *s1 = (unsigned char *)canonical, *s2 = (unsigned char *)other.canonical; > + > + while (my_tolower(*s1) == my_tolower(*s2)) > + { > + if (*s1 == 0) > + return false; Without any knowledge of the cooked_index_entry type or what it's supposed to do, I think the return statement here is problematic in the case where there are 2 cooked_index_entry instances contains the same case insensitive canonical strings. I'm not sure if this can happen, but it's just a thought. It also appears that the old strcasecmp implementation would suffer the same limitation. To avoid the issue, I guess the address of the 2 objects could be compared in order to get a stable result if there is no other property that is guaranteed to be unique for the 2 instances. Kind regards, Torbjörn > + s1++; > + s2++; > + } > + return (int)my_tolower(*s1) < (int)my_tolower(*s2); > +#endif > } > > /* The name as it appears in DWARF. This always points into one of