From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from de-smtp-delivery-102.mimecast.com (de-smtp-delivery-102.mimecast.com [194.104.111.102]) by sourceware.org (Postfix) with ESMTPS id 36AE63858D28 for ; Tue, 12 Apr 2022 12:38:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 36AE63858D28 Received: from EUR03-DB5-obe.outbound.protection.outlook.com (mail-db5eur03lp2059.outbound.protection.outlook.com [104.47.10.59]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id de-mta-14-4lAAIBGINT62v6RhaCCNPA-1; Tue, 12 Apr 2022 14:38:53 +0200 X-MC-Unique: 4lAAIBGINT62v6RhaCCNPA-1 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=R5QbLnaJUvD33sBrUWM4sO7EEUvOuXgyHoZShnZCQ0t99r9t8q+ZDIfvGSE3vfx7ULvhRn3Zm1tMSmMjythq69NhGEsr15DFy4dXB+AwbRQSE9wYa8fttKrU1lY1SvQvpwOFwhDnSaqXded6ts9T3ySoMXimfrPtaDvcLEC1aI4D1XeMA2Op7E2TfUB+L0SPiMncWUXKrPwKa3OVSnPF7zXMNUdmbqpC6JKx+JfLllWl1QXDb4jVfLS+urgkitoDlgF0HWQQHMeQ91VLzbeBZFNlrSZp1A6CxnndV0Na5Q7oruQLbrwVHz8pcNzNxm3hwUGEuotjkxuwdIlRE2gVmA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=MIeUGo/1C3y0SshvebzGgy2Ce3Qe1Vkg2QLtN40zV/U=; b=MYBBscH0Jpts2unueS3I6qVJvi4NCi9CCEb4U/+vwO48dl1WSypMxpUXLZbDuBzy9vLDGLkMgSZJvyGq88mp2EaiA/VNPoNRuGMaemT5dVlOpg11VaJS7kwDLasfjw0hbtJVog5KsZo1no/1bgKLYg5NjhnbEYjJAn06d28TkKS+JllzEDFyTxU/F4284oV9keJUGMpu3x7PeHmp4GwJya57xUAgFcXcdYe0Ytet76FfkjiHrJK35l8vBW6nRzcN/k74GqILsGhSxJXxyAeNuCRPRGlu9PksD4EIzT02Bi/mbNsr/GUJCnsxsmKYAtvUOxUcvkp7AWtyfk9D3bxm/Q== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none Received: from DU2PR04MB8616.eurprd04.prod.outlook.com (2603:10a6:10:2db::16) by DB8PR04MB6876.eurprd04.prod.outlook.com (2603:10a6:10:116::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5144.30; Tue, 12 Apr 2022 12:38:50 +0000 Received: from DU2PR04MB8616.eurprd04.prod.outlook.com ([fe80::5592:2abe:fb16:6cd1]) by DU2PR04MB8616.eurprd04.prod.outlook.com ([fe80::5592:2abe:fb16:6cd1%6]) with mapi id 15.20.5144.029; Tue, 12 Apr 2022 12:38:50 +0000 Message-ID: Date: Tue, 12 Apr 2022 14:38:48 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.8.0 Subject: Re: [PATCH v3] Add a trie to map quickly from address range to compilation unit. Content-Language: en-US To: "Steinar H. Gunderson" CC: sesse@chromium.org, binutils@sourceware.org References: <20220408141442.520107-1-sesse@google.com> From: Jan Beulich In-Reply-To: <20220408141442.520107-1-sesse@google.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: AM5PR04CA0004.eurprd04.prod.outlook.com (2603:10a6:206:1::17) To DU2PR04MB8616.eurprd04.prod.outlook.com (2603:10a6:10:2db::16) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: aac3e19f-20c7-4f5e-7f66-08da1c816587 X-MS-TrafficTypeDiagnostic: DB8PR04MB6876:EE_ X-Microsoft-Antispam-PRVS: X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: /YBs4SsPfawaDuiFHNpTd2kwlLdOimwCVDM7SNXgRTDcH+VA4NqeobvskVMjhgXVAcIECo3pZqEVJV26mrv1b/y2Qv7zJ1kJM99bIb2Af4Sl5PsF8Bq1nPNLVdrv4fW1ZzpdThJfVa+XwT7yD3aAHAW4PiWte1YuD/N7ieY/7HrSxLoDJEOs5EBXPG/QbOPSHgWAHxqk/U77qKVwmJH/NKAWhuc58/OXDXzmJf3zp1PqPuoB3l9T1eFENbFV0eu1ME1pJw9KEeOVULOAR6kJBH54aynns/BjS4pNP7JR08M1cJ6kDJ4x14eIwvv3LtQej5Ti2x5bamcL/RkAWqJJwgw14adIWMf6vcaD29qpNhWjkfm+KateafBKlr7l+R3vaC9zeUN4yeB2cN5rQRcndEaCguIvaIkPeOzT0xMDNiSfriwFd6EwhgsAbxLGO1YHcTSALHqiNfjsTe3buXDuwcTsLBE5rmXoANrOkSmvFAbdC92wI5pBguFdFclS2QOqON4gQ5DGQSLMNAYeE0RWZpAu1quuVdNafM//zEkZ9PorNXSW48cK1XCSHH+jJHPqiT9ffBI+zMH04kB1Yz6QyKkIK7J3jzfJOZb+A/30qrhTSp3RYTYt39oYNLsMnvitQz7j+YErbcMKslrtUz7T3ZQw9u3NmAcIuNe3vSVQtJcHxZP4lF1k1WgLQB+BnyW5HakJenVlDRYBpDSCfb3AR7+0eTjwmV7xuusH7skv/4iQISqcBk5J5BMMxIS48LLG X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:DU2PR04MB8616.eurprd04.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230001)(366004)(2616005)(6506007)(53546011)(6486002)(6512007)(316002)(508600001)(36756003)(6916009)(8676002)(66946007)(66476007)(4326008)(186003)(83380400001)(26005)(66556008)(31686004)(2906002)(66574015)(38100700002)(31696002)(86362001)(30864003)(5660300002)(8936002)(45980500001)(43740500002); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?kK52SflqSg+M6f1Jmg8jXzv7mqX6OsUd6d1dhMlA5UWUq/scgv1vvzl3X9Lk?= =?us-ascii?Q?vh/t/Xt7qfX9kNX5bMdqyLdlbPyo4p4Ptp4PcDkB+keCFpXc76N9jFGQCoEJ?= =?us-ascii?Q?kACJut/PgKa7iuS5lJPW9ssgm7uGqtMDNA4JSRmc8jMxsnKYZuWWGCBm7vC0?= =?us-ascii?Q?vlVJ2NLTSqgLQ4LI7hHrepMMQMzlKzrl40nijP1G7dy2a3ETXK/Dgs1xI9Jc?= =?us-ascii?Q?zgVGMI9rTpkHeVsS/H8FqwVk6zXv/xUzJvRXDXrQfdiJb++Uu5DOkk8BfBah?= =?us-ascii?Q?+HyzYlfMa8iliRi5WcyAxwfoL1n+nAch/slcAArTeMjJN3JMTySnJwkRbiYn?= =?us-ascii?Q?cRRRyikbWKBVqp3rGXjSSX2ikJ5nN5pn6zed5TftGl+NKj99NgeQW3LnOT7m?= =?us-ascii?Q?pKaxfjSE9Z3oKYNIl+bdszWICzmLpoIhK5AJ5cC+oyTm7YIz2W3YeBcIQQn5?= =?us-ascii?Q?ex+hgxGAS3pt0BNKNLhLBOnFXzVJFIHcZOpcrXaPF//VxQaL3JbO3nIMNrMT?= =?us-ascii?Q?KI0ovxjjBRz47kfRzXKlDv0WBhk1APHPZb7Wt1v9pKiQ/rly8FnH5ogNXNOJ?= =?us-ascii?Q?PEraqgoBRdRjjJG5T6/yGle6MugKmHXRyFJGc1WBZ4R8+g3hzODUkJxksXVe?= =?us-ascii?Q?LDG5pEwjp/J+5J0qdyWzF/asrE8gXfLnVU7LgwfScVbiegnVSVdjTjxb1HLi?= =?us-ascii?Q?zDMXUDmylDM7BCndumiVzEjQwyimzL0r+lGpyuIJVMse2+h4o5svnYF0FMeM?= =?us-ascii?Q?cnoSRtahC1KNN39OVQrCHeg8l9Gl4aJ6I9OMZiBCjYgJDd5TpOHdO3lazcOe?= =?us-ascii?Q?7/nNJiMsU4xxpaIMbwhVz4xa7ah1CEyoStCJHbo44hNklSIMO5UszXpwRR6u?= =?us-ascii?Q?cqoBWN8TvQ1phkpqH8EfEp92MwLU7i7llJ4R/WKkhGP4bmKNqxaOXfu2pHyF?= =?us-ascii?Q?NRyzNOjiDIDDAHCt2B4prX6AgGH4MRgSxw8f93VmZXrPasPocLIlIamCDPPv?= =?us-ascii?Q?s4C7JFjr4DhcxsacVXELXbl6SW6zi5/kNWCd+vsawP4M4SIJTwCIddOJ0mdE?= =?us-ascii?Q?s0828HHlomVHVqIJ/fbZQflYQlmsl2jzT2YcSJ5ZY1LPUTGzPaIHuyvUA8US?= =?us-ascii?Q?WCZ3aqV4yMNQeHJu+DTei5wyDxS0pgoNmY03bQ+TaTqBcNm7vXDqMG6G09YC?= =?us-ascii?Q?OOEATBlUt69n0pyXaUEeGRCcc8+pOI+I/+DQJk0dRCxWZHgT8Y2hz3dIdKAK?= =?us-ascii?Q?UmhpvxBV3CrtbPrPkkOC/4PjjP1HTMsinkInfsvJYMilJHGQUAqoynPo3iIa?= =?us-ascii?Q?scDCNUSMHJeE7yt/xDn6+vE+rz/h3T8ZaEQSvQ+CTA3XT+GFcb+rmPV7Zuac?= =?us-ascii?Q?DhtdLcLqMQbILLT1nUA+IxzcRs0cCSQMSR3WiSCEjgQcdVZkgOGavcm3aCxm?= =?us-ascii?Q?3z6tGXhxgRDJ14ec/TuMWDudd/exXLokVi8jNP+WSqOORQ094iud+0XsAgv8?= =?us-ascii?Q?veWYyoL0qvADRbcQAlxHop5swe0ThV40QhDH2cZnKcXty4uGJfgnbStGI1uA?= =?us-ascii?Q?c2RT9Lu8S5iFBI0dKkOs8rzwdbAEGzOZqfmxXJhO5DEVrRj1ATXvh/lMNizk?= =?us-ascii?Q?JgtvsFnSeUgR1AVxa5zMwOc1AtxYgu58W//FITXAPGE5Lg1EegtFhRs56xh9?= =?us-ascii?Q?OzsiaIvKczlfjdGBoIbib8fq4YNIBC9TBBWch+teCZUtDt+8qEZWVvEnqAx1?= =?us-ascii?Q?XMXn86qPSA=3D=3D?= X-OriginatorOrg: suse.com X-MS-Exchange-CrossTenant-Network-Message-Id: aac3e19f-20c7-4f5e-7f66-08da1c816587 X-MS-Exchange-CrossTenant-AuthSource: DU2PR04MB8616.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Apr 2022 12:38:50.8115 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: f7a17af6-1c5c-4a36-aa8b-f5be247aa4ba X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: bwl5SzOEJWIgMAxoSkeAYNa0sLDojiZTaHdGhpKfP8QFVGc1dS+eAAFY5iN8kFCs5LOlZHubeuEu8dStq6VafA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB8PR04MB6876 X-Spam-Status: No, score=-3037.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, NICE_REPLY_A, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: binutils@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Binutils mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 Apr 2022 12:39:00 -0000 On 08.04.2022 16:14, Steinar H. Gunderson via Binutils wrote: > When using perf to profile large binaries, _bfd_dwarf2_find_nearest_line(= ) > becomes a hotspot, as perf wants to get line number information > (for inline-detection purposes) for each and every sample. In Chromium > in particular (the content_shell binary), this entails going through > 475k address ranges, which takes a long time when done repeatedly. >=20 > Add a radix-256 trie over the address space to quickly map address to > compilation unit spaces; for content_shell, which is 1.6 GB when some > (but not full) debug information turned is on, we go from 6 ms to > 0.006 ms (6 =C2=B5s) for each lookup from address to compilation unit, a = 1000x > speedup. >=20 > There is a modest RAM increase of 180 MB in this binary (the existing > linked list over ranges uses about 10 MB, and the entire perf job uses > between 2=E2=80=933 GB for a medium-size profile); for smaller binaries w= ith few > ranges, there should be hardly any extra RAM usage at all. > --- > bfd/dwarf2.c | 374 ++++++++++++++++++++++++++++++++++++++++++++++++--- > 1 file changed, 355 insertions(+), 19 deletions(-) Patch is okay with some over-long lines suitably wrapped. Jan > diff --git a/bfd/dwarf2.c b/bfd/dwarf2.c > index 404f35df62b..8987121b2cb 100644 > --- a/bfd/dwarf2.c > +++ b/bfd/dwarf2.c > @@ -82,6 +82,77 @@ struct adjusted_section > bfd_vma adj_vma; > }; > =20 > +/* A trie to map quickly from address range to compilation unit. > + > + This is a fairly standard radix-256 trie, used to quickly locate whic= h > + compilation unit any given address belongs to. Given that each compi= lation > + unit may register hundreds of very small and unaligned ranges (which = may > + potentially overlap, due to inlining and other concerns), and a large > + program may end up containing hundreds of thousands of such ranges, w= e cannot > + scan through them linearly without undue slowdown. > + > + We use a hybrid trie to avoid memory explosion: There are two types o= f trie > + nodes, leaves and interior nodes. (Almost all nodes are leaves, so t= hey > + take up the bulk of the memory usage.) Leaves contain a simple array = of > + ranges (high/low address) and which compilation unit contains those r= anges, > + and when we get to a leaf, we scan through it linearly. Interior nod= es > + contain pointers to 256 other nodes, keyed by the next byte of the ad= dress. > + So for a 64-bit address like 0x1234567abcd, we would start at the roo= t and go > + down child[0x00]->child[0x00]->child[0x01]->child[0x23]->child[0x45] = etc., > + until we hit a leaf. (Nodes are, in general, leaves until they excee= d the > + default allocation of 16 elements, at which point they are converted = to > + interior node if possible.) This gives us near-constant lookup times; > + the only thing that can be costly is if there are lots of overlapping= ranges > + within a single 256-byte segment of the binary, in which case we have= to > + scan through them all to find the best match. > + > + For a binary with few ranges, we will in practice only have a single = leaf node > + at the root, containing a simple array. Thus, the scheme is efficien= t for > + both small and large binaries. > + */ > + > +/* Experiments have shown 16 to be a memory-efficient default leaf size. > + The only case where a leaf will hold more memory than this, is at the > + bottomost level (covering 256 bytes in the binary), where we'll expan= d > + the leaf to be able to hold more ranges if needed. > + */ > +#define TRIE_LEAF_SIZE 16 > + > +/* All trie_node pointers will really be trie_leaf or trie_interior, > + but they have this common head. */ > +struct trie_node > +{ > + /* If zero, we are an interior node. > + Otherwise, how many ranges we have room for in this leaf. */ > + unsigned int num_room_in_leaf; > +}; > + > +struct trie_leaf > +{ > + struct trie_node head; > + unsigned int num_stored_in_leaf; > + struct { > + struct comp_unit *unit; > + bfd_vma low_pc, high_pc; > + } ranges[TRIE_LEAF_SIZE]; > +}; > + > +struct trie_interior > +{ > + struct trie_node head; > + struct trie_node *children[256]; > +}; > + > +static struct trie_node *alloc_trie_leaf (bfd *abfd) > +{ > + struct trie_leaf *leaf =3D > + bfd_zalloc (abfd, sizeof (struct trie_leaf)); > + if (leaf =3D=3D NULL) > + return NULL; > + leaf->head.num_room_in_leaf =3D TRIE_LEAF_SIZE; > + return &leaf->head; > +} > + > struct dwarf2_debug_file > { > /* The actual bfd from which debug info was loaded. Might be > @@ -139,6 +210,9 @@ struct dwarf2_debug_file > /* A list of all previously read comp_units. */ > struct comp_unit *all_comp_units; > =20 > + /* A list of all previously read comp_units with no ranges (yet). */ > + struct comp_unit *all_comp_units_without_ranges; > + > /* Last comp unit in list above. */ > struct comp_unit *last_comp_unit; > =20 > @@ -147,6 +221,9 @@ struct dwarf2_debug_file > =20 > /* Hash table to map offsets to decoded abbrevs. */ > htab_t abbrev_offsets; > + > + /* Root of a trie to map addresses to compilation units. */ > + struct trie_node *trie_root; > }; > =20 > struct dwarf2_debug > @@ -220,6 +297,11 @@ struct comp_unit > /* Chain the previously read compilation units. */ > struct comp_unit *next_unit; > =20 > + /* Chain the previously read compilation units that have no ranges yet= . > + We scan these separately when we have a trie over the ranges. > + Unused if arange.high !=3D 0. */ > + struct comp_unit *next_unit_without_ranges; > + > /* Likewise, chain the compilation unit read after this one. > The comp units are stored in reversed reading order. */ > struct comp_unit *prev_unit; > @@ -296,6 +378,10 @@ struct comp_unit > =20 > /* TRUE if symbols are cached in hash table for faster lookup by name.= */ > bool cached; > + > + /* Used when iterating over trie leaves to know which units we have > + already seen in this iteration. */ > + bool mark; > }; > =20 > /* This data structure holds the information of an abbrev. */ > @@ -1767,9 +1853,183 @@ concat_filename (struct line_info_table *table, u= nsigned int file) > return strdup (filename); > } > =20 > +/* Number of bits in a bfd_vma. */ > +#define VMA_BITS (8 * sizeof (bfd_vma)) > + > +/* Check whether [low1, high1) can be combined with [low2, high2), > + i.e., they touch or overlap. */ > +static bool ranges_overlap(bfd_vma low1, bfd_vma high1, bfd_vma low2, bf= d_vma high2) > +{ > + if (low1 =3D=3D low2 || high1 =3D=3D high2) > + return true; > + > + /* Sort so that low1 is below low2. */ > + if (low1 > low2) > + { > + bfd_vma tmp; > + > + tmp =3D low1; > + low1 =3D low2; > + low2 =3D tmp; > + > + tmp =3D high1; > + high1 =3D high2; > + high2 =3D tmp; > + } > + > + /* We touch iff low2 =3D=3D high1. > + We overlap iff low2 is within [low1, high1). */ > + return (low2 <=3D high1); > +} > + > +/* Insert an address range in the trie mapping addresses to compilation = units. > + Will return the new trie node (usually the same as is being sent in, = but > + in case of a leaf-to-interior conversion, or expansion of a leaf, it = may be > + different), or NULL on failure. > + */ > +static struct trie_node *insert_arange_in_trie(bfd *abfd, > + struct trie_node *trie, > + bfd_vma trie_pc, > + unsigned int trie_pc_bits, > + struct comp_unit *unit, > + bfd_vma low_pc, > + bfd_vma high_pc) > +{ > + bfd_vma clamped_low_pc, clamped_high_pc; > + int ch, from_ch, to_ch; > + bool is_full_leaf =3D false; > + > + /* See if we can extend any of the existing ranges. This merging > + isn't perfect (if merging opens up the possibility of merging two e= xisting > + ranges, we won't find them), but it takes the majority of the cases= . */ > + if (trie->num_room_in_leaf > 0) > + { > + struct trie_leaf *leaf =3D (struct trie_leaf *) trie; > + unsigned int i; > + > + for (i =3D 0; i < leaf->num_stored_in_leaf; ++i) > + { > + if (leaf->ranges[i].unit =3D=3D unit && > + ranges_overlap(low_pc, high_pc, > + leaf->ranges[i].low_pc, leaf->ranges[i].high_pc)) > + { > + if (low_pc < leaf->ranges[i].low_pc) > + leaf->ranges[i].low_pc =3D low_pc; > + if (high_pc > leaf->ranges[i].high_pc) > + leaf->ranges[i].high_pc =3D high_pc; > + return trie; > + } > + } > + > + is_full_leaf =3D leaf->num_stored_in_leaf =3D=3D trie->num_room_in= _leaf; > + } > + > + /* If we're a leaf with no more room and we're _not_ at the bottom, > + convert to an interior node. */ > + if (is_full_leaf && trie_pc_bits < VMA_BITS) > + { > + const struct trie_leaf *leaf =3D (struct trie_leaf *) trie; > + unsigned int i; > + > + trie =3D bfd_zalloc (abfd, sizeof (struct trie_interior)); > + if (!trie) > + return NULL; > + is_full_leaf =3D false; > + > + /* TODO: If we wanted to save a little more memory at the cost of > + complexity, we could have reused the old leaf node as one of the > + children of the new interior node, instead of throwing it away. */ > + for (i =3D 0; i < leaf->num_stored_in_leaf; ++i) > + { > + if (!insert_arange_in_trie (abfd, trie, trie_pc, trie_pc_bits, > + leaf->ranges[i].unit, leaf->ranges[i].low_pc, > + leaf->ranges[i].high_pc)) > + return NULL; > + } > + } > + > + /* If we're a leaf with no more room and we _are_ at the bottom, > + we have no choice but to just make it larger. */ > + if (is_full_leaf) > + { > + const struct trie_leaf *leaf =3D (struct trie_leaf *) trie; > + unsigned int new_room_in_leaf =3D trie->num_room_in_leaf * 2; > + struct trie_leaf *new_leaf; > + > + new_leaf =3D bfd_zalloc (abfd, > + sizeof (struct trie_leaf) + (new_room_in_leaf - TRIE_LEAF_SIZE) * sizeo= f (leaf->ranges[0])); > + new_leaf->head.num_room_in_leaf =3D new_room_in_leaf; > + new_leaf->num_stored_in_leaf =3D leaf->num_stored_in_leaf; > + > + memcpy (new_leaf->ranges, > + leaf->ranges, > + leaf->num_stored_in_leaf * sizeof (leaf->ranges[0])); > + trie =3D &new_leaf->head; > + is_full_leaf =3D false; > + > + /* Now the insert below will go through. */ > + } > + > + /* If we're a leaf (now with room), we can just insert at the end. */ > + if (trie->num_room_in_leaf > 0) > + { > + struct trie_leaf *leaf =3D (struct trie_leaf *) trie; > + > + unsigned int i =3D leaf->num_stored_in_leaf++; > + leaf->ranges[i].unit =3D unit; > + leaf->ranges[i].low_pc =3D low_pc; > + leaf->ranges[i].high_pc =3D high_pc; > + return trie; > + } > + > + /* Now we are definitely an interior node, so recurse into all the rel= evant buckets. */ > + > + /* Clamp the range to the current trie bucket. */ > + clamped_low_pc =3D low_pc; > + clamped_high_pc =3D high_pc; > + if (trie_pc_bits > 0) > + { > + bfd_vma bucket_high_pc =3D trie_pc + ((bfd_vma)-1 >> trie_pc_bits)= ; /* Inclusive. */ > + if (clamped_low_pc < trie_pc) > + clamped_low_pc =3D trie_pc; > + if (clamped_high_pc > bucket_high_pc) > + clamped_high_pc =3D bucket_high_pc; > + } > + > + /* Insert the ranges in all buckets that it spans. */ > + from_ch =3D (clamped_low_pc >> (VMA_BITS - trie_pc_bits - 8)) & 0xff; > + to_ch =3D ((clamped_high_pc - 1) >> (VMA_BITS - trie_pc_bits - 8)) & 0= xff; > + for (ch =3D from_ch; ch <=3D to_ch; ++ch) > + { > + struct trie_interior *interior =3D (struct trie_interior *) trie; > + struct trie_node *child =3D interior->children[ch]; > + > + if (child =3D=3D NULL) > + { > + child =3D alloc_trie_leaf (abfd); > + if (!child) > + return NULL; > + } > + child =3D insert_arange_in_trie (abfd, > + child, > + trie_pc + ((bfd_vma)ch << (VMA_BITS - trie_pc_bits - 8)), > + trie_pc_bits + 8, > + unit, > + low_pc, > + high_pc); > + if (!child) > + return NULL; > + > + interior->children[ch] =3D child; > + } > + > + return trie; > +} > + > + > static bool > -arange_add (const struct comp_unit *unit, struct arange *first_arange, > - bfd_vma low_pc, bfd_vma high_pc) > +arange_add (struct comp_unit *unit, struct arange *first_arange, > + struct trie_node **trie_root, bfd_vma low_pc, bfd_vma high_pc) > { > struct arange *arange; > =20 > @@ -1777,6 +2037,19 @@ arange_add (const struct comp_unit *unit, struct a= range *first_arange, > if (low_pc =3D=3D high_pc) > return true; > =20 > + if (trie_root !=3D NULL) > + { > + *trie_root =3D insert_arange_in_trie (unit->file->bfd_ptr, > + *trie_root, > + 0, > + 0, > + unit, > + low_pc, > + high_pc); > + if (*trie_root =3D=3D NULL) > + return false; > + } > + > /* If the first arange is empty, use it. */ > if (first_arange->high =3D=3D 0) > { > @@ -2411,7 +2684,7 @@ decode_line_info (struct comp_unit *unit) > low_pc =3D address; > if (address > high_pc) > high_pc =3D address; > - if (!arange_add (unit, &unit->arange, low_pc, high_pc)) > + if (!arange_add (unit, &unit->arange, &unit->file->trie_root, low_pc= , high_pc)) > goto line_fail; > break; > case DW_LNE_set_address: > @@ -3134,7 +3407,7 @@ find_abstract_instance (struct comp_unit *unit, > =20 > static bool > read_ranges (struct comp_unit *unit, struct arange *arange, > - bfd_uint64_t offset) > + struct trie_node **trie_root, bfd_uint64_t offset) > { > bfd_byte *ranges_ptr; > bfd_byte *ranges_end; > @@ -3169,7 +3442,7 @@ read_ranges (struct comp_unit *unit, struct arange = *arange, > base_address =3D high_pc; > else > { > - if (!arange_add (unit, arange, > + if (!arange_add (unit, arange, trie_root, > base_address + low_pc, base_address + high_pc)) > return false; > } > @@ -3179,7 +3452,7 @@ read_ranges (struct comp_unit *unit, struct arange = *arange, > =20 > static bool > read_rnglists (struct comp_unit *unit, struct arange *arange, > - bfd_uint64_t offset) > + struct trie_node **trie_root, bfd_uint64_t offset) > { > bfd_byte *rngs_ptr; > bfd_byte *rngs_end; > @@ -3253,19 +3526,19 @@ read_rnglists (struct comp_unit *unit, struct ara= nge *arange, > return false; > } > =20 > - if (!arange_add (unit, arange, low_pc, high_pc)) > + if (!arange_add (unit, arange, trie_root, low_pc, high_pc)) > return false; > } > } > =20 > static bool > read_rangelist (struct comp_unit *unit, struct arange *arange, > - bfd_uint64_t offset) > + struct trie_node **trie_root, bfd_uint64_t offset) > { > if (unit->version <=3D 4) > - return read_ranges (unit, arange, offset); > + return read_ranges (unit, arange, trie_root, offset); > else > - return read_rnglists (unit, arange, offset); > + return read_rnglists (unit, arange, trie_root, offset); > } > =20 > static struct funcinfo * > @@ -3617,7 +3890,7 @@ scan_unit_for_symbols (struct comp_unit *unit) > =20 > case DW_AT_ranges: > if (is_int_form (&attr) > - && !read_rangelist (unit, &func->arange, attr.u.val)) > + && !read_rangelist (unit, &func->arange, &unit->file->trie_root,= attr.u.val)) > goto fail; > break; > =20 > @@ -3733,7 +4006,7 @@ scan_unit_for_symbols (struct comp_unit *unit) > =20 > if (func && high_pc !=3D 0) > { > - if (!arange_add (unit, &func->arange, low_pc, high_pc)) > + if (!arange_add (unit, &func->arange, &unit->file->trie_root, low_pc,= high_pc)) > goto fail; > } > } > @@ -3931,7 +4204,7 @@ parse_comp_unit (struct dwarf2_debug *stash, > =20 > case DW_AT_ranges: > if (is_int_form (&attr) > - && !read_rangelist (unit, &unit->arange, attr.u.val)) > + && !read_rangelist (unit, &unit->arange, &unit->file->trie_root, = attr.u.val)) > return NULL; > break; > =20 > @@ -3973,7 +4246,7 @@ parse_comp_unit (struct dwarf2_debug *stash, > high_pc +=3D low_pc; > if (high_pc !=3D 0) > { > - if (!arange_add (unit, &unit->arange, low_pc, high_pc)) > + if (!arange_add (unit, &unit->arange, &unit->file->trie_root, low_= pc, high_pc)) > return NULL; > } > =20 > @@ -4772,6 +5045,14 @@ _bfd_dwarf2_slurp_debug_info (bfd *abfd, bfd *debu= g_bfd, > if (!stash->alt.abbrev_offsets) > return false; > =20 > + stash->f.trie_root =3D alloc_trie_leaf (abfd); > + if (!stash->f.trie_root) > + return false; > + > + stash->alt.trie_root =3D alloc_trie_leaf (abfd); > + if (!stash->alt.trie_root) > + return false; > + > *pinfo =3D stash; > =20 > if (debug_bfd =3D=3D NULL) > @@ -4943,6 +5224,12 @@ stash_comp_unit (struct dwarf2_debug *stash, struc= t dwarf2_debug_file *file) > each->next_unit =3D file->all_comp_units; > file->all_comp_units =3D each; > =20 > + if (each->arange.high =3D=3D 0) > + { > + each->next_unit_without_ranges =3D file->all_comp_units_without_r= anges; > + file->all_comp_units_without_ranges =3D each->next_unit_without_r= anges; > + } > + > file->info_ptr +=3D length; > return each; > } > @@ -5185,17 +5472,66 @@ _bfd_dwarf2_find_nearest_line (bfd *abfd, > } > else > { > - for (each =3D stash->f.all_comp_units; each; each =3D each->next_u= nit) > + struct trie_node *trie =3D stash->f.trie_root; > + unsigned int bits =3D VMA_BITS - 8; > + struct comp_unit **prev_each; > + > + /* Traverse interior nodes until we get to a leaf. */ > + while (trie && trie->num_room_in_leaf =3D=3D 0) > { > - found =3D ((each->arange.high =3D=3D 0 > - || comp_unit_contains_address (each, addr)) > - && comp_unit_find_nearest_line (each, addr, > + int ch =3D (addr >> bits) & 0xff; > + trie =3D ((struct trie_interior *) trie)->children[ch]; > + bits -=3D 8; > + } > + > + if (trie) > + { > + const struct trie_leaf *leaf =3D (struct trie_leaf *) trie; > + unsigned int i; > + > + for (i =3D 0; i < leaf->num_stored_in_leaf; ++i) > + { > + leaf->ranges[i].unit->mark =3D false; > + } > + > + for (i =3D 0; i < leaf->num_stored_in_leaf; ++i) > + { > + struct comp_unit *unit =3D leaf->ranges[i].unit; > + if (unit->mark || > + addr < leaf->ranges[i].low_pc || > + addr >=3D leaf->ranges[i].high_pc) > + continue; > + unit->mark =3D true; > + > + found =3D comp_unit_find_nearest_line (unit, addr, > filename_ptr, > &function, > linenumber_ptr, > - discriminator_ptr)); > + discriminator_ptr); > + if (found) > + goto done; > + } > + } > + > + /* Also scan through all compilation units without any ranges, > + taking them out of the list if they have acquired any since las= t time. */ > + prev_each =3D &stash->f.all_comp_units_without_ranges; > + for (each =3D *prev_each; each; each =3D each->next_unit_without_r= anges) > + { > + if (each->arange.high !=3D 0) > + { > + *prev_each =3D each->next_unit_without_ranges; > + continue; > + } > + > + found =3D comp_unit_find_nearest_line (each, addr, > + filename_ptr, > + &function, > + linenumber_ptr, > + discriminator_ptr); > if (found) > goto done; > + prev_each =3D &each->next_unit_without_ranges; > } > } > =20