From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 66278 invoked by alias); 9 Feb 2016 15:58:23 -0000 Mailing-List: contact libabigail-help@sourceware.org; run by ezmlm Precedence: bulk List-Post: List-Help: List-Subscribe: Sender: libabigail-owner@sourceware.org Received: (qmail 66047 invoked by uid 48); 9 Feb 2016 15:58:18 -0000 From: "dodji at redhat dot com" To: libabigail@sourceware.org Subject: [Bug default/19427] Intern the strings used in Libabigail Date: Fri, 01 Jan 2016 00:00:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: libabigail X-Bugzilla-Component: default X-Bugzilla-Version: unspecified X-Bugzilla-Keywords: X-Bugzilla-Severity: enhancement X-Bugzilla-Who: dodji at redhat dot com X-Bugzilla-Status: ASSIGNED X-Bugzilla-Resolution: X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: dodji at redhat dot com X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_status Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://sourceware.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2016-q1/txt/msg00066.txt.bz2 https://sourceware.org/bugzilla/show_bug.cgi?id=3D19427 dodji at redhat dot com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED --- Comment #1 from dodji at redhat dot com --- So I started to work on this and I do have a working branch (named 'str-intern') with the necessary changes undocumented (yet). You can brows= e it at https://sourceware.org/git/gitweb.cgi?p=3Dlibabigail.git;a=3Dshortlog;h=3Dr= efs/heads/dodji/str-intern. I am going to post time and memory consumption comparison using the code ba= se that is built with optimization (-O2). So here is the resource usage of abidw --abidiff on r300_dri.so, for the ma= ster branch: real =3D> 5:03.31 user =3D> 300.24 sys =3D> 2.86 max mem =3D> 4959036KB And the resource usage for the str-intern branch: real =3D> 4:56.98 user =3D> 294.12 sys =3D> 2.65 max mem =3D> 4617328KB So, as you can see, it slightly improves the speed of this test (by 6 secon= ds), and significantly improves memory usage (saving more than 300 mega bytes of memory). The problem is that, of smaller tests and tests that don't involve emitting abixml, things are a little bit slower, actually. In other words, abidiff,= for instance, becomes slightly slower. The memory consumption savings are still there though. That is, the cost of looking up strings in a hash table to ensure that each string exists in only one copy in the environment (this is string interning) makes the loading of abi corpora slower. But then, comparing *strings* lat= er becomes faster as comparing two strings amounts to just comparing two point= ers. But we need to compare a lot of strings to make up for the cost of interni= ng them in the first place. And the place where we compare strings the most at the moment is when we emit abixml (i.e, in abidw). During decls comparisons it turns out we don't compare strings that much because we compare their types first. And thanks to type canonicalization, comparing two types is very fast. And as the majority of comparisons yield= a negative result, we don't even get to compare the names of the decls. So I am still not sure if I am going to incorporate this optimization in the end. I *am* inclined to merge it, because it makes the library consume less memory, and it speeds up abixml writing, especially for big libraries. In other words, it makes libabigail scale more. But then it slows it slightly= on small workloads (which are quite fast anyway). I'll give this a little bit more thought. But in the mean time, if you have some thoughts, please share them :-) --=20 You are receiving this mail because: You are on the CC list for the bug.