From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <archer-return-2488-listarch-archer=sourceware.org@sourceware.org>
Received: (qmail 21062 invoked by alias); 1 Feb 2012 13:23:30 -0000
Mailing-List: contact archer-help@sourceware.org; run by ezmlm
Sender: <archer@sourceware.org>
Precedence: bulk
List-Post: <mailto:archer@sourceware.org>
List-Help: <mailto:archer-help@sourceware.org>
List-Subscribe: <mailto:archer-subscribe@sourceware.org>
List-Id: <archer.sourceware.org>
Received: (qmail 21046 invoked by uid 22791); 1 Feb 2012 13:23:28 -0000
X-SWARE-Spam-Status: No, hits=-6.6 required=5.0
	tests=AWL,BAYES_00,RCVD_IN_DNSWL_HI,SPF_HELO_PASS,TW_BJ,TW_GD,T_RP_MATCHES_RCVD
X-Spam-Check-By: sourceware.org
Date: Wed, 01 Feb 2012 13:23:00 -0000
From: Jan Kratochvil <jan.kratochvil@redhat.com>
To: archer@sourceware.org
Cc: Jakub Jelinek <jakub@redhat.com>
Subject: Inter-CU DWARF size optimizations and gcc -flto
Message-ID: <20120201132307.GA32578@host2.jankratochvil.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.5.21 (2010-09-15)
X-SW-Source: 2012-q1/txt/msg00005.txt.bz2

Hi,

I am sorry if it is clear to everyone but I admit I played with it only
yesterday.

With
	gcc -flto -flto-partition=none

gcc outputs only single CU (Compilation Unit).  With default (omitting)
-flto-partition there are multiple CUs but still a few compared to the number
of .o files.

-flto is AFAIK the future for all the compilations.  It is well known -flto
debug info is somehow broken now but that needs to be fixed anyway.

As the DWARF size is being discussed for 5+ years I am in Tools this is
a long-term project and waiting for (helping, heh) working -flto is an
acceptable solution.

This has some implications:

(a) DWARF post-processing optimization tool no longer makes sense with -flto.

    (a1) Intra-CU optimizations in GCC make sense as it is the final output.

(b) .gdb_index will have limited scope, only to select which objfiles to expand,
    no longer to select which CUs to expand.

(c) Partial CU expansion Tom Tromey talks about is a must in such case.
    Although the smaller LTO debug info takes only 63% of GDB memory
    requirements compared to the non-LTO (many-CUs) debug info.
    (GDB memory requirement is not directly proportional ot the DWARF size)

With -flto-partition=none linking of GDB took about 900MB.  Honza Hubicka's
memory requirements for LTO (2.7GB for Mozilla) not sure how were related to
-flto-partition.  Still some GBs of cheap memory for the few hosts in build
farm (Koji) for Mozilla + LibreOffice should not be such a concern IMO.

FYI for gdb with Rawhide -O2-style CFLAGS (-gdwarf-4 -fno-debug-types-section):

-fno-debug-types-section:
                       |  non-LTO  |    LTO
stripped binary size   |   5023064 |   4985864
separate .debug size   |  19190280 |  12484312 =65%
GDB RSS -readnow       | 160136 KB | 106252 KB
GDB RSS without .debug |  14964 KB |  14972 KB
GDB RSS difference     | 145172 KB |  91280 KB =63%

I had an idea those 65% (35% reduction) could be the magic ratio achievable by
the hypothetically optimal "Roland's" DWARF optimizer.  But at least struct
range_bounds is there defined (including all its fields) 49x so this is still
far from optimal/"Roland's one".

Additionally with -fdebug-types-section:
                       v like above
                       |  non-LTO  |  non-LTO .debug_types | LTO .debug_types
stripped binary size   |   5023064 |  5023064              |  4985864
separate .debug size   |  19190280 | 12789960 = 67%        | 12170080 = 63%
GDB RSS -readnow       | 160136 KB |  77524 KB             | 227876 KB
GDB RSS without .debug |  14964 KB |  14968 KB             |  14964 KB
GDB RSS difference     | 145172 KB |  62556 KB = 43%       | 212912 KB = 147%

This has IMO some implications:

(z) gcc/dwarf2out.c is a viable place where to implement "Roland's" DWARF
    optimizer.


Regards,
Jan