From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 21062 invoked by alias); 1 Feb 2012 13:23:30 -0000 Mailing-List: contact archer-help@sourceware.org; run by ezmlm Sender: Precedence: bulk List-Post: List-Help: List-Subscribe: List-Id: Received: (qmail 21046 invoked by uid 22791); 1 Feb 2012 13:23:28 -0000 X-SWARE-Spam-Status: No, hits=-6.6 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_HI,SPF_HELO_PASS,TW_BJ,TW_GD,T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Date: Wed, 01 Feb 2012 13:23:00 -0000 From: Jan Kratochvil To: archer@sourceware.org Cc: Jakub Jelinek Subject: Inter-CU DWARF size optimizations and gcc -flto Message-ID: <20120201132307.GA32578@host2.jankratochvil.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) X-SW-Source: 2012-q1/txt/msg00005.txt.bz2 Hi, I am sorry if it is clear to everyone but I admit I played with it only yesterday. With gcc -flto -flto-partition=none gcc outputs only single CU (Compilation Unit). With default (omitting) -flto-partition there are multiple CUs but still a few compared to the number of .o files. -flto is AFAIK the future for all the compilations. It is well known -flto debug info is somehow broken now but that needs to be fixed anyway. As the DWARF size is being discussed for 5+ years I am in Tools this is a long-term project and waiting for (helping, heh) working -flto is an acceptable solution. This has some implications: (a) DWARF post-processing optimization tool no longer makes sense with -flto. (a1) Intra-CU optimizations in GCC make sense as it is the final output. (b) .gdb_index will have limited scope, only to select which objfiles to expand, no longer to select which CUs to expand. (c) Partial CU expansion Tom Tromey talks about is a must in such case. Although the smaller LTO debug info takes only 63% of GDB memory requirements compared to the non-LTO (many-CUs) debug info. (GDB memory requirement is not directly proportional ot the DWARF size) With -flto-partition=none linking of GDB took about 900MB. Honza Hubicka's memory requirements for LTO (2.7GB for Mozilla) not sure how were related to -flto-partition. Still some GBs of cheap memory for the few hosts in build farm (Koji) for Mozilla + LibreOffice should not be such a concern IMO. FYI for gdb with Rawhide -O2-style CFLAGS (-gdwarf-4 -fno-debug-types-section): -fno-debug-types-section: | non-LTO | LTO stripped binary size | 5023064 | 4985864 separate .debug size | 19190280 | 12484312 =65% GDB RSS -readnow | 160136 KB | 106252 KB GDB RSS without .debug | 14964 KB | 14972 KB GDB RSS difference | 145172 KB | 91280 KB =63% I had an idea those 65% (35% reduction) could be the magic ratio achievable by the hypothetically optimal "Roland's" DWARF optimizer. But at least struct range_bounds is there defined (including all its fields) 49x so this is still far from optimal/"Roland's one". Additionally with -fdebug-types-section: v like above | non-LTO | non-LTO .debug_types | LTO .debug_types stripped binary size | 5023064 | 5023064 | 4985864 separate .debug size | 19190280 | 12789960 = 67% | 12170080 = 63% GDB RSS -readnow | 160136 KB | 77524 KB | 227876 KB GDB RSS without .debug | 14964 KB | 14968 KB | 14964 KB GDB RSS difference | 145172 KB | 62556 KB = 43% | 212912 KB = 147% This has IMO some implications: (z) gcc/dwarf2out.c is a viable place where to implement "Roland's" DWARF optimizer. Regards, Jan