From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugs-return-482102-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 44407 invoked by alias); 27 Mar 2015 19:15:48 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Received: (qmail 44340 invoked by uid 48); 27 Mar 2015 19:15:44 -0000
From: "hubicka at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug lto/65536] LTO line number information garbled
Date: Fri, 27 Mar 2015 19:39:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: lto
X-Bugzilla-Version: 5.0
X-Bugzilla-Keywords:
X-Bugzilla-Severity: normal
X-Bugzilla-Who: hubicka at gcc dot gnu.org
X-Bugzilla-Status: NEW
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields:
Message-ID: <bug-65536-4-seXXLlXPVB@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-65536-4@http.gcc.gnu.org/bugzilla/>
References: <bug-65536-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2015-03/txt/msg03246.txt.bz2

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65536
--- Comment #53 from Jan Hubicka <hubicka at gcc dot gnu.org> ---
> You can get an estimate of how much memory would be required to stream in/out 
> directly the line_table by summing up the memory reported by 
> dump_line_table_statistics for each TU before streaming out (perhaps using 
> -ftrack-macro-expansion=0 to reduce it further). This would still be an 
> overestimate, because one can drop one char (sysp) and one int (included_from) > per map and one can drop all maps that are not used by LTO. Moreover, you will > not need a cache and everything will be in order already when you stream in.

I see, the stats actually are in -fmem-report, how convenient ;))

With Firefox I get:
Line Table allocations during the compilation process
Number of ordinary maps used:          407k
Ordinary map used size:                 15M
Number of ordinary maps allocated:     409k
Ordinary maps allocated size:           15M
Number of macro maps used:               0 
Macro maps used size:                    0 
Macro maps locations size:               0 
Macro maps size:                         0 
Duplicated maps locations size:          0 
Total allocated maps size:              15M
Total used maps size:                   15M

after streaming in all declarations&types and

Line Table allocations during the compilation process
Number of ordinary maps used:          769k
Ordinary map used size:                 30M
Number of ordinary maps allocated:    1638k
Ordinary maps allocated size:           63M
Number of macro maps used:               0 
Macro maps used size:                    0 
Macro maps locations size:               0 
Macro maps size:                         0 
Duplicated maps locations size:          0 
Total allocated maps size:              63M
Total used maps size:                   30M

by end of WPA stage. The extra growth is by variable initializers and function
bodies we bring in on demand (for merging, ICF, devirt machinery). Those by
large part bypass my cache.

Individual ltrans units consume after streaming in global decls:
Line Table allocations during the compilation process
Number of ordinary maps used:           29k
Ordinary map used size:               1189k
Number of ordinary maps allocated:     102k
Ordinary maps allocated size:         4095k
Number of macro maps used:               0 
Macro maps used size:                    0 
Macro maps locations size:               0 
Macro maps size:                         0 
Duplicated maps locations size:          0 
Total allocated maps size:            4095k
Total used maps size:                 1189k

this does not look terrible

I tried to sum the sizes for EON, and got to 92493kB and 8631 without macro
expansion tracking (note that this size is basically the same with or without
my proposed line-maps.c patch, so it does not seem to pesimize non-LTO builds).

At WPA time (with my proposed patch, will rebuild tree without after lunch) I
get:

Line Table allocations during the compilation process
Number of ordinary maps used:         6084 
Ordinary map used size:                237k
Number of ordinary maps allocated:    6553 
Ordinary maps allocated size:          255k
Number of macro maps used:               0 
Macro maps used size:                    0 
Macro maps locations size:               0 
Macro maps size:                         0 
Duplicated maps locations size:          0 
Total allocated maps size:             255k
Total used maps size:                  237k

So about 30x smaller, even though we do have information loss here (i.e. inline
stacks)

So it seems we can not really directly stream linemaps - there is way too much
of information that is discarded from parsing time to LTO time.
(about 95% of trees that are streamed out are discarded by tree merging and
thus with my cache never hits linemaps).

Linemap representation is interesting. I suppose for next release we want to do
kind of on-disk variant inspired by libcpp.  If we arrange cache to be
per-LTO-section (not randomly flushed as it is now), we can read the on-disk
format into LTO own format and apply relevant part to the line-maps after tree
merging.