From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 30129 invoked by alias); 8 Aug 2013 23:04:20 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 30112 invoked by uid 89); 8 Aug 2013 23:04:19 -0000 X-Spam-SWARE-Status: No, score=-2.9 required=5.0 tests=AWL,BAYES_05,KHOP_THREADED,RCVD_IN_DNSWL_LOW,RCVD_IN_HOSTKARMA_YE,RDNS_NONE,SPF_PASS autolearn=ham version=3.3.1 Received: from Unknown (HELO mail-qe0-f43.google.com) (209.85.128.43) by sourceware.org (qpsmtpd/0.84/v0.84-167-ge50287c) with ESMTP; Thu, 08 Aug 2013 23:04:18 +0000 Received: by mail-qe0-f43.google.com with SMTP id k5so2070496qej.30 for ; Thu, 08 Aug 2013 16:04:11 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=ESejJNn2B6MeQEftSuynbqWaf2UG0LLL4Lfrpl5A8is=; b=SlHJQSElqV91kWMUJ7zYk2sWCPmkHVLvt7kjAEbhx66VRtdnLR64uJ1dOAISOGAa1V D5Psy1p6zy7RzafhzRMMothVB+oawz282y34qn9L/Mv+4r39vLA6HQVPRBOH/RCig96f MCb6tUOWunoqMJ2jZpfWbUNxgJVhve99Rr1WooV9PGiSonX/yp51xMoPan4RTHToS6ci 0bBZ18zSdYsUbcf+cehMweRvEkFvw25O2uRbEfxcPUYGVNyJOTCQAfWseQH8ElG6OOgq OAfxmyjHgP1PEaOcO0M8Gm+naqlgowuNWqmHbYwu3DqBqLXApG45XqWsPG+AdkBvl7CU fM4w== X-Gm-Message-State: ALoCoQmYjOeQEGV4/ClNY1qnG66N+SsbRwhWmqTrZSkmFFRi6Jx42+CGKrWdJnsEQk4szXbR8ofGkVpXLsYnyGAzeeEXrx/NRL3dM85+sfcKi3ptMrxKVtQ7QWnLKBbja3Qc+R0vlWDyB4mj32bkjZ86Mjl6RxS3RTIrwOqKhRxQs0fXPKREfsX7iMkH5jFRzrK+iqbZb6Ujy61/YC+/pHmPxef8f6X4/A== MIME-Version: 1.0 X-Received: by 10.224.135.7 with SMTP id l7mr8722430qat.0.1376003050915; Thu, 08 Aug 2013 16:04:10 -0700 (PDT) Received: by 10.49.40.162 with HTTP; Thu, 8 Aug 2013 16:04:10 -0700 (PDT) In-Reply-To: <20130808222332.GA31755@kam.mff.cuni.cz> References: <20130802150529.GC15776@kam.mff.cuni.cz> <20130808222332.GA31755@kam.mff.cuni.cz> Date: Thu, 08 Aug 2013 23:04:00 -0000 Message-ID: Subject: Re: [PATCH] Sanitize block partitioning under -freorder-blocks-and-partition From: Teresa Johnson To: Jan Hubicka Cc: Bernhard Reutner-Fischer , "gcc-patches@gcc.gnu.org" , Steven Bosscher , Jeff Law , marxin.liska@gmail.com Content-Type: text/plain; charset=ISO-8859-1 X-SW-Source: 2013-08/txt/msg00501.txt.bz2 On Thu, Aug 8, 2013 at 3:23 PM, Jan Hubicka wrote: > Hi, > Martin Liska was kind enough to generate disk seeking graph of gimp statrup with his function reordering. > His code simply measures time of firest execution of a function and orders functions in the given order. > The functions stay in the subsections (unlikely/startup/exit/hot/normal) that are then glued together > in this order. > > I am attaching disk seeking with and without -freorder-blocks-and-partition (with your patch). > > In 2.pdf you can see two increasing sequences in the text segment. If I am not mistaken the bottom > one comes for hot and the top one for normal section. The big unused part on bottom is unlikely > section since most of gimp is not trained. 2.pdf is reordered with Martin's technique? > > Now 1.pdf is with -freorder-blocks-and-partition and your patch. You can see there is third sequence > near bottom of the text seciton. that is beggining of unlikely section, so it tracks jumps where we > fall into cold section of function. 1.pdf is generated using the usual FDO + -freorder-blocks-and-partition (i.e. not using Martin's technique)? > > It still seems rather bad (i.e. good part of unlikely section is actually used). I think the dominator > based approach is not going to work too reliably (I can "fix" my testcase to contain multiple nested > conditionals and then the heuristic about predecestors won't help). Yes, this doesn't look good. Did you use the latest version of my patch that doesn't walk the dominators? Do you know how many training runs are done for this benchmark? I think a lot of the issues that you pointed out with the hot loop preceded by non-looping conditional code as in your earlier example, or multiple nested conditionals, comes from the fact that the cold cutoff is not 0, but some number less than the number of training runs. Perhaps the cutoff for splitting should be 0. Then the main issue that needs to be corrected is profile insanities, not code that is executed once (since that would not be marked cold). The only other issue that I can think of here is that the training data was not representative and didn't execute these blocks. > > What about simply walking the CFG from entry through all edges with non-0 counts and making all reachable > blocks hot + forcingly make any hot blocks not reachable this way reachable? Is this different than what I currently have + changing the cold cutoff to 0? In that case any blocks reachable through non-0 edges should be non-0 and marked hot, and the current patch forces the most frequent paths to all hot blocks to be hot. Thanks, Teresa > I think we are really looking primarily for dead parts of the functions (sanity checks/error handling) > that should not be visited by train run. We can then see how to make the heuristic more aggressive? > > Honza -- Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413