From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 125440 invoked by alias); 23 May 2019 01:28:00 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 125405 invoked by uid 89); 23 May 2019 01:27:59 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-6.1 required=5.0 tests=AWL,BAYES_00,HTML_MESSAGE,KAM_SHORT,SPF_HELO_PASS autolearn=ham version=3.3.1 spammy=20th, anticipate, walks, refining X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 23 May 2019 01:27:57 +0000 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 132CC3091754 for ; Thu, 23 May 2019 01:27:56 +0000 (UTC) Received: from [10.10.124.16] (ovpn-124-16.rdu2.redhat.com [10.10.124.16]) by smtp.corp.redhat.com (Postfix) with ESMTP id 03EAC52EA; Thu, 23 May 2019 01:27:54 +0000 (UTC) To: GCC Cc: Jeff Law , Aldy Hernandez From: Andrew MacLeod Subject: On-Demand range technology [1/5] - Executive Summary Message-ID: Date: Thu, 23 May 2019 01:28:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.3.1 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-IsSubscribed: yes X-SW-Source: 2019-05/txt/msg00193.txt.bz2 Now that stage 1 has reopened, I’d like to reopen a discussion about the technology and experiences we have from the Ranger project I brought up last year. https://gcc.gnu.org/ml/gcc/2018-05/msg00288.html .  (The original wiki pages are now out of date, and I will work on updating them soon.) The Ranger is designed to evaluate ranges on-demand rather than through a top-down approach. This means you can ask for a range from anywhere, and it walks back thru the IL satisfying any preconditions and doing the required calculations.  It utilizes a cache to avoid re-doing work. If ranges are processed in a forward dominator order, it’s not much different than what we do today. Due to its nature, the order you process things in has minimal impact on the overall time… You can do it in reverse dominator order and get similar times. It requires no outside preconditions (such as dominators) to work, and has a very simple API… Simply query the range of an ssa_name at any point in the IL and all the details are taken care of. We have spent much of the past 6 months refining the prototype (branch “ssa-range”) and adjusting it to share as much code with VRP as possible. They are currently using a common code base for extracting ranges from statements, as well as simplifying statements. The Ranger deals with just  ranges. The other aspects of  VRP are intended to be follow on work that integrates tightly with it, but are also independent and would be available for other passes to use.  These include: - Equivalency tracking - Relational processing - Bitmask tracking We have implemented a VRP pass that duplicates the functionality of EVRP (other than the bits mentioned above), as well as converted a few other passes to use the interface.. I do not anticipate those missing bits having a significant impact on the results. The prototype branch it quite stable and can successfully build and test an entire Fedora distribution (9174 packages). There is an issue with switches I will discuss later whereby the constant range of a switch edge is not readily available and is exponentially expensive to calculate. We have a design to address that problem, and in the common case we are about 20% faster than EVRP is. When utilized in passes which only require ranges for a small number of ssa-names we see significant improvements.  The sprintf warning pass for instance allows us to remove the calculations of dominators and the resulting forced walk order. We see a 95% speedup (yes, 1/20th of the overall time!).  This is primarily due to no additional overhead and only calculating the few things that are actually needed.  The walloca and wrestrict passes are a similar model, but as they have not been converted to use EVRP ranges yet, we don’t see similar speedups there. That is the executive summary.  I will go into more details of each major thing mentioned in follow on notes so that comments and discussions can focus on one thing at a time. We think this approach is very solid and has many significant benefits to GCC. We’d like to address any concerns you may have, and work towards finding a way to integrate this model with the code base during this stage 1. Comments and feedback always welcome! Thanks Andrew