From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 121472 invoked by alias); 9 Mar 2015 18:44:04 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 121448 invoked by uid 89); 9 Mar 2015 18:44:03 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.8 required=5.0 tests=AWL,BAYES_00,SPF_HELO_PASS,SPF_PASS,T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Mon, 09 Mar 2015 18:44:02 +0000 Received: from int-mx13.intmail.prod.int.phx2.redhat.com (int-mx13.intmail.prod.int.phx2.redhat.com [10.5.11.26]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id t29Ii0NF006116 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Mon, 9 Mar 2015 14:44:01 -0400 Received: from [10.3.113.56] (ovpn-113-56.phx2.redhat.com [10.3.113.56]) by int-mx13.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id t29Ii0TV021344; Mon, 9 Mar 2015 14:44:00 -0400 Message-ID: <54FDE9F0.4030305@redhat.com> Date: Mon, 09 Mar 2015 18:44:00 -0000 From: Jeff Law User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: Richard Biener CC: gcc-patches@gcc.gnu.org Subject: Re: [PATCH][RFC] Fix PR63155 References: <54F9DDB8.9030300@redhat.com> In-Reply-To: Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-IsSubscribed: yes X-SW-Source: 2015-03/txt/msg00491.txt.bz2 On 03/09/15 03:42, Richard Biener wrote: > On Fri, 6 Mar 2015, Jeff Law wrote: > >> On 03/06/15 06:16, Richard Biener wrote: >>> >>> This fixes PR63155 and reduces the memory usage at -O0 from reported >>> 10GB (couldn't verify/update on my small box) to 350MB (still worse >>> compared to 4.8 which needs only 50MB). >>> >>> It fixes this by no longer computing live info or building a conflict >>> graph for coalescing of SSA names flowing over abnormal edges >>> (which needs to succeed). >>> >>> Of course this also removes verification that this coalescing is valid >>> (but computing this has quadratic cost). With this it turns >>> ICEs into miscompiles. >>> >>> We could restrict verifying that we can perform abnormal coalescing >>> to ENABLE_CHECKING (and I've wanted a verifier pass that can verify >>> this multiple times to be able to catch what breaks it and not having >>> to work back from out-of-SSA ICEing...). >>> >>> So any opinion on this patch welcome. >>> >>> Bootstrap and regtest running on x86_64-unknown-linux-gnu. >>> >>> Ok for trunk? ;) >>> >>> Thanks, >>> Richard. >>> >>> 2015-03-06 Richard Biener >>> >>> PR middle-end/63155 >>> * tree-ssa-coalesce.c (attempt_coalesce): Handle graph being NULL. >>> (coalesce_partitions): Split out abnormal coalescing to ... >>> (perform_abnormal_coalescing): ... this function. >>> (coalesce_ssa_name): Perform abnormal coalescing without computing >>> live/conflict. >> I'd personally like to keep the checking when ENABLE_CHECKING. >> >> I haven't followed this discussion real closely, but I wonder if some kind of >> blocking approach would work without blowing up the memory consumption. >> There's no inherent reason why we have to coalesce everything at the same >> time. We can use a blocking factor and do coalescing on some N number of >> SSA_NAMEs at a time. > > Yes, that's possible at (quite?) some compile-time cost. Note that we > can't really guarantee that the resulting live/conflict problems shrink > significantly enough without sorting the coalesces in a different way > (not after important coalesces but after their basevars). Yea, it's a class time/space tradeoff. I guess it comes down to how much compile-time pain we'll take for reducing memory usage. It may also be the case that some blocking factors are actually faster than doing everything at once, even for more common input graph sizes. I actually ran into this when looking at the liveness computations for into-ssa eons ago. We were computing liveness in parallel, but a blocking of 1 object is actually best: https://gcc.gnu.org/ml/gcc-patches/2003-10/msg01301.html Jeff