From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 17214 invoked by alias); 25 May 2018 07:13:24 -0000 Mailing-List: contact fortran-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: fortran-owner@gcc.gnu.org Received: (qmail 16971 invoked by uid 89); 25 May 2018 07:13:19 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.2 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2 spammy=H*r:sk:18-v6so X-HELO: mail-wm0-f46.google.com Received: from mail-wm0-f46.google.com (HELO mail-wm0-f46.google.com) (74.125.82.46) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 25 May 2018 07:13:17 +0000 Received: by mail-wm0-f46.google.com with SMTP id 18-v6so5974264wml.2 for ; Fri, 25 May 2018 00:13:17 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=0IFlbu5woemQlKSVlNpAbyN8RxGn7xw+qD09KxJsNB0=; b=qa7nWbXJ0mnbHsBz9UTrpLFlubo3Ccgk3793cPlQK5XA9lZu8MRofHOvEvMXM6xl58 st5Hq5A/nyiq6gr8LknfSQroEkGainRLkIHB4eslK5I6XfSchIbNoblaTLZzYSdgdFrO zrUJ6KfOFmE3uqr5Vav9cBzGyy7ZMOdPSftzB1C0Rug2VX9prU41z/q86JBLqkiU9gHW CLp2djMo10kgmhfNAdk5jOVwcmGth89/2kHj80l2NNDI4ZqDBREKWuGJk86+LJhGBdtH juwqIpw3Sg4PbvcPTgVBs/hFJexEt4Fv+Gu+z1A6vOYNQz/bXb626DEEe36qbPqi/trc iCqw== X-Gm-Message-State: ALKqPwc8kpJG+iKQZZ8XksGy1dXUebztMqEP4odnvbrB+XxD4TbtxKIQ 5a+aSG1tGp9fDOZeu7n7bvVzovo8MFr69SrYxul5MDUq X-Google-Smtp-Source: AB8JxZrG89HtaJLjEVYkXKYthBVYnG/9xJGhJYDHZSUmc+M4X4+xZsbpRleZWkYTTgRkLH6xOhgNPf3AhQHmP3sgDho= X-Received: by 2002:a2e:4255:: with SMTP id p82-v6mr795758lja.16.1527232395356; Fri, 25 May 2018 00:13:15 -0700 (PDT) MIME-Version: 1.0 References: <6944935.KTWIneVxun@andrew-precision-3520> In-Reply-To: From: Richard Biener Date: Fri, 25 May 2018 07:13:00 -0000 Message-ID: Subject: Re: Optimization of add_dt_to_dt_list() in resolve.c To: abenson@carnegiescience.edu Cc: "fortran@gcc.gnu.org" Content-Type: text/plain; charset="UTF-8" X-IsSubscribed: yes X-SW-Source: 2018-05/txt/msg00102.txt.bz2 On Fri, May 25, 2018 at 9:06 AM Richard Biener wrote: > On Fri, May 25, 2018 at 12:53 AM Andrew Benson < abenson@carnegiescience.edu> > wrote: > > I've been attempting to track down some of the causes of very long compile > > times for files which use modules that contain a large number of symbols. > The > > worst case offender in my code takes around 12 minutes to compile. > > After profiling f951 for this source file it turns out that the majority > of the > > time is spent in add_dt_to_dt_list() in resolve.c. In cases where the > number > > of symbols imported becomes very large (~10,000 in some cases in this > code), > > the O(N) search in this function becomes inefficient. > > A simple optimization for this problem seems to be to just have the > gfc_symbol > > struct include a pointer back to the corresponding entry in the > > gfc_derived_types list. It's then fast to check if a symbol is already on > that > > list by checking if this pointer is non-null. (It could just as easily be > an > > int in gfc_symbol which indicates if the symbol is already added to the > list - > > I don't know if having a pointer to the list entry is useful for any other > > reason.) > > With this change in place compile times are much faster - my worst case > > offender now takes just under 1 minute to compile. > > My patch is attached. I freely admit that I have only a very shallow > > understanding of the inner workings of the compiler, so I would not be > > surprised if there are good reasons not to do this. I did "make > check-fortran" > > and did not see any failures. If any one wants to try this out and/or > provide > > any feedback I'd be happy to hear it. > It looks like it would be cheaper to simply embed gtc_dt_list *next in > gfc_symbol? > (in case a gfc_symbol can be only on one gfc_dt_list which your patch > assumes as well) Note you'll need to make the list cyclic to make the member test work as sym->next_dt != NULL. Also note that gfc_get_derived_type has another linear list walk so it looks like derived type processing is still quadratic in the number of derived types after the change. That suggests that eventually making the derived_types list a hash-map of gfc_symbol to the common derived type (symbol) would be a better fix, thus doing the merging at dt-"list" insertion time. Of course that's likely a bigger change that needs more thorough understanding of how derived types work. A fix along your idea looks simple enough and has obvious benefits as you show so it might be even appropriate for release branches. Richard. > Richard. > > Thanks, > > Andrew > > -- > > * Andrew Benson: http://users.obs.carnegiescience.edu/abenson/contact.html > > * Galacticus: https://bitbucket.org/abensonca/galacticus