From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ej1-x62d.google.com (mail-ej1-x62d.google.com [IPv6:2a00:1450:4864:20::62d]) by sourceware.org (Postfix) with ESMTPS id 65E213858004 for ; Tue, 23 Mar 2021 08:25:42 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 65E213858004 Received: by mail-ej1-x62d.google.com with SMTP id r12so25638825ejr.5 for ; Tue, 23 Mar 2021 01:25:42 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=vO7F3io4xri5D+Fd8lSataKYQYeQmGG5+vr/s7wl6PY=; b=D5ln5NkIRuIDgsXHjYVm1YaiGKXp8XPQCi0ouHlCQpEDDt9or/p7+z/hQlhMVU1okn ndWb9hqtsfsX2d4tG1DNXccC1IpbidFm9nYqnllaGo8PhbM122v/g1lDie0S0vpbtByA SKZjRmU/j9xsZ/chMTJ1ATIM64JKXC3M62m3Sa/LzYnBvJcLOx9wOQA+U6KoFhaPQycl P31HgQp3oZ3oWuT/yDodzjco+xb5/gW96nlYnWGAMDiQrq5Vi36L62EnN+AcomTmtVir zZn2fCSaXSCzGp0j+10siM+HDVutj8YlIKhoFuCBzYpLm37dXe67ICKHU8YpahjIT1Mo 16xA== X-Gm-Message-State: AOAM533SEQS8nPyHi1sB7uzuk52hh/i2rmblEjYItz7x1ZKk5npUoEtd yiGxwThg664QFHqoLAXJSj0+vt75HeuYnGJyRyk= X-Google-Smtp-Source: ABdhPJwJMTc7yMi5OmFlxbZCzb8KD2R9HphTku0XF1YoD0r1Gx1G4lArvjdX/OBaG1Il45afp8o8yBYghOD8+GxbzRo= X-Received: by 2002:a17:906:72d1:: with SMTP id m17mr3804888ejl.118.1616487941533; Tue, 23 Mar 2021 01:25:41 -0700 (PDT) MIME-Version: 1.0 References: <6cee0cd1c8c6023eac77d32b531e4f4b@imap.linux.ibm.com> <20210322083146.GB231854@tucnak> <1d8a918d84f4fcd5da265f1ac4e81ae1@imap.linux.ibm.com> In-Reply-To: <1d8a918d84f4fcd5da265f1ac4e81ae1@imap.linux.ibm.com> From: Richard Biener Date: Tue, 23 Mar 2021 09:25:30 +0100 Message-ID: Subject: Re: [RFC] avoid type conversion through versioning loop To: guojiufu Cc: Jakub Jelinek , GCC Development , Richard Guenther , Segher Boessenkool , Jeff Law Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-3.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Mar 2021 08:25:44 -0000 On Tue, Mar 23, 2021 at 4:33 AM guojiufu wrote: > > On 2021-03-22 16:31, Jakub Jelinek via Gcc wrote: > > On Mon, Mar 22, 2021 at 09:22:26AM +0100, Richard Biener via Gcc wrote: > >> Better than doing loop versioning is to enhance SCEV (and thus also > >> dependence analysis) to track extra conditions they need to handle > >> cases similar as to how niter analysis computes it's 'assumptions' > >> condition. That allows the versioning to be done when there's an > >> actual beneficial transform (like vectorization) rather than just > >> upfront for the eventual chance that there'll be any. Ideally such > >> transform would then choose IVs in their transformed copy that > >> are analyzable w/o repeating such versioning exercise for the next > >> transform. > > > > And it might be beneficial to perform some type promotion/demotion > > pass, either early during vectorization or separately before > > vectorization > > on a loop copy guarded with the ifns e.g. ifconv uses too. > > Find out what type sizes the loop use, first try to demote computations > > to narrower types in the vectorized loop candidate (e.g. if something > > is computed in a wider type only to have the result demoted to narrower > > type), then pick up the widest type size still in use in the loop (ok, > > this assumes we don't mix multiple vector sizes in the loop, but > > currently > > our vectorizer doesn't do that) and try to promote computations that > > could > > be promoted to that type size. We do partially something like that > > during > > vect patterns for bool types, but not other types I think. > > > > Jakub > > Thanks for the suggestions! > > Enhancing SCEV could help other optimizations and improve performance in > some cases. > While one of the direct ideas of using the '64bit type' is to eliminate > conversions, > even for some cases which are not easy to be optimized through > ifconv/vectorization, > for examples: > > unsigned int i = 0; > while (a[i]>1e-3) > i++; > > unsigned int i = 0; > while (p1[i] == p2[i] && p1[i] != '\0') > i++; > > Or only do versioning on type for this kind of loop? Any suggestions? But the "optimization" resulting from such versioning is hard to determine upfront which means we'll pay quite a big code size cost for unknown questionable gain. What's the particular optimization in the above cases? Note that for example for unsigned int i = 0; while (a[i]>1e-3) i++; you know that when 'i' wraps then the loop will not terminate. There's the address computation that is i * sizeof (T) which is done in a larger type to avoid overflow so we have &a + zext (i) * 8 - is that the operation that is 'slow' for you? Richard. > BR. > Jiufu Guo.