From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 87891 invoked by alias); 9 Jan 2019 11:21:18 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 87879 invoked by uid 89); 9 Jan 2019 11:21:18 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-0.9 required=5.0 tests=BAYES_00,KAM_LAZY_DOMAIN_SECURITY,SPF_HELO_PASS autolearn=no version=3.3.2 spammy=libcall X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 09 Jan 2019 11:21:16 +0000 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 8F1F55F74A; Wed, 9 Jan 2019 11:21:15 +0000 (UTC) Received: from tucnak.zalov.cz (ovpn-116-18.ams2.redhat.com [10.36.116.18]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 14F5A600C4; Wed, 9 Jan 2019 11:21:14 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.15.2/8.15.2) with ESMTP id x09BLCWL027265; Wed, 9 Jan 2019 12:21:12 +0100 Received: (from jakub@localhost) by tucnak.zalov.cz (8.15.2/8.15.2/Submit) id x09BLB3H027264; Wed, 9 Jan 2019 12:21:11 +0100 Date: Wed, 09 Jan 2019 11:21:00 -0000 From: Jakub Jelinek To: "Kay F. Jahnke" Cc: Kyrill Tkachov , "gcc@gcc.gnu.org" Subject: Re: autovectorization in gcc Message-ID: <20190109112111.GP30353@tucnak> Reply-To: Jakub Jelinek References: <41ea83cd-0ce8-4f25-35e5-888513d69c7b@gmail.com> <5C35C2C2.1050106@foss.arm.com> <20190109110345.GO30353@tucnak> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190109110345.GO30353@tucnak> User-Agent: Mutt/1.10.1 (2018-07-13) X-IsSubscribed: yes X-SW-Source: 2019-01/txt/msg00053.txt.bz2 On Wed, Jan 09, 2019 at 12:03:45PM +0100, Jakub Jelinek wrote: > > The above is a typical example. So, to give a complete source 'vec_sqrt.cc': > > > > #include > > > > extern float data [ 32768 ] ; > > > > extern void vf1() > > { > > #pragma vectorize enable > > for ( int i = 0 ; i < 32768 ; i++ ) > > data [ i ] = std::sqrt ( data [ i ] ) ; > > } > > > > This has a large trip count, the loop is trivial. It's an ideal candidate > > for autovectorization. When I compile this source, using > > > > g++ -O3 -mavx2 -S -o sqrt.s sqrt_gcc.cc > > Generally you want -Ofast or -ffast-math or at least some suboptions of that > if you want to vectorize floating point loops, because vectorization in many > cases changes where FPU exceptions would be generated, can affect precision > by reordering the ops etc. In the above case it is just that glibc > declares the vector math functions for #ifdef __FAST_MATH__ only, as they > have worse precision. Actually, the last sentence was just a wrong guess in this case, for sqrt no glibc libcall is needed, that is for trigonometric and the like, all you need for the above to vectorize from -ffast-math is -fno-math-errno, tell the compiler you don't need errno set if you call sqrt on negative etc. With -fopt-info-vec-missed the compiler would tell you: /tmp/1.c:5:3: note: not vectorized: control flow in loop. /tmp/1.c:5:3: note: bad loop form. and you could look at the dumps to see that there is _2 = .SQRT (_1); if (_1 u>= 0.0) goto ; [99.95%] else goto ; [0.05%] ... [local count: 531495]: __builtin_sqrt (_1); which is the idiom to do sqrt inline using instruction, but in the unlikely case when the argument is negative, also call the library function so that it handles the errno setting. Jakub