From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 49966 invoked by alias); 9 Jan 2019 11:03:53 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 49957 invoked by uid 89); 9 Jan 2019 11:03:52 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=03am, 03AM, H*i:sk:a03f97f, H*f:sk:a03f97f X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 09 Jan 2019 11:03:51 +0000 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 5E99E2D2BD3; Wed, 9 Jan 2019 11:03:50 +0000 (UTC) Received: from tucnak.zalov.cz (ovpn-116-18.ams2.redhat.com [10.36.116.18]) by smtp.corp.redhat.com (Postfix) with ESMTPS id EBAE5608C7; Wed, 9 Jan 2019 11:03:49 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.15.2/8.15.2) with ESMTP id x09B3lPu027160; Wed, 9 Jan 2019 12:03:47 +0100 Received: (from jakub@localhost) by tucnak.zalov.cz (8.15.2/8.15.2/Submit) id x09B3jln027159; Wed, 9 Jan 2019 12:03:45 +0100 Date: Wed, 09 Jan 2019 11:03:00 -0000 From: Jakub Jelinek To: "Kay F. Jahnke" Cc: Kyrill Tkachov , "gcc@gcc.gnu.org" Subject: Re: autovectorization in gcc Message-ID: <20190109110345.GO30353@tucnak> Reply-To: Jakub Jelinek References: <41ea83cd-0ce8-4f25-35e5-888513d69c7b@gmail.com> <5C35C2C2.1050106@foss.arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) X-IsSubscribed: yes X-SW-Source: 2019-01/txt/msg00052.txt.bz2 On Wed, Jan 09, 2019 at 11:56:03AM +0100, Kay F. Jahnke wrote: > The above is a typical example. So, to give a complete source 'vec_sqrt.cc': > > #include > > extern float data [ 32768 ] ; > > extern void vf1() > { > #pragma vectorize enable > for ( int i = 0 ; i < 32768 ; i++ ) > data [ i ] = std::sqrt ( data [ i ] ) ; > } > > This has a large trip count, the loop is trivial. It's an ideal candidate > for autovectorization. When I compile this source, using > > g++ -O3 -mavx2 -S -o sqrt.s sqrt_gcc.cc Generally you want -Ofast or -ffast-math or at least some suboptions of that if you want to vectorize floating point loops, because vectorization in many cases changes where FPU exceptions would be generated, can affect precision by reordering the ops etc. In the above case it is just that glibc declares the vector math functions for #ifdef __FAST_MATH__ only, as they have worse precision. Note, gcc doesn't recognize #pragma vectorize, you can use e.g. #pragma omp simd or #pragma GCC ivdep if you want to assert some properties of the loop the compiler can't easily prove itself that would help the vectorization. Jakub