From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 77362 invoked by alias); 16 Dec 2019 07:19:07 -0000 Mailing-List: contact gcc-help-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-help-owner@gcc.gnu.org Received: (qmail 77277 invoked by uid 89); 16 Dec 2019 07:19:07 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_PASS autolearn=ham version=3.3.1 spammy=slot, HX-Languages-Length:918 X-HELO: mail2-relais-roc.national.inria.fr Received: from mail2-relais-roc.national.inria.fr (HELO mail2-relais-roc.national.inria.fr) (192.134.164.83) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 16 Dec 2019 07:19:05 +0000 Received: from aaubervilliers-653-1-61-237.w90-61.abo.wanadoo.fr (HELO stedding) ([90.61.76.237]) by mail2-relais-roc.national.inria.fr with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 16 Dec 2019 08:19:02 +0100 Date: Mon, 16 Dec 2019 07:19:00 -0000 From: Marc Glisse Reply-To: gcc-help@gcc.gnu.org To: Xi Ruoyao cc: gcc-help@gcc.gnu.org Subject: Re: size value of vector_size attribute In-Reply-To: Message-ID: References: User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed X-SW-Source: 2019-12/txt/msg00058.txt.bz2 On Mon, 16 Dec 2019, Xi Ruoyao wrote: > Is there any reason to enforce "x must be a power of 2" in > __attribute__((vector_size(x)))? > > I want to use this attribute in my source code to simplify coding > (instead of utilizing SIMD instructions, normally). Someone may argue > that I should use std::valarray but it is stupidly slow. Now with this > restriction on size value I may have to write something like > std::valarray but w/o dynamic allocation. See PR53024. One main reason is that supporting it would be some work, for not enough demand. Also, it can be done in user code, compiler support is not necessary (it would be convenient though). Even lowering an unsupported power of 2 to a set of smaller vectors still generates pretty bad code IIRC. By the way, for 3 double on x86, would you prefer __m128d+double, or __m256d with one slot ignored? -- Marc Glisse