From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf1-x129.google.com (mail-lf1-x129.google.com [IPv6:2a00:1450:4864:20::129]) by sourceware.org (Postfix) with ESMTPS id CD282385AE47 for ; Mon, 20 Nov 2023 07:28:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org CD282385AE47 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org CD282385AE47 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::129 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700465299; cv=none; b=SgSZZK7mENPBbW2IY6ELqFf3soFlQKj+uHNgDnI2t3Or6pp3zv0gAy3DSMMmCBEufPdPq9sJxTX5awfCm68nLygZLneAHajwRWpiIsCiy5F+0GkKTMPHS43dGoB4EOshvMp4Ie+/NWQpACy5A3Gii2Ar/WIxou76pGcCDb+IDrg= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700465299; c=relaxed/simple; bh=UzN7d4RPydPYLeS3dCvsftv+q8dXFWEExjUQMTfVVi4=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=mbaasTHMA6+FHtFOwJM0NpVOu5Kia5iSd6aRTaX61tnr5XWosW1jMXDZYBFgn4SWRPmrAJUlMIe8f5zcQ7pI0REvIeta4+1bHd6708gG+MgDc+63g70QI/v/FyKs8O9WkMyP01OOO+KE5BlpQnUqalEPrYsUazN7ISqWoFYBj+Y= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-lf1-x129.google.com with SMTP id 2adb3069b0e04-507a62d4788so5696034e87.0 for ; Sun, 19 Nov 2023 23:28:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1700465296; x=1701070096; darn=gcc.gnu.org; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=wbAnZYezA8f7JqZHmt/DyYdrWR8DijUSpas0D+4UJ5I=; b=NiTKWCeEETG+d5aA9m9IeWsrfM7s3i5BDu137v8jdBEAljmj9M9ypWbccpi1AxcuTj rT9MX1cvmHdkHgNFjsxEanml7H6eFZFkOOdP71lhO3bVNIO8r8x34HyRFY1QMx3ntLZt e95tKMruqgDq9gxeIfVP4Mva2FBJhESmq2icKpW+4va9TRC/Vg21ByaDk4beOWb+AElG 5qyS7hszNfls5KHyZdbpDBMvUgjGes0LEB5KpuJhCp7Tg44yHhKT+ecKOth5x5PjcZ1C DvS/U1YmJn6HIngIXSKqaJWNqdPMoHsysppXJj7UvbguudoPGOWPHT4hHR7J1sw8XxLb Oxfg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700465296; x=1701070096; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=wbAnZYezA8f7JqZHmt/DyYdrWR8DijUSpas0D+4UJ5I=; b=GLgX4t495ORuI8QosxPKlE8l/mYf5O7nXXG/fpdaY5dF2Nk4FVUdkv8jdKVefq7kNP 0Sw9XUbdJFMV/5/ZTz6By+wrFzkIIRUIsGk+Hsl0yvvfYkaTkHMeg/grIYkwUYdQbBV/ 2gckaFmNEhrlgkeyueNeTsdy23tCkBbFazA3sL2gGX2xjs89JHyuzw+nNYbWxtyst4KV VeNavu6ZgohH5BqplNnGAG75s8e/DAPMp54cUi2e2QHACglCJzBt1gk46KE6Yr0Qs25o xwNjwL9fe8NoRrbDpTebrsd5lrdk2b18TvP8eTtLOHUCPDF9mSqb+SCtqcNYMDIEqpzY wVEQ== X-Gm-Message-State: AOJu0YzCHay6nrsABCsPRfNoMyLRZr+HpSJdTd34mzsCQaNojKNmaCoF S39Nsd25JH5GPKw3KJcfZHtZDhDF4xiz2UnJYgE= X-Google-Smtp-Source: AGHT+IHjY4RIPejQMC3kPnbtIhT1I/Pnsv56HedyZtxNyFL/GLWDSAF1rYu0DgGyt3oYBU2hjqI9ZqGl7iuprBDkWrg= X-Received: by 2002:ac2:5238:0:b0:509:8da4:93da with SMTP id i24-20020ac25238000000b005098da493damr4299665lfl.18.1700465295672; Sun, 19 Nov 2023 23:28:15 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Richard Biener Date: Mon, 20 Nov 2023 08:24:35 +0100 Message-ID: Subject: Re: [PATCH 0/4] Add vector pair support to PowerPC attribute((vector_size(32))) To: Michael Meissner , gcc-patches@gcc.gnu.org, Segher Boessenkool , "Kewen.Lin" , David Edelsohn , Peter Bergner Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-1.3 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Mon, Nov 20, 2023 at 5:19=E2=80=AFAM Michael Meissner wrote: > > This is simiilar to the patches on November 10th. > > * https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636077.ht= ml > * https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636078.ht= ml > * https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636083.ht= ml > * https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636080.ht= ml > * https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636081.ht= ml > > to add a set of built-in functions that use the PowePC __vector_pair type= and > that provide a set of functions to do basic operations on vector pair. > > After I posted these patches, it was decided that it would be better to h= ave a > new type that is used rather than a bunch of new built-in functions. Wit= hin > the GCC context, the best way to add this support is to extend the vector= modes > so that V4DFmode, V8SFmode, V4DImode, V8SImode, V16HImode, and V32QImode = are > used. > > These patches are to provide this new implementation. > > While in theory you could add a whole new type that isn't a larger size v= ector, > my experience with IEEE 128-bit floating point is that GCC really doesn't= like > 2 modes that are the same size but have different implementations (such a= s we > see with IEEE 128-bit floating point and IBM double-double 128-bit floati= ng > point). So I did not consider adding a new mode for using with vector pa= irs. > > My original intention was to just implement V4DFmode and V8SFmode, since = the > primary users asking for vector pair support are people implementing the = high > end math libraries like Eigen and Blas. > > However in implementing this code, I discovered that we will need integer > vector pair support as well as floating point vector pair. The integer m= odes > and types are needed to properly implement byte shuffling and vector > comparisons which need integer vector pairs. > > With the current patches, vector pair support is not enabled by default. = The > main reason is I have not implemented the support for byte shuffling whic= h > various tests depend on. > > I would also like to implement overloads for the vector built-in function= s like > vec_add, vec_sum, etc. that if you give it a vector pair, it would handle= it > just like if you give a vector type. > > In addition, once the various bugs are addressed, I would then implement = the > support so that automatic vectorization would consider using vector pairs > instead of vectors. > > In terms of benchmarks, I wrote two benchmarks: > > 1) One benchmark is a saxpy type loop: value[i] +=3D (a[i] * b[i]). = That is > a loop with 3 loads and a store per loop. > > 2) Another benchmark produces a scalar sun of an entire vector. Thi= s is a > loop that just has a single load and no store. > > For the saxpy type loop, I get the following general numbers for both flo= at and > double: > > 1) The benchmarks that use attribute((vector_size(32))) are roughly = 9-10% > faster than using normal vector processing (both auto vectorize a= nd > using vector types). > > 2) The benchmarks that use attribute((vector_size(32))) are roughly = 19-20% > faster than if I write the loop using the vector pair loads using= the > exist built-ins, and then manually split the values and do the > arithmetic and single vector stores, > > Unfortunately, for floating point, doing the sum of the whole vector is s= lower > using the new vector pair built-in functions using a simple loop (compare= d to > using the existing built-ins for disassembling vector pairs. If I write = more > complex loops that manually unroll the loop, then the floating point vect= or > pair built-in functions become like the integer vector pair integer built= -in > functions. So there is some amount of tuning that will need to be done. > > There are 4 patches in this set: > > The first patch adds support for the types, and does moves, and provides = some > optimizations for extracting an element and setting an element. > > The second patch implements the floating point arithmetic operations. > > The third patch implements the integer operations. > > The fourth patch provides new tests to test these features. I wouldn't expose the "fake" larger modes to the vectorizer but rather adjust m_suggested_unroll_factor (which you already do to some extent). > -- > Michael Meissner, IBM > PO Box 98, Ayer, Massachusetts, USA, 01432 > email: meissner@linux.ibm.com