From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pj1-x1034.google.com (mail-pj1-x1034.google.com [IPv6:2607:f8b0:4864:20::1034]) by sourceware.org (Postfix) with ESMTPS id 0E8CF3858D1E for ; Tue, 29 Nov 2022 15:13:51 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 0E8CF3858D1E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-pj1-x1034.google.com with SMTP id t11-20020a17090a024b00b0021932afece4so6727664pje.5 for ; Tue, 29 Nov 2022 07:13:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=pUhsP0ZGxXe2E0ykMp89/OgoffP2AKpRfARyb5sBCdw=; b=MAUEznBz1gPS62wynuGF5B9tUHnbMMW5XQwFO+Cz79GbQJydesenqJDLxM+ok6cZd3 dbp/Jcygc1oBc7K5vuT4ctZpFO1I2Pganxih+DO4JDAJz38VzoN0RV86L5G7doryU9Wh Z+sJ03nbtBjcT9o/K8S6tvGs4breiVPpN4CtmEe15vPiZz63E88RG1ditBEu8yP+4Q9e p+LGA5lJgsa1KnpYTkHBciaJmr04PxUr4i+MW7pcfTKl3mcGfCwzxse9pwiwlJKcPo9d 6TBH2sqfvHDJJ1wrOGqKogwT9qyZSalzV831u0VWY68/lg2BS+pSgPSjEx/VXE3YL3ro WfMg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=pUhsP0ZGxXe2E0ykMp89/OgoffP2AKpRfARyb5sBCdw=; b=Ky4YInuDXxvGZroUQe1N+mtBipAetbxYaPkBa5hMKRnCr3L4aa2lS5eZUB2VNeanQ8 Ms5ufHfM5GTtXaoNAzrPAgckJBBtUjMu8A+IHgV6vs9QkwxOw1tFGNk5gW6CbP0J3Kr3 2D3QdYL9HdoqenaAXAzkA1bSUdO2HZbGV2QZv57PnFi90D59GY064uRJcxIOXJOLNwEa DmqWTpTUEhgzsvL10nVGuL5ku+ljIFM9FfbODJXBt8s84L/yAD5O0sdOqafnJM9QM1xs 2wojIeP9I1326wqs12DD2g+CxbU23k0L5/VdYACiGTyIjuR7K2n8745Fz0aQdoA4tLar 755g== X-Gm-Message-State: ANoB5pniqyrkXdF4oBjG4IfVBzzxAEPYa81V8LWRWuZxsc24M/I3teKp EJOYBy9Kw4ljxV7ZMRFkDnnICTAZcHqm9Xj1IlY= X-Google-Smtp-Source: AA0mqf78c6RV0RISQm2KHCjqCOydbPd4Wd1qVROqxu2lnqK5WpX8af1wK23+5ukdUvNwaalRbpxIKzhS3+Q5aRXrI/E= X-Received: by 2002:a17:90a:4207:b0:209:46fe:871b with SMTP id o7-20020a17090a420700b0020946fe871bmr66818590pjg.163.1669734829940; Tue, 29 Nov 2022 07:13:49 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Andrew Pinski Date: Tue, 29 Nov 2022 07:13:37 -0800 Message-ID: Subject: Re: [aarch64] Use dup and zip1 for interleaving elements in initializing vector To: Prathamesh Kulkarni Cc: gcc Patches , Richard Sandiford Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Tue, Nov 29, 2022 at 6:40 AM Prathamesh Kulkarni via Gcc-patches wrote: > > Hi, > For the following test-case: > > int16x8_t foo(int16_t x, int16_t y) > { > return (int16x8_t) { x, y, x, y, x, y, x, y }; > } (Not to block this patch) Seems like this trick can be done even with less than perfect initializer too: e.g. int16x8_t foo(int16_t x, int16_t y) { return (int16x8_t) { x, y, x, y, x, y, x, 0 }; } Which should generate something like: dup v0.8h, w0 dup v1.8h, w1 zip1 v0.8h, v0.8h, v1.8h ins v0.h[7], wzr Thanks, Andrew Pinski > > Code gen at -O3: > foo: > dup v0.8h, w0 > ins v0.h[1], w1 > ins v0.h[3], w1 > ins v0.h[5], w1 > ins v0.h[7], w1 > ret > > For 16 elements, it results in 8 ins instructions which might not be > optimal perhaps. > I guess, the above code-gen would be equivalent to the following ? > dup v0.8h, w0 > dup v1.8h, w1 > zip1 v0.8h, v0.8h, v1.8h > > I have attached patch to do the same, if number of elements >= 8, > which should be possibly better compared to current code-gen ? > Patch passes bootstrap+test on aarch64-linux-gnu. > Does the patch look OK ? > > Thanks, > Prathamesh