From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lj1-x22c.google.com (mail-lj1-x22c.google.com [IPv6:2a00:1450:4864:20::22c]) by sourceware.org (Postfix) with ESMTPS id 836BB385021B for ; Wed, 9 Aug 2023 11:59:48 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 836BB385021B Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-lj1-x22c.google.com with SMTP id 38308e7fff4ca-2b9b6e943ebso9881891fa.1 for ; Wed, 09 Aug 2023 04:59:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1691582387; x=1692187187; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=5O4XjosaxmRdXGRmBs2iR71ZPGMvvLasKHYKWkIOflg=; b=ETvedizJx5A+q+WexOVAJSgD38JkOYHtvGXsOREt/1CHiMARAx6/aWaciAlAV64qr7 IJrEgQ5V5FYwfBvb77t1IL/Ov+csRx6fCSV9X0HdBMqLwIRu4up7MQx1a14LV9KLcYdk Aj1B+4qt23Baecgu7lXdwpQJO3meQ+N1OR3Hf6zt+xPMM5oa0EbxGyDCsAGNot1okXc6 W1SitN+hftrlaeKpBSnk7ELNchtDbYuzRTPsv7Be1QJPWf4aU/dBJuUDBBQfEgUoHPXH ymQKaB/DQ+JsEqXQG0DzNYyIfGjUuw7SndSc1sKFKx+K2nlAwvEuc5gO57wC6RtL+Qnl tlpw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691582387; x=1692187187; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=5O4XjosaxmRdXGRmBs2iR71ZPGMvvLasKHYKWkIOflg=; b=foCvI6+6A77dQY53AS1GKvOyGbpks50yGW6sMw0i63Oi1bryddZF9S3qzcYzgpYo00 SApV//oIZiFD4tzF1Ksjq3bnvv1sXcnKT9duA3HH9ixpRnXV/wNQoS7Ag4WJag71KR+k iV63mRQO7F7cXOpk5SIwfHksQjo24C0jULO5Y5kF9K6rQybSRxka/6paALgqW6uHy965 gMe0lGQCjF4ia4BAjV41ZMx4FnSvIMKONb41aUn5Kz145ttckwbIVR/y7ffHVHI7lqQB +nVF/9XJ1TUww+KPBBTu2fcNSisQUV69k5g78q+K2ZA+TGX2B8RjkSGpPTh3plL1Wsgz Pr9A== X-Gm-Message-State: AOJu0YyAWz2H925dRmR81WPu07L2prbexQmu3WlHFT5wpuao3H81yLK3 BzCgmk4/pHzEtkhuynkrFDzsNrYYbE3klW28dMM= X-Google-Smtp-Source: AGHT+IGGNgYTQy+Sp1f3oIKOSjd+FzqFbxSA2iIR/fxvAzVa/5JxvJcHdmUIy5C7u+js/rgXyQpHgVvBc5IebP5AMU8= X-Received: by 2002:a2e:b94b:0:b0:2b9:c2dc:619c with SMTP id 11-20020a2eb94b000000b002b9c2dc619cmr4635467ljs.10.1691582386655; Wed, 09 Aug 2023 04:59:46 -0700 (PDT) MIME-Version: 1.0 References: <4d0d53a0-20d2-5b98-c4f9-67b624a27269@gmail.com> <86bb6ae6-1eb5-040a-19ef-bab1e1bc6f4e@gmail.com> <2a568bea-04db-ebd1-8b0d-f2f124f0b183@gmail.com> <8af42de5-c897-aecf-aad4-66f4a80d4551@gmail.com> In-Reply-To: <8af42de5-c897-aecf-aad4-66f4a80d4551@gmail.com> From: Richard Biener Date: Wed, 9 Aug 2023 13:58:40 +0200 Message-ID: Subject: Re: [PATCH] vect: Add a popcount fallback. To: Robin Dapp Cc: gcc-patches Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Wed, Aug 9, 2023 at 12:23=E2=80=AFPM Robin Dapp wr= ote: > > > We seem to be looking at promotions of the call argument, lhs_type > > is the same as the type of the call LHS. But the comment mentions .POP= COUNT > > and the following code also handles others, so maybe handling should be > > moved. Also when we look to vectorize popcount (x) instead of popcount= ((T)x) > > we can simply promote the result accordingly. > > IMHO lhs_type is the type of the conversion > > lhs_oprnd =3D gimple_assign_lhs (last_stmt); > lhs_type =3D TREE_TYPE (lhs_oprnd); > > and rhs/unprom_diff has the type of the call's input argument > > rhs_oprnd =3D gimple_call_arg (call_stmt, 0); > vect_look_through_possible_promotion (vinfo, rhs_oprnd, &unprom_diff); > > So we can potentially have > T0 arg > T1 in =3D (T1)arg > T2 ret =3D __builtin_popcount (in) > T3 lhs =3D (T3)ret > > and we're checking if precision (T0) =3D=3D precision (T3). Looks like so. Note T1 =3D=3D T2. What we're really after is changing T1/T2 and the actual popcount used closer to T0/T3, like in case T0 was 'char' and T3 was 'long' we could still use popcountqi and then widen to T3 (or the other way around). So yes, I think requiring that T0 and T3 are equal isn't necessary. > This will never be true for a proper __builtin_popcountll except if > the return value is cast to uint64_t (which I just happened to do > in my test...). Therefore it still doesn't really make sense to me. > > Interestingly though, it helps for an aarch64 __builtin_popcountll > testcase where we abort here and then manage to vectorize via > vectorizable_call. When we skip this check, recognition succeeds > and replaces the call with the pattern. Then scalar costs are lower > than in the vectorizable_call case because __builtin_popcountll is > not STMT_VINFO_RELEVANT_P anymore (not live or so?). > Then, vectorization costs are too high compared to the wrong scalar > costs and we don't vectorize... Odd, might require fixing separately. > We might need to calculate the scalar costs in advance? > > > It looks like vect_recog_popcount_clz_ctz_ffs_pattern is specifcally fo= r > > the conversions, so your fallback should possibly apply even when not > > matching them. > > Mhm, yes it appears to only match when casting the return value to > something else than an int. So we'd need a fallback in vectorizable_call= ? > And it would potentially look a bit out of place there only handling > popcount and not ctz, clz, ... Not sure if it is worth it then? I'd keep the handling as pattern just also match on popcount directly when not converted. > > Regards > Robin >