From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lj1-x22c.google.com (mail-lj1-x22c.google.com [IPv6:2a00:1450:4864:20::22c]) by sourceware.org (Postfix) with ESMTPS id 697C73858D1E for ; Tue, 7 May 2024 09:52:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 697C73858D1E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 697C73858D1E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::22c ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1715075548; cv=none; b=JdeazIc7Gr0wEOmtnBIclDIVBFpawDK0BFz76gMqtX/HIHu2fJdmgEfE7RNKptwN5jWBIaJT2R3EJN3E0dX2D1sW9mLbLvgMTnx1DtTtdPhFE6ASqrnqOEK8sO+X3VS8lfqreiIxLaEN/dhUYOME+pI6lNPTEmItvT4d0n6d+HE= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1715075548; c=relaxed/simple; bh=aD7OBbqSoMrYyXvTTXP/WA/SNBwdP+ytjvV11RBqHIw=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=R6ohPwJ3OGi7GCBu3LDZ0Zpg2KF6viT8ERsnni2wHD321lu+VDxbhiJJaxWUm1vHUbi6p08c0xt5hM+g+TZjcQimzih8H7XnSpbVt+u++q9Jk6NJX0vt3OjPBUgJStTbG5y22aU0neh+mpV2NkFOH/HzieYfjtf5ZKsORugW41E= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-lj1-x22c.google.com with SMTP id 38308e7fff4ca-2e0a0cc5e83so29389911fa.1 for ; Tue, 07 May 2024 02:52:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1715075544; x=1715680344; darn=gcc.gnu.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=giovTkON+layOB0+2UBZi46PaRAqUKSFVrjVHQmirZc=; b=inm8Tc1NAi2Lig5OjlmqP+VZ8LSSjmglNIlPut5Q/zgi3At21DxHh+jmC/FhvIi1WS ge4HI5XtQJpKopfln+s5zUzNBiZMID6PXcffL4hq1AZlqDiKGdla91xGEeVmbP7DlKE0 aLdVEAFOEDxlz07kfZz+gUI3ip44RmyJkcx9J4z7VchaR91F/ZsjFAi4dfi5NFwe2Kwq YcvWIok8VhTR+9L9uQJ0QeS5Yrevyi2QOu87tRrT++TudMam1tIUvY1kM9NRRD5hYuFe 2c5Ms8TWRdcGSFsx45Ko6Eh7+ZO3ONSRGwQ0AjiMYFtZ6mLySrGNxwKqtTrXpu2zMmSA B9XA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1715075544; x=1715680344; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=giovTkON+layOB0+2UBZi46PaRAqUKSFVrjVHQmirZc=; b=tQODxB2CFPpSS52dzFebwwT6gOqNYHwkadGypg+imwDRwc+S1p+pGwhVFLA6UPUy72 me4IIS+yx8/Tg+aR+6s+qRhmHYrgGOT5XV3WrtCzUUvbXst+9Q+nLcQk/2q9uRsxlbl/ 6DqJuPGoiwPmZV/+SoJHiBN2flgudfl/7QczlJbCaVO2momY7Iu2ED7J6xJGrHZv6Pjw dywdO+MgmkelmH/kBZlqanL0zxLa75cjaNG3ckZviyHexTq0HThNgw/R9azMU49Knjlf WB3lwPaADqI4C5lkW0gV4Ftrrj6ER2pRMykgeExvPRM0uW/UYIo7x1Ljjxq0uHbHzw/w KWnQ== X-Gm-Message-State: AOJu0Yw4gxr9rLsf6D4PrKn6CtVzrcANEwfI+t4BiV4mgisBVtWZJR2k F0ycRljwh/5tqQDijTKmx3DI+u1/Sehv0XkUkeqmAhMGA6LtIgssQsPCd+vjfR+lkVMtRYNBIz3 T3j44GI+0PWxbOOkrrL7ng9IUzew= X-Google-Smtp-Source: AGHT+IERac/W6yc0l58dxIGeOQsmAWkfhxkdkER3UmDF2yg5xiUlQ90MNKBM669BmamfZ/DtKIeCnZqSK/ZOqwPcbaQ= X-Received: by 2002:a05:6512:3b28:b0:521:1239:6461 with SMTP id f40-20020a0565123b2800b0052112396461mr1867161lfv.50.1715075543957; Tue, 07 May 2024 02:52:23 -0700 (PDT) MIME-Version: 1.0 References: <20240425072645.2891385-1-stefansf@linux.ibm.com> In-Reply-To: From: Richard Biener Date: Tue, 7 May 2024 11:52:13 +0200 Message-ID: Subject: Re: [PATCH] tree-optimization/110490 - bitcount for narrow modes To: Stefan Schulze Frielinghaus Cc: gcc-patches@gcc.gnu.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-7.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Tue, May 7, 2024 at 10:58=E2=80=AFAM Stefan Schulze Frielinghaus wrote: > > Ping. Ok for mainline? OK. Thanks, Richard. > On Thu, Apr 25, 2024 at 09:26:45AM +0200, Stefan Schulze Frielinghaus wro= te: > > Bitcount operations popcount, clz, and ctz are emulated for narrow mode= s > > in case an operation is only supported for wider modes. Beside that ct= z > > may be emulated via clz in expand_ctz. Reflect this in > > expression_expensive_p. > > > > I considered the emulation of ctz via clz as not expensive since this > > basically reduces to ctz (x) =3D c - (clz (x & ~x)) where c is the mode > > precision minus 1 which should be faster than a loop. > > > > Bootstrapped and regtested on x86_64 and s390. Though, this is probabl= y > > stage1 material? > > > > gcc/ChangeLog: > > > > PR tree-optimization/110490 > > * tree-scalar-evolution.cc (expression_expensive_p): Also > > consider mode widening for popcount, clz, and ctz. > > --- > > gcc/tree-scalar-evolution.cc | 23 +++++++++++++++++++++++ > > 1 file changed, 23 insertions(+) > > > > diff --git a/gcc/tree-scalar-evolution.cc b/gcc/tree-scalar-evolution.c= c > > index b0a5e09a77c..622c7246c1b 100644 > > --- a/gcc/tree-scalar-evolution.cc > > +++ b/gcc/tree-scalar-evolution.cc > > @@ -3458,6 +3458,28 @@ bitcount_call: > > && (optab_handler (optab, word_mode) > > !=3D CODE_FOR_nothing)) > > break; > > + /* If popcount is available for a wider mode, we emulate th= e > > + operation for a narrow mode by first zero-extending the = value > > + and then computing popcount in the wider mode. Analogue= for > > + ctz. For clz we do the same except that we additionally= have > > + to subtract the difference of the mode precisions from t= he > > + result. */ > > + if (is_a (mode, &int_mode)) > > + { > > + machine_mode wider_mode_iter; > > + FOR_EACH_WIDER_MODE (wider_mode_iter, mode) > > + if (optab_handler (optab, wider_mode_iter) > > + !=3D CODE_FOR_nothing) > > + goto check_call_args; > > + /* Operation ctz may be emulated via clz in expand_ctz.= */ > > + if (optab =3D=3D ctz_optab) > > + { > > + FOR_EACH_WIDER_MODE_FROM (wider_mode_iter, mode) > > + if (optab_handler (clz_optab, wider_mode_iter) > > + !=3D CODE_FOR_nothing) > > + goto check_call_args; > > + } > > + } > > return true; > > } > > break; > > @@ -3469,6 +3491,7 @@ bitcount_call: > > break; > > } > > > > +check_call_args: > > FOR_EACH_CALL_EXPR_ARG (arg, iter, expr) > > if (expression_expensive_p (arg, cond_overflow_p, cache, op_cost)= ) > > return true; > > -- > > 2.44.0 > >