From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oi1-x231.google.com (mail-oi1-x231.google.com [IPv6:2607:f8b0:4864:20::231]) by sourceware.org (Postfix) with ESMTPS id 1E4D6385B53C for ; Tue, 18 Jul 2023 19:29:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1E4D6385B53C Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org Received: by mail-oi1-x231.google.com with SMTP id 5614622812f47-3a3b7f992e7so4426770b6e.2 for ; Tue, 18 Jul 2023 12:29:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1689708585; x=1690313385; h=mime-version:message-id:date:in-reply-to:subject:cc:to:from :user-agent:references:from:to:cc:subject:date:message-id:reply-to; bh=exKPbDm6Zbyx97BJnTFALVbnCqXVF1+9KyExiiSrjOY=; b=QRz31rkbJYZpe03425plYAjqaj8IkHXOBkx8xnp7TeR1HStGyKS8nL1AlZdm9QJglJ t/uzF2TFcMtCtpylcVuWbS9BNeiD4HQMx4pPaRpB/8zC/+bPEQl3eI9pLEr87//Xo6B0 fgPWTvGqV4h/BsI56ffLGJz8Rs7V2PfsZS+nDD998dk8W1xH1d2lJueGTY6pCZZ+gsQp rf1fjPfP2WFtGEBx4Y81A6S3bzmMXB23FoV1bXUW+RrfPLM7qm8hn/pL4WujrDvn8aDv Y5ZXUXx+QG2zhi+k2OZvfpqBTQ3K2pXGlKOp8/bI/P2R23XJ2r2H3mYcc3Ml/1LHrMsL iO1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689708585; x=1690313385; h=mime-version:message-id:date:in-reply-to:subject:cc:to:from :user-agent:references:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=exKPbDm6Zbyx97BJnTFALVbnCqXVF1+9KyExiiSrjOY=; b=ZOzbgTLkC+0FemVmMcb2zs6Erjli1Vjo02FcryBUQiFoQI49d8jLjel382L661St3s OWULDYv25JH1fi0SZUOsvdRP3rf5VTEUiMNVes+5LsVHUpJXqsWLXo41wR9gEhFSDQvu ZRV7ScO2I0L93FIrhUU8JsU6bmQCNOc+OKOtZKVPLedQ6hd9FOTML8SGVgqTFxheHNnx IjJCf0POmLxbVKGFSRoHZyXUk2wkY09EKXmPCNl3g05E9rwMva4THOYTOTq5tAnfOgW1 nlWBzvqUQodrwwEn7v7a5W8zBGlyPlBxNfW7inrkEMlGtl+pEWR5ad8sGoKvXucr0YPH 2qkg== X-Gm-Message-State: ABy/qLbXgApcvKc/9/07zHsvjv+SPBxTK2OKazXMnEIM/CqC27P1kfGv 1ClyiAE4zOym0QRCf0ZNm8u8FA== X-Google-Smtp-Source: APBJJlHR83V2mZCinCjNCYDq+hP/Aez51l2EA5qwCBvNVXGDnuKmC90+nNzzLiE4IvGOMl2fNPQagA== X-Received: by 2002:a05:6808:2091:b0:3a3:9ae0:efe2 with SMTP id s17-20020a056808209100b003a39ae0efe2mr288037oiw.20.1689708585204; Tue, 18 Jul 2023 12:29:45 -0700 (PDT) Received: from localhost ([2804:14d:7e39:8470:c0ff:233f:bf56:cadb]) by smtp.gmail.com with ESMTPSA id z10-20020a056808064a00b003a463ded3a3sm1054496oih.53.2023.07.18.12.29.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 18 Jul 2023 12:29:44 -0700 (PDT) References: User-agent: mu4e 1.10.5; emacs 28.2 From: Thiago Jung Bauermann To: Jan Hubicka Cc: rguenther@suse.cz, gcc-patches@gcc.gnu.org Subject: Re: Fix profile update in scale_profile_for_vect_loop In-reply-to: Date: Tue, 18 Jul 2023 16:29:42 -0300 Message-ID: <875y6h2f2x.fsf@linaro.org> MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hello, Jan Hubicka via Gcc-patches writes: > Hi, > when vectorizing 4 times, we sometimes do > for > <4x vectorized body> > for > <2x vectorized body> > for > <1x vectorized body> > > Here the second two fors handling epilogue never iterates. > Currently vecotrizer thinks that the middle for itrates twice. > This turns out to be scale_profile_for_vect_loop that uses > niter_for_unrolled_loop. > > At that time we know epilogue will iterate at most 2 times > but niter_for_unrolled_loop does not know that the last iteration > will be taken by the epilogue-of-epilogue and thus it think > that the loop may iterate once and exit in middle of second > iteration. > > We already do correct job updating niter bounds and this is > just ordering issue. This patch makes us to first update > the bounds and then do updating of the loop. I re-implemented > the function more correctly and precisely. > > The loop reducing iteration factor for overly flat profiles is bit funny, but > only other method I can think of is to compute sreal scale that would have > similar overhead I think. > > Bootstrapped/regtested x86_64-linux, comitted. > > gcc/ChangeLog: > > PR middle-end/110649 > * tree-vect-loop.cc (scale_profile_for_vect_loop): > (vect_transform_loop): > (optimize_mask_stores): Our CI detected regressions on aarch64-linux-gnu with this commit in gcc.target/aarch64/sve/aarch64-sve.exp. I checked today's trunk and it still fails. I filed the following bug report with the details: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110727 Could you please check? -- Thiago