From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lj1-x22b.google.com (mail-lj1-x22b.google.com [IPv6:2a00:1450:4864:20::22b]) by sourceware.org (Postfix) with ESMTPS id 6F12A388553D for ; Thu, 17 Nov 2022 08:02:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 6F12A388553D Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-lj1-x22b.google.com with SMTP id z24so1701184ljn.4 for ; Thu, 17 Nov 2022 00:02:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=EUE16jSLjKyUegzUFU7qn4KEjH/iwBKf+q2VFxgUhmc=; b=WcrTLWqiPxSq0MGoGBdHzgQa66zAVSMBriYnGwrK+YhciWA5GBFMe0B7hYtlmPYsVb xY+7wehMJ+dc7+dMUj5zUIpJ8Fcw1QRp3sT52uNgrcPcMagoveNE+WuPXm3luxV7DIc4 0YZiQeYse5+QHKSjLktZusrXcPaDLd/JrvWJ7ab9YYJ0wiJBabIAJ6qUaeLmY9XUSyiQ VIffoL0NH7NdkUTDsmiubFJ4HsTCtL4wZ7OtiGFOSnrSaV9+WZeVF0MQbB0PSV+GOoU3 UyutI3SmwtyZZV4qxbB/5vgH9bh0fPrE898f7OXJ26eTMsOAaxv9dJ209eOUI873dtMF Xa/g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=EUE16jSLjKyUegzUFU7qn4KEjH/iwBKf+q2VFxgUhmc=; b=NO5LAbCW4Rr5egD5rYxUUHCqG0ADJkqcUusGW8qYgweb+nAQlJ/5i1VLECmjdUkKuV 7M0sQT7zCoBPCum9yciKYD0Ox6uoWD/sAELNHzENjb5ITeozfxTgk26OiXs+uIglS/XK cftt3sG7sLU9YRXldKCNKxmS3CyJyE+2ch4a69ec9aPb72ZvZN/SSKt3reK13BxB9Mes ALseDZZrA0VXwVLBb/HlbhoRE360t1a0JYTvaihXXZcnwtYsb+p0nwjy4Q3NiJF7RxzC WQ71BbKBZdmplN28i4zpsOHzITanu9ilkCPZbleottzDyPp16nNNBAMpc7pHJuTGohbC /UKA== X-Gm-Message-State: ANoB5pnukNyzS4d2tFUe+rbwruJXZi7bWzUAgW1cIOVuqUx1cKla3RIi 335oWau1xoYTM+B4xPKcdIjwYdFT8N6vB54ePBI= X-Google-Smtp-Source: AA0mqf6sh+rvCgKmIzRn7JKYA35SqPGOqbsgYX6qKbo9ehoPzUwz6vj95QlH2YbEwlXf3CuFtFc8v15hCPxAMr+p+YE= X-Received: by 2002:a2e:b6c1:0:b0:26d:fc06:b7a9 with SMTP id m1-20020a2eb6c1000000b0026dfc06b7a9mr597679ljo.354.1668672124121; Thu, 17 Nov 2022 00:02:04 -0800 (PST) MIME-Version: 1.0 References: <20221112183048.389811-1-aldyh@redhat.com> <1ea5fc0e-fcc4-a354-b71b-3da3008ea5f2@redhat.com> <503f89de-8e88-2e20-5b14-5493191b19f5@redhat.com> In-Reply-To: From: Richard Biener Date: Thu, 17 Nov 2022 09:01:51 +0100 Message-ID: Subject: Re: [PATCH] [PR68097] Try to avoid recursing for floats in tree_*_nonnegative_warnv_p. To: Aldy Hernandez Cc: GCC patches , Andrew MacLeod , richard.sandiford@arm.com Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-0.8 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Wed, Nov 16, 2022 at 6:38 PM Aldy Hernandez wrote: > > > > On 11/16/22 17:04, Richard Biener wrote: > > On Tue, Nov 15, 2022 at 11:46 AM Aldy Hernandez wrote: > >> > >> > >> > >> On 11/15/22 08:15, Richard Biener wrote: > >>> On Mon, Nov 14, 2022 at 8:05 PM Aldy Hernandez wrote: > >>>> > >>>> > >>>> > >>>> On 11/14/22 10:12, Richard Biener wrote: > >>>>> On Sat, Nov 12, 2022 at 7:30 PM Aldy Hernandez wrote: > >>>>>> > >>>>>> It irks me that a PR named "we should track ranges for floating-point > >>>>>> hasn't been closed in this release. This is an attempt to do just > >>>>>> that. > >>>>>> > >>>>>> As mentioned in the PR, even though we track ranges for floats, it has > >>>>>> been suggested that avoiding recursing through SSA defs in > >>>>>> gimple_assign_nonnegative_warnv_p is also a goal. We can do this with > >>>>>> various ranger components without the need for a heavy handed approach > >>>>>> (i.e. a full ranger). > >>>>>> > >>>>>> I have implemented two versions of known_float_sign_p() that answer > >>>>>> the question whether we definitely know the sign for an operation or a > >>>>>> tree expression. > >>>>>> > >>>>>> Both versions use get_global_range_query, which is a wrapper to query > >>>>>> global ranges. This means, that no caching or propagation is done. > >>>>>> In the case of an SSA, we just return the global range for it (think > >>>>>> SSA_NAME_RANGE_INFO). In the case of a tree code with operands, we > >>>>>> also use get_global_range_query to resolve the operands, and then call > >>>>>> into range-ops, which is our lowest level component. There is no > >>>>>> ranger or gori involved. All we're doing is resolving the operation > >>>>>> with the ranges passed. > >>>>>> > >>>>>> This is enough to avoid recursing in the case where we definitely know > >>>>>> the sign of a range. Otherwise, we still recurse. > >>>>>> > >>>>>> Note that instead of get_global_range_query(), we could use > >>>>>> get_range_query() which uses a ranger (if active in a pass), or > >>>>>> get_global_range_query if not. This would allow passes that have an > >>>>>> active ranger (with enable_ranger) to use a full ranger. These passes > >>>>>> are currently, VRP, loop unswitching, DOM, loop versioning, etc. If > >>>>>> no ranger is active, get_range_query defaults to global ranges, so > >>>>>> there's no additional penalty. > >>>>>> > >>>>>> Would this be acceptable, at least enough to close (or rename the PR ;-))? > >>>>> > >>>>> I think the checks would belong to the gimple_stmt_nonnegative_warnv_p function > >>>>> only (that's the SSA name entry from the fold-const.cc ones)? > >>>> > >>>> That was my first approach, but I thought I'd cover the unary and binary > >>>> operators as well, since they had other callers. But I'm happy with > >>>> just the top-level tweak. It's a lot less code :). > >>> > >>> @@ -9234,6 +9235,15 @@ bool > >>> gimple_stmt_nonnegative_warnv_p (gimple *stmt, bool *strict_overflow_p, > >>> int depth) > >>> { > >>> + tree type = gimple_range_type (stmt); > >>> + if (type && frange::supports_p (type)) > >>> + { > >>> + frange r; > >>> + bool sign; > >>> + return (get_global_range_query ()->range_of_stmt (r, stmt) > >>> + && r.signbit_p (sign) > >>> + && sign == false); > >>> + } > >>> > >>> the above means we never fall through to the switch below if > >>> frange::supports_p (type) - that's eventually good enough, I > >>> don't think we ever call this very function directly but it gets > >>> invoked via recursion through operands only. But of course > >> > >> Woah, sorry. That was not intended. For that matter, the patch as > >> posted caused: > >> > >> FAIL: gcc.dg/builtins-10.c (test for excess errors) > >> FAIL: gcc.dg/builtins-57.c (test for excess errors) > >> FAIL: gcc.dg/torture/builtin-nonneg-1.c -O1 (test for excess errors) > >> FAIL: gcc.dg/torture/builtin-nonneg-1.c -O2 (test for excess errors) > >> FAIL: gcc.dg/torture/builtin-nonneg-1.c -O2 -flto > >> -fno-use-linker-plugin -flto-partition=none (test for excess errors) > >> FAIL: gcc.dg/torture/builtin-nonneg-1.c -O3 -g (test for excess errors) > >> FAIL: gcc.dg/torture/builtin-nonneg-1.c -Os (test for excess errors) > >> FAIL: gcc.dg/torture/builtin-power-1.c -O1 (test for excess errors) > >> FAIL: gcc.dg/torture/builtin-power-1.c -O2 (test for excess errors) > >> FAIL: gcc.dg/torture/builtin-power-1.c -O2 -flto > >> -fno-use-linker-plugin -flto-partition=none (test for excess errors) > >> FAIL: gcc.dg/torture/builtin-power-1.c -O3 -g (test for excess errors) > >> FAIL: gcc.dg/torture/builtin-power-1.c -Os (test for excess errors) > > > > Did you investigate why? Because the old patch removed the recursion > > while the new keeps it in case the global range isn't present which isn't > > as nice. > > For gcc.dg/builtins-10.c, there are a few calls to > gimple_stmt_nonnegative* that are being made before we have global > ranges (ccp1 and forwprop1), so the query returns VARYING (i.e. no known > sign). If you're curious, the call to gimple_stmt_nonnegative* comes > via courtesy of match.pd: > > /* Canonicalization of sequences of math builtins. These rules represent > IL simplifications but are not necessarily optimizations. > > So ISTM, we still need to fall through if we're being called before > global ranges are available. > > After global ranges are available (evrp), we would avoid further lookups > if it weren't for an unrelated problem I found. > > foperator_abs::fold_range() is trying to set a range of [+0.0, +INF], > but this little snpipet in the frange normalization code adds a -0.0 to > the range: > > else if (!HONOR_SIGNED_ZEROS (m_type)) > { > if (real_iszero (&m_max, 1)) > m_max.sign = 0; > if (real_iszero (&m_min, 0)) > m_min.sign = 1; > } > > We end up with: > > [frange] double [-0.0 (-0x0.0p+0), > 1.79769313486231570814527423731704356798070567525844996599e+308 > (0x0.fffffffffffff8p+1024)] > > I must say this is beyond my paygrade :). Jakub, it was your suggestion > to add the snippet above. Is this correct? Note that this test is for > -ffast-math. > > If I comment out the code above, the regressions are fixed, both with my > current patch or with the original one. But as I suggested, maybe we > want the second patch, because we may be called before global ranges are > available. > > IMHO, we could go with the second patch, and fix the ABS problem > independently. > > Yay? Nay? Yes, the 2nd patch is approved, I was just curious. Richard. > Aldy > > > > >> Note that ranger folding calls this function, though it won't run the > >> risk of endless recursion because range_of_stmt uses the LHS, and only > >> use global ranges to solve the LHS. > >> > >> Also, frange::supports_p() does not support all floats: > >> > >> static bool supports_p (const_tree type) > >> { > >> // ?? Decimal floats can have multiple representations for the > >> // same number. Supporting them may be as simple as just > >> // disabling them in singleton_p. No clue. > >> return SCALAR_FLOAT_TYPE_P (type) && !DECIMAL_FLOAT_TYPE_P (type); > >> } > > > > OK, _Complex types are obviously missing, so are vector ones. > > > >> Finally, my patch is more conservative than what the *nonnegative_warn* > >> friends do. We only return true when we're sure about the sign bit and > >> it's FALSE. As I mentioned elsewhere, tree_call_nonnegative_warn_p() > >> always returns true for: > >> > >> CASE_CFN_ACOS: > >> CASE_CFN_ACOS_FN: > >> CASE_CFN_ACOSH: > >> CASE_CFN_ACOSH_FN: > >> CASE_CFN_CABS: > >> CASE_CFN_CABS_FN: > >> ... > >> ... > >> /* Always true. */ > >> return true; > >> > >> This means that we'll return true for a NAN, but we're incorrectly > >> assuming the NAN will be +NAN. In my proposed patch, we don't make such > >> assumptions. We only return true if the range is non-negative, > >> including the NAN. > > > > Yep, the usual issue whether nonnegative means copysign (1, x) produces > > 1 or whether !(x < 0) is true. > > > >>> I wonder what types are not supported by frange and whether > >>> the manual processing we fall through to does anything meaningful > >>> for those? > >>> > >>> I won't ask you to thoroughly answer this now but please put in > >>> a comment reflecting the above before the switch stmt. > >>> > >>> switch (gimple_code (stmt)) > >>> > >>> > >>> Otherwise OK, in case you tree gets back to bootstrapping ;) > >> > >> Updated patch that passes test. > >> > >> OK? And if so, can I close the PR? > > > > Yes, I think we now track float ranges - improvements are of course > > always possible. > > > > Richard. > > > >> Thanks. > >> Aldy > > >