From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 93C82395BC70 for ; Wed, 16 Nov 2022 17:38:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 93C82395BC70 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1668620295; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=TIj6vKWC//oYHfkBWLLWV2cNe1M2op1OQV5/hn2ODBM=; b=XtPE7mUMDhEVNSs8bRmzUjV7+JfVt+dhnVdKeH8Sb+TSe8iTXB6RuF8WPqlxQVwYezZ8ZN pY+FHul2jJ4rx7M13WdAB4skNXEwrKmOCiP/HqBLIq/OELlIU88An3+TnBseZs0iq9iEuf jj6OgK9MnDpfJ5sBBt0LaVFbApLbtOU= Received: from mail-wr1-f72.google.com (mail-wr1-f72.google.com [209.85.221.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-621-Uicnpry5Oa2HYm5ImAM36w-1; Wed, 16 Nov 2022 12:38:14 -0500 X-MC-Unique: Uicnpry5Oa2HYm5ImAM36w-1 Received: by mail-wr1-f72.google.com with SMTP id k7-20020adfc707000000b002416f2e9ad5so3268477wrg.6 for ; Wed, 16 Nov 2022 09:38:13 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=TIj6vKWC//oYHfkBWLLWV2cNe1M2op1OQV5/hn2ODBM=; b=jeklQkYxYMFVRcM+i1+sv6avd9le61Z3FbNPrZTAg6EeeUqIsD0AkDTW+wmvM9pV/X A+n1nrdsP9TmuvgnI/Aa0o5LNKW8TTPL0jZVj1ikNCA5Aj4S7unMoERihMEpaM3acDm3 jI0btmq5+OW7gPBoRV0U63ZW7AzlDUahgMXbJsqJaVp+fjoGVY3NbxCcLaTcAFnV2bhh Af0RfwRLWt1cUFvElNNZEPVTYDWv/BY9Lml23TLcdnv7T6P8NUAt+2C/zVmNJlVqrXbe v2exvO1KO6F1I66vTRiY1dpv82pw2khyXAOnhxP9Yrlm9FQLa9UvAUrw4122CMMFcy1q aLrQ== X-Gm-Message-State: ANoB5plvZ4TqXwu98U/kbRvAQzGKiyUmCpIefGoARkb+AVeaJW53SJJH HzCskBpxtgRDYwCLsjWWNbjgdUYpidx/BpCqnDY/0ce2E7cCPmb+KwWOlfsrOfhz/8cJ5PCTKR4 1kXTiCAFjdBGet55zZw== X-Received: by 2002:a05:600c:348d:b0:3cf:88e7:f808 with SMTP id a13-20020a05600c348d00b003cf88e7f808mr2861123wmq.200.1668620292195; Wed, 16 Nov 2022 09:38:12 -0800 (PST) X-Google-Smtp-Source: AA0mqf7wMMzs5+vFCy7fP0kUH/c7K3w1z7mpOyqs+sxyzPfSTtjJHtHuFC98ZFFTSXKEFd4bmoStYQ== X-Received: by 2002:a05:600c:348d:b0:3cf:88e7:f808 with SMTP id a13-20020a05600c348d00b003cf88e7f808mr2861105wmq.200.1668620291888; Wed, 16 Nov 2022 09:38:11 -0800 (PST) Received: from [192.168.1.2] ([139.47.33.22]) by smtp.gmail.com with ESMTPSA id w15-20020a5d4b4f000000b00228cbac7a25sm15594376wrs.64.2022.11.16.09.38.10 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 16 Nov 2022 09:38:11 -0800 (PST) Message-ID: Date: Wed, 16 Nov 2022 18:38:09 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.4.1 Subject: Re: [PATCH] [PR68097] Try to avoid recursing for floats in tree_*_nonnegative_warnv_p. To: Richard Biener Cc: GCC patches , Andrew MacLeod , richard.sandiford@arm.com References: <20221112183048.389811-1-aldyh@redhat.com> <1ea5fc0e-fcc4-a354-b71b-3da3008ea5f2@redhat.com> <503f89de-8e88-2e20-5b14-5493191b19f5@redhat.com> From: Aldy Hernandez In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-5.0 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 11/16/22 17:04, Richard Biener wrote: > On Tue, Nov 15, 2022 at 11:46 AM Aldy Hernandez wrote: >> >> >> >> On 11/15/22 08:15, Richard Biener wrote: >>> On Mon, Nov 14, 2022 at 8:05 PM Aldy Hernandez wrote: >>>> >>>> >>>> >>>> On 11/14/22 10:12, Richard Biener wrote: >>>>> On Sat, Nov 12, 2022 at 7:30 PM Aldy Hernandez wrote: >>>>>> >>>>>> It irks me that a PR named "we should track ranges for floating-point >>>>>> hasn't been closed in this release. This is an attempt to do just >>>>>> that. >>>>>> >>>>>> As mentioned in the PR, even though we track ranges for floats, it has >>>>>> been suggested that avoiding recursing through SSA defs in >>>>>> gimple_assign_nonnegative_warnv_p is also a goal. We can do this with >>>>>> various ranger components without the need for a heavy handed approach >>>>>> (i.e. a full ranger). >>>>>> >>>>>> I have implemented two versions of known_float_sign_p() that answer >>>>>> the question whether we definitely know the sign for an operation or a >>>>>> tree expression. >>>>>> >>>>>> Both versions use get_global_range_query, which is a wrapper to query >>>>>> global ranges. This means, that no caching or propagation is done. >>>>>> In the case of an SSA, we just return the global range for it (think >>>>>> SSA_NAME_RANGE_INFO). In the case of a tree code with operands, we >>>>>> also use get_global_range_query to resolve the operands, and then call >>>>>> into range-ops, which is our lowest level component. There is no >>>>>> ranger or gori involved. All we're doing is resolving the operation >>>>>> with the ranges passed. >>>>>> >>>>>> This is enough to avoid recursing in the case where we definitely know >>>>>> the sign of a range. Otherwise, we still recurse. >>>>>> >>>>>> Note that instead of get_global_range_query(), we could use >>>>>> get_range_query() which uses a ranger (if active in a pass), or >>>>>> get_global_range_query if not. This would allow passes that have an >>>>>> active ranger (with enable_ranger) to use a full ranger. These passes >>>>>> are currently, VRP, loop unswitching, DOM, loop versioning, etc. If >>>>>> no ranger is active, get_range_query defaults to global ranges, so >>>>>> there's no additional penalty. >>>>>> >>>>>> Would this be acceptable, at least enough to close (or rename the PR ;-))? >>>>> >>>>> I think the checks would belong to the gimple_stmt_nonnegative_warnv_p function >>>>> only (that's the SSA name entry from the fold-const.cc ones)? >>>> >>>> That was my first approach, but I thought I'd cover the unary and binary >>>> operators as well, since they had other callers. But I'm happy with >>>> just the top-level tweak. It's a lot less code :). >>> >>> @@ -9234,6 +9235,15 @@ bool >>> gimple_stmt_nonnegative_warnv_p (gimple *stmt, bool *strict_overflow_p, >>> int depth) >>> { >>> + tree type = gimple_range_type (stmt); >>> + if (type && frange::supports_p (type)) >>> + { >>> + frange r; >>> + bool sign; >>> + return (get_global_range_query ()->range_of_stmt (r, stmt) >>> + && r.signbit_p (sign) >>> + && sign == false); >>> + } >>> >>> the above means we never fall through to the switch below if >>> frange::supports_p (type) - that's eventually good enough, I >>> don't think we ever call this very function directly but it gets >>> invoked via recursion through operands only. But of course >> >> Woah, sorry. That was not intended. For that matter, the patch as >> posted caused: >> >> FAIL: gcc.dg/builtins-10.c (test for excess errors) >> FAIL: gcc.dg/builtins-57.c (test for excess errors) >> FAIL: gcc.dg/torture/builtin-nonneg-1.c -O1 (test for excess errors) >> FAIL: gcc.dg/torture/builtin-nonneg-1.c -O2 (test for excess errors) >> FAIL: gcc.dg/torture/builtin-nonneg-1.c -O2 -flto >> -fno-use-linker-plugin -flto-partition=none (test for excess errors) >> FAIL: gcc.dg/torture/builtin-nonneg-1.c -O3 -g (test for excess errors) >> FAIL: gcc.dg/torture/builtin-nonneg-1.c -Os (test for excess errors) >> FAIL: gcc.dg/torture/builtin-power-1.c -O1 (test for excess errors) >> FAIL: gcc.dg/torture/builtin-power-1.c -O2 (test for excess errors) >> FAIL: gcc.dg/torture/builtin-power-1.c -O2 -flto >> -fno-use-linker-plugin -flto-partition=none (test for excess errors) >> FAIL: gcc.dg/torture/builtin-power-1.c -O3 -g (test for excess errors) >> FAIL: gcc.dg/torture/builtin-power-1.c -Os (test for excess errors) > > Did you investigate why? Because the old patch removed the recursion > while the new keeps it in case the global range isn't present which isn't > as nice. For gcc.dg/builtins-10.c, there are a few calls to gimple_stmt_nonnegative* that are being made before we have global ranges (ccp1 and forwprop1), so the query returns VARYING (i.e. no known sign). If you're curious, the call to gimple_stmt_nonnegative* comes via courtesy of match.pd: /* Canonicalization of sequences of math builtins. These rules represent IL simplifications but are not necessarily optimizations. So ISTM, we still need to fall through if we're being called before global ranges are available. After global ranges are available (evrp), we would avoid further lookups if it weren't for an unrelated problem I found. foperator_abs::fold_range() is trying to set a range of [+0.0, +INF], but this little snpipet in the frange normalization code adds a -0.0 to the range: else if (!HONOR_SIGNED_ZEROS (m_type)) { if (real_iszero (&m_max, 1)) m_max.sign = 0; if (real_iszero (&m_min, 0)) m_min.sign = 1; } We end up with: [frange] double [-0.0 (-0x0.0p+0), 1.79769313486231570814527423731704356798070567525844996599e+308 (0x0.fffffffffffff8p+1024)] I must say this is beyond my paygrade :). Jakub, it was your suggestion to add the snippet above. Is this correct? Note that this test is for -ffast-math. If I comment out the code above, the regressions are fixed, both with my current patch or with the original one. But as I suggested, maybe we want the second patch, because we may be called before global ranges are available. IMHO, we could go with the second patch, and fix the ABS problem independently. Yay? Nay? Aldy > >> Note that ranger folding calls this function, though it won't run the >> risk of endless recursion because range_of_stmt uses the LHS, and only >> use global ranges to solve the LHS. >> >> Also, frange::supports_p() does not support all floats: >> >> static bool supports_p (const_tree type) >> { >> // ?? Decimal floats can have multiple representations for the >> // same number. Supporting them may be as simple as just >> // disabling them in singleton_p. No clue. >> return SCALAR_FLOAT_TYPE_P (type) && !DECIMAL_FLOAT_TYPE_P (type); >> } > > OK, _Complex types are obviously missing, so are vector ones. > >> Finally, my patch is more conservative than what the *nonnegative_warn* >> friends do. We only return true when we're sure about the sign bit and >> it's FALSE. As I mentioned elsewhere, tree_call_nonnegative_warn_p() >> always returns true for: >> >> CASE_CFN_ACOS: >> CASE_CFN_ACOS_FN: >> CASE_CFN_ACOSH: >> CASE_CFN_ACOSH_FN: >> CASE_CFN_CABS: >> CASE_CFN_CABS_FN: >> ... >> ... >> /* Always true. */ >> return true; >> >> This means that we'll return true for a NAN, but we're incorrectly >> assuming the NAN will be +NAN. In my proposed patch, we don't make such >> assumptions. We only return true if the range is non-negative, >> including the NAN. > > Yep, the usual issue whether nonnegative means copysign (1, x) produces > 1 or whether !(x < 0) is true. > >>> I wonder what types are not supported by frange and whether >>> the manual processing we fall through to does anything meaningful >>> for those? >>> >>> I won't ask you to thoroughly answer this now but please put in >>> a comment reflecting the above before the switch stmt. >>> >>> switch (gimple_code (stmt)) >>> >>> >>> Otherwise OK, in case you tree gets back to bootstrapping ;) >> >> Updated patch that passes test. >> >> OK? And if so, can I close the PR? > > Yes, I think we now track float ranges - improvements are of course > always possible. > > Richard. > >> Thanks. >> Aldy >