From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 57DA738A814C for ; Tue, 15 Nov 2022 10:46:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 57DA738A814C Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1668509175; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Z89H3FLv203t2sDOSqjpkSeOnkuDMto0p733u3Ikcy0=; b=fTnzWQKjDFZmOMHj0VRoyBAEwo2kSMStg53AGy9dv7M0HcmLkhognoHNg5T7VUkQDHOFnx I5lpGeIYb9muEtwXwqEoXj9zkLNw0OxwU15TasSYfmL/bS+XivjASBOYeNlVcuehlEvGgT HCdu5FWVroa/CgWunhF5Ab4oPDGu6xI= Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com [209.85.128.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-466-ZBPO-4sMMvK7t-oLVCgifA-1; Tue, 15 Nov 2022 05:46:13 -0500 X-MC-Unique: ZBPO-4sMMvK7t-oLVCgifA-1 Received: by mail-wm1-f69.google.com with SMTP id v188-20020a1cacc5000000b003cf76c4ae66so10335471wme.7 for ; Tue, 15 Nov 2022 02:46:13 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:from:content-language:references:cc:to:subject :user-agent:mime-version:date:message-id:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=dxx97XRyvL1dh2RrgI45mMTkKOxbD3YxRGi2vYA28xA=; b=4sA19/caha7/RAGw78HzkLHYRIyWV+6LUeSBS0rruBLZJbwaknBGf/LnyYFiG2ldjC m6C5FLu9L22PPOsw7UXSVCC7EYSZQyuyV5qSkmZ7EVBh/vCfpqwAwhLbu9A6+JlTmDV1 I+mQSvi2gzbhpFpRRshNn59oA39fLJutCMqbZ6ONxFPP82dSDFX27Fs3tZGrK43twv0t +ZDXOosNMZTcmZ11TXXmNFkVIDjPTGE5I32ptODjVZtob8Rxc231P88ARdEJNI06N+xj J7H940NamO7ppIlB/2quk70J6CV22mzz9pkeEmD85LeZ4gt6LAXrCDMc7bzCrgRfzuqy zVVQ== X-Gm-Message-State: ANoB5pk794CC3xqfhk9M+h/8Owd/dwmAVFr3e8FZuIFSZu9eYjv61fTP dhuyRpBVrhkxvax1N+1qzRJZ+w6y3CBQ0dWLBJekbI0yInGxA0S98Ihi1zCsIjSXoQED/tP8V7/ xdip7SCI/fdiCLZTHsA== X-Received: by 2002:adf:d84b:0:b0:236:c65e:dcbd with SMTP id k11-20020adfd84b000000b00236c65edcbdmr10520253wrl.62.1668509172190; Tue, 15 Nov 2022 02:46:12 -0800 (PST) X-Google-Smtp-Source: AA0mqf7xjjFLeO4IoJH22KVDA3Ckzo+qFBk5KDBtmjvs5AknApfobTbHHH3evsHhSQIKTdJgWihP/g== X-Received: by 2002:adf:d84b:0:b0:236:c65e:dcbd with SMTP id k11-20020adfd84b000000b00236c65edcbdmr10520230wrl.62.1668509171806; Tue, 15 Nov 2022 02:46:11 -0800 (PST) Received: from [192.168.1.2] ([139.47.33.22]) by smtp.gmail.com with ESMTPSA id o15-20020a05600c510f00b003c6d21a19a0sm17192585wms.29.2022.11.15.02.46.11 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 15 Nov 2022 02:46:11 -0800 (PST) Message-ID: <503f89de-8e88-2e20-5b14-5493191b19f5@redhat.com> Date: Tue, 15 Nov 2022 11:46:10 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.4.1 Subject: Re: [PATCH] [PR68097] Try to avoid recursing for floats in tree_*_nonnegative_warnv_p. To: Richard Biener Cc: GCC patches , Andrew MacLeod , richard.sandiford@arm.com References: <20221112183048.389811-1-aldyh@redhat.com> <1ea5fc0e-fcc4-a354-b71b-3da3008ea5f2@redhat.com> From: Aldy Hernandez In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: multipart/mixed; boundary="------------m8hn5YZuQuFzlanItY5S4UIM" Content-Language: en-US X-Spam-Status: No, score=-11.0 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: This is a multi-part message in MIME format. --------------m8hn5YZuQuFzlanItY5S4UIM Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit On 11/15/22 08:15, Richard Biener wrote: > On Mon, Nov 14, 2022 at 8:05 PM Aldy Hernandez wrote: >> >> >> >> On 11/14/22 10:12, Richard Biener wrote: >>> On Sat, Nov 12, 2022 at 7:30 PM Aldy Hernandez wrote: >>>> >>>> It irks me that a PR named "we should track ranges for floating-point >>>> hasn't been closed in this release. This is an attempt to do just >>>> that. >>>> >>>> As mentioned in the PR, even though we track ranges for floats, it has >>>> been suggested that avoiding recursing through SSA defs in >>>> gimple_assign_nonnegative_warnv_p is also a goal. We can do this with >>>> various ranger components without the need for a heavy handed approach >>>> (i.e. a full ranger). >>>> >>>> I have implemented two versions of known_float_sign_p() that answer >>>> the question whether we definitely know the sign for an operation or a >>>> tree expression. >>>> >>>> Both versions use get_global_range_query, which is a wrapper to query >>>> global ranges. This means, that no caching or propagation is done. >>>> In the case of an SSA, we just return the global range for it (think >>>> SSA_NAME_RANGE_INFO). In the case of a tree code with operands, we >>>> also use get_global_range_query to resolve the operands, and then call >>>> into range-ops, which is our lowest level component. There is no >>>> ranger or gori involved. All we're doing is resolving the operation >>>> with the ranges passed. >>>> >>>> This is enough to avoid recursing in the case where we definitely know >>>> the sign of a range. Otherwise, we still recurse. >>>> >>>> Note that instead of get_global_range_query(), we could use >>>> get_range_query() which uses a ranger (if active in a pass), or >>>> get_global_range_query if not. This would allow passes that have an >>>> active ranger (with enable_ranger) to use a full ranger. These passes >>>> are currently, VRP, loop unswitching, DOM, loop versioning, etc. If >>>> no ranger is active, get_range_query defaults to global ranges, so >>>> there's no additional penalty. >>>> >>>> Would this be acceptable, at least enough to close (or rename the PR ;-))? >>> >>> I think the checks would belong to the gimple_stmt_nonnegative_warnv_p function >>> only (that's the SSA name entry from the fold-const.cc ones)? >> >> That was my first approach, but I thought I'd cover the unary and binary >> operators as well, since they had other callers. But I'm happy with >> just the top-level tweak. It's a lot less code :). > > @@ -9234,6 +9235,15 @@ bool > gimple_stmt_nonnegative_warnv_p (gimple *stmt, bool *strict_overflow_p, > int depth) > { > + tree type = gimple_range_type (stmt); > + if (type && frange::supports_p (type)) > + { > + frange r; > + bool sign; > + return (get_global_range_query ()->range_of_stmt (r, stmt) > + && r.signbit_p (sign) > + && sign == false); > + } > > the above means we never fall through to the switch below if > frange::supports_p (type) - that's eventually good enough, I > don't think we ever call this very function directly but it gets > invoked via recursion through operands only. But of course Woah, sorry. That was not intended. For that matter, the patch as posted caused: FAIL: gcc.dg/builtins-10.c (test for excess errors) FAIL: gcc.dg/builtins-57.c (test for excess errors) FAIL: gcc.dg/torture/builtin-nonneg-1.c -O1 (test for excess errors) FAIL: gcc.dg/torture/builtin-nonneg-1.c -O2 (test for excess errors) FAIL: gcc.dg/torture/builtin-nonneg-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none (test for excess errors) FAIL: gcc.dg/torture/builtin-nonneg-1.c -O3 -g (test for excess errors) FAIL: gcc.dg/torture/builtin-nonneg-1.c -Os (test for excess errors) FAIL: gcc.dg/torture/builtin-power-1.c -O1 (test for excess errors) FAIL: gcc.dg/torture/builtin-power-1.c -O2 (test for excess errors) FAIL: gcc.dg/torture/builtin-power-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none (test for excess errors) FAIL: gcc.dg/torture/builtin-power-1.c -O3 -g (test for excess errors) FAIL: gcc.dg/torture/builtin-power-1.c -Os (test for excess errors) Note that ranger folding calls this function, though it won't run the risk of endless recursion because range_of_stmt uses the LHS, and only use global ranges to solve the LHS. Also, frange::supports_p() does not support all floats: static bool supports_p (const_tree type) { // ?? Decimal floats can have multiple representations for the // same number. Supporting them may be as simple as just // disabling them in singleton_p. No clue. return SCALAR_FLOAT_TYPE_P (type) && !DECIMAL_FLOAT_TYPE_P (type); } Finally, my patch is more conservative than what the *nonnegative_warn* friends do. We only return true when we're sure about the sign bit and it's FALSE. As I mentioned elsewhere, tree_call_nonnegative_warn_p() always returns true for: CASE_CFN_ACOS: CASE_CFN_ACOS_FN: CASE_CFN_ACOSH: CASE_CFN_ACOSH_FN: CASE_CFN_CABS: CASE_CFN_CABS_FN: ... ... /* Always true. */ return true; This means that we'll return true for a NAN, but we're incorrectly assuming the NAN will be +NAN. In my proposed patch, we don't make such assumptions. We only return true if the range is non-negative, including the NAN. > I wonder what types are not supported by frange and whether > the manual processing we fall through to does anything meaningful > for those? > > I won't ask you to thoroughly answer this now but please put in > a comment reflecting the above before the switch stmt. > > switch (gimple_code (stmt)) > > > Otherwise OK, in case you tree gets back to bootstrapping ;) Updated patch that passes test. OK? And if so, can I close the PR? Thanks. Aldy --------------m8hn5YZuQuFzlanItY5S4UIM Content-Type: text/x-patch; charset=UTF-8; name="0001-PR68097-Try-to-avoid-recursing-for-floats-in-gimple_.patch" Content-Disposition: attachment; filename*0="0001-PR68097-Try-to-avoid-recursing-for-floats-in-gimple_.pa"; filename*1="tch" Content-Transfer-Encoding: base64 RnJvbSA4ZjU4OWNmZjIyZTgwZWRhN2NjMzgzNzA4MGRiYzlhZTgyODBkYThjIE1vbiBTZXAgMTcg MDA6MDA6MDAgMjAwMQpGcm9tOiBBbGR5IEhlcm5hbmRleiA8YWxkeWhAcmVkaGF0LmNvbT4KRGF0 ZTogU2F0LCAxMiBOb3YgMjAyMiAxMTo1ODowNyArMDEwMApTdWJqZWN0OiBbUEFUQ0hdIFtQUjY4 MDk3XSBUcnkgdG8gYXZvaWQgcmVjdXJzaW5nIGZvciBmbG9hdHMgaW4KIGdpbXBsZV9zdG10X25v bm5lZ2F0aXZlX3dhcm52X3AuCgpJdCBpcmtzIG1lIHRoYXQgYSBQUiBuYW1lZCAid2Ugc2hvdWxk IHRyYWNrIHJhbmdlcyBmb3IgZmxvYXRpbmctcG9pbnQKaGFzbid0IGJlZW4gY2xvc2VkIGluIHRo aXMgcmVsZWFzZS4gIFRoaXMgaXMgYW4gYXR0ZW1wdCB0byBkbyBqdXN0CnRoYXQuCgpBcyBtZW50 aW9uZWQgaW4gdGhlIFBSLCBldmVuIHRob3VnaCB3ZSB0cmFjayByYW5nZXMgZm9yIGZsb2F0cywg aXQgaGFzCmJlZW4gc3VnZ2VzdGVkIHRoYXQgYXZvaWRpbmcgcmVjdXJzaW5nIHRocm91Z2ggU1NB IGRlZnMgaW4KZ2ltcGxlX2Fzc2lnbl9ub25uZWdhdGl2ZV93YXJudl9wIGlzIGFsc28gYSBnb2Fs LiAgVGhpcyBwYXRjaCB1c2VzIGEKZ2xvYmFsIHJhbmdlIHF1ZXJ5IChubyBvbi1kZW1hbmQgbG9v a3VwcywganVzdCBnbG9iYWwgcmFuZ2VzIGFuZAptaW5pbWFsIGZvbGRpbmcpIHRvIGRldGVybWlu ZSBpZiB0aGUgcmFuZ2Ugb2YgYSBzdGF0ZW1lbnQgaXMga25vd24gdG8KYmUgbm9uLW5lZ2F0aXZl LgoKCVBSIHRyZWUtb3B0aW1pemF0aW9uLzY4MDk3CgpnY2MvQ2hhbmdlTG9nOgoKCSogZ2ltcGxl LWZvbGQuY2MgKGdpbXBsZV9zdG10X25vbm5lZ2F0aXZlX3dhcm52X3ApOiBDYWxsCglyYW5nZV9v Zl9zdG10IGZvciBmbG9hdHMuCi0tLQogZ2NjL2dpbXBsZS1mb2xkLmNjIHwgMTAgKysrKysrKysr KwogMSBmaWxlIGNoYW5nZWQsIDEwIGluc2VydGlvbnMoKykKCmRpZmYgLS1naXQgYS9nY2MvZ2lt cGxlLWZvbGQuY2MgYi9nY2MvZ2ltcGxlLWZvbGQuY2MKaW5kZXggMGEyMTJlNmQwZDQuLmM0ZWJl OWEzYjUyIDEwMDY0NAotLS0gYS9nY2MvZ2ltcGxlLWZvbGQuY2MKKysrIGIvZ2NjL2dpbXBsZS1m b2xkLmNjCkBAIC02OCw2ICs2OCw3IEBAIGFsb25nIHdpdGggR0NDOyBzZWUgdGhlIGZpbGUgQ09Q WUlORzMuICBJZiBub3Qgc2VlCiAjaW5jbHVkZSAidHJlZS1zc2Etc3RybGVuLmgiCiAjaW5jbHVk ZSAidmFyYXNtLmgiCiAjaW5jbHVkZSAiaW50ZXJuYWwtZm4uaCIKKyNpbmNsdWRlICJnaW1wbGUt cmFuZ2UuaCIKIAogZW51bSBzdHJsZW5fcmFuZ2Vfa2luZCB7CiAgIC8qIENvbXB1dGUgdGhlIGV4 YWN0IGNvbnN0YW50IHN0cmluZyBsZW5ndGguICAqLwpAQCAtOTIzNCw2ICs5MjM1LDE1IEBAIGJv b2wKIGdpbXBsZV9zdG10X25vbm5lZ2F0aXZlX3dhcm52X3AgKGdpbXBsZSAqc3RtdCwgYm9vbCAq c3RyaWN0X292ZXJmbG93X3AsCiAJCQkJIGludCBkZXB0aCkKIHsKKyAgdHJlZSB0eXBlID0gZ2lt cGxlX3JhbmdlX3R5cGUgKHN0bXQpOworICBpZiAodHlwZSAmJiBmcmFuZ2U6OnN1cHBvcnRzX3Ag KHR5cGUpKQorICAgIHsKKyAgICAgIGZyYW5nZSByOworICAgICAgYm9vbCBzaWduOworICAgICAg aWYgKGdldF9nbG9iYWxfcmFuZ2VfcXVlcnkgKCktPnJhbmdlX29mX3N0bXQgKHIsIHN0bXQpCisJ ICAmJiByLnNpZ25iaXRfcCAoc2lnbikpCisJcmV0dXJuICFzaWduOworICAgIH0KICAgc3dpdGNo IChnaW1wbGVfY29kZSAoc3RtdCkpCiAgICAgewogICAgIGNhc2UgR0lNUExFX0FTU0lHTjoKLS0g CjIuMzguMQoK --------------m8hn5YZuQuFzlanItY5S4UIM--