From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pj1-x1033.google.com (mail-pj1-x1033.google.com [IPv6:2607:f8b0:4864:20::1033]) by sourceware.org (Postfix) with ESMTPS id 124003858CDB for ; Thu, 23 Mar 2023 23:28:02 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 124003858CDB Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-pj1-x1033.google.com with SMTP id qe8-20020a17090b4f8800b0023f07253a2cso3646344pjb.3 for ; Thu, 23 Mar 2023 16:28:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1679614080; h=content-transfer-encoding:in-reply-to:from:references:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=njMC/2ObwDGOOUECKH8hWkX4bPgdJFwPxH0r1Wdixxs=; b=oe0ui/dl9CCtvjtiG7DIR0uvB+9iAiPfsSeIuM7AloyWQD/jiwaCKxy/0V9nRRKHNZ aGGZYRHUs3VpVlHjeepEcac8WUJ2XhspUD/q4AWAvCihOlRMTO0plgfoAU6m8D1nONwB DCL01OnzvWtB/nzeyh9adOXwOavjPsHMrC+i8kG6Euv4DGRJYRUcPJXk6Go37uD3Zx0M CWm0KhswAAIVx/kQV4EJTyE2NrolWutaZ+leYRa8FW63AaGoIgm2V7/SN7sHHfTDPgdY v8a9h+eqoihdlxDKKSnm9IphF32OgDNLuXMpLutnJHKiEVdOsLQ6zMjbq1m4+0OjcVJn JgIA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679614080; h=content-transfer-encoding:in-reply-to:from:references:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=njMC/2ObwDGOOUECKH8hWkX4bPgdJFwPxH0r1Wdixxs=; b=YCa9BhOQOKa3JKgT9AFm9fcuG182R6V+ZbUl7KtA+zcDDtT3eGRf22NuSiEJ2gVj7W vGzzn61nuAn8Q7TcAjwKaotPkzpA9nybE7Aq0hN1bieyePb0boXVzburmBZ44WKbWgoK LdNAq021gVNxavO8GKuijGorwifxb7rKzk0xJkQdiod4mjw6EreShKaKZFC8qzKzSOkb HSJF/eBgZiJBDRve02Luch/hGUr2jy+3sLZgXroIVVUogV7hh8sT0zSpTlcnYSxtllpM rW0/yxZupBMyDz1sIBQEB+r2ws++yT2St3iMVt21Ztlo5Dyixoj0hNmGqXFkbws+P+jO FxOQ== X-Gm-Message-State: AAQBX9dSWXuPhaTDb9tPak2Vdd2ybVH7q84oBEawhwYAW9AfwXN5wigz RkeejbtxNC7dVGVzUzpPVD6gX7mnras= X-Google-Smtp-Source: AKy350be18IK0rB7PxXlbLeS1Lz2lUZKIbC4K9KBGwDDyWxJRLA+X+TRD7taIo1m4RPIhCBWkH4Q6w== X-Received: by 2002:a17:903:5c8:b0:1a1:ad52:4060 with SMTP id kf8-20020a17090305c800b001a1ad524060mr442149plb.20.1679614080050; Thu, 23 Mar 2023 16:28:00 -0700 (PDT) Received: from ?IPV6:2601:681:8600:13d0::f0a? ([2601:681:8600:13d0::f0a]) by smtp.gmail.com with ESMTPSA id 19-20020a170902c21300b001a04ff0e2eesm12845507pll.58.2023.03.23.16.27.59 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 23 Mar 2023 16:27:59 -0700 (PDT) Message-ID: <3343f06e-8aec-e0af-fff7-ac5c158e2ca9@gmail.com> Date: Thu, 23 Mar 2023 17:27:58 -0600 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.9.0 Subject: Re: [PATCH v1] [RFC] Improve folding for comparisons with zero in tree-ssa-forwprop. Content-Language: en-US To: gcc-patches@gcc.gnu.org References: <20230316152706.2214124-1-manolis.tsamis@vrull.eu> From: Jeff Law In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 3/20/23 08:01, Manolis Tsamis wrote: > On Fri, Mar 17, 2023 at 10:31 AM Richard Biener > wrote: >> >> On Thu, Mar 16, 2023 at 4:27 PM Manolis Tsamis wrote: >>> >>> For this C testcase: >>> >>> void g(); >>> void f(unsigned int *a) >>> { >>> if (++*a == 1) >>> g(); >>> } >>> >>> GCC will currently emit a comparison with 1 by using the value >>> of *a after the increment. This can be improved by comparing >>> against 0 and using the value before the increment. As a result >>> there is a potentially shorter dependancy chain (no need to wait >>> for the result of +1) and on targets with compare zero instructions >>> the generated code is one instruction shorter. >> >> The downside is we now need two registers and their lifetime overlaps. >> >> Your patch mixes changing / inverting a parameter (which seems unneeded >> for the actual change) with preferring compares against zero. >> > > Indeed. I thought that without that change the original names wouldn't properly > describe what the parameter actually does and that's why I've changed it. > I can undo that in the next revision. Typically the thing to do is send that separately. If it has no functional change, then it can often go in immediately. > >> What's the reason to specifically prefer compares against zero? On x86 >> we have add that sets flags, so ++*a == 0 would be preferred, but >> for your sequence we'd need a test reg, reg; branch on zero, so we do >> not save any instruction. >> > > My reasoning is that zero is treated preferentially in most if not > all architectures. Some specifically have zero/non-zero comparisons so > we get one less instruction. X86 doesn't explicitly have that but I > think that test reg, reg may not be always needed depending on the > rest of the code. By what Andrew mentions below there may even be > optimizations for zero in the microarchitecture level. There's all kinds of low level ways a test against zero is better than a test against any other value. I'm not aware of any architecture were the opposite is true. Note that in this specific case rewriting does cause us to need two registers, so we'll want to think about the right time to make this transformation. It may be the case that doing it in gimple is too early. > > Because this is still an arch-specific thing I initially tried to make > it arch-depended by invoking the target's const functions (e.g. If I > recall correctly aarch64 will return a lower cost for zero > comparisons). But the code turned out complicated and messy so I came > up with this alternative that just treats zero preferentially. > > If you have in mind a way that this can be done in a better way I > could try to implement it. And in general I think you approached this in the preferred way -- it's largely a target independent optimization, so let's tackle it in a generic way. Anyway, we'll dive into it once gcc-14 development opens and try to figure out the best way forward. jeff