From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 6BC8C382F0BD for ; Thu, 26 May 2022 21:12:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 6BC8C382F0BD Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com [209.85.222.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-198-h2B7u1nuMsC4rIbxs8KB5g-1; Thu, 26 May 2022 17:12:04 -0400 X-MC-Unique: h2B7u1nuMsC4rIbxs8KB5g-1 Received: by mail-qk1-f197.google.com with SMTP id bj2-20020a05620a190200b005084968bb24so2431008qkb.23 for ; Thu, 26 May 2022 14:12:04 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:to:references:from:in-reply-to :content-transfer-encoding; bh=wc5FvcsqtR6t9U/R+XEsmXna+Iaj4WX1ASz+RHJNxPY=; b=q2rPOCjFv4Ej0RAamIYh02dJmOGRhE9Go8zvKlrvXTZLlVwNDC6J+p4HV3k/JL3KfJ bKRkvmkA71XJL1T4cJvQ7y/XSiJkhV6+p0wxCxWJ+cttJt0uqYn/dHr09rnHHTnKV89C XixDkti7ZggywW7EIPRaGxA2ZlZdphDktude9BUWCP/FmSwCJ3p8stNjCtJfTrGGqnS0 /1Q+hoO3usH597tIet5o9ylm/hxz27bI54ryJEfSrMKT1585qkYwFyvSOsV5e3Ia6rzi K/ZmyFFTNva6BI+ljNBLHjR1ULYP9B63+Xrm88sfWrrD760yO1VCBYbet0DxNDwYBNhm moEA== X-Gm-Message-State: AOAM5325T8YS+vYtL/HmiwAqgrwgAsqCIHFr2SUHBPR9lokbs7VsILik jpYLLOeivlC5r6fMVoZqxBw7AqkHIzKjQ7qy8I9X01ucaD1te9haNVV8ZM9K5LMhdwlphJ9+OZ0 8U94MqYsDFnYMrFroKg== X-Received: by 2002:a05:6214:1742:b0:462:60fb:213 with SMTP id dc2-20020a056214174200b0046260fb0213mr9580692qvb.84.1653599524444; Thu, 26 May 2022 14:12:04 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxnCdH2LcPrDrLyrG/7HnxC/rW8w6jyZTL/gULZ755LuxrdaFbxeRRYViV3Sae9/i18jYZI2A== X-Received: by 2002:a05:6214:1742:b0:462:60fb:213 with SMTP id dc2-20020a056214174200b0046260fb0213mr9580674qvb.84.1653599524167; Thu, 26 May 2022 14:12:04 -0700 (PDT) Received: from [192.168.0.101] ([45.72.226.124]) by smtp.gmail.com with ESMTPSA id g17-20020a379d11000000b006a3696c4739sm1754525qke.19.2022.05.26.14.12.02 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 26 May 2022 14:12:03 -0700 (PDT) Message-ID: <8b505a07-64bb-f483-63cc-cef6e8e4642c@redhat.com> Date: Thu, 26 May 2022 17:12:02 -0400 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.8.0 Subject: Re: [PATCH] Add a bit dislike for separate mem alternative when op is REG_P. To: liuhongt , gcc-patches@gcc.gnu.org References: <20220525033920.77449-1-hongtao.liu@intel.com> From: Vladimir Makarov In-Reply-To: <20220525033920.77449-1-hongtao.liu@intel.com> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-13.0 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 May 2022 21:12:07 -0000 On 2022-05-24 23:39, liuhongt wrote: > Rigt now, mem_cost for separate mem alternative is 1 * frequency which > is pretty small and caused the unnecessary SSE spill in the PR, I've tried > to rework backend cost model, but RA still not happy with that(regress > somewhere else). I think the root cause of this is cost for separate 'm' > alternative cost is too small, especially considering that the mov cost > of gpr are 2(default for REGISTER_MOVE_COST). So this patch increase mem_cost > to 2*frequency, also increase 1 for reg_class cost when m alternative. > > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. > Ok for trunk? Thank you for addressing this problem. And sorry I can not approve this patch at least w/o your additional work on benchmarking this change. This code is very old.  It is coming from older RA (former file regclass.c) and existed practically since GCC day 1.  People tried many times to improve this code.  The code also affects many targets. I can approve this patch if you show that there is no regression at least on x86-64 on some credible benchmark, e.g. SPEC2006 or SPEC2017. I know it is a big work but when I myself do such changes I check SPEC2017.  I rejected my changes like this one several times when I benchmarked them on SPEC2017 although at the first glance they looked reasonable. > gcc/ChangeLog: > > PR target/105513 > * ira-costs.cc (record_reg_classes): Increase both mem_cost > and reg class cost by 1 for separate mem alternative when > REG_P (op). > > gcc/testsuite/ChangeLog: > > * gcc.target/i386/pr105513-1.c: New test. > --- > gcc/ira-costs.cc | 26 +++++++++++++--------- > gcc/testsuite/gcc.target/i386/pr105513-1.c | 16 +++++++++++++ > 2 files changed, 31 insertions(+), 11 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/i386/pr105513-1.c > > diff --git a/gcc/ira-costs.cc b/gcc/ira-costs.cc > index 964c94a06ef..f7b8325e195 100644 > --- a/gcc/ira-costs.cc > +++ b/gcc/ira-costs.cc > @@ -625,7 +625,8 @@ record_reg_classes (int n_alts, int n_ops, rtx *ops, > for (k = cost_classes_ptr->num - 1; k >= 0; k--) > { > rclass = cost_classes[k]; > - pp_costs[k] = mem_cost[rclass][0] * frequency; > + pp_costs[k] = (mem_cost[rclass][0] > + + 1) * frequency; > } > } > else > @@ -648,7 +649,8 @@ record_reg_classes (int n_alts, int n_ops, rtx *ops, > for (k = cost_classes_ptr->num - 1; k >= 0; k--) > { > rclass = cost_classes[k]; > - pp_costs[k] = mem_cost[rclass][1] * frequency; > + pp_costs[k] = (mem_cost[rclass][1] > + + 1) * frequency; > } > } > else > @@ -670,9 +672,9 @@ record_reg_classes (int n_alts, int n_ops, rtx *ops, > for (k = cost_classes_ptr->num - 1; k >= 0; k--) > { > rclass = cost_classes[k]; > - pp_costs[k] = ((mem_cost[rclass][0] > - + mem_cost[rclass][1]) > - * frequency); > + pp_costs[k] = (mem_cost[rclass][0] > + + mem_cost[rclass][1] > + + 2) * frequency; > } > } > else > @@ -861,7 +863,8 @@ record_reg_classes (int n_alts, int n_ops, rtx *ops, > for (k = cost_classes_ptr->num - 1; k >= 0; k--) > { > rclass = cost_classes[k]; > - pp_costs[k] = mem_cost[rclass][0] * frequency; > + pp_costs[k] = (mem_cost[rclass][0] > + + 1) * frequency; > } > } > else > @@ -884,7 +887,8 @@ record_reg_classes (int n_alts, int n_ops, rtx *ops, > for (k = cost_classes_ptr->num - 1; k >= 0; k--) > { > rclass = cost_classes[k]; > - pp_costs[k] = mem_cost[rclass][1] * frequency; > + pp_costs[k] = (mem_cost[rclass][1] > + + 1) * frequency; > } > } > else > @@ -906,9 +910,9 @@ record_reg_classes (int n_alts, int n_ops, rtx *ops, > for (k = cost_classes_ptr->num - 1; k >= 0; k--) > { > rclass = cost_classes[k]; > - pp_costs[k] = ((mem_cost[rclass][0] > - + mem_cost[rclass][1]) > - * frequency); > + pp_costs[k] = (mem_cost[rclass][0] > + + mem_cost[rclass][1] > + + 2) * frequency; > } > } > else > @@ -929,7 +933,7 @@ record_reg_classes (int n_alts, int n_ops, rtx *ops, > /* Although we don't need insn to reload from > memory, still accessing memory is usually more > expensive than a register. */ > - pp->mem_cost = frequency; > + pp->mem_cost = 2 * frequency; > else > /* If the alternative actually allows memory, make > things a bit cheaper since we won't need an > diff --git a/gcc/testsuite/gcc.target/i386/pr105513-1.c b/gcc/testsuite/gcc.target/i386/pr105513-1.c > new file mode 100644 > index 00000000000..530f5292252 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr105513-1.c > @@ -0,0 +1,16 @@ > +/* { dg-do compile { target { ! ia32 } } } */ > +/* { dg-options "-O2 -msse2 -mtune=skylake -mfpmath=sse" } */ > +/* { dg-final { scan-assembler-not "\\(%rsp\\)" } } */ > + > +static int as_int(float x) > +{ > + return (union{float x; int i;}){x}.i; > +} > + > +float f(double y, float x) > +{ > + int i = as_int(x); > + if (__builtin_expect(i > 99, 0)) return 0; > + if (i*2u < 77) if (i==2) return 0; > + return y*x; > +}