From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qt1-x82e.google.com (mail-qt1-x82e.google.com [IPv6:2607:f8b0:4864:20::82e]) by sourceware.org (Postfix) with ESMTPS id 938A13858D28 for ; Fri, 24 Dec 2021 08:19:22 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 938A13858D28 Received: by mail-qt1-x82e.google.com with SMTP id j17so7142496qtx.2 for ; Fri, 24 Dec 2021 00:19:22 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=eXYDnWQHFHV8hX21/dRVJQ/gbar3SfPwrIo34U7hpPI=; b=l194TLqRdmwRjV8Fegqn1KJVupJoN6+gG8qYgpgQmtjpe0or9akP6p0Bc6kZBB48D9 PA0qj0LTKrk7CM9kaZpjnKtcLsFVTrZJ5jqHS4Jd9+CtDi5+LfcycOHcrR0fu4QJ2iKO NEWuoGxP+mJUFTcIkQ+WngK6+ZRKRKe7FFkbFA5CaMrGhBNGQ2tE4E389KtKe86OMvjD mh74EP8YxIUSa1WdChExqLckLhLgI0y/5uAU5Km1bkGd6zTKwC9KK38Eg/UKrpAxTjVN cxFg4hX/tNYm/w8azAJoAYoarlPQQKLNJDL3/WraAFo1xjkQ3+FTdec/P6Z3lU1uaA7U zcYQ== X-Gm-Message-State: AOAM533aMdxLNQU/4/CRX9gqA+vbLol3DcPj/xmL6BJ/hahAnQGAo4aa IoZoFxUrUyu4/LwEbXjflHcXmDoopoOW+OoYHn8= X-Google-Smtp-Source: ABdhPJzEelC8IL20YFCB9JiFVPMh1cuZDaDSAxpKDwabM7b8SS5YTsVZlnBe6KDkWQ1pY+JkYxmVAGcts3xYiNQ0ROo= X-Received: by 2002:ac8:7d4b:: with SMTP id h11mr4771399qtb.173.1640333962211; Fri, 24 Dec 2021 00:19:22 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Uros Bizjak Date: Fri, 24 Dec 2021 09:19:10 +0100 Message-ID: Subject: Re: [PATCH v2] ix86: Don't use the 'm' constraint for x86_64_general_operand To: "H.J. Lu" Cc: Jakub Jelinek , Hongtao Liu , liwei.xu@intel.com, GCC Patches Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-2.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 Dec 2021 08:19:24 -0000 On Thu, Dec 23, 2021 at 3:42 PM H.J. Lu wrote: > > On Mon, Dec 20, 2021 at 2:22 PM H.J. Lu wrote: > > > > On Mon, Dec 20, 2021 at 12:38 PM Jakub Jelinek wrote: > > > > > > On Mon, Dec 20, 2021 at 11:44:08AM -0800, H.J. Lu wrote: > > > > The problem is in > > > > > > > > (define_memory_constraint "TARGET_MEM_CONSTRAINT" > > > > "Matches any valid memory." > > > > (and (match_code "mem") > > > > (match_test "memory_address_addr_space_p (GET_MODE (op), XEXP (op, 0), > > > > MEM_ADDR_SPACE (op))"))) > > > > > > > > define_register_constraint allows LRA to convert the operand to the form > > > > '(mem (reg X))', where X is a base register. I am testing the v2 patch with > > > > > > If you mean replacing an immediate with a MEM containing that immediate, > > > isn't that often the right thing though? > > > I mean, if the register pressure is high and options are either spill some > > > register, set it to immediate, use it in one instruction and then fill the > > > spilled register (i.e. 2 memory loads), compared to a MEM use on the > > > arithmetic instruction one MEM seems cheaper to me. With -fPIC and the > > > cst needing runtime relocation slightly less so of course. > > > > > > > We will check the performance impact on SPEC CPU 2017. > > Here is the v2 patch. Liwei, can you help collect SPEC CPU 2017 > > impact of the enclosed patch? Thanks. > > We checked SPEC CPU 2017 performance with -O2 and -Ofast. > There is no performance regression. OK for master? OK if there are no further comments from Jakub. Thanks, Uros. > > > The code due to ivopts is trying to have something like > > > size_t a = (size_t) &tunable_list; > > > size_t b = 0xffffffffffffffa8 - a; > > > size_t c = x + b; > > > and for that cst - &symbol one needs actually 2 registers, one to hold the > > > constant and one to hold the (%rip) based address. > > > (insn 790 789 791 111 (set (reg:DI 292) > > > (const_int -88 [0xffffffffffffffa8])) "dl-tunables.c":304:15 76 {*movdi_internal} > > > (nil)) > > > (insn 791 790 792 111 (set (reg:DI 293) > > > (symbol_ref:DI ("tunable_list") [flags 0x2] )) "dl-tunables.c":304:15 76 {*movdi_internal} > > > (nil)) > > > (insn 792 791 793 111 (parallel [ > > > (set (reg:DI 291) > > > (minus:DI (reg:DI 292) > > > (reg:DI 293))) > > > (clobber (reg:CC 17 flags)) > > > ]) "dl-tunables.c":304:15 299 {*subdi_1} > > > (nil)) > > > (insn 793 792 794 111 (parallel [ > > > (set (reg:DI 294) > > > (plus:DI (reg:DI 291) > > > (reg:DI 198 [ ivtmp.176 ]))) > > > (clobber (reg:CC 17 flags)) > > > ]) "dl-tunables.c":304:15 226 {*adddi_1} > > > (nil)) > > > It would be smarter to rewrite the above into a lea 88+tunable_list(%rip), %temp1 > > > and use a subtraction instead of addition in the last insn above, or of > > > course in the particular case even consider the following 2 instructions > > > that do: > > > (insn 794 793 795 111 (set (reg:DI 296) > > > (symbol_ref:DI ("tunable_list") [flags 0x2] )) "dl-tunables.c":304:15 76 {*movdi_internal} > > > (nil)) > > > (insn 795 794 796 111 (parallel [ > > > (set (reg:DI 295 [ cur ]) > > > (plus:DI (reg:DI 294) > > > (reg:DI 296))) > > > (clobber (reg:CC 17 flags)) > > > ]) "dl-tunables.c":304:15 226 {*adddi_1} > > > (nil)) > > > and find out that &tuneble_list - &tuneble_list is 0 and we don't need it at > > > all. Guess we don't figure that out due to the cast of one of those > > > addresses to size_t and the other one used in POINTER_PLUS_EXPR as normal > > > pointer. > > > > > > Jakub > > > > > > > > > -- > > H.J. > > Thanks. > > -- > H.J.