From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pj1-x1035.google.com (mail-pj1-x1035.google.com [IPv6:2607:f8b0:4864:20::1035]) by sourceware.org (Postfix) with ESMTPS id 105673857C6F for ; Sun, 12 Jun 2022 17:27:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 105673857C6F Received: by mail-pj1-x1035.google.com with SMTP id e9so3676195pju.5 for ; Sun, 12 Jun 2022 10:27:17 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:to:references:from:in-reply-to :content-transfer-encoding; bh=oJeUcce9MzQ9I8GKJAlFChXAKKJqIAe9AeUX3xzAfSM=; b=c/AsDVf+J3nPl3TNNH6dagz9eR2WDSmisUxvdMXA0BX3I9vvLWnEqpJ7TY6zH1WLMg b8DKOvqjqrL1taHB3QQwABiUi85canSMWBP4rQoMWQHEwvZsYFzdPuh5fEt3xHOkbYz4 IdEvh7R6xpHlFcJ2kwmIhKQv+2uKpGeami0+a68JG+kFE3r3lrXTztz1HvfpRETfJZDe VVMb5qvcxTy819zpC4ycrKlb9KER0SYnSZ2OzgBeyvpjVDrTtd84nPOUyyyc7SgBCtUN Pn+UToduwYCg+QV7SohT2SywvYESLh/+/Qu2HPX4d4SFd+phf3sD7TBbVdo9WYSc5uJe WvKg== X-Gm-Message-State: AOAM530D/+uh1RBULAgOKB7e7nYia6SOz2mS7EQrWJLpZzoCXNFWSfMt O9VKpWjMWFAcD0G6BPSEjZS306b04Gx9GQ== X-Google-Smtp-Source: ABdhPJxB7goa++whJSabYcSbGHpBoRj9lm2AEcQNx22QdmadcwHJuyUQnncp8nYD3SQEZQS9pnuEsA== X-Received: by 2002:a17:90b:11c5:b0:1ea:9747:28ba with SMTP id gv5-20020a17090b11c500b001ea974728bamr8093931pjb.22.1655054835651; Sun, 12 Jun 2022 10:27:15 -0700 (PDT) Received: from [172.31.0.204] (c-73-63-24-84.hsd1.ut.comcast.net. [73.63.24.84]) by smtp.gmail.com with ESMTPSA id k1-20020a17090a7f0100b001e095a5477bsm5544689pjl.33.2022.06.12.10.27.14 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 12 Jun 2022 10:27:14 -0700 (PDT) Message-ID: Date: Sun, 12 Jun 2022 11:27:13 -0600 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.10.0 Subject: Re: [PATCH]middle-end Use subregs to expand COMPLEX_EXPR to set the lowpart. Content-Language: en-US To: gcc-patches@gcc.gnu.org References: From: Jeff Law In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-3.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 12 Jun 2022 17:27:19 -0000 On 6/9/2022 1:52 AM, Tamar Christina via Gcc-patches wrote: > Hi All, > > When lowering COMPLEX_EXPR we currently emit two VEC_EXTRACTs. One for the > lowpart and one for the highpart. > > The problem with this is that in RTL the lvalue of the RTX is the only thing > tying the two instructions together. > > This means that e.g. combine is unable to try to combine the two instructions > for setting the lowpart and highpart. > > For ISAs that have bit extract instructions we can eliminate one of the extracts > if, and only if we're setting the entire complex number. > > This change changes the expand code when we're setting the entire complex number > to generate a subreg for the lowpart instead of a vec_extract. > > This allows us to optimize sequences such as: Just a note.  I regularly see subregs significantly interfere with optimization, particularly register allocation.  So be aware that subregs can often get in the way of generating good code.  When changing something to use subregs I like to run real benchmarks rather than working with code snippets. > > _Complex int f(int a, int b) { > _Complex int t = a + b * 1i; > return t; > } > > from: > > f: > bfi x2, x0, 0, 32 > bfi x2, x1, 32, 32 > mov x0, x2 > ret > > into: > > f: > bfi x0, x1, 32, 32 > ret > > I have also confirmed the codegen for x86_64 did not change. > > Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu > and no issues. > > Ok for master? > > Thanks, > Tamar > > gcc/ChangeLog: > > * emit-rtl.cc (validate_subreg): Accept subregs of complex modes. > * expr.cc (emit_move_complex_parts): Emit subreg of lowpart if possible. > > gcc/testsuite/ChangeLog: > > * g++.target/aarch64/complex-init.C: New test. OK. On a related topic, any thoughts on keeping complex objects as complex types/modes through gimple and into at least parts of the RTL pipeline? The way complex arithmetic instructions work on our chip is going to be extremely tough to utilize in GCC -- we really need to the complex types/arithmetic up through RTL generation at the least. Ideally we'd even expose complex modes all the way to final.    Is that something y'all could benefit from as well?  Have y'all poked at this problem at all? jeff