From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pl1-x634.google.com (mail-pl1-x634.google.com [IPv6:2607:f8b0:4864:20::634]) by sourceware.org (Postfix) with ESMTPS id 903CC38418BB for ; Wed, 1 Jun 2022 15:04:10 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 903CC38418BB Received: by mail-pl1-x634.google.com with SMTP id h1so2042297plf.11 for ; Wed, 01 Jun 2022 08:04:10 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:to:references:from:in-reply-to :content-transfer-encoding; bh=/TTvMzNjFPgUQFfPGyPvT0lGtqa6hBjhP+pwSP+nIPk=; b=GRa/h1b/ac3SP8j7W1ERVzFe6SviJdJkxHSprGMi42jawYtzfsFBSTuid9AJX7u1P2 v4Jyazt+TqP+Gs+kv95Xye49aIB0rY137I2liM77LaJ7BsV3eW/UktP3H7JCb3oN5BAY ln08vIMD/oqSt4xNcf5HrLQ1JY1V1T+ngxM+VgWUxwBdMpemMyJt1W4ttZPhrmpt9Tk5 tpzygHWDheRDNY8JO2UPybj+IAtyula84JoFqv8hT/vx2olPaPOb0YHLcVJmsx0qgDxk EbTHZktpGxm3XuURRFdIB3i0ozpSxkeXyCaVFppP83KpqW/QwC2Qq5MsePU12/wmwERx fjgA== X-Gm-Message-State: AOAM530YwzdWUIy56vW+rk59guTazo9gHrTWYl9efShip6Mp7tz5Zdhu xMB5gFhk8sIUp0TPLhwxtKOpSWKPO5s0xg== X-Google-Smtp-Source: ABdhPJwP9cyJDpc3ZsKqCJ6hGXDd0NBzqRfncHxn4wYWo02I5dCa7v1w+warjkrQHnhR29ezNcsQlw== X-Received: by 2002:a17:90b:1c8f:b0:1b8:c6dc:ca61 with SMTP id oo15-20020a17090b1c8f00b001b8c6dcca61mr53276pjb.13.1654095849038; Wed, 01 Jun 2022 08:04:09 -0700 (PDT) Received: from [172.31.0.204] (c-73-63-24-84.hsd1.ut.comcast.net. [73.63.24.84]) by smtp.gmail.com with ESMTPSA id a1-20020a056a001d0100b00518950bfc82sm1539992pfx.10.2022.06.01.08.04.07 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 01 Jun 2022 08:04:08 -0700 (PDT) Message-ID: Date: Wed, 1 Jun 2022 09:04:07 -0600 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.9.1 Subject: Re: [PATCH] PR rtl-optimization/7061: Complex number arguments on x86_64-like ABIs. Content-Language: en-US To: gcc-patches@gcc.gnu.org References: <00e101d8740c$e785e110$b691a330$@nextmovesoftware.com> From: Jeff Law In-Reply-To: <00e101d8740c$e785e110$b691a330$@nextmovesoftware.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-3.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, KAM_SHORT, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 01 Jun 2022 15:04:13 -0000 On 5/30/2022 4:06 AM, Roger Sayle wrote: > This patch addresses the issue in comment #6 of PR rtl-optimization/7061 > (a four digit PR number) from 2006 where on x86_64 complex number arguments > are unconditionally spilled to the stack. > > For the test cases below: > float re(float _Complex a) { return __real__ a; } > float im(float _Complex a) { return __imag__ a; } > > GCC with -O2 currently generates: > > re: movq %xmm0, -8(%rsp) > movss -8(%rsp), %xmm0 > ret > im: movq %xmm0, -8(%rsp) > movss -4(%rsp), %xmm0 > ret > > with this patch we now generate: > > re: ret > im: movq %xmm0, %rax > shrq $32, %rax > movd %eax, %xmm0 > ret > > [Technically, this shift can be performed on %xmm0 in a single > instruction, but the backend needs to be taught to do that, the > important bit is that the SCmode argument isn't written to the > stack]. > > The patch itself is to emit_group_store where just before RTL > expansion commits to writing to the stack, we check if the store > group consists of a single scalar integer register that holds > a complex mode value; on x86_64 SCmode arguments are passed in > DImode registers. If this is the case, we can use a SUBREG to > "view_convert" the integer to the equivalent complex mode. > > An interesting corner case that showed up during testing is that > x86_64 also passes HCmode arguments in DImode registers(!), i.e. > using modes of different sizes. This is easily handled/supported > by first converting to an integer mode of the correct size, and > then generating a complex mode SUBREG of this. This is similar > in concept to the patch I proposed here: > https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590139.html > which was almost (but not quite) approved here: > https://gcc.gnu.org/pipermail/gcc-patches/2022-March/591139.html Yea, sorry.  Too much to do at the new job.  Trying to work my way through queued up stuff now... > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > and make -k check, both with and without --target_board=unix{-m32}, > with no new failures. Ok for mainline? > > > 2020-05-30 Roger Sayle > > gcc/ChangeLog > PR rtl-optimization/7061 > * expr.cc (emit_group_stote): For groups that consist of a single > scalar integer register that hold a complex mode value, use > gen_lowpart to generate a SUBREG to "view_convert" to the complex > mode. For modes of different sizes, first convert to an integer > mode of the appropriate size. > > gcc/testsuite/ChangeLog > PR rtl-optimization/7061 > * gcc.target/i386/pr7061-1.c: New test case. > * gcc.target/i386/pr7061-2.c: New test case. OK jeff