From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oa1-x32.google.com (mail-oa1-x32.google.com [IPv6:2001:4860:4864:20::32]) by sourceware.org (Postfix) with ESMTPS id 62C3A3858C98 for ; Sat, 16 Mar 2024 20:27:13 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 62C3A3858C98 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 62C3A3858C98 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2001:4860:4864:20::32 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1710620835; cv=none; b=p2oBkgQw5VEJzClWa8LUSdDf6l1nVXZcSnAGRBLxKcLSq1bYnDmmY8H9X/4rznvTAR5bIA7wIx+ZL4TSKK+SGx/0p4UYVqpwEl0rvS0XOOH/qMJ/pUnWJE0IkIiLxzD1KeY/Ct27S8qx1WiBo3vPqmdiepFN3HcRX8K2PcUCr8U= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1710620835; c=relaxed/simple; bh=4QuWblSkEzEJl6W7EGFdWcU4NZVu5PeM8nbB9wSn9xg=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From; b=NDpxRvI4JmZgsVhk3DZNF+ILjxe6Cmy2QSdp4YJYToEl86UwYTSQdoN2guC+IBokYmNybhxOKd5IvFkb5NWQoUx6kQowJiL4k4nmF4km/F4GbJbD+DVwFXQDkXWa0brfwAddfnnHNcfBaCPluIXzMM0GNem5/XI4nxeNZXlV9O0= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oa1-x32.google.com with SMTP id 586e51a60fabf-221a9e5484aso2182643fac.0 for ; Sat, 16 Mar 2024 13:27:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1710620832; x=1711225632; darn=gcc.gnu.org; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=cA0nbA/OISQRiY84pMeEb1FMTUZogv29O98ymmEQilE=; b=nUukwd5pE/F1PyaVX6hmYKirWs0+l6WZ3tS7t3LQ4HdDKaKK6vphg0dWz21aXA2JF3 Qf5DrzEOIAhEKQjGvza4mUQPYqgV8KdMgbq5MTACrTyeYpI76S0V8wJUjTv+FKHmGhVB itPxDxmhQtygjq9qdE2TE5xMgBC/nQaBSWW9CED8+rkdWrmwgXC+m3BBQqRebJEFteCx A/fMGJORgEW9MkKZMzvQb7047GzBdw2P8t4WwOHByGlsmWvofvXejq54yydqfVNyl9/b fvztp4YYQLZG3dMxm/uJK22xOhJsVTR1rjt/jMIGZglaTOstKJAkqv34/Q6c1L9Wr3Wp aixw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710620832; x=1711225632; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=cA0nbA/OISQRiY84pMeEb1FMTUZogv29O98ymmEQilE=; b=OCnW++BcmU7d7dc0cIC/wrCQdFp0vUnbOIhopg1uZ2P7xxdTsGe+cqJOUCUOcVTsUJ KmOT2AiMCPgI+yxpOBXZZR/lXrkpQj3KJQia5E3LAIK9hLHoRShMo53cQUmZqzZ1tEgs CBJCTa1ksckzxdaxgF5+wGL3B9U8cfPNQ9RT6nmwJPk+NU23t18667F+/JCY+Sts8L5V NSprD5Bmnf2c1vFlZzO7uqSOtegGjC3xooy0QGllQcLEzCkIsjk4/WLwVTF4ILx6sPIn IXieP8EwHlSrlbB8Y2BCtMrBI1RWlBvfNdSz1NXlKJfkufJRhn9gJ8n3fGrSWcP9El3k speA== X-Forwarded-Encrypted: i=1; AJvYcCX68TG2yt7aYGBFaEfcXhXFV17eWrx/Y5j2hTHRL7x6RTqVlfSAaA2PI+Qy46IGLLMe6fn1wDuJNn1P7+wKYvfBQBSiEXqSTA== X-Gm-Message-State: AOJu0YzmcTH+BTDzWYp02C2N+DALhjWWg1PBNfQyF2VfSm2WvsjLH+8R 5OpS/A/EcFncwXCJGGlzHhlm+J+65uCGibkdLUHrYB7bliCf34mz X-Google-Smtp-Source: AGHT+IGyulO7aGnz+tIR1eOj/T6UcNPsiTJvzQ7b94cZHEQY+6zdetej7ah8mhpDYUfuRz3WI3GfeQ== X-Received: by 2002:a05:6870:1d4:b0:21e:be10:f39d with SMTP id n20-20020a05687001d400b0021ebe10f39dmr9488486oad.46.1710620832403; Sat, 16 Mar 2024 13:27:12 -0700 (PDT) Received: from [172.31.0.109] ([136.36.72.243]) by smtp.gmail.com with ESMTPSA id vz19-20020a056871a41300b00221bcb6ec63sm1725229oab.33.2024.03.16.13.27.11 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sat, 16 Mar 2024 13:27:12 -0700 (PDT) Message-ID: Date: Sat, 16 Mar 2024 14:27:10 -0600 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Beta Subject: Re: [gcc-15 3/3] RISC-V: avoid LUI based const mat in prologue/epilogue expansion [PR/105733] Content-Language: en-US To: Vineet Gupta , gcc-patches@gcc.gnu.org Cc: kito.cheng@gmail.com, Palmer Dabbelt , gnu-toolchain@rivosinc.com, Robin Dapp References: <20240316173524.1147760-1-vineetg@rivosinc.com> <20240316173524.1147760-4-vineetg@rivosinc.com> From: Jeff Law In-Reply-To: <20240316173524.1147760-4-vineetg@rivosinc.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,KAM_ASCII_DIVIDERS,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 3/16/24 11:35 AM, Vineet Gupta wrote: > If the constant used for stack offset can be expressed as sum of two S12 > values, the constant need not be materialized (in a reg) and instead the > two S12 bits can be added to instructions involved with frame pointer. > This avoids burning a register and more importantly can often get down > to be 2 insn vs. 3. > > The prev patches to generally avoid LUI based const materialization didn't > fix this PR and need this directed fix in funcion prologue/epilogue > expansion. > > This fix doesn't move the neddle for SPEC, at all, but it is still a > win considering gcc generates one insn fewer than llvm for the test ;-) > > gcc-13.1 release | gcc 230823 | | > | g6619b3d4c15c | This patch | clang/llvm > --------------------------------------------------------------------------------- > li t0,-4096 | li t0,-4096 | addi sp,sp,-2048 | addi sp,sp,-2048 > addi t0,t0,2016 | addi t0,t0,2032 | add sp,sp,-16 | addi sp,sp,-32 > li a4,4096 | add sp,sp,t0 | add a5,sp,a0 | add a1,sp,16 > add sp,sp,t0 | addi a5,sp,-2032 | sb zero,0(a5) | add a0,a0,a1 > li a5,-4096 | add a0,a5,a0 | addi sp,sp,2032 | sb zero,0(a0) > addi a4,a4,-2032 | li t0, 4096 | addi sp,sp,32 | addi sp,sp,2032 > add a4,a4,a5 | sb zero,2032(a0) | ret | addi sp,sp,48 > addi a5,sp,16 | addi t0,t0,-2032 | | ret > add a5,a4,a5 | add sp,sp,t0 | > add a0,a5,a0 | ret | > li t0,4096 | > sd a5,8(sp) | > sb zero,2032(a0)| > addi t0,t0,-2016 | > add sp,sp,t0 | > ret | > > gcc/ChangeLog: > PR target/105733 > * config/riscv/riscv.cc (riscv_split_sum_of_two_s12): New > function to split a sum of two s12 values into constituents. > (riscv_expand_prologue): Handle offset being sum of two S12. > (riscv_expand_epilogue): Ditto. > * config/riscv/riscv-protos.h (riscv_split_sum_of_two_s12): New. > > gcc/testsuite/ChangeLog: > * gcc.target/riscv/pr105733.c: New Test. > * gcc.target/riscv/rvv/autovec/vls/spill-1.c: Adjust to not > expect LUI 4096. > * gcc.target/riscv/rvv/autovec/vls/spill-2.c: Ditto. > * gcc.target/riscv/rvv/autovec/vls/spill-3.c: Ditto. > * gcc.target/riscv/rvv/autovec/vls/spill-4.c: Ditto. > * gcc.target/riscv/rvv/autovec/vls/spill-5.c: Ditto. > * gcc.target/riscv/rvv/autovec/vls/spill-6.c: Ditto. > * gcc.target/riscv/rvv/autovec/vls/spill-7.c: Ditto. Yea, wouldn't expect this to move the needle on spec since it's just hitting the prologue/epilogue. In fact, I wouldn't be surprised if there were other stack frame sizes that could be improved. But I wouldn't bother chasing down those other cases. If we think about the embedded space, they're probably not going to want to see functions with large frames to begin with. So optimizing those cases for the embedded space just doesn't make much sense. In the distro space, by this time next year we'll be living in a world where stack clash mitigations are enabled. So for any given size stack frame, it'll be allocated in at most 1 page chunks. So again, going to any significant length to optimize other cases just doesn't make much sense. So we probably should go with this patch in the gcc-15 space, but I wouldn't suggest heroic efforts for other sized stack frames. jeff