From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pg1-x531.google.com (mail-pg1-x531.google.com [IPv6:2607:f8b0:4864:20::531]) by sourceware.org (Postfix) with ESMTPS id DF2B73857712 for ; Tue, 18 Apr 2023 16:59:10 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DF2B73857712 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-pg1-x531.google.com with SMTP id 41be03b00d2f7-517bdc9e81dso1034673a12.1 for ; Tue, 18 Apr 2023 09:59:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1681837149; x=1684429149; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=eXXW5F0Ij3V6ZxhO9IbMOes/N27elU0l76B8esJmgH4=; b=sTgvXBOsgO15JldDGLrfqzrFk1BTtE+3e3+DHisNVlCxV3hyDBG14c8oA3IyX754bQ aF1VjPmGq4LPDK2W0+CIV1gt4aK5KGKvk7G0qZZXe0OCOX7N4MEZFebadxvy3Io8zWma tj/ywMmNwunQxPBm7zmOx2drQqzVOEUCRJFeQIETL9NuXSLtUwNkrL0V9aBLoMjGYOhp zqaWrXF2+dkTv58GMItWrXSvK0YBIfm6FvU3nr2l4fBlMyGTx/QZhJfyOs2cmAdrRLPq mijZ7b8y6zJ/tzrlZU9nd3Zlba06ypByDtklILs/XfNoJ04Ikv2YMTwRcAJwtlqOZUhZ MJCA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681837149; x=1684429149; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=eXXW5F0Ij3V6ZxhO9IbMOes/N27elU0l76B8esJmgH4=; b=ONZJ/WXaN2oo38/qYjRuJu1A6EIPQR8mNrb4mlmHXz5Ydb8aD28VOKr50on4GAzoTa 6wMF1WF3j3tyfZ+WS0GwRiywpyvtuLNjncFcECP4fEIJTskOE8xb/k6N/1bY+oeOLWcc zLUMjmlHnhg+EY2IKw0kuzBAhz82DgvKq8d3SAZjt8/JPm2dQbFXbubmLs8s/szNQGWR AaW3pY5YIJBWRzGDjsPcXoQ/UUHd+7yRGQXmkOHzcEZfHHvXLwIWoIylzKTh6HnK1jbn EqLmBlcMETiAFCla5H8u9ja02Tf3KrFvUbUWqG+JPACfH0qxhI6DGtv8sX+1aaOIw/Qz tolw== X-Gm-Message-State: AAQBX9esZ9tW2a2Fgyp39HBu+hp/eBFk0NUGrc6a1jH7AaGctQuannYg k5tqHnrXhTJnNhC9WyKkD0U= X-Google-Smtp-Source: AKy350ZBeufPXQ/uRwOjqANRRIl0IvSVAU0+KFX3OO0RYftEZOBQzDlNjBiWuFYtst408A+3UWL+fw== X-Received: by 2002:a17:903:22cf:b0:1a5:1438:9bcd with SMTP id y15-20020a17090322cf00b001a514389bcdmr3034457plg.35.1681837149167; Tue, 18 Apr 2023 09:59:09 -0700 (PDT) Received: from ?IPV6:2601:681:8600:13d0::f0a? ([2601:681:8600:13d0::f0a]) by smtp.gmail.com with ESMTPSA id p7-20020a170903248700b001a68d45e52dsm9412943plw.249.2023.04.18.09.59.07 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 18 Apr 2023 09:59:08 -0700 (PDT) Message-ID: Date: Tue, 18 Apr 2023 10:59:07 -0600 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.9.1 Subject: Re: [PATCH v5] RISCV: Inline subword atomic ops Content-Language: en-US To: Patrick O'Neill , gcc-patches@gcc.gnu.org Cc: palmer@rivosinc.com, kito.cheng@gmail.com, david.abd@gmail.com References: <20220821215823.18207-1-palmer@rivosinc.com> <20230418142858.2424851-1-patrick@rivosinc.com> From: Jeff Law In-Reply-To: <20230418142858.2424851-1-patrick@rivosinc.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-9.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,KAM_SHORT,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 4/18/23 08:28, Patrick O'Neill wrote: > RISC-V has no support for subword atomic operations; code currently > generates libatomic library calls. > > This patch changes the default behavior to inline subword atomic calls > (using the same logic as the existing library call). > Behavior can be specified using the -minline-atomics and > -mno-inline-atomics command line flags. > > gcc/libgcc/config/riscv/atomic.c has the same logic implemented in asm. > This will need to stay for backwards compatibility and the > -mno-inline-atomics flag. > > 2023-04-18 Patrick O'Neill > > PR target/104338 > * riscv-protos.h: Add helper function stubs. > * riscv.cc: Add helper functions for subword masking. > * riscv.opt: Add command-line flag. > * sync.md: Add masking logic and inline asm for fetch_and_op, > fetch_and_nand, CAS, and exchange ops. > * invoke.texi: Add blurb regarding command-line flag. > * inline-atomics-1.c: New test. > * inline-atomics-2.c: Likewise. > * inline-atomics-3.c: Likewise. > * inline-atomics-4.c: Likewise. > * inline-atomics-5.c: Likewise. > * inline-atomics-6.c: Likewise. > * inline-atomics-7.c: Likewise. > * inline-atomics-8.c: Likewise. > * atomic.c: Add reference to duplicate logic. So for others who may be interested. The motivation here is that for a sub-word atomic we currently have to explicitly link in libatomic or we get undefined symbols. This is particularly problematical for the distros because we're one of the few (only?) architectures supported by the distros that require linking in libatomic for these cases. THe distros don't want to adjust each affected packages and be stuck carrying that change forward or negotiating with all the relevant upstreams. The distros might tackle this problem by porting this patch into their compiler tree which has its own set of problems with long term maintenance. The net is from a usability standpoint it's best if we get this problem addressed and backported to our gcc-13 RISC-V coordination branch. We had held this up pending resolution of some other issues in the atomics space. In retrospect that might have been a mistake. So with that background... Here we go... > > +/* Helper function for extracting a subword from memory. */ > + > +void > +riscv_subword_address (rtx mem, rtx *aligned_mem, rtx *shift, rtx *mask, > + rtx *not_mask) So I'd expand on that comment. The idea is we would like someone working in the backend to be able to read the function comment and have a reasonable sense of what the function does as well as the inputs and return value. So perhaps something like this: /* Given memory reference MEM, expand code to compute the aligned memory address, shift and mask values and store them into *ALIGNED_MEM, *SHIFT, *MASK and *NOT_MASK. */ Or something like that. > +{ > + /* Align the memory addess to a word. */ s/addess/address/ > + rtx addr = force_reg (Pmode, XEXP (mem, 0)); > + > + rtx aligned_addr = gen_reg_rtx (Pmode); > + emit_move_insn (aligned_addr, gen_rtx_AND (Pmode, addr, > + gen_int_mode (-4, Pmode))); So rather than -4 as a magic number, GET_MODE_MASK would be better. That may result in needing to rewrap this code. I'd bring the gen_rtx_AND down on a new line, aligned with aligned_addr. Presumably using SImode is intentional here rather than wanting to use word_mode which would be SImode for rv32 and DImode for rv64? I'm going to work based on that assumption, but if it isn't there's more work to do to generalize this code. > + > + /* Calculate the shift amount. */ > + emit_move_insn (*shift, gen_rtx_AND (SImode, gen_lowpart (SImode, addr), > + gen_int_mode (3, SImode))); > + emit_move_insn (*shift, gen_rtx_ASHIFT (SImode, *shift, > + gen_int_mode(3, SImode))); Formatting nit. space after before open paren in the gen_int_mode call on that last line above. This minor goof shows up in various places, please review the patch as a whole looking for similar nits. > + > + /* Calculate the mask. */ > + int unshifted_mask; > + if (GET_MODE (mem) == QImode) > + unshifted_mask = 0xFF; > + else > + unshifted_mask = 0xFFFF; Can you just use GET_MASK_MODE here which should simplify this to something like unshifted_mask = GET_MODE_MASK (GET_MODE (mem)); > + > +(define_expand "atomic_fetch_nand" > + [(set (match_operand:SHORT 0 "register_operand" "=&r") > + (match_operand:SHORT 1 "memory_operand" "+A")) > + (set (match_dup 1) > + (unspec_volatile:SHORT > + [(not:SHORT (and:SHORT (match_dup 1) > + (match_operand:SHORT 2 "reg_or_0_operand" "rJ"))) > + (match_operand:SI 3 "const_int_operand")] ;; model > + UNSPEC_SYNC_OLD_OP_SUBWORD))] > + "TARGET_ATOMIC && TARGET_INLINE_SUBWORD_ATOMIC" Just a note, constraints aren't necessary for a define_expand. They don't hurt anything though. They do document expectations, but then you have to maintain them over time. I'm OK leaving them, mostly wanted to make sure you're aware they aren't strictly necessary for a define_expand. > + > +(define_expand "atomic_fetch_" > + [(set (match_operand:SHORT 0 "register_operand" "=&r") ;; old value at mem > + (match_operand:SHORT 1 "memory_operand" "+A")) ;; mem location > + (set (match_dup 1) > + (unspec_volatile:SHORT > + [(any_atomic:SHORT (match_dup 1) > + (match_operand:SHORT 2 "reg_or_0_operand" "rJ")) ;; value for op > + (match_operand:SI 3 "const_int_operand")] ;; model > + UNSPEC_SYNC_OLD_OP_SUBWORD))] > + "TARGET_ATOMIC && TARGET_INLINE_SUBWORD_ATOMIC" > +{ > + /* We have no QImode/HImode atomics, so form a mask, then use > + subword_atomic_fetch_strong_ to implement a LR/SC version of the > + operation. */ > + > + /* Logic duplicated in gcc/libgcc/config/riscv/atomic.c for use when inlining > + is disabled */ Makes me wonder if we should expose builtins that could be used by atomic.c rather than having the logic open coded in two places. That could be a follow-up in my opinion rather than a blocker for this patch. > > +(define_expand "atomic_compare_and_swap" > + [(match_operand:SI 0 "register_operand" "") ;; bool output > + (match_operand:SHORT 1 "register_operand" "") ;; val output > + (match_operand:SHORT 2 "memory_operand" "") ;; memory > + (match_operand:SHORT 3 "reg_or_0_operand" "") ;; expected value > + (match_operand:SHORT 4 "reg_or_0_operand" "") ;; desired value > + (match_operand:SI 5 "const_int_operand" "") ;; is_weak > + (match_operand:SI 6 "const_int_operand" "") ;; mod_s > + (match_operand:SI 7 "const_int_operand" "")] ;; mod_f > + "TARGET_ATOMIC && TARGET_INLINE_SUBWORD_ATOMIC" I nearly suggested you use this form for the earlier expanders where the RTL in the expander is ultimately thrown away because we generate all the RTL we need in the C fragment and finish with a "DONE" tag. > diff --git a/gcc/testsuite/gcc.target/riscv/inline-atomics-1.c b/gcc/testsuite/gcc.target/riscv/inline-atomics-1.c > new file mode 100644 > index 00000000000..5c5623d9b2f > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/inline-atomics-1.c > @@ -0,0 +1,18 @@ > +/* { dg-do compile } */ > +/* { dg-options "-mno-inline-atomics" } */ > +/* { dg-message "note: '__sync_fetch_and_nand' changed semantics in GCC 4.4" "fetch_and_nand" { target *-*-* } 0 } */ > +/* { dg-final { scan-assembler "\tcall\t__sync_fetch_and_add_1" } } */ > +/* { dg-final { scan-assembler "\tcall\t__sync_fetch_and_nand_1" } } */ > +/* { dg-final { scan-assembler "\tcall\t__sync_bool_compare_and_swap_1" } } */ > + > +char foo; > +char bar; > +char baz; > + > +int > +main () > +{ > + __sync_fetch_and_add(&foo, 1); > + __sync_fetch_and_nand(&bar, 1); > + __sync_bool_compare_and_swap (&baz, 1, 2); > +} > diff --git a/gcc/testsuite/gcc.target/riscv/inline-atomics-2.c b/gcc/testsuite/gcc.target/riscv/inline-atomics-2.c > new file mode 100644 > index 00000000000..fdce7a5d71f > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/inline-atomics-2.c > @@ -0,0 +1,19 @@ > +/* { dg-do compile } */ > +/* Verify that subword atomics do not generate calls. */ > +/* { dg-options "-minline-atomics" } */ > +/* { dg-message "note: '__sync_fetch_and_nand' changed semantics in GCC 4.4" "fetch_and_nand" { target *-*-* } 0 } */ > +/* { dg-final { scan-assembler-not "\tcall\t__sync_fetch_and_add_1" } } */ > +/* { dg-final { scan-assembler-not "\tcall\t__sync_fetch_and_nand_1" } } */ > +/* { dg-final { scan-assembler-not "\tcall\t__sync_bool_compare_and_swap_1" } } */ Note that you can #include another test file. When you do that the dg-directives in the #included file are ignored. Meaning that you can share test code, but use different dg-directives. So for inline-2.c, you can have the dg-* directives above, followed by: #include "inline-atomics-1.c" Overall it looks pretty close to ready. Jeff