From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pj1-x1029.google.com (mail-pj1-x1029.google.com [IPv6:2607:f8b0:4864:20::1029]) by sourceware.org (Postfix) with ESMTPS id D9DF73858D20 for ; Wed, 15 May 2024 11:02:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D9DF73858D20 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu ARC-Filter: OpenARC Filter v1.0.0 sourceware.org D9DF73858D20 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::1029 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1715770990; cv=none; b=n5yXqS3nLKsAZ0stW7m8BFUEijYuekXp/q92PrXMU1mBc+LfF8a3TtKDMNCxLxG0V5Hw7jcOK90r3kqwaYPqzVBey6RUn0YIgF2Qfa1MwCZ+rRWHzSSjIpWvezpkass0WyTHiJfI8hIN1pzn5iiQ3J/WocPu+RK7kSHTHUdYbz8= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1715770990; c=relaxed/simple; bh=rX0DULdFpZdCotB7y5aii7uhIUpsDzHREs2X6BDB2es=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=L2ct/VqiXRaoQX6W+n0QQ9zRSGcf516HYfXhvwpyXebxz6b+g9rviat4TJeVzC7IiCHWyZwM0S343qIJ3Cqf9N2YaugtiK3CyZdUnxd3tObCX73mvqMbfj3keIcfQuVO5SklmrgfU7fQbZER/gUZzql/a1iSRdz4EVBRs93hM4g= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-pj1-x1029.google.com with SMTP id 98e67ed59e1d1-2b5388087f9so4671706a91.0 for ; Wed, 15 May 2024 04:02:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; t=1715770979; x=1716375779; darn=gcc.gnu.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=fNQo0HYsa0EBy5/GMa49srmTcLo3xOoEju7nly9bT6M=; b=rXEjHL2a+woDRloulehMCM+Dxlph6CU512nLGoqWw8RqaD3y9JUq+WeylLj3BCpLeJ 6AvUhUb0xdAE+TrtvDJfEOyI1SoeBWdP0l9hVoYLlJhsahTDp88sqzqTQuF2Tq7O6ODd DHKXD2oOWd0n8W9pRj+NFGhOZmvX5gEoSwcasy9FiuU13zNGttUmdbVi6bslJCxSfJX+ ne1JjlvMf1eXLa8IwDCUtCqiO5j2v/kWnR2QKKeCgSAPNyh87WCkqiXfthi4lk7d6fol 6yYxXgEjZI4OlhIWxp9QORH2sUAeOxk36u3Bi5lgBo4pkfwSXi31QKQxbk8M7Xks7i5u ZMMQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1715770979; x=1716375779; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=fNQo0HYsa0EBy5/GMa49srmTcLo3xOoEju7nly9bT6M=; b=vhflMjnp1qaMbtXpvMTCth9dDO0Jir5OsWlHhzLyXeF1Lam52rNJuyJ+akUQzluUpK g5w7NObRL9KaHMvmRer2IL/qUZjdUxubOcZEsi1gScMZvxCXaVPM09lLZawbxRzvL0TI /y1JzqMjK2tqmQnAk8IFX6yJwWa1hXzUiGiScxQBNS5tryIIzaNmQoEN7zm6Vruievg8 KXaA7mc+Lm2gAR9oITiCpPD2wRznvE8SH4Tq39EWZE9HaHHekncp8pNbyiU28bPIlAnH d+gofWRqRkWdAPUI6WVA8eEVxRBAARBsnRIt8w1crjkk4lf3ZB09rhLdOcC/pfRK4ovk X1mA== X-Gm-Message-State: AOJu0Yw/dEWwNQzGBw2tfwPE9UGxcPHFOBPBJRUUeP6QvIO4GkkZNCoQ 8QDkSNHmK6lEJSCfTGZMvbUrmjFSHeN0QfZufOUes05K7x7qZahunDIrMJskzso0IShq8k/Gffl yqXy3gzRYa6QZTc3PqjshETDt9ACCKo5PeXS2mvTfFexbXed3zB8= X-Google-Smtp-Source: AGHT+IESNXdK7ydQkK9SD75puKyciDG0cjPTHn+iAcNMIw8tvu7mnFZYS+zp1GiwDnmKvg5q6UJIxQqiC9R6p1B0qXU= X-Received: by 2002:a17:90a:5802:b0:2b2:af15:f948 with SMTP id 98e67ed59e1d1-2b6ccc72faemr14245114a91.38.1715770978752; Wed, 15 May 2024 04:02:58 -0700 (PDT) MIME-Version: 1.0 References: <20240508051756.3999080-1-christoph.muellner@vrull.eu> <20240508051756.3999080-3-christoph.muellner@vrull.eu> In-Reply-To: From: =?UTF-8?Q?Christoph_M=C3=BCllner?= Date: Wed, 15 May 2024 13:02:46 +0200 Message-ID: Subject: Re: [PATCH 2/4] RISC-V: Allow unaligned accesses in cpymemsi expansion To: Jeff Law Cc: gcc-patches@gcc.gnu.org, Kito Cheng , Jim Wilson , Palmer Dabbelt , Andrew Waterman , Philipp Tomsich , Vineet Gupta Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Sat, May 11, 2024 at 12:32=E2=80=AFAM Jeff Law w= rote: > > > > On 5/7/24 11:17 PM, Christoph M=C3=BCllner wrote: > > The RISC-V cpymemsi expansion is called, whenever the by-pieces > > infrastructure will not take care of the builtin expansion. > > The code emitted by the by-pieces infrastructure may emits code, > > that includes unaligned accesses if riscv_slow_unaligned_access_p > > is false. > > > > The RISC-V cpymemsi expansion is handled via riscv_expand_block_move(). > > The current implementation of this function does not check > > riscv_slow_unaligned_access_p and never emits unaligned accesses. > > > > Since by-pieces emits unaligned accesses, it is reasonable to implement > > the same behaviour in the cpymemsi expansion. And that's what this patc= h > > is doing. > > > > The patch checks riscv_slow_unaligned_access_p at the entry and sets > > the allowed alignment accordingly. This alignment is then propagated > > down to the routines that emit the actual instructions. > > > > The changes introduced by this patch can be seen in the adjustments > > of the cpymem tests. > > > > gcc/ChangeLog: > > > > * config/riscv/riscv-string.cc (riscv_block_move_straight): Add > > parameter align. > > (riscv_adjust_block_mem): Replace parameter length by align. > > (riscv_block_move_loop): Add parameter align. > > (riscv_expand_block_move_scalar): Set alignment properly if the > > target has fast unaligned access. > > > > gcc/testsuite/ChangeLog: > > > > * gcc.target/riscv/cpymem-32-ooo.c: Adjust for unaligned access. > > * gcc.target/riscv/cpymem-64-ooo.c: Likewise. > Mostly ok. One concern noted below. > > > > > > Signed-off-by: Christoph M=C3=BCllner > > --- > > gcc/config/riscv/riscv-string.cc | 53 +++++++++++-------= - > > .../gcc.target/riscv/cpymem-32-ooo.c | 20 +++++-- > > .../gcc.target/riscv/cpymem-64-ooo.c | 14 ++++- > > 3 files changed, 59 insertions(+), 28 deletions(-) > > > > @@ -730,8 +732,16 @@ riscv_expand_block_move_scalar (rtx dest, rtx src,= rtx length) > > unsigned HOST_WIDE_INT hwi_length =3D UINTVAL (length); > > unsigned HOST_WIDE_INT factor, align; > > > > - align =3D MIN (MIN (MEM_ALIGN (src), MEM_ALIGN (dest)), BITS_PER_WOR= D); > > - factor =3D BITS_PER_WORD / align; > > + if (riscv_slow_unaligned_access_p) > > + { > > + align =3D MIN (MIN (MEM_ALIGN (src), MEM_ALIGN (dest)), BITS_PER= _WORD); > > + factor =3D BITS_PER_WORD / align; > > + } > > + else > > + { > > + align =3D hwi_length * BITS_PER_UNIT; > > + factor =3D 1; > > + } > Not sure why you're using hwi_length here. That's a property of the > host, not the target. ISTM you wanted BITS_PER_WORD here to encourage > word sized moves irrespective of alignment. We set 'align' here to pretend proper alignment to force unaligned accesses (if needed). 'hwi_length' is defined above as: unsigned HOST_WIDE_INT hwi_length =3D UINTVAL (length); So, it is not a host property, but the number of bytes to compare. Setting 'align' to BITS_PER_WORD does the same but is indeed the better cho= ice. I'll also add the comment "Pretend alignment" to make readers aware of the fact that we ignore the actual alignment. > OK with that change after a fresh rounding of testing. Pushed after adjusting as stated above and retesting: * rv32 & rv64: GCC regression tests * rv64: CPU 2017 intrate