From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf1-x136.google.com (mail-lf1-x136.google.com [IPv6:2a00:1450:4864:20::136]) by sourceware.org (Postfix) with ESMTPS id 967F43858C60 for ; Fri, 7 Jul 2023 21:37:41 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 967F43858C60 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivosinc.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivosinc.com Received: by mail-lf1-x136.google.com with SMTP id 2adb3069b0e04-4fb863edcb6so3895086e87.0 for ; Fri, 07 Jul 2023 14:37:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20221208.gappssmtp.com; s=20221208; t=1688765860; x=1691357860; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=31gncfR5e8goIwcqhOe4v2k16T/syIN6cTY0Px1jhIA=; b=pTrrmZPbHrzMmLig0R+ciKngGdDSK1zrLqMy9klHrc7OqpEmVExHUYh1Fr0cKGLC42 EYgFl1Q4AYLCCuj6A2mo/XY2mzNcFT8QUZolsRsdGXbP9U+H1oGRDMZYP/n8WV14PvqU FoS4XlsVbv+92DUIwhetNFeakQXkRoMYm78lrjdlSI/fpM7qSUY1FFz2EAf6K/9e6Nrl NarSYIJDmg9dFZGrE2ruvsxMFtGk33+IxQcYCmh3gqv+oyzToMuknX/XKVIN0EDlQbT6 B7t6UA46anp9oB+7soPugCIXRYkES2fZR6Acrw8ONo+TX4y5qLEWmz7FIU1SczlMihng oBjQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688765860; x=1691357860; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=31gncfR5e8goIwcqhOe4v2k16T/syIN6cTY0Px1jhIA=; b=ln5Rz+R1N8AAqhc9mtrQRA35128K3NQa37KbigrP+DyL2YDfTBWU6W5Qc04Xuyalf/ VCCRPq8dULAfFi9zlr5iNSnrWRYFm74+LmgGkRvnAeLs+KyiUw40NkSXqPcg4UJjfeAO sW1orN/R9WMJCb+aP1Fq3g5VNK1hWLxfpMWKZCWkUrU/P1dW9MqHErs0zIIxkrO+JXpB fxKWkRqZRf3QTQlBulQut0wvs3CpTq2WrzdW0B1uFcWuRXd2giRbQqLsO0ebJ5qUZS5i ALS5pf7ABqIiUdAVXLxKY6VAh/ZC+yH744MssJpulvk7gumzkrpsRwdvrYaAX3965S8L 1Egw== X-Gm-Message-State: ABy/qLYbmoS/73pnpIwjbocPrJyfn1onzqo8JL56yrHl8/A/vwtzMLWv irlek82svjbQbcURVx7ZifQViO3CjtLmJdvKVQspyw== X-Google-Smtp-Source: APBJJlG4akZ50VBLoYzlItJIQBdNMkKbzfjwDPmrcFmHzhxXIBznJEShICaKR91VO6qLmOYN8z/1KFdMe1QP+g55SmQ= X-Received: by 2002:a05:6512:220e:b0:4fb:c0d8:10ec with SMTP id h14-20020a056512220e00b004fbc0d810ecmr5953304lfu.1.1688765859788; Fri, 07 Jul 2023 14:37:39 -0700 (PDT) MIME-Version: 1.0 References: <20230706192947.1566767-1-evan@rivosinc.com> <20230706192947.1566767-4-evan@rivosinc.com> In-Reply-To: From: Evan Green Date: Fri, 7 Jul 2023 14:37:03 -0700 Message-ID: Subject: Re: [PATCH v4 3/3] riscv: Add and use alignment-ignorant memcpy To: Jeff Law Cc: Richard Henderson , libc-alpha@sourceware.org, palmer@rivosinc.com, slewis@rivosinc.com, vineetg@rivosinc.com, Florian Weimer Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-3.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Fri, Jul 7, 2023 at 8:25=E2=80=AFAM Jeff Law wro= te: > > > > On 7/7/23 03:22, Richard Henderson via Libc-alpha wrote: > > On 7/6/23 20:29, Evan Green wrote: > >> + /* Copy the last few individual bytes */ > >> + add a3, a1, a2 > >> +5: > >> + lb a4, 0(a1) > >> + addi a1, a1, 1 > >> + sb a4, 0(t6) > >> + addi t6, t6, 1 > >> + bltu a1, a3, 5b > >> +6: > >> + ret > > > > The only time you should be copying individual bytes is when the copy i= s > > smaller than SZREG. Otherwise the tail can be handled like > > > > add srcend, a1, a2 > > add dstend, a0, a2 > > REG_L tmp, -SZREG(srcend) > > REG_S tmp, -SZREG(dstend) > > > > There are other tricks that can be used to reduce the number of branche= s > > -- please examine the x86 code. See e.g. the copy_0_15 block in > > sysdeps/x86_64/multiarch/memmove-ssse3.S. > The bits we've got here from VRULL use this trick. > > Evan, I'm happy to pass those bits along if you want to take a look. > > I have no strong opinions if this should be fixed before integration or > as a follow-up. This is the vrull patch, right? https://patchwork.sourceware.org/project/glibc/patch/20230207001618.458947-= 13-christoph.muellner@vrull.eu/ Sure, I can add the overlapping word access as suggested by Richard, it's a good idea. My preference is a followup patch, but I am ok either way. I should be able to get it sent next week. -Evan