From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wr1-x433.google.com (mail-wr1-x433.google.com [IPv6:2a00:1450:4864:20::433]) by sourceware.org (Postfix) with ESMTPS id 11C093851ABE for ; Fri, 7 Jul 2023 09:22:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 11C093851ABE Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org Received: by mail-wr1-x433.google.com with SMTP id ffacd0b85a97d-307d58b3efbso1572622f8f.0 for ; Fri, 07 Jul 2023 02:22:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1688721745; x=1691313745; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=rZQx3k+I6n1Oi+F8m/vo6uLchsBZNCs/vTlP7TV4Syo=; b=ggufWeMhDTAJ6hQG64cCFXTI3MmUMgNimdFkn+dPE65FJPmFfoA78J9UvSgNbq53MO +M7MNA3hhpF+1bn1myQupDHWnAL9NyKywg68iH/WBbLfs7nCEUFj+TJy+1DyQQl3NO4U gQUBMrQxQrL7+62JnGpuOFOC45G5fSEBLaA4mkfiojXSZRZbPQV0TtWn6+VucVpoQWpV OIo37asPi6cLYhD81CbPhyGF+mWUL+qv8Uy8OvKtslYRpFQ+mmiNKUjcJ9vTvhZwDe/H Y8jR3PCrUMxPkqJMQqNjXgwAZoxyXgGoWziX08X9RVUYvMdQmNLKYLKUnicJmpTqt9OD bbpQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688721745; x=1691313745; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=rZQx3k+I6n1Oi+F8m/vo6uLchsBZNCs/vTlP7TV4Syo=; b=Dk+SZMDSOINXJic1oJtn8S8cn3Xe7UTWJ/HZ/+cLLNydcnPa6Yg/Svf5LhZ96cEduM SWsDW1zhg0t16COBTAF9wgjwnQosfFJFm8LWiQDaZal1MyYsIq3wb6goSawNMs0kOT12 4Dl/O68vng1Mc6lsTpbNFtls6XtmkNxvzvfACLHZPyOKIbvW3koFFnClGyaqi3PmL1Am t+sQHCpSLU/GStz0YkOp2odfKlY8tHEsLrFY88F/9qJV38TdaXFiMAIZRjDkqMkULaLl R8KxB8B5AzryHsDpQCC7S2tuvPLdnFtKBKGLfDoUpZwgPmmoj65YhdBRa1cuBHqAUvYz aK4w== X-Gm-Message-State: ABy/qLbmlcABq0z74/XrJW83dASj08XFAvyvsvZ7TjeIvforD+mbu80J F2YGF48XU4pozc6O2aJ2qyw3rg== X-Google-Smtp-Source: APBJJlGSXsqQJTFRNrm7DeB2J1apfzDRRM/OIzu7xhA10lJ4JDRTCU+qmCPXYIOK+RG2FbP0szBkvw== X-Received: by 2002:a5d:488f:0:b0:314:3954:7ff6 with SMTP id g15-20020a5d488f000000b0031439547ff6mr3528315wrq.56.1688721744747; Fri, 07 Jul 2023 02:22:24 -0700 (PDT) Received: from [192.168.82.227] ([91.209.212.56]) by smtp.gmail.com with ESMTPSA id c3-20020adfef43000000b003141e9e2f81sm4011243wrp.4.2023.07.07.02.22.23 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 07 Jul 2023 02:22:24 -0700 (PDT) Message-ID: Date: Fri, 7 Jul 2023 10:22:20 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.11.0 Subject: Re: [PATCH v4 3/3] riscv: Add and use alignment-ignorant memcpy Content-Language: en-US To: Evan Green , libc-alpha@sourceware.org Cc: palmer@rivosinc.com, slewis@rivosinc.com, vineetg@rivosinc.com, Florian Weimer References: <20230706192947.1566767-1-evan@rivosinc.com> <20230706192947.1566767-4-evan@rivosinc.com> From: Richard Henderson In-Reply-To: <20230706192947.1566767-4-evan@rivosinc.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-5.2 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 7/6/23 20:29, Evan Green wrote: > + /* Copy the last few individual bytes */ > + add a3, a1, a2 > +5: > + lb a4, 0(a1) > + addi a1, a1, 1 > + sb a4, 0(t6) > + addi t6, t6, 1 > + bltu a1, a3, 5b > +6: > + ret The only time you should be copying individual bytes is when the copy is smaller than SZREG. Otherwise the tail can be handled like add srcend, a1, a2 add dstend, a0, a2 REG_L tmp, -SZREG(srcend) REG_S tmp, -SZREG(dstend) There are other tricks that can be used to reduce the number of branches -- please examine the x86 code. See e.g. the copy_0_15 block in sysdeps/x86_64/multiarch/memmove-ssse3.S. r~