From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-io1-xd2a.google.com (mail-io1-xd2a.google.com [IPv6:2607:f8b0:4864:20::d2a]) by sourceware.org (Postfix) with ESMTPS id 17A2F3858C5E for ; Fri, 3 Feb 2023 00:51:49 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 17A2F3858C5E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=sifive.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=sifive.com Received: by mail-io1-xd2a.google.com with SMTP id l128so1468126iof.2 for ; Thu, 02 Feb 2023 16:51:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=Nu6/bSFJ5Vx7qQhwt4G42+eIPr2QqmfOpUygzn4NctE=; b=YEklkcpI3fM5J8P2p86J9SX4KE4NTnY1K5FDWcl1YNqJacuhtfrjag3lzNHXDCD6di 885fpahk8luIbewmu4rieItBiGgXxwP2zGI/Z3ofs4g0rC7qBNniDYWDFN7jz0VzYJOs zvWioODJIASxONs6B3DddTtdAjT1eTlm2qVnSLg9hnvlvLYtLqxQQlhdXQU1/fqRfKPj pbowJSZckMowQ3KnGFeYAGqRNeAxXhUE49GEoxNdo6fAXmZMICSe4NODENbCY+phe45e vyD7gOOOqH+XaiL/9HyH+cBmC3VJqcZusZXrx5FjvmGdvC5BUS67rtsjHVzYL8dxLeHr 44+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Nu6/bSFJ5Vx7qQhwt4G42+eIPr2QqmfOpUygzn4NctE=; b=D11QVhp5eP7hnfi1ilK4vBLOD5iKSG680dC+G0prmJUHdPvbtuS+nDULzKXWmB723P kUpZIZErZJpbJuP9OLN4ULTG+pBvbpIgQTjoZNXXhLx5mXVBRMHQ/B2qPc1+Vdea0mBY SWnWdQxTQR9cLBywFge+49FskhEo5Skj1ZYvWwpovrrU62CoVYZL+tOPWK86XNXb5iPh sFoW0dyX/RA3AlOKF3700Yhruzj3zMtME9bPxXZumcIXFUreAQt7BvFFR420uDowGbG5 lsonEX55kpcSLldNvIvzzH+mVRfp7I1q8lEMMyHP0DiZZAOAKgun8kJFyRzG8esxZexC xG6Q== X-Gm-Message-State: AO0yUKVXr6/pG+XHhTMQTBI63VA/wv1/Ve8cl0CSr0DWJr6uc7rtOudH N8SvDDl5wz+ISfagK6K8vWI5H1fEW3taLb9STL39URIwabVvkNv3DugLnvt/8dJ7PkIwCXVMxdR s9+iRBwpeb5GH+pgLJrUPmFcZQdWOhAwGDIhQicaf7elQlhAPAHQ9Q7/12YqQPBRJzMSpwKHCP3 Gl X-Google-Smtp-Source: AK7set8djAc2EoG/LEKaaZFGf4vNd+wTg/JAonfL9JWuc5P4onMABE2GJAdXLMuZlHrjZ1IlmFQTRw== X-Received: by 2002:a6b:f40c:0:b0:707:aa7f:7b78 with SMTP id i12-20020a6bf40c000000b00707aa7f7b78mr5528547iog.9.1675385507722; Thu, 02 Feb 2023 16:51:47 -0800 (PST) Received: from mail-io1-f54.google.com (mail-io1-f54.google.com. [209.85.166.54]) by smtp.gmail.com with ESMTPSA id a1-20020a5ec301000000b00721ff349b22sm341378iok.16.2023.02.02.16.51.46 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 02 Feb 2023 16:51:47 -0800 (PST) Received: by mail-io1-f54.google.com with SMTP id q6so1447351ior.12 for ; Thu, 02 Feb 2023 16:51:46 -0800 (PST) X-Received: by 2002:a5d:9da0:0:b0:708:13a7:c809 with SMTP id ay32-20020a5d9da0000000b0070813a7c809mr1847289iob.70.1675385506487; Thu, 02 Feb 2023 16:51:46 -0800 (PST) MIME-Version: 1.0 References: <20230201095232.15942-1-slewis@rivosinc.com> <20230201095232.15942-2-slewis@rivosinc.com> In-Reply-To: From: Andrew Waterman Date: Thu, 2 Feb 2023 16:51:35 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH 2/2] riscv: vectorised mem* and str* functions To: Vineet Gupta Cc: Sergei Lewis , libc-alpha@sourceware.org Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-4.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, Feb 2, 2023 at 4:13 PM Vineet Gupta wrote: > > > > On 2/1/23 11:03, Andrew Waterman wrote: > >> +#ifndef __riscv_strict_align > > strict-align is not the right thing to check here. As the RVA profile > > document explains, all RVA-compliant implementations must support > > misaligned loads and stores (so strict-align will be false), but they > > might execute extremely slowly (e.g., via trap and emulate), and so > > this approach will unduly penalize some implementations. > > FWIW, the proposed __riscv_strict_align if generated can have 2 possible > values: > - 1 (explicit -mstrict-align used in build) > - 2 cpu tune param indicated unaligned access is slow (like with > trap-n-emulate) Yeah, those semantics make sense. It makes the "strict" name a little misleading, though: "strict" suggests to me that misaligned accesses are outright illegal. So, it might be better to pick another name, e.g. __riscv_avoid_misaligned. > > So if theory code can still be written to cater to that. It had better cater to it in practice as well as in theory. Standard binary distributions need to be suitably generic, and they need to heed the guidance in the RVA profile spec. So this is OK as long as the default continues to be to avoid misaligned accesses. The fact that GCC's default -mtune setting is to mark them as slow means this is probably OK. This may be Yet Another IFUNC Case: use misaligned accesses only if known at runtime that they are fast, or when there's a routine available that's optimized for a specific microarchitecture. (Microarchitecture-specific IFUNCs would probably be more appropriate for these routines, anyway, since they're pretty clearly tuned for a particular machine. For example, the trivial 7-instruction memcpy loop recommended in the specification will perform well across a broader range of vector machines.) > > -Vineet