From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pj1-x102c.google.com (mail-pj1-x102c.google.com [IPv6:2607:f8b0:4864:20::102c]) by sourceware.org (Postfix) with ESMTPS id DAEE33858D28 for ; Fri, 31 Mar 2023 18:07:05 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DAEE33858D28 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-pj1-x102c.google.com with SMTP id p13-20020a17090a284d00b0023d2e945aebso9187625pjf.0 for ; Fri, 31 Mar 2023 11:07:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1680286025; x=1682878025; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=LvoZHpKrexxxlSAi1/e7V8RJa6soo2E8XFzT4u36zDE=; b=LbMeg6wZkfghqOV8BCRs7+lCfU3GokgmPc2BWjmPIJoP0zEOf86yPwZkkoieyP2HDw 6enhsk8aibt644rXKNE452p9kvtQtIT6sPLLZoVwFW/OHuZo4cFXZwjQ3QPV0ty2dqsR yPULXDK66IhSMA1wpdfcwRQZRrsbZxlNsaxy0epOUB91parWIiUowMqhtta+SlUb5JdK 7smxLCghPuZ9Gut0ioxGdwjO+gG0ODnBUBJWF2L3BAlhwj+hmSuXb5yytbzmPFxorqum P01WaQkQo17pzSHgSxNv4SW7G02AvdyoCECtIZg7PqwcGkUi454vO/SJ4o2KJiRb6ONj JuIw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680286025; x=1682878025; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=LvoZHpKrexxxlSAi1/e7V8RJa6soo2E8XFzT4u36zDE=; b=DXnZFkaHx1uvSAT60vuK27aKZFw+q39IRqh54igMUGSh6hGR3HHjvXpKH2iTW6JB9H NiPgNNVkbuYUjePGxKZiHwy2R7wRD9oyV4IsSID5MtpvOGbOf3Vj8+pM6rNdO0xox/Vp +l9qCKwv4R+j4MLdL4rqjU1tovBgNEF9IUuz4mNnX+dOdF5kJ8YWZfmgmbA/F3LlISdL 3v8+pwtGLkIYobKQMVPK26bgIS4Er2bvJoESEScBKYPZ6bqCo+6PdAlBn2ejCUiGE1qt mRIJaFdye0ZLHFi8XEbFQWIPN0/YnXz+6ZqeUugXnRS99iDc6ippZOMnt3TCipc7l359 fzVA== X-Gm-Message-State: AO0yUKXbRu93kvp9BvBpKatD+3IV5QiJ0cHBF1VA8OAzsT6pSWZmOkGr SQ5dOyaCEJnGPVbO1oISVRE= X-Google-Smtp-Source: AK7set9VLBGquM+eqfMYbUVB/WvA3DUVMXbmBDWZbGWRR/TaP3TbCO2jic8WHSx1FZ4/OE34bfb/HQ== X-Received: by 2002:a05:6a20:7a98:b0:d4:220d:a7e7 with SMTP id u24-20020a056a207a9800b000d4220da7e7mr24722749pzh.46.1680286024793; Fri, 31 Mar 2023 11:07:04 -0700 (PDT) Received: from ?IPV6:2601:681:8600:13d0::f0a? ([2601:681:8600:13d0::f0a]) by smtp.gmail.com with ESMTPSA id 134-20020a63028c000000b0051303d3e3c5sm1898561pgc.42.2023.03.31.11.07.03 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 31 Mar 2023 11:07:04 -0700 (PDT) Message-ID: <343d75ee-69f9-94b0-aa35-d5cd645ba0d3@gmail.com> Date: Fri, 31 Mar 2023 12:07:02 -0600 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.9.0 Subject: Re: [PATCH v2 0/3] RISC-V: ifunced memcpy using new kernel hwprobe interface Content-Language: en-US To: Adhemerval Zanella Netto , Palmer Dabbelt Cc: Evan Green , libc-alpha@sourceware.org, slewis@rivosinc.com, Vineet Gupta References: <9290df69-8946-3b67-63ea-3d386a3c30a6@gmail.com> <0fe9357f-b960-065c-a3a7-8abfbbd5017c@linaro.org> From: Jeff Law In-Reply-To: <0fe9357f-b960-065c-a3a7-8abfbbd5017c@linaro.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.3 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 3/30/23 13:38, Adhemerval Zanella Netto wrote: > > > On 30/03/23 03:20, Jeff Law wrote: >> >> >> On 3/29/23 13:45, Palmer Dabbelt wrote: >> >>> It's not in for-next yet, but various patch sets / proposals have been on the lists for a few months and it seems like discussion on the kernel side has pretty much died down.  That's why I was pinging the glibc side of things, if anyone here has comments on the interface then it's time to chime in.  If there's no comments then we're likely to end up with this in the next release (so queue into for-next soon, Linus' master in a month or so). >> Right.  And I've suggested that we at least try to settle on the various mem* and str* implementations independently of the kernel->glibc interface question. >> >> I don't much care how we break down the problem of selecting implementations, just that we get started.   That can and probably should be happening in parallel with the kernel->glibc API work. >> >> I've got some performance testing to do in this space (primarily of the VRULL implementations).  It's just going to take a long time to get the data.  And that implementation probably needs some revamping after all the work on the mem* and str* infrastructure that landed earlier this year. >> > > I don't think glibc is the right place for code dump, specially for implementations > that does not have representative performance numbers in real hardware and might > require further tuning. It can be even tricky if you require different build config > to testing as used to have for some ABI (for instance on powerpc with --with-cpu), > at least for ifunc we have some mechanism to test multiple variants assuming the > chips at least support (which should be case for unaligned). It's not meant to be "code dump". It's "these are the recommended implementation and we're just waiting for the final ifunc wiring to use them automatically." But I understand your point. Even if we just agree on the implementations without committing until the ifunc interface is settled is a major step forward. My larger point is that we need to work through the str* and mem* implementations and settle on those implementations and that can happen in independently of the interface discussion with the kernel team. If we've settled on specific implementations, why not go ahead and put them into the repo with the expectation that we can trivially wire them into the ifunc resolver once the abi interface is sorted out. > > So for experimental routines, where you expect to have frequent tuning based on > once you have tested and benchmarks on different chips; an external project > might a better idea; and sync with glibc once the routines are tested and validate. > And these RISCV does seemed to be still very experimental, where performance numbers > are still synthetic ones from emulators. I think we're actually a lot closer than you might think :-) My goal would be that we're not doing frequent tuning and avoid uarch specific versions if we at all can. There's a reasonable chance we can do that if we have good baseline, zbb and vector versions. I'm not including cboz memory clear right now -- there's already evidence that uarch considerations around cboz may be significant. > > Another possibility might to improve the generic implementation, as we have done > recently where RISCV bitmanip was a matter to add just 2 files and 4 functions > to optimize multiple string functions [2]. I have some WIP patches to add support > for unaligned memcpy/memmove with a very simple strategy. As I noted elsewhere. I was on the fence with pushing for improvements to the generic strcmp bits, but could be easily swayed to that position. jeff