From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oi1-x22e.google.com (mail-oi1-x22e.google.com [IPv6:2607:f8b0:4864:20::22e]) by sourceware.org (Postfix) with ESMTPS id 4CBDB3858C00 for ; Fri, 27 Jan 2023 16:52:47 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 4CBDB3858C00 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org Received: by mail-oi1-x22e.google.com with SMTP id bg30so1648944oib.5 for ; Fri, 27 Jan 2023 08:52:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:in-reply-to:organization:from:references :to:content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=2OAQwbnUQrTPtYuoJOs+RUvzhxahvWr7OEmnhwTBAnY=; b=g3omHz7AkW2iMNmsY+sceljD2YVnYEYn7L54/DHtwZQ+GfwTyLqmzns76Z+jJYq3Qt Ijq+1hG4s4/yd/hc3r49O2pzsmp5KY2+HbzLVusAOqDpXJA427YSwPN+RAEF85+8uM8K cjXFYZr63Q9i1COT3wqMf3vj9AcKCAHDUh4ZX2yj1MTb9DLFQFk6+kN7bmO3pZ9LY4jQ gdc37/JLtlLBrboEgn9++1KBHGV9TFA6nm5nPlepyQL2xP30bxQew53RuE07QMa8pmSS 9f8HKtzZA3w4CsIqHM5myWJvYtLNLZeXYAJvFtQUNNOXbDCyPjuXD9LvTapmmm66WemK wO2A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:organization:from:references :to:content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=2OAQwbnUQrTPtYuoJOs+RUvzhxahvWr7OEmnhwTBAnY=; b=N1hbwvV0Ebvo3UFAfPQBilMmTwxfr0eWeCSkqDLgcV0ClgygTIbURSG4kp4GAB6NFE bv7+SQAlQpM1eMAZkjZ05C7qgvsw3Dym4i7ABOBlqKDA3o2M6et6jmXHrmYHOpS48Irs 3pcTaybe0suJQ88IxUScjClCme9WSeVdT9HTzeByPHMpW4NeZQMgP5fHe0wlkJBHG15F LKrJp+X4oKWnL9rvs2oVACK9VvHU08Tt4B/5VjG/92bVwsiYci8Dq3qhmqf1PyB4w42T gC1sEuetsgaxKhgu2nNNp/5nppINd3DDFDq0d2sINPn1mgpCiBNuDtwFP4oYEhp5hjMp mYZQ== X-Gm-Message-State: AFqh2kqbWuTcliUKB0kJJCIX9ln/2JcLghXhP0o1FAjjUdKz9nD/A+VA Q2741wmogAkqUp5cWeLrmgif0iA0GqhpygRf3jQ= X-Google-Smtp-Source: AMrXdXs+H5C3I6MToaWP1I4gOQj6J93lMPymRENjL1tSfkjKMLC7khwMea8ZhpZ/nxaqVORxnS8X6Q== X-Received: by 2002:a05:6808:30e:b0:36c:e637:6791 with SMTP id i14-20020a056808030e00b0036ce6376791mr15717186oie.32.1674838366528; Fri, 27 Jan 2023 08:52:46 -0800 (PST) Received: from ?IPV6:2804:1b3:a7c1:7e99:6983:51e0:457e:734? ([2804:1b3:a7c1:7e99:6983:51e0:457e:734]) by smtp.gmail.com with ESMTPSA id s126-20020aca5e84000000b003645ec41412sm1754447oib.27.2023.01.27.08.52.44 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 27 Jan 2023 08:52:45 -0800 (PST) Message-ID: Date: Fri, 27 Jan 2023 13:52:43 -0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.7.0 Subject: Re: [PATCH v10 00/24] Improve generic string routines Content-Language: en-US To: Jeff Law , libc-alpha@sourceware.org References: <20230120211622.3445279-1-adhemerval.zanella@linaro.org> From: Adhemerval Zanella Netto Organization: Linaro In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.7 required=5.0 tests=BAYES_00,BODY_8BITS,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 20/01/23 20:00, Jeff Law via Libc-alpha wrote: > > > On 1/20/23 14:15, Adhemerval Zanella via Libc-alpha wrote: >> It is done by: >> >>    1. Parametrizing the internal routines (for instance the find zero >>       in a word) so each architecture can reimplement without the need >>       to reimplement the whole routine. >> >>    2. Vectorizing more string implementations (for instance strcpy >>       and strcmp). >> >>    3. Change some implementations to use already possible optimized >>       ones (strnlen and strchr).  It makes new ports to focus on >>       only provide optimized implementation of a hardful symbols >>       (for instance memchr) and make its improvement to be used in >>       a larger set of routines. >> >> I checked on x86_64-linux-gnu, i686-linux-gnu, powerpc-linux-gnu, >> and powerpc64-linux-gnu by removing the arch-specific assembly >> implementation and disabling multiarch (it covers both LE and BE >> for 64 and 32 bits). I also checked the string routines on alpha, hppa, >> and sh. > [ ... ] > You guys are making work for me! :) > > We've got hand-written RISC-V implementations of various str* and mem* routines that I nearly asked to get merged for glibc-2.37.  I held off thinking that there wasn't much benefit to including it in 2.37 vs 2.38.  Bad call on my part! > Does this patchset work for the RISCV routine are aiming to optimize? Do you have any strategy you think it would be profitable to add in the generic framework? Hand-optimize routines might squeeze some more cycles on some scenarios, but at least the default framework should give a better performance. I see that most of the arch-specific optimization adds alignment consideration (specially if architecture provides fast unaligned access) and loop unrolling for main loop. For former it would require a different implementation, but for latter I think we can add per-arch compiler flags to force unrolling (as we do for powerpc memmove/worcopy).