public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Adhemerval Zanella <adhemerval.zanella@linaro.org>
To: libc-alpha@sourceware.org
Subject: Re: [PATCH 1/2] benchtests: Add wcrtomb microbenchmark
Date: Fri, 6 May 2022 09:50:06 -0300	[thread overview]
Message-ID: <e4dfe6d4-3db3-73c7-2a91-037c379643bf@linaro.org> (raw)
In-Reply-To: <20220505184348.3357550-2-siddhesh@sourceware.org>



On 05/05/2022 15:43, Siddhesh Poyarekar via Libc-alpha wrote:
> Add a simple benchmark that measures wcrtomb performance with various
> locales with 1-4 byte characters.
> 
> Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
> ---
>  benchtests/Makefile        |   1 +
>  benchtests/bench-wcrtomb.c | 140 +++++++++++++++++++++++++++++++++++++
>  2 files changed, 141 insertions(+)
>  create mode 100644 benchtests/bench-wcrtomb.c
> 
> diff --git a/benchtests/Makefile b/benchtests/Makefile
> index 149d87e22e..de9de5cf58 100644
> --- a/benchtests/Makefile
> +++ b/benchtests/Makefile
> @@ -171,6 +171,7 @@ ifeq (no,$(cross-compiling))
>  wcsmbs-benchset := \
>    wcpcpy \
>    wcpncpy \
> +  wcrtomb \
>    wcscat \
>    wcschr \
>    wcschrnul \
> diff --git a/benchtests/bench-wcrtomb.c b/benchtests/bench-wcrtomb.c
> new file mode 100644
> index 0000000000..6cef69cdbf
> --- /dev/null
> +++ b/benchtests/bench-wcrtomb.c
> @@ -0,0 +1,140 @@
> +/* Measure wcrtomb function.
> +   Copyright The GNU Toolchain Authors.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +#include <limits.h>
> +#include <locale.h>
> +#include <string.h>
> +#include <wchar.h>
> +
> +#include "bench-timing.h"
> +#include "json-lib.h"
> +
> +#define NITERS 100000
> +
> +struct test_inputs
> +{
> +  const char *locale;
> +  const wchar_t *input_chars;
> +};
> +
> +/* The inputs represent different types of characters, e.g. RTL, 1 byte, 2
> +   byte, 3 byte and 4 byte chars.  The exact number of inputs per locale
> +   doesn't really matter because we're not looking to compare performance
> +   between locales.  */
> +struct test_inputs inputs[] =
> +{
> +  /* RTL.  */
> +  {"ar_SA.UTF-8",
> +   L",-.،؟ـًُّ٠٢٣٤ءآأؤإئابةتثجحخدذرزسشصضطظعغفقكلمنهوىي"},
> +
> +  /* Various mixes of 1 and 2 byte chars.  */
> +  {"cs_CZ.UTF-8",
> +   L",.aAábcCčdDďeEéÉěĚfFghHiIíJlLmMnNňŇoóÓpPqQrřsSšŠTťuUúÚůŮvVWxyýz"},
> +
> +  {"el_GR.UTF-8",
> +   L",.αΑβγδΔεΕζηΗθΘιΙκΚλμΜνΝξοΟπΠρΡσΣςτυΥφΦχψω"},
> +
> +  {"en_GB.UTF-8",
> +   L",.aAāĀæÆǽǣǢbcCċdDðÐeEēĒfFgGġhHiIīĪlLmMnNoōpPqQrsSTuUūŪvVwxyȝzþÞƿǷ"},
> +
> +  {"fr_FR.UTF-8",
> +   L",.aAàâbcCçdDeEéèêëfFghHiIîïjlLmMnNoOôœpPqQrRsSTuUùûvVwxyz"},
> +
> +  {"he_IL.UTF-8",
> +   L"',.ִאבגדהוזחטיכךלמםנןסעפףצץקרשת"},
> +
> +  /* Devanagari, Japanese, 3-byte chars.  */
> +  {"hi_IN.UTF-8",
> +   L"(।ं०४५७अआइईउऎएओऔकखगघचछजञटडढणतथदधनपफ़बभमयरलवशषसहािीुूृेैोौ्"},
> +
> +  {"ja_JP.UTF-8",
> +   L".ー0123456789あアいイうウえエおオかカがきキぎくクぐけケげこコごさサざ"},
> +
> +  /* More mixtures of 1 and 2 byte chars.  */
> +  {"ru_RU.UTF-8",
> +   L",.аАбвВгдДеЕёЁжЖзЗийЙкКлЛмМнНоОпПрстТуУфФхХЦчшШщъыЫьэЭюЮя"},
> +
> +  {"sr_RS.UTF-8",
> +   L",.aAbcCćčdDđĐeEfgGhHiIlLmMnNoOpPqQrsSšŠTuUvVxyzZž"},
> +
> +  {"sv_SE.UTF-8",
> +   L",.aAåÅäÄæÆbBcCdDeEfFghHiIjlLmMnNoOöÖpPqQrsSTuUvVwxyz"},
> +
> +  /* Chinese, 3-byte chars  */
> +  {"zh_CN.UTF-8",
> +   L"一七三下不与世両並中串主乱予事二五亡京人今仕付以任企伎会伸住佐体作使"},
> +
> +  /* 4-byte chars, because smileys are the universal language and we want to
> +     ensure optimal performance with them 😊.  */
> +  {"en_US.UTF-8",
> +   L"😀😁😂😃😄😅😆😇😈😉😊😋😌😍😎😏😐😑😒😓😔😕😖😗😘😙😚😛😜😝😞😟😠😡"}
> +};

Could you use use hexadecimal character escape in tests? Although gcc handle multiple
-fexec-charset, trying to build it with a different compiler usually emits a lot of
warnings.

> +
> +char buf[MB_LEN_MAX];
> +size_t ret;
> +
> +int
> +main (int argc, char **argv)
> +{
> +  const size_t inputs_len = sizeof (inputs) / sizeof (struct test_inputs);
> +
> +  json_ctx_t json_ctx;
> +  json_init (&json_ctx, 0, stdout);
> +  json_document_begin (&json_ctx);
> +
> +  json_attr_string (&json_ctx, "timing_type", TIMING_TYPE);
> +  json_attr_object_begin (&json_ctx, "functions");
> +  json_attr_object_begin (&json_ctx, "wcrtomb");
> +
> +  for (size_t i = 0; i < inputs_len; i++)
> +    {
> +      json_attr_object_begin (&json_ctx, inputs[i].locale);
> +      setlocale (LC_ALL, inputs[i].locale);
> +
> +      timing_t min = 0x7fffffffffffffff, max = 0, total = 0;
> +      const wchar_t *inp = inputs[i].input_chars;
> +      const size_t len = wcslen (inp);
> +      mbstate_t s;                  
> +
> +      memset (&s, '\0', sizeof (s));
> +
> +      for (size_t n = 0; n < NITERS; n++)
> +	{
> +	  timing_t start, end, elapsed;
> +
> +	  TIMING_NOW (start);
> +	  for (size_t j = 0; j < len; j++)
> +	    ret = wcrtomb (buf, inp[j], &s);
> +	  TIMING_NOW (end);
> +	  TIMING_DIFF (elapsed, start, end);
> +	  if (min > elapsed)
> +	    min = elapsed;
> +	  if (max < elapsed)
> +	    max = elapsed;
> +	  TIMING_ACCUM (total, elapsed);
> +	}
> +      json_attr_double (&json_ctx, "max", max);
> +      json_attr_double (&json_ctx, "min", min);
> +      json_attr_double (&json_ctx, "mean", total / NITERS);
> +      json_attr_object_end (&json_ctx);
> +    }
> +
> +  json_attr_object_end (&json_ctx);
> +  json_attr_object_end (&json_ctx);
> +  json_document_end (&json_ctx);
> +}

  parent reply	other threads:[~2022-05-06 12:50 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-07  6:26 [RFC] _FORTIFY_SOURCE strictness Siddhesh Poyarekar
2022-04-07 10:16 ` Andreas Schwab
2022-04-08  3:24   ` Siddhesh Poyarekar
2022-04-08  2:26 ` Paul Eggert
2022-04-08  3:32   ` Siddhesh Poyarekar
2022-04-08  5:37 ` Florian Weimer
2022-04-08  6:02   ` Siddhesh Poyarekar
2022-04-08 21:07     ` Paul Eggert
2022-04-11  8:02       ` Siddhesh Poyarekar
2022-05-05 18:43         ` [PATCH 0/2] More compliant wcrtomb Siddhesh Poyarekar
2022-05-05 18:43           ` [PATCH 1/2] benchtests: Add wcrtomb microbenchmark Siddhesh Poyarekar
2022-05-06  9:10             ` Florian Weimer
2022-05-06 12:49               ` [committed] " Siddhesh Poyarekar
2022-05-06 12:50             ` Adhemerval Zanella [this message]
2022-05-06 12:59               ` [PATCH 1/2] " Siddhesh Poyarekar
2022-05-06 13:20                 ` Adhemerval Zanella
2022-05-06 13:26                   ` Siddhesh Poyarekar
2022-05-06 13:36                     ` Siddhesh Poyarekar
2022-05-06 13:46                       ` Adhemerval Zanella
2022-05-05 18:43           ` [PATCH 2/2] wcrtomb: Make behavior POSIX compliant Siddhesh Poyarekar
2022-05-06  9:25             ` Paul Eggert
2022-05-06 13:40               ` Adhemerval Zanella
2022-05-06 13:46                 ` Siddhesh Poyarekar
2022-05-06 14:04             ` [PATCH v2] " Siddhesh Poyarekar
2022-05-09 13:22               ` Adhemerval Zanella
2022-05-09 13:35                 ` Siddhesh Poyarekar
2022-05-12 13:15             ` [PATCH v3] " Siddhesh Poyarekar
2022-05-13  4:56               ` Paul Eggert
2022-05-13  5:28                 ` Paul Eggert
2022-05-13 11:31                   ` Siddhesh Poyarekar
2022-05-13 11:38                     ` Florian Weimer
2022-05-13 11:51                       ` Siddhesh Poyarekar
2022-05-13 12:55                         ` Florian Weimer
2022-05-13 12:30                       ` Adhemerval Zanella
2022-05-13 13:42                         ` Siddhesh Poyarekar
2022-05-13 17:58                           ` Paul Eggert
2022-05-13 13:45                         ` [committed] " Siddhesh Poyarekar
2022-05-13  8:18                 ` [PATCH v3] " Siddhesh Poyarekar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e4dfe6d4-3db3-73c7-2a91-037c379643bf@linaro.org \
    --to=adhemerval.zanella@linaro.org \
    --cc=libc-alpha@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).