public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Carlos O'Donell <carlos@redhat.com>
To: Ilyahoo Proshel <ip@ipshel.com>, libc-alpha@sourceware.org
Subject: Re: [PATCH] glibc [BZ #27781]
Date: Mon, 24 Jan 2022 16:59:28 -0500	[thread overview]
Message-ID: <7d45acad-0065-461d-93b7-3d4e68ed39c2@redhat.com> (raw)
In-Reply-To: <20220124180749.1592069-1-ip@ipshel.com>

On 1/24/22 13:07, Ilyahoo Proshel wrote:
> Test case for rif_MA [BZ #27781]

Thanks for the submission!

Please adjust the subject to reflect the contribution.
I would suggest "Add rif_MA locale [BZ #27781]"

The CI/CD framework shows this patch does not pass regression testing:
https://patchwork.sourceware.org/project/glibc/patch/20220124180749.1592069-1-ip@ipshel.com/

Specifically it looks like the sorting tests fail given the inputs
you've provided.

Did you make sure to run 'make check' to look for errors?

The failures are here:
rif_MA.UTF-8 collate-test FAIL
rif_MA.UTF-8 xfrm-test FAIL

You will need to review these and determine why the characters don't
sort as expected by the data you provided.

> ---
>  .../strcoll-inputs/filelist#en_US.UTF-8       |   1 +
>  .../strcoll-inputs/lorem_ipsum#rif_MA.UTF-8   |  15 ++
>  locale/iso-639.def                            |   1 +
>  localedata/Makefile                           |   2 +
>  localedata/SUPPORTED                          |   2 +
>  localedata/locales/rif_MA                     | 161 ++++++++++++++++++
>  localedata/rif_MA.UTF-8.in                    |  55 ++++++
>  7 files changed, 237 insertions(+)
>  create mode 100644 benchtests/strcoll-inputs/lorem_ipsum#rif_MA.UTF-8
>  create mode 100644 localedata/locales/rif_MA
>  create mode 100644 localedata/rif_MA.UTF-8.in
> 
> diff --git a/benchtests/strcoll-inputs/filelist#en_US.UTF-8 b/benchtests/strcoll-inputs/filelist#en_US.UTF-8
> index 197700ec90..ab9e2fc3b1 100644
> --- a/benchtests/strcoll-inputs/filelist#en_US.UTF-8
> +++ b/benchtests/strcoll-inputs/filelist#en_US.UTF-8
> @@ -11478,6 +11478,7 @@ en_ZM
>  de_DE@euro
>  fr_LU@euro
>  sw_KE
> +rif_MA
>  id_ID
>  is_IS
>  gez_ER@abegede
> diff --git a/benchtests/strcoll-inputs/lorem_ipsum#rif_MA.UTF-8 b/benchtests/strcoll-inputs/lorem_ipsum#rif_MA.UTF-8
> new file mode 100644
> index 0000000000..66f1e66c01
> --- /dev/null
> +++ b/benchtests/strcoll-inputs/lorem_ipsum#rif_MA.UTF-8
> @@ -0,0 +1,15 @@
> +Yuffu-d Arrif iḍennaḍ x uneɣmis n tmettant n Prof. Dr. Hassan Benɛaqiya, Lexbar-a yejja ij n uxeyyeq d ameqqran deg imeddukal d imeḥḍaren-nnes.
> +
> +Benɛaqiya i yellan d aselmad n tsekla Tafransist di tesdawit n Selwan, mamec i t-tuɣa zeg yewdan i yessersen ddsas i tebridt n tɣuriwin Timaziɣin di tesdawit n Wejda aseggas n 2007. Aked uyenni, issers ddsas i tebridt n tɣuriwin Timaziɣin di tesdawit n Selwan deg iseggusa imeggura. Ɛawed, ibedd ad yili x yixef n umaster n tutlayt Tamaziɣt di tɣiwant-a s ixef-nnes.
> +
> +Prof. Dr. Hasan Benɛaqiya, yettwaḥsab zeg ipilaren n Tmaziɣt di Arif, la s trezzutin-nnes, la s tira-nnes. Amek yettwassen deg iger n tussna Tamaziɣt s ufus-nnes yessiɣen lebda ad iɛawen imeḥḍaren-nnes, d qaε wi icerken amnus.
> +
> +Deg ussan yeɛdun yemser waṭṭas n wawal jer Irifiyyen belli Aliman, ixes ad yerẓem i ixeddamen zi barra. Niɣ mamek qqarn s wawal n leqhawi: Aliman ad yerẓem i lekwaɣeḍ. Deg umegraḍ anekkes afray x uneɣmis‑a (Lexbar), a nessen tidet, deg ixarriqen.
> +
> +Pulitik n wenɛarq deg uliman tella zeg wami yella Uliman, deg ineggura‑ya n iseggusa, tedwel d ijj n ṭṭema labud a x‑as yili wawal melmi reccḥen i lintixab. Igrawen marra tteggen‑tt deg uprugram‑nsen, Jer wenni yexsen imenneɛraq, d wenni yexsen ad yazzel x yinni da, marra qat ɣer‑sen d ij n lkartet ttiraren zeg‑sen.
> +
> +Lḥukuma n useggwasa, tettwassen s “Ampel” , i yexsen ad yini sṭup n yebriden, izegzawen n twennaḍt (Envirement), Spd (Agraw asuṣyal, azeggwaɣ) FDP (ilibiraliyyen, awraɣ), Igrawen‑a msafaqen x waṭṭas n tmeslayin, munen mani wer yettmun ḥed, minzi lxezrat‑nsen temsebḍa deg waṭṭas n tmeslayin. Zi min xef firmarn, d lqanun amaynu n wenɛarq.Kunṭra‑ya, tiwi‑asen‑dd aṭṭas n ukritik zeg igrawen‑nneɣni, d waṭṭasn Ilimaniyyen,
> +Aṭṭas n imenneɛraq deg iseggusa yeɛdun, inni d irewlen zeg iɣewwiɣen d umquṭṭes, niɣ iḍriben, d yenni wer ɣer llin lekwaɣeḍ. Ssa d usawen ad x‑asen yehwen lḥal. Aṭaf zemmern ad ggen lekwaɣeḍ, wer ttɣimin mkul twalat tticen‑asen sett‑chur niɣ d ɛam n visa waha. Min isseqsiḥen afsay deg wamun aliman, xsen ad teksen. Iwdan‑a ad yedwel ɣer‑sen lḥeq ad xedmen. A ten‑ɛayaren s lekwaɣeḍ umi semman “lekwaɣeḍ n ujarreb”. Di lweqt nni di ten‑ɣa‑jarrben, ixessa ad ḍebbren x uzellif‑nsen, ad afen lxedmet.
> +
> +Kritik: Agraw n merkel (CDU), ittwala wa d acejjeɛ x ukular ɣer Uliman, wer yettɣimi manaya deg yenni da waha, maca ad mmernin inneɣni.
> +Tamunt n yegrawen iḥekkmen aliman, xsen ad ggen ij n usistim d jjdid i ixeddamen yexsen ad adfen zi barra ad xedmen zeg Uliman. Xsen ad ggen ij n usistim am Green Card n Kanada. Asistim yebna x tneqqiḍin, Maca deg Uliman ɛad wer nnin man tneqqiḍin (ccuruṭ) i xef ɣa yebna usistim‑a. Zeg ij n uɣeẓḍis nneɣni, xsen ad shewnen asemquddi n ddiplumat n barra deg Uliman.
> diff --git a/locale/iso-639.def b/locale/iso-639.def
> index 926aebada0..3e07eca967 100644
> --- a/locale/iso-639.def
> +++ b/locale/iso-639.def
> @@ -399,6 +399,7 @@ DEFINE_LANGUAGE_CODE3 ("Quechua, Southern", quz, quz)
>  DEFINE_LANGUAGE_CODE ("Raeto-Romance", rm, roh, roh)
>  DEFINE_LANGUAGE_CODE3 ("Rajasthani", raj, raj)
>  DEFINE_LANGUAGE_CODE3 ("Rapanui", rap, rap)
> +DEFINE_LANGUAGE_CODE3 ("Tarifit", rif, rif)
>  DEFINE_LANGUAGE_CODE3 ("Rarotongan", rar, rar)
>  DEFINE_LANGUAGE_CODE3 ("Romance (Other)", roa, roa)
>  DEFINE_LANGUAGE_CODE ("Romanian", ro, ron, rum)
> diff --git a/localedata/Makefile b/localedata/Makefile
> index 79db713925..d601626ffb 100644
> --- a/localedata/Makefile
> +++ b/localedata/Makefile
> @@ -98,6 +98,7 @@ test-input := \
>  	pl_PL.UTF-8 \
>  	ps_AF.UTF-8 \
>  	ro_RO.UTF-8 \
> +	rif_MA.UTF-8 \
>  	ru_RU.UTF-8 \
>  	sah_RU.UTF-8 \
>  	sc_IT.UTF-8 \
> @@ -268,6 +269,7 @@ LOCALES := \
>  	pl_PL.UTF-8 \
>  	ps_AF.UTF-8 \
>  	ro_RO.UTF-8 \
> +	rif_MA.UTF-8 \
>  	ru_RU.UTF-8 \
>  	sah_RU.UTF-8 \
>  	sc_IT.UTF-8 \
> diff --git a/localedata/SUPPORTED b/localedata/SUPPORTED
> index d768aa4795..ca39131596 100644
> --- a/localedata/SUPPORTED
> +++ b/localedata/SUPPORTED
> @@ -378,6 +378,8 @@ pt_PT/ISO-8859-1 \
>  pt_PT@euro/ISO-8859-15 \
>  quz_PE/UTF-8 \
>  raj_IN/UTF-8 \
> +rif_MA/UTF-8 \
> +rif_MA.UTF-8/UTF-8 \
>  ro_RO.UTF-8/UTF-8 \
>  ro_RO/ISO-8859-2 \
>  ru_RU.KOI8-R/KOI8-R \
> diff --git a/localedata/locales/rif_MA b/localedata/locales/rif_MA
> new file mode 100644
> index 0000000000..526b1b87fd
> --- /dev/null
> +++ b/localedata/locales/rif_MA
> @@ -0,0 +1,161 @@
> +comment_char %
> +escape_char /
> +
> +% This file is part of the GNU C Library and contains locale data.
> +% The Free Software Foundation does not claim any copyright interest
> +% in the locale data contained in this file.  The foregoing does not
> +% affect the license of the GNU C Library as a whole.  It does not
> +% exempt you from the conditions of the license if your use would
> +% otherwise be governed by that license.
> +
> +% Tarifit language locale for Morocco
> +% Contact: Ilyahoo Proshel
> +% Email: ip@ipshel.com
> +
> +
> +LC_IDENTIFICATION
> +title      "Tarifit locale for Morocco"
> +source     ""
> +contact    "Ilyahoo Proshel"
> +email      "ip@ipshel.com"
> +language   "Tarifit"
> +territory  "Morocco"
> +revision   "1.0"
> +date       "2020-04-26"
> +
> +category "i18n:2012";LC_IDENTIFICATION
> +category "i18n:2012";LC_CTYPE
> +category "i18n:2012";LC_COLLATE
> +category "i18n:2012";LC_TIME
> +category "i18n:2012";LC_NUMERIC
> +category "i18n:2012";LC_MONETARY
> +category "i18n:2012";LC_MESSAGES
> +category "i18n:2012";LC_PAPER
> +category "i18n:2012";LC_NAME
> +category "i18n:2012";LC_ADDRESS
> +category "i18n:2012";LC_TELEPHONE
> +category "i18n:2012";LC_MEASUREMENT
> +END LC_IDENTIFICATION
> +
> +LC_CTYPE
> +copy "i18n"
> +
> +translit_start
> +include "translit_combining";""
> +translit_end
> +END LC_CTYPE
> +
> +LC_COLLATE
> +copy "iso14651_t1"
> +END LC_COLLATE
> +
> +LC_TIME
> +abday       	"<U004C><U1E25><U0065>";/
> +		"<U004C><U0065><U0074>";/
> +		"<U0054><U0074><U006C>";/
> +		"<U004C><U0061><U0072>";/
> +		"<U004C><U0065><U0078>";/
> +		"<U004A><U006A><U0065>";/
> +		"<U0053><U0062><U0074>"
> +day		"<U004C><U1E25><U0065><U0064>";/
> +		"<U004C><U0065><U0074><U006E><U0061><U0079><U0065><U006E>";/
> +		"<U0054><U0074><U006C><U0061><U0074>";/
> +		"<U004C><U0061><U0072><U0062><U0065><U025B>";/
> +		"<U004C><U0065><U0078><U006D><U0069><U0073><U0073>";/
> +		"<U004A><U006A><U0065><U006D><U025B><U0061>";/
> +		"<U0053><U0073><U0065><U0062><U0074>"
> +abmon       	"<U0059><U0065><U006E>";/
> +         	"<U0046><U0065><U0062>";/
> +		"<U004D><U0061><U0072>";/
> +           	"<U0059><U0065><U0062>";/
> +           	"<U004D><U0061><U0079>";/
> +          	"<U0059><U0075><U006E>";/
> +          	"<U0059><U0075><U006C>";/
> +          	"<U0194><U0075><U0063>";/
> +          	"<U0043><U0075><U0074>";/
> +          	"<U004B><U1E6D><U0075>";/
> +          	"<U004E><U0075><U0076>";/
> +          	"<U0044><U0075><U006A>"
> +mon         	"<U0059><U0065><U006E><U006E><U0061><U0079><U0065><U0072>";/
> +            	"<U0046><U0065><U0062><U0072><U0061><U0079><U0065><U0072>";/
> +            	"<U004D><U0061><U0072><U0065><U0073>";/
> +            	"<U0059><U0065><U0062><U0072><U0069><U006C>";/
> +            	"<U004D><U0061><U0079><U0079><U0075>";/
> +            	"<U0059><U0075><U006E><U0079><U0075>";/
> +            	"<U0059><U0075><U006C><U0079><U0075><U007A>";/
> +            	"<U0194><U0075><U0063><U0074>";/
> +            	"<U0043><U0075><U0074><U0065><U006E><U0062><U0065><U0072>";/
> +            	"<U004B><U1E6D><U0075><U0062><U0065><U0072>";/
> +            	"<U004E><U0075><U0076><U0065><U006D><U0062><U0065><U0072>";/
> +            	"<U0044><U0075><U006A><U0065><U006D><U0062><U0065><U0072>"
> +d_t_fmt     	"%a %d %b %Y %T %Z"
> +d_fmt       	"%d//%m//%y"
> +t_fmt       	"%T"
> +am_pm       	"<U0073><U0062>";"<U0061><U025B>"
> +t_fmt_ampm  	""
> +date_fmt    	"%a %e %b %Y %H:%M:%S %Z"
> +week    	7;19971130;4
> +first_weekday 	2
> +END LC_TIME
> +
> +LC_NUMERIC
> +decimal_point 	"."
> +thousands_sep 	""
> +grouping      	3
> +END LC_NUMERIC
> +
> +LC_MONETARY
> +int_curr_symbol		"<U004D><U0041><U0044><U0020>"
> +currency_symbol        	"<U0064><U0068>"
> +mon_decimal_point      	"."
> +mon_thousands_sep      	""
> +mon_grouping           	3;3
> +positive_sign          	""
> +negative_sign          	"-"
> +int_frac_digits        	2
> +frac_digits           	2
> +p_cs_precedes          	1
> +p_sep_by_space         	0
> +n_cs_precedes          	0
> +n_sep_by_space         	0
> +p_sign_posn            	1
> +n_sign_posn            	1
> +END LC_MONETARY
> +
> +LC_MESSAGES
> +yesexpr 	"^[<U002B><U0031><U0079><U0059><U0077><U0057>]"
> +noexpr  	"^[<U002D><U0030><U006E><U004E><U006C><U004C>]"
> +yesstr 		"<U0057><U0061><U0068>"
> +nostr  		"<U004C><U006C><U0061>"
> +END LC_MESSAGES
> +
> +LC_PAPER
> +copy "i18n"
> +END LC_PAPER
> +
> +LC_NAME
> +name_fmt  "%g%t%f"
> +END LC_NAME
> +
> +LC_ADDRESS
> +postal_fmt   "%f%N%a%N%d%N%b%N%s %h %e %r%N%z %T%N%c%N"
> +country_name "<U004C><U006D><U0065><U0072><U0072><U0075><U006B>"
> +country_ab2  "<U004D><U0041>"
> +country_ab3  "<U004D><U0041><U0052>"
> +country_num  504
> +country_car  "<U004D><U0041>"
> +lang_term    "<U0072><U0069><U0066>"
> +lang_lib     "<U0072><U0069><U0066>"
> +END LC_ADDRESS
> +
> +LC_TELEPHONE
> +tel_int_fmt    "+%c%l"
> +tel_dom_fmt	"0%l"
> +int_select     "00"
> +int_prefix     "212"
> +END LC_TELEPHONE
> +
> +LC_MEASUREMENT
> +copy "i18n"
> +END LC_MEASUREMENT
> +
> diff --git a/localedata/rif_MA.UTF-8.in b/localedata/rif_MA.UTF-8.in
> new file mode 100644
> index 0000000000..1544dda31c
> --- /dev/null
> +++ b/localedata/rif_MA.UTF-8.in
> @@ -0,0 +1,55 @@
> +a
> +A
> +aa
> +ɛ
> +Ɛ
> +ɛɛ
> +b
> +B
> +bb
> +c
> +C
> +cc
> +d
> +D
> +dd
> +ḍ
> +Ḍ
> +ḍḍ
> +g
> +G
> +gg
> +ɣ
> +Ɣ
> +ɣɣ
> +h
> +H
> +hh
> +ḥ
> +Ḥ
> +ḥḥ
> +q
> +Q
> +qq
> +r
> +R
> +rr
> +s
> +S
> +ss
> +ṣ
> +Ṣ
> +ṣṣ
> +t
> +T
> +tt
> +ṭ
> +Ṭ
> +ṭṭ
> +z
> +Z
> +zz
> +ẓ
> +Ẓ
> +ẓẓ
> +ʷ


-- 
Cheers,
Carlos.


      reply	other threads:[~2022-01-24 21:59 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-24 18:07 Ilyahoo Proshel
2022-01-24 21:59 ` Carlos O'Donell [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7d45acad-0065-461d-93b7-3d4e68ed39c2@redhat.com \
    --to=carlos@redhat.com \
    --cc=ip@ipshel.com \
    --cc=libc-alpha@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).