From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 21411 invoked by alias); 29 Feb 2016 07:53:37 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Received: (qmail 21367 invoked by uid 89); 29 Feb 2016 07:53:36 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: =?ISO-8859-1?Q?Yes, score=5.6 required=5.0 tests=BAYES_50,BODY_8BITS,FREEMAIL_FROM,GARBLED_BODY,KAM_STOCKGEN,RCVD_IN_DNSWL_LOW,RP_MATCHES_RCVD,SPF_PASS autolearn=no version=3.3.2 spammy=8:=b8=b4, 8:=b8=a3, 8:=b1=e0, 8:=b9?= X-HELO: mout.web.de From: Leonhard Holz Subject: [PATCH V4][BZ #18441] fix sorting multibyte charsets with an improper locale To: GNU C Library Cc: Carlos O'Donell Message-ID: <56D3F8F0.8070401@web.de> Date: Mon, 29 Feb 2016 10:29:00 -0000 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="------------030900000606020402020804" X-UI-Out-Filterresults: notjunk:1;V01:K0:ErXvnSjD38g=:ldO1ZAQVBzhnR/ufp2PnJV Fxetd9kwldcYcWxZNlaR8Uw27XRsNm3kDNU639ZIUMN388nCVlQ0rSyR2e9N/CAqi5UN0hGwI gTl2QoCDFeKQHI3MwUUnAgz/KqAydMBkFPAc7x6BvVjOTI8xlKdRdlkV/dJ8v2l/enD7J3MsB qG0TOJ+Sd6bI871iNA8eoRS6d9rh4cIPNxecYFLPl1/rfxum50wi7amLZBZuNi5yxKcYCuw57 A5yj7+A42GMfbzYAfdKggItYI4NnrBcL+mD02Vm7wBeTfdxVS8hsqFaauk9l63ayIdExfCnvT +/OGU37QS/L7Gydrn1rNsZ68lt9LEFpWVipKKYfruJEQJbW60fAmmZhEjcs6FFvvQ7YfKMM/z u9c2li7r+irRnNyoWIT84XzXyx4wz8LJDEcbgSTWWVbYo8VHlECJBuJtXhUvXOYNCeWJzYU/v GOqG/piYK12lrHIsICZs5jRRqHOzyekK3RHbnKZIvl71chMOaf6QNx0MSLRIy+InVg2OO8rad xhHSsLtHRw7OzATbuj8MAnLEl0XBk/6+OdxuqU10WZbAQcrDP/AvBkhEgaXpnp/3cN05yMbL6 dPe1+VjL+wO2XFv3OXx8TqDzkb4lctuujCrH+f9K+nqhf9Brgd0Z//BVDyDnqm6KM/z8cPrXo K1NwkNP8IvE3DQxa8rkSYyKIItGTbFF+PMMkuyMR+/gOeyO9pqCOlCkicUngEDIwtOR8Uekpz DZXR4t+8H5IX8FgMhCSFlvkWfTDh6FUW+GqnGAW6Ol4WFiKjCQyzCOF2CJw= X-SW-Source: 2016-02/txt/msg00875.txt.bz2 This is a multi-part message in MIME format. --------------030900000606020402020804 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Content-length: 29420 In BZ #18441 sorting a thai text with the en_US.UTF-8 locale causes a performance regression. The cause of the problem is that a) en_US.UTF-8 has no informations for thai chars and so always reports a zero sort weight which causes the comparison to check the whole string instead of breaking up early and b) the sequence-to-weight list is partitioned by the first byte of the first character (TABLEMB); this generates long lists for multibyte UTF-8 characters as they tend to have an equal starting byte (e.g. all thai chars start with E0). The approach of the patch is to interprete TABLEMB as a hashtable and find a better hash key. My first try was to somehow "fold" a multibyte character into one byte but that worsened the overall performance a lot. Enhancing the table to 2 byte keys works much better while needing a reasonable amount of extra memory. The patch vastly improves the performance of languages with multibyte chars (see zh_CN, hi_IN and ja_JP below). A side effect is that some languages with one-byte chars get a bit slower because of the extra check for the first byte while finding the right sequence in the sequence list . It cannot be avoided since the hash key is not longer equal to the first byte of the sequence. Tests are ok. filelist#C 1.75% 23,396,200 23,805,700 filelist#en_US.UTF-8 1.42% 77,186,200 78,285,200 lorem_ipsum#vi_VN.UTF-8 -1.70% 1,680,740 1,652,110 lorem_ipsum#ar_SA.UTF-8 -7.71% 2,134,780 1,970,170 lorem_ipsum#en_US.UTF-8 2.61% 1,685,120 1,729,160 lorem_ipsum#zh_CN.UTF-8 -88.66% 806,176 91,423 lorem_ipsum#cs_CZ.UTF-8 -4.89% 2,150,120 2,045,030 lorem_ipsum#en_GB.UTF-8 -1.47% 2,061,960 2,031,620 lorem_ipsum#da_DK.UTF-8 3.15% 1,703,710 1,757,390 lorem_ipsum#pl_PL.UTF-8 0.86% 1,634,890 1,648,870 lorem_ipsum#fr_FR.UTF-8 -2.06% 2,232,030 2,186,030 lorem_ipsum#pt_PT.UTF-8 -2.60% 2,238,410 2,180,210 lorem_ipsum#el_GR.UTF-8 -34.52% 3,413,330 2,235,010 lorem_ipsum#ru_RU.UTF-8 -9.88% 2,403,370 2,165,950 lorem_ipsum#iw_IL.UTF-8 -9.56% 2,209,740 1,998,500 lorem_ipsum#es_ES.UTF-8 4.92% 1,983,470 2,081,050 lorem_ipsum#hi_IN.UTF-8 -98.88% 220,453,000 2,458,620 lorem_ipsum#sv_SE.UTF-8 1.79% 1,645,370 1,674,760 lorem_ipsum#hu_HU.UTF-8 4.86% 3,179,620 3,334,290 lorem_ipsum#tr_TR.UTF-8 -23.59% 2,473,330 1,889,870 lorem_ipsum#is_IS.UTF-8 2.49% 1,620,370 1,660,680 lorem_ipsum#it_IT.UTF-8 -2.67% 2,186,160 2,127,710 lorem_ipsum#sr_RS.UTF-8 2.70% 1,930,520 1,982,720 lorem_ipsum#ja_JP.UTF-8 -97.43% 958,411 24,664 wikipedia-th#en_US.UTF-8 -99.61% 10,511,700,000 40,577,100 The performance numbers and the size of the patch changed due to the removal of the strdiff optimization (#18589) and the included thai test. Performance degration for locales in the ASCII plane is still minor. It does increase the speed of strcoll for all languages that mostly use multiple byte UTF-8 encoding a lot. Note that it should affect the regex performance of these languages too, though there is no benchmark for that. Regarding Carlos comments: >> + struct element_t *mbheads[256 * 256]; > > Use #define MBHEADS_SZ or something similar. Ok. >> + bool is_utf8 = strcmp (charmap->code_set_name, "UTF-8") == 0; > > OK. > > Will this always work? I'm just wondering about a user generated charmap that they > call 'utf8', which is the other common alias for instance where the dash is not valid > syntax. Probably not since the official name is UTF-8, and that's what you should use. Well, if it does not work it's just a speed penalty. But there is no problem in adding a check for "utf8". >> + /* Special handling of UTF-8: Generate a 2-byte index to mbheads. >> + Also check the UTF-8 encoding. Keep locale/weight.h in sync. */ > > Not OK. Can we refactor to avoid keeing the two in sync? Ok, there is a new function utf8index in locale/weight.h that does the job. >> @@ -2239,7 +2281,7 @@ collate_output (struct localedef_t *locale, const struct charmap_t *charmap, >> >> /* Compute how much space we will need. */ >> added = LOCFILE_ALIGN_UP (sizeof (int32_t) + 1 >> - + 2 * (runp->nmbs - 1)); >> + + 2 * runp->nmbs); > > Doesn't the change to zero indexing make the conditional in the code above this wrong? > > e.g. > 2230 if (runp->mbnext != NULL > 2231 && runp->nmbs == runp->mbnext->nmbs > 2232 && memcmp (runp->mbs, runp->mbnext->mbs, runp->nmbs - 1) == 0 > 2233 && (runp->mbs[runp->nmbs - 1] > 2234 == runp->mbnext->mbs[runp->nmbs - 1] + 1)) No. runp traverses through the input / locale definition file and this is not affected by the change. What happens here is a check if the next unicode literal has the same byte sequence as the current except for the last byte, which should be 1 higher than the last byte of the current literal -> beginning of a sequence. * benchtests/bench-strcoll.c: Add thai text with en_US.UTF-8 locale. * benchtests/strcoll-inputs/wikipedia-th#en_US.UTF-8: New file. * locale/categories.def: Define _NL_COLLATE_ENCODING_TYPE. * locale/langinfo.h: Add _NL_COLLATE_ENCODING_TYPE to attribute list. * locale/localeinfo.h: Add enum collation_encoding_type. * locale/C-collate.c: Set _NL_COLLATE_ENCODING_TYPE to 8bit. * locale/programs/ld-collate.c (struct locale_collate_t): Expand mbheads array from 256 to 16384 entries. (collate_finish): Generate 2-byte key for mbheads if UTF-8 locale. (collate_output): Output larger table and sequences including first byte. (collate_output): Add encoding type info. * locale/weight.h (utf8index): New function to calculate 2 byte index. (findidx): Use 2-byte index for table if UTF-8 locale. * locale/weightwc.h (findidx): Accept encoding parameter, not used. * posix/fnmatch_loop.c (FCT): Call findidx with encoding parameter. * posix/regcomp.c (build_equiv_class): Likewise. * posix/regex_internal.h (re_string_elem_size_at): Likewise. * posix/regexec.c (check_node_accept_bytes): Likewise. * string/strcoll_l.c (get_next_seq): Likewise. (STRCOLL): Call get_next_seq with encoding parameter. * string/strxfrm_l.c (find_idx): Call findidx with encoding parameter. (STRXFRM): Call find_idx with encoding parameter. diff --git a/benchtests/bench-strcoll.c b/benchtests/bench-strcoll.c index 22ae87c..6ce5b2a 100644 --- a/benchtests/bench-strcoll.c +++ b/benchtests/bench-strcoll.c @@ -53,7 +53,8 @@ static const char *const input_files[] = { "lorem_ipsum#is_IS.UTF-8", "lorem_ipsum#it_IT.UTF-8", "lorem_ipsum#sr_RS.UTF-8", - "lorem_ipsum#ja_JP.UTF-8" + "lorem_ipsum#ja_JP.UTF-8", + "wikipedia-th#en_US.UTF-8" }; #define TEXTFILE_DELIMITER " \n\r\t.,?!" diff --git a/locale/C-collate.c b/locale/C-collate.c index 8214ff5..5a9ed6a 100644 --- a/locale/C-collate.c +++ b/locale/C-collate.c @@ -144,6 +144,8 @@ const struct __locale_data _nl_C_LC_COLLATE attribute_hidden = /* _NL_COLLATE_COLLSEQWC */ { .string = (const char *) collseqwc }, /* _NL_COLLATE_CODESET */ - { .string = _nl_C_codeset } + { .string = _nl_C_codeset }, + /* _NL_COLLATE_ENCODING_TYPE */ + { .word = __cet_8bit } } }; diff --git a/locale/categories.def b/locale/categories.def index d8a3ab8..cb57eae 100644 --- a/locale/categories.def +++ b/locale/categories.def @@ -58,6 +58,7 @@ DEFINE_CATEGORY DEFINE_ELEMENT (_NL_COLLATE_COLLSEQMB, "collate-collseqmb", std, wstring) DEFINE_ELEMENT (_NL_COLLATE_COLLSEQWC, "collate-collseqwc", std, wstring) DEFINE_ELEMENT (_NL_COLLATE_CODESET, "collate-codeset", std, string) + DEFINE_ELEMENT (_NL_COLLATE_ENCODING_TYPE, "collate-encoding-type", std, word) ), NO_POSTLOAD) diff --git a/locale/langinfo.h b/locale/langinfo.h index 481e226..0906a6a 100644 --- a/locale/langinfo.h +++ b/locale/langinfo.h @@ -255,6 +255,7 @@ enum _NL_COLLATE_COLLSEQMB, _NL_COLLATE_COLLSEQWC, _NL_COLLATE_CODESET, + _NL_COLLATE_ENCODING_TYPE, _NL_NUM_LC_COLLATE, /* LC_CTYPE category: character classification. diff --git a/locale/localeinfo.h b/locale/localeinfo.h index 5c4e6ef..bd284df 100644 --- a/locale/localeinfo.h +++ b/locale/localeinfo.h @@ -110,6 +110,14 @@ enum coll_sort_rule sort_mask }; +/* Collation encoding type. */ +enum collation_encoding_type +{ + __cet_other, + __cet_8bit, + __cet_utf8 +}; + /* We can map the types of the entries into a few categories. */ enum value_type { diff --git a/locale/programs/ld-collate.c b/locale/programs/ld-collate.c index 1e125f6..efaacf6 100644 --- a/locale/programs/ld-collate.c +++ b/locale/programs/ld-collate.c @@ -32,6 +32,8 @@ #include "linereader.h" #include "locfile.h" #include "elem-hash.h" +#include "../localeinfo.h" +#include "../locale/weight.h" /* Uncomment the following line in the production version. */ /* #define NDEBUG 1 */ @@ -243,9 +245,10 @@ struct locale_collate_t Therefore we keep all relevant input in a list. */ struct locale_collate_t *next; - /* Arrays with heads of the list for each of the leading bytes in + /* Arrays with heads of the list for the leading bytes in the multibyte sequences. */ - struct element_t *mbheads[256]; + #define MBHEADS_SZ (256 * 256) + struct element_t *mbheads[MBHEADS_SZ]; /* Arrays with heads of the list for each of the leading bytes in the multibyte sequences. */ @@ -1557,6 +1560,7 @@ collate_finish (struct localedef_t *locale, const struct charmap_t *charmap) struct section_list *sect; int ruleidx; int nr_wide_elems = 0; + bool is_utf8 = strcmp (charmap->code_set_name, "UTF-8") == 0; if (collate == NULL) { @@ -1663,7 +1667,22 @@ collate_finish (struct localedef_t *locale, const struct charmap_t *charmap) struct element_t *lastp = NULL; /* Find the point where to insert in the list. */ - eptr = &collate->mbheads[((unsigned char *) runp->mbs)[0]]; + uint16_t index = ((unsigned char *) runp->mbs)[0]; + + /* Special handling of UTF-8: Generate a 2-byte index to mbheads. */ + if (is_utf8 && index > 0) + { + index = utf8index((unsigned char *) runp->mbs, runp->nmbs); + if (index == 0) + { + WITH_CUR_LOCALE (error_at_line (0, 0, runp->file, runp->line, + _("\ +malformed UTF-8 character in `%s'"), runp->name);); + goto dont_insert; + } + } + + eptr = &collate->mbheads[index]; while (*eptr != NULL) { if ((*eptr)->nmbs < runp->nmbs) @@ -1734,7 +1753,7 @@ symbol `%s' has the same encoding as"), (*eptr)->name); /* Find out whether any of the `mbheads' entries is unset. In this case we use the UNDEFINED entry. */ - for (i = 1; i < 256; ++i) + for (i = 1; i < MBHEADS_SZ; ++i) if (collate->mbheads[i] == NULL) { need_undefined = 1; @@ -2107,7 +2126,7 @@ collate_output (struct localedef_t *locale, const struct charmap_t *charmap, const size_t nelems = _NL_ITEM_INDEX (_NL_NUM_LC_COLLATE); struct locale_file file; size_t ch; - int32_t tablemb[256]; + int32_t tablemb[MBHEADS_SZ]; struct obstack weightpool; struct obstack extrapool; struct obstack indirectpool; @@ -2130,6 +2149,8 @@ collate_output (struct localedef_t *locale, const struct charmap_t *charmap, /* The words have to be handled specially. */ if (idx == _NL_ITEM_INDEX (_NL_COLLATE_SYMB_HASH_SIZEMB)) add_locale_uint32 (&file, 0); + else if (idx == _NL_ITEM_INDEX (_NL_COLLATE_ENCODING_TYPE)) + add_locale_uint32 (&file, __cet_other); else add_locale_empty (&file); } @@ -2183,7 +2204,7 @@ collate_output (struct localedef_t *locale, const struct charmap_t *charmap, if (collate->undefined.used_in_level != 0) output_weight (&weightpool, collate, &collate->undefined); - for (ch = 1; ch < 256; ++ch) + for (ch = 1; ch < MBHEADS_SZ; ++ch) if (collate->mbheads[ch]->mbnext == NULL && collate->mbheads[ch]->nmbs <= 1) { @@ -2208,7 +2229,6 @@ collate_output (struct localedef_t *locale, const struct charmap_t *charmap, and add only one index into the weight table. We can find the consecutive entries since they are also consecutive in the list. */ struct element_t *runp = collate->mbheads[ch]; - struct element_t *lastp; assert (LOCFILE_ALIGNED_P (obstack_object_size (&extrapool))); @@ -2236,7 +2256,7 @@ collate_output (struct localedef_t *locale, const struct charmap_t *charmap, /* Compute how much space we will need. */ added = LOCFILE_ALIGN_UP (sizeof (int32_t) + 1 - + 2 * (runp->nmbs - 1)); + + 2 * runp->nmbs); assert (LOCFILE_ALIGNED_P (obstack_object_size (&extrapool))); obstack_make_room (&extrapool, added); @@ -2259,9 +2279,9 @@ collate_output (struct localedef_t *locale, const struct charmap_t *charmap, /* Now walk backward from here to the beginning. */ curp = runp; - assert (runp->nmbs <= 256); - obstack_1grow_fast (&extrapool, curp->nmbs - 1); - for (i = 1; i < curp->nmbs; ++i) + assert (runp->nmbs <= 255); + obstack_1grow_fast (&extrapool, curp->nmbs); + for (i = 0; i < curp->nmbs; ++i) obstack_1grow_fast (&extrapool, curp->mbs[i]); /* Now find the end of the consecutive sequence and @@ -2281,7 +2301,7 @@ collate_output (struct localedef_t *locale, const struct charmap_t *charmap, /* And add the end byte sequence. Without length this time. */ - for (i = 1; i < curp->nmbs; ++i) + for (i = 0; i < curp->nmbs; ++i) obstack_1grow_fast (&extrapool, curp->mbs[i]); } else @@ -2295,15 +2315,15 @@ collate_output (struct localedef_t *locale, const struct charmap_t *charmap, weightidx = output_weight (&weightpool, collate, runp); added = LOCFILE_ALIGN_UP (sizeof (int32_t) + 1 - + runp->nmbs - 1); + + runp->nmbs); assert (LOCFILE_ALIGNED_P (obstack_object_size (&extrapool))); obstack_make_room (&extrapool, added); obstack_int32_grow_fast (&extrapool, weightidx); - assert (runp->nmbs <= 256); - obstack_1grow_fast (&extrapool, runp->nmbs - 1); + assert (runp->nmbs <= 255); + obstack_1grow_fast (&extrapool, runp->nmbs); - for (i = 1; i < runp->nmbs; ++i) + for (i = 0; i < runp->nmbs; ++i) obstack_1grow_fast (&extrapool, runp->mbs[i]); } @@ -2312,30 +2332,25 @@ collate_output (struct localedef_t *locale, const struct charmap_t *charmap, obstack_1grow_fast (&extrapool, '\0'); /* Next entry. */ - lastp = runp; runp = runp->mbnext; } while (runp != NULL); assert (LOCFILE_ALIGNED_P (obstack_object_size (&extrapool))); - /* If the final entry in the list is not a single character we - add an UNDEFINED entry here. */ - if (lastp->nmbs != 1) - { - int added = LOCFILE_ALIGN_UP (sizeof (int32_t) + 1 + 1); - obstack_make_room (&extrapool, added); + /* Add an UNDEFINED entry at the end of the list. */ + int added = LOCFILE_ALIGN_UP (sizeof (int32_t) + 1 + 1); + obstack_make_room (&extrapool, added); - obstack_int32_grow_fast (&extrapool, 0); - /* XXX What rule? We just pick the first. */ - obstack_1grow_fast (&extrapool, 0); - /* Length is zero. */ - obstack_1grow_fast (&extrapool, 0); + obstack_int32_grow_fast (&extrapool, 0); + /* XXX What rule? We just pick the first. */ + obstack_1grow_fast (&extrapool, 0); + /* Length is zero. */ + obstack_1grow_fast (&extrapool, 0); - /* Add alignment bytes if necessary. */ - while (!LOCFILE_ALIGNED_P (obstack_object_size (&extrapool))) - obstack_1grow_fast (&extrapool, '\0'); - } + /* Add alignment bytes if necessary. */ + while (!LOCFILE_ALIGNED_P (obstack_object_size (&extrapool))) + obstack_1grow_fast (&extrapool, '\0'); } /* Add padding to the tables if necessary. */ @@ -2343,7 +2358,7 @@ collate_output (struct localedef_t *locale, const struct charmap_t *charmap, obstack_1grow (&weightpool, 0); /* Now add the four tables. */ - add_locale_uint32_array (&file, (const uint32_t *) tablemb, 256); + add_locale_uint32_array (&file, (const uint32_t *) tablemb, MBHEADS_SZ); add_locale_raw_obstack (&file, &weightpool); add_locale_raw_obstack (&file, &extrapool); add_locale_raw_obstack (&file, &indirectpool); @@ -2493,6 +2508,12 @@ collate_output (struct localedef_t *locale, const struct charmap_t *charmap, add_locale_raw_data (&file, collate->mbseqorder, 256); add_locale_collseq_table (&file, &collate->wcseqorder); add_locale_string (&file, charmap->code_set_name); + if (strcmp (charmap->code_set_name, "UTF-8") == 0) + add_locale_uint32 (&file, __cet_utf8); + else if (charmap->mb_cur_max == 1) + add_locale_uint32 (&file, __cet_8bit); + else + add_locale_uint32 (&file, __cet_other); write_locale_data (output_path, LC_COLLATE, "LC_COLLATE", &file); obstack_free (&weightpool, NULL); diff --git a/locale/weight.h b/locale/weight.h index c99730c..5b4103b 100644 --- a/locale/weight.h +++ b/locale/weight.h @@ -19,26 +19,81 @@ #ifndef _WEIGHT_H_ #define _WEIGHT_H_ 1 +/* Generate 2 byte code for the next UTF-8 encoded char. + Returns zero on UTF-8 encoding errors. */ +static __always_inline uint16_t +utf8index (const unsigned char *cp, size_t len) +{ + uint16_t index = cp[0]; + + if (index >= 0x80) + { + if (index < 0xE0) + { + if (len < 2) + return 0; + uint16_t byte2 = cp[1]; + index = (index << 6) + byte2 - 0x3080; + } + else if (index < 0xF0) + { + if (len < 3) + return 0; + uint16_t byte2 = cp[1]; + uint16_t byte3 = cp[2]; + index = (index << 12) + (byte2 << 6) + byte3 - 0xE2080; + } + else if (index < 0xF8) + { + if (len < 4) + return 0; + uint16_t byte2 = cp[1]; + uint16_t byte3 = cp[2]; + uint16_t byte4 = cp[3]; + index = (byte2 << 12) + (byte3 << 6) + byte4 - 0x82080; + } + else + return 0; + } + + return index; +} + /* Find index of weight. */ static inline int32_t __attribute__ ((always_inline)) -findidx (const int32_t *table, +findidx (uint_fast32_t locale_encoding, + const int32_t *table, const int32_t *indirect, const unsigned char *extra, const unsigned char **cpp, size_t len) { - int_fast32_t i = table[*(*cpp)++]; const unsigned char *cp; const unsigned char *usrc; + uint16_t index = (*cpp)[0]; + + /* Special handling of UTF-8: Generate a 2-byte index for table. */ + if (index >= 0x80 && locale_encoding == __cet_utf8) + { + index = utf8index(*cpp, len); + if (index == 0) + { + *cpp += 1; + return 0; + } + } + int_fast32_t i = table[index]; if (i >= 0) - /* This is an index into the weight table. Cool. */ - return i; + { + /* This is an index into the weight table. Cool. */ + *cpp += 1; + return i; + } /* Oh well, more than one sequence starting with this byte. Search for the correct one. */ cp = &extra[-i]; usrc = *cpp; - --len; while (1) { size_t nhere; @@ -57,8 +112,7 @@ findidx (const int32_t *table, /* It is a single character. If it matches we found our index. Note that at the end of each list there is an entry of length zero which represents the single byte - sequence. The first (and here only) byte was tested - already. */ + sequence. */ size_t cnt; for (cnt = 0; cnt < nhere && cnt < len; ++cnt) @@ -68,7 +122,7 @@ findidx (const int32_t *table, if (cnt == nhere) { /* Found it. */ - *cpp += nhere; + *cpp += nhere > 0 ? nhere : 1; return i; } @@ -127,7 +181,7 @@ findidx (const int32_t *table, while (++cnt < nhere); } - *cpp += nhere; + *cpp += nhere > 0 ? nhere : 1; return indirect[-i + offset]; } } diff --git a/locale/weightwc.h b/locale/weightwc.h index ab26482..4101dc8 100644 --- a/locale/weightwc.h +++ b/locale/weightwc.h @@ -21,7 +21,8 @@ /* Find index of weight. */ static inline int32_t __attribute__ ((always_inline)) -findidx (const int32_t *table, +findidx (uint_fast32_t encoding, + const int32_t *table, const int32_t *indirect, const wint_t *extra, const wint_t **cpp, size_t len) diff --git a/posix/fnmatch_loop.c b/posix/fnmatch_loop.c index 229904e..07b60fb 100644 --- a/posix/fnmatch_loop.c +++ b/posix/fnmatch_loop.c @@ -383,6 +383,8 @@ FCT (const CHAR *pattern, const CHAR *string, const CHAR *string_end, const int32_t *indirect; int32_t idx; const UCHAR *cp = (const UCHAR *) &str; + uint_fast32_t encoding = + _NL_CURRENT_WORD (LC_COLLATE, _NL_COLLATE_ENCODING_TYPE); # if WIDE_CHAR_VERSION table = (const int32_t *) @@ -404,7 +406,7 @@ FCT (const CHAR *pattern, const CHAR *string, const CHAR *string_end, _NL_CURRENT (LC_COLLATE, _NL_COLLATE_INDIRECTMB); # endif - idx = FINDIDX (table, indirect, extra, &cp, 1); + idx = FINDIDX (encoding, table, indirect, extra, &cp, 1); if (idx != 0) { /* We found a table entry. Now see whether the @@ -414,7 +416,7 @@ FCT (const CHAR *pattern, const CHAR *string, const CHAR *string_end, int32_t idx2; const UCHAR *np = (const UCHAR *) n; - idx2 = FINDIDX (table, indirect, extra, + idx2 = FINDIDX (encoding, table, indirect, extra, &np, string_end - n); if (idx2 != 0 && (idx >> 24) == (idx2 >> 24) diff --git a/posix/regcomp.c b/posix/regcomp.c index b6126b7..011ef92 100644 --- a/posix/regcomp.c +++ b/posix/regcomp.c @@ -3414,6 +3414,7 @@ build_equiv_class (bitset_t sbcset, const unsigned char *name) uint32_t nrules = _NL_CURRENT_WORD (LC_COLLATE, _NL_COLLATE_NRULES); if (nrules != 0) { + uint_fast32_t encoding; const int32_t *table, *indirect; const unsigned char *weights, *extra, *cp; unsigned char char_buf[2]; @@ -3422,6 +3423,7 @@ build_equiv_class (bitset_t sbcset, const unsigned char *name) size_t len; /* Calculate the index for equivalence class. */ cp = name; + encoding = _NL_CURRENT_WORD (LC_COLLATE, _NL_COLLATE_ENCODING_TYPE); table = (const int32_t *) _NL_CURRENT (LC_COLLATE, _NL_COLLATE_TABLEMB); weights = (const unsigned char *) _NL_CURRENT (LC_COLLATE, _NL_COLLATE_WEIGHTMB); @@ -3429,7 +3431,7 @@ build_equiv_class (bitset_t sbcset, const unsigned char *name) _NL_COLLATE_EXTRAMB); indirect = (const int32_t *) _NL_CURRENT (LC_COLLATE, _NL_COLLATE_INDIRECTMB); - idx1 = findidx (table, indirect, extra, &cp, -1); + idx1 = findidx (encoding, table, indirect, extra, &cp, -1); if (BE (idx1 == 0 || *cp != '\0', 0)) /* This isn't a valid character. */ return REG_ECOLLATE; @@ -3440,7 +3442,7 @@ build_equiv_class (bitset_t sbcset, const unsigned char *name) { char_buf[0] = ch; cp = char_buf; - idx2 = findidx (table, indirect, extra, &cp, 1); + idx2 = findidx (encoding, table, indirect, extra, &cp, 1); /* idx2 = table[ch]; */ diff --git a/posix/regex_internal.h b/posix/regex_internal.h index 02e040b..993c7c3 100644 --- a/posix/regex_internal.h +++ b/posix/regex_internal.h @@ -743,17 +743,19 @@ re_string_elem_size_at (const re_string_t *pstr, int idx) # ifdef _LIBC const unsigned char *p, *extra; const int32_t *table, *indirect; + uint_fast32_t encoding; uint_fast32_t nrules = _NL_CURRENT_WORD (LC_COLLATE, _NL_COLLATE_NRULES); if (nrules != 0) { + encoding = _NL_CURRENT_WORD (LC_COLLATE, _NL_COLLATE_ENCODING_TYPE); table = (const int32_t *) _NL_CURRENT (LC_COLLATE, _NL_COLLATE_TABLEMB); extra = (const unsigned char *) _NL_CURRENT (LC_COLLATE, _NL_COLLATE_EXTRAMB); indirect = (const int32_t *) _NL_CURRENT (LC_COLLATE, _NL_COLLATE_INDIRECTMB); p = pstr->mbs + idx; - findidx (table, indirect, extra, &p, pstr->len - idx); + findidx (encoding, table, indirect, extra, &p, pstr->len - idx); return p - pstr->mbs - idx; } else diff --git a/posix/regexec.c b/posix/regexec.c index ec46c3a..3d3ad9a 100644 --- a/posix/regexec.c +++ b/posix/regexec.c @@ -3843,6 +3843,7 @@ check_node_accept_bytes (const re_dfa_t *dfa, int node_idx, if (nrules != 0) { unsigned int in_collseq = 0; + uint_fast32_t encoding; const int32_t *table, *indirect; const unsigned char *weights, *extra; const char *collseqwc; @@ -3893,6 +3894,8 @@ check_node_accept_bytes (const re_dfa_t *dfa, int node_idx, if (cset->nequiv_classes) { const unsigned char *cp = pin; + encoding = + _NL_CURRENT_WORD (LC_COLLATE, _NL_COLLATE_ENCODING_TYPE); table = (const int32_t *) _NL_CURRENT (LC_COLLATE, _NL_COLLATE_TABLEMB); weights = (const unsigned char *) @@ -3901,7 +3904,8 @@ check_node_accept_bytes (const re_dfa_t *dfa, int node_idx, _NL_CURRENT (LC_COLLATE, _NL_COLLATE_EXTRAMB); indirect = (const int32_t *) _NL_CURRENT (LC_COLLATE, _NL_COLLATE_INDIRECTMB); - int32_t idx = findidx (table, indirect, extra, &cp, elem_len); + int32_t idx = findidx (encoding, table, indirect, extra, &cp, + elem_len); if (idx > 0) for (i = 0; i < cset->nequiv_classes; ++i) { diff --git a/string/strcoll_l.c b/string/strcoll_l.c index 4d1e3ab..2c2cab0 100644 --- a/string/strcoll_l.c +++ b/string/strcoll_l.c @@ -63,9 +63,9 @@ typedef struct /* Get next sequence. Traverse the string as required. */ static __always_inline void get_next_seq (coll_seq *seq, int nrules, const unsigned char *rulesets, - const USTRING_TYPE *weights, const int32_t *table, - const USTRING_TYPE *extra, const int32_t *indirect, - int pass) + const USTRING_TYPE *weights, uint_fast32_t encoding, + const int32_t *table, const USTRING_TYPE *extra, + const int32_t *indirect, int pass) { size_t val = seq->val = 0; int len = seq->len; @@ -109,7 +109,7 @@ get_next_seq (coll_seq *seq, int nrules, const unsigned char *rulesets, us = seq->back_us; while (i < backw) { - int32_t tmp = findidx (table, indirect, extra, &us, -1); + int32_t tmp = findidx (encoding, table, indirect, extra, &us, -1); idx = tmp & 0xffffff; i++; } @@ -124,7 +124,7 @@ get_next_seq (coll_seq *seq, int nrules, const unsigned char *rulesets, while (*us != L('\0')) { - int32_t tmp = findidx (table, indirect, extra, &us, -1); + int32_t tmp = findidx (encoding, table, indirect, extra, &us, -1); unsigned char rule = tmp >> 24; prev_idx = idx; idx = tmp & 0xffffff; @@ -253,6 +253,7 @@ STRCOLL (const STRING_TYPE *s1, const STRING_TYPE *s2, __locale_t l) const USTRING_TYPE *weights; const USTRING_TYPE *extra; const int32_t *indirect; + uint_fast32_t encoding; if (nrules == 0) return STRCMP (s1, s2); @@ -271,6 +272,8 @@ STRCOLL (const STRING_TYPE *s1, const STRING_TYPE *s2, __locale_t l) current->values[_NL_ITEM_INDEX (CONCAT(_NL_COLLATE_EXTRA,SUFFIX))].string; indirect = (const int32_t *) current->values[_NL_ITEM_INDEX (CONCAT(_NL_COLLATE_INDIRECT,SUFFIX))].string; + encoding = current->values[_NL_ITEM_INDEX (_NL_COLLATE_ENCODING_TYPE)].word; + assert (((uintptr_t) table) % __alignof__ (table[0]) == 0); assert (((uintptr_t) weights) % __alignof__ (weights[0]) == 0); @@ -310,9 +313,9 @@ STRCOLL (const STRING_TYPE *s1, const STRING_TYPE *s2, __locale_t l) while (1) { - get_next_seq (&seq1, nrules, rulesets, weights, table, + get_next_seq (&seq1, nrules, rulesets, weights, encoding, table, extra, indirect, pass); - get_next_seq (&seq2, nrules, rulesets, weights, table, + get_next_seq (&seq2, nrules, rulesets, weights, encoding, table, extra, indirect, pass); /* See whether any or both strings are empty. */ if (seq1.len == 0 || seq2.len == 0) diff --git a/string/strxfrm_l.c b/string/strxfrm_l.c index 22e24d3..5c89b15 100644 --- a/string/strxfrm_l.c +++ b/string/strxfrm_l.c @@ -53,6 +53,7 @@ typedef struct uint_fast32_t nrules; unsigned char *rulesets; USTRING_TYPE *weights; + uint_fast32_t encoding; int32_t *table; USTRING_TYPE *extra; int32_t *indirect; @@ -100,8 +101,8 @@ static __always_inline size_t find_idx (const USTRING_TYPE **us, int32_t *weight_idx, unsigned char *rule_idx, const locale_data_t *l_data, const int pass) { - int32_t tmp = findidx (l_data->table, l_data->indirect, l_data->extra, us, - -1); + int32_t tmp = findidx (l_data->encoding, l_data->table, l_data->indirect, + l_data->extra, us, -1); *rule_idx = tmp >> 24; int32_t idx = tmp & 0xffffff; size_t len = l_data->weights[idx++]; @@ -693,6 +694,8 @@ STRXFRM (STRING_TYPE *dest, const STRING_TYPE *src, size_t n, __locale_t l) /* Get the locale data. */ l_data.rulesets = (unsigned char *) current->values[_NL_ITEM_INDEX (_NL_COLLATE_RULESETS)].string; + l_data.encoding = + current->values[_NL_ITEM_INDEX (_NL_COLLATE_ENCODING_TYPE)].word; l_data.table = (int32_t *) current->values[_NL_ITEM_INDEX (CONCAT(_NL_COLLATE_TABLE,SUFFIX))].string; l_data.weights = (USTRING_TYPE *) @@ -721,8 +724,8 @@ STRXFRM (STRING_TYPE *dest, const STRING_TYPE *src, size_t n, __locale_t l) do { - int32_t tmp = findidx (l_data.table, l_data.indirect, l_data.extra, &cur, - -1); + int32_t tmp = findidx (l_data.encoding, l_data.table, l_data.indirect, + l_data.extra, &cur, -1); rulearr[idxmax] = tmp >> 24; idxarr[idxmax] = tmp & 0xffffff; --------------030900000606020402020804 Content-Type: text/plain; charset=UTF-8; name="wikipedia-th#en_US.UTF-8" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="wikipedia-th#en_US.UTF-8" Content-length: 8988 4LmA4LiZ4Lia4Li04Lin4Lil4Liy4Lib4Li5IOC5gOC4m+C5h+C4meC4i+C4 suC4geC4i+C4ueC5gOC4m+C4reC4o+C5jOC5guC4meC4p+C4suC5geC4peC4 sOC5gOC4meC4muC4tOC4p+C4peC4suC4peC4oeC4nuC4seC4peC4i+C4suC4 o+C5jOC5g+C4meC4geC4peC4uOC5iOC4oeC4lOC4suC4p+C4p+C4seC4pwrg uYDguJnguJrguLTguKfguKXguLLguJnguLXguYnguYTguJTguYnguKPguLHg uJrguIHguLLguKPguKrguLHguIfguYDguIHguJXguYLguJTguKLguIjguK3g uKvguYzguJkg4LmA4Lia4Lin4Li04LiqIOC5g+C4meC4m+C4tSDguJ4u4Lio LiAyMjc0CuC4i+C4tuC5iOC4h+C4quC4reC4lOC4hOC4peC5ieC4reC4h+C4 geC4seC4muC4geC4suC4o+C4muC4seC4meC4l+C4tuC4geC5gOC4q+C4leC4 uOC4geC4suC4o+C4k+C5jOC4i+C4ueC5gOC4m+C4reC4o+C5jOC5guC4meC4 p+C4suC4quC4p+C5iOC4suC4h+C5guC4lOC4ouC4meC4seC4geC4lOC4suC4 o+C4suC4qOC4suC4quC4leC4o+C5jOC4iuC4suC4p+C4iOC4teC4meC5geC4 peC4sOC4iuC4suC4p+C4reC4suC4q+C4o+C4seC4muC5g+C4mQrguJ4u4Lio LiAxNTk3IOC4l+C4teC5iOC4o+C4sOC4lOC4seC4muC4o+C4seC4h+C4quC4 teC5gOC4reC4geC4i+C5jOC5geC4peC4sOC4o+C4seC4h+C4quC4teC5geC4 geC4oeC4oeC4suC4quC4ueC4h+C4geC4p+C5iOC4siAzMCDguIHguLTguYLg uKXguK3guLTguYDguKXguYfguIHguJXguKPguK3guJnguYLguKfguKXguJXg uYwK4LmA4LiZ4Lia4Li04Lin4Lil4Liy4Lib4Li54LmA4Lib4LmH4LiZ4LmB 4Lir4Lil4LmI4LiH4Lie4Lil4Lix4LiH4LiH4Liy4LiZ4LiX4Li14LmI4LmA 4LiC4LmJ4Lih4LiX4Li14LmI4Liq4Li44LiU4Lia4LiZ4LiX4LmJ4Lit4LiH 4Lif4LmJ4Liy4Lih4Liy4Lit4Lii4LmI4Liy4LiH4LiV4LmI4Lit4LmA4LiZ 4Li34LmI4Lit4LiHIOC5guC4lOC4ouC4quC4suC4oeC4suC4o+C4luC4p+C4 seC4lOC4n+C4peC4seC4geC4i+C5jOC5hOC4lOC5ieC4luC4tuC4h+C4quC4 ueC4h+C4geC4p+C5iOC4sgoxMDEyIOC4reC4tOC5gOC4peC5h+C4geC4leC4 o+C4reC4meC5guC4p+C4peC4leC5jCDguYDguJnguJrguLTguKfguKXguLLg uJvguLnguJXguLHguYnguIfguK3guKLguLnguYjguKvguYjguLLguIfguIjg uLLguIHguYLguKXguIEgNiw1MDAg4Lib4Li14LmB4Liq4LiHICgyIOC4geC4 tOC5guC4peC4nuC4suC4o+C5jOC5gOC4i+C4gSkK4Lih4Li14LmA4Liq4LmJ 4LiZ4Lic4LmI4Liy4LiZ4Lio4Li54LiZ4Lii4LmM4LiB4Lil4Liy4LiHIDEx IOC4m+C4teC5geC4quC4hyAoMy40IOC4nuC4suC4o+C5jOC5gOC4i+C4gSkg 4LmB4Lil4Liw4LiC4Lii4Liy4Lii4LiV4Lix4Lin4LmD4LiZ4Lit4Lix4LiV 4Lij4LiyIDEsNTAwIOC4geC4tOC5guC4peC5gOC4oeC4leC4o+C4leC5iOC4 reC4p+C4tOC4meC4suC4l+C4tQrguJMg4LmD4LiI4LiB4Lil4Liy4LiH4LmA 4LiZ4Lia4Li04Lin4Lil4Liy4Lib4Li54LmA4Lib4LmH4LiZ4LiX4Li14LmI 4Lit4Lii4Li54LmI4LiC4Lit4LiH4Lie4Lix4Lil4LiL4Liy4Lij4LmM4Lib 4Li5IOC4lOC4suC4p+C4meC4tOC4p+C4leC4o+C4reC4meC4guC4meC4suC4 lOC5gOC4quC5ieC4meC4nOC5iOC4suC4meC4qOC4ueC4meC4ouC5jOC4geC4 peC4suC4hyAyOC0zMCDguIHguLTguYLguKXguYDguKHguJXguKMK4LiL4Li2 4LmI4LiH4Lib4Lil4LiU4Lib4Lil4LmI4Lit4Lii4Lij4Lix4LiH4Liq4Li1 4LiV4Lix4LmJ4LiH4LmB4LiV4LmI4Lij4Lix4LiH4Liq4Li14LmB4LiB4Lih 4Lih4Liy4LmE4Lib4LiI4LiZ4LiW4Li24LiH4LiE4Lil4Li34LmI4LiZ4Lin 4Li04LiX4Lii4Li44LiU4LmJ4Lin4Lii4Lit4Lix4LiV4Lij4Liy4LiB4Liy 4Lij4Lir4Lih4Li44LiZIDMwLjIg4Lij4Lit4Lia4LiV4LmI4Lit4Lin4Li0 4LiZ4Liy4LiX4Li1CuC5gOC4meC4muC4tOC4p+C4peC4suC4m+C4ueC5gOC4 m+C5h+C4meC4p+C4seC4leC4luC4uOC4l+C4suC4h+C4lOC4suC4o+C4suC4 qOC4suC4quC4leC4o+C5jOC4p+C4seC4leC4luC4uOC5geC4o+C4geC4l+C4 teC5iOC4quC4suC4oeC4suC4o+C4luC4o+C4sOC4muC4uOC5hOC4lOC5ieC4 iOC4suC4geC4geC4suC4o+C4o+C4sOC5gOC4muC4tOC4lOC4i+C4ueC5gOC4 m+C4reC4o+C5jOC5guC4meC4p+C4suC5g+C4meC4m+C4o+C4sOC4p+C4seC4 leC4tOC4qOC4suC4quC4leC4o+C5jArguYDguJnguJrguLTguKfguKXguLLg uJnguLXguYnguJfguLPguJXguLHguKfguYDguKrguKHguLfguK3guJnguKvg uJnguLbguYjguIfguYHguKvguKXguYjguIfguIHguLPguYDguJnguLTguJTg uKPguLHguIfguKrguLXguKrguLPguKvguKPguLHguJrguIHguLLguKPguKjg uLbguIHguKnguLLguYDguJfguKvguYzguJ/guLLguIHguJ/guYnguLLguJfg uLXguYjguYDguITguKXguLfguYjguK3guJnguJzguYjguLLguJnguJXguLHg uKfguKHguLHguJkK4LmD4LiZ4LiK4LmI4Lin4LiH4Lib4Li14LieLuC4qC4g MjQ5MyDguYHguKXguLAgMjUxMgrguKHguLXguIHguLLguKPguJfguLPguYHg uJzguJnguKDguLnguKHguLTguYLguITguYLguKPguJnguLLguILguK3guIfg uJTguKfguIfguK3guLLguJfguLTguJXguKLguYzguILguLbguYnguJnguIjg uLLguIHguIHguLLguKPguYDguJ3guYnguLLguKrguLHguIfguYDguIHguJXg uITguKXguLfguYjguJnguKfguLTguJfguKLguLjguIjguLLguIHguYDguJng uJrguLTguKfguKXguLLguJvguLnguJfguLXguYjguJzguYjguLLguJnguIrg uLHguYnguJnguYLguITguYLguKPguJnguLLguYTguJsK4LmB4Lil4Liw4LmD 4LiZ4Lib4Li1IOC4ni7guKguIDI1NDYg4LmA4Lij4Liy4Liq4Liy4Lih4Liy 4Lij4LiW4Lin4Lix4LiU4LiE4Lin4Liy4Lih4Lir4LiZ4Liy4LiC4Lit4LiH 4LiK4Lix4LmJ4LiZ4Lia4Lij4Lij4Lii4Liy4LiB4Liy4Lio4LiC4Lit4LiH 4LiU4Lin4LiH4LiI4Lix4LiZ4LiX4Lij4LmM4LmE4LiX4LiX4Lix4LiZCuC4 lOC4suC4p+C4muC4o+C4tOC4p+C4suC4o+C4guC4reC4h+C4lOC4suC4p+C5 gOC4quC4suC4o+C5jOC5hOC4lOC5ieC4iOC4suC4geC4geC4suC4o+C4l+C4 teC5iOC4iuC4seC5ieC4meC4muC4o+C4o+C4ouC4suC4geC4suC4qOC4meC4 teC5ieC4geC4teC4lOC4guC4p+C4suC4h+C4o+C4seC4h+C4quC4teC5gOC4 reC4geC4i+C5jOC4iOC4suC4geC5gOC4meC4muC4tOC4p+C4peC4siAo4Lit 4LmI4Liy4LiZ4LiV4LmI4LitLi4uKQrguIzguK3guKPguYzguIwg4LmA4Lil 4Lit4LmB4Lih4LmH4LiX4Lij4LmMIOC4meC4seC4geC4p+C4tOC4l+C4ouC4 suC4qOC4suC4quC4leC4o+C5jOC5geC4peC4sOC4nuC4o+C4sOC5guC4o+C4 oeC4seC4meC4hOC4suC4l+C4reC4peC4tOC4gSDguYDguJvguYfguJnguJzg uLnguYnguYDguKrguJnguK3guYHguJnguKfguITguLTguJTguIHguLLguKPg uIHguLPguYDguJnguLTguJTguILguK3guIfguYDguK3guIHguKDguJ4K4LiL 4Li24LmI4LiH4LiV4LmI4Lit4Lih4Liy4Lij4Li54LmJ4LiI4Lix4LiB4LiB 4Lix4LiZ4LmD4LiZ4LiK4Li34LmI4LitIOC4l+C4pOC4qeC4juC4teC4muC4 tOC4geC5geC4muC4hyDguYPguJnguYDguJrguLfguYnguK3guIfguYHguKPg uIHguYDguILguLLguYDguKPguLXguKLguIHguJfguKTguKnguI7guLXguJng uLXguYnguKfguYjguLIK4Liq4Lih4Lih4LiV4Li04LiQ4Liy4LiZ4LmA4LiB 4Li14LmI4Lii4Lin4LiB4Lix4Lia4Lit4Liw4LiV4Lit4Lih4LmB4Lij4LiB 4LmA4Lij4Li04LmI4LihIChoeXBvdGhlc2lzIG9mIHRoZSBwcmltZXZhbCBh dG9tKSDguK3guYDguKXguYfguIHguIvguLLguJnguYDguJTguK3guKPguYwK 4Lif4Lij4Li14LiU4LmB4Lih4LiZCuC4l+C4s+C4geC4suC4o+C4hOC4s+C4 meC4p+C4k+C5geC4muC4muC4iOC4s+C4peC4reC4h+C5guC4lOC4ouC4oeC4 teC4geC4o+C4reC4muC4geC4suC4o+C4nuC4tOC4iOC4suC4o+C4k+C4suC4 reC4ouC4ueC5iOC4muC4meC4nuC4t+C5ieC4meC4kOC4suC4meC4guC4reC4 h+C4l+C4pOC4qeC4juC4teC4quC4seC4oeC4nuC4seC4l+C4mOC4oOC4suC4 nuC4l+C4seC5iOC4p+C5hOC4m+C4guC4reC4h+C4reC4seC4peC5gOC4muC4 tOC4o+C5jOC4lQrguYTguK3guJnguYzguKrguYTguJXguJnguYwg4LiV4LmI 4Lit4Lih4Liy4LmD4LiZ4Lib4Li1IOC4hC7guKguIDE5Mjkg4LmA4Lit4LmH 4LiU4Lin4Li04LiZIOC4ruC4seC4muC5gOC4muC4tOC4peC4hOC5ieC4meC4 nuC4muC4p+C5iOC4sgrguKPguLDguKLguLDguKvguYjguLLguIfguILguK3g uIfguJTguLLguKPguLLguIjguLHguIHguKPguKHguLXguKrguLHguJTguKrg uYjguKfguJnguJfguLXguYjguYDguJvguKXguLXguYjguKLguJnguYHguJvg uKXguIfguKrguLHguKHguJ7guLHguJnguJjguYzguIHguLHguJrguIHguLLg uKPguYDguITguKXguLfguYjguK3guJnguYTguJvguJfguLLguIfguYHguJTg uIcK4LiB4Liy4Lij4Liq4Lix4LiH4LmA4LiB4LiV4LiB4Liy4Lij4LiT4LmM 4LiZ4Li14LmJ4Lia4LmI4LiH4LiK4Li14LmJ4Lin4LmI4LiyIOC4lOC4suC4 o+C4suC4iOC4seC4geC4o+C5geC4peC4sOC4geC4o+C4sOC4iOC4uOC4geC4 lOC4suC4p+C4reC4seC4meC4q+C5iOC4suC4h+C5hOC4geC4peC4geC4s+C4 peC4seC4h+C5gOC4hOC4peC4t+C5iOC4reC4meC4l+C4teC5iOC4reC4reC4 geC4iOC4suC4geC4iOC4uOC4lOC4quC4seC4h+C5gOC4geC4lQrguIvguLbg uYjguIfguKvguKHguLLguKLguITguKfguLLguKHguKfguYjguLLguYDguK3g uIHguKDguJ7guIHguLPguKXguLHguIfguILguKLguLLguKLguJXguLHguKcg 4Lii4Li04LmI4LiH4LiV4Liz4LmB4Lir4LiZ4LmI4LiH4LiU4Liy4Lij4Liy 4LiI4Lix4LiB4Lij4LmE4LiB4Lil4Lii4Li04LmI4LiH4LiC4Li24LmJ4LiZ CuC4hOC4p+C4suC4oeC5gOC4o+C5h+C4p+C4m+C4o+C4suC4geC4j+C4geC5 h+C4ouC4tOC5iOC4h+C5gOC4nuC4tOC5iOC4oeC4oeC4suC4geC4guC4tuC5 ieC4mSDguKvguLLguIHguYDguK3guIHguKDguJ7guYPguJnguJvguLHguIjg uIjguLjguJrguLHguJnguIHguLPguKXguLHguIfguILguKLguLLguKLguJXg uLHguKcg4LmB4Liq4LiU4LiH4Lin4LmI4Liy4LiB4LmI4Lit4LiZ4Lir4LiZ 4LmJ4Liy4LiZ4Li14LmJCuC5gOC4reC4geC4oOC4nuC4ouC5iOC4reC4oeC4 oeC4teC4guC4meC4suC4lOC5gOC4peC5h+C4geC4geC4p+C5iOC4siDguKvg uJnguLLguYHguJnguYjguJnguIHguKfguYjguLIg4LmB4Lil4Liw4Lij4LmJ 4Lit4LiZ4LiB4Lin4LmI4Liy4LiX4Li14LmI4LmA4Lib4LmH4LiZ4Lit4Lii 4Li54LmICuC5geC4meC4p+C4hOC4tOC4lOC4meC4teC5ieC4oeC4teC4geC4 suC4o+C4nuC4tOC4iOC4suC4o+C4k+C4suC4reC4ouC5iOC4suC4h+C4peC4 sOC5gOC4reC4teC4ouC4lOC4ouC5ieC4reC4meC5hOC4m+C4iOC4meC4luC4 tuC4h+C4o+C4sOC4lOC4seC4muC4hOC4p+C4suC4oeC4q+C4meC4suC5geC4 meC5iOC4meC5geC4peC4sOC4reC4uOC4k+C4q+C4oOC4ueC4oeC4tOC4l+C4 teC5iOC4iOC4uOC4lOC4quC4ueC4h+C4quC4uOC4lArguYHguKXguLDguJzg uKXguKrguKPguLjguJvguJfguLXguYjguYTguJTguYnguIHguYfguKrguK3g uJTguITguKXguYnguK3guIfguK3guKLguYjguLLguIfguKLguLTguYjguIfg uIHguLHguJrguJzguKXguIjguLLguIHguIHguLLguKPguKrguLHguIfguYDg uIHguJXguIHguLLguKPguJPguYwK4LiX4Lin4LmI4Liy4LiB4Liy4Lij4LmA 4Lie4Li04LmI4Lih4LiC4Lit4LiH4Lit4Lix4LiV4Lij4Liy4LmA4Lij4LmI 4LiH4Lih4Li14LiC4LmJ4Lit4LiI4Liz4LiB4Lix4LiU4LmD4LiZ4LiB4Liy 4Lij4LiV4Lij4Lin4LiI4Liq4Lit4Lia4Liq4Lig4Liy4Lin4Liw4Lie4Lil 4Lix4LiH4LiH4Liy4LiZ4LiX4Li14LmI4Liq4Li54LiH4LiC4LiZ4Liy4LiU 4LiZ4Lix4LmJ4LiZCuC4q+C4suC4geC5hOC4oeC5iOC4oeC4teC4guC5ieC4 reC4oeC4ueC4peC4reC4t+C5iOC4meC4l+C4teC5iOC4iuC5iOC4p+C4ouC4 ouC4t+C4meC4ouC4seC4meC4quC4oOC4suC4p+C4sOC5gOC4o+C4tOC5iOC4 oeC4leC5ieC4meC4iuC4seC5iOC4p+C4guC4k+C4sOC4geC5iOC4reC4meC4 geC4suC4o+C4o+C4sOC5gOC4muC4tOC4lArguKXguLPguJ7guLHguIfguJfg uKTguKnguI7guLXguJrguLTguIHguYHguJrguIfguIHguYfguKLguLHguIfg uYTguKHguYjguKrguLLguKHguLLguKPguJbguYPguIrguYnguK3guJjguLTg uJrguLLguKLguKrguKDguLLguKfguLDguYDguKPguLTguYjguKHguJXguYng uJnguYTguJTguYkK4Lih4Lix4LiZ4LmA4Lie4Li14Lii4LiH4Lit4LiY4Li0 4Lia4Liy4Lii4LiB4Lij4Liw4Lia4Lin4LiZ4LiB4Liy4Lij4LmA4Lib4Lil 4Li14LmI4Lii4LiZ4LmB4Lib4Lil4LiH4LiC4Lit4LiH4LmA4Lit4LiB4Lig 4Lie4LiX4Li14LmI4LmA4LiB4Li04LiU4LiC4Li24LmJ4LiZ4Lir4Lil4Lix 4LiH4LiI4Liy4LiB4Liq4Lig4Liy4Lin4Liw4LmA4Lij4Li04LmI4Lih4LiV 4LmJ4LiZ4LmA4LiX4LmI4Liy4LiZ4Lix4LmJ4LiZCijguK3guYjguLLguJng uJXguYjguK0uLi4pCgo= --------------030900000606020402020804--