From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 33BF4385801D for ; Tue, 2 Nov 2021 02:41:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 33BF4385801D Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-292-UHnucg8_Nb6g4VBInvjOsA-1; Mon, 01 Nov 2021 22:40:41 -0400 X-MC-Unique: UHnucg8_Nb6g4VBInvjOsA-1 Received: by mail-qt1-f200.google.com with SMTP id 90-20020aed3163000000b002a6bd958077so13492764qtg.6 for ; Mon, 01 Nov 2021 19:40:41 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:to:references:from:organization:in-reply-to :content-transfer-encoding; bh=DSK66ISxBsXKgjUvTkZVDJugDX2uG8+du4vHUG8dQVA=; b=JS4PlYw4sb/hBeT5Zbbkd8WxLlvpce3gLWcUFxCwlyk10zUj5LJu6mVzDNqfySRelZ IEBSU8fexK03ku/Jlmy7yXjHjpl4GqyksXf6QyiUklQTxBlCgdPwK3tykyf/Oyt5cgPa yISRllgfJYx//v/s+b3kO5/xkXwjuUecMlxnBlJBggTf6WA8y6l5s+cunplIHzlTYNwo wC7Uk2Pop7eByC+0Fc4FPzClyYgKCSPSSCGWn6OvVEVbJF155Hn1a7ftbjVA/byat0RF 5GzY4dScV7lMIQ6CdP+klSnSLnVAXgm9nJl2vACwjKw0K7ojAikr7qlFxOFYRNb/L5Kq hx0g== X-Gm-Message-State: AOAM530uODXYv+o4MtPmWbFCla2Q3J9FTnQeCGZowOGgNvvT7aIy0PZA rlRcpvbFNn3eS7F5+OyGXJqK0VM+wk/beeAjuLg3FV4GQs6F6gBLvcsiL2N6RbfAbApt0+BtzHM zBQkRB7qaxCqbJCAVQzl6 X-Received: by 2002:ad4:5d66:: with SMTP id fn6mr33308277qvb.15.1635820840862; Mon, 01 Nov 2021 19:40:40 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzOL0TyDVymTodKo/A5AOPGVBzPfIJ3xRfiCmEIODpvQ2Y0F/yXT/YJHg/WuwYE7wjobv6qNQ== X-Received: by 2002:ad4:5d66:: with SMTP id fn6mr33308224qvb.15.1635820840320; Mon, 01 Nov 2021 19:40:40 -0700 (PDT) Received: from [192.168.1.16] (198-84-214-74.cpe.teksavvy.com. [198.84.214.74]) by smtp.gmail.com with ESMTPSA id v15sm12027319qtx.54.2021.11.01.19.40.39 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 01 Nov 2021 19:40:39 -0700 (PDT) Message-ID: <5a1ca090-1877-a63d-a83f-451696429964@redhat.com> Date: Mon, 1 Nov 2021 22:40:38 -0400 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.1.0 Subject: Re: [PATCH] regex: Unnest nested functions in regcomp.c To: Fangrui Song , libc-alpha@sourceware.org, Paul Eggert References: <20211027052959.2549214-1-maskray@google.com> From: Carlos O'Donell Organization: Red Hat In-Reply-To: <20211027052959.2549214-1-maskray@google.com> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_INFOUSMEBIZ, KAM_STOCKGEN, NICE_REPLY_A, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Nov 2021 02:41:03 -0000 On 10/27/21 01:29, Fangrui Song via Libc-alpha wrote: > If the non-buildable `sysdeps/generic/dl-machine.h` doesn't count, > this patch removes the last `auto inline` usage and makes the file > buildable by Clang once we can resolve the "asm label after first use" > issue[1]. > > collseqwc, table_size, symb_table, extra are now initialized to appease > GCC -Werror=maybe-uninitialized false positive. High-level: Overall the refactor looks correct and removes a nested function which we have as a community agreed is OK. Implementation: The refactor moves four functions out of nested scope and converts them into static always_inline functions. Which overall is correct. Details: The commit message needs an update and you answered Paul's performance question. Please repost a v2 with an updated commit message indicating that the variables are now initialized to zero because that is required by the refactoring (Paul's comment). You also answered Paul's request to comment on the performance of the new code, and as far as I can tell it just adds 4 stores in the hot path. This is nothing compared to the amount of work that regcomp has to do looking up the collation weights, or other functions IMO. I'll review the v2 once posted. Review inline below. > [1]: https://maskray.me/blog/2021-10-10-when-can-glibc-be-built-with-clang#asm-label-after-first-use > --- > posix/regcomp.c | 514 +++++++++++++++++++++++++----------------------- > 1 file changed, 266 insertions(+), 248 deletions(-) > > diff --git a/posix/regcomp.c b/posix/regcomp.c > index 887e5b5068..e2b5c2807a 100644 > --- a/posix/regcomp.c > +++ b/posix/regcomp.c > @@ -2831,6 +2831,263 @@ build_collating_symbol (bitset_t sbcset, const unsigned char *name) > } > #endif /* not _LIBC */ > > +#ifdef _LIBC > +/* Local function for parse_bracket_exp used in _LIBC environment. > + Seek the collating symbol entry corresponding to NAME. > + Return the index of the symbol in the SYMB_TABLE, > + or -1 if not found. */ > + > +static inline int32_t > +__attribute__ ((always_inline)) OK. Static inline and always inline. > +seek_collating_symbol_entry (const unsigned char *name, size_t name_len, OK. Add seek_collating_symbol_entry (1/4) matches original. > + const int32_t *symb_table, int32_t table_size, > + const unsigned char *extra) > +{ > + int32_t elem; > + > + for (elem = 0; elem < table_size; elem++) > + if (symb_table[2 * elem] != 0) > + { > + int32_t idx = symb_table[2 * elem + 1]; > + /* Skip the name of collating element name. */ > + idx += 1 + extra[idx]; > + if (/* Compare the length of the name. */ > + name_len == extra[idx] > + /* Compare the name. */ > + && memcmp (name, &extra[idx + 1], name_len) == 0) > + /* Yep, this is the entry. */ > + return elem; > + } > + return -1; > +} > + > +/* Local function for parse_bracket_exp used in _LIBC environment. > + Look up the collation sequence value of BR_ELEM. > + Return the value if succeeded, UINT_MAX otherwise. */ > + > +static inline unsigned int > +__attribute__ ((always_inline)) OK. Static inline and always inline. > +lookup_collation_sequence_value (bracket_elem_t *br_elem, uint32_t nrules, OK. Add lookup_collation_sequence_value (2/4) matches original. > + const unsigned char *collseqmb, > + const char *collseqwc, int32_t table_size, > + const int32_t *symb_table, > + const unsigned char *extra) > +{ > + if (br_elem->type == SB_CHAR) > + { > + /* if (MB_CUR_MAX == 1) */ > + if (nrules == 0) > + return collseqmb[br_elem->opr.ch]; > + else > + { > + wint_t wc = __btowc (br_elem->opr.ch); > + return __collseq_table_lookup (collseqwc, wc); > + } > + } > + else if (br_elem->type == MB_CHAR) > + { > + if (nrules != 0) > + return __collseq_table_lookup (collseqwc, br_elem->opr.wch); > + } > + else if (br_elem->type == COLL_SYM) > + { > + size_t sym_name_len = strlen ((char *) br_elem->opr.name); > + if (nrules != 0) > + { > + int32_t elem, idx; > + elem = seek_collating_symbol_entry (br_elem->opr.name, > + sym_name_len, > + symb_table, table_size, > + extra); > + if (elem != -1) > + { > + /* We found the entry. */ > + idx = symb_table[2 * elem + 1]; > + /* Skip the name of collating element name. */ > + idx += 1 + extra[idx]; > + /* Skip the byte sequence of the collating element. */ > + idx += 1 + extra[idx]; > + /* Adjust for the alignment. */ > + idx = (idx + 3) & ~3; > + /* Skip the multibyte collation sequence value. */ > + idx += sizeof (unsigned int); > + /* Skip the wide char sequence of the collating element. */ > + idx += sizeof (unsigned int) * > + (1 + *(unsigned int *) (extra + idx)); > + /* Return the collation sequence value. */ > + return *(unsigned int *) (extra + idx); > + } > + else if (sym_name_len == 1) > + { > + /* No valid character. Match it as a single byte > + character. */ > + return collseqmb[br_elem->opr.name[0]]; > + } > + } > + else if (sym_name_len == 1) > + return collseqmb[br_elem->opr.name[0]]; > + } > + return UINT_MAX; > +} > + > +/* Local function for parse_bracket_exp used in _LIBC environment. > + Build the range expression which starts from START_ELEM, and ends > + at END_ELEM. The result are written to MBCSET and SBCSET. > + RANGE_ALLOC is the allocated size of mbcset->range_starts, and > + mbcset->range_ends, is a pointer argument since we may > + update it. */ > + > +static inline reg_errcode_t > +__attribute__ ((always_inline)) OK. Static inline and always inline. > +build_range_exp (bitset_t sbcset, re_charset_t *mbcset, int *range_alloc, OK. Add build_range_exp (3/4) matches original. > + bracket_elem_t *start_elem, bracket_elem_t *end_elem, > + re_dfa_t *dfa, reg_syntax_t syntax, uint32_t nrules, > + const unsigned char *collseqmb, const char *collseqwc, > + int32_t table_size, const int32_t *symb_table, > + const unsigned char *extra) > +{ > + unsigned int ch; > + uint32_t start_collseq; > + uint32_t end_collseq; > + > + /* Equivalence Classes and Character Classes can't be a range > + start/end. */ > + if (__glibc_unlikely (start_elem->type == EQUIV_CLASS > + || start_elem->type == CHAR_CLASS > + || end_elem->type == EQUIV_CLASS > + || end_elem->type == CHAR_CLASS)) > + return REG_ERANGE; > + > + /* FIXME: Implement rational ranges here, too. */ > + start_collseq = lookup_collation_sequence_value (start_elem, nrules, collseqmb, collseqwc, > + table_size, symb_table, extra); > + end_collseq = lookup_collation_sequence_value (end_elem, nrules, collseqmb, collseqwc, > + table_size, symb_table, extra); > + /* Check start/end collation sequence values. */ > + if (__glibc_unlikely (start_collseq == UINT_MAX > + || end_collseq == UINT_MAX)) > + return REG_ECOLLATE; > + if (__glibc_unlikely ((syntax & RE_NO_EMPTY_RANGES) > + && start_collseq > end_collseq)) > + return REG_ERANGE; > + > + /* Got valid collation sequence values, add them as a new entry. > + However, if we have no collation elements, and the character set > + is single byte, the single byte character set that we > + build below suffices. */ > + if (nrules > 0 || dfa->mb_cur_max > 1) > + { > + /* Check the space of the arrays. */ > + if (__glibc_unlikely (*range_alloc == mbcset->nranges)) > + { > + /* There is not enough space, need realloc. */ > + uint32_t *new_array_start; > + uint32_t *new_array_end; > + int new_nranges; > + > + /* +1 in case of mbcset->nranges is 0. */ > + new_nranges = 2 * mbcset->nranges + 1; > + new_array_start = re_realloc (mbcset->range_starts, uint32_t, > + new_nranges); > + new_array_end = re_realloc (mbcset->range_ends, uint32_t, > + new_nranges); > + > + if (__glibc_unlikely (new_array_start == NULL > + || new_array_end == NULL)) > + return REG_ESPACE; > + > + mbcset->range_starts = new_array_start; > + mbcset->range_ends = new_array_end; > + *range_alloc = new_nranges; > + } > + > + mbcset->range_starts[mbcset->nranges] = start_collseq; > + mbcset->range_ends[mbcset->nranges++] = end_collseq; > + } > + > + /* Build the table for single byte characters. */ > + for (ch = 0; ch < SBC_MAX; ch++) > + { > + uint32_t ch_collseq; > + /* if (MB_CUR_MAX == 1) */ > + if (nrules == 0) > + ch_collseq = collseqmb[ch]; > + else > + ch_collseq = __collseq_table_lookup (collseqwc, __btowc (ch)); > + if (start_collseq <= ch_collseq && ch_collseq <= end_collseq) > + bitset_set (sbcset, ch); > + } > + return REG_NOERROR; > +} > + > +/* Local function for parse_bracket_exp used in _LIBC environment. > + Build the collating element which is represented by NAME. > + The result are written to MBCSET and SBCSET. > + COLL_SYM_ALLOC is the allocated size of mbcset->coll_sym, is a > + pointer argument since we may update it. */ > + > +static inline reg_errcode_t > +__attribute__ ((always_inline)) OK. Static inline and always inline. > +build_collating_symbol (bitset_t sbcset, re_charset_t *mbcset, OK. Add build_collating_symbol (4/4), matches original. > + int *coll_sym_alloc, const unsigned char *name, > + uint32_t nrules, int32_t table_size, > + const int32_t *symb_table, const unsigned char *extra) > +{ > + int32_t elem, idx; > + size_t name_len = strlen ((const char *) name); > + if (nrules != 0) > + { > + elem = seek_collating_symbol_entry (name, name_len, symb_table, > + table_size, extra); > + if (elem != -1) > + { > + /* We found the entry. */ > + idx = symb_table[2 * elem + 1]; > + /* Skip the name of collating element name. */ > + idx += 1 + extra[idx]; > + } > + else if (name_len == 1) > + { > + /* No valid character, treat it as a normal > + character. */ > + bitset_set (sbcset, name[0]); > + return REG_NOERROR; > + } > + else > + return REG_ECOLLATE; > + > + /* Got valid collation sequence, add it as a new entry. */ > + /* Check the space of the arrays. */ > + if (__glibc_unlikely (*coll_sym_alloc == mbcset->ncoll_syms)) > + { > + /* Not enough, realloc it. */ > + /* +1 in case of mbcset->ncoll_syms is 0. */ > + int new_coll_sym_alloc = 2 * mbcset->ncoll_syms + 1; > + /* Use realloc since mbcset->coll_syms is NULL > + if *alloc == 0. */ > + int32_t *new_coll_syms = re_realloc (mbcset->coll_syms, int32_t, > + new_coll_sym_alloc); > + if (__glibc_unlikely (new_coll_syms == NULL)) > + return REG_ESPACE; > + mbcset->coll_syms = new_coll_syms; > + *coll_sym_alloc = new_coll_sym_alloc; > + } > + mbcset->coll_syms[mbcset->ncoll_syms++] = idx; > + return REG_NOERROR; > + } > + else > + { > + if (__glibc_unlikely (name_len != 1)) > + return REG_ECOLLATE; > + else > + { > + bitset_set (sbcset, name[0]); > + return REG_NOERROR; > + } > + } > +} > +#endif /* _LIBC */ > + > /* This function parse bracket expression like "[abc]", "[a-c]", > "[[.a-a.]]" etc. */ > > @@ -2840,253 +3097,11 @@ parse_bracket_exp (re_string_t *regexp, re_dfa_t *dfa, re_token_t *token, > { > #ifdef _LIBC > const unsigned char *collseqmb; > - const char *collseqwc; > + const char *collseqwc = NULL; OK. Set collseqwc to NULL. > uint32_t nrules; > - int32_t table_size; > - const int32_t *symb_table; > - const unsigned char *extra; > - > - /* Local function for parse_bracket_exp used in _LIBC environment. > - Seek the collating symbol entry corresponding to NAME. > - Return the index of the symbol in the SYMB_TABLE, > - or -1 if not found. */ > - > - auto inline int32_t > - __attribute__ ((always_inline)) > - seek_collating_symbol_entry (const unsigned char *name, size_t name_len) > - { > - int32_t elem; > - > - for (elem = 0; elem < table_size; elem++) > - if (symb_table[2 * elem] != 0) > - { > - int32_t idx = symb_table[2 * elem + 1]; > - /* Skip the name of collating element name. */ > - idx += 1 + extra[idx]; > - if (/* Compare the length of the name. */ > - name_len == extra[idx] > - /* Compare the name. */ > - && memcmp (name, &extra[idx + 1], name_len) == 0) > - /* Yep, this is the entry. */ > - return elem; > - } > - return -1; > - } > - > - /* Local function for parse_bracket_exp used in _LIBC environment. > - Look up the collation sequence value of BR_ELEM. > - Return the value if succeeded, UINT_MAX otherwise. */ > - > - auto inline unsigned int > - __attribute__ ((always_inline)) > - lookup_collation_sequence_value (bracket_elem_t *br_elem) > - { > - if (br_elem->type == SB_CHAR) > - { > - /* > - if (MB_CUR_MAX == 1) > - */ > - if (nrules == 0) > - return collseqmb[br_elem->opr.ch]; > - else > - { > - wint_t wc = __btowc (br_elem->opr.ch); > - return __collseq_table_lookup (collseqwc, wc); > - } > - } > - else if (br_elem->type == MB_CHAR) > - { > - if (nrules != 0) > - return __collseq_table_lookup (collseqwc, br_elem->opr.wch); > - } > - else if (br_elem->type == COLL_SYM) > - { > - size_t sym_name_len = strlen ((char *) br_elem->opr.name); > - if (nrules != 0) > - { > - int32_t elem, idx; > - elem = seek_collating_symbol_entry (br_elem->opr.name, > - sym_name_len); > - if (elem != -1) > - { > - /* We found the entry. */ > - idx = symb_table[2 * elem + 1]; > - /* Skip the name of collating element name. */ > - idx += 1 + extra[idx]; > - /* Skip the byte sequence of the collating element. */ > - idx += 1 + extra[idx]; > - /* Adjust for the alignment. */ > - idx = (idx + 3) & ~3; > - /* Skip the multibyte collation sequence value. */ > - idx += sizeof (unsigned int); > - /* Skip the wide char sequence of the collating element. */ > - idx += sizeof (unsigned int) * > - (1 + *(unsigned int *) (extra + idx)); > - /* Return the collation sequence value. */ > - return *(unsigned int *) (extra + idx); > - } > - else if (sym_name_len == 1) > - { > - /* No valid character. Match it as a single byte > - character. */ > - return collseqmb[br_elem->opr.name[0]]; > - } > - } > - else if (sym_name_len == 1) > - return collseqmb[br_elem->opr.name[0]]; > - } > - return UINT_MAX; > - } > - > - /* Local function for parse_bracket_exp used in _LIBC environment. > - Build the range expression which starts from START_ELEM, and ends > - at END_ELEM. The result are written to MBCSET and SBCSET. > - RANGE_ALLOC is the allocated size of mbcset->range_starts, and > - mbcset->range_ends, is a pointer argument since we may > - update it. */ > - > - auto inline reg_errcode_t > - __attribute__ ((always_inline)) > - build_range_exp (bitset_t sbcset, re_charset_t *mbcset, int *range_alloc, > - bracket_elem_t *start_elem, bracket_elem_t *end_elem) > - { > - unsigned int ch; > - uint32_t start_collseq; > - uint32_t end_collseq; > - > - /* Equivalence Classes and Character Classes can't be a range > - start/end. */ > - if (__glibc_unlikely (start_elem->type == EQUIV_CLASS > - || start_elem->type == CHAR_CLASS > - || end_elem->type == EQUIV_CLASS > - || end_elem->type == CHAR_CLASS)) > - return REG_ERANGE; > - > - /* FIXME: Implement rational ranges here, too. */ > - start_collseq = lookup_collation_sequence_value (start_elem); > - end_collseq = lookup_collation_sequence_value (end_elem); > - /* Check start/end collation sequence values. */ > - if (__glibc_unlikely (start_collseq == UINT_MAX > - || end_collseq == UINT_MAX)) > - return REG_ECOLLATE; > - if (__glibc_unlikely ((syntax & RE_NO_EMPTY_RANGES) > - && start_collseq > end_collseq)) > - return REG_ERANGE; > - > - /* Got valid collation sequence values, add them as a new entry. > - However, if we have no collation elements, and the character set > - is single byte, the single byte character set that we > - build below suffices. */ > - if (nrules > 0 || dfa->mb_cur_max > 1) > - { > - /* Check the space of the arrays. */ > - if (__glibc_unlikely (*range_alloc == mbcset->nranges)) > - { > - /* There is not enough space, need realloc. */ > - uint32_t *new_array_start; > - uint32_t *new_array_end; > - Idx new_nranges; > - > - /* +1 in case of mbcset->nranges is 0. */ > - new_nranges = 2 * mbcset->nranges + 1; > - new_array_start = re_realloc (mbcset->range_starts, uint32_t, > - new_nranges); > - new_array_end = re_realloc (mbcset->range_ends, uint32_t, > - new_nranges); > - > - if (__glibc_unlikely (new_array_start == NULL > - || new_array_end == NULL)) > - return REG_ESPACE; > - > - mbcset->range_starts = new_array_start; > - mbcset->range_ends = new_array_end; > - *range_alloc = new_nranges; > - } > - > - mbcset->range_starts[mbcset->nranges] = start_collseq; > - mbcset->range_ends[mbcset->nranges++] = end_collseq; > - } > - > - /* Build the table for single byte characters. */ > - for (ch = 0; ch < SBC_MAX; ch++) > - { > - uint32_t ch_collseq; > - /* > - if (MB_CUR_MAX == 1) > - */ > - if (nrules == 0) > - ch_collseq = collseqmb[ch]; > - else > - ch_collseq = __collseq_table_lookup (collseqwc, __btowc (ch)); > - if (start_collseq <= ch_collseq && ch_collseq <= end_collseq) > - bitset_set (sbcset, ch); > - } > - return REG_NOERROR; > - } > - > - /* Local function for parse_bracket_exp used in _LIBC environment. > - Build the collating element which is represented by NAME. > - The result are written to MBCSET and SBCSET. > - COLL_SYM_ALLOC is the allocated size of mbcset->coll_sym, is a > - pointer argument since we may update it. */ > - > - auto inline reg_errcode_t > - __attribute__ ((always_inline)) > - build_collating_symbol (bitset_t sbcset, re_charset_t *mbcset, > - Idx *coll_sym_alloc, const unsigned char *name) > - { > - int32_t elem, idx; > - size_t name_len = strlen ((const char *) name); > - if (nrules != 0) > - { > - elem = seek_collating_symbol_entry (name, name_len); > - if (elem != -1) > - { > - /* We found the entry. */ > - idx = symb_table[2 * elem + 1]; > - /* Skip the name of collating element name. */ > - idx += 1 + extra[idx]; > - } > - else if (name_len == 1) > - { > - /* No valid character, treat it as a normal > - character. */ > - bitset_set (sbcset, name[0]); > - return REG_NOERROR; > - } > - else > - return REG_ECOLLATE; > - > - /* Got valid collation sequence, add it as a new entry. */ > - /* Check the space of the arrays. */ > - if (__glibc_unlikely (*coll_sym_alloc == mbcset->ncoll_syms)) > - { > - /* Not enough, realloc it. */ > - /* +1 in case of mbcset->ncoll_syms is 0. */ > - Idx new_coll_sym_alloc = 2 * mbcset->ncoll_syms + 1; > - /* Use realloc since mbcset->coll_syms is NULL > - if *alloc == 0. */ > - int32_t *new_coll_syms = re_realloc (mbcset->coll_syms, int32_t, > - new_coll_sym_alloc); > - if (__glibc_unlikely (new_coll_syms == NULL)) > - return REG_ESPACE; > - mbcset->coll_syms = new_coll_syms; > - *coll_sym_alloc = new_coll_sym_alloc; > - } > - mbcset->coll_syms[mbcset->ncoll_syms++] = idx; > - return REG_NOERROR; > - } > - else > - { > - if (__glibc_unlikely (name_len != 1)) > - return REG_ECOLLATE; > - else > - { > - bitset_set (sbcset, name[0]); > - return REG_NOERROR; > - } > - } > - } OK. Remove 4 nested functions. > + int32_t table_size = 0; > + const int32_t *symb_table = NULL; > + const unsigned char *extra = NULL; OK. Initialize table_size, symb_table and extra. > #endif > > re_token_t br_token; > @@ -3230,7 +3245,9 @@ parse_bracket_exp (re_string_t *regexp, re_dfa_t *dfa, re_token_t *token, > > #ifdef _LIBC > *err = build_range_exp (sbcset, mbcset, &range_alloc, > - &start_elem, &end_elem); > + &start_elem, &end_elem, > + dfa, syntax, nrules, collseqmb, collseqwc, > + table_size, symb_table, extra); OK. Pass new parameters required for the function to operate. > #else > # ifdef RE_ENABLE_I18N > *err = build_range_exp (syntax, sbcset, > @@ -3283,7 +3300,8 @@ parse_bracket_exp (re_string_t *regexp, re_dfa_t *dfa, re_token_t *token, > #ifdef RE_ENABLE_I18N > mbcset, &coll_sym_alloc, > #endif /* RE_ENABLE_I18N */ > - start_elem.opr.name); > + start_elem.opr.name, > + nrules, table_size, symb_table, extra); OK. Likewise. > if (__glibc_unlikely (*err != REG_NOERROR)) > goto parse_bracket_exp_free_return; > break; > -- Cheers, Carlos.