From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 101941 invoked by alias); 7 Aug 2018 23:04:42 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 101926 invoked by uid 89); 7 Aug 2018 23:04:42 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-15.2 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,KAM_ASCII_DIVIDERS,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2 spammy=H*i:CADzB X-HELO: mail-qt0-f178.google.com Received: from mail-qt0-f178.google.com (HELO mail-qt0-f178.google.com) (209.85.216.178) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 07 Aug 2018 23:04:38 +0000 Received: by mail-qt0-f178.google.com with SMTP id c15-v6so440581qtp.0 for ; Tue, 07 Aug 2018 16:04:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:references:cc:from:message-id:date:user-agent :mime-version:in-reply-to; bh=DjVJvgLjomedHdjI+rhnYXsXbcg5ttG47zm39j9Mt8w=; b=XWzTtoemFdr1EwdfxWkw21fFeztkBJqOdp4hBMV44cVtdjTii96C6A8+A7vReiVKuJ m7Fh67S99+ccgPvkY9r4GLq5JFaxfc2jj9VxfxTub/fPcb4RAZGlye1H/wQFxjOFPr1I bcUGQsj9vOVyKU/YnMqyS1o6mJEKvMqewQlxobeucrU/gI6NHiYZ/pcF0ljQgVW75Ojp cHu57+cPRg7qLeuTXdgzd3JdpC0vtF32/PlXxk53UAYbd5Su22e9Pee7cIh9eDHKjQrR 2xncx+1SjiKo2q27MY5DQb5W+ZUKhEkh0cy4s4Pxh11SPmlVyP79xS+7TE+BXMeihq7+ +/MQ== Return-Path: Received: from localhost.localdomain (97-118-124-30.hlrn.qwest.net. [97.118.124.30]) by smtp.gmail.com with ESMTPSA id o29-v6sm2175690qkh.85.2018.08.07.16.04.34 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 07 Aug 2018 16:04:35 -0700 (PDT) Subject: Re: [PATCH] convert braced initializers to strings (PR 71625) To: Jason Merrill References: Cc: Gcc Patch List , Joseph Myers From: Martin Sebor Message-ID: Date: Tue, 07 Aug 2018 23:04:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: multipart/mixed; boundary="------------434342E14FC3E65EC9871754" X-IsSubscribed: yes X-SW-Source: 2018-08/txt/msg00538.txt.bz2 This is a multi-part message in MIME format. --------------434342E14FC3E65EC9871754 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-length: 3011 On 08/07/2018 02:57 AM, Jason Merrill wrote: > On Wed, Aug 1, 2018 at 12:49 AM, Martin Sebor wrote: >> On 07/31/2018 07:38 AM, Jason Merrill wrote: >>> >>> On Tue, Jul 31, 2018 at 9:51 AM, Martin Sebor wrote: >>>> >>>> The middle-end contains code to determine the lengths of constant >>>> character arrays initialized by string literals. The code is used >>>> in a number of optimizations and warnings. >>>> >>>> However, the code is unable to deal with constant arrays initialized >>>> using the braced initializer syntax, as in >>>> >>>> const char a[] = { '1', '2', '\0' }; >>>> >>>> The attached patch extends the C and C++ front-ends to convert such >>>> initializers into a STRING_CST form. >>>> >>>> The goal of this work is to both enable existing optimizations for >>>> such arrays, and to help detect bugs due to using non-nul terminated >>>> arrays where nul-terminated strings are expected. The latter is >>>> an extension of the GCC 8 _Wstringop-overflow and >>>> -Wstringop-truncation warnings that help detect or prevent reading >>>> past the end of dynamically created character arrays. Future work >>>> includes detecting potential past-the-end reads from uninitialized >>>> local character arrays. >>> >>> >>>> && TYPE_MAIN_VARIANT (TREE_TYPE (valtype)) == char_type_node) >>> >>> >>> Why? Don't we want this for other character types as well? >> >> It suppresses narrowing warnings for things like >> >> signed char a[] = { 0xff }; >> >> (there are a couple of tests that exercise this). > > Why is plain char different in this respect? Presumably one of > > char a[] = { -1 }; > char b[] = { 0xff }; > > should give the same narrowing warning, depending on whether char is signed. Right. I've added more tests to verify that it does. >> At the same time, STRING_CST is supposed to be able to represent >> strings of any integer type so there should be a way to make it >> work. On the flip side, recent discussions of changes in this >> area suggest there may be bugs in the wide character handling of >> STRING_CST so those would need to be fixed before relying on it >> for robust support. >> >> In any case, if you have a suggestion for how to make it work for >> at least the narrow character types I'll adjust the patch. > > I suppose braced_list_to_string should call check_narrowing for C++. I see. I've made that change. That has made it possible to convert arrays of all character types. Thanks! > Currently it uses tree_fits_shwi_p (signed host_wide_int) and then > stores the extracted value in a host unsigned int, which is then > converted to host char. Does the right thing happen for -fsigned-char > or targets with a different character set? I believe so. I've added tests for these too (ASCII and EBCDIC) and also changed the type of the extracted value to HWI to match (it doesn't change the results of the tests). Attached is an updated patch with these changes plus more tests as suggested by Joseph. Martin --------------434342E14FC3E65EC9871754 Content-Type: text/x-patch; name="gcc-71625.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="gcc-71625.diff" Content-length: 22717 PR tree-optimization/71625 - missing strlen optimization on different array initialization style gcc/c/ChangeLog: PR tree-optimization/71625 * c-parser.c (c_parser_declaration_or_fndef): Call braced_list_to_string. gcc/c-family/ChangeLog: PR tree-optimization/71625 * c-common.c (braced_list_to_string): New function. * c-common.h (braced_list_to_string): Declare it. gcc/cp/ChangeLog: PR tree-optimization/71625 * parser.c (cp_parser_init_declarator): Call braced_list_to_string. (eval_check_narrowing): New function. gcc/testsuite/ChangeLog: PR tree-optimization/71625 * g++.dg/init/string2.C: New test. * g++.dg/init/string3.C: New test. * gcc.dg/strlenopt-55.c: New test. * gcc.dg/strlenopt-56.c: New test. Index: gcc/c/c-parser.c =================================================================== --- gcc/c/c-parser.c (revision 263372) +++ gcc/c/c-parser.c (working copy) @@ -2126,6 +2126,15 @@ c_parser_declaration_or_fndef (c_parser *parser, b if (d != error_mark_node) { maybe_warn_string_init (init_loc, TREE_TYPE (d), init); + + /* Try to convert a string CONSTRUCTOR into a STRING_CST. */ + tree valtype = TREE_TYPE (init.value); + if (TREE_CODE (init.value) == CONSTRUCTOR + && TREE_CODE (valtype) == ARRAY_TYPE + && TYPE_STRING_FLAG (TREE_TYPE (valtype))) + if (tree str = braced_list_to_string (valtype, init.value)) + init.value = str; + finish_decl (d, init_loc, init.value, init.original_type, asm_name); } Index: gcc/c-family/c-common.c =================================================================== --- gcc/c-family/c-common.c (revision 263372) +++ gcc/c-family/c-common.c (working copy) @@ -8509,4 +8509,84 @@ maybe_add_include_fixit (rich_location *richloc, c free (text); } +/* Attempt to convert a braced array initializer list CTOR for array + TYPE into a STRING_CST for convenience and efficiency. When non-null, + use EVAL to attempt to evalue constants (used by C++). Return + the converted string on success or null on failure. */ + +tree +braced_list_to_string (tree type, tree ctor, tree (*eval)(tree, tree)) +{ + /* If the array has an explicit bound, use it to constrain the size + of the string. If it doesn't, be sure to create a string that's + as long as implied by the index of the last zero specified via + a designator, as in: + const char a[] = { [7] = 0 }; */ + unsigned HOST_WIDE_INT maxelts = HOST_WIDE_INT_M1U; + if (tree nelts = TYPE_SIZE_UNIT (type)) + if (tree_fits_uhwi_p (nelts)) + { + maxelts = tree_to_uhwi (nelts); + maxelts /= tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (type))); + } + + unsigned HOST_WIDE_INT nelts = CONSTRUCTOR_NELTS (ctor); + tree eltype = TREE_TYPE (type); + + auto_vec str; + str.reserve (nelts + 1); + + unsigned HOST_WIDE_INT i; + tree index, value; + + FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (ctor), i, index, value) + { + unsigned HOST_WIDE_INT idx = index ? tree_to_uhwi (index) : i; + + /* auto_vec is limited to UINT_MAX elements. */ + if (idx > UINT_MAX) + return NULL_TREE; + + /* Attempt to evaluate constants. */ + if (eval) + value = eval (eltype, value); + + /* Avoid non-constant initializers. */ + if (!tree_fits_shwi_p (value)) + return NULL_TREE; + + /* Skip over embedded nuls. */ + HOST_WIDE_INT val = tree_to_shwi (value); + if (!val && i + 1 < nelts) + continue; + + /* Bail if the CTOR has a block of more than 256 embedded nuls + due to implicitly initialized elements. */ + unsigned nelts = (idx - str.length ()) + 1; + if (nelts > 256) + return NULL_TREE; + + if (nelts > 1) + { + str.reserve (idx); + str.quick_grow_cleared (idx); + } + + if (idx > maxelts) + return NULL_TREE; + + str.safe_insert (idx, val); + } + + if (!nelts || str.length () < i) + /* Append a nul for the empty initializer { } and for the last + explicit initializer in the loop above that is a nul. */ + str.safe_push (0); + + /* Build a string literal but return the embedded STRING_CST. */ + tree res = build_string_literal (str.length (), str.begin ()); + res = TREE_OPERAND (TREE_OPERAND (res, 0), 0); + return res; +} + #include "gt-c-family-c-common.h" Index: gcc/c-family/c-common.h =================================================================== --- gcc/c-family/c-common.h (revision 263372) +++ gcc/c-family/c-common.h (working copy) @@ -1331,6 +1331,7 @@ extern void maybe_add_include_fixit (rich_location extern void maybe_suggest_missing_token_insertion (rich_location *richloc, enum cpp_ttype token_type, location_t prev_token_loc); +extern tree braced_list_to_string (tree, tree, tree (*)(tree, tree) = NULL); #if CHECKING_P namespace selftest { Index: gcc/cp/parser.c =================================================================== --- gcc/cp/parser.c (revision 263372) +++ gcc/cp/parser.c (working copy) @@ -19419,6 +19419,30 @@ strip_declarator_types (tree type, cp_declarator * return type; } +/* Attempt to determine the constant VALUE of integral type and convert + it to TYPE, issuing narrowing warnings/errors as necessary. Return + the constant result or null on failure. Callback for + braced_list_to_string. */ + +static tree +eval_check_narrowing (tree type, tree value) +{ + if (tree valtype = TREE_TYPE (value)) + { + if (TREE_CODE (valtype) != INTEGER_TYPE) + return NULL_TREE; + } + else + return NULL_TREE; + + value = scalar_constant_value (value); + if (!value) + return NULL_TREE; + + check_narrowing (type, value, tf_warning_or_error); + return value; +} + /* Declarators [gram.dcl.decl] */ /* Parse an init-declarator. @@ -19825,6 +19849,18 @@ cp_parser_init_declarator (cp_parser* parser, finish_lambda_scope (); if (initializer == error_mark_node) cp_parser_skip_to_end_of_statement (parser); + else if (decl) + { + /* Try to convert a string CONSTRUCTOR into a STRING_CST. */ + tree valtype = TREE_TYPE (decl); + if (TREE_CODE (valtype) == ARRAY_TYPE + && TYPE_STRING_FLAG (TREE_TYPE (valtype)) + && TREE_CODE (initializer) == CONSTRUCTOR + && TREE_TYPE (initializer) == init_list_type_node) + if (tree str = braced_list_to_string (valtype, initializer, + eval_check_narrowing)) + initializer = str; + } } } Index: gcc/testsuite/g++.dg/init/string2.C =================================================================== --- gcc/testsuite/g++.dg/init/string2.C (nonexistent) +++ gcc/testsuite/g++.dg/init/string2.C (working copy) @@ -0,0 +1,85 @@ +// PR tree-optimization/71625 - missing strlen optimization on different +// array initialization style +// +// Verify that strlen() calls with constant character array arguments +// initialized with string constants are folded. (This is a small +// subset of pr63989). +// { dg-do compile } +// { dg-options "-O0 -Wno-error=narrowing -fdump-tree-gimple" } + +#define A(expr) do { typedef char A[-1 + 2 * !!(expr)]; } while (0) + +const char a0[] = { 'a', 'b', 'c', '\0' }; + +int len0 () +{ + return __builtin_strlen (a0); +} + +// Verify that narrowing warnings are preserved. +const signed char +sa0[] = { 'a', 'b', 255, '\0' }; // { dg-warning "\\\[\(-Wnarrowing|-Woverflow\)" "" { target { ! c++98_only } } } + +int lens0 () +{ + return __builtin_strlen ((const char*)sa0); +} + +const unsigned char +ua0[] = { 'a', 'b', -1, '\0' }; // { dg-warning "\\\[\(-Wnarrowing|-Woverflow\)" "" { target { ! c++98_only } } } + +int lenu0 () +{ + return __builtin_strlen ((const char*)ua0); +} + +const char c = 0; +const char a1[] = { 'a', 'b', 'c', c }; + +int len1 () +{ + return __builtin_strlen (a1); +} + +const wchar_t ws4[] = { 1, 2, 3, 4 }; +const wchar_t ws7[] = { 1, 2, 3, 4, 0, 0, 0 }; +const wchar_t ws9[9] = { 1, 2, 3, 4, 0 }; + +void wsize () +{ + A (sizeof ws4 == 4 * sizeof *ws4); + A (ws4[0] == 1 && ws4[1] == 2 && ws4[2] == 3 && ws4[3] == 4); + + A (sizeof ws7 == 7 * sizeof *ws7); + A (ws7[0] == 1 && ws7[1] == 2 && ws7[2] == 3 && ws7[4] == 4 + && !ws7[5] && !ws7[6]); + + A (sizeof ws9 == 9 * sizeof *ws9); + A (ws9[0] == 1 && ws9[1] == 2 && ws9[2] == 3 && ws9[4] == 4 + && !ws9[5] && !ws9[6] && !ws9[7] && !ws9[8]); +} + +#if 0 + +// The following aren't handled. + +const char &cref = c; +const char a2[] = { 'a', 'b', 'c', cref }; + +int len2 () +{ + return __builtin_strlen (a2); +} + + +const char* const cptr = &cref; +const char a3[] = { 'a', 'b', 'c', *cptr }; + +int len3 () +{ + return __builtin_strlen (a3); +} + +#endif + +// { dg-final { scan-tree-dump-times "strlen" 0 "gimple" } } Index: gcc/testsuite/g++.dg/init/string3.C =================================================================== --- gcc/testsuite/g++.dg/init/string3.C (nonexistent) +++ gcc/testsuite/g++.dg/init/string3.C (working copy) @@ -0,0 +1,36 @@ +// PR tree-optimization/71625 - missing strlen optimization on different +// array initialization style +// +// Verify that strlen() call with a constant character array argument +// initialized with non-constant elements isn't folded. (This is a small +// subset of pr63989). +// +// { dg-do compile } +// { dg-options "-O2 -fdump-tree-optimized" } + + +extern const char c; +const char a0[] = { 'a', 'b', 'c', c }; + +int len0 () +{ + return __builtin_strlen (a0); +} + +const char &ref = c; +const char a1[] = { 'a', 'b', 'c', ref }; + +int len1 () +{ + return __builtin_strlen (a1); +} + +const char* const ptr = &c; +const char a2[] = { 'a', 'b', 'c', *ptr }; + +int len2 () +{ + return __builtin_strlen (a2); +} + +// { dg-final { scan-tree-dump-times "strlen" 3 "optimized" } } Index: gcc/testsuite/gcc.dg/strlenopt-55.c =================================================================== --- gcc/testsuite/gcc.dg/strlenopt-55.c (nonexistent) +++ gcc/testsuite/gcc.dg/strlenopt-55.c (working copy) @@ -0,0 +1,230 @@ +/* PR tree-optimization/71625 - missing strlen optimization on different + array initialization style + + Verify that strlen() of braced initialized array is folded + { dg-do compile } + { dg-options "-O1 -Wall -fdump-tree-gimple -fdump-tree-optimized" } */ + +#include "strlenopt.h" + +#define S \ + "\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f" \ + "\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f" \ + "\x20\x21\x22\x23\x24\x25\x26\x27\x28\x29\x2a\x2b\x2c\x2d\x2e\x2f" \ + "\x30\x31\x32\x33\x34\x35\x36\x37\x38\x39\x3a\x3b\x3c\x3d\x3e\x3f" \ + "\x40\x41\x42\x43\x44\x45\x46\x47\x48\x49\x4a\x4b\x4c\x4d\x4e\x4f" \ + "\x50\x51\x52\x53\x54\x55\x56\x57\x58\x59\x5a\x5b\x5c\x5d\x5e\x5f" \ + "\x60\x61\x62\x63\x64\x65\x66\x67\x68\x69\x6a\x6b\x6c\x6d\x6e\x6f" \ + "\x70\x71\x72\x73\x74\x75\x76\x77\x78\x79\x7a\x7b\x7c\x7d\x7e\x7f" \ + "\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f" \ + "\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f" \ + "\xa0\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab\xac\xad\xae\xaf" \ + "\xb0\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf" \ + "\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf" \ + "\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf" \ + "\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef" \ + "\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff" + +/* Arrays of char, signed char, and unsigned char to verify that + the length and contents of all are the same as that of the string + literal above. */ + +const char c256[] = { + S[0], S[1], S[2], S[3], S[4], S[5], S[6], S[7], S[8], S[9], S[10], + S[11], S[12], S[13], S[14], S[15], S[16], S[17], S[18], S[19], S[20], + S[21], S[22], S[23], S[24], S[25], S[26], S[27], S[28], S[29], S[30], + S[31], S[32], S[33], S[34], S[35], S[36], S[37], S[38], S[39], S[40], + S[41], S[42], S[43], S[44], S[45], S[46], S[47], S[48], S[49], S[50], + S[51], S[52], S[53], S[54], S[55], S[56], S[57], S[58], S[59], S[60], + S[61], S[62], S[63], S[64], S[65], S[66], S[67], S[68], S[69], S[70], + S[71], S[72], S[73], S[74], S[75], S[76], S[77], S[78], S[79], S[80], + S[81], S[82], S[83], S[84], S[85], S[86], S[87], S[88], S[89], S[90], + S[91], S[92], S[93], S[94], S[95], S[96], S[97], S[98], S[99], S[100], + S[101], S[102], S[103], S[104], S[105], S[106], S[107], S[108], S[109], + S[110], S[111], S[112], S[113], S[114], S[115], S[116], S[117], S[118], + S[119], S[120], S[121], S[122], S[123], S[124], S[125], S[126], S[127], + S[128], S[129], S[130], S[131], S[132], S[133], S[134], S[135], S[136], + S[137], S[138], S[139], S[140], S[141], S[142], S[143], S[144], S[145], + S[146], S[147], S[148], S[149], S[150], S[151], S[152], S[153], S[154], + S[155], S[156], S[157], S[158], S[159], S[160], S[161], S[162], S[163], + S[164], S[165], S[166], S[167], S[168], S[169], S[170], S[171], S[172], + S[173], S[174], S[175], S[176], S[177], S[178], S[179], S[180], S[181], + S[182], S[183], S[184], S[185], S[186], S[187], S[188], S[189], S[190], + S[191], S[192], S[193], S[194], S[195], S[196], S[197], S[198], S[199], + S[200], S[201], S[202], S[203], S[204], S[205], S[206], S[207], S[208], + S[209], S[210], S[211], S[212], S[213], S[214], S[215], S[216], S[217], + S[218], S[219], S[220], S[221], S[222], S[223], S[224], S[225], S[226], + S[227], S[228], S[229], S[230], S[231], S[232], S[233], S[234], S[235], + S[236], S[237], S[238], S[239], S[240], S[241], S[242], S[243], S[244], + S[245], S[246], S[247], S[248], S[249], S[250], S[251], S[252], S[253], + S[254], S[255] /* = NUL */ +}; + +const signed char sc256[] = { + S[0], S[1], S[2], S[3], S[4], S[5], S[6], S[7], S[8], S[9], S[10], + S[11], S[12], S[13], S[14], S[15], S[16], S[17], S[18], S[19], S[20], + S[21], S[22], S[23], S[24], S[25], S[26], S[27], S[28], S[29], S[30], + S[31], S[32], S[33], S[34], S[35], S[36], S[37], S[38], S[39], S[40], + S[41], S[42], S[43], S[44], S[45], S[46], S[47], S[48], S[49], S[50], + S[51], S[52], S[53], S[54], S[55], S[56], S[57], S[58], S[59], S[60], + S[61], S[62], S[63], S[64], S[65], S[66], S[67], S[68], S[69], S[70], + S[71], S[72], S[73], S[74], S[75], S[76], S[77], S[78], S[79], S[80], + S[81], S[82], S[83], S[84], S[85], S[86], S[87], S[88], S[89], S[90], + S[91], S[92], S[93], S[94], S[95], S[96], S[97], S[98], S[99], S[100], + S[101], S[102], S[103], S[104], S[105], S[106], S[107], S[108], S[109], + S[110], S[111], S[112], S[113], S[114], S[115], S[116], S[117], S[118], + S[119], S[120], S[121], S[122], S[123], S[124], S[125], S[126], S[127], + S[128], S[129], S[130], S[131], S[132], S[133], S[134], S[135], S[136], + S[137], S[138], S[139], S[140], S[141], S[142], S[143], S[144], S[145], + S[146], S[147], S[148], S[149], S[150], S[151], S[152], S[153], S[154], + S[155], S[156], S[157], S[158], S[159], S[160], S[161], S[162], S[163], + S[164], S[165], S[166], S[167], S[168], S[169], S[170], S[171], S[172], + S[173], S[174], S[175], S[176], S[177], S[178], S[179], S[180], S[181], + S[182], S[183], S[184], S[185], S[186], S[187], S[188], S[189], S[190], + S[191], S[192], S[193], S[194], S[195], S[196], S[197], S[198], S[199], + S[200], S[201], S[202], S[203], S[204], S[205], S[206], S[207], S[208], + S[209], S[210], S[211], S[212], S[213], S[214], S[215], S[216], S[217], + S[218], S[219], S[220], S[221], S[222], S[223], S[224], S[225], S[226], + S[227], S[228], S[229], S[230], S[231], S[232], S[233], S[234], S[235], + S[236], S[237], S[238], S[239], S[240], S[241], S[242], S[243], S[244], + S[245], S[246], S[247], S[248], S[249], S[250], S[251], S[252], S[253], + S[254], S[255] /* = NUL */ +}; + +const unsigned char uc256[] = { + S[0], S[1], S[2], S[3], S[4], S[5], S[6], S[7], S[8], S[9], S[10], + S[11], S[12], S[13], S[14], S[15], S[16], S[17], S[18], S[19], S[20], + S[21], S[22], S[23], S[24], S[25], S[26], S[27], S[28], S[29], S[30], + S[31], S[32], S[33], S[34], S[35], S[36], S[37], S[38], S[39], S[40], + S[41], S[42], S[43], S[44], S[45], S[46], S[47], S[48], S[49], S[50], + S[51], S[52], S[53], S[54], S[55], S[56], S[57], S[58], S[59], S[60], + S[61], S[62], S[63], S[64], S[65], S[66], S[67], S[68], S[69], S[70], + S[71], S[72], S[73], S[74], S[75], S[76], S[77], S[78], S[79], S[80], + S[81], S[82], S[83], S[84], S[85], S[86], S[87], S[88], S[89], S[90], + S[91], S[92], S[93], S[94], S[95], S[96], S[97], S[98], S[99], S[100], + S[101], S[102], S[103], S[104], S[105], S[106], S[107], S[108], S[109], + S[110], S[111], S[112], S[113], S[114], S[115], S[116], S[117], S[118], + S[119], S[120], S[121], S[122], S[123], S[124], S[125], S[126], S[127], + S[128], S[129], S[130], S[131], S[132], S[133], S[134], S[135], S[136], + S[137], S[138], S[139], S[140], S[141], S[142], S[143], S[144], S[145], + S[146], S[147], S[148], S[149], S[150], S[151], S[152], S[153], S[154], + S[155], S[156], S[157], S[158], S[159], S[160], S[161], S[162], S[163], + S[164], S[165], S[166], S[167], S[168], S[169], S[170], S[171], S[172], + S[173], S[174], S[175], S[176], S[177], S[178], S[179], S[180], S[181], + S[182], S[183], S[184], S[185], S[186], S[187], S[188], S[189], S[190], + S[191], S[192], S[193], S[194], S[195], S[196], S[197], S[198], S[199], + S[200], S[201], S[202], S[203], S[204], S[205], S[206], S[207], S[208], + S[209], S[210], S[211], S[212], S[213], S[214], S[215], S[216], S[217], + S[218], S[219], S[220], S[221], S[222], S[223], S[224], S[225], S[226], + S[227], S[228], S[229], S[230], S[231], S[232], S[233], S[234], S[235], + S[236], S[237], S[238], S[239], S[240], S[241], S[242], S[243], S[244], + S[245], S[246], S[247], S[248], S[249], S[250], S[251], S[252], S[253], + S[254], S[255] /* = NUL */ +}; + +const __CHAR16_TYPE__ c16_4[] = { + 1, 0x7fff, 0x8000, 0xffff, + 0x10000 /* { dg-warning "\\\[-Woverflow]" } */ +}; + +const char a2_implicit[2] = { }; +const char a3_implicit[3] = { }; + +const char a3_nul[3] = { 0 }; +const char a5_nul1[3] = { [1] = 0 }; +const char a7_nul2[3] = { [2] = 0 }; + +const char ax_2_nul[] = { '1', '2', '\0' }; +const char ax_3_nul[] = { '1', '2', '3', '\0' }; + +const char ax_3_des_nul[] = { [3] = 0, [2] = '3', [1] = '2', [0] = '1' }; + +const char ax_3[] = { '1', '2', '3' }; +const char a3_3[3] = { '1', '2', '3' }; + +const char ax_100_3[] = { '1', '2', '3', [100] = '\0' }; + +#define CONCAT(x, y) x ## y +#define CAT(x, y) CONCAT (x, y) +#define FAILNAME(name) CAT (call_ ## name ##_on_line_, __LINE__) + +#define FAIL(name) do { \ + extern void FAILNAME (name) (void); \ + FAILNAME (name)(); \ + } while (0) + +/* Macro to emit a call to funcation named + call_in_true_branch_not_eliminated_on_line_NNN() + for each call that's expected to be eliminated. The dg-final + scan-tree-dump-time directive at the bottom of the test verifies + that no such call appears in output. */ +#define ELIM(expr) \ + if (!(expr)) FAIL (in_true_branch_not_eliminated); else (void)0 + +#define T(s, n) ELIM (strlen (s) == n) + +void test_nulstring (void) +{ + T (a2_implicit, 0); + T (a3_implicit, 0); + + T (a3_nul, 0); + T (a5_nul1, 0); + T (a7_nul2, 0); + + T (ax_2_nul, 2); + T (ax_3_nul, 3); + T (ax_3_des_nul, 3); + + T (ax_100_3, 3); + T (ax_100_3 + 4, 0); + ELIM (101 == sizeof ax_100_3); + ELIM ('\0' == ax_100_3[100]); + + /* Verify that all three character arrays have the same length + as the string literal they are initialized with. */ + T (S, 255); + T (c256, 255); + T ((const char*)sc256, 255); + T ((const char*)uc256, 255); + + /* Verify that all three character arrays have the same contents + as the string literal they are initialized with. */ + ELIM (0 == memcmp (c256, S, sizeof c256)); + ELIM (0 == memcmp (c256, (const char*)sc256, sizeof c256)); + ELIM (0 == memcmp (c256, (const char*)uc256, sizeof c256)); + + ELIM (0 == strcmp (c256, (const char*)sc256)); + ELIM (0 == strcmp (c256, (const char*)uc256)); + + /* Verify that the char16_t array has the expected contents. */ + ELIM (c16_4[0] == 1 && c16_4[1] == 0x7fff + && c16_4[2] == 0x8000 && c16_4[3] == 0xffff + && c16_4[4] == 0); +} + +/* Verify that excessively large initializers don't run out of + memory. Also verify that the they have the expected size and + contents. */ + +#define MAX (__PTRDIFF_MAX__ - 1) + +const char large_string[] = { 'a', [1234] = 'b', [MAX] = '\0' }; + +const void test_large_string_size (void) +{ + ELIM (sizeof large_string == MAX + 1); + + /* The following expressions are not folded without optimization. */ + ELIM ('a' == large_string[0]); + ELIM ('\0' == large_string[1233]); + ELIM ('b' == large_string[1234]); + ELIM ('\0' == large_string[1235]); + ELIM ('\0' == large_string[MAX - 1]); +} + + +/* { dg-final { scan-tree-dump-times "strlen" 0 "gimple" } } + { dg-final { scan-tree-dump-times "memcmp" 0 "gimple" } } + { dg-final { scan-tree-dump-times "strcmp" 0 "gimple" } } + { dg-final { scan-tree-dump-times "call_in_true_branch_not_eliminated" 0 "optimized" } } */ Index: gcc/testsuite/gcc.dg/strlenopt-56.c =================================================================== --- gcc/testsuite/gcc.dg/strlenopt-56.c (nonexistent) +++ gcc/testsuite/gcc.dg/strlenopt-56.c (working copy) @@ -0,0 +1,50 @@ +/* PR tree-optimization/71625 - conversion of braced initializers to strings + Verify that array elements have the expected values regardless of sign + and non-ASCII execution character set. + { dg-do compile } + { dg-require-iconv "IBM1047" } + { dg-options "-O -Wall -fexec-charset=IBM1047 -fdump-tree-gimple -fdump-tree-optimized" } */ + +#include "strlenopt.h" + +const char a[] = { 'a', 129, 0 }; +const signed char b[] = { 'b', 130, 0 }; +const unsigned char c[] = { 'c', 131, 0 }; + +const char s[] = "a\201"; +const signed char ss[] = "b\202"; +const unsigned char us[] = "c\203"; + + +#define A(expr) ((expr) ? (void)0 : __builtin_abort ()) + +void test_values (void) +{ + A (a[0] == a[1]); + A (a[1] == 'a'); + + A (b[0] == b[1]); + A (b[1] == (signed char)'b'); + + A (c[0] == c[1]); + A (c[1] == (unsigned char)'c'); +} + +void test_lengths (void) +{ + A (2 == strlen (a)); + A (2 == strlen ((const char*)b)); + A (2 == strlen ((const char*)c)); +} + +void test_contents (void) +{ + A (0 == strcmp (a, s)); + A (0 == strcmp ((const char*)b, (const char*)ss)); + A (0 == strcmp ((const char*)c, (const char*)us)); +} + + +/* { dg-final { scan-tree-dump-times "strlen" 0 "gimple" } } + { dg-final { scan-tree-dump-times "strcmp" 0 "gimple" } } + { dg-final { scan-tree-dump-times "abort" 0 "optimized" } } */ --------------434342E14FC3E65EC9871754--