From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 103648 invoked by alias); 9 Aug 2018 00:17:18 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 102891 invoked by uid 89); 9 Aug 2018 00:17:17 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-24.9 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2 spammy=exposure X-HELO: mail-qk0-f194.google.com Received: from mail-qk0-f194.google.com (HELO mail-qk0-f194.google.com) (209.85.220.194) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 09 Aug 2018 00:17:12 +0000 Received: by mail-qk0-f194.google.com with SMTP id 191-v6so2857624qki.13 for ; Wed, 08 Aug 2018 17:17:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:references:cc:from:message-id:date:user-agent :mime-version:in-reply-to; bh=P9kVNTNPtsneZ0ahQCSlCzp5g9yAJW3V9musTRBAAbE=; b=Hd+nDd45Zr3p8uc+AM7Y/EJTrtbebGGWwYHVAnH0x+H+gZ0vLT4/oGZndPCpLI6OfZ WuqE1B8OObxnwFF46g4+StqGW2D/OpXRxZfRweYTCwT7QQ7uuCBQA324/YWFeoCq+E7g DweePpN/n8LOTMtF5qHLBTd9+QR1OFwqR7rI5ImyGnmc2VuuXJZWQS36yPyofem6Oav/ IhIWkpHz3/juGuDkysklJ+KZtyy6AlqtFcFQSlvUSZ2YVEANFVhpom4Jbw7ez4JW/GVX STBMnCkZDwWk22ij3ddQlphI+sZ1JaOtvstDS++r5jcgKO6q9BMom9Q2dey8dm0tZ36g MJsA== Return-Path: Received: from localhost.localdomain (97-118-124-30.hlrn.qwest.net. [97.118.124.30]) by smtp.gmail.com with ESMTPSA id q51-v6sm3805019qtf.18.2018.08.08.17.17.08 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 08 Aug 2018 17:17:09 -0700 (PDT) Subject: Re: [PATCH] convert braced initializers to strings (PR 71625) To: Jason Merrill References: Cc: Gcc Patch List , Joseph Myers From: Martin Sebor Message-ID: Date: Thu, 09 Aug 2018 00:17:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: multipart/mixed; boundary="------------3D62F2456C23E814BFD6FE0B" X-IsSubscribed: yes X-SW-Source: 2018-08/txt/msg00602.txt.bz2 This is a multi-part message in MIME format. --------------3D62F2456C23E814BFD6FE0B Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-length: 4313 On 08/08/2018 05:08 AM, Jason Merrill wrote: > On Wed, Aug 8, 2018 at 9:04 AM, Martin Sebor wrote: >> On 08/07/2018 02:57 AM, Jason Merrill wrote: >>> >>> On Wed, Aug 1, 2018 at 12:49 AM, Martin Sebor wrote: >>>> >>>> On 07/31/2018 07:38 AM, Jason Merrill wrote: >>>>> >>>>> >>>>> On Tue, Jul 31, 2018 at 9:51 AM, Martin Sebor wrote: >>>>>> >>>>>> >>>>>> The middle-end contains code to determine the lengths of constant >>>>>> character arrays initialized by string literals. The code is used >>>>>> in a number of optimizations and warnings. >>>>>> >>>>>> However, the code is unable to deal with constant arrays initialized >>>>>> using the braced initializer syntax, as in >>>>>> >>>>>> const char a[] = { '1', '2', '\0' }; >>>>>> >>>>>> The attached patch extends the C and C++ front-ends to convert such >>>>>> initializers into a STRING_CST form. >>>>>> >>>>>> The goal of this work is to both enable existing optimizations for >>>>>> such arrays, and to help detect bugs due to using non-nul terminated >>>>>> arrays where nul-terminated strings are expected. The latter is >>>>>> an extension of the GCC 8 _Wstringop-overflow and >>>>>> -Wstringop-truncation warnings that help detect or prevent reading >>>>>> past the end of dynamically created character arrays. Future work >>>>>> includes detecting potential past-the-end reads from uninitialized >>>>>> local character arrays. >>>>> >>>>> >>>>> >>>>>> && TYPE_MAIN_VARIANT (TREE_TYPE (valtype)) == char_type_node) >>>>> >>>>> >>>>> >>>>> Why? Don't we want this for other character types as well? >>>> >>>> >>>> It suppresses narrowing warnings for things like >>>> >>>> signed char a[] = { 0xff }; >>>> >>>> (there are a couple of tests that exercise this). >>> >>> >>> Why is plain char different in this respect? Presumably one of >>> >>> char a[] = { -1 }; >>> char b[] = { 0xff }; >>> >>> should give the same narrowing warning, depending on whether char is >>> signed. >> >> >> Right. I've added more tests to verify that it does. >> >>>> At the same time, STRING_CST is supposed to be able to represent >>>> strings of any integer type so there should be a way to make it >>>> work. On the flip side, recent discussions of changes in this >>>> area suggest there may be bugs in the wide character handling of >>>> STRING_CST so those would need to be fixed before relying on it >>>> for robust support. >>>> >>>> In any case, if you have a suggestion for how to make it work for >>>> at least the narrow character types I'll adjust the patch. >>> >>> >>> I suppose braced_list_to_string should call check_narrowing for C++. >> >> >> I see. I've made that change. That has made it possible to >> convert arrays of all character types. Thanks! >> >>> Currently it uses tree_fits_shwi_p (signed host_wide_int) and then >>> stores the extracted value in a host unsigned int, which is then >>> converted to host char. Does the right thing happen for -fsigned-char >>> or targets with a different character set? >> >> I believe so. I've added tests for these too (ASCII and EBCDIC) >> and also changed the type of the extracted value to HWI to match >> (it doesn't change the results of the tests). >> >> Attached is an updated patch with these changes plus more tests >> as suggested by Joseph. > > Great. Can we also move the call to braced_list_to_string into > check_initializer, so it works for templates as well? As a case just > before the block that calls reshape_init seems appropriate. Done in the attached patch. I've also avoided dealing with zero-length arrays and added tests to make sure their size stays is regardless of the form of their initializer and the appropriate warnings are issued. Using build_string() rather than build_string_literal() needed a tweak in digest_init_r(). It didn't break anything but since the array type may not have a domain yet, neither will the string. It looks like that may get adjusted later on but I've temporarily guarded the code with #if 1. If the change is fine I'll remove the #if before committing. This initial patch only handles narrow character initializers (i.e., those with TYPE_STRING_FLAG set). Once this gets some exposure I'd like to extend it to other character types, including wchar_t. Martin --------------3D62F2456C23E814BFD6FE0B Content-Type: text/x-patch; name="gcc-71625.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="gcc-71625.diff" Content-length: 27489 PR tree-optimization/71625 - missing strlen optimization on different array initialization style gcc/c/ChangeLog: PR tree-optimization/71625 * c-parser.c (c_parser_declaration_or_fndef): Call braced_list_to_string. gcc/c-family/ChangeLog: PR tree-optimization/71625 * c-common.c (braced_list_to_string): New function. * c-common.h (braced_list_to_string): Declare it. gcc/cp/ChangeLog: PR tree-optimization/71625 * decl.c (check_initializer): Call braced_list_to_string. (eval_check_narrowing): New function. * gcc/cp/typeck2.c (digest_init_r): Accept strings literals as initilizers for all narrow character types. gcc/testsuite/ChangeLog: PR tree-optimization/71625 * g++.dg/init/string2.C: New test. * g++.dg/init/string3.C: New test. * g++.dg/init/string4.C: New test. * gcc.dg/init-string-3.c: New test. * gcc.dg/strlenopt-55.c: New test. * gcc.dg/strlenopt-56.c: New test. diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c index d919605..b10d9c9 100644 --- a/gcc/c-family/c-common.c +++ b/gcc/c-family/c-common.c @@ -8509,4 +8509,102 @@ maybe_add_include_fixit (rich_location *richloc, const char *header) free (text); } +/* Attempt to convert a braced array initializer list CTOR for array + TYPE into a STRING_CST for convenience and efficiency. When non-null, + use EVAL to attempt to evalue constants (used by C++). Return + the converted string on success or null on failure. */ + +tree +braced_list_to_string (tree type, tree ctor, tree (*eval)(tree, tree)) +{ + unsigned HOST_WIDE_INT nelts = CONSTRUCTOR_NELTS (ctor); + + /* If the array has an explicit bound, use it to constrain the size + of the string. If it doesn't, be sure to create a string that's + as long as implied by the index of the last zero specified via + a designator, as in: + const char a[] = { [7] = 0 }; */ + unsigned HOST_WIDE_INT maxelts = HOST_WIDE_INT_M1U; + if (tree size = TYPE_SIZE_UNIT (type)) + { + if (tree_fits_uhwi_p (size)) + { + maxelts = tree_to_uhwi (size); + maxelts /= tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (type))); + + /* Avoid converting initializers for zero-length arrays. */ + if (!maxelts) + return NULL_TREE; + } + } + else if (!nelts) + /* Avoid handling the undefined/erroneous case of an empty + initializer for an arrays with unspecified bound. */ + return NULL_TREE; + + tree eltype = TREE_TYPE (type); + + auto_vec str; + str.reserve (nelts + 1); + + unsigned HOST_WIDE_INT i; + tree index, value; + + FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (ctor), i, index, value) + { + unsigned HOST_WIDE_INT idx = index ? tree_to_uhwi (index) : i; + + /* auto_vec is limited to UINT_MAX elements. */ + if (idx > UINT_MAX) + return NULL_TREE; + + /* Attempt to evaluate constants. */ + if (eval) + value = eval (eltype, value); + + /* Avoid non-constant initializers. */ + if (!tree_fits_shwi_p (value)) + return NULL_TREE; + + /* Skip over embedded nuls except the last one (initializer + elements are in ascending order of indices). */ + HOST_WIDE_INT val = tree_to_shwi (value); + if (!val && i + 1 < nelts) + continue; + + /* Bail if the CTOR has a block of more than 256 embedded nuls + due to implicitly initialized elements. */ + unsigned nchars = (idx - str.length ()) + 1; + if (nchars > 256) + return NULL_TREE; + + if (nchars > 1) + { + str.reserve (idx); + str.quick_grow_cleared (idx); + } + + if (idx > maxelts) + return NULL_TREE; + + str.safe_insert (idx, val); + } + + if (!nelts) + /* Append a nul for the empty initializer { }. */ + str.safe_push (0); + +#if 1 + /* Build a STRING_CST with the same type as the array, which + may be an array of unknown bound. */ + tree res = build_string (str.length (), str.begin ()); + TREE_TYPE (res) = type; +#else + /* Build a string literal but return the embedded STRING_CST. */ + tree res = build_string_literal (str.length (), str.begin (), eltype); + res = TREE_OPERAND (TREE_OPERAND (res, 0), 0); +#endif + return res; +} + #include "gt-c-family-c-common.h" diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h index fcec95b..8a802bb 100644 --- a/gcc/c-family/c-common.h +++ b/gcc/c-family/c-common.h @@ -1331,6 +1331,7 @@ extern void maybe_add_include_fixit (rich_location *, const char *); extern void maybe_suggest_missing_token_insertion (rich_location *richloc, enum cpp_ttype token_type, location_t prev_token_loc); +extern tree braced_list_to_string (tree, tree, tree (*)(tree, tree) = NULL); #if CHECKING_P namespace selftest { diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c index 7a92628..5ad4f57 100644 --- a/gcc/c/c-parser.c +++ b/gcc/c/c-parser.c @@ -2126,6 +2126,15 @@ c_parser_declaration_or_fndef (c_parser *parser, bool fndef_ok, if (d != error_mark_node) { maybe_warn_string_init (init_loc, TREE_TYPE (d), init); + + /* Try to convert a string CONSTRUCTOR into a STRING_CST. */ + tree valtype = TREE_TYPE (init.value); + if (TREE_CODE (init.value) == CONSTRUCTOR + && TREE_CODE (valtype) == ARRAY_TYPE + && TYPE_STRING_FLAG (TREE_TYPE (valtype))) + if (tree str = braced_list_to_string (valtype, init.value)) + init.value = str; + finish_decl (d, init_loc, init.value, init.original_type, asm_name); } diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c index 78ebbde..d2c5b5d 100644 --- a/gcc/cp/decl.c +++ b/gcc/cp/decl.c @@ -6282,6 +6282,30 @@ build_aggr_init_full_exprs (tree decl, tree init, int flags) return build_aggr_init (decl, init, flags, tf_warning_or_error); } +/* Attempt to determine the constant VALUE of integral type and convert + it to TYPE, issuing narrowing warnings/errors as necessary. Return + the constant result or null on failure. Callback for + braced_list_to_string. */ + +static tree +eval_check_narrowing (tree type, tree value) +{ + if (tree valtype = TREE_TYPE (value)) + { + if (TREE_CODE (valtype) != INTEGER_TYPE) + return NULL_TREE; + } + else + return NULL_TREE; + + value = scalar_constant_value (value); + if (!value) + return NULL_TREE; + + check_narrowing (type, value, tf_warning_or_error); + return value; +} + /* Verify INIT (the initializer for DECL), and record the initialization in DECL_INITIAL, if appropriate. CLEANUP is as for grok_reference_init. @@ -6397,7 +6421,18 @@ check_initializer (tree decl, tree init, int flags, vec **cleanups) } else { - init = reshape_init (type, init, tf_warning_or_error); + /* Try to convert a string CONSTRUCTOR into a STRING_CST. */ + tree valtype = TREE_TYPE (decl); + if (TREE_CODE (valtype) == ARRAY_TYPE + && TYPE_STRING_FLAG (TREE_TYPE (valtype)) + && TREE_CODE (init) == CONSTRUCTOR + && TREE_TYPE (init) == init_list_type_node) + if (tree str = braced_list_to_string (valtype, init, + eval_check_narrowing)) + init = str; + + if (TREE_CODE (init) != STRING_CST) + init = reshape_init (type, init, tf_warning_or_error); flags |= LOOKUP_NO_NARROWING; } } diff --git a/gcc/cp/typeck2.c b/gcc/cp/typeck2.c index 7763d53..72515d9 100644 --- a/gcc/cp/typeck2.c +++ b/gcc/cp/typeck2.c @@ -1056,7 +1056,9 @@ digest_init_r (tree type, tree init, int nested, int flags, if (TYPE_PRECISION (typ1) == BITS_PER_UNIT) { - if (char_type != char_type_node) + if (char_type != char_type_node + && char_type != signed_char_type_node + && char_type != unsigned_char_type_node) { if (complain & tf_error) error_at (loc, "char-array initialized from wide string"); diff --git a/gcc/testsuite/g++.dg/init/string2.C b/gcc/testsuite/g++.dg/init/string2.C new file mode 100644 index 0000000..5da13bd --- /dev/null +++ b/gcc/testsuite/g++.dg/init/string2.C @@ -0,0 +1,104 @@ +// PR tree-optimization/71625 - missing strlen optimization on different +// array initialization style +// +// Verify that strlen() calls with constant character array arguments +// initialized with string constants are folded. (This is a small +// subset of pr71625). +// { dg-do compile } +// { dg-options "-O0 -Wno-error=narrowing -fdump-tree-gimple" } + +#define A(expr) do { typedef char A[-1 + 2 * !!(expr)]; } while (0) + +/* This is undefined but accepted without -Wpedantic. Verify that + the size is zero. */ +const char ax[] = { }; + +void size0 () +{ + A (sizeof ax == 0); +} + +const char a0[] = { 'a', 'b', 'c', '\0' }; + +int len0 () +{ + return __builtin_strlen (a0); +} + +// Verify that narrowing warnings are preserved. +const signed char +sa0[] = { 'a', 'b', 255, '\0' }; // { dg-warning "\\\[\(-Wnarrowing|-Woverflow\)" "" { target { ! c++98_only } } } + +int lens0 () +{ + return __builtin_strlen ((const char*)sa0); +} + +const unsigned char +ua0[] = { 'a', 'b', -1, '\0' }; // { dg-warning "\\\[\(-Wnarrowing|-Woverflow\)" "" { target { ! c++98_only } } } + +int lenu0 () +{ + return __builtin_strlen ((const char*)ua0); +} + +const char c = 0; +const char a1[] = { 'a', 'b', 'c', c }; + +int len1 () +{ + return __builtin_strlen (a1); +} + +template +int tmplen () +{ + static const T + a[] = { 1, 2, 333, 0 }; // { dg-warning "\\\[\(-Wnarrowing|-Woverflow\)" "" { target { ! c++98_only } } } + return __builtin_strlen (a); +} + +template int tmplen(); + +const wchar_t ws4[] = { 1, 2, 3, 4 }; +const wchar_t ws7[] = { 1, 2, 3, 4, 0, 0, 0 }; +const wchar_t ws9[9] = { 1, 2, 3, 4, 0 }; + +void wsize () +{ + A (sizeof ws4 == 4 * sizeof *ws4); + A (ws4[0] == 1 && ws4[1] == 2 && ws4[2] == 3 && ws4[3] == 4); + + A (sizeof ws7 == 7 * sizeof *ws7); + A (ws7[0] == 1 && ws7[1] == 2 && ws7[2] == 3 && ws7[4] == 4 + && !ws7[5] && !ws7[6]); + + A (sizeof ws9 == 9 * sizeof *ws9); + A (ws9[0] == 1 && ws9[1] == 2 && ws9[2] == 3 && ws9[4] == 4 + && !ws9[5] && !ws9[6] && !ws9[7] && !ws9[8]); +} + +#if 0 + +// The following aren't handled. + +const char &cref = c; +const char a2[] = { 'a', 'b', 'c', cref }; + +int len2 () +{ + return __builtin_strlen (a2); +} + + +const char* const cptr = &cref; +const char a3[] = { 'a', 'b', 'c', *cptr }; + +int len3 () +{ + return __builtin_strlen (a3); +} + +#endif + +// { dg-final { scan-tree-dump-times "strlen" 0 "gimple" } } diff --git a/gcc/testsuite/g++.dg/init/string3.C b/gcc/testsuite/g++.dg/init/string3.C new file mode 100644 index 0000000..8212e81 --- /dev/null +++ b/gcc/testsuite/g++.dg/init/string3.C @@ -0,0 +1,35 @@ +// PR tree-optimization/71625 - missing strlen optimization on different +// array initialization style +// +// Verify that strlen() call with a constant character array argument +// initialized with non-constant elements isn't folded. +// +// { dg-do compile } +// { dg-options "-O2 -fdump-tree-optimized" } + + +extern const char c; +const char a0[] = { 'a', 'b', 'c', c }; + +int len0 () +{ + return __builtin_strlen (a0); +} + +const char &ref = c; +const char a1[] = { 'a', 'b', 'c', ref }; + +int len1 () +{ + return __builtin_strlen (a1); +} + +const char* const ptr = &c; +const char a2[] = { 'a', 'b', 'c', *ptr }; + +int len2 () +{ + return __builtin_strlen (a2); +} + +// { dg-final { scan-tree-dump-times "strlen" 3 "optimized" } } diff --git a/gcc/testsuite/g++.dg/init/string4.C b/gcc/testsuite/g++.dg/init/string4.C new file mode 100644 index 0000000..5df4176 --- /dev/null +++ b/gcc/testsuite/g++.dg/init/string4.C @@ -0,0 +1,60 @@ +// PR tree-optimization/71625 - missing strlen optimization on different +// array initialization style + +// Verify that zero-length array initialization results in the expected +// array sizes and in the expected diagnostics. See init-string-3.c +// for the corresponding C test. + +// { dg-do compile } +// { dg-options "-Wall -Wno-unused-local-typedefs -fpermissive" } + +#define A(expr) typedef char A[-1 + 2 * !!(expr)]; + +const char a[] = { }; + +A (sizeof a == 0); + + +const char b[0] = { }; + +A (sizeof b == 0); + +// Also verify that the error is "too many initializers for +// 'const char [0]'" and not "initializer-string is too long." +const char c[0] = { 1 }; // { dg-error "too many initializers for .const char \\\[0]" } + +A (sizeof c == 0); + + +void test_auto_empty (void) +{ + const char a[] = { }; + + A (sizeof a == 0); +} + +void test_auto_zero_length (void) +{ + const char a[0] = { }; + + A (sizeof a == 0); + + const char b[0] = { 0 }; // { dg-error "too many initializers" } + + A (sizeof b == 0); + + const char c[0] = ""; // { dg-warning "too long" } + + A (sizeof c == 0); +} + + +void test_compound_zero_length (void) +{ + A (sizeof (const char[]){ } == 0); + A (sizeof (const char[0]){ } == 0); + A (sizeof (const char[0]){ 0 } == 0); // { dg-error "too many" } + A (sizeof (const char[0]){ 1 } == 0); // { dg-error "too many" } + A (sizeof (const char[0]){ "" } == 0); // { dg-warning "too long" } + A (sizeof (const char[0]){ "1" } == 0); // { dg-warning "too long" } +} diff --git a/gcc/testsuite/gcc.dg/init-string-3.c b/gcc/testsuite/gcc.dg/init-string-3.c new file mode 100644 index 0000000..e955f2e --- /dev/null +++ b/gcc/testsuite/gcc.dg/init-string-3.c @@ -0,0 +1,58 @@ +/* PR tree-optimization/71625 - missing strlen optimization on different + array initialization style + + Verify that zero-length array initialization results in the expected + array sizes. + + { dg-do compile } + { dg-options "-Wall -Wno-unused-local-typedefs" } */ + +#define A(expr) typedef char A[-1 + 2 * !!(expr)]; + +const char a[] = { }; + +A (sizeof a == 0); + + +const char b[0] = { }; + +A (sizeof b == 0); + + +const char c[0] = { 1 }; /* { dg-warning "excess elements" } */ + +A (sizeof c == 0); + + +void test_auto_empty (void) +{ + const char a[] = { }; + + A (sizeof a == 0); +} + +void test_auto_zero_length (void) +{ + const char a[0] = { }; + + A (sizeof a == 0); + + const char b[0] = { 0 }; /* { dg-warning "excess elements" } */ + + A (sizeof b == 0); + + const char c[0] = ""; + + A (sizeof c == 0); +} + + +void test_compound_zero_length (void) +{ + A (sizeof (const char[]){ } == 0); + A (sizeof (const char[0]){ } == 0); + A (sizeof (const char[0]){ 0 } == 0); /* { dg-warning "excess elements" } */ + A (sizeof (const char[0]){ 1 } == 0); /* { dg-warning "excess elements" } */ + A (sizeof (const char[0]){ "" } == 0); + A (sizeof (const char[0]){ "1" } == 0); /* { dg-warning "too long" } */ +} diff --git a/gcc/testsuite/gcc.dg/strlenopt-55.c b/gcc/testsuite/gcc.dg/strlenopt-55.c new file mode 100644 index 0000000..d5a0295 --- /dev/null +++ b/gcc/testsuite/gcc.dg/strlenopt-55.c @@ -0,0 +1,230 @@ +/* PR tree-optimization/71625 - missing strlen optimization on different + array initialization style + + Verify that strlen() of braced initialized array is folded + { dg-do compile } + { dg-options "-O1 -Wall -fdump-tree-gimple -fdump-tree-optimized" } */ + +#include "strlenopt.h" + +#define S \ + "\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f" \ + "\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f" \ + "\x20\x21\x22\x23\x24\x25\x26\x27\x28\x29\x2a\x2b\x2c\x2d\x2e\x2f" \ + "\x30\x31\x32\x33\x34\x35\x36\x37\x38\x39\x3a\x3b\x3c\x3d\x3e\x3f" \ + "\x40\x41\x42\x43\x44\x45\x46\x47\x48\x49\x4a\x4b\x4c\x4d\x4e\x4f" \ + "\x50\x51\x52\x53\x54\x55\x56\x57\x58\x59\x5a\x5b\x5c\x5d\x5e\x5f" \ + "\x60\x61\x62\x63\x64\x65\x66\x67\x68\x69\x6a\x6b\x6c\x6d\x6e\x6f" \ + "\x70\x71\x72\x73\x74\x75\x76\x77\x78\x79\x7a\x7b\x7c\x7d\x7e\x7f" \ + "\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f" \ + "\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f" \ + "\xa0\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab\xac\xad\xae\xaf" \ + "\xb0\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf" \ + "\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf" \ + "\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf" \ + "\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef" \ + "\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff" + +/* Arrays of char, signed char, and unsigned char to verify that + the length and contents of all are the same as that of the string + literal above. */ + +const char c256[] = { + S[0], S[1], S[2], S[3], S[4], S[5], S[6], S[7], S[8], S[9], S[10], + S[11], S[12], S[13], S[14], S[15], S[16], S[17], S[18], S[19], S[20], + S[21], S[22], S[23], S[24], S[25], S[26], S[27], S[28], S[29], S[30], + S[31], S[32], S[33], S[34], S[35], S[36], S[37], S[38], S[39], S[40], + S[41], S[42], S[43], S[44], S[45], S[46], S[47], S[48], S[49], S[50], + S[51], S[52], S[53], S[54], S[55], S[56], S[57], S[58], S[59], S[60], + S[61], S[62], S[63], S[64], S[65], S[66], S[67], S[68], S[69], S[70], + S[71], S[72], S[73], S[74], S[75], S[76], S[77], S[78], S[79], S[80], + S[81], S[82], S[83], S[84], S[85], S[86], S[87], S[88], S[89], S[90], + S[91], S[92], S[93], S[94], S[95], S[96], S[97], S[98], S[99], S[100], + S[101], S[102], S[103], S[104], S[105], S[106], S[107], S[108], S[109], + S[110], S[111], S[112], S[113], S[114], S[115], S[116], S[117], S[118], + S[119], S[120], S[121], S[122], S[123], S[124], S[125], S[126], S[127], + S[128], S[129], S[130], S[131], S[132], S[133], S[134], S[135], S[136], + S[137], S[138], S[139], S[140], S[141], S[142], S[143], S[144], S[145], + S[146], S[147], S[148], S[149], S[150], S[151], S[152], S[153], S[154], + S[155], S[156], S[157], S[158], S[159], S[160], S[161], S[162], S[163], + S[164], S[165], S[166], S[167], S[168], S[169], S[170], S[171], S[172], + S[173], S[174], S[175], S[176], S[177], S[178], S[179], S[180], S[181], + S[182], S[183], S[184], S[185], S[186], S[187], S[188], S[189], S[190], + S[191], S[192], S[193], S[194], S[195], S[196], S[197], S[198], S[199], + S[200], S[201], S[202], S[203], S[204], S[205], S[206], S[207], S[208], + S[209], S[210], S[211], S[212], S[213], S[214], S[215], S[216], S[217], + S[218], S[219], S[220], S[221], S[222], S[223], S[224], S[225], S[226], + S[227], S[228], S[229], S[230], S[231], S[232], S[233], S[234], S[235], + S[236], S[237], S[238], S[239], S[240], S[241], S[242], S[243], S[244], + S[245], S[246], S[247], S[248], S[249], S[250], S[251], S[252], S[253], + S[254], S[255] /* = NUL */ +}; + +const signed char sc256[] = { + S[0], S[1], S[2], S[3], S[4], S[5], S[6], S[7], S[8], S[9], S[10], + S[11], S[12], S[13], S[14], S[15], S[16], S[17], S[18], S[19], S[20], + S[21], S[22], S[23], S[24], S[25], S[26], S[27], S[28], S[29], S[30], + S[31], S[32], S[33], S[34], S[35], S[36], S[37], S[38], S[39], S[40], + S[41], S[42], S[43], S[44], S[45], S[46], S[47], S[48], S[49], S[50], + S[51], S[52], S[53], S[54], S[55], S[56], S[57], S[58], S[59], S[60], + S[61], S[62], S[63], S[64], S[65], S[66], S[67], S[68], S[69], S[70], + S[71], S[72], S[73], S[74], S[75], S[76], S[77], S[78], S[79], S[80], + S[81], S[82], S[83], S[84], S[85], S[86], S[87], S[88], S[89], S[90], + S[91], S[92], S[93], S[94], S[95], S[96], S[97], S[98], S[99], S[100], + S[101], S[102], S[103], S[104], S[105], S[106], S[107], S[108], S[109], + S[110], S[111], S[112], S[113], S[114], S[115], S[116], S[117], S[118], + S[119], S[120], S[121], S[122], S[123], S[124], S[125], S[126], S[127], + S[128], S[129], S[130], S[131], S[132], S[133], S[134], S[135], S[136], + S[137], S[138], S[139], S[140], S[141], S[142], S[143], S[144], S[145], + S[146], S[147], S[148], S[149], S[150], S[151], S[152], S[153], S[154], + S[155], S[156], S[157], S[158], S[159], S[160], S[161], S[162], S[163], + S[164], S[165], S[166], S[167], S[168], S[169], S[170], S[171], S[172], + S[173], S[174], S[175], S[176], S[177], S[178], S[179], S[180], S[181], + S[182], S[183], S[184], S[185], S[186], S[187], S[188], S[189], S[190], + S[191], S[192], S[193], S[194], S[195], S[196], S[197], S[198], S[199], + S[200], S[201], S[202], S[203], S[204], S[205], S[206], S[207], S[208], + S[209], S[210], S[211], S[212], S[213], S[214], S[215], S[216], S[217], + S[218], S[219], S[220], S[221], S[222], S[223], S[224], S[225], S[226], + S[227], S[228], S[229], S[230], S[231], S[232], S[233], S[234], S[235], + S[236], S[237], S[238], S[239], S[240], S[241], S[242], S[243], S[244], + S[245], S[246], S[247], S[248], S[249], S[250], S[251], S[252], S[253], + S[254], S[255] /* = NUL */ +}; + +const unsigned char uc256[] = { + S[0], S[1], S[2], S[3], S[4], S[5], S[6], S[7], S[8], S[9], S[10], + S[11], S[12], S[13], S[14], S[15], S[16], S[17], S[18], S[19], S[20], + S[21], S[22], S[23], S[24], S[25], S[26], S[27], S[28], S[29], S[30], + S[31], S[32], S[33], S[34], S[35], S[36], S[37], S[38], S[39], S[40], + S[41], S[42], S[43], S[44], S[45], S[46], S[47], S[48], S[49], S[50], + S[51], S[52], S[53], S[54], S[55], S[56], S[57], S[58], S[59], S[60], + S[61], S[62], S[63], S[64], S[65], S[66], S[67], S[68], S[69], S[70], + S[71], S[72], S[73], S[74], S[75], S[76], S[77], S[78], S[79], S[80], + S[81], S[82], S[83], S[84], S[85], S[86], S[87], S[88], S[89], S[90], + S[91], S[92], S[93], S[94], S[95], S[96], S[97], S[98], S[99], S[100], + S[101], S[102], S[103], S[104], S[105], S[106], S[107], S[108], S[109], + S[110], S[111], S[112], S[113], S[114], S[115], S[116], S[117], S[118], + S[119], S[120], S[121], S[122], S[123], S[124], S[125], S[126], S[127], + S[128], S[129], S[130], S[131], S[132], S[133], S[134], S[135], S[136], + S[137], S[138], S[139], S[140], S[141], S[142], S[143], S[144], S[145], + S[146], S[147], S[148], S[149], S[150], S[151], S[152], S[153], S[154], + S[155], S[156], S[157], S[158], S[159], S[160], S[161], S[162], S[163], + S[164], S[165], S[166], S[167], S[168], S[169], S[170], S[171], S[172], + S[173], S[174], S[175], S[176], S[177], S[178], S[179], S[180], S[181], + S[182], S[183], S[184], S[185], S[186], S[187], S[188], S[189], S[190], + S[191], S[192], S[193], S[194], S[195], S[196], S[197], S[198], S[199], + S[200], S[201], S[202], S[203], S[204], S[205], S[206], S[207], S[208], + S[209], S[210], S[211], S[212], S[213], S[214], S[215], S[216], S[217], + S[218], S[219], S[220], S[221], S[222], S[223], S[224], S[225], S[226], + S[227], S[228], S[229], S[230], S[231], S[232], S[233], S[234], S[235], + S[236], S[237], S[238], S[239], S[240], S[241], S[242], S[243], S[244], + S[245], S[246], S[247], S[248], S[249], S[250], S[251], S[252], S[253], + S[254], S[255] /* = NUL */ +}; + +const __CHAR16_TYPE__ c16_4[] = { + 1, 0x7fff, 0x8000, 0xffff, + 0x10000 /* { dg-warning "\\\[-Woverflow]" } */ +}; + +const char a2_implicit[2] = { }; +const char a3_implicit[3] = { }; + +const char a3_nul[3] = { 0 }; +const char a5_nul1[3] = { [1] = 0 }; +const char a7_nul2[3] = { [2] = 0 }; + +const char ax_2_nul[] = { '1', '2', '\0' }; +const char ax_3_nul[] = { '1', '2', '3', '\0' }; + +const char ax_3_des_nul[] = { [3] = 0, [2] = '3', [1] = '2', [0] = '1' }; + +const char ax_3[] = { '1', '2', '3' }; +const char a3_3[3] = { '1', '2', '3' }; + +const char ax_100_3[] = { '1', '2', '3', [100] = '\0' }; + +#define CONCAT(x, y) x ## y +#define CAT(x, y) CONCAT (x, y) +#define FAILNAME(name) CAT (call_ ## name ##_on_line_, __LINE__) + +#define FAIL(name) do { \ + extern void FAILNAME (name) (void); \ + FAILNAME (name)(); \ + } while (0) + +/* Macro to emit a call to funcation named + call_in_true_branch_not_eliminated_on_line_NNN() + for each call that's expected to be eliminated. The dg-final + scan-tree-dump-time directive at the bottom of the test verifies + that no such call appears in output. */ +#define ELIM(expr) \ + if (!(expr)) FAIL (in_true_branch_not_eliminated); else (void)0 + +#define T(s, n) ELIM (strlen (s) == n) + +void test_nulstring (void) +{ + T (a2_implicit, 0); + T (a3_implicit, 0); + + T (a3_nul, 0); + T (a5_nul1, 0); + T (a7_nul2, 0); + + T (ax_2_nul, 2); + T (ax_3_nul, 3); + T (ax_3_des_nul, 3); + + T (ax_100_3, 3); + T (ax_100_3 + 4, 0); + ELIM (101 == sizeof ax_100_3); + ELIM ('\0' == ax_100_3[100]); + + /* Verify that all three character arrays have the same length + as the string literal they are initialized with. */ + T (S, 255); + T (c256, 255); + T ((const char*)sc256, 255); + T ((const char*)uc256, 255); + + /* Verify that all three character arrays have the same contents + as the string literal they are initialized with. */ + ELIM (0 == memcmp (c256, S, sizeof c256)); + ELIM (0 == memcmp (c256, (const char*)sc256, sizeof c256)); + ELIM (0 == memcmp (c256, (const char*)uc256, sizeof c256)); + + ELIM (0 == strcmp (c256, (const char*)sc256)); + ELIM (0 == strcmp (c256, (const char*)uc256)); + + /* Verify that the char16_t array has the expected contents. */ + ELIM (c16_4[0] == 1 && c16_4[1] == 0x7fff + && c16_4[2] == 0x8000 && c16_4[3] == 0xffff + && c16_4[4] == 0); +} + +/* Verify that excessively large initializers don't run out of + memory. Also verify that the they have the expected size and + contents. */ + +#define MAX (__PTRDIFF_MAX__ - 1) + +const char large_string[] = { 'a', [1234] = 'b', [MAX] = '\0' }; + +const void test_large_string_size (void) +{ + ELIM (sizeof large_string == MAX + 1); + + /* The following expressions are not folded without optimization. */ + ELIM ('a' == large_string[0]); + ELIM ('\0' == large_string[1233]); + ELIM ('b' == large_string[1234]); + ELIM ('\0' == large_string[1235]); + ELIM ('\0' == large_string[MAX - 1]); +} + + +/* { dg-final { scan-tree-dump-times "strlen" 0 "gimple" } } + { dg-final { scan-tree-dump-times "memcmp" 0 "gimple" } } + { dg-final { scan-tree-dump-times "strcmp" 0 "gimple" } } + { dg-final { scan-tree-dump-times "call_in_true_branch_not_eliminated" 0 "optimized" } } */ diff --git a/gcc/testsuite/gcc.dg/strlenopt-56.c b/gcc/testsuite/gcc.dg/strlenopt-56.c new file mode 100644 index 0000000..39a532b --- /dev/null +++ b/gcc/testsuite/gcc.dg/strlenopt-56.c @@ -0,0 +1,50 @@ +/* PR tree-optimization/71625 - conversion of braced initializers to strings + Verify that array elements have the expected values regardless of sign + and non-ASCII execution character set. + { dg-do compile } + { dg-require-iconv "IBM1047" } + { dg-options "-O -Wall -fexec-charset=IBM1047 -fdump-tree-gimple -fdump-tree-optimized" } */ + +#include "strlenopt.h" + +const char a[] = { 'a', 129, 0 }; +const signed char b[] = { 'b', 130, 0 }; +const unsigned char c[] = { 'c', 131, 0 }; + +const char s[] = "a\201"; +const signed char ss[] = "b\202"; +const unsigned char us[] = "c\203"; + + +#define A(expr) ((expr) ? (void)0 : __builtin_abort ()) + +void test_values (void) +{ + A (a[0] == a[1]); + A (a[1] == 'a'); + + A (b[0] == b[1]); + A (b[1] == (signed char)'b'); + + A (c[0] == c[1]); + A (c[1] == (unsigned char)'c'); +} + +void test_lengths (void) +{ + A (2 == strlen (a)); + A (2 == strlen ((const char*)b)); + A (2 == strlen ((const char*)c)); +} + +void test_contents (void) +{ + A (0 == strcmp (a, s)); + A (0 == strcmp ((const char*)b, (const char*)ss)); + A (0 == strcmp ((const char*)c, (const char*)us)); +} + + +/* { dg-final { scan-tree-dump-times "strlen" 0 "gimple" } } + { dg-final { scan-tree-dump-times "strcmp" 0 "gimple" } } + { dg-final { scan-tree-dump-times "abort" 0 "optimized" } } */ --------------3D62F2456C23E814BFD6FE0B--