From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 81258 invoked by alias); 16 Nov 2018 18:25:23 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 81240 invoked by uid 89); 16 Nov 2018 18:25:21 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-11.8 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,GIT_PATCH_2,GIT_PATCH_3,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2 spammy= X-HELO: mail-qk1-f193.google.com Received: from mail-qk1-f193.google.com (HELO mail-qk1-f193.google.com) (209.85.222.193) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 16 Nov 2018 18:25:20 +0000 Received: by mail-qk1-f193.google.com with SMTP id w204so38947450qka.2 for ; Fri, 16 Nov 2018 10:25:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:references:cc:from:message-id:date:user-agent :mime-version:in-reply-to:content-transfer-encoding; bh=rIdszsb1wnzn1tPysRSGuoDaiuRJm8ui5H1cWNoRyco=; b=tn8JfiQb+2iZ2Q2isDF/YiVSEBPnDlG3Xh3M38C/4hZEOVQpo4NYvnpvRv9qg6oEpD ARkZE38DeSaXnLwn3Pw0xwh/UXec4xxioBmgLn3L3pMKKiP+mpgTjh5w9ebSn975yM6b zVRc1SmgxjvY1EwWetIR+/IsimEMuyGimrnlnNClQYbySn0duJ84Ev6p11iGQv0iXhfF eZKuH3WnWPLlWl4rDpwsyOLYJ+VMztR5V2F8sNeVGrf9tNlKt5gKwcnBOA3uhWBiyPXh 0TAL5o3D6wWgHm7D2r4RpyMLI7GdA4EN8BUmFbHqZl0JFRNNF3+Tqxe6jg0NzJ+3ItBY s/cg== Return-Path: Received: from localhost.localdomain (184-96-239-209.hlrn.qwest.net. [184.96.239.209]) by smtp.gmail.com with ESMTPSA id k6sm8801507qkk.60.2018.11.16.10.25.16 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 16 Nov 2018 10:25:17 -0800 (PST) Subject: Re: [PATCH] Reject too large string literals (PR middle-end/87854) To: Jakub Jelinek , "Joseph S. Myers" , Marek Polacek , Jason Merrill , Nathan Sidwell References: <20181116084325.GD11625@tucnak> Cc: gcc-patches@gcc.gnu.org From: Martin Sebor Message-ID: Date: Fri, 16 Nov 2018 18:25:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: <20181116084325.GD11625@tucnak> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-IsSubscribed: yes X-SW-Source: 2018-11/txt/msg01526.txt.bz2 On 11/16/2018 01:43 AM, Jakub Jelinek wrote: > Hi! > > Both C and C++ FE diagnose arrays larger than half of the address space: > /tmp/1.c:1:6: error: size of array ‘a’ is too large > char a[__SIZE_MAX__ / 2 + 1]; > ^ > because one can't do pointer arithmetics on them. But we don't have > anything similar for string literals. As internally we use host int > as TREE_STRING_LENGTH, this is relevant to targets that have < 32-bit > size_t only. > > The following patch adds that diagnostics and truncates the string literals. > > Bootstrapped/regtested on x86_64-linux and i686-linux and tested with > a cross to avr. I'll defer adjusting testcases to the maintainers of 16-bit > ports. From the PR it seems gcc.dg/concat2.c, g++.dg/parse/concat1.C and > pr46534.c tests are affected. > > Ok for trunk? > > 2018-11-16 Jakub Jelinek > > PR middle-end/87854 > * c-common.c (fix_string_type): Reject string literals larger than > TYPE_MAX_VALUE (ssizetype) bytes. > > --- gcc/c-family/c-common.c.jj 2018-11-14 13:37:46.921050615 +0100 > +++ gcc/c-family/c-common.c 2018-11-15 15:20:31.138056115 +0100 > @@ -737,31 +737,44 @@ tree > fix_string_type (tree value) > { > int length = TREE_STRING_LENGTH (value); > - int nchars; > + int nchars, charsz; > tree e_type, i_type, a_type; > > /* Compute the number of elements, for the array type. */ > if (TREE_TYPE (value) == char_array_type_node || !TREE_TYPE (value)) > { > - nchars = length; > + charsz = 1; > e_type = char_type_node; > } > else if (TREE_TYPE (value) == char16_array_type_node) > { > - nchars = length / (TYPE_PRECISION (char16_type_node) / BITS_PER_UNIT); > + charsz = TYPE_PRECISION (char16_type_node) / BITS_PER_UNIT; > e_type = char16_type_node; > } > else if (TREE_TYPE (value) == char32_array_type_node) > { > - nchars = length / (TYPE_PRECISION (char32_type_node) / BITS_PER_UNIT); > + charsz = TYPE_PRECISION (char32_type_node) / BITS_PER_UNIT; > e_type = char32_type_node; > } > else > { > - nchars = length / (TYPE_PRECISION (wchar_type_node) / BITS_PER_UNIT); > + charsz = TYPE_PRECISION (wchar_type_node) / BITS_PER_UNIT; > e_type = wchar_type_node; > } > > + /* This matters only for targets where ssizetype has smaller precision > + than 32 bits. */ > + if (wi::lts_p (wi::to_wide (TYPE_MAX_VALUE (ssizetype)), length)) > + { > + error ("size of string literal is too large"); It would be helpful to mention the size of the literal and the limit so users who do run into the error don't wonder how to fix it. Martin