public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] Reject too large string literals (PR middle-end/87854)
@ 2018-11-16  8:43 Jakub Jelinek
  2018-11-16 12:06 ` Nathan Sidwell
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Jakub Jelinek @ 2018-11-16  8:43 UTC (permalink / raw)
  To: Joseph S. Myers, Marek Polacek, Jason Merrill, Nathan Sidwell; +Cc: gcc-patches

Hi!

Both C and C++ FE diagnose arrays larger than half of the address space:
/tmp/1.c:1:6: error: size of array ‘a’ is too large
 char a[__SIZE_MAX__ / 2 + 1];
      ^
because one can't do pointer arithmetics on them.  But we don't have
anything similar for string literals.  As internally we use host int
as TREE_STRING_LENGTH, this is relevant to targets that have < 32-bit
size_t only.

The following patch adds that diagnostics and truncates the string literals.

Bootstrapped/regtested on x86_64-linux and i686-linux and tested with
a cross to avr.  I'll defer adjusting testcases to the maintainers of 16-bit
ports.  From the PR it seems gcc.dg/concat2.c, g++.dg/parse/concat1.C and
pr46534.c tests are affected.

Ok for trunk?

2018-11-16  Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/87854
	* c-common.c (fix_string_type): Reject string literals larger than
	TYPE_MAX_VALUE (ssizetype) bytes.

--- gcc/c-family/c-common.c.jj	2018-11-14 13:37:46.921050615 +0100
+++ gcc/c-family/c-common.c	2018-11-15 15:20:31.138056115 +0100
@@ -737,31 +737,44 @@ tree
 fix_string_type (tree value)
 {
   int length = TREE_STRING_LENGTH (value);
-  int nchars;
+  int nchars, charsz;
   tree e_type, i_type, a_type;
 
   /* Compute the number of elements, for the array type.  */
   if (TREE_TYPE (value) == char_array_type_node || !TREE_TYPE (value))
     {
-      nchars = length;
+      charsz = 1;
       e_type = char_type_node;
     }
   else if (TREE_TYPE (value) == char16_array_type_node)
     {
-      nchars = length / (TYPE_PRECISION (char16_type_node) / BITS_PER_UNIT);
+      charsz = TYPE_PRECISION (char16_type_node) / BITS_PER_UNIT;
       e_type = char16_type_node;
     }
   else if (TREE_TYPE (value) == char32_array_type_node)
     {
-      nchars = length / (TYPE_PRECISION (char32_type_node) / BITS_PER_UNIT);
+      charsz = TYPE_PRECISION (char32_type_node) / BITS_PER_UNIT;
       e_type = char32_type_node;
     }
   else
     {
-      nchars = length / (TYPE_PRECISION (wchar_type_node) / BITS_PER_UNIT);
+      charsz = TYPE_PRECISION (wchar_type_node) / BITS_PER_UNIT;
       e_type = wchar_type_node;
     }
 
+  /* This matters only for targets where ssizetype has smaller precision
+     than 32 bits.  */
+  if (wi::lts_p (wi::to_wide (TYPE_MAX_VALUE (ssizetype)), length))
+    {
+      error ("size of string literal is too large");
+      length = tree_to_shwi (TYPE_MAX_VALUE (ssizetype)) / charsz * charsz;
+      char *str = CONST_CAST (char *, TREE_STRING_POINTER (value));
+      memset (str + length, '\0',
+	      MIN (TREE_STRING_LENGTH (value) - length, charsz));
+      TREE_STRING_LENGTH (value) = length;
+    }
+  nchars = length / charsz;
+
   /* C89 2.2.4.1, C99 5.2.4.1 (Translation limits).  The analogous
      limit in C++98 Annex B is very large (65536) and is not normative,
      so we do not diagnose it (warn_overlength_strings is forced off

	Jakub

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Reject too large string literals (PR middle-end/87854)
  2018-11-16  8:43 [PATCH] Reject too large string literals (PR middle-end/87854) Jakub Jelinek
@ 2018-11-16 12:06 ` Nathan Sidwell
  2018-11-16 14:33   ` Marek Polacek
  2018-11-16 17:35 ` Joseph Myers
  2018-11-16 18:25 ` Martin Sebor
  2 siblings, 1 reply; 6+ messages in thread
From: Nathan Sidwell @ 2018-11-16 12:06 UTC (permalink / raw)
  To: Jakub Jelinek, Joseph S. Myers, Marek Polacek, Jason Merrill; +Cc: gcc-patches

On 11/16/18 3:43 AM, Jakub Jelinek wrote:
> Hi!
> 
> Both C and C++ FE diagnose arrays larger than half of the address space:
> /tmp/1.c:1:6: error: size of array ‘a’ is too large
>   char a[__SIZE_MAX__ / 2 + 1];
>        ^
> because one can't do pointer arithmetics on them.  But we don't have
> anything similar for string literals.  As internally we use host int
> as TREE_STRING_LENGTH, this is relevant to targets that have < 32-bit
> size_t only.
> 
> The following patch adds that diagnostics and truncates the string literals.

Ok by me.

nathan

-- 
Nathan Sidwell

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Reject too large string literals (PR middle-end/87854)
  2018-11-16 12:06 ` Nathan Sidwell
@ 2018-11-16 14:33   ` Marek Polacek
  0 siblings, 0 replies; 6+ messages in thread
From: Marek Polacek @ 2018-11-16 14:33 UTC (permalink / raw)
  To: Nathan Sidwell; +Cc: Jakub Jelinek, Joseph S. Myers, Jason Merrill, gcc-patches

On Fri, Nov 16, 2018 at 07:06:51AM -0500, Nathan Sidwell wrote:
> On 11/16/18 3:43 AM, Jakub Jelinek wrote:
> > Hi!
> > 
> > Both C and C++ FE diagnose arrays larger than half of the address space:
> > /tmp/1.c:1:6: error: size of array ‘a’ is too large
> >   char a[__SIZE_MAX__ / 2 + 1];
> >        ^
> > because one can't do pointer arithmetics on them.  But we don't have
> > anything similar for string literals.  As internally we use host int
> > as TREE_STRING_LENGTH, this is relevant to targets that have < 32-bit
> > size_t only.
> > 
> > The following patch adds that diagnostics and truncates the string literals.
> 
> Ok by me.

No objections from me, either.

Marek

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Reject too large string literals (PR middle-end/87854)
  2018-11-16  8:43 [PATCH] Reject too large string literals (PR middle-end/87854) Jakub Jelinek
  2018-11-16 12:06 ` Nathan Sidwell
@ 2018-11-16 17:35 ` Joseph Myers
  2018-11-16 18:25 ` Martin Sebor
  2 siblings, 0 replies; 6+ messages in thread
From: Joseph Myers @ 2018-11-16 17:35 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Marek Polacek, Jason Merrill, Nathan Sidwell, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1057 bytes --]

On Fri, 16 Nov 2018, Jakub Jelinek wrote:

> Hi!
> 
> Both C and C++ FE diagnose arrays larger than half of the address space:
> /tmp/1.c:1:6: error: size of array ‘a’ is too large
>  char a[__SIZE_MAX__ / 2 + 1];
>       ^
> because one can't do pointer arithmetics on them.  But we don't have
> anything similar for string literals.  As internally we use host int
> as TREE_STRING_LENGTH, this is relevant to targets that have < 32-bit
> size_t only.
> 
> The following patch adds that diagnostics and truncates the string literals.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux and tested with
> a cross to avr.  I'll defer adjusting testcases to the maintainers of 16-bit
> ports.  From the PR it seems gcc.dg/concat2.c, g++.dg/parse/concat1.C and
> pr46534.c tests are affected.
> 
> Ok for trunk?

OK with me.  I'd hope at least one test (existing or new) would actually 
test the new diagnostic on 16-bit systems, rather than just those tests 
being disabled for affected platforms.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Reject too large string literals (PR middle-end/87854)
  2018-11-16  8:43 [PATCH] Reject too large string literals (PR middle-end/87854) Jakub Jelinek
  2018-11-16 12:06 ` Nathan Sidwell
  2018-11-16 17:35 ` Joseph Myers
@ 2018-11-16 18:25 ` Martin Sebor
  2018-11-16 18:31   ` Jakub Jelinek
  2 siblings, 1 reply; 6+ messages in thread
From: Martin Sebor @ 2018-11-16 18:25 UTC (permalink / raw)
  To: Jakub Jelinek, Joseph S. Myers, Marek Polacek, Jason Merrill,
	Nathan Sidwell
  Cc: gcc-patches

On 11/16/2018 01:43 AM, Jakub Jelinek wrote:
> Hi!
>
> Both C and C++ FE diagnose arrays larger than half of the address space:
> /tmp/1.c:1:6: error: size of array ‘a’ is too large
>  char a[__SIZE_MAX__ / 2 + 1];
>       ^
> because one can't do pointer arithmetics on them.  But we don't have
> anything similar for string literals.  As internally we use host int
> as TREE_STRING_LENGTH, this is relevant to targets that have < 32-bit
> size_t only.
>
> The following patch adds that diagnostics and truncates the string literals.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux and tested with
> a cross to avr.  I'll defer adjusting testcases to the maintainers of 16-bit
> ports.  From the PR it seems gcc.dg/concat2.c, g++.dg/parse/concat1.C and
> pr46534.c tests are affected.
>
> Ok for trunk?
>
> 2018-11-16  Jakub Jelinek  <jakub@redhat.com>
>
> 	PR middle-end/87854
> 	* c-common.c (fix_string_type): Reject string literals larger than
> 	TYPE_MAX_VALUE (ssizetype) bytes.
>
> --- gcc/c-family/c-common.c.jj	2018-11-14 13:37:46.921050615 +0100
> +++ gcc/c-family/c-common.c	2018-11-15 15:20:31.138056115 +0100
> @@ -737,31 +737,44 @@ tree
>  fix_string_type (tree value)
>  {
>    int length = TREE_STRING_LENGTH (value);
> -  int nchars;
> +  int nchars, charsz;
>    tree e_type, i_type, a_type;
>
>    /* Compute the number of elements, for the array type.  */
>    if (TREE_TYPE (value) == char_array_type_node || !TREE_TYPE (value))
>      {
> -      nchars = length;
> +      charsz = 1;
>        e_type = char_type_node;
>      }
>    else if (TREE_TYPE (value) == char16_array_type_node)
>      {
> -      nchars = length / (TYPE_PRECISION (char16_type_node) / BITS_PER_UNIT);
> +      charsz = TYPE_PRECISION (char16_type_node) / BITS_PER_UNIT;
>        e_type = char16_type_node;
>      }
>    else if (TREE_TYPE (value) == char32_array_type_node)
>      {
> -      nchars = length / (TYPE_PRECISION (char32_type_node) / BITS_PER_UNIT);
> +      charsz = TYPE_PRECISION (char32_type_node) / BITS_PER_UNIT;
>        e_type = char32_type_node;
>      }
>    else
>      {
> -      nchars = length / (TYPE_PRECISION (wchar_type_node) / BITS_PER_UNIT);
> +      charsz = TYPE_PRECISION (wchar_type_node) / BITS_PER_UNIT;
>        e_type = wchar_type_node;
>      }
>
> +  /* This matters only for targets where ssizetype has smaller precision
> +     than 32 bits.  */
> +  if (wi::lts_p (wi::to_wide (TYPE_MAX_VALUE (ssizetype)), length))
> +    {
> +      error ("size of string literal is too large");

It would be helpful to mention the size of the literal and the limit
so users who do run into the error don't wonder how to fix it.

Martin

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Reject too large string literals (PR middle-end/87854)
  2018-11-16 18:25 ` Martin Sebor
@ 2018-11-16 18:31   ` Jakub Jelinek
  0 siblings, 0 replies; 6+ messages in thread
From: Jakub Jelinek @ 2018-11-16 18:31 UTC (permalink / raw)
  To: Martin Sebor
  Cc: Joseph S. Myers, Marek Polacek, Jason Merrill, Nathan Sidwell,
	gcc-patches

On Fri, Nov 16, 2018 at 11:25:15AM -0700, Martin Sebor wrote:
> On 11/16/2018 01:43 AM, Jakub Jelinek wrote:
> > 
> > +  /* This matters only for targets where ssizetype has smaller precision
> > +     than 32 bits.  */
> > +  if (wi::lts_p (wi::to_wide (TYPE_MAX_VALUE (ssizetype)), length))
> > +    {
> > +      error ("size of string literal is too large");
> 
> It would be helpful to mention the size of the literal and the limit
> so users who do run into the error don't wonder how to fix it.

It is consistent with what we emit for the arrays.
So, if the size and limit info is helpful to users, we should provide that
for those too.  I mean the:
                        if (name)
                          error_at (loc, "size of array %qE is too large",
                        else
                          error_at (loc, "size of unnamed array is too large");
                                    name);
calls in the C FE and similar stuff in C++ FE.
Feel free to add that to all of those.

	Jakub

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-11-16 18:31 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-16  8:43 [PATCH] Reject too large string literals (PR middle-end/87854) Jakub Jelinek
2018-11-16 12:06 ` Nathan Sidwell
2018-11-16 14:33   ` Marek Polacek
2018-11-16 17:35 ` Joseph Myers
2018-11-16 18:25 ` Martin Sebor
2018-11-16 18:31   ` Jakub Jelinek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).