public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: fdump-ast-original and strg:
@ 2001-11-24 14:37 mike stump
  2001-11-24 17:15 ` Joseph S. Myers
  2001-11-30 17:49 ` mike stump
  0 siblings, 2 replies; 39+ messages in thread
From: mike stump @ 2001-11-24 14:37 UTC (permalink / raw)
  To: rth, zack; +Cc: florian, gcc, guillaume.thouvenin, jbuck

> Date: Fri, 30 Nov 2001 15:48:09 -0800
> From: Richard Henderson <rth@redhat.com>
> To: Zack Weinberg <zack@codesourcery.com>

> On Fri, Nov 30, 2001 at 03:37:43PM -0800, Zack Weinberg wrote:
> > > Unfortunately there is no such function.  That stuff is 
> > > replicated 99 times in various header files.
> > 
> > Dare I ask why?

> Historic acretion.

All the things that the ports have most in common ought to available
for default to all ports, and all ports that have those default ought
not to have their own, further, these things should be documented in
the manual.  This will simplify lots of ports, and regularize the
ports, making them all easier to read and understand.

defaults.h was created to meet this need, people should collapse
common things into it as part of normal maintenance.

To pick a few (that easily fit on one line):

     23 #define BITS_PER_WORD 32
      6 #define BITS_PER_WORD 16
      5 #define BITS_PER_WORD 64
      4 #define BITS_PER_WORD (TARGET_64BIT ? 64 : 32)
      1 #define BITS_PER_WORD 8

     21 #define SHORT_TYPE_SIZE 16
      1 #define SHORT_TYPE_SIZE 32
      1 #define SHORT_TYPE_SIZE (INT_TYPE_SIZE == 8 ? INT_TYPE_SIZE : 16)

      9 #define CHAR_TYPE_SIZE 8
      2 #define CHAR_TYPE_SIZE BITS_PER_UNIT
      2 #define CHAR_TYPE_SIZE 16

     18 #define INT_TYPE_SIZE 32
      3 #define INT_TYPE_SIZE 16
      2 #define INT_TYPE_SIZE (TARGET_SHORT ? 16 : 32)
      1 #define INT_TYPE_SIZE 64

A while ago, I complained about TARGET_VT and friends.  It seemed
kinda pointless, and I noticed that now, thanks to the hard work of
Neil, they are all in defaults.h, with the sole exception of an EBCDIC
machine.  :-)  So, sometimes things do get better.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: fdump-ast-original and strg:
  2001-11-24 14:37 fdump-ast-original and strg: mike stump
@ 2001-11-24 17:15 ` Joseph S. Myers
  2001-11-30 18:35   ` Joseph S. Myers
  2001-12-03 14:23   ` Richard Henderson
  2001-11-30 17:49 ` mike stump
  1 sibling, 2 replies; 39+ messages in thread
From: Joseph S. Myers @ 2001-11-24 17:15 UTC (permalink / raw)
  To: mike stump; +Cc: gcc

On Fri, 30 Nov 2001, mike stump wrote:

> defaults.h was created to meet this need, people should collapse
> common things into it as part of normal maintenance.
> 
> To pick a few (that easily fit on one line):

Some of these are already in defaults.h.  (Of course all defaults should
be documented in tm.texi as well.)

>      21 #define SHORT_TYPE_SIZE 16
>       1 #define SHORT_TYPE_SIZE 32
>       1 #define SHORT_TYPE_SIZE (INT_TYPE_SIZE == 8 ? INT_TYPE_SIZE : 16)

Defaults to (BITS_PER_UNIT * MIN ((UNITS_PER_WORD + 1) / 2, 2)).
I expect most definitions could go away.

>       9 #define CHAR_TYPE_SIZE 8
>       2 #define CHAR_TYPE_SIZE BITS_PER_UNIT
>       2 #define CHAR_TYPE_SIZE 16

Defaults to BITS_PER_UNIT, all values are equal to that, and I don't know
whether support for it being different really works in practice.

>      18 #define INT_TYPE_SIZE 32
>       3 #define INT_TYPE_SIZE 16
>       2 #define INT_TYPE_SIZE (TARGET_SHORT ? 16 : 32)
>       1 #define INT_TYPE_SIZE 64

Defaults to BITS_PER_WORD.

While dealing with this sort of cruft, WCHAR_UNSIGNED doesn't seem to be
used anywhere.  Is it meant to be useful?  If not, it should probably die
too.

-- 
Joseph S. Myers
jsm28@cam.ac.uk

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: fdump-ast-original and strg:
  2001-11-24 14:37 fdump-ast-original and strg: mike stump
  2001-11-24 17:15 ` Joseph S. Myers
@ 2001-11-30 17:49 ` mike stump
  1 sibling, 0 replies; 39+ messages in thread
From: mike stump @ 2001-11-30 17:49 UTC (permalink / raw)
  To: rth, zack; +Cc: florian, gcc, guillaume.thouvenin, jbuck

> Date: Fri, 30 Nov 2001 15:48:09 -0800
> From: Richard Henderson <rth@redhat.com>
> To: Zack Weinberg <zack@codesourcery.com>

> On Fri, Nov 30, 2001 at 03:37:43PM -0800, Zack Weinberg wrote:
> > > Unfortunately there is no such function.  That stuff is 
> > > replicated 99 times in various header files.
> > 
> > Dare I ask why?

> Historic acretion.

All the things that the ports have most in common ought to available
for default to all ports, and all ports that have those default ought
not to have their own, further, these things should be documented in
the manual.  This will simplify lots of ports, and regularize the
ports, making them all easier to read and understand.

defaults.h was created to meet this need, people should collapse
common things into it as part of normal maintenance.

To pick a few (that easily fit on one line):

     23 #define BITS_PER_WORD 32
      6 #define BITS_PER_WORD 16
      5 #define BITS_PER_WORD 64
      4 #define BITS_PER_WORD (TARGET_64BIT ? 64 : 32)
      1 #define BITS_PER_WORD 8

     21 #define SHORT_TYPE_SIZE 16
      1 #define SHORT_TYPE_SIZE 32
      1 #define SHORT_TYPE_SIZE (INT_TYPE_SIZE == 8 ? INT_TYPE_SIZE : 16)

      9 #define CHAR_TYPE_SIZE 8
      2 #define CHAR_TYPE_SIZE BITS_PER_UNIT
      2 #define CHAR_TYPE_SIZE 16

     18 #define INT_TYPE_SIZE 32
      3 #define INT_TYPE_SIZE 16
      2 #define INT_TYPE_SIZE (TARGET_SHORT ? 16 : 32)
      1 #define INT_TYPE_SIZE 64

A while ago, I complained about TARGET_VT and friends.  It seemed
kinda pointless, and I noticed that now, thanks to the hard work of
Neil, they are all in defaults.h, with the sole exception of an EBCDIC
machine.  :-)  So, sometimes things do get better.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: fdump-ast-original and strg:
  2001-11-24 17:15 ` Joseph S. Myers
@ 2001-11-30 18:35   ` Joseph S. Myers
  2001-12-03 14:23   ` Richard Henderson
  1 sibling, 0 replies; 39+ messages in thread
From: Joseph S. Myers @ 2001-11-30 18:35 UTC (permalink / raw)
  To: mike stump; +Cc: gcc

On Fri, 30 Nov 2001, mike stump wrote:

> defaults.h was created to meet this need, people should collapse
> common things into it as part of normal maintenance.
> 
> To pick a few (that easily fit on one line):

Some of these are already in defaults.h.  (Of course all defaults should
be documented in tm.texi as well.)

>      21 #define SHORT_TYPE_SIZE 16
>       1 #define SHORT_TYPE_SIZE 32
>       1 #define SHORT_TYPE_SIZE (INT_TYPE_SIZE == 8 ? INT_TYPE_SIZE : 16)

Defaults to (BITS_PER_UNIT * MIN ((UNITS_PER_WORD + 1) / 2, 2)).
I expect most definitions could go away.

>       9 #define CHAR_TYPE_SIZE 8
>       2 #define CHAR_TYPE_SIZE BITS_PER_UNIT
>       2 #define CHAR_TYPE_SIZE 16

Defaults to BITS_PER_UNIT, all values are equal to that, and I don't know
whether support for it being different really works in practice.

>      18 #define INT_TYPE_SIZE 32
>       3 #define INT_TYPE_SIZE 16
>       2 #define INT_TYPE_SIZE (TARGET_SHORT ? 16 : 32)
>       1 #define INT_TYPE_SIZE 64

Defaults to BITS_PER_WORD.

While dealing with this sort of cruft, WCHAR_UNSIGNED doesn't seem to be
used anywhere.  Is it meant to be useful?  If not, it should probably die
too.

-- 
Joseph S. Myers
jsm28@cam.ac.uk

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: fdump-ast-original and strg:
  2001-11-24 17:15 ` Joseph S. Myers
  2001-11-30 18:35   ` Joseph S. Myers
@ 2001-12-03 14:23   ` Richard Henderson
  1 sibling, 0 replies; 39+ messages in thread
From: Richard Henderson @ 2001-12-03 14:23 UTC (permalink / raw)
  To: Joseph S. Myers; +Cc: mike stump, gcc

On Sat, Dec 01, 2001 at 02:34:21AM +0000, Joseph S. Myers wrote:
> >      21 #define SHORT_TYPE_SIZE 16
> >       1 #define SHORT_TYPE_SIZE 32
> >       1 #define SHORT_TYPE_SIZE (INT_TYPE_SIZE == 8 ? INT_TYPE_SIZE : 16)
> 
> Defaults to (BITS_PER_UNIT * MIN ((UNITS_PER_WORD + 1) / 2, 2)).

Better is MAX (16, BITS_PER_UNIT).

> >      18 #define INT_TYPE_SIZE 32
> >       3 #define INT_TYPE_SIZE 16
> >       2 #define INT_TYPE_SIZE (TARGET_SHORT ? 16 : 32)
> >       1 #define INT_TYPE_SIZE 64
> 
> Defaults to BITS_PER_WORD.

A closer default is just 32.  The 16 and 64 bit outliers are
definitely the exception.  


r~

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: fdump-ast-original and strg:
  2001-11-24 12:41 ` Joe Buck
@ 2001-11-30 17:12   ` Joe Buck
  0 siblings, 0 replies; 39+ messages in thread
From: Joe Buck @ 2001-11-30 17:12 UTC (permalink / raw)
  To: mike stump; +Cc: jbuck, rth, florian, gcc, guillaume.thouvenin, zack

I wrote:

> > > if the option of simply calling an existing string-emitting function
> > > exists [it should be used].

From: Richard Henderson <rth@redhat.com>
> > Unfortunately there is no such function.  That stuff is 
> > replicated 99 times in various header files.

Mike Stump writes:
> output_quoted_string in toplev.c, should be half way reasonable.

Well, it only treats '"' and '\\' specially (by prepending a \ ), all
other characters (including control characters) go straight to output.

If the only purpose of the patch is to make it possible to read the file
back in, I suppose it could be good enough, but I think that the dump
should look reasonable for the traditional

int main() {
	printf("Hello, world\n");
}

and output_quoted_string will write

"Hello, world
"





^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: fdump-ast-original and strg:
  2001-11-24 11:14 mike stump
  2001-11-24 12:41 ` Joe Buck
@ 2001-11-30 16:59 ` mike stump
  1 sibling, 0 replies; 39+ messages in thread
From: mike stump @ 2001-11-30 16:59 UTC (permalink / raw)
  To: jbuck, rth; +Cc: florian, gcc, guillaume.thouvenin, zack

> Date: Fri, 30 Nov 2001 15:26:12 -0800
> From: Richard Henderson <rth@redhat.com>
> To: Joe Buck <jbuck@synopsys.COM>

> On Fri, Nov 30, 2001 at 11:01:52AM -0800, Joe Buck wrote:
> > if the option of simply calling an existing string-emitting function
> > exists.

> Unfortunately there is no such function.  That stuff is 
> replicated 99 times in various header files.

output_quoted_string in toplev.c, should be half way reasonable.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: fdump-ast-original and strg:
  2001-11-24  3:38                 ` Richard Henderson
@ 2001-11-30 15:50                   ` Richard Henderson
  0 siblings, 0 replies; 39+ messages in thread
From: Richard Henderson @ 2001-11-30 15:50 UTC (permalink / raw)
  To: Zack Weinberg; +Cc: Joe Buck, Florian Krohm, Guillaume, gcc

On Fri, Nov 30, 2001 at 03:37:43PM -0800, Zack Weinberg wrote:
> > Unfortunately there is no such function.  That stuff is 
> > replicated 99 times in various header files.
> 
> Dare I ask why?

Historic acretion.


r~

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: fdump-ast-original and strg:
  2001-11-24  3:30               ` Zack Weinberg
  2001-11-24  3:38                 ` Richard Henderson
@ 2001-11-30 15:37                 ` Zack Weinberg
  1 sibling, 0 replies; 39+ messages in thread
From: Zack Weinberg @ 2001-11-30 15:37 UTC (permalink / raw)
  To: Richard Henderson, Joe Buck, Florian Krohm, Guillaume, gcc

On Fri, Nov 30, 2001 at 03:26:12PM -0800, Richard Henderson wrote:
> On Fri, Nov 30, 2001 at 11:01:52AM -0800, Joe Buck wrote:
> > if the option of simply calling an existing string-emitting function
> > exists.
> 
> Unfortunately there is no such function.  That stuff is 
> replicated 99 times in various header files.

Dare I ask why?

zw

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: fdump-ast-original and strg:
  2001-11-23 23:16             ` Richard Henderson
  2001-11-24  3:30               ` Zack Weinberg
@ 2001-11-30 15:28               ` Richard Henderson
  1 sibling, 0 replies; 39+ messages in thread
From: Richard Henderson @ 2001-11-30 15:28 UTC (permalink / raw)
  To: Joe Buck; +Cc: Florian Krohm, Zack Weinberg, Guillaume, gcc

On Fri, Nov 30, 2001 at 11:01:52AM -0800, Joe Buck wrote:
> if the option of simply calling an existing string-emitting function
> exists.

Unfortunately there is no such function.  That stuff is 
replicated 99 times in various header files.


r~

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: fdump-ast-original and strg:
  2001-11-23 18:26           ` Dale Johannesen
@ 2001-11-30 15:02             ` Dale Johannesen
  0 siblings, 0 replies; 39+ messages in thread
From: Dale Johannesen @ 2001-11-30 15:02 UTC (permalink / raw)
  To: tim; +Cc: Dale Johannesen, Florian Krohm, Guillaume, Joe Buck, gcc

On Friday, November 30, 2001, at 03:07 PM, Tim Hollebeek wrote:

>
>> On Friday, November 30, 2001, at 10:12 AM, Florian Krohm wrote:
>>
>>> I'm afraid, things are even a bit more complex.
>>> Consider a string containing two characters, the first
>>> of which contains the bit pattern 00001010. The second
>>> character is '2'. If you want to recover the original
>>> representation for that string you will have to use a
>>> string concatenation e.g. "\12" "2" or "\x6" "2".
>>> Note that you cannot write "\122" as that would specify
>>> only a single character.
>>
>> "\0122" works.
>
> I believe you can also use "\12\2".

No, that specifies a second character with bit pattern 00000010,
which is not '2'.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: fdump-ast-original and strg:
  2001-11-23 17:46         ` Tim Hollebeek
  2001-11-23 18:26           ` Dale Johannesen
@ 2001-11-30 14:59           ` Tim Hollebeek
  1 sibling, 0 replies; 39+ messages in thread
From: Tim Hollebeek @ 2001-11-30 14:59 UTC (permalink / raw)
  To: Dale Johannesen; +Cc: Florian Krohm, Guillaume, Joe Buck, gcc

> On Friday, November 30, 2001, at 10:12 AM, Florian Krohm wrote:
> 
> > I'm afraid, things are even a bit more complex.
> > Consider a string containing two characters, the first
> > of which contains the bit pattern 00001010. The second
> > character is '2'. If you want to recover the original
> > representation for that string you will have to use a
> > string concatenation e.g. "\12" "2" or "\x6" "2".
> > Note that you cannot write "\122" as that would specify
> > only a single character.
> 
> "\0122" works.

I believe you can also use "\12\2".

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: fdump-ast-original and strg:
  2001-11-23 16:40         ` Guillaume
@ 2001-11-30 13:55           ` Guillaume
  0 siblings, 0 replies; 39+ messages in thread
From: Guillaume @ 2001-11-30 13:55 UTC (permalink / raw)
  To: Zack Weinberg; +Cc: Florian Krohm, Guillaume, Joe Buck, gcc

On Fri, 30 Nov 2001, Zack Weinberg wrote:

> > Note that you cannot write "\122" as that would specify
> > only a single character.
> > You could call this a pathological example, but I think
> > you want to come up with an algorithm that can handle
> > the general case.
>
> "\0122" will work fine.  (Or, in this case, "\n2" assuming ASCII.)

Yes it will produce "\n2"

>
> We already have code to emit strings safely, into the assembly output;
> you could just use that.

I will look to your code tonight. I put what I've done at the end of the
mail and I will modify it if I can find code to emit strings safely.

The function dump_string_cst() produced the following output:

@51     string_cst       type: @58     strg: "Hello\nit's a \t\"test\"\n" lngt: 22


for the following input:

  fprintf (stderr, "Hello\nit's a \t\"test\"\n");

Thanks to all of you for your help
Guillaume

-----------
diff -urN gcc-3.0.2-20011014/gcc/c-dump.c
gcc-3.0.2-20011014-mod/gcc/c-dump.c
--- gcc-3.0.2-20011014/gcc/c-dump.c     Tue Jun  5 03:46:58 2001
+++ gcc-3.0.2-20011014-mod/gcc/c-dump.c Fri Nov 30 16:25:48 2001
@@ -214,6 +214,49 @@
     di->column += 14;
 }

+void
+dump_string_cst (di, string)
+     dump_info_p di;
+     const char *string;
+{
+  int index;
+
+  fprintf (di->stream, "strg: \"");
+  for (index = 0; string[index] != '\0' ; index++)
+    {
+    switch (string[index])
+      {
+      case '\a':
+        fprintf (di->stream, "%c%c", '\\', 'a');
+       break;
+      case '\b':
+        fprintf (di->stream, "%c%c", '\\', 'b');
+       break;
+      case '\t':
+        fprintf (di->stream, "%c%c", '\\', 't');
+       break;
+      case '\n':
+        fprintf (di->stream, "%c%c", '\\', 'n');
+       break;
+      case '\v':
+        fprintf (di->stream, "%c%c", '\\', 'v');
+       break;
+      case '\f':
+        fprintf (di->stream, "%c%c", '\\', 'f');
+       break;
+      case '\r':
+        fprintf (di->stream, "%c%c", '\\', 'r');
+       break;
+      case '\"':
+        fprintf (di->stream, "%c%c", '\\', '\"');
+       break;
+      default :
+        fprintf (di->stream, "%c", string[index]);
+      }
+    }
+  fprintf (di->stream, "\"");
+}
+
 /* Dump the string field S.  */

 static void
@@ -646,7 +689,7 @@
       break;

     case STRING_CST:
-      fprintf (di->stream, "strg: %-7s ", TREE_STRING_POINTER (t));
+      dump_string_cst (di, TREE_STRING_POINTER(t));
       dump_int (di, "lngt", TREE_STRING_LENGTH (t));
       break;

diff -urN gcc-3.0.2-20011014/gcc/c-dump.h
gcc-3.0.2-20011014-mod/gcc/c-dump.h
--- gcc-3.0.2-20011014/gcc/c-dump.h     Tue Jun  5 03:46:58 2001
+++ gcc-3.0.2-20011014-mod/gcc/c-dump.h Fri Nov 30 16:23:12 2001
@@ -80,6 +80,8 @@
   PARAMS ((dump_info_p, const char *, int));
 extern void dump_string
   PARAMS ((dump_info_p, const char *));
+extern void dump_string_cst
+  PARAMS ((dump_info_p, const char *));
 extern void dump_stmt
   PARAMS ((dump_info_p, tree));
 extern void dump_next_stmt

--------------------

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: fdump-ast-original and strg:
  2001-11-23 14:42           ` Joe Buck
  2001-11-23 23:16             ` Richard Henderson
@ 2001-11-30 11:01             ` Joe Buck
  1 sibling, 0 replies; 39+ messages in thread
From: Joe Buck @ 2001-11-30 11:01 UTC (permalink / raw)
  To: Florian Krohm; +Cc: Zack Weinberg, Guillaume, Joe Buck, gcc

> > "\0122" will work fine.  (Or, in this case, "\n2" assuming ASCII.)
> >
> Yup, you're right. So if you use octal notation to represent a 
> non-printable character and always use 3 octal digits following
> the '\' you have something that should work in all cases.
> 
> > We already have code to emit strings safely, into the assembly output;
> > you could just use that.
> >
> Even better!

Not just "even better", IMHO.  While I'm not the one that will make
a decision about whether a patch is acceptable, I think that any patch
that includes a complete new conversion function should be rejected,
if the option of simply calling an existing string-emitting function
exists.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: fdump-ast-original and strg:
  2001-11-23 14:13         ` Florian Krohm
  2001-11-23 14:42           ` Joe Buck
@ 2001-11-30 10:54           ` Florian Krohm
  1 sibling, 0 replies; 39+ messages in thread
From: Florian Krohm @ 2001-11-30 10:54 UTC (permalink / raw)
  To: Zack Weinberg; +Cc: Guillaume, Joe Buck, gcc

>
> "\0122" will work fine.  (Or, in this case, "\n2" assuming ASCII.)
>
Yup, you're right. So if you use octal notation to represent a 
non-printable character and always use 3 octal digits following
the '\' you have something that should work in all cases.

> We already have code to emit strings safely, into the assembly output;
> you could just use that.
>
Even better!

Florian

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: fdump-ast-original and strg:
  2001-11-23 11:14       ` Zack Weinberg
  2001-11-23 14:13         ` Florian Krohm
  2001-11-23 16:40         ` Guillaume
@ 2001-11-30 10:26         ` Zack Weinberg
  2 siblings, 0 replies; 39+ messages in thread
From: Zack Weinberg @ 2001-11-30 10:26 UTC (permalink / raw)
  To: Florian Krohm; +Cc: Guillaume, Joe Buck, gcc

On Fri, Nov 30, 2001 at 01:12:07PM -0500, Florian Krohm wrote:
> I'm afraid, things are even a bit more complex.
> Consider a string containing two characters, the first
> of which contains the bit pattern 00001010. The second
> character is '2'. If you want to recover the original
> representation for that string you will have to use a 
> string concatenation e.g. "\12" "2" or "\x6" "2". 

I think you meant "\xa" "2".

> Note that you cannot write "\122" as that would specify
> only a single character.
> You could call this a pathological example, but I think
> you want to come up with an algorithm that can handle
> the general case.

"\0122" will work fine.  (Or, in this case, "\n2" assuming ASCII.)

We already have code to emit strings safely, into the assembly output;
you could just use that.

zw

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: fdump-ast-original and strg:
  2001-11-23 11:04       ` Dale Johannesen
  2001-11-23 17:46         ` Tim Hollebeek
@ 2001-11-30 10:24         ` Dale Johannesen
  1 sibling, 0 replies; 39+ messages in thread
From: Dale Johannesen @ 2001-11-30 10:24 UTC (permalink / raw)
  To: Florian Krohm; +Cc: Dale Johannesen, Guillaume, Joe Buck, gcc

On Friday, November 30, 2001, at 10:12 AM, Florian Krohm wrote:

> I'm afraid, things are even a bit more complex.
> Consider a string containing two characters, the first
> of which contains the bit pattern 00001010. The second
> character is '2'. If you want to recover the original
> representation for that string you will have to use a
> string concatenation e.g. "\12" "2" or "\x6" "2".
> Note that you cannot write "\122" as that would specify
> only a single character.

"\0122" works.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: fdump-ast-original and strg:
  2001-11-23 10:56       ` Joe Buck
@ 2001-11-30 10:22         ` Joe Buck
  0 siblings, 0 replies; 39+ messages in thread
From: Joe Buck @ 2001-11-30 10:22 UTC (permalink / raw)
  To: Florian Krohm; +Cc: Guillaume, Joe Buck, gcc

[ fixing the dump format to be able to read back strings ]

The correct solution, in my opinion, is to produce a valid C string
literal.  This can always be done, for any input, and furthermore
there should already be routines in the compiler that you can use
(though I can't be more specific off the top of my head).  Don't try
to invent a new format.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: fdump-ast-original and strg:
  2001-11-23  8:52     ` Florian Krohm
                         ` (2 preceding siblings ...)
  2001-11-23 11:14       ` Zack Weinberg
@ 2001-11-30 10:12       ` Florian Krohm
  3 siblings, 0 replies; 39+ messages in thread
From: Florian Krohm @ 2001-11-30 10:12 UTC (permalink / raw)
  To: Guillaume, Joe Buck; +Cc: gcc

I'm afraid, things are even a bit more complex.
Consider a string containing two characters, the first
of which contains the bit pattern 00001010. The second
character is '2'. If you want to recover the original
representation for that string you will have to use a 
string concatenation e.g. "\12" "2" or "\x6" "2". 
Note that you cannot write "\122" as that would specify
only a single character.
You could call this a pathological example, but I think
you want to come up with an algorithm that can handle
the general case.

Florian

On Friday 30 November 2001 12:54, Guillaume wrote:
> On Thu, 29 Nov 2001, Joe Buck wrote:
> > Guillaume Thouvenin writes:
> > ...
> >
> > > The problem is the following. If you have something like:
> > >
> > > -- part of a C code --
> > >
> > > fprintf(stderr, "error strg: toto");
> > >
> > > --
> > >
> > > The asg given by gcc gives the following line:
> > >
> > > @247    string_cst       type: @268    strg: error strg: toto  lngt: 5
> > >
> > > So, I add a very basic modification inside GCC (in c-dump.c) and now,
> > > it produces this line:
> > >
> > > @247    string_cst       type: @268    strg: "error strg: toto"  lngt:
> > > 5
> >
> > This seems reasonable, but does your patch do the whole job?  What
> > happens if the string contains newlines, control characters, or '"'?  It
> > would seem reasonable to make the output match the input (that is, output
> > \", \n, etc).
>
> No it doesn't do the whole job. If you have something like :
>
>  fprintf (stderr, "Hello\nit's a \"test\"\n");
>
> It will produce :
>
> @54     string_cst       type: @67     strg: "Hello
> it's a "test"
> "  lngt: 21
>
> So, the good output should be
>
> @54     string_cst       type: @67     strg: "Hello\nit's a \"test\"\n"
>         lngt: 21
>
>
> Actually, strings with newlines, control characters and '"' are treated by
> my parser. The only modification that I done in GCC is in file c-dump.c:
>
> line 649:
> ---
> 648: case STRING_CST:
> 649:      fprintf (di->stream, "strg: \"%-7s\" ", TREE_STRING_POINTER (t));
>                                       ^^    ^^
> 650:      dump_int (di, "lngt", TREE_STRING_LENGTH (t));
> 651:      break;
>
> So, I can try to path GCC to make output match the input?
>
> Guillaume

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: fdump-ast-original and strg:
  2001-11-23  8:49   ` Guillaume
  2001-11-23  8:52     ` Florian Krohm
@ 2001-11-30  9:54     ` Guillaume
  1 sibling, 0 replies; 39+ messages in thread
From: Guillaume @ 2001-11-30  9:54 UTC (permalink / raw)
  To: Joe Buck; +Cc: gcc

On Thu, 29 Nov 2001, Joe Buck wrote:

> Guillaume Thouvenin writes:
> ...
> > The problem is the following. If you have something like:
> >
> > -- part of a C code --
> >
> > fprintf(stderr, "error strg: toto");
> >
> > --
> >
> > The asg given by gcc gives the following line:
> >
> > @247    string_cst       type: @268    strg: error strg: toto  lngt: 5
> >
> > So, I add a very basic modification inside GCC (in c-dump.c) and now, it
> > produces this line:
> >
> > @247    string_cst       type: @268    strg: "error strg: toto"  lngt: 5
> >
> This seems reasonable, but does your patch do the whole job?  What happens
> if the string contains newlines, control characters, or '"'?  It would
> seem reasonable to make the output match the input (that is, output \",
> \n, etc).

No it doesn't do the whole job. If you have something like :

 fprintf (stderr, "Hello\nit's a \"test\"\n");

It will produce :

@54     string_cst       type: @67     strg: "Hello
it's a "test"
"  lngt: 21

So, the good output should be

@54     string_cst       type: @67     strg: "Hello\nit's a \"test\"\n"
        lngt: 21


Actually, strings with newlines, control characters and '"' are treated by
my parser. The only modification that I done in GCC is in file c-dump.c:

line 649:
---
648: case STRING_CST:
649:      fprintf (di->stream, "strg: \"%-7s\" ", TREE_STRING_POINTER (t));
                                      ^^    ^^
650:      dump_int (di, "lngt", TREE_STRING_LENGTH (t));
651:      break;

So, I can try to path GCC to make output match the input?

Guillaume

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: fdump-ast-original and strg:
  2001-11-22 13:14 ` Joe Buck
  2001-11-23  8:49   ` Guillaume
@ 2001-11-29 19:00   ` Joe Buck
  1 sibling, 0 replies; 39+ messages in thread
From: Joe Buck @ 2001-11-29 19:00 UTC (permalink / raw)
  To: Guillaume; +Cc: gcc

Guillaume Thouvenin writes:
...
> The problem is the following. If you have something like:
> 
> -- part of a C code --
> 
> fprintf(stderr, "error strg: toto");
> 
> --
> 
> The asg given by gcc gives the following line:
> 
> @247    string_cst       type: @268    strg: error strg: toto  lngt: 5
> 
> So, I add a very basic modification inside GCC (in c-dump.c) and now, it
> produces this line:
> 
> @247    string_cst       type: @268    strg: "error strg: toto"  lngt: 5
> 
> It is easier to parse. So, I'd like to know if it can be added to official
> gcc futur release. It's only one line and for me it will be easier because
> people won't need to recompile the gcc compiler if they want to use my
> tool (ok for now I'm the only one who use it but it can change...).

This seems reasonable, but does your patch do the whole job?  What happens
if the string contains newlines, control characters, or '"'?  It would
seem reasonable to make the output match the input (that is, output \",
\n, etc).

^ permalink raw reply	[flat|nested] 39+ messages in thread

* fdump-ast-original and strg:
  2001-11-22 13:14 Guillaume
  2001-11-22 13:14 ` Joe Buck
@ 2001-11-29 18:46 ` Guillaume
  1 sibling, 0 replies; 39+ messages in thread
From: Guillaume @ 2001-11-29 18:46 UTC (permalink / raw)
  To: gcc

Hello,

I'm student and I'm trying to build a tool which use the ASG given by g++
using the option -fdump-ast-original. Actually I build a basic parser
which reads the file file.c.original and stores the ASG in memory in a
hash table where the key is the number of a node. I also build a visitor
which visits the ASG in memory and extracts a CFG for some analysis.

The problem is the following. If you have something like:

-- part of a C code --

fprintf(stderr, "error strg: toto");

--

The asg given by gcc gives the following line:

@247    string_cst       type: @268    strg: error strg: toto  lngt: 5

So, I add a very basic modification inside GCC (in c-dump.c) and now, it
produces this line:

@247    string_cst       type: @268    strg: "error strg: toto"  lngt: 5

It is easier to parse. So, I'd like to know if it can be added to official
gcc futur release. It's only one line and for me it will be easier because
people won't need to recompile the gcc compiler if they want to use my
tool (ok for now I'm the only one who use it but it can change...).

Thank you
Sorry for my english

---
Guillaume Thouvenin
GASTA: Gcc Abstract Syntax Tree
http://gasta.sf.net





^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: fdump-ast-original and strg:
  2001-11-24 11:14 mike stump
@ 2001-11-24 12:41 ` Joe Buck
  2001-11-30 17:12   ` Joe Buck
  2001-11-30 16:59 ` mike stump
  1 sibling, 1 reply; 39+ messages in thread
From: Joe Buck @ 2001-11-24 12:41 UTC (permalink / raw)
  To: mike stump; +Cc: jbuck, rth, florian, gcc, guillaume.thouvenin, zack

I wrote:

> > > if the option of simply calling an existing string-emitting function
> > > exists [it should be used].

From: Richard Henderson <rth@redhat.com>
> > Unfortunately there is no such function.  That stuff is 
> > replicated 99 times in various header files.

Mike Stump writes:
> output_quoted_string in toplev.c, should be half way reasonable.

Well, it only treats '"' and '\\' specially (by prepending a \ ), all
other characters (including control characters) go straight to output.

If the only purpose of the patch is to make it possible to read the file
back in, I suppose it could be good enough, but I think that the dump
should look reasonable for the traditional

int main() {
	printf("Hello, world\n");
}

and output_quoted_string will write

"Hello, world
"





^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: fdump-ast-original and strg:
@ 2001-11-24 11:14 mike stump
  2001-11-24 12:41 ` Joe Buck
  2001-11-30 16:59 ` mike stump
  0 siblings, 2 replies; 39+ messages in thread
From: mike stump @ 2001-11-24 11:14 UTC (permalink / raw)
  To: jbuck, rth; +Cc: florian, gcc, guillaume.thouvenin, zack

> Date: Fri, 30 Nov 2001 15:26:12 -0800
> From: Richard Henderson <rth@redhat.com>
> To: Joe Buck <jbuck@synopsys.COM>

> On Fri, Nov 30, 2001 at 11:01:52AM -0800, Joe Buck wrote:
> > if the option of simply calling an existing string-emitting function
> > exists.

> Unfortunately there is no such function.  That stuff is 
> replicated 99 times in various header files.

output_quoted_string in toplev.c, should be half way reasonable.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: fdump-ast-original and strg:
  2001-11-24  3:30               ` Zack Weinberg
@ 2001-11-24  3:38                 ` Richard Henderson
  2001-11-30 15:50                   ` Richard Henderson
  2001-11-30 15:37                 ` Zack Weinberg
  1 sibling, 1 reply; 39+ messages in thread
From: Richard Henderson @ 2001-11-24  3:38 UTC (permalink / raw)
  To: Zack Weinberg; +Cc: Joe Buck, Florian Krohm, Guillaume, gcc

On Fri, Nov 30, 2001 at 03:37:43PM -0800, Zack Weinberg wrote:
> > Unfortunately there is no such function.  That stuff is 
> > replicated 99 times in various header files.
> 
> Dare I ask why?

Historic acretion.


r~

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: fdump-ast-original and strg:
  2001-11-23 23:16             ` Richard Henderson
@ 2001-11-24  3:30               ` Zack Weinberg
  2001-11-24  3:38                 ` Richard Henderson
  2001-11-30 15:37                 ` Zack Weinberg
  2001-11-30 15:28               ` Richard Henderson
  1 sibling, 2 replies; 39+ messages in thread
From: Zack Weinberg @ 2001-11-24  3:30 UTC (permalink / raw)
  To: Richard Henderson, Joe Buck, Florian Krohm, Guillaume, gcc

On Fri, Nov 30, 2001 at 03:26:12PM -0800, Richard Henderson wrote:
> On Fri, Nov 30, 2001 at 11:01:52AM -0800, Joe Buck wrote:
> > if the option of simply calling an existing string-emitting function
> > exists.
> 
> Unfortunately there is no such function.  That stuff is 
> replicated 99 times in various header files.

Dare I ask why?

zw

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: fdump-ast-original and strg:
  2001-11-23 14:42           ` Joe Buck
@ 2001-11-23 23:16             ` Richard Henderson
  2001-11-24  3:30               ` Zack Weinberg
  2001-11-30 15:28               ` Richard Henderson
  2001-11-30 11:01             ` Joe Buck
  1 sibling, 2 replies; 39+ messages in thread
From: Richard Henderson @ 2001-11-23 23:16 UTC (permalink / raw)
  To: Joe Buck; +Cc: Florian Krohm, Zack Weinberg, Guillaume, gcc

On Fri, Nov 30, 2001 at 11:01:52AM -0800, Joe Buck wrote:
> if the option of simply calling an existing string-emitting function
> exists.

Unfortunately there is no such function.  That stuff is 
replicated 99 times in various header files.


r~

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: fdump-ast-original and strg:
  2001-11-23 17:46         ` Tim Hollebeek
@ 2001-11-23 18:26           ` Dale Johannesen
  2001-11-30 15:02             ` Dale Johannesen
  2001-11-30 14:59           ` Tim Hollebeek
  1 sibling, 1 reply; 39+ messages in thread
From: Dale Johannesen @ 2001-11-23 18:26 UTC (permalink / raw)
  To: tim; +Cc: Dale Johannesen, Florian Krohm, Guillaume, Joe Buck, gcc


On Friday, November 30, 2001, at 03:07 PM, Tim Hollebeek wrote:

>
>> On Friday, November 30, 2001, at 10:12 AM, Florian Krohm wrote:
>>
>>> I'm afraid, things are even a bit more complex.
>>> Consider a string containing two characters, the first
>>> of which contains the bit pattern 00001010. The second
>>> character is '2'. If you want to recover the original
>>> representation for that string you will have to use a
>>> string concatenation e.g. "\12" "2" or "\x6" "2".
>>> Note that you cannot write "\122" as that would specify
>>> only a single character.
>>
>> "\0122" works.
>
> I believe you can also use "\12\2".

No, that specifies a second character with bit pattern 00000010,
which is not '2'.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: fdump-ast-original and strg:
  2001-11-23 11:04       ` Dale Johannesen
@ 2001-11-23 17:46         ` Tim Hollebeek
  2001-11-23 18:26           ` Dale Johannesen
  2001-11-30 14:59           ` Tim Hollebeek
  2001-11-30 10:24         ` Dale Johannesen
  1 sibling, 2 replies; 39+ messages in thread
From: Tim Hollebeek @ 2001-11-23 17:46 UTC (permalink / raw)
  To: Dale Johannesen; +Cc: Florian Krohm, Guillaume, Joe Buck, gcc


> On Friday, November 30, 2001, at 10:12 AM, Florian Krohm wrote:
> 
> > I'm afraid, things are even a bit more complex.
> > Consider a string containing two characters, the first
> > of which contains the bit pattern 00001010. The second
> > character is '2'. If you want to recover the original
> > representation for that string you will have to use a
> > string concatenation e.g. "\12" "2" or "\x6" "2".
> > Note that you cannot write "\122" as that would specify
> > only a single character.
> 
> "\0122" works.

I believe you can also use "\12\2".

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: fdump-ast-original and strg:
  2001-11-23 11:14       ` Zack Weinberg
  2001-11-23 14:13         ` Florian Krohm
@ 2001-11-23 16:40         ` Guillaume
  2001-11-30 13:55           ` Guillaume
  2001-11-30 10:26         ` Zack Weinberg
  2 siblings, 1 reply; 39+ messages in thread
From: Guillaume @ 2001-11-23 16:40 UTC (permalink / raw)
  To: Zack Weinberg; +Cc: Florian Krohm, Guillaume, Joe Buck, gcc

On Fri, 30 Nov 2001, Zack Weinberg wrote:

> > Note that you cannot write "\122" as that would specify
> > only a single character.
> > You could call this a pathological example, but I think
> > you want to come up with an algorithm that can handle
> > the general case.
>
> "\0122" will work fine.  (Or, in this case, "\n2" assuming ASCII.)

Yes it will produce "\n2"

>
> We already have code to emit strings safely, into the assembly output;
> you could just use that.

I will look to your code tonight. I put what I've done at the end of the
mail and I will modify it if I can find code to emit strings safely.

The function dump_string_cst() produced the following output:

@51     string_cst       type: @58     strg: "Hello\nit's a \t\"test\"\n" lngt: 22


for the following input:

  fprintf (stderr, "Hello\nit's a \t\"test\"\n");

Thanks to all of you for your help
Guillaume

-----------
diff -urN gcc-3.0.2-20011014/gcc/c-dump.c
gcc-3.0.2-20011014-mod/gcc/c-dump.c
--- gcc-3.0.2-20011014/gcc/c-dump.c     Tue Jun  5 03:46:58 2001
+++ gcc-3.0.2-20011014-mod/gcc/c-dump.c Fri Nov 30 16:25:48 2001
@@ -214,6 +214,49 @@
     di->column += 14;
 }

+void
+dump_string_cst (di, string)
+     dump_info_p di;
+     const char *string;
+{
+  int index;
+
+  fprintf (di->stream, "strg: \"");
+  for (index = 0; string[index] != '\0' ; index++)
+    {
+    switch (string[index])
+      {
+      case '\a':
+        fprintf (di->stream, "%c%c", '\\', 'a');
+       break;
+      case '\b':
+        fprintf (di->stream, "%c%c", '\\', 'b');
+       break;
+      case '\t':
+        fprintf (di->stream, "%c%c", '\\', 't');
+       break;
+      case '\n':
+        fprintf (di->stream, "%c%c", '\\', 'n');
+       break;
+      case '\v':
+        fprintf (di->stream, "%c%c", '\\', 'v');
+       break;
+      case '\f':
+        fprintf (di->stream, "%c%c", '\\', 'f');
+       break;
+      case '\r':
+        fprintf (di->stream, "%c%c", '\\', 'r');
+       break;
+      case '\"':
+        fprintf (di->stream, "%c%c", '\\', '\"');
+       break;
+      default :
+        fprintf (di->stream, "%c", string[index]);
+      }
+    }
+  fprintf (di->stream, "\"");
+}
+
 /* Dump the string field S.  */

 static void
@@ -646,7 +689,7 @@
       break;

     case STRING_CST:
-      fprintf (di->stream, "strg: %-7s ", TREE_STRING_POINTER (t));
+      dump_string_cst (di, TREE_STRING_POINTER(t));
       dump_int (di, "lngt", TREE_STRING_LENGTH (t));
       break;

diff -urN gcc-3.0.2-20011014/gcc/c-dump.h
gcc-3.0.2-20011014-mod/gcc/c-dump.h
--- gcc-3.0.2-20011014/gcc/c-dump.h     Tue Jun  5 03:46:58 2001
+++ gcc-3.0.2-20011014-mod/gcc/c-dump.h Fri Nov 30 16:23:12 2001
@@ -80,6 +80,8 @@
   PARAMS ((dump_info_p, const char *, int));
 extern void dump_string
   PARAMS ((dump_info_p, const char *));
+extern void dump_string_cst
+  PARAMS ((dump_info_p, const char *));
 extern void dump_stmt
   PARAMS ((dump_info_p, tree));
 extern void dump_next_stmt

--------------------

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: fdump-ast-original and strg:
  2001-11-23 14:13         ` Florian Krohm
@ 2001-11-23 14:42           ` Joe Buck
  2001-11-23 23:16             ` Richard Henderson
  2001-11-30 11:01             ` Joe Buck
  2001-11-30 10:54           ` Florian Krohm
  1 sibling, 2 replies; 39+ messages in thread
From: Joe Buck @ 2001-11-23 14:42 UTC (permalink / raw)
  To: Florian Krohm; +Cc: Zack Weinberg, Guillaume, Joe Buck, gcc


> > "\0122" will work fine.  (Or, in this case, "\n2" assuming ASCII.)
> >
> Yup, you're right. So if you use octal notation to represent a 
> non-printable character and always use 3 octal digits following
> the '\' you have something that should work in all cases.
> 
> > We already have code to emit strings safely, into the assembly output;
> > you could just use that.
> >
> Even better!

Not just "even better", IMHO.  While I'm not the one that will make
a decision about whether a patch is acceptable, I think that any patch
that includes a complete new conversion function should be rejected,
if the option of simply calling an existing string-emitting function
exists.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: fdump-ast-original and strg:
  2001-11-23 11:14       ` Zack Weinberg
@ 2001-11-23 14:13         ` Florian Krohm
  2001-11-23 14:42           ` Joe Buck
  2001-11-30 10:54           ` Florian Krohm
  2001-11-23 16:40         ` Guillaume
  2001-11-30 10:26         ` Zack Weinberg
  2 siblings, 2 replies; 39+ messages in thread
From: Florian Krohm @ 2001-11-23 14:13 UTC (permalink / raw)
  To: Zack Weinberg; +Cc: Guillaume, Joe Buck, gcc

>
> "\0122" will work fine.  (Or, in this case, "\n2" assuming ASCII.)
>
Yup, you're right. So if you use octal notation to represent a 
non-printable character and always use 3 octal digits following
the '\' you have something that should work in all cases.

> We already have code to emit strings safely, into the assembly output;
> you could just use that.
>
Even better!

Florian

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: fdump-ast-original and strg:
  2001-11-23  8:52     ` Florian Krohm
  2001-11-23 10:56       ` Joe Buck
  2001-11-23 11:04       ` Dale Johannesen
@ 2001-11-23 11:14       ` Zack Weinberg
  2001-11-23 14:13         ` Florian Krohm
                           ` (2 more replies)
  2001-11-30 10:12       ` Florian Krohm
  3 siblings, 3 replies; 39+ messages in thread
From: Zack Weinberg @ 2001-11-23 11:14 UTC (permalink / raw)
  To: Florian Krohm; +Cc: Guillaume, Joe Buck, gcc

On Fri, Nov 30, 2001 at 01:12:07PM -0500, Florian Krohm wrote:
> I'm afraid, things are even a bit more complex.
> Consider a string containing two characters, the first
> of which contains the bit pattern 00001010. The second
> character is '2'. If you want to recover the original
> representation for that string you will have to use a 
> string concatenation e.g. "\12" "2" or "\x6" "2". 

I think you meant "\xa" "2".

> Note that you cannot write "\122" as that would specify
> only a single character.
> You could call this a pathological example, but I think
> you want to come up with an algorithm that can handle
> the general case.

"\0122" will work fine.  (Or, in this case, "\n2" assuming ASCII.)

We already have code to emit strings safely, into the assembly output;
you could just use that.

zw

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: fdump-ast-original and strg:
  2001-11-23  8:52     ` Florian Krohm
  2001-11-23 10:56       ` Joe Buck
@ 2001-11-23 11:04       ` Dale Johannesen
  2001-11-23 17:46         ` Tim Hollebeek
  2001-11-30 10:24         ` Dale Johannesen
  2001-11-23 11:14       ` Zack Weinberg
  2001-11-30 10:12       ` Florian Krohm
  3 siblings, 2 replies; 39+ messages in thread
From: Dale Johannesen @ 2001-11-23 11:04 UTC (permalink / raw)
  To: Florian Krohm; +Cc: Dale Johannesen, Guillaume, Joe Buck, gcc


On Friday, November 30, 2001, at 10:12 AM, Florian Krohm wrote:

> I'm afraid, things are even a bit more complex.
> Consider a string containing two characters, the first
> of which contains the bit pattern 00001010. The second
> character is '2'. If you want to recover the original
> representation for that string you will have to use a
> string concatenation e.g. "\12" "2" or "\x6" "2".
> Note that you cannot write "\122" as that would specify
> only a single character.

"\0122" works.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: fdump-ast-original and strg:
  2001-11-23  8:52     ` Florian Krohm
@ 2001-11-23 10:56       ` Joe Buck
  2001-11-30 10:22         ` Joe Buck
  2001-11-23 11:04       ` Dale Johannesen
                         ` (2 subsequent siblings)
  3 siblings, 1 reply; 39+ messages in thread
From: Joe Buck @ 2001-11-23 10:56 UTC (permalink / raw)
  To: Florian Krohm; +Cc: Guillaume, Joe Buck, gcc

[ fixing the dump format to be able to read back strings ]

The correct solution, in my opinion, is to produce a valid C string
literal.  This can always be done, for any input, and furthermore
there should already be routines in the compiler that you can use
(though I can't be more specific off the top of my head).  Don't try
to invent a new format.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: fdump-ast-original and strg:
  2001-11-23  8:49   ` Guillaume
@ 2001-11-23  8:52     ` Florian Krohm
  2001-11-23 10:56       ` Joe Buck
                         ` (3 more replies)
  2001-11-30  9:54     ` Guillaume
  1 sibling, 4 replies; 39+ messages in thread
From: Florian Krohm @ 2001-11-23  8:52 UTC (permalink / raw)
  To: Guillaume, Joe Buck; +Cc: gcc

I'm afraid, things are even a bit more complex.
Consider a string containing two characters, the first
of which contains the bit pattern 00001010. The second
character is '2'. If you want to recover the original
representation for that string you will have to use a 
string concatenation e.g. "\12" "2" or "\x6" "2". 
Note that you cannot write "\122" as that would specify
only a single character.
You could call this a pathological example, but I think
you want to come up with an algorithm that can handle
the general case.

Florian

On Friday 30 November 2001 12:54, Guillaume wrote:
> On Thu, 29 Nov 2001, Joe Buck wrote:
> > Guillaume Thouvenin writes:
> > ...
> >
> > > The problem is the following. If you have something like:
> > >
> > > -- part of a C code --
> > >
> > > fprintf(stderr, "error strg: toto");
> > >
> > > --
> > >
> > > The asg given by gcc gives the following line:
> > >
> > > @247    string_cst       type: @268    strg: error strg: toto  lngt: 5
> > >
> > > So, I add a very basic modification inside GCC (in c-dump.c) and now,
> > > it produces this line:
> > >
> > > @247    string_cst       type: @268    strg: "error strg: toto"  lngt:
> > > 5
> >
> > This seems reasonable, but does your patch do the whole job?  What
> > happens if the string contains newlines, control characters, or '"'?  It
> > would seem reasonable to make the output match the input (that is, output
> > \", \n, etc).
>
> No it doesn't do the whole job. If you have something like :
>
>  fprintf (stderr, "Hello\nit's a \"test\"\n");
>
> It will produce :
>
> @54     string_cst       type: @67     strg: "Hello
> it's a "test"
> "  lngt: 21
>
> So, the good output should be
>
> @54     string_cst       type: @67     strg: "Hello\nit's a \"test\"\n"
>         lngt: 21
>
>
> Actually, strings with newlines, control characters and '"' are treated by
> my parser. The only modification that I done in GCC is in file c-dump.c:
>
> line 649:
> ---
> 648: case STRING_CST:
> 649:      fprintf (di->stream, "strg: \"%-7s\" ", TREE_STRING_POINTER (t));
>                                       ^^    ^^
> 650:      dump_int (di, "lngt", TREE_STRING_LENGTH (t));
> 651:      break;
>
> So, I can try to path GCC to make output match the input?
>
> Guillaume

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: fdump-ast-original and strg:
  2001-11-22 13:14 ` Joe Buck
@ 2001-11-23  8:49   ` Guillaume
  2001-11-23  8:52     ` Florian Krohm
  2001-11-30  9:54     ` Guillaume
  2001-11-29 19:00   ` Joe Buck
  1 sibling, 2 replies; 39+ messages in thread
From: Guillaume @ 2001-11-23  8:49 UTC (permalink / raw)
  To: Joe Buck; +Cc: gcc

On Thu, 29 Nov 2001, Joe Buck wrote:

> Guillaume Thouvenin writes:
> ...
> > The problem is the following. If you have something like:
> >
> > -- part of a C code --
> >
> > fprintf(stderr, "error strg: toto");
> >
> > --
> >
> > The asg given by gcc gives the following line:
> >
> > @247    string_cst       type: @268    strg: error strg: toto  lngt: 5
> >
> > So, I add a very basic modification inside GCC (in c-dump.c) and now, it
> > produces this line:
> >
> > @247    string_cst       type: @268    strg: "error strg: toto"  lngt: 5
> >
> This seems reasonable, but does your patch do the whole job?  What happens
> if the string contains newlines, control characters, or '"'?  It would
> seem reasonable to make the output match the input (that is, output \",
> \n, etc).

No it doesn't do the whole job. If you have something like :

 fprintf (stderr, "Hello\nit's a \"test\"\n");

It will produce :

@54     string_cst       type: @67     strg: "Hello
it's a "test"
"  lngt: 21

So, the good output should be

@54     string_cst       type: @67     strg: "Hello\nit's a \"test\"\n"
        lngt: 21


Actually, strings with newlines, control characters and '"' are treated by
my parser. The only modification that I done in GCC is in file c-dump.c:

line 649:
---
648: case STRING_CST:
649:      fprintf (di->stream, "strg: \"%-7s\" ", TREE_STRING_POINTER (t));
                                      ^^    ^^
650:      dump_int (di, "lngt", TREE_STRING_LENGTH (t));
651:      break;

So, I can try to path GCC to make output match the input?

Guillaume

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: fdump-ast-original and strg:
  2001-11-22 13:14 Guillaume
@ 2001-11-22 13:14 ` Joe Buck
  2001-11-23  8:49   ` Guillaume
  2001-11-29 19:00   ` Joe Buck
  2001-11-29 18:46 ` Guillaume
  1 sibling, 2 replies; 39+ messages in thread
From: Joe Buck @ 2001-11-22 13:14 UTC (permalink / raw)
  To: Guillaume; +Cc: gcc

Guillaume Thouvenin writes:
...
> The problem is the following. If you have something like:
> 
> -- part of a C code --
> 
> fprintf(stderr, "error strg: toto");
> 
> --
> 
> The asg given by gcc gives the following line:
> 
> @247    string_cst       type: @268    strg: error strg: toto  lngt: 5
> 
> So, I add a very basic modification inside GCC (in c-dump.c) and now, it
> produces this line:
> 
> @247    string_cst       type: @268    strg: "error strg: toto"  lngt: 5
> 
> It is easier to parse. So, I'd like to know if it can be added to official
> gcc futur release. It's only one line and for me it will be easier because
> people won't need to recompile the gcc compiler if they want to use my
> tool (ok for now I'm the only one who use it but it can change...).

This seems reasonable, but does your patch do the whole job?  What happens
if the string contains newlines, control characters, or '"'?  It would
seem reasonable to make the output match the input (that is, output \",
\n, etc).

^ permalink raw reply	[flat|nested] 39+ messages in thread

* fdump-ast-original and strg:
@ 2001-11-22 13:14 Guillaume
  2001-11-22 13:14 ` Joe Buck
  2001-11-29 18:46 ` Guillaume
  0 siblings, 2 replies; 39+ messages in thread
From: Guillaume @ 2001-11-22 13:14 UTC (permalink / raw)
  To: gcc

Hello,

I'm student and I'm trying to build a tool which use the ASG given by g++
using the option -fdump-ast-original. Actually I build a basic parser
which reads the file file.c.original and stores the ASG in memory in a
hash table where the key is the number of a node. I also build a visitor
which visits the ASG in memory and extracts a CFG for some analysis.

The problem is the following. If you have something like:

-- part of a C code --

fprintf(stderr, "error strg: toto");

--

The asg given by gcc gives the following line:

@247    string_cst       type: @268    strg: error strg: toto  lngt: 5

So, I add a very basic modification inside GCC (in c-dump.c) and now, it
produces this line:

@247    string_cst       type: @268    strg: "error strg: toto"  lngt: 5

It is easier to parse. So, I'd like to know if it can be added to official
gcc futur release. It's only one line and for me it will be easier because
people won't need to recompile the gcc compiler if they want to use my
tool (ok for now I'm the only one who use it but it can change...).

Thank you
Sorry for my english

---
Guillaume Thouvenin
GASTA: Gcc Abstract Syntax Tree
http://gasta.sf.net





^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2001-12-03 22:23 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-11-24 14:37 fdump-ast-original and strg: mike stump
2001-11-24 17:15 ` Joseph S. Myers
2001-11-30 18:35   ` Joseph S. Myers
2001-12-03 14:23   ` Richard Henderson
2001-11-30 17:49 ` mike stump
  -- strict thread matches above, loose matches on Subject: below --
2001-11-24 11:14 mike stump
2001-11-24 12:41 ` Joe Buck
2001-11-30 17:12   ` Joe Buck
2001-11-30 16:59 ` mike stump
2001-11-22 13:14 Guillaume
2001-11-22 13:14 ` Joe Buck
2001-11-23  8:49   ` Guillaume
2001-11-23  8:52     ` Florian Krohm
2001-11-23 10:56       ` Joe Buck
2001-11-30 10:22         ` Joe Buck
2001-11-23 11:04       ` Dale Johannesen
2001-11-23 17:46         ` Tim Hollebeek
2001-11-23 18:26           ` Dale Johannesen
2001-11-30 15:02             ` Dale Johannesen
2001-11-30 14:59           ` Tim Hollebeek
2001-11-30 10:24         ` Dale Johannesen
2001-11-23 11:14       ` Zack Weinberg
2001-11-23 14:13         ` Florian Krohm
2001-11-23 14:42           ` Joe Buck
2001-11-23 23:16             ` Richard Henderson
2001-11-24  3:30               ` Zack Weinberg
2001-11-24  3:38                 ` Richard Henderson
2001-11-30 15:50                   ` Richard Henderson
2001-11-30 15:37                 ` Zack Weinberg
2001-11-30 15:28               ` Richard Henderson
2001-11-30 11:01             ` Joe Buck
2001-11-30 10:54           ` Florian Krohm
2001-11-23 16:40         ` Guillaume
2001-11-30 13:55           ` Guillaume
2001-11-30 10:26         ` Zack Weinberg
2001-11-30 10:12       ` Florian Krohm
2001-11-30  9:54     ` Guillaume
2001-11-29 19:00   ` Joe Buck
2001-11-29 18:46 ` Guillaume

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).