public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: [PATCH]: Proof-of-concept for dynamic format checking
       [not found] <200508111509.j7BF9qMq015700@caipclassic.rutgers.edu>
@ 2005-08-17 19:15 ` Ian Lance Taylor
  2005-08-17 19:19   ` Florian Weimer
  2005-08-19 19:40   ` Tom Tromey
  2005-08-18  2:01 ` Kaveh R. Ghazi
  1 sibling, 2 replies; 40+ messages in thread
From: Ian Lance Taylor @ 2005-08-17 19:15 UTC (permalink / raw)
  To: Kaveh R. Ghazi; +Cc: gcc, joseph

"Kaveh R. Ghazi" <ghazi@caipclassic.rutgers.edu> writes:

[ Moved from gcc-patches to gcc ]

> At this point, I don't do any parsing of the "format-checking-data",
> this is where I would expect Ian's state machine language to appear.

To make this kind of thing useful, I see two paths that we can follow.


The first is to simply not try to implement all of printf in a special
language.  Most printf extensions are not nearly as complex as printf
itself.  In fact, most simply add a few additional % conversions, with
no modifiers.  So we can get pretty good mileage out of a mechanism
which simply says "like printf, plus these conversions".

For example,

#pragma GCC format "bfd" "inherit printf; %A: asection; %B: bfd"

Here the "inherit" could be simply "printf" for whatever is
appropriate for the current compilation, or it could be a specific
standard name.

Unfortunately it turns out that this doesn't currently describe the
BFD formatting.  The BFD additional conversion specifiers %A and %B
can only appear at the beginning of the string, before any other
conversion specifiers.  But still this would be good enough for many
uses.


The second approach is of course to write a little language which is
powerful enough to describe printf.  The state machine language I
described earlier is too simple and perhaps overly cryptic.  It would
be easier to understand a language based on regular expressions (which
are of course equivalent to state machines).  The main issues I see
are:

* There is duplicated information in printf flags.  For example, in
  many cases we want to accept a width flag, and warn if the it is
  zero.  We don't want to have to duplicate that warning for each
  conversion specifier which uses a width flag.

* The use of dollar to specify a particular argument (e.g., "%1$d")
  must be represented cleanly.  The dollar style must be used with
  every conversion or with none.  "%1$*2$d" is a particularly annoying
  construct.

* It might be convenient to be able to access the current standard
  level.  It might also be convenient to access specific warning
  options.

So let's consider a little language in which each conversion specifier
is described by a simplified regular expression.  In order to handle
commonalities, we permit the regular expressions to use subroutines--a
subroutine is itself a named regular expression.  Each regular
expression may have a conditional indicating whether it is valid,
which may reference the standard level and warning options.  Each
regular expression may have an action which is executed if it matches.
The action is a sequence of zero or more predefined functions.  Dollar
specifiers are handled specially by the framework, not by the little
language itself.

In this proposal, we can't use standard regular expressions, because
they have no provision for the subroutines which I think we need.  So
here is the regular expression syntax:

c          where c is not a meta-character, matches c
\c         matches c
[abc...]   matches any of the characters abc....
r1r2       matches r1, then r2
(r)        grouping; matches r
r*         matches zero or more of r
r+         matches one or more of r
r?         matches zero or one r
{NAME}     matches a regular expression matched by {NAME}
{$}        matches [0123456789]+$, and handles dollar specifier

The meta-characters are "\*+?[](){}/:" (colon is used for labels).

Typical regular expression items which are not present (but which we
could add if we want them): ^ and $ anchors, . to match any character,
| for alternation, [^abc...] for a negated match.

The grammar of the little language is:

rules: rule rules
rule: optcond optlabel '/' REGEXP '/' optactions
optcond: /* empty */ | '?' cond '?'
cond: VAR | cond '||' cond | cond '&&' cond | '(' cond ')'
optlabel: /* empty */ | ':' NAME ':'
optactions: /* empty */ | '{' actions '}'
actions: action actions
action: FNNAME '(' args ')' ';'
args: arg | arg ',' args
arg: STRING

We permit multiple regular expressions to have the same name.  In this
case we try to match each one in order, and apply the actions of the
first one to match.

Conditionals are expressions based on variables, like flag_isoc99.
I'm not sure yet what the permitted variables should be.

The string is processed by matching unnamed regular expressions in
order, anchored at the start of the string.  If nothing matches, the
character is ignored and we start matchin at the next character.

So, for printf:

:FLAGS: /[#0- +]*/
:WIDTH: /[123456789][0123456789]*/
:WIDTH: /\*{$}/ { match (int); }
:PREC:  /.[0123456789]*/
:PREC:  /\*{$}/ { match (int); }
:ADJ:   /{FLAGS}{WIDTH}?{PREC}?/
?flag_isoc99? /%{ADJ}hh[di]/  { match (char); }
/%{ADJ}h[di]/   { match (short); }
?flag_isoc99? /%{ADJ}ll[di]/ { match (long long); }
/%{ADJ}l[di]/   { match (long); }
/%{ADJ}[di]/    { match (int); }

etc.  Note that the use of {$} in the regexp controls which argument
is tested by the next call to "match".

As an example of checking for a zero field width in scanf:

:WIDTH: /[0123456789]*[123456789][0123456789]*/
:WIDTH: /0*/ { warn ("zero width"); }

For BFD:

/%A/ { match (asection); }
/%B/ { match (bfd); }
/{PRINTF}/

I still don't have a way to say that %A and %B must appear first.  I'm
a bit leery of introducing C style variables and if statements,
although that would be one approach.  We could also do it like this:

/%A/ { warn_if_flag (0, "%A must not follow a printf specifier");
       match (asection); }
/%B/ { warn_if_flag (0, "%B must not follow a printf specifier");
       match (bfd); }
/{PRINTF}/ { set_flag (0); }

I haven't tried to flesh this out any further.  I'd be curious to hear
how people react to it.

Ian

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH]: Proof-of-concept for dynamic format checking
  2005-08-17 19:15 ` [PATCH]: Proof-of-concept for dynamic format checking Ian Lance Taylor
@ 2005-08-17 19:19   ` Florian Weimer
  2005-08-17 19:25     ` Ian Lance Taylor
                       ` (2 more replies)
  2005-08-19 19:40   ` Tom Tromey
  1 sibling, 3 replies; 40+ messages in thread
From: Florian Weimer @ 2005-08-17 19:19 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: Kaveh R. Ghazi, gcc, joseph

* Ian Lance Taylor:

> I haven't tried to flesh this out any further.  I'd be curious to hear
> how people react to it.

Can't we just use some inline function written in plain C to check the
arguments and execute it at compile time using constant folding etc.?

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH]: Proof-of-concept for dynamic format checking
  2005-08-17 19:19   ` Florian Weimer
  2005-08-17 19:25     ` Ian Lance Taylor
@ 2005-08-17 19:25     ` Mike Stump
  2005-08-18  1:00     ` Giovanni Bajo
  2 siblings, 0 replies; 40+ messages in thread
From: Mike Stump @ 2005-08-17 19:25 UTC (permalink / raw)
  To: Florian Weimer; +Cc: Ian Lance Taylor, Kaveh R. Ghazi, gcc, joseph

On Aug 17, 2005, at 12:19 PM, Florian Weimer wrote:
> Can't we just use some inline function written in plain C to check the
> arguments and execute it at compile time using constant folding etc.?

I like this idea, but, I'm probably weird.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH]: Proof-of-concept for dynamic format checking
  2005-08-17 19:19   ` Florian Weimer
@ 2005-08-17 19:25     ` Ian Lance Taylor
  2005-08-17 19:45       ` Florian Weimer
  2005-08-17 19:25     ` Mike Stump
  2005-08-18  1:00     ` Giovanni Bajo
  2 siblings, 1 reply; 40+ messages in thread
From: Ian Lance Taylor @ 2005-08-17 19:25 UTC (permalink / raw)
  To: Florian Weimer; +Cc: Kaveh R. Ghazi, gcc, joseph

Florian Weimer <fw@deneb.enyo.de> writes:

> * Ian Lance Taylor:
> 
> > I haven't tried to flesh this out any further.  I'd be curious to hear
> > how people react to it.
> 
> Can't we just use some inline function written in plain C to check the
> arguments and execute it at compile time using constant folding etc.?

I don't really see how that could work and still do what we want it to
do.  Could you give an example of what it would look like?

Ian

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH]: Proof-of-concept for dynamic format checking
  2005-08-17 19:25     ` Ian Lance Taylor
@ 2005-08-17 19:45       ` Florian Weimer
  2005-08-17 20:00         ` Ian Lance Taylor
  0 siblings, 1 reply; 40+ messages in thread
From: Florian Weimer @ 2005-08-17 19:45 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: Kaveh R. Ghazi, gcc, joseph

* Ian Lance Taylor:

> Florian Weimer <fw@deneb.enyo.de> writes:
>
>> * Ian Lance Taylor:
>> 
>> > I haven't tried to flesh this out any further.  I'd be curious to hear
>> > how people react to it.
>> 
>> Can't we just use some inline function written in plain C to check the
>> arguments and execute it at compile time using constant folding etc.?
>
> I don't really see how that could work and still do what we want it to
> do.  Could you give an example of what it would look like?

If I understand your %A/%B example correctly, it would look like this:

/* FORMAT is the complete format string, POS the offset of the current %
   directive.  Returns a C type specifcier as a string.  NULL means: do not
   consume any argument */
static inline const char *
printf_checker_bfd (const char *format, size_t pos)
{
  if (strncmp (format + pos, "%A", 2) == 0)
    {
      if (pos != 0)
        {
           __builtin_warn ("`%A' must occur at the start of the format string");
          return "void *"; // accept anything
        }
      return "asection *";
    }
  if (strncmp (format + pos, "%B", 2) == 0)
    {
      if (pos != 0)
        {
           __builtin_warn ("`%B' must occur at the start of the format string");
          return "void *"; // accept anything
        }
      return "bfd *";
    }
  return __builtin_printf_checker (format, pos); // handle printf format string
}

#pragma GCC format "bfd" "invoke printf_checker_bfd"

The interface still needs some polishing; it might be desirable to be
able to pass along some kind of flag.  Perhaps it's more obvious to
express the scanning loop in the checking code and explicitly compare
the type using some builtin, but this is probably even more
challenging on the optimiziers.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH]: Proof-of-concept for dynamic format checking
  2005-08-17 19:45       ` Florian Weimer
@ 2005-08-17 20:00         ` Ian Lance Taylor
  2005-08-17 20:25           ` Florian Weimer
  0 siblings, 1 reply; 40+ messages in thread
From: Ian Lance Taylor @ 2005-08-17 20:00 UTC (permalink / raw)
  To: Florian Weimer; +Cc: Kaveh R. Ghazi, gcc, joseph

Florian Weimer <fw@deneb.enyo.de> writes:

> If I understand your %A/%B example correctly, it would look like this:

OK, I can see how that might work in a simple case.  Now, can you give
me an example of matching %d with the various flags?  In particular,
are you going to write a loop, and is gcc going to somehow fully
unroll that loop at compile time?

Ian

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH]: Proof-of-concept for dynamic format checking
  2005-08-17 20:00         ` Ian Lance Taylor
@ 2005-08-17 20:25           ` Florian Weimer
  0 siblings, 0 replies; 40+ messages in thread
From: Florian Weimer @ 2005-08-17 20:25 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: Kaveh R. Ghazi, gcc, joseph

* Ian Lance Taylor:

> Florian Weimer <fw@deneb.enyo.de> writes:
>
>> If I understand your %A/%B example correctly, it would look like this:
>
> OK, I can see how that might work in a simple case.  Now, can you give
> me an example of matching %d with the various flags?  In particular,
> are you going to write a loop, and is gcc going to somehow fully
> unroll that loop at compile time?

This is indeed a problem (with GCC 4.0 at least).  A regexp builtin
which returns the length of the matched string probably could probably
solve this.  Managing state so that you can still compose multiple
checkers is the harder part, I think.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH]: Proof-of-concept for dynamic format checking
  2005-08-17 19:19   ` Florian Weimer
  2005-08-17 19:25     ` Ian Lance Taylor
  2005-08-17 19:25     ` Mike Stump
@ 2005-08-18  1:00     ` Giovanni Bajo
  2005-08-18  1:20       ` Ian Lance Taylor
  2005-08-18 12:00       ` Florian Weimer
  2 siblings, 2 replies; 40+ messages in thread
From: Giovanni Bajo @ 2005-08-18  1:00 UTC (permalink / raw)
  To: Florian Weimer; +Cc: Joseph S. Myers, gcc, ghazi, Ian Lance Taylor

Florian Weimer <fw@deneb.enyo.de> wrote:

>> I haven't tried to flesh this out any further.  I'd be curious to
>> hear how people react to it.
>
> Can't we just use some inline function written in plain C to check the
> arguments and execute it at compile time using constant folding etc.?


Do we have a sane way to (partially) execute optimizers at -O0 without screwing
up with the pass manager too much? Probably they can be talked into, but might
require some work. The idea is neat though, and I prefer it over introducing a
specific pattern-matching language (which sounds like over-engineering for such
a side feature).

Giovanni Bajo

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH]: Proof-of-concept for dynamic format checking
  2005-08-18  1:00     ` Giovanni Bajo
@ 2005-08-18  1:20       ` Ian Lance Taylor
  2005-08-18 11:56         ` Dave Korn
  2005-08-18 12:00       ` Florian Weimer
  1 sibling, 1 reply; 40+ messages in thread
From: Ian Lance Taylor @ 2005-08-18  1:20 UTC (permalink / raw)
  To: Giovanni Bajo; +Cc: Florian Weimer, Joseph S. Myers, gcc, ghazi

"Giovanni Bajo" <giovannibajo@libero.it> writes:

> Florian Weimer <fw@deneb.enyo.de> wrote:
> 
> > Can't we just use some inline function written in plain C to check the
> > arguments and execute it at compile time using constant folding etc.?
> 
> 
> Do we have a sane way to (partially) execute optimizers at -O0 without screwing
> up with the pass manager too much? Probably they can be talked into, but might
> require some work. The idea is neat though, and I prefer it over introducing a
> specific pattern-matching language (which sounds like over-engineering for such
> a side feature).

I suppose I have the reverse opinion about which one is over-
engineering, but that's probably just me.  Remember that it's not
enough simply to execute the optimizers.  You have to build a symbol
table and an environment for the code to execute in.

Another approach would be to dlopen a shared library to do format
checking.  There might be some security implications to that, though.

Ian

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH]: Proof-of-concept for dynamic format checking
       [not found] <200508111509.j7BF9qMq015700@caipclassic.rutgers.edu>
  2005-08-17 19:15 ` [PATCH]: Proof-of-concept for dynamic format checking Ian Lance Taylor
@ 2005-08-18  2:01 ` Kaveh R. Ghazi
  2005-08-18  2:08   ` Kaveh R. Ghazi
  1 sibling, 1 reply; 40+ messages in thread
From: Kaveh R. Ghazi @ 2005-08-18  2:01 UTC (permalink / raw)
  To: ian; +Cc: gcc, joseph


 > To make this kind of thing useful, I see two paths that we can follow.
 > The first is to simply not try to implement all of printf in a special
 > language.  Most printf extensions are not nearly as complex as printf
 > itself.  In fact, most simply add a few additional % conversions, with
 > no modifiers.  So we can get pretty good mileage out of a mechanism
 > which simply says "like printf, plus these conversions".

I like having a shorthand, however looking at the GCC sources custom
formats many of them want something much simpler than printf but with
several extra flags.

For example, gcc's asm_fprintf format implements "l" long and "ll"
long long as length modifiers, plus an extension "w" for gcc's
HOST_WIDE_INT.  However it does not implement C90 "h" or the C99 or
GNU extention length modifiers (e.g. "z" or "Z" for size_t).

Ditto for the gcc diagnostic formats.

Specifiers themselves are also a mixed bag.  The asm_fprintf format
doesn't implement %p or the floating point specifiers.  But of course
it has a bunch of it's own extension flags.

So clearly many implementations will need a language to specify
exactly what they do.

Alternatively or maybe in addition, we could have a way to say "like
printf, but delete these specifiers, and these modifiers.  Then add
these other things."  Ultimately if a complete language is available
as well as "like printf" then users will do whichever is easier given
their particular format.

		--Kaveh
--
Kaveh R. Ghazi			ghazi@caip.rutgers.edu

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH]: Proof-of-concept for dynamic format checking
  2005-08-18  2:01 ` Kaveh R. Ghazi
@ 2005-08-18  2:08   ` Kaveh R. Ghazi
  2005-08-18  2:50     ` Ian Lance Taylor
  0 siblings, 1 reply; 40+ messages in thread
From: Kaveh R. Ghazi @ 2005-08-18  2:08 UTC (permalink / raw)
  To: ian; +Cc: gcc, joseph

 > For example,
 > 
 > #pragma GCC format "bfd" "inherit printf; %A: asection; %B: bfd"
 > 
 > Here the "inherit" could be simply "printf" for whatever is
 > appropriate for the current compilation, or it could be a specific
 > standard name.

I strongly feel that the "inherit" command should not change the
behavior of the inherited format depending on the --std= flag passed
to GCC at compile time of the user's code.  This change isn't right
for users, their variable argument output routine will not change it's
behavior based on the C standard in effect when compiling it.

Therefore if we implement an inherit, it should force the user to
choose "inherit printf90", "inherit printf99" or "inherit printfGNU".
Or something along those lines.

		--Kaveh
--
Kaveh R. Ghazi			ghazi@caip.rutgers.edu

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH]: Proof-of-concept for dynamic format checking
  2005-08-18  2:08   ` Kaveh R. Ghazi
@ 2005-08-18  2:50     ` Ian Lance Taylor
  2005-08-18  3:07       ` Kaveh R. Ghazi
  0 siblings, 1 reply; 40+ messages in thread
From: Ian Lance Taylor @ 2005-08-18  2:50 UTC (permalink / raw)
  To: Kaveh R. Ghazi; +Cc: gcc

"Kaveh R. Ghazi" <ghazi@caipclassic.rutgers.edu> writes:

> I strongly feel that the "inherit" command should not change the
> behavior of the inherited format depending on the --std= flag passed
> to GCC at compile time of the user's code.  This change isn't right
> for users, their variable argument output routine will not change it's
> behavior based on the C standard in effect when compiling it.
> 
> Therefore if we implement an inherit, it should force the user to
> choose "inherit printf90", "inherit printf99" or "inherit printfGNU".
> Or something along those lines.

But in cases like BFD, the code just does some pre-processing and then
calls vfprintf.  So there is no always correct value to inherit.  The
correct value to inherit from is the one which the user will link
against, and for that the closest we can come to the right answer is
the --std= flag used at compile time of the user's code.

Ian

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH]: Proof-of-concept for dynamic format checking
  2005-08-18  2:50     ` Ian Lance Taylor
@ 2005-08-18  3:07       ` Kaveh R. Ghazi
  2005-08-18  3:42         ` Alan Modra
  0 siblings, 1 reply; 40+ messages in thread
From: Kaveh R. Ghazi @ 2005-08-18  3:07 UTC (permalink / raw)
  To: ian; +Cc: gcc

 > But in cases like BFD, the code just does some pre-processing and then
 > calls vfprintf.  So there is no always correct value to inherit.  The
 > correct value to inherit from is the one which the user will link
 > against, and for that the closest we can come to the right answer is
 > the --std= flag used at compile time of the user's code.
 > Ian

Yeah, BFD can only do that because it forces the %A %B specifiers be
in the front.  (Maybe inheriting the morphing printf is your trigger
for enforcing front position for all exended specifiers?  Or is that
too esoteric for users?)

Anyway, I conclude we need both fixed and the adjustable inheriting.
So "inherit printf" for BFD and "inherit printf90" (etc) for other
implementations.  That's easy enough to code up.

		--Kaveh
--
Kaveh R. Ghazi			ghazi@caip.rutgers.edu

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH]: Proof-of-concept for dynamic format checking
  2005-08-18  3:07       ` Kaveh R. Ghazi
@ 2005-08-18  3:42         ` Alan Modra
  2005-08-18 12:46           ` Kaveh R. Ghazi
  0 siblings, 1 reply; 40+ messages in thread
From: Alan Modra @ 2005-08-18  3:42 UTC (permalink / raw)
  To: Kaveh R. Ghazi; +Cc: ian, gcc

On Wed, Aug 17, 2005 at 11:07:42PM -0400, Kaveh R. Ghazi wrote:
> Yeah, BFD can only do that because it forces the %A %B specifiers be
> in the front.

No, it's worse than that.  %A and %B can appear anywhere in the format
string, but consume their args first.  eg.

_bfd_default_error_handler ("section %d is called %A", sec, 1);

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 40+ messages in thread

* RE: [PATCH]: Proof-of-concept for dynamic format checking
  2005-08-18  1:20       ` Ian Lance Taylor
@ 2005-08-18 11:56         ` Dave Korn
  2005-08-18 12:00           ` Florian Weimer
  0 siblings, 1 reply; 40+ messages in thread
From: Dave Korn @ 2005-08-18 11:56 UTC (permalink / raw)
  To: 'Ian Lance Taylor', 'Giovanni Bajo'
  Cc: 'Florian Weimer', 'Joseph S. Myers', gcc, ghazi

----Original Message----
>From: Ian Lance Taylor
>Sent: 18 August 2005 02:20

> "Giovanni Bajo" writes:
> 
>> Florian Weimer wrote:
>> 
>>> Can't we just use some inline function written in plain C to check the
>>> arguments and execute it at compile time using constant folding etc.?
>> 
>> 
>> Do we have a sane way to (partially) execute optimizers at -O0 without
>> screwing up with the pass manager too much? Probably they can be talked
>> into, but might require some work. The idea is neat though, and I prefer
>> it over introducing a specific pattern-matching language (which sounds
>> like over-engineering for such a side feature).
> 
> I suppose I have the reverse opinion about which one is over-
> engineering, but that's probably just me.  Remember that it's not
> enough simply to execute the optimizers.  You have to build a symbol
> table and an environment for the code to execute in.
> 
> Another approach would be to dlopen a shared library to do format
> checking.  There might be some security implications to that, though.


  PMFBI, but how is all this going to work on a cross compiler?


    cheers,
      DaveK
-- 
Can't think of a witty .sigline today....

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH]: Proof-of-concept for dynamic format checking
  2005-08-18 11:56         ` Dave Korn
@ 2005-08-18 12:00           ` Florian Weimer
  2005-08-18 12:09             ` Dave Korn
  0 siblings, 1 reply; 40+ messages in thread
From: Florian Weimer @ 2005-08-18 12:00 UTC (permalink / raw)
  To: Dave Korn
  Cc: 'Ian Lance Taylor', 'Giovanni Bajo',
	'Joseph S. Myers',
	gcc, ghazi

* Dave Korn:

>   PMFBI, but how is all this going to work on a cross compiler?

Constant folding works in a cross-compiler, too. 8-)

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH]: Proof-of-concept for dynamic format checking
  2005-08-18  1:00     ` Giovanni Bajo
  2005-08-18  1:20       ` Ian Lance Taylor
@ 2005-08-18 12:00       ` Florian Weimer
  2005-08-18 16:38         ` Ian Lance Taylor
  1 sibling, 1 reply; 40+ messages in thread
From: Florian Weimer @ 2005-08-18 12:00 UTC (permalink / raw)
  To: Giovanni Bajo; +Cc: Joseph S. Myers, gcc, ghazi, Ian Lance Taylor

* Giovanni Bajo:

> Do we have a sane way to (partially) execute optimizers at -O0
> without screwing up with the pass manager too much?

Do we have to provide user-defined format string warnings at -O0?

^ permalink raw reply	[flat|nested] 40+ messages in thread

* RE: [PATCH]: Proof-of-concept for dynamic format checking
  2005-08-18 12:00           ` Florian Weimer
@ 2005-08-18 12:09             ` Dave Korn
  2005-08-18 19:10               ` Mike Stump
  0 siblings, 1 reply; 40+ messages in thread
From: Dave Korn @ 2005-08-18 12:09 UTC (permalink / raw)
  To: 'Florian Weimer'
  Cc: 'Ian Lance Taylor', 'Giovanni Bajo',
	'Joseph S. Myers',
	gcc, ghazi

----Original Message----
>From: Florian Weimer
>Sent: 18 August 2005 13:00

> * Dave Korn:
> 
>>   PMFBI, but how is all this going to work on a cross compiler?
> 
> Constant folding works in a cross-compiler, too. 8-)

  I was referring to this bit:

>   Remember that it's not
> enough simply to execute the optimizers.  You have to build a symbol
> table and an environment for the code to execute in.

  IIUIC, that would be a requirement for the optimisers to be able to
perform the full constant-folding and get the same results as if the
function was executed at runtime instead, wouldn't it?  It seems to me like
it would be quite a difficult thing to get right in a cross environment.


    cheers,
      DaveK
-- 
Can't think of a witty .sigline today....

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH]: Proof-of-concept for dynamic format checking
  2005-08-18  3:42         ` Alan Modra
@ 2005-08-18 12:46           ` Kaveh R. Ghazi
  2005-08-18 13:41             ` Alan Modra
  0 siblings, 1 reply; 40+ messages in thread
From: Kaveh R. Ghazi @ 2005-08-18 12:46 UTC (permalink / raw)
  To: amodra; +Cc: gcc, ian

 > > Yeah, BFD can only do that because it forces the %A %B specifiers be
 > > in the front.
 > 
 > No, it's worse than that.  %A and %B can appear anywhere in the format
 > string, but consume their args first.  eg.
 > 
 > _bfd_default_error_handler ("section %d is called %A", sec, 1);
 > 
 > Alan Modra

Oh... ick, I didn't realize that.  It means my numbers for format
errors in bfd were off because GCC counted positional mismatches as
bugs in note #1 here:
http://gcc.gnu.org/ml/gcc-patches/2005-08/msg00693.html

GCC's current infrastructure doesn't seem suited to handle this style.
The clostest I can shoehorn it is that it's really two separate
formats, one with %A and %B ignoring all other specifiers and checking
against the first N arguments.  Then check all other specifiers
ignoring %A and %B against arguments N+1 until the end.  (Where N
equals the number of %A and %B appearances.)  We can actually do that
in GCC by creating two separate formats.  The trick is to calculate N
on the fly and apply both formats against the "right" arguments.  I
haven't figured out how to solve that without major surgery on GCC.

I don't know how wedded to this style the bfd folks are, but perhaps
we can modify bfd sources into a more conforming format compatible
with GCC's format checking.  However it requires tweeking BFD sources.
I wouldn't consider doing that until format checking is in place so
that we'll know if we introduce bugs into bfd in this conversion.

		--Kaveh
--
Kaveh R. Ghazi			ghazi@caip.rutgers.edu

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH]: Proof-of-concept for dynamic format checking
  2005-08-18 12:46           ` Kaveh R. Ghazi
@ 2005-08-18 13:41             ` Alan Modra
  2005-08-18 14:35               ` Kaveh R. Ghazi
  0 siblings, 1 reply; 40+ messages in thread
From: Alan Modra @ 2005-08-18 13:41 UTC (permalink / raw)
  To: Kaveh R. Ghazi; +Cc: gcc, ian

On Thu, Aug 18, 2005 at 08:46:04AM -0400, Kaveh R. Ghazi wrote:
> I don't know how wedded to this style the bfd folks are

Not at all.  In fact I don't like it, even though I wrote the code.

It would be great if _bfd_default_error_handler used the natural arg
positions for %A and %B.  I couldn't think of a way to do that without
incorporating a whole lot of knowledge about printf into the bfd
function.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH]: Proof-of-concept for dynamic format checking
  2005-08-18 13:41             ` Alan Modra
@ 2005-08-18 14:35               ` Kaveh R. Ghazi
  2005-08-19  1:08                 ` Alan Modra
  0 siblings, 1 reply; 40+ messages in thread
From: Kaveh R. Ghazi @ 2005-08-18 14:35 UTC (permalink / raw)
  To: amodra; +Cc: gcc, ian

 > > I don't know how wedded to this style the bfd folks are
 > 
 > Not at all.  In fact I don't like it, even though I wrote the code.
 > It would be great if _bfd_default_error_handler used the natural arg
 > positions for %A and %B.  I couldn't think of a way to do that without
 > incorporating a whole lot of knowledge about printf into the bfd
 > function.

Right, in GCC we ended up doing that except we only implemented the
bits of printf commonly used.  So for example we don't implement all
of the specifiers (floating point) or modifers (%h) or flags.  In fact
the fortran front end has a format that only has %d %i %c and %s from
printf, (plus two custom specifiers.)  No flags or even length
modifiers!

It's likely that bfd doesn't use a big chunk of printf that you could
leave out as well.  (I haven't actually audited bfd though).


Another option is to require positional specifiers for out of order
arguments.  E.g.

_bfd_default_error_handler ("section %2$d is called %1$A", sec, 1);

You could keep "sec" at the front, consume it, replace %1$A with the
appropriate string, and then pass the modified format string and the
partially consumed argument list to vfprintf.

Two problems, one is you'd have to modify (or delete) all the
positional parameters to account for taking out "sec".  So 2$ above
becomes 1$ or is eliminated.

Also, there's nothing to prevent someone from violating the rules
keeping "sec" in the front.


So I favor rewriting _bfd_default_error_handler to do the safer thing
which is to use natural arg positions.  Then create a format check
with only the stuff you need, not the whole printf style.

		--Kaveh
--
Kaveh R. Ghazi			ghazi@caip.rutgers.edu

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH]: Proof-of-concept for dynamic format checking
  2005-08-18 12:00       ` Florian Weimer
@ 2005-08-18 16:38         ` Ian Lance Taylor
  0 siblings, 0 replies; 40+ messages in thread
From: Ian Lance Taylor @ 2005-08-18 16:38 UTC (permalink / raw)
  To: Florian Weimer; +Cc: Giovanni Bajo, Joseph S. Myers, gcc, ghazi

Florian Weimer <fw@deneb.enyo.de> writes:

> > Do we have a sane way to (partially) execute optimizers at -O0
> > without screwing up with the pass manager too much?
> 
> Do we have to provide user-defined format string warnings at -O0?

Yes, we do.

(But, although I don't like this approach, I think this particular
problem could be solved.)

Ian

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH]: Proof-of-concept for dynamic format checking
  2005-08-18 12:09             ` Dave Korn
@ 2005-08-18 19:10               ` Mike Stump
  2005-08-18 19:54                 ` Branko Čibej
  0 siblings, 1 reply; 40+ messages in thread
From: Mike Stump @ 2005-08-18 19:10 UTC (permalink / raw)
  To: Dave Korn
  Cc: 'Florian Weimer', 'Ian Lance Taylor',
	'Giovanni Bajo', 'Joseph S. Myers',
	gcc, ghazi

On Aug 18, 2005, at 5:08 AM, Dave Korn wrote:
>   I was referring to this bit:
>
>> Remember that it's not enough simply to execute the optimizers.   
>> You have to build a symbol table and an environment for the code  
>> to execute in.
>
> IIUIC, that would be a requirement for the optimisers to be able to
> perform the full constant-folding and get the same results as if the
> function was executed at runtime instead, wouldn't it?  It seems to  
> me like
> it would be quite a difficult thing to get right in a cross  
> environment.

Imagine the following program:

{
    int i = 234234;
    printf ("%d", i);
}

imagine the folder collapsing this to puts ("234234");

Or:

enum {
     foo=42;

foo() {
     printf("%d", (int)foo);
}

Hint, we already have a symbol table, it already works for cross  
compilation.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH]: Proof-of-concept for dynamic format checking
  2005-08-18 19:10               ` Mike Stump
@ 2005-08-18 19:54                 ` Branko Čibej
  2005-08-18 21:52                   ` Vincent Lefevre
  2005-08-18 22:51                   ` Mike Stump
  0 siblings, 2 replies; 40+ messages in thread
From: Branko Čibej @ 2005-08-18 19:54 UTC (permalink / raw)
  To: gcc

Mike Stump wrote:

> Imagine the following program:
>
> {
>    int i = 234234;
>    printf ("%d", i);
> }
>
> imagine the folder collapsing this to puts ("234234");

Now imagine that the output of the original program depends on the 
locale that's in force at execution time, which defines numberic output 
to be in arabic numerals (real ones, not the sort we see in ASCII).

-- Brane

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH]: Proof-of-concept for dynamic format checking
  2005-08-18 19:54                 ` Branko Čibej
@ 2005-08-18 21:52                   ` Vincent Lefevre
  2005-08-19  0:54                     ` Joe Buck
  2005-08-18 22:51                   ` Mike Stump
  1 sibling, 1 reply; 40+ messages in thread
From: Vincent Lefevre @ 2005-08-18 21:52 UTC (permalink / raw)
  To: gcc

On 2005-08-18 21:53:47 +0200, Branko ÄŒibej wrote:
> Mike Stump wrote:
[...]
> >   printf ("%d", i);
[...]
> Now imagine that the output of the original program depends on the
> locale that's in force at execution time, which defines numberic
> output to be in arabic numerals (real ones, not the sort we see in
> ASCII).

Is it possible? I would have thought that only the decimal-point
character depends on the locale.

-- 
Vincent Lefèvre <vincent@vinc17.org> - Web: <http://www.vinc17.org/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.org/blog/>
Work: CR INRIA - computer arithmetic / SPACES project at LORIA

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH]: Proof-of-concept for dynamic format checking
  2005-08-18 19:54                 ` Branko Čibej
  2005-08-18 21:52                   ` Vincent Lefevre
@ 2005-08-18 22:51                   ` Mike Stump
  1 sibling, 0 replies; 40+ messages in thread
From: Mike Stump @ 2005-08-18 22:51 UTC (permalink / raw)
  To: gcc

On Aug 18, 2005, at 12:53 PM, Branko Čibej wrote:
> Now imagine that the output of the original program depends on the  
> locale that's in force at execution time

Now imagine that you can't use locale specific functions for these  
things.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH]: Proof-of-concept for dynamic format checking
  2005-08-18 21:52                   ` Vincent Lefevre
@ 2005-08-19  0:54                     ` Joe Buck
  2005-08-19  1:34                       ` James E Wilson
  2005-08-19 10:32                       ` Vincent Lefevre
  0 siblings, 2 replies; 40+ messages in thread
From: Joe Buck @ 2005-08-19  0:54 UTC (permalink / raw)
  To: gcc

On Thu, Aug 18, 2005 at 11:52:36PM +0200, Vincent Lefevre wrote:
> On 2005-08-18 21:53:47 +0200, Branko ÄŒibej wrote:
> > Mike Stump wrote:
> [...]
> > >   printf ("%d", i);
> [...]
> > Now imagine that the output of the original program depends on the
> > locale that's in force at execution time, which defines numberic
> > output to be in arabic numerals (real ones, not the sort we see in
> > ASCII).
> 
> Is it possible? I would have thought that only the decimal-point
> character depends on the locale.

The digits we use come from the Arabs, and look much the same in Arabic.
Check an Arabic-language site, for example http://www.aljazeera.net/ .

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH]: Proof-of-concept for dynamic format checking
  2005-08-18 14:35               ` Kaveh R. Ghazi
@ 2005-08-19  1:08                 ` Alan Modra
  2005-08-19  2:56                   ` Ian Lance Taylor
  0 siblings, 1 reply; 40+ messages in thread
From: Alan Modra @ 2005-08-19  1:08 UTC (permalink / raw)
  To: Kaveh R. Ghazi; +Cc: gcc, ian

On Thu, Aug 18, 2005 at 10:35:22AM -0400, Kaveh R. Ghazi wrote:
>  > > I don't know how wedded to this style the bfd folks are
>  > 
>  > Not at all.  In fact I don't like it, even though I wrote the code.
>  > It would be great if _bfd_default_error_handler used the natural arg
>  > positions for %A and %B.  I couldn't think of a way to do that without
>  > incorporating a whole lot of knowledge about printf into the bfd
>  > function.
> 
> Right, in GCC we ended up doing that except we only implemented the
> bits of printf commonly used.  So for example we don't implement all
> of the specifiers (floating point) or modifers (%h) or flags.  In fact
> the fortran front end has a format that only has %d %i %c and %s from
> printf, (plus two custom specifiers.)  No flags or even length
> modifiers!
> 
> It's likely that bfd doesn't use a big chunk of printf that you could
> leave out as well.  (I haven't actually audited bfd though).

$ sed -n -e 's,[^%]*\(%[0-9\.# hlL+-]*.\)[^%]*,\1,gp' < bfd/po/bfd.pot | sed -e 's,%,\
%,g' | sort | uniq

%
%"
%-7ld
%.2x
%.8lx
%02X
%02x
%04lx
%08lx
%08x
%4d
%4lx
%4x
%A
%B
%X
%d
%i
%ld
%lu
%lx
%p
%s
%u
%x

(The '%"' is line wrapping in bfd.pot.  I may have missed a few format
specifiers because of that.  And the '%' is really '%%'.)

> Another option is to require positional specifiers for out of order
> arguments.  E.g.

Ick.

> So I favor rewriting _bfd_default_error_handler to do the safer thing
> which is to use natural arg positions.  Then create a format check
> with only the stuff you need, not the whole printf style.

I'm not motivated to do that myself. :)  There aren't that many places
that don't have %A or %B first in the format string.

$ sed -n -e 's,[^%]*\(%[0-9\.# hlL+-]*.\)[^%]*,\1,gp' < bfd/po/bfd.pot | grep '[^%AB]%[AB]'
%B%lx%A
%B%d%B%d
%s%d%d%B
%B%d%B%"
%B%d%B%d
%B%x%A
%B%s%s%A
%s%B
%s%B
%s%B
%s%B
%%%d%s%B%s%B
%s%B%s%B
%s%s%B%B
%s%B%A%B
%s%B%B
%s%B%A%B
%s%B%B%A
%B%lx%lx%lx%A
%u%s%B%u%B
%s%lu%B%lu%B
%B%s%s%B
%B%s%A
%X%s%A%B%A
%B%lx%A
%B%s%s%lx%A%lx

It's a great pity that vfprintf doesn't return its va_list arg.  If it
did, you could chop the format string into pieces and have vprintf
process the normal parts, consuming args as it goes.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH]: Proof-of-concept for dynamic format checking
  2005-08-19  0:54                     ` Joe Buck
@ 2005-08-19  1:34                       ` James E Wilson
  2005-08-19  2:23                         ` Robert Dewar
  2005-08-19 10:32                       ` Vincent Lefevre
  1 sibling, 1 reply; 40+ messages in thread
From: James E Wilson @ 2005-08-19  1:34 UTC (permalink / raw)
  To: Joe Buck; +Cc: gcc

Joe Buck wrote:
> The digits we use come from the Arabs, and look much the same in Arabic.
> Check an Arabic-language site, for example http://www.aljazeera.net/ .

In English, we call them "Arabic Numerals", but that is a bit of a
misnomer.  Once upon a time, a long time ago, some Arabs used digits
that looked something like the ones we use today, but not anymore.
Arabic actually has its own set of digits which are quite different from
the ones that we use.

For more info, see
    http://en.wikipedia.org/wiki/Arabic_numerals
-- 
Jim Wilson, GNU Tools Support, http://www.specifix.com

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH]: Proof-of-concept for dynamic format checking
  2005-08-19  1:34                       ` James E Wilson
@ 2005-08-19  2:23                         ` Robert Dewar
  0 siblings, 0 replies; 40+ messages in thread
From: Robert Dewar @ 2005-08-19  2:23 UTC (permalink / raw)
  To: James E Wilson; +Cc: Joe Buck, gcc

James E Wilson wrote:
> Joe Buck wrote:
> 
>>The digits we use come from the Arabs, and look much the same in Arabic.
>>Check an Arabic-language site, for example http://www.aljazeera.net/ .
> 
> In English, we call them "Arabic Numerals", but that is a bit of a
> misnomer.  Once upon a time, a long time ago, some Arabs used digits
> that looked something like the ones we use today, but not anymore.
> Arabic actually has its own set of digits which are quite different from
> the ones that we use.

as far as i know,
you are both right and wrong. the arabic numerals we know (and indeed acquired
from the arabs) are characteristic of the western arabic dialects.
For example today in Morocco, that is what you will see. The eastern
arabic dialects, e.g. in Egypt, use quite different numerals closer
to Indian roots.

i certainly find the Wikipedia article inconsistent with what I have
read elsewhere, e.g.
http://www-history.mcs.st-andrews.ac.uk/history/HistTopics/Arabic_numerals.html
which shows al-Banna al-Marrakushi's form of the numerals to be very similar
to our own, and with my personal observations in Morocco.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH]: Proof-of-concept for dynamic format checking
  2005-08-19  1:08                 ` Alan Modra
@ 2005-08-19  2:56                   ` Ian Lance Taylor
  0 siblings, 0 replies; 40+ messages in thread
From: Ian Lance Taylor @ 2005-08-19  2:56 UTC (permalink / raw)
  To: Alan Modra; +Cc: Kaveh R. Ghazi, gcc

Alan Modra <amodra@bigpond.net.au> writes:

> It's a great pity that vfprintf doesn't return its va_list arg.  If it
> did, you could chop the format string into pieces and have vprintf
> process the normal parts, consuming args as it goes.

You can do relatively limited parsing and still identify how printf is
going to use its arguments.  See libiberty/vasprintf.c.  (Although it
admittedly assumes that long == int, and it mishandles %lld.)

I suppose we could write the version of vfprintf you want, and put it
in libiberty.  Assuming it is always OK to pass the address of a
va_list to a function.

Ian

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH]: Proof-of-concept for dynamic format checking
  2005-08-19  0:54                     ` Joe Buck
  2005-08-19  1:34                       ` James E Wilson
@ 2005-08-19 10:32                       ` Vincent Lefevre
  1 sibling, 0 replies; 40+ messages in thread
From: Vincent Lefevre @ 2005-08-19 10:32 UTC (permalink / raw)
  To: gcc

On 2005-08-18 17:53:24 -0700, Joe Buck wrote:
> On Thu, Aug 18, 2005 at 11:52:36PM +0200, Vincent Lefevre wrote:
> > On 2005-08-18 21:53:47 +0200, Branko ÄŒibej wrote:
> > > Mike Stump wrote:
> > [...]
> > > >   printf ("%d", i);
> > [...]
> > > Now imagine that the output of the original program depends on the
> > > locale that's in force at execution time, which defines numberic
> > > output to be in arabic numerals (real ones, not the sort we see in
> > > ASCII).
> > 
> > Is it possible? I would have thought that only the decimal-point
> > character depends on the locale.
> 
> The digits we use come from the Arabs, and look much the same in Arabic.
> Check an Arabic-language site, for example http://www.aljazeera.net/ .

I agree, but you don't answer the question. The point is that they
are different characters. I think the digits mentioned by Branko
are the characters U+0660 to U+0669 (ARABIC-INDIC DIGIT ZERO and
so on). Many languages have their own 0 to 9 digits in Unicode.
But I don't think the decimal digits used for %d above depend on
the locale (e.g. I don't think a C implementation may use them if
it uses the ASCII ones in the C locale).

-- 
Vincent Lefèvre <vincent@vinc17.org> - Web: <http://www.vinc17.org/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.org/blog/>
Work: CR INRIA - computer arithmetic / SPACES project at LORIA

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH]: Proof-of-concept for dynamic format checking
  2005-08-17 19:15 ` [PATCH]: Proof-of-concept for dynamic format checking Ian Lance Taylor
  2005-08-17 19:19   ` Florian Weimer
@ 2005-08-19 19:40   ` Tom Tromey
  2005-08-19 20:28     ` Internal Behavior of G++ Aoun Raza
  2005-08-19 21:17     ` [PATCH]: Proof-of-concept for dynamic format checking Ian Lance Taylor
  1 sibling, 2 replies; 40+ messages in thread
From: Tom Tromey @ 2005-08-19 19:40 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: gcc, joseph

>>>>> "Ian" == Ian Lance Taylor <ian@airs.com> writes:

Ian> To make this kind of thing useful, I see two paths that we can follow.

Ian> The second approach is of course to write a little language which is
Ian> powerful enough to describe printf.  The state machine language I
Ian> described earlier is too simple and perhaps overly cryptic.

If we're doing that, why not use an already existing little language?
Then we have one less thing to maintain, document, and extend (since
you know there will be a need for extensions).

Of course this leads into the morass of picking a language.  Folks,
please resist the urge... you know what I mean.

The idea of letting gcc load a .so to do the checking also seems fine.
At least then the checking language is a standard one, not one we made
up.  I guess an intermediate approach would be to use that bytecode
back end I heard you're hacking on.

Tom

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Internal Behavior of G++
  2005-08-19 19:40   ` Tom Tromey
@ 2005-08-19 20:28     ` Aoun Raza
  2005-08-19 20:30       ` Joe Buck
  2005-08-19 21:17     ` [PATCH]: Proof-of-concept for dynamic format checking Ian Lance Taylor
  1 sibling, 1 reply; 40+ messages in thread
From: Aoun Raza @ 2005-08-19 20:28 UTC (permalink / raw)
  To: gcc


Hi all, 

I am developing an intermediate compile using EDG C++
frontend. 

When I try to compile any c++ source file with my
compiler it returns many errors.. like 


James:~/C_PP/test> C_CC -I/usr/include/c++/3.3
-I/usr/include/c++/3.3/i486-linux-gnu -I/usr/include
-c helloworld.cpp
"/usr/include/c++/3.3/i486-linux-gnu/bits/c++locale.h",
line 39: warning:
          unrecognized #pragma
  #pragma GCC system_header
          ^

"/usr/include/locale.h", line 29: catastrophic error:
could not open source
          file "stddef.h"
  #include <stddef.h>
                     ^

1 catastrophic error detected in the compilation of
"helloworld.cpp".
Compilation terminated.
C_CC --with-preincludes -D__linux__
-I/usr/include/c++/3.3 -I/usr/include/c++/3.3
-I/usr/include/c++/3.3/i486-linux-gnu -I/usr/include
-o helloworld.o helloworld.cpp
cafeCC: Error from subprocess: cafe++
"/usr/include/locale.h", line 29: error: could not
open source file "stddef.h"
  #include <stddef.h>
                     ^

"/usr/include/iconv.h", line 24: error: could not open
source file "stddef.h"
  #include <stddef.h>
                     ^

"/usr/include/bits/types.h", line 31: error: could not
open source file
          "stddef.h"
  #include <stddef.h>
                     ^


15 errors detected in the compilation of
"helloworld.cpp".
Compilation terminated.



/******************************************/

Now Can anyone guide me that how g(++/cc) starts its
preprocessing and what macros it uses by default if
there are no user defined.. 

Any suggestion will be appreciated..

Thanks.

Raza


		
____________________________________________________
Start your day with Yahoo! - make it your home page 
http://www.yahoo.com/r/hs 
 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Internal Behavior of G++
  2005-08-19 20:28     ` Internal Behavior of G++ Aoun Raza
@ 2005-08-19 20:30       ` Joe Buck
  2005-08-19 20:57         ` Aoun Raza
  0 siblings, 1 reply; 40+ messages in thread
From: Joe Buck @ 2005-08-19 20:30 UTC (permalink / raw)
  To: Aoun Raza; +Cc: gcc

On Fri, Aug 19, 2005 at 01:27:56PM -0700, Aoun Raza wrote:
> I am developing an intermediate compile using EDG C++
> frontend. 

Then you are on the wrong list.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Internal Behavior of G++
  2005-08-19 20:30       ` Joe Buck
@ 2005-08-19 20:57         ` Aoun Raza
  2005-08-19 21:11           ` Mike Stump
  0 siblings, 1 reply; 40+ messages in thread
From: Aoun Raza @ 2005-08-19 20:57 UTC (permalink / raw)
  To: gcc, Joe Buck

I have developed it already, but I want to use GCC
headers.. and I see the problems described earlier

--- Joe Buck <Joe.Buck@synopsys.COM> wrote:

> On Fri, Aug 19, 2005 at 01:27:56PM -0700, Aoun Raza
> wrote:
> > I am developing an intermediate compile using EDG
> C++
> > frontend. 
> 
> Then you are on the wrong list.
> 



		
____________________________________________________
Start your day with Yahoo! - make it your home page 
http://www.yahoo.com/r/hs 
 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Internal Behavior of G++
  2005-08-19 20:57         ` Aoun Raza
@ 2005-08-19 21:11           ` Mike Stump
  0 siblings, 0 replies; 40+ messages in thread
From: Mike Stump @ 2005-08-19 21:11 UTC (permalink / raw)
  To: Aoun Raza; +Cc: gcc, Joe Buck

On Friday, August 19, 2005, at 01:57  PM, Aoun Raza wrote:
> I have developed it already, but I want to use GCC
> headers.. and I see the problems described earlier

Must be a bug in your compiler, because g++ compiles it just fine, go 
ask them.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH]: Proof-of-concept for dynamic format checking
  2005-08-19 19:40   ` Tom Tromey
  2005-08-19 20:28     ` Internal Behavior of G++ Aoun Raza
@ 2005-08-19 21:17     ` Ian Lance Taylor
  2005-08-28 21:17       ` Daniel Jacobowitz
  1 sibling, 1 reply; 40+ messages in thread
From: Ian Lance Taylor @ 2005-08-19 21:17 UTC (permalink / raw)
  To: tromey; +Cc: gcc

Tom Tromey <tromey@redhat.com> writes:

> Ian> The second approach is of course to write a little language which is
> Ian> powerful enough to describe printf.  The state machine language I
> Ian> described earlier is too simple and perhaps overly cryptic.
> 
> If we're doing that, why not use an already existing little language?

I thought about that, but it doesn't quite make sense to me yet.  It
introduces yet another external software dependency into gcc, and it
does it not for any fundamental need but for a rather limited
feature--one which is used only for warnings, not for quality of
generated code.

(I also don't know of any embeddable little language which is really
right for the problem space, although of course there are quite a few
which are powerful enough to solve the problem.  I think the closest
existing languages to what we need are lex or awk, although I'm not
aware of any easily embeddable version of either.)

> The idea of letting gcc load a .so to do the checking also seems fine.
> At least then the checking language is a standard one, not one we made
> up.

Yes.  My main concerns would be

* It's obviously vastly more powerful than anything we actually need,
  and using dlopen exposes the compiler to bugs in the implementation
  of the format checker--slowness, random memory clobbering, etc.

* The compiler is, in its own way, a system security component.  If
  somebody were to put format checking into a system header file which
  used a shared library, then substituting that shared library--
  perhaps by just getting the compiler to pick up a different version
  of it--becomes an avenue for a complex but subtle attack on the
  system as a whole.

Ian

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH]: Proof-of-concept for dynamic format checking
  2005-08-19 21:17     ` [PATCH]: Proof-of-concept for dynamic format checking Ian Lance Taylor
@ 2005-08-28 21:17       ` Daniel Jacobowitz
  2005-08-28 23:28         ` Ken Raeburn
  0 siblings, 1 reply; 40+ messages in thread
From: Daniel Jacobowitz @ 2005-08-28 21:17 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: tromey, gcc

[Sorry for the way-late response... was on vacation]

On Fri, Aug 19, 2005 at 02:16:53PM -0700, Ian Lance Taylor wrote:
> > The idea of letting gcc load a .so to do the checking also seems fine.
> > At least then the checking language is a standard one, not one we made
> > up.

I think this is a wonderfully good idea.

> Yes.  My main concerns would be
> 
> * It's obviously vastly more powerful than anything we actually need,
>   and using dlopen exposes the compiler to bugs in the implementation
>   of the format checker--slowness, random memory clobbering, etc.

I just don't see this as a problem.

> * The compiler is, in its own way, a system security component.  If
>   somebody were to put format checking into a system header file which
>   used a shared library, then substituting that shared library--
>   perhaps by just getting the compiler to pick up a different version
>   of it--becomes an avenue for a complex but subtle attack on the
>   system as a whole.

I see this as a problem.  OK, let's solve it.  The solution has two
parts:

  - Allow arbitrary shared libraries to be specified on the command
    line.  BFD can then build one before it compiles, and pass it as
    an argument to GCC.

  - Define a trusted directory to allow shared libraries to be loaded
    by the installed system compiler, via #pragma.

I think this has a lot more mileage in it than spending months debating
how to represent the format specifiers in source code.  Of course,
we'll need to create a C interface for doing this, which will take some
time to do right.  But we know how to do that!

-- 
Daniel Jacobowitz
CodeSourcery, LLC

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH]: Proof-of-concept for dynamic format checking
  2005-08-28 21:17       ` Daniel Jacobowitz
@ 2005-08-28 23:28         ` Ken Raeburn
  0 siblings, 0 replies; 40+ messages in thread
From: Ken Raeburn @ 2005-08-28 23:28 UTC (permalink / raw)
  To: gcc

Maybe I should avoid making suggestions that would make the project  
more complex, especially since I'm not implementing it, but...

If we can describe the argument types expected for a given format  
string, then we can produce warnings for values used but not yet set  
(%s with an uninitialized automatic char array, but not %n with an  
uninitialized int), and let the compiler know what values are set by  
the call for use in later warnings.  For additions like bfd's %A and % 
B, though, we'd need a way of indicating what fields of the pointed- 
to structure are read and/or written, because some of them may be  
ignored, or only conditionally used.

Seems to me the best way to describe that is either calling out to  
user-supplied C code, or providing something very much like a C  
function or function fragment to show the compiler how the parameters  
are used -- off the top of my head, say, map 'A' to a static function  
format_asection which takes an asection* argument and reads the name  
field, which function can be analyzed for data usage patterns and  
whether it handles a null pointer, but which probably would be  
discarded by the compiler.  Mapping format specifiers to code  
fragments might also allow the compiler to transform
   bfd_print("%d:%A",sec,num)
to
   printf("%d:%s",num,sec->name)
if it had enough information.  But that requires expressing not just  
the data i/o pattern, but what the formatting actually will be for a  
specifier, which sometimes may be too complex to want to express.

Just a thought...

Ken

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2005-08-28 20:49 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <200508111509.j7BF9qMq015700@caipclassic.rutgers.edu>
2005-08-17 19:15 ` [PATCH]: Proof-of-concept for dynamic format checking Ian Lance Taylor
2005-08-17 19:19   ` Florian Weimer
2005-08-17 19:25     ` Ian Lance Taylor
2005-08-17 19:45       ` Florian Weimer
2005-08-17 20:00         ` Ian Lance Taylor
2005-08-17 20:25           ` Florian Weimer
2005-08-17 19:25     ` Mike Stump
2005-08-18  1:00     ` Giovanni Bajo
2005-08-18  1:20       ` Ian Lance Taylor
2005-08-18 11:56         ` Dave Korn
2005-08-18 12:00           ` Florian Weimer
2005-08-18 12:09             ` Dave Korn
2005-08-18 19:10               ` Mike Stump
2005-08-18 19:54                 ` Branko Čibej
2005-08-18 21:52                   ` Vincent Lefevre
2005-08-19  0:54                     ` Joe Buck
2005-08-19  1:34                       ` James E Wilson
2005-08-19  2:23                         ` Robert Dewar
2005-08-19 10:32                       ` Vincent Lefevre
2005-08-18 22:51                   ` Mike Stump
2005-08-18 12:00       ` Florian Weimer
2005-08-18 16:38         ` Ian Lance Taylor
2005-08-19 19:40   ` Tom Tromey
2005-08-19 20:28     ` Internal Behavior of G++ Aoun Raza
2005-08-19 20:30       ` Joe Buck
2005-08-19 20:57         ` Aoun Raza
2005-08-19 21:11           ` Mike Stump
2005-08-19 21:17     ` [PATCH]: Proof-of-concept for dynamic format checking Ian Lance Taylor
2005-08-28 21:17       ` Daniel Jacobowitz
2005-08-28 23:28         ` Ken Raeburn
2005-08-18  2:01 ` Kaveh R. Ghazi
2005-08-18  2:08   ` Kaveh R. Ghazi
2005-08-18  2:50     ` Ian Lance Taylor
2005-08-18  3:07       ` Kaveh R. Ghazi
2005-08-18  3:42         ` Alan Modra
2005-08-18 12:46           ` Kaveh R. Ghazi
2005-08-18 13:41             ` Alan Modra
2005-08-18 14:35               ` Kaveh R. Ghazi
2005-08-19  1:08                 ` Alan Modra
2005-08-19  2:56                   ` Ian Lance Taylor

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).