public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: [PATCH][GIT PULL][v2.6.32] tracing/x86: Add check to detect GCC messing with mcount prologue
@ 2009-11-24 17:05 Ross Ridge
  2009-11-24 17:11 ` Andrew Haley
  0 siblings, 1 reply; 31+ messages in thread
From: Ross Ridge @ 2009-11-24 17:05 UTC (permalink / raw)
  To: gcc

Andrew Haley writes:
>Alright.  So, it is possible in theory for gcc to generate code that
>only uses -maccumulate-outgoing-args when it needs to realign SP.
>And, therefore, we could have a nice option for the kernel: one with
>(mostly) good code density and never generates the bizarre code
>sequence in the prologue.

The best option would be for the Linux people to fix the underlying
problem in their kernel sources.  If the code no longer requested
that certain automatic variables be aligned, then not only would this
bizarre code sequence not be emitted, the unnecessary stack alignment
would disapear as well.  The kernel would then be free to choose to use
whatever code generation options it felt was appropriate.

				Ross Ridge

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH][GIT PULL][v2.6.32] tracing/x86: Add check to detect GCC  messing with mcount prologue
  2009-11-24 17:05 [PATCH][GIT PULL][v2.6.32] tracing/x86: Add check to detect GCC messing with mcount prologue Ross Ridge
@ 2009-11-24 17:11 ` Andrew Haley
  0 siblings, 0 replies; 31+ messages in thread
From: Andrew Haley @ 2009-11-24 17:11 UTC (permalink / raw)
  To: Ross Ridge; +Cc: gcc

Ross Ridge wrote:
> Andrew Haley writes:
>> Alright.  So, it is possible in theory for gcc to generate code that
>> only uses -maccumulate-outgoing-args when it needs to realign SP.
>> And, therefore, we could have a nice option for the kernel: one with
>> (mostly) good code density and never generates the bizarre code
>> sequence in the prologue.
> 
> The best option would be for the Linux people to fix the underlying
> problem in their kernel sources.  If the code no longer requested
> that certain automatic variables be aligned, then not only would this
> bizarre code sequence not be emitted, the unnecessary stack alignment
> would disapear as well.  The kernel would then be free to choose to use
> whatever code generation options it felt was appropriate.

Well, yeah.  But, for my sins, I tend to assume that the Linux kernel
people have some kind of reason for the things they do.  Working with
them over the years has helped us improve gcc, even though at times
things get to be a little ill-tempered.

Andrew.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH][GIT PULL][v2.6.32] tracing/x86: Add check to detect GCC  messing with mcount prologue
  2009-11-25 20:13                                 ` H. Peter Anvin
@ 2009-11-25 21:01                                   ` Andrew Haley
  0 siblings, 0 replies; 31+ messages in thread
From: Andrew Haley @ 2009-11-25 21:01 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Jakub Jelinek, Ingo Molnar, Thomas Gleixner, H.J. Lu, rostedt,
	LKML, Andrew Morton, Heiko Carstens, feng.tang, Peter Zijlstra,
	Frederic Weisbecker, David Daney, Richard Guenther, gcc,
	Linus Torvalds

H. Peter Anvin wrote:
> On 11/25/2009 08:44 AM, Jakub Jelinek wrote:
>> If you compile kernels 90%+ people out there run with -p on i?86/x86_64,
>> then certainly coming up with a new gcc switch and new profiling ABI is
>> desirable.  -p on i?86/x86_64 e.g. forces -fno-omit-frame-pointer, which
>> makes code on these register starved arches significantly worse.
>> Making GCC output profiling call before prologue instead of after prologue
>> is a 4 liner in generic code and a few lines in target specific code.
>> The important thing is that we shouldn't have 100 different profiling ABIs,
>> so it is desirable to agree on something that will be generally useful not
>> just for the kernel, but perhaps for other purposes.
> 
> There is really just one that makes sense, which is providing the
> ABI-defined entry state, which means intercepting at the point of entry.
> 
> Anything else is/was a mistake.

Indeed.  The problem, though, is that the "naked call" approach, while attractive,
requires the back end to be modified and so requires the help of the gcc maintainers
for every Linux target.  Not that this is a terrible idea, but such co-ordination
is going to take time.

Andrew.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH][GIT PULL][v2.6.32] tracing/x86: Add check to detect GCC  messing with mcount prologue
  2009-11-25 16:45                               ` Jakub Jelinek
@ 2009-11-25 20:13                                 ` H. Peter Anvin
  2009-11-25 21:01                                   ` Andrew Haley
  0 siblings, 1 reply; 31+ messages in thread
From: H. Peter Anvin @ 2009-11-25 20:13 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Ingo Molnar, Thomas Gleixner, Andrew Haley, H.J. Lu, rostedt,
	LKML, Andrew Morton, Heiko Carstens, feng.tang, Peter Zijlstra,
	Frederic Weisbecker, David Daney, Richard Guenther, gcc,
	Linus Torvalds

On 11/25/2009 08:44 AM, Jakub Jelinek wrote:
> 
> If you compile kernels 90%+ people out there run with -p on i?86/x86_64,
> then certainly coming up with a new gcc switch and new profiling ABI is
> desirable.  -p on i?86/x86_64 e.g. forces -fno-omit-frame-pointer, which
> makes code on these register starved arches significantly worse.
> Making GCC output profiling call before prologue instead of after prologue
> is a 4 liner in generic code and a few lines in target specific code.
> The important thing is that we shouldn't have 100 different profiling ABIs,
> so it is desirable to agree on something that will be generally useful not
> just for the kernel, but perhaps for other purposes.
> 

There is really just one that makes sense, which is providing the
ABI-defined entry state, which means intercepting at the point of entry.

Anything else is/was a mistake.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH][GIT PULL][v2.6.32] tracing/x86: Add check to detect GCC  messing with mcount prologue
  2009-11-24 17:30                                     ` Steven Rostedt
@ 2009-11-25 20:05                                       ` H. Peter Anvin
  0 siblings, 0 replies; 31+ messages in thread
From: H. Peter Anvin @ 2009-11-25 20:05 UTC (permalink / raw)
  To: rostedt
  Cc: Andrew Haley, Jakub Jelinek, Thomas Gleixner, H.J. Lu,
	Ingo Molnar, LKML, Andrew Morton, Heiko Carstens, feng.tang,
	Peter Zijlstra, Frederic Weisbecker, David Daney,
	Richard Guenther, gcc, Linus Torvalds, Andi Kleen

On 11/24/2009 09:30 AM, Steven Rostedt wrote:
> 
> For other archs, Linus showed some examples:
> 
> http://lkml.org/lkml/2009/11/19/349
> 

Yes; the key here is that the ABI-defined entry state is readily
mappable onto the state on entry to the __fentry__ function.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH][GIT PULL][v2.6.32] tracing/x86: Add check to detect GCC messing with mcount prologue
  2009-11-25 15:45                             ` Ingo Molnar
  2009-11-25 15:53                               ` Thomas Gleixner
@ 2009-11-25 16:45                               ` Jakub Jelinek
  2009-11-25 20:13                                 ` H. Peter Anvin
  1 sibling, 1 reply; 31+ messages in thread
From: Jakub Jelinek @ 2009-11-25 16:45 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Thomas Gleixner, Andrew Haley, H.J. Lu, rostedt, H. Peter Anvin,
	LKML, Andrew Morton, Heiko Carstens, feng.tang, Peter Zijlstra,
	Frederic Weisbecker, David Daney, Richard Guenther, gcc,
	Linus Torvalds

On Wed, Nov 25, 2009 at 04:44:52PM +0100, Ingo Molnar wrote:
> 
> * Thomas Gleixner <tglx@linutronix.de> wrote:
> 
> > On Tue, 24 Nov 2009, Jakub Jelinek wrote:
> > 
> > > On Tue, Nov 24, 2009 at 03:55:49PM +0100, Thomas Gleixner wrote:
> > > > > you should compile your code with -maccumulate-outgoing-args, and there's
> > > > > no need to use -mtune=generic.  Is that right?
> > > > 
> > > > Seems to work. What other side effects has that ?
> > > 
> > > Faster code, significant increase in code size though.  Note that on many
> > > architectures it is the only supported model.
> > 
> > Just checked on the affected -marchs. The increase in code size is 
> > about 3% which is not that bad and definitely acceptable for the 
> > tracing case. Will zap the -mtune=generic patch and use 
> > -maccumulate-outgoing-args instead.
> 
> hm, 3% sounds quite large :( dyn-ftrace is enabled in distro configs, so 
> 3% is a big deal IMO.

If you compile kernels 90%+ people out there run with -p on i?86/x86_64,
then certainly coming up with a new gcc switch and new profiling ABI is
desirable.  -p on i?86/x86_64 e.g. forces -fno-omit-frame-pointer, which
makes code on these register starved arches significantly worse.
Making GCC output profiling call before prologue instead of after prologue
is a 4 liner in generic code and a few lines in target specific code.
The important thing is that we shouldn't have 100 different profiling ABIs,
so it is desirable to agree on something that will be generally useful not
just for the kernel, but perhaps for other purposes.

	Jakub

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH][GIT PULL][v2.6.32] tracing/x86: Add check to detect GCC  messing with mcount prologue
  2009-11-25 15:53                               ` Thomas Gleixner
@ 2009-11-25 16:26                                 ` Ingo Molnar
  0 siblings, 0 replies; 31+ messages in thread
From: Ingo Molnar @ 2009-11-25 16:26 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Jakub Jelinek, Andrew Haley, H.J. Lu, rostedt, H. Peter Anvin,
	LKML, Andrew Morton, Heiko Carstens, feng.tang, Peter Zijlstra,
	Frederic Weisbecker, David Daney, Richard Guenther, gcc,
	Linus Torvalds


* Thomas Gleixner <tglx@linutronix.de> wrote:

> On Wed, 25 Nov 2009, Ingo Molnar wrote:
> > * Thomas Gleixner <tglx@linutronix.de> wrote:
> > 
> > > On Tue, 24 Nov 2009, Jakub Jelinek wrote:
> > > 
> > > > On Tue, Nov 24, 2009 at 03:55:49PM +0100, Thomas Gleixner wrote:
> > > > > > you should compile your code with -maccumulate-outgoing-args, and there's
> > > > > > no need to use -mtune=generic.  Is that right?
> > > > > 
> > > > > Seems to work. What other side effects has that ?
> > > > 
> > > > Faster code, significant increase in code size though.  Note that on many
> > > > architectures it is the only supported model.
> > > 
> > > Just checked on the affected -marchs. The increase in code size is 
> > > about 3% which is not that bad and definitely acceptable for the 
> > > tracing case. Will zap the -mtune=generic patch and use 
> > > -maccumulate-outgoing-args instead.
> > 
> > hm, 3% sounds quite large :( dyn-ftrace is enabled in distro configs, so 
> > 3% is a big deal IMO.
> 
> Distro-configs have -mtune=generic anyway. So it's not changing
> anything for them.
> 
> I'm talking about the -march flags which result in that weird code
> (pentium-mmx ....).

ok!

	Ingo

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH][GIT PULL][v2.6.32] tracing/x86: Add check to detect GCC  messing with mcount prologue
  2009-11-25 15:45                             ` Ingo Molnar
@ 2009-11-25 15:53                               ` Thomas Gleixner
  2009-11-25 16:26                                 ` Ingo Molnar
  2009-11-25 16:45                               ` Jakub Jelinek
  1 sibling, 1 reply; 31+ messages in thread
From: Thomas Gleixner @ 2009-11-25 15:53 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Jakub Jelinek, Andrew Haley, H.J. Lu, rostedt, H. Peter Anvin,
	LKML, Andrew Morton, Heiko Carstens, feng.tang, Peter Zijlstra,
	Frederic Weisbecker, David Daney, Richard Guenther, gcc,
	Linus Torvalds

On Wed, 25 Nov 2009, Ingo Molnar wrote:
> * Thomas Gleixner <tglx@linutronix.de> wrote:
> 
> > On Tue, 24 Nov 2009, Jakub Jelinek wrote:
> > 
> > > On Tue, Nov 24, 2009 at 03:55:49PM +0100, Thomas Gleixner wrote:
> > > > > you should compile your code with -maccumulate-outgoing-args, and there's
> > > > > no need to use -mtune=generic.  Is that right?
> > > > 
> > > > Seems to work. What other side effects has that ?
> > > 
> > > Faster code, significant increase in code size though.  Note that on many
> > > architectures it is the only supported model.
> > 
> > Just checked on the affected -marchs. The increase in code size is 
> > about 3% which is not that bad and definitely acceptable for the 
> > tracing case. Will zap the -mtune=generic patch and use 
> > -maccumulate-outgoing-args instead.
> 
> hm, 3% sounds quite large :( dyn-ftrace is enabled in distro configs, so 
> 3% is a big deal IMO.

Distro-configs have -mtune=generic anyway. So it's not changing
anything for them.

I'm talking about the -march flags which result in that weird code
(pentium-mmx ....).

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH][GIT PULL][v2.6.32] tracing/x86: Add check to detect GCC  messing with mcount prologue
  2009-11-25 15:29                           ` Thomas Gleixner
@ 2009-11-25 15:45                             ` Ingo Molnar
  2009-11-25 15:53                               ` Thomas Gleixner
  2009-11-25 16:45                               ` Jakub Jelinek
  0 siblings, 2 replies; 31+ messages in thread
From: Ingo Molnar @ 2009-11-25 15:45 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Jakub Jelinek, Andrew Haley, H.J. Lu, rostedt, H. Peter Anvin,
	LKML, Andrew Morton, Heiko Carstens, feng.tang, Peter Zijlstra,
	Frederic Weisbecker, David Daney, Richard Guenther, gcc,
	Linus Torvalds


* Thomas Gleixner <tglx@linutronix.de> wrote:

> On Tue, 24 Nov 2009, Jakub Jelinek wrote:
> 
> > On Tue, Nov 24, 2009 at 03:55:49PM +0100, Thomas Gleixner wrote:
> > > > you should compile your code with -maccumulate-outgoing-args, and there's
> > > > no need to use -mtune=generic.  Is that right?
> > > 
> > > Seems to work. What other side effects has that ?
> > 
> > Faster code, significant increase in code size though.  Note that on many
> > architectures it is the only supported model.
> 
> Just checked on the affected -marchs. The increase in code size is 
> about 3% which is not that bad and definitely acceptable for the 
> tracing case. Will zap the -mtune=generic patch and use 
> -maccumulate-outgoing-args instead.

hm, 3% sounds quite large :( dyn-ftrace is enabled in distro configs, so 
3% is a big deal IMO.

	Ingo

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH][GIT PULL][v2.6.32] tracing/x86: Add check to detect GCC  messing with mcount prologue
  2009-11-24 15:06                         ` Jakub Jelinek
  2009-11-24 15:32                           ` Andrew Haley
@ 2009-11-25 15:29                           ` Thomas Gleixner
  2009-11-25 15:45                             ` Ingo Molnar
  1 sibling, 1 reply; 31+ messages in thread
From: Thomas Gleixner @ 2009-11-25 15:29 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Andrew Haley, H.J. Lu, rostedt, Ingo Molnar, H. Peter Anvin,
	LKML, Andrew Morton, Heiko Carstens, feng.tang, Peter Zijlstra,
	Frederic Weisbecker, David Daney, Richard Guenther, gcc,
	Linus Torvalds

On Tue, 24 Nov 2009, Jakub Jelinek wrote:

> On Tue, Nov 24, 2009 at 03:55:49PM +0100, Thomas Gleixner wrote:
> > > you should compile your code with -maccumulate-outgoing-args, and there's
> > > no need to use -mtune=generic.  Is that right?
> > 
> > Seems to work. What other side effects has that ?
> 
> Faster code, significant increase in code size though.  Note that on many
> architectures it is the only supported model.

Just checked on the affected -marchs. The increase in code size is
about 3% which is not that bad and definitely acceptable for the
tracing case. Will zap the -mtune=generic patch and use
-maccumulate-outgoing-args instead.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH][GIT PULL][v2.6.32] tracing/x86: Add check to detect GCC   messing with mcount prologue
  2009-11-24 17:12                                   ` Andrew Haley
  2009-11-24 17:30                                     ` Steven Rostedt
@ 2009-11-24 19:56                                     ` H. Peter Anvin
  1 sibling, 0 replies; 31+ messages in thread
From: H. Peter Anvin @ 2009-11-24 19:56 UTC (permalink / raw)
  To: Andrew Haley
  Cc: Jakub Jelinek, Thomas Gleixner, H.J. Lu, rostedt, Ingo Molnar,
	LKML, Andrew Morton, Heiko Carstens, feng.tang, Peter Zijlstra,
	Frederic Weisbecker, David Daney, Richard Guenther, gcc,
	Linus Torvalds

On 11/24/2009 09:12 AM, Andrew Haley wrote:
>>
>> If we're changing gcc anyway, then let's add the option of intercepting
>> the function at the point where the machine state is well-defined by
>> ABI, which is before the function stack frame is set up.
> 
> Hmm.  On the x86 I suppose we could just inject a naked call instruction,
> but not all aeches allow us to call anything before we've saved the return
> address.  Or are you talking x86 only?
> 

For x86, we should use a naked call.

For architectures where that is not possible, we should use a minimal
sequence such that the ABI state at the invocation point is 100% derivable.

On MIPS, for example, we could use a sequence such as:

	mov at, ra
	jal __fentry__

It would be up to __fentry__ to save the value in at and to restore it
back into ra before resuming, meaning that __fentry__ has a nonstandard
calling convention.

	-hpa

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH][GIT PULL][v2.6.32] tracing/x86: Add check to detect  GCC messing with mcount prologue
  2009-11-24 17:12                                   ` Andrew Haley
@ 2009-11-24 17:30                                     ` Steven Rostedt
  2009-11-25 20:05                                       ` H. Peter Anvin
  2009-11-24 19:56                                     ` H. Peter Anvin
  1 sibling, 1 reply; 31+ messages in thread
From: Steven Rostedt @ 2009-11-24 17:30 UTC (permalink / raw)
  To: Andrew Haley
  Cc: H. Peter Anvin, Jakub Jelinek, Thomas Gleixner, H.J. Lu,
	Ingo Molnar, LKML, Andrew Morton, Heiko Carstens, feng.tang,
	Peter Zijlstra, Frederic Weisbecker, David Daney,
	Richard Guenther, gcc, Linus Torvalds, Andi Kleen

On Tue, 2009-11-24 at 17:12 +0000, Andrew Haley wrote:
> H. Peter Anvin wrote:

> > If we're changing gcc anyway, then let's add the option of intercepting
> > the function at the point where the machine state is well-defined by
> > ABI, which is before the function stack frame is set up.
> 
> Hmm.  On the x86 I suppose we could just inject a naked call instruction,
> but not all aeches allow us to call anything before we've saved the return
> address.  Or are you talking x86 only?

Earlier in the GCC BUG thread we talked about this. Adding a __fentry__
call at the beginning of the function. This could be done for other
archs as well, but yes, the return address must be stored. For x86 it is
the easiest because it automatically stores the return address on the
stack (Andi already has a working patch I believe).

For other archs, Linus showed some examples:

http://lkml.org/lkml/2009/11/19/349

-- Steve


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH][GIT PULL][v2.6.32] tracing/x86: Add check to detect GCC   messing with mcount prologue
  2009-11-24 16:39                                 ` H. Peter Anvin
@ 2009-11-24 17:12                                   ` Andrew Haley
  2009-11-24 17:30                                     ` Steven Rostedt
  2009-11-24 19:56                                     ` H. Peter Anvin
  0 siblings, 2 replies; 31+ messages in thread
From: Andrew Haley @ 2009-11-24 17:12 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Jakub Jelinek, Thomas Gleixner, H.J. Lu, rostedt, Ingo Molnar,
	LKML, Andrew Morton, Heiko Carstens, feng.tang, Peter Zijlstra,
	Frederic Weisbecker, David Daney, Richard Guenther, gcc,
	Linus Torvalds

H. Peter Anvin wrote:
> On 11/24/2009 07:46 AM, Andrew Haley wrote:
>>> Yes, a lot.  The difference is that -maccumulate-outgoing-args allocates
>>> space for arguments of the callee with most arguments in the prologue, using
>>> subtraction from sp, then to pass arguments uses movl XXX, 4(%esp) etc.
>>> and the stack pointer doesn't usually change within the function (except for
>>> alloca/VLAs).
>>> With -mno-accumulate-outgoing-args args are pushed using push instructions
>>> and stack pointer is constantly changing.
>> Alright.  So, it is possible in theory for gcc to generate code that
>> only uses -maccumulate-outgoing-args when it needs to realign SP.
>> And, therefore, we could have a nice option for the kernel: one with
>> (mostly) good code density and never generates the bizarre code
>> sequence in the prologue.
> 
> If we're changing gcc anyway, then let's add the option of intercepting
> the function at the point where the machine state is well-defined by
> ABI, which is before the function stack frame is set up.

Hmm.  On the x86 I suppose we could just inject a naked call instruction,
but not all aeches allow us to call anything before we've saved the return
address.  Or are you talking x86 only?

Andrew.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH][GIT PULL][v2.6.32] tracing/x86: Add check to detect GCC  messing with mcount prologue
  2009-11-24 15:48                               ` Andrew Haley
@ 2009-11-24 16:39                                 ` H. Peter Anvin
  2009-11-24 17:12                                   ` Andrew Haley
  0 siblings, 1 reply; 31+ messages in thread
From: H. Peter Anvin @ 2009-11-24 16:39 UTC (permalink / raw)
  To: Andrew Haley
  Cc: Jakub Jelinek, Thomas Gleixner, H.J. Lu, rostedt, Ingo Molnar,
	LKML, Andrew Morton, Heiko Carstens, feng.tang, Peter Zijlstra,
	Frederic Weisbecker, David Daney, Richard Guenther, gcc,
	Linus Torvalds

On 11/24/2009 07:46 AM, Andrew Haley wrote:
>>
>> Yes, a lot.  The difference is that -maccumulate-outgoing-args allocates
>> space for arguments of the callee with most arguments in the prologue, using
>> subtraction from sp, then to pass arguments uses movl XXX, 4(%esp) etc.
>> and the stack pointer doesn't usually change within the function (except for
>> alloca/VLAs).
>> With -mno-accumulate-outgoing-args args are pushed using push instructions
>> and stack pointer is constantly changing.
> 
> Alright.  So, it is possible in theory for gcc to generate code that
> only uses -maccumulate-outgoing-args when it needs to realign SP.
> And, therefore, we could have a nice option for the kernel: one with
> (mostly) good code density and never generates the bizarre code
> sequence in the prologue.
> 

If we're changing gcc anyway, then let's add the option of intercepting
the function at the point where the machine state is well-defined by
ABI, which is before the function stack frame is set up.

-maccumulate-outgoing-args sounds like it would be painful on x86 (not
using its cheap push/pop instructions), but I guess since it's only when
tracing it's less of an issue.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH][GIT PULL][v2.6.32] tracing/x86: Add check to detect GCC  messing with mcount prologue
  2009-11-24 15:36                             ` Jakub Jelinek
@ 2009-11-24 15:48                               ` Andrew Haley
  2009-11-24 16:39                                 ` H. Peter Anvin
  0 siblings, 1 reply; 31+ messages in thread
From: Andrew Haley @ 2009-11-24 15:48 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Thomas Gleixner, H.J. Lu, rostedt, Ingo Molnar, H. Peter Anvin,
	LKML, Andrew Morton, Heiko Carstens, feng.tang, Peter Zijlstra,
	Frederic Weisbecker, David Daney, Richard Guenther, gcc,
	Linus Torvalds

Jakub Jelinek wrote:
> On Tue, Nov 24, 2009 at 03:32:20PM +0000, Andrew Haley wrote:
>> Jakub Jelinek wrote:
>>> On Tue, Nov 24, 2009 at 03:55:49PM +0100, Thomas Gleixner wrote:
>>>>> you should compile your code with -maccumulate-outgoing-args, and there's
>>>>> no need to use -mtune=generic.  Is that right?
>>>> Seems to work. What other side effects has that ?
>>> Faster code, significant increase in code size though.
>> Does it affect code size when we don't have to realign the stack pointer?
> 
> Yes, a lot.  The difference is that -maccumulate-outgoing-args allocates
> space for arguments of the callee with most arguments in the prologue, using
> subtraction from sp, then to pass arguments uses movl XXX, 4(%esp) etc.
> and the stack pointer doesn't usually change within the function (except for
> alloca/VLAs).
> With -mno-accumulate-outgoing-args args are pushed using push instructions
> and stack pointer is constantly changing.

Alright.  So, it is possible in theory for gcc to generate code that
only uses -maccumulate-outgoing-args when it needs to realign SP.
And, therefore, we could have a nice option for the kernel: one with
(mostly) good code density and never generates the bizarre code
sequence in the prologue.

Andrew.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH][GIT PULL][v2.6.32] tracing/x86: Add check to detect GCC messing with mcount prologue
  2009-11-24 15:32                           ` Andrew Haley
@ 2009-11-24 15:36                             ` Jakub Jelinek
  2009-11-24 15:48                               ` Andrew Haley
  0 siblings, 1 reply; 31+ messages in thread
From: Jakub Jelinek @ 2009-11-24 15:36 UTC (permalink / raw)
  To: Andrew Haley
  Cc: Thomas Gleixner, H.J. Lu, rostedt, Ingo Molnar, H. Peter Anvin,
	LKML, Andrew Morton, Heiko Carstens, feng.tang, Peter Zijlstra,
	Frederic Weisbecker, David Daney, Richard Guenther, gcc,
	Linus Torvalds

On Tue, Nov 24, 2009 at 03:32:20PM +0000, Andrew Haley wrote:
> Jakub Jelinek wrote:
> > On Tue, Nov 24, 2009 at 03:55:49PM +0100, Thomas Gleixner wrote:
> >>> you should compile your code with -maccumulate-outgoing-args, and there's
> >>> no need to use -mtune=generic.  Is that right?
> >> Seems to work. What other side effects has that ?
> > 
> > Faster code, significant increase in code size though.
> 
> Does it affect code size when we don't have to realign the stack pointer?

Yes, a lot.  The difference is that -maccumulate-outgoing-args allocates
space for arguments of the callee with most arguments in the prologue, using
subtraction from sp, then to pass arguments uses movl XXX, 4(%esp) etc.
and the stack pointer doesn't usually change within the function (except for
alloca/VLAs).
With -mno-accumulate-outgoing-args args are pushed using push instructions
and stack pointer is constantly changing.

	Jakub

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH][GIT PULL][v2.6.32] tracing/x86: Add check to detect GCC  messing with mcount prologue
  2009-11-24 15:06                         ` Jakub Jelinek
@ 2009-11-24 15:32                           ` Andrew Haley
  2009-11-24 15:36                             ` Jakub Jelinek
  2009-11-25 15:29                           ` Thomas Gleixner
  1 sibling, 1 reply; 31+ messages in thread
From: Andrew Haley @ 2009-11-24 15:32 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Thomas Gleixner, H.J. Lu, rostedt, Ingo Molnar, H. Peter Anvin,
	LKML, Andrew Morton, Heiko Carstens, feng.tang, Peter Zijlstra,
	Frederic Weisbecker, David Daney, Richard Guenther, gcc,
	Linus Torvalds

Jakub Jelinek wrote:
> On Tue, Nov 24, 2009 at 03:55:49PM +0100, Thomas Gleixner wrote:
>>> you should compile your code with -maccumulate-outgoing-args, and there's
>>> no need to use -mtune=generic.  Is that right?
>> Seems to work. What other side effects has that ?
> 
> Faster code, significant increase in code size though.

Does it affect code size when we don't have to realign the stack pointer?

Andrew.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH][GIT PULL][v2.6.32] tracing/x86: Add check to detect GCC messing with mcount prologue
  2009-11-24 14:56                       ` Thomas Gleixner
@ 2009-11-24 15:06                         ` Jakub Jelinek
  2009-11-24 15:32                           ` Andrew Haley
  2009-11-25 15:29                           ` Thomas Gleixner
  0 siblings, 2 replies; 31+ messages in thread
From: Jakub Jelinek @ 2009-11-24 15:06 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Andrew Haley, H.J. Lu, rostedt, Ingo Molnar, H. Peter Anvin,
	LKML, Andrew Morton, Heiko Carstens, feng.tang, Peter Zijlstra,
	Frederic Weisbecker, David Daney, Richard Guenther, gcc,
	Linus Torvalds

On Tue, Nov 24, 2009 at 03:55:49PM +0100, Thomas Gleixner wrote:
> > you should compile your code with -maccumulate-outgoing-args, and there's
> > no need to use -mtune=generic.  Is that right?
> 
> Seems to work. What other side effects has that ?

Faster code, significant increase in code size though.  Note that on many
architectures it is the only supported model.

	Jakub

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH][GIT PULL][v2.6.32] tracing/x86: Add check to detect GCC  messing with mcount prologue
  2009-11-24 14:43                     ` Andrew Haley
@ 2009-11-24 14:56                       ` Thomas Gleixner
  2009-11-24 15:06                         ` Jakub Jelinek
  0 siblings, 1 reply; 31+ messages in thread
From: Thomas Gleixner @ 2009-11-24 14:56 UTC (permalink / raw)
  To: Andrew Haley
  Cc: H.J. Lu, rostedt, Ingo Molnar, H. Peter Anvin, LKML,
	Andrew Morton, Heiko Carstens, feng.tang, Peter Zijlstra,
	Frederic Weisbecker, David Daney, Richard Guenther, jakub, gcc,
	Linus Torvalds

On Tue, 24 Nov 2009, Andrew Haley wrote:
> H.J. Lu wrote:
> > On Sun, Nov 22, 2009 at 9:20 AM, Andrew Haley <aph@redhat.com> wrote:
> >> H.J. Lu wrote:
> >>> On Fri, Nov 20, 2009 at 11:35 AM, Andrew Haley <aph@redhat.com> wrote:
> >>>> Steven Rostedt wrote:
> >>>>> Ingo, Thomas and Linus,
> >>>>>
> >>>>> I know Thomas did a patch to force the -mtune=generic, but just in case
> >>>>> gcc decides to do something crazy again, this patch will catch it.
> >>>>>
> >>>>> Should we try to get this in now?
> >>>> I'm sure this makes sense, but a gcc test case would be even better.
> >>>> If this can be detected in the gcc test suite it'll be found and
> >>>> fixed long before y'all in kernel land get to see it.  That's the
> >>>> only way to guarantee this never bothers you again.
> >>>>
> >>>> H.J., who wrote the code in question, is hopefully looking at why
> >>>> this odd code is being generated.  Once he's done I can put a
> >>>> suitable test case in the gcc test suite.
> >>>>
> >>> See:
> >>>
> >>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42109#c7
> >> I saw that, but does it mean you're going to investigate?  There is
> >> no obvious reason why -mtune=generic should affect code generation
> >> in this way, but it does.
> > 
> > Why not, there is
> > 
> > static const unsigned int x86_accumulate_outgoing_args
> >   = m_AMD_MULTIPLE | m_ATOM | m_PENT4 | m_NOCONA | m_PPRO | m_CORE2
> >     | m_GENERIC;
> > 
> > -mtune=generic turns on -maccumulate-outgoing-args.
> 
> Alright, so let's at least try to give the kernel people the information
> that they need.
> 
> What you're saying is, to avoid this:
> 
>   000005f0 <timer_stats_update_stats>:
>  5f0:   57                      push   %edi
>  5f1:   8d 7c 24 08             lea    0x8(%esp),%edi
>  5f5:   83 e4 f0                and    $0xfffffff0,%esp
>  5f8:   ff 77 fc                pushl  -0x4(%edi)
>  5fb:   55                      push   %ebp
>  5fc:   89 e5                   mov    %esp,%ebp
> 
> you should compile your code with -maccumulate-outgoing-args, and there's
> no need to use -mtune=generic.  Is that right?

Seems to work. What other side effects has that ?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH][GIT PULL][v2.6.32] tracing/x86: Add check to detect GCC    messing with mcount prologue
  2009-11-22 23:31                   ` H.J. Lu
@ 2009-11-24 14:43                     ` Andrew Haley
  2009-11-24 14:56                       ` Thomas Gleixner
  0 siblings, 1 reply; 31+ messages in thread
From: Andrew Haley @ 2009-11-24 14:43 UTC (permalink / raw)
  To: H.J. Lu
  Cc: rostedt, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, LKML,
	Andrew Morton, Heiko Carstens, feng.tang, Peter Zijlstra,
	Frederic Weisbecker, David Daney, Richard Guenther, jakub, gcc,
	Linus Torvalds

H.J. Lu wrote:
> On Sun, Nov 22, 2009 at 9:20 AM, Andrew Haley <aph@redhat.com> wrote:
>> H.J. Lu wrote:
>>> On Fri, Nov 20, 2009 at 11:35 AM, Andrew Haley <aph@redhat.com> wrote:
>>>> Steven Rostedt wrote:
>>>>> Ingo, Thomas and Linus,
>>>>>
>>>>> I know Thomas did a patch to force the -mtune=generic, but just in case
>>>>> gcc decides to do something crazy again, this patch will catch it.
>>>>>
>>>>> Should we try to get this in now?
>>>> I'm sure this makes sense, but a gcc test case would be even better.
>>>> If this can be detected in the gcc test suite it'll be found and
>>>> fixed long before y'all in kernel land get to see it.  That's the
>>>> only way to guarantee this never bothers you again.
>>>>
>>>> H.J., who wrote the code in question, is hopefully looking at why
>>>> this odd code is being generated.  Once he's done I can put a
>>>> suitable test case in the gcc test suite.
>>>>
>>> See:
>>>
>>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42109#c7
>> I saw that, but does it mean you're going to investigate?  There is
>> no obvious reason why -mtune=generic should affect code generation
>> in this way, but it does.
> 
> Why not, there is
> 
> static const unsigned int x86_accumulate_outgoing_args
>   = m_AMD_MULTIPLE | m_ATOM | m_PENT4 | m_NOCONA | m_PPRO | m_CORE2
>     | m_GENERIC;
> 
> -mtune=generic turns on -maccumulate-outgoing-args.

Alright, so let's at least try to give the kernel people the information
that they need.

What you're saying is, to avoid this:

  000005f0 <timer_stats_update_stats>:
 5f0:   57                      push   %edi
 5f1:   8d 7c 24 08             lea    0x8(%esp),%edi
 5f5:   83 e4 f0                and    $0xfffffff0,%esp
 5f8:   ff 77 fc                pushl  -0x4(%edi)
 5fb:   55                      push   %ebp
 5fc:   89 e5                   mov    %esp,%ebp

you should compile your code with -maccumulate-outgoing-args, and there's
no need to use -mtune=generic.  Is that right?

Andrew.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH][GIT PULL][v2.6.32] tracing/x86: Add check to detect GCC   messing with mcount prologue
  2009-11-22 17:21                 ` Andrew Haley
@ 2009-11-22 23:31                   ` H.J. Lu
  2009-11-24 14:43                     ` Andrew Haley
  0 siblings, 1 reply; 31+ messages in thread
From: H.J. Lu @ 2009-11-22 23:31 UTC (permalink / raw)
  To: Andrew Haley
  Cc: rostedt, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, LKML,
	Andrew Morton, Heiko Carstens, feng.tang, Peter Zijlstra,
	Frederic Weisbecker, David Daney, Richard Guenther, jakub, gcc,
	Linus Torvalds

On Sun, Nov 22, 2009 at 9:20 AM, Andrew Haley <aph@redhat.com> wrote:
> H.J. Lu wrote:
>> On Fri, Nov 20, 2009 at 11:35 AM, Andrew Haley <aph@redhat.com> wrote:
>>> Steven Rostedt wrote:
>>>> Ingo, Thomas and Linus,
>>>>
>>>> I know Thomas did a patch to force the -mtune=generic, but just in case
>>>> gcc decides to do something crazy again, this patch will catch it.
>>>>
>>>> Should we try to get this in now?
>>> I'm sure this makes sense, but a gcc test case would be even better.
>>> If this can be detected in the gcc test suite it'll be found and
>>> fixed long before y'all in kernel land get to see it.  That's the
>>> only way to guarantee this never bothers you again.
>>>
>>> H.J., who wrote the code in question, is hopefully looking at why
>>> this odd code is being generated.  Once he's done I can put a
>>> suitable test case in the gcc test suite.
>>>
>>
>> See:
>>
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42109#c7
>
> I saw that, but does it mean you're going to investigate?  There is
> no obvious reason why -mtune=generic should affect code generation
> in this way, but it does.
>

Why not, there is

static const unsigned int x86_accumulate_outgoing_args
  = m_AMD_MULTIPLE | m_ATOM | m_PENT4 | m_NOCONA | m_PPRO | m_CORE2
    | m_GENERIC;

-mtune=generic turns on -maccumulate-outgoing-args.


-- 
H.J.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH][GIT PULL][v2.6.32] tracing/x86: Add check to detect GCC   messing with mcount prologue
  2009-11-22  9:39               ` H.J. Lu
@ 2009-11-22 17:21                 ` Andrew Haley
  2009-11-22 23:31                   ` H.J. Lu
  0 siblings, 1 reply; 31+ messages in thread
From: Andrew Haley @ 2009-11-22 17:21 UTC (permalink / raw)
  To: H.J. Lu
  Cc: rostedt, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, LKML,
	Andrew Morton, Heiko Carstens, feng.tang, Peter Zijlstra,
	Frederic Weisbecker, David Daney, Richard Guenther, jakub, gcc,
	Linus Torvalds

H.J. Lu wrote:
> On Fri, Nov 20, 2009 at 11:35 AM, Andrew Haley <aph@redhat.com> wrote:
>> Steven Rostedt wrote:
>>> Ingo, Thomas and Linus,
>>>
>>> I know Thomas did a patch to force the -mtune=generic, but just in case
>>> gcc decides to do something crazy again, this patch will catch it.
>>>
>>> Should we try to get this in now?
>> I'm sure this makes sense, but a gcc test case would be even better.
>> If this can be detected in the gcc test suite it'll be found and
>> fixed long before y'all in kernel land get to see it.  That's the
>> only way to guarantee this never bothers you again.
>>
>> H.J., who wrote the code in question, is hopefully looking at why
>> this odd code is being generated.  Once he's done I can put a
>> suitable test case in the gcc test suite.
>>
> 
> See:
> 
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42109#c7

I saw that, but does it mean you're going to investigate?  There is
no obvious reason why -mtune=generic should affect code generation
in this way, but it does.

Andrew.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH][GIT PULL][v2.6.32] tracing/x86: Add check to detect GCC   messing with mcount prologue
  2009-11-20 19:36             ` Andrew Haley
  2009-11-20 19:48               ` Steven Rostedt
@ 2009-11-22  9:39               ` H.J. Lu
  2009-11-22 17:21                 ` Andrew Haley
  1 sibling, 1 reply; 31+ messages in thread
From: H.J. Lu @ 2009-11-22  9:39 UTC (permalink / raw)
  To: Andrew Haley
  Cc: rostedt, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, LKML,
	Andrew Morton, Heiko Carstens, feng.tang, Peter Zijlstra,
	Frederic Weisbecker, David Daney, Richard Guenther, jakub, gcc,
	Linus Torvalds

On Fri, Nov 20, 2009 at 11:35 AM, Andrew Haley <aph@redhat.com> wrote:
> Steven Rostedt wrote:
>> Ingo, Thomas and Linus,
>>
>> I know Thomas did a patch to force the -mtune=generic, but just in case
>> gcc decides to do something crazy again, this patch will catch it.
>>
>> Should we try to get this in now?
>
> I'm sure this makes sense, but a gcc test case would be even better.
> If this can be detected in the gcc test suite it'll be found and
> fixed long before y'all in kernel land get to see it.  That's the
> only way to guarantee this never bothers you again.
>
> H.J., who wrote the code in question, is hopefully looking at why
> this odd code is being generated.  Once he's done I can put a
> suitable test case in the gcc test suite.
>

See:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42109#c7


-- 
H.J.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH][GIT PULL][v2.6.32] tracing/x86: Add check to detect GCC  messing with mcount prologue
  2009-11-20 17:02           ` Steven Rostedt
  2009-11-20 17:15             ` H. Peter Anvin
  2009-11-20 19:36             ` Andrew Haley
@ 2009-11-22  9:06             ` Ingo Molnar
  2 siblings, 0 replies; 31+ messages in thread
From: Ingo Molnar @ 2009-11-22  9:06 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Thomas Gleixner, H. Peter Anvin, LKML, Andrew Morton,
	Heiko Carstens, feng.tang, Peter Zijlstra, Frederic Weisbecker,
	David Daney, Andrew Haley, Richard Guenther, jakub, gcc,
	Linus Torvalds


* Steven Rostedt <rostedt@goodmis.org> wrote:

> Ingo, Thomas and Linus,
> 
> I know Thomas did a patch to force the -mtune=generic, but just in 
> case gcc decides to do something crazy again, this patch will catch 
> it.
> 
> Should we try to get this in now?

Very nice example of defensive coding - i like this. I've queued it up 
for .33 (unless anyone objects) as i think it's too late for .32.

	Ingo

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH][GIT PULL][v2.6.32] tracing/x86: Add check to detect   GCC messing with mcount prologue
  2009-11-20 19:48               ` Steven Rostedt
@ 2009-11-20 19:50                 ` H. Peter Anvin
  0 siblings, 0 replies; 31+ messages in thread
From: H. Peter Anvin @ 2009-11-20 19:50 UTC (permalink / raw)
  To: rostedt
  Cc: Andrew Haley, Thomas Gleixner, Ingo Molnar, LKML, Andrew Morton,
	Heiko Carstens, feng.tang, Peter Zijlstra, Frederic Weisbecker,
	David Daney, Richard Guenther, jakub, gcc, Linus Torvalds

On 11/20/2009 11:46 AM, Steven Rostedt wrote:
> 
> Yes a gcc test suite will help new instances of gcc. But we need to
> worry about the instances of gcc that people have on their desktops now.
> This test case will catch the discrepancy between gcc and the function
> graph tracer. I'm not 100% convince that just adding -mtune=generic will
> help in all cases. If we miss another instance, then the function graph
> tracer may crash someone's kernel.
> 

Furthermore, for future gcc instances what we really want is the early
interception support anyway.

	-hpa

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH][GIT PULL][v2.6.32] tracing/x86: Add check to detect   GCC messing with mcount prologue
  2009-11-20 19:36             ` Andrew Haley
@ 2009-11-20 19:48               ` Steven Rostedt
  2009-11-20 19:50                 ` H. Peter Anvin
  2009-11-22  9:39               ` H.J. Lu
  1 sibling, 1 reply; 31+ messages in thread
From: Steven Rostedt @ 2009-11-20 19:48 UTC (permalink / raw)
  To: Andrew Haley
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, LKML,
	Andrew Morton, Heiko Carstens, feng.tang, Peter Zijlstra,
	Frederic Weisbecker, David Daney, Richard Guenther, jakub, gcc,
	Linus Torvalds

On Fri, 2009-11-20 at 19:35 +0000, Andrew Haley wrote:
> Steven Rostedt wrote:
> > Ingo, Thomas and Linus,
> > 
> > I know Thomas did a patch to force the -mtune=generic, but just in case
> > gcc decides to do something crazy again, this patch will catch it.
> > 
> > Should we try to get this in now?
> 
> I'm sure this makes sense, but a gcc test case would be even better.
> If this can be detected in the gcc test suite it'll be found and
> fixed long before y'all in kernel land get to see it.  That's the
> only way to guarantee this never bothers you again.
> 
> H.J., who wrote the code in question, is hopefully looking at why
> this odd code is being generated.  Once he's done I can put a
> suitable test case in the gcc test suite.

Yes a gcc test suite will help new instances of gcc. But we need to
worry about the instances of gcc that people have on their desktops now.
This test case will catch the discrepancy between gcc and the function
graph tracer. I'm not 100% convince that just adding -mtune=generic will
help in all cases. If we miss another instance, then the function graph
tracer may crash someone's kernel.

-- Steve


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH][GIT PULL][v2.6.32] tracing/x86: Add check to detect   GCC messing with mcount prologue
  2009-11-20 17:02           ` Steven Rostedt
  2009-11-20 17:15             ` H. Peter Anvin
@ 2009-11-20 19:36             ` Andrew Haley
  2009-11-20 19:48               ` Steven Rostedt
  2009-11-22  9:39               ` H.J. Lu
  2009-11-22  9:06             ` Ingo Molnar
  2 siblings, 2 replies; 31+ messages in thread
From: Andrew Haley @ 2009-11-20 19:36 UTC (permalink / raw)
  To: rostedt
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, LKML,
	Andrew Morton, Heiko Carstens, feng.tang, Peter Zijlstra,
	Frederic Weisbecker, David Daney, Richard Guenther, jakub, gcc,
	Linus Torvalds

Steven Rostedt wrote:
> Ingo, Thomas and Linus,
> 
> I know Thomas did a patch to force the -mtune=generic, but just in case
> gcc decides to do something crazy again, this patch will catch it.
> 
> Should we try to get this in now?

I'm sure this makes sense, but a gcc test case would be even better.
If this can be detected in the gcc test suite it'll be found and
fixed long before y'all in kernel land get to see it.  That's the
only way to guarantee this never bothers you again.

H.J., who wrote the code in question, is hopefully looking at why
this odd code is being generated.  Once he's done I can put a
suitable test case in the gcc test suite.

Andrew.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH][GIT PULL][v2.6.32] tracing/x86: Add check to detect GCC  messing with mcount prologue
  2009-11-20 17:02           ` Steven Rostedt
@ 2009-11-20 17:15             ` H. Peter Anvin
  2009-11-20 19:36             ` Andrew Haley
  2009-11-22  9:06             ` Ingo Molnar
  2 siblings, 0 replies; 31+ messages in thread
From: H. Peter Anvin @ 2009-11-20 17:15 UTC (permalink / raw)
  To: rostedt
  Cc: Thomas Gleixner, Ingo Molnar, LKML, Andrew Morton,
	Heiko Carstens, feng.tang, Peter Zijlstra, Frederic Weisbecker,
	David Daney, Andrew Haley, Richard Guenther, jakub, gcc,
	Linus Torvalds

On 11/20/2009 09:00 AM, Steven Rostedt wrote:
> Ingo, Thomas and Linus,
> 
> I know Thomas did a patch to force the -mtune=generic, but just in case
> gcc decides to do something crazy again, this patch will catch it.
> 
> Should we try to get this in now?
> 

Sounds like a very good idea to me.
	
	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH][GIT PULL][v2.6.32] tracing/x86: Add check to detect  GCC messing with mcount prologue
  2009-11-20  5:24         ` Steven Rostedt
  2009-11-20  5:33           ` Steven Rostedt
@ 2009-11-20 17:02           ` Steven Rostedt
  2009-11-20 17:15             ` H. Peter Anvin
                               ` (2 more replies)
  1 sibling, 3 replies; 31+ messages in thread
From: Steven Rostedt @ 2009-11-20 17:02 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Ingo Molnar, H. Peter Anvin, LKML, Andrew Morton, Heiko Carstens,
	feng.tang, Peter Zijlstra, Frederic Weisbecker, David Daney,
	Andrew Haley, Richard Guenther, jakub, gcc, Linus Torvalds

Ingo, Thomas and Linus,

I know Thomas did a patch to force the -mtune=generic, but just in case
gcc decides to do something crazy again, this patch will catch it.

Should we try to get this in now?

-- Steve

On Fri, 2009-11-20 at 00:23 -0500, Steven Rostedt wrote:
> commit c7715fb611c69ac4b7f722a891de08b206fb7686
> Author: Steven Rostedt <srostedt@redhat.com>
> Date:   Thu Nov 19 23:41:02 2009 -0500
> 
>     tracing/x86: Add check to detect GCC messing with mcount prologue
>     
>     Latest versions of GCC create a funny prologue for some functions.
>     Instead of the typical:
>     
>     		push   %ebp
>     		mov    %esp,%ebp
>     		and    $0xffffffe0,%esp
>     		[...]
>     		call   mcount
>     
>     GCC may try to align the stack before setting up the frame pointer
>     register:
>     
>     		push   %edi
>     		lea    0x8(%esp),%edi
>     		and    $0xffffffe0,%esp
>     		pushl  -0x4(%edi)
>     		push   %ebp
>     		mov    %esp,%ebp
>     		[...]
>     		call   mcount
>     
>     This crazy code places a copy of the return address into the
>     frame pointer. The function graph tracer uses this pointer to
>     save and replace the return address of the calling function to jump
>     to the function graph tracer's return handler, which will put back
>     the return address. But instead instead of the typical return:
>     
>     		mov    %ebp,%esp
>     		pop    %ebp
>     		ret
>     
>     The return of the function performs:
>     
>     		lea    -0x8(%edi),%esp
>     		pop    %edi
>     		ret
>     
>     The function graph tracer return handler will not be called at the exit
>     of the function, but the parent function will call it. Because we missed
>     the return of the child function, the handler will replace the parent's
>     return address with that of the child. Obviously this will cause a crash
>     (Note, there is code to detect this case and safely panic the kernel).
>     
>     The kicker is that this happens to just a handful of functions.
>     And only with certain gcc options.
>     
>     Compiling with: 	-march=pentium-mmx
>     will cause the problem to appear. But if you were to change
>     pentium-mmx to i686 or add -mtune=generic, then the problem goes away.
>     
>     I first saw this problem when compiling with optimize for size.
>     But it seems that various other options may cause this issue to arise.
>     
>     Instead of completely disabling the function graph tracer for i386 builds
>     this patch adds a check to recordmcount.pl to make sure that all
>     functions that contain a call to mcount start with "push %ebp".
>     If not, it will fail the compile and print out the nasty warning:
>     
>       CC      kernel/time/timer_stats.o
>     
>     ********************************************************
>       Your version of GCC breaks the function graph tracer
>       Please disable CONFIG_FUNCTION_GRAPH_TRACER
>       Failed function was "timer_stats_update_stats"
>     ********************************************************
>     
>     The script recordmcount.pl is given a new parameter "do_check". If
>     this is negative, the script will only perform this check without
>     creating the mcount caller section. This will be executed for x86_32
>     when CONFIG_FUNCTION_GRAPH_TRACER is enabled and CONFIG_DYNAMIC_FTRACE
>     is not.
>     
>     If the arch is x86_32 and $do_check is greater than 1, it will perform
>     the check while processing the mcount callers. If $do_check is 0, then
>     no check will be performed. This is for non x86_32 archs and when
>     compiling without CONFIG_FUNCTION_GRAPH_TRACER enabled, even on x86_32.
>     
>     Reported-by: Thomas Gleixner <tglx@linutronix.de>
>     LKML-Reference: <alpine.LFD.2.00.0911191423190.24119@localhost.localdomain>
>     Signed-off-by: Steven Rostedt <rostedt@goodmis.org>


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH][GIT PULL][v2.6.32] tracing/x86: Add check to detect  GCC messing with mcount prologue
  2009-11-20  5:24         ` Steven Rostedt
@ 2009-11-20  5:33           ` Steven Rostedt
  2009-11-20 17:02           ` Steven Rostedt
  1 sibling, 0 replies; 31+ messages in thread
From: Steven Rostedt @ 2009-11-20  5:33 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Ingo Molnar, H. Peter Anvin, LKML, Andrew Morton, Heiko Carstens,
	feng.tang, Peter Zijlstra, Frederic Weisbecker, David Daney,
	Andrew Haley, Richard Guenther, jakub, gcc, Linus Torvalds,
	linux-kbuild, Sam Ravnborg

This touches the Makefile scripts. I forgot to CC kbuild and Sam.

-- Steve

On Fri, 2009-11-20 at 00:23 -0500, Steven Rostedt wrote:
> Ingo,
> 
> Not sure if this is too much for this late in the -rc game, but it finds
> the gcc bug at build time, and we don't need to disable function graph
> tracer for all i386 builds.
> 
> This is built on my last urgent repo pull request.
> 
> Please pull the latest tip/tracing/urgent-2 tree, which can be found at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace.git
> tip/tracing/urgent-2
> 
> 
> Steven Rostedt (1):
>       tracing/x86: Add check to detect GCC messing with mcount prologue
> 
> ----
>  kernel/trace/Kconfig    |    1 -
>  scripts/Makefile.build  |   25 +++++++++++++++-
>  scripts/recordmcount.pl |   74 +++++++++++++++++++++++++++++++++++++++++++++--
>  3 files changed, 95 insertions(+), 5 deletions(-)
> ---------------------------
> commit c7715fb611c69ac4b7f722a891de08b206fb7686
> Author: Steven Rostedt <srostedt@redhat.com>
> Date:   Thu Nov 19 23:41:02 2009 -0500
> 
>     tracing/x86: Add check to detect GCC messing with mcount prologue
>     
>     Latest versions of GCC create a funny prologue for some functions.
>     Instead of the typical:
>     
>     		push   %ebp
>     		mov    %esp,%ebp
>     		and    $0xffffffe0,%esp
>     		[...]
>     		call   mcount
>     
>     GCC may try to align the stack before setting up the frame pointer
>     register:
>     
>     		push   %edi
>     		lea    0x8(%esp),%edi
>     		and    $0xffffffe0,%esp
>     		pushl  -0x4(%edi)
>     		push   %ebp
>     		mov    %esp,%ebp
>     		[...]
>     		call   mcount
>     
>     This crazy code places a copy of the return address into the
>     frame pointer. The function graph tracer uses this pointer to
>     save and replace the return address of the calling function to jump
>     to the function graph tracer's return handler, which will put back
>     the return address. But instead instead of the typical return:
>     
>     		mov    %ebp,%esp
>     		pop    %ebp
>     		ret
>     
>     The return of the function performs:
>     
>     		lea    -0x8(%edi),%esp
>     		pop    %edi
>     		ret
>     
>     The function graph tracer return handler will not be called at the exit
>     of the function, but the parent function will call it. Because we missed
>     the return of the child function, the handler will replace the parent's
>     return address with that of the child. Obviously this will cause a crash
>     (Note, there is code to detect this case and safely panic the kernel).
>     
>     The kicker is that this happens to just a handful of functions.
>     And only with certain gcc options.
>     
>     Compiling with: 	-march=pentium-mmx
>     will cause the problem to appear. But if you were to change
>     pentium-mmx to i686 or add -mtune=generic, then the problem goes away.
>     
>     I first saw this problem when compiling with optimize for size.
>     But it seems that various other options may cause this issue to arise.
>     
>     Instead of completely disabling the function graph tracer for i386 builds
>     this patch adds a check to recordmcount.pl to make sure that all
>     functions that contain a call to mcount start with "push %ebp".
>     If not, it will fail the compile and print out the nasty warning:
>     
>       CC      kernel/time/timer_stats.o
>     
>     ********************************************************
>       Your version of GCC breaks the function graph tracer
>       Please disable CONFIG_FUNCTION_GRAPH_TRACER
>       Failed function was "timer_stats_update_stats"
>     ********************************************************
>     
>     The script recordmcount.pl is given a new parameter "do_check". If
>     this is negative, the script will only perform this check without
>     creating the mcount caller section. This will be executed for x86_32
>     when CONFIG_FUNCTION_GRAPH_TRACER is enabled and CONFIG_DYNAMIC_FTRACE
>     is not.
>     
>     If the arch is x86_32 and $do_check is greater than 1, it will perform
>     the check while processing the mcount callers. If $do_check is 0, then
>     no check will be performed. This is for non x86_32 archs and when
>     compiling without CONFIG_FUNCTION_GRAPH_TRACER enabled, even on x86_32.
>     
>     Reported-by: Thomas Gleixner <tglx@linutronix.de>
>     LKML-Reference: <alpine.LFD.2.00.0911191423190.24119@localhost.localdomain>
>     Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> 
> diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
> index b416512..cd39064 100644
> --- a/kernel/trace/Kconfig
> +++ b/kernel/trace/Kconfig
> @@ -143,7 +143,6 @@ config FUNCTION_GRAPH_TRACER
>  	bool "Kernel Function Graph Tracer"
>  	depends on HAVE_FUNCTION_GRAPH_TRACER
>  	depends on FUNCTION_TRACER
> -	depends on !X86_32 || !CC_OPTIMIZE_FOR_SIZE
>  	default y
>  	help
>  	  Enable the kernel to trace a function at both its return
> diff --git a/scripts/Makefile.build b/scripts/Makefile.build
> index 341b589..3b897f2 100644
> --- a/scripts/Makefile.build
> +++ b/scripts/Makefile.build
> @@ -206,10 +206,33 @@ cmd_modversions =							\
>  endif
>  
>  ifdef CONFIG_FTRACE_MCOUNT_RECORD
> +
> + ifdef CONFIG_FUNCTION_GRAPH_TRACER
> +  ifdef CONFIG_X86_32
> +   rm_do_check = 1
> +  else
> +   rm_do_check = 0
> +  endif
> + else
> +  rm_do_check = 0
> + endif
> +
>  cmd_record_mcount = set -e ; perl $(srctree)/scripts/recordmcount.pl "$(ARCH)" \
>  	"$(if $(CONFIG_64BIT),64,32)" \
>  	"$(OBJDUMP)" "$(OBJCOPY)" "$(CC)" "$(LD)" "$(NM)" "$(RM)" "$(MV)" \
> -	"$(if $(part-of-module),1,0)" "$(@)";
> +	"$(if $(part-of-module),1,0)" "$(rm_do_check)" "$(@)";
> +
> +else
> +
> + ifdef CONFIG_X86_32
> + ifdef CONFIG_FUNCTION_GRAPH_TRACER
> +   cmd_record_mcount = set -e ; perl $(srctree)/scripts/recordmcount.pl "$(ARCH)" \
> +	"$(if $(CONFIG_64BIT),64,32)" \
> +	"$(OBJDUMP)" "$(OBJCOPY)" "$(CC)" "$(LD)" "$(NM)" "$(RM)" "$(MV)" \
> +	"$(if $(part-of-module),1,0)" "-1" "$(@)";
> + endif
> + endif
> +
>  endif
>  
>  define rule_cc_o_c
> diff --git a/scripts/recordmcount.pl b/scripts/recordmcount.pl
> index 090d300..164a9d5 100755
> --- a/scripts/recordmcount.pl
> +++ b/scripts/recordmcount.pl
> @@ -99,14 +99,14 @@ $P =~ s@.*/@@g;
>  
>  my $V = '0.1';
>  
> -if ($#ARGV < 7) {
> -	print "usage: $P arch bits objdump objcopy cc ld nm rm mv is_module inputfile\n";
> +if ($#ARGV < 11) {
> +	print "usage: $P arch bits objdump objcopy cc ld nm rm mv is_module do_check inputfile\n";
>  	print "version: $V\n";
>  	exit(1);
>  }
>  
>  my ($arch, $bits, $objdump, $objcopy, $cc,
> -    $ld, $nm, $rm, $mv, $is_module, $inputfile) = @ARGV;
> +    $ld, $nm, $rm, $mv, $is_module, $do_check, $inputfile) = @ARGV;
>  
>  # This file refers to mcount and shouldn't be ftraced, so lets' ignore it
>  if ($inputfile eq "kernel/trace/ftrace.o") {
> @@ -129,6 +129,60 @@ $nm = "nm" if ((length $nm) == 0);
>  $rm = "rm" if ((length $rm) == 0);
>  $mv = "mv" if ((length $mv) == 0);
>  
> +# gcc can do stupid things with the stack pointer on x86_32.
> +# It may pass a copy of the return address to mcount, which will
> +# break the function graph tracer. If this happens then we need
> +# to flag it and break the build.
> +#
> +# For x86_32, the parameter do_check will be negative if we only
> +# want to perform the check, and positive if we should od the check.
> +# If function graph tracing is not enabled, do_check will be zero.
> +#
> +
> +my $check_next_line = 0;
> +my $line_failed = 0;
> +my $last_function;
> +
> +sub test_x86_32_prologue
> +{
> +    if ($check_next_line) {
> +	if (!/push\s*\%ebp/) {
> +	    $line_failed = 1;
> +	}
> +    }
> +
> +    if ($line_failed && /mcount/) {
> +	print STDERR "\n********************************************************\n";
> +	print STDERR "  Your version of GCC breaks the function graph tracer\n";
> +	print STDERR "  Please disable CONFIG_FUNCTION_GRAPH_TRACER\n";
> +	print STDERR "  Failed function was \"$last_function\"\n";
> +	print STDERR "********************************************************\n\n";
> +	exit -1;
> +    }
> +    $check_next_line = 0;
> +
> +    # check the first line after a function starts for
> +    #  push %ebp
> +    if (/^[0-9a-fA-F]+\s+<([a-zA-Z_].*)>:$/) {
> +	$last_function = $1;
> +	$check_next_line = 1;
> +	$line_failed = 0;
> +    }
> +}
> +
> +if ($do_check < 0) {
> +    # Only perform the check and quit
> +    open(IN, "$objdump -dr $inputfile|") || die "error running $objdump";
> +
> +    while (<IN>) {
> +	test_x86_32_prologue;
> +    }
> +    close (IN);
> +    exit 0;
> +}
> +
> +my $check = 0;
> +
>  #print STDERR "running: $P '$arch' '$objdump' '$objcopy' '$cc' '$ld' " .
>  #    "'$nm' '$rm' '$mv' '$inputfile'\n";
>  
> @@ -153,6 +207,12 @@ if ($arch eq "x86") {
>      }
>  }
>  
> +if ($arch eq "i386") {
> +    if ($do_check) {
> +	$check = 1;
> +    }
> +}
> +
>  #
>  # We base the defaults off of i386, the other archs may
>  # feel free to change them in the below if statements.
> @@ -381,6 +441,14 @@ my $text;
>  my $read_headers = 1;
>  
>  while (<IN>) {
> +
> +    # x86_32 may need to test the start of every function to see
> +    # if GCC did not mess up the mcount prologue. All functions must
> +    # start with push %ebp, otherwise it is broken.
> +    if ($check) {
> +	test_x86_32_prologue;
> +    }
> +
>      # is it a section?
>      if (/$section_regex/) {
>  	$read_headers = 0;
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH][GIT PULL][v2.6.32] tracing/x86: Add check to detect GCC  messing with mcount prologue
       [not found]       ` <alpine.LFD.2.00.0911191423190.24119@localhost.localdomain>
@ 2009-11-20  5:24         ` Steven Rostedt
  2009-11-20  5:33           ` Steven Rostedt
  2009-11-20 17:02           ` Steven Rostedt
  0 siblings, 2 replies; 31+ messages in thread
From: Steven Rostedt @ 2009-11-20  5:24 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Ingo Molnar, H. Peter Anvin, LKML, Andrew Morton, Heiko Carstens,
	feng.tang, Peter Zijlstra, Frederic Weisbecker, David Daney,
	Andrew Haley, Richard Guenther, jakub, gcc, Linus Torvalds


Ingo,

Not sure if this is too much for this late in the -rc game, but it finds
the gcc bug at build time, and we don't need to disable function graph
tracer for all i386 builds.

This is built on my last urgent repo pull request.

Please pull the latest tip/tracing/urgent-2 tree, which can be found at:

  git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace.git
tip/tracing/urgent-2


Steven Rostedt (1):
      tracing/x86: Add check to detect GCC messing with mcount prologue

----
 kernel/trace/Kconfig    |    1 -
 scripts/Makefile.build  |   25 +++++++++++++++-
 scripts/recordmcount.pl |   74 +++++++++++++++++++++++++++++++++++++++++++++--
 3 files changed, 95 insertions(+), 5 deletions(-)
---------------------------
commit c7715fb611c69ac4b7f722a891de08b206fb7686
Author: Steven Rostedt <srostedt@redhat.com>
Date:   Thu Nov 19 23:41:02 2009 -0500

    tracing/x86: Add check to detect GCC messing with mcount prologue
    
    Latest versions of GCC create a funny prologue for some functions.
    Instead of the typical:
    
    		push   %ebp
    		mov    %esp,%ebp
    		and    $0xffffffe0,%esp
    		[...]
    		call   mcount
    
    GCC may try to align the stack before setting up the frame pointer
    register:
    
    		push   %edi
    		lea    0x8(%esp),%edi
    		and    $0xffffffe0,%esp
    		pushl  -0x4(%edi)
    		push   %ebp
    		mov    %esp,%ebp
    		[...]
    		call   mcount
    
    This crazy code places a copy of the return address into the
    frame pointer. The function graph tracer uses this pointer to
    save and replace the return address of the calling function to jump
    to the function graph tracer's return handler, which will put back
    the return address. But instead instead of the typical return:
    
    		mov    %ebp,%esp
    		pop    %ebp
    		ret
    
    The return of the function performs:
    
    		lea    -0x8(%edi),%esp
    		pop    %edi
    		ret
    
    The function graph tracer return handler will not be called at the exit
    of the function, but the parent function will call it. Because we missed
    the return of the child function, the handler will replace the parent's
    return address with that of the child. Obviously this will cause a crash
    (Note, there is code to detect this case and safely panic the kernel).
    
    The kicker is that this happens to just a handful of functions.
    And only with certain gcc options.
    
    Compiling with: 	-march=pentium-mmx
    will cause the problem to appear. But if you were to change
    pentium-mmx to i686 or add -mtune=generic, then the problem goes away.
    
    I first saw this problem when compiling with optimize for size.
    But it seems that various other options may cause this issue to arise.
    
    Instead of completely disabling the function graph tracer for i386 builds
    this patch adds a check to recordmcount.pl to make sure that all
    functions that contain a call to mcount start with "push %ebp".
    If not, it will fail the compile and print out the nasty warning:
    
      CC      kernel/time/timer_stats.o
    
    ********************************************************
      Your version of GCC breaks the function graph tracer
      Please disable CONFIG_FUNCTION_GRAPH_TRACER
      Failed function was "timer_stats_update_stats"
    ********************************************************
    
    The script recordmcount.pl is given a new parameter "do_check". If
    this is negative, the script will only perform this check without
    creating the mcount caller section. This will be executed for x86_32
    when CONFIG_FUNCTION_GRAPH_TRACER is enabled and CONFIG_DYNAMIC_FTRACE
    is not.
    
    If the arch is x86_32 and $do_check is greater than 1, it will perform
    the check while processing the mcount callers. If $do_check is 0, then
    no check will be performed. This is for non x86_32 archs and when
    compiling without CONFIG_FUNCTION_GRAPH_TRACER enabled, even on x86_32.
    
    Reported-by: Thomas Gleixner <tglx@linutronix.de>
    LKML-Reference: <alpine.LFD.2.00.0911191423190.24119@localhost.localdomain>
    Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index b416512..cd39064 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -143,7 +143,6 @@ config FUNCTION_GRAPH_TRACER
 	bool "Kernel Function Graph Tracer"
 	depends on HAVE_FUNCTION_GRAPH_TRACER
 	depends on FUNCTION_TRACER
-	depends on !X86_32 || !CC_OPTIMIZE_FOR_SIZE
 	default y
 	help
 	  Enable the kernel to trace a function at both its return
diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index 341b589..3b897f2 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -206,10 +206,33 @@ cmd_modversions =							\
 endif
 
 ifdef CONFIG_FTRACE_MCOUNT_RECORD
+
+ ifdef CONFIG_FUNCTION_GRAPH_TRACER
+  ifdef CONFIG_X86_32
+   rm_do_check = 1
+  else
+   rm_do_check = 0
+  endif
+ else
+  rm_do_check = 0
+ endif
+
 cmd_record_mcount = set -e ; perl $(srctree)/scripts/recordmcount.pl "$(ARCH)" \
 	"$(if $(CONFIG_64BIT),64,32)" \
 	"$(OBJDUMP)" "$(OBJCOPY)" "$(CC)" "$(LD)" "$(NM)" "$(RM)" "$(MV)" \
-	"$(if $(part-of-module),1,0)" "$(@)";
+	"$(if $(part-of-module),1,0)" "$(rm_do_check)" "$(@)";
+
+else
+
+ ifdef CONFIG_X86_32
+ ifdef CONFIG_FUNCTION_GRAPH_TRACER
+   cmd_record_mcount = set -e ; perl $(srctree)/scripts/recordmcount.pl "$(ARCH)" \
+	"$(if $(CONFIG_64BIT),64,32)" \
+	"$(OBJDUMP)" "$(OBJCOPY)" "$(CC)" "$(LD)" "$(NM)" "$(RM)" "$(MV)" \
+	"$(if $(part-of-module),1,0)" "-1" "$(@)";
+ endif
+ endif
+
 endif
 
 define rule_cc_o_c
diff --git a/scripts/recordmcount.pl b/scripts/recordmcount.pl
index 090d300..164a9d5 100755
--- a/scripts/recordmcount.pl
+++ b/scripts/recordmcount.pl
@@ -99,14 +99,14 @@ $P =~ s@.*/@@g;
 
 my $V = '0.1';
 
-if ($#ARGV < 7) {
-	print "usage: $P arch bits objdump objcopy cc ld nm rm mv is_module inputfile\n";
+if ($#ARGV < 11) {
+	print "usage: $P arch bits objdump objcopy cc ld nm rm mv is_module do_check inputfile\n";
 	print "version: $V\n";
 	exit(1);
 }
 
 my ($arch, $bits, $objdump, $objcopy, $cc,
-    $ld, $nm, $rm, $mv, $is_module, $inputfile) = @ARGV;
+    $ld, $nm, $rm, $mv, $is_module, $do_check, $inputfile) = @ARGV;
 
 # This file refers to mcount and shouldn't be ftraced, so lets' ignore it
 if ($inputfile eq "kernel/trace/ftrace.o") {
@@ -129,6 +129,60 @@ $nm = "nm" if ((length $nm) == 0);
 $rm = "rm" if ((length $rm) == 0);
 $mv = "mv" if ((length $mv) == 0);
 
+# gcc can do stupid things with the stack pointer on x86_32.
+# It may pass a copy of the return address to mcount, which will
+# break the function graph tracer. If this happens then we need
+# to flag it and break the build.
+#
+# For x86_32, the parameter do_check will be negative if we only
+# want to perform the check, and positive if we should od the check.
+# If function graph tracing is not enabled, do_check will be zero.
+#
+
+my $check_next_line = 0;
+my $line_failed = 0;
+my $last_function;
+
+sub test_x86_32_prologue
+{
+    if ($check_next_line) {
+	if (!/push\s*\%ebp/) {
+	    $line_failed = 1;
+	}
+    }
+
+    if ($line_failed && /mcount/) {
+	print STDERR "\n********************************************************\n";
+	print STDERR "  Your version of GCC breaks the function graph tracer\n";
+	print STDERR "  Please disable CONFIG_FUNCTION_GRAPH_TRACER\n";
+	print STDERR "  Failed function was \"$last_function\"\n";
+	print STDERR "********************************************************\n\n";
+	exit -1;
+    }
+    $check_next_line = 0;
+
+    # check the first line after a function starts for
+    #  push %ebp
+    if (/^[0-9a-fA-F]+\s+<([a-zA-Z_].*)>:$/) {
+	$last_function = $1;
+	$check_next_line = 1;
+	$line_failed = 0;
+    }
+}
+
+if ($do_check < 0) {
+    # Only perform the check and quit
+    open(IN, "$objdump -dr $inputfile|") || die "error running $objdump";
+
+    while (<IN>) {
+	test_x86_32_prologue;
+    }
+    close (IN);
+    exit 0;
+}
+
+my $check = 0;
+
 #print STDERR "running: $P '$arch' '$objdump' '$objcopy' '$cc' '$ld' " .
 #    "'$nm' '$rm' '$mv' '$inputfile'\n";
 
@@ -153,6 +207,12 @@ if ($arch eq "x86") {
     }
 }
 
+if ($arch eq "i386") {
+    if ($do_check) {
+	$check = 1;
+    }
+}
+
 #
 # We base the defaults off of i386, the other archs may
 # feel free to change them in the below if statements.
@@ -381,6 +441,14 @@ my $text;
 my $read_headers = 1;
 
 while (<IN>) {
+
+    # x86_32 may need to test the start of every function to see
+    # if GCC did not mess up the mcount prologue. All functions must
+    # start with push %ebp, otherwise it is broken.
+    if ($check) {
+	test_x86_32_prologue;
+    }
+
     # is it a section?
     if (/$section_regex/) {
 	$read_headers = 0;


^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2009-11-25 21:01 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-11-24 17:05 [PATCH][GIT PULL][v2.6.32] tracing/x86: Add check to detect GCC messing with mcount prologue Ross Ridge
2009-11-24 17:11 ` Andrew Haley
     [not found] <alpine.LFD.2.00.0911181933540.24119@localhost.localdomain>
     [not found] ` <tip-887a29f59b93cf54e21814869a4ab6e80b6fa623@git.kernel.org>
     [not found]   ` <20091119072040.GA23579@elte.hu>
     [not found]     ` <alpine.LFD.2.00.0911191053390.24119@localhost.localdomain>
     [not found]       ` <alpine.LFD.2.00.0911191423190.24119@localhost.localdomain>
2009-11-20  5:24         ` Steven Rostedt
2009-11-20  5:33           ` Steven Rostedt
2009-11-20 17:02           ` Steven Rostedt
2009-11-20 17:15             ` H. Peter Anvin
2009-11-20 19:36             ` Andrew Haley
2009-11-20 19:48               ` Steven Rostedt
2009-11-20 19:50                 ` H. Peter Anvin
2009-11-22  9:39               ` H.J. Lu
2009-11-22 17:21                 ` Andrew Haley
2009-11-22 23:31                   ` H.J. Lu
2009-11-24 14:43                     ` Andrew Haley
2009-11-24 14:56                       ` Thomas Gleixner
2009-11-24 15:06                         ` Jakub Jelinek
2009-11-24 15:32                           ` Andrew Haley
2009-11-24 15:36                             ` Jakub Jelinek
2009-11-24 15:48                               ` Andrew Haley
2009-11-24 16:39                                 ` H. Peter Anvin
2009-11-24 17:12                                   ` Andrew Haley
2009-11-24 17:30                                     ` Steven Rostedt
2009-11-25 20:05                                       ` H. Peter Anvin
2009-11-24 19:56                                     ` H. Peter Anvin
2009-11-25 15:29                           ` Thomas Gleixner
2009-11-25 15:45                             ` Ingo Molnar
2009-11-25 15:53                               ` Thomas Gleixner
2009-11-25 16:26                                 ` Ingo Molnar
2009-11-25 16:45                               ` Jakub Jelinek
2009-11-25 20:13                                 ` H. Peter Anvin
2009-11-25 21:01                                   ` Andrew Haley
2009-11-22  9:06             ` Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).