Function specific optimizations call for discussion

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* Function specific optimizations call for discussion
@ 2007-11-28 22:25 Michael Meissner
  2007-11-29 13:43 ` Karthik Kumar
                   ` (5 more replies)
  0 siblings, 6 replies; 18+ messages in thread
From: Michael Meissner @ 2007-11-28 22:25 UTC (permalink / raw)
  To: gcc, christophe.harle; +Cc: michael.meissner

One of the things that I've been interested in is adding support to GCC to
compile individual functions with specific target options.  I first presented a
draft at the Google mini-summit, and then another draft at the GCC developer
summit last July.

In the x86 world this would mean saying that an individual function can use
SSE5 instructions or SSE4.1 instructions.  This would simplify things for
people who need to write high performance libraries that run on different
architectures, and need to be optimal on each platform.  Ultimately, the goal
is to allow hotspot functions to be compiled several times with different
target specific optimizations.  I would welcome any thoughts or suggestions
about this proposal.

The proposal is at:
http://gcc.gnu.org/wiki/FunctionSpecificOpt

Part of the infrastructure work for doing this is already addressed in function
adaption project and we will work together with that team:
http://gcc.gnu.org/wiki/functionAdaptation

One of the things that I have considered and not added to the current proposal
is to change most of the -f, -O, -W options for a function.  This would be
relatively simple to add once the infrastructure is in place, but it can really
complicate things, since various optimizations depend on other optimizations
having been done.  Similarly, the -mtune= and -march= options can overly
complicate matters.

In addition, attribute(cold) and attribute(hot) will set the optimization level
to -Os and -O3.

I will be away on vacation from December 3-8th, and not reading mail during
that time.

-- 
Michael Meissner, AMD
90 Central Street, MS 83-29, Boxborough, MA, 01719, USA
michael.meissner@amd.com

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Function specific optimizations call for discussion
  2007-11-28 22:25 Function specific optimizations call for discussion Michael Meissner
@ 2007-11-29 13:43 ` Karthik Kumar
  2007-11-29 13:55 ` Ramana Radhakrishnan
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 18+ messages in thread
From: Karthik Kumar @ 2007-11-29 13:43 UTC (permalink / raw)
  To: Michael Meissner, gcc, christophe.harle

On Nov 29, 2007 2:27 AM, Michael Meissner <michael.meissner@amd.com> wrote:
> One of the things that I've been interested in is adding support to GCC to
> compile individual functions with specific target options.  I first presented a
> draft at the Google mini-summit, and then another draft at the GCC developer
> summit last July.
>
> In the x86 world this would mean saying that an individual function can use
> SSE5 instructions or SSE4.1 instructions.  This would simplify things for
> people who need to write high performance libraries that run on different
> architectures, and need to be optimal on each platform.  Ultimately, the goal
> is to allow hotspot functions to be compiled several times with different
> target specific optimizations.  I would welcome any thoughts or suggestions
> about this proposal.
>
> The proposal is at:
> http://gcc.gnu.org/wiki/FunctionSpecificOpt

Regarding the static constructors/destructors, we can version them as
well. In case a person wishes to annotate them, he can use:
__attribute__(sse4a) or change the constructor function name to
name__v__sse4a__ or so. He can safely assume that it won't execute if
the feature bit isn't present. We would simply ret if the feature bit
isn't supported. Multiple attributes are supported for any function,
and it shouldnt be a problem :-) This means the detection of features
would be done before the program's actual execution.

>
> Part of the infrastructure work for doing this is already addressed in function
> adaption project and we will work together with that team:
> http://gcc.gnu.org/wiki/functionAdaptation
>
> One of the things that I have considered and not added to the current proposal
> is to change most of the -f, -O, -W options for a function.  This would be
> relatively simple to add once the infrastructure is in place, but it can really
> complicate things, since various optimizations depend on other optimizations
> having been done.  Similarly, the -mtune= and -march= options can overly
> complicate matters.
>

As for setting -f, there are flags like: stack-protector-all which can
not be set at a function level. The problem happens when a flag is set
to 0/1 (not 2) and is valid for a whole compilation unit.. (It might
be there, I'm saying); We should clearly split the common.opt/annotate
it with the required information so we can determine which ones can
not be unset when set or vice-versa. The warn/optimization levels
would inherit/differenciate from the parent settings (compilation unit
or enclosing function).

> In addition, attribute(cold) and attribute(hot) will set the optimization level
> to -Os and -O3.
>

It might be possible that he wishes to optimize for size (not inline
function calls/unroll loops, for instance)  but he might want it
versioned; . Versioning will increase text size only if all versions
are loaded during runtime.

> I will be away on vacation from December 3-8th, and not reading mail during
> that time.
>
> --
> Michael Meissner, AMD
> 90 Central Street, MS 83-29, Boxborough, MA, 01719, USA
> michael.meissner@amd.com
>
>
>

-- 
Karthik

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Function specific optimizations call for discussion
  2007-11-28 22:25 Function specific optimizations call for discussion Michael Meissner
  2007-11-29 13:43 ` Karthik Kumar
@ 2007-11-29 13:55 ` Ramana Radhakrishnan
       [not found]   ` <d841b3f30711290259m461d7378mc4aab0812c57df38@mail.gmail.com>
  2007-11-29 22:13   ` Michael Meissner
  2007-11-29 14:53 ` Ramana Radhakrishnan
                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 18+ messages in thread
From: Ramana Radhakrishnan @ 2007-11-29 13:55 UTC (permalink / raw)
  To: Michael Meissner, gcc, christophe.harle

Hi Michael,

I had a comment / query regarding Stage 2 where you talk about
Function cloning for different targets.

I understand that the mechanism is to have a hidden function pointer
that actually gets initialized based on the cpuid.

I don't know if it is worth the effort to have debug info also
enhanced to be such that a breakpoint put on the function my_min
actually sets the breakpoint on any of the clones since the logic
would be similar to the same.  This sounds logically similar to the
way that gdb currently handles breakpoints to multiple constructors.

The other option ofcourse is to fake debug information for such to
actually set a breakpoint on the value of the function pointer that
you so set up. I am no DWARF expert but there might be other folks on
the list who might have better ideas about how to implement this.

cheers
Ramana



On Nov 29, 2007 2:27 AM, Michael Meissner <michael.meissner@amd.com> wrote:
> One of the things that I've been interested in is adding support to GCC to
> compile individual functions with specific target options.  I first presented a
> draft at the Google mini-summit, and then another draft at the GCC developer
> summit last July.
>
> In the x86 world this would mean saying that an individual function can use
> SSE5 instructions or SSE4.1 instructions.  This would simplify things for
> people who need to write high performance libraries that run on different
> architectures, and need to be optimal on each platform.  Ultimately, the goal
> is to allow hotspot functions to be compiled several times with different
> target specific optimizations.  I would welcome any thoughts or suggestions
> about this proposal.
>
> The proposal is at:
> http://gcc.gnu.org/wiki/FunctionSpecificOpt
>
> Part of the infrastructure work for doing this is already addressed in function
> adaption project and we will work together with that team:
> http://gcc.gnu.org/wiki/functionAdaptation
>
> One of the things that I have considered and not added to the current proposal
> is to change most of the -f, -O, -W options for a function.  This would be
> relatively simple to add once the infrastructure is in place, but it can really
> complicate things, since various optimizations depend on other optimizations
> having been done.  Similarly, the -mtune= and -march= options can overly
> complicate matters.
>
> In addition, attribute(cold) and attribute(hot) will set the optimization level
> to -Os and -O3.
>
> I will be away on vacation from December 3-8th, and not reading mail during
> that time.
>
> --
> Michael Meissner, AMD
> 90 Central Street, MS 83-29, Boxborough, MA, 01719, USA
> michael.meissner@amd.com
>
>
>



-- 
Ramana Radhakrishnan
GNU Tools
Celunite Inc.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Function specific optimizations call for discussion
  2007-11-28 22:25 Function specific optimizations call for discussion Michael Meissner
  2007-11-29 13:43 ` Karthik Kumar
  2007-11-29 13:55 ` Ramana Radhakrishnan
@ 2007-11-29 14:53 ` Ramana Radhakrishnan
  2007-11-29 22:06   ` Michael Meissner
  2007-11-29 18:26 ` Sylvain Pion
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 18+ messages in thread
From: Ramana Radhakrishnan @ 2007-11-29 14:53 UTC (permalink / raw)
  To: Michael Meissner, gcc, christophe.harle

Hi,

Hit the send button a bit too soon on my earlier mail .



> In the x86 world this would mean saying that an individual function can use
> SSE5 instructions or SSE4.1 instructions.  This would simplify things for
> people who need to write high performance libraries that run on different
> architectures, and need to be optimal on each platform.  Ultimately, the goal
> is to allow hotspot functions to be compiled several times with different
> target specific optimizations.  I would welcome any thoughts or suggestions
> about this proposal.

I noticed this from your proposal.

Stage1: Teach the inliner about target specific functions

We will teach the inliner not to inline functions compiled with target
specific optimizations inside of a general function. However, if a
function that has target specific optimizations it should be able to
inline normal functions, or functions compiled with the same set of
target specific optimizations. I estimate that this should take 2
weeks of time.



This is already handled in the inliner and could be handled by
defining the target hook TARGET_FUNCTION_ATTRIBUTE_INLINABLE_P .We use
it in the private port that we maintain to disable inlining of certain
attributed functions like interrupt handlers. The way we do it already
is to look at DECL_ATTRIBUTES of the tree to figure this out. You
would have to munge in the attributes into the DECL_ATTRIBUTES and the
check later when you do the same but I guess you know that already.


cheers
Ramana


On Nov 29, 2007 2:27 AM, Michael Meissner <michael.meissner@amd.com> wrote:
> One of the things that I've been interested in is adding support to GCC to
> compile individual functions with specific target options.  I first presented a
> draft at the Google mini-summit, and then another draft at the GCC developer
> summit last July.
>
> In the x86 world this would mean saying that an individual function can use
> SSE5 instructions or SSE4.1 instructions.  This would simplify things for
> people who need to write high performance libraries that run on different
> architectures, and need to be optimal on each platform.  Ultimately, the goal
> is to allow hotspot functions to be compiled several times with different
> target specific optimizations.  I would welcome any thoughts or suggestions
> about this proposal.
>
> The proposal is at:
> http://gcc.gnu.org/wiki/FunctionSpecificOpt
>
> Part of the infrastructure work for doing this is already addressed in function
> adaption project and we will work together with that team:
> http://gcc.gnu.org/wiki/functionAdaptation
>
> One of the things that I have considered and not added to the current proposal
> is to change most of the -f, -O, -W options for a function.  This would be
> relatively simple to add once the infrastructure is in place, but it can really
> complicate things, since various optimizations depend on other optimizations
> having been done.  Similarly, the -mtune= and -march= options can overly
> complicate matters.
>
> In addition, attribute(cold) and attribute(hot) will set the optimization level
> to -Os and -O3.
>
> I will be away on vacation from December 3-8th, and not reading mail during
> that time.
>
> --
> Michael Meissner, AMD
> 90 Central Street, MS 83-29, Boxborough, MA, 01719, USA
> michael.meissner@amd.com
>
>
>



-- 
Ramana Radhakrishnan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Fwd: Function specific optimizations call for discussion
       [not found]   ` <d841b3f30711290259m461d7378mc4aab0812c57df38@mail.gmail.com>
@ 2007-11-29 15:19     ` Karthik Kumar
  2007-11-29 16:39     ` Ramana Radhakrishnan
  1 sibling, 0 replies; 18+ messages in thread
From: Karthik Kumar @ 2007-11-29 15:19 UTC (permalink / raw)
  To: GCC@GCC

On Nov 29, 2007 2:09 PM, Ramana Radhakrishnan <ramana.r@gmail.com> wrote:
> Hi Michael,
>
> I had a comment / query regarding Stage 2 where you talk about
> Function cloning for different targets.
>
> I understand that the mechanism is to have a hidden function pointer
> that actually gets initialized based on the cpuid.
>
> I don't know if it is worth the effort to have debug info also
> enhanced to be such that a breakpoint put on the function my_min
> actually sets the breakpoint on any of the clones since the logic
> would be similar to the same.  This sounds logically similar to the
> way that gdb currently handles breakpoints to multiple constructors.
>

If the user has written the clone (for manual dispatch), then the
debug info would point to his code version. If it were generated
automatically (stage 2), the information would pertain to the function
cloned multiple times.The disassembly on those breakpoints might be
different, of course.

> The other option ofcourse is to fake debug information for such to
> actually set a breakpoint on the value of the function pointer that
> you so set up. I am no DWARF expert but there might be other folks on
> the list who might have better ideas about how to implement this.
>

There is an idea to modify the dynamic linker to take advantage of the
detection; If in such case, setting a breakpoint would be easier. Then
we wouldn't require indirect calls either. The breakpoints can be set
in each of the clones, and they will be processed only after setting
up the pointer and their subsequent execution.

The idea is to make use of the debugging information as provided by
the inline-cloner.

> cheers
> Ramana

Karthik

-- 
Karthik
http://guilt.bafsoft.net

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Function specific optimizations call for discussion
       [not found]   ` <d841b3f30711290259m461d7378mc4aab0812c57df38@mail.gmail.com>
  2007-11-29 15:19     ` Fwd: " Karthik Kumar
@ 2007-11-29 16:39     ` Ramana Radhakrishnan
  2007-11-29 23:28       ` Michael Meissner
  1 sibling, 1 reply; 18+ messages in thread
From: Ramana Radhakrishnan @ 2007-11-29 16:39 UTC (permalink / raw)
  To: Karthik Kumar; +Cc: GCC

Hi Karthik,

Thanks for your email .

> > Hi Michael,
> >
> > I had a comment / query regarding Stage 2 where you talk about
> > Function cloning for different targets.
> >
> > I understand that the mechanism is to have a hidden function pointer
> > that actually gets initialized based on the cpuid.
> >
> > I don't know if it is worth the effort to have debug info also
> > enhanced to be such that a breakpoint put on the function my_min
> > actually sets the breakpoint on any of the clones since the logic
> > would be similar to the same.  This sounds logically similar to the
> > way that gdb currently handles breakpoints to multiple constructors.
> >
>
> If the user has written the clone (for manual dispatch), then the
> debug info would point to his code version. If it were generated
> automatically (stage 2), the information would pertain to the function
> cloned multiple times.The disassembly on those breakpoints might be
> different, of course.

All I am saying is that while doing the automatic dispatch try and
have debug info in sync with the source written by the user. The
disassembly is not what I am worrying about. Its only the ability to
place breakpoints on all the clones.

b do_min  and voila gdb will automagically put breakpoints on all the clones.


>
> > The other option ofcourse is to fake debug information for such to
> > actually set a breakpoint on the value of the function pointer that
> > you so set up. I am no DWARF expert but there might be other folks on
> > the list who might have better ideas about how to implement this.
> >
>
> There is an idea to modify the dynamic linker to take advantage of the
> detection; If in such case, setting a breakpoint would be easier. Then
> we wouldn't require indirect calls either. The breakpoints can be set
> in each of the clones, and they will be processed only after setting
> up the pointer and their subsequent execution.

Hmmm some kind of dl symbol resolver work where  you have a cloned
attribute for the symbol in ELF and figure out the best clone based on
a runnable hunk which detects the best function . Might increase a bit
of overhead at run time but its a one time expense. If we did
prelinking that could be removed too.

This might preclude my earlier statement.

>
> The idea is to make use of the debugging information as provided by
> the inline-cloner.

All I wanted was the requirement of debug information consistency to
be a part of the proposal for the inline cloner.

cheers
Ramana
>
> > cheers
> > Ramana
>
> Karthik
>



-- 
Ramana Radhakrishnan
GNU Tools
Celunite Inc.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Function specific optimizations call for discussion
  2007-11-28 22:25 Function specific optimizations call for discussion Michael Meissner
                   ` (2 preceding siblings ...)
  2007-11-29 14:53 ` Ramana Radhakrishnan
@ 2007-11-29 18:26 ` Sylvain Pion
  2007-11-29 22:33   ` Michael Meissner
  2007-11-29 22:03 ` Weddington, Eric
  2007-12-06  4:04 ` Jonathan Adamczewski
  5 siblings, 1 reply; 18+ messages in thread
From: Sylvain Pion @ 2007-11-29 18:26 UTC (permalink / raw)
  To: gcc; +Cc: Michael Meissner, christophe.harle, Sylvain Pion

[-- Attachment #1: Type: text/plain, Size: 1299 bytes --]

Michael Meissner a écrit :
> One of the things that I've been interested in is adding support to GCC to
> compile individual functions with specific target options.  I first presented a
> draft at the Google mini-summit, and then another draft at the GCC developer
> summit last July.
> 
> In the x86 world this would mean saying that an individual function can use
> SSE5 instructions or SSE4.1 instructions.  This would simplify things for
> people who need to write high performance libraries that run on different
> architectures, and need to be optimal on each platform.  Ultimately, the goal
> is to allow hotspot functions to be compiled several times with different
> target specific optimizations.  I would welcome any thoughts or suggestions
> about this proposal.

I'm wondering if this proposal would support specifying things
like adding -frounding-math when compiling specific functions.
( This particular case is connected to pragma FENV_ACCESS though. )

Also, would this work when the functions is inline?
I mean the case when the caller does not have the same attribute,
but the inlined code of the callee still respects the attribute
set for the inlined callee.

-- 
Sylvain Pion
INRIA Sophia-Antipolis
Geometrica Project-Team
CGAL, http://cgal.org/


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/x-pkcs7-signature, Size: 4309 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: Function specific optimizations call for discussion
  2007-11-28 22:25 Function specific optimizations call for discussion Michael Meissner
                   ` (3 preceding siblings ...)
  2007-11-29 18:26 ` Sylvain Pion
@ 2007-11-29 22:03 ` Weddington, Eric
  2007-11-29 22:37   ` Michael Meissner
  2007-11-29 22:44   ` tbp
  2007-12-06  4:04 ` Jonathan Adamczewski
  5 siblings, 2 replies; 18+ messages in thread
From: Weddington, Eric @ 2007-11-29 22:03 UTC (permalink / raw)
  To: Michael Meissner, gcc, christophe.harle

 

> -----Original Message-----
> From: Michael Meissner [mailto:michael.meissner@amd.com] 
> Sent: Wednesday, November 28, 2007 1:58 PM
> To: gcc@gcc.gnu.org; christophe.harle@amd.com
> Cc: michael.meissner@amd.com
> Subject: Function specific optimizations call for discussion
> 
> One of the things that I've been interested in is adding 
> support to GCC to
> compile individual functions with specific target options.  I 
> first presented a
> draft at the Google mini-summit, and then another draft at 
> the GCC developer
> summit last July.
<snip> 
> I would welcome any thoughts 
> or suggestions
> about this proposal.

As I spoke to you about this at the Summit, the users of the AVR port,
and I would also postulate the general embedded community, would
*really* like to have this functionality, especially your Stage 1. There
are many AVR, or embedded, applications where they are generally
optimized for size, but have a time-critical function that needs to be
optimized for speed. I would vote for including both the attribute
syntax and the pragma syntax. I have many users who would be more
comfortable with the pragma syntax, despite any shortcomings.

Eric Weddington

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Function specific optimizations call for discussion
  2007-11-29 14:53 ` Ramana Radhakrishnan
@ 2007-11-29 22:06   ` Michael Meissner
  0 siblings, 0 replies; 18+ messages in thread
From: Michael Meissner @ 2007-11-29 22:06 UTC (permalink / raw)
  To: Ramana Radhakrishnan; +Cc: Michael Meissner, gcc, christophe.harle

On Thu, Nov 29, 2007 at 02:25:46PM +0530, Ramana Radhakrishnan wrote:
> Hi,
> 
> Hit the send button a bit too soon on my earlier mail .
> 
> 
> 
> > In the x86 world this would mean saying that an individual function can use
> > SSE5 instructions or SSE4.1 instructions.  This would simplify things for
> > people who need to write high performance libraries that run on different
> > architectures, and need to be optimal on each platform.  Ultimately, the goal
> > is to allow hotspot functions to be compiled several times with different
> > target specific optimizations.  I would welcome any thoughts or suggestions
> > about this proposal.
> 
> I noticed this from your proposal.
> 
> Stage1: Teach the inliner about target specific functions
> 
> We will teach the inliner not to inline functions compiled with target
> specific optimizations inside of a general function. However, if a
> function that has target specific optimizations it should be able to
> inline normal functions, or functions compiled with the same set of
> target specific optimizations. I estimate that this should take 2
> weeks of time.
> 
> 
> 
> This is already handled in the inliner and could be handled by
> defining the target hook TARGET_FUNCTION_ATTRIBUTE_INLINABLE_P .We use
> it in the private port that we maintain to disable inlining of certain
> attributed functions like interrupt handlers. The way we do it already
> is to look at DECL_ATTRIBUTES of the tree to figure this out. You
> would have to munge in the attributes into the DECL_ATTRIBUTES and the
> check later when you do the same but I guess you know that already.

Yes, though some of the work will be gluing the pieces together.  I haven't
looked at the inliner in detail right now.

-- 
Michael Meissner, AMD
90 Central Street, MS 83-29, Boxborough, MA, 01719, USA
michael.meissner@amd.com


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Function specific optimizations call for discussion
  2007-11-29 13:55 ` Ramana Radhakrishnan
       [not found]   ` <d841b3f30711290259m461d7378mc4aab0812c57df38@mail.gmail.com>
@ 2007-11-29 22:13   ` Michael Meissner
  1 sibling, 0 replies; 18+ messages in thread
From: Michael Meissner @ 2007-11-29 22:13 UTC (permalink / raw)
  To: Ramana Radhakrishnan; +Cc: Michael Meissner, gcc, christophe.harle

On Thu, Nov 29, 2007 at 02:09:27PM +0530, Ramana Radhakrishnan wrote:
> Hi Michael,
> 
> I had a comment / query regarding Stage 2 where you talk about
> Function cloning for different targets.
> 
> I understand that the mechanism is to have a hidden function pointer
> that actually gets initialized based on the cpuid.
> 
> I don't know if it is worth the effort to have debug info also
> enhanced to be such that a breakpoint put on the function my_min
> actually sets the breakpoint on any of the clones since the logic
> would be similar to the same.  This sounds logically similar to the
> way that gdb currently handles breakpoints to multiple constructors.
> 
> The other option ofcourse is to fake debug information for such to
> actually set a breakpoint on the value of the function pointer that
> you so set up. I am no DWARF expert but there might be other folks on
> the list who might have better ideas about how to implement this.

I must admit I hadn't thought much about debugging.  I guess I was assuming the
cloning that we already support in the compiler had solved the debug problem,
but that is a whole other discussion that is going on right now.

-- 
Michael Meissner, AMD
90 Central Street, MS 83-29, Boxborough, MA, 01719, USA
michael.meissner@amd.com


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Function specific optimizations call for discussion
  2007-11-29 18:26 ` Sylvain Pion
@ 2007-11-29 22:33   ` Michael Meissner
  2007-11-30 14:53     ` Joseph S. Myers
  0 siblings, 1 reply; 18+ messages in thread
From: Michael Meissner @ 2007-11-29 22:33 UTC (permalink / raw)
  To: Sylvain Pion; +Cc: gcc, Michael Meissner, christophe.harle

On Thu, Nov 29, 2007 at 12:58:55PM +0100, Sylvain Pion wrote:
> Michael Meissner a écrit :
> >One of the things that I've been interested in is adding support to GCC to
> >compile individual functions with specific target options.  I first 
> >presented a
> >draft at the Google mini-summit, and then another draft at the GCC 
> >developer
> >summit last July.
> >
> >In the x86 world this would mean saying that an individual function can use
> >SSE5 instructions or SSE4.1 instructions.  This would simplify things for
> >people who need to write high performance libraries that run on different
> >architectures, and need to be optimal on each platform.  Ultimately, the 
> >goal
> >is to allow hotspot functions to be compiled several times with different
> >target specific optimizations.  I would welcome any thoughts or suggestions
> >about this proposal.
> 
> I'm wondering if this proposal would support specifying things
> like adding -frounding-math when compiling specific functions.
> ( This particular case is connected to pragma FENV_ACCESS though. )

I imagine it could be made to work once the infrastructure is in place.  I had
forgotten about the C99 math pragmas.

> Also, would this work when the functions is inline?
> I mean the case when the caller does not have the same attribute,
> but the inlined code of the callee still respects the attribute
> set for the inlined callee.

Right now I specify that for stage 1 if a generic function calls a function
with target specific support, that it not be inlined.  However a function with
target specific support can call and inline either generic functions or other
functions with the same target specific options.

-- 
Michael Meissner, AMD
90 Central Street, MS 83-29, Boxborough, MA, 01719, USA
michael.meissner@amd.com


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Function specific optimizations call for discussion
  2007-11-29 22:03 ` Weddington, Eric
@ 2007-11-29 22:37   ` Michael Meissner
  2007-11-29 22:44   ` tbp
  1 sibling, 0 replies; 18+ messages in thread
From: Michael Meissner @ 2007-11-29 22:37 UTC (permalink / raw)
  To: Weddington, Eric; +Cc: Michael Meissner, gcc, christophe.harle

On Thu, Nov 29, 2007 at 01:29:51PM -0700, Weddington, Eric wrote:
>  
> 
> > -----Original Message-----
> > From: Michael Meissner [mailto:michael.meissner@amd.com] 
> > Sent: Wednesday, November 28, 2007 1:58 PM
> > To: gcc@gcc.gnu.org; christophe.harle@amd.com
> > Cc: michael.meissner@amd.com
> > Subject: Function specific optimizations call for discussion
> > 
> > One of the things that I've been interested in is adding 
> > support to GCC to
> > compile individual functions with specific target options.  I 
> > first presented a
> > draft at the Google mini-summit, and then another draft at 
> > the GCC developer
> > summit last July.
> <snip> 
> > I would welcome any thoughts 
> > or suggestions
> > about this proposal.
> 
> As I spoke to you about this at the Summit, the users of the AVR port,
> and I would also postulate the general embedded community, would
> *really* like to have this functionality, especially your Stage 1. There
> are many AVR, or embedded, applications where they are generally
> optimized for size, but have a time-critical function that needs to be
> optimized for speed. I would vote for including both the attribute
> syntax and the pragma syntax. I have many users who would be more
> comfortable with the pragma syntax, despite any shortcomings.

Yes, I remember that discussion.  I think a lot of people in the embedded
community (and also things like one laptop per child which tend to have small
memory systems) could use the ability to mark cold functions as save space, hot
functions do as much optimization as possible.  For example, Arm users might
want to switch to thumb code generation instead of arm for cold functions.

-- 
Michael Meissner, AMD
90 Central Street, MS 83-29, Boxborough, MA, 01719, USA
michael.meissner@amd.com


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Function specific optimizations call for discussion
  2007-11-29 22:03 ` Weddington, Eric
  2007-11-29 22:37   ` Michael Meissner
@ 2007-11-29 22:44   ` tbp
  1 sibling, 0 replies; 18+ messages in thread
From: tbp @ 2007-11-29 22:44 UTC (permalink / raw)
  To: Weddington, Eric; +Cc: Michael Meissner, gcc, christophe.harle

On Nov 29, 2007 9:29 PM, Weddington, Eric <eweddington@cso.atmel.com> wrote:
> and I would also postulate the general embedded community, would
> *really* like to have this functionality, especially your Stage 1. There
> are many AVR, or embedded, applications where they are generally
> optimized for size, but have a time-critical function that needs to be
> optimized for speed.
I would personally, and i think it hasn't been evoked  yet, *really*
like to be able to toggle fast-math (or related flags) per function,
basically for the same reason.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Function specific optimizations call for discussion
  2007-11-29 16:39     ` Ramana Radhakrishnan
@ 2007-11-29 23:28       ` Michael Meissner
  0 siblings, 0 replies; 18+ messages in thread
From: Michael Meissner @ 2007-11-29 23:28 UTC (permalink / raw)
  To: Ramana Radhakrishnan; +Cc: Karthik Kumar, GCC

On Thu, Nov 29, 2007 at 05:08:11PM +0530, Ramana Radhakrishnan wrote:
> Hi Karthik,
> 
> Thanks for your email .
> 
> > > Hi Michael,
> > >
> > > I had a comment / query regarding Stage 2 where you talk about
> > > Function cloning for different targets.
> > >
> > > I understand that the mechanism is to have a hidden function pointer
> > > that actually gets initialized based on the cpuid.
> > >
> > > I don't know if it is worth the effort to have debug info also
> > > enhanced to be such that a breakpoint put on the function my_min
> > > actually sets the breakpoint on any of the clones since the logic
> > > would be similar to the same.  This sounds logically similar to the
> > > way that gdb currently handles breakpoints to multiple constructors.
> > >
> >
> > If the user has written the clone (for manual dispatch), then the
> > debug info would point to his code version. If it were generated
> > automatically (stage 2), the information would pertain to the function
> > cloned multiple times.The disassembly on those breakpoints might be
> > different, of course.
> 
> All I am saying is that while doing the automatic dispatch try and
> have debug info in sync with the source written by the user. The
> disassembly is not what I am worrying about. Its only the ability to
> place breakpoints on all the clones.
> 
> b do_min  and voila gdb will automagically put breakpoints on all the clones.

Yes, that is the desired goal.
 
> 
> >
> > > The other option ofcourse is to fake debug information for such to
> > > actually set a breakpoint on the value of the function pointer that
> > > you so set up. I am no DWARF expert but there might be other folks on
> > > the list who might have better ideas about how to implement this.
> > >
> >
> > There is an idea to modify the dynamic linker to take advantage of the
> > detection; If in such case, setting a breakpoint would be easier. Then
> > we wouldn't require indirect calls either. The breakpoints can be set
> > in each of the clones, and they will be processed only after setting
> > up the pointer and their subsequent execution.
> 
> Hmmm some kind of dl symbol resolver work where  you have a cloned
> attribute for the symbol in ELF and figure out the best clone based on
> a runnable hunk which detects the best function . Might increase a bit
> of overhead at run time but its a one time expense. If we did
> prelinking that could be removed too.

It would be nice if we could get something like this done.  However, given that
it spans several different groups that don't always talk together (compiler,
glibc, binary utilities, dynamic linker, linux system vendors), it is a much
harder problem to solve.  Also, doing it in the Linux space only, means it
isn't available to the non-Linux gcc users.

One approach is the fat binary approach, where you compile the program N times,
and at runtime the system maps in the correct code image depending on the
target bits.  However, this is very space inefficient.
 
> This might preclude my earlier statement.
> 
> >
> > The idea is to make use of the debugging information as provided by
> > the inline-cloner.
> 
> All I wanted was the requirement of debug information consistency to
> be a part of the proposal for the inline cloner.
> 
> cheers
> Ramana
> >
> > > cheers
> > > Ramana
> >
> > Karthik
> >
> 
> 
> 
> -- 
> Ramana Radhakrishnan
> GNU Tools
> Celunite Inc.
> 
> 

-- 
Michael Meissner, AMD
90 Central Street, MS 83-29, Boxborough, MA, 01719, USA
michael.meissner@amd.com


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Function specific optimizations call for discussion
  2007-11-29 22:33   ` Michael Meissner
@ 2007-11-30 14:53     ` Joseph S. Myers
  0 siblings, 0 replies; 18+ messages in thread
From: Joseph S. Myers @ 2007-11-30 14:53 UTC (permalink / raw)
  To: Michael Meissner; +Cc: Sylvain Pion, gcc, christophe.harle

On Thu, 29 Nov 2007, Michael Meissner wrote:

> > I'm wondering if this proposal would support specifying things
> > like adding -frounding-math when compiling specific functions.
> > ( This particular case is connected to pragma FENV_ACCESS though. )
> 
> I imagine it could be made to work once the infrastructure is in place.  I had
> forgotten about the C99 math pragmas.

Stephen Moshier's testcases for FENV_ACCESS, attached to PR 20785, may be 
of use (though more tests may also be needed, certainly for the other 
pragmas).

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Function specific optimizations call for discussion
  2007-11-28 22:25 Function specific optimizations call for discussion Michael Meissner
                   ` (4 preceding siblings ...)
  2007-11-29 22:03 ` Weddington, Eric
@ 2007-12-06  4:04 ` Jonathan Adamczewski
  5 siblings, 0 replies; 18+ messages in thread
From: Jonathan Adamczewski @ 2007-12-06  4:04 UTC (permalink / raw)
  To: Michael Meissner, gcc, christophe.harle

Michael Meissner wrote:
> One of the things that I've been interested in is adding support to GCC to
> compile individual functions with specific target options.  I first presented a
> draft at the Google mini-summit, and then another draft at the GCC developer
> summit last July.
>
> ...
>
> The proposal is at:
> http://gcc.gnu.org/wiki/FunctionSpecificOpt
>   


Have you given any thought to specifying --param values?

jonathan.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Function specific optimizations call for discussion
  2007-11-29 18:40 J.C. Pizarro
@ 2007-11-29 20:30 ` J.C. Pizarro
  0 siblings, 0 replies; 18+ messages in thread
From: J.C. Pizarro @ 2007-11-29 20:30 UTC (permalink / raw)
  To: Sylvain Pion, gcc

On 2007/11/29, J.C. Pizarro <jcpiza@gmail.com>, i wrote:
> Autovectorization is still a researching issue.

+--------------+    +------------------+     /-------\     +----------------+
| unroll-loops | -> | inline-functions | -> < big BBs > -> | autovectorize! |
+--------------+    +------------------+     \-------/     +----------------+

Besides inlining functions, inline non-virtual methods (from C++) too!

   J.C.Pizarro

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Function specific optimizations call for discussion
@ 2007-11-29 18:40 J.C. Pizarro
  2007-11-29 20:30 ` J.C. Pizarro
  0 siblings, 1 reply; 18+ messages in thread
From: J.C. Pizarro @ 2007-11-29 18:40 UTC (permalink / raw)
  To: Sylvain Pion, gcc

On 2007/11/29, Sylvain Pion <Sylvain.Pion@sophia.inria.fr>
> Michael Meissner a écrit :
> > One of the things that I've been interested in is adding support to GCC to
> > compile individual functions with specific target options.  I first presented a
> > draft at the Google mini-summit, and then another draft at the GCC developer
> > summit last July
> >
> > In the x86 world this would mean saying that an individual function can use
> > SSE5 instructions or SSE4.1 instructions.  This would simplify things for
> > people who need to write high performance libraries that run on different
> > architectures, and need to be optimal on each platform.  Ultimately, the goal
> > is to allow hotspot functions to be compiled several times with different
> > target specific optimizations.  I would welcome any thoughts or suggestions
> > about this proposal.
>
>
> I'm wondering if this proposal would support specifying things
> like adding -frounding-math when compiling specific functions.
> ( This particular case is connected to pragma FENV_ACCESS though. )
>
>
> Also, would this work when the functions is inline?
> I mean the case when the caller does not have the same attribute,
> but the inlined code of the callee still respects the attribute
> set for the inlined callee.

Autovectorization is still a researching issue.

The generated program/library should depend of the capabilities of present
historical machines from no-SIMD instructions until the last SSE9
instructions set (SSE, SSE2, SSE3, SSE4, SSE4.1, SSE5, 3DNow!, 3DNow+!...)

But there are many distros LiveCDs/LiveDVDs that they don't want to
grow brutally their sizes
(e.g. one specific LiveCD for supporting each one i-arch that has SSEi, i=1..9).

The good solution is extending the ELF format to insert SIMD stubs to unique
portable program/library for historical machines, and the loader will do the
work of choosing the adequate stub for the correspondent old or new machine.

But there is another problem, this strategy doesn't works fully in clusters
with processes's migration (ala old OpenMosix) because OpenMosix only
migrates processes with identical instructions sets to use for all machines.

   J.C.Pizarro

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2007-12-06  4:04 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-11-28 22:25 Function specific optimizations call for discussion Michael Meissner
2007-11-29 13:43 ` Karthik Kumar
2007-11-29 13:55 ` Ramana Radhakrishnan
     [not found]   ` <d841b3f30711290259m461d7378mc4aab0812c57df38@mail.gmail.com>
2007-11-29 15:19     ` Fwd: " Karthik Kumar
2007-11-29 16:39     ` Ramana Radhakrishnan
2007-11-29 23:28       ` Michael Meissner
2007-11-29 22:13   ` Michael Meissner
2007-11-29 14:53 ` Ramana Radhakrishnan
2007-11-29 22:06   ` Michael Meissner
2007-11-29 18:26 ` Sylvain Pion
2007-11-29 22:33   ` Michael Meissner
2007-11-30 14:53     ` Joseph S. Myers
2007-11-29 22:03 ` Weddington, Eric
2007-11-29 22:37   ` Michael Meissner
2007-11-29 22:44   ` tbp
2007-12-06  4:04 ` Jonathan Adamczewski
2007-11-29 18:40 J.C. Pizarro
2007-11-29 20:30 ` J.C. Pizarro

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).