RFC: Patch for gcc-5/changes.html

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* RFC: Patch for gcc-5/changes.html
@ 2015-03-26 19:25 Vladimir Makarov
  2015-03-26 19:54 ` Jan Hubicka
  2015-03-26 20:39 ` Sandra Loosemore
  0 siblings, 2 replies; 4+ messages in thread
From: Vladimir Makarov @ 2015-03-26 19:25 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 258 bytes --]

   Hi, I neglected to write about RA changes for the previous releases 
and people asked me to write about RA changes for GCC-5. So here is what 
I'd like to add to gcc-5/changes.html.  I'll do it tomorrow.  So any 
comments will be appreciated.

   Thanks.

[-- Attachment #2: changes.patch --]
[-- Type: text/x-patch, Size: 2678 bytes --]

Index: changes.html
===================================================================
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v
retrieving revision 1.91
diff -U 5 -r1.91 changes.html
--- changes.html	23 Mar 2015 10:12:23 -0000	1.91
+++ changes.html	26 Mar 2015 19:24:32 -0000
@@ -95,10 +95,40 @@
       <li>The new <code>gcov-tool</code> utility allows manipulating
           profiles.</li>
       <li>Profiles are now more tolerant to source file changes (this can be
 	  controlled by <code>--param profile-func-internal-id</code>).</li>
     </ul></li>
+    <li>Register allocation improvements:
+    <ul>
+      <li>A new local register allocator (LRA) sub-pass was added.
+          The sub-pass implements control-flow sensitive global
+          register rematerialization (controlled via
+	  <code>-flra-remat</code>).  Instead of spilling and
+          restoring register value, it is recalculated if it is
+          profitable.  The sub-pass improved SPEC2000 generated code
+          by 1% and 0.5% correspondingly on ARM and x86-64.</li>
+      <li>In GCC-4.9 and earlier releases PIC hard register was fixed
+          and was not used for other purposes when PIC code was
+          generated.  Reuse of PIC hard register was implemented in RA
+          for GCC-5.0.  It improves generated PIC code performance as
+          more hard registers can be used.  As an example, shared
+          libraries and OS Android would significantly benefit from
+          such optimization.  Currently it is switched on only for
+          x86/x86-64 targets.  As RA infrastructure is already
+          implemented for PIC register reuse, other targets might
+          follow this in the future.</li>
+      <li>A simple form of inter-procedural RA was implemented.  When
+          it is known that a called function does not use caller saved
+          registers, save/restore code is not generated around the
+          call for such registers. This optimization can be controlled
+          by <code>-fipa-ra</code></li>
+      <li>On some architectures (e.g. modern Intel processors),
+          spilling general registers into vector registers can be more
+          profitable than spilling into memory.  LRA had already such
+          optimization.  It was significantly improved for GCC-5.0,
+          permitting more 85% such spills than in GCC-4.9.</li>
+    </ul></li>
     <li>UndefinedBehaviorSanitizer gained a few new sanitization options:
     <ul>
       <li><code>-fsanitize=float-divide-by-zero</code>: detect floating-point
 	   division by zero;</li>
       <li><code>-fsanitize=float-cast-overflow</code>: check that the result

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: RFC: Patch for gcc-5/changes.html
  2015-03-26 19:25 RFC: Patch for gcc-5/changes.html Vladimir Makarov
@ 2015-03-26 19:54 ` Jan Hubicka
  2015-03-26 20:39 ` Sandra Loosemore
  1 sibling, 0 replies; 4+ messages in thread
From: Jan Hubicka @ 2015-03-26 19:54 UTC (permalink / raw)
  To: Vladimir Makarov; +Cc: gcc-patches

> +      <li>In GCC-4.9 and earlier releases PIC hard register was fixed
> +          and was not used for other purposes when PIC code was
> +          generated.  Reuse of PIC hard register was implemented in RA
> +          for GCC-5.0.  It improves generated PIC code performance as
> +          more hard registers can be used.  As an example, shared
> +          libraries and OS Android would significantly benefit from
> +          such optimization.  Currently it is switched on only for
> +          x86/x86-64 targets.  As RA infrastructure is already
> +          implemented for PIC register reuse, other targets might
> +          follow this in the future.</li>

PIC was always huge performance disaster on x86, I suppose with this patch in,
-fPIC benchmarks should improve quite noticeably ;)

Thanks!
Honza

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: RFC: Patch for gcc-5/changes.html
  2015-03-26 19:25 RFC: Patch for gcc-5/changes.html Vladimir Makarov
  2015-03-26 19:54 ` Jan Hubicka
@ 2015-03-26 20:39 ` Sandra Loosemore
  2015-03-26 20:56   ` Vladimir Makarov
  1 sibling, 1 reply; 4+ messages in thread
From: Sandra Loosemore @ 2015-03-26 20:39 UTC (permalink / raw)
  To: Vladimir Makarov; +Cc: gcc-patches

On 03/26/2015 01:25 PM, Vladimir Makarov wrote:
>    Hi, I neglected to write about RA changes for the previous releases
> and people asked me to write about RA changes for GCC-5. So here is what
> I'd like to add to gcc-5/changes.html.  I'll do it tomorrow.  So any
> comments will be appreciated.
>
> Index: changes.html
> ===================================================================
> RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v
> retrieving revision 1.91
> diff -U 5 -r1.91 changes.html
> --- changes.html	23 Mar 2015 10:12:23 -0000	1.91
> +++ changes.html	26 Mar 2015 19:24:32 -0000
> @@ -95,10 +95,40 @@
>        <li>The new <code>gcov-tool</code> utility allows manipulating
>            profiles.</li>
>        <li>Profiles are now more tolerant to source file changes (this can be
>  	  controlled by <code>--param profile-func-internal-id</code>).</li>
>      </ul></li>
> +    <li>Register allocation improvements:
> +    <ul>
> +      <li>A new local register allocator (LRA) sub-pass was added.
> +          The sub-pass implements control-flow sensitive global
> +          register rematerialization (controlled via
> +	  <code>-flra-remat</code>).  Instead of spilling and

How about rewriting the first two sentences as:

A new local register allocator (LRA) sub-pass, controlled by 
<code>-flra-remat</code>, implements control-flow sensitive global 
register materialization.

> +          restoring register value, it is recalculated if it is

s/register value/a register value/

> +          profitable.  The sub-pass improved SPEC2000 generated code
> +          by 1% and 0.5% correspondingly on ARM and x86-64.</li>
> +      <li>In GCC-4.9 and earlier releases PIC hard register was fixed
> +          and was not used for other purposes when PIC code was
> +          generated.  Reuse of PIC hard register was implemented in RA
> +          for GCC-5.0.  It improves generated PIC code performance as
> +          more hard registers can be used.  As an example, shared
> +          libraries and OS Android would significantly benefit from
> +          such optimization.  Currently it is switched on only for
> +          x86/x86-64 targets.  As RA infrastructure is already
> +          implemented for PIC register reuse, other targets might
> +          follow this in the future.</li>

How about making this less verbose and repetitive:

Reuse of the PIC hard register, instead of using a fixed register, was 
implemented on x86/x86-64 targets.  This improves generated PIC code 
performance as more hard registers can be used.  Shared libraries can 
significantly benefit from this optimization.

> +      <li>A simple form of inter-procedural RA was implemented.  When
> +          it is known that a called function does not use caller saved

s/caller saved/caller-saved/

> +          registers, save/restore code is not generated around the
> +          call for such registers. This optimization can be controlled
> +          by <code>-fipa-ra</code></li>
> +      <li>On some architectures (e.g. modern Intel processors),
> +          spilling general registers into vector registers can be more
> +          profitable than spilling into memory.  LRA had already such
> +          optimization.  It was significantly improved for GCC-5.0,
> +          permitting more 85% such spills than in GCC-4.9.</li>

I don't understand the last sentence.  How about just dropping that 85% 
bit and making the whole thing less wordy and more focused on the actual 
improvement:

LRA is now much more effective at generating spills of general registers 
into vector registers instead of memory on architectures (e.g., modern 
Intel processors) where this is profitable.

> +    </ul></li>
>      <li>UndefinedBehaviorSanitizer gained a few new sanitization options:
>      <ul>
>        <li><code>-fsanitize=float-divide-by-zero</code>: detect floating-point
>  	   division by zero;</li>
>        <li><code>-fsanitize=float-cast-overflow</code>: check that the result

-Sandra

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: RFC: Patch for gcc-5/changes.html
  2015-03-26 20:39 ` Sandra Loosemore
@ 2015-03-26 20:56   ` Vladimir Makarov
  0 siblings, 0 replies; 4+ messages in thread
From: Vladimir Makarov @ 2015-03-26 20:56 UTC (permalink / raw)
  To: Sandra Loosemore; +Cc: gcc-patches

On 03/26/2015 04:38 PM, Sandra Loosemore wrote:
> On 03/26/2015 01:25 PM, Vladimir Makarov wrote:
>>    Hi, I neglected to write about RA changes for the previous releases
>> and people asked me to write about RA changes for GCC-5. So here is what
>> I'd like to add to gcc-5/changes.html.  I'll do it tomorrow.  So any
>> comments will be appreciated.
>>
>> Index: changes.html
>> ===================================================================
>> RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v
>> retrieving revision 1.91
>> diff -U 5 -r1.91 changes.html
>> --- changes.html    23 Mar 2015 10:12:23 -0000    1.91
>> +++ changes.html    26 Mar 2015 19:24:32 -0000
>> @@ -95,10 +95,40 @@
>>        <li>The new <code>gcov-tool</code> utility allows manipulating
>>            profiles.</li>
>>        <li>Profiles are now more tolerant to source file changes 
>> (this can be
>>        controlled by <code>--param 
>> profile-func-internal-id</code>).</li>
>>      </ul></li>
>> +    <li>Register allocation improvements:
>> +    <ul>
>> +      <li>A new local register allocator (LRA) sub-pass was added.
>> +          The sub-pass implements control-flow sensitive global
>> +          register rematerialization (controlled via
>> +      <code>-flra-remat</code>).  Instead of spilling and
>
> How about rewriting the first two sentences as:
>
> A new local register allocator (LRA) sub-pass, controlled by 
> <code>-flra-remat</code>, implements control-flow sensitive global 
> register materialization.
>
That is better.  Thanks.
>> +          restoring register value, it is recalculated if it is
>
> s/register value/a register value/
>
Fixed.
>> +          profitable.  The sub-pass improved SPEC2000 generated code
>> +          by 1% and 0.5% correspondingly on ARM and x86-64.</li>
>> +      <li>In GCC-4.9 and earlier releases PIC hard register was fixed
>> +          and was not used for other purposes when PIC code was
>> +          generated.  Reuse of PIC hard register was implemented in RA
>> +          for GCC-5.0.  It improves generated PIC code performance as
>> +          more hard registers can be used.  As an example, shared
>> +          libraries and OS Android would significantly benefit from
>> +          such optimization.  Currently it is switched on only for
>> +          x86/x86-64 targets.  As RA infrastructure is already
>> +          implemented for PIC register reuse, other targets might
>> +          follow this in the future.</li>
>
> How about making this less verbose and repetitive:
>
> Reuse of the PIC hard register, instead of using a fixed register, was 
> implemented on x86/x86-64 targets.  This improves generated PIC code 
> performance as more hard registers can be used.  Shared libraries can 
> significantly benefit from this optimization.
Fixed.  Thanks.
>
>> +      <li>A simple form of inter-procedural RA was implemented.  When
>> +          it is known that a called function does not use caller saved
>
> s/caller saved/caller-saved/
>
Done.
>> +          registers, save/restore code is not generated around the
>> +          call for such registers. This optimization can be controlled
>> +          by <code>-fipa-ra</code></li>
>> +      <li>On some architectures (e.g. modern Intel processors),
>> +          spilling general registers into vector registers can be more
>> +          profitable than spilling into memory.  LRA had already such
>> +          optimization.  It was significantly improved for GCC-5.0,
>> +          permitting more 85% such spills than in GCC-4.9.</li>
>
> I don't understand the last sentence.  How about just dropping that 
> 85% bit and making the whole thing less wordy and more focused on the 
> actual improvement:
>
> LRA is now much more effective at generating spills of general 
> registers into vector registers instead of memory on architectures 
> (e.g., modern Intel processors) where this is profitable.
>
Ok.  Fixed.

Thanks, Sandra.  That was very helpful as English is not my native language.
>> +    </ul></li>
>>      <li>UndefinedBehaviorSanitizer gained a few new sanitization 
>> options:
>>      <ul>
>> <li><code>-fsanitize=float-divide-by-zero</code>: detect floating-point
>>         division by zero;</li>
>> <li><code>-fsanitize=float-cast-overflow</code>: check that the result
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-03-26 20:56 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-26 19:25 RFC: Patch for gcc-5/changes.html Vladimir Makarov
2015-03-26 19:54 ` Jan Hubicka
2015-03-26 20:39 ` Sandra Loosemore
2015-03-26 20:56   ` Vladimir Makarov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).