public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [RFC] introduce --param max-lto-partition for having an upper bound on partition size
@ 2016-04-01 13:48 Prathamesh Kulkarni
  2016-04-01 17:33 ` Richard Biener
  0 siblings, 1 reply; 22+ messages in thread
From: Prathamesh Kulkarni @ 2016-04-01 13:48 UTC (permalink / raw)
  To: gcc Patches, Richard Biener, Ramana Radhakrishnan, Jan Hubicka

[-- Attachment #1: Type: text/plain, Size: 1706 bytes --]

Hi,
The attached patch introduces param max-lto-partition which creates an upper
bound for partition size.

My primary motivation for this patch is to fix building chromium for arm
with -flto-partition=one.
Chromium fails to build with -flto-partition={none, one} with assembler error:
"branch out of range error"
because in both these cases LTO creates a single text section of 18 mb
which exceeds thumb's limit of 16 mb and arm backend emits a short
call if caller and callee are in same section.
This is binutils PR18625:
https://sourceware.org/bugzilla/show_bug.cgi?id=18625
With patch, chromium builds for -flto-partition=one (by creating more
than one but minimal number of partitions to honor 16 mb limit).
I haven't tested with -flto-partition=none but I suppose the build
will still fail for none, because it won't involve partitioning?  I am
not sure how to fix for none case.

As suggested by Jim in binutils PR18625, the proper fix would be to
implement branch relaxation in arm's port of gas, however I suppose
only LTO will realistically create such large sections,
and implementing branch relaxation appears to be quite complicated and
probably too much of
an effort for this single use case, so this patch serves as a
work-around to the issue.
I am looking into fine-tuning the param value for ARM backend to
roughly match limit
of 16 mb.

AFAIU, this would change semantics of --param n_lto_partitions (or
-flto-partition=one) from
"exactly n_lto_partitions" to "at-least n_lto_partitions". If that's
not desirable maybe we could add
another param/option ?
Cross-tested on arm*-*-*.
Would this patch be OK for stage-1 (after getting param value right
for ARM target) ?

Thanks,
Prathamesh

[-- Attachment #2: max-partition.diff --]
[-- Type: text/plain, Size: 2048 bytes --]

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index c868490..f734d56 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -3459,6 +3459,11 @@ arm_option_override (void)
 
   /* Init initial mode for testing.  */
   thumb_flipper = TARGET_THUMB;
+
+    maybe_set_param_value (MAX_PARTITION_SIZE,
+			  10000, /* FIXME: fine-tune this value to roughly match 16 mb limit.  */
+                           global_options.x_param_values,
+			   global_options_set.x_param_values);
 }
 
 static void
diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c
index 9eb63c2..bc0c612 100644
--- a/gcc/lto/lto-partition.c
+++ b/gcc/lto/lto-partition.c
@@ -511,9 +511,20 @@ lto_balanced_map (int n_lto_partitions)
   varpool_order.qsort (varpool_node_cmp);
 
   /* Compute partition size and create the first partition.  */
+  if (PARAM_VALUE (MIN_PARTITION_SIZE) > PARAM_VALUE (MAX_PARTITION_SIZE))
+    fatal_error (input_location, "min partition size cannot be greater than max partition size");
+
   partition_size = total_size / n_lto_partitions;
   if (partition_size < PARAM_VALUE (MIN_PARTITION_SIZE))
     partition_size = PARAM_VALUE (MIN_PARTITION_SIZE);
+  else if (partition_size > PARAM_VALUE (MAX_PARTITION_SIZE))
+    {
+      n_lto_partitions = total_size / PARAM_VALUE (MAX_PARTITION_SIZE);
+      if (total_size % PARAM_VALUE (MAX_PARTITION_SIZE))
+	n_lto_partitions++;
+      partition_size = total_size / n_lto_partitions;
+    }
+
   npartitions = 1;
   partition = new_partition ("");
   if (symtab->dump_file)
diff --git a/gcc/params.def b/gcc/params.def
index 9362c15..b6055ff 100644
--- a/gcc/params.def
+++ b/gcc/params.def
@@ -1029,6 +1029,11 @@ DEFPARAM (MIN_PARTITION_SIZE,
 	  "Minimal size of a partition for LTO (in estimated instructions).",
 	  1000, 0, 0)
 
+DEFPARAM (MAX_PARTITION_SIZE,
+	  "lto-max-partition",
+	  "Maximal size of a partition for LTO (in estimated instructions).",
+	  INT_MAX, 0, INT_MAX)
+
 /* Diagnostic parameters.  */
 
 DEFPARAM (CXX_MAX_NAMESPACES_FOR_DIAGNOSTIC_HELP,

[-- Attachment #3: ChangeLog --]
[-- Type: application/octet-stream, Size: 302 bytes --]

2016-04-01  Prathamesh Kulkarni  <prathamesh.kulkarni@linaro.org>

	* params.def (MAX_PARTITION_SIZE): New parameter.
	* lto/lto-partition.c (lto_balanced_map): Check if partition size is greater than
	max-lto-partition.
	* config/arm/arm.c (arm_option_override): Set parm max-lto-partition
	to 10000.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC] introduce --param max-lto-partition for having an upper bound on partition size
  2016-04-01 13:48 [RFC] introduce --param max-lto-partition for having an upper bound on partition size Prathamesh Kulkarni
@ 2016-04-01 17:33 ` Richard Biener
  2016-04-04  7:47   ` Prathamesh Kulkarni
  0 siblings, 1 reply; 22+ messages in thread
From: Richard Biener @ 2016-04-01 17:33 UTC (permalink / raw)
  To: Prathamesh Kulkarni, gcc Patches, Ramana Radhakrishnan, Jan Hubicka

On April 1, 2016 3:48:35 PM GMT+02:00, Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> wrote:
>Hi,
>The attached patch introduces param max-lto-partition which creates an
>upper
>bound for partition size.
>
>My primary motivation for this patch is to fix building chromium for
>arm
>with -flto-partition=one.
>Chromium fails to build with -flto-partition={none, one} with assembler
>error:
>"branch out of range error"
>because in both these cases LTO creates a single text section of 18 mb
>which exceeds thumb's limit of 16 mb and arm backend emits a short
>call if caller and callee are in same section.
>This is binutils PR18625:
>https://sourceware.org/bugzilla/show_bug.cgi?id=18625
>With patch, chromium builds for -flto-partition=one (by creating more
>than one but minimal number of partitions to honor 16 mb limit).
>I haven't tested with -flto-partition=none but I suppose the build
>will still fail for none, because it won't involve partitioning?  I am
>not sure how to fix for none case.
>
>As suggested by Jim in binutils PR18625, the proper fix would be to
>implement branch relaxation in arm's port of gas, however I suppose
>only LTO will realistically create such large sections,
>and implementing branch relaxation appears to be quite complicated and
>probably too much of
>an effort for this single use case, so this patch serves as a
>work-around to the issue.
>I am looking into fine-tuning the param value for ARM backend to
>roughly match limit
>of 16 mb.
>
>AFAIU, this would change semantics of --param n_lto_partitions (or
>-flto-partition=one) from
>"exactly n_lto_partitions" to "at-least n_lto_partitions". If that's
>not desirable maybe we could add
>another param/option ?
>Cross-tested on arm*-*-*.
>Would this patch be OK for stage-1 (after getting param value right
>for ARM target) ?

What do you want to achieve?  Changing =one semantics doesn't look right to me.
Adding a param for maximum size sounds good in general, but only to increase the maximum number of partitions for =balanced (the default).

Richard.

>
>Thanks,
>Prathamesh


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC] introduce --param max-lto-partition for having an upper bound on partition size
  2016-04-01 17:33 ` Richard Biener
@ 2016-04-04  7:47   ` Prathamesh Kulkarni
  2016-04-04  8:26     ` Richard Biener
  0 siblings, 1 reply; 22+ messages in thread
From: Prathamesh Kulkarni @ 2016-04-04  7:47 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc Patches, Ramana Radhakrishnan, Jan Hubicka

On 1 April 2016 at 23:02, Richard Biener <rguenther@suse.de> wrote:
> On April 1, 2016 3:48:35 PM GMT+02:00, Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> wrote:
>>Hi,
>>The attached patch introduces param max-lto-partition which creates an
>>upper
>>bound for partition size.
>>
>>My primary motivation for this patch is to fix building chromium for
>>arm
>>with -flto-partition=one.
>>Chromium fails to build with -flto-partition={none, one} with assembler
>>error:
>>"branch out of range error"
>>because in both these cases LTO creates a single text section of 18 mb
>>which exceeds thumb's limit of 16 mb and arm backend emits a short
>>call if caller and callee are in same section.
>>This is binutils PR18625:
>>https://sourceware.org/bugzilla/show_bug.cgi?id=18625
>>With patch, chromium builds for -flto-partition=one (by creating more
>>than one but minimal number of partitions to honor 16 mb limit).
>>I haven't tested with -flto-partition=none but I suppose the build
>>will still fail for none, because it won't involve partitioning?  I am
>>not sure how to fix for none case.
>>
>>As suggested by Jim in binutils PR18625, the proper fix would be to
>>implement branch relaxation in arm's port of gas, however I suppose
>>only LTO will realistically create such large sections,
>>and implementing branch relaxation appears to be quite complicated and
>>probably too much of
>>an effort for this single use case, so this patch serves as a
>>work-around to the issue.
>>I am looking into fine-tuning the param value for ARM backend to
>>roughly match limit
>>of 16 mb.
>>
>>AFAIU, this would change semantics of --param n_lto_partitions (or
>>-flto-partition=one) from
>>"exactly n_lto_partitions" to "at-least n_lto_partitions". If that's
>>not desirable maybe we could add
>>another param/option ?
>>Cross-tested on arm*-*-*.
>>Would this patch be OK for stage-1 (after getting param value right
>>for ARM target) ?
>
> What do you want to achieve?  Changing =one semantics doesn't look right to me.
> Adding a param for maximum size sounds good in general, but only to increase the maximum number of partitions for =balanced (the default).
Well, chromium fails to build on ARM with -flto-partition={none, one}
because the size of text section created with LTO,
exceeds the limit of 16 mb for thumb2 which results in assembler
errors: "branch out of range".
I was trying to fix that by creating minimal number of partitions such
that size of each partition is not greater than section size limit.
I suppose in theory the problem could also present with balanced
partitioning if total_size / n_lto_partitions exceeds section size
limit,
although not sure if this will be a practical case.

Thanks,
Prathamesh
>
> Richard.
>
>>
>>Thanks,
>>Prathamesh
>
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC] introduce --param max-lto-partition for having an upper bound on partition size
  2016-04-04  7:47   ` Prathamesh Kulkarni
@ 2016-04-04  8:26     ` Richard Biener
  2016-04-04 11:19       ` Prathamesh Kulkarni
  0 siblings, 1 reply; 22+ messages in thread
From: Richard Biener @ 2016-04-04  8:26 UTC (permalink / raw)
  To: Prathamesh Kulkarni; +Cc: gcc Patches, Ramana Radhakrishnan, Jan Hubicka

On Mon, 4 Apr 2016, Prathamesh Kulkarni wrote:

> On 1 April 2016 at 23:02, Richard Biener <rguenther@suse.de> wrote:
> > On April 1, 2016 3:48:35 PM GMT+02:00, Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> wrote:
> >>Hi,
> >>The attached patch introduces param max-lto-partition which creates an
> >>upper
> >>bound for partition size.
> >>
> >>My primary motivation for this patch is to fix building chromium for
> >>arm
> >>with -flto-partition=one.
> >>Chromium fails to build with -flto-partition={none, one} with assembler
> >>error:
> >>"branch out of range error"
> >>because in both these cases LTO creates a single text section of 18 mb
> >>which exceeds thumb's limit of 16 mb and arm backend emits a short
> >>call if caller and callee are in same section.
> >>This is binutils PR18625:
> >>https://sourceware.org/bugzilla/show_bug.cgi?id=18625
> >>With patch, chromium builds for -flto-partition=one (by creating more
> >>than one but minimal number of partitions to honor 16 mb limit).
> >>I haven't tested with -flto-partition=none but I suppose the build
> >>will still fail for none, because it won't involve partitioning?  I am
> >>not sure how to fix for none case.
> >>
> >>As suggested by Jim in binutils PR18625, the proper fix would be to
> >>implement branch relaxation in arm's port of gas, however I suppose
> >>only LTO will realistically create such large sections,
> >>and implementing branch relaxation appears to be quite complicated and
> >>probably too much of
> >>an effort for this single use case, so this patch serves as a
> >>work-around to the issue.
> >>I am looking into fine-tuning the param value for ARM backend to
> >>roughly match limit
> >>of 16 mb.
> >>
> >>AFAIU, this would change semantics of --param n_lto_partitions (or
> >>-flto-partition=one) from
> >>"exactly n_lto_partitions" to "at-least n_lto_partitions". If that's
> >>not desirable maybe we could add
> >>another param/option ?
> >>Cross-tested on arm*-*-*.
> >>Would this patch be OK for stage-1 (after getting param value right
> >>for ARM target) ?
> >
> > What do you want to achieve?  Changing =one semantics doesn't look right to me.
> > Adding a param for maximum size sounds good in general, but only to increase the maximum number of partitions for =balanced (the default).
>
> Well, chromium fails to build on ARM with -flto-partition={none, one}
> because the size of text section created with LTO,
> exceeds the limit of 16 mb for thumb2 which results in assembler
> errors: "branch out of range".
> I was trying to fix that by creating minimal number of partitions such
> that size of each partition is not greater than section size limit.

Ok, but you simply shouldn't use -flto-partition={none,one} if it doesn't 
work.  Note that "partition size" and text section size do not have
a 1:1 correspondence so a safe limit is hard to achieve anyway.

> I suppose in theory the problem could also present with balanced
> partitioning if total_size / n_lto_partitions exceeds section size
> limit,
> although not sure if this will be a practical case.

I guess an artificial testcase can easily hit this.  Or you can
hit this by adjusting --param lto-partitions to 1.  I think
adding a --param lto-max-partition is missing given that we
already have a --param lto-min-partition and the partitioning
algorithm tries to create lto-partitions partitions (but not smaller
than lto-min-partition) but it never creates more than lto-partitions
partitions as there is no upper bound on individual partition size.

This is also why lto-partitions has such a high default (to exploit
parallelism - but if there is only a very small number of CPU cores
available it doesn't make sense to split up so much for small programs).

That said, lto-partitions is a hint currently but also an upper bound
because we lack lto-max-partition.  Let's fix that instead.

Richard.

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC] introduce --param max-lto-partition for having an upper bound on partition size
  2016-04-04  8:26     ` Richard Biener
@ 2016-04-04 11:19       ` Prathamesh Kulkarni
  2016-04-04 11:42         ` Richard Biener
  0 siblings, 1 reply; 22+ messages in thread
From: Prathamesh Kulkarni @ 2016-04-04 11:19 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc Patches, Ramana Radhakrishnan, Jan Hubicka

On 4 April 2016 at 13:56, Richard Biener <rguenther@suse.de> wrote:
> On Mon, 4 Apr 2016, Prathamesh Kulkarni wrote:
>
>> On 1 April 2016 at 23:02, Richard Biener <rguenther@suse.de> wrote:
>> > On April 1, 2016 3:48:35 PM GMT+02:00, Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> wrote:
>> >>Hi,
>> >>The attached patch introduces param max-lto-partition which creates an
>> >>upper
>> >>bound for partition size.
>> >>
>> >>My primary motivation for this patch is to fix building chromium for
>> >>arm
>> >>with -flto-partition=one.
>> >>Chromium fails to build with -flto-partition={none, one} with assembler
>> >>error:
>> >>"branch out of range error"
>> >>because in both these cases LTO creates a single text section of 18 mb
>> >>which exceeds thumb's limit of 16 mb and arm backend emits a short
>> >>call if caller and callee are in same section.
>> >>This is binutils PR18625:
>> >>https://sourceware.org/bugzilla/show_bug.cgi?id=18625
>> >>With patch, chromium builds for -flto-partition=one (by creating more
>> >>than one but minimal number of partitions to honor 16 mb limit).
>> >>I haven't tested with -flto-partition=none but I suppose the build
>> >>will still fail for none, because it won't involve partitioning?  I am
>> >>not sure how to fix for none case.
>> >>
>> >>As suggested by Jim in binutils PR18625, the proper fix would be to
>> >>implement branch relaxation in arm's port of gas, however I suppose
>> >>only LTO will realistically create such large sections,
>> >>and implementing branch relaxation appears to be quite complicated and
>> >>probably too much of
>> >>an effort for this single use case, so this patch serves as a
>> >>work-around to the issue.
>> >>I am looking into fine-tuning the param value for ARM backend to
>> >>roughly match limit
>> >>of 16 mb.
>> >>
>> >>AFAIU, this would change semantics of --param n_lto_partitions (or
>> >>-flto-partition=one) from
>> >>"exactly n_lto_partitions" to "at-least n_lto_partitions". If that's
>> >>not desirable maybe we could add
>> >>another param/option ?
>> >>Cross-tested on arm*-*-*.
>> >>Would this patch be OK for stage-1 (after getting param value right
>> >>for ARM target) ?
>> >
>> > What do you want to achieve?  Changing =one semantics doesn't look right to me.
>> > Adding a param for maximum size sounds good in general, but only to increase the maximum number of partitions for =balanced (the default).
>>
>> Well, chromium fails to build on ARM with -flto-partition={none, one}
>> because the size of text section created with LTO,
>> exceeds the limit of 16 mb for thumb2 which results in assembler
>> errors: "branch out of range".
>> I was trying to fix that by creating minimal number of partitions such
>> that size of each partition is not greater than section size limit.
>
> Ok, but you simply shouldn't use -flto-partition={none,one} if it doesn't
> work.  Note that "partition size" and text section size do not have
> a 1:1 correspondence so a safe limit is hard to achieve anyway.
>
>> I suppose in theory the problem could also present with balanced
>> partitioning if total_size / n_lto_partitions exceeds section size
>> limit,
>> although not sure if this will be a practical case.
>
> I guess an artificial testcase can easily hit this.  Or you can
> hit this by adjusting --param lto-partitions to 1.  I think
> adding a --param lto-max-partition is missing given that we
> already have a --param lto-min-partition and the partitioning
> algorithm tries to create lto-partitions partitions (but not smaller
> than lto-min-partition) but it never creates more than lto-partitions
> partitions as there is no upper bound on individual partition size.
>
> This is also why lto-partitions has such a high default (to exploit
> parallelism - but if there is only a very small number of CPU cores
> available it doesn't make sense to split up so much for small programs).
>
> That said, lto-partitions is a hint currently but also an upper bound
> because we lack lto-max-partition.  Let's fix that instead.
Um not sure if I understood correctly.
Do we want to constrain individual partition size by adding parameter
lto-max-partition
for balanced partitioning but not for -flto-partition=one
case (since latter would also change semantics of =one) ?

Thanks,
Prathamesh
>
> Richard.
>
> --
> Richard Biener <rguenther@suse.de>
> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC] introduce --param max-lto-partition for having an upper bound on partition size
  2016-04-04 11:19       ` Prathamesh Kulkarni
@ 2016-04-04 11:42         ` Richard Biener
  2016-04-04 12:00           ` Jan Hubicka
  0 siblings, 1 reply; 22+ messages in thread
From: Richard Biener @ 2016-04-04 11:42 UTC (permalink / raw)
  To: Prathamesh Kulkarni; +Cc: gcc Patches, Ramana Radhakrishnan, Jan Hubicka

On Mon, 4 Apr 2016, Prathamesh Kulkarni wrote:

> On 4 April 2016 at 13:56, Richard Biener <rguenther@suse.de> wrote:
> > On Mon, 4 Apr 2016, Prathamesh Kulkarni wrote:
> >
> >> On 1 April 2016 at 23:02, Richard Biener <rguenther@suse.de> wrote:
> >> > On April 1, 2016 3:48:35 PM GMT+02:00, Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> wrote:
> >> >>Hi,
> >> >>The attached patch introduces param max-lto-partition which creates an
> >> >>upper
> >> >>bound for partition size.
> >> >>
> >> >>My primary motivation for this patch is to fix building chromium for
> >> >>arm
> >> >>with -flto-partition=one.
> >> >>Chromium fails to build with -flto-partition={none, one} with assembler
> >> >>error:
> >> >>"branch out of range error"
> >> >>because in both these cases LTO creates a single text section of 18 mb
> >> >>which exceeds thumb's limit of 16 mb and arm backend emits a short
> >> >>call if caller and callee are in same section.
> >> >>This is binutils PR18625:
> >> >>https://sourceware.org/bugzilla/show_bug.cgi?id=18625
> >> >>With patch, chromium builds for -flto-partition=one (by creating more
> >> >>than one but minimal number of partitions to honor 16 mb limit).
> >> >>I haven't tested with -flto-partition=none but I suppose the build
> >> >>will still fail for none, because it won't involve partitioning?  I am
> >> >>not sure how to fix for none case.
> >> >>
> >> >>As suggested by Jim in binutils PR18625, the proper fix would be to
> >> >>implement branch relaxation in arm's port of gas, however I suppose
> >> >>only LTO will realistically create such large sections,
> >> >>and implementing branch relaxation appears to be quite complicated and
> >> >>probably too much of
> >> >>an effort for this single use case, so this patch serves as a
> >> >>work-around to the issue.
> >> >>I am looking into fine-tuning the param value for ARM backend to
> >> >>roughly match limit
> >> >>of 16 mb.
> >> >>
> >> >>AFAIU, this would change semantics of --param n_lto_partitions (or
> >> >>-flto-partition=one) from
> >> >>"exactly n_lto_partitions" to "at-least n_lto_partitions". If that's
> >> >>not desirable maybe we could add
> >> >>another param/option ?
> >> >>Cross-tested on arm*-*-*.
> >> >>Would this patch be OK for stage-1 (after getting param value right
> >> >>for ARM target) ?
> >> >
> >> > What do you want to achieve?  Changing =one semantics doesn't look right to me.
> >> > Adding a param for maximum size sounds good in general, but only to increase the maximum number of partitions for =balanced (the default).
> >>
> >> Well, chromium fails to build on ARM with -flto-partition={none, one}
> >> because the size of text section created with LTO,
> >> exceeds the limit of 16 mb for thumb2 which results in assembler
> >> errors: "branch out of range".
> >> I was trying to fix that by creating minimal number of partitions such
> >> that size of each partition is not greater than section size limit.
> >
> > Ok, but you simply shouldn't use -flto-partition={none,one} if it doesn't
> > work.  Note that "partition size" and text section size do not have
> > a 1:1 correspondence so a safe limit is hard to achieve anyway.
> >
> >> I suppose in theory the problem could also present with balanced
> >> partitioning if total_size / n_lto_partitions exceeds section size
> >> limit,
> >> although not sure if this will be a practical case.
> >
> > I guess an artificial testcase can easily hit this.  Or you can
> > hit this by adjusting --param lto-partitions to 1.  I think
> > adding a --param lto-max-partition is missing given that we
> > already have a --param lto-min-partition and the partitioning
> > algorithm tries to create lto-partitions partitions (but not smaller
> > than lto-min-partition) but it never creates more than lto-partitions
> > partitions as there is no upper bound on individual partition size.
> >
> > This is also why lto-partitions has such a high default (to exploit
> > parallelism - but if there is only a very small number of CPU cores
> > available it doesn't make sense to split up so much for small programs).
> >
> > That said, lto-partitions is a hint currently but also an upper bound
> > because we lack lto-max-partition.  Let's fix that instead.
> Um not sure if I understood correctly.
> Do we want to constrain individual partition size by adding parameter
> lto-max-partition
> for balanced partitioning but not for -flto-partition=one
> case (since latter would also change semantics of =one) ?

Yes, I think so.

Richard.

> Thanks,
> Prathamesh
> >
> > Richard.
> >
> > --
> > Richard Biener <rguenther@suse.de>
> > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC] introduce --param max-lto-partition for having an upper bound on partition size
  2016-04-04 11:42         ` Richard Biener
@ 2016-04-04 12:00           ` Jan Hubicka
  2016-04-04 13:28             ` Prathamesh Kulkarni
  0 siblings, 1 reply; 22+ messages in thread
From: Jan Hubicka @ 2016-04-04 12:00 UTC (permalink / raw)
  To: Richard Biener
  Cc: Prathamesh Kulkarni, gcc Patches, Ramana Radhakrishnan, Jan Hubicka

> > Um not sure if I understood correctly.
> > Do we want to constrain individual partition size by adding parameter
> > lto-max-partition
> > for balanced partitioning but not for -flto-partition=one
> > case (since latter would also change semantics of =one) ?
> 
> Yes, I think so.

Yep, I agree.  Having partition one that produces multiple partitions doesn't seem to make much sense.
Given that we have such not so predictable target specific limits on size of single translation unit
we can handle, I suppose adding a resonable limit to the default balanced partitioning makes more sense.
One can always push the behaviour you intend by setting max partitions to 1 (I suppose max size should
have precedence over max partitions)

Honza
> 
> Richard.
> 
> > Thanks,
> > Prathamesh
> > >
> > > Richard.
> > >
> > > --
> > > Richard Biener <rguenther@suse.de>
> > > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
> > 
> > 
> 
> -- 
> Richard Biener <rguenther@suse.de>
> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC] introduce --param max-lto-partition for having an upper bound on partition size
  2016-04-04 12:00           ` Jan Hubicka
@ 2016-04-04 13:28             ` Prathamesh Kulkarni
  2016-04-04 14:14               ` Jan Hubicka
  0 siblings, 1 reply; 22+ messages in thread
From: Prathamesh Kulkarni @ 2016-04-04 13:28 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Richard Biener, gcc Patches, Ramana Radhakrishnan

[-- Attachment #1: Type: text/plain, Size: 1325 bytes --]

On 4 April 2016 at 17:30, Jan Hubicka <hubicka@ucw.cz> wrote:
>> > Um not sure if I understood correctly.
>> > Do we want to constrain individual partition size by adding parameter
>> > lto-max-partition
>> > for balanced partitioning but not for -flto-partition=one
>> > case (since latter would also change semantics of =one) ?
>>
>> Yes, I think so.
>
> Yep, I agree.  Having partition one that produces multiple partitions doesn't seem to make much sense.
> Given that we have such not so predictable target specific limits on size of single translation unit
> we can handle, I suppose adding a resonable limit to the default balanced partitioning makes more sense.
> One can always push the behaviour you intend by setting max partitions to 1 (I suppose max size should
> have precedence over max partitions)
Thanks for the suggestions, I have updated the patch.
Is it OK if it passes bootstrap+test ?

Thanks,
Prathamesh
>
> Honza
>>
>> Richard.
>>
>> > Thanks,
>> > Prathamesh
>> > >
>> > > Richard.
>> > >
>> > > --
>> > > Richard Biener <rguenther@suse.de>
>> > > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
>> >
>> >
>>
>> --
>> Richard Biener <rguenther@suse.de>
>> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

[-- Attachment #2: patch-2.diff --]
[-- Type: text/plain, Size: 2133 bytes --]

diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c
index 9eb63c2..bc0c612 100644
--- a/gcc/lto/lto-partition.c
+++ b/gcc/lto/lto-partition.c
@@ -511,9 +511,20 @@ lto_balanced_map (int n_lto_partitions)
   varpool_order.qsort (varpool_node_cmp);
 
   /* Compute partition size and create the first partition.  */
+  if (PARAM_VALUE (MIN_PARTITION_SIZE) > PARAM_VALUE (MAX_PARTITION_SIZE))
+    fatal_error (input_location, "min partition size cannot be greater than max partition size");
+
   partition_size = total_size / n_lto_partitions;
   if (partition_size < PARAM_VALUE (MIN_PARTITION_SIZE))
     partition_size = PARAM_VALUE (MIN_PARTITION_SIZE);
+  else if (partition_size > PARAM_VALUE (MAX_PARTITION_SIZE))
+    {
+      n_lto_partitions = total_size / PARAM_VALUE (MAX_PARTITION_SIZE);
+      if (total_size % PARAM_VALUE (MAX_PARTITION_SIZE))
+	n_lto_partitions++;
+      partition_size = total_size / n_lto_partitions;
+    }
+
   npartitions = 1;
   partition = new_partition ("");
   if (symtab->dump_file)
diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c
index 9dd513f..294b8a4 100644
--- a/gcc/lto/lto.c
+++ b/gcc/lto/lto.c
@@ -3112,6 +3112,12 @@ do_whole_program_analysis (void)
   timevar_pop (TV_WHOPR_WPA);
 
   timevar_push (TV_WHOPR_PARTITIONING);
+
+  if (flag_lto_partition != LTO_PARTITION_BALANCED
+      && PARAM_VALUE (MAX_PARTITION_SIZE) != INT_MAX)
+    fatal_error (input_location, "--param max-lto-partition should only"
+		 " be used with balanced partitioning\n");
+
   if (flag_lto_partition == LTO_PARTITION_1TO1)
     lto_1_to_1_map ();
   else if (flag_lto_partition == LTO_PARTITION_MAX)
diff --git a/gcc/params.def b/gcc/params.def
index 9362c15..b6055ff 100644
--- a/gcc/params.def
+++ b/gcc/params.def
@@ -1029,6 +1029,11 @@ DEFPARAM (MIN_PARTITION_SIZE,
 	  "Minimal size of a partition for LTO (in estimated instructions).",
 	  1000, 0, 0)
 
+DEFPARAM (MAX_PARTITION_SIZE,
+	  "lto-max-partition",
+	  "Maximal size of a partition for LTO (in estimated instructions).",
+	  INT_MAX, 0, INT_MAX)
+
 /* Diagnostic parameters.  */
 
 DEFPARAM (CXX_MAX_NAMESPACES_FOR_DIAGNOSTIC_HELP,

[-- Attachment #3: ChangeLog --]
[-- Type: application/octet-stream, Size: 353 bytes --]

2016-04-04  Prathamesh Kulkarni  <prathamesh.kulkarni@linaro.org>

	* params.def (MAX_PARTITION_SIZE): New parameter.
	* lto/lto-partition.c (lto_balanced_map): Check if partition size is greater than
	max-lto-partition.
	* lto/lto.c (do_whole_program_analysis): Report fatal error if lto-max-partition is
	passed and balanced partitioning is not used.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC] introduce --param max-lto-partition for having an upper bound on partition size
  2016-04-04 13:28             ` Prathamesh Kulkarni
@ 2016-04-04 14:14               ` Jan Hubicka
  2016-04-05 11:11                 ` Prathamesh Kulkarni
  0 siblings, 1 reply; 22+ messages in thread
From: Jan Hubicka @ 2016-04-04 14:14 UTC (permalink / raw)
  To: Prathamesh Kulkarni
  Cc: Jan Hubicka, Richard Biener, gcc Patches, Ramana Radhakrishnan


> diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c
> index 9eb63c2..bc0c612 100644
> --- a/gcc/lto/lto-partition.c
> +++ b/gcc/lto/lto-partition.c
> @@ -511,9 +511,20 @@ lto_balanced_map (int n_lto_partitions)
>    varpool_order.qsort (varpool_node_cmp);
>  
>    /* Compute partition size and create the first partition.  */
> +  if (PARAM_VALUE (MIN_PARTITION_SIZE) > PARAM_VALUE (MAX_PARTITION_SIZE))
> +    fatal_error (input_location, "min partition size cannot be greater than max partition size");
> +
>    partition_size = total_size / n_lto_partitions;
>    if (partition_size < PARAM_VALUE (MIN_PARTITION_SIZE))
>      partition_size = PARAM_VALUE (MIN_PARTITION_SIZE);
> +  else if (partition_size > PARAM_VALUE (MAX_PARTITION_SIZE))
> +    {
> +      n_lto_partitions = total_size / PARAM_VALUE (MAX_PARTITION_SIZE);
> +      if (total_size % PARAM_VALUE (MAX_PARTITION_SIZE))
> +	n_lto_partitions++;
> +      partition_size = total_size / n_lto_partitions;
> +    }

lto_balanced_map actually works in a way that looks for cheapest cutpoint in range
3/4*parittion_size to 2*partition_size and picks the cheapest range.
Setting partition_size to this value will thus not cause partitioner to produce smaller
partitions only.  I suppose modify the conditional:

      /* Partition is too large, unwind into step when best cost was reached and
         start new partition.  */                                               
      if (partition->insns > 2 * partition_size)                                

and/or in the code above set the partition_size to half of total_size/max_size.

I know this is somewhat sloppy.  This was really just first cut implementation
many years ago. I expected to reimplement it marter soon, but then there was
never really a need for it (I am trying to avoid late IPA optimizations so the
partitioning decisions should mostly affect compile time performance only).
If ARM is more sensitive for partitining, perhaps it would make sense to try to
look for something smarter.

> +
>    npartitions = 1;
>    partition = new_partition ("");
>    if (symtab->dump_file)
> diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c
> index 9dd513f..294b8a4 100644
> --- a/gcc/lto/lto.c
> +++ b/gcc/lto/lto.c
> @@ -3112,6 +3112,12 @@ do_whole_program_analysis (void)
>    timevar_pop (TV_WHOPR_WPA);
>  
>    timevar_push (TV_WHOPR_PARTITIONING);
> +
> +  if (flag_lto_partition != LTO_PARTITION_BALANCED
> +      && PARAM_VALUE (MAX_PARTITION_SIZE) != INT_MAX)
> +    fatal_error (input_location, "--param max-lto-partition should only"
> +		 " be used with balanced partitioning\n");
> +

I think we should wire in resonable MAX_PARTITION_SIZE default.  THe value you
found experimentally may be a good start. For that reason we can't really
refuse a value when !LTO_PARTITION_BALANCED.  Just document it as parameter for
balanced partitioning only and add a parameter to lto_balanced_map specifying whether
this param should be honored (because the same path is used for partitioning to one partition)

Otherwise the patch looks good to me modulo missing documentation.

Honza

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC] introduce --param max-lto-partition for having an upper bound on partition size
  2016-04-04 14:14               ` Jan Hubicka
@ 2016-04-05 11:11                 ` Prathamesh Kulkarni
  2016-04-05 11:28                   ` Richard Biener
  0 siblings, 1 reply; 22+ messages in thread
From: Prathamesh Kulkarni @ 2016-04-05 11:11 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Richard Biener, gcc Patches, Ramana Radhakrishnan

[-- Attachment #1: Type: text/plain, Size: 4012 bytes --]

On 4 April 2016 at 19:44, Jan Hubicka <hubicka@ucw.cz> wrote:
>
>> diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c
>> index 9eb63c2..bc0c612 100644
>> --- a/gcc/lto/lto-partition.c
>> +++ b/gcc/lto/lto-partition.c
>> @@ -511,9 +511,20 @@ lto_balanced_map (int n_lto_partitions)
>>    varpool_order.qsort (varpool_node_cmp);
>>
>>    /* Compute partition size and create the first partition.  */
>> +  if (PARAM_VALUE (MIN_PARTITION_SIZE) > PARAM_VALUE (MAX_PARTITION_SIZE))
>> +    fatal_error (input_location, "min partition size cannot be greater than max partition size");
>> +
>>    partition_size = total_size / n_lto_partitions;
>>    if (partition_size < PARAM_VALUE (MIN_PARTITION_SIZE))
>>      partition_size = PARAM_VALUE (MIN_PARTITION_SIZE);
>> +  else if (partition_size > PARAM_VALUE (MAX_PARTITION_SIZE))
>> +    {
>> +      n_lto_partitions = total_size / PARAM_VALUE (MAX_PARTITION_SIZE);
>> +      if (total_size % PARAM_VALUE (MAX_PARTITION_SIZE))
>> +     n_lto_partitions++;
>> +      partition_size = total_size / n_lto_partitions;
>> +    }
>
> lto_balanced_map actually works in a way that looks for cheapest cutpoint in range
> 3/4*parittion_size to 2*partition_size and picks the cheapest range.
> Setting partition_size to this value will thus not cause partitioner to produce smaller
> partitions only.  I suppose modify the conditional:
>
>       /* Partition is too large, unwind into step when best cost was reached and
>          start new partition.  */
>       if (partition->insns > 2 * partition_size)
>
> and/or in the code above set the partition_size to half of total_size/max_size.
>
> I know this is somewhat sloppy.  This was really just first cut implementation
> many years ago. I expected to reimplement it marter soon, but then there was
> never really a need for it (I am trying to avoid late IPA optimizations so the
> partitioning decisions should mostly affect compile time performance only).
> If ARM is more sensitive for partitining, perhaps it would make sense to try to
> look for something smarter.
>
>> +
>>    npartitions = 1;
>>    partition = new_partition ("");
>>    if (symtab->dump_file)
>> diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c
>> index 9dd513f..294b8a4 100644
>> --- a/gcc/lto/lto.c
>> +++ b/gcc/lto/lto.c
>> @@ -3112,6 +3112,12 @@ do_whole_program_analysis (void)
>>    timevar_pop (TV_WHOPR_WPA);
>>
>>    timevar_push (TV_WHOPR_PARTITIONING);
>> +
>> +  if (flag_lto_partition != LTO_PARTITION_BALANCED
>> +      && PARAM_VALUE (MAX_PARTITION_SIZE) != INT_MAX)
>> +    fatal_error (input_location, "--param max-lto-partition should only"
>> +              " be used with balanced partitioning\n");
>> +
>
> I think we should wire in resonable MAX_PARTITION_SIZE default.  THe value you
> found experimentally may be a good start. For that reason we can't really
> refuse a value when !LTO_PARTITION_BALANCED.  Just document it as parameter for
> balanced partitioning only and add a parameter to lto_balanced_map specifying whether
> this param should be honored (because the same path is used for partitioning to one partition)
>
> Otherwise the patch looks good to me modulo missing documentation.
Thanks for the review. I have updated the patch.
Does this version look OK ?
I had randomly chosen 10000, not sure if that's an appropriate value
for default.

I have a silly question about partitioning: Does it hamper
transformations on ipa optimizations if caller and
callee get placed in separate partitions ? For instance if callee is
supposed to be inlined
into caller, would inlining still take place if callee and caller get
placed in separate partitions ?
I tried with a trivial example with -flto-partition=max
which created 3 partitions for 3 functions (bar, foo and main), and it was
able to inline bar into foo and foo into main.  I am not sure how that happens.
I thought ltrans can perform transformations on functions only within
a single partition
and not across partitions ?

Thanks,
Prathamesh
>
> Honza

[-- Attachment #2: patch-3.diff --]
[-- Type: text/plain, Size: 3606 bytes --]

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 9e54bb7..f0de7ec 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -9477,6 +9477,11 @@ Size of minimal partition for WHOPR (in estimated instructions).
 This prevents expenses of splitting very small programs into too many
 partitions.
 
+@item lto-max-partition
+Size of max partition for WHOPR (in estimated instructions).
+to provide an upper bound for individual size of partition.
+Meant to be used only with balanced partitioning.
+
 @item cxx-max-namespaces-for-diagnostic-help
 The maximum number of namespaces to consult for suggestions when C++
 name lookup fails for an identifier.  The default is 1000.
diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c
index 9eb63c2..d385dd9 100644
--- a/gcc/lto/lto-partition.c
+++ b/gcc/lto/lto-partition.c
@@ -447,7 +447,7 @@ add_sorted_nodes (vec<symtab_node *> &next_nodes, ltrans_partition partition)
    and in-partition calls was reached.  */
 
 void
-lto_balanced_map (int n_lto_partitions)
+lto_balanced_map (int n_lto_partitions, bool honor_max_partition)
 {
   int n_nodes = 0;
   int n_varpool_nodes = 0, varpool_pos = 0, best_varpool_pos = 0;
@@ -511,6 +511,9 @@ lto_balanced_map (int n_lto_partitions)
   varpool_order.qsort (varpool_node_cmp);
 
   /* Compute partition size and create the first partition.  */
+  if (PARAM_VALUE (MIN_PARTITION_SIZE) > PARAM_VALUE (MAX_PARTITION_SIZE))
+    fatal_error (input_location, "min partition size cannot be greater than max partition size");
+
   partition_size = total_size / n_lto_partitions;
   if (partition_size < PARAM_VALUE (MIN_PARTITION_SIZE))
     partition_size = PARAM_VALUE (MIN_PARTITION_SIZE);
@@ -719,7 +723,9 @@ lto_balanced_map (int n_lto_partitions)
 		 best_cost, best_internal, best_i);
       /* Partition is too large, unwind into step when best cost was reached and
 	 start new partition.  */
-      if (partition->insns > 2 * partition_size)
+      if (partition->insns > 2 * partition_size
+	  || (honor_max_partition
+	      && partition->insns > PARAM_VALUE (MAX_PARTITION_SIZE)))
 	{
 	  if (best_i != i)
 	    {
diff --git a/gcc/lto/lto-partition.h b/gcc/lto/lto-partition.h
index 31e3764..2992bee 100644
--- a/gcc/lto/lto-partition.h
+++ b/gcc/lto/lto-partition.h
@@ -35,7 +35,7 @@ extern vec<ltrans_partition> ltrans_partitions;
 
 void lto_1_to_1_map (void);
 void lto_max_map (void);
-void lto_balanced_map (int);
+void lto_balanced_map (int, bool honor_max_partition = true);
 void lto_promote_cross_file_statics (void);
 void free_ltrans_partitions (void);
 void lto_promote_statics_nonwpa (void);
diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c
index 9dd513f..82bd9b3 100644
--- a/gcc/lto/lto.c
+++ b/gcc/lto/lto.c
@@ -3117,7 +3118,7 @@ do_whole_program_analysis (void)
   else if (flag_lto_partition == LTO_PARTITION_MAX)
     lto_max_map ();
   else if (flag_lto_partition == LTO_PARTITION_ONE)
-    lto_balanced_map (1);
+    lto_balanced_map (1, false);
   else if (flag_lto_partition == LTO_PARTITION_BALANCED)
     lto_balanced_map (PARAM_VALUE (PARAM_LTO_PARTITIONS));
   else
diff --git a/gcc/params.def b/gcc/params.def
index 9362c15..9f8a648 100644
--- a/gcc/params.def
+++ b/gcc/params.def
@@ -1029,6 +1029,11 @@ DEFPARAM (MIN_PARTITION_SIZE,
 	  "Minimal size of a partition for LTO (in estimated instructions).",
 	  1000, 0, 0)
 
+DEFPARAM (MAX_PARTITION_SIZE,
+	  "lto-max-partition",
+	  "Maximal size of a partition for LTO (in estimated instructions).",
+	  10000, 0, INT_MAX)
+
 /* Diagnostic parameters.  */
 
 DEFPARAM (CXX_MAX_NAMESPACES_FOR_DIAGNOSTIC_HELP,

[-- Attachment #3: ChangeLog --]
[-- Type: application/octet-stream, Size: 535 bytes --]

2016-04-05  Prathamesh Kulkarni  <prathamesh.kulkarni@linaro.org>

	* params.def (MAX_PARTITION_SIZE): New param.
	* lto/lto-partition.h (lto_balanced_map): New parameter
	honor_max_partition with default value true.
	* lto/lto-partition.c (lto_balanced_map): New parameter
	honor_max_partition with default value true.
	Check if partition size is greater than max-lto-partition.
	* lto/lto.c (do_whole_program_analysis): Pass false as 2nd arg to
	lto_balanced_map in single partition case.
	* invoke.texi: Document lto-max-partition.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC] introduce --param max-lto-partition for having an upper bound on partition size
  2016-04-05 11:11                 ` Prathamesh Kulkarni
@ 2016-04-05 11:28                   ` Richard Biener
  2016-04-05 12:55                     ` Prathamesh Kulkarni
  0 siblings, 1 reply; 22+ messages in thread
From: Richard Biener @ 2016-04-05 11:28 UTC (permalink / raw)
  To: Prathamesh Kulkarni; +Cc: Jan Hubicka, gcc Patches, Ramana Radhakrishnan

On Tue, 5 Apr 2016, Prathamesh Kulkarni wrote:

> On 4 April 2016 at 19:44, Jan Hubicka <hubicka@ucw.cz> wrote:
> >
> >> diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c
> >> index 9eb63c2..bc0c612 100644
> >> --- a/gcc/lto/lto-partition.c
> >> +++ b/gcc/lto/lto-partition.c
> >> @@ -511,9 +511,20 @@ lto_balanced_map (int n_lto_partitions)
> >>    varpool_order.qsort (varpool_node_cmp);
> >>
> >>    /* Compute partition size and create the first partition.  */
> >> +  if (PARAM_VALUE (MIN_PARTITION_SIZE) > PARAM_VALUE (MAX_PARTITION_SIZE))
> >> +    fatal_error (input_location, "min partition size cannot be greater than max partition size");
> >> +
> >>    partition_size = total_size / n_lto_partitions;
> >>    if (partition_size < PARAM_VALUE (MIN_PARTITION_SIZE))
> >>      partition_size = PARAM_VALUE (MIN_PARTITION_SIZE);
> >> +  else if (partition_size > PARAM_VALUE (MAX_PARTITION_SIZE))
> >> +    {
> >> +      n_lto_partitions = total_size / PARAM_VALUE (MAX_PARTITION_SIZE);
> >> +      if (total_size % PARAM_VALUE (MAX_PARTITION_SIZE))
> >> +     n_lto_partitions++;
> >> +      partition_size = total_size / n_lto_partitions;
> >> +    }
> >
> > lto_balanced_map actually works in a way that looks for cheapest cutpoint in range
> > 3/4*parittion_size to 2*partition_size and picks the cheapest range.
> > Setting partition_size to this value will thus not cause partitioner to produce smaller
> > partitions only.  I suppose modify the conditional:
> >
> >       /* Partition is too large, unwind into step when best cost was reached and
> >          start new partition.  */
> >       if (partition->insns > 2 * partition_size)
> >
> > and/or in the code above set the partition_size to half of total_size/max_size.
> >
> > I know this is somewhat sloppy.  This was really just first cut implementation
> > many years ago. I expected to reimplement it marter soon, but then there was
> > never really a need for it (I am trying to avoid late IPA optimizations so the
> > partitioning decisions should mostly affect compile time performance only).
> > If ARM is more sensitive for partitining, perhaps it would make sense to try to
> > look for something smarter.
> >
> >> +
> >>    npartitions = 1;
> >>    partition = new_partition ("");
> >>    if (symtab->dump_file)
> >> diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c
> >> index 9dd513f..294b8a4 100644
> >> --- a/gcc/lto/lto.c
> >> +++ b/gcc/lto/lto.c
> >> @@ -3112,6 +3112,12 @@ do_whole_program_analysis (void)
> >>    timevar_pop (TV_WHOPR_WPA);
> >>
> >>    timevar_push (TV_WHOPR_PARTITIONING);
> >> +
> >> +  if (flag_lto_partition != LTO_PARTITION_BALANCED
> >> +      && PARAM_VALUE (MAX_PARTITION_SIZE) != INT_MAX)
> >> +    fatal_error (input_location, "--param max-lto-partition should only"
> >> +              " be used with balanced partitioning\n");
> >> +
> >
> > I think we should wire in resonable MAX_PARTITION_SIZE default.  THe value you
> > found experimentally may be a good start. For that reason we can't really
> > refuse a value when !LTO_PARTITION_BALANCED.  Just document it as parameter for
> > balanced partitioning only and add a parameter to lto_balanced_map specifying whether
> > this param should be honored (because the same path is used for partitioning to one partition)
> >
> > Otherwise the patch looks good to me modulo missing documentation.
> Thanks for the review. I have updated the patch.
> Does this version look OK ?
> I had randomly chosen 10000, not sure if that's an appropriate value
> for default.

I think it's way too small.  This is roughly the number of GIMPLE stmts
(thus roughly the number of instructions).  So with say a 8 byte
instruction format it is on the order of 80kB.  You'd want to have a
default of at least several ten times of large-unit-insns (also 10000).
I'd choose sth like 1000000 (one million).  I find the lto-min-partition
number quite small as well (and up it by a factor of 10).

Richard.

> I have a silly question about partitioning: Does it hamper
> transformations on ipa optimizations if caller and
> callee get placed in separate partitions ? For instance if callee is
> supposed to be inlined
> into caller, would inlining still take place if callee and caller get
> placed in separate partitions ?
> I tried with a trivial example with -flto-partition=max
> which created 3 partitions for 3 functions (bar, foo and main), and it was
> able to inline bar into foo and foo into main.  I am not sure how that happens.
> I thought ltrans can perform transformations on functions only within
> a single partition
> and not across partitions ?
> 
> Thanks,
> Prathamesh
> >
> > Honza
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC] introduce --param max-lto-partition for having an upper bound on partition size
  2016-04-05 11:28                   ` Richard Biener
@ 2016-04-05 12:55                     ` Prathamesh Kulkarni
  2016-04-05 12:58                       ` Richard Biener
  0 siblings, 1 reply; 22+ messages in thread
From: Prathamesh Kulkarni @ 2016-04-05 12:55 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jan Hubicka, gcc Patches, Ramana Radhakrishnan

[-- Attachment #1: Type: text/plain, Size: 5053 bytes --]

On 5 April 2016 at 16:58, Richard Biener <rguenther@suse.de> wrote:
> On Tue, 5 Apr 2016, Prathamesh Kulkarni wrote:
>
>> On 4 April 2016 at 19:44, Jan Hubicka <hubicka@ucw.cz> wrote:
>> >
>> >> diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c
>> >> index 9eb63c2..bc0c612 100644
>> >> --- a/gcc/lto/lto-partition.c
>> >> +++ b/gcc/lto/lto-partition.c
>> >> @@ -511,9 +511,20 @@ lto_balanced_map (int n_lto_partitions)
>> >>    varpool_order.qsort (varpool_node_cmp);
>> >>
>> >>    /* Compute partition size and create the first partition.  */
>> >> +  if (PARAM_VALUE (MIN_PARTITION_SIZE) > PARAM_VALUE (MAX_PARTITION_SIZE))
>> >> +    fatal_error (input_location, "min partition size cannot be greater than max partition size");
>> >> +
>> >>    partition_size = total_size / n_lto_partitions;
>> >>    if (partition_size < PARAM_VALUE (MIN_PARTITION_SIZE))
>> >>      partition_size = PARAM_VALUE (MIN_PARTITION_SIZE);
>> >> +  else if (partition_size > PARAM_VALUE (MAX_PARTITION_SIZE))
>> >> +    {
>> >> +      n_lto_partitions = total_size / PARAM_VALUE (MAX_PARTITION_SIZE);
>> >> +      if (total_size % PARAM_VALUE (MAX_PARTITION_SIZE))
>> >> +     n_lto_partitions++;
>> >> +      partition_size = total_size / n_lto_partitions;
>> >> +    }
>> >
>> > lto_balanced_map actually works in a way that looks for cheapest cutpoint in range
>> > 3/4*parittion_size to 2*partition_size and picks the cheapest range.
>> > Setting partition_size to this value will thus not cause partitioner to produce smaller
>> > partitions only.  I suppose modify the conditional:
>> >
>> >       /* Partition is too large, unwind into step when best cost was reached and
>> >          start new partition.  */
>> >       if (partition->insns > 2 * partition_size)
>> >
>> > and/or in the code above set the partition_size to half of total_size/max_size.
>> >
>> > I know this is somewhat sloppy.  This was really just first cut implementation
>> > many years ago. I expected to reimplement it marter soon, but then there was
>> > never really a need for it (I am trying to avoid late IPA optimizations so the
>> > partitioning decisions should mostly affect compile time performance only).
>> > If ARM is more sensitive for partitining, perhaps it would make sense to try to
>> > look for something smarter.
>> >
>> >> +
>> >>    npartitions = 1;
>> >>    partition = new_partition ("");
>> >>    if (symtab->dump_file)
>> >> diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c
>> >> index 9dd513f..294b8a4 100644
>> >> --- a/gcc/lto/lto.c
>> >> +++ b/gcc/lto/lto.c
>> >> @@ -3112,6 +3112,12 @@ do_whole_program_analysis (void)
>> >>    timevar_pop (TV_WHOPR_WPA);
>> >>
>> >>    timevar_push (TV_WHOPR_PARTITIONING);
>> >> +
>> >> +  if (flag_lto_partition != LTO_PARTITION_BALANCED
>> >> +      && PARAM_VALUE (MAX_PARTITION_SIZE) != INT_MAX)
>> >> +    fatal_error (input_location, "--param max-lto-partition should only"
>> >> +              " be used with balanced partitioning\n");
>> >> +
>> >
>> > I think we should wire in resonable MAX_PARTITION_SIZE default.  THe value you
>> > found experimentally may be a good start. For that reason we can't really
>> > refuse a value when !LTO_PARTITION_BALANCED.  Just document it as parameter for
>> > balanced partitioning only and add a parameter to lto_balanced_map specifying whether
>> > this param should be honored (because the same path is used for partitioning to one partition)
>> >
>> > Otherwise the patch looks good to me modulo missing documentation.
>> Thanks for the review. I have updated the patch.
>> Does this version look OK ?
>> I had randomly chosen 10000, not sure if that's an appropriate value
>> for default.
>
> I think it's way too small.  This is roughly the number of GIMPLE stmts
> (thus roughly the number of instructions).  So with say a 8 byte
> instruction format it is on the order of 80kB.  You'd want to have a
> default of at least several ten times of large-unit-insns (also 10000).
> I'd choose sth like 1000000 (one million).  I find the lto-min-partition
> number quite small as well (and up it by a factor of 10).
Done in this version.
Is it OK after bootstrap+test ?

Thanks,
Prathamesh
>
> Richard.
>
>> I have a silly question about partitioning: Does it hamper
>> transformations on ipa optimizations if caller and
>> callee get placed in separate partitions ? For instance if callee is
>> supposed to be inlined
>> into caller, would inlining still take place if callee and caller get
>> placed in separate partitions ?
>> I tried with a trivial example with -flto-partition=max
>> which created 3 partitions for 3 functions (bar, foo and main), and it was
>> able to inline bar into foo and foo into main.  I am not sure how that happens.
>> I thought ltrans can perform transformations on functions only within
>> a single partition
>> and not across partitions ?
>>
>> Thanks,
>> Prathamesh
>> >
>> > Honza
>>
>
> --
> Richard Biener <rguenther@suse.de>
> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

[-- Attachment #2: patch-4.diff --]
[-- Type: text/plain, Size: 3867 bytes --]

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 9e54bb7..f0de7ec 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -9477,6 +9477,11 @@ Size of minimal partition for WHOPR (in estimated instructions).
 This prevents expenses of splitting very small programs into too many
 partitions.
 
+@item lto-max-partition
+Size of max partition for WHOPR (in estimated instructions).
+to provide an upper bound for individual size of partition.
+Meant to be used only with balanced partitioning.
+
 @item cxx-max-namespaces-for-diagnostic-help
 The maximum number of namespaces to consult for suggestions when C++
 name lookup fails for an identifier.  The default is 1000.
diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c
index 9eb63c2..d385dd9 100644
--- a/gcc/lto/lto-partition.c
+++ b/gcc/lto/lto-partition.c
@@ -447,7 +447,7 @@ add_sorted_nodes (vec<symtab_node *> &next_nodes, ltrans_partition partition)
    and in-partition calls was reached.  */
 
 void
-lto_balanced_map (int n_lto_partitions)
+lto_balanced_map (int n_lto_partitions, bool honor_max_partition)
 {
   int n_nodes = 0;
   int n_varpool_nodes = 0, varpool_pos = 0, best_varpool_pos = 0;
@@ -511,9 +511,13 @@ lto_balanced_map (int n_lto_partitions)
   varpool_order.qsort (varpool_node_cmp);
 
   /* Compute partition size and create the first partition.  */
+  if (PARAM_VALUE (MIN_PARTITION_SIZE) > PARAM_VALUE (MAX_PARTITION_SIZE))
+    fatal_error (input_location, "min partition size cannot be greater than max partition size");
+
   partition_size = total_size / n_lto_partitions;
   if (partition_size < PARAM_VALUE (MIN_PARTITION_SIZE))
     partition_size = PARAM_VALUE (MIN_PARTITION_SIZE);
+
   npartitions = 1;
   partition = new_partition ("");
   if (symtab->dump_file)
@@ -719,7 +723,9 @@ lto_balanced_map (int n_lto_partitions)
 		 best_cost, best_internal, best_i);
       /* Partition is too large, unwind into step when best cost was reached and
 	 start new partition.  */
-      if (partition->insns > 2 * partition_size)
+      if (partition->insns > 2 * partition_size
+	  || (honor_max_partition
+	      && partition->insns > PARAM_VALUE (MAX_PARTITION_SIZE)))
 	{
 	  if (best_i != i)
 	    {
diff --git a/gcc/lto/lto-partition.h b/gcc/lto/lto-partition.h
index 31e3764..2992bee 100644
--- a/gcc/lto/lto-partition.h
+++ b/gcc/lto/lto-partition.h
@@ -35,7 +35,7 @@ extern vec<ltrans_partition> ltrans_partitions;
 
 void lto_1_to_1_map (void);
 void lto_max_map (void);
-void lto_balanced_map (int);
+void lto_balanced_map (int, bool honor_max_partition = true);
 void lto_promote_cross_file_statics (void);
 void free_ltrans_partitions (void);
 void lto_promote_statics_nonwpa (void);
diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c
index 9dd513f..82bd9b3 100644
--- a/gcc/lto/lto.c
+++ b/gcc/lto/lto.c
@@ -3112,12 +3112,13 @@ do_whole_program_analysis (void)
   timevar_pop (TV_WHOPR_WPA);
 
   timevar_push (TV_WHOPR_PARTITIONING);
+
   if (flag_lto_partition == LTO_PARTITION_1TO1)
     lto_1_to_1_map ();
   else if (flag_lto_partition == LTO_PARTITION_MAX)
     lto_max_map ();
   else if (flag_lto_partition == LTO_PARTITION_ONE)
-    lto_balanced_map (1);
+    lto_balanced_map (1, false);
   else if (flag_lto_partition == LTO_PARTITION_BALANCED)
     lto_balanced_map (PARAM_VALUE (PARAM_LTO_PARTITIONS));
   else
diff --git a/gcc/params.def b/gcc/params.def
index 9362c15..97e41aa 100644
--- a/gcc/params.def
+++ b/gcc/params.def
@@ -1027,7 +1027,12 @@ DEFPARAM (PARAM_LTO_PARTITIONS,
 DEFPARAM (MIN_PARTITION_SIZE,
 	  "lto-min-partition",
 	  "Minimal size of a partition for LTO (in estimated instructions).",
-	  1000, 0, 0)
+	  10000, 0, 0)
+
+DEFPARAM (MAX_PARTITION_SIZE,
+	  "lto-max-partition",
+	  "Maximal size of a partition for LTO (in estimated instructions).",
+	  1000000, 0, INT_MAX)
 
 /* Diagnostic parameters.  */
 

[-- Attachment #3: ChangeLog --]
[-- Type: application/octet-stream, Size: 589 bytes --]

2016-04-05  Prathamesh Kulkarni  <prathamesh.kulkarni@linaro.org>

	* params.def (MAX_PARTITION_SIZE): New param.
	(MIN_PARTITION_SIZE): Change default value to 10000.
	* lto/lto-partition.h (lto_balanced_map): New parameter
	honor_max_partition with default value true.
	* lto/lto-partition.c (lto_balanced_map): New parameter
	honor_max_partition with default value true.
	Check if partition size is greater than max-lto-partition.
	* lto/lto.c (do_whole_program_analysis): Pass false as 2nd arg to
	lto_balanced_map in single partition case.
	* invoke.texi: Document lto-max-partition.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC] introduce --param max-lto-partition for having an upper bound on partition size
  2016-04-05 12:55                     ` Prathamesh Kulkarni
@ 2016-04-05 12:58                       ` Richard Biener
  2016-04-06  7:47                         ` Prathamesh Kulkarni
  0 siblings, 1 reply; 22+ messages in thread
From: Richard Biener @ 2016-04-05 12:58 UTC (permalink / raw)
  To: Prathamesh Kulkarni; +Cc: Jan Hubicka, gcc Patches, Ramana Radhakrishnan

On Tue, 5 Apr 2016, Prathamesh Kulkarni wrote:

> On 5 April 2016 at 16:58, Richard Biener <rguenther@suse.de> wrote:
> > On Tue, 5 Apr 2016, Prathamesh Kulkarni wrote:
> >
> >> On 4 April 2016 at 19:44, Jan Hubicka <hubicka@ucw.cz> wrote:
> >> >
> >> >> diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c
> >> >> index 9eb63c2..bc0c612 100644
> >> >> --- a/gcc/lto/lto-partition.c
> >> >> +++ b/gcc/lto/lto-partition.c
> >> >> @@ -511,9 +511,20 @@ lto_balanced_map (int n_lto_partitions)
> >> >>    varpool_order.qsort (varpool_node_cmp);
> >> >>
> >> >>    /* Compute partition size and create the first partition.  */
> >> >> +  if (PARAM_VALUE (MIN_PARTITION_SIZE) > PARAM_VALUE (MAX_PARTITION_SIZE))
> >> >> +    fatal_error (input_location, "min partition size cannot be greater than max partition size");
> >> >> +
> >> >>    partition_size = total_size / n_lto_partitions;
> >> >>    if (partition_size < PARAM_VALUE (MIN_PARTITION_SIZE))
> >> >>      partition_size = PARAM_VALUE (MIN_PARTITION_SIZE);
> >> >> +  else if (partition_size > PARAM_VALUE (MAX_PARTITION_SIZE))
> >> >> +    {
> >> >> +      n_lto_partitions = total_size / PARAM_VALUE (MAX_PARTITION_SIZE);
> >> >> +      if (total_size % PARAM_VALUE (MAX_PARTITION_SIZE))
> >> >> +     n_lto_partitions++;
> >> >> +      partition_size = total_size / n_lto_partitions;
> >> >> +    }
> >> >
> >> > lto_balanced_map actually works in a way that looks for cheapest cutpoint in range
> >> > 3/4*parittion_size to 2*partition_size and picks the cheapest range.
> >> > Setting partition_size to this value will thus not cause partitioner to produce smaller
> >> > partitions only.  I suppose modify the conditional:
> >> >
> >> >       /* Partition is too large, unwind into step when best cost was reached and
> >> >          start new partition.  */
> >> >       if (partition->insns > 2 * partition_size)
> >> >
> >> > and/or in the code above set the partition_size to half of total_size/max_size.
> >> >
> >> > I know this is somewhat sloppy.  This was really just first cut implementation
> >> > many years ago. I expected to reimplement it marter soon, but then there was
> >> > never really a need for it (I am trying to avoid late IPA optimizations so the
> >> > partitioning decisions should mostly affect compile time performance only).
> >> > If ARM is more sensitive for partitining, perhaps it would make sense to try to
> >> > look for something smarter.
> >> >
> >> >> +
> >> >>    npartitions = 1;
> >> >>    partition = new_partition ("");
> >> >>    if (symtab->dump_file)
> >> >> diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c
> >> >> index 9dd513f..294b8a4 100644
> >> >> --- a/gcc/lto/lto.c
> >> >> +++ b/gcc/lto/lto.c
> >> >> @@ -3112,6 +3112,12 @@ do_whole_program_analysis (void)
> >> >>    timevar_pop (TV_WHOPR_WPA);
> >> >>
> >> >>    timevar_push (TV_WHOPR_PARTITIONING);
> >> >> +
> >> >> +  if (flag_lto_partition != LTO_PARTITION_BALANCED
> >> >> +      && PARAM_VALUE (MAX_PARTITION_SIZE) != INT_MAX)
> >> >> +    fatal_error (input_location, "--param max-lto-partition should only"
> >> >> +              " be used with balanced partitioning\n");
> >> >> +
> >> >
> >> > I think we should wire in resonable MAX_PARTITION_SIZE default.  THe value you
> >> > found experimentally may be a good start. For that reason we can't really
> >> > refuse a value when !LTO_PARTITION_BALANCED.  Just document it as parameter for
> >> > balanced partitioning only and add a parameter to lto_balanced_map specifying whether
> >> > this param should be honored (because the same path is used for partitioning to one partition)
> >> >
> >> > Otherwise the patch looks good to me modulo missing documentation.
> >> Thanks for the review. I have updated the patch.
> >> Does this version look OK ?
> >> I had randomly chosen 10000, not sure if that's an appropriate value
> >> for default.
> >
> > I think it's way too small.  This is roughly the number of GIMPLE stmts
> > (thus roughly the number of instructions).  So with say a 8 byte
> > instruction format it is on the order of 80kB.  You'd want to have a
> > default of at least several ten times of large-unit-insns (also 10000).
> > I'd choose sth like 1000000 (one million).  I find the lto-min-partition
> > number quite small as well (and up it by a factor of 10).
> Done in this version.

I'd do that separately.

Please no default parameter for lto_balanced_map (), instead change
all callers.

> Is it OK after bootstrap+test ?

Note that this is for stage1 only.  I'll leave approval to Honza
(also verification of the default max param - not sure if for example
chromium or firefox should/will be split to more than 32 partitions
with the patch)

Richard.

> Thanks,
> Prathamesh
> >
> > Richard.
> >
> >> I have a silly question about partitioning: Does it hamper
> >> transformations on ipa optimizations if caller and
> >> callee get placed in separate partitions ? For instance if callee is
> >> supposed to be inlined
> >> into caller, would inlining still take place if callee and caller get
> >> placed in separate partitions ?
> >> I tried with a trivial example with -flto-partition=max
> >> which created 3 partitions for 3 functions (bar, foo and main), and it was
> >> able to inline bar into foo and foo into main.  I am not sure how that happens.
> >> I thought ltrans can perform transformations on functions only within
> >> a single partition
> >> and not across partitions ?
> >>
> >> Thanks,
> >> Prathamesh
> >> >
> >> > Honza
> >>
> >
> > --
> > Richard Biener <rguenther@suse.de>
> > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC] introduce --param max-lto-partition for having an upper bound on partition size
  2016-04-05 12:58                       ` Richard Biener
@ 2016-04-06  7:47                         ` Prathamesh Kulkarni
  2016-04-06  8:14                           ` Richard Biener
  0 siblings, 1 reply; 22+ messages in thread
From: Prathamesh Kulkarni @ 2016-04-06  7:47 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jan Hubicka, gcc Patches, Ramana Radhakrishnan

[-- Attachment #1: Type: text/plain, Size: 6180 bytes --]

On 5 April 2016 at 18:28, Richard Biener <rguenther@suse.de> wrote:
> On Tue, 5 Apr 2016, Prathamesh Kulkarni wrote:
>
>> On 5 April 2016 at 16:58, Richard Biener <rguenther@suse.de> wrote:
>> > On Tue, 5 Apr 2016, Prathamesh Kulkarni wrote:
>> >
>> >> On 4 April 2016 at 19:44, Jan Hubicka <hubicka@ucw.cz> wrote:
>> >> >
>> >> >> diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c
>> >> >> index 9eb63c2..bc0c612 100644
>> >> >> --- a/gcc/lto/lto-partition.c
>> >> >> +++ b/gcc/lto/lto-partition.c
>> >> >> @@ -511,9 +511,20 @@ lto_balanced_map (int n_lto_partitions)
>> >> >>    varpool_order.qsort (varpool_node_cmp);
>> >> >>
>> >> >>    /* Compute partition size and create the first partition.  */
>> >> >> +  if (PARAM_VALUE (MIN_PARTITION_SIZE) > PARAM_VALUE (MAX_PARTITION_SIZE))
>> >> >> +    fatal_error (input_location, "min partition size cannot be greater than max partition size");
>> >> >> +
>> >> >>    partition_size = total_size / n_lto_partitions;
>> >> >>    if (partition_size < PARAM_VALUE (MIN_PARTITION_SIZE))
>> >> >>      partition_size = PARAM_VALUE (MIN_PARTITION_SIZE);
>> >> >> +  else if (partition_size > PARAM_VALUE (MAX_PARTITION_SIZE))
>> >> >> +    {
>> >> >> +      n_lto_partitions = total_size / PARAM_VALUE (MAX_PARTITION_SIZE);
>> >> >> +      if (total_size % PARAM_VALUE (MAX_PARTITION_SIZE))
>> >> >> +     n_lto_partitions++;
>> >> >> +      partition_size = total_size / n_lto_partitions;
>> >> >> +    }
>> >> >
>> >> > lto_balanced_map actually works in a way that looks for cheapest cutpoint in range
>> >> > 3/4*parittion_size to 2*partition_size and picks the cheapest range.
>> >> > Setting partition_size to this value will thus not cause partitioner to produce smaller
>> >> > partitions only.  I suppose modify the conditional:
>> >> >
>> >> >       /* Partition is too large, unwind into step when best cost was reached and
>> >> >          start new partition.  */
>> >> >       if (partition->insns > 2 * partition_size)
>> >> >
>> >> > and/or in the code above set the partition_size to half of total_size/max_size.
>> >> >
>> >> > I know this is somewhat sloppy.  This was really just first cut implementation
>> >> > many years ago. I expected to reimplement it marter soon, but then there was
>> >> > never really a need for it (I am trying to avoid late IPA optimizations so the
>> >> > partitioning decisions should mostly affect compile time performance only).
>> >> > If ARM is more sensitive for partitining, perhaps it would make sense to try to
>> >> > look for something smarter.
>> >> >
>> >> >> +
>> >> >>    npartitions = 1;
>> >> >>    partition = new_partition ("");
>> >> >>    if (symtab->dump_file)
>> >> >> diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c
>> >> >> index 9dd513f..294b8a4 100644
>> >> >> --- a/gcc/lto/lto.c
>> >> >> +++ b/gcc/lto/lto.c
>> >> >> @@ -3112,6 +3112,12 @@ do_whole_program_analysis (void)
>> >> >>    timevar_pop (TV_WHOPR_WPA);
>> >> >>
>> >> >>    timevar_push (TV_WHOPR_PARTITIONING);
>> >> >> +
>> >> >> +  if (flag_lto_partition != LTO_PARTITION_BALANCED
>> >> >> +      && PARAM_VALUE (MAX_PARTITION_SIZE) != INT_MAX)
>> >> >> +    fatal_error (input_location, "--param max-lto-partition should only"
>> >> >> +              " be used with balanced partitioning\n");
>> >> >> +
>> >> >
>> >> > I think we should wire in resonable MAX_PARTITION_SIZE default.  THe value you
>> >> > found experimentally may be a good start. For that reason we can't really
>> >> > refuse a value when !LTO_PARTITION_BALANCED.  Just document it as parameter for
>> >> > balanced partitioning only and add a parameter to lto_balanced_map specifying whether
>> >> > this param should be honored (because the same path is used for partitioning to one partition)
>> >> >
>> >> > Otherwise the patch looks good to me modulo missing documentation.
>> >> Thanks for the review. I have updated the patch.
>> >> Does this version look OK ?
>> >> I had randomly chosen 10000, not sure if that's an appropriate value
>> >> for default.
>> >
>> > I think it's way too small.  This is roughly the number of GIMPLE stmts
>> > (thus roughly the number of instructions).  So with say a 8 byte
>> > instruction format it is on the order of 80kB.  You'd want to have a
>> > default of at least several ten times of large-unit-insns (also 10000).
>> > I'd choose sth like 1000000 (one million).  I find the lto-min-partition
>> > number quite small as well (and up it by a factor of 10).
>> Done in this version.
>
> I'd do that separately.
>
> Please no default parameter for lto_balanced_map (), instead change
> all callers.
>
>> Is it OK after bootstrap+test ?
>
> Note that this is for stage1 only.  I'll leave approval to Honza
> (also verification of the default max param - not sure if for example
> chromium or firefox should/will be split to more than 32 partitions
> with the patch)
Removed default parameter in this version. I verified with the patch
for chromium LTO build:
n_lto_partitions == 32, ltrans_partitions.length() == 31

Thanks,
Prathamesh
>
> Richard.
>
>> Thanks,
>> Prathamesh
>> >
>> > Richard.
>> >
>> >> I have a silly question about partitioning: Does it hamper
>> >> transformations on ipa optimizations if caller and
>> >> callee get placed in separate partitions ? For instance if callee is
>> >> supposed to be inlined
>> >> into caller, would inlining still take place if callee and caller get
>> >> placed in separate partitions ?
>> >> I tried with a trivial example with -flto-partition=max
>> >> which created 3 partitions for 3 functions (bar, foo and main), and it was
>> >> able to inline bar into foo and foo into main.  I am not sure how that happens.
>> >> I thought ltrans can perform transformations on functions only within
>> >> a single partition
>> >> and not across partitions ?
>> >>
>> >> Thanks,
>> >> Prathamesh
>> >> >
>> >> > Honza
>> >>
>> >
>> > --
>> > Richard Biener <rguenther@suse.de>
>> > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
>>
>
> --
> Richard Biener <rguenther@suse.de>
> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

[-- Attachment #2: patch-4.diff --]
[-- Type: text/plain, Size: 3695 bytes --]

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 9e54bb7..f0de7ec 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -9477,6 +9477,11 @@ Size of minimal partition for WHOPR (in estimated instructions).
 This prevents expenses of splitting very small programs into too many
 partitions.
 
+@item lto-max-partition
+Size of max partition for WHOPR (in estimated instructions).
+to provide an upper bound for individual size of partition.
+Meant to be used only with balanced partitioning.
+
 @item cxx-max-namespaces-for-diagnostic-help
 The maximum number of namespaces to consult for suggestions when C++
 name lookup fails for an identifier.  The default is 1000.
diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c
index 9eb63c2..d385dd9 100644
--- a/gcc/lto/lto-partition.c
+++ b/gcc/lto/lto-partition.c
@@ -447,7 +447,7 @@ add_sorted_nodes (vec<symtab_node *> &next_nodes, ltrans_partition partition)
    and in-partition calls was reached.  */
 
 void
-lto_balanced_map (int n_lto_partitions)
+lto_balanced_map (int n_lto_partitions, bool honor_max_partition)
 {
   int n_nodes = 0;
   int n_varpool_nodes = 0, varpool_pos = 0, best_varpool_pos = 0;
@@ -511,6 +511,9 @@ lto_balanced_map (int n_lto_partitions)
   varpool_order.qsort (varpool_node_cmp);
 
   /* Compute partition size and create the first partition.  */
+  if (PARAM_VALUE (MIN_PARTITION_SIZE) > PARAM_VALUE (MAX_PARTITION_SIZE))
+    fatal_error (input_location, "min partition size cannot be greater than max partition size");
+
   partition_size = total_size / n_lto_partitions;
   if (partition_size < PARAM_VALUE (MIN_PARTITION_SIZE))
     partition_size = PARAM_VALUE (MIN_PARTITION_SIZE);
@@ -719,7 +723,9 @@ lto_balanced_map (int n_lto_partitions)
 		 best_cost, best_internal, best_i);
       /* Partition is too large, unwind into step when best cost was reached and
 	 start new partition.  */
-      if (partition->insns > 2 * partition_size)
+      if (partition->insns > 2 * partition_size
+	  || (honor_max_partition
+	      && partition->insns > PARAM_VALUE (MAX_PARTITION_SIZE)))
 	{
 	  if (best_i != i)
 	    {
diff --git a/gcc/lto/lto-partition.h b/gcc/lto/lto-partition.h
index 31e3764..a55594c 100644
--- a/gcc/lto/lto-partition.h
+++ b/gcc/lto/lto-partition.h
@@ -35,7 +35,7 @@ extern vec<ltrans_partition> ltrans_partitions;
 
 void lto_1_to_1_map (void);
 void lto_max_map (void);
-void lto_balanced_map (int);
+void lto_balanced_map (int, bool honor_max_partition); 
 void lto_promote_cross_file_statics (void);
 void free_ltrans_partitions (void);
 void lto_promote_statics_nonwpa (void);
diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c
index 9dd513f..ef2ce15 100644
--- a/gcc/lto/lto.c
+++ b/gcc/lto/lto.c
@@ -3117,9 +3118,9 @@ do_whole_program_analysis (void)
   else if (flag_lto_partition == LTO_PARTITION_MAX)
     lto_max_map ();
   else if (flag_lto_partition == LTO_PARTITION_ONE)
-    lto_balanced_map (1);
+    lto_balanced_map (1, false);
   else if (flag_lto_partition == LTO_PARTITION_BALANCED)
-    lto_balanced_map (PARAM_VALUE (PARAM_LTO_PARTITIONS));
+    lto_balanced_map (PARAM_VALUE (PARAM_LTO_PARTITIONS), true);
   else
     gcc_unreachable ();
 
diff --git a/gcc/params.def b/gcc/params.def
index 9362c15..b5da384 100644
--- a/gcc/params.def
+++ b/gcc/params.def
@@ -1029,6 +1029,11 @@ DEFPARAM (MIN_PARTITION_SIZE,
 	  "Minimal size of a partition for LTO (in estimated instructions).",
 	  1000, 0, 0)
 
+DEFPARAM (MAX_PARTITION_SIZE,
+	  "lto-max-partition",
+	  "Maximal size of a partition for LTO (in estimated instructions).",
+	  1000000, 0, INT_MAX)
+
 /* Diagnostic parameters.  */
 
 DEFPARAM (CXX_MAX_NAMESPACES_FOR_DIAGNOSTIC_HELP,

[-- Attachment #3: ChangeLog --]
[-- Type: application/octet-stream, Size: 477 bytes --]

2016-04-06  Prathamesh Kulkarni  <prathamesh.kulkarni@linaro.org>

	* params.def (MAX_PARTITION_SIZE): New param.
	* lto/lto-partition.h (lto_balanced_map): New parameter
	honor_max_partition. 
	* lto/lto-partition.c (lto_balanced_map): New parameter
	honor_max_partition.
	Check if partition size is greater than max-lto-partition.
	* lto/lto.c (do_whole_program_analysis): Adjust calls to
	lto_balanced_map() to pass 2nd argument.
	* invoke.texi: Document lto-max-partition.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC] introduce --param max-lto-partition for having an upper bound on partition size
  2016-04-06  7:47                         ` Prathamesh Kulkarni
@ 2016-04-06  8:14                           ` Richard Biener
  2016-04-06  8:53                             ` Prathamesh Kulkarni
  0 siblings, 1 reply; 22+ messages in thread
From: Richard Biener @ 2016-04-06  8:14 UTC (permalink / raw)
  To: Prathamesh Kulkarni; +Cc: Jan Hubicka, gcc Patches, Ramana Radhakrishnan

On Wed, 6 Apr 2016, Prathamesh Kulkarni wrote:

> On 5 April 2016 at 18:28, Richard Biener <rguenther@suse.de> wrote:
> > On Tue, 5 Apr 2016, Prathamesh Kulkarni wrote:
> >
> >> On 5 April 2016 at 16:58, Richard Biener <rguenther@suse.de> wrote:
> >> > On Tue, 5 Apr 2016, Prathamesh Kulkarni wrote:
> >> >
> >> >> On 4 April 2016 at 19:44, Jan Hubicka <hubicka@ucw.cz> wrote:
> >> >> >
> >> >> >> diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c
> >> >> >> index 9eb63c2..bc0c612 100644
> >> >> >> --- a/gcc/lto/lto-partition.c
> >> >> >> +++ b/gcc/lto/lto-partition.c
> >> >> >> @@ -511,9 +511,20 @@ lto_balanced_map (int n_lto_partitions)
> >> >> >>    varpool_order.qsort (varpool_node_cmp);
> >> >> >>
> >> >> >>    /* Compute partition size and create the first partition.  */
> >> >> >> +  if (PARAM_VALUE (MIN_PARTITION_SIZE) > PARAM_VALUE (MAX_PARTITION_SIZE))
> >> >> >> +    fatal_error (input_location, "min partition size cannot be greater than max partition size");
> >> >> >> +
> >> >> >>    partition_size = total_size / n_lto_partitions;
> >> >> >>    if (partition_size < PARAM_VALUE (MIN_PARTITION_SIZE))
> >> >> >>      partition_size = PARAM_VALUE (MIN_PARTITION_SIZE);
> >> >> >> +  else if (partition_size > PARAM_VALUE (MAX_PARTITION_SIZE))
> >> >> >> +    {
> >> >> >> +      n_lto_partitions = total_size / PARAM_VALUE (MAX_PARTITION_SIZE);
> >> >> >> +      if (total_size % PARAM_VALUE (MAX_PARTITION_SIZE))
> >> >> >> +     n_lto_partitions++;
> >> >> >> +      partition_size = total_size / n_lto_partitions;
> >> >> >> +    }
> >> >> >
> >> >> > lto_balanced_map actually works in a way that looks for cheapest cutpoint in range
> >> >> > 3/4*parittion_size to 2*partition_size and picks the cheapest range.
> >> >> > Setting partition_size to this value will thus not cause partitioner to produce smaller
> >> >> > partitions only.  I suppose modify the conditional:
> >> >> >
> >> >> >       /* Partition is too large, unwind into step when best cost was reached and
> >> >> >          start new partition.  */
> >> >> >       if (partition->insns > 2 * partition_size)
> >> >> >
> >> >> > and/or in the code above set the partition_size to half of total_size/max_size.
> >> >> >
> >> >> > I know this is somewhat sloppy.  This was really just first cut implementation
> >> >> > many years ago. I expected to reimplement it marter soon, but then there was
> >> >> > never really a need for it (I am trying to avoid late IPA optimizations so the
> >> >> > partitioning decisions should mostly affect compile time performance only).
> >> >> > If ARM is more sensitive for partitining, perhaps it would make sense to try to
> >> >> > look for something smarter.
> >> >> >
> >> >> >> +
> >> >> >>    npartitions = 1;
> >> >> >>    partition = new_partition ("");
> >> >> >>    if (symtab->dump_file)
> >> >> >> diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c
> >> >> >> index 9dd513f..294b8a4 100644
> >> >> >> --- a/gcc/lto/lto.c
> >> >> >> +++ b/gcc/lto/lto.c
> >> >> >> @@ -3112,6 +3112,12 @@ do_whole_program_analysis (void)
> >> >> >>    timevar_pop (TV_WHOPR_WPA);
> >> >> >>
> >> >> >>    timevar_push (TV_WHOPR_PARTITIONING);
> >> >> >> +
> >> >> >> +  if (flag_lto_partition != LTO_PARTITION_BALANCED
> >> >> >> +      && PARAM_VALUE (MAX_PARTITION_SIZE) != INT_MAX)
> >> >> >> +    fatal_error (input_location, "--param max-lto-partition should only"
> >> >> >> +              " be used with balanced partitioning\n");
> >> >> >> +
> >> >> >
> >> >> > I think we should wire in resonable MAX_PARTITION_SIZE default.  THe value you
> >> >> > found experimentally may be a good start. For that reason we can't really
> >> >> > refuse a value when !LTO_PARTITION_BALANCED.  Just document it as parameter for
> >> >> > balanced partitioning only and add a parameter to lto_balanced_map specifying whether
> >> >> > this param should be honored (because the same path is used for partitioning to one partition)
> >> >> >
> >> >> > Otherwise the patch looks good to me modulo missing documentation.
> >> >> Thanks for the review. I have updated the patch.
> >> >> Does this version look OK ?
> >> >> I had randomly chosen 10000, not sure if that's an appropriate value
> >> >> for default.
> >> >
> >> > I think it's way too small.  This is roughly the number of GIMPLE stmts
> >> > (thus roughly the number of instructions).  So with say a 8 byte
> >> > instruction format it is on the order of 80kB.  You'd want to have a
> >> > default of at least several ten times of large-unit-insns (also 10000).
> >> > I'd choose sth like 1000000 (one million).  I find the lto-min-partition
> >> > number quite small as well (and up it by a factor of 10).
> >> Done in this version.
> >
> > I'd do that separately.
> >
> > Please no default parameter for lto_balanced_map (), instead change
> > all callers.
> >
> >> Is it OK after bootstrap+test ?
> >
> > Note that this is for stage1 only.  I'll leave approval to Honza
> > (also verification of the default max param - not sure if for example
> > chromium or firefox should/will be split to more than 32 partitions
> > with the patch)
> Removed default parameter in this version. I verified with the patch
> for chromium LTO build:
> n_lto_partitions == 32, ltrans_partitions.length() == 31

Just noticed that lto_balanced_map already gets PARAM_LTO_PARTITIONS,
so why not pass it PARAM_MAX_PARTITION_SIZE or 0 (as magic value for
unlimited) instead of a bool parameter?

Richard.

> Thanks,
> Prathamesh
> >
> > Richard.
> >
> >> Thanks,
> >> Prathamesh
> >> >
> >> > Richard.
> >> >
> >> >> I have a silly question about partitioning: Does it hamper
> >> >> transformations on ipa optimizations if caller and
> >> >> callee get placed in separate partitions ? For instance if callee is
> >> >> supposed to be inlined
> >> >> into caller, would inlining still take place if callee and caller get
> >> >> placed in separate partitions ?
> >> >> I tried with a trivial example with -flto-partition=max
> >> >> which created 3 partitions for 3 functions (bar, foo and main), and it was
> >> >> able to inline bar into foo and foo into main.  I am not sure how that happens.
> >> >> I thought ltrans can perform transformations on functions only within
> >> >> a single partition
> >> >> and not across partitions ?
> >> >>
> >> >> Thanks,
> >> >> Prathamesh
> >> >> >
> >> >> > Honza
> >> >>
> >> >
> >> > --
> >> > Richard Biener <rguenther@suse.de>
> >> > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
> >>
> >
> > --
> > Richard Biener <rguenther@suse.de>
> > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC] introduce --param max-lto-partition for having an upper bound on partition size
  2016-04-06  8:14                           ` Richard Biener
@ 2016-04-06  8:53                             ` Prathamesh Kulkarni
  2016-04-06  9:07                               ` Richard Biener
  0 siblings, 1 reply; 22+ messages in thread
From: Prathamesh Kulkarni @ 2016-04-06  8:53 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jan Hubicka, gcc Patches, Ramana Radhakrishnan

[-- Attachment #1: Type: text/plain, Size: 7307 bytes --]

On 6 April 2016 at 13:44, Richard Biener <rguenther@suse.de> wrote:
> On Wed, 6 Apr 2016, Prathamesh Kulkarni wrote:
>
>> On 5 April 2016 at 18:28, Richard Biener <rguenther@suse.de> wrote:
>> > On Tue, 5 Apr 2016, Prathamesh Kulkarni wrote:
>> >
>> >> On 5 April 2016 at 16:58, Richard Biener <rguenther@suse.de> wrote:
>> >> > On Tue, 5 Apr 2016, Prathamesh Kulkarni wrote:
>> >> >
>> >> >> On 4 April 2016 at 19:44, Jan Hubicka <hubicka@ucw.cz> wrote:
>> >> >> >
>> >> >> >> diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c
>> >> >> >> index 9eb63c2..bc0c612 100644
>> >> >> >> --- a/gcc/lto/lto-partition.c
>> >> >> >> +++ b/gcc/lto/lto-partition.c
>> >> >> >> @@ -511,9 +511,20 @@ lto_balanced_map (int n_lto_partitions)
>> >> >> >>    varpool_order.qsort (varpool_node_cmp);
>> >> >> >>
>> >> >> >>    /* Compute partition size and create the first partition.  */
>> >> >> >> +  if (PARAM_VALUE (MIN_PARTITION_SIZE) > PARAM_VALUE (MAX_PARTITION_SIZE))
>> >> >> >> +    fatal_error (input_location, "min partition size cannot be greater than max partition size");
>> >> >> >> +
>> >> >> >>    partition_size = total_size / n_lto_partitions;
>> >> >> >>    if (partition_size < PARAM_VALUE (MIN_PARTITION_SIZE))
>> >> >> >>      partition_size = PARAM_VALUE (MIN_PARTITION_SIZE);
>> >> >> >> +  else if (partition_size > PARAM_VALUE (MAX_PARTITION_SIZE))
>> >> >> >> +    {
>> >> >> >> +      n_lto_partitions = total_size / PARAM_VALUE (MAX_PARTITION_SIZE);
>> >> >> >> +      if (total_size % PARAM_VALUE (MAX_PARTITION_SIZE))
>> >> >> >> +     n_lto_partitions++;
>> >> >> >> +      partition_size = total_size / n_lto_partitions;
>> >> >> >> +    }
>> >> >> >
>> >> >> > lto_balanced_map actually works in a way that looks for cheapest cutpoint in range
>> >> >> > 3/4*parittion_size to 2*partition_size and picks the cheapest range.
>> >> >> > Setting partition_size to this value will thus not cause partitioner to produce smaller
>> >> >> > partitions only.  I suppose modify the conditional:
>> >> >> >
>> >> >> >       /* Partition is too large, unwind into step when best cost was reached and
>> >> >> >          start new partition.  */
>> >> >> >       if (partition->insns > 2 * partition_size)
>> >> >> >
>> >> >> > and/or in the code above set the partition_size to half of total_size/max_size.
>> >> >> >
>> >> >> > I know this is somewhat sloppy.  This was really just first cut implementation
>> >> >> > many years ago. I expected to reimplement it marter soon, but then there was
>> >> >> > never really a need for it (I am trying to avoid late IPA optimizations so the
>> >> >> > partitioning decisions should mostly affect compile time performance only).
>> >> >> > If ARM is more sensitive for partitining, perhaps it would make sense to try to
>> >> >> > look for something smarter.
>> >> >> >
>> >> >> >> +
>> >> >> >>    npartitions = 1;
>> >> >> >>    partition = new_partition ("");
>> >> >> >>    if (symtab->dump_file)
>> >> >> >> diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c
>> >> >> >> index 9dd513f..294b8a4 100644
>> >> >> >> --- a/gcc/lto/lto.c
>> >> >> >> +++ b/gcc/lto/lto.c
>> >> >> >> @@ -3112,6 +3112,12 @@ do_whole_program_analysis (void)
>> >> >> >>    timevar_pop (TV_WHOPR_WPA);
>> >> >> >>
>> >> >> >>    timevar_push (TV_WHOPR_PARTITIONING);
>> >> >> >> +
>> >> >> >> +  if (flag_lto_partition != LTO_PARTITION_BALANCED
>> >> >> >> +      && PARAM_VALUE (MAX_PARTITION_SIZE) != INT_MAX)
>> >> >> >> +    fatal_error (input_location, "--param max-lto-partition should only"
>> >> >> >> +              " be used with balanced partitioning\n");
>> >> >> >> +
>> >> >> >
>> >> >> > I think we should wire in resonable MAX_PARTITION_SIZE default.  THe value you
>> >> >> > found experimentally may be a good start. For that reason we can't really
>> >> >> > refuse a value when !LTO_PARTITION_BALANCED.  Just document it as parameter for
>> >> >> > balanced partitioning only and add a parameter to lto_balanced_map specifying whether
>> >> >> > this param should be honored (because the same path is used for partitioning to one partition)
>> >> >> >
>> >> >> > Otherwise the patch looks good to me modulo missing documentation.
>> >> >> Thanks for the review. I have updated the patch.
>> >> >> Does this version look OK ?
>> >> >> I had randomly chosen 10000, not sure if that's an appropriate value
>> >> >> for default.
>> >> >
>> >> > I think it's way too small.  This is roughly the number of GIMPLE stmts
>> >> > (thus roughly the number of instructions).  So with say a 8 byte
>> >> > instruction format it is on the order of 80kB.  You'd want to have a
>> >> > default of at least several ten times of large-unit-insns (also 10000).
>> >> > I'd choose sth like 1000000 (one million).  I find the lto-min-partition
>> >> > number quite small as well (and up it by a factor of 10).
>> >> Done in this version.
>> >
>> > I'd do that separately.
>> >
>> > Please no default parameter for lto_balanced_map (), instead change
>> > all callers.
>> >
>> >> Is it OK after bootstrap+test ?
>> >
>> > Note that this is for stage1 only.  I'll leave approval to Honza
>> > (also verification of the default max param - not sure if for example
>> > chromium or firefox should/will be split to more than 32 partitions
>> > with the patch)
>> Removed default parameter in this version. I verified with the patch
>> for chromium LTO build:
>> n_lto_partitions == 32, ltrans_partitions.length() == 31
>
> Just noticed that lto_balanced_map already gets PARAM_LTO_PARTITIONS,
> so why not pass it PARAM_MAX_PARTITION_SIZE or 0 (as magic value for
> unlimited) instead of a bool parameter?
Indeed.  Instead of 0, would it be OK to pass INT_MAX as 2nd parameter in case
of single partition, since in that case partition->insns >
max_partition_size will never
be true, which would effectively ignore max_partition_size.

Thanks,
Prathamesh
>
> Richard.
>
>> Thanks,
>> Prathamesh
>> >
>> > Richard.
>> >
>> >> Thanks,
>> >> Prathamesh
>> >> >
>> >> > Richard.
>> >> >
>> >> >> I have a silly question about partitioning: Does it hamper
>> >> >> transformations on ipa optimizations if caller and
>> >> >> callee get placed in separate partitions ? For instance if callee is
>> >> >> supposed to be inlined
>> >> >> into caller, would inlining still take place if callee and caller get
>> >> >> placed in separate partitions ?
>> >> >> I tried with a trivial example with -flto-partition=max
>> >> >> which created 3 partitions for 3 functions (bar, foo and main), and it was
>> >> >> able to inline bar into foo and foo into main.  I am not sure how that happens.
>> >> >> I thought ltrans can perform transformations on functions only within
>> >> >> a single partition
>> >> >> and not across partitions ?
>> >> >>
>> >> >> Thanks,
>> >> >> Prathamesh
>> >> >> >
>> >> >> > Honza
>> >> >>
>> >> >
>> >> > --
>> >> > Richard Biener <rguenther@suse.de>
>> >> > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
>> >>
>> >
>> > --
>> > Richard Biener <rguenther@suse.de>
>> > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
>>
>
> --
> Richard Biener <rguenther@suse.de>
> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

[-- Attachment #2: patch-5.diff --]
[-- Type: text/plain, Size: 3642 bytes --]

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 9e54bb7..f0de7ec 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -9477,6 +9477,11 @@ Size of minimal partition for WHOPR (in estimated instructions).
 This prevents expenses of splitting very small programs into too many
 partitions.

+@item lto-max-partition
+Size of max partition for WHOPR (in estimated instructions).
+to provide an upper bound for individual size of partition.
+Meant to be used only with balanced partitioning.
+
 @item cxx-max-namespaces-for-diagnostic-help
 The maximum number of namespaces to consult for suggestions when C++
 name lookup fails for an identifier.  The default is 1000.
diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c
index 9eb63c2..bd6dc1e 100644
--- a/gcc/lto/lto-partition.c
+++ b/gcc/lto/lto-partition.c
@@ -447,7 +447,7 @@ add_sorted_nodes (vec<symtab_node *> &next_nodes, ltrans_partition partition)
    and in-partition calls was reached.  */

 void
-lto_balanced_map (int n_lto_partitions)
+lto_balanced_map (int n_lto_partitions, int max_partition_size)
 {
   int n_nodes = 0;
   int n_varpool_nodes = 0, varpool_pos = 0, best_varpool_pos = 0;
@@ -511,6 +511,9 @@ lto_balanced_map (int n_lto_partitions)
   varpool_order.qsort (varpool_node_cmp);

   /* Compute partition size and create the first partition.  */
+  if (PARAM_VALUE (MIN_PARTITION_SIZE) > max_partition_size)
+    fatal_error (input_location, "min partition size cannot be greater than max partition size");
+
   partition_size = total_size / n_lto_partitions;
   if (partition_size < PARAM_VALUE (MIN_PARTITION_SIZE))
     partition_size = PARAM_VALUE (MIN_PARTITION_SIZE);
@@ -719,7 +723,8 @@ lto_balanced_map (int n_lto_partitions)
 		 best_cost, best_internal, best_i);
       /* Partition is too large, unwind into step when best cost was reached and
 	 start new partition.  */
-      if (partition->insns > 2 * partition_size)
+      if (partition->insns > 2 * partition_size
+	  || partition->insns > max_partition_size)
 	{
 	  if (best_i != i)
 	    {
diff --git a/gcc/lto/lto-partition.h b/gcc/lto/lto-partition.h
index 31e3764..b0b533b 100644
--- a/gcc/lto/lto-partition.h
+++ b/gcc/lto/lto-partition.h
@@ -35,7 +35,7 @@ extern vec<ltrans_partition> ltrans_partitions;

 void lto_1_to_1_map (void);
 void lto_max_map (void);
-void lto_balanced_map (int);
+void lto_balanced_map (int, int);
 void lto_promote_cross_file_statics (void);
 void free_ltrans_partitions (void);
 void lto_promote_statics_nonwpa (void);
diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c
index 9dd513f..7663a5b 100644
--- a/gcc/lto/lto.c
+++ b/gcc/lto/lto.c
@@ -3117,9 +3118,10 @@ do_whole_program_analysis (void)
   else if (flag_lto_partition == LTO_PARTITION_MAX)
     lto_max_map ();
   else if (flag_lto_partition == LTO_PARTITION_ONE)
-    lto_balanced_map (1);
+    lto_balanced_map (1, INT_MAX);
   else if (flag_lto_partition == LTO_PARTITION_BALANCED)
-    lto_balanced_map (PARAM_VALUE (PARAM_LTO_PARTITIONS));
+    lto_balanced_map (PARAM_VALUE (PARAM_LTO_PARTITIONS),
+		      PARAM_VALUE (MAX_PARTITION_SIZE))
   else
     gcc_unreachable ();

diff --git a/gcc/params.def b/gcc/params.def
index 9362c15..b5da384 100644
--- a/gcc/params.def
+++ b/gcc/params.def
@@ -1029,6 +1029,11 @@ DEFPARAM (MIN_PARTITION_SIZE,
 	  "Minimal size of a partition for LTO (in estimated instructions).",
 	  1000, 0, 0)

+DEFPARAM (MAX_PARTITION_SIZE,
+	  "lto-max-partition",
+	  "Maximal size of a partition for LTO (in estimated instructions).",
+	  1000000, 0, INT_MAX)
+
 /* Diagnostic parameters.  */

 DEFPARAM (CXX_MAX_NAMESPACES_FOR_DIAGNOSTIC_HELP,

[-- Attachment #3: ChangeLog --]
[-- Type: application/octet-stream, Size: 457 bytes --]

2016-04-06  Prathamesh Kulkarni  <prathamesh.kulkarni@linaro.org>

	* params.def (MAX_PARTITION_SIZE): New param.
	* lto/lto-partition.h (lto_balanced_map): New parameter.	
	* lto/lto-partition.c (lto_balanced_map): New parameter
	max_partition_size.
	Check if partition size is greater than max_partition_size. 
	* lto/lto.c (do_whole_program_analysis): Adjust calls to
	lto_balanced_map() to pass 2nd argument.
	* invoke.texi: Document lto-max-partition.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC] introduce --param max-lto-partition for having an upper bound on partition size
  2016-04-06  8:53                             ` Prathamesh Kulkarni
@ 2016-04-06  9:07                               ` Richard Biener
  2016-04-06  9:24                                 ` Richard Biener
  0 siblings, 1 reply; 22+ messages in thread
From: Richard Biener @ 2016-04-06  9:07 UTC (permalink / raw)
  To: Prathamesh Kulkarni; +Cc: Jan Hubicka, gcc Patches, Ramana Radhakrishnan

On Wed, 6 Apr 2016, Prathamesh Kulkarni wrote:

> On 6 April 2016 at 13:44, Richard Biener <rguenther@suse.de> wrote:
> > On Wed, 6 Apr 2016, Prathamesh Kulkarni wrote:
> >
> >> On 5 April 2016 at 18:28, Richard Biener <rguenther@suse.de> wrote:
> >> > On Tue, 5 Apr 2016, Prathamesh Kulkarni wrote:
> >> >
> >> >> On 5 April 2016 at 16:58, Richard Biener <rguenther@suse.de> wrote:
> >> >> > On Tue, 5 Apr 2016, Prathamesh Kulkarni wrote:
> >> >> >
> >> >> >> On 4 April 2016 at 19:44, Jan Hubicka <hubicka@ucw.cz> wrote:
> >> >> >> >
> >> >> >> >> diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c
> >> >> >> >> index 9eb63c2..bc0c612 100644
> >> >> >> >> --- a/gcc/lto/lto-partition.c
> >> >> >> >> +++ b/gcc/lto/lto-partition.c
> >> >> >> >> @@ -511,9 +511,20 @@ lto_balanced_map (int n_lto_partitions)
> >> >> >> >>    varpool_order.qsort (varpool_node_cmp);
> >> >> >> >>
> >> >> >> >>    /* Compute partition size and create the first partition.  */
> >> >> >> >> +  if (PARAM_VALUE (MIN_PARTITION_SIZE) > PARAM_VALUE (MAX_PARTITION_SIZE))
> >> >> >> >> +    fatal_error (input_location, "min partition size cannot be greater than max partition size");
> >> >> >> >> +
> >> >> >> >>    partition_size = total_size / n_lto_partitions;
> >> >> >> >>    if (partition_size < PARAM_VALUE (MIN_PARTITION_SIZE))
> >> >> >> >>      partition_size = PARAM_VALUE (MIN_PARTITION_SIZE);
> >> >> >> >> +  else if (partition_size > PARAM_VALUE (MAX_PARTITION_SIZE))
> >> >> >> >> +    {
> >> >> >> >> +      n_lto_partitions = total_size / PARAM_VALUE (MAX_PARTITION_SIZE);
> >> >> >> >> +      if (total_size % PARAM_VALUE (MAX_PARTITION_SIZE))
> >> >> >> >> +     n_lto_partitions++;
> >> >> >> >> +      partition_size = total_size / n_lto_partitions;
> >> >> >> >> +    }
> >> >> >> >
> >> >> >> > lto_balanced_map actually works in a way that looks for cheapest cutpoint in range
> >> >> >> > 3/4*parittion_size to 2*partition_size and picks the cheapest range.
> >> >> >> > Setting partition_size to this value will thus not cause partitioner to produce smaller
> >> >> >> > partitions only.  I suppose modify the conditional:
> >> >> >> >
> >> >> >> >       /* Partition is too large, unwind into step when best cost was reached and
> >> >> >> >          start new partition.  */
> >> >> >> >       if (partition->insns > 2 * partition_size)
> >> >> >> >
> >> >> >> > and/or in the code above set the partition_size to half of total_size/max_size.
> >> >> >> >
> >> >> >> > I know this is somewhat sloppy.  This was really just first cut implementation
> >> >> >> > many years ago. I expected to reimplement it marter soon, but then there was
> >> >> >> > never really a need for it (I am trying to avoid late IPA optimizations so the
> >> >> >> > partitioning decisions should mostly affect compile time performance only).
> >> >> >> > If ARM is more sensitive for partitining, perhaps it would make sense to try to
> >> >> >> > look for something smarter.
> >> >> >> >
> >> >> >> >> +
> >> >> >> >>    npartitions = 1;
> >> >> >> >>    partition = new_partition ("");
> >> >> >> >>    if (symtab->dump_file)
> >> >> >> >> diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c
> >> >> >> >> index 9dd513f..294b8a4 100644
> >> >> >> >> --- a/gcc/lto/lto.c
> >> >> >> >> +++ b/gcc/lto/lto.c
> >> >> >> >> @@ -3112,6 +3112,12 @@ do_whole_program_analysis (void)
> >> >> >> >>    timevar_pop (TV_WHOPR_WPA);
> >> >> >> >>
> >> >> >> >>    timevar_push (TV_WHOPR_PARTITIONING);
> >> >> >> >> +
> >> >> >> >> +  if (flag_lto_partition != LTO_PARTITION_BALANCED
> >> >> >> >> +      && PARAM_VALUE (MAX_PARTITION_SIZE) != INT_MAX)
> >> >> >> >> +    fatal_error (input_location, "--param max-lto-partition should only"
> >> >> >> >> +              " be used with balanced partitioning\n");
> >> >> >> >> +
> >> >> >> >
> >> >> >> > I think we should wire in resonable MAX_PARTITION_SIZE default.  THe value you
> >> >> >> > found experimentally may be a good start. For that reason we can't really
> >> >> >> > refuse a value when !LTO_PARTITION_BALANCED.  Just document it as parameter for
> >> >> >> > balanced partitioning only and add a parameter to lto_balanced_map specifying whether
> >> >> >> > this param should be honored (because the same path is used for partitioning to one partition)
> >> >> >> >
> >> >> >> > Otherwise the patch looks good to me modulo missing documentation.
> >> >> >> Thanks for the review. I have updated the patch.
> >> >> >> Does this version look OK ?
> >> >> >> I had randomly chosen 10000, not sure if that's an appropriate value
> >> >> >> for default.
> >> >> >
> >> >> > I think it's way too small.  This is roughly the number of GIMPLE stmts
> >> >> > (thus roughly the number of instructions).  So with say a 8 byte
> >> >> > instruction format it is on the order of 80kB.  You'd want to have a
> >> >> > default of at least several ten times of large-unit-insns (also 10000).
> >> >> > I'd choose sth like 1000000 (one million).  I find the lto-min-partition
> >> >> > number quite small as well (and up it by a factor of 10).
> >> >> Done in this version.
> >> >
> >> > I'd do that separately.
> >> >
> >> > Please no default parameter for lto_balanced_map (), instead change
> >> > all callers.
> >> >
> >> >> Is it OK after bootstrap+test ?
> >> >
> >> > Note that this is for stage1 only.  I'll leave approval to Honza
> >> > (also verification of the default max param - not sure if for example
> >> > chromium or firefox should/will be split to more than 32 partitions
> >> > with the patch)
> >> Removed default parameter in this version. I verified with the patch
> >> for chromium LTO build:
> >> n_lto_partitions == 32, ltrans_partitions.length() == 31
> >
> > Just noticed that lto_balanced_map already gets PARAM_LTO_PARTITIONS,
> > so why not pass it PARAM_MAX_PARTITION_SIZE or 0 (as magic value for
> > unlimited) instead of a bool parameter?
> Indeed.  Instead of 0, would it be OK to pass INT_MAX as 2nd parameter in case
> of single partition, since in that case partition->insns >
> max_partition_size will never
> be true, which would effectively ignore max_partition_size.

You mean we are limited to INT_MAX partition size anyway, even on 64bit
systems? ...  (but yes, using a suitable large number works as well)

Richard.

> Thanks,
> Prathamesh
> >
> > Richard.
> >
> >> Thanks,
> >> Prathamesh
> >> >
> >> > Richard.
> >> >
> >> >> Thanks,
> >> >> Prathamesh
> >> >> >
> >> >> > Richard.
> >> >> >
> >> >> >> I have a silly question about partitioning: Does it hamper
> >> >> >> transformations on ipa optimizations if caller and
> >> >> >> callee get placed in separate partitions ? For instance if callee is
> >> >> >> supposed to be inlined
> >> >> >> into caller, would inlining still take place if callee and caller get
> >> >> >> placed in separate partitions ?
> >> >> >> I tried with a trivial example with -flto-partition=max
> >> >> >> which created 3 partitions for 3 functions (bar, foo and main), and it was
> >> >> >> able to inline bar into foo and foo into main.  I am not sure how that happens.
> >> >> >> I thought ltrans can perform transformations on functions only within
> >> >> >> a single partition
> >> >> >> and not across partitions ?
> >> >> >>
> >> >> >> Thanks,
> >> >> >> Prathamesh
> >> >> >> >
> >> >> >> > Honza
> >> >> >>
> >> >> >
> >> >> > --
> >> >> > Richard Biener <rguenther@suse.de>
> >> >> > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
> >> >>
> >> >
> >> > --
> >> > Richard Biener <rguenther@suse.de>
> >> > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
> >>
> >
> > --
> > Richard Biener <rguenther@suse.de>
> > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC] introduce --param max-lto-partition for having an upper bound on partition size
  2016-04-06  9:07                               ` Richard Biener
@ 2016-04-06  9:24                                 ` Richard Biener
  2016-04-25 11:58                                   ` Prathamesh Kulkarni
  0 siblings, 1 reply; 22+ messages in thread
From: Richard Biener @ 2016-04-06  9:24 UTC (permalink / raw)
  To: Prathamesh Kulkarni; +Cc: Jan Hubicka, gcc Patches, Ramana Radhakrishnan

On Wed, 6 Apr 2016, Richard Biener wrote:

> On Wed, 6 Apr 2016, Prathamesh Kulkarni wrote:
> 
> > On 6 April 2016 at 13:44, Richard Biener <rguenther@suse.de> wrote:
> > > On Wed, 6 Apr 2016, Prathamesh Kulkarni wrote:
> > >
> > >> On 5 April 2016 at 18:28, Richard Biener <rguenther@suse.de> wrote:
> > >> > On Tue, 5 Apr 2016, Prathamesh Kulkarni wrote:
> > >> >
> > >> >> On 5 April 2016 at 16:58, Richard Biener <rguenther@suse.de> wrote:
> > >> >> > On Tue, 5 Apr 2016, Prathamesh Kulkarni wrote:
> > >> >> >
> > >> >> >> On 4 April 2016 at 19:44, Jan Hubicka <hubicka@ucw.cz> wrote:
> > >> >> >> >
> > >> >> >> >> diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c
> > >> >> >> >> index 9eb63c2..bc0c612 100644
> > >> >> >> >> --- a/gcc/lto/lto-partition.c
> > >> >> >> >> +++ b/gcc/lto/lto-partition.c
> > >> >> >> >> @@ -511,9 +511,20 @@ lto_balanced_map (int n_lto_partitions)
> > >> >> >> >>    varpool_order.qsort (varpool_node_cmp);
> > >> >> >> >>
> > >> >> >> >>    /* Compute partition size and create the first partition.  */
> > >> >> >> >> +  if (PARAM_VALUE (MIN_PARTITION_SIZE) > PARAM_VALUE (MAX_PARTITION_SIZE))
> > >> >> >> >> +    fatal_error (input_location, "min partition size cannot be greater than max partition size");
> > >> >> >> >> +
> > >> >> >> >>    partition_size = total_size / n_lto_partitions;
> > >> >> >> >>    if (partition_size < PARAM_VALUE (MIN_PARTITION_SIZE))
> > >> >> >> >>      partition_size = PARAM_VALUE (MIN_PARTITION_SIZE);
> > >> >> >> >> +  else if (partition_size > PARAM_VALUE (MAX_PARTITION_SIZE))
> > >> >> >> >> +    {
> > >> >> >> >> +      n_lto_partitions = total_size / PARAM_VALUE (MAX_PARTITION_SIZE);
> > >> >> >> >> +      if (total_size % PARAM_VALUE (MAX_PARTITION_SIZE))
> > >> >> >> >> +     n_lto_partitions++;
> > >> >> >> >> +      partition_size = total_size / n_lto_partitions;
> > >> >> >> >> +    }
> > >> >> >> >
> > >> >> >> > lto_balanced_map actually works in a way that looks for cheapest cutpoint in range
> > >> >> >> > 3/4*parittion_size to 2*partition_size and picks the cheapest range.
> > >> >> >> > Setting partition_size to this value will thus not cause partitioner to produce smaller
> > >> >> >> > partitions only.  I suppose modify the conditional:
> > >> >> >> >
> > >> >> >> >       /* Partition is too large, unwind into step when best cost was reached and
> > >> >> >> >          start new partition.  */
> > >> >> >> >       if (partition->insns > 2 * partition_size)
> > >> >> >> >
> > >> >> >> > and/or in the code above set the partition_size to half of total_size/max_size.
> > >> >> >> >
> > >> >> >> > I know this is somewhat sloppy.  This was really just first cut implementation
> > >> >> >> > many years ago. I expected to reimplement it marter soon, but then there was
> > >> >> >> > never really a need for it (I am trying to avoid late IPA optimizations so the
> > >> >> >> > partitioning decisions should mostly affect compile time performance only).
> > >> >> >> > If ARM is more sensitive for partitining, perhaps it would make sense to try to
> > >> >> >> > look for something smarter.
> > >> >> >> >
> > >> >> >> >> +
> > >> >> >> >>    npartitions = 1;
> > >> >> >> >>    partition = new_partition ("");
> > >> >> >> >>    if (symtab->dump_file)
> > >> >> >> >> diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c
> > >> >> >> >> index 9dd513f..294b8a4 100644
> > >> >> >> >> --- a/gcc/lto/lto.c
> > >> >> >> >> +++ b/gcc/lto/lto.c
> > >> >> >> >> @@ -3112,6 +3112,12 @@ do_whole_program_analysis (void)
> > >> >> >> >>    timevar_pop (TV_WHOPR_WPA);
> > >> >> >> >>
> > >> >> >> >>    timevar_push (TV_WHOPR_PARTITIONING);
> > >> >> >> >> +
> > >> >> >> >> +  if (flag_lto_partition != LTO_PARTITION_BALANCED
> > >> >> >> >> +      && PARAM_VALUE (MAX_PARTITION_SIZE) != INT_MAX)
> > >> >> >> >> +    fatal_error (input_location, "--param max-lto-partition should only"
> > >> >> >> >> +              " be used with balanced partitioning\n");
> > >> >> >> >> +
> > >> >> >> >
> > >> >> >> > I think we should wire in resonable MAX_PARTITION_SIZE default.  THe value you
> > >> >> >> > found experimentally may be a good start. For that reason we can't really
> > >> >> >> > refuse a value when !LTO_PARTITION_BALANCED.  Just document it as parameter for
> > >> >> >> > balanced partitioning only and add a parameter to lto_balanced_map specifying whether
> > >> >> >> > this param should be honored (because the same path is used for partitioning to one partition)
> > >> >> >> >
> > >> >> >> > Otherwise the patch looks good to me modulo missing documentation.
> > >> >> >> Thanks for the review. I have updated the patch.
> > >> >> >> Does this version look OK ?
> > >> >> >> I had randomly chosen 10000, not sure if that's an appropriate value
> > >> >> >> for default.
> > >> >> >
> > >> >> > I think it's way too small.  This is roughly the number of GIMPLE stmts
> > >> >> > (thus roughly the number of instructions).  So with say a 8 byte
> > >> >> > instruction format it is on the order of 80kB.  You'd want to have a
> > >> >> > default of at least several ten times of large-unit-insns (also 10000).
> > >> >> > I'd choose sth like 1000000 (one million).  I find the lto-min-partition
> > >> >> > number quite small as well (and up it by a factor of 10).
> > >> >> Done in this version.
> > >> >
> > >> > I'd do that separately.
> > >> >
> > >> > Please no default parameter for lto_balanced_map (), instead change
> > >> > all callers.
> > >> >
> > >> >> Is it OK after bootstrap+test ?
> > >> >
> > >> > Note that this is for stage1 only.  I'll leave approval to Honza
> > >> > (also verification of the default max param - not sure if for example
> > >> > chromium or firefox should/will be split to more than 32 partitions
> > >> > with the patch)
> > >> Removed default parameter in this version. I verified with the patch
> > >> for chromium LTO build:
> > >> n_lto_partitions == 32, ltrans_partitions.length() == 31
> > >
> > > Just noticed that lto_balanced_map already gets PARAM_LTO_PARTITIONS,
> > > so why not pass it PARAM_MAX_PARTITION_SIZE or 0 (as magic value for
> > > unlimited) instead of a bool parameter?
> > Indeed.  Instead of 0, would it be OK to pass INT_MAX as 2nd parameter in case
> > of single partition, since in that case partition->insns >
> > max_partition_size will never
> > be true, which would effectively ignore max_partition_size.
> 
> You mean we are limited to INT_MAX partition size anyway, even on 64bit
> systems? ...  (but yes, using a suitable large number works as well)

Ah, even 'total_size' is an int ... I wonder what this means for LTOing
a -mcmodel=large app (that really needs the large model).

Richard.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC] introduce --param max-lto-partition for having an upper bound on partition size
  2016-04-06  9:24                                 ` Richard Biener
@ 2016-04-25 11:58                                   ` Prathamesh Kulkarni
  2016-04-26 11:01                                     ` Richard Biener
  0 siblings, 1 reply; 22+ messages in thread
From: Prathamesh Kulkarni @ 2016-04-25 11:58 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jan Hubicka, gcc Patches, Ramana Radhakrishnan

[-- Attachment #1: Type: text/plain, Size: 7052 bytes --]

On 6 April 2016 at 14:54, Richard Biener <rguenther@suse.de> wrote:
> On Wed, 6 Apr 2016, Richard Biener wrote:
>
>> On Wed, 6 Apr 2016, Prathamesh Kulkarni wrote:
>>
>> > On 6 April 2016 at 13:44, Richard Biener <rguenther@suse.de> wrote:
>> > > On Wed, 6 Apr 2016, Prathamesh Kulkarni wrote:
>> > >
>> > >> On 5 April 2016 at 18:28, Richard Biener <rguenther@suse.de> wrote:
>> > >> > On Tue, 5 Apr 2016, Prathamesh Kulkarni wrote:
>> > >> >
>> > >> >> On 5 April 2016 at 16:58, Richard Biener <rguenther@suse.de> wrote:
>> > >> >> > On Tue, 5 Apr 2016, Prathamesh Kulkarni wrote:
>> > >> >> >
>> > >> >> >> On 4 April 2016 at 19:44, Jan Hubicka <hubicka@ucw.cz> wrote:
>> > >> >> >> >
>> > >> >> >> >> diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c
>> > >> >> >> >> index 9eb63c2..bc0c612 100644
>> > >> >> >> >> --- a/gcc/lto/lto-partition.c
>> > >> >> >> >> +++ b/gcc/lto/lto-partition.c
>> > >> >> >> >> @@ -511,9 +511,20 @@ lto_balanced_map (int n_lto_partitions)
>> > >> >> >> >>    varpool_order.qsort (varpool_node_cmp);
>> > >> >> >> >>
>> > >> >> >> >>    /* Compute partition size and create the first partition.  */
>> > >> >> >> >> +  if (PARAM_VALUE (MIN_PARTITION_SIZE) > PARAM_VALUE (MAX_PARTITION_SIZE))
>> > >> >> >> >> +    fatal_error (input_location, "min partition size cannot be greater than max partition size");
>> > >> >> >> >> +
>> > >> >> >> >>    partition_size = total_size / n_lto_partitions;
>> > >> >> >> >>    if (partition_size < PARAM_VALUE (MIN_PARTITION_SIZE))
>> > >> >> >> >>      partition_size = PARAM_VALUE (MIN_PARTITION_SIZE);
>> > >> >> >> >> +  else if (partition_size > PARAM_VALUE (MAX_PARTITION_SIZE))
>> > >> >> >> >> +    {
>> > >> >> >> >> +      n_lto_partitions = total_size / PARAM_VALUE (MAX_PARTITION_SIZE);
>> > >> >> >> >> +      if (total_size % PARAM_VALUE (MAX_PARTITION_SIZE))
>> > >> >> >> >> +     n_lto_partitions++;
>> > >> >> >> >> +      partition_size = total_size / n_lto_partitions;
>> > >> >> >> >> +    }
>> > >> >> >> >
>> > >> >> >> > lto_balanced_map actually works in a way that looks for cheapest cutpoint in range
>> > >> >> >> > 3/4*parittion_size to 2*partition_size and picks the cheapest range.
>> > >> >> >> > Setting partition_size to this value will thus not cause partitioner to produce smaller
>> > >> >> >> > partitions only.  I suppose modify the conditional:
>> > >> >> >> >
>> > >> >> >> >       /* Partition is too large, unwind into step when best cost was reached and
>> > >> >> >> >          start new partition.  */
>> > >> >> >> >       if (partition->insns > 2 * partition_size)
>> > >> >> >> >
>> > >> >> >> > and/or in the code above set the partition_size to half of total_size/max_size.
>> > >> >> >> >
>> > >> >> >> > I know this is somewhat sloppy.  This was really just first cut implementation
>> > >> >> >> > many years ago. I expected to reimplement it marter soon, but then there was
>> > >> >> >> > never really a need for it (I am trying to avoid late IPA optimizations so the
>> > >> >> >> > partitioning decisions should mostly affect compile time performance only).
>> > >> >> >> > If ARM is more sensitive for partitining, perhaps it would make sense to try to
>> > >> >> >> > look for something smarter.
>> > >> >> >> >
>> > >> >> >> >> +
>> > >> >> >> >>    npartitions = 1;
>> > >> >> >> >>    partition = new_partition ("");
>> > >> >> >> >>    if (symtab->dump_file)
>> > >> >> >> >> diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c
>> > >> >> >> >> index 9dd513f..294b8a4 100644
>> > >> >> >> >> --- a/gcc/lto/lto.c
>> > >> >> >> >> +++ b/gcc/lto/lto.c
>> > >> >> >> >> @@ -3112,6 +3112,12 @@ do_whole_program_analysis (void)
>> > >> >> >> >>    timevar_pop (TV_WHOPR_WPA);
>> > >> >> >> >>
>> > >> >> >> >>    timevar_push (TV_WHOPR_PARTITIONING);
>> > >> >> >> >> +
>> > >> >> >> >> +  if (flag_lto_partition != LTO_PARTITION_BALANCED
>> > >> >> >> >> +      && PARAM_VALUE (MAX_PARTITION_SIZE) != INT_MAX)
>> > >> >> >> >> +    fatal_error (input_location, "--param max-lto-partition should only"
>> > >> >> >> >> +              " be used with balanced partitioning\n");
>> > >> >> >> >> +
>> > >> >> >> >
>> > >> >> >> > I think we should wire in resonable MAX_PARTITION_SIZE default.  THe value you
>> > >> >> >> > found experimentally may be a good start. For that reason we can't really
>> > >> >> >> > refuse a value when !LTO_PARTITION_BALANCED.  Just document it as parameter for
>> > >> >> >> > balanced partitioning only and add a parameter to lto_balanced_map specifying whether
>> > >> >> >> > this param should be honored (because the same path is used for partitioning to one partition)
>> > >> >> >> >
>> > >> >> >> > Otherwise the patch looks good to me modulo missing documentation.
>> > >> >> >> Thanks for the review. I have updated the patch.
>> > >> >> >> Does this version look OK ?
>> > >> >> >> I had randomly chosen 10000, not sure if that's an appropriate value
>> > >> >> >> for default.
>> > >> >> >
>> > >> >> > I think it's way too small.  This is roughly the number of GIMPLE stmts
>> > >> >> > (thus roughly the number of instructions).  So with say a 8 byte
>> > >> >> > instruction format it is on the order of 80kB.  You'd want to have a
>> > >> >> > default of at least several ten times of large-unit-insns (also 10000).
>> > >> >> > I'd choose sth like 1000000 (one million).  I find the lto-min-partition
>> > >> >> > number quite small as well (and up it by a factor of 10).
>> > >> >> Done in this version.
>> > >> >
>> > >> > I'd do that separately.
>> > >> >
>> > >> > Please no default parameter for lto_balanced_map (), instead change
>> > >> > all callers.
>> > >> >
>> > >> >> Is it OK after bootstrap+test ?
>> > >> >
>> > >> > Note that this is for stage1 only.  I'll leave approval to Honza
>> > >> > (also verification of the default max param - not sure if for example
>> > >> > chromium or firefox should/will be split to more than 32 partitions
>> > >> > with the patch)
>> > >> Removed default parameter in this version. I verified with the patch
>> > >> for chromium LTO build:
>> > >> n_lto_partitions == 32, ltrans_partitions.length() == 31
>> > >
>> > > Just noticed that lto_balanced_map already gets PARAM_LTO_PARTITIONS,
>> > > so why not pass it PARAM_MAX_PARTITION_SIZE or 0 (as magic value for
>> > > unlimited) instead of a bool parameter?
>> > Indeed.  Instead of 0, would it be OK to pass INT_MAX as 2nd parameter in case
>> > of single partition, since in that case partition->insns >
>> > max_partition_size will never
>> > be true, which would effectively ignore max_partition_size.
>>
>> You mean we are limited to INT_MAX partition size anyway, even on 64bit
>> systems? ...  (but yes, using a suitable large number works as well)
>
> Ah, even 'total_size' is an int ... I wonder what this means for LTOing
> a -mcmodel=large app (that really needs the large model).
Hi,
Is the attached patch OK for trunk now ?
Bootstrapped and tested on x86_64-unknown-linux-gnu.
Cross tested on arm*-*-* and aarch64*-*-*.

Thanks,
Prathamesh
>
> Richard.

[-- Attachment #2: patch.diff --]
[-- Type: text/plain, Size: 3650 bytes --]

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 821f8fd..4afa32c 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -9477,6 +9477,11 @@ Size of minimal partition for WHOPR (in estimated instructions).
 This prevents expenses of splitting very small programs into too many
 partitions.
 
+@item lto-max-partition
+Size of max partition for WHOPR (in estimated instructions).
+to provide an upper bound for individual size of partition.
+Meant to be used only with balanced partitioning.
+
 @item cxx-max-namespaces-for-diagnostic-help
 The maximum number of namespaces to consult for suggestions when C++
 name lookup fails for an identifier.  The default is 1000.
diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c
index 9eb63c2..c191d24 100644
--- a/gcc/lto/lto-partition.c
+++ b/gcc/lto/lto-partition.c
@@ -447,7 +447,7 @@ add_sorted_nodes (vec<symtab_node *> &next_nodes, ltrans_partition partition)
    and in-partition calls was reached.  */
 
 void
-lto_balanced_map (int n_lto_partitions)
+lto_balanced_map (int n_lto_partitions, int max_partition_size)
 {
   int n_nodes = 0;
   int n_varpool_nodes = 0, varpool_pos = 0, best_varpool_pos = 0;
@@ -511,6 +511,9 @@ lto_balanced_map (int n_lto_partitions)
   varpool_order.qsort (varpool_node_cmp);
 
   /* Compute partition size and create the first partition.  */
+  if (PARAM_VALUE (MIN_PARTITION_SIZE) > max_partition_size)
+    fatal_error (input_location, "min partition size cannot be greater than max partition size");
+
   partition_size = total_size / n_lto_partitions;
   if (partition_size < PARAM_VALUE (MIN_PARTITION_SIZE))
     partition_size = PARAM_VALUE (MIN_PARTITION_SIZE);
@@ -719,7 +722,8 @@ lto_balanced_map (int n_lto_partitions)
 		 best_cost, best_internal, best_i);
       /* Partition is too large, unwind into step when best cost was reached and
 	 start new partition.  */
-      if (partition->insns > 2 * partition_size)
+      if (partition->insns > 2 * partition_size
+	  || partition->insns > max_partition_size)
 	{
 	  if (best_i != i)
 	    {
diff --git a/gcc/lto/lto-partition.h b/gcc/lto/lto-partition.h
index 31e3764..f7abe62 100644
--- a/gcc/lto/lto-partition.h
+++ b/gcc/lto/lto-partition.h
@@ -35,7 +35,7 @@ extern vec<ltrans_partition> ltrans_partitions;
 
 void lto_1_to_1_map (void);
 void lto_max_map (void);
-void lto_balanced_map (int);
+void lto_balanced_map (int, int);
 void lto_promote_cross_file_statics (void);
 void free_ltrans_partitions (void);
 void lto_promote_statics_nonwpa (void);
diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c
index 9dd513f..af735cb 100644
--- a/gcc/lto/lto.c
+++ b/gcc/lto/lto.c
@@ -3117,9 +3117,10 @@ do_whole_program_analysis (void)
   else if (flag_lto_partition == LTO_PARTITION_MAX)
     lto_max_map ();
   else if (flag_lto_partition == LTO_PARTITION_ONE)
-    lto_balanced_map (1);
+    lto_balanced_map (1, INT_MAX);
   else if (flag_lto_partition == LTO_PARTITION_BALANCED)
-    lto_balanced_map (PARAM_VALUE (PARAM_LTO_PARTITIONS));
+    lto_balanced_map (PARAM_VALUE (PARAM_LTO_PARTITIONS),
+		      PARAM_VALUE (MAX_PARTITION_SIZE));
   else
     gcc_unreachable ();
 
diff --git a/gcc/params.def b/gcc/params.def
index dbff305..eceee32 100644
--- a/gcc/params.def
+++ b/gcc/params.def
@@ -1029,6 +1029,11 @@ DEFPARAM (MIN_PARTITION_SIZE,
 	  "Minimal size of a partition for LTO (in estimated instructions).",
 	  1000, 0, 0)
 
+DEFPARAM (MAX_PARTITION_SIZE,
+	  "lto-max-partition",
+	  "Maximal size of a partition for LTO (in estimated instructions).",
+	  1000000, 0, INT_MAX)
+
 /* Diagnostic parameters.  */
 
 DEFPARAM (CXX_MAX_NAMESPACES_FOR_DIAGNOSTIC_HELP,

[-- Attachment #3: ChangeLog --]
[-- Type: application/octet-stream, Size: 457 bytes --]

2016-04-25  Prathamesh Kulkarni  <prathamesh.kulkarni@linaro.org>

	* params.def (MAX_PARTITION_SIZE): New param.
	* lto/lto-partition.h (lto_balanced_map): New parameter.	
	* lto/lto-partition.c (lto_balanced_map): New parameter
	max_partition_size.
	Check if partition size is greater than max_partition_size. 
	* lto/lto.c (do_whole_program_analysis): Adjust calls to
	lto_balanced_map() to pass 2nd argument.
	* invoke.texi: Document lto-max-partition.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC] introduce --param max-lto-partition for having an upper bound on partition size
  2016-04-25 11:58                                   ` Prathamesh Kulkarni
@ 2016-04-26 11:01                                     ` Richard Biener
  2016-04-26 20:45                                       ` Prathamesh Kulkarni
  0 siblings, 1 reply; 22+ messages in thread
From: Richard Biener @ 2016-04-26 11:01 UTC (permalink / raw)
  To: Prathamesh Kulkarni; +Cc: Jan Hubicka, gcc Patches, Ramana Radhakrishnan

On Mon, 25 Apr 2016, Prathamesh Kulkarni wrote:

> On 6 April 2016 at 14:54, Richard Biener <rguenther@suse.de> wrote:
> > On Wed, 6 Apr 2016, Richard Biener wrote:
> >
> >> On Wed, 6 Apr 2016, Prathamesh Kulkarni wrote:
> >>
> >> > On 6 April 2016 at 13:44, Richard Biener <rguenther@suse.de> wrote:
> >> > > On Wed, 6 Apr 2016, Prathamesh Kulkarni wrote:
> >> > >
> >> > >> On 5 April 2016 at 18:28, Richard Biener <rguenther@suse.de> wrote:
> >> > >> > On Tue, 5 Apr 2016, Prathamesh Kulkarni wrote:
> >> > >> >
> >> > >> >> On 5 April 2016 at 16:58, Richard Biener <rguenther@suse.de> wrote:
> >> > >> >> > On Tue, 5 Apr 2016, Prathamesh Kulkarni wrote:
> >> > >> >> >
> >> > >> >> >> On 4 April 2016 at 19:44, Jan Hubicka <hubicka@ucw.cz> wrote:
> >> > >> >> >> >
> >> > >> >> >> >> diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c
> >> > >> >> >> >> index 9eb63c2..bc0c612 100644
> >> > >> >> >> >> --- a/gcc/lto/lto-partition.c
> >> > >> >> >> >> +++ b/gcc/lto/lto-partition.c
> >> > >> >> >> >> @@ -511,9 +511,20 @@ lto_balanced_map (int n_lto_partitions)
> >> > >> >> >> >>    varpool_order.qsort (varpool_node_cmp);
> >> > >> >> >> >>
> >> > >> >> >> >>    /* Compute partition size and create the first partition.  */
> >> > >> >> >> >> +  if (PARAM_VALUE (MIN_PARTITION_SIZE) > PARAM_VALUE (MAX_PARTITION_SIZE))
> >> > >> >> >> >> +    fatal_error (input_location, "min partition size cannot be greater than max partition size");
> >> > >> >> >> >> +
> >> > >> >> >> >>    partition_size = total_size / n_lto_partitions;
> >> > >> >> >> >>    if (partition_size < PARAM_VALUE (MIN_PARTITION_SIZE))
> >> > >> >> >> >>      partition_size = PARAM_VALUE (MIN_PARTITION_SIZE);
> >> > >> >> >> >> +  else if (partition_size > PARAM_VALUE (MAX_PARTITION_SIZE))
> >> > >> >> >> >> +    {
> >> > >> >> >> >> +      n_lto_partitions = total_size / PARAM_VALUE (MAX_PARTITION_SIZE);
> >> > >> >> >> >> +      if (total_size % PARAM_VALUE (MAX_PARTITION_SIZE))
> >> > >> >> >> >> +     n_lto_partitions++;
> >> > >> >> >> >> +      partition_size = total_size / n_lto_partitions;
> >> > >> >> >> >> +    }
> >> > >> >> >> >
> >> > >> >> >> > lto_balanced_map actually works in a way that looks for cheapest cutpoint in range
> >> > >> >> >> > 3/4*parittion_size to 2*partition_size and picks the cheapest range.
> >> > >> >> >> > Setting partition_size to this value will thus not cause partitioner to produce smaller
> >> > >> >> >> > partitions only.  I suppose modify the conditional:
> >> > >> >> >> >
> >> > >> >> >> >       /* Partition is too large, unwind into step when best cost was reached and
> >> > >> >> >> >          start new partition.  */
> >> > >> >> >> >       if (partition->insns > 2 * partition_size)
> >> > >> >> >> >
> >> > >> >> >> > and/or in the code above set the partition_size to half of total_size/max_size.
> >> > >> >> >> >
> >> > >> >> >> > I know this is somewhat sloppy.  This was really just first cut implementation
> >> > >> >> >> > many years ago. I expected to reimplement it marter soon, but then there was
> >> > >> >> >> > never really a need for it (I am trying to avoid late IPA optimizations so the
> >> > >> >> >> > partitioning decisions should mostly affect compile time performance only).
> >> > >> >> >> > If ARM is more sensitive for partitining, perhaps it would make sense to try to
> >> > >> >> >> > look for something smarter.
> >> > >> >> >> >
> >> > >> >> >> >> +
> >> > >> >> >> >>    npartitions = 1;
> >> > >> >> >> >>    partition = new_partition ("");
> >> > >> >> >> >>    if (symtab->dump_file)
> >> > >> >> >> >> diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c
> >> > >> >> >> >> index 9dd513f..294b8a4 100644
> >> > >> >> >> >> --- a/gcc/lto/lto.c
> >> > >> >> >> >> +++ b/gcc/lto/lto.c
> >> > >> >> >> >> @@ -3112,6 +3112,12 @@ do_whole_program_analysis (void)
> >> > >> >> >> >>    timevar_pop (TV_WHOPR_WPA);
> >> > >> >> >> >>
> >> > >> >> >> >>    timevar_push (TV_WHOPR_PARTITIONING);
> >> > >> >> >> >> +
> >> > >> >> >> >> +  if (flag_lto_partition != LTO_PARTITION_BALANCED
> >> > >> >> >> >> +      && PARAM_VALUE (MAX_PARTITION_SIZE) != INT_MAX)
> >> > >> >> >> >> +    fatal_error (input_location, "--param max-lto-partition should only"
> >> > >> >> >> >> +              " be used with balanced partitioning\n");
> >> > >> >> >> >> +
> >> > >> >> >> >
> >> > >> >> >> > I think we should wire in resonable MAX_PARTITION_SIZE default.  THe value you
> >> > >> >> >> > found experimentally may be a good start. For that reason we can't really
> >> > >> >> >> > refuse a value when !LTO_PARTITION_BALANCED.  Just document it as parameter for
> >> > >> >> >> > balanced partitioning only and add a parameter to lto_balanced_map specifying whether
> >> > >> >> >> > this param should be honored (because the same path is used for partitioning to one partition)
> >> > >> >> >> >
> >> > >> >> >> > Otherwise the patch looks good to me modulo missing documentation.
> >> > >> >> >> Thanks for the review. I have updated the patch.
> >> > >> >> >> Does this version look OK ?
> >> > >> >> >> I had randomly chosen 10000, not sure if that's an appropriate value
> >> > >> >> >> for default.
> >> > >> >> >
> >> > >> >> > I think it's way too small.  This is roughly the number of GIMPLE stmts
> >> > >> >> > (thus roughly the number of instructions).  So with say a 8 byte
> >> > >> >> > instruction format it is on the order of 80kB.  You'd want to have a
> >> > >> >> > default of at least several ten times of large-unit-insns (also 10000).
> >> > >> >> > I'd choose sth like 1000000 (one million).  I find the lto-min-partition
> >> > >> >> > number quite small as well (and up it by a factor of 10).
> >> > >> >> Done in this version.
> >> > >> >
> >> > >> > I'd do that separately.
> >> > >> >
> >> > >> > Please no default parameter for lto_balanced_map (), instead change
> >> > >> > all callers.
> >> > >> >
> >> > >> >> Is it OK after bootstrap+test ?
> >> > >> >
> >> > >> > Note that this is for stage1 only.  I'll leave approval to Honza
> >> > >> > (also verification of the default max param - not sure if for example
> >> > >> > chromium or firefox should/will be split to more than 32 partitions
> >> > >> > with the patch)
> >> > >> Removed default parameter in this version. I verified with the patch
> >> > >> for chromium LTO build:
> >> > >> n_lto_partitions == 32, ltrans_partitions.length() == 31
> >> > >
> >> > > Just noticed that lto_balanced_map already gets PARAM_LTO_PARTITIONS,
> >> > > so why not pass it PARAM_MAX_PARTITION_SIZE or 0 (as magic value for
> >> > > unlimited) instead of a bool parameter?
> >> > Indeed.  Instead of 0, would it be OK to pass INT_MAX as 2nd parameter in case
> >> > of single partition, since in that case partition->insns >
> >> > max_partition_size will never
> >> > be true, which would effectively ignore max_partition_size.
> >>
> >> You mean we are limited to INT_MAX partition size anyway, even on 64bit
> >> systems? ...  (but yes, using a suitable large number works as well)
> >
> > Ah, even 'total_size' is an int ... I wonder what this means for LTOing
> > a -mcmodel=large app (that really needs the large model).
> Hi,
> Is the attached patch OK for trunk now ?
> Bootstrapped and tested on x86_64-unknown-linux-gnu.
> Cross tested on arm*-*-* and aarch64*-*-*.

Ok.  How many partitions do we generate for linking cc1 with 
bootstrap-lto now?

Thanks,
Richard.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC] introduce --param max-lto-partition for having an upper bound on partition size
  2016-04-26 11:01                                     ` Richard Biener
@ 2016-04-26 20:45                                       ` Prathamesh Kulkarni
  2016-04-27  7:28                                         ` Richard Biener
  0 siblings, 1 reply; 22+ messages in thread
From: Prathamesh Kulkarni @ 2016-04-26 20:45 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jan Hubicka, gcc Patches, Ramana Radhakrishnan

On 26 April 2016 at 16:31, Richard Biener <rguenther@suse.de> wrote:
> On Mon, 25 Apr 2016, Prathamesh Kulkarni wrote:
>
>> On 6 April 2016 at 14:54, Richard Biener <rguenther@suse.de> wrote:
>> > On Wed, 6 Apr 2016, Richard Biener wrote:
>> >
>> >> On Wed, 6 Apr 2016, Prathamesh Kulkarni wrote:
>> >>
>> >> > On 6 April 2016 at 13:44, Richard Biener <rguenther@suse.de> wrote:
>> >> > > On Wed, 6 Apr 2016, Prathamesh Kulkarni wrote:
>> >> > >
>> >> > >> On 5 April 2016 at 18:28, Richard Biener <rguenther@suse.de> wrote:
>> >> > >> > On Tue, 5 Apr 2016, Prathamesh Kulkarni wrote:
>> >> > >> >
>> >> > >> >> On 5 April 2016 at 16:58, Richard Biener <rguenther@suse.de> wrote:
>> >> > >> >> > On Tue, 5 Apr 2016, Prathamesh Kulkarni wrote:
>> >> > >> >> >
>> >> > >> >> >> On 4 April 2016 at 19:44, Jan Hubicka <hubicka@ucw.cz> wrote:
>> >> > >> >> >> >
>> >> > >> >> >> >> diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c
>> >> > >> >> >> >> index 9eb63c2..bc0c612 100644
>> >> > >> >> >> >> --- a/gcc/lto/lto-partition.c
>> >> > >> >> >> >> +++ b/gcc/lto/lto-partition.c
>> >> > >> >> >> >> @@ -511,9 +511,20 @@ lto_balanced_map (int n_lto_partitions)
>> >> > >> >> >> >>    varpool_order.qsort (varpool_node_cmp);
>> >> > >> >> >> >>
>> >> > >> >> >> >>    /* Compute partition size and create the first partition.  */
>> >> > >> >> >> >> +  if (PARAM_VALUE (MIN_PARTITION_SIZE) > PARAM_VALUE (MAX_PARTITION_SIZE))
>> >> > >> >> >> >> +    fatal_error (input_location, "min partition size cannot be greater than max partition size");
>> >> > >> >> >> >> +
>> >> > >> >> >> >>    partition_size = total_size / n_lto_partitions;
>> >> > >> >> >> >>    if (partition_size < PARAM_VALUE (MIN_PARTITION_SIZE))
>> >> > >> >> >> >>      partition_size = PARAM_VALUE (MIN_PARTITION_SIZE);
>> >> > >> >> >> >> +  else if (partition_size > PARAM_VALUE (MAX_PARTITION_SIZE))
>> >> > >> >> >> >> +    {
>> >> > >> >> >> >> +      n_lto_partitions = total_size / PARAM_VALUE (MAX_PARTITION_SIZE);
>> >> > >> >> >> >> +      if (total_size % PARAM_VALUE (MAX_PARTITION_SIZE))
>> >> > >> >> >> >> +     n_lto_partitions++;
>> >> > >> >> >> >> +      partition_size = total_size / n_lto_partitions;
>> >> > >> >> >> >> +    }
>> >> > >> >> >> >
>> >> > >> >> >> > lto_balanced_map actually works in a way that looks for cheapest cutpoint in range
>> >> > >> >> >> > 3/4*parittion_size to 2*partition_size and picks the cheapest range.
>> >> > >> >> >> > Setting partition_size to this value will thus not cause partitioner to produce smaller
>> >> > >> >> >> > partitions only.  I suppose modify the conditional:
>> >> > >> >> >> >
>> >> > >> >> >> >       /* Partition is too large, unwind into step when best cost was reached and
>> >> > >> >> >> >          start new partition.  */
>> >> > >> >> >> >       if (partition->insns > 2 * partition_size)
>> >> > >> >> >> >
>> >> > >> >> >> > and/or in the code above set the partition_size to half of total_size/max_size.
>> >> > >> >> >> >
>> >> > >> >> >> > I know this is somewhat sloppy.  This was really just first cut implementation
>> >> > >> >> >> > many years ago. I expected to reimplement it marter soon, but then there was
>> >> > >> >> >> > never really a need for it (I am trying to avoid late IPA optimizations so the
>> >> > >> >> >> > partitioning decisions should mostly affect compile time performance only).
>> >> > >> >> >> > If ARM is more sensitive for partitining, perhaps it would make sense to try to
>> >> > >> >> >> > look for something smarter.
>> >> > >> >> >> >
>> >> > >> >> >> >> +
>> >> > >> >> >> >>    npartitions = 1;
>> >> > >> >> >> >>    partition = new_partition ("");
>> >> > >> >> >> >>    if (symtab->dump_file)
>> >> > >> >> >> >> diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c
>> >> > >> >> >> >> index 9dd513f..294b8a4 100644
>> >> > >> >> >> >> --- a/gcc/lto/lto.c
>> >> > >> >> >> >> +++ b/gcc/lto/lto.c
>> >> > >> >> >> >> @@ -3112,6 +3112,12 @@ do_whole_program_analysis (void)
>> >> > >> >> >> >>    timevar_pop (TV_WHOPR_WPA);
>> >> > >> >> >> >>
>> >> > >> >> >> >>    timevar_push (TV_WHOPR_PARTITIONING);
>> >> > >> >> >> >> +
>> >> > >> >> >> >> +  if (flag_lto_partition != LTO_PARTITION_BALANCED
>> >> > >> >> >> >> +      && PARAM_VALUE (MAX_PARTITION_SIZE) != INT_MAX)
>> >> > >> >> >> >> +    fatal_error (input_location, "--param max-lto-partition should only"
>> >> > >> >> >> >> +              " be used with balanced partitioning\n");
>> >> > >> >> >> >> +
>> >> > >> >> >> >
>> >> > >> >> >> > I think we should wire in resonable MAX_PARTITION_SIZE default.  THe value you
>> >> > >> >> >> > found experimentally may be a good start. For that reason we can't really
>> >> > >> >> >> > refuse a value when !LTO_PARTITION_BALANCED.  Just document it as parameter for
>> >> > >> >> >> > balanced partitioning only and add a parameter to lto_balanced_map specifying whether
>> >> > >> >> >> > this param should be honored (because the same path is used for partitioning to one partition)
>> >> > >> >> >> >
>> >> > >> >> >> > Otherwise the patch looks good to me modulo missing documentation.
>> >> > >> >> >> Thanks for the review. I have updated the patch.
>> >> > >> >> >> Does this version look OK ?
>> >> > >> >> >> I had randomly chosen 10000, not sure if that's an appropriate value
>> >> > >> >> >> for default.
>> >> > >> >> >
>> >> > >> >> > I think it's way too small.  This is roughly the number of GIMPLE stmts
>> >> > >> >> > (thus roughly the number of instructions).  So with say a 8 byte
>> >> > >> >> > instruction format it is on the order of 80kB.  You'd want to have a
>> >> > >> >> > default of at least several ten times of large-unit-insns (also 10000).
>> >> > >> >> > I'd choose sth like 1000000 (one million).  I find the lto-min-partition
>> >> > >> >> > number quite small as well (and up it by a factor of 10).
>> >> > >> >> Done in this version.
>> >> > >> >
>> >> > >> > I'd do that separately.
>> >> > >> >
>> >> > >> > Please no default parameter for lto_balanced_map (), instead change
>> >> > >> > all callers.
>> >> > >> >
>> >> > >> >> Is it OK after bootstrap+test ?
>> >> > >> >
>> >> > >> > Note that this is for stage1 only.  I'll leave approval to Honza
>> >> > >> > (also verification of the default max param - not sure if for example
>> >> > >> > chromium or firefox should/will be split to more than 32 partitions
>> >> > >> > with the patch)
>> >> > >> Removed default parameter in this version. I verified with the patch
>> >> > >> for chromium LTO build:
>> >> > >> n_lto_partitions == 32, ltrans_partitions.length() == 31
>> >> > >
>> >> > > Just noticed that lto_balanced_map already gets PARAM_LTO_PARTITIONS,
>> >> > > so why not pass it PARAM_MAX_PARTITION_SIZE or 0 (as magic value for
>> >> > > unlimited) instead of a bool parameter?
>> >> > Indeed.  Instead of 0, would it be OK to pass INT_MAX as 2nd parameter in case
>> >> > of single partition, since in that case partition->insns >
>> >> > max_partition_size will never
>> >> > be true, which would effectively ignore max_partition_size.
>> >>
>> >> You mean we are limited to INT_MAX partition size anyway, even on 64bit
>> >> systems? ...  (but yes, using a suitable large number works as well)
>> >
>> > Ah, even 'total_size' is an int ... I wonder what this means for LTOing
>> > a -mcmodel=large app (that really needs the large model).
>> Hi,
>> Is the attached patch OK for trunk now ?
>> Bootstrapped and tested on x86_64-unknown-linux-gnu.
>> Cross tested on arm*-*-* and aarch64*-*-*.
>
> Ok.  How many partitions do we generate for linking cc1 with
> bootstrap-lto now?
No difference with patch in number of partitions:
ltrans_partitions.length() == 31, n_lto_partitions == 32.
Should I commit it ?

Thanks,
Prathamesh
>
> Thanks,
> Richard.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC] introduce --param max-lto-partition for having an upper bound on partition size
  2016-04-26 20:45                                       ` Prathamesh Kulkarni
@ 2016-04-27  7:28                                         ` Richard Biener
  0 siblings, 0 replies; 22+ messages in thread
From: Richard Biener @ 2016-04-27  7:28 UTC (permalink / raw)
  To: Prathamesh Kulkarni; +Cc: Jan Hubicka, gcc Patches, Ramana Radhakrishnan

On Wed, 27 Apr 2016, Prathamesh Kulkarni wrote:

> On 26 April 2016 at 16:31, Richard Biener <rguenther@suse.de> wrote:
> > On Mon, 25 Apr 2016, Prathamesh Kulkarni wrote:
> >
> >> On 6 April 2016 at 14:54, Richard Biener <rguenther@suse.de> wrote:
> >> > On Wed, 6 Apr 2016, Richard Biener wrote:
> >> >
> >> >> On Wed, 6 Apr 2016, Prathamesh Kulkarni wrote:
> >> >>
> >> >> > On 6 April 2016 at 13:44, Richard Biener <rguenther@suse.de> wrote:
> >> >> > > On Wed, 6 Apr 2016, Prathamesh Kulkarni wrote:
> >> >> > >
> >> >> > >> On 5 April 2016 at 18:28, Richard Biener <rguenther@suse.de> wrote:
> >> >> > >> > On Tue, 5 Apr 2016, Prathamesh Kulkarni wrote:
> >> >> > >> >
> >> >> > >> >> On 5 April 2016 at 16:58, Richard Biener <rguenther@suse.de> wrote:
> >> >> > >> >> > On Tue, 5 Apr 2016, Prathamesh Kulkarni wrote:
> >> >> > >> >> >
> >> >> > >> >> >> On 4 April 2016 at 19:44, Jan Hubicka <hubicka@ucw.cz> wrote:
> >> >> > >> >> >> >
> >> >> > >> >> >> >> diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c
> >> >> > >> >> >> >> index 9eb63c2..bc0c612 100644
> >> >> > >> >> >> >> --- a/gcc/lto/lto-partition.c
> >> >> > >> >> >> >> +++ b/gcc/lto/lto-partition.c
> >> >> > >> >> >> >> @@ -511,9 +511,20 @@ lto_balanced_map (int n_lto_partitions)
> >> >> > >> >> >> >>    varpool_order.qsort (varpool_node_cmp);
> >> >> > >> >> >> >>
> >> >> > >> >> >> >>    /* Compute partition size and create the first partition.  */
> >> >> > >> >> >> >> +  if (PARAM_VALUE (MIN_PARTITION_SIZE) > PARAM_VALUE (MAX_PARTITION_SIZE))
> >> >> > >> >> >> >> +    fatal_error (input_location, "min partition size cannot be greater than max partition size");
> >> >> > >> >> >> >> +
> >> >> > >> >> >> >>    partition_size = total_size / n_lto_partitions;
> >> >> > >> >> >> >>    if (partition_size < PARAM_VALUE (MIN_PARTITION_SIZE))
> >> >> > >> >> >> >>      partition_size = PARAM_VALUE (MIN_PARTITION_SIZE);
> >> >> > >> >> >> >> +  else if (partition_size > PARAM_VALUE (MAX_PARTITION_SIZE))
> >> >> > >> >> >> >> +    {
> >> >> > >> >> >> >> +      n_lto_partitions = total_size / PARAM_VALUE (MAX_PARTITION_SIZE);
> >> >> > >> >> >> >> +      if (total_size % PARAM_VALUE (MAX_PARTITION_SIZE))
> >> >> > >> >> >> >> +     n_lto_partitions++;
> >> >> > >> >> >> >> +      partition_size = total_size / n_lto_partitions;
> >> >> > >> >> >> >> +    }
> >> >> > >> >> >> >
> >> >> > >> >> >> > lto_balanced_map actually works in a way that looks for cheapest cutpoint in range
> >> >> > >> >> >> > 3/4*parittion_size to 2*partition_size and picks the cheapest range.
> >> >> > >> >> >> > Setting partition_size to this value will thus not cause partitioner to produce smaller
> >> >> > >> >> >> > partitions only.  I suppose modify the conditional:
> >> >> > >> >> >> >
> >> >> > >> >> >> >       /* Partition is too large, unwind into step when best cost was reached and
> >> >> > >> >> >> >          start new partition.  */
> >> >> > >> >> >> >       if (partition->insns > 2 * partition_size)
> >> >> > >> >> >> >
> >> >> > >> >> >> > and/or in the code above set the partition_size to half of total_size/max_size.
> >> >> > >> >> >> >
> >> >> > >> >> >> > I know this is somewhat sloppy.  This was really just first cut implementation
> >> >> > >> >> >> > many years ago. I expected to reimplement it marter soon, but then there was
> >> >> > >> >> >> > never really a need for it (I am trying to avoid late IPA optimizations so the
> >> >> > >> >> >> > partitioning decisions should mostly affect compile time performance only).
> >> >> > >> >> >> > If ARM is more sensitive for partitining, perhaps it would make sense to try to
> >> >> > >> >> >> > look for something smarter.
> >> >> > >> >> >> >
> >> >> > >> >> >> >> +
> >> >> > >> >> >> >>    npartitions = 1;
> >> >> > >> >> >> >>    partition = new_partition ("");
> >> >> > >> >> >> >>    if (symtab->dump_file)
> >> >> > >> >> >> >> diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c
> >> >> > >> >> >> >> index 9dd513f..294b8a4 100644
> >> >> > >> >> >> >> --- a/gcc/lto/lto.c
> >> >> > >> >> >> >> +++ b/gcc/lto/lto.c
> >> >> > >> >> >> >> @@ -3112,6 +3112,12 @@ do_whole_program_analysis (void)
> >> >> > >> >> >> >>    timevar_pop (TV_WHOPR_WPA);
> >> >> > >> >> >> >>
> >> >> > >> >> >> >>    timevar_push (TV_WHOPR_PARTITIONING);
> >> >> > >> >> >> >> +
> >> >> > >> >> >> >> +  if (flag_lto_partition != LTO_PARTITION_BALANCED
> >> >> > >> >> >> >> +      && PARAM_VALUE (MAX_PARTITION_SIZE) != INT_MAX)
> >> >> > >> >> >> >> +    fatal_error (input_location, "--param max-lto-partition should only"
> >> >> > >> >> >> >> +              " be used with balanced partitioning\n");
> >> >> > >> >> >> >> +
> >> >> > >> >> >> >
> >> >> > >> >> >> > I think we should wire in resonable MAX_PARTITION_SIZE default.  THe value you
> >> >> > >> >> >> > found experimentally may be a good start. For that reason we can't really
> >> >> > >> >> >> > refuse a value when !LTO_PARTITION_BALANCED.  Just document it as parameter for
> >> >> > >> >> >> > balanced partitioning only and add a parameter to lto_balanced_map specifying whether
> >> >> > >> >> >> > this param should be honored (because the same path is used for partitioning to one partition)
> >> >> > >> >> >> >
> >> >> > >> >> >> > Otherwise the patch looks good to me modulo missing documentation.
> >> >> > >> >> >> Thanks for the review. I have updated the patch.
> >> >> > >> >> >> Does this version look OK ?
> >> >> > >> >> >> I had randomly chosen 10000, not sure if that's an appropriate value
> >> >> > >> >> >> for default.
> >> >> > >> >> >
> >> >> > >> >> > I think it's way too small.  This is roughly the number of GIMPLE stmts
> >> >> > >> >> > (thus roughly the number of instructions).  So with say a 8 byte
> >> >> > >> >> > instruction format it is on the order of 80kB.  You'd want to have a
> >> >> > >> >> > default of at least several ten times of large-unit-insns (also 10000).
> >> >> > >> >> > I'd choose sth like 1000000 (one million).  I find the lto-min-partition
> >> >> > >> >> > number quite small as well (and up it by a factor of 10).
> >> >> > >> >> Done in this version.
> >> >> > >> >
> >> >> > >> > I'd do that separately.
> >> >> > >> >
> >> >> > >> > Please no default parameter for lto_balanced_map (), instead change
> >> >> > >> > all callers.
> >> >> > >> >
> >> >> > >> >> Is it OK after bootstrap+test ?
> >> >> > >> >
> >> >> > >> > Note that this is for stage1 only.  I'll leave approval to Honza
> >> >> > >> > (also verification of the default max param - not sure if for example
> >> >> > >> > chromium or firefox should/will be split to more than 32 partitions
> >> >> > >> > with the patch)
> >> >> > >> Removed default parameter in this version. I verified with the patch
> >> >> > >> for chromium LTO build:
> >> >> > >> n_lto_partitions == 32, ltrans_partitions.length() == 31
> >> >> > >
> >> >> > > Just noticed that lto_balanced_map already gets PARAM_LTO_PARTITIONS,
> >> >> > > so why not pass it PARAM_MAX_PARTITION_SIZE or 0 (as magic value for
> >> >> > > unlimited) instead of a bool parameter?
> >> >> > Indeed.  Instead of 0, would it be OK to pass INT_MAX as 2nd parameter in case
> >> >> > of single partition, since in that case partition->insns >
> >> >> > max_partition_size will never
> >> >> > be true, which would effectively ignore max_partition_size.
> >> >>
> >> >> You mean we are limited to INT_MAX partition size anyway, even on 64bit
> >> >> systems? ...  (but yes, using a suitable large number works as well)
> >> >
> >> > Ah, even 'total_size' is an int ... I wonder what this means for LTOing
> >> > a -mcmodel=large app (that really needs the large model).
> >> Hi,
> >> Is the attached patch OK for trunk now ?
> >> Bootstrapped and tested on x86_64-unknown-linux-gnu.
> >> Cross tested on arm*-*-* and aarch64*-*-*.
> >
> > Ok.  How many partitions do we generate for linking cc1 with
> > bootstrap-lto now?
> No difference with patch in number of partitions:
> ltrans_partitions.length() == 31, n_lto_partitions == 32.
> Should I commit it ?

Yes please.

Richard.

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2016-04-27  7:28 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-01 13:48 [RFC] introduce --param max-lto-partition for having an upper bound on partition size Prathamesh Kulkarni
2016-04-01 17:33 ` Richard Biener
2016-04-04  7:47   ` Prathamesh Kulkarni
2016-04-04  8:26     ` Richard Biener
2016-04-04 11:19       ` Prathamesh Kulkarni
2016-04-04 11:42         ` Richard Biener
2016-04-04 12:00           ` Jan Hubicka
2016-04-04 13:28             ` Prathamesh Kulkarni
2016-04-04 14:14               ` Jan Hubicka
2016-04-05 11:11                 ` Prathamesh Kulkarni
2016-04-05 11:28                   ` Richard Biener
2016-04-05 12:55                     ` Prathamesh Kulkarni
2016-04-05 12:58                       ` Richard Biener
2016-04-06  7:47                         ` Prathamesh Kulkarni
2016-04-06  8:14                           ` Richard Biener
2016-04-06  8:53                             ` Prathamesh Kulkarni
2016-04-06  9:07                               ` Richard Biener
2016-04-06  9:24                                 ` Richard Biener
2016-04-25 11:58                                   ` Prathamesh Kulkarni
2016-04-26 11:01                                     ` Richard Biener
2016-04-26 20:45                                       ` Prathamesh Kulkarni
2016-04-27  7:28                                         ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).