public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 0/6] Parallelize Intra-Procedural Optimizations using the LTO Engine.
@ 2020-08-20 22:00 Giuliano Belinassi
  2020-08-20 22:00 ` [PATCH 1/6] Modify gcc driver for parallel compilation Giuliano Belinassi
                   ` (7 more replies)
  0 siblings, 8 replies; 31+ messages in thread
From: Giuliano Belinassi @ 2020-08-20 22:00 UTC (permalink / raw)
  To: gcc-patches; +Cc: richard.guenther, hubicka

This patch series add a new flag "-fparallel-jobs=" to control if the
compiler should try to compile the current file in parallel.

There are three modes which is supported by now:

1. -fparallel-jobs=<N>: Try to compile the file using a maximum of N
jobs.

2. -fparallel-jobs=jobserver: Check if there is a running GNU Make
Jobserver. If positive, communicate with it in order to launch jobs,
but alert the user if the jobserver was not found, since it requires
modifications in the project Makefile.

3. -fparallel-jobs=auto: Same as 2., but quietly fall back to a maximum
of 2 jobs if the jobserver was not found.

The parallelization works by using a modified LTO engine, as no IR is
dumped into the disk, and a new partitioner is employed to find
symbols which must be partitioned together.

In order to implement the parallelism feature, we:

1. The driver will pass a hidden -fsplit-outputs=<filename> to cc1*.

2. After IPA, cc1* will search for symbols in which must be partitioned
together.  If the user allows GCC to automatically promote symbols to
globals through "--param=promote-statics=1" for a better parallel
compilation performance, it will also be done.  However, if it decides
that partitioning is a bad idea, it will continue with a default serial
compilation, and the additional <filename> will not be created.  It will
avoid compiling in parallel if and only if:

  * File size exceeds the minimum file size specified by LTO default
  --param=lto-min-partition.

  * The partitioner is unable to find any point of partitioning in the
  file.

3. cc1* will fork itself; one fork for each partition. Each child
process will apply its partition mask generated by the partitioner
and write a new assembler name file to <filename> pointed by the driver.

4. The driver will open each file and partially link them together into
a single .o file, if -c was requested, else into a binary.  -S and -E
is unsupported for now and probably will remain so.


Speedups ranged from 0.95x to 1.9x on a Quad-Core Intel Core-i7 8565U
when testing with two files in GCC, as stated in the following table.
The test was the result of a single execution with a previous warm up
execution. The compiled GCC had checking enabled, and therefore release
version might have better timings in both sequential and parallel, but the
speedup may remain the same.

|                |            | Without Static | With Static |   Max   |
| File           | Sequential |    Promotion   |  Promotion  | Speedup |
|----------------|------------|----------------|-----------------------|
| gimple-match.c |     60s    |       63s      |     34s     |   1.7x  |
| insn-emit.c    |     37s    |       19s      |     20s     |   1.9x  |

Notice that we have a slowdown in some cases when it is enabled, that
is why the parallelism feature is enabled with a flag for now.

Bootstrapped and Regtested on Linux x86_64.

Giuliano Belinassi (6):
  Modify gcc driver for parallel compilation
  Implement a new partitioner for parallel compilation
  Implement fork-based parallelism engine
  Add `+' for Jobserver Integration
  Add invoke documentation
  New tests for parallel compilation feature

 gcc/Makefile.in                               |    6 +-
 gcc/cgraph.c                                  |   16 +
 gcc/cgraph.h                                  |   13 +
 gcc/cgraphunit.c                              |  198 ++-
 gcc/common.opt                                |    4 +
 gcc/doc/invoke.texi                           |   32 +-
 gcc/gcc.c                                     | 1219 +++++++++++++----
 gcc/ipa-fnsummary.c                           |    2 +-
 gcc/ipa-icf.c                                 |    3 +-
 gcc/ipa-visibility.c                          |    3 +-
 gcc/ipa.c                                     |    4 +-
 gcc/jobserver.cc                              |  168 +++
 gcc/jobserver.h                               |   33 +
 gcc/lto-cgraph.c                              |  172 +++
 gcc/{lto => }/lto-partition.c                 |  463 ++++++-
 gcc/{lto => }/lto-partition.h                 |    4 +-
 gcc/lto-streamer.h                            |    4 +
 gcc/lto/Make-lang.in                          |    4 +-
 gcc/lto/lto.c                                 |    2 +-
 gcc/params.opt                                |    8 +
 gcc/symtab.c                                  |   46 +-
 gcc/testsuite/driver/a.c                      |    6 +
 gcc/testsuite/driver/b.c                      |    6 +
 gcc/testsuite/driver/driver.exp               |   80 ++
 gcc/testsuite/driver/empty.c                  |    0
 gcc/testsuite/driver/foo.c                    |    7 +
 .../gcc.dg/parallel-early-constant.c          |   22 +
 gcc/testsuite/gcc.dg/parallel-static-1.c      |   21 +
 gcc/testsuite/gcc.dg/parallel-static-2.c      |   21 +
 .../gcc.dg/parallel-static-clash-1.c          |   23 +
 .../gcc.dg/parallel-static-clash-aux.c        |   14 +
 gcc/toplev.c                                  |   58 +-
 gcc/toplev.h                                  |    3 +
 gcc/tree.c                                    |   23 +-
 gcc/varasm.c                                  |   26 +-
 intl/Makefile.in                              |    2 +-
 libbacktrace/Makefile.in                      |    2 +-
 libcpp/Makefile.in                            |    2 +-
 libdecnumber/Makefile.in                      |    2 +-
 libiberty/Makefile.in                         |  212 +--
 zlib/Makefile.in                              |   64 +-
 41 files changed, 2539 insertions(+), 459 deletions(-)
 create mode 100644 gcc/jobserver.cc
 create mode 100644 gcc/jobserver.h
 rename gcc/{lto => }/lto-partition.c (78%)
 rename gcc/{lto => }/lto-partition.h (89%)
 create mode 100644 gcc/testsuite/driver/a.c
 create mode 100644 gcc/testsuite/driver/b.c
 create mode 100644 gcc/testsuite/driver/driver.exp
 create mode 100644 gcc/testsuite/driver/empty.c
 create mode 100644 gcc/testsuite/driver/foo.c
 create mode 100644 gcc/testsuite/gcc.dg/parallel-early-constant.c
 create mode 100644 gcc/testsuite/gcc.dg/parallel-static-1.c
 create mode 100644 gcc/testsuite/gcc.dg/parallel-static-2.c
 create mode 100644 gcc/testsuite/gcc.dg/parallel-static-clash-1.c
 create mode 100644 gcc/testsuite/gcc.dg/parallel-static-clash-aux.c

-- 
2.28.0


^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2020-08-31 11:44 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-20 22:00 [PATCH 0/6] Parallelize Intra-Procedural Optimizations using the LTO Engine Giuliano Belinassi
2020-08-20 22:00 ` [PATCH 1/6] Modify gcc driver for parallel compilation Giuliano Belinassi
2020-08-24 13:17   ` Richard Biener
2020-08-24 18:06     ` Giuliano Belinassi
2020-08-25  6:53       ` Richard Biener
2020-08-20 22:00 ` [PATCH 2/6] Implement a new partitioner " Giuliano Belinassi
2020-08-27 15:18   ` Jan Hubicka
2020-08-27 21:42     ` Giuliano Belinassi
2020-08-31  9:25   ` Richard Biener
2020-08-20 22:00 ` [PATCH 3/6] Implement fork-based parallelism engine Giuliano Belinassi
2020-08-27 15:25   ` Jan Hubicka
2020-08-27 15:37   ` Jan Hubicka
2020-08-27 18:27     ` Giuliano Belinassi
2020-08-29 11:41       ` Jan Hubicka
2020-08-31  9:33       ` Richard Biener
2020-08-20 22:00 ` [PATCH 4/6] Add `+' for Jobserver Integration Giuliano Belinassi
2020-08-20 22:33   ` Joseph Myers
2020-08-24 13:19     ` Richard Biener
2020-08-27 15:38     ` Jan Hubicka
2020-08-20 22:00 ` [PATCH 5/6] Add invoke documentation Giuliano Belinassi
2020-08-20 22:00 ` [PATCH 6/6] New tests for parallel compilation feature Giuliano Belinassi
2020-08-21 21:08 ` [PATCH 0/6] Parallelize Intra-Procedural Optimizations using the LTO Engine Josh Triplett
2020-08-22 21:04   ` Giuliano Belinassi
2020-08-24 16:44     ` Josh Triplett
2020-08-24 18:38       ` Giuliano Belinassi
2020-08-25  7:03         ` Richard Biener
2020-08-24 12:50 ` Richard Biener
2020-08-24 15:13   ` Giuliano Belinassi
2020-08-29 11:31     ` Jan Hubicka
2020-08-31  8:15       ` Richard Biener
2020-08-31 11:44         ` Jan Hubicka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).