public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "cvs-commit at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck
Date: Fri, 05 May 2023 12:48:09 +0000	[thread overview]
Message-ID: <bug-84402-4-FbLWIQjiph@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-84402-4@http.gcc.gnu.org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #69 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Tamar Christina <tnfchris@gcc.gnu.org>:

https://gcc.gnu.org/g:703417a030b3d80f55ba1402adc3f1692d3631e5

commit r14-500-g703417a030b3d80f55ba1402adc3f1692d3631e5
Author: Tamar Christina <tamar.christina@arm.com>
Date:   Fri May 5 13:38:50 2023 +0100

    match.pd: automatically partition *-match.cc files.

    Following on from Richi's RFC[1] this is another attempt to split up
match.pd
    into multiple gimple-match and generic-match files.  This version is fully
    automated and requires no human intervention.

    First things first, some perf numbers.  The following shows the effect of
the
    patch on my desktop doing parallel compilation of gimple-match:

    +--------+------------------+--------+------------------+
    | splits | rel. improvement | splits | rel. improvement |
    +--------+------------------+--------+------------------+
    |      1 | 0.00%            |     33 | 91.03%           |
    |      2 | 71.77%           |     34 | 84.02%           |
    |      3 | 100.71%          |     35 | 83.42%           |
    |      4 | 143.08%          |     36 | 78.80%           |
    |      5 | 176.18%          |     37 | 74.06%           |
    |      6 | 174.40%          |     38 | 55.76%           |
    |      7 | 176.62%          |     39 | 66.90%           |
    |      8 | 168.35%          |     40 | 18.25%           |
    |      9 | 189.80%          |     41 | 16.55%           |
    |     10 | 171.77%          |     42 | 47.02%           |
    |     11 | 152.82%          |     43 | 15.29%           |
    |     12 | 112.20%          |     44 | 21.63%           |
    |     13 | 158.57%          |     45 | 41.53%           |
    |     14 | 158.57%          |     46 | 21.98%           |
    |     15 | 152.07%          |     47 | -42.74%          |
    |     16 | 151.70%          |     48 | -32.62%          |
    |     17 | 131.52%          |     49 | 11.81%           |
    |     18 | 133.11%          |     50 | 34.07%           |
    |     19 | 137.33%          |     51 | 2.71%            |
    |     20 | 103.83%          |     52 | -22.23%          |
    |     21 | 132.47%          |     53 | 32.30%           |
    |     22 | 116.52%          |     54 | 21.45%           |
    |     23 | 112.73%          |     55 | 40.02%           |
    |     24 | 111.94%          |     56 | 42.83%           |
    |     25 | 112.73%          |     57 | -9.98%           |
    |     26 | 104.07%          |     58 | 18.01%           |
    |     27 | 113.27%          |     59 | -4.91%           |
    |     28 | 96.77%           |     60 | 22.94%           |
    |     29 | 93.42%           |     61 | -3.73%           |
    |     30 | 87.67%           |     62 | -27.43%          |
    |     31 | 89.54%           |     63 | -1.05%           |
    |     32 | 84.42%           |     64 | -5.44%           |
    +--------+------------------+--------+------------------+

    As can be seen there seems to be a point of diminishing returns in doing
splits.
    This comes from the fact that these match files consume a sizeable amount
of
    headers.  At a certain point the parsing overhead of the headers dominate
and
    you start losing in gains.

    As such from this I've made the default 10 splits per file to allow for
some
    room for growth in the future without needing changes to the split amount.
    Since 5-10 show roughly the same gains it means we can afford to double the
    file sizes before we need to up the split amount.  This can be controlled
    by the configure parameter --with-matchpd-partitions=.

    At 10 splits the sizes of the files are:

     1.2M gimple-match-1.cc
     490K gimple-match-2.cc
     459K gimple-match-3.cc
     462K gimple-match-4.cc
     466K gimple-match-5.cc
     690K gimple-match-6.cc
     517K gimple-match-7.cc
     693K gimple-match-8.cc
    1011K gimple-match-9.cc
     490K gimple-match-10.cc
     210K gimple-match-auto.h

    The reason gimple-match-1.cc is so large is because it got allocated a very
    large function: gimple_simplify_NE_EXPR.

    Because of these sporadically large functions the allocation to a split
happens
    based on the amount of data already written to a split instead of just a
simple
    round robin allocation (though the patch supports that too.).   This means
that
    once gimple_simplify_NE_EXPR is allocated to gimple-match-1.cc nothing uses
it
    again until the rest of the files catch up.

    To support this split a new header file *-match-auto.h is generated to
allow
    the individual files to compile separately.

    Lastly for the auto generated files I use pragmas to silence the unused
    predicate warnings instead of the previous Makefile way because I couldn't
find
    a way to set them without knowing the number of split files beforehand.

    Finally with this change, bootstrap time has dropped 8 minutes on AArch64.

    [1] https://gcc.gnu.org/legacy-ml/gcc-patches/2018-04/msg01125.html

    gcc/ChangeLog:

            PR bootstrap/84402
            * genmatch.cc (emit_func, SIZED_BASED_CHUNKS, get_out_file): New.
            (decision_tree::gen): Accept list of files instead of single and
update
            to write function definition to header and main file.
            (write_predicate): Likewise.
            (write_header): Emit pragmas and new includes.
            (main): Create file buffers and cleanup.
            (showUsage, write_header_includes): New.

  parent reply	other threads:[~2023-05-05 12:48 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <bug-84402-4@http.gcc.gnu.org/bugzilla/>
2020-05-07 11:56 ` jakub at gcc dot gnu.org
2020-05-07 22:59 ` egallager at gcc dot gnu.org
2020-07-09  9:44 ` rjiejie at me dot com
2020-07-09 10:04 ` marxin at gcc dot gnu.org
2020-07-09 11:40 ` rguenth at gcc dot gnu.org
2020-07-13  5:51 ` rjiejie at me dot com
2020-07-23  6:51 ` rguenth at gcc dot gnu.org
2021-04-08 12:02 ` rguenth at gcc dot gnu.org
2021-07-19  6:17 ` pinskia at gcc dot gnu.org
2021-07-19  6:19 ` pinskia at gcc dot gnu.org
2021-10-09 12:58 ` egallager at gcc dot gnu.org
2021-10-11  8:01 ` marxin at gcc dot gnu.org
2021-10-11 18:10 ` egallager at gcc dot gnu.org
2021-11-01  4:56 ` egallager at gcc dot gnu.org
2022-05-29  3:44 ` sam at gentoo dot org
2022-06-02 22:05 ` segher at gcc dot gnu.org
2022-11-30  8:13 ` marxin at gcc dot gnu.org
2022-11-30  8:23 ` marxin at gcc dot gnu.org
2022-11-30  8:25 ` rguenth at gcc dot gnu.org
2022-11-30  8:27 ` rguenth at gcc dot gnu.org
2022-11-30  8:38 ` rguenth at gcc dot gnu.org
2022-11-30  9:10 ` marxin at gcc dot gnu.org
2022-12-01  9:43 ` marxin at gcc dot gnu.org
2022-12-01 10:01 ` marxin at gcc dot gnu.org
2022-12-01 10:03 ` marxin at gcc dot gnu.org
2022-12-01 10:07 ` marxin at gcc dot gnu.org
2023-03-27 14:55 ` andrew.carlotti at arm dot com
2023-03-28  3:01 ` marxin at gcc dot gnu.org
2023-03-28  8:30 ` rguenth at gcc dot gnu.org
2023-03-28  8:45 ` amonakov at gcc dot gnu.org
2023-03-28  8:54 ` jakub at gcc dot gnu.org
2023-03-28  9:05 ` rguenther at suse dot de
2023-03-28 11:31 ` cvs-commit at gcc dot gnu.org
2023-05-05 12:47 ` cvs-commit at gcc dot gnu.org
2023-05-05 12:47 ` cvs-commit at gcc dot gnu.org
2023-05-05 12:47 ` cvs-commit at gcc dot gnu.org
2023-05-05 12:48 ` cvs-commit at gcc dot gnu.org
2023-05-05 12:48 ` cvs-commit at gcc dot gnu.org [this message]
2023-05-05 12:48 ` cvs-commit at gcc dot gnu.org
2023-07-07 11:29 ` sjames at gcc dot gnu.org
2023-10-31 12:35 ` cvs-commit at gcc dot gnu.org
2023-10-31 12:48 ` sjames at gcc dot gnu.org
2024-03-04  4:29 ` law at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-84402-4-FbLWIQjiph@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).