public inbox for gcc-help@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Earnshaw <Richard.Earnshaw@foss.arm.com>
To: psmith@gnu.org, Kai Song <kaisong1515@gmail.com>
Cc: gcc-help <gcc-help@gcc.gnu.org>
Subject: Re: Compilation of lengthy C++ Files
Date: Wed, 25 Oct 2023 12:09:43 +0100	[thread overview]
Message-ID: <56e28c8b-903f-4e55-965e-a1c1fc92249d@foss.arm.com> (raw)
In-Reply-To: <30ea01f8046640cc9cc269a987283fd5294e89ee.camel@gnu.org>



On 24/10/2023 15:57, Paul Smith via Gcc-help wrote:
> On Sat, 2023-10-21 at 16:10 +0200, Kai Song via Gcc-help wrote:
>> I understand (and needless to say couldn't do it better) that it may
>> just be for now a too difficult problem to actually compile lengthy
>> cpp.
> 
> I just want to be very clear here.  When you say "lengthy cpp" that is
> not very precise because "cpp" is not an accurate term.
> 
> What is true is the the compiler will have problems compiling extremely
> long SINGLE FUNCTIONS.  It doesn't really matter how long the source
> file is (how many functions it contains).  What matters is now large
> each function is.
> 

That's only partially true.  Modern compilers read the whole translation 
unit into memory, parse all of it and then optimize the various 
functions.  Smaller functions helps to reduce the overall memory 
footprint while doing the optimizations, but does not entirely eliminate 
the file size issue.

> In order to make your generated code digestible for compilers, you need
> to break up your algorithms into multiple smaller functions.
> 
>> The question that I was targeting at is: What can I realistically do
>> (implying that refactoring in the remaining project's period appears
>> nonviable)?
> 
> Of course we cannot decide what is realistic or not.  Your comments
> here seem to imply that you would be wiling to rewrite the entire thing
> to generate output in a completely different language so it seems you
> are willing and able to make very sweeping changes.
> 
> It's hard for us to understand why you can't extract parts of your
> generated code and put them into separate functions.  Surely you must
> have loops in your code: can't the body of some of these loops become
> separate functions?  If there are steps to the algorithm can't each of
> these steps become a separate function?  You can collect all the state
> of the algorithm into one large global structure, rather than having
> individual local variables, and all the functions can refer to that
> global state structure, so you don't have to pass the data around as
> arguments.
> 
>> Are there languages with similar capabilities to cpp that I could
>> generate into equivalently and that are as easy to learn
> 
> Since we don't know which particular capabilities of C++ you are
> relying on that's difficult to say.  However, my feeling is that no
> matter what language you use you will run into these limitations:
> compilers in general simply are not written to have single functions
> which are hundreds of thousands of lines long.  Functions help humans
> organize their thinking and programming, but they also are critical for
> compilers to make their jobs simpler (or even possible).
> 
> Rather than learning a new language, I think you should be revisiting
> your code generation and attempting to understand how to break it up.
> I don't think there's any feasible alternative.

The most important consideration doesn't seem to have been mentioned on 
this thread: CPU architectures.  Modern CPUs have a very strong 
preference to repeatedly executing small bits of code.  Linear code will 
kill their performance.  This comes from the fact that the CPU can 
consume instructions far faster than the memory system can supply them. 
So to keep performance up, the system uses caches; but those are tiny 
and they only help if the instructions held in them are re-used. 
Furthermore, branches for most programs are practically free these days, 
especially when you take the benefit of caches into account: the days of 
unrolling loops dozens (or more) times to avoid branch overheads are 
pretty much gone.

Finally, there are practical limits (especially on RISC processors) of 
how large a function can be before you need to take defensive 
compilation measures to handle very large offsets between addresses. 
Those defensive measures further slow down your compiled programs.

R.

  reply	other threads:[~2023-10-25 11:09 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-18 16:04 Kai Song
2023-10-18 21:59 ` Jonathan Wakely
2023-10-19  8:36   ` Andrew Haley
2023-10-19 12:47 ` David Brown
2023-10-19 12:47   ` David Brown
2023-10-19 14:16   ` Kai Song
2023-10-19 14:26     ` Jonathan Wakely
2023-10-19 15:11       ` Kai Song
2023-10-19 16:03         ` David Brown
2023-10-20  9:32           ` Kai Song
2023-10-20 10:19             ` Jonathan Wakely
     [not found]             ` <CACJ51z3rYUSSe7XpcL4d2xfAhMaiVZpxWAnpkqZc1cn2DRf+uA@mail.gmail.com>
2023-10-20 21:08               ` Kai Song
2023-10-20 22:03                 ` Paul Smith
2023-10-21  6:52                   ` Jonathan Wakely
2023-10-21 14:10                     ` Kai Song
2023-10-24 14:57                       ` Paul Smith
2023-10-25 11:09                         ` Richard Earnshaw [this message]
2023-10-25 14:49                           ` Paul Smith
2023-10-26 11:19                             ` David Brown
2023-10-19 15:15     ` David Brown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56e28c8b-903f-4e55-965e-a1c1fc92249d@foss.arm.com \
    --to=richard.earnshaw@foss.arm.com \
    --cc=gcc-help@gcc.gnu.org \
    --cc=kaisong1515@gmail.com \
    --cc=psmith@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).