public inbox for gcc-help@gcc.gnu.org
 help / color / mirror / Atom feed
* High memory usage compiling large ‘xxd -i’ output
@ 2020-04-11 20:03 relliott
  2020-04-11 21:22 ` Jonathan Wakely
  0 siblings, 1 reply; 5+ messages in thread
From: relliott @ 2020-04-11 20:03 UTC (permalink / raw)
  To: gcc-help

Hello,

I’m seeing gcc memory usage exceed 15GB when compiling a source file generated from the command:

% xxd -i file.txt

where ‘file.txt’ is a 170MB file.

Is there a way to avoid this high memory usage?  Can anyone explain why it occurs?

Thanks,

Ryan Elliott

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: High memory usage compiling large ‘xxd -i’ output
  2020-04-11 20:03 High memory usage compiling large ‘xxd -i’ output relliott
@ 2020-04-11 21:22 ` Jonathan Wakely
  2020-04-11 21:31   ` relliott
  0 siblings, 1 reply; 5+ messages in thread
From: Jonathan Wakely @ 2020-04-11 21:22 UTC (permalink / raw)
  To: relliott; +Cc: gcc-help

On Sat, 11 Apr 2020 at 21:04, relliott--- via Gcc-help
<gcc-help@gcc.gnu.org> wrote:
>
> Hello,
>
> I’m seeing gcc memory usage exceed 15GB when compiling a source file generated from the command:
>
> % xxd -i file.txt
>
> where ‘file.txt’ is a 170MB file.
>
> Is there a way to avoid this high memory usage?  Can anyone explain why it occurs?

Because the compiler is parsing an array initializer with millions of elements.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: High memory usage compiling large ‘xxd -i’ output
  2020-04-11 21:22 ` Jonathan Wakely
@ 2020-04-11 21:31   ` relliott
  2020-04-12  2:07     ` Xi Ruoyao
  0 siblings, 1 reply; 5+ messages in thread
From: relliott @ 2020-04-11 21:31 UTC (permalink / raw)
  To: Jonathan Wakely; +Cc: gcc-help

Hello,

Thanks for your reply.  I guess I don’t understand why this requires so much memory all at once.  Is it not possible to write to the object file as the initializer is parsed?  I feel like there is some complexity about the parsing process/requirements I’m not appreciating.  An example or explanation would probably help me a lot.

Thanks,

Ryan 

> On Apr 11, 2020, at 4:22 PM, Jonathan Wakely <jwakely.gcc@gmail.com> wrote:
> 
> On Sat, 11 Apr 2020 at 21:04, relliott--- via Gcc-help
> <gcc-help@gcc.gnu.org> wrote:
>> 
>> Hello,
>> 
>> I’m seeing gcc memory usage exceed 15GB when compiling a source file generated from the command:
>> 
>> % xxd -i file.txt
>> 
>> where ‘file.txt’ is a 170MB file.
>> 
>> Is there a way to avoid this high memory usage?  Can anyone explain why it occurs?
> 
> Because the compiler is parsing an array initializer with millions of elements.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: High memory usage compiling large ‘xxd -i’ output
  2020-04-11 21:31   ` relliott
@ 2020-04-12  2:07     ` Xi Ruoyao
  2020-04-12 22:01       ` relliott
  0 siblings, 1 reply; 5+ messages in thread
From: Xi Ruoyao @ 2020-04-12  2:07 UTC (permalink / raw)
  To: relliott; +Cc: gcc-help

On 2020-04-11 16:31 -0500, relliott--- via Gcc-help wrote:
> Hello,
> 
> Thanks for your reply.  I guess I don’t understand why this requires so much
> memory all at once.  Is it not possible to write to the object file as the
> initializer is parsed?

A compiler doesn't write to object file.  It only writes to assembly file. 
Modern compilers are designed multi-layer so the parser (frontend) should not
output assembly (it's the job of the backend).  And, doing that so will miss
optimizations.  A simple code:

const int b[] = {0, 1, 2};

int foo(void)
{
  int i, x = 0;
  for (i = 0; i < 3; i++)
    x += b[i];
  return x;
}

The body of "foo" will be optimized to "return 3;" at -O2.  If we wrote the
array content to assembly file and discarded the value in memory, this
optimization will be impossible.

"xxd" a binary file and then compile the result with a compiler is just like
"converting a .jpg to .png then back to .jpg".  A smarter way:

  ld -r -b binary some_big_binary_blob.bin -o some_big_binary_blob.o

There will be symbols _binary_some_big_binary_blob_bin_start,
_binary_some_big_binary_blob_end in some_big_binary_blob.o.  Then it's possible
to access the content of the binary blob with:

  extern const char _binary_some_big_binary_blob_bin_start;
  extern const char _binary_some_big_binary_blob_bin_end;

  int main(void)
  {
    const char *begin = &_binary_some_big_binary_blob_bin_start;
    const char *end = &_binary_some_big_binary_blob_bin_end;
    const char *p;
    for (p = begin; p != end; p++)
      play_with(*p);
  }
-- 
Xi Ruoyao <xry111@mengyan1223.wang>
School of Aerospace Science and Technology, Xidian University


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: High memory usage compiling large ‘xxd -i’ output
  2020-04-12  2:07     ` Xi Ruoyao
@ 2020-04-12 22:01       ` relliott
  0 siblings, 0 replies; 5+ messages in thread
From: relliott @ 2020-04-12 22:01 UTC (permalink / raw)
  To: gcc-help

Hello,
Thanks for the explanation; I can see the issue now.

Ryan


> On Apr 11, 2020, at 9:16 PM, Xi Ruoyao <xry111@mengyan1223.wang> wrote:
> 
> On 2020-04-11 16:31 -0500, relliott--- via Gcc-help wrote:
>> Hello,
>> 
>> Thanks for your reply.  I guess I don’t understand why this requires so much
>> memory all at once.  Is it not possible to write to the object file as the
>> initializer is parsed?
> 
> A compiler doesn't write to object file.  It only writes to assembly file. 
> Modern compilers are designed multi-layer so the parser (frontend) should not
> output assembly (it's the job of the backend).  And, doing that so will miss
> optimizations.  A simple code:
> 
> const int b[] = {0, 1, 2};
> 
> int foo(void)
> {
>  int i, x = 0;
>  for (i = 0; i < 3; i++)
>    x += b[i];
>  return x;
> }
> 
> The body of "foo" will be optimized to "return 3;" at -O2.  If we wrote the
> array content to assembly file and discarded the value in memory, this
> optimization will be impossible.
> 
> "xxd" a binary file and then compile the result with a compiler is just like
> "converting a .jpg to .png then back to .jpg".  A smarter way:
> 
>  ld -r -b binary some_big_binary_blob.bin -o some_big_binary_blob.o
> 
> There will be symbols _binary_some_big_binary_blob_bin_start,
> _binary_some_big_binary_blob_end in some_big_binary_blob.o.  Then it's possible
> to access the content of the binary blob with:
> 
>  extern const char _binary_some_big_binary_blob_bin_start;
>  extern const char _binary_some_big_binary_blob_bin_end;
> 
>  int main(void)
>  {
>    const char *begin = &_binary_some_big_binary_blob_bin_start;
>    const char *end = &_binary_some_big_binary_blob_bin_end;
>    const char *p;
>    for (p = begin; p != end; p++)
>      play_with(*p);
>  }
> -- 
> Xi Ruoyao <xry111@mengyan1223.wang>
> School of Aerospace Science and Technology, Xidian University
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-04-12 22:01 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-11 20:03 High memory usage compiling large ‘xxd -i’ output relliott
2020-04-11 21:22 ` Jonathan Wakely
2020-04-11 21:31   ` relliott
2020-04-12  2:07     ` Xi Ruoyao
2020-04-12 22:01       ` relliott

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).