public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
* emacs and large-address awareness under recent snapshots
@ 2011-08-05 23:17 Ken Brown
  2011-08-07 11:34 ` Corinna Vinschen
  0 siblings, 1 reply; 34+ messages in thread
From: Ken Brown @ 2011-08-05 23:17 UTC (permalink / raw)
  To: cygwin

Starting with the 2011-07-21 snapshot, emacs doesn't work well with the 
large-address-awareness flag set (under 64-bit Win7).  As soon as emacs 
is started, a *Warning* buffer is created with the following message:

   Emergency (alloc): Warning: past 95% of memory limit

To reproduce, install emacs and do the following:

$ peflags --bigaddr=1 /usr/bin/emacs-nox.exe

$ emacs-nox.exe -Q

Ken

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: emacs and large-address awareness under recent snapshots
  2011-08-05 23:17 emacs and large-address awareness under recent snapshots Ken Brown
@ 2011-08-07 11:34 ` Corinna Vinschen
  2011-08-07 11:51   ` Corinna Vinschen
  0 siblings, 1 reply; 34+ messages in thread
From: Corinna Vinschen @ 2011-08-07 11:34 UTC (permalink / raw)
  To: cygwin

On Aug  5 19:16, Ken Brown wrote:
> Starting with the 2011-07-21 snapshot, emacs doesn't work well with
> the large-address-awareness flag set (under 64-bit Win7).  As soon
> as emacs is started, a *Warning* buffer is created with the
> following message:
> 
>   Emergency (alloc): Warning: past 95% of memory limit
> 
> To reproduce, install emacs and do the following:
> 
> $ peflags --bigaddr=1 /usr/bin/emacs-nox.exe
> 
> $ emacs-nox.exe -Q

Yes, I can reproduce the message, but I have not the faintest idea
why emacs thinks so.  If you look into the process map, you'll
see the following:

  $ ps | grep emacs
     280    2852     280       2796    0 11001 13:02:21 /usr/bin/emacs-nox
  $ less /proc/280/maps
  [...]
  80000000-8064E000 rw-p 00000000 0000:0000 0                   [heap]
  8064E000-98000000 ===p 0064E000 0000:0000 0                   [heap]

Starting with the 2011-07-21 the heap starts at 0x80000000 if the
application (and the system) is large address aware.  Even if you
dont see the "[heap]" decoration(*), the heap is at that address.
What you can see is this:

- The heap is located at 0x80000000 and has a size of 384 Megs (the
  default start size), up to address 0x98000000.

- Only the first 0x64e000 (== 6610944) bytes are allocated so far, so
  there are still about 254 Megs left on the heap.

I did set breakpoints to all functions returning malloc information,
but emacs doesn't call one of them.  Is there a chance that emacs
does some invalid 32 bit pointer arithmetic and just gets confused?


Corinna


(*) I just checked in a patch to Cygwin which fixes printing the
    "[heap]" text in the right column.  In your case there's a good
    chance that it's missing.
   

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: emacs and large-address awareness under recent snapshots
  2011-08-07 11:34 ` Corinna Vinschen
@ 2011-08-07 11:51   ` Corinna Vinschen
  2011-08-07 14:44     ` Ken Brown
  0 siblings, 1 reply; 34+ messages in thread
From: Corinna Vinschen @ 2011-08-07 11:51 UTC (permalink / raw)
  To: cygwin

On Aug  7 13:33, Corinna Vinschen wrote:
> On Aug  5 19:16, Ken Brown wrote:
> > Starting with the 2011-07-21 snapshot, emacs doesn't work well with
> > the large-address-awareness flag set (under 64-bit Win7).  As soon
> > as emacs is started, a *Warning* buffer is created with the
> > following message:
> > 
> >   Emergency (alloc): Warning: past 95% of memory limit
> > 
> > To reproduce, install emacs and do the following:
> > 
> > $ peflags --bigaddr=1 /usr/bin/emacs-nox.exe
> > 
> > $ emacs-nox.exe -Q
> 
> Yes, I can reproduce the message, but I have not the faintest idea
> why emacs thinks so.  If you look into the process map, you'll
> see the following:
> 
>   $ ps | grep emacs
>      280    2852     280       2796    0 11001 13:02:21 /usr/bin/emacs-nox
>   $ less /proc/280/maps
>   [...]
>   80000000-8064E000 rw-p 00000000 0000:0000 0                   [heap]
>   8064E000-98000000 ===p 0064E000 0000:0000 0                   [heap]
> 
> Starting with the 2011-07-21 the heap starts at 0x80000000 if the
> application (and the system) is large address aware.  Even if you
> dont see the "[heap]" decoration(*), the heap is at that address.
> What you can see is this:
> 
> - The heap is located at 0x80000000 and has a size of 384 Megs (the
>   default start size), up to address 0x98000000.
> 
> - Only the first 0x64e000 (== 6610944) bytes are allocated so far, so
>   there are still about 254 Megs left on the heap.

I forgot to explain.  The first line

  80000000-8064E000 rw-p 00000000 0000:0000 0

means that the address area from 80000000 to 8064E000 is commited R/W
memory.  That's the space for which the application has called sbrk().

In the second line

  8064E000-98000000 ===p 0064E000 0000:0000 0

the "===p" means that the area is reserved, but uncommited.  That's the
remainder of the current heap, not sbrk'd yet.

Even if that space would have been taken by emacs, the next sbrk would
have enough space left, since ther space *after* the current heap is
not reserverd yet, up to some address in the 0xfff00000 space, so there's
about 1.7 Gigs left to extend the heap.

> I did set breakpoints to all functions returning malloc information,
> but emacs doesn't call one of them.  Is there a chance that emacs
> does some invalid 32 bit pointer arithmetic and just gets confused?
> 
> 
> Corinna
> 
> 
> (*) I just checked in a patch to Cygwin which fixes printing the
>     "[heap]" text in the right column.  In your case there's a good
>     chance that it's missing.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: emacs and large-address awareness under recent snapshots
  2011-08-07 11:51   ` Corinna Vinschen
@ 2011-08-07 14:44     ` Ken Brown
  2011-08-07 16:19       ` Ken Brown
  0 siblings, 1 reply; 34+ messages in thread
From: Ken Brown @ 2011-08-07 14:44 UTC (permalink / raw)
  To: cygwin

On 8/7/2011 7:50 AM, Corinna Vinschen wrote:
> On Aug  7 13:33, Corinna Vinschen wrote:
>> On Aug  5 19:16, Ken Brown wrote:
>>> Starting with the 2011-07-21 snapshot, emacs doesn't work well with
>>> the large-address-awareness flag set (under 64-bit Win7).  As soon
>>> as emacs is started, a *Warning* buffer is created with the
>>> following message:
>>>
>>>    Emergency (alloc): Warning: past 95% of memory limit
>>>
>>> To reproduce, install emacs and do the following:
>>>
>>> $ peflags --bigaddr=1 /usr/bin/emacs-nox.exe
>>>
>>> $ emacs-nox.exe -Q
>>
>> Yes, I can reproduce the message, but I have not the faintest idea
>> why emacs thinks so.  If you look into the process map, you'll
>> see the following:
>>
>>    $ ps | grep emacs
>>       280    2852     280       2796    0 11001 13:02:21 /usr/bin/emacs-nox
>>    $ less /proc/280/maps
>>    [...]
>>    80000000-8064E000 rw-p 00000000 0000:0000 0                   [heap]
>>    8064E000-98000000 ===p 0064E000 0000:0000 0                   [heap]
>>
>> Starting with the 2011-07-21 the heap starts at 0x80000000 if the
>> application (and the system) is large address aware.  Even if you
>> dont see the "[heap]" decoration(*), the heap is at that address.
>> What you can see is this:
>>
>> - The heap is located at 0x80000000 and has a size of 384 Megs (the
>>    default start size), up to address 0x98000000.
>>
>> - Only the first 0x64e000 (== 6610944) bytes are allocated so far, so
>>    there are still about 254 Megs left on the heap.
>
> I forgot to explain.  The first line
>
>    80000000-8064E000 rw-p 00000000 0000:0000 0
>
> means that the address area from 80000000 to 8064E000 is commited R/W
> memory.  That's the space for which the application has called sbrk().
>
> In the second line
>
>    8064E000-98000000 ===p 0064E000 0000:0000 0
>
> the "===p" means that the area is reserved, but uncommited.  That's the
> remainder of the current heap, not sbrk'd yet.
>
> Even if that space would have been taken by emacs, the next sbrk would
> have enough space left, since ther space *after* the current heap is
> not reserverd yet, up to some address in the 0xfff00000 space, so there's
> about 1.7 Gigs left to extend the heap.
>
>> I did set breakpoints to all functions returning malloc information,
>> but emacs doesn't call one of them.  Is there a chance that emacs
>> does some invalid 32 bit pointer arithmetic and just gets confused?

Thanks for all the information.

Emacs checks available memory in the function check_memory_limits() in 
the source file src/vm-limits.c.  I'm trying to sort it out, but I don't 
see any invalid pointer arithmetic.  If I'm correctly following all the 
preprocessor logic, emacs uses getrlimit() on Cygwin to determine the 
total memory.  Is it possible that this is returning the wrong value 
when the large-address-awareness flag is set?

I tried to use gdb to step through check_memory_limits() with and 
without the flag set.  But when the flag was set, gdb froze.

Ken

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: emacs and large-address awareness under recent snapshots
  2011-08-07 14:44     ` Ken Brown
@ 2011-08-07 16:19       ` Ken Brown
  2011-08-07 20:03         ` Corinna Vinschen
  0 siblings, 1 reply; 34+ messages in thread
From: Ken Brown @ 2011-08-07 16:19 UTC (permalink / raw)
  To: cygwin

On 8/7/2011 10:43 AM, Ken Brown wrote:
> On 8/7/2011 7:50 AM, Corinna Vinschen wrote:
>> On Aug  7 13:33, Corinna Vinschen wrote:
>>> On Aug  5 19:16, Ken Brown wrote:
>>>> Starting with the 2011-07-21 snapshot, emacs doesn't work well with
>>>> the large-address-awareness flag set (under 64-bit Win7).  As soon
>>>> as emacs is started, a *Warning* buffer is created with the
>>>> following message:
>>>>
>>>>     Emergency (alloc): Warning: past 95% of memory limit
>>>>
>>>> To reproduce, install emacs and do the following:
>>>>
>>>> $ peflags --bigaddr=1 /usr/bin/emacs-nox.exe
>>>>
>>>> $ emacs-nox.exe -Q
>>>
>>> Yes, I can reproduce the message, but I have not the faintest idea
>>> why emacs thinks so.  If you look into the process map, you'll
>>> see the following:
>>>
>>>     $ ps | grep emacs
>>>        280    2852     280       2796    0 11001 13:02:21 /usr/bin/emacs-nox
>>>     $ less /proc/280/maps
>>>     [...]
>>>     80000000-8064E000 rw-p 00000000 0000:0000 0                   [heap]
>>>     8064E000-98000000 ===p 0064E000 0000:0000 0                   [heap]
>>>
>>> Starting with the 2011-07-21 the heap starts at 0x80000000 if the
>>> application (and the system) is large address aware.  Even if you
>>> dont see the "[heap]" decoration(*), the heap is at that address.
>>> What you can see is this:
>>>
>>> - The heap is located at 0x80000000 and has a size of 384 Megs (the
>>>     default start size), up to address 0x98000000.
>>>
>>> - Only the first 0x64e000 (== 6610944) bytes are allocated so far, so
>>>     there are still about 254 Megs left on the heap.
>>
>> I forgot to explain.  The first line
>>
>>     80000000-8064E000 rw-p 00000000 0000:0000 0
>>
>> means that the address area from 80000000 to 8064E000 is commited R/W
>> memory.  That's the space for which the application has called sbrk().
>>
>> In the second line
>>
>>     8064E000-98000000 ===p 0064E000 0000:0000 0
>>
>> the "===p" means that the area is reserved, but uncommited.  That's the
>> remainder of the current heap, not sbrk'd yet.
>>
>> Even if that space would have been taken by emacs, the next sbrk would
>> have enough space left, since ther space *after* the current heap is
>> not reserverd yet, up to some address in the 0xfff00000 space, so there's
>> about 1.7 Gigs left to extend the heap.
>>
>>> I did set breakpoints to all functions returning malloc information,
>>> but emacs doesn't call one of them.  Is there a chance that emacs
>>> does some invalid 32 bit pointer arithmetic and just gets confused?
>
> Thanks for all the information.
>
> Emacs checks available memory in the function check_memory_limits() in
> the source file src/vm-limits.c.  I'm trying to sort it out, but I don't
> see any invalid pointer arithmetic.  If I'm correctly following all the
> preprocessor logic, emacs uses getrlimit() on Cygwin to determine the
> total memory.  Is it possible that this is returning the wrong value
> when the large-address-awareness flag is set?
>
> I tried to use gdb to step through check_memory_limits() with and
> without the flag set.  But when the flag was set, gdb froze.

I may have found the problem.  I think emacs is not correctly 
determining the start of the heap, with or without large address 
awareness.  When I run emacs under gdb (without large address 
awareness), I find that emacs thinks the heap starts at 0x7b6b30.  But 
the heap actually starts at 0x20000000, doesn't it?

Here's the code that emacs uses to find the start of the heap:

char *
start_of_data (void)
{
#ifdef BSD_SYSTEM
   extern char etext;
   return (POINTER)(&etext);
#elif defined DATA_START
   return ((POINTER) DATA_START);
#elif defined ORDINARY_LINK
   /*
    * This is a hack.  Since we're not linking crt0.c or pre_crt0.c,
    * data_start isn't defined.  We take the address of environ, which
    * is known to live at or near the start of the system crt0.c, and
    * we don't sweat the handful of bytes that might lose.
    */
   extern char **environ;
   return ((POINTER) &environ);
#else
   extern int data_start;
   return ((POINTER) &data_start);
#endif
}

I left all the preprocessor stuff in there to play it safe, but I think 
we're in the ORDINARY_LINK case.

What would be the right way for emacs to determine the start of the heap 
on Cygwin?

Ken

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: emacs and large-address awareness under recent snapshots
  2011-08-07 16:19       ` Ken Brown
@ 2011-08-07 20:03         ` Corinna Vinschen
  2011-08-07 20:44           ` Ken Brown
  2011-08-07 22:39           ` Ken Brown
  0 siblings, 2 replies; 34+ messages in thread
From: Corinna Vinschen @ 2011-08-07 20:03 UTC (permalink / raw)
  To: cygwin

On Aug  7 12:18, Ken Brown wrote:
> On 8/7/2011 10:43 AM, Ken Brown wrote:
> >On 8/7/2011 7:50 AM, Corinna Vinschen wrote:
> >>>I did set breakpoints to all functions returning malloc information,
> >>>but emacs doesn't call one of them.  Is there a chance that emacs
> >>>does some invalid 32 bit pointer arithmetic and just gets confused?
> >
> >Thanks for all the information.
> >
> >Emacs checks available memory in the function check_memory_limits() in
> >the source file src/vm-limits.c.  I'm trying to sort it out, but I don't
> >see any invalid pointer arithmetic.  If I'm correctly following all the
> >preprocessor logic, emacs uses getrlimit() on Cygwin to determine the
> >total memory.  Is it possible that this is returning the wrong value
> >when the large-address-awareness flag is set?

You're right, it calls getrlimit(RLIMIT_AS) to get the information of
the maximum VM size, and Cygwin always returned 0x80000000.  Apparently
there's some strange test in emacs, which chokes on the fact that a
memory address is returned which is beyond the maximum address as
returned by getrlimit(RLIMIT_AS).

What I did now is to change Cygwin to return always RLIM_INFINITY in
a call to getrlimit(RLIMIT_AS).  This seems to be more correct anyway,
given the definition in SUSv4(*):

  "If a call to getrlimit() returns RLIM_INFINITY for a resource, it
   means the implementation shall not enforce limits on that resource."

That's exactly our situation.  There's no enforced limit on the VM,
other than the size of the VM itself.  Now emacs is happy.

(*) http://pubs.opengroup.org/onlinepubs/9699919799/functions/getrlimit.html

> I may have found the problem.  I think emacs is not correctly
> determining the start of the heap, with or without large address
> awareness.  When I run emacs under gdb (without large address
> awareness), I find that emacs thinks the heap starts at 0x7b6b30.
> But the heap actually starts at 0x20000000, doesn't it?
> 
> Here's the code that emacs uses to find the start of the heap:
> 
> char *
> start_of_data (void)
> {
> #ifdef BSD_SYSTEM
>   extern char etext;
>   return (POINTER)(&etext);
> #elif defined DATA_START
>   return ((POINTER) DATA_START);
> #elif defined ORDINARY_LINK
>   /*
>    * This is a hack.  Since we're not linking crt0.c or pre_crt0.c,
>    * data_start isn't defined.  We take the address of environ, which
>    * is known to live at or near the start of the system crt0.c, and
>    * we don't sweat the handful of bytes that might lose.
>    */
>   extern char **environ;
>   return ((POINTER) &environ);
> #else
>   extern int data_start;
>   return ((POINTER) &data_start);
> #endif
> }
> 
> I left all the preprocessor stuff in there to play it safe, but I
> think we're in the ORDINARY_LINK case.
> 
> What would be the right way for emacs to determine the start of the
> heap on Cygwin?

There is no such way and in theory it's none of the application's
business.  Whatever the above code thinks it does, it has nothing to do
with the problem you reported, as far as I can see.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: emacs and large-address awareness under recent snapshots
  2011-08-07 20:03         ` Corinna Vinschen
@ 2011-08-07 20:44           ` Ken Brown
  2011-08-07 22:39           ` Ken Brown
  1 sibling, 0 replies; 34+ messages in thread
From: Ken Brown @ 2011-08-07 20:44 UTC (permalink / raw)
  To: cygwin

On 8/7/2011 4:02 PM, Corinna Vinschen wrote:
> On Aug  7 12:18, Ken Brown wrote:
>> On 8/7/2011 10:43 AM, Ken Brown wrote:
>>> On 8/7/2011 7:50 AM, Corinna Vinschen wrote:
>>>>> I did set breakpoints to all functions returning malloc information,
>>>>> but emacs doesn't call one of them.  Is there a chance that emacs
>>>>> does some invalid 32 bit pointer arithmetic and just gets confused?
>>>
>>> Thanks for all the information.
>>>
>>> Emacs checks available memory in the function check_memory_limits() in
>>> the source file src/vm-limits.c.  I'm trying to sort it out, but I don't
>>> see any invalid pointer arithmetic.  If I'm correctly following all the
>>> preprocessor logic, emacs uses getrlimit() on Cygwin to determine the
>>> total memory.  Is it possible that this is returning the wrong value
>>> when the large-address-awareness flag is set?
>
> You're right, it calls getrlimit(RLIMIT_AS) to get the information of
> the maximum VM size, and Cygwin always returned 0x80000000.  Apparently
> there's some strange test in emacs, which chokes on the fact that a
> memory address is returned which is beyond the maximum address as
> returned by getrlimit(RLIMIT_AS).
>
> What I did now is to change Cygwin to return always RLIM_INFINITY in
> a call to getrlimit(RLIMIT_AS).  This seems to be more correct anyway,
> given the definition in SUSv4(*):
>
>    "If a call to getrlimit() returns RLIM_INFINITY for a resource, it
>     means the implementation shall not enforce limits on that resource."
>
> That's exactly our situation.  There's no enforced limit on the VM,
> other than the size of the VM itself.  Now emacs is happy.

Thanks!

Ken

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: emacs and large-address awareness under recent snapshots
  2011-08-07 20:03         ` Corinna Vinschen
  2011-08-07 20:44           ` Ken Brown
@ 2011-08-07 22:39           ` Ken Brown
  2011-08-08 13:23             ` Ken Brown
  2011-08-08 15:40             ` Corinna Vinschen
  1 sibling, 2 replies; 34+ messages in thread
From: Ken Brown @ 2011-08-07 22:39 UTC (permalink / raw)
  To: cygwin

On 8/7/2011 4:02 PM, Corinna Vinschen wrote:
> On Aug  7 12:18, Ken Brown wrote:
>> On 8/7/2011 10:43 AM, Ken Brown wrote:
>>> On 8/7/2011 7:50 AM, Corinna Vinschen wrote:
>>>>> I did set breakpoints to all functions returning malloc information,
>>>>> but emacs doesn't call one of them.  Is there a chance that emacs
>>>>> does some invalid 32 bit pointer arithmetic and just gets confused?
>>>
>>> Thanks for all the information.
>>>
>>> Emacs checks available memory in the function check_memory_limits() in
>>> the source file src/vm-limits.c.  I'm trying to sort it out, but I don't
>>> see any invalid pointer arithmetic.  If I'm correctly following all the
>>> preprocessor logic, emacs uses getrlimit() on Cygwin to determine the
>>> total memory.  Is it possible that this is returning the wrong value
>>> when the large-address-awareness flag is set?
>
> You're right, it calls getrlimit(RLIMIT_AS) to get the information of
> the maximum VM size, and Cygwin always returned 0x80000000.  Apparently
> there's some strange test in emacs, which chokes on the fact that a
> memory address is returned which is beyond the maximum address as
> returned by getrlimit(RLIMIT_AS).
>
> What I did now is to change Cygwin to return always RLIM_INFINITY in
> a call to getrlimit(RLIMIT_AS).  This seems to be more correct anyway,
> given the definition in SUSv4(*):
>
>    "If a call to getrlimit() returns RLIM_INFINITY for a resource, it
>     means the implementation shall not enforce limits on that resource."
>
> That's exactly our situation.  There's no enforced limit on the VM,
> other than the size of the VM itself.  Now emacs is happy.

I've built cygwin1.dll from the latest CVS and confirmed that the 
problem is fixed.  Unfortunately, I've just discovered a second problem, 
also starting with the 2011-07-21 snapshot, that only shows up when I 
try to start emacs under X (with emacs large address aware).  What 
happens here is that emacs keeps using more and more CPU (as shown by 
Windows Task Manager), but the emacs window never opens.  To reproduce, 
install emacs-X11 and then do the following:

1. $ peflags --bigaddr=1 /usr/bin/emacs-X11.exe

2. Start the X server.  (I use the Start Menu shortcut.)

3. Start emacs from an xterm window:

    $ emacs -Q &

As a slight variation on this, you can instead start emacs with the -nw 
option:

    $ emacs -nw -Q

This tells emacs to use the xterm window for display rather than opening 
its own window.  The result this time is that you can see the emacs 
display, but emacs is unresponsive while the CPU usage increases as before.

Ken



--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: emacs and large-address awareness under recent snapshots
  2011-08-07 22:39           ` Ken Brown
@ 2011-08-08 13:23             ` Ken Brown
  2011-08-08 15:51               ` Corinna Vinschen
  2011-08-08 15:40             ` Corinna Vinschen
  1 sibling, 1 reply; 34+ messages in thread
From: Ken Brown @ 2011-08-08 13:23 UTC (permalink / raw)
  To: cygwin

On 8/7/2011 6:38 PM, Ken Brown wrote:
> I've built cygwin1.dll from the latest CVS and confirmed that the
> problem is fixed.  Unfortunately, I've just discovered a second problem,
> also starting with the 2011-07-21 snapshot, that only shows up when I
> try to start emacs under X (with emacs large address aware).  What
> happens here is that emacs keeps using more and more CPU (as shown by
> Windows Task Manager), but the emacs window never opens.  To reproduce,
> install emacs-X11 and then do the following:
>
> 1. $ peflags --bigaddr=1 /usr/bin/emacs-X11.exe
>
> 2. Start the X server.  (I use the Start Menu shortcut.)
>
> 3. Start emacs from an xterm window:
>
>      $ emacs -Q&
>
> As a slight variation on this, you can instead start emacs with the -nw
> option:
>
>      $ emacs -nw -Q
>
> This tells emacs to use the xterm window for display rather than opening
> its own window.  The result this time is that you can see the emacs
> display, but emacs is unresponsive while the CPU usage increases as before.

I attached gdb to the running process and got some more information.  It 
turns out that this has nothing to do with X.  It's just that starting 
emacs under X causes emacs to try to allocate memory, and this makes the 
problem show up very quickly.

It looks to me like emacs gets stuck in morecore_nolock() and/or 
_malloc_internal_nolock(), which are defined in src/gmalloc.c. 
Apparently, emacs has a peculiar way of managing memory on Cygwin, and 
this chokes on the changes to the heap start address as of 2011-07-21. 
I don't know enough programming to fix this.  If anyone wants to try, 
the relevant source files to look at are gmalloc.c, sheap.c, and 
unexcw.c.  The second and third are compiled only in the Cygwin build, 
and the first also has some Cygwin-specific stuff.

Maybe I should take this to the emacs-devel list at some point, but I'll 
wait a while to see if someone on this list can help.

Ken

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: emacs and large-address awareness under recent snapshots
  2011-08-07 22:39           ` Ken Brown
  2011-08-08 13:23             ` Ken Brown
@ 2011-08-08 15:40             ` Corinna Vinschen
  1 sibling, 0 replies; 34+ messages in thread
From: Corinna Vinschen @ 2011-08-08 15:40 UTC (permalink / raw)
  To: cygwin

On Aug  7 18:38, Ken Brown wrote:
> On 8/7/2011 4:02 PM, Corinna Vinschen wrote:
> >What I did now is to change Cygwin to return always RLIM_INFINITY in
> >a call to getrlimit(RLIMIT_AS).  This seems to be more correct anyway,
> >given the definition in SUSv4(*):
> >
> >   "If a call to getrlimit() returns RLIM_INFINITY for a resource, it
> >    means the implementation shall not enforce limits on that resource."
> >
> >That's exactly our situation.  There's no enforced limit on the VM,
> >other than the size of the VM itself.  Now emacs is happy.
> 
> I've built cygwin1.dll from the latest CVS and confirmed that the
> problem is fixed.  Unfortunately, I've just discovered a second
> problem, also starting with the 2011-07-21 snapshot, that only shows
> up when I try to start emacs under X (with emacs large address
> aware).  What happens here is that emacs keeps using more and more
> CPU (as shown by Windows Task Manager), but the emacs window never
> opens.  To reproduce, install emacs-X11 and then do the following:
> 
> 1. $ peflags --bigaddr=1 /usr/bin/emacs-X11.exe
> 
> 2. Start the X server.  (I use the Start Menu shortcut.)
> 
> 3. Start emacs from an xterm window:
> 
>    $ emacs -Q &
> 
> As a slight variation on this, you can instead start emacs with the
> -nw option:
> 
>    $ emacs -nw -Q
> 
> This tells emacs to use the xterm window for display rather than
> opening its own window.  The result this time is that you can see
> the emacs display, but emacs is unresponsive while the CPU usage
> increases as before.

I can reproduce this, but from the strace I can't figure out what emacs
is doing.  There are a lot of threads running in parallel.  GDB isn't a
big help either, given that I have no debug symbols for emacs.

I'm still looking into this, but I need your help.  If you have a hunch
where the problem occurs in emacs, add some debug output to the code
or try to set a breakpoint in the affected function.  Everything helps.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: emacs and large-address awareness under recent snapshots
  2011-08-08 13:23             ` Ken Brown
@ 2011-08-08 15:51               ` Corinna Vinschen
  2011-08-08 16:05                 ` Ken Brown
  0 siblings, 1 reply; 34+ messages in thread
From: Corinna Vinschen @ 2011-08-08 15:51 UTC (permalink / raw)
  To: cygwin

On Aug  8 09:22, Ken Brown wrote:
> I attached gdb to the running process and got some more information.
> It turns out that this has nothing to do with X.  It's just that
> starting emacs under X causes emacs to try to allocate memory, and
> this makes the problem show up very quickly.
> 
> It looks to me like emacs gets stuck in morecore_nolock() and/or
> _malloc_internal_nolock(), which are defined in src/gmalloc.c.
> Apparently, emacs has a peculiar way of managing memory on Cygwin,
> and this chokes on the changes to the heap start address as of
> 2011-07-21. I don't know enough programming to fix this.  If anyone
> wants to try, the relevant source files to look at are gmalloc.c,
> sheap.c, and unexcw.c.  The second and third are compiled only in
> the Cygwin build, and the first also has some Cygwin-specific stuff.
> 
> Maybe I should take this to the emacs-devel list at some point, but
> I'll wait a while to see if someone on this list can help.

I had a look into the sources you're mentioning above, but I don't see
anything suspicious, apart from the fact that emacs uses some static
buffer of 12 Megs as heap on Cygwin... sometimes.  At least that's what
sheap.c is about, afaics.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: emacs and large-address awareness under recent snapshots
  2011-08-08 15:51               ` Corinna Vinschen
@ 2011-08-08 16:05                 ` Ken Brown
  2011-08-08 16:26                   ` Corinna Vinschen
  0 siblings, 1 reply; 34+ messages in thread
From: Ken Brown @ 2011-08-08 16:05 UTC (permalink / raw)
  To: cygwin

On 8/8/2011 11:50 AM, Corinna Vinschen wrote:
> On Aug  8 09:22, Ken Brown wrote:
>> I attached gdb to the running process and got some more information.
>> It turns out that this has nothing to do with X.  It's just that
>> starting emacs under X causes emacs to try to allocate memory, and
>> this makes the problem show up very quickly.
>>
>> It looks to me like emacs gets stuck in morecore_nolock() and/or
>> _malloc_internal_nolock(), which are defined in src/gmalloc.c.
>> Apparently, emacs has a peculiar way of managing memory on Cygwin,
>> and this chokes on the changes to the heap start address as of
>> 2011-07-21. I don't know enough programming to fix this.  If anyone
>> wants to try, the relevant source files to look at are gmalloc.c,
>> sheap.c, and unexcw.c.  The second and third are compiled only in
>> the Cygwin build, and the first also has some Cygwin-specific stuff.
>>
>> Maybe I should take this to the emacs-devel list at some point, but
>> I'll wait a while to see if someone on this list can help.
>
> I had a look into the sources you're mentioning above, but I don't see
> anything suspicious, apart from the fact that emacs uses some static
> buffer of 12 Megs as heap on Cygwin... sometimes.  At least that's what
> sheap.c is about, afaics.

I'll build a debug version and try stepping through the functions I 
mentioned.  Maybe I can figure out what's happening.  I suspect that it 
does have something to do with the static heap.

Ken

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: emacs and large-address awareness under recent snapshots
  2011-08-08 16:05                 ` Ken Brown
@ 2011-08-08 16:26                   ` Corinna Vinschen
  2011-08-08 18:21                     ` Achim Gratz
  0 siblings, 1 reply; 34+ messages in thread
From: Corinna Vinschen @ 2011-08-08 16:26 UTC (permalink / raw)
  To: cygwin

On Aug  8 12:05, Ken Brown wrote:
> On 8/8/2011 11:50 AM, Corinna Vinschen wrote:
> >On Aug  8 09:22, Ken Brown wrote:
> >>I attached gdb to the running process and got some more information.
> >>It turns out that this has nothing to do with X.  It's just that
> >>starting emacs under X causes emacs to try to allocate memory, and
> >>this makes the problem show up very quickly.
> >>
> >>It looks to me like emacs gets stuck in morecore_nolock() and/or
> >>_malloc_internal_nolock(), which are defined in src/gmalloc.c.
> >>Apparently, emacs has a peculiar way of managing memory on Cygwin,
> >>and this chokes on the changes to the heap start address as of
> >>2011-07-21. I don't know enough programming to fix this.  If anyone
> >>wants to try, the relevant source files to look at are gmalloc.c,
> >>sheap.c, and unexcw.c.  The second and third are compiled only in
> >>the Cygwin build, and the first also has some Cygwin-specific stuff.
> >>
> >>Maybe I should take this to the emacs-devel list at some point, but
> >>I'll wait a while to see if someone on this list can help.
> >
> >I had a look into the sources you're mentioning above, but I don't see
> >anything suspicious, apart from the fact that emacs uses some static
> >buffer of 12 Megs as heap on Cygwin... sometimes.  At least that's what
> >sheap.c is about, afaics.
> 
> I'll build a debug version and try stepping through the functions I
> mentioned.  Maybe I can figure out what's happening.  I suspect that
> it does have something to do with the static heap.

Maybe, but the static heap is in the bss segment, so it's not in the
application heap starting at 0x80000000, but somewhere within the .bss
section's address range from 0x746000 to 0x1380000.

There's also this strange comment at the top of sheap.c:

   simulate `sbrk' with an array in .bss, for `unexec' support for Cygwin;
   complete rewrite of xemacs Cygwin `unexec' code

Whatever "unexec" is.  The code is from 2004.  I'm concerned that it
still tries to workaround some old problem in the Cygwin sbrk
implementation in Cygwin 1.5.  Unfortunately the comment doesn't contain
any hint as to what exact problem this code is trying to workaround.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: emacs and large-address awareness under recent snapshots
  2011-08-08 16:26                   ` Corinna Vinschen
@ 2011-08-08 18:21                     ` Achim Gratz
  2011-08-08 20:17                       ` Ken Brown
  0 siblings, 1 reply; 34+ messages in thread
From: Achim Gratz @ 2011-08-08 18:21 UTC (permalink / raw)
  To: cygwin

Corinna Vinschen <...> writes:
> still tries to workaround some old problem in the Cygwin sbrk
> implementation in Cygwin 1.5.  Unfortunately the comment doesn't contain
> any hint as to what exact problem this code is trying to workaround.

Apologies if that's obvious and you've already checked that: emacs gets
created as a dumpfile of temacs during build, so if peflags moves the
heap retroactively thereafter I can't see how it's going to work since
part of the heap is where it was during dumping and the rest is, well,
somewhere else.  I'd look at the build process first before suspecting
the sources — I would assume that temacs must also be made large address
aware and that it right now just isn't.  There may still be workarounds
that aren't needed anymore and bad assumptions about how the memory map
looks like in Cygwin.


Regards,
Achim.
-- 
+<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+

SD adaptation for Waldorf rackAttack V1.04R1:
http://Synth.Stromeko.net/Downloads.html#WaldorfSDada


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: emacs and large-address awareness under recent snapshots
  2011-08-08 18:21                     ` Achim Gratz
@ 2011-08-08 20:17                       ` Ken Brown
  2011-08-08 21:17                         ` Ken Brown
  0 siblings, 1 reply; 34+ messages in thread
From: Ken Brown @ 2011-08-08 20:17 UTC (permalink / raw)
  To: cygwin

On 8/8/2011 2:20 PM, Achim Gratz wrote:
> Corinna Vinschen<...>  writes:
>> still tries to workaround some old problem in the Cygwin sbrk
>> implementation in Cygwin 1.5.  Unfortunately the comment doesn't contain
>> any hint as to what exact problem this code is trying to workaround.
>
> Apologies if that's obvious and you've already checked that: emacs gets
> created as a dumpfile of temacs during build, so if peflags moves the
> heap retroactively thereafter I can't see how it's going to work since
> part of the heap is where it was during dumping and the rest is, well,
> somewhere else.  I'd look at the build process first before suspecting
> the sources — I would assume that temacs must also be made large address
> aware and that it right now just isn't.  There may still be workarounds
> that aren't needed anymore and bad assumptions about how the memory map
> looks like in Cygwin.

Thanks for the suggestion, but that doesn't seem to be the issue.  I 
just tried building emacs with LDFLAGS=-Wl,large-address-aware.  That 
should have made temacs and the dumpfile large address aware.  The 
result was that the build didn't finish.  bootstrap-emacs.exe compiled a 
bunch of .el files and then started spinning its wheels, just as in my 
report earlier in this thread.  Attaching gdb and getting a backtrace, I 
again found that emacs was stuck in morecore_nolock, called from 
_malloc_internal_nolock.

Corinna, here's some explanation of the above (and of unexec, which you 
were wondering about.)  The build process for emacs first compiles the C 
source files into an executable temacs.exe, which has no editing 
commands.  It then runs temacs.exe, which loads some lisp files to set 
up the editing environment and then dumps itself as emacs.exe.  The 
dumping is done by unexec, which is defined in unexcw.c.  I think that 
the data in the static heap (from sheap.c) is part of what gets dumped, 
so emacs defines a special version of sbrk (called bss_sbrk) that 
simulates sbrk but uses the static heap instead of the ordinary 
application heap.

I don't think emacs is trying to work around problems in Cygwin's sbrk. 
  In fact, emacs.exe, as opposed to temacs.exe, does use Cygwin's sbrk. 
  You can see this in the function __default_morecore in gmalloc.c, 
which calls bss_sbrk if emacs.exe hasn't yet been dumped (i.e., if 
temacs.exe is running) and Cygwin's sbrk otherwise.

I hope this all makes sense and is correct.  It may or may not be 
relevant to figuring out what goes wrong when the ordinary heap starts 
at 0x80000000.

Ken


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: emacs and large-address awareness under recent snapshots
  2011-08-08 20:17                       ` Ken Brown
@ 2011-08-08 21:17                         ` Ken Brown
  2011-08-08 23:07                           ` Eliot Moss
  0 siblings, 1 reply; 34+ messages in thread
From: Ken Brown @ 2011-08-08 21:17 UTC (permalink / raw)
  To: cygwin

On 8/8/2011 4:16 PM, Ken Brown wrote:
> On 8/8/2011 2:20 PM, Achim Gratz wrote:
>> Corinna Vinschen<...>   writes:
>>> still tries to workaround some old problem in the Cygwin sbrk
>>> implementation in Cygwin 1.5.  Unfortunately the comment doesn't contain
>>> any hint as to what exact problem this code is trying to workaround.
>>
>> Apologies if that's obvious and you've already checked that: emacs gets
>> created as a dumpfile of temacs during build, so if peflags moves the
>> heap retroactively thereafter I can't see how it's going to work since
>> part of the heap is where it was during dumping and the rest is, well,
>> somewhere else.  I'd look at the build process first before suspecting
>> the sources — I would assume that temacs must also be made large address
>> aware and that it right now just isn't.  There may still be workarounds
>> that aren't needed anymore and bad assumptions about how the memory map
>> looks like in Cygwin.
>
> Thanks for the suggestion, but that doesn't seem to be the issue.  I
> just tried building emacs with LDFLAGS=-Wl,large-address-aware.  That
> should have made temacs and the dumpfile large address aware.  The
> result was that the build didn't finish.  bootstrap-emacs.exe compiled a
> bunch of .el files and then started spinning its wheels, just as in my
> report earlier in this thread.  Attaching gdb and getting a backtrace, I
> again found that emacs was stuck in morecore_nolock, called from
> _malloc_internal_nolock.
>
> Corinna, here's some explanation of the above (and of unexec, which you
> were wondering about.)  The build process for emacs first compiles the C
> source files into an executable temacs.exe, which has no editing
> commands.  It then runs temacs.exe, which loads some lisp files to set
> up the editing environment and then dumps itself as emacs.exe.  The
> dumping is done by unexec, which is defined in unexcw.c.  I think that
> the data in the static heap (from sheap.c) is part of what gets dumped,
> so emacs defines a special version of sbrk (called bss_sbrk) that
> simulates sbrk but uses the static heap instead of the ordinary
> application heap.
>
> I don't think emacs is trying to work around problems in Cygwin's sbrk.
>    In fact, emacs.exe, as opposed to temacs.exe, does use Cygwin's sbrk.
>    You can see this in the function __default_morecore in gmalloc.c,
> which calls bss_sbrk if emacs.exe hasn't yet been dumped (i.e., if
> temacs.exe is running) and Cygwin's sbrk otherwise.
>
> I hope this all makes sense and is correct.  It may or may not be
> relevant to figuring out what goes wrong when the ordinary heap starts
> at 0x80000000.

I built a debug version of emacs, set it for large address awareness, 
let it run for a while, and then attached gdb to it.  It turns out that 
it was stuck in an infinite loop at lines 701-703 of gmalloc.c, with 
newsize = 0:

do
   newsize *= 2;
while ((__malloc_size_t) BLOCK ((char *) result + size) > newsize);

My guess now is that there was some invalid pointer arithmetic somewhere 
that led to this, but I don't have time at the moment to look for it. 
I'll do it later (or tomorrow) if no one beats me to it.

Ken



--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: emacs and large-address awareness under recent snapshots
  2011-08-08 21:17                         ` Ken Brown
@ 2011-08-08 23:07                           ` Eliot Moss
  2011-08-09  8:27                             ` Corinna Vinschen
  0 siblings, 1 reply; 34+ messages in thread
From: Eliot Moss @ 2011-08-08 23:07 UTC (permalink / raw)
  To: cygwin

On 8/8/2011 5:17 PM, Ken Brown wrote:

> do
> newsize *= 2;
> while ((__malloc_size_t) BLOCK ((char *) result + size) > newsize);
>
> My guess now is that there was some invalid pointer arithmetic somewhere that led to this, but I
> don't have time at the moment to look for it. I'll do it later (or tomorrow) if no one beats me to it.

Possibly, Ken. I also wonder about signed vs unsigned calculations
and such. We are looking at the higher end of the address space,
which means negative addresses when considered as signed numbers.

I'm not sure what the above is doing, but if it is trying to
double its understanding of the heap size, based on using the
current end of the heap (result?) as a measure of size, then
if the heap is at 0x80000000, doubling that gives 0 in a 32-bit
address space ...

Best wishes -- Eliot Moss

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: emacs and large-address awareness under recent snapshots
  2011-08-08 23:07                           ` Eliot Moss
@ 2011-08-09  8:27                             ` Corinna Vinschen
  2011-08-09 11:19                               ` Ken Brown
  0 siblings, 1 reply; 34+ messages in thread
From: Corinna Vinschen @ 2011-08-09  8:27 UTC (permalink / raw)
  To: cygwin

On Aug  8 19:07, Eliot Moss wrote:
> On 8/8/2011 5:17 PM, Ken Brown wrote:
> 
> >do
> >newsize *= 2;
> >while ((__malloc_size_t) BLOCK ((char *) result + size) > newsize);
> >
> >My guess now is that there was some invalid pointer arithmetic somewhere that led to this, but I
> >don't have time at the moment to look for it. I'll do it later (or tomorrow) if no one beats me to it.
> 
> Possibly, Ken. I also wonder about signed vs unsigned calculations
> and such. We are looking at the higher end of the address space,
> which means negative addresses when considered as signed numbers.
> 
> I'm not sure what the above is doing, but if it is trying to
> double its understanding of the heap size, based on using the
> current end of the heap (result?) as a measure of size, then
> if the heap is at 0x80000000, doubling that gives 0 in a 32-bit
> address space ...

The question is, how could newsize ever become >= 0x80000000?
Ken, what are the values of result and size?  And what value has
heapsize?  Consider that the statement before the loop is

  newsize = heapsize;


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: emacs and large-address awareness under recent snapshots
  2011-08-09  8:27                             ` Corinna Vinschen
@ 2011-08-09 11:19                               ` Ken Brown
  2011-08-09 11:35                                 ` Eliot Moss
  2011-08-09 14:13                                 ` Ken Brown
  0 siblings, 2 replies; 34+ messages in thread
From: Ken Brown @ 2011-08-09 11:19 UTC (permalink / raw)
  To: cygwin

On 8/9/2011 4:26 AM, Corinna Vinschen wrote:
> On Aug  8 19:07, Eliot Moss wrote:
>> On 8/8/2011 5:17 PM, Ken Brown wrote:
>>
>>> do
>>> newsize *= 2;
>>> while ((__malloc_size_t) BLOCK ((char *) result + size)>  newsize);
>>>
>>> My guess now is that there was some invalid pointer arithmetic somewhere that led to this, but I
>>> don't have time at the moment to look for it. I'll do it later (or tomorrow) if no one beats me to it.
>>
>> Possibly, Ken. I also wonder about signed vs unsigned calculations
>> and such. We are looking at the higher end of the address space,
>> which means negative addresses when considered as signed numbers.
>>
>> I'm not sure what the above is doing, but if it is trying to
>> double its understanding of the heap size, based on using the
>> current end of the heap (result?) as a measure of size, then
>> if the heap is at 0x80000000, doubling that gives 0 in a 32-bit
>> address space ...
>
> The question is, how could newsize ever become>= 0x80000000?
> Ken, what are the values of result and size?  And what value has
> heapsize?  Consider that the statement before the loop is
>
>    newsize = heapsize;

(gdb) thread 1
[Switching to thread 1 (Thread 19828.0x447c)]
#0  0x00622ee0 in morecore_nolock (size=1052672) at gmalloc.c:703
703           while ((__malloc_size_t) BLOCK ((char *) result + size) > 
newsize);
(gdb) p /x size
$1 = 0x101000
(gdb) p /x heapsize
$2 = 0x80000
(gdb) p result
$3 = (void *) 0x807d0000
(gdb) p newsize
$4 = 0
(gdb) p _heapbase
$5 = 0x816000 "\202"
(gdb) p _heapinfo
$6 = (malloc_info *) 0x80060000

Is _heapbase the problem?  This is initialized to _heapinfo at the first 
call of malloc and is never changed.  _heapinfo presumably points into 
the static heap at that point.  (_heapinfo is later changed as a result 
of realloc.)  This low value of _heapbase is used in the BLOCK macro.

Ken

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: emacs and large-address awareness under recent snapshots
  2011-08-09 11:19                               ` Ken Brown
@ 2011-08-09 11:35                                 ` Eliot Moss
  2011-08-09 14:13                                 ` Ken Brown
  1 sibling, 0 replies; 34+ messages in thread
From: Eliot Moss @ 2011-08-09 11:35 UTC (permalink / raw)
  To: cygwin

On 8/9/2011 7:19 AM, Ken Brown wrote:
> (gdb) thread 1
> [Switching to thread 1 (Thread 19828.0x447c)]
> #0 0x00622ee0 in morecore_nolock (size=1052672) at gmalloc.c:703
> 703 while ((__malloc_size_t) BLOCK ((char *) result + size) > newsize);
> (gdb) p /x size
> $1 = 0x101000
> (gdb) p /x heapsize
> $2 = 0x80000
> (gdb) p result
> $3 = (void *) 0x807d0000
> (gdb) p newsize
> $4 = 0
> (gdb) p _heapbase
> $5 = 0x816000 "\202"
> (gdb) p _heapinfo
> $6 = (malloc_info *) 0x80060000
>
> Is _heapbase the problem? This is initialized to _heapinfo at the first call of malloc and is never
> changed. _heapinfo presumably points into the static heap at that point. (_heapinfo is later changed
> as a result of realloc.) This low value of _heapbase is used in the BLOCK macro.

Here's a theory. Emacs is estimating its heap size as being
approximately result+size (i.e., it is assuming its heap is
in relatively low memory). Given the value of result, now in
the upper half of memory, it tries to compute a heap size
(by doubling newsize repeatedly), and thus will double until
newsize goes to 0.

If this theory is correct, a base needs to be subtracted.
If that is happening in the BLOCK macro, and if Ken is
right that heapbase is a small value, then that could
well be the problem: heapbase needs to be "reset" to
0x80000000 ...

Regards -- Eliot Moss

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: emacs and large-address awareness under recent snapshots
  2011-08-09 11:19                               ` Ken Brown
  2011-08-09 11:35                                 ` Eliot Moss
@ 2011-08-09 14:13                                 ` Ken Brown
  2011-08-09 14:24                                   ` Ken Brown
  1 sibling, 1 reply; 34+ messages in thread
From: Ken Brown @ 2011-08-09 14:13 UTC (permalink / raw)
  To: cygwin

On 8/9/2011 7:19 AM, Ken Brown wrote:
> (gdb) thread 1
> [Switching to thread 1 (Thread 19828.0x447c)]
> #0  0x00622ee0 in morecore_nolock (size=1052672) at gmalloc.c:703
> 703           while ((__malloc_size_t) BLOCK ((char *) result + size)>
> newsize);
> (gdb) p /x size
> $1 = 0x101000
> (gdb) p /x heapsize
> $2 = 0x80000
> (gdb) p result
> $3 = (void *) 0x807d0000
> (gdb) p newsize
> $4 = 0
> (gdb) p _heapbase
> $5 = 0x816000 "\202"
> (gdb) p _heapinfo
> $6 = (malloc_info *) 0x80060000
>
> Is _heapbase the problem?  This is initialized to _heapinfo at the first
> call of malloc and is never changed.  _heapinfo presumably points into
> the static heap at that point.  (_heapinfo is later changed as a result
> of realloc.)  This low value of _heapbase is used in the BLOCK macro.

Here's what I think is happening.  When temacs.exe is running during the 
build process (see my explanation of this earlier in the thread), 
malloc_init is called and _heapbase is set.  At this point, temacs is 
using its own static buffer as the heap, and _heapbase gets the value 
0x816000.  This gets dumped as initialized data into emacs.exe, as does 
the value __malloc_initialized = 1.  Now when emacs.exe is run, it sees 
that malloc has already been initialized, so _heapbase retains its 
value, which is no longer appropriate.  All code relying on the BLOCK 
macro is now invalid.

AFAICS, this has always been wrong.  But the error didn't have dramatic 
consequences until the heap was put into high memory.

I'm not sure what's the best way to fix this (assuming my analysis is 
right).  Would it be enough to set __malloc_initialized to 0 before 
dumping?  That would force emacs to reinitialize and get the correct 
value of _heapbase.

Ken


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: emacs and large-address awareness under recent snapshots
  2011-08-09 14:13                                 ` Ken Brown
@ 2011-08-09 14:24                                   ` Ken Brown
  2011-08-09 15:22                                     ` Corinna Vinschen
  0 siblings, 1 reply; 34+ messages in thread
From: Ken Brown @ 2011-08-09 14:24 UTC (permalink / raw)
  To: cygwin

On 8/9/2011 10:12 AM, Ken Brown wrote:
> On 8/9/2011 7:19 AM, Ken Brown wrote:
>> (gdb) thread 1
>> [Switching to thread 1 (Thread 19828.0x447c)]
>> #0  0x00622ee0 in morecore_nolock (size=1052672) at gmalloc.c:703
>> 703           while ((__malloc_size_t) BLOCK ((char *) result + size)>
>> newsize);
>> (gdb) p /x size
>> $1 = 0x101000
>> (gdb) p /x heapsize
>> $2 = 0x80000
>> (gdb) p result
>> $3 = (void *) 0x807d0000
>> (gdb) p newsize
>> $4 = 0
>> (gdb) p _heapbase
>> $5 = 0x816000 "\202"
>> (gdb) p _heapinfo
>> $6 = (malloc_info *) 0x80060000
>>
>> Is _heapbase the problem?  This is initialized to _heapinfo at the first
>> call of malloc and is never changed.  _heapinfo presumably points into
>> the static heap at that point.  (_heapinfo is later changed as a result
>> of realloc.)  This low value of _heapbase is used in the BLOCK macro.
>
> Here's what I think is happening.  When temacs.exe is running during the
> build process (see my explanation of this earlier in the thread),
> malloc_init is called and _heapbase is set.  At this point, temacs is
> using its own static buffer as the heap, and _heapbase gets the value
> 0x816000.  This gets dumped as initialized data into emacs.exe, as does
> the value __malloc_initialized = 1.  Now when emacs.exe is run, it sees
> that malloc has already been initialized, so _heapbase retains its
> value, which is no longer appropriate.  All code relying on the BLOCK
> macro is now invalid.
>
> AFAICS, this has always been wrong.  But the error didn't have dramatic
> consequences until the heap was put into high memory.
>
> I'm not sure what's the best way to fix this (assuming my analysis is
> right).  Would it be enough to set __malloc_initialized to 0 before
> dumping?  That would force emacs to reinitialize and get the correct
> value of _heapbase.

No, that's too simple-minded.  I just tried it, and emacs aborted.  This 
seems like a mess.

Ken

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: emacs and large-address awareness under recent snapshots
  2011-08-09 14:24                                   ` Ken Brown
@ 2011-08-09 15:22                                     ` Corinna Vinschen
  2011-08-09 16:20                                       ` Ryan Johnson
  2011-08-09 18:21                                       ` Ken Brown
  0 siblings, 2 replies; 34+ messages in thread
From: Corinna Vinschen @ 2011-08-09 15:22 UTC (permalink / raw)
  To: cygwin

On Aug  9 10:23, Ken Brown wrote:
> On 8/9/2011 10:12 AM, Ken Brown wrote:
> >On 8/9/2011 7:19 AM, Ken Brown wrote:
> >>(gdb) thread 1
> >>[Switching to thread 1 (Thread 19828.0x447c)]
> >>#0  0x00622ee0 in morecore_nolock (size=1052672) at gmalloc.c:703
> >>703           while ((__malloc_size_t) BLOCK ((char *) result + size)>
> >>newsize);
> >>(gdb) p /x size
> >>$1 = 0x101000
> >>(gdb) p /x heapsize
> >>$2 = 0x80000
> >>(gdb) p result
> >>$3 = (void *) 0x807d0000
> >>(gdb) p newsize
> >>$4 = 0
> >>(gdb) p _heapbase
> >>$5 = 0x816000 "\202"
> >>(gdb) p _heapinfo
> >>$6 = (malloc_info *) 0x80060000
> >>
> >>Is _heapbase the problem?  This is initialized to _heapinfo at the first
> >>call of malloc and is never changed.  _heapinfo presumably points into
> >>the static heap at that point.  (_heapinfo is later changed as a result
> >>of realloc.)  This low value of _heapbase is used in the BLOCK macro.
> >
> >Here's what I think is happening.  When temacs.exe is running during the
> >build process (see my explanation of this earlier in the thread),
> >malloc_init is called and _heapbase is set.  At this point, temacs is
> >using its own static buffer as the heap, and _heapbase gets the value
> >0x816000.  This gets dumped as initialized data into emacs.exe, as does
> >the value __malloc_initialized = 1.  Now when emacs.exe is run, it sees
> >that malloc has already been initialized, so _heapbase retains its
> >value, which is no longer appropriate.  All code relying on the BLOCK
> >macro is now invalid.
> >
> >AFAICS, this has always been wrong.  But the error didn't have dramatic
> >consequences until the heap was put into high memory.
> >
> >I'm not sure what's the best way to fix this (assuming my analysis is
> >right).  Would it be enough to set __malloc_initialized to 0 before
> >dumping?  That would force emacs to reinitialize and get the correct
> >value of _heapbase.
> 
> No, that's too simple-minded.  I just tried it, and emacs aborted.
> This seems like a mess.

What happens if you remove the Cygwin-specific call to bss_sbrk in
__default_morecore?  In theory that should also break, as long as
temacs isn't also build large address aware.  The only difference,
_heapbase = 0x20000000.  But if temacs gets build with large address
awareness set, _heapbase should become 0x80000000.

However, whatever you do, it will not really work.  Keep in mind that
the large address awareness only makes sense (and has any effect!) on
systems which provide a large address area.

To me the bottom line here is, that emacs is doing the wrong thing.
There are a couple of assumptions how a system maintains memory, which
are just not valid on all systems.  The malloc initialization and the
assignment of the heapbase (the first call to sbrk(0)) should happen
in emacs every time it starts.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: emacs and large-address awareness under recent snapshots
  2011-08-09 15:22                                     ` Corinna Vinschen
@ 2011-08-09 16:20                                       ` Ryan Johnson
  2011-08-09 18:34                                         ` Ken Brown
  2011-08-09 18:21                                       ` Ken Brown
  1 sibling, 1 reply; 34+ messages in thread
From: Ryan Johnson @ 2011-08-09 16:20 UTC (permalink / raw)
  To: cygwin

On 09/08/2011 11:21 AM, Corinna Vinschen wrote:
> On Aug  9 10:23, Ken Brown wrote:
>> On 8/9/2011 10:12 AM, Ken Brown wrote:
>>> On 8/9/2011 7:19 AM, Ken Brown wrote:
>>>> (gdb) thread 1
>>>> [Switching to thread 1 (Thread 19828.0x447c)]
>>>> #0  0x00622ee0 in morecore_nolock (size=1052672) at gmalloc.c:703
>>>> 703           while ((__malloc_size_t) BLOCK ((char *) result + size)>
>>>> newsize);
>>>> (gdb) p /x size
>>>> $1 = 0x101000
>>>> (gdb) p /x heapsize
>>>> $2 = 0x80000
>>>> (gdb) p result
>>>> $3 = (void *) 0x807d0000
>>>> (gdb) p newsize
>>>> $4 = 0
>>>> (gdb) p _heapbase
>>>> $5 = 0x816000 "\202"
>>>> (gdb) p _heapinfo
>>>> $6 = (malloc_info *) 0x80060000
>>>>
>>>> Is _heapbase the problem?  This is initialized to _heapinfo at the first
>>>> call of malloc and is never changed.  _heapinfo presumably points into
>>>> the static heap at that point.  (_heapinfo is later changed as a result
>>>> of realloc.)  This low value of _heapbase is used in the BLOCK macro.
>>> Here's what I think is happening.  When temacs.exe is running during the
>>> build process (see my explanation of this earlier in the thread),
>>> malloc_init is called and _heapbase is set.  At this point, temacs is
>>> using its own static buffer as the heap, and _heapbase gets the value
>>> 0x816000.  This gets dumped as initialized data into emacs.exe, as does
>>> the value __malloc_initialized = 1.  Now when emacs.exe is run, it sees
>>> that malloc has already been initialized, so _heapbase retains its
>>> value, which is no longer appropriate.  All code relying on the BLOCK
>>> macro is now invalid.
>>>
>>> AFAICS, this has always been wrong.  But the error didn't have dramatic
>>> consequences until the heap was put into high memory.
>>>
>>> I'm not sure what's the best way to fix this (assuming my analysis is
>>> right).  Would it be enough to set __malloc_initialized to 0 before
>>> dumping?  That would force emacs to reinitialize and get the correct
>>> value of _heapbase.
>> No, that's too simple-minded.  I just tried it, and emacs aborted.
>> This seems like a mess.
> What happens if you remove the Cygwin-specific call to bss_sbrk in
> __default_morecore?  In theory that should also break, as long as
> temacs isn't also build large address aware.  The only difference,
> _heapbase = 0x20000000.  But if temacs gets build with large address
> awareness set, _heapbase should become 0x80000000.
>
> However, whatever you do, it will not really work.  Keep in mind that
> the large address awareness only makes sense (and has any effect!) on
> systems which provide a large address area.
>
> To me the bottom line here is, that emacs is doing the wrong thing.
> There are a couple of assumptions how a system maintains memory, which
> are just not valid on all systems.  The malloc initialization and the
> assignment of the heapbase (the first call to sbrk(0)) should happen
> in emacs every time it starts.
I'm pretty sure emacs [thinks it] doesn't even use the system heaps 
(sort of how cygwin doesn't use the windows heaps); from what I 
remember, the "heap" in [t]emacs is an .idata section of the image (12MB 
large on my version of emacs) which is supposed to have unused address 
space afterward, similar to how cygwin allocates its heap. There's even 
a comment there that says they got the idea from cygwin.

Does anybody know why emacs is accessing anything at 0x80000000 in the 
first place?

Ryan


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: emacs and large-address awareness under recent snapshots
  2011-08-09 15:22                                     ` Corinna Vinschen
  2011-08-09 16:20                                       ` Ryan Johnson
@ 2011-08-09 18:21                                       ` Ken Brown
  2011-08-10  2:33                                         ` Ken Brown
  1 sibling, 1 reply; 34+ messages in thread
From: Ken Brown @ 2011-08-09 18:21 UTC (permalink / raw)
  To: cygwin

On 8/9/2011 11:21 AM, Corinna Vinschen wrote:
> On Aug  9 10:23, Ken Brown wrote:
>> On 8/9/2011 10:12 AM, Ken Brown wrote:
>>> On 8/9/2011 7:19 AM, Ken Brown wrote:
>>>> (gdb) thread 1
>>>> [Switching to thread 1 (Thread 19828.0x447c)]
>>>> #0  0x00622ee0 in morecore_nolock (size=1052672) at gmalloc.c:703
>>>> 703           while ((__malloc_size_t) BLOCK ((char *) result + size)>
>>>> newsize);
>>>> (gdb) p /x size
>>>> $1 = 0x101000
>>>> (gdb) p /x heapsize
>>>> $2 = 0x80000
>>>> (gdb) p result
>>>> $3 = (void *) 0x807d0000
>>>> (gdb) p newsize
>>>> $4 = 0
>>>> (gdb) p _heapbase
>>>> $5 = 0x816000 "\202"
>>>> (gdb) p _heapinfo
>>>> $6 = (malloc_info *) 0x80060000
>>>>
>>>> Is _heapbase the problem?  This is initialized to _heapinfo at the first
>>>> call of malloc and is never changed.  _heapinfo presumably points into
>>>> the static heap at that point.  (_heapinfo is later changed as a result
>>>> of realloc.)  This low value of _heapbase is used in the BLOCK macro.
>>>
>>> Here's what I think is happening.  When temacs.exe is running during the
>>> build process (see my explanation of this earlier in the thread),
>>> malloc_init is called and _heapbase is set.  At this point, temacs is
>>> using its own static buffer as the heap, and _heapbase gets the value
>>> 0x816000.  This gets dumped as initialized data into emacs.exe, as does
>>> the value __malloc_initialized = 1.  Now when emacs.exe is run, it sees
>>> that malloc has already been initialized, so _heapbase retains its
>>> value, which is no longer appropriate.  All code relying on the BLOCK
>>> macro is now invalid.
>>>
>>> AFAICS, this has always been wrong.  But the error didn't have dramatic
>>> consequences until the heap was put into high memory.
>>>
>>> I'm not sure what's the best way to fix this (assuming my analysis is
>>> right).  Would it be enough to set __malloc_initialized to 0 before
>>> dumping?  That would force emacs to reinitialize and get the correct
>>> value of _heapbase.
>>
>> No, that's too simple-minded.  I just tried it, and emacs aborted.
>> This seems like a mess.
>
> What happens if you remove the Cygwin-specific call to bss_sbrk in
> __default_morecore?

That will mess up dumping.  The point of using bss_sbrk and simulating 
the heap in a static buffer is that whatever has been stored in that 
buffer gets dumped into emacs.exe as initialized data.  See unexcw.c.

 > In theory that should also break, as long as
> temacs isn't also build large address aware.  The only difference,
> _heapbase = 0x20000000.  But if temacs gets build with large address
> awareness set, _heapbase should become 0x80000000.

That would make _heapbase (which is part of the dumped emacs.exe) depend 
on the build system.  Obviously, as you say below, _heapbase needs to be 
determined at run time.

> However, whatever you do, it will not really work.  Keep in mind that
> the large address awareness only makes sense (and has any effect!) on
> systems which provide a large address area.
>
> To me the bottom line here is, that emacs is doing the wrong thing.
> There are a couple of assumptions how a system maintains memory, which
> are just not valid on all systems.  The malloc initialization and the
> assignment of the heapbase (the first call to sbrk(0)) should happen
> in emacs every time it starts.

That makes sense to me.  I thought that was what I was accomplishing 
(for Cygwin) by setting __malloc_initialized to 0 before dumping.  I'm 
not sure why it didn't work.  In any case, the fix shouldn't be Cygwin 
specific.  It's probably time to report this as an emacs bug.

Ken

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: emacs and large-address awareness under recent snapshots
  2011-08-09 16:20                                       ` Ryan Johnson
@ 2011-08-09 18:34                                         ` Ken Brown
  0 siblings, 0 replies; 34+ messages in thread
From: Ken Brown @ 2011-08-09 18:34 UTC (permalink / raw)
  To: cygwin

On 8/9/2011 12:20 PM, Ryan Johnson wrote:
> I'm pretty sure emacs [thinks it] doesn't even use the system heaps
> (sort of how cygwin doesn't use the windows heaps); from what I
> remember, the "heap" in [t]emacs is an .idata section of the image (12MB
> large on my version of emacs) which is supposed to have unused address
> space afterward, similar to how cygwin allocates its heap. There's even
> a comment there that says they got the idea from cygwin.

I think you're misreading the code.  The 12MB you're talking about is 
the static heap, used as the heap by temacs.  See the discussion of 
bss_sbrk earlier in the thread.  But emacs (as opposed to temacs) will 
start getting addresses in the heap allocated to it by Cygwin as soon as 
it calls sbrk, which it does if it needs more memory.  See 
__default_morecore in gmalloc.c.

> Does anybody know why emacs is accessing anything at 0x80000000 in the
> first place?

As explained earlier in the thread, Cygwin starts the heap at 0x20000000 
or 0x80000000 as of the 2011-07-21 snapshot.  It uses the higher value 
if the application and the system support large address awareness.

Ken

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: emacs and large-address awareness under recent snapshots
  2011-08-09 18:21                                       ` Ken Brown
@ 2011-08-10  2:33                                         ` Ken Brown
  2011-08-10  2:39                                           ` Ryan Johnson
  2011-08-10 11:47                                           ` Corinna Vinschen
  0 siblings, 2 replies; 34+ messages in thread
From: Ken Brown @ 2011-08-10  2:33 UTC (permalink / raw)
  To: cygwin

On 8/9/2011 2:21 PM, Ken Brown wrote:
> On 8/9/2011 11:21 AM, Corinna Vinschen wrote:
>> However, whatever you do, it will not really work.  Keep in mind that
>> the large address awareness only makes sense (and has any effect!) on
>> systems which provide a large address area.
>>
>> To me the bottom line here is, that emacs is doing the wrong thing.
>> There are a couple of assumptions how a system maintains memory, which
>> are just not valid on all systems.  The malloc initialization and the
>> assignment of the heapbase (the first call to sbrk(0)) should happen
>> in emacs every time it starts.
>
> That makes sense to me.  I thought that was what I was accomplishing
> (for Cygwin) by setting __malloc_initialized to 0 before dumping.  I'm
> not sure why it didn't work.  In any case, the fix shouldn't be Cygwin
> specific.  It's probably time to report this as an emacs bug.

I submitted a bug report and may or may not get a useful response. 
While waiting, I'd like to keep trying to figure out what the right fix 
is.  Unless the dumping mechanism (unexec) is completely revamped, we 
can't just ignore the static heap.  Some of it has already been 
allocated by temacs and has to be taken into account by the memory 
management scheme.  So when emacs starts up (as of 2011-07-21), the heap 
is going to come in two pieces: the static heap in low memory and the 
Cygwin-provided heap starting at 0x20000000 or 0x80000000.  I can't 
think of any easy way of dealing with this, short of drastically 
rewriting malloc.  Do you have any suggestions?

BTW, I don't necessarily have to use the malloc that comes with emacs. 
I just verified that I can build emacs so that it uses Cygwin's malloc. 
  I haven't done any testing yet to make sure there are no glitches, but 
I think it will be OK.  Assuming this is the case, does that simplify 
dealing with a heap that has two non-contiguous pieces?

Ken

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: emacs and large-address awareness under recent snapshots
  2011-08-10  2:33                                         ` Ken Brown
@ 2011-08-10  2:39                                           ` Ryan Johnson
  2011-08-10 14:57                                             ` Ken Brown
  2011-08-10 11:47                                           ` Corinna Vinschen
  1 sibling, 1 reply; 34+ messages in thread
From: Ryan Johnson @ 2011-08-10  2:39 UTC (permalink / raw)
  To: cygwin

On 09/08/2011 10:33 PM, Ken Brown wrote:
> On 8/9/2011 2:21 PM, Ken Brown wrote:
>> On 8/9/2011 11:21 AM, Corinna Vinschen wrote:
>>> However, whatever you do, it will not really work.  Keep in mind that
>>> the large address awareness only makes sense (and has any effect!) on
>>> systems which provide a large address area.
>>>
>>> To me the bottom line here is, that emacs is doing the wrong thing.
>>> There are a couple of assumptions how a system maintains memory, which
>>> are just not valid on all systems.  The malloc initialization and the
>>> assignment of the heapbase (the first call to sbrk(0)) should happen
>>> in emacs every time it starts.
>>
>> That makes sense to me.  I thought that was what I was accomplishing
>> (for Cygwin) by setting __malloc_initialized to 0 before dumping.  I'm
>> not sure why it didn't work.  In any case, the fix shouldn't be Cygwin
>> specific.  It's probably time to report this as an emacs bug.
>
> I submitted a bug report and may or may not get a useful response. 
> While waiting, I'd like to keep trying to figure out what the right 
> fix is.  Unless the dumping mechanism (unexec) is completely revamped, 
> we can't just ignore the static heap.  Some of it has already been 
> allocated by temacs and has to be taken into account by the memory 
> management scheme.  So when emacs starts up (as of 2011-07-21), the 
> heap is going to come in two pieces: the static heap in low memory and 
> the Cygwin-provided heap starting at 0x20000000 or 0x80000000.  I 
> can't think of any easy way of dealing with this, short of drastically 
> rewriting malloc.  Do you have any suggestions?
>
> BTW, I don't necessarily have to use the malloc that comes with emacs. 
> I just verified that I can build emacs so that it uses Cygwin's 
> malloc.  I haven't done any testing yet to make sure there are no 
> glitches, but I think it will be OK.  Assuming this is the case, does 
> that simplify dealing with a heap that has two non-contiguous pieces?
Given that the static heap is only 12MB, with most of that arguably 
occupied by stuff that isn't going away, what if we did "just ignore the 
static heap" (mostly)? Anything freed from that regionjust gets dropped 
on the floor and all new requests are served from the cygwin heap? I 
assume temacs stays away from the dynamic heap, since otherwise the dump 
would be corrupted.

Ryan


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: emacs and large-address awareness under recent snapshots
  2011-08-10  2:33                                         ` Ken Brown
  2011-08-10  2:39                                           ` Ryan Johnson
@ 2011-08-10 11:47                                           ` Corinna Vinschen
  2011-08-10 15:28                                             ` Ken Brown
  1 sibling, 1 reply; 34+ messages in thread
From: Corinna Vinschen @ 2011-08-10 11:47 UTC (permalink / raw)
  To: cygwin

On Aug  9 22:33, Ken Brown wrote:
> On 8/9/2011 2:21 PM, Ken Brown wrote:
> BTW, I don't necessarily have to use the malloc that comes with
> emacs. I just verified that I can build emacs so that it uses
> Cygwin's malloc.  I haven't done any testing yet to make sure there
> are no glitches, but I think it will be OK.  Assuming this is the
> case, does that simplify dealing with a heap that has two
> non-contiguous pieces?

I guess so.  Cygwin's malloc obviously uses Cygwin's heap or mmap to get
memory.  If it works, I don't see a reason to stick to emacs' own malloc
implementation.  Is emacs always using it's own malloc by default, or
is it using it's own malloc only on certain platforms?


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: emacs and large-address awareness under recent snapshots
  2011-08-10  2:39                                           ` Ryan Johnson
@ 2011-08-10 14:57                                             ` Ken Brown
  2011-08-11 21:14                                               ` Ken Brown
  0 siblings, 1 reply; 34+ messages in thread
From: Ken Brown @ 2011-08-10 14:57 UTC (permalink / raw)
  To: cygwin

On 8/9/2011 10:39 PM, Ryan Johnson wrote:
> On 09/08/2011 10:33 PM, Ken Brown wrote:
>> I submitted a bug report and may or may not get a useful response.
>> While waiting, I'd like to keep trying to figure out what the right
>> fix is.  Unless the dumping mechanism (unexec) is completely revamped,
>> we can't just ignore the static heap.  Some of it has already been
>> allocated by temacs and has to be taken into account by the memory
>> management scheme.  So when emacs starts up (as of 2011-07-21), the
>> heap is going to come in two pieces: the static heap in low memory and
>> the Cygwin-provided heap starting at 0x20000000 or 0x80000000.  I
>> can't think of any easy way of dealing with this, short of drastically
>> rewriting malloc.  Do you have any suggestions?
>>
>> BTW, I don't necessarily have to use the malloc that comes with emacs.
>> I just verified that I can build emacs so that it uses Cygwin's
>> malloc.  I haven't done any testing yet to make sure there are no
>> glitches, but I think it will be OK.  Assuming this is the case, does
>> that simplify dealing with a heap that has two non-contiguous pieces?
> Given that the static heap is only 12MB, with most of that arguably
> occupied by stuff that isn't going away, what if we did "just ignore the
> static heap" (mostly)? Anything freed from that regionjust gets dropped
> on the floor and all new requests are served from the cygwin heap? I
> assume temacs stays away from the dynamic heap, since otherwise the dump
> would be corrupted.

I tried forcing malloc to reinitialize itself in emacs.c, and emacs 
crashed almost immediately.  A gdb backtrace showed that the memory got 
corrupted as soon as realloc got called on objects that were originally 
stored in the static heap.  After reinitialization, malloc had no 
knowledge of memory allocation in the static heap.

Ken

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: emacs and large-address awareness under recent snapshots
  2011-08-10 11:47                                           ` Corinna Vinschen
@ 2011-08-10 15:28                                             ` Ken Brown
  2011-08-10 15:59                                               ` Eliot Moss
  0 siblings, 1 reply; 34+ messages in thread
From: Ken Brown @ 2011-08-10 15:28 UTC (permalink / raw)
  To: cygwin

On 8/10/2011 7:47 AM, Corinna Vinschen wrote:
> On Aug  9 22:33, Ken Brown wrote:
>> On 8/9/2011 2:21 PM, Ken Brown wrote:
>> BTW, I don't necessarily have to use the malloc that comes with
>> emacs. I just verified that I can build emacs so that it uses
>> Cygwin's malloc.  I haven't done any testing yet to make sure there
>> are no glitches, but I think it will be OK.  Assuming this is the
>> case, does that simplify dealing with a heap that has two
>> non-contiguous pieces?
>
> I guess so.  Cygwin's malloc obviously uses Cygwin's heap or mmap to get
> memory.  If it works, I don't see a reason to stick to emacs' own malloc
> implementation.  Is emacs always using it's own malloc by default, or
> is it using it's own malloc only on certain platforms?

It uses its own malloc only on certain platforms.  This is determined 
during configuration, and it's easy enough to tell it to use Cygwin's 
malloc.

Of course, Cygwin's malloc won't use bss_sbrk, so nothing that temacs 
loads into memory (which would be in the Cygwin heap) will get dumped. 
I wonder if there's a better way to do the dumping to get around this. 
Here's what happens currently:

temacs starts up and then loads a bunch of lisp files.  Memory for this 
is allocated in the static heap by emacs's malloc, which uses bss_sbrk 
at this stage.  Note that the static heap also contains the table that 
malloc uses to keep track of the memory it has allocated.  temacs then 
writes a file emacs.exe, which starts out as a copy of temacs.exe but 
with the bss and data sections replaced by those of the running temacs. 
  The bss section contains the static heap.  Finally, temacs converts 
this new bss section in emacs.exe to an initialized data section.  The 
bulk of the work for this is done by the function fixup_executable in 
unexcw.c.

Would it be possible to accomplish the same goal without using bss_sbrk 
and the static heap?  In other words, can one save the information on 
the Cygwin heap as part of emacs.exe, so that when emacs is run the heap 
gets restored?  I know virtually nothing about the structure of .exe 
files and how the loader works, so I have no idea whether that's feasible.

That might be a much better solution.  We'd still have the issue of the 
heap sometimes starting at 0x20000000 and sometimes at 0x80000000, but 
that seems less important.  At worst, we would just have to give up the 
possibility of using large address awareness for emacs.  But I'm afraid 
that emacs will have memory problems in Cygwin 1.7.10 if we just leave 
things the way they are.[*]

Ken

[*] I've actually been running emacs with the 2011-08-03 snapshot 
(patched by your fix at the very beginning of this thread), and I 
haven't noticed any memory problems.  But I feel like that's just luck, 
because I don't see how the code in gmalloc.c could possibly be doing 
the right thing when Cygwin's heap starts at 0x20000000.  Maybe I'm 
missing something.

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: emacs and large-address awareness under recent snapshots
  2011-08-10 15:28                                             ` Ken Brown
@ 2011-08-10 15:59                                               ` Eliot Moss
  2011-08-11  7:43                                                 ` Corinna Vinschen
  0 siblings, 1 reply; 34+ messages in thread
From: Eliot Moss @ 2011-08-10 15:59 UTC (permalink / raw)
  To: cygwin

On 8/10/2011 11:28 AM, Ken Brown wrote:

> Would it be possible to accomplish the same goal without using bss_sbrk and the static heap? In
> other words, can one save the information on the Cygwin heap as part of emacs.exe, so that when
> emacs is run the heap gets restored? I know virtually nothing about the structure of .exe files and
> how the loader works, so I have no idea whether that's feasible.

I would think so.  The trick is knowing what pages contain
the Cygwin heap.  As for the other approach, these need to
get dumped as initialized data segments.  It might not be
too hard if the Cygwin heap provides functions telling you
where it starts and ends (more generally, the ranges of pages
in which it lies).

I begin to wonder, though, whether this would mean having
to provide two different copies of emacs, one with the
heap at 0x2... and one with it at 0x8...

Best wishes -- Eliot Moss

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: emacs and large-address awareness under recent snapshots
  2011-08-10 15:59                                               ` Eliot Moss
@ 2011-08-11  7:43                                                 ` Corinna Vinschen
  0 siblings, 0 replies; 34+ messages in thread
From: Corinna Vinschen @ 2011-08-11  7:43 UTC (permalink / raw)
  To: cygwin

On Aug 10 11:58, Eliot Moss wrote:
> On 8/10/2011 11:28 AM, Ken Brown wrote:
> 
> >Would it be possible to accomplish the same goal without using bss_sbrk and the static heap? In
> >other words, can one save the information on the Cygwin heap as part of emacs.exe, so that when
> >emacs is run the heap gets restored? I know virtually nothing about the structure of .exe files and
> >how the loader works, so I have no idea whether that's feasible.
> 
> I would think so.  The trick is knowing what pages contain
> the Cygwin heap.  As for the other approach, these need to
> get dumped as initialized data segments.  It might not be
> too hard if the Cygwin heap provides functions telling you
> where it starts and ends (more generally, the ranges of pages
> in which it lies).
> 
> I begin to wonder, though, whether this would mean having
> to provide two different copies of emacs, one with the
> heap at 0x2... and one with it at 0x8...

Never, please.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: emacs and large-address awareness under recent snapshots
  2011-08-10 14:57                                             ` Ken Brown
@ 2011-08-11 21:14                                               ` Ken Brown
  0 siblings, 0 replies; 34+ messages in thread
From: Ken Brown @ 2011-08-11 21:14 UTC (permalink / raw)
  To: cygwin

On 8/10/2011 10:56 AM, Ken Brown wrote:
> On 8/9/2011 10:39 PM, Ryan Johnson wrote:
>> Given that the static heap is only 12MB, with most of that arguably
>> occupied by stuff that isn't going away, what if we did "just ignore the
>> static heap" (mostly)? Anything freed from that regionjust gets dropped
>> on the floor and all new requests are served from the cygwin heap? I
>> assume temacs stays away from the dynamic heap, since otherwise the dump
>> would be corrupted.
>
> I tried forcing malloc to reinitialize itself in emacs.c, and emacs
> crashed almost immediately.  A gdb backtrace showed that the memory got
> corrupted as soon as realloc got called on objects that were originally
> stored in the static heap.  After reinitialization, malloc had no
> knowledge of memory allocation in the static heap.

I think there's an obvious solution to this.  At the time of 
reinitialization, we save the previous malloc state.  Then if realloc is 
called on a pointer to something in the static heap, we temporarily 
restore the old state and let realloc proceed as it did in temacs prior 
to dumping.

Unless I've (again) missed something obvious, it shouldn't be too hard 
to do this.  I'm about to go on vacation, but I should have a new emacs 
release within a couple weeks.

Ken

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2011-08-11 21:14 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-08-05 23:17 emacs and large-address awareness under recent snapshots Ken Brown
2011-08-07 11:34 ` Corinna Vinschen
2011-08-07 11:51   ` Corinna Vinschen
2011-08-07 14:44     ` Ken Brown
2011-08-07 16:19       ` Ken Brown
2011-08-07 20:03         ` Corinna Vinschen
2011-08-07 20:44           ` Ken Brown
2011-08-07 22:39           ` Ken Brown
2011-08-08 13:23             ` Ken Brown
2011-08-08 15:51               ` Corinna Vinschen
2011-08-08 16:05                 ` Ken Brown
2011-08-08 16:26                   ` Corinna Vinschen
2011-08-08 18:21                     ` Achim Gratz
2011-08-08 20:17                       ` Ken Brown
2011-08-08 21:17                         ` Ken Brown
2011-08-08 23:07                           ` Eliot Moss
2011-08-09  8:27                             ` Corinna Vinschen
2011-08-09 11:19                               ` Ken Brown
2011-08-09 11:35                                 ` Eliot Moss
2011-08-09 14:13                                 ` Ken Brown
2011-08-09 14:24                                   ` Ken Brown
2011-08-09 15:22                                     ` Corinna Vinschen
2011-08-09 16:20                                       ` Ryan Johnson
2011-08-09 18:34                                         ` Ken Brown
2011-08-09 18:21                                       ` Ken Brown
2011-08-10  2:33                                         ` Ken Brown
2011-08-10  2:39                                           ` Ryan Johnson
2011-08-10 14:57                                             ` Ken Brown
2011-08-11 21:14                                               ` Ken Brown
2011-08-10 11:47                                           ` Corinna Vinschen
2011-08-10 15:28                                             ` Ken Brown
2011-08-10 15:59                                               ` Eliot Moss
2011-08-11  7:43                                                 ` Corinna Vinschen
2011-08-08 15:40             ` Corinna Vinschen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).