* [ECOS] eth_recv out of MBUFs
@ 2011-07-20 19:16 Stanislav Meduna
2011-07-20 21:27 ` Stanislav Meduna
2011-08-05 7:52 ` Lambrecht Jürgen
0 siblings, 2 replies; 20+ messages in thread
From: Stanislav Meduna @ 2011-07-20 19:16 UTC (permalink / raw)
To: ecos-discuss
Hi,
I am quite reproducibly able to produce
eth_recv out of MBUFs
in the FreeBSD stack. The setup is two incoming TCP connections
sending data, a UDP socket sending data, pinging the device,
closing and reopening the TCP connections a few times and
to pull and put back the ethernet cable.
The device then never recovers. The partner sends ARP requests
with no response from eCos, as there is no mbuf to accept them.
The eCos never send anything. eth_drv_tickle_devices runs, but
finds IF_IS_EMPTY(&ifp->if_snd) true and never touches
the interface. Maybe it wants to ARP first, but was unable
to send the request - I don't know. Unless I am blind
I am caling tx_done for every packet I got into send
(but I am going to double-check this).
The mbufs as shown by cyg_net_show_mbufs are full of DATA
with short size (60 or 64) and flags 2 (M_PKTHDR).
Anyone has seen something like that? I am not very familiar
with the TCP/IP stacks in general - so it is quite problematic
for me to debug something there. Are there some constraints
of the "mbuf space has to be larger than tcp window of active
connections" type or something like that?
I am reluctant to try to "fix" it enlarging the space for mbufs,
as I think this should never happen regardless of mbufs and I am
also on a quite memory-constrained device where every 100 kB
matter.
Thanks for any hints
--
Stano
--
Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos
and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [ECOS] eth_recv out of MBUFs
2011-07-20 19:16 [ECOS] eth_recv out of MBUFs Stanislav Meduna
@ 2011-07-20 21:27 ` Stanislav Meduna
2011-08-05 7:52 ` Lambrecht Jürgen
1 sibling, 0 replies; 20+ messages in thread
From: Stanislav Meduna @ 2011-07-20 21:27 UTC (permalink / raw)
To: ecos-discuss
On 20.07.2011 21:15, Stanislav Meduna wrote:
> The device then never recovers.
Okay - it does recover if left sitting for minutes (or tens
of minutes) with the cable unplugged. So this is not
a unrecoverable leak, the mbufs will eventually get
drained.
Well, time to reproduce it again and look _what_ is stuck
in that buffers...
--
Stano
--
Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos
and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [ECOS] eth_recv out of MBUFs
2011-07-20 19:16 [ECOS] eth_recv out of MBUFs Stanislav Meduna
2011-07-20 21:27 ` Stanislav Meduna
@ 2011-08-05 7:52 ` Lambrecht Jürgen
2011-08-09 6:37 ` Stanislav Meduna
1 sibling, 1 reply; 20+ messages in thread
From: Lambrecht Jürgen @ 2011-08-05 7:52 UTC (permalink / raw)
To: ecos-discuss
(try again to send to ecos)
On 07/20/2011 09:15 PM, Stanislav Meduna wrote:
>
> Hi,
>
> I am quite reproducibly able to produce
>
> eth_recv out of MBUFs
>
> in the FreeBSD stack. The setup is two incoming TCP connections
> sending data, a UDP socket sending data, pinging the device,
> closing and reopening the TCP connections a few times and
> to pull and put back the ethernet cable.
>
Be carefull: the eCos freeBSD stack has no timeouts on TCP connections I
believe, but often the PC stack has!
So the eCos TCP session stays alive, but the PC (used to test) stack
shuts down the TCP connection when the cable breaks (after a timeout, or
it can also detect the cable fault). And the TCPs are out of sync
(something like that), and then it uses a lot of sockets, and indeed (as
in your next mail) the timeouts are very big to free sockets.
If that is your problem, I can ask my college: he recompiled ecos to
change the minute timeout to seconds.
>
>
> The device then never recovers. The partner sends ARP requests
> with no response from eCos, as there is no mbuf to accept them.
> The eCos never send anything. eth_drv_tickle_devices runs, but
> finds IF_IS_EMPTY(&ifp->if_snd) true and never touches
> the interface. Maybe it wants to ARP first, but was unable
> to send the request - I don't know. Unless I am blind
> I am caling tx_done for every packet I got into send
> (but I am going to double-check this).
>
> The mbufs as shown by cyg_net_show_mbufs are full of DATA
> with short size (60 or 64) and flags 2 (M_PKTHDR).
>
> Anyone has seen something like that? I am not very familiar
> with the TCP/IP stacks in general - so it is quite problematic
> for me to debug something there. Are there some constraints
> of the "mbuf space has to be larger than tcp window of active
> connections" type or something like that?
>
> I am reluctant to try to "fix" it enlarging the space for mbufs,
> as I think this should never happen regardless of mbufs and I am
> also on a quite memory-constrained device where every 100 kB
> matter.
>
With a slow device and big burst of data, I had to increase the space
for mbufs.
I have had some problems with segmented data (using IP packets of >
1518B(-header sizes)), but I believe the freeBSD stack is OK.
Better use lwIP on memory-constrained devices. And lwIP is more actively
maintained..
Success,
Jürgen
>
>
> Thanks for any hints
> --
> Stano
>
> --
> Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos
> and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss
>
--
Jürgen Lambrecht
R&D Associate
Tel: +32 (0)51 303045 Fax: +32 (0)51 310670
http://www.televic-rail.com
Televic Rail NV - Leo Bekaertlaan 1 - 8870 Izegem - Belgium
Company number 0825.539.581 - RPR Kortrijk
--
Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos
and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [ECOS] eth_recv out of MBUFs
2011-08-05 7:52 ` Lambrecht Jürgen
@ 2011-08-09 6:37 ` Stanislav Meduna
2011-08-09 12:55 ` Wayne Visser
0 siblings, 1 reply; 20+ messages in thread
From: Stanislav Meduna @ 2011-08-09 6:37 UTC (permalink / raw)
To: ecos-discuss
On 05.08.2011 09:51, Lambrecht Jürgen wrote:
> Be carefull: the eCos freeBSD stack has no timeouts on TCP connections I
> believe, but often the PC stack has!
Well the TCP timeouts that are in place when the connection
is not sending any data are basically unusable on any
standard-conforming TCP/IP stack. Whoever designed
SO_KEEPALIVE the way it is specified was smoking something
quite strong...
> So the eCos TCP session stays alive, but the PC (used to test) stack
> shuts down the TCP connection when the cable breaks (after a timeout, or
> it can also detect the cable fault). And the TCPs are out of sync
> (something like that), and then it uses a lot of sockets, and indeed (as
> in your next mail) the timeouts are very big to free sockets.
> If that is your problem, I can ask my college: he recompiled ecos to
> change the minute timeout to seconds.
This is most probably not the source of the problem I am seeing
in this test setup. Thanks for the hint anyway.
> With a slow device and big burst of data, I had to increase the space
> for mbufs.
Well, but to what size? I'd like to know the formula to calculate
the worst-case mbufs needs based on the number of active TCP
and UDP sockets... I am probably asking for too much :(
My gut feeling is that the TCP windows of all open connections
do not fit into the allocated mbufs. I'll try to tune this
stuff - unfortunately the FreeBSD stack is not really
friendly here.
> Better use lwIP on memory-constrained devices. And lwIP is more actively
> maintained..
This is on my todo list for quite a long time - unfortunately
there are issues that have to be solved first. AFAIK lwIP
is not thread-safe on a single socket level and our present
framework does read from and write to a single socket from
different threads :/
I have also no idea whether everything from eCos we are
using/planning to use is able to work with lwIP (e.g. SNMP).
Regards
--
Stano
--
Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos
and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [ECOS] eth_recv out of MBUFs
2011-08-09 6:37 ` Stanislav Meduna
@ 2011-08-09 12:55 ` Wayne Visser
2011-08-11 16:52 ` Stanislav Meduna
0 siblings, 1 reply; 20+ messages in thread
From: Wayne Visser @ 2011-08-09 12:55 UTC (permalink / raw)
To: Stanislav Meduna; +Cc: ecos-discuss
Hi Stano,
I've had a similar problem and found some help in this message:
http://ecos.sourceware.org/ml/ecos-discuss/2003-10/msg00350.html
and the followup:
http://ecos.sourceware.org/ml/ecos-discuss/2003-10/msg00380.html
Basically, I've used the suggested hack to reduce the time required for
TCP connections to timeout. Fiddling around with the number of
permissible open connections did not end well...
--
Wayne Visser
LSZ PaperTech Inc.
On 11-08-09 02:36 AM, Stanislav Meduna wrote:
> On 05.08.2011 09:51, Lambrecht Jürgen wrote:
>
>> Be carefull: the eCos freeBSD stack has no timeouts on TCP connections I
>> believe, but often the PC stack has!
> Well the TCP timeouts that are in place when the connection
> is not sending any data are basically unusable on any
> standard-conforming TCP/IP stack. Whoever designed
> SO_KEEPALIVE the way it is specified was smoking something
> quite strong...
>
>> So the eCos TCP session stays alive, but the PC (used to test) stack
>> shuts down the TCP connection when the cable breaks (after a timeout, or
>> it can also detect the cable fault). And the TCPs are out of sync
>> (something like that), and then it uses a lot of sockets, and indeed (as
>> in your next mail) the timeouts are very big to free sockets.
>> If that is your problem, I can ask my college: he recompiled ecos to
>> change the minute timeout to seconds.
> This is most probably not the source of the problem I am seeing
> in this test setup. Thanks for the hint anyway.
>
>> With a slow device and big burst of data, I had to increase the space
>> for mbufs.
> Well, but to what size? I'd like to know the formula to calculate
> the worst-case mbufs needs based on the number of active TCP
> and UDP sockets... I am probably asking for too much :(
>
> My gut feeling is that the TCP windows of all open connections
> do not fit into the allocated mbufs. I'll try to tune this
> stuff - unfortunately the FreeBSD stack is not really
> friendly here.
>
>> Better use lwIP on memory-constrained devices. And lwIP is more actively
>> maintained..
> This is on my todo list for quite a long time - unfortunately
> there are issues that have to be solved first. AFAIK lwIP
> is not thread-safe on a single socket level and our present
> framework does read from and write to a single socket from
> different threads :/
>
> I have also no idea whether everything from eCos we are
> using/planning to use is able to work with lwIP (e.g. SNMP).
>
>
> Regards
--
Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos
and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [ECOS] eth_recv out of MBUFs
2011-08-09 12:55 ` Wayne Visser
@ 2011-08-11 16:52 ` Stanislav Meduna
2011-11-22 17:50 ` Stefan Sommerfeld
0 siblings, 1 reply; 20+ messages in thread
From: Stanislav Meduna @ 2011-08-11 16:52 UTC (permalink / raw)
To: ecos-discuss
Hi all,
thank you all for the hints. I think I now know the cause
of the problem - it is most probably caused by the
nature of how our application communicates that does
not go well with the mbuf allocation.
Our application sends many (about 50/sec in my test
setup, in praxis it varies wildly) short (about 60 byte)
TCP packets and as the latency is more important than
the throughput it does it on a socket with the TCP_NODELAY
option.
It looks the stack does not coalesce the data into clusters
in this setup and allocates a separate mbuf for each
such packet. Of course if the cable is plugged out or something
this quickly leads to a mbuf pressure that is twice more than
naively expected because of the overhead (128 byte mbuf used
for 60 byte of data). Considering that the amount of data needed
in the mbufs can be (I guess) at least number of connections times
TCP send window this is not easy to manage in a system where
512 kB for network buffers is quite a luxury...
I am now testing a patch allowing to tune a few parameters
in the TCP/IP stack including the ones suggested by some
of you (I know there is sysctl but not for all of them)
and I hope I will be able to come up with the values
satisfying our needs.
The problem causing 'unrecoverable' mbufs overflows was
a completely different beast - something (a mangled packet?)
sometimes puts the ethernet controller of the LM3S9B90
processor into a state that always thinks it has a packet,
returns bogus data when trying to read it and can be only
fixed by resetting the controller. Oh well...
Regards
--
Stano
--
Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos
and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [ECOS] eth_recv out of MBUFs
2011-08-11 16:52 ` Stanislav Meduna
@ 2011-11-22 17:50 ` Stefan Sommerfeld
0 siblings, 0 replies; 20+ messages in thread
From: Stefan Sommerfeld @ 2011-11-22 17:50 UTC (permalink / raw)
To: Stanislav Meduna; +Cc: ecos-discuss
Hi Stanislav,
did you manage do create a patch? I have some MBUF problems too, running out of
MBUFs or noticed big latencies which can be reduced some times with debug output
inside the network driver. The whole MBUF stuff seem to be quite critical and I
need a stable network stack.
Bye...
On 11.08.2011 18:51, Stanislav Meduna wrote:
> Hi all,
>
> thank you all for the hints. I think I now know the cause
> of the problem - it is most probably caused by the
> nature of how our application communicates that does
> not go well with the mbuf allocation.
>
> Our application sends many (about 50/sec in my test
> setup, in praxis it varies wildly) short (about 60 byte)
> TCP packets and as the latency is more important than
> the throughput it does it on a socket with the TCP_NODELAY
> option.
>
> It looks the stack does not coalesce the data into clusters
> in this setup and allocates a separate mbuf for each
> such packet. Of course if the cable is plugged out or something
> this quickly leads to a mbuf pressure that is twice more than
> naively expected because of the overhead (128 byte mbuf used
> for 60 byte of data). Considering that the amount of data needed
> in the mbufs can be (I guess) at least number of connections times
> TCP send window this is not easy to manage in a system where
> 512 kB for network buffers is quite a luxury...
>
> I am now testing a patch allowing to tune a few parameters
> in the TCP/IP stack including the ones suggested by some
> of you (I know there is sysctl but not for all of them)
> and I hope I will be able to come up with the values
> satisfying our needs.
>
>
> The problem causing 'unrecoverable' mbufs overflows was
> a completely different beast - something (a mangled packet?)
> sometimes puts the ethernet controller of the LM3S9B90
> processor into a state that always thinks it has a packet,
> returns bogus data when trying to read it and can be only
> fixed by resetting the controller. Oh well...
>
> Regards
--
Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos
and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [ECOS] eth_recv out of MBUFs
2010-07-21 1:12 ` John Mills
@ 2010-07-21 1:58 ` cyl cyl
0 siblings, 0 replies; 20+ messages in thread
From: cyl cyl @ 2010-07-21 1:58 UTC (permalink / raw)
To: John Mills; +Cc: eCos Users
Memory designated for network buffers
294912
Memory designated for network dynamically allocated memory 69632
MBUFs memory size
34816 (default was 69632)
Clusters size
139264
Max number of open sockets
32
Number of supoported pending network events 8
- cyl
2010/7/21 John Mills <johnmills@speakeasy.net>:
> Cyl -
>
> How much memory is your code allocating dynamically? How much statically?
>
> - Mills
>
> On Tue, 20 Jul 2010, cyl cyl wrote:
>
>> A strange thing. If I twice the "MBUFs memory size" which is
>> 69632 to 139264, it seems worse. No warnings of "out of MBUFs" ,
>> but a few seconds after I took out the hardware (my computer not the
>> development board) , I can't ping it (from another computer).
>
>> If I set the "MBUFs memory size" to 69632 (default value), the
>> same problem occurs . But not happens so soon as "139264".
>
>> If set to 34816 (half). It works well. I can always ping its
>> ip. And the "write" to the "unnormally closed" connection would return
>> -1 at last.
>
>> if set to 17408 (1/4), the warning out of MBUFs occurs.
>> Any suggestions?
>> Thank you.
>>
>> - cyl
>
--
Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos
and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [ECOS] eth_recv out of MBUFs
2010-07-20 7:45 ` cyl cyl
@ 2010-07-21 1:12 ` John Mills
2010-07-21 1:58 ` cyl cyl
0 siblings, 1 reply; 20+ messages in thread
From: John Mills @ 2010-07-21 1:12 UTC (permalink / raw)
To: eCos Users; +Cc: cyl cyl
Cyl -
How much memory is your code allocating dynamically? How much statically?
- Mills
On Tue, 20 Jul 2010, cyl cyl wrote:
> A strange thing. If I twice the "MBUFs memory size" which is
> 69632 to 139264, it seems worse. No warnings of "out of MBUFs" ,
> but a few seconds after I took out the hardware (my computer not the
> development board) , I can't ping it (from another computer).
> If I set the "MBUFs memory size" to 69632 (default value), the
> same problem occurs . But not happens so soon as "139264".
> If set to 34816 (half). It works well. I can always ping its
> ip. And the "write" to the "unnormally closed" connection would return
> -1 at last.
> if set to 17408 (1/4), the warning out of MBUFs occurs.
> Any suggestions?
> Thank you.
>
> - cyl
--
Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos
and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [ECOS] eth_recv out of MBUFs
2010-07-17 12:44 ` John Mills
2010-07-18 21:07 ` Ross Younger
@ 2010-07-20 7:45 ` cyl cyl
2010-07-21 1:12 ` John Mills
1 sibling, 1 reply; 20+ messages in thread
From: cyl cyl @ 2010-07-20 7:45 UTC (permalink / raw)
To: John Mills; +Cc: eCos Users
Hello:
A strange thing. If I twice the "MBUFs memory size" which is
69632 to 139264, it seems worse. No warnings of "out of MBUFs" ,
but a few seconds after I took out the hardware (my computer not the
development board) , I can't ping it (from another computer).
If I set the "MBUFs memory size" to 69632 (default value), the
same problem occurs . But not happens so soon as "139264".
If set to 34816 (half). It works well. I can always ping its
ip. And the "write" to the "unnormally closed" connection would return
-1 at last.
if set to 17408 (1/4), the warning out of MBUFs occurs.
Any suggestions?
Thank you.
- cyl
2010/7/17 John Mills <johnmills@speakeasy.net>:
> Cyl -
>
> You may be running out of free memory to allocate and the enternet buffer is
> just the place where it shows up.
>
> Short answer:
> You may need to look for a memory leak somewhere in your code. These are no
> fun to find.
>
> Long answer: eCos sets up a pool of available memory for allocation when it
> starts up. This pool is broken into blocks (the 'mbuf's in question, I
> think) and when a service or application allocates memory, eCos first checks
> whether any space is available in the one or more mbufs already assigned to
> that level of your code. If space is available then storage is allocated
> from that source for your data object; if not, the memory manager attempts
> to get another mbuf to work with. If none is available, you are "out of
> MBUFs".
>
> This block-allocation is a common approach in RTOS as it protects memory
> allocation and helps keep overhead code under control.
>
> Diagnosis: If you have generous read-write memory available, increase the
> amount which is broken up into mbufs at eCos startup. (Sorry - I can't point
> you to the correct code at the moment.) If this solves the problem, you had
> some kind of worst-case collision. If this makes the problem less frequent
> that may still be the issue. It this only delays the problem I would bet on
> a memory leak (unbalanced 'malloc' and 'free' usage in some thread). These
> are no fun to find.
>
> Suggestions:
> 1. Inactivate different threads and see if this solves the problem, then go
> looking for memory leakage in the thread(s) that were running when the
> problem occurred. Recently added code is the prime suspect, of course.
>
> 2. Write macros to "wrap" the system's 'malloc' and 'free' code so that
> they print from whence they are called and for what objects they are
> allocating. Then replace all calls to the system 'malloc' and 'free' with
> calls to your macros. (This approach makes it easy to turn the debug code on
> and off by at one spot by changing the macros' definitions: through a
> compilation switch for example.)
>
> 3. Use static rather than dynamic ('malloc') storage allocation wherever
> this is practical (not a bad idea in embedded real-time applications that
> may need to run stably for indefinite periods.)
>
> 4. Use a free ('valgrind') or commercial code analyzer to track memory
> allocation (another "learning opportunity"!).
>
> DISCLAIMERS:
> 1. I don't have eCos code handy so I may be misunderstanding the problem
> completely. Sorry.
>
> 2. I'm using eCos-2.0 so the issue may be quite different in 3.x.
>
> 3. If I missed the boat here, I hope folks will correct me and we will both
> learn from it.
>
> - John Mills
>
> On Fri, 16 Jul 2010, cyl cyl wrote:
>
>> Hello!
>> I got warning: eth_recv out of MBUFs, when I was writing to a
>> connection(frequently) which is NOT NORMALLY closed (hardware is
>> taken out). Usually "write" returns -1, but sometimes it shows that
>> message. I'm using ecos3.0 by the way.
>> Anything will be helpful !
>> Thank you!
>
--
Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos
and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [ECOS] eth_recv out of MBUFs
2010-07-18 21:07 ` Ross Younger
@ 2010-07-19 1:55 ` John Mills
0 siblings, 0 replies; 20+ messages in thread
From: John Mills @ 2010-07-19 1:55 UTC (permalink / raw)
To: ecos; +Cc: Ross Younger
All -
I was not able to use 'valgrind' successfully on a multithreaded eCos
application, but for another reason: We combine a number of packages from
other sources and parties and these contained so many questionable coding
structures that valgrind gave up the job in a flurry of sarcastic
comments. We were confident that our leaks were in our own code and not
those third-party items because we have a long history of stable
performance with those components. There were also some new pieces we had
not written so we couldn't be 100% sure where our problems were arising.
Since dynamic allocation and freeing are centralized, we used the "smart
macro" approach and hooked a simple memory monitor that way. Indeed the
problems were in code I had written. (Big surprise!)
- John Mills
On Sun, 18 Jul 2010, Ross Younger wrote:
> John Mills wrote:
>> 4. Use a free ('valgrind') or commercial code analyzer to track memory
>> allocation (another "learning opportunity"!).
>
> valgrind is a wonderful idea in general, but very Linux-specific at the
> moment. I wondered about valgrinding an eCos app running on the synth target
> the other month; I got as far as determining that it seems to Just Work for a
> simple single-threaded eCos configuration, but as soon as you bring in the
> scheduler the threading model - which doesn't use ordinary Linux threads -
> causes valgrind to get very very confused.
> I found myself staring at the valgrind code, wondering how much effort it
> would be to teach valgrind how to notice the synth spawning threads (at the
> moment it does so by intercepting the clone syscall) but haven't had time to
> actually try as yet...
--
Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos
and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [ECOS] eth_recv out of MBUFs
2010-07-17 12:44 ` John Mills
@ 2010-07-18 21:07 ` Ross Younger
2010-07-19 1:55 ` John Mills
2010-07-20 7:45 ` cyl cyl
1 sibling, 1 reply; 20+ messages in thread
From: Ross Younger @ 2010-07-18 21:07 UTC (permalink / raw)
To: ecos
John Mills wrote:
> 4. Use a free ('valgrind') or commercial code analyzer to track memory
> allocation (another "learning opportunity"!).
valgrind is a wonderful idea in general, but very Linux-specific at the
moment. I wondered about valgrinding an eCos app running on the synth
target the other month; I got as far as determining that it seems to
Just Work for a simple single-threaded eCos configuration, but as soon
as you bring in the scheduler the threading model - which doesn't use
ordinary Linux threads - causes valgrind to get very very confused.
I found myself staring at the valgrind code, wondering how much effort
it would be to teach valgrind how to notice the synth spawning threads
(at the moment it does so by intercepting the clone syscall) but haven't
had time to actually try as yet...
Ross
--
eCosCentric Ltd, Barnwell House, Barnwell Drive, Cambridge CB5 8UU, UK
Registered in England no. 4422071. www.ecoscentric.com
--
Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos
and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [ECOS] eth_recv out of MBUFs
2010-07-16 8:26 cyl cyl
2010-07-16 11:52 ` James Hunter
@ 2010-07-17 12:44 ` John Mills
2010-07-18 21:07 ` Ross Younger
2010-07-20 7:45 ` cyl cyl
1 sibling, 2 replies; 20+ messages in thread
From: John Mills @ 2010-07-17 12:44 UTC (permalink / raw)
To: eCos Users; +Cc: cyl cyl
Cyl -
You may be running out of free memory to allocate and the enternet
buffer is just the place where it shows up.
Short answer:
You may need to look for a memory leak somewhere in your code. These are
no fun to find.
Long answer: eCos sets up a pool of available memory for allocation when
it starts up. This pool is broken into blocks (the 'mbuf's in question, I
think) and when a service or application allocates memory, eCos first
checks whether any space is available in the one or more mbufs already
assigned to that level of your code. If space is available then storage is
allocated from that source for your data object; if not, the memory
manager attempts to get another mbuf to work with. If none is available,
you are "out of MBUFs".
This block-allocation is a common approach in RTOS as it protects memory
allocation and helps keep overhead code under control.
Diagnosis: If you have generous read-write memory available, increase the
amount which is broken up into mbufs at eCos startup. (Sorry - I can't
point you to the correct code at the moment.) If this solves the problem,
you had some kind of worst-case collision. If this makes the problem less
frequent that may still be the issue. It this only delays the problem I
would bet on a memory leak (unbalanced 'malloc' and 'free' usage in some
thread). These are no fun to find.
Suggestions:
1. Inactivate different threads and see if this solves the problem, then
go looking for memory leakage in the thread(s) that were running when the
problem occurred. Recently added code is the prime suspect, of course.
2. Write macros to "wrap" the system's 'malloc' and 'free' code so that
they print from whence they are called and for what objects they are
allocating. Then replace all calls to the system 'malloc' and 'free' with
calls to your macros. (This approach makes it easy to turn the debug code
on and off by at one spot by changing the macros' definitions: through a
compilation switch for example.)
3. Use static rather than dynamic ('malloc') storage allocation wherever
this is practical (not a bad idea in embedded real-time applications that
may need to run stably for indefinite periods.)
4. Use a free ('valgrind') or commercial code analyzer to track memory
allocation (another "learning opportunity"!).
DISCLAIMERS:
1. I don't have eCos code handy so I may be misunderstanding the problem
completely. Sorry.
2. I'm using eCos-2.0 so the issue may be quite different in 3.x.
3. If I missed the boat here, I hope folks will correct me and we will
both learn from it.
- John Mills
On Fri, 16 Jul 2010, cyl cyl wrote:
> Hello!
> I got warning: eth_recv out of MBUFs, when I was writing to a
> connection(frequently) which is NOT NORMALLY closed (hardware is
> taken out). Usually "write" returns -1, but sometimes it shows that
> message. I'm using ecos3.0 by the way.
> Anything will be helpful !
> Thank you!
--
Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos
and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss
^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: [ECOS] eth_recv out of MBUFs
2010-07-16 8:26 cyl cyl
@ 2010-07-16 11:52 ` James Hunter
2010-07-17 12:44 ` John Mills
1 sibling, 0 replies; 20+ messages in thread
From: James Hunter @ 2010-07-16 11:52 UTC (permalink / raw)
To: ecos-discuss, cyl cyl
Hi,
Could be a number of things..
1. Your network alarm and network support threads are not high enough priority to give the TCP/IP stack time to deal with all your data
2. You're simply overloading your CPU, it just can't handle that amount of traffic
3. There's a bug in your network driver that is not releasing MBUF's back to the system
4. You may need to increase the amount of memory allocated to MBUF's and clusters, but this usually masks the problem
You could use some of the debug calls to see when the MBUF's are increasing.
I`m not sure what you mean by "write" returning -1, you mean SendTo? Or Send?
James
-----Original Message-----
From: ecos-discuss-owner@ecos.sourceware.org [mailto:ecos-discuss-owner@ecos.sourceware.org] On Behalf Of cyl cyl
Sent: 16 July 2010 09:26
To: ecos-discuss@ecos.sourceware.org
Subject: [ECOS] eth_recv out of MBUFs
Hello!
I got warning: eth_recv out of MBUFs, when I was writing to a
connection(frequently) which is NOT NORMALLY closed (hardware is
taken out). Usually "write" returns -1, but sometimes it shows that
message. I'm using ecos3.0 by the way.
Anything will be helpful !
Thank you!
--
Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos
and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss
--
Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos
and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss
^ permalink raw reply [flat|nested] 20+ messages in thread
* [ECOS] eth_recv out of MBUFs
@ 2010-07-16 8:26 cyl cyl
2010-07-16 11:52 ` James Hunter
2010-07-17 12:44 ` John Mills
0 siblings, 2 replies; 20+ messages in thread
From: cyl cyl @ 2010-07-16 8:26 UTC (permalink / raw)
To: ecos-discuss
Hello!
I got warning: eth_recv out of MBUFs, when I was writing to a
connection(frequently) which is NOT NORMALLY closed (hardware is
taken out). Usually "write" returns -1, but sometimes it shows that
message. I'm using ecos3.0 by the way.
Anything will be helpful !
Thank you!
--
Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos
and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [ECOS] eth_recv out of MBUFs
2001-10-16 17:19 ` Rosimildo da Silva
@ 2001-10-17 1:49 ` Peter Graf
0 siblings, 0 replies; 20+ messages in thread
From: Peter Graf @ 2001-10-17 1:49 UTC (permalink / raw)
To: ecos-discuss
Rosimildo da Silva wrote:
>I have observed the same problems. I have checked with 2 platforms
>mips and x86 pc. In both cases over heavy load, the system always
>run-out- MBUFs. A second problem that I see is that there is no
>recover for that, I mean once you reach it, even if the other end
>stop sending anything, the ecos never recovers it.
>
>I did some tests, and I can only recreate this with TCP sockets.
>With UDP sockets, I've got the same targets running over 24 hours.
Same for me, CS8900 Ethernet, Hitachi SH3. Recovers, but needs minutes
after the other end stops sending.
Peter
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [ECOS] eth_recv out of MBUFs
2001-10-16 14:55 ` Jonathan Larmour
2001-10-16 17:19 ` Rosimildo da Silva
@ 2001-10-17 1:03 ` Christoph Csebits
1 sibling, 0 replies; 20+ messages in thread
From: Christoph Csebits @ 2001-10-17 1:03 UTC (permalink / raw)
To: Jonathan Larmour; +Cc: eCos mailing list
On Tue, Oct 16, 2001 at 10:55:18PM +0100, Jonathan Larmour wrote:
> Christoph Csebits wrote:
> >
> > hi,
> >
> > i am working on a FCC driver for the MPC8260ADS.
> > (FCC driver is based on the Viper FEC driver)
> > cvs-version 02-Oct-2001
> >
> > i am getting "warning: eth_recv out of MBUFs" messages
> > after while running the tcp_echo - "test suite".
> > sending only 100 packets instead of 1024 works fine.
> >
> > What does this message mean, where should i
> > start in my driver code.
>
> It probably means the mbufs your driver uses for received data aren't
> getting freed after use. Perhaps it isn't getting read fast enough or you
> need to increase CYGPKG_NET_MEM_USAGE if you are hammering it really hard.
> It may not indicate a problem.
>
> Alternatively if the buffers stay allocated even after you've stopped
> hammering it, then that _would_ be a problem :-).
hi, i think i solved my mbuf problem.
Since my FCC driver is based on the QUICC/FEC drivers and
according to Jon Hartley ->
http://sources.redhat.com/ml/ecos-discuss/2001-07/msg00089.html
there are still various bugs in them.
So I worked out the patch for my FCC driver and now it seems
to work.
at least better than before :-) (waiting for arising problems)
thanks to Jon,
best regards, christoph
--
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [ECOS] eth_recv out of MBUFs
2001-10-16 14:55 ` Jonathan Larmour
@ 2001-10-16 17:19 ` Rosimildo da Silva
2001-10-17 1:49 ` Peter Graf
2001-10-17 1:03 ` Christoph Csebits
1 sibling, 1 reply; 20+ messages in thread
From: Rosimildo da Silva @ 2001-10-16 17:19 UTC (permalink / raw)
To: eCos mailing list
From: "Jonathan Larmour" <jlarmour@redhat.com>
To: "Christoph Csebits" <christoph.csebits@frequentis.com>
Cc: "eCos mailing list" <ecos-discuss@sources.redhat.com>
Sent: Tuesday, October 16, 2001 4:55 PM
Subject: Re: [ECOS] eth_recv out of MBUFs
> Christoph Csebits wrote:
> >
> > hi,
> >
> > i am working on a FCC driver for the MPC8260ADS.
> > (FCC driver is based on the Viper FEC driver)
> > cvs-version 02-Oct-2001
> >
> > i am getting "warning: eth_recv out of MBUFs" messages
> > after while running the tcp_echo - "test suite".
> > sending only 100 packets instead of 1024 works fine.
> >
> > What does this message mean, where should i
> > start in my driver code.
>
> It probably means the mbufs your driver uses for received data aren't
> getting freed after use. Perhaps it isn't getting read fast enough or you
> need to increase CYGPKG_NET_MEM_USAGE if you are hammering it really hard.
> It may not indicate a problem.
>
> Alternatively if the buffers stay allocated even after you've stopped
> hammering it, then that _would_ be a problem :-).
I have observed the same problems. I have checked with 2 platforms
mips and x86 pc. In both cases over heavy load, the system always
run-out- MBUFs. A second problem that I see is that there is no
recover for that, I mean once you reach it, even if the other end
stop sending anything, the ecos never recovers it.
I did some tests, and I can only recreate this with TCP sockets.
With UDP sockets, I've got the same targets running over 24 hours.
Rosimildo.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [ECOS] eth_recv out of MBUFs
2001-10-16 9:21 Christoph Csebits
@ 2001-10-16 14:55 ` Jonathan Larmour
2001-10-16 17:19 ` Rosimildo da Silva
2001-10-17 1:03 ` Christoph Csebits
0 siblings, 2 replies; 20+ messages in thread
From: Jonathan Larmour @ 2001-10-16 14:55 UTC (permalink / raw)
To: Christoph Csebits; +Cc: eCos mailing list
Christoph Csebits wrote:
>
> hi,
>
> i am working on a FCC driver for the MPC8260ADS.
> (FCC driver is based on the Viper FEC driver)
> cvs-version 02-Oct-2001
>
> i am getting "warning: eth_recv out of MBUFs" messages
> after while running the tcp_echo - "test suite".
> sending only 100 packets instead of 1024 works fine.
>
> What does this message mean, where should i
> start in my driver code.
It probably means the mbufs your driver uses for received data aren't
getting freed after use. Perhaps it isn't getting read fast enough or you
need to increase CYGPKG_NET_MEM_USAGE if you are hammering it really hard.
It may not indicate a problem.
Alternatively if the buffers stay allocated even after you've stopped
hammering it, then that _would_ be a problem :-).
Jifl
--
Red Hat, Rustat House, Clifton Road, Cambridge, UK. Tel: +44 (1223) 271062
Maybe this world is another planet's Hell -Aldous Huxley || Opinions==mine
^ permalink raw reply [flat|nested] 20+ messages in thread
* [ECOS] eth_recv out of MBUFs
@ 2001-10-16 9:21 Christoph Csebits
2001-10-16 14:55 ` Jonathan Larmour
0 siblings, 1 reply; 20+ messages in thread
From: Christoph Csebits @ 2001-10-16 9:21 UTC (permalink / raw)
To: eCos mailing list
hi,
i am working on a FCC driver for the MPC8260ADS.
(FCC driver is based on the Viper FEC driver)
cvs-version 02-Oct-2001
i am getting "warning: eth_recv out of MBUFs" messages
after while running the tcp_echo - "test suite".
sending only 100 packets instead of 1024 works fine.
What does this message mean, where should i
start in my driver code.
Any suggestions are appreciated
regards, christoph
target log:
INK connection from 172.17.2.1:3993
SOURCE connection from 172.17.2.1:3996
Using 100 buffers of 8192 bytes each, 0% background load
Set no background load
Set no background load
439 ticks elapsed, 3273 kloops predicted for an idle system
actual kloops 3084, CPU was 94% idle during transfer
Set no background load
Set no background load
Final load[1458] = 742341 => 1%
Net malloc:
count: 80, min: 0.18, max: 1.17, total: 38.65, ave: 0.47
Net free:
count: 32, min: 0.35, max: 4.50, total: 23.12, ave: 0.71
Mbuf alloc:
count: 4109, min: 0.24, max: 6.23, total: 1603.69, ave: 0.38
Mbuf free:
count: 3747, min: 0.32, max: 1.08, total: 1771.50, ave: 0.47
Cluster alloc:
count: 200, min: 0.24, max: 0.53, total: 70.28, ave: 0.35
Checksum:
count: 3689, min: 0.18, max: 7.91, total: 3214.81, ave: 0.86
Net memcpy:
count: 7119, min: 0.02, max: 21.57, total: 4152.86, ave: 0.58
Net memset:
count: 1128, min: 0.05, max: 1.77, total: 245.93, ave: 0.21
--
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2011-11-22 17:50 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-07-20 19:16 [ECOS] eth_recv out of MBUFs Stanislav Meduna
2011-07-20 21:27 ` Stanislav Meduna
2011-08-05 7:52 ` Lambrecht Jürgen
2011-08-09 6:37 ` Stanislav Meduna
2011-08-09 12:55 ` Wayne Visser
2011-08-11 16:52 ` Stanislav Meduna
2011-11-22 17:50 ` Stefan Sommerfeld
-- strict thread matches above, loose matches on Subject: below --
2010-07-16 8:26 cyl cyl
2010-07-16 11:52 ` James Hunter
2010-07-17 12:44 ` John Mills
2010-07-18 21:07 ` Ross Younger
2010-07-19 1:55 ` John Mills
2010-07-20 7:45 ` cyl cyl
2010-07-21 1:12 ` John Mills
2010-07-21 1:58 ` cyl cyl
2001-10-16 9:21 Christoph Csebits
2001-10-16 14:55 ` Jonathan Larmour
2001-10-16 17:19 ` Rosimildo da Silva
2001-10-17 1:49 ` Peter Graf
2001-10-17 1:03 ` Christoph Csebits
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).