public inbox for ecos-discuss@sourceware.org
 help / color / mirror / Atom feed
* [ECOS] MPC860 Ethernet driver problem.
@ 2001-05-01  0:20 Geoff Patch
  2001-05-01  5:48 ` Jonathan Larmour
  0 siblings, 1 reply; 6+ messages in thread
From: Geoff Patch @ 2001-05-01  0:20 UTC (permalink / raw)
  To: 'ecos-discuss@sources.redhat.com'

Hi All,

We have ported eCos to our custom MPC860 based board, and  we've run across 
a problem with the Ethernet driver. The problem is that when the Ethernet 
cable is disconnected from the board for more than a couple of seconds we 
lose LAN connectivity permanently. In other words, when we reconnect the 
LAN cable, we don't recover the connection.

I've had a look through the Ethernet driver in if_quicc.c and observed that 
when the LAN is disconnected the function quicc_eth_can_send() starts 
returning zero after a few transmission attempts, indicating that there are 
no free transmit buffers available. Once we get into this state we never 
recover from it.

I've implemented a quick brute force fix, which is to reinitialise the SCC 
when this condition is detected.  This seems to work but I'm not real 
pleased with it.

I'm unsure at the moment whether this is a generic problem, or something 
that is characteristic of our particular board. Has anybody else observed 
this problem, and if so come up with a more elegant solution?

Thanks in Advance

Geoff


------------------------------
Geoff Patch
Senior Software Engineer
CEA Technologies
Canberra Australia
02-6213 0141

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [ECOS] MPC860 Ethernet driver problem.
  2001-05-01  0:20 [ECOS] MPC860 Ethernet driver problem Geoff Patch
@ 2001-05-01  5:48 ` Jonathan Larmour
  2001-05-01  8:03   ` Gary Thomas
  0 siblings, 1 reply; 6+ messages in thread
From: Jonathan Larmour @ 2001-05-01  5:48 UTC (permalink / raw)
  To: Geoff Patch; +Cc: 'ecos-discuss@sources.redhat.com', Gary Thomas

Geoff Patch wrote:
> 
> Hi All,
> 
> We have ported eCos to our custom MPC860 based board, and  we've run across
> a problem with the Ethernet driver. The problem is that when the Ethernet
> cable is disconnected from the board for more than a couple of seconds we
> lose LAN connectivity permanently. In other words, when we reconnect the
> LAN cable, we don't recover the connection.
> 
> I've had a look through the Ethernet driver in if_quicc.c and observed that
> when the LAN is disconnected the function quicc_eth_can_send() starts
> returning zero after a few transmission attempts, indicating that there are
> no free transmit buffers available. Once we get into this state we never
> recover from it.
> 
> I've implemented a quick brute force fix, which is to reinitialise the SCC
> when this condition is detected.  This seems to work but I'm not real
> pleased with it.
> 
> I'm unsure at the moment whether this is a generic problem, or something
> that is characteristic of our particular board. Has anybody else observed
> this problem, and if so come up with a more elegant solution?

I'm pretty sure we've tested this type of thing before on other platforms
so I suspect quicc specific behaviour when it comes to processing the
packet.

See if you can debug the interrupt handler after you've disconnected the
cable to see if you ever get an interrupt, and if you do, what type it is.

I've seen another problem I believe: if the ring buffer fills up it starts
overwriting old txbd's. But if it doesn't call the higher layer's txDone,
the mbuf won't ever be freed as far as I can tell. Perhaps it's even the
same problem and you're out of mbufs? Try adding a call to
quicc_eth_TxEvent after the "No free xmit buffers" printf in
quicc_eth_send(). I think :-).

Gary, care to comment?

Jifl
-- 
Red Hat, Rustat House, Clifton Road, Cambridge, UK. Tel: +44 (1223) 271062
Maybe this world is another planet's Hell -Aldous Huxley || Opinions==mine

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [ECOS] MPC860 Ethernet driver problem.
  2001-05-01  5:48 ` Jonathan Larmour
@ 2001-05-01  8:03   ` Gary Thomas
  2001-05-01  8:15     ` Jonathan Larmour
  0 siblings, 1 reply; 6+ messages in thread
From: Gary Thomas @ 2001-05-01  8:03 UTC (permalink / raw)
  To: Jonathan Larmour; +Cc: ecos-discuss, Geoff Patch

 
On 01-May-2001 Jonathan Larmour wrote:
> Geoff Patch wrote:
>> 
>> Hi All,
>> 
>> We have ported eCos to our custom MPC860 based board, and  we've run across
>> a problem with the Ethernet driver. The problem is that when the Ethernet
>> cable is disconnected from the board for more than a couple of seconds we
>> lose LAN connectivity permanently. In other words, when we reconnect the
>> LAN cable, we don't recover the connection.
>> 
>> I've had a look through the Ethernet driver in if_quicc.c and observed that
>> when the LAN is disconnected the function quicc_eth_can_send() starts
>> returning zero after a few transmission attempts, indicating that there are
>> no free transmit buffers available. Once we get into this state we never
>> recover from it.
>> 
>> I've implemented a quick brute force fix, which is to reinitialise the SCC
>> when this condition is detected.  This seems to work but I'm not real
>> pleased with it.
>> 
>> I'm unsure at the moment whether this is a generic problem, or something
>> that is characteristic of our particular board. Has anybody else observed
>> this problem, and if so come up with a more elegant solution?
> 
> I'm pretty sure we've tested this type of thing before on other platforms
> so I suspect quicc specific behaviour when it comes to processing the
> packet.
> 
> See if you can debug the interrupt handler after you've disconnected the
> cable to see if you ever get an interrupt, and if you do, what type it is.
> 
> I've seen another problem I believe: if the ring buffer fills up it starts
> overwriting old txbd's. But if it doesn't call the higher layer's txDone,
> the mbuf won't ever be freed as far as I can tell. Perhaps it's even the
> same problem and you're out of mbufs? Try adding a call to
> quicc_eth_TxEvent after the "No free xmit buffers" printf in
> quicc_eth_send(). I think :-).
> 
> Gary, care to comment?

More likely is that with no carrier, the "engine" won't send packets and
needs to be kicked when the carrier is restored.  If the packet is not
consumed by the ethernet chip (engine), the driver will stop trying to
send anything.

I'm sure that there is room for improvement in this driver.  In particular,
adding some timeout mechanisms to deal with the case when packets are not
going out - currently it will just get stuck.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [ECOS] MPC860 Ethernet driver problem.
  2001-05-01  8:03   ` Gary Thomas
@ 2001-05-01  8:15     ` Jonathan Larmour
  2001-05-01  8:23       ` Gary Thomas
  0 siblings, 1 reply; 6+ messages in thread
From: Jonathan Larmour @ 2001-05-01  8:15 UTC (permalink / raw)
  To: Gary Thomas; +Cc: ecos-discuss, Geoff Patch

Gary Thomas wrote:
> > I've seen another problem I believe: if the ring buffer fills up it starts
> > overwriting old txbd's. But if it doesn't call the higher layer's txDone,
> > the mbuf won't ever be freed as far as I can tell. Perhaps it's even the
> > same problem and you're out of mbufs? Try adding a call to
> > quicc_eth_TxEvent after the "No free xmit buffers" printf in
> > quicc_eth_send(). I think :-).
> >
> > Gary, care to comment?
> 
> More likely is that with no carrier, the "engine" won't send packets and
> needs to be kicked when the carrier is restored.  If the packet is not
> consumed by the ethernet chip (engine), the driver will stop trying to
> send anything.

Still I think this memory/mbuf leak is an issue. Should I do what I
proposed?

Jifl
-- 
Red Hat, Rustat House, Clifton Road, Cambridge, UK. Tel: +44 (1223) 271062
Maybe this world is another planet's Hell -Aldous Huxley || Opinions==mine

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [ECOS] MPC860 Ethernet driver problem.
  2001-05-01  8:15     ` Jonathan Larmour
@ 2001-05-01  8:23       ` Gary Thomas
  2001-05-01  9:09         ` Jonathan Larmour
  0 siblings, 1 reply; 6+ messages in thread
From: Gary Thomas @ 2001-05-01  8:23 UTC (permalink / raw)
  To: Jonathan Larmour; +Cc: Geoff Patch, ecos-discuss

On 01-May-2001 Jonathan Larmour wrote:
> Gary Thomas wrote:
>> > I've seen another problem I believe: if the ring buffer fills up it starts
>> > overwriting old txbd's. But if it doesn't call the higher layer's txDone,
>> > the mbuf won't ever be freed as far as I can tell. Perhaps it's even the
>> > same problem and you're out of mbufs? Try adding a call to
>> > quicc_eth_TxEvent after the "No free xmit buffers" printf in
>> > quicc_eth_send(). I think :-).
>> >
>> > Gary, care to comment?
>> 
>> More likely is that with no carrier, the "engine" won't send packets and
>> needs to be kicked when the carrier is restored.  If the packet is not
>> consumed by the ethernet chip (engine), the driver will stop trying to
>> send anything.
> 
> Still I think this memory/mbuf leak is an issue. Should I do what I
> proposed?

I don't think it will have any effect.  The message comes because there
are no free transmit descriptors, i.e. all possible "pending" messages
have data in them and the engine is not moving them out.  The TxEvent()
routine will skip over all of these because none of them has gone empty.

The way to handle this would be to detect the condition, then scan the
buffers and call the 'tx_done' callback for each (indicating an error, but
that is unused at the moment) and then invalidate the buffer.  Convincing
the chip to actually start transmitting again when the carrier returns
is probably another matter.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [ECOS] MPC860 Ethernet driver problem.
  2001-05-01  8:23       ` Gary Thomas
@ 2001-05-01  9:09         ` Jonathan Larmour
  0 siblings, 0 replies; 6+ messages in thread
From: Jonathan Larmour @ 2001-05-01  9:09 UTC (permalink / raw)
  To: Gary Thomas; +Cc: Geoff Patch, ecos-discuss

Gary Thomas wrote:
> 
> On 01-May-2001 Jonathan Larmour wrote:
> > Gary Thomas wrote:
> >> > I've seen another problem I believe: if the ring buffer fills up it starts
> >> > overwriting old txbd's. But if it doesn't call the higher layer's txDone,
> >> > the mbuf won't ever be freed as far as I can tell. Perhaps it's even the
> >> > same problem and you're out of mbufs? Try adding a call to
> >> > quicc_eth_TxEvent after the "No free xmit buffers" printf in
> >> > quicc_eth_send(). I think :-).
> >>
> >> More likely is that with no carrier, the "engine" won't send packets and
> >> needs to be kicked when the carrier is restored.  If the packet is not
> >> consumed by the ethernet chip (engine), the driver will stop trying to
> >> send anything.
> >
> > Still I think this memory/mbuf leak is an issue. Should I do what I
> > proposed?
> 
> I don't think it will have any effect.  The message comes because there
> are no free transmit descriptors, i.e. all possible "pending" messages
> have data in them and the engine is not moving them out.  The TxEvent()
> routine will skip over all of these because none of them has gone empty.

If it is stalled, it won't even get a TxEvent I'd have thought.

> The way to handle this would be to detect the condition, then scan the
> buffers and call the 'tx_done' callback for each (indicating an error, but
> that is unused at the moment) and then invalidate the buffer.

You only need worry about the mbuf you're about to overwrite surely.
Scanning for those every time you send a packet and are stalled seems
wasteful.

But yes, the txbd->ctrl should probably be invalidated if overwriting an
existing one, just in case the whole thing does start up again (once that's
working).

>  Convincing
> the chip to actually start transmitting again when the carrier returns
> is probably another matter.

But is obviously a prerequisite to sensible behaviour anyway.

One would like to believe that the Quicc can indicate when the link
returns. I see in quicc_eth.h that the event register can have a
QUICC_SCCE_TXE event indicating a tx error. There's also a transmit buffer
status which includes QUICC_BD_TX_CSL to indicate carrier lost.

It would probably help if Geoff could find out what the state of the queued
up buffers actually is, whether we get a TXE interrupt, and what happens as
a result. Geoff, can you do this?

NB I don't really know anything about the Quicc so I'm reluctant to try
(and I'm pretty busy on tother things anyway)

Jifl
-- 
Red Hat, Rustat House, Clifton Road, Cambridge, UK. Tel: +44 (1223) 271062
Maybe this world is another planet's Hell -Aldous Huxley || Opinions==mine

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2001-05-01  9:09 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-05-01  0:20 [ECOS] MPC860 Ethernet driver problem Geoff Patch
2001-05-01  5:48 ` Jonathan Larmour
2001-05-01  8:03   ` Gary Thomas
2001-05-01  8:15     ` Jonathan Larmour
2001-05-01  8:23       ` Gary Thomas
2001-05-01  9:09         ` Jonathan Larmour

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).