public inbox for ecos-discuss@sourceware.org
 help / color / mirror / Atom feed
From: "Lambrecht Jürgen" <J.Lambrecht@TELEVIC.com>
To: <ecos-discuss@sources.redhat.com>
Subject: Re: [ECOS] bugs in AT91 Ethernet driver
Date: Wed, 04 Jun 2008 06:40:00 -0000	[thread overview]
Message-ID: <369C2E4EDB94C34881A8178BEA192A124D9F@nt-email.TELEVIC.COM> (raw)

Major bug found in TX part - see below.

Andrew Lunn wrote:
>> During debugging, I see that 'priv->tx_busy' stays 'true', because the  
>> TX-Complete bit (COMP of TSR reg) is never set - I also see that the  
>> TX-Used Bit Read (UBR of TSR reg) is set, which is normal I think  
>> because the TX buffers have not yet been released (because COMP is not  
>> set..). But the TX-GO bit is 0, meaning that transmit is not busy.
>>
>> I added code to release the TX buffers and to clear 'priv->tx_busy' when  
>> TX-GO is read 0. Then at each ARP request, there is an ARP reply, but it  
>> does not get out.. (monitoring with Wireshark)
>> What's happening - it is like the EMAC TX is blocked, but why?
>
> It might be worth printing out the value of isr in at91_eth_isr. As
> the comment says, error conditions are not handled. Maybe you are
> getting a buffer under run?
Indeed Andrew, looking at the value of the ISR lead me to the cause of error,
and hence the solution. 
  The TX status of the ISR can be read in AT91_EMAC_TSR. The problem is bit 0
of this register: UBR = "Used Bit Read". 
  The datasheet (DS) is very unclear on this point (both the ATSAM9260 DS and
ATSAM7X DS have completely the same section about their EMAC). But after a full
day debugging, I'm quite sure I'm right.
  So, that Used Bit indicates if the concerning packet buffer (with sg_list[i]
data) has already been used by the EMAC. This Used Bit must be set to 0 by the
user SW, and is set to 1 by the EMAC. In case of the TX EMAC, the Used Bit is
set to 1 by the EMAC after transmit of the concerning buffer.
  The TX EMAC holds a private counter/pointer (read-only in TBQP) to the
transmit buffer descriptor list - the start location is the write-only
TBQP. (Only the RX part is explained in the DS). Normally this counter/pointer
cycles trough the circular transmit buffer descriptor list.
  Now, after transmit of the last buffer (that last buffer has to be marked by
the SW with a EOF flag) the TX EMAC increases its TX counter/pointer to already
point the next TX buffer. But that next TX buffer - that is free of course - is
marked by the SW as "Used" because the Used Bit is set. And therefore the UBR
error is flagged. And therefore the TX EMAC *resets its counter/pointer* !!
Because the SW does not reset its tbd_idx pointer, both pointers are out of
sync so all further transmits are blocked. 
  The major bug is that the if_at91.c SW rigorously sets the Used Bit (in the
buffer descriptor) of all free buffers to 1, meaning "Used". A the end of the
EMAC DS (36.4.1.3 for SAM9260 vG and 37.4.1.3 for SAM7 vG point 2): "Mark all
entries in this list as owned by EMAC, i.e. bit 31 of word 1 set to 0."
Therefore I changed at91_tb_init() and at91_reset_tbd() accordingly.
  Solving the "Used Bit" issue is not enough. All the TX errors UBR, RLE, BEX
and UND cause the counter/pointer to be reset. Therefore I changed the
deliver() function and added an extra argument to at91_reset_tbd() to reset the
SW tbd_idx pointer. I also added those TX errors to the IRQ IER and ISR.

In attachment a diff - it also contains my previous changes. I also added some
comments, and added '#ifdef CYGINT_IO_ETH_INT_SUPPORT_REQUIRED' to the start
and stop functions. And I added more debug printing - even in the ISR..
I tested this fix during 5 minutes - long enough to see that a ping finally
works... 
  There are of course still some problems - the ping reply is not synchronized
with the ping request: it runs 1 ping iteration behind - more of it in my next
mail. 

Kind regards,
Juergen

P.S.: The current code can only work if the driver is initialized after each
TX, or because each tx uses all buffers so that the pointer is wrapped.
I guess redboot must work this way??
Or am I still missing something?

--
Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos
and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss

             reply	other threads:[~2008-06-04  6:40 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-06-04  6:40 Lambrecht Jürgen [this message]
2008-06-04 15:05 ` [ECOS] bugs in AT91 Ethernet driver (EMAC) Jürgen Lambrecht
  -- strict thread matches above, loose matches on Subject: below --
2008-06-01 21:36 [ECOS] bugs in AT91 Ethernet driver Lambrecht Jürgen
2008-06-02 16:03 ` Jürgen Lambrecht
2008-06-02 18:04   ` Andrew Lunn
2008-05-31 23:23 Lambrecht Jürgen
2008-05-31 23:13 Lambrecht Jürgen
2008-05-30 22:24 Lambrecht Jürgen
2008-05-31  8:12 ` I-Yanaslov
2008-05-31 11:18   ` Andrew Lunn
2008-05-31 11:51     ` I-Yanaslov
2008-05-31 12:04       ` Andrew Lunn
2008-05-31 10:59 ` Andrew Lunn
2008-05-31 11:47 ` Andrew Lunn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=369C2E4EDB94C34881A8178BEA192A124D9F@nt-email.TELEVIC.COM \
    --to=j.lambrecht@televic.com \
    --cc=ecos-discuss@sources.redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).