public inbox for ecos-patches@sourceware.org
 help / color / mirror / Atom feed
* [Bug 1001897] New: lpc2xxx CAN driver improvements / enhancements
@ 2013-08-29  9:37 bugzilla-daemon
  2013-08-30 12:37 ` [Bug 1001897] " bugzilla-daemon
                   ` (29 more replies)
  0 siblings, 30 replies; 31+ messages in thread
From: bugzilla-daemon @ 2013-08-29  9:37 UTC (permalink / raw)
  To: ecos-patches

Please do not reply to this email, use the link below.

http://bugs.ecos.sourceware.org/show_bug.cgi?id=1001897

            Bug ID: 1001897
           Summary: lpc2xxx CAN driver improvements / enhancements
           Product: eCos
           Version: CVS
            Target: Other (please specify)
  Architecture/Host HostOS: Win XP/7
                OS:
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: low
         Component: Patches and contributions
          Assignee: unassigned@bugs.ecos.sourceware.org
          Reporter: uwe_kindler@web.de
                CC: ecos-patches@ecos.sourceware.org

Created attachment 2349
  --> http://bugs.ecos.sourceware.org/attachment.cgi?id=2349&action=edit
lpx2xxx CAN driver fix

The following patch fixes some issues discovered while porting the CANopen Node
project (http://sourceforge.net/projects/canopennode/) to the eCos CAN
framework.

The following problems have occurred, when the node has been disconnected /
connected physically from the CAN bus while the CANopen stack was runnning:

1. Disconnecting the node physically from CAN bus causes a bus off condition
for the CAN controller. This caused an endless invocation of the error ISR
because of the remaining bus off error condition. The fast ISR invokation
blocked the application from running.

2. When the node was reconnected physically to CAN bus again, it could not
always properly recover from bus off condition. This also happened after a
reset of the CANopen application. 

The attached patch fixes both issues for lpc2xxx CAN driver.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 1001897] lpc2xxx CAN driver improvements / enhancements
  2013-08-29  9:37 [Bug 1001897] New: lpc2xxx CAN driver improvements / enhancements bugzilla-daemon
@ 2013-08-30 12:37 ` bugzilla-daemon
  2013-08-30 12:38 ` bugzilla-daemon
                   ` (28 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2013-08-30 12:37 UTC (permalink / raw)
  To: ecos-patches

Please do not reply to this email, use the link below.

http://bugs.ecos.sourceware.org/show_bug.cgi?id=1001897

Uwe Kindler <uwe_kindler@web.de> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Attachment #2349|0                           |1
        is obsolete|                            |

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 1001897] lpc2xxx CAN driver improvements / enhancements
  2013-08-29  9:37 [Bug 1001897] New: lpc2xxx CAN driver improvements / enhancements bugzilla-daemon
  2013-08-30 12:37 ` [Bug 1001897] " bugzilla-daemon
@ 2013-08-30 12:38 ` bugzilla-daemon
  2013-08-30 14:27 ` bugzilla-daemon
                   ` (27 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2013-08-30 12:38 UTC (permalink / raw)
  To: ecos-patches

Please do not reply to this email, use the link below.

http://bugs.ecos.sourceware.org/show_bug.cgi?id=1001897

--- Comment #1 from Uwe Kindler <uwe_kindler@web.de> ---
Created attachment 2351
  --> http://bugs.ecos.sourceware.org/attachment.cgi?id=2351&action=edit
lpc2xxx CAN driver patch

Updated patch that fixes a compile error if only one CAN channel is used.
Compilation tested with 1 and 4 enabled hardware CAN channels.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 1001897] lpc2xxx CAN driver improvements / enhancements
  2013-08-29  9:37 [Bug 1001897] New: lpc2xxx CAN driver improvements / enhancements bugzilla-daemon
  2013-08-30 12:37 ` [Bug 1001897] " bugzilla-daemon
  2013-08-30 12:38 ` bugzilla-daemon
@ 2013-08-30 14:27 ` bugzilla-daemon
  2013-09-02  7:22 ` bugzilla-daemon
                   ` (26 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2013-08-30 14:27 UTC (permalink / raw)
  To: ecos-patches

Please do not reply to this email, use the link below.

http://bugs.ecos.sourceware.org/show_bug.cgi?id=1001897

Bernard Fouché <bernard.fouche@kuantic.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bernard.fouche@kuantic.com

--- Comment #2 from Bernard Fouché <bernard.fouche@kuantic.com> ---
I've no experience with LPC2xxx but with LPC1765 that has the same CAN cell
from the LPC2xxx.

AFAIK, ICR_BUS_ERR shows an error on the bus, it may not be always a BUS OFF
condition.

To know about BUS OFF, you must check bit 7 of the GSR.

If you immediately clears the counters, the DSR can't know about the counters
value and has no way to help diagnose the problem occurring on the bus.

            // This ensures, that this ISR does not fire again and again and
            // blocks the application while the bus off condition is active.

I made many tests, for instance with a single node on a Hi-Z bus and I don't
remember having the bus OFF condition to make the interrupt code to be called
like in a spin loop.

The LPC1765 irq system is different but I don't see why a MCU would do that,
the problem must be elsewhere because the CAN controller is expected to exit
the bus off condition by itself, at least if there is activity on the bus.

            // Setting the TX error counter to 127 ensures that the controller
            // is in TX error passive mode and that it does not flood the CAN
            // bus with error messages

If your controller is flooding the bus with error frames, it is probably
because another node, or the bus itself, has problems since error frames are
sent by the receiving nodes.

What your patch does it to stop the CAN controller to send error frames as soon
as a single error, or any kind, is detected, which probably breaks the CAN spec
(which may be normal with CANopen, I don't know, but in that case the driver
becomes specific to CANopen).

Why not report simply BUS_ERR and a correct BUS_OFF to the DSR and let it
decide if it's time to reset the bus? The DSR could save the RX/TXREC before
performing a reset for instance. IIUC, because of the reset, your patch clears
the TX buffer that made the controller to go into BUS OFF mode and the
DSR/application isn't made aware of this.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 1001897] lpc2xxx CAN driver improvements / enhancements
  2013-08-29  9:37 [Bug 1001897] New: lpc2xxx CAN driver improvements / enhancements bugzilla-daemon
                   ` (2 preceding siblings ...)
  2013-08-30 14:27 ` bugzilla-daemon
@ 2013-09-02  7:22 ` bugzilla-daemon
  2013-09-02 16:22 ` bugzilla-daemon
                   ` (25 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2013-09-02  7:22 UTC (permalink / raw)
  To: ecos-patches

Please do not reply to this email, use the link below.

http://bugs.ecos.sourceware.org/show_bug.cgi?id=1001897

--- Comment #3 from Uwe Kindler <uwe_kindler@web.de> ---
(In reply to comment #2)
> I've no experience with LPC2xxx but with LPC1765 that has the same CAN cell
> from the LPC2xxx.
> 
> AFAIK, ICR_BUS_ERR shows an error on the bus, it may not be always a BUS OFF
> condition.

The CAN ISR is triggered by various warning or error conditions. To check,
which error occured, we check within the lpc2xxx_can_getevent which event
occured by testing the ICR register for various flags. The flag ICR_BUS_ERR is
bit 7 of ICR and tests for bus error interrupt.

This is, what the manual of the LPC2xxx writes about error handling:

manual snippet ------------------>
The CAN Controllers count and handle transmit and receive errors as specified
in CAN Spec 2.0B. The Transmit and Receive Error Counters are incremented for
each detected error and are decremented when operation is error-free. If the
Transmit Error counter contains 255 and another error occurs, the CAN
Controller is forced into a state called Bus-Off. In this state, the following
register bits are set: BS in CANSR, BEI and EI in CANIR if these are enabled,
and RM in CANMOD. RM resets and disables much of the CAN Controller. Also at
this time the Transmit Error Counter is set to 127 and the Receive
Error Counter is cleared. Software must next clear the RM bit. Thereafter the
Transmit Error Counter will count down 128 occurrences of the Bus Free
condition (11 consecutive recessive bits).
<--------------------

So the ICR_BUS_ERR flag is set if BEI (Bus error interrupt) is enabled.
According to the manual, is this interrupt occurs, the BS bit in CANSR (status
register) is set. The manual says:

These bits are identical to the BS bit in the GSR (Global Status Register)

Bit 7 in GSR (Global Status Register) is the Bus status flag - this is written
in the manual about this flag:

Bus Status: the CAN controller is currently prohibited from
bus activity because the Transmit Error Counter reached
its limiting value of 255.

That means, the bus error interrupt occures if there is a Bus-Off condition.
For all other error or warning interrupts there are other interrupt flags. So
the bus error is a Bus-off condition and the flag ICR_BUS_ERR tests for this
Bus-off condition.


> 
> To know about BUS OFF, you must check bit 7 of the GSR.

Yes, but in case of an interrupt need to check bit 7 of ICR (Interrupt and
Capture Register) - bit 7 (BEI - Bus error interrupt) is set if Bit 7 in GSR is
set (Bus-off contiditon) is set.

> 
> If you immediately clears the counters, the DSR can't know about the
> counters value and has no way to help diagnose the problem occurring on the
> bus.

The counter values are cleared in the lpc2xxx_can_getevent function - so the
DSR has already been executet. In the lpc2xxx_can_getevent function the event
flag is set in case of a Bus-off condition (pevent->flags |=
CYGNUM_CAN_EVENT_BUS_OFF) and propagated to the application code that will
receive this event. In case of a Bus-off condition the application dont't need
to read the error counters because the data size of the error counter registers
is 8 bit and a Bus-off condition occures if the error counter contains 255 and
another error occures (that means the counter overflows). So in case of a
Bus-off error the error counters do not contain "valid" counter values anymore.
Normally the application does not need to care about the error counters,
because the LPC2xxx CAN controller has status flags and interrupts for all
important conditions.

1. Warning interrupt if counter raise above the warning limit (>96)
2. Error passive interript if counters raise abover error passive level (>128)
3. Bus-off interrupt if counters overflow (>255)

The error counters a normally for the internal CAN controller logik to track
controller warning and error states.


> The LPC1765 irq system is different but I don't see why a MCU would do that,
> the problem must be elsewhere because the CAN controller is expected to exit
> the bus off condition by itself, at least if there is activity on the bus.

Here is a small snippet from the LPC2xxx manual what the controller does in
case of a Bus-off condition:


manual snippet ------------------->
If the Transmit Error counter contains 255 and another error occurs, the CAN
Controller is forced into a state called Bus-Off. In this state, the following
register bits are set: BS in CANSR, BEI and EI in CANIR if these are enabled,
and RM in CANMOD. RM resets and disables much of the CAN Controller. Also at
this time the Transmit Error Counter is set to 127 and the Receive Error
Counter is cleared.
<----------------------

That means: 

1. Bus error interrupt
2. RM in CANMOD set (RM - Reset Mode resets and disables much of the CAN
Controller)
3. TX counter is set to 127 and RX counter is cleared.

That means according to the manual the hardware should do exactly what my patch
does in case of a Bus-off confition. The problem is, although it is written in
the manual, it does not happen for my LPC2xxx. Via debug output I can see the
following:

1. Bus error interrupt occures (BS in CANSR and GSR is set)
2. RM in CANMOD is NOT set (controller remains active)
3. TX counter is NOT set to 127 and RX counter is NOT cleared.

So the hardware acts differently than the manual states. I could not find
anything in the errata sheets and I don't know if this also happens for newer
(i.e. LPC3xxx) devices - but for the LPC2294 controller on the olimex board,
this is reality. Because the controller does not enter RM (Reset Mode) and
because the counters are not cleared by hardware, the Bus error interrupt will
happen immediatelly again as soon as the ISR / DSR processing has finished.
This will block application from running because the ISR /DSR code will fire
again and again. So my patch simply does, what it is written in the manual:

1. Set the controller into reset mode (RM bit)
2. Set the TX counter to 127 and clear the RX counter.

The only additional step my code does, is clearing the RM bit. So the
controller leaves the reset mode again. Because the TX counter value is 127 it
takes a while until the TX counter overflows again and the next bus error
interrupt occures. During this time the application code can run and will
receive the CYGNUM_CAN_EVENT_BUS_OFF. As long as the bus off condition exists,
the application will recevice the CYGNUM_CAN_EVENT_BUS_OFF event again and
again - each time the TX error counter overflows. But as soon as the bus off
condition goes away (i.e. if the node is physically connected to the bus again)
the bus off condition goes away and the controller automatically recovers from
this bus of condition.

> What your patch does it to stop the CAN controller to send error frames as
> soon as a single error, or any kind, is detected, which probably breaks the
> CAN spec (which may be normal with CANopen, I don't know, but in that case
> the driver becomes specific to CANopen).
> 

No, my patch does excatly what is written in the LPC2xxx manual. The Bus-off
condition auomatically stops the controller from sending error frames. This is
what the bus off confition is made for - ensuring that a broken node does not
destroy the whole CAN communication. My patch sets the TX counter back to 127.
This ensures that controller stays in error passive mode - that means is sends
only Passive Error Flags on the bus - A Passive Error Flag comprises 6
recessive bits, and will not destroy other bus traffic. Here is a good article
about CAN bus error handling:

http://www.kvaser.com/zh/about-can/the-can-protocol/23.html

So my patch does:

1. Ensure that the application is not blocked by bus error ISR/DSR
2. The application gets informed about bus off condition via
CYGNUM_CAN_EVENT_BUS_OFF.
3. The controller stays in error passive mode - sends only 6 recessive error
bits, and will not destroy other bus traffic
4. The controller properly recovers from bus off condition
5. The controller behaves like written in the user manual

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 1001897] lpc2xxx CAN driver improvements / enhancements
  2013-08-29  9:37 [Bug 1001897] New: lpc2xxx CAN driver improvements / enhancements bugzilla-daemon
                   ` (3 preceding siblings ...)
  2013-09-02  7:22 ` bugzilla-daemon
@ 2013-09-02 16:22 ` bugzilla-daemon
  2013-09-03  6:18 ` bugzilla-daemon
                   ` (24 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2013-09-02 16:22 UTC (permalink / raw)
  To: ecos-patches

Please do not reply to this email, use the link below.

http://bugs.ecos.sourceware.org/show_bug.cgi?id=1001897

--- Comment #4 from Sergei Gavrikov <sergei.gavrikov@gmail.com> ---
(In reply to comment #3)
> So the hardware acts differently than the manual states. I could not
> find anything in the errata sheets and I don't know if this also
> happens for newer (i.e. LPC3xxx) devices - but for the LPC2294
> controller on the olimex board, this is reality. Because the
> controller does not enter RM (Reset Mode) ...

SYNOPSIS

  http://www.nxp.com/documents/application_note/AN10674.pdf | Fig. 14

May we presume that description from AN is not true for some LPC2294
revisions? Perhaps, http://www.nxp.com/documents/errata_sheet/2294.pdf
See CAN.5 (pp. 11,12) "Normal operation cannot be resumed after reset".
May we presume that they did not raise "Reset Mode" for "Bus Off" error
as an early "workaround"? Though, I found no such reassurances on Net.

On the other hand the driver uses lpc2xxx_enter_reset_mode() in a few
places without (?) issue. Uwe, did you noticed any strange behaviors
after this call? If you did not, then my guess is wrong. And if CAN.5
issue can be the reason then we can use your workaround by a condition
is set according some CDL option. Sorry, I cannot help in testing any
more due lack a CAN adapter.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 1001897] lpc2xxx CAN driver improvements / enhancements
  2013-08-29  9:37 [Bug 1001897] New: lpc2xxx CAN driver improvements / enhancements bugzilla-daemon
                   ` (4 preceding siblings ...)
  2013-09-02 16:22 ` bugzilla-daemon
@ 2013-09-03  6:18 ` bugzilla-daemon
  2013-09-03  6:38 ` bugzilla-daemon
                   ` (23 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2013-09-03  6:18 UTC (permalink / raw)
  To: ecos-patches

Please do not reply to this email, use the link below.

http://bugs.ecos.sourceware.org/show_bug.cgi?id=1001897

--- Comment #5 from Uwe Kindler <uwe_kindler@web.de> ---

> SYNOPSIS
> 
>   http://www.nxp.com/documents/application_note/AN10674.pdf | Fig. 14
> 
> May we presume that description from AN is not true for some LPC2294
> revisions? 

It seems so. I only have one single LPC2294 here and it definitely don't works
like written on page 16 of the application note. The RM bit is not set, so
controller dont't go into reset mode and also the TX RX counters are not
changed. I would not have put so much time into this patch if the CAN
controller would have worked the way it should.

> Perhaps, http://www.nxp.com/documents/errata_sheet/2294.pdf
> See CAN.5 (pp. 11,12) "Normal operation cannot be resumed after reset".
> May we presume that they did not raise "Reset Mode" for "Bus Off" error
> as an early "workaround"? Though, I found no such reassurances on Net.

I think CAN.5 has nothing to do with the problem I see - this is something
different.

> 
> On the other hand the driver uses lpc2xxx_enter_reset_mode() in a few
> places without (?) issue. 

Manually entering reset mode via lpc2xxx_enter_reset_mode() is not a problem.
The problem is, that the controller does not automatically enters reset mode in
case of a bus off condition (bus error). lpc2xxx_enter_reset_mode() works fine
and controller properly enters reset mode.

> Uwe, did you noticed any strange behaviors
> after this call? If you did not, then my guess is wrong. And if CAN.5
> issue can be the reason then we can use your workaround by a condition
> is set according some CDL option. Sorry, I cannot help in testing any
> more due lack a CAN adapter.

Ok. Shall I change the patch and make the code conditional via a CDL option?
That would be o.k. for me as long as no one confirmed that this problem exists
for other LPCxxxx devices?

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 1001897] lpc2xxx CAN driver improvements / enhancements
  2013-08-29  9:37 [Bug 1001897] New: lpc2xxx CAN driver improvements / enhancements bugzilla-daemon
                   ` (5 preceding siblings ...)
  2013-09-03  6:18 ` bugzilla-daemon
@ 2013-09-03  6:38 ` bugzilla-daemon
  2013-09-03  8:20 ` bugzilla-daemon
                   ` (22 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2013-09-03  6:38 UTC (permalink / raw)
  To: ecos-patches

Please do not reply to this email, use the link below.

http://bugs.ecos.sourceware.org/show_bug.cgi?id=1001897

--- Comment #6 from Sergei Gavrikov <sergei.gavrikov@gmail.com> ---
(In reply to comment #5)
> Ok. Shall I change the patch and make the code conditional via a CDL
> option?  That would be o.k. for me as long as no one confirmed that
> this problem exists for other LPCxxxx devices?

IMO, it is not bad compromise. Could you, please, update the patch then?

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 1001897] lpc2xxx CAN driver improvements / enhancements
  2013-08-29  9:37 [Bug 1001897] New: lpc2xxx CAN driver improvements / enhancements bugzilla-daemon
                   ` (6 preceding siblings ...)
  2013-09-03  6:38 ` bugzilla-daemon
@ 2013-09-03  8:20 ` bugzilla-daemon
  2013-09-03  8:22 ` bugzilla-daemon
                   ` (21 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2013-09-03  8:20 UTC (permalink / raw)
  To: ecos-patches

Please do not reply to this email, use the link below.

http://bugs.ecos.sourceware.org/show_bug.cgi?id=1001897

Uwe Kindler <uwe_kindler@web.de> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Attachment #2351|0                           |1
        is obsolete|                            |

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 1001897] lpc2xxx CAN driver improvements / enhancements
  2013-08-29  9:37 [Bug 1001897] New: lpc2xxx CAN driver improvements / enhancements bugzilla-daemon
                   ` (7 preceding siblings ...)
  2013-09-03  8:20 ` bugzilla-daemon
@ 2013-09-03  8:22 ` bugzilla-daemon
  2013-09-03  9:06 ` bugzilla-daemon
                   ` (20 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2013-09-03  8:22 UTC (permalink / raw)
  To: ecos-patches

Please do not reply to this email, use the link below.

http://bugs.ecos.sourceware.org/show_bug.cgi?id=1001897

--- Comment #7 from Uwe Kindler <uwe_kindler@web.de> ---
Created attachment 2352
  --> http://bugs.ecos.sourceware.org/attachment.cgi?id=2352&action=edit
lpc2xxx CAN driver patch

Updated patch to make the workaround code conditional. The workaround is
enabled by default, as I believe it won't affect parts that don't have the
problem because it simply implements the behaviour documented in manual and
application note.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 1001897] lpc2xxx CAN driver improvements / enhancements
  2013-08-29  9:37 [Bug 1001897] New: lpc2xxx CAN driver improvements / enhancements bugzilla-daemon
                   ` (8 preceding siblings ...)
  2013-09-03  8:22 ` bugzilla-daemon
@ 2013-09-03  9:06 ` bugzilla-daemon
  2013-09-03 10:48 ` bugzilla-daemon
                   ` (19 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2013-09-03  9:06 UTC (permalink / raw)
  To: ecos-patches

Please do not reply to this email, use the link below.

http://bugs.ecos.sourceware.org/show_bug.cgi?id=1001897

--- Comment #8 from Bernard Fouché <bernard.fouche@kuantic.com> ---
(In reply to comment #3)
> manual snippet ------------------->
> If the Transmit Error counter contains 255 and another error occurs, the CAN
> Controller is forced into a state called Bus-Off. In this state, the
> following register bits are set: BS in CANSR, BEI and EI in CANIR if these
> are enabled, and RM in CANMOD. RM resets and disables much of the CAN
> Controller. Also at this time the Transmit Error Counter is set to 127 and
> the Receive Error Counter is cleared.
> <----------------------

The manual is unclear: in some places it states that bus off is reached when
the tx counter error is 255, and in what you show (in what is also in LPC17xx
manual section 16.8.1), that bus off is reached only when the counter reaches
255+1.

For instance the definition of BS is:

"7 BS[6] Bus Status. 0 0
0 (Bus-On) The CAN Controller is involved in bus activities
1 (Bus-Off) The CAN controller is currently not involved/prohibited from bus
activity
because the Transmit Error Counter reached its limiting value of 255."

By readins this one could understand that BS is set when TEC reaches 255, but
the complete bus off state is reached only when the counter is incremented
another time.

> 
> That means: 
> 
> 1. Bus error interrupt
> 2. RM in CANMOD set (RM - Reset Mode resets and disables much of the CAN
> Controller)
> 3. TX counter is set to 127 and RX counter is cleared.
> 
> That means according to the manual the hardware should do exactly what my
> patch does in case of a Bus-off confition. The problem is, although it is
> written in the manual, it does not happen for my LPC2xxx. Via debug output I
> can see the following:
> 
> 1. Bus error interrupt occures (BS in CANSR and GSR is set)
> 2. RM in CANMOD is NOT set (controller remains active)
> 3. TX counter is NOT set to 127 and RX counter is NOT cleared.
> 
> So the hardware acts differently than the manual states. I could not find
> anything in the errata sheets and I don't know if this also happens for
> newer (i.e. LPC3xxx) devices - but for the LPC2294 controller on the olimex
> board, this is reality. Because the controller does not enter RM (Reset
> Mode) and because the counters are not cleared by hardware, the Bus error
> interrupt will happen immediatelly again as soon as the ISR / DSR processing
> has finished. This will block application from running because the ISR /DSR
> code will fire again and again. So my patch simply does, what it is written
> in the manual:

Still about BS, there is a note [6] that states:

"[6] Mode bit '1' (present) and an Error Warning Interrupt is generated, if
enabled. Afterwards the Transmit Error Counter is set to '127', and
the Receive Error Counter is cleared. It will stay in this mode until the CPU
clears the Reset Mode bit. Once this is completed the CAN
Controller will wait the minimum protocol-defined time (128 occurrences of the
Bus-Free signal) counting down the Transmit Error
Counter. After that, the Bus Status bit is cleared (Bus-On), the Error Status
bit is set '0' (ok), the Error Counters are reset, and an Error
Warning Interrupt is generated, if enabled. Reading the TX Error Counter during
this time gives information about the status of the
Bus-Off recovery."

Why did they wrote "afterwards" ? Does it mean that parts of the processing
(TEC set to 127 and REC to zero) only occurs *after* the interrupt?

If so a possibility of what you observe is that TEC reaches 255, BS is set to
one, ISR is triggered, but TEC/REC aren't modified yet. RM is still zero since
TEC just got 255, and didn't crossed yet 255+1 so full bus off isn't reached.

In the meaning time the bus is idle (AFAIK when you debug the CAN cell is still
running), the controller sees 128 'bus free' bits, TEC goes down until the
controller restart trying to send a frame that nobody acknowledge. Or the bus
can't handle recessive bits, etc. From that moment TEC goes up, until the ISR
is fired again.

What is the frequency of the ISR being triggered? Did you test how this
frequency is related to the bus bit rate ? Just to know if when the ISR fires
"again and again" it fires immediately one ISR after the other, or if there is
time for 128 bus free bits on the bus...

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 1001897] lpc2xxx CAN driver improvements / enhancements
  2013-08-29  9:37 [Bug 1001897] New: lpc2xxx CAN driver improvements / enhancements bugzilla-daemon
                   ` (9 preceding siblings ...)
  2013-09-03  9:06 ` bugzilla-daemon
@ 2013-09-03 10:48 ` bugzilla-daemon
  2013-09-03 12:20 ` bugzilla-daemon
                   ` (18 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2013-09-03 10:48 UTC (permalink / raw)
  To: ecos-patches

Please do not reply to this email, use the link below.

http://bugs.ecos.sourceware.org/show_bug.cgi?id=1001897

--- Comment #9 from Uwe Kindler <uwe_kindler@web.de> ---
> The manual is unclear: in some places it states that bus off is reached when
> the tx counter error is 255, and in what you show (in what is also in
> LPC17xx manual section 16.8.1), that bus off is reached only when the
> counter reaches 255+1.
> 
> For instance the definition of BS is:
> 
> "7 BS[6] Bus Status. 0 0
> 0 (Bus-On) The CAN Controller is involved in bus activities
> 1 (Bus-Off) The CAN controller is currently not involved/prohibited from bus
> activity
> because the Transmit Error Counter reached its limiting value of 255."
> 
> By readins this one could understand that BS is set when TEC reaches 255,
> but the complete bus off state is reached only when the counter is
> incremented another time.

The CAN standard from Bosch and also the manual of the LPC2xxx is clear in this
point:

Manual:
If the Transmit Error counter contains 255 and another error occurs, the CAN
Controller is forced into a state called Bus-Off.

This is also the same that is written here
http://www.kvaser.com/zh/about-can/the-can-protocol/23.html:
When any one of the two Error Counters raises above 127, the node will enter a
state known as Error Passive and when the Transmit Error Counter raises above
255, the node will enter the Bus Off state.

And that is how it is implemented in LPC2xxx CAN controller.



> "[6] Mode bit '1' (present) and an Error Warning Interrupt is generated, if
> enabled. Afterwards the Transmit Error Counter is set to '127', and
> the Receive Error Counter is cleared. It will stay in this mode until the
> CPU clears the Reset Mode bit. Once this is completed the CAN
> Controller will wait the minimum protocol-defined time (128 occurrences of
> the Bus-Free signal) counting down the Transmit Error
> Counter. After that, the Bus Status bit is cleared (Bus-On), the Error
> Status bit is set '0' (ok), the Error Counters are reset, and an Error
> Warning Interrupt is generated, if enabled. Reading the TX Error Counter
> during this time gives information about the status of the
> Bus-Off recovery."
> 
> Why did they wrote "afterwards" ? Does it mean that parts of the processing
> (TEC set to 127 and REC to zero) only occurs *after* the interrupt?

The LPC2xxx manual is more precise here:


--------------> manual snippet
If the Transmit Error counter contains 255 and another error occurs, the CAN
Controller is forced into a state called Bus-Off. In this state, the following
register bits are set: BS in CANSR, BEI and EI in CANIR if these are enabled,
and RM in CANMOD. RM resets and disables much of the´CAN Controller. Also at
this time the Transmit Error Counter is set to 127 and the Receive
Error Counter is cleared.
<--------------

So they write: "Also at this time the..." and not "afterwards"

> 
> If so a possibility of what you observe is that TEC reaches 255, BS is set
> to one, ISR is triggered, but TEC/REC aren't modified yet. RM is still zero
> since TEC just got 255, and didn't crossed yet 255+1 so full bus off isn't
> reached.

No. BS is set if TX counter raises above 255. 

> 
> What is the frequency of the ISR being triggered? Did you test how this
> frequency is related to the bus bit rate ? Just to know if when the ISR
> fires "again and again" it fires immediately one ISR after the other, or if
> there is time for 128 bus free bits on the bus...

As soon as DSR processing has finished and interrupt is unmasked at the end of
DSR via cyg_drv_interrupt_unmask, the ISR fires again. And that is normal
because Bus-Off condition still exists and RM (Reset Mode) is not activated and
BS in CANSR is still set.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 1001897] lpc2xxx CAN driver improvements / enhancements
  2013-08-29  9:37 [Bug 1001897] New: lpc2xxx CAN driver improvements / enhancements bugzilla-daemon
                   ` (10 preceding siblings ...)
  2013-09-03 10:48 ` bugzilla-daemon
@ 2013-09-03 12:20 ` bugzilla-daemon
  2013-09-03 12:22 ` bugzilla-daemon
                   ` (17 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2013-09-03 12:20 UTC (permalink / raw)
  To: ecos-patches

Please do not reply to this email, use the link below.

http://bugs.ecos.sourceware.org/show_bug.cgi?id=1001897

Sergei Gavrikov <sergei.gavrikov@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Attachment #2352|0                           |1
        is obsolete|                            |

--- Comment #10 from Sergei Gavrikov <sergei.gavrikov@gmail.com> ---
Created attachment 2353
  --> http://bugs.ecos.sourceware.org/attachment.cgi?id=2353&action=edit
Force hadling "Bus-off" logic for some LPC22xx parts

This is cleaned up and a bit fixed version of attachment 2352.

NOTE: By default CDL option CYGHWR_DEVS_CAN_LPC2XXX_BUSOFF_WORKAROUND is set to
1 for "LPC22xx", and "LPC22xx/00" parts only.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 1001897] lpc2xxx CAN driver improvements / enhancements
  2013-08-29  9:37 [Bug 1001897] New: lpc2xxx CAN driver improvements / enhancements bugzilla-daemon
                   ` (11 preceding siblings ...)
  2013-09-03 12:20 ` bugzilla-daemon
@ 2013-09-03 12:22 ` bugzilla-daemon
  2013-09-03 12:31 ` bugzilla-daemon
                   ` (16 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2013-09-03 12:22 UTC (permalink / raw)
  To: ecos-patches

Please do not reply to this email, use the link below.

http://bugs.ecos.sourceware.org/show_bug.cgi?id=1001897

--- Comment #11 from Uwe Kindler <uwe_kindler@web.de> ---
(In reply to comment #10)
> Created attachment 2353 [details]
> Force hadling "Bus-off" logic for some LPC22xx parts
> 
> This is cleaned up and a bit fixed version of attachment 2352 [details].
> 
> NOTE: By default CDL option CYGHWR_DEVS_CAN_LPC2XXX_BUSOFF_WORKAROUND is set
> to 1 for "LPC22xx", and "LPC22xx/00" parts only.

Ok, I think that is fine. Thank you Sergei.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 1001897] lpc2xxx CAN driver improvements / enhancements
  2013-08-29  9:37 [Bug 1001897] New: lpc2xxx CAN driver improvements / enhancements bugzilla-daemon
                   ` (12 preceding siblings ...)
  2013-09-03 12:22 ` bugzilla-daemon
@ 2013-09-03 12:31 ` bugzilla-daemon
  2013-09-03 12:37 ` bugzilla-daemon
                   ` (15 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2013-09-03 12:31 UTC (permalink / raw)
  To: ecos-patches

Please do not reply to this email, use the link below.

http://bugs.ecos.sourceware.org/show_bug.cgi?id=1001897

--- Comment #12 from Bernard Fouché <bernard.fouche@kuantic.com> ---
FYI I obtained documents from NXP stating that the LPC1765 CAN cell is exactly
the same one than on the LPC2xxx, that conformance tests have been done by NXP
on the LPC2xxx and so that these tests apply to the LPC1765. NXP wrote that I'm
not allowed to give away these documents, but I got them simply by asking NXP
on their website, so I suppose anyone can get them too.

AFAIK the LPC1765 never had any CAN related errata reported. However I reported
an error in the documentation to NXP in January 2012 (CAN controller numbering
is wrong), they acknowledged the error but they never released an updated
version of UM10360.pdf. So maybe there are other reported problems that don't
make it to an errata document.

The NXP test document I have is dated September 2006 and includes bus off
testing.

I can't test on the LPC1765 before next week at the earliest and the driver I
use is different because of the LPC1765 "pending interrupt problem", but what
you describe should appear here also if we have the same CAN cell.

Can you describe what you do on the CAN bus to reach bus off? I could try to
reproduce here and see if I also see RM staying at zero.

Did you check the value of EWL? If is 0 or 255 I wonder what happens...

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 1001897] lpc2xxx CAN driver improvements / enhancements
  2013-08-29  9:37 [Bug 1001897] New: lpc2xxx CAN driver improvements / enhancements bugzilla-daemon
                   ` (13 preceding siblings ...)
  2013-09-03 12:31 ` bugzilla-daemon
@ 2013-09-03 12:37 ` bugzilla-daemon
  2013-09-03 12:53 ` bugzilla-daemon
                   ` (14 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2013-09-03 12:37 UTC (permalink / raw)
  To: ecos-patches

Please do not reply to this email, use the link below.

http://bugs.ecos.sourceware.org/show_bug.cgi?id=1001897

--- Comment #13 from Uwe Kindler <uwe_kindler@web.de> ---

> I can't test on the LPC1765 before next week at the earliest and the driver
> I use is different because of the LPC1765 "pending interrupt problem", but
> what you describe should appear here also if we have the same CAN cell.
> 
> Can you describe what you do on the CAN bus to reach bus off? I could try to
> reproduce here and see if I also see RM staying at zero.
> 
> Did you check the value of EWL? If is 0 or 255 I wonder what happens...

Create a thread that prints a debug message in a loop via diag_printf (i.e. one
message per second). Create a second thread that simply sends CAN messages
(i.e. one message per second). When this application is running, simply unplug
the CAN bus cable from the CAN connector.

If your debug message from the thread stops, then you know you are spinning in
the CAN ISR. You can activate the debug output in the driver to see register
contents and ISR reason (ICR_BUS_OFF) - it will be printed again and again ...

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 1001897] lpc2xxx CAN driver improvements / enhancements
  2013-08-29  9:37 [Bug 1001897] New: lpc2xxx CAN driver improvements / enhancements bugzilla-daemon
                   ` (14 preceding siblings ...)
  2013-09-03 12:37 ` bugzilla-daemon
@ 2013-09-03 12:53 ` bugzilla-daemon
  2013-09-03 18:52 ` bugzilla-daemon
                   ` (13 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2013-09-03 12:53 UTC (permalink / raw)
  To: ecos-patches

Please do not reply to this email, use the link below.

http://bugs.ecos.sourceware.org/show_bug.cgi?id=1001897

--- Comment #14 from Sergei Gavrikov <sergei.gavrikov@gmail.com> ---
(In reply to comment #12)
> FYI I obtained documents from NXP stating that the LPC1765 CAN cell is
> exactly the same one than on the LPC2xxx, that conformance tests have
> been done by NXP on the LPC2XXX ...

Bernard, thank you for persistence.

Q: But the same with which parts (LPC2XXX, LPC2XXX/00, LPC2XXX/01)?

  https://encrypted.google.com/search?q=lpc2294+errata

I see 3 errata sheets for LPC2294 (LPC2XXX and LPC2XXX/00 parts have 7
CAN.1-CAN.7 erratums; LPC2XXX/01 parts have only one - CAN.1). Perhaps,
CAN cells on LPC17XX are more modern.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 1001897] lpc2xxx CAN driver improvements / enhancements
  2013-08-29  9:37 [Bug 1001897] New: lpc2xxx CAN driver improvements / enhancements bugzilla-daemon
                   ` (15 preceding siblings ...)
  2013-09-03 12:53 ` bugzilla-daemon
@ 2013-09-03 18:52 ` bugzilla-daemon
  2013-09-06  8:19 ` bugzilla-daemon
                   ` (12 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2013-09-03 18:52 UTC (permalink / raw)
  To: ecos-patches

Please do not reply to this email, use the link below.

http://bugs.ecos.sourceware.org/show_bug.cgi?id=1001897

--- Comment #15 from Sergei Gavrikov <sergei.gavrikov@gmail.com> ---
It seems for me we are interested in experiment on LPC1765. Looking
forward for Bernard's report. Thanks all.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 1001897] lpc2xxx CAN driver improvements / enhancements
  2013-08-29  9:37 [Bug 1001897] New: lpc2xxx CAN driver improvements / enhancements bugzilla-daemon
                   ` (16 preceding siblings ...)
  2013-09-03 18:52 ` bugzilla-daemon
@ 2013-09-06  8:19 ` bugzilla-daemon
  2013-09-06 14:20 ` bugzilla-daemon
                   ` (11 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2013-09-06  8:19 UTC (permalink / raw)
  To: ecos-patches

Please do not reply to this email, use the link below.

http://bugs.ecos.sourceware.org/show_bug.cgi?id=1001897

Anton Vlaskin <Rinins@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |Rinins@gmail.com

--- Comment #16 from Anton Vlaskin <Rinins@gmail.com> ---
Uwe, could I ask you for a testing application code? Wanna test this bug on
couple of 2294H, 2294H/01

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 1001897] lpc2xxx CAN driver improvements / enhancements
  2013-08-29  9:37 [Bug 1001897] New: lpc2xxx CAN driver improvements / enhancements bugzilla-daemon
                   ` (17 preceding siblings ...)
  2013-09-06  8:19 ` bugzilla-daemon
@ 2013-09-06 14:20 ` bugzilla-daemon
  2013-09-11 21:23 ` bugzilla-daemon
                   ` (10 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2013-09-06 14:20 UTC (permalink / raw)
  To: ecos-patches

Please do not reply to this email, use the link below.

http://bugs.ecos.sourceware.org/show_bug.cgi?id=1001897

--- Comment #17 from Uwe Kindler <uwe_kindler@web.de> ---
(In reply to comment #16)
> Uwe, could I ask you for a testing application code? Wanna test this bug on
> couple of 2294H, 2294H/01

Hi,

you simply need to create an eCos configuration with CAN support. Than create
an application that opens the CAN driver and that CAN send and receive CAN
messages. 

As soon as you physically disconnect the CAN bus from your LPC2294 the problem
will show up.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 1001897] lpc2xxx CAN driver improvements / enhancements
  2013-08-29  9:37 [Bug 1001897] New: lpc2xxx CAN driver improvements / enhancements bugzilla-daemon
                   ` (18 preceding siblings ...)
  2013-09-06 14:20 ` bugzilla-daemon
@ 2013-09-11 21:23 ` bugzilla-daemon
  2013-09-17  6:10 ` bugzilla-daemon
                   ` (9 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2013-09-11 21:23 UTC (permalink / raw)
  To: ecos-patches

Please do not reply to this email, use the link below.

http://bugs.ecos.sourceware.org/show_bug.cgi?id=1001897

--- Comment #18 from Sergei Gavrikov <sergei.gavrikov@gmail.com> ---
(In reply to comment #16)
> Uwe, could I ask you for a testing application code? Wanna test this
> bug on couple of 2294H, 2294H/01

If you got some results (see comment #17) please, drop here. Thank you.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 1001897] lpc2xxx CAN driver improvements / enhancements
  2013-08-29  9:37 [Bug 1001897] New: lpc2xxx CAN driver improvements / enhancements bugzilla-daemon
                   ` (19 preceding siblings ...)
  2013-09-11 21:23 ` bugzilla-daemon
@ 2013-09-17  6:10 ` bugzilla-daemon
  2013-09-17  6:35 ` bugzilla-daemon
                   ` (8 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2013-09-17  6:10 UTC (permalink / raw)
  To: ecos-patches

Please do not reply to this email, use the link below.

http://bugs.ecos.sourceware.org/show_bug.cgi?id=1001897

--- Comment #19 from Uwe Kindler <uwe_kindler@web.de> ---
We just bought a new LPC-L2294-1MB board from Olimex for a new project. While
out old LPC-E2294 board has an on LPC2294 the new Olimex board has a LPC2294/01
(suffix 01) proxessor. I tested the bus-off problem with this new variant and
it shows the same behaviour - a bus off condition will block application
execution because error ISR is called again and again. So the problem still
exists for new LPC-2294 variants

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 1001897] lpc2xxx CAN driver improvements / enhancements
  2013-08-29  9:37 [Bug 1001897] New: lpc2xxx CAN driver improvements / enhancements bugzilla-daemon
                   ` (20 preceding siblings ...)
  2013-09-17  6:10 ` bugzilla-daemon
@ 2013-09-17  6:35 ` bugzilla-daemon
  2013-09-17 18:39 ` bugzilla-daemon
                   ` (7 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2013-09-17  6:35 UTC (permalink / raw)
  To: ecos-patches

Please do not reply to this email, use the link below.

http://bugs.ecos.sourceware.org/show_bug.cgi?id=1001897

--- Comment #20 from Sergei Gavrikov <sergei.gavrikov@gmail.com> ---
(In reply to comment #19)
> We just bought a new LPC-L2294-1MB board from Olimex for a new
> project. While out old LPC-E2294 board has an on LPC2294 the new
> Olimex board has a LPC2294/01 (suffix 01) proxessor. I tested the
> bus-off problem with this new variant and it shows the same behaviour
> - a bus off condition will block application execution because error
> ISR is called again and again. So the problem still exists for new
> LPC-2294 variants

Thanks for report. Then I reject CDL check for "01" suffix and submit
the final patch here (I do not plan to delay a check-in any more).

Sergei

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 1001897] lpc2xxx CAN driver improvements / enhancements
  2013-08-29  9:37 [Bug 1001897] New: lpc2xxx CAN driver improvements / enhancements bugzilla-daemon
                   ` (21 preceding siblings ...)
  2013-09-17  6:35 ` bugzilla-daemon
@ 2013-09-17 18:39 ` bugzilla-daemon
  2013-09-17 18:42 ` bugzilla-daemon
                   ` (6 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2013-09-17 18:39 UTC (permalink / raw)
  To: ecos-patches

Please do not reply to this email, use the link below.

http://bugs.ecos.sourceware.org/show_bug.cgi?id=1001897

Sergei Gavrikov <sergei.gavrikov@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Attachment #2353|0                           |1
        is obsolete|                            |

--- Comment #21 from Sergei Gavrikov <sergei.gavrikov@gmail.com> ---
Created attachment 2366
  --> http://bugs.ecos.sourceware.org/attachment.cgi?id=2366&action=edit
Force handling "Buss-Off" logic for LPC22xx parts

Now CDL option CYGHWR_DEVS_CAN_LPC2XXX_BUSOFF_WORKAROUND enabled by default
for all LPC22XX parts.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 1001897] lpc2xxx CAN driver improvements / enhancements
  2013-08-29  9:37 [Bug 1001897] New: lpc2xxx CAN driver improvements / enhancements bugzilla-daemon
                   ` (22 preceding siblings ...)
  2013-09-17 18:39 ` bugzilla-daemon
@ 2013-09-17 18:42 ` bugzilla-daemon
  2013-09-18  5:45 ` bugzilla-daemon
                   ` (5 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2013-09-17 18:42 UTC (permalink / raw)
  To: ecos-patches

Please do not reply to this email, use the link below.

http://bugs.ecos.sourceware.org/show_bug.cgi?id=1001897

--- Comment #22 from Sergei Gavrikov <sergei.gavrikov@gmail.com> ---
Uwe, it would be great if you could test final version (an attachment
2366). Build is okay, but recently you add  small changes to lpc2xxx
CAN driver and I ask you to try this patch and confirm that "Bus-Off"
issue went away. Thank you.

Sergei

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 1001897] lpc2xxx CAN driver improvements / enhancements
  2013-08-29  9:37 [Bug 1001897] New: lpc2xxx CAN driver improvements / enhancements bugzilla-daemon
                   ` (23 preceding siblings ...)
  2013-09-17 18:42 ` bugzilla-daemon
@ 2013-09-18  5:45 ` bugzilla-daemon
  2013-09-18  6:46 ` bugzilla-daemon
                   ` (4 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2013-09-18  5:45 UTC (permalink / raw)
  To: ecos-patches

Please do not reply to this email, use the link below.

http://bugs.ecos.sourceware.org/show_bug.cgi?id=1001897

--- Comment #23 from Uwe Kindler <uwe_kindler@web.de> ---
Created attachment 2367
  --> http://bugs.ecos.sourceware.org/attachment.cgi?id=2367&action=edit
Updated patch against CVS head

Hi Sergei, I updated the patch and created a diff agains the latest CVS head.
The function lpc2xxx_reset_error_counters() is available now only if
CYGHWR_DEVS_CAN_LPC2XXX_BUSOFF_WORKAROUND is defined because it is only used if
the work around is active an will produce a compiler warning otherwise.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 1001897] lpc2xxx CAN driver improvements / enhancements
  2013-08-29  9:37 [Bug 1001897] New: lpc2xxx CAN driver improvements / enhancements bugzilla-daemon
                   ` (24 preceding siblings ...)
  2013-09-18  5:45 ` bugzilla-daemon
@ 2013-09-18  6:46 ` bugzilla-daemon
  2013-09-18 16:08 ` bugzilla-daemon
                   ` (3 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2013-09-18  6:46 UTC (permalink / raw)
  To: ecos-patches

Please do not reply to this email, use the link below.

http://bugs.ecos.sourceware.org/show_bug.cgi?id=1001897

--- Comment #24 from Sergei Gavrikov <sergei.gavrikov@gmail.com> ---
(In reply to comment #23)
> Hi Sergei, I updated the patch and created a diff agains the latest
> CVS head.  The function lpc2xxx_reset_error_counters() is available
> now only if CYGHWR_DEVS_CAN_LPC2XXX_BUSOFF_WORKAROUND is defined
> because it is only used if the work around is active an will produce a
> compiler warning otherwise.

Hi Uwe, good catch, thank you. Build is OK (hope that you tested it on
real hardware). Tonight I apply it.

Sergei

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 1001897] lpc2xxx CAN driver improvements / enhancements
  2013-08-29  9:37 [Bug 1001897] New: lpc2xxx CAN driver improvements / enhancements bugzilla-daemon
                   ` (25 preceding siblings ...)
  2013-09-18  6:46 ` bugzilla-daemon
@ 2013-09-18 16:08 ` bugzilla-daemon
  2013-11-05 15:41 ` bugzilla-daemon
                   ` (2 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2013-09-18 16:08 UTC (permalink / raw)
  To: ecos-patches

Please do not reply to this email, use the link below.

http://bugs.ecos.sourceware.org/show_bug.cgi?id=1001897

--- Comment #25 from Sergei Gavrikov <sergei.gavrikov@gmail.com> ---
Now patch (attachment 2367) in CVS.
Thank you for contribution to eCos!

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 1001897] lpc2xxx CAN driver improvements / enhancements
  2013-08-29  9:37 [Bug 1001897] New: lpc2xxx CAN driver improvements / enhancements bugzilla-daemon
                   ` (26 preceding siblings ...)
  2013-09-18 16:08 ` bugzilla-daemon
@ 2013-11-05 15:41 ` bugzilla-daemon
  2013-11-06  9:30 ` bugzilla-daemon
  2013-11-06 15:28 ` bugzilla-daemon
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2013-11-05 15:41 UTC (permalink / raw)
  To: ecos-patches

Please do not reply to this email, use the link below.

http://bugs.ecos.sourceware.org/show_bug.cgi?id=1001897

--- Comment #26 from Bernard Fouché <bernard.fouche@kuantic.com> ---
(In reply to comment #13)
> > I can't test on the LPC1765 before next week at the earliest and the driver
> > I use is different because of the LPC1765 "pending interrupt problem", but
> > what you describe should appear here also if we have the same CAN cell.
> > 
> > Can you describe what you do on the CAN bus to reach bus off? I could try to
> > reproduce here and see if I also see RM staying at zero.
> > 
> > Did you check the value of EWL? If is 0 or 255 I wonder what happens...
> 
> Create a thread that prints a debug message in a loop via diag_printf (i.e.
> one message per second). Create a second thread that simply sends CAN
> messages (i.e. one message per second). When this application is running,
> simply unplug the CAN bus cable from the CAN connector.
> 
> If your debug message from the thread stops, then you know you are spinning
> in the CAN ISR. You can activate the debug output in the driver to see
> register contents and ISR reason (ICR_BUS_OFF) - it will be printed again
> and again ...

Sorry, it took me ages to be able to go back to that topic.

On the LPC1765, I just did something a bit different than what you wrote, I
unplugged the resistor on the CAN wires. Then I got BEI (Bus Error Interrupt)
in CAN1ICR.

When I look at the error counter, I have 0x00 (Tx) and 0x87 (Rx) while the
warning level is at the default 0x60.

Once I reach this point, the ISR is triggered again and again (and I'm not even
in a bus off condition since the test code didn't have time yet to try to send
a message: it is just floating signals on the wires that shows impossible CAN
frames).

Once the wires aren't floating any more, I have no more such interrupts. It
seems that the only way to stop having these repeated ISR is to reset the CAN
hardware driver or to disable the interrupt that reacts on a bus error or to
have the external condition (in this case bad wiring) to disappear. I did not
find in the documentation how to correctly process that case.

Then I re-installed the resistor but I had only my test node on the CAN bus.
Since no one could acknowledge frames, I had a bus off condition, that worked
as described: TxREC was at 0x7F , and CAN1MOD changed from 0x0 to 0x01 (reset
mode).

Hence in my opinion your patch is not needed on LPC1765 BUT the issue with
floating wires needs some attention and the documentation does not describe
exactly what's going on anyway whatever the version of this CAN cell.

All in all I think it is better to have this kind of processing (taking the
decision to reset the CAN controller) to be handled by higher level code
instead of having the ISR or DSR to magically do things.

I did not encountered problems when I was developing my app since the app
powers off the CAN cell as soon as a bus error or bus off condition occurs.
This is detected by a CAN event call back function that takes immediate action
and also reports this to higher level code. The app re-initialize the CAN a few
seconds later and then tries again to use the CAN bus. So I was immune to the
repeated ISR illness, whatever the reason. And my high level code is always
kept aware about what's going on on the wires.

If the CAN hardware driver is reset without having the app to know about it,
other weird things may  happen, especially if the higher level code assumes
than the CAN controller can recover from bus off by itself, or than a
previously queued message was correctly sent on the bus, etc.

Maybe the feature you added should send a new event reporting that the CAN
controller was reset? So high level code knows that it should be reconfigured
or messages resent? Anyway I don't use LPC22xx parts so for me this is not a
problem!

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 1001897] lpc2xxx CAN driver improvements / enhancements
  2013-08-29  9:37 [Bug 1001897] New: lpc2xxx CAN driver improvements / enhancements bugzilla-daemon
                   ` (27 preceding siblings ...)
  2013-11-05 15:41 ` bugzilla-daemon
@ 2013-11-06  9:30 ` bugzilla-daemon
  2013-11-06 15:28 ` bugzilla-daemon
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2013-11-06  9:30 UTC (permalink / raw)
  To: ecos-patches

Please do not reply to this email, use the link below.

http://bugs.ecos.sourceware.org/show_bug.cgi?id=1001897

--- Comment #27 from Uwe Kindler <uwe_kindler@web.de> ---
> Sorry, it took me ages to be able to go back to that topic.

Hi Bernard,

thank you for taking the time investigating this problem. 

> On the LPC1765, I just did something a bit different than what you wrote, I
> unplugged the resistor on the CAN wires. Then I got BEI (Bus Error
> Interrupt) in CAN1ICR.

what happpens if you do the same thing that I do - simply disconnect from CAN
bus? 

> All in all I think it is better to have this kind of processing (taking the
> decision to reset the CAN controller) to be handled by higher level code
> instead of having the ISR or DSR to magically do things.

My patch does no do any magically things in ISR and DSR. It does exactly the
thing that the hadware manual claims the CAN controller would do. So my
implementation does something the CAN controller would do anyway.

> I did not encountered problems when I was developing my app since the app
> powers off the CAN cell as soon as a bus error or bus off condition occurs.

As soon as the bus off condition occurs the application is completely blocked
because the ISR fires again and again. So without my patch the application
could never react on a bus off condition or power of the CAN cell because the
application code has no chance to run. My patch fixes this. 

> 
> If the CAN hardware driver is reset without having the app to know about it,
> other weird things may  happen, especially if the higher level code assumes
> than the CAN controller can recover from bus off by itself, or than a
> previously queued message was correctly sent on the bus, etc.
> 
> Maybe the feature you added should send a new event reporting that the CAN
> controller was reset? So high level code knows that it should be
> reconfigured or messages resent?

No, I don't agree. With my patch the application gets a bus off event from CAN
driver and knows what happened. If a bus off condition occures, each
application nows, that it might need to get reconfigured or that messages might
got lost or need to get resend. Without my patch the application code is
totally blocked and cant't do anything like resetting CAN cell.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 1001897] lpc2xxx CAN driver improvements / enhancements
  2013-08-29  9:37 [Bug 1001897] New: lpc2xxx CAN driver improvements / enhancements bugzilla-daemon
                   ` (28 preceding siblings ...)
  2013-11-06  9:30 ` bugzilla-daemon
@ 2013-11-06 15:28 ` bugzilla-daemon
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2013-11-06 15:28 UTC (permalink / raw)
  To: ecos-patches

Please do not reply to this email, use the link below.

http://bugs.ecos.sourceware.org/show_bug.cgi?id=1001897

--- Comment #28 from Bernard Fouché <bernard.fouche@kuantic.com> ---
(In reply to comment #27)
> > Sorry, it took me ages to be able to go back to that topic.
> 
> Hi Bernard,
> 
> thank you for taking the time investigating this problem. 
> 
> > On the LPC1765, I just did something a bit different than what you wrote, I
> > unplugged the resistor on the CAN wires. Then I got BEI (Bus Error
> > Interrupt) in CAN1ICR.
> 
> what happpens if you do the same thing that I do - simply disconnect from
> CAN bus?

I can't do this, on my target the same connector provides power supply and CAN
connectivity :-(

However the only difference should be the length of wire after the CAN
transceiver.

Since you say that you also see BEI raised, maybe the problem you have is the
same than mine, you are stuck with a broken bus, not a real bus off condition
and that would explain why you don't see the documented behavior for bus off.

Did you try to xmit on a correct bus with no other node to acknowledge the
frames? If so, were you stuck on the ISR following this "bus off" condition?

> 
> > All in all I think it is better to have this kind of processing (taking the
> > decision to reset the CAN controller) to be handled by higher level code
> > instead of having the ISR or DSR to magically do things.
> 
> My patch does no do any magically things in ISR and DSR. It does exactly the
> thing that the hadware manual claims the CAN controller would do. So my
> implementation does something the CAN controller would do anyway.

On the LPC1765 datasheet, it is stated that if TxREC reaches 255 you enter bus
off mode and that means waiting 128 occurrences of Bus Free Condition when RM
is back to zero. If you write a value between 0 and 254, then only a single
occurrence of Bus Free Condition is waited for, whatever the value set in
TxREC.

If it is similar on LPC22xx it means you can't completely simulate the
hardware: writing 127 just set TxREC to the value it has when bus off is
generated by the hardware but not the full behavior (waiting 128 occurrences of
Bus Free Condition).

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2013-11-06 15:28 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-08-29  9:37 [Bug 1001897] New: lpc2xxx CAN driver improvements / enhancements bugzilla-daemon
2013-08-30 12:37 ` [Bug 1001897] " bugzilla-daemon
2013-08-30 12:38 ` bugzilla-daemon
2013-08-30 14:27 ` bugzilla-daemon
2013-09-02  7:22 ` bugzilla-daemon
2013-09-02 16:22 ` bugzilla-daemon
2013-09-03  6:18 ` bugzilla-daemon
2013-09-03  6:38 ` bugzilla-daemon
2013-09-03  8:20 ` bugzilla-daemon
2013-09-03  8:22 ` bugzilla-daemon
2013-09-03  9:06 ` bugzilla-daemon
2013-09-03 10:48 ` bugzilla-daemon
2013-09-03 12:20 ` bugzilla-daemon
2013-09-03 12:22 ` bugzilla-daemon
2013-09-03 12:31 ` bugzilla-daemon
2013-09-03 12:37 ` bugzilla-daemon
2013-09-03 12:53 ` bugzilla-daemon
2013-09-03 18:52 ` bugzilla-daemon
2013-09-06  8:19 ` bugzilla-daemon
2013-09-06 14:20 ` bugzilla-daemon
2013-09-11 21:23 ` bugzilla-daemon
2013-09-17  6:10 ` bugzilla-daemon
2013-09-17  6:35 ` bugzilla-daemon
2013-09-17 18:39 ` bugzilla-daemon
2013-09-17 18:42 ` bugzilla-daemon
2013-09-18  5:45 ` bugzilla-daemon
2013-09-18  6:46 ` bugzilla-daemon
2013-09-18 16:08 ` bugzilla-daemon
2013-11-05 15:41 ` bugzilla-daemon
2013-11-06  9:30 ` bugzilla-daemon
2013-11-06 15:28 ` bugzilla-daemon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).