From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <ecos-bugs-return-8680-listarch-ecos-bugs=sources.redhat.com@sourceware.org>
Received: (qmail 27819 invoked by alias); 16 Jan 2012 14:26:18 -0000
Received: (qmail 27787 invoked by uid 22791); 16 Jan 2012 14:26:17 -0000
X-SWARE-Spam-Status: No, hits=-1.9 required=5.0
	tests=AWL,BAYES_00
X-Spam-Check-By: sourceware.org
Received: from hagrid.ecoscentric.com (HELO mail.ecoscentric.com) (212.13.207.197)
    by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 16 Jan 2012 14:26:03 +0000
Received: from localhost (hagrid.ecoscentric.com [127.0.0.1])
	by mail.ecoscentric.com (Postfix) with ESMTP id 75EA72F78005
	for <ecos-bugs@ecos.sourceware.org>; Mon, 16 Jan 2012 14:26:02 +0000 (GMT)
Received: from mail.ecoscentric.com ([127.0.0.1])
	by localhost (hagrid.ecoscentric.com [127.0.0.1]) (amavisd-new, port 10024)
	with ESMTP id c28pxJP4sBCX; Mon, 16 Jan 2012 14:25:57 +0000 (GMT)
From: bugzilla-daemon@bugs.ecos.sourceware.org
To: ecos-bugs@ecos.sourceware.org
Subject: [Bug 1001456] New: HAL misses Interrupt Clear-Pending Registers
 handling: wasted processing power
X-Bugzilla-Reason: CC
X-Bugzilla-Type: new
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: eCos
X-Bugzilla-Component: HAL
X-Bugzilla-Keywords: 
X-Bugzilla-Severity: major
X-Bugzilla-Who: bernard.fouche@kuantic.com
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Priority: low
X-Bugzilla-Assigned-To: unassigned@bugs.ecos.sourceware.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-1001456-13@http.bugs.ecos.sourceware.org/>
X-Bugzilla-URL: http://bugs.ecos.sourceware.org/
Auto-Submitted: auto-generated
Content-Type: text/plain; charset="UTF-8"
MIME-Version: 1.0
Date: Mon, 16 Jan 2012 14:26:00 -0000
Mailing-List: contact ecos-bugs-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <ecos-bugs.sourceware.org>
List-Subscribe: <mailto:ecos-bugs-subscribe@sourceware.org>
List-Post: <mailto:ecos-bugs@sourceware.org>
List-Help: <mailto:ecos-bugs-help@sourceware.org>, <http://sourceware.org/lists.html#faqs>
Sender: ecos-bugs-owner@sourceware.org
X-SW-Source: 2012/txt/msg00109.txt.bz2

Please do not reply to this email. Use the web interface provided at:
http://bugs.ecos.sourceware.org/show_bug.cgi?id=1001456

           Summary: HAL misses Interrupt Clear-Pending Registers handling:
                    wasted processing power
           Product: eCos
           Version: CVS
          Platform: Other (please specify)
        OS/Version: Cortex-M
            Status: UNCONFIRMED
          Severity: major
          Priority: low
         Component: HAL
        AssignedTo: unassigned@bugs.ecos.sourceware.org
        ReportedBy: bernard.fouche@kuantic.com
                CC: ecos-bugs@ecos.sourceware.org
             Class: Advice Request


The way the Cortex-M3 manages peripheral interrupts require handling these
registers, otherwise an interrupt may be triggered multiple times.

I was looking for spurious interrupts on the LPC17XX CAN driver, and my error
was to skip handling of ICPR0 and ICPR1. The Cortex-M HAL must provide a
function like this:

//===========================================================================
// Clear pending interrupt.
//===========================================================================
__externC void hal_interrupt_clear_pending(cyg_uint32 vector )
{
    HAL_WRITE_UINT32(
CYGARC_REG_NVIC_BASE+CYGARC_REG_NVIC_CPR(vector-CYGNUM_HAL_INTERRUPT_EXTERNAL),
                     
CYGARC_REG_NVIC_IBIT(vector-CYGNUM_HAL_INTERRUPT_EXTERNAL) );
}


This also means that when writing a driver, especially a driver with some kind
of FIFO mechanism, special care must be taken:

- it is impossible to have a general function able to fully acknowledge an
interrupt since the NVIC knows nothing of the inner details of all peripherals
and each peripheral is very different from the other. Hence
cyg_drv_interrupt_acknowledge() should not be used in Cortex-M drivers: it
resolves to an empty statement while letting the programmer think he/she had
done something useful while the situation is the exact opposite: nothing was
done, and an important step has been skipped. IMHO this function should
generate a message on diag_printf() and freeze the system (after some cdl
setting) on the Cortex arch.

- see next point why cyg_drv_interrupt_acknowledge() is different from the
function provided above: clearing the pending interrupt condition must be done
at a particular time, not at anytime during DSR processing, nor during ISR
processing. If the name 'acknowledge' is kept, it is likely that some design
flaw will occur in driver writing.

- when a driver reads a peripheral register to know what kind of job to do, the
corresponding ICPR register must be used to clear the pending interrupt state
first, before reading the driver specific interrupt register. Otherwise a race
condition may occur, at least in cases where the DSR is not looping until it
has completed everything that requires work to do: DSR clears reads the
peripheral register, thinks it knows everything, and just before clearing ICPR,
a new event occurs. By clearing ICPR at this time, an event may be missed
(depends probably of the kind of peripheral and of the way the MCU designed
cabler ICPR to the peripherals).

- IMHO it is necessary to have DSR to perform loops until they see no reason to
continue processing: a simple one time scanning of a peripheral with FIFO
registers will very probably lead to ISR/DSR being fired uselessly.

- even when doing loops but not handling ICPR, an other interrupt will be
triggered and in many cases the DSR will be fired just to discover that there
is nothing to do, the job has been done in the previous run. Sometimes I had
30% of the calls to the CAN DSR that were done without use, so it can be a real
waste.

- if the hardware has two different bits to reports two different conditions
(typically RX and Overrun), don't assume for instance that overrun shows up
only after a RX, unless explicitly stated in the datasheet. I've seen this on
the generic serial driver (made a report: bug #1001392 but nobody reacted about
it) and also on the CAN driver. This is because of the need to have loops in
DSR for efficient processing: if for some reasons the conditions are unrelated
but the code thinks they are, a loop based DSR may be stuck.

I did not investigate the generic serial driver behavior yet regarding ICPR but
I have seen ICPR issues on the SPI and CAN driver: I would be surprised to not
see this problem in the generic serial driver since it has a FIFO.

Checking the issue for any driver is rather simple: count the number of times
the DSR is fired without doing any processing. A non zero count means an ICPR
issue. The count value depends of a number of conditions hard to reproduce. For
instance an ISR may be fired 1000 times, the DSR 1032 times, it depends
entirely of what's going on the concerned peripheral, what the remote end is
doing, etc.

Conclusion:

- cyg_drv_interrupt_acknowledge() is misleading for Cortex-M.
- not handling IPCR -> wasted processing.
- handling IPCR incorrectly -> lost events.
- using a loop in a DSR that incorrect combines unrelated conditions -> stuck
program.

-- 
Configure bugmail: http://bugs.ecos.sourceware.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.