From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 23512 invoked by alias); 21 Dec 2012 03:23:27 -0000 Received: (qmail 23301 invoked by uid 48); 21 Dec 2012 03:23:09 -0000 From: "joey.ye at arm dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug rtl-optimization/55757] Suboptimal interrupt prologue/epilogue for ARMv7-M (Cortex-M3) Date: Fri, 21 Dec 2012 03:23:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: rtl-optimization X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: enhancement X-Bugzilla-Who: joey.ye at arm dot com X-Bugzilla-Status: NEW X-Bugzilla-Priority: P5 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Changed-Fields: CC Message-ID: In-Reply-To: References: X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated Content-Type: text/plain; charset="UTF-8" MIME-Version: 1.0 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org X-SW-Source: 2012-12/txt/msg02066.txt.bz2 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55757 Joey Ye changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |joey.ye at arm dot com --- Comment #4 from Joey Ye 2012-12-21 03:23:07 UTC --- > An interrupt handler function (void something(void)), but without attribute, > doing something inside (posts a FreeRTOS semaphore, calls vPortYieldFromISR() > if it's needed) actually saves a lot of registers on entry: > 23b4: b507 push {r0, r1, r2, lr} Pushing of scratch registers can be used to 1. align stack, which Richard has explained 2. allocate stack frame, as a code size optimization of sub sp, #x Explain with following example: extern void bar(int *, int *); void foo() { int a, b; bar(&a, &b); } Built with -Os -mcpu=cortex-m3: push {r0, r1, r2, lr} Here, pushing of r0 and r1 allocates a 8-byte frame for local variables. Pushing of r2 is to make sp aligned to 8 bytes together with pushing lr. Values of r0-r2 pushed to stack don't really matter. But built with -O2: push {lr} sub sp, sp, #12 Former is better on code size, latter wins on performance. Hopefully this explains everything.