public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Earnshaw <rearnsha@arm.com>
To: Revital1 Eres <ERES@il.ibm.com>
Cc: Roman Zhuykov <zhroma@ispras.ru>,
	dm@ispras.ru, gcc@gcc.gnu.org,  cltang@codesourcery.com,
	yao@codesourcery.com, Ayal Zaks <ZAKS@il.ibm.com>
Subject: Re: [ARM] Implementing doloop pattern
Date: Wed, 05 Jan 2011 15:35:00 -0000	[thread overview]
Message-ID: <1294241704.7406.12.camel@e102346-lin.cambridge.arm.com> (raw)
In-Reply-To: <OFFD96764C.38FEC784-ONC2257809.005BD5C2-C2257809.005D0F95@il.ibm.com>


On Thu, 2010-12-30 at 18:56 +0200, Revital1 Eres wrote:
> Hello,
> 
> The attached patch is my latest attempt to model doloop for arm.
> I followed Chung-Lin Tang suggestion and used subs+jump similar to your
> patch.
> On crotex-A8 I see gain of 29% on autocor benchmark (telecom suite) with
> SMS using the following flags: -fmodulo-sched-allow-regmoves
> -funsafe-loop-optimizations -fmodulo-sched   -fno-auto-inc-dec
> -fdump-rtl-sms -mthumb  -mcpu=cortex-a8 -O3. (compare to using only
> -mthumb  -mcpu=cortex-a8 -O3)
> 
> I have not fully tested the patch and it's not in the proper format of
> submission yet.
> 
> Thanks,
> Revital
> 
> (See attached file: patch_arm_doloop.txt)
> 
> 
> 
> From:	Roman Zhuykov <zhroma@ispras.ru>
> To:	gcc@gcc.gnu.org
> Cc:	dm@ispras.ru
> Date:	30/12/2010 04:04 PM
> Subject:	[ARM] Implementing doloop pattern
> Sent by:	gcc-owner@gcc.gnu.org
> 
> 
> 
> Hello!
> 
> The main idea of the work described below was to estimate speedup we can
> gain from SMS on ARM.  SMS depends on doloop_end pattern and there is no
> appropriate instruction on ARM.  We decided to create a "fake"
> doloop_end pattern on ARM using a pair of "subs" and "bne" assembler
> instructions.  In implementation we used ideas from machine description
> files of other architectures, e. g. spu, which expands doloop_end
> pattern only when SMS is enabled.  The patch is attached.
> 
> This patch allows to use any possible register for the doloop pattern.
> It was tested on trunk snapshot from 30 Aug 2010.  It works fine on
> several small examples, but gives an ICE on sqlite-amalgamation-3.6.1
> source:
> sqlite3.c: In function 'sqlite3WhereBegin':
> sqlite3.c:76683:1: internal compiler error: in patch_jump_insn, at
> cfgrtl.c:1020
> 
> ICE happens in ira pass, when cleanup_cfg is called at the end or ira.
> 
> The "bad" instruction looks like
> (jump_insn 3601 628 4065 76 (parallel [
>              (set (pc)
>                  (if_then_else (ne (mem/c:SI (plus:SI (reg/f:SI 13 sp)
>                                  (const_int 36 [0x24])) [105 %sfp+-916
> S4 A32])
>                          (const_int 1 [0x1]))
>                      (label_ref 3600)
>                      (pc)))
>              (set (mem/c:SI (plus:SI (reg/f:SI 13 sp)
>                          (const_int 36 [0x24])) [105 %sfp+-916 S4 A32])
>                  (plus:SI (mem/c:SI (plus:SI (reg/f:SI 13 sp)
>                              (const_int 36 [0x24])) [105 %sfp+-916 S4 A32])
>                      (const_int -1 [0xffffffffffffffff])))
>          ]) sqlite3.c:75235 328 {doloop_end_internal}
>       (expr_list:REG_BR_PROB (const_int 9100 [0x238c])
>          (nil))
>   -> 3600)
> 
> So, the problem seems to be with ira.  Memory is used instead of a
> register to store doloop counter.  We tried to fix this by explicitly
> specifying hard register (r5) for doloop pattern.  The fixed version
> seems to work, but this doesn't look like a proper fix.  On trunk
> snapshot from 17 Dec 2010 the ICE described above have disappeared, but
> probably it's just a coincidence, and it will shop up anyway on some
> other test case.
> 
> The r5-fix shows the following results (compare "-O2 -fno-auto-inc-dec
> -fmodulo-sched" vs "-O2 -fno-auto-inc-dec").
> Aburto benchmarks: heapsort and matmult - 3% speedup. nsieve - 7% slowdown.
> Other aburto tests, sqlite tests and libevas rasterization library
> (expedite testsuite) show around zero results.
> 
> A motivating example shows about 23% speedup:
> 
> char scal (int n, char *a, char *b)
> {
>    int i;
>    char s = 0;
>    for (i = 0; i < n; i++)
>      s += a[i] * b[i];
>    return s;
> }
> 
> We have analyzed SMS results, and can conclude that if SMS has
> successfully built a schedule for the loop we usually gain a speedup,
> and when SMS fails, we often have some slowdown, which have appeared
> because of do-loop conversion.
> 
> The questions are:
> How to properly fix the ICE described?
> Do you think this approach (after the fixes) can make its way into trunk?
> 
> Happy holidays!
> --
> Roman Zhuykov
> 
> [attachment "sms-doloop-any-reg.diff" deleted by Revital1 Eres/Haifa/IBM]


@@ -162,6 +175,7 @@ doloop_condition_get (rtx doloop_pat)
     return 0;
 
   if ((XEXP (condition, 0) == reg)
+      || (REGNO (XEXP (condition, 0)) == CC_REGNUM)
       || (GET_CODE (XEXP (condition, 0)) == PLUS
                   && XEXP (XEXP (condition, 0), 0) == reg))

You can't depend on CC_REGNUM in generic code.  That's part of the
private machine description for ARM.  Other cores have different ways of
representing condition codes.

R.


  reply	other threads:[~2011-01-05 15:35 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-30 14:04 Roman Zhuykov
2010-12-30 16:02 ` Ulrich Weigand
2010-12-30 16:56 ` Revital1 Eres
2011-01-05 15:35   ` Richard Earnshaw [this message]
2011-01-06  7:59     ` Revital1 Eres
2011-01-06  9:11       ` Andreas Schwab
2011-01-13 11:11         ` Ramana Radhakrishnan
2011-01-13 13:51       ` Nathan Froyd

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1294241704.7406.12.camel@e102346-lin.cambridge.arm.com \
    --to=rearnsha@arm.com \
    --cc=ERES@il.ibm.com \
    --cc=ZAKS@il.ibm.com \
    --cc=cltang@codesourcery.com \
    --cc=dm@ispras.ru \
    --cc=gcc@gcc.gnu.org \
    --cc=yao@codesourcery.com \
    --cc=zhroma@ispras.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).