public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 3/7] [ARC] Update can_follow_jump hook helper.
  2017-07-24  8:46 [PATCH 0/7] [ARC] SDATA and ZOL updates Claudiu Zissulescu
                   ` (3 preceding siblings ...)
  2017-07-24  8:46 ` [PATCH 4/7] [ARC] Use TARGET_USE_ANCHORS_FOR_SYMBOL_P Claudiu Zissulescu
@ 2017-07-24  8:46 ` Claudiu Zissulescu
  2017-08-15 13:56   ` Andrew Burgess
  2017-07-24  8:46 ` [PATCH 7/7] [ARC] Fix errors in arc_ifcvt Claudiu Zissulescu
  2017-07-24  8:46 ` [PATCH 1/7] [ARC] Improves and fixes for small data support Claudiu Zissulescu
  6 siblings, 1 reply; 27+ messages in thread
From: Claudiu Zissulescu @ 2017-07-24  8:46 UTC (permalink / raw)
  To: gcc-patches; +Cc: Claudiu.Zissulescu, Francois.Bedard, andrew.burgess, sandra

From: claziss <claziss@synopsys.com>

Short branches cannot be used to jump between hot/cold
sections. Update the hook.

gcc/
2017-04-26  Claudiu Zissulescu  <claziss@synopsys.com>

	* config/arc/arc.c (arc_can_follow_jump): Check for short
	branches.
---
 gcc/config/arc/arc.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
index cf52df8..3980a0f 100644
--- a/gcc/config/arc/arc.c
+++ b/gcc/config/arc/arc.c
@@ -9941,6 +9941,9 @@ arc_can_follow_jump (const rtx_insn *follower, const rtx_insn *followee)
   if (CROSSING_JUMP_P (followee))
     switch (get_attr_type (u.r))
       {
+      case TYPE_BRANCH:
+	if (get_attr_length (u.r) != 2)
+	  break;
       case TYPE_BRCC:
       case TYPE_BRCC_NO_DELAY_SLOT:
 	return false;
-- 
1.9.1

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 5/7] [ARC] Update various patterns
  2017-07-24  8:46 [PATCH 0/7] [ARC] SDATA and ZOL updates Claudiu Zissulescu
@ 2017-07-24  8:46 ` Claudiu Zissulescu
  2017-08-31 19:33   ` Andrew Burgess
  2017-07-24  8:46 ` [PATCH 6/7] [ARC] Reimplement ZOL support Claudiu Zissulescu
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 27+ messages in thread
From: Claudiu Zissulescu @ 2017-07-24  8:46 UTC (permalink / raw)
  To: gcc-patches; +Cc: Claudiu.Zissulescu, Francois.Bedard, andrew.burgess, sandra

From: claziss <claziss@synopsys.com>

gcc/
2017-04-25  Claudiu Zissulescu  <claziss@synopsys.com>

	* config/arc/arc.md (movqi_insn): Add stores to save constant long
	immediates.
	(movhi_insn): Update store instruction constraint which are saving
	6-bit short immediates.
	(movsi_insn): Consider also short scaled load operations.
	(zero_extendhisi2_i): Use Usd constraint instead of T.
	(extendhisi2_i): Add q constraint.
	(arc_clzsi2): Add type and length attributes.
	(arc_ctzsi2): Likewise.
	* config/arc/constraints.md (Usc): Update constraint, the
	assembler can parse two relocations for a single instruction.

gcc/testsuite/
2017-04-25  Claudiu Zissulescu  <claziss@synopsys.com>

	* gcc.target/arc/arc.exp: Test also cpp files.
	* gcc.target/arc/tdelay_slots.cpp: New test.
---
 gcc/config/arc/arc.md                         | 54 ++++++++++++++-------------
 gcc/config/arc/constraints.md                 |  6 +--
 gcc/testsuite/gcc.target/arc/arc.exp          |  2 +-
 gcc/testsuite/gcc.target/arc/tdelay_slots.cpp | 42 +++++++++++++++++++++
 4 files changed, 75 insertions(+), 29 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arc/tdelay_slots.cpp

diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md
index f595da7..04a1447 100644
--- a/gcc/config/arc/arc.md
+++ b/gcc/config/arc/arc.md
@@ -618,8 +618,8 @@
 ; The iscompact attribute allows the epilogue expander to know for which
 ; insns it should lengthen the return insn.
 (define_insn "*movqi_insn"
-  [(set (match_operand:QI 0 "move_dest_operand" "=Rcq,Rcq#q,    w,Rcq#q,   h, w,w,???w,h, w,Rcq,  S,!*x,  r,r, Ucm,m,???m,Usc")
-	(match_operand:QI 1 "move_src_operand"  "  cL,   cP,Rcq#q,    P,hCm1,cL,I,?Rac,i,?i,  T,Rcq,Usd,Ucm,m,?Rac,c,?Rac,Cm3"))]
+  [(set (match_operand:QI 0 "move_dest_operand" "=Rcq,Rcq#q,    w,Rcq#q,   h, w,w,???w,h, w,Rcq,  S,!*x,  r,r, Ucm,m,???m,  m,Usc")
+	(match_operand:QI 1 "move_src_operand"  "  cL,   cP,Rcq#q,    P,hCm1,cL,I,?Rac,i,?i,  T,Rcq,Usd,Ucm,m,?Rac,c,?Rac,Cm3,i"))]
   "register_operand (operands[0], QImode)
    || register_operand (operands[1], QImode)"
   "@
@@ -641,11 +641,12 @@
    xstb%U0 %1,%0
    stb%U0%V0 %1,%0
    stb%U0%V0 %1,%0
+   stb%U0%V0 %1,%0
    stb%U0%V0 %1,%0"
-  [(set_attr "type" "move,move,move,move,move,move,move,move,move,move,load,store,load,load,load,store,store,store,store")
-   (set_attr "iscompact" "maybe,maybe,maybe,true,true,false,false,false,maybe_limm,false,true,true,true,false,false,false,false,false,false")
-   (set_attr "predicable" "yes,no,yes,no,no,yes,no,yes,yes,yes,no,no,no,no,no,no,no,no,no")
-   (set_attr "cpu_facility" "av1,av1,av1,av2,av2,*,*,*,*,*,*,*,*,*,*,*,*,*,*")])
+  [(set_attr "type" "move,move,move,move,move,move,move,move,move,move,load,store,load,load,load,store,store,store,store,store")
+   (set_attr "iscompact" "maybe,maybe,maybe,true,true,false,false,false,maybe_limm,false,true,true,true,false,false,false,false,false,false,false")
+   (set_attr "predicable" "yes,no,yes,no,no,yes,no,yes,yes,yes,no,no,no,no,no,no,no,no,no,no")
+   (set_attr "cpu_facility" "av1,av1,av1,av2,av2,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*")])
 
 (define_expand "movhi"
   [(set (match_operand:HI 0 "move_dest_operand" "")
@@ -654,8 +655,8 @@
   "if (prepare_move_operands (operands, HImode)) DONE;")
 
 (define_insn "*movhi_insn"
-  [(set (match_operand:HI 0 "move_dest_operand" "=Rcq,Rcq#q,    w,Rcq#q,   h, w,w,???w,Rcq#q,h, w,Rcq,  S,  r,r, Ucm,m,???m,VUsc,VUsc")
-	(match_operand:HI 1 "move_src_operand" "   cL,   cP,Rcq#q,    P,hCm1,cL,I,?Rac,    i,i,?i,  T,Rcq,Ucm,m,?Rac,c,?Rac, Cm3,i"))]
+  [(set (match_operand:HI 0 "move_dest_operand" "=Rcq,Rcq#q,    w,Rcq#q,   h, w,w,???w,Rcq#q,h, w,Rcq,  S,  r,r, Ucm,m,???m,  m,VUsc")
+	(match_operand:HI 1 "move_src_operand" "   cL,   cP,Rcq#q,    P,hCm1,cL,I,?Rac,    i,i,?i,  T,Rcq,Ucm,m,?Rac,c,?Rac,Cm3,i"))]
   "register_operand (operands[0], HImode)
    || register_operand (operands[1], HImode)
    || (CONSTANT_P (operands[1])
@@ -706,8 +707,8 @@
 ; insns it should lengthen the return insn.
 ; N.B. operand 1 of alternative 7 expands into pcl,symbol@gotpc .
 (define_insn "*movsi_insn"                      ;   0     1     2     3    4  5 6   7   8   9   10  11  12  13    14  15   16  17  18     19     20  21  22    23    24 25 26    27 28  29   30   31
-  [(set (match_operand:SI 0 "move_dest_operand" "=Rcq,Rcq#q,    w,Rcq#q,   h, w,w,  w,  w,  w,  w,???w, ?w,  w,Rcq#q,  h,   w,Rcq,  S,   Us<,RcqRck,!*x,  r,!*Rsd,!*Rcd,r,Ucm,  Usd,m,???m,VUsc,VUsc")
-	(match_operand:SI 1 "move_src_operand"  "  cL,   cP,Rcq#q,    P,hCm1,cL,I,Crr,Clo,Chi,Cbi,?Rac,Cpc,Clb, ?Cal,Cal,?Cal,  T,Rcq,RcqRck,   Us>,Usd,Ucm,  Usd,  Ucd,m,  w,!*Rzd,c,?Rac, Cm3, C32"))]
+  [(set (match_operand:SI 0 "move_dest_operand" "=Rcq,Rcq#q,    w,Rcq#q,   h, w,w,  w,  w,  w,  w,???w, ?w,  w,Rcq#q,  h,   w,Rcq,  S,   Us<,RcqRck,!*x,  r,!*Rsd,!*Rcd,r,Ucm,  Usd,m,???m,  m,VUsc")
+	(match_operand:SI 1 "move_src_operand"  "  cL,   cP,Rcq#q,    P,hCm1,cL,I,Crr,Clo,Chi,Cbi,?Rac,Cpc,Clb, ?Cal,Cal,?Cal,Uts,Rcq,RcqRck,   Us>,Usd,Ucm,  Usd,  Ucd,m,  w,!*Rzd,c,?Rac,Cm3, C32"))]
   "register_operand (operands[0], SImode)
    || register_operand (operands[1], SImode)
    || (CONSTANT_P (operands[1])
@@ -730,10 +731,10 @@
    mov%? %0,%1		;11
    add %0,%S1		;12
    add %0,pcl,%1@pcl    ;13
-   mov%? %0,%S1%&	;14
-   mov%? %0,%S1		;15
-   mov%? %0,%S1		;16
-   ld%? %0,%1%&		;17
+   mov%? %0,%1  	;14
+   mov%? %0,%1		;15
+   mov%? %0,%1		;16
+   ld%?%U1 %0,%1	;17
    st%? %1,%0%&		;18
    * return arc_short_long (insn, \"push%? %1%&\", \"st%U0 %1,%0%&\");
    * return arc_short_long (insn, \"pop%? %0%&\",  \"ld%U1 %0,%1%&\");
@@ -747,13 +748,13 @@
    st%U0%V0 %1,%0	;28
    st%U0%V0 %1,%0	;29
    st%U0%V0 %1,%0	;30
-   st%U0%V0 %S1,%0	;31"
+   st%U0%V0 %1,%0	;31"
    ;                         0     1     2     3    4    5      6       7           8     9    10     11    12    13           14        15    16   17    18    19   20    21    22   23  24    25    26    27    28    29   30   31
   [(set_attr "type"       "move, move, move,move,move, move, move,two_cycle_core,shift,shift,shift, move,binary,binary,      move,      move, move,load,store,store,load,load, load,load,load, load,store,store,store,store,store,store")
    (set_attr "iscompact" "maybe,maybe,maybe,true,true,false,false,         false,false,false,false,false, false, false,maybe_limm,maybe_limm,false,true, true, true,true,true,false,true,true,false,false, true,false,false,false,false")
    ; Use default length for iscompact to allow for COND_EXEC.  But set length
    ; of Crr to 4.
-   (set_attr "length" "*,*,*,*,*,4,4,4,4,4,4,4,8,8,*,*,*,*,*,*,*,*,4,*,4,*,*,*,*,*,4,8")
+   (set_attr "length" "*,*,*,*,*,4,4,4,4,4,4,4,8,8,*,*,*,*,*,*,*,*,4,*,4,*,*,*,*,*,*,8")
    (set_attr "predicable" "yes,no,yes,no,no,yes,no,no,no,no,no,yes,no,no,yes,yes,yes,no,no,no,no,no,no,no,no,no,no,no,no,no,no,no")
    (set_attr "cpu_facility" "av1,av1,av1,av2,av2,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,av2,av2,*,*,av2,*,*,av2,*")])
 
@@ -1634,7 +1635,7 @@
 )
 
 (define_insn "*zero_extendqisi2_ac"
-  [(set (match_operand:SI 0 "dest_reg_operand" "=Rcq,Rcq#q,Rcw,w,qRcq,!*x,r,r")
+  [(set (match_operand:SI 0 "dest_reg_operand"    "=Rcq,Rcq#q,Rcw,w,qRcq,!*x,r,r")
 	(zero_extend:SI (match_operand:QI 1 "nonvol_nonimm_operand" "0,Rcq#q,0,c,T,Usd,Ucm,m")))]
   ""
   "@
@@ -1659,19 +1660,19 @@
 
 (define_insn "*zero_extendhisi2_i"
   [(set (match_operand:SI 0 "dest_reg_operand" "=Rcq,q,Rcw,w,!x,Rcqq,r,r")
-	(zero_extend:SI (match_operand:HI 1 "nonvol_nonimm_operand" "0,q,0,c,Usd,Usd,Ucm,m")))]
+	(zero_extend:SI (match_operand:HI 1 "nonvol_nonimm_operand" "0,q,0,c,Usd,T,Ucm,m")))]
   ""
   "@
    ext%_%? %0,%1%&
    ext%_%? %0,%1%&
    bmsk%? %0,%1,15
    ext%_ %0,%1
-   ld%_%? %0,%1%&
-   ld%_%U1 %0,%1
+   ld%_%? %0,%1
+   ld%_%? %0,%1
    * return TARGET_EM ? \"xldh%U1%V1 %0,%1\" : \"xldw%U1 %0,%1\";
    ld%_%U1%V1 %0,%1"
   [(set_attr "type" "unary,unary,unary,unary,load,load,load,load")
-   (set_attr "iscompact" "maybe,true,false,false,true,false,false,false")
+   (set_attr "iscompact" "maybe,true,false,false,true,true,false,false")
    (set_attr "predicable" "no,no,yes,no,no,no,no,no")])
 
 
@@ -1726,7 +1727,7 @@
 )
 
 (define_insn "*extendhisi2_i"
-  [(set (match_operand:SI 0 "dest_reg_operand" "=Rcqq,w,Rcq,r,r")
+  [(set (match_operand:SI 0 "dest_reg_operand" "=Rcqq,w,Rcqq,r,r")
 	(sign_extend:SI (match_operand:HI 1 "nonvol_nonimm_operand" "Rcqq,c,Ucd,Uex,m")))]
   ""
   "@
@@ -4524,7 +4525,9 @@
        gen_rtx_GE (VOIDmode, gen_rtx_REG (CC_ZNmode, CC_REG), const0_rtx),
        gen_rtx_SET (operands[0], plus_constant (SImode, operands[0], 1))));
   DONE;
-})
+}
+[(set_attr "type" "unary")
+ (set_attr "length" "12")])
 
 (define_expand "ctzsi2"
   [(match_operand:SI 0 "register_operand" "")
@@ -4567,8 +4570,9 @@
        gen_rtx_SET (operands[0], gen_rtx_MINUS (SImode, GEN_INT (31),
 						operands[0]))));
   DONE;
-})
-
+}
+[(set_attr "type" "unary")
+ (set_attr "length" "20")])
 
 (define_insn "swap"
   [(set (match_operand:SI  0 "dest_reg_operand" "=w,w,w")
diff --git a/gcc/config/arc/constraints.md b/gcc/config/arc/constraints.md
index 0ad318c..b8a4a90 100644
--- a/gcc/config/arc/constraints.md
+++ b/gcc/config/arc/constraints.md
@@ -357,13 +357,13 @@
    (and (match_code "mem")
 	(match_test "compact_sda_memory_operand (op, VOIDmode, true)")))
 
+; Usc constant is only used for storing long constants, hence we can
+; have only [b,s9], and [b] types of addresses.
 (define_memory_constraint "Usc"
   "@internal
    A valid memory operand for storing constants"
   (and (match_code "mem")
-       (match_test "!CONSTANT_P (XEXP (op,0))")
-;; ??? the assembler rejects stores of immediates to small data.
-       (match_test "!compact_sda_memory_operand (op, VOIDmode, false)")))
+       (match_test "!CONSTANT_P (XEXP (op,0))")))
 
 (define_constraint "Us<"
   "@internal
diff --git a/gcc/testsuite/gcc.target/arc/arc.exp b/gcc/testsuite/gcc.target/arc/arc.exp
index 00b5c33..5bb09d8 100644
--- a/gcc/testsuite/gcc.target/arc/arc.exp
+++ b/gcc/testsuite/gcc.target/arc/arc.exp
@@ -104,7 +104,7 @@ if ![info exists DEFAULT_CFLAGS] then {
 dg-init
 
 # Main loop.
-dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.\[cS\]]] \
+dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.{\[cS\],cpp}]] \
 	"" $DEFAULT_CFLAGS
 
 # All done.
diff --git a/gcc/testsuite/gcc.target/arc/tdelay_slots.cpp b/gcc/testsuite/gcc.target/arc/tdelay_slots.cpp
new file mode 100644
index 0000000..3a6d379
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arc/tdelay_slots.cpp
@@ -0,0 +1,42 @@
+/* { dg-do assemble } */
+/* { dg-skip-if "" { ! { clmcpu } } } */
+/* { dg-options "-O2 -mcpu=em" } */
+
+template <class> struct A;
+int a;
+template <> struct A<char> {
+  typedef int int_type;
+  static int_type eof();
+};
+template <> struct A<wchar_t> {
+  typedef int int_type;
+  static int_type eof() { return -1; }
+};
+class basic_streambuf {
+public:
+  virtual ~basic_streambuf();
+};
+class B {
+  void tie();
+  class C {
+    C();
+  };
+};
+template <typename _CharT, typename _Traits = A<_CharT>>
+class D : basic_streambuf {
+  typedef _Traits traits_type;
+  typename traits_type::int_type _M_unget_buf;
+
+public:
+  D(void *) : _M_unget_buf(traits_type::eof()) {}
+};
+
+extern D<wchar_t> b;
+B c;
+void *operator new(unsigned, void *p2) { return p2; }
+
+B::C::C() {
+  new D<char>(&a);
+  c.tie();
+  new (&b) D<wchar_t>(&a);
+}
-- 
1.9.1

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 0/7] [ARC] SDATA and ZOL updates.
@ 2017-07-24  8:46 Claudiu Zissulescu
  2017-07-24  8:46 ` [PATCH 5/7] [ARC] Update various patterns Claudiu Zissulescu
                   ` (6 more replies)
  0 siblings, 7 replies; 27+ messages in thread
From: Claudiu Zissulescu @ 2017-07-24  8:46 UTC (permalink / raw)
  To: gcc-patches; +Cc: Claudiu.Zissulescu, Francois.Bedard, andrew.burgess, sandra

From: claziss <claziss@synopsys.com>

Hi Andrew,

Please find a new series of ARC patches that improves/fixes a number of key ARC features like:
* SDATA updates: there are two patches, one is cleaning up, improves the existing sdata implementation, while the second one adds support for -G option to our target. -G option is used to control the sdata behaviour.
* Reimplement the ZOL support using the modern gcc support for it.
* A number of hooks are updated + patches for arc's if conversion procedure.

Hi Sandra,

You are cc'ed to this email because I'm changing documentation, and it will be nice if you can check my mods.

Thank you,
Claudiu

Claudiu Zissulescu (7):
  [ARC] Improves and fixes for small data support.
  [ARC] Use TARGET_USE_ANCHORS_FOR_SYMBOL_P.
  [ARC] Use -G option to control sdata behavior
  [ARC] Update can_follow_jump hook helper.
  [ARC] Update various patterns
  [ARC] Reimplement ZOL support.
  [ARC] Fix errors in arc_ifcvt.

 gcc/config.gcc                                    |    2 +-
 gcc/config/arc/arc-c.c                            |    2 +
 gcc/config/arc/arc-protos.h                       |    2 +-
 gcc/config/arc/arc.c                              | 1089 ++++++++++++---------
 gcc/config/arc/arc.h                              |   16 +-
 gcc/config/arc/arc.md                             |  461 +++------
 gcc/config/arc/arc.opt                            |   25 +
 gcc/config/arc/constraints.md                     |    8 +-
 gcc/config/arc/predicates.md                      |    2 +
 gcc/config/arc/simdext.md                         |   22 +-
 gcc/doc/invoke.texi                               |   14 +-
 gcc/testsuite/gcc.target/arc/arc.exp              |    2 +-
 gcc/testsuite/gcc.target/arc/arc700-stld-hazard.c |    2 +-
 gcc/testsuite/gcc.target/arc/loop-1.c             |   49 +-
 gcc/testsuite/gcc.target/arc/pr9001184797.c       |   19 +
 gcc/testsuite/gcc.target/arc/sdata-3.c            |   32 +
 gcc/testsuite/gcc.target/arc/sdata-4.c            |   15 +
 gcc/testsuite/gcc.target/arc/sdata-5.c            |    8 +
 gcc/testsuite/gcc.target/arc/tdelay_slots.cpp     |   42 +
 19 files changed, 940 insertions(+), 872 deletions(-)
 mode change 100644 => 100755 gcc/testsuite/gcc.target/arc/loop-1.c
 create mode 100644 gcc/testsuite/gcc.target/arc/pr9001184797.c
 create mode 100644 gcc/testsuite/gcc.target/arc/sdata-3.c
 create mode 100644 gcc/testsuite/gcc.target/arc/sdata-4.c
 create mode 100755 gcc/testsuite/gcc.target/arc/sdata-5.c
 create mode 100644 gcc/testsuite/gcc.target/arc/tdelay_slots.cpp

-- 
1.9.1

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 7/7] [ARC] Fix errors in arc_ifcvt.
  2017-07-24  8:46 [PATCH 0/7] [ARC] SDATA and ZOL updates Claudiu Zissulescu
                   ` (4 preceding siblings ...)
  2017-07-24  8:46 ` [PATCH 3/7] [ARC] Update can_follow_jump hook helper Claudiu Zissulescu
@ 2017-07-24  8:46 ` Claudiu Zissulescu
  2017-08-31 20:08   ` Andrew Burgess
  2017-07-24  8:46 ` [PATCH 1/7] [ARC] Improves and fixes for small data support Claudiu Zissulescu
  6 siblings, 1 reply; 27+ messages in thread
From: Claudiu Zissulescu @ 2017-07-24  8:46 UTC (permalink / raw)
  To: gcc-patches; +Cc: Claudiu.Zissulescu, Francois.Bedard, andrew.burgess, sandra

From: claziss <claziss@synopsys.com>

The arc_ifcvt procedure is removing a label even when it is used by
another jump.  This patch fixes dg.exp/pr31507-1.c.

gcc/
2017-07-10  Claudiu Zissulescu  <claziss@synopsys.com>

	* config/arc/arc.c (arc_ifcvt): Remove use of merge_blocks call.
	(arc_ccfsm_advance): Fix checking for delay slots.
	(arc_reorg): Add rtl dump after each call to arc_ifcvt
---
 gcc/config/arc/arc.c | 42 +++++++++++-------------------------------
 1 file changed, 11 insertions(+), 31 deletions(-)

diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
index 0f9b553..f7a2b61 100644
--- a/gcc/config/arc/arc.c
+++ b/gcc/config/arc/arc.c
@@ -4429,12 +4429,10 @@ arc_ccfsm_advance (rtx_insn *insn, struct arc_ccfsm *state)
 
       /* If this is a non-annulled branch with a delay slot, there is
 	 no need to conditionalize the delay slot.  */
-      if (NEXT_INSN (PREV_INSN (insn)) != insn
+      if ((GET_CODE (PATTERN (NEXT_INSN (PREV_INSN (insn)))) == SEQUENCE)
 	  && state->state == 0 && !INSN_ANNULLED_BRANCH_P (insn))
 	{
 	  this_insn = NEXT_INSN (this_insn);
-	  gcc_assert (NEXT_INSN (NEXT_INSN (PREV_INSN (start_insn)))
-		      == NEXT_INSN (this_insn));
 	}
       /* See how many insns this branch skips, and what kind of insns.  If all
 	 insns are okay, and the label or unconditional branch to the same
@@ -7547,6 +7545,12 @@ arc_reorg (void)
 	  arc_ifcvt ();
 	  unsigned int flags = pass_data_arc_ifcvt.todo_flags_finish;
 	  df_finish_pass ((flags & TODO_df_verify) != 0);
+
+	  if (dump_file)
+	    {
+	      fprintf (dump_file, ";; After if conversion:\n\n");
+	      print_rtl (dump_file, get_insns ());
+	    }
 	}
 
       /* Call shorten_branches to calculate the insn lengths.  */
@@ -8998,7 +9002,6 @@ static unsigned
 arc_ifcvt (void)
 {
   struct arc_ccfsm *statep = &cfun->machine->ccfsm_current;
-  basic_block merge_bb = 0;
 
   memset (statep, 0, sizeof *statep);
   for (rtx_insn *insn = get_insns (); insn; insn = next_insn (insn))
@@ -9008,20 +9011,14 @@ arc_ifcvt (void)
       switch (statep->state)
 	{
 	case 0:
-	  if (JUMP_P (insn))
-	    merge_bb = 0;
 	  break;
 	case 1: case 2:
 	  {
 	    /* Deleted branch.  */
-	    gcc_assert (!merge_bb);
-	    merge_bb = BLOCK_FOR_INSN (insn);
-	    basic_block succ_bb
-	      = BLOCK_FOR_INSN (NEXT_INSN (NEXT_INSN (PREV_INSN (insn))));
 	    arc_ccfsm_post_advance (insn, statep);
 	    gcc_assert (!IN_RANGE (statep->state, 1, 2));
 	    rtx_insn *seq = NEXT_INSN (PREV_INSN (insn));
-	    if (seq != insn)
+	    if (GET_CODE (PATTERN (seq)) == SEQUENCE)
 	      {
 		rtx slot = XVECEXP (PATTERN (seq), 0, 1);
 		rtx pat = PATTERN (slot);
@@ -9035,18 +9032,10 @@ arc_ifcvt (void)
 		  gcc_unreachable ();
 		PUT_CODE (slot, NOTE);
 		NOTE_KIND (slot) = NOTE_INSN_DELETED;
-		if (merge_bb && succ_bb)
-		  merge_blocks (merge_bb, succ_bb);
-	      }
-	    else if (merge_bb && succ_bb)
-	      {
-		set_insn_deleted (insn);
-		merge_blocks (merge_bb, succ_bb);
 	      }
 	    else
 	      {
-		PUT_CODE (insn, NOTE);
-		NOTE_KIND (insn) = NOTE_INSN_DELETED;
+		set_insn_deleted (insn);
 	      }
 	    continue;
 	  }
@@ -9055,17 +9044,8 @@ arc_ifcvt (void)
 	      && statep->target_label == CODE_LABEL_NUMBER (insn))
 	    {
 	      arc_ccfsm_post_advance (insn, statep);
-	      basic_block succ_bb = BLOCK_FOR_INSN (insn);
-	      if (merge_bb && succ_bb)
-		merge_blocks (merge_bb, succ_bb);
-	      else if (--LABEL_NUSES (insn) == 0)
-		{
-		  const char *name = LABEL_NAME (insn);
-		  PUT_CODE (insn, NOTE);
-		  NOTE_KIND (insn) = NOTE_INSN_DELETED_LABEL;
-		  NOTE_DELETED_LABEL_NAME (insn) = name;
-		}
-	      merge_bb = 0;
+	      if (--LABEL_NUSES (insn) == 0)
+		delete_insn (insn);
 	      continue;
 	    }
 	  /* Fall through.  */
-- 
1.9.1

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 1/7] [ARC] Improves and fixes for small data support.
  2017-07-24  8:46 [PATCH 0/7] [ARC] SDATA and ZOL updates Claudiu Zissulescu
                   ` (5 preceding siblings ...)
  2017-07-24  8:46 ` [PATCH 7/7] [ARC] Fix errors in arc_ifcvt Claudiu Zissulescu
@ 2017-07-24  8:46 ` Claudiu Zissulescu
  2017-08-15 13:36   ` Andrew Burgess
  6 siblings, 1 reply; 27+ messages in thread
From: Claudiu Zissulescu @ 2017-07-24  8:46 UTC (permalink / raw)
  To: gcc-patches
  Cc: Claudiu.Zissulescu, Francois.Bedard, andrew.burgess, sandra,
	Claudiu Zissulescu

From: Claudiu Zissulescu <claziss@gmail.com>

Add alignment check for short load/store instructions used for sdata,
as they request 32-bit aligned short immediate.  Use sdata symbol
alignment information and emit scalled loads/stores whenever is
possible. The scalled address will extend the access range for sdata
symbols.  Allow 64-bit datum into small data section, if double
load/store instructions are present.

gcc/
2017-04-12  Claudiu Zissulescu  <claziss@synopsys.com>

	* config/arc/arc-protos.h (compact_sda_memory_operand): Update
	prototype.
	* config/arc/arc.c (arc_print_operand): Output scalled address for
	sdata whenever is possible.
	(arc_in_small_data_p): Allow sdata for 64bit datum when double
	load/stores are available.
	(compact_sda_memory_operand): Check for the alignment required by
	code density instructions.
	* config/arc/arc.md (movsi_insn): Use newly introduced Us0
	constraint.
	* config/arc/constraints.md (Usd): Update constraint.
	(Us0): New constraint.
	(Usc): Update constraint.

gcc/testsuite/
2017-04-12  Claudiu Zissulescu  <claziss@synopsys.com>

	* gcc.target/arc/sdata-3.c: New file.
---
 gcc/config/arc/arc-protos.h            |  2 +-
 gcc/config/arc/arc.c                   | 64 +++++++++++++++++++++++++++++-----
 gcc/config/arc/constraints.md          |  4 +--
 gcc/testsuite/gcc.target/arc/sdata-3.c | 32 +++++++++++++++++
 gcc/testsuite/gcc.target/arc/sdata-4.c | 15 ++++++++
 5 files changed, 105 insertions(+), 12 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arc/sdata-3.c
 create mode 100644 gcc/testsuite/gcc.target/arc/sdata-4.c

diff --git a/gcc/config/arc/arc-protos.h b/gcc/config/arc/arc-protos.h
index 850795a..c831972 100644
--- a/gcc/config/arc/arc-protos.h
+++ b/gcc/config/arc/arc-protos.h
@@ -27,7 +27,7 @@ extern struct rtx_def *gen_compare_reg (rtx, machine_mode);
 /* Declarations for various fns used in the .md file.  */
 extern void arc_output_function_epilogue (FILE *, HOST_WIDE_INT, int);
 extern const char *output_shift (rtx *);
-extern bool compact_sda_memory_operand (rtx op,machine_mode  mode);
+extern bool compact_sda_memory_operand (rtx, machine_mode, bool);
 extern bool arc_double_limm_p (rtx);
 extern void arc_print_operand (FILE *, rtx, int);
 extern void arc_print_operand_address (FILE *, rtx);
diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
index 89de6cd..091bc89 100644
--- a/gcc/config/arc/arc.c
+++ b/gcc/config/arc/arc.c
@@ -3900,6 +3900,26 @@ arc_print_operand (FILE *file, rtx x, int code)
 		  fputs (".as", file);
 		  output_scaled = 1;
 		}
+	      else if (LEGITIMATE_SMALL_DATA_ADDRESS_P (addr)
+		       && GET_MODE_SIZE (GET_MODE (x)) > 1)
+		{
+		  tree decl = NULL_TREE;
+		  int align = 0;
+		  if (GET_CODE (XEXP (addr, 1)) == SYMBOL_REF)
+		    decl = SYMBOL_REF_DECL (XEXP (addr, 1));
+		  else if (GET_CODE (XEXP (XEXP (XEXP (addr, 1), 0), 0))
+			   == SYMBOL_REF)
+		    decl = SYMBOL_REF_DECL (XEXP (XEXP (XEXP (addr, 1), 0), 0));
+		  if (decl)
+		    align = DECL_ALIGN (decl);
+		  align = align / BITS_PER_UNIT;
+		  if ((GET_MODE_SIZE (GET_MODE (x)) == 2)
+		      && align && ((align & 1) == 0))
+		    fputs (".as", file);
+		  if ((GET_MODE_SIZE (GET_MODE (x)) >= 4)
+		      && align && ((align & 3) == 0))
+		    fputs (".as", file);
+		}
 	      break;
 	    case REG:
 	      break;
@@ -7571,12 +7591,10 @@ arc_in_small_data_p (const_tree decl)
 {
   HOST_WIDE_INT size;
 
+  /* Strings and functions are never in small data area.  */
   if (TREE_CODE (decl) == STRING_CST || TREE_CODE (decl) == FUNCTION_DECL)
     return false;
 
-
-  /* We don't yet generate small-data references for -mabicalls.  See related
-     -G handling in override_options.  */
   if (TARGET_NO_SDATA_SET)
     return false;
 
@@ -7595,7 +7613,7 @@ arc_in_small_data_p (const_tree decl)
 	  return true;
     }
   /* Only global variables go into sdata section for now.  */
-  else if (1)
+  else
     {
       /* Don't put constants into the small data section: we want them
 	 to be in ROM rather than RAM.  */
@@ -7625,9 +7643,6 @@ arc_in_small_data_p (const_tree decl)
 
   size = int_size_in_bytes (TREE_TYPE (decl));
 
-/*   if (AGGREGATE_TYPE_P (TREE_TYPE (decl))) */
-/*     return false; */
-
   /* Allow only <=4B long data types into sdata.  */
   return (size > 0 && size <= 4);
 }
@@ -7719,10 +7734,13 @@ small_data_pattern (rtx op, machine_mode)
 /* volatile cache option still to be handled.  */
 
 bool
-compact_sda_memory_operand (rtx op, machine_mode mode)
+compact_sda_memory_operand (rtx op, machine_mode mode, bool short_p)
 {
   rtx addr;
   int size;
+  tree decl = NULL_TREE;
+  int align = 0;
+  int mask = 0;
 
   /* Eliminate non-memory operations.  */
   if (GET_CODE (op) != MEM)
@@ -7740,7 +7758,35 @@ compact_sda_memory_operand (rtx op, machine_mode mode)
   /* Decode the address now.  */
   addr = XEXP (op, 0);
 
-  return LEGITIMATE_SMALL_DATA_ADDRESS_P  (addr);
+  if (!LEGITIMATE_SMALL_DATA_ADDRESS_P (addr))
+    return false;
+
+  if (!short_p || size == 1)
+    return true;
+
+  /* Now check for the alignment, the short loads using gp require the
+     addresses to be aligned.  */
+  if (GET_CODE (XEXP (addr, 1)) == SYMBOL_REF)
+    decl = SYMBOL_REF_DECL (XEXP (addr, 1));
+  else if (GET_CODE (XEXP (XEXP (XEXP (addr, 1), 0), 0)) == SYMBOL_REF)
+    decl = SYMBOL_REF_DECL (XEXP (XEXP (XEXP (addr, 1), 0), 0));
+  if (decl)
+    align = DECL_ALIGN (decl);
+  align = align / BITS_PER_UNIT;
+
+  switch (mode)
+    {
+    case HImode:
+      mask = 1;
+      break;
+    default:
+      mask = 3;
+      break;
+    }
+
+  if (align && ((align & mask) == 0))
+    return true;
+  return false;
 }
 
 /* Implement ASM_OUTPUT_ALIGNED_DECL_LOCAL.  */
diff --git a/gcc/config/arc/constraints.md b/gcc/config/arc/constraints.md
index 6620daf..0ad318c 100644
--- a/gcc/config/arc/constraints.md
+++ b/gcc/config/arc/constraints.md
@@ -355,7 +355,7 @@
    "@internal
     A valid _small-data_ memory operand for ARCompact instructions"
    (and (match_code "mem")
-	(match_test "compact_sda_memory_operand (op, VOIDmode)")))
+	(match_test "compact_sda_memory_operand (op, VOIDmode, true)")))
 
 (define_memory_constraint "Usc"
   "@internal
@@ -363,7 +363,7 @@
   (and (match_code "mem")
        (match_test "!CONSTANT_P (XEXP (op,0))")
 ;; ??? the assembler rejects stores of immediates to small data.
-       (match_test "!compact_sda_memory_operand (op, VOIDmode)")))
+       (match_test "!compact_sda_memory_operand (op, VOIDmode, false)")))
 
 (define_constraint "Us<"
   "@internal
diff --git a/gcc/testsuite/gcc.target/arc/sdata-3.c b/gcc/testsuite/gcc.target/arc/sdata-3.c
new file mode 100644
index 0000000..cdf3b6d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arc/sdata-3.c
@@ -0,0 +1,32 @@
+/* Check if sdata access is done correctly, specially
+   for variables which are having a different alignment
+   than the default data type indicates.  */
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+int g_a __attribute__ ((aligned (1)));
+int g_b;
+short g_c;
+char g_d;
+
+#define TEST(name, optype)			\
+  void test_ ## name (optype x)			\
+  {						\
+    g_ ## name += x;				\
+  }
+
+TEST (a, int)
+TEST (b, int)
+TEST (c, short)
+TEST (d, char)
+
+/* { dg-final { scan-assembler "ld r2,\\\[gp,@g_a@sda\\\]" } } */
+/* { dg-final { scan-assembler "ld.as r2,\\\[gp,@g_b@sda\\\]" } } */
+/* { dg-final { scan-assembler "ld\[hw\]\\\.as r2,\\\[gp,@g_c@sda\\\]" } } */
+/* { dg-final { scan-assembler "ldb r2,\\\[gp,@g_d@sda\\\]" } } */
+
+/* { dg-final { scan-assembler "st r0,\\\[gp,@g_a@sda\\\]" } } */
+/* { dg-final { scan-assembler "st_s r0,\\\[gp,@g_b@sda\\\]" { target { arcem || archs } } } } */
+/* { dg-final { scan-assembler "st\\\.as r0,\\\[gp,@g_b@sda\\\]" { target { arc700 || arc6xx } } } } */
+/* { dg-final { scan-assembler "st\[hw\]\\\.as r0,\\\[gp,@g_c@sda\\\]" } } */
+/* { dg-final { scan-assembler "stb r0,\\\[gp,@g_d@sda\\\]" } } */
diff --git a/gcc/testsuite/gcc.target/arc/sdata-4.c b/gcc/testsuite/gcc.target/arc/sdata-4.c
new file mode 100644
index 0000000..45fe712
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arc/sdata-4.c
@@ -0,0 +1,15 @@
+/* Check if sdata access is done correctly, specially
+   for variables which are having a different alignment
+   than the default data type indicates.  */
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+short gA  __attribute__ ((aligned(1)));
+
+void foo (void)
+{
+  gA += gA + 3;
+}
+
+/* { dg-final { scan-assembler-not "ld\[wh\]_s r0,\\\[gp" } } */
+/* { dg-final { scan-assembler-not "st\[wh\]\\\.as.*gp" } } */
-- 
1.9.1

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 4/7] [ARC] Use TARGET_USE_ANCHORS_FOR_SYMBOL_P.
  2017-07-24  8:46 [PATCH 0/7] [ARC] SDATA and ZOL updates Claudiu Zissulescu
                   ` (2 preceding siblings ...)
  2017-07-24  8:46 ` [PATCH 2/7] [ARC] Use -G option to control sdata behavior Claudiu Zissulescu
@ 2017-07-24  8:46 ` Claudiu Zissulescu
  2017-08-31 19:13   ` Andrew Burgess
  2017-07-24  8:46 ` [PATCH 3/7] [ARC] Update can_follow_jump hook helper Claudiu Zissulescu
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 27+ messages in thread
From: Claudiu Zissulescu @ 2017-07-24  8:46 UTC (permalink / raw)
  To: gcc-patches
  Cc: Claudiu.Zissulescu, Francois.Bedard, andrew.burgess, sandra,
	Claudiu Zissulescu

From: Claudiu Zissulescu <claziss@gmail.com>

We don't want to use anchors for small data: the GP register acts as an anchor in that
case.  We also don't want to use them for PC-relative accesses,
where the PC acts as an anchor.  TLS symbols require special accesses as well, don't use
anchors for such symbols.

gcc/
2017-04-28  Claudiu Zissulescu  <claziss@synopsys.com>

	* config/arc/arc.c (arc_use_anchors_for_symbol_p): New function.
	(TARGET_USE_ANCHORS_FOR_SYMBOL_P): Define.

gcc/testsuite
2017-04-28  Claudiu Zissulescu  <claziss@synopsys.com>

	* gcc.target/arc/pr9001184797.c: New test.
---
 gcc/config/arc/arc.c                        | 24 ++++++++++++++++++++++++
 gcc/testsuite/gcc.target/arc/pr9001184797.c | 19 +++++++++++++++++++
 2 files changed, 43 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/arc/pr9001184797.c

diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
index 3980a0f..c94b187 100644
--- a/gcc/config/arc/arc.c
+++ b/gcc/config/arc/arc.c
@@ -10687,6 +10687,30 @@ arc_builtin_setjmp_frame_value (void)
   return gen_raw_REG (Pmode, FRAME_POINTER_REGNUM);
 }
 
+/* Implement TARGET_USE_ANCHORS_FOR_SYMBOL_P.  We don't want to use
+   anchors for small data: the GP register acts as an anchor in that
+   case.  We also don't want to use them for PC-relative accesses,
+   where the PC acts as an anchor.  Prohibit also TLS symbols to use
+   anchors.  */
+
+static bool
+arc_use_anchors_for_symbol_p (const_rtx symbol)
+{
+  if (SYMBOL_REF_TLS_MODEL (symbol))
+    return false;
+
+  if (flag_pic)
+    return false;
+
+  if (SYMBOL_REF_SMALL_P (symbol))
+    return false;
+
+  return default_use_anchors_for_symbol_p (symbol);
+}
+
+#undef TARGET_USE_ANCHORS_FOR_SYMBOL_P
+#define TARGET_USE_ANCHORS_FOR_SYMBOL_P arc_use_anchors_for_symbol_p
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 #include "gt-arc.h"
diff --git a/gcc/testsuite/gcc.target/arc/pr9001184797.c b/gcc/testsuite/gcc.target/arc/pr9001184797.c
new file mode 100644
index 0000000..e76c676
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arc/pr9001184797.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target tls } */
+/* { dg-options "-Os -w -mno-ll64" } */
+
+/* This test studies the use of anchors and tls symbols. */
+
+struct a b;
+struct a {
+  long c;
+  long d
+} e() {
+  static __thread struct a f;
+  static __thread g;
+  g = 5;
+  h();
+  if (f.c)
+    g = g & 5;
+  f = b;
+}
-- 
1.9.1

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 6/7] [ARC] Reimplement ZOL support.
  2017-07-24  8:46 [PATCH 0/7] [ARC] SDATA and ZOL updates Claudiu Zissulescu
  2017-07-24  8:46 ` [PATCH 5/7] [ARC] Update various patterns Claudiu Zissulescu
@ 2017-07-24  8:46 ` Claudiu Zissulescu
  2017-07-24 22:22   ` Sandra Loosemore
  2017-08-31 20:23   ` Andrew Burgess
  2017-07-24  8:46 ` [PATCH 2/7] [ARC] Use -G option to control sdata behavior Claudiu Zissulescu
                   ` (4 subsequent siblings)
  6 siblings, 2 replies; 27+ messages in thread
From: Claudiu Zissulescu @ 2017-07-24  8:46 UTC (permalink / raw)
  To: gcc-patches; +Cc: Claudiu.Zissulescu, Francois.Bedard, andrew.burgess, sandra

From: claziss <claziss@synopsys.com>

2017-05-22  Claudiu Zissulescu <claziss@synopsys.com>

	* config/arc/arc-c.c (__ARC_LPC_WIDTH__): Add builtin define.
	* config/arc/arc.c (ARC_MAX_LOOP_LENGTH): Define.
	(arc_conditional_register_usage): Remove ARC600 lp_count
	exception.
	(arc_file_start): Emit Tag_ARC_CPU_variation.
	(arc_can_use_doloop_p): New conditions to use ZOLs.
	(hwloop_fail): New function.
	(hwloop_optimize): Likewise.
	(hwloop_pattern_reg): Likewise.
	(arc_doloop_hooks): New struct, to be used with reorg_loops.
	(arc_reorg_loops): New function, calls reorg_loops.
	(arc_reorg): Call arc_reorg_loops.  Remove old ZOL handling.
	(arc600_corereg_hazard): Remove ZOL checking, case handled by
	hwloop_optimize.
	(arc_loop_hazard): Remove function, functionality moved into
	hwloop_optimize.
	(arc_hazard): Remove arc_loop_hazard call.
	(arc_adjust_insn_length): Remove ZOL handling, functionality moved
	into hwloop_optimize.
	(arc_label_align): Remove ZOL handling.
	* config/arc/arc.h (LOOP_ALIGN): Changed to 0.
	* config/arc/arc.md (doloop_begin): Remove pattern.
	(doloop_begin_i): Likewise.
	(doloop_end_i): Likewise.
	(doloop_fallback): Likewise.
	(doloop_fallback_m): Likewise.
	(doloop_end): Reimplement expand.
	(arc_lp): New pattern for LP instruction.
	(loop_end): New pattern.
	(loop_fail): Likewise.
	(decrement_and_branch_until_zero): Likewise.
	* config/arc/arc.opt (mlpc-width): New option.
	* doc/invoke.texi (mlpc-width): Document option.

testsuite/
2017-05-22  Claudiu Zissulescu <claziss@synopsys.com>

 	    * gcc.target/arc/loop-1.c: Update test.
---
 gcc/config/arc/arc-c.c                |   2 +
 gcc/config/arc/arc.c                  | 726 ++++++++++++++++++----------------
 gcc/config/arc/arc.h                  |  10 +-
 gcc/config/arc/arc.md                 | 419 ++++++--------------
 gcc/config/arc/arc.opt                |  25 ++
 gcc/config/arc/predicates.md          |   2 +
 gcc/doc/invoke.texi                   |  14 +-
 gcc/testsuite/gcc.target/arc/loop-1.c |  49 +--
 8 files changed, 561 insertions(+), 686 deletions(-)
 mode change 100644 => 100755 gcc/testsuite/gcc.target/arc/loop-1.c

diff --git a/gcc/config/arc/arc-c.c b/gcc/config/arc/arc-c.c
index de877a1..44ff338 100644
--- a/gcc/config/arc/arc-c.c
+++ b/gcc/config/arc/arc-c.c
@@ -62,6 +62,8 @@ arc_cpu_cpp_builtins (cpp_reader * pfile)
   builtin_define_with_int_value ("__ARC_TLS_REGNO__",
 				 arc_tp_regno);
 
+  builtin_define_with_int_value ("__ARC_LPC_WIDTH__", arc_lpcwidth);
+
   builtin_define (TARGET_BIG_ENDIAN
 		  ? "__BIG_ENDIAN__" : "__LITTLE_ENDIAN__");
   if (TARGET_BIG_ENDIAN)
diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
index c94b187..0f9b553 100644
--- a/gcc/config/arc/arc.c
+++ b/gcc/config/arc/arc.c
@@ -64,11 +64,15 @@ along with GCC; see the file COPYING3.  If not see
 #include "rtl-iter.h"
 #include "alias.h"
 #include "opts.h"
+#include "hw-doloop.h"
 
 /* Which cpu we're compiling for (ARC600, ARC601, ARC700).  */
 static char arc_cpu_name[10] = "";
 static const char *arc_cpu_string = arc_cpu_name;
 
+/* Maximum size of a loop.  */
+#define ARC_MAX_LOOP_LENGTH 4095
+
 /* ??? Loads can handle any constant, stores can only handle small ones.  */
 /* OTOH, LIMMs cost extra, so their usefulness is limited.  */
 #define RTX_OK_FOR_OFFSET_P(MODE, X) \
@@ -1708,18 +1712,7 @@ arc_conditional_register_usage (void)
 	   i <= ARC_LAST_SIMD_DMA_CONFIG_REG; i++)
 	reg_alloc_order [i] = i;
     }
-  /* For ARC600, lp_count may not be read in an instruction
-     following immediately after another one setting it to a new value.
-     There was some discussion on how to enforce scheduling constraints for
-     processors with missing interlocks on the gcc mailing list:
-     http://gcc.gnu.org/ml/gcc/2008-05/msg00021.html .
-     However, we can't actually use this approach, because for ARC the
-     delay slot scheduling pass is active, which runs after
-     machine_dependent_reorg.  */
-  if (TARGET_ARC600)
-    CLEAR_HARD_REG_BIT (reg_class_contents[SIBCALL_REGS], LP_COUNT);
-  else if (!TARGET_LP_WR_INTERLOCK)
-    fixed_regs[LP_COUNT] = 1;
+
   for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
     if (!call_used_regs[regno])
       CLEAR_HARD_REG_BIT (reg_class_contents[SIBCALL_REGS], regno);
@@ -6998,28 +6991,33 @@ arc_pass_by_reference (cumulative_args_t ca_v ATTRIBUTE_UNUSED,
 /* Implement TARGET_CAN_USE_DOLOOP_P.  */
 
 static bool
-arc_can_use_doloop_p (const widest_int &iterations, const widest_int &,
+arc_can_use_doloop_p (const widest_int &,
+		      const widest_int &iterations_max,
 		      unsigned int loop_depth, bool entered_at_top)
 {
-  if (loop_depth > 1)
+  /* Considering limitations in the hardware, only use doloop
+     for innermost loops which must be entered from the top.  */
+  if (loop_depth > 1 || !entered_at_top)
     return false;
-  /* Setting up the loop with two sr instructions costs 6 cycles.  */
-  if (TARGET_ARC700
-      && !entered_at_top
-      && wi::gtu_p (iterations, 0)
-      && wi::leu_p (iterations, flag_pic ? 6 : 3))
+
+  /* Check for lp_count width boundary.  */
+  if (arc_lpcwidth != 32
+      && (wi::gtu_p (iterations_max, ((1 << arc_lpcwidth) - 1))
+	  || wi::eq_p (iterations_max, 0)))
     return false;
   return true;
 }
 
-/* NULL if INSN insn is valid within a low-overhead loop.
-   Otherwise return why doloop cannot be applied.  */
+/* NULL if INSN insn is valid within a low-overhead loop.  Otherwise
+   return why doloop cannot be applied.  */
 
 static const char *
 arc_invalid_within_doloop (const rtx_insn *insn)
 {
   if (CALL_P (insn))
     return "Function call in the loop.";
+
+  /* FIXME! add here all the ZOL exceptions.  */
   return NULL;
 }
 
@@ -7118,6 +7116,368 @@ workaround_arc_anomaly (void)
     }
 }
 
+/* A callback for the hw-doloop pass.  Called when a loop we have discovered
+   turns out not to be optimizable; we have to split the loop_end pattern into
+   a subtract and a test.  */
+
+static void
+hwloop_fail (hwloop_info loop)
+{
+  rtx test;
+  rtx insn = loop->loop_end;
+
+  if (TARGET_V2
+      && (loop->length && (loop->length <= ARC_MAX_LOOP_LENGTH))
+      && REG_P (loop->iter_reg))
+    {
+      /* TARGET_V2 has dbnz instructions.  */
+      test = gen_dbnz (loop->iter_reg, loop->start_label);
+      insn = emit_jump_insn_before (test, loop->loop_end);
+    }
+  else if (REG_P (loop->iter_reg) && (REGNO (loop->iter_reg) == LP_COUNT))
+    {
+      /* We have the lp_count as loop iterator, try to use it.  */
+      emit_insn_before (gen_loop_fail (), loop->loop_end);
+      test = gen_rtx_NE (VOIDmode, gen_rtx_REG (CC_ZNmode, CC_REG),
+			 const0_rtx);
+      test = gen_rtx_IF_THEN_ELSE (VOIDmode, test,
+				   gen_rtx_LABEL_REF (Pmode, loop->start_label),
+				   pc_rtx);
+      insn = emit_jump_insn_before (gen_rtx_SET (pc_rtx, test),
+				     loop->loop_end);
+    }
+  else
+    {
+      emit_insn_before (gen_addsi3 (loop->iter_reg,
+				    loop->iter_reg,
+				    constm1_rtx),
+			loop->loop_end);
+      test = gen_rtx_NE (VOIDmode, loop->iter_reg, const0_rtx);
+      insn = emit_jump_insn_before (gen_cbranchsi4 (test,
+						    loop->iter_reg,
+						    const0_rtx,
+						    loop->start_label),
+				    loop->loop_end);
+    }
+  JUMP_LABEL (insn) = loop->start_label;
+  LABEL_NUSES (loop->start_label)++;
+  delete_insn (loop->loop_end);
+}
+
+/* Optimize LOOP.  */
+
+static bool
+hwloop_optimize (hwloop_info loop)
+{
+  int i;
+  edge entry_edge;
+  basic_block entry_bb, bb;
+  rtx iter_reg, end_label;
+  rtx_insn *insn, *seq, *entry_after, *last_insn;
+  unsigned int length;
+  bool need_fix = false;
+  rtx lp_reg = gen_rtx_REG (SImode, LP_COUNT);
+
+  if (loop->depth > 1)
+    {
+      if (dump_file)
+        fprintf (dump_file, ";; loop %d is not innermost\n",
+                 loop->loop_no);
+      return false;
+    }
+
+  if (!loop->incoming_dest)
+    {
+      if (dump_file)
+        fprintf (dump_file, ";; loop %d has more than one entry\n",
+                 loop->loop_no);
+      return false;
+    }
+
+  if (loop->incoming_dest != loop->head)
+    {
+      if (dump_file)
+        fprintf (dump_file, ";; loop %d is not entered from head\n",
+                 loop->loop_no);
+      return false;
+    }
+
+  if (loop->has_call || loop->has_asm)
+    {
+      if (dump_file)
+        fprintf (dump_file, ";; loop %d has invalid insn\n",
+                 loop->loop_no);
+      return false;
+    }
+
+  /* Scan all the blocks to make sure they don't use iter_reg.  */
+  if (loop->iter_reg_used || loop->iter_reg_used_outside)
+    {
+      if (dump_file)
+        fprintf (dump_file, ";; loop %d uses iterator\n",
+                 loop->loop_no);
+      return false;
+    }
+
+  /* Check if start_label appears before doloop_end.  */
+  length = 0;
+  for (insn = loop->start_label;
+       insn && insn != loop->loop_end;
+       insn = NEXT_INSN (insn))
+    length += NONDEBUG_INSN_P (insn) ? get_attr_length (insn) : 0;
+
+  if (!insn)
+    {
+      if (dump_file)
+        fprintf (dump_file, ";; loop %d start_label not before loop_end\n",
+                 loop->loop_no);
+      return false;
+    }
+
+  loop->length = length;
+  if (loop->length > ARC_MAX_LOOP_LENGTH)
+    {
+      if (dump_file)
+	fprintf (dump_file, ";; loop %d too long\n", loop->loop_no);
+      return false;
+    }
+  else if (!loop->length)
+    {
+      if (dump_file)
+	fprintf (dump_file, ";; loop %d is empty\n", loop->loop_no);
+      return false;
+    }
+
+  /* Check if we use a register or not.  */
+  if (!REG_P (loop->iter_reg))
+    {
+      if (dump_file)
+        fprintf (dump_file, ";; loop %d iterator is MEM\n",
+                 loop->loop_no);
+      return false;
+    }
+
+  /* Check if loop register is lpcount.  */
+  if (REG_P (loop->iter_reg) && (REGNO (loop->iter_reg)) != LP_COUNT)
+    {
+      if (dump_file)
+        fprintf (dump_file, ";; loop %d doesn't use lp_count as loop"
+		 " iterator\n",
+                 loop->loop_no);
+      /* This loop doesn't use the lp_count, check though if we can
+	 fix it.  */
+      if (TEST_HARD_REG_BIT (loop->regs_set_in_loop, LP_COUNT)
+	  /* In very unique cases we may have LP_COUNT alive.  */
+	  || (loop->incoming_src
+	      && REGNO_REG_SET_P (df_get_live_out (loop->incoming_src),
+				  LP_COUNT)))
+	return false;
+      else
+	need_fix = true;
+    }
+
+  /* Check for control like instruction as the last instruction of a
+     ZOL.  */
+  bb = loop->tail;
+  last_insn = PREV_INSN (loop->loop_end);
+
+  while (1)
+    {
+      for (; last_insn != BB_HEAD (bb);
+	   last_insn = PREV_INSN (last_insn))
+	if (NONDEBUG_INSN_P (last_insn))
+	  break;
+
+      if (last_insn != BB_HEAD (bb))
+	break;
+
+      if (single_pred_p (bb)
+	  && single_pred_edge (bb)->flags & EDGE_FALLTHRU
+	  && single_pred (bb) != ENTRY_BLOCK_PTR_FOR_FN (cfun))
+	{
+	  bb = single_pred (bb);
+	  last_insn = BB_END (bb);
+	  continue;
+	}
+      else
+	{
+	  last_insn = NULL;
+	  break;
+	}
+    }
+
+  if (!last_insn)
+    {
+      if (dump_file)
+	fprintf (dump_file, ";; loop %d has no last instruction\n",
+		 loop->loop_no);
+      return false;
+    }
+
+  if ((TARGET_ARC600_FAMILY || TARGET_HS)
+      && INSN_P (last_insn)
+      && (JUMP_P (last_insn) || CALL_P (last_insn)
+	  || GET_CODE (PATTERN (last_insn)) == SEQUENCE
+	  /* At this stage we can have (insn (clobber (mem:BLK
+	     (reg)))) instructions, ignpre them.  */
+	  || (GET_CODE (PATTERN (last_insn)) != CLOBBER
+	      && (get_attr_type (last_insn) == TYPE_BRCC
+		  || get_attr_type (last_insn) == TYPE_BRCC_NO_DELAY_SLOT))))
+    {
+      if (loop->length + 2 > ARC_MAX_LOOP_LENGTH)
+	{
+	  if (dump_file)
+	    fprintf (dump_file, ";; loop %d too long\n", loop->loop_no);
+	  return false;
+	}
+      if (dump_file)
+	fprintf (dump_file, ";; loop %d has a control like last insn;"
+		 "add a nop\n",
+		 loop->loop_no);
+
+      last_insn = emit_insn_after (gen_nopv (), last_insn);
+    }
+
+  if (LABEL_P (last_insn))
+    {
+      if (dump_file)
+	fprintf (dump_file, ";; loop %d has a label as last insn;"
+		 "add a nop\n",
+		 loop->loop_no);
+      last_insn = emit_insn_after (gen_nopv (), last_insn);
+    }
+  loop->last_insn = last_insn;
+
+  /* Get the loop iteration register.  */
+  iter_reg = loop->iter_reg;
+
+  gcc_assert (REG_P (iter_reg));
+
+  entry_edge = NULL;
+
+  FOR_EACH_VEC_SAFE_ELT (loop->incoming, i, entry_edge)
+    if (entry_edge->flags & EDGE_FALLTHRU)
+      break;
+
+  if (entry_edge == NULL)
+    {
+      if (dump_file)
+	fprintf (dump_file, ";; loop %d has no fallthru edge jumping"
+		 "into the loop\n",
+		 loop->loop_no);
+      return false;
+    }
+  /* The loop is good.  */
+  end_label = gen_label_rtx ();
+  loop->end_label = end_label;
+
+  /* Place the zero_cost_loop_start instruction before the loop.  */
+  entry_bb = entry_edge->src;
+
+  start_sequence ();
+
+  if (need_fix)
+    {
+      /* The loop uses a R-register, but the lp_count is free, thus
+	 use lp_count.  */
+      emit_insn (gen_movsi (lp_reg, iter_reg));
+      SET_HARD_REG_BIT (loop->regs_set_in_loop, LP_COUNT);
+      iter_reg = lp_reg;
+      if (dump_file)
+	{
+	  fprintf (dump_file, ";; fix loop %d to use lp_count\n",
+		   loop->loop_no);
+	}
+    }
+
+  insn = emit_insn (gen_arc_lp (iter_reg,
+				loop->start_label,
+				loop->end_label));
+
+  seq = get_insns ();
+  end_sequence ();
+
+  entry_after = BB_END (entry_bb);
+  if (!single_succ_p (entry_bb) || vec_safe_length (loop->incoming) > 1
+      || !entry_after)
+    {
+      basic_block new_bb;
+      edge e;
+      edge_iterator ei;
+
+      emit_insn_before (seq, BB_HEAD (loop->head));
+      seq = emit_label_before (gen_label_rtx (), seq);
+      new_bb = create_basic_block (seq, insn, entry_bb);
+      FOR_EACH_EDGE (e, ei, loop->incoming)
+        {
+          if (!(e->flags & EDGE_FALLTHRU))
+            redirect_edge_and_branch_force (e, new_bb);
+          else
+            redirect_edge_succ (e, new_bb);
+        }
+
+      make_edge (new_bb, loop->head, 0);
+    }
+  else
+    {
+#if 0
+      while (DEBUG_INSN_P (entry_after)
+             || (NOTE_P (entry_after)
+                 && NOTE_KIND (entry_after) != NOTE_INSN_BASIC_BLOCK
+		 /* Make sure we don't split a call and its corresponding
+		    CALL_ARG_LOCATION note.  */
+                 && NOTE_KIND (entry_after) != NOTE_INSN_CALL_ARG_LOCATION))
+        entry_after = NEXT_INSN (entry_after);
+#endif
+      entry_after = next_nonnote_insn_bb (entry_after);
+
+      gcc_assert (entry_after);
+      emit_insn_before (seq, entry_after);
+    }
+
+  delete_insn (loop->loop_end);
+  /* Insert the loop end label before the last instruction of the
+     loop.  */
+  emit_label_after (end_label, loop->last_insn);
+
+  return true;
+}
+
+/* A callback for the hw-doloop pass.  This function examines INSN; if
+   it is a loop_end pattern we recognize, return the reg rtx for the
+   loop counter.  Otherwise, return NULL_RTX.  */
+
+static rtx
+hwloop_pattern_reg (rtx_insn *insn)
+{
+  rtx reg;
+
+  if (!JUMP_P (insn) || recog_memoized (insn) != CODE_FOR_loop_end)
+    return NULL_RTX;
+
+  reg = SET_DEST (XVECEXP (PATTERN (insn), 0, 1));
+  if (!REG_P (reg))
+    return NULL_RTX;
+  return reg;
+}
+
+static struct hw_doloop_hooks arc_doloop_hooks =
+{
+  hwloop_pattern_reg,
+  hwloop_optimize,
+  hwloop_fail
+};
+
+/* Run from machine_dependent_reorg, this pass looks for doloop_end insns
+   and tries to rewrite the RTL of these loops so that proper Blackfin
+   hardware loops are generated.  */
+
+static void
+arc_reorg_loops (void)
+{
+  reorg_loops (true, &arc_doloop_hooks);
+}
+
 static int arc_reorg_in_progress = 0;
 
 /* ARC's machince specific reorg function.  */
@@ -7131,204 +7491,17 @@ arc_reorg (void)
   long offset;
   int changed;
 
-  workaround_arc_anomaly ();
-
   cfun->machine->arc_reorg_started = 1;
   arc_reorg_in_progress = 1;
 
-  /* Link up loop ends with their loop start.  */
-  {
-    for (insn = get_insns (); insn; insn = NEXT_INSN (insn))
-      if (GET_CODE (insn) == JUMP_INSN
-	  && recog_memoized (insn) == CODE_FOR_doloop_end_i)
-	{
-	  rtx_insn *top_label
-	    = as_a <rtx_insn *> (XEXP (XEXP (SET_SRC (XVECEXP (PATTERN (insn), 0, 0)), 1), 0));
-	  rtx num = GEN_INT (CODE_LABEL_NUMBER (top_label));
-	  rtx_insn *lp, *prev = prev_nonnote_insn (top_label);
-	  rtx_insn *lp_simple = NULL;
-	  rtx_insn *next = NULL;
-	  rtx op0 = XEXP (XVECEXP (PATTERN (insn), 0, 1), 0);
-	  int seen_label = 0;
-
-	  for (lp = prev;
-	       (lp && NONJUMP_INSN_P (lp)
-		&& recog_memoized (lp) != CODE_FOR_doloop_begin_i);
-	       lp = prev_nonnote_insn (lp))
-	    ;
-	  if (!lp || !NONJUMP_INSN_P (lp)
-	      || dead_or_set_regno_p (lp, LP_COUNT))
-	    {
-	      HOST_WIDE_INT loop_end_id
-		= INTVAL (XEXP (XVECEXP (PATTERN (insn), 0, 4), 0));
+  compute_bb_for_insn ();
 
-	      for (prev = next = insn, lp = NULL ; prev || next;)
-		{
-		  if (prev)
-		    {
-		      if (NONJUMP_INSN_P (prev)
-			  && recog_memoized (prev) == CODE_FOR_doloop_begin_i
-			  && (INTVAL (XEXP (XVECEXP (PATTERN (prev), 0, 5), 0))
-			      == loop_end_id))
-			{
-			  lp = prev;
-			  break;
-			}
-		      else if (LABEL_P (prev))
-			seen_label = 1;
-		      prev = prev_nonnote_insn (prev);
-		    }
-		  if (next)
-		    {
-		      if (NONJUMP_INSN_P (next)
-			  && recog_memoized (next) == CODE_FOR_doloop_begin_i
-			  && (INTVAL (XEXP (XVECEXP (PATTERN (next), 0, 5), 0))
-			      == loop_end_id))
-			{
-			  lp = next;
-			  break;
-			}
-		      next = next_nonnote_insn (next);
-		    }
-		}
-	      prev = NULL;
-	    }
-	  else
-	    lp_simple = lp;
-	  if (lp && !dead_or_set_regno_p (lp, LP_COUNT))
-	    {
-	      rtx begin_cnt = XEXP (XVECEXP (PATTERN (lp), 0 ,3), 0);
-	      if (INTVAL (XEXP (XVECEXP (PATTERN (lp), 0, 4), 0)))
-		/* The loop end insn has been duplicated.  That can happen
-		   when there is a conditional block at the very end of
-		   the loop.  */
-		goto failure;
-	      /* If Register allocation failed to allocate to the right
-		 register, There is no point into teaching reload to
-		 fix this up with reloads, as that would cost more
-		 than using an ordinary core register with the
-		 doloop_fallback pattern.  */
-	      if ((true_regnum (op0) != LP_COUNT || !REG_P (begin_cnt))
-	      /* Likewise, if the loop setup is evidently inside the loop,
-		 we loose.  */
-		  || (!lp_simple && lp != next && !seen_label))
-		{
-		  remove_insn (lp);
-		  goto failure;
-		}
-	      /* It is common that the optimizers copy the loop count from
-		 another register, and doloop_begin_i is stuck with the
-		 source of the move.  Making doloop_begin_i only accept "l"
-		 is nonsentical, as this then makes reload evict the pseudo
-		 used for the loop end.  The underlying cause is that the
-		 optimizers don't understand that the register allocation for
-		 doloop_begin_i should be treated as part of the loop.
-		 Try to work around this problem by verifying the previous
-		 move exists.  */
-	      if (true_regnum (begin_cnt) != LP_COUNT)
-		{
-		  rtx_insn *mov;
-		  rtx set, note;
+  df_analyze ();
 
-		  for (mov = prev_nonnote_insn (lp); mov;
-		       mov = prev_nonnote_insn (mov))
-		    {
-		      if (!NONJUMP_INSN_P (mov))
-			mov = 0;
-		      else if ((set = single_set (mov))
-			  && rtx_equal_p (SET_SRC (set), begin_cnt)
-			  && rtx_equal_p (SET_DEST (set), op0))
-			break;
-		    }
-		  if (mov)
-		    {
-		      XEXP (XVECEXP (PATTERN (lp), 0 ,3), 0) = op0;
-		      note = find_regno_note (lp, REG_DEAD, REGNO (begin_cnt));
-		      if (note)
-			remove_note (lp, note);
-		    }
-		  else
-		    {
-		      remove_insn (lp);
-		      goto failure;
-		    }
-		}
-	      XEXP (XVECEXP (PATTERN (insn), 0, 4), 0) = num;
-	      XEXP (XVECEXP (PATTERN (lp), 0, 4), 0) = num;
-	      if (next == lp)
-		XEXP (XVECEXP (PATTERN (lp), 0, 6), 0) = const2_rtx;
-	      else if (!lp_simple)
-		XEXP (XVECEXP (PATTERN (lp), 0, 6), 0) = const1_rtx;
-	      else if (prev != lp)
-		{
-		  remove_insn (lp);
-		  add_insn_after (lp, prev, NULL);
-		}
-	      if (!lp_simple)
-		{
-		  XEXP (XVECEXP (PATTERN (lp), 0, 7), 0)
-		    = gen_rtx_LABEL_REF (Pmode, top_label);
-		  add_reg_note (lp, REG_LABEL_OPERAND, top_label);
-		  LABEL_NUSES (top_label)++;
-		}
-	      /* We can avoid tedious loop start / end setting for empty loops
-		 be merely setting the loop count to its final value.  */
-	      if (next_active_insn (top_label) == insn)
-		{
-		  rtx lc_set
-		    = gen_rtx_SET (XEXP (XVECEXP (PATTERN (lp), 0, 3), 0),
-				   const0_rtx);
-
-		  rtx_insn *lc_set_insn = emit_insn_before (lc_set, insn);
-		  delete_insn (lp);
-		  delete_insn (insn);
-		  insn = lc_set_insn;
-		}
-	      /* If the loop is non-empty with zero length, we can't make it
-		 a zero-overhead loop.  That can happen for empty asms.  */
-	      else
-		{
-		  rtx_insn *scan;
+  /* Doloop optimization.  */
+  arc_reorg_loops ();
 
-		  for (scan = top_label;
-		       (scan && scan != insn
-			&& (!NONJUMP_INSN_P (scan) || !get_attr_length (scan)));
-		       scan = NEXT_INSN (scan));
-		  if (scan == insn)
-		    {
-		      remove_insn (lp);
-		      goto failure;
-		    }
-		}
-	    }
-	  else
-	    {
-	      /* Sometimes the loop optimizer makes a complete hash of the
-		 loop.  If it were only that the loop is not entered at the
-		 top, we could fix this up by setting LP_START with SR .
-		 However, if we can't find the loop begin were it should be,
-		 chances are that it does not even dominate the loop, but is
-		 inside the loop instead.  Using SR there would kill
-		 performance.
-		 We use the doloop_fallback pattern here, which executes
-		 in two cycles on the ARC700 when predicted correctly.  */
-	    failure:
-	      if (!REG_P (op0))
-		{
-		  rtx op3 = XEXP (XVECEXP (PATTERN (insn), 0, 5), 0);
-
-		  emit_insn_before (gen_move_insn (op3, op0), insn);
-		  PATTERN (insn)
-		    = gen_doloop_fallback_m (op3, JUMP_LABEL (insn), op0);
-		}
-	      else
-		XVEC (PATTERN (insn), 0)
-		  = gen_rtvec (2, XVECEXP (PATTERN (insn), 0, 0),
-			       XVECEXP (PATTERN (insn), 0, 1));
-	      INSN_CODE (insn) = -1;
-	    }
-	}
-    }
+  workaround_arc_anomaly ();
 
 /* FIXME: should anticipate ccfsm action, generate special patterns for
    to-be-deleted branches that have no delay slot and have at least the
@@ -7866,11 +8039,11 @@ arc_register_move_cost (machine_mode,
 	return 6;
     }
 
-  /* The ARC700 stalls for 3 cycles when *reading* from lp_count.  */
-  if (TARGET_ARC700
-      && (from_class == LPCOUNT_REG || from_class == ALL_CORE_REGS
-	  || from_class == WRITABLE_CORE_REGS))
-    return 8;
+  /* Using lp_count as scratch reg is a VERY bad idea.  */
+  if (from_class == LPCOUNT_REG)
+    return 1000;
+  if (to_class == LPCOUNT_REG)
+    return 6;
 
   /* Force an attempt to 'mov Dy,Dx' to spill.  */
   if ((TARGET_ARC700 || TARGET_EM) && TARGET_DPFP
@@ -8312,14 +8485,6 @@ arc600_corereg_hazard (rtx_insn *pred, rtx_insn *succ)
 {
   if (!TARGET_ARC600)
     return 0;
-  /* If SUCC is a doloop_end_i with a preceding label, we must output a nop
-     in front of SUCC anyway, so there will be separation between PRED and
-     SUCC.  */
-  if (recog_memoized (succ) == CODE_FOR_doloop_end_i
-      && LABEL_P (prev_nonnote_insn (succ)))
-    return 0;
-  if (recog_memoized (succ) == CODE_FOR_doloop_begin_i)
-    return 0;
   if (GET_CODE (PATTERN (pred)) == SEQUENCE)
     pred = as_a <rtx_sequence *> (PATTERN (pred))->insn (1);
   if (GET_CODE (PATTERN (succ)) == SEQUENCE)
@@ -8393,76 +8558,6 @@ arc_asm_insn_p (rtx x)
   return 0;
 }
 
-/* We might have a CALL to a non-returning function before a loop end.
-   ??? Although the manual says that's OK (the target is outside the
-   loop, and the loop counter unused there), the assembler barfs on
-   this for ARC600, so we must insert a nop before such a call too.
-   For ARC700, and ARCv2 is not allowed to have the last ZOL
-   instruction a jump to a location where lp_count is modified.  */
-
-static bool
-arc_loop_hazard (rtx_insn *pred, rtx_insn *succ)
-{
-  rtx_insn *jump  = NULL;
-  rtx label_rtx = NULL_RTX;
-  rtx_insn *label = NULL;
-  basic_block succ_bb;
-
-  if (recog_memoized (succ) != CODE_FOR_doloop_end_i)
-    return false;
-
-  /* Phase 1: ARC600 and ARCv2HS doesn't allow any control instruction
-     (i.e., jump/call) as the last instruction of a ZOL.  */
-  if (TARGET_ARC600 || TARGET_HS)
-    if (JUMP_P (pred) || CALL_P (pred)
-	|| arc_asm_insn_p (PATTERN (pred))
-	|| GET_CODE (PATTERN (pred)) == SEQUENCE)
-      return true;
-
-  /* Phase 2: Any architecture, it is not allowed to have the last ZOL
-     instruction a jump to a location where lp_count is modified.  */
-
-  /* Phase 2a: Dig for the jump instruction.  */
-  if (JUMP_P (pred))
-    jump = pred;
-  else if (GET_CODE (PATTERN (pred)) == SEQUENCE
-	   && JUMP_P (XVECEXP (PATTERN (pred), 0, 0)))
-    jump = as_a <rtx_insn *> (XVECEXP (PATTERN (pred), 0, 0));
-  else
-    return false;
-
-  /* Phase 2b: Make sure is not a millicode jump.  */
-  if ((GET_CODE (PATTERN (jump)) == PARALLEL)
-      && (XVECEXP (PATTERN (jump), 0, 0) == ret_rtx))
-    return false;
-
-  label_rtx = JUMP_LABEL (jump);
-  if (!label_rtx)
-    return false;
-
-  /* Phase 2c: Make sure is not a return.  */
-  if (ANY_RETURN_P (label_rtx))
-    return false;
-
-  /* Pahse 2d: Go to the target of the jump and check for aliveness of
-     LP_COUNT register.  */
-  label = safe_as_a <rtx_insn *> (label_rtx);
-  succ_bb = BLOCK_FOR_INSN (label);
-  if (!succ_bb)
-    {
-      gcc_assert (NEXT_INSN (label));
-      if (NOTE_INSN_BASIC_BLOCK_P (NEXT_INSN (label)))
-	succ_bb = NOTE_BASIC_BLOCK (NEXT_INSN (label));
-      else
-	succ_bb = BLOCK_FOR_INSN (NEXT_INSN (label));
-    }
-
-  if (succ_bb && REGNO_REG_SET_P (df_get_live_out (succ_bb), LP_COUNT))
-    return true;
-
-  return false;
-}
-
 /* For ARC600:
    A write to a core reg greater or equal to 32 must not be immediately
    followed by a use.  Anticipate the length requirement to insert a nop
@@ -8474,9 +8569,6 @@ arc_hazard (rtx_insn *pred, rtx_insn *succ)
   if (!pred || !INSN_P (pred) || !succ || !INSN_P (succ))
     return 0;
 
-  if (arc_loop_hazard (pred, succ))
-    return 4;
-
   if (TARGET_ARC600)
     return arc600_corereg_hazard (pred, succ);
 
@@ -8494,24 +8586,6 @@ arc_adjust_insn_length (rtx_insn *insn, int len, bool)
   if (GET_CODE (PATTERN (insn)) == SEQUENCE)
     return len;
 
-  /* It is impossible to jump to the very end of a Zero-Overhead Loop, as
-     the ZOL mechanism only triggers when advancing to the end address,
-     so if there's a label at the end of a ZOL, we need to insert a nop.
-     The ARC600 ZOL also has extra restrictions on jumps at the end of a
-     loop.  */
-  if (recog_memoized (insn) == CODE_FOR_doloop_end_i)
-    {
-      rtx_insn *prev = prev_nonnote_insn (insn);
-
-      return ((LABEL_P (prev)
-	       || (TARGET_ARC600
-		   && (JUMP_P (prev)
-		       || CALL_P (prev) /* Could be a noreturn call.  */
-		       || (NONJUMP_INSN_P (prev)
-			   && GET_CODE (PATTERN (prev)) == SEQUENCE))))
-	      ? len + 4 : len);
-    }
-
   /* Check for return with but one preceding insn since function
      start / call.  */
   if (TARGET_PAD_RETURN
@@ -9871,27 +9945,9 @@ arc_scheduling_not_expected (void)
   return cfun->machine->arc_reorg_started;
 }
 
-/* Oddly enough, sometimes we get a zero overhead loop that branch
-   shortening doesn't think is a loop - observed with compile/pr24883.c
-   -O3 -fomit-frame-pointer -funroll-loops.  Make sure to include the
-   alignment visible for branch shortening  (we actually align the loop
-   insn before it, but that is equivalent since the loop insn is 4 byte
-   long.)  */
-
 int
 arc_label_align (rtx_insn *label)
 {
-  int loop_align = LOOP_ALIGN (LABEL);
-
-  if (loop_align > align_labels_log)
-    {
-      rtx_insn *prev = prev_nonnote_insn (label);
-
-      if (prev && NONJUMP_INSN_P (prev)
-	  && GET_CODE (PATTERN (prev)) == PARALLEL
-	  && recog_memoized (prev) == CODE_FOR_doloop_begin_i)
-	return loop_align;
-    }
   /* Code has a minimum p2 alignment of 1, which we must restore after an
      ADDR_DIFF_VEC.  */
   if (align_labels_log < 1)
diff --git a/gcc/config/arc/arc.h b/gcc/config/arc/arc.h
index d4e97cd..4c54b7e 100644
--- a/gcc/config/arc/arc.h
+++ b/gcc/config/arc/arc.h
@@ -581,15 +581,15 @@ enum reg_class
   {0x0000f00f, 0x00000000, 0x00000000, 0x00000000, 0x00000000},	     /* 'q', r0-r3, r12-r15 */		\
   {0x1000f00f, 0x00000000, 0x00000000, 0x00000000, 0x00000000},	     /* 'e', r0-r3, r12-r15, sp */	\
   {0x1c001fff, 0x00000000, 0x00000000, 0x00000000, 0x00000000},    /* "Rsc", r0-r12 */ \
-  {0x9fffffff, 0xc0000000, 0x00000000, 0x00000000, 0x00000000},      /* 'r', r0-r28, blink, ap and pcl */	\
+  {0x9fffffff, 0x80000000, 0x00000000, 0x00000000, 0x00000000},      /* 'r', r0-r28, blink, ap and pcl */	\
   {0xffffffff, 0x00000000, 0x00000000, 0x00000000, 0x00000000},      /* 'W',  r0-r31 */ \
   /* Include ap / pcl in WRITABLE_CORE_REGS for sake of symmetry.  As these \
      registers are fixed, it does not affect the literal meaning of the \
      constraints, but it makes it a superset of GENERAL_REGS, thus \
      enabling some operations that would otherwise not be possible.  */ \
-  {0xffffffff, 0xd0000000, 0x00000000, 0x00000000, 0x00000000},      /* 'w', r0-r31, r60 */ \
-  {0xffffffff, 0xdfffffff, 0x00000000, 0x00000000, 0x00000000},      /* 'c', r0-r60, ap, pcl */ \
-  {0xffffffff, 0xdfffffff, 0x00000000, 0x00000000, 0x00000000},      /* 'Rac', r0-r60, ap, pcl */ \
+  {0xffffffff, 0x00000000, 0x00000000, 0x00000000, 0x00000000},      /* 'w', r0-r31, r60 */ \
+  {0xffffffff, 0x9fffffff, 0x00000000, 0x00000000, 0x00000000},      /* 'c', r0-r60, ap, pcl */ \
+  {0xffffffff, 0x9fffffff, 0x00000000, 0x00000000, 0x00000000},      /* 'Rac', r0-r60, ap, pcl */ \
   {0x0000000f, 0x00000000, 0x00000000, 0x00000000, 0x00000000},      /* 'Rcd', r0-r3 */ \
   {0x00000003, 0x00000000, 0x00000000, 0x00000000, 0x00000000},      /* 'Rsd', r0-r1 */ \
   {0x9fffffff, 0x00000000, 0x00000000, 0x00000000, 0x00000000},      /* 'h',  r0-28, r30 */ \
@@ -1351,7 +1351,7 @@ do {							\
    of a loop.  */
 /* On the ARC, align loops to 4 byte boundaries unless doing all-out size
    optimization.  */
-#define LOOP_ALIGN JUMP_ALIGN
+#define LOOP_ALIGN(X) 0
 
 #define LABEL_ALIGN(LABEL) (arc_label_align (LABEL))
 
diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md
index 04a1447..27afe40 100644
--- a/gcc/config/arc/arc.md
+++ b/gcc/config/arc/arc.md
@@ -554,6 +554,11 @@
    (eq_attr "annul_ret_delay_insn" "yes")
    (eq_attr "cond_ret_delay_insn" "yes")])
 
+(define_delay (eq_attr "type" "loop_end")
+  [(eq_attr "in_delay_slot" "true")
+   (eq_attr "in_delay_slot" "true")
+   (nil)])
+
 ;; For ARC600, unexposing the delay sloy incurs a penalty also in the
 ;; non-taken case, so the only meaningful way to have an annull-true
 ;; filled delay slot is to conditionalize the delay slot insn.
@@ -618,8 +623,8 @@
 ; The iscompact attribute allows the epilogue expander to know for which
 ; insns it should lengthen the return insn.
 (define_insn "*movqi_insn"
-  [(set (match_operand:QI 0 "move_dest_operand" "=Rcq,Rcq#q,    w,Rcq#q,   h, w,w,???w,h, w,Rcq,  S,!*x,  r,r, Ucm,m,???m,  m,Usc")
-	(match_operand:QI 1 "move_src_operand"  "  cL,   cP,Rcq#q,    P,hCm1,cL,I,?Rac,i,?i,  T,Rcq,Usd,Ucm,m,?Rac,c,?Rac,Cm3,i"))]
+  [(set (match_operand:QI 0 "move_dest_operand" "=Rcq,Rcq#q,    w,Rcq#q,   h,w*l,w*l,???w,h,w*l,Rcq,  S,!*x,  r,r, Ucm,m,???m,  m,Usc")
+	(match_operand:QI 1 "move_src_operand"  "  cL,   cP,Rcq#q,    P,hCm1, cL,  I,?Rac,i, ?i,  T,Rcq,Usd,Ucm,m,?Rac,c,?Rac,Cm3,i"))]
   "register_operand (operands[0], QImode)
    || register_operand (operands[1], QImode)"
   "@
@@ -655,8 +660,8 @@
   "if (prepare_move_operands (operands, HImode)) DONE;")
 
 (define_insn "*movhi_insn"
-  [(set (match_operand:HI 0 "move_dest_operand" "=Rcq,Rcq#q,    w,Rcq#q,   h, w,w,???w,Rcq#q,h, w,Rcq,  S,  r,r, Ucm,m,???m,  m,VUsc")
-	(match_operand:HI 1 "move_src_operand" "   cL,   cP,Rcq#q,    P,hCm1,cL,I,?Rac,    i,i,?i,  T,Rcq,Ucm,m,?Rac,c,?Rac,Cm3,i"))]
+  [(set (match_operand:HI 0 "move_dest_operand" "=Rcq,Rcq#q,    w,Rcq#q,   h,w*l,w*l,???w,Rcq#q,h,w*l,Rcq,  S,  r,r, Ucm,m,???m,  m,VUsc")
+	(match_operand:HI 1 "move_src_operand" "   cL,   cP,Rcq#q,    P,hCm1, cL,  I,?Rac,    i,i, ?i,  T,Rcq,Ucm,m,?Rac,c,?Rac,Cm3,i"))]
   "register_operand (operands[0], HImode)
    || register_operand (operands[1], HImode)
    || (CONSTANT_P (operands[1])
@@ -706,9 +711,9 @@
 ; the iscompact attribute allows the epilogue expander to know for which
 ; insns it should lengthen the return insn.
 ; N.B. operand 1 of alternative 7 expands into pcl,symbol@gotpc .
-(define_insn "*movsi_insn"                      ;   0     1     2     3    4  5 6   7   8   9   10  11  12  13    14  15   16  17  18     19     20  21  22    23    24 25 26    27 28  29   30   31
-  [(set (match_operand:SI 0 "move_dest_operand" "=Rcq,Rcq#q,    w,Rcq#q,   h, w,w,  w,  w,  w,  w,???w, ?w,  w,Rcq#q,  h,   w,Rcq,  S,   Us<,RcqRck,!*x,  r,!*Rsd,!*Rcd,r,Ucm,  Usd,m,???m,  m,VUsc")
-	(match_operand:SI 1 "move_src_operand"  "  cL,   cP,Rcq#q,    P,hCm1,cL,I,Crr,Clo,Chi,Cbi,?Rac,Cpc,Clb, ?Cal,Cal,?Cal,Uts,Rcq,RcqRck,   Us>,Usd,Ucm,  Usd,  Ucd,m,  w,!*Rzd,c,?Rac,Cm3, C32"))]
+(define_insn "*movsi_insn"                      ;   0     1     2     3    4  5    6   7   8   9   10    11  12  13    14  15   16  17  18     19     20  21  22    23    24 25 26    27 28  29  30   31
+  [(set (match_operand:SI 0 "move_dest_operand" "=Rcq,Rcq#q,    w,Rcq#q,   h,w*l,w*l,  w,  w,  w,  w,  ???w, ?w,  w,Rcq#q,  h, w*l,Rcq,  S,   Us<,RcqRck,!*x,  r,!*Rsd,!*Rcd,r,Ucm,  Usd,m,???m,  m,VUsc")
+	(match_operand:SI 1 "move_src_operand"  "  cL,   cP,Rcq#q,    P,hCm1, cL,  I,Crr,Clo,Chi,Cbi,?Rac*l,Cpc,Clb, ?Cal,Cal,?Cal,Uts,Rcq,RcqRck,   Us>,Usd,Ucm,  Usd,  Ucd,m,  w,!*Rzd,c,?Rac,Cm3, C32"))]
   "register_operand (operands[0], SImode)
    || register_operand (operands[1], SImode)
    || (CONSTANT_P (operands[1])
@@ -5073,317 +5078,123 @@
 				xtr, const0_rtx);
 })
 
+;; -------------------------------------------------------------------
+;; Hardware loop
+;; -------------------------------------------------------------------
+
 ; operand 0 is the loop count pseudo register
-; operand 1 is the loop end pattern
-(define_expand "doloop_begin"
-  [(use (match_operand 0 "register_operand" ""))
-   (use (match_operand 1 "" ""))]
+; operand 1 is the label to jump to at the top of the loop
+(define_expand "doloop_end"
+  [(parallel [(set (pc)
+		   (if_then_else
+		    (ne (match_operand 0 "" "")
+			(const_int 1))
+		    (label_ref (match_operand 1 "" ""))
+		    (pc)))
+	      (set (match_dup 0) (plus (match_dup 0) (const_int -1)))
+	      (unspec [(const_int 0)] UNSPEC_ARC_LP)
+	      (clobber (match_dup 2))])]
   ""
 {
-  /* Using the INSN_UID of the loop end pattern to identify it causes
-     trouble with -fcompare-debug, so allocate a debug-independent
-     id instead.  We use negative numbers so that we can use the same
-     slot in doloop_end_i where we later store a CODE_LABEL_NUMBER, and
-     still be able to tell what kind of number this is.  */
-  static HOST_WIDE_INT loop_end_id = 0;
-
-  rtx id = GEN_INT (--loop_end_id);
-  XEXP (XVECEXP (PATTERN (operands[1]), 0, 4), 0) = id;
-  emit_insn (gen_doloop_begin_i (operands[0], const0_rtx, id,
-				 const0_rtx, const0_rtx));
-  DONE;
+ if (GET_MODE (operands[0]) != SImode)
+   FAIL;
+ operands[2] = gen_rtx_SCRATCH (SImode);
 })
 
-; ??? can't describe the insn properly as then the optimizers try to
-; hoist the SETs.
-;(define_insn "doloop_begin_i"
-;  [(set (reg:SI LP_START) (pc))
-;   (set (reg:SI LP_END) (unspec:SI [(pc)] UNSPEC_ARC_LP))
-;   (use (match_operand 0 "const_int_operand" "n"))]
-;  ""
-;  "lp .L__GCC__LP%0"
-;)
-
-; The operands of doloop_end_i are also read / written by arc_reorg with
-; XVECEXP (PATTERN (lp, 0, N), so if you want to change the pattern, you
-; might have to adjust arc_reorg.
-; operands 0 / 2 are supplied by the expander, 1, 3 and 4 are filled in
-; by arc_reorg.  arc_reorg might also alter operand 0.
-;
-; N in XVECEXP PATTERN (lp, 0 N)
-;  V              rtl                 purpose
-;  0           unspec UNSPEC_ARC_LP identify pattern
-;  1           clobber LP_START     show LP_START is set
-;  2           clobber LP_END       show LP_END is set
-;  3           use operand0         loop count pseudo register
-;  4           use operand1         before arc_reorg: -id
-;                                   after : CODE_LABEL_NUMBER of loop top label
-;  5           use operand2         INSN_UID of loop end insn
-;  6           use operand3         loop setup not at start (1 above, 2 below)
-;  7           use operand4         LABEL_REF of top label, if not
-;                                   immediately following
-; If operand1 is still zero after arc_reorg, this is an orphaned loop
-; instruction that was not at the start of the loop.
-; There is no point is reloading this insn - then lp_count would still not
-; be available for the loop end.
-(define_insn "doloop_begin_i"
-  [(unspec:SI [(pc)] UNSPEC_ARC_LP)
-   (clobber (reg:SI LP_START))
-   (clobber (reg:SI LP_END))
-   (use (match_operand:SI 0 "register_operand" "l,l,????*X"))
-   (use (match_operand 1 "const_int_operand" "n,n,C_0"))
-   (use (match_operand 2 "const_int_operand" "n,n,X"))
-   (use (match_operand 3 "const_int_operand" "C_0,n,X"))
-   (use (match_operand 4 "const_int_operand" "C_0,X,X"))]
+(define_insn "arc_lp"
+  [(unspec:SI [(match_operand:SI 0 "register_operand" "l")]
+	      UNSPEC_ARC_LP)
+   (use (label_ref (match_operand 1 "" "")))
+   (use (label_ref (match_operand 2 "" "")))]
   ""
-{
-  rtx_insn *scan;
-  int len, size = 0;
-  int n_insns = 0;
-  rtx loop_start = operands[4];
-
-  if (CONST_INT_P (loop_start))
-    loop_start = NULL_RTX;
-  /* Size implications of the alignment will be taken care of by the
-     alignment inserted at the loop start.  */
-  if (LOOP_ALIGN (0) && INTVAL (operands[1]))
-    {
-      asm_fprintf (asm_out_file, "\t.p2align %d\\n", LOOP_ALIGN (0));
-      arc_clear_unalign ();
-    }
-  if (!INTVAL (operands[1]))
-    return "; LITTLE LOST LOOP";
-  if (loop_start && flag_pic)
-    {
-      /* ??? Can do better for when a scratch register
-	 is known.  But that would require extra testing.  */
-      return "push_s r0\;add r0,pcl,%4@pcl\;sr r0,[2]; LP_START\;add r0,pcl,.L__GCC__LP%1@pcl\;sr r0,[3]; LP_END\;pop_s r0";
-    }
-  /* Check if the loop end is in range to be set by the lp instruction.  */
-  size = INTVAL (operands[3]) < 2 ? 0 : 2048;
-  for (scan = insn; scan && size < 2048; scan = NEXT_INSN (scan))
-    {
-      if (!INSN_P (scan))
-	continue;
-      if (recog_memoized (scan) == CODE_FOR_doloop_end_i
-	  && (XEXP (XVECEXP (PATTERN (scan), 0, 4), 0)
-	      == XEXP (XVECEXP (PATTERN (insn), 0, 4), 0)))
-	break;
-      len = get_attr_length (scan);
-      size += len;
-    }
-  /* Try to verify that there are at least three instruction fetches
-     between the loop setup and the first encounter of the loop end.  */
-  for (scan = NEXT_INSN (insn); scan && n_insns < 3; scan = NEXT_INSN (scan))
-    {
-      if (!INSN_P (scan))
-	continue;
-      if (rtx_sequence *seq = dyn_cast <rtx_sequence *> (PATTERN (scan)))
-	scan = seq->insn (0);
-      if (JUMP_P (scan))
-	{
-	  if (recog_memoized (scan) != CODE_FOR_doloop_end_i)
-	    {
-	      n_insns += 2;
-	      if (simplejump_p (scan))
-		{
-		  scan = as_a <rtx_insn *> (XEXP (SET_SRC (PATTERN (scan)), 0));
-		  continue;
-		}
-
-	      rtx lab = JUMP_LABEL (scan);
-	      if (!lab)
-		break;
-
-	      rtx_insn *next_scan
-		= next_active_insn (NEXT_INSN (PREV_INSN (scan)));
-	      if (next_scan
-		  && recog_memoized (next_scan) != CODE_FOR_doloop_begin_i)
-		break;
-
-	      /* JUMP_LABEL might be simple_return instead if an insn.  */
-	      if (!INSN_P (lab))
-		{
-		  n_insns++;
-		  break;
-		}
-
-	      rtx_insn *next_lab = next_active_insn (as_a<rtx_insn *> (lab));
-	      if (next_lab
-		  && recog_memoized (next_lab) != CODE_FOR_doloop_begin_i)
-		break;
-
-		n_insns++;
-	    }
-	  break;
-	}
-      len = get_attr_length (scan);
-      /* Size estimation of asms assumes that each line which is nonempty
-	 codes an insn, and that each has a long immediate.  For minimum insn
-	 count, assume merely that a nonempty asm has at least one insn.  */
-      if (GET_CODE (PATTERN (scan)) == ASM_INPUT
-	  || asm_noperands (PATTERN (scan)) >= 0)
-	n_insns += (len != 0);
-      else
-	n_insns += (len > 4 ? 2 : (len ? 1 : 0));
-    }
-  if (LOOP_ALIGN (0))
-    {
-      asm_fprintf (asm_out_file, "\t.p2align %d\\n", LOOP_ALIGN (0));
-      arc_clear_unalign ();
-    }
-  gcc_assert (n_insns || GET_CODE (next_nonnote_insn (insn)) == CODE_LABEL);
-  if (size >= 2048 || (TARGET_ARC600 && n_insns == 1) || loop_start)
-    {
-      if (flag_pic)
-	{
-	  /* ??? Can do better for when a scratch register
-	     is known.  But that would require extra testing.  */
-	  arc_clear_unalign ();
-	  return ".p2align 2\;push_s r0\;add r0,pcl,24\;sr r0,[2]; LP_START\;add r0,pcl,.L__GCC__LP%1@pcl\;sr r0,[3]; LP_END\;pop_s r0";
-	}
-      output_asm_insn ((size < 2048
-			? "lp .L__GCC__LP%1" : "sr .L__GCC__LP%1,[3]; LP_END"),
-		       operands);
-      output_asm_insn (loop_start
-		       ? "sr %4,[2]; LP_START" : "sr 0f,[2]; LP_START",
-		       operands);
-      if (TARGET_ARC600 && n_insns < 1)
-	output_asm_insn ("nop", operands);
-      return (TARGET_ARC600 && n_insns < 3) ? "nop_s\;nop_s\;0:" : "0:";
-    }
-  else if (TARGET_ARC600 && n_insns < 3)
-    {
-      /* At least four instructions are needed between the setting of LP_COUNT
-	 and the loop end - but the lp instruction qualifies as one.  */
-      rtx_insn *prev = prev_nonnote_insn (insn);
-
-      if (!INSN_P (prev) || dead_or_set_regno_p (prev, LP_COUNT))
-	output_asm_insn ("nop", operands);
-    }
-  return "lp .L__GCC__LP%1";
-}
+  "lp\\t@%l2\\t; %0:@%l1->@%l2"
   [(set_attr "type" "loop_setup")
-   (set_attr_alternative "length"
-;     FIXME: length is usually 4, but we need branch shortening
-;     to get this right.
-;     [(if_then_else (match_test "TARGET_ARC600") (const_int 16) (const_int 4))
-     [(if_then_else (match_test "flag_pic") (const_int 24) (const_int 16))
-      (if_then_else (match_test "flag_pic") (const_int 28) (const_int 16))
-      (const_int 0)])]
-  ;; ??? we should really branch shorten this insn, but then we'd
-  ;; need a proper label first.  N.B. the end label can not only go out
-  ;; of range when it is far away, but also when it precedes the loop -
-  ;; which, unfortunately, it sometimes does, when the loop "optimizer"
-  ;; messes things up.
-)
-
-; operand 0 is the loop count pseudo register
-; operand 1 is the label to jump to at the top of the loop
-; Use this for the ARC600 and ARC700.
-; ??? ARC600 might want to check if the loop has few iteration and only a
-; single insn - loop setup is expensive then.
-(define_expand "doloop_end"
-  [(use (match_operand 0 "register_operand" ""))
-   (use (label_ref (match_operand 1 "" "")))]
-  "!TARGET_ARC601"
-{
-  /* We could do smaller bivs with biv widening, and wider bivs by having
-     a high-word counter in an outer loop - but punt on this for now.  */
-  if (GET_MODE (operands[0]) != SImode)
-    FAIL;
-  emit_jump_insn (gen_doloop_end_i (operands[0], operands[1], const0_rtx));
-  DONE;
-})
+   (set_attr "length" "4")])
 
-(define_insn_and_split "doloop_end_i"
+;; if by any chance the lp_count is not used, then use an 'r'
+;; register, instead of going to memory.
+(define_insn "loop_end"
   [(set (pc)
-	(if_then_else (ne (match_operand:SI 0 "shouldbe_register_operand" "+l,*c,*m")
-			   (const_int 1))
+	(if_then_else (ne (match_operand:SI 2 "nonimmediate_operand" "0,0")
+			  (const_int 1))
 		      (label_ref (match_operand 1 "" ""))
 		      (pc)))
-   (set (match_dup 0) (plus:SI (match_dup 0) (const_int -1)))
-   (use (reg:SI LP_START))
-   (use (reg:SI LP_END))
-   (use (match_operand 2 "const_int_operand" "n,???Cn0,???X"))
-   (clobber (match_scratch:SI 3 "=X,X,&????r"))]
+   (set (match_operand:SI 0 "nonimmediate_operand" "=l!r,m")
+	(plus (match_dup 2) (const_int -1)))
+   (unspec [(const_int 0)] UNSPEC_ARC_LP)
+   (clobber (match_scratch:SI 3 "=X,&r"))]
   ""
-  "*
-{
-  rtx_insn *prev = prev_nonnote_insn (insn);
-
-  /* If there is an immediately preceding label, we must output a nop,
-     lest a branch to that label will fall out of the loop.
-     ??? We could try to avoid this by claiming to have a delay slot if there
-     is a preceding label, and outputting the delay slot insn instead, if
-     present.
-     Or we could have some optimization that changes the source edge to update
-     the loop count and jump to the loop start instead.  */
-  /* For ARC600, we must also prevent jumps inside the loop and jumps where
-     the loop counter value is live at the target from being directly at the
-     loop end.  Being sure that the loop counter is dead at the target is
-     too much hair - we can't rely on data flow information at this point -
-     so insert a nop for all branches.
-     The ARC600 also can't read the loop counter in the last insn of a loop.  */
-  if (LABEL_P (prev))
-    output_asm_insn (\"nop%?\", operands);
-  return \"\\n.L__GCC__LP%2: ; loop end, start is %1\";
-}"
-  "&& memory_operand (operands[0], SImode)"
-  [(pc)]
-{
-  emit_move_insn (operands[3], operands[0]);
-  emit_jump_insn (gen_doloop_fallback_m (operands[3], operands[1], operands[0]));
-  DONE;
-}
-  [(set_attr "type" "loop_end")
-   (set (attr "length")
-	(if_then_else (match_test "LABEL_P (prev_nonnote_insn (insn))")
-		      (const_int 4) (const_int 0)))]
-)
+  "\\t;%0 %1 %2"
+  [(set_attr "length" "0")
+   (set_attr "predicable" "no")
+   (set_attr "type" "loop_end")])
 
-; This pattern is generated by arc_reorg when there is no recognizable
-; loop start.
-(define_insn "*doloop_fallback"
-  [(set (pc) (if_then_else (ne (match_operand:SI 0 "register_operand" "+r,!w")
-				(const_int 1))
-			   (label_ref (match_operand 1 "" ""))
-			   (pc)))
-   (set (match_dup 0) (plus:SI (match_dup 0) (const_int -1)))]
-   ; avoid fooling the loop optimizer into assuming this is a special insn.
-  "reload_completed"
-  "*return get_attr_length (insn) == 8
-   ? \"brne.d %0,1,%1\;sub %0,%0,1\"
-   : \"breq %0,1,0f\;b.d %1\;sub %0,%0,1\\n0:\";"
-  [(set (attr "length")
-	(if_then_else (and (ge (minus (match_dup 1) (pc)) (const_int -256))
- 			   (le (minus (match_dup 1) (pc)) (const_int 244)))
- 		      (const_int 8) (const_int 12)))
-   (set_attr "type" "brcc_no_delay_slot")
-   (set_attr "cond" "nocond")]
-)
+;; split pattern for the very slim chance when the loop register is
+;; memory.
+(define_split
+  [(set (pc)
+	(if_then_else (ne (match_operand:SI 0 "memory_operand")
+			  (const_int 1))
+		      (label_ref (match_operand 1 ""))
+		      (pc)))
+   (set (match_dup 0) (plus (match_dup 0) (const_int -1)))
+   (unspec [(const_int 0)] UNSPEC_ARC_LP)
+   (clobber (match_scratch:SI 2))]
+  "memory_operand (operands[0], SImode)"
+  [(set (match_dup 2) (match_dup 0))
+   (set (match_dup 2) (plus:SI (match_dup 2) (const_int -1)))
+   (set (match_dup 0) (match_dup 2))
+   (set (reg:CC CC_REG) (compare:CC (match_dup 2) (const_int 0)))
+   (set (pc)
+	(if_then_else (ne (reg:CC CC_REG)
+			  (const_int 0))
+		      (label_ref (match_dup 1))
+		      (pc)))]
+  "")
 
-; reload can't make output reloads for jump insns, so we have to do this by hand.
-(define_insn "doloop_fallback_m"
-  [(set (pc) (if_then_else (ne (match_operand:SI 0 "register_operand" "+&r")
-				(const_int 1))
-			   (label_ref (match_operand 1 "" ""))
-			   (pc)))
-   (set (match_dup 0) (plus:SI (match_dup 0) (const_int -1)))
-   (set (match_operand:SI 2 "memory_operand" "=m")
-	(plus:SI (match_dup 0) (const_int -1)))]
-   ; avoid fooling the loop optimizer into assuming this is a special insn.
-  "reload_completed"
-  "*return get_attr_length (insn) == 12
-   ? \"sub %0,%0,1\;brne.d %0,0,%1\;st%U2%V2 %0,%2\"
-   : \"sub %0,%0,1\;breq %0,0,0f\;b.d %1\\n0:\tst%U2%V2 %0,%2\";"
-  [(set (attr "length")
-	(if_then_else (and (ge (minus (match_dup 1) (pc)) (const_int -252))
- 			   (le (minus (match_dup 1) (pc)) (const_int 244)))
- 		      (const_int 12) (const_int 16)))
-   (set_attr "type" "brcc_no_delay_slot")
-   (set_attr "cond" "nocond")]
-)
+(define_insn "loop_fail"
+  [(set (reg:SI LP_COUNT)
+	(plus:SI (reg:SI LP_COUNT) (const_int -1)))
+   (set (reg:CC_ZN CC_REG)
+	(compare:CC_ZN (plus:SI (reg:SI LP_COUNT) (const_int -1))
+		       (const_int 0)))]
+  ""
+  "sub.f%?\\tlp_count,lp_count,1"
+  [(set_attr "iscompact" "false")
+   (set_attr "type" "compare")
+   (set_attr "cond" "set_zn")
+   (set_attr "length" "4")
+   (set_attr "predicable" "yes")])
+
+(define_insn_and_split "dbnz"
+  [(set (pc)
+	(if_then_else
+	 (ne (plus:SI (match_operand:SI 0 "nonimmediate_operand" "+r!l,m")
+		      (const_int -1))
+	     (const_int 0))
+	 (label_ref (match_operand 1 "" ""))
+	 (pc)))
+   (set (match_dup 0)
+	(plus:SI (match_dup 0)
+		 (const_int -1)))
+   (clobber (match_scratch:SI 2 "=X,r"))]
+  "TARGET_V2"
+  "@
+   dbnz%#\\t%0,%l1
+   #"
+  "TARGET_V2 && reload_completed && memory_operand (operands[0], SImode)"
+  [(set (match_dup 2) (match_dup 0))
+   (set (match_dup 2) (plus:SI (match_dup 2) (const_int -1)))
+   (set (reg:CC CC_REG) (compare:CC (match_dup 2) (const_int 0)))
+   (set (match_dup 0) (match_dup 2))
+   (set (pc) (if_then_else (ge (reg:CC CC_REG)
+			       (const_int 0))
+			   (label_ref (match_dup 1))
+			   (pc)))]
+  ""
+  [(set_attr "iscompact" "false")
+   (set_attr "type" "loop_end")
+   (set_attr "length" "4,20")])
 
 (define_expand "movmemsi"
   [(match_operand:BLK 0 "" "")
diff --git a/gcc/config/arc/arc.opt b/gcc/config/arc/arc.opt
index ad2df26..d1ebd44 100644
--- a/gcc/config/arc/arc.opt
+++ b/gcc/config/arc/arc.opt
@@ -494,3 +494,28 @@ Specifies the registers that the processor saves on an interrupt entry and exit.
 mrgf-banked-regs=
 Target RejectNegative Joined Var(arc_deferred_options) Defer
 Specifies the number of registers replicated in second register bank on entry to fast interrupt.
+
+mlpc-width=
+Target RejectNegative Joined Enum(arc_lpc) Var(arc_lpcwidth) Init(32)
+Sets LP_COUNT register width.  Possible values are 8, 16, 20, 24, 28, and 32.
+
+Enum
+Name(arc_lpc) Type(int)
+
+EnumValue
+Enum(arc_lpc) String(8) Value(8)
+
+EnumValue
+Enum(arc_lpc) String(16) Value(16)
+
+EnumValue
+Enum(arc_lpc) String(20) Value(20)
+
+EnumValue
+Enum(arc_lpc) String(24) Value(24)
+
+EnumValue
+Enum(arc_lpc) String(28) Value(28)
+
+EnumValue
+Enum(arc_lpc) String(32) Value(32)
diff --git a/gcc/config/arc/predicates.md b/gcc/config/arc/predicates.md
index 1f66438..2610f84 100644
--- a/gcc/config/arc/predicates.md
+++ b/gcc/config/arc/predicates.md
@@ -362,6 +362,8 @@
       else if (TARGET_MUL64_SET
 	       && (REGNO (op) == 57 || REGNO(op) == 58 || REGNO(op) == 59 ))
 	return 0;
+      else if (REGNO (op) == LP_COUNT)
+        return 1;
       else
 	return dest_reg_operand (op, mode);
     case SUBREG :
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index f480051..3d1f7f3 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -614,7 +614,7 @@ Objective-C and Objective-C++ Dialects}.
 -mcrc  -mdsp-packa  -mdvbf  -mlock  -mmac-d16  -mmac-24  -mrtsc  -mswape @gol
 -mtelephony  -mxy  -misize  -mannotate-align  -marclinux  -marclinux_prof @gol
 -mlong-calls  -mmedium-calls  -msdata -mirq-ctrl-saved @gol
--mrgf-banked-regs @gol
+-mrgf-banked-regs -mlpc-width @gol
 -mvolatile-cache  -mtp-regno=@var{regno} @gol
 -malign-call  -mauto-modify-reg  -mbbit-peephole  -mno-brcc @gol
 -mcase-vector-pcrel  -mcompact-casesi  -mno-cond-exec  -mearly-cbranchsi @gol
@@ -14678,6 +14678,18 @@ registers to avoid memory transactions during interrupt entry and exit
 sequences.  Use this option when you are using fast interrupts in an
 ARC V2 family processor.  Permitted values are 4, 8, 16, and 32.
 
+@item -mlpc-width=@var{lpcw}
+@opindex mlpc-width
+Specify the width of the LP_COUNT register.  Valid values for
+@var{lpcw} are 8, 16, 20, 24, 28 and 32.  The default width is fixed
+to 32.  If the width is less than 32, the compiler does not attempt to
+transform loops in your program to use the zero-delay loop mechanism
+unless it is known that the @samp{LP_COUNT} register can hold the
+required loop-counter value.  Depending on the size specified, the
+compiler and run-time library might continue to use the loop mechanism
+for various needs.  This option defines macro @code{__ARC_LPC_WIDTH__}
+with the value of size.
+
 @end table
 
 The following options are passed through to the assembler, and also
diff --git a/gcc/testsuite/gcc.target/arc/loop-1.c b/gcc/testsuite/gcc.target/arc/loop-1.c
old mode 100644
new mode 100755
index 1afe8eb..773f583
--- a/gcc/testsuite/gcc.target/arc/loop-1.c
+++ b/gcc/testsuite/gcc.target/arc/loop-1.c
@@ -1,45 +1,12 @@
 /* { dg-do compile } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -w" } */
 
-/* This case would fail to make use of the zero-overhead loop
-   instruction at one time due to a bug.  */
+/* Check how we handle empty body loops.  */
 
-extern char a[];
-
-struct some_t
-{
-  struct
-  {
-    int aaa;
-    short bbb;
-    char ccc;
-    char ddd;
-  } ppp[8];
-
-  int www[1];
-};
-
-int b;
-
-void
-some_function ()
-{
-  struct some_t *tmp = (struct some_t *) a;
-
-  while ((*tmp).ppp[b].ccc)
-    while(0);
-
-  for (; b; b++)
-    {
-      if (tmp->ppp[b].ccc)
-        {
-          int c = tmp->ppp[b].bbb;
-          int d = tmp->ppp[b].aaa;
-          int e = d - tmp->www[c];
-          if (e)
-            tmp->ppp[b].ddd = 1;
-        }
-    }
+a;
+fn1() {
+  int i;
+  for (; i < 8; i++) {
+    double A[a];
+  }
 }
-
-/* { dg-final { scan-assembler "\[^\n\]+lp \\.L__GCC__" } } */
-- 
1.9.1

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 2/7] [ARC] Use -G option to control sdata behavior
  2017-07-24  8:46 [PATCH 0/7] [ARC] SDATA and ZOL updates Claudiu Zissulescu
  2017-07-24  8:46 ` [PATCH 5/7] [ARC] Update various patterns Claudiu Zissulescu
  2017-07-24  8:46 ` [PATCH 6/7] [ARC] Reimplement ZOL support Claudiu Zissulescu
@ 2017-07-24  8:46 ` Claudiu Zissulescu
  2017-08-15 13:43   ` Andrew Burgess
  2017-07-24  8:46 ` [PATCH 4/7] [ARC] Use TARGET_USE_ANCHORS_FOR_SYMBOL_P Claudiu Zissulescu
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 27+ messages in thread
From: Claudiu Zissulescu @ 2017-07-24  8:46 UTC (permalink / raw)
  To: gcc-patches; +Cc: Claudiu.Zissulescu, Francois.Bedard, andrew.burgess, sandra

From: claziss <claziss@synopsys.com>

Add support for -G option to ARC backend.

gcc/
2017-04-24  Claudiu Zissulescu  <claziss@synopsys.com>

	* config.gcc: Use g.opt for arc.
	* config/arc/arc.c (LEGITIMATE_SCALED_ADDRESS_P): Deleted,
	functionality moved to ...
	(legitimate_scaled_address_p): New function, ...here.
	(LEGITIMATE_SMALL_DATA_OFFSET_P): New define.
	(LEGITIMATE_SMALL_DATA_ADDRESS_P): Use the above define.
	(legitimate_offset_address_p): Delete TARGET_NO_SDATA_SET
	condition.
	(arc_override_options): Handle G option.
	(arc_output_pic_addr_const): Correct function definition.
	(arc_legitimate_address_p): Use legitimate_scaled_address_p.
	(arc_decl_anon_ns_mem_p): Delete.
	(arc_in_small_data_p): Overhaul this function to take into
	consideration the value given via G option.
	(arc_rewrite_small_data_1): Renamed and corrected old
	arc_rewrite_small_data function.
	(arc_rewrite_small_data): New function.
	(small_data_pattern): Don't use pic_offset_table_rtx.
	* config/arc/arc.h (CC1_SPEC): Recognize G option.
	* config/arc/simdext.md (movmisalignv2hi): Use
	prepare_move_operands function.
	(mov*): Likewise.
	(movmisalign*): Likewise.

gcc/testsuite/
2017-04-24  Claudiu Zissulescu  <claziss@synopsys.com>

	* gcc.target/arc/sdata-5.c: New test.
	* gcc.target/arc/arc700-stld-hazard.c: Update test options.

Fix test
---
 gcc/config.gcc                                    |   2 +-
 gcc/config/arc/arc.c                              | 236 ++++++++++++----------
 gcc/config/arc/arc.h                              |   6 +-
 gcc/config/arc/simdext.md                         |  22 +-
 gcc/testsuite/gcc.target/arc/arc700-stld-hazard.c |   2 +-
 gcc/testsuite/gcc.target/arc/sdata-5.c            |   8 +
 6 files changed, 152 insertions(+), 124 deletions(-)
 create mode 100755 gcc/testsuite/gcc.target/arc/sdata-5.c

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 0518cb7..564d7ff 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -318,7 +318,7 @@ arc*-*-*)
 	cpu_type=arc
 	c_target_objs="arc-c.o"
 	cxx_target_objs="arc-c.o"
-	extra_options="${extra_options} arc/arc-tables.opt"
+	extra_options="${extra_options} arc/arc-tables.opt g.opt"
 	extra_headers="arc-simd.h"
 	;;
 arm*-*-*)
diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
index 091bc89..cf52df8 100644
--- a/gcc/config/arc/arc.c
+++ b/gcc/config/arc/arc.c
@@ -78,26 +78,21 @@ static const char *arc_cpu_string = arc_cpu_name;
 		      ? 0 \
 		      : -(-GET_MODE_SIZE (MODE) | -4) >> 1)))
 
-#define LEGITIMATE_SCALED_ADDRESS_P(MODE, X, STRICT) \
-(GET_CODE (X) == PLUS \
- && GET_CODE (XEXP (X, 0)) == MULT \
- && RTX_OK_FOR_INDEX_P (XEXP (XEXP (X, 0), 0), (STRICT)) \
- && GET_CODE (XEXP (XEXP (X, 0), 1)) == CONST_INT \
- && ((GET_MODE_SIZE (MODE) == 2 && INTVAL (XEXP (XEXP (X, 0), 1)) == 2) \
-     || (GET_MODE_SIZE (MODE) == 4 && INTVAL (XEXP (XEXP (X, 0), 1)) == 4)) \
- && (RTX_OK_FOR_BASE_P (XEXP (X, 1), (STRICT)) \
-     || (flag_pic ? CONST_INT_P (XEXP (X, 1)) : CONSTANT_P (XEXP (X, 1)))))
-
-#define LEGITIMATE_SMALL_DATA_ADDRESS_P(X) \
-  (GET_CODE (X) == PLUS \
-   && (REG_P (XEXP ((X), 0)) && REGNO (XEXP ((X), 0)) == SDATA_BASE_REGNUM) \
-   && ((GET_CODE (XEXP((X),1)) == SYMBOL_REF \
-	&& SYMBOL_REF_SMALL_P (XEXP ((X), 1))) \
-       || (GET_CODE (XEXP ((X), 1)) == CONST \
-	   && GET_CODE (XEXP (XEXP ((X), 1), 0)) == PLUS \
-	   && GET_CODE (XEXP (XEXP (XEXP ((X), 1), 0), 0)) == SYMBOL_REF \
-	   && SYMBOL_REF_SMALL_P (XEXP (XEXP (XEXP ((X), 1), 0), 0)) \
-	   && GET_CODE (XEXP(XEXP (XEXP ((X), 1), 0), 1)) == CONST_INT)))
+#define LEGITIMATE_SMALL_DATA_OFFSET_P(X)				\
+  (GET_CODE (X) == CONST						\
+   && GET_CODE (XEXP ((X), 0)) == PLUS					\
+   && GET_CODE (XEXP (XEXP ((X), 0), 0)) == SYMBOL_REF			\
+   && SYMBOL_REF_SMALL_P (XEXP (XEXP ((X), 0), 0))			\
+   && GET_CODE (XEXP(XEXP ((X), 0), 1)) == CONST_INT			\
+   && INTVAL (XEXP (XEXP ((X), 0), 1)) <= g_switch_value)
+
+#define LEGITIMATE_SMALL_DATA_ADDRESS_P(X)				\
+  (GET_CODE (X) == PLUS							\
+     && REG_P (XEXP ((X), 0))						\
+     && REGNO (XEXP ((X), 0)) == SDATA_BASE_REGNUM			\
+     && ((GET_CODE (XEXP ((X), 1)) == SYMBOL_REF			\
+	    && SYMBOL_REF_SMALL_P (XEXP ((X), 1)))			\
+	 || LEGITIMATE_SMALL_DATA_OFFSET_P (XEXP ((X), 1))))
 
 /* Array of valid operand punctuation characters.  */
 char arc_punct_chars[256];
@@ -275,6 +270,61 @@ static bool arc_use_by_pieces_infrastructure_p (unsigned HOST_WIDE_INT,
 /* Globally visible information about currently selected cpu.  */
 const arc_cpu_t *arc_selected_cpu;
 
+static bool
+legitimate_scaled_address_p (machine_mode mode, rtx op, bool strict)
+{
+  if (GET_CODE (op) != PLUS)
+    return false;
+
+  if (GET_CODE (XEXP (op, 0)) != MULT)
+    return false;
+
+  /* Check multiplication operands.  */
+  if (!RTX_OK_FOR_INDEX_P (XEXP (XEXP (op, 0), 0), strict))
+    return false;
+
+  if (!CONST_INT_P (XEXP (XEXP (op, 0), 1)))
+    return false;
+
+  switch (GET_MODE_SIZE (mode))
+    {
+    case 2:
+      if (INTVAL (XEXP (XEXP (op, 0), 1)) != 2)
+	return false;
+      break;
+    case 8:
+      if (!TARGET_LL64)
+	return false;
+      /*  Fall through. */
+    case 4:
+      if (INTVAL (XEXP (XEXP (op, 0), 1)) != 4)
+	return false;
+    default:
+      return false;
+    }
+
+  /* Check the base.  */
+  if (RTX_OK_FOR_BASE_P (XEXP (op, 1), (strict)))
+    return true;
+
+  if (flag_pic)
+    {
+      if (CONST_INT_P (XEXP (op, 1)))
+	return true;
+      return false;
+    }
+  if (CONSTANT_P (XEXP (op, 1)))
+    {
+      /* Scalled addresses for sdata is done other places.  */
+      if (GET_CODE (XEXP (op, 1)) == SYMBOL_REF
+	  && SYMBOL_REF_SMALL_P (XEXP (op, 1)))
+	return false;
+      return true;
+    }
+
+  return false;
+}
+
 /* Check for constructions like REG + OFFS, where OFFS can be a
    register, an immediate or an long immediate. */
 
@@ -301,8 +351,7 @@ legitimate_offset_address_p (enum machine_mode mode, rtx x, bool index,
       && (GET_MODE_SIZE (mode) <= 4)
       /* Avoid small data which ends in something like GP +
 	 symb@sda.  */
-      && (!SYMBOL_REF_SMALL_P (XEXP (x, 1))
-	  || TARGET_NO_SDATA_SET))
+      && (!SYMBOL_REF_SMALL_P (XEXP (x, 1))))
     return true;
 
   return false;
@@ -1120,6 +1169,10 @@ arc_override_options (void)
   if (TARGET_COMPACT_CASESI)
     TARGET_CASE_VECTOR_PC_RELATIVE = 1;
 
+  /* Check for small data option */
+  if (!global_options_set.x_g_switch_value && !TARGET_NO_SDATA_SET)
+    g_switch_value = TARGET_LL64 ? 8 : 4;
+
   /* These need to be done at start up.  It's convenient to do them here.  */
   arc_init ();
 }
@@ -5518,7 +5571,7 @@ arc_legitimize_pic_address (rtx orig, rtx oldx)
 
 /* Output address constant X to FILE, taking PIC into account.  */
 
-void
+static void
 arc_output_pic_addr_const (FILE * file, rtx x, int code)
 {
   char buf[256];
@@ -5967,7 +6020,7 @@ arc_legitimate_address_p (machine_mode mode, rtx x, bool strict)
      return true;
   if (legitimate_offset_address_p (mode, x, TARGET_INDEXED_LOADS, strict))
      return true;
-  if (LEGITIMATE_SCALED_ADDRESS_P (mode, x, strict))
+  if (legitimate_scaled_address_p (mode, x, strict))
     return true;
   if (LEGITIMATE_SMALL_DATA_ADDRESS_P (x))
      return true;
@@ -7561,28 +7614,6 @@ valid_brcc_with_delay_p (rtx *operands)
   return brcc_nolimm_operator (operands[0], VOIDmode);
 }
 
-/* ??? Hack.  This should no really be here.  See PR32143.  */
-static bool
-arc_decl_anon_ns_mem_p (const_tree decl)
-{
-  while (1)
-    {
-      if (decl == NULL_TREE || decl == error_mark_node)
-	return false;
-      if (TREE_CODE (decl) == NAMESPACE_DECL
-	  && DECL_NAME (decl) == NULL_TREE)
-	return true;
-      /* Classes and namespaces inside anonymous namespaces have
-	 TREE_PUBLIC == 0, so we can shortcut the search.  */
-      else if (TYPE_P (decl))
-	return (TREE_PUBLIC (TYPE_NAME (decl)) == 0);
-      else if (TREE_CODE (decl) == NAMESPACE_DECL)
-	return (TREE_PUBLIC (decl) == 0);
-      else
-	decl = DECL_CONTEXT (decl);
-    }
-}
-
 /* Implement TARGET_IN_SMALL_DATA_P.  Return true if it would be safe to
    access DECL using %gp_rel(...)($gp).  */
 
@@ -7591,60 +7622,43 @@ arc_in_small_data_p (const_tree decl)
 {
   HOST_WIDE_INT size;
 
-  /* Strings and functions are never in small data area.  */
-  if (TREE_CODE (decl) == STRING_CST || TREE_CODE (decl) == FUNCTION_DECL)
+  /* Only variables are going into small data area.  */
+  if (TREE_CODE (decl) != VAR_DECL)
     return false;
 
   if (TARGET_NO_SDATA_SET)
     return false;
 
-  if (TREE_CODE (decl) == VAR_DECL && DECL_SECTION_NAME (decl) != 0)
-    {
-      const char *name;
-
-      /* Reject anything that isn't in a known small-data section.  */
-      name = DECL_SECTION_NAME (decl);
-      if (strcmp (name, ".sdata") != 0 && strcmp (name, ".sbss") != 0)
-	return false;
-
-      /* If a symbol is defined externally, the assembler will use the
-	 usual -G rules when deciding how to implement macros.  */
-      if (!DECL_EXTERNAL (decl))
-	  return true;
-    }
-  /* Only global variables go into sdata section for now.  */
-  else
-    {
-      /* Don't put constants into the small data section: we want them
-	 to be in ROM rather than RAM.  */
-      if (TREE_CODE (decl) != VAR_DECL)
-	return false;
-
-      if (TREE_READONLY (decl)
-	  && !TREE_SIDE_EFFECTS (decl)
-	  && (!DECL_INITIAL (decl) || TREE_CONSTANT (DECL_INITIAL (decl))))
-	return false;
-
-      /* TREE_PUBLIC might change after the first call, because of the patch
-	 for PR19238.  */
-      if (default_binds_local_p_1 (decl, 1)
-	  || arc_decl_anon_ns_mem_p (decl))
-	return false;
-
-      /* To ensure -mvolatile-cache works
-	 ld.di does not have a gp-relative variant.  */
-      if (TREE_THIS_VOLATILE (decl))
-	return false;
-    }
-
   /* Disable sdata references to weak variables.  */
   if (DECL_WEAK (decl))
     return false;
 
-  size = int_size_in_bytes (TREE_TYPE (decl));
+  /* Don't put constants into the small data section: we want them to
+     be in ROM rather than RAM.  */
+  if (TREE_READONLY (decl))
+    return false;
+
+  /* To ensure -mvolatile-cache works ld.di does not have a
+     gp-relative variant.  */
+  if (!TARGET_VOLATILE_CACHE_SET
+      && TREE_THIS_VOLATILE (decl))
+    return false;
 
-  /* Allow only <=4B long data types into sdata.  */
-  return (size > 0 && size <= 4);
+  if (DECL_SECTION_NAME (decl) != 0)
+    {
+      const char *name = DECL_SECTION_NAME (decl);
+      if (strcmp (name, ".sdata") == 0
+	  || strcmp (name, ".sbss") == 0)
+	return true;
+    }
+  /* If it's not public, there's no need to put it in the small data
+     section.  */
+  else if (TREE_PUBLIC (decl))
+    {
+      size = int_size_in_bytes (TREE_TYPE (decl));
+      return (size > 0 && size <= g_switch_value);
+    }
+  return false;
 }
 
 /* Return true if X is a small data address that can be rewritten
@@ -7673,9 +7687,10 @@ arc_rewrite_small_data_p (const_rtx x)
 /* If possible, rewrite OP so that it refers to small data using
    explicit relocations.  */
 
-rtx
-arc_rewrite_small_data (rtx op)
+static rtx
+arc_rewrite_small_data_1 (rtx op)
 {
+  rtx rgp = gen_rtx_REG (Pmode, SDATA_BASE_REGNUM);
   op = copy_insn (op);
   subrtx_ptr_iterator::array_type array;
   FOR_EACH_SUBRTX_PTR (iter, array, &op, ALL)
@@ -7683,28 +7698,33 @@ arc_rewrite_small_data (rtx op)
       rtx *loc = *iter;
       if (arc_rewrite_small_data_p (*loc))
 	{
-	  gcc_assert (SDATA_BASE_REGNUM == PIC_OFFSET_TABLE_REGNUM);
-	  *loc = gen_rtx_PLUS (Pmode, pic_offset_table_rtx, *loc);
-	  if (loc != &op)
-	    {
-	      if (GET_CODE (op) == MEM && &XEXP (op, 0) == loc)
-		; /* OK.  */
-	      else if (GET_CODE (op) == MEM
-		       && GET_CODE (XEXP (op, 0)) == PLUS
-		       && GET_CODE (XEXP (XEXP (op, 0), 0)) == MULT)
-		*loc = force_reg (Pmode, *loc);
-	      else
-		gcc_unreachable ();
-	    }
+	  *loc = gen_rtx_PLUS (Pmode, rgp, *loc);
 	  iter.skip_subrtxes ();
 	}
       else if (GET_CODE (*loc) == PLUS
-	       && rtx_equal_p (XEXP (*loc, 0), pic_offset_table_rtx))
+	       && rtx_equal_p (XEXP (*loc, 0), rgp))
 	iter.skip_subrtxes ();
     }
   return op;
 }
 
+rtx
+arc_rewrite_small_data (rtx op)
+{
+  op = arc_rewrite_small_data_1 (op);
+
+  /* Check if we fit small data constraints.  */
+  if (MEM_P (op)
+      && !LEGITIMATE_SMALL_DATA_ADDRESS_P (XEXP (op, 0)))
+    {
+      rtx addr = XEXP (op, 0);
+      rtx tmp = gen_reg_rtx (Pmode);
+      emit_move_insn (tmp, addr);
+      op = replace_equiv_address_nv (op, tmp);
+    }
+  return op;
+}
+
 /* Return true if OP refers to small data symbols directly, not through
    a PLUS.  */
 
@@ -7713,12 +7733,14 @@ small_data_pattern (rtx op, machine_mode)
 {
   if (GET_CODE (op) == SEQUENCE)
     return false;
+
+  rtx rgp = gen_rtx_REG (Pmode, SDATA_BASE_REGNUM);
   subrtx_iterator::array_type array;
   FOR_EACH_SUBRTX (iter, array, op, ALL)
     {
       const_rtx x = *iter;
       if (GET_CODE (x) == PLUS
-	  && rtx_equal_p (XEXP (x, 0), pic_offset_table_rtx))
+	  && rtx_equal_p (XEXP (x, 0), rgp))
 	iter.skip_subrtxes ();
       else if (arc_rewrite_small_data_p (x))
 	return true;
diff --git a/gcc/config/arc/arc.h b/gcc/config/arc/arc.h
index c7cd798..d4e97cd 100644
--- a/gcc/config/arc/arc.h
+++ b/gcc/config/arc/arc.h
@@ -65,9 +65,9 @@ along with GCC; see the file COPYING3.  If not see
 %(subtarget_cpp_spec)"
 
 #undef CC1_SPEC
-#define CC1_SPEC "\
-%{EB:%{EL:%emay not use both -EB and -EL}} \
-%{EB:-mbig-endian} %{EL:-mlittle-endian} \
+#define CC1_SPEC "%{EB:%{EL:%emay not use both -EB and -EL}}	\
+%{EB:-mbig-endian} %{EL:-mlittle-endian}			\
+%{G*}								\
 "
 extern const char *arc_cpu_to_as (int argc, const char **argv);
 
diff --git a/gcc/config/arc/simdext.md b/gcc/config/arc/simdext.md
index 6c102d3..9f5b4a8 100644
--- a/gcc/config/arc/simdext.md
+++ b/gcc/config/arc/simdext.md
@@ -1383,19 +1383,18 @@
  [(set (match_operand:V2HI 0 "general_operand" "")
        (match_operand:V2HI 1 "general_operand" ""))]
  ""
-{
- if (!register_operand (operands[0], V2HImode)
-      && !register_operand (operands[1], V2HImode))
-    operands[1] = force_reg (V2HImode, operands[1]);
-})
+ "{
+   if (prepare_move_operands (operands, V2HImode))
+     DONE;
+  }")
 
 (define_expand "mov<mode>"
   [(set (match_operand:VWH 0 "move_dest_operand" "")
 	(match_operand:VWH 1 "general_operand" ""))]
   ""
   "{
-    if (GET_CODE (operands[0]) == MEM)
-     operands[1] = force_reg (<MODE>mode, operands[1]);
+    if (prepare_move_operands (operands, <MODE>mode))
+     DONE;
    }")
 
 (define_insn_and_split "*mov<mode>_insn"
@@ -1440,11 +1439,10 @@
  [(set (match_operand:VWH 0 "general_operand" "")
        (match_operand:VWH 1 "general_operand" ""))]
  ""
-{
- if (!register_operand (operands[0], <MODE>mode)
-      && !register_operand (operands[1], <MODE>mode))
-    operands[1] = force_reg (<MODE>mode, operands[1]);
-})
+ "{
+   if (prepare_move_operands (operands, <MODE>mode))
+     DONE;
+  }")
 
 (define_insn "bswapv2hi2"
   [(set (match_operand:V2HI 0 "register_operand" "=r,r")
diff --git a/gcc/testsuite/gcc.target/arc/arc700-stld-hazard.c b/gcc/testsuite/gcc.target/arc/arc700-stld-hazard.c
index bf6ae33..eba03d8 100644
--- a/gcc/testsuite/gcc.target/arc/arc700-stld-hazard.c
+++ b/gcc/testsuite/gcc.target/arc/arc700-stld-hazard.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-mcpu=arc700" } */
+/* { dg-options "-mcpu=arc700 -mno-sdata" } */
 
 volatile int a;
 volatile int b;
diff --git a/gcc/testsuite/gcc.target/arc/sdata-5.c b/gcc/testsuite/gcc.target/arc/sdata-5.c
new file mode 100755
index 0000000..0fdd52d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arc/sdata-5.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-w -Os" } */
+
+const a[1] = {};
+static short b[] = {};
+
+c;
+fn1() { return a + b[c]; }
-- 
1.9.1

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 6/7] [ARC] Reimplement ZOL support.
  2017-07-24  8:46 ` [PATCH 6/7] [ARC] Reimplement ZOL support Claudiu Zissulescu
@ 2017-07-24 22:22   ` Sandra Loosemore
  2017-08-31 20:23   ` Andrew Burgess
  1 sibling, 0 replies; 27+ messages in thread
From: Sandra Loosemore @ 2017-07-24 22:22 UTC (permalink / raw)
  To: Claudiu Zissulescu, gcc-patches; +Cc: Francois.Bedard, andrew.burgess

On 07/24/2017 02:42 AM, Claudiu Zissulescu wrote:

> +@item -mlpc-width=@var{lpcw}
> +@opindex mlpc-width
> +Specify the width of the LP_COUNT register.  Valid values for
> +@var{lpcw} are 8, 16, 20, 24, 28 and 32.  The default width is fixed
> +to 32.  If the width is less than 32, the compiler does not attempt to
> +transform loops in your program to use the zero-delay loop mechanism
> +unless it is known that the @samp{LP_COUNT} register can hold the
> +required loop-counter value.  Depending on the size specified, the
> +compiler and run-time library might continue to use the loop mechanism
> +for various needs.  This option defines macro @code{__ARC_LPC_WIDTH__}
> +with the value of size.

I think it would be better to use a more meaningful name for the 
argument than "lpcw".  If "lpcw" refers to the same thing as "width" and 
"size" later in the description, can you either use a consistent naming 
convention everywhere, or else clarify what the different terms mean in 
this context?

Please use consistent markup on both uses of LP_COUNT.  If this is a 
literal register name I think we use @code markup for such things elsewhere.

What are the units of the argument?  Bits?  Best to say so explicitly.

-Sandra

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/7] [ARC] Improves and fixes for small data support.
  2017-07-24  8:46 ` [PATCH 1/7] [ARC] Improves and fixes for small data support Claudiu Zissulescu
@ 2017-08-15 13:36   ` Andrew Burgess
  2017-08-31 14:37     ` Claudiu Zissulescu
  0 siblings, 1 reply; 27+ messages in thread
From: Andrew Burgess @ 2017-08-15 13:36 UTC (permalink / raw)
  To: Claudiu Zissulescu
  Cc: gcc-patches, Francois.Bedard, sandra, Claudiu Zissulescu

* Claudiu Zissulescu <Claudiu.Zissulescu@synopsys.com> [2017-07-24 10:42:53 +0200]:

> From: Claudiu Zissulescu <claziss@gmail.com>
> 
> Add alignment check for short load/store instructions used for sdata,
> as they request 32-bit aligned short immediate.  Use sdata symbol
> alignment information and emit scalled loads/stores whenever is
> possible. The scalled address will extend the access range for sdata
> symbols.  Allow 64-bit datum into small data section, if double
> load/store instructions are present.
> 
> gcc/
> 2017-04-12  Claudiu Zissulescu  <claziss@synopsys.com>
> 
> 	* config/arc/arc-protos.h (compact_sda_memory_operand): Update
> 	prototype.
> 	* config/arc/arc.c (arc_print_operand): Output scalled address for
> 	sdata whenever is possible.
> 	(arc_in_small_data_p): Allow sdata for 64bit datum when double
> 	load/stores are available.
> 	(compact_sda_memory_operand): Check for the alignment required by
> 	code density instructions.
> 	* config/arc/arc.md (movsi_insn): Use newly introduced Us0
> 	constraint.
> 	* config/arc/constraints.md (Usd): Update constraint.
> 	(Us0): New constraint.
> 	(Usc): Update constraint.
> 
> gcc/testsuite/
> 2017-04-12  Claudiu Zissulescu  <claziss@synopsys.com>
> 
> 	* gcc.target/arc/sdata-3.c: New file.

Remember to add both new testsuite files in the ChangeLog, but
otherwise looks good.

Thanks,
Andrew





> ---
>  gcc/config/arc/arc-protos.h            |  2 +-
>  gcc/config/arc/arc.c                   | 64 +++++++++++++++++++++++++++++-----
>  gcc/config/arc/constraints.md          |  4 +--
>  gcc/testsuite/gcc.target/arc/sdata-3.c | 32 +++++++++++++++++
>  gcc/testsuite/gcc.target/arc/sdata-4.c | 15 ++++++++
>  5 files changed, 105 insertions(+), 12 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/arc/sdata-3.c
>  create mode 100644 gcc/testsuite/gcc.target/arc/sdata-4.c
> 
> diff --git a/gcc/config/arc/arc-protos.h b/gcc/config/arc/arc-protos.h
> index 850795a..c831972 100644
> --- a/gcc/config/arc/arc-protos.h
> +++ b/gcc/config/arc/arc-protos.h
> @@ -27,7 +27,7 @@ extern struct rtx_def *gen_compare_reg (rtx, machine_mode);
>  /* Declarations for various fns used in the .md file.  */
>  extern void arc_output_function_epilogue (FILE *, HOST_WIDE_INT, int);
>  extern const char *output_shift (rtx *);
> -extern bool compact_sda_memory_operand (rtx op,machine_mode  mode);
> +extern bool compact_sda_memory_operand (rtx, machine_mode, bool);
>  extern bool arc_double_limm_p (rtx);
>  extern void arc_print_operand (FILE *, rtx, int);
>  extern void arc_print_operand_address (FILE *, rtx);
> diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
> index 89de6cd..091bc89 100644
> --- a/gcc/config/arc/arc.c
> +++ b/gcc/config/arc/arc.c
> @@ -3900,6 +3900,26 @@ arc_print_operand (FILE *file, rtx x, int code)
>  		  fputs (".as", file);
>  		  output_scaled = 1;
>  		}
> +	      else if (LEGITIMATE_SMALL_DATA_ADDRESS_P (addr)
> +		       && GET_MODE_SIZE (GET_MODE (x)) > 1)
> +		{
> +		  tree decl = NULL_TREE;
> +		  int align = 0;
> +		  if (GET_CODE (XEXP (addr, 1)) == SYMBOL_REF)
> +		    decl = SYMBOL_REF_DECL (XEXP (addr, 1));
> +		  else if (GET_CODE (XEXP (XEXP (XEXP (addr, 1), 0), 0))
> +			   == SYMBOL_REF)
> +		    decl = SYMBOL_REF_DECL (XEXP (XEXP (XEXP (addr, 1), 0), 0));
> +		  if (decl)
> +		    align = DECL_ALIGN (decl);
> +		  align = align / BITS_PER_UNIT;
> +		  if ((GET_MODE_SIZE (GET_MODE (x)) == 2)
> +		      && align && ((align & 1) == 0))
> +		    fputs (".as", file);
> +		  if ((GET_MODE_SIZE (GET_MODE (x)) >= 4)
> +		      && align && ((align & 3) == 0))
> +		    fputs (".as", file);
> +		}
>  	      break;
>  	    case REG:
>  	      break;
> @@ -7571,12 +7591,10 @@ arc_in_small_data_p (const_tree decl)
>  {
>    HOST_WIDE_INT size;
>  
> +  /* Strings and functions are never in small data area.  */
>    if (TREE_CODE (decl) == STRING_CST || TREE_CODE (decl) == FUNCTION_DECL)
>      return false;
>  
> -
> -  /* We don't yet generate small-data references for -mabicalls.  See related
> -     -G handling in override_options.  */
>    if (TARGET_NO_SDATA_SET)
>      return false;
>  
> @@ -7595,7 +7613,7 @@ arc_in_small_data_p (const_tree decl)
>  	  return true;
>      }
>    /* Only global variables go into sdata section for now.  */
> -  else if (1)
> +  else
>      {
>        /* Don't put constants into the small data section: we want them
>  	 to be in ROM rather than RAM.  */
> @@ -7625,9 +7643,6 @@ arc_in_small_data_p (const_tree decl)
>  
>    size = int_size_in_bytes (TREE_TYPE (decl));
>  
> -/*   if (AGGREGATE_TYPE_P (TREE_TYPE (decl))) */
> -/*     return false; */
> -
>    /* Allow only <=4B long data types into sdata.  */
>    return (size > 0 && size <= 4);
>  }
> @@ -7719,10 +7734,13 @@ small_data_pattern (rtx op, machine_mode)
>  /* volatile cache option still to be handled.  */
>  
>  bool
> -compact_sda_memory_operand (rtx op, machine_mode mode)
> +compact_sda_memory_operand (rtx op, machine_mode mode, bool short_p)
>  {
>    rtx addr;
>    int size;
> +  tree decl = NULL_TREE;
> +  int align = 0;
> +  int mask = 0;
>  
>    /* Eliminate non-memory operations.  */
>    if (GET_CODE (op) != MEM)
> @@ -7740,7 +7758,35 @@ compact_sda_memory_operand (rtx op, machine_mode mode)
>    /* Decode the address now.  */
>    addr = XEXP (op, 0);
>  
> -  return LEGITIMATE_SMALL_DATA_ADDRESS_P  (addr);
> +  if (!LEGITIMATE_SMALL_DATA_ADDRESS_P (addr))
> +    return false;
> +
> +  if (!short_p || size == 1)
> +    return true;
> +
> +  /* Now check for the alignment, the short loads using gp require the
> +     addresses to be aligned.  */
> +  if (GET_CODE (XEXP (addr, 1)) == SYMBOL_REF)
> +    decl = SYMBOL_REF_DECL (XEXP (addr, 1));
> +  else if (GET_CODE (XEXP (XEXP (XEXP (addr, 1), 0), 0)) == SYMBOL_REF)
> +    decl = SYMBOL_REF_DECL (XEXP (XEXP (XEXP (addr, 1), 0), 0));
> +  if (decl)
> +    align = DECL_ALIGN (decl);
> +  align = align / BITS_PER_UNIT;
> +
> +  switch (mode)
> +    {
> +    case HImode:
> +      mask = 1;
> +      break;
> +    default:
> +      mask = 3;
> +      break;
> +    }
> +
> +  if (align && ((align & mask) == 0))
> +    return true;
> +  return false;
>  }
>  
>  /* Implement ASM_OUTPUT_ALIGNED_DECL_LOCAL.  */
> diff --git a/gcc/config/arc/constraints.md b/gcc/config/arc/constraints.md
> index 6620daf..0ad318c 100644
> --- a/gcc/config/arc/constraints.md
> +++ b/gcc/config/arc/constraints.md
> @@ -355,7 +355,7 @@
>     "@internal
>      A valid _small-data_ memory operand for ARCompact instructions"
>     (and (match_code "mem")
> -	(match_test "compact_sda_memory_operand (op, VOIDmode)")))
> +	(match_test "compact_sda_memory_operand (op, VOIDmode, true)")))
>  
>  (define_memory_constraint "Usc"
>    "@internal
> @@ -363,7 +363,7 @@
>    (and (match_code "mem")
>         (match_test "!CONSTANT_P (XEXP (op,0))")
>  ;; ??? the assembler rejects stores of immediates to small data.
> -       (match_test "!compact_sda_memory_operand (op, VOIDmode)")))
> +       (match_test "!compact_sda_memory_operand (op, VOIDmode, false)")))
>  
>  (define_constraint "Us<"
>    "@internal
> diff --git a/gcc/testsuite/gcc.target/arc/sdata-3.c b/gcc/testsuite/gcc.target/arc/sdata-3.c
> new file mode 100644
> index 0000000..cdf3b6d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arc/sdata-3.c
> @@ -0,0 +1,32 @@
> +/* Check if sdata access is done correctly, specially
> +   for variables which are having a different alignment
> +   than the default data type indicates.  */
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +int g_a __attribute__ ((aligned (1)));
> +int g_b;
> +short g_c;
> +char g_d;
> +
> +#define TEST(name, optype)			\
> +  void test_ ## name (optype x)			\
> +  {						\
> +    g_ ## name += x;				\
> +  }
> +
> +TEST (a, int)
> +TEST (b, int)
> +TEST (c, short)
> +TEST (d, char)
> +
> +/* { dg-final { scan-assembler "ld r2,\\\[gp,@g_a@sda\\\]" } } */
> +/* { dg-final { scan-assembler "ld.as r2,\\\[gp,@g_b@sda\\\]" } } */
> +/* { dg-final { scan-assembler "ld\[hw\]\\\.as r2,\\\[gp,@g_c@sda\\\]" } } */
> +/* { dg-final { scan-assembler "ldb r2,\\\[gp,@g_d@sda\\\]" } } */
> +
> +/* { dg-final { scan-assembler "st r0,\\\[gp,@g_a@sda\\\]" } } */
> +/* { dg-final { scan-assembler "st_s r0,\\\[gp,@g_b@sda\\\]" { target { arcem || archs } } } } */
> +/* { dg-final { scan-assembler "st\\\.as r0,\\\[gp,@g_b@sda\\\]" { target { arc700 || arc6xx } } } } */
> +/* { dg-final { scan-assembler "st\[hw\]\\\.as r0,\\\[gp,@g_c@sda\\\]" } } */
> +/* { dg-final { scan-assembler "stb r0,\\\[gp,@g_d@sda\\\]" } } */
> diff --git a/gcc/testsuite/gcc.target/arc/sdata-4.c b/gcc/testsuite/gcc.target/arc/sdata-4.c
> new file mode 100644
> index 0000000..45fe712
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arc/sdata-4.c
> @@ -0,0 +1,15 @@
> +/* Check if sdata access is done correctly, specially
> +   for variables which are having a different alignment
> +   than the default data type indicates.  */
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +short gA  __attribute__ ((aligned(1)));
> +
> +void foo (void)
> +{
> +  gA += gA + 3;
> +}
> +
> +/* { dg-final { scan-assembler-not "ld\[wh\]_s r0,\\\[gp" } } */
> +/* { dg-final { scan-assembler-not "st\[wh\]\\\.as.*gp" } } */
> -- 
> 1.9.1
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/7] [ARC] Use -G option to control sdata behavior
  2017-07-24  8:46 ` [PATCH 2/7] [ARC] Use -G option to control sdata behavior Claudiu Zissulescu
@ 2017-08-15 13:43   ` Andrew Burgess
  2017-08-15 16:26     ` Sandra Loosemore
  2017-08-31 16:04     ` Claudiu Zissulescu
  0 siblings, 2 replies; 27+ messages in thread
From: Andrew Burgess @ 2017-08-15 13:43 UTC (permalink / raw)
  To: Claudiu Zissulescu; +Cc: gcc-patches, Francois.Bedard, sandra

* Claudiu Zissulescu <Claudiu.Zissulescu@synopsys.com> [2017-07-24 10:42:54 +0200]:

> From: claziss <claziss@synopsys.com>
> 
> Add support for -G option to ARC backend.
> 
> gcc/
> 2017-04-24  Claudiu Zissulescu  <claziss@synopsys.com>
> 
> 	* config.gcc: Use g.opt for arc.
> 	* config/arc/arc.c (LEGITIMATE_SCALED_ADDRESS_P): Deleted,
> 	functionality moved to ...
> 	(legitimate_scaled_address_p): New function, ...here.
> 	(LEGITIMATE_SMALL_DATA_OFFSET_P): New define.
> 	(LEGITIMATE_SMALL_DATA_ADDRESS_P): Use the above define.
> 	(legitimate_offset_address_p): Delete TARGET_NO_SDATA_SET
> 	condition.
> 	(arc_override_options): Handle G option.
> 	(arc_output_pic_addr_const): Correct function definition.
> 	(arc_legitimate_address_p): Use legitimate_scaled_address_p.
> 	(arc_decl_anon_ns_mem_p): Delete.
> 	(arc_in_small_data_p): Overhaul this function to take into
> 	consideration the value given via G option.
> 	(arc_rewrite_small_data_1): Renamed and corrected old
> 	arc_rewrite_small_data function.
> 	(arc_rewrite_small_data): New function.
> 	(small_data_pattern): Don't use pic_offset_table_rtx.
> 	* config/arc/arc.h (CC1_SPEC): Recognize G option.
> 	* config/arc/simdext.md (movmisalignv2hi): Use
> 	prepare_move_operands function.
> 	(mov*): Likewise.
> 	(movmisalign*): Likewise.
> 
> gcc/testsuite/
> 2017-04-24  Claudiu Zissulescu  <claziss@synopsys.com>
> 
> 	* gcc.target/arc/sdata-5.c: New test.
> 	* gcc.target/arc/arc700-stld-hazard.c: Update test options.

From looking at other targets, I think that we need to add
documentation for -G into the ARC Options section of of
gcc/doc/invoke.texi.

> diff --git a/gcc/testsuite/gcc.target/arc/sdata-5.c b/gcc/testsuite/gcc.target/arc/sdata-5.c
> new file mode 100755
> index 0000000..0fdd52d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arc/sdata-5.c
> @@ -0,0 +1,8 @@
> +/* { dg-do compile } */
> +/* { dg-options "-w -Os" } */
> +
> +const a[1] = {};
> +static short b[] = {};
> +
> +c;
> +fn1() { return a + b[c]; }

Is it critical that we rely on default types here?  I know it's
legitimate, but it just makes me sad to see new code using default
types.

Also, I guess that this test expects some things to be placed into the
small data section? Is there no pattern we can 'scan-assembler' for?
Or is this testing some other feature / bug?

Otherwise seems fine.

Thanks,
Andrew

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 3/7] [ARC] Update can_follow_jump hook helper.
  2017-07-24  8:46 ` [PATCH 3/7] [ARC] Update can_follow_jump hook helper Claudiu Zissulescu
@ 2017-08-15 13:56   ` Andrew Burgess
  2017-08-31 16:38     ` Claudiu Zissulescu
  0 siblings, 1 reply; 27+ messages in thread
From: Andrew Burgess @ 2017-08-15 13:56 UTC (permalink / raw)
  To: Claudiu Zissulescu; +Cc: gcc-patches, Francois.Bedard, sandra

* Claudiu Zissulescu <Claudiu.Zissulescu@synopsys.com> [2017-07-24 10:42:55 +0200]:

> From: claziss <claziss@synopsys.com>
> 
> Short branches cannot be used to jump between hot/cold
> sections. Update the hook.
> 
> gcc/
> 2017-04-26  Claudiu Zissulescu  <claziss@synopsys.com>
> 
> 	* config/arc/arc.c (arc_can_follow_jump): Check for short
> 	branches.

This looks fine.

Thanks,
Andrew

> ---
>  gcc/config/arc/arc.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
> index cf52df8..3980a0f 100644
> --- a/gcc/config/arc/arc.c
> +++ b/gcc/config/arc/arc.c
> @@ -9941,6 +9941,9 @@ arc_can_follow_jump (const rtx_insn *follower, const rtx_insn *followee)
>    if (CROSSING_JUMP_P (followee))
>      switch (get_attr_type (u.r))
>        {
> +      case TYPE_BRANCH:
> +	if (get_attr_length (u.r) != 2)
> +	  break;
>        case TYPE_BRCC:
>        case TYPE_BRCC_NO_DELAY_SLOT:
>  	return false;
> -- 
> 1.9.1
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/7] [ARC] Use -G option to control sdata behavior
  2017-08-15 13:43   ` Andrew Burgess
@ 2017-08-15 16:26     ` Sandra Loosemore
  2017-08-31 16:04     ` Claudiu Zissulescu
  1 sibling, 0 replies; 27+ messages in thread
From: Sandra Loosemore @ 2017-08-15 16:26 UTC (permalink / raw)
  To: Andrew Burgess, Claudiu Zissulescu; +Cc: gcc-patches, Francois.Bedard

On 08/15/2017 07:30 AM, Andrew Burgess wrote:
> * Claudiu Zissulescu <Claudiu.Zissulescu@synopsys.com> [2017-07-24 10:42:54 +0200]:
>
>> From: claziss <claziss@synopsys.com>
>>
>> Add support for -G option to ARC backend.
>>
>> gcc/
>> 2017-04-24  Claudiu Zissulescu  <claziss@synopsys.com>
>>
>> 	* config.gcc: Use g.opt for arc.
>> 	* config/arc/arc.c (LEGITIMATE_SCALED_ADDRESS_P): Deleted,
>> 	functionality moved to ...
>> 	(legitimate_scaled_address_p): New function, ...here.
>> 	(LEGITIMATE_SMALL_DATA_OFFSET_P): New define.
>> 	(LEGITIMATE_SMALL_DATA_ADDRESS_P): Use the above define.
>> 	(legitimate_offset_address_p): Delete TARGET_NO_SDATA_SET
>> 	condition.
>> 	(arc_override_options): Handle G option.
>> 	(arc_output_pic_addr_const): Correct function definition.
>> 	(arc_legitimate_address_p): Use legitimate_scaled_address_p.
>> 	(arc_decl_anon_ns_mem_p): Delete.
>> 	(arc_in_small_data_p): Overhaul this function to take into
>> 	consideration the value given via G option.
>> 	(arc_rewrite_small_data_1): Renamed and corrected old
>> 	arc_rewrite_small_data function.
>> 	(arc_rewrite_small_data): New function.
>> 	(small_data_pattern): Don't use pic_offset_table_rtx.
>> 	* config/arc/arc.h (CC1_SPEC): Recognize G option.
>> 	* config/arc/simdext.md (movmisalignv2hi): Use
>> 	prepare_move_operands function.
>> 	(mov*): Likewise.
>> 	(movmisalign*): Likewise.
>>
>> gcc/testsuite/
>> 2017-04-24  Claudiu Zissulescu  <claziss@synopsys.com>
>>
>> 	* gcc.target/arc/sdata-5.c: New test.
>> 	* gcc.target/arc/arc700-stld-hazard.c: Update test options.
>
>  From looking at other targets, I think that we need to add
> documentation for -G into the ARC Options section of of
> gcc/doc/invoke.texi.

Yes.  It would probably be better to treat -G as a target-independent 
option enabled by some target hook and consolidate the documentation as 
well, but since it's not done that way at present there needs to be an 
entry for -G in the ARC options table.

-Sandra

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH 1/7] [ARC] Improves and fixes for small data support.
  2017-08-15 13:36   ` Andrew Burgess
@ 2017-08-31 14:37     ` Claudiu Zissulescu
  0 siblings, 0 replies; 27+ messages in thread
From: Claudiu Zissulescu @ 2017-08-31 14:37 UTC (permalink / raw)
  To: Andrew Burgess; +Cc: gcc-patches, Francois.Bedard, sandra

> Remember to add both new testsuite files in the ChangeLog, but
> otherwise looks good.
> 
> Thanks,
> Andrew
> 

Done. Thank you for your review,
Claudiu

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH 2/7] [ARC] Use -G option to control sdata behavior
  2017-08-15 13:43   ` Andrew Burgess
  2017-08-15 16:26     ` Sandra Loosemore
@ 2017-08-31 16:04     ` Claudiu Zissulescu
  1 sibling, 0 replies; 27+ messages in thread
From: Claudiu Zissulescu @ 2017-08-31 16:04 UTC (permalink / raw)
  To: Andrew Burgess; +Cc: gcc-patches, Francois.Bedard, sandra

> From looking at other targets, I think that we need to add
> documentation for -G into the ARC Options section of of
> gcc/doc/invoke.texi.

Added to invoke.texi as suggested.


> Is it critical that we rely on default types here?  I know it's
> legitimate, but it just makes me sad to see new code using default
> types.

No, just a small mistake on my side. I've updated the test as suggested.

> 
> Also, I guess that this test expects some things to be placed into the
> small data section? Is there no pattern we can 'scan-assembler' for?
> Or is this testing some other feature / bug?

The test should check the interaction between the small data and data section anchors (which are default when compiling for size). In the case of an issue, we should get an ICE. Anyhow, I've added also a scan-assembler for clarity.

Committed with above changes. Thank you for your review,
Claudiu

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH 3/7] [ARC] Update can_follow_jump hook helper.
  2017-08-15 13:56   ` Andrew Burgess
@ 2017-08-31 16:38     ` Claudiu Zissulescu
  0 siblings, 0 replies; 27+ messages in thread
From: Claudiu Zissulescu @ 2017-08-31 16:38 UTC (permalink / raw)
  To: Andrew Burgess; +Cc: gcc-patches, Francois.Bedard, sandra

> This looks fine.
> 

Committed. Thank you for your review,
Claudiu

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 4/7] [ARC] Use TARGET_USE_ANCHORS_FOR_SYMBOL_P.
  2017-07-24  8:46 ` [PATCH 4/7] [ARC] Use TARGET_USE_ANCHORS_FOR_SYMBOL_P Claudiu Zissulescu
@ 2017-08-31 19:13   ` Andrew Burgess
  2017-09-01 11:51     ` Claudiu Zissulescu
  0 siblings, 1 reply; 27+ messages in thread
From: Andrew Burgess @ 2017-08-31 19:13 UTC (permalink / raw)
  To: Claudiu Zissulescu
  Cc: gcc-patches, Francois.Bedard, sandra, Claudiu Zissulescu

* Claudiu Zissulescu <Claudiu.Zissulescu@synopsys.com> [2017-07-24 10:42:56 +0200]:

> From: Claudiu Zissulescu <claziss@gmail.com>
> 
> We don't want to use anchors for small data: the GP register acts as an anchor in that
> case.  We also don't want to use them for PC-relative accesses,
> where the PC acts as an anchor.  TLS symbols require special accesses as well, don't use
> anchors for such symbols.
> 
> gcc/
> 2017-04-28  Claudiu Zissulescu  <claziss@synopsys.com>
> 
> 	* config/arc/arc.c (arc_use_anchors_for_symbol_p): New function.
> 	(TARGET_USE_ANCHORS_FOR_SYMBOL_P): Define.
> 
> gcc/testsuite
> 2017-04-28  Claudiu Zissulescu  <claziss@synopsys.com>
> 
> 	* gcc.target/arc/pr9001184797.c: New test.

Looks good.

Thanks,
Andrew


> ---
>  gcc/config/arc/arc.c                        | 24 ++++++++++++++++++++++++
>  gcc/testsuite/gcc.target/arc/pr9001184797.c | 19 +++++++++++++++++++
>  2 files changed, 43 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/arc/pr9001184797.c
> 
> diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
> index 3980a0f..c94b187 100644
> --- a/gcc/config/arc/arc.c
> +++ b/gcc/config/arc/arc.c
> @@ -10687,6 +10687,30 @@ arc_builtin_setjmp_frame_value (void)
>    return gen_raw_REG (Pmode, FRAME_POINTER_REGNUM);
>  }
>  
> +/* Implement TARGET_USE_ANCHORS_FOR_SYMBOL_P.  We don't want to use
> +   anchors for small data: the GP register acts as an anchor in that
> +   case.  We also don't want to use them for PC-relative accesses,
> +   where the PC acts as an anchor.  Prohibit also TLS symbols to use
> +   anchors.  */
> +
> +static bool
> +arc_use_anchors_for_symbol_p (const_rtx symbol)
> +{
> +  if (SYMBOL_REF_TLS_MODEL (symbol))
> +    return false;
> +
> +  if (flag_pic)
> +    return false;
> +
> +  if (SYMBOL_REF_SMALL_P (symbol))
> +    return false;
> +
> +  return default_use_anchors_for_symbol_p (symbol);
> +}
> +
> +#undef TARGET_USE_ANCHORS_FOR_SYMBOL_P
> +#define TARGET_USE_ANCHORS_FOR_SYMBOL_P arc_use_anchors_for_symbol_p
> +
>  struct gcc_target targetm = TARGET_INITIALIZER;
>  
>  #include "gt-arc.h"
> diff --git a/gcc/testsuite/gcc.target/arc/pr9001184797.c b/gcc/testsuite/gcc.target/arc/pr9001184797.c
> new file mode 100644
> index 0000000..e76c676
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arc/pr9001184797.c
> @@ -0,0 +1,19 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target tls } */
> +/* { dg-options "-Os -w -mno-ll64" } */
> +
> +/* This test studies the use of anchors and tls symbols. */
> +
> +struct a b;
> +struct a {
> +  long c;
> +  long d
> +} e() {
> +  static __thread struct a f;
> +  static __thread g;
> +  g = 5;
> +  h();
> +  if (f.c)
> +    g = g & 5;
> +  f = b;
> +}
> -- 
> 1.9.1
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 5/7] [ARC] Update various patterns
  2017-07-24  8:46 ` [PATCH 5/7] [ARC] Update various patterns Claudiu Zissulescu
@ 2017-08-31 19:33   ` Andrew Burgess
  2017-09-01 11:50     ` Claudiu Zissulescu
  0 siblings, 1 reply; 27+ messages in thread
From: Andrew Burgess @ 2017-08-31 19:33 UTC (permalink / raw)
  To: Claudiu Zissulescu; +Cc: gcc-patches, Francois.Bedard, sandra

* Claudiu Zissulescu <Claudiu.Zissulescu@synopsys.com> [2017-07-24 10:42:57 +0200]:

> From: claziss <claziss@synopsys.com>
> 
> gcc/
> 2017-04-25  Claudiu Zissulescu  <claziss@synopsys.com>
> 
> 	* config/arc/arc.md (movqi_insn): Add stores to save constant long
> 	immediates.
> 	(movhi_insn): Update store instruction constraint which are saving
> 	6-bit short immediates.
> 	(movsi_insn): Consider also short scaled load operations.
> 	(zero_extendhisi2_i): Use Usd constraint instead of T.
> 	(extendhisi2_i): Add q constraint.
> 	(arc_clzsi2): Add type and length attributes.
> 	(arc_ctzsi2): Likewise.
> 	* config/arc/constraints.md (Usc): Update constraint, the
> 	assembler can parse two relocations for a single instruction.
> 
> gcc/testsuite/
> 2017-04-25  Claudiu Zissulescu  <claziss@synopsys.com>
> 
> 	* gcc.target/arc/arc.exp: Test also cpp files.
> 	* gcc.target/arc/tdelay_slots.cpp: New test.

That looks fine to me.

Thanks,
Andrew



> ---
>  gcc/config/arc/arc.md                         | 54 ++++++++++++++-------------
>  gcc/config/arc/constraints.md                 |  6 +--
>  gcc/testsuite/gcc.target/arc/arc.exp          |  2 +-
>  gcc/testsuite/gcc.target/arc/tdelay_slots.cpp | 42 +++++++++++++++++++++
>  4 files changed, 75 insertions(+), 29 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/arc/tdelay_slots.cpp
> 
> diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md
> index f595da7..04a1447 100644
> --- a/gcc/config/arc/arc.md
> +++ b/gcc/config/arc/arc.md
> @@ -618,8 +618,8 @@
>  ; The iscompact attribute allows the epilogue expander to know for which
>  ; insns it should lengthen the return insn.
>  (define_insn "*movqi_insn"
> -  [(set (match_operand:QI 0 "move_dest_operand" "=Rcq,Rcq#q,    w,Rcq#q,   h, w,w,???w,h, w,Rcq,  S,!*x,  r,r, Ucm,m,???m,Usc")
> -	(match_operand:QI 1 "move_src_operand"  "  cL,   cP,Rcq#q,    P,hCm1,cL,I,?Rac,i,?i,  T,Rcq,Usd,Ucm,m,?Rac,c,?Rac,Cm3"))]
> +  [(set (match_operand:QI 0 "move_dest_operand" "=Rcq,Rcq#q,    w,Rcq#q,   h, w,w,???w,h, w,Rcq,  S,!*x,  r,r, Ucm,m,???m,  m,Usc")
> +	(match_operand:QI 1 "move_src_operand"  "  cL,   cP,Rcq#q,    P,hCm1,cL,I,?Rac,i,?i,  T,Rcq,Usd,Ucm,m,?Rac,c,?Rac,Cm3,i"))]
>    "register_operand (operands[0], QImode)
>     || register_operand (operands[1], QImode)"
>    "@
> @@ -641,11 +641,12 @@
>     xstb%U0 %1,%0
>     stb%U0%V0 %1,%0
>     stb%U0%V0 %1,%0
> +   stb%U0%V0 %1,%0
>     stb%U0%V0 %1,%0"
> -  [(set_attr "type" "move,move,move,move,move,move,move,move,move,move,load,store,load,load,load,store,store,store,store")
> -   (set_attr "iscompact" "maybe,maybe,maybe,true,true,false,false,false,maybe_limm,false,true,true,true,false,false,false,false,false,false")
> -   (set_attr "predicable" "yes,no,yes,no,no,yes,no,yes,yes,yes,no,no,no,no,no,no,no,no,no")
> -   (set_attr "cpu_facility" "av1,av1,av1,av2,av2,*,*,*,*,*,*,*,*,*,*,*,*,*,*")])
> +  [(set_attr "type" "move,move,move,move,move,move,move,move,move,move,load,store,load,load,load,store,store,store,store,store")
> +   (set_attr "iscompact" "maybe,maybe,maybe,true,true,false,false,false,maybe_limm,false,true,true,true,false,false,false,false,false,false,false")
> +   (set_attr "predicable" "yes,no,yes,no,no,yes,no,yes,yes,yes,no,no,no,no,no,no,no,no,no,no")
> +   (set_attr "cpu_facility" "av1,av1,av1,av2,av2,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*")])
>  
>  (define_expand "movhi"
>    [(set (match_operand:HI 0 "move_dest_operand" "")
> @@ -654,8 +655,8 @@
>    "if (prepare_move_operands (operands, HImode)) DONE;")
>  
>  (define_insn "*movhi_insn"
> -  [(set (match_operand:HI 0 "move_dest_operand" "=Rcq,Rcq#q,    w,Rcq#q,   h, w,w,???w,Rcq#q,h, w,Rcq,  S,  r,r, Ucm,m,???m,VUsc,VUsc")
> -	(match_operand:HI 1 "move_src_operand" "   cL,   cP,Rcq#q,    P,hCm1,cL,I,?Rac,    i,i,?i,  T,Rcq,Ucm,m,?Rac,c,?Rac, Cm3,i"))]
> +  [(set (match_operand:HI 0 "move_dest_operand" "=Rcq,Rcq#q,    w,Rcq#q,   h, w,w,???w,Rcq#q,h, w,Rcq,  S,  r,r, Ucm,m,???m,  m,VUsc")
> +	(match_operand:HI 1 "move_src_operand" "   cL,   cP,Rcq#q,    P,hCm1,cL,I,?Rac,    i,i,?i,  T,Rcq,Ucm,m,?Rac,c,?Rac,Cm3,i"))]
>    "register_operand (operands[0], HImode)
>     || register_operand (operands[1], HImode)
>     || (CONSTANT_P (operands[1])
> @@ -706,8 +707,8 @@
>  ; insns it should lengthen the return insn.
>  ; N.B. operand 1 of alternative 7 expands into pcl,symbol@gotpc .
>  (define_insn "*movsi_insn"                      ;   0     1     2     3    4  5 6   7   8   9   10  11  12  13    14  15   16  17  18     19     20  21  22    23    24 25 26    27 28  29   30   31
> -  [(set (match_operand:SI 0 "move_dest_operand" "=Rcq,Rcq#q,    w,Rcq#q,   h, w,w,  w,  w,  w,  w,???w, ?w,  w,Rcq#q,  h,   w,Rcq,  S,   Us<,RcqRck,!*x,  r,!*Rsd,!*Rcd,r,Ucm,  Usd,m,???m,VUsc,VUsc")
> -	(match_operand:SI 1 "move_src_operand"  "  cL,   cP,Rcq#q,    P,hCm1,cL,I,Crr,Clo,Chi,Cbi,?Rac,Cpc,Clb, ?Cal,Cal,?Cal,  T,Rcq,RcqRck,   Us>,Usd,Ucm,  Usd,  Ucd,m,  w,!*Rzd,c,?Rac, Cm3, C32"))]
> +  [(set (match_operand:SI 0 "move_dest_operand" "=Rcq,Rcq#q,    w,Rcq#q,   h, w,w,  w,  w,  w,  w,???w, ?w,  w,Rcq#q,  h,   w,Rcq,  S,   Us<,RcqRck,!*x,  r,!*Rsd,!*Rcd,r,Ucm,  Usd,m,???m,  m,VUsc")
> +	(match_operand:SI 1 "move_src_operand"  "  cL,   cP,Rcq#q,    P,hCm1,cL,I,Crr,Clo,Chi,Cbi,?Rac,Cpc,Clb, ?Cal,Cal,?Cal,Uts,Rcq,RcqRck,   Us>,Usd,Ucm,  Usd,  Ucd,m,  w,!*Rzd,c,?Rac,Cm3, C32"))]
>    "register_operand (operands[0], SImode)
>     || register_operand (operands[1], SImode)
>     || (CONSTANT_P (operands[1])
> @@ -730,10 +731,10 @@
>     mov%? %0,%1		;11
>     add %0,%S1		;12
>     add %0,pcl,%1@pcl    ;13
> -   mov%? %0,%S1%&	;14
> -   mov%? %0,%S1		;15
> -   mov%? %0,%S1		;16
> -   ld%? %0,%1%&		;17
> +   mov%? %0,%1  	;14
> +   mov%? %0,%1		;15
> +   mov%? %0,%1		;16
> +   ld%?%U1 %0,%1	;17
>     st%? %1,%0%&		;18
>     * return arc_short_long (insn, \"push%? %1%&\", \"st%U0 %1,%0%&\");
>     * return arc_short_long (insn, \"pop%? %0%&\",  \"ld%U1 %0,%1%&\");
> @@ -747,13 +748,13 @@
>     st%U0%V0 %1,%0	;28
>     st%U0%V0 %1,%0	;29
>     st%U0%V0 %1,%0	;30
> -   st%U0%V0 %S1,%0	;31"
> +   st%U0%V0 %1,%0	;31"
>     ;                         0     1     2     3    4    5      6       7           8     9    10     11    12    13           14        15    16   17    18    19   20    21    22   23  24    25    26    27    28    29   30   31
>    [(set_attr "type"       "move, move, move,move,move, move, move,two_cycle_core,shift,shift,shift, move,binary,binary,      move,      move, move,load,store,store,load,load, load,load,load, load,store,store,store,store,store,store")
>     (set_attr "iscompact" "maybe,maybe,maybe,true,true,false,false,         false,false,false,false,false, false, false,maybe_limm,maybe_limm,false,true, true, true,true,true,false,true,true,false,false, true,false,false,false,false")
>     ; Use default length for iscompact to allow for COND_EXEC.  But set length
>     ; of Crr to 4.
> -   (set_attr "length" "*,*,*,*,*,4,4,4,4,4,4,4,8,8,*,*,*,*,*,*,*,*,4,*,4,*,*,*,*,*,4,8")
> +   (set_attr "length" "*,*,*,*,*,4,4,4,4,4,4,4,8,8,*,*,*,*,*,*,*,*,4,*,4,*,*,*,*,*,*,8")
>     (set_attr "predicable" "yes,no,yes,no,no,yes,no,no,no,no,no,yes,no,no,yes,yes,yes,no,no,no,no,no,no,no,no,no,no,no,no,no,no,no")
>     (set_attr "cpu_facility" "av1,av1,av1,av2,av2,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,av2,av2,*,*,av2,*,*,av2,*")])
>  
> @@ -1634,7 +1635,7 @@
>  )
>  
>  (define_insn "*zero_extendqisi2_ac"
> -  [(set (match_operand:SI 0 "dest_reg_operand" "=Rcq,Rcq#q,Rcw,w,qRcq,!*x,r,r")
> +  [(set (match_operand:SI 0 "dest_reg_operand"    "=Rcq,Rcq#q,Rcw,w,qRcq,!*x,r,r")
>  	(zero_extend:SI (match_operand:QI 1 "nonvol_nonimm_operand" "0,Rcq#q,0,c,T,Usd,Ucm,m")))]
>    ""
>    "@
> @@ -1659,19 +1660,19 @@
>  
>  (define_insn "*zero_extendhisi2_i"
>    [(set (match_operand:SI 0 "dest_reg_operand" "=Rcq,q,Rcw,w,!x,Rcqq,r,r")
> -	(zero_extend:SI (match_operand:HI 1 "nonvol_nonimm_operand" "0,q,0,c,Usd,Usd,Ucm,m")))]
> +	(zero_extend:SI (match_operand:HI 1 "nonvol_nonimm_operand" "0,q,0,c,Usd,T,Ucm,m")))]
>    ""
>    "@
>     ext%_%? %0,%1%&
>     ext%_%? %0,%1%&
>     bmsk%? %0,%1,15
>     ext%_ %0,%1
> -   ld%_%? %0,%1%&
> -   ld%_%U1 %0,%1
> +   ld%_%? %0,%1
> +   ld%_%? %0,%1
>     * return TARGET_EM ? \"xldh%U1%V1 %0,%1\" : \"xldw%U1 %0,%1\";
>     ld%_%U1%V1 %0,%1"
>    [(set_attr "type" "unary,unary,unary,unary,load,load,load,load")
> -   (set_attr "iscompact" "maybe,true,false,false,true,false,false,false")
> +   (set_attr "iscompact" "maybe,true,false,false,true,true,false,false")
>     (set_attr "predicable" "no,no,yes,no,no,no,no,no")])
>  
>  
> @@ -1726,7 +1727,7 @@
>  )
>  
>  (define_insn "*extendhisi2_i"
> -  [(set (match_operand:SI 0 "dest_reg_operand" "=Rcqq,w,Rcq,r,r")
> +  [(set (match_operand:SI 0 "dest_reg_operand" "=Rcqq,w,Rcqq,r,r")
>  	(sign_extend:SI (match_operand:HI 1 "nonvol_nonimm_operand" "Rcqq,c,Ucd,Uex,m")))]
>    ""
>    "@
> @@ -4524,7 +4525,9 @@
>         gen_rtx_GE (VOIDmode, gen_rtx_REG (CC_ZNmode, CC_REG), const0_rtx),
>         gen_rtx_SET (operands[0], plus_constant (SImode, operands[0], 1))));
>    DONE;
> -})
> +}
> +[(set_attr "type" "unary")
> + (set_attr "length" "12")])
>  
>  (define_expand "ctzsi2"
>    [(match_operand:SI 0 "register_operand" "")
> @@ -4567,8 +4570,9 @@
>         gen_rtx_SET (operands[0], gen_rtx_MINUS (SImode, GEN_INT (31),
>  						operands[0]))));
>    DONE;
> -})
> -
> +}
> +[(set_attr "type" "unary")
> + (set_attr "length" "20")])
>  
>  (define_insn "swap"
>    [(set (match_operand:SI  0 "dest_reg_operand" "=w,w,w")
> diff --git a/gcc/config/arc/constraints.md b/gcc/config/arc/constraints.md
> index 0ad318c..b8a4a90 100644
> --- a/gcc/config/arc/constraints.md
> +++ b/gcc/config/arc/constraints.md
> @@ -357,13 +357,13 @@
>     (and (match_code "mem")
>  	(match_test "compact_sda_memory_operand (op, VOIDmode, true)")))
>  
> +; Usc constant is only used for storing long constants, hence we can
> +; have only [b,s9], and [b] types of addresses.
>  (define_memory_constraint "Usc"
>    "@internal
>     A valid memory operand for storing constants"
>    (and (match_code "mem")
> -       (match_test "!CONSTANT_P (XEXP (op,0))")
> -;; ??? the assembler rejects stores of immediates to small data.
> -       (match_test "!compact_sda_memory_operand (op, VOIDmode, false)")))
> +       (match_test "!CONSTANT_P (XEXP (op,0))")))
>  
>  (define_constraint "Us<"
>    "@internal
> diff --git a/gcc/testsuite/gcc.target/arc/arc.exp b/gcc/testsuite/gcc.target/arc/arc.exp
> index 00b5c33..5bb09d8 100644
> --- a/gcc/testsuite/gcc.target/arc/arc.exp
> +++ b/gcc/testsuite/gcc.target/arc/arc.exp
> @@ -104,7 +104,7 @@ if ![info exists DEFAULT_CFLAGS] then {
>  dg-init
>  
>  # Main loop.
> -dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.\[cS\]]] \
> +dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.{\[cS\],cpp}]] \
>  	"" $DEFAULT_CFLAGS
>  
>  # All done.
> diff --git a/gcc/testsuite/gcc.target/arc/tdelay_slots.cpp b/gcc/testsuite/gcc.target/arc/tdelay_slots.cpp
> new file mode 100644
> index 0000000..3a6d379
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arc/tdelay_slots.cpp
> @@ -0,0 +1,42 @@
> +/* { dg-do assemble } */
> +/* { dg-skip-if "" { ! { clmcpu } } } */
> +/* { dg-options "-O2 -mcpu=em" } */
> +
> +template <class> struct A;
> +int a;
> +template <> struct A<char> {
> +  typedef int int_type;
> +  static int_type eof();
> +};
> +template <> struct A<wchar_t> {
> +  typedef int int_type;
> +  static int_type eof() { return -1; }
> +};
> +class basic_streambuf {
> +public:
> +  virtual ~basic_streambuf();
> +};
> +class B {
> +  void tie();
> +  class C {
> +    C();
> +  };
> +};
> +template <typename _CharT, typename _Traits = A<_CharT>>
> +class D : basic_streambuf {
> +  typedef _Traits traits_type;
> +  typename traits_type::int_type _M_unget_buf;
> +
> +public:
> +  D(void *) : _M_unget_buf(traits_type::eof()) {}
> +};
> +
> +extern D<wchar_t> b;
> +B c;
> +void *operator new(unsigned, void *p2) { return p2; }
> +
> +B::C::C() {
> +  new D<char>(&a);
> +  c.tie();
> +  new (&b) D<wchar_t>(&a);
> +}
> -- 
> 1.9.1
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 7/7] [ARC] Fix errors in arc_ifcvt.
  2017-07-24  8:46 ` [PATCH 7/7] [ARC] Fix errors in arc_ifcvt Claudiu Zissulescu
@ 2017-08-31 20:08   ` Andrew Burgess
  2017-09-01 11:50     ` Claudiu Zissulescu
  0 siblings, 1 reply; 27+ messages in thread
From: Andrew Burgess @ 2017-08-31 20:08 UTC (permalink / raw)
  To: Claudiu Zissulescu; +Cc: gcc-patches, Francois.Bedard, sandra

* Claudiu Zissulescu <Claudiu.Zissulescu@synopsys.com> [2017-07-24 10:42:59 +0200]:

> From: claziss <claziss@synopsys.com>
> 
> The arc_ifcvt procedure is removing a label even when it is used by
> another jump.  This patch fixes dg.exp/pr31507-1.c.
> 
> gcc/
> 2017-07-10  Claudiu Zissulescu  <claziss@synopsys.com>
> 
> 	* config/arc/arc.c (arc_ifcvt): Remove use of merge_blocks call.
> 	(arc_ccfsm_advance): Fix checking for delay slots.
> 	(arc_reorg): Add rtl dump after each call to arc_ifcvt

Looks good.

Thanks,
Andrew




> ---
>  gcc/config/arc/arc.c | 42 +++++++++++-------------------------------
>  1 file changed, 11 insertions(+), 31 deletions(-)
> 
> diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
> index 0f9b553..f7a2b61 100644
> --- a/gcc/config/arc/arc.c
> +++ b/gcc/config/arc/arc.c
> @@ -4429,12 +4429,10 @@ arc_ccfsm_advance (rtx_insn *insn, struct arc_ccfsm *state)
>  
>        /* If this is a non-annulled branch with a delay slot, there is
>  	 no need to conditionalize the delay slot.  */
> -      if (NEXT_INSN (PREV_INSN (insn)) != insn
> +      if ((GET_CODE (PATTERN (NEXT_INSN (PREV_INSN (insn)))) == SEQUENCE)
>  	  && state->state == 0 && !INSN_ANNULLED_BRANCH_P (insn))
>  	{
>  	  this_insn = NEXT_INSN (this_insn);
> -	  gcc_assert (NEXT_INSN (NEXT_INSN (PREV_INSN (start_insn)))
> -		      == NEXT_INSN (this_insn));
>  	}
>        /* See how many insns this branch skips, and what kind of insns.  If all
>  	 insns are okay, and the label or unconditional branch to the same
> @@ -7547,6 +7545,12 @@ arc_reorg (void)
>  	  arc_ifcvt ();
>  	  unsigned int flags = pass_data_arc_ifcvt.todo_flags_finish;
>  	  df_finish_pass ((flags & TODO_df_verify) != 0);
> +
> +	  if (dump_file)
> +	    {
> +	      fprintf (dump_file, ";; After if conversion:\n\n");
> +	      print_rtl (dump_file, get_insns ());
> +	    }
>  	}
>  
>        /* Call shorten_branches to calculate the insn lengths.  */
> @@ -8998,7 +9002,6 @@ static unsigned
>  arc_ifcvt (void)
>  {
>    struct arc_ccfsm *statep = &cfun->machine->ccfsm_current;
> -  basic_block merge_bb = 0;
>  
>    memset (statep, 0, sizeof *statep);
>    for (rtx_insn *insn = get_insns (); insn; insn = next_insn (insn))
> @@ -9008,20 +9011,14 @@ arc_ifcvt (void)
>        switch (statep->state)
>  	{
>  	case 0:
> -	  if (JUMP_P (insn))
> -	    merge_bb = 0;
>  	  break;
>  	case 1: case 2:
>  	  {
>  	    /* Deleted branch.  */
> -	    gcc_assert (!merge_bb);
> -	    merge_bb = BLOCK_FOR_INSN (insn);
> -	    basic_block succ_bb
> -	      = BLOCK_FOR_INSN (NEXT_INSN (NEXT_INSN (PREV_INSN (insn))));
>  	    arc_ccfsm_post_advance (insn, statep);
>  	    gcc_assert (!IN_RANGE (statep->state, 1, 2));
>  	    rtx_insn *seq = NEXT_INSN (PREV_INSN (insn));
> -	    if (seq != insn)
> +	    if (GET_CODE (PATTERN (seq)) == SEQUENCE)
>  	      {
>  		rtx slot = XVECEXP (PATTERN (seq), 0, 1);
>  		rtx pat = PATTERN (slot);
> @@ -9035,18 +9032,10 @@ arc_ifcvt (void)
>  		  gcc_unreachable ();
>  		PUT_CODE (slot, NOTE);
>  		NOTE_KIND (slot) = NOTE_INSN_DELETED;
> -		if (merge_bb && succ_bb)
> -		  merge_blocks (merge_bb, succ_bb);
> -	      }
> -	    else if (merge_bb && succ_bb)
> -	      {
> -		set_insn_deleted (insn);
> -		merge_blocks (merge_bb, succ_bb);
>  	      }
>  	    else
>  	      {
> -		PUT_CODE (insn, NOTE);
> -		NOTE_KIND (insn) = NOTE_INSN_DELETED;
> +		set_insn_deleted (insn);
>  	      }
>  	    continue;
>  	  }
> @@ -9055,17 +9044,8 @@ arc_ifcvt (void)
>  	      && statep->target_label == CODE_LABEL_NUMBER (insn))
>  	    {
>  	      arc_ccfsm_post_advance (insn, statep);
> -	      basic_block succ_bb = BLOCK_FOR_INSN (insn);
> -	      if (merge_bb && succ_bb)
> -		merge_blocks (merge_bb, succ_bb);
> -	      else if (--LABEL_NUSES (insn) == 0)
> -		{
> -		  const char *name = LABEL_NAME (insn);
> -		  PUT_CODE (insn, NOTE);
> -		  NOTE_KIND (insn) = NOTE_INSN_DELETED_LABEL;
> -		  NOTE_DELETED_LABEL_NAME (insn) = name;
> -		}
> -	      merge_bb = 0;
> +	      if (--LABEL_NUSES (insn) == 0)
> +		delete_insn (insn);
>  	      continue;
>  	    }
>  	  /* Fall through.  */
> -- 
> 1.9.1
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 6/7] [ARC] Reimplement ZOL support.
  2017-07-24  8:46 ` [PATCH 6/7] [ARC] Reimplement ZOL support Claudiu Zissulescu
  2017-07-24 22:22   ` Sandra Loosemore
@ 2017-08-31 20:23   ` Andrew Burgess
  2017-09-01 11:49     ` Claudiu Zissulescu
  2017-09-01 12:36     ` [PATCH] [ARC][ZOL] Account for empty body loops Claudiu Zissulescu
  1 sibling, 2 replies; 27+ messages in thread
From: Andrew Burgess @ 2017-08-31 20:23 UTC (permalink / raw)
  To: Claudiu Zissulescu; +Cc: gcc-patches, Francois.Bedard, sandra

* Claudiu Zissulescu <Claudiu.Zissulescu@synopsys.com> [2017-07-24 10:42:58 +0200]:

> From: claziss <claziss@synopsys.com>
> 
> 2017-05-22  Claudiu Zissulescu <claziss@synopsys.com>
> 
> 	* config/arc/arc-c.c (__ARC_LPC_WIDTH__): Add builtin define.
> 	* config/arc/arc.c (ARC_MAX_LOOP_LENGTH): Define.
> 	(arc_conditional_register_usage): Remove ARC600 lp_count
> 	exception.
> 	(arc_file_start): Emit Tag_ARC_CPU_variation.
> 	(arc_can_use_doloop_p): New conditions to use ZOLs.
> 	(hwloop_fail): New function.
> 	(hwloop_optimize): Likewise.
> 	(hwloop_pattern_reg): Likewise.
> 	(arc_doloop_hooks): New struct, to be used with reorg_loops.
> 	(arc_reorg_loops): New function, calls reorg_loops.
> 	(arc_reorg): Call arc_reorg_loops.  Remove old ZOL handling.
> 	(arc600_corereg_hazard): Remove ZOL checking, case handled by
> 	hwloop_optimize.
> 	(arc_loop_hazard): Remove function, functionality moved into
> 	hwloop_optimize.
> 	(arc_hazard): Remove arc_loop_hazard call.
> 	(arc_adjust_insn_length): Remove ZOL handling, functionality moved
> 	into hwloop_optimize.
> 	(arc_label_align): Remove ZOL handling.
> 	* config/arc/arc.h (LOOP_ALIGN): Changed to 0.
> 	* config/arc/arc.md (doloop_begin): Remove pattern.
> 	(doloop_begin_i): Likewise.
> 	(doloop_end_i): Likewise.
> 	(doloop_fallback): Likewise.
> 	(doloop_fallback_m): Likewise.
> 	(doloop_end): Reimplement expand.
> 	(arc_lp): New pattern for LP instruction.
> 	(loop_end): New pattern.
> 	(loop_fail): Likewise.
> 	(decrement_and_branch_until_zero): Likewise.
> 	* config/arc/arc.opt (mlpc-width): New option.
> 	* doc/invoke.texi (mlpc-width): Document option.
> 
> testsuite/
> 2017-05-22  Claudiu Zissulescu <claziss@synopsys.com>
> 
>  	    * gcc.target/arc/loop-1.c: Update test.

I'm happy with this if the doc is updated inline with Sandra's
suggestions.

Thanks,
Andrew



> ---
>  gcc/config/arc/arc-c.c                |   2 +
>  gcc/config/arc/arc.c                  | 726 ++++++++++++++++++----------------
>  gcc/config/arc/arc.h                  |  10 +-
>  gcc/config/arc/arc.md                 | 419 ++++++--------------
>  gcc/config/arc/arc.opt                |  25 ++
>  gcc/config/arc/predicates.md          |   2 +
>  gcc/doc/invoke.texi                   |  14 +-
>  gcc/testsuite/gcc.target/arc/loop-1.c |  49 +--
>  8 files changed, 561 insertions(+), 686 deletions(-)
>  mode change 100644 => 100755 gcc/testsuite/gcc.target/arc/loop-1.c
> 
> diff --git a/gcc/config/arc/arc-c.c b/gcc/config/arc/arc-c.c
> index de877a1..44ff338 100644
> --- a/gcc/config/arc/arc-c.c
> +++ b/gcc/config/arc/arc-c.c
> @@ -62,6 +62,8 @@ arc_cpu_cpp_builtins (cpp_reader * pfile)
>    builtin_define_with_int_value ("__ARC_TLS_REGNO__",
>  				 arc_tp_regno);
>  
> +  builtin_define_with_int_value ("__ARC_LPC_WIDTH__", arc_lpcwidth);
> +
>    builtin_define (TARGET_BIG_ENDIAN
>  		  ? "__BIG_ENDIAN__" : "__LITTLE_ENDIAN__");
>    if (TARGET_BIG_ENDIAN)
> diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
> index c94b187..0f9b553 100644
> --- a/gcc/config/arc/arc.c
> +++ b/gcc/config/arc/arc.c
> @@ -64,11 +64,15 @@ along with GCC; see the file COPYING3.  If not see
>  #include "rtl-iter.h"
>  #include "alias.h"
>  #include "opts.h"
> +#include "hw-doloop.h"
>  
>  /* Which cpu we're compiling for (ARC600, ARC601, ARC700).  */
>  static char arc_cpu_name[10] = "";
>  static const char *arc_cpu_string = arc_cpu_name;
>  
> +/* Maximum size of a loop.  */
> +#define ARC_MAX_LOOP_LENGTH 4095
> +
>  /* ??? Loads can handle any constant, stores can only handle small ones.  */
>  /* OTOH, LIMMs cost extra, so their usefulness is limited.  */
>  #define RTX_OK_FOR_OFFSET_P(MODE, X) \
> @@ -1708,18 +1712,7 @@ arc_conditional_register_usage (void)
>  	   i <= ARC_LAST_SIMD_DMA_CONFIG_REG; i++)
>  	reg_alloc_order [i] = i;
>      }
> -  /* For ARC600, lp_count may not be read in an instruction
> -     following immediately after another one setting it to a new value.
> -     There was some discussion on how to enforce scheduling constraints for
> -     processors with missing interlocks on the gcc mailing list:
> -     http://gcc.gnu.org/ml/gcc/2008-05/msg00021.html .
> -     However, we can't actually use this approach, because for ARC the
> -     delay slot scheduling pass is active, which runs after
> -     machine_dependent_reorg.  */
> -  if (TARGET_ARC600)
> -    CLEAR_HARD_REG_BIT (reg_class_contents[SIBCALL_REGS], LP_COUNT);
> -  else if (!TARGET_LP_WR_INTERLOCK)
> -    fixed_regs[LP_COUNT] = 1;
> +
>    for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
>      if (!call_used_regs[regno])
>        CLEAR_HARD_REG_BIT (reg_class_contents[SIBCALL_REGS], regno);
> @@ -6998,28 +6991,33 @@ arc_pass_by_reference (cumulative_args_t ca_v ATTRIBUTE_UNUSED,
>  /* Implement TARGET_CAN_USE_DOLOOP_P.  */
>  
>  static bool
> -arc_can_use_doloop_p (const widest_int &iterations, const widest_int &,
> +arc_can_use_doloop_p (const widest_int &,
> +		      const widest_int &iterations_max,
>  		      unsigned int loop_depth, bool entered_at_top)
>  {
> -  if (loop_depth > 1)
> +  /* Considering limitations in the hardware, only use doloop
> +     for innermost loops which must be entered from the top.  */
> +  if (loop_depth > 1 || !entered_at_top)
>      return false;
> -  /* Setting up the loop with two sr instructions costs 6 cycles.  */
> -  if (TARGET_ARC700
> -      && !entered_at_top
> -      && wi::gtu_p (iterations, 0)
> -      && wi::leu_p (iterations, flag_pic ? 6 : 3))
> +
> +  /* Check for lp_count width boundary.  */
> +  if (arc_lpcwidth != 32
> +      && (wi::gtu_p (iterations_max, ((1 << arc_lpcwidth) - 1))
> +	  || wi::eq_p (iterations_max, 0)))
>      return false;
>    return true;
>  }
>  
> -/* NULL if INSN insn is valid within a low-overhead loop.
> -   Otherwise return why doloop cannot be applied.  */
> +/* NULL if INSN insn is valid within a low-overhead loop.  Otherwise
> +   return why doloop cannot be applied.  */
>  
>  static const char *
>  arc_invalid_within_doloop (const rtx_insn *insn)
>  {
>    if (CALL_P (insn))
>      return "Function call in the loop.";
> +
> +  /* FIXME! add here all the ZOL exceptions.  */
>    return NULL;
>  }
>  
> @@ -7118,6 +7116,368 @@ workaround_arc_anomaly (void)
>      }
>  }
>  
> +/* A callback for the hw-doloop pass.  Called when a loop we have discovered
> +   turns out not to be optimizable; we have to split the loop_end pattern into
> +   a subtract and a test.  */
> +
> +static void
> +hwloop_fail (hwloop_info loop)
> +{
> +  rtx test;
> +  rtx insn = loop->loop_end;
> +
> +  if (TARGET_V2
> +      && (loop->length && (loop->length <= ARC_MAX_LOOP_LENGTH))
> +      && REG_P (loop->iter_reg))
> +    {
> +      /* TARGET_V2 has dbnz instructions.  */
> +      test = gen_dbnz (loop->iter_reg, loop->start_label);
> +      insn = emit_jump_insn_before (test, loop->loop_end);
> +    }
> +  else if (REG_P (loop->iter_reg) && (REGNO (loop->iter_reg) == LP_COUNT))
> +    {
> +      /* We have the lp_count as loop iterator, try to use it.  */
> +      emit_insn_before (gen_loop_fail (), loop->loop_end);
> +      test = gen_rtx_NE (VOIDmode, gen_rtx_REG (CC_ZNmode, CC_REG),
> +			 const0_rtx);
> +      test = gen_rtx_IF_THEN_ELSE (VOIDmode, test,
> +				   gen_rtx_LABEL_REF (Pmode, loop->start_label),
> +				   pc_rtx);
> +      insn = emit_jump_insn_before (gen_rtx_SET (pc_rtx, test),
> +				     loop->loop_end);
> +    }
> +  else
> +    {
> +      emit_insn_before (gen_addsi3 (loop->iter_reg,
> +				    loop->iter_reg,
> +				    constm1_rtx),
> +			loop->loop_end);
> +      test = gen_rtx_NE (VOIDmode, loop->iter_reg, const0_rtx);
> +      insn = emit_jump_insn_before (gen_cbranchsi4 (test,
> +						    loop->iter_reg,
> +						    const0_rtx,
> +						    loop->start_label),
> +				    loop->loop_end);
> +    }
> +  JUMP_LABEL (insn) = loop->start_label;
> +  LABEL_NUSES (loop->start_label)++;
> +  delete_insn (loop->loop_end);
> +}
> +
> +/* Optimize LOOP.  */
> +
> +static bool
> +hwloop_optimize (hwloop_info loop)
> +{
> +  int i;
> +  edge entry_edge;
> +  basic_block entry_bb, bb;
> +  rtx iter_reg, end_label;
> +  rtx_insn *insn, *seq, *entry_after, *last_insn;
> +  unsigned int length;
> +  bool need_fix = false;
> +  rtx lp_reg = gen_rtx_REG (SImode, LP_COUNT);
> +
> +  if (loop->depth > 1)
> +    {
> +      if (dump_file)
> +        fprintf (dump_file, ";; loop %d is not innermost\n",
> +                 loop->loop_no);
> +      return false;
> +    }
> +
> +  if (!loop->incoming_dest)
> +    {
> +      if (dump_file)
> +        fprintf (dump_file, ";; loop %d has more than one entry\n",
> +                 loop->loop_no);
> +      return false;
> +    }
> +
> +  if (loop->incoming_dest != loop->head)
> +    {
> +      if (dump_file)
> +        fprintf (dump_file, ";; loop %d is not entered from head\n",
> +                 loop->loop_no);
> +      return false;
> +    }
> +
> +  if (loop->has_call || loop->has_asm)
> +    {
> +      if (dump_file)
> +        fprintf (dump_file, ";; loop %d has invalid insn\n",
> +                 loop->loop_no);
> +      return false;
> +    }
> +
> +  /* Scan all the blocks to make sure they don't use iter_reg.  */
> +  if (loop->iter_reg_used || loop->iter_reg_used_outside)
> +    {
> +      if (dump_file)
> +        fprintf (dump_file, ";; loop %d uses iterator\n",
> +                 loop->loop_no);
> +      return false;
> +    }
> +
> +  /* Check if start_label appears before doloop_end.  */
> +  length = 0;
> +  for (insn = loop->start_label;
> +       insn && insn != loop->loop_end;
> +       insn = NEXT_INSN (insn))
> +    length += NONDEBUG_INSN_P (insn) ? get_attr_length (insn) : 0;
> +
> +  if (!insn)
> +    {
> +      if (dump_file)
> +        fprintf (dump_file, ";; loop %d start_label not before loop_end\n",
> +                 loop->loop_no);
> +      return false;
> +    }
> +
> +  loop->length = length;
> +  if (loop->length > ARC_MAX_LOOP_LENGTH)
> +    {
> +      if (dump_file)
> +	fprintf (dump_file, ";; loop %d too long\n", loop->loop_no);
> +      return false;
> +    }
> +  else if (!loop->length)
> +    {
> +      if (dump_file)
> +	fprintf (dump_file, ";; loop %d is empty\n", loop->loop_no);
> +      return false;
> +    }
> +
> +  /* Check if we use a register or not.  */
> +  if (!REG_P (loop->iter_reg))
> +    {
> +      if (dump_file)
> +        fprintf (dump_file, ";; loop %d iterator is MEM\n",
> +                 loop->loop_no);
> +      return false;
> +    }
> +
> +  /* Check if loop register is lpcount.  */
> +  if (REG_P (loop->iter_reg) && (REGNO (loop->iter_reg)) != LP_COUNT)
> +    {
> +      if (dump_file)
> +        fprintf (dump_file, ";; loop %d doesn't use lp_count as loop"
> +		 " iterator\n",
> +                 loop->loop_no);
> +      /* This loop doesn't use the lp_count, check though if we can
> +	 fix it.  */
> +      if (TEST_HARD_REG_BIT (loop->regs_set_in_loop, LP_COUNT)
> +	  /* In very unique cases we may have LP_COUNT alive.  */
> +	  || (loop->incoming_src
> +	      && REGNO_REG_SET_P (df_get_live_out (loop->incoming_src),
> +				  LP_COUNT)))
> +	return false;
> +      else
> +	need_fix = true;
> +    }
> +
> +  /* Check for control like instruction as the last instruction of a
> +     ZOL.  */
> +  bb = loop->tail;
> +  last_insn = PREV_INSN (loop->loop_end);
> +
> +  while (1)
> +    {
> +      for (; last_insn != BB_HEAD (bb);
> +	   last_insn = PREV_INSN (last_insn))
> +	if (NONDEBUG_INSN_P (last_insn))
> +	  break;
> +
> +      if (last_insn != BB_HEAD (bb))
> +	break;
> +
> +      if (single_pred_p (bb)
> +	  && single_pred_edge (bb)->flags & EDGE_FALLTHRU
> +	  && single_pred (bb) != ENTRY_BLOCK_PTR_FOR_FN (cfun))
> +	{
> +	  bb = single_pred (bb);
> +	  last_insn = BB_END (bb);
> +	  continue;
> +	}
> +      else
> +	{
> +	  last_insn = NULL;
> +	  break;
> +	}
> +    }
> +
> +  if (!last_insn)
> +    {
> +      if (dump_file)
> +	fprintf (dump_file, ";; loop %d has no last instruction\n",
> +		 loop->loop_no);
> +      return false;
> +    }
> +
> +  if ((TARGET_ARC600_FAMILY || TARGET_HS)
> +      && INSN_P (last_insn)
> +      && (JUMP_P (last_insn) || CALL_P (last_insn)
> +	  || GET_CODE (PATTERN (last_insn)) == SEQUENCE
> +	  /* At this stage we can have (insn (clobber (mem:BLK
> +	     (reg)))) instructions, ignpre them.  */
> +	  || (GET_CODE (PATTERN (last_insn)) != CLOBBER
> +	      && (get_attr_type (last_insn) == TYPE_BRCC
> +		  || get_attr_type (last_insn) == TYPE_BRCC_NO_DELAY_SLOT))))
> +    {
> +      if (loop->length + 2 > ARC_MAX_LOOP_LENGTH)
> +	{
> +	  if (dump_file)
> +	    fprintf (dump_file, ";; loop %d too long\n", loop->loop_no);
> +	  return false;
> +	}
> +      if (dump_file)
> +	fprintf (dump_file, ";; loop %d has a control like last insn;"
> +		 "add a nop\n",
> +		 loop->loop_no);
> +
> +      last_insn = emit_insn_after (gen_nopv (), last_insn);
> +    }
> +
> +  if (LABEL_P (last_insn))
> +    {
> +      if (dump_file)
> +	fprintf (dump_file, ";; loop %d has a label as last insn;"
> +		 "add a nop\n",
> +		 loop->loop_no);
> +      last_insn = emit_insn_after (gen_nopv (), last_insn);
> +    }
> +  loop->last_insn = last_insn;
> +
> +  /* Get the loop iteration register.  */
> +  iter_reg = loop->iter_reg;
> +
> +  gcc_assert (REG_P (iter_reg));
> +
> +  entry_edge = NULL;
> +
> +  FOR_EACH_VEC_SAFE_ELT (loop->incoming, i, entry_edge)
> +    if (entry_edge->flags & EDGE_FALLTHRU)
> +      break;
> +
> +  if (entry_edge == NULL)
> +    {
> +      if (dump_file)
> +	fprintf (dump_file, ";; loop %d has no fallthru edge jumping"
> +		 "into the loop\n",
> +		 loop->loop_no);
> +      return false;
> +    }
> +  /* The loop is good.  */
> +  end_label = gen_label_rtx ();
> +  loop->end_label = end_label;
> +
> +  /* Place the zero_cost_loop_start instruction before the loop.  */
> +  entry_bb = entry_edge->src;
> +
> +  start_sequence ();
> +
> +  if (need_fix)
> +    {
> +      /* The loop uses a R-register, but the lp_count is free, thus
> +	 use lp_count.  */
> +      emit_insn (gen_movsi (lp_reg, iter_reg));
> +      SET_HARD_REG_BIT (loop->regs_set_in_loop, LP_COUNT);
> +      iter_reg = lp_reg;
> +      if (dump_file)
> +	{
> +	  fprintf (dump_file, ";; fix loop %d to use lp_count\n",
> +		   loop->loop_no);
> +	}
> +    }
> +
> +  insn = emit_insn (gen_arc_lp (iter_reg,
> +				loop->start_label,
> +				loop->end_label));
> +
> +  seq = get_insns ();
> +  end_sequence ();
> +
> +  entry_after = BB_END (entry_bb);
> +  if (!single_succ_p (entry_bb) || vec_safe_length (loop->incoming) > 1
> +      || !entry_after)
> +    {
> +      basic_block new_bb;
> +      edge e;
> +      edge_iterator ei;
> +
> +      emit_insn_before (seq, BB_HEAD (loop->head));
> +      seq = emit_label_before (gen_label_rtx (), seq);
> +      new_bb = create_basic_block (seq, insn, entry_bb);
> +      FOR_EACH_EDGE (e, ei, loop->incoming)
> +        {
> +          if (!(e->flags & EDGE_FALLTHRU))
> +            redirect_edge_and_branch_force (e, new_bb);
> +          else
> +            redirect_edge_succ (e, new_bb);
> +        }
> +
> +      make_edge (new_bb, loop->head, 0);
> +    }
> +  else
> +    {
> +#if 0
> +      while (DEBUG_INSN_P (entry_after)
> +             || (NOTE_P (entry_after)
> +                 && NOTE_KIND (entry_after) != NOTE_INSN_BASIC_BLOCK
> +		 /* Make sure we don't split a call and its corresponding
> +		    CALL_ARG_LOCATION note.  */
> +                 && NOTE_KIND (entry_after) != NOTE_INSN_CALL_ARG_LOCATION))
> +        entry_after = NEXT_INSN (entry_after);
> +#endif
> +      entry_after = next_nonnote_insn_bb (entry_after);
> +
> +      gcc_assert (entry_after);
> +      emit_insn_before (seq, entry_after);
> +    }
> +
> +  delete_insn (loop->loop_end);
> +  /* Insert the loop end label before the last instruction of the
> +     loop.  */
> +  emit_label_after (end_label, loop->last_insn);
> +
> +  return true;
> +}
> +
> +/* A callback for the hw-doloop pass.  This function examines INSN; if
> +   it is a loop_end pattern we recognize, return the reg rtx for the
> +   loop counter.  Otherwise, return NULL_RTX.  */
> +
> +static rtx
> +hwloop_pattern_reg (rtx_insn *insn)
> +{
> +  rtx reg;
> +
> +  if (!JUMP_P (insn) || recog_memoized (insn) != CODE_FOR_loop_end)
> +    return NULL_RTX;
> +
> +  reg = SET_DEST (XVECEXP (PATTERN (insn), 0, 1));
> +  if (!REG_P (reg))
> +    return NULL_RTX;
> +  return reg;
> +}
> +
> +static struct hw_doloop_hooks arc_doloop_hooks =
> +{
> +  hwloop_pattern_reg,
> +  hwloop_optimize,
> +  hwloop_fail
> +};
> +
> +/* Run from machine_dependent_reorg, this pass looks for doloop_end insns
> +   and tries to rewrite the RTL of these loops so that proper Blackfin
> +   hardware loops are generated.  */
> +
> +static void
> +arc_reorg_loops (void)
> +{
> +  reorg_loops (true, &arc_doloop_hooks);
> +}
> +
>  static int arc_reorg_in_progress = 0;
>  
>  /* ARC's machince specific reorg function.  */
> @@ -7131,204 +7491,17 @@ arc_reorg (void)
>    long offset;
>    int changed;
>  
> -  workaround_arc_anomaly ();
> -
>    cfun->machine->arc_reorg_started = 1;
>    arc_reorg_in_progress = 1;
>  
> -  /* Link up loop ends with their loop start.  */
> -  {
> -    for (insn = get_insns (); insn; insn = NEXT_INSN (insn))
> -      if (GET_CODE (insn) == JUMP_INSN
> -	  && recog_memoized (insn) == CODE_FOR_doloop_end_i)
> -	{
> -	  rtx_insn *top_label
> -	    = as_a <rtx_insn *> (XEXP (XEXP (SET_SRC (XVECEXP (PATTERN (insn), 0, 0)), 1), 0));
> -	  rtx num = GEN_INT (CODE_LABEL_NUMBER (top_label));
> -	  rtx_insn *lp, *prev = prev_nonnote_insn (top_label);
> -	  rtx_insn *lp_simple = NULL;
> -	  rtx_insn *next = NULL;
> -	  rtx op0 = XEXP (XVECEXP (PATTERN (insn), 0, 1), 0);
> -	  int seen_label = 0;
> -
> -	  for (lp = prev;
> -	       (lp && NONJUMP_INSN_P (lp)
> -		&& recog_memoized (lp) != CODE_FOR_doloop_begin_i);
> -	       lp = prev_nonnote_insn (lp))
> -	    ;
> -	  if (!lp || !NONJUMP_INSN_P (lp)
> -	      || dead_or_set_regno_p (lp, LP_COUNT))
> -	    {
> -	      HOST_WIDE_INT loop_end_id
> -		= INTVAL (XEXP (XVECEXP (PATTERN (insn), 0, 4), 0));
> +  compute_bb_for_insn ();
>  
> -	      for (prev = next = insn, lp = NULL ; prev || next;)
> -		{
> -		  if (prev)
> -		    {
> -		      if (NONJUMP_INSN_P (prev)
> -			  && recog_memoized (prev) == CODE_FOR_doloop_begin_i
> -			  && (INTVAL (XEXP (XVECEXP (PATTERN (prev), 0, 5), 0))
> -			      == loop_end_id))
> -			{
> -			  lp = prev;
> -			  break;
> -			}
> -		      else if (LABEL_P (prev))
> -			seen_label = 1;
> -		      prev = prev_nonnote_insn (prev);
> -		    }
> -		  if (next)
> -		    {
> -		      if (NONJUMP_INSN_P (next)
> -			  && recog_memoized (next) == CODE_FOR_doloop_begin_i
> -			  && (INTVAL (XEXP (XVECEXP (PATTERN (next), 0, 5), 0))
> -			      == loop_end_id))
> -			{
> -			  lp = next;
> -			  break;
> -			}
> -		      next = next_nonnote_insn (next);
> -		    }
> -		}
> -	      prev = NULL;
> -	    }
> -	  else
> -	    lp_simple = lp;
> -	  if (lp && !dead_or_set_regno_p (lp, LP_COUNT))
> -	    {
> -	      rtx begin_cnt = XEXP (XVECEXP (PATTERN (lp), 0 ,3), 0);
> -	      if (INTVAL (XEXP (XVECEXP (PATTERN (lp), 0, 4), 0)))
> -		/* The loop end insn has been duplicated.  That can happen
> -		   when there is a conditional block at the very end of
> -		   the loop.  */
> -		goto failure;
> -	      /* If Register allocation failed to allocate to the right
> -		 register, There is no point into teaching reload to
> -		 fix this up with reloads, as that would cost more
> -		 than using an ordinary core register with the
> -		 doloop_fallback pattern.  */
> -	      if ((true_regnum (op0) != LP_COUNT || !REG_P (begin_cnt))
> -	      /* Likewise, if the loop setup is evidently inside the loop,
> -		 we loose.  */
> -		  || (!lp_simple && lp != next && !seen_label))
> -		{
> -		  remove_insn (lp);
> -		  goto failure;
> -		}
> -	      /* It is common that the optimizers copy the loop count from
> -		 another register, and doloop_begin_i is stuck with the
> -		 source of the move.  Making doloop_begin_i only accept "l"
> -		 is nonsentical, as this then makes reload evict the pseudo
> -		 used for the loop end.  The underlying cause is that the
> -		 optimizers don't understand that the register allocation for
> -		 doloop_begin_i should be treated as part of the loop.
> -		 Try to work around this problem by verifying the previous
> -		 move exists.  */
> -	      if (true_regnum (begin_cnt) != LP_COUNT)
> -		{
> -		  rtx_insn *mov;
> -		  rtx set, note;
> +  df_analyze ();
>  
> -		  for (mov = prev_nonnote_insn (lp); mov;
> -		       mov = prev_nonnote_insn (mov))
> -		    {
> -		      if (!NONJUMP_INSN_P (mov))
> -			mov = 0;
> -		      else if ((set = single_set (mov))
> -			  && rtx_equal_p (SET_SRC (set), begin_cnt)
> -			  && rtx_equal_p (SET_DEST (set), op0))
> -			break;
> -		    }
> -		  if (mov)
> -		    {
> -		      XEXP (XVECEXP (PATTERN (lp), 0 ,3), 0) = op0;
> -		      note = find_regno_note (lp, REG_DEAD, REGNO (begin_cnt));
> -		      if (note)
> -			remove_note (lp, note);
> -		    }
> -		  else
> -		    {
> -		      remove_insn (lp);
> -		      goto failure;
> -		    }
> -		}
> -	      XEXP (XVECEXP (PATTERN (insn), 0, 4), 0) = num;
> -	      XEXP (XVECEXP (PATTERN (lp), 0, 4), 0) = num;
> -	      if (next == lp)
> -		XEXP (XVECEXP (PATTERN (lp), 0, 6), 0) = const2_rtx;
> -	      else if (!lp_simple)
> -		XEXP (XVECEXP (PATTERN (lp), 0, 6), 0) = const1_rtx;
> -	      else if (prev != lp)
> -		{
> -		  remove_insn (lp);
> -		  add_insn_after (lp, prev, NULL);
> -		}
> -	      if (!lp_simple)
> -		{
> -		  XEXP (XVECEXP (PATTERN (lp), 0, 7), 0)
> -		    = gen_rtx_LABEL_REF (Pmode, top_label);
> -		  add_reg_note (lp, REG_LABEL_OPERAND, top_label);
> -		  LABEL_NUSES (top_label)++;
> -		}
> -	      /* We can avoid tedious loop start / end setting for empty loops
> -		 be merely setting the loop count to its final value.  */
> -	      if (next_active_insn (top_label) == insn)
> -		{
> -		  rtx lc_set
> -		    = gen_rtx_SET (XEXP (XVECEXP (PATTERN (lp), 0, 3), 0),
> -				   const0_rtx);
> -
> -		  rtx_insn *lc_set_insn = emit_insn_before (lc_set, insn);
> -		  delete_insn (lp);
> -		  delete_insn (insn);
> -		  insn = lc_set_insn;
> -		}
> -	      /* If the loop is non-empty with zero length, we can't make it
> -		 a zero-overhead loop.  That can happen for empty asms.  */
> -	      else
> -		{
> -		  rtx_insn *scan;
> +  /* Doloop optimization.  */
> +  arc_reorg_loops ();
>  
> -		  for (scan = top_label;
> -		       (scan && scan != insn
> -			&& (!NONJUMP_INSN_P (scan) || !get_attr_length (scan)));
> -		       scan = NEXT_INSN (scan));
> -		  if (scan == insn)
> -		    {
> -		      remove_insn (lp);
> -		      goto failure;
> -		    }
> -		}
> -	    }
> -	  else
> -	    {
> -	      /* Sometimes the loop optimizer makes a complete hash of the
> -		 loop.  If it were only that the loop is not entered at the
> -		 top, we could fix this up by setting LP_START with SR .
> -		 However, if we can't find the loop begin were it should be,
> -		 chances are that it does not even dominate the loop, but is
> -		 inside the loop instead.  Using SR there would kill
> -		 performance.
> -		 We use the doloop_fallback pattern here, which executes
> -		 in two cycles on the ARC700 when predicted correctly.  */
> -	    failure:
> -	      if (!REG_P (op0))
> -		{
> -		  rtx op3 = XEXP (XVECEXP (PATTERN (insn), 0, 5), 0);
> -
> -		  emit_insn_before (gen_move_insn (op3, op0), insn);
> -		  PATTERN (insn)
> -		    = gen_doloop_fallback_m (op3, JUMP_LABEL (insn), op0);
> -		}
> -	      else
> -		XVEC (PATTERN (insn), 0)
> -		  = gen_rtvec (2, XVECEXP (PATTERN (insn), 0, 0),
> -			       XVECEXP (PATTERN (insn), 0, 1));
> -	      INSN_CODE (insn) = -1;
> -	    }
> -	}
> -    }
> +  workaround_arc_anomaly ();
>  
>  /* FIXME: should anticipate ccfsm action, generate special patterns for
>     to-be-deleted branches that have no delay slot and have at least the
> @@ -7866,11 +8039,11 @@ arc_register_move_cost (machine_mode,
>  	return 6;
>      }
>  
> -  /* The ARC700 stalls for 3 cycles when *reading* from lp_count.  */
> -  if (TARGET_ARC700
> -      && (from_class == LPCOUNT_REG || from_class == ALL_CORE_REGS
> -	  || from_class == WRITABLE_CORE_REGS))
> -    return 8;
> +  /* Using lp_count as scratch reg is a VERY bad idea.  */
> +  if (from_class == LPCOUNT_REG)
> +    return 1000;
> +  if (to_class == LPCOUNT_REG)
> +    return 6;
>  
>    /* Force an attempt to 'mov Dy,Dx' to spill.  */
>    if ((TARGET_ARC700 || TARGET_EM) && TARGET_DPFP
> @@ -8312,14 +8485,6 @@ arc600_corereg_hazard (rtx_insn *pred, rtx_insn *succ)
>  {
>    if (!TARGET_ARC600)
>      return 0;
> -  /* If SUCC is a doloop_end_i with a preceding label, we must output a nop
> -     in front of SUCC anyway, so there will be separation between PRED and
> -     SUCC.  */
> -  if (recog_memoized (succ) == CODE_FOR_doloop_end_i
> -      && LABEL_P (prev_nonnote_insn (succ)))
> -    return 0;
> -  if (recog_memoized (succ) == CODE_FOR_doloop_begin_i)
> -    return 0;
>    if (GET_CODE (PATTERN (pred)) == SEQUENCE)
>      pred = as_a <rtx_sequence *> (PATTERN (pred))->insn (1);
>    if (GET_CODE (PATTERN (succ)) == SEQUENCE)
> @@ -8393,76 +8558,6 @@ arc_asm_insn_p (rtx x)
>    return 0;
>  }
>  
> -/* We might have a CALL to a non-returning function before a loop end.
> -   ??? Although the manual says that's OK (the target is outside the
> -   loop, and the loop counter unused there), the assembler barfs on
> -   this for ARC600, so we must insert a nop before such a call too.
> -   For ARC700, and ARCv2 is not allowed to have the last ZOL
> -   instruction a jump to a location where lp_count is modified.  */
> -
> -static bool
> -arc_loop_hazard (rtx_insn *pred, rtx_insn *succ)
> -{
> -  rtx_insn *jump  = NULL;
> -  rtx label_rtx = NULL_RTX;
> -  rtx_insn *label = NULL;
> -  basic_block succ_bb;
> -
> -  if (recog_memoized (succ) != CODE_FOR_doloop_end_i)
> -    return false;
> -
> -  /* Phase 1: ARC600 and ARCv2HS doesn't allow any control instruction
> -     (i.e., jump/call) as the last instruction of a ZOL.  */
> -  if (TARGET_ARC600 || TARGET_HS)
> -    if (JUMP_P (pred) || CALL_P (pred)
> -	|| arc_asm_insn_p (PATTERN (pred))
> -	|| GET_CODE (PATTERN (pred)) == SEQUENCE)
> -      return true;
> -
> -  /* Phase 2: Any architecture, it is not allowed to have the last ZOL
> -     instruction a jump to a location where lp_count is modified.  */
> -
> -  /* Phase 2a: Dig for the jump instruction.  */
> -  if (JUMP_P (pred))
> -    jump = pred;
> -  else if (GET_CODE (PATTERN (pred)) == SEQUENCE
> -	   && JUMP_P (XVECEXP (PATTERN (pred), 0, 0)))
> -    jump = as_a <rtx_insn *> (XVECEXP (PATTERN (pred), 0, 0));
> -  else
> -    return false;
> -
> -  /* Phase 2b: Make sure is not a millicode jump.  */
> -  if ((GET_CODE (PATTERN (jump)) == PARALLEL)
> -      && (XVECEXP (PATTERN (jump), 0, 0) == ret_rtx))
> -    return false;
> -
> -  label_rtx = JUMP_LABEL (jump);
> -  if (!label_rtx)
> -    return false;
> -
> -  /* Phase 2c: Make sure is not a return.  */
> -  if (ANY_RETURN_P (label_rtx))
> -    return false;
> -
> -  /* Pahse 2d: Go to the target of the jump and check for aliveness of
> -     LP_COUNT register.  */
> -  label = safe_as_a <rtx_insn *> (label_rtx);
> -  succ_bb = BLOCK_FOR_INSN (label);
> -  if (!succ_bb)
> -    {
> -      gcc_assert (NEXT_INSN (label));
> -      if (NOTE_INSN_BASIC_BLOCK_P (NEXT_INSN (label)))
> -	succ_bb = NOTE_BASIC_BLOCK (NEXT_INSN (label));
> -      else
> -	succ_bb = BLOCK_FOR_INSN (NEXT_INSN (label));
> -    }
> -
> -  if (succ_bb && REGNO_REG_SET_P (df_get_live_out (succ_bb), LP_COUNT))
> -    return true;
> -
> -  return false;
> -}
> -
>  /* For ARC600:
>     A write to a core reg greater or equal to 32 must not be immediately
>     followed by a use.  Anticipate the length requirement to insert a nop
> @@ -8474,9 +8569,6 @@ arc_hazard (rtx_insn *pred, rtx_insn *succ)
>    if (!pred || !INSN_P (pred) || !succ || !INSN_P (succ))
>      return 0;
>  
> -  if (arc_loop_hazard (pred, succ))
> -    return 4;
> -
>    if (TARGET_ARC600)
>      return arc600_corereg_hazard (pred, succ);
>  
> @@ -8494,24 +8586,6 @@ arc_adjust_insn_length (rtx_insn *insn, int len, bool)
>    if (GET_CODE (PATTERN (insn)) == SEQUENCE)
>      return len;
>  
> -  /* It is impossible to jump to the very end of a Zero-Overhead Loop, as
> -     the ZOL mechanism only triggers when advancing to the end address,
> -     so if there's a label at the end of a ZOL, we need to insert a nop.
> -     The ARC600 ZOL also has extra restrictions on jumps at the end of a
> -     loop.  */
> -  if (recog_memoized (insn) == CODE_FOR_doloop_end_i)
> -    {
> -      rtx_insn *prev = prev_nonnote_insn (insn);
> -
> -      return ((LABEL_P (prev)
> -	       || (TARGET_ARC600
> -		   && (JUMP_P (prev)
> -		       || CALL_P (prev) /* Could be a noreturn call.  */
> -		       || (NONJUMP_INSN_P (prev)
> -			   && GET_CODE (PATTERN (prev)) == SEQUENCE))))
> -	      ? len + 4 : len);
> -    }
> -
>    /* Check for return with but one preceding insn since function
>       start / call.  */
>    if (TARGET_PAD_RETURN
> @@ -9871,27 +9945,9 @@ arc_scheduling_not_expected (void)
>    return cfun->machine->arc_reorg_started;
>  }
>  
> -/* Oddly enough, sometimes we get a zero overhead loop that branch
> -   shortening doesn't think is a loop - observed with compile/pr24883.c
> -   -O3 -fomit-frame-pointer -funroll-loops.  Make sure to include the
> -   alignment visible for branch shortening  (we actually align the loop
> -   insn before it, but that is equivalent since the loop insn is 4 byte
> -   long.)  */
> -
>  int
>  arc_label_align (rtx_insn *label)
>  {
> -  int loop_align = LOOP_ALIGN (LABEL);
> -
> -  if (loop_align > align_labels_log)
> -    {
> -      rtx_insn *prev = prev_nonnote_insn (label);
> -
> -      if (prev && NONJUMP_INSN_P (prev)
> -	  && GET_CODE (PATTERN (prev)) == PARALLEL
> -	  && recog_memoized (prev) == CODE_FOR_doloop_begin_i)
> -	return loop_align;
> -    }
>    /* Code has a minimum p2 alignment of 1, which we must restore after an
>       ADDR_DIFF_VEC.  */
>    if (align_labels_log < 1)
> diff --git a/gcc/config/arc/arc.h b/gcc/config/arc/arc.h
> index d4e97cd..4c54b7e 100644
> --- a/gcc/config/arc/arc.h
> +++ b/gcc/config/arc/arc.h
> @@ -581,15 +581,15 @@ enum reg_class
>    {0x0000f00f, 0x00000000, 0x00000000, 0x00000000, 0x00000000},	     /* 'q', r0-r3, r12-r15 */		\
>    {0x1000f00f, 0x00000000, 0x00000000, 0x00000000, 0x00000000},	     /* 'e', r0-r3, r12-r15, sp */	\
>    {0x1c001fff, 0x00000000, 0x00000000, 0x00000000, 0x00000000},    /* "Rsc", r0-r12 */ \
> -  {0x9fffffff, 0xc0000000, 0x00000000, 0x00000000, 0x00000000},      /* 'r', r0-r28, blink, ap and pcl */	\
> +  {0x9fffffff, 0x80000000, 0x00000000, 0x00000000, 0x00000000},      /* 'r', r0-r28, blink, ap and pcl */	\
>    {0xffffffff, 0x00000000, 0x00000000, 0x00000000, 0x00000000},      /* 'W',  r0-r31 */ \
>    /* Include ap / pcl in WRITABLE_CORE_REGS for sake of symmetry.  As these \
>       registers are fixed, it does not affect the literal meaning of the \
>       constraints, but it makes it a superset of GENERAL_REGS, thus \
>       enabling some operations that would otherwise not be possible.  */ \
> -  {0xffffffff, 0xd0000000, 0x00000000, 0x00000000, 0x00000000},      /* 'w', r0-r31, r60 */ \
> -  {0xffffffff, 0xdfffffff, 0x00000000, 0x00000000, 0x00000000},      /* 'c', r0-r60, ap, pcl */ \
> -  {0xffffffff, 0xdfffffff, 0x00000000, 0x00000000, 0x00000000},      /* 'Rac', r0-r60, ap, pcl */ \
> +  {0xffffffff, 0x00000000, 0x00000000, 0x00000000, 0x00000000},      /* 'w', r0-r31, r60 */ \
> +  {0xffffffff, 0x9fffffff, 0x00000000, 0x00000000, 0x00000000},      /* 'c', r0-r60, ap, pcl */ \
> +  {0xffffffff, 0x9fffffff, 0x00000000, 0x00000000, 0x00000000},      /* 'Rac', r0-r60, ap, pcl */ \
>    {0x0000000f, 0x00000000, 0x00000000, 0x00000000, 0x00000000},      /* 'Rcd', r0-r3 */ \
>    {0x00000003, 0x00000000, 0x00000000, 0x00000000, 0x00000000},      /* 'Rsd', r0-r1 */ \
>    {0x9fffffff, 0x00000000, 0x00000000, 0x00000000, 0x00000000},      /* 'h',  r0-28, r30 */ \
> @@ -1351,7 +1351,7 @@ do {							\
>     of a loop.  */
>  /* On the ARC, align loops to 4 byte boundaries unless doing all-out size
>     optimization.  */
> -#define LOOP_ALIGN JUMP_ALIGN
> +#define LOOP_ALIGN(X) 0
>  
>  #define LABEL_ALIGN(LABEL) (arc_label_align (LABEL))
>  
> diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md
> index 04a1447..27afe40 100644
> --- a/gcc/config/arc/arc.md
> +++ b/gcc/config/arc/arc.md
> @@ -554,6 +554,11 @@
>     (eq_attr "annul_ret_delay_insn" "yes")
>     (eq_attr "cond_ret_delay_insn" "yes")])
>  
> +(define_delay (eq_attr "type" "loop_end")
> +  [(eq_attr "in_delay_slot" "true")
> +   (eq_attr "in_delay_slot" "true")
> +   (nil)])
> +
>  ;; For ARC600, unexposing the delay sloy incurs a penalty also in the
>  ;; non-taken case, so the only meaningful way to have an annull-true
>  ;; filled delay slot is to conditionalize the delay slot insn.
> @@ -618,8 +623,8 @@
>  ; The iscompact attribute allows the epilogue expander to know for which
>  ; insns it should lengthen the return insn.
>  (define_insn "*movqi_insn"
> -  [(set (match_operand:QI 0 "move_dest_operand" "=Rcq,Rcq#q,    w,Rcq#q,   h, w,w,???w,h, w,Rcq,  S,!*x,  r,r, Ucm,m,???m,  m,Usc")
> -	(match_operand:QI 1 "move_src_operand"  "  cL,   cP,Rcq#q,    P,hCm1,cL,I,?Rac,i,?i,  T,Rcq,Usd,Ucm,m,?Rac,c,?Rac,Cm3,i"))]
> +  [(set (match_operand:QI 0 "move_dest_operand" "=Rcq,Rcq#q,    w,Rcq#q,   h,w*l,w*l,???w,h,w*l,Rcq,  S,!*x,  r,r, Ucm,m,???m,  m,Usc")
> +	(match_operand:QI 1 "move_src_operand"  "  cL,   cP,Rcq#q,    P,hCm1, cL,  I,?Rac,i, ?i,  T,Rcq,Usd,Ucm,m,?Rac,c,?Rac,Cm3,i"))]
>    "register_operand (operands[0], QImode)
>     || register_operand (operands[1], QImode)"
>    "@
> @@ -655,8 +660,8 @@
>    "if (prepare_move_operands (operands, HImode)) DONE;")
>  
>  (define_insn "*movhi_insn"
> -  [(set (match_operand:HI 0 "move_dest_operand" "=Rcq,Rcq#q,    w,Rcq#q,   h, w,w,???w,Rcq#q,h, w,Rcq,  S,  r,r, Ucm,m,???m,  m,VUsc")
> -	(match_operand:HI 1 "move_src_operand" "   cL,   cP,Rcq#q,    P,hCm1,cL,I,?Rac,    i,i,?i,  T,Rcq,Ucm,m,?Rac,c,?Rac,Cm3,i"))]
> +  [(set (match_operand:HI 0 "move_dest_operand" "=Rcq,Rcq#q,    w,Rcq#q,   h,w*l,w*l,???w,Rcq#q,h,w*l,Rcq,  S,  r,r, Ucm,m,???m,  m,VUsc")
> +	(match_operand:HI 1 "move_src_operand" "   cL,   cP,Rcq#q,    P,hCm1, cL,  I,?Rac,    i,i, ?i,  T,Rcq,Ucm,m,?Rac,c,?Rac,Cm3,i"))]
>    "register_operand (operands[0], HImode)
>     || register_operand (operands[1], HImode)
>     || (CONSTANT_P (operands[1])
> @@ -706,9 +711,9 @@
>  ; the iscompact attribute allows the epilogue expander to know for which
>  ; insns it should lengthen the return insn.
>  ; N.B. operand 1 of alternative 7 expands into pcl,symbol@gotpc .
> -(define_insn "*movsi_insn"                      ;   0     1     2     3    4  5 6   7   8   9   10  11  12  13    14  15   16  17  18     19     20  21  22    23    24 25 26    27 28  29   30   31
> -  [(set (match_operand:SI 0 "move_dest_operand" "=Rcq,Rcq#q,    w,Rcq#q,   h, w,w,  w,  w,  w,  w,???w, ?w,  w,Rcq#q,  h,   w,Rcq,  S,   Us<,RcqRck,!*x,  r,!*Rsd,!*Rcd,r,Ucm,  Usd,m,???m,  m,VUsc")
> -	(match_operand:SI 1 "move_src_operand"  "  cL,   cP,Rcq#q,    P,hCm1,cL,I,Crr,Clo,Chi,Cbi,?Rac,Cpc,Clb, ?Cal,Cal,?Cal,Uts,Rcq,RcqRck,   Us>,Usd,Ucm,  Usd,  Ucd,m,  w,!*Rzd,c,?Rac,Cm3, C32"))]
> +(define_insn "*movsi_insn"                      ;   0     1     2     3    4  5    6   7   8   9   10    11  12  13    14  15   16  17  18     19     20  21  22    23    24 25 26    27 28  29  30   31
> +  [(set (match_operand:SI 0 "move_dest_operand" "=Rcq,Rcq#q,    w,Rcq#q,   h,w*l,w*l,  w,  w,  w,  w,  ???w, ?w,  w,Rcq#q,  h, w*l,Rcq,  S,   Us<,RcqRck,!*x,  r,!*Rsd,!*Rcd,r,Ucm,  Usd,m,???m,  m,VUsc")
> +	(match_operand:SI 1 "move_src_operand"  "  cL,   cP,Rcq#q,    P,hCm1, cL,  I,Crr,Clo,Chi,Cbi,?Rac*l,Cpc,Clb, ?Cal,Cal,?Cal,Uts,Rcq,RcqRck,   Us>,Usd,Ucm,  Usd,  Ucd,m,  w,!*Rzd,c,?Rac,Cm3, C32"))]
>    "register_operand (operands[0], SImode)
>     || register_operand (operands[1], SImode)
>     || (CONSTANT_P (operands[1])
> @@ -5073,317 +5078,123 @@
>  				xtr, const0_rtx);
>  })
>  
> +;; -------------------------------------------------------------------
> +;; Hardware loop
> +;; -------------------------------------------------------------------
> +
>  ; operand 0 is the loop count pseudo register
> -; operand 1 is the loop end pattern
> -(define_expand "doloop_begin"
> -  [(use (match_operand 0 "register_operand" ""))
> -   (use (match_operand 1 "" ""))]
> +; operand 1 is the label to jump to at the top of the loop
> +(define_expand "doloop_end"
> +  [(parallel [(set (pc)
> +		   (if_then_else
> +		    (ne (match_operand 0 "" "")
> +			(const_int 1))
> +		    (label_ref (match_operand 1 "" ""))
> +		    (pc)))
> +	      (set (match_dup 0) (plus (match_dup 0) (const_int -1)))
> +	      (unspec [(const_int 0)] UNSPEC_ARC_LP)
> +	      (clobber (match_dup 2))])]
>    ""
>  {
> -  /* Using the INSN_UID of the loop end pattern to identify it causes
> -     trouble with -fcompare-debug, so allocate a debug-independent
> -     id instead.  We use negative numbers so that we can use the same
> -     slot in doloop_end_i where we later store a CODE_LABEL_NUMBER, and
> -     still be able to tell what kind of number this is.  */
> -  static HOST_WIDE_INT loop_end_id = 0;
> -
> -  rtx id = GEN_INT (--loop_end_id);
> -  XEXP (XVECEXP (PATTERN (operands[1]), 0, 4), 0) = id;
> -  emit_insn (gen_doloop_begin_i (operands[0], const0_rtx, id,
> -				 const0_rtx, const0_rtx));
> -  DONE;
> + if (GET_MODE (operands[0]) != SImode)
> +   FAIL;
> + operands[2] = gen_rtx_SCRATCH (SImode);
>  })
>  
> -; ??? can't describe the insn properly as then the optimizers try to
> -; hoist the SETs.
> -;(define_insn "doloop_begin_i"
> -;  [(set (reg:SI LP_START) (pc))
> -;   (set (reg:SI LP_END) (unspec:SI [(pc)] UNSPEC_ARC_LP))
> -;   (use (match_operand 0 "const_int_operand" "n"))]
> -;  ""
> -;  "lp .L__GCC__LP%0"
> -;)
> -
> -; The operands of doloop_end_i are also read / written by arc_reorg with
> -; XVECEXP (PATTERN (lp, 0, N), so if you want to change the pattern, you
> -; might have to adjust arc_reorg.
> -; operands 0 / 2 are supplied by the expander, 1, 3 and 4 are filled in
> -; by arc_reorg.  arc_reorg might also alter operand 0.
> -;
> -; N in XVECEXP PATTERN (lp, 0 N)
> -;  V              rtl                 purpose
> -;  0           unspec UNSPEC_ARC_LP identify pattern
> -;  1           clobber LP_START     show LP_START is set
> -;  2           clobber LP_END       show LP_END is set
> -;  3           use operand0         loop count pseudo register
> -;  4           use operand1         before arc_reorg: -id
> -;                                   after : CODE_LABEL_NUMBER of loop top label
> -;  5           use operand2         INSN_UID of loop end insn
> -;  6           use operand3         loop setup not at start (1 above, 2 below)
> -;  7           use operand4         LABEL_REF of top label, if not
> -;                                   immediately following
> -; If operand1 is still zero after arc_reorg, this is an orphaned loop
> -; instruction that was not at the start of the loop.
> -; There is no point is reloading this insn - then lp_count would still not
> -; be available for the loop end.
> -(define_insn "doloop_begin_i"
> -  [(unspec:SI [(pc)] UNSPEC_ARC_LP)
> -   (clobber (reg:SI LP_START))
> -   (clobber (reg:SI LP_END))
> -   (use (match_operand:SI 0 "register_operand" "l,l,????*X"))
> -   (use (match_operand 1 "const_int_operand" "n,n,C_0"))
> -   (use (match_operand 2 "const_int_operand" "n,n,X"))
> -   (use (match_operand 3 "const_int_operand" "C_0,n,X"))
> -   (use (match_operand 4 "const_int_operand" "C_0,X,X"))]
> +(define_insn "arc_lp"
> +  [(unspec:SI [(match_operand:SI 0 "register_operand" "l")]
> +	      UNSPEC_ARC_LP)
> +   (use (label_ref (match_operand 1 "" "")))
> +   (use (label_ref (match_operand 2 "" "")))]
>    ""
> -{
> -  rtx_insn *scan;
> -  int len, size = 0;
> -  int n_insns = 0;
> -  rtx loop_start = operands[4];
> -
> -  if (CONST_INT_P (loop_start))
> -    loop_start = NULL_RTX;
> -  /* Size implications of the alignment will be taken care of by the
> -     alignment inserted at the loop start.  */
> -  if (LOOP_ALIGN (0) && INTVAL (operands[1]))
> -    {
> -      asm_fprintf (asm_out_file, "\t.p2align %d\\n", LOOP_ALIGN (0));
> -      arc_clear_unalign ();
> -    }
> -  if (!INTVAL (operands[1]))
> -    return "; LITTLE LOST LOOP";
> -  if (loop_start && flag_pic)
> -    {
> -      /* ??? Can do better for when a scratch register
> -	 is known.  But that would require extra testing.  */
> -      return "push_s r0\;add r0,pcl,%4@pcl\;sr r0,[2]; LP_START\;add r0,pcl,.L__GCC__LP%1@pcl\;sr r0,[3]; LP_END\;pop_s r0";
> -    }
> -  /* Check if the loop end is in range to be set by the lp instruction.  */
> -  size = INTVAL (operands[3]) < 2 ? 0 : 2048;
> -  for (scan = insn; scan && size < 2048; scan = NEXT_INSN (scan))
> -    {
> -      if (!INSN_P (scan))
> -	continue;
> -      if (recog_memoized (scan) == CODE_FOR_doloop_end_i
> -	  && (XEXP (XVECEXP (PATTERN (scan), 0, 4), 0)
> -	      == XEXP (XVECEXP (PATTERN (insn), 0, 4), 0)))
> -	break;
> -      len = get_attr_length (scan);
> -      size += len;
> -    }
> -  /* Try to verify that there are at least three instruction fetches
> -     between the loop setup and the first encounter of the loop end.  */
> -  for (scan = NEXT_INSN (insn); scan && n_insns < 3; scan = NEXT_INSN (scan))
> -    {
> -      if (!INSN_P (scan))
> -	continue;
> -      if (rtx_sequence *seq = dyn_cast <rtx_sequence *> (PATTERN (scan)))
> -	scan = seq->insn (0);
> -      if (JUMP_P (scan))
> -	{
> -	  if (recog_memoized (scan) != CODE_FOR_doloop_end_i)
> -	    {
> -	      n_insns += 2;
> -	      if (simplejump_p (scan))
> -		{
> -		  scan = as_a <rtx_insn *> (XEXP (SET_SRC (PATTERN (scan)), 0));
> -		  continue;
> -		}
> -
> -	      rtx lab = JUMP_LABEL (scan);
> -	      if (!lab)
> -		break;
> -
> -	      rtx_insn *next_scan
> -		= next_active_insn (NEXT_INSN (PREV_INSN (scan)));
> -	      if (next_scan
> -		  && recog_memoized (next_scan) != CODE_FOR_doloop_begin_i)
> -		break;
> -
> -	      /* JUMP_LABEL might be simple_return instead if an insn.  */
> -	      if (!INSN_P (lab))
> -		{
> -		  n_insns++;
> -		  break;
> -		}
> -
> -	      rtx_insn *next_lab = next_active_insn (as_a<rtx_insn *> (lab));
> -	      if (next_lab
> -		  && recog_memoized (next_lab) != CODE_FOR_doloop_begin_i)
> -		break;
> -
> -		n_insns++;
> -	    }
> -	  break;
> -	}
> -      len = get_attr_length (scan);
> -      /* Size estimation of asms assumes that each line which is nonempty
> -	 codes an insn, and that each has a long immediate.  For minimum insn
> -	 count, assume merely that a nonempty asm has at least one insn.  */
> -      if (GET_CODE (PATTERN (scan)) == ASM_INPUT
> -	  || asm_noperands (PATTERN (scan)) >= 0)
> -	n_insns += (len != 0);
> -      else
> -	n_insns += (len > 4 ? 2 : (len ? 1 : 0));
> -    }
> -  if (LOOP_ALIGN (0))
> -    {
> -      asm_fprintf (asm_out_file, "\t.p2align %d\\n", LOOP_ALIGN (0));
> -      arc_clear_unalign ();
> -    }
> -  gcc_assert (n_insns || GET_CODE (next_nonnote_insn (insn)) == CODE_LABEL);
> -  if (size >= 2048 || (TARGET_ARC600 && n_insns == 1) || loop_start)
> -    {
> -      if (flag_pic)
> -	{
> -	  /* ??? Can do better for when a scratch register
> -	     is known.  But that would require extra testing.  */
> -	  arc_clear_unalign ();
> -	  return ".p2align 2\;push_s r0\;add r0,pcl,24\;sr r0,[2]; LP_START\;add r0,pcl,.L__GCC__LP%1@pcl\;sr r0,[3]; LP_END\;pop_s r0";
> -	}
> -      output_asm_insn ((size < 2048
> -			? "lp .L__GCC__LP%1" : "sr .L__GCC__LP%1,[3]; LP_END"),
> -		       operands);
> -      output_asm_insn (loop_start
> -		       ? "sr %4,[2]; LP_START" : "sr 0f,[2]; LP_START",
> -		       operands);
> -      if (TARGET_ARC600 && n_insns < 1)
> -	output_asm_insn ("nop", operands);
> -      return (TARGET_ARC600 && n_insns < 3) ? "nop_s\;nop_s\;0:" : "0:";
> -    }
> -  else if (TARGET_ARC600 && n_insns < 3)
> -    {
> -      /* At least four instructions are needed between the setting of LP_COUNT
> -	 and the loop end - but the lp instruction qualifies as one.  */
> -      rtx_insn *prev = prev_nonnote_insn (insn);
> -
> -      if (!INSN_P (prev) || dead_or_set_regno_p (prev, LP_COUNT))
> -	output_asm_insn ("nop", operands);
> -    }
> -  return "lp .L__GCC__LP%1";
> -}
> +  "lp\\t@%l2\\t; %0:@%l1->@%l2"
>    [(set_attr "type" "loop_setup")
> -   (set_attr_alternative "length"
> -;     FIXME: length is usually 4, but we need branch shortening
> -;     to get this right.
> -;     [(if_then_else (match_test "TARGET_ARC600") (const_int 16) (const_int 4))
> -     [(if_then_else (match_test "flag_pic") (const_int 24) (const_int 16))
> -      (if_then_else (match_test "flag_pic") (const_int 28) (const_int 16))
> -      (const_int 0)])]
> -  ;; ??? we should really branch shorten this insn, but then we'd
> -  ;; need a proper label first.  N.B. the end label can not only go out
> -  ;; of range when it is far away, but also when it precedes the loop -
> -  ;; which, unfortunately, it sometimes does, when the loop "optimizer"
> -  ;; messes things up.
> -)
> -
> -; operand 0 is the loop count pseudo register
> -; operand 1 is the label to jump to at the top of the loop
> -; Use this for the ARC600 and ARC700.
> -; ??? ARC600 might want to check if the loop has few iteration and only a
> -; single insn - loop setup is expensive then.
> -(define_expand "doloop_end"
> -  [(use (match_operand 0 "register_operand" ""))
> -   (use (label_ref (match_operand 1 "" "")))]
> -  "!TARGET_ARC601"
> -{
> -  /* We could do smaller bivs with biv widening, and wider bivs by having
> -     a high-word counter in an outer loop - but punt on this for now.  */
> -  if (GET_MODE (operands[0]) != SImode)
> -    FAIL;
> -  emit_jump_insn (gen_doloop_end_i (operands[0], operands[1], const0_rtx));
> -  DONE;
> -})
> +   (set_attr "length" "4")])
>  
> -(define_insn_and_split "doloop_end_i"
> +;; if by any chance the lp_count is not used, then use an 'r'
> +;; register, instead of going to memory.
> +(define_insn "loop_end"
>    [(set (pc)
> -	(if_then_else (ne (match_operand:SI 0 "shouldbe_register_operand" "+l,*c,*m")
> -			   (const_int 1))
> +	(if_then_else (ne (match_operand:SI 2 "nonimmediate_operand" "0,0")
> +			  (const_int 1))
>  		      (label_ref (match_operand 1 "" ""))
>  		      (pc)))
> -   (set (match_dup 0) (plus:SI (match_dup 0) (const_int -1)))
> -   (use (reg:SI LP_START))
> -   (use (reg:SI LP_END))
> -   (use (match_operand 2 "const_int_operand" "n,???Cn0,???X"))
> -   (clobber (match_scratch:SI 3 "=X,X,&????r"))]
> +   (set (match_operand:SI 0 "nonimmediate_operand" "=l!r,m")
> +	(plus (match_dup 2) (const_int -1)))
> +   (unspec [(const_int 0)] UNSPEC_ARC_LP)
> +   (clobber (match_scratch:SI 3 "=X,&r"))]
>    ""
> -  "*
> -{
> -  rtx_insn *prev = prev_nonnote_insn (insn);
> -
> -  /* If there is an immediately preceding label, we must output a nop,
> -     lest a branch to that label will fall out of the loop.
> -     ??? We could try to avoid this by claiming to have a delay slot if there
> -     is a preceding label, and outputting the delay slot insn instead, if
> -     present.
> -     Or we could have some optimization that changes the source edge to update
> -     the loop count and jump to the loop start instead.  */
> -  /* For ARC600, we must also prevent jumps inside the loop and jumps where
> -     the loop counter value is live at the target from being directly at the
> -     loop end.  Being sure that the loop counter is dead at the target is
> -     too much hair - we can't rely on data flow information at this point -
> -     so insert a nop for all branches.
> -     The ARC600 also can't read the loop counter in the last insn of a loop.  */
> -  if (LABEL_P (prev))
> -    output_asm_insn (\"nop%?\", operands);
> -  return \"\\n.L__GCC__LP%2: ; loop end, start is %1\";
> -}"
> -  "&& memory_operand (operands[0], SImode)"
> -  [(pc)]
> -{
> -  emit_move_insn (operands[3], operands[0]);
> -  emit_jump_insn (gen_doloop_fallback_m (operands[3], operands[1], operands[0]));
> -  DONE;
> -}
> -  [(set_attr "type" "loop_end")
> -   (set (attr "length")
> -	(if_then_else (match_test "LABEL_P (prev_nonnote_insn (insn))")
> -		      (const_int 4) (const_int 0)))]
> -)
> +  "\\t;%0 %1 %2"
> +  [(set_attr "length" "0")
> +   (set_attr "predicable" "no")
> +   (set_attr "type" "loop_end")])
>  
> -; This pattern is generated by arc_reorg when there is no recognizable
> -; loop start.
> -(define_insn "*doloop_fallback"
> -  [(set (pc) (if_then_else (ne (match_operand:SI 0 "register_operand" "+r,!w")
> -				(const_int 1))
> -			   (label_ref (match_operand 1 "" ""))
> -			   (pc)))
> -   (set (match_dup 0) (plus:SI (match_dup 0) (const_int -1)))]
> -   ; avoid fooling the loop optimizer into assuming this is a special insn.
> -  "reload_completed"
> -  "*return get_attr_length (insn) == 8
> -   ? \"brne.d %0,1,%1\;sub %0,%0,1\"
> -   : \"breq %0,1,0f\;b.d %1\;sub %0,%0,1\\n0:\";"
> -  [(set (attr "length")
> -	(if_then_else (and (ge (minus (match_dup 1) (pc)) (const_int -256))
> - 			   (le (minus (match_dup 1) (pc)) (const_int 244)))
> - 		      (const_int 8) (const_int 12)))
> -   (set_attr "type" "brcc_no_delay_slot")
> -   (set_attr "cond" "nocond")]
> -)
> +;; split pattern for the very slim chance when the loop register is
> +;; memory.
> +(define_split
> +  [(set (pc)
> +	(if_then_else (ne (match_operand:SI 0 "memory_operand")
> +			  (const_int 1))
> +		      (label_ref (match_operand 1 ""))
> +		      (pc)))
> +   (set (match_dup 0) (plus (match_dup 0) (const_int -1)))
> +   (unspec [(const_int 0)] UNSPEC_ARC_LP)
> +   (clobber (match_scratch:SI 2))]
> +  "memory_operand (operands[0], SImode)"
> +  [(set (match_dup 2) (match_dup 0))
> +   (set (match_dup 2) (plus:SI (match_dup 2) (const_int -1)))
> +   (set (match_dup 0) (match_dup 2))
> +   (set (reg:CC CC_REG) (compare:CC (match_dup 2) (const_int 0)))
> +   (set (pc)
> +	(if_then_else (ne (reg:CC CC_REG)
> +			  (const_int 0))
> +		      (label_ref (match_dup 1))
> +		      (pc)))]
> +  "")
>  
> -; reload can't make output reloads for jump insns, so we have to do this by hand.
> -(define_insn "doloop_fallback_m"
> -  [(set (pc) (if_then_else (ne (match_operand:SI 0 "register_operand" "+&r")
> -				(const_int 1))
> -			   (label_ref (match_operand 1 "" ""))
> -			   (pc)))
> -   (set (match_dup 0) (plus:SI (match_dup 0) (const_int -1)))
> -   (set (match_operand:SI 2 "memory_operand" "=m")
> -	(plus:SI (match_dup 0) (const_int -1)))]
> -   ; avoid fooling the loop optimizer into assuming this is a special insn.
> -  "reload_completed"
> -  "*return get_attr_length (insn) == 12
> -   ? \"sub %0,%0,1\;brne.d %0,0,%1\;st%U2%V2 %0,%2\"
> -   : \"sub %0,%0,1\;breq %0,0,0f\;b.d %1\\n0:\tst%U2%V2 %0,%2\";"
> -  [(set (attr "length")
> -	(if_then_else (and (ge (minus (match_dup 1) (pc)) (const_int -252))
> - 			   (le (minus (match_dup 1) (pc)) (const_int 244)))
> - 		      (const_int 12) (const_int 16)))
> -   (set_attr "type" "brcc_no_delay_slot")
> -   (set_attr "cond" "nocond")]
> -)
> +(define_insn "loop_fail"
> +  [(set (reg:SI LP_COUNT)
> +	(plus:SI (reg:SI LP_COUNT) (const_int -1)))
> +   (set (reg:CC_ZN CC_REG)
> +	(compare:CC_ZN (plus:SI (reg:SI LP_COUNT) (const_int -1))
> +		       (const_int 0)))]
> +  ""
> +  "sub.f%?\\tlp_count,lp_count,1"
> +  [(set_attr "iscompact" "false")
> +   (set_attr "type" "compare")
> +   (set_attr "cond" "set_zn")
> +   (set_attr "length" "4")
> +   (set_attr "predicable" "yes")])
> +
> +(define_insn_and_split "dbnz"
> +  [(set (pc)
> +	(if_then_else
> +	 (ne (plus:SI (match_operand:SI 0 "nonimmediate_operand" "+r!l,m")
> +		      (const_int -1))
> +	     (const_int 0))
> +	 (label_ref (match_operand 1 "" ""))
> +	 (pc)))
> +   (set (match_dup 0)
> +	(plus:SI (match_dup 0)
> +		 (const_int -1)))
> +   (clobber (match_scratch:SI 2 "=X,r"))]
> +  "TARGET_V2"
> +  "@
> +   dbnz%#\\t%0,%l1
> +   #"
> +  "TARGET_V2 && reload_completed && memory_operand (operands[0], SImode)"
> +  [(set (match_dup 2) (match_dup 0))
> +   (set (match_dup 2) (plus:SI (match_dup 2) (const_int -1)))
> +   (set (reg:CC CC_REG) (compare:CC (match_dup 2) (const_int 0)))
> +   (set (match_dup 0) (match_dup 2))
> +   (set (pc) (if_then_else (ge (reg:CC CC_REG)
> +			       (const_int 0))
> +			   (label_ref (match_dup 1))
> +			   (pc)))]
> +  ""
> +  [(set_attr "iscompact" "false")
> +   (set_attr "type" "loop_end")
> +   (set_attr "length" "4,20")])
>  
>  (define_expand "movmemsi"
>    [(match_operand:BLK 0 "" "")
> diff --git a/gcc/config/arc/arc.opt b/gcc/config/arc/arc.opt
> index ad2df26..d1ebd44 100644
> --- a/gcc/config/arc/arc.opt
> +++ b/gcc/config/arc/arc.opt
> @@ -494,3 +494,28 @@ Specifies the registers that the processor saves on an interrupt entry and exit.
>  mrgf-banked-regs=
>  Target RejectNegative Joined Var(arc_deferred_options) Defer
>  Specifies the number of registers replicated in second register bank on entry to fast interrupt.
> +
> +mlpc-width=
> +Target RejectNegative Joined Enum(arc_lpc) Var(arc_lpcwidth) Init(32)
> +Sets LP_COUNT register width.  Possible values are 8, 16, 20, 24, 28, and 32.
> +
> +Enum
> +Name(arc_lpc) Type(int)
> +
> +EnumValue
> +Enum(arc_lpc) String(8) Value(8)
> +
> +EnumValue
> +Enum(arc_lpc) String(16) Value(16)
> +
> +EnumValue
> +Enum(arc_lpc) String(20) Value(20)
> +
> +EnumValue
> +Enum(arc_lpc) String(24) Value(24)
> +
> +EnumValue
> +Enum(arc_lpc) String(28) Value(28)
> +
> +EnumValue
> +Enum(arc_lpc) String(32) Value(32)
> diff --git a/gcc/config/arc/predicates.md b/gcc/config/arc/predicates.md
> index 1f66438..2610f84 100644
> --- a/gcc/config/arc/predicates.md
> +++ b/gcc/config/arc/predicates.md
> @@ -362,6 +362,8 @@
>        else if (TARGET_MUL64_SET
>  	       && (REGNO (op) == 57 || REGNO(op) == 58 || REGNO(op) == 59 ))
>  	return 0;
> +      else if (REGNO (op) == LP_COUNT)
> +        return 1;
>        else
>  	return dest_reg_operand (op, mode);
>      case SUBREG :
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index f480051..3d1f7f3 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -614,7 +614,7 @@ Objective-C and Objective-C++ Dialects}.
>  -mcrc  -mdsp-packa  -mdvbf  -mlock  -mmac-d16  -mmac-24  -mrtsc  -mswape @gol
>  -mtelephony  -mxy  -misize  -mannotate-align  -marclinux  -marclinux_prof @gol
>  -mlong-calls  -mmedium-calls  -msdata -mirq-ctrl-saved @gol
> --mrgf-banked-regs @gol
> +-mrgf-banked-regs -mlpc-width @gol
>  -mvolatile-cache  -mtp-regno=@var{regno} @gol
>  -malign-call  -mauto-modify-reg  -mbbit-peephole  -mno-brcc @gol
>  -mcase-vector-pcrel  -mcompact-casesi  -mno-cond-exec  -mearly-cbranchsi @gol
> @@ -14678,6 +14678,18 @@ registers to avoid memory transactions during interrupt entry and exit
>  sequences.  Use this option when you are using fast interrupts in an
>  ARC V2 family processor.  Permitted values are 4, 8, 16, and 32.
>  
> +@item -mlpc-width=@var{lpcw}
> +@opindex mlpc-width
> +Specify the width of the LP_COUNT register.  Valid values for
> +@var{lpcw} are 8, 16, 20, 24, 28 and 32.  The default width is fixed
> +to 32.  If the width is less than 32, the compiler does not attempt to
> +transform loops in your program to use the zero-delay loop mechanism
> +unless it is known that the @samp{LP_COUNT} register can hold the
> +required loop-counter value.  Depending on the size specified, the
> +compiler and run-time library might continue to use the loop mechanism
> +for various needs.  This option defines macro @code{__ARC_LPC_WIDTH__}
> +with the value of size.
> +
>  @end table
>  
>  The following options are passed through to the assembler, and also
> diff --git a/gcc/testsuite/gcc.target/arc/loop-1.c b/gcc/testsuite/gcc.target/arc/loop-1.c
> old mode 100644
> new mode 100755
> index 1afe8eb..773f583
> --- a/gcc/testsuite/gcc.target/arc/loop-1.c
> +++ b/gcc/testsuite/gcc.target/arc/loop-1.c
> @@ -1,45 +1,12 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O2" } */
> +/* { dg-options "-O2 -w" } */
>  
> -/* This case would fail to make use of the zero-overhead loop
> -   instruction at one time due to a bug.  */
> +/* Check how we handle empty body loops.  */
>  
> -extern char a[];
> -
> -struct some_t
> -{
> -  struct
> -  {
> -    int aaa;
> -    short bbb;
> -    char ccc;
> -    char ddd;
> -  } ppp[8];
> -
> -  int www[1];
> -};
> -
> -int b;
> -
> -void
> -some_function ()
> -{
> -  struct some_t *tmp = (struct some_t *) a;
> -
> -  while ((*tmp).ppp[b].ccc)
> -    while(0);
> -
> -  for (; b; b++)
> -    {
> -      if (tmp->ppp[b].ccc)
> -        {
> -          int c = tmp->ppp[b].bbb;
> -          int d = tmp->ppp[b].aaa;
> -          int e = d - tmp->www[c];
> -          if (e)
> -            tmp->ppp[b].ddd = 1;
> -        }
> -    }
> +a;
> +fn1() {
> +  int i;
> +  for (; i < 8; i++) {
> +    double A[a];
> +  }
>  }
> -
> -/* { dg-final { scan-assembler "\[^\n\]+lp \\.L__GCC__" } } */
> -- 
> 1.9.1
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH 6/7] [ARC] Reimplement ZOL support.
  2017-08-31 20:23   ` Andrew Burgess
@ 2017-09-01 11:49     ` Claudiu Zissulescu
  2017-09-01 12:36     ` [PATCH] [ARC][ZOL] Account for empty body loops Claudiu Zissulescu
  1 sibling, 0 replies; 27+ messages in thread
From: Claudiu Zissulescu @ 2017-09-01 11:49 UTC (permalink / raw)
  To: Andrew Burgess; +Cc: gcc-patches, Francois.Bedard, sandra

> I'm happy with this if the doc is updated inline with Sandra's
> suggestions.

Updated with Sandra's suggestions. Thank you for your feedback,
Claudiu

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH 7/7] [ARC] Fix errors in arc_ifcvt.
  2017-08-31 20:08   ` Andrew Burgess
@ 2017-09-01 11:50     ` Claudiu Zissulescu
  0 siblings, 0 replies; 27+ messages in thread
From: Claudiu Zissulescu @ 2017-09-01 11:50 UTC (permalink / raw)
  To: Andrew Burgess; +Cc: gcc-patches, Francois.Bedard

> Looks good.

Committed. Thank you for your review,
Claudiu

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH 5/7] [ARC] Update various patterns
  2017-08-31 19:33   ` Andrew Burgess
@ 2017-09-01 11:50     ` Claudiu Zissulescu
  0 siblings, 0 replies; 27+ messages in thread
From: Claudiu Zissulescu @ 2017-09-01 11:50 UTC (permalink / raw)
  To: Andrew Burgess; +Cc: gcc-patches, Francois.Bedard

> That looks fine to me.

Committed. Thank you for your review,
Claudiu

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH 4/7] [ARC] Use TARGET_USE_ANCHORS_FOR_SYMBOL_P.
  2017-08-31 19:13   ` Andrew Burgess
@ 2017-09-01 11:51     ` Claudiu Zissulescu
  0 siblings, 0 replies; 27+ messages in thread
From: Claudiu Zissulescu @ 2017-09-01 11:51 UTC (permalink / raw)
  To: Andrew Burgess; +Cc: gcc-patches, Francois.Bedard

> Looks good.

Committed. Thank you for your review,
Claudiu

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH] [ARC][ZOL] Account for empty body loops
  2017-08-31 20:23   ` Andrew Burgess
  2017-09-01 11:49     ` Claudiu Zissulescu
@ 2017-09-01 12:36     ` Claudiu Zissulescu
  2017-09-26 18:56       ` Andrew Burgess
  1 sibling, 1 reply; 27+ messages in thread
From: Claudiu Zissulescu @ 2017-09-01 12:36 UTC (permalink / raw)
  To: gcc-patches; +Cc: Claudiu.Zissulescu, Francois.Bedard, andrew.burgess

From: claziss <claziss@synopsys.com>

Hi Andrew,

By mistake I've pushed an incoplete ZOL-rework patch, and it missing the attached parts. Please can you check if it is ok?

Thank you,
Claudiu

gcc/
2017-09-01  Claudiu Zissulescu <claziss@synopsys.com>

	    * config/arc/arc.c (hwloop_optimize): Account for empty
	    body loops.

testsuite/
2017-09-01  Claudiu Zissulescu <claziss@synopsys.com>

 	        * gcc.target/arc/loop-1.c: Add test.
---
 gcc/config/arc/arc.c                  | 13 +++++++++++--
 gcc/testsuite/gcc.target/arc/loop-1.c | 12 ++++++++++++
 2 files changed, 23 insertions(+), 2 deletions(-)
 create mode 100755 gcc/testsuite/gcc.target/arc/loop-1.c

diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
index 52a9b24..d519063 100644
--- a/gcc/config/arc/arc.c
+++ b/gcc/config/arc/arc.c
@@ -7240,6 +7240,12 @@ hwloop_optimize (hwloop_info loop)
 	fprintf (dump_file, ";; loop %d too long\n", loop->loop_no);
       return false;
     }
+  else if (!loop->length)
+    {
+      if (dump_file)
+	fprintf (dump_file, ";; loop %d is empty\n", loop->loop_no);
+      return false;
+    }
 
   /* Check if we use a register or not.  */
   if (!REG_P (loop->iter_reg))
@@ -7311,8 +7317,11 @@ hwloop_optimize (hwloop_info loop)
       && INSN_P (last_insn)
       && (JUMP_P (last_insn) || CALL_P (last_insn)
 	  || GET_CODE (PATTERN (last_insn)) == SEQUENCE
-	  || get_attr_type (last_insn) == TYPE_BRCC
-	  || get_attr_type (last_insn) == TYPE_BRCC_NO_DELAY_SLOT))
+	  /* At this stage we can have (insn (clobber (mem:BLK
+	     (reg)))) instructions, ignore them.  */
+	  || (GET_CODE (PATTERN (last_insn)) != CLOBBER
+	      && (get_attr_type (last_insn) == TYPE_BRCC
+		  || get_attr_type (last_insn) == TYPE_BRCC_NO_DELAY_SLOT))))
     {
       if (loop->length + 2 > ARC_MAX_LOOP_LENGTH)
 	{
diff --git a/gcc/testsuite/gcc.target/arc/loop-1.c b/gcc/testsuite/gcc.target/arc/loop-1.c
new file mode 100755
index 0000000..274bb46
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arc/loop-1.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+/* Check how we handle empty body loops.  */
+
+int a;
+void fn1(void) {
+  int i;
+  for (; i < 8; i++) {
+    double A[a];
+  }
+}
-- 
1.9.1

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] [ARC][ZOL] Account for empty body loops
  2017-09-01 12:36     ` [PATCH] [ARC][ZOL] Account for empty body loops Claudiu Zissulescu
@ 2017-09-26 18:56       ` Andrew Burgess
  2017-11-02 10:22         ` Claudiu Zissulescu
  0 siblings, 1 reply; 27+ messages in thread
From: Andrew Burgess @ 2017-09-26 18:56 UTC (permalink / raw)
  To: Claudiu Zissulescu; +Cc: gcc-patches, Francois.Bedard

* Claudiu Zissulescu <Claudiu.Zissulescu@synopsys.com> [2017-09-01 14:32:10 +0200]:

> From: claziss <claziss@synopsys.com>
> 
> Hi Andrew,
> 
> By mistake I've pushed an incoplete ZOL-rework patch, and it missing the attached parts. Please can you check if it is ok?
> 
> Thank you,
> Claudiu
> 
> gcc/
> 2017-09-01  Claudiu Zissulescu <claziss@synopsys.com>
> 
> 	    * config/arc/arc.c (hwloop_optimize): Account for empty
> 	    body loops.

Looks good to me.

Thanks,
Andrew


> 
> testsuite/
> 2017-09-01  Claudiu Zissulescu <claziss@synopsys.com>
> 
>  	        * gcc.target/arc/loop-1.c: Add test.
> ---
>  gcc/config/arc/arc.c                  | 13 +++++++++++--
>  gcc/testsuite/gcc.target/arc/loop-1.c | 12 ++++++++++++
>  2 files changed, 23 insertions(+), 2 deletions(-)
>  create mode 100755 gcc/testsuite/gcc.target/arc/loop-1.c
> 
> diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
> index 52a9b24..d519063 100644
> --- a/gcc/config/arc/arc.c
> +++ b/gcc/config/arc/arc.c
> @@ -7240,6 +7240,12 @@ hwloop_optimize (hwloop_info loop)
>  	fprintf (dump_file, ";; loop %d too long\n", loop->loop_no);
>        return false;
>      }
> +  else if (!loop->length)
> +    {
> +      if (dump_file)
> +	fprintf (dump_file, ";; loop %d is empty\n", loop->loop_no);
> +      return false;
> +    }
>  
>    /* Check if we use a register or not.  */
>    if (!REG_P (loop->iter_reg))
> @@ -7311,8 +7317,11 @@ hwloop_optimize (hwloop_info loop)
>        && INSN_P (last_insn)
>        && (JUMP_P (last_insn) || CALL_P (last_insn)
>  	  || GET_CODE (PATTERN (last_insn)) == SEQUENCE
> -	  || get_attr_type (last_insn) == TYPE_BRCC
> -	  || get_attr_type (last_insn) == TYPE_BRCC_NO_DELAY_SLOT))
> +	  /* At this stage we can have (insn (clobber (mem:BLK
> +	     (reg)))) instructions, ignore them.  */
> +	  || (GET_CODE (PATTERN (last_insn)) != CLOBBER
> +	      && (get_attr_type (last_insn) == TYPE_BRCC
> +		  || get_attr_type (last_insn) == TYPE_BRCC_NO_DELAY_SLOT))))
>      {
>        if (loop->length + 2 > ARC_MAX_LOOP_LENGTH)
>  	{
> diff --git a/gcc/testsuite/gcc.target/arc/loop-1.c b/gcc/testsuite/gcc.target/arc/loop-1.c
> new file mode 100755
> index 0000000..274bb46
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arc/loop-1.c
> @@ -0,0 +1,12 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +/* Check how we handle empty body loops.  */
> +
> +int a;
> +void fn1(void) {
> +  int i;
> +  for (; i < 8; i++) {
> +    double A[a];
> +  }
> +}
> -- 
> 1.9.1
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH] [ARC][ZOL] Account for empty body loops
  2017-09-26 18:56       ` Andrew Burgess
@ 2017-11-02 10:22         ` Claudiu Zissulescu
  0 siblings, 0 replies; 27+ messages in thread
From: Claudiu Zissulescu @ 2017-11-02 10:22 UTC (permalink / raw)
  To: Andrew Burgess; +Cc: gcc-patches, Francois.Bedard

> > gcc/
> > 2017-09-01  Claudiu Zissulescu <claziss@synopsys.com>
> >
> > 	    * config/arc/arc.c (hwloop_optimize): Account for empty
> > 	    body loops.
> 
> Looks good to me.
> 
> Thanks,
> Andrew
> 
> 

Pushed. Thank you for your review,
Claudiu

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2017-11-02 10:22 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-24  8:46 [PATCH 0/7] [ARC] SDATA and ZOL updates Claudiu Zissulescu
2017-07-24  8:46 ` [PATCH 5/7] [ARC] Update various patterns Claudiu Zissulescu
2017-08-31 19:33   ` Andrew Burgess
2017-09-01 11:50     ` Claudiu Zissulescu
2017-07-24  8:46 ` [PATCH 6/7] [ARC] Reimplement ZOL support Claudiu Zissulescu
2017-07-24 22:22   ` Sandra Loosemore
2017-08-31 20:23   ` Andrew Burgess
2017-09-01 11:49     ` Claudiu Zissulescu
2017-09-01 12:36     ` [PATCH] [ARC][ZOL] Account for empty body loops Claudiu Zissulescu
2017-09-26 18:56       ` Andrew Burgess
2017-11-02 10:22         ` Claudiu Zissulescu
2017-07-24  8:46 ` [PATCH 2/7] [ARC] Use -G option to control sdata behavior Claudiu Zissulescu
2017-08-15 13:43   ` Andrew Burgess
2017-08-15 16:26     ` Sandra Loosemore
2017-08-31 16:04     ` Claudiu Zissulescu
2017-07-24  8:46 ` [PATCH 4/7] [ARC] Use TARGET_USE_ANCHORS_FOR_SYMBOL_P Claudiu Zissulescu
2017-08-31 19:13   ` Andrew Burgess
2017-09-01 11:51     ` Claudiu Zissulescu
2017-07-24  8:46 ` [PATCH 3/7] [ARC] Update can_follow_jump hook helper Claudiu Zissulescu
2017-08-15 13:56   ` Andrew Burgess
2017-08-31 16:38     ` Claudiu Zissulescu
2017-07-24  8:46 ` [PATCH 7/7] [ARC] Fix errors in arc_ifcvt Claudiu Zissulescu
2017-08-31 20:08   ` Andrew Burgess
2017-09-01 11:50     ` Claudiu Zissulescu
2017-07-24  8:46 ` [PATCH 1/7] [ARC] Improves and fixes for small data support Claudiu Zissulescu
2017-08-15 13:36   ` Andrew Burgess
2017-08-31 14:37     ` Claudiu Zissulescu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).