public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH][GCC][ARM] Arm generates out of range conditional branches in Thumb2 (PR91816)
@ 2019-10-11 14:57 Stam Markianos-Wright
  2019-10-13 15:37 ` Ramana Radhakrishnan
  0 siblings, 1 reply; 15+ messages in thread
From: Stam Markianos-Wright @ 2019-10-11 14:57 UTC (permalink / raw)
  To: gcc-patches; +Cc: nd, James Greenhalgh, Richard Earnshaw

[-- Attachment #1: Type: text/plain, Size: 1567 bytes --]

Hi all,

This is a patch for an issue where the compiler was generating a 
conditional branch in Thumb2, which was too far for b{cond} to handle.

This was originally reported at binutils:
https://sourceware.org/bugzilla/show_bug.cgi?id=24991

And then raised for GCC:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91816


As can be seen here:

http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0489c/Cihfddaf.html

the range of a 32-bit Thumb B{cond} is +/-1MB.

This is now checked for in arm.md and an unconditional branch is 
generated if the jump would be greater than 1MB.

New test has been written that checks this for: beq (if (a)), bne (if 
(a==1))

Patch bootstrapped and regression tested on arm-none-linux-gnueabihf, 
however, on my native Aarch32 setup the test times out when run as part 
of a big "make check-gcc" regression, but not when run individually.

Patch also regression tested on arm-none-eabi, arm-none-linux-gnueabi 
with no issues.

Also, I don't have commit rights yet, so could someone commit it on my 
behalf?

Thanks,
Stam Markianos-Wright



gcc/ChangeLog:

2019-10-11  Stamatis Markianos-Wright <stam.markianos-wright@arm.com>

	* config/arm/arm.md: Update b<cond> for Thumb2 range checks.
	* config/arm/arm.c: New function arm_gen_far_branch.
  	* config/arm/arm-protos.h: New function arm_gen_far_branch
	prototype.

gcc/testsuite/ChangeLog:

2019-10-11  Stamatis Markianos-Wright <stam.markianos-wright@arm.com>

  	* testsuite/gcc.target/arm/pr91816.c: New test.

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: patch.patch --]
[-- Type: text/x-patch; name="patch.patch", Size: 8352 bytes --]

diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index f995974f9bb..1dce333d1c3 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -570,4 +570,7 @@ void arm_parse_option_features (sbitmap, const cpu_arch_option *,
 
 void arm_initialize_isa (sbitmap, const enum isa_feature *);
 
+const char * arm_gen_far_branch (rtx *, int,const char * , const char *);
+
+
 #endif /* ! GCC_ARM_PROTOS_H */
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 39e1a1ef9a2..1a693d2ddca 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -32139,6 +32139,31 @@ arm_run_selftests (void)
 }
 } /* Namespace selftest.  */
 
+
+/* Generate code to enable conditional branches in functions over 1 MiB.  */
+const char *
+arm_gen_far_branch (rtx * operands, int pos_label, const char * dest,
+			const char * branch_format)
+{
+  rtx_code_label * tmp_label = gen_label_rtx ();
+  char label_buf[256];
+  char buffer[128];
+  ASM_GENERATE_INTERNAL_LABEL (label_buf, dest , \
+			CODE_LABEL_NUMBER (tmp_label));
+  const char *label_ptr = arm_strip_name_encoding (label_buf);
+  rtx dest_label = operands[pos_label];
+  operands[pos_label] = tmp_label;
+
+  snprintf (buffer, sizeof (buffer), "%s%s", branch_format , label_ptr);
+  output_asm_insn (buffer, operands);
+
+  snprintf (buffer, sizeof (buffer), "b\t%%l0%d\n%s:", pos_label, label_ptr);
+  operands[pos_label] = dest_label;
+  output_asm_insn (buffer, operands);
+  return "";
+}
+
+
 #undef TARGET_RUN_TARGET_SELFTESTS
 #define TARGET_RUN_TARGET_SELFTESTS selftest::arm_run_selftests
 #endif /* CHECKING_P */
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index f861c72ccfc..634fd0a59da 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -6686,9 +6686,16 @@
 ;; And for backward branches we have 
 ;;   (neg_range - neg_base_offs + pc_offs) = (neg_range - (-2 or -4) + 4).
 ;;
+;; In 16-bit Thumb these ranges are:
 ;; For a 'b'       pos_range = 2046, neg_range = -2048 giving (-2040->2048).
 ;; For a 'b<cond>' pos_range = 254,  neg_range = -256  giving (-250 ->256).
 
+;; In 32-bit Thumb these ranges are:
+;; For a 'b'       +/- 16MB is not checked for.
+;; For a 'b<cond>' pos_range = 1048574,  neg_range = -1048576  giving
+;; (-1048568 -> 1048576).
+
+
 (define_expand "cbranchsi4"
   [(set (pc) (if_then_else
 	      (match_operator 0 "expandable_comparison_operator"
@@ -6947,22 +6954,42 @@
 		      (pc)))]
   "TARGET_32BIT"
   "*
-  if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
-    {
-      arm_ccfsm_state += 2;
-      return \"\";
-    }
-  return \"b%d1\\t%l0\";
+     if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
+      {
+	arm_ccfsm_state += 2;
+	return \"\";
+      }
+     switch (get_attr_length (insn))
+      {
+	// Thumb2 16-bit b{cond}
+	case 2:
+
+	// Thumb2 32-bit b{cond}
+	case 4: return \"b%d1\\t%l0\";break;
+
+	// Thumb2 b{cond} out of range.  Use unconditional branch.
+	case 8: return arm_gen_far_branch \
+		(operands, 0, \"Lbcond\", \"b%D1\t\");
+	break;
+
+	// A32 b{cond}
+	default: return \"b%d1\\t%l0\";
+      }
   "
   [(set_attr "conds" "use")
    (set_attr "type" "branch")
    (set (attr "length")
-	(if_then_else
-	   (and (match_test "TARGET_THUMB2")
-		(and (ge (minus (match_dup 0) (pc)) (const_int -250))
-		     (le (minus (match_dup 0) (pc)) (const_int 256))))
-	   (const_int 2)
-	   (const_int 4)))]
+	(if_then_else (match_test "TARGET_THUMB2")
+	(if_then_else (and (ge (minus (match_dup 0) (pc)) (const_int -250))
+	(le (minus (match_dup 0) (pc)) (const_int 256)))
+	(const_int 2)
+	(if_then_else (and (ge (minus (match_dup 0) (pc))
+						(const_int -1048568))
+			(le (minus (match_dup 0) (pc)) (const_int 1048576)))
+	(const_int 4)
+	(const_int 8)))
+	(const_int 10)))
+   ]
 )
 
 (define_insn "*arm_cond_branch_reversed"
@@ -6978,17 +7005,36 @@
       arm_ccfsm_state += 2;
       return \"\";
     }
-  return \"b%D1\\t%l0\";
+     switch (get_attr_length (insn))
+      {
+	// Thumb2 16-bit b{cond}
+	case 2:
+
+	// Thumb2 32-bit b{cond}
+	case 4: return \"b%D1\\t%l0\";break;
+
+	// Thumb2 b{cond} out of range.  Use unconditional branch.
+	case 8: return arm_gen_far_branch \
+		(operands, 0, \"Lbcond\", \"b%d1\t\");
+		break;
+	// A32 b{cond}
+	default: return \"b%D1\\t%l0\";
+       }
   "
   [(set_attr "conds" "use")
    (set_attr "type" "branch")
    (set (attr "length")
-	(if_then_else
-	   (and (match_test "TARGET_THUMB2")
-		(and (ge (minus (match_dup 0) (pc)) (const_int -250))
-		     (le (minus (match_dup 0) (pc)) (const_int 256))))
-	   (const_int 2)
-	   (const_int 4)))]
+	(if_then_else (match_test "TARGET_THUMB2")
+	(if_then_else (and (ge (minus (match_dup 0) (pc)) (const_int -250))
+		(le (minus (match_dup 0) (pc)) (const_int 256)))
+	(const_int 2)
+	(if_then_else (and (ge (minus (match_dup 0) (pc))
+							(const_int -1048568))
+		(le (minus (match_dup 0) (pc)) (const_int 1048576)))
+	(const_int 4)
+	(const_int 8)))
+	(const_int 10)))
+   ]
 )
 
 \f
diff --git a/gcc/testsuite/gcc.target/arm/pr91816.c b/gcc/testsuite/gcc.target/arm/pr91816.c
new file mode 100644
index 00000000000..176bf61780b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pr91816.c
@@ -0,0 +1,102 @@
+/* { dg-do compile } */
+/* { dg-options "-march=armv7-a -mthumb -mfpu=vfpv3-d16" }  */
+int printf(const char *, ...);
+
+__attribute__((noinline,noclone)) void f1(int a)
+{
+	if (a) {
+#define HW0	printf("Hello World!\n");
+#define HW1	HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
+#define HW2	HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
+#define HW3	HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
+#define HW4	HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
+#define HW5	HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
+		HW0
+	}
+}
+
+__attribute__((noinline,noclone)) void f2(int a)
+{
+	if (a) {
+#define HW0	printf("Hello World!\n");
+#define HW1	HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
+#define HW2	HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
+#define HW3	HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
+#define HW4	HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
+#define HW5	HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
+		HW3
+	}
+}
+
+
+__attribute__((noinline,noclone)) void f3(int a)
+{
+	if (a) {
+#define HW0	printf("Hello World!\n");
+#define HW1	HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
+#define HW2	HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
+#define HW3	HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
+#define HW4	HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
+#define HW5	HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
+		HW5
+	}
+}
+
+__attribute__((noinline,noclone)) void f4(int a)
+{
+	if (a==1) {
+#define HW0	printf("Hello World!\n");
+#define HW1	HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
+#define HW2	HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
+#define HW3	HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
+#define HW4	HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
+#define HW5	HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
+		HW0
+	}
+}
+
+__attribute__((noinline,noclone)) void f5(int a)
+{
+	if (a==1) {
+#define HW0	printf("Hello World!\n");
+#define HW1	HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
+#define HW2	HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
+#define HW3	HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
+#define HW4	HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
+#define HW5	HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
+		HW3
+	}
+}
+
+
+__attribute__((noinline,noclone)) void f6(int a)
+{
+	if (a==1) {
+#define HW0	printf("Hello World!\n");
+#define HW1	HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
+#define HW2	HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
+#define HW3	HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
+#define HW4	HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
+#define HW5	HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
+		HW5
+	}
+}
+
+
+int main(void)
+{
+	f1(0);
+	f2(0);
+	f3(0);
+	f4(0);
+	f5(0);
+	f6(0);
+	return 0;
+}
+
+
+/* { dg-final { scan-assembler-times "beq\\t.L\[0-9\]" 2 } } */
+/* { dg-final { scan-assembler-times "beq\\t.Lbcond\[0-9\]" 1 } } */
+/* { dg-final { scan-assembler-times "bne\\t.L\[0-9\]" 2 } } */
+/* { dg-final { scan-assembler-times "bne\\t.Lbcond\[0-9\]" 1 } } */
+/* { dg-final { scan-assembler-times "b\\t.L\[0-9\]" 2 } } */

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH][GCC][ARM] Arm generates out of range conditional branches in Thumb2 (PR91816)
  2019-10-11 14:57 [PATCH][GCC][ARM] Arm generates out of range conditional branches in Thumb2 (PR91816) Stam Markianos-Wright
@ 2019-10-13 15:37 ` Ramana Radhakrishnan
  2019-10-21  9:40   ` Stam Markianos-Wright
  0 siblings, 1 reply; 15+ messages in thread
From: Ramana Radhakrishnan @ 2019-10-13 15:37 UTC (permalink / raw)
  To: Stam Markianos-Wright; +Cc: gcc-patches, nd, James Greenhalgh, Richard Earnshaw

> 
> Patch bootstrapped and regression tested on arm-none-linux-gnueabihf, 
> however, on my native Aarch32 setup the test times out when run as part 
> of a big "make check-gcc" regression, but not when run individually.
> 
> 2019-10-11  Stamatis Markianos-Wright <stam.markianos-wright@arm.com>
> 
> 	* config/arm/arm.md: Update b<cond> for Thumb2 range checks.
> 	* config/arm/arm.c: New function arm_gen_far_branch.
>   	* config/arm/arm-protos.h: New function arm_gen_far_branch
> 	prototype.
> 
> gcc/testsuite/ChangeLog:
> 
> 2019-10-11  Stamatis Markianos-Wright <stam.markianos-wright@arm.com>
> 
>   	* testsuite/gcc.target/arm/pr91816.c: New test.

> diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
> index f995974f9bb..1dce333d1c3 100644
> --- a/gcc/config/arm/arm-protos.h
> +++ b/gcc/config/arm/arm-protos.h
> @@ -570,4 +570,7 @@ void arm_parse_option_features (sbitmap, const cpu_arch_option *,
>  
>  void arm_initialize_isa (sbitmap, const enum isa_feature *);
>  
> +const char * arm_gen_far_branch (rtx *, int,const char * , const char *);
> +
> +

Lets get the nits out of the way.

Unnecessary extra new line, need a space between int and const above.


>  #endif /* ! GCC_ARM_PROTOS_H */
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index 39e1a1ef9a2..1a693d2ddca 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -32139,6 +32139,31 @@ arm_run_selftests (void)
>  }
>  } /* Namespace selftest.  */
>  
> +
> +/* Generate code to enable conditional branches in functions over 1 MiB.  */
> +const char *
> +arm_gen_far_branch (rtx * operands, int pos_label, const char * dest,
> +			const char * branch_format)

Not sure if this is some munging from the attachment but check
vertical alignment of parameters.

> +{
> +  rtx_code_label * tmp_label = gen_label_rtx ();
> +  char label_buf[256];
> +  char buffer[128];
> +  ASM_GENERATE_INTERNAL_LABEL (label_buf, dest , \
> +			CODE_LABEL_NUMBER (tmp_label));
> +  const char *label_ptr = arm_strip_name_encoding (label_buf);
> +  rtx dest_label = operands[pos_label];
> +  operands[pos_label] = tmp_label;
> +
> +  snprintf (buffer, sizeof (buffer), "%s%s", branch_format , label_ptr);
> +  output_asm_insn (buffer, operands);
> +
> +  snprintf (buffer, sizeof (buffer), "b\t%%l0%d\n%s:", pos_label, label_ptr);
> +  operands[pos_label] = dest_label;
> +  output_asm_insn (buffer, operands);
> +  return "";
> +}
> +
> +

Unnecessary extra newline.

>  #undef TARGET_RUN_TARGET_SELFTESTS
>  #define TARGET_RUN_TARGET_SELFTESTS selftest::arm_run_selftests
>  #endif /* CHECKING_P */
> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
> index f861c72ccfc..634fd0a59da 100644
> --- a/gcc/config/arm/arm.md
> +++ b/gcc/config/arm/arm.md
> @@ -6686,9 +6686,16 @@
>  ;; And for backward branches we have 
>  ;;   (neg_range - neg_base_offs + pc_offs) = (neg_range - (-2 or -4) + 4).
>  ;;
> +;; In 16-bit Thumb these ranges are:
>  ;; For a 'b'       pos_range = 2046, neg_range = -2048 giving (-2040->2048).
>  ;; For a 'b<cond>' pos_range = 254,  neg_range = -256  giving (-250 ->256).
>  
> +;; In 32-bit Thumb these ranges are:
> +;; For a 'b'       +/- 16MB is not checked for.
> +;; For a 'b<cond>' pos_range = 1048574,  neg_range = -1048576  giving
> +;; (-1048568 -> 1048576).
> +
> +

Unnecessary extra newline.

>  (define_expand "cbranchsi4"
>    [(set (pc) (if_then_else
>  	      (match_operator 0 "expandable_comparison_operator"
> @@ -6947,22 +6954,42 @@
>  		      (pc)))]
>    "TARGET_32BIT"
>    "*
> -  if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
> -    {
> -      arm_ccfsm_state += 2;
> -      return \"\";
> -    }
> -  return \"b%d1\\t%l0\";
> +     if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
> +      {
> +	arm_ccfsm_state += 2;
> +	return \"\";
> +      }
> +     switch (get_attr_length (insn))
> +      {
> +	// Thumb2 16-bit b{cond}
> +	case 2:
> +
> +	// Thumb2 32-bit b{cond}
> +	case 4: return \"b%d1\\t%l0\";break;
> +
> +	// Thumb2 b{cond} out of range.  Use unconditional branch.
> +	case 8: return arm_gen_far_branch \
> +		(operands, 0, \"Lbcond\", \"b%D1\t\");
> +	break;
> +
> +	// A32 b{cond}
> +	default: return \"b%d1\\t%l0\";
> +      }

Please fix indentation here. 

>    "
>    [(set_attr "conds" "use")
>     (set_attr "type" "branch")
>     (set (attr "length")
> -	(if_then_else
> -	   (and (match_test "TARGET_THUMB2")
> -		(and (ge (minus (match_dup 0) (pc)) (const_int -250))
> -		     (le (minus (match_dup 0) (pc)) (const_int 256))))
> -	   (const_int 2)
> -	   (const_int 4)))]
> +	(if_then_else (match_test "TARGET_THUMB2")
> +	(if_then_else (and (ge (minus (match_dup 0) (pc)) (const_int -250))
> +	(le (minus (match_dup 0) (pc)) (const_int 256)))
> +	(const_int 2)
> +	(if_then_else (and (ge (minus (match_dup 0) (pc))
> +						(const_int -1048568))
> +			(le (minus (match_dup 0) (pc)) (const_int 1048576)))
> +	(const_int 4)
> +	(const_int 8)))
> +	(const_int 10)))
> +   ]

This conditional is unreadable and is getting quite complex.

Please fix the indentation and add some comments to indicate when
this is 2, 4, 8, 10 above the pattern and ask for the comment to
be in sync with this.

How did we end up with length 10 ? That indicates 2 4 byte instructions
and a 2 byte instruction ? You are handling lengths 2, 4, 8 above in
the switch - is length 10 going to be a single A32 b<cond> instruction ?

What am I missing ? 



>  )
>  
>  (define_insn "*arm_cond_branch_reversed"
> @@ -6978,17 +7005,36 @@
>        arm_ccfsm_state += 2;
>        return \"\";
>      }
> -  return \"b%D1\\t%l0\";
> +     switch (get_attr_length (insn))
> +      {
> +	// Thumb2 16-bit b{cond}
> +	case 2:
> +
> +	// Thumb2 32-bit b{cond}
> +	case 4: return \"b%D1\\t%l0\";break;
> +
> +	// Thumb2 b{cond} out of range.  Use unconditional branch.
> +	case 8: return arm_gen_far_branch \
> +		(operands, 0, \"Lbcond\", \"b%d1\t\");
> +		break;
> +	// A32 b{cond}
> +	default: return \"b%D1\\t%l0\";
> +       }
>    "
>    [(set_attr "conds" "use")
>     (set_attr "type" "branch")
>     (set (attr "length")
> -	(if_then_else
> -	   (and (match_test "TARGET_THUMB2")
> -		(and (ge (minus (match_dup 0) (pc)) (const_int -250))
> -		     (le (minus (match_dup 0) (pc)) (const_int 256))))
> -	   (const_int 2)
> -	   (const_int 4)))]
> +	(if_then_else (match_test "TARGET_THUMB2")
> +	(if_then_else (and (ge (minus (match_dup 0) (pc)) (const_int -250))
> +		(le (minus (match_dup 0) (pc)) (const_int 256)))
> +	(const_int 2)
> +	(if_then_else (and (ge (minus (match_dup 0) (pc))
> +							(const_int -1048568))
> +		(le (minus (match_dup 0) (pc)) (const_int 1048576)))
> +	(const_int 4)
> +	(const_int 8)))
> +	(const_int 10)))
> +   ]

Same comments as above apply here too.

Ramana

>  )
>  
>  \f
> diff --git a/gcc/testsuite/gcc.target/arm/pr91816.c b/gcc/testsuite/gcc.target/arm/pr91816.c
> new file mode 100644
> index 00000000000..176bf61780b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/pr91816.c
> @@ -0,0 +1,102 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=armv7-a -mthumb -mfpu=vfpv3-d16" }  */
> +int printf(const char *, ...);
> +
> +__attribute__((noinline,noclone)) void f1(int a)
> +{
> +	if (a) {
> +#define HW0	printf("Hello World!\n");
> +#define HW1	HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
> +#define HW2	HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
> +#define HW3	HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
> +#define HW4	HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
> +#define HW5	HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
> +		HW0
> +	}
> +}
> +
> +__attribute__((noinline,noclone)) void f2(int a)
> +{
> +	if (a) {
> +#define HW0	printf("Hello World!\n");
> +#define HW1	HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
> +#define HW2	HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
> +#define HW3	HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
> +#define HW4	HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
> +#define HW5	HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
> +		HW3
> +	}
> +}
> +
> +
> +__attribute__((noinline,noclone)) void f3(int a)
> +{
> +	if (a) {
> +#define HW0	printf("Hello World!\n");
> +#define HW1	HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
> +#define HW2	HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
> +#define HW3	HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
> +#define HW4	HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
> +#define HW5	HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
> +		HW5
> +	}
> +}
> +
> +__attribute__((noinline,noclone)) void f4(int a)
> +{
> +	if (a==1) {
> +#define HW0	printf("Hello World!\n");
> +#define HW1	HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
> +#define HW2	HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
> +#define HW3	HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
> +#define HW4	HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
> +#define HW5	HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
> +		HW0
> +	}
> +}
> +
> +__attribute__((noinline,noclone)) void f5(int a)
> +{
> +	if (a==1) {
> +#define HW0	printf("Hello World!\n");
> +#define HW1	HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
> +#define HW2	HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
> +#define HW3	HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
> +#define HW4	HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
> +#define HW5	HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
> +		HW3
> +	}
> +}
> +
> +
> +__attribute__((noinline,noclone)) void f6(int a)
> +{
> +	if (a==1) {
> +#define HW0	printf("Hello World!\n");
> +#define HW1	HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
> +#define HW2	HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
> +#define HW3	HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
> +#define HW4	HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
> +#define HW5	HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
> +		HW5
> +	}
> +}
> +
> +
> +int main(void)
> +{
> +	f1(0);
> +	f2(0);
> +	f3(0);
> +	f4(0);
> +	f5(0);
> +	f6(0);
> +	return 0;
> +}
> +
> +
> +/* { dg-final { scan-assembler-times "beq\\t.L\[0-9\]" 2 } } */
> +/* { dg-final { scan-assembler-times "beq\\t.Lbcond\[0-9\]" 1 } } */
> +/* { dg-final { scan-assembler-times "bne\\t.L\[0-9\]" 2 } } */
> +/* { dg-final { scan-assembler-times "bne\\t.Lbcond\[0-9\]" 1 } } */
> +/* { dg-final { scan-assembler-times "b\\t.L\[0-9\]" 2 } } */

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH][GCC][ARM] Arm generates out of range conditional branches in Thumb2 (PR91816)
  2019-10-13 15:37 ` Ramana Radhakrishnan
@ 2019-10-21  9:40   ` Stam Markianos-Wright
  2019-11-15 17:27     ` [PING][PATCH][GCC][ARM] " Stam Markianos-Wright
  0 siblings, 1 reply; 15+ messages in thread
From: Stam Markianos-Wright @ 2019-10-21  9:40 UTC (permalink / raw)
  To: Ramana Radhakrishnan; +Cc: gcc-patches, nd, James Greenhalgh, Richard Earnshaw

[-- Attachment #1: Type: text/plain, Size: 12135 bytes --]



On 10/13/19 4:23 PM, Ramana Radhakrishnan wrote:
>>
>> Patch bootstrapped and regression tested on arm-none-linux-gnueabihf,
>> however, on my native Aarch32 setup the test times out when run as part
>> of a big "make check-gcc" regression, but not when run individually.
>>
>> 2019-10-11  Stamatis Markianos-Wright <stam.markianos-wright@arm.com>
>>
>> 	* config/arm/arm.md: Update b<cond> for Thumb2 range checks.
>> 	* config/arm/arm.c: New function arm_gen_far_branch.
>>    	* config/arm/arm-protos.h: New function arm_gen_far_branch
>> 	prototype.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2019-10-11  Stamatis Markianos-Wright <stam.markianos-wright@arm.com>
>>
>>    	* testsuite/gcc.target/arm/pr91816.c: New test.
> 
>> diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
>> index f995974f9bb..1dce333d1c3 100644
>> --- a/gcc/config/arm/arm-protos.h
>> +++ b/gcc/config/arm/arm-protos.h
>> @@ -570,4 +570,7 @@ void arm_parse_option_features (sbitmap, const cpu_arch_option *,
>>   
>>   void arm_initialize_isa (sbitmap, const enum isa_feature *);
>>   
>> +const char * arm_gen_far_branch (rtx *, int,const char * , const char *);
>> +
>> +
> 
> Lets get the nits out of the way.
> 
> Unnecessary extra new line, need a space between int and const above.
> 
> 

.Fixed!

>>   #endif /* ! GCC_ARM_PROTOS_H */
>> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
>> index 39e1a1ef9a2..1a693d2ddca 100644
>> --- a/gcc/config/arm/arm.c
>> +++ b/gcc/config/arm/arm.c
>> @@ -32139,6 +32139,31 @@ arm_run_selftests (void)
>>   }
>>   } /* Namespace selftest.  */
>>   
>> +
>> +/* Generate code to enable conditional branches in functions over 1 MiB.  */
>> +const char *
>> +arm_gen_far_branch (rtx * operands, int pos_label, const char * dest,
>> +			const char * branch_format)
> 
> Not sure if this is some munging from the attachment but check
> vertical alignment of parameters.
> 

.Fixed!

>> +{
>> +  rtx_code_label * tmp_label = gen_label_rtx ();
>> +  char label_buf[256];
>> +  char buffer[128];
>> +  ASM_GENERATE_INTERNAL_LABEL (label_buf, dest , \
>> +			CODE_LABEL_NUMBER (tmp_label));
>> +  const char *label_ptr = arm_strip_name_encoding (label_buf);
>> +  rtx dest_label = operands[pos_label];
>> +  operands[pos_label] = tmp_label;
>> +
>> +  snprintf (buffer, sizeof (buffer), "%s%s", branch_format , label_ptr);
>> +  output_asm_insn (buffer, operands);
>> +
>> +  snprintf (buffer, sizeof (buffer), "b\t%%l0%d\n%s:", pos_label, label_ptr);
>> +  operands[pos_label] = dest_label;
>> +  output_asm_insn (buffer, operands);
>> +  return "";
>> +}
>> +
>> +
> 
> Unnecessary extra newline.
> 

.Fixed!

>>   #undef TARGET_RUN_TARGET_SELFTESTS
>>   #define TARGET_RUN_TARGET_SELFTESTS selftest::arm_run_selftests
>>   #endif /* CHECKING_P */
>> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
>> index f861c72ccfc..634fd0a59da 100644
>> --- a/gcc/config/arm/arm.md
>> +++ b/gcc/config/arm/arm.md
>> @@ -6686,9 +6686,16 @@
>>   ;; And for backward branches we have
>>   ;;   (neg_range - neg_base_offs + pc_offs) = (neg_range - (-2 or -4) + 4).
>>   ;;
>> +;; In 16-bit Thumb these ranges are:
>>   ;; For a 'b'       pos_range = 2046, neg_range = -2048 giving (-2040->2048).
>>   ;; For a 'b<cond>' pos_range = 254,  neg_range = -256  giving (-250 ->256).
>>   
>> +;; In 32-bit Thumb these ranges are:
>> +;; For a 'b'       +/- 16MB is not checked for.
>> +;; For a 'b<cond>' pos_range = 1048574,  neg_range = -1048576  giving
>> +;; (-1048568 -> 1048576).
>> +
>> +
> 
> Unnecessary extra newline.
> 

.Fixed!

>>   (define_expand "cbranchsi4"
>>     [(set (pc) (if_then_else
>>   	      (match_operator 0 "expandable_comparison_operator"
>> @@ -6947,22 +6954,42 @@
>>   		      (pc)))]
>>     "TARGET_32BIT"
>>     "*
>> -  if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
>> -    {
>> -      arm_ccfsm_state += 2;
>> -      return \"\";
>> -    }
>> -  return \"b%d1\\t%l0\";
>> +     if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
>> +      {
>> +	arm_ccfsm_state += 2;
>> +	return \"\";
>> +      }
>> +     switch (get_attr_length (insn))
>> +      {
>> +	// Thumb2 16-bit b{cond}
>> +	case 2:
>> +
>> +	// Thumb2 32-bit b{cond}
>> +	case 4: return \"b%d1\\t%l0\";break;
>> +
>> +	// Thumb2 b{cond} out of range.  Use unconditional branch.
>> +	case 8: return arm_gen_far_branch \
>> +		(operands, 0, \"Lbcond\", \"b%D1\t\");
>> +	break;
>> +
>> +	// A32 b{cond}
>> +	default: return \"b%d1\\t%l0\";
>> +      }
> 
> Please fix indentation here.
> 

.Fixed together with below changes.

>>     "
>>     [(set_attr "conds" "use")
>>      (set_attr "type" "branch")
>>      (set (attr "length")
>> -	(if_then_else
>> -	   (and (match_test "TARGET_THUMB2")
>> -		(and (ge (minus (match_dup 0) (pc)) (const_int -250))
>> -		     (le (minus (match_dup 0) (pc)) (const_int 256))))
>> -	   (const_int 2)
>> -	   (const_int 4)))]
>> +	(if_then_else (match_test "TARGET_THUMB2")
>> +	(if_then_else (and (ge (minus (match_dup 0) (pc)) (const_int -250))
>> +	(le (minus (match_dup 0) (pc)) (const_int 256)))
>> +	(const_int 2)
>> +	(if_then_else (and (ge (minus (match_dup 0) (pc))
>> +						(const_int -1048568))
>> +			(le (minus (match_dup 0) (pc)) (const_int 1048576)))
>> +	(const_int 4)
>> +	(const_int 8)))
>> +	(const_int 10)))
>> +   ]
> 
> This conditional is unreadable and is getting quite complex.
> 
> Please fix the indentation and add some comments to indicate when
> this is 2, 4, 8, 10 above the pattern and ask for the comment to
> be in sync with this.
> 
> How did we end up with length 10 ? That indicates 2 4 byte instructions
> and a 2 byte instruction ? You are handling lengths 2, 4, 8 above in
> the switch - is length 10 going to be a single A32 b<cond> instruction ?
> 
> What am I missing ?
> 
> 

Ah sorry, I had not realised that the "length" related to the number of 
bytes in the instruction, so I just used it as a variable to then check 
in the switch().
And yes, you are correct in assuming that length 10 would have been the 
A32 b<cond> version.
So the mapping I had in mind was:
2->  Thumb2 b<cond> - narrow 16bit version
4->  Thumb2 b<cond> - wide 32bit version
8->  Thumb2 b       - "far branch".
10-> A32 b<cond>

The new version that maintains the "length=number of bytes" would be:

2->  Thumb2 b<cond> - narrow 16bit version
4->  Thumb2 b<cond> - wide 32bit version OR A32 b<cond>
6->  Thumb2 "far branch" made up from one b<cond> to a very close Lbcond 
label (so 16 bits) and one b for 32 bits. (so 2+4 == 6)

I've gone ahead and done this in the new proposed patch. Let me know if 
it's ok! (also I changed the first check to !TARGET_THUMB2 - this makes 
it slightly more readable). I'm still not sure about this, so any 
suggestions are welcome!

> 
>>   )
>>   
>>   (define_insn "*arm_cond_branch_reversed"
>> @@ -6978,17 +7005,36 @@
>>         arm_ccfsm_state += 2;
>>         return \"\";
>>       }
>> -  return \"b%D1\\t%l0\";
>> +     switch (get_attr_length (insn))
>> +      {
>> +	// Thumb2 16-bit b{cond}
>> +	case 2:
>> +
>> +	// Thumb2 32-bit b{cond}
>> +	case 4: return \"b%D1\\t%l0\";break;
>> +
>> +	// Thumb2 b{cond} out of range.  Use unconditional branch.
>> +	case 8: return arm_gen_far_branch \
>> +		(operands, 0, \"Lbcond\", \"b%d1\t\");
>> +		break;
>> +	// A32 b{cond}
>> +	default: return \"b%D1\\t%l0\";
>> +       }
>>     "
>>     [(set_attr "conds" "use")
>>      (set_attr "type" "branch")
>>      (set (attr "length")
>> -	(if_then_else
>> -	   (and (match_test "TARGET_THUMB2")
>> -		(and (ge (minus (match_dup 0) (pc)) (const_int -250))
>> -		     (le (minus (match_dup 0) (pc)) (const_int 256))))
>> -	   (const_int 2)
>> -	   (const_int 4)))]
>> +	(if_then_else (match_test "TARGET_THUMB2")
>> +	(if_then_else (and (ge (minus (match_dup 0) (pc)) (const_int -250))
>> +		(le (minus (match_dup 0) (pc)) (const_int 256)))
>> +	(const_int 2)
>> +	(if_then_else (and (ge (minus (match_dup 0) (pc))
>> +							(const_int -1048568))
>> +		(le (minus (match_dup 0) (pc)) (const_int 1048576)))
>> +	(const_int 4)
>> +	(const_int 8)))
>> +	(const_int 10)))
>> +   ]
> 
> Same comments as above apply here too.
> 

Same as above.

Thank you for the feedback and apologies for being a clueless :)

And, of course, let me know of any problems or queries!

Cheers,
Stam

> Ramana
> 
>>   )
>>   
>>   \f
>> diff --git a/gcc/testsuite/gcc.target/arm/pr91816.c b/gcc/testsuite/gcc.target/arm/pr91816.c
>> new file mode 100644
>> index 00000000000..176bf61780b
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/arm/pr91816.c
>> @@ -0,0 +1,102 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-march=armv7-a -mthumb -mfpu=vfpv3-d16" }  */
>> +int printf(const char *, ...);
>> +
>> +__attribute__((noinline,noclone)) void f1(int a)
>> +{
>> +	if (a) {
>> +#define HW0	printf("Hello World!\n");
>> +#define HW1	HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>> +#define HW2	HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>> +#define HW3	HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>> +#define HW4	HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>> +#define HW5	HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>> +		HW0
>> +	}
>> +}
>> +
>> +__attribute__((noinline,noclone)) void f2(int a)
>> +{
>> +	if (a) {
>> +#define HW0	printf("Hello World!\n");
>> +#define HW1	HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>> +#define HW2	HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>> +#define HW3	HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>> +#define HW4	HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>> +#define HW5	HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>> +		HW3
>> +	}
>> +}
>> +
>> +
>> +__attribute__((noinline,noclone)) void f3(int a)
>> +{
>> +	if (a) {
>> +#define HW0	printf("Hello World!\n");
>> +#define HW1	HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>> +#define HW2	HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>> +#define HW3	HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>> +#define HW4	HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>> +#define HW5	HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>> +		HW5
>> +	}
>> +}
>> +
>> +__attribute__((noinline,noclone)) void f4(int a)
>> +{
>> +	if (a==1) {
>> +#define HW0	printf("Hello World!\n");
>> +#define HW1	HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>> +#define HW2	HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>> +#define HW3	HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>> +#define HW4	HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>> +#define HW5	HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>> +		HW0
>> +	}
>> +}
>> +
>> +__attribute__((noinline,noclone)) void f5(int a)
>> +{
>> +	if (a==1) {
>> +#define HW0	printf("Hello World!\n");
>> +#define HW1	HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>> +#define HW2	HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>> +#define HW3	HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>> +#define HW4	HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>> +#define HW5	HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>> +		HW3
>> +	}
>> +}
>> +
>> +
>> +__attribute__((noinline,noclone)) void f6(int a)
>> +{
>> +	if (a==1) {
>> +#define HW0	printf("Hello World!\n");
>> +#define HW1	HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>> +#define HW2	HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>> +#define HW3	HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>> +#define HW4	HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>> +#define HW5	HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>> +		HW5
>> +	}
>> +}
>> +
>> +
>> +int main(void)
>> +{
>> +	f1(0);
>> +	f2(0);
>> +	f3(0);
>> +	f4(0);
>> +	f5(0);
>> +	f6(0);
>> +	return 0;
>> +}
>> +
>> +
>> +/* { dg-final { scan-assembler-times "beq\\t.L\[0-9\]" 2 } } */
>> +/* { dg-final { scan-assembler-times "beq\\t.Lbcond\[0-9\]" 1 } } */
>> +/* { dg-final { scan-assembler-times "bne\\t.L\[0-9\]" 2 } } */
>> +/* { dg-final { scan-assembler-times "bne\\t.Lbcond\[0-9\]" 1 } } */
>> +/* { dg-final { scan-assembler-times "b\\t.L\[0-9\]" 2 } } */
> 

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 1.patch --]
[-- Type: text/x-patch; name="1.patch", Size: 9573 bytes --]

diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index f995974f9bb..59ec219da3d 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -570,4 +570,6 @@ void arm_parse_option_features (sbitmap, const cpu_arch_option *,
 
 void arm_initialize_isa (sbitmap, const enum isa_feature *);
 
+const char * arm_gen_far_branch (rtx *, int, const char *, const char *);
+
 #endif /* ! GCC_ARM_PROTOS_H */
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 39e1a1ef9a2..7a69ddb6b7b 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -32139,6 +32139,30 @@ arm_run_selftests (void)
 }
 } /* Namespace selftest.  */
 
+
+/* Generate code to enable conditional branches in functions over 1 MiB.  */
+const char *
+arm_gen_far_branch (rtx * operands, int pos_label, const char * dest,
+		    const char * branch_format)
+{
+  rtx_code_label * tmp_label = gen_label_rtx ();
+  char label_buf[256];
+  char buffer[128];
+  ASM_GENERATE_INTERNAL_LABEL (label_buf, dest , \
+			CODE_LABEL_NUMBER (tmp_label));
+  const char *label_ptr = arm_strip_name_encoding (label_buf);
+  rtx dest_label = operands[pos_label];
+  operands[pos_label] = tmp_label;
+
+  snprintf (buffer, sizeof (buffer), "%s%s", branch_format , label_ptr);
+  output_asm_insn (buffer, operands);
+
+  snprintf (buffer, sizeof (buffer), "b\t%%l0%d\n%s:", pos_label, label_ptr);
+  operands[pos_label] = dest_label;
+  output_asm_insn (buffer, operands);
+  return "";
+}
+
 #undef TARGET_RUN_TARGET_SELFTESTS
 #define TARGET_RUN_TARGET_SELFTESTS selftest::arm_run_selftests
 #endif /* CHECKING_P */
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index f861c72ccfc..7e5e1489214 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -6686,9 +6686,15 @@
 ;; And for backward branches we have 
 ;;   (neg_range - neg_base_offs + pc_offs) = (neg_range - (-2 or -4) + 4).
 ;;
+;; In 16-bit Thumb these ranges are:
 ;; For a 'b'       pos_range = 2046, neg_range = -2048 giving (-2040->2048).
 ;; For a 'b<cond>' pos_range = 254,  neg_range = -256  giving (-250 ->256).
 
+;; In 32-bit Thumb these ranges are:
+;; For a 'b'       +/- 16MB is not checked for.
+;; For a 'b<cond>' pos_range = 1048574,  neg_range = -1048576  giving
+;; (-1048568 -> 1048576).
+
 (define_expand "cbranchsi4"
   [(set (pc) (if_then_else
 	      (match_operator 0 "expandable_comparison_operator"
@@ -6946,23 +6952,56 @@
 		      (label_ref (match_operand 0 "" ""))
 		      (pc)))]
   "TARGET_32BIT"
-  "*
-  if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
-    {
-      arm_ccfsm_state += 2;
-      return \"\";
-    }
-  return \"b%d1\\t%l0\";
-  "
+  {
+	if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
+	{
+		arm_ccfsm_state += 2;
+		return "";
+	}
+	switch (get_attr_length (insn))
+	{
+		/* Thumb2 16-bit b{cond}.  */
+		case 2:
+
+		/* Thumb2 32-bit b{cond} or A32 b{cond}.  */
+		case 4: return "b%d1\t%l0";
+			break;
+
+		/* Thumb2 b{cond} out of range.  Use 16-bit b{cond} and
+		   unconditional branch b.  */
+		default: return arm_gen_far_branch \
+				(operands, 0, "Lbcond", "b%D1\t");
+	}
+  }
   [(set_attr "conds" "use")
    (set_attr "type" "branch")
    (set (attr "length")
-	(if_then_else
-	   (and (match_test "TARGET_THUMB2")
-		(and (ge (minus (match_dup 0) (pc)) (const_int -250))
-		     (le (minus (match_dup 0) (pc)) (const_int 256))))
-	   (const_int 2)
-	   (const_int 4)))]
+    (if_then_else (match_test "!TARGET_THUMB2")
+
+	;;Target is not Thumb2, therefore is A32.  Generate b{cond}.
+	(const_int 4)
+
+	;; Check if target is within 16-bit Thumb2 b{cond} range.
+	(if_then_else (and (ge (minus (match_dup 0) (pc)) (const_int -250))
+			   (le (minus (match_dup 0) (pc)) (const_int 256)))
+
+		;; Target is Thumb2, within narrow range.
+		;; Generate b{cond}.
+			(const_int 2)
+
+		;; Check if target is within 32-bit Thumb2 b{cond} range.
+			(if_then_else (and (ge (minus (match_dup 0)
+					 (pc))(const_int -1048568))
+					   (le (minus (match_dup 0)
+					 (pc)) (const_int 1048576)))
+
+		;; Target is Thumb2, within wide range.
+		;; Generate b{cond}
+						(const_int 4)
+		;; Target is Thumb2, out of range.
+		;; Generate narrow b{cond} and unconditional branch b.
+						(const_int 6)))))
+  ]
 )
 
 (define_insn "*arm_cond_branch_reversed"
@@ -6972,23 +7011,56 @@
 		      (pc)
 		      (label_ref (match_operand 0 "" ""))))]
   "TARGET_32BIT"
-  "*
-  if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
-    {
-      arm_ccfsm_state += 2;
-      return \"\";
-    }
-  return \"b%D1\\t%l0\";
-  "
+  {
+	if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
+	{
+		arm_ccfsm_state += 2;
+		return "";
+	}
+	switch (get_attr_length (insn))
+	{
+		/* Thumb2 16-bit b{cond}.  */
+		case 2:
+
+		/* Thumb2 32-bit b{cond} or A32 b{cond}.  */
+		case 4: return "b%D1\t%l0";
+			break;
+
+		/* Thumb2 b{cond} out of range.  Use 16-bit b{cond} and
+		   unconditional branch b.  */
+		default: return arm_gen_far_branch \
+				(operands, 0, "Lbcond", "b%d1\t");
+	}
+  }
   [(set_attr "conds" "use")
    (set_attr "type" "branch")
    (set (attr "length")
-	(if_then_else
-	   (and (match_test "TARGET_THUMB2")
-		(and (ge (minus (match_dup 0) (pc)) (const_int -250))
-		     (le (minus (match_dup 0) (pc)) (const_int 256))))
-	   (const_int 2)
-	   (const_int 4)))]
+    (if_then_else (match_test "!TARGET_THUMB2")
+
+	;;Target is not Thumb2, therefore is A32.  Generate b{cond}.
+	(const_int 4)
+
+	;; Check if target is within 16-bit Thumb2 b{cond} range.
+	(if_then_else (and (ge (minus (match_dup 0) (pc)) (const_int -250))
+			   (le (minus (match_dup 0) (pc)) (const_int 256)))
+
+		;; Target is Thumb2, within narrow range.
+		;; Generate b{cond}.
+			(const_int 2)
+
+		;; Check if target is within 32-bit Thumb2 b{cond} range.
+			(if_then_else (and (ge (minus (match_dup 0)
+					 (pc))(const_int -1048568))
+					   (le (minus (match_dup 0)
+					 (pc)) (const_int 1048576)))
+
+		;; Target is Thumb2, within wide range.
+		;; Generate b{cond}.
+						(const_int 4)
+		;; Target is Thumb2, out of range.
+		;; Generate narrow b{cond} and unconditional branch b.
+						(const_int 6)))))
+  ]
 )
 
 \f
diff --git a/gcc/testsuite/gcc.target/arm/pr91816.c b/gcc/testsuite/gcc.target/arm/pr91816.c
new file mode 100644
index 00000000000..176bf61780b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pr91816.c
@@ -0,0 +1,102 @@
+/* { dg-do compile } */
+/* { dg-options "-march=armv7-a -mthumb -mfpu=vfpv3-d16" }  */
+int printf(const char *, ...);
+
+__attribute__((noinline,noclone)) void f1(int a)
+{
+	if (a) {
+#define HW0	printf("Hello World!\n");
+#define HW1	HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
+#define HW2	HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
+#define HW3	HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
+#define HW4	HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
+#define HW5	HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
+		HW0
+	}
+}
+
+__attribute__((noinline,noclone)) void f2(int a)
+{
+	if (a) {
+#define HW0	printf("Hello World!\n");
+#define HW1	HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
+#define HW2	HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
+#define HW3	HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
+#define HW4	HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
+#define HW5	HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
+		HW3
+	}
+}
+
+
+__attribute__((noinline,noclone)) void f3(int a)
+{
+	if (a) {
+#define HW0	printf("Hello World!\n");
+#define HW1	HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
+#define HW2	HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
+#define HW3	HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
+#define HW4	HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
+#define HW5	HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
+		HW5
+	}
+}
+
+__attribute__((noinline,noclone)) void f4(int a)
+{
+	if (a==1) {
+#define HW0	printf("Hello World!\n");
+#define HW1	HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
+#define HW2	HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
+#define HW3	HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
+#define HW4	HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
+#define HW5	HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
+		HW0
+	}
+}
+
+__attribute__((noinline,noclone)) void f5(int a)
+{
+	if (a==1) {
+#define HW0	printf("Hello World!\n");
+#define HW1	HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
+#define HW2	HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
+#define HW3	HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
+#define HW4	HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
+#define HW5	HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
+		HW3
+	}
+}
+
+
+__attribute__((noinline,noclone)) void f6(int a)
+{
+	if (a==1) {
+#define HW0	printf("Hello World!\n");
+#define HW1	HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
+#define HW2	HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
+#define HW3	HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
+#define HW4	HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
+#define HW5	HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
+		HW5
+	}
+}
+
+
+int main(void)
+{
+	f1(0);
+	f2(0);
+	f3(0);
+	f4(0);
+	f5(0);
+	f6(0);
+	return 0;
+}
+
+
+/* { dg-final { scan-assembler-times "beq\\t.L\[0-9\]" 2 } } */
+/* { dg-final { scan-assembler-times "beq\\t.Lbcond\[0-9\]" 1 } } */
+/* { dg-final { scan-assembler-times "bne\\t.L\[0-9\]" 2 } } */
+/* { dg-final { scan-assembler-times "bne\\t.Lbcond\[0-9\]" 1 } } */
+/* { dg-final { scan-assembler-times "b\\t.L\[0-9\]" 2 } } */

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PING][PATCH][GCC][ARM] Arm generates out of range conditional branches in Thumb2 (PR91816)
  2019-10-21  9:40   ` Stam Markianos-Wright
@ 2019-11-15 17:27     ` Stam Markianos-Wright
  2019-12-02 16:43       ` Stam Markianos-Wright
  2019-12-10 17:03       ` Kyrill Tkachov
  0 siblings, 2 replies; 15+ messages in thread
From: Stam Markianos-Wright @ 2019-11-15 17:27 UTC (permalink / raw)
  To: gcc-patches; +Cc: nickc, Ramana Radhakrishnan, Kyrylo Tkachov, Richard Earnshaw

[-- Attachment #1: Type: text/plain, Size: 12759 bytes --]

Pinging with more correct maintainers this time :)

Also would need to backport to gcc7,8,9, but need to get this approved 
first!

Thank you,
Stam


-------- Forwarded Message --------
Subject: Re: [PATCH][GCC][ARM] Arm generates out of range conditional 
branches in Thumb2 (PR91816)
Date: Mon, 21 Oct 2019 10:37:09 +0100
From: Stam Markianos-Wright <stam.markianos-wright@arm.com>
To: Ramana Radhakrishnan <ramana.gcc@googlemail.com>
CC: gcc-patches@gcc.gnu.org <gcc-patches@gcc.gnu.org>, nd <nd@arm.com>, 
James Greenhalgh <James.Greenhalgh@arm.com>, Richard Earnshaw 
<Richard.Earnshaw@arm.com>



On 10/13/19 4:23 PM, Ramana Radhakrishnan wrote:
>>
>> Patch bootstrapped and regression tested on arm-none-linux-gnueabihf,
>> however, on my native Aarch32 setup the test times out when run as part
>> of a big "make check-gcc" regression, but not when run individually.
>>
>> 2019-10-11  Stamatis Markianos-Wright <stam.markianos-wright@arm.com>
>>
>> 	* config/arm/arm.md: Update b<cond> for Thumb2 range checks.
>> 	* config/arm/arm.c: New function arm_gen_far_branch.
>>    	* config/arm/arm-protos.h: New function arm_gen_far_branch
>> 	prototype.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2019-10-11  Stamatis Markianos-Wright <stam.markianos-wright@arm.com>
>>
>>    	* testsuite/gcc.target/arm/pr91816.c: New test.
> 
>> diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
>> index f995974f9bb..1dce333d1c3 100644
>> --- a/gcc/config/arm/arm-protos.h
>> +++ b/gcc/config/arm/arm-protos.h
>> @@ -570,4 +570,7 @@ void arm_parse_option_features (sbitmap, const cpu_arch_option *,
>>   
>>   void arm_initialize_isa (sbitmap, const enum isa_feature *);
>>   
>> +const char * arm_gen_far_branch (rtx *, int,const char * , const char *);
>> +
>> +
> 
> Lets get the nits out of the way.
> 
> Unnecessary extra new line, need a space between int and const above.
> 
> 

.Fixed!

>>   #endif /* ! GCC_ARM_PROTOS_H */
>> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
>> index 39e1a1ef9a2..1a693d2ddca 100644
>> --- a/gcc/config/arm/arm.c
>> +++ b/gcc/config/arm/arm.c
>> @@ -32139,6 +32139,31 @@ arm_run_selftests (void)
>>   }
>>   } /* Namespace selftest.  */
>>   
>> +
>> +/* Generate code to enable conditional branches in functions over 1 MiB.  */
>> +const char *
>> +arm_gen_far_branch (rtx * operands, int pos_label, const char * dest,
>> +			const char * branch_format)
> 
> Not sure if this is some munging from the attachment but check
> vertical alignment of parameters.
> 

.Fixed!

>> +{
>> +  rtx_code_label * tmp_label = gen_label_rtx ();
>> +  char label_buf[256];
>> +  char buffer[128];
>> +  ASM_GENERATE_INTERNAL_LABEL (label_buf, dest , \
>> +			CODE_LABEL_NUMBER (tmp_label));
>> +  const char *label_ptr = arm_strip_name_encoding (label_buf);
>> +  rtx dest_label = operands[pos_label];
>> +  operands[pos_label] = tmp_label;
>> +
>> +  snprintf (buffer, sizeof (buffer), "%s%s", branch_format , label_ptr);
>> +  output_asm_insn (buffer, operands);
>> +
>> +  snprintf (buffer, sizeof (buffer), "b\t%%l0%d\n%s:", pos_label, label_ptr);
>> +  operands[pos_label] = dest_label;
>> +  output_asm_insn (buffer, operands);
>> +  return "";
>> +}
>> +
>> +
> 
> Unnecessary extra newline.
> 

.Fixed!

>>   #undef TARGET_RUN_TARGET_SELFTESTS
>>   #define TARGET_RUN_TARGET_SELFTESTS selftest::arm_run_selftests
>>   #endif /* CHECKING_P */
>> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
>> index f861c72ccfc..634fd0a59da 100644
>> --- a/gcc/config/arm/arm.md
>> +++ b/gcc/config/arm/arm.md
>> @@ -6686,9 +6686,16 @@
>>   ;; And for backward branches we have
>>   ;;   (neg_range - neg_base_offs + pc_offs) = (neg_range - (-2 or -4) + 4).
>>   ;;
>> +;; In 16-bit Thumb these ranges are:
>>   ;; For a 'b'       pos_range = 2046, neg_range = -2048 giving (-2040->2048).
>>   ;; For a 'b<cond>' pos_range = 254,  neg_range = -256  giving (-250 ->256).
>>   
>> +;; In 32-bit Thumb these ranges are:
>> +;; For a 'b'       +/- 16MB is not checked for.
>> +;; For a 'b<cond>' pos_range = 1048574,  neg_range = -1048576  giving
>> +;; (-1048568 -> 1048576).
>> +
>> +
> 
> Unnecessary extra newline.
> 

.Fixed!

>>   (define_expand "cbranchsi4"
>>     [(set (pc) (if_then_else
>>   	      (match_operator 0 "expandable_comparison_operator"
>> @@ -6947,22 +6954,42 @@
>>   		      (pc)))]
>>     "TARGET_32BIT"
>>     "*
>> -  if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
>> -    {
>> -      arm_ccfsm_state += 2;
>> -      return \"\";
>> -    }
>> -  return \"b%d1\\t%l0\";
>> +     if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
>> +      {
>> +	arm_ccfsm_state += 2;
>> +	return \"\";
>> +      }
>> +     switch (get_attr_length (insn))
>> +      {
>> +	// Thumb2 16-bit b{cond}
>> +	case 2:
>> +
>> +	// Thumb2 32-bit b{cond}
>> +	case 4: return \"b%d1\\t%l0\";break;
>> +
>> +	// Thumb2 b{cond} out of range.  Use unconditional branch.
>> +	case 8: return arm_gen_far_branch \
>> +		(operands, 0, \"Lbcond\", \"b%D1\t\");
>> +	break;
>> +
>> +	// A32 b{cond}
>> +	default: return \"b%d1\\t%l0\";
>> +      }
> 
> Please fix indentation here.
> 

.Fixed together with below changes.

>>     "
>>     [(set_attr "conds" "use")
>>      (set_attr "type" "branch")
>>      (set (attr "length")
>> -	(if_then_else
>> -	   (and (match_test "TARGET_THUMB2")
>> -		(and (ge (minus (match_dup 0) (pc)) (const_int -250))
>> -		     (le (minus (match_dup 0) (pc)) (const_int 256))))
>> -	   (const_int 2)
>> -	   (const_int 4)))]
>> +	(if_then_else (match_test "TARGET_THUMB2")
>> +	(if_then_else (and (ge (minus (match_dup 0) (pc)) (const_int -250))
>> +	(le (minus (match_dup 0) (pc)) (const_int 256)))
>> +	(const_int 2)
>> +	(if_then_else (and (ge (minus (match_dup 0) (pc))
>> +						(const_int -1048568))
>> +			(le (minus (match_dup 0) (pc)) (const_int 1048576)))
>> +	(const_int 4)
>> +	(const_int 8)))
>> +	(const_int 10)))
>> +   ]
> 
> This conditional is unreadable and is getting quite complex.
> 
> Please fix the indentation and add some comments to indicate when
> this is 2, 4, 8, 10 above the pattern and ask for the comment to
> be in sync with this.
> 
> How did we end up with length 10 ? That indicates 2 4 byte instructions
> and a 2 byte instruction ? You are handling lengths 2, 4, 8 above in
> the switch - is length 10 going to be a single A32 b<cond> instruction ?
> 
> What am I missing ?
> 
> 

Ah sorry, I had not realised that the "length" related to the number of 
bytes in the instruction, so I just used it as a variable to then check 
in the switch().
And yes, you are correct in assuming that length 10 would have been the 
A32 b<cond> version.
So the mapping I had in mind was:
2->  Thumb2 b<cond> - narrow 16bit version
4->  Thumb2 b<cond> - wide 32bit version
8->  Thumb2 b       - "far branch".
10-> A32 b<cond>

The new version that maintains the "length=number of bytes" would be:

2->  Thumb2 b<cond> - narrow 16bit version
4->  Thumb2 b<cond> - wide 32bit version OR A32 b<cond>
6->  Thumb2 "far branch" made up from one b<cond> to a very close Lbcond 
label (so 16 bits) and one b for 32 bits. (so 2+4 == 6)

I've gone ahead and done this in the new proposed patch. Let me know if 
it's ok! (also I changed the first check to !TARGET_THUMB2 - this makes 
it slightly more readable). I'm still not sure about this, so any 
suggestions are welcome!

> 
>>   )
>>   
>>   (define_insn "*arm_cond_branch_reversed"
>> @@ -6978,17 +7005,36 @@
>>         arm_ccfsm_state += 2;
>>         return \"\";
>>       }
>> -  return \"b%D1\\t%l0\";
>> +     switch (get_attr_length (insn))
>> +      {
>> +	// Thumb2 16-bit b{cond}
>> +	case 2:
>> +
>> +	// Thumb2 32-bit b{cond}
>> +	case 4: return \"b%D1\\t%l0\";break;
>> +
>> +	// Thumb2 b{cond} out of range.  Use unconditional branch.
>> +	case 8: return arm_gen_far_branch \
>> +		(operands, 0, \"Lbcond\", \"b%d1\t\");
>> +		break;
>> +	// A32 b{cond}
>> +	default: return \"b%D1\\t%l0\";
>> +       }
>>     "
>>     [(set_attr "conds" "use")
>>      (set_attr "type" "branch")
>>      (set (attr "length")
>> -	(if_then_else
>> -	   (and (match_test "TARGET_THUMB2")
>> -		(and (ge (minus (match_dup 0) (pc)) (const_int -250))
>> -		     (le (minus (match_dup 0) (pc)) (const_int 256))))
>> -	   (const_int 2)
>> -	   (const_int 4)))]
>> +	(if_then_else (match_test "TARGET_THUMB2")
>> +	(if_then_else (and (ge (minus (match_dup 0) (pc)) (const_int -250))
>> +		(le (minus (match_dup 0) (pc)) (const_int 256)))
>> +	(const_int 2)
>> +	(if_then_else (and (ge (minus (match_dup 0) (pc))
>> +							(const_int -1048568))
>> +		(le (minus (match_dup 0) (pc)) (const_int 1048576)))
>> +	(const_int 4)
>> +	(const_int 8)))
>> +	(const_int 10)))
>> +   ]
> 
> Same comments as above apply here too.
> 

Same as above.

Thank you for the feedback and apologies for being a clueless :)

And, of course, let me know of any problems or queries!

Cheers,
Stam

> Ramana
> 
>>   )
>>   
>>   \f
>> diff --git a/gcc/testsuite/gcc.target/arm/pr91816.c b/gcc/testsuite/gcc.target/arm/pr91816.c
>> new file mode 100644
>> index 00000000000..176bf61780b
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/arm/pr91816.c
>> @@ -0,0 +1,102 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-march=armv7-a -mthumb -mfpu=vfpv3-d16" }  */
>> +int printf(const char *, ...);
>> +
>> +__attribute__((noinline,noclone)) void f1(int a)
>> +{
>> +	if (a) {
>> +#define HW0	printf("Hello World!\n");
>> +#define HW1	HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>> +#define HW2	HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>> +#define HW3	HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>> +#define HW4	HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>> +#define HW5	HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>> +		HW0
>> +	}
>> +}
>> +
>> +__attribute__((noinline,noclone)) void f2(int a)
>> +{
>> +	if (a) {
>> +#define HW0	printf("Hello World!\n");
>> +#define HW1	HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>> +#define HW2	HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>> +#define HW3	HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>> +#define HW4	HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>> +#define HW5	HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>> +		HW3
>> +	}
>> +}
>> +
>> +
>> +__attribute__((noinline,noclone)) void f3(int a)
>> +{
>> +	if (a) {
>> +#define HW0	printf("Hello World!\n");
>> +#define HW1	HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>> +#define HW2	HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>> +#define HW3	HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>> +#define HW4	HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>> +#define HW5	HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>> +		HW5
>> +	}
>> +}
>> +
>> +__attribute__((noinline,noclone)) void f4(int a)
>> +{
>> +	if (a==1) {
>> +#define HW0	printf("Hello World!\n");
>> +#define HW1	HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>> +#define HW2	HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>> +#define HW3	HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>> +#define HW4	HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>> +#define HW5	HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>> +		HW0
>> +	}
>> +}
>> +
>> +__attribute__((noinline,noclone)) void f5(int a)
>> +{
>> +	if (a==1) {
>> +#define HW0	printf("Hello World!\n");
>> +#define HW1	HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>> +#define HW2	HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>> +#define HW3	HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>> +#define HW4	HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>> +#define HW5	HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>> +		HW3
>> +	}
>> +}
>> +
>> +
>> +__attribute__((noinline,noclone)) void f6(int a)
>> +{
>> +	if (a==1) {
>> +#define HW0	printf("Hello World!\n");
>> +#define HW1	HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>> +#define HW2	HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>> +#define HW3	HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>> +#define HW4	HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>> +#define HW5	HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>> +		HW5
>> +	}
>> +}
>> +
>> +
>> +int main(void)
>> +{
>> +	f1(0);
>> +	f2(0);
>> +	f3(0);
>> +	f4(0);
>> +	f5(0);
>> +	f6(0);
>> +	return 0;
>> +}
>> +
>> +
>> +/* { dg-final { scan-assembler-times "beq\\t.L\[0-9\]" 2 } } */
>> +/* { dg-final { scan-assembler-times "beq\\t.Lbcond\[0-9\]" 1 } } */
>> +/* { dg-final { scan-assembler-times "bne\\t.L\[0-9\]" 2 } } */
>> +/* { dg-final { scan-assembler-times "bne\\t.Lbcond\[0-9\]" 1 } } */
>> +/* { dg-final { scan-assembler-times "b\\t.L\[0-9\]" 2 } } */
> 



[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 1.patch --]
[-- Type: text/x-patch; name="1.patch", Size: 9577 bytes --]

diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index f995974f9bb..59ec219da3d 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -570,4 +570,6 @@ void arm_parse_option_features (sbitmap, const cpu_arch_option *,
 
 void arm_initialize_isa (sbitmap, const enum isa_feature *);
 
+const char * arm_gen_far_branch (rtx *, int, const char *, const char *);
+
 #endif /* ! GCC_ARM_PROTOS_H */
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 39e1a1ef9a2..7a69ddb6b7b 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -32139,6 +32139,30 @@ arm_run_selftests (void)
 }
 } /* Namespace selftest.  */
 
+
+/* Generate code to enable conditional branches in functions over 1 MiB.  */
+const char *
+arm_gen_far_branch (rtx * operands, int pos_label, const char * dest,
+		    const char * branch_format)
+{
+  rtx_code_label * tmp_label = gen_label_rtx ();
+  char label_buf[256];
+  char buffer[128];
+  ASM_GENERATE_INTERNAL_LABEL (label_buf, dest , \
+			CODE_LABEL_NUMBER (tmp_label));
+  const char *label_ptr = arm_strip_name_encoding (label_buf);
+  rtx dest_label = operands[pos_label];
+  operands[pos_label] = tmp_label;
+
+  snprintf (buffer, sizeof (buffer), "%s%s", branch_format , label_ptr);
+  output_asm_insn (buffer, operands);
+
+  snprintf (buffer, sizeof (buffer), "b\t%%l0%d\n%s:", pos_label, label_ptr);
+  operands[pos_label] = dest_label;
+  output_asm_insn (buffer, operands);
+  return "";
+}
+
 #undef TARGET_RUN_TARGET_SELFTESTS
 #define TARGET_RUN_TARGET_SELFTESTS selftest::arm_run_selftests
 #endif /* CHECKING_P */
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index f861c72ccfc..7e5e1489214 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -6686,9 +6686,15 @@
 ;; And for backward branches we have 
 ;;   (neg_range - neg_base_offs + pc_offs) = (neg_range - (-2 or -4) + 4).
 ;;
+;; In 16-bit Thumb these ranges are:
 ;; For a 'b'       pos_range = 2046, neg_range = -2048 giving (-2040->2048).
 ;; For a 'b<cond>' pos_range = 254,  neg_range = -256  giving (-250 ->256).
 
+;; In 32-bit Thumb these ranges are:
+;; For a 'b'       +/- 16MB is not checked for.
+;; For a 'b<cond>' pos_range = 1048574,  neg_range = -1048576  giving
+;; (-1048568 -> 1048576).
+
 (define_expand "cbranchsi4"
   [(set (pc) (if_then_else
 	      (match_operator 0 "expandable_comparison_operator"
@@ -6946,23 +6952,56 @@
 		      (label_ref (match_operand 0 "" ""))
 		      (pc)))]
   "TARGET_32BIT"
-  "*
-  if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
-    {
-      arm_ccfsm_state += 2;
-      return \"\";
-    }
-  return \"b%d1\\t%l0\";
-  "
+  {
+	if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
+	{
+		arm_ccfsm_state += 2;
+		return "";
+	}
+	switch (get_attr_length (insn))
+	{
+		/* Thumb2 16-bit b{cond}.  */
+		case 2:
+
+		/* Thumb2 32-bit b{cond} or A32 b{cond}.  */
+		case 4: return "b%d1\t%l0";
+			break;
+
+		/* Thumb2 b{cond} out of range.  Use 16-bit b{cond} and
+		   unconditional branch b.  */
+		default: return arm_gen_far_branch \
+				(operands, 0, "Lbcond", "b%D1\t");
+	}
+  }
   [(set_attr "conds" "use")
    (set_attr "type" "branch")
    (set (attr "length")
-	(if_then_else
-	   (and (match_test "TARGET_THUMB2")
-		(and (ge (minus (match_dup 0) (pc)) (const_int -250))
-		     (le (minus (match_dup 0) (pc)) (const_int 256))))
-	   (const_int 2)
-	   (const_int 4)))]
+    (if_then_else (match_test "!TARGET_THUMB2")
+
+	;;Target is not Thumb2, therefore is A32.  Generate b{cond}.
+	(const_int 4)
+
+	;; Check if target is within 16-bit Thumb2 b{cond} range.
+	(if_then_else (and (ge (minus (match_dup 0) (pc)) (const_int -250))
+			   (le (minus (match_dup 0) (pc)) (const_int 256)))
+
+		;; Target is Thumb2, within narrow range.
+		;; Generate b{cond}.
+			(const_int 2)
+
+		;; Check if target is within 32-bit Thumb2 b{cond} range.
+			(if_then_else (and (ge (minus (match_dup 0)
+					 (pc))(const_int -1048568))
+					   (le (minus (match_dup 0)
+					 (pc)) (const_int 1048576)))
+
+		;; Target is Thumb2, within wide range.
+		;; Generate b{cond}
+						(const_int 4)
+		;; Target is Thumb2, out of range.
+		;; Generate narrow b{cond} and unconditional branch b.
+						(const_int 6)))))
+  ]
 )
 
 (define_insn "*arm_cond_branch_reversed"
@@ -6972,23 +7011,56 @@
 		      (pc)
 		      (label_ref (match_operand 0 "" ""))))]
   "TARGET_32BIT"
-  "*
-  if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
-    {
-      arm_ccfsm_state += 2;
-      return \"\";
-    }
-  return \"b%D1\\t%l0\";
-  "
+  {
+	if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
+	{
+		arm_ccfsm_state += 2;
+		return "";
+	}
+	switch (get_attr_length (insn))
+	{
+		/* Thumb2 16-bit b{cond}.  */
+		case 2:
+
+		/* Thumb2 32-bit b{cond} or A32 b{cond}.  */
+		case 4: return "b%D1\t%l0";
+			break;
+
+		/* Thumb2 b{cond} out of range.  Use 16-bit b{cond} and
+		   unconditional branch b.  */
+		default: return arm_gen_far_branch \
+				(operands, 0, "Lbcond", "b%d1\t");
+	}
+  }
   [(set_attr "conds" "use")
    (set_attr "type" "branch")
    (set (attr "length")
-	(if_then_else
-	   (and (match_test "TARGET_THUMB2")
-		(and (ge (minus (match_dup 0) (pc)) (const_int -250))
-		     (le (minus (match_dup 0) (pc)) (const_int 256))))
-	   (const_int 2)
-	   (const_int 4)))]
+    (if_then_else (match_test "!TARGET_THUMB2")
+
+	;;Target is not Thumb2, therefore is A32.  Generate b{cond}.
+	(const_int 4)
+
+	;; Check if target is within 16-bit Thumb2 b{cond} range.
+	(if_then_else (and (ge (minus (match_dup 0) (pc)) (const_int -250))
+			   (le (minus (match_dup 0) (pc)) (const_int 256)))
+
+		;; Target is Thumb2, within narrow range.
+		;; Generate b{cond}.
+			(const_int 2)
+
+		;; Check if target is within 32-bit Thumb2 b{cond} range.
+			(if_then_else (and (ge (minus (match_dup 0)
+					 (pc))(const_int -1048568))
+					   (le (minus (match_dup 0)
+					 (pc)) (const_int 1048576)))
+
+		;; Target is Thumb2, within wide range.
+		;; Generate b{cond}.
+						(const_int 4)
+		;; Target is Thumb2, out of range.
+		;; Generate narrow b{cond} and unconditional branch b.
+						(const_int 6)))))
+  ]
 )
 
 \f
diff --git a/gcc/testsuite/gcc.target/arm/pr91816.c b/gcc/testsuite/gcc.target/arm/pr91816.c
new file mode 100644
index 00000000000..176bf61780b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pr91816.c
@@ -0,0 +1,102 @@
+/* { dg-do compile } */
+/* { dg-options "-march=armv7-a -mthumb -mfpu=vfpv3-d16" }  */
+int printf(const char *, ...);
+
+__attribute__((noinline,noclone)) void f1(int a)
+{
+	if (a) {
+#define HW0	printf("Hello World!\n");
+#define HW1	HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
+#define HW2	HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
+#define HW3	HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
+#define HW4	HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
+#define HW5	HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
+		HW0
+	}
+}
+
+__attribute__((noinline,noclone)) void f2(int a)
+{
+	if (a) {
+#define HW0	printf("Hello World!\n");
+#define HW1	HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
+#define HW2	HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
+#define HW3	HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
+#define HW4	HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
+#define HW5	HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
+		HW3
+	}
+}
+
+
+__attribute__((noinline,noclone)) void f3(int a)
+{
+	if (a) {
+#define HW0	printf("Hello World!\n");
+#define HW1	HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
+#define HW2	HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
+#define HW3	HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
+#define HW4	HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
+#define HW5	HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
+		HW5
+	}
+}
+
+__attribute__((noinline,noclone)) void f4(int a)
+{
+	if (a==1) {
+#define HW0	printf("Hello World!\n");
+#define HW1	HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
+#define HW2	HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
+#define HW3	HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
+#define HW4	HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
+#define HW5	HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
+		HW0
+	}
+}
+
+__attribute__((noinline,noclone)) void f5(int a)
+{
+	if (a==1) {
+#define HW0	printf("Hello World!\n");
+#define HW1	HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
+#define HW2	HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
+#define HW3	HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
+#define HW4	HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
+#define HW5	HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
+		HW3
+	}
+}
+
+
+__attribute__((noinline,noclone)) void f6(int a)
+{
+	if (a==1) {
+#define HW0	printf("Hello World!\n");
+#define HW1	HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
+#define HW2	HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
+#define HW3	HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
+#define HW4	HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
+#define HW5	HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
+		HW5
+	}
+}
+
+
+int main(void)
+{
+	f1(0);
+	f2(0);
+	f3(0);
+	f4(0);
+	f5(0);
+	f6(0);
+	return 0;
+}
+
+
+/* { dg-final { scan-assembler-times "beq\\t.L\[0-9\]" 2 } } */
+/* { dg-final { scan-assembler-times "beq\\t.Lbcond\[0-9\]" 1 } } */
+/* { dg-final { scan-assembler-times "bne\\t.L\[0-9\]" 2 } } */
+/* { dg-final { scan-assembler-times "bne\\t.Lbcond\[0-9\]" 1 } } */
+/* { dg-final { scan-assembler-times "b\\t.L\[0-9\]" 2 } } */



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PING][PATCH][GCC][ARM] Arm generates out of range conditional branches in Thumb2 (PR91816)
  2019-11-15 17:27     ` [PING][PATCH][GCC][ARM] " Stam Markianos-Wright
@ 2019-12-02 16:43       ` Stam Markianos-Wright
  2019-12-09 17:50         ` Stam Markianos-Wright
  2019-12-10 17:03       ` Kyrill Tkachov
  1 sibling, 1 reply; 15+ messages in thread
From: Stam Markianos-Wright @ 2019-12-02 16:43 UTC (permalink / raw)
  To: gcc-patches; +Cc: nickc, Ramana Radhakrishnan, Kyrylo Tkachov, Richard Earnshaw



On 11/15/19 5:26 PM, Stam Markianos-Wright wrote:
> Pinging with more correct maintainers this time :)
> 
> Also would need to backport to gcc7,8,9, but need to get this approved 
> first!
> 
> Thank you,
> Stam
> 
> 
> -------- Forwarded Message --------
> Subject: Re: [PATCH][GCC][ARM] Arm generates out of range conditional 
> branches in Thumb2 (PR91816)
> Date: Mon, 21 Oct 2019 10:37:09 +0100
> From: Stam Markianos-Wright <stam.markianos-wright@arm.com>
> To: Ramana Radhakrishnan <ramana.gcc@googlemail.com>
> CC: gcc-patches@gcc.gnu.org <gcc-patches@gcc.gnu.org>, nd <nd@arm.com>, 
> James Greenhalgh <James.Greenhalgh@arm.com>, Richard Earnshaw 
> <Richard.Earnshaw@arm.com>
> 
> 
> 
> On 10/13/19 4:23 PM, Ramana Radhakrishnan wrote:
>>>
>>> Patch bootstrapped and regression tested on arm-none-linux-gnueabihf,
>>> however, on my native Aarch32 setup the test times out when run as part
>>> of a big "make check-gcc" regression, but not when run individually.
>>>
>>> 2019-10-11  Stamatis Markianos-Wright <stam.markianos-wright@arm.com>
>>>
>>>     * config/arm/arm.md: Update b<cond> for Thumb2 range checks.
>>>     * config/arm/arm.c: New function arm_gen_far_branch.
>>>        * config/arm/arm-protos.h: New function arm_gen_far_branch
>>>     prototype.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> 2019-10-11  Stamatis Markianos-Wright <stam.markianos-wright@arm.com>
>>>
>>>        * testsuite/gcc.target/arm/pr91816.c: New test.
>>
>>> diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
>>> index f995974f9bb..1dce333d1c3 100644
>>> --- a/gcc/config/arm/arm-protos.h
>>> +++ b/gcc/config/arm/arm-protos.h
>>> @@ -570,4 +570,7 @@ void arm_parse_option_features (sbitmap, const 
>>> cpu_arch_option *,
>>>   void arm_initialize_isa (sbitmap, const enum isa_feature *);
>>> +const char * arm_gen_far_branch (rtx *, int,const char * , const 
>>> char *);
>>> +
>>> +
>>
>> Lets get the nits out of the way.
>>
>> Unnecessary extra new line, need a space between int and const above.
>>
>>
> 
> .Fixed!
> 
>>>   #endif /* ! GCC_ARM_PROTOS_H */
>>> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
>>> index 39e1a1ef9a2..1a693d2ddca 100644
>>> --- a/gcc/config/arm/arm.c
>>> +++ b/gcc/config/arm/arm.c
>>> @@ -32139,6 +32139,31 @@ arm_run_selftests (void)
>>>   }
>>>   } /* Namespace selftest.  */
>>> +
>>> +/* Generate code to enable conditional branches in functions over 1 
>>> MiB.  */
>>> +const char *
>>> +arm_gen_far_branch (rtx * operands, int pos_label, const char * dest,
>>> +            const char * branch_format)
>>
>> Not sure if this is some munging from the attachment but check
>> vertical alignment of parameters.
>>
> 
> .Fixed!
> 
>>> +{
>>> +  rtx_code_label * tmp_label = gen_label_rtx ();
>>> +  char label_buf[256];
>>> +  char buffer[128];
>>> +  ASM_GENERATE_INTERNAL_LABEL (label_buf, dest , \
>>> +            CODE_LABEL_NUMBER (tmp_label));
>>> +  const char *label_ptr = arm_strip_name_encoding (label_buf);
>>> +  rtx dest_label = operands[pos_label];
>>> +  operands[pos_label] = tmp_label;
>>> +
>>> +  snprintf (buffer, sizeof (buffer), "%s%s", branch_format , 
>>> label_ptr);
>>> +  output_asm_insn (buffer, operands);
>>> +
>>> +  snprintf (buffer, sizeof (buffer), "b\t%%l0%d\n%s:", pos_label, 
>>> label_ptr);
>>> +  operands[pos_label] = dest_label;
>>> +  output_asm_insn (buffer, operands);
>>> +  return "";
>>> +}
>>> +
>>> +
>>
>> Unnecessary extra newline.
>>
> 
> .Fixed!
> 
>>>   #undef TARGET_RUN_TARGET_SELFTESTS
>>>   #define TARGET_RUN_TARGET_SELFTESTS selftest::arm_run_selftests
>>>   #endif /* CHECKING_P */
>>> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
>>> index f861c72ccfc..634fd0a59da 100644
>>> --- a/gcc/config/arm/arm.md
>>> +++ b/gcc/config/arm/arm.md
>>> @@ -6686,9 +6686,16 @@
>>>   ;; And for backward branches we have
>>>   ;;   (neg_range - neg_base_offs + pc_offs) = (neg_range - (-2 or 
>>> -4) + 4).
>>>   ;;
>>> +;; In 16-bit Thumb these ranges are:
>>>   ;; For a 'b'       pos_range = 2046, neg_range = -2048 giving 
>>> (-2040->2048).
>>>   ;; For a 'b<cond>' pos_range = 254,  neg_range = -256  giving (-250 
>>> ->256).
>>> +;; In 32-bit Thumb these ranges are:
>>> +;; For a 'b'       +/- 16MB is not checked for.
>>> +;; For a 'b<cond>' pos_range = 1048574,  neg_range = -1048576  giving
>>> +;; (-1048568 -> 1048576).
>>> +
>>> +
>>
>> Unnecessary extra newline.
>>
> 
> .Fixed!
> 
>>>   (define_expand "cbranchsi4"
>>>     [(set (pc) (if_then_else
>>>             (match_operator 0 "expandable_comparison_operator"
>>> @@ -6947,22 +6954,42 @@
>>>                 (pc)))]
>>>     "TARGET_32BIT"
>>>     "*
>>> -  if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
>>> -    {
>>> -      arm_ccfsm_state += 2;
>>> -      return \"\";
>>> -    }
>>> -  return \"b%d1\\t%l0\";
>>> +     if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
>>> +      {
>>> +    arm_ccfsm_state += 2;
>>> +    return \"\";
>>> +      }
>>> +     switch (get_attr_length (insn))
>>> +      {
>>> +    // Thumb2 16-bit b{cond}
>>> +    case 2:
>>> +
>>> +    // Thumb2 32-bit b{cond}
>>> +    case 4: return \"b%d1\\t%l0\";break;
>>> +
>>> +    // Thumb2 b{cond} out of range.  Use unconditional branch.
>>> +    case 8: return arm_gen_far_branch \
>>> +        (operands, 0, \"Lbcond\", \"b%D1\t\");
>>> +    break;
>>> +
>>> +    // A32 b{cond}
>>> +    default: return \"b%d1\\t%l0\";
>>> +      }
>>
>> Please fix indentation here.
>>
> 
> .Fixed together with below changes.
> 
>>>     "
>>>     [(set_attr "conds" "use")
>>>      (set_attr "type" "branch")
>>>      (set (attr "length")
>>> -    (if_then_else
>>> -       (and (match_test "TARGET_THUMB2")
>>> -        (and (ge (minus (match_dup 0) (pc)) (const_int -250))
>>> -             (le (minus (match_dup 0) (pc)) (const_int 256))))
>>> -       (const_int 2)
>>> -       (const_int 4)))]
>>> +    (if_then_else (match_test "TARGET_THUMB2")
>>> +    (if_then_else (and (ge (minus (match_dup 0) (pc)) (const_int -250))
>>> +    (le (minus (match_dup 0) (pc)) (const_int 256)))
>>> +    (const_int 2)
>>> +    (if_then_else (and (ge (minus (match_dup 0) (pc))
>>> +                        (const_int -1048568))
>>> +            (le (minus (match_dup 0) (pc)) (const_int 1048576)))
>>> +    (const_int 4)
>>> +    (const_int 8)))
>>> +    (const_int 10)))
>>> +   ]
>>
>> This conditional is unreadable and is getting quite complex.
>>
>> Please fix the indentation and add some comments to indicate when
>> this is 2, 4, 8, 10 above the pattern and ask for the comment to
>> be in sync with this.
>>
>> How did we end up with length 10 ? That indicates 2 4 byte instructions
>> and a 2 byte instruction ? You are handling lengths 2, 4, 8 above in
>> the switch - is length 10 going to be a single A32 b<cond> instruction ?
>>
>> What am I missing ?
>>
>>
> 
> Ah sorry, I had not realised that the "length" related to the number of 
> bytes in the instruction, so I just used it as a variable to then check 
> in the switch().
> And yes, you are correct in assuming that length 10 would have been the 
> A32 b<cond> version.
> So the mapping I had in mind was:
> 2->  Thumb2 b<cond> - narrow 16bit version
> 4->  Thumb2 b<cond> - wide 32bit version
> 8->  Thumb2 b       - "far branch".
> 10-> A32 b<cond>
> 
> The new version that maintains the "length=number of bytes" would be:
> 
> 2->  Thumb2 b<cond> - narrow 16bit version
> 4->  Thumb2 b<cond> - wide 32bit version OR A32 b<cond>
> 6->  Thumb2 "far branch" made up from one b<cond> to a very close Lbcond 
> label (so 16 bits) and one b for 32 bits. (so 2+4 == 6)
> 
> I've gone ahead and done this in the new proposed patch. Let me know if 
> it's ok! (also I changed the first check to !TARGET_THUMB2 - this makes 
> it slightly more readable). I'm still not sure about this, so any 
> suggestions are welcome!

Ping :)

> 
>>
>>>   )
>>>   (define_insn "*arm_cond_branch_reversed"
>>> @@ -6978,17 +7005,36 @@
>>>         arm_ccfsm_state += 2;
>>>         return \"\";
>>>       }
>>> -  return \"b%D1\\t%l0\";
>>> +     switch (get_attr_length (insn))
>>> +      {
>>> +    // Thumb2 16-bit b{cond}
>>> +    case 2:
>>> +
>>> +    // Thumb2 32-bit b{cond}
>>> +    case 4: return \"b%D1\\t%l0\";break;
>>> +
>>> +    // Thumb2 b{cond} out of range.  Use unconditional branch.
>>> +    case 8: return arm_gen_far_branch \
>>> +        (operands, 0, \"Lbcond\", \"b%d1\t\");
>>> +        break;
>>> +    // A32 b{cond}
>>> +    default: return \"b%D1\\t%l0\";
>>> +       }
>>>     "
>>>     [(set_attr "conds" "use")
>>>      (set_attr "type" "branch")
>>>      (set (attr "length")
>>> -    (if_then_else
>>> -       (and (match_test "TARGET_THUMB2")
>>> -        (and (ge (minus (match_dup 0) (pc)) (const_int -250))
>>> -             (le (minus (match_dup 0) (pc)) (const_int 256))))
>>> -       (const_int 2)
>>> -       (const_int 4)))]
>>> +    (if_then_else (match_test "TARGET_THUMB2")
>>> +    (if_then_else (and (ge (minus (match_dup 0) (pc)) (const_int -250))
>>> +        (le (minus (match_dup 0) (pc)) (const_int 256)))
>>> +    (const_int 2)
>>> +    (if_then_else (and (ge (minus (match_dup 0) (pc))
>>> +                            (const_int -1048568))
>>> +        (le (minus (match_dup 0) (pc)) (const_int 1048576)))
>>> +    (const_int 4)
>>> +    (const_int 8)))
>>> +    (const_int 10)))
>>> +   ]
>>
>> Same comments as above apply here too.
>>
> 
> Same as above.

Ping :)

> 
> Thank you for the feedback and apologies for being a clueless :)
> 
> And, of course, let me know of any problems or queries!
> 
> Cheers,
> Stam
> 
>> Ramana
>>
>>>   )
>>>   \f
>>> diff --git a/gcc/testsuite/gcc.target/arm/pr91816.c 
>>> b/gcc/testsuite/gcc.target/arm/pr91816.c
>>> new file mode 100644
>>> index 00000000000..176bf61780b
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/arm/pr91816.c
>>> @@ -0,0 +1,102 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-march=armv7-a -mthumb -mfpu=vfpv3-d16" }  */
>>> +int printf(const char *, ...);
>>> +
>>> +__attribute__((noinline,noclone)) void f1(int a)
>>> +{
>>> +    if (a) {
>>> +#define HW0    printf("Hello World!\n");
>>> +#define HW1    HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>>> +#define HW2    HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>>> +#define HW3    HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>>> +#define HW4    HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>>> +#define HW5    HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>>> +        HW0
>>> +    }
>>> +}
>>> +
>>> +__attribute__((noinline,noclone)) void f2(int a)
>>> +{
>>> +    if (a) {
>>> +#define HW0    printf("Hello World!\n");
>>> +#define HW1    HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>>> +#define HW2    HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>>> +#define HW3    HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>>> +#define HW4    HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>>> +#define HW5    HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>>> +        HW3
>>> +    }
>>> +}
>>> +
>>> +
>>> +__attribute__((noinline,noclone)) void f3(int a)
>>> +{
>>> +    if (a) {
>>> +#define HW0    printf("Hello World!\n");
>>> +#define HW1    HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>>> +#define HW2    HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>>> +#define HW3    HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>>> +#define HW4    HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>>> +#define HW5    HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>>> +        HW5
>>> +    }
>>> +}
>>> +
>>> +__attribute__((noinline,noclone)) void f4(int a)
>>> +{
>>> +    if (a==1) {
>>> +#define HW0    printf("Hello World!\n");
>>> +#define HW1    HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>>> +#define HW2    HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>>> +#define HW3    HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>>> +#define HW4    HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>>> +#define HW5    HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>>> +        HW0
>>> +    }
>>> +}
>>> +
>>> +__attribute__((noinline,noclone)) void f5(int a)
>>> +{
>>> +    if (a==1) {
>>> +#define HW0    printf("Hello World!\n");
>>> +#define HW1    HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>>> +#define HW2    HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>>> +#define HW3    HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>>> +#define HW4    HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>>> +#define HW5    HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>>> +        HW3
>>> +    }
>>> +}
>>> +
>>> +
>>> +__attribute__((noinline,noclone)) void f6(int a)
>>> +{
>>> +    if (a==1) {
>>> +#define HW0    printf("Hello World!\n");
>>> +#define HW1    HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>>> +#define HW2    HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>>> +#define HW3    HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>>> +#define HW4    HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>>> +#define HW5    HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>>> +        HW5
>>> +    }
>>> +}
>>> +
>>> +
>>> +int main(void)
>>> +{
>>> +    f1(0);
>>> +    f2(0);
>>> +    f3(0);
>>> +    f4(0);
>>> +    f5(0);
>>> +    f6(0);
>>> +    return 0;
>>> +}
>>> +
>>> +
>>> +/* { dg-final { scan-assembler-times "beq\\t.L\[0-9\]" 2 } } */
>>> +/* { dg-final { scan-assembler-times "beq\\t.Lbcond\[0-9\]" 1 } } */
>>> +/* { dg-final { scan-assembler-times "bne\\t.L\[0-9\]" 2 } } */
>>> +/* { dg-final { scan-assembler-times "bne\\t.Lbcond\[0-9\]" 1 } } */
>>> +/* { dg-final { scan-assembler-times "b\\t.L\[0-9\]" 2 } } */
>>
> 
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PING][PATCH][GCC][ARM] Arm generates out of range conditional branches in Thumb2 (PR91816)
  2019-12-02 16:43       ` Stam Markianos-Wright
@ 2019-12-09 17:50         ` Stam Markianos-Wright
  0 siblings, 0 replies; 15+ messages in thread
From: Stam Markianos-Wright @ 2019-12-09 17:50 UTC (permalink / raw)
  To: gcc-patches; +Cc: nickc, Ramana Radhakrishnan, Kyrylo Tkachov, Richard Earnshaw



On 12/2/19 4:43 PM, Stam Markianos-Wright wrote:
> 
> 
> On 11/15/19 5:26 PM, Stam Markianos-Wright wrote:
>> Pinging with more correct maintainers this time :)
>>
>> Also would need to backport to gcc7,8,9, but need to get this approved 
>> first!
>>
>> Thank you,
>> Stam
>>
>>
>> -------- Forwarded Message --------
>> Subject: Re: [PATCH][GCC][ARM] Arm generates out of range conditional 
>> branches in Thumb2 (PR91816)
>> Date: Mon, 21 Oct 2019 10:37:09 +0100
>> From: Stam Markianos-Wright <stam.markianos-wright@arm.com>
>> To: Ramana Radhakrishnan <ramana.gcc@googlemail.com>
>> CC: gcc-patches@gcc.gnu.org <gcc-patches@gcc.gnu.org>, nd 
>> <nd@arm.com>, James Greenhalgh <James.Greenhalgh@arm.com>, Richard 
>> Earnshaw <Richard.Earnshaw@arm.com>
>>
>>
>>
>> On 10/13/19 4:23 PM, Ramana Radhakrishnan wrote:
>>>>
>>>> Patch bootstrapped and regression tested on arm-none-linux-gnueabihf,
>>>> however, on my native Aarch32 setup the test times out when run as part
>>>> of a big "make check-gcc" regression, but not when run individually.
>>>>
>>>> 2019-10-11  Stamatis Markianos-Wright <stam.markianos-wright@arm.com>
>>>>
>>>>     * config/arm/arm.md: Update b<cond> for Thumb2 range checks.
>>>>     * config/arm/arm.c: New function arm_gen_far_branch.
>>>>        * config/arm/arm-protos.h: New function arm_gen_far_branch
>>>>     prototype.
>>>>
>>>> gcc/testsuite/ChangeLog:
>>>>
>>>> 2019-10-11  Stamatis Markianos-Wright <stam.markianos-wright@arm.com>
>>>>
>>>>        * testsuite/gcc.target/arm/pr91816.c: New test.
>>>
>>>> diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
>>>> index f995974f9bb..1dce333d1c3 100644
>>>> --- a/gcc/config/arm/arm-protos.h
>>>> +++ b/gcc/config/arm/arm-protos.h
>>>> @@ -570,4 +570,7 @@ void arm_parse_option_features (sbitmap, const 
>>>> cpu_arch_option *,
>>>>   void arm_initialize_isa (sbitmap, const enum isa_feature *);
>>>> +const char * arm_gen_far_branch (rtx *, int,const char * , const 
>>>> char *);
>>>> +
>>>> +
>>>
>>> Lets get the nits out of the way.
>>>
>>> Unnecessary extra new line, need a space between int and const above.
>>>
>>>
>>
>> .Fixed!
>>
>>>>   #endif /* ! GCC_ARM_PROTOS_H */
>>>> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
>>>> index 39e1a1ef9a2..1a693d2ddca 100644
>>>> --- a/gcc/config/arm/arm.c
>>>> +++ b/gcc/config/arm/arm.c
>>>> @@ -32139,6 +32139,31 @@ arm_run_selftests (void)
>>>>   }
>>>>   } /* Namespace selftest.  */
>>>> +
>>>> +/* Generate code to enable conditional branches in functions over 1 
>>>> MiB.  */
>>>> +const char *
>>>> +arm_gen_far_branch (rtx * operands, int pos_label, const char * dest,
>>>> +            const char * branch_format)
>>>
>>> Not sure if this is some munging from the attachment but check
>>> vertical alignment of parameters.
>>>
>>
>> .Fixed!
>>
>>>> +{
>>>> +  rtx_code_label * tmp_label = gen_label_rtx ();
>>>> +  char label_buf[256];
>>>> +  char buffer[128];
>>>> +  ASM_GENERATE_INTERNAL_LABEL (label_buf, dest , \
>>>> +            CODE_LABEL_NUMBER (tmp_label));
>>>> +  const char *label_ptr = arm_strip_name_encoding (label_buf);
>>>> +  rtx dest_label = operands[pos_label];
>>>> +  operands[pos_label] = tmp_label;
>>>> +
>>>> +  snprintf (buffer, sizeof (buffer), "%s%s", branch_format , 
>>>> label_ptr);
>>>> +  output_asm_insn (buffer, operands);
>>>> +
>>>> +  snprintf (buffer, sizeof (buffer), "b\t%%l0%d\n%s:", pos_label, 
>>>> label_ptr);
>>>> +  operands[pos_label] = dest_label;
>>>> +  output_asm_insn (buffer, operands);
>>>> +  return "";
>>>> +}
>>>> +
>>>> +
>>>
>>> Unnecessary extra newline.
>>>
>>
>> .Fixed!
>>
>>>>   #undef TARGET_RUN_TARGET_SELFTESTS
>>>>   #define TARGET_RUN_TARGET_SELFTESTS selftest::arm_run_selftests
>>>>   #endif /* CHECKING_P */
>>>> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
>>>> index f861c72ccfc..634fd0a59da 100644
>>>> --- a/gcc/config/arm/arm.md
>>>> +++ b/gcc/config/arm/arm.md
>>>> @@ -6686,9 +6686,16 @@
>>>>   ;; And for backward branches we have
>>>>   ;;   (neg_range - neg_base_offs + pc_offs) = (neg_range - (-2 or 
>>>> -4) + 4).
>>>>   ;;
>>>> +;; In 16-bit Thumb these ranges are:
>>>>   ;; For a 'b'       pos_range = 2046, neg_range = -2048 giving 
>>>> (-2040->2048).
>>>>   ;; For a 'b<cond>' pos_range = 254,  neg_range = -256  giving 
>>>> (-250 ->256).
>>>> +;; In 32-bit Thumb these ranges are:
>>>> +;; For a 'b'       +/- 16MB is not checked for.
>>>> +;; For a 'b<cond>' pos_range = 1048574,  neg_range = -1048576  giving
>>>> +;; (-1048568 -> 1048576).
>>>> +
>>>> +
>>>
>>> Unnecessary extra newline.
>>>
>>
>> .Fixed!
>>
>>>>   (define_expand "cbranchsi4"
>>>>     [(set (pc) (if_then_else
>>>>             (match_operator 0 "expandable_comparison_operator"
>>>> @@ -6947,22 +6954,42 @@
>>>>                 (pc)))]
>>>>     "TARGET_32BIT"
>>>>     "*
>>>> -  if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
>>>> -    {
>>>> -      arm_ccfsm_state += 2;
>>>> -      return \"\";
>>>> -    }
>>>> -  return \"b%d1\\t%l0\";
>>>> +     if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
>>>> +      {
>>>> +    arm_ccfsm_state += 2;
>>>> +    return \"\";
>>>> +      }
>>>> +     switch (get_attr_length (insn))
>>>> +      {
>>>> +    // Thumb2 16-bit b{cond}
>>>> +    case 2:
>>>> +
>>>> +    // Thumb2 32-bit b{cond}
>>>> +    case 4: return \"b%d1\\t%l0\";break;
>>>> +
>>>> +    // Thumb2 b{cond} out of range.  Use unconditional branch.
>>>> +    case 8: return arm_gen_far_branch \
>>>> +        (operands, 0, \"Lbcond\", \"b%D1\t\");
>>>> +    break;
>>>> +
>>>> +    // A32 b{cond}
>>>> +    default: return \"b%d1\\t%l0\";
>>>> +      }
>>>
>>> Please fix indentation here.
>>>
>>
>> .Fixed together with below changes.
>>
>>>>     "
>>>>     [(set_attr "conds" "use")
>>>>      (set_attr "type" "branch")
>>>>      (set (attr "length")
>>>> -    (if_then_else
>>>> -       (and (match_test "TARGET_THUMB2")
>>>> -        (and (ge (minus (match_dup 0) (pc)) (const_int -250))
>>>> -             (le (minus (match_dup 0) (pc)) (const_int 256))))
>>>> -       (const_int 2)
>>>> -       (const_int 4)))]
>>>> +    (if_then_else (match_test "TARGET_THUMB2")
>>>> +    (if_then_else (and (ge (minus (match_dup 0) (pc)) (const_int 
>>>> -250))
>>>> +    (le (minus (match_dup 0) (pc)) (const_int 256)))
>>>> +    (const_int 2)
>>>> +    (if_then_else (and (ge (minus (match_dup 0) (pc))
>>>> +                        (const_int -1048568))
>>>> +            (le (minus (match_dup 0) (pc)) (const_int 1048576)))
>>>> +    (const_int 4)
>>>> +    (const_int 8)))
>>>> +    (const_int 10)))
>>>> +   ]
>>>
>>> This conditional is unreadable and is getting quite complex.
>>>
>>> Please fix the indentation and add some comments to indicate when
>>> this is 2, 4, 8, 10 above the pattern and ask for the comment to
>>> be in sync with this.
>>>
>>> How did we end up with length 10 ? That indicates 2 4 byte instructions
>>> and a 2 byte instruction ? You are handling lengths 2, 4, 8 above in
>>> the switch - is length 10 going to be a single A32 b<cond> instruction ?
>>>
>>> What am I missing ?
>>>
>>>
>>
>> Ah sorry, I had not realised that the "length" related to the number 
>> of bytes in the instruction, so I just used it as a variable to then 
>> check in the switch().
>> And yes, you are correct in assuming that length 10 would have been 
>> the A32 b<cond> version.
>> So the mapping I had in mind was:
>> 2->  Thumb2 b<cond> - narrow 16bit version
>> 4->  Thumb2 b<cond> - wide 32bit version
>> 8->  Thumb2 b       - "far branch".
>> 10-> A32 b<cond>
>>
>> The new version that maintains the "length=number of bytes" would be:
>>
>> 2->  Thumb2 b<cond> - narrow 16bit version
>> 4->  Thumb2 b<cond> - wide 32bit version OR A32 b<cond>
>> 6->  Thumb2 "far branch" made up from one b<cond> to a very close 
>> Lbcond label (so 16 bits) and one b for 32 bits. (so 2+4 == 6)
>>
>> I've gone ahead and done this in the new proposed patch. Let me know 
>> if it's ok! (also I changed the first check to !TARGET_THUMB2 - this 
>> makes it slightly more readable). I'm still not sure about this, so 
>> any suggestions are welcome!
> 
> Ping :)
> 

.Ping

>>
>>>
>>>>   )
>>>>   (define_insn "*arm_cond_branch_reversed"
>>>> @@ -6978,17 +7005,36 @@
>>>>         arm_ccfsm_state += 2;
>>>>         return \"\";
>>>>       }
>>>> -  return \"b%D1\\t%l0\";
>>>> +     switch (get_attr_length (insn))
>>>> +      {
>>>> +    // Thumb2 16-bit b{cond}
>>>> +    case 2:
>>>> +
>>>> +    // Thumb2 32-bit b{cond}
>>>> +    case 4: return \"b%D1\\t%l0\";break;
>>>> +
>>>> +    // Thumb2 b{cond} out of range.  Use unconditional branch.
>>>> +    case 8: return arm_gen_far_branch \
>>>> +        (operands, 0, \"Lbcond\", \"b%d1\t\");
>>>> +        break;
>>>> +    // A32 b{cond}
>>>> +    default: return \"b%D1\\t%l0\";
>>>> +       }
>>>>     "
>>>>     [(set_attr "conds" "use")
>>>>      (set_attr "type" "branch")
>>>>      (set (attr "length")
>>>> -    (if_then_else
>>>> -       (and (match_test "TARGET_THUMB2")
>>>> -        (and (ge (minus (match_dup 0) (pc)) (const_int -250))
>>>> -             (le (minus (match_dup 0) (pc)) (const_int 256))))
>>>> -       (const_int 2)
>>>> -       (const_int 4)))]
>>>> +    (if_then_else (match_test "TARGET_THUMB2")
>>>> +    (if_then_else (and (ge (minus (match_dup 0) (pc)) (const_int 
>>>> -250))
>>>> +        (le (minus (match_dup 0) (pc)) (const_int 256)))
>>>> +    (const_int 2)
>>>> +    (if_then_else (and (ge (minus (match_dup 0) (pc))
>>>> +                            (const_int -1048568))
>>>> +        (le (minus (match_dup 0) (pc)) (const_int 1048576)))
>>>> +    (const_int 4)
>>>> +    (const_int 8)))
>>>> +    (const_int 10)))
>>>> +   ]
>>>
>>> Same comments as above apply here too.
>>>
>>
>> Same as above.
> 
> Ping :)

.Ping


> 
>>
>> Thank you for the feedback and apologies for being a clueless :)
>>
>> And, of course, let me know of any problems or queries!
>>
>> Cheers,
>> Stam
>>
>>> Ramana
>>>
>>>>   )
>>>>   \f
>>>> diff --git a/gcc/testsuite/gcc.target/arm/pr91816.c 
>>>> b/gcc/testsuite/gcc.target/arm/pr91816.c
>>>> new file mode 100644
>>>> index 00000000000..176bf61780b
>>>> --- /dev/null
>>>> +++ b/gcc/testsuite/gcc.target/arm/pr91816.c
>>>> @@ -0,0 +1,102 @@
>>>> +/* { dg-do compile } */
>>>> +/* { dg-options "-march=armv7-a -mthumb -mfpu=vfpv3-d16" }  */
>>>> +int printf(const char *, ...);
>>>> +
>>>> +__attribute__((noinline,noclone)) void f1(int a)
>>>> +{
>>>> +    if (a) {
>>>> +#define HW0    printf("Hello World!\n");
>>>> +#define HW1    HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>>>> +#define HW2    HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>>>> +#define HW3    HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>>>> +#define HW4    HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>>>> +#define HW5    HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>>>> +        HW0
>>>> +    }
>>>> +}
>>>> +
>>>> +__attribute__((noinline,noclone)) void f2(int a)
>>>> +{
>>>> +    if (a) {
>>>> +#define HW0    printf("Hello World!\n");
>>>> +#define HW1    HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>>>> +#define HW2    HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>>>> +#define HW3    HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>>>> +#define HW4    HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>>>> +#define HW5    HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>>>> +        HW3
>>>> +    }
>>>> +}
>>>> +
>>>> +
>>>> +__attribute__((noinline,noclone)) void f3(int a)
>>>> +{
>>>> +    if (a) {
>>>> +#define HW0    printf("Hello World!\n");
>>>> +#define HW1    HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>>>> +#define HW2    HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>>>> +#define HW3    HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>>>> +#define HW4    HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>>>> +#define HW5    HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>>>> +        HW5
>>>> +    }
>>>> +}
>>>> +
>>>> +__attribute__((noinline,noclone)) void f4(int a)
>>>> +{
>>>> +    if (a==1) {
>>>> +#define HW0    printf("Hello World!\n");
>>>> +#define HW1    HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>>>> +#define HW2    HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>>>> +#define HW3    HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>>>> +#define HW4    HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>>>> +#define HW5    HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>>>> +        HW0
>>>> +    }
>>>> +}
>>>> +
>>>> +__attribute__((noinline,noclone)) void f5(int a)
>>>> +{
>>>> +    if (a==1) {
>>>> +#define HW0    printf("Hello World!\n");
>>>> +#define HW1    HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>>>> +#define HW2    HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>>>> +#define HW3    HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>>>> +#define HW4    HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>>>> +#define HW5    HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>>>> +        HW3
>>>> +    }
>>>> +}
>>>> +
>>>> +
>>>> +__attribute__((noinline,noclone)) void f6(int a)
>>>> +{
>>>> +    if (a==1) {
>>>> +#define HW0    printf("Hello World!\n");
>>>> +#define HW1    HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>>>> +#define HW2    HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>>>> +#define HW3    HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>>>> +#define HW4    HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>>>> +#define HW5    HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>>>> +        HW5
>>>> +    }
>>>> +}
>>>> +
>>>> +
>>>> +int main(void)
>>>> +{
>>>> +    f1(0);
>>>> +    f2(0);
>>>> +    f3(0);
>>>> +    f4(0);
>>>> +    f5(0);
>>>> +    f6(0);
>>>> +    return 0;
>>>> +}
>>>> +
>>>> +
>>>> +/* { dg-final { scan-assembler-times "beq\\t.L\[0-9\]" 2 } } */
>>>> +/* { dg-final { scan-assembler-times "beq\\t.Lbcond\[0-9\]" 1 } } */
>>>> +/* { dg-final { scan-assembler-times "bne\\t.L\[0-9\]" 2 } } */
>>>> +/* { dg-final { scan-assembler-times "bne\\t.Lbcond\[0-9\]" 1 } } */
>>>> +/* { dg-final { scan-assembler-times "b\\t.L\[0-9\]" 2 } } */
>>>
>>
>>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PING][PATCH][GCC][ARM] Arm generates out of range conditional branches in Thumb2 (PR91816)
  2019-11-15 17:27     ` [PING][PATCH][GCC][ARM] " Stam Markianos-Wright
  2019-12-02 16:43       ` Stam Markianos-Wright
@ 2019-12-10 17:03       ` Kyrill Tkachov
  2020-01-08 15:19         ` Stam Markianos-Wright
  1 sibling, 1 reply; 15+ messages in thread
From: Kyrill Tkachov @ 2019-12-10 17:03 UTC (permalink / raw)
  To: Stam Markianos-Wright, gcc-patches
  Cc: nickc, Ramana Radhakrishnan, Richard Earnshaw

Hi Stam,

On 11/15/19 5:26 PM, Stam Markianos-Wright wrote:
> Pinging with more correct maintainers this time :)
>
> Also would need to backport to gcc7,8,9, but need to get this approved
> first!
>

Sorry for the delay.


> Thank you,
> Stam
>
>
> -------- Forwarded Message --------
> Subject: Re: [PATCH][GCC][ARM] Arm generates out of range conditional
> branches in Thumb2 (PR91816)
> Date: Mon, 21 Oct 2019 10:37:09 +0100
> From: Stam Markianos-Wright <stam.markianos-wright@arm.com>
> To: Ramana Radhakrishnan <ramana.gcc@googlemail.com>
> CC: gcc-patches@gcc.gnu.org <gcc-patches@gcc.gnu.org>, nd <nd@arm.com>,
> James Greenhalgh <James.Greenhalgh@arm.com>, Richard Earnshaw
> <Richard.Earnshaw@arm.com>
>
>
>
> On 10/13/19 4:23 PM, Ramana Radhakrishnan wrote:
> >>
> >> Patch bootstrapped and regression tested on arm-none-linux-gnueabihf,
> >> however, on my native Aarch32 setup the test times out when run as part
> >> of a big "make check-gcc" regression, but not when run individually.
> >>
> >> 2019-10-11  Stamatis Markianos-Wright <stam.markianos-wright@arm.com>
> >>
> >>       * config/arm/arm.md: Update b<cond> for Thumb2 range checks.
> >>       * config/arm/arm.c: New function arm_gen_far_branch.
> >>       * config/arm/arm-protos.h: New function arm_gen_far_branch
> >>       prototype.
> >>
> >> gcc/testsuite/ChangeLog:
> >>
> >> 2019-10-11  Stamatis Markianos-Wright <stam.markianos-wright@arm.com>
> >>
> >>       * testsuite/gcc.target/arm/pr91816.c: New test.
> >
> >> diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
> >> index f995974f9bb..1dce333d1c3 100644
> >> --- a/gcc/config/arm/arm-protos.h
> >> +++ b/gcc/config/arm/arm-protos.h
> >> @@ -570,4 +570,7 @@ void arm_parse_option_features (sbitmap, const 
> cpu_arch_option *,
> >>
> >>   void arm_initialize_isa (sbitmap, const enum isa_feature *);
> >>
> >> +const char * arm_gen_far_branch (rtx *, int,const char * , const 
> char *);
> >> +
> >> +
> >
> > Lets get the nits out of the way.
> >
> > Unnecessary extra new line, need a space between int and const above.
> >
> >
>
> .Fixed!
>
> >>   #endif /* ! GCC_ARM_PROTOS_H */
> >> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> >> index 39e1a1ef9a2..1a693d2ddca 100644
> >> --- a/gcc/config/arm/arm.c
> >> +++ b/gcc/config/arm/arm.c
> >> @@ -32139,6 +32139,31 @@ arm_run_selftests (void)
> >>   }
> >>   } /* Namespace selftest.  */
> >>
> >> +
> >> +/* Generate code to enable conditional branches in functions over 
> 1 MiB.  */
> >> +const char *
> >> +arm_gen_far_branch (rtx * operands, int pos_label, const char * dest,
> >> +                    const char * branch_format)
> >
> > Not sure if this is some munging from the attachment but check
> > vertical alignment of parameters.
> >
>
> .Fixed!
>
> >> +{
> >> +  rtx_code_label * tmp_label = gen_label_rtx ();
> >> +  char label_buf[256];
> >> +  char buffer[128];
> >> +  ASM_GENERATE_INTERNAL_LABEL (label_buf, dest , \
> >> +                    CODE_LABEL_NUMBER (tmp_label));
> >> +  const char *label_ptr = arm_strip_name_encoding (label_buf);
> >> +  rtx dest_label = operands[pos_label];
> >> +  operands[pos_label] = tmp_label;
> >> +
> >> +  snprintf (buffer, sizeof (buffer), "%s%s", branch_format , 
> label_ptr);
> >> +  output_asm_insn (buffer, operands);
> >> +
> >> +  snprintf (buffer, sizeof (buffer), "b\t%%l0%d\n%s:", pos_label, 
> label_ptr);
> >> +  operands[pos_label] = dest_label;
> >> +  output_asm_insn (buffer, operands);
> >> +  return "";
> >> +}
> >> +
> >> +
> >
> > Unnecessary extra newline.
> >
>
> .Fixed!
>
> >>   #undef TARGET_RUN_TARGET_SELFTESTS
> >>   #define TARGET_RUN_TARGET_SELFTESTS selftest::arm_run_selftests
> >>   #endif /* CHECKING_P */
> >> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
> >> index f861c72ccfc..634fd0a59da 100644
> >> --- a/gcc/config/arm/arm.md
> >> +++ b/gcc/config/arm/arm.md
> >> @@ -6686,9 +6686,16 @@
> >>   ;; And for backward branches we have
> >>   ;;   (neg_range - neg_base_offs + pc_offs) = (neg_range - (-2 or 
> -4) + 4).
> >>   ;;
> >> +;; In 16-bit Thumb these ranges are:
> >>   ;; For a 'b'       pos_range = 2046, neg_range = -2048 giving 
> (-2040->2048).
> >>   ;; For a 'b<cond>' pos_range = 254, neg_range = -256  giving 
> (-250 ->256).
> >>
> >> +;; In 32-bit Thumb these ranges are:
> >> +;; For a 'b'       +/- 16MB is not checked for.
> >> +;; For a 'b<cond>' pos_range = 1048574, neg_range = -1048576  giving
> >> +;; (-1048568 -> 1048576).
> >> +
> >> +
> >
> > Unnecessary extra newline.
> >
>
> .Fixed!
>
> >>   (define_expand "cbranchsi4"
> >>     [(set (pc) (if_then_else
> >>             (match_operator 0 "expandable_comparison_operator"
> >> @@ -6947,22 +6954,42 @@
> >>                     (pc)))]
> >>     "TARGET_32BIT"
> >>     "*
> >> -  if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
> >> -    {
> >> -      arm_ccfsm_state += 2;
> >> -      return \"\";
> >> -    }
> >> -  return \"b%d1\\t%l0\";
> >> +     if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
> >> +      {
> >> +    arm_ccfsm_state += 2;
> >> +    return \"\";
> >> +      }
> >> +     switch (get_attr_length (insn))
> >> +      {
> >> +    // Thumb2 16-bit b{cond}
> >> +    case 2:
> >> +
> >> +    // Thumb2 32-bit b{cond}
> >> +    case 4: return \"b%d1\\t%l0\";break;
> >> +
> >> +    // Thumb2 b{cond} out of range.  Use unconditional branch.
> >> +    case 8: return arm_gen_far_branch \
> >> +            (operands, 0, \"Lbcond\", \"b%D1\t\");
> >> +    break;
> >> +
> >> +    // A32 b{cond}
> >> +    default: return \"b%d1\\t%l0\";
> >> +      }
> >
> > Please fix indentation here.
> >
>
> .Fixed together with below changes.
>
> >>     "
> >>     [(set_attr "conds" "use")
> >>      (set_attr "type" "branch")
> >>      (set (attr "length")
> >> -    (if_then_else
> >> -       (and (match_test "TARGET_THUMB2")
> >> -            (and (ge (minus (match_dup 0) (pc)) (const_int -250))
> >> -                 (le (minus (match_dup 0) (pc)) (const_int 256))))
> >> -       (const_int 2)
> >> -       (const_int 4)))]
> >> +    (if_then_else (match_test "TARGET_THUMB2")
> >> +    (if_then_else (and (ge (minus (match_dup 0) (pc)) (const_int 
> -250))
> >> +    (le (minus (match_dup 0) (pc)) (const_int 256)))
> >> +    (const_int 2)
> >> +    (if_then_else (and (ge (minus (match_dup 0) (pc))
> >> + (const_int -1048568))
> >> +                    (le (minus (match_dup 0) (pc)) (const_int 
> 1048576)))
> >> +    (const_int 4)
> >> +    (const_int 8)))
> >> +    (const_int 10)))
> >> +   ]
> >
> > This conditional is unreadable and is getting quite complex.
> >
> > Please fix the indentation and add some comments to indicate when
> > this is 2, 4, 8, 10 above the pattern and ask for the comment to
> > be in sync with this.
> >
> > How did we end up with length 10 ? That indicates 2 4 byte instructions
> > and a 2 byte instruction ? You are handling lengths 2, 4, 8 above in
> > the switch - is length 10 going to be a single A32 b<cond> instruction ?
> >
> > What am I missing ?
> >
> >
>
> Ah sorry, I had not realised that the "length" related to the number of
> bytes in the instruction, so I just used it as a variable to then check
> in the switch().
> And yes, you are correct in assuming that length 10 would have been the
> A32 b<cond> version.
> So the mapping I had in mind was:
> 2->  Thumb2 b<cond> - narrow 16bit version
> 4->  Thumb2 b<cond> - wide 32bit version
> 8->  Thumb2 b       - "far branch".
> 10-> A32 b<cond>
>
> The new version that maintains the "length=number of bytes" would be:
>
> 2->  Thumb2 b<cond> - narrow 16bit version
> 4->  Thumb2 b<cond> - wide 32bit version OR A32 b<cond>
> 6->  Thumb2 "far branch" made up from one b<cond> to a very close Lbcond
> label (so 16 bits) and one b for 32 bits. (so 2+4 == 6)
>
> I've gone ahead and done this in the new proposed patch. Let me know if
> it's ok! (also I changed the first check to !TARGET_THUMB2 - this makes
> it slightly more readable). I'm still not sure about this, so any
> suggestions are welcome!
>
> >
> >>   )
> >>
> >>   (define_insn "*arm_cond_branch_reversed"
> >> @@ -6978,17 +7005,36 @@
> >>         arm_ccfsm_state += 2;
> >>         return \"\";
> >>       }
> >> -  return \"b%D1\\t%l0\";
> >> +     switch (get_attr_length (insn))
> >> +      {
> >> +    // Thumb2 16-bit b{cond}
> >> +    case 2:
> >> +
> >> +    // Thumb2 32-bit b{cond}
> >> +    case 4: return \"b%D1\\t%l0\";break;
> >> +
> >> +    // Thumb2 b{cond} out of range.  Use unconditional branch.
> >> +    case 8: return arm_gen_far_branch \
> >> +            (operands, 0, \"Lbcond\", \"b%d1\t\");
> >> +            break;
> >> +    // A32 b{cond}
> >> +    default: return \"b%D1\\t%l0\";
> >> +       }
> >>     "
> >>     [(set_attr "conds" "use")
> >>      (set_attr "type" "branch")
> >>      (set (attr "length")
> >> -    (if_then_else
> >> -       (and (match_test "TARGET_THUMB2")
> >> -            (and (ge (minus (match_dup 0) (pc)) (const_int -250))
> >> -                 (le (minus (match_dup 0) (pc)) (const_int 256))))
> >> -       (const_int 2)
> >> -       (const_int 4)))]
> >> +    (if_then_else (match_test "TARGET_THUMB2")
> >> +    (if_then_else (and (ge (minus (match_dup 0) (pc)) (const_int 
> -250))
> >> +            (le (minus (match_dup 0) (pc)) (const_int 256)))
> >> +    (const_int 2)
> >> +    (if_then_else (and (ge (minus (match_dup 0) (pc))
> >> + (const_int -1048568))
> >> +            (le (minus (match_dup 0) (pc)) (const_int 1048576)))
> >> +    (const_int 4)
> >> +    (const_int 8)))
> >> +    (const_int 10)))
> >> +   ]
> >
> > Same comments as above apply here too.
> >
>
> Same as above.
>
> Thank you for the feedback and apologies for being a clueless :)
>
> And, of course, let me know of any problems or queries!
>
> Cheers,
> Stam
>
> > Ramana
> >
> >>   )
> >>
> >>
> >> diff --git a/gcc/testsuite/gcc.target/arm/pr91816.c 
> b/gcc/testsuite/gcc.target/arm/pr91816.c
> >> new file mode 100644
> >> index 00000000000..176bf61780b
> >> --- /dev/null
> >> +++ b/gcc/testsuite/gcc.target/arm/pr91816.c
> >> @@ -0,0 +1,102 @@
> >> +/* { dg-do compile } */
> >> +/* { dg-options "-march=armv7-a -mthumb -mfpu=vfpv3-d16" }  */
> >> +int printf(const char *, ...);
> >> +
> >> +__attribute__((noinline,noclone)) void f1(int a)
> >> +{
> >> +    if (a) {
> >> +#define HW0 printf("Hello World!\n");
> >> +#define HW1 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
> >> +#define HW2 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
> >> +#define HW3 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
> >> +#define HW4 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
> >> +#define HW5 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
> >> +            HW0
> >> +    }
> >> +}
> >> +
> >> +__attribute__((noinline,noclone)) void f2(int a)
> >> +{
> >> +    if (a) {
> >> +#define HW0 printf("Hello World!\n");
> >> +#define HW1 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
> >> +#define HW2 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
> >> +#define HW3 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
> >> +#define HW4 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
> >> +#define HW5 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
> >> +            HW3
> >> +    }
> >> +}
> >> +
> >> +
> >> +__attribute__((noinline,noclone)) void f3(int a)
> >> +{
> >> +    if (a) {
> >> +#define HW0 printf("Hello World!\n");
> >> +#define HW1 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
> >> +#define HW2 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
> >> +#define HW3 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
> >> +#define HW4 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
> >> +#define HW5 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
> >> +            HW5
> >> +    }
> >> +}
> >> +
> >> +__attribute__((noinline,noclone)) void f4(int a)
> >> +{
> >> +    if (a==1) {
> >> +#define HW0 printf("Hello World!\n");
> >> +#define HW1 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
> >> +#define HW2 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
> >> +#define HW3 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
> >> +#define HW4 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
> >> +#define HW5 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
> >> +            HW0
> >> +    }
> >> +}
> >> +
> >> +__attribute__((noinline,noclone)) void f5(int a)
> >> +{
> >> +    if (a==1) {
> >> +#define HW0 printf("Hello World!\n");
> >> +#define HW1 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
> >> +#define HW2 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
> >> +#define HW3 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
> >> +#define HW4 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
> >> +#define HW5 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
> >> +            HW3
> >> +    }
> >> +}
> >> +
> >> +
> >> +__attribute__((noinline,noclone)) void f6(int a)
> >> +{
> >> +    if (a==1) {
> >> +#define HW0 printf("Hello World!\n");
> >> +#define HW1 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
> >> +#define HW2 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
> >> +#define HW3 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
> >> +#define HW4 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
> >> +#define HW5 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
> >> +            HW5
> >> +    }
> >> +}
> >> +
> >> +
> >> +int main(void)
> >> +{
> >> +    f1(0);
> >> +    f2(0);
> >> +    f3(0);
> >> +    f4(0);
> >> +    f5(0);
> >> +    f6(0);
> >> +    return 0;
> >> +}
> >> +
> >> +
> >> +/* { dg-final { scan-assembler-times "beq\\t.L\[0-9\]" 2 } } */
> >> +/* { dg-final { scan-assembler-times "beq\\t.Lbcond\[0-9\]" 1 } } */
> >> +/* { dg-final { scan-assembler-times "bne\\t.L\[0-9\]" 2 } } */
> >> +/* { dg-final { scan-assembler-times "bne\\t.Lbcond\[0-9\]" 1 } } */
> >> +/* { dg-final { scan-assembler-times "b\\t.L\[0-9\]" 2 } } */
> >
>

1.patch

diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index f995974f9bb..59ec219da3d 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -570,4 +570,6 @@ void arm_parse_option_features (sbitmap, const cpu_arch_option *,
  
  void arm_initialize_isa (sbitmap, const enum isa_feature *);
  
+const char * arm_gen_far_branch (rtx *, int, const char *, const char *);
+
  #endif /* ! GCC_ARM_PROTOS_H */
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 39e1a1ef9a2..7a69ddb6b7b 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -32139,6 +32139,30 @@ arm_run_selftests (void)
  }
  } /* Namespace selftest.  */
  
+
+/* Generate code to enable conditional branches in functions over 1 MiB.  */


Please document the function parameters in this comment as other functions in this file (try to) do.


+const char *
+arm_gen_far_branch (rtx * operands, int pos_label, const char * dest,
+		    const char * branch_format)
+{
+  rtx_code_label * tmp_label = gen_label_rtx ();
+  char label_buf[256];
+  char buffer[128];
+  ASM_GENERATE_INTERNAL_LABEL (label_buf, dest , \
+			CODE_LABEL_NUMBER (tmp_label));
+  const char *label_ptr = arm_strip_name_encoding (label_buf);
+  rtx dest_label = operands[pos_label];
+  operands[pos_label] = tmp_label;
+
+  snprintf (buffer, sizeof (buffer), "%s%s", branch_format , label_ptr);
+  output_asm_insn (buffer, operands);
+
+  snprintf (buffer, sizeof (buffer), "b\t%%l0%d\n%s:", pos_label, label_ptr);
+  operands[pos_label] = dest_label;
+  output_asm_insn (buffer, operands);
+  return "";
+}
+
  #undef TARGET_RUN_TARGET_SELFTESTS
  #define TARGET_RUN_TARGET_SELFTESTS selftest::arm_run_selftests
  #endif /* CHECKING_P */
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index f861c72ccfc..7e5e1489214 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -6686,9 +6686,15 @@
  ;; And for backward branches we have
  ;;   (neg_range - neg_base_offs + pc_offs) = (neg_range - (-2 or -4) + 4).
  ;;
+;; In 16-bit Thumb these ranges are:
  ;; For a 'b'       pos_range = 2046, neg_range = -2048 giving (-2040->2048).
  ;; For a 'b<cond>' pos_range = 254,  neg_range = -256  giving (-250 ->256).
  
+;; In 32-bit Thumb these ranges are:
+;; For a 'b'       ± 16MB is not checked for.
+;; For a 'b<cond>' pos_range = 1048574,  neg_range = -1048576  giving
+;; (-1048568 -> 1048576).
+
  (define_expand "cbranchsi4"
    [(set (pc) (if_then_else
  	      (match_operator 0 "expandable_comparison_operator"
@@ -6946,23 +6952,56 @@
  		      (label_ref (match_operand 0 "" ""))
  		      (pc)))]
    "TARGET_32BIT"
-  "*
-  if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
-    {
-      arm_ccfsm_state += 2;
-      return \"\";
-    }
-  return \"b%d1\\t%l0\";
-  "
+  {
+	if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
+	{
+		arm_ccfsm_state += 2;
+		return "";
+	}
+	switch (get_attr_length (insn))
+	{
+		/* Thumb2 16-bit b{cond}.  */
+		case 2:
+
+		/* Thumb2 32-bit b{cond} or A32 b{cond}.  */
+		case 4: return "b%d1\t%l0";
+			break;
+
+		/* Thumb2 b{cond} out of range.  Use 16-bit b{cond} and
+		   unconditional branch b.  */
+		default: return arm_gen_far_branch \
+				(operands, 0, "Lbcond", "b%D1\t");
+	}


The indentation here is wrong. Please look at how other switch statements are written in the backend for guidance: 2 space indentation, new line after the cases etc.

  +  }
    [(set_attr "conds" "use")
     (set_attr "type" "branch")
     (set (attr "length")
-	(if_then_else
-	   (and (match_test "TARGET_THUMB2")
-		(and (ge (minus (match_dup 0) (pc)) (const_int -250))
-		     (le (minus (match_dup 0) (pc)) (const_int 256))))
-	   (const_int 2)
-	   (const_int 4)))]
+    (if_then_else (match_test "!TARGET_THUMB2")
+
+	;;Target is not Thumb2, therefore is A32.  Generate b{cond}.
+	(const_int 4)
+
+	;; Check if target is within 16-bit Thumb2 b{cond} range.
+	(if_then_else (and (ge (minus (match_dup 0) (pc)) (const_int -250))
+			   (le (minus (match_dup 0) (pc)) (const_int 256)))
+
+		;; Target is Thumb2, within narrow range.
+		;; Generate b{cond}.
+			(const_int 2)
+
+		;; Check if target is within 32-bit Thumb2 b{cond} range.
+			(if_then_else (and (ge (minus (match_dup 0)
+					 (pc))(const_int -1048568))
+					   (le (minus (match_dup 0)
+					 (pc)) (const_int 1048576)))
+
+		;; Target is Thumb2, within wide range.
+		;; Generate b{cond}
+						(const_int 4)
+		;; Target is Thumb2, out of range.
+		;; Generate narrow b{cond} and unconditional branch b.
+						(const_int 6)))))
+  ]


Likewise on the indentation.

  )
  
  (define_insn "*arm_cond_branch_reversed"
@@ -6972,23 +7011,56 @@
  		      (pc)
  		      (label_ref (match_operand 0 "" ""))))]
    "TARGET_32BIT"
-  "*
-  if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
-    {
-      arm_ccfsm_state += 2;
-      return \"\";
-    }
-  return \"b%D1\\t%l0\";
-  "
+  {
+	if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
+	{
+		arm_ccfsm_state += 2;
+		return "";
+	}
+	switch (get_attr_length (insn))
+	{
+		/* Thumb2 16-bit b{cond}.  */
+		case 2:
+
+		/* Thumb2 32-bit b{cond} or A32 b{cond}.  */
+		case 4: return "b%D1\t%l0";
+			break;
+
+		/* Thumb2 b{cond} out of range.  Use 16-bit b{cond} and
+		   unconditional branch b.  */
+		default: return arm_gen_far_branch \
+				(operands, 0, "Lbcond", "b%d1\t");
+	}




  +  }
    [(set_attr "conds" "use")
     (set_attr "type" "branch")
     (set (attr "length")
-	(if_then_else
-	   (and (match_test "TARGET_THUMB2")
-		(and (ge (minus (match_dup 0) (pc)) (const_int -250))
-		     (le (minus (match_dup 0) (pc)) (const_int 256))))
-	   (const_int 2)
-	   (const_int 4)))]
+    (if_then_else (match_test "!TARGET_THUMB2")
+
+	;;Target is not Thumb2, therefore is A32.  Generate b{cond}.
+	(const_int 4)
+
+	;; Check if target is within 16-bit Thumb2 b{cond} range.
+	(if_then_else (and (ge (minus (match_dup 0) (pc)) (const_int -250))
+			   (le (minus (match_dup 0) (pc)) (const_int 256)))
+
+		;; Target is Thumb2, within narrow range.
+		;; Generate b{cond}.
+			(const_int 2)
+
+		;; Check if target is within 32-bit Thumb2 b{cond} range.
+			(if_then_else (and (ge (minus (match_dup 0)
+					 (pc))(const_int -1048568))
+					   (le (minus (match_dup 0)
+					 (pc)) (const_int 1048576)))
+
+		;; Target is Thumb2, within wide range.
+		;; Generate b{cond}.
+						(const_int 4)
+		;; Target is Thumb2, out of range.
+		;; Generate narrow b{cond} and unconditional branch b.
+						(const_int 6)))))
+  ]
  )
  

Otherwise this looks reasonable to me. Ramana, did you have any further comments on the patch?
Thanks,
Kyrill


  
diff --git a/gcc/testsuite/gcc.target/arm/pr91816.c b/gcc/testsuite/gcc.target/arm/pr91816.c
new file mode 100644
index 00000000000..176bf61780b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pr91816.c
@@ -0,0 +1,102 @@
+/* { dg-do compile } */
+/* { dg-options "-march=armv7-a -mthumb -mfpu=vfpv3-d16" }  */
+int printf(const char *, ...);
+
+__attribute__((noinline,noclone)) void f1(int a)
+{
+	if (a) {
+#define HW0	printf("Hello World!\n");
+#define HW1	HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
+#define HW2	HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
+#define HW3	HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
+#define HW4	HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
+#define HW5	HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
+		HW0
+	}
+}
+
+__attribute__((noinline,noclone)) void f2(int a)
+{
+	if (a) {
+#define HW0	printf("Hello World!\n");
+#define HW1	HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
+#define HW2	HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
+#define HW3	HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
+#define HW4	HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
+#define HW5	HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
+		HW3
+	}
+}
+
+
+__attribute__((noinline,noclone)) void f3(int a)
+{
+	if (a) {
+#define HW0	printf("Hello World!\n");
+#define HW1	HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
+#define HW2	HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
+#define HW3	HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
+#define HW4	HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
+#define HW5	HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
+		HW5
+	}
+}
+
+__attribute__((noinline,noclone)) void f4(int a)
+{
+	if (a==1) {
+#define HW0	printf("Hello World!\n");
+#define HW1	HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
+#define HW2	HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
+#define HW3	HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
+#define HW4	HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
+#define HW5	HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
+		HW0
+	}
+}
+
+__attribute__((noinline,noclone)) void f5(int a)
+{
+	if (a==1) {
+#define HW0	printf("Hello World!\n");
+#define HW1	HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
+#define HW2	HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
+#define HW3	HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
+#define HW4	HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
+#define HW5	HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
+		HW3
+	}
+}
+
+
+__attribute__((noinline,noclone)) void f6(int a)
+{
+	if (a==1) {
+#define HW0	printf("Hello World!\n");
+#define HW1	HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
+#define HW2	HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
+#define HW3	HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
+#define HW4	HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
+#define HW5	HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
+		HW5
+	}
+}
+
+
+int main(void)
+{
+	f1(0);
+	f2(0);
+	f3(0);
+	f4(0);
+	f5(0);
+	f6(0);
+	return 0;
+}
+
+
+/* { dg-final { scan-assembler-times "beq\\t.L\[0-9\]" 2 } } */
+/* { dg-final { scan-assembler-times "beq\\t.Lbcond\[0-9\]" 1 } } */
+/* { dg-final { scan-assembler-times "bne\\t.L\[0-9\]" 2 } } */
+/* { dg-final { scan-assembler-times "bne\\t.Lbcond\[0-9\]" 1 } } */
+/* { dg-final { scan-assembler-times "b\\t.L\[0-9\]" 2 } } */


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PING][PATCH][GCC][ARM] Arm generates out of range conditional branches in Thumb2 (PR91816)
  2019-12-10 17:03       ` Kyrill Tkachov
@ 2020-01-08 15:19         ` Stam Markianos-Wright
  2020-01-16 16:30           ` Stam Markianos-Wright
  2020-01-28 11:05           ` [PING][PATCH][GCC][ARM] " Kyrill Tkachov
  0 siblings, 2 replies; 15+ messages in thread
From: Stam Markianos-Wright @ 2020-01-08 15:19 UTC (permalink / raw)
  To: Kyrill Tkachov, gcc-patches; +Cc: nickc, Ramana Radhakrishnan, Richard Earnshaw

[-- Attachment #1: Type: text/plain, Size: 28101 bytes --]



On 12/10/19 5:03 PM, Kyrill Tkachov wrote:
> Hi Stam,
> 
> On 11/15/19 5:26 PM, Stam Markianos-Wright wrote:
>> Pinging with more correct maintainers this time :)
>>
>> Also would need to backport to gcc7,8,9, but need to get this approved
>> first!
>>
> 
> Sorry for the delay.

Same here now! Sorry totally forget about this in the lead up to Xmas!

Done the changes marked below and also removed the unnecessary extra #defines 
from the test.

> 
> 
>> Thank you,
>> Stam
>>
>>
>> -------- Forwarded Message --------
>> Subject: Re: [PATCH][GCC][ARM] Arm generates out of range conditional
>> branches in Thumb2 (PR91816)
>> Date: Mon, 21 Oct 2019 10:37:09 +0100
>> From: Stam Markianos-Wright <stam.markianos-wright@arm.com>
>> To: Ramana Radhakrishnan <ramana.gcc@googlemail.com>
>> CC: gcc-patches@gcc.gnu.org <gcc-patches@gcc.gnu.org>, nd <nd@arm.com>,
>> James Greenhalgh <James.Greenhalgh@arm.com>, Richard Earnshaw
>> <Richard.Earnshaw@arm.com>
>>
>>
>>
>> On 10/13/19 4:23 PM, Ramana Radhakrishnan wrote:
>> >>
>> >> Patch bootstrapped and regression tested on arm-none-linux-gnueabihf,
>> >> however, on my native Aarch32 setup the test times out when run as part
>> >> of a big "make check-gcc" regression, but not when run individually.
>> >>
>> >> 2019-10-11  Stamatis Markianos-Wright <stam.markianos-wright@arm.com>
>> >>
>> >>       * config/arm/arm.md: Update b<cond> for Thumb2 range checks.
>> >>       * config/arm/arm.c: New function arm_gen_far_branch.
>> >>       * config/arm/arm-protos.h: New function arm_gen_far_branch
>> >>       prototype.
>> >>
>> >> gcc/testsuite/ChangeLog:
>> >>
>> >> 2019-10-11  Stamatis Markianos-Wright <stam.markianos-wright@arm.com>
>> >>
>> >>       * testsuite/gcc.target/arm/pr91816.c: New test.
>> >
>> >> diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
>> >> index f995974f9bb..1dce333d1c3 100644
>> >> --- a/gcc/config/arm/arm-protos.h
>> >> +++ b/gcc/config/arm/arm-protos.h
>> >> @@ -570,4 +570,7 @@ void arm_parse_option_features (sbitmap, const 
>> cpu_arch_option *,
>> >>
>> >>   void arm_initialize_isa (sbitmap, const enum isa_feature *);
>> >>
>> >> +const char * arm_gen_far_branch (rtx *, int,const char * , const char *);
>> >> +
>> >> +
>> >
>> > Lets get the nits out of the way.
>> >
>> > Unnecessary extra new line, need a space between int and const above.
>> >
>> >
>>
>> .Fixed!
>>
>> >>   #endif /* ! GCC_ARM_PROTOS_H */
>> >> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
>> >> index 39e1a1ef9a2..1a693d2ddca 100644
>> >> --- a/gcc/config/arm/arm.c
>> >> +++ b/gcc/config/arm/arm.c
>> >> @@ -32139,6 +32139,31 @@ arm_run_selftests (void)
>> >>   }
>> >>   } /* Namespace selftest.  */
>> >>
>> >> +
>> >> +/* Generate code to enable conditional branches in functions over 1 MiB.  */
>> >> +const char *
>> >> +arm_gen_far_branch (rtx * operands, int pos_label, const char * dest,
>> >> +                    const char * branch_format)
>> >
>> > Not sure if this is some munging from the attachment but check
>> > vertical alignment of parameters.
>> >
>>
>> .Fixed!
>>
>> >> +{
>> >> +  rtx_code_label * tmp_label = gen_label_rtx ();
>> >> +  char label_buf[256];
>> >> +  char buffer[128];
>> >> +  ASM_GENERATE_INTERNAL_LABEL (label_buf, dest , \
>> >> +                    CODE_LABEL_NUMBER (tmp_label));
>> >> +  const char *label_ptr = arm_strip_name_encoding (label_buf);
>> >> +  rtx dest_label = operands[pos_label];
>> >> +  operands[pos_label] = tmp_label;
>> >> +
>> >> +  snprintf (buffer, sizeof (buffer), "%s%s", branch_format , label_ptr);
>> >> +  output_asm_insn (buffer, operands);
>> >> +
>> >> +  snprintf (buffer, sizeof (buffer), "b\t%%l0%d\n%s:", pos_label, label_ptr);
>> >> +  operands[pos_label] = dest_label;
>> >> +  output_asm_insn (buffer, operands);
>> >> +  return "";
>> >> +}
>> >> +
>> >> +
>> >
>> > Unnecessary extra newline.
>> >
>>
>> .Fixed!
>>
>> >>   #undef TARGET_RUN_TARGET_SELFTESTS
>> >>   #define TARGET_RUN_TARGET_SELFTESTS selftest::arm_run_selftests
>> >>   #endif /* CHECKING_P */
>> >> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
>> >> index f861c72ccfc..634fd0a59da 100644
>> >> --- a/gcc/config/arm/arm.md
>> >> +++ b/gcc/config/arm/arm.md
>> >> @@ -6686,9 +6686,16 @@
>> >>   ;; And for backward branches we have
>> >>   ;;   (neg_range - neg_base_offs + pc_offs) = (neg_range - (-2 or -4) + 4).
>> >>   ;;
>> >> +;; In 16-bit Thumb these ranges are:
>> >>   ;; For a 'b'       pos_range = 2046, neg_range = -2048 giving (-2040->2048).
>> >>   ;; For a 'b<cond>' pos_range = 254, neg_range = -256  giving (-250 ->256).
>> >>
>> >> +;; In 32-bit Thumb these ranges are:
>> >> +;; For a 'b'       +/- 16MB is not checked for.
>> >> +;; For a 'b<cond>' pos_range = 1048574, neg_range = -1048576  giving
>> >> +;; (-1048568 -> 1048576).
>> >> +
>> >> +
>> >
>> > Unnecessary extra newline.
>> >
>>
>> .Fixed!
>>
>> >>   (define_expand "cbranchsi4"
>> >>     [(set (pc) (if_then_else
>> >>             (match_operator 0 "expandable_comparison_operator"
>> >> @@ -6947,22 +6954,42 @@
>> >>                     (pc)))]
>> >>     "TARGET_32BIT"
>> >>     "*
>> >> -  if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
>> >> -    {
>> >> -      arm_ccfsm_state += 2;
>> >> -      return \"\";
>> >> -    }
>> >> -  return \"b%d1\\t%l0\";
>> >> +     if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
>> >> +      {
>> >> +    arm_ccfsm_state += 2;
>> >> +    return \"\";
>> >> +      }
>> >> +     switch (get_attr_length (insn))
>> >> +      {
>> >> +    // Thumb2 16-bit b{cond}
>> >> +    case 2:
>> >> +
>> >> +    // Thumb2 32-bit b{cond}
>> >> +    case 4: return \"b%d1\\t%l0\";break;
>> >> +
>> >> +    // Thumb2 b{cond} out of range.  Use unconditional branch.
>> >> +    case 8: return arm_gen_far_branch \
>> >> +            (operands, 0, \"Lbcond\", \"b%D1\t\");
>> >> +    break;
>> >> +
>> >> +    // A32 b{cond}
>> >> +    default: return \"b%d1\\t%l0\";
>> >> +      }
>> >
>> > Please fix indentation here.
>> >
>>
>> .Fixed together with below changes.
>>
>> >>     "
>> >>     [(set_attr "conds" "use")
>> >>      (set_attr "type" "branch")
>> >>      (set (attr "length")
>> >> -    (if_then_else
>> >> -       (and (match_test "TARGET_THUMB2")
>> >> -            (and (ge (minus (match_dup 0) (pc)) (const_int -250))
>> >> -                 (le (minus (match_dup 0) (pc)) (const_int 256))))
>> >> -       (const_int 2)
>> >> -       (const_int 4)))]
>> >> +    (if_then_else (match_test "TARGET_THUMB2")
>> >> +    (if_then_else (and (ge (minus (match_dup 0) (pc)) (const_int -250))
>> >> +    (le (minus (match_dup 0) (pc)) (const_int 256)))
>> >> +    (const_int 2)
>> >> +    (if_then_else (and (ge (minus (match_dup 0) (pc))
>> >> + (const_int -1048568))
>> >> +                    (le (minus (match_dup 0) (pc)) (const_int 1048576)))
>> >> +    (const_int 4)
>> >> +    (const_int 8)))
>> >> +    (const_int 10)))
>> >> +   ]
>> >
>> > This conditional is unreadable and is getting quite complex.
>> >
>> > Please fix the indentation and add some comments to indicate when
>> > this is 2, 4, 8, 10 above the pattern and ask for the comment to
>> > be in sync with this.
>> >
>> > How did we end up with length 10 ? That indicates 2 4 byte instructions
>> > and a 2 byte instruction ? You are handling lengths 2, 4, 8 above in
>> > the switch - is length 10 going to be a single A32 b<cond> instruction ?
>> >
>> > What am I missing ?
>> >
>> >
>>
>> Ah sorry, I had not realised that the "length" related to the number of
>> bytes in the instruction, so I just used it as a variable to then check
>> in the switch().
>> And yes, you are correct in assuming that length 10 would have been the
>> A32 b<cond> version.
>> So the mapping I had in mind was:
>> 2->  Thumb2 b<cond> - narrow 16bit version
>> 4->  Thumb2 b<cond> - wide 32bit version
>> 8->  Thumb2 b       - "far branch".
>> 10-> A32 b<cond>
>>
>> The new version that maintains the "length=number of bytes" would be:
>>
>> 2->  Thumb2 b<cond> - narrow 16bit version
>> 4->  Thumb2 b<cond> - wide 32bit version OR A32 b<cond>
>> 6->  Thumb2 "far branch" made up from one b<cond> to a very close Lbcond
>> label (so 16 bits) and one b for 32 bits. (so 2+4 == 6)
>>
>> I've gone ahead and done this in the new proposed patch. Let me know if
>> it's ok! (also I changed the first check to !TARGET_THUMB2 - this makes
>> it slightly more readable). I'm still not sure about this, so any
>> suggestions are welcome!
>>
>> >
>> >>   )
>> >>
>> >>   (define_insn "*arm_cond_branch_reversed"
>> >> @@ -6978,17 +7005,36 @@
>> >>         arm_ccfsm_state += 2;
>> >>         return \"\";
>> >>       }
>> >> -  return \"b%D1\\t%l0\";
>> >> +     switch (get_attr_length (insn))
>> >> +      {
>> >> +    // Thumb2 16-bit b{cond}
>> >> +    case 2:
>> >> +
>> >> +    // Thumb2 32-bit b{cond}
>> >> +    case 4: return \"b%D1\\t%l0\";break;
>> >> +
>> >> +    // Thumb2 b{cond} out of range.  Use unconditional branch.
>> >> +    case 8: return arm_gen_far_branch \
>> >> +            (operands, 0, \"Lbcond\", \"b%d1\t\");
>> >> +            break;
>> >> +    // A32 b{cond}
>> >> +    default: return \"b%D1\\t%l0\";
>> >> +       }
>> >>     "
>> >>     [(set_attr "conds" "use")
>> >>      (set_attr "type" "branch")
>> >>      (set (attr "length")
>> >> -    (if_then_else
>> >> -       (and (match_test "TARGET_THUMB2")
>> >> -            (and (ge (minus (match_dup 0) (pc)) (const_int -250))
>> >> -                 (le (minus (match_dup 0) (pc)) (const_int 256))))
>> >> -       (const_int 2)
>> >> -       (const_int 4)))]
>> >> +    (if_then_else (match_test "TARGET_THUMB2")
>> >> +    (if_then_else (and (ge (minus (match_dup 0) (pc)) (const_int -250))
>> >> +            (le (minus (match_dup 0) (pc)) (const_int 256)))
>> >> +    (const_int 2)
>> >> +    (if_then_else (and (ge (minus (match_dup 0) (pc))
>> >> + (const_int -1048568))
>> >> +            (le (minus (match_dup 0) (pc)) (const_int 1048576)))
>> >> +    (const_int 4)
>> >> +    (const_int 8)))
>> >> +    (const_int 10)))
>> >> +   ]
>> >
>> > Same comments as above apply here too.
>> >
>>
>> Same as above.
>>
>> Thank you for the feedback and apologies for being a clueless :)
>>
>> And, of course, let me know of any problems or queries!
>>
>> Cheers,
>> Stam
>>
>> > Ramana
>> >
>> >>   )
>> >>
>> >>
>> >> diff --git a/gcc/testsuite/gcc.target/arm/pr91816.c 
>> b/gcc/testsuite/gcc.target/arm/pr91816.c
>> >> new file mode 100644
>> >> index 00000000000..176bf61780b
>> >> --- /dev/null
>> >> +++ b/gcc/testsuite/gcc.target/arm/pr91816.c
>> >> @@ -0,0 +1,102 @@
>> >> +/* { dg-do compile } */
>> >> +/* { dg-options "-march=armv7-a -mthumb -mfpu=vfpv3-d16" }  */
>> >> +int printf(const char *, ...);
>> >> +
>> >> +__attribute__((noinline,noclone)) void f1(int a)
>> >> +{
>> >> +    if (a) {
>> >> +#define HW0 printf("Hello World!\n");
>> >> +#define HW1 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>> >> +#define HW2 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>> >> +#define HW3 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>> >> +#define HW4 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>> >> +#define HW5 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>> >> +            HW0
>> >> +    }
>> >> +}
>> >> +
>> >> +__attribute__((noinline,noclone)) void f2(int a)
>> >> +{
>> >> +    if (a) {
>> >> +#define HW0 printf("Hello World!\n");
>> >> +#define HW1 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>> >> +#define HW2 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>> >> +#define HW3 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>> >> +#define HW4 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>> >> +#define HW5 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>> >> +            HW3
>> >> +    }
>> >> +}
>> >> +
>> >> +
>> >> +__attribute__((noinline,noclone)) void f3(int a)
>> >> +{
>> >> +    if (a) {
>> >> +#define HW0 printf("Hello World!\n");
>> >> +#define HW1 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>> >> +#define HW2 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>> >> +#define HW3 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>> >> +#define HW4 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>> >> +#define HW5 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>> >> +            HW5
>> >> +    }
>> >> +}
>> >> +
>> >> +__attribute__((noinline,noclone)) void f4(int a)
>> >> +{
>> >> +    if (a==1) {
>> >> +#define HW0 printf("Hello World!\n");
>> >> +#define HW1 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>> >> +#define HW2 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>> >> +#define HW3 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>> >> +#define HW4 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>> >> +#define HW5 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>> >> +            HW0
>> >> +    }
>> >> +}
>> >> +
>> >> +__attribute__((noinline,noclone)) void f5(int a)
>> >> +{
>> >> +    if (a==1) {
>> >> +#define HW0 printf("Hello World!\n");
>> >> +#define HW1 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>> >> +#define HW2 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>> >> +#define HW3 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>> >> +#define HW4 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>> >> +#define HW5 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>> >> +            HW3
>> >> +    }
>> >> +}
>> >> +
>> >> +
>> >> +__attribute__((noinline,noclone)) void f6(int a)
>> >> +{
>> >> +    if (a==1) {
>> >> +#define HW0 printf("Hello World!\n");
>> >> +#define HW1 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>> >> +#define HW2 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>> >> +#define HW3 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>> >> +#define HW4 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>> >> +#define HW5 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>> >> +            HW5
>> >> +    }
>> >> +}
>> >> +
>> >> +
>> >> +int main(void)
>> >> +{
>> >> +    f1(0);
>> >> +    f2(0);
>> >> +    f3(0);
>> >> +    f4(0);
>> >> +    f5(0);
>> >> +    f6(0);
>> >> +    return 0;
>> >> +}
>> >> +
>> >> +
>> >> +/* { dg-final { scan-assembler-times "beq\\t.L\[0-9\]" 2 } } */
>> >> +/* { dg-final { scan-assembler-times "beq\\t.Lbcond\[0-9\]" 1 } } */
>> >> +/* { dg-final { scan-assembler-times "bne\\t.L\[0-9\]" 2 } } */
>> >> +/* { dg-final { scan-assembler-times "bne\\t.Lbcond\[0-9\]" 1 } } */
>> >> +/* { dg-final { scan-assembler-times "b\\t.L\[0-9\]" 2 } } */
>> >
>>
> 
> 1.patch
> 
> diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
> index f995974f9bb..59ec219da3d 100644
> --- a/gcc/config/arm/arm-protos.h
> +++ b/gcc/config/arm/arm-protos.h
> @@ -570,4 +570,6 @@ void arm_parse_option_features (sbitmap, const 
> cpu_arch_option *,
> 
>   void arm_initialize_isa (sbitmap, const enum isa_feature *);
> 
> +const char * arm_gen_far_branch (rtx *, int, const char *, const char *);
> +
>   #endif /* ! GCC_ARM_PROTOS_H */
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index 39e1a1ef9a2..7a69ddb6b7b 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -32139,6 +32139,30 @@ arm_run_selftests (void)
>   }
>   } /* Namespace selftest.  */
> 
> +
> +/* Generate code to enable conditional branches in functions over 1 MiB.  */
> 
> 
> Please document the function parameters in this comment as other functions in 
> this file (try to) do.

Done :)
> 
> 
> +const char *
> +arm_gen_far_branch (rtx * operands, int pos_label, const char * dest,
> +            const char * branch_format)
> +{
> +  rtx_code_label * tmp_label = gen_label_rtx ();
> +  char label_buf[256];
> +  char buffer[128];
> +  ASM_GENERATE_INTERNAL_LABEL (label_buf, dest , \
> +            CODE_LABEL_NUMBER (tmp_label));
> +  const char *label_ptr = arm_strip_name_encoding (label_buf);
> +  rtx dest_label = operands[pos_label];
> +  operands[pos_label] = tmp_label;
> +
> +  snprintf (buffer, sizeof (buffer), "%s%s", branch_format , label_ptr);
> +  output_asm_insn (buffer, operands);
> +
> +  snprintf (buffer, sizeof (buffer), "b\t%%l0%d\n%s:", pos_label, label_ptr);
> +  operands[pos_label] = dest_label;
> +  output_asm_insn (buffer, operands);
> +  return "";
> +}
> +
>   #undef TARGET_RUN_TARGET_SELFTESTS
>   #define TARGET_RUN_TARGET_SELFTESTS selftest::arm_run_selftests
>   #endif /* CHECKING_P */
> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
> index f861c72ccfc..7e5e1489214 100644
> --- a/gcc/config/arm/arm.md
> +++ b/gcc/config/arm/arm.md
> @@ -6686,9 +6686,15 @@
>   ;; And for backward branches we have
>   ;;   (neg_range - neg_base_offs + pc_offs) = (neg_range - (-2 or -4) + 4).
>   ;;
> +;; In 16-bit Thumb these ranges are:
>   ;; For a 'b'       pos_range = 2046, neg_range = -2048 giving (-2040->2048).
>   ;; For a 'b<cond>' pos_range = 254,  neg_range = -256  giving (-250 ->256).
> 
> +;; In 32-bit Thumb these ranges are:
> +;; For a 'b'       ± 16MB is not checked for.
> +;; For a 'b<cond>' pos_range = 1048574,  neg_range = -1048576  giving
> +;; (-1048568 -> 1048576).
> +
>   (define_expand "cbranchsi4"
>     [(set (pc) (if_then_else
>             (match_operator 0 "expandable_comparison_operator"
> @@ -6946,23 +6952,56 @@
>                 (label_ref (match_operand 0 "" ""))
>                 (pc)))]
>     "TARGET_32BIT"
> -  "*
> -  if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
> -    {
> -      arm_ccfsm_state += 2;
> -      return \"\";
> -    }
> -  return \"b%d1\\t%l0\";
> -  "
> +  {
> +    if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
> +    {
> +        arm_ccfsm_state += 2;
> +        return "";
> +    }
> +    switch (get_attr_length (insn))
> +    {
> +        /* Thumb2 16-bit b{cond}.  */
> +        case 2:
> +
> +        /* Thumb2 32-bit b{cond} or A32 b{cond}.  */
> +        case 4: return "b%d1\t%l0";
> +            break;
> +
> +        /* Thumb2 b{cond} out of range.  Use 16-bit b{cond} and
> +           unconditional branch b.  */
> +        default: return arm_gen_far_branch \
> +                (operands, 0, "Lbcond", "b%D1\t");
> +    }
> 
> 
> The indentation here is wrong. Please look at how other switch statements are 
> written in the backend for guidance: 2 space indentation, new line after the 
> cases etc.

Done
> 
>   +  }
>     [(set_attr "conds" "use")
>      (set_attr "type" "branch")
>      (set (attr "length")
> -    (if_then_else
> -       (and (match_test "TARGET_THUMB2")
> -        (and (ge (minus (match_dup 0) (pc)) (const_int -250))
> -             (le (minus (match_dup 0) (pc)) (const_int 256))))
> -       (const_int 2)
> -       (const_int 4)))]
> +    (if_then_else (match_test "!TARGET_THUMB2")
> +
> +    ;;Target is not Thumb2, therefore is A32.  Generate b{cond}.
> +    (const_int 4)
> +
> +    ;; Check if target is within 16-bit Thumb2 b{cond} range.
> +    (if_then_else (and (ge (minus (match_dup 0) (pc)) (const_int -250))
> +               (le (minus (match_dup 0) (pc)) (const_int 256)))
> +
> +        ;; Target is Thumb2, within narrow range.
> +        ;; Generate b{cond}.
> +            (const_int 2)
> +
> +        ;; Check if target is within 32-bit Thumb2 b{cond} range.
> +            (if_then_else (and (ge (minus (match_dup 0)
> +                     (pc))(const_int -1048568))
> +                       (le (minus (match_dup 0)
> +                     (pc)) (const_int 1048576)))
> +
> +        ;; Target is Thumb2, within wide range.
> +        ;; Generate b{cond}
> +                        (const_int 4)
> +        ;; Target is Thumb2, out of range.
> +        ;; Generate narrow b{cond} and unconditional branch b.
> +                        (const_int 6)))))
> +  ]
> 
> 
> Likewise on the indentation.

Done, sorry about that!
> 
>   )
> 
>   (define_insn "*arm_cond_branch_reversed"
> @@ -6972,23 +7011,56 @@
>                 (pc)
>                 (label_ref (match_operand 0 "" ""))))]
>     "TARGET_32BIT"
> -  "*
> -  if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
> -    {
> -      arm_ccfsm_state += 2;
> -      return \"\";
> -    }
> -  return \"b%D1\\t%l0\";
> -  "
> +  {
> +    if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
> +    {
> +        arm_ccfsm_state += 2;
> +        return "";
> +    }
> +    switch (get_attr_length (insn))
> +    {
> +        /* Thumb2 16-bit b{cond}.  */
> +        case 2:
> +
> +        /* Thumb2 32-bit b{cond} or A32 b{cond}.  */
> +        case 4: return "b%D1\t%l0";
> +            break;
> +
> +        /* Thumb2 b{cond} out of range.  Use 16-bit b{cond} and
> +           unconditional branch b.  */
> +        default: return arm_gen_far_branch \
> +                (operands, 0, "Lbcond", "b%d1\t");
> +    }
> 
> 
> 
> 
>   +  }
>     [(set_attr "conds" "use")
>      (set_attr "type" "branch")
>      (set (attr "length")
> -    (if_then_else
> -       (and (match_test "TARGET_THUMB2")
> -        (and (ge (minus (match_dup 0) (pc)) (const_int -250))
> -             (le (minus (match_dup 0) (pc)) (const_int 256))))
> -       (const_int 2)
> -       (const_int 4)))]
> +    (if_then_else (match_test "!TARGET_THUMB2")
> +
> +    ;;Target is not Thumb2, therefore is A32.  Generate b{cond}.
> +    (const_int 4)
> +
> +    ;; Check if target is within 16-bit Thumb2 b{cond} range.
> +    (if_then_else (and (ge (minus (match_dup 0) (pc)) (const_int -250))
> +               (le (minus (match_dup 0) (pc)) (const_int 256)))
> +
> +        ;; Target is Thumb2, within narrow range.
> +        ;; Generate b{cond}.
> +            (const_int 2)
> +
> +        ;; Check if target is within 32-bit Thumb2 b{cond} range.
> +            (if_then_else (and (ge (minus (match_dup 0)
> +                     (pc))(const_int -1048568))
> +                       (le (minus (match_dup 0)
> +                     (pc)) (const_int 1048576)))
> +
> +        ;; Target is Thumb2, within wide range.
> +        ;; Generate b{cond}.
> +                        (const_int 4)
> +        ;; Target is Thumb2, out of range.
> +        ;; Generate narrow b{cond} and unconditional branch b.
> +                        (const_int 6)))))
> +  ]
>   )
> 
> 
> Otherwise this looks reasonable to me. Ramana, did you have any further comments 
> on the patch?
> Thanks,
> Kyrill
> 
> 
> 
> diff --git a/gcc/testsuite/gcc.target/arm/pr91816.c 
> b/gcc/testsuite/gcc.target/arm/pr91816.c
> new file mode 100644
> index 00000000000..176bf61780b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/pr91816.c
> @@ -0,0 +1,102 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=armv7-a -mthumb -mfpu=vfpv3-d16" }  */
> +int printf(const char *, ...);
> +
> +__attribute__((noinline,noclone)) void f1(int a)
> +{
> +    if (a) {
> +#define HW0    printf("Hello World!\n");
> +#define HW1    HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
> +#define HW2    HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
> +#define HW3    HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
> +#define HW4    HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
> +#define HW5    HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
> +        HW0
> +    }
> +}
> +
> +__attribute__((noinline,noclone)) void f2(int a)
> +{
> +    if (a) {
> +#define HW0    printf("Hello World!\n");
> +#define HW1    HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
> +#define HW2    HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
> +#define HW3    HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
> +#define HW4    HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
> +#define HW5    HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
> +        HW3
> +    }
> +}
> +
> +
> +__attribute__((noinline,noclone)) void f3(int a)
> +{
> +    if (a) {
> +#define HW0    printf("Hello World!\n");
> +#define HW1    HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
> +#define HW2    HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
> +#define HW3    HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
> +#define HW4    HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
> +#define HW5    HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
> +        HW5
> +    }
> +}
> +
> +__attribute__((noinline,noclone)) void f4(int a)
> +{
> +    if (a==1) {
> +#define HW0    printf("Hello World!\n");
> +#define HW1    HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
> +#define HW2    HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
> +#define HW3    HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
> +#define HW4    HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
> +#define HW5    HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
> +        HW0
> +    }
> +}
> +
> +__attribute__((noinline,noclone)) void f5(int a)
> +{
> +    if (a==1) {
> +#define HW0    printf("Hello World!\n");
> +#define HW1    HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
> +#define HW2    HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
> +#define HW3    HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
> +#define HW4    HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
> +#define HW5    HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
> +        HW3
> +    }
> +}
> +
> +
> +__attribute__((noinline,noclone)) void f6(int a)
> +{
> +    if (a==1) {
> +#define HW0    printf("Hello World!\n");
> +#define HW1    HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
> +#define HW2    HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
> +#define HW3    HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
> +#define HW4    HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
> +#define HW5    HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
> +        HW5
> +    }
> +}
> +
> +
> +int main(void)
> +{
> +    f1(0);
> +    f2(0);
> +    f3(0);
> +    f4(0);
> +    f5(0);
> +    f6(0);
> +    return 0;
> +}
> +
> +
> +/* { dg-final { scan-assembler-times "beq\\t.L\[0-9\]" 2 } } */
> +/* { dg-final { scan-assembler-times "beq\\t.Lbcond\[0-9\]" 1 } } */
> +/* { dg-final { scan-assembler-times "bne\\t.L\[0-9\]" 2 } } */
> +/* { dg-final { scan-assembler-times "bne\\t.Lbcond\[0-9\]" 1 } } */
> +/* { dg-final { scan-assembler-times "b\\t.L\[0-9\]" 2 } } */
> 
> 

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: rb11928.patch --]
[-- Type: text/x-patch; name="rb11928.patch", Size: 8628 bytes --]

diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 7c4b1003844a1cdbed008e233753663bc8df3beb..8895becc639057b6394df4d3966b960ace5e97db 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -576,4 +576,6 @@ void arm_parse_option_features (sbitmap, const cpu_arch_option *,
 
 void arm_initialize_isa (sbitmap, const enum isa_feature *);
 
+const char * arm_gen_far_branch (rtx *, int, const char * , const char *);
+
 #endif /* ! GCC_ARM_PROTOS_H */
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 07231d722b978b5c99eb5a27d8ad8ece3d6c80fd..ee5de169f3ebdd5e3381144156f136ac5b3af887 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -32626,6 +32626,40 @@ arm_run_selftests (void)
 }
 } /* Namespace selftest.  */
 
+
+/* Generate code to enable conditional branches in functions over 1 MiB.
+   Parameters are:
+     operands: is the operands list of the asm insn (see arm_cond_branch or
+       arm_cond_branch_reversed).
+     pos_label: is an index into the operands array where operands[pos_label] is
+       the asm label of the final jump destination.
+     dest: is a string which is used to generate the asm label of the intermediate
+       destination
+   branch_format: is a string denoting the intermediate branch format, e.g.
+     "beq", "bne", etc.  */
+
+const char *
+arm_gen_far_branch (rtx * operands, int pos_label, const char * dest,
+		    const char * branch_format)
+{
+  rtx_code_label * tmp_label = gen_label_rtx ();
+  char label_buf[256];
+  char buffer[128];
+  ASM_GENERATE_INTERNAL_LABEL (label_buf, dest , \
+			CODE_LABEL_NUMBER (tmp_label));
+  const char *label_ptr = arm_strip_name_encoding (label_buf);
+  rtx dest_label = operands[pos_label];
+  operands[pos_label] = tmp_label;
+
+  snprintf (buffer, sizeof (buffer), "%s%s", branch_format , label_ptr);
+  output_asm_insn (buffer, operands);
+
+  snprintf (buffer, sizeof (buffer), "b\t%%l0%d\n%s:", pos_label, label_ptr);
+  operands[pos_label] = dest_label;
+  output_asm_insn (buffer, operands);
+  return "";
+}
+
 #undef TARGET_RUN_TARGET_SELFTESTS
 #define TARGET_RUN_TARGET_SELFTESTS selftest::arm_run_selftests
 #endif /* CHECKING_P */
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index f89a2d412df8afe621241958b29a8a7d58dce284..fb1d4547e5c990cc719e2cac5bfe5c2ce8fcb2a3 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -7546,9 +7546,15 @@
 ;; And for backward branches we have 
 ;;   (neg_range - neg_base_offs + pc_offs) = (neg_range - (-2 or -4) + 4).
 ;;
+;; In 16-bit Thumb these ranges are:
 ;; For a 'b'       pos_range = 2046, neg_range = -2048 giving (-2040->2048).
 ;; For a 'b<cond>' pos_range = 254,  neg_range = -256  giving (-250 ->256).
 
+;; In 32-bit Thumb these ranges are:
+;; For a 'b'       +/- 16MB is not checked for.
+;; For a 'b<cond>' pos_range = 1048574,  neg_range = -1048576  giving
+;; (-1048568 -> 1048576).
+
 (define_expand "cbranchsi4"
   [(set (pc) (if_then_else
 	      (match_operator 0 "expandable_comparison_operator"
@@ -7721,23 +7727,50 @@
 		      (label_ref (match_operand 0 "" ""))
 		      (pc)))]
   "TARGET_32BIT"
-  "*
-  if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
+  {
+    if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
     {
       arm_ccfsm_state += 2;
-      return \"\";
+      return "";
     }
-  return \"b%d1\\t%l0\";
-  "
+    switch (get_attr_length (insn))
+      {
+	case 2: /* Thumb2 16-bit b{cond}.  */
+	case 4: /* Thumb2 32-bit b{cond} or A32 b{cond}.  */
+	  return "b%d1\t%l0";
+	  break;
+
+	/* Thumb2 b{cond} out of range.  Use 16-bit b{cond} and
+	   unconditional branch b.  */
+	default: return arm_gen_far_branch (operands, 0, "Lbcond", "b%D1\t");
+      }
+  }
   [(set_attr "conds" "use")
    (set_attr "type" "branch")
    (set (attr "length")
-	(if_then_else
-	   (and (match_test "TARGET_THUMB2")
-		(and (ge (minus (match_dup 0) (pc)) (const_int -250))
-		     (le (minus (match_dup 0) (pc)) (const_int 256))))
-	   (const_int 2)
-	   (const_int 4)))]
+    (if_then_else (match_test "!TARGET_THUMB2")
+
+      ;;Target is not Thumb2, therefore is A32.  Generate b{cond}.
+      (const_int 4)
+
+      ;; Check if target is within 16-bit Thumb2 b{cond} range.
+      (if_then_else (and (ge (minus (match_dup 0) (pc)) (const_int -250))
+		         (le (minus (match_dup 0) (pc)) (const_int 256)))
+
+	;; Target is Thumb2, within narrow range.
+	;; Generate b{cond}.
+	(const_int 2)
+
+	;; Check if target is within 32-bit Thumb2 b{cond} range.
+	(if_then_else (and (ge (minus (match_dup 0) (pc))(const_int -1048568))
+			   (le (minus (match_dup 0) (pc)) (const_int 1048576)))
+
+	  ;; Target is Thumb2, within wide range.
+	  ;; Generate b{cond}
+	  (const_int 4)
+	  ;; Target is Thumb2, out of range.
+	  ;; Generate narrow b{cond} and unconditional branch b.
+	  (const_int 6)))))]
 )
 
 (define_insn "*arm_cond_branch_reversed"
@@ -7747,23 +7780,50 @@
 		      (pc)
 		      (label_ref (match_operand 0 "" ""))))]
   "TARGET_32BIT"
-  "*
-  if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
+  {
+    if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
     {
       arm_ccfsm_state += 2;
-      return \"\";
+      return "";
     }
-  return \"b%D1\\t%l0\";
-  "
+    switch (get_attr_length (insn))
+      {
+	case 2: /* Thumb2 16-bit b{cond}.  */
+	case 4: /* Thumb2 32-bit b{cond} or A32 b{cond}.  */
+	  return "b%D1\t%l0";
+	  break;
+
+	/* Thumb2 b{cond} out of range.  Use 16-bit b{cond} and
+	   unconditional branch b.  */
+	default: return arm_gen_far_branch (operands, 0, "Lbcond", "b%d1\t");
+      }
+  }
   [(set_attr "conds" "use")
    (set_attr "type" "branch")
    (set (attr "length")
-	(if_then_else
-	   (and (match_test "TARGET_THUMB2")
-		(and (ge (minus (match_dup 0) (pc)) (const_int -250))
-		     (le (minus (match_dup 0) (pc)) (const_int 256))))
-	   (const_int 2)
-	   (const_int 4)))]
+    (if_then_else (match_test "!TARGET_THUMB2")
+
+      ;;Target is not Thumb2, therefore is A32.  Generate b{cond}.
+      (const_int 4)
+
+      ;; Check if target is within 16-bit Thumb2 b{cond} range.
+      (if_then_else (and (ge (minus (match_dup 0) (pc)) (const_int -250))
+			 (le (minus (match_dup 0) (pc)) (const_int 256)))
+
+	;; Target is Thumb2, within narrow range.
+	;; Generate b{cond}.
+	(const_int 2)
+
+	;; Check if target is within 32-bit Thumb2 b{cond} range.
+	(if_then_else (and (ge (minus (match_dup 0) (pc))(const_int -1048568))
+			   (le (minus (match_dup 0) (pc)) (const_int 1048576)))
+
+	  ;; Target is Thumb2, within wide range.
+	  ;; Generate b{cond}.
+	  (const_int 4)
+	  ;; Target is Thumb2, out of range.
+	  ;; Generate narrow b{cond} and unconditional branch b.
+	  (const_int 6)))))]
 )
 
 \f
diff --git a/gcc/testsuite/gcc.target/arm/pr91816.c b/gcc/testsuite/gcc.target/arm/pr91816.c
new file mode 100644
index 0000000000000000000000000000000000000000..757c897e9c0db32709227b3fdf1b4a8033428232
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pr91816.c
@@ -0,0 +1,61 @@
+/* { dg-do compile } */
+/* { dg-options "-march=armv7-a -mthumb -mfpu=vfpv3-d16" }  */
+int printf(const char *, ...);
+
+#define HW0	printf("Hello World!\n");
+#define HW1	HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
+#define HW2	HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
+#define HW3	HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
+#define HW4	HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
+#define HW5	HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
+
+__attribute__((noinline,noclone)) void f1 (int a)
+{
+  if (a) { HW0 }
+}
+
+__attribute__((noinline,noclone)) void f2 (int a)
+{
+  if (a) { HW3 }
+}
+
+
+__attribute__((noinline,noclone)) void f3 (int a)
+{
+  if (a) { HW5 }
+}
+
+__attribute__((noinline,noclone)) void f4 (int a)
+{
+  if (a == 1) { HW0 }
+}
+
+__attribute__((noinline,noclone)) void f5 (int a)
+{
+  if (a == 1) { HW3 }
+}
+
+
+__attribute__((noinline,noclone)) void f6 (int a)
+{
+  if (a == 1) { HW5 }
+}
+
+
+int main(void)
+{
+	f1(0);
+	f2(0);
+	f3(0);
+	f4(0);
+	f5(0);
+	f6(0);
+	return 0;
+}
+
+
+/* { dg-final { scan-assembler-times "beq\\t.L\[0-9\]" 2 } } */
+/* { dg-final { scan-assembler-times "beq\\t.Lbcond\[0-9\]" 1 } } */
+/* { dg-final { scan-assembler-times "bne\\t.L\[0-9\]" 2 } } */
+/* { dg-final { scan-assembler-times "bne\\t.Lbcond\[0-9\]" 1 } } */
+/* { dg-final { scan-assembler-times "b\\t.L\[0-9\]" 2 } } */

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PING][PATCH][GCC][ARM] Arm generates out of range conditional branches in Thumb2 (PR91816)
  2020-01-08 15:19         ` Stam Markianos-Wright
@ 2020-01-16 16:30           ` Stam Markianos-Wright
  2020-01-27 16:28             ` [PINGx2][PATCH][GCC][ARM] " Stam Markianos-Wright
  2020-01-28 11:05           ` [PING][PATCH][GCC][ARM] " Kyrill Tkachov
  1 sibling, 1 reply; 15+ messages in thread
From: Stam Markianos-Wright @ 2020-01-16 16:30 UTC (permalink / raw)
  To: Kyrill Tkachov, gcc-patches; +Cc: nickc, Ramana Radhakrishnan, Richard Earnshaw



On 1/8/20 3:18 PM, Stam Markianos-Wright wrote:
> 
> 
> On 12/10/19 5:03 PM, Kyrill Tkachov wrote:
>> Hi Stam,
>>
>> On 11/15/19 5:26 PM, Stam Markianos-Wright wrote:
>>> Pinging with more correct maintainers this time :)
>>>
>>> Also would need to backport to gcc7,8,9, but need to get this approved
>>> first!
>>>
>>
>> Sorry for the delay.
> 
> Same here now! Sorry totally forget about this in the lead up to Xmas!
> 
> Done the changes marked below and also removed the unnecessary extra #defines 
> from the test.

Ping :)

Cheers,
Stam

> 
>>
>>
>>> Thank you,
>>> Stam
>>>
>>>
>>> -------- Forwarded Message --------
>>> Subject: Re: [PATCH][GCC][ARM] Arm generates out of range conditional
>>> branches in Thumb2 (PR91816)
>>> Date: Mon, 21 Oct 2019 10:37:09 +0100
>>> From: Stam Markianos-Wright <stam.markianos-wright@arm.com>
>>> To: Ramana Radhakrishnan <ramana.gcc@googlemail.com>
>>> CC: gcc-patches@gcc.gnu.org <gcc-patches@gcc.gnu.org>, nd <nd@arm.com>,
>>> James Greenhalgh <James.Greenhalgh@arm.com>, Richard Earnshaw
>>> <Richard.Earnshaw@arm.com>
>>>
>>>
>>>
>>> On 10/13/19 4:23 PM, Ramana Radhakrishnan wrote:
>>> >>
>>> >> Patch bootstrapped and regression tested on arm-none-linux-gnueabihf,
>>> >> however, on my native Aarch32 setup the test times out when run as part
>>> >> of a big "make check-gcc" regression, but not when run individually.
>>> >>
>>> >> 2019-10-11  Stamatis Markianos-Wright <stam.markianos-wright@arm.com>
>>> >>
>>> >>       * config/arm/arm.md: Update b<cond> for Thumb2 range checks.
>>> >>       * config/arm/arm.c: New function arm_gen_far_branch.
>>> >>       * config/arm/arm-protos.h: New function arm_gen_far_branch
>>> >>       prototype.
>>> >>
>>> >> gcc/testsuite/ChangeLog:
>>> >>
>>> >> 2019-10-11  Stamatis Markianos-Wright <stam.markianos-wright@arm.com>
>>> >>
>>> >>       * testsuite/gcc.target/arm/pr91816.c: New test.
>>> >
>>> >> diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
>>> >> index f995974f9bb..1dce333d1c3 100644
>>> >> --- a/gcc/config/arm/arm-protos.h
>>> >> +++ b/gcc/config/arm/arm-protos.h
>>> >> @@ -570,4 +570,7 @@ void arm_parse_option_features (sbitmap, const 
>>> cpu_arch_option *,
>>> >>
>>> >>   void arm_initialize_isa (sbitmap, const enum isa_feature *);
>>> >>
>>> >> +const char * arm_gen_far_branch (rtx *, int,const char * , const char *);
>>> >> +
>>> >> +
>>> >
>>> > Lets get the nits out of the way.
>>> >
>>> > Unnecessary extra new line, need a space between int and const above.
>>> >
>>> >
>>>
>>> .Fixed!
>>>
>>> >>   #endif /* ! GCC_ARM_PROTOS_H */
>>> >> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
>>> >> index 39e1a1ef9a2..1a693d2ddca 100644
>>> >> --- a/gcc/config/arm/arm.c
>>> >> +++ b/gcc/config/arm/arm.c
>>> >> @@ -32139,6 +32139,31 @@ arm_run_selftests (void)
>>> >>   }
>>> >>   } /* Namespace selftest.  */
>>> >>
>>> >> +
>>> >> +/* Generate code to enable conditional branches in functions over 1 MiB.  */
>>> >> +const char *
>>> >> +arm_gen_far_branch (rtx * operands, int pos_label, const char * dest,
>>> >> +                    const char * branch_format)
>>> >
>>> > Not sure if this is some munging from the attachment but check
>>> > vertical alignment of parameters.
>>> >
>>>
>>> .Fixed!
>>>
>>> >> +{
>>> >> +  rtx_code_label * tmp_label = gen_label_rtx ();
>>> >> +  char label_buf[256];
>>> >> +  char buffer[128];
>>> >> +  ASM_GENERATE_INTERNAL_LABEL (label_buf, dest , \
>>> >> +                    CODE_LABEL_NUMBER (tmp_label));
>>> >> +  const char *label_ptr = arm_strip_name_encoding (label_buf);
>>> >> +  rtx dest_label = operands[pos_label];
>>> >> +  operands[pos_label] = tmp_label;
>>> >> +
>>> >> +  snprintf (buffer, sizeof (buffer), "%s%s", branch_format , label_ptr);
>>> >> +  output_asm_insn (buffer, operands);
>>> >> +
>>> >> +  snprintf (buffer, sizeof (buffer), "b\t%%l0%d\n%s:", pos_label, 
>>> label_ptr);
>>> >> +  operands[pos_label] = dest_label;
>>> >> +  output_asm_insn (buffer, operands);
>>> >> +  return "";
>>> >> +}
>>> >> +
>>> >> +
>>> >
>>> > Unnecessary extra newline.
>>> >
>>>
>>> .Fixed!
>>>
>>> >>   #undef TARGET_RUN_TARGET_SELFTESTS
>>> >>   #define TARGET_RUN_TARGET_SELFTESTS selftest::arm_run_selftests
>>> >>   #endif /* CHECKING_P */
>>> >> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
>>> >> index f861c72ccfc..634fd0a59da 100644
>>> >> --- a/gcc/config/arm/arm.md
>>> >> +++ b/gcc/config/arm/arm.md
>>> >> @@ -6686,9 +6686,16 @@
>>> >>   ;; And for backward branches we have
>>> >>   ;;   (neg_range - neg_base_offs + pc_offs) = (neg_range - (-2 or -4) + 4).
>>> >>   ;;
>>> >> +;; In 16-bit Thumb these ranges are:
>>> >>   ;; For a 'b'       pos_range = 2046, neg_range = -2048 giving 
>>> (-2040->2048).
>>> >>   ;; For a 'b<cond>' pos_range = 254, neg_range = -256  giving (-250 ->256).
>>> >>
>>> >> +;; In 32-bit Thumb these ranges are:
>>> >> +;; For a 'b'       +/- 16MB is not checked for.
>>> >> +;; For a 'b<cond>' pos_range = 1048574, neg_range = -1048576  giving
>>> >> +;; (-1048568 -> 1048576).
>>> >> +
>>> >> +
>>> >
>>> > Unnecessary extra newline.
>>> >
>>>
>>> .Fixed!
>>>
>>> >>   (define_expand "cbranchsi4"
>>> >>     [(set (pc) (if_then_else
>>> >>             (match_operator 0 "expandable_comparison_operator"
>>> >> @@ -6947,22 +6954,42 @@
>>> >>                     (pc)))]
>>> >>     "TARGET_32BIT"
>>> >>     "*
>>> >> -  if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
>>> >> -    {
>>> >> -      arm_ccfsm_state += 2;
>>> >> -      return \"\";
>>> >> -    }
>>> >> -  return \"b%d1\\t%l0\";
>>> >> +     if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
>>> >> +      {
>>> >> +    arm_ccfsm_state += 2;
>>> >> +    return \"\";
>>> >> +      }
>>> >> +     switch (get_attr_length (insn))
>>> >> +      {
>>> >> +    // Thumb2 16-bit b{cond}
>>> >> +    case 2:
>>> >> +
>>> >> +    // Thumb2 32-bit b{cond}
>>> >> +    case 4: return \"b%d1\\t%l0\";break;
>>> >> +
>>> >> +    // Thumb2 b{cond} out of range.  Use unconditional branch.
>>> >> +    case 8: return arm_gen_far_branch \
>>> >> +            (operands, 0, \"Lbcond\", \"b%D1\t\");
>>> >> +    break;
>>> >> +
>>> >> +    // A32 b{cond}
>>> >> +    default: return \"b%d1\\t%l0\";
>>> >> +      }
>>> >
>>> > Please fix indentation here.
>>> >
>>>
>>> .Fixed together with below changes.
>>>
>>> >>     "
>>> >>     [(set_attr "conds" "use")
>>> >>      (set_attr "type" "branch")
>>> >>      (set (attr "length")
>>> >> -    (if_then_else
>>> >> -       (and (match_test "TARGET_THUMB2")
>>> >> -            (and (ge (minus (match_dup 0) (pc)) (const_int -250))
>>> >> -                 (le (minus (match_dup 0) (pc)) (const_int 256))))
>>> >> -       (const_int 2)
>>> >> -       (const_int 4)))]
>>> >> +    (if_then_else (match_test "TARGET_THUMB2")
>>> >> +    (if_then_else (and (ge (minus (match_dup 0) (pc)) (const_int -250))
>>> >> +    (le (minus (match_dup 0) (pc)) (const_int 256)))
>>> >> +    (const_int 2)
>>> >> +    (if_then_else (and (ge (minus (match_dup 0) (pc))
>>> >> + (const_int -1048568))
>>> >> +                    (le (minus (match_dup 0) (pc)) (const_int 1048576)))
>>> >> +    (const_int 4)
>>> >> +    (const_int 8)))
>>> >> +    (const_int 10)))
>>> >> +   ]
>>> >
>>> > This conditional is unreadable and is getting quite complex.
>>> >
>>> > Please fix the indentation and add some comments to indicate when
>>> > this is 2, 4, 8, 10 above the pattern and ask for the comment to
>>> > be in sync with this.
>>> >
>>> > How did we end up with length 10 ? That indicates 2 4 byte instructions
>>> > and a 2 byte instruction ? You are handling lengths 2, 4, 8 above in
>>> > the switch - is length 10 going to be a single A32 b<cond> instruction ?
>>> >
>>> > What am I missing ?
>>> >
>>> >
>>>
>>> Ah sorry, I had not realised that the "length" related to the number of
>>> bytes in the instruction, so I just used it as a variable to then check
>>> in the switch().
>>> And yes, you are correct in assuming that length 10 would have been the
>>> A32 b<cond> version.
>>> So the mapping I had in mind was:
>>> 2->  Thumb2 b<cond> - narrow 16bit version
>>> 4->  Thumb2 b<cond> - wide 32bit version
>>> 8->  Thumb2 b       - "far branch".
>>> 10-> A32 b<cond>
>>>
>>> The new version that maintains the "length=number of bytes" would be:
>>>
>>> 2->  Thumb2 b<cond> - narrow 16bit version
>>> 4->  Thumb2 b<cond> - wide 32bit version OR A32 b<cond>
>>> 6->  Thumb2 "far branch" made up from one b<cond> to a very close Lbcond
>>> label (so 16 bits) and one b for 32 bits. (so 2+4 == 6)
>>>
>>> I've gone ahead and done this in the new proposed patch. Let me know if
>>> it's ok! (also I changed the first check to !TARGET_THUMB2 - this makes
>>> it slightly more readable). I'm still not sure about this, so any
>>> suggestions are welcome!
>>>
>>> >
>>> >>   )
>>> >>
>>> >>   (define_insn "*arm_cond_branch_reversed"
>>> >> @@ -6978,17 +7005,36 @@
>>> >>         arm_ccfsm_state += 2;
>>> >>         return \"\";
>>> >>       }
>>> >> -  return \"b%D1\\t%l0\";
>>> >> +     switch (get_attr_length (insn))
>>> >> +      {
>>> >> +    // Thumb2 16-bit b{cond}
>>> >> +    case 2:
>>> >> +
>>> >> +    // Thumb2 32-bit b{cond}
>>> >> +    case 4: return \"b%D1\\t%l0\";break;
>>> >> +
>>> >> +    // Thumb2 b{cond} out of range.  Use unconditional branch.
>>> >> +    case 8: return arm_gen_far_branch \
>>> >> +            (operands, 0, \"Lbcond\", \"b%d1\t\");
>>> >> +            break;
>>> >> +    // A32 b{cond}
>>> >> +    default: return \"b%D1\\t%l0\";
>>> >> +       }
>>> >>     "
>>> >>     [(set_attr "conds" "use")
>>> >>      (set_attr "type" "branch")
>>> >>      (set (attr "length")
>>> >> -    (if_then_else
>>> >> -       (and (match_test "TARGET_THUMB2")
>>> >> -            (and (ge (minus (match_dup 0) (pc)) (const_int -250))
>>> >> -                 (le (minus (match_dup 0) (pc)) (const_int 256))))
>>> >> -       (const_int 2)
>>> >> -       (const_int 4)))]
>>> >> +    (if_then_else (match_test "TARGET_THUMB2")
>>> >> +    (if_then_else (and (ge (minus (match_dup 0) (pc)) (const_int -250))
>>> >> +            (le (minus (match_dup 0) (pc)) (const_int 256)))
>>> >> +    (const_int 2)
>>> >> +    (if_then_else (and (ge (minus (match_dup 0) (pc))
>>> >> + (const_int -1048568))
>>> >> +            (le (minus (match_dup 0) (pc)) (const_int 1048576)))
>>> >> +    (const_int 4)
>>> >> +    (const_int 8)))
>>> >> +    (const_int 10)))
>>> >> +   ]
>>> >
>>> > Same comments as above apply here too.
>>> >
>>>
>>> Same as above.
>>>
>>> Thank you for the feedback and apologies for being a clueless :)
>>>
>>> And, of course, let me know of any problems or queries!
>>>
>>> Cheers,
>>> Stam
>>>
>>> > Ramana
>>> >
>>> >>   )
>>> >>
>>> >>
>>> >> diff --git a/gcc/testsuite/gcc.target/arm/pr91816.c 
>>> b/gcc/testsuite/gcc.target/arm/pr91816.c
>>> >> new file mode 100644
>>> >> index 00000000000..176bf61780b
>>> >> --- /dev/null
>>> >> +++ b/gcc/testsuite/gcc.target/arm/pr91816.c
>>> >> @@ -0,0 +1,102 @@
>>> >> +/* { dg-do compile } */
>>> >> +/* { dg-options "-march=armv7-a -mthumb -mfpu=vfpv3-d16" }  */
>>> >> +int printf(const char *, ...);
>>> >> +
>>> >> +__attribute__((noinline,noclone)) void f1(int a)
>>> >> +{
>>> >> +    if (a) {
>>> >> +#define HW0 printf("Hello World!\n");
>>> >> +#define HW1 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>>> >> +#define HW2 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>>> >> +#define HW3 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>>> >> +#define HW4 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>>> >> +#define HW5 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>>> >> +            HW0
>>> >> +    }
>>> >> +}
>>> >> +
>>> >> +__attribute__((noinline,noclone)) void f2(int a)
>>> >> +{
>>> >> +    if (a) {
>>> >> +#define HW0 printf("Hello World!\n");
>>> >> +#define HW1 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>>> >> +#define HW2 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>>> >> +#define HW3 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>>> >> +#define HW4 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>>> >> +#define HW5 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>>> >> +            HW3
>>> >> +    }
>>> >> +}
>>> >> +
>>> >> +
>>> >> +__attribute__((noinline,noclone)) void f3(int a)
>>> >> +{
>>> >> +    if (a) {
>>> >> +#define HW0 printf("Hello World!\n");
>>> >> +#define HW1 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>>> >> +#define HW2 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>>> >> +#define HW3 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>>> >> +#define HW4 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>>> >> +#define HW5 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>>> >> +            HW5
>>> >> +    }
>>> >> +}
>>> >> +
>>> >> +__attribute__((noinline,noclone)) void f4(int a)
>>> >> +{
>>> >> +    if (a==1) {
>>> >> +#define HW0 printf("Hello World!\n");
>>> >> +#define HW1 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>>> >> +#define HW2 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>>> >> +#define HW3 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>>> >> +#define HW4 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>>> >> +#define HW5 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>>> >> +            HW0
>>> >> +    }
>>> >> +}
>>> >> +
>>> >> +__attribute__((noinline,noclone)) void f5(int a)
>>> >> +{
>>> >> +    if (a==1) {
>>> >> +#define HW0 printf("Hello World!\n");
>>> >> +#define HW1 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>>> >> +#define HW2 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>>> >> +#define HW3 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>>> >> +#define HW4 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>>> >> +#define HW5 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>>> >> +            HW3
>>> >> +    }
>>> >> +}
>>> >> +
>>> >> +
>>> >> +__attribute__((noinline,noclone)) void f6(int a)
>>> >> +{
>>> >> +    if (a==1) {
>>> >> +#define HW0 printf("Hello World!\n");
>>> >> +#define HW1 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>>> >> +#define HW2 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>>> >> +#define HW3 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>>> >> +#define HW4 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>>> >> +#define HW5 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>>> >> +            HW5
>>> >> +    }
>>> >> +}
>>> >> +
>>> >> +
>>> >> +int main(void)
>>> >> +{
>>> >> +    f1(0);
>>> >> +    f2(0);
>>> >> +    f3(0);
>>> >> +    f4(0);
>>> >> +    f5(0);
>>> >> +    f6(0);
>>> >> +    return 0;
>>> >> +}
>>> >> +
>>> >> +
>>> >> +/* { dg-final { scan-assembler-times "beq\\t.L\[0-9\]" 2 } } */
>>> >> +/* { dg-final { scan-assembler-times "beq\\t.Lbcond\[0-9\]" 1 } } */
>>> >> +/* { dg-final { scan-assembler-times "bne\\t.L\[0-9\]" 2 } } */
>>> >> +/* { dg-final { scan-assembler-times "bne\\t.Lbcond\[0-9\]" 1 } } */
>>> >> +/* { dg-final { scan-assembler-times "b\\t.L\[0-9\]" 2 } } */
>>> >
>>>
>>
>> 1.patch
>>
>> diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
>> index f995974f9bb..59ec219da3d 100644
>> --- a/gcc/config/arm/arm-protos.h
>> +++ b/gcc/config/arm/arm-protos.h
>> @@ -570,4 +570,6 @@ void arm_parse_option_features (sbitmap, const 
>> cpu_arch_option *,
>>
>>   void arm_initialize_isa (sbitmap, const enum isa_feature *);
>>
>> +const char * arm_gen_far_branch (rtx *, int, const char *, const char *);
>> +
>>   #endif /* ! GCC_ARM_PROTOS_H */
>> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
>> index 39e1a1ef9a2..7a69ddb6b7b 100644
>> --- a/gcc/config/arm/arm.c
>> +++ b/gcc/config/arm/arm.c
>> @@ -32139,6 +32139,30 @@ arm_run_selftests (void)
>>   }
>>   } /* Namespace selftest.  */
>>
>> +
>> +/* Generate code to enable conditional branches in functions over 1 MiB.  */
>>
>>
>> Please document the function parameters in this comment as other functions in 
>> this file (try to) do.
> 
> Done :)
>>
>>
>> +const char *
>> +arm_gen_far_branch (rtx * operands, int pos_label, const char * dest,
>> +            const char * branch_format)
>> +{
>> +  rtx_code_label * tmp_label = gen_label_rtx ();
>> +  char label_buf[256];
>> +  char buffer[128];
>> +  ASM_GENERATE_INTERNAL_LABEL (label_buf, dest , \
>> +            CODE_LABEL_NUMBER (tmp_label));
>> +  const char *label_ptr = arm_strip_name_encoding (label_buf);
>> +  rtx dest_label = operands[pos_label];
>> +  operands[pos_label] = tmp_label;
>> +
>> +  snprintf (buffer, sizeof (buffer), "%s%s", branch_format , label_ptr);
>> +  output_asm_insn (buffer, operands);
>> +
>> +  snprintf (buffer, sizeof (buffer), "b\t%%l0%d\n%s:", pos_label, label_ptr);
>> +  operands[pos_label] = dest_label;
>> +  output_asm_insn (buffer, operands);
>> +  return "";
>> +}
>> +
>>   #undef TARGET_RUN_TARGET_SELFTESTS
>>   #define TARGET_RUN_TARGET_SELFTESTS selftest::arm_run_selftests
>>   #endif /* CHECKING_P */
>> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
>> index f861c72ccfc..7e5e1489214 100644
>> --- a/gcc/config/arm/arm.md
>> +++ b/gcc/config/arm/arm.md
>> @@ -6686,9 +6686,15 @@
>>   ;; And for backward branches we have
>>   ;;   (neg_range - neg_base_offs + pc_offs) = (neg_range - (-2 or -4) + 4).
>>   ;;
>> +;; In 16-bit Thumb these ranges are:
>>   ;; For a 'b'       pos_range = 2046, neg_range = -2048 giving (-2040->2048).
>>   ;; For a 'b<cond>' pos_range = 254,  neg_range = -256  giving (-250 ->256).
>>
>> +;; In 32-bit Thumb these ranges are:
>> +;; For a 'b'       ± 16MB is not checked for.
>> +;; For a 'b<cond>' pos_range = 1048574,  neg_range = -1048576  giving
>> +;; (-1048568 -> 1048576).
>> +
>>   (define_expand "cbranchsi4"
>>     [(set (pc) (if_then_else
>>             (match_operator 0 "expandable_comparison_operator"
>> @@ -6946,23 +6952,56 @@
>>                 (label_ref (match_operand 0 "" ""))
>>                 (pc)))]
>>     "TARGET_32BIT"
>> -  "*
>> -  if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
>> -    {
>> -      arm_ccfsm_state += 2;
>> -      return \"\";
>> -    }
>> -  return \"b%d1\\t%l0\";
>> -  "
>> +  {
>> +    if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
>> +    {
>> +        arm_ccfsm_state += 2;
>> +        return "";
>> +    }
>> +    switch (get_attr_length (insn))
>> +    {
>> +        /* Thumb2 16-bit b{cond}.  */
>> +        case 2:
>> +
>> +        /* Thumb2 32-bit b{cond} or A32 b{cond}.  */
>> +        case 4: return "b%d1\t%l0";
>> +            break;
>> +
>> +        /* Thumb2 b{cond} out of range.  Use 16-bit b{cond} and
>> +           unconditional branch b.  */
>> +        default: return arm_gen_far_branch \
>> +                (operands, 0, "Lbcond", "b%D1\t");
>> +    }
>>
>>
>> The indentation here is wrong. Please look at how other switch statements are 
>> written in the backend for guidance: 2 space indentation, new line after the 
>> cases etc.
> 
> Done
>>
>>   +  }
>>     [(set_attr "conds" "use")
>>      (set_attr "type" "branch")
>>      (set (attr "length")
>> -    (if_then_else
>> -       (and (match_test "TARGET_THUMB2")
>> -        (and (ge (minus (match_dup 0) (pc)) (const_int -250))
>> -             (le (minus (match_dup 0) (pc)) (const_int 256))))
>> -       (const_int 2)
>> -       (const_int 4)))]
>> +    (if_then_else (match_test "!TARGET_THUMB2")
>> +
>> +    ;;Target is not Thumb2, therefore is A32.  Generate b{cond}.
>> +    (const_int 4)
>> +
>> +    ;; Check if target is within 16-bit Thumb2 b{cond} range.
>> +    (if_then_else (and (ge (minus (match_dup 0) (pc)) (const_int -250))
>> +               (le (minus (match_dup 0) (pc)) (const_int 256)))
>> +
>> +        ;; Target is Thumb2, within narrow range.
>> +        ;; Generate b{cond}.
>> +            (const_int 2)
>> +
>> +        ;; Check if target is within 32-bit Thumb2 b{cond} range.
>> +            (if_then_else (and (ge (minus (match_dup 0)
>> +                     (pc))(const_int -1048568))
>> +                       (le (minus (match_dup 0)
>> +                     (pc)) (const_int 1048576)))
>> +
>> +        ;; Target is Thumb2, within wide range.
>> +        ;; Generate b{cond}
>> +                        (const_int 4)
>> +        ;; Target is Thumb2, out of range.
>> +        ;; Generate narrow b{cond} and unconditional branch b.
>> +                        (const_int 6)))))
>> +  ]
>>
>>
>> Likewise on the indentation.
> 
> Done, sorry about that!
>>
>>   )
>>
>>   (define_insn "*arm_cond_branch_reversed"
>> @@ -6972,23 +7011,56 @@
>>                 (pc)
>>                 (label_ref (match_operand 0 "" ""))))]
>>     "TARGET_32BIT"
>> -  "*
>> -  if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
>> -    {
>> -      arm_ccfsm_state += 2;
>> -      return \"\";
>> -    }
>> -  return \"b%D1\\t%l0\";
>> -  "
>> +  {
>> +    if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
>> +    {
>> +        arm_ccfsm_state += 2;
>> +        return "";
>> +    }
>> +    switch (get_attr_length (insn))
>> +    {
>> +        /* Thumb2 16-bit b{cond}.  */
>> +        case 2:
>> +
>> +        /* Thumb2 32-bit b{cond} or A32 b{cond}.  */
>> +        case 4: return "b%D1\t%l0";
>> +            break;
>> +
>> +        /* Thumb2 b{cond} out of range.  Use 16-bit b{cond} and
>> +           unconditional branch b.  */
>> +        default: return arm_gen_far_branch \
>> +                (operands, 0, "Lbcond", "b%d1\t");
>> +    }
>>
>>
>>
>>
>>   +  }
>>     [(set_attr "conds" "use")
>>      (set_attr "type" "branch")
>>      (set (attr "length")
>> -    (if_then_else
>> -       (and (match_test "TARGET_THUMB2")
>> -        (and (ge (minus (match_dup 0) (pc)) (const_int -250))
>> -             (le (minus (match_dup 0) (pc)) (const_int 256))))
>> -       (const_int 2)
>> -       (const_int 4)))]
>> +    (if_then_else (match_test "!TARGET_THUMB2")
>> +
>> +    ;;Target is not Thumb2, therefore is A32.  Generate b{cond}.
>> +    (const_int 4)
>> +
>> +    ;; Check if target is within 16-bit Thumb2 b{cond} range.
>> +    (if_then_else (and (ge (minus (match_dup 0) (pc)) (const_int -250))
>> +               (le (minus (match_dup 0) (pc)) (const_int 256)))
>> +
>> +        ;; Target is Thumb2, within narrow range.
>> +        ;; Generate b{cond}.
>> +            (const_int 2)
>> +
>> +        ;; Check if target is within 32-bit Thumb2 b{cond} range.
>> +            (if_then_else (and (ge (minus (match_dup 0)
>> +                     (pc))(const_int -1048568))
>> +                       (le (minus (match_dup 0)
>> +                     (pc)) (const_int 1048576)))
>> +
>> +        ;; Target is Thumb2, within wide range.
>> +        ;; Generate b{cond}.
>> +                        (const_int 4)
>> +        ;; Target is Thumb2, out of range.
>> +        ;; Generate narrow b{cond} and unconditional branch b.
>> +                        (const_int 6)))))
>> +  ]
>>   )
>>
>>
>> Otherwise this looks reasonable to me. Ramana, did you have any further 
>> comments on the patch?
>> Thanks,
>> Kyrill
>>
>>
>>
>> diff --git a/gcc/testsuite/gcc.target/arm/pr91816.c 
>> b/gcc/testsuite/gcc.target/arm/pr91816.c
>> new file mode 100644
>> index 00000000000..176bf61780b
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/arm/pr91816.c
>> @@ -0,0 +1,102 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-march=armv7-a -mthumb -mfpu=vfpv3-d16" }  */
>> +int printf(const char *, ...);
>> +
>> +__attribute__((noinline,noclone)) void f1(int a)
>> +{
>> +    if (a) {
>> +#define HW0    printf("Hello World!\n");
>> +#define HW1    HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>> +#define HW2    HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>> +#define HW3    HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>> +#define HW4    HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>> +#define HW5    HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>> +        HW0
>> +    }
>> +}
>> +
>> +__attribute__((noinline,noclone)) void f2(int a)
>> +{
>> +    if (a) {
>> +#define HW0    printf("Hello World!\n");
>> +#define HW1    HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>> +#define HW2    HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>> +#define HW3    HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>> +#define HW4    HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>> +#define HW5    HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>> +        HW3
>> +    }
>> +}
>> +
>> +
>> +__attribute__((noinline,noclone)) void f3(int a)
>> +{
>> +    if (a) {
>> +#define HW0    printf("Hello World!\n");
>> +#define HW1    HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>> +#define HW2    HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>> +#define HW3    HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>> +#define HW4    HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>> +#define HW5    HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>> +        HW5
>> +    }
>> +}
>> +
>> +__attribute__((noinline,noclone)) void f4(int a)
>> +{
>> +    if (a==1) {
>> +#define HW0    printf("Hello World!\n");
>> +#define HW1    HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>> +#define HW2    HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>> +#define HW3    HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>> +#define HW4    HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>> +#define HW5    HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>> +        HW0
>> +    }
>> +}
>> +
>> +__attribute__((noinline,noclone)) void f5(int a)
>> +{
>> +    if (a==1) {
>> +#define HW0    printf("Hello World!\n");
>> +#define HW1    HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>> +#define HW2    HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>> +#define HW3    HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>> +#define HW4    HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>> +#define HW5    HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>> +        HW3
>> +    }
>> +}
>> +
>> +
>> +__attribute__((noinline,noclone)) void f6(int a)
>> +{
>> +    if (a==1) {
>> +#define HW0    printf("Hello World!\n");
>> +#define HW1    HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>> +#define HW2    HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>> +#define HW3    HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>> +#define HW4    HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>> +#define HW5    HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>> +        HW5
>> +    }
>> +}
>> +
>> +
>> +int main(void)
>> +{
>> +    f1(0);
>> +    f2(0);
>> +    f3(0);
>> +    f4(0);
>> +    f5(0);
>> +    f6(0);
>> +    return 0;
>> +}
>> +
>> +
>> +/* { dg-final { scan-assembler-times "beq\\t.L\[0-9\]" 2 } } */
>> +/* { dg-final { scan-assembler-times "beq\\t.Lbcond\[0-9\]" 1 } } */
>> +/* { dg-final { scan-assembler-times "bne\\t.L\[0-9\]" 2 } } */
>> +/* { dg-final { scan-assembler-times "bne\\t.Lbcond\[0-9\]" 1 } } */
>> +/* { dg-final { scan-assembler-times "b\\t.L\[0-9\]" 2 } } */
>>
>>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PINGx2][PATCH][GCC][ARM] Arm generates out of range conditional branches in Thumb2 (PR91816)
  2020-01-16 16:30           ` Stam Markianos-Wright
@ 2020-01-27 16:28             ` Stam Markianos-Wright
  0 siblings, 0 replies; 15+ messages in thread
From: Stam Markianos-Wright @ 2020-01-27 16:28 UTC (permalink / raw)
  To: Kyrill Tkachov, gcc-patches; +Cc: nickc, Ramana Radhakrishnan, Richard Earnshaw



On 1/16/20 4:06 PM, Stam Markianos-Wright wrote:
> 
> 
> On 1/8/20 3:18 PM, Stam Markianos-Wright wrote:
>>
>>
>> On 12/10/19 5:03 PM, Kyrill Tkachov wrote:
>>> Hi Stam,
>>>
>>> On 11/15/19 5:26 PM, Stam Markianos-Wright wrote:
>>>> Pinging with more correct maintainers this time :)
>>>>
>>>> Also would need to backport to gcc7,8,9, but need to get this approved
>>>> first!
>>>>
>>>
>>> Sorry for the delay.
>>
>> Same here now! Sorry totally forget about this in the lead up to Xmas!
>>
>> Done the changes marked below and also removed the unnecessary extra #defines 
>> from the test.
> 
> Ping :)
> 
> Cheers,
> Stam
> 
>>
>>>
>>>
>>>> Thank you,
>>>> Stam
>>>>
>>>>
>>>> -------- Forwarded Message --------
>>>> Subject: Re: [PATCH][GCC][ARM] Arm generates out of range conditional
>>>> branches in Thumb2 (PR91816)
>>>> Date: Mon, 21 Oct 2019 10:37:09 +0100
>>>> From: Stam Markianos-Wright <stam.markianos-wright@arm.com>
>>>> To: Ramana Radhakrishnan <ramana.gcc@googlemail.com>
>>>> CC: gcc-patches@gcc.gnu.org <gcc-patches@gcc.gnu.org>, nd <nd@arm.com>,
>>>> James Greenhalgh <James.Greenhalgh@arm.com>, Richard Earnshaw
>>>> <Richard.Earnshaw@arm.com>
>>>>
>>>>
>>>>
>>>> On 10/13/19 4:23 PM, Ramana Radhakrishnan wrote:
>>>> >>
>>>> >> Patch bootstrapped and regression tested on arm-none-linux-gnueabihf,
>>>> >> however, on my native Aarch32 setup the test times out when run as part
>>>> >> of a big "make check-gcc" regression, but not when run individually.
>>>> >>
>>>> >> 2019-10-11  Stamatis Markianos-Wright <stam.markianos-wright@arm.com>
>>>> >>
>>>> >>       * config/arm/arm.md: Update b<cond> for Thumb2 range checks.
>>>> >>       * config/arm/arm.c: New function arm_gen_far_branch.
>>>> >>       * config/arm/arm-protos.h: New function arm_gen_far_branch
>>>> >>       prototype.
>>>> >>
>>>> >> gcc/testsuite/ChangeLog:
>>>> >>
>>>> >> 2019-10-11  Stamatis Markianos-Wright <stam.markianos-wright@arm.com>
>>>> >>
>>>> >>       * testsuite/gcc.target/arm/pr91816.c: New test.
>>>> >
>>>> >> diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
>>>> >> index f995974f9bb..1dce333d1c3 100644
>>>> >> --- a/gcc/config/arm/arm-protos.h
>>>> >> +++ b/gcc/config/arm/arm-protos.h
>>>> >> @@ -570,4 +570,7 @@ void arm_parse_option_features (sbitmap, const 
>>>> cpu_arch_option *,
>>>> >>
>>>> >>   void arm_initialize_isa (sbitmap, const enum isa_feature *);
>>>> >>
>>>> >> +const char * arm_gen_far_branch (rtx *, int,const char * , const char *);
>>>> >> +
>>>> >> +
>>>> >
>>>> > Lets get the nits out of the way.
>>>> >
>>>> > Unnecessary extra new line, need a space between int and const above.
>>>> >
>>>> >
>>>>
>>>> .Fixed!
>>>>
>>>> >>   #endif /* ! GCC_ARM_PROTOS_H */
>>>> >> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
>>>> >> index 39e1a1ef9a2..1a693d2ddca 100644
>>>> >> --- a/gcc/config/arm/arm.c
>>>> >> +++ b/gcc/config/arm/arm.c
>>>> >> @@ -32139,6 +32139,31 @@ arm_run_selftests (void)
>>>> >>   }
>>>> >>   } /* Namespace selftest.  */
>>>> >>
>>>> >> +
>>>> >> +/* Generate code to enable conditional branches in functions over 1 
>>>> MiB.  */
>>>> >> +const char *
>>>> >> +arm_gen_far_branch (rtx * operands, int pos_label, const char * dest,
>>>> >> +                    const char * branch_format)
>>>> >
>>>> > Not sure if this is some munging from the attachment but check
>>>> > vertical alignment of parameters.
>>>> >
>>>>
>>>> .Fixed!
>>>>
>>>> >> +{
>>>> >> +  rtx_code_label * tmp_label = gen_label_rtx ();
>>>> >> +  char label_buf[256];
>>>> >> +  char buffer[128];
>>>> >> +  ASM_GENERATE_INTERNAL_LABEL (label_buf, dest , \
>>>> >> +                    CODE_LABEL_NUMBER (tmp_label));
>>>> >> +  const char *label_ptr = arm_strip_name_encoding (label_buf);
>>>> >> +  rtx dest_label = operands[pos_label];
>>>> >> +  operands[pos_label] = tmp_label;
>>>> >> +
>>>> >> +  snprintf (buffer, sizeof (buffer), "%s%s", branch_format , label_ptr);
>>>> >> +  output_asm_insn (buffer, operands);
>>>> >> +
>>>> >> +  snprintf (buffer, sizeof (buffer), "b\t%%l0%d\n%s:", pos_label, 
>>>> label_ptr);
>>>> >> +  operands[pos_label] = dest_label;
>>>> >> +  output_asm_insn (buffer, operands);
>>>> >> +  return "";
>>>> >> +}
>>>> >> +
>>>> >> +
>>>> >
>>>> > Unnecessary extra newline.
>>>> >
>>>>
>>>> .Fixed!
>>>>
>>>> >>   #undef TARGET_RUN_TARGET_SELFTESTS
>>>> >>   #define TARGET_RUN_TARGET_SELFTESTS selftest::arm_run_selftests
>>>> >>   #endif /* CHECKING_P */
>>>> >> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
>>>> >> index f861c72ccfc..634fd0a59da 100644
>>>> >> --- a/gcc/config/arm/arm.md
>>>> >> +++ b/gcc/config/arm/arm.md
>>>> >> @@ -6686,9 +6686,16 @@
>>>> >>   ;; And for backward branches we have
>>>> >>   ;;   (neg_range - neg_base_offs + pc_offs) = (neg_range - (-2 or -4) + 4).
>>>> >>   ;;
>>>> >> +;; In 16-bit Thumb these ranges are:
>>>> >>   ;; For a 'b'       pos_range = 2046, neg_range = -2048 giving 
>>>> (-2040->2048).
>>>> >>   ;; For a 'b<cond>' pos_range = 254, neg_range = -256  giving (-250 ->256).
>>>> >>
>>>> >> +;; In 32-bit Thumb these ranges are:
>>>> >> +;; For a 'b'       +/- 16MB is not checked for.
>>>> >> +;; For a 'b<cond>' pos_range = 1048574, neg_range = -1048576  giving
>>>> >> +;; (-1048568 -> 1048576).
>>>> >> +
>>>> >> +
>>>> >
>>>> > Unnecessary extra newline.
>>>> >
>>>>
>>>> .Fixed!
>>>>
>>>> >>   (define_expand "cbranchsi4"
>>>> >>     [(set (pc) (if_then_else
>>>> >>             (match_operator 0 "expandable_comparison_operator"
>>>> >> @@ -6947,22 +6954,42 @@
>>>> >>                     (pc)))]
>>>> >>     "TARGET_32BIT"
>>>> >>     "*
>>>> >> -  if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
>>>> >> -    {
>>>> >> -      arm_ccfsm_state += 2;
>>>> >> -      return \"\";
>>>> >> -    }
>>>> >> -  return \"b%d1\\t%l0\";
>>>> >> +     if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
>>>> >> +      {
>>>> >> +    arm_ccfsm_state += 2;
>>>> >> +    return \"\";
>>>> >> +      }
>>>> >> +     switch (get_attr_length (insn))
>>>> >> +      {
>>>> >> +    // Thumb2 16-bit b{cond}
>>>> >> +    case 2:
>>>> >> +
>>>> >> +    // Thumb2 32-bit b{cond}
>>>> >> +    case 4: return \"b%d1\\t%l0\";break;
>>>> >> +
>>>> >> +    // Thumb2 b{cond} out of range.  Use unconditional branch.
>>>> >> +    case 8: return arm_gen_far_branch \
>>>> >> +            (operands, 0, \"Lbcond\", \"b%D1\t\");
>>>> >> +    break;
>>>> >> +
>>>> >> +    // A32 b{cond}
>>>> >> +    default: return \"b%d1\\t%l0\";
>>>> >> +      }
>>>> >
>>>> > Please fix indentation here.
>>>> >
>>>>
>>>> .Fixed together with below changes.
>>>>
>>>> >>     "
>>>> >>     [(set_attr "conds" "use")
>>>> >>      (set_attr "type" "branch")
>>>> >>      (set (attr "length")
>>>> >> -    (if_then_else
>>>> >> -       (and (match_test "TARGET_THUMB2")
>>>> >> -            (and (ge (minus (match_dup 0) (pc)) (const_int -250))
>>>> >> -                 (le (minus (match_dup 0) (pc)) (const_int 256))))
>>>> >> -       (const_int 2)
>>>> >> -       (const_int 4)))]
>>>> >> +    (if_then_else (match_test "TARGET_THUMB2")
>>>> >> +    (if_then_else (and (ge (minus (match_dup 0) (pc)) (const_int -250))
>>>> >> +    (le (minus (match_dup 0) (pc)) (const_int 256)))
>>>> >> +    (const_int 2)
>>>> >> +    (if_then_else (and (ge (minus (match_dup 0) (pc))
>>>> >> + (const_int -1048568))
>>>> >> +                    (le (minus (match_dup 0) (pc)) (const_int 1048576)))
>>>> >> +    (const_int 4)
>>>> >> +    (const_int 8)))
>>>> >> +    (const_int 10)))
>>>> >> +   ]
>>>> >
>>>> > This conditional is unreadable and is getting quite complex.
>>>> >
>>>> > Please fix the indentation and add some comments to indicate when
>>>> > this is 2, 4, 8, 10 above the pattern and ask for the comment to
>>>> > be in sync with this.
>>>> >
>>>> > How did we end up with length 10 ? That indicates 2 4 byte instructions
>>>> > and a 2 byte instruction ? You are handling lengths 2, 4, 8 above in
>>>> > the switch - is length 10 going to be a single A32 b<cond> instruction ?
>>>> >
>>>> > What am I missing ?
>>>> >
>>>> >
>>>>
>>>> Ah sorry, I had not realised that the "length" related to the number of
>>>> bytes in the instruction, so I just used it as a variable to then check
>>>> in the switch().
>>>> And yes, you are correct in assuming that length 10 would have been the
>>>> A32 b<cond> version.
>>>> So the mapping I had in mind was:
>>>> 2->  Thumb2 b<cond> - narrow 16bit version
>>>> 4->  Thumb2 b<cond> - wide 32bit version
>>>> 8->  Thumb2 b       - "far branch".
>>>> 10-> A32 b<cond>
>>>>
>>>> The new version that maintains the "length=number of bytes" would be:
>>>>
>>>> 2->  Thumb2 b<cond> - narrow 16bit version
>>>> 4->  Thumb2 b<cond> - wide 32bit version OR A32 b<cond>
>>>> 6->  Thumb2 "far branch" made up from one b<cond> to a very close Lbcond
>>>> label (so 16 bits) and one b for 32 bits. (so 2+4 == 6)
>>>>
>>>> I've gone ahead and done this in the new proposed patch. Let me know if
>>>> it's ok! (also I changed the first check to !TARGET_THUMB2 - this makes
>>>> it slightly more readable). I'm still not sure about this, so any
>>>> suggestions are welcome!
>>>>
>>>> >
>>>> >>   )
>>>> >>
>>>> >>   (define_insn "*arm_cond_branch_reversed"
>>>> >> @@ -6978,17 +7005,36 @@
>>>> >>         arm_ccfsm_state += 2;
>>>> >>         return \"\";
>>>> >>       }
>>>> >> -  return \"b%D1\\t%l0\";
>>>> >> +     switch (get_attr_length (insn))
>>>> >> +      {
>>>> >> +    // Thumb2 16-bit b{cond}
>>>> >> +    case 2:
>>>> >> +
>>>> >> +    // Thumb2 32-bit b{cond}
>>>> >> +    case 4: return \"b%D1\\t%l0\";break;
>>>> >> +
>>>> >> +    // Thumb2 b{cond} out of range.  Use unconditional branch.
>>>> >> +    case 8: return arm_gen_far_branch \
>>>> >> +            (operands, 0, \"Lbcond\", \"b%d1\t\");
>>>> >> +            break;
>>>> >> +    // A32 b{cond}
>>>> >> +    default: return \"b%D1\\t%l0\";
>>>> >> +       }
>>>> >>     "
>>>> >>     [(set_attr "conds" "use")
>>>> >>      (set_attr "type" "branch")
>>>> >>      (set (attr "length")
>>>> >> -    (if_then_else
>>>> >> -       (and (match_test "TARGET_THUMB2")
>>>> >> -            (and (ge (minus (match_dup 0) (pc)) (const_int -250))
>>>> >> -                 (le (minus (match_dup 0) (pc)) (const_int 256))))
>>>> >> -       (const_int 2)
>>>> >> -       (const_int 4)))]
>>>> >> +    (if_then_else (match_test "TARGET_THUMB2")
>>>> >> +    (if_then_else (and (ge (minus (match_dup 0) (pc)) (const_int -250))
>>>> >> +            (le (minus (match_dup 0) (pc)) (const_int 256)))
>>>> >> +    (const_int 2)
>>>> >> +    (if_then_else (and (ge (minus (match_dup 0) (pc))
>>>> >> + (const_int -1048568))
>>>> >> +            (le (minus (match_dup 0) (pc)) (const_int 1048576)))
>>>> >> +    (const_int 4)
>>>> >> +    (const_int 8)))
>>>> >> +    (const_int 10)))
>>>> >> +   ]
>>>> >
>>>> > Same comments as above apply here too.
>>>> >
>>>>
>>>> Same as above.
>>>>
>>>> Thank you for the feedback and apologies for being a clueless :)
>>>>
>>>> And, of course, let me know of any problems or queries!
>>>>
>>>> Cheers,
>>>> Stam
>>>>
>>>> > Ramana
>>>> >
>>>> >>   )
>>>> >>
>>>> >>
>>>> >> diff --git a/gcc/testsuite/gcc.target/arm/pr91816.c 
>>>> b/gcc/testsuite/gcc.target/arm/pr91816.c
>>>> >> new file mode 100644
>>>> >> index 00000000000..176bf61780b
>>>> >> --- /dev/null
>>>> >> +++ b/gcc/testsuite/gcc.target/arm/pr91816.c
>>>> >> @@ -0,0 +1,102 @@
>>>> >> +/* { dg-do compile } */
>>>> >> +/* { dg-options "-march=armv7-a -mthumb -mfpu=vfpv3-d16" }  */
>>>> >> +int printf(const char *, ...);
>>>> >> +
>>>> >> +__attribute__((noinline,noclone)) void f1(int a)
>>>> >> +{
>>>> >> +    if (a) {
>>>> >> +#define HW0 printf("Hello World!\n");
>>>> >> +#define HW1 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>>>> >> +#define HW2 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>>>> >> +#define HW3 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>>>> >> +#define HW4 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>>>> >> +#define HW5 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>>>> >> +            HW0
>>>> >> +    }
>>>> >> +}
>>>> >> +
>>>> >> +__attribute__((noinline,noclone)) void f2(int a)
>>>> >> +{
>>>> >> +    if (a) {
>>>> >> +#define HW0 printf("Hello World!\n");
>>>> >> +#define HW1 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>>>> >> +#define HW2 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>>>> >> +#define HW3 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>>>> >> +#define HW4 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>>>> >> +#define HW5 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>>>> >> +            HW3
>>>> >> +    }
>>>> >> +}
>>>> >> +
>>>> >> +
>>>> >> +__attribute__((noinline,noclone)) void f3(int a)
>>>> >> +{
>>>> >> +    if (a) {
>>>> >> +#define HW0 printf("Hello World!\n");
>>>> >> +#define HW1 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>>>> >> +#define HW2 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>>>> >> +#define HW3 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>>>> >> +#define HW4 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>>>> >> +#define HW5 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>>>> >> +            HW5
>>>> >> +    }
>>>> >> +}
>>>> >> +
>>>> >> +__attribute__((noinline,noclone)) void f4(int a)
>>>> >> +{
>>>> >> +    if (a==1) {
>>>> >> +#define HW0 printf("Hello World!\n");
>>>> >> +#define HW1 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>>>> >> +#define HW2 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>>>> >> +#define HW3 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>>>> >> +#define HW4 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>>>> >> +#define HW5 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>>>> >> +            HW0
>>>> >> +    }
>>>> >> +}
>>>> >> +
>>>> >> +__attribute__((noinline,noclone)) void f5(int a)
>>>> >> +{
>>>> >> +    if (a==1) {
>>>> >> +#define HW0 printf("Hello World!\n");
>>>> >> +#define HW1 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>>>> >> +#define HW2 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>>>> >> +#define HW3 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>>>> >> +#define HW4 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>>>> >> +#define HW5 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>>>> >> +            HW3
>>>> >> +    }
>>>> >> +}
>>>> >> +
>>>> >> +
>>>> >> +__attribute__((noinline,noclone)) void f6(int a)
>>>> >> +{
>>>> >> +    if (a==1) {
>>>> >> +#define HW0 printf("Hello World!\n");
>>>> >> +#define HW1 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>>>> >> +#define HW2 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>>>> >> +#define HW3 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>>>> >> +#define HW4 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>>>> >> +#define HW5 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>>>> >> +            HW5
>>>> >> +    }
>>>> >> +}
>>>> >> +
>>>> >> +
>>>> >> +int main(void)
>>>> >> +{
>>>> >> +    f1(0);
>>>> >> +    f2(0);
>>>> >> +    f3(0);
>>>> >> +    f4(0);
>>>> >> +    f5(0);
>>>> >> +    f6(0);
>>>> >> +    return 0;
>>>> >> +}
>>>> >> +
>>>> >> +
>>>> >> +/* { dg-final { scan-assembler-times "beq\\t.L\[0-9\]" 2 } } */
>>>> >> +/* { dg-final { scan-assembler-times "beq\\t.Lbcond\[0-9\]" 1 } } */
>>>> >> +/* { dg-final { scan-assembler-times "bne\\t.L\[0-9\]" 2 } } */
>>>> >> +/* { dg-final { scan-assembler-times "bne\\t.Lbcond\[0-9\]" 1 } } */
>>>> >> +/* { dg-final { scan-assembler-times "b\\t.L\[0-9\]" 2 } } */
>>>> >
>>>>
>>>
>>> 1.patch
>>>
>>> diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
>>> index f995974f9bb..59ec219da3d 100644
>>> --- a/gcc/config/arm/arm-protos.h
>>> +++ b/gcc/config/arm/arm-protos.h
>>> @@ -570,4 +570,6 @@ void arm_parse_option_features (sbitmap, const 
>>> cpu_arch_option *,
>>>
>>>   void arm_initialize_isa (sbitmap, const enum isa_feature *);
>>>
>>> +const char * arm_gen_far_branch (rtx *, int, const char *, const char *);
>>> +
>>>   #endif /* ! GCC_ARM_PROTOS_H */
>>> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
>>> index 39e1a1ef9a2..7a69ddb6b7b 100644
>>> --- a/gcc/config/arm/arm.c
>>> +++ b/gcc/config/arm/arm.c
>>> @@ -32139,6 +32139,30 @@ arm_run_selftests (void)
>>>   }
>>>   } /* Namespace selftest.  */
>>>
>>> +
>>> +/* Generate code to enable conditional branches in functions over 1 MiB.  */
>>>
>>>
>>> Please document the function parameters in this comment as other functions in 
>>> this file (try to) do.
>>
>> Done :)
>>>
>>>
>>> +const char *
>>> +arm_gen_far_branch (rtx * operands, int pos_label, const char * dest,
>>> +            const char * branch_format)
>>> +{
>>> +  rtx_code_label * tmp_label = gen_label_rtx ();
>>> +  char label_buf[256];
>>> +  char buffer[128];
>>> +  ASM_GENERATE_INTERNAL_LABEL (label_buf, dest , \
>>> +            CODE_LABEL_NUMBER (tmp_label));
>>> +  const char *label_ptr = arm_strip_name_encoding (label_buf);
>>> +  rtx dest_label = operands[pos_label];
>>> +  operands[pos_label] = tmp_label;
>>> +
>>> +  snprintf (buffer, sizeof (buffer), "%s%s", branch_format , label_ptr);
>>> +  output_asm_insn (buffer, operands);
>>> +
>>> +  snprintf (buffer, sizeof (buffer), "b\t%%l0%d\n%s:", pos_label, label_ptr);
>>> +  operands[pos_label] = dest_label;
>>> +  output_asm_insn (buffer, operands);
>>> +  return "";
>>> +}
>>> +
>>>   #undef TARGET_RUN_TARGET_SELFTESTS
>>>   #define TARGET_RUN_TARGET_SELFTESTS selftest::arm_run_selftests
>>>   #endif /* CHECKING_P */
>>> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
>>> index f861c72ccfc..7e5e1489214 100644
>>> --- a/gcc/config/arm/arm.md
>>> +++ b/gcc/config/arm/arm.md
>>> @@ -6686,9 +6686,15 @@
>>>   ;; And for backward branches we have
>>>   ;;   (neg_range - neg_base_offs + pc_offs) = (neg_range - (-2 or -4) + 4).
>>>   ;;
>>> +;; In 16-bit Thumb these ranges are:
>>>   ;; For a 'b'       pos_range = 2046, neg_range = -2048 giving (-2040->2048).
>>>   ;; For a 'b<cond>' pos_range = 254,  neg_range = -256  giving (-250 ->256).
>>>
>>> +;; In 32-bit Thumb these ranges are:
>>> +;; For a 'b'       ± 16MB is not checked for.
>>> +;; For a 'b<cond>' pos_range = 1048574,  neg_range = -1048576  giving
>>> +;; (-1048568 -> 1048576).
>>> +
>>>   (define_expand "cbranchsi4"
>>>     [(set (pc) (if_then_else
>>>             (match_operator 0 "expandable_comparison_operator"
>>> @@ -6946,23 +6952,56 @@
>>>                 (label_ref (match_operand 0 "" ""))
>>>                 (pc)))]
>>>     "TARGET_32BIT"
>>> -  "*
>>> -  if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
>>> -    {
>>> -      arm_ccfsm_state += 2;
>>> -      return \"\";
>>> -    }
>>> -  return \"b%d1\\t%l0\";
>>> -  "
>>> +  {
>>> +    if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
>>> +    {
>>> +        arm_ccfsm_state += 2;
>>> +        return "";
>>> +    }
>>> +    switch (get_attr_length (insn))
>>> +    {
>>> +        /* Thumb2 16-bit b{cond}.  */
>>> +        case 2:
>>> +
>>> +        /* Thumb2 32-bit b{cond} or A32 b{cond}.  */
>>> +        case 4: return "b%d1\t%l0";
>>> +            break;
>>> +
>>> +        /* Thumb2 b{cond} out of range.  Use 16-bit b{cond} and
>>> +           unconditional branch b.  */
>>> +        default: return arm_gen_far_branch \
>>> +                (operands, 0, "Lbcond", "b%D1\t");
>>> +    }
>>>
>>>
>>> The indentation here is wrong. Please look at how other switch statements are 
>>> written in the backend for guidance: 2 space indentation, new line after the 
>>> cases etc.
>>
>> Done
>>>
>>>   +  }
>>>     [(set_attr "conds" "use")
>>>      (set_attr "type" "branch")
>>>      (set (attr "length")
>>> -    (if_then_else
>>> -       (and (match_test "TARGET_THUMB2")
>>> -        (and (ge (minus (match_dup 0) (pc)) (const_int -250))
>>> -             (le (minus (match_dup 0) (pc)) (const_int 256))))
>>> -       (const_int 2)
>>> -       (const_int 4)))]
>>> +    (if_then_else (match_test "!TARGET_THUMB2")
>>> +
>>> +    ;;Target is not Thumb2, therefore is A32.  Generate b{cond}.
>>> +    (const_int 4)
>>> +
>>> +    ;; Check if target is within 16-bit Thumb2 b{cond} range.
>>> +    (if_then_else (and (ge (minus (match_dup 0) (pc)) (const_int -250))
>>> +               (le (minus (match_dup 0) (pc)) (const_int 256)))
>>> +
>>> +        ;; Target is Thumb2, within narrow range.
>>> +        ;; Generate b{cond}.
>>> +            (const_int 2)
>>> +
>>> +        ;; Check if target is within 32-bit Thumb2 b{cond} range.
>>> +            (if_then_else (and (ge (minus (match_dup 0)
>>> +                     (pc))(const_int -1048568))
>>> +                       (le (minus (match_dup 0)
>>> +                     (pc)) (const_int 1048576)))
>>> +
>>> +        ;; Target is Thumb2, within wide range.
>>> +        ;; Generate b{cond}
>>> +                        (const_int 4)
>>> +        ;; Target is Thumb2, out of range.
>>> +        ;; Generate narrow b{cond} and unconditional branch b.
>>> +                        (const_int 6)))))
>>> +  ]
>>>
>>>
>>> Likewise on the indentation.
>>
>> Done, sorry about that!
>>>
>>>   )
>>>
>>>   (define_insn "*arm_cond_branch_reversed"
>>> @@ -6972,23 +7011,56 @@
>>>                 (pc)
>>>                 (label_ref (match_operand 0 "" ""))))]
>>>     "TARGET_32BIT"
>>> -  "*
>>> -  if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
>>> -    {
>>> -      arm_ccfsm_state += 2;
>>> -      return \"\";
>>> -    }
>>> -  return \"b%D1\\t%l0\";
>>> -  "
>>> +  {
>>> +    if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
>>> +    {
>>> +        arm_ccfsm_state += 2;
>>> +        return "";
>>> +    }
>>> +    switch (get_attr_length (insn))
>>> +    {
>>> +        /* Thumb2 16-bit b{cond}.  */
>>> +        case 2:
>>> +
>>> +        /* Thumb2 32-bit b{cond} or A32 b{cond}.  */
>>> +        case 4: return "b%D1\t%l0";
>>> +            break;
>>> +
>>> +        /* Thumb2 b{cond} out of range.  Use 16-bit b{cond} and
>>> +           unconditional branch b.  */
>>> +        default: return arm_gen_far_branch \
>>> +                (operands, 0, "Lbcond", "b%d1\t");
>>> +    }
>>>
>>>
>>>
>>>
>>>   +  }
>>>     [(set_attr "conds" "use")
>>>      (set_attr "type" "branch")
>>>      (set (attr "length")
>>> -    (if_then_else
>>> -       (and (match_test "TARGET_THUMB2")
>>> -        (and (ge (minus (match_dup 0) (pc)) (const_int -250))
>>> -             (le (minus (match_dup 0) (pc)) (const_int 256))))
>>> -       (const_int 2)
>>> -       (const_int 4)))]
>>> +    (if_then_else (match_test "!TARGET_THUMB2")
>>> +
>>> +    ;;Target is not Thumb2, therefore is A32.  Generate b{cond}.
>>> +    (const_int 4)
>>> +
>>> +    ;; Check if target is within 16-bit Thumb2 b{cond} range.
>>> +    (if_then_else (and (ge (minus (match_dup 0) (pc)) (const_int -250))
>>> +               (le (minus (match_dup 0) (pc)) (const_int 256)))
>>> +
>>> +        ;; Target is Thumb2, within narrow range.
>>> +        ;; Generate b{cond}.
>>> +            (const_int 2)
>>> +
>>> +        ;; Check if target is within 32-bit Thumb2 b{cond} range.
>>> +            (if_then_else (and (ge (minus (match_dup 0)
>>> +                     (pc))(const_int -1048568))
>>> +                       (le (minus (match_dup 0)
>>> +                     (pc)) (const_int 1048576)))
>>> +
>>> +        ;; Target is Thumb2, within wide range.
>>> +        ;; Generate b{cond}.
>>> +                        (const_int 4)
>>> +        ;; Target is Thumb2, out of range.
>>> +        ;; Generate narrow b{cond} and unconditional branch b.
>>> +                        (const_int 6)))))
>>> +  ]
>>>   )
>>>
>>>
>>> Otherwise this looks reasonable to me. Ramana, did you have any further 
>>> comments on the patch?
>>> Thanks,
>>> Kyrill
>>>
>>>
>>>
>>> diff --git a/gcc/testsuite/gcc.target/arm/pr91816.c 
>>> b/gcc/testsuite/gcc.target/arm/pr91816.c
>>> new file mode 100644
>>> index 00000000000..176bf61780b
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/arm/pr91816.c
>>> @@ -0,0 +1,102 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-march=armv7-a -mthumb -mfpu=vfpv3-d16" }  */
>>> +int printf(const char *, ...);
>>> +
>>> +__attribute__((noinline,noclone)) void f1(int a)
>>> +{
>>> +    if (a) {
>>> +#define HW0    printf("Hello World!\n");
>>> +#define HW1    HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>>> +#define HW2    HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>>> +#define HW3    HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>>> +#define HW4    HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>>> +#define HW5    HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>>> +        HW0
>>> +    }
>>> +}
>>> +
>>> +__attribute__((noinline,noclone)) void f2(int a)
>>> +{
>>> +    if (a) {
>>> +#define HW0    printf("Hello World!\n");
>>> +#define HW1    HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>>> +#define HW2    HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>>> +#define HW3    HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>>> +#define HW4    HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>>> +#define HW5    HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>>> +        HW3
>>> +    }
>>> +}
>>> +
>>> +
>>> +__attribute__((noinline,noclone)) void f3(int a)
>>> +{
>>> +    if (a) {
>>> +#define HW0    printf("Hello World!\n");
>>> +#define HW1    HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>>> +#define HW2    HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>>> +#define HW3    HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>>> +#define HW4    HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>>> +#define HW5    HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>>> +        HW5
>>> +    }
>>> +}
>>> +
>>> +__attribute__((noinline,noclone)) void f4(int a)
>>> +{
>>> +    if (a==1) {
>>> +#define HW0    printf("Hello World!\n");
>>> +#define HW1    HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>>> +#define HW2    HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>>> +#define HW3    HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>>> +#define HW4    HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>>> +#define HW5    HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>>> +        HW0
>>> +    }
>>> +}
>>> +
>>> +__attribute__((noinline,noclone)) void f5(int a)
>>> +{
>>> +    if (a==1) {
>>> +#define HW0    printf("Hello World!\n");
>>> +#define HW1    HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>>> +#define HW2    HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>>> +#define HW3    HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>>> +#define HW4    HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>>> +#define HW5    HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>>> +        HW3
>>> +    }
>>> +}
>>> +
>>> +
>>> +__attribute__((noinline,noclone)) void f6(int a)
>>> +{
>>> +    if (a==1) {
>>> +#define HW0    printf("Hello World!\n");
>>> +#define HW1    HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
>>> +#define HW2    HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
>>> +#define HW3    HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
>>> +#define HW4    HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
>>> +#define HW5    HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
>>> +        HW5
>>> +    }
>>> +}
>>> +
>>> +
>>> +int main(void)
>>> +{
>>> +    f1(0);
>>> +    f2(0);
>>> +    f3(0);
>>> +    f4(0);
>>> +    f5(0);
>>> +    f6(0);
>>> +    return 0;
>>> +}
>>> +
>>> +
>>> +/* { dg-final { scan-assembler-times "beq\\t.L\[0-9\]" 2 } } */
>>> +/* { dg-final { scan-assembler-times "beq\\t.Lbcond\[0-9\]" 1 } } */
>>> +/* { dg-final { scan-assembler-times "bne\\t.L\[0-9\]" 2 } } */
>>> +/* { dg-final { scan-assembler-times "bne\\t.Lbcond\[0-9\]" 1 } } */
>>> +/* { dg-final { scan-assembler-times "b\\t.L\[0-9\]" 2 } } */
>>>
>>>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PING][PATCH][GCC][ARM] Arm generates out of range conditional branches in Thumb2 (PR91816)
  2020-01-08 15:19         ` Stam Markianos-Wright
  2020-01-16 16:30           ` Stam Markianos-Wright
@ 2020-01-28 11:05           ` Kyrill Tkachov
  2020-01-30 14:55             ` Stam Markianos-Wright
  1 sibling, 1 reply; 15+ messages in thread
From: Kyrill Tkachov @ 2020-01-28 11:05 UTC (permalink / raw)
  To: Stam Markianos-Wright, gcc-patches
  Cc: nickc, Ramana Radhakrishnan, Richard Earnshaw

Hi Stam,

On 1/8/20 3:18 PM, Stam Markianos-Wright wrote:
>
> On 12/10/19 5:03 PM, Kyrill Tkachov wrote:
>> Hi Stam,
>>
>> On 11/15/19 5:26 PM, Stam Markianos-Wright wrote:
>>> Pinging with more correct maintainers this time :)
>>>
>>> Also would need to backport to gcc7,8,9, but need to get this approved
>>> first!
>>>
>> Sorry for the delay.
> Same here now! Sorry totally forget about this in the lead up to Xmas!
>
> Done the changes marked below and also removed the unnecessary extra #defines
> from the test.


This is ok with a nit on the testcase...


diff --git a/gcc/testsuite/gcc.target/arm/pr91816.c b/gcc/testsuite/gcc.target/arm/pr91816.c
new file mode 100644
index 0000000000000000000000000000000000000000..757c897e9c0db32709227b3fdf1b4a8033428232
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pr91816.c
@@ -0,0 +1,61 @@
+/* { dg-do compile } */
+/* { dg-options "-march=armv7-a -mthumb -mfpu=vfpv3-d16" }  */
+int printf(const char *, ...);
+

I think this needs a couple of effective target checks like arm_hard_vfp_ok and arm_thumb2_ok. See other tests in gcc.target/arm that add -mthumb to the options.

Thanks,
Kyrill

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PING][PATCH][GCC][ARM] Arm generates out of range conditional branches in Thumb2 (PR91816)
  2020-01-28 11:05           ` [PING][PATCH][GCC][ARM] " Kyrill Tkachov
@ 2020-01-30 14:55             ` Stam Markianos-Wright
  2020-01-30 15:21               ` Kyrill Tkachov
  0 siblings, 1 reply; 15+ messages in thread
From: Stam Markianos-Wright @ 2020-01-30 14:55 UTC (permalink / raw)
  To: Kyrill Tkachov, gcc-patches; +Cc: nickc, Ramana Radhakrishnan, Richard Earnshaw

[-- Attachment #1: Type: text/plain, Size: 1669 bytes --]



On 1/28/20 10:35 AM, Kyrill Tkachov wrote:
> Hi Stam,
> 
> On 1/8/20 3:18 PM, Stam Markianos-Wright wrote:
>>
>> On 12/10/19 5:03 PM, Kyrill Tkachov wrote:
>>> Hi Stam,
>>>
>>> On 11/15/19 5:26 PM, Stam Markianos-Wright wrote:
>>>> Pinging with more correct maintainers this time :)
>>>>
>>>> Also would need to backport to gcc7,8,9, but need to get this approved
>>>> first!
>>>>
>>> Sorry for the delay.
>> Same here now! Sorry totally forget about this in the lead up to Xmas!
>>
>> Done the changes marked below and also removed the unnecessary extra #defines
>> from the test.
> 
> 
> This is ok with a nit on the testcase...
> 
> 
> diff --git a/gcc/testsuite/gcc.target/arm/pr91816.c 
> b/gcc/testsuite/gcc.target/arm/pr91816.c
> new file mode 100644
> index 
> 0000000000000000000000000000000000000000..757c897e9c0db32709227b3fdf1b4a8033428232
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/pr91816.c
> @@ -0,0 +1,61 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=armv7-a -mthumb -mfpu=vfpv3-d16" }  */
> +int printf(const char *, ...);
> +
> 
> I think this needs a couple of effective target checks like arm_hard_vfp_ok and 
> arm_thumb2_ok. See other tests in gcc.target/arm that add -mthumb to the options.

Hmm, looking back at this now, is there any reason why it can't just be:

/* { dg-do compile } */
/* { dg-require-effective-target arm_thumb2_ok } */
/* { dg-additional-options "-mthumb" }  */

were we don't override the march or fpu options at all, but just use 
`require-effective-target arm_thumb2_ok` to make sure that thumb2 is supported?

The attached new diff does just that.

Cheers :)

Stam.

> 
> Thanks,
> Kyrill
> 


[-- Attachment #2: 1 --]
[-- Type: text/plain, Size: 8164 bytes --]

diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 7c4b1003844..8895becc639 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -576,4 +576,6 @@ void arm_parse_option_features (sbitmap, const cpu_arch_option *,
 
 void arm_initialize_isa (sbitmap, const enum isa_feature *);
 
+const char * arm_gen_far_branch (rtx *, int, const char * , const char *);
+
 #endif /* ! GCC_ARM_PROTOS_H */
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 07231d722b9..ee5de169f3e 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -32626,6 +32626,40 @@ arm_run_selftests (void)
 }
 } /* Namespace selftest.  */
 
+
+/* Generate code to enable conditional branches in functions over 1 MiB.
+   Parameters are:
+     operands: is the operands list of the asm insn (see arm_cond_branch or
+       arm_cond_branch_reversed).
+     pos_label: is an index into the operands array where operands[pos_label] is
+       the asm label of the final jump destination.
+     dest: is a string which is used to generate the asm label of the intermediate
+       destination
+   branch_format: is a string denoting the intermediate branch format, e.g.
+     "beq", "bne", etc.  */
+
+const char *
+arm_gen_far_branch (rtx * operands, int pos_label, const char * dest,
+		    const char * branch_format)
+{
+  rtx_code_label * tmp_label = gen_label_rtx ();
+  char label_buf[256];
+  char buffer[128];
+  ASM_GENERATE_INTERNAL_LABEL (label_buf, dest , \
+			CODE_LABEL_NUMBER (tmp_label));
+  const char *label_ptr = arm_strip_name_encoding (label_buf);
+  rtx dest_label = operands[pos_label];
+  operands[pos_label] = tmp_label;
+
+  snprintf (buffer, sizeof (buffer), "%s%s", branch_format , label_ptr);
+  output_asm_insn (buffer, operands);
+
+  snprintf (buffer, sizeof (buffer), "b\t%%l0%d\n%s:", pos_label, label_ptr);
+  operands[pos_label] = dest_label;
+  output_asm_insn (buffer, operands);
+  return "";
+}
+
 #undef TARGET_RUN_TARGET_SELFTESTS
 #define TARGET_RUN_TARGET_SELFTESTS selftest::arm_run_selftests
 #endif /* CHECKING_P */
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index f89a2d412df..fb1d4547e5c 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -7546,9 +7546,15 @@
 ;; And for backward branches we have 
 ;;   (neg_range - neg_base_offs + pc_offs) = (neg_range - (-2 or -4) + 4).
 ;;
+;; In 16-bit Thumb these ranges are:
 ;; For a 'b'       pos_range = 2046, neg_range = -2048 giving (-2040->2048).
 ;; For a 'b<cond>' pos_range = 254,  neg_range = -256  giving (-250 ->256).
 
+;; In 32-bit Thumb these ranges are:
+;; For a 'b'       +/- 16MB is not checked for.
+;; For a 'b<cond>' pos_range = 1048574,  neg_range = -1048576  giving
+;; (-1048568 -> 1048576).
+
 (define_expand "cbranchsi4"
   [(set (pc) (if_then_else
 	      (match_operator 0 "expandable_comparison_operator"
@@ -7721,23 +7727,50 @@
 		      (label_ref (match_operand 0 "" ""))
 		      (pc)))]
   "TARGET_32BIT"
-  "*
-  if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
+  {
+    if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
     {
       arm_ccfsm_state += 2;
-      return \"\";
+      return "";
     }
-  return \"b%d1\\t%l0\";
-  "
+    switch (get_attr_length (insn))
+      {
+	case 2: /* Thumb2 16-bit b{cond}.  */
+	case 4: /* Thumb2 32-bit b{cond} or A32 b{cond}.  */
+	  return "b%d1\t%l0";
+	  break;
+
+	/* Thumb2 b{cond} out of range.  Use 16-bit b{cond} and
+	   unconditional branch b.  */
+	default: return arm_gen_far_branch (operands, 0, "Lbcond", "b%D1\t");
+      }
+  }
   [(set_attr "conds" "use")
    (set_attr "type" "branch")
    (set (attr "length")
-	(if_then_else
-	   (and (match_test "TARGET_THUMB2")
-		(and (ge (minus (match_dup 0) (pc)) (const_int -250))
-		     (le (minus (match_dup 0) (pc)) (const_int 256))))
-	   (const_int 2)
-	   (const_int 4)))]
+    (if_then_else (match_test "!TARGET_THUMB2")
+
+      ;;Target is not Thumb2, therefore is A32.  Generate b{cond}.
+      (const_int 4)
+
+      ;; Check if target is within 16-bit Thumb2 b{cond} range.
+      (if_then_else (and (ge (minus (match_dup 0) (pc)) (const_int -250))
+		         (le (minus (match_dup 0) (pc)) (const_int 256)))
+
+	;; Target is Thumb2, within narrow range.
+	;; Generate b{cond}.
+	(const_int 2)
+
+	;; Check if target is within 32-bit Thumb2 b{cond} range.
+	(if_then_else (and (ge (minus (match_dup 0) (pc))(const_int -1048568))
+			   (le (minus (match_dup 0) (pc)) (const_int 1048576)))
+
+	  ;; Target is Thumb2, within wide range.
+	  ;; Generate b{cond}
+	  (const_int 4)
+	  ;; Target is Thumb2, out of range.
+	  ;; Generate narrow b{cond} and unconditional branch b.
+	  (const_int 6)))))]
 )
 
 (define_insn "*arm_cond_branch_reversed"
@@ -7747,23 +7780,50 @@
 		      (pc)
 		      (label_ref (match_operand 0 "" ""))))]
   "TARGET_32BIT"
-  "*
-  if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
+  {
+    if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
     {
       arm_ccfsm_state += 2;
-      return \"\";
+      return "";
     }
-  return \"b%D1\\t%l0\";
-  "
+    switch (get_attr_length (insn))
+      {
+	case 2: /* Thumb2 16-bit b{cond}.  */
+	case 4: /* Thumb2 32-bit b{cond} or A32 b{cond}.  */
+	  return "b%D1\t%l0";
+	  break;
+
+	/* Thumb2 b{cond} out of range.  Use 16-bit b{cond} and
+	   unconditional branch b.  */
+	default: return arm_gen_far_branch (operands, 0, "Lbcond", "b%d1\t");
+      }
+  }
   [(set_attr "conds" "use")
    (set_attr "type" "branch")
    (set (attr "length")
-	(if_then_else
-	   (and (match_test "TARGET_THUMB2")
-		(and (ge (minus (match_dup 0) (pc)) (const_int -250))
-		     (le (minus (match_dup 0) (pc)) (const_int 256))))
-	   (const_int 2)
-	   (const_int 4)))]
+    (if_then_else (match_test "!TARGET_THUMB2")
+
+      ;;Target is not Thumb2, therefore is A32.  Generate b{cond}.
+      (const_int 4)
+
+      ;; Check if target is within 16-bit Thumb2 b{cond} range.
+      (if_then_else (and (ge (minus (match_dup 0) (pc)) (const_int -250))
+			 (le (minus (match_dup 0) (pc)) (const_int 256)))
+
+	;; Target is Thumb2, within narrow range.
+	;; Generate b{cond}.
+	(const_int 2)
+
+	;; Check if target is within 32-bit Thumb2 b{cond} range.
+	(if_then_else (and (ge (minus (match_dup 0) (pc))(const_int -1048568))
+			   (le (minus (match_dup 0) (pc)) (const_int 1048576)))
+
+	  ;; Target is Thumb2, within wide range.
+	  ;; Generate b{cond}.
+	  (const_int 4)
+	  ;; Target is Thumb2, out of range.
+	  ;; Generate narrow b{cond} and unconditional branch b.
+	  (const_int 6)))))]
 )
 
 \f
diff --git a/gcc/testsuite/gcc.target/arm/pr91816.c b/gcc/testsuite/gcc.target/arm/pr91816.c
new file mode 100644
index 00000000000..1fa382d12ea
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pr91816.c
@@ -0,0 +1,62 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_thumb2_ok } */
+/* { dg-additional-options "-mthumb" }  */
+
+int printf(const char *, ...);
+
+#define HW0	printf("Hello World!\n");
+#define HW1	HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0 HW0
+#define HW2	HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1 HW1
+#define HW3	HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2 HW2
+#define HW4	HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3 HW3
+#define HW5	HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4 HW4
+
+__attribute__((noinline,noclone)) void f1 (int a)
+{
+  if (a) { HW0 }
+}
+
+__attribute__((noinline,noclone)) void f2 (int a)
+{
+  if (a) { HW3 }
+}
+
+
+__attribute__((noinline,noclone)) void f3 (int a)
+{
+  if (a) { HW5 }
+}
+
+__attribute__((noinline,noclone)) void f4 (int a)
+{
+  if (a == 1) { HW0 }
+}
+
+__attribute__((noinline,noclone)) void f5 (int a)
+{
+  if (a == 1) { HW3 }
+}
+
+
+__attribute__((noinline,noclone)) void f6 (int a)
+{
+  if (a == 1) { HW5 }
+}
+
+
+int main(void)
+{
+	f1(0);
+	f2(0);
+	f3(0);
+	f4(0);
+	f5(0);
+	f6(0);
+	return 0;
+}
+
+
+/* { dg-final { scan-assembler-times "beq\\t.L\[0-9\]" 2 } } */
+/* { dg-final { scan-assembler-times "beq\\t.Lbcond\[0-9\]" 1 } } */
+/* { dg-final { scan-assembler-times "bne\\t.L\[0-9\]" 2 } } */
+/* { dg-final { scan-assembler-times "bne\\t.Lbcond\[0-9\]" 1 } } */
+/* { dg-final { scan-assembler-times "b\\t.L\[0-9\]" 2 } } */

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PING][PATCH][GCC][ARM] Arm generates out of range conditional branches in Thumb2 (PR91816)
  2020-01-30 14:55             ` Stam Markianos-Wright
@ 2020-01-30 15:21               ` Kyrill Tkachov
  2020-03-04 14:15                 ` Tamar Christina
  0 siblings, 1 reply; 15+ messages in thread
From: Kyrill Tkachov @ 2020-01-30 15:21 UTC (permalink / raw)
  To: Stam Markianos-Wright, gcc-patches
  Cc: nickc, Ramana Radhakrishnan, Richard Earnshaw


On 1/30/20 2:42 PM, Stam Markianos-Wright wrote:
>
>
> On 1/28/20 10:35 AM, Kyrill Tkachov wrote:
>> Hi Stam,
>>
>> On 1/8/20 3:18 PM, Stam Markianos-Wright wrote:
>>>
>>> On 12/10/19 5:03 PM, Kyrill Tkachov wrote:
>>>> Hi Stam,
>>>>
>>>> On 11/15/19 5:26 PM, Stam Markianos-Wright wrote:
>>>>> Pinging with more correct maintainers this time :)
>>>>>
>>>>> Also would need to backport to gcc7,8,9, but need to get this 
>>>>> approved
>>>>> first!
>>>>>
>>>> Sorry for the delay.
>>> Same here now! Sorry totally forget about this in the lead up to Xmas!
>>>
>>> Done the changes marked below and also removed the unnecessary extra 
>>> #defines
>>> from the test.
>>
>>
>> This is ok with a nit on the testcase...
>>
>>
>> diff --git a/gcc/testsuite/gcc.target/arm/pr91816.c 
>> b/gcc/testsuite/gcc.target/arm/pr91816.c
>> new file mode 100644
>> index 
>> 0000000000000000000000000000000000000000..757c897e9c0db32709227b3fdf1b4a8033428232
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/arm/pr91816.c
>> @@ -0,0 +1,61 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-march=armv7-a -mthumb -mfpu=vfpv3-d16" }  */
>> +int printf(const char *, ...);
>> +
>>
>> I think this needs a couple of effective target checks like 
>> arm_hard_vfp_ok and arm_thumb2_ok. See other tests in gcc.target/arm 
>> that add -mthumb to the options.
>
> Hmm, looking back at this now, is there any reason why it can't just be:
>
> /* { dg-do compile } */
> /* { dg-require-effective-target arm_thumb2_ok } */
> /* { dg-additional-options "-mthumb" }  */
>
> were we don't override the march or fpu options at all, but just use 
> `require-effective-target arm_thumb2_ok` to make sure that thumb2 is 
> supported?
>
> The attached new diff does just that.
>

Works for me, there are plenty of configurations run with fpu that it 
should get the right coverage.

Ok (make sure commit the updated, if needed, ChangeLog as well)

Thanks!

Kyrill


> Cheers :)
>
> Stam.
>
>>
>> Thanks,
>> Kyrill
>>
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [PING][PATCH][GCC][ARM] Arm generates out of range conditional branches in Thumb2 (PR91816)
  2020-01-30 15:21               ` Kyrill Tkachov
@ 2020-03-04 14:15                 ` Tamar Christina
  2020-03-04 14:18                   ` Kyrill Tkachov
  0 siblings, 1 reply; 15+ messages in thread
From: Tamar Christina @ 2020-03-04 14:15 UTC (permalink / raw)
  To: Kyrill Tkachov, Stam Markianos-Wright, gcc-patches
  Cc: nickc, Ramana Radhakrishnan, Richard Earnshaw

Hi Kyrill,

Ok for backporting this patch to GCC 8 and GCC 9?

Thanks,
Tamar

> -----Original Message-----
> From: gcc-patches-owner@gcc.gnu.org <gcc-patches-owner@gcc.gnu.org>
> On Behalf Of Kyrill Tkachov
> Sent: Thursday, January 30, 2020 14:55
> To: Stam Markianos-Wright <Stam.Markianos-Wright@arm.com>; gcc-
> patches@gcc.gnu.org
> Cc: nickc@redhat.com; Ramana Radhakrishnan
> <Ramana.Radhakrishnan@arm.com>; Richard Earnshaw
> <Richard.Earnshaw@arm.com>
> Subject: Re: [PING][PATCH][GCC][ARM] Arm generates out of range
> conditional branches in Thumb2 (PR91816)
> 
> 
> On 1/30/20 2:42 PM, Stam Markianos-Wright wrote:
> >
> >
> > On 1/28/20 10:35 AM, Kyrill Tkachov wrote:
> >> Hi Stam,
> >>
> >> On 1/8/20 3:18 PM, Stam Markianos-Wright wrote:
> >>>
> >>> On 12/10/19 5:03 PM, Kyrill Tkachov wrote:
> >>>> Hi Stam,
> >>>>
> >>>> On 11/15/19 5:26 PM, Stam Markianos-Wright wrote:
> >>>>> Pinging with more correct maintainers this time :)
> >>>>>
> >>>>> Also would need to backport to gcc7,8,9, but need to get this
> >>>>> approved first!
> >>>>>
> >>>> Sorry for the delay.
> >>> Same here now! Sorry totally forget about this in the lead up to Xmas!
> >>>
> >>> Done the changes marked below and also removed the unnecessary
> extra
> >>> #defines from the test.
> >>
> >>
> >> This is ok with a nit on the testcase...
> >>
> >>
> >> diff --git a/gcc/testsuite/gcc.target/arm/pr91816.c
> >> b/gcc/testsuite/gcc.target/arm/pr91816.c
> >> new file mode 100644
> >> index
> >>
> 0000000000000000000000000000000000000000..757c897e9c0db32709227b3fdf
> 1
> >> b4a8033428232
> >> --- /dev/null
> >> +++ b/gcc/testsuite/gcc.target/arm/pr91816.c
> >> @@ -0,0 +1,61 @@
> >> +/* { dg-do compile } */
> >> +/* { dg-options "-march=armv7-a -mthumb -mfpu=vfpv3-d16" }  */ int
> >> +printf(const char *, ...);
> >> +
> >>
> >> I think this needs a couple of effective target checks like
> >> arm_hard_vfp_ok and arm_thumb2_ok. See other tests in gcc.target/arm
> >> that add -mthumb to the options.
> >
> > Hmm, looking back at this now, is there any reason why it can't just be:
> >
> > /* { dg-do compile } */
> > /* { dg-require-effective-target arm_thumb2_ok } */
> > /* { dg-additional-options "-mthumb" }  */
> >
> > were we don't override the march or fpu options at all, but just use
> > `require-effective-target arm_thumb2_ok` to make sure that thumb2 is
> > supported?
> >
> > The attached new diff does just that.
> >
> 
> Works for me, there are plenty of configurations run with fpu that it should
> get the right coverage.
> 
> Ok (make sure commit the updated, if needed, ChangeLog as well)
> 
> Thanks!
> 
> Kyrill
> 
> 
> > Cheers :)
> >
> > Stam.
> >
> >>
> >> Thanks,
> >> Kyrill
> >>
> >

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PING][PATCH][GCC][ARM] Arm generates out of range conditional branches in Thumb2 (PR91816)
  2020-03-04 14:15                 ` Tamar Christina
@ 2020-03-04 14:18                   ` Kyrill Tkachov
  0 siblings, 0 replies; 15+ messages in thread
From: Kyrill Tkachov @ 2020-03-04 14:18 UTC (permalink / raw)
  To: Tamar Christina, Stam Markianos-Wright, gcc-patches
  Cc: nickc, Ramana Radhakrishnan, Richard Earnshaw


On 3/4/20 2:14 PM, Tamar Christina wrote:
> Hi Kyrill,
>
> Ok for backporting this patch to GCC 8 and GCC 9?


Ok assuming bootstrap and test shows no problems.

Thanks,

Kyrill


>
> Thanks,
> Tamar
>
>> -----Original Message-----
>> From: gcc-patches-owner@gcc.gnu.org <gcc-patches-owner@gcc.gnu.org>
>> On Behalf Of Kyrill Tkachov
>> Sent: Thursday, January 30, 2020 14:55
>> To: Stam Markianos-Wright <Stam.Markianos-Wright@arm.com>; gcc-
>> patches@gcc.gnu.org
>> Cc: nickc@redhat.com; Ramana Radhakrishnan
>> <Ramana.Radhakrishnan@arm.com>; Richard Earnshaw
>> <Richard.Earnshaw@arm.com>
>> Subject: Re: [PING][PATCH][GCC][ARM] Arm generates out of range
>> conditional branches in Thumb2 (PR91816)
>>
>>
>> On 1/30/20 2:42 PM, Stam Markianos-Wright wrote:
>>>
>>> On 1/28/20 10:35 AM, Kyrill Tkachov wrote:
>>>> Hi Stam,
>>>>
>>>> On 1/8/20 3:18 PM, Stam Markianos-Wright wrote:
>>>>> On 12/10/19 5:03 PM, Kyrill Tkachov wrote:
>>>>>> Hi Stam,
>>>>>>
>>>>>> On 11/15/19 5:26 PM, Stam Markianos-Wright wrote:
>>>>>>> Pinging with more correct maintainers this time :)
>>>>>>>
>>>>>>> Also would need to backport to gcc7,8,9, but need to get this
>>>>>>> approved first!
>>>>>>>
>>>>>> Sorry for the delay.
>>>>> Same here now! Sorry totally forget about this in the lead up to Xmas!
>>>>>
>>>>> Done the changes marked below and also removed the unnecessary
>> extra
>>>>> #defines from the test.
>>>>
>>>> This is ok with a nit on the testcase...
>>>>
>>>>
>>>> diff --git a/gcc/testsuite/gcc.target/arm/pr91816.c
>>>> b/gcc/testsuite/gcc.target/arm/pr91816.c
>>>> new file mode 100644
>>>> index
>>>>
>> 0000000000000000000000000000000000000000..757c897e9c0db32709227b3fdf
>> 1
>>>> b4a8033428232
>>>> --- /dev/null
>>>> +++ b/gcc/testsuite/gcc.target/arm/pr91816.c
>>>> @@ -0,0 +1,61 @@
>>>> +/* { dg-do compile } */
>>>> +/* { dg-options "-march=armv7-a -mthumb -mfpu=vfpv3-d16" }  */ int
>>>> +printf(const char *, ...);
>>>> +
>>>>
>>>> I think this needs a couple of effective target checks like
>>>> arm_hard_vfp_ok and arm_thumb2_ok. See other tests in gcc.target/arm
>>>> that add -mthumb to the options.
>>> Hmm, looking back at this now, is there any reason why it can't just be:
>>>
>>> /* { dg-do compile } */
>>> /* { dg-require-effective-target arm_thumb2_ok } */
>>> /* { dg-additional-options "-mthumb" }  */
>>>
>>> were we don't override the march or fpu options at all, but just use
>>> `require-effective-target arm_thumb2_ok` to make sure that thumb2 is
>>> supported?
>>>
>>> The attached new diff does just that.
>>>
>> Works for me, there are plenty of configurations run with fpu that it should
>> get the right coverage.
>>
>> Ok (make sure commit the updated, if needed, ChangeLog as well)
>>
>> Thanks!
>>
>> Kyrill
>>
>>
>>> Cheers :)
>>>
>>> Stam.
>>>
>>>> Thanks,
>>>> Kyrill
>>>>

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2020-03-04 14:18 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-11 14:57 [PATCH][GCC][ARM] Arm generates out of range conditional branches in Thumb2 (PR91816) Stam Markianos-Wright
2019-10-13 15:37 ` Ramana Radhakrishnan
2019-10-21  9:40   ` Stam Markianos-Wright
2019-11-15 17:27     ` [PING][PATCH][GCC][ARM] " Stam Markianos-Wright
2019-12-02 16:43       ` Stam Markianos-Wright
2019-12-09 17:50         ` Stam Markianos-Wright
2019-12-10 17:03       ` Kyrill Tkachov
2020-01-08 15:19         ` Stam Markianos-Wright
2020-01-16 16:30           ` Stam Markianos-Wright
2020-01-27 16:28             ` [PINGx2][PATCH][GCC][ARM] " Stam Markianos-Wright
2020-01-28 11:05           ` [PING][PATCH][GCC][ARM] " Kyrill Tkachov
2020-01-30 14:55             ` Stam Markianos-Wright
2020-01-30 15:21               ` Kyrill Tkachov
2020-03-04 14:15                 ` Tamar Christina
2020-03-04 14:18                   ` Kyrill Tkachov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).