public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* arm/thumb bugs
@ 2001-11-01  8:28 Adrian von Bidder
  2001-11-01  8:46 ` Adrian von Bidder
  0 siblings, 1 reply; 9+ messages in thread
From: Adrian von Bidder @ 2001-11-01  8:28 UTC (permalink / raw)
  To: gcc

Hi!

2 bugs are bugging me right now:

 - sometimes (depending on what the optimizer does), gcc generates an
illegal strb rx,[sp, offs] instruction. See also my bug report. 
	I can work around this by not optimizing or by using -fforce-addr

 - in very big functions with long case statements, I get 'Branch out of
range' when compiling for thumb with -fpic. I'm attempting to fix this,
only I have no idea where to start - any hints? - since I don't know any
workaround.

I'm only glad the problems appear on both plain 3.0.2 and 3.0.2 + my
embedded-pic patch.

Again: I'd be glad for hints. (concerning both problems, though the
first one is not that urgent for me.)
 
greets from Zürich
-- vbi

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: arm/thumb bugs
  2001-11-01  8:28 arm/thumb bugs Adrian von Bidder
@ 2001-11-01  8:46 ` Adrian von Bidder
  2001-11-01  9:03   ` Philip Blundell
  0 siblings, 1 reply; 9+ messages in thread
From: Adrian von Bidder @ 2001-11-01  8:46 UTC (permalink / raw)
  To: gcc

Adrian von Bidder wrote:

>  - in very big functions with long case statements, I get 'Branch out of
> range' when compiling for thumb with -fpic. I'm attempting to fix this,
> only I have no idea where to start - any hints? - since I don't know any
> workaround.

Quite helpless on this... I hoped I would find something looking hard at
gcc source, but to no avail...

I thought about tinkering with CASE_VECTOR_MODE, but I don't think it's
the right way. (And I would have no idea what mode to use...)

The switch statements translate to code like
.LCB6:
        ldr     r2, .L20
        lsl     r3, r3, #2
        ldr     r0, [r3, r2]
        mov     pc, r0
        .align  2
        .align  2
.L18:
        b       .L3
        b       .L10
	...

when compiling with -fpic. On thumb, the branch only takes +/- 2040
bytes as addresses, so branches out of that range are generated quite
fast in big functions.

Any help available at all?

greets from Zürich
-- vbi

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: arm/thumb bugs
  2001-11-01  8:46 ` Adrian von Bidder
@ 2001-11-01  9:03   ` Philip Blundell
  2001-11-01 15:27     ` Adrian von Bidder
  2001-11-02  9:28     ` arm/thumb bugs Hartmut Schirmer
  0 siblings, 2 replies; 9+ messages in thread
From: Philip Blundell @ 2001-11-01  9:03 UTC (permalink / raw)
  To: Adrian von Bidder; +Cc: gcc

In message <3BF003A7.F2FE2F5@acter.ch>, Adrian von Bidder writes:
>The switch statements translate to code like
>.LCB6:
>        ldr     r2, .L20
>        lsl     r3, r3, #2
>        ldr     r0, [r3, r2]
>        mov     pc, r0
>        .align  2
>        .align  2
>.L18:
>        b       .L3
>        b       .L10
>	...
>
>when compiling with -fpic. On thumb, the branch only takes +/- 2040
>bytes as addresses, so branches out of that range are generated quite
>fast in big functions.

It looks to me like there are bigger problems with that code than out-of-range branches.  Does it actually work at all, even with small functions?

You probably want to arrange to emit something like:

	lsl	r3, r3, #2
	add	r1, pc, #(.Lb - . - 4)		@ adr r1, .Lb
	ldr	r3, [r1, r3]
	add	r3, r1
	mov	pc, r3

.Lb
	.word	.L1 - .Lb
	.word	.L2 - .Lb
	.word	.L3 - .Lb

.L1:
	...
.L2:
	...

p.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: arm/thumb bugs
  2001-11-01  9:03   ` Philip Blundell
@ 2001-11-01 15:27     ` Adrian von Bidder
  2001-11-01 21:52       ` Nick Clifton
  2001-11-02  9:28     ` arm/thumb bugs Hartmut Schirmer
  1 sibling, 1 reply; 9+ messages in thread
From: Adrian von Bidder @ 2001-11-01 15:27 UTC (permalink / raw)
  To: Philip Blundell; +Cc: gcc

Philip Blundell wrote:

> 
> It looks to me like there are bigger problems with that code than out-of-range branches.  Does it actually work at all, even with small functions?

switch statements basically do work. At least, that's what I'd expect
from a release version...

I probably cut away a little bit too much in my earlier mail. so:

On ARM, a switch statement translates to:
        sub     r3, r0, #1
        cmp     r3, #44
        addls   pc, pc, r3, asl #2
        b       .L17
        .align  2
.L18:
        b       .L3
        b       .L10
        b       .L16
        b       .L19
        b       .L19

(r0 == the var we're switching on. .L17 is obviously the 'default:'
case. Used compiler: gcc-3.0.2 unmodified, target=arm-linux-elf, run
with -fpic) 

The assembler generates relocation entries (R_ARM_PC24 .text) for every
branch instruction. I have tested this - it works as expected. (Tested
with the older gcc with -mdisable-got. The code of this function is
identical, though.)

For thumb:
        sub     r3, r1, #1
        cmp     r3, #44
        bls     .LCB6
        b       .L17    @long jump
.LCB6:
        ldr     r2, .L20
        lsl     r3, r3, #2
        ldr     r3, [r3, r2]
        mov     pc, r3
        .align  2
        .align  2
.L18:
        b       .L3
        b       .L10
        b       .L4
        b       .L5
        b       .L5
	...
.L20:
        .word   .L18(GOTOFF)


(now it's r1... compiler is again gcc-3.0.2 unmodified, same target,
flags -fpic -msingle-pic-base [not that this changes anything here]
-mthumb)

Now this does indeed look strange - it expects the jump table to be the
list of offsets you have mentioned in your mail. But in fact it's the
same series of relative jumps as in the 32bit ARM case.

Since the possible offsets in thumb are only 11 bit, I suppose your
proposal would be the best - and would get rid of the 'branch out of
range' issue as well. (Ok, the jump tables would get bigger than with
branches.) Producing non-pic code also uses 32 bit constants.

I would probably have the time to do this adaption, but I have - as I
said previously - no idea where to start.

greets
-- vbi

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: arm/thumb bugs
  2001-11-01 15:27     ` Adrian von Bidder
@ 2001-11-01 21:52       ` Nick Clifton
  2001-11-02 18:27         ` writing casesi (was: Re: arm/thumb bugs) Adrian von Bidder
  0 siblings, 1 reply; 9+ messages in thread
From: Nick Clifton @ 2001-11-01 21:52 UTC (permalink / raw)
  To: Adrian von Bidder; +Cc: gcc

Hi Adrian,

> Now this does indeed look strange - it expects the jump table to be
> the list of offsets you have mentioned in your mail. But in fact
> it's the same series of relative jumps as in the 32bit ARM case.

Hmm, not good! :-)

> I would probably have the time to do this adaption, but I have - as
> I said previously - no idea where to start.

Have a look in the arm.md file.  Find "casesi" and "casesi_internal".
These are the patterns used for switch table generation for the ARM.
You will porobably want to define similar patterns for the THUMB.

Cheers
        Nick

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: arm/thumb bugs
  2001-11-01  9:03   ` Philip Blundell
  2001-11-01 15:27     ` Adrian von Bidder
@ 2001-11-02  9:28     ` Hartmut Schirmer
  1 sibling, 0 replies; 9+ messages in thread
From: Hartmut Schirmer @ 2001-11-02  9:28 UTC (permalink / raw)
  To: Adrian von Bidder; +Cc: gcc

On Mon, 12 Nov 2001, Philip Blundell wrote:
>In message <3BF003A7.F2FE2F5@acter.ch>, Adrian von Bidder writes:
>>The switch statements translate to code like
>>.LCB6:
>>        ldr     r2, .L20
>>        lsl     r3, r3, #2
>>        ldr     r0, [r3, r2]
>>        mov     pc, r0
>>        .align  2
>>        .align  2
>>.L18:
>>        b       .L3
>>        b       .L10
>>	...
>>
>>when compiling with -fpic. On thumb, the branch only takes +/- 2040
>>bytes as addresses, so branches out of that range are generated quite
>>fast in big functions.
>
>It looks to me like there are bigger problems with that code than out-of-range branches.  Does it actually work at all, even with small functions?
>
>You probably want to arrange to emit something like:
>
>	lsl	r3, r3, #2
>	add	r1, pc, #(.Lb - . - 4)		@ adr r1, .Lb
>	ldr	r3, [r1, r3]
>	add	r3, r1
>	mov	pc, r3
>
>.Lb
>	.word	.L1 - .Lb
>	.word	.L2 - .Lb
>	.word	.L3 - .Lb
>
>.L1:
>	...
>.L2:
>	...

The SH port works this way and even chooses .byte or .word (and .long ?)
if required. You may have a look at it.

Hartmut

^ permalink raw reply	[flat|nested] 9+ messages in thread

* writing casesi (was: Re: arm/thumb bugs)
  2001-11-01 21:52       ` Nick Clifton
@ 2001-11-02 18:27         ` Adrian von Bidder
  2001-11-06 14:56           ` Alexandre Oliva
  0 siblings, 1 reply; 9+ messages in thread
From: Adrian von Bidder @ 2001-11-02 18:27 UTC (permalink / raw)
  To: Nick Clifton, gcc

Heyho!

[ARM/Thumb produces non-working code for switch statements when used
with -fpic]

Nick Clifton wrote:

> Have a look in the arm.md file.  Find "casesi" and "casesi_internal".
> These are the patterns used for switch table generation for the ARM.
> You will porobably want to define similar patterns for the THUMB.

Ok, I'm trying - but not getting anywhere right now.

I'm calling the thumbpic_casesi from casesi if TARGET_THUMB && flag_pic.
This failes when compiling a switch statement, saying there'd be an
unrecognizable pattern.

The define_expand:
(define_expand "thumbpic_casesi"
  [(set (match_dup 5)
        (minus:SI (match_operand:SI 0 "" "")
                  (match_operand:SI 1 "" "")))
   (set (pc)
        (if_then_else (gtu (match_dup 5)
                           (match_operand 2 "" ""))
                      (label_ref (match_operand 4 "" ""))
                      (pc)))
   (set (match_dup 6)
        (mult:SI (match_dup 5)
                 (const_int 2)))
   (set (match_dup 7)
        (label_ref (match_operand 3 "" "")))
   (set (match_dup 8)
        (plus:SI (match_dup 6)
                 (match_dup 7)))
    (set (pc)
         (match_dup 8))]
  "TARGET_THUMB && flag_pic"
  "{
    operands[5] = gen_reg_rtx (SImode);
    operands[6] = gen_reg_rtx (SImode);
    operands[7] = gen_reg_rtx (SImode);
    operands[8] = gen_reg_rtx (SImode);
  }"
)


This results in:
]$ arm-uclinux-gcc -O3 -mno-sched-prolog -dr -save-temps         -mthumb
-fpic -c -o 01.o.thumbpic 01.c -da -dP
01.c: In function `bla':
01.c:20: Unrecognizable insn:

(insn 151 150 152 (set (reg/f:SI 54)
        (label_ref 154)) -1 (nil)
    (expr_list:REG_EQUAL (label_ref 154)
        (insn_list:REG_LABEL 154 (nil))))
01.c:20: Internal compiler error in extract_insn, at recog.c:2218
Please submit a full bug report,
with preprocessed source if appropriate.
See <URL:http://www.gnu.org/software/gcc/bugs.html> for instructions.

I tried to define a pattern to catch this, but without success so far:
(define_insn "*thumbpic_movaddr"
  [(set (match_operand:SI 0 "s_register_operand" "=l")
        (match_operand:DI 1 "address_operand" "p"))]
  "TARGET_THUMB && flag_pic"
  "adr\\t%0, %a1"
  [(set_attr "length" "2")]
)
 seems not to do what I want.

The other possibility would be not to define casesi, but to have the
addr_diff_vec generate not the series of branch instructions that is
generated now, but the series of offsets as Philip Blundell suggested.
Only I couldn't find out where I would specify this (defining
CASE_VECTOR_PC_RELATIVE in arm.h did not change anything.)

 
greets from Zürich
-- vbi

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: writing casesi (was: Re: arm/thumb bugs)
  2001-11-02 18:27         ` writing casesi (was: Re: arm/thumb bugs) Adrian von Bidder
@ 2001-11-06 14:56           ` Alexandre Oliva
  2001-11-07 17:26             ` Adrian von Bidder
  0 siblings, 1 reply; 9+ messages in thread
From: Alexandre Oliva @ 2001-11-06 14:56 UTC (permalink / raw)
  To: Adrian von Bidder; +Cc: Nick Clifton, gcc

On Nov 15, 2001, Adrian von Bidder <avbidder@acter.ch> wrote:

>    (set (match_dup 7)
>         (label_ref (match_operand 3 "" "")))

As far as I can tell, this is not a valid instruction, at least not in
PIC mode.  You should probably built it as the PIC register plus the
GOTOFF of the label.  I don't know details about the implementation of
PIC in the ARM back-end, but it appears to me that pic_load_addr_thumb
may be the insn you're looking for.

> (define_insn "*thumbpic_movaddr"
>   [(set (match_operand:SI 0 "s_register_operand" "=l")
>         (match_operand:DI 1 "address_operand" "p"))]

At the very least, both operands should have the same mode.  I don't
see why you're using DImode, though.  Also, address_operand and the
"p" constraint are not what you're looking for.  "immediate_operand"
and "s" would be closer, but a label_ref is not a legitimate constant,
so it wouldn't match.

> The other possibility would be not to define casesi, but to have the
> addr_diff_vec generate not the series of branch instructions that is
> generated now, but the series of offsets as Philip Blundell suggested.
> Only I couldn't find out where I would specify this (defining
> CASE_VECTOR_PC_RELATIVE in arm.h did not change anything.)

You may have to define ASM_OUTPUT_ADDR_{DIFF,VEC}_ELT for it to make a
difference.  I don't see these macros defined in arm.h.

-- 
Alexandre Oliva   Enjoy Guarana', see http://www.ic.unicamp.br/~oliva/
Red Hat GCC Developer                  aoliva@{cygnus.com, redhat.com}
CS PhD student at IC-Unicamp        oliva@{lsd.ic.unicamp.br, gnu.org}
Free Software Evangelist    *Please* write to mailing lists, not to me

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: writing casesi (was: Re: arm/thumb bugs)
  2001-11-06 14:56           ` Alexandre Oliva
@ 2001-11-07 17:26             ` Adrian von Bidder
  0 siblings, 0 replies; 9+ messages in thread
From: Adrian von Bidder @ 2001-11-07 17:26 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: Nick Clifton, gcc

[-- Attachment #1: Type: text/plain, Size: 1630 bytes --]

Hi again!

Attached is a first cut on the casesi-on-thumb-with-fpic problem. The
patch is not yet tested - I'll do that asap - but the code should work
in theory (an improvement on the current state, I'd say). Patch is
against 3.0.2, but does apply to current cvs.

Alexandre Oliva wrote:
> 
> On Nov 15, 2001, Adrian von Bidder <avbidder@acter.ch> wrote:
> 
> >    (set (match_dup 7)
> >         (label_ref (match_operand 3 "" "")))
> 
> As far as I can tell, this is not a valid instruction, at least not in
> PIC mode.  You should probably built it as the PIC register plus the
> GOTOFF of the label.  I don't know details about the implementation of
> PIC in the ARM back-end, but it appears to me that pic_load_addr_thumb
> may be the insn you're looking for.

The label_ref in question is very close to the current pc, so I can use
"adr rx, #offset" (which is rx = pc + offset). Pattern *thumb_movaddr in
the patch.

It would perhaps be better if I'd call gen_thumb_movaddr directly from
the preparation statements. But how do I indicate in the template that
this operand has been set up? dce will delete the adr instruction if I
don't.
(Best would probably be to specify the range in the condition to the adr
instruction, so it could be used generally if the necessity arised. But
that's beyond me).

Probably there is no need to add pattern *thumb_mulsi3_power2 and the
modification to *thumb_subsi3_insn, if the template were written the
Right[tm] way.

All in all: I'm still hacking away without understanding half of what's
going on behind the scene, so any hand-holding will be appreciated.

greets from Zürich
-- vbi

[-- Attachment #2: arm.md-casesi.patch-2 --]
[-- Type: text/plain, Size: 3838 bytes --]

--- gcc-3.0.2.orig/gcc/config/arm/arm.md	Fri May 18 14:45:22 2001
+++ gcc-3.0.2/gcc/config/arm/arm.md	Mon Nov 19 16:13:59 2001
@@ -983,11 +983,13 @@
 )
 
 (define_insn "*thumb_subsi3_insn"
-  [(set (match_operand:SI           0 "register_operand" "=l")
-	(minus:SI (match_operand:SI 1 "register_operand" "l")
-		  (match_operand:SI 2 "register_operand" "l")))]
+  [(set (match_operand:SI           0 "register_operand" "=l,l")
+	(minus:SI (match_operand:SI 1 "register_operand" "l,0")
+		  (match_operand:SI 2 "register_operand" "l,I")))]
   "TARGET_THUMB"
-  "sub\\t%0, %1, %2"
+  "@
+   sub\\t%0, %1, %2
+   sub\\t%0, %2"
   [(set_attr "length" "2")]
 )
 
@@ -1165,6 +1167,16 @@
    (set_attr "type" "mult")]
 )
 
+; multiply by power of 2
+(define_insn "*thumb_mulsi3_power2"
+  [(set (match_operand:SI          0 "s_register_operand" "=l")
+	(mult:SI (match_operand:SI 1 "s_register_operand" "0")
+		 (match_operand:SI 2 "const_int_operand" "")))]
+  "TARGET_THUMB && (((INTVAL(operands[2]))&((INTVAL(operands[2]) - 1))) == 0)"
+  "lsl\\t%0, %1, %2"
+  [(set_attr "length" "2")]
+)
+
 (define_insn "*mulsi3_compare0"
   [(set (reg:CC_NOOV CC_REGNUM)
 	(compare:CC_NOOV (mult:SI
@@ -6833,10 +6845,17 @@
    (match_operand:SI 2 "const_int_operand" "")	; total range
    (match_operand:SI 3 "" "")			; table label
    (match_operand:SI 4 "" "")]			; Out of range label
-  "TARGET_ARM"
+  "TARGET_ARM || (TARGET_THUMB && flag_pic)"
   "
   {
     rtx reg;
+
+    if (TARGET_THUMB && flag_pic) {
+      emit_insn (gen_thumbpic_casesi (operands[0], operands[1],
+              operands[2], operands[3], operands[4]));
+      DONE;
+    }
+    
     if (operands[1] != const0_rtx)
       {
 	reg = gen_reg_rtx (SImode);
@@ -6849,15 +6868,15 @@
     if (!const_ok_for_arm (INTVAL (operands[2])))
       operands[2] = force_reg (SImode, operands[2]);
 
-    emit_jump_insn (gen_casesi_internal (operands[0], operands[2], operands[3],
-					 operands[4]));
+    emit_jump_insn (gen_arm_casesi_internal (operands[0], operands[2],
+                operands[3], operands[4]));
     DONE;
   }"
 )
 
 ;; The USE in this pattern is needed to tell flow analysis that this is
 ;; a CASESI insn.  It has no other purpose.
-(define_insn "casesi_internal"
+(define_insn "arm_casesi_internal"
   [(parallel [(set (pc)
 	       (if_then_else
 		(leu (match_operand:SI 0 "s_register_operand" "r")
@@ -6875,6 +6894,47 @@
   "
   [(set_attr "conds" "clob")
    (set_attr "length" "12")]
+)
+
+;; tablejump seems to produce wrong code for thumb with -fpic. Fix this:
+(define_expand "thumbpic_casesi"
+  [(set (match_dup 5)
+        (minus:SI (match_operand:SI 0 "" "")
+                  (match_operand:SI 1 "" "")))
+   (set (pc)
+        (if_then_else (gtu (match_dup 5)
+                           (match_operand 2 "" ""))
+                      (label_ref (match_operand 4 "" ""))
+                      (pc)))
+   (set (match_dup 6)
+        (mult:SI (match_dup 5)
+                 (const_int 2)))
+   (set (match_dup 7)
+        (label_ref (match_operand 3 "" "")))
+   (set (match_dup 8)
+        (plus:SI (match_dup 6)
+                 (match_dup 7)))
+   (parallel [
+     (set (pc)
+          (match_dup 8))
+     (use (label_ref (match_dup 3)))])]
+  "TARGET_THUMB && flag_pic"
+  "{
+    operands[5] = gen_reg_rtx (SImode);
+    operands[6] = gen_reg_rtx (SImode);
+    operands[7] = gen_reg_rtx (SImode);
+    operands[8] = gen_reg_rtx (SImode);
+  }"
+)
+
+; this works only if the label_ref is close enough (10 bit word aligned)
+; but it should only be called from thumbpic_casesi anyway.
+(define_insn "*thumbpic_movaddr"
+  [(set (match_operand:SI 0 "s_register_operand" "=l")
+        (label_ref (match_operand 1 "" "")))]
+  "TARGET_THUMB && flag_pic"
+  "adr\\t%0, %a1"
+  [(set_attr "length" "2")]
 )
 
 (define_expand "indirect_jump"

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2001-11-19 15:34 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-11-01  8:28 arm/thumb bugs Adrian von Bidder
2001-11-01  8:46 ` Adrian von Bidder
2001-11-01  9:03   ` Philip Blundell
2001-11-01 15:27     ` Adrian von Bidder
2001-11-01 21:52       ` Nick Clifton
2001-11-02 18:27         ` writing casesi (was: Re: arm/thumb bugs) Adrian von Bidder
2001-11-06 14:56           ` Alexandre Oliva
2001-11-07 17:26             ` Adrian von Bidder
2001-11-02  9:28     ` arm/thumb bugs Hartmut Schirmer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).