[PATCH, ARM] Subregs of VFP registers in big-endian mode

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH, ARM] Subregs of VFP registers in big-endian mode
@ 2012-10-20 14:13 Julian Brown
  2012-10-21  2:48 ` Andrew Pinski
  2012-10-22  8:23 ` Richard Earnshaw
  0 siblings, 2 replies; 3+ messages in thread
From: Julian Brown @ 2012-10-20 14:13 UTC (permalink / raw)
  To: gcc-patches, Richard Earnshaw, Ramana Radhakrishnan

[-- Attachment #1: Type: text/plain, Size: 5566 bytes --]

Hi,

Quite a few tests fail for big-endian multilibs which use VFP
instructions at present. One reason for many of these is glaringly
obvious once you notice it: for D registers interpreted as two S
registers, the lower-numbered register is always the less-significant
part of the value, and the higher-numbered register the
more-significant -- regardless of the endianness the processor is
running in.

However, for big-endian mode, when DFmode values are represented in
memory (or indeed core registers), the opposite is true. So, a subreg
expression such as the following will work fine on core registers (or
e.g. pseudos assigned to stack slots):

(subreg:SI (reg:DF) 0)

but, when applied to a VFP register Dn, it should be resolved to the
hard register S(n*2+1). At present though, it resolves to S(n*2) -- i.e.
the wrong half of the value (for WORDS_BIG_ENDIAN, such a subreg should
be the most-significant part of the value). For the relatively few cases
where DFmode values are interpreted as a pair of (integer) words, this
means that wrong code is generated.

My feeling is that implementing a "proper" solution to this problem is
probably impractical -- the closest existing macros to control
behaviour aren't sufficient for this case:

* FLOAT_WORDS_BIG_ENDIAN only refers to memory layout, which is correct
  as is it.

* REG_WORDS_BIG_ENDIAN controls whether values are stored in big-endian
  order in registers, but refers to *all* registers. We only want to
  change the behaviour for the VFP registers. Defining a new macro
  FLOAT_REG_WORDS_BIG_ENDIAN wouldn't do, because the behaviour would
  differ depending on the hard register under observation: that seems
  like too much to ask of generic machinery in the middle-end.

So, the attached patch just avoids the problem, by pretending that
greater-than-word-size values in VFP registers, in big-endian mode, are
opaque and cannot be subreg'ed. In practice, for at least the test case
I looked at, this isn't as much of a pessimisation as you might expect
-- the value in question might already be stored in core registers
(e.g. for function arguments with -mfloat-abi=softfp), so can be
retrieved directly from those rather than via memory.

This is the testsuite delta for current FSF mainline, with multilibs
adjusted to build for little/big-endian, and using options
"-mbig-endian -mfloat-abi=softfp -mfpu=vfpv3" for testing:

FAIL -> PASS: be-code-on-qemu/g++.sum:g++.dg/torture/type-generic-1.C  -O1  execution test
FAIL -> PASS: be-code-on-qemu/g++.sum:g++.dg/torture/type-generic-1.C  -O2  execution test
FAIL -> PASS: be-code-on-qemu/g++.sum:g++.dg/torture/type-generic-1.C  -O2 -flto -fno-use-linker-plugin -flto-partition=none  execution test
FAIL -> PASS: be-code-on-qemu/g++.sum:g++.dg/torture/type-generic-1.C  -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  execution test
FAIL -> PASS: be-code-on-qemu/g++.sum:g++.dg/torture/type-generic-1.C  -O3 -fomit-frame-pointer  execution test
FAIL -> PASS: be-code-on-qemu/g++.sum:g++.dg/torture/type-generic-1.C  -O3 -g  execution test
FAIL -> PASS: be-code-on-qemu/g++.sum:g++.dg/torture/type-generic-1.C  -Os  execution test
FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/ieee/copysign1.c execution,  -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects 
FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/ieee/mzero6.c execution,  -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects 
FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/pr35456.c execution,  -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects 
FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/pr44683.c execution,  -O1 
FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/pr44683.c execution,  -O2 
FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/pr44683.c execution,  -O2 -flto -fno-use-linker-plugin -flto-partition=none 
FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/pr44683.c execution,  -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects 
FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/pr44683.c execution,  -O3 -fomit-frame-pointer 
FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/pr44683.c execution,  -O3 -g 
FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/pr44683.c execution,  -Og -g 
FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/pr44683.c execution,  -Os 
FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.dg/compat/scalar-by-value-3 c_compat_x_tst.o-c_compat_y_tst.o execute 
FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.dg/torture/type-generic-1.c  -O1  execution test
FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.dg/torture/type-generic-1.c  -O2  execution test
FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.dg/torture/type-generic-1.c  -O2 -flto -fno-use-linker-plugin -flto-partition=none  execution test
FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.dg/torture/type-generic-1.c  -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  execution test
FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.dg/torture/type-generic-1.c  -O3 -fomit-frame-pointer  execution test
FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.dg/torture/type-generic-1.c  -O3 -g  execution test
FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.dg/torture/type-generic-1.c  -Os  execution test

OK for mainline, or any comments? (I've included the multilib tweaks I
used in the attached patch for reference, though I'm not proposing to
apply those.)

Thanks,

Julian

ChangeLog

    gcc/
    * config/arm/arm.h (CANNOT_CHANGE_MODE_CLASS): Avoid subreg'ing
    VFP D registers in big-endian mode.

[-- Attachment #2: vfp-subregs-bigendian-2.diff --]
[-- Type: text/x-patch, Size: 2109 bytes --]

Index: gcc/config/arm/arm.h
===================================================================
--- gcc/config/arm/arm.h	(revision 192576)
+++ gcc/config/arm/arm.h	(working copy)
@@ -1205,8 +1205,15 @@ enum reg_class
 /* In VFPv1, VFP registers could only be accessed in the mode they
    were set, so subregs would be invalid there.  However, we don't
    support VFPv1 at the moment, and the restriction was lifted in
-   VFPv2.  */
-#define CANNOT_CHANGE_MODE_CLASS(FROM, TO, CLASS) 0
+   VFPv2.
+   In big-endian mode, modes greater than word size (i.e. DFmode) are stored in
+   VFP registers in little-endian order.  We can't describe that accurately to
+   GCC, so avoid taking subregs of such values.  */
+#define CANNOT_CHANGE_MODE_CLASS(FROM, TO, CLASS)	\
+  (TARGET_VFP && TARGET_BIG_END				\
+   && (GET_MODE_SIZE (FROM) > UNITS_PER_WORD		\
+       || GET_MODE_SIZE (TO) > UNITS_PER_WORD)		\
+   && reg_classes_intersect_p (VFP_REGS, (CLASS)))
 
 /* The class value for index registers, and the one for base regs.  */
 #define INDEX_REG_CLASS  (TARGET_THUMB1 ? LO_REGS : GENERAL_REGS)
Index: gcc/config/arm/t-arm-elf
===================================================================
--- gcc/config/arm/t-arm-elf	(revision 192576)
+++ gcc/config/arm/t-arm-elf	(working copy)
@@ -17,8 +17,8 @@
 # along with GCC; see the file COPYING3.  If not see
 # <http://www.gnu.org/licenses/>.
 
-MULTILIB_OPTIONS     = marm/mthumb
-MULTILIB_DIRNAMES    = arm thumb
+MULTILIB_OPTIONS     = marm
+MULTILIB_DIRNAMES    = arm
 MULTILIB_EXCEPTIONS  = 
 MULTILIB_MATCHES     =
 
@@ -49,9 +49,9 @@ MULTILIB_EXCEPTIONS    += *mthumb/*mfloa
 # MULTILIB_DIRNAMES   += ep9312
 # MULTILIB_EXCEPTIONS += *mthumb/*mcpu=ep9312*
 # 	
-# MULTILIB_OPTIONS     += mlittle-endian/mbig-endian
-# MULTILIB_DIRNAMES    += le be
-# MULTILIB_MATCHES     += mbig-endian=mbe mlittle-endian=mle
+MULTILIB_OPTIONS     += mlittle-endian/mbig-endian
+MULTILIB_DIRNAMES    += le be
+MULTILIB_MATCHES     += mbig-endian=mbe mlittle-endian=mle
 # 
 # MULTILIB_OPTIONS    += mfloat-abi=hard/mfloat-abi=soft
 # MULTILIB_DIRNAMES   += fpu soft

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH, ARM] Subregs of VFP registers in big-endian mode
  2012-10-20 14:13 [PATCH, ARM] Subregs of VFP registers in big-endian mode Julian Brown
@ 2012-10-21  2:48 ` Andrew Pinski
  2012-10-22  8:23 ` Richard Earnshaw
  1 sibling, 0 replies; 3+ messages in thread
From: Andrew Pinski @ 2012-10-21  2:48 UTC (permalink / raw)
  To: Julian Brown; +Cc: gcc-patches, Richard Earnshaw, Ramana Radhakrishnan

On Sat, Oct 20, 2012 at 4:38 AM, Julian Brown <julian@codesourcery.com> wrote:
> Hi,
>
> Quite a few tests fail for big-endian multilibs which use VFP
> instructions at present. One reason for many of these is glaringly
> obvious once you notice it: for D registers interpreted as two S
> registers, the lower-numbered register is always the less-significant
> part of the value, and the higher-numbered register the
> more-significant -- regardless of the endianness the processor is
> running in.
>
> However, for big-endian mode, when DFmode values are represented in
> memory (or indeed core registers), the opposite is true. So, a subreg
> expression such as the following will work fine on core registers (or
> e.g. pseudos assigned to stack slots):
>
> (subreg:SI (reg:DF) 0)
>
> but, when applied to a VFP register Dn, it should be resolved to the
> hard register S(n*2+1). At present though, it resolves to S(n*2) -- i.e.
> the wrong half of the value (for WORDS_BIG_ENDIAN, such a subreg should
> be the most-significant part of the value). For the relatively few cases
> where DFmode values are interpreted as a pair of (integer) words, this
> means that wrong code is generated.
>
> My feeling is that implementing a "proper" solution to this problem is
> probably impractical -- the closest existing macros to control
> behaviour aren't sufficient for this case:
>
> * FLOAT_WORDS_BIG_ENDIAN only refers to memory layout, which is correct
>   as is it.
>
> * REG_WORDS_BIG_ENDIAN controls whether values are stored in big-endian
>   order in registers, but refers to *all* registers. We only want to
>   change the behaviour for the VFP registers. Defining a new macro
>   FLOAT_REG_WORDS_BIG_ENDIAN wouldn't do, because the behaviour would
>   differ depending on the hard register under observation: that seems
>   like too much to ask of generic machinery in the middle-end.
>
> So, the attached patch just avoids the problem, by pretending that
> greater-than-word-size values in VFP registers, in big-endian mode, are
> opaque and cannot be subreg'ed. In practice, for at least the test case
> I looked at, this isn't as much of a pessimisation as you might expect
> -- the value in question might already be stored in core registers
> (e.g. for function arguments with -mfloat-abi=softfp), so can be
> retrieved directly from those rather than via memory.
>
> This is the testsuite delta for current FSF mainline, with multilibs
> adjusted to build for little/big-endian, and using options
> "-mbig-endian -mfloat-abi=softfp -mfpu=vfpv3" for testing:
>
> FAIL -> PASS: be-code-on-qemu/g++.sum:g++.dg/torture/type-generic-1.C  -O1  execution test
> FAIL -> PASS: be-code-on-qemu/g++.sum:g++.dg/torture/type-generic-1.C  -O2  execution test
> FAIL -> PASS: be-code-on-qemu/g++.sum:g++.dg/torture/type-generic-1.C  -O2 -flto -fno-use-linker-plugin -flto-partition=none  execution test
> FAIL -> PASS: be-code-on-qemu/g++.sum:g++.dg/torture/type-generic-1.C  -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  execution test
> FAIL -> PASS: be-code-on-qemu/g++.sum:g++.dg/torture/type-generic-1.C  -O3 -fomit-frame-pointer  execution test
> FAIL -> PASS: be-code-on-qemu/g++.sum:g++.dg/torture/type-generic-1.C  -O3 -g  execution test
> FAIL -> PASS: be-code-on-qemu/g++.sum:g++.dg/torture/type-generic-1.C  -Os  execution test
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/ieee/copysign1.c execution,  -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/ieee/mzero6.c execution,  -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/pr35456.c execution,  -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/pr44683.c execution,  -O1
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/pr44683.c execution,  -O2
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/pr44683.c execution,  -O2 -flto -fno-use-linker-plugin -flto-partition=none
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/pr44683.c execution,  -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/pr44683.c execution,  -O3 -fomit-frame-pointer
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/pr44683.c execution,  -O3 -g
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/pr44683.c execution,  -Og -g
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/pr44683.c execution,  -Os
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.dg/compat/scalar-by-value-3 c_compat_x_tst.o-c_compat_y_tst.o execute
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.dg/torture/type-generic-1.c  -O1  execution test
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.dg/torture/type-generic-1.c  -O2  execution test
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.dg/torture/type-generic-1.c  -O2 -flto -fno-use-linker-plugin -flto-partition=none  execution test
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.dg/torture/type-generic-1.c  -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  execution test
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.dg/torture/type-generic-1.c  -O3 -fomit-frame-pointer  execution test
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.dg/torture/type-generic-1.c  -O3 -g  execution test
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.dg/torture/type-generic-1.c  -Os  execution test
>
> OK for mainline, or any comments? (I've included the multilib tweaks I
> used in the attached patch for reference, though I'm not proposing to
> apply those.)

I also tested this on GCC 4.7.0 with armeb-linux-gnueabi defaulting to
hardfloat ABI and fixes a lot of failures there too.

Thanks,
Andrew Pinski


>
> Thanks,
>
> Julian
>
> ChangeLog
>
>     gcc/
>     * config/arm/arm.h (CANNOT_CHANGE_MODE_CLASS): Avoid subreg'ing
>     VFP D registers in big-endian mode.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH, ARM] Subregs of VFP registers in big-endian mode
  2012-10-20 14:13 [PATCH, ARM] Subregs of VFP registers in big-endian mode Julian Brown
  2012-10-21  2:48 ` Andrew Pinski
@ 2012-10-22  8:23 ` Richard Earnshaw
  1 sibling, 0 replies; 3+ messages in thread
From: Richard Earnshaw @ 2012-10-22  8:23 UTC (permalink / raw)
  To: Julian Brown; +Cc: gcc-patches, Ramana Radhakrishnan

On 20/10/12 12:38, Julian Brown wrote:
> Hi,
>
> Quite a few tests fail for big-endian multilibs which use VFP
> instructions at present. One reason for many of these is glaringly
> obvious once you notice it: for D registers interpreted as two S
> registers, the lower-numbered register is always the less-significant
> part of the value, and the higher-numbered register the
> more-significant -- regardless of the endianness the processor is
> running in.
>
> However, for big-endian mode, when DFmode values are represented in
> memory (or indeed core registers), the opposite is true. So, a subreg
> expression such as the following will work fine on core registers (or
> e.g. pseudos assigned to stack slots):
>
> (subreg:SI (reg:DF) 0)
>
> but, when applied to a VFP register Dn, it should be resolved to the
> hard register S(n*2+1). At present though, it resolves to S(n*2) -- i.e.
> the wrong half of the value (for WORDS_BIG_ENDIAN, such a subreg should
> be the most-significant part of the value). For the relatively few cases
> where DFmode values are interpreted as a pair of (integer) words, this
> means that wrong code is generated.
>
> My feeling is that implementing a "proper" solution to this problem is
> probably impractical -- the closest existing macros to control
> behaviour aren't sufficient for this case:
>
> * FLOAT_WORDS_BIG_ENDIAN only refers to memory layout, which is correct
>    as is it.
>
> * REG_WORDS_BIG_ENDIAN controls whether values are stored in big-endian
>    order in registers, but refers to *all* registers. We only want to
>    change the behaviour for the VFP registers. Defining a new macro
>    FLOAT_REG_WORDS_BIG_ENDIAN wouldn't do, because the behaviour would
>    differ depending on the hard register under observation: that seems
>    like too much to ask of generic machinery in the middle-end.
>
> So, the attached patch just avoids the problem, by pretending that
> greater-than-word-size values in VFP registers, in big-endian mode, are
> opaque and cannot be subreg'ed. In practice, for at least the test case
> I looked at, this isn't as much of a pessimisation as you might expect
> -- the value in question might already be stored in core registers
> (e.g. for function arguments with -mfloat-abi=softfp), so can be
> retrieved directly from those rather than via memory.
>
> This is the testsuite delta for current FSF mainline, with multilibs
> adjusted to build for little/big-endian, and using options
> "-mbig-endian -mfloat-abi=softfp -mfpu=vfpv3" for testing:
>
> FAIL -> PASS: be-code-on-qemu/g++.sum:g++.dg/torture/type-generic-1.C  -O1  execution test
> FAIL -> PASS: be-code-on-qemu/g++.sum:g++.dg/torture/type-generic-1.C  -O2  execution test
> FAIL -> PASS: be-code-on-qemu/g++.sum:g++.dg/torture/type-generic-1.C  -O2 -flto -fno-use-linker-plugin -flto-partition=none  execution test
> FAIL -> PASS: be-code-on-qemu/g++.sum:g++.dg/torture/type-generic-1.C  -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  execution test
> FAIL -> PASS: be-code-on-qemu/g++.sum:g++.dg/torture/type-generic-1.C  -O3 -fomit-frame-pointer  execution test
> FAIL -> PASS: be-code-on-qemu/g++.sum:g++.dg/torture/type-generic-1.C  -O3 -g  execution test
> FAIL -> PASS: be-code-on-qemu/g++.sum:g++.dg/torture/type-generic-1.C  -Os  execution test
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/ieee/copysign1.c execution,  -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/ieee/mzero6.c execution,  -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/pr35456.c execution,  -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/pr44683.c execution,  -O1
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/pr44683.c execution,  -O2
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/pr44683.c execution,  -O2 -flto -fno-use-linker-plugin -flto-partition=none
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/pr44683.c execution,  -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/pr44683.c execution,  -O3 -fomit-frame-pointer
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/pr44683.c execution,  -O3 -g
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/pr44683.c execution,  -Og -g
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/pr44683.c execution,  -Os
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.dg/compat/scalar-by-value-3 c_compat_x_tst.o-c_compat_y_tst.o execute
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.dg/torture/type-generic-1.c  -O1  execution test
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.dg/torture/type-generic-1.c  -O2  execution test
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.dg/torture/type-generic-1.c  -O2 -flto -fno-use-linker-plugin -flto-partition=none  execution test
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.dg/torture/type-generic-1.c  -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  execution test
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.dg/torture/type-generic-1.c  -O3 -fomit-frame-pointer  execution test
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.dg/torture/type-generic-1.c  -O3 -g  execution test
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.dg/torture/type-generic-1.c  -Os  execution test
>
> OK for mainline, or any comments? (I've included the multilib tweaks I
> used in the attached patch for reference, though I'm not proposing to
> apply those.)
>
> Thanks,
>
> Julian
>
> ChangeLog
>
>      gcc/
>      * config/arm/arm.h (CANNOT_CHANGE_MODE_CLASS): Avoid subreg'ing
>      VFP D registers in big-endian mode.
>
>

The patch to arm.h is OK.  The patch to t-arm-elf is not.  I presume the 
latter was just an oversight in patch preparation as there is no 
ChangeLog entry for it.

R.

> vfp-subregs-bigendian-2.diff
>
>
> Index: gcc/config/arm/arm.h
> ===================================================================
> --- gcc/config/arm/arm.h	(revision 192576)
> +++ gcc/config/arm/arm.h	(working copy)
> @@ -1205,8 +1205,15 @@ enum reg_class
>   /* In VFPv1, VFP registers could only be accessed in the mode they
>      were set, so subregs would be invalid there.  However, we don't
>      support VFPv1 at the moment, and the restriction was lifted in
> -   VFPv2.  */
> -#define CANNOT_CHANGE_MODE_CLASS(FROM, TO, CLASS) 0
> +   VFPv2.
> +   In big-endian mode, modes greater than word size (i.e. DFmode) are stored in
> +   VFP registers in little-endian order.  We can't describe that accurately to
> +   GCC, so avoid taking subregs of such values.  */
> +#define CANNOT_CHANGE_MODE_CLASS(FROM, TO, CLASS)	\
> +  (TARGET_VFP && TARGET_BIG_END				\
> +   && (GET_MODE_SIZE (FROM) > UNITS_PER_WORD		\
> +       || GET_MODE_SIZE (TO) > UNITS_PER_WORD)		\
> +   && reg_classes_intersect_p (VFP_REGS, (CLASS)))
>
>   /* The class value for index registers, and the one for base regs.  */
>   #define INDEX_REG_CLASS  (TARGET_THUMB1 ? LO_REGS : GENERAL_REGS)
> Index: gcc/config/arm/t-arm-elf
> ===================================================================
> --- gcc/config/arm/t-arm-elf	(revision 192576)
> +++ gcc/config/arm/t-arm-elf	(working copy)
> @@ -17,8 +17,8 @@
>   # along with GCC; see the file COPYING3.  If not see
>   # <http://www.gnu.org/licenses/>.
>
> -MULTILIB_OPTIONS     = marm/mthumb
> -MULTILIB_DIRNAMES    = arm thumb
> +MULTILIB_OPTIONS     = marm
> +MULTILIB_DIRNAMES    = arm
>   MULTILIB_EXCEPTIONS  =
>   MULTILIB_MATCHES     =
>
> @@ -49,9 +49,9 @@ MULTILIB_EXCEPTIONS    += *mthumb/*mfloa
>   # MULTILIB_DIRNAMES   += ep9312
>   # MULTILIB_EXCEPTIONS += *mthumb/*mcpu=ep9312*
>   # 	
> -# MULTILIB_OPTIONS     += mlittle-endian/mbig-endian
> -# MULTILIB_DIRNAMES    += le be
> -# MULTILIB_MATCHES     += mbig-endian=mbe mlittle-endian=mle
> +MULTILIB_OPTIONS     += mlittle-endian/mbig-endian
> +MULTILIB_DIRNAMES    += le be
> +MULTILIB_MATCHES     += mbig-endian=mbe mlittle-endian=mle
>   #
>   # MULTILIB_OPTIONS    += mfloat-abi=hard/mfloat-abi=soft
>   # MULTILIB_DIRNAMES   += fpu soft
>


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2012-10-22  8:21 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-10-20 14:13 [PATCH, ARM] Subregs of VFP registers in big-endian mode Julian Brown
2012-10-21  2:48 ` Andrew Pinski
2012-10-22  8:23 ` Richard Earnshaw

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).