public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* [MIPS] Test case dspr2-MULT is failed
@ 2010-12-29  8:14 Mingjie Xing
  2010-12-31 13:38 ` Richard Sandiford
  0 siblings, 1 reply; 7+ messages in thread
From: Mingjie Xing @ 2010-12-29  8:14 UTC (permalink / raw)
  To: gcc

Hi,

There are two test cases failed when run 'make check-gcc
RUNTESTFLAGS="mips.exp"'.  The log is,

Executing on host: /home/xmj/tools/build-test-trunk-mips/gcc/xgcc
-B/home/xmj/tools/build-test-trunk-mips/gcc/
/home/xmj/tools/test-trunk/gcc/testsuite/gcc.target/mips/dspr2-MULT.c
 -DNOMIPS16=__attribute__((nomips16)) -mabi=32 -mips32r2 -mgp32 -O2
-mdspr2 -mtune=74kc -ffixed-hi -ffixed-lo -S  -o dspr2-MULT.s
(timeout = 300)
PASS: gcc.target/mips/dspr2-MULT.c (test for excess errors)
PASS: gcc.target/mips/dspr2-MULT.c scan-assembler \tmult\t
PASS: gcc.target/mips/dspr2-MULT.c scan-assembler ac1
FAIL: gcc.target/mips/dspr2-MULT.c scan-assembler ac2
Executing on host: /home/xmj/tools/build-test-trunk-mips/gcc/xgcc
-B/home/xmj/tools/build-test-trunk-mips/gcc/
/home/xmj/tools/test-trunk/gcc/testsuite/gcc.target/mips/dspr2-MULTU.c
  -DNOMIPS16=__attribute__((nomips16)) -mabi=32 -mips32r2 -mgp32 -O2
-mdspr2 -mtune=74kc -ffixed-hi -ffixed-lo -S  -o dspr2-MULTU.s
(timeout = 300)
PASS: gcc.target/mips/dspr2-MULTU.c (test for excess errors)
PASS: gcc.target/mips/dspr2-MULTU.c scan-assembler \tmultu\t
PASS: gcc.target/mips/dspr2-MULTU.c scan-assembler ac1
FAIL: gcc.target/mips/dspr2-MULTU.c scan-assembler ac2

Is it a bug?

Thanks,
Mingjie

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [MIPS] Test case dspr2-MULT is failed
  2010-12-29  8:14 [MIPS] Test case dspr2-MULT is failed Mingjie Xing
@ 2010-12-31 13:38 ` Richard Sandiford
  2011-01-06 18:32   ` Chung-Lin Tang
  0 siblings, 1 reply; 7+ messages in thread
From: Richard Sandiford @ 2010-12-31 13:38 UTC (permalink / raw)
  To: Mingjie Xing; +Cc: gcc

Mingjie Xing <mingjie.xing@gmail.com> writes:
> There are two test cases failed when run 'make check-gcc
> RUNTESTFLAGS="mips.exp"'.  The log is,
>
> Executing on host: /home/xmj/tools/build-test-trunk-mips/gcc/xgcc
> -B/home/xmj/tools/build-test-trunk-mips/gcc/
> /home/xmj/tools/test-trunk/gcc/testsuite/gcc.target/mips/dspr2-MULT.c
>  -DNOMIPS16=__attribute__((nomips16)) -mabi=32 -mips32r2 -mgp32 -O2
> -mdspr2 -mtune=74kc -ffixed-hi -ffixed-lo -S  -o dspr2-MULT.s
> (timeout = 300)
> PASS: gcc.target/mips/dspr2-MULT.c (test for excess errors)
> PASS: gcc.target/mips/dspr2-MULT.c scan-assembler \tmult\t
> PASS: gcc.target/mips/dspr2-MULT.c scan-assembler ac1
> FAIL: gcc.target/mips/dspr2-MULT.c scan-assembler ac2
> Executing on host: /home/xmj/tools/build-test-trunk-mips/gcc/xgcc
> -B/home/xmj/tools/build-test-trunk-mips/gcc/
> /home/xmj/tools/test-trunk/gcc/testsuite/gcc.target/mips/dspr2-MULTU.c
>   -DNOMIPS16=__attribute__((nomips16)) -mabi=32 -mips32r2 -mgp32 -O2
> -mdspr2 -mtune=74kc -ffixed-hi -ffixed-lo -S  -o dspr2-MULTU.s
> (timeout = 300)
> PASS: gcc.target/mips/dspr2-MULTU.c (test for excess errors)
> PASS: gcc.target/mips/dspr2-MULTU.c scan-assembler \tmultu\t
> PASS: gcc.target/mips/dspr2-MULTU.c scan-assembler ac1
> FAIL: gcc.target/mips/dspr2-MULTU.c scan-assembler ac2
>
> Is it a bug?

It's a register-allocation optimisation regression that's been around
for quite a long time now (probably over a year).  We're not making as
much use of the 4 accumulator registers as we should.

In truth, I don't think we ever really made good use of them anyway.
ISTR that trivial modifications of the testcase failed even before
the regression.

Richard

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [MIPS] Test case dspr2-MULT is failed
  2010-12-31 13:38 ` Richard Sandiford
@ 2011-01-06 18:32   ` Chung-Lin Tang
  2011-01-13  5:50     ` Mingjie Xing
  2011-02-18  1:19     ` Fu, Chao-Ying
  0 siblings, 2 replies; 7+ messages in thread
From: Chung-Lin Tang @ 2011-01-06 18:32 UTC (permalink / raw)
  To: Mingjie Xing, gcc, rdsandiford

On 2010/12/31 09:38 PM, Richard Sandiford wrote:
> Mingjie Xing <mingjie.xing@gmail.com> writes:
>> There are two test cases failed when run 'make check-gcc
>> RUNTESTFLAGS="mips.exp"'.  The log is,
>>
>> Executing on host: /home/xmj/tools/build-test-trunk-mips/gcc/xgcc
>> -B/home/xmj/tools/build-test-trunk-mips/gcc/
>> /home/xmj/tools/test-trunk/gcc/testsuite/gcc.target/mips/dspr2-MULT.c
>>  -DNOMIPS16=__attribute__((nomips16)) -mabi=32 -mips32r2 -mgp32 -O2
>> -mdspr2 -mtune=74kc -ffixed-hi -ffixed-lo -S  -o dspr2-MULT.s
>> (timeout = 300)
>> PASS: gcc.target/mips/dspr2-MULT.c (test for excess errors)
>> PASS: gcc.target/mips/dspr2-MULT.c scan-assembler \tmult\t
>> PASS: gcc.target/mips/dspr2-MULT.c scan-assembler ac1
>> FAIL: gcc.target/mips/dspr2-MULT.c scan-assembler ac2
>> Executing on host: /home/xmj/tools/build-test-trunk-mips/gcc/xgcc
>> -B/home/xmj/tools/build-test-trunk-mips/gcc/
>> /home/xmj/tools/test-trunk/gcc/testsuite/gcc.target/mips/dspr2-MULTU.c
>>   -DNOMIPS16=__attribute__((nomips16)) -mabi=32 -mips32r2 -mgp32 -O2
>> -mdspr2 -mtune=74kc -ffixed-hi -ffixed-lo -S  -o dspr2-MULTU.s
>> (timeout = 300)
>> PASS: gcc.target/mips/dspr2-MULTU.c (test for excess errors)
>> PASS: gcc.target/mips/dspr2-MULTU.c scan-assembler \tmultu\t
>> PASS: gcc.target/mips/dspr2-MULTU.c scan-assembler ac1
>> FAIL: gcc.target/mips/dspr2-MULTU.c scan-assembler ac2
>>
>> Is it a bug?
> 
> It's a register-allocation optimisation regression that's been around
> for quite a long time now (probably over a year).  We're not making as
> much use of the 4 accumulator registers as we should.
> 
> In truth, I don't think we ever really made good use of them anyway.
> ISTR that trivial modifications of the testcase failed even before
> the regression.

I analyzed this testcase regression a while earlier; the direct cause of
this is due to mips_order_regs_for_local_alloc(), which now serves as
MIPS' ADJUST_REG_ALLOC_ORDER macro.

The mips_order_regs_for_local_alloc() function seems to be written for
the old local-alloc.c, still left as the deprecated
ORDER_REGS_FOR_LOCAL_ALLOC macro after the transition to IRA (actually
not called at all during then), and relatively recently 'revived' after
a patch by Bernd that created the ADJUST_REG_ALLOC_ORDER macro went in.

So you have a local-alloc.c heuristic working in IRA, which seemed to
cause these regressions.

Removing mips_order_regs_for_local_alloc() will let this testcase pass;
of course the real fix should be to review the MIPS reg-ordering logic,
left for you MIPS people...

Chung-Lin

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [MIPS] Test case dspr2-MULT is failed
  2011-01-06 18:32   ` Chung-Lin Tang
@ 2011-01-13  5:50     ` Mingjie Xing
  2011-02-18  1:19     ` Fu, Chao-Ying
  1 sibling, 0 replies; 7+ messages in thread
From: Mingjie Xing @ 2011-01-13  5:50 UTC (permalink / raw)
  To: Chung-Lin Tang; +Cc: gcc, rdsandiford

2011/1/7 Chung-Lin Tang <cltang@codesourcery.com>:
> I analyzed this testcase regression a while earlier; the direct cause of
> this is due to mips_order_regs_for_local_alloc(), which now serves as
> MIPS' ADJUST_REG_ALLOC_ORDER macro.
>
> The mips_order_regs_for_local_alloc() function seems to be written for
> the old local-alloc.c, still left as the deprecated
> ORDER_REGS_FOR_LOCAL_ALLOC macro after the transition to IRA (actually
> not called at all during then), and relatively recently 'revived' after
> a patch by Bernd that created the ADJUST_REG_ALLOC_ORDER macro went in.
>
> So you have a local-alloc.c heuristic working in IRA, which seemed to
> cause these regressions.
>
> Removing mips_order_regs_for_local_alloc() will let this testcase pass;
> of course the real fix should be to review the MIPS reg-ordering logic,
> left for you MIPS people...
>
> Chung-Lin
>

As I can see, mips_order_regs_for_local_alloc() is only used to
reorder $24 (T_REG) for MIPS16.  Since current definition of
REG_ALLOC_ORDER for IRA is not {0,1,2,...} any more, the old loop
code,

  for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
    reg_alloc_order[i] = i;

in mips_order_regs_for_local_alloc() seems inadequate.  Because this
will override the definition of REG_ALLOC_ORDER.

Anyway, when I tried to rewrite the function to keep the register
order, the testcase is passed, but I can't see the speedup on my port.

Thanks,
Mingjie

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [MIPS] Test case dspr2-MULT is failed
  2011-01-06 18:32   ` Chung-Lin Tang
  2011-01-13  5:50     ` Mingjie Xing
@ 2011-02-18  1:19     ` Fu, Chao-Ying
  2011-02-18  6:21       ` Mingjie Xing
  1 sibling, 1 reply; 7+ messages in thread
From: Fu, Chao-Ying @ 2011-02-18  1:19 UTC (permalink / raw)
  To: Chung-Lin Tang, Mingjie Xing, gcc, Richard Sandiford; +Cc: Fuhler, Rich

Chung-Lin Tang wrote:
> I analyzed this testcase regression a while earlier; the 
> direct cause of
> this is due to mips_order_regs_for_local_alloc(), which now serves as
> MIPS' ADJUST_REG_ALLOC_ORDER macro.
> 
> The mips_order_regs_for_local_alloc() function seems to be written for
> the old local-alloc.c, still left as the deprecated
> ORDER_REGS_FOR_LOCAL_ALLOC macro after the transition to IRA (actually
> not called at all during then), and relatively recently 
> 'revived' after
> a patch by Bernd that created the ADJUST_REG_ALLOC_ORDER 
> macro went in.
> 
> So you have a local-alloc.c heuristic working in IRA, which seemed to
> cause these regressions.
> 
> Removing mips_order_regs_for_local_alloc() will let this 
> testcase pass;
> of course the real fix should be to review the MIPS 
> reg-ordering logic,
> left for you MIPS people...

I think your analysis is correct.  We should just delete mips_order_regs_for_local_alloc()
in mips.c and delete ADJUST_REG_ALLOC_ORDER in mips.h.
Then, 3 accumulators can be used in dspr2-MULT.c and dspr2-MULTU.c now.  Thanks!
Ex:
/* Test MIPS32 DSP REV 2 MULT instruction.  Tune for a CPU that has
   pipelined mult.  */
/* { dg-do compile } */
/* { dg-options "-mgp32 -mdspr2 -O2 -ffixed-hi -ffixed-lo -mtune=74kc" } */

/* { dg-final { scan-assembler "\tmult\t" } } */
/* { dg-final { scan-assembler "ac1" } } */
/* { dg-final { scan-assembler "ac2" } } */
/* { dg-final { scan-assembler "ac3" } } */

typedef long long a64;

NOMIPS16 a64 test (a64 *a, int *b, int *c)
{
  a[0] = (a64) b[0] * c[0];
  a[1] = (a64) b[1] * c[1];
  a[2] = (a64) b[2] * c[2];
}

Ex:
/* Test MIPS32 DSP REV 2 MULTU instruction.  Tune for a CPU that has
   pipelined multu.  */
/* { dg-do compile } */
/* { dg-options "-mgp32 -mdspr2 -O2 -ffixed-hi -ffixed-lo -mtune=74kc" } */

/* { dg-final { scan-assembler "\tmultu\t" } } */
/* { dg-final { scan-assembler "ac1" } } */
/* { dg-final { scan-assembler "ac2" } } */
/* { dg-final { scan-assembler "ac3" } } */

typedef unsigned long long a64;

NOMIPS16 a64 test (a64 *a, unsigned int *b, unsigned int *c)
{
  a[0] = (a64) b[0] * c[0];
  a[1] = (a64) b[1] * c[1];
  a[2] = (a64) b[2] * c[2];
}

Regards,
Chao-ying

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [MIPS] Test case dspr2-MULT is failed
  2011-02-18  1:19     ` Fu, Chao-Ying
@ 2011-02-18  6:21       ` Mingjie Xing
  2011-02-18  6:43         ` Fu, Chao-Ying
  0 siblings, 1 reply; 7+ messages in thread
From: Mingjie Xing @ 2011-02-18  6:21 UTC (permalink / raw)
  To: Fu, Chao-Ying; +Cc: Chung-Lin Tang, gcc, Richard Sandiford, Fuhler, Rich

2011/2/18 Fu, Chao-Ying <fu@mips.com>:
> I think your analysis is correct.  We should just delete mips_order_regs_for_local_alloc()
> in mips.c and delete ADJUST_REG_ALLOC_ORDER in mips.h.
> Then, 3 accumulators can be used in dspr2-MULT.c and dspr2-MULTU.c now.  Thanks!

/* ADJUST_REG_ALLOC_ORDER is a macro which permits reg_alloc_order
   to be rearranged based on a particular function.  On the mips16, we
   want to allocate $24 (T_REG) before other registers for
   instructions for which it is possible.  */

#define ADJUST_REG_ALLOC_ORDER mips_order_regs_for_local_alloc ()

I'm just wondering if it Is appropriate to simply remove
ADJUST_REG_ALLOC_ORDER considering its comment.

Regards,
Mingjie

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [MIPS] Test case dspr2-MULT is failed
  2011-02-18  6:21       ` Mingjie Xing
@ 2011-02-18  6:43         ` Fu, Chao-Ying
  0 siblings, 0 replies; 7+ messages in thread
From: Fu, Chao-Ying @ 2011-02-18  6:43 UTC (permalink / raw)
  To: Mingjie Xing; +Cc: Chung-Lin Tang, gcc, Richard Sandiford, Fuhler, Rich

Mingjie Xing wrote:
> 2011/2/18 Fu, Chao-Ying <fu@mips.com>:
> > I think your analysis is correct.  We should just delete 
> mips_order_regs_for_local_alloc()
> > in mips.c and delete ADJUST_REG_ALLOC_ORDER in mips.h.
> > Then, 3 accumulators can be used in dspr2-MULT.c and 
> dspr2-MULTU.c now.  Thanks!
> 
> /* ADJUST_REG_ALLOC_ORDER is a macro which permits reg_alloc_order
>    to be rearranged based on a particular function.  On the mips16, we
>    want to allocate $24 (T_REG) before other registers for
>    instructions for which it is possible.  */
> 
> #define ADJUST_REG_ALLOC_ORDER mips_order_regs_for_local_alloc ()
> 
> I'm just wondering if it Is appropriate to simply remove
> ADJUST_REG_ALLOC_ORDER considering its comment.

  Ok.  Need to test if allocating $24 first is still better in MIPS16 under IRA.
If yes, we should update mips_order_regs_for_local_alloc() for MIPS16 only
(eg: exchange $24 and $1), as the default register order is as follows.
Ex:
/* We generally want to put call-clobbered registers ahead of
   call-saved ones.  (IRA expects this.)  */

#define REG_ALLOC_ORDER                                                 \
{ /* Accumulator registers.  When GPRs and accumulators have equal      \
     cost, we generally prefer to use accumulators.  For example,       \
     a division of multiplication result is better allocated to LO,     \
     so that we put the MFLO at the point of use instead of at the      \
     point of definition.  It's also needed if we're to take advantage  \
     of the extra accumulators available with -mdspr2.  In some cases,  \
     it can also help to reduce register pressure.  */                  \
  64, 65,176,177,178,179,180,181,                                       \
  /* Call-clobbered GPRs.  */                                           \
  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15,            \
  24, 25, 31,                                                           \
...

Regards,
Chao-ying

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2011-02-18  1:37 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-12-29  8:14 [MIPS] Test case dspr2-MULT is failed Mingjie Xing
2010-12-31 13:38 ` Richard Sandiford
2011-01-06 18:32   ` Chung-Lin Tang
2011-01-13  5:50     ` Mingjie Xing
2011-02-18  1:19     ` Fu, Chao-Ying
2011-02-18  6:21       ` Mingjie Xing
2011-02-18  6:43         ` Fu, Chao-Ying

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).