public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
From: Dorit Naishlos <DORIT@il.ibm.com>
To: Dale Johannesen <dalej@apple.com>, gcc@gcc.gnu.org
Subject: Re: rs6000.md/altivec.md problem in setting of vector registers
Date: Fri, 05 Mar 2004 00:06:00 -0000	[thread overview]
Message-ID: <OF45AA3A9B.10F22048-ONC2256E4D.005E56AD-C2256E4E.0000D3E5@il.ibm.com> (raw)
In-Reply-To: <790DA3BC-6D42-11D8-8C0E-000A95D7CD40@apple.com>


> Those loads and stores in the loop are happening because RA
> chose to assign int registers to the va pseudo.  This is obviously
> not a good idea and suggests some problem in computing the register
> costs.

Thanks, this indeed looks like the right direction, and it seems to support
the original suspicion that there is a problem with the "insvsi" pattern of
the insns that are used to initialize the va pseudo:

(insn:HI 12 11 13 0 (set (zero_extract:SI (subreg:SI (reg/v:V8HI 120 [ va
]) 0)
            (const_int 16 [0x10])
            (const_int 0 [0x0]))
        (reg/v:SI 118 [ n ])) 106 {insvsi} (insn_list 11 (insn_list 3
(nil)))
    (nil))

(insn:HI 13 12 14 0 (set (zero_extract:SI (subreg:SI (reg/v:V8HI 120 [ va
]) 0)
            (const_int 16 [0x10])
            (const_int 16 [0x10]))
        (reg/v:SI 118 [ n ])) 106 {insvsi} (insn_list 12 (insn_list 3
(nil)))
    (nil))

(insn:HI 14 13 15 0 (set (zero_extract:SI (subreg:SI (reg/v:V8HI 120 [ va
]) 4)
            (const_int 16 [0x10])
            (const_int 0 [0x0]))
        (reg/v:SI 118 [ n ])) 106 {insvsi} (insn_list 13 (insn_list 3
(nil)))
    (nil))

.....

When the RA calculates the cost of assigning an ALTIVEC_REG to temporary
120 in any of these insns, it checks the related cost using the relevant
machine mode of the operand in question. The relevant mode should be V8HI.
However, the "zero_extract" part of the "insvsi" pattern "hides" that, and
instead, the RA calculates the cost of assigning an ALTIVEC_REG to an
*SImode* operand. The default for such "impossible" assignments is 10,000.

Is there a way to express a vector pseudo initialization without the
zero_extract, or in a way that exposes the relevant mode (V8HI)? should we
change the insvsi pattern, or define a new pattern to be generated in this
case?

A little more details below.

thanks,
dorit

The reg allocator calculates the cost of assigning a GENERAL_REG to the
vector temporary 120 as follows: each of the 8 "insvsi" insns contributes 0
to the overall cost; the stvx insn in the loop however contributes a cost
of 20:

OVERALL_COST(GENERAL_REG) = 8 * (0 * freq_out_of_loop) + (20 *
freq_in_loop)
OVERALL_COST(GENERAL_REG) = 8 * (0 * 7) + (20 * 1000) = 20,000

The cost '20' is obtained by checking the cost of moving data of a V8HImode
operand from a GENERAL_REG into an ALTIVEC_REG. The overall cost of
assigning an ALTIVEC_REG should therefore be as follows:

OVERALL_COST(ALTIVEC_REG) = 8 * (20 * freq_out_of_loop) + (0 *
freq_in_loop)
OVERALL_COST(ALTIVEC_REG) = 8 * (20 * 7) + (0 * 1000) = 1,120

However, because the operand in the "insvsi" pattern has an SImode, the
cost that is used instead is of moving SImode data from a GENERAL_REG into
an ALTIVEC_REG which is set to the maximum cost (defaults to 10,000). The
outcome is:

OVERALL_COST(ALTIVEC_REG) = 8 * (10,000 * freq_out_of_loop) + (0 *
freq_in_loop)
OVERALL_COST(ALTIVEC_REG) = 8 * (10,000 * 7) + (0 * 1000) = 56,000



                                                                                                                                       
                      Dale Johannesen                                                                                                  
                      <dalej@apple.com>        To:       Dorit Naishlos/Haifa/IBM@IBMIL                                                
                                               cc:       gcc@gcc.gnu.org                                                               
                      03/03/2004 20:42         Subject:  Re: rs6000.md/altivec.md problem in setting of vector registers               
                                                                                                                                       





On Mar 3, 2004, at 8:56 AM, Dorit Naishlos wrote:

> Hi,
>
> I think there is a problem in the way we model the setting of subregs
> (insn
> "insvsi") in rs6000, or rather - a problem in the way reload phase
> handles
> these patterns when they are generated to express an initialization of
> a
> vector register. Consider the following example:
>
> typedef int __attribute__((mode(V8HI))) v8hi;
> #define N 1024
> void foo5 (short n){
>     short a[N];
>     v8hi *pa = (v8hi *)a;
>>>  v8hi va = {n,n,n,n,n,n,n,n};
>     int i;
>
>     for (i=0; i<N/8; i++){
>       pa[i] = va;
>     }
>     bar1 (pa[2]);
> }

Those loads and stores in the loop are happening because RA
chose to assign int registers to the va pseudo.  This is obviously
not a good idea and suggests some problem in computing the register
costs.

  Register 120 costs: BASE_REGS:20000 GENERAL_REGS:20000
ALTIVEC_REGS:560000
SPEC_OR_GEN_REGS:2048 NON_FLOAT_REGS:20448 MEM:4896

That (from the .lreg dump) doesn't look right, I suggest investigating
why the
Altivec number is so high.  Comparison with x86 may be helpful.

A couple of observations that won't help with your main problem:
>     short a[N];
>     v8hi *pa = (v8hi *)a;
This sort of thing generally violates strict aliasing rules, and has a
tendency
not to work as expected when scheduled.   Vectors are outside the
standard,
but I don't think there's any special handling for aliasing them, is
there Aldy?
I don't see why pa will necessarily by 16-byte aligned, either,
although it always
was when I tried.

The shift-i-and-add inside the loop isn't strength reduced, and should
be.
It is if you change pretty much anything, though, so it's hitting some
edge case.





  reply	other threads:[~2004-03-05  0:06 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-03-03 16:46 Dorit Naishlos
2004-03-03 17:52 ` David Edelsohn
2004-03-03 18:16   ` Dorit Naishlos
2004-03-03 18:44 ` Dale Johannesen
2004-03-05  0:06   ` Dorit Naishlos [this message]
2004-03-05  0:23     ` Dale Johannesen
2004-03-09 18:46       ` David Edelsohn
2004-03-11 22:38     ` David Edelsohn
2004-03-11 23:31       ` Richard Henderson
2004-03-12  3:14         ` David Edelsohn
2004-03-07 18:30   ` Aldy Hernandez
     [not found] <OF6E029669.0E5E3F05-ONC2256E5C.000482C4-C2256E5C.0004D314@il.ibm.com>
2004-03-19  8:45 ` David Edelsohn
2004-03-21  0:36   ` Dorit Naishlos
2004-03-23 22:10     ` David Edelsohn
2004-03-23 17:03       ` Dorit Naishlos
2004-03-19 20:59 ` Dale Johannesen
2004-03-21  1:47   ` Dorit Naishlos

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=OF45AA3A9B.10F22048-ONC2256E4D.005E56AD-C2256E4E.0000D3E5@il.ibm.com \
    --to=dorit@il.ibm.com \
    --cc=dalej@apple.com \
    --cc=gcc@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).