public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* issues with rtl vector operators
@ 2004-08-10  1:04 James E Wilson
  2004-08-18 19:17 ` Devang Patel
  0 siblings, 1 reply; 3+ messages in thread
From: James E Wilson @ 2004-08-10  1:04 UTC (permalink / raw)
  To: gcc

I've been working on some MIPS vector support, for the paired single
instructions.  I have noticed a number of issues with the RTL support
for vectors while writing this code.  I have not yet looked at the LNO
branch to see if any of these issues are dealt with there.

The rtx operators vector_merge and vec_select are under specified.  The
result is that different ports are interpreting them different ways. 
Currently, this isn't an issue, as there is no optimizer support for
them.  However, once we start merging in auto vectorization code from
the LNO branch, this may be an issue that will need to be resolved.

Consider the vec_select operation.  The third operand is a parallel that
lists the subparts that are being selected.  A use of it might look like
this:
(vec_select:V4SF
  (match_operand:V4SF 1 "register_operand" "f")
  (parallel [(const_int 1) (const_int 2) (const_int 3) (const_int 0)]))
So given an input vector "ABCD" the expected result is "BCDA".

There are two issues to consider here, the order of the elements, and
the order that we are counting.  We have four elements 1,2,3,0 that need
to be concatenated to form the vector.  How are we doing this
concatenation?  Are we concatenating from left to right?  Or maybe this
is endian dependent, in which case we may be concatenating from left to
right for big endian but right to left for little-endian.  There is also
the issue of which subpart is element 1.  Are we counting from the
left?  Or is this endian dependent, in which case we are counting from
the left for big-endian, but counting from the right for little-endian.

Most backends with vector support are using the interpretation that the
elements are ordered from left to right and counted from left to right. 
The SH port however makes this endian dependent.  It appears that the SH
port orders and counts from left to right for big-endian, but orders and
counts right to left for little-endian.  Using this approach the above
RTL with a vector "ABCD" gives a result "BCDA" when big-endian and
"CBAD" when little endian.

While the left to right ordering and counting seems very natural, it
only makes sense for vectors that are in registers.  If a target
supports vectors in memory, then an endian dependent approach may make
more sense, as there is no left or right in memory.  In this case, we
could be talking about most significant byte (MSB) versus the least
significant bytes (LSB), or we could be talking about the 0 offset
address versus the N-1 offset address.  These give different results.

The rs6000/Altivec port is yet again different.  Looking at an Altivec
manual, and the patterns in the md file, I can't find a definition of
vec_select that works.  I think the issue here is that the Altivec port
has defined vec_merge differently than the other ports.

vec_merge takes two input vectors, and a parallel that indicates whether
we are taking subparts from the first vector or the second.  A use of it
might look like this:
(vec_merge:V4SF
  (match_operand:V4SF 1 "register_operand" "f")
  (match_operand:V4SF 2 "register_operand" "f")
  (const_int 12))  ;; 12 is 1100 in binary
Given input vectors "ABCD" and "WXYZ" the result is "ABYZ".  That is the
naive interpretation of what vec_merge does.

Now look at the altivec_vmrghw pattern in the rs6000/altivec.md file. 
The Altivec PEM says that given two inputs "ABCD" and "WXYZ" the result
is "AWBX"  The altivec pattern is
(vec_merge:V4SI
  (vec_select:V4SI (match_operand...) [2 3 0 1])
  (match_operand...)
  (const_int 12))
The vec_select rewrites the "ABCD" as "CDAB".  So we are then doing a
vector merge of "CDAB" "WXYZ" 1100 which should be "CDYZ" which has
nothing in common with the result that the hardware delivers.  I don't
know what the Altivec port is doing here.  I think it is just broken.

vec_duplicate seems unambiguous.  It appears that all ports have defined
vec_concat the same way.  I have seen problems with either of these yet.
-- 
Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: issues with rtl vector operators
  2004-08-10  1:04 issues with rtl vector operators James E Wilson
@ 2004-08-18 19:17 ` Devang Patel
  2004-08-18 20:36   ` James E Wilson
  0 siblings, 1 reply; 3+ messages in thread
From: Devang Patel @ 2004-08-18 19:17 UTC (permalink / raw)
  To: James E Wilson; +Cc: gcc


On Aug 9, 2004, at 5:56 PM, James E Wilson wrote:

> Now look at the altivec_vmrghw pattern in the rs6000/altivec.md file.
> The Altivec PEM says that given two inputs "ABCD" and "WXYZ" the result
> is "AWBX"  The altivec pattern is
> (vec_merge:V4SI
>   (vec_select:V4SI (match_operand...) [2 3 0 1])
>   (match_operand...)
>   (const_int 12))
> The vec_select rewrites the "ABCD" as "CDAB".  So we are then doing a
> vector merge of "CDAB" "WXYZ" 1100 which should be "CDYZ"

vector merge of "CDAB" "WXYZ" 1100 should be WXAB. But I still don't 
understand how it gets the right answer, that is "AWBX".

I tried following,
	#include <altivec.h>

	vector int a1 = { 100, 200, 300, 400};
	vector int a2 = { 500, 600, 700, 800};
	vector int k;

	int main()
	{
	  int i;
	  k = vec_mergeh (a1, a2);
	}

And it does the right thing, means k is assigned {100, 500, 200, 600}.

thoughts?
-
Devang

>  which has
> nothing in common with the result that the hardware delivers.  I don't
> know what the Altivec port is doing here.  I think it is just broken.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: issues with rtl vector operators
  2004-08-18 19:17 ` Devang Patel
@ 2004-08-18 20:36   ` James E Wilson
  0 siblings, 0 replies; 3+ messages in thread
From: James E Wilson @ 2004-08-18 20:36 UTC (permalink / raw)
  To: Devang Patel; +Cc: gcc

On Wed, 2004-08-18 at 12:01, Devang Patel wrote:
> vector merge of "CDAB" "WXYZ" 1100 should be WXAB. But I still don't 
> understand how it gets the right answer, that is "AWBX".

This is because we currently have no RTL optimizer support for vector
operations.  You could write the pattern any way you like, and it will
still work, because the only thing that matters is that the expander
generates unique RTL that exactly matches the one recognizer pattern
that emits the vmrghw instruction.  This is also why UNSPEC works.

However, when we do start adding RTL optimizer support for vector
operations, then the RTL optimizer will start rewriting patterns based
on what the RTL means, and if the RTL doesn't mean what the instruction
it represents means, then you will be in trouble.

It would be better to fix the RTL in the altivec.md file now before we
get to that point.
-- 
Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2004-08-18 20:05 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-08-10  1:04 issues with rtl vector operators James E Wilson
2004-08-18 19:17 ` Devang Patel
2004-08-18 20:36   ` James E Wilson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).