RE: [patch] extend.texi MIPS PS/3D Support

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* RE: [patch] extend.texi MIPS PS/3D Support
@ 2004-09-23 18:02 Fu, Chao-Ying
  2004-09-24 12:17 ` Dorit Naishlos
  0 siblings, 1 reply; 36+ messages in thread
From: Fu, Chao-Ying @ 2004-09-23 18:02 UTC (permalink / raw)
  To: Richard Sandiford
  Cc: Dorit Naishlos, Giovanni Bajo, gcc-patches, dpatel,
	Gerald Pfeifer, Stephens, Nigel, Thekkath, Radhika, Uhler, Mike,
	Jim Wilson

I tested one case and got ICE, unfortunately.

Regards,
Chao-ying
---
<540> # cat vect1.c
#define N 16

void fbar (float *);

/* multiple loops */
foo (int n)
{
  int i;
  float a[N];
  float b[N];

  /* Vectorizable.  */
  for (i = 0; i < N; i++){
    a[i] = b[i];
  }

  fbar (a);
}
<541> # mipsisa64-elf-gcc -O2 -mips64 -mpaired-single -ftree-vectorize -fdump-tree-vect-stats -S vect1.c
vect1.c: In function 'foo':
vect1.c:7: internal compiler error: in int_mode_for_mode, at stor-layout.c:251
Please submit a full bug report,
with preprocessed source if appropriate.
See <URL:http://gcc.gnu.org/bugs.html> for instructions.


-----Original Message-----
From: gcc-patches-owner@gcc.gnu.org
[mailto:gcc-patches-owner@gcc.gnu.org]On Behalf Of Richard Sandiford
Sent: Thursday, September 23, 2004 9:10 AM
To: Fu, Chao-Ying
Cc: Dorit Naishlos; Giovanni Bajo; gcc-patches@gcc.gnu.org;
dpatel@apple.com; Gerald Pfeifer; Stephens, Nigel; Thekkath, Radhika;
Uhler, Mike; Jim Wilson
Subject: Re: [patch] extend.texi MIPS PS/3D Support


"Fu, Chao-Ying" <fu@mips.com> writes:
> Ok.  We need to put this into "mips.h".  Thanks!
> [...]
> Index: mips.h
> ===================================================================
> RCS file: /cvsroot/gcc/gcc/gcc/config/mips/mips.h,v
> retrieving revision 1.370
> diff -c -3 -p -r1.370 mips.h
> *** mips.h	18 Sep 2004 19:19:37 -0000	1.370
> --- mips.h	22 Sep 2004 21:07:30 -0000
> *************** extern const struct mips_cpu_info *mips_
> *** 243,248 ****
> --- 243,250 ----
>   				((target_flags & MASK_PAIRED_SINGLE) != 0)
>   #define TARGET_MIPS3D		((target_flags & MASK_MIPS3D) != 0)
>   
> + #define UNITS_PER_SIMD_WORD	(TARGET_PAIRED_SINGLE_FLOAT ? 8 : 0)
> + 
>   /* True if we should use NewABI-style relocation operators for
>      symbolic addresses.  This is never true for mips16 code,
>      which has its own conventions.  */

Have you tested this?

Richard

^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: [patch] extend.texi MIPS PS/3D Support
  2004-09-23 18:02 [patch] extend.texi MIPS PS/3D Support Fu, Chao-Ying
@ 2004-09-24 12:17 ` Dorit Naishlos
  2004-09-24 21:35   ` Chao-ying Fu
  0 siblings, 1 reply; 36+ messages in thread
From: Dorit Naishlos @ 2004-09-24 12:17 UTC (permalink / raw)
  To: Fu, Chao-Ying
  Cc: dpatel, gcc-patches, Gerald Pfeifer, Giovanni Bajo, Stephens,
	Nigel, Thekkath, Radhika, Richard Sandiford, Uhler, Mike,
	Jim Wilson


Could you send me some more detail (gdb traceback, what is the mode that's
failing in int_mode_for_mode, the .vect dump file when compiling it with
-fdump-tree-vect-details)? I can look into it after the weekend (on
Sunday).

dorit



|---------+---------------------------->
|         |           "Fu, Chao-Ying"  |
|         |           <fu@mips.com>    |
|         |                            |
|         |           23/09/2004 20:17 |
|---------+---------------------------->
  >---------------------------------------------------------------------------------------------------------------|
  |                                                                                                               |
  |       To:       "Richard Sandiford" <rsandifo@redhat.com>                                                     |
  |       cc:       Dorit Naishlos/Haifa/IBM@IBMIL, "Giovanni Bajo" <giovannibajo@libero.it>,                     |
  |        <gcc-patches@gcc.gnu.org>, <dpatel@apple.com>, "Gerald Pfeifer" <gerald@pfeifer.com>, "Stephens, Nigel"|
  |        <nigel@mercury.mips.com>, "Thekkath, Radhika" <radhika@mercury.mips.com>, "Uhler, Mike"                |
  |        <uhler@mercury.mips.com>, "Jim Wilson" <wilson@specifixinc.com>                                        |
  |       Subject:  RE: [patch] extend.texi MIPS PS/3D Support                                                    |
  >---------------------------------------------------------------------------------------------------------------|




I tested one case and got ICE, unfortunately.

Regards,
Chao-ying
---
<540> # cat vect1.c
#define N 16

void fbar (float *);

/* multiple loops */
foo (int n)
{
  int i;
  float a[N];
  float b[N];

  /* Vectorizable.  */
  for (i = 0; i < N; i++){
    a[i] = b[i];
  }

  fbar (a);
}
<541> # mipsisa64-elf-gcc -O2 -mips64 -mpaired-single -ftree-vectorize
-fdump-tree-vect-stats -S vect1.c
vect1.c: In function 'foo':
vect1.c:7: internal compiler error: in int_mode_for_mode, at
stor-layout.c:251
Please submit a full bug report,
with preprocessed source if appropriate.
See <URL:http://gcc.gnu.org/bugs.html> for instructions.


-----Original Message-----
From: gcc-patches-owner@gcc.gnu.org
[mailto:gcc-patches-owner@gcc.gnu.org]On Behalf Of Richard Sandiford
Sent: Thursday, September 23, 2004 9:10 AM
To: Fu, Chao-Ying
Cc: Dorit Naishlos; Giovanni Bajo; gcc-patches@gcc.gnu.org;
dpatel@apple.com; Gerald Pfeifer; Stephens, Nigel; Thekkath, Radhika;
Uhler, Mike; Jim Wilson
Subject: Re: [patch] extend.texi MIPS PS/3D Support


"Fu, Chao-Ying" <fu@mips.com> writes:
> Ok.  We need to put this into "mips.h".  Thanks!
> [...]
> Index: mips.h
> ===================================================================
> RCS file: /cvsroot/gcc/gcc/gcc/config/mips/mips.h,v
> retrieving revision 1.370
> diff -c -3 -p -r1.370 mips.h
> *** mips.h             18 Sep 2004 19:19:37 -0000          1.370
> --- mips.h             22 Sep 2004 21:07:30 -0000
> *************** extern const struct mips_cpu_info *mips_
> *** 243,248 ****
> --- 243,250 ----
>                                                ((target_flags &
MASK_PAIRED_SINGLE) != 0)
>   #define TARGET_MIPS3D                        ((target_flags &
MASK_MIPS3D) != 0)
>
> + #define UNITS_PER_SIMD_WORD            (TARGET_PAIRED_SINGLE_FLOAT ? 8
: 0)
> +
>   /* True if we should use NewABI-style relocation operators for
>      symbolic addresses.  This is never true for mips16 code,
>      which has its own conventions.  */

Have you tested this?

Richard



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [patch] extend.texi MIPS PS/3D Support
  2004-09-24 12:17 ` Dorit Naishlos
@ 2004-09-24 21:35   ` Chao-ying Fu
  2004-09-25 10:53     ` Richard Sandiford
  0 siblings, 1 reply; 36+ messages in thread
From: Chao-ying Fu @ 2004-09-24 21:35 UTC (permalink / raw)
  To: Dorit Naishlos
  Cc: dpatel, gcc-patches, Gerald Pfeifer, Giovanni Bajo, Stephens,
	Nigel, Thekkath, Radhika, Richard Sandiford, Uhler, Mike,
	Jim Wilson

Hello,

After running "cvs update" and building again today, I can compile the code.
But the output assembly is wrong with
"<ANYF:loadx>    $f0,$9($2)" and "<ANYF:storex>   $f0,$8($2)" in it.
Thanks!

Regards,
Chao-ying

<67> # cat vect1.c
#define N 16

void fbar (float *);
float a[N];
float b[N];

/* multiple loops */
foo (int n)
{
  int i;

  /* Vectorizable.  */
  for (i = 0; i < N; i++){
    a[i] = b[i];
  }

  fbar (a);
}

<68> #
mipsisa64-elf-gcc -O2 -mips64 -mpaired-single -ftree-vectorize -fdump-tree-v
ect-stats -S vect1.c

<69> # cat vect1.s
        .file   1 "vect1.c"
        .section .mdebug.eabi64
        .section .gcc_compiled_long64
        .previous
        .text
        .align  2
        .align  3
        .globl  foo
        .ent    foo
foo:
        .frame  $sp,0,$31               # vars= 0, regs= 0/0, args= 0, gp= 0
        .mask   0x00000000,0
        .fmask  0x00000000,0
        .set    noreorder
        .set    nomacro

        lui     $4,%hi(a)
        lui     $2,%hi(b)
        daddiu  $9,$2,%lo(b)
        daddiu  $8,$4,%lo(a)
        move    $6,$0
        move    $5,$0
        li      $7,8                    # 0x8
        .align  3
$L2:
        dsll    $2,$5,32
        dsrl    $2,$2,32
        <ANYF:loadx>    $f0,$9($2)
        addiu   $3,$6,1
        addiu   $5,$5,8
        <ANYF:storex>   $f0,$8($2)
        bne     $7,$3,$L2
        move    $6,$3

        j       fbar
        daddiu  $4,$4,%lo(a)

        .set    macro
        .set    reorder
        .end    foo
        .size   foo, .-foo

        .comm   b,64,8

        .comm   a,64,8
        .ident  "GCC: (GNU) 4.0.0 20040924 (experimental)"

<70> # cat vect1.c.t48.vect

;; Function foo (foo)


loop at vect1.c:14: LOOP VECTORIZED.

vectorized 1 loops in function.
NOTE: no flow-sensitive alias info for vect_pb.2_16 in vect_var_.1_5 =
(*vect_pb.2_16)[ivtmp.5_15];
NOTE: no flow-sensitive alias info for vect_pa.6_1 in
(*vect_pa.6_1)[ivtmp.9_17] = vect_var_.1_5;

DFA Statistics for foo

---------------------------------------------------------
                                Number of        Memory
                                instances         used
---------------------------------------------------------
Referenced variables                     15         60b
Statements annotated                      0          0b
Variables annotated                       0          0b
USE operands                              0          0b
DEF operands                              0          0b
VUSE operands                             0          0b
V_MAY_DEF operands                        0          0b
V_MUST_DEF operands                       0          0b
PHI nodes                                 5        260b
PHI arguments                            10        120b
---------------------------------------------------------
Total memory used by DFA/SSA data                  440b
---------------------------------------------------------

Average number of arguments per PHI node: 2.0 (max: 2)


Hash table statistics:
    def_blocks: size 31, 2 elements, 0.000000 collision/search ratio


DFA Statistics for foo

---------------------------------------------------------
                                Number of        Memory
                                instances         used
---------------------------------------------------------
Referenced variables                     15         60b
Statements annotated                      0          0b
Variables annotated                       0          0b
USE operands                              0          0b
DEF operands                              0          0b
VUSE operands                             0          0b
V_MAY_DEF operands                        0          0b
V_MUST_DEF operands                       0          0b
PHI nodes                                 6        312b
PHI arguments                            11        132b
---------------------------------------------------------
Total memory used by DFA/SSA data                  504b
---------------------------------------------------------

Average number of arguments per PHI node: 1.8 (max: 2)


Hash table statistics:
    def_blocks: size 31, 1 elements, 0.000000 collision/search ratio

foo (n)
{
  int ivtmp.10;
  int ivtmp.9;
  float * vect_pa.8;
  int newinit.7;
  <unnamed type>[<unknown>] * vect_pa.6;
  int ivtmp.5;
  float * vect_pb.4;
  int newinit.3;
  <unnamed type>[<unknown>] * vect_pb.2;
  <unnamed type> vect_var_.1;
  float b[16];
  float a[16];
  int i;
  float D.1169;
  int i.0;

<bb 0>:

<L6>:;
  vect_pb.4_6 = &b[0];
  vect_pb.2_16 = (<unnamed type>[<unknown>] *) vect_pb.4_6;
  vect_pa.8_2 = &a[0];
  vect_pa.6_1 = (<unnamed type>[<unknown>] *) vect_pa.8_2;

  # ivtmp.10_19 = PHI <0(4), ivtmp.10_20(3)>;
  # ivtmp.9_17 = PHI <0(4), ivtmp.9_18(3)>;
  # ivtmp.5_15 = PHI <0(4), ivtmp.5_12(3)>;
  # i_14 = PHI <0(4), i_10(3)>;
<L0>:;
  vect_var_.1_5 = (*vect_pb.2_16)[ivtmp.5_15];
  ivtmp.5_12 = ivtmp.5_15 + 1;
  D.1169_8 = b[i_14];
  (*vect_pa.6_1)[ivtmp.9_17] = vect_var_.1_5;
  ivtmp.9_18 = ivtmp.9_17 + 1;
  i_10 = i_14 + 1;
  ivtmp.10_20 = ivtmp.10_19 + 1;
  if (ivtmp.10_20 < 8) goto <L5>; else goto <L2>;

<L5>:;
  goto <bb 1> (<L0>);

<L2>:;
  fbar (&a);
  return;

}


<71> # cat vect1.c.t49.vect

;; Function foo (foo)


loop at vect1.c:14: LOOP VECTORIZED.

vectorized 1 loops in function.
NOTE: no flow-sensitive alias info for vect_pb.2_2 in vect_var_.1_1 =
*vect_pb.2_2;
NOTE: no flow-sensitive alias info for vect_pa.7_23 in *vect_pa.7_23 =
vect_var_.1_1;

DFA Statistics for foo

---------------------------------------------------------
                                Number of        Memory
                                instances         used
---------------------------------------------------------
Referenced variables                     17         68b
Statements annotated                      0          0b
Variables annotated                       0          0b
USE operands                              0          0b
DEF operands                              0          0b
VUSE operands                             0          0b
V_MAY_DEF operands                        0          0b
V_MUST_DEF operands                       0          0b
PHI nodes                                 5        260b
PHI arguments                            10        120b
---------------------------------------------------------
Total memory used by DFA/SSA data                  448b
---------------------------------------------------------

Average number of arguments per PHI node: 2.0 (max: 2)


Hash table statistics:
    def_blocks: size 31, 2 elements, 0.000000 collision/search ratio


DFA Statistics for foo

---------------------------------------------------------
                                Number of        Memory
                                instances         used
---------------------------------------------------------
Referenced variables                     17         68b
Statements annotated                      0          0b
Variables annotated                       0          0b
USE operands                              0          0b
DEF operands                              0          0b
VUSE operands                             0          0b
V_MAY_DEF operands                        0          0b
V_MUST_DEF operands                       0          0b
PHI nodes                                 6        312b
PHI arguments                            11        132b
---------------------------------------------------------
Total memory used by DFA/SSA data                  512b
---------------------------------------------------------

Average number of arguments per PHI node: 1.8 (max: 2)


Hash table statistics:
    def_blocks: size 31, 1 elements, 0.000000 collision/search ratio

foo (n)
{
  int ivtmp.12;
  int update.11;
  int ivtmp.10;
  float * vect_pa.9;
  int newinit.8;
  <unnamed type> * vect_pa.7;
  int update.6;
  int ivtmp.5;
  float * vect_pb.4;
  int newinit.3;
  <unnamed type> * vect_pb.2;
  <unnamed type> vect_var_.1;
  int i;
  float D.1169;
  int i.0;

<bb 0>:

<L6>:;
  vect_pb.4_9 = &b[0];
  vect_pb.2_17 = (<unnamed type> *) vect_pb.4_9;
  vect_pa.9_18 = &a[0];
  vect_pa.7_19 = (<unnamed type> *) vect_pa.9_18;

  # ivtmp.12_24 = PHI <0(4), ivtmp.12_25(3)>;
  # ivtmp.10_20 = PHI <0(4), ivtmp.10_21(3)>;
  # ivtmp.5_16 = PHI <0(4), ivtmp.5_13(3)>;
  # i_15 = PHI <0(4), i_12(3)>;
<L0>:;
  update.6_8 = ivtmp.5_16 * 8;
  vect_pb.2_2 = update.6_8 + vect_pb.2_17;
  vect_var_.1_1 = *vect_pb.2_2;
  ivtmp.5_13 = ivtmp.5_16 + 1;
  D.1169_10 = b[i_15];
  update.11_22 = ivtmp.10_20 * 8;
  vect_pa.7_23 = vect_pa.7_19 + update.11_22;
  *vect_pa.7_23 = vect_var_.1_1;
  ivtmp.10_21 = ivtmp.10_20 + 1;
  i_12 = i_15 + 1;
  ivtmp.12_25 = ivtmp.12_24 + 1;
  if (ivtmp.12_25 < 8) goto <L5>; else goto <L2>;

<L5>:;
  goto <bb 1> (<L0>);

<L2>:;
  fbar (&a);
  return;

}



----- Original Message ----- 
From: "Dorit Naishlos" <DORIT@il.ibm.com>
To: "Fu, Chao-Ying" <fu@mips.com>
Cc: <dpatel@apple.com>; <gcc-patches@gcc.gnu.org>; "Gerald Pfeifer"
<gerald@pfeifer.com>; "Giovanni Bajo" <giovannibajo@libero.it>; "Stephens,
Nigel" <nigel@mercury.mips.com>; "Thekkath, Radhika"
<radhika@mercury.mips.com>; "Richard Sandiford" <rsandifo@redhat.com>;
"Uhler, Mike" <uhler@mercury.mips.com>; "Jim Wilson"
<wilson@specifixinc.com>
Sent: Friday, September 24, 2004 2:39 AM
Subject: RE: [patch] extend.texi MIPS PS/3D Support


>
> Could you send me some more detail (gdb traceback, what is the mode that's
> failing in int_mode_for_mode, the .vect dump file when compiling it with
> -fdump-tree-vect-details)? I can look into it after the weekend (on
> Sunday).
>
> dorit
>
>
>
> |---------+---------------------------->
> |         |           "Fu, Chao-Ying"  |
> |         |           <fu@mips.com>    |
> |         |                            |
> |         |           23/09/2004 20:17 |
> |---------+---------------------------->
>
>---------------------------------------------------------------------------
------------------------------------|
>   |
|
>   |       To:       "Richard Sandiford" <rsandifo@redhat.com>
|
>   |       cc:       Dorit Naishlos/Haifa/IBM@IBMIL, "Giovanni Bajo"
<giovannibajo@libero.it>,                     |
>   |        <gcc-patches@gcc.gnu.org>, <dpatel@apple.com>, "Gerald Pfeifer"
<gerald@pfeifer.com>, "Stephens, Nigel"|
>   |        <nigel@mercury.mips.com>, "Thekkath, Radhika"
<radhika@mercury.mips.com>, "Uhler, Mike"                |
>   |        <uhler@mercury.mips.com>, "Jim Wilson" <wilson@specifixinc.com>
|
>   |       Subject:  RE: [patch] extend.texi MIPS PS/3D Support
|
>
>---------------------------------------------------------------------------
------------------------------------|
>
>
>
>
> I tested one case and got ICE, unfortunately.
>
> Regards,
> Chao-ying
> ---
> <540> # cat vect1.c
> #define N 16
>
> void fbar (float *);
>
> /* multiple loops */
> foo (int n)
> {
>   int i;
>   float a[N];
>   float b[N];
>
>   /* Vectorizable.  */
>   for (i = 0; i < N; i++){
>     a[i] = b[i];
>   }
>
>   fbar (a);
> }
> <541> # mipsisa64-elf-gcc -O2 -mips64 -mpaired-single -ftree-vectorize
> -fdump-tree-vect-stats -S vect1.c
> vect1.c: In function 'foo':
> vect1.c:7: internal compiler error: in int_mode_for_mode, at
> stor-layout.c:251
> Please submit a full bug report,
> with preprocessed source if appropriate.
> See <URL:http://gcc.gnu.org/bugs.html> for instructions.
>
>
> -----Original Message-----
> From: gcc-patches-owner@gcc.gnu.org
> [mailto:gcc-patches-owner@gcc.gnu.org]On Behalf Of Richard Sandiford
> Sent: Thursday, September 23, 2004 9:10 AM
> To: Fu, Chao-Ying
> Cc: Dorit Naishlos; Giovanni Bajo; gcc-patches@gcc.gnu.org;
> dpatel@apple.com; Gerald Pfeifer; Stephens, Nigel; Thekkath, Radhika;
> Uhler, Mike; Jim Wilson
> Subject: Re: [patch] extend.texi MIPS PS/3D Support
>
>
> "Fu, Chao-Ying" <fu@mips.com> writes:
> > Ok.  We need to put this into "mips.h".  Thanks!
> > [...]
> > Index: mips.h
> > ===================================================================
> > RCS file: /cvsroot/gcc/gcc/gcc/config/mips/mips.h,v
> > retrieving revision 1.370
> > diff -c -3 -p -r1.370 mips.h
> > *** mips.h             18 Sep 2004 19:19:37 -0000          1.370
> > --- mips.h             22 Sep 2004 21:07:30 -0000
> > *************** extern const struct mips_cpu_info *mips_
> > *** 243,248 ****
> > --- 243,250 ----
> >                                                ((target_flags &
> MASK_PAIRED_SINGLE) != 0)
> >   #define TARGET_MIPS3D                        ((target_flags &
> MASK_MIPS3D) != 0)
> >
> > + #define UNITS_PER_SIMD_WORD            (TARGET_PAIRED_SINGLE_FLOAT ? 8
> : 0)
> > +
> >   /* True if we should use NewABI-style relocation operators for
> >      symbolic addresses.  This is never true for mips16 code,
> >      which has its own conventions.  */
>
> Have you tested this?
>
> Richard
>
>
>
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [patch] extend.texi MIPS PS/3D Support
  2004-09-24 21:35   ` Chao-ying Fu
@ 2004-09-25 10:53     ` Richard Sandiford
  2004-09-27 23:32       ` Chao-ying Fu
  0 siblings, 1 reply; 36+ messages in thread
From: Richard Sandiford @ 2004-09-25 10:53 UTC (permalink / raw)
  To: Chao-ying Fu
  Cc: Dorit Naishlos, dpatel, gcc-patches, Gerald Pfeifer,
	Giovanni Bajo, Stephens, Nigel, Thekkath, Radhika, Uhler, Mike,
	Jim Wilson

"Chao-ying Fu" <fu@mips.com> writes:
> After running "cvs update" and building again today, I can compile the code.
> But the output assembly is wrong with
> "<ANYF:loadx>    $f0,$9($2)" and "<ANYF:storex>   $f0,$8($2)" in it.

OK, I've checked in the patch below to fix that.  Tested by running
mips.exp for mipsisa64-elf before and after the patch (which is the
only current test of paired-single support).  Committed to head.
I didn't install a testcase since this would be covered by existing
tests once autovectorisation is enabled.

WRT defining UNITS_PER_SIMD_WORD: I'd like that patch to go in, but
it would need to be run against the vectorisation testsuite first
(--target_board mipsisa64-elf/-mpaired-single).

Richard


	* config/mips/mips.md (loadx, storex): Define for V2SF.

Index: config/mips/mips.md
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/mips/mips.md,v
retrieving revision 1.307
diff -c -p -F^\([(a-zA-Z0-9_]\|#define\) -r1.307 mips.md
*** config/mips/mips.md	20 Sep 2004 06:54:52 -0000	1.307
--- config/mips/mips.md	25 Sep 2004 06:32:35 -0000
*************** (define_mode_attr load [(SI "lw") (DI "l
*** 355,362 ****
  (define_mode_attr store [(SI "sw") (DI "sd")])
  
  ;; Similarly for MIPS IV indexed FPR loads and stores.
! (define_mode_attr loadx [(SF "lwxc1") (DF "ldxc1")])
! (define_mode_attr storex [(SF "swxc1") (DF "sdxc1")])
  
  ;; The unextended ranges of the MIPS16 addiu and daddiu instructions
  ;; are different.  Some forms of unextended addiu have an 8-bit immediate
--- 355,362 ----
  (define_mode_attr store [(SI "sw") (DI "sd")])
  
  ;; Similarly for MIPS IV indexed FPR loads and stores.
! (define_mode_attr loadx [(SF "lwxc1") (DF "ldxc1") (V2SF "ldxc1")])
! (define_mode_attr storex [(SF "swxc1") (DF "sdxc1") (V2SF "sdxc1")])
  
  ;; The unextended ranges of the MIPS16 addiu and daddiu instructions
  ;; are different.  Some forms of unextended addiu have an 8-bit immediate

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [patch] extend.texi MIPS PS/3D Support
  2004-09-25 10:53     ` Richard Sandiford
@ 2004-09-27 23:32       ` Chao-ying Fu
  2004-09-28  0:32         ` Dorit Naishlos
  0 siblings, 1 reply; 36+ messages in thread
From: Chao-ying Fu @ 2004-09-27 23:32 UTC (permalink / raw)
  To: Dorit Naishlos
  Cc: Richard Sandiford, Jim Wilson, gcc-patches, Stephens, Nigel,
	Thekkath, Radhika, Uhler, Mike

Hello,

One question is as follows.  The current MIPS64 architecture doesn't have
SIMD integer instructions.  But, the vectorizer still vectorized the
following C example and generated an incorrect code (daddu $2,$2,$3).  How
do we disable the vectorization for the integer mode?  Thanks!

Regards,
Chao-ying

<122> # cat vect2.c
#define N 16

void ibar (int *);
int a[N];
int b[N];
int c[N];

/* multiple loops */
foo (int n)
{
  int i;

  /* Vectorizable.  */
  for (i = 0; i < N; i++){
    a[i] = b[i]+c[i];
  }

  ibar (a);
}

<123> #
mipsisa64-elf-gcc -O2 -mips64 -mpaired-single -ftree-vectorize -fdump-tree-v
ect-stats -S vect2.c
<124> # cat vect2.s
 .file 1 "vect2.c"
 .section .mdebug.eabi64
 .section .gcc_compiled_long64
 .previous
 .text
 .align 2
 .align 3
 .globl foo
 .ent foo
foo:
 .frame $sp,0,$31  # vars= 0, regs= 0/0, args= 0, gp= 0
 .mask 0x00000000,0
 .fmask 0x00000000,0
 .set noreorder
 .set nomacro

 lui $10,%hi(a)
 lui $2,%hi(b)
 lui $3,%hi(c)
 daddiu $7,$2,%lo(b)
 daddiu $6,$3,%lo(c)
 daddiu $5,$10,%lo(a)
 move $8,$0
 li $9,16   # 0x10
 .align 3
$L2:
 ld $2,0($7)
 ld $3,0($6)
 addiu $4,$8,1
 daddiu $7,$7,8
 daddu $2,$2,$3 <--------------------
 sd $2,0($5)
 daddiu $6,$6,8
 daddiu $5,$5,8
 bne $9,$4,$L2
 move $8,$4

 j ibar
 daddiu $4,$10,%lo(a)

 .set macro
 .set reorder
 .end foo
 .size foo, .-foo

 .comm c,64,8

 .comm b,64,8

 .comm a,64,8
 .ident "GCC: (GNU) 4.0.0 20040927 (experimental)"

<125> # cat vect2.c.t50.vect

;; Function foo (foo)


loop at vect2.c:15: LOOP VECTORIZED.

vectorized 1 loops in function.
NOTE: no flow-sensitive alias info for vect_pb.2_10 in vect_var_.1_2 =
*vect_pb.2_10;
NOTE: no flow-sensitive alias info for vect_pc.8_27 in vect_var_.7_28 =
*vect_pc.8_27;
NOTE: no flow-sensitive alias info for vect_pa.14_35 in *vect_pa.14_35 =
vect_var_.13_29;

DFA Statistics for foo

---------------------------------------------------------
                                Number of        Memory
                                instances         used
---------------------------------------------------------
Referenced variables                     27        108b
Statements annotated                      0          0b
Variables annotated                       0          0b
USE operands                              0          0b
DEF operands                              0          0b
VUSE operands                             0          0b
V_MAY_DEF operands                        0          0b
V_MUST_DEF operands                       0          0b
PHI nodes                                 6        312b
PHI arguments                            12        144b
---------------------------------------------------------
Total memory used by DFA/SSA data                  564b
---------------------------------------------------------

Average number of arguments per PHI node: 2.0 (max: 2)


Hash table statistics:
    def_blocks: size 31, 3 elements, 0.333333 collision/search ratio


DFA Statistics for foo

---------------------------------------------------------
                                Number of        Memory
                                instances         used
---------------------------------------------------------
Referenced variables                     27        108b
Statements annotated                      0          0b
Variables annotated                       0          0b
USE operands                              0          0b
DEF operands                              0          0b
VUSE operands                             0          0b
V_MAY_DEF operands                        0          0b
V_MUST_DEF operands                       0          0b
PHI nodes                                 7        364b
PHI arguments                            13        156b
---------------------------------------------------------
Total memory used by DFA/SSA data                  628b
---------------------------------------------------------

Average number of arguments per PHI node: 1.9 (max: 2)


Hash table statistics:
    def_blocks: size 61, 1 elements, 0.000000 collision/search ratio

foo (n)
{
  int ivtmp.19;
  int update.18;
  int ivtmp.17;
  int * vect_pa.16;
  int newinit.15;
  <unnamed type> * vect_pa.14;
  <unnamed type> vect_var_.13;
  int update.12;
  int ivtmp.11;
  int * vect_pc.10;
  int newinit.9;
  <unnamed type> * vect_pc.8;
  <unnamed type> vect_var_.7;
  int update.6;
  int ivtmp.5;
  int * vect_pb.4;
  int newinit.3;
  <unnamed type> * vect_pb.2;
  <unnamed type> vect_var_.1;
  int i;
  int D.1172;
  int D.1171;
  int D.1170;
  int i.0;

<bb 0>:

<L6>:;
  vect_pb.4_13 = &b[0];
  vect_pb.2_3 = (<unnamed type> *) vect_pb.4_13;
  vect_pc.10_1 = &c[0];
  vect_pc.8_23 = (<unnamed type> *) vect_pc.10_1;
  vect_pa.16_30 = &a[0];
  vect_pa.14_31 = (<unnamed type> *) vect_pa.16_30;

  # ivtmp.19_36 = PHI <0(4), ivtmp.19_37(3)>;
  # ivtmp.17_32 = PHI <0(4), ivtmp.17_33(3)>;
  # ivtmp.11_24 = PHI <0(4), ivtmp.11_25(3)>;
  # ivtmp.5_22 = PHI <0(4), ivtmp.5_21(3)>;
  # i_20 = PHI <0(4), i_17(3)>;
<L0>:;
  update.6_18 = ivtmp.5_22 * 8;
  vect_pb.2_10 = vect_pb.2_3 + update.6_18;
  vect_var_.1_2 = *vect_pb.2_10;
  ivtmp.5_21 = ivtmp.5_22 + 1;
  D.1170_12 = b[i_20];
  update.12_26 = ivtmp.11_24 * 8;
  vect_pc.8_27 = vect_pc.8_23 + update.12_26;
  vect_var_.7_28 = *vect_pc.8_27;
  ivtmp.11_25 = ivtmp.11_24 + 1;
  D.1171_14 = c[i_20];
  vect_var_.13_29 = vect_var_.1_2 + vect_var_.7_28;
  D.1172_15 = D.1170_12 + D.1171_14;
  update.18_34 = ivtmp.17_32 * 8;
  vect_pa.14_35 = vect_pa.14_31 + update.18_34;
  *vect_pa.14_35 = vect_var_.13_29;
  ivtmp.17_33 = ivtmp.17_32 + 1;
  i_17 = i_20 + 1;
  ivtmp.19_37 = ivtmp.19_36 + 1;
  if (ivtmp.19_37 < 16) goto <L5>; else goto <L2>;

<L5>:;
  goto <bb 1> (<L0>);

<L2>:;
  ibar (&a);
  return;

}


----- Original Message ----- 
From: "Richard Sandiford" <rsandifo@redhat.com>
To: "Chao-ying Fu" <fu@mips.com>
Cc: "Dorit Naishlos" <DORIT@il.ibm.com>; <dpatel@apple.com>;
<gcc-patches@gcc.gnu.org>; "Gerald Pfeifer" <gerald@pfeifer.com>; "Giovanni
Bajo" <giovannibajo@libero.it>; "Stephens, Nigel" <nigel@mercury.mips.com>;
"Thekkath, Radhika" <radhika@mercury.mips.com>; "Uhler, Mike"
<uhler@mercury.mips.com>; "Jim Wilson" <wilson@specifixinc.com>
Sent: Friday, September 24, 2004 11:43 PM
Subject: Re: [patch] extend.texi MIPS PS/3D Support


> "Chao-ying Fu" <fu@mips.com> writes:
> > After running "cvs update" and building again today, I can compile the
code.
> > But the output assembly is wrong with
> > "<ANYF:loadx>    $f0,$9($2)" and "<ANYF:storex>   $f0,$8($2)" in it.
>
> OK, I've checked in the patch below to fix that.  Tested by running
> mips.exp for mipsisa64-elf before and after the patch (which is the
> only current test of paired-single support).  Committed to head.
> I didn't install a testcase since this would be covered by existing
> tests once autovectorisation is enabled.
>
> WRT defining UNITS_PER_SIMD_WORD: I'd like that patch to go in, but
> it would need to be run against the vectorisation testsuite first
> (--target_board mipsisa64-elf/-mpaired-single).
>
> Richard
>
>
> * config/mips/mips.md (loadx, storex): Define for V2SF.
>
> Index: config/mips/mips.md
> ===================================================================
> RCS file: /cvs/gcc/gcc/gcc/config/mips/mips.md,v
> retrieving revision 1.307
> diff -c -p -F^\([(a-zA-Z0-9_]\|#define\) -r1.307 mips.md
> *** config/mips/mips.md 20 Sep 2004 06:54:52 -0000 1.307
> --- config/mips/mips.md 25 Sep 2004 06:32:35 -0000
> *************** (define_mode_attr load [(SI "lw") (DI "l
> *** 355,362 ****
>   (define_mode_attr store [(SI "sw") (DI "sd")])
>
>   ;; Similarly for MIPS IV indexed FPR loads and stores.
> ! (define_mode_attr loadx [(SF "lwxc1") (DF "ldxc1")])
> ! (define_mode_attr storex [(SF "swxc1") (DF "sdxc1")])
>
>   ;; The unextended ranges of the MIPS16 addiu and daddiu instructions
>   ;; are different.  Some forms of unextended addiu have an 8-bit
immediate
> --- 355,362 ----
>   (define_mode_attr store [(SI "sw") (DI "sd")])
>
>   ;; Similarly for MIPS IV indexed FPR loads and stores.
> ! (define_mode_attr loadx [(SF "lwxc1") (DF "ldxc1") (V2SF "ldxc1")])
> ! (define_mode_attr storex [(SF "swxc1") (DF "sdxc1") (V2SF "sdxc1")])
>
>   ;; The unextended ranges of the MIPS16 addiu and daddiu instructions
>   ;; are different.  Some forms of unextended addiu have an 8-bit
immediate
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [patch] extend.texi MIPS PS/3D Support
  2004-09-27 23:32       ` Chao-ying Fu
@ 2004-09-28  0:32         ` Dorit Naishlos
  2004-09-28  0:41           ` Richard Henderson
  0 siblings, 1 reply; 36+ messages in thread
From: Dorit Naishlos @ 2004-09-28  0:32 UTC (permalink / raw)
  To: Chao-ying Fu
  Cc: gcc-patches, Stephens, Nigel, Thekkath, Radhika,
	Richard Sandiford, Uhler, Mike, Jim Wilson


The vectorizer is supposed to fail to vectorize if the relevant machine
mode is not supported by the target.
If V*SImode is not supported, it should have failed to vectorize here:

         vectype = get_vectype_for_scalar_type (scalar_type);
          if (!vectype)
            {
              if (vect_debug_stats (loop) || vect_debug_details (loop))
                {
                  fprintf (dump_file, "not vectorized: unsupported
data-type ");
                  print_generic_expr (dump_file, scalar_type, TDF_SLIM);
                }
              return false;
            }

Maybe the problem is here (in get_vectype_for_scalar_type):

  vectype = build_vector_type (scalar_type, nunits);
  if (TYPE_MODE (vectype) == BLKmode)
    return NULL_TREE;

Maybe the vectype that is built has a mode other than BLKmode, although
it's not supported by the target? (see
http://gcc.gnu.org/ml/gcc/2004-09/msg00043.html)

-fdump-tree-vect-details may help.

dorit




|---------+---------------------------->
|         |           "Chao-ying Fu"   |
|         |           <fu@mips.com>    |
|         |                            |
|         |           28/09/2004 01:08 |
|         |           Please respond to|
|         |           "Chao-ying Fu"   |
|---------+---------------------------->
  >---------------------------------------------------------------------------------------------------------------|
  |                                                                                                               |
  |       To:       Dorit Naishlos/Haifa/IBM@IBMIL                                                                |
  |       cc:       "Richard Sandiford" <rsandifo@redhat.com>, "Jim Wilson" <wilson@specifixinc.com>,             |
  |        <gcc-patches@gcc.gnu.org>, "Stephens, Nigel" <nigel@mercury.mips.com>, "Thekkath, Radhika"             |
  |        <radhika@mercury.mips.com>, "Uhler, Mike" <uhler@mercury.mips.com>                                     |
  |       Subject:  Re: [patch] extend.texi MIPS PS/3D Support                                                    |
  >---------------------------------------------------------------------------------------------------------------|




Hello,

One question is as follows.  The current MIPS64 architecture doesn't have
SIMD integer instructions.  But, the vectorizer still vectorized the
following C example and generated an incorrect code (daddu $2,$2,$3).  How
do we disable the vectorization for the integer mode?  Thanks!

Regards,
Chao-ying

<122> # cat vect2.c
#define N 16

void ibar (int *);
int a[N];
int b[N];
int c[N];

/* multiple loops */
foo (int n)
{
  int i;

  /* Vectorizable.  */
  for (i = 0; i < N; i++){
    a[i] = b[i]+c[i];
  }

  ibar (a);
}

<123> #
mipsisa64-elf-gcc -O2 -mips64 -mpaired-single -ftree-vectorize
-fdump-tree-v
ect-stats -S vect2.c
<124> # cat vect2.s
 .file 1 "vect2.c"
 .section .mdebug.eabi64
 .section .gcc_compiled_long64
 .previous
 .text
 .align 2
 .align 3
 .globl foo
 .ent foo
foo:
 .frame $sp,0,$31  # vars= 0, regs= 0/0, args= 0, gp= 0
 .mask 0x00000000,0
 .fmask 0x00000000,0
 .set noreorder
 .set nomacro

 lui $10,%hi(a)
 lui $2,%hi(b)
 lui $3,%hi(c)
 daddiu $7,$2,%lo(b)
 daddiu $6,$3,%lo(c)
 daddiu $5,$10,%lo(a)
 move $8,$0
 li $9,16   # 0x10
 .align 3
$L2:
 ld $2,0($7)
 ld $3,0($6)
 addiu $4,$8,1
 daddiu $7,$7,8
 daddu $2,$2,$3 <--------------------
 sd $2,0($5)
 daddiu $6,$6,8
 daddiu $5,$5,8
 bne $9,$4,$L2
 move $8,$4

 j ibar
 daddiu $4,$10,%lo(a)

 .set macro
 .set reorder
 .end foo
 .size foo, .-foo

 .comm c,64,8

 .comm b,64,8

 .comm a,64,8
 .ident "GCC: (GNU) 4.0.0 20040927 (experimental)"

<125> # cat vect2.c.t50.vect

;; Function foo (foo)


loop at vect2.c:15: LOOP VECTORIZED.

vectorized 1 loops in function.
NOTE: no flow-sensitive alias info for vect_pb.2_10 in vect_var_.1_2 =
*vect_pb.2_10;
NOTE: no flow-sensitive alias info for vect_pc.8_27 in vect_var_.7_28 =
*vect_pc.8_27;
NOTE: no flow-sensitive alias info for vect_pa.14_35 in *vect_pa.14_35 =
vect_var_.13_29;

DFA Statistics for foo

---------------------------------------------------------
                                Number of        Memory
                                instances         used
---------------------------------------------------------
Referenced variables                     27        108b
Statements annotated                      0          0b
Variables annotated                       0          0b
USE operands                              0          0b
DEF operands                              0          0b
VUSE operands                             0          0b
V_MAY_DEF operands                        0          0b
V_MUST_DEF operands                       0          0b
PHI nodes                                 6        312b
PHI arguments                            12        144b
---------------------------------------------------------
Total memory used by DFA/SSA data                  564b
---------------------------------------------------------

Average number of arguments per PHI node: 2.0 (max: 2)


Hash table statistics:
    def_blocks: size 31, 3 elements, 0.333333 collision/search ratio


DFA Statistics for foo

---------------------------------------------------------
                                Number of        Memory
                                instances         used
---------------------------------------------------------
Referenced variables                     27        108b
Statements annotated                      0          0b
Variables annotated                       0          0b
USE operands                              0          0b
DEF operands                              0          0b
VUSE operands                             0          0b
V_MAY_DEF operands                        0          0b
V_MUST_DEF operands                       0          0b
PHI nodes                                 7        364b
PHI arguments                            13        156b
---------------------------------------------------------
Total memory used by DFA/SSA data                  628b
---------------------------------------------------------

Average number of arguments per PHI node: 1.9 (max: 2)


Hash table statistics:
    def_blocks: size 61, 1 elements, 0.000000 collision/search ratio

foo (n)
{
  int ivtmp.19;
  int update.18;
  int ivtmp.17;
  int * vect_pa.16;
  int newinit.15;
  <unnamed type> * vect_pa.14;
  <unnamed type> vect_var_.13;
  int update.12;
  int ivtmp.11;
  int * vect_pc.10;
  int newinit.9;
  <unnamed type> * vect_pc.8;
  <unnamed type> vect_var_.7;
  int update.6;
  int ivtmp.5;
  int * vect_pb.4;
  int newinit.3;
  <unnamed type> * vect_pb.2;
  <unnamed type> vect_var_.1;
  int i;
  int D.1172;
  int D.1171;
  int D.1170;
  int i.0;

<bb 0>:

<L6>:;
  vect_pb.4_13 = &b[0];
  vect_pb.2_3 = (<unnamed type> *) vect_pb.4_13;
  vect_pc.10_1 = &c[0];
  vect_pc.8_23 = (<unnamed type> *) vect_pc.10_1;
  vect_pa.16_30 = &a[0];
  vect_pa.14_31 = (<unnamed type> *) vect_pa.16_30;

  # ivtmp.19_36 = PHI <0(4), ivtmp.19_37(3)>;
  # ivtmp.17_32 = PHI <0(4), ivtmp.17_33(3)>;
  # ivtmp.11_24 = PHI <0(4), ivtmp.11_25(3)>;
  # ivtmp.5_22 = PHI <0(4), ivtmp.5_21(3)>;
  # i_20 = PHI <0(4), i_17(3)>;
<L0>:;
  update.6_18 = ivtmp.5_22 * 8;
  vect_pb.2_10 = vect_pb.2_3 + update.6_18;
  vect_var_.1_2 = *vect_pb.2_10;
  ivtmp.5_21 = ivtmp.5_22 + 1;
  D.1170_12 = b[i_20];
  update.12_26 = ivtmp.11_24 * 8;
  vect_pc.8_27 = vect_pc.8_23 + update.12_26;
  vect_var_.7_28 = *vect_pc.8_27;
  ivtmp.11_25 = ivtmp.11_24 + 1;
  D.1171_14 = c[i_20];
  vect_var_.13_29 = vect_var_.1_2 + vect_var_.7_28;
  D.1172_15 = D.1170_12 + D.1171_14;
  update.18_34 = ivtmp.17_32 * 8;
  vect_pa.14_35 = vect_pa.14_31 + update.18_34;
  *vect_pa.14_35 = vect_var_.13_29;
  ivtmp.17_33 = ivtmp.17_32 + 1;
  i_17 = i_20 + 1;
  ivtmp.19_37 = ivtmp.19_36 + 1;
  if (ivtmp.19_37 < 16) goto <L5>; else goto <L2>;

<L5>:;
  goto <bb 1> (<L0>);

<L2>:;
  ibar (&a);
  return;

}


----- Original Message -----
From: "Richard Sandiford" <rsandifo@redhat.com>
To: "Chao-ying Fu" <fu@mips.com>
Cc: "Dorit Naishlos" <DORIT@il.ibm.com>; <dpatel@apple.com>;
<gcc-patches@gcc.gnu.org>; "Gerald Pfeifer" <gerald@pfeifer.com>; "Giovanni
Bajo" <giovannibajo@libero.it>; "Stephens, Nigel" <nigel@mercury.mips.com>;
"Thekkath, Radhika" <radhika@mercury.mips.com>; "Uhler, Mike"
<uhler@mercury.mips.com>; "Jim Wilson" <wilson@specifixinc.com>
Sent: Friday, September 24, 2004 11:43 PM
Subject: Re: [patch] extend.texi MIPS PS/3D Support


> "Chao-ying Fu" <fu@mips.com> writes:
> > After running "cvs update" and building again today, I can compile the
code.
> > But the output assembly is wrong with
> > "<ANYF:loadx>    $f0,$9($2)" and "<ANYF:storex>   $f0,$8($2)" in it.
>
> OK, I've checked in the patch below to fix that.  Tested by running
> mips.exp for mipsisa64-elf before and after the patch (which is the
> only current test of paired-single support).  Committed to head.
> I didn't install a testcase since this would be covered by existing
> tests once autovectorisation is enabled.
>
> WRT defining UNITS_PER_SIMD_WORD: I'd like that patch to go in, but
> it would need to be run against the vectorisation testsuite first
> (--target_board mipsisa64-elf/-mpaired-single).
>
> Richard
>
>
> * config/mips/mips.md (loadx, storex): Define for V2SF.
>
> Index: config/mips/mips.md
> ===================================================================
> RCS file: /cvs/gcc/gcc/gcc/config/mips/mips.md,v
> retrieving revision 1.307
> diff -c -p -F^\([(a-zA-Z0-9_]\|#define\) -r1.307 mips.md
> *** config/mips/mips.md 20 Sep 2004 06:54:52 -0000 1.307
> --- config/mips/mips.md 25 Sep 2004 06:32:35 -0000
> *************** (define_mode_attr load [(SI "lw") (DI "l
> *** 355,362 ****
>   (define_mode_attr store [(SI "sw") (DI "sd")])
>
>   ;; Similarly for MIPS IV indexed FPR loads and stores.
> ! (define_mode_attr loadx [(SF "lwxc1") (DF "ldxc1")])
> ! (define_mode_attr storex [(SF "swxc1") (DF "sdxc1")])
>
>   ;; The unextended ranges of the MIPS16 addiu and daddiu instructions
>   ;; are different.  Some forms of unextended addiu have an 8-bit
immediate
> --- 355,362 ----
>   (define_mode_attr store [(SI "sw") (DI "sd")])
>
>   ;; Similarly for MIPS IV indexed FPR loads and stores.
> ! (define_mode_attr loadx [(SF "lwxc1") (DF "ldxc1") (V2SF "ldxc1")])
> ! (define_mode_attr storex [(SF "swxc1") (DF "sdxc1") (V2SF "sdxc1")])
>
>   ;; The unextended ranges of the MIPS16 addiu and daddiu instructions
>   ;; are different.  Some forms of unextended addiu have an 8-bit
immediate
>




^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [patch] extend.texi MIPS PS/3D Support
  2004-09-28  0:32         ` Dorit Naishlos
@ 2004-09-28  0:41           ` Richard Henderson
  2004-09-28 14:07             ` Dorit Naishlos
  0 siblings, 1 reply; 36+ messages in thread
From: Richard Henderson @ 2004-09-28  0:41 UTC (permalink / raw)
  To: Dorit Naishlos
  Cc: Chao-ying Fu, gcc-patches, Stephens, Nigel, Thekkath, Radhika,
	Richard Sandiford, Uhler, Mike, Jim Wilson

On Tue, Sep 28, 2004 at 01:30:00AM +0200, Dorit Naishlos wrote:
> Maybe the problem is here (in get_vectype_for_scalar_type):
> 
>   vectype = build_vector_type (scalar_type, nunits);
>   if (TYPE_MODE (vectype) == BLKmode)
>     return NULL_TREE;
> 
> Maybe the vectype that is built has a mode other than BLKmode, although
> it's not supported by the target?

Yes, e.g. V4QI may be represented with SImode.  But that's fine,
since the vectorizer may be able to optimize data movement loops
with no other vector support in the target.  (E.g. if target and
destination are alignable.)

The real problem is that you're pulling out SImode and deciding
that add_optab[SImode] is the vector addition operation.


r~

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [patch] extend.texi MIPS PS/3D Support
  2004-09-28  0:41           ` Richard Henderson
@ 2004-09-28 14:07             ` Dorit Naishlos
  2004-09-28 14:08               ` Paolo Bonzini
                                 ` (2 more replies)
  0 siblings, 3 replies; 36+ messages in thread
From: Dorit Naishlos @ 2004-09-28 14:07 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Chao-ying Fu, gcc-patches, Stephens, Nigel, Thekkath, Radhika,
	Richard Sandiford, Uhler, Mike, Jim Wilson


> The real problem is that you're pulling out SImode and deciding
> that add_optab[SImode] is the vector addition operation.

but how come we get SImode here? we call build_vector_type with
innertype = SI, and
nunits = UNITS_PER_SIMD_WORD / nbytes = 8 / 4 = 2
so it should return a V2SI vectype, which is not supported by the target
and therefore should have a BLKmode. no?

dorit



                                                                                                                                  
                      Richard Henderson                                                                                           
                      <rth@redhat.com>         To:       Dorit Naishlos/Haifa/IBM@IBMIL                                           
                                               cc:       Chao-ying Fu <fu@mips.com>, gcc-patches@gcc.gnu.org, "Stephens, Nigel"   
                      28/09/2004 01:43          <nigel@mercury.mips.com>, "Thekkath, Radhika" <radhika@mercury.mips.com>, Richard 
                                                Sandiford <rsandifo@redhat.com>, "Uhler, Mike" <uhler@mercury.mips.com>, Jim      
                                                Wilson <wilson@specifixinc.com>                                                   
                                               Subject:  Re: [patch] extend.texi MIPS PS/3D Support                               
                                                                                                                                  




On Tue, Sep 28, 2004 at 01:30:00AM +0200, Dorit Naishlos wrote:
> Maybe the problem is here (in get_vectype_for_scalar_type):
>
>   vectype = build_vector_type (scalar_type, nunits);
>   if (TYPE_MODE (vectype) == BLKmode)
>     return NULL_TREE;
>
> Maybe the vectype that is built has a mode other than BLKmode, although
> it's not supported by the target?

Yes, e.g. V4QI may be represented with SImode.  But that's fine,
since the vectorizer may be able to optimize data movement loops
with no other vector support in the target.  (E.g. if target and
destination are alignable.)

The real problem is that you're pulling out SImode and deciding
that add_optab[SImode] is the vector addition operation.


r~



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [patch] extend.texi MIPS PS/3D Support
  2004-09-28 14:07             ` Dorit Naishlos
@ 2004-09-28 14:08               ` Paolo Bonzini
  2004-09-29 11:41                 ` Paolo Bonzini
  2004-09-28 19:35               ` [patch] extend.texi MIPS PS/3D Support Chao-ying Fu
  2004-09-28 23:04               ` Richard Henderson
  2 siblings, 1 reply; 36+ messages in thread
From: Paolo Bonzini @ 2004-09-28 14:08 UTC (permalink / raw)
  To: Dorit Naishlos
  Cc: Chao-ying Fu, gcc-patches, Stephens, Nigel, Thekkath, Radhika,
	Richard Sandiford, Uhler, Mike, Jim Wilson

>>The real problem is that you're pulling out SImode and deciding
>>that add_optab[SImode] is the vector addition operation.
> 
> but how come we get SImode here? we call build_vector_type with
> innertype = SI, and
> nunits = UNITS_PER_SIMD_WORD / nbytes = 8 / 4 = 2
> so it should return a V2SI vectype, which is not supported by the target
> and therefore should have a BLKmode. no?

Yes, it should.  Generic vectors should not be created by the vectorizer 
(yet).

Paolo

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [patch] extend.texi MIPS PS/3D Support
  2004-09-28 14:08               ` Paolo Bonzini
@ 2004-09-29 11:41                 ` Paolo Bonzini
  2004-09-29 17:33                   ` Richard Henderson
                                     ` (2 more replies)
  0 siblings, 3 replies; 36+ messages in thread
From: Paolo Bonzini @ 2004-09-29 11:41 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Dorit Naishlos
  Cc: Chao-ying Fu, gcc-patches, Stephens, Nigel, Thekkath, Radhika,
	Richard Sandiford, Uhler, Mike, Jim Wilson

[-- Attachment #1: Type: text/plain, Size: 545 bytes --]

> Yes, it should.  Generic vectors should not be created by the vectorizer 
> (yet).

Ah, I thought that the vectorizer created its vector types with a 
separate routine, but that has been fixed a while ago.  The attached 
patch should help.

It was bootstrapped and passes the vectorizer testsuite.  Full regtest 
on i686-pc-linux-gnu is in progress.  Ok for mainline?

Paolo

2004-09-29  Paolo Bonzini  <bonzini@gnu.org>

	* tree-vectorizer.c (vectorizable_operation): Only look at the
	optab in the case of vector modes; otherwise, bail out.

[-- Attachment #2: fix-vectorizer-generic-types.patch --]
[-- Type: text/plain, Size: 934 bytes --]

Index: tree-vectorizer.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-vectorizer.c,v
retrieving revision 2.12
diff -u -r2.12 tree-vectorizer.c
--- tree-vectorizer.c	25 Sep 2004 14:48:03 -0000	2.12
+++ tree-vectorizer.c	29 Sep 2004 07:31:24 -0000
@@ -1408,6 +1408,17 @@
       return false;
     }
   vec_mode = TYPE_MODE (vectype);
+  if (GET_MODE_CLASS (vec_mode) != MODE_VECTOR_INT
+      && GET_MODE_CLASS (vec_mode) != MODE_VECTOR_FLOAT)
+    {
+      /* TODO: tree-complex.c sometimes can parallelize operations
+	 on generic vectors.  We can vectorize the loop in that case,
+	 but then we should re-run the lowering pass.  */
+      if (vect_debug_details (NULL))
+	fprintf (dump_file, "vector mode not supported by target.");
+      return false;
+    }
+
   if (optab->handlers[(int) vec_mode].insn_code == CODE_FOR_nothing)
     {
       if (vect_debug_details (NULL))

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [patch] extend.texi MIPS PS/3D Support
  2004-09-29 11:41                 ` Paolo Bonzini
@ 2004-09-29 17:33                   ` Richard Henderson
  2004-09-29 23:39                   ` Dorit Naishlos
  2004-10-11 22:48                   ` [patch] vectorizer: fix handling of non VECTOR_MODE_P vectypes Dorit Naishlos
  2 siblings, 0 replies; 36+ messages in thread
From: Richard Henderson @ 2004-09-29 17:33 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Dorit Naishlos, Chao-ying Fu, gcc-patches, Stephens, Nigel,
	Thekkath, Radhika, Richard Sandiford, Uhler, Mike, Jim Wilson

On Wed, Sep 29, 2004 at 11:17:57AM +0200, Paolo Bonzini wrote:
> +  if (GET_MODE_CLASS (vec_mode) != MODE_VECTOR_INT
> +      && GET_MODE_CLASS (vec_mode) != MODE_VECTOR_FLOAT)

VECTOR_MODE_P.

Otherwise ok.


r~

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [patch] extend.texi MIPS PS/3D Support
  2004-09-29 11:41                 ` Paolo Bonzini
  2004-09-29 17:33                   ` Richard Henderson
@ 2004-09-29 23:39                   ` Dorit Naishlos
  2004-10-11 22:48                   ` [patch] vectorizer: fix handling of non VECTOR_MODE_P vectypes Dorit Naishlos
  2 siblings, 0 replies; 36+ messages in thread
From: Dorit Naishlos @ 2004-09-29 23:39 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Paolo Bonzini, Chao-ying Fu, gcc-patches, Stephens, Nigel,
	Thekkath, Radhika, Richard Sandiford, Richard Henderson, Uhler,
	Mike, Jim Wilson


thanks, Paolo!

dorit




|---------+---------------------------->
|         |           Paolo Bonzini    |
|         |           <bonzini@gnu.org>|
|         |                            |
|         |           29/09/2004 11:17 |
|---------+---------------------------->
  >----------------------------------------------------------------------------------------------------------------------------|
  |                                                                                                                            |
  |       To:       Paolo Bonzini <bonzini@gnu.org>, Richard Henderson <rth@redhat.com>, Dorit Naishlos/Haifa/IBM@IBMIL        |
  |       cc:       Chao-ying Fu <fu@mips.com>, gcc-patches@gcc.gnu.org, "Stephens, Nigel" <nigel@mercury.mips.com>, "Thekkath,|
  |        Radhika" <radhika@mercury.mips.com>, Richard Sandiford <rsandifo@redhat.com>, "Uhler, Mike"                         |
  |        <uhler@mercury.mips.com>, Jim Wilson <wilson@specifixinc.com>                                                       |
  |       Subject:  Re: [patch] extend.texi MIPS PS/3D Support                                                                 |
  >----------------------------------------------------------------------------------------------------------------------------|




> Yes, it should.  Generic vectors should not be created by the vectorizer
> (yet).

Ah, I thought that the vectorizer created its vector types with a
separate routine, but that has been fixed a while ago.  The attached
patch should help.

It was bootstrapped and passes the vectorizer testsuite.  Full regtest
on i686-pc-linux-gnu is in progress.  Ok for mainline?

Paolo

2004-09-29  Paolo Bonzini  <bonzini@gnu.org>

             * tree-vectorizer.c (vectorizable_operation): Only look at the
             optab in the case of vector modes; otherwise, bail out.
Index: tree-vectorizer.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-vectorizer.c,v
retrieving revision 2.12
diff -u -r2.12 tree-vectorizer.c
--- tree-vectorizer.c          25 Sep 2004 14:48:03 -0000          2.12
+++ tree-vectorizer.c          29 Sep 2004 07:31:24 -0000
@@ -1408,6 +1408,17 @@
       return false;
     }
   vec_mode = TYPE_MODE (vectype);
+  if (GET_MODE_CLASS (vec_mode) != MODE_VECTOR_INT
+      && GET_MODE_CLASS (vec_mode) != MODE_VECTOR_FLOAT)
+    {
+      /* TODO: tree-complex.c sometimes can parallelize operations
+             on generic vectors.  We can vectorize the loop in that case,
+             but then we should re-run the lowering pass.  */
+      if (vect_debug_details (NULL))
+            fprintf (dump_file, "vector mode not supported by target.");
+      return false;
+    }
+
   if (optab->handlers[(int) vec_mode].insn_code == CODE_FOR_nothing)
     {
       if (vect_debug_details (NULL))



^ permalink raw reply	[flat|nested] 36+ messages in thread

* [patch] vectorizer: fix handling of non VECTOR_MODE_P vectypes
  2004-09-29 11:41                 ` Paolo Bonzini
  2004-09-29 17:33                   ` Richard Henderson
  2004-09-29 23:39                   ` Dorit Naishlos
@ 2004-10-11 22:48                   ` Dorit Naishlos
  2004-10-12  0:40                     ` Richard Henderson
  2 siblings, 1 reply; 36+ messages in thread
From: Dorit Naishlos @ 2004-10-11 22:48 UTC (permalink / raw)
  To: gcc-patches; +Cc: mark, Richard Henderson

[-- Attachment #1: Type: text/plain, Size: 4407 bytes --]


> > so it should return a V2SI vectype, which is not supported by the
target
> > and therefore should have a BLKmode. no?
>
> Nope.  An integer mode is returned if we have one that fits.  It
> makes the routines in tree-complex.c happy.

In this case we also need to fix the computation of the vectorization
factor - currently it is set to GET_MODE_NUNITS, which is correct only if
the mode is VECTOR_MODE_P. This came up on ppc-darwin compiling SPEC's
crafty - build_vector_type returned TImode for a 'long long' type, and the
vectorization factor was set to 1 instead of 2. Crafty failed as a result.
(Ideally build_vector_type would have returned BLKmode rather than TImode;
see machmode.def: "FIXME TI shouldn't be generically available either.  *".
I didn't fail vectorization in this case, although this type is not really
supported and the vectorization will be undone later on).

Bootstrapped and tested on ppc-darwin. ok for mainline?

thanks,
dorit

        * tree-vectorizer.c (get_vectype_for_scalar_type): Added debug
printouts.
        Added checks under ENABLE_CHECKING to verify the vectorizer's
        assumptions on the vectype returned by build_vector_type. Check if
TImode.
        (vect_transform_loop): Added gcc_assert under ENABLE_CHECKING.
        (vect_analyze_operations): Don't use GET_MODE_NUNITS to determine
the
        vectorization factor. Make sure the vectorization factor > 1. Add
        gcc_assert under ENABLE_CHECKING.
        (vect_get_vec_def_for_operand): Remove redundant variables.

(See attached file: vectfix2)





|---------+---------------------------->
|         |           Paolo Bonzini    |
|         |           <bonzini@gnu.org>|
|         |                            |
|         |           29/09/2004 11:17 |
|---------+---------------------------->
  >----------------------------------------------------------------------------------------------------------------------------|
  |                                                                                                                            |
  |       To:       Paolo Bonzini <bonzini@gnu.org>, Richard Henderson <rth@redhat.com>, Dorit Naishlos/Haifa/IBM@IBMIL        |
  |       cc:       Chao-ying Fu <fu@mips.com>, gcc-patches@gcc.gnu.org, "Stephens, Nigel" <nigel@mercury.mips.com>, "Thekkath,|
  |        Radhika" <radhika@mercury.mips.com>, Richard Sandiford <rsandifo@redhat.com>, "Uhler, Mike"                         |
  |        <uhler@mercury.mips.com>, Jim Wilson <wilson@specifixinc.com>                                                       |
  |       Subject:  Re: [patch] extend.texi MIPS PS/3D Support                                                                 |
  >----------------------------------------------------------------------------------------------------------------------------|




> Yes, it should.  Generic vectors should not be created by the vectorizer
> (yet).

Ah, I thought that the vectorizer created its vector types with a
separate routine, but that has been fixed a while ago.  The attached
patch should help.

It was bootstrapped and passes the vectorizer testsuite.  Full regtest
on i686-pc-linux-gnu is in progress.  Ok for mainline?

Paolo

2004-09-29  Paolo Bonzini  <bonzini@gnu.org>

             * tree-vectorizer.c (vectorizable_operation): Only look at the
             optab in the case of vector modes; otherwise, bail out.
Index: tree-vectorizer.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-vectorizer.c,v
retrieving revision 2.12
diff -u -r2.12 tree-vectorizer.c
--- tree-vectorizer.c          25 Sep 2004 14:48:03 -0000          2.12
+++ tree-vectorizer.c          29 Sep 2004 07:31:24 -0000
@@ -1408,6 +1408,17 @@
       return false;
     }
   vec_mode = TYPE_MODE (vectype);
+  if (GET_MODE_CLASS (vec_mode) != MODE_VECTOR_INT
+      && GET_MODE_CLASS (vec_mode) != MODE_VECTOR_FLOAT)
+    {
+      /* TODO: tree-complex.c sometimes can parallelize operations
+             on generic vectors.  We can vectorize the loop in that case,
+             but then we should re-run the lowering pass.  */
+      if (vect_debug_details (NULL))
+            fprintf (dump_file, "vector mode not supported by target.");
+      return false;
+    }
+
   if (optab->handlers[(int) vec_mode].insn_code == CODE_FOR_nothing)
     {
       if (vect_debug_details (NULL))


[-- Attachment #2: vectfix2 --]
[-- Type: application/octet-stream, Size: 7044 bytes --]

Index: tree-vectorizer.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-vectorizer.c,v
retrieving revision 2.14
diff -c -3 -p -r2.14 tree-vectorizer.c
*** tree-vectorizer.c	1 Oct 2004 09:59:00 -0000	2.14
--- tree-vectorizer.c	11 Oct 2004 17:49:35 -0000
*************** static tree
*** 836,841 ****
--- 836,842 ----
  get_vectype_for_scalar_type (tree scalar_type)
  {
    enum machine_mode inner_mode = TYPE_MODE (scalar_type);
+   enum machine_mode vec_mode;
    int nbytes = GET_MODE_SIZE (inner_mode);
    int nunits;
    tree vectype;
*************** get_vectype_for_scalar_type (tree scalar
*** 848,855 ****
    nunits = UNITS_PER_SIMD_WORD / nbytes;
  
    vectype = build_vector_type (scalar_type, nunits);
!   if (TYPE_MODE (vectype) == BLKmode)
      return NULL_TREE;
    return vectype;
  }
  
--- 849,916 ----
    nunits = UNITS_PER_SIMD_WORD / nbytes;
  
    vectype = build_vector_type (scalar_type, nunits);
!   if (vect_debug_details (NULL))
!     {
!       fprintf (dump_file, "get vectype with %d units of type ", nunits);
!       print_generic_expr (dump_file, scalar_type, TDF_SLIM);
!     }
! 
!   if (!vectype)
      return NULL_TREE;
+ 
+   if (vect_debug_details (NULL))
+     {
+       fprintf (dump_file, "vectype: ");
+       print_generic_expr (dump_file, vectype, TDF_SLIM);
+     }
+ 
+   vec_mode = TYPE_MODE (vectype);
+ 
+   /* Is type supported by target?:  */
+   if (vec_mode == BLKmode)
+     {
+       if (vect_debug_details (NULL))
+         fprintf (dump_file, "vectype not supported by target.");
+       return NULL_TREE;
+     }
+ 
+   if (vec_mode == TImode)
+     {
+       /* See comment in machmode.def:
+          "FIXME TI shouldn't be generically available either."
+          The vectorizer will report that it successfully vectorized,
+          but this will actually be lowered during RTL expand. */
+       if (vect_debug_details (NULL))
+         fprintf (dump_file, "TImode; not really supported by target.");
+       /* return NULL_TREE; */ /* CHECKME */
+     }
+ 
+ #ifdef ENABLE_CHECKING
+   if (GET_MODE_SIZE (vec_mode) != UNITS_PER_SIMD_WORD)
+     {
+       if (vect_debug_details (NULL))
+         fprintf (dump_file, "unexpected vectype size.");
+       return NULL_TREE;
+     }
+ 
+   if (VECTOR_MODE_P (vec_mode))
+     {
+       if (nbytes * GET_MODE_NUNITS (vec_mode) != UNITS_PER_SIMD_WORD)
+         {
+           if (vect_debug_details (NULL))
+             fprintf (dump_file, "unexpected vectype nunits: nunits = %d",
+                      GET_MODE_NUNITS (vec_mode));
+           return NULL_TREE;
+         }
+       else
+         if (vect_debug_details (NULL))
+           fprintf (dump_file, "supportable vector type.");
+     }
+   else
+     if (vect_debug_details (NULL))
+       fprintf (dump_file, "type not really a vector.");
+ #endif
+ 
    return vectype;
  }
  
*************** vect_get_vec_def_for_operand (tree op, t
*** 1157,1167 ****
        /* Create 'vect_cst_ = {cst,cst,...,cst}'  */
  
        tree vec_cst;
-       stmt_vec_info stmt_vinfo = vinfo_for_stmt (stmt);
-       tree vectype = STMT_VINFO_VECTYPE (stmt_vinfo);
-       int nunits = GET_MODE_NUNITS (TYPE_MODE (vectype));
-       tree t = NULL_TREE;
-       int i;
  
        /* Build a tree with vector elements.  */
        if (vect_debug_details (NULL))
--- 1218,1223 ----
*************** vect_transform_loop (loop_vec_info loop_
*** 1907,1912 ****
--- 1963,1969 ----
  	  bool is_store;
  #ifdef ENABLE_CHECKING
  	  tree vectype;
+ 	  enum machine_mode vecmode;
  #endif
  
  	  if (vect_debug_details (NULL))
*************** vect_transform_loop (loop_vec_info loop_
*** 1925,1932 ****
  	  /* FORNOW: Verify that all stmts operate on the same number of
  	             units and no inner unrolling is necessary.  */
  	  vectype = STMT_VINFO_VECTYPE (stmt_info);
! 	  gcc_assert (GET_MODE_NUNITS (TYPE_MODE (vectype))
! 		      == vectorization_factor);
  #endif
  	  /* -------- vectorize statement ------------ */
  	  if (vect_debug_details (NULL))
--- 1982,1991 ----
  	  /* FORNOW: Verify that all stmts operate on the same number of
  	             units and no inner unrolling is necessary.  */
  	  vectype = STMT_VINFO_VECTYPE (stmt_info);
! 	  vecmode = TYPE_MODE (vectype);
! 	  gcc_assert (GET_MODE_SIZE (vecmode) == UNITS_PER_SIMD_WORD);
! 	  gcc_assert (GET_MODE_NUNITS (vecmode) == vectorization_factor
! 		      || !VECTOR_MODE_P (vecmode));
  #endif
  	  /* -------- vectorize statement ------------ */
  	  if (vect_debug_details (NULL))
*************** vect_analyze_operations (loop_vec_info l
*** 2044,2050 ****
    int vectorization_factor = 0;
    int i;
    bool ok;
-   tree scalar_type;
  
    if (vect_debug_details (NULL))
      fprintf (dump_file, "\n<<vect_analyze_operations>>\n");
--- 2103,2108 ----
*************** vect_analyze_operations (loop_vec_info l
*** 2059,2064 ****
--- 2117,2124 ----
  	  int nunits;
  	  stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
  	  tree vectype;
+ 	  tree scalar_type;
+ 	  int scalar_size;
  
  	  if (vect_debug_details (NULL))
  	    {
*************** vect_analyze_operations (loop_vec_info l
*** 2138,2146 ****
  	      return false;
  	    }
  
! 	  nunits = GET_MODE_NUNITS (TYPE_MODE (vectype));
! 	  if (vect_debug_details (NULL))
! 	    fprintf (dump_file, "nunits = %d", nunits);
  
  	  if (vectorization_factor)
  	    {
--- 2198,2216 ----
  	      return false;
  	    }
  
!           /* Note that:
!              (*) GET_MODE_NUNITS (TYPE_MODE (vectype)) == vf
!              is not necessarily true! It only holds if the mode of vectype
! 	       is VECTOR_MODE_P.  E.g, V4QI may be represented with SI;
!              in this case, the LHS is 1, but the vectorization factor (vf)
!              should be 4.
! 
!              We don't require that mode be always VECTOR_TYPE_P because we 
! 	       can vectorize such datatypes in some cases (for example, if there 
! 	       are only data movements).  */
! 
! 	  scalar_size = GET_MODE_SIZE (TYPE_MODE (scalar_type));
! 	  nunits = UNITS_PER_SIMD_WORD / scalar_size;
  
  	  if (vectorization_factor)
  	    {
*************** vect_analyze_operations (loop_vec_info l
*** 2154,2165 ****
  		}
  	    }
  	  else
! 	    vectorization_factor = nunits;
  	}
      }
  
    /* TODO: Analyze cost. Decide if worth while to vectorize.  */
!   if (!vectorization_factor)
      {
        if (vect_debug_stats (loop) || vect_debug_details (loop))
          fprintf (dump_file, "not vectorized: unsupported data-type");
--- 2224,2240 ----
  		}
  	    }
  	  else
!             vectorization_factor = nunits;
! 
! #ifdef ENABLE_CHECKING
!           gcc_assert (scalar_size * vectorization_factor == UNITS_PER_SIMD_WORD);
! #endif
  	}
      }
  
    /* TODO: Analyze cost. Decide if worth while to vectorize.  */
! 
!   if (vectorization_factor <= 1)
      {
        if (vect_debug_stats (loop) || vect_debug_details (loop))
          fprintf (dump_file, "not vectorized: unsupported data-type");

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [patch] vectorizer: fix handling of non VECTOR_MODE_P vectypes
  2004-10-11 22:48                   ` [patch] vectorizer: fix handling of non VECTOR_MODE_P vectypes Dorit Naishlos
@ 2004-10-12  0:40                     ` Richard Henderson
  2004-10-12 12:54                       ` Dorit Naishlos
  0 siblings, 1 reply; 36+ messages in thread
From: Richard Henderson @ 2004-10-12  0:40 UTC (permalink / raw)
  To: Dorit Naishlos; +Cc: gcc-patches, mark

On Tue, Oct 12, 2004 at 12:01:20AM +0200, Dorit Naishlos wrote:
> (Ideally build_vector_type would have returned BLKmode rather than TImode;
> see machmode.def: "FIXME TI shouldn't be generically available either.  *".

Not ok.

You should be checking VECTOR_MODE_P, not some specific integer mode.

Didn't this get fixed already, anyway?


r~

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [patch] vectorizer: fix handling of non VECTOR_MODE_P vectypes
  2004-10-12  0:40                     ` Richard Henderson
@ 2004-10-12 12:54                       ` Dorit Naishlos
  2004-10-12 16:37                         ` Richard Henderson
  2004-10-13 19:18                         ` Richard Henderson
  0 siblings, 2 replies; 36+ messages in thread
From: Dorit Naishlos @ 2004-10-12 12:54 UTC (permalink / raw)
  To: Richard Henderson; +Cc: gcc-patches, mark

[-- Attachment #1: Type: text/plain, Size: 2388 bytes --]


> Didn't this get fixed already, anyway?

No - the VECTOR_MODE_P check was added only when checking if an operation
is supportable. It was not added for loads and stores - to allow
vectorization of data movements I thought. Moving the VECTOR_MODE_P check
from vectorizable_operation to get_vectype_for_scalar_type would solve the
problem. It would be more restrictive (will not allow vectorization of data
movement in the case of V4QI represented as SI), but on second thought, the
current version is not restrictive enough - it allows initialization with
constant/invariants which would not be handled correctly (unless it's an
initialization with 0 or -1). I'll use the VECTOR_MODE_P check in all cases
(at least for now).

Is the following ok?

thanks,
dorit

        * tree-vectorizer.c (get_vectype_for_scalar_type): Added debug
prinouts.
        Added check that vectype is VECTOR_MODE_P.
        (vect_analyze_operations): Make sure the vectorization factor > 1.
Add
        gcc_assert under ENABLE_CHECKING.
        (vectorizable_operation): Remove check for VECTOR_MODE_P (moved to
        get_vectype_for_scalar_type).
        (vect_get_vec_def_for_operand): Remove redundant variables.
        (vect_transform_loop): Likewise.

(See attached file: vectfix3)


                                                                                                                               
                      Richard Henderson                                                                                        
                      <rth@redhat.com>         To:       Dorit Naishlos/Haifa/IBM@IBMIL                                        
                                               cc:       gcc-patches@gcc.gnu.org, mark@codesourcery.com                        
                      12/10/2004 02:05         Subject:  Re: [patch] vectorizer: fix handling of non VECTOR_MODE_P vectypes    
                                                                                                                               




On Tue, Oct 12, 2004 at 12:01:20AM +0200, Dorit Naishlos wrote:
> (Ideally build_vector_type would have returned BLKmode rather than
TImode;
> see machmode.def: "FIXME TI shouldn't be generically available either.
*".

Not ok.

You should be checking VECTOR_MODE_P, not some specific integer mode.

Didn't this get fixed already, anyway?


r~


[-- Attachment #2: vectfix3 --]
[-- Type: application/octet-stream, Size: 5104 bytes --]

Index: tree-vectorizer.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-vectorizer.c,v
retrieving revision 2.14
diff -c -3 -p -r2.14 tree-vectorizer.c
*** tree-vectorizer.c	1 Oct 2004 09:59:00 -0000	2.14
--- tree-vectorizer.c	12 Oct 2004 08:56:26 -0000
*************** static tree
*** 836,841 ****
--- 836,842 ----
  get_vectype_for_scalar_type (tree scalar_type)
  {
    enum machine_mode inner_mode = TYPE_MODE (scalar_type);
+   enum machine_mode vec_mode;
    int nbytes = GET_MODE_SIZE (inner_mode);
    int nunits;
    tree vectype;
*************** get_vectype_for_scalar_type (tree scalar
*** 848,855 ****
    nunits = UNITS_PER_SIMD_WORD / nbytes;
  
    vectype = build_vector_type (scalar_type, nunits);
!   if (TYPE_MODE (vectype) == BLKmode)
      return NULL_TREE;
    return vectype;
  }
  
--- 849,889 ----
    nunits = UNITS_PER_SIMD_WORD / nbytes;
  
    vectype = build_vector_type (scalar_type, nunits);
!   if (vect_debug_details (NULL))
!     {
!       fprintf (dump_file, "get vectype with %d units of type ", nunits);
!       print_generic_expr (dump_file, scalar_type, TDF_SLIM);
!     }
! 
!   if (!vectype)
      return NULL_TREE;
+ 
+   if (vect_debug_details (NULL))
+     {
+       fprintf (dump_file, "vectype: ");
+       print_generic_expr (dump_file, vectype, TDF_SLIM);
+     }
+ 
+   vec_mode = TYPE_MODE (vectype);
+ 
+   /* Is type supported by target?:  */
+   if (vec_mode == BLKmode)
+     {
+       if (vect_debug_details (NULL))
+         fprintf (dump_file, "vectype not supported by target.");
+       return NULL_TREE;
+     }
+ 
+   if (!VECTOR_MODE_P (vec_mode))
+     {
+       /* TODO: tree-complex.c sometimes can parallelize operations
+          on generic vectors.  We can vectorize the loop in that case,
+          but then we should re-run the lowering pass.  */
+       if (vect_debug_details (NULL))
+         fprintf (dump_file, "mode not supported by target.");
+       return NULL_TREE;
+     }
+ 
    return vectype;
  }
  
*************** vect_get_vec_def_for_operand (tree op, t
*** 1157,1167 ****
        /* Create 'vect_cst_ = {cst,cst,...,cst}'  */
  
        tree vec_cst;
-       stmt_vec_info stmt_vinfo = vinfo_for_stmt (stmt);
-       tree vectype = STMT_VINFO_VECTYPE (stmt_vinfo);
-       int nunits = GET_MODE_NUNITS (TYPE_MODE (vectype));
-       tree t = NULL_TREE;
-       int i;
  
        /* Build a tree with vector elements.  */
        if (vect_debug_details (NULL))
--- 1191,1196 ----
*************** vectorizable_operation (tree stmt, block
*** 1408,1423 ****
        return false;
      }
    vec_mode = TYPE_MODE (vectype);
-   if (!VECTOR_MODE_P (vec_mode))
-     {
-       /* TODO: tree-complex.c sometimes can parallelize operations
- 	 on generic vectors.  We can vectorize the loop in that case,
- 	 but then we should re-run the lowering pass.  */
-       if (vect_debug_details (NULL))
- 	fprintf (dump_file, "mode not supported by target.");
-       return false;
-     }
- 
    if (optab->handlers[(int) vec_mode].insn_code == CODE_FOR_nothing)
      {
        if (vect_debug_details (NULL))
--- 1437,1442 ----
*************** vect_transform_loop (loop_vec_info loop_
*** 1905,1913 ****
  	  tree stmt = bsi_stmt (si);
  	  stmt_vec_info stmt_info;
  	  bool is_store;
- #ifdef ENABLE_CHECKING
- 	  tree vectype;
- #endif
  
  	  if (vect_debug_details (NULL))
  	    {
--- 1924,1929 ----
*************** vect_transform_loop (loop_vec_info loop_
*** 1924,1931 ****
  #ifdef ENABLE_CHECKING
  	  /* FORNOW: Verify that all stmts operate on the same number of
  	             units and no inner unrolling is necessary.  */
! 	  vectype = STMT_VINFO_VECTYPE (stmt_info);
! 	  gcc_assert (GET_MODE_NUNITS (TYPE_MODE (vectype))
  		      == vectorization_factor);
  #endif
  	  /* -------- vectorize statement ------------ */
--- 1940,1946 ----
  #ifdef ENABLE_CHECKING
  	  /* FORNOW: Verify that all stmts operate on the same number of
  	             units and no inner unrolling is necessary.  */
! 	  gcc_assert (GET_MODE_NUNITS (TYPE_MODE (STMT_VINFO_VECTYPE (stmt_info)))
  		      == vectorization_factor);
  #endif
  	  /* -------- vectorize statement ------------ */
*************** vect_analyze_operations (loop_vec_info l
*** 2155,2165 ****
  	    }
  	  else
  	    vectorization_factor = nunits;
  	}
      }
  
    /* TODO: Analyze cost. Decide if worth while to vectorize.  */
!   if (!vectorization_factor)
      {
        if (vect_debug_stats (loop) || vect_debug_details (loop))
          fprintf (dump_file, "not vectorized: unsupported data-type");
--- 2170,2186 ----
  	    }
  	  else
  	    vectorization_factor = nunits;
+ 
+ #ifdef ENABLE_CHECKING
+ 	  gcc_assert (GET_MODE_SIZE (TYPE_MODE (scalar_type))
+ 			* vectorization_factor == UNITS_PER_SIMD_WORD);
+ #endif
  	}
      }
  
    /* TODO: Analyze cost. Decide if worth while to vectorize.  */
! 
!   if (vectorization_factor <= 1)
      {
        if (vect_debug_stats (loop) || vect_debug_details (loop))
          fprintf (dump_file, "not vectorized: unsupported data-type");

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [patch] vectorizer: fix handling of non VECTOR_MODE_P vectypes
  2004-10-12 12:54                       ` Dorit Naishlos
@ 2004-10-12 16:37                         ` Richard Henderson
  2004-10-12 18:21                           ` Dorit Naishlos
  2004-10-13 19:18                         ` Richard Henderson
  1 sibling, 1 reply; 36+ messages in thread
From: Richard Henderson @ 2004-10-12 16:37 UTC (permalink / raw)
  To: Dorit Naishlos; +Cc: gcc-patches, mark

On Tue, Oct 12, 2004 at 02:45:50PM +0200, Dorit Naishlos wrote:
> ... but on second thought, the current version is not restrictive
> enough - it allows initialization with constant/invariants which
> would not be handled correctly (unless it's an initialization with
> 0 or -1).

Test case?


r~

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [patch] vectorizer: fix handling of non VECTOR_MODE_P vectypes
  2004-10-12 16:37                         ` Richard Henderson
@ 2004-10-12 18:21                           ` Dorit Naishlos
  2004-10-12 20:49                             ` James A. Morrison
  0 siblings, 1 reply; 36+ messages in thread
From: Dorit Naishlos @ 2004-10-12 18:21 UTC (permalink / raw)
  To: Richard Henderson; +Cc: gcc-patches, mark


> Test case?

Sure:

Tests vect-[82,83].c below currently ICE.
If compiled in addition with -mpowerpc64 (testcases vect-[82,83]_64.c),
then they don't ICE: vect-82_64.c passes ok, and vect-83_64.c produces
wrong results.
With the patch (that disables vectorization in this case), the tests will
pass, but we lose the opportunity to vectorize vect-82_64.c (which may not
be so terrible cause we may want to convert this loop to a call to memset
instead).

dorit

=====================================
vect-82.c:
/* { dg-do run { target powerpc*-*-* } } */
/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-stats -maltivec" {
target powerpc*-*-* } } */

#include <stdarg.h>
#include "tree-vect.h"

#define N 16

int main1 ()
{
  long long unsigned int ca[N];
  int i;

  for (i = 0; i < N; i++)
    ca[i] = 0;

  /* check results:  */
  for (i = 0; i < N; i++)
    {
      if (ca[i] != 0)
        abort ();
    }

  return 0;
}

int main (void)
{
  check_vect ();
  return main1 ();
}
=====================================
vect-83.c:
/* { dg-do run { target powerpc*-*-* } } */
/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-stats -maltivec" {
target powerpc*-*-* } } */

#include <stdarg.h>
#include "tree-vect.h"

#define N 16

int main1 ()
{
  long long unsigned int ca[N];
  int i;

  for (i = 0; i < N; i++)
    ca[i] = 2;

  /* check results:  */
  for (i = 0; i < N; i++)
    {
      if (ca[i] != 2)
        abort ();
    }

  return 0;
}

int main (void)
{
  check_vect ();
  return main1 ();
}
=====================================
vect-82_64.c:
/* { dg-do run { target powerpc*-*-* } } */
/* { dg-options "-O2 -ftree-vectorize -mpowerpc64 -fdump-tree-vect-stats
-maltivec" { target powerpc*-*-* } } */

#include <stdarg.h>
#include "tree-vect.h"

#define N 16

int main1 ()
{
  long long unsigned int ca[N];
  int i;

  for (i = 0; i < N; i++)
    ca[i] = 0;

  /* check results:  */
  for (i = 0; i < N; i++)
    {
      if (ca[i] != 0)
        abort ();
    }

  return 0;
}

int main (void)
{
  check_vect ();
  return main1 ();
}
=====================================
vect-83_64.c:
/* { dg-do run { target powerpc*-*-* } } */
/* { dg-options "-O2 -ftree-vectorize -mpowerpc64 -fdump-tree-vect-stats
-maltivec" { target powerpc*-*-* } } */

#include <stdarg.h>
#include "tree-vect.h"

#define N 16

int main1 ()
{
  long long unsigned int ca[N];
  int i;

  for (i = 0; i < N; i++)
    ca[i] = 2;

  /* check results:  */
  for (i = 0; i < N; i++)
    {
      if (ca[i] != 2)
        abort ();
    }

  return 0;
}

int main (void)
{
  check_vect ();
  return main1 ();
}
=====================================



|---------+---------------------------->
|         |           Richard Henderson|
|         |           <rth@redhat.com> |
|         |                            |
|         |           12/10/2004 18:27 |
|---------+---------------------------->
  >----------------------------------------------------------------------------------------------------------------------------|
  |                                                                                                                            |
  |       To:       Dorit Naishlos/Haifa/IBM@IBMIL                                                                             |
  |       cc:       gcc-patches@gcc.gnu.org, mark@codesourcery.com                                                             |
  |       Subject:  Re: [patch] vectorizer: fix handling of non VECTOR_MODE_P vectypes                                         |
  >----------------------------------------------------------------------------------------------------------------------------|




On Tue, Oct 12, 2004 at 02:45:50PM +0200, Dorit Naishlos wrote:
> ... but on second thought, the current version is not restrictive
> enough - it allows initialization with constant/invariants which
> would not be handled correctly (unless it's an initialization with
> 0 or -1).

Test case?


r~



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [patch] vectorizer: fix handling of non VECTOR_MODE_P vectypes
  2004-10-12 18:21                           ` Dorit Naishlos
@ 2004-10-12 20:49                             ` James A. Morrison
  2004-10-13  8:46                               ` Dorit Naishlos
  0 siblings, 1 reply; 36+ messages in thread
From: James A. Morrison @ 2004-10-12 20:49 UTC (permalink / raw)
  To: Dorit Naishlos; +Cc: Richard Henderson, gcc-patches, mark


Dorit Naishlos <DORIT@il.ibm.com> writes:

> > Test case?
> 
> Sure:
> 
> Tests vect-[82,83].c below currently ICE.
> If compiled in addition with -mpowerpc64 (testcases vect-[82,83]_64.c),
> then they don't ICE: vect-82_64.c passes ok, and vect-83_64.c produces
> wrong results.
> With the patch (that disables vectorization in this case), the tests will
> pass, but we lose the opportunity to vectorize vect-82_64.c (which may not
> be so terrible cause we may want to convert this loop to a call to memset
> instead).
> 
> dorit
> 
> =====================================
> vect-82.c:
> /* { dg-do run { target powerpc*-*-* } } */
> /* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-stats -maltivec" {
> target powerpc*-*-* } } */

 This doesn't look right.  Why do you need -mpowerpc64?  Or even 
-fdump-tree-vect-stats if this a runtime test?
 


> #include <stdarg.h>
> #include "tree-vect.h"
> 
> #define N 16
> 
> int main1 ()
> {
>   long long unsigned int ca[N];
>   int i;
> 
>   for (i = 0; i < N; i++)
>     ca[i] = 0;
> 
>   /* check results:  */
>   for (i = 0; i < N; i++)
>     {
>       if (ca[i] != 0)
>         abort ();
>     }
> 
>   return 0;
> }
> 
> int main (void)
> {
>   check_vect ();
>   return main1 ();
> }
> =====================================
> vect-83.c:
> /* { dg-do run { target powerpc*-*-* } } */
> /* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-stats -maltivec" {
> target powerpc*-*-* } } */
> 
> #include <stdarg.h>
> #include "tree-vect.h"
> 
> #define N 16
> 
> int main1 ()
> {
>   long long unsigned int ca[N];
>   int i;
> 
>   for (i = 0; i < N; i++)
>     ca[i] = 2;
> 
>   /* check results:  */
>   for (i = 0; i < N; i++)
>     {
>       if (ca[i] != 2)
>         abort ();
>     }
> 
>   return 0;
> }
> 
> int main (void)
> {
>   check_vect ();
>   return main1 ();
> }
> =====================================
> vect-82_64.c:
> /* { dg-do run { target powerpc*-*-* } } */
> /* { dg-options "-O2 -ftree-vectorize -mpowerpc64 -fdump-tree-vect-stats
> -maltivec" { target powerpc*-*-* } } */
> 
> #include <stdarg.h>
> #include "tree-vect.h"
> 
> #define N 16
> 
> int main1 ()
> {
>   long long unsigned int ca[N];
>   int i;
> 
>   for (i = 0; i < N; i++)
>     ca[i] = 0;
> 
>   /* check results:  */
>   for (i = 0; i < N; i++)
>     {
>       if (ca[i] != 0)
>         abort ();
>     }
> 
>   return 0;
> }
> 
> int main (void)
> {
>   check_vect ();
>   return main1 ();
> }
> =====================================
> vect-83_64.c:
> /* { dg-do run { target powerpc*-*-* } } */
> /* { dg-options "-O2 -ftree-vectorize -mpowerpc64 -fdump-tree-vect-stats
> -maltivec" { target powerpc*-*-* } } */

> #include <stdarg.h>
> #include "tree-vect.h"
> 
> #define N 16
> 
> int main1 ()
> {
>   long long unsigned int ca[N];
>   int i;
> 
>   for (i = 0; i < N; i++)
>     ca[i] = 2;
> 
>   /* check results:  */
>   for (i = 0; i < N; i++)
>     {
>       if (ca[i] != 2)
>         abort ();
>     }
> 
>   return 0;
> }
> 
> int main (void)
> {
>   check_vect ();
>   return main1 ();
> }


-- 
Thanks,
Jim

http://www.student.cs.uwaterloo.ca/~ja2morri/
http://phython.blogspot.com
http://open.nit.ca/wiki/?page=jim

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [patch] vectorizer: fix handling of non VECTOR_MODE_P vectypes
  2004-10-12 20:49                             ` James A. Morrison
@ 2004-10-13  8:46                               ` Dorit Naishlos
  0 siblings, 0 replies; 36+ messages in thread
From: Dorit Naishlos @ 2004-10-13  8:46 UTC (permalink / raw)
  To: James A. Morrison; +Cc: gcc-patches, mark, Richard Henderson


>  This doesn't look right.  Why do you need -mpowerpc64?

because I get different failures if I use -mpowerpc64 or not.
that's why I have two tests that use -mpowerpc64 and two tests that don't.

> Or even
> -fdump-tree-vect-stats if this a runtime test?

because I also want to verify that nothing got vectorized; so I will also
have this at the end of each of these testcases:

/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 0 "vect" } } */

dorit



                                                                                                                           
                      ja2morri@csclub.u                                                                                    
                      waterloo.ca              To:       Dorit Naishlos/Haifa/IBM@IBMIL                                    
                      (James A.                cc:       Richard Henderson <rth@redhat.com>, gcc-patches@gcc.gnu.org,      
                      Morrison)                 mark@codesourcery.com                                                      
                                               Subject:  Re: [patch] vectorizer: fix handling of non VECTOR_MODE_P         
                      12/10/2004 22:39          vectypes                                                                   
                                                                                                                           





Dorit Naishlos <DORIT@il.ibm.com> writes:

> > Test case?
>
> Sure:
>
> Tests vect-[82,83].c below currently ICE.
> If compiled in addition with -mpowerpc64 (testcases vect-[82,83]_64.c),
> then they don't ICE: vect-82_64.c passes ok, and vect-83_64.c produces
> wrong results.
> With the patch (that disables vectorization in this case), the tests will
> pass, but we lose the opportunity to vectorize vect-82_64.c (which may
not
> be so terrible cause we may want to convert this loop to a call to memset
> instead).
>
> dorit
>
> =====================================
> vect-82.c:
> /* { dg-do run { target powerpc*-*-* } } */
> /* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-stats -maltivec" {
> target powerpc*-*-* } } */

 This doesn't look right.  Why do you need -mpowerpc64?  Or even
-fdump-tree-vect-stats if this a runtime test?



> #include <stdarg.h>
> #include "tree-vect.h"
>
> #define N 16
>
> int main1 ()
> {
>   long long unsigned int ca[N];
>   int i;
>
>   for (i = 0; i < N; i++)
>     ca[i] = 0;
>
>   /* check results:  */
>   for (i = 0; i < N; i++)
>     {
>       if (ca[i] != 0)
>         abort ();
>     }
>
>   return 0;
> }
>
> int main (void)
> {
>   check_vect ();
>   return main1 ();
> }
> =====================================
> vect-83.c:
> /* { dg-do run { target powerpc*-*-* } } */
> /* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-stats -maltivec" {
> target powerpc*-*-* } } */
>
> #include <stdarg.h>
> #include "tree-vect.h"
>
> #define N 16
>
> int main1 ()
> {
>   long long unsigned int ca[N];
>   int i;
>
>   for (i = 0; i < N; i++)
>     ca[i] = 2;
>
>   /* check results:  */
>   for (i = 0; i < N; i++)
>     {
>       if (ca[i] != 2)
>         abort ();
>     }
>
>   return 0;
> }
>
> int main (void)
> {
>   check_vect ();
>   return main1 ();
> }
> =====================================
> vect-82_64.c:
> /* { dg-do run { target powerpc*-*-* } } */
> /* { dg-options "-O2 -ftree-vectorize -mpowerpc64 -fdump-tree-vect-stats
> -maltivec" { target powerpc*-*-* } } */
>
> #include <stdarg.h>
> #include "tree-vect.h"
>
> #define N 16
>
> int main1 ()
> {
>   long long unsigned int ca[N];
>   int i;
>
>   for (i = 0; i < N; i++)
>     ca[i] = 0;
>
>   /* check results:  */
>   for (i = 0; i < N; i++)
>     {
>       if (ca[i] != 0)
>         abort ();
>     }
>
>   return 0;
> }
>
> int main (void)
> {
>   check_vect ();
>   return main1 ();
> }
> =====================================
> vect-83_64.c:
> /* { dg-do run { target powerpc*-*-* } } */
> /* { dg-options "-O2 -ftree-vectorize -mpowerpc64 -fdump-tree-vect-stats
> -maltivec" { target powerpc*-*-* } } */

> #include <stdarg.h>
> #include "tree-vect.h"
>
> #define N 16
>
> int main1 ()
> {
>   long long unsigned int ca[N];
>   int i;
>
>   for (i = 0; i < N; i++)
>     ca[i] = 2;
>
>   /* check results:  */
>   for (i = 0; i < N; i++)
>     {
>       if (ca[i] != 2)
>         abort ();
>     }
>
>   return 0;
> }
>
> int main (void)
> {
>   check_vect ();
>   return main1 ();
> }


--
Thanks,
Jim

http://www.student.cs.uwaterloo.ca/~ja2morri/
http://phython.blogspot.com
http://open.nit.ca/wiki/?page=jim



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [patch] vectorizer: fix handling of non VECTOR_MODE_P vectypes
  2004-10-12 12:54                       ` Dorit Naishlos
  2004-10-12 16:37                         ` Richard Henderson
@ 2004-10-13 19:18                         ` Richard Henderson
  1 sibling, 0 replies; 36+ messages in thread
From: Richard Henderson @ 2004-10-13 19:18 UTC (permalink / raw)
  To: Dorit Naishlos; +Cc: gcc-patches, mark

Ok, except,

> +   if (vec_mode == BLKmode)

This is redundant with 

> +   if (!VECTOR_MODE_P (vec_mode))

This.


r~

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [patch] extend.texi MIPS PS/3D Support
  2004-09-28 14:07             ` Dorit Naishlos
  2004-09-28 14:08               ` Paolo Bonzini
@ 2004-09-28 19:35               ` Chao-ying Fu
  2004-09-28 23:04               ` Richard Henderson
  2 siblings, 0 replies; 36+ messages in thread
From: Chao-ying Fu @ 2004-09-28 19:35 UTC (permalink / raw)
  To: Dorit Naishlos
  Cc: Richard Henderson, Jim Wilson, gcc-patches, Stephens, Nigel,
	Thekkath, Radhika, Richard Sandiford, Uhler, Mike

Hello,

build_vector_type() -> make_vecotr_type() -> layout_type() returns DImode
for 2 units of SImode (innermode).

Then, vect_analyze_operatoions()->vectorizable_operation() checks if
optab->handlers[DImode].insn_code is available.

But, we should check optab->handlers[V2SImode].insn_code.

Regards,

Chao-ying

----- Original Message ----- 
From: "Dorit Naishlos" <DORIT@il.ibm.com>
To: "Richard Henderson" <rth@redhat.com>
Cc: "Chao-ying Fu" <fu@mips.com>; <gcc-patches@gcc.gnu.org>; "Stephens,
Nigel" <nigel@mercury.mips.com>; "Thekkath, Radhika"
<radhika@mercury.mips.com>; "Richard Sandiford" <rsandifo@redhat.com>;
"Uhler, Mike" <uhler@mercury.mips.com>; "Jim Wilson"
<wilson@specifixinc.com>
Sent: Tuesday, September 28, 2004 6:25 AM
Subject: Re: [patch] extend.texi MIPS PS/3D Support


>
> > The real problem is that you're pulling out SImode and deciding
> > that add_optab[SImode] is the vector addition operation.
>
> but how come we get SImode here? we call build_vector_type with
> innertype = SI, and
> nunits = UNITS_PER_SIMD_WORD / nbytes = 8 / 4 = 2
> so it should return a V2SI vectype, which is not supported by the target
> and therefore should have a BLKmode. no?
>
> dorit
>
>
>
>
>                       Richard Henderson
>                       <rth@redhat.com>         To:       Dorit
Naishlos/Haifa/IBM@IBMIL
>                                                cc:       Chao-ying Fu
<fu@mips.com>, gcc-patches@gcc.gnu.org, "Stephens, Nigel"
>                       28/09/2004 01:43          <nigel@mercury.mips.com>,
"Thekkath, Radhika" <radhika@mercury.mips.com>, Richard
>                                                 Sandiford
<rsandifo@redhat.com>, "Uhler, Mike" <uhler@mercury.mips.com>, Jim
>                                                 Wilson
<wilson@specifixinc.com>
>                                                Subject:  Re: [patch]
extend.texi MIPS PS/3D Support
>
>
>
>
>
> On Tue, Sep 28, 2004 at 01:30:00AM +0200, Dorit Naishlos wrote:
> > Maybe the problem is here (in get_vectype_for_scalar_type):
> >
> >   vectype = build_vector_type (scalar_type, nunits);
> >   if (TYPE_MODE (vectype) == BLKmode)
> >     return NULL_TREE;
> >
> > Maybe the vectype that is built has a mode other than BLKmode, although
> > it's not supported by the target?
>
> Yes, e.g. V4QI may be represented with SImode.  But that's fine,
> since the vectorizer may be able to optimize data movement loops
> with no other vector support in the target.  (E.g. if target and
> destination are alignable.)
>
> The real problem is that you're pulling out SImode and deciding
> that add_optab[SImode] is the vector addition operation.
>
>
> r~
>
>
>
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [patch] extend.texi MIPS PS/3D Support
  2004-09-28 14:07             ` Dorit Naishlos
  2004-09-28 14:08               ` Paolo Bonzini
  2004-09-28 19:35               ` [patch] extend.texi MIPS PS/3D Support Chao-ying Fu
@ 2004-09-28 23:04               ` Richard Henderson
  2 siblings, 0 replies; 36+ messages in thread
From: Richard Henderson @ 2004-09-28 23:04 UTC (permalink / raw)
  To: Dorit Naishlos
  Cc: Chao-ying Fu, gcc-patches, Stephens, Nigel, Thekkath, Radhika,
	Richard Sandiford, Uhler, Mike, Jim Wilson

On Tue, Sep 28, 2004 at 03:25:39PM +0200, Dorit Naishlos wrote:
> so it should return a V2SI vectype, which is not supported by the target
> and therefore should have a BLKmode. no?

Nope.  An integer mode is returned if we have one that fits.  It
makes the routines in tree-complex.c happy.


r~

^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: [patch] extend.texi MIPS PS/3D Support
@ 2004-09-24  0:12 Fu, Chao-Ying
  2004-10-02 11:35 ` Richard Sandiford
  0 siblings, 1 reply; 36+ messages in thread
From: Fu, Chao-Ying @ 2004-09-24  0:12 UTC (permalink / raw)
  To: Richard Sandiford
  Cc: James E Wilson, gcc-patches, Stephens, Nigel, Thekkath, Radhika,
	Uhler, Mike

Here is the updated patch!  Thanks!

Regards,
Chao-ying Fu
MIPS Technologies, Inc.

Index: extend.texi
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/doc/extend.texi,v
retrieving revision 1.219
diff -c -3 -p -r1.219 extend.texi
*** extend.texi	23 Sep 2004 16:11:23 -0000	1.219
--- extend.texi	23 Sep 2004 23:42:23 -0000
*************** instructions, but allow the compiler to 
*** 5297,5302 ****
--- 5297,5303 ----
  * ARM Built-in Functions::
  * FR-V Built-in Functions::
  * X86 Built-in Functions::
+ * MIPS Paired-Single Floating Point Support::
  * PowerPC AltiVec Built-in Functions::
  @end menu
  
*************** v2sf __builtin_ia32_pswapdsf (v2sf)
*** 6180,6185 ****
--- 6181,6798 ----
  v2si __builtin_ia32_pswapdsi (v2si)
  @end smallexample
  
+ @node MIPS Paired-Single Floating Point Support
+ @subsection MIPS Paired-Single Floating Point Support
+ 
+ @menu
+ * Built-in Functions for MIPS64 Paired-Single Instructions::
+ * Built-in Functions for MIPS-3D ASE Instructions::
+ @end menu
+ 
+ The MIPS64 Architecture includes a number of paired-single floating point (FP) 
+ instructions.  These paired-single instructions operate on
+ 64-bit floating point registers containing two single floating point values
+ each in two contiguous 32-bits of the register, that is, one in the upper
+ half and one in the lower half.  This type of vector representation of two 
+ single FP values can then be operated upon simultaneously by a single 
+ instruction in a SIMD (single instruction multiple data) fashion. 
+ GCC supports the paired-single instructions via the @code{V2SF} machine mode, 
+ generic C arithmetic operations, and built-in functions.  Having this support 
+ allows a user to declare variables in C that use the new SIMD data type.  
+ When these variables are then used in arithmetic operations, the appropriate 
+ paired-single instructions are automatically generated by the compiler.  
+ Many built-in functions are also available that use this data type and
+ generate appropriate paired-single instructions.
+ 
+ To enable paired-single instruction support, the command line option 
+ @option{-mpaired-single} must be used.  Also, the hardware floating point 
+ option @option{-mhard-float} and the 64-bit FP register option 
+ @option{-mfp64} must be turned on either by default or by explicit command 
+ line options.
+ 
+ @smallexample
+ gcc -mips64 -mpaired-single -c hello.c
+ gcc -mips64 -mpaired-single -mhard-float -mfp64 -c hello.c
+ @end smallexample
+ 
+ To use paired-single variables in C, the following @code{typedef} is
+ required.
+ @smallexample
+ typedef float v2sf __attribute__ ((vector_size (8)));
+ @end smallexample
+ 
+ This @code{typedef} defines @code{v2sf} as an 8-byte data type which base type 
+ is @code{float} (4 bytes), so a @code{v2sf} variable contains two 
+ @code{float}s in a SIMD fashion.
+ 
+ The following illustrates by example, how SIMD variables can be declared,
+ initialized, and used in a C program.  Note that declaring and using SIMD 
+ variables in this way allows the compiler to use the appropriate SIMD 
+ instructions without any explicit direction from the user.
+ 
+ @table @asis
+ @item Variable Declaration
+ @smallexample
+ v2sf a, b, c, d;
+ /* @r{This declares four variables @code{a}, @code{b}, @code{c}, and @code{d} 
+    as a paired-single data type.}  */
+ @end smallexample
+ 
+ @item Variable Initialization
+ @smallexample
+ v2sf a = @{1.5, 9.1@};
+ v2sf b;
+ float e, f;
+ b = (v2sf) @{e, f@};
+ @end smallexample
+ 
+ @emph{Note:} The CPU endianness determines how paired-single variables are 
+ initialized, i.e., which value goes into the upper and which goes into the 
+ lower part of the paired-single floating point register.
+ For little-endian CPUs, the first single value goes into the lower part,
+ and the second single value goes into the upper part.  For example,
+ @code{v2sf a = @{1.5, 9.1@};} will put @code{1.5} into the lower part and 
+ @code{9.1} into the upper part.
+ For big-endian CPUs, the first single value goes into the upper part, and
+ the second single value goes into the lower part.  For the same example,
+ @code{v2sf a = @{1.5, 9.1@};} will put @code{1.5} into the upper part and 
+ @code{9.1} into the lower part.
+ @end table
+ 
+ The examples below illustrate the usage of the declared SIMD data type 
+ variables that allow the compiler to automatically invoke paired-single
+ instructions.  The instruction invoked in each case is shown in parentheses.
+ 
+ @table @asis
+ @item Assignment (@code{MOV.PS})
+ @smallexample
+ v2sf a, b;
+ a = b;
+ @end smallexample
+ 
+ @item Addition (@code{ADD.PS})
+ @smallexample
+ v2sf a, b, c;
+ a = b + c;
+ @end smallexample
+ 
+ @item Subtraction (@code{SUB.PS})
+ @smallexample
+ v2sf a, b, c;
+ a = b - c;
+ @end smallexample
+ 
+ @item Negation (@code{NEG.PS})
+ @smallexample
+ v2sf a, b;
+ a = - b;
+ @end smallexample
+ 
+ @item Multiplication (@code{MUL.PS})
+ @smallexample
+ v2sf a, b, c;
+ a = b * c;
+ @end smallexample
+ 
+ @item Multiplication and Addition (@code{MADD.PS})
+ @smallexample
+ v2sf a, b, c, d;
+ a = b * c + d;
+ @end smallexample
+ 
+ @item Multiplication and Subtraction (@code{MSUB.PS})
+ @smallexample
+ v2sf a, b, c, d;
+ a = b * c - d;
+ @end smallexample
+ 
+ @item Negation of Multiplication and Addition (@code{NMADD.PS})
+ @smallexample
+ v2sf a, b, c, d;
+ a = - (b * c + d);
+ @end smallexample
+ 
+ @item Negation of Multiplication and Subtraction (@code{NMSUB.PS})
+ @smallexample
+ v2sf a, b, c, d;
+ a = - (b * c - d);
+ @end smallexample
+ 
+ @item Conditional Move Based on Integer Comparison (@code{MOVN.PS}, @code{MOVZ.PS})
+ @smallexample
+ int i, j;
+ v2sf a, b, c;
+ a = (i > j) ? b : c;
+ /* @r{Or}  */
+ if (i > j)
+   a = b;
+ else
+   a = c;
+ @end smallexample
+ @end table
+ 
+ @node Built-in Functions for MIPS64 Paired-Single Instructions
+ @subsubsection Built-in Functions for MIPS64 Paired-Single Instructions 
+ 
+ Built-in functions can be used to explicitly invoke a particular SIMD 
+ instruction.  They are used in situations to merge paired-single variables,
+ convert between paired-single and float variables, calculate absolute
+ values, conditionally assign paired-single variables based on floating-point
+ comparison results, and compare two paired-single variables.
+ Typically, each built-in function refers to a particular paired-single SIMD
+ instruction (and other floating-point instructions if necessary).
+ 
+ @table @code
+ @item v2sf __builtin_mips_pll_ps (v2sf, v2sf)
+ Pair lower lower (@code{PLL.PS}).  
+ (Please refer to the architecture specification on the exact function of this 
+ and other paired-single instructions.)
+ 
+ In order to invoke the @code{PLL.PS} instruction on some paired-single 
+ variables, the user must do the following:  Declare the @code{v2sf} type 
+ shown previously, declare variables using that type, then call the built-in 
+ function on the appropriate variables.  The usage of other built-in functions 
+ is similar.
+ 
+ Example:
+ @smallexample
+ v2sf a, b, c;
+ c = __builtin_mips_pll_ps (a, b);
+ @end smallexample
+ 
+ @item v2sf __builtin_mips_pul_ps (v2sf, v2sf)
+ Pair upper lower (@code{PUL.PS}).
+ 
+ @item v2sf __builtin_mips_plu_ps (v2sf, v2sf)
+ Pair lower upper (@code{PLU.PS}).
+ 
+ @item v2sf __builtin_mips_puu_ps (v2sf, v2sf)
+ Pair upper upper (@code{PUU.PS}).
+ 
+ @item v2sf __builtin_mips_cvt_ps_s (float, float)
+ Floating point convert pair to paired single (@code{CVT.PS.S}).
+ 
+ @item float __builtin_mips_cvt_s_pl (v2sf)
+ Floating point convert pair lower to single floating point (@code{CVT.S.PL}).
+ 
+ @item float __builtin_mips_cvt_s_pu (v2sf)
+ Floating point convert pair upper to single floating point (@code{CVT.S.PU}).
+ 
+ @item v2sf __builtin_mips_abs_ps (v2sf)
+ Floating point absolute value (@code{ABS.PS}).
+ 
+ @item v2sf __builtin_mips_alnv_ps (v2sf, v2sf, int)
+ Floating point align variable (@code{ALNV.PS}).
+ 
+ @emph{Note:} The value of the third parameter @code{int} must be 0 or 4 
+ modulo 8, otherwise the result will be unpredictable.  Please read the 
+ instruction description for details.
+ 
+ @item v2sf __builtin_mips_mov@var{tf}_c_@var{cond}_ps (v2sf, v2sf, v2sf, v2sf)
+ Conditional move based on floating point comparison (@code{C.@var{cond}.PS}, 
+ @code{MOVT.PS}/@code{MOVF.PS}).
+ 
+ @var{tf} = @code{t}, @code{f}
+ 
+ @var{cond} = @code{f}, @code{un}, @code{eq}, @code{ueq}, @code{olt}, 
+ @code{ult}, @code{ole}, @code{ule}, @code{sf}, @code{ngle}, @code{seq}, 
+ @code{ngl}, @code{lt}, @code{nge}, @code{le}, @code{ngt}
+ 
+ Description: Compare the first and the second @code{v2sf} parameters using one 
+ of the 16 comparison operators specified.  Assign the third @code{v2sf} 
+ parameter to the destination, and then conditionally assign (if true or false) 
+ the fourth @code{v2sf} parameter to the destination, potentially overwriting
+ the previous assigned values.
+ 
+ @emph{Note:} The assignments of the upper and lower parts are independent and
+ based on their own comparison results.
+ 
+ Example:
+ @smallexample
+ v2sf a, b, c, d, e, f;
+ e = __builtin_mips_movt_c_eq_ps (a, b, c, d);
+ f = __builtin_mips_movf_c_eq_ps (a, b, c, d);
+ @end smallexample
+ 
+ @item int __builtin_mips_@var{choice}_c_@var{cond}_ps (v2sf, v2sf)
+ Floating point comparisons on two paired-single values (@code{C.@var{cond}.PS}, 
+ @code{BC1T}/@code{BC1F}).
+ 
+ @var{choice} = @code{upper}, @code{lower}
+ 
+ @var{cond} = @code{f}, @code{un}, @code{eq}, @code{ueq}, @code{olt}, 
+ @code{ult}, @code{ole}, @code{ule}, @code{sf}, @code{ngle}, @code{seq}, 
+ @code{ngl}, @code{lt}, @code{nge}, @code{le}, @code{ngt}
+ 
+ Description: Compare the first and the second @code{v2sf} parameters using one
+ of the 16 comparison operators specified.  Then return 1 if the specified upper
+ or lower comparison result is true, or return 0 if the specified upper
+ or lower comparison result is false.
+ 
+ Example 1:
+ @smallexample
+ v2sf a, b;
+ if (__builtin_mips_upper_c_eq_ps (a, b))
+   upper_is_true ();
+ else
+   upper_is_false ();
+ 
+ if (__builtin_mips_lower_c_eq_ps (a, b))
+   lower_is_true ();
+ else
+   lower_is_false ();
+ @end smallexample
+ 
+ Example 2:
+ @smallexample
+ v2sf a, b;
+ int i, j;
+ i = __builtin_mips_upper_c_eq_ps (a, b);
+ if (i)
+   upper_is_true ();
+ else
+   upper_is_false ();
+ 
+ j = __builtin_mips_lower_c_eq_ps (a, b);
+ if (j)
+   lower_is_true ();
+ else
+   lower_is_false ();
+ @end smallexample
+ 
+ @end table
+ 
+ @node Built-in Functions for MIPS-3D ASE Instructions
+ @subsubsection Built-in Functions for MIPS-3D ASE Instructions
+ 
+ The MIPS-3D Application-Specific Extension (ASE) includes 13 floating point 
+ instructions designed to improve the performance of 3D graphics geometry 
+ operations.  GCC supports these instructions via built-in functions.  
+ For a description of these instructions, refer to the appropriate architecture 
+ reference manual.
+ 
+ To enable MIPS-3D support, the command line option @option{-mips3d} must be 
+ used.  Also, the hardware floating point option @option{-mhard-float}
+ and the 64-bit FP register option @option{-mfp64} must be turned on either by 
+ default or by explicit command line options.  Note that using @option{-mips3d}
+ implies @option{-mpaired-single}.
+ 
+ @smallexample
+ gcc -mips64 -mips3d -c hello.c
+ gcc -mips64 -mips3d -mhard-float -mfp64 -c hello.c
+ @end smallexample
+ 
+ The MIPS-3D built-in functions are as follows.
+ 
+ @table @code
+ @item v2sf __builtin_mips_addr_ps (v2sf, v2sf)
+ Floating point reduction add (@code{ADDR.PS}).
+ 
+ @item v2sf __builtin_mips_mulr_ps (v2sf, v2sf)
+ Floating point reduction multiply (@code{MULR.PS}).
+ 
+ @item v2sf __builtin_mips_cvt_pw_ps (v2sf)
+ Floating point convert paired single to paired word (@code{CVT.PW.PS}).
+ 
+ @item v2sf __builtin_mips_cvt_ps_pw (v2sf)
+ Floating point convert paired word to paired single (@code{CVT.PS.PW}).
+ 
+ @item float __builtin_mips_recip1_s (float)
+ Floating point reduced precision reciprocal (sequence step 1) (@code{RECIP1.S}).
+ 
+ @item double __builtin_mips_recip1_d (double)
+ Floating point reduced precision reciprocal (sequence step 1) (@code{RECIP1.D}).
+ 
+ @item v2sf __builtin_mips_recip1_ps (v2sf)
+ Floating point reduced precision reciprocal (sequence step 1) 
+ (@code{RECIP1.PS}).
+ 
+ @item float __builtin_mips_recip2_s (float, float)
+ Floating point reduced precision reciprocal (sequence step 2) (@code{RECIP2.S}).
+ 
+ @item double __builtin_mips_recip2_d (double, double)
+ Floating point reduced precision reciprocal (sequence step 2) (@code{RECIP2.D}).
+ 
+ @item v2sf __builtin_mips_recip2_ps (v2sf, v2sf)
+ Floating point reduced precision reciprocal (sequence step 2) 
+ (@code{RECIP2.PS}).
+ 
+ @item float __builtin_mips_rsqrt1_s (float)
+ Floating point reduced precision reciprocal square root (sequence step 1) 
+ (@code{RSQRT1.S}).
+ 
+ @item double __builtin_mips_rsqrt1_d (double)
+ Floating point reduced precision reciprocal square root (sequence step 1) 
+ (@code{RSQRT1.D}).
+ 
+ @item v2sf __builtin_mips_rsqrt1_ps (v2sf)
+ Floating point reduced precision reciprocal square root (sequence step 1) 
+ (@code{RSQRT1.PS}).
+ 
+ @item float __builtin_mips_rsqrt2_s (float, float)
+ Floating point reduced precision reciprocal square root (sequence step 2) 
+ (@code{RSQRT2.S}).
+ 
+ @item double __builtin_mips_rsqrt2_d (double, double)
+ Floating point reduced precision reciprocal square root (sequence step 2) 
+ (@code{RSQRT2.D}).
+ 
+ @item v2sf __builtin_mips_rsqrt2_ps (v2sf, v2sf)
+ Floating point reduced precision reciprocal square root (sequence step 2) 
+ (@code{RSQRT2.PS}).
+ 
+ @item int __builtin_mips_@var{choice}_c_@var{cond}_ps (v2sf, v2sf)
+ Floating point comparisons on two paired-single values (@code{C.@var{cond}.PS}, 
+ @code{BC1ANY2T}/@code{BC1ANY2F}).
+ 
+ @var{choice} = @code{any}, @code{all}
+ 
+ @var{cond} = @code{f}, @code{un}, @code{eq}, @code{ueq}, @code{olt}, 
+ @code{ult}, @code{ole}, @code{ule}, @code{sf}, @code{ngle}, @code{seq}, 
+ @code{ngl}, @code{lt}, @code{nge}, @code{le}, @code{ngt}
+ 
+ Description: Compare the first and the second @code{v2sf} parameters using 
+ one of the 16 comparison operators specified.  For the @code{any} version,
+ then return 1 if any comparison result is true, or return 0 if all comparison 
+ results are false.  For the @code{all} version, then return 1 if all comparison
+ results are true, or return 0 if any comparison result is false.
+ 
+ Example 1:
+ @smallexample
+ v2sf a, b;
+ if (__builtin_mips_any_c_eq_ps (a, b))
+   any_is_true ();
+ else
+   all_is_false ();
+ 
+ if (__builtin_mips_all_c_eq_ps (a, b))
+   all_is_true ();
+ else
+   any_is_false ();
+ @end smallexample
+ 
+ Example 2:
+ @smallexample
+ v2sf a, b;
+ int i, j;
+ i = __builtin_mips_any_c_eq_ps (a, b);
+ j = __builtin_mips_all_c_eq_ps (a, b);
+ @end smallexample
+ 
+ @emph{Note:} The remaining built-in functions are similar to this function,
+ except for compare operators (normal or absolute), basic data types,
+ or using two or four consecutive conditional bits.
+ 
+ @item int __builtin_mips_@var{choice}_cabs_@var{cond}_ps (v2sf, v2sf)
+ Floating point absolute comparisons on two paired-single values 
+ (@code{CABS.@var{cond}.PS}, 
+ @code{BC1ANY2T}/@code{BC1ANY2F}/@code{BC1T}/@code{BC1F}).
+ 
+ @var{choice} = @code{any}, @code{all}, @code{upper}, @code{lower}
+ 
+ @var{cond} = @code{f}, @code{un}, @code{eq}, @code{ueq}, @code{olt}, 
+ @code{ult}, @code{ole}, @code{ule}, @code{sf}, @code{ngle}, @code{seq}, 
+ @code{ngl}, @code{lt}, @code{nge}, @code{le}, @code{ngt}
+ 
+ Description: Absolute compare the first and the second @code{v2sf} parameters 
+ using one of the 16 comparison operators specified.  Then return 1 if 
+ any results/all results/upper result/lower result are(is) true, or return 0 
+ if all results/any results/upper result/lower result are(is) false.
+ 
+ Example 1:
+ @smallexample
+ v2sf a, b;
+ if (__builtin_mips_any_cabs_eq_ps (a, b))
+   any_is_true ();
+ else
+   all_is_false ();
+ 
+ if (__builtin_mips_all_cabs_eq_ps (a, b))
+   all_is_true ();
+ else
+   any_is_false ();
+ 
+ if (__builtin_mips_upper_cabs_eq_ps (a, b))
+   upper_is_true ();
+ else
+   upper_is_false ();
+ 
+ if (__builtin_mips_lower_cabs_eq_ps (a, b))
+   lower_is_true ();
+ else
+   lower_is_false ();
+ @end smallexample
+ 
+ Example 2:
+ @smallexample
+ v2sf a, b;
+ int i, j, k, l;
+ i = __builtin_mips_any_cabs_eq_ps (a, b);
+ j = __builtin_mips_all_cabs_eq_ps (a, b);
+ k = __builtin_mips_upper_cabs_eq_ps (a, b);
+ l = __builtin_mips_lower_cabs_eq_ps (a, b);
+ @end smallexample
+ 
+ @item int __builtin_mips_cabs_@var{cond}_s (float, float)
+ Floating point absolute comparisons on single (@code{CABS.@var{cond}.S},
+ @code{BC1T}/@code{BC1F}).
+ 
+ @var{cond} = @code{f}, @code{un}, @code{eq}, @code{ueq}, @code{olt}, 
+ @code{ult}, @code{ole}, @code{ule}, @code{sf}, @code{ngle}, @code{seq}, 
+ @code{ngl}, @code{lt}, @code{nge}, @code{le}, @code{ngt}
+ 
+ Description: Absolute compare the first and the second @code{float} parameters 
+ using one of the 16 comparison operators specified.  Then return 1 if true or 
+ return 0 if false.
+ 
+ Example 1:
+ @smallexample
+ float a, b;
+ if (__builtin_mips_cabs_eq_s (a, b))
+   true ();
+ else
+   false ();
+ @end smallexample
+ 
+ Example 2:
+ @smallexample
+ float a, b;
+ int i;
+ i = __builtin_mips_cabs_eq_s (a, b);
+ @end smallexample
+ 
+ @item int __builtin_mips_cabs_@var{cond}_d (double, double)
+ Floating point absolute comparisons on double (@code{CABS.@var{cond}.D},
+ @code{BC1T}/@code{BC1F}).
+ 
+ @var{cond} = @code{f}, @code{un}, @code{eq}, @code{ueq}, @code{olt}, 
+ @code{ult}, @code{ole}, @code{ule}, @code{sf}, @code{ngle}, @code{seq}, 
+ @code{ngl}, @code{lt}, @code{nge}, @code{le}, @code{ngt}
+ 
+ Description: Absolute compare the first and the second @code{double} parameters
+ using one of the 16 comparison operators specified.  Then return 1 if true or 
+ return 0 if false.
+ 
+ Example 1:
+ @smallexample
+ double a, b;
+ if (__builtin_mips_cabs_eq_d (a, b))
+   true ();
+ else
+   false ();
+ @end smallexample
+ 
+ Example 2:
+ @smallexample
+ double a, b;
+ int i;
+ i = __builtin_mips_cabs_eq_d (a, b);
+ @end smallexample
+ 
+ @item v2sf __builtin_mips_mov@var{tf}_cabs_@var{cond}_ps (v2sf, v2sf, v2sf, v2sf)
+ Conditional move based on floating point absolute comparison 
+ (@code{CABS.@var{cond}.PS}, @code{MOVT.PS}/@code{MOVF.PS}).
+ 
+ @var{tf} = @code{t}, @code{f}
+ 
+ @var{cond} = @code{f}, @code{un}, @code{eq}, @code{ueq}, @code{olt}, 
+ @code{ult}, @code{ole}, @code{ule}, @code{sf}, @code{ngle}, @code{seq}, 
+ @code{ngl}, @code{lt}, @code{nge}, @code{le}, @code{ngt}
+ 
+ Description: Absolute compare the first and the second @code{v2sf} parameters 
+ using one of the 16 comparison operators specified.  Assign the third
+ @code{v2sf} parameter to the destination, and then conditionally assign (if 
+ true or false) the fourth @code{v2sf} parameter to the destination,
+ potentially overwriting the previous assigned values.
+ 
+ @emph{Note:} The assignments of the upper and lower parts are independent and 
+ based on their own comparison results.
+ 
+ Example:
+ @smallexample
+ v2sf a, b, c, d, e, f;
+ e = __builtin_mips_movt_cabs_eq_ps (a, b, c, d);
+ f = __builtin_mips_movf_cabs_eq_ps (a, b, c, d);
+ @end smallexample
+ 
+ @item int __builtin_mips_@var{choice}_c_@var{cond}_4s (v2sf, v2sf, v2sf, v2sf)
+ Floating point comparisons on four paired-single values 
+ (@code{C.@var{cond}.PS}, @code{BC1ANY4T}/@code{BC1ANY4F}).
+ 
+ @var{choice} = @code{any}, @code{all}
+ 
+ @var{cond} = @code{f}, @code{un}, @code{eq}, @code{ueq}, @code{olt}, 
+ @code{ult}, @code{ole}, @code{ule}, @code{sf}, @code{ngle}, @code{seq}, 
+ @code{ngl}, @code{lt}, @code{nge}, @code{le}, @code{ngt}
+ 
+ Description: Compare the first and the second @code{v2sf} parameters using one 
+ of the 16 comparison operators specified.  Compare the third and the fourth 
+ @code{v2sf} parameters using one of the 16 comparison operators specified.  
+ Then return 1 if any/all comparison results are true, or return 0 
+ if all/any comparison results are false.
+ 
+ Example 1:
+ @smallexample
+ v2sf a, b, c, d;
+ if (__builtin_mips_any_c_eq_4s (a, b, c, d))
+   any_is_true ();
+ else
+   all_is_false ();
+ 
+ if (__builtin_mips_all_c_eq_4s (a, b, c, d))
+   all_is_true ();
+ else
+   any_is_false ();
+ @end smallexample
+ 
+ Example 2:
+ @smallexample
+ v2sf a, b, c, d;
+ int i, j;
+ i = __builtin_mips_any_c_eq_4s (a, b, c, d);
+ j = __builtin_mips_all_c_eq_4s (a, b, c, d);
+ @end smallexample
+ 
+ @item int __builtin_mips_@var{choice}_cabs_@var{cond}_4s (v2sf, v2sf, v2sf, v2sf)
+ Floating point absolute comparisons on four paired-single values 
+ (@code{CABS.@var{cond}.PS}, @code{BC1ANY4T}/@code{BC1ANY4F}).
+ 
+ @var{choice} = @code{any}, @code{all}
+ 
+ @var{cond} = @code{f}, @code{un}, @code{eq}, @code{ueq}, @code{olt}, 
+ @code{ult}, @code{ole}, @code{ule}, @code{sf}, @code{ngle}, @code{seq}, 
+ @code{ngl}, @code{lt}, @code{nge}, @code{le}, @code{ngt}
+ 
+ Description: Absolute compare the first and the second @code{v2sf} parameters 
+ using one of the 16 comparison operators specified.  Absolute compare the 
+ third and the fourth @code{v2sf} parameters using one of the 16 comparison 
+ operators specified.  Then return 1 if any/all comparison results 
+ are true, or return 0 if all/any comparison results are false.
+ 
+ Example 1:
+ @smallexample
+ v2sf a, b, c, d;
+ if (__builtin_mips_any_cabs_eq_4s (a, b, c, d))
+   any_is_true ();
+ else
+   all_is_false ();
+ 
+ if (__builtin_mips_all_cabs_eq_4s (a, b, c, d))
+   all_is_true ();
+ else
+   any_is_false ();
+ @end smallexample
+ 
+ Example 2:
+ @smallexample
+ v2sf a, b, c, d;
+ int i, j;
+ i = __builtin_mips_any_cabs_eq_4s (a, b, c, d);
+ j = __builtin_mips_all_cabs_eq_4s (a, b, c, d);
+ @end smallexample
+ 
+ @end table
+ 
  @node PowerPC AltiVec Built-in Functions
  @subsection PowerPC AltiVec Built-in Functions
  

> -----Original Message-----
> From: gcc-patches-owner@gcc.gnu.org
> [mailto:gcc-patches-owner@gcc.gnu.org]On Behalf Of Richard Sandiford
> Sent: Tuesday, September 21, 2004 11:49 PM
> To: Fu, Chao-Ying
> Cc: James E Wilson; gcc-patches@gcc.gnu.org; Stephens, Nigel; 
> Thekkath,
> Radhika; Uhler, Mike
> Subject: Re: [patch] extend.texi MIPS PS/3D Support
> 
> 
> This is awesome.  Thanks a lot for doing this.
> 
> Some very minor niggles:
> 
> "Fu, Chao-Ying" <fu@mips.com> writes:
> > + * MIPS Paired-Single Floating Point Instruction Support::
> 
> Maybe drop "Instruction"?
> 
> > + @menu
> > + * Built-in Functions for MIPS64 Paired-Single Instructions::
> > + * Built-in Functions for MIPS-3D ASE Instructions::
> > + @end menu
> 
> I'm not sure how canonical it is to have the section start with a menu
> and be followed by a lengthy (but very good!) overview.  It might well
> be OK, but maybe a docs maintainer can comment.
> 
> Perhaps a lot of the overview could go in a subsection of its own,
> alongside the two nodes above.
> 
> > + The MIPS64 Architecture includes a number of 
> paired-single floating point (FP) 
> > + instructions.  These paired-single instructions operate on
> > + 64-bit floating point registers containing two single 
> floating point values
> > + each in two contiguous 32-bits of the register, that is, 
> one in the upper
> > + half and one in the lower half.  This type of vector 
> representation of two 
> > + single FP values can then be operated upon simultaneously 
> by a single 
> > + instruction in a SIMD (single instruction multiple data) 
> fashion, also called 
> > + paired-single instructions in the MIPS sentence.  GCC 
> supports the 
>                                   ^^^^^^^^^^^^^^^^^
> I think you mean "MIPS terminology", or something like that.  Although
> everything after "also called..." is a little redundant since you've
> already introduced the term "paired-single instructions".
> 
> > + To enable paired-single instruction support, the command 
> line option 
> > + @option{-mpaired-single} must be used.  Also, the 
> hardware floating point 
> > + option @option{-mhard-float} and the 64-bit FP register option 
> > + @option{-mfp64} must be turned on either by default or by 
> explicit command 
> > + line options.
> > + 
> > + @smallexample
> > + gcc -mips64 -mpaired-single -c hello.c
> > + gcc -mips64 -mpaired-single -mhard-float -mfp64 -c hello.c
> > + @end smallexample
> > + 
> > + To use paired-single variables in C, the following 
> @code{typedef} is
> > + required.
> > + @smallexample
> > + typedef float v2sf __attribute__ ((vector_size (8)));
> > + @end smallexample
> > + 
> > + This @code{typedef} defines @code{v2sf} as an 8-byte data 
> type which base type 
> > + is @code{float} (4 bytes), so a @code{v2sf} variable contains two 
> > + @code{float}s in a SIMD fashion.
> > + 
> > + The following illustrates by example, how SIMD variables 
> can be declared,
> > + initialized, and used in a C program.  Note that 
> declaring and using SIMD 
> > + variables in this way allows the compiler to use the 
> appropriate SIMD 
> > + instructions without any explicit direction from the user.
> > + 
> > + @table @asis
> > + @item Variable Declaration
> > + @smallexample
> > + v2sf a, b, c, d;
> > + /* @r{This declares four variables a, b, c, and d as a 
> paired-single data type.}  */
> 
> I suspect that this should reall be "@code{a}, @code{b}, ...".
> 
> > + @end smallexample
> > + 
> > + @item Variable Initialization
> > + @smallexample
> > + v2sf a = @{1.5, 9.1@};
> > + v2sf b;
> > + float e, f;
> > + b = (v2sf) @{e, f@};
> > + @end smallexample
> > + 
> > + NOTE: The CPU endianness determines how paired-single 
> variables are initialized,
> 
> I think this should be @emph{Note:}.
> 
> > + i.e., which value goes into the upper and which goes into 
> the lower part
> > + of the paired-single floating point register.
> > + For little-endian CPUs, the first single value goes into 
> the lower part,
> > + and the second single value goes into the upper part.  
> For example,
> > + @code{v2sf a = @{1.5, 9.1@};}, @code{1.5} goes into the 
> lower part and 
> > + @code{9.1} goes into the upper part.
> 
> How about:
> 
>     For example, @code{v2sf a = @{1.5, 9.1@};} will put 
> @code{1.5} into
>     the lower part and @code{9.1} into the upper part.
> 
> > + The examples below illustrate the usage of the declared 
> SIMD data type 
> > + variables that allow the compiler to automatically invoke 
> paired-single
> > + instructions.  The instruction invoked in each case is 
> shown in parentheses.
> 
> The first sentence doesn't sound quite right, but I can't think of
> anything better right now.
> 
> > + 
> > + @table @asis
> > + @item Assignment (MOV.PS)
> 
> Maybe better to use "(@code{mov.ps})", here and elsewhere?
> 
> > + @item Conditional Move Based on Integer Comparison 
> (MOVN.PS, MOVZ.PS)
> > + @smallexample
> > + int i, j;
> > + v2sf a, b, c;
> > + a = (i > j) ? b : c;
> > + /* Or */
> > + if (i > j)
> > +   a = b;
> > + else
> > +   a = c;
> > + @end smallexample
> > + @end table
> 
> Apparently that should be "@r{Or}".  So many things to remember... ;)
> 
> > + @node Built-in Functions for MIPS64 Paired-Single Instructions
> > + @subsubsection Built-in Functions for MIPS64 
> Paired-Single Instructions 
> > + 
> > + Built-in functions invoke explicit SIMD instructions by 
> the user in C.
> 
> Maybe something like "Built-in functions can be used to explicitly
> invoke a particular SIMD instruction"?
> 
> > + They are used in situations when merging paired-single variables,
>                   ^^^^^^^^^^^^^^^^^^^^^^^^^^
> Suggest "to merge" (with rest of sentence changing accordingly).
> 
> > + NOTE: The value of the third parameter @code{int} must be 0 or 4 
> 
> Another @emph{Note:}.
> 
> > + @item v2sf __builtin_mips_mov*_c_*_ps (v2sf, v2sf, v2sf, v2sf)
> > + Conditional move based on floating point comparison 
> (C.*.PS, MOVT.PS/MOVF.PS).
> > + 
> > + The first * = @code{t}, @code{f}
> > + 
> > + The second * = @code{f}, @code{un}, @code{eq}, 
> @code{ueq}, @code{olt}, 
> > + @code{ult}, @code{ole}, @code{ule}, @code{sf}, 
> @code{ngle}, @code{seq}, 
> > + @code{ngl}, @code{lt}, @code{nge}, @code{le}, @code{ngt}
> 
> Probably better to use @var{...} instead of "*" here.  E.g.:
> 
> __builtin_mips_mov@var{tf}_c_@var{cond}_ps
> 
> Richard
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [patch] extend.texi MIPS PS/3D Support
  2004-09-24  0:12 Fu, Chao-Ying
@ 2004-10-02 11:35 ` Richard Sandiford
  2004-10-05  7:15   ` Richard Sandiford
  0 siblings, 1 reply; 36+ messages in thread
From: Richard Sandiford @ 2004-10-02 11:35 UTC (permalink / raw)
  To: Fu, Chao-Ying
  Cc: James E Wilson, gcc-patches, Stephens, Nigel, Thekkath, Radhika,
	Uhler, Mike

"Fu, Chao-Ying" <fu@mips.com> writes:
> Here is the updated patch!  Thanks!

Sorry for taking so long to deal with this.  As usual, I couldn't
resist tinkering a bit, and it's taken a while before I could
find the time.

Here's an updated patch.  The main changes were to:

  - Move the bit about the options requirements to invoke.texi.
    The existing options documentation is a bit bare and your
    patch was adding just the details that were needed.

  - Move some of the general paired-single documentation to
    the beginning of the section (ahead of the menu) and put
    the arithmetic bits in a (sub)subsection of its own.

  - Use a table to summarise the arithmetic operations.

  - Combine the documentation for similar built-in functions.

  - Use a code sequence to document the movt/movf functions.

Does it look OK?  If so, and if no-one else has any complaints,
I'll install this is in the next few days.

Tested with "make info" and "make dvi".

Richard


2004-10-02  Chao-Ying Fu  <fu@mips.com>
	    Richard Sandiford  <rsandifo@redhat.com>

	* doc/invoke.texi (-mpaired-single): Link to the new description of the
	built-in functions.  Document dependencies.
	(-mips3d): Add link here too.
	* doc/extend.texi (MIPS Paired-Single Support): New section.

Index: doc/invoke.texi
===================================================================
RCS file: /cvs/gcc/gcc/gcc/doc/invoke.texi,v
retrieving revision 1.539
diff -c -p -F^\([(a-zA-Z0-9_]\|#define\) -r1.539 invoke.texi
*** doc/invoke.texi	17 Sep 2004 15:51:58 -0000	1.539
--- doc/invoke.texi	2 Oct 2004 11:06:19 -0000
*************** operations.  This is the default.
*** 9446,9459 ****
  @itemx -mno-paired-single
  @opindex mpaired-single
  @opindex mno-paired-single
! Use (do not use) the paired single instructions.
  
  @itemx -mips3d
  @itemx -mno-mips3d
  @opindex mips3d
  @opindex mno-mips3d
! Use (do not use) the MIPS-3D ASE.  The option @option{-mips3d} implies
! @option{-mpaired-single}.
  
  @item -mint64
  @opindex mint64
--- 9446,9462 ----
  @itemx -mno-paired-single
  @opindex mpaired-single
  @opindex mno-paired-single
! Use (do not use) paired-single floating-point instructions.
! @xref{MIPS Paired-Single Support}.  This option can only be used
! when generating 64-bit code and requires hardware floating-point
! support to be enabled.
  
  @itemx -mips3d
  @itemx -mno-mips3d
  @opindex mips3d
  @opindex mno-mips3d
! Use (do not use) the MIPS-3D ASE.  @xref{MIPS-3D Built-in Functions}.
! The option @option{-mips3d} implies @option{-mpaired-single}.
  
  @item -mint64
  @opindex mint64
Index: doc/extend.texi
===================================================================
RCS file: /cvs/gcc/gcc/gcc/doc/extend.texi,v
retrieving revision 1.221
diff -c -p -F^\([(a-zA-Z0-9_]\|#define\) -r1.221 extend.texi
*** doc/extend.texi	24 Sep 2004 20:29:52 -0000	1.221
--- doc/extend.texi	2 Oct 2004 11:06:32 -0000
*************** instructions, but allow the compiler to 
*** 5297,5302 ****
--- 5297,5303 ----
  * ARM Built-in Functions::
  * FR-V Built-in Functions::
  * X86 Built-in Functions::
+ * MIPS Paired-Single Support::
  * PowerPC AltiVec Built-in Functions::
  @end menu
  
*************** v2sf __builtin_ia32_pswapdsf (v2sf)
*** 6180,6185 ****
--- 6181,6501 ----
  v2si __builtin_ia32_pswapdsi (v2si)
  @end smallexample
  
+ @node MIPS Paired-Single Support
+ @subsection MIPS Paired-Single Support
+ 
+ The MIPS64 architecture includes a number of instructions that
+ operate on pairs of single-precision floating-point values.
+ Each pair is packed into a 64-bit floating-point register,
+ with one element being designated the ``upper half'' and
+ the other being designated the ``lower half''.
+ 
+ GCC supports paired-single operations using both the generic
+ vector extensions (@pxref{Vector Extensions}) and a collection of
+ MIPS-specific built-in functions.  Both kinds of support are
+ enabled by the @option{-mpaired-single} command-line option.
+ 
+ The vector type associated with paired-single values is usually
+ called @code{v2sf}.  It can be defined in C as follows:
+ 
+ @smallexample
+ typedef float v2sf __attribute__ ((vector_size (8)));
+ @end smallexample
+ 
+ @code{v2sf} values are initialized in the same way as aggregates.
+ For example:
+ 
+ @smallexample
+ v2sf a = @{1.5, 9.1@};
+ v2sf b;
+ float e, f;
+ b = (v2sf) @{e, f@};
+ @end smallexample
+ 
+ @emph{Note:} The CPU's endianness determines which value is stored in
+ the upper half of a register and which value is stored in the lower half.
+ On little-endian targets, the first value is the lower one and the second
+ value is the upper one.  The opposite order applies to big-endian targets.
+ For example, the code above will set the lower half of @code{a} to
+ @code{1.5} on little-endian targets and @code{9.1} on big-endian targets.
+ 
+ @menu
+ * Paired-Single Arithmetic::
+ * Paired-Single Built-in Functions::
+ * MIPS-3D Built-in Functions::
+ @end menu
+ 
+ @node Paired-Single Arithmetic
+ @subsubsection Paired-Single Arithmetic
+ 
+ The table below lists the @code{v2sf} operations for which hardware
+ support exists.  @code{a}, @code{b} and @code{c} are @code{v2sf}
+ values and @code{x} is an integral value.
+ 
+ @multitable @columnfractions .50 .50
+ @item C code @tab MIPS instruction
+ @item @code{a + b} @tab @code{add.ps}
+ @item @code{a - b} @tab @code{sub.ps}
+ @item @code{-a} @tab @code{neg.ps}
+ @item @code{a * b} @tab @code{mul.ps}
+ @item @code{a * b + c} @tab @code{madd.ps}
+ @item @code{a * b - c} @tab @code{msub.ps}
+ @item @code{-(a * b + c)} @tab @code{nmadd.ps}
+ @item @code{-(a * b - c)} @tab @code{nmsub.ps}
+ @item @code{x ? a : b} @tab @code{movn.ps}/@code{movz.ps}
+ @end multitable
+ 
+ Note that the multiply-accumulate instructions can be disabled
+ using the command-line option @code{-mno-fused-madd}.
+ 
+ @node Paired-Single Built-in Functions
+ @subsubsection Paired-Single Built-in Functions
+ 
+ The following paired-single functions map directly to a particular
+ MIPS instruction.  Please refer to the architecture specification
+ for details on what each instruction does.
+ 
+ @table @code
+ @item v2sf __builtin_mips_pll_ps (v2sf, v2sf)
+ Pair lower lower (@code{pll.ps}).
+ 
+ @item v2sf __builtin_mips_pul_ps (v2sf, v2sf)
+ Pair upper lower (@code{pul.ps}).
+ 
+ @item v2sf __builtin_mips_plu_ps (v2sf, v2sf)
+ Pair lower upper (@code{plu.ps}).
+ 
+ @item v2sf __builtin_mips_puu_ps (v2sf, v2sf)
+ Pair upper upper (@code{puu.ps}).
+ 
+ @item v2sf __builtin_mips_cvt_ps_s (float, float)
+ Convert pair to paired single (@code{cvt.ps.s}).
+ 
+ @item float __builtin_mips_cvt_s_pl (v2sf)
+ Convert pair lower to single (@code{cvt.s.pl}).
+ 
+ @item float __builtin_mips_cvt_s_pu (v2sf)
+ Convert pair upper to single (@code{cvt.s.pu}).
+ 
+ @item v2sf __builtin_mips_abs_ps (v2sf)
+ Absolute value (@code{abs.ps}).
+ 
+ @item v2sf __builtin_mips_alnv_ps (v2sf, v2sf, int)
+ Align variable (@code{alnv.ps}).
+ 
+ @emph{Note:} The value of the third parameter must be 0 or 4
+ modulo 8, otherwise the result will be unpredictable.  Please read the
+ instruction description for details.
+ @end table
+ 
+ The following multi-instruction functions are also available.
+ In each case, @var{cond} can be any of the 16 floating-point conditions:
+ @code{f}, @code{un}, @code{eq}, @code{ueq}, @code{olt}, @code{ult},
+ @code{ole}, @code{ule}, @code{sf}, @code{ngle}, @code{seq}, @code{ngl},
+ @code{lt}, @code{nge}, @code{le} or @code{ngt}.
+ 
+ @table @code
+ @item v2sf __builtin_mips_movt_c_@var{cond}_ps (v2sf @var{a}, v2sf @var{b}, v2sf @var{c}, v2sf @var{d})
+ @itemx v2sf __builtin_mips_movf_c_@var{cond}_ps (v2sf @var{a}, v2sf @var{b}, v2sf @var{c}, v2sf @var{d})
+ Conditional move based on floating point comparison (@code{c.@var{cond}.ps},
+ @code{movt.ps}/@code{movf.ps}).
+ 
+ The @code{movt} functions return the value @var{x} computed by:
+ 
+ @smallexample
+ c.@var{cond}.ps @var{cc},@var{a},@var{b}
+ mov.ps @var{x},@var{c}
+ movt.ps @var{x},@var{d},@var{cc}
+ @end smallexample
+ 
+ The @code{movf} functions are similar but use @code{movf.ps} instead
+ of @code{movt.ps}.
+ 
+ @item int __builtin_mips_upper_c_@var{cond}_ps (v2sf @var{a}, v2sf @var{b})
+ @itemx int __builtin_mips_lower_c_@var{cond}_ps (v2sf @var{a}, v2sf @var{b})
+ Comparison of two paired-single values (@code{c.@var{cond}.ps},
+ @code{bc1t}/@code{bc1f}).
+ 
+ These functions compare @var{a} and @var{b} using @code{c.@var{cond}.ps}
+ and return either the upper or lower half of the result.  For example:
+ 
+ @smallexample
+ v2sf a, b;
+ if (__builtin_mips_upper_c_eq_ps (a, b))
+   upper_halves_are_equal ();
+ else
+   upper_halves_are_unequal ();
+ 
+ if (__builtin_mips_lower_c_eq_ps (a, b))
+   lower_halves_are_equal ();
+ else
+   lower_halves_are_unequal ();
+ @end smallexample
+ @end table
+ 
+ @node MIPS-3D Built-in Functions
+ @subsubsection MIPS-3D Built-in Functions
+ 
+ The MIPS-3D Application-Specific Extension (ASE) includes additional
+ paired-single instructions that are designed to improve the performance
+ of 3D graphics operations.  Support for these instructions is controlled
+ by the @option{-mips3d} command-line option.
+ 
+ The functions listed below map directly to a particular MIPS-3D
+ instruction.  Please refer to the architecture specification for
+ more details on what each instruction does.
+ 
+ @table @code
+ @item v2sf __builtin_mips_addr_ps (v2sf, v2sf)
+ Reduction add (@code{addr.ps}).
+ 
+ @item v2sf __builtin_mips_mulr_ps (v2sf, v2sf)
+ Reduction multiply (@code{mulr.ps}).
+ 
+ @item v2sf __builtin_mips_cvt_pw_ps (v2sf)
+ Convert paired single to paired word (@code{cvt.pw.ps}).
+ 
+ @item v2sf __builtin_mips_cvt_ps_pw (v2sf)
+ Convert paired word to paired single (@code{cvt.ps.pw}).
+ 
+ @item float __builtin_mips_recip1_s (float)
+ @itemx double __builtin_mips_recip1_d (double)
+ @itemx v2sf __builtin_mips_recip1_ps (v2sf)
+ Reduced precision reciprocal (sequence step 1) (@code{recip1.@var{fmt}}).
+ 
+ @item float __builtin_mips_recip2_s (float, float)
+ @itemx double __builtin_mips_recip2_d (double, double)
+ @itemx v2sf __builtin_mips_recip2_ps (v2sf, v2sf)
+ Reduced precision reciprocal (sequence step 2) (@code{recip2.@var{fmt}}).
+ 
+ @item float __builtin_mips_rsqrt1_s (float)
+ @itemx double __builtin_mips_rsqrt1_d (double)
+ @itemx v2sf __builtin_mips_rsqrt1_ps (v2sf)
+ Reduced precision reciprocal square root (sequence step 1)
+ (@code{rsqrt1.@var{fmt}}).
+ 
+ @item float __builtin_mips_rsqrt2_s (float, float)
+ @itemx double __builtin_mips_rsqrt2_d (double, double)
+ @itemx v2sf __builtin_mips_rsqrt2_ps (v2sf, v2sf)
+ Reduced precision reciprocal square root (sequence step 2)
+ (@code{rsqrt2.@var{fmt}}).
+ @end table
+ 
+ The following multi-instruction functions are also available.
+ In each case, @var{cond} can be any of the 16 floating-point conditions:
+ @code{f}, @code{un}, @code{eq}, @code{ueq}, @code{olt}, @code{ult},
+ @code{ole}, @code{ule}, @code{sf}, @code{ngle}, @code{seq},
+ @code{ngl}, @code{lt}, @code{nge}, @code{le} or @code{ngt}.
+ 
+ @table @code
+ @item int __builtin_mips_cabs_@var{cond}_s (float @var{a}, float @var{b})
+ @itemx int __builtin_mips_cabs_@var{cond}_d (double @var{a}, double @var{b})
+ Absolute comparison of two scalar values (@code{cabs.@var{cond}.@var{fmt}},
+ @code{bc1t}/@code{bc1f}).
+ 
+ These functions compare @var{a} and @var{b} using @code{cabs.@var{cond}.s}
+ or @code{cabs.@var{cond}.d} and return the result as a boolean value.
+ For example:
+ 
+ @smallexample
+ float a, b;
+ if (__builtin_mips_cabs_eq_s (a, b))
+   true ();
+ else
+   false ();
+ @end smallexample
+ 
+ @item int __builtin_mips_upper_cabs_@var{cond}_ps (v2sf @var{a}, v2sf @var{b})
+ @itemx int __builtin_mips_lower_cabs_@var{cond}_ps (v2sf @var{a}, v2sf @var{b})
+ Absolute comparison of two paired-single values (@code{cabs.@var{cond}.ps},
+ @code{bc1t}/@code{bc1f}).
+ 
+ These functions compare @var{a} and @var{b} using @code{cabs.@var{cond}.ps}
+ and return either the upper or lower half of the result.  For example:
+ 
+ @smallexample
+ v2sf a, b;
+ if (__builtin_mips_upper_cabs_eq_ps (a, b))
+   upper_halves_are_equal ();
+ else
+   upper_halves_are_unequal ();
+ 
+ if (__builtin_mips_lower_cabs_eq_ps (a, b))
+   lower_halves_are_equal ();
+ else
+   lower_halves_are_unequal ();
+ @end smallexample
+ 
+ @item v2sf __builtin_mips_movt_cabs_@var{cond}_ps (v2sf @var{a}, v2sf @var{b}, v2sf @var{c}, v2sf @var{d})
+ @itemx v2sf __builtin_mips_movf_cabs_@var{cond}_ps (v2sf @var{a}, v2sf @var{b}, v2sf @var{c}, v2sf @var{d})
+ Conditional move based on absolute comparison (@code{cabs.@var{cond}.ps},
+ @code{movt.ps}/@code{movf.ps}).
+ 
+ The @code{movt} functions return the value @var{x} computed by:
+ 
+ @smallexample
+ cabs.@var{cond}.ps @var{cc},@var{a},@var{b}
+ mov.ps @var{x},@var{c}
+ movt.ps @var{x},@var{d},@var{cc}
+ @end smallexample
+ 
+ The @code{movf} functions are similar but use @code{movf.ps} instead
+ of @code{movt.ps}.
+ 
+ @item int __builtin_mips_any_c_@var{cond}_ps (v2sf @var{a}, v2sf @var{b})
+ @itemx int __builtin_mips_all_c_@var{cond}_ps (v2sf @var{a}, v2sf @var{b})
+ @itemx int __builtin_mips_any_cabs_@var{cond}_ps (v2sf @var{a}, v2sf @var{b})
+ @itemx int __builtin_mips_all_cabs_@var{cond}_ps (v2sf @var{a}, v2sf @var{b})
+ Comparison of two paired-single values
+ (@code{c.@var{cond}.ps}/@code{cabs.@var{cond}.ps},
+ @code{bc1any2t}/@code{bc1any2f}).
+ 
+ These functions compare @var{a} and @var{b} using @code{c.@var{cond}.ps}
+ or @code{cabs.@var{cond}.ps}.  The @code{any} forms return true if either
+ result is true and the @code{all} forms return true if both results are true.
+ For example:
+ 
+ @smallexample
+ v2sf a, b;
+ if (__builtin_mips_any_c_eq_ps (a, b))
+   one_is_true ();
+ else
+   both_are_false ();
+ 
+ if (__builtin_mips_all_c_eq_ps (a, b))
+   both_are_true ();
+ else
+   one_is_false ();
+ @end smallexample
+ 
+ @item int __builtin_mips_any_c_@var{cond}_4s (v2sf @var{a}, v2sf @var{b}, v2sf @var{c}, v2sf @var{d})
+ @itemx int __builtin_mips_all_c_@var{cond}_4s (v2sf @var{a}, v2sf @var{b}, v2sf @var{c}, v2sf @var{d})
+ @itemx int __builtin_mips_any_cabs_@var{cond}_4s (v2sf @var{a}, v2sf @var{b}, v2sf @var{c}, v2sf @var{d})
+ @itemx int __builtin_mips_all_cabs_@var{cond}_4s (v2sf @var{a}, v2sf @var{b}, v2sf @var{c}, v2sf @var{d})
+ Comparison of four paired-single values
+ (@code{c.@var{cond}.ps}/@code{cabs.@var{cond}.ps},
+ @code{bc1any4t}/@code{bc1any4f}).
+ 
+ These functions use @code{c.@var{cond}.ps} or @code{cabs.@var{cond}.ps}
+ to compare @var{a} with @var{b} and to compare @var{c} with @var{d}.
+ The @code{any} forms return true if any of the four results are true
+ and the @code{all} forms return true if all four results are true.
+ For example:
+ 
+ @smallexample
+ v2sf a, b, c, d;
+ if (__builtin_mips_any_c_eq_4s (a, b, c, d))
+   some_are_true ();
+ else
+   all_are_false ();
+ 
+ if (__builtin_mips_all_c_eq_4s (a, b, c, d))
+   all_are_true ();
+ else
+   some_are_false ();
+ @end smallexample
+ @end table
+ 
  @node PowerPC AltiVec Built-in Functions
  @subsection PowerPC AltiVec Built-in Functions
  

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [patch] extend.texi MIPS PS/3D Support
  2004-10-02 11:35 ` Richard Sandiford
@ 2004-10-05  7:15   ` Richard Sandiford
  0 siblings, 0 replies; 36+ messages in thread
From: Richard Sandiford @ 2004-10-05  7:15 UTC (permalink / raw)
  To: Fu, Chao-Ying
  Cc: James E Wilson, gcc-patches, Stephens, Nigel, Thekkath, Radhika,
	Uhler, Mike

Richard Sandiford <rsandifo@redhat.com> writes:
> 2004-10-02  Chao-Ying Fu  <fu@mips.com>
> 	    Richard Sandiford  <rsandifo@redhat.com>
>
> 	* doc/invoke.texi (-mpaired-single): Link to the new description of the
> 	built-in functions.  Document dependencies.
> 	(-mips3d): Add link here too.
> 	* doc/extend.texi (MIPS Paired-Single Support): New section.

No-one seemed to have any objections, so I went ahead and installed this.

Richard

^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: [patch] extend.texi MIPS PS/3D Support
@ 2004-09-22 22:13 Fu, Chao-Ying
  0 siblings, 0 replies; 36+ messages in thread
From: Fu, Chao-Ying @ 2004-09-22 22:13 UTC (permalink / raw)
  To: Devang Patel, Giovanni Bajo
  Cc: Gerald Pfeifer, dorit, Stephens, Nigel, Thekkath, Radhika, Uhler,
	Mike, Jim Wilson, gcc-patches

For MIPS machines, we can put the following.

/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-stats -mips64 -mpaired-single -mhard-float -mfp64" { 
target mipsisa64*-*-* } } */

Regards,
Chao-ying Fu
MIPS Technologies, Inc.

-----Original Message-----
From: gcc-patches-owner@gcc.gnu.org
[mailto:gcc-patches-owner@gcc.gnu.org]On Behalf Of Devang Patel
Sent: Tuesday, September 21, 2004 6:48 PM
To: Giovanni Bajo
Cc: Fu, Chao-Ying; Gerald Pfeifer; dorit@il.ibm.com; Stephens, Nigel;
Thekkath, Radhika; Uhler, Mike; Jim Wilson; gcc-patches@gcc.gnu.org
Subject: Re: [patch] extend.texi MIPS PS/3D Support



On Sep 21, 2004, at 6:33 PM, Giovanni Bajo wrote:

> Does MIPS PS/3D work also with the autovectorizer?

Autovectorizer checks if required vector instruction is supported by 
target or not. I am not familiar with MIPS PS/3D but if vector insns 
match then why not? It would be good idea to update vectorization tests 
for other vector architectures. Right now each tests contain following 
powerpc and x68 specific options setting.

/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-stats -maltivec" 
{ target powerpc*-*-* } } */
/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-stats -msse2" { 
target i?86-*-* x86_64-*-* } } */


-
Devang

^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: [patch] extend.texi MIPS PS/3D Support
@ 2004-09-22 22:05 Fu, Chao-Ying
  2004-09-23 17:40 ` Richard Sandiford
  0 siblings, 1 reply; 36+ messages in thread
From: Fu, Chao-Ying @ 2004-09-22 22:05 UTC (permalink / raw)
  To: Dorit Naishlos, Giovanni Bajo, gcc-patches
  Cc: dpatel, Gerald Pfeifer, Stephens, Nigel, Thekkath, Radhika,
	Uhler, Mike, Jim Wilson

Ok.  We need to put this into "mips.h".  Thanks!

Regards,
Chao-ying Fu
MIPS Technolgoies, Inc.

Index: mips.h
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/config/mips/mips.h,v
retrieving revision 1.370
diff -c -3 -p -r1.370 mips.h
*** mips.h	18 Sep 2004 19:19:37 -0000	1.370
--- mips.h	22 Sep 2004 21:07:30 -0000
*************** extern const struct mips_cpu_info *mips_
*** 243,248 ****
--- 243,250 ----
  				((target_flags & MASK_PAIRED_SINGLE) != 0)
  #define TARGET_MIPS3D		((target_flags & MASK_MIPS3D) != 0)
  
+ #define UNITS_PER_SIMD_WORD	(TARGET_PAIRED_SINGLE_FLOAT ? 8 : 0)
+ 
  /* True if we should use NewABI-style relocation operators for
     symbolic addresses.  This is never true for mips16 code,
     which has its own conventions.  */

-----Original Message-----
From: Dorit Naishlos [mailto:DORIT@il.ibm.com]
Sent: Wednesday, September 22, 2004 2:34 AM
To: Giovanni Bajo
Cc: dpatel@apple.com; Fu, Chao-Ying; gcc-patches@gcc.gnu.org; Gerald
Pfeifer; Stephens, Nigel; Thekkath, Radhika; Uhler, Mike; Jim Wilson
Subject: Re: [patch] extend.texi MIPS PS/3D Support



> Does MIPS PS/3D work also with the autovectorizer?

It should, if you define UNITS_PER_SIMD_WORD.

dorit




                                                                                                                                   
                      "Giovanni Bajo"                                                                                              
                      <giovannibajo@lib        To:       "Fu, Chao-Ying" <fu@mips.com>, "Gerald Pfeifer" <gerald@pfeifer.com>,     
                      ero.it>                   <dpatel@apple.com>, Dorit Naishlos/Haifa/IBM@IBMIL                                 
                                               cc:       <nigel@mercury.mips.com>, <radhika@mercury.mips.com>,                     
                      22/09/2004 03:33          <uhler@mercury.mips.com>, "Jim Wilson" <wilson@specifixinc.com>,                   
                                                <gcc-patches@gcc.gnu.org>                                                          
                                               Subject:  Re: [patch] extend.texi MIPS PS/3D Support                                
                                                                                                                                   




Fu, Chao-Ying wrote:

> Hi Jim,
>
> Thanks for your feedback!  Here is the updated patc    h.
> (I forgot to send my first email to gcc-patches.  This time should be
> ok.)

Thanks for writing this up. Would you please prepare also a patch to
http://gcc.gnu.org/gcc-4.0/changes.html mentioning this change? Moreover,
if
Gerald agrees, I think we could mention the PS/3D support in the front page
too
(http://gcc.gnu.org/news.html).

Does MIPS PS/3D work also with the autovectorizer?

Giovanni Bajo





^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [patch] extend.texi MIPS PS/3D Support
  2004-09-22 22:05 Fu, Chao-Ying
@ 2004-09-23 17:40 ` Richard Sandiford
  0 siblings, 0 replies; 36+ messages in thread
From: Richard Sandiford @ 2004-09-23 17:40 UTC (permalink / raw)
  To: Fu, Chao-Ying
  Cc: Dorit Naishlos, Giovanni Bajo, gcc-patches, dpatel,
	Gerald Pfeifer, Stephens, Nigel, Thekkath, Radhika, Uhler, Mike,
	Jim Wilson

"Fu, Chao-Ying" <fu@mips.com> writes:
> Ok.  We need to put this into "mips.h".  Thanks!
> [...]
> Index: mips.h
> ===================================================================
> RCS file: /cvsroot/gcc/gcc/gcc/config/mips/mips.h,v
> retrieving revision 1.370
> diff -c -3 -p -r1.370 mips.h
> *** mips.h	18 Sep 2004 19:19:37 -0000	1.370
> --- mips.h	22 Sep 2004 21:07:30 -0000
> *************** extern const struct mips_cpu_info *mips_
> *** 243,248 ****
> --- 243,250 ----
>   				((target_flags & MASK_PAIRED_SINGLE) != 0)
>   #define TARGET_MIPS3D		((target_flags & MASK_MIPS3D) != 0)
>   
> + #define UNITS_PER_SIMD_WORD	(TARGET_PAIRED_SINGLE_FLOAT ? 8 : 0)
> + 
>   /* True if we should use NewABI-style relocation operators for
>      symbolic addresses.  This is never true for mips16 code,
>      which has its own conventions.  */

Have you tested this?

Richard

^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: [patch] extend.texi MIPS PS/3D Support
@ 2004-09-22  0:21 Fu, Chao-Ying
  2004-09-22  2:46 ` Giovanni Bajo
  2004-09-22  8:53 ` Richard Sandiford
  0 siblings, 2 replies; 36+ messages in thread
From: Fu, Chao-Ying @ 2004-09-22  0:21 UTC (permalink / raw)
  To: James E Wilson, gcc-patches
  Cc: Stephens, Nigel, Thekkath, Radhika, Uhler, Mike

Hi Jim,

Thanks for your feedback!  Here is the updated patch.
(I forgot to send my first email to gcc-patches.  This time should be ok.)

Chao-ying Fu
MIPS Technologies, Inc.

Index: doc/extend.texi
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/doc/extend.texi,v
retrieving revision 1.218
diff -c -3 -p -r1.218 extend.texi
*** doc/extend.texi	17 Sep 2004 17:24:17 -0000	1.218
--- doc/extend.texi	21 Sep 2004 23:39:44 -0000
*************** instructions, but allow the compiler to 
*** 5295,5300 ****
--- 5295,5301 ----
  * ARM Built-in Functions::
  * FR-V Built-in Functions::
  * X86 Built-in Functions::
+ * MIPS Paired-Single Floating Point Instruction Support::
  * PowerPC AltiVec Built-in Functions::
  @end menu
  
*************** v2sf __builtin_ia32_pswapdsf (v2sf)
*** 6178,6183 ****
--- 6179,6776 ----
  v2si __builtin_ia32_pswapdsi (v2si)
  @end smallexample
  
+ @node MIPS Paired-Single Floating Point Instruction Support
+ @subsection MIPS Paired-Single Floating Point Instruction Support
+ 
+ @menu
+ * Built-in Functions for MIPS64 Paired-Single Instructions::
+ * Built-in Functions for MIPS-3D ASE Instructions::
+ @end menu
+ 
+ The MIPS64 Architecture includes a number of paired-single floating point (FP) 
+ instructions.  These paired-single instructions operate on
+ 64-bit floating point registers containing two single floating point values
+ each in two contiguous 32-bits of the register, that is, one in the upper
+ half and one in the lower half.  This type of vector representation of two 
+ single FP values can then be operated upon simultaneously by a single 
+ instruction in a SIMD (single instruction multiple data) fashion, also called 
+ paired-single instructions in the MIPS sentence.  GCC supports the 
+ paired-single instructions via the @code{V2SF} machine mode, 
+ generic C arithmetic operations, and built-in functions.  Having this support 
+ allows a user to declare variables in C that use the new SIMD data type.  
+ When these variables are then used in arithmetic operations, the appropriate 
+ paired-single instructions are automatically generated by the compiler.  
+ Many built-in functions are also available that use this data type and
+ generate appropriate paired-single instructions.
+ 
+ To enable paired-single instruction support, the command line option 
+ @option{-mpaired-single} must be used.  Also, the hardware floating point 
+ option @option{-mhard-float} and the 64-bit FP register option 
+ @option{-mfp64} must be turned on either by default or by explicit command 
+ line options.
+ 
+ @smallexample
+ gcc -mips64 -mpaired-single -c hello.c
+ gcc -mips64 -mpaired-single -mhard-float -mfp64 -c hello.c
+ @end smallexample
+ 
+ To use paired-single variables in C, the following @code{typedef} is
+ required.
+ @smallexample
+ typedef float v2sf __attribute__ ((vector_size (8)));
+ @end smallexample
+ 
+ This @code{typedef} defines @code{v2sf} as an 8-byte data type which base type 
+ is @code{float} (4 bytes), so a @code{v2sf} variable contains two 
+ @code{float}s in a SIMD fashion.
+ 
+ The following illustrates by example, how SIMD variables can be declared,
+ initialized, and used in a C program.  Note that declaring and using SIMD 
+ variables in this way allows the compiler to use the appropriate SIMD 
+ instructions without any explicit direction from the user.
+ 
+ @table @asis
+ @item Variable Declaration
+ @smallexample
+ v2sf a, b, c, d;
+ /* @r{This declares four variables a, b, c, and d as a paired-single data type.}  */
+ @end smallexample
+ 
+ @item Variable Initialization
+ @smallexample
+ v2sf a = @{1.5, 9.1@};
+ v2sf b;
+ float e, f;
+ b = (v2sf) @{e, f@};
+ @end smallexample
+ 
+ NOTE: The CPU endianness determines how paired-single variables are initialized,
+ i.e., which value goes into the upper and which goes into the lower part
+ of the paired-single floating point register.
+ For little-endian CPUs, the first single value goes into the lower part,
+ and the second single value goes into the upper part.  For example,
+ @code{v2sf a = @{1.5, 9.1@};}, @code{1.5} goes into the lower part and 
+ @code{9.1} goes into the upper part.
+ For big-endian CPUs, the first single value goes into the upper part, and
+ the second single value goes into the lower part.  For the same example,
+ @code{v2sf a = @{1.5, 9.1@};}, @code{1.5} goes into the upper and @code{9.1} 
+ goes into the lower part.
+ @end table
+ 
+ The examples below illustrate the usage of the declared SIMD data type 
+ variables that allow the compiler to automatically invoke paired-single
+ instructions.  The instruction invoked in each case is shown in parentheses.
+ 
+ @table @asis
+ @item Assignment (MOV.PS)
+ @smallexample
+ v2sf a, b;
+ a = b;
+ @end smallexample
+ 
+ @item Addition (ADD.PS)
+ @smallexample
+ v2sf a, b, c;
+ a = b + c;
+ @end smallexample
+ 
+ @item Subtraction (SUB.PS)
+ @smallexample
+ v2sf a, b, c;
+ a = b - c;
+ @end smallexample
+ 
+ @item Negation (NEG.PS)
+ @smallexample
+ v2sf a, b;
+ a = - b;
+ @end smallexample
+ 
+ @item Multiplication (MUL.PS)
+ @smallexample
+ v2sf a, b, c;
+ a = b * c;
+ @end smallexample
+ 
+ @item Multiplication and Addition (MADD.PS)
+ @smallexample
+ v2sf a, b, c, d;
+ a = b * c + d;
+ @end smallexample
+ 
+ @item Multiplication and Subtraction (MSUB.PS)
+ @smallexample
+ v2sf a, b, c, d;
+ a = b * c - d;
+ @end smallexample
+ 
+ @item Negation of Multiplication and Addition (NMADD.PS)
+ @smallexample
+ v2sf a, b, c, d;
+ a = - (b * c + d);
+ @end smallexample
+ 
+ @item Negation of Multiplication and Subtraction (NMSUB.PS)
+ @smallexample
+ v2sf a, b, c, d;
+ a = - (b * c - d);
+ @end smallexample
+ 
+ @item Conditional Move Based on Integer Comparison (MOVN.PS, MOVZ.PS)
+ @smallexample
+ int i, j;
+ v2sf a, b, c;
+ a = (i > j) ? b : c;
+ /* Or */
+ if (i > j)
+   a = b;
+ else
+   a = c;
+ @end smallexample
+ @end table
+ 
+ @node Built-in Functions for MIPS64 Paired-Single Instructions
+ @subsubsection Built-in Functions for MIPS64 Paired-Single Instructions 
+ 
+ Built-in functions invoke explicit SIMD instructions by the user in C.
+ They are used in situations when merging paired-single variables,
+ converting between paired-single and float variables, calculating absolute
+ values, conditionally assigning paired-single variables based on floating-point
+ comparison results, and comparing two paired-single variables.
+ Typically, each built-in function refers to a particular paired-single SIMD
+ instruction (and other floating-point instructions if necessary).
+ 
+ @table @code
+ @item v2sf __builtin_mips_pll_ps (v2sf, v2sf)
+ Pair lower lower (PLL.PS).  
+ (Please refer to the architecture specification on the exact function of this 
+ and other paired-single instructions.)
+ 
+ In order to invoke the PLL.PS instruction on some paired-single variables,
+ the user must do the following:  Declare the @code{v2sf} type shown previously,
+ declare variables using that type, then call the built-in function on the 
+ appropriate variables.  The usage of other built-in functions is similar.
+ 
+ @item v2sf __builtin_mips_pul_ps (v2sf, v2sf)
+ Pair upper lower (PUL.PS).
+ 
+ @item v2sf __builtin_mips_plu_ps (v2sf, v2sf)
+ Pair lower upper (PLU.PS).
+ 
+ @item v2sf __builtin_mips_puu_ps (v2sf, v2sf)
+ Pair upper upper (PUU.PS).
+ 
+ @item v2sf __builtin_mips_cvt_ps_s (float, float)
+ Floating point convert pair to paired single (CVT.PS.S).
+ 
+ @item float __builtin_mips_cvt_s_pl (v2sf)
+ Floating point convert pair lower to single floating point (CVT.S.PL).
+ 
+ @item float __builtin_mips_cvt_s_pu (v2sf)
+ Floating point convert pair upper to single floating point (CVT.S.PU).
+ 
+ @item v2sf __builtin_mips_abs_ps (v2sf)
+ Floating point absolute value (ABS.PS).
+ 
+ @item v2sf __builtin_mips_alnv_ps (v2sf, v2sf, int)
+ Floating point align variable (ALNV.PS).
+ 
+ NOTE: The value of the third parameter @code{int} must be 0 or 4 
+ modulo 8, otherwise the result will be unpredictable.  Please read the 
+ instruction description for details.
+ 
+ @item v2sf __builtin_mips_mov*_c_*_ps (v2sf, v2sf, v2sf, v2sf)
+ Conditional move based on floating point comparison (C.*.PS, MOVT.PS/MOVF.PS).
+ 
+ The first * = @code{t}, @code{f}
+ 
+ The second * = @code{f}, @code{un}, @code{eq}, @code{ueq}, @code{olt}, 
+ @code{ult}, @code{ole}, @code{ule}, @code{sf}, @code{ngle}, @code{seq}, 
+ @code{ngl}, @code{lt}, @code{nge}, @code{le}, @code{ngt}
+ 
+ Description: Compare the first and the second @code{v2sf} parameters using one 
+ of the 16 comparison operators specified.  Assign the third @code{v2sf} 
+ parameter to the destination, and then conditionally assign (if true or false) 
+ the fourth @code{v2sf} parameter to the destination, potentially overwriting
+ the previous assigned values.
+ 
+ NOTE: The assignments of the upper and lower parts are independent and
+ based on their own comparison results.
+ 
+ Example:
+ @smallexample
+ v2sf a, b, c, d, e, f;
+ e = __builtin_mips_movt_c_eq_ps (a, b, c, d);
+ f = __builtin_mips_movf_c_eq_ps (a, b, c, d);
+ @end smallexample
+ 
+ @item int __builtin_mips_*_c_*_ps(v2sf, v2sf)
+ Floating point comparisons on two paired-single values (C.*.PS, BC1T/BC1F).
+ 
+ The first * = @code{upper}, @code{lower}
+ 
+ The second * = @code{f}, @code{un}, @code{eq}, @code{ueq}, @code{olt}, 
+ @code{ult}, @code{ole}, @code{ule}, @code{sf}, @code{ngle}, @code{seq}, 
+ @code{ngl}, @code{lt}, @code{nge}, @code{le}, @code{ngt}
+ 
+ Description: Compare the first and the second @code{v2sf} parameters using one
+ of the 16 comparison operators specified.  Then return 1 if the specified upper
+ or lower comparison result is true, or return 0 if the specified upper
+ or lower comparison result is false.
+ 
+ Example 1:
+ @smallexample
+ v2sf a, b;
+ if (__builtin_mips_upper_c_eq_ps (a, b))
+   upper_is_true ();
+ else
+   upper_is_false ();
+ 
+ if (__builtin_mips_lower_c_eq_ps (a, b))
+   lower_is_true ();
+ else
+   lower_is_false ();
+ @end smallexample
+ 
+ Example 2:
+ @smallexample
+ v2sf a, b;
+ int i, j;
+ i = __builtin_mips_upper_c_eq_ps (a, b);
+ if (i)
+   upper_is_true ();
+ else
+   upper_is_false ();
+ 
+ j = __builtin_mips_lower_c_eq_ps (a, b);
+ if (j)
+   lower_is_true ();
+ else
+   lower_is_false ();
+ @end smallexample
+ 
+ @end table
+ 
+ @node Built-in Functions for MIPS-3D ASE Instructions
+ @subsubsection Built-in Functions for MIPS-3D ASE Instructions
+ 
+ The MIPS-3D Application-Specific Extension (ASE) includes 13 floating point 
+ instructions designed to improve the performance of 3D graphics geometry 
+ operations.  GCC supports these instructions via built-in functions.  
+ For a description of these instructions, refer to the appropriate architecture 
+ reference manual.
+ 
+ To enable MIPS-3D support, the command line option @option{-mips3d} must be 
+ used.  Also, the hardware floating point option @option{-mhard-float}
+ and the 64-bit FP register option @option{-mfp64} must be turned on either by 
+ default or by explicit command line options.  Note that using @option{-mips3d}
+ implies @option{-mpaired-single}.
+ 
+ @smallexample
+ gcc -mips64 -mips3d -c hello.c
+ gcc -mips64 -mips3d -mhard-float -mfp64 -c hello.c
+ @end smallexample
+ 
+ The MIPS-3D built-in functions are as follows.
+ 
+ @table @code
+ @item v2sf __builtin_mips_addr_ps (v2sf, v2sf)
+ Floating point reduction add (ADDR.PS).
+ 
+ @item v2sf __builtin_mips_mulr_ps (v2sf, v2sf)
+ Floating point reduction multiply (MULR.PS).
+ 
+ @item v2sf __builtin_mips_cvt_pw_ps (v2sf)
+ Floating point convert paired single to paired word (CVT.PW.PS).
+ 
+ @item v2sf __builtin_mips_cvt_ps_pw (v2sf)
+ Floating point convert paired word to paired single (CVT.PS.PW).
+ 
+ @item float __builtin_mips_recip1_s (float)
+ Floating point reduced precision reciprocal (sequence step 1) (RECIP1.S).
+ 
+ @item double __builtin_mips_recip1_d (double)
+ Floating point reduced precision reciprocal (sequence step 1) (RECIP1.D).
+ 
+ @item v2sf __builtin_mips_recip1_ps (v2sf)
+ Floating point reduced precision reciprocal (sequence step 1) (RECIP1.PS).
+ 
+ @item float __builtin_mips_recip2_s (float, float)
+ Floating point reduced precision reciprocal (sequence step 2) (RECIP2.S).
+ 
+ @item double __builtin_mips_recip2_d (double, double)
+ Floating point reduced precision reciprocal (sequence step 2) (RECIP2.D).
+ 
+ @item v2sf __builtin_mips_recip2_ps (v2sf, v2sf)
+ Floating point reduced precision reciprocal (sequence step 2) (RECIP2.PS).
+ 
+ @item float __builtin_mips_rsqrt1_s (float)
+ Floating point reduced precision reciprocal square root (sequence step 1) (RSQRT1.S).
+ 
+ @item double __builtin_mips_rsqrt1_d (double)
+ Floating point reduced precision reciprocal square root (sequence step 1) (RSQRT1.D).
+ 
+ @item v2sf __builtin_mips_rsqrt1_ps (v2sf)
+ Floating point reduced precision reciprocal square root (sequence step 1) (RSQRT1.PS).
+ 
+ @item float __builtin_mips_rsqrt2_s (float, float)
+ Floating point reduced precision reciprocal square root (sequence step 2) (RSQRT2.S).
+ 
+ @item double __builtin_mips_rsqrt2_d (double, double)
+ Floating point reduced precision reciprocal square root (sequence step 2) (RSQRT2.D).
+ 
+ @item v2sf __builtin_mips_rsqrt2_ps (v2sf, v2sf)
+ Floating point reduced precision reciprocal square root (sequence step 2) (RSQRT2.PS).
+ 
+ @item int __builtin_mips_*_c_*_ps(v2sf, v2sf)
+ Floating point comparisons on two paired-single values (C.*.PS, 
+ BC1ANY2T/BC1ANY2F).
+ 
+ The first * = @code{any}, @code{all}
+ 
+ The second * = @code{f}, @code{un}, @code{eq}, @code{ueq}, @code{olt}, 
+ @code{ult}, @code{ole}, @code{ule}, @code{sf}, @code{ngle}, @code{seq}, 
+ @code{ngl}, @code{lt}, @code{nge}, @code{le}, @code{ngt}
+ 
+ Description: Compare the first and the second @code{v2sf} parameters using 
+ one of the 16 comparison operators specified.  For the @code{any} version,
+ then return 1 if any comparison result is true, or return 0 if all comparison 
+ results are false.  For the @code{all} version, then return 1 if all comparison
+ results are true, or return 0 if any comparison result is false.
+ 
+ Example 1:
+ @smallexample
+ v2sf a, b;
+ if (__builtin_mips_any_c_eq_ps (a, b))
+   any_is_true();
+ else
+   all_is_false();
+ 
+ if (__builtin_mips_all_c_eq_ps (a, b))
+   all_is_true();
+ else
+   any_is_false();
+ @end smallexample
+ 
+ Example 2:
+ @smallexample
+ v2sf a, b;
+ int i, j;
+ i = __builtin_mips_any_c_eq_ps (a, b);
+ j = __builtin_mips_all_c_eq_ps (a, b);
+ @end smallexample
+ 
+ NOTE: The remaining built-in functions are similar to this function,
+ except for compare operators (normal or absolute), basic data types,
+ or using two or four consecutive conditional bits.
+ 
+ @item int __builtin_mips_*_cabs_*_ps(v2sf, v2sf)
+ Floating point absolute comparisons on two paired-single values (CABS.*.PS, 
+ BC1ANY2T/BC1ANY2F/BC1T/BC1F).
+ 
+ The first * = @code{any}, @code{all}, @code{upper}, @code{lower}
+ 
+ The second * = @code{f}, @code{un}, @code{eq}, @code{ueq}, @code{olt}, 
+ @code{ult}, @code{ole}, @code{ule}, @code{sf}, @code{ngle}, @code{seq}, 
+ @code{ngl}, @code{lt}, @code{nge}, @code{le}, @code{ngt}
+ 
+ Description: Absolute compare the first and the second @code{v2sf} parameters 
+ using one of the 16 comparison operators specified.  Then return 1 if 
+ any results/all results/upper result/lower result are(is) true, or return 0 
+ if all results/any results/upper result/lower result are(is) false.
+ 
+ Example 1:
+ @smallexample
+ v2sf a, b;
+ if (__builtin_mips_any_cabs_eq_ps (a, b))
+   any_is_true ();
+ else
+   all_is_false ();
+ 
+ if (__builtin_mips_all_cabs_eq_ps (a, b))
+   all_is_true ();
+ else
+   any_is_false ();
+ 
+ if (__builtin_mips_upper_cabs_eq_ps (a, b))
+   upper_is_true ();
+ else
+   upper_is_false ();
+ 
+ if (__builtin_mips_lower_cabs_eq_ps (a, b))
+   lower_is_true ();
+ else
+   lower_is_false ();
+ @end smallexample
+ 
+ Example 2:
+ @smallexample
+ v2sf a, b;
+ int i, j, k, l;
+ i = __builtin_mips_any_cabs_eq_ps (a, b);
+ j = __builtin_mips_all_cabs_eq_ps (a, b);
+ k = __builtin_mips_upper_cabs_eq_ps (a, b);
+ l = __builtin_mips_lower_cabs_eq_ps (a, b);
+ @end smallexample
+ 
+ @item int __builtin_mips_cabs_*_s (float, float)
+ Floating point absolute comparisons on single (CABS.*.S).
+ 
+ * = @code{f}, @code{un}, @code{eq}, @code{ueq}, @code{olt}, 
+ @code{ult}, @code{ole}, @code{ule}, @code{sf}, @code{ngle}, @code{seq}, 
+ @code{ngl}, @code{lt}, @code{nge}, @code{le}, @code{ngt}
+ 
+ Description: Absolute compare the first and the second @code{float} parameters 
+ using one of the 16 comparison operators specified.  Then return 1 if true or 
+ return 0 if false.
+ 
+ Example 1:
+ @smallexample
+ float a, b;
+ if (__builtin_mips_cabs_eq_s (a, b))
+   true ();
+ else
+   false ();
+ @end smallexample
+ 
+ Example 2:
+ @smallexample
+ float a, b;
+ int i;
+ i = __builtin_mips_cabs_eq_s (a, b);
+ @end smallexample
+ 
+ @item int __builtin_mips_cabs_*_d(double, double)
+ Floating point absolute comparisons on double (CABS.*.D).
+ 
+ * = @code{f}, @code{un}, @code{eq}, @code{ueq}, @code{olt}, 
+ @code{ult}, @code{ole}, @code{ule}, @code{sf}, @code{ngle}, @code{seq}, 
+ @code{ngl}, @code{lt}, @code{nge}, @code{le}, @code{ngt}
+ 
+ Description: Absolute compare the first and the second @code{double} parameters
+ using one of the 16 comparison operators specified.  Then return 1 if true or 
+ return 0 if false.
+ 
+ Example 1:
+ @smallexample
+ double a, b;
+ if (__builtin_mips_cabs_eq_d (a, b))
+   true ();
+ else
+   false ();
+ @end smallexample
+ 
+ Example 2:
+ @smallexample
+ double a, b;
+ int i;
+ i = __builtin_mips_cabs_eq_d (a, b);
+ @end smallexample
+ 
+ @item v2sf __builtin_mips_mov*_cabs_*_ps (v2sf, v2sf, v2sf, v2sf)
+ Conditional move based on floating point absolute comparison (CABS.*.PS, 
+ MOVT.PS/MOVF.PS).
+ 
+ The first * = @code{t}, @code{f}
+ 
+ The second * = @code{f}, @code{un}, @code{eq}, @code{ueq}, @code{olt}, 
+ @code{ult}, @code{ole}, @code{ule}, @code{sf}, @code{ngle}, @code{seq}, 
+ @code{ngl}, @code{lt}, @code{nge}, @code{le}, @code{ngt}
+ 
+ Description: Absolute compare the first and the second @code{v2sf} parameters 
+ using one of the 16 comparison operators specified.  Assign the third
+ @code{v2sf} parameter to the destination, and then conditionally assign (if 
+ true or false) the fourth @code{v2sf} parameter to the destination,
+ potentially overwriting the previous assigned values.
+ 
+ NOTE: The assignments of the upper and lower parts are independent and based
+       on their own comparison results.
+ 
+ Example:
+ @smallexample
+ v2sf a, b, c, d, e, f;
+ e = __builtin_mips_movt_cabs_eq_ps (a, b, c, d);
+ f = __builtin_mips_movf_cabs_eq_ps (a, b, c, d);
+ @end smallexample
+ 
+ @item int __builtin_mips_*_c_*_4s(v2sf, v2sf, v2sf, v2sf)
+ Floating point comparisons on four paired-single values (C.*.PS, 
+ BC1ANY4T/BC1ANY4F).
+ 
+ The first * = @code{any}, @code{all}
+ 
+ The second * = @code{f}, @code{un}, @code{eq}, @code{ueq}, @code{olt}, 
+ @code{ult}, @code{ole}, @code{ule}, @code{sf}, @code{ngle}, @code{seq}, 
+ @code{ngl}, @code{lt}, @code{nge}, @code{le}, @code{ngt}
+ 
+ Description: Compare the first and the second @code{v2sf} parameters using one 
+ of the 16 comparison operators specified.  Compare the third and the fourth 
+ @code{v2sf} parameters using one of the 16 comparison operators specified.  
+ Then return 1 if any/all comparison results are true, or return 0 
+ if all/any comparison results are false.
+ 
+ Example 1:
+ @smallexample
+ v2sf a, b, c, d;
+ if (__builtin_mips_any_c_eq_4s (a, b, c, d))
+   any_is_true ();
+ else
+   all_is_false ();
+ 
+ if (__builtin_mips_all_c_eq_4s (a, b, c, d))
+   all_is_true ();
+ else
+   any_is_false ();
+ @end smallexample
+ 
+ Example 2:
+ @smallexample
+ v2sf a, b, c, d;
+ int i, j;
+ i = __builtin_mips_any_c_eq_4s (a, b, c, d);
+ j = __builtin_mips_all_c_eq_4s (a, b, c, d);
+ @end smallexample
+ 
+ @item int __builtin_mips_*_cabs_*_4s(v2sf, v2sf, v2sf, v2sf)
+ Floating point absolute comparisons on four paired-single values (CABS.*.PS, 
+ BC1ANY4T/BC1ANY4F).
+ 
+ The first * = @code{any}, @code{all}
+ 
+ The second * = @code{f}, @code{un}, @code{eq}, @code{ueq}, @code{olt}, 
+ @code{ult}, @code{ole}, @code{ule}, @code{sf}, @code{ngle}, @code{seq}, 
+ @code{ngl}, @code{lt}, @code{nge}, @code{le}, @code{ngt}
+ 
+ Description: Absolute compare the first and the second @code{v2sf} parameters 
+ using one of the 16 comparison operators specified.  Absolute compare the 
+ third and the fourth @code{v2sf} parameters using one of the 16 comparison 
+ operators specified.  Then return 1 if any/all comparison results 
+ are true, or return 0 if all/any comparison results are false.
+ 
+ Example 1:
+ @smallexample
+ v2sf a, b, c, d;
+ if (__builtin_mips_any_cabs_eq_4s (a, b, c, d))
+   any_is_true ();
+ else
+   all_is_false ();
+ 
+ if (__builtin_mips_all_cabs_eq_4s (a, b, c, d))
+   all_is_true ();
+ else
+   any_is_false ();
+ @end smallexample
+ 
+ Example 2:
+ @smallexample
+ v2sf a, b, c, d;
+ int i, j;
+ i = __builtin_mips_any_cabs_eq_4s (a, b, c, d);
+ j = __builtin_mips_all_cabs_eq_4s (a, b, c, d);
+ @end smallexample
+ 
+ @end table
+ 
  @node PowerPC AltiVec Built-in Functions
  @subsection PowerPC AltiVec Built-in Functions
  



-----Original Message-----
From: James E Wilson [mailto:wilson@specifixinc.com]
Sent: Tuesday, September 21, 2004 3:20 PM
To: Fu, Chao-Ying
Cc: Stephens, Nigel; Thekkath, Radhika; Uhler, Mike
Subject: Re: [patch] extend.texi MIPS PS/3D Support


This needs to be sent to the gcc-patches mailing list if it hasn't
already.  Eric Christopher and/or Richard Sandiford will likely want to
comment on it.

Improving the docs is one of the things I am still on the hook for, so
I'm glad to see that you are helping out here.

I got sick over the weekend, so I may not be very responsive for a
little while.  I've only skimmed through this.  It looks pretty
reasonable to me, but I am a poor judge of what Richard wants.

The text for the -mpaired-single and -mips3d options could be improved
to mention that they enable built-in functions.

The first hunk of the patch doesn't mention Built-in Functions, which
looks a little odd next to all of the others Built-in Functions list,
but I see the next menu down has the PS and MIPS-3D Built-in Functions,
so it seems reasonable.  Though others may disagree.  We won't know
until you post it.

The sentence
"includes several paired-single floating point (FP) instructions"
looks odd to me, as there are more than several such instructions. 
Maybe "a number of" instead of "several".

"Gcc supports the paired-single instructions via a new SIMD data type
@code{v2sf}."
There is no such data type.  The user must create one via a typedef, and
the user can give it any name.

"Note that declaring and using SIMD variables in this way allow the
compiler..."
Should be allows not allow.

"as a paired-single data type. */"
Should be two spaces after a period as per GNU coding conventions.

The "Variable Declaration", "Variable Initialization", "Assignment", etc
stuff looks strange by itself.  Maybe use @table/@item/@end here to give
it some structure?

There is no mention of the SB-1 extensions, but I don't expect you to
write that up.  I can add that on top of your docs after your docs get
approved and installed.
-- 
Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [patch] extend.texi MIPS PS/3D Support
  2004-09-22  0:21 Fu, Chao-Ying
@ 2004-09-22  2:46 ` Giovanni Bajo
  2004-09-22  3:14   ` Devang Patel
                     ` (3 more replies)
  2004-09-22  8:53 ` Richard Sandiford
  1 sibling, 4 replies; 36+ messages in thread
From: Giovanni Bajo @ 2004-09-22  2:46 UTC (permalink / raw)
  To: Fu, Chao-Ying, Gerald Pfeifer, dpatel, dorit
  Cc: nigel, radhika, uhler, Jim Wilson, gcc-patches

Fu, Chao-Ying wrote:

> Hi Jim,
>
> Thanks for your feedback!  Here is the updated patc    h.
> (I forgot to send my first email to gcc-patches.  This time should be
> ok.)

Thanks for writing this up. Would you please prepare also a patch to
http://gcc.gnu.org/gcc-4.0/changes.html mentioning this change? Moreover, if
Gerald agrees, I think we could mention the PS/3D support in the front page too
(http://gcc.gnu.org/news.html).

Does MIPS PS/3D work also with the autovectorizer?

Giovanni Bajo

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [patch] extend.texi MIPS PS/3D Support
  2004-09-22  2:46 ` Giovanni Bajo
@ 2004-09-22  3:14   ` Devang Patel
  2004-09-22 11:24     ` Dorit Naishlos
  2004-09-22  7:36   ` Richard Sandiford
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 36+ messages in thread
From: Devang Patel @ 2004-09-22  3:14 UTC (permalink / raw)
  To: Giovanni Bajo
  Cc: Fu, Chao-Ying, Gerald Pfeifer, dorit, nigel, radhika, uhler,
	Jim Wilson, gcc-patches

On Sep 21, 2004, at 6:33 PM, Giovanni Bajo wrote:

> Does MIPS PS/3D work also with the autovectorizer?

Autovectorizer checks if required vector instruction is supported by 
target or not. I am not familiar with MIPS PS/3D but if vector insns 
match then why not? It would be good idea to update vectorization tests 
for other vector architectures. Right now each tests contain following 
powerpc and x68 specific options setting.

/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-stats -maltivec" 
{ target powerpc*-*-* } } */
/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-stats -msse2" { 
target i?86-*-* x86_64-*-* } } */

-
Devang

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [patch] extend.texi MIPS PS/3D Support
  2004-09-22  3:14   ` Devang Patel
@ 2004-09-22 11:24     ` Dorit Naishlos
  0 siblings, 0 replies; 36+ messages in thread
From: Dorit Naishlos @ 2004-09-22 11:24 UTC (permalink / raw)
  To: Devang Patel
  Cc: Giovanni Bajo, Fu, Chao-Ying, Gerald Pfeifer, nigel, radhika,
	uhler, Jim Wilson, gcc-patches


> It would be good idea to update vectorization tests
> for other vector architectures.

Absolutely. When I first introduced the tests I enabled them only for
targets I have access to and could verify that they pass, but people are
welcome to enable the tests for other relevant architectures (x86_64-*-*
for example was added thanks to Andi Kleen's initiative).

dorit



                                                                                                                                   
                      Devang Patel                                                                                                 
                      <dpatel@apple.com        To:       Giovanni Bajo <giovannibajo@libero.it>                                    
                      >                        cc:       "Fu, Chao-Ying" <fu@mips.com>, Gerald Pfeifer <gerald@pfeifer.com>, Dorit 
                                                Naishlos/Haifa/IBM@IBMIL, nigel@mercury.mips.com, radhika@mercury.mips.com,        
                      22/09/2004 03:48          uhler@mercury.mips.com, Jim Wilson <wilson@specifixinc.com>,                       
                                                gcc-patches@gcc.gnu.org                                                            
                                               Subject:  Re: [patch] extend.texi MIPS PS/3D Support                                
                                                                                                                                   





On Sep 21, 2004, at 6:33 PM, Giovanni Bajo wrote:

> Does MIPS PS/3D work also with the autovectorizer?

Autovectorizer checks if required vector instruction is supported by
target or not. I am not familiar with MIPS PS/3D but if vector insns
match then why not? It would be good idea to update vectorization tests
for other vector architectures. Right now each tests contain following
powerpc and x68 specific options setting.

/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-stats -maltivec"
{ target powerpc*-*-* } } */
/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-stats -msse2" {
target i?86-*-* x86_64-*-* } } */


-
Devang



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [patch] extend.texi MIPS PS/3D Support
  2004-09-22  2:46 ` Giovanni Bajo
  2004-09-22  3:14   ` Devang Patel
@ 2004-09-22  7:36   ` Richard Sandiford
  2004-09-22 11:06   ` Dorit Naishlos
  2004-10-26 14:29   ` Gerald Pfeifer
  3 siblings, 0 replies; 36+ messages in thread
From: Richard Sandiford @ 2004-09-22  7:36 UTC (permalink / raw)
  To: Giovanni Bajo
  Cc: Fu, Chao-Ying, Gerald Pfeifer, dpatel, dorit, nigel, radhika,
	uhler, Jim Wilson, gcc-patches

"Giovanni Bajo" <giovannibajo@libero.it> writes:
> Fu, Chao-Ying wrote:
>> Hi Jim,
>>
>> Thanks for your feedback!  Here is the updated patc    h.
>> (I forgot to send my first email to gcc-patches.  This time should be
>> ok.)
>
> Thanks for writing this up. Would you please prepare also a patch to
> http://gcc.gnu.org/gcc-4.0/changes.html mentioning this change?

FWIW, I was planning to do this soon, alongside the other MIPS changes.
(Note that there are still some ABI fixes pending.)

Richard

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [patch] extend.texi MIPS PS/3D Support
  2004-09-22  2:46 ` Giovanni Bajo
  2004-09-22  3:14   ` Devang Patel
  2004-09-22  7:36   ` Richard Sandiford
@ 2004-09-22 11:06   ` Dorit Naishlos
  2004-10-26 14:29   ` Gerald Pfeifer
  3 siblings, 0 replies; 36+ messages in thread
From: Dorit Naishlos @ 2004-09-22 11:06 UTC (permalink / raw)
  To: Giovanni Bajo
  Cc: dpatel, Fu, Chao-Ying, gcc-patches, Gerald Pfeifer, nigel,
	radhika, uhler, Jim Wilson


> Does MIPS PS/3D work also with the autovectorizer?

It should, if you define UNITS_PER_SIMD_WORD.

dorit




                                                                                                                                   
                      "Giovanni Bajo"                                                                                              
                      <giovannibajo@lib        To:       "Fu, Chao-Ying" <fu@mips.com>, "Gerald Pfeifer" <gerald@pfeifer.com>,     
                      ero.it>                   <dpatel@apple.com>, Dorit Naishlos/Haifa/IBM@IBMIL                                 
                                               cc:       <nigel@mercury.mips.com>, <radhika@mercury.mips.com>,                     
                      22/09/2004 03:33          <uhler@mercury.mips.com>, "Jim Wilson" <wilson@specifixinc.com>,                   
                                                <gcc-patches@gcc.gnu.org>                                                          
                                               Subject:  Re: [patch] extend.texi MIPS PS/3D Support                                
                                                                                                                                   




Fu, Chao-Ying wrote:

> Hi Jim,
>
> Thanks for your feedback!  Here is the updated patc    h.
> (I forgot to send my first email to gcc-patches.  This time should be
> ok.)

Thanks for writing this up. Would you please prepare also a patch to
http://gcc.gnu.org/gcc-4.0/changes.html mentioning this change? Moreover,
if
Gerald agrees, I think we could mention the PS/3D support in the front page
too
(http://gcc.gnu.org/news.html).

Does MIPS PS/3D work also with the autovectorizer?

Giovanni Bajo





^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [patch] extend.texi MIPS PS/3D Support
  2004-09-22  2:46 ` Giovanni Bajo
                     ` (2 preceding siblings ...)
  2004-09-22 11:06   ` Dorit Naishlos
@ 2004-10-26 14:29   ` Gerald Pfeifer
  3 siblings, 0 replies; 36+ messages in thread
From: Gerald Pfeifer @ 2004-10-26 14:29 UTC (permalink / raw)
  To: Giovanni Bajo
  Cc: Fu, Chao-Ying, dpatel, dorit, nigel, radhika, uhler, Jim Wilson,
	gcc-patches

On Wed, 22 Sep 2004, Giovanni Bajo wrote:
> Thanks for writing this up. Would you please prepare also a patch to
> http://gcc.gnu.org/gcc-4.0/changes.html mentioning this change? Moreover, if
> Gerald agrees, I think we could mention the PS/3D support in the front page too
> (http://gcc.gnu.org/news.html).

You know that I'm all for more announcements.  If this constitutes a
significant contribution/amount of work (which Jim or some other MIPS
maintainer should know best), sure.

(Sorry for the late response -- that one hid in the post-holidays pile.)

Gerald

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [patch] extend.texi MIPS PS/3D Support
  2004-09-22  0:21 Fu, Chao-Ying
  2004-09-22  2:46 ` Giovanni Bajo
@ 2004-09-22  8:53 ` Richard Sandiford
  1 sibling, 0 replies; 36+ messages in thread
From: Richard Sandiford @ 2004-09-22  8:53 UTC (permalink / raw)
  To: Fu, Chao-Ying
  Cc: James E Wilson, gcc-patches, Stephens, Nigel, Thekkath, Radhika,
	Uhler, Mike

This is awesome.  Thanks a lot for doing this.

Some very minor niggles:

"Fu, Chao-Ying" <fu@mips.com> writes:
> + * MIPS Paired-Single Floating Point Instruction Support::

Maybe drop "Instruction"?

> + @menu
> + * Built-in Functions for MIPS64 Paired-Single Instructions::
> + * Built-in Functions for MIPS-3D ASE Instructions::
> + @end menu

I'm not sure how canonical it is to have the section start with a menu
and be followed by a lengthy (but very good!) overview.  It might well
be OK, but maybe a docs maintainer can comment.

Perhaps a lot of the overview could go in a subsection of its own,
alongside the two nodes above.

> + The MIPS64 Architecture includes a number of paired-single floating point (FP) 
> + instructions.  These paired-single instructions operate on
> + 64-bit floating point registers containing two single floating point values
> + each in two contiguous 32-bits of the register, that is, one in the upper
> + half and one in the lower half.  This type of vector representation of two 
> + single FP values can then be operated upon simultaneously by a single 
> + instruction in a SIMD (single instruction multiple data) fashion, also called 
> + paired-single instructions in the MIPS sentence.  GCC supports the 
                                  ^^^^^^^^^^^^^^^^^
I think you mean "MIPS terminology", or something like that.  Although
everything after "also called..." is a little redundant since you've
already introduced the term "paired-single instructions".

> + To enable paired-single instruction support, the command line option 
> + @option{-mpaired-single} must be used.  Also, the hardware floating point 
> + option @option{-mhard-float} and the 64-bit FP register option 
> + @option{-mfp64} must be turned on either by default or by explicit command 
> + line options.
> + 
> + @smallexample
> + gcc -mips64 -mpaired-single -c hello.c
> + gcc -mips64 -mpaired-single -mhard-float -mfp64 -c hello.c
> + @end smallexample
> + 
> + To use paired-single variables in C, the following @code{typedef} is
> + required.
> + @smallexample
> + typedef float v2sf __attribute__ ((vector_size (8)));
> + @end smallexample
> + 
> + This @code{typedef} defines @code{v2sf} as an 8-byte data type which base type 
> + is @code{float} (4 bytes), so a @code{v2sf} variable contains two 
> + @code{float}s in a SIMD fashion.
> + 
> + The following illustrates by example, how SIMD variables can be declared,
> + initialized, and used in a C program.  Note that declaring and using SIMD 
> + variables in this way allows the compiler to use the appropriate SIMD 
> + instructions without any explicit direction from the user.
> + 
> + @table @asis
> + @item Variable Declaration
> + @smallexample
> + v2sf a, b, c, d;
> + /* @r{This declares four variables a, b, c, and d as a paired-single data type.}  */

I suspect that this should reall be "@code{a}, @code{b}, ...".

> + @end smallexample
> + 
> + @item Variable Initialization
> + @smallexample
> + v2sf a = @{1.5, 9.1@};
> + v2sf b;
> + float e, f;
> + b = (v2sf) @{e, f@};
> + @end smallexample
> + 
> + NOTE: The CPU endianness determines how paired-single variables are initialized,

I think this should be @emph{Note:}.

> + i.e., which value goes into the upper and which goes into the lower part
> + of the paired-single floating point register.
> + For little-endian CPUs, the first single value goes into the lower part,
> + and the second single value goes into the upper part.  For example,
> + @code{v2sf a = @{1.5, 9.1@};}, @code{1.5} goes into the lower part and 
> + @code{9.1} goes into the upper part.

How about:

    For example, @code{v2sf a = @{1.5, 9.1@};} will put @code{1.5} into
    the lower part and @code{9.1} into the upper part.

> + The examples below illustrate the usage of the declared SIMD data type 
> + variables that allow the compiler to automatically invoke paired-single
> + instructions.  The instruction invoked in each case is shown in parentheses.

The first sentence doesn't sound quite right, but I can't think of
anything better right now.

> + 
> + @table @asis
> + @item Assignment (MOV.PS)

Maybe better to use "(@code{mov.ps})", here and elsewhere?

> + @item Conditional Move Based on Integer Comparison (MOVN.PS, MOVZ.PS)
> + @smallexample
> + int i, j;
> + v2sf a, b, c;
> + a = (i > j) ? b : c;
> + /* Or */
> + if (i > j)
> +   a = b;
> + else
> +   a = c;
> + @end smallexample
> + @end table

Apparently that should be "@r{Or}".  So many things to remember... ;)

> + @node Built-in Functions for MIPS64 Paired-Single Instructions
> + @subsubsection Built-in Functions for MIPS64 Paired-Single Instructions 
> + 
> + Built-in functions invoke explicit SIMD instructions by the user in C.

Maybe something like "Built-in functions can be used to explicitly
invoke a particular SIMD instruction"?

> + They are used in situations when merging paired-single variables,
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^
Suggest "to merge" (with rest of sentence changing accordingly).

> + NOTE: The value of the third parameter @code{int} must be 0 or 4 

Another @emph{Note:}.

> + @item v2sf __builtin_mips_mov*_c_*_ps (v2sf, v2sf, v2sf, v2sf)
> + Conditional move based on floating point comparison (C.*.PS, MOVT.PS/MOVF.PS).
> + 
> + The first * = @code{t}, @code{f}
> + 
> + The second * = @code{f}, @code{un}, @code{eq}, @code{ueq}, @code{olt}, 
> + @code{ult}, @code{ole}, @code{ule}, @code{sf}, @code{ngle}, @code{seq}, 
> + @code{ngl}, @code{lt}, @code{nge}, @code{le}, @code{ngt}

Probably better to use @var{...} instead of "*" here.  E.g.:

__builtin_mips_mov@var{tf}_c_@var{cond}_ps

Richard

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2004-10-26 13:39 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-09-23 18:02 [patch] extend.texi MIPS PS/3D Support Fu, Chao-Ying
2004-09-24 12:17 ` Dorit Naishlos
2004-09-24 21:35   ` Chao-ying Fu
2004-09-25 10:53     ` Richard Sandiford
2004-09-27 23:32       ` Chao-ying Fu
2004-09-28  0:32         ` Dorit Naishlos
2004-09-28  0:41           ` Richard Henderson
2004-09-28 14:07             ` Dorit Naishlos
2004-09-28 14:08               ` Paolo Bonzini
2004-09-29 11:41                 ` Paolo Bonzini
2004-09-29 17:33                   ` Richard Henderson
2004-09-29 23:39                   ` Dorit Naishlos
2004-10-11 22:48                   ` [patch] vectorizer: fix handling of non VECTOR_MODE_P vectypes Dorit Naishlos
2004-10-12  0:40                     ` Richard Henderson
2004-10-12 12:54                       ` Dorit Naishlos
2004-10-12 16:37                         ` Richard Henderson
2004-10-12 18:21                           ` Dorit Naishlos
2004-10-12 20:49                             ` James A. Morrison
2004-10-13  8:46                               ` Dorit Naishlos
2004-10-13 19:18                         ` Richard Henderson
2004-09-28 19:35               ` [patch] extend.texi MIPS PS/3D Support Chao-ying Fu
2004-09-28 23:04               ` Richard Henderson
  -- strict thread matches above, loose matches on Subject: below --
2004-09-24  0:12 Fu, Chao-Ying
2004-10-02 11:35 ` Richard Sandiford
2004-10-05  7:15   ` Richard Sandiford
2004-09-22 22:13 Fu, Chao-Ying
2004-09-22 22:05 Fu, Chao-Ying
2004-09-23 17:40 ` Richard Sandiford
2004-09-22  0:21 Fu, Chao-Ying
2004-09-22  2:46 ` Giovanni Bajo
2004-09-22  3:14   ` Devang Patel
2004-09-22 11:24     ` Dorit Naishlos
2004-09-22  7:36   ` Richard Sandiford
2004-09-22 11:06   ` Dorit Naishlos
2004-10-26 14:29   ` Gerald Pfeifer
2004-09-22  8:53 ` Richard Sandiford

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).