From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 99514 invoked by alias); 26 Aug 2018 11:16:25 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 99369 invoked by uid 89); 26 Aug 2018 11:16:14 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-11.0 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_2,GIT_PATCH_3,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2 spammy=hints, sca, 1979, enhance X-HELO: ainaz.pair.com Received: from ainaz.pair.com (HELO ainaz.pair.com) (209.68.2.66) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Sun, 26 Aug 2018 11:16:12 +0000 Received: from ainaz.pair.com (localhost [127.0.0.1]) by ainaz.pair.com (Postfix) with ESMTP id 44051B53EC9 for ; Sun, 26 Aug 2018 07:15:49 -0400 (EDT) Received: from anthias (vie-91-186-158-155.dsl.sil.at [91.186.158.155]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ainaz.pair.com (Postfix) with ESMTPSA id BE994B53E28 for ; Sun, 26 Aug 2018 07:15:48 -0400 (EDT) Date: Sun, 26 Aug 2018 11:16:00 -0000 From: Gerald Pfeifer To: gcc-patches@gcc.gnu.org Subject: Re: [wwwdocs] Replace by id= attributes in all of projects/ In-Reply-To: Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-IsSubscribed: yes X-SW-Source: 2018-08/txt/msg01621.txt.bz2 On Sun, 29 Jul 2018, Gerald Pfeifer wrote: > ...and avoid a few that weren't referenced. > > This is the next step in cleaning up and simplifying our pages for > a transition to the (simpler) HTML 5. Turns out that also here there were a quite a few I missed, including some rather creative ones such as

User Hints

On the way I made the labeling of examples quite more consistent. Applied. Gerald Index: projects/tree-ssa/vectorization.html =================================================================== RCS file: /cvs/gcc/wwwdocs/htdocs/projects/tree-ssa/vectorization.html,v retrieving revision 1.36 diff -u -r1.36 vectorization.html --- projects/tree-ssa/vectorization.html 29 Jul 2018 20:43:43 -0000 1.36 +++ projects/tree-ssa/vectorization.html 26 Aug 2018 10:45:12 -0000 @@ -159,12 +159,13 @@ as loop vectorization. Basic block SLP is enabled by default at -O3 and when -ftree-vectorize is enabled.

-

Vectorizable - Loops

+

Vectorizable Loops

"feature" indicates the vectorization capabilities demonstrated by the - example.

example1: + example.

+ +Example 1:
 int a[256], b[256], c[256];
@@ -175,7 +176,9 @@
     a[i] = b[i] + c[i];
   }
 }
-
example2: + + +Example 2:
 int a[256], b[256], c[256];
@@ -194,7 +197,9 @@
       a[i] = b[i]&c[i]; i++;
    }
 }
-
example3: + + +Example 3:
 typedef int aint __attribute__ ((__aligned__(16)));
@@ -205,7 +210,9 @@
       *p++ = *q++;
    }
 }
-
example4: + + +Example 4:
 typedef int aint __attribute__ ((__aligned__(16)));
@@ -230,7 +237,9 @@
       b[i] = (j > MAX ? MAX : 0);
    }
 }
-
example5: + + +Example 5:
 struct a {
@@ -241,8 +250,9 @@
     /* feature: support for alignable struct access  */
     s.ca[i] = 5;
   }
-
example6 -(gfortran): + + +Example 6: gfortran:
 DIMENSION A(1000000), B(1000000), C(1000000)
@@ -250,7 +260,9 @@
 A = LOG(X); B = LOG(Y); C = A + B
 PRINT*, C(500000)
 END
-
example7: + + +Example 7:
 int a[256], b[256];
@@ -262,7 +274,9 @@
       a[i] = b[i+x];
    }
 }
-
example8: + + +Example 8:
 int a[M][N];
@@ -276,7 +290,9 @@
      }
    }
 }
-
example9: + + +Example 9:
 unsigned int ub[N], uc[N];
@@ -289,7 +305,9 @@
   for (i = 0; i < N; i++) {
     udiff += (ub[i] - uc[i]);
   }
-
example10: + + +Example 10:
 /* feature: support data-types of different sizes.
@@ -311,7 +329,8 @@
   ia[i] = (int) sb[i];
 }
 
-example11: + +Example 11:
 /* feature: support strided accesses - the data elements
@@ -324,7 +343,7 @@
 }
 
-example12: induction: +Example 12: Induction:
 for (i = 0; i < N; i++) {
@@ -332,7 +351,7 @@
 }
 
-example13: outer-loop: +Example 13: Outer-loop:
   for (i = 0; i < M; i++) {
@@ -345,7 +364,8 @@
 }
 
-example14: double reduction: +Example 14: Double reduction: +
   for (k = 0; k < K; k++) {
     sum = 0;
@@ -357,7 +377,8 @@
   }
 
-example15: condition in nested loop: +Example 15: Condition in nested loop: +
   for (j = 0; j < M; j++)
     {
@@ -374,7 +395,8 @@
     }
 
-example16: load permutation in loop-aware SLP: +Example 16: Load permutation in loop-aware SLP: +
   for (i = 0; i < N; i++)
     {
@@ -388,7 +410,8 @@
     }
 
-example17: basic block SLP: +Example 17: Basic block SLP: +
 void foo ()
 {
@@ -402,7 +425,8 @@
 }
 
-example18: Simple reduction in SLP: +Example 18: Simple reduction in SLP: +
 int sum1;
 int sum2;
@@ -419,7 +443,8 @@
 }
 
-example19: Reduction chain in SLP: +Example 19: Reduction chain in SLP: +
 int sum;
 int a[128];
@@ -435,9 +460,10 @@
 }
 
-example20: Basic block SLP with +Example 20: Basic block SLP with multiple types, loads with different offsets, misaligned load, -and not-affine accesses: +and not-affine accesses: +
 void foo (int * __restrict__ dst, short * __restrict__ src,
           int h, int stride, short A, short B)
@@ -459,7 +485,8 @@
 }
 
-example21: Backward access: +Example 21: Backward access: +
 int foo (int *b, int n)
 {
@@ -472,7 +499,8 @@
 }
 
-example22: Alignment hints: +Example 22: Alignment hints: +
 void foo (int *out1, int *in1, int *in2, int n)
 {
@@ -487,7 +515,8 @@
 }
 
-example23: Widening shift: +Example 23: Widening shift: +
 void foo (unsigned short *src, unsigned int *dst)
 {
@@ -498,7 +527,8 @@
 }
 
-example24: Condition with mixed types: +Example 24: Condition with mixed types: +
 #define N 1024
 float a[N], b[N];
@@ -512,7 +542,8 @@
 }
 
-example25: Loop with bool: +Example 25: Loop with bool: +
 #define N 1024
 float a[N], b[N], c[N], d[N];
@@ -531,11 +562,12 @@
 }
 
-

Unvectorizable - Loops

+

Unvectorizable Loops

Examples of loops that currently cannot be - vectorized:

example1: uncountable loop: + vectorized:

+ +Example 1: Uncountable loop:
 while (*p != NULL) {
@@ -1564,8 +1596,7 @@
         PLDI 2000.
     
 
-    

High-Level Plan of - Implementation (2003-2005)

+

High-Level Plan of Implementation (2003-2005)

The table below outlines the high level vectorization scheme along with a proposal for an implementation scheme, as @@ -1926,9 +1957,7 @@

  1. - - -

    Loop detection and loop CFG analysis

    +

    >Loop detection and loop CFG analysis

    Detect loops, and record some basic control flow information about them (contained basic blocks, loop @@ -1940,9 +1969,7 @@

  2. - - -

    Modeling the target machine vector capabilities to +

    Modeling the target machine vector capabilities to the tree-level.

    Expose the required target specific information to @@ -1998,9 +2025,7 @@

  3. - - -

    Enhance the Builtins Support

    +

    Enhance the Builtins Support

    Currently the tree optimizers do not know the semantics of target specific builtin functions, so they @@ -2016,9 +2041,7 @@

  4. - - -

    Cost Model

    +

    Cost Model

    There is an overhead associated with vectorization -- moving data in to/out of vector registers @@ -2037,9 +2060,7 @@

  5. - - -

    Induction Variable Analysis

    +

    Induction Variable Analysis

    Used by the vectorizer to detect loop bound, analyze access patterns and analyze data dependencies between @@ -2066,9 +2087,7 @@

  6. - - -

    Dependence Testing

    +

    Dependence Testing

    Following the classic dependence-based approach for vectorization as described in

  7. - - -

    Access Pattern Analysis

    +

    Access Pattern Analysis

    The memory architecture usually allows only restricted accesses to data in memory; one of the @@ -2151,9 +2168,7 @@

  8. - - -

    Extend the range of supportable operations

    +

    Extend the range of supportable operations

    At first, the only computations that will be vectorized are those for which the vectorization @@ -2185,9 +2200,7 @@

  9. - - -

    Alignment

    +

    Alignment

    The memory architecture usually allows only restricted accesses to data in memory. One of the @@ -2237,9 +2250,7 @@

  10. - - -

    Idiom Recognition

    +

    Idiom Recognition

    It is often the case that complicated computations can be reduced into a simpler, straight-line sequence @@ -2301,9 +2312,7 @@

  11. - - -

    Conditional Execution

    +

    Conditional Execution

    The general principle we are trying to follow is to keep the actual code transformation part of the @@ -2333,9 +2342,7 @@

  12. - - -

    Handle Advanced Loop Forms

    +

    Handle Advanced Loop Forms

    1. Support general loop bound (unknown, or doesn't @@ -2355,9 +2362,7 @@
    2. - - -

      Handle Pointer Aliasing

      +

      Handle Pointer Aliasing

      1. Improve aliasing analysis. [various gcc projects @@ -2406,9 +2411,7 @@
      2. - - -

        Loop versioning

        +

        Loop versioning

        Provide utilities that allow performing the following transformation: Given a condition and a loop, @@ -2424,9 +2427,7 @@

      3. - - -

        Loop Transformations to Increase Vectorizability of +

        Loop Transformations to Increase Vectorizability of Loops

        These include:

        @@ -2448,9 +2449,7 @@
      4. - - -

        Other Optimizations

        +

        Other Optimizations

        1. Exploit data reuse (a la "Compiler-Controlled @@ -2477,9 +2476,7 @@
        2. - - -

          User Hints

          +

          User Hints

          Using user hints for different purposes (aliasing, alignment, profitability of vectorizing