public inbox for jit@gcc.gnu.org
 help / color / mirror / Atom feed
From: Michael Cree <mcree@orcon.net.nz>
To: David Malcolm <dmalcolm@redhat.com>
Cc: jit@gcc.gnu.org
Subject: Re: [committed] jit: add gcc_jit_type_get_vector
Date: Sun, 01 Jan 2017 00:00:00 -0000	[thread overview]
Message-ID: <20170816095854.dp3qe5dmsuqnblsg@tower> (raw)
In-Reply-To: <1502326873-58234-1-git-send-email-dmalcolm@redhat.com>

On Wed, Aug 09, 2017 at 09:01:13PM -0400, David Malcolm wrote:
> On Wed, 2017-08-09 at 20:42 +1200, Michael Cree wrote:
> > On Mon, Aug 07, 2017 at 10:28:57AM -0400, David Malcolm wrote:
> > > What would the ideal API
> > > look like? 
> > > 
> > > Maybe something like:
> > > 
> > >   extern gcc_jit_type *
> > >   gcc_jit_type_get_vector (gcc_jit_type *type, unsigned nunits);
> > >  
> > > with various requirements (type must be integral/floating point;
> > > nunits
> > > must be a power of two).
> > 
> > I suspect that would do the job nicely.
> 
> I implemented the above (although I switched the 2nd arg to be
> "size_t num_units").

Thanks!  I haven't been able to try the vector type yet; current
gcc trunk which I just pulled failed to build.

But I have started work using gcc 6.4 and 7.1 libgccjit (without
the vector type) and have a problem noted below.  But first:

> It looks like you may not need to explicitly use builtins to
> access machine specific simd intrinsics; for example, on x86_64
> when I tried multiplying two of these together for float, with
> GCC_JIT_BINARY_OP_MULT, which led to this gimple:
> 
> jit_v4f_mult (const vector(4) <float:32> * a, const vector(4) <float:32> * b, vector(4) <float:32> * c)
> {
>   initial:
>   _1 = *a;
>   _2 = *b;
>   _3 = _1 * _2;
>   *c = _3;
>   return;
> }
> 
> on this x86_64 box it compiled to:
> 
> 	movaps	(%rdi), %xmm0
> 	mulps	(%rsi), %xmm0
> 	movaps	%xmm0, (%rdx)
> 	ret
> 
> (i.e. using the "mulps" SIMD instruction).

Yep, compiling with optimisation set to -O3 will enable the
vectorisation optimisations.  I normally compile with -O2; historically
I have found -O3 as likely to impair performance as to improve
performance (so I tend not to use -O3) but maybe that has changed in
recent decades ;-)

The vectorisation optimisations are not clever enough to well optimise
more complicated image processing filters so accessing the builtins will
be necessary. 

But I have hit a problem which I suspect is a bug in the gcc optimiser.

In the vein of your example above, but working on uint8_t pixel data
and adding saturation, the jit compiler segfaults in the optimiser. I
provide below the gimple produced by the function that causes the
problem (I presume that is more useful than the code calling the
gcc_jit routines), and a backtrace from the jit compiler.  This example
is from Debian gcc 6.3.0-18 (but it also happens with gcc 7.1;
unfortunately my build of gcc from the trunk failed).  Should I file a
bug report, and if so, against what component?

For the below I have set optimisation level to -O3 (to get
vectorisation) and specified -mavx2 as a compiler arg.  (BTW, the same
segfault also occurs when compiling for Arm and Arm64.  Also if I set
optimisation level to -O2 the example compiles and runs correctly.)

The offending function I implement in the JIT is essentially:

ip_jit_im_add_clip_UBYTE (struct ip_image * dest, struct ip_image * src)
{
    int rowlen = dest->size.x;
    int numrows = dest->size.y;
    for (int j=0; j<numrows; j++) {
        uint8_t *sptr, *dptr;
        dptr = (uint8_t *)dest->imrow[j];
        sptr = (uint8_t *)src->imrow[j];
        for (int i=0; i<rowlen; i++) {
            int ival = (int)*dptr + (int)*sptr;
            if (ival > UINT8_MAX)
                ival = UINT8_MAX;
            *dptr = (uint8_t)ival;
            sptr++; dptr++;
        }
    }
}


The gimple produced is:

ip_jit_im_add_clip_UBYTE (struct ip_image * dest, struct ip_image * src)
{
  void * * D.370;
  sizetype D.371;
  sizetype D.372;
  void * * D.373;
  void * * D.374;
  void * * D.375;
  unsigned char D.376;
  signed int D.377;
  unsigned char D.378;
  signed int D.379;
  unsigned char D.380;
  sizetype D.381;
  signed int ival;
  signed int i;
  unsigned char * sptr;
  unsigned char * dptr;
  signed int j;
  signed int numrows;
  signed int rowlen;

  F1:
  rowlen = dest->size.x;
  numrows = dest->size.y;
  j = 0;
  goto C1;
  C1:
  if (j < numrows) goto L1; else goto A1;
  L1:
  D.370 = dest->imrow;
  D.371 = (sizetype) j;
  D.372 = D.371 * 8;
  D.373 = D.370 + D.372;
  dptr = *D.373;
  D.374 = src->imrow;
  D.371 = (sizetype) j;
  D.372 = D.371 * 8;
  D.375 = D.374 + D.372;
  sptr = *D.375;
  i = 0;
  goto C2;
  C2:
  if (i < rowlen) goto L2; else goto A2;
  L2:
  D.376 = *dptr;
  D.377 = (signed int) D.376;
  D.378 = *sptr;
  D.379 = (signed int) D.378;
  ival = D.377 + D.379;
  if (ival > 255) goto p_C1_true; else goto p_C1_end;
  A2:
  j = j + 1;
  goto C1;
  A1:
  return;
  p_C1_true:
  ival = 255;
  goto p_C1_end;
  p_C1_end:
  D.380 = (unsigned char) ival;
  *dptr = D.380;
  D.381 = 1;
  dptr = dptr + D.381;
  D.381 = 1;
  sptr = sptr + D.381;
  i = i + 1;
  goto C2;
}


And the optimiser segfaults while compiling the above with:

Program received signal SIGSEGV, Segmentation fault.
optab_for_tree_code (code=code@entry=VEC_UNPACK_LO_EXPR, type=type@entry=0x0, 
    subtype=subtype@entry=optab_default) at ../../src/gcc/optabs-tree.c:190
190	../../src/gcc/optabs-tree.c: No such file or directory.
(gdb) bt
#0  optab_for_tree_code (code=code@entry=VEC_UNPACK_LO_EXPR, type=type@entry=0x0, 
    subtype=subtype@entry=optab_default) at ../../src/gcc/optabs-tree.c:190
#1  0x00007ffff6148593 in supportable_widening_operation (code=code@entry=NOP_EXPR, 
    stmt=stmt@entry=0x7ffff3d170f0, vectype_out=vectype_out@entry=0x7ffff3d32f18, 
    vectype_in=0x7ffff3cf3f18, code1=code1@entry=0x7fffffffd804, 
    code2=code2@entry=0x7fffffffd808, multi_step_cvt=0x7fffffffd814, 
    interm_types=0x7fffffffd850) at ../../src/gcc/tree-vect-stmts.c:9037
#2  0x00007ffff614c2e5 in vectorizable_conversion (stmt=stmt@entry=0x7ffff3d170f0, 
    gsi=gsi@entry=0x0, vec_stmt=vec_stmt@entry=0x0, slp_node=slp_node@entry=0x0)
    at ../../src/gcc/tree-vect-stmts.c:3803
#3  0x00007ffff6159d25 in vect_analyze_stmt (stmt=stmt@entry=0x7ffff3d170f0, 
    need_to_vectorize=need_to_vectorize@entry=0x7fffffffd978, node=node@entry=0x0)
    at ../../src/gcc/tree-vect-stmts.c:8135
#4  0x00007ffff616830b in vect_analyze_loop_operations (loop_vinfo=0x555555e80660, 
    loop_vinfo=0x555555e80660) at ../../src/gcc/tree-vect-loop.c:1727
#5  vect_analyze_loop_2 (fatal=<synthetic pointer>: <optimized out>, loop_vinfo=0x555555e80660)
    at ../../src/gcc/tree-vect-loop.c:2015
#6  vect_analyze_loop (loop=loop@entry=0x7ffff3d02ee0) at ../../src/gcc/tree-vect-loop.c:2268
#7  0x00007ffff617a37f in vectorize_loops () at ../../src/gcc/tree-vectorizer.c:532
#8  0x00007ffff5eec80a in execute_one_pass (pass=pass@entry=0x555555c335c0)
    at ../../src/gcc/passes.c:2336
#9  0x00007ffff5eecdd8 in execute_pass_list_1 (pass=0x555555c335c0)
    at ../../src/gcc/passes.c:2420
#10 0x00007ffff5eecdea in execute_pass_list_1 (pass=0x555555c32e30)
    at ../../src/gcc/passes.c:2421
#11 0x00007ffff5eecdea in execute_pass_list_1 (pass=0x555555c31c90)
    at ../../src/gcc/passes.c:2421
#12 0x00007ffff5eece3d in execute_pass_list (fn=<optimized out>, pass=<optimized out>)
    at ../../src/gcc/passes.c:2431
#13 0x00007ffff5c7c4b3 in cgraph_node::expand (this=0x7ffff3d132e0)
    at ../../src/gcc/cgraphunit.c:1990
#14 0x00007ffff5c7db6f in expand_all_functions () at ../../src/gcc/cgraphunit.c:2126
#15 symbol_table::compile (this=0x7ffff3cd30a8) at ../../src/gcc/cgraphunit.c:2482
#16 0x00007ffff5c7f53a in symbol_table::finalize_compilation_unit (this=0x7ffff3cd30a8)
    at ../../src/gcc/cgraphunit.c:2572
#17 0x00007ffff5fa227a in compile_file () at ../../src/gcc/toplev.c:488
#18 0x00007ffff5bdb207 in do_compile () at ../../src/gcc/toplev.c:2011
#19 toplev::main (this=this@entry=0x7fffffffdd4e, argc=<optimized out>, argv=<optimized out>)
    at ../../src/gcc/toplev.c:2119
#20 0x00007ffff5bfd066 in gcc::jit::playback::context::compile (this=this@entry=0x7fffffffdda0)
    at ../../src/gcc/jit/jit-playback.c:1789
#21 0x00007ffff5bf3bf9 in gcc::jit::recording::context::compile (this=this@entry=0x555555bb8990)
    at ../../src/gcc/jit/jit-recording.c:1241
#22 0x00007ffff5be9649 in gcc_jit_context_compile (ctxt=0x555555bb8990)
    at ../../src/gcc/jit/libgccjit.c:2677
#23 0x00005555555703c7 in ip_init_jit () at jit.c:615
#24 0x0000555555568c6f in im_add (dest=0x5555559b3530, src=0x5555558b0eb0, flag=0)
    at arith.c:750
#25 0x000055555556364e in run_libip_operator (flag=0, s=0x5555558b0eb0, d=0x5555559b3530, op=0)
    at arith-test.c:228
#26 im_op_ii_check (op=0, type=3, size=..., flag=<optimized out>, source=<optimized out>)
    at arith-test.c:334
#27 0x000055555556428f in run_im_ii_tests (operator=0, size=..., chk_flag=114)
    at arith-test.c:488
#28 0x000055555555ef34 in main (argc=<optimized out>, argv=<optimized out>) at arith-test.c:601

Cheers
Michael.

  reply	other threads:[~2017-08-16 10:03 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-01  0:00 does libgccjit support vector types? Michael Cree
2017-01-01  0:00 ` David Malcolm
2017-01-01  0:00   ` Michael Cree
2017-01-01  0:00     ` [committed] jit: add gcc_jit_type_get_vector David Malcolm
2017-01-01  0:00       ` Michael Cree [this message]
2017-01-01  0:00         ` David Malcolm
2017-01-01  0:00           ` Michael Cree
2017-01-01  0:00             ` David Malcolm
2017-01-01  0:00               ` [committed] jit: fix segfault with autovectorization (PR tree-optimization/46805) David Malcolm

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170816095854.dp3qe5dmsuqnblsg@tower \
    --to=mcree@orcon.net.nz \
    --cc=dmalcolm@redhat.com \
    --cc=jit@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).