From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 24123 invoked by alias); 8 Jul 2015 17:54:55 -0000 Mailing-List: contact jit-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Post: List-Help: List-Subscribe: Sender: jit-owner@gcc.gnu.org Received: (qmail 24080 invoked by uid 89); 8 Jul 2015 17:54:51 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Checked: by ClamAV 0.98.7 on sourceware.org X-Virus-Found: No X-HELO: mx1.redhat.com Message-ID: <1436377619.24803.97.camel@surprise> Subject: Re: Filed PR jit/66812 for the code generation issue From: David Malcolm To: Dibyendu Majumdar Cc: jit@gcc.gnu.org Date: Thu, 01 Jan 2015 00:00:00 -0000 In-Reply-To: <1436369443.24803.75.camel@surprise> References: <1436365266.24803.65.camel@surprise> <1436367926.24803.71.camel@surprise> <1436369443.24803.75.camel@surprise> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.68 on 10.5.11.27 X-SW-Source: 2015-q3/txt/msg00034.txt.bz2 On Wed, 2015-07-08 at 11:30 -0400, David Malcolm wrote: > On Wed, 2015-07-08 at 11:05 -0400, David Malcolm wrote: > > On Wed, 2015-07-08 at 10:21 -0400, David Malcolm wrote: > > > On Sat, 2015-07-04 at 16:58 +0100, Dibyendu Majumdar wrote: > > > > On 4 July 2015 at 14:20, Dibyendu Majumdar wrote: > > > > > On 4 July 2015 at 13:11, Dibyendu Majumdar wrote: > > > > >> Looks like in the failure case the code is being incorrectly > > > > >> optimized. I wonder if this is a manifestation of the get_address bug, > > > > >> perhaps the real fix will be better than the patch I am using. I will > > > > >> use the latest gcc 5 branch and see if that helps. > > > > >> > > > > > > > > > > Hi Dave, > > > > > > > > > > I am now using the latest gcc-5-branch from gcc github mirror. > > > > > Unfortunately the issue still persists. > > > > > > > > > > If set optimization level to 0 or 1, then it works ok, but at levels 2 > > > > > or 3 the break occurs. > > > > > > > > > > > > > Adding the -fno-strict-aliasing appears to resolve the issue with -O2 > > > > and -O3 but with this enabled the benchmarks are degraded. > > > > > > I've filed the bad code generation issue as: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66812 > > > > > > and I'm investigating it. > > > > > > Notes on investigation so far. > > > > With the fixes here > > https://gcc.gnu.org/ml/jit/2015-q3/msg00025.html > > and here: > > https://gcc.gnu.org/ml/jit/2015-q3/msg00028.html > > the reproducer from > > https://gcc.gnu.org/ml/jit/2015-q3/msg00012.html > > now correctly fails with: > > error: gcc_jit_context_new_field: unknown size for field > > "errorJmp" (type: struct ravi_lua_longjmp) > > > > Upon applying the equivalent of the fix from: > > https://github.com/dibyendumajumdar/ravi/commit/d65d2e68fbdcf211ed36deea05727f996ede8296 > > to the generator you provided, gcc_jit_context_compile completes. > > > > I don't know if it's generating bad code (given that I don't have Ravi > > itself running, just the reproducer). However, I tried comparing the > > bug_rdump.txt and bug_rdump_ok.txt reproducers, and tried adding > > -fno-strict-aliasing to the former. > > > > I turned on GCC_JIT_BOOL_OPTION_DUMP_EVERYTHING and started comparing > > the dumpfiles. > > > > The diffs stay easily comparable until fake.c.018t.ssa, when numbering > > differences between temporaries mean that a simple "diff" becomes > > unreadable. > > > > So I wrote a tool to try to alleviate this: > > https://github.com/davidmalcolm/gcc-dump-diff > > The tool tries to renumber temporaries in dumpfiles so that they are > > more consistent between runs, and then tries to do a textual diff of two > > dumpfiles. > > > > (note that it might renumber labels also) > > Yes, labels *were* being renumbered. I've fixed that now. > > > With this, the dumpfiles become comparable again. I see that when > > rerunning the bug_rdump.txt reproducer, the first big difference between > > the with and without -fno-strict-aliasing case happens at 035t.fre1. > > > > Note in the following how, without -fno-strict-aliasing, various > > statements and blocks are eliminated at 035t.fre1, which treats them as > > dead code. In particular, it seems to have optimized away > > OP_TEST_do_jmp_5_0 and OP_TEST_do_skip_5_0 (if I'm reading things > > right). > > It's OP_TEST_do_jmp_5_14, jmp_9_2 and OP_TEST_do_skip_5_15 that are > being optimized away by pass fre1 when -fno-strict-aliasing isn't > supplied. > > I'm attaching a fixed diff, showing the correct label names. > > > > My hunch at this time is that this optimization is being too > > aggressive... for some reason. I'll continue poking at this. Dibyendu: what Lua code generated the reproducer? What is the code meant to be doing? I used gcc_jit_function_dump_to_dot to dump the CFG in GraphViz format; you can see the result here: https://dmalcolm.fedorapeople.org/gcc/2015-07-08/rdump.png and with the printfs here: https://dmalcolm.fedorapeople.org/gcc/2015-07-08/rdump_ok.png I see that both paths out of the "entry" block go through empty blocks and then into "jmp_5_1". A similar thing happens later with "jmp_9_2": both paths from the conditional lead through empty blocks to "jmp_12_3". Those pairs of empty blocks look odd. Is the code correct? Looking at the body of "jmp_5_1", and annotating, I see: jmp_5_1: (&L->ci->u.l.base[(int)1])->value_.b = (int)0; (&L->ci->u.l.base[(int)1])->tt_ = (int)1; comparison_0_11 = (&L->ci->u.l.base[(int)1])->tt_ == (int)0; /* this must be true because of the 2nd assignment above */ comparison_0_12 = (&L->ci->u.l.base[(int)1])->tt_ == (int)1; /* similarly this must be false */ comparison_0_13 = (&L->ci->u.l.base[(int)1])->value_.b == (int)0; /* this must be true because of the 1st assignment above */ isfalse_0_10 = comparison_0_11 || comparison_0_12 && comparison_0_13; /* hence we have: true || false && true and hence: true */ if (!(!(isfalse_0_10))) goto OP_TEST_do_jmp_5_14; else goto OP_TEST_do_skip_5_15; /* hence this always takes the 1st path; the 2nd path is indeed dead code */ So it does in fact seem reasonable for the optimizer to optimize away OP_TEST_do_skip_5_15, and I think that once it does that, it merges OP_TEST_do_jmp_5_14 and jmp_9_2 into jmp_5_1, and can then do similar optimizations to the statements that were in jmp_9_2. So it seems that things I reported pass "fre1" as doing are reasonable. It seems that the optimizer is only able to assume the above values when strict aliasing is enabled, but it seems to be a reasonable optimization. (I suspect that for some reason the presence of the printfs also is stopping this optimization; perhaps JIT doesn't know as much as the C frontend about the lack of side-effects of printf?) Is the code being supplied correct? It's not clear to me what it's meant to be doing, but that CFG looks curious to me. Maybe the input is incorrect, but it only turns into a problem when optimized? FWIW, if this is a CFG issue, note that blocks are cheap in libgccjit; I find it easiest to simply create a gcc_jit_block * at the entrypoint of every opcode even if there's no fancy control flow going on; blocks get immediately consolidated internally, so any redundancy there is very brief. Hope this is helpful Dave