From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 5813 invoked by alias); 1 Oct 2014 14:05:07 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 5804 invoked by uid 89); 1 Oct 2014 14:05:06 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-3.2 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_LOW,RP_MATCHES_RCVD,SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-qc0-f170.google.com Received: from mail-qc0-f170.google.com (HELO mail-qc0-f170.google.com) (209.85.216.170) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Wed, 01 Oct 2014 14:05:02 +0000 Received: by mail-qc0-f170.google.com with SMTP id m20so350485qcx.29 for ; Wed, 01 Oct 2014 07:05:00 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=AIV17t9plUB4fHG0j6L6v963sfdA/Sy3Uxt7mCR8+Yo=; b=YYqTMbsW28rJwWL3jGrzmxds9fB7j9f8QYi5mV0JnjjvWXnrJHywSzqD/w9Yyh6vG/ iB3stFjMXzWGIZILj8kb387LOEFnnsD1EiD6nKShuzuSG7rJyA0tj3WG8nWLn/HA8MeN m83RThVaxF5O7rpgXYw3qCraImpMXLExrKQrLcTZqEPYP5EP7gfftc2yBtOXIfXwl42r L4R41ltwy1JpvbrO1hLnEVZMu6zNTOYpNYYFZwx17Q5qsLAFimv2nouM7pgD4xfPovXl dNT5bKWDXmIIkk0B2GB9l7BbZvlUB5tY8otjQX7N6wialTJt0S/6jkcaI/Qy1IvA+fKw BI+w== X-Gm-Message-State: ALoCoQle2E10qkX0RHJDDXHTl/KJUJrHNQ+yNGS0sjgOD0tu/fELsfzvS+4fxm9GF/sb3+PCBSlg MIME-Version: 1.0 X-Received: by 10.224.80.65 with SMTP id s1mr72634603qak.41.1412172300355; Wed, 01 Oct 2014 07:05:00 -0700 (PDT) Received: by 10.229.148.1 with HTTP; Wed, 1 Oct 2014 07:05:00 -0700 (PDT) In-Reply-To: References: <53CF1DFD.7080805@redhat.com> <542A32AB.1040708@redhat.com> Date: Wed, 01 Oct 2014 14:05:00 -0000 Message-ID: Subject: Re: [PATCH] Redesign jump threading profile updates From: Teresa Johnson To: Christophe Lyon Cc: Jeff Law , "gcc-patches@gcc.gnu.org" , Jan Hubicka , David Li Content-Type: text/plain; charset=UTF-8 X-IsSubscribed: yes X-SW-Source: 2014-10/txt/msg00055.txt.bz2 On Wed, Oct 1, 2014 at 6:20 AM, Teresa Johnson wrote: > Sorry, yes, will try to reproduce. > Teresa > > On Wed, Oct 1, 2014 at 12:03 AM, Christophe Lyon > wrote: >> On 30 September 2014 20:20, Teresa Johnson wrote: >>> On Mon, Sep 29, 2014 at 9:33 PM, Jeff Law wrote: >>>> On 09/29/14 08:19, Teresa Johnson wrote: >>>>>> >>>>>> >>>>>> Just an update - I found some good test cases by compiling the >>>>>> c-torture tests with profile feedback with and without my patch. But >>>>>> in the cases I pulled out I saw that there were still a couple profile >>>>>> or probability insanities introduced by jump threading (albeit far >>>>>> less than before), so I wanted to investigate before I commit. I ran >>>>>> out of time this week and will not get to this until I get back from >>>>>> vacation the week after next. >>>>> >>>>> >>>>> Hi Jeff, >>>>> >>>>> I finally had a chance to get back to this and look at the remaining >>>>> insanities in the new test cases I created. It turns out that there >>>>> were still a few issues in the case where there were guessed >>>>> frequencies and no profile counts. The two test cases I created do use >>>>> FDO, and the insanities in the routines with profile counts went away >>>>> with my patch. But the outlined copies of routines that were also >>>>> inlined into the main routine still had estimated frequencies, and >>>>> these still had a few issues. >>>>> >>>>> The problem is that the profile updates are done incrementally as we >>>>> walk and update the paths in ssa_fix_duplicate_block_edges, including >>>>> the block and edge counts, the block frequencies and the >>>>> probabilities. This is very difficult to do when only operating on >>>>> frequencies since the edge frequencies are derived from the source >>>>> block frequency and the probability. Therefore, once the source block >>>>> frequency is updated, the edge frequency is also affected, and it is >>>>> really difficult to figure out what the update to the edge frequency >>>>> (essentially the probability) is using the same incremental update >>>>> approach. I was attempting to handle this with the routine >>>>> deduce_freq, for example, but this turned out to have issues for >>>>> certain types of paths. I tried a few other approaches, but they start >>>>> looking really ugly and I didn't want to add a parallel but different >>>>> algorithm in the case of no profile counts. >>>>> >>>>> So by far the simplest approach was simply to take a snapshot of the >>>>> existing block and edge frequencies along the path before we start the >>>>> updates in ssa_fix_duplicate_block_edges, by copying them into the >>>>> profile count fields of those blocks and edges. Then the existing >>>>> algorithm operates the same as when we do have counts, and can >>>>> essentially operate incrementally on the edge frequencies since they >>>>> live in the count field of the edge and are no longer affected anytime >>>>> the source block is updated. Since the algorithm does update block >>>>> frequencies and probabilities as well (based on the count updates >>>>> performed), we can simply clear out these fake count fields at the end >>>>> of ssa_fix_duplicate_block_edges. This takes care of the remaining >>>>> insanities introduced by jump threading from these test cases. During >>>>> testing I also added in some checking to ensure that the count fields >>>>> for the whole routine were cleared properly to make sure the new >>>>> clear_counts_path was not missing anything (checking is a little too >>>>> heavyweight to add in normally). >>>>> >>>>> New patch below (also attached since my mailer sometimes eats spaces). >>>>> The differences between the old patch and the new one: >>>>> - removed deduce_freq (which was my least favorite part of the patch >>>>> anyway!), and its call from recompute_probabilities, since it is no >>>>> longer necessary. >>>>> - two new routines freqs_to_counts_path and clear_counts_path, invoked >>>>> from ssa_fix_duplicate_block_edges. >>>>> - two new tests >>>>> >>>>> Bootstrapped and tested on x86_64-unknown-linux-gnu, ok for trunk? >>>>> >>>>> Thanks, >>>>> Teresa >>>>> >>>>> gcc: >>>>> >>>>> 2014-09-29 Teresa Johnson >>>>> >>>>> * tree-ssa-threadupdate.c (struct ssa_local_info_t): New >>>>> duplicate_blocks bitmap. >>>>> (remove_ctrl_stmt_and_useless_edges): Ditto. >>>>> (create_block_for_threading): Ditto. >>>>> (compute_path_counts): New function. >>>>> (update_profile): Ditto. >>>>> (recompute_probabilities): Ditto. >>>>> (update_joiner_offpath_counts): Ditto. >>>>> (freqs_to_counts_path): Ditto. >>>>> (clear_counts_path): Ditto. >>>>> (ssa_fix_duplicate_block_edges): Update profile info. >>>>> (ssa_create_duplicates): Pass new parameter. >>>>> (ssa_redirect_edges): Remove old profile update. >>>>> (thread_block_1): New duplicate_blocks bitmap, >>>>> remove old profile update. >>>>> (thread_single_edge): Pass new parameter. >>>>> >>>>> gcc/testsuite: >>>>> >>>>> 2014-09-29 Teresa Johnson >>>>> >>>>> * testsuite/gcc.dg/tree-prof/20050826-2.c: New test. >>>>> * testsuite/gcc.dg/tree-prof/cmpsf-1.c: Ditto. >>>> >>>> Given I'd already been through this pretty thoroughly, I just gave this a >>>> cursory review. >>>> >>>> clear_counts_path needs a function comment. It's pretty obvious what it's >>>> doing, but for completeness let's go ahead and get the obvious comment in >>>> there. >>> >>> Done and committed as r215739. >>> >> >> Since this commit, I can see all my builds for arm*linux* and >> aarch64*linux* fail while building glibc: >> >> /tmp/3496222_18.tmpdir/aci-gcc-fsf/builds/gcc-fsf-gccsrc/tools/bin/aarch64-none-linux-gnu-gcc >> iso-2022-cn.c -c -std=gnu99 -fgnu89-inline -O2 -Wall -Win >> line -Wundef -Wwrite-strings -fmerge-all-constants -frounding-math -g >> -Wstrict-prototypes -fPIC -I../include >> -I/tmp/3496222_18.tmpdir/aci-gcc-f >> sf/builds/gcc-fsf-gccsrc/obj-aarch64-none-linux-gnu/glibc-1/iconvdata >> -I/tmp/3496222_18.tmpdir/aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-aarch64-none-linux >> -gnu/glibc-1 -I../sysdeps/unix/sysv/linux/aarch64/nptl >> -I../sysdeps/unix/sysv/linux/aarch64 >> -I../sysdeps/unix/sysv/linux/generic -I../sysdeps/unix/s >> ysv/linux/wordsize-64 -I../nptl/sysdeps/unix/sysv/linux >> -I../nptl/sysdeps/pthread -I../sysdeps/pthread >> -I../sysdeps/unix/sysv/linux -I../sysdeps/gn >> u -I../sysdeps/unix/inet -I../nptl/sysdeps/unix/sysv >> -I../sysdeps/unix/sysv -I../nptl/sysdeps/unix -I../sysdeps/unix >> -I../sysdeps/posix -I../sysd >> eps/aarch64/fpu -I../sysdeps/aarch64/nptl -I../sysdeps/aarch64 >> -I../sysdeps/wordsize-64 -I../sysdeps/ieee754/ldbl-128 >> -I../sysdeps/ieee754/dbl-64/w >> ordsize-64 -I../sysdeps/ieee754/dbl-64 -I../sysdeps/ieee754/flt-32 >> -I../sysdeps/aarch64/soft-fp -I../sysdeps/ieee754 >> -I../sysdeps/generic -I../npt >> l -I.. -I../libio -I. -nostdinc -isystem >> /tmp/3496222_18.tmpdir/aci-gcc-fsf/builds/gcc-fsf-gccsrc/tools/lib/gcc/aarch64-none-linux-gnu/5.0.0/include >> -i >> system /tmp/3496222_18.tmpdir/aci-gcc-fsf/builds/gcc-fsf-gccsrc/tools/lib/gcc/aarch64-none-linux-gnu/5.0.0/include-fixed >> -isystem /tmp/3496222_18.tmpdir >> /aci-gcc-fsf/builds/gcc-fsf-gccsrc/sysroot-aarch64-none-linux-gnu/usr/include >> -D_LIBC_REENTRANT -include ../include/libc-symbols.h -DPIC -DSHARED >> -DNOT_IN_libc -o >> /tmp/3496222_18.tmpdir/aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-aarch64-none-linux-gnu/glibc-1/iconvdata/iso-2022-cn.os >> -MD -MP -MF /tmp/3 >> 496222_18.tmpdir/aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-aarch64-none-linux-gnu/glibc-1/iconvdata/iso-2022-cn.os.dt >> -MT /tmp/3496222_18.tmpdir/aci-gcc-fsf >> /builds/gcc-fsf-gccsrc/obj-aarch64-none-linux-gnu/glibc-1/iconvdata/iso-2022-cn.os >> >> In file included from iso-2022-cn.c:407:0: >> ../iconv/skeleton.c: In function 'gconv': >> ../iconv/skeleton.c:800:1: internal compiler error: in >> check_probability, at basic-block.h:959 >> 0xe4e2fb find_many_sub_basic_blocks(simple_bitmap_def*) >> /tmp/3496222_18.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/basic-block.h:959 >> 0x6623f0 execute >> /tmp/3496222_18.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cfgexpand.c:5916 >> Please submit a full bug report, >> with preprocessed source if appropriate. >> Please include the complete backtrace with any bug report. >> See for instructions. >> >> Can you have a look? Unfortunately doesn't reproduce with x86_64, so I will build an aarch64 cross-compiler next and give that a try. This was with the latest glibc sources I git cloned from sourceware.org. Let me know if I need something else. Also, the instructions I found for configuring an aarch64 cross-compiler are: ./configure --prefix=$prefix \ --target=aarch64-none-linux --enable-languages=c \ --disable-threads --disable-shared --disable-libmudflap \ --disable-libssp --disable-libgomp --disable-libquadmath Let me know if I need to configure it differently. Thanks, Teresa >> >> Thanks, >> >> Christophe. >> >>> Thanks, >>> Teresa >>> >>>> >>>> With that fix, approved for the trunk. Thanks for taking the time to sort >>>> out all these issues. >>>> >>>> jeff >>>> >>>> >>> >>> >>> >>> -- >>> Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413 > > > > -- > Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413 -- Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413