From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 18258 invoked by alias); 11 Dec 2013 02:56:38 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 18203 invoked by uid 89); 11 Dec 2013 02:56:37 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.2 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-pd0-f181.google.com Received: from Unknown (HELO mail-pd0-f181.google.com) (209.85.192.181) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Wed, 11 Dec 2013 02:56:34 +0000 Received: by mail-pd0-f181.google.com with SMTP id p10so8571248pdj.40 for ; Tue, 10 Dec 2013 18:56:27 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:content-type:mime-version:subject:from :in-reply-to:date:cc:content-transfer-encoding:message-id:references :to; bh=ChhUHT0Z6ljW1CwD2VWwf6MRhk3lHVT7PP9onb10uK4=; b=Yxq1Xd6i5cLin9NqtYI0Hjjy5bUPmPBZ8Nj1pBzzBMTHWObDVVLQlpsL+GtH2FK79g j5SY9Nt2bdMuh4uHXTVzArO2Cfzm3Tg0AWS1MrpWkG1+qPcDcLnFM4pf7WKKatJYBGEt FM3el4EpyUMeC5h+eIEiBpd60m4uL2ldcPoCymzYR0qGEo2R/cvREoesMx7Umh/Y1wzS Ya2syo5rghZy/9Ofem17y8JBcVyqsFYiaqsjaAjXYDDDdA4x+401FCB/jdxxXAUsskc/ Pt7icK/jTDKkE99HL9Z83S7Sd2h66XRDEx5q6VTy/7RE0qj+nzbMYmlJIU/bfGAx9T+L Poxg== X-Gm-Message-State: ALoCoQlKAzhiENXTlOxlQ1/xCNLqVct45M2RmlsGkj0Yw7UxHl5Mj2fXMxsTxR+QyZvn30PF/RKX X-Received: by 10.68.91.3 with SMTP id ca3mr31089999pbb.20.1386730586859; Tue, 10 Dec 2013 18:56:26 -0800 (PST) Received: from [192.168.0.130] (121-98-52-31.bng1.nct.orcon.net.nz. [121.98.52.31]) by mx.google.com with ESMTPSA id bh6sm40138325pad.20.2013.12.10.18.56.23 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Tue, 10 Dec 2013 18:56:25 -0800 (PST) Content-Type: text/plain; charset=iso-8859-1 Mime-Version: 1.0 (Mac OS X Mail 7.0 \(1822\)) Subject: Re: DONT_BREAK_DEPENDENCIES bitmask for scheduling From: Maxim Kuvyrkov In-Reply-To: Date: Wed, 11 Dec 2013 02:56:00 -0000 Cc: Paulo Matos , "gcc@gcc.gnu.org" Content-Transfer-Encoding: quoted-printable Message-Id: <849E4FBE-836D-45BB-928A-D780B69D80BA@kugelworks.com> References: <19EB96622A777C4AB91610E763265F46231E97@SJEXCHMB14.corp.ad.broadcom.com> <6F040EA5-5398-4FB8-9518-6C12EF22B5CD@kugelworks.com> <85C983FD-BB69-4083-8B8C-202CD470BF3D@kugelworks.com> To: ramrad01@arm.com X-SW-Source: 2013-12/txt/msg00140.txt.bz2 On 11/12/2013, at 3:45 pm, Ramana Radhakrishnan = wrote: > On Wed, Dec 11, 2013 at 12:02 AM, Maxim Kuvyrkov w= rote: >> On 11/12/2013, at 11:14 am, Ramana Radhakrishnan wrote: >>=20 >>> On Tue, Dec 10, 2013 at 9:44 PM, Maxim Kuvyrkov = wrote: >>>> On 11/12/2013, at 5:17 am, Ramana Radhakrishnan wrote: >>>>=20 >>>>> On Mon, Jul 1, 2013 at 5:31 PM, Paulo Matos wro= te: >>>>>> Hi, >>>>>>=20 >>>>>> Near the start of schedule_block, find_modifiable_mems is called if = DONT_BREAK_DEPENDENCIES is not enabled for this scheduling pass. It seems o= n c6x backend currently uses this. >>>>>> However, it's quite strange that this is not a requirement for all b= ackends since find_modifiable_mems, moves all my dependencies in SD_LIST_HA= RD_BACK to SD_LIST_SPEC_BACK even though I don't have DO_SPECULATION enable= d. >>>>>>=20 >>>>>> Since dependencies are accessed later on from try_ready (for example= ), I would have thought that it would be always good not to call find_modif= iable_mems, given that it seems to 'literally' break dependencies. >>>>>>=20 >>>>>> Is the behaviour of find_modifiable_mems a bug or somehow expected? >>>>=20 >>>> "Breaking" a dependency in scheduler involves modification of instruct= ions that would allow scheduler to move one instruction past the other. Th= e most common case of breaking a dependency is "r2 =3D r1 + 4; r3 =3D [r2];= " which can be transformed into "r3 =3D [r1 + 4]; r2 =3D r1 + 4;". Breakin= g a dependency is not ignoring it, speculatively or otherwise; it is an equ= ivalent code transformation to allow scheduler more freedom to fill up CPU = cycles. >>>=20 >>>=20 >>> Yes, but there are times when it does this a bit too aggressively and >>> this looks like the cause for a performance regression that I'm >>> investigating on ARM. I was looking for a way of preventing this >>> transformation and there doesn't seem to be an easy one other than the >>> obvious hack. >>=20 >> If you want a particular transformation from occurring, then you need to= investigate why scheduler thinks that there is nothing better to do than t= o schedule an instruction which requires breaking a dependency. "Breaking"= a dependency only increases pool of instructions available to schedule, an= d your problem seems to be laying in "why" the wrong instruction is selecte= d from that pool. >>=20 >> Are you sure that the problem is introduced by dependency breaking, rath= er than dependency breaking exposing a latent bug? >=20 > From my reading because the dependency breaking is of addresses that > are in a memcpy type loop which is unrolled and the original > expectation is that by switching this to an add and a negative offset > one can get more ILP in theory, but in practice the effects appear to > be worse because of secondary issues that I'm still investigating. Is this happening in the 1st or 2nd scheduling pass? From your comments I = get a feeling that dependency breaking is introducing an additional instruc= tion, rather then adding an offset to a memory reference. Ideally, depende= ncy breaking during 1st scheduling pass should be more conservative and avo= id too many new instructions (e.g., by breaking a dependency only if nothin= g whatsoever can be scheduled on the current cycle). Dependency breaking d= uring 2nd scheduling pass can be more aggressive as it can make sure that a= dding offset to a memory instruction will not cause it to be split. >=20 >>=20 >>>=20 >>> Additionally there appears to be no way to control "flags" in a >>> backend hook for sched-rgn for DONT_BREAK_DEPENDENCIES . Again if the >>> DONT_BREAK_DEPENDENCIES is meant to be disabled with these flags, then >>> it looks like we should allow for these to also be handled or describe >>> TARGET_SCHED_SET_SCHED_FLAGS as only a hook valid with the selective >>> scheduler. >>=20 >> I'm not sure I follow you here. Any port can define TARGET_SCHED_SET_SC= HED_FLAGS and set current_sched_info->flags to whatever it thinks is approp= riate. E.g., c6x does this to disable dependency breaking for a particular= kind of loops. >=20 > Ah, that will probably work and that's probably what I was missing. I > don't like the idea in general of the same interface setting global > state randomly in a backend is probably not the best approach in the > long term. Expecting to set global state in this form from an > interface is something I wasn't expecting especially when it takes a > parameter. Originally TARGET_SCHED_SET_SCHED_FLAGS was setting current_sched_info->fla= gs and nothing else, hence the name. The parameter spec_info appeared late= r to hold flags related to IA64-specific speculative scheduling. -- Maxim Kuvyrkov www.kugelworks.com