From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pj1-x1035.google.com (mail-pj1-x1035.google.com [IPv6:2607:f8b0:4864:20::1035]) by sourceware.org (Postfix) with ESMTPS id 9B7A43857C58 for ; Sat, 30 Jul 2022 20:01:18 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 9B7A43857C58 Received: by mail-pj1-x1035.google.com with SMTP id t2-20020a17090a4e4200b001f21572f3a4so8288475pjl.0 for ; Sat, 30 Jul 2022 13:01:18 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:to:references:from:in-reply-to :content-transfer-encoding; bh=ibGP+dqGJ8ye7ZgcmlwEI/U3F09X8LRRgw/FLR8kgyo=; b=BElV2/zLwwSrWyWz8upWOxnWP29PbOne8vedZwRAf9FCsx79CYhBDnXvZ3O3YZRB0k kyKNElIYQxoywR9vbuMcvrAXK4o1424yZQ4plQW3E6yxdp/dKbON1ImXdqEmyISsZ2Ha SAKfJQrxWHebEhKFor+OVYo3utRy4b1xAfVISKRedWyChbM29OK2nir6iqWt4Ft9g+kQ 1xXHUpe7znlgy/9lf11HotIf/X/25h4/qeV9MBcYzcprUBQsjyYGUoPA7r8UGJ/UMGDr 43Y3X3tAjoy6ai4nZ/1XsfO5Rwe5gz61BmMEVcP4ibecjG3ij5mWNF7IlXfQOQftObfX zhSQ== X-Gm-Message-State: ACgBeo0mcUpL7uwtq/R3F8zX8SPTljIT/XaH8dEUXm51qDAhJ4HJGAm5 oXyZUdADjUBBwsRpSNTnGCftMfqPCSo= X-Google-Smtp-Source: AA6agR4WOgCIOVsToFe3IymzqbZ4G6QZXqoFny+KY3k51nRQXokb5tkPfzmpCFCWfFK3rDydZJ+QuQ== X-Received: by 2002:a17:90b:1a8c:b0:1ef:c1b2:b2cd with SMTP id ng12-20020a17090b1a8c00b001efc1b2b2cdmr10813963pjb.190.1659211276539; Sat, 30 Jul 2022 13:01:16 -0700 (PDT) Received: from [172.31.0.204] (c-73-98-188-51.hsd1.ut.comcast.net. [73.98.188.51]) by smtp.gmail.com with ESMTPSA id bg1-20020a056a02010100b00419a6f3c8f5sm4610200pgb.19.2022.07.30.13.01.15 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sat, 30 Jul 2022 13:01:16 -0700 (PDT) Message-ID: <363c54e0-a29b-5bec-e719-58df1aa09dc5@gmail.com> Date: Sat, 30 Jul 2022 14:01:14 -0600 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.11.0 Subject: Re: [RFC] Analysis of PR105586 and possible approaches to fix issue Content-Language: en-US To: gcc@gcc.gnu.org References: From: Jeff Law In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=1.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, SPAM_BODY, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Level: * X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 30 Jul 2022 20:01:20 -0000 On 7/27/2022 12:58 AM, Richard Biener via Gcc wrote: > On Tue, Jul 26, 2022 at 8:55 PM Surya Kumari Jangala via Gcc > wrote: >> Hi, >> I am working on PR105586. This is a -fcompare-debug failure, with the differences starting during sched1 pass. The sequence of two instructions in a basic block (block 4) is flipped with -g. >> In addition to this, another difference is that an insn is assigned a different cycle in debug vs non-debug modes. >> More specifically, the 2nd instruction in basic block 4 is assigned to cycle 0 w/o -g but to cycle 1 w/ -g. I felt that this could be resulting in the flipping of the later insns in the bb, so I started to investigate the difference in cycle assignment. >> >> In the routine schedule_block(), after scheduling an insn(schedule_insn()), prune_ready_list() is called if the ready list is not empty. This routine goes thru all the insns in the ready list and for each insn it checks if there is a state transition. If there is no state transition, then INSN_TICK(insn) is set to current_cycle+1. >> >> After scheduling the first insn in bb 4, when prune_ready_list() is called, we see that for the debug mode run, there is no state transition for the second insn and hence it's INSN_TICK is updated. For the non-debug run, a state transition exists and the INSN_TICK is not updated. This was resulting in the second insn being scheduled in cycle 1 in the debug mode, and in cycle 0 in the non-debug mode. >> >> It turned out that the initial dfa state of the basic block (‘init_state’ parameter of schedule_block()) was different in debug and non-debug modes. >> >> After scheduling a basic block, it’s current dfa state is copied to the fall-thru basic block. In other words, the initial dfa state of the fall thru bb is the current state of the bb that was just scheduled. >> >> Basic block 4 is the fall-thru bb for basic block 3. In non-debug mode, bb 3 has only a NOTE insn and hence scheduling of bb 3 is skipped. Since bb 3 is not scheduled, it’s state is not copied to bb 4. Whereas in debug mode, bb3 has a NOTE insn and a DEBUG insn. So bb 3 is “scheduled” and it’s dfa state is copied to bb4. [The dfa state of bb 3 is obtained from it’s parent bb, ie, bb 2]. Hence the initial dfa state of bb 4 is different in debug and non-debug modes due to the difference in the insns in the predecessor bb (bb 3). >> >> The routine no_real_insns_p() is called to check if scheduling can be skipped for a basic block. This routine checks for NOTE and LABEL insns and it returns ‘true’ if a basic block contains only NOTE/LABEL insns. Hence, any basic block which has only NOTE or LABEL insns is not scheduled. >> >> To fix the issue of insns being assigned different cycles, there are two possible solutions: >> >> 1. Modify no_real_insns_p() to treat a DEBUG insn as a non-real insn (similar to NOTE and LABEL). With this change, bb 3 will not be scheduled in the debug mode (as it contains only NOTE and DEBUG insns). If scheduling is skipped, then bb 3’s state is not copied to bb 4 and the initial dfa state of bb 4 will be same in both debug and non-debug modes >> 2. Copy dfa state of a basic block to it’s fall-thru block only if the basic block contains ‘real’ insns (ie, it should contain at least one insn which is not a LABEL, NOTE or DEBUG). This will prevent copying of dfa state from bb 3 to bb 4 in debug mode. > Do you know why the DFA state is not always copied to the fallthru > destination and then copied further even if the block does not contain > any (real) insns? It somewhat sounds like premature optimization > breaking things here... Yea (premature optimization) and probably just an oversight thinking that a block with no real insns could be totally ignored. jeff