From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=ZUY6=62=gmail.com=richard.guenther@sourceware.org>
Received: from mail-lj1-x22a.google.com (mail-lj1-x22a.google.com [IPv6:2a00:1450:4864:20::22a])
	by sourceware.org (Postfix) with ESMTPS id 4F87A3857B8E;
	Thu,  2 Mar 2023 10:45:23 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 4F87A3857B8E
Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com
Received: by mail-lj1-x22a.google.com with SMTP id g18so14985509ljl.3;
        Thu, 02 Mar 2023 02:45:23 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20210112; t=1677753922;
        h=cc:to:subject:message-id:date:from:in-reply-to:references
         :mime-version:from:to:cc:subject:date:message-id:reply-to;
        bh=1XN4g8e/eTswUGo3wpEhU5POSLem5bo1JKfF+K4NYkE=;
        b=ZlDxkssJ5xOEyv/hvUMv0QmT2Wf5ClfGKnn63nsWP90+va6vtld/14rVsxRbX+LTa4
         PbR61LxOUtym12WI3G733hJ8jzYlb+95JQHCQj4OGocFJo0qIFxoAGuKvlZx94j35JIg
         RjVyTCOYKTR5z6Fu/0CsdVjs6yPdAJTBk/h8dXT1SqHLIvqD7atJun9LJ1deKo/zkQrG
         Zy+COnVoXKjocV+voz27qwQGUsnIfNUIx4nRY9ZCEYU19qPqW08wlnOdQC8sF4o/Ypzn
         GQH2q5fO1U9K1OYuTmDeqRmGctXBcw/UkrMrqkmzXc1nxMhTDWsJhq1rqhZ36GeFIAnI
         TLEQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112; t=1677753922;
        h=cc:to:subject:message-id:date:from:in-reply-to:references
         :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id
         :reply-to;
        bh=1XN4g8e/eTswUGo3wpEhU5POSLem5bo1JKfF+K4NYkE=;
        b=bAfYvJK7BKaF0EYFl2nJPXJF2XX1OyCPgSmEq84qeJlBXPwt897rkXW1NDsbMyNW2i
         SULVBtUgqRl1ljkYVi/LFFIclYhtMfaA/uJIVRiIpioLvsCIAppDgvqjRtkezyxGkFk0
         kV7WAlHp7feiCcsZjDpilLfkJs0k7JJ2e/2rMGdIZRNxlnm0IOh4jpxYjVvI1wheeAds
         wLWJJvc6MXpxQVjozAwsBwL2+bU86J+MoxqlaRIkzlHFR5a4jdyr5fJc+F5exjDozmqC
         qIR4vAeMqtt0Slw0j0vs51PI7JNHEA1L2Ki/+HKcdyhqonbOE9HfwuVeD2n486XoHwpR
         8rAQ==
X-Gm-Message-State: AO0yUKVKCkSpKB/taP8guTFXZQCuLFwShQsd9IC0b3ueRZ1MQtNo36lc
	oYMyJWT2uWVIfHVVOCst15KWUO5tAtKJixLMo/w=
X-Google-Smtp-Source: AK7set+rxDdNZE9h/pmubQk9bTsq7IDqpP9ucLh/3dsqVDwQSSx/lIKKejVSBj9utMRZr5RdpHazC1+b+0gqgHjvSIQ=
X-Received: by 2002:a05:651c:1243:b0:295:b77c:a3a2 with SMTP id
 h3-20020a05651c124300b00295b77ca3a2mr3021623ljh.6.1677753921859; Thu, 02 Mar
 2023 02:45:21 -0800 (PST)
MIME-Version: 1.0
References: <20230302022921.4055291-1-xionghuluo@tencent.com>
 <CAFiYyc2vdA3R_mBhNwktmjm0_5SJD7egGwoQMy3vgVBP5iD3+w@mail.gmail.com> <00510a56-1fe5-a556-21cf-fa869f70800b@gmail.com>
In-Reply-To: <00510a56-1fe5-a556-21cf-fa869f70800b@gmail.com>
From: Richard Biener <richard.guenther@gmail.com>
Date: Thu, 2 Mar 2023 11:45:08 +0100
Message-ID: <CAFiYyc2vA1yKz5k3E3q=xL-n+7=_oKomsSJNyn8daoiPVAqgeQ@mail.gmail.com>
Subject: Re: [PATCH 1/2] gcov: Fix "do-while" structure in case statement
 leads to incorrect code coverage [PR93680]
To: Xionghu Luo <yinyuefengyi@gmail.com>
Cc: Xionghu Luo <xionghuluo@tencent.com>, gcc-patches@gcc.gnu.org, luoxhu@gcc.gnu.org, 
	rguenther@suse.de, hubicka@ucw.cz
Content-Type: text/plain; charset="UTF-8"
X-Spam-Status: No, score=-7.2 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <gcc-patches.gcc.gnu.org>

On Thu, Mar 2, 2023 at 11:22 AM Xionghu Luo <yinyuefengyi@gmail.com> wrote:
>
>
>
> On 2023/3/2 16:41, Richard Biener wrote:
> > On Thu, Mar 2, 2023 at 3:31 AM Xionghu Luo via Gcc-patches
> > <gcc-patches@gcc.gnu.org> wrote:
> >>
> >> When spliting edge with self loop, the split edge should be placed just next to
> >> the edge_in->src, otherwise it may generate different position latch bbs for
> >> two consecutive self loops.  For details, please refer to:
> >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93680#c4
> >>
> >> Regression tested pass on x86_64-linux-gnu and aarch64-linux-gnu, OK for
> >> master?
> >>
> >> gcc/ChangeLog:
> >>
> >>          PR gcov/93680
> >>          * tree-cfg.cc (split_edge_bb_loc): Return edge_in->src for self loop.
> >>
> >> gcc/testsuite/ChangeLog:
> >>
> >>          PR gcov/93680
> >>          * gcc.misc-tests/gcov-pr93680.c: New test.
> >>
> >> Signed-off-by: Xionghu Luo <xionghuluo@tencent.com>
> >> ---
> >>   gcc/testsuite/gcc.misc-tests/gcov-pr93680.c | 24 +++++++++++++++++++++
> >>   gcc/tree-cfg.cc                             |  2 +-
> >>   2 files changed, 25 insertions(+), 1 deletion(-)
> >>   create mode 100644 gcc/testsuite/gcc.misc-tests/gcov-pr93680.c
> >>
> >> diff --git a/gcc/testsuite/gcc.misc-tests/gcov-pr93680.c b/gcc/testsuite/gcc.misc-tests/gcov-pr93680.c
> >> new file mode 100644
> >> index 00000000000..b2bf9e626fc
> >> --- /dev/null
> >> +++ b/gcc/testsuite/gcc.misc-tests/gcov-pr93680.c
> >> @@ -0,0 +1,24 @@
> >> +/* { dg-options "-fprofile-arcs -ftest-coverage" } */
> >> +/* { dg-do run { target native } } */
> >> +
> >> +int f(int s, int n)
> >> +{
> >> +  int p = 0;
> >> +
> >> +  switch (s)
> >> +  {
> >> +    case 0: /* count(5) */
> >> +      do { p++; } while (--n); /* count(5) */
> >> +      return p; /* count(1) */
> >> +
> >> +    case 1: /* count(5) */
> >> +      do { p++; } while (--n); /* count(5) */
> >> +      return p; /* count(1) */
> >> +  }
> >> +
> >> +  return 0;
> >> +}
> >> +
> >> +int main() { f(0, 5); f(1, 5); return 0; }
> >> +
> >> +/* { dg-final { run-gcov gcov-pr93680.c } } */
> >> diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc
> >> index a9fcc7fd050..6fa1d83d366 100644
> >> --- a/gcc/tree-cfg.cc
> >> +++ b/gcc/tree-cfg.cc
> >> @@ -3009,7 +3009,7 @@ split_edge_bb_loc (edge edge_in)
> >>     if (dest_prev)
> >>       {
> >>         edge e = find_edge (dest_prev, dest);
> >> -      if (e && !(e->flags & EDGE_COMPLEX))
> >> +      if ((e && !(e->flags & EDGE_COMPLEX)) || edge_in->src == edge_in->dest)
> >
> > I think this should eventually apply to all backedge edge_in, correct?
> >   But of course
> > we cannot easily test for this here.
> >
> > Still since this affects ordering in the {next,prev}_bb chain only but not CFG
> > semantics I wonder how it can affect coverage?  Isn't it only by chance that
> > this block order survives?
>
> For case:
>
> 1 int f(int s, int n)
> 2 {
> 3  int p = 0;
> 4  int q = 0;
> 5
> 6  switch (s)
> 7    {
> 8    case 0:
> 9      do { p++; } while (--n);
> 10      return p;
> 11
> 12    case 1:
> 13      do { p++; } while (--n);
> 14      return p;
> 15    }
> 16
> 17  return 0;
> 18 }
> 19
> 20 int main() { f(0, 5); f(1, 5);}
>
>
> current GCC generates:
>
> <bb 2> :
> ...
>
>   <bb 3> :                <= first loop
> ...
>      goto <bb 4>; [INV]
>    else
>      goto <bb 5>; [INV]
>
>    <bb 4> :               <= first latch bb
>    goto <bb 3>; [100.00%]
>
>    <bb 5> :
> ...
>    goto <bb 10>; [INV]
>
>    <bb 6> :               <= second latch bb
>
>    <bb 7> :                <= second loop
> ...
>      goto <bb 6>; [INV]
>    else
>      goto <bb 8>; [INV]
>
>
> <bb 4> and <bb 6> are created by split_edge->split_edge_bb_loc, <bb 4>
> is located after the loop, but <bb 6> is located before the loop.
>
> First call of split_edge_bb_loc, the dest_prev is <bb 2>, and find_edge
> did find a edge from <bb 2> to <bb 3>, the returned afte_bb is <bb 3>, so
> latch <bb 4> is put after the loop
>
> but second call of split_edge_bb_loc, the dest_prev is <bb 5>, so find_edge
> return 0, and the returned after_bb is <bb 5>, then the created latch <bb 6>
> is put before the loop...
>
> Different latch bb position caused different gcno, while gcov has poor
> information and not that smart to recognize it:(, is it reasonable to keep
> this kind of loops same order?
>
>
>   small.gcno:  648:                  block 2:`small.c':1, 3, 4, 6
>   small.gcno:  688:    01450000:  36:LINES
>   small.gcno:  700:                  block 3:`small.c':8, 9
>   small.gcno:  732:    01450000:  32:LINES
>   small.gcno:  744:                  block 5:`small.c':10
> -small.gcno:  772:    01450000:  32:LINES
> -small.gcno:  784:                  block 6:`small.c':12
> -small.gcno:  812:    01450000:  36:LINES
> -small.gcno:  824:                  block 7:`small.c':12, 13
> +small.gcno:  772:    01450000:  36:LINES
> +small.gcno:  784:                  block 6:`small.c':12, 13
> +small.gcno:  816:    01450000:  32:LINES
> +small.gcno:  828:                  block 8:`small.c':14
>   small.gcno:  856:    01450000:  32:LINES
> -small.gcno:  868:                  block 8:`small.c':14
> -small.gcno:  896:    01450000:  32:LINES
> -small.gcno:  908:                  block 9:`small.c':17
> +small.gcno:  868:                  block 9:`small.c':17

Looking at the CFG and the instrumentation shows

  <bb 2> :
  PROF_edge_counter_17 = __gcov0.f[0];
  PROF_edge_counter_18 = PROF_edge_counter_17 + 1;
  __gcov0.f[0] = PROF_edge_counter_18;
  [t.c:3:7] p_6 = 0;
  [t.c:5:3] switch (s_7(D)) <default: <L6> [INV], [t.c:7:5] case 0:
<L0> [INV], [t.c:11:5] case 1: <L3> [INV]>

  <bb 3> :
  # n_1 = PHI <n_8(D)(2), [t.c:8:28] n_13(4)>
  # p_3 = PHI <[t.c:3:7] p_6(2), [t.c:8:15] p_12(4)>
[t.c:7:5] <L0>:
  [t.c:8:15] p_12 = p_3 + 1;
  [t.c:8:28] n_13 = n_1 + -1;
  [t.c:8:28] if (n_13 != 0)
    goto <bb 4>; [INV]
  else
    goto <bb 5>; [INV]

  <bb 4> :
  PROF_edge_counter_21 = __gcov0.f[2];
  PROF_edge_counter_22 = PROF_edge_counter_21 + 1;
  __gcov0.f[2] = PROF_edge_counter_22;
  [t.c:7:5] goto <bb 3>; [100.00%]

  <bb 5> :
  PROF_edge_counter_23 = __gcov0.f[3];
  PROF_edge_counter_24 = PROF_edge_counter_23 + 1;
  __gcov0.f[3] = PROF_edge_counter_24;
  [t.c:9:16] _14 = p_12;
  [t.c:9:16] goto <bb 10>; [INV]

so the reason this goes wrong is that gcov associates the "wrong"
counter with the block containing
the 'case' label(s), for the case 0 it should have chosen the counter
from bb5 but it likely
computed the count of bb3?

It might be that ordering blocks differently puts the instrumentation
to different blocks or it
makes gcovs association chose different blocks but that means it's
just luck and not fixing
the actual issue?

To me it looks like the correct thing to investigate is switch
statement and/or case label
handling.  One can also see that <L0> having line number 7 is wrong to
the extent that
the position of the label doesn't match the number of times it
executes in the source.  So
placement of the label is wrong here, possibly caused by CFG cleanup
after CFG build
(but generally labels are not used for anything once the CFG is built
and coverage
instrumentation is late so it might fail due to us moving labels).  It
might be OK to
avoid moving labels for --coverage but then coverage should possibly
look at edges
rather than labels?

Richard.

>
>
> >
> > For the case when both edge_in->src has more than one successor and
> > edge_in->dest has more than one predecessor there isn't any good heuristic
> > to make printing the blocks in chain order "nice" (well, the backedge
> > one maybe).
> >
> > But as said - this order shouldn't have any effect on semantics ...
> >
> >>          return edge_in->src;
> >>       }
> >>     return dest_prev;
> >> --
> >> 2.27.0
> >>