public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "tejohnson at google dot com" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug bootstrap/55051] [4.8 Regression] profiledbootstrap failed
Date: Thu, 15 Nov 2012 22:42:00 -0000	[thread overview]
Message-ID: <bug-55051-4-oQz4QvRjhB@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-55051-4@http.gcc.gnu.org/bugzilla/>


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55051

--- Comment #26 from Teresa Johnson <tejohnson at google dot com> 2012-11-15 22:42:12 UTC ---
On Thu, Nov 15, 2012 at 6:33 AM, Teresa Johnson <tejohnson@google.com> wrote:
>
>
>
> On Thu, Nov 15, 2012 at 2:56 AM, hubicka at ucw dot cz
> <gcc-bugzilla@gcc.gnu.org> wrote:
>>
>>
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55051
>>
>> --- Comment #24 from Jan Hubicka <hubicka at ucw dot cz> 2012-11-15
>> 10:56:53 UTC ---
>> > Note though that this is not an assert. It just emits a message to
>> > stderr. Do you think a better error message is appropriate? I'm not
>> > sure the "some data files may have been removed" is an accurate
>> > description of the issue. Perhaps something like "Profile data file
>> > mismatch may indicate corrupt profile data"?
>>
>> Well, we should figure out why sum_all starts to diverge.  If we had
>> problems mixing cc1 and cc1plus executions, we should get mismatches in
>> number of counters.
>
>
> Right, it doesn't appear to be different executables since the number of
> counters is identical. I'll instrument it and see if I can figure out why
> they diverge.
>
>>
>> What happens after the miscompare?
>
>
> A flag is set so that the error is emitted at most once per merge, and then
> we continue on with the merge and ignore it. Basically what it is doing is
> saving the first merged summary (for the first object file's gcda we merge
> into), and then for each additional object file that gets its counters
> merged the resulting program summary is compared against the saved program
> summary. But only if the number of runs is the same as the saved summary.
> This could happen if the gcda files are walked in a different order during
> updates (i.e. the gcov_list is in a different order for different processes
> of the same executable), but I am not sure if that can happen.

It appears that this is what is happening, and I think it makes sense
that it can.

We're essentially doing this:

  /* Now merge each file.  */
  for (gi_ptr = gcov_list; gi_ptr; gi_ptr = gi_ptr->next)
    {
        // Open existing gcda file for gi_ptr
        // Find program summary corresponding to this executable -> save in prg
        // Merge execution counts for each function
        // Merge program summary
        //      - If this is the first merged file for this execution,
save merged summary in all_prg
        //      - Otherwise if #runs the same in prg and all_prg,
print error message if prg != all_prg.
        // Write merged gcda
    }

I found that in a couple cases, we printed the error message for
libcpp/directives.gcda, where the saved all_prg summary was from
gcc/gcc.gcda.

I then instrumented the code so that each time we merge into one of
these 2 gcda files I emit the pids, the number of runs, the number of
counters and the merged sum_all. Comparing the results from all the
merges to these two gcda files I see that most of the time the merges
proceed in the same order, but there are a few cases where the order
is different, resulting in a different sum_all with the same number of
runs, and then things go back to normal and the sum_all matches again.
E.g., here is one place where things get out of order briefly,
resulting in one of the error messages being printed:

...
pid 28432 ppid 28429 Merging summary for
/home/tejohnson/extra/gcc_trunk_3_obj/gcc/gcc.gcda with runs 254 num
13193 sum_all 17058327
pid 28437 ppid 28365 Merging summary for
/home/tejohnson/extra/gcc_trunk_3_obj/gcc/gcc.gcda with runs 255 num
13193 sum_all 17064832
pid 28439 ppid 28367 Merging summary for
/home/tejohnson/extra/gcc_trunk_3_obj/gcc/gcc.gcda with runs 256 num
13193 sum_all 17071340
pid 28440 ppid 28436 Merging summary for
/home/tejohnson/extra/gcc_trunk_3_obj/gcc/gcc.gcda with runs 257 num
13193 sum_all 17177525
...

vs

...
pid 28432 ppid 28429 Merging summary for
/home/tejohnson/extra/gcc_trunk_3_obj/libcpp/directives.gcda with runs
254 num 13193 sum_all 17058327
pid 28439 ppid 28367 Merging summary for
/home/tejohnson/extra/gcc_trunk_3_obj/libcpp/directives.gcda with runs
255 num 13193 sum_all 17064835
pid 28437 ppid 28365 Merging summary for
/home/tejohnson/extra/gcc_trunk_3_obj/libcpp/directives.gcda with runs
256 num 13193 sum_all 17071340
pid 28440 ppid 28436 Merging summary for
/home/tejohnson/extra/gcc_trunk_3_obj/libcpp/directives.gcda with runs
257 num 13193 sum_all 17177525
...

Notice the middle two pids are flipped, resulting in the sum_all being
different after run 255, and back to the same after run 256.

I believe this could happen if pids 28437 and 28439 finished
near-simultaneously, waited for the lock for gcc.gcda, and 28437 won
first, but then by some luck of timing they subsequently both
attempted to open directives.gcda at around the same time and 28439
happened to win the lock in the fcntl loop first.

I believe it is also possible for object files to be in different
orders in the gcov_list in different processes, since they are added
to the head of that list in __gcov_init, which is invoked when running
an object file's global constructors, according to the header comment.
And for C++ at least, the order of initialization across translation
units is undefined. That could also cause the sum_all to go
temporarily out of sync between different object file gcda files.

Overall, I think it makes sense to remove this check altogether. Would
you agree? Testing the patch to remove this right now.

Teresa

>
> Teresa
>
>>
>> Honza
>>
>> --
>> Configure bugmail: http://gcc.gnu.org/bugzilla/userprefs.cgi?tab=email
>> ------- You are receiving this mail because: -------
>> You are on the CC list for the bug.
>
>
>
>
> --
> Teresa Johnson | Software Engineer |  tejohnson@google.com |  408-460-2413
>



--
Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413


  parent reply	other threads:[~2012-11-15 22:42 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-10-24 10:22 [Bug bootstrap/55051] New: " hjl.tools at gmail dot com
2012-10-24 10:22 ` [Bug bootstrap/55051] " hjl.tools at gmail dot com
2012-10-24 23:50 ` hjl.tools at gmail dot com
2012-10-25  0:49 ` hjl.tools at gmail dot com
2012-10-25  6:52 ` ubizjak at gmail dot com
2012-11-14 15:07 ` hjl.tools at gmail dot com
2012-11-14 15:09 ` hjl.tools at gmail dot com
2012-11-14 15:13 ` markus at trippelsdorf dot de
2012-11-14 15:35 ` hubicka at ucw dot cz
2012-11-14 20:48 ` markus at trippelsdorf dot de
2012-11-14 23:03 ` hubicka at gcc dot gnu.org
2012-11-15  0:07 ` hubicka at gcc dot gnu.org
2012-11-15  0:22 ` tejohnson at google dot com
2012-11-15  0:31 ` hjl.tools at gmail dot com
2012-11-15  1:03 ` hubicka at gcc dot gnu.org
2012-11-15  1:07 ` hubicka at gcc dot gnu.org
2012-11-15  1:10 ` hubicka at gcc dot gnu.org
2012-11-15  1:18 ` hubicka at gcc dot gnu.org
2012-11-15  1:29 ` tejohnson at google dot com
2012-11-15  1:34 ` tejohnson at google dot com
2012-11-15  1:42   ` Jan Hubicka
2012-11-15  1:43 ` hubicka at ucw dot cz
2012-11-15  1:53 ` tejohnson at google dot com
2012-11-15  2:02 ` hubicka at ucw dot cz
2012-11-15  2:46 ` tejohnson at google dot com
2012-11-15  6:44 ` tejohnson at google dot com
2012-11-15 10:56   ` Jan Hubicka
2012-11-15 10:57 ` hubicka at ucw dot cz
2012-11-15 14:34 ` tejohnson at google dot com
2012-11-15 22:42 ` tejohnson at google dot com [this message]
2012-11-16 17:42 ` hubicka at gcc dot gnu.org
2012-11-16 18:03 ` tejohnson at google dot com
2012-11-16 18:57 ` hubicka at ucw dot cz
2012-11-19  5:21 ` tejohnson at gcc dot gnu.org
2012-12-07 12:04 ` rguenth at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-55051-4-oQz4QvRjhB@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).