public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Gabriel Charette <gcharette1@gmail.com>
To: Diego Novillo <dnovillo@google.com>
Cc: reply@codereview.appspotmail.com, crowl@google.com,
		gcc-patches@gcc.gnu.org
Subject: Re: [pph] Make libcpp symbol validation a warning (issue5235061)
Date: Thu, 13 Oct 2011 22:39:00 -0000	[thread overview]
Message-ID: <CAAb05gEkkvi+7Tc6ZDSAD8Lo=83oBttyCB0-q2ARUN5AX-Ej-A@mail.gmail.com> (raw)
In-Reply-To: <20111011202636.D76B41DA1D2@topo.tor.corp.google.com>

Just looked at the line_table related sections, but see comments below:

On Tue, Oct 11, 2011 at 4:26 PM, Diego Novillo <dnovillo@google.com> wrote:
>
> Currently, the consistency check done on pre-processor symbols is
> triggering on symbols that are not really problematic (e.g., symbols
> used for double-include guards).
>
> The problem is that in the testsuite, we are refusing to process PPH
> images that fail that test, which means we don't get to test other
> issues.  To avoid this, I changed the error() call to warning().  Seemed
> innocent enough, but there were more problems behind that one:
>
> 1- We do not really try to avoid reading PPH images more than once.
>   This problem is different than the usual double-inclusion guard.
>   For instance, suppose a file foo.pph includes 1.pph, 2.pph and
>   3.pph.  When generating foo.pph, we read all 3 files just once and
>   double-include guards do not need to trigger.  However, if we are
>   later building a TU with:
>        #include 2.pph
>        #include foo.pph
>   we first read 2.pph and when reading foo.pph, we try to read 2.pph
>   again, because it is mentioned in foo.pph's line map table.
>
>   I added a guard in pph_stream_open() so it doesn't try to open the
>   same file more than once, but that meant adjusting some of the
>   assertions while reading the line table.  We should not expect to
>   find foo.pph's line map table exactly like the one we wrote.

That makes sense.

> @@ -328,8 +327,6 @@ pph_in_line_table_and_includes (pph_stream *stream)
>   int entries_offset = line_table->used - PPH_NUM_IGNORED_LINE_TABLE_ENTRIES;
>   enum pph_linetable_marker next_lt_marker = pph_in_linetable_marker (stream);
>
> -  pph_reading_includes++;
> -
>   for (first = true; next_lt_marker != PPH_LINETABLE_END;
>        next_lt_marker = pph_in_linetable_marker (stream))
>     {
> @@ -373,19 +370,33 @@ pph_in_line_table_and_includes (pph_stream *stream)
>          else
>            lm->included_from += entries_offset;
>
> -         gcc_assert (lm->included_from < (int) line_table->used);
> -

This should still hold, it is impossible that included_from points to
an entry that doesn't exist (i.e. beyond line_table->used), but since
we recalculate it on the previous line, adding entries_offset, this
was just a safe check to make sure everything read makes sense.

>          lm->start_location += pph_loc_offset;
>
>          line_table->used++;
>        }
>     }
>
> -  pph_reading_includes--;
> +  /* We used to expect exactly the same number of entries, but files
> +     included from this PPH file may sometimes not be needed.  For
> +     example,
> +
> +       #include "2.pph"
> +       #include "foo.pph"
> +         +-->  #include "1.pph"
> +               #include "2.pph"
> +               #include "3.pph"
> +
> +     When foo.pph was originally created, the line table was built
> +     with inclusions of 1.pph, 2.pph and 3.pph.  But when compiling
> +     the main translation unit, we include 2.pph before foo.pph, so
> +     the inclusion of 2.pph from foo.pph does nothing.  Leaving the
> +     line table in a different shape than the original compilation.
>
> +     Instead of insisting on getting EXPECTED_IN entries, we expect at
> +     most EXPECTED_IN entries.  */
>   {
>     unsigned int expected_in = pph_in_uint (stream);
> -    gcc_assert (line_table->used - used_before == expected_in);
> +    gcc_assert (line_table->used - used_before <= expected_in);

I'm not sure exactly how you skip headers already parsed now (we
didn't used to when I wrote this code and that was the only problem
remaining in the line_table (i.e. duplicate entries for guarded
headers in the non-pph compile)), but couldn't you count the number of
skipped entries and assert (line_table->used - used_before) +
numSkipped == expected_in) ?

I'd have to re-download the code, I've bee following through patches,
but I'm not so sure now exactly how the "guarded headers skipping" is
done, my memorized knowledge of the codebase has diverged I feel..!

A more important note: I think it could be worth having a new flag
that outputs the line_table when done parsing (as a mean to robustly
test it). My way to test it usually was to breakpoint on
varpool_assemble_decl (a random choice, but it was only called after
parsing was done...), both in pph and non-pph compiles and compare the
line_table in gdb.... However, to have a stable test in the long run,
it could be nice to have a flag that asks for an output of the
line_table and then we could checksum and compare the line_table
outputted by the pph and non-pph compiles.

A good test I had found to break in and analyze the line_table was
p4eabi.h as it pretty much had all the problems that I fixed regarding
the line_table (it also has re-includes if I remember correctly, but
that wasn't a problem before as we would not guard out re-includes as
I just mentioned above).

Having such a robust test would be important I feel as, as we saw with
previous bugs, discrepancies in the line_table can result in tricky
unique ID diffs.

If you are now guarding out re-includes the line_table should now be
perfectly identical in pph and non-pph as this was the only thing
remaining I think :)!

Cheers,
Gab

>
> --
> This patch is available for review at http://codereview.appspot.com/5235061

  reply	other threads:[~2011-10-13 21:56 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-10-11 21:30 Diego Novillo
2011-10-13 22:39 ` Gabriel Charette [this message]
2011-10-14 13:43   ` Diego Novillo
2011-10-15  8:00     ` Gabriel Charette
2011-10-17 13:39       ` Diego Novillo
2011-10-21  5:53 ` Gabriel Charette
2011-10-21 19:06   ` Lawrence Crowl

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAAb05gEkkvi+7Tc6ZDSAD8Lo=83oBttyCB0-q2ARUN5AX-Ej-A@mail.gmail.com' \
    --to=gcharette1@gmail.com \
    --cc=crowl@google.com \
    --cc=dnovillo@google.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=reply@codereview.appspotmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).