public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "cvs-commit at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug other/109163] SARIF (and other JSON) output files are non-deterministic
Date: Fri, 24 Mar 2023 15:41:25 +0000	[thread overview]
Message-ID: <bug-109163-4-0oQgfCpOLO@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-109163-4@http.gcc.gnu.org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109163

--- Comment #4 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by David Malcolm <dmalcolm@gcc.gnu.org>:

https://gcc.gnu.org/g:7f1e15f743357e037d7c4f6f6000863c26f3dfc3

commit r13-6851-g7f1e15f743357e037d7c4f6f6000863c26f3dfc3
Author: David Malcolm <dmalcolm@redhat.com>
Date:   Fri Mar 24 11:38:14 2023 -0400

    json: preserve key-insertion order [PR109163]

    PR other/109163 notes that when we write out JSON files, we traverse
    the keys within each object via hash_map iteration, and thus the
    ordering is non-deterministic - it can arbitrarily vary from run to
    run and from different machines, making it harder for users to compare
    results and determine if anything has "really" changed.

    I'm running into this issue with SARIF output, but there are several
    places where we're currently emitting JSON:

      * -fsave-optimization-record emits SRCFILE.opt-record.json.gz
            "This option is experimental and the format of the data within
            the compressed JSON file is subject to change."; see
            optinfo-emit-json.{h,cc}, dumpfile.cc, etc
      * -fdiagnostics-format= with the various "sarif" and "json" options
      * -fdump-analyzer-json is a developer option in the analyzer
      * gcov has:
         "-j, --json-format: Output JSON intermediate format into
         .gcov.json.gz file"

    This patch adds an auto_vec to class json::object to preserve
    key-insertion order, and use it when writing out objects.  Potentially
    this slightly slows down JSON output, but I believe that this isn't
    normally a bottleneck, and that the benefits to the user of
    deterministic output are worth it.

    I had first attempted to use ordered_hash_map.h for this, but ran into
    impenetrable template errors, so this patch uses a simpler approach of
    just adding an auto_vec to json::object.

    Testing showed a failure of diagnostic-format-json-5.c, which was using
    a convoluted set of regexps to consume the output; I believe that this
    was brittle, and was intermittently failing for some of the random
    orderings of output.  I rewrote these regexps to work with the expected
    output order.  The other such tests seem to pass with the
    now-deterministic orderings.

    gcc/ChangeLog:
            PR other/109163
            * json.cc: Update comments to indicate that we now preserve
            insertion order of keys within objects.
            (object::print): Traverse keys in insertion order.
            (object::set): Preserve insertion order of keys.
            (selftest::test_writing_objects): Add an additional key to verify
            that we preserve insertion order.
            * json.h (object::m_keys): New field.

    gcc/testsuite/ChangeLog:
            PR other/109163
            * c-c++-common/diagnostic-format-json-1.c: Update comment.
            * c-c++-common/diagnostic-format-json-2.c: Likewise.
            * c-c++-common/diagnostic-format-json-3.c: Likewise.
            * c-c++-common/diagnostic-format-json-4.c: Likewise.
            * c-c++-common/diagnostic-format-json-5.c: Rewrite regexps.
            * c-c++-common/diagnostic-format-json-stderr-1.c: Update comment.

    Signed-off-by: David Malcolm <dmalcolm@redhat.com>

  parent reply	other threads:[~2023-03-24 15:41 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-16 22:05 [Bug other/109163] New: " dmalcolm at gcc dot gnu.org
2023-03-16 22:10 ` [Bug other/109163] " dmalcolm at gcc dot gnu.org
2023-03-16 22:49 ` dmalcolm at gcc dot gnu.org
2023-03-17 20:57 ` dmalcolm at gcc dot gnu.org
2023-03-24 15:41 ` cvs-commit at gcc dot gnu.org [this message]
2023-03-31 13:03 ` dmalcolm at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-109163-4-0oQgfCpOLO@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).