public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Jakub Jelinek <jakub@redhat.com>
To: Dodji Seketeli <dodji@redhat.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>,
	Kostya Serebryany <kcc@google.com>,
	       Dmitry Vyukov <dvyukov@google.com>
Subject: Re: [PATCH 2/2] [asan] Avoid instrumenting duplicated memory access in the same basic block
Date: Tue, 29 Jan 2013 15:01:00 -0000	[thread overview]
Message-ID: <20130129150022.GS4385@tucnak.redhat.com> (raw)
In-Reply-To: <87ehh5xabo.fsf@redhat.com>

On Mon, Jan 28, 2013 at 10:32:43PM +0100, Dodji Seketeli wrote:

Thanks for working on this.

> @@ -212,6 +213,159 @@ alias_set_type asan_shadow_set = -1;
>     alias set is used for all shadow memory accesses.  */
>  static GTY(()) tree shadow_ptr_types[2];
>  
> +/* Hashtable support for memory references used by gimple
> +   statements.  */
> +
> +/* This type represents a reference to a memory region.  */
> +struct __attribute__ ((visibility ("hidden"))) mem_ref

This is wrong, you aren't checking whether visibility or even hidden
visibility is supposed by the compiler.

The C++ way I think would be to use anonymous namespace,
but I don't see anything like that used in gcc/*.c yet, so
perhaps just name it asan_mem_ref and be done with that.

> +{
> +  /* The expression of the begining of the memory region.  */
> +  tree start;
> +  /* The expression representing the length of the region.  */
> +  tree len;

Do you really need to store len as tree?  Wouldn't it be better
just to store the access size in bytes as integer (1, 2, 4, 8 or 16
bytes) into say unsigned char; variable?

> +struct __attribute__((visibility ("hidden"))) mem_ref_hasher
> +  : typed_delete_remove <mem_ref>

Likewise.

> +static hash_table <mem_ref_hasher>&

Not sure about agreed formatting for references, but grep '>&' doesn't
show anything, while '> &' shows some hits.

> +get_mem_ref_hash_table ()
> +{
> +    static hash_table <mem_ref_hasher> ht;
> +
> +    if (!ht.is_created ())
> +      ht.create (10);
> +
> +    return ht;

The above is all indented too much.

> +  hash_table <mem_ref_hasher> ht = get_mem_ref_hash_table ();
> +  mem_ref **slot = ht.find_slot (&ref, INSERT);
> +  gcc_assert (*slot == NULL);
> +  *slot = new mem_ref (ref);

Wouldn't it be better to use pool_alloc/pool_free here instead of
new/delete?

>      case BUILT_IN_STRLEN:
> -      return instrument_strlen_call (iter);
> +      source0 = gimple_call_arg (call, 0);

Reminds me that we should replace all uses of is_gimple_builtin_call
in asan.c/tsan.c with gimple_call_builtin_p (stmt, BUILT_IN_NORMAL),
and probably nuke is_gimple_builtin_call, otherwise no verification
whether K&R unprototyped size_t strlen (); etc. aren't used
say with x = strlen (); .  Right now gimple_call_arg here is unsafe,
we haven't checked, whether it has at least one argument.  But if we
ignore calls which don't match the builtin prototype, we'd be safe.

> +static int
> +test1 ()
> +{
> +  /*__builtin___asan_report_store1 called 1 time here to instrument
> +    the initialization.  */
> +  char foo[4] = {1}; 
> +
> +  /*__builtin___asan_report_store1 called 2 times here to instrument
> +    the store to the memory region of tab.  */
> +  __builtin_memset (tab, 3, sizeof (tab));
> +
> +  /* There is no instrumentation for the two memset calls below.  */
> +  __builtin_memset (tab, 4, sizeof (tab));
> +  __builtin_memset (tab, 5, sizeof (tab));

If you don't instrument the above two, it shows something bad.
I'd say for calls which test two bytes (first and last) you actually
should enter into the hash table two records, one that &tab[0] with size 1
byte has been instrumented, another one that &tab[3] (resp. 4, resp. 5)
with size 1 byte has been instrumented, and e.g. if the first access has
been instrumented already, but second access hasn't (or vice versa), just
emit instrumentation for the one which is missing.

For future improvements (but 4.9 material likely, after we lower
each instrumentation just to a single builtin first), different access
sizes and/or different is_store should just mean we (at least in the same bb
and with no intervening calls that could never return) could upgrade the
the previous instrumentation to cover a wider access size, or load into
store (or just ignore loads vs. stores?).

> +/* 

I wouldn't start the comment on different line.

> +   { dg-final { scan-tree-dump-times "__builtin___asan_report_store1" 7 "instrumented stores"}  }

Space between " and }, only one space between }  }.

> +
> +   { dg-final { scan-tree-dump-times "__builtin___asan_report_load" 4 "instrumented loads"}  }

and just close */ at the end of the above line (or use /* */ on each line
separately.

	Jakub

  reply	other threads:[~2013-01-29 15:01 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-01-28 21:27 [PATCH 0/2] Avoid duplicated instrumentation in Address Sanitizer Dodji Seketeli
2013-01-28 21:30 ` [PATCH 1/2] [asan] Allow creating/deleting hash table entries with new/delete Dodji Seketeli
2013-01-28 22:08   ` Lawrence Crowl
2013-01-28 21:32 ` [PATCH 2/2] [asan] Avoid instrumenting duplicated memory access in the same basic block Dodji Seketeli
2013-01-29 15:01   ` Jakub Jelinek [this message]
2013-01-29 15:12     ` Jakub Jelinek
2013-02-12 14:20     ` Dodji Seketeli
2013-02-12 14:27       ` Jakub Jelinek
2013-02-12 16:28         ` Dodji Seketeli
2013-02-12 17:03           ` Jakub Jelinek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130129150022.GS4385@tucnak.redhat.com \
    --to=jakub@redhat.com \
    --cc=dodji@redhat.com \
    --cc=dvyukov@google.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=kcc@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).