public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Adder <adder.thief@gmail.com>
To: DJ Delorie <dj@redhat.com>
Cc: libc-alpha@sourceware.org, fweimer@redhat.com
Subject: Re: [PATCH] malloc tcache: Debugger now sees the address of the corrupted chunk.
Date: Thu, 17 Dec 2020 04:08:30 +0200	[thread overview]
Message-ID: <CAHQ-jgiED=YXhJKFj114CkZO9zaNvfwOpvwOos6BATGOe0cSAQ@mail.gmail.com> (raw)
In-Reply-To: <xna6udz8mj.fsf@greed.delorie.com>

On Thu, Dec 17, 2020 at 12:30 AM DJ Delorie <dj@redhat.com> wrote:
>
> Adder <adder.thief@gmail.com> writes:
>
> > And shuffling the qualifier eliminates the need for casting to void.
>
> The (void) is a common idiom; we should leave it even if it has no
> effect *at the moment*.  It tells future code readers our current
> intentions, and is more resilient to compiler changes as the compiler
> devs know about it too.

I see. Okay.



> > We can also use the following:
> >
> >   ((volatile tcache_entry *) tcache->entries[tc_idx])->next;
> >
> > It produces the exact same machine code, but it is more resilient
> > to changes in the definition of our tcache_entry struct
> > (example: to a field being added before our first field, i.e. before "next").
>
> For validating the "next" pointer, it doesn't matter what the "next"
> tcache_entry struct looks like; it only matters if it's in valid memory.
> Reading the first word of the entry is sufficient.  tcache->entries[] is
> a pointer, e->next is a pointer, either way we need to read *one* word
> to validate that pointer.  We don't need to read the whole struct it
> points to, nor does it matter which word in the struct is read.
>
> I would argue that casting to a non-struct pointer emphasizes that we
> care about the pointer more than what it points to, but a sufficiently
> verbose comment would suffice as well.

After reading the explanation (thanks), I can understand the casts.



> Whether we use e->next or tcache->entries for this purpose is
> irrelevent, I think.  Whichever results in a smaller line of code is
> probably a bit cleaner ;-)

Then does the version without casts win ?  [-:



> >     mov [r9], rsi ; Crash here if e happens to be an only-allow-reading pointer.
>
> Given the size of a 64-bit address space, the odds of randomly hitting a
> read-only page is negligible, compared to the odds of hitting an
> unmapped page, which is very high.  Given that this code is on the fast
> path, I would argue against this as "not useful enough".

I see. I agree about probability.

But perhaps the speed cost is lower than it appears.
I have always been bad at evaluating time.
I am going to think about a way to benchmark this.
Suggestions are warmly welcome.

In the meanwhile, could we also consider the following alternative ?

  if (e_next)
    {
      if (__glibc_unlikely (e_next->key != tcache))
        please_crash_in_a_way_which_allows_debugger_to_print_e ();
    }

Advantages (YMMV):

  - no (volatile T *), no (void), no other cast;
  - easier to understand (even without "corrupted pointer" in mind);
  - only loads, no stores;
  - probability of e_next accidentally being valid and having a good key is low.

Disadvantages:

  - conditional jump;
  - a bit slower than just reading (Really ? Should we benchmark ?).



> > +  if (__glibc_unlikely (e->key != tcache))
> > +    * ((tcache_entry *volatile *) 0) = e;
>
> This is where malloc_printerr should be called.  Even if the data is
> corrupt, this is what we do elsewhere in these cases.

Purpose is to give "e" to the debugger (including the human debugger).
In my testing (on x86_64), adding malloc_printerr here loses "e".

For clarity and consistence with usage of malloc_printerr elsewhere,
I wish to suggest adding a function malloc_printerr_4
which is given the pointer to the string and 4 additional uint64_t args.

It is not necessary to hex-dump the 4 values in the output message
(though it could be done). Just to make them available to the debugger.

Then this could be used to complete other messages
where the address of the corrupted chunk is/might-be lost to the debugger,
e.g. "corrupted size vs. prev_size", "corrupted double-linked list".

(Currently, I am not saying that the address is lost there. To be tested.)



> > I wish to suggest that we quarrel on Comments after we agree on Code. (-:
>
> Perhaps, but good comments will be required ;-)

I agree !

Let us just postpone them for a little bit, if you will, as it helps
me more easily read our emails with versions of Code.

  reply	other threads:[~2020-12-17  2:08 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-07  0:57 Adder
2020-12-11 22:51 ` DJ Delorie
2020-12-11 23:48   ` Adder
2020-12-12  0:09     ` DJ Delorie
2020-12-12  3:17 ` DJ Delorie
2020-12-12  5:18   ` Adder
2020-12-12  5:32     ` DJ Delorie
2020-12-13  2:16       ` Adder
2020-12-16 22:30         ` DJ Delorie
2020-12-17  2:08           ` Adder [this message]
2020-12-17  4:18             ` DJ Delorie
2020-12-22  2:15               ` Adder
2020-12-17  9:34           ` Florian Weimer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHQ-jgiED=YXhJKFj114CkZO9zaNvfwOpvwOos6BATGOe0cSAQ@mail.gmail.com' \
    --to=adder.thief@gmail.com \
    --cc=dj@redhat.com \
    --cc=fweimer@redhat.com \
    --cc=libc-alpha@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).