public inbox for gcc-help@gcc.gnu.org
 help / color / mirror / Atom feed
From: x visitor <visitor20212@outlook.com>
To: Henri Cloetens <henri.cloetens@blueice.be>
Cc: "gcc-help@gcc.gnu.org" <gcc-help@gcc.gnu.org>
Subject: Re: When will gcc assign local variables to registers?
Date: Thu, 12 Nov 2020 11:43:33 +0000	[thread overview]
Message-ID: <SL2PR02MB35960841B86F5DFCA1C4177883E70@SL2PR02MB3596.apcprd02.prod.outlook.com> (raw)
In-Reply-To: <a6f61a82-0552-07b6-f982-781098c33925@blueice.be>

Thank you, Henri.

The goal is to determine whether two instructions in binary access the same variable. This
information will be very helpful to detect code similarity and homology.

The challenge is that source code is not available (yes, it's more like a reverse-engineer). Even
worse, the compiler options used to generate the target binary remains unknown and out of
control in a more common case. I suppose this task is a little easier than reverse-engineer
because the latter aims at complete recovery of source code. I'm collecting alias analysis
techniques applicable to binary-only situation with the hope of a solution.

I will try -fdump option and take it as a good way to learn gcc's behavior.

Thank you again.


From: Gcc-help <gcc-help-bounces@gcc.gnu.org> on behalf of Henri Cloetens <henri.cloetens@blueice.be>
Sent: Thursday, November 12, 2020 14:54
To: gcc-help@gcc.gnu.org <gcc-help@gcc.gnu.org>
Subject: Re: When will gcc assign local variables to registers?

Dear Sir,

What do you want to do ?.

- Gcc, especially when the optimizers are turned on, heavily optimizes
the source code.
   If you want to reverse-engineer, in order to recognize the C-source
in the assembly,
   some suggestions:
1. Turn on the debug option. Then, GCC annotates the assembly
introducing info as to which
     assembly statement belongs to which source line. It may not work
with the optimizer on.
2. Run the compiler with -fdump-all. Then, it outputs a lot of verbose
files, documenting how
     the compile and optimizations have been done. It includes all the
restructuring, and also
     the register allocation. Now, good luck with that, it are long and
difficult to read files.

Best Regards,

Henri.


On 11/12/20 6:04 AM, visitor x via Gcc-help wrote:
> Thank you for the pointer.
>
> I learned SSA and realized that the problem is more challenging than I thought. As far, my understanding of SSA is that compilers restrict the definition site of each variable to only one by introducing phi-function and other tools. In this way it facilitates data flow analysis and further optimization such as dead code elimination.
>
> My idea before is to list all possible manners that compilers assign variables to registers, then it may be easier to recover variables from binary. Now it seems to be an impossible mission. So I rethink my ultimate goal, essentially a track to variable access sequence, which doesn’t require full decompilation (maybe).
>
> All we need to know is whether two instructions access the same variable (or say object if compilers care about only values). It sounds like an alias analysis in binary. Is it a specialized subfield in program/binary analysis?
>


  reply	other threads:[~2020-11-12 11:43 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <SL2PR02MB3596DBDC06124B066570F3D383E80@SL2PR02MB3596.apcprd02.prod.outlook.com>
2020-11-12  5:04 ` visitor x
2020-11-12  6:54   ` Henri Cloetens
2020-11-12 11:43     ` x visitor [this message]
2020-11-10 11:27 visitor x
2020-11-10 15:24 ` Andrew Haley

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=SL2PR02MB35960841B86F5DFCA1C4177883E70@SL2PR02MB3596.apcprd02.prod.outlook.com \
    --to=visitor20212@outlook.com \
    --cc=gcc-help@gcc.gnu.org \
    --cc=henri.cloetens@blueice.be \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).