* Re: Re: When will gcc assign local variables to registers? [not found] <SL2PR02MB3596DBDC06124B066570F3D383E80@SL2PR02MB3596.apcprd02.prod.outlook.com> @ 2020-11-12 5:04 ` visitor x 2020-11-12 6:54 ` Henri Cloetens 0 siblings, 1 reply; 5+ messages in thread From: visitor x @ 2020-11-12 5:04 UTC (permalink / raw) To: gcc-help Thank you for the pointer. I learned SSA and realized that the problem is more challenging than I thought. As far, my understanding of SSA is that compilers restrict the definition site of each variable to only one by introducing phi-function and other tools. In this way it facilitates data flow analysis and further optimization such as dead code elimination. My idea before is to list all possible manners that compilers assign variables to registers, then it may be easier to recover variables from binary. Now it seems to be an impossible mission. So I rethink my ultimate goal, essentially a track to variable access sequence, which doesn’t require full decompilation (maybe). All we need to know is whether two instructions access the same variable (or say object if compilers care about only values). It sounds like an alias analysis in binary. Is it a specialized subfield in program/binary analysis? ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: When will gcc assign local variables to registers? 2020-11-12 5:04 ` Re: When will gcc assign local variables to registers? visitor x @ 2020-11-12 6:54 ` Henri Cloetens 2020-11-12 11:43 ` x visitor 0 siblings, 1 reply; 5+ messages in thread From: Henri Cloetens @ 2020-11-12 6:54 UTC (permalink / raw) To: gcc-help Dear Sir, What do you want to do ?. - Gcc, especially when the optimizers are turned on, heavily optimizes the source code. If you want to reverse-engineer, in order to recognize the C-source in the assembly, some suggestions: 1. Turn on the debug option. Then, GCC annotates the assembly introducing info as to which assembly statement belongs to which source line. It may not work with the optimizer on. 2. Run the compiler with -fdump-all. Then, it outputs a lot of verbose files, documenting how the compile and optimizations have been done. It includes all the restructuring, and also the register allocation. Now, good luck with that, it are long and difficult to read files. Best Regards, Henri. On 11/12/20 6:04 AM, visitor x via Gcc-help wrote: > Thank you for the pointer. > > I learned SSA and realized that the problem is more challenging than I thought. As far, my understanding of SSA is that compilers restrict the definition site of each variable to only one by introducing phi-function and other tools. In this way it facilitates data flow analysis and further optimization such as dead code elimination. > > My idea before is to list all possible manners that compilers assign variables to registers, then it may be easier to recover variables from binary. Now it seems to be an impossible mission. So I rethink my ultimate goal, essentially a track to variable access sequence, which doesn’t require full decompilation (maybe). > > All we need to know is whether two instructions access the same variable (or say object if compilers care about only values). It sounds like an alias analysis in binary. Is it a specialized subfield in program/binary analysis? > ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: When will gcc assign local variables to registers? 2020-11-12 6:54 ` Henri Cloetens @ 2020-11-12 11:43 ` x visitor 0 siblings, 0 replies; 5+ messages in thread From: x visitor @ 2020-11-12 11:43 UTC (permalink / raw) To: Henri Cloetens; +Cc: gcc-help Thank you, Henri. The goal is to determine whether two instructions in binary access the same variable. This information will be very helpful to detect code similarity and homology. The challenge is that source code is not available (yes, it's more like a reverse-engineer). Even worse, the compiler options used to generate the target binary remains unknown and out of control in a more common case. I suppose this task is a little easier than reverse-engineer because the latter aims at complete recovery of source code. I'm collecting alias analysis techniques applicable to binary-only situation with the hope of a solution. I will try -fdump option and take it as a good way to learn gcc's behavior. Thank you again. From: Gcc-help <gcc-help-bounces@gcc.gnu.org> on behalf of Henri Cloetens <henri.cloetens@blueice.be> Sent: Thursday, November 12, 2020 14:54 To: gcc-help@gcc.gnu.org <gcc-help@gcc.gnu.org> Subject: Re: When will gcc assign local variables to registers? Dear Sir, What do you want to do ?. - Gcc, especially when the optimizers are turned on, heavily optimizes the source code. If you want to reverse-engineer, in order to recognize the C-source in the assembly, some suggestions: 1. Turn on the debug option. Then, GCC annotates the assembly introducing info as to which assembly statement belongs to which source line. It may not work with the optimizer on. 2. Run the compiler with -fdump-all. Then, it outputs a lot of verbose files, documenting how the compile and optimizations have been done. It includes all the restructuring, and also the register allocation. Now, good luck with that, it are long and difficult to read files. Best Regards, Henri. On 11/12/20 6:04 AM, visitor x via Gcc-help wrote: > Thank you for the pointer. > > I learned SSA and realized that the problem is more challenging than I thought. As far, my understanding of SSA is that compilers restrict the definition site of each variable to only one by introducing phi-function and other tools. In this way it facilitates data flow analysis and further optimization such as dead code elimination. > > My idea before is to list all possible manners that compilers assign variables to registers, then it may be easier to recover variables from binary. Now it seems to be an impossible mission. So I rethink my ultimate goal, essentially a track to variable access sequence, which doesn’t require full decompilation (maybe). > > All we need to know is whether two instructions access the same variable (or say object if compilers care about only values). It sounds like an alias analysis in binary. Is it a specialized subfield in program/binary analysis? > ^ permalink raw reply [flat|nested] 5+ messages in thread
* When will gcc assign local variables to registers? @ 2020-11-10 11:27 visitor x 2020-11-10 15:24 ` Andrew Haley 0 siblings, 1 reply; 5+ messages in thread From: visitor x @ 2020-11-10 11:27 UTC (permalink / raw) To: gcc-help Hello, I'm trying to distinguish local variable assigned to register from a binary. I got that compilers will assign some local variable to registers rather than stack for performance reason, while which variable depends on the compiler. I also found there is a calling convention on x64 that the first six arguments are passed to register (when the class is INTEGER). However, I got almost nothing about the rest of the case. I am aware that compilers are free to arrange variables in both stack and registers, but what pricinple or specification will they follow? Is there any materials about gcc's behavior on deciding which variables should be kept in registers? Or any other hint to find variables stored in register from a bianry? visitor20 ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: When will gcc assign local variables to registers? 2020-11-10 11:27 visitor x @ 2020-11-10 15:24 ` Andrew Haley 0 siblings, 0 replies; 5+ messages in thread From: Andrew Haley @ 2020-11-10 15:24 UTC (permalink / raw) To: gcc-help On 10/11/2020 11:27, visitor x via Gcc-help wrote: > I got that compilers will assign some local variable to registers rather than stack for performance reason, while which variable depends on the compiler. I also found there is a calling convention on x64 that the first six arguments are passed to register (when the class is INTEGER). > > However, I got almost nothing about the rest of the case. I am aware that compilers are free to arrange variables in both stack and registers, but what pricinple or specification will they follow? None, really. Optimizing compilers don't really care about variables once the initial transalation has been done, but about values. https://en.wikipedia.org/wiki/Static_single_assignment_form may help. > Is there any materials about gcc's behavior on deciding which variables should be kept in registers? Or any other hint to find variables stored in register from a bianry? It's more complicated than that. Not only can local variables be assigned to registers, then can be assigned to different registers at differentt times. Not only that, but local variables can be eliminated completely or split into multiple copies in different registers. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. <https://www.redhat.com> https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2020-11-12 11:43 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <SL2PR02MB3596DBDC06124B066570F3D383E80@SL2PR02MB3596.apcprd02.prod.outlook.com> 2020-11-12 5:04 ` Re: When will gcc assign local variables to registers? visitor x 2020-11-12 6:54 ` Henri Cloetens 2020-11-12 11:43 ` x visitor 2020-11-10 11:27 visitor x 2020-11-10 15:24 ` Andrew Haley
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).