On 16/03/2023 14:44, David Malcolm wrote: > On Thu, 2023-03-16 at 09:54 +0100, Pierrick Philippe wrote: >> On 15/03/2023 17:26, David Malcolm wrote: >>> On Wed, 2023-03-15 at 16:24 +0100, Pierrick Philippe wrote: [stripping] >>>> So, first question: is there any way to associate and track the >>>> state >>>> of >>>> a rvalue, independently of its lvalue? >>>> >>>> To try to clarify the question, here's an example: >>>> >>>> ''' >>>> int __attribute__("__some_attribute__") x = 42; >>>> /* STEP-1 >>>>   From now on, we consider the state of x as being marked by >>>> some_attribute. >>>> But in fact, in the log, we can observe that we'll have something >>>> like >>>> this in the new '/ana::program_state/': >>>> {0x4b955b0: (int)42: marked_state (‘x’)} */ [stripping] >>>> int *y = &x; >>>> /* STEP-2 >>>> For analysis purpose, you want to know that from now on, y is >>>> pointing >>>> to marked data. >>>> So you set state of the LHS of the GIMPLE statement (i.e. some >>>> ssa_name >>>> instance of y) accordingly, with a state named 'points- >>>> to_marked_data' >>>> and setting 'x' as the origin of the state (in the sense of the >>>> argument >>>> /origin/ from '/ana::sm_context::on_transition/'. >>>> What we now have in the new '/ana::program_state/' is this: >>>> {0x4b9acb0: &x: points-to-marked_data (‘&x’) (origin: 0x4b955b0: >>>> (int)42 >>>> (‘x’)), 0x4b955b0: (int)42: marked_state (‘x’)} */ >>> Yes: you've set the state on the svalue "&x", not on "y". >>> >>>> int z = *y; >>>> /* STEP-3 >>>> Now you have implicit copy of marked data, and you want to report >>>> it. >>>> So you state the LHS of the GIMPLE statement (i.e. some ssa_name >>>> instance of z) as being marked, with 'y' as the origin. >>>> What we now have in the new '/ana::program_state/' is this: >>>> {0x4b9acb0: &x: points-to-marked_data (‘&x’) (origin: 0x4b955b0: >>>> (int)42 >>>> (‘x’)), 0x4b955b0: (int)42: marked_state (‘x’)} */ >>>> ''' >>> Presumably the program_state also shows that you have a binding for >>> the >>> region "z", containing the svalue 42 (this is represented within >>> the >>> "store", within the region_model within the program_state). [stripping] This is an update about tracking state of svalues instead of region for other kind of variables than pointers. If you consider the following code: ''' int __attribute__((__taint__)) x = 42; /* Program state: {0x4b955b0: (int)42: marked_state (‘x’)} */ int y = 42; // Program state unchanged if (y); /* When querying the sm_context about the state of y, it returns it as being in a "marked_state", because its svalue is the same as x's one. Even though no call to change y's state has been made. And here it triggers a diagnostic for my analysis. */ ''' I understand way better now the internals of the analyzer regarding the state's tracking. I do completely understand now, why you've said you've mainly designed it for pointers, because this allow you to avoid to do some points-to analysis, by associating state with pointer's svalues instead of pointer's region. But as you can see in the above code example, it has its drawback for analyzing variables with a different semantics, such as integer type variable. I will have to modified the analyzer's code to add a way for state machine to ask the analyzer to track region's state instead of svalue's state to be able to keep using it with my analysis plugin. Do you think it would be interesting having such features merged within the analyzer? In any case, I'll start to work on it over the last /trunk/ branch, within an appropriate branch. Thank you for your time, Pierrick