public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* SUSPICIOUS CODE
@ 2023-07-12 15:12 jacob navia
  2023-07-12 22:33 ` Alan Modra
  2023-07-13 13:54 ` Michael Matz
  0 siblings, 2 replies; 5+ messages in thread
From: jacob navia @ 2023-07-12 15:12 UTC (permalink / raw)
  To: binutils

[-- Attachment #1: Type: text/plain, Size: 2655 bytes --]

Consider this code:

1202 static fragS * get_frag_for_reloc (fragS *last_frag,
1203             const segment_info_type *seginfo,
1204             const struct reloc_list *r)
1205 {   
1206   fragS *f;
1207   
1208   for (f = last_frag; f != NULL; f = f->fr_next)
1209     if (f->fr_address <= r->u.b.r.address
1210     && r->u.b.r.address < f->fr_address + f->fr_fix)
1211       return f;
1212 
1213   for (f = seginfo->frchainP->frch_root; f != NULL; f = f->fr_next)
1214     if (f->fr_address <= r->u.b.r.address
1215     && r->u.b.r.address < f->fr_address + f->fr_fix)
1216       return f;
1217   
1218   for (f = seginfo->frchainP->frch_root; f != NULL; f = f->fr_next)
1219     if (f->fr_address <= r->u.b.r.address
1220     && r->u.b.r.address <= f->fr_address + f->fr_fix)
1221       return f;
1222 
1223   as_bad_where (r->file, r->line,
1224         _("reloc not within (fixed part of) section"));
1225   return NULL;
1226 }

This function consists of 3 loops: 1208-1211, 1213 to 1216 and 1218 to 1221. 

Lines 1213 - 1216 are ALMOST identical to lines 1218 to 1221. The ONLY difference that I can see is that the less in line 1215 is replaced by a less equal in line 1220.

But… why?

This code is searching the fragment that contains a given address in between the start and end addresses of the frags in question, either in the fragment list given by last_frag or in the list given by seginfo.

To know if a fragment is OK you should start with the given address and stop one memory address BEFORE the limit given by fr_address + f->fr_fix. That is what the first two loops are doing. The third loop repeats the second one and changes the less to less equal, so if fr_address+fr_fix is one MORE than the address it will still pass.

Why it is doing that? 

If that code is correct, it is obvious that we could merge the second and third loops and put a <= in t he second one and erase the third one… UNLESS priority should be given to matches that are less and not less equal, what seems incomprehensible … to me.

This change was introduced on Aug 18th 2011 by Mr Alan Modra with the rather terse comment: "(get_frag_for_reloc): New function. ». There are no further comments in the code at all.

This code is run after all relocations are fixed just before the software writes them out. The code is in file « write.c » in the gas directory. Note that this code runs through ALL relocations lists each time for EACH relocation, so it is quite expensive. In general the list data structure is not really optimal here but that is another story.

Thanks in advance for your help.

Jacob

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: SUSPICIOUS CODE
  2023-07-12 15:12 SUSPICIOUS CODE jacob navia
@ 2023-07-12 22:33 ` Alan Modra
  2023-07-13  6:00   ` jacob navia
  2023-07-13 13:54 ` Michael Matz
  1 sibling, 1 reply; 5+ messages in thread
From: Alan Modra @ 2023-07-12 22:33 UTC (permalink / raw)
  To: jacob navia; +Cc: binutils

On Wed, Jul 12, 2023 at 05:12:13PM +0200, jacob navia wrote:
> Consider this code:
> 
> 1202 static fragS * get_frag_for_reloc (fragS *last_frag,
> 1203             const segment_info_type *seginfo,
> 1204             const struct reloc_list *r)
> 1205 {   
> 1206   fragS *f;
> 1207   
> 1208   for (f = last_frag; f != NULL; f = f->fr_next)
> 1209     if (f->fr_address <= r->u.b.r.address
> 1210     && r->u.b.r.address < f->fr_address + f->fr_fix)
> 1211       return f;
> 1212 
> 1213   for (f = seginfo->frchainP->frch_root; f != NULL; f = f->fr_next)
> 1214     if (f->fr_address <= r->u.b.r.address
> 1215     && r->u.b.r.address < f->fr_address + f->fr_fix)
> 1216       return f;
> 1217   
> 1218   for (f = seginfo->frchainP->frch_root; f != NULL; f = f->fr_next)
> 1219     if (f->fr_address <= r->u.b.r.address
> 1220     && r->u.b.r.address <= f->fr_address + f->fr_fix)
> 1221       return f;
> 1222 
> 1223   as_bad_where (r->file, r->line,
> 1224         _("reloc not within (fixed part of) section"));
> 1225   return NULL;
> 1226 }
> 
> This function consists of 3 loops: 1208-1211, 1213 to 1216 and 1218 to 1221. 
> 
> Lines 1213 - 1216 are ALMOST identical to lines 1218 to 1221. The ONLY difference that I can see is that the less in line 1215 is replaced by a less equal in line 1220.
> 
> But… why?
> 
> This code is searching the fragment that contains a given address in between the start and end addresses of the frags in question, either in the fragment list given by last_frag or in the list given by seginfo.
> 
> To know if a fragment is OK you should start with the given address and stop one memory address BEFORE the limit given by fr_address + f->fr_fix. That is what the first two loops are doing. The third loop repeats the second one and changes the less to less equal, so if fr_address+fr_fix is one MORE than the address it will still pass.
> 
> Why it is doing that? 

Take out the third loop and run the testsuite.  Does anything regress?

> If that code is correct, it is obvious that we could merge the second and third loops and put a <= in t he second one and erase the third one… UNLESS priority should be given to matches that are less and not less equal,

That is exactly what needs to happen.

> what seems incomprehensible … to me.
> 
> This change was introduced on Aug 18th 2011 by Mr Alan Modra with the rather terse comment: "(get_frag_for_reloc): New function. ». There are no further comments in the code at all.

Yes, I'm responsible for lots of suspicious code, but this isn't the
full history of get_frag_for_reloc.

> This code is run after all relocations are fixed just before the software writes them out. The code is in file « write.c » in the gas directory. Note that this code runs through ALL relocations lists each time for EACH relocation, so it is quite expensive. In general the list data structure is not really optimal here but that is another story.

The code does not run through all relocations, just those created with
the .reloc directive.

> Thanks in advance for your help.
> 
> Jacob

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: SUSPICIOUS CODE
  2023-07-12 22:33 ` Alan Modra
@ 2023-07-13  6:00   ` jacob navia
  0 siblings, 0 replies; 5+ messages in thread
From: jacob navia @ 2023-07-13  6:00 UTC (permalink / raw)
  To: Alan Modra; +Cc: binutils

[-- Attachment #1: Type: text/plain, Size: 4963 bytes --]

Hi Mr Modra

First, thanks for replying. Good thing you are available and can explain things.

Second, I tried to run the test suite but… no Makefile, no documentation on how to run these tests… Impossible to continue unless you would point me to some documentation about the tests. Alternatively you could just tell me which test fails and WHY this change is needed.

Third, basically when looking if an address is between two borders we include the lower limit and exclude the upper limit. If you want to know if a is between Addrx and Addrx+6 you will use Addrx , Addrx+1 until Addrx+5, excluding Addrx+6. This is because we use zero based array indices in C. This is obvious to you of course. 

Fourth, the first two loops of « get_frag_from_reloc » use this principle. What bothers me is that with no justification the valid address range is extended by one in the third.

These relocations (stored in « reloc_list » ) are created by « create_note_reloc » . The two calls to that function are in « maybe_generate_build_notes » The address of the note is composed of the addition of two numbers, note_offset and desc2_offset, and in those calls to « create_note_reloc » I see a division by 2 that *COULD* explain the behavior you programmed since it is an integer division that rounds towards zero.

But if that is the case, why not adding 1 to the division instead of running an additional loop with <= ???

In any case it would be nice if we added some comments to the code to explain why it is there so people reading the code do not pose unwanted questions!

:-)

Jacob

> Le 13 juil. 2023 à 00:33, Alan Modra <amodra@gmail.com> a écrit :
> 
> On Wed, Jul 12, 2023 at 05:12:13PM +0200, jacob navia wrote:
>> Consider this code:
>> 
>> 1202 static fragS * get_frag_for_reloc (fragS *last_frag,
>> 1203             const segment_info_type *seginfo,
>> 1204             const struct reloc_list *r)
>> 1205 {   
>> 1206   fragS *f;
>> 1207   
>> 1208   for (f = last_frag; f != NULL; f = f->fr_next)
>> 1209     if (f->fr_address <= r->u.b.r.address
>> 1210     && r->u.b.r.address < f->fr_address + f->fr_fix)
>> 1211       return f;
>> 1212 
>> 1213   for (f = seginfo->frchainP->frch_root; f != NULL; f = f->fr_next)
>> 1214     if (f->fr_address <= r->u.b.r.address
>> 1215     && r->u.b.r.address < f->fr_address + f->fr_fix)
>> 1216       return f;
>> 1217   
>> 1218   for (f = seginfo->frchainP->frch_root; f != NULL; f = f->fr_next)
>> 1219     if (f->fr_address <= r->u.b.r.address
>> 1220     && r->u.b.r.address <= f->fr_address + f->fr_fix)
>> 1221       return f;
>> 1222 
>> 1223   as_bad_where (r->file, r->line,
>> 1224         _("reloc not within (fixed part of) section"));
>> 1225   return NULL;
>> 1226 }
>> 
>> This function consists of 3 loops: 1208-1211, 1213 to 1216 and 1218 to 1221. 
>> 
>> Lines 1213 - 1216 are ALMOST identical to lines 1218 to 1221. The ONLY difference that I can see is that the less in line 1215 is replaced by a less equal in line 1220.
>> 
>> But… why?
>> 
>> This code is searching the fragment that contains a given address in between the start and end addresses of the frags in question, either in the fragment list given by last_frag or in the list given by seginfo.
>> 
>> To know if a fragment is OK you should start with the given address and stop one memory address BEFORE the limit given by fr_address + f->fr_fix. That is what the first two loops are doing. The third loop repeats the second one and changes the less to less equal, so if fr_address+fr_fix is one MORE than the address it will still pass.
>> 
>> Why it is doing that? 
> 
> Take out the third loop and run the testsuite.  Does anything regress?
> 
>> If that code is correct, it is obvious that we could merge the second and third loops and put a <= in t he second one and erase the third one… UNLESS priority should be given to matches that are less and not less equal,
> 
> That is exactly what needs to happen.
> 
>> what seems incomprehensible … to me.
>> 
>> This change was introduced on Aug 18th 2011 by Mr Alan Modra with the rather terse comment: "(get_frag_for_reloc): New function. ». There are no further comments in the code at all.
> 
> Yes, I'm responsible for lots of suspicious code, but this isn't the
> full history of get_frag_for_reloc.
> 
>> This code is run after all relocations are fixed just before the software writes them out. The code is in file « write.c » in the gas directory. Note that this code runs through ALL relocations lists each time for EACH relocation, so it is quite expensive. In general the list data structure is not really optimal here but that is another story.
> 
> The code does not run through all relocations, just those created with
> the .reloc directive.
> 
>> Thanks in advance for your help.
>> 
>> Jacob
> 
> -- 
> Alan Modra
> Australia Development Lab, IBM


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: SUSPICIOUS CODE
  2023-07-12 15:12 SUSPICIOUS CODE jacob navia
  2023-07-12 22:33 ` Alan Modra
@ 2023-07-13 13:54 ` Michael Matz
  2023-07-13 17:15   ` jacob navia
  1 sibling, 1 reply; 5+ messages in thread
From: Michael Matz @ 2023-07-13 13:54 UTC (permalink / raw)
  To: jacob navia; +Cc: binutils

[-- Attachment #1: Type: text/plain, Size: 627 bytes --]

Hello,

On Wed, 12 Jul 2023, jacob navia wrote:

> This change was introduced on Aug 18th 2011 by Mr Alan Modra with the 
> rather terse comment: "(get_frag_for_reloc): New function. ».

Nope.  The third loop (the one you're asking about) wasn't added with the 
above, but rather by commit 740bdc67c057 (also by Alan).  It contains 
testcase and justification.  (Hint: as last resort, and only then, a reloc 
is associated with a frag at the _end_ of section. frags can be 
var-length).

The testsuite is run (as usual with many GNU projects) with 'make check' 
after the appropriate configure and make steps.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: SUSPICIOUS CODE
  2023-07-13 13:54 ` Michael Matz
@ 2023-07-13 17:15   ` jacob navia
  0 siblings, 0 replies; 5+ messages in thread
From: jacob navia @ 2023-07-13 17:15 UTC (permalink / raw)
  To: Michael Matz; +Cc: binutils

OK then. All those « details » were unknown to me.

This is actually normal. 

Free software supposes that many people read the code and examine it for bugs. I was doing just that, and without comments, the code looked suspicious. 

In no way I was trying to say your work was badly done or similar stuff.

Thanks for your time.

Jacob


> Le 13 juil. 2023 à 15:54, Michael Matz <matz@suse.de> a écrit :
> 
> Hello,
> 
> On Wed, 12 Jul 2023, jacob navia wrote:
> 
>> This change was introduced on Aug 18th 2011 by Mr Alan Modra with the 
>> rather terse comment: "(get_frag_for_reloc): New function. ».
> 
> Nope.  The third loop (the one you're asking about) wasn't added with the 
> above, but rather by commit 740bdc67c057 (also by Alan).  It contains 
> testcase and justification.  (Hint: as last resort, and only then, a reloc 
> is associated with a frag at the _end_ of section. frags can be 
> var-length).
> 
> The testsuite is run (as usual with many GNU projects) with 'make check' 
> after the appropriate configure and make steps.
> 
> 
> Ciao,
> Michael.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-07-13 17:16 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-12 15:12 SUSPICIOUS CODE jacob navia
2023-07-12 22:33 ` Alan Modra
2023-07-13  6:00   ` jacob navia
2023-07-13 13:54 ` Michael Matz
2023-07-13 17:15   ` jacob navia

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).