On 11/23/20 2:04 PM, Jeff Law wrote: > > > On 11/4/20 5:58 PM, Martin Sebor via Gcc-patches wrote: >> To determine the target of a pointer expression and the offset into >> it, the increasingly widely used compute_objsize function traverses >> the IL following the DEF statements of pointer variables, aggregating >> offsets from POINTER_PLUS assignments along the way.  It does that >> for many statements that involve pointers, including calls to >> built-in functions and (so far only) accesses to char arrays.  When >> a function has many such statements with pointers to the same objects >> but with different offsets, the traversal ends up visiting the same >> pointer assignments repeatedly and unnecessarily. >> >> To avoid this repeated traversal, the attached patch adds the ability >> to cache results obtained in prior calls to the function.  The cache >> is optional and only used when enabled. >> >> To exercise the cache I have enabled it for the strlen pass (which >> is probably the heaviest compute_objsize user).  That happens to >> resolve PR 97373 which tracks the pass' failure to detect sprintf >> overflowing allocated buffers at a non-constant offset.  I thought >> about making this a separate patch but the sprintf/strlen changes >> are completely mechanical so it didn't seem worth the effort. >> >> In the benchmarking I've done the cache isn't a huge win there but >> it does have a measurable difference in the project I'm wrapping up >> where most pointer assignments need to be examined.  The space used >> for the cache is negligible on average: fewer than 20 entries per >> Glibc function and about 6 for GCC.  The worst case in Glibc is >> 6 thousand entries and 10k in GCC.  Since the entries are sizable >> (216 bytes each) the worst case memory consumption can be reduced >> by adding a level of indirection.  A further savings can be obtained >> by replacing some of the offset_int members of the entries with >> HOST_WIDE_INT. >> >> The efficiency benefits of the cache should increase further as more >> of the access checking code is integrated into the same pass.  This >> should eventually include the checking currently done in the built-in >> expanders. >> >> Tested on x86_64-linux, along with Glibc and Binutils/GDB. >> >> Martin >> >> PS The patch add the new pointer_query class (loosely modeled on >> range_query) to builtins.{h,c}.  This should be only temporary, >> until the access checking code is moved into a file (and ultimately >> a pass) of its own. >> >> gcc-97373.diff >> >> PR middle-end/97373 - missing warning on sprintf into allocated destination >> >> gcc/ChangeLog: >> >> PR middle-end/97373 >> * builtins.c (compute_objsize): Rename... >> (compute_objsize_r): to this. Change order and types of arguments. >> Use new argument. Adjust calls to self. >> (access_ref::get_ref): New member function. >> (pointer_query::pointer_query): New member function. >> (pointer_query::get_ref): Same. >> (pointer_query::put_ref): Same. >> (handle_min_max_size): Change order and types of arguments. >> (maybe_emit_free_warning): Add a test. >> * builtins.h (class pointer_query): New class. >> (compute_objsize): Declare an overload. >> * gimple-ssa-sprintf.c (get_destination_size): >> (handle_printf_call): >> * tree-ssa-strlen.c (adjust_last_stmt): Add an argument and use it. >> (maybe_warn_overflow): Same. >> (handle_builtin_strcpy): Same. >> (maybe_diag_stxncpy_trunc): Same. >> (handle_builtin_memcpy): Change argument type. Adjust calls. >> (handle_builtin_strcat): Same. >> (handle_builtin_memset): Same. >> (handle_store): Same. >> (strlen_check_and_optimize_call): Same. >> (check_and_optimize_stmt): Same. >> (strlen_dom_walker): Add new data members. >> (strlen_dom_walker::before_dom_children): Use new member. >> (printf_strlen_execute): Dump cache performance counters. >> * tree-ssa-strlen.h (maybe_diag_stxncpy_trunc): Add argument. >> (handle_printf_call): Change argument type. >> >> gcc/testsuite/ChangeLog: >> >> PR middle-end/97373 >> * gcc.dg/warn-strnlen-no-nul.c: >> * g++.dg/warn/Wmismatched-new-delete-2.C: New test. >> * gcc.dg/tree-ssa/builtin-sprintf-warn-25.c: New test. > I'm going to hesitatingly ACK this. Besides rebasing the changes on top the latest trunk I've also adjusted them to avoid the "inefficient hacks" I mentioned in the subsequent patch(*). They were not only inefficient but also would have caused the function to return incorrect results in some cases. After retesting with Binutils/GDB and Glibc I committed the attached patch in r11-5622. Martin [*] https://gcc.gnu.org/pipermail/gcc-patches/2020-November/558775.html > > The reason I saw hesitatingly is because I suspect that we're going to > see more cases where caching would be useful and I'd like to have a > general way to do that.  We've already got a ton of caching code in > Ranger and I fully expect more as we continue to write analysis code > that wants to walk backwards.  I don't want to see multiple > implementations of simple caching code. > > Additionally, if you're doing so many queries in a subsequent patch that > caching is needed, then that may be an indication that you're doing too > much work.  I can't really evaluate that yet as I haven't looked at the > subsequent patch where this becomes a significant issue. > > So expect pushback if further cache implementations are introduced. > > Jeff >