public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
* [Bug analyzer/106392] New: Support iteration over C++ containers in -fanalyzer @ 2022-07-21 18:00 redi at gcc dot gnu.org 2023-06-08 17:08 ` [Bug analyzer/106392] " vultkayn at gcc dot gnu.org ` (2 more replies) 0 siblings, 3 replies; 4+ messages in thread From: redi at gcc dot gnu.org @ 2022-07-21 18:00 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106392 Bug ID: 106392 Summary: Support iteration over C++ containers in -fanalyzer Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: analyzer Assignee: dmalcolm at gcc dot gnu.org Reporter: redi at gcc dot gnu.org Blocks: 97110 Target Milestone: --- I haven't really thought this one through completely, and maybe it's too hard, but it would be good to check that 'for' loops using iterators don't do silly things, e.g. for (auto iter = cont.begin(); iter != cont.end() - 1; ++iter) // ... This is unsafe if the container is not empty, because end-1 is invalid. Similarly for begin+1 when the container is empty. Need to check cont.begin() != cont.end() before doing begin+1 or end-1 And end+1 is always wrong. Range-based for loops should not modify the container being iterated over: for (auto& x : cont) if (x == 3) cont.push_back(4); Modifying the container while traversing it might invalidate the iterators and have undefined behaviour. (No iterators are visible in the code, but the compiler simply expands the range-based 'for' into a traditional 'for' using two iterators, and just because the user can't see them, those iterators are still invalidated by modifications to the container. Modifying the container in any loop is potentially risky, if not done correctly. For an associative container such as std::set, std::map etc. this is wrong: for (auto it = set.begin(); it != set.end(); ++it) if (*it == 3) set.erase(it); The ++it increment is undefined after erasing that element, because the erase call invalidates the iterator. This solves the invalidation problem, but is still wrong: for (auto it = set.begin(); it != set.end(); ++it) if (*it == 3) it = set.erase(it); // returns iterator to the element after the erased one Now we get a valid iterator back from erase, but then we skip the next element because the ++it always happens at the end of each loop iteration. And if the erased element was the last one then erase returns the end() iterator, then we do ++it on the past-the-end iterator, which is undefined. This is the right way to do it: for (auto it = set.begin(); it != set.end(); ) if (*it == 3) it = set.erase(it); else ++it; But ideally, just don't. Use std::erase_if(cont, 3) instead. It would be great if the analyzer understood enough to flag some of these. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97110 [Bug 97110] [meta-bug] tracker bug for supporting C++ in -fanalyzer ^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug analyzer/106392] Support iteration over C++ containers in -fanalyzer 2022-07-21 18:00 [Bug analyzer/106392] New: Support iteration over C++ containers in -fanalyzer redi at gcc dot gnu.org @ 2023-06-08 17:08 ` vultkayn at gcc dot gnu.org 2023-06-08 19:46 ` redi at gcc dot gnu.org 2023-06-08 19:57 ` vultkayn at gcc dot gnu.org 2 siblings, 0 replies; 4+ messages in thread From: vultkayn at gcc dot gnu.org @ 2023-06-08 17:08 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106392 Benjamin Priour <vultkayn at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |vultkayn at gcc dot gnu.org --- Comment #1 from Benjamin Priour <vultkayn at gcc dot gnu.org> --- Access to end iterator should be flagged as undefined behavior. Detecting invalidation is a difficult problem, as many methods are only sometimes invalidating. Checking a reallocation of the underlying container, e.g. for vectors, is (always?) a good check, but it is not sufficient, e.g. consider unordered_map.rehash(0) Additionally, flagging all calls to "possibly invalidating operations" will obviously lead to too many false positive. Thus, it might be relevant to have multiple attributes that keep track of different kinds of invalidation. Invalidating operations not only differ on their certainty, but on the iterator(s) they invalidate. Some will invalidate every iterators, while some [[gnu::invalidate]] - always invalidate [[gnu::invalidate(resize[, size_varname, capacity_varname, factor_varname ])]] - All iterators are invalidated ? Examples of such case is a vector's size exceeding its capacity, or a hashmap too loaded. Would be detected by keeping track of push*, emplace*, insert* methods, as well as clear, extract, erase. If size_, capacity_ and factor_varname are provided, the invalidation is done only if size/capacity >= factor. [[gnu::invalidate(swap)]] - Both containers should be invalidated. Name probably ill-chosen since swapping two lists invalidates nothing. [[gnu::invalidate(rehash)]] - Generic remapping of every key -> element. This one I think could be entirely replaced by the following. [[gnu::invalidate(adaptor)]] - Look at the underlying container invalidation rule. [[gnu::invalidate(_from_[, _to_[, _step_]])]] - Invalidate all iterators from _from_ to _to_ (default end()), with a jump of _step_ (default 1). Or combine them. For adaptors, we must look at their underlying containers. What policy to take for unknown parameters ? Typically if the load factor is determined at runtime, we simply cannot precisely determine the event. I believe that in such case, either the previous known value is used by the analyzer as a heuristic, or if no other known user-provided value was given, we use the implementation's default value. When invalidating a range, how to determine the "following" iterators, that should be invalidated, when we are not dealing with random iterators ? Is there even such a case in the standard library, where a method over a container without random iterators invalidates a range/ a subset of all iterators ? I didn't find any, and it is also counter-intuitive, and std::*_list certainly are not. More research needed. ^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug analyzer/106392] Support iteration over C++ containers in -fanalyzer 2022-07-21 18:00 [Bug analyzer/106392] New: Support iteration over C++ containers in -fanalyzer redi at gcc dot gnu.org 2023-06-08 17:08 ` [Bug analyzer/106392] " vultkayn at gcc dot gnu.org @ 2023-06-08 19:46 ` redi at gcc dot gnu.org 2023-06-08 19:57 ` vultkayn at gcc dot gnu.org 2 siblings, 0 replies; 4+ messages in thread From: redi at gcc dot gnu.org @ 2023-06-08 19:46 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106392 --- Comment #2 from Jonathan Wakely <redi at gcc dot gnu.org> --- (In reply to Benjamin Priour from comment #1) > [[gnu::invalidate(swap)]] - Both containers should be invalidated. Name > probably ill-chosen since swapping two lists invalidates nothing. Swapping doesn't invalidate iterators for any standard container. I think the unordered containers might be too hard. I would start with std::vector, as that will probably give the best return on investment of effort. > When invalidating a range, how to determine the "following" iterators, that > should be invalidated, when we are not dealing with random iterators ? > Is there even such a case in the standard library, where a method over a > container without random iterators invalidates a range/ a subset of all > iterators ? erase(first, last) invalidates iterators in the erased range. ^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug analyzer/106392] Support iteration over C++ containers in -fanalyzer 2022-07-21 18:00 [Bug analyzer/106392] New: Support iteration over C++ containers in -fanalyzer redi at gcc dot gnu.org 2023-06-08 17:08 ` [Bug analyzer/106392] " vultkayn at gcc dot gnu.org 2023-06-08 19:46 ` redi at gcc dot gnu.org @ 2023-06-08 19:57 ` vultkayn at gcc dot gnu.org 2 siblings, 0 replies; 4+ messages in thread From: vultkayn at gcc dot gnu.org @ 2023-06-08 19:57 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106392 --- Comment #3 from Benjamin Priour <vultkayn at gcc dot gnu.org> --- > I think the unordered containers might be too hard. I would start with > std::vector, as that will probably give the best return on investment of > effort. > Indeed, I just found these slides from clang https://llvm.org/devmtg/2018-04/slides/Balogh-Finding%20Iterator-related%20Errors%20with%20Clang%20Static%20Analyzer.pdf They seem to be dividing the containers into 3 categories, list-like, vector-like and deque-like, thus also sticking to ordered containers. But they also are kind of brute-forcing the checks it seems, as they say upon insertion or deletion, they check every iterator position of the container. It sounds overly expensive to me, particularly for lists. Surely there is a reason they are doing that, I'm looking into it. ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-06-08 19:57 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-07-21 18:00 [Bug analyzer/106392] New: Support iteration over C++ containers in -fanalyzer redi at gcc dot gnu.org 2023-06-08 17:08 ` [Bug analyzer/106392] " vultkayn at gcc dot gnu.org 2023-06-08 19:46 ` redi at gcc dot gnu.org 2023-06-08 19:57 ` vultkayn at gcc dot gnu.org
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).