From: David Malcolm <dmalcolm@redhat.com>
To: Tim Lange <mail@tim-lange.me>, gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] analyzer: warn on the use of floating points in the size argument [PR106181]
Date: Mon, 15 Aug 2022 14:16:02 -0400 [thread overview]
Message-ID: <26fa8158e7fe8cb559250397b6494409c98effb6.camel@redhat.com> (raw)
In-Reply-To: <20220815123525.49172-1-mail@tim-lange.me>
On Mon, 2022-08-15 at 14:35 +0200, Tim Lange wrote:
> This patch fixes the ICE reported in PR106181 and adds a new warning
> to
> the analyzer complaining about the use of floating point operands.
Thanks for the patch.
Various comments inline...
>
> I decided to move the warning for floats inside the size argument out
> of
> the allocation size checker code and toward the allocation such that
> the
> warning only appears once.
> I'm not sure about the wording of the diagnostic, my current wording
> feels
> a bit bulky.
Agreed, and the warning itself is probably setting a new record for
option length: -Wanalyzer-imprecise-floating-point-arithmetic is 45
characters. I'm not without sin here: I think the current record is -
Wanalyzer-unsafe-call-within-signal-handler, which is 43.
How about:
-Wanalyzer-imprecise-float-arithmetic
-Wanalyzer-imprecise-fp-arithmetic
instead? (ideas welcome)
> Here is how the diagnostics look like:
>
> /path/to/main.c: In function ‘test_1’:
> /path/to/main.c:10:14: warning: use of floating point arithmetic
> inside the size argument might yield unexpected results
https://gcc.gnu.org/codingconventions.html#Spelling
says we should use "floating-point" rather than "floating point".
How about just this:
"warning: use of floating-point arithmetic here might yield unexpected
results"
here...
> [-Wanalyzer-imprecise-floating-point-arithmetic]
> 10 | int *ptr = malloc (sizeof (int) * n); /* { dg-line test_1 }
> */
> | ^~~~~~~~~~~~~~~~~~~~~~~~~
> ‘test_1’: event 1
> |
> | 10 | int *ptr = malloc (sizeof (int) * n); /* { dg-line
> test_1 } */
> | | ^~~~~~~~~~~~~~~~~~~~~~~~~
> | | |
> | | (1) operand ‘n’ is of type ‘float’
> |
> /path/to/main.c:10:14: note: only use operands of a type that
> represents whole numbers inside the size argument
...and make the note say:
"only use operands of an integer type inside the size argument"
which tells the user that it's a size that we're complaining about.
> /path/to/main.c: In function ‘test_2’:
> /path/to/main.c:20:14: warning: use of floating point arithmetic
> inside the size argument might yield unexpected results [-Wanalyzer-
> imprecise-floating-point-arithmetic]
> 20 | int *ptr = malloc (n * 3.1); /* { dg-line test_2 } */
> | ^~~~~~~~~~~~~~~~
> ‘test_2’: event 1
> |
> | 20 | int *ptr = malloc (n * 3.1); /* { dg-line test_2 } */
> | | ^~~~~~~~~~~~~~~~
> | | |
> | | (1) operand ‘3.1000000000000001e+0’ is of
> type ‘double’
> |
> /path/to/main.c:20:14: note: only use operands of a type that
> represents whole numbers inside the size argument
>
> Also, another point to discuss is the event note in case the
> expression is
> wrapped in a variable, such as in test_3:
> /path/to/main.c:30:10: warning: use of floating point arithmetic
> inside the size argument might yield unexpected results [-Wanalyzer-
> imprecise-floating-point-arithmetic]
> 30 | return malloc (size); /* { dg-line test_3 } */
> | ^~~~~~~~~~~~~
> ‘test_3’: events 1-2
> |
> | 37 | void test_3 (float f)
> | | ^~~~~~
> | | |
> | | (1) entry to ‘test_3’
> | 38 | {
> | 39 | void *ptr = alloc_me (f); /* { dg-message "calling
> 'alloc_me' from 'test_3'" } */
> | | ~~~~~~~~~~~~
> | | |
> | | (2) calling ‘alloc_me’ from ‘test_3’
> |
> +--> ‘alloc_me’: events 3-4
> |
> | 28 | void *alloc_me (size_t size)
> | | ^~~~~~~~
> | | |
> | | (3) entry to ‘alloc_me’
> | 29 | {
> | 30 | return malloc (size); /* { dg-line test_3 } */
> | | ~~~~~~~~~~~~~
> | | |
> | | (4) operand ‘f’ is of type ‘float’
> |
>
> I'm not sure if it is easily discoverable that event (4) does refer
> to
> 'size'. I thought about also printing get_representative_tree
> (capacity)
> in the event but that would clutter the event message if the argument
> does hold the full expression. I don't have any strong feelings about
> the
> decision here but if I had to decide I'd leave it as is (especially
> because the warning is probably quite unusual).
Yeah; get_representative_tree tries gets a tree, but trees don't give
us a good way of referring to a local var within a particular stack
frame within a path. So leaving it as is is OK.
> The index of the argument would also be a possibility, but that would
> get
> tricky for calloc.
[...snip...]
>
> diff --git a/gcc/analyzer/analyzer.opt b/gcc/analyzer/analyzer.opt
> index 61b58c575ff..bef15eae2c3 100644
> --- a/gcc/analyzer/analyzer.opt
> +++ b/gcc/analyzer/analyzer.opt
> @@ -98,6 +98,10 @@ Wanalyzer-free-of-non-heap
> Common Var(warn_analyzer_free_of_non_heap) Init(1) Warning
> Warn about code paths in which a non-heap pointer is freed.
>
> +Wanalyzer-imprecise-floating-point-arithmetic
> +Common Var(warn_analyzer_imprecise_floating_point_arithmetic)
> Init(1) Warning
> +Warn about code paths in which floating point arithmetic is use in
"use" -> "used".
> locations where precise computation is needed.
> +
> Wanalyzer-jump-through-null
> Common Var(warn_analyzer_jump_through_null) Init(1) Warning
> Warn about code paths in which a NULL function pointer is called.
> diff --git a/gcc/analyzer/region-model-impl-calls.cc
> b/gcc/analyzer/region-model-impl-calls.cc
> index 8eebd122d42..c4d7e186963 100644
> --- a/gcc/analyzer/region-model-impl-calls.cc
> +++ b/gcc/analyzer/region-model-impl-calls.cc
> @@ -237,6 +237,7 @@ region_model::impl_call_alloca (const
> call_details &cd)
> const region *new_reg = create_region_for_alloca (size_sval,
> cd.get_ctxt ());
> const svalue *ptr_sval
> = m_mgr->get_ptr_svalue (cd.get_lhs_type (), new_reg);
> + check_region_capacity_for_floats (ptr_sval, cd.get_ctxt ());
> cd.maybe_set_lhs (ptr_sval);
> }
>
> @@ -393,6 +394,7 @@ region_model::impl_call_calloc (const
> call_details &cd)
> {
> const svalue *ptr_sval
> = m_mgr->get_ptr_svalue (cd.get_lhs_type (), new_reg);
> + check_region_capacity_for_floats (ptr_sval, cd.get_ctxt ());
> cd.maybe_set_lhs (ptr_sval);
> }
> }
> @@ -497,6 +499,7 @@ region_model::impl_call_malloc (const
> call_details &cd)
> {
> const svalue *ptr_sval
> = m_mgr->get_ptr_svalue (cd.get_lhs_type (), new_reg);
> + check_region_capacity_for_floats (ptr_sval, cd.get_ctxt ());
> cd.maybe_set_lhs (ptr_sval);
> }
> }
These checks would be simpler if they were consolidated into a single
call to check_region_capacity_for_floats in:
region_model::set_dynamic_extents
akin to how we have check_dynamic_size_for_taint called there - though
that would make it harder to complain about specific arguments at
function calls. Given that we're not doing the latter, please
consolidate them.
[...snip...]
>
> +/* Abstract class for diagnostics related touse of floating point
"touse" -> "to use"
> + arithmetic where precision is needed. */
> +
> +class imprecise_floating_point_arithmetic
> +: public
> pending_diagnostic_subclass<imprecise_floating_point_arithmetic>
> +{
> +public:
> + imprecise_floating_point_arithmetic ()
> + {}
Do we need thic ctor? There aren't any fields to initialize.
> +
> + const char *get_kind () const final override
> + {
> + return "imprecise_floating_point_arithmetic";
> + }
get_kind should probably only be implemented for concrete subclasses,
such as float_as_size_arg.
> +
> + int get_controlling_option () const final override
> + {
> + return OPT_Wanalyzer_imprecise_floating_point_arithmetic;
> + }
> +
> + bool
> + operator== (const imprecise_floating_point_arithmetic &other
> + ATTRIBUTE_UNUSED) const
> + {
> + return true;
> + }
I don't think it makes sense to have an operator== here for an abstract
class with no fields.
> +};
> +
> +/* Concrete diagnostic to complain about uses of floating point
> arithmetic
> + in the size argument of malloc etc. */
> +
> +class float_as_size_arg : public imprecise_floating_point_arithmetic
> +{
> +public:
> + float_as_size_arg (tree arg) : m_arg (arg)
> + {}
> +
> + bool operator== (const float_as_size_arg &other) const
> + {
> + return pending_diagnostic::same_tree_p (m_arg, other.m_arg);
> + }
> +
> + bool emit (rich_location *rich_loc) final override
> + {
> + diagnostic_metadata m;
> + bool warned = warning_meta (rich_loc, m, get_controlling_option
> (),
> + "use of floating point arithmetic
> inside the"
> + " size argument might yield
> unexpected"
> + " results");
> + if (warned)
> + inform (rich_loc->get_loc (), "only use operands of a type
> that"
> + " represents whole numbers inside
> the"
> + " size argument");
> + return warned;
> + }
> +
> + label_text describe_final_event (const evdesc::final_event &ev)
> final
> + override
> + {
> + if (m_arg)
> + return ev.formatted_print ("operand %qE is of type %qT",
> + m_arg, TREE_TYPE (m_arg));
> + return ev.formatted_print ("at least one operand of the size
> argument is"
> + " of a floating point type");
> + }
> +
> +private:
> + tree m_arg;
> +};
> +
> +/* Visitor to find uses of floating point variables/constants in an
> svalue. */
> +
> +class contains_floating_point_visitor : public visitor
> +{
> +public:
> + contains_floating_point_visitor (const svalue *root_sval) :
> m_result (NULL)
> + {
> + root_sval->accept (this);
> + }
> +
> + const svalue *get_svalue_to_report ()
> + {
> + return m_result;
> + }
> +
> + void visit_constant_svalue (const constant_svalue *sval) final
> override
> + {
> + /* At the point the analyzer runs, constant integer operands in
> a floating
> + point expression are already implictly converted to floating
> points.
> + Thus, we do prefer to report non-constants such that the
> diagnostic
> + always reports a floating point operand. */
If the constant is seen first, then the non-constant won't be favored
(though perhaps binary ops get canonicalized so that constants are on
the RHS?).
Maybe rework this so that the visitor accumulates an auto_vec<const
svalue *> of all floating point svalues it sees, and have
get_svalue_to_report do a qsort of them (if non-empty), and pick the
first, using a custom comparator to order them into preference of
reporting? Though this is possibly overkill.
> + tree type = sval->get_type ();
> + if (type && FLOAT_TYPE_P (type) && !m_result)
> + m_result = sval;
> + }
> +
> + void visit_conjured_svalue (const conjured_svalue *sval) final
> override
> + {
> + tree type = sval->get_type ();
> + if (type && SCALAR_FLOAT_TYPE_P (type))
> + m_result = sval;
> + }
> +
> + void visit_initial_svalue (const initial_svalue *sval) final
> override
> + {
> + tree type = sval->get_type ();
> + if (type && SCALAR_FLOAT_TYPE_P (type))
> + m_result = sval;
> + }
> +
> +private:
> + /* Non-null if at least one floating point operand was found. */
> + const svalue *m_result;
> +};
> +
> +/* May complain about uses of floating point operands in the
> capacity of SVAL.
> +
> + SVAL is expected to be a region_svalue. */
> +
> +void
> +region_model::check_region_capacity_for_floats (const svalue *sval,
> + region_model_context
> *ctxt)
> +const
> +{
> + if (!ctxt)
> + return;
> +
> + if (const region_svalue *reg_sval = dyn_cast <const region_svalue
> *> (sval))
> + {
> + const svalue *capacity = get_capacity (reg_sval->get_pointee
> ());
> + contains_floating_point_visitor v (capacity);
> + if (const svalue *float_sval = v.get_svalue_to_report ())
> + {
> + tree diag_arg = get_representative_tree (float_sval);
> + ctxt->warn (new float_as_size_arg (diag_arg));
> + }
> + }
> +}
> +
> /* Set the value of the region given by LHS_REG to the value given
> by RHS_SVAL.
> Use CTXT to report any warnings associated with writing to
> LHS_REG. */
[...snip...]
> @@ -9758,6 +9759,7 @@ Enabling this option effectively enables the
> following warnings:
> -Wanalyzer-fd-use-without-check @gol
> -Wanalyzer-file-leak @gol
> -Wanalyzer-free-of-non-heap @gol
> +-Wno-analyzer-imprecise-floating-point-arithmetic @gol
Lose the "no-" here.
> -Wanalyzer-jump-through-null @gol
> -Wanalyzer-malloc-leak @gol
> -Wanalyzer-mismatching-deallocation @gol
> @@ -9946,6 +9948,18 @@ is called on a non-heap pointer (e.g. an on-
> stack buffer, or a global).
>
> See @uref{https://cwe.mitre.org/data/definitions/590.html, CWE-590:
> Free of Memory not on the Heap}.
>
> +@item -Wno-analyzer-imprecise-floating-point-arithmetic
> +@opindex Wanalyzer-imprecise-floating-point-arithmetic
> +@opindex Wno-analyzer-imprecise-floating-point-arithmetic
> +This warning requires @option{-fanalyzer}, which enables it; use
> +@option{-Wno-analyzer-imprecise-floating-point-arithmetic}
> +to disable it.
> +
> +This diagnostic warns for paths through the code in which floating
> point
> +arithmetic is used in locations where precise computation is
> needed. This
> +diagnostic only warns on use of floating points operands inside the
> +allocation size at the moment.
"inside the allocation size " -> "inside the calculation of an
allocation size".
[...snip...]
Other than the above, the patch looks good; thanks.
Dave
next prev parent reply other threads:[~2022-08-15 18:16 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-15 12:35 Tim Lange
2022-08-15 18:16 ` David Malcolm [this message]
2022-08-18 9:44 ` [PATCH v2] analyzer: warn on the use of floating-points operands " Tim Lange
2022-08-18 12:21 ` David Malcolm
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=26fa8158e7fe8cb559250397b6494409c98effb6.camel@redhat.com \
--to=dmalcolm@redhat.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=mail@tim-lange.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).