* [pushed 0/9] analyzer: strlen, strcpy, and strcat [PR105899]
@ 2023-08-24 14:38 David Malcolm
2023-08-24 14:38 ` [PATCH 1/9] analyzer: add logging to impl_path_context David Malcolm
` (8 more replies)
0 siblings, 9 replies; 10+ messages in thread
From: David Malcolm @ 2023-08-24 14:38 UTC (permalink / raw)
To: gcc-patches; +Cc: David Malcolm
This patch kit makes improvements to the analyzer's new strlen
implementation, and wires it up to strcpy and strcat.
For example, given:
#include <string.h>
void test (void)
{
char buf[10];
strcpy (buf, "hello world!");
}
we now emit:
demo.c: In function ‘test’:
demo.c:6:3: warning: stack-based buffer overflow [CWE-121] [-Wanalyzer-out-of-bounds]
6 | strcpy (buf, "hello world!");
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
‘test’: events 1-2
|
| 5 | char buf[10];
| | ^~~
| | |
| | (1) capacity: 10 bytes
| 6 | strcpy (buf, "hello world!");
| | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| | |
| | (2) out-of-bounds write from byte 10 till byte 12 but ‘buf’ ends at byte 10
|
demo.c:6:3: note: write of 3 bytes to beyond the end of ‘buf’
6 | strcpy (buf, "hello world!");
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
demo.c:6:3: note: valid subscripts for ‘buf’ are ‘[0]’ to ‘[9]’
┌─────┬─────┬────┬────┬────┬────┬────┬────┬────┬────┐┌─────┬─────┬─────┐
│ [0] │ [1] │[2] │[3] │[4] │[5] │[6] │[7] │[8] │[9] ││[10] │[11] │[12] │
├─────┼─────┼────┼────┼────┼────┼────┼────┼────┼────┤├─────┼─────┼─────┤
│ ‘h’ │ ‘e’ │‘l’ │‘l’ │‘o’ │‘ ’ │‘w’ │‘o’ │‘r’ │‘l’ ││ ‘d’ │ ‘!’ │ NUL │
├─────┴─────┴────┴────┴────┴────┴────┴────┴────┴────┴┴─────┴─────┴─────┤
│ string literal (type: ‘char[13]’) │
└──────────────────────────────────────────────────────────────────────┘
│ │ │ │ │ │ │ │ │ │ │ │ │
│ │ │ │ │ │ │ │ │ │ │ │ │
v v v v v v v v v v v v v
┌─────┬────────────────────────────────────────┬────┐┌─────────────────┐
│ [0] │ ... │[9] ││ │
├─────┴────────────────────────────────────────┴────┤│after valid range│
│ ‘buf’ (type: ‘char[10]’) ││ │
└───────────────────────────────────────────────────┘└─────────────────┘
├─────────────────────────┬─────────────────────────┤├────────┬────────┤
│ │
╭─────────┴────────╮ ╭───────────┴──────────╮
│capacity: 10 bytes│ │⚠️ overflow of 3 bytes│
╰──────────────────╯ ╰──────────────────────╯
in addition to the pre-existing:
demo.c:6:3: warning: ‘__builtin_memcpy’ writing 13 bytes into a region of size 10 overflows the destination [-Wstringop-overflow=]
demo.c:5:8: note: destination object ‘buf’ of size 10
5 | char buf[10];
| ^~~
Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r14-3461-g9aaec66917c96a through to
r14-3469-gbbdc0e0d0042ae.
David Malcolm (9):
analyzer: add logging to impl_path_context
analyzer: handle symbolic bindings in scan_for_null_terminator
[PR105899]
analyzer: reimplement kf_strcpy [PR105899]
analyzer: eliminate region_model::get_string_size [PR105899]
analyzer: reimplement kf_memcpy_memmove
analyzer: handle strlen(INIT_VAL(STRING_REG)) [PR105899]
analyzer: handle INIT_VAL(ELEMENT_REG(STRING_REG), CONSTANT_SVAL)
[PR105899]
analyzer: handle strlen(BITS_WITHIN) [PR105899]
analyzer: implement kf_strcat [PR105899]
gcc/analyzer/call-details.cc | 12 +-
gcc/analyzer/call-details.h | 5 +-
gcc/analyzer/engine.cc | 13 +-
gcc/analyzer/kf.cc | 116 +++++---
gcc/analyzer/region-model-manager.cc | 19 ++
gcc/analyzer/region-model.cc | 261 +++++++++++++-----
gcc/analyzer/region-model.h | 22 +-
gcc/doc/invoke.texi | 1 +
.../analyzer/out-of-bounds-diagram-16.c | 31 +++
gcc/testsuite/gcc.dg/analyzer/sprintf-1.c | 11 +
gcc/testsuite/gcc.dg/analyzer/strcat-1.c | 136 +++++++++
gcc/testsuite/gcc.dg/analyzer/strcpy-1.c | 22 ++
gcc/testsuite/gcc.dg/analyzer/strcpy-3.c | 8 +
gcc/testsuite/gcc.dg/analyzer/strcpy-4.c | 51 ++++
14 files changed, 601 insertions(+), 107 deletions(-)
create mode 100644 gcc/testsuite/gcc.dg/analyzer/out-of-bounds-diagram-16.c
create mode 100644 gcc/testsuite/gcc.dg/analyzer/strcat-1.c
create mode 100644 gcc/testsuite/gcc.dg/analyzer/strcpy-4.c
--
2.26.3
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 1/9] analyzer: add logging to impl_path_context
2023-08-24 14:38 [pushed 0/9] analyzer: strlen, strcpy, and strcat [PR105899] David Malcolm
@ 2023-08-24 14:38 ` David Malcolm
2023-08-24 14:38 ` [PATCH 2/9] analyzer: handle symbolic bindings in scan_for_null_terminator [PR105899] David Malcolm
` (7 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: David Malcolm @ 2023-08-24 14:38 UTC (permalink / raw)
To: gcc-patches; +Cc: David Malcolm
gcc/analyzer/ChangeLog:
* engine.cc (impl_path_context::impl_path_context): Add logger
param.
(impl_path_context::bifurcate): Add log message.
(impl_path_context::terminate_path): Likewise.
(impl_path_context::m_logger): New field.
(exploded_graph::process_node): Pass logger to path_ctxt ctor.
---
gcc/analyzer/engine.cc | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/gcc/analyzer/engine.cc b/gcc/analyzer/engine.cc
index 3700154eec2c..a1908cdb364e 100644
--- a/gcc/analyzer/engine.cc
+++ b/gcc/analyzer/engine.cc
@@ -3848,8 +3848,10 @@ exploded_graph::maybe_create_dynamic_call (const gcall *call,
class impl_path_context : public path_context
{
public:
- impl_path_context (const program_state *cur_state)
+ impl_path_context (const program_state *cur_state,
+ logger *logger)
: m_cur_state (cur_state),
+ m_logger (logger),
m_terminate_path (false)
{
}
@@ -3868,6 +3870,9 @@ public:
void
bifurcate (std::unique_ptr<custom_edge_info> info) final override
{
+ if (m_logger)
+ m_logger->log ("bifurcating path");
+
if (m_state_at_bifurcation)
/* Verify that the state at bifurcation is consistent when we
split into multiple out-edges. */
@@ -3884,6 +3889,8 @@ public:
void terminate_path () final override
{
+ if (m_logger)
+ m_logger->log ("terminating path");
m_terminate_path = true;
}
@@ -3900,6 +3907,8 @@ public:
private:
const program_state *m_cur_state;
+ logger *m_logger;
+
/* Lazily-created copy of the state before the split. */
std::unique_ptr<program_state> m_state_at_bifurcation;
@@ -4044,7 +4053,7 @@ exploded_graph::process_node (exploded_node *node)
exactly one stmt, the one that caused the change. */
program_state next_state (state);
- impl_path_context path_ctxt (&next_state);
+ impl_path_context path_ctxt (&next_state, logger);
uncertainty_t uncertainty;
const supernode *snode = point.get_supernode ();
--
2.26.3
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 2/9] analyzer: handle symbolic bindings in scan_for_null_terminator [PR105899]
2023-08-24 14:38 [pushed 0/9] analyzer: strlen, strcpy, and strcat [PR105899] David Malcolm
2023-08-24 14:38 ` [PATCH 1/9] analyzer: add logging to impl_path_context David Malcolm
@ 2023-08-24 14:38 ` David Malcolm
2023-08-24 14:38 ` [PATCH 3/9] analyzer: reimplement kf_strcpy [PR105899] David Malcolm
` (6 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: David Malcolm @ 2023-08-24 14:38 UTC (permalink / raw)
To: gcc-patches; +Cc: David Malcolm
gcc/analyzer/ChangeLog:
PR analyzer/105899
* region-model.cc (iterable_cluster::iterable_cluster): Add
symbolic binding keys to m_symbolic_bindings.
(iterable_cluster::has_symbolic_bindings_p): New.
(iterable_cluster::m_symbolic_bindings): New field.
(region_model::scan_for_null_terminator): Treat clusters with
symbolic bindings as having unknown strlen.
gcc/testsuite/ChangeLog:
PR analyzer/105899
* gcc.dg/analyzer/sprintf-1.c: Include "analyzer-decls.h".
(test_strlen_1): New.
---
gcc/analyzer/region-model.cc | 15 +++++++++++++++
gcc/testsuite/gcc.dg/analyzer/sprintf-1.c | 11 +++++++++++
2 files changed, 26 insertions(+)
diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
index 99817aee3a93..7a2f81f36e0f 100644
--- a/gcc/analyzer/region-model.cc
+++ b/gcc/analyzer/region-model.cc
@@ -3420,6 +3420,8 @@ public:
if (concrete_key->get_byte_range (&fragment_bytes))
m_fragments.safe_push (fragment (fragment_bytes, sval));
}
+ else
+ m_symbolic_bindings.safe_push (key);
}
m_fragments.qsort (fragment::cmp_ptrs);
}
@@ -3440,8 +3442,14 @@ public:
return false;
}
+ bool has_symbolic_bindings_p () const
+ {
+ return !m_symbolic_bindings.is_empty ();
+ }
+
private:
auto_vec<fragment> m_fragments;
+ auto_vec<const binding_key *> m_symbolic_bindings;
};
/* Simulate reading the bytes at BYTES from BASE_REG.
@@ -3610,6 +3618,13 @@ region_model::scan_for_null_terminator (const region *reg,
/* No binding for this base_region, or no binding at src_byte_offset
(or a symbolic binding). */
+ if (c.has_symbolic_bindings_p ())
+ {
+ if (out_sval)
+ *out_sval = m_mgr->get_or_create_unknown_svalue (NULL_TREE);
+ return m_mgr->get_or_create_unknown_svalue (size_type_node);
+ }
+
/* TODO: the various special-cases seen in
region_model::get_store_value. */
diff --git a/gcc/testsuite/gcc.dg/analyzer/sprintf-1.c b/gcc/testsuite/gcc.dg/analyzer/sprintf-1.c
index f8dc806d6192..e7c2b3089c5b 100644
--- a/gcc/testsuite/gcc.dg/analyzer/sprintf-1.c
+++ b/gcc/testsuite/gcc.dg/analyzer/sprintf-1.c
@@ -1,6 +1,8 @@
/* See e.g. https://en.cppreference.com/w/c/io/fprintf
and https://www.man7.org/linux/man-pages/man3/sprintf.3.html */
+#include "analyzer-decls.h"
+
extern int
sprintf(char* dst, const char* fmt, ...)
__attribute__((__nothrow__));
@@ -64,3 +66,12 @@ test_fmt_not_terminated (char *dst)
return sprintf (dst, fmt); /* { dg-warning "stack-based buffer over-read" } */
/* { dg-message "while looking for null terminator for argument 2 \\('&fmt'\\) of 'sprintf'..." "event" { target *-*-* } .-1 } */
}
+
+void
+test_strlen_1 (void)
+{
+ char buf[10];
+ sprintf (buf, "msg: %s\n", "abc");
+ __analyzer_eval (__builtin_strlen (buf) == 8); /* { dg-warning "UNKNOWN" } */
+ // TODO: ideally would be TRUE
+}
--
2.26.3
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 3/9] analyzer: reimplement kf_strcpy [PR105899]
2023-08-24 14:38 [pushed 0/9] analyzer: strlen, strcpy, and strcat [PR105899] David Malcolm
2023-08-24 14:38 ` [PATCH 1/9] analyzer: add logging to impl_path_context David Malcolm
2023-08-24 14:38 ` [PATCH 2/9] analyzer: handle symbolic bindings in scan_for_null_terminator [PR105899] David Malcolm
@ 2023-08-24 14:38 ` David Malcolm
2023-08-24 14:38 ` [PATCH 4/9] analyzer: eliminate region_model::get_string_size [PR105899] David Malcolm
` (5 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: David Malcolm @ 2023-08-24 14:38 UTC (permalink / raw)
To: gcc-patches; +Cc: David Malcolm
This patch reimplements the analyzer's implementation of strcpy using
the region_model::scan_for_null_terminator infrastructure, so that e.g.
it can complain about out-of-bounds reads/writes, unterminated strings,
etc.
gcc/analyzer/ChangeLog:
PR analyzer/105899
* kf.cc (kf_strcpy::impl_call_pre): Reimplement using
check_for_null_terminated_string_arg.
* region-model.cc (region_model::get_store_bytes): Shortcut
reading all of a string_region.
(region_model::scan_for_null_terminator): Use get_store_value for
the bytes rather than "unknown" when returning an unknown length.
(region_model::write_bytes): New.
* region-model.h (region_model::write_bytes): New decl.
gcc/testsuite/ChangeLog:
PR analyzer/105899
* gcc.dg/analyzer/out-of-bounds-diagram-16.c: New test.
* gcc.dg/analyzer/strcpy-1.c: Add test coverage.
* gcc.dg/analyzer/strcpy-3.c: Likewise.
* gcc.dg/analyzer/strcpy-4.c: New test.
---
gcc/analyzer/kf.cc | 32 +++++-------
gcc/analyzer/region-model.cc | 32 ++++++++++--
gcc/analyzer/region-model.h | 4 ++
.../analyzer/out-of-bounds-diagram-16.c | 31 +++++++++++
gcc/testsuite/gcc.dg/analyzer/strcpy-1.c | 22 ++++++++
gcc/testsuite/gcc.dg/analyzer/strcpy-3.c | 1 +
gcc/testsuite/gcc.dg/analyzer/strcpy-4.c | 51 +++++++++++++++++++
7 files changed, 150 insertions(+), 23 deletions(-)
create mode 100644 gcc/testsuite/gcc.dg/analyzer/out-of-bounds-diagram-16.c
create mode 100644 gcc/testsuite/gcc.dg/analyzer/strcpy-4.c
diff --git a/gcc/analyzer/kf.cc b/gcc/analyzer/kf.cc
index 59f46bab581c..6b33cd159dac 100644
--- a/gcc/analyzer/kf.cc
+++ b/gcc/analyzer/kf.cc
@@ -1135,29 +1135,25 @@ void
kf_strcpy::impl_call_pre (const call_details &cd) const
{
region_model *model = cd.get_model ();
- region_model_manager *mgr = cd.get_manager ();
+ region_model_context *ctxt = cd.get_ctxt ();
const svalue *dest_sval = cd.get_arg_svalue (0);
const region *dest_reg = model->deref_rvalue (dest_sval, cd.get_arg_tree (0),
- cd.get_ctxt ());
- const svalue *src_sval = cd.get_arg_svalue (1);
- const region *src_reg = model->deref_rvalue (src_sval, cd.get_arg_tree (1),
- cd.get_ctxt ());
- const svalue *src_contents_sval = model->get_store_value (src_reg,
- cd.get_ctxt ());
- cd.check_for_null_terminated_string_arg (1);
-
+ ctxt);
+ /* strcpy returns the initial param. */
cd.maybe_set_lhs (dest_sval);
- /* Try to get the string size if SRC_REG is a string_region. */
- const svalue *copied_bytes_sval = model->get_string_size (src_reg);
- /* Otherwise, check if the contents of SRC_REG is a string. */
- if (copied_bytes_sval->get_kind () == SK_UNKNOWN)
- copied_bytes_sval = model->get_string_size (src_contents_sval);
-
- const region *sized_dest_reg
- = mgr->get_sized_region (dest_reg, NULL_TREE, copied_bytes_sval);
- model->set_value (sized_dest_reg, src_contents_sval, cd.get_ctxt ());
+ const svalue *bytes_to_copy;
+ if (const svalue *num_bytes_read_sval
+ = cd.check_for_null_terminated_string_arg (1, &bytes_to_copy))
+ {
+ model->write_bytes (dest_reg, num_bytes_read_sval, bytes_to_copy, ctxt);
+ }
+ else
+ {
+ if (cd.get_ctxt ())
+ cd.get_ctxt ()->terminate_path ();
+ }
}
/* Handler for "strdup" and "__builtin_strdup". */
diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
index 7a2f81f36e0f..cc8d895d9665 100644
--- a/gcc/analyzer/region-model.cc
+++ b/gcc/analyzer/region-model.cc
@@ -3460,6 +3460,13 @@ region_model::get_store_bytes (const region *base_reg,
const byte_range &bytes,
region_model_context *ctxt) const
{
+ /* Shortcut reading all of a string_region. */
+ if (bytes.get_start_byte_offset () == 0)
+ if (const string_region *string_reg = base_reg->dyn_cast_string_region ())
+ if (bytes.m_size_in_bytes
+ == TREE_STRING_LENGTH (string_reg->get_string_cst ()))
+ return m_mgr->get_or_create_initial_value (base_reg);
+
const svalue *index_sval
= m_mgr->get_or_create_int_cst (size_type_node,
bytes.get_start_byte_offset ());
@@ -3533,14 +3540,14 @@ region_model::scan_for_null_terminator (const region *reg,
if (offset.symbolic_p ())
{
if (out_sval)
- *out_sval = m_mgr->get_or_create_unknown_svalue (NULL_TREE);
+ *out_sval = get_store_value (reg, nullptr);
return m_mgr->get_or_create_unknown_svalue (size_type_node);
}
byte_offset_t src_byte_offset;
if (!offset.get_concrete_byte_offset (&src_byte_offset))
{
if (out_sval)
- *out_sval = m_mgr->get_or_create_unknown_svalue (NULL_TREE);
+ *out_sval = get_store_value (reg, nullptr);
return m_mgr->get_or_create_unknown_svalue (size_type_node);
}
const byte_offset_t initial_src_byte_offset = src_byte_offset;
@@ -3582,7 +3589,7 @@ region_model::scan_for_null_terminator (const region *reg,
if (is_terminated.is_unknown ())
{
if (out_sval)
- *out_sval = m_mgr->get_or_create_unknown_svalue (NULL_TREE);
+ *out_sval = get_store_value (reg, nullptr);
return m_mgr->get_or_create_unknown_svalue (size_type_node);
}
@@ -3621,7 +3628,7 @@ region_model::scan_for_null_terminator (const region *reg,
if (c.has_symbolic_bindings_p ())
{
if (out_sval)
- *out_sval = m_mgr->get_or_create_unknown_svalue (NULL_TREE);
+ *out_sval = get_store_value (reg, nullptr);
return m_mgr->get_or_create_unknown_svalue (size_type_node);
}
@@ -3638,7 +3645,7 @@ region_model::scan_for_null_terminator (const region *reg,
if (base_reg->can_have_initial_svalue_p ())
{
if (out_sval)
- *out_sval = m_mgr->get_or_create_unknown_svalue (NULL_TREE);
+ *out_sval = get_store_value (reg, nullptr);
return m_mgr->get_or_create_unknown_svalue (size_type_node);
}
else
@@ -3801,6 +3808,21 @@ region_model::zero_fill_region (const region *reg)
m_store.zero_fill_region (m_mgr->get_store_manager(), reg);
}
+/* Copy NUM_BYTES_SVAL of SVAL to DEST_REG.
+ Use CTXT to report any warnings associated with the copy
+ (e.g. out-of-bounds writes). */
+
+void
+region_model::write_bytes (const region *dest_reg,
+ const svalue *num_bytes_sval,
+ const svalue *sval,
+ region_model_context *ctxt)
+{
+ const region *sized_dest_reg
+ = m_mgr->get_sized_region (dest_reg, NULL_TREE, num_bytes_sval);
+ set_value (sized_dest_reg, sval, ctxt);
+}
+
/* Mark REG as having unknown content. */
void
diff --git a/gcc/analyzer/region-model.h b/gcc/analyzer/region-model.h
index 3979bf124783..9c6e60bbe824 100644
--- a/gcc/analyzer/region-model.h
+++ b/gcc/analyzer/region-model.h
@@ -367,6 +367,10 @@ class region_model
void purge_region (const region *reg);
void fill_region (const region *reg, const svalue *sval);
void zero_fill_region (const region *reg);
+ void write_bytes (const region *dest_reg,
+ const svalue *num_bytes_sval,
+ const svalue *sval,
+ region_model_context *ctxt);
void mark_region_as_unknown (const region *reg, uncertainty_t *uncertainty);
tristate eval_condition (const svalue *lhs,
diff --git a/gcc/testsuite/gcc.dg/analyzer/out-of-bounds-diagram-16.c b/gcc/testsuite/gcc.dg/analyzer/out-of-bounds-diagram-16.c
new file mode 100644
index 000000000000..b0fb409267ea
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/out-of-bounds-diagram-16.c
@@ -0,0 +1,31 @@
+/* { dg-additional-options "-fdiagnostics-text-art-charset=unicode" } */
+
+#include <string.h>
+#include "analyzer-decls.h"
+
+char *test_fixed_size_heap_2_invalid (void)
+{
+ char str[] = "abc";
+ char *p = __builtin_malloc (strlen (str)); /* { dg-message "\\(1\\) capacity: 3 bytes" } */
+ if (!p)
+ return NULL;
+ strcpy (p, str); /* { dg-warning "heap-based buffer overflow" } */
+ return p;
+}
+
+/* { dg-begin-multiline-output "" }
+ ┌──────────────────────────────────────────────────────────────────────┐
+ │ write of 4 bytes │
+ └──────────────────────────────────────────────────────────────────────┘
+ │ │
+ │ │
+ v v
+ ┌───────────────────────────────────────────────────┐┌─────────────────┐
+ │ buffer allocated on heap at (1) ││after valid range│
+ └───────────────────────────────────────────────────┘└─────────────────┘
+ ├─────────────────────────┬─────────────────────────┤├────────┬────────┤
+ │ │
+ ╭────────┴────────╮ ╭─────────┴────────╮
+ │capacity: 3 bytes│ │overflow of 1 byte│
+ ╰─────────────────╯ ╰──────────────────╯
+ { dg-end-multiline-output "" } */
diff --git a/gcc/testsuite/gcc.dg/analyzer/strcpy-1.c b/gcc/testsuite/gcc.dg/analyzer/strcpy-1.c
index d21e77175119..30341061f4cc 100644
--- a/gcc/testsuite/gcc.dg/analyzer/strcpy-1.c
+++ b/gcc/testsuite/gcc.dg/analyzer/strcpy-1.c
@@ -30,3 +30,25 @@ char *test_uninitialized (char *dst)
return strcpy (dst, buf); /* { dg-warning "use of uninitialized value 'buf\\\[0\\\]'" } */
/* { dg-message "while looking for null terminator for argument 2 \\('&buf'\\) of 'strcpy'..." "event" { target *-*-* } .-1 } */
}
+
+extern void external_fn (void *ptr);
+
+char *test_external_fn (void)
+{
+ char src[10];
+ char dst[10];
+ external_fn (src);
+ strcpy (dst, src);
+ __analyzer_eval (strlen (dst) == strlen (src)); /* { dg-warning "UNKNOWN" } */
+ // TODO: ideally would be TRUE
+}
+
+void test_sprintf_strcpy (const char *a, const char *b)
+{
+ char buf_1[10];
+ char buf_2[10];
+ __builtin_sprintf (buf_1, "%s/%s", a, b);
+ strcpy (buf_2, buf_1);
+ __analyzer_eval (strlen (buf_1) == strlen (buf_2)); /* { dg-warning "UNKNOWN" } */
+ // TODO: ideally would be TRUE
+}
diff --git a/gcc/testsuite/gcc.dg/analyzer/strcpy-3.c b/gcc/testsuite/gcc.dg/analyzer/strcpy-3.c
index a38f9a7641fe..abb49bc39f27 100644
--- a/gcc/testsuite/gcc.dg/analyzer/strcpy-3.c
+++ b/gcc/testsuite/gcc.dg/analyzer/strcpy-3.c
@@ -20,4 +20,5 @@ void test_1 (void)
__analyzer_eval (result[3] == 'l'); /* { dg-warning "TRUE" } */
__analyzer_eval (result[4] == 'o'); /* { dg-warning "TRUE" } */
__analyzer_eval (result[5] == 0); /* { dg-warning "TRUE" } */
+ __analyzer_eval (strlen (result) == 5); /* { dg-warning "TRUE" } */
}
diff --git a/gcc/testsuite/gcc.dg/analyzer/strcpy-4.c b/gcc/testsuite/gcc.dg/analyzer/strcpy-4.c
new file mode 100644
index 000000000000..435a4cadee9d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/strcpy-4.c
@@ -0,0 +1,51 @@
+/* { dg-additional-options "-Wno-stringop-overflow" } */
+
+#include <string.h>
+#include "analyzer-decls.h"
+
+void
+test_fixed_size_stack_1 (void)
+{
+ char buf[3];
+ strcpy (buf, "abc"); /* { dg-warning "stack-based buffer overflow" } */
+}
+
+char *test_fixed_size_heap_1 (void)
+{
+ char str[] = "abc";
+ char *p = __builtin_malloc (3);
+ if (!p)
+ return NULL;
+ strcpy (p, str); /* { dg-warning "heap-based buffer overflow" } */
+ return p;
+}
+
+char *test_fixed_size_heap_2_invalid (void)
+{
+ char str[] = "abc";
+ char *p = __builtin_malloc (strlen (str));
+ if (!p)
+ return NULL;
+ strcpy (p, str); /* { dg-warning "heap-based buffer overflow" } */
+ return p;
+}
+
+char *test_fixed_size_heap_2_valid (void)
+{
+ char str[] = "abc";
+ char *p = __builtin_malloc (strlen (str) + 1);
+ if (!p)
+ return NULL;
+ strcpy (p, str); /* { dg-bogus "" } */
+ __analyzer_eval (strlen (p) == 3); /* { dg-warning "TRUE" } */
+ return p;
+}
+
+char *test_dynamic_size_heap_1 (const char *src)
+{
+ char *p = __builtin_malloc (strlen (src));
+ if (!p)
+ return NULL;
+ strcpy (p, src); // TODO: write of null terminator is oob
+ return p;
+}
--
2.26.3
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 4/9] analyzer: eliminate region_model::get_string_size [PR105899]
2023-08-24 14:38 [pushed 0/9] analyzer: strlen, strcpy, and strcat [PR105899] David Malcolm
` (2 preceding siblings ...)
2023-08-24 14:38 ` [PATCH 3/9] analyzer: reimplement kf_strcpy [PR105899] David Malcolm
@ 2023-08-24 14:38 ` David Malcolm
2023-08-24 14:38 ` [PATCH 5/9] analyzer: reimplement kf_memcpy_memmove David Malcolm
` (4 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: David Malcolm @ 2023-08-24 14:38 UTC (permalink / raw)
To: gcc-patches; +Cc: David Malcolm
gcc/analyzer/ChangeLog:
PR analyzer/105899
* region-model.cc (region_model::get_string_size): Delete both.
* region-model.h (region_model::get_string_size): Delete both
decls.
---
gcc/analyzer/region-model.cc | 29 -----------------------------
gcc/analyzer/region-model.h | 3 ---
2 files changed, 32 deletions(-)
diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
index cc8d895d9665..1fe66f4719fa 100644
--- a/gcc/analyzer/region-model.cc
+++ b/gcc/analyzer/region-model.cc
@@ -2794,35 +2794,6 @@ region_model::get_capacity (const region *reg) const
return m_mgr->get_or_create_unknown_svalue (sizetype);
}
-/* Return the string size, including the 0-terminator, if SVAL is a
- constant_svalue holding a string. Otherwise, return an unknown_svalue. */
-
-const svalue *
-region_model::get_string_size (const svalue *sval) const
-{
- tree cst = sval->maybe_get_constant ();
- if (!cst || TREE_CODE (cst) != STRING_CST)
- return m_mgr->get_or_create_unknown_svalue (size_type_node);
-
- tree out = build_int_cst (size_type_node, TREE_STRING_LENGTH (cst));
- return m_mgr->get_or_create_constant_svalue (out);
-}
-
-/* Return the string size, including the 0-terminator, if REG is a
- string_region. Otherwise, return an unknown_svalue. */
-
-const svalue *
-region_model::get_string_size (const region *reg) const
-{
- const string_region *str_reg = dyn_cast <const string_region *> (reg);
- if (!str_reg)
- return m_mgr->get_or_create_unknown_svalue (size_type_node);
-
- tree cst = str_reg->get_string_cst ();
- tree out = build_int_cst (size_type_node, TREE_STRING_LENGTH (cst));
- return m_mgr->get_or_create_constant_svalue (out);
-}
-
/* If CTXT is non-NULL, use it to warn about any problems accessing REG,
using DIR to determine if this access is a read or write.
Return TRUE if an OOB access was detected.
diff --git a/gcc/analyzer/region-model.h b/gcc/analyzer/region-model.h
index 9c6e60bbe824..41df1885ad5b 100644
--- a/gcc/analyzer/region-model.h
+++ b/gcc/analyzer/region-model.h
@@ -469,9 +469,6 @@ class region_model
const svalue *get_capacity (const region *reg) const;
- const svalue *get_string_size (const svalue *sval) const;
- const svalue *get_string_size (const region *reg) const;
-
bool replay_call_summary (call_summary_replay &r,
const region_model &summary);
--
2.26.3
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 5/9] analyzer: reimplement kf_memcpy_memmove
2023-08-24 14:38 [pushed 0/9] analyzer: strlen, strcpy, and strcat [PR105899] David Malcolm
` (3 preceding siblings ...)
2023-08-24 14:38 ` [PATCH 4/9] analyzer: eliminate region_model::get_string_size [PR105899] David Malcolm
@ 2023-08-24 14:38 ` David Malcolm
2023-08-24 14:39 ` [PATCH 6/9] analyzer: handle strlen(INIT_VAL(STRING_REG)) [PR105899] David Malcolm
` (3 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: David Malcolm @ 2023-08-24 14:38 UTC (permalink / raw)
To: gcc-patches; +Cc: David Malcolm
gcc/analyzer/ChangeLog:
* kf.cc (kf_memcpy_memmove::impl_call_pre): Reimplement using
region_model::copy_bytes.
* region-model.cc (region_model::read_bytes): New.
(region_model::copy_bytes): New.
* region-model.h (region_model::read_bytes): New decl.
(region_model::copy_bytes): New decl.
---
gcc/analyzer/kf.cc | 14 ++++----------
gcc/analyzer/region-model.cc | 35 +++++++++++++++++++++++++++++++++++
gcc/analyzer/region-model.h | 9 +++++++++
3 files changed, 48 insertions(+), 10 deletions(-)
diff --git a/gcc/analyzer/kf.cc b/gcc/analyzer/kf.cc
index 6b33cd159dac..3eddbe200387 100644
--- a/gcc/analyzer/kf.cc
+++ b/gcc/analyzer/kf.cc
@@ -541,7 +541,6 @@ kf_memcpy_memmove::impl_call_pre (const call_details &cd) const
const svalue *num_bytes_sval = cd.get_arg_svalue (2);
region_model *model = cd.get_model ();
- region_model_manager *mgr = cd.get_manager ();
const region *dest_reg
= model->deref_rvalue (dest_ptr_sval, cd.get_arg_tree (0), cd.get_ctxt ());
@@ -550,15 +549,10 @@ kf_memcpy_memmove::impl_call_pre (const call_details &cd) const
cd.maybe_set_lhs (dest_ptr_sval);
- const region *sized_src_reg
- = mgr->get_sized_region (src_reg, NULL_TREE, num_bytes_sval);
- const region *sized_dest_reg
- = mgr->get_sized_region (dest_reg, NULL_TREE, num_bytes_sval);
- const svalue *src_contents_sval
- = model->get_store_value (sized_src_reg, cd.get_ctxt ());
- model->check_for_poison (src_contents_sval, cd.get_arg_tree (1),
- sized_src_reg, cd.get_ctxt ());
- model->set_value (sized_dest_reg, src_contents_sval, cd.get_ctxt ());
+ model->copy_bytes (dest_reg,
+ src_reg, cd.get_arg_tree (1),
+ num_bytes_sval,
+ cd.get_ctxt ());
}
/* Handler for "memset" and "__builtin_memset". */
diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
index 1fe66f4719fa..00c306ab7dae 100644
--- a/gcc/analyzer/region-model.cc
+++ b/gcc/analyzer/region-model.cc
@@ -3794,6 +3794,41 @@ region_model::write_bytes (const region *dest_reg,
set_value (sized_dest_reg, sval, ctxt);
}
+/* Read NUM_BYTES_SVAL from SRC_REG.
+ Use CTXT to report any warnings associated with the copy
+ (e.g. out-of-bounds reads, copying of uninitialized values, etc). */
+
+const svalue *
+region_model::read_bytes (const region *src_reg,
+ tree src_ptr_expr,
+ const svalue *num_bytes_sval,
+ region_model_context *ctxt) const
+{
+ const region *sized_src_reg
+ = m_mgr->get_sized_region (src_reg, NULL_TREE, num_bytes_sval);
+ const svalue *src_contents_sval = get_store_value (sized_src_reg, ctxt);
+ check_for_poison (src_contents_sval, src_ptr_expr,
+ sized_src_reg, ctxt);
+ return src_contents_sval;
+}
+
+/* Copy NUM_BYTES_SVAL bytes from SRC_REG to DEST_REG.
+ Use CTXT to report any warnings associated with the copy
+ (e.g. out-of-bounds reads/writes, copying of uninitialized values,
+ etc). */
+
+void
+region_model::copy_bytes (const region *dest_reg,
+ const region *src_reg,
+ tree src_ptr_expr,
+ const svalue *num_bytes_sval,
+ region_model_context *ctxt)
+{
+ const svalue *data_sval
+ = read_bytes (src_reg, src_ptr_expr, num_bytes_sval, ctxt);
+ write_bytes (dest_reg, num_bytes_sval, data_sval, ctxt);
+}
+
/* Mark REG as having unknown content. */
void
diff --git a/gcc/analyzer/region-model.h b/gcc/analyzer/region-model.h
index 41df1885ad5b..b1c705e22c28 100644
--- a/gcc/analyzer/region-model.h
+++ b/gcc/analyzer/region-model.h
@@ -371,6 +371,15 @@ class region_model
const svalue *num_bytes_sval,
const svalue *sval,
region_model_context *ctxt);
+ const svalue *read_bytes (const region *src_reg,
+ tree src_ptr_expr,
+ const svalue *num_bytes_sval,
+ region_model_context *ctxt) const;
+ void copy_bytes (const region *dest_reg,
+ const region *src_reg,
+ tree src_ptr_expr,
+ const svalue *num_bytes_sval,
+ region_model_context *ctxt);
void mark_region_as_unknown (const region *reg, uncertainty_t *uncertainty);
tristate eval_condition (const svalue *lhs,
--
2.26.3
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 6/9] analyzer: handle strlen(INIT_VAL(STRING_REG)) [PR105899]
2023-08-24 14:38 [pushed 0/9] analyzer: strlen, strcpy, and strcat [PR105899] David Malcolm
` (4 preceding siblings ...)
2023-08-24 14:38 ` [PATCH 5/9] analyzer: reimplement kf_memcpy_memmove David Malcolm
@ 2023-08-24 14:39 ` David Malcolm
2023-08-24 14:39 ` [PATCH 7/9] analyzer: handle INIT_VAL(ELEMENT_REG(STRING_REG), CONSTANT_SVAL) [PR105899] David Malcolm
` (2 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: David Malcolm @ 2023-08-24 14:39 UTC (permalink / raw)
To: gcc-patches; +Cc: David Malcolm
gcc/analyzer/ChangeLog:
PR analyzer/105899
* region-model.cc (fragment::has_null_terminator): Move STRING_CST
handling to fragment::string_cst_has_null_terminator; also use it to
handle INIT_VAL(STRING_REG).
(fragment::string_cst_has_null_terminator): New, from above.
gcc/testsuite/ChangeLog:
PR analyzer/105899
* gcc.dg/analyzer/strcpy-3.c (test_2): New.
---
gcc/analyzer/region-model.cc | 68 ++++++++++++++++--------
gcc/testsuite/gcc.dg/analyzer/strcpy-3.c | 7 +++
2 files changed, 54 insertions(+), 21 deletions(-)
diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
index 00c306ab7dae..6574ec140074 100644
--- a/gcc/analyzer/region-model.cc
+++ b/gcc/analyzer/region-model.cc
@@ -3310,27 +3310,10 @@ struct fragment
switch (TREE_CODE (cst))
{
case STRING_CST:
- {
- /* Look for the first 0 byte within STRING_CST
- from START_READ_OFFSET onwards. */
- const HOST_WIDE_INT num_bytes_to_search
- = std::min<HOST_WIDE_INT> ((TREE_STRING_LENGTH (cst)
- - rel_start_read_offset_hwi),
- available_bytes_hwi);
- const char *start = (TREE_STRING_POINTER (cst)
- + rel_start_read_offset_hwi);
- if (num_bytes_to_search >= 0)
- if (const void *p = memchr (start, 0,
- num_bytes_to_search))
- {
- *out_bytes_read = (const char *)p - start + 1;
- return tristate (true);
- }
-
- *out_bytes_read = available_bytes;
- return tristate (false);
- }
- break;
+ return string_cst_has_null_terminator (cst,
+ rel_start_read_offset_hwi,
+ available_bytes_hwi,
+ out_bytes_read);
case INTEGER_CST:
if (rel_start_read_offset_hwi == 0
&& integer_onep (TYPE_SIZE_UNIT (TREE_TYPE (cst))))
@@ -3357,12 +3340,55 @@ struct fragment
}
}
break;
+
+ case SK_INITIAL:
+ {
+ const initial_svalue *initial_sval = (const initial_svalue *)m_sval;
+ const region *reg = initial_sval->get_region ();
+ if (const string_region *string_reg = reg->dyn_cast_string_region ())
+ {
+ tree string_cst = string_reg->get_string_cst ();
+ return string_cst_has_null_terminator (string_cst,
+ rel_start_read_offset_hwi,
+ available_bytes_hwi,
+ out_bytes_read);
+ }
+ return tristate::TS_UNKNOWN;
+ }
+ break;
+
default:
// TODO: it may be possible to handle other cases here.
return tristate::TS_UNKNOWN;
}
}
+ static tristate
+ string_cst_has_null_terminator (tree string_cst,
+ HOST_WIDE_INT rel_start_read_offset_hwi,
+ HOST_WIDE_INT available_bytes_hwi,
+ byte_offset_t *out_bytes_read)
+ {
+ /* Look for the first 0 byte within STRING_CST
+ from START_READ_OFFSET onwards. */
+ const HOST_WIDE_INT num_bytes_to_search
+ = std::min<HOST_WIDE_INT> ((TREE_STRING_LENGTH (string_cst)
+ - rel_start_read_offset_hwi),
+ available_bytes_hwi);
+ const char *start = (TREE_STRING_POINTER (string_cst)
+ + rel_start_read_offset_hwi);
+ if (num_bytes_to_search >= 0)
+ if (const void *p = memchr (start, 0,
+ num_bytes_to_search))
+ {
+ *out_bytes_read = (const char *)p - start + 1;
+ return tristate (true);
+ }
+
+ *out_bytes_read = available_bytes_hwi;
+ return tristate (false);
+ }
+
byte_range m_byte_range;
const svalue *m_sval;
};
diff --git a/gcc/testsuite/gcc.dg/analyzer/strcpy-3.c b/gcc/testsuite/gcc.dg/analyzer/strcpy-3.c
index abb49bc39f27..a7b324fc445e 100644
--- a/gcc/testsuite/gcc.dg/analyzer/strcpy-3.c
+++ b/gcc/testsuite/gcc.dg/analyzer/strcpy-3.c
@@ -22,3 +22,10 @@ void test_1 (void)
__analyzer_eval (result[5] == 0); /* { dg-warning "TRUE" } */
__analyzer_eval (strlen (result) == 5); /* { dg-warning "TRUE" } */
}
+
+void test_2 (void)
+{
+ char buf[16];
+ __builtin_strcpy (buf, "abc");
+ __analyzer_eval (strlen (buf) == 3); /* { dg-warning "TRUE" } */
+}
--
2.26.3
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 7/9] analyzer: handle INIT_VAL(ELEMENT_REG(STRING_REG), CONSTANT_SVAL) [PR105899]
2023-08-24 14:38 [pushed 0/9] analyzer: strlen, strcpy, and strcat [PR105899] David Malcolm
` (5 preceding siblings ...)
2023-08-24 14:39 ` [PATCH 6/9] analyzer: handle strlen(INIT_VAL(STRING_REG)) [PR105899] David Malcolm
@ 2023-08-24 14:39 ` David Malcolm
2023-08-24 14:39 ` [PATCH 8/9] analyzer: handle strlen(BITS_WITHIN) [PR105899] David Malcolm
2023-08-24 14:39 ` [PATCH 9/9] analyzer: implement kf_strcat [PR105899] David Malcolm
8 siblings, 0 replies; 10+ messages in thread
From: David Malcolm @ 2023-08-24 14:39 UTC (permalink / raw)
To: gcc-patches; +Cc: David Malcolm
gcc/analyzer/ChangeLog:
PR analyzer/105899
* region-model-manager.cc
(region_model_manager::get_or_create_initial_value): Simplify
INIT_VAL(ELEMENT_REG(STRING_REG), CONSTANT_SVAL) to
CONSTANT_SVAL(STRING[N]).
---
gcc/analyzer/region-model-manager.cc | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/gcc/analyzer/region-model-manager.cc b/gcc/analyzer/region-model-manager.cc
index 65b719056c84..22246876f8f9 100644
--- a/gcc/analyzer/region-model-manager.cc
+++ b/gcc/analyzer/region-model-manager.cc
@@ -310,6 +310,25 @@ region_model_manager::get_or_create_initial_value (const region *reg,
get_or_create_initial_value (original_reg));
}
+ /* Simplify:
+ INIT_VAL(ELEMENT_REG(STRING_REG), CONSTANT_SVAL)
+ to:
+ CONSTANT_SVAL(STRING[N]). */
+ if (const element_region *element_reg = reg->dyn_cast_element_region ())
+ if (tree cst_idx = element_reg->get_index ()->maybe_get_constant ())
+ if (const string_region *string_reg
+ = element_reg->get_parent_region ()->dyn_cast_string_region ())
+ if (tree_fits_shwi_p (cst_idx))
+ {
+ HOST_WIDE_INT idx = tree_to_shwi (cst_idx);
+ tree string_cst = string_reg->get_string_cst ();
+ if (idx >= 0 && idx <= TREE_STRING_LENGTH (string_cst))
+ {
+ int ch = TREE_STRING_POINTER (string_cst)[idx];
+ return get_or_create_int_cst (reg->get_type (), ch);
+ }
+ }
+
/* INIT_VAL (*UNKNOWN_PTR) -> UNKNOWN_VAL. */
if (reg->symbolic_for_unknown_ptr_p ())
return get_or_create_unknown_svalue (reg->get_type ());
--
2.26.3
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 8/9] analyzer: handle strlen(BITS_WITHIN) [PR105899]
2023-08-24 14:38 [pushed 0/9] analyzer: strlen, strcpy, and strcat [PR105899] David Malcolm
` (6 preceding siblings ...)
2023-08-24 14:39 ` [PATCH 7/9] analyzer: handle INIT_VAL(ELEMENT_REG(STRING_REG), CONSTANT_SVAL) [PR105899] David Malcolm
@ 2023-08-24 14:39 ` David Malcolm
2023-08-24 14:39 ` [PATCH 9/9] analyzer: implement kf_strcat [PR105899] David Malcolm
8 siblings, 0 replies; 10+ messages in thread
From: David Malcolm @ 2023-08-24 14:39 UTC (permalink / raw)
To: gcc-patches; +Cc: David Malcolm
gcc/analyzer/ChangeLog:
PR analyzer/105899
* region-model.cc (fragment::has_null_terminator): Handle
SK_BITS_WITHIN.
---
gcc/analyzer/region-model.cc | 21 ++++++++++++++++++++-
1 file changed, 20 insertions(+), 1 deletion(-)
diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
index 6574ec140074..025b555d7b97 100644
--- a/gcc/analyzer/region-model.cc
+++ b/gcc/analyzer/region-model.cc
@@ -3357,10 +3357,29 @@ struct fragment
}
break;
+ case SK_BITS_WITHIN:
+ {
+ const bits_within_svalue *bits_within_sval
+ = (const bits_within_svalue *)m_sval;
+ byte_range bytes (0, 0);
+ if (bits_within_sval->get_bits ().as_byte_range (&bytes))
+ {
+ const svalue *inner_sval = bits_within_sval->get_inner_svalue ();
+ fragment f (byte_range
+ (start_read_offset - bytes.get_start_bit_offset (),
+ std::max<byte_size_t> (bytes.m_size_in_bytes,
+ available_bytes)),
+ inner_sval);
+ return f.has_null_terminator (start_read_offset, out_bytes_read);
+ }
+ }
+ break;
+
default:
// TODO: it may be possible to handle other cases here.
- return tristate::TS_UNKNOWN;
+ break;
}
+ return tristate::TS_UNKNOWN;
}
static tristate
--
2.26.3
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 9/9] analyzer: implement kf_strcat [PR105899]
2023-08-24 14:38 [pushed 0/9] analyzer: strlen, strcpy, and strcat [PR105899] David Malcolm
` (7 preceding siblings ...)
2023-08-24 14:39 ` [PATCH 8/9] analyzer: handle strlen(BITS_WITHIN) [PR105899] David Malcolm
@ 2023-08-24 14:39 ` David Malcolm
8 siblings, 0 replies; 10+ messages in thread
From: David Malcolm @ 2023-08-24 14:39 UTC (permalink / raw)
To: gcc-patches; +Cc: David Malcolm
gcc/analyzer/ChangeLog:
PR analyzer/105899
* call-details.cc
(call_details::check_for_null_terminated_string_arg): Split into
overloads, one taking just an arg_idx, the other a new
"include_terminator" param.
* call-details.h: Likewise.
* kf.cc (class kf_strcat): New.
(kf_strcpy::impl_call_pre): Update for change to
check_for_null_terminated_string_arg.
(register_known_functions): Register kf_strcat.
* region-model.cc
(region_model::check_for_null_terminated_string_arg): Split into
overloads, one taking just an arg_idx, the other a new
"include_terminator" param. When returning an svalue, handle
"include_terminator" being false by subtracting one.
* region-model.h
(region_model::check_for_null_terminated_string_arg): Split into
overloads, one taking just an arg_idx, the other a new
"include_terminator" param.
gcc/ChangeLog:
PR analyzer/105899
* doc/invoke.texi (Static Analyzer Options): Add "strcat" to the
list of functions known to the analyzer.
gcc/testsuite/ChangeLog:
PR analyzer/105899
* gcc.dg/analyzer/strcat-1.c: New test.
---
gcc/analyzer/call-details.cc | 12 +-
gcc/analyzer/call-details.h | 5 +-
gcc/analyzer/kf.cc | 72 ++++++++++--
gcc/analyzer/region-model.cc | 63 +++++++++--
gcc/analyzer/region-model.h | 6 +-
gcc/doc/invoke.texi | 1 +
gcc/testsuite/gcc.dg/analyzer/strcat-1.c | 136 +++++++++++++++++++++++
7 files changed, 275 insertions(+), 20 deletions(-)
create mode 100644 gcc/testsuite/gcc.dg/analyzer/strcat-1.c
diff --git a/gcc/analyzer/call-details.cc b/gcc/analyzer/call-details.cc
index 8f5b28ce6c26..ce1f859c9996 100644
--- a/gcc/analyzer/call-details.cc
+++ b/gcc/analyzer/call-details.cc
@@ -386,13 +386,23 @@ call_details::lookup_function_attribute (const char *attr_name) const
return lookup_attribute (attr_name, TYPE_ATTRIBUTES (allocfntype));
}
+void
+call_details::check_for_null_terminated_string_arg (unsigned arg_idx) const
+{
+ check_for_null_terminated_string_arg (arg_idx, false, nullptr);
+}
+
const svalue *
call_details::
check_for_null_terminated_string_arg (unsigned arg_idx,
+ bool include_terminator,
const svalue **out_sval) const
{
region_model *model = get_model ();
- return model->check_for_null_terminated_string_arg (*this, arg_idx, out_sval);
+ return model->check_for_null_terminated_string_arg (*this,
+ arg_idx,
+ include_terminator,
+ out_sval);
}
} // namespace ana
diff --git a/gcc/analyzer/call-details.h b/gcc/analyzer/call-details.h
index 58b5ccd2acde..ae528e4ab116 100644
--- a/gcc/analyzer/call-details.h
+++ b/gcc/analyzer/call-details.h
@@ -72,9 +72,12 @@ public:
tree lookup_function_attribute (const char *attr_name) const;
+ void
+ check_for_null_terminated_string_arg (unsigned arg_idx) const;
const svalue *
check_for_null_terminated_string_arg (unsigned arg_idx,
- const svalue **out_sval = nullptr) const;
+ bool include_terminator,
+ const svalue **out_sval) const;
private:
const gcall *m_call;
diff --git a/gcc/analyzer/kf.cc b/gcc/analyzer/kf.cc
index 3eddbe200387..36d9d10bb013 100644
--- a/gcc/analyzer/kf.cc
+++ b/gcc/analyzer/kf.cc
@@ -1106,6 +1106,61 @@ public:
/* Currently a no-op. */
};
+/* Handler for "strcat" and "__builtin_strcat_chk". */
+
+class kf_strcat : public known_function
+{
+public:
+ kf_strcat (unsigned int num_args) : m_num_args (num_args) {}
+ bool matches_call_types_p (const call_details &cd) const final override
+ {
+ return (cd.num_args () == m_num_args
+ && cd.arg_is_pointer_p (0)
+ && cd.arg_is_pointer_p (1));
+ }
+
+ void impl_call_pre (const call_details &cd) const final override
+ {
+ region_model *model = cd.get_model ();
+ region_model_manager *mgr = cd.get_manager ();
+
+ const svalue *dest_sval = cd.get_arg_svalue (0);
+ const region *dest_reg = model->deref_rvalue (dest_sval, cd.get_arg_tree (0),
+ cd.get_ctxt ());
+
+ const svalue *dst_strlen_sval
+ = cd.check_for_null_terminated_string_arg (0, false, nullptr);
+ if (!dst_strlen_sval)
+ {
+ if (cd.get_ctxt ())
+ cd.get_ctxt ()->terminate_path ();
+ return;
+ }
+
+ const svalue *bytes_to_copy;
+ const svalue *num_src_bytes_read_sval
+ = cd.check_for_null_terminated_string_arg (1, true, &bytes_to_copy);
+ if (!num_src_bytes_read_sval)
+ {
+ if (cd.get_ctxt ())
+ cd.get_ctxt ()->terminate_path ();
+ return;
+ }
+
+ cd.maybe_set_lhs (dest_sval);
+
+ const region *offset_reg
+ = mgr->get_offset_region (dest_reg, NULL_TREE, dst_strlen_sval);
+ model->write_bytes (offset_reg,
+ num_src_bytes_read_sval,
+ bytes_to_copy,
+ cd.get_ctxt ());
+ }
+
+private:
+ unsigned int m_num_args;
+};
+
/* Handler for "strcpy" and "__builtin_strcpy_chk". */
class kf_strcpy : public known_function
@@ -1139,7 +1194,7 @@ kf_strcpy::impl_call_pre (const call_details &cd) const
const svalue *bytes_to_copy;
if (const svalue *num_bytes_read_sval
- = cd.check_for_null_terminated_string_arg (1, &bytes_to_copy))
+ = cd.check_for_null_terminated_string_arg (1, true, &bytes_to_copy))
{
model->write_bytes (dest_reg, num_bytes_read_sval, bytes_to_copy, ctxt);
}
@@ -1188,16 +1243,10 @@ public:
}
void impl_call_pre (const call_details &cd) const final override
{
- if (const svalue *bytes_read = cd.check_for_null_terminated_string_arg (0))
- if (bytes_read->get_kind () != SK_UNKNOWN)
+ if (const svalue *strlen_sval
+ = cd.check_for_null_terminated_string_arg (0, false, nullptr))
+ if (strlen_sval->get_kind () != SK_UNKNOWN)
{
- region_model_manager *mgr = cd.get_manager ();
- /* strlen is (bytes_read - 1). */
- const svalue *one = mgr->get_or_create_int_cst (size_type_node, 1);
- const svalue *strlen_sval = mgr->get_or_create_binop (size_type_node,
- MINUS_EXPR,
- bytes_read,
- one);
cd.maybe_set_lhs (strlen_sval);
return;
}
@@ -1415,6 +1464,8 @@ register_known_functions (known_function_manager &kfm)
kfm.add (BUILT_IN_SPRINTF, make_unique<kf_sprintf> ());
kfm.add (BUILT_IN_STACK_RESTORE, make_unique<kf_stack_restore> ());
kfm.add (BUILT_IN_STACK_SAVE, make_unique<kf_stack_save> ());
+ kfm.add (BUILT_IN_STRCAT, make_unique<kf_strcat> (2));
+ kfm.add (BUILT_IN_STRCAT_CHK, make_unique<kf_strcat> (3));
kfm.add (BUILT_IN_STRCHR, make_unique<kf_strchr> ());
kfm.add (BUILT_IN_STRCPY, make_unique<kf_strcpy> (2));
kfm.add (BUILT_IN_STRCPY_CHK, make_unique<kf_strcpy> (3));
@@ -1429,6 +1480,7 @@ register_known_functions (known_function_manager &kfm)
/* Known builtins and C standard library functions. */
{
kfm.add ("memset", make_unique<kf_memset> ());
+ kfm.add ("strcat", make_unique<kf_strcat> (2));
kfm.add ("strdup", make_unique<kf_strdup> ());
kfm.add ("strndup", make_unique<kf_strndup> ());
}
diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
index 025b555d7b97..02c073c15bcc 100644
--- a/gcc/analyzer/region-model.cc
+++ b/gcc/analyzer/region-model.cc
@@ -3679,9 +3679,41 @@ region_model::scan_for_null_terminator (const region *reg,
- the buffer pointed to has any uninitalized bytes before any 0-terminator
- any of the reads aren't within the bounds of the underlying base region
- Otherwise, return a svalue for the number of bytes read (strlen + 1),
- and, if OUT_SVAL is non-NULL, write to *OUT_SVAL with an svalue
- representing the content of the buffer up to and including the terminator.
+ Otherwise, return a svalue for strlen of the buffer (*not* including
+ the null terminator).
+
+ TODO: we should also complain if:
+ - the pointer is NULL (or could be). */
+
+void
+region_model::check_for_null_terminated_string_arg (const call_details &cd,
+ unsigned arg_idx)
+{
+ check_for_null_terminated_string_arg (cd,
+ arg_idx,
+ false, /* include_terminator */
+ nullptr); // out_sval
+}
+
+
+/* Check that argument ARG_IDX (0-based) to the call described by CD
+ is a pointer to a valid null-terminated string.
+
+ Simulate scanning through the buffer, reading until we find a 0 byte
+ (equivalent to calling strlen).
+
+ Complain and return NULL if:
+ - the buffer pointed to isn't null-terminated
+ - the buffer pointed to has any uninitalized bytes before any 0-terminator
+ - any of the reads aren't within the bounds of the underlying base region
+
+ Otherwise, return a svalue. This will be the number of bytes read
+ (including the null terminator) if INCLUDE_TERMINATOR is true, or strlen
+ of the buffer (not including the null terminator) if it is false.
+
+ Also, when returning an svalue, if OUT_SVAL is non-NULL, write to
+ *OUT_SVAL with an svalue representing the content of the buffer up to
+ and including the terminator.
TODO: we should also complain if:
- the pointer is NULL (or could be). */
@@ -3689,6 +3721,7 @@ region_model::scan_for_null_terminator (const region *reg,
const svalue *
region_model::check_for_null_terminated_string_arg (const call_details &cd,
unsigned arg_idx,
+ bool include_terminator,
const svalue **out_sval)
{
class null_terminator_check_event : public custom_event
@@ -3786,10 +3819,26 @@ region_model::check_for_null_terminated_string_arg (const call_details &cd,
const region *buf_reg
= deref_rvalue (arg_sval, cd.get_arg_tree (arg_idx), &my_ctxt);
- return scan_for_null_terminator (buf_reg,
- cd.get_arg_tree (arg_idx),
- out_sval,
- &my_ctxt);
+ if (const svalue *num_bytes_read_sval
+ = scan_for_null_terminator (buf_reg,
+ cd.get_arg_tree (arg_idx),
+ out_sval,
+ &my_ctxt))
+ {
+ if (include_terminator)
+ return num_bytes_read_sval;
+ else
+ {
+ /* strlen is (bytes_read - 1). */
+ const svalue *one = m_mgr->get_or_create_int_cst (size_type_node, 1);
+ return m_mgr->get_or_create_binop (size_type_node,
+ MINUS_EXPR,
+ num_bytes_read_sval,
+ one);
+ }
+ }
+ else
+ return nullptr;
}
/* Remove all bindings overlapping REG within the store. */
diff --git a/gcc/analyzer/region-model.h b/gcc/analyzer/region-model.h
index b1c705e22c28..40259625fb06 100644
--- a/gcc/analyzer/region-model.h
+++ b/gcc/analyzer/region-model.h
@@ -519,10 +519,14 @@ class region_model
const svalue *sval_hint,
region_model_context *ctxt) const;
+ void
+ check_for_null_terminated_string_arg (const call_details &cd,
+ unsigned idx);
const svalue *
check_for_null_terminated_string_arg (const call_details &cd,
unsigned idx,
- const svalue **out_sval = nullptr);
+ bool include_terminator,
+ const svalue **out_sval);
private:
const region *get_lvalue_1 (path_var pv, region_model_context *ctxt) const;
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index ef3f40989860..209d6b0ce4d3 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -11109,6 +11109,7 @@ and of the following functions:
@item @code{siglongjmp}
@item @code{signal}
@item @code{sigsetjmp}
+@item @code{strcat}
@item @code{strchr}
@item @code{strlen}
@end itemize
diff --git a/gcc/testsuite/gcc.dg/analyzer/strcat-1.c b/gcc/testsuite/gcc.dg/analyzer/strcat-1.c
new file mode 100644
index 000000000000..e3b698ae73d3
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/strcat-1.c
@@ -0,0 +1,136 @@
+/* See e.g. https://en.cppreference.com/w/c/string/byte/strcat */
+
+#include "analyzer-decls.h"
+
+char *strcat (char *dest, const char *src);
+#define NULL ((void *)0)
+
+char *
+test_passthrough (char *dest, const char *src)
+{
+ return strcat (dest, src);
+}
+
+char *
+test_null_dest (const char *src)
+{
+ return strcat (NULL, src); /* { dg-warning "use of NULL where non-null expected" } */
+}
+
+char *
+test_null_src (char *dest)
+{
+ return strcat (dest, NULL); /* { dg-warning "use of NULL where non-null expected" } */
+}
+
+char *
+test_uninit_dest (const char *src)
+{
+ char dest[10];
+ return strcat (dest, src); /* { dg-warning "use of uninitialized value 'dest\\\[0\\\]'" } */
+}
+
+char *
+test_uninit_src (char *dest)
+{
+ const char src[10];
+ return strcat (dest, src); /* { dg-warning "use of uninitialized value 'src\\\[0\\\]'" } */
+}
+
+char *
+test_dest_not_terminated (char *src)
+{
+ char dest[3] = "foo";
+ return strcat (dest, src); /* { dg-warning "stack-based buffer over-read" } */
+ /* { dg-message "while looking for null terminator for argument 1 \\('&dest'\\) of 'strcat'" "" { target *-*-* } .-1 } */
+}
+
+char *
+test_src_not_terminated (char *dest)
+{
+ const char src[3] = "foo";
+ return strcat (dest, src); /* { dg-warning "stack-based buffer over-read" } */
+ /* { dg-message "while looking for null terminator for argument 2 \\('&src'\\) of 'strcat'" "" { target *-*-* } .-1 } */
+}
+
+char * __attribute__((noinline))
+call_strcat (char *dest, const char *src)
+{
+ return strcat (dest, src);
+}
+
+void
+test_concrete_valid_static_size (void)
+{
+ char buf[16];
+ char *p1 = __builtin_strcpy (buf, "abc");
+ char *p2 = call_strcat (buf, "def");
+ __analyzer_eval (p1 == buf); /* { dg-warning "TRUE" } */
+ __analyzer_eval (p2 == buf); /* { dg-warning "TRUE" } */
+ __analyzer_eval (buf[0] == 'a'); /* { dg-warning "TRUE" } */
+ __analyzer_eval (buf[1] == 'b'); /* { dg-warning "TRUE" } */
+ __analyzer_eval (buf[2] == 'c'); /* { dg-warning "TRUE" } */
+ __analyzer_eval (buf[3] == 'd'); /* { dg-warning "TRUE" } */
+ __analyzer_eval (buf[4] == 'e'); /* { dg-warning "TRUE" } */
+ __analyzer_eval (buf[5] == 'f'); /* { dg-warning "TRUE" } */
+ __analyzer_eval (buf[6] == '\0'); /* { dg-warning "TRUE" } */
+ __analyzer_eval (__builtin_strlen (buf) == 6); /* { dg-warning "TRUE" } */
+}
+
+void
+test_concrete_valid_static_size_2 (void)
+{
+ char buf[16];
+ char *p1 = __builtin_strcpy (buf, "abc");
+ char *p2 = call_strcat (buf, "def");
+ char *p3 = call_strcat (buf, "ghi");
+ __analyzer_eval (p1 == buf); /* { dg-warning "TRUE" } */
+ __analyzer_eval (p2 == buf); /* { dg-warning "TRUE" } */
+ __analyzer_eval (p3 == buf); /* { dg-warning "TRUE" } */
+ __analyzer_eval (buf[0] == 'a'); /* { dg-warning "TRUE" } */
+ __analyzer_eval (buf[1] == 'b'); /* { dg-warning "TRUE" } */
+ __analyzer_eval (buf[2] == 'c'); /* { dg-warning "TRUE" } */
+ __analyzer_eval (buf[3] == 'd'); /* { dg-warning "TRUE" } */
+ __analyzer_eval (buf[4] == 'e'); /* { dg-warning "TRUE" } */
+ __analyzer_eval (buf[5] == 'f'); /* { dg-warning "TRUE" } */
+ __analyzer_eval (buf[6] == 'g'); /* { dg-warning "TRUE" } */
+ __analyzer_eval (buf[7] == 'h'); /* { dg-warning "TRUE" } */
+ __analyzer_eval (buf[8] == 'i'); /* { dg-warning "TRUE" } */
+ __analyzer_eval (buf[9] == '\0'); /* { dg-warning "TRUE" } */
+ __analyzer_eval (__builtin_strlen (buf) == 9); /* { dg-warning "TRUE" } */
+ __analyzer_eval (__builtin_strlen (buf + 1) == 8); /* { dg-warning "TRUE" } */
+ __analyzer_eval (__builtin_strlen (buf + 2) == 7); /* { dg-warning "TRUE" } */
+ __analyzer_eval (__builtin_strlen (buf + 3) == 6); /* { dg-warning "TRUE" } */
+ __analyzer_eval (__builtin_strlen (buf + 4) == 5); /* { dg-warning "TRUE" } */
+ __analyzer_eval (__builtin_strlen (buf + 5) == 4); /* { dg-warning "TRUE" } */
+ __analyzer_eval (__builtin_strlen (buf + 6) == 3); /* { dg-warning "TRUE" } */
+ __analyzer_eval (__builtin_strlen (buf + 7) == 2); /* { dg-warning "TRUE" } */
+ __analyzer_eval (__builtin_strlen (buf + 8) == 1); /* { dg-warning "TRUE" } */
+ __analyzer_eval (__builtin_strlen (buf + 9) == 0); /* { dg-warning "TRUE" } */
+}
+
+char * __attribute__((noinline))
+call_strcat_invalid (char *dest, const char *src)
+{
+ return strcat (dest, src); /* { dg-warning "stack-based buffer overflow" } */
+}
+
+void
+test_concrete_invalid_static_size (void)
+{
+ char buf[3];
+ buf[0] = '\0';
+ call_strcat_invalid (buf, "abc");
+}
+
+void
+test_concrete_symbolic (const char *suffix)
+{
+ char buf[10];
+ buf[0] = '\0';
+ call_strcat (buf, suffix);
+}
+
+/* TODO:
+ - "The behavior is undefined if the strings overlap."
+*/
--
2.26.3
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2023-08-24 14:39 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-24 14:38 [pushed 0/9] analyzer: strlen, strcpy, and strcat [PR105899] David Malcolm
2023-08-24 14:38 ` [PATCH 1/9] analyzer: add logging to impl_path_context David Malcolm
2023-08-24 14:38 ` [PATCH 2/9] analyzer: handle symbolic bindings in scan_for_null_terminator [PR105899] David Malcolm
2023-08-24 14:38 ` [PATCH 3/9] analyzer: reimplement kf_strcpy [PR105899] David Malcolm
2023-08-24 14:38 ` [PATCH 4/9] analyzer: eliminate region_model::get_string_size [PR105899] David Malcolm
2023-08-24 14:38 ` [PATCH 5/9] analyzer: reimplement kf_memcpy_memmove David Malcolm
2023-08-24 14:39 ` [PATCH 6/9] analyzer: handle strlen(INIT_VAL(STRING_REG)) [PR105899] David Malcolm
2023-08-24 14:39 ` [PATCH 7/9] analyzer: handle INIT_VAL(ELEMENT_REG(STRING_REG), CONSTANT_SVAL) [PR105899] David Malcolm
2023-08-24 14:39 ` [PATCH 8/9] analyzer: handle strlen(BITS_WITHIN) [PR105899] David Malcolm
2023-08-24 14:39 ` [PATCH 9/9] analyzer: implement kf_strcat [PR105899] David Malcolm
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).