From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from www523.your-server.de (www523.your-server.de [159.69.224.22]) by sourceware.org (Postfix) with ESMTPS id 71A083858CDB for ; Sun, 4 Sep 2022 19:23:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 71A083858CDB Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=tim-lange.me Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=tim-lange.me DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=tim-lange.me; s=default2108; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID; bh=AdA9CewYngV1dyUQmlvEPjxbHpLbbVooLICZXCJDmq4=; b=Q6DTVTuVQQj3XGICnrblvVZ1VW WJqQaj2oUzyGuzyy1nZAMDOtEoinLRHs5Ep2kjfY8Y1jkeHLbuRZ85T1w7/l+QlvWVUAY4oYX2BTf Mz9Fv+toBztKbKqioR2PyJcALxgOvEQfPP9E5B5hbKkNfcZnGEzACvLcY/R+K1fx/+Tg92wQewVwg 1oe8hDMgqxnwCJYVnW69j3z7ZyPdtUvexSyTKzXeMe/3bZxwA0ZKZbkc1uU/vKAGMzb1zrJWDjSEc LlAaRR7EdQsbbFpIHAACbt8C8L5KL1dTQL/AOAtq4hW8QNv63PWwQRFpQGza4JW8XkOxVraCsmidT qPZrr6yw==; Received: from sslproxy02.your-server.de ([78.47.166.47]) by www523.your-server.de with esmtpsa (TLSv1.3:TLS_AES_256_GCM_SHA384:256) (Exim 4.92.3) (envelope-from ) id 1oUvDn-000D2u-Sy; Sun, 04 Sep 2022 21:23:55 +0200 Received: from [2a02:908:1861:d6a0::6b5] (helo=fedora..) by sslproxy02.your-server.de with esmtpsa (TLSv1.3:TLS_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1oUvDn-000BaS-NL; Sun, 04 Sep 2022 21:23:55 +0200 From: Tim Lange To: dmalcolm@redhat.com Cc: gcc-patches@gcc.gnu.org, Tim Lange Subject: [PATCH 2/2 v2] analyzer: strcpy semantics Date: Sun, 4 Sep 2022 21:17:25 +0200 Message-Id: <20220904191725.81158-1-mail@tim-lange.me> X-Mailer: git-send-email 2.37.2 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Authenticated-Sender: mail@tim-lange.me X-Virus-Scanned: Clear (ClamAV 0.103.6/26648/Sun Sep 4 09:56:21 2022) X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,KAM_INFOUSMEBIZ,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi Dave, sorry about the strncpy thing, I should've been more careful. Below is the patch with just the strcpy part. - Tim This patch adds modelling for the semantics of strcpy in the simple case where the analyzer is able to infer a concrete string size. Regrtested on Linux x86_64. 2022-09-04 Tim Lange gcc/analyzer/ChangeLog: * region-model-impl-calls.cc (region_model::impl_call_strcpy): Handle the constant string case. * region-model.cc (region_model::get_string_size): New function to get the string size from a region or svalue. * region-model.h (class region_model): Add get_string_size. gcc/testsuite/ChangeLog: * gcc.dg/analyzer/out-of-bounds-4.c: New test. * gcc.dg/analyzer/strcpy-3.c: New test. --- gcc/analyzer/region-model-impl-calls.cc | 16 ++++- gcc/analyzer/region-model.cc | 29 +++++++++ gcc/analyzer/region-model.h | 3 + .../gcc.dg/analyzer/out-of-bounds-4.c | 65 +++++++++++++++++++ gcc/testsuite/gcc.dg/analyzer/strcpy-3.c | 23 +++++++ 5 files changed, 133 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/analyzer/out-of-bounds-4.c create mode 100644 gcc/testsuite/gcc.dg/analyzer/strcpy-3.c diff --git a/gcc/analyzer/region-model-impl-calls.cc b/gcc/analyzer/region-model-impl-calls.cc index 8eebd122d42..3790eaf2d79 100644 --- a/gcc/analyzer/region-model-impl-calls.cc +++ b/gcc/analyzer/region-model-impl-calls.cc @@ -1019,13 +1019,23 @@ region_model::impl_call_strcpy (const call_details &cd) const svalue *dest_sval = cd.get_arg_svalue (0); const region *dest_reg = deref_rvalue (dest_sval, cd.get_arg_tree (0), cd.get_ctxt ()); + const svalue *src_sval = cd.get_arg_svalue (1); + const region *src_reg = deref_rvalue (src_sval, cd.get_arg_tree (1), + cd.get_ctxt ()); + const svalue *src_contents_sval = get_store_value (src_reg, + cd.get_ctxt ()); cd.maybe_set_lhs (dest_sval); - check_region_for_write (dest_reg, cd.get_ctxt ()); + /* Try to get the string size if SRC_REG is a string_region. */ + const svalue *copied_bytes_sval = get_string_size (src_reg); + /* Otherwise, check if the contents of SRC_REG is a string. */ + if (copied_bytes_sval->get_kind () == SK_UNKNOWN) + copied_bytes_sval = get_string_size (src_contents_sval); - /* For now, just mark region's contents as unknown. */ - mark_region_as_unknown (dest_reg, cd.get_uncertainty ()); + const region *sized_dest_reg + = m_mgr->get_sized_region (dest_reg, NULL_TREE, copied_bytes_sval); + set_value (sized_dest_reg, src_contents_sval, cd.get_ctxt ()); } /* Handle the on_call_pre part of "strlen". */ diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc index ec29be259b5..4ec18c86774 100644 --- a/gcc/analyzer/region-model.cc +++ b/gcc/analyzer/region-model.cc @@ -3218,6 +3218,35 @@ region_model::get_capacity (const region *reg) const return m_mgr->get_or_create_unknown_svalue (sizetype); } +/* Return the string size, including the 0-terminator, if SVAL is a + constant_svalue holding a string. Otherwise, return an unknown_svalue. */ + +const svalue * +region_model::get_string_size (const svalue *sval) const +{ + tree cst = sval->maybe_get_constant (); + if (!cst || TREE_CODE (cst) != STRING_CST) + return m_mgr->get_or_create_unknown_svalue (size_type_node); + + tree out = build_int_cst (size_type_node, TREE_STRING_LENGTH (cst)); + return m_mgr->get_or_create_constant_svalue (out); +} + +/* Return the string size, including the 0-terminator, if REG is a + string_region. Otherwise, return an unknown_svalue. */ + +const svalue * +region_model::get_string_size (const region *reg) const +{ + const string_region *str_reg = dyn_cast (reg); + if (!str_reg) + return m_mgr->get_or_create_unknown_svalue (size_type_node); + + tree cst = str_reg->get_string_cst (); + tree out = build_int_cst (size_type_node, TREE_STRING_LENGTH (cst)); + return m_mgr->get_or_create_constant_svalue (out); +} + /* If CTXT is non-NULL, use it to warn about any problems accessing REG, using DIR to determine if this access is a read or write. */ diff --git a/gcc/analyzer/region-model.h b/gcc/analyzer/region-model.h index 7ce832f6ce4..a1f2165e145 100644 --- a/gcc/analyzer/region-model.h +++ b/gcc/analyzer/region-model.h @@ -793,6 +793,9 @@ class region_model const svalue *get_capacity (const region *reg) const; + const svalue *get_string_size (const svalue *sval) const; + const svalue *get_string_size (const region *reg) const; + /* Implemented in sm-malloc.cc */ void on_realloc_with_move (const call_details &cd, const svalue *old_ptr_sval, diff --git a/gcc/testsuite/gcc.dg/analyzer/out-of-bounds-4.c b/gcc/testsuite/gcc.dg/analyzer/out-of-bounds-4.c new file mode 100644 index 00000000000..46f600de658 --- /dev/null +++ b/gcc/testsuite/gcc.dg/analyzer/out-of-bounds-4.c @@ -0,0 +1,65 @@ +/* { dg-additional-options "-Wno-stringop-overflow -Wno-stringop-truncation" } */ +#include + +/* Wanalyzer-out-of-bounds tests for strpy-related overflows. + + The intra-procedural tests are all caught by Wstringop-overflow. + The inter-procedural out-of-bounds are only found by the analyzer. */ + +void test1 (void) +{ + char dst[5]; + strcpy (dst, "Hello"); /* { dg-line test1 } */ + + /* { dg-warning "overflow" "warning" { target *-*-* } test1 } */ + /* { dg-message "dst" "note" { target *-*-* } test1 } */ +} + +void test2 (void) +{ + char dst[6]; + strcpy (dst, "Hello"); +} + +void test3 (void) +{ + char *src = "Hello"; + char dst[5]; + strcpy (dst, src); /* { dg-line test3 } */ + + /* { dg-warning "overflow" "warning" { target *-*-* } test3 } */ + /* { dg-message "dst" "note" { target *-*-* } test3 } */ +} + +void test4 (void) +{ + char *src = "Hello"; + char dst[6]; + strcpy (dst, src); +} + +const char *return_hello (void) +{ + return "hello"; +} + +void test5 (void) +{ + const char *str = return_hello (); + if (!str) + return; + char dst[5]; + strcpy (dst, str); /* { dg-line test5 } */ + + /* { dg-warning "overflow" "warning" { target *-*-* } test5 } */ + /* { dg-message "dst" "note" { target *-*-* } test5 } */ +} + +void test6 (void) +{ + const char *str = return_hello (); + if (!str) + return; + char dst[6]; + strcpy (dst, str); +} diff --git a/gcc/testsuite/gcc.dg/analyzer/strcpy-3.c b/gcc/testsuite/gcc.dg/analyzer/strcpy-3.c new file mode 100644 index 00000000000..a38f9a7641f --- /dev/null +++ b/gcc/testsuite/gcc.dg/analyzer/strcpy-3.c @@ -0,0 +1,23 @@ +#include +#include "analyzer-decls.h" + +void test_1 (void) +{ + char str[] = "Hello"; + char buf[6]; + char *result = strcpy (buf, str); + __analyzer_describe (1, result); /* { dg-warning "region_svalue.*?'buf'" } */ + __analyzer_eval (result == buf); /* { dg-warning "TRUE" } */ + __analyzer_eval (buf[0] == 'H'); /* { dg-warning "TRUE" } */ + __analyzer_eval (buf[1] == 'e'); /* { dg-warning "TRUE" } */ + __analyzer_eval (buf[2] == 'l'); /* { dg-warning "TRUE" } */ + __analyzer_eval (buf[3] == 'l'); /* { dg-warning "TRUE" } */ + __analyzer_eval (buf[4] == 'o'); /* { dg-warning "TRUE" } */ + __analyzer_eval (buf[5] == 0); /* { dg-warning "TRUE" } */ + __analyzer_eval (result[0] == 'H'); /* { dg-warning "TRUE" } */ + __analyzer_eval (result[1] == 'e'); /* { dg-warning "TRUE" } */ + __analyzer_eval (result[2] == 'l'); /* { dg-warning "TRUE" } */ + __analyzer_eval (result[3] == 'l'); /* { dg-warning "TRUE" } */ + __analyzer_eval (result[4] == 'o'); /* { dg-warning "TRUE" } */ + __analyzer_eval (result[5] == 0); /* { dg-warning "TRUE" } */ +} -- 2.37.2