From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-il1-x142.google.com (mail-il1-x142.google.com [IPv6:2607:f8b0:4864:20::142]) by sourceware.org (Postfix) with ESMTPS id 334AF386EC5C for ; Mon, 27 Apr 2020 15:38:05 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 334AF386EC5C Received: by mail-il1-x142.google.com with SMTP id f82so17037301ilh.8 for ; Mon, 27 Apr 2020 08:38:05 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=4ymdDwJnKywHhOOdXvLKyNA/2KkUI+8d+IqAfRNvM2s=; b=aTM2v2/2+NnGW0k7Us4qgZB5LpbhpjJdW3UC6Bt5uFMiIa1d+sQVJBq74YQZ6odzFh 1dEijwvsIZzn9xlML24yoQnyfbrI8amdC+ivlnf6DtC/GyjYs+Spl/8RwGZC37b5XWif aMG01LXJvDy8YcWpJDzXH0Fz9JlcNu9ReaSc6QKC3pvpDlezKDMg4efsv9bAGBkt4coP HpROoPIi3aUefG8Ntysls2xQnjsCutJU1+0Zf57NEiO5XBd5nXQ44HafQZpHnPuIxRSf ZVD9g+KEPxqvd8ZyyN4a/6NwhcwC3aK+F8PJkSOZM+21OVPAhXnebUlIYSYrYMrgWBuM 5WVA== X-Gm-Message-State: AGi0PuYAjk7W9IzxPrBOkzxC/t/XZh4vut2ktTw7+jxlTWxtZ/WWkLJC bpXnYf0gQ3SX229ruedgNZWhPlYFsVz3QpvzXogsKDhk X-Google-Smtp-Source: APiQypId4EdB/w6+PLUZoztvJyfhMh87NYc7UbNGFJYVi18ltz2gFzp3RZ+mqXY0lGx/1JfvHE7uKiv4idsEfbzNMiI= X-Received: by 2002:a92:c7a9:: with SMTP id f9mr22711941ilk.0.1588001884356; Mon, 27 Apr 2020 08:38:04 -0700 (PDT) MIME-Version: 1.0 References: <20200423154441.170531-1-gprocida@google.com> <20200424092132.150547-1-gprocida@google.com> <20200424092132.150547-5-gprocida@google.com> <20200427111438.GD159704@google.com> In-Reply-To: <20200427111438.GD159704@google.com> From: Giuliano Procida Date: Mon, 27 Apr 2020 16:37:47 +0100 Message-ID: Subject: Re: [PATCH v3 04/21] Escape names used in symbol whitelisting regex. To: Matthias Maennich Cc: libabigail@sourceware.org, Dodji Seketeli , kernel-team@android.com Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-40.4 required=5.0 tests=BAYES_00, DKIMWL_WL_MED, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, USER_IN_DEF_DKIM_WL, USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libabigail@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libabigail mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 Apr 2020 15:38:06 -0000 Hi. On Mon, 27 Apr 2020 at 12:14, Matthias Maennich wrote: > > On Fri, Apr 24, 2020 at 10:21:15AM +0100, Giuliano Procida wrote: > >There is the theoretical possibility that symbols may contain special > >regex characters like '.' and '$'. This patch ensures all such > >characters in symbol names are escaped before they are added to the > >whitelisting regex. > > > > * include/regex.h (escape): New string reference holder > > class. (operator<<): Declaration of std::ostream, > > regex::escape overload. > > * include/regex.cc (operator<<): New std::ostream, > > regex::escape overload that outputs regex-escaped strings. > > * src/abg-tools-utils.cc > > (gen_suppr_spec_from_kernel_abi_whitelists): Make sure any > > special regex characters in symbol names are escaped. > > > >Signed-off-by: Giuliano Procida > >--- > > include/abg-regex.h | 10 ++++++++++ > > src/abg-regex.cc | 27 +++++++++++++++++++++++++-- > > 2 files changed, 35 insertions(+), 2 deletions(-) > > > >diff --git a/include/abg-regex.h b/include/abg-regex.h > >index 2f638ef2..59976794 100644 > >--- a/include/abg-regex.h > >+++ b/include/abg-regex.h > >@@ -58,6 +58,16 @@ struct regex_t_deleter > > } > > };//end struct regex_deleter > > > >+/// A class to hold a reference to a string to regex escape. > >+struct escape > >+{ > >+ escape(const std::string& str) : ref(str) { } > >+ const std::string& ref; > >+}; > >+ > >+std::ostream& > >+operator<<(std::ostream& os, const escape& esc); > >+ > > std::string > > generate_from_strings(const std::vector& strs); > > > >diff --git a/src/abg-regex.cc b/src/abg-regex.cc > >index 79a89033..90e4d144 100644 > >--- a/src/abg-regex.cc > >+++ b/src/abg-regex.cc > >@@ -24,6 +24,7 @@ > > /// > > > > #include > >+#include > Sort. Done. > > #include "abg-sptr-utils.h" > > #include "abg-regex.h" > > > >@@ -56,6 +57,28 @@ sptr_utils::build_sptr() > > namespace regex > > { > > > >+/// Escape regex special charaters in input string. > >+/// > >+/// @param os the output stream being written to. > >+/// > >+/// @param esc the regex_escape object holding a reference to the string > >+/// needing to be escaped. > >+/// > >+/// @return the output stream. > >+std::ostream& > >+operator<<(std::ostream& os, const escape& esc) > >+{ > >+ static const std::string specials = "^.[$()|*+?{\\"; > > What about ']' and '}' ? I stole my list from somewhere, possibly Wikipedia. To answer your question, because ']' and '}' are only special when preceded by '[' and '{' respectively. However, it can be confusing for humans, so I'll add them. > Cheers, > Matthias > > >+ const std::string str = esc.ref; > >+ for (std::string::const_iterator i = str.begin(); i != str.end(); ++i) > >+ { > >+ if (specials.find(*i) != std::string::npos) > >+ os << '\\'; > >+ os << *i; > >+ } > >+ return os; > >+} > >+ > > /// Generate a regex pattern equivalent to testing set membership. > > /// > > /// A string will match the resulting pattern regex, if and only if it > >@@ -71,9 +94,9 @@ generate_from_strings(const std::vector& strs) > > return "^_^"; > > std::ostringstream os; > > std::vector::const_iterator i = strs.begin(); > >- os << "^(" << *i++; > >+ os << "^(" << escape(*i++); > > while (i != strs.end()) > >- os << "|" << *i++; > >+ os << "|" << escape(*i++); > > os << ")$"; > > return os.str(); > > } > >-- > >2.26.2.303.gf8c07b1a785-goog > >