From: "serhei at serhei dot io" <sourceware-bugzilla@sourceware.org>
To: systemtap@sourceware.org
Subject: [Bug translator/30395] Regex code has invalid memory reads caught by KASAN
Date: Fri, 05 May 2023 16:12:34 +0000 [thread overview]
Message-ID: <bug-30395-6586-FC0ZgTDAnJ@http.sourceware.org/bugzilla/> (raw)
In-Reply-To: <bug-30395-6586@http.sourceware.org/bugzilla/>
https://sourceware.org/bugzilla/show_bug.cgi?id=30395
--- Comment #6 from Serhei Makarov <serhei at serhei dot io> ---
There's a simple fix that I think will work, but I'll need to add a bit of code
to doublecheck/guard against the state 'to' doing anything except exiting on a
NUL. This shouldn't happen -- essentially, the below tweak feeds the DFA an
unending sequence of NULs, which it should terminate on soon-enough. (The extra
state transition is needed because of the rather fiddly TNFA bookkeeping I
added in 2017 to handle capture groups.)
Note that uncommenting STAPREGEX_DEBUG_DFA in stapregex-dfa.cxx will produce a
trace of visited states. I can clearly see how far it goes beyond the NUL when
matching against a statically allocated string :(
diff --git a/stapregex-dfa.cxx b/stapregex-dfa.cxx
index 3601b28dd..cae8e2494 100644
--- a/stapregex-dfa.cxx
+++ b/stapregex-dfa.cxx
@@ -1020,7 +1020,7 @@ span::emit_jump (translator_output *o, const dfa *d)
const
if (to->accepts)
{
- emit_final(o, d);
+ emit_final(o, d, false /*saw_nul*/);
return;
}
@@ -1033,7 +1033,7 @@ span::emit_jump (translator_output *o, const dfa *d)
const
/* Assuming the target DFA state of the span is a final state, emit code to
cleanup tags and (if appropriate) exit with a final answer. */
void
-span::emit_final (translator_output *o, const dfa *d) const
+span::emit_final (translator_output *o, const dfa *d, bool saw_nul) const
{
assert (to->accepts); // XXX: must guarantee correct usage of emit_final()
@@ -1087,6 +1087,11 @@ span::emit_final (translator_output *o, const dfa *d)
const
o->indent(-1);
o->newline() << "}";
+ if (saw_nul)
+ {
+ o->newline () << "/* XXX PROBLEM TRANSITION XXX */"; /* DEBUG */
+ o->newline () << "YYCURSOR--;"; /* SUGGESTED FIX: the next state
will encounter a repeated NUL */
+ }
o->newline () << "goto yystate" << to->label << ";";
}
}
@@ -1119,10 +1124,11 @@ state::emit (translator_output *o, const dfa *d) const
if (it->lb == '\0')
{
o->newline() << "case " << c_char('\0') << ":";
- it->emit_final(o, d);
+ it->emit_final(o, d, true /* saw_nul */);
}
// Emit labels to handle all the other elements of the span:
diff --git a/stapregex-dfa.h b/stapregex-dfa.h
index c9a398fd7..065e1fe41 100644
--- a/stapregex-dfa.h
+++ b/stapregex-dfa.h
@@ -103,7 +103,7 @@ struct span {
state_kernel *reach_pairs; // -- starting point for te_closure computation
void emit_jump (translator_output *o, const dfa *d) const;
- void emit_final (translator_output *o, const dfa *d) const;
+ void emit_final (translator_output *o, const dfa *d, bool saw_nul) const;
};
struct state {
--
You are receiving this mail because:
You are the assignee for the bug.
next prev parent reply other threads:[~2023-05-05 16:12 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-27 5:09 [Bug translator/30395] New: " agentzh at gmail dot com
2023-04-27 5:10 ` [Bug translator/30395] " agentzh at gmail dot com
2023-04-28 2:20 ` agentzh at gmail dot com
2023-05-03 0:42 ` agentzh at gmail dot com
2023-05-03 0:46 ` agentzh at gmail dot com
2023-05-03 13:55 ` serhei at serhei dot io
2023-05-05 16:12 ` serhei at serhei dot io [this message]
2023-05-08 12:17 ` serhei at serhei dot io
2023-05-09 19:55 ` agentzh at gmail dot com
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bug-30395-6586-FC0ZgTDAnJ@http.sourceware.org/bugzilla/ \
--to=sourceware-bugzilla@sourceware.org \
--cc=systemtap@sourceware.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).