public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug analyzer/112965] New: Valgrind error on gcc.dg/analyzer/fd-dup-1.c
@ 2023-12-11 18:29 jakub at gcc dot gnu.org
  2023-12-11 18:30 ` [Bug analyzer/112965] " jakub at gcc dot gnu.org
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-12-11 18:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112965

            Bug ID: 112965
           Summary: Valgrind error on gcc.dg/analyzer/fd-dup-1.c
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: analyzer
          Assignee: dmalcolm at gcc dot gnu.org
          Reporter: jakub at gcc dot gnu.org
  Target Milestone: ---

With valgrind checking I'm seeing
Executing on host: /home/jakub/src/gcc/obj88/gcc/xgcc
-B/home/jakub/src/gcc/obj88/gcc/ 
/home/jakub/src/gcc/gcc/testsuite/gcc.dg/analyzer/fd-dup-1.c   
-fdiagnostics-plain-output   -
fanalyzer -Wanalyzer-too-complex -Wanalyzer-symbol-too-complex
-fanalyzer-call-summaries -S -o fd-dup-1.s    (timeout = 300)
spawn -ignore SIGHUP /home/jakub/src/gcc/obj88/gcc/xgcc
-B/home/jakub/src/gcc/obj88/gcc/
/home/jakub/src/gcc/gcc/testsuite/gcc.dg/analyzer/fd-dup-1.c
-fdiagnostics-plain-output -fana
lyzer -Wanalyzer-too-complex -Wanalyzer-symbol-too-complex
-fanalyzer-call-summaries -S -o fd-dup-1.s
==395421== Conditional jump or move depends on uninitialised value(s)
==395421==    at 0x1DC5D3E: search_line_sse42(unsigned char const*, unsigned
char const*) (lex.cc:467)
==395421==    by 0x1DC6944: _cpp_clean_line (lex.cc:960)
==395421==    by 0x1DC6DD2: bool get_fresh_line_impl<false>(cpp_reader*)
(lex.cc:3747)
==395421==    by 0x1DCB6FC: _cpp_get_fresh_line (lex.cc:3785)
==395421==    by 0x1DCB6FC: _cpp_lex_direct (lex.cc:3838)
==395421==    by 0x1DCD428: _cpp_lex_token (lex.cc:3670)
==395421==    by 0x1DD3A97: cpp_get_token_1(cpp_reader*, unsigned int*)
(macro.cc:2936)
==395421==    by 0x7DD20A: get_token (c-lex.cc:311)
==395421==    by 0x7DD20A: c_lex_with_flags(tree_node**, unsigned int*,
unsigned char*, int) (c-lex.cc:552)
==395421==    by 0x785BFA: consider_macro (c-parser.cc:1854)
==395421==    by 0x785BFA:
ana::c_translation_unit::lookup_constant_by_id(tree_node*) const
(c-parser.cc:1789)
==395421==    by 0x10540A5: ana::maybe_stash_named_constant(ana::logger*,
ana::translation_unit const&, char const*) (analyzer-language.cc:73)
==395421==    by 0x105449E: stash_named_constants (analyzer-language.cc:96)
==395421==    by 0x105449E:
ana::on_finish_translation_unit(ana::translation_unit const&)
(analyzer-language.cc:124)
==395421==    by 0x785442: c_parser_translation_unit (c-parser.cc:1935)
==395421==    by 0x785442: c_parse_file() (c-parser.cc:26713)
==395421==    by 0x7F6331: c_common_parse_file() (c-opts.cc:1301)
==395421== 
==395421== Use of uninitialised value of size 8
==395421==    at 0x1DC6948: _cpp_clean_line (lex.cc:962)
==395421==    by 0x1DC6DD2: bool get_fresh_line_impl<false>(cpp_reader*)
(lex.cc:3747)
==395421==    by 0x1DCB6FC: _cpp_get_fresh_line (lex.cc:3785)
==395421==    by 0x1DCB6FC: _cpp_lex_direct (lex.cc:3838)
==395421==    by 0x1DCD428: _cpp_lex_token (lex.cc:3670)
==395421==    by 0x1DD3A97: cpp_get_token_1(cpp_reader*, unsigned int*)
(macro.cc:2936)
==395421==    by 0x7DD20A: get_token (c-lex.cc:311)
==395421==    by 0x7DD20A: c_lex_with_flags(tree_node**, unsigned int*,
unsigned char*, int) (c-lex.cc:552)
==395421==    by 0x785BFA: consider_macro (c-parser.cc:1854)
==395421==    by 0x785BFA:
ana::c_translation_unit::lookup_constant_by_id(tree_node*) const
(c-parser.cc:1789)
==395421==    by 0x10540A5: ana::maybe_stash_named_constant(ana::logger*,
ana::translation_unit const&, char const*) (analyzer-language.cc:73)
==395421==    by 0x105449E: stash_named_constants (analyzer-language.cc:96)
==395421==    by 0x105449E:
ana::on_finish_translation_unit(ana::translation_unit const&)
(analyzer-language.cc:124)
==395421==    by 0x785442: c_parser_translation_unit (c-parser.cc:1935)
==395421==    by 0x785442: c_parse_file() (c-parser.cc:26713)
==395421==    by 0x7F6331: c_common_parse_file() (c-opts.cc:1301)
==395421==    by 0xCFF87D: compile_file() (toplev.cc:446)
I vaguely remember the buffers for libcpp need to be aligned at the end so that
the lex.cc fastpath can read it 8 bytes at a time, but I coiuld be wrong.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug analyzer/112965] Valgrind error on gcc.dg/analyzer/fd-dup-1.c
  2023-12-11 18:29 [Bug analyzer/112965] New: Valgrind error on gcc.dg/analyzer/fd-dup-1.c jakub at gcc dot gnu.org
@ 2023-12-11 18:30 ` jakub at gcc dot gnu.org
  2023-12-12 22:14 ` dmalcolm at gcc dot gnu.org
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-12-11 18:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112965

--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
gcc.dg/analyzer/fd-leak-pr108252.c, gcc.dg/analyzer/fd-access-mode-macros.c,
gcc.dg/analyzer/fd-4.c, gcc.dg/analyzer/named-constants-Wunused-macros.c etc.
fail the same.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug analyzer/112965] Valgrind error on gcc.dg/analyzer/fd-dup-1.c
  2023-12-11 18:29 [Bug analyzer/112965] New: Valgrind error on gcc.dg/analyzer/fd-dup-1.c jakub at gcc dot gnu.org
  2023-12-11 18:30 ` [Bug analyzer/112965] " jakub at gcc dot gnu.org
@ 2023-12-12 22:14 ` dmalcolm at gcc dot gnu.org
  2023-12-12 22:15 ` dmalcolm at gcc dot gnu.org
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: dmalcolm at gcc dot gnu.org @ 2023-12-12 22:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112965

--- Comment #2 from David Malcolm <dmalcolm at gcc dot gnu.org> ---
In c-parser.cc's consider_macro:
1843        pretty_printer pp;
1844        pp_string (&pp, (const char *) tok.val.str.text);
1845        pp_newline (&pp);
1846        cpp_push_buffer (parse_in,
1847                         (const unsigned char *) pp_formatted_text (&pp),
1848                         strlen (pp_formatted_text (&pp)),
1849                         0);

(gdb) p pp_formatted_text (&pp)
$10 = 0x5782360 "3\n"

(gdb) p (size_t)strlen(pp_formatted_text (&pp))
$11 = 2

So we have an aligned buffer, but it's only 2 bytes long aka 3 bytes in size
i.e.:
  ['3', '\n', '\0']

Looking at lex.cc's search_line_sse42:

424       uintptr_t si = (uintptr_t)s;
425       uintptr_t index;
426     
427       /* Check for unaligned input.  */
428       if (si & 15)
429         {
430           v16qi sv;
431     
432           if (__builtin_expect (end - s < 16, 0)
433               && __builtin_expect ((si & 0xfff) > 0xff0, 0))
434             {
435               /* There are less than 16 bytes left in the buffer, and less
436                  than 16 bytes left on the page.  Reading 16 bytes at this
437                  point might generate a spurious page fault.  Defer to the
438                  SSE2 implementation, which already handles alignment.  */
439               return search_line_sse2 (s, end);
440             }

we skip the block starting at line 429, since it's aligned:

(gdb) p ((uintptr_t)s) & 15
$14 = 0

but the length is so short that we presumably don't want to read 16 bytes at a
time:

(gdb) p end - s
$15 = 2

Not sure if this is a false positive, or a bug in search_line_sse42 when
dealing with very short aligned buffers.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug analyzer/112965] Valgrind error on gcc.dg/analyzer/fd-dup-1.c
  2023-12-11 18:29 [Bug analyzer/112965] New: Valgrind error on gcc.dg/analyzer/fd-dup-1.c jakub at gcc dot gnu.org
  2023-12-11 18:30 ` [Bug analyzer/112965] " jakub at gcc dot gnu.org
  2023-12-12 22:14 ` dmalcolm at gcc dot gnu.org
@ 2023-12-12 22:15 ` dmalcolm at gcc dot gnu.org
  2023-12-12 22:41 ` dmalcolm at gcc dot gnu.org
  2023-12-12 22:59 ` jakub at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: dmalcolm at gcc dot gnu.org @ 2023-12-12 22:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112965

--- Comment #3 from David Malcolm <dmalcolm at gcc dot gnu.org> ---
A workaround might be to pad pp's buffer with trailing zero bytes up to a
multiple of 16.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug analyzer/112965] Valgrind error on gcc.dg/analyzer/fd-dup-1.c
  2023-12-11 18:29 [Bug analyzer/112965] New: Valgrind error on gcc.dg/analyzer/fd-dup-1.c jakub at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2023-12-12 22:15 ` dmalcolm at gcc dot gnu.org
@ 2023-12-12 22:41 ` dmalcolm at gcc dot gnu.org
  2023-12-12 22:59 ` jakub at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: dmalcolm at gcc dot gnu.org @ 2023-12-12 22:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112965

--- Comment #4 from David Malcolm <dmalcolm at gcc dot gnu.org> ---
(In reply to David Malcolm from comment #3)
> A workaround might be to pad pp's buffer with trailing zero bytes up to a
> multiple of 16.

The following hack "fixes" it (for some definition of "fix"):

diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index 5700ccccc493..4858fec0dc3e 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -1843,6 +1843,11 @@ private:
     pretty_printer pp;
     pp_string (&pp, (const char *) tok.val.str.text);
     pp_newline (&pp);
+
+    // FIXME: workaround for PR 112965
+    for (int i = 0; i < 16; i++)
+      pp_character (&pp, '\0');
+
     cpp_push_buffer (parse_in,
                     (const unsigned char *) pp_formatted_text (&pp),
                     strlen (pp_formatted_text (&pp)),

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug analyzer/112965] Valgrind error on gcc.dg/analyzer/fd-dup-1.c
  2023-12-11 18:29 [Bug analyzer/112965] New: Valgrind error on gcc.dg/analyzer/fd-dup-1.c jakub at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2023-12-12 22:41 ` dmalcolm at gcc dot gnu.org
@ 2023-12-12 22:59 ` jakub at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-12-12 22:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112965

--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
It could be done just for #ifdef ENABLE_VALGRIND_ANNOTATIONS
Perhaps { const char buf[16] = {}; pp_append_text (pp, buf, buf + 16); } ?
Anyway, I'm really curious why it works for buffers in libcpp.
Because new_buff only aligns length to DEFAULT_ALIGNMENT, which is alignment of
double and pointers; 8 bytes on x86-64, but just 4 bytes on i686.
Obviously, when not under valgrind I think it should be ok in any case, even
when reading random uninitialized bytes after the allocation it shouldn't
change anything on the outcome of the search, and because e.g. the sse4.2
version for < 16 bytes misaligned at the end of page defers to sse2 version and
otherwise does one unaligned load and all the further ones are aligned, so it
should never cross the end of page.
So it is just about valgrind not understanding that the uninitialized bytes
after newline don't matter.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-12-12 22:59 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-12-11 18:29 [Bug analyzer/112965] New: Valgrind error on gcc.dg/analyzer/fd-dup-1.c jakub at gcc dot gnu.org
2023-12-11 18:30 ` [Bug analyzer/112965] " jakub at gcc dot gnu.org
2023-12-12 22:14 ` dmalcolm at gcc dot gnu.org
2023-12-12 22:15 ` dmalcolm at gcc dot gnu.org
2023-12-12 22:41 ` dmalcolm at gcc dot gnu.org
2023-12-12 22:59 ` jakub at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).