public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug pch/56549] New: #pragma once ineffective with BOM in include file
@ 2013-03-06  9:47 yurivkhan at gmail dot com
  2013-03-06  9:52 ` [Bug pch/56549] " yurivkhan at gmail dot com
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: yurivkhan at gmail dot com @ 2013-03-06  9:47 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56549

             Bug #: 56549
           Summary: #pragma once ineffective with BOM in include file
    Classification: Unclassified
           Product: gcc
           Version: 4.6.3
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: pch
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: yurivkhan@gmail.com


Created attachment 29594
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29594
Minimal example demonstrating the problem

When an include file starts with a UTF-8 byte order mark and is included in a
precompiled header, it is not protected against repeated inclusion.

Environment:
 * Ubuntu 12.04 amd64
 * GCC 4.6.3 from Ubuntu repo

To reproduce:

$ unzip pragma-once-bom.zip
$ make

The contents of the archive are listed below for convenience of discussion.

Foo.h:
===
<BOM>#pragma once

struct Foo {};
===

stable.h:
===
#include "Foo.h"
===

main.cpp:
===
#include "stable.h"

int main() { Foo foo; }
===

Makefile:
===
all: main.o

pragma-once.gch/c++: stable.h Foo.h
    @test -d pragma-once.gch || mkdir pragma-once.gch
    g++ -x c++-header -c stable.h -o pragma-once.gch/c++

main.o: main.cpp stable.h Foo.h pragma-once.gch/c++
    g++ -c -include pragma-once -o main.o main.cpp

clean:
    @rm -rf pragma-once.gch main.o
===

Expected behavior:

===
$ make
g++ -x c++-header -c stable.h -o pragma-once.gch/c++
g++ -c -include pragma-once -o main.o main.cpp
===

Observed behavior:

===
$ make
g++ -x c++-header -c stable.h -o pragma-once.gch/c++
g++ -c -include pragma-once -o main.o main.cpp
In file included from stable.h:1:0,
                 from main.cpp:1:
Foo.h:3:8: error: redefinition of ‘struct Foo’
Foo.h:3:8: error: previous definition of ‘struct Foo’
make: *** [main.o] Error 1
===

Workarounds/observations:

* If the header file does not contain a BOM, the problem does not occur.
* With include guards instead of #pragma once, the problem does not occur.
* In a real project, changing an #include <FooLib/Foo.h> to #include
<FooLib/./Foo.h> in the precompiled header stable.h, or changing an #include
<FooLib/Foo.h> to #include "Foo.h" in a different header included by stable.h,
also fixes the immediate problem.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug pch/56549] #pragma once ineffective with BOM in include file
  2013-03-06  9:47 [Bug pch/56549] New: #pragma once ineffective with BOM in include file yurivkhan at gmail dot com
@ 2013-03-06  9:52 ` yurivkhan at gmail dot com
  2013-09-25  5:22 ` woodroof at gmail dot com
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: yurivkhan at gmail dot com @ 2013-03-06  9:52 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56549

--- Comment #1 from Yuri Khan <yurivkhan at gmail dot com> 2013-03-06 09:52:01 UTC ---
Also reproduced on 4.7.2.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug pch/56549] #pragma once ineffective with BOM in include file
  2013-03-06  9:47 [Bug pch/56549] New: #pragma once ineffective with BOM in include file yurivkhan at gmail dot com
  2013-03-06  9:52 ` [Bug pch/56549] " yurivkhan at gmail dot com
@ 2013-09-25  5:22 ` woodroof at gmail dot com
  2015-08-10 20:46 ` tanner at gmx dot net
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: woodroof at gmail dot com @ 2013-09-25  5:22 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56549

Danil Ilinykh <woodroof at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |woodroof at gmail dot com

--- Comment #2 from Danil Ilinykh <woodroof at gmail dot com> ---
*** Bug 49837 has been marked as a duplicate of this bug. ***


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug pch/56549] #pragma once ineffective with BOM in include file
  2013-03-06  9:47 [Bug pch/56549] New: #pragma once ineffective with BOM in include file yurivkhan at gmail dot com
  2013-03-06  9:52 ` [Bug pch/56549] " yurivkhan at gmail dot com
  2013-09-25  5:22 ` woodroof at gmail dot com
@ 2015-08-10 20:46 ` tanner at gmx dot net
  2020-06-21 17:34 ` xaizek at posteo dot net
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: tanner at gmx dot net @ 2015-08-10 20:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56549

Tom Tanner <tanner at gmx dot net> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |tanner at gmx dot net

--- Comment #3 from Tom Tanner <tanner at gmx dot net> ---
confirmed with g++ 4.2.1, 4.7.4, 4.8.5 and 4.9.3


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug pch/56549] #pragma once ineffective with BOM in include file
  2013-03-06  9:47 [Bug pch/56549] New: #pragma once ineffective with BOM in include file yurivkhan at gmail dot com
                   ` (2 preceding siblings ...)
  2015-08-10 20:46 ` tanner at gmx dot net
@ 2020-06-21 17:34 ` xaizek at posteo dot net
  2020-11-09 17:45 ` julien.ruffin at ivu dot de
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: xaizek at posteo dot net @ 2020-06-21 17:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56549

xaizek at posteo dot net changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |xaizek at posteo dot net

--- Comment #5 from xaizek at posteo dot net ---
The bug is still present on current master of GCC 11
(8ee2640bfdc62f835ec9740278f948034bc7d9f1).

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug pch/56549] #pragma once ineffective with BOM in include file
  2013-03-06  9:47 [Bug pch/56549] New: #pragma once ineffective with BOM in include file yurivkhan at gmail dot com
                   ` (3 preceding siblings ...)
  2020-06-21 17:34 ` xaizek at posteo dot net
@ 2020-11-09 17:45 ` julien.ruffin at ivu dot de
  2020-11-30 12:51 ` jruffin at gmail dot com
  2022-10-06 12:13 ` [Bug preprocessor/56549] " Dmitriy.Poterukha at uvoteam dot com
  6 siblings, 0 replies; 8+ messages in thread
From: julien.ruffin at ivu dot de @ 2020-11-09 17:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56549

Julien Ruffin <julien.ruffin at ivu dot de> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |julien.ruffin at ivu dot de

--- Comment #6 from Julien Ruffin <julien.ruffin at ivu dot de> ---
I have been having the same issue with GCC 9.2.0 for a while and ended up
finding the cause of this error. It can be traced back to function
_cpp_save_file_entries in gcc/libcpp/files.c.

Short explanation: the function saves the sizes and MD5 checksums of files
without any encoding conversion or BOM removal into the PCH's file list, even
though it should.

Long explanation: the function fills the PCH's files list which contains, among
other things, the sizes and MD5 checksums of all files in the PCH. Later, when
using the PCH, the compiler compares the files it loads with the files in that
list. If it finds an entry with the same size and checksum as the loaded file,
it is in the PCH and the compiler skips processing it: see
check_file_against_entries for the implementation, also in files.c.

The issue here is that the matching never succeeds for headers that contain a
BOM. The PCH entry is always 3 Bytes longer than the file loaded by the
compiler and the checksums always differ. The following code in
_cpp_save_file_entries is why:

      if (f->buffer_valid)
        md5_buffer ((const char *)f->buffer,
                    f->st.st_size, result->entries[count].sum);
      else
        {
          FILE *ff;
          int oldfd = f->fd;

          if (!open_file (f))
            {
              open_file_failed (pfile, f, 0, 0);
              free (result);
              return false;
            }
          ff = fdopen (f->fd, "rb");
          md5_stream (ff, result->entries[count].sum);
          fclose (ff);
          f->fd = oldfd;
        }
      result->entries[count].size = f->st.st_size;

libcpp caches the contents of the files it reads into their own buffers, here
f->buffer. The read_file function implements this loading and converts the
file's encoding on the fly with _cpp_convert_input. *This conversion strips the
BOM,* so the contents of f->buffer differ from those of the file whenever a BOM
is used.

If f->buffer_valid is not true, which seems to always be the case in the code
above as far as I could test it, the function reopens the file by hand and
computes the MD5 checksum directly from it, without any conversion. open_file()
also overwrites the data size in f->st.st_size with the size of the unconverted
file. That is why the checksum and size of the unconverted file end up in the
PCH's file list.

The compiler later compares those with the files it loads through read_files.
There never is a match because the checksums and sizes differ and the compiler
thinks it it has loaded a different file, so it processes the header with the
BOM a second time and the error we have been observing happens.

I have managed to solve this issue by replacing the manual loading of the
unconverted file in the else block above with a loading through read_file,
yielding the converted buffer and the correct size and, in the end, the correct
checksum. I do not have a patch to offer yet for various reasons but my
amateurish attempt at a fix made me able to build a large C++ code base
successfully with precompiled headers, so it is rather encouraging.

Somebody with more experience in the preprocessor might want to take a look at
this.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug pch/56549] #pragma once ineffective with BOM in include file
  2013-03-06  9:47 [Bug pch/56549] New: #pragma once ineffective with BOM in include file yurivkhan at gmail dot com
                   ` (4 preceding siblings ...)
  2020-11-09 17:45 ` julien.ruffin at ivu dot de
@ 2020-11-30 12:51 ` jruffin at gmail dot com
  2022-10-06 12:13 ` [Bug preprocessor/56549] " Dmitriy.Poterukha at uvoteam dot com
  6 siblings, 0 replies; 8+ messages in thread
From: jruffin at gmail dot com @ 2020-11-30 12:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56549

--- Comment #7 from Julien Ruffin <jruffin at gmail dot com> ---
Here is the patch for the current master. I have tested it on large C++ code
bases. So far, it builds successfully and significantly faster.

diff --git a/libcpp/files.c b/libcpp/files.c
index 301b2379a23..cbc2b0f4540 100644
--- a/libcpp/files.c
+++ b/libcpp/files.c
@@ -1978,25 +1978,28 @@ _cpp_save_file_entries (cpp_reader *pfile, FILE *fp)
       result->entries[count].once_only = f->once_only;
       /* |= is avoided in the next line because of an HP C compiler bug */
       result->have_once_only = result->have_once_only | f->once_only;
+
       if (f->buffer_valid)
-       md5_buffer ((const char *)f->buffer,
-                   f->st.st_size, result->entries[count].sum);
+        {
+          md5_buffer ((const char *)f->buffer,
+                      f->st.st_size, result->entries[count].sum);
+        }
       else
-       {
-         FILE *ff;
-         int oldfd = f->fd;
-
-         if (!open_file (f))
-           {
-             open_file_failed (pfile, f, 0, 0);
-             free (result);
-             return false;
-           }
-         ff = fdopen (f->fd, "rb");
-         md5_stream (ff, result->entries[count].sum);
-         fclose (ff);
-         f->fd = oldfd;
-       }
+        {
+          if (!read_file (pfile, f, 0))
+            {
+              return false;
+            }
+
+          md5_buffer ((const char *)f->buffer,
+                      f->st.st_size, result->entries[count].sum);
+
+          const void* to_free = f->buffer_start;
+          f->buffer_start = NULL;
+          f->buffer = NULL;
+          f->buffer_valid = false;
+          free ((void*) to_free);
+        }
       result->entries[count].size = f->st.st_size;
     }

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug preprocessor/56549] #pragma once ineffective with BOM in include file
  2013-03-06  9:47 [Bug pch/56549] New: #pragma once ineffective with BOM in include file yurivkhan at gmail dot com
                   ` (5 preceding siblings ...)
  2020-11-30 12:51 ` jruffin at gmail dot com
@ 2022-10-06 12:13 ` Dmitriy.Poterukha at uvoteam dot com
  6 siblings, 0 replies; 8+ messages in thread
From: Dmitriy.Poterukha at uvoteam dot com @ 2022-10-06 12:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56549

creeon <Dmitriy.Poterukha at uvoteam dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |Dmitriy.Poterukha at uvoteam dot c
                   |                            |om

--- Comment #8 from creeon <Dmitriy.Poterukha at uvoteam dot com> ---
Here is the patch for the current master. I have tested it on large C++ code
bases. So far, it builds successfully and significantly faster.

diff --git a/libcpp/files.c b/libcpp/files.c
index 301b2379a23..cbc2b0f4540 https://goo.gl/2DqXGj 100644
--- a/libcpp/files.c
+++ b/libcpp/files.c
@@ -1978,25 +1978,28 @@ _cpp_save_file_entries (cpp_reader *pfile, FILE *fp)
       result->entries[count].once_only = f->once_only;
       /* |= is avoided in the next line because of an HP C compiler bug */
       result->have_once_only = result->have_once_only | f->once_only;
+
       if (f->buffer_valid)
-       md5_buffer ((const char *)f->buffer,
-                   f->st.st_size, result->entries[count].sum);
+        {
+          md5_buffer ((const char *)f->buffer,
+                      f->st.st_size, result->entries[count].sum);
+        }
       else
-       {
-         FILE *ff;
-         int oldfd = f->fd;
-
-         if (!open_file (f))
-           {
-             open_file_failed (pfile, f, 0, 0);
-             free (result);
-             return false;
-           }
-         ff = fdopen (f->fd, "rb");
-         md5_stream (ff, result->entries[count].sum);
-         fclose (ff);
-         f->fd = oldfd;
-       }
+        {
+          if (!read_file (pfile, f, 0))
+            {
+              return false;
+            }
+
+          md5_buffer ((const char *)f->buffer,
+                      f->st.st_size, result->entries[count].sum);
+
+          const void* to_free = f->buffer_start;
+          f->buffer_start = NULL;
+          f->buffer = NULL;
+          f->buffer_valid = false;
+          free ((void*) to_free);
+        }
       result->entries[count].size = f->st.st_size;
     }

Hi how are you fixed it?

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-10-06 12:13 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-03-06  9:47 [Bug pch/56549] New: #pragma once ineffective with BOM in include file yurivkhan at gmail dot com
2013-03-06  9:52 ` [Bug pch/56549] " yurivkhan at gmail dot com
2013-09-25  5:22 ` woodroof at gmail dot com
2015-08-10 20:46 ` tanner at gmx dot net
2020-06-21 17:34 ` xaizek at posteo dot net
2020-11-09 17:45 ` julien.ruffin at ivu dot de
2020-11-30 12:51 ` jruffin at gmail dot com
2022-10-06 12:13 ` [Bug preprocessor/56549] " Dmitriy.Poterukha at uvoteam dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).