* [PATCH] Allocate extra 16 bytes for -fsanitize=address @ 2012-11-23 17:23 H.J. Lu 2012-11-23 17:39 ` Jakub Jelinek 0 siblings, 1 reply; 14+ messages in thread From: H.J. Lu @ 2012-11-23 17:23 UTC (permalink / raw) To: gcc-patches Hi, This patch allocates extra 16 bytes for -fsanitize=address so that asan won't report read beyond memory buffer. It is used by bootstrap-asan. OK to install? Thanks. H.J. --- 2012-11-21 H.J. Lu <hongjiu.lu@intel.com> PR bootstrap/55380 * charset.c (_cpp_convert_input): Allocate extra 16 bytes for -fsanitize=address if __SANITIZE_ADDRESS__ is defined. diff --git a/libcpp/charset.c b/libcpp/charset.c index cba19a6..dea8bb1 100644 --- a/libcpp/charset.c +++ b/libcpp/charset.c @@ -1729,9 +1729,16 @@ _cpp_convert_input (cpp_reader *pfile, const char *input_charset, iconv_close (input_cset.cd); /* Resize buffer if we allocated substantially too much, or if we - haven't enough space for the \n-terminator. */ + haven't enough space for the \n-terminator. Allocate extra 16 + bytes for -fsanitize=address. */ if (to.len + 4096 < to.asize || to.len >= to.asize) - to.text = XRESIZEVEC (uchar, to.text, to.len + 1); + { +#ifdef __SANITIZE_ADDRESS__ + to.text = XRESIZEVEC (uchar, to.text, to.len + 17); +#else + to.text = XRESIZEVEC (uchar, to.text, to.len + 1); +#endif + } /* If the file is using old-school Mac line endings (\r only), terminate with another \r, not an \n, so that we do not mistake -- 1.7.11.7 ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Allocate extra 16 bytes for -fsanitize=address 2012-11-23 17:23 [PATCH] Allocate extra 16 bytes for -fsanitize=address H.J. Lu @ 2012-11-23 17:39 ` Jakub Jelinek 2012-11-23 18:08 ` H.J. Lu 0 siblings, 1 reply; 14+ messages in thread From: Jakub Jelinek @ 2012-11-23 17:39 UTC (permalink / raw) To: H.J. Lu, Tom Tromey; +Cc: gcc-patches On Fri, Nov 23, 2012 at 09:23:37AM -0800, H.J. Lu wrote: > This patch allocates extra 16 bytes for -fsanitize=address so that > asan won't report read beyond memory buffer. It is used by > bootstrap-asan. OK to install? As valgrind warns about that too, I'd say we should do that unconditionally, the additional 16-bytes just aren't a big deal. But, _cpp_convert_input isn't the only place which needs that, IMHO you want to also change the caller, read_file_guts, where it does buf = XNEWVEC (uchar, size + 1); and buf = XRESIZEVEC (uchar, buf, size + 1); I'll defer the review to Tom though. > 2012-11-21 H.J. Lu <hongjiu.lu@intel.com> > > PR bootstrap/55380 > * charset.c (_cpp_convert_input): Allocate extra 16 bytes for > -fsanitize=address if __SANITIZE_ADDRESS__ is defined. > > diff --git a/libcpp/charset.c b/libcpp/charset.c > index cba19a6..dea8bb1 100644 > --- a/libcpp/charset.c > +++ b/libcpp/charset.c > @@ -1729,9 +1729,16 @@ _cpp_convert_input (cpp_reader *pfile, const char *input_charset, > iconv_close (input_cset.cd); > > /* Resize buffer if we allocated substantially too much, or if we > - haven't enough space for the \n-terminator. */ > + haven't enough space for the \n-terminator. Allocate extra 16 > + bytes for -fsanitize=address. */ > if (to.len + 4096 < to.asize || to.len >= to.asize) > - to.text = XRESIZEVEC (uchar, to.text, to.len + 1); > + { > +#ifdef __SANITIZE_ADDRESS__ > + to.text = XRESIZEVEC (uchar, to.text, to.len + 17); > +#else > + to.text = XRESIZEVEC (uchar, to.text, to.len + 1); > +#endif > + } > > /* If the file is using old-school Mac line endings (\r only), > terminate with another \r, not an \n, so that we do not mistake Jakub ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Allocate extra 16 bytes for -fsanitize=address 2012-11-23 17:39 ` Jakub Jelinek @ 2012-11-23 18:08 ` H.J. Lu 2012-11-23 18:12 ` Jakub Jelinek ` (2 more replies) 0 siblings, 3 replies; 14+ messages in thread From: H.J. Lu @ 2012-11-23 18:08 UTC (permalink / raw) To: Jakub Jelinek; +Cc: Tom Tromey, gcc-patches On Fri, Nov 23, 2012 at 9:38 AM, Jakub Jelinek <jakub@redhat.com> wrote: > On Fri, Nov 23, 2012 at 09:23:37AM -0800, H.J. Lu wrote: >> This patch allocates extra 16 bytes for -fsanitize=address so that >> asan won't report read beyond memory buffer. It is used by >> bootstrap-asan. OK to install? > > As valgrind warns about that too, I'd say we should do that unconditionally, > the additional 16-bytes just aren't a big deal. This isn't sufficient for valgrind since valgrind will also report reading uninitialized data, which requires additional memset on extra 16 bytes. > But, _cpp_convert_input isn't the only place which needs that, IMHO you want This patch is sufficient to compile libcpp with -fsanitize=address. > to also change the caller, read_file_guts, where it does > buf = XNEWVEC (uchar, size + 1); > and > buf = XRESIZEVEC (uchar, buf, size + 1); I don't think it is necessary since there is no real data in those extra 16 bytes. They can have garbage and libcpp still functions correctly. They are purely used to silence ASAN. > I'll defer the review to Tom though. > >> 2012-11-21 H.J. Lu <hongjiu.lu@intel.com> >> >> PR bootstrap/55380 >> * charset.c (_cpp_convert_input): Allocate extra 16 bytes for >> -fsanitize=address if __SANITIZE_ADDRESS__ is defined. >> >> diff --git a/libcpp/charset.c b/libcpp/charset.c >> index cba19a6..dea8bb1 100644 >> --- a/libcpp/charset.c >> +++ b/libcpp/charset.c >> @@ -1729,9 +1729,16 @@ _cpp_convert_input (cpp_reader *pfile, const char *input_charset, >> iconv_close (input_cset.cd); >> >> /* Resize buffer if we allocated substantially too much, or if we >> - haven't enough space for the \n-terminator. */ >> + haven't enough space for the \n-terminator. Allocate extra 16 >> + bytes for -fsanitize=address. */ >> if (to.len + 4096 < to.asize || to.len >= to.asize) >> - to.text = XRESIZEVEC (uchar, to.text, to.len + 1); >> + { >> +#ifdef __SANITIZE_ADDRESS__ >> + to.text = XRESIZEVEC (uchar, to.text, to.len + 17); >> +#else >> + to.text = XRESIZEVEC (uchar, to.text, to.len + 1); >> +#endif >> + } >> >> /* If the file is using old-school Mac line endings (\r only), >> terminate with another \r, not an \n, so that we do not mistake > > Jakub -- H.J. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Allocate extra 16 bytes for -fsanitize=address 2012-11-23 18:08 ` H.J. Lu @ 2012-11-23 18:12 ` Jakub Jelinek 2012-11-23 18:17 ` H.J. Lu 2012-11-23 18:12 ` Konstantin Serebryany 2012-11-24 10:52 ` Hans-Peter Nilsson 2 siblings, 1 reply; 14+ messages in thread From: Jakub Jelinek @ 2012-11-23 18:12 UTC (permalink / raw) To: H.J. Lu; +Cc: Tom Tromey, gcc-patches On Fri, Nov 23, 2012 at 10:08:11AM -0800, H.J. Lu wrote: > > to also change the caller, read_file_guts, where it does > > buf = XNEWVEC (uchar, size + 1); > > and > > buf = XRESIZEVEC (uchar, buf, size + 1); > > I don't think it is necessary since there is no real data in > those extra 16 bytes. They can have garbage and libcpp still > functions correctly. They are purely used to silence ASAN. The thing is, if the buf from the caller has such size/total combination that if (to.len + 4096 < to.asize || to.len >= to.asize) isn't true, there won't be any reallocation and the buffer passed in from the caller will be used instead. And, if it doesn't have those extra 16 bytes, it will still result in asan warning... Guess you need to read file from stdin instead of a file for that to trigger that. For valgrind I bet just clearing those 16 bytes might still be cheap enough not to worry about it. > > I'll defer the review to Tom though. > > > >> 2012-11-21 H.J. Lu <hongjiu.lu@intel.com> > >> > >> PR bootstrap/55380 > >> * charset.c (_cpp_convert_input): Allocate extra 16 bytes for > >> -fsanitize=address if __SANITIZE_ADDRESS__ is defined. > >> > >> diff --git a/libcpp/charset.c b/libcpp/charset.c > >> index cba19a6..dea8bb1 100644 > >> --- a/libcpp/charset.c > >> +++ b/libcpp/charset.c > >> @@ -1729,9 +1729,16 @@ _cpp_convert_input (cpp_reader *pfile, const char *input_charset, > >> iconv_close (input_cset.cd); > >> > >> /* Resize buffer if we allocated substantially too much, or if we > >> - haven't enough space for the \n-terminator. */ > >> + haven't enough space for the \n-terminator. Allocate extra 16 > >> + bytes for -fsanitize=address. */ > >> if (to.len + 4096 < to.asize || to.len >= to.asize) > >> - to.text = XRESIZEVEC (uchar, to.text, to.len + 1); > >> + { > >> +#ifdef __SANITIZE_ADDRESS__ > >> + to.text = XRESIZEVEC (uchar, to.text, to.len + 17); > >> +#else > >> + to.text = XRESIZEVEC (uchar, to.text, to.len + 1); > >> +#endif > >> + } > >> > >> /* If the file is using old-school Mac line endings (\r only), > >> terminate with another \r, not an \n, so that we do not mistake Jakub ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Allocate extra 16 bytes for -fsanitize=address 2012-11-23 18:12 ` Jakub Jelinek @ 2012-11-23 18:17 ` H.J. Lu 0 siblings, 0 replies; 14+ messages in thread From: H.J. Lu @ 2012-11-23 18:17 UTC (permalink / raw) To: Jakub Jelinek; +Cc: Tom Tromey, gcc-patches On Fri, Nov 23, 2012 at 10:12 AM, Jakub Jelinek <jakub@redhat.com> wrote: > On Fri, Nov 23, 2012 at 10:08:11AM -0800, H.J. Lu wrote: >> > to also change the caller, read_file_guts, where it does >> > buf = XNEWVEC (uchar, size + 1); >> > and >> > buf = XRESIZEVEC (uchar, buf, size + 1); >> >> I don't think it is necessary since there is no real data in >> those extra 16 bytes. They can have garbage and libcpp still >> functions correctly. They are purely used to silence ASAN. > > The thing is, if the buf from the caller has such size/total > combination that if (to.len + 4096 < to.asize || to.len >= to.asize) > isn't true, there won't be any reallocation and the buffer passed > in from the caller will be used instead. And, if it doesn't have those > extra 16 bytes, it will still result in asan warning... > Guess you need to read file from stdin instead of a file for that > to trigger that. I see. I will double check and update my patch. > For valgrind I bet just clearing those 16 bytes might still be cheap enough > not to worry about it. This is what I did for valgrind: http://gcc.gnu.org/git/?p=gcc.git;a=blobdiff;f=libcpp/charset.c;h=16a37c36f0cbf7a69c93ae7acb5d79893d57b643;hp=cba19a67178c796dcef1c8c70ac5c43dcbc69071;hb=8ab0e2cd4ae0fae01d0b84d6ccc382acb81ab876;hpb=e5e0d29f8b8ccff799a26fc3e6435af8dbf358fc /* Resize buffer if we allocated substantially too much, or if we - haven't enough space for the \n-terminator. */ + haven't enough space for the \n-terminator. Allocate and + initialize extra 16 bytes for --enable-checking=valgrind. */ if (to.len + 4096 < to.asize || to.len >= to.asize) - to.text = XRESIZEVEC (uchar, to.text, to.len + 1); + { +#ifdef ENABLE_VALGRIND_CHECKING + to.text = XRESIZEVEC (uchar, to.text, to.len + 17); + memset (to.text + to.len + 1, 0, 16); +#else + to.text = XRESIZEVEC (uchar, to.text, to.len + 1); +#endif + } I will check if I need to update it for stdin. Thanks. -- H.J. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Allocate extra 16 bytes for -fsanitize=address 2012-11-23 18:08 ` H.J. Lu 2012-11-23 18:12 ` Jakub Jelinek @ 2012-11-23 18:12 ` Konstantin Serebryany 2012-11-24 10:52 ` Hans-Peter Nilsson 2 siblings, 0 replies; 14+ messages in thread From: Konstantin Serebryany @ 2012-11-23 18:12 UTC (permalink / raw) To: H.J. Lu; +Cc: Jakub Jelinek, Tom Tromey, gcc-patches On Fri, Nov 23, 2012 at 10:08 PM, H.J. Lu <hjl.tools@gmail.com> wrote: > On Fri, Nov 23, 2012 at 9:38 AM, Jakub Jelinek <jakub@redhat.com> wrote: >> On Fri, Nov 23, 2012 at 09:23:37AM -0800, H.J. Lu wrote: >>> This patch allocates extra 16 bytes for -fsanitize=address so that >>> asan won't report read beyond memory buffer. It is used by >>> bootstrap-asan. OK to install? >> >> As valgrind warns about that too, I'd say we should do that unconditionally, >> the additional 16-bytes just aren't a big deal. > > This isn't sufficient for valgrind since valgrind will also report > reading uninitialized data, which requires additional memset > on extra 16 bytes. Valgrind should not report reading uninitialized data if that data is actually not used later (extra bytes are cleared by AND-ing with zeroes). If the code was mine, I would simply use (to.len + 17) w/o any conditionals. --kcc > >> But, _cpp_convert_input isn't the only place which needs that, IMHO you want > > This patch is sufficient to compile libcpp with -fsanitize=address. > >> to also change the caller, read_file_guts, where it does >> buf = XNEWVEC (uchar, size + 1); >> and >> buf = XRESIZEVEC (uchar, buf, size + 1); > > I don't think it is necessary since there is no real data in > those extra 16 bytes. They can have garbage and libcpp still > functions correctly. They are purely used to silence ASAN. > >> I'll defer the review to Tom though. >> >>> 2012-11-21 H.J. Lu <hongjiu.lu@intel.com> >>> >>> PR bootstrap/55380 >>> * charset.c (_cpp_convert_input): Allocate extra 16 bytes for >>> -fsanitize=address if __SANITIZE_ADDRESS__ is defined. >>> >>> diff --git a/libcpp/charset.c b/libcpp/charset.c >>> index cba19a6..dea8bb1 100644 >>> --- a/libcpp/charset.c >>> +++ b/libcpp/charset.c >>> @@ -1729,9 +1729,16 @@ _cpp_convert_input (cpp_reader *pfile, const char *input_charset, >>> iconv_close (input_cset.cd); >>> >>> /* Resize buffer if we allocated substantially too much, or if we >>> - haven't enough space for the \n-terminator. */ >>> + haven't enough space for the \n-terminator. Allocate extra 16 >>> + bytes for -fsanitize=address. */ >>> if (to.len + 4096 < to.asize || to.len >= to.asize) >>> - to.text = XRESIZEVEC (uchar, to.text, to.len + 1); >>> + { >>> +#ifdef __SANITIZE_ADDRESS__ >>> + to.text = XRESIZEVEC (uchar, to.text, to.len + 17); >>> +#else >>> + to.text = XRESIZEVEC (uchar, to.text, to.len + 1); >>> +#endif >>> + } >>> >>> /* If the file is using old-school Mac line endings (\r only), >>> terminate with another \r, not an \n, so that we do not mistake >> >> Jakub > > > > -- > H.J. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Allocate extra 16 bytes for -fsanitize=address 2012-11-23 18:08 ` H.J. Lu 2012-11-23 18:12 ` Jakub Jelinek 2012-11-23 18:12 ` Konstantin Serebryany @ 2012-11-24 10:52 ` Hans-Peter Nilsson 2 siblings, 0 replies; 14+ messages in thread From: Hans-Peter Nilsson @ 2012-11-24 10:52 UTC (permalink / raw) To: H.J. Lu; +Cc: Jakub Jelinek, Tom Tromey, gcc-patches On Fri, 23 Nov 2012, H.J. Lu wrote: > On Fri, Nov 23, 2012 at 9:38 AM, Jakub Jelinek <jakub@redhat.com> wrote: > > On Fri, Nov 23, 2012 at 09:23:37AM -0800, H.J. Lu wrote: > >> This patch allocates extra 16 bytes for -fsanitize=address so that > >> asan won't report read beyond memory buffer. It is used by > >> bootstrap-asan. OK to install? > > > > As valgrind warns about that too, I'd say we should do that unconditionally, > > the additional 16-bytes just aren't a big deal. > > This isn't sufficient for valgrind since valgrind will also report > reading uninitialized data, Only if that initialized data is actually used in a conditional. > which requires additional memset > on extra 16 bytes. (For plain operations, valgrind just operate on, typically passing along, the undefinedness.) Maybe that's what you meant, in which case my comment just clarifies what I saw as an ambiguity in your statement. brgds, H-P ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Allocate extra 16 bytes for -fsanitize=address
@ 2012-11-23 19:01 Uros Bizjak
2012-11-23 19:17 ` H.J. Lu
0 siblings, 1 reply; 14+ messages in thread
From: Uros Bizjak @ 2012-11-23 19:01 UTC (permalink / raw)
To: gcc-patches; +Cc: Jakub Jelinek
Hello!
> This patch allocates extra 16 bytes for -fsanitize=address so that
> asan won't report read beyond memory buffer. It is used by
> bootstrap-asan. OK to install?
/* Resize buffer if we allocated substantially too much, or if we
- haven't enough space for the \n-terminator. */
+ haven't enough space for the \n-terminator. Allocate extra 16
+ bytes for -fsanitize=address. */
I guess that extra _15_ bytes should be enough? The maximum we need
for SSE stringops is additional 15 bytes, when only the first byte is
allocated in the 16byte bundle.
Uros,
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Allocate extra 16 bytes for -fsanitize=address 2012-11-23 19:01 Uros Bizjak @ 2012-11-23 19:17 ` H.J. Lu 2012-11-23 19:23 ` Uros Bizjak 0 siblings, 1 reply; 14+ messages in thread From: H.J. Lu @ 2012-11-23 19:17 UTC (permalink / raw) To: Uros Bizjak; +Cc: gcc-patches, Jakub Jelinek On Fri, Nov 23, 2012 at 10:59 AM, Uros Bizjak <ubizjak@gmail.com> wrote: > Hello! > >> This patch allocates extra 16 bytes for -fsanitize=address so that >> asan won't report read beyond memory buffer. It is used by >> bootstrap-asan. OK to install? > > /* Resize buffer if we allocated substantially too much, or if we > - haven't enough space for the \n-terminator. */ > + haven't enough space for the \n-terminator. Allocate extra 16 > + bytes for -fsanitize=address. */ > > I guess that extra _15_ bytes should be enough? The maximum we need > for SSE stringops is additional 15 bytes, when only the first byte is > allocated in the 16byte bundle. > If we need to clear the extra bytes, clearing 16 bytes can be faster than 16 bytes. -- H.J. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Allocate extra 16 bytes for -fsanitize=address 2012-11-23 19:17 ` H.J. Lu @ 2012-11-23 19:23 ` Uros Bizjak 2012-11-23 19:33 ` H.J. Lu 0 siblings, 1 reply; 14+ messages in thread From: Uros Bizjak @ 2012-11-23 19:23 UTC (permalink / raw) To: H.J. Lu; +Cc: gcc-patches, Jakub Jelinek On Fri, Nov 23, 2012 at 8:16 PM, H.J. Lu <hjl.tools@gmail.com> wrote: > On Fri, Nov 23, 2012 at 10:59 AM, Uros Bizjak <ubizjak@gmail.com> wrote: >> Hello! >> >>> This patch allocates extra 16 bytes for -fsanitize=address so that >>> asan won't report read beyond memory buffer. It is used by >>> bootstrap-asan. OK to install? >> >> /* Resize buffer if we allocated substantially too much, or if we >> - haven't enough space for the \n-terminator. */ >> + haven't enough space for the \n-terminator. Allocate extra 16 >> + bytes for -fsanitize=address. */ >> >> I guess that extra _15_ bytes should be enough? The maximum we need >> for SSE stringops is additional 15 bytes, when only the first byte is >> allocated in the 16byte bundle. >> > > If we need to clear the extra bytes, clearing 16 bytes can be faster > than 16 bytes. (I assume "... than 15 bytes"). However, this is ideal for the code below, where we add either +16 or +1, considering also additional byte for the terminator. Uros. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Allocate extra 16 bytes for -fsanitize=address 2012-11-23 19:23 ` Uros Bizjak @ 2012-11-23 19:33 ` H.J. Lu 2012-11-23 19:50 ` Jakub Jelinek 0 siblings, 1 reply; 14+ messages in thread From: H.J. Lu @ 2012-11-23 19:33 UTC (permalink / raw) To: Uros Bizjak; +Cc: gcc-patches, Jakub Jelinek, Tom Tromey On Fri, Nov 23, 2012 at 11:23 AM, Uros Bizjak <ubizjak@gmail.com> wrote: > On Fri, Nov 23, 2012 at 8:16 PM, H.J. Lu <hjl.tools@gmail.com> wrote: >> On Fri, Nov 23, 2012 at 10:59 AM, Uros Bizjak <ubizjak@gmail.com> wrote: >>> Hello! >>> >>>> This patch allocates extra 16 bytes for -fsanitize=address so that >>>> asan won't report read beyond memory buffer. It is used by >>>> bootstrap-asan. OK to install? >>> >>> /* Resize buffer if we allocated substantially too much, or if we >>> - haven't enough space for the \n-terminator. */ >>> + haven't enough space for the \n-terminator. Allocate extra 16 >>> + bytes for -fsanitize=address. */ >>> >>> I guess that extra _15_ bytes should be enough? The maximum we need >>> for SSE stringops is additional 15 bytes, when only the first byte is >>> allocated in the 16byte bundle. >>> >> >> If we need to clear the extra bytes, clearing 16 bytes can be faster >> than 16 bytes. > > (I assume "... than 15 bytes"). > > However, this is ideal for the code below, where we add either +16 or > +1, considering also additional byte for the terminator. > > Uros. Here is the updated patch. I added CLEAR_CPP_PAD_BUFFER to be used later for valgrind. I have no real preferences for 15 vs 16 extra bytes. -- H.J. --- 2012-11-21 H.J. Lu <hongjiu.lu@intel.com> PR bootstrap/55380 * charset.c (_cpp_convert_input): Clear CPP_PAD_BUFFER_SIZE bytes if CLEAR_CPP_PAD_BUFFER isn't 0. Allocate extra CPP_PAD_BUFFER_SIZE bytes and clear it if CLEAR_CPP_PAD_BUFFER isn't 0. * files.c (read_file_guts): Allocate extra CPP_PAD_BUFFER_SIZE bytes. * internal.h (CPP_PAD_BUFFER_SIZE): New. Defined to 16 for -fsanitize=address if __SANITIZE_ADDRESS__ is defined. (CLEAR_CPP_PAD_BUFFER): New. diff --git a/libcpp/charset.c b/libcpp/charset.c index cba19a6..1ca94a1 100644 --- a/libcpp/charset.c +++ b/libcpp/charset.c @@ -1709,6 +1709,8 @@ _cpp_convert_input (cpp_reader *pfile, const char *input_c harset, to.text = input; to.asize = size; to.len = len; + if (CLEAR_CPP_PAD_BUFFER) + memset (input + size, 0, CPP_PAD_BUFFER_SIZE); } else { @@ -1731,7 +1733,12 @@ _cpp_convert_input (cpp_reader *pfile, const char *input_ charset, /* Resize buffer if we allocated substantially too much, or if we haven't enough space for the \n-terminator. */ if (to.len + 4096 < to.asize || to.len >= to.asize) - to.text = XRESIZEVEC (uchar, to.text, to.len + 1); + { + to.text = XRESIZEVEC (uchar, to.text, + to.len + CPP_PAD_BUFFER_SIZE + 1); + if (CLEAR_CPP_PAD_BUFFER) + memset (to.text + to.len + 1, 0, CPP_PAD_BUFFER_SIZE); + } /* If the file is using old-school Mac line endings (\r only), terminate with another \r, not an \n, so that we do not mistake diff --git a/libcpp/files.c b/libcpp/files.c index 9f84d8c..090f928 100644 --- a/libcpp/files.c +++ b/libcpp/files.c @@ -671,7 +671,7 @@ read_file_guts (cpp_reader *pfile, _cpp_file *file) the majority of C source files. */ size = 8 * 1024; - buf = XNEWVEC (uchar, size + 1); + buf = XNEWVEC (uchar, size + CPP_PAD_BUFFER_SIZE + 1); total = 0; while ((count = read (file->fd, buf + total, size - total)) > 0) { @@ -682,7 +682,7 @@ read_file_guts (cpp_reader *pfile, _cpp_file *file) if (regular) break; size *= 2; - buf = XRESIZEVEC (uchar, buf, size + 1); + buf = XRESIZEVEC (uchar, buf, size + CPP_PAD_BUFFER_SIZE + 1); } } diff --git a/libcpp/internal.h b/libcpp/internal.h index 312b8b5..bc1f311 100644 --- a/libcpp/internal.h +++ b/libcpp/internal.h @@ -77,6 +77,16 @@ struct cset_converter efficiency, and partly to limit runaway recursion. */ #define CPP_STACK_MAX 200 +/* Allocate extra 16 bytes for -fsanitize=address. */ +#ifdef __SANITIZE_ADDRESS__ +#define CPP_PAD_BUFFER_SIZE 16 +#else +#define CPP_PAD_BUFFER_SIZE 0 +#endif + +/* 1 to clear extra 16 bytes. */ +#define CLEAR_CPP_PAD_BUFFER 0 + /* Host alignment handling. */ struct dummy { ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Allocate extra 16 bytes for -fsanitize=address 2012-11-23 19:33 ` H.J. Lu @ 2012-11-23 19:50 ` Jakub Jelinek 2012-11-26 14:50 ` H.J. Lu 2012-12-03 15:45 ` Tom Tromey 0 siblings, 2 replies; 14+ messages in thread From: Jakub Jelinek @ 2012-11-23 19:50 UTC (permalink / raw) To: H.J. Lu; +Cc: Uros Bizjak, gcc-patches, Tom Tromey On Fri, Nov 23, 2012 at 11:33:37AM -0800, H.J. Lu wrote: > 2012-11-21 H.J. Lu <hongjiu.lu@intel.com> > > PR bootstrap/55380 > * charset.c (_cpp_convert_input): Clear CPP_PAD_BUFFER_SIZE > bytes if CLEAR_CPP_PAD_BUFFER isn't 0. Allocate extra > CPP_PAD_BUFFER_SIZE bytes and clear it if CLEAR_CPP_PAD_BUFFER > isn't 0. > > * files.c (read_file_guts): Allocate extra CPP_PAD_BUFFER_SIZE > bytes. > > * internal.h (CPP_PAD_BUFFER_SIZE): New. Defined to 16 for > -fsanitize=address if __SANITIZE_ADDRESS__ is defined. > (CLEAR_CPP_PAD_BUFFER): New. I'd say you are making this way too much complicated. The following patch just changes those + 1 to + 16, adds memset of the 16 bytes unconditionally, but as it also fixes a thinko which IMHO affects the most common cases (regular file, with no conversion needed), I'd say it ought to be faster than the old version (the old version would allocate st.st_size + 1 bytes long buffer, read st.st_size bytes into it, call _cpp_convert_input with len st.st_size and size st.st_size (the latter should have been st.st_size + 1, i.e. the allocated size) and thus to.len >= to.asize was true and we called realloc with the same length as malloc has been called originally. 2012-11-23 Jakub Jelinek <jakub@redhat.com> PR bootstrap/55380 * files.c (read_file_guts): Allocate extra 16 bytes instead of 1 byte at the end of buf. Pass size + 16 instead of size to _cpp_convert_input. * charset.c (_cpp_convert_input): Reallocate if there aren't at least 16 bytes beyond to.len in the buffer. Clear 16 bytes at to.text + to.len. --- libcpp/files.c.jj 2012-11-22 11:04:24.000000000 +0100 +++ libcpp/files.c 2012-11-23 20:37:03.146379853 +0100 @@ -671,7 +671,7 @@ read_file_guts (cpp_reader *pfile, _cpp_ the majority of C source files. */ size = 8 * 1024; - buf = XNEWVEC (uchar, size + 1); + buf = XNEWVEC (uchar, size + 16); total = 0; while ((count = read (file->fd, buf + total, size - total)) > 0) { @@ -682,7 +682,7 @@ read_file_guts (cpp_reader *pfile, _cpp_ if (regular) break; size *= 2; - buf = XRESIZEVEC (uchar, buf, size + 1); + buf = XRESIZEVEC (uchar, buf, size + 16); } } @@ -699,7 +699,7 @@ read_file_guts (cpp_reader *pfile, _cpp_ file->buffer = _cpp_convert_input (pfile, CPP_OPTION (pfile, input_charset), - buf, size, total, + buf, size + 16, total, &file->buffer_start, &file->st.st_size); file->buffer_valid = true; --- libcpp/charset.c.jj 2011-01-06 10:22:00.000000000 +0100 +++ libcpp/charset.c 2012-11-23 20:40:39.736116642 +0100 @@ -1,6 +1,6 @@ /* CPP Library - charsets Copyright (C) 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2006, 2008, 2009, - 2010 Free Software Foundation, Inc. + 2010, 2012 Free Software Foundation, Inc. Broken out of c-lex.c Apr 2003, adding valid C99 UCN ranges. @@ -1729,9 +1729,12 @@ _cpp_convert_input (cpp_reader *pfile, c iconv_close (input_cset.cd); /* Resize buffer if we allocated substantially too much, or if we - haven't enough space for the \n-terminator. */ - if (to.len + 4096 < to.asize || to.len >= to.asize) - to.text = XRESIZEVEC (uchar, to.text, to.len + 1); + haven't enough space for the \n-terminator or following + 15 bytes of padding. */ + if (to.len + 4096 < to.asize || to.len + 16 > to.asize) + to.text = XRESIZEVEC (uchar, to.text, to.len + 16); + + memset (to.text + to.len, '\0', 16); /* If the file is using old-school Mac line endings (\r only), terminate with another \r, not an \n, so that we do not mistake Jakub ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Allocate extra 16 bytes for -fsanitize=address 2012-11-23 19:50 ` Jakub Jelinek @ 2012-11-26 14:50 ` H.J. Lu 2012-12-03 15:45 ` Tom Tromey 1 sibling, 0 replies; 14+ messages in thread From: H.J. Lu @ 2012-11-26 14:50 UTC (permalink / raw) To: Jakub Jelinek; +Cc: Uros Bizjak, gcc-patches, Tom Tromey On Fri, Nov 23, 2012 at 11:50 AM, Jakub Jelinek <jakub@redhat.com> wrote: > On Fri, Nov 23, 2012 at 11:33:37AM -0800, H.J. Lu wrote: >> 2012-11-21 H.J. Lu <hongjiu.lu@intel.com> >> >> PR bootstrap/55380 >> * charset.c (_cpp_convert_input): Clear CPP_PAD_BUFFER_SIZE >> bytes if CLEAR_CPP_PAD_BUFFER isn't 0. Allocate extra >> CPP_PAD_BUFFER_SIZE bytes and clear it if CLEAR_CPP_PAD_BUFFER >> isn't 0. >> >> * files.c (read_file_guts): Allocate extra CPP_PAD_BUFFER_SIZE >> bytes. >> >> * internal.h (CPP_PAD_BUFFER_SIZE): New. Defined to 16 for >> -fsanitize=address if __SANITIZE_ADDRESS__ is defined. >> (CLEAR_CPP_PAD_BUFFER): New. > > I'd say you are making this way too much complicated. > The following patch just changes those + 1 to + 16, adds memset of the 16 > bytes unconditionally, but as it also fixes a thinko which IMHO affects > the most common cases (regular file, with no conversion needed), I'd say > it ought to be faster than the old version (the old version would allocate > st.st_size + 1 bytes long buffer, read st.st_size bytes into it, > call _cpp_convert_input with len st.st_size and size st.st_size (the latter > should have been st.st_size + 1, i.e. the allocated size) and thus > to.len >= to.asize was true and we called realloc with the same length > as malloc has been called originally. > > 2012-11-23 Jakub Jelinek <jakub@redhat.com> > > PR bootstrap/55380 > * files.c (read_file_guts): Allocate extra 16 bytes instead of > 1 byte at the end of buf. Pass size + 16 instead of size > to _cpp_convert_input. > * charset.c (_cpp_convert_input): Reallocate if there aren't > at least 16 bytes beyond to.len in the buffer. Clear 16 bytes > at to.text + to.len. > > --- libcpp/files.c.jj 2012-11-22 11:04:24.000000000 +0100 > +++ libcpp/files.c 2012-11-23 20:37:03.146379853 +0100 > @@ -671,7 +671,7 @@ read_file_guts (cpp_reader *pfile, _cpp_ > the majority of C source files. */ > size = 8 * 1024; > > - buf = XNEWVEC (uchar, size + 1); > + buf = XNEWVEC (uchar, size + 16); > total = 0; > while ((count = read (file->fd, buf + total, size - total)) > 0) > { > @@ -682,7 +682,7 @@ read_file_guts (cpp_reader *pfile, _cpp_ > if (regular) > break; > size *= 2; > - buf = XRESIZEVEC (uchar, buf, size + 1); > + buf = XRESIZEVEC (uchar, buf, size + 16); > } > } > > @@ -699,7 +699,7 @@ read_file_guts (cpp_reader *pfile, _cpp_ > > file->buffer = _cpp_convert_input (pfile, > CPP_OPTION (pfile, input_charset), > - buf, size, total, > + buf, size + 16, total, > &file->buffer_start, > &file->st.st_size); > file->buffer_valid = true; > --- libcpp/charset.c.jj 2011-01-06 10:22:00.000000000 +0100 > +++ libcpp/charset.c 2012-11-23 20:40:39.736116642 +0100 > @@ -1,6 +1,6 @@ > /* CPP Library - charsets > Copyright (C) 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2006, 2008, 2009, > - 2010 Free Software Foundation, Inc. > + 2010, 2012 Free Software Foundation, Inc. > > Broken out of c-lex.c Apr 2003, adding valid C99 UCN ranges. > > @@ -1729,9 +1729,12 @@ _cpp_convert_input (cpp_reader *pfile, c > iconv_close (input_cset.cd); > > /* Resize buffer if we allocated substantially too much, or if we > - haven't enough space for the \n-terminator. */ > - if (to.len + 4096 < to.asize || to.len >= to.asize) > - to.text = XRESIZEVEC (uchar, to.text, to.len + 1); > + haven't enough space for the \n-terminator or following > + 15 bytes of padding. */ > + if (to.len + 4096 < to.asize || to.len + 16 > to.asize) > + to.text = XRESIZEVEC (uchar, to.text, to.len + 16); > + > + memset (to.text + to.len, '\0', 16); > > /* If the file is using old-school Mac line endings (\r only), > terminate with another \r, not an \n, so that we do not mistake > > > Jakub You should also mention: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54691 in ChangeLog entry. Thanks. -- H.J. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Allocate extra 16 bytes for -fsanitize=address 2012-11-23 19:50 ` Jakub Jelinek 2012-11-26 14:50 ` H.J. Lu @ 2012-12-03 15:45 ` Tom Tromey 1 sibling, 0 replies; 14+ messages in thread From: Tom Tromey @ 2012-12-03 15:45 UTC (permalink / raw) To: Jakub Jelinek; +Cc: H.J. Lu, Uros Bizjak, gcc-patches >>>>> "Jakub" == Jakub Jelinek <jakub@redhat.com> writes: Jakub> 2012-11-23 Jakub Jelinek <jakub@redhat.com> Jakub> PR bootstrap/55380 Jakub> * files.c (read_file_guts): Allocate extra 16 bytes instead of Jakub> 1 byte at the end of buf. Pass size + 16 instead of size Jakub> to _cpp_convert_input. Jakub> * charset.c (_cpp_convert_input): Reallocate if there aren't Jakub> at least 16 bytes beyond to.len in the buffer. Clear 16 bytes Jakub> at to.text + to.len. Jakub> + buf = XNEWVEC (uchar, size + 16); I think the magic constant 16 could use a comment, here and elsewhere. Otherwise ok. Tom ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2012-12-03 15:45 UTC | newest] Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2012-11-23 17:23 [PATCH] Allocate extra 16 bytes for -fsanitize=address H.J. Lu 2012-11-23 17:39 ` Jakub Jelinek 2012-11-23 18:08 ` H.J. Lu 2012-11-23 18:12 ` Jakub Jelinek 2012-11-23 18:17 ` H.J. Lu 2012-11-23 18:12 ` Konstantin Serebryany 2012-11-24 10:52 ` Hans-Peter Nilsson 2012-11-23 19:01 Uros Bizjak 2012-11-23 19:17 ` H.J. Lu 2012-11-23 19:23 ` Uros Bizjak 2012-11-23 19:33 ` H.J. Lu 2012-11-23 19:50 ` Jakub Jelinek 2012-11-26 14:50 ` H.J. Lu 2012-12-03 15:45 ` Tom Tromey
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).