public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Thomas Schwinge <thomas@codesourcery.com>
To: Lewis Hyatt <lhyatt@gmail.com>, <gcc-patches@gcc.gnu.org>
Cc: Richard Sandiford <richard.sandiford@arm.com>,
	Jakub Jelinek <jakub@redhat.com>,
	David Malcolm <dmalcolm@redhat.com>
Subject: GTY: Enhance 'string_length' option documentation (was: 'unsigned int len' field in 'libcpp/include/symtab.h:struct ht_identifier' (was: [PATCH] pch: Fix streaming of strings with embedded null bytes))
Date: Wed, 5 Jul 2023 09:56:23 +0200	[thread overview]
Message-ID: <878rbuvljs.fsf@euler.schwinge.homeip.net> (raw)
In-Reply-To: <CAA_5UQ6VTXUjPBykS0YJaus8MdVkCJSm+cE2RR4844Af8S=t6w@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1604 bytes --]

Hi!

On 2023-07-04T15:56:23-0400, Lewis Hyatt via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
> On Tue, Jul 4, 2023 at 11:50 AM Thomas Schwinge <thomas@codesourcery.com> wrote:
>> I came across this one here on my way working through another (somewhat
>> related) GTY issue.  I generally do understand the issue here, but do
>> have a question about 'unsigned int len' field in
>> 'libcpp/include/symtab.h:struct ht_identifier': [...]

> I don't think there is currently any possibility for a null byte to
> end up in an ht_identifier's string. I assumed that ht_identifier
> stores the length as an optimization (especially since it doesn't take
> up any extra space on 64-bit platforms, given the 32-bit hash code is
> stored as well there.) I created the string_length GTY markup mainly
> to support another patch that I have still pending review, which I
> thought would increase the likelihood of PCH needing to handle null
> bytes in general. When I did that, I added the markup to ht_identifier
> simply because the length was already there, so there was no reason
> not to add it. It does save a few cycles when streaming out the PCH,
> but I doubt it is meaningful.

Thanks for confirming.  OK thus to push the attached
"GTY: Enhance 'string_length' option documentation"?


Grüße
 Thomas


-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-GTY-Enhance-string_length-option-documentation.patch --]
[-- Type: text/x-diff, Size: 2723 bytes --]

From a31b6657c26ac70c6e03b8ad81cdcb873f905716 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <thomas@codesourcery.com>
Date: Wed, 5 Jul 2023 08:38:49 +0200
Subject: [PATCH] GTY: Enhance 'string_length' option documentation

We're (currently) not aware of any actual use of 'ht_identifier's with NUL
characters embedded; its 'len' field appears to exist for optimization
purposes, since "forever".  Before 'struct ht_identifier' was added in
commit 2a967f3d3a45294640e155381ef549e0b8090ad4 (Subversion r42334), we had in
'gcc/cpplib.h:struct cpp_hashnode': 'unsigned short len', or earlier 'length',
earlier in 'gcc/cpphash.h:struct hashnode': 'unsigned short length', earlier
'size_t length' with comment: "length of token, for quick comparison", earlier
'int length', ever since the 'gcc/cpp*' files were added in
commit 7f2935c734c36f84ab62b20a04de465e19061333 (Subversion r9191).

This amends commit f3b957ea8b9dadfb1ed30f24f463529684b7a36a
"pch: Fix streaming of strings with embedded null bytes".

	gcc/
	* doc/gty.texi (GTY Options) <string_length>: Enhance.
	libcpp/
	* include/symtab.h (struct ht_identifier): Document different
	rationale.
---
 gcc/doc/gty.texi        | 11 +++++++++++
 libcpp/include/symtab.h |  4 +---
 2 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/gcc/doc/gty.texi b/gcc/doc/gty.texi
index 7bd064b5781..15f9fa07405 100644
--- a/gcc/doc/gty.texi
+++ b/gcc/doc/gty.texi
@@ -217,6 +217,17 @@ struct GTY(()) non_terminated_string @{
 @};
 @end smallexample
 
+Similarly, this is useful for (regular NUL-terminated) strings with
+NUL characters embedded (that the default @code{strlen} use would run
+afoul of):
+
+@smallexample
+struct GTY(()) multi_string @{
+  const char * GTY((string_length ("%h.len + 1"))) str;
+  size_t len;
+@};
+@end smallexample
+
 The @code{string_length} option currently is not supported for (fields
 in) global variables.
 @c <https://inbox.sourceware.org/87bkgqvlst.fsf@euler.schwinge.homeip.net>
diff --git a/libcpp/include/symtab.h b/libcpp/include/symtab.h
index c7ccc6db9f0..0c713f2ad30 100644
--- a/libcpp/include/symtab.h
+++ b/libcpp/include/symtab.h
@@ -29,9 +29,7 @@ along with this program; see the file COPYING3.  If not see
 typedef struct ht_identifier ht_identifier;
 typedef struct ht_identifier *ht_identifier_ptr;
 struct GTY(()) ht_identifier {
-  /* This GTY markup arranges that the null-terminated identifier would still
-     stream to PCH correctly, if a null byte were to make its way into an
-     identifier somehow.  */
+  /* We know the 'len'gth of the 'str'ing; use it in the GTY markup.  */
   const unsigned char * GTY((string_length ("1 + %h.len"))) str;
   unsigned int len;
   unsigned int hash_value;
-- 
2.34.1


  reply	other threads:[~2023-07-05  7:56 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-18 22:14 [PATCH] pch: Fix streaming of strings with embedded null bytes Lewis Hyatt
2022-10-19 11:54 ` Richard Sandiford
2022-10-19 12:08   ` Jakub Jelinek
2022-10-19 12:17     ` Richard Sandiford
2022-10-19 12:23       ` Jakub Jelinek
2022-10-19 12:47         ` Lewis Hyatt
2023-07-04 15:50 ` 'unsigned int len' field in 'libcpp/include/symtab.h:struct ht_identifier' (was: [PATCH] pch: Fix streaming of strings with embedded null bytes) Thomas Schwinge
2023-07-04 19:56   ` Lewis Hyatt
2023-07-05  7:56     ` Thomas Schwinge [this message]
2023-07-05  8:15       ` GTY: Enhance 'string_length' option documentation (was: 'unsigned int len' field in 'libcpp/include/symtab.h:struct ht_identifier' (was: [PATCH] pch: Fix streaming of strings with embedded null bytes)) Richard Biener
2023-07-05  7:50 ` GTY: Explicitly reject 'string_length' option for (fields in) global variables (was: [PATCH] pch: Fix streaming of strings with embedded null bytes) Thomas Schwinge
2023-07-05  8:13   ` Richard Biener

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=878rbuvljs.fsf@euler.schwinge.homeip.net \
    --to=thomas@codesourcery.com \
    --cc=dmalcolm@redhat.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jakub@redhat.com \
    --cc=lhyatt@gmail.com \
    --cc=richard.sandiford@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).