From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from gateway23.websitewelcome.com (gateway23.websitewelcome.com [192.185.50.185]) by sourceware.org (Postfix) with ESMTPS id 806A93938380 for ; Thu, 13 May 2021 16:59:39 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 806A93938380 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=tromey.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=tom@tromey.com Received: from cm12.websitewelcome.com (cm12.websitewelcome.com [100.42.49.8]) by gateway23.websitewelcome.com (Postfix) with ESMTP id EE73013C46 for ; Thu, 13 May 2021 11:59:32 -0500 (CDT) Received: from box5379.bluehost.com ([162.241.216.53]) by cmsmtp with SMTP id hEgNlDVPQDedfhEgNlSoE9; Thu, 13 May 2021 11:59:31 -0500 X-Authority-Reason: nr=8 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=tromey.com; s=default; h=Content-Type:MIME-Version:Message-ID:In-Reply-To:Date:References :Subject:Cc:To:From:Sender:Reply-To:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=XGjHennr7DG5DtO/HkGfD7gZI7qA8IJ2nEvyd8u30TU=; b=W1qo3KzmWlabtPlCuISNW4PSMj 9ld+YL43diEYTDgqTmLWhOIjx+RsXNk6jOSu34qP7XwB+0TTGjkepWYV/0tvKDAI0qqMa2PxJ+izf LvNUlwcYfiQIBon5THz2b2S5Y; Received: from 75-166-134-27.hlrn.qwest.net ([75.166.134.27]:53028 helo=murgatroyd) by box5379.bluehost.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1lhEgM-002mhA-S2; Thu, 13 May 2021 10:59:30 -0600 From: Tom Tromey To: "Jose E. Marchesi via Gdb-patches" Subject: Re: [PATCH 1/1] Integrate GNU poke in GDB References: <20210510151044.20829-1-jose.marchesi@oracle.com> <20210510151044.20829-2-jose.marchesi@oracle.com> X-Attribution: Tom Date: Thu, 13 May 2021 10:59:30 -0600 In-Reply-To: <20210510151044.20829-2-jose.marchesi@oracle.com> (Jose E. Marchesi via Gdb-patches's message of "Mon, 10 May 2021 17:10:44 +0200") Message-ID: <871raa603x.fsf@tromey.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - box5379.bluehost.com X-AntiAbuse: Original Domain - sourceware.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - tromey.com X-BWhitelist: no X-Source-IP: 75.166.134.27 X-Source-L: No X-Exim-ID: 1lhEgM-002mhA-S2 X-Source: X-Source-Args: X-Source-Dir: X-Source-Sender: 75-166-134-27.hlrn.qwest.net (murgatroyd) [75.166.134.27]:53028 X-Source-Auth: tom+tromey.com X-Email-Count: 1 X-Source-Cap: ZWx5bnJvYmk7ZWx5bnJvYmk7Ym94NTM3OS5ibHVlaG9zdC5jb20= X-Local-Domain: yes X-Spam-Status: No, score=-3026.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, JMQ_SPF_NEUTRAL, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_NEUTRAL, TXREP autolearn=no autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 May 2021 16:59:42 -0000 >>>>> "Jose" == Jose E Marchesi via Gdb-patches writes: Hi. Thanks for the patch. gdb has its own coding style, not to mention API peculiarities and surprises, so I've gone through the patch and mentioned the ones I could find. I'm sorry it turned out rather long, don't be discouraged, this is typical for a first gdb patch. Before accepting this, I think test cases would also be needed; plus a NEWS entry. Probably not all the maintainers will have poke, so I'd guess any changes needed will be done on a best-effort basis. I see Fedora doesn't package it, at least as of Fedora 32. Jose> +# Integration with GNU poke is done through the libpoke library. Jose> +AC_ARG_ENABLE([poke], Jose> + AS_HELP_STRING([--enable-poke], Jose> + [Build GDB with poke support (default is NO)]), Jose> + [poke_enabled=$enableval], [poke_enabled=no]) It's normal to also support an 'auto' value, often the default; then have this mode probe for the library. Jose> +if test "x$poke_enabled" = "xyes"; then Jose> + # Note that we need a libpoke with support for registering foreign Jose> + # IO devices, hence the symbol pk_register_iod. Jose> + AC_CHECK_LIB(poke, pk_register_iod) Jose> + POKE_OBS="poke.o" Probably the "yes" value should not probe, and just give a compilation (or configure) error if the library isn't found. The idea here is that this is what the user asked for. Jose> +++ b/gdb/doc/gdb.texinfo Jose> @@ -10245,6 +10245,7 @@ being passed the type of @var{arg} as the argument. Jose> * Caching Target Data:: Data caching for targets Jose> * Searching Memory:: Searching memory for a sequence of bytes Jose> * Value Sizes:: Managing memory allocated for values Jose> +* Poke:: Poking at data using GNU poke. I think it would be better if this were conditional on Poke actually being compiled in. Jose> +static void Jose> +poke_term_flush (void) No need for (void) in C++. Jose> +/* Terminal hook that prints a fixed string. */ Jose> + Jose> +static void Jose> +poke_puts (const char *str) Jose> +{ Jose> + printf_filtered ("%s", str); Jose> +} printf_filtered can throw an exception if the user chooses 'q' at a pagination prompt. Is this ok for poke? I don't think other exceptions can be thrown, but it's hard to know for sure. Jose> +/* Terminal hook that prints a formatted string. */ Jose> + Jose> +__attribute__ ((__format__ (__printf__, 1, 2))) ATTRIBUTE_PRINTF Jose> +static void Jose> +poke_printf (const char *format, ...) Jose> +{ Jose> + va_list ap; Jose> + char *str; Jose> + int r; Jose> + Jose> + va_start (ap, format); Jose> + r = vasprintf (&str, format, ap); Jose> + if (r == -1) Jose> + error (_("out of memory in vasprintf")); /* XXX fatal */ Jose> + va_end (ap); Jose> + Jose> + printf_filtered ("%s", str); This can all be replaced with vprintf_filtered. Jose> +/* Terminal hook that sets the terminal background color. */ Jose> + Jose> +static void Jose> +poke_term_set_bgcolor (struct pk_color color) Jose> +{ Jose> + /* Do nothing. */ Jose> +} I don't know what this is for, but gdb has a styling system that can be used. Jose> +/* Implementation of the poke terminal interface, that uses the hooks Jose> + defined above. */ Jose> + Jose> +static struct pk_term_if poke_term_if = Jose> + { Jose> + .flush_fn = poke_term_flush, I don't think this is C++ syntax. Maybe it's a GNU extension, but gdb only uses those when it knows it is being compiled with gcc. Jose> +static char * Jose> +iod_handler_normalize (const char *handler, uint64_t flags, int *error) Jose> +{ Jose> + char *new_handler = NULL; Jose> + Jose> + if (strcmp (handler, "") == 0) Jose> + new_handler = xstrdup (handler); Jose> + if (error) Jose> + *error = PK_IOD_OK; Jose> + Jose> + return new_handler; If poke owns this memory, then I suspect you'll have problems on Windows, because IIRC different libraries may get different malloc/free implementations, and so passing ownership like this isn't ok. Not sure if that's correct, you may want to double-check. Jose> + return (gdbarch_addr_bit (get_current_arch ()) == 32 Jose> + ? 0xffffffff : 0xffffffffffffffff); GNU style here is to add extra parens so that the second line lines up with the expression on the first line. Jose> +static struct pk_alien_token * Jose> +poke_alien_token_handler (const char *id, char **errmsg) Jose> +{ Jose> + /* In GDB alien poke tokens with the form $addr::FOO provide the Jose> + address of the symbol `FOO' as an offset in bytes, i.e. it Jose> + resolves to the GDB value &foo as a Poke offset with unit bytes. Jose> + Jose> + $FOO, on the other hand, provide the value of the symbol FOO Jose> + incarnated in a proper Poke value. */ Jose> + Jose> + if (strncmp (id, "addr::", 6) == 0) startswith Jose> + { Jose> + CORE_ADDR addr; Jose> + Jose> + std::string expr = "&"; Jose> + expr += id + 6; Jose> + Jose> + try Jose> + { Jose> + addr = parse_and_eval_address (expr.c_str ()); Jose> + } "&" is valid in C, but there's no guarantee it is correct for all the languages gdb supports. I think it would be better to parse-and-eval, then use the value API to take the address, the convert using value_as_address. Jose> + catch (const gdb_exception_error &except) Jose> + { Jose> + goto error; Jose> + } This will let through gdb_exception_quit, which can happen if the user types C-c during the evaluation. You probably want to catch gdb_exception instead. Jose> + if (can_dereference (type)) I didn't know about this function. It's only used by annotations... hmm. Classically gdb lets you dereference a plain int. Dunno if you want to support that. Jose> + *errmsg = NULL; Jose> + return &alien_token; If this is the only use, the definition might as well be put inside this function. Jose> + Jose> + error: Jose> + std::string emsg = "can't access GDB variable '"; Jose> + emsg += id; Jose> + emsg += "'"; Jose> + *errmsg = xstrdup (emsg.c_str ()); You can use xstrprintf or concat here, instead of a std::string. Jose> +static std::vector poke_keywords Jose> + { Jose> + "pinned", "struct", "union", "else", "while", "until", Jose> + "for", "in", "where", "if", "sizeof", "fun", "method", Jose> + "type", "var", "unit", "break", "continue", "return", Jose> + "string", "as", "try", "catch", "raise", "void", "any", Jose> + "print", "printf", "isa", "unmap", "big", "little", Jose> + "load", "lambda", "assert", Jose> + }; Jose> + Jose> +static std::string Jose> +normalize_poke_identifier (std::string prefix, std::string str) Jose> +{ Jose> + if (std::find (poke_keywords.begin (), Jose> + poke_keywords.end (), Jose> + str) != poke_keywords.end ()) Jose> + str = prefix + str; Jose> + Jose> + return str; Jose> +} I think it's probably better to use a plain C array of const char * for poke_keywords. Not sure if std::find will work, but an ordinary foreach is also fine. I think both arguments can be 'const std::string &' to avoid copying. Jose> +static std::string Jose> +gdb_type_name_to_poke (std::string str, struct type *type = NULL) nullptr Jose> +{ Jose> + for (int i = 0; i < str.length (); ++i) Jose> + if (!(str.begin()[i] == '_' Jose> + || (str.begin()[i] >= 'a' && str.begin()[i] <= 'z') Jose> + || (str.begin()[i] >= '0' && str.begin()[i] <= '9') Jose> + || (str.begin()[i] >= 'A' && str.begin()[i] <= 'Z'))) Jose> + str.begin()[i] = '_'; You can just use str[i], no need for .begin(). gdb would use the ctype macros here, or maybe the ones in safe-ctype.h. Jose> + if (type) type != nullptr Jose> + { Jose> + /* Prepend struct and union tags with suitable prefixes. This Jose> + is to avoid ending with recursive typedefs in C programs. */ Jose> + if (type->code () == TYPE_CODE_STRUCT) Jose> + str = "struct_" + str; Jose> + else if (type->code () == TYPE_CODE_UNION) Jose> + str = "union_" + str; Before using a type in gdb, you normally have to call check_typedef. However, I don't know if that's appropriate here or not, since I don't really know what this is doing. How does poke handle typedefs? Jose> +#if 0 Jose> +/* Command to shut down the poke incremental compiler. */ I guess this is the deletion thing. Anyway, FAOD, we don't want new dead code. What does the incremental compiler do exactly? Jose> +/* Command to feed the poke compiler with the definition of some given Jose> + GDB type. */ Jose> + Jose> +static void poke_command (const char *args, int from_tty); Jose> + Jose> +static std::string Jose> +poke_add_type (struct type *type) Jose> +{ Comment should be just before the function. Jose> + std::string str = ""; You can drop the "". Jose> + if (type->name ()) != nullptr Jose> + case TYPE_CODE_TYPEDEF: Jose> + { Jose> + struct type *target_type = TYPE_TARGET_TYPE (type); Jose> + std::string target_type_code = poke_add_type (target_type); Jose> + Jose> + if (target_type_code == "") Normally you'd use 'target_type_code.empty ()' here. I think I missed another spot like this. Jose> + case TYPE_CODE_INT: You can probably handle enums like ints. Jose> + case TYPE_CODE_ARRAY: Jose> + { Jose> + struct type *target_type = TYPE_TARGET_TYPE (type); Jose> + size_t target_type_length = TYPE_LENGTH (target_type); You need a check_typedef in here. Jose> + if (target_type->name ()) nullptr Jose> + str += std::to_string (TYPE_LENGTH (type) / target_type_length); I don't think this will work as you expect. Normally you need to fetch the bounds of the array. This can't always be done, though; for example in C, a VLA's bounds depend on some value, so the dynamic type can't be resolved in isolation. Not sure if this is an issue here or not. You can probably just skip dynamically-sized arrays. Jose> + case TYPE_CODE_STRUCT: Don't know if it matters to poke, but unions can often be handled like structs. Jose> + { Jose> + size_t natural_bitpos = 0; Jose> + str += "struct {"; Jose> + Jose> + for (int idx = 0; idx < type->num_fields (); idx++) Jose> + { Jose> + std::string field_name Jose> + = normalize_poke_identifier ("__f", TYPE_FIELD_NAME (type, idx)); Jose> + struct type *field_type = type->field (idx).type (); Jose> + size_t field_bitpos = TYPE_FIELD_BITPOS (type, idx); You probably want to skip static fields in this loop. Jose> + if (poke_add_type (field_type) == "") Jose> + goto skip; Jose> + str += gdb_type_name_to_poke (field_type->name (), field_type); I forgot to mention something else ... this seems to assume that a given name corresponds to a single type. Is that true? I wasn't really sure from the code. Anyway -- this isn't always the case, it's relatively normal in C programs to have an opaque type that has a different meaning in different compilation units. How will poke handle this? You may also want this to skip virtual base classes, depending on whether poke can handle those. And you may want to skip all dynamic types generally, maybe somewhere a bit higher up. Not sure. These are types whose layout depends on other values -- not super common in C (just VLA), but relatively more so in Ada and Rust and maybe Fortran. Jose> + natural_bitpos = field_bitpos + TYPE_LENGTH (field_type) * 8; I didn't understand this. For example, a bitfield will have an explicit length that isn't the length of its underlying type. It's probably better to just remove the "natural_bitpos" logic and defer to what gdb already knows about the type. However, I don't know if this has some kind of bad effect on poke. Also there may be something weird about bit positions on big-endian systems. I'm never quite clear on that. Jose> + type_poke_strings.push_back (deftype); Jose> + poke_command (deftype.c_str(), 0 /* from_tty */); I think it would be better to factor out a common helper instead. Jose> + printf_filtered ("added type %s\n", poke_type_name.c_str ()); Was this some kind of debugging thing, or is it needed? Jose> + skip: Jose> + return ""; I guess all those 'goto's can just be 'return {};' Jose> +static void Jose> +start_poke (void) (void) Jose> + if (poke_compiler == NULL) Jose> + error (_("Couldn't start the poke incremental compiler.")); Jose> + poke_compiler_lives = 1; I was wondering if all the code after this point in this function matters. If so, then probably this flag shouldn't be set until then. Also, it should be bool and use true/false. Jose> +/* Commands to add GDB types to the running poke compiler. */ Jose> + Jose> +static void Jose> +poke_add_type_command (const char *args, int from_tty) Jose> +{ Jose> + std::string type_name = skip_spaces (args); Jose> + type_name = gdb_type_name_to_poke (type_name); Jose> + Jose> + expression_up expr = parse_expression (args); This seems to use 'args' twice for two different things. Jose> + Jose> +static void Jose> +poke_add_types (const char *args, int from_tty) Intro comment. Jose> + std::string symbol_name_regexp = skip_spaces (args); Better to use a const char * here -- avoids a copy. Jose> + int what; /* 0 -> declaration, 1 -> statement */ gdb prefers bool in new code. Jose> + const char *end; Jose> + std::string cmd; Jose> + Jose> +#define IS_COMMAND(input, cmd) \ Jose> + (strncmp ((input), (cmd), sizeof (cmd) - 1) == 0 \ Jose> + && ((input)[sizeof (cmd) - 1] == ' ' || (input)[sizeof (cmd) - 1] == '\t')) Jose> + Jose> + args = skip_spaces (args); Jose> + if (IS_COMMAND (args, "fun")) Jose> + { Jose> + what = 0; Jose> + cmd = args; Jose> + } Jose> + else Jose> + { Jose> + if (IS_COMMAND (args, "var") Jose> + || IS_COMMAND (args, "type") Jose> + || IS_COMMAND (args, "unit")) One idea here would be to register a command prefix and sub-commands instead. This will get completion, abbreviation, and better 'help' handling. I get that this is some kind of poke built-in thing, though, so maybe you'd rather not. Jose> + Jose> + if (what == 0) Jose> + { Jose> + /* Declaration. */ Jose> + if (pk_compile_buffer (poke_compiler, cmd.c_str (), &end) != PK_OK) Jose> + goto error; Normally in gdb, a command will call error() on an error, to stop execution of command sequences at that point. Swallowing an error is sometimes ok, but I wonder if that's the case here. Does poke itself print something if this failed? If not, the user should get some indication of the problem. Jose> +/* Initialize the poke GDB subsystem. */ Jose> + Jose> +void _initialize_poke (void); (void) Jose> + /* XXX commands to set poke settings, like the output base. */ gdb has an output radix, not sure if that's useful. Tom