From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out2.suse.de (smtp-out2.suse.de [IPv6:2001:67c:2178:6::1d]) by sourceware.org (Postfix) with ESMTPS id 230193858D20 for ; Fri, 20 Oct 2023 12:51:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 230193858D20 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 230193858D20 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2001:67c:2178:6::1d ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1697806300; cv=none; b=R2P5Xr55zyPGjd2S4hBYVT0c5tVjHrLoifXzy8upb50Is+xgC2xanC0e52jDj8nKuKVOmeSrgXICItdJLJuEyvOcAM5CJMFs4HA8rlJEpLLfTGI5Dz32V6w+nwKNGlj0kN9TqYN4MAus2MiQDMruFvyM73bqN6xathRNIENP8T8= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1697806300; c=relaxed/simple; bh=aj+XYORTB2X2UC3HpPfAM7yECiZogayGxhMzvMweR0g=; h=DKIM-Signature:DKIM-Signature:Message-ID:Date:MIME-Version: Subject:To:From; b=gvAf8oO3NKwQf6XpHc0cJOwoIBJt1IpDITiC6o8tuyZ5x5uuZABsn9cjFBMVYxuNc6QAAQTl6ikrYK4ChXt41daCBO2gDZ87YHMK0rHZBigs235qvU8fmhvf86w4N81hHMW0lNEzkeUDVLgTGj1rybJeBjc1AfRSehA5YpCqAcc= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id DD79E1F750; Fri, 20 Oct 2023 12:51:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1697806296; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Dk6SiBukYw281oWQWyR9/orRbAXu3rwyHGzHzBF/kCw=; b=2O1T8bdIeEF2i0Ejy3gXqXByela+iz1yscb3bdjxh3tC+cBUl44H2ba5dt7bX+u5RLIhIV z+ws1dxbwHe/r5kaQ07XA27pfuKUiAfaJ5HSTPhVSJxajL9ONld/LFTnXduSLriMvjhrgW LhJUDLw3nGTB1yM5g1fwiH1q1WBHjwA= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1697806296; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Dk6SiBukYw281oWQWyR9/orRbAXu3rwyHGzHzBF/kCw=; b=0UScIj4wlGfN+av/iekyuXymt5yn/aixLM0DB+MhLrx4vnWZ0XLlBu0V6oRtDhb2gJ36dD eEm2oP5NQlV/0YBA== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id C8AAE138E2; Fri, 20 Oct 2023 12:51:36 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id zGy8L9h3MmXwEwAAMHmgww (envelope-from ); Fri, 20 Oct 2023 12:51:36 +0000 Message-ID: Date: Fri, 20 Oct 2023 14:52:14 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] [gdb/cli] Allow source-highlight to autodetect language Content-Language: en-US To: Guinevere Larsen , gdb-patches@sourceware.org References: <20231018165623.24187-1-tdevries@suse.de> <2852bc27-8577-d745-fa3f-691ffd0d1919@redhat.com> From: Tom de Vries In-Reply-To: <2852bc27-8577-d745-fa3f-691ffd0d1919@redhat.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Authentication-Results: smtp-out2.suse.de; none X-Spam-Level: X-Spam-Score: -7.09 X-Spamd-Result: default: False [-7.09 / 50.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; XM_UA_NO_VERSION(0.01)[]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; BAYES_HAM(-3.00)[100.00%]; MIME_GOOD(-0.10)[text/plain]; NEURAL_HAM_LONG(-3.00)[-1.000]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; NEURAL_HAM_SHORT(-1.00)[-1.000]; RCPT_COUNT_TWO(0.00)[2]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_COUNT_TWO(0.00)[2]; RCVD_TLS_ALL(0.00)[]; MID_RHS_MATCH_FROM(0.00)[] X-Spam-Status: No, score=-10.4 required=5.0 tests=BAYES_00,BODY_8BITS,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 10/20/23 13:30, Guinevere Larsen wrote: > On 18/10/2023 18:56, Tom de Vries wrote: >> Currently when gdb asks the source-highlight library to highlight a >> file, it >> tells it what language file to use. >> >> For instance, if gdb learns from the debug info that the file is >> language_c, >> the language file "c.lang" is used.  This mapping is hardcoded in >> get_language_name. >> >> However, if gdb doesn't know what language file to use, it falls back >> to using >> python pygments, and in absence of that, unhighlighted source text. >> >> In the case of python pygments, it autodetects which language to use >> based on >> the file name. >> >> Add the same capability when using the source-highlight library. >> >> Tested on x86_64-linux. >> >> Verified that it works by: >> - making get_language_name return nullptr for language_c, and >> - checking that source-highlight still manages to highlight a hello >> world. >> >> PR cli/30966 >> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30966 > > Thanks for working on this! I tested this the way you mentioned and it > works. > > I only have one small nit for style. > >> --- >>   gdb/source-cache.c | 18 ++++++++++++++++-- >>   1 file changed, 16 insertions(+), 2 deletions(-) >> >> diff --git a/gdb/source-cache.c b/gdb/source-cache.c >> index 99be3334803..92acb100901 100644 >> --- a/gdb/source-cache.c >> +++ b/gdb/source-cache.c >> @@ -37,6 +37,7 @@ >>   #include >>   #include >>   #include >> +#include >>   #endif >>   /* The number of source files we'll cache.  */ >> @@ -205,8 +206,6 @@ try_source_highlight (std::string &contents >> ATTRIBUTE_UNUSED, >>       return false; >>     const char *lang_name = get_language_name (lang); >> -  if (lang_name == nullptr) >> -    return false; >>     /* The global source highlight object, or null if one was >>        never constructed.  This is stored here rather than in >> @@ -214,6 +213,9 @@ try_source_highlight (std::string &contents >> ATTRIBUTE_UNUSED, >>        conditional compilation in source-cache.h.  */ >>     static srchilite::SourceHighlight *highlighter; >> +  /* The global source highlight language map object.  */ >> +  static srchilite::LangMap *langmap; >> + >>     bool styled = false; >>     try >>       { >> @@ -221,6 +223,18 @@ try_source_highlight (std::string &contents >> ATTRIBUTE_UNUSED, >>       { >>         highlighter = new srchilite::SourceHighlight ("esc.outlang"); >>         highlighter->setStyleFile ("esc.style"); >> + >> +      const std::string &datadir = >> srchilite::Settings::retrieveDataDir (); >> +      langmap = new srchilite::LangMap (datadir, "lang.map"); >> +    } >> + >> +      std::string detected_lang; >> +      if (lang_name == nullptr) >> +    { >> +      detected_lang = langmap->getMappedFileNameFromFileName (fullname); >> +      if (detected_lang.empty ()) >> +        return false; >> +      lang_name = detected_lang.c_str (); > > Instead of an early exit here, I think I would prefer if it was: > > if ( !detected_lang.empty ()) >   lang_name = detected_lang.c_str (); > > mainly because if something else is changed in this function, this might > cause bugs that would be very hard to track down. > > Testing locally, I don't see any performance difference with 5000 > repeats, and I feel like the easier flow graph could be would be worth it. > Hi, thanks for the review. I think the proposed change means that: - we're allowing lang_name to remain nullptr, which - then gets converted into an std::string, which - throws an exception, which - we catch. so agreed, that would work. However, starting c++23, this doesn't compile anymore: ... #include int main (void) { std::string s = nullptr; return 0; } ... so I think it means we shouldn't be relying on that. Thanks, - Tom