public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] [gdb/cli] Allow source-highlight to autodetect language
@ 2023-10-18 16:56 Tom de Vries
  2023-10-20 11:30 ` Guinevere Larsen
  2023-10-20 17:24 ` Tom Tromey
  0 siblings, 2 replies; 5+ messages in thread
From: Tom de Vries @ 2023-10-18 16:56 UTC (permalink / raw)
  To: gdb-patches

Currently when gdb asks the source-highlight library to highlight a file, it
tells it what language file to use.

For instance, if gdb learns from the debug info that the file is language_c,
the language file "c.lang" is used.  This mapping is hardcoded in
get_language_name.

However, if gdb doesn't know what language file to use, it falls back to using
python pygments, and in absence of that, unhighlighted source text.

In the case of python pygments, it autodetects which language to use based on
the file name.

Add the same capability when using the source-highlight library.

Tested on x86_64-linux.

Verified that it works by:
- making get_language_name return nullptr for language_c, and
- checking that source-highlight still manages to highlight a hello world.

PR cli/30966
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30966
---
 gdb/source-cache.c | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/gdb/source-cache.c b/gdb/source-cache.c
index 99be3334803..92acb100901 100644
--- a/gdb/source-cache.c
+++ b/gdb/source-cache.c
@@ -37,6 +37,7 @@
 #include <sstream>
 #include <srchilite/sourcehighlight.h>
 #include <srchilite/langmap.h>
+#include <srchilite/settings.h>
 #endif
 
 /* The number of source files we'll cache.  */
@@ -205,8 +206,6 @@ try_source_highlight (std::string &contents ATTRIBUTE_UNUSED,
     return false;
 
   const char *lang_name = get_language_name (lang);
-  if (lang_name == nullptr)
-    return false;
 
   /* The global source highlight object, or null if one was
      never constructed.  This is stored here rather than in
@@ -214,6 +213,9 @@ try_source_highlight (std::string &contents ATTRIBUTE_UNUSED,
      conditional compilation in source-cache.h.  */
   static srchilite::SourceHighlight *highlighter;
 
+  /* The global source highlight language map object.  */
+  static srchilite::LangMap *langmap;
+
   bool styled = false;
   try
     {
@@ -221,6 +223,18 @@ try_source_highlight (std::string &contents ATTRIBUTE_UNUSED,
 	{
 	  highlighter = new srchilite::SourceHighlight ("esc.outlang");
 	  highlighter->setStyleFile ("esc.style");
+
+	  const std::string &datadir = srchilite::Settings::retrieveDataDir ();
+	  langmap = new srchilite::LangMap (datadir, "lang.map");
+	}
+
+      std::string detected_lang;
+      if (lang_name == nullptr)
+	{
+	  detected_lang = langmap->getMappedFileNameFromFileName (fullname);
+	  if (detected_lang.empty ())
+	    return false;
+	  lang_name = detected_lang.c_str ();
 	}
 
       std::istringstream input (contents);

base-commit: 29736fc507c7a9c6e797b7f83e8df4be73d37767
-- 
2.35.3


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] [gdb/cli] Allow source-highlight to autodetect language
  2023-10-18 16:56 [PATCH] [gdb/cli] Allow source-highlight to autodetect language Tom de Vries
@ 2023-10-20 11:30 ` Guinevere Larsen
  2023-10-20 12:52   ` Tom de Vries
  2023-10-20 17:24 ` Tom Tromey
  1 sibling, 1 reply; 5+ messages in thread
From: Guinevere Larsen @ 2023-10-20 11:30 UTC (permalink / raw)
  To: Tom de Vries, gdb-patches

On 18/10/2023 18:56, Tom de Vries wrote:
> Currently when gdb asks the source-highlight library to highlight a file, it
> tells it what language file to use.
>
> For instance, if gdb learns from the debug info that the file is language_c,
> the language file "c.lang" is used.  This mapping is hardcoded in
> get_language_name.
>
> However, if gdb doesn't know what language file to use, it falls back to using
> python pygments, and in absence of that, unhighlighted source text.
>
> In the case of python pygments, it autodetects which language to use based on
> the file name.
>
> Add the same capability when using the source-highlight library.
>
> Tested on x86_64-linux.
>
> Verified that it works by:
> - making get_language_name return nullptr for language_c, and
> - checking that source-highlight still manages to highlight a hello world.
>
> PR cli/30966
> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30966

Thanks for working on this! I tested this the way you mentioned and it 
works.

I only have one small nit for style.

> ---
>   gdb/source-cache.c | 18 ++++++++++++++++--
>   1 file changed, 16 insertions(+), 2 deletions(-)
>
> diff --git a/gdb/source-cache.c b/gdb/source-cache.c
> index 99be3334803..92acb100901 100644
> --- a/gdb/source-cache.c
> +++ b/gdb/source-cache.c
> @@ -37,6 +37,7 @@
>   #include <sstream>
>   #include <srchilite/sourcehighlight.h>
>   #include <srchilite/langmap.h>
> +#include <srchilite/settings.h>
>   #endif
>   
>   /* The number of source files we'll cache.  */
> @@ -205,8 +206,6 @@ try_source_highlight (std::string &contents ATTRIBUTE_UNUSED,
>       return false;
>   
>     const char *lang_name = get_language_name (lang);
> -  if (lang_name == nullptr)
> -    return false;
>   
>     /* The global source highlight object, or null if one was
>        never constructed.  This is stored here rather than in
> @@ -214,6 +213,9 @@ try_source_highlight (std::string &contents ATTRIBUTE_UNUSED,
>        conditional compilation in source-cache.h.  */
>     static srchilite::SourceHighlight *highlighter;
>   
> +  /* The global source highlight language map object.  */
> +  static srchilite::LangMap *langmap;
> +
>     bool styled = false;
>     try
>       {
> @@ -221,6 +223,18 @@ try_source_highlight (std::string &contents ATTRIBUTE_UNUSED,
>   	{
>   	  highlighter = new srchilite::SourceHighlight ("esc.outlang");
>   	  highlighter->setStyleFile ("esc.style");
> +
> +	  const std::string &datadir = srchilite::Settings::retrieveDataDir ();
> +	  langmap = new srchilite::LangMap (datadir, "lang.map");
> +	}
> +
> +      std::string detected_lang;
> +      if (lang_name == nullptr)
> +	{
> +	  detected_lang = langmap->getMappedFileNameFromFileName (fullname);
> +	  if (detected_lang.empty ())
> +	    return false;
> +	  lang_name = detected_lang.c_str ();

Instead of an early exit here, I think I would prefer if it was:

if ( !detected_lang.empty ())
   lang_name = detected_lang.c_str ();

mainly because if something else is changed in this function, this might 
cause bugs that would be very hard to track down.

Testing locally, I don't see any performance difference with 5000 
repeats, and I feel like the easier flow graph could be would be worth it.

-- 
Cheers,
Guinevere Larsen
She/Her/Hers

>   	}
>   
>         std::istringstream input (contents);
>
> base-commit: 29736fc507c7a9c6e797b7f83e8df4be73d37767



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] [gdb/cli] Allow source-highlight to autodetect language
  2023-10-20 11:30 ` Guinevere Larsen
@ 2023-10-20 12:52   ` Tom de Vries
  2023-10-20 12:59     ` Guinevere Larsen
  0 siblings, 1 reply; 5+ messages in thread
From: Tom de Vries @ 2023-10-20 12:52 UTC (permalink / raw)
  To: Guinevere Larsen, gdb-patches

On 10/20/23 13:30, Guinevere Larsen wrote:
> On 18/10/2023 18:56, Tom de Vries wrote:
>> Currently when gdb asks the source-highlight library to highlight a 
>> file, it
>> tells it what language file to use.
>>
>> For instance, if gdb learns from the debug info that the file is 
>> language_c,
>> the language file "c.lang" is used.  This mapping is hardcoded in
>> get_language_name.
>>
>> However, if gdb doesn't know what language file to use, it falls back 
>> to using
>> python pygments, and in absence of that, unhighlighted source text.
>>
>> In the case of python pygments, it autodetects which language to use 
>> based on
>> the file name.
>>
>> Add the same capability when using the source-highlight library.
>>
>> Tested on x86_64-linux.
>>
>> Verified that it works by:
>> - making get_language_name return nullptr for language_c, and
>> - checking that source-highlight still manages to highlight a hello 
>> world.
>>
>> PR cli/30966
>> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30966
> 
> Thanks for working on this! I tested this the way you mentioned and it 
> works.
> 
> I only have one small nit for style.
> 
>> ---
>>   gdb/source-cache.c | 18 ++++++++++++++++--
>>   1 file changed, 16 insertions(+), 2 deletions(-)
>>
>> diff --git a/gdb/source-cache.c b/gdb/source-cache.c
>> index 99be3334803..92acb100901 100644
>> --- a/gdb/source-cache.c
>> +++ b/gdb/source-cache.c
>> @@ -37,6 +37,7 @@
>>   #include <sstream>
>>   #include <srchilite/sourcehighlight.h>
>>   #include <srchilite/langmap.h>
>> +#include <srchilite/settings.h>
>>   #endif
>>   /* The number of source files we'll cache.  */
>> @@ -205,8 +206,6 @@ try_source_highlight (std::string &contents 
>> ATTRIBUTE_UNUSED,
>>       return false;
>>     const char *lang_name = get_language_name (lang);
>> -  if (lang_name == nullptr)
>> -    return false;
>>     /* The global source highlight object, or null if one was
>>        never constructed.  This is stored here rather than in
>> @@ -214,6 +213,9 @@ try_source_highlight (std::string &contents 
>> ATTRIBUTE_UNUSED,
>>        conditional compilation in source-cache.h.  */
>>     static srchilite::SourceHighlight *highlighter;
>> +  /* The global source highlight language map object.  */
>> +  static srchilite::LangMap *langmap;
>> +
>>     bool styled = false;
>>     try
>>       {
>> @@ -221,6 +223,18 @@ try_source_highlight (std::string &contents 
>> ATTRIBUTE_UNUSED,
>>       {
>>         highlighter = new srchilite::SourceHighlight ("esc.outlang");
>>         highlighter->setStyleFile ("esc.style");
>> +
>> +      const std::string &datadir = 
>> srchilite::Settings::retrieveDataDir ();
>> +      langmap = new srchilite::LangMap (datadir, "lang.map");
>> +    }
>> +
>> +      std::string detected_lang;
>> +      if (lang_name == nullptr)
>> +    {
>> +      detected_lang = langmap->getMappedFileNameFromFileName (fullname);
>> +      if (detected_lang.empty ())
>> +        return false;
>> +      lang_name = detected_lang.c_str ();
> 
> Instead of an early exit here, I think I would prefer if it was:
> 
> if ( !detected_lang.empty ())
>    lang_name = detected_lang.c_str ();
> 
> mainly because if something else is changed in this function, this might 
> cause bugs that would be very hard to track down.
> 
> Testing locally, I don't see any performance difference with 5000 
> repeats, and I feel like the easier flow graph could be would be worth it.
> 

Hi,

thanks for the review.

I think the proposed change means that:
- we're allowing lang_name to remain nullptr, which
- then gets converted into an std::string, which
- throws an exception, which
- we catch.
so agreed, that would work.

However, starting c++23, this doesn't compile anymore:
...
#include <string>

int
main (void)
{
   std::string s = nullptr;

   return 0;
}
...
so I think it means we shouldn't be relying on that.

Thanks,
- Tom


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] [gdb/cli] Allow source-highlight to autodetect language
  2023-10-20 12:52   ` Tom de Vries
@ 2023-10-20 12:59     ` Guinevere Larsen
  0 siblings, 0 replies; 5+ messages in thread
From: Guinevere Larsen @ 2023-10-20 12:59 UTC (permalink / raw)
  To: Tom de Vries, gdb-patches

On 20/10/2023 14:52, Tom de Vries wrote:
> On 10/20/23 13:30, Guinevere Larsen wrote:
>> On 18/10/2023 18:56, Tom de Vries wrote:
>>> Currently when gdb asks the source-highlight library to highlight a 
>>> file, it
>>> tells it what language file to use.
>>>
>>> For instance, if gdb learns from the debug info that the file is 
>>> language_c,
>>> the language file "c.lang" is used.  This mapping is hardcoded in
>>> get_language_name.
>>>
>>> However, if gdb doesn't know what language file to use, it falls 
>>> back to using
>>> python pygments, and in absence of that, unhighlighted source text.
>>>
>>> In the case of python pygments, it autodetects which language to use 
>>> based on
>>> the file name.
>>>
>>> Add the same capability when using the source-highlight library.
>>>
>>> Tested on x86_64-linux.
>>>
>>> Verified that it works by:
>>> - making get_language_name return nullptr for language_c, and
>>> - checking that source-highlight still manages to highlight a hello 
>>> world.
>>>
>>> PR cli/30966
>>> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30966
>>
>> Thanks for working on this! I tested this the way you mentioned and 
>> it works.
>>
>> I only have one small nit for style.
>>
>>> ---
>>>   gdb/source-cache.c | 18 ++++++++++++++++--
>>>   1 file changed, 16 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/gdb/source-cache.c b/gdb/source-cache.c
>>> index 99be3334803..92acb100901 100644
>>> --- a/gdb/source-cache.c
>>> +++ b/gdb/source-cache.c
>>> @@ -37,6 +37,7 @@
>>>   #include <sstream>
>>>   #include <srchilite/sourcehighlight.h>
>>>   #include <srchilite/langmap.h>
>>> +#include <srchilite/settings.h>
>>>   #endif
>>>   /* The number of source files we'll cache.  */
>>> @@ -205,8 +206,6 @@ try_source_highlight (std::string &contents 
>>> ATTRIBUTE_UNUSED,
>>>       return false;
>>>     const char *lang_name = get_language_name (lang);
>>> -  if (lang_name == nullptr)
>>> -    return false;
>>>     /* The global source highlight object, or null if one was
>>>        never constructed.  This is stored here rather than in
>>> @@ -214,6 +213,9 @@ try_source_highlight (std::string &contents 
>>> ATTRIBUTE_UNUSED,
>>>        conditional compilation in source-cache.h.  */
>>>     static srchilite::SourceHighlight *highlighter;
>>> +  /* The global source highlight language map object.  */
>>> +  static srchilite::LangMap *langmap;
>>> +
>>>     bool styled = false;
>>>     try
>>>       {
>>> @@ -221,6 +223,18 @@ try_source_highlight (std::string &contents 
>>> ATTRIBUTE_UNUSED,
>>>       {
>>>         highlighter = new srchilite::SourceHighlight ("esc.outlang");
>>>         highlighter->setStyleFile ("esc.style");
>>> +
>>> +      const std::string &datadir = 
>>> srchilite::Settings::retrieveDataDir ();
>>> +      langmap = new srchilite::LangMap (datadir, "lang.map");
>>> +    }
>>> +
>>> +      std::string detected_lang;
>>> +      if (lang_name == nullptr)
>>> +    {
>>> +      detected_lang = langmap->getMappedFileNameFromFileName 
>>> (fullname);
>>> +      if (detected_lang.empty ())
>>> +        return false;
>>> +      lang_name = detected_lang.c_str ();
>>
>> Instead of an early exit here, I think I would prefer if it was:
>>
>> if ( !detected_lang.empty ())
>>    lang_name = detected_lang.c_str ();
>>
>> mainly because if something else is changed in this function, this 
>> might cause bugs that would be very hard to track down.
>>
>> Testing locally, I don't see any performance difference with 5000 
>> repeats, and I feel like the easier flow graph could be would be 
>> worth it.
>>
>
> Hi,
>
> thanks for the review.
>
> I think the proposed change means that:
> - we're allowing lang_name to remain nullptr, which
> - then gets converted into an std::string, which
> - throws an exception, which
> - we catch.
> so agreed, that would work.
>
> However, starting c++23, this doesn't compile anymore:
> ...
> #include <string>
>
> int
> main (void)
> {
>   std::string s = nullptr;
>
>   return 0;
> }
> ...
> so I think it means we shouldn't be relying on that.

ah, I wasn't aware of that change in C++23. With that in mind, i think 
the patch is looking good,

Reviewed-By: Guinevere Larsen <blarsen@redhat.com>

-- 
Cheers,
Guinevere Larsen
She/Her/Hers


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] [gdb/cli] Allow source-highlight to autodetect language
  2023-10-18 16:56 [PATCH] [gdb/cli] Allow source-highlight to autodetect language Tom de Vries
  2023-10-20 11:30 ` Guinevere Larsen
@ 2023-10-20 17:24 ` Tom Tromey
  1 sibling, 0 replies; 5+ messages in thread
From: Tom Tromey @ 2023-10-20 17:24 UTC (permalink / raw)
  To: Tom de Vries; +Cc: gdb-patches

>>>>> "Tom" == Tom de Vries <tdevries@suse.de> writes:

Tom> Verified that it works by:
Tom> - making get_language_name return nullptr for language_c, and
Tom> - checking that source-highlight still manages to highlight a hello world.

Tom> PR cli/30966
Tom> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30966

Thank you.  I think this looks good.
Approved-By: Tom Tromey <tom@tromey.com>

Tom

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-10-20 17:24 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-18 16:56 [PATCH] [gdb/cli] Allow source-highlight to autodetect language Tom de Vries
2023-10-20 11:30 ` Guinevere Larsen
2023-10-20 12:52   ` Tom de Vries
2023-10-20 12:59     ` Guinevere Larsen
2023-10-20 17:24 ` Tom Tromey

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).