public inbox for fortran@gcc.gnu.org
 help / color / mirror / Atom feed
From: Bernhard Reutner-Fischer <rep.dot.nop@gmail.com>
To: David Malcolm <dmalcolm@redhat.com>,fortran@gcc.gnu.org
Cc: gcc-patches@gcc.gnu.org,VandeVondele Joost
	<joost.vandevondele@mat.ethz.ch>
Subject: Re: [PATCH, fortran, v3] Use Levenshtein spelling suggestions in Fortran FE
Date: Sat, 23 Apr 2016 18:22:00 -0000	[thread overview]
Message-ID: <15FCD903-179E-4D64-B4C2-0B37E6011E38@gmail.com> (raw)
In-Reply-To: <1457362636.9813.27.camel@redhat.com>

On March 7, 2016 3:57:16 PM GMT+01:00, David Malcolm <dmalcolm@redhat.com> wrote:
>On Sat, 2016-03-05 at 23:46 +0100, Bernhard Reutner-Fischer wrote:
>[...]
>
>> diff --git a/gcc/fortran/misc.c b/gcc/fortran/misc.c
>> index 405bae0..72ed311 100644
>> --- a/gcc/fortran/misc.c
>> +++ b/gcc/fortran/misc.c
>[...]
>
>> @@ -274,3 +275,41 @@ get_c_kind(const char *c_kind_name,teropKind_tki
>> nds_table[])
>>  
>>    return ISOCBINDING_INVALID;
>>  }
>> +
>> +
>> +/* For a given name TYPO, determine the best candidate from
>> CANDIDATES
>> +   perusing Levenshtein distance.  Frees CANDIDATES before
>> returning.  */
>> +
>> +const char *
>> +gfc_closest_fuzzy_match (const char *typo, char **candidates)
>> +{
>> +  /* Determine closest match.  */
>> +  const char *best = NULL;
>> +  char **cand = candidates;
>> +  edit_distance_t best_distance = MAX_EDIT_DISTANCE;
>> +
>> +  while (cand && *cand)
>> +    {
>> +      edit_distance_t dist = levenshtein_distance (typo, *cand);
>> +      if (dist < best_distance)
>> +	{
>> +	   best_distance = dist;
>> +	   best = *cand;
>> +	}
>> +      cand++;
>> +    }
>> +  /* If more than half of the letters were misspelled, the
>> suggestion is
>> +     likely to be meaningless.  */
>> +  if (best)
>> +    {
>> +      unsigned int cutoff = MAX (strlen (typo), strlen (best)) / 2;
>> +
>> +      if (best_distance > cutoff)
>> +	{
>> +	  XDELETEVEC (candidates);
>> +	  return NULL;
>> +	}
>> +      XDELETEVEC (candidates);
>> +    }
>> +  return best;
>> +}
>
>FWIW, there are two overloaded variants of levenshtein_distance in
>gcc/spellcheck.h, the first of which takes a pair of strlen values;
>your patch uses the second one:
>
>extern edit_distance_t
>levenshtein_distance (const char *s, int len_s,
>		      const char *t, int len_t);
>
>extern edit_distance_t
>levenshtein_distance (const char *s, const char *t);
>
>So one minor tweak you may want to consider here is to calculate
>  strlen (typo)
>once at the top of gfc_closest_fuzzy_match, and then pass it in to the
>4-arg variant of levenshtein_distance, which would avoid recalculating
>strlen (typo) for every candidate.

I've pondered this back then but came to the conclusion to use the variant without len because to use the 4 argument variant I would have stored the candidates strlen in the vector too and was not convinced about the memory footprint for that would be justified. Maybe it is, but I would prefer the following tweak in the 4 argument variant:
If you would amend the 4 argument variant with a

  if (len_t == -1)
    len_t = strlen (t);
before the
   if (len_s == 0)
     return len_t;
   if (len_t == 0)
     return len_s;

checks then I'd certainly use the 4 arg variant :)

WDYT?
>
>I can't comment on the rest of the patch (I'm not a Fortran expert),
>though it seems sane to 
>
>Hope this is constructive

It is, thanks for your thoughts!

cheers,

  reply	other threads:[~2016-04-23 18:22 UTC|newest]

Thread overview: 94+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-01 12:55 [PATCH] Use gfc_add_*_component defines where appropriate Bernhard Reutner-Fischer
2015-12-01 12:55 ` [PATCH] Derive interface buffers from max name length Bernhard Reutner-Fischer
2015-12-01 14:52   ` Janne Blomqvist
2015-12-01 16:51     ` Bernhard Reutner-Fischer
2015-12-03  9:46       ` Janne Blomqvist
2016-06-18 19:46         ` Bernhard Reutner-Fischer
2017-10-19  8:03           ` Bernhard Reutner-Fischer
2017-10-20 22:46             ` Bernhard Reutner-Fischer
2017-10-21 15:18               ` Thomas Koenig
2017-10-21 18:11                 ` Bernhard Reutner-Fischer
2017-10-31 20:35                   ` Bernhard Reutner-Fischer
2018-09-03 16:05                     ` Bernhard Reutner-Fischer
2018-09-05 14:57                       ` [PATCH,FORTRAN 00/29] Move towards stringpool, part 1 Bernhard Reutner-Fischer
2018-09-05 18:57                         ` Janne Blomqvist
2018-09-07  8:09                           ` Bernhard Reutner-Fischer
2018-09-19 14:40                             ` Bernhard Reutner-Fischer
2023-04-13 21:04                               ` Bernhard Reutner-Fischer
     [not found]                         ` <cba81495-832c-2b95-3c30-d2ef819ea9fb@charter.net>
     [not found]                           ` <CAC1BbcThL4Cj=mVRuGg2p8jUipwLOeosB7kwoVD27myRnKcgZA@mail.gmail.com>
2021-04-18 21:30                             ` Bernhard Reutner-Fischer
2018-09-05 14:57                       ` [PATCH,FORTRAN 03/29] Use stringpool for gfc_get_name Bernhard Reutner-Fischer
2018-09-05 14:57                       ` [PATCH,FORTRAN 08/29] Add uop/name helpers Bernhard Reutner-Fischer
2018-09-05 14:57                       ` [PATCH,FORTRAN 13/29] Use stringpool for intrinsics and common Bernhard Reutner-Fischer
2018-09-05 14:57                       ` [PATCH,FORTRAN 07/29] Use stringpool for some gfc_code2string return values Bernhard Reutner-Fischer
2018-09-05 14:57                       ` [PATCH,FORTRAN 02/29] Use stringpool for gfc_match_defined_op_name() Bernhard Reutner-Fischer
2018-09-05 14:57                       ` [PATCH,FORTRAN 04/29] Use stringpool for gfc_match_generic_spec Bernhard Reutner-Fischer
2018-09-05 14:57                       ` [PATCH,FORTRAN 09/29] Use stringpool for modules Bernhard Reutner-Fischer
2018-09-05 18:44                         ` Janne Blomqvist
2018-09-05 20:59                           ` Bernhard Reutner-Fischer
2018-09-05 14:57                       ` [PATCH,FORTRAN 06/29] Use stringpool for association_list Bernhard Reutner-Fischer
2018-09-05 14:57                       ` [PATCH,FORTRAN 01/29] gdbinit: break on gfc_internal_error Bernhard Reutner-Fischer
2021-10-29 18:58                         ` Bernhard Reutner-Fischer
2021-10-29 22:13                           ` Jerry D
2021-10-30 18:25                             ` Bernhard Reutner-Fischer
2018-09-05 14:58                       ` [PATCH,FORTRAN 11/29] Do pointer comparison instead of strcmp Bernhard Reutner-Fischer
2018-09-05 14:58                       ` [PATCH,FORTRAN 24/29] Use stringpool for intrinsic functions Bernhard Reutner-Fischer
2018-09-05 14:58                       ` [PATCH,FORTRAN 22/29] Use stringpool in class and procedure-pointer result Bernhard Reutner-Fischer
2018-09-05 14:58                       ` [PATCH,FORTRAN 14/29] Fix write_omp_udr for user-operator REDUCTIONs Bernhard Reutner-Fischer
2018-09-05 14:58                       ` [PATCH,FORTRAN 26/29] Use stringpool for mangled common names Bernhard Reutner-Fischer
2018-09-05 14:58                       ` [PATCH,FORTRAN 12/29] Use stringpool for remaining names Bernhard Reutner-Fischer
2018-09-05 14:58                       ` [PATCH,FORTRAN 10/29] Do not copy name for check_function_name Bernhard Reutner-Fischer
2018-09-05 14:58                       ` [PATCH,FORTRAN 27/29] Use stringpool for OMP clause reduction code Bernhard Reutner-Fischer
2018-09-05 14:58                       ` [PATCH,FORTRAN 29/29] PR87103: Remove max symbol length check from gfc_new_symbol Bernhard Reutner-Fischer
2018-09-05 14:58                       ` [PATCH,FORTRAN 23/29] Use stringpool for module binding_label Bernhard Reutner-Fischer
2018-09-05 14:58                       ` [PATCH,FORTRAN 05/29] Use stringpool for gfc_match("%n") Bernhard Reutner-Fischer
2018-09-05 14:58                       ` [PATCH,FORTRAN 25/29] Use stringpool on loading module symbols Bernhard Reutner-Fischer
2018-09-19 22:55                         ` [PATCH,FORTRAN v2] " Bernhard Reutner-Fischer
2018-09-05 14:58                       ` [PATCH,FORTRAN 21/29] Use stringpool for module tbp Bernhard Reutner-Fischer
2018-09-05 15:02                       ` [PATCH,FORTRAN 15/29] Use stringpool for iso_c_binding module names Bernhard Reutner-Fischer
2018-09-05 15:02                       ` [PATCH,FORTRAN 16/29] Do pointer comparison in iso_c_binding_module Bernhard Reutner-Fischer
2018-09-05 15:02                       ` [PATCH,FORTRAN 17/29] Use stringpool for iso_fortran_env Bernhard Reutner-Fischer
2018-09-05 15:02                       ` [PATCH,FORTRAN 18/29] Use stringpool for charkind Bernhard Reutner-Fischer
2018-09-05 15:02                       ` [PATCH,FORTRAN 19/29] Use stringpool and unified uppercase handling for types Bernhard Reutner-Fischer
2018-09-05 15:02                       ` [PATCH,FORTRAN 20/29] Use stringpool in class et al Bernhard Reutner-Fischer
2018-09-05 15:02                       ` [PATCH,FORTRAN 28/29] Free type-bound procedure structs Bernhard Reutner-Fischer
2021-10-29  0:05                         ` Bernhard Reutner-Fischer
2021-10-29 14:54                           ` Jerry D
2021-10-29 16:42                             ` Bernhard Reutner-Fischer
     [not found]                           ` <slhifq$rlb$1@ciao.gmane.io>
2021-10-29 20:09                             ` Bernhard Reutner-Fischer
2021-10-31 22:35                               ` Bernhard Reutner-Fischer
2015-12-01 12:55 ` [PATCH] Commentary typo fix for gfc_typenode_for_spec() Bernhard Reutner-Fischer
2015-12-01 16:00   ` Steve Kargl
2016-06-18 20:07     ` Bernhard Reutner-Fischer
2015-12-01 12:56 ` [PATCH] RFC: Use Levenshtein spelling suggestions in Fortran FE Bernhard Reutner-Fischer
2015-12-01 15:02   ` Steve Kargl
2015-12-01 16:13     ` Bernhard Reutner-Fischer
2015-12-01 16:41       ` Steve Kargl
2015-12-01 17:35         ` Bernhard Reutner-Fischer
2015-12-01 19:49           ` Steve Kargl
2015-12-01 17:28   ` David Malcolm
2015-12-01 17:51     ` Bernhard Reutner-Fischer
2015-12-01 17:58       ` David Malcolm
2015-12-01 20:00         ` Steve Kargl
2015-12-03  9:29       ` Janne Blomqvist
2015-12-03 13:53         ` Mikael Morin
2015-12-04  0:08           ` Steve Kargl
2015-12-05 19:53   ` Mikael Morin
2015-12-09  1:07     ` [PATCH] v2 " David Malcolm
2015-12-10 16:15       ` Tobias Burnus
2015-12-22 13:57         ` Fortran release notes (was: [PATCH] v2 ...) Gerald Pfeifer
2015-12-12 17:02       ` [PATCH] v2 Re: [PATCH] RFC: Use Levenshtein spelling suggestions in Fortran FE Bernhard Reutner-Fischer
2015-12-27 21:43   ` [PATCH, RFC, v2] " Bernhard Reutner-Fischer
2016-03-05 22:46     ` [PATCH, fortran, v3] " Bernhard Reutner-Fischer
2016-03-07 14:57       ` David Malcolm
2016-04-23 18:22         ` Bernhard Reutner-Fischer [this message]
2016-04-25 17:07           ` David Malcolm
2016-06-18 19:59             ` [PATCH, fortran, v4] " Bernhard Reutner-Fischer
2016-06-20 10:26               ` VandeVondele  Joost
2016-07-03 22:46               ` Ping: [Re: [PATCH, fortran, v4] Use Levenshtein spelling suggestions in Fortran FE] Bernhard Reutner-Fischer
2016-07-04  3:31                 ` Jerry DeLisle
2016-07-04  5:03                   ` Janne Blomqvist
2017-10-19  7:26                     ` Bernhard Reutner-Fischer
2017-10-19  7:51               ` [PATCH, fortran, v4] Use Levenshtein spelling suggestions in Fortran FE Bernhard Reutner-Fischer
2016-06-18 19:47 ` [PATCH] Use gfc_add_*_component defines where appropriate Bernhard Reutner-Fischer
2016-06-19  9:18   ` Paul Richard Thomas
2016-06-19 10:39     ` Bernhard Reutner-Fischer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=15FCD903-179E-4D64-B4C2-0B37E6011E38@gmail.com \
    --to=rep.dot.nop@gmail.com \
    --cc=dmalcolm@redhat.com \
    --cc=fortran@gcc.gnu.org \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=joost.vandevondele@mat.ethz.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).