public inbox for fortran@gcc.gnu.org
 help / color / mirror / Atom feed
From: Mikael Morin <mikael.morin@sfr.fr>
To: Bernhard Reutner-Fischer <rep.dot.nop@gmail.com>, fortran@gcc.gnu.org
Cc: gcc-patches@gcc.gnu.org, dmalcolm@redhat.com
Subject: Re: [PATCH] RFC: Use Levenshtein spelling suggestions in Fortran FE
Date: Sat, 05 Dec 2015 19:53:00 -0000	[thread overview]
Message-ID: <566340AC.3050408@sfr.fr> (raw)
In-Reply-To: <1448974501-30981-4-git-send-email-rep.dot.nop@gmail.com>

Hello,

to get things moving again, a few comments on top of David Malcolm's:

Le 01/12/2015 13:55, Bernhard Reutner-Fischer a écrit :
>
> David Malcolm nice Levenshtein distance spelling check helpers
> were used in some parts of other frontends. This proposed patch adds
> some spelling corrections to the fortran frontend.
>
> Suggestions are printed if we can find a suitable name, currently
> perusing a very simple cutoff factor:
> /* If more than half of the letters were misspelled, the suggestion is
>     likely to be meaningless.  */
> cutoff = MAX (strlen (typo), strlen (best_guess)) / 2;
> which effectively skips names with less than 4 characters.
> For e.g. structures, one could try to be much smarter in an attempt to
> also provide suggestions for single-letter members/components.
>
> This patch covers (at least partly):
> - user-defined operators
> - structures (types and their components)
> - functions
> - symbols (variables)
>
> I do not immediately see how to handle subroutines. Ideas?
>
Not sure what you are looking for; I can get an error generated in 
gfc_procedure_use if using IMPLICIT NONE (EXTERNAL)

> If anybody has a testcase where a spelling-suggestion would make sense
> then please pass it along so we maybe can add support for GCC-7.
>


> diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c
> index 685e3f5..6e1f63c 100644
> --- a/gcc/fortran/resolve.c
> +++ b/gcc/fortran/resolve.c
> @@ -29,6 +29,7 @@ along with GCC; see the file COPYING3.  If not see
>   #include "data.h"
>   #include "target-memory.h" /* for gfc_simplify_transfer */
>   #include "constructor.h"
> +#include "spellcheck.h"
>
>   /* Types used in equivalence statements.  */
>
> @@ -2682,6 +2683,61 @@ resolve_specific_f (gfc_expr *expr)
>     return true;
>   }
>
> +/* Recursively append candidate SYM to CANDIDATES.  */
> +
> +static void
> +lookup_function_fuzzy_find_candidates (gfc_symtree *sym,
> +                                       vec<const char *> *candidates)
> +{
> +  gfc_symtree *p;
> +  for (p = sym->right; p; p = p->right)
> +    {
> +      lookup_function_fuzzy_find_candidates (p, candidates);
> +      if (p->n.sym->ts.type != BT_UNKNOWN)
> +	candidates->safe_push (p->name);
> +    }
> +  for (p = sym->left; p; p = p->left)
> +    {
> +      lookup_function_fuzzy_find_candidates (p, candidates);
> +      if (p->n.sym->ts.type != BT_UNKNOWN)
> +	candidates->safe_push (p->name);
> +    }
> +}

It seems you are considering some candidates more than once here.
The first time through the recursive call you will consider say 
sym->right->right, and with the loop, you'll consider it again after 
returning from the recursive call.
The usual way to traverse the whole tree is to handle the current 
pointer and recurse on left and right pointers.  So without loop.
There is gfc_traverse_ns that you might find handy to do that (no 
obligation).

Same goes for the user operators below.

> +
> +
> +/* Lookup function FN fuzzily, taking names in FUN into account.  */
> +
> +const char*
> +gfc_lookup_function_fuzzy (const char *fn, gfc_symtree *fun)
> +{
> +  auto_vec <const char *> candidates;
> +  lookup_function_fuzzy_find_candidates (fun, &candidates);

You have to start the lookup with the current namespace's sym_root (not 
with fun), otherwise you'll miss some candidates.
You may also want to query parent namespaces for host-associated symbols.

> +
> +  /* Determine closest match.  */
> +  int i;
> +  const char *name, *best = NULL;
> +  edit_distance_t best_distance = MAX_EDIT_DISTANCE;
> +

[...]

> diff --git a/gcc/fortran/symbol.c b/gcc/fortran/symbol.c
> index ff9aff9..212f7d8 100644
> --- a/gcc/fortran/symbol.c
> +++ b/gcc/fortran/symbol.c
> @@ -27,6 +27,7 @@ along with GCC; see the file COPYING3.  If not see
>   #include "parse.h"
>   #include "match.h"
>   #include "constructor.h"
> +#include "spellcheck.h"
>
>
>   /* Strings for all symbol attributes.  We use these for dumping the
> @@ -235,6 +236,62 @@ gfc_get_default_type (const char *name, gfc_namespace *ns)
>   }
>
>
> +/* Recursively append candidate SYM to CANDIDATES.  */
> +
> +static void
> +lookup_symbol_fuzzy_find_candidates (gfc_symtree *sym,
> +				        vec<const char *> *candidates)
> +{
> +  gfc_symtree *p;
> +  for (p = sym->right; p; p = p->right)
> +    {
> +      lookup_symbol_fuzzy_find_candidates (p, candidates);
> +      if (p->n.sym->ts.type != BT_UNKNOWN)
> +	candidates->safe_push (p->name);
> +    }
> +  for (p = sym->left; p; p = p->left)
> +    {
> +      lookup_symbol_fuzzy_find_candidates (p, candidates);
> +      if (p->n.sym->ts.type != BT_UNKNOWN)
> +	candidates->safe_push (p->name);
> +    }
> +}
This looks like the same as lookup_function_fuzzy_find_candidates, isn't it?
Maybe have a general symbol traversal function with a selection callback 
argument to test whether the symbol is what you want, depending on the 
context (is it a function? a subroutine? etc).

> +
> +
> +/* Lookup symbol SYM fuzzily, taking names in SYMBOL into account.  */
> +
> +static const char*
> +lookup_symbol_fuzzy (const char *sym, gfc_symbol *symbol)
> +{
> +  auto_vec <const char *> candidates;
> +  lookup_symbol_fuzzy_find_candidates (symbol->ns->sym_root, &candidates);
> +
> +  /* Determine closest match.  */
> +  int i;
> +  const char *name, *best = NULL;
> +  edit_distance_t best_distance = MAX_EDIT_DISTANCE;
> +
> +  FOR_EACH_VEC_ELT (candidates, i, name)
> +    {
> +      edit_distance_t dist = levenshtein_distance (sym, name);
> +      if (dist < best_distance)
> +	{
> +	  best_distance = dist;
> +	  best = name;
> +	}
> +    }
> +  /* If more than half of the letters were misspelled, the suggestion is
> +     likely to be meaningless.  */
> +  if (best)
> +    {
> +      unsigned int cutoff = MAX (strlen (sym), strlen (best)) / 2;
> +      if (best_distance > cutoff)
> +	return NULL;
> +    }
> +  return best;
> +}
> +
> +
>   /* Given a pointer to a symbol, set its type according to the first
>      letter of its name.  Fails if the letter in question has no default
>      type.  */
> @@ -253,8 +310,15 @@ gfc_set_default_type (gfc_symbol *sym, int error_flag, gfc_namespace *ns)
>       {
>         if (error_flag && !sym->attr.untyped)
>   	{
> -	  gfc_error ("Symbol %qs at %L has no IMPLICIT type",
> -		     sym->name, &sym->declared_at);
> +	  const char *guessed
> +	    = lookup_symbol_fuzzy (sym->name, sym);
> +	  if (guessed)
> +	    gfc_error ("Symbol %qs at %L has no IMPLICIT type"
> +		       "; did you mean %qs?",
> +		       sym->name, &sym->declared_at, guessed);
> +	  else
> +	    gfc_error ("Symbol %qs at %L has no IMPLICIT type",
> +		       sym->name, &sym->declared_at);
>   	  sym->attr.untyped = 1; /* Ensure we only give an error once.  */
>   	}
>
> @@ -2188,6 +2252,55 @@ bad:
>   }
>
>
> +/* Recursively append candidate COMPONENT structures to CANDIDATES.  */
> +
> +static void
> +lookup_component_fuzzy_find_candidates (gfc_component *component,
> +				        vec<const char *> *candidates)
> +{
> +  for (gfc_component *p = component; p; p = p->next)
> +    {
> +      if (00 && p->ts.type == BT_DERIVED)
> +	/* ??? There's no (suitable) DERIVED_TYPE which would come in
> +	   handy throughout the frontend; Use CLASS_DATA here for brevity.  */
> +	lookup_component_fuzzy_find_candidates (CLASS_DATA (p), candidates);
I don't understand what you are looking for here.
Are you trying to handle type extension?  Then I guess you would have to 
pass the derived type symbol instead of its components, and use 
gfc_get_derived_super_type to retrieve the parent type.

Mikael

  parent reply	other threads:[~2015-12-05 19:53 UTC|newest]

Thread overview: 96+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-01 12:55 [PATCH] Use gfc_add_*_component defines where appropriate Bernhard Reutner-Fischer
2015-12-01 12:55 ` [PATCH] Commentary typo fix for gfc_typenode_for_spec() Bernhard Reutner-Fischer
2015-12-01 16:00   ` Steve Kargl
2016-06-18 20:07     ` Bernhard Reutner-Fischer
2015-12-01 12:55 ` [PATCH] Derive interface buffers from max name length Bernhard Reutner-Fischer
2015-12-01 14:52   ` Janne Blomqvist
2015-12-01 16:51     ` Bernhard Reutner-Fischer
2015-12-03  9:46       ` Janne Blomqvist
2016-06-18 19:46         ` Bernhard Reutner-Fischer
2017-10-19  8:03           ` Bernhard Reutner-Fischer
2017-10-20 22:46             ` Bernhard Reutner-Fischer
2017-10-21 15:18               ` Thomas Koenig
2017-10-21 18:11                 ` Bernhard Reutner-Fischer
2017-10-31 20:35                   ` Bernhard Reutner-Fischer
2018-09-03 16:05                     ` Bernhard Reutner-Fischer
2018-09-05 14:57                       ` [PATCH,FORTRAN 02/29] Use stringpool for gfc_match_defined_op_name() Bernhard Reutner-Fischer
2018-09-05 14:57                       ` [PATCH,FORTRAN 04/29] Use stringpool for gfc_match_generic_spec Bernhard Reutner-Fischer
2018-09-05 14:57                       ` [PATCH,FORTRAN 09/29] Use stringpool for modules Bernhard Reutner-Fischer
2018-09-05 18:44                         ` Janne Blomqvist
2018-09-05 20:59                           ` Bernhard Reutner-Fischer
2018-09-05 14:57                       ` [PATCH,FORTRAN 06/29] Use stringpool for association_list Bernhard Reutner-Fischer
2018-09-05 14:57                       ` [PATCH,FORTRAN 01/29] gdbinit: break on gfc_internal_error Bernhard Reutner-Fischer
2021-10-29 18:58                         ` Bernhard Reutner-Fischer
2021-10-29 22:13                           ` Jerry D
2021-10-30 18:25                             ` Bernhard Reutner-Fischer
2018-09-05 14:57                       ` [PATCH,FORTRAN 00/29] Move towards stringpool, part 1 Bernhard Reutner-Fischer
2018-09-05 18:57                         ` Janne Blomqvist
2018-09-07  8:09                           ` Bernhard Reutner-Fischer
2018-09-19 14:40                             ` Bernhard Reutner-Fischer
2023-04-13 21:04                               ` Bernhard Reutner-Fischer
     [not found]                         ` <cba81495-832c-2b95-3c30-d2ef819ea9fb@charter.net>
     [not found]                           ` <CAC1BbcThL4Cj=mVRuGg2p8jUipwLOeosB7kwoVD27myRnKcgZA@mail.gmail.com>
2021-04-18 21:30                             ` Bernhard Reutner-Fischer
2018-09-05 14:57                       ` [PATCH,FORTRAN 03/29] Use stringpool for gfc_get_name Bernhard Reutner-Fischer
2018-09-05 14:57                       ` [PATCH,FORTRAN 08/29] Add uop/name helpers Bernhard Reutner-Fischer
2018-09-05 14:57                       ` [PATCH,FORTRAN 07/29] Use stringpool for some gfc_code2string return values Bernhard Reutner-Fischer
2018-09-05 14:57                       ` [PATCH,FORTRAN 13/29] Use stringpool for intrinsics and common Bernhard Reutner-Fischer
2018-09-05 14:58                       ` [PATCH,FORTRAN 10/29] Do not copy name for check_function_name Bernhard Reutner-Fischer
2018-09-05 14:58                       ` [PATCH,FORTRAN 27/29] Use stringpool for OMP clause reduction code Bernhard Reutner-Fischer
2018-09-05 14:58                       ` [PATCH,FORTRAN 29/29] PR87103: Remove max symbol length check from gfc_new_symbol Bernhard Reutner-Fischer
2018-09-05 14:58                       ` [PATCH,FORTRAN 23/29] Use stringpool for module binding_label Bernhard Reutner-Fischer
2018-09-05 14:58                       ` [PATCH,FORTRAN 05/29] Use stringpool for gfc_match("%n") Bernhard Reutner-Fischer
2018-09-05 14:58                       ` [PATCH,FORTRAN 21/29] Use stringpool for module tbp Bernhard Reutner-Fischer
2018-09-05 14:58                       ` [PATCH,FORTRAN 25/29] Use stringpool on loading module symbols Bernhard Reutner-Fischer
2018-09-19 22:55                         ` [PATCH,FORTRAN v2] " Bernhard Reutner-Fischer
2018-09-05 14:58                       ` [PATCH,FORTRAN 11/29] Do pointer comparison instead of strcmp Bernhard Reutner-Fischer
2018-09-05 14:58                       ` [PATCH,FORTRAN 24/29] Use stringpool for intrinsic functions Bernhard Reutner-Fischer
2018-09-05 14:58                       ` [PATCH,FORTRAN 14/29] Fix write_omp_udr for user-operator REDUCTIONs Bernhard Reutner-Fischer
2018-09-05 14:58                       ` [PATCH,FORTRAN 22/29] Use stringpool in class and procedure-pointer result Bernhard Reutner-Fischer
2018-09-05 14:58                       ` [PATCH,FORTRAN 26/29] Use stringpool for mangled common names Bernhard Reutner-Fischer
2018-09-05 14:58                       ` [PATCH,FORTRAN 12/29] Use stringpool for remaining names Bernhard Reutner-Fischer
2018-09-05 15:02                       ` [PATCH,FORTRAN 20/29] Use stringpool in class et al Bernhard Reutner-Fischer
2018-09-05 15:02                       ` [PATCH,FORTRAN 28/29] Free type-bound procedure structs Bernhard Reutner-Fischer
2021-10-29  0:05                         ` Bernhard Reutner-Fischer
2021-10-29 14:54                           ` Jerry D
2021-10-29 16:42                             ` Bernhard Reutner-Fischer
     [not found]                           ` <slhifq$rlb$1@ciao.gmane.io>
2021-10-29 20:09                             ` Bernhard Reutner-Fischer
2021-10-31 22:35                               ` Bernhard Reutner-Fischer
2018-09-05 15:02                       ` [PATCH,FORTRAN 15/29] Use stringpool for iso_c_binding module names Bernhard Reutner-Fischer
2018-09-05 15:02                       ` [PATCH,FORTRAN 16/29] Do pointer comparison in iso_c_binding_module Bernhard Reutner-Fischer
2018-09-05 15:02                       ` [PATCH,FORTRAN 17/29] Use stringpool for iso_fortran_env Bernhard Reutner-Fischer
2018-09-05 15:02                       ` [PATCH,FORTRAN 18/29] Use stringpool for charkind Bernhard Reutner-Fischer
2018-09-05 15:02                       ` [PATCH,FORTRAN 19/29] Use stringpool and unified uppercase handling for types Bernhard Reutner-Fischer
2015-12-01 12:56 ` [PATCH] RFC: Use Levenshtein spelling suggestions in Fortran FE Bernhard Reutner-Fischer
2015-12-01 15:02   ` Steve Kargl
2015-12-01 16:13     ` Bernhard Reutner-Fischer
2015-12-01 16:41       ` Steve Kargl
2015-12-01 17:35         ` Bernhard Reutner-Fischer
2015-12-01 19:49           ` Steve Kargl
2015-12-01 17:28   ` David Malcolm
2015-12-01 17:51     ` Bernhard Reutner-Fischer
2015-12-01 17:58       ` David Malcolm
2015-12-01 20:00         ` Steve Kargl
2015-12-03  9:29       ` Janne Blomqvist
2015-12-03 13:53         ` Mikael Morin
2015-12-04  0:08           ` Steve Kargl
2015-12-05 19:53   ` Mikael Morin [this message]
2015-12-09  1:07     ` [PATCH] v2 " David Malcolm
2015-12-10 16:15       ` Tobias Burnus
2015-12-22 13:57         ` Fortran release notes (was: [PATCH] v2 ...) Gerald Pfeifer
2015-12-12 17:02       ` [PATCH] v2 Re: [PATCH] RFC: Use Levenshtein spelling suggestions in Fortran FE Bernhard Reutner-Fischer
2015-12-27 21:43   ` [PATCH, RFC, v2] " Bernhard Reutner-Fischer
2016-03-05 22:46     ` [PATCH, fortran, v3] " Bernhard Reutner-Fischer
2016-03-07 14:57       ` David Malcolm
2016-04-23 18:22         ` Bernhard Reutner-Fischer
2016-04-25 17:07           ` David Malcolm
2016-06-18 19:59             ` [PATCH, fortran, v4] " Bernhard Reutner-Fischer
2016-06-20 10:26               ` VandeVondele  Joost
2016-07-03 22:46               ` Ping: [Re: [PATCH, fortran, v4] Use Levenshtein spelling suggestions in Fortran FE] Bernhard Reutner-Fischer
2016-07-04  3:31                 ` Jerry DeLisle
2016-07-04  5:03                   ` Janne Blomqvist
2017-10-19  7:26                     ` Bernhard Reutner-Fischer
2017-10-19  7:51               ` [PATCH, fortran, v4] Use Levenshtein spelling suggestions in Fortran FE Bernhard Reutner-Fischer
2016-06-18 19:47 ` [PATCH] Use gfc_add_*_component defines where appropriate Bernhard Reutner-Fischer
2016-06-19  9:18   ` Paul Richard Thomas
2016-06-19 10:39     ` Bernhard Reutner-Fischer
2015-12-01 15:28 [PATCH] RFC: Use Levenshtein spelling suggestions in Fortran FE VandeVondele  Joost
2015-12-01 18:12 ` VandeVondele  Joost

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=566340AC.3050408@sfr.fr \
    --to=mikael.morin@sfr.fr \
    --cc=dmalcolm@redhat.com \
    --cc=fortran@gcc.gnu.org \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=rep.dot.nop@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).