From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 27050 invoked by alias); 1 Dec 2015 17:58:32 -0000 Mailing-List: contact fortran-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: fortran-owner@gcc.gnu.org Received: (qmail 27022 invoked by uid 89); 1 Dec 2015 17:58:31 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.8 required=5.0 tests=AWL,BAYES_00,SPF_HELO_PASS,T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-Spam-User: qpsmtpd, 2 recipients X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Tue, 01 Dec 2015 17:58:30 +0000 Received: from int-mx13.intmail.prod.int.phx2.redhat.com (int-mx13.intmail.prod.int.phx2.redhat.com [10.5.11.26]) by mx1.redhat.com (Postfix) with ESMTPS id 52FA1C0CC62E; Tue, 1 Dec 2015 17:58:29 +0000 (UTC) Received: from [10.3.234.164] (vpn-234-164.phx2.redhat.com [10.3.234.164]) by int-mx13.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id tB1HwS9Z007603; Tue, 1 Dec 2015 12:58:28 -0500 Message-ID: <1448992708.8490.29.camel@surprise> Subject: Re: [PATCH] RFC: Use Levenshtein spelling suggestions in Fortran FE From: David Malcolm To: Bernhard Reutner-Fischer Cc: gfortran , GCC Patches Date: Tue, 01 Dec 2015 17:58:00 -0000 In-Reply-To: References: <1448974501-30981-1-git-send-email-rep.dot.nop@gmail.com> <1448974501-30981-4-git-send-email-rep.dot.nop@gmail.com> <1448990880.8490.24.camel@surprise> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 7bit X-SW-Source: 2015-12/txt/msg00015.txt.bz2 On Tue, 2015-12-01 at 18:51 +0100, Bernhard Reutner-Fischer wrote: > On 1 December 2015 at 18:28, David Malcolm wrote: > > On Tue, 2015-12-01 at 13:55 +0100, Bernhard Reutner-Fischer wrote: > > > >> +/* Lookup function FN fuzzily, taking names in FUN into account. */ > >> + > >> +const char* > >> +gfc_lookup_function_fuzzy (const char *fn, gfc_symtree *fun) > >> +{ > >> + auto_vec candidates; > >> + lookup_function_fuzzy_find_candidates (fun, &candidates); > >> + > >> + /* Determine closest match. */ > >> + int i; > >> + const char *name, *best = NULL; > >> + edit_distance_t best_distance = MAX_EDIT_DISTANCE; > >> + > >> + FOR_EACH_VEC_ELT (candidates, i, name) > >> + { > >> + edit_distance_t dist = levenshtein_distance (fn, name); > >> + if (dist < best_distance) > >> + { > >> + best_distance = dist; > >> + best = name; > >> + } > >> + } > >> + /* If more than half of the letters were misspelled, the suggestion is > >> + likely to be meaningless. */ > >> + if (best) > >> + { > >> + unsigned int cutoff = MAX (strlen (fn), strlen (best)) / 2; > >> + if (best_distance > cutoff) > >> + return NULL; > >> + } > >> + return best; > >> +} > > > > > > Caveat: I'm not very familiar with the Fortran FE, so take the following > > with a pinch of salt. > > > > If I'm reading things right, here, and in various other places, you're > > building a vec of const char *, and then seeing which one of those > > candidates is the best match for another const char *. > > > > You could simplify things by adding a helper function to spellcheck.h, > > akin to this one: > > > > extern tree > > find_closest_identifier (tree target, const auto_vec *candidates); > > I was hoping for ipa-icf to fix that up on my behalf. I'll try to see > if it does. Short of that: yes, should do that. I was more thinking about code readability; don't rely on ipa-icf - fix it in the source. > > This would reduce the amount of duplication in the patch (and slightly > > reduce the amount of C++). > > As said, we could as well use a list of candidates with NULL as record marker. > Implementation cosmetics. Steve seems to not be thrilled by the > overall idea in the first place, so unless there is clear support by > somebody else i won't pursue this any further, it's not that i'm bored > or ran out of stuff i should do.. ;) (FWIW I liked the idea, but I'm not a Fortran person so my opinion counts much less that Steve's) > > [are there IDENTIFIER nodes in the Fortran FE, or is it all const char > > *? this would avoid some strlen calls] > > Right, but in the Fortran FE these are const char*. > > thanks for your comments!