From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 3245 invoked by alias); 31 Aug 2014 06:48:17 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Received: (qmail 3144 invoked by uid 48); 31 Aug 2014 06:48:06 -0000 From: "jmichae3 at yahoo dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug libstdc++/62318] New: optimization of string searches using Nigel Horspool's Boyer-Moore-Horpool algorithm Date: Sun, 31 Aug 2014 06:48:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: libstdc++ X-Bugzilla-Version: 4.9.1 X-Bugzilla-Keywords: X-Bugzilla-Severity: enhancement X-Bugzilla-Who: jmichae3 at yahoo dot com X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2014-08/txt/msg02057.txt.bz2 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62318 Bug ID: 62318 Summary: optimization of string searches using Nigel Horspool's Boyer-Moore-Horpool algorithm Product: gcc Version: 4.9.1 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: jmichae3 at yahoo dot com optimization of string searches using Nigel Horspool's Boyer-Moore-Horpool algorithm is what I would like to see. right now it's just a plain old, slow O(n) loop. in just about any OS, at least half of the funcrtions are string functions. the C library should be modified as well, so should the same in objc and objc++ and whatever compiler in GNU compiler set you can put it in. in C libraries I typically see simple while loops. this can be done much better with only a little more code, the algorithm is very short. this could be sped WAY up using this algorithm. google Nigel Horspool and see if you can get permission to use his algorithm. this could apply to 's std::find(), std::string.find() and you would not need to change the function signature. there are probably other uses for this algorithm I am not thinking of right now, but if you put your thinking cap on to see where this could fit, I am sure you could find some choice places for it. "1980 Practical Fast Searching in Strings" under Publications http://webhome.cs.uvic.ca/~nigelh/ if you are compiling an OS or OS kernel with this, I would suspect you would see at least a 2-3x+ general performance improvement, maybe a lot more (though I have done no testing) due to the fact that most every OS API call has some sort of string parameter in it. Donald Knuth's The Art Of Computer Programming volumes 1-5 has quite an array of algorithms, and it's (as far as I know) a well known set that's been around for years.