From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 17526 invoked by alias); 30 Sep 2018 18:41:25 -0000 Mailing-List: contact cygwin-help@cygwin.com; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner@cygwin.com Mail-Followup-To: cygwin@cygwin.com Received: (qmail 17514 invoked by uid 89); 30 Sep 2018 18:41:24 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-1.8 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2 spammy=ben, upgraded, gathering, increases X-HELO: mail-lf1-f41.google.com Received: from mail-lf1-f41.google.com (HELO mail-lf1-f41.google.com) (209.85.167.41) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Sun, 30 Sep 2018 18:41:23 +0000 Received: by mail-lf1-f41.google.com with SMTP id s10-v6so781906lfc.9 for ; Sun, 30 Sep 2018 11:41:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=qehffeAwDfItNzNBBf5PHDWsCABfqtV7l6OM3FA7ki8=; b=T4xgcz03ijeEZXnhhZxLrSi7OOQ5f5XQylcf7coGJfkIwc5+DZe+fwrRV5Fy7ptoBk SatMc4jAe1N2KaLM5QJacTrkU2Y62CQZMZ8TTbYjWNymC0Fvup/vJYnOFBx0jBS/Ac+5 lFRODdu5U4H1/sGDWY8gL0P9+ufFP6DJVQIaN0AIv5972B8EjvHlVY01ES+eQM/C8uoG h/DlVtS1MTNu08jhkX4WTAGqYlyvtHf9sLf2Vkzsvc/r6R9qvyqbvp0EpONvkAIKkufK rXUhqvtrN2HThug5zJIUxrz/i1vbSab0b2iRTQOAFuF/UXt4Fn1SEditutIooOGW1YK/ IInQ== MIME-Version: 1.0 From: Marco Mason Date: Sun, 30 Sep 2018 18:41:00 -0000 Message-ID: Subject: Filesystem enumeration performance improvement To: cygwin@cygwin.com Content-Type: text/plain; charset="UTF-8" X-SW-Source: 2018-09/txt/msg00297.txt.bz2 I recently upgraded from cygwin v2.10 to v2.11.1 and noticed that one of my programs got a tremendous speed boost. It's a custom filesystem enumeration program whose output I feed to frcode to update the /var/locatedb database. It used to take quite a bit of time (15-20 minutes?), and now runs in about a minute. Since the program seems to work well, just many times faster, I'm rather happy with the changes. The reason I'm writing is that I don't see *why* I should have any timing changes at all! The reason I have my own file enumerator for locatedb is that the original went through the POSIX layer and was pretty slow, especially for remote-mounts. As I only needed enough for locate, I wrote my own enumerator against the Windows API for speed. Since my loop is essentially just using FindFirstFile/FindNextFile and printf(), I don't know why file gathering would be any faster. So either printf() has gotten remarkably faster, or there are some interactions between Cygwin and windows in the file enumeration area that are surprising me. Can someone please clue me in to what might be causing the speed increases? Looking at the git log and mailing list history, my best guess would be that it's related to the EMail threads "Why does readdir() open files ?" (Ben Rubson 2018-03-28) and "Why does (stat() ?) open files ?" (Ben Rubson 2018-04-09). However, I can't seem to pin down which git commits are relevent to those threads. If anyone can provide a little insight, I'd really appreciate it. --marco -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple