From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf1-x131.google.com (mail-lf1-x131.google.com [IPv6:2a00:1450:4864:20::131]) by sourceware.org (Postfix) with ESMTPS id ABD863858435 for ; Mon, 18 Dec 2023 06:22:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org ABD863858435 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org ABD863858435 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::131 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702880574; cv=none; b=CxENqx/ykEK9wYH90Vm2DxTD0OtKz8waNftbijD5TsmeVn8s/XArWjSQVnJqVhQkQT/DpofpNeeFGiHtzSLYlry9/eRHwmeUwhLXq80opUsWn4sQg3HtTg17MHKp9dW6RTg7r//U2Tjxoo1GSyx0xepzxCVoi43RmRTSCLaJKGM= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702880574; c=relaxed/simple; bh=ZwlO6jRRUPgG8mwf34H2pfVlNSkDMDOq6oCnu6gwNfQ=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=BFbvZW0yOGTO1xYXiFROZtNDZ6PdyY/HBB2dG/kBHwXXH5PFU8V6lOeSxZqerpiQ8dxzzWTKHzsTDjZYl1hMYMHqc/zbNcDEt2DtDCGZbvRyg6Yi4WNg3L+25QBopJVO/G0VKsgDc3BvOS9kdOTbKgdQL7fd7QhMBHKX/pCXQ5o= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-lf1-x131.google.com with SMTP id 2adb3069b0e04-50e2786e71fso1698738e87.0 for ; Sun, 17 Dec 2023 22:22:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1702880570; x=1703485370; darn=cygwin.com; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :from:to:cc:subject:date:message-id:reply-to; bh=XHXYGuhh6XYay6HrAY3G7geks+0g+1i7WuFOOFQQBVY=; b=agytwJnaVYpwJ/SUonmuGNSZtHlLbmD0o8aPLNbvomqRUa3hD9gEmvXSB3+0R9O6MK w2aHCaNtAMCFGWnG0Ojzii1XnsWXfWvkkRa+l8T1OyFkYVoNKDEHE1D+aQLg96zXNZxv LruWOnvBYzCnjzoyZ8qYhJkLHaQIuH8xoRmXLO6R6Nl4ELpK3OQO7eQbu0jWaQnVGIeR L8FKLihMKojRYvkt3208JaNHMyGjsxEbOyb3KkABjhCEwxvBMtDRwwl33gUTmEvjIWy+ PpsQ1RNXoe3Gu8vSFqBXvvplWqTJwpzPqSVLZ0FMJ8ODSR486kSdzmxhCblid7NpoTCP nzMg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702880570; x=1703485370; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=XHXYGuhh6XYay6HrAY3G7geks+0g+1i7WuFOOFQQBVY=; b=EXhRO57Sb0TfH87Nkv7eo+OXfiC5LUha/QM4dVP9qdh02KckbKHccEva+WZmx1LKW+ k357FxvTg59w7YWqyjyiSLuINp1fW++EnDy1u6zj+NirAxcq6lb01mkxyQkg1mex38yE /Kkfjeb7B5C+vEkP/JJjRDhi2LHFTnui+8JnJlv4dAl2qpDSRrkXXNDlP0bL7AsNWQyM zY40faXmpCHiAv/f9ouT5wnhlFT+1H0+uW3VXLr/lkWXzOg5VECMKdRnmOSmDKViv6kM ja5QIsfhVQkZgqUP5p4PlESuMtnCtsna7js6EEmMx2+Rgbb/xStXr7DW+KnOIrZ2BIq8 4WIg== X-Gm-Message-State: AOJu0Yy8K6zrR5Gji1Vn33dciWH+Tz2UsGYnGSRdgCYEW+h0adbMeCvZ htyIN4TxQNHpnCnukXWbKgXlOEr37dcbZ0Lr9bIHZlBF X-Google-Smtp-Source: AGHT+IEPFLHnGWrnABwe6LS1hINau7khCTHW7gBUHli6yt+4gFHN3cnO269GoAEBarRi7qsCTyiO5NoVT9dQry03UWM= X-Received: by 2002:a2e:be8c:0:b0:2c9:fa34:332d with SMTP id a12-20020a2ebe8c000000b002c9fa34332dmr7282612ljr.12.1702880570421; Sun, 17 Dec 2023 22:22:50 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Dan Shelton Date: Mon, 18 Dec 2023 07:22:24 +0100 Message-ID: Subject: Re: Catastrophic Cygwin find . -ls, grep performance on samba share compared to WSL&Linux To: cygwin@cygwin.com Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=0.3 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Wed, 6 Dec 2023 at 05:08, Dan Shelton wrote: > > Hello! > I am unhappy to report a severe performance issue with find -ls, ls -R > and grep -r, with Cygwin 3.4.9 and Cygwin 3.5.0 when samba shares are > involved. > > Imagine a directory with 256 subdirs, and each has 256 files per > subdir, all on a samba share, samba server is on Linux with tmpfs. > > mkdir dir1 > for ((i=0;i<256;i++)) ; do > mkdir "dir1/subdir$i" > for ((j=0; j < 256;j++));do > echo "j=$j" >"dir1/subdir$i/j$j.txt" > done > done > > Time comparisations then show a dramatic difference, Debian Linux > accessing the samba share, WSL accessing the samba share, and Cygwin > accessing the samba share: > 1. time find . >/dev/null > Cygwin 86 seconds > WSL 23 seconds > Debian 19 seconds > > 2. time find . -ls >/dev/null > Cygwin 129 seconds > WSL 38 seconds > Debian 32 seconds > > 3. time grep -r -E NOMATCH 2>/dev/null > Cygwin 390 seconds > WSL 144 seconds > Debian 141 seconds > > So where does the bad Cygwin performance come from? Virus checker, > memory compression and other Windows services known to interfere with > benchmarking are OFF. > > But the network trace shows a dramatic difference: While Debian and > WSL open files only once, the Cygwin run spends lots of network > traffic checking whether the txt files are txt.lnk, txt,bat.lnk and so > on, all non existent files. > > Why does that happen? It would be nice if someone from the Cygwin authors could assist me in figuring out why this happens. My working theory is that the extra file and dir lookup calls are for soft- and hardlink emulation for file systems which do not have soft- or hardlinks? If this is correct, then a fix might be to 1) determinate the filesystem type (cached, per process lifetime in absence of /etc/mnttab) and its boundaries (mount point, and whether other muont points are below it) 2) Only use the emulation for FAT filesystems, and for NTFS, REFS, SMBFS the native filesystem link is used. Help! Dan -- Dan Shelton - Cluster Specialist Win/Lin/Bsd