From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf1-x130.google.com (mail-lf1-x130.google.com [IPv6:2a00:1450:4864:20::130]) by sourceware.org (Postfix) with ESMTPS id B6B903847700 for ; Wed, 3 Apr 2024 06:44:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B6B903847700 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org B6B903847700 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::130 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1712126658; cv=none; b=shHrqvWLmfsUr/T6EcR/MGmAtC1JFwGXWUc0+bZd0q1QyI4hHBkfjdwOzs9vgJEelqDyLz23XTMiT8muhzc7xkMyCVyzngm6sA2kMK1dUgBCklKS1Aw1fY5nB7IoOMuz1w7qQ198Khc2d1MKUrwj4mpTzWgLbtigkZa2GV5+P68= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1712126658; c=relaxed/simple; bh=A5wGtGo12nwC8Eu8d4B2Wev2nT+Ug7APYebRS9XwDec=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=eY1f/NjdH7g+ZYQHyY8yoXQTqmt8VeFUxmIsP4G892+v0lwoesocMQU00ME+Pcoo9KhkQ6QDMA3806uTXd3a9pl14i/W1weS92JOub2o3j99Jb+Sf7HJdXzUjphqi/HIHAxTRrP/Tvs2VzYWmxTlsnwwgVEpaaBz9KalYoSxTGg= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-lf1-x130.google.com with SMTP id 2adb3069b0e04-516be63af88so387388e87.0 for ; Tue, 02 Apr 2024 23:44:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1712126654; x=1712731454; darn=cygwin.com; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=hain7Yr8r8PBwV/rUqBOiIiSD9Khe419sgl1YKabs94=; b=H5UyZ++kUQDPyRY0Pog9jaVbcLT0UbiCKQnsTZmIySVAiml4bdlH+kcnhdlU2qPLyv +prpW6x3doXHhvc09NvtI0OmFLv5wjxYMRDD0RQ1nFAmKdxWmOl8Iwx42wmiPD6PnMnC 4aMVGzD6aGqHupZBIR6sc5ECVs4f5eUqPbChsku6TRoCPfnHsrGQWjKnVqbNeSE0bw4N V9u2RkH0L2xiY/2ai5XGJYHtmcQ1QgTnHxFaiiPvGCAsXPrZcktoxJh/kxvzfQRWaZns JNbiTFsfReh/LR2efjG5K3g/Xn4vHnB5j4BgixJbWwzHDleCR1hTRYlOIDM7z1l+zsW7 RA/Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712126654; x=1712731454; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=hain7Yr8r8PBwV/rUqBOiIiSD9Khe419sgl1YKabs94=; b=NLP74ciYXpqDEUW9HtPdSpqLTM2/kU5EXnD2T6aWZEPqvVJNwGItDDhTVZbWUi2a2z /KjFgTgNiEVY17p4FNwIucG4l5hEzqWW9IOilY2Vyi3rUK29D0O1nNfL+RGw1Pc4mTcv 7vJ9JEcMnGxZ9MxfDtey1rGEJYc8dry0EFlV4K7lDyfPCtOg3MH/636el2PkJ6lCkR2L lI7iOpUzZiu+LslitUPaDc9ECL70LppE67L1mqWEiMxzyHMEohRbaadpA3z4vRJibY1j tnNcXyKvJA6FMJiTNViuXaUtAztZycpxE5EGHjCT6VR5I3GgbPuau8Xs9bxh22GH5HwF oEIg== X-Gm-Message-State: AOJu0YzkfgxBr8cs203asHPQxAS/szKY+MFfoJ1WBQdedvE+Hrbxf0kL JVprb56ggaE0NQt+8BOPTBdtvqPg+xg5GGXUSmQhelPqzPzkPSiQPBEAZqk34KElH88k/u2W+EH Vaqx5SLXKCmXXRXAaeB8frsy8FNAXGwJbojM= X-Google-Smtp-Source: AGHT+IEjAp/AnEwuaa9obmGvjMLoQzNbuW6048dxsfmgV8OuXmLc/l84GZsgBE9XeUwSSBxJdazfDxX2aROco9UR3S0= X-Received: by 2002:a19:5209:0:b0:515:d135:68f2 with SMTP id m9-20020a195209000000b00515d13568f2mr984463lfb.53.1712126653777; Tue, 02 Apr 2024 23:44:13 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Cedric Blancher Date: Wed, 3 Apr 2024 08:44:00 +0200 Message-ID: Subject: Re: Cygwin&Win32 file prefetch, block sizes? To: cygwin@cygwin.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-0.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Wed, 3 Apr 2024 at 03:10, Mark Geisert via Cygwin wr= ote: > > On 4/2/2024 3:35 PM, Martin Wege via Cygwin wrote: > > On Tue, Apr 2, 2024 at 3:17=E2=80=AFPM Corinna Vinschen via Cygwin > > wrote: > >> > >> On Apr 2 02:04, Martin Wege via Cygwin wrote: > >>> Hello, > >>> > >>> Is there any document which describes how Cygwin and Win32 file > >>> prefetch and readahead work, and which sizes are used (e.g. always > >>> read one full page even if only 16 bytes are requested?)? > >> > >> I'm not aware of any docs, but again, keep in mind that Cygwin is a > >> usersapce DLL. We basically do what Windows does for low-level file > >> access. > >> > >>> Quick /usr/bin/stat /etc/profile returns "IO Block: 65536". Does that > >>> mean the file's block size is really 64k? Is this info per filesystem= , > >>> or hardcoded in Cygwin? > >> > >> Hardcoded in Cygwin since 2017, based on a discussion in terms of > >> file access performance, especially when using stdio.h functions: > >> > >> https://cygwin.com/cgit/newlib-cygwin/commit/?id=3D7bef7db5ccd9c > > > > OUCH. > > > > While I can understand the motivation, FAT32 on multi-GB-devices > > having 64k block size, and Win32 API on Win95/98/ME/Win7 being > > optimized to that insane block size, it is absolutely WRONG with > > today's NTFS and even more so with ReFS. This only works if you stream > > files, but as soon as you are doing random read/writes the performance > > is terrible due to cache thrashing. That could explain the many > > complaints about Cygwin's IO performance. > > No comment. > > > So, what can be done? I'm not a benchmarking guru, so I'd like to > > propose to add a tunable called EXPERIMENTAL_PREFERRED_IO_BLKSIZE to > > the CYGWIN env variable (marked as "experimental"), so the > > benchmarking guys can do performance testing without recompiling > > everything, get perf results for Cygwin 3.6, and decide what to do for > > Cygwin 3.7. > > That kind of experiment is what folks who can build their own > cygwin1.dll might do. I doubt we'd want to make a run-time global disk > I/O strategy changer available like this, even temporarily. Realistically that would mean that Cygwin will forever be stuck with an insane IO block size. Building Cygwin.dll requires specialised knowledge and TIME, and no manager will waste the time of a performance engineer to produce custom binaries. Cygwin 3.6 is right now in development, so it would be better to add such a knob, so performance engineers can just grab those binaries and do benchmarking with them. BTW: A block size of 64k is CLEARLY harming performance. Have a look at https://www.zabkat.com/blog/buffered-disk-access.htm the sweet spot is somewhere between 16k and 32k, for SMB even below that. 64k is clearly on the backside of the curve, and actively harming performance, except for "linear reads". > > What could make sense is enhancing Cygwin's posix_fadvise() to support > POSIX_FADV_RANDOM getting mapped to Windows' FILE_RANDOM_ACCESS flag. > Something like this is currently done for POSIX_FADV_SEQUENTIAL -> > FILE_SEQUENTIAL_ONLY. These are per-filedescriptor adjustments and due > to Windows limitations would apply to a whole file rather than having > the POSIX behavior of being settable for a byte range within a file. Nope. Because we are talking about a sensible default for all applications, and a block size of 64k is HARMFUL, except on fat32 where the filesystem block size is already 64k for multi gigabyte disks. Ced --=20 Cedric Blancher [https://plus.google.com/u/0/+CedricBlancher/] Institute Pasteur