public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
* gawk: Bad File Descriptor error with concurrent readonly access to a network file
@ 2015-09-25 16:31 Vermessung AVT - Wolfgang Rieger
  2015-09-25 16:59 ` Marco Atzeri
  2015-10-27 13:21 ` Corinna Vinschen
  0 siblings, 2 replies; 11+ messages in thread
From: Vermessung AVT - Wolfgang Rieger @ 2015-09-25 16:31 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 7046 bytes --]

We let thousands of tiles undergo the same time consuming processing tasks. We use a multi core Windows 7 workstation running several tiles simultaneously in separate shell windows (parallel processing). A batch script controls the work flow of the task with gawk interpreting a number of setup / definition files at run time for each tile / working step. From time to time we get "Bad File Descriptor" errors in gawk (and, e.g., cat, head, tail) when accessing these setup files (they are only read). The full error line reads similar to:

(With job.awk and first access to datafile.txt at gawk source line 31:)
   "gawk: job.awk:31: fatal: error reading input file `datafile.txt': Bad file descriptor"
(With inline gawk scripts typically:)
   "gawk: fatal: error reading input file `datafile.txt': Bad file descriptor"
(With something like "cat datafile.txt > destination":)
   "cat: datafile.txt: Bad file descriptor"

We use MS-Windows shell cmd.exe with batch scripts executing the gawk and other commands.
I tried to use gawk's BEGINFILE rule in order to trap that error. However, the BEGINFILE block is never entered, rather, gawk immediately crashes with the "Bad File Descriptor" error.

I found nothing helpful in the web on that. Several updates to latest versions throughout last years brought no change in this behaviour.

Isolating and tracking down the problem with the test case included below I found out:

1) Concurrent read access to the setup files was possible and worked fine with local files (24 hrs testing with millions of file accesses in 4 parallel jobs).
2) However, when the file to be read (datafile.txt) is stored on a network share on a file server - which is the case in our working environment - the error could be reproduced. The number of Bad file descriptor errors seems to be related to the work load at the server where the file resides.
3) The MS copy command shows no such error, even with network files. So we can substitute the cat's by copy's. For gawk, however, there is no shell alternative.

It looks like there is a small time frame in opening files when the server file is non-accessible to other processes. If a parallel job happens to access the same file within that short time period while another process is opening it, the "Bad File Descriptor" error is thrown.


I would at least expect such a file opening error be submitted to a BEGINFILE rule (as included in the test example) in gawk; but rather I hoped that Cygwin could cope with these situations.
Microsoft obviously is able to cope with these situations (if it is a concurrent file access problem which I am sure is the case), since with copy instead of cat (or gawk) I never experienced such access problems.




Here is the test case I have used. It consists of 3 files:

datafile.txt    A datafile filled with dummy content
chkParallelError.bat     The control Job which has to be started in a cmd.exe shell window; it features 2 optional parameters:
First parameter is the datafile name which defaults to datafile.txt (eventually add a path to it if it is stored in a different directory, e.g. network share)
Second parameter is the number of parallel jobs which should be run, it defaults to the shell (cmd.exe) symbol NUMBER_OF_PROCESSORS, or to 4 if not set. This should be chosen in accordance with the number of cores available (e.g. not exceeding 2*number of processors).
The syntax is MS Windows cmd.exe shell syntax of MS Windows 7.

The job chkParallelErrorJob.bat is started as many times as given in the number of parallel jobs in separate shell windows (cmd.exe). There are 3 calls included that are currently all commented out with rem, namely gawk, cat, and copy (in our case MS Windows cmd.exe command). In order to run one of them it is necessary to erase the respective rem.
Each job creates a logfile "chkParallelError_1.log", "chkParallelError_2.log", etc. in the local directory, where the output (stderr and stdout) is directed to, which can be parsed for "Bad". Additionally the output is partly shown in the shell windows. In my environment I experienced roughly 1 "Bad file descriptor" error in 5secs - 10minutes; eventually a server should be used that has some work load.

Remark 1: Operating a similar test case with a gawk script instead of inline source I experienced from time to time "Bad File Descriptor" errors even when accessing the gawk script source itself, if that script was stored at the network share as well. With the gawk script stored locally, that error did not occur during 24 hrs testing time.

Remark 2: Attached cygcheck_150924.out was edited: Deleted several shell symbols, computer names, network shares, etc.

Remark 3: Since screen output is given for each call to gawk, it may be helpful to minimize the shell windows that popped up (parallel processes) in order to speed up the process so that the errors become more frequent.

Remark 4: For the time being we have a workaround, "copy"-ing the setup files to a local SSD in separate directories for each parallel process before access by any Cygwin tool which is not convenient but at least works.


=== datafile.txt: ==============================
This is line 1
This is line 2
This is line 3
================================================
=== chkParallelError.bat =======================
@setlocal
@echo off
    set lis=%1
    set njobs=%2
    if "%lis%"   == "" set lis=datafile.txt
    if "%njobs%" == "" set njobs=%NUMBER_OF_PROCESSORS%
    if "%njobs%" == "" set njobs=4
    for /L %%I in (1,1,%njobs%) do echo start chkParallelErrorJob %%I %lis%&start chkParallelErrorJob %%I %lis%
================================================
=== chkParallelErrorJob.bat ====================
@setlocal
@echo off
    set instance=%1
    set lis=%~dpnx2
    echo instance=%instance% lis=%lis%
    set n=0
rem Loop endlessly calling a gawk script that simply counts the lines
rem Write stdout and stderr to a logfile
rem After each call, write timestamp and number of call to logfile and stdout.
:loop
    set/a n=n+1
rem !!! Clear one of the following rems in order to activate that particular command !!!
rem     gawk 'BEGINFILE{if(ERRNO)print "Trapped error",ERRNO,"opening file";}{n++}END{print "%date% %time% call %n%",n,"entries read"}' %lis%   >> chkParallelError_%instance%.log 2>&1
rem     cat  %lis% > %lis%_%instance%                  2>> chkParallelError_%instance%.log
rem     copy %lis%   %lis%_%instance%                   >> chkParallelError_%instance%.log 2>&1
rem In case of error write a note and the (last line of the) error message
    if %errorlevel% neq 0 echo Error: %errorlevel%&tail -1 chkParallelError_%instance%.log
rem Write timestamp and count mark to logfile and screen
    echo %date% %time% Instance %instance% Call %n% >> chkParallelError_%instance%.log 2>&1
    echo %date% %time% Instance %instance% Call %n% copy
    goto loop
================================================

Kind regards,
Wolfgang


[-- Attachment #2: cygcheck_150924.out --]
[-- Type: application/octet-stream, Size: 15222 bytes --]


Cygwin Configuration Diagnostics
Current System Time: Fri Sep 25 17:11:04 2015

Windows 7 Professional Ver 6.1 Build 7601 Service Pack 1

Running under WOW64 on AMD64

Path:	I:\Programme\cygwin_150924\bin
	C:\Windows\system32
	C:\Windows
	C:\Windows\System32\Wbem
	C:\Windows\System32\WindowsPowerShell\v1.0\
	C:\Program Files (x86)\Microsoft SQL Server\90\Tools\binn\
	I:\Software\uti\Cygwin-Tests\gawk-Bad_File_Descriptor

Output from I:\Programme\cygwin_150924\bin\id.exe
544(Administratoren)
545(Benutzer)
4(INTERAKTIV)
66049(KONSOLENANMELDUNG)
11(Authentifizierte Benutzer)
15(Diese Organisation)
4095(CurrentSession)
66048(LOKAL)
70145(Authentication authority asserted identity)
405504(Hohe Verbindlichkeitsstufe)

SysDir: C:\Windows\system32
WinDir: C:\Windows

CYGWIN = 'nodosfilewarning'
Path = 'I:\Programme\cygwin_150924\bin;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;c:\Program Files (x86)\Microsoft SQL Server\90\Tools\binn\;I:\Software\uti\Cygwin-Tests\gawk-Bad_File_Descriptor'

ALLUSERSPROFILE = 'C:\ProgramData'
APPDATA = 'C:\Users\User\AppData\Roaming'
AWKPATH = '/cygdrive/i/Software/awkuti/'
CommonProgramFiles = 'C:\Program Files (x86)\Common Files'
CommonProgramFiles(x86) = 'C:\Program Files (x86)\Common Files'
CommonProgramW6432 = 'C:\Program Files\Common Files'
COMPUTERNAME = 'Computer'
ComSpec = 'C:\Windows\system32\cmd.exe'
FP_NO_HOST_CHECK = 'NO'
HOMEDRIVE = 'C:'
HOMEPATH = '\Users\User'
LOCALAPPDATA = 'C:\Users\User\AppData\Local'
LOGONSERVER = '\\SERVER'
MOZ_PLUGIN_PATH = 'C:\Program Files\Tracker Software\PDF Viewer\Win32'
NUMBER_OF_PROCESSORS = '2'
OS = 'Windows_NT'
PATHEXT = '.COM;.EXE;.BAT;.CMD;.VBS;.VBE;.JS;.JSE;.WSF;.WSH;.MSC'
PROCESSOR_ARCHITECTURE = 'x86'
PROCESSOR_ARCHITEW6432 = 'AMD64'
PROCESSOR_IDENTIFIER = 'Intel64 Family 6 Model 26 Stepping 5, GenuineIntel'
PROCESSOR_LEVEL = '6'
PROCESSOR_REVISION = '1a05'
ProgramData = 'C:\ProgramData'
ProgramFiles = 'C:\Program Files (x86)'
ProgramFiles(x86) = 'C:\Program Files (x86)'
ProgramW6432 = 'C:\Program Files'
PROMPT = '$D $T $P$+$G '
PSModulePath = 'C:\Windows\system32\WindowsPowerShell\v1.0\Modules\'
PUBLIC = 'C:\Users\Public'
SESSIONNAME = 'Console'
SystemDrive = 'C:'
SystemRoot = 'C:\Windows'
TEMP = 'C:\Users\User\AppData\Local\Temp'
TMP = 'C:\Users\User\AppData\Local\Temp'
TVT = 'C:\Program Files (x86)\Lenovo'
USERDNSDOMAIN = 'AVT-WIEN.LOCAL'
USERDOMAIN = 'AVT-WIEN'
USERNAME = 'User'
USERPROFILE = 'C:\Users\User'
VS100COMNTOOLS = 'C:\Program Files (x86)\Microsoft Visual Studio 10.0\Common7\Tools\'
VS90COMNTOOLS = 'c:\Program Files (x86)\Microsoft Visual Studio 9.0\Common7\Tools\'
windir = 'C:\Windows'
WindowsSysBits = '64'
WindowsSysName = 'Windows 7 Professional'
WindowsVersion = '6.1'

HKEY_CURRENT_USER\Software\Cygnus Solutions\Cygwin
HKEY_CURRENT_USER\Software\Cygnus Solutions\Cygwin\mounts v2
HKEY_CURRENT_USER\Software\Cygnus Solutions\Cygwin\Program Options
HKEY_CURRENT_USER\Software\Cygwin
HKEY_CURRENT_USER\Software\Cygwin\Program Options
HKEY_CURRENT_USER\Software\Cygwin\setup
HKEY_LOCAL_MACHINE\SOFTWARE\Cygnus Solutions\Cygwin
HKEY_LOCAL_MACHINE\SOFTWARE\Cygnus Solutions\Cygwin\mounts v2
HKEY_LOCAL_MACHINE\SOFTWARE\Cygnus Solutions\Cygwin\Program Options
HKEY_LOCAL_MACHINE\SOFTWARE\Cygwin
HKEY_LOCAL_MACHINE\SOFTWARE\Cygwin\Installations
  (default) = '\??\C:\User_Programme\Cygwin'
  73383fbec20f1541 = '\??\I:\Programme\cygwin'
  4d833d62f3bc3e43 = '\??\I:\Programme\cygwin_150924'
HKEY_LOCAL_MACHINE\SOFTWARE\Cygwin\Program Options
HKEY_LOCAL_MACHINE\SOFTWARE\Cygwin\setup
  (default) = 'I:\Programme\cygwin_150924'

obcaseinsensitive set to 1

Cygwin installations found in the registry:
  System: Key: 35f4e990f06bd1c5 Path: C:\User_Programme\Cygwin
  System: Key: 73383fbec20f1541 Path: I:\Programme\cygwin
  System: Key: 4d833d62f3bc3e43 Path: I:\Programme\cygwin_150924

c:  hd  NTFS    465737Mb  30% CP CS UN PA FC     Windows7_OS
d:  hd  NTFS      9999Mb  65% CP CS UN PA FC     Lenovo_Recovery
e:  cd             N/A    N/A                    
i:  net NTFS   1604909Mb  11% CP CS UN PA FC     Daten
l:  net NTFS   8580073Mb  48% CP CS UN PA FC     Storage2

I:\Programme\cygwin_150924      /          system  binary,auto
I:\Programme\cygwin_150924\bin  /usr/bin   system  binary,auto
I:\Programme\cygwin_150924\lib  /usr/lib   system  binary,auto
cygdrive prefix                 /cygdrive  user    binary,posix=0,auto

Found: I:\Programme\cygwin_150924\bin\awk
 -> I:\Programme\cygwin_150924\bin\gawk.exe
Found: I:\Programme\cygwin_150924\bin\bash.exe
Found: I:\Programme\cygwin_150924\bin\cat.exe
Found: I:\Software\uti\Cygwin-Tests\gawk-Bad_File_Descriptor\cat.exe
Warning: I:\Programme\cygwin_150924\bin\cat.exe hides I:\Software\uti\Cygwin-Tests\gawk-Bad_File_Descriptor\cat.exe
Found: I:\Programme\cygwin_150924\bin\cp.exe
Not Found: cpp (good!)
Not Found: crontab
Found: I:\Programme\cygwin_150924\bin\find.exe
Found: C:\Windows\system32\find.exe
Warning: I:\Programme\cygwin_150924\bin\find.exe hides C:\Windows\system32\find.exe
Not Found: gcc
Not Found: gdb
Found: I:\Programme\cygwin_150924\bin\grep.exe
Found: I:\Programme\cygwin_150924\bin\kill.exe
Not Found: ld
Found: I:\Programme\cygwin_150924\bin\ls.exe
Not Found: make
Found: I:\Programme\cygwin_150924\bin\mv.exe
Not Found: patch
Not Found: perl
Found: I:\Programme\cygwin_150924\bin\rm.exe
Found: I:\Programme\cygwin_150924\bin\sed.exe
Not Found: ssh
Found: I:\Programme\cygwin_150924\bin\sh.exe
Found: I:\Programme\cygwin_150924\bin\tar.exe
Found: I:\Programme\cygwin_150924\bin\test.exe
Found: I:\Programme\cygwin_150924\bin\vi.exe
Not Found: vim

   38k 2015/09/24 I:\Programme\cygwin_150924\bin\cygargp-0.dll - os=4.0 img=1.0 sys=4.0
                  "cygargp-0.dll" v0.0 ts=2013-07-23 16:35
   14k 2015/09/24 I:\Programme\cygwin_150924\bin\cygattr-1.dll - os=4.0 img=1.0 sys=4.0
                  "cygattr-1.dll" v0.0 ts=2012-05-04 13:35
  203k 2015/09/24 I:\Programme\cygwin_150924\bin\cygblkid-1.dll - os=4.0 img=1.0 sys=4.0
                  "cygblkid-1.dll" v0.0 ts=2015-03-23 09:55
   62k 2015/09/24 I:\Programme\cygwin_150924\bin\cygbz2-1.dll - os=4.0 img=1.0 sys=4.0
                  "cygbz2-1.dll" v0.0 ts=2011-05-21 21:16
 1980k 2015/09/24 I:\Programme\cygwin_150924\bin\cygcrypto-1.0.0.dll - os=4.0 img=1.0 sys=4.0
                  "cygcrypto-1.0.0.dll" v0.0 ts=2015-07-09 18:50
   27k 2015/09/24 I:\Programme\cygwin_150924\bin\cygffi-6.dll - os=4.0 img=1.0 sys=4.0
                  "cygffi-6.dll" v0.0 ts=2015-01-02 02:11
   60k 2015/09/24 I:\Programme\cygwin_150924\bin\cygformw-10.dll - os=4.0 img=1.0 sys=4.0
                  "cygformw-10.dll" v0.0 ts=2015-06-10 01:36
  108k 2015/09/24 I:\Programme\cygwin_150924\bin\cyggcc_s-1.dll - os=4.0 img=1.0 sys=4.0
                  "cyggcc_s-1.dll" v0.0 ts=2015-07-02 19:59
   19k 2015/09/24 I:\Programme\cygwin_150924\bin\cyggdbm-4.dll - os=4.0 img=1.0 sys=4.0
                  "cyggdbm-4.dll" v0.0 ts=2009-02-26 08:58
    8k 2015/09/24 I:\Programme\cygwin_150924\bin\cyggdbm_compat-4.dll - os=4.0 img=1.0 sys=4.0
                  "cyggdbm_compat-4.dll" v0.0 ts=2009-02-26 08:58
  505k 2015/09/24 I:\Programme\cygwin_150924\bin\cyggmp-10.dll - os=4.0 img=1.0 sys=4.0
                  "cyggmp-10.dll" v0.0 ts=2015-01-26 17:08
   31k 2015/09/24 I:\Programme\cygwin_150924\bin\cyghistory7.dll - os=4.0 img=1.0 sys=4.0
                  "cyghistory7.dll" v0.0 ts=2015-01-28 00:43
 1010k 2015/09/24 I:\Programme\cygwin_150924\bin\cygiconv-2.dll - os=4.0 img=1.0 sys=4.0
                  "cygiconv-2.dll" v0.0 ts=2015-02-20 17:52
   41k 2015/09/24 I:\Programme\cygwin_150924\bin\cygintl-8.dll - os=4.0 img=1.0 sys=4.0
                  "cygintl-8.dll" v0.0 ts=2015-09-20 21:20
    5k 2015/08/20 I:\Programme\cygwin_150924\bin\cyglsa.dll - os=4.0 img=1.0 sys=4.0
                  "cyglsa.dll" v0.0 ts=2015-08-20 11:40
    6k 2015/08/20 I:\Programme\cygwin_150924\bin\cyglsa64.dll (not x86 dll)
  159k 2015/09/24 I:\Programme\cygwin_150924\bin\cyglzma-5.dll - os=4.0 img=1.0 sys=4.0
                  "cyglzma-5.dll" v0.0 ts=2015-05-04 05:00
  123k 2015/09/24 I:\Programme\cygwin_150924\bin\cygmagic-1.dll - os=4.0 img=1.0 sys=4.0
                  "cygmagic-1.dll" v0.0 ts=2015-08-12 21:06
  173k 2015/09/24 I:\Programme\cygwin_150924\bin\cygman-2-7-1.dll - os=4.0 img=1.0 sys=4.0
                  "cygman-2-7-1.dll" v0.0 ts=2015-04-17 23:31
   22k 2015/09/24 I:\Programme\cygwin_150924\bin\cygmandb-2-7-1.dll - os=4.0 img=1.0 sys=4.0
                  "cygmandb-2-7-1.dll" v0.0 ts=2015-04-17 23:31
   30k 2015/09/24 I:\Programme\cygwin_150924\bin\cygmenuw-10.dll - os=4.0 img=1.0 sys=4.0
                  "cygmenuw-10.dll" v0.0 ts=2015-06-10 01:35
  369k 2015/09/24 I:\Programme\cygwin_150924\bin\cygmpfr-4.dll - os=4.0 img=1.0 sys=4.0
                  "cygmpfr-4.dll" v0.0 ts=2015-06-30 19:39
   57k 2015/09/24 I:\Programme\cygwin_150924\bin\cygncurses++w-10.dll - os=4.0 img=1.0 sys=4.0
                  "cygncurses++w-10.dll" v0.0 ts=2015-06-10 01:42
  327k 2015/09/24 I:\Programme\cygwin_150924\bin\cygncursesw-10.dll - os=4.0 img=1.0 sys=4.0
                  "cygncursesw-10.dll" v0.0 ts=2015-06-10 01:33
  326k 2015/09/24 I:\Programme\cygwin_150924\bin\cygp11-kit-0.dll - os=4.0 img=1.0 sys=4.0
                  "cygp11-kit-0.dll" v0.0 ts=2015-06-01 21:17
   15k 2015/09/24 I:\Programme\cygwin_150924\bin\cygpanelw-10.dll - os=4.0 img=1.0 sys=4.0
                  "cygpanelw-10.dll" v0.0 ts=2015-06-10 01:35
  458k 2015/09/24 I:\Programme\cygwin_150924\bin\cygpcre-1.dll - os=4.0 img=1.0 sys=4.0
                  "cygpcre-1.dll" v0.0 ts=2015-08-11 19:40
   41k 2015/09/24 I:\Programme\cygwin_150924\bin\cygpipeline-1.dll - os=4.0 img=1.0 sys=4.0
                  "cygpipeline-1.dll" v0.0 ts=2015-04-09 21:58
   41k 2015/09/24 I:\Programme\cygwin_150924\bin\cygpopt-0.dll - os=4.0 img=1.0 sys=4.0
                  "cygpopt-0.dll" v0.0 ts=2013-10-21 22:52
  208k 2015/09/24 I:\Programme\cygwin_150924\bin\cygreadline7.dll - os=4.0 img=1.0 sys=4.0
                  "cygreadline7.dll" v0.0 ts=2015-01-28 00:43
   98k 2015/09/24 I:\Programme\cygwin_150924\bin\cygsmartcols-1.dll - os=4.0 img=1.0 sys=4.0
                  "cygsmartcols-1.dll" v0.0 ts=2015-03-23 09:55
  446k 2015/09/24 I:\Programme\cygwin_150924\bin\cygssl-1.0.0.dll - os=4.0 img=1.0 sys=4.0
                  "cygssl-1.0.0.dll" v0.0 ts=2015-07-09 18:50
  944k 2015/09/24 I:\Programme\cygwin_150924\bin\cygstdc++-6.dll - os=4.0 img=1.0 sys=4.0
                  "cygstdc++-6.dll" v0.0 ts=2015-07-02 20:14
   69k 2015/09/24 I:\Programme\cygwin_150924\bin\cygtasn1-6.dll - os=4.0 img=1.0 sys=4.0
                  "cygtasn1-6.dll" v0.0 ts=2015-08-28 10:48
   54k 2015/09/24 I:\Programme\cygwin_150924\bin\cygticw-10.dll - os=4.0 img=1.0 sys=4.0
                  "cygticw-10.dll" v0.0 ts=2015-06-10 01:33
   16k 2015/09/24 I:\Programme\cygwin_150924\bin\cyguuid-1.dll - os=4.0 img=1.0 sys=4.0
                  "cyguuid-1.dll" v0.0 ts=2015-03-23 09:55
   83k 2015/09/24 I:\Programme\cygwin_150924\bin\cygz.dll - os=4.0 img=1.0 sys=4.0
                  "cygz.dll" v0.0 ts=2014-11-19 23:57
 3399k 2015/08/20 I:\Programme\cygwin_150924\bin\cygwin1.dll - os=4.0 img=1.0 sys=4.0
                  "cygwin1.dll" v0.0 ts=2015-08-20 11:40
    Cygwin DLL version info:
        DLL version: 2.2.1
        DLL epoch: 19
        DLL old termios: 5
        DLL malloc env: 28
        Cygwin conv: 181
        API major: 0
        API minor: 289
        Shared data: 5
        DLL identifier: cygwin1
        Mount registry: 3
        Cygwin registry name: Cygwin
        Installations name: Installations
        Cygdrive default prefix: 
        Build date: 
        Shared id: cygwin1S5


Can't find the cygrunsrv utility, skipping services check.


Cygwin Package Information
Last downloaded files to: M:\0_DAT\Programme_Installer\cygwin
Last downloaded files from: http://gd.tuwien.ac.at/gnu/cygwin/

Package              Version            Status
_autorebase          001002-1           OK
_update-info-dir     01437-1            OK
alternatives         1.3.30c-10         OK
base-cygwin          3.8-1              OK
base-files           4.2-3              OK
bash                 4.3.42-3           OK
bzip2                1.0.6-2            OK
ca-certificates      2.5-1              OK
coreutils            8.24-3             OK
cygutils             1.4.14-1           OK
cygwin               2.2.1-1            OK
dash                 0.5.8-3            OK
editrights           1.03-1             OK
file                 5.24-1             OK
findutils            4.5.12-1           OK
gawk                 4.1.3-1            OK
getent               2.18.90-4          OK
grep                 2.21-2             OK
groff                1.22.3-1           OK
gzip                 1.6-1              OK
hostname             3.13-1             OK
info                 6.0-1              OK
ipc-utils            1.0-1              OK
less                 471-1              OK
libargp              20110921-2         OK
libattr1             2.4.46-1           OK
libblkid1            2.25.2-2           OK
libbz2_1             1.0.6-2            OK
libffi6              3.2.1-1            OK
libgcc1              4.9.3-1            OK
libgdbm4             1.8.3-20           OK
libgmp10             6.0.0a-2           OK
libiconv             1.14-3             OK
libiconv2            1.14-3             OK
libintl8             0.19.5.1-2         OK
liblzma5             5.2.1-1            OK
libmpfr4             3.1.3-1            OK
libncursesw10        5.9-20150530-1     OK
libopenssl100        1.0.2d-1           OK
libp11-kit0          0.22.1-1           OK
libpcre1             8.37-2             OK
libpipeline1         1.4.0-1            OK
Empty package libpopt0
libpopt0             1.16-1             OK
libreadline7         6.3.8-1            OK
libsmartcols1        2.25.2-2           OK
libstdc++6           4.9.3-1            OK
libtasn1_6           4.5-1              OK
libuuid1             2.25.2-2           OK
login                1.11-1             OK
lynx                 2.8.7-1            OK
man-db               2.7.1-1            OK
mintty               2.1.5-0            OK
openssl              1.0.2d-1           OK
p11-kit              0.22.1-1           OK
p11-kit-trust        0.22.1-1           OK
popt                 1.16-1             OK
rebase               4.4.1-1            OK
run                  1.3.4-2            OK
sed                  4.2.2-3            OK
tar                  1.28-1             OK
terminfo             5.9-20150530-1     OK
tzcode               2015f-1            OK
util-linux           2.25.2-2           OK
vim-minimal          7.4.873-1          OK
which                2.20-2             OK
xz                   5.2.1-1            OK
zlib0                1.2.8-3            OK
Use -h to see help about each section

[-- Attachment #3: Type: text/plain, Size: 218 bytes --]

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: gawk: Bad File Descriptor error with concurrent readonly access to a network file
  2015-09-25 16:31 gawk: Bad File Descriptor error with concurrent readonly access to a network file Vermessung AVT - Wolfgang Rieger
@ 2015-09-25 16:59 ` Marco Atzeri
  2015-10-27 13:21 ` Corinna Vinschen
  1 sibling, 0 replies; 11+ messages in thread
From: Marco Atzeri @ 2015-09-25 16:59 UTC (permalink / raw)
  To: cygwin

On 25/09/2015 18:31, Vermessung AVT - Wolfgang Rieger wrote:
> We let thousands of tiles undergo the same time consuming processing tasks. We use a multi core Windows 7 workstation running several tiles simultaneously in separate shell windows (parallel processing). A batch script controls the work flow of the task with gawk interpreting a number of setup / definition files at run time for each tile / working step. From time to time we get "Bad File Descriptor" errors in gawk (and, e.g., cat, head, tail) when accessing these setup files (they are only read). The full error line reads similar to:
>
> (With job.awk and first access to datafile.txt at gawk source line 31:)
>     "gawk: job.awk:31: fatal: error reading input file `datafile.txt': Bad file descriptor"
> (With inline gawk scripts typically:)
>     "gawk: fatal: error reading input file `datafile.txt': Bad file descriptor"
> (With something like "cat datafile.txt > destination":)
>     "cat: datafile.txt: Bad file descriptor"
>
> We use MS-Windows shell cmd.exe with batch scripts executing the gawk and other commands.
> I tried to use gawk's BEGINFILE rule in order to trap that error. However, the BEGINFILE block is never entered, rather, gawk immediately crashes with the "Bad File Descriptor" error.

Hi Wolfang,

"Bad file descriptor" just arose recently in another problem
https://cygwin.com/ml/cygwin/2015-09/msg00374.html
https://cygwin.com/ml/cygwin/2015-09/msg00436.html

Have you by chance some potential suspect like usual ones
   https://cygwin.com/faq/faq.html#faq.using.bloda

On your cygcheck output I notice nothing strange.

Can you provide the type of network disk with

/usr/lib/csih/getVolInfo <volumename>

Regards
Marco

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: gawk: Bad File Descriptor error with concurrent readonly access to a network file
  2015-09-25 16:31 gawk: Bad File Descriptor error with concurrent readonly access to a network file Vermessung AVT - Wolfgang Rieger
  2015-09-25 16:59 ` Marco Atzeri
@ 2015-10-27 13:21 ` Corinna Vinschen
  2015-10-27 15:18   ` Matt D.
  1 sibling, 1 reply; 11+ messages in thread
From: Corinna Vinschen @ 2015-10-27 13:21 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 1618 bytes --]

On Sep 25 16:31, Vermessung AVT - Wolfgang Rieger wrote:
> 1) Concurrent read access to the setup files was possible and worked
> fine with local files (24 hrs testing with millions of file accesses
> in 4 parallel jobs).
> 2) However, when the file to be read (datafile.txt) is stored on a
> network share on a file server - which is the case in our working
> environment - the error could be reproduced. The number of Bad file
> descriptor errors seems to be related to the work load at the server
> where the file resides.
> 3) The MS copy command shows no such error, even with network files.
> So we can substitute the cat's by copy's. For gawk, however, there is
> no shell alternative.
> 
> It looks like there is a small time frame in opening files when the
> server file is non-accessible to other processes. If a parallel job
> happens to access the same file within that short time period while
> another process is opening it, the "Bad File Descriptor" error is
> thrown.

Cygwin uses full sharing for all files it opens, unless the file is
opened in very specific circumstances (e.g, creating a symlink, deleting
a file).  "Bad file descriptor" doesn't point to a sharing problem.
It seems the handle is unusable or something.

I tried your testcase and I can't reproduce the problem in my
environment.  Have you tried catching a trace of the problem via
strace?  It would be helpful to see where the EBADF occurs.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: gawk: Bad File Descriptor error with concurrent readonly access to a network file
  2015-10-27 13:21 ` Corinna Vinschen
@ 2015-10-27 15:18   ` Matt D.
  2015-10-27 17:49     ` Corinna Vinschen
  0 siblings, 1 reply; 11+ messages in thread
From: Matt D. @ 2015-10-27 15:18 UTC (permalink / raw)
  To: cygwin

I haven't had an opportunity to look into it but I've also encountered 
errors when performing a parallel make build (make -j) on a large C++ 
project which has multiple interdependencies across a network share with 
too many threads.

The reported "Bad File Descriptor" is the same error that I get.

Matt D.

On 10/27/2015 5:52 AM, Corinna Vinschen wrote:
> On Sep 25 16:31, Vermessung AVT - Wolfgang Rieger wrote:
>> 1) Concurrent read access to the setup files was possible and worked
>> fine with local files (24 hrs testing with millions of file accesses
>> in 4 parallel jobs).
>> 2) However, when the file to be read (datafile.txt) is stored on a
>> network share on a file server - which is the case in our working
>> environment - the error could be reproduced. The number of Bad file
>> descriptor errors seems to be related to the work load at the server
>> where the file resides.
>> 3) The MS copy command shows no such error, even with network files.
>> So we can substitute the cat's by copy's. For gawk, however, there is
>> no shell alternative.
>>
>> It looks like there is a small time frame in opening files when the
>> server file is non-accessible to other processes. If a parallel job
>> happens to access the same file within that short time period while
>> another process is opening it, the "Bad File Descriptor" error is
>> thrown.
>
> Cygwin uses full sharing for all files it opens, unless the file is
> opened in very specific circumstances (e.g, creating a symlink, deleting
> a file).  "Bad file descriptor" doesn't point to a sharing problem.
> It seems the handle is unusable or something.
>
> I tried your testcase and I can't reproduce the problem in my
> environment.  Have you tried catching a trace of the problem via
> strace?  It would be helpful to see where the EBADF occurs.
>
>
> Corinna
>

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: gawk: Bad File Descriptor error with concurrent readonly access to a network file
  2015-10-27 15:18   ` Matt D.
@ 2015-10-27 17:49     ` Corinna Vinschen
  0 siblings, 0 replies; 11+ messages in thread
From: Corinna Vinschen @ 2015-10-27 17:49 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 2350 bytes --]

On Oct 27 06:48, Matt D. wrote:
> I haven't had an opportunity to look into it but I've also encountered
> errors when performing a parallel make build (make -j) on a large C++
> project which has multiple interdependencies across a network share with too
> many threads.
> 
> The reported "Bad File Descriptor" is the same error that I get.

That's fine and all, but it doesn't add any new info to the case.

> 
> Matt D.
> 
> On 10/27/2015 5:52 AM, Corinna Vinschen wrote:
> >On Sep 25 16:31, Vermessung AVT - Wolfgang Rieger wrote:
> >>1) Concurrent read access to the setup files was possible and worked
> >>fine with local files (24 hrs testing with millions of file accesses
> >>in 4 parallel jobs).
> >>2) However, when the file to be read (datafile.txt) is stored on a
> >>network share on a file server - which is the case in our working
> >>environment - the error could be reproduced. The number of Bad file
> >>descriptor errors seems to be related to the work load at the server
> >>where the file resides.
> >>3) The MS copy command shows no such error, even with network files.
> >>So we can substitute the cat's by copy's. For gawk, however, there is
> >>no shell alternative.
> >>
> >>It looks like there is a small time frame in opening files when the
> >>server file is non-accessible to other processes. If a parallel job
> >>happens to access the same file within that short time period while
> >>another process is opening it, the "Bad File Descriptor" error is
> >>thrown.
> >
> >Cygwin uses full sharing for all files it opens, unless the file is
> >opened in very specific circumstances (e.g, creating a symlink, deleting
> >a file).  "Bad file descriptor" doesn't point to a sharing problem.
> >It seems the handle is unusable or something.
> >
> >I tried your testcase and I can't reproduce the problem in my
> >environment.  Have you tried catching a trace of the problem via
> >strace?  It would be helpful to see where the EBADF occurs.

I'm running the testscript for a few hours now, with a server under heavy
load, but I still can't reproduce it.

Is this a bad interaction with some virus scanner, perhaps?


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: gawk: Bad File Descriptor error with concurrent readonly access to a network file
  2015-10-28  3:45 Vermessung AVT - Wolfgang Rieger
@ 2015-10-28 12:34 ` Corinna Vinschen
  0 siblings, 0 replies; 11+ messages in thread
From: Corinna Vinschen @ 2015-10-28 12:34 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 2730 bytes --]

On Oct 27 23:30, Vermessung AVT - Wolfgang Rieger wrote:
> From: Corinna Vinschen [mailto:corinna-cygwin@cygwin.com]
> Sent: Tuesday, October 27, 2015 10:53
> To: cygwin@cygwin.com
> Subject: Re: gawk: Bad File Descriptor error with concurrent readonly access to a network file
>  (Snip)
> > Cygwin uses full sharing for all files it opens, unless the file is opened in very specific circumstances (e.g, creating a symlink,
> > deleting a file).  "Bad file descriptor" doesn't point to a sharing problem.
> > It seems the handle is unusable or something.
> >
> > I tried your testcase and I can't reproduce the problem in my environment.
> > Have you tried catching a trace of the problem via strace?
> > It would be helpful to see where the EBADF occurs.
> 
> Here is the portion of strace output where IMHO the error occurs. The behaviour is quite different when running with strace, so I hope it is the same error situation, but at least the message given is "Bad file descriptor".
> I changed the paths to ???. This is strace --mask=syscall. If you need further info I would appreciate to send it directly to you since there is a lot of info in strace I do not want to put into the www:
> 
>   197   70416 [main] gawk 3844 open: 3 = open(???\datafile.txt, 0x8000)
>   327   70743 [main] gawk 3844 fcntl64: 0 = fcntl(3, 1, 0x0)
>   265   71008 [main] gawk 3844 fcntl64: 0 = fcntl(3, 2, 0x1)
>  3298   74306 [main] gawk 3844 fhandler_base::fstat_helper: 0 = fstat (\??\???\datafile.txt, 0x8004AEF8) st_size=135, st_mode=0100644, st_ino=281474976912276st_atim=551A9D6D.BD48F54 st_ctim=5603C3F5.26F49798 st_mtim=54E1ECAE.B2A66C8 st_birthtim=551A9D6D.BD48F54
>   206   74512 [main] gawk 3844 fstat64: 0 = fstat(3, 0x8004AEF8)
>   216   74728 [main] gawk 3844 isatty: 0 = isatty(3)
>  1887   76615 [main] gawk 3844 fhandler_base::fstat_helper: 0 = fstat (\??\???\datafile.txt, 0x8004AEF8) st_size=135, st_mode=0100644, st_ino=281474976912276st_atim=551A9D6D.BD48F54 st_ctim=5603C3F5.26F49798 st_mtim=54E1ECAE.B2A66C8 st_birthtim=551A9D6D.BD48F54
>   241   76856 [main] gawk 3844 fstat64: 0 = fstat(3, 0x8004AEF8)
>   304   77160 [main] gawk 3844 read: read(3, 0x8004AF90, 135) blocking
>  1722   78882 [main] gawk 3844 seterrno_from_nt_status: /home/corinna/src/cygwin/cygwin-1.7.33/cygwin-1.7.33-1.i686/src/src/winsup/cygwin/fhandler.cc:276 status 0xC0000128 -> windows error 6

0xC0000128 is STATUS_FILE_CLOSED.  "Something" closed the handle.
This is only the result, unfortunately.  It doesn't tell us anything
in terms of the cause.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: gawk: Bad File Descriptor error with concurrent readonly access to a network file
@ 2015-10-28  3:45 Vermessung AVT - Wolfgang Rieger
  2015-10-28 12:34 ` Corinna Vinschen
  0 siblings, 1 reply; 11+ messages in thread
From: Vermessung AVT - Wolfgang Rieger @ 2015-10-28  3:45 UTC (permalink / raw)
  To: cygwin

From: Corinna Vinschen [mailto:corinna-cygwin@cygwin.com]
Sent: Tuesday, October 27, 2015 10:53
To: cygwin@cygwin.com
Subject: Re: gawk: Bad File Descriptor error with concurrent readonly access to a network file
 (Snip)
> Cygwin uses full sharing for all files it opens, unless the file is opened in very specific circumstances (e.g, creating a symlink,
> deleting a file).  "Bad file descriptor" doesn't point to a sharing problem.
> It seems the handle is unusable or something.
>
> I tried your testcase and I can't reproduce the problem in my environment.
> Have you tried catching a trace of the problem via strace?
> It would be helpful to see where the EBADF occurs.

Here is the portion of strace output where IMHO the error occurs. The behaviour is quite different when running with strace, so I hope it is the same error situation, but at least the message given is "Bad file descriptor".
I changed the paths to ???. This is strace --mask=syscall. If you need further info I would appreciate to send it directly to you since there is a lot of info in strace I do not want to put into the www:

  197   70416 [main] gawk 3844 open: 3 = open(???\datafile.txt, 0x8000)
  327   70743 [main] gawk 3844 fcntl64: 0 = fcntl(3, 1, 0x0)
  265   71008 [main] gawk 3844 fcntl64: 0 = fcntl(3, 2, 0x1)
 3298   74306 [main] gawk 3844 fhandler_base::fstat_helper: 0 = fstat (\??\???\datafile.txt, 0x8004AEF8) st_size=135, st_mode=0100644, st_ino=281474976912276st_atim=551A9D6D.BD48F54 st_ctim=5603C3F5.26F49798 st_mtim=54E1ECAE.B2A66C8 st_birthtim=551A9D6D.BD48F54
  206   74512 [main] gawk 3844 fstat64: 0 = fstat(3, 0x8004AEF8)
  216   74728 [main] gawk 3844 isatty: 0 = isatty(3)
 1887   76615 [main] gawk 3844 fhandler_base::fstat_helper: 0 = fstat (\??\???\datafile.txt, 0x8004AEF8) st_size=135, st_mode=0100644, st_ino=281474976912276st_atim=551A9D6D.BD48F54 st_ctim=5603C3F5.26F49798 st_mtim=54E1ECAE.B2A66C8 st_birthtim=551A9D6D.BD48F54
  241   76856 [main] gawk 3844 fstat64: 0 = fstat(3, 0x8004AEF8)
  304   77160 [main] gawk 3844 read: read(3, 0x8004AF90, 135) blocking
 1722   78882 [main] gawk 3844 seterrno_from_nt_status: /home/corinna/src/cygwin/cygwin-1.7.33/cygwin-1.7.33-1.i686/src/src/winsup/cygwin/fhandler.cc:276 status 0xC0000128 -> windows error 6
  210   79092 [main] gawk 3844 geterrno_from_win_error: windows error 6 == errno 9
  170   79262 [main] gawk 3844 read: -1 = read(3, 0x8004AF90, -1), errno 9

A successful call goes on after the "read: read(3, ..., 135)" like this:
  129  187614 [main] gawk 4452 fhandler_base::read: returning 135, text mode
  147  187761 [main] gawk 4452 read: 135 = read(3, 0x8004AF80, 135)
  181  187942 [main] gawk 4452 read: read(3, 0x8004AF80, 135) blocking
  130  188072 [main] gawk 4452 fhandler_base::read: returning 0, text mode
  150  188222 [main] gawk 4452 read: 0 = read(3, 0x8004AF80, 0)
  350  188572 [main] gawk 4452 close: close(3)
  163  188735 [main] gawk 4452 fhandler_base::close: closing '???/datafile.txt' handle 0x12C

Thanks, Wolfgang

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: gawk: Bad File Descriptor error with concurrent readonly access to a network file
@ 2015-10-22 13:48 Vermessung AVT - Wolfgang Rieger
  0 siblings, 0 replies; 11+ messages in thread
From: Vermessung AVT - Wolfgang Rieger @ 2015-10-22 13:48 UTC (permalink / raw)
  To: cygwin

Marco suggested I should wait for Corinna being back for that issue. Did you ever look into that, Corinna? For your info, my first mail in that thread contains a description and test case.

Thanks,
Wolfgang

-----Original Message-----
From: Vermessung AVT - Wolfgang Rieger 
Sent: Sa, 26. September 2015 12:01
To: 'cygwin@cygwin.com'
Subject: Re: gawk: Bad File Descriptor error with concurrent readonly access to a network file

On Fri, 25 Sep 2015 18:58:57 +0200, Marco Atzeri wrote:
>> "Bad file descriptor" just arose recently in another problem 
>> https://cygwin.com/ml/cygwin/2015-09/msg00374.html
>> https://cygwin.com/ml/cygwin/2015-09/msg00436.html
>>
I don't think this applies to our case. We use massive parallel processing, and the problem is related to that as the test case shows in our environment. In single thread operation we don't have any problems at all. I don't use fork or other of the tools mentioned. We don't have Chrome or Comodo or so installed. We have an encapsulated environment with not even an anti-virus sw running in the power workstations and as little stuff as possible because computing speed is our main issue.

>> Have you by chance some potential suspect like usual ones
>>   https://cygwin.com/faq/faq.html#faq.using.bloda
I did not find there anything that seems related to our problem.

>> On your cygcheck output I notice nothing strange.
I do not think there is anything strange. I have been using Cygwin for 15+ years now. We started parallelizing our jobs some 12 years ago. Of course, hardware was not comparable then to what we have today. But the Bad File Descriptor issues only started some 3 or 4 years ago with an update of Cygwin (I really don't remember when; there must have been some major change in the Cygwin-dll: E. g., since then the type-ahead buffer of cmd.exe is no longer useable when Cygwin programs run in the shell). Since these errors were fairly rare (say, 1 in >1000 tiles), we did not dig into it deeper. However, it is an ongoing issue.

With raising workload at the file server and new workstations with more cores (allowing for more parallel processes) it became more frequent during last years. A server upgrade last winter reduced the problem, but with recently massively increasing work load it raise again.

>> Can you provide the type of network disk with 
>> /usr/lib/csih/getVolInfo <volumename>
I am sorry, I have a very small installation of Cygwin running with no getVolInfo. In which package can I find that? We have MS Windows Server 2008 that provides network shares.


Again I want to stress: Running the jobs in single thread we never experienced any such problems at all. Only with several jobs running in parallel (the same batch job is started in several cmd-shell windows independently) we have these errors. The reason is obviously when by chance two processes try to access the same file at the same time which happens not often, but it happens. I assume access to local files is better synchronized by the CPU, whereas at the server there may arise these conflicts.

The major question is, what is the underlying access problem within Cygwin? As mentioned, the MS programs (e. g. copy) never show a similar problem.
 
Thanks for your help,
Wolfgang


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: gawk: Bad File Descriptor error with concurrent readonly access to a network file
  2015-09-26 10:14 ` Marco Atzeri
@ 2015-09-26 10:35   ` Andrey Repin
  0 siblings, 0 replies; 11+ messages in thread
From: Andrey Repin @ 2015-09-26 10:35 UTC (permalink / raw)
  To: Marco Atzeri, cygwin

Greetings, Marco Atzeri!

>>>> Can you provide the type of network disk with
>>>> /usr/lib/csih/getVolInfo <volumename>
>> I am sorry, I have a very small installation of Cygwin running with no getVolInfo. In which package can I find that? We have MS Windows Server 2008 that provides network shares.
>>

> It is part of the tiny "csih" package

Which, I believe, is part of "base" ?


-- 
With best regards,
Andrey Repin
Saturday, September 26, 2015 13:23:01

Sorry for my terrible english...


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: gawk: Bad File Descriptor error with concurrent readonly access to a network file
  2015-09-26 10:00 Vermessung AVT - Wolfgang Rieger
@ 2015-09-26 10:14 ` Marco Atzeri
  2015-09-26 10:35   ` Andrey Repin
  0 siblings, 1 reply; 11+ messages in thread
From: Marco Atzeri @ 2015-09-26 10:14 UTC (permalink / raw)
  To: cygwin

On 26/09/2015 12:00, Vermessung AVT - Wolfgang Rieger wrote:
> On Fri, 25 Sep 2015 18:58:57 +0200, Marco Atzeri wrote:

>>> Can you provide the type of network disk with
>>> /usr/lib/csih/getVolInfo <volumename>
> I am sorry, I have a very small installation of Cygwin running with no getVolInfo. In which package can I find that? We have MS Windows Server 2008 that provides network shares.
>

It is part of the tiny "csih" package

> Again I want to stress: Running the jobs in single thread we never experienced any such problems at all. Only with several jobs running in parallel (the same batch job is started in several cmd-shell windows independently) we have these errors. The reason is obviously when by chance two processes try to access the same file at the same time which happens not often, but it happens. I assume access to local files is better synchronized by the CPU, whereas at the server there may arise these conflicts.
>
> The major question is, what is the underlying access problem within Cygwin? As mentioned, the MS programs (e. g. copy) never show a similar problem.

I suspect you need to wait that Corinna is back.
She wrote most of the stuff and workaround in that area.


> Thanks for your help,
> Wolfgang





--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: gawk: Bad File Descriptor error with concurrent readonly access to a network file
@ 2015-09-26 10:00 Vermessung AVT - Wolfgang Rieger
  2015-09-26 10:14 ` Marco Atzeri
  0 siblings, 1 reply; 11+ messages in thread
From: Vermessung AVT - Wolfgang Rieger @ 2015-09-26 10:00 UTC (permalink / raw)
  To: cygwin

On Fri, 25 Sep 2015 18:58:57 +0200, Marco Atzeri wrote:
>> "Bad file descriptor" just arose recently in another problem
>> https://cygwin.com/ml/cygwin/2015-09/msg00374.html
>> https://cygwin.com/ml/cygwin/2015-09/msg00436.html
>>
I don't think this applies to our case. We use massive parallel processing, and the problem is related to that as the test case shows in our environment. In single thread operation we don't have any problems at all. I don't use fork or other of the tools mentioned. We don't have Chrome or Comodo or so installed. We have an encapsulated environment with not even an anti-virus sw running in the power workstations and as little stuff as possible because computing speed is our main issue.

>> Have you by chance some potential suspect like usual ones
>>   https://cygwin.com/faq/faq.html#faq.using.bloda
I did not find there anything that seems related to our problem.

>> On your cygcheck output I notice nothing strange.
I do not think there is anything strange. I have been using Cygwin for 15+ years now. We started parallelizing our jobs some 12 years ago. Of course, hardware was not comparable then to what we have today. But the Bad File Descriptor issues only started some 3 or 4 years ago with an update of Cygwin (I really don't remember when; there must have been some major change in the Cygwin-dll: E. g., since then the type-ahead buffer of cmd.exe is no longer useable when Cygwin programs run in the shell). Since these errors were fairly rare (say, 1 in >1000 tiles), we did not dig into it deeper. However, it is an ongoing issue.

With raising workload at the file server and new workstations with more cores (allowing for more parallel processes) it became more frequent during last years. A server upgrade last winter reduced the problem, but with recently massively increasing work load it raise again.

>> Can you provide the type of network disk with
>> /usr/lib/csih/getVolInfo <volumename>
I am sorry, I have a very small installation of Cygwin running with no getVolInfo. In which package can I find that? We have MS Windows Server 2008 that provides network shares.


Again I want to stress: Running the jobs in single thread we never experienced any such problems at all. Only with several jobs running in parallel (the same batch job is started in several cmd-shell windows independently) we have these errors. The reason is obviously when by chance two processes try to access the same file at the same time which happens not often, but it happens. I assume access to local files is better synchronized by the CPU, whereas at the server there may arise these conflicts.

The major question is, what is the underlying access problem within Cygwin? As mentioned, the MS programs (e. g. copy) never show a similar problem.
 
Thanks for your help,
Wolfgang


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2015-10-28  9:10 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-25 16:31 gawk: Bad File Descriptor error with concurrent readonly access to a network file Vermessung AVT - Wolfgang Rieger
2015-09-25 16:59 ` Marco Atzeri
2015-10-27 13:21 ` Corinna Vinschen
2015-10-27 15:18   ` Matt D.
2015-10-27 17:49     ` Corinna Vinschen
2015-09-26 10:00 Vermessung AVT - Wolfgang Rieger
2015-09-26 10:14 ` Marco Atzeri
2015-09-26 10:35   ` Andrey Repin
2015-10-22 13:48 Vermessung AVT - Wolfgang Rieger
2015-10-28  3:45 Vermessung AVT - Wolfgang Rieger
2015-10-28 12:34 ` Corinna Vinschen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).