On 07/27/2017 12:08 PM, Eric Blake wrote: > So I'm back to cmd to try and debug things. Next, I tried: > > c:\cygwin\bin> .\dash > > > and again got Ω; pressing complains that ./dash: 1: Ω: not found To double check things, I started .\dash, typed 'echo $$', then in a second terminal, typed 'gdb --pid XXX' with the dash pid. (gdb) b read (gdb) b select (gdb) c then in the first window, typed to get dash back to its input loop and the second window hit a breakpoint in read. But that didn't get me very far: Thread 1 hit Breakpoint 1, read (fd=0, ptr=0x41b540 , len=1024) at /usr/src/debug/cygwin-2.8.2-1/winsup/cygwin/syscalls.cc:1118 1118 { (gdb) fin Run till exit from #0 read (fd=0, ptr=0x41b540 , len=1024) at /usr/src/debug/cygwin-2.8.2-1/winsup/cygwin/syscalls.cc:1118 [New Thread 628.0x70c] readline: readline_callback_read_char() called with no handler! Aborted (core dumped) Urgh - gdb uses readline, so debugging readline with gdb may prove harder than planned if I don't time things right. A second time around, and instead of using fin, I stepped through: 1118 { (gdb) n ... (gdb) n 1139 cfd->read (ptr, len); (gdb) n [New Thread 736.0x960] at the point of the new thread, I typed in the first terminal, which let the read return, and the buffer contents are correct: 1140 res = len; (gdb) p len $1 = 3 (gdb) p/x ((char*)ptr)[0] $2 = 0xce (gdb) p/x ((char*)ptr)[1] $3 = 0xa9 (gdb) p ((char*)ptr)[2] $4 = '\n' so whatever dash did, it read a solid block of input from the terminal; from there, I quit debugging - obviously dash is not doing things piecemeal, and manages to replay the same output as it just read in input (when you aren't trying hard to be interactive, life is easy). > > However, when I try: > > c:\cygwin\bin> .\bash --norc > > > the display shows :\251 Repeating the gdb attach trick, I'm able to catch bash at this breakpoint, even without hitting , just by typing : Thread 1 hit Breakpoint 1, read (fd=0, ptr=0x28c013, len=1) at /usr/src/debug/cygwin-2.8.2-1/winsup/cygwin/syscalls.cc:1118 Notice a difference? dash had the terminal set up in line-oriented mode, and blindly reads until EOL or until len=1024 is exhausted; bash has the terminal set up in byte-oriented mode, and is only reading len=1 at a time. So when entering a UTF-8 character to dash, the whole character lands in the buffer at once, while under bash (presumably, as I haven't debugged that far yet), bash must reconstruct the Unicode characters from the individual bytes. Stepping through the breakpoints on sees 0xce on the first read, and 0xa9 on the second. But, in between the two read breakpoints, the first terminal displayed ':'. So the input is still making it correctly INTO readline; but being munged on the way to output; and it very much looks like readline's fault rather than cygwin's. I'm still trying to put breakpoints in the right places (the call stack at read() points to rl_getc(), from rl_read_key(), from readline_internal_char()...), but this is at least to let you know how I'm tackling the issue, in case it helps someone else spot a solution faster than me by starting from the same information. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org