public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
From: Mark Geisert <mark@maxrnd.com>
To: cygwin@cygwin.com
Subject: Re: segfault on 32bit cygwin snapshot
Date: Thu, 4 Mar 2021 01:05:14 -0800	[thread overview]
Message-ID: <e15e6d1f-7680-8f80-871e-0d224a3ed682@maxrnd.com> (raw)
In-Reply-To: <YD9sSea32T7GqlJr@calimero.vinschen.de>

Hi Corinna,

Corinna Vinschen via Cygwin wrote:
> [Ping Mark Geisert]
> 
> On Mar  3 18:56, Takashi Yano via Cygwin wrote:
>> Hi Corinna,
>>
>> On Tue, 2 Mar 2021 16:48:45 +0100
>> Corinna Vinschen wrote:
>>> On Mar  2 20:03, Takashi Yano via Cygwin wrote:
>>>>> The following check code does not work as expected if
>>>>> newly build exe file is linked with old dll which calls
>>>>> uname() as in this case.
>>>>>
>>>>>    if (CYGWIN_VERSION_CHECK_FOR_UNAME_X)
>>>>>      return uname_x (in_name);
>>>>>
>>>>> Any idea?
>>>>
>>>> Ping Corinna?
>>>
>>> I have no idea how we could fix that, other than by rebuilding the DLLs
>>> which call uname, too.  We can't check the Cygwin build of all DLLs an
>>> executable is linked to.
>>
>> I have checked all /usr/bin/*.dll in my cygwin installation and found
>> following dlls call uname() rather than uname_x().
>> [...]
>> Do you think rebuilding all of these (or maybe more) dlls is only
>> the solution?
> 
> No, we could also drop the above code snippet.
> 
> Here's the problem: When we changed some datatypes in the early 2000s,
> the old entry points have been conserved for backward compatibility,
> while the new function using the new datatypes got a new name, e. g.,
> stat vs. _stat64.
> 
> Next, libcygwin.a gets changed so that a newly built executable (using
> the new datatypes as defined in the standard headers) calling stat is
> redirected to _stat64.
> 
> All is well for the next 15+ years or so.
> 
> Then we discover that the exact same mechanism fails to work for
> uname vs. the new uname_x in python.  What happened?
> 
> It turned out that python called uname dynamically Rather then just
> calling uname, it calls dlopen();dlsym("uname");
> 
> This actually fetches the symbol for uname, not the symbol for uname_x.
> The good old mechanism used for ages on standard function, fails as soon
> as the caller uses dynamic loading of symbols.  Surprise, surprise!
> It was just never taken into consideration that standard libc functions
> might be called dynamically, given it usually doesn't make sense.
> 
> Given that this affects *all* of these redirected functions, we're in a
> bit of a fix.  Fortunately, for all other functions this only affects 32
> bit Cygwin, because the 64 bit version never had this backward
> compatibility problem.
> 
> Therefore, uname vs. uname_x is the only function affecting 64 bit
> Cygwin as well, and that's why I added the above crude redirection only
> to uname in the first place.
> 
> So what we can do is this:
> 
> - Either all old DLLs calling uname must be rebuilt.

.. or patched (somehow).  Also, IIUC you're both talking about Cygwin-supplied 
DLLs, not user builds of Cygwin or private DLLs.  Dunno how we'd find every last one.

> - Or we remove the above code snippet again and behave like for all
>    other redirected functions on 32 bit as well.  Python's os.uname is
>    broken, but all the affected DLL sstart working again.

I think you need to keep the above code snippet to handle old exe running with 
current Cygwin DLL, no?

> Is there a way around that?  I'm not quite sure, so let's brain storm
> a bit, ok?
> 
> - One thing we could try is to remove the above code, but add a python
>    hack to dlsym instead.  This would let the "old" DLLs work again as
>    before and for python we could add a hack to dlsym, along these lines:
> 
>      if (CYGWIN_VERSION_CHECK_FOR_UNAME_X
>      	&& modulehandle == cygwin1.dll
> 	&& strcmp (symname, "uname"))
>        symname = "uname_x";
> 
> Thoughts?  Other ideas?

That's a sly fix, but it seems that it would do the job.  That's good!

On a different tack, I was thinking about how run time code could tell the 
difference between the versions of uname() being called.  It can't.  I looked at 
glibc and FreeBSD libc to see if they had to deal with this compatibility issue. 
It seems they don't, or I couldn't locate it if they do.

But FreeBSD has an approach that looked interesting in another way.  They have the 
standard uname() function taking the one usual arg.  But it's just a wrapper on a 
worker function that takes the original arg plus another arg specifying the size 
of the fields in struct utsname.  Paraphrasing, using Cygwin names:
         int uname(struct utsname *uptr)
         {
             return uname_x((int) _UTSNAME_LENGTH, uptr);
         }
They allow the user to override what we call _UTSNAME_LENGTH.  That seems like an 
invitation to exploit so I don't care for that.  But what I was thinking is if we 
make that first arg to uname_x() be a uintptr_t, we could tell (with pretty good 
confidence) at run time inside uname_x() if we were passed a length (from new code 
passing two args) or a ptr-to-buf (from old code just passing one arg).

I am unsure if this is useful.  I am trying to consider the cross-product of exes 
and dlls we care about and whether we can identify which combination we have at 
run time.  I think I need some time with a pad of paper, and maybe a drink.

..mark

  reply	other threads:[~2021-03-04  9:05 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-20 21:01 Marco Atzeri
2021-02-20 22:29 ` Takashi Yano
2021-02-28 18:48   ` Marco Atzeri
2021-03-01  0:55     ` Takashi Yano
2021-03-01  8:41       ` Takashi Yano
2021-03-01 11:26         ` Takashi Yano
2021-03-01 12:17           ` marco atzeri
2021-03-01 12:25       ` Takashi Yano
2021-03-02 11:03         ` Takashi Yano
2021-03-02 15:48           ` Corinna Vinschen
2021-03-03  9:56             ` Takashi Yano
2021-03-03 11:00               ` Corinna Vinschen
2021-03-04  9:05                 ` Mark Geisert [this message]
2021-03-04 16:11                   ` Corinna Vinschen
2021-03-05  9:18                     ` Mark Geisert
2021-03-05 14:37                       ` Corinna Vinschen
2021-03-04  9:05                 ` Takashi Yano
2021-03-04 11:11                   ` marco atzeri
2021-03-04 11:50                     ` Takashi Yano
2021-03-04 15:17                       ` Ken Brown
2021-03-04 15:54                         ` Corinna Vinschen
2021-03-04 20:17                         ` Marco Atzeri
2021-03-05  8:09                           ` Marco Atzeri
2021-03-05  9:11                             ` Mark Geisert
2021-03-05 14:42                               ` Corinna Vinschen
2021-03-05 16:30                                 ` Marco Atzeri
2021-03-06  1:45                                   ` Takashi Yano
2021-03-06 17:16                                     ` Ken Brown
2021-03-06 21:38                                       ` Mark Geisert
2021-03-05 22:55                                 ` Mark Geisert
2021-03-08 11:07                                   ` Mark Geisert
2021-03-08 14:01                                     ` Corinna Vinschen
2021-03-04 15:52                   ` Corinna Vinschen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e15e6d1f-7680-8f80-871e-0d224a3ed682@maxrnd.com \
    --to=mark@maxrnd.com \
    --cc=cygwin@cygwin.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).