From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-patches-return-148086-listarch-gdb-patches=sources.redhat.com@sourceware.org>
Received: (qmail 88719 invoked by alias); 8 Jun 2018 18:44:21 -0000
Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <gdb-patches.sourceware.org>
List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gdb-patches-owner@sourceware.org
Received: (qmail 88705 invoked by uid 89); 8 Jun 2018 18:44:20 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-2.8 required=5.0 tests=AWL,BAYES_00,SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=upgraded
X-HELO: mx1.redhat.com
Received: from mx3-rdu2.redhat.com (HELO mx1.redhat.com) (66.187.233.73) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 08 Jun 2018 18:44:18 +0000
Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6])	(using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))	(No client certificate requested)	by mx1.redhat.com (Postfix) with ESMTPS id 288C0BB43A;	Fri,  8 Jun 2018 18:44:17 +0000 (UTC)
Received: from [127.0.0.1] (ovpn04.gateway.prod.ext.ams2.redhat.com [10.39.146.4])	by smtp.corp.redhat.com (Postfix) with ESMTP id D44CA20357CA;	Fri,  8 Jun 2018 18:44:15 +0000 (UTC)
Subject: Re: [PATCH] Implement IPv6 support for GDB/gdbserver
To: Sergio Durigan Junior <sergiodj@redhat.com>
References: <20180523185719.22832-1-sergiodj@redhat.com> <307a63d3-703d-5611-1508-c80daa86fbbf@redhat.com> <874lieulko.fsf@redhat.com> <8721b020-3b0e-bd66-85dc-5e28aef456a8@redhat.com> <87vaattbjq.fsf@redhat.com>
Cc: GDB Patches <gdb-patches@sourceware.org>, Eli Zaretskii <eliz@gnu.org>, Jan Kratochvil <jan.kratochvil@redhat.com>, Paul Fertser <fercerpav@gmail.com>, Tsutomu Seki <sekiriki@gmail.com>
From: Pedro Alves <palves@redhat.com>
Message-ID: <745457f7-9c87-5e0e-e8df-a58900302da5@redhat.com>
Date: Fri, 08 Jun 2018 18:44:00 -0000
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.8.0
MIME-Version: 1.0
In-Reply-To: <87vaattbjq.fsf@redhat.com>
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-SW-Source: 2018-06/txt/msg00229.txt.bz2

On 06/08/2018 06:47 PM, Sergio Durigan Junior wrote:
> On Friday, June 08 2018, Pedro Alves wrote:

>>>> Does connecting with "localhost6:port" default to IPv6, BTW?
>>>> At least fedora includes "localhost6" in /etc/hosts.
>>>
>>> Using "localhost6:port" works, but it doesn't default to IPv6.  Here's
>>> what I see on the gdbserver side:
>>>
>>>   $ ./gdb/gdbserver/gdbserver --once localhost6:1234 a.out
>>>   Process /path/to/a.out created; pid = 7742
>>>   Listening on port 1234
>>>   Remote debugging from host ::ffff:127.0.0.1, port 39196
>>>
>>> This means that the connection came using IPv4; it works because IPv6
>>> sockets also listen for IPv4 connection on Linux (one can change this
>>> behaviour by setting the "IPV6_V6ONLY" socket option).
>>>
>>> This happens because I've made a decision to default to AF_INET (instead
>>> of AF_UNSPEC) when no prefix has been given.  This basically means that,
>>> at least for now, we assume that an unknown (i.e., not prefixed)
>>> address/hostname is IPv4.  I've made this decision thinking about the
>>> convenience of the user: when AF_UNSPEC is used (and the user hasn't
>>> specified any prefix), getaddrinfo will return a linked list of possible
>>> addresses that we should try to connect to, which usually means an IPv6
>>> and an IPv4 address, in that order.  Usually this is fine, because (as I
>>> said) IPv6 sockets can also listen for IPv4 connections.  However, if
>>> you start gdbserver with an explicit IPv4 address:
>>>
>>>   $ ./gdb/gdbserver/gdbserver --once 127.0.0.1:1234 a.out
>>>
>>> and try to connect GDB to it using an "ambiguous" hostname:
>>>
>>>   $ ./gdb/gdb -ex 'target remote localhost:1234' a.out
>>>
>>> you will notice that GDB will take a somewhat long time trying to
>>> connect (to the IPv6 address, because of AF_UNSPEC), and then it will
>>> error out saying that the connection timed out:
>>>
>>>   tcp:localhost:1234: Connection timed out.
>>
>> How do other tools handle this?
> 
> Just like GDB.

Well, it sounds like they do the AF_UNSPEC thing, instead of
defaulting to AF_INET.

> 
> 
>> For example, with ping, I get:
>>
>>  $ ping localhost
>>  PING localhost.localdomain (127.0.0.1) 56(84) bytes of data.
>>  64 bytes from localhost.localdomain (127.0.0.1): icmp_seq=1 ttl=64 time=0.048 ms
>>  ^C
>>
>>  $ ping localhost6
>>  PING localhost6(localhost6.localdomain6 (::1)) 56 data bytes
>>  64 bytes from localhost6.localdomain6 (::1): icmp_seq=1 ttl=64 time=0.086 ms
>>  ^C
> 
> And I get:
> 
>   $ ping localhost
>   PING localhost(localhost (::1)) 56 data bytes
>   64 bytes from localhost (::1): icmp_seq=1 ttl=64 time=0.050 ms
>   ^C
> 
>   $ ping localhost6
>   PING localhost6(localhost (::1)) 56 data bytes
>   64 bytes from localhost (::1): icmp_seq=1 ttl=64 time=0.089 ms
>   ^C
> 
> Maybe your /etc/hosts is different than mine:
> 
>   127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
>   ::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

Looks like it:

$ cat /etc/hosts
127.0.0.1               localhost.localdomain localhost
::1             localhost6.localdomain6 localhost6

This is on F27 (upgraded a few times).  I don't remember if I ever
changed this manually.

> 
>> how does ping instantly know without visible delay that "localhost"
>> resolves to an IPv4 address, and that "localhost6" resolves to
>> an IPv6 address?
> 
> It doesn't.  It just tries to connect to the first entry returned by
> getaddrinfo:
> 
>   https://github.com/iputils/iputils/blob/master/ping.c#L518
> 
> In my case, since both IPv4 and IPv6 localhost addresses are valid, it
> just connects to the first one, which is the IPv6.

OK.

> Using telnet allows one to see the algorithm I explained before working:

OK.

> The "getaddrinfo loop" is a well known way to implement IPv6 support on
> IPv4-only tools.  I think it is totally fine to iterate through the
> possible addresses and try each one until we have success, but our
> problem is that we implemented a retry mechanism on top of that, so when
> we get "connection refused" GDB won't just try the next address, but
> will keep retrying the refused one...  That is the problem.

Yes.

> I don't know why would anyone use different hostnames on both GDB and
> gdbserver, I just stated the fact that if someone does it, she will have
> problems.  And yes, you'd get the same problem if localhost always only
> resolved to IPv6.  The difference is that the tools you're using for
> your example don't implement retry (or at least not the way we do), so
> you don't have huge delays when they can't connect to an address.
> 
> It is clear to me, after investigating this, that the problem is our
> retry mechanism.  

Agreed, it's clear to me too.

> We can either adjust it to a lower delay, get rid of
> it, or leave it as is and assume that unprefixed addresses are IPv4.  I
> fail to see what else we're missing.

The "assume unprefixed addresses are IPv4" seems like the worse
option to me, as it's a work around.  Let's tackle the real issue
instead.

We could consider for example more verbose progress indication,
or cycling the whole "getaddrinfo loop" before waiting to retry instead
of waiting after each individual connection failure.

> 
>>>>> +  char *orig_name = strdup (name);
>>>>
>>>> Do we need a deep copy?  And if we do, how about
>>>> using std::string to avoid having to call free further
>>>> down?
>>>
>>> This is gdbserver/gdbreplay.c, where apparently we don't have access to
>>> a lot of our regular facilities on GDB.  For example, I was trying to
>>> use std::string, its methods, and other stuff here (even i18n
>>> functions), but the code won't compile, and as far as I have researched
>>> this is intentional, because gdbreplay needs to be a very small and
>>> simple program.  
>>
>> What did you find that gave you that impression?  There's no reason
>> that gdbreplay needs to be small or simple.  Certainly doesn't need
>> to be smaller than gdbserver.
> 
> First, the way it is written.  It doesn't use any of our facilities
> (e.g., i18n, strdup instead of xstrdup), and it seems to be treated in a
> "special" way, because it is a separate program.  I found this message:
> 
>   https://sourceware.org/ml/gdb/2008-06/msg00117.html
> 
>   > I've tried to find information in the doc about gdbreplay without luck.
>   > Really quickly, does gdbreplay, as its name suggest, allow to record an
>   > re-run an application session? 
> 
>   Yes, exactly -- but with rather stringent limits.
>   In a nutshell, during the replay session, you must give
>   EXACTLY the same sequence of gdb commands as were given
>   during the record session.  gdbreplay will prompt you for
>   the next command, but if you do *anything* different, 
>   it will throw up its hands and quit.
> 
> And it seems to imply that gdbreplay is a very limited program.  

The "stringent limits" refers to what you can do from gdb
when connected to gdbreplay, not to gdbreplay's running environment.

The tool is useful for support, to e.g., reproduce bugs that only
trigger against remote stubs / embedded probes that the user has
access to, other than gdbserver.  You ask the user to use "set remotelogfile"
to record the remote protocol traffic against the user's remote stub, and
then the user reproduces the bug.  The user sends you the resulting
remote protocol log file, and you load it against gdbreplay.  Then, using
the same gdb version and same program binary the user used, you connect
to gdbreplay, and use the exact same set of gdb commands the user used.
Given those constraints, gdb should send the exact same remote protocol
packets that the user's gdb sent to the users stub.  And since gdbreplay
is replaying the remote protocol traffic the original stub sent, you'll be
able to reproduce the bug locally.  If you use a different set of commands,
then gdb will send different packets, and the log that gdbreplay is
replaying of course breaks down.

So gdbreplay runs on the host computer.  There's no need to care about
making it super tiny of anything like that.


And> 
> Jan's first patch (back in 2006) implementing IPv6 also duplicated code
> on gdbreplay.  I admit I may have read too much between the lines here,
> but I just assumed that this was just the way things were.
> 
>>> at least that's what I understood from our
>>> archives/documentation.  I did not feel confident reworking gdbreplay to
>>> make it "modern", so I decided to implement things "the old way".
>>
>> Seems like adding to technical debt to be honest.  Did you hit some
>> unsurmountable problem, or would just a little bit of fixing here and
>> there be doable?
> 
> I don't know if it's unsurmountable.  I know I had trouble getting i18n
> and trying to include a few headers here and there, but I haven't tried
> very hard to work around it.  I just decided to "add to the technical
> debt".
> 
> I'll take a better look at this.
Thanks.

Pedro Alves