From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 21064 invoked by alias); 21 Dec 2009 20:06:40 -0000 Mailing-List: contact archer-help@sourceware.org; run by ezmlm Sender: Precedence: bulk List-Post: List-Help: List-Subscribe: List-Id: Received: (qmail 21053 invoked by uid 22791); 21 Dec 2009 20:06:39 -0000 X-SWARE-Spam-Status: No, hits=-2.5 required=5.0 tests=AWL,BAYES_00,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: sourceware.org From: Tom Tromey To: Phil Muldoon Cc: Project Archer Subject: Re: [patch][rfc] Implement Python lazy strings (PR 10705) References: <4B2CDABA.4050301@redhat.com> Reply-To: Tom Tromey Date: Mon, 21 Dec 2009 20:06:00 -0000 In-Reply-To: <4B2CDABA.4050301@redhat.com> (Phil Muldoon's message of "Sat, 19 Dec 2009 13:52:58 +0000") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-SW-Source: 2009-q4/txt/msg00127.txt.bz2 >>>>> "Phil" == Phil Muldoon writes: Phil> I wanted to send this patch out before the holidays for initial Phil> review. This patch implements Python lazy strings, and also Phil> alters the Python pretty-printing process to handle them Phil> accordingly. This looks quite nice. I think you should submit the next revision of this patch upstream. In general I think that fixes and modifications to Python code should go upstream if the underlying functionality is already upstream. Phil> +If the optional @var{encoding} argument is given, it must be a string Phil> +naming the encoding of the string in the @code{gdb.LazyString}, such as Phil> +@code{"ascii"}, @code{"iso-8859-6"} or @code{"utf-8"}. When a lazy Phil> +string is printed, the @value{GDBN} codec machinery is used to convert Phil> +the string during printing. If @var{encoding} is not given, or if Phil> +@var{encoding} is an empty string, then @var{encoding} is set to Phil> +NULL. When @value{GDBN} attempts to print a @code{gdb.LazyString} Phil> +where @var{encoding} is set to NULL, @value{GDBN} will automatically Phil> +select the encoding most suitable for the string type. I think this paragraph should be reorganized a little. The "When a lazy string is printed" sentence should probably be in its own paragraph. Phil> +If the optional @var{length} argument is given, the string will be Phil> +fetched and encoded to the given length. This should mention whether the length is in chars or bytes, and what happens if no length is given. Phil> +@defivar LazyString length Phil> +This attribute holds the length of the string. This attribute is not Phil> +writable. Likewise. Phil> +@defivar LazyString encoding Phil> +This attribute holds the encoding that will be applied to the string Phil> +when the string is printed by @value{GDBN}. This attribute is not Phil> +writable. This should be None if there is no encoding. Phil> +@defivar LazyString type Phil> +This attribute holds the type that is associated with the string. This should mention if it is the string's type or the underlying character type. Phil> +typedef struct { Phil> + PyObject_HEAD Phil> + CORE_ADDR address; Phil> + char *encoding; Phil> + long length; Phil> + struct type *type; Phil> +} lazy_string_object; I try to put comments on each local field (not HEAD) indicating the meaning of the field. This could mirror the docs, more or less. Phil> +static PyObject * Phil> +stpy_get_encoding (PyObject *self, void *closure) Phil> +{ Phil> + lazy_string_object *self_string = (lazy_string_object *) self; Phil> + Phil> + /* An encoding can be set to NULL by the user, so check before Phil> + attempting a Python FromString call. */ Phil> + if (self_string->encoding) Phil> + return PyString_FromString (self_string->encoding); Phil> + Phil> + return NULL; I think this should return None if there is no encoding. Phil> +PyObject * Phil> +stpy_get_type (PyObject *self, void *closure) Phil> +{ Phil> + lazy_string_object *obj = (lazy_string_object *) self; Phil> + PyObject *type; Phil> + Phil> + if (obj->type) Phil> + type = type_to_type_object (obj->type); Phil> + else Phil> + return NULL; I think the lack of a type should be detected and rejected when the object is constructed. BTW it isn't generally valid to just return NULL in a Python method. You must set the error first. Phil> +PyObject * Phil> +create_lazy_string_object (CORE_ADDR address, long length, Phil> + const char *encoding, struct type *type) I think we usually use some py_ name for exported APIs. Phil> + str_obj = PyObject_New (lazy_string_object, &lazy_string_object_type); Phil> + if (!str_obj) Phil> + { Phil> + PyErr_SetString (PyExc_MemoryError, Phil> + "Could not allocate lazy string object."); Phil> + return NULL; Phil> + } I think if PyObject_New returns NULL, then it has probably already set the error, and you can simply return NULL without setting it again. Phil> + if (address == 0) Phil> + { Phil> + PyErr_SetString (PyExc_MemoryError, Phil> + "Cannot create a lazy string from a GDB-side string."); Phil> + return NULL; Phil> + } Do this check first, before making a new object. Otherwise, this leaks the new object. Phil> + str_obj->address = address; Phil> + str_obj->length = length; Phil> + if (encoding == NULL || strcmp (encoding, "")) You probably meant !strcmp here. Phil> + type_incref (type); Upstream doesn't have this stuff FWIW. Hopefully we'll fix it up next year... Phil> + val = value_at_lazy (self_string->type, self_string->address); I guess this means the lazy string holds the string type, not the character type. Phil> + xfree (((lazy_string_object *) self)->encoding); Phil> + type_decref (((lazy_string_object *) self)->type); Introduce a temporary to avoid duplicating the ugly cast. Phil> +/* Determine whether the printer object pointed to by OBJ is a Phil> + Python lazy string. */ Phil> +int is_lazy_string (PyObject *result) Newline after "int". Probably should have a py_ name for this and for extract_lazy_string. Phil> +gdb_byte * Phil> +extract_lazy_string (PyObject *string, struct type Phil> + **str_type, long *length, Don't line break before "**str_type", that looks weird. Phil> + errcode = read_string (value_address (output), *length, width, Phil> + *length, byte_order, &buffer, Phil> + &bytes_read); This seems fishy. Doesn't this mean we end up reading the full contents of the string? I looked a little and it seems this is what gdb already does. I guess I am confused about something here. I suppose this does solve the decoding problem nicely, just not the full laziness problem. Also, note that read_string will allocate 'buffer' even on error (not on exception though). So this should free buffer if read_string returns an error. And, read_string can throw, so you need a try-catch here. Phil> + if (is_lazy_string (py_v)) Phil> + { Phil> + output = extract_lazy_string (py_v, &type, &length, &encoding); Phil> + xfree (encoding); Phil> + } [...] Phil> + fputs_filtered (output, stream); It isn't ok to print bytes from the target like this. They have to be converted to the host encoding. Phil> +static PyObject * Phil> +valpy_lazy_string (PyObject *self, PyObject *args, PyObject *kw) [...] Phil> + str_obj = create_lazy_string_object (value_address (value), length, Phil> + user_encoding, value_type (value)); Phil> + if (str_obj) Phil> + Py_INCREF (str_obj); I think create_lazy_string_object returns a new reference. That means this incref is wrong. Tom