From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <simonsobisch@gnu.org>
Received: from eggs.gnu.org (eggs.gnu.org [IPv6:2001:470:142:3::10])
 by sourceware.org (Postfix) with ESMTPS id 8B4893858D28
 for <gdb@sourceware.org>; Wed,  3 Nov 2021 20:50:07 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 8B4893858D28
Received: from fencepost.gnu.org ([2001:470:142:3::e]:41682)
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <simonsobisch@gnu.org>)
 id 1miNCw-0007Q0-KM
 for gdb@sourceware.org; Wed, 03 Nov 2021 16:50:06 -0400
Received: from ip5f5a8d68.dynamic.kabel-deutschland.de ([95.90.141.104]:61229
 helo=[192.168.111.41])
 by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128)
 (Exim 4.90_1) (envelope-from <simonsobisch@gnu.org>)
 id 1miNCw-0005Vg-CG
 for gdb@sourceware.org; Wed, 03 Nov 2021 16:50:06 -0400
From: Simon Sobisch <simonsobisch@gnu.org>
To: gdb@sourceware.org
References: <e5e0522f-2847-1575-6dbd-3795fa6ebdb3@gnu.org>
 <60c53fa8bf160533a2eddf1da280eb50c7461a6a.camel@fit.cvut.cz>
Subject: UnicodeDecodeError on gdb.execute
Message-ID: <33ec492b-3689-80fd-ca78-a4e2e69b9180@gnu.org>
Date: Wed, 3 Nov 2021 21:50:03 +0100
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101
 Thunderbird/78.14.0
MIME-Version: 1.0
In-Reply-To: <60c53fa8bf160533a2eddf1da280eb50c7461a6a.camel@fit.cvut.cz>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: 8bit
X-Spam-Status: No, score=1.8 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH,
 DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF,
 RCVD_IN_BARRACUDACENTRAL, SPF_HELO_PASS, SPF_PASS,
 TXREP autolearn=no autolearn_force=no version=3.4.4
X-Spam-Level: *
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on
 server2.sourceware.org
X-BeenThere: gdb@sourceware.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gdb mailing list <gdb.sourceware.org>
List-Unsubscribe: <https://sourceware.org/mailman/options/gdb>,
 <mailto:gdb-request@sourceware.org?subject=unsubscribe>
List-Archive: <https://sourceware.org/pipermail/gdb/>
List-Post: <mailto:gdb@sourceware.org>
List-Help: <mailto:gdb-request@sourceware.org?subject=help>
List-Subscribe: <https://sourceware.org/mailman/listinfo/gdb>,
 <mailto:gdb-request@sourceware.org?subject=subscribe>
X-List-Received-Date: Wed, 03 Nov 2021 20:50:09 -0000

For some special file I need to look at the source code from within the 
GDB extension.

I did this with the reasonable and obious
	output = gdb.execute("list *" + hex(sal.pc), False, True)
(and get more lines with a follow-up "list" [not all are needed, 
otherwise the gdb.parameter("listsize") could be adjusted).

I _think_ the problem I expect now is because of a system with Python3 
which has default utf8 encoding, but it _may_ was also in before: 
There's a python exception UnicodeDecodeError in this line whenever it 
contains "extended" ascii.

"list" in GDB shows the code correctly; also
	(gdb) py gdb.execute("list 14")
shows the correct text, but as soon as python has to internally decode 
it to store a string:
	(gdb) py gdb.execute("list 14", False, True)
	Traceback (most recent call last):
	  File "<string>", line 1, in <module>
	UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfc in
	position 18: invalid start byte
	Error while executing Python code.

is there any way I could adjust the encoding used for storing 
gdb.execute as string?
Is there a reason that this isn't by default set to match 
gdb.target_charset() ?

Thanks for insights to this issue, too,
Simon