From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from eggs.gnu.org (eggs.gnu.org [IPv6:2001:470:142:3::10]) by sourceware.org (Postfix) with ESMTPS id 8B4893858D28 for ; Wed, 3 Nov 2021 20:50:07 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 8B4893858D28 Received: from fencepost.gnu.org ([2001:470:142:3::e]:41682) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1miNCw-0007Q0-KM for gdb@sourceware.org; Wed, 03 Nov 2021 16:50:06 -0400 Received: from ip5f5a8d68.dynamic.kabel-deutschland.de ([95.90.141.104]:61229 helo=[192.168.111.41]) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1miNCw-0005Vg-CG for gdb@sourceware.org; Wed, 03 Nov 2021 16:50:06 -0400 From: Simon Sobisch To: gdb@sourceware.org References: <60c53fa8bf160533a2eddf1da280eb50c7461a6a.camel@fit.cvut.cz> Subject: UnicodeDecodeError on gdb.execute Message-ID: <33ec492b-3689-80fd-ca78-a4e2e69b9180@gnu.org> Date: Wed, 3 Nov 2021 21:50:03 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.14.0 MIME-Version: 1.0 In-Reply-To: <60c53fa8bf160533a2eddf1da280eb50c7461a6a.camel@fit.cvut.cz> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=1.8 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_BARRACUDACENTRAL, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=no autolearn_force=no version=3.4.4 X-Spam-Level: * X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gdb@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Nov 2021 20:50:09 -0000 For some special file I need to look at the source code from within the GDB extension. I did this with the reasonable and obious output = gdb.execute("list *" + hex(sal.pc), False, True) (and get more lines with a follow-up "list" [not all are needed, otherwise the gdb.parameter("listsize") could be adjusted). I _think_ the problem I expect now is because of a system with Python3 which has default utf8 encoding, but it _may_ was also in before: There's a python exception UnicodeDecodeError in this line whenever it contains "extended" ascii. "list" in GDB shows the code correctly; also (gdb) py gdb.execute("list 14") shows the correct text, but as soon as python has to internally decode it to store a string: (gdb) py gdb.execute("list 14", False, True) Traceback (most recent call last): File "", line 1, in UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfc in position 18: invalid start byte Error while executing Python code. is there any way I could adjust the encoding used for storing gdb.execute as string? Is there a reason that this isn't by default set to match gdb.target_charset() ? Thanks for insights to this issue, too, Simon