From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by sourceware.org (Postfix) with ESMTP id F33A13857C5A for ; Wed, 23 Sep 2020 21:06:51 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org F33A13857C5A Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-139-GuolWvLKPwWvrL6wI6h61Q-1; Wed, 23 Sep 2020 17:06:48 -0400 X-MC-Unique: GuolWvLKPwWvrL6wI6h61Q-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 5FB0B1005C99 for ; Wed, 23 Sep 2020 21:06:47 +0000 (UTC) Received: from theo.uglyboxes.com.com (ovpn-113-168.phx2.redhat.com [10.3.113.168]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3303960C04 for ; Wed, 23 Sep 2020 21:06:47 +0000 (UTC) From: Keith Seitz To: bunsen@sourceware.org Subject: [PATCH] Use binary mode to read .log/.sum Date: Wed, 23 Sep 2020 14:06:45 -0700 Message-Id: <20200923210645.1472242-1-keiths@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset="US-ASCII" X-Spam-Status: No, score=-13.6 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: bunsen@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Bunsen mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Sep 2020 21:06:53 -0000 There's no guarantee that gdb.{log,sum} will not contain non-UTF-8 encoded characters. This can happen, for example, when either gdb or the inferior outputs uninitialized data (either intentionally or as a result of some bug). gdb.fortran/function-calls.exp is a common problem. Since Cursor will decode lines using UTF-8, nothing is really sacrificed. However, this does fix several import problems I've encountered where stray/garbage bytes have caused the import to abort prematurely. --- scripts-master/gdb/parse_dejagnu.py | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/scripts-master/gdb/parse_dejagnu.py b/scripts-master/gdb/parse_dejagnu.py index 5218e9a..c9f30ac 100755 --- a/scripts-master/gdb/parse_dejagnu.py +++ b/scripts-master/gdb/parse_dejagnu.py @@ -65,11 +65,14 @@ def get_outcome_line(testcase): datestamp_format = '%a %b %d %H:%M:%S %Y' def openfile_or_xz(path): + # Read in bary mode to suppress encoding problems that might occur + # from reading gdb.{log,sum}. Sometimes inferiors or gdb can just output + # garbage bytes. if os.path.isfile(path): - return open(path, mode='rt') + return open(path, mode='rb') elif os.path.isfile(path+'.xz'): - return lzma.open(path+'.xz', mode='rt') - return open(path, mode='rt') # XXX trigger default error + return lzma.open(path+'.xz', mode='rb') + return open(path, mode='rb') # XXX trigger default error def parse_README(testrun, READMEfile): if testrun is None: return None -- 2.26.2