From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <keiths@redhat.com>
Received: from us-smtp-delivery-124.mimecast.com
 (us-smtp-delivery-124.mimecast.com [216.205.24.124])
 by sourceware.org (Postfix) with ESMTP id F33A13857C5A
 for <bunsen@sourceware.org>; Wed, 23 Sep 2020 21:06:51 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org F33A13857C5A
Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com
 [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id
 us-mta-139-GuolWvLKPwWvrL6wI6h61Q-1; Wed, 23 Sep 2020 17:06:48 -0400
X-MC-Unique: GuolWvLKPwWvrL6wI6h61Q-1
Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com
 [10.5.11.12])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 5FB0B1005C99
 for <bunsen@sourceware.org>; Wed, 23 Sep 2020 21:06:47 +0000 (UTC)
Received: from theo.uglyboxes.com.com (ovpn-113-168.phx2.redhat.com
 [10.3.113.168])
 by smtp.corp.redhat.com (Postfix) with ESMTP id 3303960C04
 for <bunsen@sourceware.org>; Wed, 23 Sep 2020 21:06:47 +0000 (UTC)
From: Keith Seitz <keiths@redhat.com>
To: bunsen@sourceware.org
Subject: [PATCH] Use binary mode to read .log/.sum
Date: Wed, 23 Sep 2020 14:06:45 -0700
Message-Id: <20200923210645.1472242-1-keiths@redhat.com>
MIME-Version: 1.0
X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12
X-Mimecast-Spam-Score: 0
X-Mimecast-Originator: redhat.com
Content-Transfer-Encoding: 8bit
Content-Type: text/plain; charset="US-ASCII"
X-Spam-Status: No, score=-13.6 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH,
 DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0,
 RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE,
 SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on
 server2.sourceware.org
X-BeenThere: bunsen@sourceware.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Bunsen mailing list <bunsen.sourceware.org>
List-Unsubscribe: <https://sourceware.org/mailman/options/bunsen>,
 <mailto:bunsen-request@sourceware.org?subject=unsubscribe>
List-Archive: <https://sourceware.org/pipermail/bunsen/>
List-Help: <mailto:bunsen-request@sourceware.org?subject=help>
List-Subscribe: <https://sourceware.org/mailman/listinfo/bunsen>,
 <mailto:bunsen-request@sourceware.org?subject=subscribe>
X-List-Received-Date: Wed, 23 Sep 2020 21:06:53 -0000

There's no guarantee that gdb.{log,sum} will not contain non-UTF-8
encoded characters.  This can happen, for example, when either gdb or
the inferior outputs uninitialized data (either intentionally or as a
result of some bug). gdb.fortran/function-calls.exp is a common
problem.

Since Cursor will decode lines using UTF-8, nothing is really sacrificed.
However, this does fix several import problems I've encountered where
stray/garbage bytes have caused the import to abort prematurely.
---
 scripts-master/gdb/parse_dejagnu.py | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/scripts-master/gdb/parse_dejagnu.py b/scripts-master/gdb/parse_dejagnu.py
index 5218e9a..c9f30ac 100755
--- a/scripts-master/gdb/parse_dejagnu.py
+++ b/scripts-master/gdb/parse_dejagnu.py
@@ -65,11 +65,14 @@ def get_outcome_line(testcase):
 datestamp_format = '%a %b %d %H:%M:%S %Y'
 
 def openfile_or_xz(path):
+    # Read in bary mode to suppress encoding problems that might occur
+    # from reading gdb.{log,sum}. Sometimes inferiors or gdb can just output
+    # garbage bytes.
     if os.path.isfile(path):
-        return open(path, mode='rt')
+        return open(path, mode='rb')
     elif os.path.isfile(path+'.xz'):
-        return lzma.open(path+'.xz', mode='rt')
-    return open(path, mode='rt') # XXX trigger default error
+        return lzma.open(path+'.xz', mode='rb')
+    return open(path, mode='rb') # XXX trigger default error
 
 def parse_README(testrun, READMEfile):
     if testrun is None: return None
-- 
2.26.2