From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 28126 invoked by alias); 5 Jun 2009 15:52:20 -0000 Received: (qmail 28113 invoked by uid 22791); 5 Jun 2009 15:52:18 -0000 X-SWARE-Spam-Status: No, hits=-2.4 required=5.0 tests=AWL,BAYES_00,SPF_PASS X-Spam-Check-By: sourceware.org Received: from mail-pz0-f173.google.com (HELO mail-pz0-f173.google.com) (209.85.222.173) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 05 Jun 2009 15:52:12 +0000 Received: by pzk3 with SMTP id 3so1449529pzk.0 for ; Fri, 05 Jun 2009 08:52:10 -0700 (PDT) Received: by 10.115.54.7 with SMTP id g7mr3848608wak.147.1244217130268; Fri, 05 Jun 2009 08:52:10 -0700 (PDT) Received: from Paullaptop (124-168-137-88.dyn.iinet.net.au [124.168.137.88]) by mx.google.com with ESMTPS id k2sm4032451rvb.10.2009.06.05.08.52.07 (version=SSLv3 cipher=RC4-MD5); Fri, 05 Jun 2009 08:52:09 -0700 (PDT) Message-ID: <174030E531584E68BF5743D433A6E16E@Paullaptop> From: "Paul Edwards" To: "Joseph S. Myers" , "Ulrich Weigand" Cc: References: <200906051520.n55FKg7T016481@d12av02.megacenter.de.ibm.com> In-Reply-To: Subject: Re: i370 port Date: Fri, 05 Jun 2009 15:52:00 -0000 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="Windows-1252"; reply-type=original Content-Transfer-Encoding: 7bit Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org X-SW-Source: 2009-06/txt/msg00109.txt.bz2 >> I understand current GCC supports various source and target character >> sets a lot better out of the box, so it may be EBCDIC isn't even an >> issue any more. If there are other problems related to MVS host > > I think the EBCDIC support is largely theoretical and not tested on any > actual EBCDIC host (or target). cpplib knows the character set name > UTF-EBCDIC, but whenever it does anything internally that involves the > encoding of its internal character set it uses UTF-8 rules (which is not > something valid to do with UTF-EBCDIC). >From the hercules-os380 files section, here's the relevant change to 3.4.6 to stop it being theoretical: Index: gccnew/gcc/cppcharset.c diff -c gccnew/gcc/cppcharset.c:1.1.1.1 gccnew/gcc/cppcharset.c:1.6 *** gccnew/gcc/cppcharset.c:1.1.1.1 Wed Apr 15 16:26:16 2009 --- gccnew/gcc/cppcharset.c Wed May 13 11:07:08 2009 *************** *** 23,28 **** --- 23,30 ---- #include "cpplib.h" #include "cpphash.h" #include "cppucnid.h" + #include "coretypes.h" + #include "tm.h" /* Character set handling for C-family languages. *************** *** 529,534 **** --- 531,561 ---- return conversion_loop (one_utf32_to_utf8, cd, from, flen, to); } + #ifdef MAP_OUTCHAR + /* convert ASCII to EBCDIC */ + static bool + convert_asc_ebc (iconv_t cd ATTRIBUTE_UNUSED, + const uchar *from, size_t flen, struct _cpp_strbuf *to) + { + size_t x; + int c; + + if (to->len + flen > to->asize) + { + to->asize = to->len + flen; + to->text = xrealloc (to->text, to->asize); + } + for (x = 0; x < flen; x++) + { + c = from[x]; + c = MAP_OUTCHAR(c); + to->text[to->len + x] = c; + } + to->len += flen; + return true; + } + #endif + /* Identity conversion, used when we have no alternative. */ static bool convert_no_conversion (iconv_t cd ATTRIBUTE_UNUSED, *************** *** 606,611 **** --- 633,641 ---- { "UTF-32BE/UTF-8", convert_utf32_utf8, (iconv_t)1 }, { "UTF-16LE/UTF-8", convert_utf16_utf8, (iconv_t)0 }, { "UTF-16BE/UTF-8", convert_utf16_utf8, (iconv_t)1 }, + #if defined(TARGET_EBCDIC) + { "UTF-8/UTF-EBCDIC", convert_asc_ebc, (iconv_t)0 }, + #endif }; /* Subroutine of cpp_init_iconv: initialize and return a *************** *** 683,688 **** --- 713,722 ---- bool be = CPP_OPTION (pfile, bytes_big_endian); + #if defined(TARGET_EBCDIC) + ncset = "UTF-EBCDIC"; + wcset = "UTF-EBCDIC"; + #else if (CPP_OPTION (pfile, wchar_precision) >= 32) default_wcset = be ? "UTF-32BE" : "UTF-32LE"; else if (CPP_OPTION (pfile, wchar_precision) >= 16) *************** *** 696,701 **** --- 730,736 ---- ncset = SOURCE_CHARSET; if (!wcset) wcset = default_wcset; + #endif pfile->narrow_cset_desc = init_iconv_desc (pfile, ncset, SOURCE_CHARSET); pfile->wide_cset_desc = init_iconv_desc (pfile, wcset, SOURCE_CHARSET); The generated code appears to be fine from visual inspection. BFN. Paul.