public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug fortran/65125] New: ISO_10646 characters and transfer statement
@ 2015-02-19 19:30 zbeekman at gmail dot com
  2015-03-03 21:57 ` [Bug fortran/65125] " zbeekman at gmail dot com
  2015-03-04  7:53 ` burnus at gcc dot gnu.org
  0 siblings, 2 replies; 3+ messages in thread
From: zbeekman at gmail dot com @ 2015-02-19 19:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65125

            Bug ID: 65125
           Summary: ISO_10646 characters and transfer statement
           Product: gcc
           Version: 4.9.2
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: fortran
          Assignee: unassigned at gcc dot gnu.org
          Reporter: zbeekman at gmail dot com

Created attachment 34810
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=34810&action=edit
reproducer program

I am on OS X Yosemite, 10.10.2 with a 64bit Intel CPU. 
Gfortran is version: NU Fortran (Homebrew gcc49 4.9.2_1 --enable-fortran) 4.9.2

I'm trying to make a portable json library that will behave gracefully when
encountering compilers whose ISO_10646 support is lacking. To achieve this, for
certain variables use the character kind defined as:

integer, parameter :: CK =
merge(tsource=selected_char_kind('ISO_10646'),fsource=selected_char_kind('DEFAULT'),mask=selected_char_kind('ISO_10646')
/= -1)

The error handling of the library needs a way to both DEFAULT and ISO_10646
characters, but overloading the error handling routine to have two interfaces
won't work because when ISO_10646 *isn't* supported the two specific routines
will have matching interfaces. As a work around, I would like to print the hex
representation of characters that are `kind=CK` that is possibly ISO_10646 or
DEFAULT. To accomplish this I have written a function `char_to_hex(c)` to try
to print the hex representation of the character. To do this, I am using
`transfer()` to puth the passed in single character into a sufficiently large
integer, and using the `z8.8` edit descriptor to convert to the function result
which is ALWAYS represented in `kind='DEFAULT'` characters (so that I can
safely pass it to the error handling routine).

The problem is that it appears the transfer statement is only writing 2 nibbles
(one byte) to the integer, or  the z8.8 is only fetching two nibbles of the
int32 integer. Forexample, the output for a Unicode SNOWFLAKE is 0x000000E2
when it should be: 0x00E29d84. I think the problem is the transfer statement,
since I observe some strange behavior if I modify the char_to_hex() function to
accept len=4 character strings, and adjust the inputs to be the character in
question followed by 4 spaces, and adjust the storage container to int64 and
edit descriptor to z18.18. Now SNOWFLAKE prints out as 0x0x0000009D000000E2:
two most significant nibbles are on the right, then 6 zero nibbles to the left,
the 3rd and 4th most significant nibbles, then more zeroes.

Am I missing something? If not, I think this is a bug.


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug fortran/65125] ISO_10646 characters and transfer statement
  2015-02-19 19:30 [Bug fortran/65125] New: ISO_10646 characters and transfer statement zbeekman at gmail dot com
@ 2015-03-03 21:57 ` zbeekman at gmail dot com
  2015-03-04  7:53 ` burnus at gcc dot gnu.org
  1 sibling, 0 replies; 3+ messages in thread
From: zbeekman at gmail dot com @ 2015-03-03 21:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65125

Zaak <zbeekman at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|---                         |INVALID

--- Comment #1 from Zaak <zbeekman at gmail dot com> ---
After mucking around some more and realizing that gfortran can't/won't handle
input source files with utf-8 encoding and non ascii files, I also discovered
the `-fbackslash` flag as a means of embedding non-ascii characters in strings.
When I use this approach things work as I would have expected and I'll be
closing this bug, as invalid, since it appears there is no bug here.


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug fortran/65125] ISO_10646 characters and transfer statement
  2015-02-19 19:30 [Bug fortran/65125] New: ISO_10646 characters and transfer statement zbeekman at gmail dot com
  2015-03-03 21:57 ` [Bug fortran/65125] " zbeekman at gmail dot com
@ 2015-03-04  7:53 ` burnus at gcc dot gnu.org
  1 sibling, 0 replies; 3+ messages in thread
From: burnus at gcc dot gnu.org @ 2015-03-04  7:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65125

Tobias Burnus <burnus at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |burnus at gcc dot gnu.org

--- Comment #2 from Tobias Burnus <burnus at gcc dot gnu.org> ---
(In reply to Zaak from comment #1)
> After mucking around some more and realizing that gfortran can't/won't
> handle input source files with utf-8 encoding

See also PR45179 which is about implementing this feature.


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2015-03-04  7:53 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-19 19:30 [Bug fortran/65125] New: ISO_10646 characters and transfer statement zbeekman at gmail dot com
2015-03-03 21:57 ` [Bug fortran/65125] " zbeekman at gmail dot com
2015-03-04  7:53 ` burnus at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).