public inbox for libc-locales@sourceware.org
 help / color / mirror / Atom feed
From: Robert Schultz <robert@cosmicrealms.com>
To: libc-alpha@sourceware.org
Cc: libc-locales@sourceware.org
Subject: [PATCH] Added RISC OS charmap (RISCOS)
Date: Sat, 15 Jan 2022 10:10:25 -0500	[thread overview]
Message-ID: <CAODtNamprgsCU9twN66aBvNJof_0QcEmzNZVnR5RsJndOzgTRg@mail.gmail.com> (raw)

My first patch to glibc/charmap. Two items of note:
1: Code 0x84 is in unicode as U0001FBC0 WHITE HEAVY SALTIRE WITH
ROUNDED CORNERS but I chose U2613 SALTIRE instead so iconv conversions
would work properly.

2: Code 0x87 is two characters, SUBSCRIPT EIGHT SUPERSCRIPT 7, so I
chose to set it as UE01E UNDEFINED

---
iconvdata/Makefile | 5 +-
iconvdata/gconv-modules-extra.conf | 4 +
iconvdata/riscos.c | 28 +++
iconvdata/tst-tables.sh | 1 +
localedata/charmaps/RISCOS | 264 +++++++++++++++++++++++++++++
5 files changed, 300 insertions(+), 2 deletions(-)
create mode 100644 iconvdata/riscos.c
create mode 100644 localedata/charmaps/RISCOS

diff --git a/iconvdata/Makefile b/iconvdata/Makefile
index f4c089ed5d..7ff576d5c4 100644
--- a/iconvdata/Makefile
+++ b/iconvdata/Makefile
@@ -62,7 +62,7 @@ modules := ISO8859-1 ISO8859-2 ISO8859-3 ISO8859-4 ISO8859-5 \
IBM5347 IBM9030 IBM9066 IBM9448 IBM12712 IBM16804 \
IBM1364 IBM1371 IBM1388 IBM1390 IBM1399 ISO_11548-1 MIK BRF \
MAC-CENTRALEUROPE KOI8-RU ISO8859-9E \
- CP770 CP771 CP772 CP773 CP774
+ CP770 CP771 CP772 CP773 CP774 RISCOS
# If lazy binding is disabled, use BIND_NOW for the gconv modules.
ifeq ($(bind-now),yes)
@@ -173,7 +173,8 @@ gen-8bit-gap-modules := koi8-r latin-greek
latin-greek-1 ibm256 ibm273 \
mac-centraleurope koi8-ru hp-roman8 hp-roman9 \
ebcdic-es ebcdic-es-a ebcdic-is-friss ebcdic-uk \
iso8859-16 viscii iso8859-9e hp-turkish8 \
- hp-thai8 hp-greek8 cp770 cp771 cp772 cp773 cp774
+ hp-thai8 hp-greek8 cp770 cp771 cp772 cp773 cp774 \
+ riscos
gen-special-modules := iso8859-7jp
diff --git a/iconvdata/gconv-modules-extra.conf
b/iconvdata/gconv-modules-extra.conf
index 82d7be577d..f419965bef 100644
--- a/iconvdata/gconv-modules-extra.conf
+++ b/iconvdata/gconv-modules-extra.conf
@@ -1459,6 +1459,10 @@ module INTERNAL GB18030// GB18030 1
module VISCII// INTERNAL VISCII 1
module INTERNAL VISCII// VISCII 1
+# from to module cost
+module RISCOS// INTERNAL RISCOS 1
+module INTERNAL RISCOS// RISCOS 1
+
# from to module cost
module KOI8-T// INTERNAL KOI8-T 1
module INTERNAL KOI8-T// KOI8-T 1
diff --git a/iconvdata/riscos.c b/iconvdata/riscos.c
new file mode 100644
index 0000000000..e82f5a2d3d
--- /dev/null
+++ b/iconvdata/riscos.c
@@ -0,0 +1,28 @@
+/* Conversion from and to RISCOS.
+ Copyright (C) 2000-2022 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+ Contributed by Robert Schultz <robert@cosmicrealms.com>, 2022.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <https://www.gnu.org/licenses/>. */
+
+#include <stdint.h>
+
+/* Specify the conversion table. */
+#define TABLES <riscos.h>
+
+#define CHARSET_NAME "RISCOS//"
+#define HAS_HOLES 1 /* Not all 256 character are defined. */
+
+#include <8bit-gap.c>
diff --git a/iconvdata/tst-tables.sh b/iconvdata/tst-tables.sh
index 4207b44175..6b3330f52e 100755
--- a/iconvdata/tst-tables.sh
+++ b/iconvdata/tst-tables.sh
@@ -207,6 +207,7 @@ cat <<EOF |
KOI8-U
#ISIRI-3342 This charset concept is completely broken
VISCII
+ RISCOS
KOI8-T
GEORGIAN-PS
GEORGIAN-ACADEMY
diff --git a/localedata/charmaps/RISCOS b/localedata/charmaps/RISCOS
new file mode 100644
index 0000000000..c22a7ef783
--- /dev/null
+++ b/localedata/charmaps/RISCOS
@@ -0,0 +1,264 @@
+<code_set_name> RISCOS
+<comment_char> %
+<escape_char> /
+% version: 1.0
+% source: https://en.wikipedia.org/wiki/RISC_OS_character_set
+
+CHARMAP
+<U0000> /x00 NULL (NUL)
+<U0001> /x01 START OF HEADING (SOH)
+<U0002> /x02 START OF TEXT (STX)
+<U0003> /x03 END OF TEXT (ETX)
+<U0004> /x04 END OF TRANSMISSION (EOT)
+<U0005> /x05 ENQUIRY (ENQ)
+<U0006> /x06 ACKNOWLEDGE (ACK)
+<U0007> /x07 BELL (BEL)
+<U0008> /x08 BACKSPACE (BS)
+<U0009> /x09 CHARACTER TABULATION (HT)
+<U000A> /x0a LINE FEED (LF)
+<U000B> /x0b LINE TABULATION (VT)
+<U000C> /x0c FORM FEED (FF)
+<U000D> /x0d CARRIAGE RETURN (CR)
+<U000E> /x0e SHIFT OUT (SO)
+<U000F> /x0f SHIFT IN (SI)
+<U0010> /x10 DATALINK ESCAPE (DLE)
+<U0011> /x11 DEVICE CONTROL ONE (DC1)
+<U0012> /x12 DEVICE CONTROL TWO (DC2)
+<U0013> /x13 DEVICE CONTROL THREE (DC3)
+<U0014> /x14 DEVICE CONTROL FOUR (DC4)
+<U0015> /x15 NEGATIVE ACKNOWLEDGE (NAK)
+<U0016> /x16 SYNCHRONOUS IDLE (SYN)
+<U0017> /x17 END OF TRANSMISSION BLOCK (ETB)
+<U0018> /x18 CANCEL (CAN)
+<U0019> /x19 END OF MEDIUM (EM)
+<U001A> /x1a SUBSTITUTE (SUB)
+<U001B> /x1b ESCAPE (ESC)
+<U001C> /x1c FILE SEPARATOR (IS4)
+<U001D> /x1d GROUP SEPARATOR (IS3)
+<U001E> /x1e RECORD SEPARATOR (IS2)
+<U001F> /x1f UNIT SEPARATOR (IS1)
+<U0020> /x20 SPACE
+<U0021> /x21 EXCLAMATION MARK
+<U0022> /x22 QUOTATION MARK
+<U0023> /x23 NUMBER SIGN
+<U0024> /x24 DOLLAR SIGN
+<U0025> /x25 PERCENT SIGN
+<U0026> /x26 AMPERSAND
+<U0027> /x27 APOSTROPHE
+<U0028> /x28 LEFT PARENTHESIS
+<U0029> /x29 RIGHT PARENTHESIS
+<U002A> /x2a ASTERISK
+<U002B> /x2b PLUS SIGN
+<U002C> /x2c COMMA
+<U002D> /x2d HYPHEN-MINUS
+<U002E> /x2e FULL STOP
+<U002F> /x2f SOLIDUS
+<U0030> /x30 DIGIT ZERO
+<U0031> /x31 DIGIT ONE
+<U0032> /x32 DIGIT TWO
+<U0033> /x33 DIGIT THREE
+<U0034> /x34 DIGIT FOUR
+<U0035> /x35 DIGIT FIVE
+<U0036> /x36 DIGIT SIX
+<U0037> /x37 DIGIT SEVEN
+<U0038> /x38 DIGIT EIGHT
+<U0039> /x39 DIGIT NINE
+<U003A> /x3a COLON
+<U003B> /x3b SEMICOLON
+<U003C> /x3c LESS-THAN SIGN
+<U003D> /x3d EQUALS SIGN
+<U003E> /x3e GREATER-THAN SIGN
+<U003F> /x3f QUESTION MARK
+<U0040> /x40 COMMERCIAL AT
+<U0041> /x41 LATIN CAPITAL LETTER A
+<U0042> /x42 LATIN CAPITAL LETTER B
+<U0043> /x43 LATIN CAPITAL LETTER C
+<U0044> /x44 LATIN CAPITAL LETTER D
+<U0045> /x45 LATIN CAPITAL LETTER E
+<U0046> /x46 LATIN CAPITAL LETTER F
+<U0047> /x47 LATIN CAPITAL LETTER G
+<U0048> /x48 LATIN CAPITAL LETTER H
+<U0049> /x49 LATIN CAPITAL LETTER I
+<U004A> /x4a LATIN CAPITAL LETTER J
+<U004B> /x4b LATIN CAPITAL LETTER K
+<U004C> /x4c LATIN CAPITAL LETTER L
+<U004D> /x4d LATIN CAPITAL LETTER M
+<U004E> /x4e LATIN CAPITAL LETTER N
+<U004F> /x4f LATIN CAPITAL LETTER O
+<U0050> /x50 LATIN CAPITAL LETTER P
+<U0051> /x51 LATIN CAPITAL LETTER Q
+<U0052> /x52 LATIN CAPITAL LETTER R
+<U0053> /x53 LATIN CAPITAL LETTER S
+<U0054> /x54 LATIN CAPITAL LETTER T
+<U0055> /x55 LATIN CAPITAL LETTER U
+<U0056> /x56 LATIN CAPITAL LETTER V
+<U0057> /x57 LATIN CAPITAL LETTER W
+<U0058> /x58 LATIN CAPITAL LETTER X
+<U0059> /x59 LATIN CAPITAL LETTER Y
+<U005A> /x5a LATIN CAPITAL LETTER Z
+<U005B> /x5b LEFT SQUARE BRACKET
+<U005C> /x5c REVERSE SOLIDUS
+<U005D> /x5d RIGHT SQUARE BRACKET
+<U005E> /x5e CIRCUMFLEX ACCENT
+<U005F> /x5f LOW LINE
+<U0060> /x60 GRAVE ACCENT
+<U0061> /x61 LATIN SMALL LETTER A
+<U0062> /x62 LATIN SMALL LETTER B
+<U0063> /x63 LATIN SMALL LETTER C
+<U0064> /x64 LATIN SMALL LETTER D
+<U0065> /x65 LATIN SMALL LETTER E
+<U0066> /x66 LATIN SMALL LETTER F
+<U0067> /x67 LATIN SMALL LETTER G
+<U0068> /x68 LATIN SMALL LETTER H
+<U0069> /x69 LATIN SMALL LETTER I
+<U006A> /x6a LATIN SMALL LETTER J
+<U006B> /x6b LATIN SMALL LETTER K
+<U006C> /x6c LATIN SMALL LETTER L
+<U006D> /x6d LATIN SMALL LETTER M
+<U006E> /x6e LATIN SMALL LETTER N
+<U006F> /x6f LATIN SMALL LETTER O
+<U0070> /x70 LATIN SMALL LETTER P
+<U0071> /x71 LATIN SMALL LETTER Q
+<U0072> /x72 LATIN SMALL LETTER R
+<U0073> /x73 LATIN SMALL LETTER S
+<U0074> /x74 LATIN SMALL LETTER T
+<U0075> /x75 LATIN SMALL LETTER U
+<U0076> /x76 LATIN SMALL LETTER V
+<U0077> /x77 LATIN SMALL LETTER W
+<U0078> /x78 LATIN SMALL LETTER X
+<U0079> /x79 LATIN SMALL LETTER Y
+<U007A> /x7a LATIN SMALL LETTER Z
+<U007B> /x7b LEFT CURLY BRACKET
+<U007C> /x7c VERTICAL LINE
+<U007D> /x7d RIGHT CURLY BRACKET
+<U007E> /x7e TILDE
+<U007F> /x7f DELETE (DEL)
+<U20AC> /x80 EURO SIGN
+<U0174> /x81 LATIN CAPITAL LETTER W WITH CIRCUMFLEX
+<U0175> /x82 LATIN SMALL LETTER W WITH CIRCUMFLEX
+<U25F0> /x83 WHITE SQUARE WITH UPPER LEFT QUADRANT
+<U2613> /x84 SALTIRE
+<U0176> /x85 LATIN CAPITAL LETTER Y WITH CIRCUMFLEX
+<U0177> /x86 LATIN SMALL LETTER Y WITH CIRCUMFLEX
+<UE01E> /x87 SUBSCRIPT EIGHT SUPERSCRIPT SEVEN
+<U21E6> /x88 LEFTWARDS WHITE ARROW
+<U21E8> /x89 RIGHTWARDS WHITE ARROW
+<U21E9> /x8a DOWNWARDS WHITE ARROW
+<U21E7> /x8b UPWARDS WHITE ARROW
+<U2026> /x8c HORIZONTAL ELLIPSIS
+<U2122> /x8d TRADE MARK SIGN
+<U2030> /x8e PER MILLE SIGN
+<U2022> /x8f BULLET
+<U2018> /x90 LEFT SINGLE QUOTATION MARK
+<U2019> /x91 RIGHT SINGLE QUOTATION MARK
+<U2039> /x92 SINGLE LEFT-POINTING ANGLE QUOTATION MARK
+<U203A> /x93 SINGLE RIGHT-POINTING ANGLE QUOTATION MARK
+<U201C> /x94 LEFT DOUBLE QUOTATION MARK
+<U201D> /x95 RIGHT DOUBLE QUOTATION MARK
+<U201E> /x96 DOUBLE LOW-9 QUOTATION MARK
+<U2013> /x97 EN DASH
+<U2014> /x98 EM DASH
+<U2212> /x99 MINUS SIGN
+<U0152> /x9a LATIN CAPITAL LIGATURE OE
+<U0153> /x9b LATIN SMALL LIGATURE OE
+<U2020> /x9c DAGGER
+<U2021> /x9d DOUBLE DAGGER
+<UFB01> /x9e LATIN SMALL LIGATURE FI
+<UFB02> /x9f LATIN SMALL LIGATURE FL
+<U00A0> /xa0 NO-BREAK SPACE
+<U00A1> /xa1 INVERTED EXCLAMATION MARK
+<U00A2> /xa2 CENT SIGN
+<U00A3> /xa3 POUND SIGN
+<U00A4> /xa4 CURRENCY SIGN
+<U00A5> /xa5 YEN SIGN
+<U00A6> /xa6 BROKEN BAR
+<U00A7> /xa7 SECTION SIGN
+<U00A8> /xa8 DIAERESIS
+<U00A9> /xa9 COPYRIGHT SIGN
+<U00AA> /xaa FEMININE ORDINAL INDICATOR
+<U00AB> /xab LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
+<U00AC> /xac NOT SIGN
+<U00AD> /xad SOFT HYPHEN
+<U00AE> /xae REGISTERED SIGN
+<U00AF> /xaf MACRON
+<U00B0> /xb0 DEGREE SIGN
+<U00B1> /xb1 PLUS-MINUS SIGN
+<U00B2> /xb2 SUPERSCRIPT TWO
+<U00B3> /xb3 SUPERSCRIPT THREE
+<U00B4> /xb4 ACUTE ACCENT
+<U00B5> /xb5 MICRO SIGN
+<U00B6> /xb6 PILCROW SIGN
+<U00B7> /xb7 MIDDLE DOT
+<U00B8> /xb8 CEDILLA
+<U00B9> /xb9 SUPERSCRIPT ONE
+<U00BA> /xba MASCULINE ORDINAL INDICATOR
+<U00BB> /xbb RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
+<U00BC> /xbc VULGAR FRACTION ONE QUARTER
+<U00BD> /xbd VULGAR FRACTION ONE HALF
+<U00BE> /xbe VULGAR FRACTION THREE QUARTERS
+<U00BF> /xbf INVERTED QUESTION MARK
+<U00C0> /xc0 LATIN CAPITAL LETTER A WITH GRAVE
+<U00C1> /xc1 LATIN CAPITAL LETTER A WITH ACUTE
+<U00C2> /xc2 LATIN CAPITAL LETTER A WITH CIRCUMFLEX
+<U00C3> /xc3 LATIN CAPITAL LETTER A WITH TILDE
+<U00C4> /xc4 LATIN CAPITAL LETTER A WITH DIAERESIS
+<U00C5> /xc5 LATIN CAPITAL LETTER A WITH RING ABOVE
+<U00C6> /xc6 LATIN CAPITAL LETTER AE
+<U00C7> /xc7 LATIN CAPITAL LETTER C WITH CEDILLA
+<U00C8> /xc8 LATIN CAPITAL LETTER E WITH GRAVE
+<U00C9> /xc9 LATIN CAPITAL LETTER E WITH ACUTE
+<U00CA> /xca LATIN CAPITAL LETTER E WITH CIRCUMFLEX
+<U00CB> /xcb LATIN CAPITAL LETTER E WITH DIAERESIS
+<U00CC> /xcc LATIN CAPITAL LETTER I WITH GRAVE
+<U00CD> /xcd LATIN CAPITAL LETTER I WITH ACUTE
+<U00CE> /xce LATIN CAPITAL LETTER I WITH CIRCUMFLEX
+<U00CF> /xcf LATIN CAPITAL LETTER I WITH DIAERESIS
+<U00D0> /xd0 LATIN CAPITAL LETTER ETH (Icelandic)
+<U00D1> /xd1 LATIN CAPITAL LETTER N WITH TILDE
+<U00D2> /xd2 LATIN CAPITAL LETTER O WITH GRAVE
+<U00D3> /xd3 LATIN CAPITAL LETTER O WITH ACUTE
+<U00D4> /xd4 LATIN CAPITAL LETTER O WITH CIRCUMFLEX
+<U00D5> /xd5 LATIN CAPITAL LETTER O WITH TILDE
+<U00D6> /xd6 LATIN CAPITAL LETTER O WITH DIAERESIS
+<U00D7> /xd7 MULTIPLICATION SIGN
+<U00D8> /xd8 LATIN CAPITAL LETTER O WITH STROKE
+<U00D9> /xd9 LATIN CAPITAL LETTER U WITH GRAVE
+<U00DA> /xda LATIN CAPITAL LETTER U WITH ACUTE
+<U00DB> /xdb LATIN CAPITAL LETTER U WITH CIRCUMFLEX
+<U00DC> /xdc LATIN CAPITAL LETTER U WITH DIAERESIS
+<U00DD> /xdd LATIN CAPITAL LETTER Y WITH ACUTE
+<U00DE> /xde LATIN CAPITAL LETTER THORN (Icelandic)
+<U00DF> /xdf LATIN SMALL LETTER SHARP S (German)
+<U00E0> /xe0 LATIN SMALL LETTER A WITH GRAVE
+<U00E1> /xe1 LATIN SMALL LETTER A WITH ACUTE
+<U00E2> /xe2 LATIN SMALL LETTER A WITH CIRCUMFLEX
+<U00E3> /xe3 LATIN SMALL LETTER A WITH TILDE
+<U00E4> /xe4 LATIN SMALL LETTER A WITH DIAERESIS
+<U00E5> /xe5 LATIN SMALL LETTER A WITH RING ABOVE
+<U00E6> /xe6 LATIN SMALL LETTER AE
+<U00E7> /xe7 LATIN SMALL LETTER C WITH CEDILLA
+<U00E8> /xe8 LATIN SMALL LETTER E WITH GRAVE
+<U00E9> /xe9 LATIN SMALL LETTER E WITH ACUTE
+<U00EA> /xea LATIN SMALL LETTER E WITH CIRCUMFLEX
+<U00EB> /xeb LATIN SMALL LETTER E WITH DIAERESIS
+<U00EC> /xec LATIN SMALL LETTER I WITH GRAVE
+<U00ED> /xed LATIN SMALL LETTER I WITH ACUTE
+<U00EE> /xee LATIN SMALL LETTER I WITH CIRCUMFLEX
+<U00EF> /xef LATIN SMALL LETTER I WITH DIAERESIS
+<U00F0> /xf0 LATIN SMALL LETTER ETH (Icelandic)
+<U00F1> /xf1 LATIN SMALL LETTER N WITH TILDE
+<U00F2> /xf2 LATIN SMALL LETTER O WITH GRAVE
+<U00F3> /xf3 LATIN SMALL LETTER O WITH ACUTE
+<U00F4> /xf4 LATIN SMALL LETTER O WITH CIRCUMFLEX
+<U00F5> /xf5 LATIN SMALL LETTER O WITH TILDE
+<U00F6> /xf6 LATIN SMALL LETTER O WITH DIAERESIS
+<U00F7> /xf7 DIVISION SIGN
+<U00F8> /xf8 LATIN SMALL LETTER O WITH STROKE
+<U00F9> /xf9 LATIN SMALL LETTER U WITH GRAVE
+<U00FA> /xfa LATIN SMALL LETTER U WITH ACUTE
+<U00FB> /xfb LATIN SMALL LETTER U WITH CIRCUMFLEX
+<U00FC> /xfc LATIN SMALL LETTER U WITH DIAERESIS
+<U00FD> /xfd LATIN SMALL LETTER Y WITH ACUTE
+<U00FE> /xfe LATIN SMALL LETTER THORN (Icelandic)
+<U00FF> /xff LATIN SMALL LETTER Y WITH DIAERESIS
+END CHARMAP
--
2.34.1

                 reply	other threads:[~2022-01-15 15:10 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAODtNamprgsCU9twN66aBvNJof_0QcEmzNZVnR5RsJndOzgTRg@mail.gmail.com \
    --to=robert@cosmicrealms.com \
    --cc=libc-alpha@sourceware.org \
    --cc=libc-locales@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).