public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* [RFC v0 0/6] ASCII Command for output section
@ 2023-02-15 11:40 binutils
  2023-02-15 11:40 ` [PATCH v0 1/6] Add testsuite for ASCII command binutils
                   ` (7 more replies)
  0 siblings, 8 replies; 14+ messages in thread
From: binutils @ 2023-02-15 11:40 UTC (permalink / raw)
  To: binutils; +Cc: nickc

This is a preliminary patchset for implementing the ASCII command

I would like to support

   ASCII <size> , <string>

when I try, and add
   
   ASCII  32 , "mystring"

I get a "syntax error", and would like to understand why.

If I do:
   ASCII (<size>) <string>
I do not get a syntax error when I do

   ASCII (32) "mystring"

I cannot understand why there is a problem...

The testsuite in the ld/testsuite/ld-scripts contain
the 'header.inc' which is included in the ascii.t.
The ascii.t contains an error in that a fixed
size string is specified, but the argument string does not fit.

[PATCH v0 1/6] Add testsuite for ASCII command
[PATCH v0 2/6] Add ASCII command info to NEWS
[PATCH v0 3/6] Add ASCII to info file
[PATCH v0 4/6] ldlex.l: add ASCII
[PATCH v0 5/6] ldgram.y: add ASCII
[PATCH v0 6/6] ldlang.*: parse ASCII command


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v0 1/6] Add testsuite for ASCII command
  2023-02-15 11:40 [RFC v0 0/6] ASCII Command for output section binutils
@ 2023-02-15 11:40 ` binutils
  2023-02-15 11:40 ` [PATCH v0 2/6] Add ASCII command info to NEWS binutils
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: binutils @ 2023-02-15 11:40 UTC (permalink / raw)
  To: binutils; +Cc: nickc, Ulf Samuelsson

From: Ulf Samuelsson <ulf@emagii.com>

Signed-off-by: Ulf Samuelsson <ulf@emagii.com>
---
 ld/testsuite/ld-scripts/ascii.d    | 155 +++++++++++++++++++++++++++++
 ld/testsuite/ld-scripts/ascii.s    |   9 ++
 ld/testsuite/ld-scripts/ascii.t    |  51 ++++++++++
 ld/testsuite/ld-scripts/header.inc |  34 +++++++
 4 files changed, 249 insertions(+)
 create mode 100644 ld/testsuite/ld-scripts/ascii.d
 create mode 100644 ld/testsuite/ld-scripts/ascii.s
 create mode 100644 ld/testsuite/ld-scripts/ascii.t
 create mode 100755 ld/testsuite/ld-scripts/header.inc

diff --git a/ld/testsuite/ld-scripts/ascii.d b/ld/testsuite/ld-scripts/ascii.d
new file mode 100644
index 00000000000..9c59896ff1d
--- /dev/null
+++ b/ld/testsuite/ld-scripts/ascii.d
@@ -0,0 +1,155 @@
+#source: ascii.s
+#ld: -T ascii.t
+#objdump: -s -j .text
+#notarget: [is_aout_format]
+#xfail: tic4x-*-* tic54x-*-*
+
+.*:     file format .*
+
+Contents of section .text:
+ 1100 434f4445 deadbeef 00000000 00000000  CODE............
+ 1110 00120000 f1080000 00000000 00000000  ................
+ 1120 01020304 17000000 00004711 deadbeef  ..........G.....
+ 1130 70726f67 72616d20 6e616d65 00000000  program name....
+ 1140 656d7074 79000000 00000000 00000000  empty...........
+ 1150 00000000 00000000 00000000 00000000  ................
+ 1160 00000000 00000000 00000000 00000000  ................
+ 1170 00000000 00000000 00000000 00000000  ................
+ 1180 636f6d6d 656e7420 315c6e00 00000000  comment 1\n.....
+ 1190 00000000 00000000 00000000 00000000  ................
+ 11a0 636f6d6d 656e7420 325c6e00 00000000  comment 2\n.....
+ 11b0 00000000 00000000 00000000 00000000  ................
+ 11c0 636f6d6d 656e7420 335c6e00 00000000  comment 3\n.....
+ 11d0 00000000 00000000 00000000 00000000  ................
+ 11e0 636f6d6d 656e7420 345c6e00 00000000  comment 4\n.....
+ 11f0 00000000 00000000 deadbeef 434f4445  ............CODE
+ 1200 434f4445 10110000 ffffffff ffffffff  CODE............
+ 1210 ffffffff ffffffff ffffffff ffffffff  ................
+ 1220 ffffffff ffffffff ffffffff ffffffff  ................
+ 1230 ffffffff ffffffff ffffffff ffffffff  ................
+ 1240 ffffffff ffffffff ffffffff ffffffff  ................
+ 1250 ffffffff ffffffff ffffffff ffffffff  ................
+ 1260 ffffffff ffffffff ffffffff ffffffff  ................
+ 1270 ffffffff ffffffff ffffffff ffffffff  ................
+ 1280 ffffffff ffffffff ffffffff ffffffff  ................
+ 1290 ffffffff ffffffff ffffffff ffffffff  ................
+ 12a0 ffffffff ffffffff ffffffff ffffffff  ................
+ 12b0 ffffffff ffffffff ffffffff ffffffff  ................
+ 12c0 ffffffff ffffffff ffffffff ffffffff  ................
+ 12d0 ffffffff ffffffff ffffffff ffffffff  ................
+ 12e0 ffffffff ffffffff ffffffff ffffffff  ................
+ 12f0 ffffffff ffffffff ffffffff ffffffff  ................
+ 1300 54686973 20697320 61207374 72696e67  This is a string
+ 1310 2c203132 38206279 7465206c 6f6e6700  , 128 byte long.
+ 1320 00000000 00000000 00000000 00000000  ................
+ 1330 00000000 00000000 00000000 00000000  ................
+ 1340 00000000 00000000 00000000 00000000  ................
+ 1350 00000000 00000000 00000000 00000000  ................
+ 1360 00000000 00000000 00000000 00000000  ................
+ 1370 00000000 00000000 00000000 00000000  ................
+ 1380 10110000 ffffffff ffffffff ffffffff  ................
+ 1390 54686973 20697320 616e2075 6e616c69  This is an unali
+ 13a0 676e6564 20737472 696e6700 01020304  gned string.....
+ 13b0 04070101 ffffffff ffffffff ffffffff  ................
+ 13c0 01ffffff ffffffff ffffffff ffffffff  ................
+ 13d0 54686973 20697300 ffffffff ffffffff  This is.........
+ 13e0 49206d65 616e7420 746f2073 61793a20  I meant to say: 
+ 13f0 54686973 20697320 77617920 746f6f20  This is way too 
+ 1400 6c6f6e67 00000000 00000000 00000000  long............
+ 1410 00000000 00000000 00000000 00000000  ................
+ 1420 ffffffff ffffffff ffffffff ffffffff  ................
+ 1430 ffffffff ffffffff ffffffff ffffffff  ................
+ 1440 ffffffff ffffffff ffffffff ffffffff  ................
+ 1450 ffffffff ffffffff ffffffff ffffffff  ................
+ 1460 ffffffff ffffffff ffffffff ffffffff  ................
+ 1470 ffffffff ffffffff ffffffff ffffffff  ................
+ 1480 ffffffff ffffffff ffffffff ffffffff  ................
+ 1490 ffffffff ffffffff ffffffff ffffffff  ................
+ 14a0 ffffffff ffffffff ffffffff ffffffff  ................
+ 14b0 ffffffff ffffffff ffffffff ffffffff  ................
+ 14c0 ffffffff ffffffff ffffffff ffffffff  ................
+ 14d0 ffffffff ffffffff ffffffff ffffffff  ................
+ 14e0 ffffffff ffffffff ffffffff ffffffff  ................
+ 14f0 ffffffff ffffffff ffffffff ffffffff  ................
+ 1500 ffffffff ffffffff ffffffff ffffffff  ................
+ 1510 ffffffff ffffffff ffffffff ffffffff  ................
+ 1520 ffffffff ffffffff ffffffff ffffffff  ................
+ 1530 ffffffff ffffffff ffffffff ffffffff  ................
+ 1540 ffffffff ffffffff ffffffff ffffffff  ................
+ 1550 ffffffff ffffffff ffffffff ffffffff  ................
+ 1560 ffffffff ffffffff ffffffff ffffffff  ................
+ 1570 ffffffff ffffffff ffffffff ffffffff  ................
+ 1580 ffffffff ffffffff ffffffff ffffffff  ................
+ 1590 ffffffff ffffffff ffffffff ffffffff  ................
+ 15a0 ffffffff ffffffff ffffffff ffffffff  ................
+ 15b0 ffffffff ffffffff ffffffff ffffffff  ................
+ 15c0 ffffffff ffffffff ffffffff ffffffff  ................
+ 15d0 ffffffff ffffffff ffffffff ffffffff  ................
+ 15e0 ffffffff ffffffff ffffffff ffffffff  ................
+ 15f0 ffffffff ffffffff ffffffff ffffffff  ................
+ 1600 ffffffff ffffffff ffffffff ffffffff  ................
+ 1610 ffffffff ffffffff ffffffff ffffffff  ................
+ 1620 ffffffff ffffffff ffffffff ffffffff  ................
+ 1630 ffffffff ffffffff ffffffff ffffffff  ................
+ 1640 ffffffff ffffffff ffffffff ffffffff  ................
+ 1650 ffffffff ffffffff ffffffff ffffffff  ................
+ 1660 ffffffff ffffffff ffffffff ffffffff  ................
+ 1670 ffffffff ffffffff ffffffff ffffffff  ................
+ 1680 ffffffff ffffffff ffffffff ffffffff  ................
+ 1690 ffffffff ffffffff ffffffff ffffffff  ................
+ 16a0 ffffffff ffffffff ffffffff ffffffff  ................
+ 16b0 ffffffff ffffffff ffffffff ffffffff  ................
+ 16c0 ffffffff ffffffff ffffffff ffffffff  ................
+ 16d0 ffffffff ffffffff ffffffff ffffffff  ................
+ 16e0 ffffffff ffffffff ffffffff ffffffff  ................
+ 16f0 ffffffff ffffffff ffffffff ffffffff  ................
+ 1700 ffffffff ffffffff ffffffff ffffffff  ................
+ 1710 ffffffff ffffffff ffffffff ffffffff  ................
+ 1720 ffffffff ffffffff ffffffff ffffffff  ................
+ 1730 ffffffff ffffffff ffffffff ffffffff  ................
+ 1740 ffffffff ffffffff ffffffff ffffffff  ................
+ 1750 ffffffff ffffffff ffffffff ffffffff  ................
+ 1760 ffffffff ffffffff ffffffff ffffffff  ................
+ 1770 ffffffff ffffffff ffffffff ffffffff  ................
+ 1780 ffffffff ffffffff ffffffff ffffffff  ................
+ 1790 ffffffff ffffffff ffffffff ffffffff  ................
+ 17a0 ffffffff ffffffff ffffffff ffffffff  ................
+ 17b0 ffffffff ffffffff ffffffff ffffffff  ................
+ 17c0 ffffffff ffffffff ffffffff ffffffff  ................
+ 17d0 ffffffff ffffffff ffffffff ffffffff  ................
+ 17e0 ffffffff ffffffff ffffffff ffffffff  ................
+ 17f0 ffffffff ffffffff ffffffff ffffffff  ................
+ 1800 41207665 7279206c 6f6e6720 73747269  A very long stri
+ 1810 6e672066 6f6c6c6f 77656420 62792061  ng followed by a
+ 1820 20273031 27000000 00000000 00000000   '01'...........
+ 1830 00000000 00000000 00000000 00000000  ................
+ 1840 00000000 00000000 00000000 00000000  ................
+ 1850 00000000 00000000 00000000 00000000  ................
+ 1860 00000000 00000000 00000000 00000000  ................
+ 1870 00000000 00000000 00000000 00000000  ................
+ 1880 00000000 00000000 00000000 00000000  ................
+ 1890 00000000 00000000 00000000 00000000  ................
+ 18a0 00000000 00000000 00000000 00000000  ................
+ 18b0 00000000 00000000 00000000 00000000  ................
+ 18c0 00000000 00000000 00000000 00000000  ................
+ 18d0 00000000 00000000 00000000 00000000  ................
+ 18e0 00000000 00000000 00000000 00000000  ................
+ 18f0 00000000 00000000 00000000 00000000  ................
+ 1900 00000000 00000000 00000000 00000000  ................
+ 1910 00000000 00000000 00000000 00000000  ................
+ 1920 00000000 00000000 00000000 00000000  ................
+ 1930 00000000 00000000 00000000 00000000  ................
+ 1940 00000000 00000000 00000000 00000000  ................
+ 1950 00000000 00000000 00000000 00000000  ................
+ 1960 00000000 00000000 00000000 00000000  ................
+ 1970 00000000 00000000 00000000 00000000  ................
+ 1980 00000000 00000000 00000000 00000000  ................
+ 1990 00000000 00000000 00000000 00000000  ................
+ 19a0 00000000 00000000 00000000 00000000  ................
+ 19b0 00000000 00000000 00000000 00000000  ................
+ 19c0 00000000 00000000 00000000 00000000  ................
+ 19d0 00000000 00000000 00000000 00000000  ................
+ 19e0 00000000 00000000 00000000 00000000  ................
+ 19f0 00000000 00000000 00000000 00000000  ................
+ 1a00 01                                   .               
+#pass
diff --git a/ld/testsuite/ld-scripts/ascii.s b/ld/testsuite/ld-scripts/ascii.s
new file mode 100644
index 00000000000..704b492ae61
--- /dev/null
+++ b/ld/testsuite/ld-scripts/ascii.s
@@ -0,0 +1,9 @@
+    .extern ecc_start
+	.section .text
+main:
+	.long 0x45444F43
+	.long ecc_start
+	.section .data
+	.long 0x9abcdef0
+	.section .bss
+	.long 0
diff --git a/ld/testsuite/ld-scripts/ascii.t b/ld/testsuite/ld-scripts/ascii.t
new file mode 100644
index 00000000000..14e01ca5c60
--- /dev/null
+++ b/ld/testsuite/ld-scripts/ascii.t
@@ -0,0 +1,51 @@
+MEMORY {
+  rom : ORIGIN = 0x000000, LENGTH = 0x400000
+  ram : ORIGIN = 0x400000, LENGTH = 0x10000
+}
+
+_start = 0x000000;
+SECTIONS
+{
+  . = 0x1000 + SIZEOF_HEADERS;
+  .text ALIGN (0x100) :
+
+    {
+      INCLUDE "header.inc"
+
+      FILL(0xFF)
+      entry = .;
+      *(.text)
+      . = ALIGN(0x100);
+      ASCII (128) "This is a string, 128 byte long"
+/*      ASCII 32,"This is a string" */
+      LONG(ecc_start)
+      . = ALIGN(16);
+      align_label = .;
+      ASCIZ "This is an unaligned string"
+      unalign_label = .;
+      BYTE(1)
+      BYTE(2)
+      BYTE(3)
+      BYTE(4)
+      BYTE(4)
+      BYTE(7)
+      BYTE(1)
+      BYTE(1)
+      . = ALIGN(16);
+      BYTE(1)
+      . = ALIGN(16);
+      ASCII (8) "This is way too long"
+      . = ALIGN(16);
+      ASCII (64) "I meant to say: This is way too long"
+      . = ALIGN(1024);
+      ASCII (512) "A very long string followed by a '01'"
+      BYTE(1)
+      ecc_end = .;
+    } > rom
+
+  .data : AT (0x400000) { *(.data) } >ram /* NO default AT>rom */
+  . = ALIGN(0x20);
+  .bss : { *(.bss) } >ram /* NO default AT>rom */
+  /DISCARD/ : { *(*) }
+}
+
diff --git a/ld/testsuite/ld-scripts/header.inc b/ld/testsuite/ld-scripts/header.inc
new file mode 100755
index 00000000000..8376d332226
--- /dev/null
+++ b/ld/testsuite/ld-scripts/header.inc
@@ -0,0 +1,34 @@
+      /* HEADER */
+      FILL(0xFF)
+      QUAD(0xEFBEADDE45444F43);
+      crc64 = .;
+      QUAD(0)
+      ecc_start = .;
+      /* Program Entry */
+      LONG(entry)
+
+      /* Program size */
+      LONG(ecc_end - ecc_start)
+
+      /* Time Stamp */
+      time_since_epoch = .;
+      QUAD(0)
+
+      /* 32 bytes here */
+      /* Version info */
+      BYTE(1)
+      BYTE(2)
+      BYTE(3)
+      BYTE(4)
+      LONG(0x17)
+      LONG(0x11470000)
+      LONG(0xEFBEADDE)
+      /* 48 bytes here */
+      ASCII (16) "program name"
+      /* 64 bytes here */
+      ASCII (64) "empty"
+      ASCII (32) "comment 1\n"
+      ASCII (32) "comment 2\n"
+      ASCII (32) "comment 3\n"
+      ASCII (24) "comment 4\n"
+      QUAD(0x45444F43EFBEADDE);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v0 2/6] Add ASCII command info to NEWS
  2023-02-15 11:40 [RFC v0 0/6] ASCII Command for output section binutils
  2023-02-15 11:40 ` [PATCH v0 1/6] Add testsuite for ASCII command binutils
@ 2023-02-15 11:40 ` binutils
  2023-02-15 11:40 ` [PATCH v0 3/6] Add ASCII to info file binutils
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: binutils @ 2023-02-15 11:40 UTC (permalink / raw)
  To: binutils; +Cc: nickc, Ulf Samuelsson

From: Ulf Samuelsson <ulf@emagii.com>

Signed-off-by: Ulf Samuelsson <ulf@emagii.com>
---
 ld/NEWS | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/ld/NEWS b/ld/NEWS
index 4ce7e19d40b..38af9cba877 100644
--- a/ld/NEWS
+++ b/ld/NEWS
@@ -1,5 +1,11 @@
 -*- text -*-
 
+* The linker script syntax has a new command for output sections: 
+  ASCII (<size>) "string" (Alt 1 = Working)
+  ASCII <size>, "string"  (Alt 2 = Not Working)
+  This will reserve a zero filled block of <size> bytes at the current
+  location and insert a zero-terminated string at the beginning of the block.
+
 * The linker script syntax has a new command for output sections: ASCIZ "string"
   This will insert a zero-terminated string at the current location.
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v0 3/6] Add ASCII to info file
  2023-02-15 11:40 [RFC v0 0/6] ASCII Command for output section binutils
  2023-02-15 11:40 ` [PATCH v0 1/6] Add testsuite for ASCII command binutils
  2023-02-15 11:40 ` [PATCH v0 2/6] Add ASCII command info to NEWS binutils
@ 2023-02-15 11:40 ` binutils
  2023-02-15 11:40 ` [PATCH v0 4/6] ldlex.l: add ASCII binutils
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: binutils @ 2023-02-15 11:40 UTC (permalink / raw)
  To: binutils; +Cc: nickc, Ulf Samuelsson

From: Ulf Samuelsson <ulf@emagii.com>

Signed-off-by: Ulf Samuelsson <ulf@emagii.com>
---
 ld/ld.texi | 24 +++++++++++++++++++-----
 1 file changed, 19 insertions(+), 5 deletions(-)

diff --git a/ld/ld.texi b/ld/ld.texi
index 335886d4e6b..e309eebfa43 100644
--- a/ld/ld.texi
+++ b/ld/ld.texi
@@ -5308,6 +5308,7 @@ C identifiers because they contain a @samp{.} character.
 @cindex data
 @cindex section data
 @cindex output section data
+@kindex ASCII (@var{expression}) ``@var{string}''
 @kindex ASCIZ ``@var{string}''
 @kindex BYTE(@var{expression})
 @kindex SHORT(@var{expression})
@@ -5345,14 +5346,27 @@ When the object file format does not have an explicit endianness, as is
 true of, for example, S-records, the value will be stored in the
 endianness of the first input object file.
 
+You can include a fixed size string in an output section by using @code{ASCII}.
+The keyword is followed by a size and a string which is stored at
+the current value of the location counter adding zero bytes at the end.
+
 You can include a zero-terminated string in an output section by using
 @code{ASCIZ}.  The keyword is followed by a string which is stored at
-the current value of the location counter adding a zero byte at the
-end.  If the string includes spaces it must be enclosed in double
-quotes.  The string may contain '\n', '\r', '\t' and octal numbers.
-Hex numbers are not supported.
+the current value of the location counter adding a zero byte at the end.  
+
+If the string in an @code{ASCIZ} or @code{ASCIZ} command includes spaces
+it must be enclosed in double quotes.
+If the string is too long, a warning is issued and the string is truncated.
+The string can have C escape characters like '\n', '\r', '\t' and octal numbers.
+The '\"' escape is not supported.
+
+Example 1: This is string of 16 characters and will create a 32 byte area
+@smallexample
+  ASCII 32, "This is 16 bytes"
+  ASCII (32) "This is 16 bytes"
+@end smallexample
 
-For example, this string of 16 characters will create a 17 byte area
+Example 2: This is a string of 16 characters and will create a 17 byte area
 @smallexample
   ASCIZ "This is 16 bytes"
 @end smallexample
-- 
2.17.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v0 4/6] ldlex.l: add ASCII
  2023-02-15 11:40 [RFC v0 0/6] ASCII Command for output section binutils
                   ` (2 preceding siblings ...)
  2023-02-15 11:40 ` [PATCH v0 3/6] Add ASCII to info file binutils
@ 2023-02-15 11:40 ` binutils
  2023-02-15 11:44   ` Ulf Samuelsson
  2023-02-15 11:40 ` [PATCH v0 5/6] ldgram.y: " binutils
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 14+ messages in thread
From: binutils @ 2023-02-15 11:40 UTC (permalink / raw)
  To: binutils; +Cc: nickc, Ulf Samuelsson

From: Ulf Samuelsson <ulf@emagii.com>

Signed-off-by: Ulf Samuelsson <ulf@emagii.com>
---
 ld/ldlex.l | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/ld/ldlex.l b/ld/ldlex.l
index 32336cf0be2..41d03ab40ff 100644
--- a/ld/ldlex.l
+++ b/ld/ldlex.l
@@ -309,6 +309,8 @@ V_IDENTIFIER [*?.$_a-zA-Z\[\]\-\!\^\\]([*?.$_a-zA-Z0-9\[\]\-\!\^\\]|::)*
 <WILD>"LONG"				{ RTOKEN(LONG); }
 <WILD>"SHORT"				{ RTOKEN(SHORT); }
 <WILD>"BYTE"				{ RTOKEN(BYTE); }
+<WILD>"ASCII"				{ RTOKEN(ASCII); }
+<WILD>"ASCIII"				{ RTOKEN(ASCIII); }
 <WILD>"ASCIZ"				{ RTOKEN(ASCIZ); }
 <SCRIPT>"NOFLOAT"			{ RTOKEN(NOFLOAT); }
 <SCRIPT,EXPRESSION>"NOCROSSREFS"	{ RTOKEN(NOCROSSREFS); }
-- 
2.17.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v0 5/6] ldgram.y: add ASCII
  2023-02-15 11:40 [RFC v0 0/6] ASCII Command for output section binutils
                   ` (3 preceding siblings ...)
  2023-02-15 11:40 ` [PATCH v0 4/6] ldlex.l: add ASCII binutils
@ 2023-02-15 11:40 ` binutils
  2023-02-15 11:40 ` [PATCH v0 6/6] ldlang.*: parse ASCII command binutils
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: binutils @ 2023-02-15 11:40 UTC (permalink / raw)
  To: binutils; +Cc: nickc, Ulf Samuelsson

From: Ulf Samuelsson <ulf@emagii.com>

Signed-off-by: Ulf Samuelsson <ulf@emagii.com>
---
 ld/ldgram.y | 27 ++++++++++++++++++++++++---
 1 file changed, 24 insertions(+), 3 deletions(-)

diff --git a/ld/ldgram.y b/ld/ldgram.y
index 8240cf97327..6c3e510111c 100644
--- a/ld/ldgram.y
+++ b/ld/ldgram.y
@@ -91,7 +91,7 @@ static int error_index;
 }
 
 %type <etree> exp opt_exp_with_type mustbe_exp opt_at phdr_type phdr_val
-%type <etree> opt_exp_without_type opt_subalign opt_align
+%type <etree> mustbe_int opt_exp_without_type opt_subalign opt_align
 %type <fill> fill_opt fill_exp
 %type <name_list> exclude_name_list
 %type <wildcard_list> section_name_list
@@ -125,7 +125,7 @@ static int error_index;
 %right UNARY
 %token END
 %left <token> '('
-%token <token> ALIGN_K BLOCK BIND QUAD SQUAD LONG SHORT BYTE ASCIZ
+%token <token> ALIGN_K BLOCK BIND QUAD SQUAD LONG SHORT BYTE ASCII ASCIII ASCIZ
 %token SECTIONS PHDRS INSERT_K AFTER BEFORE
 %token DATA_SEGMENT_ALIGN DATA_SEGMENT_RELRO_END DATA_SEGMENT_END
 %token SORT_BY_NAME SORT_BY_ALIGNMENT SORT_NONE
@@ -668,9 +668,21 @@ statement:
 		{
 		  lang_add_data ((int) $1, $3);
 		}
+	| ASCII '(' mustbe_int ')' NAME
+		{
+		  /* 'value' is a memory leak, do we care? */
+		  etree_type *value = $3;
+		  lang_add_string (value->value.value, $5);
+		}
+	| ASCII mustbe_int ',' NAME
+		{
+		  /* 'value' is a memory leak, do we care? */
+		  etree_type *value = $2;
+		  lang_add_string (value->value.value, $4);
+		}
 	| ASCIZ NAME
 		{
-		  lang_add_string ($2);
+		  lang_add_string (0, $2);
 		}
 	| FILL '(' fill_exp ')'
 		{
@@ -910,6 +922,15 @@ mustbe_exp:		{ ldlex_expression (); }
 			{ ldlex_popstate (); $$ = $2; }
 	;
 
+mustbe_int:		{ ldlex_expression (); }
+		INT
+			{ 
+			  etree_type *value = exp_bigintop ($2.integer, $2.str);
+			  ldlex_popstate ();
+			  $$ = value;
+			}
+	;
+
 exp	:
 		'-' exp %prec UNARY
 			{ $$ = exp_unop ('-', $2); }
-- 
2.17.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v0 6/6] ldlang.*: parse ASCII command
  2023-02-15 11:40 [RFC v0 0/6] ASCII Command for output section binutils
                   ` (4 preceding siblings ...)
  2023-02-15 11:40 ` [PATCH v0 5/6] ldgram.y: " binutils
@ 2023-02-15 11:40 ` binutils
  2023-02-15 17:07 ` [RFC v0 0/6] ASCII Command for output section Nick Clifton
  2023-02-15 17:28 ` Nick Clifton
  7 siblings, 0 replies; 14+ messages in thread
From: binutils @ 2023-02-15 11:40 UTC (permalink / raw)
  To: binutils; +Cc: nickc, Ulf Samuelsson

From: Ulf Samuelsson <ulf@emagii.com>

Signed-off-by: Ulf Samuelsson <ulf@emagii.com>
---
 ld/ldlang.c | 134 +++++++++++++++++++++++-----------------------------
 ld/ldlang.h |   3 +-
 2 files changed, 62 insertions(+), 75 deletions(-)

diff --git a/ld/ldlang.c b/ld/ldlang.c
index b20455c9373..d5d0047c4fc 100644
--- a/ld/ldlang.c
+++ b/ld/ldlang.c
@@ -46,6 +46,9 @@
 #include "plugin.h"
 #endif /* BFD_SUPPORTS_PLUGINS */
 
+/* for ASCII command */
+#define	MAX_STRING	256
+
 #ifndef offsetof
 #define offsetof(TYPE, MEMBER) ((size_t) & (((TYPE*) 0)->MEMBER))
 #endif
@@ -8361,89 +8364,72 @@ lang_add_data (int type, union etree_union *exp)
   new_stmt->type = type;
 }
 
-void
-lang_add_string (const char *s)
+/* Count characters in string, ignoring escape characters */
+bfd_vma charcount(const char *s)
 {
-  bfd_vma  len = strlen (s);
-  bfd_vma  i;
-  bool     escape = false;
-
-  /* Add byte expressions until end of string.  */
-  for (i = 0 ; i < len; i++)
-    {
-      char c = *s++;
-
-      if (escape)
-	{
-	  switch (c)
-	    {
-	    default:
-	      /* Ignore the escape.  */
-	      break;
-
-	    case 'n': c = '\n'; break;
-	    case 'r': c = '\r'; break;
-	    case 't': c = '\t'; break;
-	  
-	    case '0':
-	    case '1':
-	    case '2':
-	    case '3':
-	    case '4':
-	    case '5':
-	    case '6':
-	    case '7':
-	      /* We have an octal number.  */
-	      {
-		unsigned int value = c - '0';
-
-		c = *s;
-		if ((c >= '0') && (c <= '7'))
-		  {
-		    value <<= 3;
-		    value += (c - '0');
-		    i++;
-		    s++;
+  bfd_vma count = 0;
+  while (1) {
+    char c = *s++;
+    if (c =='\0')
+      {
+	return count;
+      }
+      count++;
+  }
+}
 
-		    c = *s;
-		    if ((c >= '0') && (c <= '7'))
-		      {
-			value <<= 3;
-			value += (c - '0');
-			i++;
-			s++;
-		      }
-		  }
+void lang_add_string(bfd_vma size, const char *s, ...)
+{
+  bfd_vma len;
+  char string[MAX_STRING];
 
-		if (value > 0xff)
-		  {
-		    /* octal: \777 is treated as '\077' + '7' */
-		    value >>= 3;
-		    i--;
-		    s--;
-		  }
+  /* We allocate enough space for a reasonable buffer */
+  /* If the user specifies a string that does not fit */
+  /* then the user has to split up into several ASCII commands */
+  /* We are a little bit too harsh, since escape chars */
+  /* will reduce the size, but the string should be shortened */
+  bfd_vma alloc_size = charcount(s);
+  if (alloc_size > MAX_STRING-1)
+    {
+      einfo (_("%X%P: ASCII string maximum size exceeded\n"));
+      return;
+    }
 
-		c = value;
-	      }
-	      break;
-	    }
+  memset(string, '\0', MAX_STRING);
+  { /* Evade the -Werror=format-error, if sprintf is called directly */
+    int (*alias)(char *, const char *, ...) = sprintf;
+    alias(string,s);
+  }
+  
+  /* Now we have the actual thing to emit to the object file */
+  len = strlen(string);
 
-	  lang_add_data (BYTE, exp_intop (c));
-	  escape = false;
-	}
-      else
-	{
-	  if (c == '\\')
-	    escape = true;
-	  else
-	    lang_add_data (BYTE, exp_intop (c));
-	}
+  /* Check if it is ASCIZ command (len == 0) */
+  if (size == 0) /* Emit actual length of string + room for '\0' */
+    {
+      size = len + 1;
+    }
+  else if (len > (size - 1))
+    {
+      /* We cannot fit the '\0' at the end */
+      string[size-1] = '\0'; /* truncate string */
+      einfo (_("%P:%pS: warning: ASCII string does not fit in allocated space,"
+               " truncated\n"), NULL);
     }
 
-  /* Remeber to terminate the string.  */
-  lang_add_data (BYTE, exp_intop (0));
+  for (bfd_vma i = 0 ; i < size ; i++)
+    {  if (i < len)
+         {
+           lang_add_data (BYTE, exp_intop (string[i]));
+         }
+       else
+         {
+           lang_add_data (BYTE, exp_intop ('\0'));
+         }
+    }
 }
 
+
 /* Create a new reloc statement.  RELOC is the BFD relocation type to
    generate.  HOWTO is the corresponding howto structure (we could
    look this up, but the caller has already done so).  SECTION is the
diff --git a/ld/ldlang.h b/ld/ldlang.h
index 32819066b8a..fcef937bd44 100644
--- a/ld/ldlang.h
+++ b/ld/ldlang.h
@@ -646,8 +646,9 @@ extern void pop_stat_ptr
   (void);
 extern void lang_add_data
   (int, union etree_union *);
+extern bfd_vma charcount(const char *s);
 extern void lang_add_string
-  (const char *);
+  (bfd_vma, const char *s, ...);
 extern void lang_add_reloc
   (bfd_reloc_code_real_type, reloc_howto_type *, asection *, const char *,
    union etree_union *);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v0 4/6] ldlex.l: add ASCII
  2023-02-15 11:40 ` [PATCH v0 4/6] ldlex.l: add ASCII binutils
@ 2023-02-15 11:44   ` Ulf Samuelsson
  0 siblings, 0 replies; 14+ messages in thread
From: Ulf Samuelsson @ 2023-02-15 11:44 UTC (permalink / raw)
  To: binutils; +Cc: nickc, Ulf Samuelsson

Oops, this needs to go away...

BR
Ulf

Den 2023-02-15 kl. 12:40, skrev Ulf Samuelsson via Binutils:
> +<WILD>"ASCIII"				{ RTOKEN(ASCIII); }

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC v0 0/6] ASCII Command for output section
  2023-02-15 11:40 [RFC v0 0/6] ASCII Command for output section binutils
                   ` (5 preceding siblings ...)
  2023-02-15 11:40 ` [PATCH v0 6/6] ldlang.*: parse ASCII command binutils
@ 2023-02-15 17:07 ` Nick Clifton
  2023-02-15 17:55   ` Ulf Samuelsson
  2023-02-15 17:28 ` Nick Clifton
  7 siblings, 1 reply; 14+ messages in thread
From: Nick Clifton @ 2023-02-15 17:07 UTC (permalink / raw)
  To: binutils, binutils

Hi Ulf,

> I would like to support
> 
>     ASCII <size> , <string>
> 
> when I try, and add
>     
>     ASCII  32 , "mystring"
> 
> I get a "syntax error", and would like to understand why.

Whilst I have not gone into this too deeply, I think that the
short answer is "because that is the way that the linker's
parser works".  It expects numerical expressions, including
constant integer values, to be enclosed in parentheses.

> If I do:
>     ASCII (<size>) <string>
> I do not get a syntax error when I do
> 
>     ASCII (32) "mystring"

Since this method works, I would suggest just sticking with it.

Also, whilst using sprintf() to process escape sequences is a
nice idea, it will not work.  Escape sequences are a C language
feature not a C library feature, so sprintf and its friends will
not translate them for you.

Cheers
   Nick



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC v0 0/6] ASCII Command for output section
  2023-02-15 11:40 [RFC v0 0/6] ASCII Command for output section binutils
                   ` (6 preceding siblings ...)
  2023-02-15 17:07 ` [RFC v0 0/6] ASCII Command for output section Nick Clifton
@ 2023-02-15 17:28 ` Nick Clifton
  2023-02-15 17:52   ` Ulf Samuelsson
  2023-02-15 18:29   ` Ulf Samuelsson
  7 siblings, 2 replies; 14+ messages in thread
From: Nick Clifton @ 2023-02-15 17:28 UTC (permalink / raw)
  To: binutils, binutils

[-- Attachment #1: Type: text/plain, Size: 273 bytes --]

Hi Ulf,

   What do you think of this alternative version of your patch ?

   It will still need tweaking so that the tests will work on big-endian
   architectures, and the documentation will need fixing, but I think
   that it does most of what you want.

Cheers
   Nick

[-- Attachment #2: ascii.patch --]
[-- Type: text/x-patch, Size: 20509 bytes --]

diff --git a/ld/NEWS b/ld/NEWS
index 4ce7e19d40b..38af9cba877 100644
--- a/ld/NEWS
+++ b/ld/NEWS
@@ -1,5 +1,11 @@
 -*- text -*-
 
+* The linker script syntax has a new command for output sections: 
+  ASCII (<size>) "string" (Alt 1 = Working)
+  ASCII <size>, "string"  (Alt 2 = Not Working)
+  This will reserve a zero filled block of <size> bytes at the current
+  location and insert a zero-terminated string at the beginning of the block.
+
 * The linker script syntax has a new command for output sections: ASCIZ "string"
   This will insert a zero-terminated string at the current location.
 
diff --git a/ld/ld.texi b/ld/ld.texi
index 335886d4e6b..e309eebfa43 100644
--- a/ld/ld.texi
+++ b/ld/ld.texi
@@ -5308,6 +5308,7 @@ C identifiers because they contain a @samp{.} character.
 @cindex data
 @cindex section data
 @cindex output section data
+@kindex ASCII (@var{expression}) ``@var{string}''
 @kindex ASCIZ ``@var{string}''
 @kindex BYTE(@var{expression})
 @kindex SHORT(@var{expression})
@@ -5345,14 +5346,27 @@ When the object file format does not have an explicit endianness, as is
 true of, for example, S-records, the value will be stored in the
 endianness of the first input object file.
 
+You can include a fixed size string in an output section by using @code{ASCII}.
+The keyword is followed by a size and a string which is stored at
+the current value of the location counter adding zero bytes at the end.
+
 You can include a zero-terminated string in an output section by using
 @code{ASCIZ}.  The keyword is followed by a string which is stored at
-the current value of the location counter adding a zero byte at the
-end.  If the string includes spaces it must be enclosed in double
-quotes.  The string may contain '\n', '\r', '\t' and octal numbers.
-Hex numbers are not supported.
+the current value of the location counter adding a zero byte at the end.  
+
+If the string in an @code{ASCIZ} or @code{ASCIZ} command includes spaces
+it must be enclosed in double quotes.
+If the string is too long, a warning is issued and the string is truncated.
+The string can have C escape characters like '\n', '\r', '\t' and octal numbers.
+The '\"' escape is not supported.
+
+Example 1: This is string of 16 characters and will create a 32 byte area
+@smallexample
+  ASCII 32, "This is 16 bytes"
+  ASCII (32) "This is 16 bytes"
+@end smallexample
 
-For example, this string of 16 characters will create a 17 byte area
+Example 2: This is a string of 16 characters and will create a 17 byte area
 @smallexample
   ASCIZ "This is 16 bytes"
 @end smallexample
diff --git a/ld/ldgram.y b/ld/ldgram.y
index 8240cf97327..faffeec94b8 100644
--- a/ld/ldgram.y
+++ b/ld/ldgram.y
@@ -125,7 +125,7 @@ static int error_index;
 %right UNARY
 %token END
 %left <token> '('
-%token <token> ALIGN_K BLOCK BIND QUAD SQUAD LONG SHORT BYTE ASCIZ
+%token <token> ALIGN_K BLOCK BIND QUAD SQUAD LONG SHORT BYTE ASCII ASCIZ
 %token SECTIONS PHDRS INSERT_K AFTER BEFORE
 %token DATA_SEGMENT_ALIGN DATA_SEGMENT_RELRO_END DATA_SEGMENT_END
 %token SORT_BY_NAME SORT_BY_ALIGNMENT SORT_NONE
@@ -668,9 +668,15 @@ statement:
 		{
 		  lang_add_data ((int) $1, $3);
 		}
+        | ASCII '(' mustbe_exp ')' NAME
+		{
+		  /* 'value' is a memory leak, do we care?  */
+		  etree_type *value = $3;
+		  lang_add_string (value->value.value, $5);
+		}
 	| ASCIZ NAME
 		{
-		  lang_add_string ($2);
+		  lang_add_string (0, $2);
 		}
 	| FILL '(' fill_exp ')'
 		{
diff --git a/ld/ldlang.c b/ld/ldlang.c
index b20455c9373..82e623a6f4a 100644
--- a/ld/ldlang.c
+++ b/ld/ldlang.c
@@ -8361,15 +8361,16 @@ lang_add_data (int type, union etree_union *exp)
   new_stmt->type = type;
 }
 
-void
-lang_add_string (const char *s)
+static char *
+convert_string (const char * s)
 {
-  bfd_vma  len = strlen (s);
-  bfd_vma  i;
-  bool     escape = false;
+  int     len = strlen (s);
+  int     i;
+  bool    escape = false;
+  char *  buffer = malloc (len + 1);
+  char *  b;
 
-  /* Add byte expressions until end of string.  */
-  for (i = 0 ; i < len; i++)
+  for (i = 0, b = buffer; i < len; i++)
     {
       char c = *s++;
 
@@ -8404,7 +8405,7 @@ lang_add_string (const char *s)
 		    value += (c - '0');
 		    i++;
 		    s++;
-
+ 
 		    c = *s;
 		    if ((c >= '0') && (c <= '7'))
 		      {
@@ -8422,26 +8423,58 @@ lang_add_string (const char *s)
 		    i--;
 		    s--;
 		  }
-
+		
 		c = value;
 	      }
 	      break;
 	    }
-
-	  lang_add_data (BYTE, exp_intop (c));
 	  escape = false;
 	}
       else
 	{
 	  if (c == '\\')
-	    escape = true;
-	  else
-	    lang_add_data (BYTE, exp_intop (c));
+	    {
+	      escape = true;
+	      continue;
+	    }
 	}
+
+      * b ++ = c;
     }
 
-  /* Remeber to terminate the string.  */
-  lang_add_data (BYTE, exp_intop (0));
+  * b = 0;
+  return buffer;
+}
+
+void
+lang_add_string (int size, const char *s)
+{
+  int     len;
+  int     i;
+  char *  string;
+
+  string = convert_string (s);
+  len = strlen (string);
+
+  /* Check if it is ASCIZ command (len == 0) */
+  if (size == 0)
+    size = len + 1;
+  else if (len > size)
+    {
+      /* We cannot fit the '\0' at the end.  */
+      len = size;
+
+      einfo (_("%P:%pS: warning: ASCII string does not fit in allocated space,"
+               " truncated\n"), NULL);
+    }
+
+  for (i = 0 ; i < len ; i++)
+    lang_add_data (BYTE, exp_intop (string[i]));
+
+  while (i++ < size)
+    lang_add_data (BYTE, exp_intop ('\0'));
+
+  free (string);
 }
 
 /* Create a new reloc statement.  RELOC is the BFD relocation type to
diff --git a/ld/ldlang.h b/ld/ldlang.h
index 32819066b8a..fe85e159aa7 100644
--- a/ld/ldlang.h
+++ b/ld/ldlang.h
@@ -646,8 +646,9 @@ extern void pop_stat_ptr
   (void);
 extern void lang_add_data
   (int, union etree_union *);
+extern bfd_vma charcount(const char *s);
 extern void lang_add_string
-  (const char *);
+  (int, const char *s);
 extern void lang_add_reloc
   (bfd_reloc_code_real_type, reloc_howto_type *, asection *, const char *,
    union etree_union *);
diff --git a/ld/ldlex.l b/ld/ldlex.l
index 32336cf0be2..910e7ea3b8b 100644
--- a/ld/ldlex.l
+++ b/ld/ldlex.l
@@ -309,6 +309,7 @@ V_IDENTIFIER [*?.$_a-zA-Z\[\]\-\!\^\\]([*?.$_a-zA-Z0-9\[\]\-\!\^\\]|::)*
 <WILD>"LONG"				{ RTOKEN(LONG); }
 <WILD>"SHORT"				{ RTOKEN(SHORT); }
 <WILD>"BYTE"				{ RTOKEN(BYTE); }
+<WILD>"ASCII"				{ RTOKEN(ASCII); }
 <WILD>"ASCIZ"				{ RTOKEN(ASCIZ); }
 <SCRIPT>"NOFLOAT"			{ RTOKEN(NOFLOAT); }
 <SCRIPT,EXPRESSION>"NOCROSSREFS"	{ RTOKEN(NOCROSSREFS); }
diff --git a/ld/testsuite/ld-scripts/asciz.d b/ld/testsuite/ld-scripts/asciz.d
index 615cf99732f..d3b8e89fb31 100644
--- a/ld/testsuite/ld-scripts/asciz.d
+++ b/ld/testsuite/ld-scripts/asciz.d
@@ -1,17 +1,12 @@
 #source: asciz.s
 #ld: -T asciz.t
-#objdump: -s -j .text
-#target: [is_elf_format]
-#skip: mips*-*-*
-#skip: tilegx*-*-* tilepro-*-*
-# COFF, PE and MIPS targets align code to a 16 byte boundary
-# tilegx andtilepro aligns code to a 8 byte boundary.
+#objdump: -s -j .data
 
 .*:     file format .*
 
-Contents of section .text:
- .... 01010101 54686973 20697320 61207374  ....This is a st
- .... 72696e67 00...... ........ ........  ring............
- .... 54686973 20697320 616e6f74 68657220  This is another 
- .... 0a737472 696e6753 00                 .stringS........
+Contents of section .data:
+ .... 54686973 20697320 61207374 72696e67  This is a string
+ .... 00546869 73206973 20616e6f 74686572  .This is another
+ .... 0a537472 696e6700 006e6f71 756f7465  .String..noquote
+ .... 7300                                 s.              
 #pass
diff --git a/ld/testsuite/ld-scripts/asciz.t b/ld/testsuite/ld-scripts/asciz.t
index ab66f9a5bfb..3aeb7d0c767 100644
--- a/ld/testsuite/ld-scripts/asciz.t
+++ b/ld/testsuite/ld-scripts/asciz.t
@@ -1,23 +1,16 @@
-MEMORY {
-  rom : ORIGIN = 0x00000, LENGTH = 0x10000
-  ram : ORIGIN = 0x10000, LENGTH = 0x10000
-}
 
 _start = 0x000000;
 SECTIONS
 {
   . = 0x1000 + SIZEOF_HEADERS;
-  .text ALIGN (0x20) :
-    {
-      *(.text)
+  
+  .data : AT (0x10000)
+  {
       ASCIZ "This is a string"
-      . = ALIGN(0x20);
-      align_label = .;
-      ASCIZ "This is another \nstring\123"
-      unalign_label = .;
-    }
-  .data : AT (0x10000) { *(.data) } >ram /* NO default AT>rom */
-  . = ALIGN(0x20);
-  .bss : { *(.bss) } >ram /* NO default AT>rom */
+      ASCIZ "This is another\n\123tring"
+      ASCIZ ""
+      ASCIZ noquotes
+  }
+  
   /DISCARD/ : { *(*) }
 }
diff --git a/ld/testsuite/ld-scripts/script.exp b/ld/testsuite/ld-scripts/script.exp
index a574dde034c..56e12da8e61 100644
--- a/ld/testsuite/ld-scripts/script.exp
+++ b/ld/testsuite/ld-scripts/script.exp
@@ -228,6 +228,7 @@ foreach test_script $test_script_list {
 }
 
 run_dump_test "asciz"
+run_dump_test "ascii"
 run_dump_test "align-with-input"
 run_dump_test "pr20302"
 run_dump_test "output-section-types"
--- /dev/null	2023-02-15 08:07:19.809001089 +0000
+++ current/ld/testsuite/ld-scripts/ascii.s	2023-02-15 11:43:49.984490869 +0000
@@ -0,0 +1,9 @@
+    .extern ecc_start
+	.section .text
+main:
+	.long 0x45444F43
+	.long ecc_start
+	.section .data
+	.long 0x9abcdef0
+	.section .bss
+	.long 0
--- /dev/null	2023-02-15 08:07:19.809001089 +0000
+++ current/ld/testsuite/ld-scripts/ascii.t	2023-02-15 16:51:11.265841306 +0000
@@ -0,0 +1,51 @@
+MEMORY {
+  rom : ORIGIN = 0x000000, LENGTH = 0x400000
+  ram : ORIGIN = 0x400000, LENGTH = 0x10000
+}
+
+_start = 0x000000;
+SECTIONS
+{
+  . = 0x1000 + SIZEOF_HEADERS;
+  .text ALIGN (0x100) :
+
+    {
+      INCLUDE "header.inc"
+
+      FILL(0xFF)
+      entry = .;
+      *(.text)
+      . = ALIGN(0x100);
+      ASCII (128) "This is a string, 128 byte long"
+/*      ASCII (32) "This is a string" */
+      LONG(ecc_start)
+      . = ALIGN(16);
+      align_label = .;
+      ASCIZ "This is an unaligned string"
+      unalign_label = .;
+      BYTE(1)
+      BYTE(2)
+      BYTE(3)
+      BYTE(4)
+      BYTE(4)
+      BYTE(7)
+      BYTE(1)
+      BYTE(1)
+      . = ALIGN(16);
+      BYTE(1)
+      . = ALIGN(16);
+      /* ASCII (8) "This is way too long" */
+      . = ALIGN(16);
+      ASCII (64) "I meant to say: This is way too long"
+      . = ALIGN(1024);
+      ASCII (512) "A very long string followed by a '01'"
+      BYTE(1)
+      ecc_end = .;
+    } > rom
+
+  .data : AT (0x400000) { *(.data) } >ram /* NO default AT>rom */
+  . = ALIGN(0x20);
+  .bss : { *(.bss) } >ram /* NO default AT>rom */
+  /DISCARD/ : { *(*) }
+}
+
--- /dev/null	2023-02-15 08:07:19.809001089 +0000
+++ current/ld/testsuite/ld-scripts/ascii.d	2023-02-15 17:25:33.406181103 +0000
@@ -0,0 +1,155 @@
+#source: ascii.s
+#ld: -T ascii.t
+#objdump: -s -j .text
+#notarget: [is_aout_format]
+#xfail: tic4x-*-* tic54x-*-*
+
+.*:     file format .*
+
+Contents of section .text:
+ 1100 434f4445 deadbeef 00000000 00000000  CODE............
+ 1110 00120000 f1080000 00000000 00000000  ................
+ 1120 01020304 17000000 00004711 deadbeef  ..........G.....
+ 1130 70726f67 72616d20 6e616d65 00000000  program name....
+ 1140 656d7074 79000000 00000000 00000000  empty...........
+ 1150 00000000 00000000 00000000 00000000  ................
+ 1160 00000000 00000000 00000000 00000000  ................
+ 1170 00000000 00000000 00000000 00000000  ................
+ 1180 636f6d6d 656e7420 310a0000 00000000  comment 1.......
+ 1190 00000000 00000000 00000000 00000000  ................
+ 11a0 636f6d6d 656e7420 320a0000 00000000  comment 2.......
+ 11b0 00000000 00000000 00000000 00000000  ................
+ 11c0 636f6d6d 656e7420 330a0000 00000000  comment 3.......
+ 11d0 00000000 00000000 00000000 00000000  ................
+ 11e0 636f6d6d 656e7420 340a0000 00000000  comment 4.......
+ 11f0 00000000 00000000 deadbeef 434f4445  ............CODE
+ 1200 434f4445 10110000 ffffffff ffffffff  CODE............
+ 1210 ffffffff ffffffff ffffffff ffffffff  ................
+ 1220 ffffffff ffffffff ffffffff ffffffff  ................
+ 1230 ffffffff ffffffff ffffffff ffffffff  ................
+ 1240 ffffffff ffffffff ffffffff ffffffff  ................
+ 1250 ffffffff ffffffff ffffffff ffffffff  ................
+ 1260 ffffffff ffffffff ffffffff ffffffff  ................
+ 1270 ffffffff ffffffff ffffffff ffffffff  ................
+ 1280 ffffffff ffffffff ffffffff ffffffff  ................
+ 1290 ffffffff ffffffff ffffffff ffffffff  ................
+ 12a0 ffffffff ffffffff ffffffff ffffffff  ................
+ 12b0 ffffffff ffffffff ffffffff ffffffff  ................
+ 12c0 ffffffff ffffffff ffffffff ffffffff  ................
+ 12d0 ffffffff ffffffff ffffffff ffffffff  ................
+ 12e0 ffffffff ffffffff ffffffff ffffffff  ................
+ 12f0 ffffffff ffffffff ffffffff ffffffff  ................
+ 1300 54686973 20697320 61207374 72696e67  This is a string
+ 1310 2c203132 38206279 7465206c 6f6e6700  , 128 byte long.
+ 1320 00000000 00000000 00000000 00000000  ................
+ 1330 00000000 00000000 00000000 00000000  ................
+ 1340 00000000 00000000 00000000 00000000  ................
+ 1350 00000000 00000000 00000000 00000000  ................
+ 1360 00000000 00000000 00000000 00000000  ................
+ 1370 00000000 00000000 00000000 00000000  ................
+ 1380 10110000 ffffffff ffffffff ffffffff  ................
+ 1390 54686973 20697320 616e2075 6e616c69  This is an unali
+ 13a0 676e6564 20737472 696e6700 01020304  gned string.....
+ 13b0 04070101 ffffffff ffffffff ffffffff  ................
+ 13c0 01ffffff ffffffff ffffffff ffffffff  ................
+ 13d0 49206d65 616e7420 746f2073 61793a20  I meant to say: 
+ 13e0 54686973 20697320 77617920 746f6f20  This is way too 
+ 13f0 6c6f6e67 00000000 00000000 00000000  long............
+ .... 00000000 00000000 00000000 00000000  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... 41207665 7279206c 6f6e6720 73747269  A very long stri
+ .... 6e672066 6f6c6c6f 77656420 62792061  ng followed by a
+ .... 20273031 27000000 00000000 00000000   '01'...........
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 01                                   .               
+#pass
--- /dev/null	2023-02-15 08:07:19.809001089 +0000
+++ current/ld/testsuite/ld-scripts/header.inc	2023-02-15 16:49:45.026041032 +0000
@@ -0,0 +1,34 @@
+      /* HEADER */
+      FILL(0xFF)
+      QUAD(0xEFBEADDE45444F43);
+      crc64 = .;
+      QUAD(0)
+      ecc_start = .;
+      /* Program Entry */
+      LONG(entry)
+
+      /* Program size */
+      LONG(ecc_end - ecc_start)
+
+      /* Time Stamp */
+      time_since_epoch = .;
+      QUAD(0)
+
+      /* 32 bytes here */
+      /* Version info */
+      BYTE(1)
+      BYTE(2)
+      BYTE(3)
+      BYTE(4)
+      LONG(0x17)
+      LONG(0x11470000)
+      LONG(0xEFBEADDE)
+      /* 48 bytes here */
+      ASCII (16) "program name"
+      /* 64 bytes here */
+      ASCII (64) "empty"
+      ASCII (32) "comment 1\n"
+      ASCII (32) "comment 2\n"
+      ASCII (32) "comment 3\n"
+      ASCII (24) "comment 4\n"
+      QUAD(0x45444F43EFBEADDE);

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC v0 0/6] ASCII Command for output section
  2023-02-15 17:28 ` Nick Clifton
@ 2023-02-15 17:52   ` Ulf Samuelsson
  2023-02-15 18:29   ` Ulf Samuelsson
  1 sibling, 0 replies; 14+ messages in thread
From: Ulf Samuelsson @ 2023-02-15 17:52 UTC (permalink / raw)
  To: Nick Clifton, binutils


Den 2023-02-15 kl. 18:28, skrev Nick Clifton:
> Hi Ulf,
>
>   What do you think of this alternative version of your patch ?
>
Seems to do what I want, so I would be OK with applying this.

Best Regards
Ulf Samuelsson

>   It will still need tweaking so that the tests will work on big-endian
>   architectures, and the documentation will need fixing, but I think
>   that it does most of what you want.
>
> Cheers
>   Nick

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC v0 0/6] ASCII Command for output section
  2023-02-15 17:07 ` [RFC v0 0/6] ASCII Command for output section Nick Clifton
@ 2023-02-15 17:55   ` Ulf Samuelsson
  0 siblings, 0 replies; 14+ messages in thread
From: Ulf Samuelsson @ 2023-02-15 17:55 UTC (permalink / raw)
  To: Nick Clifton, binutils


Den 2023-02-15 kl. 18:07, skrev Nick Clifton:
> Hi Ulf,
>
>> I would like to support
>>
>>     ASCII <size> , <string>
>>
>> when I try, and add
>>         ASCII  32 , "mystring"
>>
>> I get a "syntax error", and would like to understand why.
>
> Whilst I have not gone into this too deeply, I think that the
> short answer is "because that is the way that the linker's
> parser works".  It expects numerical expressions, including
> constant integer values, to be enclosed in parentheses.

I did a small trick allowing me to turn on yydebug.

     | DEBUG ON
         {
           yydebug = 1;
         }
     | DEBUG OFF
         {
           yydebug = 0;
         }

so I saw that it is parsed as 'ASCII', 'NAME' and not 'ASCII', 'INT'

>
>> If I do:
>>     ASCII (<size>) <string>
>> I do not get a syntax error when I do
>>
>>     ASCII (32) "mystring"
>
> Since this method works, I would suggest just sticking with it.
>
If we are forced to do

    ASCII (32) , "mystring"

it is no advantage, so I guess I have to give up my original idea.

> Also, whilst using sprintf() to process escape sequences is a
> nice idea, it will not work.  Escape sequences are a C language
> feature not a C library feature, so sprintf and its friends will
> not translate them for you.
>
Yes, You are right.


> Cheers
>   Nick
>
>
Best Regards

Ulf Samuelsson


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC v0 0/6] ASCII Command for output section
  2023-02-15 17:28 ` Nick Clifton
  2023-02-15 17:52   ` Ulf Samuelsson
@ 2023-02-15 18:29   ` Ulf Samuelsson
  2023-02-16 16:31     ` Nick Clifton
  1 sibling, 1 reply; 14+ messages in thread
From: Ulf Samuelsson @ 2023-02-15 18:29 UTC (permalink / raw)
  To: Nick Clifton, binutils

[-- Attachment #1: Type: text/plain, Size: 1469 bytes --]


Den 2023-02-15 kl. 18:28, skrev Nick Clifton:
> Hi Ulf,
>
>   What do you think of this alternative version of your patch ?
>
>   It will still need tweaking so that the tests will work on big-endian
>   architectures, and the documentation will need fixing, but I think
>   that it does most of what you want.

A little closer look reveals that there is some oddities in the white space.
Also, when the string in the ASCII command is longer than the specified
length, only we need to insert a '\0' character at the end.

For this:

     len = size;

must be changed to

     len = size - 1;

===============

The size is specified by a "mustbe_exp"

     | ASCII '(' mustbe_exp ')' NAME

         {
           /* 'value' is a memory leak, do we care?  */
           etree_type *value = $3;
           lang_add_string (value->value.value, $5);
         }

I cannot judge the consequences of having a full expression, instead of 
an INT.

Will the "value->value.value" work, if we do

      ASCII (3 * 15) "Long string"

?

===============

We do a malloc in the "etree_type *value = $3;" statement.
This is thrown away without a "free".

Is that a problem anywhere?

I guess most OSes would reclaim everything once the 'ld' terminates.

There will not be that many ASCII statements in a linker command file.

===============

I enclose a small update to your patch.

Best Regards
Ulf Samuelsson


>
> Cheers
>   Nick

[-- Attachment #2: 0001-ASCII-command.patch --]
[-- Type: text/x-patch, Size: 21876 bytes --]

From 1f183edc0428706d9839b1863240e7b5a2a6f036 Mon Sep 17 00:00:00 2001
From: Ulf Samuelsson <binutils@emagii.com>
Date: Wed, 15 Feb 2023 19:06:57 +0100
Subject: [PATCH] ASCII command

Signed-off-by: Ulf Samuelsson <binutils@emagii.com>
---
 ld/NEWS                            |   6 ++
 ld/ld.texi                         |  24 ++++-
 ld/ldgram.y                        |  10 +-
 ld/ldlang.c                        |  65 +++++++++---
 ld/ldlang.h                        |   3 +-
 ld/ldlex.l                         |   1 +
 ld/testsuite/ld-scripts/ascii.d    | 155 +++++++++++++++++++++++++++++
 ld/testsuite/ld-scripts/ascii.s    |   9 ++
 ld/testsuite/ld-scripts/ascii.t    |  51 ++++++++++
 ld/testsuite/ld-scripts/asciz.d    |  17 ++--
 ld/testsuite/ld-scripts/asciz.t    |  23 ++---
 ld/testsuite/ld-scripts/header.inc |  34 +++++++
 ld/testsuite/ld-scripts/script.exp |   1 +
 13 files changed, 349 insertions(+), 50 deletions(-)
 create mode 100644 ld/testsuite/ld-scripts/ascii.d
 create mode 100644 ld/testsuite/ld-scripts/ascii.s
 create mode 100644 ld/testsuite/ld-scripts/ascii.t
 create mode 100644 ld/testsuite/ld-scripts/header.inc

diff --git a/ld/NEWS b/ld/NEWS
index 4ce7e19d40b..38af9cba877 100644
--- a/ld/NEWS
+++ b/ld/NEWS
@@ -1,5 +1,11 @@
 -*- text -*-
 
+* The linker script syntax has a new command for output sections: 
+  ASCII (<size>) "string" (Alt 1 = Working)
+  ASCII <size>, "string"  (Alt 2 = Not Working)
+  This will reserve a zero filled block of <size> bytes at the current
+  location and insert a zero-terminated string at the beginning of the block.
+
 * The linker script syntax has a new command for output sections: ASCIZ "string"
   This will insert a zero-terminated string at the current location.
 
diff --git a/ld/ld.texi b/ld/ld.texi
index 335886d4e6b..e309eebfa43 100644
--- a/ld/ld.texi
+++ b/ld/ld.texi
@@ -5308,6 +5308,7 @@ C identifiers because they contain a @samp{.} character.
 @cindex data
 @cindex section data
 @cindex output section data
+@kindex ASCII (@var{expression}) ``@var{string}''
 @kindex ASCIZ ``@var{string}''
 @kindex BYTE(@var{expression})
 @kindex SHORT(@var{expression})
@@ -5345,14 +5346,27 @@ When the object file format does not have an explicit endianness, as is
 true of, for example, S-records, the value will be stored in the
 endianness of the first input object file.
 
+You can include a fixed size string in an output section by using @code{ASCII}.
+The keyword is followed by a size and a string which is stored at
+the current value of the location counter adding zero bytes at the end.
+
 You can include a zero-terminated string in an output section by using
 @code{ASCIZ}.  The keyword is followed by a string which is stored at
-the current value of the location counter adding a zero byte at the
-end.  If the string includes spaces it must be enclosed in double
-quotes.  The string may contain '\n', '\r', '\t' and octal numbers.
-Hex numbers are not supported.
+the current value of the location counter adding a zero byte at the end.  
+
+If the string in an @code{ASCIZ} or @code{ASCIZ} command includes spaces
+it must be enclosed in double quotes.
+If the string is too long, a warning is issued and the string is truncated.
+The string can have C escape characters like '\n', '\r', '\t' and octal numbers.
+The '\"' escape is not supported.
+
+Example 1: This is string of 16 characters and will create a 32 byte area
+@smallexample
+  ASCII 32, "This is 16 bytes"
+  ASCII (32) "This is 16 bytes"
+@end smallexample
 
-For example, this string of 16 characters will create a 17 byte area
+Example 2: This is a string of 16 characters and will create a 17 byte area
 @smallexample
   ASCIZ "This is 16 bytes"
 @end smallexample
diff --git a/ld/ldgram.y b/ld/ldgram.y
index 8240cf97327..8aa7749c1e8 100644
--- a/ld/ldgram.y
+++ b/ld/ldgram.y
@@ -125,7 +125,7 @@ static int error_index;
 %right UNARY
 %token END
 %left <token> '('
-%token <token> ALIGN_K BLOCK BIND QUAD SQUAD LONG SHORT BYTE ASCIZ
+%token <token> ALIGN_K BLOCK BIND QUAD SQUAD LONG SHORT BYTE ASCII ASCIZ
 %token SECTIONS PHDRS INSERT_K AFTER BEFORE
 %token DATA_SEGMENT_ALIGN DATA_SEGMENT_RELRO_END DATA_SEGMENT_END
 %token SORT_BY_NAME SORT_BY_ALIGNMENT SORT_NONE
@@ -668,9 +668,15 @@ statement:
 		{
 		  lang_add_data ((int) $1, $3);
 		}
+	| ASCII '(' mustbe_exp ')' NAME
+		{
+		  /* 'value' is a memory leak, do we care?  */
+		  etree_type *value = $3;
+		  lang_add_string (value->value.value, $5);
+		}
 	| ASCIZ NAME
 		{
-		  lang_add_string ($2);
+		  lang_add_string (0, $2);
 		}
 	| FILL '(' fill_exp ')'
 		{
diff --git a/ld/ldlang.c b/ld/ldlang.c
index b20455c9373..2ba2980f082 100644
--- a/ld/ldlang.c
+++ b/ld/ldlang.c
@@ -8361,15 +8361,16 @@ lang_add_data (int type, union etree_union *exp)
   new_stmt->type = type;
 }
 
-void
-lang_add_string (const char *s)
+static char *
+convert_string (const char * s)
 {
-  bfd_vma  len = strlen (s);
-  bfd_vma  i;
-  bool     escape = false;
+  int     len = strlen (s);
+  int     i;
+  bool    escape = false;
+  char *  buffer = malloc (len + 1);
+  char *  b;
 
-  /* Add byte expressions until end of string.  */
-  for (i = 0 ; i < len; i++)
+  for (i = 0, b = buffer; i < len; i++)
     {
       char c = *s++;
 
@@ -8404,7 +8405,7 @@ lang_add_string (const char *s)
 		    value += (c - '0');
 		    i++;
 		    s++;
-
+ 
 		    c = *s;
 		    if ((c >= '0') && (c <= '7'))
 		      {
@@ -8422,26 +8423,58 @@ lang_add_string (const char *s)
 		    i--;
 		    s--;
 		  }
-
+		
 		c = value;
 	      }
 	      break;
 	    }
-
-	  lang_add_data (BYTE, exp_intop (c));
 	  escape = false;
 	}
       else
 	{
 	  if (c == '\\')
-	    escape = true;
-	  else
-	    lang_add_data (BYTE, exp_intop (c));
+	    {
+	      escape = true;
+	      continue;
+	    }
 	}
+
+      * b ++ = c;
     }
 
-  /* Remeber to terminate the string.  */
-  lang_add_data (BYTE, exp_intop (0));
+  * b = 0;
+  return buffer;
+}
+
+void
+lang_add_string (int size, const char *s)
+{
+  int     len;
+  int     i;
+  char *  string;
+
+  string = convert_string (s);
+  len = strlen (string);
+
+  /* Check if it is ASCIZ command (len == 0) */
+  if (size == 0)
+    size = len + 1;
+  else if (len > size)
+    {
+      /* We cannot fit the '\0' at the end.  */
+      len = size - 1;
+
+      einfo (_("%P:%pS: warning: ASCII string does not fit in allocated space,"
+	       " truncated\n"), NULL);
+    }
+
+  for (i = 0 ; i < len ; i++)
+    lang_add_data (BYTE, exp_intop (string[i]));
+
+  while (i++ < size)
+    lang_add_data (BYTE, exp_intop ('\0'));
+
+  free (string);
 }
 
 /* Create a new reloc statement.  RELOC is the BFD relocation type to
diff --git a/ld/ldlang.h b/ld/ldlang.h
index 32819066b8a..fe85e159aa7 100644
--- a/ld/ldlang.h
+++ b/ld/ldlang.h
@@ -646,8 +646,9 @@ extern void pop_stat_ptr
   (void);
 extern void lang_add_data
   (int, union etree_union *);
+extern bfd_vma charcount(const char *s);
 extern void lang_add_string
-  (const char *);
+  (int, const char *s);
 extern void lang_add_reloc
   (bfd_reloc_code_real_type, reloc_howto_type *, asection *, const char *,
    union etree_union *);
diff --git a/ld/ldlex.l b/ld/ldlex.l
index 32336cf0be2..910e7ea3b8b 100644
--- a/ld/ldlex.l
+++ b/ld/ldlex.l
@@ -309,6 +309,7 @@ V_IDENTIFIER [*?.$_a-zA-Z\[\]\-\!\^\\]([*?.$_a-zA-Z0-9\[\]\-\!\^\\]|::)*
 <WILD>"LONG"				{ RTOKEN(LONG); }
 <WILD>"SHORT"				{ RTOKEN(SHORT); }
 <WILD>"BYTE"				{ RTOKEN(BYTE); }
+<WILD>"ASCII"				{ RTOKEN(ASCII); }
 <WILD>"ASCIZ"				{ RTOKEN(ASCIZ); }
 <SCRIPT>"NOFLOAT"			{ RTOKEN(NOFLOAT); }
 <SCRIPT,EXPRESSION>"NOCROSSREFS"	{ RTOKEN(NOCROSSREFS); }
diff --git a/ld/testsuite/ld-scripts/ascii.d b/ld/testsuite/ld-scripts/ascii.d
new file mode 100644
index 00000000000..922762b7f97
--- /dev/null
+++ b/ld/testsuite/ld-scripts/ascii.d
@@ -0,0 +1,155 @@
+#source: ascii.s
+#ld: -T ascii.t
+#objdump: -s -j .text
+#notarget: [is_aout_format]
+#xfail: tic4x-*-* tic54x-*-*
+
+.*:     file format .*
+
+Contents of section .text:
+ 1100 434f4445 deadbeef 00000000 00000000  CODE............
+ 1110 00120000 f1080000 00000000 00000000  ................
+ 1120 01020304 17000000 00004711 deadbeef  ..........G.....
+ 1130 70726f67 72616d20 6e616d65 00000000  program name....
+ 1140 656d7074 79000000 00000000 00000000  empty...........
+ 1150 00000000 00000000 00000000 00000000  ................
+ 1160 00000000 00000000 00000000 00000000  ................
+ 1170 00000000 00000000 00000000 00000000  ................
+ 1180 636f6d6d 656e7420 310a0000 00000000  comment 1.......
+ 1190 00000000 00000000 00000000 00000000  ................
+ 11a0 636f6d6d 656e7420 320a0000 00000000  comment 2.......
+ 11b0 00000000 00000000 00000000 00000000  ................
+ 11c0 636f6d6d 656e7420 330a0000 00000000  comment 3.......
+ 11d0 00000000 00000000 00000000 00000000  ................
+ 11e0 636f6d6d 656e7420 340a0000 00000000  comment 4.......
+ 11f0 00000000 00000000 deadbeef 434f4445  ............CODE
+ 1200 434f4445 10110000 ffffffff ffffffff  CODE............
+ 1210 ffffffff ffffffff ffffffff ffffffff  ................
+ 1220 ffffffff ffffffff ffffffff ffffffff  ................
+ 1230 ffffffff ffffffff ffffffff ffffffff  ................
+ 1240 ffffffff ffffffff ffffffff ffffffff  ................
+ 1250 ffffffff ffffffff ffffffff ffffffff  ................
+ 1260 ffffffff ffffffff ffffffff ffffffff  ................
+ 1270 ffffffff ffffffff ffffffff ffffffff  ................
+ 1280 ffffffff ffffffff ffffffff ffffffff  ................
+ 1290 ffffffff ffffffff ffffffff ffffffff  ................
+ 12a0 ffffffff ffffffff ffffffff ffffffff  ................
+ 12b0 ffffffff ffffffff ffffffff ffffffff  ................
+ 12c0 ffffffff ffffffff ffffffff ffffffff  ................
+ 12d0 ffffffff ffffffff ffffffff ffffffff  ................
+ 12e0 ffffffff ffffffff ffffffff ffffffff  ................
+ 12f0 ffffffff ffffffff ffffffff ffffffff  ................
+ 1300 54686973 20697320 61207374 72696e67  This is a string
+ 1310 2c203132 38206279 7465206c 6f6e6700  , 128 byte long.
+ 1320 00000000 00000000 00000000 00000000  ................
+ 1330 00000000 00000000 00000000 00000000  ................
+ 1340 00000000 00000000 00000000 00000000  ................
+ 1350 00000000 00000000 00000000 00000000  ................
+ 1360 00000000 00000000 00000000 00000000  ................
+ 1370 00000000 00000000 00000000 00000000  ................
+ 1380 10110000 ffffffff ffffffff ffffffff  ................
+ 1390 54686973 20697320 616e2075 6e616c69  This is an unali
+ 13a0 676e6564 20737472 696e6700 01020304  gned string.....
+ 13b0 04070101 ffffffff ffffffff ffffffff  ................
+ 13c0 01ffffff ffffffff ffffffff ffffffff  ................
+ 13d0 49206d65 616e7420 746f2073 61793a20  I meant to say: 
+ 13e0 54686973 20697320 77617920 746f6f20  This is way too 
+ 13f0 6c6f6e67 00000000 00000000 00000000  long............
+ .... 00000000 00000000 00000000 00000000  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... ffffffff ffffffff ffffffff ffffffff  ................
+ .... 41207665 7279206c 6f6e6720 73747269  A very long stri
+ .... 6e672066 6f6c6c6f 77656420 62792061  ng followed by a
+ .... 20273031 27000000 00000000 00000000   '01'...........
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 00000000 00000000 00000000 00000000  ................
+ .... 01                                   .               
+#pass
diff --git a/ld/testsuite/ld-scripts/ascii.s b/ld/testsuite/ld-scripts/ascii.s
new file mode 100644
index 00000000000..704b492ae61
--- /dev/null
+++ b/ld/testsuite/ld-scripts/ascii.s
@@ -0,0 +1,9 @@
+    .extern ecc_start
+	.section .text
+main:
+	.long 0x45444F43
+	.long ecc_start
+	.section .data
+	.long 0x9abcdef0
+	.section .bss
+	.long 0
diff --git a/ld/testsuite/ld-scripts/ascii.t b/ld/testsuite/ld-scripts/ascii.t
new file mode 100644
index 00000000000..1f63190a7e5
--- /dev/null
+++ b/ld/testsuite/ld-scripts/ascii.t
@@ -0,0 +1,51 @@
+MEMORY {
+  rom : ORIGIN = 0x000000, LENGTH = 0x400000
+  ram : ORIGIN = 0x400000, LENGTH = 0x10000
+}
+
+_start = 0x000000;
+SECTIONS
+{
+  . = 0x1000 + SIZEOF_HEADERS;
+  .text ALIGN (0x100) :
+
+    {
+      INCLUDE "header.inc"
+
+      FILL(0xFF)
+      entry = .;
+      *(.text)
+      . = ALIGN(0x100);
+      ASCII (128) "This is a string, 128 byte long"
+/*      ASCII (32) "This is a string" */
+      LONG(ecc_start)
+      . = ALIGN(16);
+      align_label = .;
+      ASCIZ "This is an unaligned string"
+      unalign_label = .;
+      BYTE(1)
+      BYTE(2)
+      BYTE(3)
+      BYTE(4)
+      BYTE(4)
+      BYTE(7)
+      BYTE(1)
+      BYTE(1)
+      . = ALIGN(16);
+      BYTE(1)
+      . = ALIGN(16);
+      /* ASCII (8) "This is way too long" */
+      . = ALIGN(16);
+      ASCII (64) "I meant to say: This is way too long"
+      . = ALIGN(1024);
+      ASCII (512) "A very long string followed by a '01'"
+      BYTE(1)
+      ecc_end = .;
+    } > rom
+
+  .data : AT (0x400000) { *(.data) } >ram /* NO default AT>rom */
+  . = ALIGN(0x20);
+  .bss : { *(.bss) } >ram /* NO default AT>rom */
+  /DISCARD/ : { *(*) }
+}
+
diff --git a/ld/testsuite/ld-scripts/asciz.d b/ld/testsuite/ld-scripts/asciz.d
index 615cf99732f..d3b8e89fb31 100644
--- a/ld/testsuite/ld-scripts/asciz.d
+++ b/ld/testsuite/ld-scripts/asciz.d
@@ -1,17 +1,12 @@
 #source: asciz.s
 #ld: -T asciz.t
-#objdump: -s -j .text
-#target: [is_elf_format]
-#skip: mips*-*-*
-#skip: tilegx*-*-* tilepro-*-*
-# COFF, PE and MIPS targets align code to a 16 byte boundary
-# tilegx andtilepro aligns code to a 8 byte boundary.
+#objdump: -s -j .data
 
 .*:     file format .*
 
-Contents of section .text:
- .... 01010101 54686973 20697320 61207374  ....This is a st
- .... 72696e67 00...... ........ ........  ring............
- .... 54686973 20697320 616e6f74 68657220  This is another 
- .... 0a737472 696e6753 00                 .stringS........
+Contents of section .data:
+ .... 54686973 20697320 61207374 72696e67  This is a string
+ .... 00546869 73206973 20616e6f 74686572  .This is another
+ .... 0a537472 696e6700 006e6f71 756f7465  .String..noquote
+ .... 7300                                 s.              
 #pass
diff --git a/ld/testsuite/ld-scripts/asciz.t b/ld/testsuite/ld-scripts/asciz.t
index ab66f9a5bfb..3aeb7d0c767 100644
--- a/ld/testsuite/ld-scripts/asciz.t
+++ b/ld/testsuite/ld-scripts/asciz.t
@@ -1,23 +1,16 @@
-MEMORY {
-  rom : ORIGIN = 0x00000, LENGTH = 0x10000
-  ram : ORIGIN = 0x10000, LENGTH = 0x10000
-}
 
 _start = 0x000000;
 SECTIONS
 {
   . = 0x1000 + SIZEOF_HEADERS;
-  .text ALIGN (0x20) :
-    {
-      *(.text)
+  
+  .data : AT (0x10000)
+  {
       ASCIZ "This is a string"
-      . = ALIGN(0x20);
-      align_label = .;
-      ASCIZ "This is another \nstring\123"
-      unalign_label = .;
-    }
-  .data : AT (0x10000) { *(.data) } >ram /* NO default AT>rom */
-  . = ALIGN(0x20);
-  .bss : { *(.bss) } >ram /* NO default AT>rom */
+      ASCIZ "This is another\n\123tring"
+      ASCIZ ""
+      ASCIZ noquotes
+  }
+  
   /DISCARD/ : { *(*) }
 }
diff --git a/ld/testsuite/ld-scripts/header.inc b/ld/testsuite/ld-scripts/header.inc
new file mode 100644
index 00000000000..8376d332226
--- /dev/null
+++ b/ld/testsuite/ld-scripts/header.inc
@@ -0,0 +1,34 @@
+      /* HEADER */
+      FILL(0xFF)
+      QUAD(0xEFBEADDE45444F43);
+      crc64 = .;
+      QUAD(0)
+      ecc_start = .;
+      /* Program Entry */
+      LONG(entry)
+
+      /* Program size */
+      LONG(ecc_end - ecc_start)
+
+      /* Time Stamp */
+      time_since_epoch = .;
+      QUAD(0)
+
+      /* 32 bytes here */
+      /* Version info */
+      BYTE(1)
+      BYTE(2)
+      BYTE(3)
+      BYTE(4)
+      LONG(0x17)
+      LONG(0x11470000)
+      LONG(0xEFBEADDE)
+      /* 48 bytes here */
+      ASCII (16) "program name"
+      /* 64 bytes here */
+      ASCII (64) "empty"
+      ASCII (32) "comment 1\n"
+      ASCII (32) "comment 2\n"
+      ASCII (32) "comment 3\n"
+      ASCII (24) "comment 4\n"
+      QUAD(0x45444F43EFBEADDE);
diff --git a/ld/testsuite/ld-scripts/script.exp b/ld/testsuite/ld-scripts/script.exp
index a574dde034c..56e12da8e61 100644
--- a/ld/testsuite/ld-scripts/script.exp
+++ b/ld/testsuite/ld-scripts/script.exp
@@ -228,6 +228,7 @@ foreach test_script $test_script_list {
 }
 
 run_dump_test "asciz"
+run_dump_test "ascii"
 run_dump_test "align-with-input"
 run_dump_test "pr20302"
 run_dump_test "output-section-types"
-- 
2.17.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC v0 0/6] ASCII Command for output section
  2023-02-15 18:29   ` Ulf Samuelsson
@ 2023-02-16 16:31     ` Nick Clifton
  0 siblings, 0 replies; 14+ messages in thread
From: Nick Clifton @ 2023-02-16 16:31 UTC (permalink / raw)
  To: Ulf Samuelsson, binutils

Hi Ulf,

>>   What do you think of this alternative version of your patch ?

OK, I have now applied a tweaked version of the patch.

> A little closer look reveals that there is some oddities in the white space.
> Also, when the string in the ASCII command is longer than the specified
> length, only we need to insert a '\0' character at the end.
> 
> For this:
> 
>      len = size;
> 
> must be changed to
> 
>      len = size - 1;

Done.


> The size is specified by a "mustbe_exp"
> 
>      | ASCII '(' mustbe_exp ')' NAME
> 
>          {
>            /* 'value' is a memory leak, do we care?  */
>            etree_type *value = $3;
>            lang_add_string (value->value.value, $5);
>          }
> 
> I cannot judge the consequences of having a full expression, instead of an INT.
> 
> Will the "value->value.value" work, if we do
> 
>       ASCII (3 * 15) "Long string"

It works.  The area assigned will be 45 bytes.  I tested it, and included a version
of this expression in the new linker test.


> We do a malloc in the "etree_type *value = $3;" statement.
> This is thrown away without a "free".
> 
> Is that a problem anywhere?
> I guess most OSes would reclaim everything once the 'ld' terminates.
> There will not be that many ASCII statements in a linker command file.

Right.  As far as I am concerned a leak this small is not important.

Cheers
   Nick


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2023-02-16 16:31 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-15 11:40 [RFC v0 0/6] ASCII Command for output section binutils
2023-02-15 11:40 ` [PATCH v0 1/6] Add testsuite for ASCII command binutils
2023-02-15 11:40 ` [PATCH v0 2/6] Add ASCII command info to NEWS binutils
2023-02-15 11:40 ` [PATCH v0 3/6] Add ASCII to info file binutils
2023-02-15 11:40 ` [PATCH v0 4/6] ldlex.l: add ASCII binutils
2023-02-15 11:44   ` Ulf Samuelsson
2023-02-15 11:40 ` [PATCH v0 5/6] ldgram.y: " binutils
2023-02-15 11:40 ` [PATCH v0 6/6] ldlang.*: parse ASCII command binutils
2023-02-15 17:07 ` [RFC v0 0/6] ASCII Command for output section Nick Clifton
2023-02-15 17:55   ` Ulf Samuelsson
2023-02-15 17:28 ` Nick Clifton
2023-02-15 17:52   ` Ulf Samuelsson
2023-02-15 18:29   ` Ulf Samuelsson
2023-02-16 16:31     ` Nick Clifton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).