* [PATCH v2 01/65] gas: consolidate whitespace recognition
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
@ 2025-01-27 15:26 ` Jan Beulich
2025-01-31 6:02 ` Hans-Peter Nilsson
2025-01-27 15:27 ` [PATCH v2 02/65] gas/obj-*.c: use is_whitespace() Jan Beulich
` (65 subsequent siblings)
66 siblings, 1 reply; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 15:26 UTC (permalink / raw)
To: Binutils
Let's extend lex_type[] to also cover whitespace, then having a simple
macro to uniformly recognize both blanks and tabs (and \r when it's not
EOL) as such.
In macro.c use sb_skip_white() as appropriate, instead of open-coding
it.
---
There's one place where \f is also checked for. I'm inclined to also
include that in the set, but for starters I didn't want to change
behavior in this regard.
Further, what about \v?
Is there any reason to retain PERMIT_WHITESPACE? It's always defined
in read.h, without any overrides. And even in compiler output some
"unnecessary" whitespace will often appear, simply for readability
reasons.
---
v2: Also replace ISSPACE(). Respect CR_EOL. Re-base.
--- a/gas/cond.c
+++ b/gas/cond.c
@@ -250,7 +250,7 @@ get_mri_string (int terminator, int *len
&& ! is_end_of_line[(unsigned char) *input_line_pointer])
++input_line_pointer;
s = input_line_pointer;
- while (s > ret && (s[-1] == ' ' || s[-1] == '\t'))
+ while (s > ret && is_whitespace (s[-1]))
--s;
}
--- a/gas/expr.c
+++ b/gas/expr.c
@@ -1436,7 +1436,7 @@ operand (expressionS *expressionP, enum
created. Doing it here saves lines of code. */
clean_up_expression (expressionP);
SKIP_ALL_WHITESPACE (); /* -> 1st char after operand. */
- know (*input_line_pointer != ' ');
+ know (!is_whitespace (*input_line_pointer));
/* The PA port needs this information. */
if (expressionP->X_add_symbol)
@@ -1858,7 +1858,7 @@ expr (int rankarg, /* Larger # is highe
retval = operand (resultP, mode);
/* operand () gobbles spaces. */
- know (*input_line_pointer != ' ');
+ know (!is_whitespace (*input_line_pointer));
op_left = operatorf (&op_chars);
while (op_left != O_illegal && op_rank[(int) op_left] > rank)
@@ -1880,7 +1880,7 @@ expr (int rankarg, /* Larger # is highe
right.X_op_symbol = NULL;
}
- know (*input_line_pointer != ' ');
+ know (!is_whitespace (*input_line_pointer));
if (op_left == O_index)
{
--- a/gas/listing.c
+++ b/gas/listing.c
@@ -1152,7 +1152,7 @@ debugging_pseudo (list_info_type *list A
in_debug = false;
#endif
- while (ISSPACE (*line))
+ while (is_whitespace (*line))
line++;
if (*line != '.')
--- a/gas/macro.c
+++ b/gas/macro.c
@@ -29,10 +29,8 @@
/* The routines in this file handle macro definition and expansion.
They are called by gas. */
-#define ISWHITE(x) ((x) == ' ' || (x) == '\t')
-
#define ISSEP(x) \
- ((x) == ' ' || (x) == '\t' || (x) == ',' || (x) == '"' || (x) == ';' \
+ (is_whitespace (x) || (x) == ',' || (x) == '"' || (x) == ';' \
|| (x) == ')' || (x) == '(' \
|| ((flag_macro_alternate || flag_mri) && ((x) == '<' || (x) == '>')))
@@ -139,8 +137,7 @@ buffer_and_nest (const char *from, const
if (! LABELS_WITHOUT_COLONS)
{
/* Skip leading whitespace. */
- while (i < ptr->len && ISWHITE (ptr->ptr[i]))
- i++;
+ i = sb_skip_white (i, ptr);
}
for (;;)
@@ -154,8 +151,7 @@ buffer_and_nest (const char *from, const
if (i < ptr->len && is_name_ender (ptr->ptr[i]))
i++;
/* Skip whitespace. */
- while (i < ptr->len && ISWHITE (ptr->ptr[i]))
- i++;
+ i = sb_skip_white (i, ptr);
/* Check for the colon. */
if (i >= ptr->len || ptr->ptr[i] != ':')
{
@@ -174,8 +170,7 @@ buffer_and_nest (const char *from, const
}
/* Skip trailing whitespace. */
- while (i < ptr->len && ISWHITE (ptr->ptr[i]))
- i++;
+ i = sb_skip_white (i, ptr);
if (i < ptr->len && (ptr->ptr[i] == '.'
|| NO_PSEUDO_DOT
@@ -424,9 +419,7 @@ get_any_string (size_t idx, sb *in, sb *
*in_br = '\0';
while (idx < in->len
- && (*in_br
- || (in->ptr[idx] != ' '
- && in->ptr[idx] != '\t'))
+ && (*in_br || !is_whitespace (in->ptr[idx]))
&& in->ptr[idx] != ','
&& (in->ptr[idx] != '<'
|| (! flag_macro_alternate && ! flag_mri)))
@@ -916,7 +909,7 @@ macro_expand_body (sb *in, sb *out, form
if (! macro
|| src + 5 >= in->len
|| strncasecmp (in->ptr + src, "LOCAL", 5) != 0
- || ! ISWHITE (in->ptr[src + 5])
+ || ! is_whitespace (in->ptr[src + 5])
/* PR 11507: Skip keyword LOCAL if it is found inside a quoted string. */
|| inquote)
{
@@ -1069,9 +1062,7 @@ macro_expand (size_t idx, sb *in, macro_
/* The Microtec assembler ignores this if followed by a white space.
(Macro invocation with empty extension) */
idx++;
- if ( idx < in->len
- && in->ptr[idx] != ' '
- && in->ptr[idx] != '\t')
+ if (idx < in->len && !is_whitespace (in->ptr[idx]))
{
formal_entry *n = new_formal ();
@@ -1192,7 +1183,7 @@ macro_expand (size_t idx, sb *in, macro_
{
if (idx < in->len && in->ptr[idx] == ',')
++idx;
- if (idx < in->len && ISWHITE (in->ptr[idx]))
+ if (idx < in->len && is_whitespace (in->ptr[idx]))
break;
}
}
--- a/gas/read.c
+++ b/gas/read.c
@@ -112,9 +112,13 @@ die horribly;
/* Used by is_... macros. our ctype[]. */
char lex_type[256] = {
- 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* @ABCDEFGHIJKLMNO */
+#ifndef CR_EOL
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 8, 0, 0, 0, 8, 0, 0, /* @ABCDEFGHIJKLMNO */
+#else
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 8, 0, 0, 0, 0, 0, 0, /* @ABCDEFGHIJKLMNO */
+#endif
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* PQRSTUVWXYZ[\]^_ */
- 0, 0, 0, LEX_HASH, LEX_DOLLAR, LEX_PCT, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, /* _!"#$%&'()*+,-./ */
+ 8, 0, 0, LEX_HASH, LEX_DOLLAR, LEX_PCT, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, /* _!"#$%&'()*+,-./ */
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, LEX_QM, /* 0123456789:;<=>? */
LEX_AT, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, /* @ABCDEFGHIJKLMNO */
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, LEX_BR, 0, LEX_BR, 0, 3, /* PQRSTUVWXYZ[\]^_ */
@@ -1068,11 +1072,11 @@ read_a_source_file (const char *name)
if (*rest == ':')
++rest;
- if (*rest == ' ' || *rest == '\t')
+ if (is_whitespace (*rest))
++rest;
if ((strncasecmp (rest, "EQU", 3) == 0
|| strncasecmp (rest, "SET", 3) == 0)
- && (rest[3] == ' ' || rest[3] == '\t'))
+ && is_whitespace (rest[3]))
{
input_line_pointer = rest + 3;
equals (line_start,
@@ -1080,8 +1084,7 @@ read_a_source_file (const char *name)
continue;
}
if (strncasecmp (rest, "MACRO", 5) == 0
- && (rest[5] == ' '
- || rest[5] == '\t'
+ && (is_whitespace (rest[5])
|| is_end_of_line[(unsigned char) rest[5]]))
mri_line_macro = 1;
}
@@ -1117,7 +1120,7 @@ read_a_source_file (const char *name)
level. */
do
nul_char = next_char = *input_line_pointer++;
- while (next_char == '\t' || next_char == ' ' || next_char == '\f');
+ while (is_whitespace (next_char) || next_char == '\f');
/* C is the 1st significant character.
Input_line_pointer points after that character. */
@@ -1146,12 +1149,12 @@ read_a_source_file (const char *name)
if (*rest == ':')
++rest;
- if (*rest == ' ' || *rest == '\t')
+ if (is_whitespace (*rest))
++rest;
if ((strncasecmp (rest, "EQU", 3) == 0
|| strncasecmp (rest, "SET", 3) == 0)
- && (rest[3] == ' ' || rest[3] == '\t'))
+ && is_whitespace (rest[3]))
{
input_line_pointer = rest + 3;
equals (s, 1);
@@ -1169,7 +1172,7 @@ read_a_source_file (const char *name)
SKIP_WHITESPACE ();
}
else if ((next_char == '=' && *rest == '=')
- || ((next_char == ' ' || next_char == '\t')
+ || (is_whitespace (next_char)
&& rest[0] == '='
&& rest[1] == '='))
{
@@ -1177,7 +1180,7 @@ read_a_source_file (const char *name)
demand_empty_rest_of_line ();
}
else if ((next_char == '='
- || ((next_char == ' ' || next_char == '\t')
+ || (is_whitespace (next_char)
&& *rest == '='))
#ifdef TC_EQUAL_IN_INSN
&& !TC_EQUAL_IN_INSN (next_char, s)
@@ -1284,7 +1287,7 @@ read_a_source_file (const char *name)
/* The following skip of whitespace is compulsory.
A well shaped space is sometimes all that separates
keyword from operands. */
- if (next_char == ' ' || next_char == '\t')
+ if (is_whitespace (next_char))
input_line_pointer++;
/* Input_line is restored.
@@ -1501,7 +1504,7 @@ mri_comment_field (char *stopcp)
know (flag_m68k_mri);
for (s = input_line_pointer;
- ((!is_end_of_line[(unsigned char) *s] && *s != ' ' && *s != '\t')
+ ((!is_end_of_line[(unsigned char) *s] && !is_whitespace (*s))
|| inquote);
s++)
{
@@ -6326,7 +6329,7 @@ equals (char *sym_name, int reassign)
if (reassign < 0 && *input_line_pointer == '=')
input_line_pointer++;
- while (*input_line_pointer == ' ' || *input_line_pointer == '\t')
+ while (is_whitespace (*input_line_pointer))
input_line_pointer++;
if (flag_mri)
@@ -6500,8 +6503,7 @@ s_include (int arg ATTRIBUTE_UNUSED)
SKIP_WHITESPACE ();
i = 0;
while (!is_end_of_line[(unsigned char) *input_line_pointer]
- && *input_line_pointer != ' '
- && *input_line_pointer != '\t')
+ && !is_whitespace (*input_line_pointer))
{
obstack_1grow (¬es, *input_line_pointer);
++input_line_pointer;
--- a/gas/read.h
+++ b/gas/read.h
@@ -29,17 +29,18 @@ extern bool input_from_string;
#ifdef PERMIT_WHITESPACE
#define SKIP_WHITESPACE() \
- ((*input_line_pointer == ' ') ? ++input_line_pointer : 0)
+ (is_whitespace (*input_line_pointer) ? ++input_line_pointer : 0)
#define SKIP_ALL_WHITESPACE() \
- while (*input_line_pointer == ' ') ++input_line_pointer
+ while (is_whitespace (*input_line_pointer)) ++input_line_pointer
#else
-#define SKIP_WHITESPACE() know (*input_line_pointer != ' ' )
+#define SKIP_WHITESPACE() know (!is_whitespace (*input_line_pointer))
#define SKIP_ALL_WHITESPACE() SKIP_WHITESPACE()
#endif
-#define LEX_NAME (1) /* may continue a name */
+#define LEX_NAME (1) /* may continue a name */
#define LEX_BEGIN_NAME (2) /* may begin a name */
#define LEX_END_NAME (4) /* ends a name */
+#define LEX_WHITE (8) /* whitespace */
#define is_name_beginner(c) \
( lex_type[(unsigned char) (c)] & LEX_BEGIN_NAME )
@@ -47,6 +48,8 @@ extern bool input_from_string;
( lex_type[(unsigned char) (c)] & LEX_NAME )
#define is_name_ender(c) \
( lex_type[(unsigned char) (c)] & LEX_END_NAME )
+#define is_whitespace(c) \
+ ( lex_type[(unsigned char) (c)] & LEX_WHITE )
/* The distinction of "line" and "statement" sadly is blurred by unhelpful
naming of e.g. the underlying array. Most users really mean "end of
--- a/gas/sb.c
+++ b/gas/sb.c
@@ -215,9 +215,7 @@ sb_terminate (sb *in)
size_t
sb_skip_white (size_t idx, sb *ptr)
{
- while (idx < ptr->len
- && (ptr->ptr[idx] == ' '
- || ptr->ptr[idx] == '\t'))
+ while (idx < ptr->len && is_whitespace (ptr->ptr[idx]))
idx++;
return idx;
}
@@ -229,18 +227,14 @@ sb_skip_white (size_t idx, sb *ptr)
size_t
sb_skip_comma (size_t idx, sb *ptr)
{
- while (idx < ptr->len
- && (ptr->ptr[idx] == ' '
- || ptr->ptr[idx] == '\t'))
+ while (idx < ptr->len && is_whitespace (ptr->ptr[idx]))
idx++;
if (idx < ptr->len
&& ptr->ptr[idx] == ',')
idx++;
- while (idx < ptr->len
- && (ptr->ptr[idx] == ' '
- || ptr->ptr[idx] == '\t'))
+ while (idx < ptr->len && is_whitespace (ptr->ptr[idx]))
idx++;
return idx;
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [PATCH v2 01/65] gas: consolidate whitespace recognition
2025-01-27 15:26 ` [PATCH v2 01/65] gas: consolidate whitespace recognition Jan Beulich
@ 2025-01-31 6:02 ` Hans-Peter Nilsson
2025-01-31 8:36 ` Jan Beulich
0 siblings, 1 reply; 106+ messages in thread
From: Hans-Peter Nilsson @ 2025-01-31 6:02 UTC (permalink / raw)
To: Jan Beulich; +Cc: Binutils
On Mon, 27 Jan 2025, Jan Beulich wrote:
> --- a/gas/read.c
> +++ b/gas/read.c
> @@ -112,9 +112,13 @@ die horribly;
>
> /* Used by is_... macros. our ctype[]. */
> char lex_type[256] = {
> - 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* @ABCDEFGHIJKLMNO */
> +#ifndef CR_EOL
> + 0, 0, 0, 0, 0, 0, 0, 0, 0, 8, 0, 0, 0, 8, 0, 0, /* @ABCDEFGHIJKLMNO */
> +#else
> + 0, 0, 0, 0, 0, 0, 0, 0, 0, 8, 0, 0, 0, 0, 0, 0, /* @ABCDEFGHIJKLMNO */
> +#endif
> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* PQRSTUVWXYZ[\]^_ */
> - 0, 0, 0, LEX_HASH, LEX_DOLLAR, LEX_PCT, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, /* _!"#$%&'()*+,-./ */
> + 8, 0, 0, LEX_HASH, LEX_DOLLAR, LEX_PCT, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, /* _!"#$%&'()*+,-./ */
> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, LEX_QM, /* 0123456789:;<=>? */
> LEX_AT, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, /* @ABCDEFGHIJKLMNO */
> 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, LEX_BR, 0, LEX_BR, 0, 3, /* PQRSTUVWXYZ[\]^_ */
Please use LEX_WHITE, not 8. (Columns not lining up is not a
valid excuse for naked literals, if that's the claimed reason.)
brgds, H-P
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [PATCH v2 01/65] gas: consolidate whitespace recognition
2025-01-31 6:02 ` Hans-Peter Nilsson
@ 2025-01-31 8:36 ` Jan Beulich
0 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-31 8:36 UTC (permalink / raw)
To: Hans-Peter Nilsson; +Cc: Binutils
On 31.01.2025 07:02, Hans-Peter Nilsson wrote:
> On Mon, 27 Jan 2025, Jan Beulich wrote:
>> --- a/gas/read.c
>> +++ b/gas/read.c
>> @@ -112,9 +112,13 @@ die horribly;
>>
>> /* Used by is_... macros. our ctype[]. */
>> char lex_type[256] = {
>> - 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* @ABCDEFGHIJKLMNO */
>> +#ifndef CR_EOL
>> + 0, 0, 0, 0, 0, 0, 0, 0, 0, 8, 0, 0, 0, 8, 0, 0, /* @ABCDEFGHIJKLMNO */
>> +#else
>> + 0, 0, 0, 0, 0, 0, 0, 0, 0, 8, 0, 0, 0, 0, 0, 0, /* @ABCDEFGHIJKLMNO */
>> +#endif
>> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* PQRSTUVWXYZ[\]^_ */
>> - 0, 0, 0, LEX_HASH, LEX_DOLLAR, LEX_PCT, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, /* _!"#$%&'()*+,-./ */
>> + 8, 0, 0, LEX_HASH, LEX_DOLLAR, LEX_PCT, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, /* _!"#$%&'()*+,-./ */
>> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, LEX_QM, /* 0123456789:;<=>? */
>> LEX_AT, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, /* @ABCDEFGHIJKLMNO */
>> 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, LEX_BR, 0, LEX_BR, 0, 3, /* PQRSTUVWXYZ[\]^_ */
>
> Please use LEX_WHITE, not 8. (Columns not lining up is not a
> valid excuse for naked literals, if that's the claimed reason.)
That's the presumed reason I derived, yes. The model appears to
be to use LEX_* here only when they're overridable by targets.
See the many 1 and 3 entries in particular. (I certainly agree
using such literal numbers isn't very nice. Yet when making
these changes - and there's another set to follow - it seemed
more reasonable to me to stay consistent.)
Jan
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 02/65] gas/obj-*.c: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
2025-01-27 15:26 ` [PATCH v2 01/65] gas: consolidate whitespace recognition Jan Beulich
@ 2025-01-27 15:27 ` Jan Beulich
2025-01-27 15:43 ` [PATCH v2 03/65] Alpha/EVAX: use is_whitespace() / is_end_of_stmt() Jan Beulich
` (64 subsequent siblings)
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 15:27 UTC (permalink / raw)
To: Binutils; +Cc: Tristan Gingold
... for consistency of recognition of what is deemed whitespace.
In obj_elf_section_name() also generalize end-of-statement recognition
at the same time. Conversely drop the unused SKIP_SEMI_COLON() for COFF.
---
The code being touched in obj_elf_section_name() looks suspicious: I
expect it ought to use get_symbol_name() rather than blindly consuming
everything that's non-whitespace, non-comma, non-statement-separator.
---
v2: New.
--- a/gas/config/obj-coff.c
+++ b/gas/config/obj-coff.c
@@ -567,9 +567,7 @@ obj_coff_ident (int ignore ATTRIBUTE_UNU
a C_EFCN. And a second reason is that the code is more clear this
way. (at least I think it is :-). */
-#define SKIP_SEMI_COLON() while (*input_line_pointer++ != ';')
-#define SKIP_WHITESPACES() while (*input_line_pointer == ' ' || \
- *input_line_pointer == '\t') \
+#define SKIP_WHITESPACES() while (is_whitespace (*input_line_pointer)) \
input_line_pointer++;
static void
--- a/gas/config/obj-elf.c
+++ b/gas/config/obj-elf.c
@@ -1089,7 +1089,7 @@ obj_elf_section_name (void)
{
char *end = input_line_pointer;
- while (0 == strchr ("\n\t,; ", *end))
+ while (!is_whitespace (*end) && !is_end_of_stmt (*end) && *end != ',')
end++;
if (end == input_line_pointer)
{
@@ -1957,8 +1957,8 @@ obj_elf_get_vtable_inherit (void)
++input_line_pointer;
if (input_line_pointer[0] == '0'
- && (input_line_pointer[1] == '\0'
- || ISSPACE (input_line_pointer[1])))
+ && (is_end_of_stmt (input_line_pointer[1])
+ || is_whitespace (input_line_pointer[1])))
{
psym = section_symbol (absolute_section);
++input_line_pointer;
@@ -2032,7 +2032,7 @@ obj_elf_vtable_entry (int ignore ATTRIBU
(void) obj_elf_get_vtable_entry ();
}
-#define skip_whitespace(str) do { if (*(str) == ' ') ++(str); } while (0)
+#define skip_whitespace(str) do { if (is_whitespace (*(str))) ++(str); } while (0)
static inline int
skip_past_char (char ** str, char c)
--- a/gas/config/obj-macho.c
+++ b/gas/config/obj-macho.c
@@ -111,7 +111,7 @@ collect_16char_name (char *dest, const c
{
int len = input_line_pointer - namstart; /* could be zero. */
/* lose any trailing space. */
- while (len > 0 && namstart[len-1] == ' ')
+ while (len > 0 && is_whitespace (namstart[len-1]))
len--;
if (len > 16)
{
@@ -330,7 +330,7 @@ obj_mach_o_section (int ignore ATTRIBUTE
len = input_line_pointer - p;
/* strip trailing spaces. */
- while (len > 0 && p[len-1] == ' ')
+ while (len > 0 && is_whitespace (p[len - 1]))
len--;
tmpc = p[len];
@@ -369,7 +369,7 @@ obj_mach_o_section (int ignore ATTRIBUTE
len = input_line_pointer - p;
/* strip trailing spaces. */
- while (len > 0 && p[len-1] == ' ')
+ while (len > 0 && is_whitespace (p[len - 1]))
len--;
tmpc = p[len];
--- a/gas/config/obj-som.c
+++ b/gas/config/obj-som.c
@@ -79,7 +79,7 @@ obj_som_compiler (int unused ATTRIBUTE_U
quote. */
filename = buf + 1;
p = filename;
- while (*p != ' ' && *p != '\000')
+ while (!is_whitespace (*p) && *p != '\000')
p++;
if (*p == '\000')
{
@@ -89,7 +89,7 @@ obj_som_compiler (int unused ATTRIBUTE_U
*p = '\000';
language_name = ++p;
- while (*p != ' ' && *p != '\000')
+ while (!is_whitespace (*p) && *p != '\000')
p++;
if (*p == '\000')
{
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 03/65] Alpha/EVAX: use is_whitespace() / is_end_of_stmt()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
2025-01-27 15:26 ` [PATCH v2 01/65] gas: consolidate whitespace recognition Jan Beulich
2025-01-27 15:27 ` [PATCH v2 02/65] gas/obj-*.c: use is_whitespace() Jan Beulich
@ 2025-01-27 15:43 ` Jan Beulich
2025-01-27 15:44 ` [PATCH v2 04/65] arc: use is_whitespace() Jan Beulich
` (63 subsequent siblings)
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 15:43 UTC (permalink / raw)
To: Binutils
Don't open-code checking for ' ', '\t', and statement ending chars.
---
The code being touched looks suspicious: I expect it ought to use
get_symbol_name() rather than blindly consuming everything that's non-
whitespace, non-comma, non-statement-separator. Yet then I notice
obj_elf_section_name() does the same.
There also looks to be a demand_empty_rest_of_line() missing.
---
v2: New.
--- a/gas/config/tc-alpha.c
+++ b/gas/config/tc-alpha.c
@@ -4201,7 +4201,7 @@ s_alpha_section_name (void)
{
char *end = input_line_pointer;
- while (0 == strchr ("\n\t,; ", *end))
+ while (!is_whitespace (*end) && !is_end_of_stmt (*end) && *end != ',')
end++;
if (end == input_line_pointer)
{
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 04/65] arc: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (2 preceding siblings ...)
2025-01-27 15:43 ` [PATCH v2 03/65] Alpha/EVAX: use is_whitespace() / is_end_of_stmt() Jan Beulich
@ 2025-01-27 15:44 ` Jan Beulich
2025-01-27 15:50 ` [PATCH v2 05/65] Arm: " Jan Beulich
` (62 subsequent siblings)
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 15:44 UTC (permalink / raw)
To: Binutils; +Cc: Claudiu Zissulescu
Wherever blanks are permissible in input, tabs ought to be permissible,
too. This is particularly relevant when -f is passed to gas (alongside
appropriate input). At the same time use is_end_of_stmt() instead of
open-coded nul char checks.
---
v2: New.
--- a/gas/config/tc-arc.c
+++ b/gas/config/tc-arc.c
@@ -1374,10 +1374,6 @@ tokenize_flags (const char *str,
{
switch (*input_line_pointer)
{
- case ' ':
- case '\0':
- goto fini;
-
case '.':
input_line_pointer++;
if (saw_dot)
@@ -1387,6 +1383,10 @@ tokenize_flags (const char *str,
break;
default:
+ if (is_end_of_stmt (*input_line_pointer)
+ || is_whitespace (*input_line_pointer))
+ goto fini;
+
if (saw_flg && !saw_dot)
goto err;
@@ -2536,8 +2536,8 @@ md_assemble (char *str)
/* Scan up to the end of the mnemonic which must end in space or end
of string. */
str += opnamelen;
- for (; *str != '\0'; str++)
- if (*str == ' ')
+ for (; !is_end_of_stmt (*str); str++)
+ if (is_whitespace (*str))
break;
/* Tokenize the rest of the line. */
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 05/65] Arm: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (3 preceding siblings ...)
2025-01-27 15:44 ` [PATCH v2 04/65] arc: use is_whitespace() Jan Beulich
@ 2025-01-27 15:50 ` Jan Beulich
2025-01-27 16:31 ` Richard Earnshaw (lists)
2025-01-27 15:50 ` [PATCH v2 06/65] aarch64: " Jan Beulich
` (61 subsequent siblings)
66 siblings, 1 reply; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 15:50 UTC (permalink / raw)
To: Binutils; +Cc: Nick Clifton, ramana.radhakrishnan, Richard Earnshaw
Wherever blanks are permissible in input, tabs ought to be permissible,
too. This is particularly relevant when -f is passed to gas (alongside
appropriate input). At the same time use is_end_of_stmt() instead of an
open-coded nul char check.
In parse_neon_type() be more aggressive and remove the special casing of
certain characters altogether. The original default case simply having
"break" can't have been correct.
---
I don't think I see why parse_qfloat_immediate() checks for '\n'. If
that was needed, "line" (really: statement) separators would need
checking for, too. Yet an easy experiment demonstrates that this case is
working correctly despite the lack of a check for ';'.
The check for "0x" in parse_qfloat_immediate() seems fishy, too: If it
was actually needed, "0X" would apparently also checking form. Yet again
experimentally that's properly refused anyway, by atof_ieee() I guess.
While for parse_neon_type() the change improves the handling of this set
of (bad) examples (including the case of passing -f to gas):
vcvt.bf016.f32 d0, q0
vcvt.bf16.f032 d0, q0
vcvt.b16.f32 d0, q0
vcvt.b f16.f32 d0, q0
vcvt.bf 16.f32 d0, q0
vcvt.bf16.f 32 d0, q0
vcvt.b f16.f32 d0, q0
vcvt.b 16.f32 d0, q0
vcvt.b 32.f32 d0, q0
vcvt.bf 16.f32 d0, q0
several are left which imo also ought to be rejected. Yet that will want
sorting separately.
---
v2: Also replace ISSPACE().
--- a/gas/config/tc-arm.c
+++ b/gas/config/tc-arm.c
@@ -1081,7 +1081,7 @@ const char FLT_CHARS[] = "rRsSfFdDxXeEpP
/* Separator character handling. */
-#define skip_whitespace(str) do { if (*(str) == ' ') ++(str); } while (0)
+#define skip_whitespace(str) do { if (is_whitespace (*(str))) ++(str); } while (0)
enum fp_16bit_format
{
@@ -1510,13 +1510,9 @@ parse_neon_type (struct neon_type *type,
return FAIL;
}
goto done;
- case '0': case '1': case '2': case '3': case '4':
- case '5': case '6': case '7': case '8': case '9':
- case ' ': case '.':
+ default:
as_bad (_("unexpected type character `b' -- did you mean `bf'?"));
return FAIL;
- default:
- break;
}
break;
default:
@@ -5055,7 +5051,8 @@ set_fp16_format (int dummy ATTRIBUTE_UNU
new_format = ARM_FP16_FORMAT_DEFAULT;
name = input_line_pointer;
- while (*input_line_pointer && !ISSPACE (*input_line_pointer))
+ while (!is_end_of_stmt (*input_line_pointer)
+ && !is_whitespace (*input_line_pointer))
input_line_pointer++;
saved_char = *input_line_pointer;
@@ -5366,7 +5363,7 @@ parse_qfloat_immediate (char **ccp, int
return FAIL;
else
{
- for (; *fpnum != '\0' && *fpnum != ' ' && *fpnum != '\n'; fpnum++)
+ for (; *fpnum != '\0' && !is_whitespace (*fpnum) && *fpnum != '\n'; fpnum++)
if (*fpnum == '.' || *fpnum == 'e' || *fpnum == 'E')
{
found_fpchar = 1;
@@ -22450,7 +22447,7 @@ opcode_lookup (char **str)
/* Scan up to the end of the mnemonic, which must end in white space,
'.' (in unified mode, or for Neon/VFP instructions), or end of string. */
for (base = end = *str; *end != '\0'; end++)
- if (*end == ' ' || *end == '.')
+ if (is_whitespace (*end) || *end == '.')
break;
if (end == base)
@@ -22481,7 +22478,7 @@ opcode_lookup (char **str)
if (parse_neon_type (&inst.vectype, str) == FAIL)
return NULL;
}
- else if (end[offset] != '\0' && end[offset] != ' ')
+ else if (end[offset] != '\0' && !is_whitespace (end[offset]))
return NULL;
}
else
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [PATCH v2 05/65] Arm: use is_whitespace()
2025-01-27 15:50 ` [PATCH v2 05/65] Arm: " Jan Beulich
@ 2025-01-27 16:31 ` Richard Earnshaw (lists)
2025-01-27 16:55 ` Jan Beulich
0 siblings, 1 reply; 106+ messages in thread
From: Richard Earnshaw (lists) @ 2025-01-27 16:31 UTC (permalink / raw)
To: Jan Beulich, Binutils
Cc: Nick Clifton, ramana.radhakrishnan, Richard Earnshaw
On 27/01/2025 15:50, Jan Beulich wrote:
> Wherever blanks are permissible in input, tabs ought to be permissible,
> too. This is particularly relevant when -f is passed to gas (alongside
> appropriate input). At the same time use is_end_of_stmt() instead of an
> open-coded nul char check.
>
> In parse_neon_type() be more aggressive and remove the special casing of
> certain characters altogether. The original default case simply having
> "break" can't have been correct.
> ---
> I don't think I see why parse_qfloat_immediate() checks for '\n'. If
> that was needed, "line" (really: statement) separators would need
> checking for, too. Yet an easy experiment demonstrates that this case is
> working correctly despite the lack of a check for ';'.
>
> The check for "0x" in parse_qfloat_immediate() seems fishy, too: If it
> was actually needed, "0X" would apparently also checking form. Yet again
> experimentally that's properly refused anyway, by atof_ieee() I guess.
>
> While for parse_neon_type() the change improves the handling of this set
> of (bad) examples (including the case of passing -f to gas):
>
> vcvt.bf016.f32 d0, q0
> vcvt.bf16.f032 d0, q0
> vcvt.b16.f32 d0, q0
> vcvt.b f16.f32 d0, q0
> vcvt.bf 16.f32 d0, q0
> vcvt.bf16.f 32 d0, q0
> vcvt.b f16.f32 d0, q0
> vcvt.b 16.f32 d0, q0
> vcvt.b 32.f32 d0, q0
> vcvt.bf 16.f32 d0, q0
>
> several are left which imo also ought to be rejected. Yet that will want
> sorting separately.
> ---
> v2: Also replace ISSPACE().
>
> --- a/gas/config/tc-arm.c
> +++ b/gas/config/tc-arm.c
> @@ -1081,7 +1081,7 @@ const char FLT_CHARS[] = "rRsSfFdDxXeEpP
>
> /* Separator character handling. */
>
> -#define skip_whitespace(str) do { if (*(str) == ' ') ++(str); } while (0)
> +#define skip_whitespace(str) do { if (is_whitespace (*(str))) ++(str); } while (0)
>
> enum fp_16bit_format
> {
> @@ -1510,13 +1510,9 @@ parse_neon_type (struct neon_type *type,
> return FAIL;
> }
> goto done;
> - case '0': case '1': case '2': case '3': case '4':
> - case '5': case '6': case '7': case '8': case '9':
> - case ' ': case '.':
> + default:
> as_bad (_("unexpected type character `b' -- did you mean `bf'?"));
> return FAIL;
> - default:
> - break;
> }
This entire switch statement has now degenerated into 'f' or error. So I think it would be better to just replace it with an if-else.
> break;
> default:
> @@ -5055,7 +5051,8 @@ set_fp16_format (int dummy ATTRIBUTE_UNU
> new_format = ARM_FP16_FORMAT_DEFAULT;
>
> name = input_line_pointer;
> - while (*input_line_pointer && !ISSPACE (*input_line_pointer))
> + while (!is_end_of_stmt (*input_line_pointer)
> + && !is_whitespace (*input_line_pointer))
> input_line_pointer++;
>
> saved_char = *input_line_pointer;
> @@ -5366,7 +5363,7 @@ parse_qfloat_immediate (char **ccp, int
> return FAIL;
> else
> {
> - for (; *fpnum != '\0' && *fpnum != ' ' && *fpnum != '\n'; fpnum++)
> + for (; *fpnum != '\0' && !is_whitespace (*fpnum) && *fpnum != '\n'; fpnum++)
> if (*fpnum == '.' || *fpnum == 'e' || *fpnum == 'E')
> {
> found_fpchar = 1;
> @@ -22450,7 +22447,7 @@ opcode_lookup (char **str)
> /* Scan up to the end of the mnemonic, which must end in white space,
> '.' (in unified mode, or for Neon/VFP instructions), or end of string. */
> for (base = end = *str; *end != '\0'; end++)
> - if (*end == ' ' || *end == '.')
> + if (is_whitespace (*end) || *end == '.')
> break;
>
> if (end == base)
> @@ -22481,7 +22478,7 @@ opcode_lookup (char **str)
> if (parse_neon_type (&inst.vectype, str) == FAIL)
> return NULL;
> }
> - else if (end[offset] != '\0' && end[offset] != ' ')
> + else if (end[offset] != '\0' && !is_whitespace (end[offset]))
> return NULL;
> }
> else
>
OK with that change.
R.
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [PATCH v2 05/65] Arm: use is_whitespace()
2025-01-27 16:31 ` Richard Earnshaw (lists)
@ 2025-01-27 16:55 ` Jan Beulich
2025-01-27 17:05 ` Richard Earnshaw (lists)
0 siblings, 1 reply; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:55 UTC (permalink / raw)
To: Richard Earnshaw (lists)
Cc: Nick Clifton, ramana.radhakrishnan, Richard Earnshaw, Binutils
On 27.01.2025 17:31, Richard Earnshaw (lists) wrote:
> On 27/01/2025 15:50, Jan Beulich wrote:
>> --- a/gas/config/tc-arm.c
>> +++ b/gas/config/tc-arm.c
>> @@ -1081,7 +1081,7 @@ const char FLT_CHARS[] = "rRsSfFdDxXeEpP
>>
>> /* Separator character handling. */
>>
>> -#define skip_whitespace(str) do { if (*(str) == ' ') ++(str); } while (0)
>> +#define skip_whitespace(str) do { if (is_whitespace (*(str))) ++(str); } while (0)
>>
>> enum fp_16bit_format
>> {
>> @@ -1510,13 +1510,9 @@ parse_neon_type (struct neon_type *type,
>> return FAIL;
>> }
>> goto done;
>> - case '0': case '1': case '2': case '3': case '4':
>> - case '5': case '6': case '7': case '8': case '9':
>> - case ' ': case '.':
>> + default:
>> as_bad (_("unexpected type character `b' -- did you mean `bf'?"));
>> return FAIL;
>> - default:
>> - break;
>> }
>
> This entire switch statement has now degenerated into 'f' or error. So I think it would be better to just replace it with an if-else.
I can do that, but in other projects I'm active we'd deliberately ask
that switch() be used simply in the expectation that if any further
character would want checking for, code churn would then be lower. If
you're fine with the extra churn, I can of course adjust here.
Jan
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [PATCH v2 05/65] Arm: use is_whitespace()
2025-01-27 16:55 ` Jan Beulich
@ 2025-01-27 17:05 ` Richard Earnshaw (lists)
0 siblings, 0 replies; 106+ messages in thread
From: Richard Earnshaw (lists) @ 2025-01-27 17:05 UTC (permalink / raw)
To: Jan Beulich
Cc: Nick Clifton, ramana.radhakrishnan, Richard Earnshaw, Binutils
On 27/01/2025 16:55, Jan Beulich wrote:
> On 27.01.2025 17:31, Richard Earnshaw (lists) wrote:
>> On 27/01/2025 15:50, Jan Beulich wrote:
>>> --- a/gas/config/tc-arm.c
>>> +++ b/gas/config/tc-arm.c
>>> @@ -1081,7 +1081,7 @@ const char FLT_CHARS[] = "rRsSfFdDxXeEpP
>>>
>>> /* Separator character handling. */
>>>
>>> -#define skip_whitespace(str) do { if (*(str) == ' ') ++(str); } while (0)
>>> +#define skip_whitespace(str) do { if (is_whitespace (*(str))) ++(str); } while (0)
>>>
>>> enum fp_16bit_format
>>> {
>>> @@ -1510,13 +1510,9 @@ parse_neon_type (struct neon_type *type,
>>> return FAIL;
>>> }
>>> goto done;
>>> - case '0': case '1': case '2': case '3': case '4':
>>> - case '5': case '6': case '7': case '8': case '9':
>>> - case ' ': case '.':
>>> + default:
>>> as_bad (_("unexpected type character `b' -- did you mean `bf'?"));
>>> return FAIL;
>>> - default:
>>> - break;
>>> }
>>
>> This entire switch statement has now degenerated into 'f' or error. So I think it would be better to just replace it with an if-else.
>
> I can do that, but in other projects I'm active we'd deliberately ask
> that switch() be used simply in the expectation that if any further
> character would want checking for, code churn would then be lower. If
> you're fine with the extra churn, I can of course adjust here.
>
> Jan
Never say never, but I can't see other characters being needed here. I'll take that hit if it's needed at some point in the future. Note, we already don't handle this when parsing the numbers after an 'f', so we're not consistent anyway.
I don't particularly like the error recovery here anyway, but that's another story. Printing "unexpected type character `b' -- did you mean `bf'?" is not exactly informative when there's so little context shown: it's not even as though 'bf' is the complete answer, the type is 'bf16'; but to do this even close to properly we'd need to tokenize the input and print the entire substring up to the next token separator.
R.
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 06/65] aarch64: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (4 preceding siblings ...)
2025-01-27 15:50 ` [PATCH v2 05/65] Arm: " Jan Beulich
@ 2025-01-27 15:50 ` Jan Beulich
2025-01-27 16:31 ` Richard Earnshaw (lists)
2025-01-27 16:06 ` [PATCH v2 07/65] avr: " Jan Beulich
` (60 subsequent siblings)
66 siblings, 1 reply; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 15:50 UTC (permalink / raw)
To: Binutils; +Cc: Richard Earnshaw, Marcus Shawcroft
Wherever blanks are permissible in input, tabs ought to be permissible,
too. This is particularly relevant when -f is passed to gas (alongside
appropriate input).
--- a/gas/config/tc-aarch64.c
+++ b/gas/config/tc-aarch64.c
@@ -634,7 +634,7 @@ const char FLT_CHARS[] = "rRsSfFdDxXeEpP
/* Separator character handling. */
-#define skip_whitespace(str) do { if (*(str) == ' ') ++(str); } while (0)
+#define skip_whitespace(str) do { if (is_whitespace (*(str))) ++(str); } while (0)
static inline bool
skip_past_char (char **str, char c)
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [PATCH v2 06/65] aarch64: use is_whitespace()
2025-01-27 15:50 ` [PATCH v2 06/65] aarch64: " Jan Beulich
@ 2025-01-27 16:31 ` Richard Earnshaw (lists)
0 siblings, 0 replies; 106+ messages in thread
From: Richard Earnshaw (lists) @ 2025-01-27 16:31 UTC (permalink / raw)
To: Jan Beulich, Binutils; +Cc: Richard Earnshaw, Marcus Shawcroft
On 27/01/2025 15:50, Jan Beulich wrote:
> Wherever blanks are permissible in input, tabs ought to be permissible,
> too. This is particularly relevant when -f is passed to gas (alongside
> appropriate input).
>
> --- a/gas/config/tc-aarch64.c
> +++ b/gas/config/tc-aarch64.c
> @@ -634,7 +634,7 @@ const char FLT_CHARS[] = "rRsSfFdDxXeEpP
>
> /* Separator character handling. */
>
> -#define skip_whitespace(str) do { if (*(str) == ' ') ++(str); } while (0)
> +#define skip_whitespace(str) do { if (is_whitespace (*(str))) ++(str); } while (0)
>
> static inline bool
> skip_past_char (char **str, char c)
>
OK.
R.
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 07/65] avr: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (5 preceding siblings ...)
2025-01-27 15:50 ` [PATCH v2 06/65] aarch64: " Jan Beulich
@ 2025-01-27 16:06 ` Jan Beulich
2025-01-27 16:07 ` [PATCH v2 08/65] bfin: " Jan Beulich
` (59 subsequent siblings)
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:06 UTC (permalink / raw)
To: Binutils; +Cc: Denis Chertykov, Marek Michalkiewicz
Wherever blanks are permissible in input, tabs ought to be permissible,
too. This is particularly relevant when -f is passed to gas (alongside
appropriate input).
---
v2: New.
--- a/gas/config/tc-avr.c
+++ b/gas/config/tc-avr.c
@@ -618,7 +618,7 @@ show_mcu_list (FILE *stream)
static inline char *
skip_space (char *s)
{
- while (*s == ' ' || *s == '\t')
+ while (is_whitespace (*s))
++s;
return s;
}
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 08/65] bfin: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (6 preceding siblings ...)
2025-01-27 16:06 ` [PATCH v2 07/65] avr: " Jan Beulich
@ 2025-01-27 16:07 ` Jan Beulich
2025-01-27 16:08 ` [PATCH v2 09/65] bpf: " Jan Beulich
` (58 subsequent siblings)
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:07 UTC (permalink / raw)
To: Binutils; +Cc: Jie Zhang, Mike Frysinger
Wherever blanks are permissible in input, tabs ought to be permissible,
too. This is particularly relevant when -f is passed to gas (alongside
appropriate input).
---
v2: New.
--- a/gas/config/tc-bfin.c
+++ b/gas/config/tc-bfin.c
@@ -1939,7 +1939,8 @@ bfin_eol_in_insn (char *line)
/* If the || is on the next line, there might be leading whitespace. */
temp++;
- while (*temp == ' ' || *temp == '\t') temp++;
+ while (is_whitespace (*temp))
+ temp++;
if (*temp == '|')
return true;
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 09/65] bpf: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (7 preceding siblings ...)
2025-01-27 16:07 ` [PATCH v2 08/65] bfin: " Jan Beulich
@ 2025-01-27 16:08 ` Jan Beulich
2025-01-28 9:21 ` Alan Modra
2025-01-27 16:09 ` [PATCH v2 10/65] CR16: " Jan Beulich
` (57 subsequent siblings)
66 siblings, 1 reply; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:08 UTC (permalink / raw)
To: Binutils; +Cc: Jose Marchesi
Wherever blanks are permissible in input, tabs ought to be permissible,
too. This is particularly relevant when -f is passed to gas (alongside
appropriate input). At the same time use is_end_of_stmt() instead of
open-coded nul char checks.
---
v2: New.
--- a/gas/config/tc-bpf.c
+++ b/gas/config/tc-bpf.c
@@ -1274,7 +1274,7 @@ parse_expression (char *s, expressionS *
these whitespaces. */
{
char *p;
- for (p = s - 1; p >= saved_s && *p == ' '; --p)
+ for (p = s - 1; p >= saved_s && is_whitespace (*p); --p)
--s;
}
@@ -1501,7 +1501,7 @@ md_assemble (char *str ATTRIBUTE_UNUSED)
if (*p == ' ')
{
/* Expect zero or more spaces. */
- while (*s != '\0' && (*s == ' ' || *s == '\t'))
+ while (!is_end_of_stmt (*s) && is_whitespace (*s))
s += 1;
p += 1;
}
@@ -1520,20 +1520,20 @@ md_assemble (char *str ATTRIBUTE_UNUSED)
else if (*(p + 1) == 'w')
{
/* Expect zero or more spaces. */
- while (*s != '\0' && (*s == ' ' || *s == '\t'))
+ while (!is_end_of_stmt (*s) && is_whitespace (*s))
s += 1;
p += 2;
}
else if (*(p + 1) == 'W')
{
/* Expect one or more spaces. */
- if (*s != ' ' && *s != '\t')
+ if (!is_whitespace (*s))
{
PARSE_ERROR ("expected white space, got '%s'",
s);
break;
}
- while (*s != '\0' && (*s == ' ' || *s == '\t'))
+ while (!is_end_of_stmt (*s) && is_whitespace (*s))
s += 1;
p += 2;
}
@@ -1620,7 +1620,7 @@ md_assemble (char *str ATTRIBUTE_UNUSED)
if (p[1] == 'I')
{
- while (*s == ' ' || *s == '\t')
+ while (is_whitespace (*s))
s += 1;
if (*s != '+' && *s != '-')
{
@@ -1643,7 +1643,7 @@ md_assemble (char *str ATTRIBUTE_UNUSED)
{
char *exp = NULL;
- while (*s == ' ' || *s == '\t')
+ while (is_whitespace (*s))
s += 1;
if (*s != '+' && *s != '-')
{
@@ -1735,9 +1735,9 @@ md_assemble (char *str ATTRIBUTE_UNUSED)
if (*p == '\0')
{
/* Allow white spaces at the end of the line. */
- while (*s != '\0' && (*s == ' ' || *s == '\t'))
+ while (!is_end_of_stmt (*s) && is_whitespace (*s))
s += 1;
- if (*s == '\0')
+ if (is_end_of_stmt (*s))
/* We parsed an instruction successfully. */
break;
PARSE_ERROR ("extra junk at end of line");
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [PATCH v2 09/65] bpf: use is_whitespace()
2025-01-27 16:08 ` [PATCH v2 09/65] bpf: " Jan Beulich
@ 2025-01-28 9:21 ` Alan Modra
2025-01-28 10:31 ` Jan Beulich
0 siblings, 1 reply; 106+ messages in thread
From: Alan Modra @ 2025-01-28 9:21 UTC (permalink / raw)
To: Jan Beulich; +Cc: Binutils, Jose Marchesi
On Mon, Jan 27, 2025 at 05:08:18PM +0100, Jan Beulich wrote:
> Wherever blanks are permissible in input, tabs ought to be permissible,
> too. This is particularly relevant when -f is passed to gas (alongside
> appropriate input). At the same time use is_end_of_stmt() instead of
> open-coded nul char checks.
> ---
> v2: New.
>
> --- a/gas/config/tc-bpf.c
> +++ b/gas/config/tc-bpf.c
> @@ -1274,7 +1274,7 @@ parse_expression (char *s, expressionS *
> these whitespaces. */
> {
> char *p;
> - for (p = s - 1; p >= saved_s && *p == ' '; --p)
> + for (p = s - 1; p >= saved_s && is_whitespace (*p); --p)
> --s;
> }
>
> @@ -1501,7 +1501,7 @@ md_assemble (char *str ATTRIBUTE_UNUSED)
> if (*p == ' ')
> {
> /* Expect zero or more spaces. */
> - while (*s != '\0' && (*s == ' ' || *s == '\t'))
> + while (!is_end_of_stmt (*s) && is_whitespace (*s))
Just is_whitespace here.
> s += 1;
> p += 1;
> }
> @@ -1520,20 +1520,20 @@ md_assemble (char *str ATTRIBUTE_UNUSED)
> else if (*(p + 1) == 'w')
> {
> /* Expect zero or more spaces. */
> - while (*s != '\0' && (*s == ' ' || *s == '\t'))
> + while (!is_end_of_stmt (*s) && is_whitespace (*s))
Same.
> s += 1;
> p += 2;
> }
> else if (*(p + 1) == 'W')
> {
> /* Expect one or more spaces. */
> - if (*s != ' ' && *s != '\t')
> + if (!is_whitespace (*s))
> {
> PARSE_ERROR ("expected white space, got '%s'",
> s);
> break;
> }
> - while (*s != '\0' && (*s == ' ' || *s == '\t'))
> + while (!is_end_of_stmt (*s) && is_whitespace (*s))
Same.
> s += 1;
> p += 2;
> }
> @@ -1620,7 +1620,7 @@ md_assemble (char *str ATTRIBUTE_UNUSED)
>
> if (p[1] == 'I')
> {
> - while (*s == ' ' || *s == '\t')
> + while (is_whitespace (*s))
> s += 1;
> if (*s != '+' && *s != '-')
> {
> @@ -1643,7 +1643,7 @@ md_assemble (char *str ATTRIBUTE_UNUSED)
> {
> char *exp = NULL;
>
> - while (*s == ' ' || *s == '\t')
> + while (is_whitespace (*s))
> s += 1;
> if (*s != '+' && *s != '-')
> {
> @@ -1735,9 +1735,9 @@ md_assemble (char *str ATTRIBUTE_UNUSED)
> if (*p == '\0')
> {
> /* Allow white spaces at the end of the line. */
> - while (*s != '\0' && (*s == ' ' || *s == '\t'))
> + while (!is_end_of_stmt (*s) && is_whitespace (*s))
Again.
> s += 1;
> - if (*s == '\0')
> + if (is_end_of_stmt (*s))
> /* We parsed an instruction successfully. */
> break;
> PARSE_ERROR ("extra junk at end of line");
--
Alan Modra
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [PATCH v2 09/65] bpf: use is_whitespace()
2025-01-28 9:21 ` Alan Modra
@ 2025-01-28 10:31 ` Jan Beulich
0 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-28 10:31 UTC (permalink / raw)
To: Alan Modra; +Cc: Binutils, Jose Marchesi
On 28.01.2025 10:21, Alan Modra wrote:
> On Mon, Jan 27, 2025 at 05:08:18PM +0100, Jan Beulich wrote:
>> Wherever blanks are permissible in input, tabs ought to be permissible,
>> too. This is particularly relevant when -f is passed to gas (alongside
>> appropriate input). At the same time use is_end_of_stmt() instead of
>> open-coded nul char checks.
>> ---
>> v2: New.
>>
>> --- a/gas/config/tc-bpf.c
>> +++ b/gas/config/tc-bpf.c
>> @@ -1274,7 +1274,7 @@ parse_expression (char *s, expressionS *
>> these whitespaces. */
>> {
>> char *p;
>> - for (p = s - 1; p >= saved_s && *p == ' '; --p)
>> + for (p = s - 1; p >= saved_s && is_whitespace (*p); --p)
>> --s;
>> }
>>
>> @@ -1501,7 +1501,7 @@ md_assemble (char *str ATTRIBUTE_UNUSED)
>> if (*p == ' ')
>> {
>> /* Expect zero or more spaces. */
>> - while (*s != '\0' && (*s == ' ' || *s == '\t'))
>> + while (!is_end_of_stmt (*s) && is_whitespace (*s))
>
> Just is_whitespace here.
Oh, yes. In later patches I did that, but then forgot to check back in
earlier ones. Thanks for noticing.
Jan
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 10/65] CR16: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (8 preceding siblings ...)
2025-01-27 16:08 ` [PATCH v2 09/65] bpf: " Jan Beulich
@ 2025-01-27 16:09 ` Jan Beulich
2025-01-27 16:10 ` [PATCH v2 11/65] cris: " Jan Beulich
` (56 subsequent siblings)
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:09 UTC (permalink / raw)
To: Binutils
Wherever blanks are permissible in input, tabs ought to be permissible,
too. This is particularly relevant when -f is passed to gas (alongside
appropriate input). Also switch ISSPACE() uses over.
---
v2: New.
--- a/gas/config/tc-cr16.c
+++ b/gas/config/tc-cr16.c
@@ -1219,7 +1219,7 @@ set_operand (char *operand, ins * cr16_i
/* Set register pair base. */
if ((strchr (operandS,'(') != NULL))
{
- while ((*operandE != '(') && (! ISSPACE (*operandE)))
+ while ((*operandE != '(') && (! is_whitespace (*operandE)))
operandE++;
if ((cur_arg->rp = get_index_register_pair (operandE)) == nullregister)
as_bad (_("Illegal register pair `%s' in Instruction `%s'"),
@@ -1400,7 +1400,7 @@ parse_operands (ins * cr16_ins, char *op
continue;
}
- if (*operandT == ' ')
+ if (is_whitespace (*operandT))
as_bad (_("Illegal operands (whitespace): `%s'"), ins_parse);
if (*operandT == '(')
@@ -1545,12 +1545,13 @@ check_cinv_options (char * operand)
switch (*p)
{
case ',':
- case ' ':
case 'i':
case 'u':
case 'd':
break;
default:
+ if (is_whitespace (*p))
+ break;
as_bad (_("Illegal `cinv' parameter: `%c'"), *p);
}
}
@@ -2503,7 +2504,7 @@ md_assemble (char *op)
reset_vars (op);
/* Strip the mnemonic. */
- for (param = op; *param != 0 && !ISSPACE (*param); param++)
+ for (param = op; *param != 0 && !is_whitespace (*param); param++)
;
*param++ = '\0';
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 11/65] cris: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (9 preceding siblings ...)
2025-01-27 16:09 ` [PATCH v2 10/65] CR16: " Jan Beulich
@ 2025-01-27 16:10 ` Jan Beulich
2025-01-27 16:22 ` Hans-Peter Nilsson
2025-01-27 16:12 ` [PATCH v2 12/65] CRx: " Jan Beulich
` (55 subsequent siblings)
66 siblings, 1 reply; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:10 UTC (permalink / raw)
To: Binutils; +Cc: Hans-Peter Nilsson
Wherever blanks are permissible in input, tabs ought to be permissible,
too. This is particularly relevant when -f is passed to gas (alongside
appropriate input). Also switch ISSPACE() uses over.
---
v2: New.
--- a/gas/config/tc-cris.c
+++ b/gas/config/tc-cris.c
@@ -1540,13 +1540,15 @@ cris_process_instruction (char *insn_tex
modified_char = *operands;
/* Fall through. */
- case ' ':
+ zap_char:
/* Consume the character after the mnemonic
and replace it with '\0'. */
*operands++ = '\0';
break;
default:
+ if (is_whitespace (*operands))
+ goto zap_char;
as_bad (_("Unknown opcode: `%s'"), insn_text);
return;
}
@@ -1608,12 +1610,16 @@ cris_process_instruction (char *insn_tex
case '[':
case ']':
case ',':
- case ' ':
/* These must match exactly. */
if (*s++ == *args)
continue;
break;
+ case ' ':
+ if (is_whitespace (*s++))
+ continue;
+ break;
+
case 'A':
/* "ACR", case-insensitive.
Handle a sometimes-mandatory dollar sign as register
@@ -1682,7 +1688,7 @@ cris_process_instruction (char *insn_tex
if (modified_char == '.' && *s == '.')
{
if ((s[1] != 'd' && s[1] == 'D')
- || ! ISSPACE (s[2]))
+ || ! is_whitespace (s[2]))
break;
s += 2;
continue;
@@ -3231,7 +3237,7 @@ get_flags (char **cPP, int *flagsp)
whitespace. Anything else, and we consider it a failure. */
if (**cPP != ','
&& **cPP != 0
- && ! ISSPACE (**cPP))
+ && ! is_whitespace (**cPP))
return 0;
else
return 1;
@@ -4278,7 +4284,7 @@ cris_arch_from_string (const char **str)
int len = strlen (ap->name);
if (strncmp (*str, ap->name, len) == 0
- && (str[0][len] == 0 || ISSPACE (str[0][len])))
+ && (is_end_of_stmt (str[0][len]) || is_whitespace (str[0][len])))
{
*str += strlen (ap->name);
return ap->arch;
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [PATCH v2 11/65] cris: use is_whitespace()
2025-01-27 16:10 ` [PATCH v2 11/65] cris: " Jan Beulich
@ 2025-01-27 16:22 ` Hans-Peter Nilsson
0 siblings, 0 replies; 106+ messages in thread
From: Hans-Peter Nilsson @ 2025-01-27 16:22 UTC (permalink / raw)
To: Jan Beulich; +Cc: binutils
> Date: Mon, 27 Jan 2025 17:10:32 +0100
> From: Jan Beulich <jbeulich@suse.com>
> Wherever blanks are permissible in input, tabs ought to be permissible,
> too. This is particularly relevant when -f is passed to gas (alongside
> appropriate input). Also switch ISSPACE() uses over.
Hm... This code is supposed to handle formatted input
(#NO_APP in effect; format being simplified,
"canonicalized") so aren't the TABs supposed to be gone
here, whitespaces changed into a single space character?
brgds, H-P
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 12/65] CRx: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (10 preceding siblings ...)
2025-01-27 16:10 ` [PATCH v2 11/65] cris: " Jan Beulich
@ 2025-01-27 16:12 ` Jan Beulich
2025-01-27 16:13 ` [PATCH v2 13/65] C-Sky: " Jan Beulich
` (54 subsequent siblings)
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:12 UTC (permalink / raw)
To: Binutils
Wherever blanks are permissible in input, tabs ought to be permissible,
too. This is particularly relevant when -f is passed to gas (alongside
appropriate input). Also switch ISSPACE() uses over.
---
v2: New.
--- a/gas/config/tc-crx.c
+++ b/gas/config/tc-crx.c
@@ -721,7 +721,7 @@ set_operand (char *operand, ins * crx_in
operandS = ++operandE;
/* Set register base. */
- while ((*operandE != ',') && (! ISSPACE (*operandE)))
+ while ((*operandE != ',') && (! is_whitespace (*operandE)))
operandE++;
*operandE++ = '\0';
if ((cur_arg->r = get_register (operandS)) == nullregister)
@@ -729,7 +729,7 @@ set_operand (char *operand, ins * crx_in
operandS, ins_parse);
/* Skip leading white space. */
- while (ISSPACE (*operandE))
+ while (is_whitespace (*operandE))
operandE++;
operandS = operandE;
@@ -744,7 +744,7 @@ set_operand (char *operand, ins * crx_in
operandS, ins_parse);
/* Skip leading white space. */
- while (ISSPACE (*operandE))
+ while (is_whitespace (*operandE))
operandE++;
operandS = operandE;
@@ -883,7 +883,7 @@ parse_operands (ins * crx_ins, char *ope
continue;
}
- if (*operandT == ' ')
+ if (is_whitespace (*operandT))
as_bad (_("Illegal operands (whitespace): `%s'"), ins_parse);
if (*operandT == '(')
@@ -1030,7 +1030,7 @@ get_cinv_parameters (const char *operand
while (*++p != ']')
{
- if (*p == ',' || *p == ' ')
+ if (*p == ',' || is_whitespace (*p))
continue;
if (*p == 'd')
@@ -1927,7 +1927,7 @@ md_assemble (char *op)
reset_vars (op);
/* Strip the mnemonic. */
- for (param = op; *param != 0 && !ISSPACE (*param); param++)
+ for (param = op; *param != 0 && !is_whitespace (*param); param++)
;
c = *param;
*param++ = '\0';
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 13/65] C-Sky: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (11 preceding siblings ...)
2025-01-27 16:12 ` [PATCH v2 12/65] CRx: " Jan Beulich
@ 2025-01-27 16:13 ` Jan Beulich
2025-01-27 16:14 ` [PATCH v2 14/65] d10v: " Jan Beulich
` (53 subsequent siblings)
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:13 UTC (permalink / raw)
To: Binutils; +Cc: Lifang Xia, Yunhai Shang
Wherever blanks are permissible in input, tabs ought to be permissible,
too. This is particularly relevant when -f is passed to gas (alongside
appropriate input). Also switch ISSPACE() uses over. At the same time
use is_end_of_stmt() instead of kind-of-open-coded checks.
---
v2: New.
--- a/gas/config/tc-csky.c
+++ b/gas/config/tc-csky.c
@@ -2287,7 +2287,7 @@ parse_exp (char *s, expressionS *e)
char *new;
/* Skip whitespace. */
- while (ISSPACE (*s))
+ while (is_whitespace (*s))
++s;
save = input_line_pointer;
@@ -3325,14 +3325,14 @@ parse_opcode (char *str)
char macro_name[OPCODE_MAX_LEN + 1];
/* Remove space ahead of string. */
- while (ISSPACE (*str))
+ while (is_whitespace (*str))
str++;
opcode_end = str;
/* Find the opcode end. */
while (nlen < OPCODE_MAX_LEN
- && !is_end_of_line [(unsigned char) *opcode_end]
- && *opcode_end != ' ')
+ && !is_end_of_stmt (*opcode_end)
+ && !is_whitespace (*opcode_end))
{
/* Is csky force 32 or 16 instruction? */
if (IS_CSKY_V2 (mach_flag)
@@ -3378,7 +3378,7 @@ parse_opcode (char *str)
macro_name[nlen] = '\0';
/* Get csky_insn.opcode_end. */
- while (ISSPACE (*opcode_end))
+ while (is_whitespace (*opcode_end))
opcode_end++;
csky_insn.opcode_end = opcode_end;
@@ -4333,13 +4333,13 @@ parse_operands_op (char *str, struct csk
for (j = 0; j < csky_insn.number; j++)
{
- while (ISSPACE (*oper))
+ while (is_whitespace (*oper))
oper++;
flag_pass = get_operand_value (&op[i], &oper,
&op[i].oprnd.oprnds[j]);
if (!flag_pass)
break;
- while (ISSPACE (*oper))
+ while (is_whitespace (*oper))
oper++;
/* Skip the ','. */
if (j < csky_insn.number - 1 && op[i].operand_num != -1)
@@ -4578,7 +4578,7 @@ md_assemble (char *str)
mapping_state (MAP_TEXT);
/* Tie dwarf2 debug info to every insn if set option --gdwarf2. */
dwarf2_emit_insn (0);
- while (ISSPACE (* str))
+ while (is_whitespace (* str))
str++;
/* Get opcode from str. */
if (!parse_opcode (str))
@@ -5905,7 +5905,7 @@ static int
csky_get_macro_operand (char *src_s, char *dst_s, char end_sym)
{
int nlen = 0;
- while (ISSPACE (*src_s))
+ while (is_whitespace (*src_s))
++src_s;
while (*src_s != end_sym)
dst_s[nlen++] = *(src_s++);
@@ -7778,11 +7778,11 @@ csky_s_section (int ignore)
pool. */
char * ilp = input_line_pointer;
- while (*ilp != 0 && ISSPACE (*ilp))
+ while (is_whitespace (*ilp))
++ ilp;
if (startswith (ilp, ".line")
- && (ISSPACE (ilp[5]) || *ilp == '\n' || *ilp == '\r'))
+ && (is_whitespace (ilp[5]) || is_end_of_stmt (ilp[5])))
;
else
dump_literals (0);
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 14/65] d10v: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (12 preceding siblings ...)
2025-01-27 16:13 ` [PATCH v2 13/65] C-Sky: " Jan Beulich
@ 2025-01-27 16:14 ` Jan Beulich
2025-01-27 16:14 ` [PATCH v2 15/65] d30v: " Jan Beulich
` (52 subsequent siblings)
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:14 UTC (permalink / raw)
To: Binutils
Wherever blanks are permissible in input, tabs ought to be permissible,
too. This is particularly relevant when -f is passed to gas (alongside
appropriate input). Also convert open-coded checks where tabs were
already included. At the same time use is_end_of_stmt() instead of open-
coded checks in adjacent code.
---
v2: New.
--- a/gas/config/tc-d10v.c
+++ b/gas/config/tc-d10v.c
@@ -140,8 +140,7 @@ register_name (expressionS *expressionP)
int reg_number;
char c, *p = input_line_pointer;
- while (*p
- && *p != '\n' && *p != '\r' && *p != ',' && *p != ' ' && *p != ')')
+ while (!is_end_of_stmt (*p) && *p != ',' && !is_whitespace (*p) && *p != ')')
p++;
c = *p;
@@ -356,9 +355,9 @@ get_operands (expressionS exp[])
while (*p)
{
- while (*p == ' ' || *p == '\t' || *p == ',')
+ while (is_whitespace (*p) || *p == ',')
p++;
- if (*p == 0 || *p == '\n' || *p == '\r')
+ if (is_end_of_stmt (*p))
break;
if (*p == '@')
@@ -1410,12 +1409,12 @@ do_assemble (char *str, struct d10v_opco
expressionS myops[6];
/* Drop leading whitespace. */
- while (*str == ' ')
+ while (is_whitespace (*str))
str++;
/* Find the opcode end. */
for (op_start = op_end = (unsigned char *) str;
- *op_end && !is_end_of_line[*op_end] && *op_end != ' ';
+ !is_end_of_stmt (*op_end) && !is_whitespace (*op_end);
op_end++)
{
name[nlen] = TOLOWER (op_start[nlen]);
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 15/65] d30v: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (13 preceding siblings ...)
2025-01-27 16:14 ` [PATCH v2 14/65] d10v: " Jan Beulich
@ 2025-01-27 16:14 ` Jan Beulich
2025-01-27 16:15 ` [PATCH v2 16/65] dlx: " Jan Beulich
` (51 subsequent siblings)
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:14 UTC (permalink / raw)
To: Binutils
Wherever blanks are permissible in input, tabs ought to be permissible,
too. This is particularly relevant when -f is passed to gas (alongside
appropriate input). Also convert open-coded checks where tabs were
already included. At the same time use is_end_of_stmt() instead of open-
coded checks in adjacent code.
---
v2: New.
--- a/gas/config/tc-d30v.c
+++ b/gas/config/tc-d30v.c
@@ -164,7 +164,7 @@ register_name (expressionS *expressionP)
int reg_number;
char c, *p = input_line_pointer;
- while (*p && *p != '\n' && *p != '\r' && *p != ',' && *p != ' ' && *p != ')')
+ while (!is_end_of_stmt (*p) && *p != ',' && !is_whitespace (*p) && *p != ')')
p++;
c = *p;
@@ -328,7 +328,7 @@ postfix (char *p)
{
while (*p != '-' && *p != '+')
{
- if (*p == 0 || *p == '\n' || *p == '\r' || *p == ' ' || *p == ',')
+ if (is_end_of_stmt (*p) || is_whitespace (*p) || *p == ',')
break;
p++;
}
@@ -400,7 +400,7 @@ get_operands (expressionS exp[], int cmp
while (*p)
{
- while (*p == ' ' || *p == '\t' || *p == ',')
+ while (is_whitespace (*p) || *p == ',')
p++;
if (*p == 0 || *p == '\n' || *p == '\r')
@@ -1294,7 +1294,7 @@ do_assemble (char *str,
long long insn;
/* Drop leading whitespace. */
- while (*str == ' ')
+ while (is_whitespace (*str))
str++;
/* Find the opcode end. */
@@ -1302,7 +1302,7 @@ do_assemble (char *str,
*op_end
&& nlen < (NAME_BUF_LEN - 1)
&& *op_end != '/'
- && !is_end_of_line[(unsigned char) *op_end] && *op_end != ' ';
+ && !is_end_of_stmt (*op_end) && !is_whitespace (*op_end);
op_end++)
{
name[nlen] = TOLOWER (op_start[nlen]);
@@ -1829,7 +1829,7 @@ d30v_start_line (void)
{
char *c = input_line_pointer;
- while (ISSPACE (*c))
+ while (is_whitespace (*c))
c++;
if (*c == '.')
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 16/65] dlx: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (14 preceding siblings ...)
2025-01-27 16:14 ` [PATCH v2 15/65] d30v: " Jan Beulich
@ 2025-01-27 16:15 ` Jan Beulich
2025-01-27 16:16 ` [PATCH v2 17/65] Epiphany: " Jan Beulich
` (50 subsequent siblings)
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:15 UTC (permalink / raw)
To: Binutils; +Cc: Nikolaos Kavvadias
Wherever blanks are permissible in input, tabs ought to be permissible,
too. This is particularly relevant when -f is passed to gas (alongside
appropriate input). Also convert open-coded checks where tabs were
already included.
---
v2: New.
--- a/gas/config/tc-dlx.c
+++ b/gas/config/tc-dlx.c
@@ -499,12 +499,12 @@ dlx_parse_storeop (char * str)
pb = comma;
/* Duplicate the first register. */
- for (i = comma + 1; (str[i] == ' ' || str[i] == '\t'); i++)
+ for (i = comma + 1; is_whitespace (str[i]); i++)
;
for (m2 = 0; (m2 < 7 && str[i] != '\0'); i++, m2++)
{
- if (str[i] != ' ' && str[i] != '\t')
+ if (!is_whitespace (str[i]))
rd[m2] = str[i];
else
goto badoperand_store;
@@ -672,12 +672,12 @@ machine_ip (char *str)
case '\0':
break;
- /* FIXME-SOMEDAY more whitespace. */
- case ' ':
- *s++ = '\0';
- break;
-
default:
+ if (is_whitespace (*s))
+ {
+ *s++ = '\0';
+ break;
+ }
as_bad (_("Unknown opcode: `%s'"), str);
return;
}
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 17/65] Epiphany: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (15 preceding siblings ...)
2025-01-27 16:15 ` [PATCH v2 16/65] dlx: " Jan Beulich
@ 2025-01-27 16:16 ` Jan Beulich
2025-01-27 16:17 ` [PATCH v2 18/65] fr30: " Jan Beulich
` (49 subsequent siblings)
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:16 UTC (permalink / raw)
To: Binutils; +Cc: Joern Rennecke
Wherever blanks are permissible in input, tabs ought to be permissible,
too. This is particularly relevant when -f is passed to gas (alongside
appropriate input).
---
v2: New.
--- a/gas/config/tc-epiphany.c
+++ b/gas/config/tc-epiphany.c
@@ -353,7 +353,7 @@ parse_reglist (const char * s, int * mas
{
long value;
- while (*s == ' ')
+ while (is_whitespace (*s))
++s;
/* Parse a list with "," or "}" as limiters. */
@@ -371,7 +371,7 @@ parse_reglist (const char * s, int * mas
return _("register is out of order");
*mask |= regmask;
- while (*s==' ')
+ while (is_whitespace (*s))
++s;
if (*s == '}')
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 18/65] fr30: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (16 preceding siblings ...)
2025-01-27 16:16 ` [PATCH v2 17/65] Epiphany: " Jan Beulich
@ 2025-01-27 16:17 ` Jan Beulich
2025-01-27 16:19 ` [PATCH v2 19/65] ft32: " Jan Beulich
` (48 subsequent siblings)
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:17 UTC (permalink / raw)
To: Binutils; +Cc: Nick Clifton
Convert open-coded checks. At the same time use is_end_of_stmt() instead
of an open-coded check in adjacent code.
---
v2: New.
--- a/gas/config/tc-fr30.c
+++ b/gas/config/tc-fr30.c
@@ -359,7 +359,7 @@ fr30_is_colon_insn (char *start, char *n
{
/* Nope - check to see a 'd' follows the colon. */
if ( (i_l_p[1] == 'd' || i_l_p[1] == 'D')
- && (i_l_p[2] == ' ' || i_l_p[2] == '\t' || i_l_p[2] == '\n'))
+ && (is_whitespace (i_l_p[2]) || is_end_of_stmt (i_l_p[2])))
{
/* Yup - it might be delay slot instruction. */
int i;
@@ -393,17 +393,17 @@ fr30_is_colon_insn (char *start, char *n
}
/* Check to see if the text following the colon is '8'. */
- if (i_l_p[1] == '8' && (i_l_p[2] == ' ' || i_l_p[2] == '\t'))
+ if (i_l_p[1] == '8' && is_whitespace (i_l_p[2]))
return restore_colon (i_l_p + 2, nul_char);
/* Check to see if the text following the colon is '20'. */
else if (i_l_p[1] == '2' && i_l_p[2] =='0'
- && (i_l_p[3] == ' ' || i_l_p[3] == '\t'))
+ && is_whitespace (i_l_p[3]))
return restore_colon (i_l_p + 3, nul_char);
/* Check to see if the text following the colon is '32'. */
else if (i_l_p[1] == '3' && i_l_p[2] =='2'
- && (i_l_p[3] == ' ' || i_l_p[3] == '\t'))
+ && is_whitespace (i_l_p[3]))
return restore_colon (i_l_p + 3, nul_char);
return 0;
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 19/65] ft32: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (17 preceding siblings ...)
2025-01-27 16:17 ` [PATCH v2 18/65] fr30: " Jan Beulich
@ 2025-01-27 16:19 ` Jan Beulich
2025-01-27 16:20 ` [PATCH v2 20/65] H8/300: " Jan Beulich
` (47 subsequent siblings)
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:19 UTC (permalink / raw)
To: Binutils
Wherever blanks are permissible in input, tabs ought to be permissible,
too. This is particularly relevant when -f is passed to gas (alongside
appropriate input). Also switch ISSPACE() uses over. At the same time
use is_end_of_stmt() instead of open-coded checks in adjacent code.
---
v2: New.
--- a/gas/config/tc-ft32.c
+++ b/gas/config/tc-ft32.c
@@ -212,15 +212,14 @@ md_assemble (char *str)
bool can_sc;
/* Drop leading whitespace. */
- while (*str == ' ')
+ while (is_whitespace (*str))
str++;
/* Find the op code end. */
op_start = str;
for (op_end = str;
- *op_end
- && !is_end_of_line[*op_end & 0xff]
- && *op_end != ' '
+ !is_end_of_stmt (*op_end)
+ && !is_whitespace (*op_end)
&& *op_end != '.';
op_end++)
nlen++;
@@ -273,7 +272,7 @@ md_assemble (char *str)
b |= dw << FT32_FLD_DW_BIT;
}
- while (ISSPACE (*op_end))
+ while (is_whitespace (*op_end))
op_end++;
output = frag_more (4);
@@ -392,7 +391,7 @@ md_assemble (char *str)
if (f)
{
- while (ISSPACE (*op_end))
+ while (is_whitespace (*op_end))
op_end++;
if (*op_end != ',')
@@ -402,13 +401,13 @@ md_assemble (char *str)
}
op_end++;
- while (ISSPACE (*op_end))
+ while (is_whitespace (*op_end))
op_end++;
}
}
}
- if (*op_end != 0)
+ if (!is_end_of_stmt (*op_end))
as_warn (_("extra stuff on line ignored"));
can_sc = ft32_shortcode (b, &sc);
@@ -434,10 +433,10 @@ md_assemble (char *str)
dwarf2_emit_insn (4);
- while (ISSPACE (*op_end))
+ while (is_whitespace (*op_end))
op_end++;
- if (*op_end != 0)
+ if (!is_end_of_stmt (*op_end))
as_warn ("extra stuff on line ignored");
if (pending_reloc)
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 20/65] H8/300: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (18 preceding siblings ...)
2025-01-27 16:19 ` [PATCH v2 19/65] ft32: " Jan Beulich
@ 2025-01-27 16:20 ` Jan Beulich
2025-01-27 16:20 ` [PATCH v2 21/65] HP-PA: " Jan Beulich
` (46 subsequent siblings)
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:20 UTC (permalink / raw)
To: Binutils; +Cc: Prafulla Thakare
Wherever blanks are permissible in input, tabs ought to be permissible,
too. This is particularly relevant when -f is passed to gas (alongside
appropriate input). At the same time use is_end_of_stmt() instead of an
open-coded check in adjacent code.
---
v2: New.
--- a/gas/config/tc-h8300.c
+++ b/gas/config/tc-h8300.c
@@ -1904,12 +1904,12 @@ md_assemble (char *str)
int size, i;
/* Drop leading whitespace. */
- while (*str == ' ')
+ while (is_whitespace (*str))
str++;
/* Find the op code end. */
for (op_start = op_end = str;
- *op_end != 0 && *op_end != ' ';
+ !is_end_of_stmt (*op_end) && !is_whitespace (*op_end);
op_end++)
{
if (*op_end == '.')
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 21/65] HP-PA: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (19 preceding siblings ...)
2025-01-27 16:20 ` [PATCH v2 20/65] H8/300: " Jan Beulich
@ 2025-01-27 16:20 ` Jan Beulich
2025-01-27 22:50 ` John David Anglin
2025-01-27 16:21 ` [PATCH v2 22/65] kvx: " Jan Beulich
` (45 subsequent siblings)
66 siblings, 1 reply; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:20 UTC (permalink / raw)
To: Binutils; +Cc: Dave Anglin
Convert open-coded checks. At the same time use is_end_of_stmt() instead
of an open-coded check in adjacent code.
---
v2: New.
--- a/gas/config/tc-hppa.c
+++ b/gas/config/tc-hppa.c
@@ -2013,7 +2013,7 @@ pa_parse_number (char **s, int is_float)
bool have_prefix;
/* Skip whitespace before the number. */
- while (*p == ' ' || *p == '\t')
+ while (is_whitespace (*p))
p = p + 1;
pa_number = -1;
@@ -2229,12 +2229,12 @@ pa_parse_fp_cmp_cond (char **s)
*s += strlen (fp_cond_map[i].string);
/* If not a complete match, back up the input string and
report an error. */
- if (**s != ' ' && **s != '\t')
+ if (!is_whitespace (**s))
{
*s -= strlen (fp_cond_map[i].string);
break;
}
- while (**s == ' ' || **s == '\t')
+ while (is_whitespace (**s))
*s = *s + 1;
return cond;
}
@@ -2243,7 +2243,7 @@ pa_parse_fp_cmp_cond (char **s)
as_bad (_("Invalid FP Compare Condition: %s"), *s);
/* Advance over the bogus completer. */
- while (**s != ',' && **s != ' ' && **s != '\t')
+ while (**s != ',' && !is_whitespace (**s))
*s += 1;
return 0;
@@ -2416,7 +2416,7 @@ pa_chk_field_selector (char **str)
char *s = *str;
/* Read past any whitespace. */
- while (*s == ' ' || *s == '\t')
+ while (is_whitespace (*s))
s++;
*str = s;
@@ -2547,7 +2547,7 @@ pa_get_number (struct pa_it *insn, char
contain no whitespace. */
s = *strp;
- while (*s != ',' && *s != ' ' && *s != '\t')
+ while (*s != ',' && !is_whitespace (*s))
s++;
c = *s;
@@ -2627,7 +2627,7 @@ pa_parse_nonneg_cmpsub_cmpltr (char **s)
if (**s == ',')
{
*s += 1;
- while (**s != ',' && **s != ' ' && **s != '\t')
+ while (**s != ',' && !is_whitespace (**s))
*s += 1;
c = **s;
**s = 0x00;
@@ -2697,7 +2697,7 @@ pa_parse_neg_cmpsub_cmpltr (char **s)
if (**s == ',')
{
*s += 1;
- while (**s != ',' && **s != ' ' && **s != '\t')
+ while (**s != ',' && !is_whitespace (**s))
*s += 1;
c = **s;
**s = 0x00;
@@ -2772,7 +2772,7 @@ pa_parse_cmpb_64_cmpltr (char **s)
if (**s == ',')
{
*s += 1;
- while (**s != ',' && **s != ' ' && **s != '\t')
+ while (**s != ',' && !is_whitespace (**s))
*s += 1;
c = **s;
**s = 0x00;
@@ -2865,7 +2865,7 @@ pa_parse_cmpib_64_cmpltr (char **s)
if (**s == ',')
{
*s += 1;
- while (**s != ',' && **s != ' ' && **s != '\t')
+ while (**s != ',' && !is_whitespace (**s))
*s += 1;
c = **s;
**s = 0x00;
@@ -2928,7 +2928,7 @@ pa_parse_nonneg_add_cmpltr (char **s)
if (**s == ',')
{
*s += 1;
- while (**s != ',' && **s != ' ' && **s != '\t')
+ while (**s != ',' && !is_whitespace (**s))
*s += 1;
c = **s;
**s = 0x00;
@@ -2997,7 +2997,7 @@ pa_parse_neg_add_cmpltr (char **s)
if (**s == ',')
{
*s += 1;
- while (**s != ',' && **s != ' ' && **s != '\t')
+ while (**s != ',' && !is_whitespace (**s))
*s += 1;
c = **s;
**s = 0x00;
@@ -3070,7 +3070,7 @@ pa_parse_addb_64_cmpltr (char **s)
if (**s == ',')
{
*s += 1;
- while (**s != ',' && **s != ' ' && **s != '\t')
+ while (**s != ',' && !is_whitespace (**s))
*s += 1;
c = **s;
**s = 0x00;
@@ -3178,7 +3178,7 @@ pa_ip (char *str)
/* Convert everything up to the first whitespace character into lower
case. */
- for (s = str; *s != ' ' && *s != '\t' && *s != '\n' && *s != '\0'; s++)
+ for (s = str; !is_whitespace (*s) && !is_end_of_stmt (*s); s++)
*s = TOLOWER (*s);
/* Skip to something interesting. */
@@ -3198,11 +3198,13 @@ pa_ip (char *str)
/*FALLTHROUGH */
- case ' ':
+ zap_char:
*s++ = '\0';
break;
default:
+ if (is_whitespace (*s))
+ goto zap_char;
as_bad (_("Unknown opcode: `%s'"), str);
return;
}
@@ -3239,7 +3241,7 @@ pa_ip (char *str)
for (args = insn->args;; ++args)
{
/* Absorb white space in instruction. */
- while (*s == ' ' || *s == '\t')
+ while (is_whitespace (*s))
s++;
switch (*args)
@@ -3264,11 +3266,15 @@ pa_ip (char *str)
case '(':
case ')':
case ',':
- case ' ':
if (*s++ == *args)
continue;
break;
+ case ' ':
+ if (is_whitespace (*s++))
+ continue;
+ break;
+
/* Handle a 5 bit register or control register field at 10. */
case 'b':
case '^':
@@ -3282,7 +3288,7 @@ pa_ip (char *str)
is there. */
case '!':
/* Skip whitespace before register. */
- while (*s == ' ' || *s == '\t')
+ while (is_whitespace (*s))
s = s + 1;
if (!strncasecmp (s, "%sar", 4))
@@ -3956,7 +3962,7 @@ pa_ip (char *str)
break;
name = s;
- while (*s != ',' && *s != ' ' && *s != '\t')
+ while (*s != ',' && !is_whitespace (*s))
s += 1;
c = *s;
*s = 0x00;
@@ -4131,7 +4137,7 @@ pa_ip (char *str)
break;
name = s;
- while (*s != ',' && *s != ' ' && *s != '\t')
+ while (*s != ',' && !is_whitespace (*s))
s += 1;
c = *s;
*s = 0x00;
@@ -4279,7 +4285,7 @@ pa_ip (char *str)
break;
name = s;
- while (*s != ',' && *s != ' ' && *s != '\t')
+ while (*s != ',' && !is_whitespace (*s))
s += 1;
c = *s;
*s = 0x00;
@@ -4353,7 +4359,7 @@ pa_ip (char *str)
break;
name = s;
- while (*s != ',' && *s != ' ' && *s != '\t')
+ while (*s != ',' && !is_whitespace (*s))
s += 1;
c = *s;
*s = 0x00;
@@ -4497,7 +4503,7 @@ pa_ip (char *str)
s += 3;
}
/* ",*" is a valid condition. */
- else if (*args != 'U' || (*s != ' ' && *s != '\t'))
+ else if (*args != 'U' || !is_whitespace (*s))
as_bad (_("Invalid Unit Instruction Condition."));
}
/* 32-bit is default for no condition. */
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [PATCH v2 21/65] HP-PA: use is_whitespace()
2025-01-27 16:20 ` [PATCH v2 21/65] HP-PA: " Jan Beulich
@ 2025-01-27 22:50 ` John David Anglin
2025-01-28 7:19 ` Jan Beulich
0 siblings, 1 reply; 106+ messages in thread
From: John David Anglin @ 2025-01-27 22:50 UTC (permalink / raw)
To: Jan Beulich, Binutils
Can't test without is_whitespace() being defined but it looks okay.
Dave
On 2025-01-27 11:20 a.m., Jan Beulich wrote:
> Convert open-coded checks. At the same time use is_end_of_stmt() instead
> of an open-coded check in adjacent code.
> ---
> v2: New.
>
> --- a/gas/config/tc-hppa.c
> +++ b/gas/config/tc-hppa.c
> @@ -2013,7 +2013,7 @@ pa_parse_number (char **s, int is_float)
> bool have_prefix;
>
> /* Skip whitespace before the number. */
> - while (*p == ' ' || *p == '\t')
> + while (is_whitespace (*p))
> p = p + 1;
>
> pa_number = -1;
> @@ -2229,12 +2229,12 @@ pa_parse_fp_cmp_cond (char **s)
> *s += strlen (fp_cond_map[i].string);
> /* If not a complete match, back up the input string and
> report an error. */
> - if (**s != ' ' && **s != '\t')
> + if (!is_whitespace (**s))
> {
> *s -= strlen (fp_cond_map[i].string);
> break;
> }
> - while (**s == ' ' || **s == '\t')
> + while (is_whitespace (**s))
> *s = *s + 1;
> return cond;
> }
> @@ -2243,7 +2243,7 @@ pa_parse_fp_cmp_cond (char **s)
> as_bad (_("Invalid FP Compare Condition: %s"), *s);
>
> /* Advance over the bogus completer. */
> - while (**s != ',' && **s != ' ' && **s != '\t')
> + while (**s != ',' && !is_whitespace (**s))
> *s += 1;
>
> return 0;
> @@ -2416,7 +2416,7 @@ pa_chk_field_selector (char **str)
> char *s = *str;
>
> /* Read past any whitespace. */
> - while (*s == ' ' || *s == '\t')
> + while (is_whitespace (*s))
> s++;
> *str = s;
>
> @@ -2547,7 +2547,7 @@ pa_get_number (struct pa_it *insn, char
> contain no whitespace. */
>
> s = *strp;
> - while (*s != ',' && *s != ' ' && *s != '\t')
> + while (*s != ',' && !is_whitespace (*s))
> s++;
>
> c = *s;
> @@ -2627,7 +2627,7 @@ pa_parse_nonneg_cmpsub_cmpltr (char **s)
> if (**s == ',')
> {
> *s += 1;
> - while (**s != ',' && **s != ' ' && **s != '\t')
> + while (**s != ',' && !is_whitespace (**s))
> *s += 1;
> c = **s;
> **s = 0x00;
> @@ -2697,7 +2697,7 @@ pa_parse_neg_cmpsub_cmpltr (char **s)
> if (**s == ',')
> {
> *s += 1;
> - while (**s != ',' && **s != ' ' && **s != '\t')
> + while (**s != ',' && !is_whitespace (**s))
> *s += 1;
> c = **s;
> **s = 0x00;
> @@ -2772,7 +2772,7 @@ pa_parse_cmpb_64_cmpltr (char **s)
> if (**s == ',')
> {
> *s += 1;
> - while (**s != ',' && **s != ' ' && **s != '\t')
> + while (**s != ',' && !is_whitespace (**s))
> *s += 1;
> c = **s;
> **s = 0x00;
> @@ -2865,7 +2865,7 @@ pa_parse_cmpib_64_cmpltr (char **s)
> if (**s == ',')
> {
> *s += 1;
> - while (**s != ',' && **s != ' ' && **s != '\t')
> + while (**s != ',' && !is_whitespace (**s))
> *s += 1;
> c = **s;
> **s = 0x00;
> @@ -2928,7 +2928,7 @@ pa_parse_nonneg_add_cmpltr (char **s)
> if (**s == ',')
> {
> *s += 1;
> - while (**s != ',' && **s != ' ' && **s != '\t')
> + while (**s != ',' && !is_whitespace (**s))
> *s += 1;
> c = **s;
> **s = 0x00;
> @@ -2997,7 +2997,7 @@ pa_parse_neg_add_cmpltr (char **s)
> if (**s == ',')
> {
> *s += 1;
> - while (**s != ',' && **s != ' ' && **s != '\t')
> + while (**s != ',' && !is_whitespace (**s))
> *s += 1;
> c = **s;
> **s = 0x00;
> @@ -3070,7 +3070,7 @@ pa_parse_addb_64_cmpltr (char **s)
> if (**s == ',')
> {
> *s += 1;
> - while (**s != ',' && **s != ' ' && **s != '\t')
> + while (**s != ',' && !is_whitespace (**s))
> *s += 1;
> c = **s;
> **s = 0x00;
> @@ -3178,7 +3178,7 @@ pa_ip (char *str)
>
> /* Convert everything up to the first whitespace character into lower
> case. */
> - for (s = str; *s != ' ' && *s != '\t' && *s != '\n' && *s != '\0'; s++)
> + for (s = str; !is_whitespace (*s) && !is_end_of_stmt (*s); s++)
> *s = TOLOWER (*s);
>
> /* Skip to something interesting. */
> @@ -3198,11 +3198,13 @@ pa_ip (char *str)
>
> /*FALLTHROUGH */
>
> - case ' ':
> + zap_char:
> *s++ = '\0';
> break;
>
> default:
> + if (is_whitespace (*s))
> + goto zap_char;
> as_bad (_("Unknown opcode: `%s'"), str);
> return;
> }
> @@ -3239,7 +3241,7 @@ pa_ip (char *str)
> for (args = insn->args;; ++args)
> {
> /* Absorb white space in instruction. */
> - while (*s == ' ' || *s == '\t')
> + while (is_whitespace (*s))
> s++;
>
> switch (*args)
> @@ -3264,11 +3266,15 @@ pa_ip (char *str)
> case '(':
> case ')':
> case ',':
> - case ' ':
> if (*s++ == *args)
> continue;
> break;
>
> + case ' ':
> + if (is_whitespace (*s++))
> + continue;
> + break;
> +
> /* Handle a 5 bit register or control register field at 10. */
> case 'b':
> case '^':
> @@ -3282,7 +3288,7 @@ pa_ip (char *str)
> is there. */
> case '!':
> /* Skip whitespace before register. */
> - while (*s == ' ' || *s == '\t')
> + while (is_whitespace (*s))
> s = s + 1;
>
> if (!strncasecmp (s, "%sar", 4))
> @@ -3956,7 +3962,7 @@ pa_ip (char *str)
> break;
>
> name = s;
> - while (*s != ',' && *s != ' ' && *s != '\t')
> + while (*s != ',' && !is_whitespace (*s))
> s += 1;
> c = *s;
> *s = 0x00;
> @@ -4131,7 +4137,7 @@ pa_ip (char *str)
> break;
>
> name = s;
> - while (*s != ',' && *s != ' ' && *s != '\t')
> + while (*s != ',' && !is_whitespace (*s))
> s += 1;
> c = *s;
> *s = 0x00;
> @@ -4279,7 +4285,7 @@ pa_ip (char *str)
> break;
>
> name = s;
> - while (*s != ',' && *s != ' ' && *s != '\t')
> + while (*s != ',' && !is_whitespace (*s))
> s += 1;
> c = *s;
> *s = 0x00;
> @@ -4353,7 +4359,7 @@ pa_ip (char *str)
> break;
>
> name = s;
> - while (*s != ',' && *s != ' ' && *s != '\t')
> + while (*s != ',' && !is_whitespace (*s))
> s += 1;
> c = *s;
> *s = 0x00;
> @@ -4497,7 +4503,7 @@ pa_ip (char *str)
> s += 3;
> }
> /* ",*" is a valid condition. */
> - else if (*args != 'U' || (*s != ' ' && *s != '\t'))
> + else if (*args != 'U' || !is_whitespace (*s))
> as_bad (_("Invalid Unit Instruction Condition."));
> }
> /* 32-bit is default for no condition. */
>
--
John David Anglin dave.anglin@bell.net
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 22/65] kvx: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (20 preceding siblings ...)
2025-01-27 16:20 ` [PATCH v2 21/65] HP-PA: " Jan Beulich
@ 2025-01-27 16:21 ` Jan Beulich
2025-01-31 12:34 ` Paul Iannetta
2025-01-27 16:22 ` [PATCH v2 23/65] LoongArch: " Jan Beulich
` (44 subsequent siblings)
66 siblings, 1 reply; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:21 UTC (permalink / raw)
To: Binutils; +Cc: Paul Iannetta
Wherever blanks are permissible in input, tabs ought to be permissible,
too. This is particularly relevant when -f is passed to gas (alongside
appropriate input). Also convert open-coded checks where tabs were
already included. At the same time use is_end_of_stmt() instead of open-
coded checks in adjacent code.
---
read_token()'s check for \n looks odd to me; I'm hence leaving it alone.
---
v2: New.
--- a/gas/config/kvx-parse.c
+++ b/gas/config/kvx-parse.c
@@ -597,12 +597,12 @@ read_token (struct token_s *tok)
{
if ('0' <= str[*begin - i] && str[*begin - i] <= '9')
last_imm_p = 1;
- else if (str[*begin - i] != ' ' && str[*begin - i] != '\t')
+ else if (!is_whitespace (str[*begin - i]))
break;
}
/* Eat up all leading spaces. */
- while (str[*begin] && (str[*begin] == ' ' || str[*begin] == '\n'))
+ while (str[*begin] && (is_whitespace (str[*begin]) || str[*begin] == '\n'))
*begin += 1;
*end = *begin;
@@ -624,7 +624,9 @@ read_token (struct token_s *tok)
}
if (str[*begin] == '.'
- && (!(*begin > 0 && (str[*begin - 1] == ' ' || is_delim(str[*begin - 1])))
+ && (!(*begin > 0
+ && (is_whitespace (str[*begin - 1])
+ || is_delim (str[*begin - 1])))
|| last_imm_p))
modifier_p = 1;
@@ -633,7 +635,8 @@ read_token (struct token_s *tok)
*end += 1;
/* Stop when reaching the start of the new token. */
- while (!(!str[*end] || is_delim (str[*end]) || str[*end] == ' ' || (modifier_p && str[*end] == '.')))
+ while (!(!str[*end] || is_delim (str[*end]) || is_whitespace (str[*end])
+ || (modifier_p && str[*end] == '.')))
*end += 1;
}
--- a/gas/config/tc-kvx.c
+++ b/gas/config/tc-kvx.c
@@ -1279,7 +1279,7 @@ md_assemble (char *line)
if (get_byte_counter (now_seg) & 3)
as_fatal ("code segment not word aligned in md_assemble");
- while (line_cursor && line_cursor[0] && (line_cursor[0] == ' '))
+ while (is_whitespace (line_cursor[0]))
line_cursor++;
/* ;; was converted to "be" by line hook */
@@ -2125,7 +2125,7 @@ kvx_md_start_line_hook (void)
{
char *t;
- for (t = input_line_pointer; t && t[0] == ' '; t++);
+ for (t = input_line_pointer; is_whitespace (t[0]); t++);
/* Detect illegal syntax patterns:
* - two bundle ends on the same line: ;; ;;
@@ -2144,9 +2144,9 @@ kvx_md_start_line_hook (void)
while (tmp_t && tmp_t[0])
{
while (tmp_t && tmp_t[0] &&
- ((tmp_t[0] == ' ') || (tmp_t[0] == '\n')))
+ (is_whitespace (tmp_t[0]) || is_end_of_stmt (tmp_t[0])))
{
- if (tmp_t[0] == '\n')
+ if (is_end_of_stmt (tmp_t[0]))
newline_seen = true;
tmp_t++;
}
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [PATCH v2 22/65] kvx: use is_whitespace()
2025-01-27 16:21 ` [PATCH v2 22/65] kvx: " Jan Beulich
@ 2025-01-31 12:34 ` Paul Iannetta
0 siblings, 0 replies; 106+ messages in thread
From: Paul Iannetta @ 2025-01-31 12:34 UTC (permalink / raw)
To: Jan Beulich; +Cc: Binutils
On Mon, Jan 27, 2025 at 05:21:48PM +0100, Jan Beulich wrote:
> Wherever blanks are permissible in input, tabs ought to be permissible,
> too. This is particularly relevant when -f is passed to gas (alongside
> appropriate input). Also convert open-coded checks where tabs were
> already included. At the same time use is_end_of_stmt() instead of open-
> coded checks in adjacent code.
> ---
> read_token()'s check for \n looks odd to me; I'm hence leaving it alone.
> ---
> v2: New.
>
> --- a/gas/config/kvx-parse.c
> +++ b/gas/config/kvx-parse.c
> @@ -597,12 +597,12 @@ read_token (struct token_s *tok)
> {
> if ('0' <= str[*begin - i] && str[*begin - i] <= '9')
> last_imm_p = 1;
> - else if (str[*begin - i] != ' ' && str[*begin - i] != '\t')
> + else if (!is_whitespace (str[*begin - i]))
> break;
> }
>
> /* Eat up all leading spaces. */
> - while (str[*begin] && (str[*begin] == ' ' || str[*begin] == '\n'))
> + while (str[*begin] && (is_whitespace (str[*begin]) || str[*begin] == '\n'))
> *begin += 1;
>
I think the check str[*begin] == '\n' can be safely dropped. This
should never happen. Thank you for pointing it out.
> *end = *begin;
> @@ -624,7 +624,9 @@ read_token (struct token_s *tok)
> }
>
> if (str[*begin] == '.'
> - && (!(*begin > 0 && (str[*begin - 1] == ' ' || is_delim(str[*begin - 1])))
> + && (!(*begin > 0
> + && (is_whitespace (str[*begin - 1])
> + || is_delim (str[*begin - 1])))
> || last_imm_p))
> modifier_p = 1;
>
> @@ -633,7 +635,8 @@ read_token (struct token_s *tok)
> *end += 1;
>
> /* Stop when reaching the start of the new token. */
> - while (!(!str[*end] || is_delim (str[*end]) || str[*end] == ' ' || (modifier_p && str[*end] == '.')))
> + while (!(!str[*end] || is_delim (str[*end]) || is_whitespace (str[*end])
> + || (modifier_p && str[*end] == '.')))
> *end += 1;
>
> }
> --- a/gas/config/tc-kvx.c
> +++ b/gas/config/tc-kvx.c
> @@ -1279,7 +1279,7 @@ md_assemble (char *line)
> if (get_byte_counter (now_seg) & 3)
> as_fatal ("code segment not word aligned in md_assemble");
>
> - while (line_cursor && line_cursor[0] && (line_cursor[0] == ' '))
> + while (is_whitespace (line_cursor[0]))
> line_cursor++;
>
> /* ;; was converted to "be" by line hook */
> @@ -2125,7 +2125,7 @@ kvx_md_start_line_hook (void)
> {
> char *t;
>
> - for (t = input_line_pointer; t && t[0] == ' '; t++);
> + for (t = input_line_pointer; is_whitespace (t[0]); t++);
>
> /* Detect illegal syntax patterns:
> * - two bundle ends on the same line: ;; ;;
> @@ -2144,9 +2144,9 @@ kvx_md_start_line_hook (void)
> while (tmp_t && tmp_t[0])
> {
> while (tmp_t && tmp_t[0] &&
> - ((tmp_t[0] == ' ') || (tmp_t[0] == '\n')))
> + (is_whitespace (tmp_t[0]) || is_end_of_stmt (tmp_t[0])))
> {
> - if (tmp_t[0] == '\n')
> + if (is_end_of_stmt (tmp_t[0]))
> newline_seen = true;
> tmp_t++;
> }
>
>
>
>
>
Thank you for doing this. I've applied your patch and tested it, and
it looks fine to me.
Paul
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 23/65] LoongArch: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (21 preceding siblings ...)
2025-01-27 16:21 ` [PATCH v2 22/65] kvx: " Jan Beulich
@ 2025-01-27 16:22 ` Jan Beulich
2025-01-27 16:23 ` [PATCH v2 24/65] m32c: " Jan Beulich
` (43 subsequent siblings)
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:22 UTC (permalink / raw)
To: Binutils; +Cc: liuzhensong, Chenghua Xu
Wherever blanks are permissible in input, tabs ought to be permissible,
too. This is particularly relevant when -f is passed to gas (alongside
appropriate input).
---
v2: New.
--- a/gas/config/tc-loongarch.c
+++ b/gas/config/tc-loongarch.c
@@ -1424,9 +1424,9 @@ loongarch_assemble_INSNs (char *str, uns
the_one.name = str;
the_one.expand_from_macro = expand_from_macro;
- for (; *str && *str != ' '; str++)
+ for (; *str && !is_whitespace (*str); str++)
;
- if (*str == ' ')
+ if (is_whitespace (*str))
*str++ = '\0';
loongarch_split_args_by_comma (str, the_one.arg_strs);
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 24/65] m32c: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (22 preceding siblings ...)
2025-01-27 16:22 ` [PATCH v2 23/65] LoongArch: " Jan Beulich
@ 2025-01-27 16:23 ` Jan Beulich
2025-01-27 16:23 ` [PATCH v2 25/65] m32r: " Jan Beulich
` (42 subsequent siblings)
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:23 UTC (permalink / raw)
To: Binutils
Convert open-coded checks as well as the sole ISBLANK() use throughout
the gas/ tree.
---
v2: New.
--- a/gas/config/tc-m32c.c
+++ b/gas/config/tc-m32c.c
@@ -223,7 +223,7 @@ m32c_indirect_operand (char *str)
operand = 2;
/* [abs] where abs is not a0 or a1 */
if (s[1] == '[' && ! (s[2] == 'a' && (s[3] == '0' || s[3] == '1'))
- && (ISBLANK (s[0]) || s[0] == ','))
+ && (is_whitespace (s[0]) || s[0] == ','))
indirection[operand] = absolute;
if (s[0] == ']' && s[1] == ']')
indirection[operand] = relative;
@@ -1241,19 +1241,19 @@ m32c_is_colon_insn (char *start ATTRIBUT
++i_l_p;
/* Check to see if the text following the colon is 'G' */
- if (TOLOWER (i_l_p[1]) == 'g' && (i_l_p[2] == ' ' || i_l_p[2] == '\t'))
+ if (TOLOWER (i_l_p[1]) == 'g' && is_whitespace (i_l_p[2]))
return restore_colon (i_l_p + 2, nul_char);
/* Check to see if the text following the colon is 'Q' */
- if (TOLOWER (i_l_p[1]) == 'q' && (i_l_p[2] == ' ' || i_l_p[2] == '\t'))
+ if (TOLOWER (i_l_p[1]) == 'q' && is_whitespace (i_l_p[2]))
return restore_colon (i_l_p + 2, nul_char);
/* Check to see if the text following the colon is 'S' */
- if (TOLOWER (i_l_p[1]) == 's' && (i_l_p[2] == ' ' || i_l_p[2] == '\t'))
+ if (TOLOWER (i_l_p[1]) == 's' && is_whitespace (i_l_p[2]))
return restore_colon (i_l_p + 2, nul_char);
/* Check to see if the text following the colon is 'Z' */
- if (TOLOWER (i_l_p[1]) == 'z' && (i_l_p[2] == ' ' || i_l_p[2] == '\t'))
+ if (TOLOWER (i_l_p[1]) == 'z' && is_whitespace (i_l_p[2]))
return restore_colon (i_l_p + 2, nul_char);
return 0;
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 25/65] m32r: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (23 preceding siblings ...)
2025-01-27 16:23 ` [PATCH v2 24/65] m32c: " Jan Beulich
@ 2025-01-27 16:23 ` Jan Beulich
2025-01-27 16:24 ` [PATCH v2 26/65] M68HC1x: " Jan Beulich
` (41 subsequent siblings)
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:23 UTC (permalink / raw)
To: Binutils; +Cc: Doug Evans
Convert a lonely ISSPACE().
---
v2: New.
--- a/gas/config/tc-m32r.c
+++ b/gas/config/tc-m32r.c
@@ -989,7 +989,7 @@ assemble_two_insns (char *str1, char *st
{
char *s2 = str1;
- while (ISSPACE (*s2++))
+ while (is_whitespace (*s2++))
continue;
--s2;
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 26/65] M68HC1x: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (24 preceding siblings ...)
2025-01-27 16:23 ` [PATCH v2 25/65] m32r: " Jan Beulich
@ 2025-01-27 16:24 ` Jan Beulich
2025-01-27 16:25 ` [PATCH v2 27/65] M68k: " Jan Beulich
` (40 subsequent siblings)
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:24 UTC (permalink / raw)
To: Binutils; +Cc: Stephane Carrez, Sean Keys
Wherever blanks are permissible in input, tabs ought to be permissible,
too. This is particularly relevant when -f is passed to gas (alongside
appropriate input). Also convert open-coded checks where tabs were
already included. At the same time use is_end_of_stmt() instead of an
open-coded check in adjacent code.
---
Some further checks for \n look odd to me; I'm hence leaving them alone.
---
v2: New.
--- a/gas/config/tc-m68hc11.c
+++ b/gas/config/tc-m68hc11.c
@@ -1082,7 +1082,7 @@ reg_name_search (char *name)
static char *
skip_whites (char *p)
{
- while (*p == ' ' || *p == '\t')
+ while (is_whitespace (*p))
p++;
return p;
@@ -1301,7 +1301,7 @@ get_operand (operand *oper, int which, l
char c = 0;
p = skip_whites (p);
- while (*p && *p != ' ' && *p != '\t')
+ while (*p && !is_whitespace (*p))
p++;
if (*p)
@@ -1461,7 +1461,7 @@ get_operand (operand *oper, int which, l
mode = M6811_OP_IND16 | M6811_OP_JUMP_REL;
p = input_line_pointer;
- while (*p == ' ' || *p == '\t')
+ while (is_whitespace (*p))
p++;
input_line_pointer = p;
oper->mode = mode;
@@ -2823,13 +2823,13 @@ md_assemble (char *str)
int alias_id = -1;
/* Drop leading whitespace. */
- while (*str == ' ')
+ while (is_whitespace (*str))
str++;
/* Find the opcode end and get the opcode in 'name'. The opcode is forced
lower case (the opcode table only has lower case op-codes). */
for (op_start = op_end = (unsigned char *) str;
- *op_end && !is_end_of_line[*op_end] && *op_end != ' ';
+ !is_end_of_stmt (*op_end) && !is_whitespace (*op_end);
op_end++)
{
name[nlen] = TOLOWER (op_start[nlen]);
@@ -3445,7 +3445,7 @@ md_assemble (char *str)
{
char *p = input_line_pointer;
- while (*p == ' ' || *p == '\t' || *p == '\n' || *p == '\r')
+ while (is_whitespace (*p) || *p == '\n' || *p == '\r')
p++;
if (*p != '\n' && *p)
@@ -3491,15 +3491,15 @@ md_assemble (char *str)
to Motorola assembler specs. */
if (opc == NULL && flag_mri)
{
- if (*op_end == ' ' || *op_end == '\t')
+ if (is_whitespace (*op_end))
{
- while (*op_end == ' ' || *op_end == '\t')
+ while (is_whitespace (*op_end))
op_end++;
if (nlen < 19
&& (*op_end &&
(is_end_of_line[op_end[1]]
- || op_end[1] == ' ' || op_end[1] == '\t'
+ || is_whitespace (op_end[1])
|| !ISALNUM (op_end[1])))
&& (*op_end == 'a' || *op_end == 'b'
|| *op_end == 'A' || *op_end == 'B'
@@ -3548,7 +3548,7 @@ md_assemble (char *str)
{
char *p = input_line_pointer;
- while (*p == ' ' || *p == '\t' || *p == '\n' || *p == '\r')
+ while (is_whitespace (*p) || *p == '\n' || *p == '\r')
p++;
if (*p != '\n' && *p)
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 27/65] M68k: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (25 preceding siblings ...)
2025-01-27 16:24 ` [PATCH v2 26/65] M68HC1x: " Jan Beulich
@ 2025-01-27 16:25 ` Jan Beulich
2025-01-27 16:25 ` [PATCH v2 28/65] M*Core: " Jan Beulich
` (39 subsequent siblings)
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:25 UTC (permalink / raw)
To: Binutils
Wherever blanks are permissible in input, tabs ought to be permissible,
too. This is particularly relevant when -f is passed to gas (alongside
appropriate input). Also convert open-coded checks where tabs were
already included. At the same time use is_end_of_stmt() instead of open-
coded checks in adjacent code.
---
v2: New.
--- a/gas/config/m68k-parse.y
+++ b/gas/config/m68k-parse.y
@@ -755,7 +755,7 @@ yylex (void)
int c = 0;
int tail = 0;
- if (*str == ' ')
+ if (is_whitespace (*str))
++str;
if (*str == '\0')
--- a/gas/config/tc-m68k.c
+++ b/gas/config/tc-m68k.c
@@ -1324,7 +1324,7 @@ m68k_ip (char *instring)
LITTLENUM_TYPE *wordp;
unsigned long ok_arch = 0;
- if (*instring == ' ')
+ if (is_whitespace (*instring))
instring++; /* Skip leading whitespace. */
/* Scan up to end of operation-code, which MUST end in end-of-string
@@ -1332,7 +1332,7 @@ m68k_ip (char *instring)
pdot = 0;
for (p = instring; *p != '\0'; p++)
{
- if (*p == ' ')
+ if (is_whitespace (*p))
break;
if (*p == '.')
pdot = p;
@@ -1375,7 +1375,7 @@ m68k_ip (char *instring)
}
/* Found a legitimate opcode, start matching operands. */
- while (*p == ' ')
+ while (is_whitespace (*p))
++p;
if (opcode->m_operands == 0)
@@ -2358,7 +2358,7 @@ m68k_ip (char *instring)
/* We ran out of space, so replace the end of the list
with ellipsis. */
buf -= 4;
- while (*buf != ' ')
+ while (is_whitespace (*buf))
buf--;
strcpy (buf, " ...");
}
@@ -4249,7 +4249,7 @@ md_assemble (char *str)
for (s = str; *s != '\0'; s++)
{
- if ((*s == ' ' || *s == '\t') && ! inquote)
+ if (is_whitespace (*s) && ! inquote)
{
if (infield)
{
@@ -6043,7 +6043,7 @@ mri_assemble (char *str)
char *s;
/* md_assemble expects the opcode to be in lower case. */
- for (s = str; *s != ' ' && *s != '\0'; s++)
+ for (s = str; !is_whitespace (*s) && !is_end_of_stmt (*s); s++)
*s = TOLOWER (*s);
md_assemble (str);
@@ -6168,7 +6168,7 @@ parse_mri_control_operand (int *pcc, cha
*leftstart = input_line_pointer;
*leftstop = s;
if (*leftstop > *leftstart
- && ((*leftstop)[-1] == ' ' || (*leftstop)[-1] == '\t'))
+ && is_whitespace ((*leftstop)[-1]))
--*leftstop;
input_line_pointer = s;
@@ -6182,8 +6182,7 @@ parse_mri_control_operand (int *pcc, cha
if d0 <eq> #FOOAND and d1 <ne> #BAROR then
^^^ ^^ */
if ((s == input_line_pointer
- || *(s-1) == ' '
- || *(s-1) == '\t')
+ || is_whitespace (*(s-1)))
&& ((strncasecmp (s, "AND", 3) == 0
&& (s[3] == '.' || ! is_part_of_name (s[3])))
|| (strncasecmp (s, "OR", 2) == 0
@@ -6194,7 +6193,7 @@ parse_mri_control_operand (int *pcc, cha
*rightstart = input_line_pointer;
*rightstop = s;
if (*rightstop > *rightstart
- && ((*rightstop)[-1] == ' ' || (*rightstop)[-1] == '\t'))
+ && is_whitespace ((*rightstop)[-1]))
--*rightstop;
input_line_pointer = s;
@@ -6512,11 +6511,10 @@ s_mri_if (int qual)
|| (flag_mri
&& *s == '*'
&& (s == input_line_pointer
- || *(s-1) == ' '
- || *(s-1) == '\t'))))
+ || is_whitespace (*(s-1))))))
++s;
--s;
- while (s > input_line_pointer && (*s == ' ' || *s == '\t'))
+ while (s > input_line_pointer && is_whitespace (*s))
--s;
if (s - input_line_pointer > 1
@@ -6778,7 +6776,7 @@ s_mri_for (int qual)
varstop = input_line_pointer;
if (varstop > varstart
- && (varstop[-1] == ' ' || varstop[-1] == '\t'))
+ && is_whitespace (varstop[-1]))
--varstop;
++input_line_pointer;
@@ -6814,7 +6812,7 @@ s_mri_for (int qual)
return;
}
if (initstop > initstart
- && (initstop[-1] == ' ' || initstop[-1] == '\t'))
+ && is_whitespace (initstop[-1]))
--initstop;
SKIP_WHITESPACE ();
@@ -6850,7 +6848,7 @@ s_mri_for (int qual)
return;
}
if (endstop > endstart
- && (endstop[-1] == ' ' || endstop[-1] == '\t'))
+ && is_whitespace (endstop[-1]))
--endstop;
if (! by)
@@ -6884,7 +6882,7 @@ s_mri_for (int qual)
return;
}
if (bystop > bystart
- && (bystop[-1] == ' ' || bystop[-1] == '\t'))
+ && is_whitespace (bystop[-1]))
--bystop;
}
@@ -7081,11 +7079,10 @@ s_mri_while (int qual)
|| (flag_mri
&& *s == '*'
&& (s == input_line_pointer
- || *(s-1) == ' '
- || *(s-1) == '\t'))))
+ || is_whitespace (*(s-1))))))
s++;
--s;
- while (*s == ' ' || *s == '\t')
+ while (is_whitespace (*s))
--s;
if (s - input_line_pointer > 1
&& s[-1] == '.')
@@ -7167,7 +7164,8 @@ s_m68k_cpu (int ignored ATTRIBUTE_UNUSED
}
name = input_line_pointer;
- while (*input_line_pointer && !ISSPACE(*input_line_pointer))
+ while (!is_end_of_stmt (*input_line_pointer)
+ && !is_whitespace (*input_line_pointer))
input_line_pointer++;
saved_char = *input_line_pointer;
*input_line_pointer = 0;
@@ -7195,8 +7193,9 @@ s_m68k_arch (int ignored ATTRIBUTE_UNUSE
}
name = input_line_pointer;
- while (*input_line_pointer && *input_line_pointer != ','
- && !ISSPACE (*input_line_pointer))
+ while (!is_end_of_stmt (*input_line_pointer)
+ && *input_line_pointer != ','
+ && !is_whitespace (*input_line_pointer))
input_line_pointer++;
saved_char = *input_line_pointer;
*input_line_pointer = 0;
@@ -7207,11 +7206,13 @@ s_m68k_arch (int ignored ATTRIBUTE_UNUSE
do
{
*input_line_pointer++ = saved_char;
- if (!*input_line_pointer || ISSPACE (*input_line_pointer))
+ if (is_end_of_stmt (*input_line_pointer)
+ || is_whitespace (*input_line_pointer))
break;
name = input_line_pointer;
- while (*input_line_pointer && *input_line_pointer != ','
- && !ISSPACE (*input_line_pointer))
+ while (!is_end_of_stmt (*input_line_pointer)
+ && *input_line_pointer != ','
+ && !is_whitespace (*input_line_pointer))
input_line_pointer++;
saved_char = *input_line_pointer;
*input_line_pointer = 0;
@@ -7665,9 +7666,9 @@ main (void)
int
is_label (char *str)
{
- while (*str == ' ')
+ while (is_whitespace (*str))
str++;
- while (*str && *str != ' ')
+ while (!is_end_of_stmt (*str) && !is_whitespace (*str))
str++;
if (str[-1] == ':' || str[1] == '=')
return 1;
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 28/65] M*Core: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (26 preceding siblings ...)
2025-01-27 16:25 ` [PATCH v2 27/65] M68k: " Jan Beulich
@ 2025-01-27 16:25 ` Jan Beulich
2025-01-27 16:26 ` [PATCH v2 29/65] metag: " Jan Beulich
` (38 subsequent siblings)
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:25 UTC (permalink / raw)
To: Binutils
Wherever blanks are permissible in input, tabs ought to be permissible,
too. This is particularly relevant when -f is passed to gas (alongside
appropriate input). Also convert ISSPACE() uses. At the same time use
is_end_of_stmt() instead of an open-coded check in adjacent code.
---
v2: New.
--- a/gas/config/tc-mcore.c
+++ b/gas/config/tc-mcore.c
@@ -358,11 +358,11 @@ mcore_s_section (int ignore)
pool. */
char * ilp = input_line_pointer;
- while (*ilp != 0 && ISSPACE (*ilp))
+ while (is_whitespace (*ilp))
++ ilp;
if (startswith (ilp, ".line")
- && (ISSPACE (ilp[5]) || *ilp == '\n' || *ilp == '\r'))
+ && (is_whitespace (ilp[5]) || *ilp == '\n' || *ilp == '\r'))
;
else
dump_literals (0);
@@ -493,7 +493,7 @@ static char *
parse_reg (char * s, unsigned * reg)
{
/* Strip leading whitespace. */
- while (ISSPACE (* s))
+ while (is_whitespace (* s))
++ s;
if (TOLOWER (s[0]) == 'r')
@@ -551,7 +551,7 @@ parse_creg (char * s, unsigned * reg)
int i;
/* Strip leading whitespace. */
- while (ISSPACE (* s))
+ while (is_whitespace (* s))
++s;
if ((TOLOWER (s[0]) == 'c' && TOLOWER (s[1]) == 'r'))
@@ -650,7 +650,7 @@ parse_exp (char * s, expressionS * e)
char * new_pointer;
/* Skip whitespace. */
- while (ISSPACE (* s))
+ while (is_whitespace (* s))
++ s;
save = input_line_pointer;
@@ -797,14 +797,14 @@ parse_mem (char * s,
{
* off = 0;
- while (ISSPACE (* s))
+ while (is_whitespace (* s))
++ s;
if (* s == '(')
{
s = parse_reg (s + 1, reg);
- while (ISSPACE (* s))
+ while (is_whitespace (* s))
++ s;
if (* s == ',')
@@ -830,7 +830,7 @@ parse_mem (char * s,
}
}
- while (ISSPACE (* s))
+ while (is_whitespace (* s))
++ s;
if (* s == ')')
@@ -862,12 +862,12 @@ md_assemble (char * str)
char name[21];
/* Drop leading whitespace. */
- while (ISSPACE (* str))
+ while (is_whitespace (* str))
str ++;
/* Find the op code end. */
for (op_start = op_end = str;
- nlen < 20 && !is_end_of_line [(unsigned char) *op_end] && *op_end != ' ';
+ nlen < 20 && !is_end_of_stmt (*op_end) && !is_whitespace (*op_end);
op_end++)
{
name[nlen] = op_start[nlen];
@@ -962,7 +962,7 @@ md_assemble (char * str)
inst |= reg;
/* Skip whitespace. */
- while (ISSPACE (* op_end))
+ while (is_whitespace (* op_end))
++ op_end;
if (*op_end == ',')
@@ -986,7 +986,7 @@ md_assemble (char * str)
inst |= reg;
/* Skip whitespace. */
- while (ISSPACE (* op_end))
+ while (is_whitespace (* op_end))
++ op_end;
if (* op_end == ',')
@@ -1005,7 +1005,7 @@ md_assemble (char * str)
op_end = parse_reg (op_end + 1, & reg);
/* Skip whitespace. */
- while (ISSPACE (* op_end))
+ while (is_whitespace (* op_end))
++ op_end;
if (* op_end == ',') /* xtrb- r1,rx. */
@@ -1025,7 +1025,7 @@ md_assemble (char * str)
inst |= reg;
/* Skip whitespace. */
- while (ISSPACE (* op_end))
+ while (is_whitespace (* op_end))
++ op_end;
if (* op_end == ',')
@@ -1045,7 +1045,7 @@ md_assemble (char * str)
inst |= reg;
/* Skip whitespace. */
- while (ISSPACE (* op_end))
+ while (is_whitespace (* op_end))
++ op_end;
if (* op_end == ',')
@@ -1064,7 +1064,7 @@ md_assemble (char * str)
inst |= reg;
/* Skip whitespace. */
- while (ISSPACE (* op_end))
+ while (is_whitespace (* op_end))
++ op_end;
if (* op_end == ',')
@@ -1084,7 +1084,7 @@ md_assemble (char * str)
inst |= reg;
/* Skip whitespace. */
- while (ISSPACE (* op_end))
+ while (is_whitespace (* op_end))
++ op_end;
if (* op_end == ',')
@@ -1113,7 +1113,7 @@ md_assemble (char * str)
inst |= reg;
/* Skip whitespace. */
- while (ISSPACE (* op_end))
+ while (is_whitespace (* op_end))
++ op_end;
if (* op_end == ',')
@@ -1140,7 +1140,7 @@ md_assemble (char * str)
inst |= reg;
/* Skip whitespace. */
- while (ISSPACE (* op_end))
+ while (is_whitespace (* op_end))
++ op_end;
if (* op_end == ',')
@@ -1179,7 +1179,7 @@ md_assemble (char * str)
inst |= reg;
/* Skip whitespace. */
- while (ISSPACE (* op_end))
+ while (is_whitespace (* op_end))
++ op_end;
if (* op_end == ',')
@@ -1212,7 +1212,7 @@ md_assemble (char * str)
inst |= reg;
/* Skip whitespace. */
- while (ISSPACE (* op_end))
+ while (is_whitespace (* op_end))
++ op_end;
if (* op_end == ',')
@@ -1231,7 +1231,7 @@ md_assemble (char * str)
inst |= reg;
/* Skip whitespace. */
- while (ISSPACE (* op_end))
+ while (is_whitespace (* op_end))
++ op_end;
if (* op_end == ',')
@@ -1250,7 +1250,7 @@ md_assemble (char * str)
inst |= reg << 8;
/* Skip whitespace. */
- while (ISSPACE (* op_end))
+ while (is_whitespace (* op_end))
++ op_end;
if (* op_end == ',')
@@ -1288,7 +1288,7 @@ md_assemble (char * str)
inst |= (reg << 8);
/* Skip whitespace. */
- while (ISSPACE (* op_end))
+ while (is_whitespace (* op_end))
++ op_end;
if (* op_end == ',')
@@ -1319,7 +1319,7 @@ md_assemble (char * str)
inst |= reg;
/* Skip whitespace. */
- while (ISSPACE (* op_end))
+ while (is_whitespace (* op_end))
++ op_end;
if (* op_end == '-')
@@ -1330,7 +1330,7 @@ md_assemble (char * str)
as_bad (_("ending register must be r15"));
/* Skip whitespace. */
- while (ISSPACE (* op_end))
+ while (is_whitespace (* op_end))
++ op_end;
}
@@ -1339,7 +1339,7 @@ md_assemble (char * str)
op_end ++;
/* Skip whitespace. */
- while (ISSPACE (* op_end))
+ while (is_whitespace (* op_end))
++ op_end;
if (* op_end == '(')
@@ -1368,7 +1368,7 @@ md_assemble (char * str)
as_fatal (_("first register must be r4"));
/* Skip whitespace. */
- while (ISSPACE (* op_end))
+ while (is_whitespace (* op_end))
++ op_end;
if (* op_end == '-')
@@ -1379,7 +1379,7 @@ md_assemble (char * str)
as_fatal (_("last register must be r7"));
/* Skip whitespace. */
- while (ISSPACE (* op_end))
+ while (is_whitespace (* op_end))
++ op_end;
if (* op_end == ',')
@@ -1387,7 +1387,7 @@ md_assemble (char * str)
op_end ++;
/* Skip whitespace. */
- while (ISSPACE (* op_end))
+ while (is_whitespace (* op_end))
++ op_end;
if (* op_end == '(')
@@ -1400,7 +1400,7 @@ md_assemble (char * str)
inst |= reg;
/* Skip whitespace. */
- while (ISSPACE (* op_end))
+ while (is_whitespace (* op_end))
++ op_end;
if (* op_end == ')')
@@ -1433,7 +1433,7 @@ md_assemble (char * str)
inst |= reg << 4;
/* Skip whitespace. */
- while (ISSPACE (* op_end))
+ while (is_whitespace (* op_end))
++ op_end;
if (* op_end == ',')
@@ -1494,7 +1494,7 @@ md_assemble (char * str)
inst |= reg;
/* Skip whitespace. */
- while (ISSPACE (* op_end))
+ while (is_whitespace (* op_end))
++ op_end;
if (* op_end == ',')
@@ -1516,7 +1516,7 @@ md_assemble (char * str)
inst |= reg << 4;
/* Skip whitespace. */
- while (ISSPACE (* op_end))
+ while (is_whitespace (* op_end))
++ op_end;
if (* op_end == ',')
@@ -1537,7 +1537,7 @@ md_assemble (char * str)
inst |= reg;
/* Skip whitespace. */
- while (ISSPACE (* op_end))
+ while (is_whitespace (* op_end))
++ op_end;
if (* op_end == ',')
@@ -1589,7 +1589,7 @@ md_assemble (char * str)
}
/* Drop whitespace after all the operands have been parsed. */
- while (ISSPACE (* op_end))
+ while (is_whitespace (* op_end))
op_end ++;
/* Give warning message if the insn has more operands than required. */
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 29/65] metag: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (27 preceding siblings ...)
2025-01-27 16:25 ` [PATCH v2 28/65] M*Core: " Jan Beulich
@ 2025-01-27 16:26 ` Jan Beulich
2025-01-27 16:26 ` [PATCH v2 30/65] MicroBlaze: " Jan Beulich
` (37 subsequent siblings)
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:26 UTC (permalink / raw)
To: Binutils; +Cc: Markos Chandras
Replace the custom is_whitespace_char().
---
I have no clue why there is a distinction between white space and space
characters here; leaving is_space_char() alone.
---
v2: New.
--- a/gas/config/tc-metag.c
+++ b/gas/config/tc-metag.c
@@ -41,7 +41,6 @@ static char mnemonic_chars[256];
#define is_register_char(x) (register_chars[(unsigned char) x])
#define is_mnemonic_char(x) (mnemonic_chars[(unsigned char) x])
-#define is_whitespace_char(x) (((x) == ' ') || ((x) == '\t'))
#define is_space_char(x) ((x) == ' ')
#define FPU_PREFIX_CHAR 'f'
@@ -221,7 +220,7 @@ skip_whitespace (const char *line)
{
const char *l = line;
- if (is_whitespace_char (*l))
+ if (is_whitespace (*l))
{
l++;
}
@@ -6052,7 +6051,7 @@ parse_prefix (const char *line, metag_in
/* Check this isn't a split condition beginning with L. */
l2 = parse_split_condition (l2, insn);
- if (l2 && is_whitespace_char (*l2))
+ if (l2 && is_whitespace (*l2))
{
l = l2;
}
@@ -6090,7 +6089,7 @@ parse_prefix (const char *line, metag_in
l++;
}
- if (! is_whitespace_char (*l))
+ if (! is_whitespace (*l))
{
l = parse_split_condition (l, insn);
@@ -6116,7 +6115,7 @@ parse_prefix (const char *line, metag_in
insn->dsp_width = DSP_WIDTH_SINGLE;
- while (!is_whitespace_char (*l))
+ while (!is_whitespace (*l))
{
/* We have to check for split condition codes first
because they are the longest strings to match,
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 30/65] MicroBlaze: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (28 preceding siblings ...)
2025-01-27 16:26 ` [PATCH v2 29/65] metag: " Jan Beulich
@ 2025-01-27 16:26 ` Jan Beulich
2025-01-29 23:56 ` Michael Eager
2025-01-27 16:28 ` [PATCH v2 31/65] MIPS: " Jan Beulich
` (36 subsequent siblings)
66 siblings, 1 reply; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:26 UTC (permalink / raw)
To: Binutils; +Cc: Michael Eager
Wherever blanks are permissible in input, tabs ought to be permissible,
too. This is particularly relevant when -f is passed to gas (alongside
appropriate input). Also convert ISSPACE() uses. At the same time use
is_end_of_stmt() instead of an open-coded check in adjacent code.
---
v2: New.
--- a/gas/config/tc-microblaze.c
+++ b/gas/config/tc-microblaze.c
@@ -400,7 +400,7 @@ parse_reg (char * s, unsigned * reg)
unsigned tmpreg = 0;
/* Strip leading whitespace. */
- while (ISSPACE (* s))
+ while (is_whitespace (* s))
++ s;
if (strncasecmp (s, "rpc", 3) == 0)
@@ -573,7 +573,7 @@ parse_exp (char *s, expressionS *e)
char *new_pointer;
/* Skip whitespace. */
- while (ISSPACE (* s))
+ while (is_whitespace (* s))
++ s;
save = input_line_pointer;
@@ -892,12 +892,12 @@ md_assemble (char * str)
char name[20];
/* Drop leading whitespace. */
- while (ISSPACE (* str))
+ while (is_whitespace (* str))
str ++;
/* Find the op code end. */
for (op_start = op_end = str;
- *op_end && !is_end_of_line[(unsigned char) *op_end] && *op_end != ' ';
+ !is_end_of_stmt (*op_end) && !is_whitespace (*op_end);
op_end++)
{
name[nlen] = op_start[nlen];
@@ -1808,7 +1808,7 @@ md_assemble (char * str)
}
/* Drop whitespace after all the operands have been parsed. */
- while (ISSPACE (* op_end))
+ while (is_whitespace (* op_end))
op_end ++;
/* Give warning message if the insn has more operands than required. */
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [PATCH v2 30/65] MicroBlaze: use is_whitespace()
2025-01-27 16:26 ` [PATCH v2 30/65] MicroBlaze: " Jan Beulich
@ 2025-01-29 23:56 ` Michael Eager
0 siblings, 0 replies; 106+ messages in thread
From: Michael Eager @ 2025-01-29 23:56 UTC (permalink / raw)
To: Jan Beulich, Binutils
On 1/27/25 8:26 AM, Jan Beulich wrote:
> Wherever blanks are permissible in input, tabs ought to be permissible,
> too. This is particularly relevant when -f is passed to gas (alongside
> appropriate input). Also convert ISSPACE() uses. At the same time use
> is_end_of_stmt() instead of an open-coded check in adjacent code.
> ---
> v2: New.
Looks OK.
--
Michael Eager
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 31/65] MIPS: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (29 preceding siblings ...)
2025-01-27 16:26 ` [PATCH v2 30/65] MicroBlaze: " Jan Beulich
@ 2025-01-27 16:28 ` Jan Beulich
2025-02-03 14:07 ` Maciej W. Rozycki
2025-01-27 16:28 ` [PATCH v2 32/65] MMIX: " Jan Beulich
` (35 subsequent siblings)
66 siblings, 1 reply; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:28 UTC (permalink / raw)
To: Binutils; +Cc: Maciej W. Rozycki, Chenghua Xu
... for consistency of recognition of what is deemed whitespace.
At the same time use is_end_of_stmt() instead of an open-coded nul char
check, and check for statement end in the first place in
parse_relocation().
---
v2: New.
--- a/gas/config/tc-mips.c
+++ b/gas/config/tc-mips.c
@@ -45,7 +45,7 @@ typedef char static_assert2[sizeof (valu
#define streq(a, b) (strcmp (a, b) == 0)
#define SKIP_SPACE_TABS(S) \
- do { while (*(S) == ' ' || *(S) == '\t') ++(S); } while (0)
+ do { while (is_whitespace (*(S))) ++(S); } while (0)
/* Clean up namespace so we can include obj-elf.h too. */
static int mips_output_flavor (void);
@@ -14349,7 +14349,7 @@ mips_ip (char *str, struct mips_cl_insn
opcode_extra = 0;
/* We first try to match an instruction up to a space or to the end. */
- for (end = 0; str[end] != '\0' && !ISSPACE (str[end]); end++)
+ for (end = 0; !is_end_of_stmt (str[end]) && !is_whitespace (str[end]); end++)
continue;
first = mips_lookup_insn (hash, str, end, &opcode_extra);
@@ -14388,7 +14388,7 @@ mips16_ip (char *str, struct mips_cl_ins
struct mips_operand_token *tokens;
unsigned int l;
- for (s = str; *s != '\0' && *s != '.' && *s != ' '; ++s)
+ for (s = str; *s != '\0' && *s != '.' && !is_whitespace (*s); ++s)
;
end = s;
c = *end;
@@ -14399,8 +14399,9 @@ mips16_ip (char *str, struct mips_cl_ins
case '\0':
break;
- case ' ':
- s++;
+ default:
+ if (is_whitespace (*s))
+ s++;
break;
case '.':
@@ -14417,7 +14418,7 @@ mips16_ip (char *str, struct mips_cl_ins
}
if (*s == '\0')
break;
- else if (*s++ == ' ')
+ else if (is_whitespace (*s++))
break;
set_insn_error (0, _("unrecognized opcode"));
return;
@@ -14641,7 +14642,9 @@ parse_relocation (char **str, bfd_reloc_
{
int len = strlen (percent_op[i].str);
- if (!ISSPACE ((*str)[len]) && (*str)[len] != '(')
+ if (!is_end_of_stmt ((*str)[len])
+ && !is_whitespace ((*str)[len])
+ && (*str)[len] != '(')
continue;
*str += strlen (percent_op[i].str);
@@ -14691,7 +14694,7 @@ my_getSmallExpression (expressionS *ep,
/* Skip over whitespace and brackets, keeping count of the number
of brackets. */
- while (*str == ' ' || *str == '\t' || *str == '(')
+ while (is_whitespace (*str) || *str == '(')
if (*str++ == '(')
str_depth++;
}
@@ -14703,7 +14706,7 @@ my_getSmallExpression (expressionS *ep,
str = expr_parse_end;
/* Match every open bracket. */
- while (crux_depth > 0 && (*str == ')' || *str == ' ' || *str == '\t'))
+ while (crux_depth > 0 && (*str == ')' || is_whitespace (*str)))
if (*str++ == ')')
crux_depth--;
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [PATCH v2 31/65] MIPS: use is_whitespace()
2025-01-27 16:28 ` [PATCH v2 31/65] MIPS: " Jan Beulich
@ 2025-02-03 14:07 ` Maciej W. Rozycki
2025-02-03 15:52 ` Jan Beulich
0 siblings, 1 reply; 106+ messages in thread
From: Maciej W. Rozycki @ 2025-02-03 14:07 UTC (permalink / raw)
To: Jan Beulich; +Cc: Binutils, Chenghua Xu
On Mon, 27 Jan 2025, Jan Beulich wrote:
> --- a/gas/config/tc-mips.c
> +++ b/gas/config/tc-mips.c
> @@ -14388,7 +14388,7 @@ mips16_ip (char *str, struct mips_cl_ins
> struct mips_operand_token *tokens;
> unsigned int l;
>
> - for (s = str; *s != '\0' && *s != '.' && *s != ' '; ++s)
> + for (s = str; *s != '\0' && *s != '.' && !is_whitespace (*s); ++s)
^^^^^^^^^^
Shouldn't this also be `!is_end_of_stmt (*s)'?
> @@ -14399,8 +14399,9 @@ mips16_ip (char *str, struct mips_cl_ins
> case '\0':
> break;
>
> - case ' ':
> - s++;
> + default:
> + if (is_whitespace (*s))
> + s++;
> break;
Why `is_whitespace (*s)' rather than `is_whitespace (c)'?
I think this only causes obfuscation to this already messed up statement.
Since there are only two cases here really ('\0' does nothing and is the
only remaining possibility here, guaranteed by the loop right above) can
you please rewrite this as:
if (c == '.')
{
...
}
else if (is_whitespace (c))
s++;
or suchlike?
> @@ -14417,7 +14418,7 @@ mips16_ip (char *str, struct mips_cl_ins
> }
> if (*s == '\0')
And `is_end_of_stmt (*s)' here presumably too?
Otherwise OK, I think.
Maciej
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [PATCH v2 31/65] MIPS: use is_whitespace()
2025-02-03 14:07 ` Maciej W. Rozycki
@ 2025-02-03 15:52 ` Jan Beulich
2025-02-03 17:50 ` Maciej W. Rozycki
0 siblings, 1 reply; 106+ messages in thread
From: Jan Beulich @ 2025-02-03 15:52 UTC (permalink / raw)
To: Maciej W. Rozycki; +Cc: Binutils, Chenghua Xu
On 03.02.2025 15:07, Maciej W. Rozycki wrote:
> On Mon, 27 Jan 2025, Jan Beulich wrote:
>
>> --- a/gas/config/tc-mips.c
>> +++ b/gas/config/tc-mips.c
>> @@ -14388,7 +14388,7 @@ mips16_ip (char *str, struct mips_cl_ins
>> struct mips_operand_token *tokens;
>> unsigned int l;
>>
>> - for (s = str; *s != '\0' && *s != '.' && *s != ' '; ++s)
>> + for (s = str; *s != '\0' && *s != '.' && !is_whitespace (*s); ++s)
> ^^^^^^^^^^
> Shouldn't this also be `!is_end_of_stmt (*s)'?
This is the kind of question you as the target maintainer really
will want to answer. All I could do is try to guess where
conversion might be on order.
>> @@ -14399,8 +14399,9 @@ mips16_ip (char *str, struct mips_cl_ins
>> case '\0':
>> break;
>>
>> - case ' ':
>> - s++;
>> + default:
>> + if (is_whitespace (*s))
>> + s++;
>> break;
>
> Why `is_whitespace (*s)' rather than `is_whitespace (c)'?
Right, I should have noticed to use c here.
> I think this only causes obfuscation to this already messed up statement.
> Since there are only two cases here really ('\0' does nothing and is the
> only remaining possibility here, guaranteed by the loop right above) can
> you please rewrite this as:
>
> if (c == '.')
> {
> ...
> }
> else if (is_whitespace (c))
> s++;
>
> or suchlike?
Possibly. On a similar question from Richard on aarch64 I indicated that
from other projects I'm working on I'm used to using switch() in such
cases, even if at a certain point there may be just a single case label.
This is to ease future addition of new further labels.
>> @@ -14417,7 +14418,7 @@ mips16_ip (char *str, struct mips_cl_ins
>> }
>> if (*s == '\0')
>
> And `is_end_of_stmt (*s)' here presumably too?
See above. It has been a lot of targets all doing things (often just
slightly) differently, so I can only guess that I may have got the
impression that somewhere up the callstack something nul-terminates the
string.
Jan
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [PATCH v2 31/65] MIPS: use is_whitespace()
2025-02-03 15:52 ` Jan Beulich
@ 2025-02-03 17:50 ` Maciej W. Rozycki
2025-02-10 22:17 ` Maciej W. Rozycki
0 siblings, 1 reply; 106+ messages in thread
From: Maciej W. Rozycki @ 2025-02-03 17:50 UTC (permalink / raw)
To: Jan Beulich; +Cc: Binutils, Chenghua Xu
On Mon, 3 Feb 2025, Jan Beulich wrote:
> >> --- a/gas/config/tc-mips.c
> >> +++ b/gas/config/tc-mips.c
> >> @@ -14388,7 +14388,7 @@ mips16_ip (char *str, struct mips_cl_ins
> >> struct mips_operand_token *tokens;
> >> unsigned int l;
> >>
> >> - for (s = str; *s != '\0' && *s != '.' && *s != ' '; ++s)
> >> + for (s = str; *s != '\0' && *s != '.' && !is_whitespace (*s); ++s)
> > ^^^^^^^^^^
> > Shouldn't this also be `!is_end_of_stmt (*s)'?
>
> This is the kind of question you as the target maintainer really
> will want to answer. All I could do is try to guess where
> conversion might be on order.
All I can say if `*s' is '\0', then we're at the end of an assembly
instruction mnemonic that has no operands following, so my understanding
is it is indeed the case for `!is_end_of_stmt (*s)'. However since you
made such changes elsewhere but chose not to make one on this occasion,
I've asked whether this has been intentional or just a missed case.
> > I think this only causes obfuscation to this already messed up statement.
> > Since there are only two cases here really ('\0' does nothing and is the
> > only remaining possibility here, guaranteed by the loop right above) can
> > you please rewrite this as:
> >
> > if (c == '.')
> > {
> > ...
> > }
> > else if (is_whitespace (c))
> > s++;
> >
> > or suchlike?
>
> Possibly. On a similar question from Richard on aarch64 I indicated that
> from other projects I'm working on I'm used to using switch() in such
> cases, even if at a certain point there may be just a single case label.
> This is to ease future addition of new further labels.
This is generic MIPS assembly language syntax, which is unlikely to ever
change, and then for the MIPS16 intruction set, which has been effectively
a dead end for decades now, even the MIPS16e2 extension ~10 years ago was
a huge surprise and a one-off effort due to a specific customer request,
so we can safely assume nothing else will ever happen again here. So I
think we need to optimise for code clarity rather than minimising highly
unlikely future changes. Yes, you need the backend maintainer's knowledge
to decide here.
For the record we're handling explicit instruction size override suffixes
on MIPS16 mnemonics here, so the three cases the current switch statement
handles is:
- '\0': end of mnemonic, instruction ends w/o operands, no size override,
- ' ': end of mnemonic, operands follow, no size override,
- '.': end of mnemonic, a size override suffix follows (then either the
instruction ends or operands follow).
There's simply no room for expansion here, and as I say the instruction
set is a dead end and therefore in the maintenance mode.
NB this stuff is extensively covered and should therefore be safe to
apply cleanups to with little concern as to possible breakage, cf.:
$ grep '\.[et]\b' gas/testsuite/gas/mips/*.s
and I put significant effort to get this stuff right, as it used to be
broken. See commit 7fd539200562 ("MIPS16: Switch to 32-bit opcode table
interpretation"), commit 3fb49709438e ("MIPS16/GAS: Fix forced size
suffixes with argumentless instructions"), and commit 25499ac7ee92
("MIPS16e2: Add MIPS16e2 ASE support") for the most relevant changes.
> >> @@ -14417,7 +14418,7 @@ mips16_ip (char *str, struct mips_cl_ins
> >> }
> >> if (*s == '\0')
> >
> > And `is_end_of_stmt (*s)' here presumably too?
>
> See above. It has been a lot of targets all doing things (often just
> slightly) differently, so I can only guess that I may have got the
> impression that somewhere up the callstack something nul-terminates the
> string.
It is an analogous situation once the override suffix has been swallowed.
Maciej
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [PATCH v2 31/65] MIPS: use is_whitespace()
2025-02-03 17:50 ` Maciej W. Rozycki
@ 2025-02-10 22:17 ` Maciej W. Rozycki
0 siblings, 0 replies; 106+ messages in thread
From: Maciej W. Rozycki @ 2025-02-10 22:17 UTC (permalink / raw)
To: Jan Beulich; +Cc: Binutils, Chenghua Xu
On Mon, 3 Feb 2025, Maciej W. Rozycki wrote:
> > > I think this only causes obfuscation to this already messed up statement.
> > > Since there are only two cases here really ('\0' does nothing and is the
> > > only remaining possibility here, guaranteed by the loop right above) can
> > > you please rewrite this as:
> > >
> > > if (c == '.')
> > > {
> > > ...
> > > }
> > > else if (is_whitespace (c))
> > > s++;
> > >
> > > or suchlike?
> >
> > Possibly. On a similar question from Richard on aarch64 I indicated that
> > from other projects I'm working on I'm used to using switch() in such
> > cases, even if at a certain point there may be just a single case label.
> > This is to ease future addition of new further labels.
>
> This is generic MIPS assembly language syntax, which is unlikely to ever
> change, and then for the MIPS16 intruction set, which has been effectively
> a dead end for decades now, even the MIPS16e2 extension ~10 years ago was
> a huge surprise and a one-off effort due to a specific customer request,
> so we can safely assume nothing else will ever happen again here. So I
> think we need to optimise for code clarity rather than minimising highly
> unlikely future changes. Yes, you need the backend maintainer's knowledge
> to decide here.
I have made and committed this cleanup myself now.
As an upside this has let me discover and deal with a regression from my
fix in this area made years ago that caused invalid instruction mnemonics
ending with a dot to be assembled successfully as if the dot wasn't there.
I've left any conversion to `is_end_of_stmt' to a future update. At this
point I've concluded that since it wasn't made with your original commit
it makes no sense to me to do it piecemeal, so it'll be best made through
tc-mips.c in one go.
Maciej
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 32/65] MMIX: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (30 preceding siblings ...)
2025-01-27 16:28 ` [PATCH v2 31/65] MIPS: " Jan Beulich
@ 2025-01-27 16:28 ` Jan Beulich
2025-01-31 6:18 ` Hans-Peter Nilsson
` (2 more replies)
2025-01-27 16:29 ` [PATCH v2 33/65] mn10200: " Jan Beulich
` (34 subsequent siblings)
66 siblings, 3 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:28 UTC (permalink / raw)
To: Binutils; +Cc: Hans-Peter Nilsson
Convert open-coded checks as well as ISSPACE() uses. At the same time
use is_end_of_stmt() instead of open-coded checks; do the conversion
even when not adjacent to code being modified anyway to cover all cases
where the is_end_of_line[] index was wrongly cast from plain char (which
can be signed) to unsigned int.
---
v2: New.
--- a/gas/config/tc-mmix.c
+++ b/gas/config/tc-mmix.c
@@ -454,11 +454,11 @@ get_operands (int max_operands, char *s,
while (nextchar == ',')
{
/* Skip leading whitespace */
- while (*p == ' ' || *p == '\t')
+ while (is_whitespace (*p))
p++;
/* Check to see if we have any operands left to parse */
- if (*p == 0 || *p == '\n' || *p == '\r')
+ if (is_end_of_stmt (*p))
{
break;
}
@@ -489,7 +489,7 @@ get_operands (int max_operands, char *s,
p = input_line_pointer;
/* Skip leading whitespace */
- while (*p == ' ' || *p == '\t')
+ while (is_whitespace (*p))
p++;
nextchar = *p++;
}
@@ -545,7 +545,7 @@ get_putget_operands (struct mmix_opcode
int regno;
/* Skip leading whitespace */
- while (*p == ' ' || *p == '\t')
+ while (is_whitespace (*p))
p++;
input_line_pointer = p;
@@ -565,7 +565,7 @@ get_putget_operands (struct mmix_opcode
p = input_line_pointer;
/* Skip whitespace */
- while (*p == ' ' || *p == '\t')
+ while (is_whitespace (*p))
p++;
if (*p == ',')
@@ -573,7 +573,7 @@ get_putget_operands (struct mmix_opcode
p++;
/* Skip whitespace */
- while (*p == ' ' || *p == '\t')
+ while (is_whitespace (*p))
p++;
sregp = p;
input_line_pointer = sregp;
@@ -594,7 +594,7 @@ get_putget_operands (struct mmix_opcode
p = input_line_pointer;
/* Skip whitespace */
- while (*p == ' ' || *p == '\t')
+ while (is_whitespace (*p))
p++;
if (*p == ',')
@@ -602,7 +602,7 @@ get_putget_operands (struct mmix_opcode
p++;
/* Skip whitespace */
- while (*p == ' ' || *p == '\t')
+ while (is_whitespace (*p))
p++;
input_line_pointer = p;
@@ -829,7 +829,7 @@ md_assemble (char *str)
++operands)
;
- if (ISSPACE (*operands))
+ if (is_whitespace (*operands))
{
modified_char = *operands;
*operands++ = '\0';
@@ -2924,7 +2924,8 @@ mmix_handle_mmixal (void)
/* If we're on a line with a label, check if it's a mmixal fb-label.
Save an indicator and skip the label; it must be set only after all
fb-labels of expressions are evaluated. */
- if (ISDIGIT (s[0]) && s[1] == 'H' && ISSPACE (s[2]))
+ if (ISDIGIT (s[0]) && s[1] == 'H'
+ && (is_whitespace (s[2]) || is_end_of_stmt (s[2])))
{
current_fb_label = s[0] - '0';
@@ -2935,12 +2936,12 @@ mmix_handle_mmixal (void)
s += 2;
input_line_pointer = s;
- while (*s && ISSPACE (*s) && ! is_end_of_line[(unsigned int) *s])
+ while (is_whitespace (*s))
s++;
/* For errors emitted here, the book-keeping is off by one; the
caller is about to bump the counters. Adjust the error messages. */
- if (is_end_of_line[(unsigned int) *s])
+ if (is_end_of_stmt (*s))
{
unsigned int line;
const char * name = as_where (&line);
@@ -2973,7 +2974,7 @@ mmix_handle_mmixal (void)
return;
}
- if (*s == 0 || is_end_of_line[(unsigned int) *s])
+ if (is_end_of_stmt (*s))
/* We avoid handling empty lines here. */
return;
@@ -2986,9 +2987,7 @@ mmix_handle_mmixal (void)
/* Find the start of the instruction or pseudo following the label,
if there is one. */
- for (insn = s;
- *insn && ISSPACE (*insn) && ! is_end_of_line[(unsigned int) *insn];
- insn++)
+ for (insn = s; is_whitespace (*insn); insn++)
/* Empty */
;
@@ -3003,7 +3002,7 @@ mmix_handle_mmixal (void)
instruction or MMIXAL-pseudo (getting its alignment). Thus
is acts like a "normal" :-ended label. Ditto if it's
followed by a non-MMIXAL pseudo. */
- && !is_end_of_line[(unsigned int) *insn]
+ && !is_end_of_stmt (*insn)
&& *insn != '.')
{
/* For labels that don't end in ":", we save it so we can later give
@@ -3038,7 +3037,7 @@ mmix_handle_mmixal (void)
while (*s)
{
c = *s++;
- if (is_end_of_line[(unsigned int) c])
+ if (is_end_of_stmt (c))
break;
if (c == MAGIC_FB_BACKWARD_CHAR || c == MAGIC_FB_FORWARD_CHAR)
as_bad (_("invalid characters in input"));
@@ -3048,34 +3047,24 @@ mmix_handle_mmixal (void)
s = insn;
/* Skip the insn. */
- while (*s
- && ! ISSPACE (*s)
- && *s != ';'
- && ! is_end_of_line[(unsigned int) *s])
+ while (! is_whitespace (*s) && ! is_end_of_stmt (*s))
s++;
/* Skip the spaces after the insn. */
- while (*s
- && ISSPACE (*s)
- && *s != ';'
- && ! is_end_of_line[(unsigned int) *s])
+ while (is_whitespace (*s))
s++;
/* Skip the operands. While doing this, replace [0-9][BF] with
(MAGIC_FB_BACKWARD_CHAR|MAGIC_FB_FORWARD_CHAR)[0-9]. */
- while ((c = *s) != 0
- && ! ISSPACE (c)
- && c != ';'
- && ! is_end_of_line[(unsigned int) c])
+ while (! is_whitespace (c = *s) && ! is_end_of_stmt (c))
{
if (c == '"')
{
s++;
/* FIXME: Test-case for semi-colon in string. */
- while (*s
- && *s != '"'
- && (! is_end_of_line[(unsigned int) *s] || *s == ';'))
+ while (*s != '"'
+ && (! is_end_of_stmt (*s) || *s == ';'))
s++;
if (*s == '"')
@@ -3101,10 +3090,7 @@ mmix_handle_mmixal (void)
}
/* Skip any spaces after the operands. */
- while (*s
- && ISSPACE (*s)
- && *s != ';'
- && !is_end_of_line[(unsigned int) *s])
+ while (is_whitespace (*s))
s++;
/* If we're now looking at a semi-colon, then it's an end-of-line
@@ -3114,7 +3100,8 @@ mmix_handle_mmixal (void)
/* Make IS into an EQU by replacing it with "= ". Only match upper-case
though; let lower-case be a syntax error. */
s = insn;
- if (s[0] == 'I' && s[1] == 'S' && ISSPACE (s[2]))
+ if (s[0] == 'I' && s[1] == 'S'
+ && (is_whitespace (s[2]) || is_end_of_stmt (s[2])))
{
*s = '=';
s[1] = ' ';
@@ -3163,7 +3150,7 @@ mmix_handle_mmixal (void)
else if (s[0] == 'G'
&& s[1] == 'R'
&& startswith (s, "GREG")
- && (ISSPACE (s[4]) || is_end_of_line[(unsigned char) s[4]]))
+ && (is_whitespace (s[4]) || is_end_of_stmt (s[4])))
{
input_line_pointer = s + 4;
@@ -4258,7 +4245,7 @@ mmix_cons (int nbytes)
SKIP_WHITESPACE ();
- if (is_end_of_line[(unsigned int) *input_line_pointer])
+ if (is_end_of_stmt (*input_line_pointer))
{
/* Default to zero if the expression was absent. */
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [PATCH v2 32/65] MMIX: use is_whitespace()
2025-01-27 16:28 ` [PATCH v2 32/65] MMIX: " Jan Beulich
@ 2025-01-31 6:18 ` Hans-Peter Nilsson
2025-01-31 6:33 ` Hans-Peter Nilsson
2025-02-03 6:54 ` Hans-Peter Nilsson
2 siblings, 0 replies; 106+ messages in thread
From: Hans-Peter Nilsson @ 2025-01-31 6:18 UTC (permalink / raw)
To: Jan Beulich; +Cc: Binutils
On Mon, 27 Jan 2025, Jan Beulich wrote:
> Convert open-coded checks as well as ISSPACE() uses. At the same time
> use is_end_of_stmt() instead of open-coded checks; do the conversion
> even when not adjacent to code being modified anyway to cover all cases
> where the is_end_of_line[] index was wrongly cast from plain char (which
> can be signed) to unsigned int.
Sorry, it looks like I'm going to have to finish looking at this
late in week-end.
brgds, H-P
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [PATCH v2 32/65] MMIX: use is_whitespace()
2025-01-27 16:28 ` [PATCH v2 32/65] MMIX: " Jan Beulich
2025-01-31 6:18 ` Hans-Peter Nilsson
@ 2025-01-31 6:33 ` Hans-Peter Nilsson
2025-01-31 6:54 ` Jan Beulich
2025-02-03 6:54 ` Hans-Peter Nilsson
2 siblings, 1 reply; 106+ messages in thread
From: Hans-Peter Nilsson @ 2025-01-31 6:33 UTC (permalink / raw)
To: Jan Beulich; +Cc: Binutils
On Mon, 27 Jan 2025, Jan Beulich wrote:
> Convert open-coded checks as well as ISSPACE() uses. At the same time
> use is_end_of_stmt() instead of open-coded checks; do the conversion
> even when not adjacent to code being modified anyway to cover all cases
> where the is_end_of_line[] index was wrongly cast from plain char (which
> can be signed) to unsigned int.
Where is the "wrongly cast" to which you refer?
I see no more "wrongly casts" before the patch than after,
considering that is_whitespace is doing such a cast. But, I
could easily have missed a spot.
brgds, H-P
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [PATCH v2 32/65] MMIX: use is_whitespace()
2025-01-31 6:33 ` Hans-Peter Nilsson
@ 2025-01-31 6:54 ` Jan Beulich
2025-01-31 7:05 ` Hans-Peter Nilsson
0 siblings, 1 reply; 106+ messages in thread
From: Jan Beulich @ 2025-01-31 6:54 UTC (permalink / raw)
To: Hans-Peter Nilsson; +Cc: Binutils
On 31.01.2025 07:33, Hans-Peter Nilsson wrote:
> On Mon, 27 Jan 2025, Jan Beulich wrote:
>> Convert open-coded checks as well as ISSPACE() uses. At the same time
>> use is_end_of_stmt() instead of open-coded checks; do the conversion
>> even when not adjacent to code being modified anyway to cover all cases
>> where the is_end_of_line[] index was wrongly cast from plain char (which
>> can be signed) to unsigned int.
>
> Where is the "wrongly cast" to which you refer?
>
> I see no more "wrongly casts" before the patch than after,
> considering that is_whitespace is doing such a cast. But, I
> could easily have missed a spot.
Take this example:
while (*s && ISSPACE (*s) && ! is_end_of_line[(unsigned int) *s])
s++;
If plain char is signed, negative values will convert to huge positive
unsigned int ones. As opposed to when casting to unsigned char (as all
the IS*() and is*() helper macros effectively[1] do, not just
is_end_of_stmt()).
Jan
[1] safe-ctype.h actually ANDs by 0xff, apparently in an attempt to
deal with CHAR_BITS != 8 cases. That's not really correct though imo;
that case needs dealing with by growing the array dimensions, to
cover all values representable in a char.
Jan
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [PATCH v2 32/65] MMIX: use is_whitespace()
2025-01-31 6:54 ` Jan Beulich
@ 2025-01-31 7:05 ` Hans-Peter Nilsson
0 siblings, 0 replies; 106+ messages in thread
From: Hans-Peter Nilsson @ 2025-01-31 7:05 UTC (permalink / raw)
To: Jan Beulich; +Cc: Binutils
On Fri, 31 Jan 2025, Jan Beulich wrote:
> On 31.01.2025 07:33, Hans-Peter Nilsson wrote:
> > On Mon, 27 Jan 2025, Jan Beulich wrote:
> >> Convert open-coded checks as well as ISSPACE() uses. At the same time
> >> use is_end_of_stmt() instead of open-coded checks; do the conversion
> >> even when not adjacent to code being modified anyway to cover all cases
> >> where the is_end_of_line[] index was wrongly cast from plain char (which
> >> can be signed) to unsigned int.
> >
> > Where is the "wrongly cast" to which you refer?
> >
> > I see no more "wrongly casts" before the patch than after,
> > considering that is_whitespace is doing such a cast. But, I
> > could easily have missed a spot.
>
> Take this example:
>
> while (*s && ISSPACE (*s) && ! is_end_of_line[(unsigned int) *s])
Oh boy, all those "unsigned int" should have been "unsigned
char". Doh! Thanks.
brgds, H-P
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [PATCH v2 32/65] MMIX: use is_whitespace()
2025-01-27 16:28 ` [PATCH v2 32/65] MMIX: " Jan Beulich
2025-01-31 6:18 ` Hans-Peter Nilsson
2025-01-31 6:33 ` Hans-Peter Nilsson
@ 2025-02-03 6:54 ` Hans-Peter Nilsson
2025-02-03 7:30 ` Jan Beulich
2 siblings, 1 reply; 106+ messages in thread
From: Hans-Peter Nilsson @ 2025-02-03 6:54 UTC (permalink / raw)
To: Jan Beulich; +Cc: Binutils
On Mon, 27 Jan 2025, Jan Beulich wrote:
> Convert open-coded checks as well as ISSPACE() uses. At the same time
> use is_end_of_stmt() instead of open-coded checks; do the conversion
> even when not adjacent to code being modified anyway to cover all cases
> where the is_end_of_line[] index was wrongly cast from plain char (which
> can be signed) to unsigned int.
> ---
> v2: New.
>
> --- a/gas/config/tc-mmix.c
> +++ b/gas/config/tc-mmix.c
Sorry, there are a few hunks that are wrong, like missing a test
for ';' and other changes that may work but doesn't logically
make sense to me.
Most of the stuff like s/ISSPACE/is_whitespace/ is obviously ok,
as is of course fixing those horrible (unsigned int) typos.
I plan to post a replacement patch (5 hunks differ), but I need
to look at it again with fresh eyes so I'll pause it for a day
or two.
brgds, H-P
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [PATCH v2 32/65] MMIX: use is_whitespace()
2025-02-03 6:54 ` Hans-Peter Nilsson
@ 2025-02-03 7:30 ` Jan Beulich
2025-02-05 6:22 ` Hans-Peter Nilsson
0 siblings, 1 reply; 106+ messages in thread
From: Jan Beulich @ 2025-02-03 7:30 UTC (permalink / raw)
To: Hans-Peter Nilsson; +Cc: Binutils
On 03.02.2025 07:54, Hans-Peter Nilsson wrote:
> On Mon, 27 Jan 2025, Jan Beulich wrote:
>> Convert open-coded checks as well as ISSPACE() uses. At the same time
>> use is_end_of_stmt() instead of open-coded checks; do the conversion
>> even when not adjacent to code being modified anyway to cover all cases
>> where the is_end_of_line[] index was wrongly cast from plain char (which
>> can be signed) to unsigned int.
>> ---
>> v2: New.
>>
>> --- a/gas/config/tc-mmix.c
>> +++ b/gas/config/tc-mmix.c
>
> Sorry, there are a few hunks that are wrong, like missing a test
> for ';' and other changes that may work but doesn't logically
> make sense to me.
Missing ';' checks? The patch replaces a few with is_end_of_stmt(), yes,
but that's intentional (to stop its open-coding).
> Most of the stuff like s/ISSPACE/is_whitespace/ is obviously ok,
> as is of course fixing those horrible (unsigned int) typos.
>
> I plan to post a replacement patch (5 hunks differ), but I need
> to look at it again with fresh eyes so I'll pause it for a day
> or two.
Well, okay, I'll leave the patch out then for the time being, as I was
meaning to commit the series today or tomorrow. Nevertheless I continue
to think the patch is correct.
Jan
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [PATCH v2 32/65] MMIX: use is_whitespace()
2025-02-03 7:30 ` Jan Beulich
@ 2025-02-05 6:22 ` Hans-Peter Nilsson
2025-02-05 7:14 ` Jan Beulich
0 siblings, 1 reply; 106+ messages in thread
From: Hans-Peter Nilsson @ 2025-02-05 6:22 UTC (permalink / raw)
To: Jan Beulich; +Cc: Binutils
On Mon, 3 Feb 2025, Jan Beulich wrote:
> On 03.02.2025 07:54, Hans-Peter Nilsson wrote:
> > On Mon, 27 Jan 2025, Jan Beulich wrote:
> >> Convert open-coded checks as well as ISSPACE() uses. At the same time
> >> use is_end_of_stmt() instead of open-coded checks; do the conversion
> >> even when not adjacent to code being modified anyway to cover all cases
> >> where the is_end_of_line[] index was wrongly cast from plain char (which
> >> can be signed) to unsigned int.
> >> ---
> >> v2: New.
> >>
> >> --- a/gas/config/tc-mmix.c
> >> +++ b/gas/config/tc-mmix.c
> >
> > Sorry, there are a few hunks that are wrong, like missing a test
> > for ';' and other changes that may work but doesn't logically
> > make sense to me.
>
> Missing ';' checks? The patch replaces a few with is_end_of_stmt(), yes,
> but that's intentional (to stop its open-coding).
Oh right, I'd looked at the end_of_line[] definition and forgot
about MMIX adding ';' by means of line_separator_chars[]. In the
end, all code-changes were ok though with the caveat of now
supporting just tab and space.
Though bundling all that into saying "stop open-coding" was too
vague. You were simply doing too much and still too little in
the same patch, making review harder than necessary.
Too much: "replacing open-coding" and removing redundant
checking while changing those whitespace tests; the IS... calls
to is_... and fixing erroneous casts.
Unfortunately there isn't a 1:1 equivalences, so the input
language is now different. Now with is_whitespace it accepts
only ' ' and '\t' where previously ISSPACE covered more
whitespace characters in some places. Not sure there are any
assembly language programmers using e.g. \f instead of \t, but
some of their code now no longer assembles for some targets.
Example: replace the space after the first 1H in 1hjmp1b.s with
a form-feed and run the related tests. But, pragmatically, I'll
ok that change. Nobody in their right mind should use other
than \t and ' ' as assembly-code field separators. They had it
coming. 1/2 :-)
Too little: not explicitly mentioning what "open-coding" you
intentionally changed (as opposed to accidental edits).
The commit message could have helped some of this, but it
said too little to tell the implication of the changes
and for me as a maintainer to feel ok with all the changes.
BTW, please use "git format" for patches sent to the list.
It'd help local testing and may simplify (programming of)
pre-commit CI autotesters.
What follows is what I committed on your behalf. It's your
patch (so naming you as the author) but with a more verbose
commit message plus a comment above a construct that I had to
step through with gdb to see why there had to be an
is_whitespace() || is_end_of_stmt() or else a testcase would
fail: a case where ISSPACE covered '\n'.
brgds, H-P
----------------------------
From: Jan Beulich <jbeulich@suse.com>
Subject: [PATCH] gas MMIX: Use more of is_... framework like is_whitespace and
is_end_of_stmt
Convert uses of ISSPACE() and testing for specific characters into
calls to is_whitespace and is_end_of_stmt. While doing that, also
remove some redundant tests, like ';' together with is_end_of_line[]
and is_whitespace and !is_end_of_line.
Note the invalid casts being fixed as part of moving to is_... macros;
there were (unsigned int) where there should have been (unsigned char)
applied on char as index to is_end_of_line[].
Beware that the input language changes slightly: some constructs with
whitespace characters other than space and TAB are now invalid.
---
gas/config/tc-mmix.c | 70 +++++++++++++++++++-------------------------
1 file changed, 30 insertions(+), 40 deletions(-)
diff --git a/gas/config/tc-mmix.c b/gas/config/tc-mmix.c
index a43774d755c0..3715790348cf 100644
--- a/gas/config/tc-mmix.c
+++ b/gas/config/tc-mmix.c
@@ -454,11 +454,11 @@ get_operands (int max_operands, char *s, expressionS *exp)
while (nextchar == ',')
{
/* Skip leading whitespace */
- while (*p == ' ' || *p == '\t')
+ while (is_whitespace (*p))
p++;
/* Check to see if we have any operands left to parse */
- if (*p == 0 || *p == '\n' || *p == '\r')
+ if (is_end_of_stmt (*p))
{
break;
}
@@ -489,7 +489,7 @@ get_operands (int max_operands, char *s, expressionS *exp)
p = input_line_pointer;
/* Skip leading whitespace */
- while (*p == ' ' || *p == '\t')
+ while (is_whitespace (*p))
p++;
nextchar = *p++;
}
@@ -545,7 +545,7 @@ get_putget_operands (struct mmix_opcode *insn, char *operands,
int regno;
/* Skip leading whitespace */
- while (*p == ' ' || *p == '\t')
+ while (is_whitespace (*p))
p++;
input_line_pointer = p;
@@ -565,7 +565,7 @@ get_putget_operands (struct mmix_opcode *insn, char *operands,
p = input_line_pointer;
/* Skip whitespace */
- while (*p == ' ' || *p == '\t')
+ while (is_whitespace (*p))
p++;
if (*p == ',')
@@ -573,7 +573,7 @@ get_putget_operands (struct mmix_opcode *insn, char *operands,
p++;
/* Skip whitespace */
- while (*p == ' ' || *p == '\t')
+ while (is_whitespace (*p))
p++;
sregp = p;
input_line_pointer = sregp;
@@ -594,7 +594,7 @@ get_putget_operands (struct mmix_opcode *insn, char *operands,
p = input_line_pointer;
/* Skip whitespace */
- while (*p == ' ' || *p == '\t')
+ while (is_whitespace (*p))
p++;
if (*p == ',')
@@ -602,7 +602,7 @@ get_putget_operands (struct mmix_opcode *insn, char *operands,
p++;
/* Skip whitespace */
- while (*p == ' ' || *p == '\t')
+ while (is_whitespace (*p))
p++;
input_line_pointer = p;
@@ -829,7 +829,7 @@ md_assemble (char *str)
++operands)
;
- if (ISSPACE (*operands))
+ if (is_whitespace (*operands))
{
modified_char = *operands;
*operands++ = '\0';
@@ -2924,7 +2924,11 @@ mmix_handle_mmixal (void)
/* If we're on a line with a label, check if it's a mmixal fb-label.
Save an indicator and skip the label; it must be set only after all
fb-labels of expressions are evaluated. */
- if (ISDIGIT (s[0]) && s[1] == 'H' && ISSPACE (s[2]))
+ if (ISDIGIT (s[0]) && s[1] == 'H'
+ /* A lone "1H" on a line is valid: we'll then see is_end_of_stmt()
+ being true for the following character (likely a '\n' but '\n'
+ doesn't count as is_whitespace). */
+ && (is_whitespace (s[2]) || is_end_of_stmt (s[2])))
{
current_fb_label = s[0] - '0';
@@ -2935,12 +2939,12 @@ mmix_handle_mmixal (void)
s += 2;
input_line_pointer = s;
- while (*s && ISSPACE (*s) && ! is_end_of_line[(unsigned int) *s])
+ while (is_whitespace (*s))
s++;
/* For errors emitted here, the book-keeping is off by one; the
caller is about to bump the counters. Adjust the error messages. */
- if (is_end_of_line[(unsigned int) *s])
+ if (is_end_of_stmt (*s))
{
unsigned int line;
const char * name = as_where (&line);
@@ -2973,7 +2977,7 @@ mmix_handle_mmixal (void)
return;
}
- if (*s == 0 || is_end_of_line[(unsigned int) *s])
+ if (is_end_of_stmt (*s))
/* We avoid handling empty lines here. */
return;
@@ -2986,9 +2990,7 @@ mmix_handle_mmixal (void)
/* Find the start of the instruction or pseudo following the label,
if there is one. */
- for (insn = s;
- *insn && ISSPACE (*insn) && ! is_end_of_line[(unsigned int) *insn];
- insn++)
+ for (insn = s; is_whitespace (*insn); insn++)
/* Empty */
;
@@ -3003,7 +3005,7 @@ mmix_handle_mmixal (void)
instruction or MMIXAL-pseudo (getting its alignment). Thus
is acts like a "normal" :-ended label. Ditto if it's
followed by a non-MMIXAL pseudo. */
- && !is_end_of_line[(unsigned int) *insn]
+ && !is_end_of_stmt (*insn)
&& *insn != '.')
{
/* For labels that don't end in ":", we save it so we can later give
@@ -3038,7 +3040,7 @@ mmix_handle_mmixal (void)
while (*s)
{
c = *s++;
- if (is_end_of_line[(unsigned int) c])
+ if (is_end_of_stmt (c))
break;
if (c == MAGIC_FB_BACKWARD_CHAR || c == MAGIC_FB_FORWARD_CHAR)
as_bad (_("invalid characters in input"));
@@ -3048,34 +3050,24 @@ mmix_handle_mmixal (void)
s = insn;
/* Skip the insn. */
- while (*s
- && ! ISSPACE (*s)
- && *s != ';'
- && ! is_end_of_line[(unsigned int) *s])
+ while (! is_whitespace (*s) && ! is_end_of_stmt (*s))
s++;
/* Skip the spaces after the insn. */
- while (*s
- && ISSPACE (*s)
- && *s != ';'
- && ! is_end_of_line[(unsigned int) *s])
+ while (is_whitespace (*s))
s++;
/* Skip the operands. While doing this, replace [0-9][BF] with
(MAGIC_FB_BACKWARD_CHAR|MAGIC_FB_FORWARD_CHAR)[0-9]. */
- while ((c = *s) != 0
- && ! ISSPACE (c)
- && c != ';'
- && ! is_end_of_line[(unsigned int) c])
+ while (! is_whitespace (c = *s) && ! is_end_of_stmt (c))
{
if (c == '"')
{
s++;
/* FIXME: Test-case for semi-colon in string. */
- while (*s
- && *s != '"'
- && (! is_end_of_line[(unsigned int) *s] || *s == ';'))
+ while (*s != '"'
+ && (! is_end_of_stmt (*s) || *s == ';'))
s++;
if (*s == '"')
@@ -3101,10 +3093,7 @@ mmix_handle_mmixal (void)
}
/* Skip any spaces after the operands. */
- while (*s
- && ISSPACE (*s)
- && *s != ';'
- && !is_end_of_line[(unsigned int) *s])
+ while (is_whitespace (*s))
s++;
/* If we're now looking at a semi-colon, then it's an end-of-line
@@ -3114,7 +3103,8 @@ mmix_handle_mmixal (void)
/* Make IS into an EQU by replacing it with "= ". Only match upper-case
though; let lower-case be a syntax error. */
s = insn;
- if (s[0] == 'I' && s[1] == 'S' && ISSPACE (s[2]))
+ if (s[0] == 'I' && s[1] == 'S'
+ && (is_whitespace (s[2]) || is_end_of_stmt (s[2])))
{
*s = '=';
s[1] = ' ';
@@ -3163,7 +3153,7 @@ mmix_handle_mmixal (void)
else if (s[0] == 'G'
&& s[1] == 'R'
&& startswith (s, "GREG")
- && (ISSPACE (s[4]) || is_end_of_line[(unsigned char) s[4]]))
+ && (is_whitespace (s[4]) || is_end_of_stmt (s[4])))
{
input_line_pointer = s + 4;
@@ -4258,7 +4248,7 @@ mmix_cons (int nbytes)
SKIP_WHITESPACE ();
- if (is_end_of_line[(unsigned int) *input_line_pointer])
+ if (is_end_of_stmt (*input_line_pointer))
{
/* Default to zero if the expression was absent. */
--
2.30.2
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [PATCH v2 32/65] MMIX: use is_whitespace()
2025-02-05 6:22 ` Hans-Peter Nilsson
@ 2025-02-05 7:14 ` Jan Beulich
0 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-02-05 7:14 UTC (permalink / raw)
To: Hans-Peter Nilsson; +Cc: Binutils
On 05.02.2025 07:22, Hans-Peter Nilsson wrote:
> On Mon, 3 Feb 2025, Jan Beulich wrote:
>> On 03.02.2025 07:54, Hans-Peter Nilsson wrote:
>>> On Mon, 27 Jan 2025, Jan Beulich wrote:
>>>> Convert open-coded checks as well as ISSPACE() uses. At the same time
>>>> use is_end_of_stmt() instead of open-coded checks; do the conversion
>>>> even when not adjacent to code being modified anyway to cover all cases
>>>> where the is_end_of_line[] index was wrongly cast from plain char (which
>>>> can be signed) to unsigned int.
>>>> ---
>>>> v2: New.
>>>>
>>>> --- a/gas/config/tc-mmix.c
>>>> +++ b/gas/config/tc-mmix.c
>>>
>>> Sorry, there are a few hunks that are wrong, like missing a test
>>> for ';' and other changes that may work but doesn't logically
>>> make sense to me.
>>
>> Missing ';' checks? The patch replaces a few with is_end_of_stmt(), yes,
>> but that's intentional (to stop its open-coding).
>
> Oh right, I'd looked at the end_of_line[] definition and forgot
> about MMIX adding ';' by means of line_separator_chars[]. In the
> end, all code-changes were ok though with the caveat of now
> supporting just tab and space.
>
> Though bundling all that into saying "stop open-coding" was too
> vague. You were simply doing too much and still too little in
> the same patch, making review harder than necessary.
Such mixing is pretty common in binutils changes. Other projects
I work on are more strict in this regard; I'm trying to find a
balance, but I don#t always succeed.
> Too much: "replacing open-coding" and removing redundant
> checking while changing those whitespace tests; the IS... calls
> to is_... and fixing erroneous casts.
>
> Unfortunately there isn't a 1:1 equivalences, so the input
> language is now different. Now with is_whitespace it accepts
> only ' ' and '\t' where previously ISSPACE covered more
> whitespace characters in some places. Not sure there are any
> assembly language programmers using e.g. \f instead of \t, but
> some of their code now no longer assembles for some targets.
> Example: replace the space after the first 1H in 1hjmp1b.s with
> a form-feed and run the related tests. But, pragmatically, I'll
> ok that change. Nobody in their right mind should use other
> than \t and ' ' as assembly-code field separators. They had it
> coming. 1/2 :-)
Well, I'm aware - see the two respective remarks in the cover
letter. ISSPACE() was problematic anyway, for including \r and
\n as well. ISBLANK() may have been slightly better. In any
event - we now have control over what we want to consider white
space. We can add \f and/or \v, or any others. My goal is though
that for no character is_whitespace() and is_end_of_stmt() both
yield true.
Jan
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 33/65] mn10200: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (31 preceding siblings ...)
2025-01-27 16:28 ` [PATCH v2 32/65] MMIX: " Jan Beulich
@ 2025-01-27 16:29 ` Jan Beulich
2025-01-27 16:29 ` [PATCH v2 34/65] mn10300: " Jan Beulich
` (33 subsequent siblings)
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:29 UTC (permalink / raw)
To: Binutils
Wherever blanks are permissible in input, tabs ought to be permissible,
too. This is particularly relevant when -f is passed to gas (alongside
appropriate input). Also convert open-coded checks as well as ISSPACE()
uses. At the same time use is_end_of_stmt() instead of kind-of-open-
coded checks in adjacent code.
---
v2: New.
--- a/gas/config/tc-mn10200.c
+++ b/gas/config/tc-mn10200.c
@@ -878,7 +878,7 @@ md_assemble (char *str)
int match;
/* Get the opcode. */
- for (s = str; *s != '\0' && !ISSPACE (*s); s++)
+ for (s = str; !is_end_of_stmt (*s) && !is_whitespace (*s); s++)
;
if (*s != '\0')
*s++ = '\0';
@@ -892,7 +892,7 @@ md_assemble (char *str)
}
str = s;
- while (ISSPACE (*str))
+ while (is_whitespace (*str))
++str;
input_line_pointer = str;
@@ -929,7 +929,7 @@ md_assemble (char *str)
errmsg = NULL;
- while (*str == ' ' || *str == ',')
+ while (is_whitespace (*str) || *str == ',')
++str;
if (operand->flags & MN10200_OPERAND_RELAX)
@@ -1102,7 +1102,7 @@ md_assemble (char *str)
str = input_line_pointer;
input_line_pointer = hold;
- while (*str == ' ' || *str == ',')
+ while (is_whitespace (*str) || *str == ',')
++str;
}
@@ -1127,10 +1127,10 @@ md_assemble (char *str)
break;
}
- while (ISSPACE (*str))
+ while (is_whitespace (*str))
++str;
- if (*str != '\0')
+ if (!is_end_of_stmt (*str))
as_bad (_("junk at end of line: `%s'"), str);
input_line_pointer = str;
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 34/65] mn10300: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (32 preceding siblings ...)
2025-01-27 16:29 ` [PATCH v2 33/65] mn10200: " Jan Beulich
@ 2025-01-27 16:29 ` Jan Beulich
2025-02-07 6:55 ` Alexandre Oliva
2025-01-27 16:30 ` [PATCH v2 35/65] Moxie: " Jan Beulich
` (32 subsequent siblings)
66 siblings, 1 reply; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:29 UTC (permalink / raw)
To: Binutils; +Cc: Alexandre Oliva
Convert open-coded checks as well as ISSPACE() uses. At the same time
use is_end_of_stmt() instead of kind-of-open-coded checks in adjacent
code.
---
v2: New.
--- a/gas/config/tc-mn10300.c
+++ b/gas/config/tc-mn10300.c
@@ -1241,7 +1241,7 @@ md_assemble (char *str)
int match;
/* Get the opcode. */
- for (s = str; *s != '\0' && !ISSPACE (*s); s++)
+ for (s = str; !is_end_of_stmt (*s) && !is_whitespace (*s); s++)
;
if (*s != '\0')
*s++ = '\0';
@@ -1255,7 +1255,7 @@ md_assemble (char *str)
}
str = s;
- while (ISSPACE (*str))
+ while (is_whitespace (*str))
++str;
input_line_pointer = str;
@@ -1304,7 +1304,7 @@ md_assemble (char *str)
next_opindex = 0;
}
- while (*str == ' ' || *str == ',')
+ while (is_whitespace (*str) || *str == ',')
++str;
if (operand->flags & MN10300_OPERAND_RELAX)
@@ -1764,7 +1764,7 @@ md_assemble (char *str)
str = input_line_pointer;
input_line_pointer = hold;
- while (*str == ' ' || *str == ',')
+ while (is_whitespace (*str) || *str == ',')
++str;
}
@@ -1815,10 +1815,10 @@ md_assemble (char *str)
break;
}
- while (ISSPACE (*str))
+ while (is_whitespace (*str))
++str;
- if (*str != '\0')
+ if (!is_end_of_stmt (*str))
as_bad (_("junk at end of line: `%s'"), str);
input_line_pointer = str;
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 35/65] Moxie: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (33 preceding siblings ...)
2025-01-27 16:29 ` [PATCH v2 34/65] mn10300: " Jan Beulich
@ 2025-01-27 16:30 ` Jan Beulich
2025-01-27 16:31 ` [PATCH v2 36/65] msp430: " Jan Beulich
` (31 subsequent siblings)
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:30 UTC (permalink / raw)
To: Binutils; +Cc: Anthony Green
Wherever blanks are permissible in input, tabs ought to be permissible,
too. This is particularly relevant when -f is passed to gas (alongside
appropriate input). Also convert ISSPACE() uses. At the same time use
is_end_of_stmt() instead of an open-coded check in adjacent code. While
at it also drop a redundant whitespace skipping loop.
---
v2: New.
--- a/gas/config/tc-moxie.c
+++ b/gas/config/tc-moxie.c
@@ -163,13 +163,13 @@ md_assemble (char *str)
int nlen = 0;
/* Drop leading whitespace. */
- while (*str == ' ')
+ while (is_whitespace (*str))
str++;
/* Find the op code end. */
op_start = str;
for (op_end = str;
- *op_end && !is_end_of_line[*op_end & 0xff] && *op_end != ' ';
+ !is_end_of_stmt (*op_end) && !is_whitespace (*op_end);
op_end++)
nlen++;
@@ -193,7 +193,7 @@ md_assemble (char *str)
{
case MOXIE_F2_A8V:
iword = (1<<15) | (opcode->opcode << 12);
- while (ISSPACE (*op_end))
+ while (is_whitespace (*op_end))
op_end++;
{
expressionS arg;
@@ -214,7 +214,7 @@ md_assemble (char *str)
break;
case MOXIE_F1_AB:
iword = opcode->opcode << 8;
- while (ISSPACE (*op_end))
+ while (is_whitespace (*op_end))
op_end++;
{
int dest, src;
@@ -224,7 +224,7 @@ md_assemble (char *str)
op_end++;
src = parse_register_operand (&op_end);
iword += (dest << 4) + src;
- while (ISSPACE (*op_end))
+ while (is_whitespace (*op_end))
op_end++;
if (*op_end != 0)
as_warn (_("extra stuff on line ignored"));
@@ -232,7 +232,7 @@ md_assemble (char *str)
break;
case MOXIE_F1_A4:
iword = opcode->opcode << 8;
- while (ISSPACE (*op_end))
+ while (is_whitespace (*op_end))
op_end++;
{
expressionS arg;
@@ -240,7 +240,7 @@ md_assemble (char *str)
int regnum;
regnum = parse_register_operand (&op_end);
- while (ISSPACE (*op_end))
+ while (is_whitespace (*op_end))
op_end++;
iword += (regnum << 4);
@@ -266,7 +266,7 @@ md_assemble (char *str)
case MOXIE_F1_M:
case MOXIE_F1_4:
iword = opcode->opcode << 8;
- while (ISSPACE (*op_end))
+ while (is_whitespace (*op_end))
op_end++;
{
expressionS arg;
@@ -284,19 +284,19 @@ md_assemble (char *str)
break;
case MOXIE_F1_NARG:
iword = opcode->opcode << 8;
- while (ISSPACE (*op_end))
+ while (is_whitespace (*op_end))
op_end++;
if (*op_end != 0)
as_warn (_("extra stuff on line ignored"));
break;
case MOXIE_F1_A:
iword = opcode->opcode << 8;
- while (ISSPACE (*op_end))
+ while (is_whitespace (*op_end))
op_end++;
{
int reg;
reg = parse_register_operand (&op_end);
- while (ISSPACE (*op_end))
+ while (is_whitespace (*op_end))
op_end++;
if (*op_end != 0)
as_warn (_("extra stuff on line ignored"));
@@ -305,7 +305,7 @@ md_assemble (char *str)
break;
case MOXIE_F1_ABi:
iword = opcode->opcode << 8;
- while (ISSPACE (*op_end))
+ while (is_whitespace (*op_end))
op_end++;
{
int a, b;
@@ -329,7 +329,7 @@ md_assemble (char *str)
}
op_end++;
iword += (a << 4) + b;
- while (ISSPACE (*op_end))
+ while (is_whitespace (*op_end))
op_end++;
if (*op_end != 0)
as_warn (_("extra stuff on line ignored"));
@@ -337,7 +337,7 @@ md_assemble (char *str)
break;
case MOXIE_F1_AiB:
iword = opcode->opcode << 8;
- while (ISSPACE (*op_end))
+ while (is_whitespace (*op_end))
op_end++;
{
int a, b;
@@ -361,7 +361,7 @@ md_assemble (char *str)
op_end++;
b = parse_register_operand (&op_end);
iword += (a << 4) + b;
- while (ISSPACE (*op_end))
+ while (is_whitespace (*op_end))
op_end++;
if (*op_end != 0)
as_warn (_("extra stuff on line ignored"));
@@ -369,7 +369,7 @@ md_assemble (char *str)
break;
case MOXIE_F1_4A:
iword = opcode->opcode << 8;
- while (ISSPACE (*op_end))
+ while (is_whitespace (*op_end))
op_end++;
{
expressionS arg;
@@ -394,7 +394,7 @@ md_assemble (char *str)
op_end++;
a = parse_register_operand (&op_end);
- while (ISSPACE (*op_end))
+ while (is_whitespace (*op_end))
op_end++;
if (*op_end != 0)
as_warn (_("extra stuff on line ignored"));
@@ -404,7 +404,7 @@ md_assemble (char *str)
break;
case MOXIE_F1_ABi2:
iword = opcode->opcode << 8;
- while (ISSPACE (*op_end))
+ while (is_whitespace (*op_end))
op_end++;
{
expressionS arg;
@@ -412,7 +412,7 @@ md_assemble (char *str)
int a, b;
a = parse_register_operand (&op_end);
- while (ISSPACE (*op_end))
+ while (is_whitespace (*op_end))
op_end++;
if (*op_end != ',')
@@ -448,7 +448,7 @@ md_assemble (char *str)
}
op_end++;
- while (ISSPACE (*op_end))
+ while (is_whitespace (*op_end))
op_end++;
if (*op_end != 0)
as_warn (_("extra stuff on line ignored"));
@@ -458,7 +458,7 @@ md_assemble (char *str)
break;
case MOXIE_F1_AiB2:
iword = opcode->opcode << 8;
- while (ISSPACE (*op_end))
+ while (is_whitespace (*op_end))
op_end++;
{
expressionS arg;
@@ -499,10 +499,7 @@ md_assemble (char *str)
op_end++;
b = parse_register_operand (&op_end);
- while (ISSPACE (*op_end))
- op_end++;
-
- while (ISSPACE (*op_end))
+ while (is_whitespace (*op_end))
op_end++;
if (*op_end != 0)
as_warn (_("extra stuff on line ignored"));
@@ -512,14 +509,14 @@ md_assemble (char *str)
break;
case MOXIE_F2_NARG:
iword = opcode->opcode << 12;
- while (ISSPACE (*op_end))
+ while (is_whitespace (*op_end))
op_end++;
if (*op_end != 0)
as_warn (_("extra stuff on line ignored"));
break;
case MOXIE_F3_PCREL:
iword = (3<<14) | (opcode->opcode << 10);
- while (ISSPACE (*op_end))
+ while (is_whitespace (*op_end))
op_end++;
{
expressionS arg;
@@ -535,7 +532,7 @@ md_assemble (char *str)
break;
case MOXIE_BAD:
iword = 0;
- while (ISSPACE (*op_end))
+ while (is_whitespace (*op_end))
op_end++;
if (*op_end != 0)
as_warn (_("extra stuff on line ignored"));
@@ -547,7 +544,7 @@ md_assemble (char *str)
md_number_to_chars (p, iword, 2);
dwarf2_emit_insn (2);
- while (ISSPACE (*op_end))
+ while (is_whitespace (*op_end))
op_end++;
if (*op_end != 0)
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 36/65] msp430: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (34 preceding siblings ...)
2025-01-27 16:30 ` [PATCH v2 35/65] Moxie: " Jan Beulich
@ 2025-01-27 16:31 ` Jan Beulich
2025-01-27 16:31 ` [PATCH v2 37/65] nds32: " Jan Beulich
` (30 subsequent siblings)
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:31 UTC (permalink / raw)
To: Binutils
Wherever blanks are permissible in input, tabs ought to be permissible,
too. This is particularly relevant when -f is passed to gas (alongside
appropriate input). Also convert ISSPACE() uses. At the same time use
is_end_of_stmt() instead of open-coded checking in code needing touching
anyway.
---
v2: New.
--- a/gas/config/tc-msp430.c
+++ b/gas/config/tc-msp430.c
@@ -436,11 +436,11 @@ del_spaces (char * s)
{
while (*s)
{
- if (ISSPACE (*s))
+ if (is_whitespace (*s))
{
char *m = s + 1;
- while (ISSPACE (*m) && *m)
+ while (is_whitespace (*m) && *m)
m++;
memmove (s, m, strlen (m) + 1);
}
@@ -452,7 +452,7 @@ del_spaces (char * s)
static inline char *
skip_space (char * s)
{
- while (ISSPACE (*s))
+ while (is_whitespace (*s))
++s;
return s;
}
@@ -1813,7 +1813,7 @@ extract_cmd (char * from, char * to, int
{
int size = 0;
- while (*from && ! ISSPACE (*from) && *from != '.' && limit > size)
+ while (*from && ! is_whitespace (*from) && *from != '.' && limit > size)
{
*(to + size) = *from;
from++;
@@ -2833,14 +2833,12 @@ msp430_operands (struct msp430_opcode_s
check = true;
break;
- case 0:
- case ' ':
- case '\n':
- case '\r':
- as_warn (_("no size modifier after period, .w assumed"));
- break;
-
default:
+ if (is_whitespace (*line) || is_end_of_stmt(*line))
+ {
+ as_warn (_("no size modifier after period, .w assumed"));
+ break;
+ }
as_bad (_("unrecognised instruction size modifier .%c"),
* line);
return 0;
@@ -2853,7 +2851,7 @@ msp430_operands (struct msp430_opcode_s
}
}
- if (*line && ! ISSPACE (*line))
+ if (*line && ! is_whitespace (*line))
{
as_bad (_("junk found after instruction: %s.%s"),
opcode->name, line);
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 37/65] nds32: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (35 preceding siblings ...)
2025-01-27 16:31 ` [PATCH v2 36/65] msp430: " Jan Beulich
@ 2025-01-27 16:31 ` Jan Beulich
2025-01-27 16:32 ` [PATCH v2 38/65] NS32k: " Jan Beulich
` (29 subsequent siblings)
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:31 UTC (permalink / raw)
To: Binutils; +Cc: Kuan-Lin Chen, Wei-Cheng Wang
Convert ISSPACE() uses.
---
v2: New.
--- a/gas/config/tc-nds32.c
+++ b/gas/config/tc-nds32.c
@@ -3452,7 +3452,7 @@ nds32_lookup_pseudo_opcode (const char *
for (i = 0; i < maxlen; i++)
{
- if (ISSPACE (op[i] = str[i]))
+ if (is_whitespace (op[i] = str[i]))
break;
}
op[i] = '\0';
@@ -4093,7 +4093,7 @@ nds32_relax_relocs (int relax)
{"", "",};
name = input_line_pointer;
- while (*input_line_pointer && !ISSPACE (*input_line_pointer))
+ while (*input_line_pointer && !is_whitespace (*input_line_pointer))
input_line_pointer++;
saved_char = *input_line_pointer;
*input_line_pointer = 0;
@@ -4230,7 +4230,7 @@ nds32_relax_hint (int mode ATTRIBUTE_UNU
struct relax_hint_id *record_id;
name = input_line_pointer;
- while (*input_line_pointer && !ISSPACE (*input_line_pointer))
+ while (*input_line_pointer && !is_whitespace (*input_line_pointer))
input_line_pointer++;
saved_char = *input_line_pointer;
*input_line_pointer = 0;
@@ -4363,7 +4363,7 @@ nds32_flag (int ignore ATTRIBUTE_UNUSED)
/* Skip whitespaces. */
name = input_line_pointer;
- while (*input_line_pointer && !ISSPACE (*input_line_pointer))
+ while (*input_line_pointer && !is_whitespace (*input_line_pointer))
input_line_pointer++;
saved_char = *input_line_pointer;
*input_line_pointer = 0;
@@ -4400,7 +4400,7 @@ ict_model (int ignore ATTRIBUTE_UNUSED)
/* Skip whitespaces. */
name = input_line_pointer;
- while (*input_line_pointer && !ISSPACE (*input_line_pointer))
+ while (*input_line_pointer && !is_whitespace (*input_line_pointer))
input_line_pointer++;
saved_char = *input_line_pointer;
*input_line_pointer = 0;
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 38/65] NS32k: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (36 preceding siblings ...)
2025-01-27 16:31 ` [PATCH v2 37/65] nds32: " Jan Beulich
@ 2025-01-27 16:32 ` Jan Beulich
2025-01-27 16:32 ` [PATCH v2 39/65] PDP11: " Jan Beulich
` (28 subsequent siblings)
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:32 UTC (permalink / raw)
To: Binutils
Wherever blanks are permissible in input, tabs ought to be permissible,
too. This is particularly relevant when -f is passed to gas (alongside
appropriate input).
---
v2: New.
--- a/gas/config/tc-ns32k.c
+++ b/gas/config/tc-ns32k.c
@@ -1097,7 +1097,9 @@ parse (const char *line, int recursive_l
if (recursive_level <= 0)
{
/* Called from md_assemble. */
- for (lineptr = line; (*lineptr) != '\0' && (*lineptr) != ' '; lineptr++)
+ for (lineptr = line;
+ (*lineptr) != '\0' && !is_whitespace (*lineptr);
+ lineptr++)
continue;
c = *lineptr;
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 39/65] PDP11: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (37 preceding siblings ...)
2025-01-27 16:32 ` [PATCH v2 38/65] NS32k: " Jan Beulich
@ 2025-01-27 16:32 ` Jan Beulich
2025-01-27 16:33 ` [PATCH v2 40/65] PicoJava: " Jan Beulich
` (27 subsequent siblings)
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:32 UTC (permalink / raw)
To: Binutils
Convert open-coded checks.
---
v2: New.
--- a/gas/config/tc-pdp11.c
+++ b/gas/config/tc-pdp11.c
@@ -320,7 +320,7 @@ md_chars_to_number (unsigned char *con,
static char *
skip_whitespace (char *str)
{
- while (*str == ' ' || *str == '\t')
+ while (is_whitespace (*str))
str++;
return str;
}
@@ -328,7 +328,7 @@ skip_whitespace (char *str)
static char *
find_whitespace (char *str)
{
- while (*str != ' ' && *str != '\t' && *str != 0)
+ while (!is_whitespace (*str) && *str != 0)
str++;
return str;
}
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 40/65] PicoJava: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (38 preceding siblings ...)
2025-01-27 16:32 ` [PATCH v2 39/65] PDP11: " Jan Beulich
@ 2025-01-27 16:33 ` Jan Beulich
2025-01-27 16:33 ` [PATCH v2 41/65] PPC: " Jan Beulich
` (26 subsequent siblings)
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:33 UTC (permalink / raw)
To: Binutils
Wherever blanks are permissible in input, tabs ought to be permissible,
too. This is particularly relevant when -f is passed to gas (alongside
appropriate input). Also convert ISSPACE(). At the same time use
is_end_of_stmt() instead of an open-coded check in adjacent code.
---
v2: New.
--- a/gas/config/tc-pj.c
+++ b/gas/config/tc-pj.c
@@ -236,13 +236,13 @@ md_assemble (char *str)
int nlen = 0;
/* Drop leading whitespace. */
- while (*str == ' ')
+ while (is_whitespace (*str))
str++;
/* Find the op code end. */
op_start = str;
for (op_end = str;
- *op_end && !is_end_of_line[*op_end & 0xff] && *op_end != ' ';
+ !is_end_of_stmt (*op_end) && !is_whitespace (*op_end);
op_end++)
nlen++;
@@ -301,7 +301,7 @@ md_assemble (char *str)
pending_reloc = 0;
}
- while (ISSPACE (*op_end))
+ while (is_whitespace (*op_end))
op_end++;
if (*op_end != 0)
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 41/65] PPC: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (39 preceding siblings ...)
2025-01-27 16:33 ` [PATCH v2 40/65] PicoJava: " Jan Beulich
@ 2025-01-27 16:33 ` Jan Beulich
2025-01-27 22:36 ` Peter Bergner
2025-01-27 16:34 ` [PATCH v2 42/65] pru: " Jan Beulich
` (25 subsequent siblings)
66 siblings, 1 reply; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:33 UTC (permalink / raw)
To: Binutils; +Cc: Peter Bergner
Wherever blanks are permissible in input, tabs ought to be permissible,
too. This is particularly relevant when -f is passed to gas (alongside
appropriate input). Also switch ISSPACE() uses over. At the same time
use is_end_of_stmt() instead of an open-coded nul char check.
---
v2: Also replace ISSPACE().
--- a/gas/config/tc-ppc.c
+++ b/gas/config/tc-ppc.c
@@ -3325,7 +3325,7 @@ md_assemble (char *str)
unsigned int insn_length;
/* Get the opcode. */
- for (s = str; *s != '\0' && ! ISSPACE (*s); s++)
+ for (s = str; ! is_end_of_stmt (*s) && ! is_whitespace (*s); s++)
;
if (*s != '\0')
*s++ = '\0';
@@ -3351,7 +3351,7 @@ md_assemble (char *str)
}
str = s;
- while (ISSPACE (*str))
+ while (is_whitespace (*str))
++str;
#ifdef OBJ_XCOFF
@@ -3967,7 +3967,7 @@ md_assemble (char *str)
{
do
++str;
- while (ISSPACE (*str));
+ while (is_whitespace (*str));
endc = ',';
}
}
@@ -3996,7 +3996,7 @@ md_assemble (char *str)
}
}
- while (ISSPACE (*str))
+ while (is_whitespace (*str))
++str;
if (*str != '\0')
@@ -5831,7 +5831,7 @@ ppc_tc (int ignore ATTRIBUTE_UNUSED)
/* Skip the TOC symbol name. */
while (is_part_of_name (*input_line_pointer)
- || *input_line_pointer == ' '
+ || is_whitespace (*input_line_pointer)
|| *input_line_pointer == '['
|| *input_line_pointer == ']'
|| *input_line_pointer == '{'
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [PATCH v2 41/65] PPC: use is_whitespace()
2025-01-27 16:33 ` [PATCH v2 41/65] PPC: " Jan Beulich
@ 2025-01-27 22:36 ` Peter Bergner
0 siblings, 0 replies; 106+ messages in thread
From: Peter Bergner @ 2025-01-27 22:36 UTC (permalink / raw)
To: Jan Beulich, Binutils
On 1/27/25 10:33 AM, Jan Beulich wrote:
> Wherever blanks are permissible in input, tabs ought to be permissible,
> too. This is particularly relevant when -f is passed to gas (alongside
> appropriate input). Also switch ISSPACE() uses over. At the same time
> use is_end_of_stmt() instead of an open-coded nul char check.
> ---
> v2: Also replace ISSPACE().
LGTM. Thanks.
Peter
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 42/65] pru: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (40 preceding siblings ...)
2025-01-27 16:33 ` [PATCH v2 41/65] PPC: " Jan Beulich
@ 2025-01-27 16:34 ` Jan Beulich
2025-01-27 16:34 ` [PATCH v2 43/65] RISC-V: " Jan Beulich
` (24 subsequent siblings)
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:34 UTC (permalink / raw)
To: Binutils
Convert open-coded checks as well as an ISSPACE() use.
---
v2: New.
--- a/gas/config/tc-pru.c
+++ b/gas/config/tc-pru.c
@@ -1441,7 +1441,7 @@ pru_parse_args (pru_insn_infoS *insn ATT
/* Strip trailing whitespace. */
len = strlen (parsed_args[i]);
for (char *temp = parsed_args[i] + len - 1;
- len && ISSPACE (*temp);
+ len && is_whitespace (*temp);
temp--, len--)
*temp = '\0';
@@ -1830,7 +1830,7 @@ pru_frob_label (symbolS *lab)
static inline char *
skip_space (char *s)
{
- while (*s == ' ' || *s == '\t')
+ while (is_whitespace (*s))
++s;
return s;
}
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 43/65] RISC-V: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (41 preceding siblings ...)
2025-01-27 16:34 ` [PATCH v2 42/65] pru: " Jan Beulich
@ 2025-01-27 16:34 ` Jan Beulich
2025-01-27 16:35 ` [PATCH v2 44/65] rl78: " Jan Beulich
` (23 subsequent siblings)
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:34 UTC (permalink / raw)
To: Binutils; +Cc: Palmer Dabbelt, Andrew Waterman, Jim Wilson, Nelson Chu
Wherever blanks are permissible in input, tabs ought to be permissible,
too. This is particularly relevant when -f is passed to gas (alongside
appropriate input). Switch places already checking for tabs to use the
macro, too.
---
v2: Also replace ISSPACE() and a strspn() use.
--- a/gas/config/tc-riscv.c
+++ b/gas/config/tc-riscv.c
@@ -2484,7 +2484,7 @@ parse_relocation (char **str, bfd_reloc_
{
size_t len = 1 + strlen (percent_op->str);
- while (ISSPACE ((*str)[len]))
+ while (is_whitespace ((*str)[len]))
++len;
if ((*str)[len] != '(')
continue;
@@ -2548,7 +2548,7 @@ my_getSmallExpression (expressionS *ep,
/* Skip over whitespace and brackets, keeping count of the number
of brackets. */
- while (*str == ' ' || *str == '\t' || *str == '(')
+ while (is_whitespace (*str) || *str == '(')
if (*str++ == '(')
str_depth++;
}
@@ -2577,7 +2577,7 @@ my_getSmallExpression (expressionS *ep,
probing_insn_operands = orig_probing;
/* Match every open bracket. */
- while (crux_depth > 0 && (*str == ')' || *str == ' ' || *str == '\t'))
+ while (crux_depth > 0 && (*str == ')' || is_whitespace (*str)))
if (*str++ == ')')
crux_depth--;
@@ -2844,7 +2844,7 @@ riscv_ip (char *str, struct riscv_cl_ins
/* Parse the name of the instruction. Terminate the string if whitespace
is found so that str_hash_find only sees the name part of the string. */
for (asarg = str; *asarg!= '\0'; ++asarg)
- if (ISSPACE (*asarg))
+ if (is_whitespace (*asarg))
{
save_c = *asarg;
*asarg++ = '\0';
@@ -2891,7 +2891,8 @@ riscv_ip (char *str, struct riscv_cl_ins
for (oparg = insn->args;; ++oparg)
{
opargStart = oparg;
- asarg += strspn (asarg, " \t");
+ while (is_whitespace (*asarg))
+ ++asarg;
switch (*oparg)
{
case '\0': /* End of args. */
@@ -3520,7 +3521,7 @@ riscv_ip (char *str, struct riscv_cl_ins
if (reg_lookup (&asarg, RCLASS_GPR, ®no))
{
char c = *oparg;
- if (*asarg == ' ')
+ if (is_whitespace (*asarg))
++asarg;
/* Now that we have assembled one operand, we use the args
@@ -3554,7 +3555,7 @@ riscv_ip (char *str, struct riscv_cl_ins
? RCLASS_GPR : RCLASS_FPR), ®no))
{
char c = *oparg;
- if (*asarg == ' ')
+ if (is_whitespace (*asarg))
++asarg;
switch (c)
{
@@ -4963,7 +4964,7 @@ s_riscv_option (int x ATTRIBUTE_UNUSED)
else if (strncmp (name, "arch,", 5) == 0)
{
name += 5;
- if (ISSPACE (*name) && *name != '\0')
+ if (is_whitespace (*name) && *name != '\0')
name++;
riscv_update_subset (&riscv_rps_as, name);
riscv_set_arch_str (&riscv_rps_as.subset_list->arch_str);
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 44/65] rl78: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (42 preceding siblings ...)
2025-01-27 16:34 ` [PATCH v2 43/65] RISC-V: " Jan Beulich
@ 2025-01-27 16:35 ` Jan Beulich
2025-01-27 16:35 ` [PATCH v2 45/65] rx: " Jan Beulich
` (22 subsequent siblings)
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:35 UTC (permalink / raw)
To: Binutils
Replace open-coded checks and convert ISSPACE() uses. At the same time
use is_end_of_stmt() instead of an open-coded check in adjacent code.
---
v2: New.
--- a/gas/config/rl78-parse.y
+++ b/gas/config/rl78-parse.y
@@ -1379,7 +1379,7 @@ find_bit_index (char *tok)
{
last_digit = tok;
}
- else if (ISSPACE (*tok))
+ else if (is_whitespace (*tok))
{
/* skip */
}
@@ -1403,7 +1403,7 @@ rl78_lex (void)
char * save_input_pointer;
char * bit = NULL;
- while (ISSPACE (*rl78_lex_start)
+ while (is_whitespace (*rl78_lex_start)
&& rl78_lex_start != rl78_lex_end)
rl78_lex_start ++;
--- a/gas/config/tc-rl78.c
+++ b/gas/config/tc-rl78.c
@@ -425,12 +425,11 @@ md_number_to_chars (char * buf, valueT v
static void
require_end_of_expr (const char *fname)
{
- while (* input_line_pointer == ' '
- || * input_line_pointer == '\t')
+ while (is_whitespace (* input_line_pointer))
input_line_pointer ++;
- if (! * input_line_pointer
- || strchr ("\n\r,", * input_line_pointer)
+ if (is_end_of_stmt (* input_line_pointer)
+ || * input_line_pointer == ','
|| strchr (comment_chars, * input_line_pointer)
|| strchr (line_comment_chars, * input_line_pointer)
|| strchr (line_separator_chars, * input_line_pointer))
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 45/65] rx: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (43 preceding siblings ...)
2025-01-27 16:35 ` [PATCH v2 44/65] rl78: " Jan Beulich
@ 2025-01-27 16:35 ` Jan Beulich
2025-01-27 16:36 ` [PATCH v2 46/65] s12z: " Jan Beulich
` (21 subsequent siblings)
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:35 UTC (permalink / raw)
To: Binutils; +Cc: Nick Clifton
Wherever blanks are permissible in input, tabs ought to be permissible,
too. This is particularly relevant when -f is passed to gas (alongside
appropriate input). Also convert open-coded checks as well as ISSPACE()
uses. At the same time use is_end_of_stmt() instead of an open-coded
check in adjacent code.
---
v2: New.
--- a/gas/config/rx-parse.y
+++ b/gas/config/rx-parse.y
@@ -1583,7 +1583,7 @@ rx_lex (void)
unsigned int ci;
char * save_input_pointer;
- while (ISSPACE (*rx_lex_start)
+ while (is_whitespace (*rx_lex_start)
&& rx_lex_start != rx_lex_end)
rx_lex_start ++;
--- a/gas/config/tc-rx.c
+++ b/gas/config/tc-rx.c
@@ -282,7 +282,8 @@ rx_include (int ignore)
last_char = find_end_of_line (filename, false);
input_line_pointer = last_char;
- while (last_char >= filename && (* last_char == ' ' || * last_char == '\n'))
+ while (last_char >= filename
+ && (is_whitespace (* last_char) || is_end_of_stmt (* last_char)))
-- last_char;
end_char = *(++ last_char);
* last_char = 0;
@@ -425,14 +426,14 @@ parse_rx_section (char * name)
{
*p = end_char;
- if (end_char == ' ')
- while (ISSPACE (*p))
+ if (is_whitespace (end_char))
+ while (is_whitespace (*p))
p++;
if (*p == '=')
{
++ p;
- while (ISSPACE (*p))
+ while (is_whitespace (*p))
p++;
switch (*p)
{
@@ -517,7 +518,7 @@ rx_section (int ignore)
{
int len = p - input_line_pointer;
- while (ISSPACE (*++p))
+ while (is_whitespace (*++p))
;
if (*p != '"' && *p != '#')
@@ -1060,7 +1061,7 @@ rx_equ (char * name, char * expression)
char * name_end;
char * saved_ilp;
- while (ISSPACE (* name))
+ while (is_whitespace (* name))
name ++;
for (name_end = name + 1; *name_end; name_end ++)
@@ -1094,7 +1095,7 @@ scan_for_infix_rx_pseudo_ops (char * str
return false;
/* A real pseudo-op must be preceded by whitespace. */
- if (dot[-1] != ' ' && dot[-1] != '\t')
+ if (!is_whitespace (dot[-1]))
return false;
pseudo_op = dot + 1;
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 46/65] s12z: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (44 preceding siblings ...)
2025-01-27 16:35 ` [PATCH v2 45/65] rx: " Jan Beulich
@ 2025-01-27 16:36 ` Jan Beulich
2025-01-27 16:36 ` [PATCH v2 47/65] S/390: " Jan Beulich
` (20 subsequent siblings)
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:36 UTC (permalink / raw)
To: Binutils; +Cc: John Darrington
Convert open-coded checks. At the same time use is_end_of_stmt() instead
of open-coded checks in adjacent code. This then also fixes the prior
use of a wrong cast for an array index: Plain char may, after all, be
signed.
---
v2: New.
--- a/gas/config/tc-s12z.c
+++ b/gas/config/tc-s12z.c
@@ -207,7 +207,7 @@ s12z_init_after_args (void)
static char *
skip_whites (char *p)
{
- while (*p == ' ' || *p == '\t')
+ while (is_whitespace (*p))
p++;
return p;
@@ -347,7 +347,7 @@ static bool
lex_match_string (const char *s)
{
char *p = input_line_pointer;
- while (p != 0 && *p != '\t' && *p != ' ' && *p != '\0')
+ while (p != 0 && !is_whitespace (*p) && !is_end_of_stmt (*p))
{
p++;
}
@@ -3790,7 +3790,7 @@ md_assemble (char *str)
/* Find the opcode end and get the opcode in 'name'. The opcode is forced
lower case (the opcode table only has lower case op-codes). */
for (op_start = op_end = str;
- *op_end && !is_end_of_line[(int)*op_end] && *op_end != ' ';
+ !is_end_of_stmt (*op_end) && !is_whitespace (*op_end);
op_end++)
{
name[nlen] = TOLOWER (op_start[nlen]);
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 47/65] S/390: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (45 preceding siblings ...)
2025-01-27 16:36 ` [PATCH v2 46/65] s12z: " Jan Beulich
@ 2025-01-27 16:36 ` Jan Beulich
2025-01-30 8:38 ` Jens Remus
2025-01-27 16:37 ` [PATCH v2 48/65] Score: " Jan Beulich
` (19 subsequent siblings)
66 siblings, 1 reply; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:36 UTC (permalink / raw)
To: Binutils; +Cc: Andreas Krebbel
Convert ISSPACE() uses. At the same time use is_end_of_stmt() instead
of kind-of-open-coded checks in adjacent code.
---
v2: New.
--- a/gas/config/tc-s390.c
+++ b/gas/config/tc-s390.c
@@ -1429,7 +1429,7 @@ md_gather_operands (char *str,
char *f;
int fc, i;
- while (ISSPACE (*str))
+ while (is_whitespace (*str))
str++;
/* Gather the operands. */
@@ -1811,7 +1811,7 @@ md_gather_operands (char *str,
}
}
- while (ISSPACE (*str))
+ while (is_whitespace (*str))
str++;
/* Check for tls instruction marker. */
@@ -1916,7 +1916,7 @@ md_assemble (char *str)
char *s;
/* Get the opcode. */
- for (s = str; *s != '\0' && ! ISSPACE (*s); s++)
+ for (s = str; ! is_end_of_stmt (*s) && ! is_whitespace (*s); s++)
;
if (*s != '\0')
*s++ = '\0';
@@ -1972,7 +1972,7 @@ s390_insn (int ignore ATTRIBUTE_UNUSED)
/* Get the opcode format. */
s = input_line_pointer;
- while (*s != '\0' && *s != ',' && ! ISSPACE (*s))
+ while (! is_end_of_stmt (*s) && *s != ',' && ! is_whitespace (*s))
s++;
if (*s != ',')
as_bad (_("Invalid .insn format\n"));
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [PATCH v2 47/65] S/390: use is_whitespace()
2025-01-27 16:36 ` [PATCH v2 47/65] S/390: " Jan Beulich
@ 2025-01-30 8:38 ` Jens Remus
2025-01-30 9:11 ` Jan Beulich
0 siblings, 1 reply; 106+ messages in thread
From: Jens Remus @ 2025-01-30 8:38 UTC (permalink / raw)
To: Jan Beulich, Binutils; +Cc: Andreas Krebbel
On 27.01.2025 17:36, Jan Beulich wrote:
> Convert ISSPACE() uses. At the same time use is_end_of_stmt() instead
> of kind-of-open-coded checks in adjacent code.
Thanks! LGTM.
> --- a/gas/config/tc-s390.c
> +++ b/gas/config/tc-s390.c
> @@ -1916,7 +1916,7 @@ md_assemble (char *str)
> char *s;
>
> /* Get the opcode. */
> - for (s = str; *s != '\0' && ! ISSPACE (*s); s++)
> + for (s = str; ! is_end_of_stmt (*s) && ! is_whitespace (*s); s++)
> ;
> if (*s != '\0')
> *s++ = '\0';
I wonder whether I should look into whether to convert all of
those checks against '\0' to is_end_of_stmt()? What are your
thoughts?
Thanks and regards,
Jens
--
Jens Remus
Linux on Z Development (D3303)
+49-7031-16-1128 Office
jremus@de.ibm.com
IBM
IBM Deutschland Research & Development GmbH; Vorsitzender des Aufsichtsrats: Wolfgang Wendt; Geschäftsführung: David Faller; Sitz der Gesellschaft: Böblingen; Registergericht: Amtsgericht Stuttgart, HRB 243294
IBM Data Privacy Statement: https://www.ibm.com/privacy/
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [PATCH v2 47/65] S/390: use is_whitespace()
2025-01-30 8:38 ` Jens Remus
@ 2025-01-30 9:11 ` Jan Beulich
0 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-30 9:11 UTC (permalink / raw)
To: Jens Remus; +Cc: Andreas Krebbel, Binutils
On 30.01.2025 09:38, Jens Remus wrote:
>> --- a/gas/config/tc-s390.c
>> +++ b/gas/config/tc-s390.c
>
>> @@ -1916,7 +1916,7 @@ md_assemble (char *str)
>> char *s;
>>
>> /* Get the opcode. */
>> - for (s = str; *s != '\0' && ! ISSPACE (*s); s++)
>> + for (s = str; ! is_end_of_stmt (*s) && ! is_whitespace (*s); s++)
>> ;
>> if (*s != '\0')
>> *s++ = '\0';
>
> I wonder whether I should look into whether to convert all of
> those checks against '\0' to is_end_of_stmt()? What are your
> thoughts?
Quite likely, unless I didn't spot where the nul char is put in place
(and I will admit I didn't try very hard, as that wouldn't have scaled
very well with the many architectures that needed touching; IOW I went
from the assumption that it's better to go a little too far with
converting, which can always be undone later on by arch maintainers).
Jan
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 48/65] Score: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (46 preceding siblings ...)
2025-01-27 16:36 ` [PATCH v2 47/65] S/390: " Jan Beulich
@ 2025-01-27 16:37 ` Jan Beulich
2025-01-27 16:37 ` [PATCH v2 49/65] SH: " Jan Beulich
` (18 subsequent siblings)
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:37 UTC (permalink / raw)
To: Binutils
Wherever blanks are permissible in input, tabs ought to be permissible,
too. This is particularly relevant when -f is passed to gas (alongside
appropriate input).
---
v2: New.
--- a/gas/config/tc-score.c
+++ b/gas/config/tc-score.c
@@ -259,7 +259,7 @@ const size_t md_longopts_size = sizeof (
#define s3_BAD_SKIP_COMMA s3_BAD_ARGS
#define s3_BAD_GARBAGE _("garbage following instruction");
-#define s3_skip_whitespace(str) while (*(str) == ' ') ++(str)
+#define s3_skip_whitespace(str) while (is_whitespace (*(str))) ++(str)
/* The name of the readonly data section. */
#define s3_RDATA_SECTION_NAME (OUTPUT_FLAVOR == bfd_target_aout_flavour \
@@ -1099,7 +1099,7 @@ s3_skip_past_comma (char **str)
char c;
int comma = 0;
- while ((c = *p) == ' ' || c == ',')
+ while (is_whitespace (c = *p) || c == ',')
{
p++;
if (c == ',' && comma++)
@@ -1376,7 +1376,7 @@ s3_data_op2 (char **str, int shift, enum
for (; *dataptr != '\0'; dataptr++)
{
*dataptr = TOLOWER (*dataptr);
- if (*dataptr == '!' || *dataptr == ' ')
+ if (*dataptr == '!' || is_whitespace (*dataptr))
break;
}
dataptr = (char *)data_exp;
@@ -2650,7 +2650,7 @@ s3_parse_16_32_inst (char *insnstr, bool
s3_skip_whitespace (operator);
for (p = operator; *p != '\0'; p++)
- if ((*p == ' ') || (*p == '!'))
+ if (is_whitespace (*p) || (*p == '!'))
break;
if (*p == '!')
@@ -2700,7 +2700,7 @@ s3_parse_48_inst (char *insnstr, bool ge
s3_skip_whitespace (operator);
for (p = operator; *p != '\0'; p++)
- if (*p == ' ')
+ if (is_whitespace (*p))
break;
c = *p;
--- a/gas/config/tc-score7.c
+++ b/gas/config/tc-score7.c
@@ -104,7 +104,7 @@ static void s7_do_lw_pic (char *);
#define s7_BAD_SKIP_COMMA s7_BAD_ARGS
#define s7_BAD_GARBAGE _("garbage following instruction");
-#define s7_skip_whitespace(str) while (*(str) == ' ') ++(str)
+#define s7_skip_whitespace(str) while (is_whitespace (*(str))) ++(str)
/* The name of the readonly data section. */
#define s7_RDATA_SECTION_NAME (OUTPUT_FLAVOR == bfd_target_aout_flavour \
@@ -1187,7 +1187,7 @@ s7_skip_past_comma (char **str)
char c;
int comma = 0;
- while ((c = *p) == ' ' || c == ',')
+ while (is_whitespace (c = *p) || c == ',')
{
p++;
if (c == ',' && comma++)
@@ -1501,7 +1501,7 @@ s7_data_op2 (char **str, int shift, enum
for (; *dataptr != '\0'; dataptr++)
{
*dataptr = TOLOWER (*dataptr);
- if (*dataptr == '!' || *dataptr == ' ')
+ if (*dataptr == '!' || is_whitespace (*dataptr))
break;
}
dataptr = (char *) data_exp;
@@ -2781,7 +2781,7 @@ s7_parse_16_32_inst (char *insnstr, bool
s7_skip_whitespace (operator);
for (p = operator; *p != '\0'; p++)
- if ((*p == ' ') || (*p == '!'))
+ if (is_whitespace (*p) || (*p == '!'))
break;
if (*p == '!')
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 49/65] SH: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (47 preceding siblings ...)
2025-01-27 16:37 ` [PATCH v2 48/65] Score: " Jan Beulich
@ 2025-01-27 16:37 ` Jan Beulich
2025-02-07 6:54 ` Alexandre Oliva
2025-01-27 16:38 ` [PATCH v2 50/65] Sparc: " Jan Beulich
` (17 subsequent siblings)
66 siblings, 1 reply; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:37 UTC (permalink / raw)
To: Binutils; +Cc: Alexandre Oliva
Wherever blanks are permissible in input, tabs ought to be permissible,
too. This is particularly relevant when -f is passed to gas (alongside
appropriate input). Also convert open-coded checks as well as an
ISSPACE() use. At the same time use is_end_of_stmt() instead of
(kind-of-)open-coded checks in adjacent code.
---
v2: New.
--- a/gas/config/tc-sh.c
+++ b/gas/config/tc-sh.c
@@ -1232,7 +1232,7 @@ get_operands (sh_opcode_info *info, char
/* The pre-processor will eliminate whitespace in front of '@'
after the first argument; we may be called multiple times
from assemble_ppi, so don't insist on finding whitespace here. */
- if (*ptr == ' ')
+ if (is_whitespace (*ptr))
ptr++;
get_operand (&ptr, operand + 0);
@@ -2151,7 +2151,7 @@ find_cooked_opcode (char **str_p)
unsigned int nlen = 0;
/* Drop leading whitespace. */
- while (*str == ' ')
+ while (is_whitespace (*str))
str++;
/* Find the op code end.
@@ -2159,9 +2159,8 @@ find_cooked_opcode (char **str_p)
any '@' after the first argument; we may be called from
assemble_ppi, so the opcode might be terminated by an '@'. */
for (op_start = op_end = (unsigned char *) str;
- *op_end
- && nlen < sizeof (name) - 1
- && !is_end_of_line[*op_end] && *op_end != ' ' && *op_end != '@';
+ nlen < sizeof (name) - 1
+ && !is_end_of_stmt (*op_end) && !is_whitespace (*op_end) && *op_end != '@';
op_end++)
{
unsigned char c = op_start[nlen];
@@ -2515,10 +2514,11 @@ md_assemble (char *str)
bool found = false;
/* Identify opcode in string. */
- while (ISSPACE (*name))
+ while (is_whitespace (*name))
name++;
- while (name[name_length] != '\0' && !ISSPACE (name[name_length]))
+ while (!is_end_of_stmt (name[name_length])
+ && !is_whitespace (name[name_length]))
name_length++;
/* Search for opcode in full list. */
@@ -2577,7 +2577,7 @@ md_assemble (char *str)
{
/* Ignore trailing whitespace. If there is any, it has already
been compressed to a single space. */
- if (*op_end == ' ')
+ if (is_whitespace (*op_end))
op_end++;
}
else
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 50/65] Sparc: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (48 preceding siblings ...)
2025-01-27 16:37 ` [PATCH v2 49/65] SH: " Jan Beulich
@ 2025-01-27 16:38 ` Jan Beulich
2025-01-27 16:38 ` [PATCH v2 51/65] spu: " Jan Beulich
` (16 subsequent siblings)
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:38 UTC (permalink / raw)
To: Binutils; +Cc: David S. Miller, Jose E. Marchesi
Wherever blanks are permissible in input, tabs ought to be permissible,
too. This is particularly relevant when -f is passed to gas (alongside
appropriate input).
---
v2: New.
--- a/gas/config/tc-sparc.c
+++ b/gas/config/tc-sparc.c
@@ -1743,13 +1743,15 @@ sparc_ip (char *str, const struct sparc_
case ',':
comma = 1;
- /* Fall through. */
-
- case ' ':
*s++ = '\0';
break;
default:
+ if (is_whitespace (*s))
+ {
+ *s++ = '\0';
+ break;
+ }
as_bad (_("Unknown opcode: `%s'"), str);
*pinsn = NULL;
return special_case;
@@ -1798,11 +1800,11 @@ sparc_ip (char *str, const struct sparc_
goto error;
}
kmask |= jmask;
- while (*s == ' ')
+ while (is_whitespace (*s))
++s;
if (*s == '|' || *s == '+')
++s;
- while (*s == ' ')
+ while (is_whitespace (*s))
++s;
}
}
@@ -2039,7 +2041,7 @@ sparc_ip (char *str, const struct sparc_
goto immediate;
case ')':
- if (*s == ' ')
+ if (is_whitespace (*s))
s++;
if ((s[0] == '0' && s[1] == 'x' && ISXDIGIT (s[2]))
|| ISDIGIT (*s))
@@ -2131,7 +2133,7 @@ sparc_ip (char *str, const struct sparc_
break;
case 'z':
- if (*s == ' ')
+ if (is_whitespace (*s))
{
++s;
}
@@ -2144,7 +2146,7 @@ sparc_ip (char *str, const struct sparc_
break;
case 'Z':
- if (*s == ' ')
+ if (is_whitespace (*s))
{
++s;
}
@@ -2157,7 +2159,7 @@ sparc_ip (char *str, const struct sparc_
break;
case '6':
- if (*s == ' ')
+ if (is_whitespace (*s))
{
++s;
}
@@ -2169,7 +2171,7 @@ sparc_ip (char *str, const struct sparc_
break;
case '7':
- if (*s == ' ')
+ if (is_whitespace (*s))
{
++s;
}
@@ -2181,7 +2183,7 @@ sparc_ip (char *str, const struct sparc_
break;
case '8':
- if (*s == ' ')
+ if (is_whitespace (*s))
{
++s;
}
@@ -2193,7 +2195,7 @@ sparc_ip (char *str, const struct sparc_
break;
case '9':
- if (*s == ' ')
+ if (is_whitespace (*s))
{
++s;
}
@@ -2303,11 +2305,15 @@ sparc_ip (char *str, const struct sparc_
case '[': /* These must match exactly. */
case ']':
case ',':
- case ' ':
if (*s++ == *args)
continue;
break;
+ case ' ':
+ if (is_whitespace (*s++))
+ continue;
+ break;
+
case '#': /* Must be at least one digit. */
if (ISDIGIT (*s++))
{
@@ -2680,7 +2686,7 @@ sparc_ip (char *str, const struct sparc_
/* fallthrough */
immediate:
- if (*s == ' ')
+ if (is_whitespace (*s))
s++;
{
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 51/65] spu: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (49 preceding siblings ...)
2025-01-27 16:38 ` [PATCH v2 50/65] Sparc: " Jan Beulich
@ 2025-01-27 16:38 ` Jan Beulich
2025-01-27 16:39 ` [PATCH v2 52/65] C30: " Jan Beulich
` (15 subsequent siblings)
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:38 UTC (permalink / raw)
To: Binutils; +Cc: Alan Modra
Convert ISSPACE() uses. At the same time use is_end_of_stmt() instead
of a kind-of-open-coded check in adjacent code.
---
v2: New.
--- a/gas/config/tc-spu.c
+++ b/gas/config/tc-spu.c
@@ -263,7 +263,7 @@ md_assemble (char *op)
/* skip over instruction to find parameters */
- for (param = op; *param != 0 && !ISSPACE (*param); param++)
+ for (param = op; !is_end_of_stmt (*param) && !is_whitespace (*param); param++)
;
c = *param;
*param = 0;
@@ -388,7 +388,7 @@ calcop (struct spu_opcode *format, const
arg = format->arg[i];
syntax_error_arg = i;
- while (ISSPACE (*param))
+ while (is_whitespace (*param))
param++;
if (*param == 0 || *param == ',')
return 0;
@@ -406,7 +406,7 @@ calcop (struct spu_opcode *format, const
if (!param)
return 0;
- while (ISSPACE (*param))
+ while (is_whitespace (*param))
param++;
if (arg != A_P && paren)
@@ -426,7 +426,7 @@ calcop (struct spu_opcode *format, const
}
}
}
- while (ISSPACE (*param))
+ while (is_whitespace (*param))
param++;
return !paren && (*param == 0 || *param == '\n');
}
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 52/65] C30: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (50 preceding siblings ...)
2025-01-27 16:38 ` [PATCH v2 51/65] spu: " Jan Beulich
@ 2025-01-27 16:39 ` Jan Beulich
2025-01-27 16:40 ` [PATCH v2 53/65] C4x: " Jan Beulich
` (14 subsequent siblings)
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:39 UTC (permalink / raw)
To: Binutils
Convert an open-coded check.
---
v2: New.
--- a/gas/config/tc-tic30.c
+++ b/gas/config/tc-tic30.c
@@ -180,7 +180,7 @@ md_begin (void)
if (ISALPHA (c) || c == '_' || c == '.' || ISDIGIT (c))
identifier_chars[c] = c;
- if (c == ' ' || c == '\t')
+ if (is_whitespace (c))
space_chars[c] = c;
if (c == '_')
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 53/65] C4x: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (51 preceding siblings ...)
2025-01-27 16:39 ` [PATCH v2 52/65] C30: " Jan Beulich
@ 2025-01-27 16:40 ` Jan Beulich
2025-01-27 16:40 ` [PATCH v2 54/65] C54x: " Jan Beulich
` (13 subsequent siblings)
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:40 UTC (permalink / raw)
To: Binutils
Wherever blanks are permissible in input, tabs ought to be permissible,
too. This is particularly relevant when -f is passed to gas (alongside
appropriate input). At the same time use is_end_of_stmt() instead of
kind-of-open-coded checks in adjacent code.
---
v2: New.
--- a/gas/config/tc-tic4x.c
+++ b/gas/config/tc-tic4x.c
@@ -1472,7 +1472,7 @@ tic4x_indirect_parse (tic4x_operand_t *o
s++;
}
}
- if (*s != ' ' && *s != ',' && *s != '\0')
+ if (!is_whitespace (*s) && *s != ',' && !is_end_of_stmt (*s))
return 0;
input_line_pointer = s;
return 1;
@@ -2428,7 +2428,7 @@ md_assemble (char *str)
/* Find mnemonic (second part of parallel instruction). */
s = str;
/* Skip past instruction mnemonic. */
- while (*s && *s != ' ')
+ while (!is_end_of_stmt (*s) && !is_whitespace (*s))
s++;
if (*s) /* Null terminate for str_hash_find. */
*s++ = '\0'; /* and skip past null. */
@@ -2492,7 +2492,7 @@ md_assemble (char *str)
{
/* Find mnemonic. */
s = str;
- while (*s && *s != ' ') /* Skip past instruction mnemonic. */
+ while (!is_end_of_stmt (*s) && !is_whitespace (*s)) /* Skip past instruction mnemonic. */
s++;
if (*s) /* Null terminate for str_hash_find. */
*s++ = '\0'; /* and skip past null. */
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 54/65] C54x: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (52 preceding siblings ...)
2025-01-27 16:40 ` [PATCH v2 53/65] C4x: " Jan Beulich
@ 2025-01-27 16:40 ` Jan Beulich
2025-01-27 16:41 ` [PATCH v2 55/65] C6x: " Jan Beulich
` (12 subsequent siblings)
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:40 UTC (permalink / raw)
To: Binutils; +Cc: Timothy Wall
Convert ISSPACE() uses. At the same time use is_end_of_stmt() instead
of open-coded checks in adjacent code. The function also needs using in
next_line_shows_parallel().
---
In next_line_shows_parallel() it's not really clear to me whether
is_end_of_stmt() is to be used, or whether this is one of the very rare
cases where indeed is_end_of_line() wants/needs using. Actually as long
as line_separator_chars[] is empty, there's no difference between the
two.
I'd like to note that the way next_line_shows_parallel() works is
unreliable: input_scrub_next_buffer() may have broken up input just
between the two lines. In which case parallel_on_next_line_hint will be
set to false no matter that the next line starts with ||.
---
v2: New.
--- a/gas/config/tc-tic54x.c
+++ b/gas/config/tc-tic54x.c
@@ -2327,8 +2327,8 @@ tic54x_mlib (int ignore ATTRIBUTE_UNUSED
{
SKIP_WHITESPACE ();
len = 0;
- while (!is_end_of_line[(unsigned char) *input_line_pointer]
- && !ISSPACE (*input_line_pointer))
+ while (!is_end_of_stmt (*input_line_pointer)
+ && !is_whitespace (*input_line_pointer))
{
obstack_1grow (¬es, *input_line_pointer);
++input_line_pointer;
@@ -3109,7 +3109,7 @@ get_operands (struct opstruct operands[]
int paren_not_balanced = 0;
char *op_start, *op_end;
- while (*lptr && ISSPACE (*lptr))
+ while (is_whitespace (*lptr))
++lptr;
op_start = lptr;
while (paren_not_balanced || *lptr != ',')
@@ -3140,7 +3140,7 @@ get_operands (struct opstruct operands[]
/* Trim trailing spaces; while the preprocessor gets rid of most,
there are weird usage patterns that can introduce them
(i.e. using strings for macro args). */
- while (len > 0 && ISSPACE (operands[numexp].buf[len - 1]))
+ while (len > 0 && is_whitespace (operands[numexp].buf[len - 1]))
operands[numexp].buf[--len] = 0;
lptr = op_end;
++numexp;
@@ -3164,8 +3164,8 @@ get_operands (struct opstruct operands[]
}
}
- while (*lptr && ISSPACE (*lptr++))
- ;
+ while (is_whitespace (*lptr))
+ ++lptr;
if (!is_end_of_line[(unsigned char) *lptr])
{
as_bad (_("Extra junk on line"));
@@ -4218,7 +4218,8 @@ static int
next_line_shows_parallel (char *next_line)
{
/* Look for the second half. */
- while (*next_line != 0 && ISSPACE (*next_line))
+ while (*next_line != 0
+ && (is_whitespace (*next_line) || is_end_of_stmt (*next_line)))
++next_line;
return (next_line[0] == PARALLEL_SEPARATOR
@@ -4804,7 +4805,7 @@ tic54x_start_line_hook (void)
comment = replacement + strlen (replacement) - 1;
/* Trim trailing whitespace. */
- while (ISSPACE (*comment))
+ while (is_whitespace (*comment))
{
comment[0] = endc;
comment[1] = 0;
@@ -4812,7 +4813,7 @@ tic54x_start_line_hook (void)
}
/* Compact leading whitespace. */
- while (ISSPACE (tmp[0]) && ISSPACE (tmp[1]))
+ while (is_whitespace (tmp[0]) && is_whitespace (tmp[1]))
++tmp;
input_line_pointer = endp;
@@ -4915,7 +4916,7 @@ md_assemble (char *line)
otherwise let the assembler pick up the next line for us. */
if (tmp != NULL)
{
- while (ISSPACE (tmp[2]))
+ while (is_whitespace (tmp[2]))
++tmp;
md_assemble (tmp + 2);
}
@@ -5387,16 +5388,16 @@ tic54x_start_label (char * label_start,
rest = input_line_pointer;
if (nul_char == '"')
++rest;
- while (ISSPACE (next_char))
+ while (is_whitespace (next_char))
next_char = *++rest;
if (next_char != '.')
return 1;
/* Don't let colon () define a label for any of these... */
- return ((strncasecmp (rest, ".tag", 4) != 0 || !ISSPACE (rest[4]))
- && (strncasecmp (rest, ".struct", 7) != 0 || !ISSPACE (rest[7]))
- && (strncasecmp (rest, ".union", 6) != 0 || !ISSPACE (rest[6]))
- && (strncasecmp (rest, ".macro", 6) != 0 || !ISSPACE (rest[6]))
- && (strncasecmp (rest, ".set", 4) != 0 || !ISSPACE (rest[4]))
- && (strncasecmp (rest, ".equ", 4) != 0 || !ISSPACE (rest[4])));
+ return ((strncasecmp (rest, ".tag", 4) != 0 || !is_whitespace (rest[4]))
+ && (strncasecmp (rest, ".struct", 7) != 0 || !is_whitespace (rest[7]))
+ && (strncasecmp (rest, ".union", 6) != 0 || !is_whitespace (rest[6]))
+ && (strncasecmp (rest, ".macro", 6) != 0 || !is_whitespace (rest[6]))
+ && (strncasecmp (rest, ".set", 4) != 0 || !is_whitespace (rest[4]))
+ && (strncasecmp (rest, ".equ", 4) != 0 || !is_whitespace (rest[4])));
}
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 55/65] C6x: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (53 preceding siblings ...)
2025-01-27 16:40 ` [PATCH v2 54/65] C54x: " Jan Beulich
@ 2025-01-27 16:41 ` Jan Beulich
2025-01-27 16:42 ` [PATCH v2 56/65] v850: " Jan Beulich
` (11 subsequent siblings)
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:41 UTC (permalink / raw)
To: Binutils; +Cc: Joseph Myers
Wherever blanks are permissible in input, tabs ought to be permissible,
too. This is particularly relevant when -f is passed to gas (alongside
appropriate input). Also convert an ISSPACE() use. At the same time use
is_end_of_stmt() instead of open-coded checks in adjacent code.
---
v2: New.
--- a/gas/config/tc-tic6x.c
+++ b/gas/config/tc-tic6x.c
@@ -489,7 +489,8 @@ s_tic6x_arch (int ignored ATTRIBUTE_UNUS
char *arch;
arch = input_line_pointer;
- while (*input_line_pointer && !ISSPACE (*input_line_pointer))
+ while (!is_end_of_stmt (*input_line_pointer)
+ && !is_whitespace (*input_line_pointer))
input_line_pointer++;
c = *input_line_pointer;
*input_line_pointer = 0;
@@ -1180,7 +1181,7 @@ typedef struct
} value;
} tic6x_operand;
-#define skip_whitespace(str) do { if (*(str) == ' ') ++(str); } while (0)
+#define skip_whitespace(str) do { if (is_whitespace (*(str))) ++(str); } while (0)
/* Parse a register operand, or part of an operand, starting at *P.
If syntactically OK (including that the number is in the range 0 to
@@ -3148,7 +3149,7 @@ md_assemble (char *str)
char *output;
p = str;
- while (*p && !is_end_of_line[(unsigned char) *p] && *p != ' ')
+ while (!is_end_of_stmt (*p) && !is_whitespace (*p))
p++;
/* This function should only have been called when there is actually
@@ -3208,10 +3209,10 @@ md_assemble (char *str)
if (good_func_unit)
{
- if (p[3] == ' ' || is_end_of_line[(unsigned char) p[3]])
+ if (is_whitespace (p[3]) || is_end_of_stmt (p[3]))
p += 3;
else if ((p[3] == 'x' || p[3] == 'X')
- && (p[4] == ' ' || is_end_of_line[(unsigned char) p[4]]))
+ && (is_whitespace (p[4]) || is_end_of_stmt (p[4])))
{
maybe_cross = 1;
p += 4;
@@ -3219,7 +3220,7 @@ md_assemble (char *str)
else if (maybe_base == tic6x_func_unit_d
&& (p[3] == 't' || p[3] == 'T')
&& (p[4] == '1' || p[4] == '2')
- && (p[5] == ' ' || is_end_of_line[(unsigned char) p[5]]))
+ && (is_whitespace (p[5]) || is_end_of_stmt (p[5])))
{
maybe_data_side = p[4] - '0';
p += 5;
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 56/65] v850: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (54 preceding siblings ...)
2025-01-27 16:41 ` [PATCH v2 55/65] C6x: " Jan Beulich
@ 2025-01-27 16:42 ` Jan Beulich
2025-01-27 16:42 ` [PATCH v2 57/65] VAX: " Jan Beulich
` (10 subsequent siblings)
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:42 UTC (permalink / raw)
To: Binutils
Wherever blanks are permissible in input, tabs ought to be permissible,
too. This is particularly relevant when -f is passed to gas (alongside
appropriate input). Also convert open-coded checks as well as ISSPACE()
uses. At the same time use is_end_of_stmt() instead of a kind-of-open-
coded check in adjacent code.
---
v2: New.
--- a/gas/config/tc-v850.c
+++ b/gas/config/tc-v850.c
@@ -1338,8 +1338,7 @@ vector_register_name (expressionS *expre
static void
skip_white_space (void)
{
- while (*input_line_pointer == ' '
- || *input_line_pointer == '\t')
+ while (is_whitespace (*input_line_pointer))
++input_line_pointer;
}
@@ -2306,7 +2305,7 @@ md_assemble (char *str)
most_match_errmsg[0] = 0;
/* Get the opcode. */
- for (s = str; *s != '\0' && ! ISSPACE (*s); s++)
+ for (s = str; ! is_end_of_stmt (*s) && ! is_whitespace (*s); s++)
continue;
if (*s != '\0')
@@ -2323,7 +2322,7 @@ md_assemble (char *str)
}
str = s;
- while (ISSPACE (*str))
+ while (is_whitespace (*str))
++str;
start_of_operands = str;
@@ -2384,7 +2383,7 @@ md_assemble (char *str)
errmsg = NULL;
- while (*str == ' ')
+ while (is_whitespace (*str))
++str;
if (operand->flags & V850_OPERAND_BANG
@@ -2397,7 +2396,7 @@ md_assemble (char *str)
if (*str == ',' || *str == '[' || *str == ']')
++str;
- while (*str == ' ')
+ while (is_whitespace (*str))
++str;
if ( (strcmp (opcode->name, "pushsp") == 0
@@ -2792,7 +2791,7 @@ md_assemble (char *str)
str = input_line_pointer;
input_line_pointer = hold;
- while (*str == ' ' || *str == ','
+ while (is_whitespace (*str) || *str == ','
|| *str == '[' || *str == ']')
++str;
continue;
@@ -2996,12 +2995,12 @@ md_assemble (char *str)
str = input_line_pointer;
input_line_pointer = hold;
- while (*str == ' ' || *str == ',' || *str == '[' || *str == ']'
+ while (is_whitespace (*str) || *str == ',' || *str == '[' || *str == ']'
|| *str == ')')
++str;
}
- while (ISSPACE (*str))
+ while (is_whitespace (*str))
++str;
if (*str == '\0')
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 57/65] VAX: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (55 preceding siblings ...)
2025-01-27 16:42 ` [PATCH v2 56/65] v850: " Jan Beulich
@ 2025-01-27 16:42 ` Jan Beulich
2025-01-27 20:46 ` Jan-Benedict Glaw
2025-01-27 16:43 ` [PATCH v2 58/65] Visium: " Jan Beulich
` (9 subsequent siblings)
66 siblings, 1 reply; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:42 UTC (permalink / raw)
To: Binutils; +Cc: Jan-Benedict Glaw
Wherever blanks are permissible in input, tabs ought to be permissible,
too. This is particularly relevant when -f is passed to gas (alongside
appropriate input).
---
v2: New.
--- a/gas/config/tc-vax.c
+++ b/gas/config/tc-vax.c
@@ -1278,13 +1278,13 @@ vip_op (char *optext, struct vop *vopP)
p = optext;
- if (*p == ' ') /* Expect all whitespace reduced to ' '. */
+ if (is_whitespace (*p))
p++; /* skip over whitespace */
if ((at = INDIRECTP (*p)) != 0)
{ /* 1 if *p=='@'(or '*' for Un*x) */
p++; /* at is determined */
- if (*p == ' ') /* Expect all whitespace reduced to ' '. */
+ if (is_whitespace (*p))
p++; /* skip over whitespace */
}
@@ -1302,7 +1302,7 @@ vip_op (char *optext, struct vop *vopP)
len = ' '; /* Len is determined. */
}
- if (*p == ' ') /* Expect all whitespace reduced to ' '. */
+ if (is_whitespace (*p))
p++;
if ((hash = IMMEDIATEP (*p)) != 0) /* 1 if *p=='#' ('$' for Un*x) */
@@ -1318,7 +1318,7 @@ vip_op (char *optext, struct vop *vopP)
;
q--; /* Now q points at last char of text. */
- if (*q == ' ' && q >= p) /* Expect all whitespace reduced to ' '. */
+ if (is_whitespace (*q) && q >= p)
q--;
/* Reverse over whitespace, but don't. */
@@ -1368,7 +1368,7 @@ vip_op (char *optext, struct vop *vopP)
Otherwise ndx == -1 if there was no "[...]".
Otherwise, ndx is index register number, and q points before "[...]". */
- if (*q == ' ' && q >= p) /* Expect all whitespace reduced to ' '. */
+ if (is_whitespace (*q) && q >= p)
q--;
/* Reverse over whitespace, but don't. */
/* Run back over *p. */
@@ -1454,7 +1454,7 @@ vip_op (char *optext, struct vop *vopP)
We remember to save q, in case we didn't want "Rn" anyway. */
if (!paren)
{
- if (*q == ' ' && q >= p) /* Expect all whitespace reduced to ' '. */
+ if (is_whitespace (*q) && q >= p)
q--;
/* Reverse over whitespace, but don't. */
/* Run back over *p. */
@@ -1860,11 +1860,11 @@ vip (struct vit *vitP, /* We build an e
/* Op-code of this instruction. */
vax_opcodeT oc;
- if (*instring == ' ')
+ if (is_whitespace (*instring))
++instring;
/* MUST end in end-of-string or exactly 1 space. */
- for (p = instring; *p && *p != ' '; p++)
+ for (p = instring; *p && !is_whitespace (*p); p++)
;
/* Scanned up to end of operation-code. */
@@ -1939,7 +1939,7 @@ vip (struct vit *vitP, /* We build an e
}
if (!*alloperr)
{
- if (*instring == ' ')
+ if (is_whitespace (*instring))
instring++;
if (*instring)
alloperr = _("Too many operands");
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [PATCH v2 57/65] VAX: use is_whitespace()
2025-01-27 16:42 ` [PATCH v2 57/65] VAX: " Jan Beulich
@ 2025-01-27 20:46 ` Jan-Benedict Glaw
0 siblings, 0 replies; 106+ messages in thread
From: Jan-Benedict Glaw @ 2025-01-27 20:46 UTC (permalink / raw)
To: Jan Beulich; +Cc: Binutils
[-- Attachment #1: Type: text/plain, Size: 581 bytes --]
On Mon, 2025-01-27 17:42:54 +0100, Jan Beulich <jbeulich@suse.com> wrote:
> Wherever blanks are permissible in input, tabs ought to be permissible,
> too. This is particularly relevant when -f is passed to gas (alongside
> appropriate input).
> ---
> v2: New.
>
> --- a/gas/config/tc-vax.c
> +++ b/gas/config/tc-vax.c
> @@ -1278,13 +1278,13 @@ vip_op (char *optext, struct vop *vopP)
>
> p = optext;
>
> - if (*p == ' ') /* Expect all whitespace reduced to ' '. */
> + if (is_whitespace (*p))
[...]
Thanks a lot, please go for it!
MfG, JBG
--
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 58/65] Visium: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (56 preceding siblings ...)
2025-01-27 16:42 ` [PATCH v2 57/65] VAX: " Jan Beulich
@ 2025-01-27 16:43 ` Jan Beulich
2025-01-27 16:43 ` [PATCH v2 59/65] wasm32: " Jan Beulich
` (8 subsequent siblings)
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:43 UTC (permalink / raw)
To: Binutils; +Cc: Eric Botcazou
Wherever blanks are permissible in input, tabs ought to be permissible,
too. This is particularly relevant when -f is passed to gas (alongside
appropriate input). Also convert an open-coded check.
---
v2: New.
--- a/gas/config/tc-visium.c
+++ b/gas/config/tc-visium.c
@@ -866,7 +866,7 @@ md_atof (int type, char *litP, int *size
static inline char *
skip_space (char *s)
{
- while (*s == ' ' || *s == '\t')
+ while (is_whitespace (*s))
++s;
return s;
@@ -1029,7 +1029,7 @@ md_assemble (char *str0)
this_dest = 0;
/* Drop leading whitespace (probably not required). */
- while (*str == ' ')
+ while (is_whitespace (*str))
str++;
/* Get opcode mnemonic and make sure it's in lower case. */
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 59/65] wasm32: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (57 preceding siblings ...)
2025-01-27 16:43 ` [PATCH v2 58/65] Visium: " Jan Beulich
@ 2025-01-27 16:43 ` Jan Beulich
2025-01-27 16:44 ` [PATCH v2 60/65] x86: " Jan Beulich
` (7 subsequent siblings)
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:43 UTC (permalink / raw)
To: Binutils
Convert an open-coded check.
---
v2: New.
--- a/gas/config/tc-wasm32.c
+++ b/gas/config/tc-wasm32.c
@@ -229,7 +229,7 @@ md_apply_fix (fixS * fixP, valueT * valP
static inline char *
skip_space (char *s)
{
- while (*s == ' ' || *s == '\t')
+ while (is_whitespace (*s))
++s;
return s;
}
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 60/65] x86: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (58 preceding siblings ...)
2025-01-27 16:43 ` [PATCH v2 59/65] wasm32: " Jan Beulich
@ 2025-01-27 16:44 ` Jan Beulich
2025-01-27 16:45 ` [PATCH v2 61/65] xgate: " Jan Beulich
` (6 subsequent siblings)
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:44 UTC (permalink / raw)
To: Binutils; +Cc: H.J. Lu
Wherever blanks are permissible in input, tabs ought to be permissible,
too. This is particularly relevant when -f is passed to gas (alongside
appropriate input).
--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -613,7 +613,6 @@ static char operand_chars[256];
/* Lexical macros. */
#define is_operand_char(x) (operand_chars[(unsigned char) x])
#define is_register_char(x) (register_chars[(unsigned char) x])
-#define is_space_char(x) ((x) == ' ')
/* All non-digit non-letter characters that may occur in an operand and
which aren't already in extra_symbol_chars[]. */
@@ -2115,7 +2114,7 @@ check_Scc_OszcOperations (const char *l)
{
const char *suffix_string = l;
- while (is_space_char (*suffix_string))
+ while (is_whitespace (*suffix_string))
suffix_string++;
/* If {oszc flags} is absent, just return. */
@@ -2126,7 +2125,7 @@ check_Scc_OszcOperations (const char *l)
suffix_string++;
/* Parse 'dfv='. */
- while (is_space_char (*suffix_string))
+ while (is_whitespace (*suffix_string))
suffix_string++;
if (strncasecmp (suffix_string, "dfv", 3) == 0)
@@ -2137,7 +2136,7 @@ check_Scc_OszcOperations (const char *l)
return -1;
}
- while (is_space_char (*suffix_string))
+ while (is_whitespace (*suffix_string))
suffix_string++;
if (*suffix_string == '=')
@@ -2151,7 +2150,7 @@ check_Scc_OszcOperations (const char *l)
/* Parse 'of, sf, zf, cf}'. */
while (*suffix_string)
{
- while (is_space_char (*suffix_string))
+ while (is_whitespace (*suffix_string))
suffix_string++;
/* Return for '{dfv=}'. */
@@ -2186,7 +2185,7 @@ check_Scc_OszcOperations (const char *l)
suffix_string += 2;
- while (is_space_char (*suffix_string))
+ while (is_whitespace (*suffix_string))
suffix_string++;
if (*suffix_string == '}')
@@ -7568,7 +7567,7 @@ parse_insn (const char *line, char *mnem
{
++mnem_p;
++l;
- if (is_space_char (*l))
+ if (is_whitespace (*l))
++l;
}
else if (mode == parse_pseudo_prefix)
@@ -7587,7 +7586,7 @@ parse_insn (const char *line, char *mnem
l++;
}
split = l;
- if (is_space_char (*l))
+ if (is_whitespace (*l))
++l;
/* Pseudo-prefixes end with a closing figure brace. */
if (*mnemonic == '{' && *l == '}')
@@ -7597,7 +7596,7 @@ parse_insn (const char *line, char *mnem
goto too_long;
*mnem_p = '\0';
- if (is_space_char (*l))
+ if (is_whitespace (*l))
++l;
}
else if (l == split
@@ -7746,7 +7745,7 @@ parse_insn (const char *line, char *mnem
}
/* Skip past PREFIX_SEPARATOR and reset token_start. */
l += (!intel_syntax && *l == PREFIX_SEPARATOR);
- if (is_space_char (*l))
+ if (is_whitespace (*l))
++l;
token_start = l;
}
@@ -7919,10 +7918,10 @@ parse_insn (const char *line, char *mnem
may work in the future and it doesn't hurt to accept them
now. */
token_start = l++;
- if (is_space_char (*l))
+ if (is_whitespace (*l))
++l;
if (TOLOWER (*l) == 'p' && ISALPHA (l[1])
- && (l[2] == END_OF_INSN || is_space_char (l[2])))
+ && (l[2] == END_OF_INSN || is_whitespace (l[2])))
{
if (TOLOWER (l[1]) == 't')
{
@@ -7990,7 +7989,7 @@ parse_operands (char *l, const char *mne
bool in_quotes = false;
/* Skip optional white space before operand. */
- if (is_space_char (*l))
+ if (is_whitespace (*l))
++l;
if (!is_operand_char (*l) && *l != END_OF_INSN && *l != '"')
{
@@ -8024,7 +8023,7 @@ parse_operands (char *l, const char *mne
++l;
else if (*l == '"')
in_quotes = !in_quotes;
- else if (!in_quotes && !is_operand_char (*l) && !is_space_char (*l))
+ else if (!in_quotes && !is_operand_char (*l) && !is_whitespace (*l))
{
as_bad (_("invalid character %s in operand %d"),
output_invalid (*l),
@@ -13155,7 +13154,7 @@ lex_got (enum bfd_reloc_code_real *rel,
be necessary, but be safe. */
tmpbuf = XNEWVEC (char, first + second + 2);
memcpy (tmpbuf, input_line_pointer, first);
- if (second != 0 && *past_reloc != ' ')
+ if (second != 0 && !is_whitespace (*past_reloc))
/* Replace the relocation token with ' ', so that
errors like foo@GOTOFF1 will be detected. */
tmpbuf[first++] = ' ';
@@ -13302,7 +13301,7 @@ s_insn (int dummy ATTRIBUTE_UNUSED)
i.tm.extension_opcode = None;
if (startswith (line, "VEX")
- && (line[3] == '.' || is_space_char (line[3])))
+ && (line[3] == '.' || is_whitespace (line[3])))
{
vex = true;
line += 3;
@@ -13313,7 +13312,7 @@ s_insn (int dummy ATTRIBUTE_UNUSED)
unsigned long n = strtoul (line + 3, &e, 16);
if (e == line + 5 && n >= 0x08 && n <= 0x1f
- && (*e == '.' || is_space_char (*e)))
+ && (*e == '.' || is_whitespace (*e)))
{
xop = true;
/* Arrange for build_vex_prefix() to emit 0x8f. */
@@ -13323,7 +13322,7 @@ s_insn (int dummy ATTRIBUTE_UNUSED)
}
}
else if (startswith (line, "EVEX")
- && (line[4] == '.' || is_space_char (line[4])))
+ && (line[4] == '.' || is_whitespace (line[4])))
{
evex = true;
line += 4;
@@ -13487,14 +13486,14 @@ s_insn (int dummy ATTRIBUTE_UNUSED)
case '0':
if (TOUPPER (line[2]) != 'F')
break;
- if (line[3] == '.' || is_space_char (line[3]))
+ if (line[3] == '.' || is_whitespace (line[3]))
{
i.insn_opcode_space = SPACE_0F;
line += 3;
}
else if (line[3] == '3'
&& (line[4] == '8' || TOUPPER (line[4]) == 'A')
- && (line[5] == '.' || is_space_char (line[5])))
+ && (line[5] == '.' || is_whitespace (line[5])))
{
i.insn_opcode_space = line[4] == '8' ? SPACE_0F38 : SPACE_0F3A;
line += 5;
@@ -13508,7 +13507,7 @@ s_insn (int dummy ATTRIBUTE_UNUSED)
unsigned long n = strtoul (line + 2, &e, 10);
if (n <= (evex ? 15 : 31)
- && (*e == '.' || is_space_char (*e)))
+ && (*e == '.' || is_whitespace (*e)))
{
i.insn_opcode_space = n;
line = e;
@@ -13544,10 +13543,10 @@ s_insn (int dummy ATTRIBUTE_UNUSED)
line += 3;
}
- if (line > end && *line && !is_space_char (*line))
+ if (line > end && *line && !is_whitespace (*line))
{
/* Improve diagnostic a little. */
- if (*line == '.' && line[1] && !is_space_char (line[1]))
+ if (*line == '.' && line[1] && !is_whitespace (line[1]))
++line;
goto done;
}
@@ -13564,7 +13563,7 @@ s_insn (int dummy ATTRIBUTE_UNUSED)
break;
if (*ptr == '+' && ptr[1] == 'r'
- && (ptr[2] == ',' || (is_space_char (ptr[2]) && ptr[3] == ',')))
+ && (ptr[2] == ',' || (is_whitespace (ptr[2]) && ptr[3] == ',')))
{
*ptr = ' ';
ptr[1] = ' ';
@@ -13575,7 +13574,7 @@ s_insn (int dummy ATTRIBUTE_UNUSED)
if (*ptr == '/' && ISDIGIT (ptr[1])
&& (n = strtoul (ptr + 1, &e, 8)) < 8
&& e == ptr + 2
- && (ptr[2] == ',' || (is_space_char (ptr[2]) && ptr[3] == ',')))
+ && (ptr[2] == ',' || (is_whitespace (ptr[2]) && ptr[3] == ',')))
{
*ptr = ' ';
ptr[1] = ' ';
@@ -14181,7 +14180,7 @@ check_VecOperations (char *op_string)
if (*op_string == '{')
{
op_string++;
- if (is_space_char (*op_string))
+ if (is_whitespace (*op_string))
op_string++;
/* Check broadcasts. */
@@ -14353,7 +14352,7 @@ check_VecOperations (char *op_string)
else
goto unknown_vec_op;
- if (is_space_char (*op_string))
+ if (is_whitespace (*op_string))
op_string++;
if (*op_string != '}')
{
@@ -14362,7 +14361,7 @@ check_VecOperations (char *op_string)
}
op_string++;
- if (is_space_char (*op_string))
+ if (is_whitespace (*op_string))
++op_string;
continue;
@@ -14403,7 +14402,7 @@ i386_immediate (char *imm_start)
exp = &im_expressions[i.imm_operands++];
i.op[this_operand].imms = exp;
- if (is_space_char (*imm_start))
+ if (is_whitespace (*imm_start))
++imm_start;
save_input_line_pointer = input_line_pointer;
@@ -15037,14 +15036,14 @@ RC_SAE_immediate (const char *imm_start)
return 0;
pstr++;
- if (is_space_char (*pstr))
+ if (is_whitespace (*pstr))
pstr++;
pstr = RC_SAE_specifier (pstr);
if (pstr == NULL)
return 0;
- if (is_space_char (*pstr))
+ if (is_whitespace (*pstr))
pstr++;
if (*pstr++ != '}')
@@ -15082,7 +15081,7 @@ i386_att_operand (char *operand_string)
char *end_op;
char *op_string = operand_string;
- if (is_space_char (*op_string))
+ if (is_whitespace (*op_string))
++op_string;
/* We check for an absolute prefix (differentiating,
@@ -15091,7 +15090,7 @@ i386_att_operand (char *operand_string)
&& current_templates.start->opcode_modifier.jump)
{
++op_string;
- if (is_space_char (*op_string))
+ if (is_whitespace (*op_string))
++op_string;
i.jumpabsolute = true;
}
@@ -15107,7 +15106,7 @@ i386_att_operand (char *operand_string)
/* Check for a segment override by searching for ':' after a
segment register. */
op_string = end_op;
- if (is_space_char (*op_string))
+ if (is_whitespace (*op_string))
++op_string;
if (*op_string == ':' && r->reg_type.bitfield.class == SReg)
{
@@ -15115,7 +15114,7 @@ i386_att_operand (char *operand_string)
/* Skip the ':' and whitespace. */
++op_string;
- if (is_space_char (*op_string))
+ if (is_whitespace (*op_string))
++op_string;
/* Handle case of %es:*foo. */
@@ -15123,7 +15122,7 @@ i386_att_operand (char *operand_string)
&& current_templates.start->opcode_modifier.jump)
{
++op_string;
- if (is_space_char (*op_string))
+ if (is_whitespace (*op_string))
++op_string;
i.jumpabsolute = true;
}
@@ -15234,7 +15233,7 @@ i386_att_operand (char *operand_string)
/* Handle vector operations. */
--base_string;
- if (is_space_char (*base_string))
+ if (is_whitespace (*base_string))
--base_string;
if (*base_string == '}')
@@ -15251,7 +15250,7 @@ i386_att_operand (char *operand_string)
vop_start = base_string;
--base_string;
- if (is_space_char (*base_string))
+ if (is_whitespace (*base_string))
--base_string;
if (*base_string != '}')
@@ -15303,7 +15302,7 @@ i386_att_operand (char *operand_string)
/* Skip past '(' and whitespace. */
gas_assert (*base_string == '(');
++base_string;
- if (is_space_char (*base_string))
+ if (is_whitespace (*base_string))
++base_string;
if (*base_string == ','
@@ -15319,7 +15318,7 @@ i386_att_operand (char *operand_string)
if (i.base_reg == &bad_reg)
return 0;
base_string = end_op;
- if (is_space_char (*base_string))
+ if (is_whitespace (*base_string))
++base_string;
}
@@ -15327,7 +15326,7 @@ i386_att_operand (char *operand_string)
if (*base_string == ',')
{
++base_string;
- if (is_space_char (*base_string))
+ if (is_whitespace (*base_string))
++base_string;
if ((i.index_reg = parse_register (base_string, &end_op))
@@ -15336,12 +15335,12 @@ i386_att_operand (char *operand_string)
if (i.index_reg == &bad_reg)
return 0;
base_string = end_op;
- if (is_space_char (*base_string))
+ if (is_whitespace (*base_string))
++base_string;
if (*base_string == ',')
{
++base_string;
- if (is_space_char (*base_string))
+ if (is_whitespace (*base_string))
++base_string;
}
else if (*base_string != ')')
@@ -15370,7 +15369,7 @@ i386_att_operand (char *operand_string)
return 0;
base_string = end_scale;
- if (is_space_char (*base_string))
+ if (is_whitespace (*base_string))
++base_string;
if (*base_string != ')')
{
@@ -16615,7 +16614,7 @@ parse_real_register (const char *reg_str
if (*s == REGISTER_PREFIX)
++s;
- if (is_space_char (*s))
+ if (is_whitespace (*s))
++s;
p = reg_name_given;
@@ -16642,18 +16641,18 @@ parse_real_register (const char *reg_str
&& !allow_pseudo_reg)
return (const reg_entry *) NULL;
- if (is_space_char (*s))
+ if (is_whitespace (*s))
++s;
if (*s == '(')
{
++s;
- if (is_space_char (*s))
+ if (is_whitespace (*s))
++s;
if (*s >= '0' && *s <= '7')
{
int fpr = *s - '0';
++s;
- if (is_space_char (*s))
+ if (is_whitespace (*s))
++s;
if (*s == ')')
{
--- a/gas/config/tc-i386-intel.c
+++ b/gas/config/tc-i386-intel.c
@@ -186,7 +186,7 @@ operatorT i386_operator (const char *nam
if (strcasecmp (i386_types[j].name, name) == 0)
break;
- if (i386_types[j].name && *pc == ' ')
+ if (i386_types[j].name && is_whitespace (*pc))
{
const char *start = ++input_line_pointer;
char *pname;
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 61/65] xgate: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (59 preceding siblings ...)
2025-01-27 16:44 ` [PATCH v2 60/65] x86: " Jan Beulich
@ 2025-01-27 16:45 ` Jan Beulich
2025-01-27 16:45 ` [PATCH v2 62/65] Xtensa: " Jan Beulich
` (5 subsequent siblings)
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:45 UTC (permalink / raw)
To: Binutils; +Cc: Sean Keys
Convert an open-coded check.
---
v2: New.
--- a/gas/config/tc-xgate.c
+++ b/gas/config/tc-xgate.c
@@ -812,7 +812,7 @@ xgate_elf_final_processing (void)
static inline char *
skip_whitespace (char *s)
{
- while (*s == ' ' || *s == '\t' || *s == '(' || *s == ')')
+ while (is_whitespace (*s) || *s == '(' || *s == ')')
s++;
return s;
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 62/65] Xtensa: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (60 preceding siblings ...)
2025-01-27 16:45 ` [PATCH v2 61/65] xgate: " Jan Beulich
@ 2025-01-27 16:45 ` Jan Beulich
2025-01-27 16:46 ` [PATCH v2 63/65] Z80: " Jan Beulich
` (4 subsequent siblings)
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:45 UTC (permalink / raw)
To: Binutils; +Cc: Max Filippov, Sterling Augustine
Convert an open-coded check.
---
v2: New.
--- a/gas/config/tc-xtensa.c
+++ b/gas/config/tc-xtensa.c
@@ -1857,11 +1857,12 @@ expression_end (const char *name)
case ',':
case ':':
return name;
- case ' ':
- case '\t':
- ++name;
- continue;
default:
+ if (is_whitespace (*name))
+ {
+ ++name;
+ continue;
+ }
return 0;
}
}
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 63/65] Z80: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (61 preceding siblings ...)
2025-01-27 16:45 ` [PATCH v2 62/65] Xtensa: " Jan Beulich
@ 2025-01-27 16:46 ` Jan Beulich
2025-01-27 16:46 ` [PATCH v2 64/65] Z8k: " Jan Beulich
` (3 subsequent siblings)
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:46 UTC (permalink / raw)
To: Binutils
Replace an open-coded check and convert ISSPACE() uses.
---
v2: New.
--- a/gas/config/tc-z80.c
+++ b/gas/config/tc-z80.c
@@ -582,7 +582,7 @@ z80_elf_final_processing (void)
static const char *
skip_space (const char *s)
{
- while (*s == ' ' || *s == '\t')
+ while (is_whitespace (*s))
++s;
return s;
}
@@ -623,7 +623,7 @@ z80_start_line_hook (void)
case '#': /* force to use next expression as immediate value in SDCC */
if (!sdcc_compat)
break;
- if (ISSPACE(p[1]) && *skip_space (p + 1) == '(')
+ if (is_whitespace (p[1]) && *skip_space (p + 1) == '(')
{ /* ld a,# (expr)... -> ld a,0+(expr)... */
*p++ = '0';
*p = '+';
@@ -3384,7 +3384,7 @@ assemble_suffix (const char **suffix)
for (i = 0; (i < 3) && (ISALPHA (*p)); i++)
sbuf[i] = TOLOWER (*p++);
- if (*p && !ISSPACE (*p))
+ if (*p && !is_whitespace (*p))
return 0;
*suffix = p;
sbuf[i] = 0;
@@ -3670,7 +3670,7 @@ md_assemble (char *str)
else
{
dwarf2_emit_insn (0);
- if ((*p) && (!ISSPACE (*p)))
+ if ((*p) && !is_whitespace (*p))
{
if (*p != '.' || !(ins_ok & INS_EZ80) || !assemble_suffix (&p))
{
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 64/65] Z8k: use is_whitespace()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (62 preceding siblings ...)
2025-01-27 16:46 ` [PATCH v2 63/65] Z80: " Jan Beulich
@ 2025-01-27 16:46 ` Jan Beulich
2025-01-30 10:16 ` Christian Groessler
2025-01-27 16:47 ` [PATCH v2 65/65] gas: suppress use of ISSPACE() / ISBLANK() Jan Beulich
` (2 subsequent siblings)
66 siblings, 1 reply; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:46 UTC (permalink / raw)
To: Binutils; +Cc: Christian Groessler
Wherever blanks are permissible in input, tabs ought to be permissible,
too. This is particularly relevant when -f is passed to gas (alongside
appropriate input). At the same time use is_end_of_stmt() instead of an
open-coded check in adjacent code.
---
v2: New.
--- a/gas/config/tc-z8k.c
+++ b/gas/config/tc-z8k.c
@@ -429,7 +429,7 @@ get_ctrl_operand (char **ptr, struct z8k
char *src = *ptr;
int i, l;
- while (*src == ' ')
+ while (is_whitespace (*src))
src++;
mode->mode = CLASS_CTRL;
@@ -472,7 +472,7 @@ get_flags_operand (char **ptr, struct z8
int i;
int j;
- while (*src == ' ')
+ while (is_whitespace (*src))
src++;
mode->mode = CLASS_FLAGS;
@@ -517,7 +517,7 @@ get_interrupt_operand (char **ptr, struc
char *src = *ptr;
int i, l;
- while (*src == ' ')
+ while (is_whitespace (*src))
src++;
mode->mode = CLASS_IMM;
@@ -607,7 +607,7 @@ get_cc_operand (char **ptr, struct z8k_o
char *src = *ptr;
int i, l;
- while (*src == ' ')
+ while (is_whitespace (*src))
src++;
mode->mode = CLASS_CC;
@@ -634,7 +634,7 @@ get_operand (char **ptr, struct z8k_op *
mode->mode = 0;
- while (*src == ' ')
+ while (is_whitespace (*src))
src++;
if (*src == '#')
{
@@ -737,7 +737,7 @@ get_operands (const opcode_entry_type *o
case 0:
operand[0].mode = 0;
operand[1].mode = 0;
- while (*ptr == ' ')
+ while (is_whitespace (*ptr))
ptr++;
break;
@@ -745,7 +745,7 @@ get_operands (const opcode_entry_type *o
if (opcode->arg_info[0] == CLASS_CC)
{
get_cc_operand (&ptr, operand + 0, 0);
- while (*ptr == ' ')
+ while (is_whitespace (*ptr))
ptr++;
if (*ptr && ! is_end_of_line[(unsigned char) *ptr])
{
@@ -757,7 +757,7 @@ get_operands (const opcode_entry_type *o
else if (opcode->arg_info[0] == CLASS_FLAGS)
{
get_flags_operand (&ptr, operand + 0, 0);
- while (*ptr == ' ')
+ while (is_whitespace (*ptr))
ptr++;
if (*ptr && ! is_end_of_line[(unsigned char) *ptr])
{
@@ -779,7 +779,7 @@ get_operands (const opcode_entry_type *o
if (opcode->arg_info[0] == CLASS_CC)
{
get_cc_operand (&ptr, operand + 0, 0);
- while (*ptr == ' ')
+ while (is_whitespace (*ptr))
ptr++;
if (*ptr != ',' && strchr (ptr + 1, ','))
{
@@ -1219,12 +1219,12 @@ md_assemble (char *str)
opcode_entry_type *opcode;
/* Drop leading whitespace. */
- while (*str == ' ')
+ while (is_whitespace (*str))
str++;
/* Find the op code end. */
for (op_start = op_end = str;
- *op_end != 0 && *op_end != ' ' && ! is_end_of_line[(unsigned char) *op_end];
+ ! is_whitespace (*op_end) && ! is_end_of_stmt (*op_end);
op_end++)
;
@@ -1258,7 +1258,7 @@ md_assemble (char *str)
oc = *old;
*old = '\n';
- while (*input_line_pointer == ' ')
+ while (is_whitespace (*input_line_pointer))
input_line_pointer++;
p = (pseudo_typeS *) (opcode->func);
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [PATCH v2 64/65] Z8k: use is_whitespace()
2025-01-27 16:46 ` [PATCH v2 64/65] Z8k: " Jan Beulich
@ 2025-01-30 10:16 ` Christian Groessler
0 siblings, 0 replies; 106+ messages in thread
From: Christian Groessler @ 2025-01-30 10:16 UTC (permalink / raw)
To: Binutils
On 1/27/25 17:46, Jan Beulich wrote:
> Wherever blanks are permissible in input, tabs ought to be permissible,
> too. This is particularly relevant when -f is passed to gas (alongside
> appropriate input). At the same time use is_end_of_stmt() instead of an
> open-coded check in adjacent code.
> ---
> v2: New.
>
> ....
Ok.
regards,
chris
^ permalink raw reply [flat|nested] 106+ messages in thread
* [PATCH v2 65/65] gas: suppress use of ISSPACE() / ISBLANK()
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (63 preceding siblings ...)
2025-01-27 16:46 ` [PATCH v2 64/65] Z8k: " Jan Beulich
@ 2025-01-27 16:47 ` Jan Beulich
2025-01-28 2:50 ` [PATCH v2 00/65] gas: whitespace handling Hans-Peter Nilsson
2025-01-28 9:59 ` Alan Modra
66 siblings, 0 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-27 16:47 UTC (permalink / raw)
To: Binutils
We want is_whitespace() to be used uniformly, no matter what this then
expands to.
---
v2: New.
--- a/gas/read.h
+++ b/gas/read.h
@@ -51,6 +51,10 @@ extern bool input_from_string;
#define is_whitespace(c) \
( lex_type[(unsigned char) (c)] & LEX_WHITE )
+/* Don't allow safe-ctype.h's counterparts to be used. */
+#undef ISSPACE
+#undef ISBLANK
+
/* The distinction of "line" and "statement" sadly is blurred by unhelpful
naming of e.g. the underlying array. Most users really mean "end of
statement". Going forward only these wrappers are supposed to be used. */
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [PATCH v2 00/65] gas: whitespace handling
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (64 preceding siblings ...)
2025-01-27 16:47 ` [PATCH v2 65/65] gas: suppress use of ISSPACE() / ISBLANK() Jan Beulich
@ 2025-01-28 2:50 ` Hans-Peter Nilsson
2025-01-28 7:40 ` Jan Beulich
2025-01-28 9:59 ` Alan Modra
66 siblings, 1 reply; 106+ messages in thread
From: Hans-Peter Nilsson @ 2025-01-28 2:50 UTC (permalink / raw)
To: Jan Beulich; +Cc: binutils
Sorry for being a downer, but:
> Date: Mon, 27 Jan 2025 16:23:42 +0100
> From: Jan Beulich <jbeulich@suse.com>
> As per observations in target specific code there appears to be disagreement
> across the assembler whether to check for specific characters (blank and tab
> normally) or whether to use ISSPACE().
>
> As agreed upon during the Cauldron in Prague, switch to a single base
> construct for all code to use: is_whitespace().
Such decisions should be made online, with the whole
community, not with the people attending a specific session
at a specific event. (While I had the chance, I had no idea
executive decisions were about to take place.)
> It clearly is an alternative option to have is_whitespace() expand to
> ISSPACE() or ISBLANK() (ISSPACE() also yields "true" for characters we don't
> really consider whitespace), then (obviously) leaving out the last patch. See
> also the CR_EOL uses in read.c and app.c. I think it is advisable though that
> is_whitespace() and is_end_of_{line,stmt}() be non-overlapping; question then
> is what (further) characters to tag as LEX_WHITE (see remarks in patch 01).
>
> Along with recently (as of the v1 submission) committed work for x86 this
> appears to be sufficient to actually use -f (or #NO_APP at start of file)
> for gcc-generated code.
Does that work also clean up gcc-generated code, like
dropping space after comma or multiple spaces or whatever is
judged the #NO_APP behavior of "x86 assembly"? I don't see
such patches but maybe they're not posted yet. It doesn't
just happen to be specified as what gcc generates on the
master branch of today?
> I didn't properly check other architectures yet, but
> I seem to recall that at least Arm32 and PPC would apparently require
> compiler side adjustments, too.
I can't help but thinking this is going ever so slightly in
the wrong direction with regards to #NO_APP: this change-set
is making that mode more lenient towards formatting;
allowing more types of space characters. With the few
targets that have #NO_APP active in gcc-generated code, you
have the chance of making that mode more strict.
brgds, H-P
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [PATCH v2 00/65] gas: whitespace handling
2025-01-28 2:50 ` [PATCH v2 00/65] gas: whitespace handling Hans-Peter Nilsson
@ 2025-01-28 7:40 ` Jan Beulich
2025-01-28 14:47 ` Richard Earnshaw (lists)
2025-01-28 15:25 ` Hans-Peter Nilsson
0 siblings, 2 replies; 106+ messages in thread
From: Jan Beulich @ 2025-01-28 7:40 UTC (permalink / raw)
To: Hans-Peter Nilsson; +Cc: binutils
On 28.01.2025 03:50, Hans-Peter Nilsson wrote:
> Sorry for being a downer, but:
>
>> Date: Mon, 27 Jan 2025 16:23:42 +0100
>> From: Jan Beulich <jbeulich@suse.com>
>
>> As per observations in target specific code there appears to be disagreement
>> across the assembler whether to check for specific characters (blank and tab
>> normally) or whether to use ISSPACE().
>>
>> As agreed upon during the Cauldron in Prague, switch to a single base
>> construct for all code to use: is_whitespace().
>
> Such decisions should be made online, with the whole
> community, not with the people attending a specific session
> at a specific event. (While I had the chance, I had no idea
> executive decisions were about to take place.)
Well, here we are online. The series hasn't been committed yet, so
objections will be listened to. Albeit in objecting to certain aspects
please keep in mind what the overall goal is: To make #NO_APP and the
-f command line option work for more targets. And to have as uniform
behavior as possible in gas across targets.
As per your comment on the cris-specific patch I can't help the
impression that gcc avoiding to emit TABs for this target isn't
"happenstance" as you called it, but simply attributed to gas'es past
behavior.
>> It clearly is an alternative option to have is_whitespace() expand to
>> ISSPACE() or ISBLANK() (ISSPACE() also yields "true" for characters we don't
>> really consider whitespace), then (obviously) leaving out the last patch. See
>> also the CR_EOL uses in read.c and app.c. I think it is advisable though that
>> is_whitespace() and is_end_of_{line,stmt}() be non-overlapping; question then
>> is what (further) characters to tag as LEX_WHITE (see remarks in patch 01).
>>
>> Along with recently (as of the v1 submission) committed work for x86 this
>> appears to be sufficient to actually use -f (or #NO_APP at start of file)
>> for gcc-generated code.
>
> Does that work also clean up gcc-generated code, like
> dropping space after comma or multiple spaces or whatever is
> judged the #NO_APP behavior of "x86 assembly"? I don't see
> such patches but maybe they're not posted yet. It doesn't
> just happen to be specified as what gcc generates on the
> master branch of today?
Well, what gcc presently emits needs to be accepted anyway. A goal is
specifically to get -f / #NO_APP working without needing to touch target
specific code in gcc, whenever possible. As said ...
>> I didn't properly check other architectures yet, but
>> I seem to recall that at least Arm32 and PPC would apparently require
>> compiler side adjustments, too.
... here, I'm aware that some targets will require some changes, but
x86 (according to my limited testing) is not among them. And those
changes are expected to be of limited nature, i.e. they're for example
not expected to touch *.md files al over the place. From what I was
able to see, the problems there are with how #APP / #NO_APP are emitted.
And btw - why would you apply different criteria to cris and x86? For
cris you said what gcc emits is unwritten but de-fact standard. Yet then
you question that same pre-condition to be applied to x86?
> I can't help but thinking this is going ever so slightly in
> the wrong direction with regards to #NO_APP: this change-set
> is making that mode more lenient towards formatting;
> allowing more types of space characters. With the few
> targets that have #NO_APP active in gcc-generated code, you
> have the chance of making that mode more strict.
Have you ever wondered why it is only so few targets? Permitting TABs
in compiler generated output is, as indicated in the reply to the
cris-specific patch, a readability aid. That's certainly a personal
view, but one I happen to know is shared by many other people. That
said, I'm also aware that for various targets gcc presently avoids to
make use of TABs. Of the architectures I'm half-way familiar with it's
actually a minority, though (PPC and ia64 vs aarch64, Arm, RISC-V, and
x86).
Jan
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [PATCH v2 00/65] gas: whitespace handling
2025-01-28 7:40 ` Jan Beulich
@ 2025-01-28 14:47 ` Richard Earnshaw (lists)
2025-01-28 15:00 ` Jan Beulich
2025-01-28 15:25 ` Hans-Peter Nilsson
1 sibling, 1 reply; 106+ messages in thread
From: Richard Earnshaw (lists) @ 2025-01-28 14:47 UTC (permalink / raw)
To: Jan Beulich, Hans-Peter Nilsson; +Cc: binutils
On 28/01/2025 07:40, Jan Beulich wrote:
> On 28.01.2025 03:50, Hans-Peter Nilsson wrote:
>> Sorry for being a downer, but:
>>
>>> Date: Mon, 27 Jan 2025 16:23:42 +0100
>>> From: Jan Beulich <jbeulich@suse.com>
>>
>>> As per observations in target specific code there appears to be disagreement
>>> across the assembler whether to check for specific characters (blank and tab
>>> normally) or whether to use ISSPACE().
>>>
>>> As agreed upon during the Cauldron in Prague, switch to a single base
>>> construct for all code to use: is_whitespace().
>>
>> Such decisions should be made online, with the whole
>> community, not with the people attending a specific session
>> at a specific event. (While I had the chance, I had no idea
>> executive decisions were about to take place.)
>
> Well, here we are online. The series hasn't been committed yet, so
> objections will be listened to. Albeit in objecting to certain aspects
> please keep in mind what the overall goal is: To make #NO_APP and the
> -f command line option work for more targets. And to have as uniform
> behavior as possible in gas across targets.
>
> As per your comment on the cris-specific patch I can't help the
> impression that gcc avoiding to emit TABs for this target isn't
> "happenstance" as you called it, but simply attributed to gas'es past
> behavior.
>
>>> It clearly is an alternative option to have is_whitespace() expand to
>>> ISSPACE() or ISBLANK() (ISSPACE() also yields "true" for characters we don't
>>> really consider whitespace), then (obviously) leaving out the last patch. See
>>> also the CR_EOL uses in read.c and app.c. I think it is advisable though that
>>> is_whitespace() and is_end_of_{line,stmt}() be non-overlapping; question then
>>> is what (further) characters to tag as LEX_WHITE (see remarks in patch 01).
>>>
>>> Along with recently (as of the v1 submission) committed work for x86 this
>>> appears to be sufficient to actually use -f (or #NO_APP at start of file)
>>> for gcc-generated code.
>>
>> Does that work also clean up gcc-generated code, like
>> dropping space after comma or multiple spaces or whatever is
>> judged the #NO_APP behavior of "x86 assembly"? I don't see
>> such patches but maybe they're not posted yet. It doesn't
>> just happen to be specified as what gcc generates on the
>> master branch of today?
>
> Well, what gcc presently emits needs to be accepted anyway. A goal is
> specifically to get -f / #NO_APP working without needing to touch target
> specific code in gcc, whenever possible. As said ...
>
>>> I didn't properly check other architectures yet, but
>>> I seem to recall that at least Arm32 and PPC would apparently require
>>> compiler side adjustments, too.
>
> ... here, I'm aware that some targets will require some changes, but
> x86 (according to my limited testing) is not among them. And those
> changes are expected to be of limited nature, i.e. they're for example
> not expected to touch *.md files al over the place. From what I was
> able to see, the problems there are with how #APP / #NO_APP are emitted.
>
> And btw - why would you apply different criteria to cris and x86? For
> cris you said what gcc emits is unwritten but de-fact standard. Yet then
> you question that same pre-condition to be applied to x86?
>
>> I can't help but thinking this is going ever so slightly in
>> the wrong direction with regards to #NO_APP: this change-set
>> is making that mode more lenient towards formatting;
>> allowing more types of space characters. With the few
>> targets that have #NO_APP active in gcc-generated code, you
>> have the chance of making that mode more strict.
>
> Have you ever wondered why it is only so few targets? Permitting TABs
> in compiler generated output is, as indicated in the reply to the
> cris-specific patch, a readability aid. That's certainly a personal
> view, but one I happen to know is shared by many other people. That
> said, I'm also aware that for various targets gcc presently avoids to
> make use of TABs. Of the architectures I'm half-way familiar with it's
> actually a minority, though (PPC and ia64 vs aarch64, Arm, RISC-V, and
> x86).
>
> Jan
Out of interest, did you consider making the scrubber translate all horizontal white space characters into a single space character (as opposed to just collapsing multiple spaces into one)? This would eliminate the need for each backend to handle multiple type of space character.
R.
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [PATCH v2 00/65] gas: whitespace handling
2025-01-28 14:47 ` Richard Earnshaw (lists)
@ 2025-01-28 15:00 ` Jan Beulich
2025-01-28 15:27 ` Richard Earnshaw (lists)
0 siblings, 1 reply; 106+ messages in thread
From: Jan Beulich @ 2025-01-28 15:00 UTC (permalink / raw)
To: Richard Earnshaw (lists); +Cc: binutils, Hans-Peter Nilsson
On 28.01.2025 15:47, Richard Earnshaw (lists) wrote:
> Out of interest, did you consider making the scrubber translate all horizontal white space characters into a single space character (as opposed to just collapsing multiple spaces into one)? This would eliminate the need for each backend to handle multiple type of space character.
I guess I'm confused: How would altering scrubber behavior matter when
the scrubber is bypassed (by -f or #NO_APP)? And then: This collapsing
already is what the scrubber does (and hence why many targets were
able to get away without checking for \t, and with -f/#NO_APP known to
not be working there), aiui.
Jan
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [PATCH v2 00/65] gas: whitespace handling
2025-01-28 15:00 ` Jan Beulich
@ 2025-01-28 15:27 ` Richard Earnshaw (lists)
0 siblings, 0 replies; 106+ messages in thread
From: Richard Earnshaw (lists) @ 2025-01-28 15:27 UTC (permalink / raw)
To: Jan Beulich; +Cc: binutils, Hans-Peter Nilsson
On 28/01/2025 15:00, Jan Beulich wrote:
> On 28.01.2025 15:47, Richard Earnshaw (lists) wrote:
>> Out of interest, did you consider making the scrubber translate all horizontal white space characters into a single space character (as opposed to just collapsing multiple spaces into one)? This would eliminate the need for each backend to handle multiple type of space character.
>
> I guess I'm confused: How would altering scrubber behavior matter when
> the scrubber is bypassed (by -f or #NO_APP)? And then: This collapsing
> already is what the scrubber does (and hence why many targets were
> able to get away without checking for \t, and with -f/#NO_APP known to
> not be working there), aiui.
>
> Jan
Sorry, it's me that's confused. I hadn't realized that was what #NO_APP did.
R.
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [PATCH v2 00/65] gas: whitespace handling
2025-01-28 7:40 ` Jan Beulich
2025-01-28 14:47 ` Richard Earnshaw (lists)
@ 2025-01-28 15:25 ` Hans-Peter Nilsson
1 sibling, 0 replies; 106+ messages in thread
From: Hans-Peter Nilsson @ 2025-01-28 15:25 UTC (permalink / raw)
To: Jan Beulich; +Cc: binutils
> Date: Tue, 28 Jan 2025 08:40:02 +0100
> From: Jan Beulich <jbeulich@suse.com>
> On 28.01.2025 03:50, Hans-Peter Nilsson wrote:
> As per your comment on the cris-specific patch I can't help the
> impression that gcc avoiding to emit TABs for this target isn't
> "happenstance" as you called it, but simply attributed to gas'es past
> behavior.
No, it's deliberate. IMHO \\t (the literally characters)
makes for less readable code in the .md, and a literal TAB
looks indention-wise weird (not sure if it was always valid.
Compare "adcs%?\\t%0, %1, #0" to "adcs%? %0,%1,#0" (random
example from arm.md). In the generated code, I guess it
depends on your preference; whether columns lining up is
better than the slightly extra horizontal distance.
> > Does that work also clean up gcc-generated code, like
> > dropping space after comma or multiple spaces or whatever is
> > judged the #NO_APP behavior of "x86 assembly"? I don't see
> > such patches but maybe they're not posted yet. It doesn't
> > just happen to be specified as what gcc generates on the
> > master branch of today?
>
> Well, what gcc presently emits needs to be accepted anyway. A goal is
> specifically to get -f / #NO_APP working without needing to touch target
> specific code in gcc, whenever possible.
...and putting the burden on the assembler to do the
post-scrubber processing in your patches. IOW, while it's
nice if more targets can skip the scrubbing (and joining
#NO_APP actually being in effect), it's more processing once
that state is entered.
> And btw - why would you apply different criteria to cris and x86? For
> cris you said what gcc emits is unwritten but de-fact standard. Yet then
> you question that same pre-condition to be applied to x86?
Different criterias for "tier-1" versus "tier-N", (N > 2)
targets isn't exactly a new concept.
> > I can't help but thinking this is going ever so slightly in
> > the wrong direction with regards to #NO_APP: this change-set
> > is making that mode more lenient towards formatting;
> > allowing more types of space characters. With the few
> > targets that have #NO_APP active in gcc-generated code, you
> > have the chance of making that mode more strict.
>
> Have you ever wondered why it is only so few targets?
Only once. Then I looked and found out, some 30+ years ago. :)
> Permitting TABs
> in compiler generated output is, as indicated in the reply to the
> cris-specific patch, a readability aid.
To that I'll just say: "\\t"! :-)
I'll admit that's the gcc "input" - but which you seem to
overlook in your readability argumentation! And with that,
I see the discussion derails...
> That's certainly a personal
> view, but one I happen to know is shared by many other people. That
> said, I'm also aware that for various targets gcc presently avoids to
> make use of TABs. Of the architectures I'm half-way familiar with it's
> actually a minority, though (PPC and ia64 vs aarch64, Arm, RISC-V, and
> x86).
...with this "argumentum ad populum". Let's please drop the
TAB vs space part of the discussion and "agree to disagree".
brgds, H-P
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [PATCH v2 00/65] gas: whitespace handling
2025-01-27 15:23 [PATCH v2 00/65] gas: whitespace handling Jan Beulich
` (65 preceding siblings ...)
2025-01-28 2:50 ` [PATCH v2 00/65] gas: whitespace handling Hans-Peter Nilsson
@ 2025-01-28 9:59 ` Alan Modra
66 siblings, 0 replies; 106+ messages in thread
From: Alan Modra @ 2025-01-28 9:59 UTC (permalink / raw)
To: Jan Beulich; +Cc: Binutils
On Mon, Jan 27, 2025 at 04:23:42PM +0100, Jan Beulich wrote:
> As per observations in target specific code there appears to be disagreement
> across the assembler whether to check for specific characters (blank and tab
> normally) or whether to use ISSPACE().
>
> As agreed upon during the Cauldron in Prague, switch to a single base
> construct for all code to use: is_whitespace().
>
> It clearly is an alternative option to have is_whitespace() expand to
> ISSPACE() or ISBLANK() (ISSPACE() also yields "true" for characters we don't
> really consider whitespace), then (obviously) leaving out the last patch. See
> also the CR_EOL uses in read.c and app.c. I think it is advisable though that
> is_whitespace() and is_end_of_{line,stmt}() be non-overlapping; question then
> is what (further) characters to tag as LEX_WHITE (see remarks in patch 01).
>
> Along with recently (as of the v1 submission) committed work for x86 this
> appears to be sufficient to actually use -f (or #NO_APP at start of file)
> for gcc-generated code. I didn't properly check other architectures yet, but
> I seem to recall that at least Arm32 and PPC would apparently require
> compiler side adjustments, too.
I like it, thanks!
--
Alan Modra
^ permalink raw reply [flat|nested] 106+ messages in thread