From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wr1-f48.google.com (mail-wr1-f48.google.com [209.85.221.48]) by sourceware.org (Postfix) with ESMTPS id 37A513856DD1 for ; Mon, 27 Jun 2022 18:12:08 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 37A513856DD1 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=palves.net Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-wr1-f48.google.com with SMTP id s1so14184286wra.9 for ; Mon, 27 Jun 2022 11:12:08 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:to:references:from:in-reply-to :content-transfer-encoding; bh=GvQJXGbMjUkxobs0OACWdrJQojrCnwqbpvRnbBPeGkw=; b=3qAyocuOyoMiCSQaD0ku643+HePEiWY0XdxrYRYs4jGj4yVBlU/aAOxMsiviNdRDfG eqZBeXmAS7pUFY8bNQ/qhF5gQgTErbnG0WXilhOZ0AyBjOp6D0W+aR08AqqsJRQd4DC3 B/8YvwDwdMbepk9LIMy08fo5sb0JSDWhMtCgdDvwNmm+STo/96nirUA65I5duxJpI3rk C1Zr1r33R5xgxbXDZ3CzOqYk7aJfp/Y2zoa4YaJAgNHasxJ9IDhICplF9yvxDTIPn+uP fbtNG83x1QfTzBJSau9s343e+J0MQNM1ZgfmrvZh4DFhKPjLIhxWmY8CkBsnvM4n8Rue CJlQ== X-Gm-Message-State: AJIora/O52KtO/S3dTTryjt/iZ6VHZ6p1SHxATjPT5lVGbipA9Ze1lx6 vStYxhk/5a4Mo0hrCmbXf3GIQXOCIcY= X-Google-Smtp-Source: AGRyM1vTaXn9y24CG2ubqx2UCZliYn/mBPOAws4AhOGfElHck8livddMYJcFr+M2We4bKqxAyswNdQ== X-Received: by 2002:a05:6000:3c6:b0:21b:9d00:db29 with SMTP id b6-20020a05600003c600b0021b9d00db29mr13900364wrg.338.1656353526940; Mon, 27 Jun 2022 11:12:06 -0700 (PDT) Received: from ?IPV6:2001:8a0:f924:2600:5b14:8ad0:780f:bdda? ([2001:8a0:f924:2600:5b14:8ad0:780f:bdda]) by smtp.gmail.com with ESMTPSA id x12-20020a05600c2d0c00b003a04a9504b0sm4198858wmf.40.2022.06.27.11.12.05 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 27 Jun 2022 11:12:06 -0700 (PDT) Message-ID: <8059d2c0-9b9d-e420-fe95-bc4150dfa164@palves.net> Date: Mon, 27 Jun 2022 19:12:04 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.10.0 Subject: Re: [PATCH 2/4] gdb, gdbserver: Add AMX registers. Content-Language: en-US To: Felix Willgerodt , gdb-patches@sourceware.org References: <20220506121226.137608-1-felix.willgerodt@intel.com> <20220506121226.137608-3-felix.willgerodt@intel.com> From: Pedro Alves In-Reply-To: <20220506121226.137608-3-felix.willgerodt@intel.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-10.9 required=5.0 tests=BAYES_00, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 Jun 2022 18:12:11 -0000 Hi Felix, This largely looks good to me, though I have a couple questions. See below. On 2022-05-06 13:12, Felix Willgerodt via Gdb-patches wrote: > > +/* A helper function to re-size AMX pseudo registers during reads. Copies > + the contents from RAW_BUF to BUF and re-sizes the value. */ I think this should say what does it mean when TILECFG is NULL. > + > +static void > +amd64_tmm_resize_read (const tilecfg_reg *tilecfg, const gdb_byte *raw_buf, > + gdb_byte *buf, value *result_value, const int tmmnum) > +{ > + uint16_t columns = 64; > + uint8_t rows = 16; > + > + if (tilecfg != nullptr) > + { > + columns = tilecfg->bytes_per_row (tmmnum); > + rows = tilecfg->rows (tmmnum); > + if (columns == 0) > + columns = 64; > + if (rows == 0) > + rows = 16; > + } > + > + gdb_assert (TYPE_LENGTH (value_type (result_value)) >= rows * columns); > + > + /* Copy each row from raw_buf into buf. The rows are not consecutive > + but they are on MAX_BYTES_PER_ROW * iRow position. */ > + const gdb_byte *raw_buf_offset > + = raw_buf + tmmnum * tilecfg->MAX_BYTES_PER_TILE; > + for (uint8_t iRow = 0; iRow < rows; ++iRow) > + { > + memcpy (buf + columns * iRow, > + raw_buf_offset + tilecfg->MAX_BYTES_PER_ROW * iRow, > + columns); > + } > + > + /* Adjust the result_value. The value is a union of matrices of different > + types. See i386_tmm_type (). This iterates over each member and > + adjusts the dimensions according to the type. */ > + for (int i = 0; i < value_type (result_value)->num_fields (); ++i) > + { > + type *rows_type = value_type (result_value)->fields ()[i].m_type; > + type *cols_type = rows_type->main_type->target_type; > + > + /* Adjust array bit lengths. */ > + rows_type->length = columns * rows; > + cols_type->length = columns; > + > + /* Adjust array dimensions. */ > + rows_type->bounds ()->high.set_const_val (rows - 1); > + int num_bytes = cols_type->main_type->target_type->length; > + cols_type->bounds ()->high.set_const_val (columns / num_bytes - 1); Does any other target do in-place type rewriting like this? That seems fishy. What happens e.g., to values already in the value history that were recorded before the dimensions changed, for instance? Will they suddenly start re-printing differently / incorrectly with their type changed behind their back? Like: (gdb) print $reg # some register or value mapped to a register that that ends up in the function above $1 = ... # before type changes # something happens and the AMX type changes. (gdb) print $reg $2 = ... # reflects type change (gdb) print $1 $3 = ... # what type does GDB use here? Do the new tests cover something like this already? This may likewise affect, e.g., watchpoints and displays. I haven't traced the new code to check where do those types originally come from, but maybe it would work to reuse/extend the vla support to make those types have dynamic length and bounds (TYPE_DYNAMIC_LENGTH, DYN_PROP_BYTE_SIZE, etc.). Or maybe just tweak these functions such that you create a new type instead of changing the original type. I don't know how frequently the array dimentions change and how open ended the dimensions are, but caching the type keyed on row/col sizes may work well to spare creating too many types, or actually creating them all the time. > + } > +} > + > static struct value * > amd64_pseudo_register_read_value (struct gdbarch *gdbarch, > readable_regcache *regcache, > @@ -401,6 +511,58 @@ amd64_pseudo_register_read_value (struct gdbarch *gdbarch, > mark_value_bytes_unavailable (result_value, 0, > TYPE_LENGTH (value_type (result_value))); > } > + else if (i386_tilecfg_regnum_p (gdbarch, regnum)) > + { > + /* Read tilecfg. */ > + gdb_byte raw_buf[register_size (gdbarch, tdep->tilecfg_raw_regnum)]; > + register_status status = regcache->raw_read (tdep->tilecfg_raw_regnum, > + raw_buf); > + if (status != REG_VALID) > + { > + mark_value_bytes_unavailable ( > + result_value, 0, TYPE_LENGTH (value_type (result_value))); Formatting, "(" never ends a line. This appears multiple times in the patch throughout. > + } > + else > + { > + /* Copy palette and start_row. See tilecfg_type (). */ > + memcpy (buf, raw_buf, 2 * 1); > + /* Copy all colsb. */ > + memcpy (buf + 2, raw_buf + 16, 2 * 8); > + /* Copy all rows. */ > + memcpy (buf + 18, raw_buf + 48, 1 * 8); > + } > + } > + else if (i386_tmm_regnum_p (gdbarch, regnum)) > + { > + /* Read tilecfg. */ > + gdb_byte tilecfg_buf[register_size (gdbarch, tdep->tilecfg_raw_regnum)]; > + register_status status = regcache->raw_read (tdep->tilecfg_raw_regnum, > + tilecfg_buf); > + if (status != REG_VALID) > + { > + mark_value_bytes_unavailable ( > + result_value, 0, TYPE_LENGTH (value_type (result_value))); Ditto. > + } > + else > + { > + tilecfg_reg tilecfg{ tilecfg_buf }; GDB prefers using ()s for ctors. Also space before '(' / '{'. Thus: tilecfg_reg tilecfg (tilecfg_buf); > + gdb_byte raw_buf[register_size (gdbarch, tdep->tiledata_regnum)]; > + status = regcache->raw_read (tdep->tiledata_regnum, raw_buf); > + > + if (status != REG_VALID) > + { > + mark_value_bytes_unavailable ( > + result_value, 0, TYPE_LENGTH (value_type (result_value))); Formatting. (I'll stop pointing these out.) > + } > + else > + { > + /* Re-size value and copy data. */ > + amd64_tmm_resize_read (&tilecfg, raw_buf, > + buf, result_value, > + regnum - tdep->tmm_regnum); > + } > + } > + } > else > i386_pseudo_register_read_into_value (gdbarch, regcache, regnum, > result_value); > > +/* AMX tilecfg_reg constructor. */ > + > +tilecfg_reg::tilecfg_reg (uint8_t *raw_tilecfg) : tilecfg_reg () > +{ Please write: tilecfg_reg::tilecfg_reg (uint8_t *raw_tilecfg) : tilecfg_reg () { Though that is just calling the default ctor. I don't think you need to that explicitly here. > + /* Use default values. */ > + if (raw_tilecfg == nullptr) > + return; > + > + palette = raw_tilecfg[0]; > + start_row = raw_tilecfg[1]; > + > + /* Read TILECFG column and row values via pointers. > + Columns are represented by 2 bytes and rows are represented > + by 1 byte. Column pointer which is *uint8_t needs to be converted > + to *uint16_t pointer. */ > + uint16_t *vec_col_pos > + = reinterpret_cast (raw_tilecfg + COLUMN_MEMORY_OFFSET); > + uint8_t *vec_row_pos = raw_tilecfg + ROW_MEMORY_OFFSET; > + > + for (int i = 0; i < MAX_NAMES; i++) > + columns_and_rows[i] = { vec_col_pos[i], vec_row_pos[i] }; Where was columns_and_rows resized to MAX_NAMES? > +} > + > /* Convert a dbx register number REG to the appropriate register > number used by GDB. */ > > @@ -3307,6 +3389,142 @@ i386_mmx_type (struct gdbarch *gdbarch) > return tdep->i386_mmx_type; > } > > +/* Construct vector type for TMM registers. */ > + > +static struct type * > +i386_tmm_type (struct gdbarch *gdbarch) > +{ > + i386_gdbarch_tdep *tdep = (i386_gdbarch_tdep *) gdbarch_tdep (gdbarch); > + > + if (!tdep->i386_tmm_type) > + { > + const struct builtin_type *bt = builtin_type (gdbarch); > + > + uint8_t bytes_per_row = tilecfg_reg::MAX_BYTES_PER_ROW; > + uint8_t max_rows = tilecfg_reg::MAX_ROWS; > + > + /* The type we're building is this: */ > +#if 0 > + union __gdb_builtin_type_matrix1024i > + { > + int8_t m_int8[max_rows][bytes_per_row]; > + uint8_t m_uint8[max_rows][bytes_per_row]; > + int32_t m_int32[max_rows][bytes_per_row/4]; > + bfloat16_t m_bfloat16[max_rows][bytes_per_row/2]; > + float m_int32[max_rows][bytes_per_row/4]; > + }; > +#endif > + > + struct type *t; > + t = arch_composite_type (gdbarch, "builtin_type_tile", TYPE_CODE_UNION); > + > + append_composite_type_field ( Formatting. (same below.) > + t, "m_int8", > + init_vector_type (init_vector_type (bt->builtin_int8, bytes_per_row), > + max_rows)); > + > + append_composite_type_field ( > + t, "m_uint8", > + init_vector_type ( > + init_vector_type (bt->builtin_uint8, bytes_per_row), max_rows)); > + > + append_composite_type_field ( > + t, "m_int32", > + init_vector_type ( > + init_vector_type (bt->builtin_int32, bytes_per_row / 4), > + max_rows)); > + > + append_composite_type_field ( > + t, "m_bf16", > + init_vector_type ( > + init_vector_type (bt->builtin_bfloat16, bytes_per_row / 2), > + max_rows)); > + > + append_composite_type_field ( > + t, "m_fp32", > + init_vector_type ( > + init_vector_type (bt->builtin_float, bytes_per_row / 4), > + max_rows)); > + > + t->set_is_vector (true); > + t->set_name ("builtin_type_tile"); > + tdep->i386_tmm_type = t; > + } > + > + return tdep->i386_tmm_type; > +} > @@ -348,7 +376,7 @@ enum record_i386_regnum > #define I386_NUM_REGS (I386_GSBASE_REGNUM + 1) > > /* Size of the largest register. */ > -#define I386_MAX_REGISTER_SIZE 64 > +#define I386_MAX_REGISTER_SIZE 8192 > *Eek* > /* Types for i386-specific registers. */ > extern struct type *i387_ext_type (struct gdbarch *gdbarch); > @@ -366,6 +394,8 @@ extern int i386_k_regnum_p (struct gdbarch *gdbarch, int regnum); > extern int i386_zmm_regnum_p (struct gdbarch *gdbarch, int regnum); > extern int i386_zmmh_regnum_p (struct gdbarch *gdbarch, int regnum); > extern bool i386_pkru_regnum_p (struct gdbarch *gdbarch, int regnum); > +extern bool i386_tilecfg_regnum_p (struct gdbarch *gdbarch, int regnum); > +extern bool i386_tmm_regnum_p (struct gdbarch *gdbarch, int regnum); > > extern const char *i386_pseudo_register_name (struct gdbarch *gdbarch, > int regnum); > @@ -485,4 +515,94 @@ extern int i386_stap_is_single_operand (struct gdbarch *gdbarch, > extern expr::operation_up i386_stap_parse_special_token > (struct gdbarch *gdbarch, struct stap_parse_info *p); > > +/* AMX utilities. */ > + > +/* TILECFG register. > + 0 palette > + 1 start_row > + 2-15 reserved, must be zero > + 16-17 tile0.colsb Tile 0 bytes per row. > + 18-19 tile1.colsb Tile 1 bytes per row. > + 20-21 tile2.colsb Tile 2 bytes per row. > + ... (sequence continues) > + 30-31 tile7.colsb Tile 7 bytes per row. > + 32-47 reserved, must be zero > + 48 tile0.rows Tile 0 rows. > + 49 tile1.rows Tile 1 rows. > + 50 tile2.rows Tile 2 rows. > + ... (sequence continues) > + 55 tile7.rows Tile 7 rows. > + 56-63 reserved, must be zero. */ > + > +/* TILECFG class representing the AMX Tilecfg register. */ > + > +class tilecfg_reg > +{ > +public: > + tilecfg_reg () > + : columns_and_rows ( > + std::vector> (MAX_NAMES, { 0, 0 })) Would it be possible to use a struct instead of a pair? Pairs are kind of horrible for readability. Some proper field names would be much clearer than "first" and "second". std::pair is great for general purpose templates, not so much otherwise. Also, how about doing this columns_and_rows initialization where the field is declared? > + { > + } > + > + /* Construct it from raw tilecfg data. */ > + explicit tilecfg_reg (uint8_t *raw_tilecfg); As pointed out above, the implementation of this ctor doesn't seem to resize the columns_and_rows vector, and just accesses the elements straight away: for (int i = 0; i < MAX_NAMES; i++) columns_and_rows[i] = { vec_col_pos[i], vec_row_pos[i] }; } Moving the initialization to the member declaration would fix it, but maybe I'm missing something. > + > +private: > + /* This stores the colsb and rows entries. */ I guess "colsb" is a typo for "cols"? > + std::vector> columns_and_rows; > +}; > + > +set line1 [gdb_get_line_number "BP1"] > +set line2 [gdb_get_line_number "BP2"] > +gdb_breakpoint $line1 > +gdb_breakpoint $line2 > + > +# Registers should be displayed as zeroed before AMX enablement. > +with_test_prefix "Before AMX is enabled" { Lowercase "Before". > + gdb_test "print \$tilecfg_raw" "= 0" > + for {set i 0} {$i < 8} {incr i} { > + test_zeroed_tile "\$tmm$i" > + } > +} > + > + > +gdb_test "continue" \ > + ".*\\\[Inferior $decimal \\\(process $decimal\\\) exited normally\\]" gdb_continue_to_end > diff --git a/gdb/testsuite/lib/gdb.exp b/gdb/testsuite/lib/gdb.exp > index 47cb2b23676..97664191244 100644 > --- a/gdb/testsuite/lib/gdb.exp > +++ b/gdb/testsuite/lib/gdb.exp > @@ -3429,6 +3429,73 @@ gdb_caching_proc skip_tsx_tests { > return $skip_tsx_tests > } > > +# Run a test on the target to see if it supports AMX. Return 0 if so, > +# 1 if it does not. Based on 'check_vmx_hw_available' from the GCC testsuite. > + > +gdb_caching_proc skip_amx_tests { > + global srcdir subdir gdb_prompt inferior_exited_re > + > + set me "skip_amx_tests" > + if { ![istarget "x86_64-*-*"] } { > + verbose "$me: target does not support AMX, returning 1" 2 Suprious double space after "me:".