public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
Search results ordered by [date|relevance]  view[summary|nested|Atom feed]
thread overview below | download mbox.gz: |
* ☠ Buildbot (Sourceware): binutils-gdb - failed test (failure) test (failure) (master)
@ 2024-04-16 15:22  2% builder
  0 siblings, 0 replies; 65+ results
From: builder @ 2024-04-16 15:22 UTC (permalink / raw)
  To: A. Wilcox, Aaron Merey, Abdul Basit Ijaz, Aditya Kamath,
	Aditya Vidyadhar Kamath, Aditya Vidyadhar Kamath, Alan Modra,
	Aleksandar Paunovic, Alex Coplan, Alexandra Hájková,
	Alexandre Oliva, Alexey Lapshin, Alok Kumar Sharma, Andre Vieira,
	Andrea Corallo, Andreas Arnez, Andreas K. Huettel,
	Andreas Krebbel, Andreas Schwab, Andreas Schwab, Andrew Burgess,
	Andrew Burgess, Andrew Carlotti, Andrew Pinski, Ari Hannula,
	Arsen Arsenovi?, Arsen Arsenović,
	Benson Muite, Bernd Edlinger, Bernhard Heckel,
	Bernhard M. Wiedemann, Bhuvanendra Kumar N, Branislav Brzak,
	Brett Werling, Bruno Larsen, CaiJingtao, Carl Love, Carl Love,
	Cary Coutant, Christina Schimpe, Christoph Müllner,
	Christophe Lyon, Christophe Lyon, Christophe Lyon,
	Christopher Di Bella, Chung-Ju Wu, Ciaran Woodward,
	Claudio Bantaloukas, Claudiu Zissulescu, Claudiu Zissulescu,
	Clément Chigot, Cristian Sandu, Cui, Cui, Cupertino Miranda,
	Dan Callaghan, David Carew, David Faust, David Guillen Fandos,
	David Seifert, Dimitar Dimitrov, Dmitry Selyutin, Eli Zaretskii,
	Enze Li, Enze Li, Eugene Rozenfeld, Ezra Sitorus, Fangrui Song,
	Fangrui Song, Feiyang Chen, Felix Willgerodt, Flavio Cruz,
	Frederic Cambus, GDB Administrator, Gaius Mulley, Gareth Rees,
	Georg-Johann Lay, Gregory Anders, Guillermo E. Martinez,
	Guinevere Larsen, H.J. Lu, Hannes Domani, Hans-Peter Nilsson,
	Haochen Jiang, Hau Hsu, Himal, Hongyu Wang, Hsinyuan Xavier, Hu,
	Hui Li, Iain Buclaw, Iain Buclaw, Iain Sandoe, Ijaz,
	Ilya Leoshkevich, Indu Bhagat, Jacob Navia, Jakub Jelinek,
	Jan Beulich, Jan Kratochvil, Jan Vrany, Jan-Benedict Glaw,
	Jason Merrill, Jaydeep Patil, Jaydeep Patil, Jedidiah Thompson,
	Jeff Law, Jeff Law, Jens Remus, Jerry Zhang Jian, Jia-Wei Chen,
	Jiajie Chen, Jiangshuai Li, Jiangshuai Li, Jiangshuai Li, Jiawei,
	Jim Wilson, Jin Ma, Jinyang He, Joel Brobecker,
	Johannes Schauer Marin Rodrigues, John Baldwin,
	John David Anglin, Johnson Sun, Jojo R, Jon Turney,
	Jonas Hoerberg, Jonathan Wakely, Jose E. Marchesi, Joseph Faulls,
	Joseph Myers, Joseph Myers, Joseph Myers, Jozef Lawrynowicz,
	Kalvis Duckmanton, Kavitha Natarajan, Kaylee Blake, Keith Seitz,
	Kevin Buettner, Khem Raj, Kito Cheng, Kong Lingling,
	Konstantin Isakov, Kuan-Lin Chen, Kumar, Kévin Le Gouguec,
	LIU Hao, Lancelot SIX, Lancelot Six, Laurent Morichetti, Li Xu,
	Lifang Xia, Luca Bacci, Luca Boccassi, Luca Bonissi,
	Ludovic Courtès, Luis Machado, Lulu Cai, Lulu Cheng,
	Maciej W. Rozycki, Maciej W. Rozycki, Magne Hov, Manoj Gupta,
	Marcus Nilsson, Marek Polacek, Mark Harmstone, Mark Wielaard,
	Markus Metzger, Martin Liska, Martin Storsjö,
	Mary Bennett, Matheus Branco Borella, Matthew strager Glazar,
	Matthew Malcomson, Matthias Klose, Matthieu Longo, Matti Puputti,
	Max Filippov, Meghan Denny, Michael J. Eager, Michael Matz,
	Mihails Strasuns, Mike Frysinger, Mo, Mohamed Bouhaouel,
	Nandakumar Edamana, Natarajan, Nathan Huckleberry,
	Nathan Sidwell, Neal Frager, Neal frager, Nelson Chu, Nelson Chu,
	Nelson Chu, Nick Alcock, Nick Clifton, Nicolas Boulenguez,
	Nicolas Boulenguez, Nikolaos Chatzikonstantinou,
	Nils-Christian Kempke, Oleg Tolmatcev, Olivier Hainque,
	Orgad Shaneh, Palmer Dabbelt, Patrick Monnerat,
	Patrick O'Neill, Paul Iannetta, Paul Koning, Paul Pluzhnikov,
	Pedro Alves, Pedro Alves, Pekka Seppänen, Peter Bergner,
	Peter Edwards, Peter Foley, Peter Jones, Petr Tesarik,
	Philip Herron, Philipp Tomsich, Philippe Blain,
	Philippe Waroquiers, Potharla, Pter Chubb, Puputti, Rainer Orth,
	Ralf Habacker, Richard Ball, Richard Bunt, Richard Earnshaw,
	Richard Purdie, Richard Sandiford, Richard W.M. Jones,
	Roger Sayle, Rohr, Roland McGrath, Romain Geissler, Rui Ueyama,
	Rupesh Potharla, Ruud van der Pas, Sam James, Samuel Tardieu,
	Sandra Loosemore, Sandra Loosemore, Saurabh Jha, Schimpe,
	Sergei Trofimovich, Sergey Bugaev, Shahab Vahedi, Shihua,
	Simon Cook, Simon Farre, Simon Marchi, Simon Marchi,
	Song Mengzhi, Srinath Parvathaneni, Stafford Horne,
	Stam Markianos-Wright, Stefan Liebler,
	Stefan Schulze Frielinghaus, Stefano Moioli,
	Steinar H. Gunderson, Steinar H. Gunderson, Stepan Nemec,
	Stephen Kitt, Szabolcs Nagy, TaiseiIto, Tamar Christina,
	Tankut Baris Aktemur, Tatsuyuki Ishi, Tejas Joshi,
	Thiago Jung Bauermann, Thomas Hebb, Thomas Koenig,
	Thomas Schwinge, Thomas Weißschuh, Tiezhu Yang,
	Tobias Burnus, Toby Lloyd Davies, Tom Tromey, Tom Tromey,
	Tom de Vries, Tom de Vries, Tom de Vries, Tom de Vries,
	Tom de Vries via Gdb-patches, Tomoaki Kawada,
	Torbjörn SVENSSON, Tristan Gingold, Tsukasa OI,
	Ulf Samuelsson, Victor Do Nascimento, Victor Do Nascimento,
	Vijay Shankar, Vladimir Mezentsev, Vladislav Belov,
	Vladislav Khmelevsky, Vsevolod Alekseyev, WANG Rui, WANG Xuerui,
	Weimin Pan, Will Hawkins, Willgerodt, Xi Ruoyao, Xi Ruoyao,
	Xianmiao Qu, Xiao Zeng, Yang Liu, Yichao Yu, Ying Huang,
	Yoshinori Sato, Youling Tang, YunQiang Su, Yuriy Kolerov,
	Yuriy Kolerov, Yury Khrustalev, Yury Khrustalev, Yvan Roux,
	Zac Walker, Zac Walker, Zeke Lu, Zhang, Zhiqing Xiong, cailulu,
	caiyinyu, changjiachen, jiawei, konglin1, liuhongt, liuzhensong,
	mengqinggang, mga-sc, rupesh potharla, rupothar, srinath,
	tangxiaolin, ticat_fp, yaowenbin, zengxiao,
	Дилян
	Палаузов,
	Сергей
	Чернов

A new failure has been detected on builder binutils-ubuntu-riscv while building binutils-gdb.

Full details are available at:
    https://builder.sourceware.org/buildbot/#/builders/283/builds/367

Build state: failed test (failure) test (failure)
Revision: 3f6a060c7543332d0cb4377fc318e2db01ea1d3c
Worker: starfive-2
Build Reason: (unknown)
Blamelist: A. Wilcox <awilfox@adelielinux.org>, Aaron Merey <amerey@redhat.com>, Abdul Basit Ijaz <abdul.b.ijaz@intel.com>, Aditya Kamath <Aditya.Kamath1@ibm.com>, Aditya Vidyadhar Kamath <ADITYA.VIDYADHAR.KAMATH@ibm.com>, Aditya Vidyadhar Kamath <Aditya.Kamath1@ibm.com>, Alan Modra <amodra@gmail.com>, Aleksandar Paunovic <aleksandar.paunovic@intel.com>, Alex Coplan <alex.coplan@arm.com>, Alexandra Hájková <ahajkova@redhat.com>, Alexandre Oliva <oliva@adacore.com>, Alexey Lapshin <alexey.lapshin@espressif.com>, Alok Kumar Sharma <AlokKumar.Sharma@amd.com>, Andre Vieira <andre.simoesdiasvieira@arm.com>, Andrea Corallo <andrea.corallo@arm.com>, Andreas Arnez <arnez@linux.ibm.com>, Andreas K. Huettel <dilfridge@gentoo.org>, Andreas Krebbel <krebbel@linux.ibm.com>, Andreas Schwab <schwab@linux-m68k.org>, Andreas Schwab <schwab@suse.de>, Andrew Burgess <aburgess@redhat.com>, Andrew Burgess <andrew.burgess@embecosm.com>, Andrew Carlotti <andrew.carlotti@arm.com>, Andrew Pinski <apinski@marvell.com>, Ari Hannula <ari.hannula@intel.com>, Arsen Arsenovi? <arsen@aarsen.me>, Arsen Arsenović <arsen@aarsen.me>, Benson Muite <benson_muite@emailplus.org>, Bernd Edlinger <bernd.edlinger@hotmail.de>, Bernhard Heckel <bernhard.heckel@intel.com>, Bernhard M. Wiedemann <bwiedemann@suse.de>, Bhuvanendra Kumar N <Bhuvanendra.KumarN@amd.com>, Branislav Brzak <branislav.brzak@syrmia.com>, Brett Werling <bwerl.dev@gmail.com>, Bruno Larsen <blarsen@redhat.com>, CaiJingtao <caijingtao@huawei.com>, Carl Love <cel@linux.ibm.com>, Carl Love <cel@us.ibm.com>, Cary Coutant <ccoutant@gmail.com>, Christina Schimpe <christina.schimpe@intel.com>, Christoph Müllner <christoph.muellner@vrull.eu>, Christophe Lyon <christophe.lyon@arm.com>, Christophe Lyon <christophe.lyon@linaro.org>, Christophe Lyon <christophe.lyon@st.com>, Christopher Di Bella <cjdb@google.com>, Chung-Ju Wu <jasonwucj@gmail.com>, Ciaran Woodward <ciaranwoodward@xmos.com>, Claudio Bantaloukas <Claudio.Bantaloukas@arm.com>, Claudiu Zissulescu <claziss@gmail.com>, Claudiu Zissulescu <claziss@synopsys.com>, Clément Chigot <chigot@adacore.com>, Cristian Sandu <cristian.sandu@intel.com>, Cui, Lili <lili.cui@intel.com>, Cui,Lili <lili.cui@intel.com>, Cupertino Miranda <cupertino.miranda@oracle.com>, Dan Callaghan <dan.callaghan@morsemicro.com>, David Carew <david@dcarew.com>, David Faust <david.faust@oracle.com>, David Guillen Fandos <david@davidgf.net>, David Seifert <soap@gentoo.org>, Dimitar Dimitrov <dimitar@dinux.eu>, Dmitry Selyutin <ghostmansd@gmail.com>, Eli Zaretskii <eliz@gnu.org>, Enze Li <enze.li@gmx.com>, Enze Li <enze.li@hotmail.com>, Eugene Rozenfeld <erozen@microsoft.com>, Ezra Sitorus <ezra.sitorus@arm.com>, Fangrui Song <i@maskray.me>, Fangrui Song <maskray@google.com>, Feiyang Chen <chenfeiyang@loongson.cn>, Felix Willgerodt <felix.willgerodt@intel.com>, Flavio Cruz <flaviocruz@gmail.com>, Frederic Cambus <fred@statdns.com>, GDB Administrator <gdbadmin@sourceware.org>, Gaius Mulley <gaiusmod2@gmail.com>, Gareth Rees <grees@undo.io>, Georg-Johann Lay <avr@gjlay.de>, Gregory Anders <greg@gpanders.com>, Guillermo E. Martinez <guillermo.e.martinez@oracle.com>, Guinevere Larsen <blarsen@redhat.com>, H.J. Lu <hjl.tools@gmail.com>, Hannes Domani <ssbssa@yahoo.de>, Hans-Peter Nilsson <hp@axis.com>, Haochen Jiang <haochen.jiang@intel.com>, Hau Hsu <hau.hsu@sifive.com>, Himal <himalr@proton.me>, Hongyu Wang <hongyu.wang@intel.com>, Hsinyuan Xavier <TheLastLin@hotmail.com>, Hu, Lin1 <lin1.hu@intel.com>, Hui Li <lihui@loongson.cn>, Iain Buclaw <ibuclaw@gcc.gnu.org>, Iain Buclaw <ibuclaw@gdcproject.org>, Iain Sandoe <iain@sandoe.co.uk>, Ijaz, Abdul B <abdul.b.ijaz@intel.com>, Ilya Leoshkevich <iii@linux.ibm.com>, Indu Bhagat <indu.bhagat@oracle.com>, Jacob Navia <jacob@jacob.remcomp.fr>, Jakub Jelinek <jakub@redhat.com>, Jan Beulich <jbeulich@suse.com>, Jan Kratochvil <jan.kratochvil@redhat.com>, Jan Vrany <jan.vrany@labware.com>, Jan-Benedict Glaw <jbglaw@lug-owl.de>, Jason Merrill <jason@redhat.com>, Jaydeep Patil <Jaydeep.Patil@imgtec.com>, Jaydeep Patil <jaydeep.patil@imgtec.com>, Jedidiah Thompson <wej22007@outlook.com>, Jeff Law <jeffreyalaw@gmail.com>, Jeff Law <jlaw@ventanamicro.com>, Jens Remus <jremus@linux.ibm.com>, Jerry Zhang Jian <jerry.zhangjian@sifive.com>, Jia-Wei Chen <jiawei@iscas.ac.cn>, Jiajie Chen <c@jia.je>, Jiangshuai Li <jiangshuai_li@c-sky.com>, Jiangshuai Li <jiangshuai_li@linux.alibaba-inc.com>, Jiangshuai Li <jiangshuai_li@linux.alibaba.com>, Jiawei <jiawei@iscas.ac.cn>, Jim Wilson <jimw@sifive.com>, Jin Ma <jinma@linux.alibaba.com>, Jinyang He <hejinyang@loongson.cn>, Joel Brobecker <brobecker@adacore.com>, Johannes Schauer Marin Rodrigues <josch@debian.org>, John Baldwin <jhb@FreeBSD.org>, John David Anglin <danglin@gcc.gnu.org>, Johnson Sun <j3.soon777@gmail.com>, Jojo R <rjiejie@linux.alibaba.com>, Jon Turney <jon.turney@dronecode.org.uk>, Jonas Hoerberg <JHorberg@danfoss.com>, Jonathan Wakely <jwakely@redhat.com>, Jose E. Marchesi <jose.marchesi@oracle.com>, Joseph Faulls <Joseph.Faulls@imgtec.com>, Joseph Myers <joseph@codesourcery.com>, Joseph Myers <josmyers@redhat.com>, Joseph Myers <jsm@polyomino.org.uk>, Jozef Lawrynowicz <jozefl@gcc.gnu.org>, Kalvis Duckmanton <kalvisd@gmail.com>, Kavitha Natarajan <kavitha.natarajan@amd.com>, Kaylee Blake <klkblake@gmail.com>, Keith Seitz <keiths@redhat.com>, Kevin Buettner <kevinb@redhat.com>, Khem Raj <raj.khem@gmail.com>, Kito Cheng <kito.cheng@sifive.com>, Kong Lingling <lingling.kong@intel.com>, Konstantin Isakov <ikm@zbackup.org>, Kuan-Lin Chen <rufus@andestech.com>, Kumar N, Bhuvanendra <Kavitha.Natarajan@amd.com>, Kévin Le Gouguec <legouguec@adacore.com>, LIU Hao <lh_mouse@126.com>, Lancelot SIX <lancelot.six@amd.com>, Lancelot Six <lancelot.six@amd.com>, Laurent Morichetti <laurent.morichetti@amd.com>, Li Xu <xuli1@eswincomputing.com>, Lifang Xia <lifang_xia@linux.alibaba.com>, Luca Bacci <luca.bacci@outlook.com>, Luca Boccassi <bluca@debian.org>, Luca Bonissi <gcc@scarsita.it>, Ludovic Courtès <ludo@gnu.org>, Luis Machado <luis.machado@arm.com>, Lulu Cai <cailulu@loongson.cn>, Lulu Cheng <chenglulu@loongson.cn>, Maciej W. Rozycki <macro@embecosm.com>, Maciej W. Rozycki <macro@orcam.me.uk>, Magne Hov <mhov@undo.io>, Manoj Gupta <manojgupta@google.com>, Marcus Nilsson <brainbomb@gmail.com>, Marek Polacek <polacek@redhat.com>, Mark Harmstone <mark@harmstone.com>, Mark Wielaard <mark@klomp.org>, Markus Metzger <markus.t.metzger@intel.com>, Martin Liska <mliska@suse.cz>, Martin Storsjö <martin@martin.st>, Mary Bennett <mary.bennett@embecosm.com>, Matheus Branco Borella <dark.ryu.550@gmail.com>, Matthew "strager" Glazar <strager.nds@gmail.com>, Matthew Malcomson <hardenedapple@gmail.com>, Matthias Klose <doko@debian.org>, Matthieu Longo <matthieu.longo@arm.com>, Matti Puputti <matti.puputti@intel.com>, Max Filippov <jcmvbkbc@gmail.com>, Meghan Denny <hello@nektro.net>, Michael J. Eager <eager@eagercon.com>, Michael Matz <matz@suse.de>, Mihails Strasuns <mihails.strasuns@intel.com>, Mike Frysinger <vapier@gentoo.org>, Mo, Zewei <zewei.mo@intel.com>, Mohamed Bouhaouel <mohamed.bouhaouel@intel.com>, Nandakumar Edamana <nandakumar@nandakumar.co.in>, Natarajan, Kavitha <Kavitha.Natarajan@amd.com>, Nathan Huckleberry <nhuck@google.com>, Nathan Sidwell <nathan@acm.org>, Neal Frager <neal.frager@amd.com>, Neal frager <neal.frager@amd.com>, Nelson Chu <nelson.chu@sifive.com>, Nelson Chu <nelson@nelson.ba.rivosinc.com>, Nelson Chu <nelson@rivosinc.com>, Nick Alcock <nick.alcock@oracle.com>, Nick Clifton <nickc@redhat.com>, Nicolas Boulenguez <nicolas.boulenguez@free.fr>, Nicolas Boulenguez <nicolas@debian.org>, Nikolaos Chatzikonstantinou <nchatz314@gmail.com>, Nils-Christian Kempke <nils-christian.kempke@intel.com>, Oleg Tolmatcev <oleg.tolmatcev@gmail.com>, Olivier Hainque <hainque@adacore.com>, Orgad Shaneh <orgads@gmail.com>, Palmer Dabbelt <palmer@rivosinc.com>, Patrick Monnerat <patrick@monnerat.net>, Patrick O'Neill <patrick@rivosinc.com>, Paul Iannetta <piannetta@kalrayinc.com>, Paul Koning <paulkoning@comcast.net>, Paul Pluzhnikov <ppluzhnikov@google.com>, Pedro Alves <palves@redhat.com>, Pedro Alves <pedro@palves.net>, Pekka Seppänen <pexu@sourceware.mail.kapsi.fi>, Peter Bergner <bergner@linux.ibm.com>, Peter Edwards <peadar@arista.com>, Peter Foley <pefoley2@pefoley.com>, Peter Jones <pjones@redhat.com>, Petr Tesarik <petr@tesarici.cz>, Philip Herron <philip.herron@embecosm.com>, Philipp Tomsich <philipp.tomsich@vrull.eu>, Philippe Blain <levraiphilippeblain@gmail.com>, Philippe Waroquiers <philippe.waroquiers@skynet.be>, Potharla, Rupesh <Rupesh.Potharla@amd.com>, Pter Chubb <peter.chubb@unsw.edu.au>, Puputti, Matti <matti.puputti@intel.com>, Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE>, Ralf Habacker <ralf.habacker@freenet.de>, Richard Ball <richard.ball@arm.com>, Richard Bunt <richard.bunt@linaro.org>, Richard Earnshaw <rearnsha@arm.com>, Richard Purdie <richard.purdie@linuxfoundation.org>, Richard Sandiford <richard.sandiford@arm.com>, Richard W.M. Jones <rjones@redhat.com>, Roger Sayle <roger@nextmovesoftware.com>, Rohr, Stephan <stephan.rohr@intel.com>, Roland McGrath <mcgrathr@google.com>, Romain Geissler <romain.geissler@amadeus.com>, Rui Ueyama <rui314@gmail.com>, Rupesh Potharla <Rupesh.Potharla@amd.com>, Ruud van der Pas <ruud.vanderpas@oracle.com>, Sam James <sam@gentoo.org>, Samuel Tardieu <sam@rfc1149.net>, Sandra Loosemore <sandra@codesourcery.com>, Sandra Loosemore <sloosemore@baylibre.com>, Saurabh Jha <saurabh.jha@arm.com>, Schimpe, Christina <christina.schimpe@intel.com>, Sergei Trofimovich <siarheit@google.com>, Sergey Bugaev <bugaevc@gmail.com>, Shahab Vahedi <shahab@synopsys.com>, Shihua <shihua@iscas.ac.cn>, Simon Cook <simon.cook@embecosm.com>, Simon Farre <simon.farre.cx@gmail.com>, Simon Marchi <simon.marchi@efficios.com>, Simon Marchi <simon.marchi@polymtl.ca>, Song Mengzhi <song.mengzhi@zte.com.cn>, Srinath Parvathaneni <srinath.parvathaneni@arm.com>, Stafford Horne <shorne@gmail.com>, Stam Markianos-Wright <stam.markianos-wright@arm.com>, Stefan Liebler <stli@linux.ibm.com>, Stefan Schulze Frielinghaus <stefansf@linux.ibm.com>, Stefano Moioli <smxdev4@gmail.com>, Steinar H. Gunderson <sesse@google.com>, Steinar H. Gunderson <steinar+sourceware@gunderson.no>, Stepan Nemec <stepnem@gmail.com>, Stephen Kitt <steve@sk2.org>, Szabolcs Nagy <szabolcs.nagy@arm.com>, TaiseiIto <taisei1212@outlook.jp>, Tamar Christina <tamar.christina@arm.com>, Tankut Baris Aktemur <tankut.baris.aktemur@intel.com>, Tatsuyuki Ishi <ishitatsuyuki@gmail.com>, Tejas Joshi <TejasSanjay.Joshi@amd.com>, Thiago Jung Bauermann <thiago.bauermann@linaro.org>, Thomas Hebb <tommyhebb@gmail.com>, Thomas Koenig <tkoenig@netcologne.de>, Thomas Schwinge <thomas@codesourcery.com>, Thomas Weißschuh <thomas@t-8ch.de>, Tiezhu Yang <yangtiezhu@loongson.cn>, Tobias Burnus <tobias@codesourcery.com>, Toby Lloyd Davies <tlloyddavies@undo.io>, Tom Tromey <tom@tromey.com>, Tom Tromey <tromey@adacore.com>, Tom de Vries <tdevries@jostaberry-8.arch.suse.de>, Tom de Vries <tdevries@loganberry-1.arch.suse.de>, Tom de Vries <tdevries@space.suse.cz>, Tom de Vries <tdevries@suse.de>, Tom de Vries via Gdb-patches <gdb-patches@sourceware.org>, Tomoaki Kawada <kawada@kmckk.co.jp>, Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>, Tristan Gingold <tgingold@free.fr>, Tsukasa OI <research_trasio@irq.a4lg.com>, Ulf Samuelsson <ulf@emagii.com>, Victor Do Nascimento <Victor.DoNascimento@arm.com>, Victor Do Nascimento <victor.donascimento@arm.com>, Vijay Shankar <shank.vijay@yandex.com>, Vladimir Mezentsev <vladimir.mezentsev@oracle.com>, Vladislav Belov <vladislav.belov@syntacore.com>, Vladislav Khmelevsky <och95@yandex.ru>, Vsevolod Alekseyev <sevaa@sprynet.com>, WANG Rui <r@hev.cc>, WANG Xuerui <git@xen0n.name>, Weimin Pan <weimin.pan@oracle.com>, Will Hawkins <hawkinsw@obs.cr>, Willgerodt, Felix <felix.willgerodt@intel.com>, Xi Ruoyao <xry111@mengyan1223.wang>, Xi Ruoyao <xry111@xry111.site>, Xianmiao Qu <cooper.qu@linux.alibaba.com>, Xiao Zeng <zengxiao@eswincomputing.com>, Yang Liu <liuyang22@iscas.ac.cn>, Yichao Yu <yyc1992@gmail.com>, Ying Huang <ying.huang@oss.cipunited.com>, Yoshinori Sato <ysato@users.sourceforge.jp>, Youling Tang <tangyouling@loongson.cn>, YunQiang Su <yunqiang.su@cipunited.com>, Yuriy Kolerov <Yuriy.Kolerov@synopsys.com>, Yuriy Kolerov <kolerov93@gmail.com>, Yury Khrustalev <Yury.Khrustalev@arm.com>, Yury Khrustalev <yury.khrustalev@arm.com>, Yvan Roux <yvan.roux@foss.st.com>, Zac Walker <zac.walker@linaro.org>, Zac Walker <zacwalker@microsoft.com>, Zeke Lu <lvzecai@gmail.com>, Zhang, Jun <jun.zhang@intel.com>, Zhiqing Xiong <zhiqxion@qti.qualcomm.com>, cailulu <cailulu@loongson.cn>, caiyinyu <caiyinyu@loongson.cn>, changjiachen <changjiachen@stu.xupt.edu.cn>, jiawei <jiawei@iscas.ac.cn>, konglin1 <lingling.kong@intel.com>, liuhongt <hongtao.liu@intel.com>, liuzhensong <liuzhensong@loongson.cn>, mengqinggang <mengqinggang@loongson.cn>, mga-sc <mark.goncharov@syntacore.com>, rupesh potharla <rupesh.potharla@amd.com>, rupothar <rupesh.potharla@amd.com>, srinath <srinath.parvathaneni@arm.com>, tangxiaolin <tangxiaolin@loongson.cn>, ticat_fp <fanpeng@loongson.cn>, yaowenbin <yaowenbin1@huawei.com>, zengxiao <zengxiao@eswincomputing.com>, Дилян Палаузов <dilyan.palauzov@aegee.org>, Сергей Чернов <klen_s@mail.ru>

Steps:

- 0: worker_preparation ( success )

- 1: git checkout ( success )
    Logs:
        - stdio: https://builder.sourceware.org/buildbot/#/builders/283/builds/367/steps/1/logs/stdio

- 2: rm -rf binutils-build ( success )
    Logs:
        - stdio: https://builder.sourceware.org/buildbot/#/builders/283/builds/367/steps/2/logs/stdio

- 3: configure ( success )
    Logs:
        - stdio: https://builder.sourceware.org/buildbot/#/builders/283/builds/367/steps/3/logs/stdio
        - config.log: https://builder.sourceware.org/buildbot/#/builders/283/builds/367/steps/3/logs/config_log

- 4: make ( warnings )
    Logs:
        - stdio: https://builder.sourceware.org/buildbot/#/builders/283/builds/367/steps/4/logs/stdio
        - warnings (12): https://builder.sourceware.org/buildbot/#/builders/283/builds/367/steps/4/logs/warnings__12_

- 5: make check gas binutils ( failure )
    Logs:
        - stdio: https://builder.sourceware.org/buildbot/#/builders/283/builds/367/steps/5/logs/stdio
        - gas.sum: https://builder.sourceware.org/buildbot/#/builders/283/builds/367/steps/5/logs/gas_sum
        - gas.log: https://builder.sourceware.org/buildbot/#/builders/283/builds/367/steps/5/logs/gas_log
        - binutils.sum: https://builder.sourceware.org/buildbot/#/builders/283/builds/367/steps/5/logs/binutils_sum
        - binutils.log: https://builder.sourceware.org/buildbot/#/builders/283/builds/367/steps/5/logs/binutils_log
        - warnings (4): https://builder.sourceware.org/buildbot/#/builders/283/builds/367/steps/5/logs/warnings__4_

- 6: make check ld ( failure )
    Logs:
        - stdio: https://builder.sourceware.org/buildbot/#/builders/283/builds/367/steps/6/logs/stdio
        - ld.sum: https://builder.sourceware.org/buildbot/#/builders/283/builds/367/steps/6/logs/ld_sum
        - ld.log: https://builder.sourceware.org/buildbot/#/builders/283/builds/367/steps/6/logs/ld_log
        - warnings (2): https://builder.sourceware.org/buildbot/#/builders/283/builds/367/steps/6/logs/warnings__2_

- 7: prep ( success )
    Logs:
        - stdio: https://builder.sourceware.org/buildbot/#/builders/283/builds/367/steps/7/logs/stdio

- 8: build bunsen.cpio.gz ( success )
    Logs:
        - stdio: https://builder.sourceware.org/buildbot/#/builders/283/builds/367/steps/8/logs/stdio

- 9: fetch bunsen.cpio.gz ( success )
    Logs:
        - stdio: https://builder.sourceware.org/buildbot/#/builders/283/builds/367/steps/9/logs/stdio

- 10: unpack bunsen.cpio.gz ( success )
    Logs:
        - stdio: https://builder.sourceware.org/buildbot/#/builders/283/builds/367/steps/10/logs/stdio

- 11: pass .bunsen.source.gitname ( success )
    Logs:
        - stdio: https://builder.sourceware.org/buildbot/#/builders/283/builds/367/steps/11/logs/stdio

- 12: pass .bunsen.source.gitdescribe ( success )
    Logs:
        - stdio: https://builder.sourceware.org/buildbot/#/builders/283/builds/367/steps/12/logs/stdio

- 13: pass .bunsen.source.gitbranch ( success )
    Logs:
        - stdio: https://builder.sourceware.org/buildbot/#/builders/283/builds/367/steps/13/logs/stdio

- 14: pass .bunsen.source.gitrepo ( success )
    Logs:
        - stdio: https://builder.sourceware.org/buildbot/#/builders/283/builds/367/steps/14/logs/stdio

- 15: upload to bunsen ( success )
    Logs:
        - stdio: https://builder.sourceware.org/buildbot/#/builders/283/builds/367/steps/15/logs/stdio

- 16: clean up ( success )
    Logs:
        - stdio: https://builder.sourceware.org/buildbot/#/builders/283/builds/367/steps/16/logs/stdio

- 17: rm -rf binutils-build_1 ( success )
    Logs:
        - stdio: https://builder.sourceware.org/buildbot/#/builders/283/builds/367/steps/17/logs/stdio


^ permalink raw reply	[relevance 2%]

* ☠ Buildbot (Sourceware): binutils-gdb - failed test (failure) (master)
@ 2024-04-12  6:06  2% builder
  0 siblings, 0 replies; 65+ results
From: builder @ 2024-04-12  6:06 UTC (permalink / raw)
  To: A. Wilcox, Aaron Merey, Abdul Basit Ijaz, Aditya Kamath,
	Aditya Vidyadhar Kamath, Aditya Vidyadhar Kamath, Alan Modra,
	Aleksandar Paunovic, Alex Coplan, Alexandra Hájková,
	Alexandre Oliva, Alexey Lapshin, Alok Kumar Sharma, Andre Vieira,
	Andrea Corallo, Andreas Arnez, Andreas K. Huettel,
	Andreas Krebbel, Andreas Schwab, Andreas Schwab, Andrew Burgess,
	Andrew Burgess, Andrew Carlotti, Andrew Pinski, Ari Hannula,
	Arsen Arsenovi?, Arsen Arsenović,
	Benson Muite, Bernd Edlinger, Bernhard Heckel,
	Bernhard M. Wiedemann, Bhuvanendra Kumar N, Branislav Brzak,
	Brett Werling, Bruno Larsen, CaiJingtao, Carl Love, Carl Love,
	Cary Coutant, Christina Schimpe, Christoph Müllner,
	Christophe Lyon, Christophe Lyon, Christophe Lyon,
	Christopher Di Bella, Chung-Ju Wu, Ciaran Woodward,
	Claudio Bantaloukas, Claudiu Zissulescu, Claudiu Zissulescu,
	Clément Chigot, Cristian Sandu, Cui, Cui, Cupertino Miranda,
	Dan Callaghan, David Carew, David Faust, David Guillen Fandos,
	David Seifert, Dimitar Dimitrov, Dmitry Selyutin, Eli Zaretskii,
	Enze Li, Enze Li, Eugene Rozenfeld, Ezra Sitorus, Fangrui Song,
	Fangrui Song, Feiyang Chen, Felix Willgerodt, Flavio Cruz,
	Frederic Cambus, GDB Administrator, Gaius Mulley, Gareth Rees,
	Georg-Johann Lay, Gregory Anders, Guillermo E. Martinez,
	Guinevere Larsen, H.J. Lu, Hannes Domani, Hans-Peter Nilsson,
	Haochen Jiang, Hau Hsu, Himal, Hongyu Wang, Hsinyuan Xavier, Hu,
	Hui Li, Iain Buclaw, Iain Buclaw, Iain Sandoe, Ijaz,
	Ilya Leoshkevich, Indu Bhagat, Jacob Navia, Jakub Jelinek,
	Jan Beulich, Jan Kratochvil, Jan Vrany, Jan-Benedict Glaw,
	Jason Merrill, Jaydeep Patil, Jaydeep Patil, Jedidiah Thompson,
	Jeff Law, Jeff Law, Jens Remus, Jerry Zhang Jian, Jia-Wei Chen,
	Jiajie Chen, Jiangshuai Li, Jiangshuai Li, Jiangshuai Li, Jiawei,
	Jim Wilson, Jin Ma, Jinyang He, Joel Brobecker,
	Johannes Schauer Marin Rodrigues, John Baldwin,
	John David Anglin, Johnson Sun, Jojo R, Jon Turney,
	Jonas Hoerberg, Jonathan Wakely, Jose E. Marchesi, Joseph Faulls,
	Joseph Myers, Joseph Myers, Joseph Myers, Jozef Lawrynowicz,
	Kalvis Duckmanton, Kavitha Natarajan, Kaylee Blake, Keith Seitz,
	Kevin Buettner, Khem Raj, Kito Cheng, Kong Lingling,
	Konstantin Isakov, Kuan-Lin Chen, Kumar, Kévin Le Gouguec,
	LIU Hao, Lancelot SIX, Lancelot Six, Laurent Morichetti, Li Xu,
	Lifang Xia, Luca Bacci, Luca Boccassi, Luca Bonissi,
	Ludovic Courtès, Luis Machado, Lulu Cai, Lulu Cheng,
	Maciej W. Rozycki, Maciej W. Rozycki, Magne Hov, Manoj Gupta,
	Marcus Nilsson, Marek Polacek, Mark Harmstone, Mark Wielaard,
	Markus Metzger, Martin Liska, Martin Storsjö,
	Mary Bennett, Matheus Branco Borella, Matthew strager Glazar,
	Matthew Malcomson, Matthias Klose, Matthieu Longo, Matti Puputti,
	Max Filippov, Meghan Denny, Michael J. Eager, Michael Matz,
	Mihails Strasuns, Mike Frysinger, Mo, Mohamed Bouhaouel,
	Nandakumar Edamana, Natarajan, Nathan Huckleberry,
	Nathan Sidwell, Neal Frager, Neal frager, Nelson Chu, Nelson Chu,
	Nelson Chu, Nick Alcock, Nick Clifton, Nicolas Boulenguez,
	Nicolas Boulenguez, Nikolaos Chatzikonstantinou,
	Nils-Christian Kempke, Oleg Tolmatcev, Olivier Hainque,
	Orgad Shaneh, Palmer Dabbelt, Patrick Monnerat,
	Patrick O'Neill, Paul Iannetta, Paul Koning, Paul Pluzhnikov,
	Pedro Alves, Pedro Alves, Pekka Seppänen, Peter Bergner,
	Peter Edwards, Peter Foley, Peter Jones, Petr Tesarik,
	Philip Herron, Philipp Tomsich, Philippe Blain,
	Philippe Waroquiers, Potharla, Pter Chubb, Puputti, Rainer Orth,
	Ralf Habacker, Richard Ball, Richard Bunt, Richard Earnshaw,
	Richard Purdie, Richard Sandiford, Richard W.M. Jones,
	Roger Sayle, Rohr, Roland McGrath, Romain Geissler, Rui Ueyama,
	Rupesh Potharla, Ruud van der Pas, Sam James, Samuel Tardieu,
	Sandra Loosemore, Sandra Loosemore, Saurabh Jha, Schimpe,
	Sergei Trofimovich, Sergey Bugaev, Shahab Vahedi, Shihua,
	Simon Farre, Simon Marchi, Simon Marchi, Song Mengzhi,
	Srinath Parvathaneni, Stafford Horne, Stam Markianos-Wright,
	Stefan Liebler, Stefan Schulze Frielinghaus, Stefano Moioli,
	Steinar H. Gunderson, Steinar H. Gunderson, Stepan Nemec,
	Stephen Kitt, Szabolcs Nagy, TaiseiIto, Tamar Christina,
	Tankut Baris Aktemur, Tatsuyuki Ishi, Tejas Joshi,
	Thiago Jung Bauermann, Thomas Hebb, Thomas Koenig,
	Thomas Schwinge, Thomas Weißschuh, Tiezhu Yang,
	Tobias Burnus, Toby Lloyd Davies, Tom Tromey, Tom Tromey,
	Tom de Vries, Tom de Vries, Tom de Vries, Tom de Vries,
	Tom de Vries via Gdb-patches, Tomoaki Kawada,
	Torbjörn SVENSSON, Tristan Gingold, Tsukasa OI,
	Ulf Samuelsson, Victor Do Nascimento, Victor Do Nascimento,
	Vladimir Mezentsev, Vladislav Belov, Vladislav Khmelevsky,
	Vsevolod Alekseyev, WANG Rui, WANG Xuerui, Weimin Pan,
	Will Hawkins, Willgerodt, Xi Ruoyao, Xi Ruoyao, Xianmiao Qu,
	Xiao Zeng, Yang Liu, Yichao Yu, Ying Huang, Yoshinori Sato,
	Youling Tang, YunQiang Su, Yuriy Kolerov, Yuriy Kolerov,
	Yury Khrustalev, Yury Khrustalev, Yvan Roux, Zac Walker,
	Zac Walker, Zeke Lu, Zhang, Zhiqing Xiong, cailulu, caiyinyu,
	changjiachen, jiawei, konglin1, liuhongt, liuzhensong,
	mengqinggang, mga-sc, rupesh potharla, rupothar, srinath,
	tangxiaolin, ticat_fp, yaowenbin, zengxiao,
	Дилян
	Палаузов,
	Сергей
	Чернов

A new failure has been detected on builder binutils-fedora-s390x while building binutils-gdb.

Full details are available at:
    https://builder.sourceware.org/buildbot/#/builders/81/builds/3539

Build state: failed test (failure)
Revision: 0f8adbf77dd3f40e74529fa989dca034c73a7273
Worker: fedora-s390x
Build Reason: (unknown)
Blamelist: A. Wilcox <awilfox@adelielinux.org>, Aaron Merey <amerey@redhat.com>, Abdul Basit Ijaz <abdul.b.ijaz@intel.com>, Aditya Kamath <Aditya.Kamath1@ibm.com>, Aditya Vidyadhar Kamath <ADITYA.VIDYADHAR.KAMATH@ibm.com>, Aditya Vidyadhar Kamath <Aditya.Kamath1@ibm.com>, Alan Modra <amodra@gmail.com>, Aleksandar Paunovic <aleksandar.paunovic@intel.com>, Alex Coplan <alex.coplan@arm.com>, Alexandra Hájková <ahajkova@redhat.com>, Alexandre Oliva <oliva@adacore.com>, Alexey Lapshin <alexey.lapshin@espressif.com>, Alok Kumar Sharma <AlokKumar.Sharma@amd.com>, Andre Vieira <andre.simoesdiasvieira@arm.com>, Andrea Corallo <andrea.corallo@arm.com>, Andreas Arnez <arnez@linux.ibm.com>, Andreas K. Huettel <dilfridge@gentoo.org>, Andreas Krebbel <krebbel@linux.ibm.com>, Andreas Schwab <schwab@linux-m68k.org>, Andreas Schwab <schwab@suse.de>, Andrew Burgess <aburgess@redhat.com>, Andrew Burgess <andrew.burgess@embecosm.com>, Andrew Carlotti <andrew.carlotti@arm.com>, Andrew Pinski <apinski@marvell.com>, Ari Hannula <ari.hannula@intel.com>, Arsen Arsenovi? <arsen@aarsen.me>, Arsen Arsenović <arsen@aarsen.me>, Benson Muite <benson_muite@emailplus.org>, Bernd Edlinger <bernd.edlinger@hotmail.de>, Bernhard Heckel <bernhard.heckel@intel.com>, Bernhard M. Wiedemann <bwiedemann@suse.de>, Bhuvanendra Kumar N <Bhuvanendra.KumarN@amd.com>, Branislav Brzak <branislav.brzak@syrmia.com>, Brett Werling <bwerl.dev@gmail.com>, Bruno Larsen <blarsen@redhat.com>, CaiJingtao <caijingtao@huawei.com>, Carl Love <cel@linux.ibm.com>, Carl Love <cel@us.ibm.com>, Cary Coutant <ccoutant@gmail.com>, Christina Schimpe <christina.schimpe@intel.com>, Christoph Müllner <christoph.muellner@vrull.eu>, Christophe Lyon <christophe.lyon@arm.com>, Christophe Lyon <christophe.lyon@linaro.org>, Christophe Lyon <christophe.lyon@st.com>, Christopher Di Bella <cjdb@google.com>, Chung-Ju Wu <jasonwucj@gmail.com>, Ciaran Woodward <ciaranwoodward@xmos.com>, Claudio Bantaloukas <Claudio.Bantaloukas@arm.com>, Claudiu Zissulescu <claziss@gmail.com>, Claudiu Zissulescu <claziss@synopsys.com>, Clément Chigot <chigot@adacore.com>, Cristian Sandu <cristian.sandu@intel.com>, Cui, Lili <lili.cui@intel.com>, Cui,Lili <lili.cui@intel.com>, Cupertino Miranda <cupertino.miranda@oracle.com>, Dan Callaghan <dan.callaghan@morsemicro.com>, David Carew <david@dcarew.com>, David Faust <david.faust@oracle.com>, David Guillen Fandos <david@davidgf.net>, David Seifert <soap@gentoo.org>, Dimitar Dimitrov <dimitar@dinux.eu>, Dmitry Selyutin <ghostmansd@gmail.com>, Eli Zaretskii <eliz@gnu.org>, Enze Li <enze.li@gmx.com>, Enze Li <enze.li@hotmail.com>, Eugene Rozenfeld <erozen@microsoft.com>, Ezra Sitorus <ezra.sitorus@arm.com>, Fangrui Song <i@maskray.me>, Fangrui Song <maskray@google.com>, Feiyang Chen <chenfeiyang@loongson.cn>, Felix Willgerodt <felix.willgerodt@intel.com>, Flavio Cruz <flaviocruz@gmail.com>, Frederic Cambus <fred@statdns.com>, GDB Administrator <gdbadmin@sourceware.org>, Gaius Mulley <gaiusmod2@gmail.com>, Gareth Rees <grees@undo.io>, Georg-Johann Lay <avr@gjlay.de>, Gregory Anders <greg@gpanders.com>, Guillermo E. Martinez <guillermo.e.martinez@oracle.com>, Guinevere Larsen <blarsen@redhat.com>, H.J. Lu <hjl.tools@gmail.com>, Hannes Domani <ssbssa@yahoo.de>, Hans-Peter Nilsson <hp@axis.com>, Haochen Jiang <haochen.jiang@intel.com>, Hau Hsu <hau.hsu@sifive.com>, Himal <himalr@proton.me>, Hongyu Wang <hongyu.wang@intel.com>, Hsinyuan Xavier <TheLastLin@hotmail.com>, Hu, Lin1 <lin1.hu@intel.com>, Hui Li <lihui@loongson.cn>, Iain Buclaw <ibuclaw@gcc.gnu.org>, Iain Buclaw <ibuclaw@gdcproject.org>, Iain Sandoe <iain@sandoe.co.uk>, Ijaz, Abdul B <abdul.b.ijaz@intel.com>, Ilya Leoshkevich <iii@linux.ibm.com>, Indu Bhagat <indu.bhagat@oracle.com>, Jacob Navia <jacob@jacob.remcomp.fr>, Jakub Jelinek <jakub@redhat.com>, Jan Beulich <jbeulich@suse.com>, Jan Kratochvil <jan.kratochvil@redhat.com>, Jan Vrany <jan.vrany@labware.com>, Jan-Benedict Glaw <jbglaw@lug-owl.de>, Jason Merrill <jason@redhat.com>, Jaydeep Patil <Jaydeep.Patil@imgtec.com>, Jaydeep Patil <jaydeep.patil@imgtec.com>, Jedidiah Thompson <wej22007@outlook.com>, Jeff Law <jeffreyalaw@gmail.com>, Jeff Law <jlaw@ventanamicro.com>, Jens Remus <jremus@linux.ibm.com>, Jerry Zhang Jian <jerry.zhangjian@sifive.com>, Jia-Wei Chen <jiawei@iscas.ac.cn>, Jiajie Chen <c@jia.je>, Jiangshuai Li <jiangshuai_li@c-sky.com>, Jiangshuai Li <jiangshuai_li@linux.alibaba-inc.com>, Jiangshuai Li <jiangshuai_li@linux.alibaba.com>, Jiawei <jiawei@iscas.ac.cn>, Jim Wilson <jimw@sifive.com>, Jin Ma <jinma@linux.alibaba.com>, Jinyang He <hejinyang@loongson.cn>, Joel Brobecker <brobecker@adacore.com>, Johannes Schauer Marin Rodrigues <josch@debian.org>, John Baldwin <jhb@FreeBSD.org>, John David Anglin <danglin@gcc.gnu.org>, Johnson Sun <j3.soon777@gmail.com>, Jojo R <rjiejie@linux.alibaba.com>, Jon Turney <jon.turney@dronecode.org.uk>, Jonas Hoerberg <JHorberg@danfoss.com>, Jonathan Wakely <jwakely@redhat.com>, Jose E. Marchesi <jose.marchesi@oracle.com>, Joseph Faulls <Joseph.Faulls@imgtec.com>, Joseph Myers <joseph@codesourcery.com>, Joseph Myers <josmyers@redhat.com>, Joseph Myers <jsm@polyomino.org.uk>, Jozef Lawrynowicz <jozefl@gcc.gnu.org>, Kalvis Duckmanton <kalvisd@gmail.com>, Kavitha Natarajan <kavitha.natarajan@amd.com>, Kaylee Blake <klkblake@gmail.com>, Keith Seitz <keiths@redhat.com>, Kevin Buettner <kevinb@redhat.com>, Khem Raj <raj.khem@gmail.com>, Kito Cheng <kito.cheng@sifive.com>, Kong Lingling <lingling.kong@intel.com>, Konstantin Isakov <ikm@zbackup.org>, Kuan-Lin Chen <rufus@andestech.com>, Kumar N, Bhuvanendra <Kavitha.Natarajan@amd.com>, Kévin Le Gouguec <legouguec@adacore.com>, LIU Hao <lh_mouse@126.com>, Lancelot SIX <lancelot.six@amd.com>, Lancelot Six <lancelot.six@amd.com>, Laurent Morichetti <laurent.morichetti@amd.com>, Li Xu <xuli1@eswincomputing.com>, Lifang Xia <lifang_xia@linux.alibaba.com>, Luca Bacci <luca.bacci@outlook.com>, Luca Boccassi <bluca@debian.org>, Luca Bonissi <gcc@scarsita.it>, Ludovic Courtès <ludo@gnu.org>, Luis Machado <luis.machado@arm.com>, Lulu Cai <cailulu@loongson.cn>, Lulu Cheng <chenglulu@loongson.cn>, Maciej W. Rozycki <macro@embecosm.com>, Maciej W. Rozycki <macro@orcam.me.uk>, Magne Hov <mhov@undo.io>, Manoj Gupta <manojgupta@google.com>, Marcus Nilsson <brainbomb@gmail.com>, Marek Polacek <polacek@redhat.com>, Mark Harmstone <mark@harmstone.com>, Mark Wielaard <mark@klomp.org>, Markus Metzger <markus.t.metzger@intel.com>, Martin Liska <mliska@suse.cz>, Martin Storsjö <martin@martin.st>, Mary Bennett <mary.bennett@embecosm.com>, Matheus Branco Borella <dark.ryu.550@gmail.com>, Matthew "strager" Glazar <strager.nds@gmail.com>, Matthew Malcomson <hardenedapple@gmail.com>, Matthias Klose <doko@debian.org>, Matthieu Longo <matthieu.longo@arm.com>, Matti Puputti <matti.puputti@intel.com>, Max Filippov <jcmvbkbc@gmail.com>, Meghan Denny <hello@nektro.net>, Michael J. Eager <eager@eagercon.com>, Michael Matz <matz@suse.de>, Mihails Strasuns <mihails.strasuns@intel.com>, Mike Frysinger <vapier@gentoo.org>, Mo, Zewei <zewei.mo@intel.com>, Mohamed Bouhaouel <mohamed.bouhaouel@intel.com>, Nandakumar Edamana <nandakumar@nandakumar.co.in>, Natarajan, Kavitha <Kavitha.Natarajan@amd.com>, Nathan Huckleberry <nhuck@google.com>, Nathan Sidwell <nathan@acm.org>, Neal Frager <neal.frager@amd.com>, Neal frager <neal.frager@amd.com>, Nelson Chu <nelson.chu@sifive.com>, Nelson Chu <nelson@nelson.ba.rivosinc.com>, Nelson Chu <nelson@rivosinc.com>, Nick Alcock <nick.alcock@oracle.com>, Nick Clifton <nickc@redhat.com>, Nicolas Boulenguez <nicolas.boulenguez@free.fr>, Nicolas Boulenguez <nicolas@debian.org>, Nikolaos Chatzikonstantinou <nchatz314@gmail.com>, Nils-Christian Kempke <nils-christian.kempke@intel.com>, Oleg Tolmatcev <oleg.tolmatcev@gmail.com>, Olivier Hainque <hainque@adacore.com>, Orgad Shaneh <orgads@gmail.com>, Palmer Dabbelt <palmer@rivosinc.com>, Patrick Monnerat <patrick@monnerat.net>, Patrick O'Neill <patrick@rivosinc.com>, Paul Iannetta <piannetta@kalrayinc.com>, Paul Koning <paulkoning@comcast.net>, Paul Pluzhnikov <ppluzhnikov@google.com>, Pedro Alves <palves@redhat.com>, Pedro Alves <pedro@palves.net>, Pekka Seppänen <pexu@sourceware.mail.kapsi.fi>, Peter Bergner <bergner@linux.ibm.com>, Peter Edwards <peadar@arista.com>, Peter Foley <pefoley2@pefoley.com>, Peter Jones <pjones@redhat.com>, Petr Tesarik <petr@tesarici.cz>, Philip Herron <philip.herron@embecosm.com>, Philipp Tomsich <philipp.tomsich@vrull.eu>, Philippe Blain <levraiphilippeblain@gmail.com>, Philippe Waroquiers <philippe.waroquiers@skynet.be>, Potharla, Rupesh <Rupesh.Potharla@amd.com>, Pter Chubb <peter.chubb@unsw.edu.au>, Puputti, Matti <matti.puputti@intel.com>, Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE>, Ralf Habacker <ralf.habacker@freenet.de>, Richard Ball <richard.ball@arm.com>, Richard Bunt <richard.bunt@linaro.org>, Richard Earnshaw <rearnsha@arm.com>, Richard Purdie <richard.purdie@linuxfoundation.org>, Richard Sandiford <richard.sandiford@arm.com>, Richard W.M. Jones <rjones@redhat.com>, Roger Sayle <roger@nextmovesoftware.com>, Rohr, Stephan <stephan.rohr@intel.com>, Roland McGrath <mcgrathr@google.com>, Romain Geissler <romain.geissler@amadeus.com>, Rui Ueyama <rui314@gmail.com>, Rupesh Potharla <Rupesh.Potharla@amd.com>, Ruud van der Pas <ruud.vanderpas@oracle.com>, Sam James <sam@gentoo.org>, Samuel Tardieu <sam@rfc1149.net>, Sandra Loosemore <sandra@codesourcery.com>, Sandra Loosemore <sloosemore@baylibre.com>, Saurabh Jha <saurabh.jha@arm.com>, Schimpe, Christina <christina.schimpe@intel.com>, Sergei Trofimovich <siarheit@google.com>, Sergey Bugaev <bugaevc@gmail.com>, Shahab Vahedi <shahab@synopsys.com>, Shihua <shihua@iscas.ac.cn>, Simon Farre <simon.farre.cx@gmail.com>, Simon Marchi <simon.marchi@efficios.com>, Simon Marchi <simon.marchi@polymtl.ca>, Song Mengzhi <song.mengzhi@zte.com.cn>, Srinath Parvathaneni <srinath.parvathaneni@arm.com>, Stafford Horne <shorne@gmail.com>, Stam Markianos-Wright <stam.markianos-wright@arm.com>, Stefan Liebler <stli@linux.ibm.com>, Stefan Schulze Frielinghaus <stefansf@linux.ibm.com>, Stefano Moioli <smxdev4@gmail.com>, Steinar H. Gunderson <sesse@google.com>, Steinar H. Gunderson <steinar+sourceware@gunderson.no>, Stepan Nemec <stepnem@gmail.com>, Stephen Kitt <steve@sk2.org>, Szabolcs Nagy <szabolcs.nagy@arm.com>, TaiseiIto <taisei1212@outlook.jp>, Tamar Christina <tamar.christina@arm.com>, Tankut Baris Aktemur <tankut.baris.aktemur@intel.com>, Tatsuyuki Ishi <ishitatsuyuki@gmail.com>, Tejas Joshi <TejasSanjay.Joshi@amd.com>, Thiago Jung Bauermann <thiago.bauermann@linaro.org>, Thomas Hebb <tommyhebb@gmail.com>, Thomas Koenig <tkoenig@netcologne.de>, Thomas Schwinge <thomas@codesourcery.com>, Thomas Weißschuh <thomas@t-8ch.de>, Tiezhu Yang <yangtiezhu@loongson.cn>, Tobias Burnus <tobias@codesourcery.com>, Toby Lloyd Davies <tlloyddavies@undo.io>, Tom Tromey <tom@tromey.com>, Tom Tromey <tromey@adacore.com>, Tom de Vries <tdevries@jostaberry-8.arch.suse.de>, Tom de Vries <tdevries@loganberry-1.arch.suse.de>, Tom de Vries <tdevries@space.suse.cz>, Tom de Vries <tdevries@suse.de>, Tom de Vries via Gdb-patches <gdb-patches@sourceware.org>, Tomoaki Kawada <kawada@kmckk.co.jp>, Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>, Tristan Gingold <tgingold@free.fr>, Tsukasa OI <research_trasio@irq.a4lg.com>, Ulf Samuelsson <ulf@emagii.com>, Victor Do Nascimento <Victor.DoNascimento@arm.com>, Victor Do Nascimento <victor.donascimento@arm.com>, Vladimir Mezentsev <vladimir.mezentsev@oracle.com>, Vladislav Belov <vladislav.belov@syntacore.com>, Vladislav Khmelevsky <och95@yandex.ru>, Vsevolod Alekseyev <sevaa@sprynet.com>, WANG Rui <r@hev.cc>, WANG Xuerui <git@xen0n.name>, Weimin Pan <weimin.pan@oracle.com>, Will Hawkins <hawkinsw@obs.cr>, Willgerodt, Felix <felix.willgerodt@intel.com>, Xi Ruoyao <xry111@mengyan1223.wang>, Xi Ruoyao <xry111@xry111.site>, Xianmiao Qu <cooper.qu@linux.alibaba.com>, Xiao Zeng <zengxiao@eswincomputing.com>, Yang Liu <liuyang22@iscas.ac.cn>, Yichao Yu <yyc1992@gmail.com>, Ying Huang <ying.huang@oss.cipunited.com>, Yoshinori Sato <ysato@users.sourceforge.jp>, Youling Tang <tangyouling@loongson.cn>, YunQiang Su <yunqiang.su@cipunited.com>, Yuriy Kolerov <Yuriy.Kolerov@synopsys.com>, Yuriy Kolerov <kolerov93@gmail.com>, Yury Khrustalev <Yury.Khrustalev@arm.com>, Yury Khrustalev <yury.khrustalev@arm.com>, Yvan Roux <yvan.roux@foss.st.com>, Zac Walker <zac.walker@linaro.org>, Zac Walker <zacwalker@microsoft.com>, Zeke Lu <lvzecai@gmail.com>, Zhang, Jun <jun.zhang@intel.com>, Zhiqing Xiong <zhiqxion@qti.qualcomm.com>, cailulu <cailulu@loongson.cn>, caiyinyu <caiyinyu@loongson.cn>, changjiachen <changjiachen@stu.xupt.edu.cn>, jiawei <jiawei@iscas.ac.cn>, konglin1 <lingling.kong@intel.com>, liuhongt <hongtao.liu@intel.com>, liuzhensong <liuzhensong@loongson.cn>, mengqinggang <mengqinggang@loongson.cn>, mga-sc <mark.goncharov@syntacore.com>, rupesh potharla <rupesh.potharla@amd.com>, rupothar <rupesh.potharla@amd.com>, srinath <srinath.parvathaneni@arm.com>, tangxiaolin <tangxiaolin@loongson.cn>, ticat_fp <fanpeng@loongson.cn>, yaowenbin <yaowenbin1@huawei.com>, zengxiao <zengxiao@eswincomputing.com>, Дилян Палаузов <dilyan.palauzov@aegee.org>, Сергей Чернов <klen_s@mail.ru>

Steps:

- 0: worker_preparation ( success )

- 1: git checkout ( success )
    Logs:
        - stdio: https://builder.sourceware.org/buildbot/#/builders/81/builds/3539/steps/1/logs/stdio

- 2: rm -rf binutils-build ( success )
    Logs:
        - stdio: https://builder.sourceware.org/buildbot/#/builders/81/builds/3539/steps/2/logs/stdio

- 3: configure ( success )
    Logs:
        - stdio: https://builder.sourceware.org/buildbot/#/builders/81/builds/3539/steps/3/logs/stdio
        - config.log: https://builder.sourceware.org/buildbot/#/builders/81/builds/3539/steps/3/logs/config_log

- 4: make ( warnings )
    Logs:
        - stdio: https://builder.sourceware.org/buildbot/#/builders/81/builds/3539/steps/4/logs/stdio
        - warnings (27): https://builder.sourceware.org/buildbot/#/builders/81/builds/3539/steps/4/logs/warnings__27_

- 5: make check ( failure )
    Logs:
        - stdio: https://builder.sourceware.org/buildbot/#/builders/81/builds/3539/steps/5/logs/stdio
        - ld.sum: https://builder.sourceware.org/buildbot/#/builders/81/builds/3539/steps/5/logs/ld_sum
        - ld.log: https://builder.sourceware.org/buildbot/#/builders/81/builds/3539/steps/5/logs/ld_log
        - gas.sum: https://builder.sourceware.org/buildbot/#/builders/81/builds/3539/steps/5/logs/gas_sum
        - gas.log: https://builder.sourceware.org/buildbot/#/builders/81/builds/3539/steps/5/logs/gas_log
        - binutils.sum: https://builder.sourceware.org/buildbot/#/builders/81/builds/3539/steps/5/logs/binutils_sum
        - binutils.log: https://builder.sourceware.org/buildbot/#/builders/81/builds/3539/steps/5/logs/binutils_log
        - libsframe.sum: https://builder.sourceware.org/buildbot/#/builders/81/builds/3539/steps/5/logs/libsframe_sum
        - libsframe.log: https://builder.sourceware.org/buildbot/#/builders/81/builds/3539/steps/5/logs/libsframe_log
        - libctf.sum: https://builder.sourceware.org/buildbot/#/builders/81/builds/3539/steps/5/logs/libctf_sum
        - libctf.log: https://builder.sourceware.org/buildbot/#/builders/81/builds/3539/steps/5/logs/libctf_log
        - warnings (15): https://builder.sourceware.org/buildbot/#/builders/81/builds/3539/steps/5/logs/warnings__15_

- 6: prep ( success )
    Logs:
        - stdio: https://builder.sourceware.org/buildbot/#/builders/81/builds/3539/steps/6/logs/stdio

- 7: build bunsen.cpio.gz ( success )
    Logs:
        - stdio: https://builder.sourceware.org/buildbot/#/builders/81/builds/3539/steps/7/logs/stdio

- 8: fetch bunsen.cpio.gz ( success )
    Logs:
        - stdio: https://builder.sourceware.org/buildbot/#/builders/81/builds/3539/steps/8/logs/stdio

- 9: unpack bunsen.cpio.gz ( success )
    Logs:
        - stdio: https://builder.sourceware.org/buildbot/#/builders/81/builds/3539/steps/9/logs/stdio

- 10: pass .bunsen.source.gitname ( success )
    Logs:
        - stdio: https://builder.sourceware.org/buildbot/#/builders/81/builds/3539/steps/10/logs/stdio

- 11: pass .bunsen.source.gitdescribe ( success )
    Logs:
        - stdio: https://builder.sourceware.org/buildbot/#/builders/81/builds/3539/steps/11/logs/stdio

- 12: pass .bunsen.source.gitbranch ( success )
    Logs:
        - stdio: https://builder.sourceware.org/buildbot/#/builders/81/builds/3539/steps/12/logs/stdio

- 13: pass .bunsen.source.gitrepo ( success )
    Logs:
        - stdio: https://builder.sourceware.org/buildbot/#/builders/81/builds/3539/steps/13/logs/stdio

- 14: upload to bunsen ( success )
    Logs:
        - stdio: https://builder.sourceware.org/buildbot/#/builders/81/builds/3539/steps/14/logs/stdio

- 15: clean up ( success )
    Logs:
        - stdio: https://builder.sourceware.org/buildbot/#/builders/81/builds/3539/steps/15/logs/stdio

- 16: rm -rf binutils-build_1 ( success )
    Logs:
        - stdio: https://builder.sourceware.org/buildbot/#/builders/81/builds/3539/steps/16/logs/stdio


^ permalink raw reply	[relevance 2%]

* Re: Re: [PATCH v2] Add support for symbol addition to the Python API
  2024-02-06 17:50  0%       ` Tom Tromey
@ 2024-02-24 17:35  7%         ` Matheus Branco Borella
  0 siblings, 0 replies; 65+ results
From: Matheus Branco Borella @ 2024-02-24 17:35 UTC (permalink / raw)
  To: tom; +Cc: gdb-patches, dark.ryu.550

Tom Tromey <tom@tromey.com> writes:
> I guess because nothing makes any blocks.  However this seems like a
> kind of big issue to me, because it means that by-name lookups will
> appear to succeed ("function xyz is at address 0xaaaaa") but then
> stopping in that function won't show the name.

The big issue with properly supporting the PC-based lookups, and minsyms
in general, is that we don't actually have a properly set up BFD struct.
And, as far as I know, there's no way to actually create one without go-
ing through the trouble of generating an actual binary file in memory
and having the library read back from it.

So I did the best I could that wouldn't also have to involve potentially
fairly invasive changes to the rest of GDB.

Of course, I could be wrong and there might be a way to properly do what
I'm trying to do, but if there is one, I couldn't find it.

> gdb exposes a gdb.Architecture, maybe we could let the Python code
> specify this.

Yeah, that makes sense. I'll change it so that it uses gdb.Architecture.

> objfile_to_objfile_object returns a new reference so I think the incref
> is wrong here.
> 
> We try to avoid explicit inc/dec-refs in gdb anyway.

Yes, I didn't realize the gdbpy_ref increased the refcount
automatically.

I don't have anything to add for the other points, so suffice to say
I'll fix everything you pointed out and submit a v3 as soon as I can.

Thanks for your time.

^ permalink raw reply	[relevance 7%]

* Re: Re: [PATCH v4] Add support for creating new types from the Python API
  2024-02-06 18:20  4% ` Tom Tromey
@ 2024-02-21 18:11  6%   ` Matheus Branco Borella
  0 siblings, 0 replies; 65+ results
From: Matheus Branco Borella @ 2024-02-21 18:11 UTC (permalink / raw)
  To: gdb-patches; +Cc: dark.ryu.550, tom

Thanks for the review, I've got a few questions and things to add before
I submit the v5, if that's okay.

Tom Tromey <tom@tromey.com> writes:

> WDYT about "make_" instead?

Yeah, that works. I don't feel particularly strongly about the names
that I picked, I'd mostly just picked them to mirror the internal ones
because I didn't have a better idea. I'll switch to `make_*_type`.

> Making any sort of type without filling in the details is probably a
> recipe for crashes.
> 
> Is there a specific situation you needed this for?

The main thing I had in mind was creating a void type, but yeah, it
makes sense to avoid exposing the generic type creation function without
also providing a proper way to fill in the details, so I'll take that
one out.

> I'm curious whether this one is really needed, because
> gdb.Type.pointer() exists.

Like I've told Eli, the main intent behind having it originally was
that, assuming one knows how to create a properly-sized pointer for
the architecture, one could, without having to rely on any of the type
lookup functions.

That being said, now that I think about it, I don't think there's any
case, when avoiding type lookup for pointer types might be necessary,
that can't be solved just as well by using `gdb.Type.pointer()`, while
also avoiding the footguns associated with `make_pointer_type`. So I'll
take it out, too.

> It seems like this could all just create a type_allocator directly and
> be simpler, like the 'kind' isn't needed.

> Is the none case really possible?
> It might be better to just throw an exception from the constructor or
> during argument validation or something like that.

Most of these fall under the same response, so I'll just reply to them
all at once.

When I was writing this patch, I had the following in mind:
 1st - This patch was first written before GDB switched to C++17, so I
       had no access to std::optional<>.
 2nd - I felt like throwing an exception over doing the `->valid()`
       check explicitly would be less clear about my intent for people
	   reading the code.

The design of `type_storage_owner` follows from those, and I don't feel
like changing it to use std::option<> or exceptions would be much of an
improvement in readability.

Would it really be that much of an improvement?

> I think the uses of this could probably use TYPE_ALLOC instead.

Isn't that only valid for `struct type`? I don't think I follow. Some of
the allocations (and I'm pretty sure at least one has to) happen before
the call to `init_*_type`.


^ permalink raw reply	[relevance 6%]

* Re: [PATCH v4] Add support for creating new types from the Python API
  2024-01-16  4:54  1% [PATCH v4] Add support for creating new types from the Python API Matheus Branco Borella
  2024-01-16 12:45  0% ` Eli Zaretskii
@ 2024-02-06 18:20  4% ` Tom Tromey
  2024-02-21 18:11  6%   ` Matheus Branco Borella
  1 sibling, 1 reply; 65+ results
From: Tom Tromey @ 2024-02-06 18:20 UTC (permalink / raw)
  To: Matheus Branco Borella; +Cc: gdb-patches, eli

>>>>> "Matheus" == Matheus Branco Borella <dark.ryu.550@gmail.com> writes:

Matheus> The main drawback of using the `init_*_type` family over implementing type
Matheus> initialization by hand is that any type that's created gets immediately
Matheus> allocated on its owner's obstack, regardless of what its real lifetime
Matheus> requirements are. The main implication of this is that types that become
Matheus> unreachable will remain live for the lifetime of the owner.

Yeah.  gdb leaks a lot of types this way, actually.  We've collectively
put off implementing "type GC", though I do think there's a bug for it.

Matheus> +  ** Functions that allow creation of instances of gdb.Type, and a new
Matheus> +     class gdb.FloatFormat that may be used to create floating point
Matheus> +     types.  The functions that allow new type creation are:
Matheus> +      - gdb.init_type: Create a new type given a type code.
Matheus> +      - gdb.init_integer_type: Create a new integer type.
Matheus> +      - gdb.init_character_type: Create a new character type.
Matheus> +      - gdb.init_boolean_type: Create a new boolean type.
Matheus> +      - gdb.init_float_type: Create a new floating point type.
Matheus> +      - gdb.init_decfloat_type: Create a new decimal floating point type.
Matheus> +      - gdb.can_create_complex_type: Whether a type can be used to create a
Matheus> +          new complex type.
Matheus> +      - gdb.init_complex_type: Create a new complex type.
Matheus> +      - gdb.init_pointer_type: Create a new pointer type.
Matheus> +          * This allows creating pointers of arbitrary size.
Matheus> +      - gdb.init_fixed_point_type: Create a new fixed point type.

I don't really love the "init_" prefixes here.  Like, I get that these
are the names internally, but I don't think they really make sense
externally.

WDYT about "make_" instead?

Matheus> +@findex gdb.init_type
Matheus> +@defun gdb.init_type (owner, type_code, bit_size, name)
Matheus> +This function creates a new @code{gdb.Type} instance corresponding to a
Matheus> +type owned by the given @var{owner}, with the given @var{type_code},
Matheus> +@var{name} and size.
Matheus> +
Matheus> +@var{owner} must be a reference to either a @code{gdb.Objfile} or a
Matheus> +@code{gdb.Architecture} object.  These correspond to objfile and
Matheus> +architecture-owned types, respectively.
Matheus> +
Matheus> +@var{type_code} is one of the @code{TYPE_CODE_} constants defined in
Matheus> +@ref{Types In Python}.
Matheus> +
Matheus> +@var{bit_size} is the size of instances of the newly created type, in
Matheus> +bits. Currently, accepted values are limited to multiples of 8.
Matheus> +@end defun

Making any sort of type without filling in the details is probably a
recipe for crashes.

Is there a specific situation you needed this for?

Matheus> +@findex gdb.init_pointer_type
Matheus> +@defun gdb.init_pointer_type (owner, target, bit_size, name)
Matheus> +This function creates a new @code{gdb.Type} instance corresponding to a
Matheus> +pointer type that points to @var{target} and is owned by the given
Matheus> +@var{owner}, with the given @var{name} and size.
Matheus> +
Matheus> +@var{target} is a @code{gdb.Type} object, corresponding to the type
Matheus> +that will be pointed to by the newly created pointer type.
Matheus> +@end defun

I'm curious whether this one is really needed, because
gdb.Type.pointer() exists.

Like is there a case where you'd want a pointer type that doesn't match
the architecture somehow?  Seems weird and/or not useful.

Matheus> +/* Converts from a Python integer to a unsigned integer. */
Matheus> +
Matheus> +static bool
Matheus> +py_to_unsigned_int (PyObject *object, unsigned int *val)
Matheus> +{
Matheus> +  if (!PyObject_IsInstance (object, (PyObject*) &PyLong_Type))
Matheus> +    {
Matheus> +      PyErr_SetString (PyExc_TypeError, "value must be an integer");
Matheus> +      return false;
Matheus> +    }
Matheus> +
Matheus> +  long native_val = PyLong_AsLong (object);
Matheus> +  if (native_val > (long) UINT_MAX)
Matheus> +    {
Matheus> +      PyErr_SetString (PyExc_ValueError, "value is too large");
Matheus> +      return false;
Matheus> +    }
Matheus> +  if (native_val < 0)
Matheus> +    {
Matheus> +      PyErr_SetString (PyExc_ValueError,
Matheus> +		       "value must not be smaller than zero");
Matheus> +      return false;
Matheus> +    }
Matheus> +
Matheus> +  *val = (unsigned int) native_val;
Matheus> +  return true;

See gdb_py_int_as_long.
I think the type-check isn't really needed (probably) and some of the
other error-handling can be simplified.
There's also gdb_py_long_as_ulongest.

Matheus> +/* Functionality for creating new types accessible from python.
Matheus> +
Matheus> +   Copyright (C) 2008-2023 Free Software Foundation, Inc.

Forgot to mention this elsewhere but I think these dates are wrong.

Matheus> +/* An abstraction covering the objects types that can own a type object. */
Matheus> +
Matheus> +class type_storage_owner
Matheus> +{
Matheus> +public:
Matheus> +  /* Creates a new type owner from the given python object. If the object is
Matheus> +   * of a type that is not supported, the newly created instance will be
Matheus> +   * marked as invalid and nothing should be done with it. */
Matheus> +
Matheus> +  type_storage_owner (PyObject *owner)
Matheus> +  {
Matheus> +    if (gdbpy_is_architecture (owner))
Matheus> +      {
Matheus> +	this->kind = owner_kind::arch;
Matheus> +	this->owner.arch = arch_object_to_gdbarch (owner);
Matheus> +	return;
Matheus> +      }
Matheus> +
Matheus> +    this->kind = owner_kind::objfile;
Matheus> +    this->owner.objfile = objfile_object_to_objfile (owner);
Matheus> +    if (this->owner.objfile != nullptr)
Matheus> +	return;

It seems like this could all just create a type_allocator directly and
be simpler, like the 'kind' isn't needed.

Matheus> +
Matheus> +    this->kind = owner_kind::none;
Matheus> +    PyErr_SetString(PyExc_TypeError, "unsupported owner type");

Spaces before parens in a lot of spots...

Matheus> +    /* Should never be reached, but it's better to fail in a safe way than try
Matheus> +     * to instance the allocator with arbitraty parameters here. */
Matheus> +    abort ();

gdb uses gdb_assert_not_reached instead.

Matheus> +  /* Get a reference to the owner's obstack. */
Matheus> +
Matheus> +  obstack *get_obstack ()
Matheus> +  {

I think the uses of this could probably use TYPE_ALLOC instead.

Matheus> +  struct gdbarch *get_arch ()
Matheus> +  {

This could use the type allocator's arch.

Matheus> +  enum class owner_kind { arch, objfile, none };

Is the none case really possible?
It might be better to just throw an exception from the constructor or
during argument validation or something like that.

thanks,
Tom

^ permalink raw reply	[relevance 4%]

* Re: [PATCH v2] Add support for symbol addition to the Python API
  2024-01-13  1:36  3%     ` [PATCH v2] " Matheus Branco Borella
@ 2024-02-06 17:50  0%       ` Tom Tromey
  2024-02-24 17:35  7%         ` Matheus Branco Borella
  0 siblings, 1 reply; 65+ results
From: Tom Tromey @ 2024-02-06 17:50 UTC (permalink / raw)
  To: Matheus Branco Borella; +Cc: gdb-patches, aburgess

>>>>> Matheus Branco Borella <dark.ryu.550@gmail.com> writes:

> I had to walk away from this for a while. I'm pinging it now and I've updated
> the code so that it works on master.

Thank you for the patch.

> This patch adds support for symbol creation and registration. It currently
> supports adding type symbols (VAR_DOMAIN/LOC_TYPEDEF), static symbols
> (VAR_DOMAIN/LOC_STATIC) and goto target labels (LABEL_DOMAIN/LOC_LABEL).

Symbol domains recently went through a change.

Also, a patch that changes the Python API requires a documentation
change and also an entry in NEWS.

> In the same vein, PC-based function name lookup also does not work, although
> there may be a way to have the feature work using overlays.

I guess because nothing makes any blocks.  However this seems like a
kind of big issue to me, because it means that by-name lookups will
appear to succeed ("function xyz is at address 0xaaaaa") but then
stopping in that function won't show the name.

> +  virtual void register_msymbol (const std::string& name,

gdb style puts the "&" next to "name", not next to "string".
There's a lot of instances of this.

> +/* Data being held by the gdb.ObjfileBuilder.
> + *
> + * This structure needs to have its constructor run in order for its lifetime
> + * to begin. Because of how Python handles its objects, we can't just reconstruct
> + * the object structure as a whole, as that would overwrite things the runtime
> + * cares about, so these fields had to be broken off into their own structure. */

gdb doesn't use the "leading *" style of comment.

> +  /* We need to tell GDB what architecture the objfile uses. */
> +  if (has_stack_frames ())
> +    of->per_bfd->gdbarch = get_frame_arch (get_selected_frame (nullptr));
> +  else
> +    of->per_bfd->gdbarch = current_inferior ()->arch ();

gdb exposes a gdb.Architecture, maybe we could let the Python code
specify this.

> +/* Parses a language from a string (coming from Python) into a language
> + * variant. */
> +
> +static enum language
> +parse_language (const char *language)
> +{
> +  if (strcmp (language, "c") == 0)
> +    return language_c;
> +  else if (strcmp (language, "objc") == 0)
> +    return language_objc;

I think this should call language_enum instead.

> +  if (language_name == nullptr)
> +    language_name = "auto";

I think it's kind of weird to use auto here.

> +/* Builds the object file. */
> +static PyObject *
> +objbdpy_build (PyObject *self, PyObject *args)
> +{
> +  auto builder = validate_objfile_builder_object (self);
> +  if (builder == nullptr)
> +    return nullptr;
> +
> +  if (builder->inner.installed)
> +    {
> +      PyErr_SetString (PyExc_ValueError, "build() cannot be run twice on the \
> +		       same object");
> +      return nullptr;
> +    }
> +  auto of = build_new_objfile (*builder);

There's a rule in the Python layer in gdb that code that calls into gdb
has to wrap the call in a try/catch and use GDB_PY_HANDLE_EXCEPTION.
This is because a lot of gdb code can throw exceptions, but letting an
exception cross the Python boundary is catastrophic.

> +  auto objpy = objfile_to_objfile_object (of).get ();
> +  Py_INCREF(objpy);
> +  return objpy;

objfile_to_objfile_object returns a new reference so I think the incref
is wrong here.

We try to avoid explicit inc/dec-refs in gdb anyway.

Tom

^ permalink raw reply	[relevance 0%]

* Re: [PATCH] Make `linux_info_proc` prefer using the LWP over the PID
  2024-01-08 15:50  7% ` Simon Marchi
@ 2024-01-19 16:52  7%   ` Matheus Branco Borella
  0 siblings, 0 replies; 65+ results
From: Matheus Branco Borella @ 2024-01-19 16:52 UTC (permalink / raw)
  To: gdb-patches; +Cc: dark.ryu.550, simark

I've sent in a v2 that should address your points. Thanks for your time.


^ permalink raw reply	[relevance 7%]

* [PATCH v2] Make `linux_info_proc` prefer using the LWP over the PID
  2024-01-06  2:45  6% [PATCH] Make `linux_info_proc` prefer using the LWP over the PID Matheus Branco Borella
  2024-01-08 15:50  7% ` Simon Marchi
@ 2024-01-19 16:49  6% ` Matheus Branco Borella
  1 sibling, 0 replies; 65+ results
From: Matheus Branco Borella @ 2024-01-19 16:49 UTC (permalink / raw)
  To: gdb-patches; +Cc: simark, Matheus Branco Borella

Normally, `linux_info_proc` would use the PID to determine which subfolder in
`/proc` to read information from. While this is usually fine, it breaks down
after the main thread exits, at which point the information in `/proc/$pid`
becomes become unreliable, if it is available at all. While it is the case
that most programs terminate after their main thread exits, some may continue
running from detached threads, in which case `info proc` will start misbehaving.

This patch addresses this by making it so that the LWP - the Lightweight Process
ID, that, in the case of GNU/Linux is the number of the process backing up the
thread[1] - is prefered over the PID. By doing this, `linux_info_proc` will
always access valid procfs information, even after the main thread exits.

[1]: https://man7.org/linux/man-pages/man2/clone.2.html

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31207
---
 gdb/linux-tdep.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/gdb/linux-tdep.c b/gdb/linux-tdep.c
index 82e8bc3db3c..4fa7a98adde 100644
--- a/gdb/linux-tdep.c
+++ b/gdb/linux-tdep.c
@@ -840,7 +840,17 @@ linux_info_proc (struct gdbarch *gdbarch, const char *args,
       if (current_inferior ()->fake_pid_p)
 	error (_("Can't determine the current process's PID: you must name one."));
 
-      pid = current_inferior ()->pid;
+      /* Seeing as, when the main thread exits, the information in /proc/$pid
+       * becomes unreliable, we should prefer using the current TID, whenever
+       * possible. */
+      pid = 0;
+      struct thread_info *info = any_live_thread_of_inferior (current_inferior ());
+      if (info != nullptr)
+	pid = info->ptid.lwp ();
+
+      /* And fall back to the actual PID only when the TID is not available. */
+      if (pid == 0)
+	pid = current_inferior ()->pid;
     }
 
   args = skip_spaces (args);
-- 
2.40.1


^ permalink raw reply	[relevance 6%]

* Re: [PATCH v4] Add support for creating new types from the Python API
  2024-01-16 18:56  0%     ` Eli Zaretskii
@ 2024-01-16 21:27  7%       ` Matheus Branco Borella
  0 siblings, 0 replies; 65+ results
From: Matheus Branco Borella @ 2024-01-16 21:27 UTC (permalink / raw)
  To: gdb-patches; +Cc: eli

> How can that work?  AFAIU, most architectures only allow pointers of
> certain sizes, some allow pointers of just one size.  E.g., what
> happens if I create a 16-bit pointer on a 64-bit target?

My intention is primarily to make it possible to construct such types,
assuming the given configuration is valid. In this case it would be the
consumer of the API who would be responsible for guaranteeing what they
are doing is valid.

Regardless, as far as I could gather, GDB doesn't really seem to care
if values whose types have TYPE_CODE_PTR have sizes that are valid in
the target architecture. `unsigned_pointer_to_address`, as well as all
the `*_pointer_to_address` functions I could find by grepping for uses
of `set_gdbarch_pointer_to_address` in the code are perfectly fine just
reading the data in the pointers as if they were type->length()-sized
integers. And mostly the same goes for their `*_address_to_pointer`
counterparts, except for large values being clipped (including extra
function pointer information). But I believe that behavior should be
fairly unsurprising if you're creating a pointer with the wrong size.

So, to answer your question, AFAICT it would just treat it as an
address stored in an (u)int16_t.

^ permalink raw reply	[relevance 7%]

* Re: [PATCH v4] Add support for creating new types from the Python API
  2024-01-16 18:20  7%   ` [PATCH v4] Add support for creating new types from the Python API Matheus Branco Borella
@ 2024-01-16 18:56  0%     ` Eli Zaretskii
  2024-01-16 21:27  7%       ` Matheus Branco Borella
  0 siblings, 1 reply; 65+ results
From: Eli Zaretskii @ 2024-01-16 18:56 UTC (permalink / raw)
  To: Matheus Branco Borella; +Cc: gdb-patches

> From: Matheus Branco Borella <dark.ryu.550@gmail.com>
> Cc: eli@gnu.org
> Date: Tue, 16 Jan 2024 15:20:24 -0300
> X-Spam-Status: No, score=-4.0 required=5.0 tests=BAYES_00, DKIM_SIGNED,
>  DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_ENVFROM_END_DIGIT,
>  FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP,
>  T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
> 
> Apologies for my previous empty response, my email client got a little
> trigger-happy (And I'm still getting the hang of it).
> 
> On Jan 16, 2024, at 9:45=E2=80=AFAM, Eli Zaretskii <eliz@gnu.org> wrote:
> > I asked previously what does BIT_SIZE mean for pointer types.  Is it
> > the size of the pointer or of the data type to which the pointer
> > points?  If it's the size of the pointer, then does it mean this
> > function can create pointers of arbitrary sizes regardless of the
> > sizes of pointers that are supported by the target?
> 
> It's the size of the pointer itself.

How can that work?  AFAIU, most architectures only allow pointers of
certain sizes, some allow pointers of just one size.  E.g., what
happens if I create a 16-bit pointer on a 64-bit target?

^ permalink raw reply	[relevance 0%]

* Re: [PATCH v4] Add support for creating new types from the Python API
  2024-01-16 12:45  0% ` Eli Zaretskii
  2024-01-16 17:50  7%   ` Matheus Branco Borella
@ 2024-01-16 18:20  7%   ` Matheus Branco Borella
  2024-01-16 18:56  0%     ` Eli Zaretskii
  1 sibling, 1 reply; 65+ results
From: Matheus Branco Borella @ 2024-01-16 18:20 UTC (permalink / raw)
  To: gdb-patches; +Cc: eli

Apologies for my previous empty response, my email client got a little
trigger-happy (And I'm still getting the hang of it).

On Jan 16, 2024, at 9:45=E2=80=AFAM, Eli Zaretskii <eliz@gnu.org> wrote:
> I asked previously what does BIT_SIZE mean for pointer types.  Is it
> the size of the pointer or of the data type to which the pointer
> points?  If it's the size of the pointer, then does it mean this
> function can create pointers of arbitrary sizes regardless of the
> sizes of pointers that are supported by the target?

It's the size of the pointer itself. This feature is mostly intended
for cases where proper types may not be available - such as reverse
engineering - but may still be desirable. While one could get around
the lack of such a function by using `gdb.Type.pointer()`, I felt like
there was no reason to restrict it, and that trusting that the
consumers of the API will pick sizes that are valid for the target was
probably good enough.


^ permalink raw reply	[relevance 7%]

* (no subject)
  2024-01-16 12:45  0% ` Eli Zaretskii
@ 2024-01-16 17:50  7%   ` Matheus Branco Borella
  2024-01-16 18:20  7%   ` [PATCH v4] Add support for creating new types from the Python API Matheus Branco Borella
  1 sibling, 0 replies; 65+ results
From: Matheus Branco Borella @ 2024-01-16 17:50 UTC (permalink / raw)
  To: gdb-patches; +Cc: eli



^ permalink raw reply	[relevance 7%]

* Re: [PATCH v4] Add support for creating new types from the Python API
  2024-01-16  4:54  1% [PATCH v4] Add support for creating new types from the Python API Matheus Branco Borella
@ 2024-01-16 12:45  0% ` Eli Zaretskii
  2024-01-16 17:50  7%   ` Matheus Branco Borella
  2024-01-16 18:20  7%   ` [PATCH v4] Add support for creating new types from the Python API Matheus Branco Borella
  2024-02-06 18:20  4% ` Tom Tromey
  1 sibling, 2 replies; 65+ results
From: Eli Zaretskii @ 2024-01-16 12:45 UTC (permalink / raw)
  To: Matheus Branco Borella; +Cc: gdb-patches

> From: Matheus Branco Borella <dark.ryu.550@gmail.com>
> Cc: eli@gnu.org,
> 	Matheus Branco Borella <dark.ryu.550@gmail.com>
> Date: Tue, 16 Jan 2024 01:54:40 -0300
> 
>  gdb/Makefile.in                           |   2 +
>  gdb/NEWS                                  |  16 +
>  gdb/doc/python.texi                       | 161 +++++++
>  gdb/python/py-float-format.c              | 307 +++++++++++++
>  gdb/python/py-objfile.c                   |  17 +
>  gdb/python/py-type-init.c                 | 520 ++++++++++++++++++++++
>  gdb/python/python-internal.h              |  34 ++
>  gdb/python/python.c                       |  50 +++
>  gdb/testsuite/gdb.python/py-type-init.c   |  21 +
>  gdb/testsuite/gdb.python/py-type-init.exp | 132 ++++++
>  10 files changed, 1260 insertions(+)
>  create mode 100644 gdb/python/py-float-format.c
>  create mode 100644 gdb/python/py-type-init.c
>  create mode 100644 gdb/testsuite/gdb.python/py-type-init.c
>  create mode 100644 gdb/testsuite/gdb.python/py-type-init.exp

Thanks.

> diff --git a/gdb/NEWS b/gdb/NEWS
> index 11cd6c0663e..e541544a027 100644
> --- a/gdb/NEWS
> +++ b/gdb/NEWS
> @@ -87,6 +87,22 @@ show remote thread-options-packet
>    ** New function gdb.interrupt(), that interrupts GDB as if the user
>       typed control-c.
>  
> +  ** Functions that allow creation of instances of gdb.Type, and a new
> +     class gdb.FloatFormat that may be used to create floating point
> +     types.  The functions that allow new type creation are:
> +      - gdb.init_type: Create a new type given a type code.
> +      - gdb.init_integer_type: Create a new integer type.
> +      - gdb.init_character_type: Create a new character type.
> +      - gdb.init_boolean_type: Create a new boolean type.
> +      - gdb.init_float_type: Create a new floating point type.
> +      - gdb.init_decfloat_type: Create a new decimal floating point type.
> +      - gdb.can_create_complex_type: Whether a type can be used to create a
> +          new complex type.
> +      - gdb.init_complex_type: Create a new complex type.
> +      - gdb.init_pointer_type: Create a new pointer type.
> +          * This allows creating pointers of arbitrary size.
> +      - gdb.init_fixed_point_type: Create a new fixed point type.
> +
>  * Debugger Adapter Protocol changes

This part is okay.

> +@var{format} is an reference to a @code{gdb.FloatFormat} object, as
                   ^^^^^^^^^^^^
"a reference"

> +@findex gdb.init_pointer_type
> +@defun gdb.init_pointer_type (owner, target, bit_size, name)
> +This function creates a new @code{gdb.Type} instance corresponding to a
> +pointer type that points to @var{target} and is owned by the given
> +@var{owner}, with the given @var{name} and size.

I asked previously what does BIT_SIZE mean for pointer types.  Is it
the size of the pointer or of the data type to which the pointer
points?  If it's the size of the pointer, then does it mean this
function can create pointers of arbitrary sizes regardless of the
sizes of pointers that are supported by the target?

> +When creating a floating point type through @code{gdb.init_float_type},
> +one has to use a @code{gdb.FloatFormat} object. These objects may be
                                                 ^^
Two spaces there, please.

> +@defvar FloatFormat.totalsize
> +The size of the floating point number, in bits. Currently, accepted
                                                 ^^
Likewise.

> +@defvar FloatFormat.intbit
> +This is a boolean values that indicates whether the integer bit is part
> +of the value or if it is determined implicitly. A value of true
                                                 ^^
And here.

> +@defvar FloatFormat.name
> +The name of the float format. Used internally, for debugging purposes.
                               ^^
And here.

Reviewed-By: Eli Zaretskii <eliz@gnu.org>

^ permalink raw reply	[relevance 0%]

* Re: [PATCH v3] Add support for creating new types from the Python API
  2024-01-13  7:21  0%         ` Eli Zaretskii
@ 2024-01-16  4:55  7%           ` Matheus Branco Borella
  0 siblings, 0 replies; 65+ results
From: Matheus Branco Borella @ 2024-01-16  4:55 UTC (permalink / raw)
  To: gdb-patches; +Cc: eli, Matheus Branco Borella

Alright, I've sent a v4 that I believe should address your comments. Thank you
for your time.

^ permalink raw reply	[relevance 7%]

* [PATCH v4] Add support for creating new types from the Python API
@ 2024-01-16  4:54  1% Matheus Branco Borella
  2024-01-16 12:45  0% ` Eli Zaretskii
  2024-02-06 18:20  4% ` Tom Tromey
  0 siblings, 2 replies; 65+ results
From: Matheus Branco Borella @ 2024-01-16  4:54 UTC (permalink / raw)
  To: gdb-patches; +Cc: eli, Matheus Branco Borella

This patch adds support for creating types from within the Python API. It does
so by exposing the `init_*_type` family of functions, defined in `gdbtypes.h` to
Python and having them return `gdb.Type` objects connected to the newly minted
types.

These functions are accessible in the root of the gdb module and all require
a reference to either a `gdb.Objfile` or a `gdb.Architecture`. Types created
from them will be owned by the object passed to the function.

This patch also adds an extra type - `gdb.FloatFormat` - to support creation of
floating point types by letting users control the format from within Python. It
is missing, however, a way to specify half formats and validation functions.

It is important to note that types created using this interface are not
automatically registered as a symbol, and so, types will become unreachable
unless used to create a value that otherwise references it or saved in some way.

The main drawback of using the `init_*_type` family over implementing type
initialization by hand is that any type that's created gets immediately
allocated on its owner's obstack, regardless of what its real lifetime
requirements are. The main implication of this is that types that become
unreachable will remain live for the lifetime of the owner.

Keeping track of the initialization of the type by hand would require a
deeper change to the existing type object infrastructure. A bit too ambitious
for a first patch, I'd say.

If it were to be done though, we would gain the ability to only keep in the
obstack types that are known to be referenced in some other way - by allocating
and copying the data to the obstack as other objects are created that reference
it (eg. symbols).
---
 gdb/Makefile.in                           |   2 +
 gdb/NEWS                                  |  16 +
 gdb/doc/python.texi                       | 161 +++++++
 gdb/python/py-float-format.c              | 307 +++++++++++++
 gdb/python/py-objfile.c                   |  17 +
 gdb/python/py-type-init.c                 | 520 ++++++++++++++++++++++
 gdb/python/python-internal.h              |  34 ++
 gdb/python/python.c                       |  50 +++
 gdb/testsuite/gdb.python/py-type-init.c   |  21 +
 gdb/testsuite/gdb.python/py-type-init.exp | 132 ++++++
 10 files changed, 1260 insertions(+)
 create mode 100644 gdb/python/py-float-format.c
 create mode 100644 gdb/python/py-type-init.c
 create mode 100644 gdb/testsuite/gdb.python/py-type-init.c
 create mode 100644 gdb/testsuite/gdb.python/py-type-init.exp

diff --git a/gdb/Makefile.in b/gdb/Makefile.in
index 195f3a2e2d1..50a758c802b 100644
--- a/gdb/Makefile.in
+++ b/gdb/Makefile.in
@@ -432,6 +432,8 @@ SUBDIR_PYTHON_SRCS = \
 	python/py-threadevent.c \
 	python/py-tui.c \
 	python/py-type.c \
+	python/py-type-init.c \
+	python/py-float-format.c \
 	python/py-unwind.c \
 	python/py-utils.c \
 	python/py-value.c \
diff --git a/gdb/NEWS b/gdb/NEWS
index 11cd6c0663e..e541544a027 100644
--- a/gdb/NEWS
+++ b/gdb/NEWS
@@ -87,6 +87,22 @@ show remote thread-options-packet
   ** New function gdb.interrupt(), that interrupts GDB as if the user
      typed control-c.
 
+  ** Functions that allow creation of instances of gdb.Type, and a new
+     class gdb.FloatFormat that may be used to create floating point
+     types.  The functions that allow new type creation are:
+      - gdb.init_type: Create a new type given a type code.
+      - gdb.init_integer_type: Create a new integer type.
+      - gdb.init_character_type: Create a new character type.
+      - gdb.init_boolean_type: Create a new boolean type.
+      - gdb.init_float_type: Create a new floating point type.
+      - gdb.init_decfloat_type: Create a new decimal floating point type.
+      - gdb.can_create_complex_type: Whether a type can be used to create a
+          new complex type.
+      - gdb.init_complex_type: Create a new complex type.
+      - gdb.init_pointer_type: Create a new pointer type.
+          * This allows creating pointers of arbitrary size.
+      - gdb.init_fixed_point_type: Create a new fixed point type.
+
 * Debugger Adapter Protocol changes
 
   ** GDB now emits the "process" event.
diff --git a/gdb/doc/python.texi b/gdb/doc/python.texi
index d74defeec0c..e79e4b1ac89 100644
--- a/gdb/doc/python.texi
+++ b/gdb/doc/python.texi
@@ -1743,6 +1743,167 @@ A Fortran namelist.
 Further support for types is provided in the @code{gdb.types}
 Python module (@pxref{gdb.types}).
 
+
+
+@node Creating Types In Python
+@subsubsection Creating Types In Python
+@cindex creating types in Python
+@cindex Python, working with types
+
+@value{GDBN} allows creation of new types from Python extensions.
+
+The following functions available in the @code{gdb} module create
+new types.
+
+They all return an instance of @code{gdb.Type}, and will throw an
+exception in case of an error, unless stated otherwise.  Arguments that
+have the same name behave the same for all functions.
+
+@findex gdb.init_type
+@defun gdb.init_type (owner, type_code, bit_size, name)
+This function creates a new @code{gdb.Type} instance corresponding to a
+type owned by the given @var{owner}, with the given @var{type_code},
+@var{name} and size.
+
+@var{owner} must be a reference to either a @code{gdb.Objfile} or a
+@code{gdb.Architecture} object.  These correspond to objfile and
+architecture-owned types, respectively.
+
+@var{type_code} is one of the @code{TYPE_CODE_} constants defined in
+@ref{Types In Python}.
+
+@var{bit_size} is the size of instances of the newly created type, in
+bits. Currently, accepted values are limited to multiples of 8.
+@end defun
+
+@findex gdb.init_integer_type
+@defun gdb.init_integer_type (owner, bit_size, unsigned, name)
+This function creates a new @code{gdb.Type} instance corresponding to an
+integer type owned by the given @var{owner}, with the given
+@var{name}, size and signedness.
+
+@var{unsigned} is a boolean indicating whether the type corresponds to
+a signed or unsigned value.
+
+@end defun
+
+@findex gdb.init_character_type
+@defun gdb.init_character_type (owner, bit_size, unsigned, name)
+This function creates a new @code{gdb.Type} instance corresponding to a
+character type owned by the given @var{owner}, with the given
+@var{name}, size and signedness.
+
+This function 
+@end defun
+
+@findex gdb.init_boolean_type
+@defun gdb.init_boolean_type (owner, bit_size, unsigned, name)
+This function creates a new @code{gdb.Type} instance corresponding to a
+boolean type owned by the given @var{owner}, with the given
+@var{name}, size and signedness.
+@end defun
+
+@findex gdb.init_float_type
+@defun gdb.init_float_type (owner, format, name)
+This function creates a new @code{gdb.Type} instance corresponding to a
+floating point type owned by the given @var{owner}, with the given
+@var{name} and @var{format}.
+
+@var{format} is an reference to a @code{gdb.FloatFormat} object, as
+described below.
+@end defun
+
+@findex gdb.init_decfloat_type
+@defun gdb.init_decfloat_type (owner, bit_size, name)
+This function creates a new @code{gdb.Type} instance corresponding to a
+decimal floating point type owned by the given @var{owner}, with the
+given @var{name} and size.
+@end defun
+
+@findex gdb.can_create_complex_type
+@defun gdb.can_create_complex_type (type)
+This function returns a boolean indicating whether @var{type} can be
+used to create a new complex type using the @code{gdb.init_complex_type}
+function.
+@end defun
+
+@findex gdb.init_complex_type
+@defun gdb.init_complex_type (type, name)
+This function creates a new @code{gdb.Type} instance corresponding to a
+complex type with the given @var{name} based on the given base
+@var{type}.
+
+The newly created type will be owned by the same object as the base
+type that was used to create it.
+@end defun
+
+@findex gdb.init_pointer_type
+@defun gdb.init_pointer_type (owner, target, bit_size, name)
+This function creates a new @code{gdb.Type} instance corresponding to a
+pointer type that points to @var{target} and is owned by the given
+@var{owner}, with the given @var{name} and size.
+
+@var{target} is a @code{gdb.Type} object, corresponding to the type
+that will be pointed to by the newly created pointer type.
+@end defun
+
+@findex gdb.init_fixed_point_type
+@defun gdb.init_fixed_point_type (owner, bit_size, unsigned, name)
+This function creates a new @code{gdb.Type} instance corresponding to a
+fixed point type owned by the given @var{owner}, with the given
+@var{name}, size and signedness.
+@end defun
+
+When creating a floating point type through @code{gdb.init_float_type},
+one has to use a @code{gdb.FloatFormat} object. These objects may be
+created with no arguments, and the following attributes may be used to
+defined the format of the desired floating point format:
+
+@defvar FloatFormat.totalsize
+The size of the floating point number, in bits. Currently, accepted
+values are limited to multiples of 8.
+@end defvar
+
+@defvar FloatFormat.sign_start
+The bit offset of the sign bit.
+@end defvar
+
+@defvar FloatFormt.exp_start
+The bit offset of the start of the exponent.
+@end defvar
+
+@defvar FloatFormat.exp_len
+The size of the exponent, in bits.
+@end defvar
+
+@defvar FloatFormat.exp_bias
+Bias added to the written exponent to form the biased exponent.
+@end defvar
+
+@defvar FloatFormat.exp_nan
+Exponent value which indicates NaN.
+@end defvar
+
+@defvar FloatFormat.man_start
+The bit offset of the start of the mantissa.
+@end defvar
+
+@defvar FloatFormat.man_len
+The size of the mantissa, in bits.
+@end defvar
+
+@defvar FloatFormat.intbit
+This is a boolean values that indicates whether the integer bit is part
+of the value or if it is determined implicitly. A value of true
+indicates the former, while a value of false indicates the latter.
+@end defvar
+
+@defvar FloatFormat.name
+The name of the float format. Used internally, for debugging purposes.
+@end defvar
+
+
+
 @node Pretty Printing API
 @subsubsection Pretty Printing API
 @cindex python pretty printing api
diff --git a/gdb/python/py-float-format.c b/gdb/python/py-float-format.c
new file mode 100644
index 00000000000..984b96361a7
--- /dev/null
+++ b/gdb/python/py-float-format.c
@@ -0,0 +1,307 @@
+/* Accessibility of float format controls from inside the Python API
+
+   Copyright (C) 2008-2023 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include "defs.h"
+#include "python-internal.h"
+#include "floatformat.h"
+
+/* Structure backing the float format Python interface. */
+
+struct float_format_object
+{
+  PyObject_HEAD
+  struct floatformat format;
+
+  struct floatformat *float_format ()
+  {
+    return &this->format;
+  }
+};
+
+/* Initializes the float format type and registers it with the Python
+ * interpreter. */
+
+static int CPYCHECKER_NEGATIVE_RESULT_SETS_EXCEPTION
+gdbpy_initialize_float_format (void)
+{
+  if (PyType_Ready (&float_format_object_type) < 0)
+    return -1;
+
+  if (gdb_pymodule_addobject (gdb_module, "FloatFormat",
+			      (PyObject *) &float_format_object_type) < 0)
+    return -1;
+
+  return 0;
+}
+
+GDBPY_INITIALIZE_FILE (gdbpy_initialize_float_format);
+
+/* Creates a function that gets the value of a field of a given name from the
+ * underliying float_format structure in the Python object. */
+
+#define INSTANCE_FIELD_GETTER(getter_name, field_name, field_type, field_conv)\
+  static PyObject *							      \
+  getter_name (PyObject *self, void *closure)				      \
+  {									      \
+    float_format_object *ff = (float_format_object*) self;		      \
+    field_type value = ff->float_format ()->field_name;			      \
+    return field_conv (value);						      \
+  }
+
+/* Creates a function that sets the value of a field of a given name from the
+ * underliying float_format structure in the Python object. */
+
+#define INSTANCE_FIELD_SETTER(setter_name, field_name, field_type, field_conv)\
+  static int								      \
+  setter_name (PyObject *self, PyObject* value, void *closure)		      \
+  {									      \
+    field_type native_value;						      \
+    if (!field_conv (value, &native_value))				      \
+      return -1;							      \
+    float_format_object *ff = (float_format_object*) self;		      \
+    ff->float_format ()->field_name = native_value;			      \
+    return 0;								      \
+  }
+
+/* Converts from the intbit enum to a Python boolean. */
+
+static PyObject *
+intbit_to_py (enum floatformat_intbit intbit)
+{
+  gdb_assert (intbit == floatformat_intbit_yes
+	      || intbit == floatformat_intbit_no);
+
+  if (intbit == floatformat_intbit_no)
+    Py_RETURN_FALSE;
+  else
+    Py_RETURN_TRUE;
+}
+
+/* Converts from a Python boolean to the intbit enum. */
+
+static bool
+py_to_intbit (PyObject *object, enum floatformat_intbit *intbit)
+{
+  if (!PyObject_IsInstance (object, (PyObject*) &PyBool_Type))
+    {
+      PyErr_SetString (PyExc_TypeError, "intbit must be True or False");
+      return false;
+    }
+
+  *intbit = PyObject_IsTrue (object) ? floatformat_intbit_yes
+    : floatformat_intbit_no;
+
+  return true;
+}
+
+/* Converts from a Python integer to a unsigned integer. */
+
+static bool
+py_to_unsigned_int (PyObject *object, unsigned int *val)
+{
+  if (!PyObject_IsInstance (object, (PyObject*) &PyLong_Type))
+    {
+      PyErr_SetString (PyExc_TypeError, "value must be an integer");
+      return false;
+    }
+
+  long native_val = PyLong_AsLong (object);
+  if (native_val > (long) UINT_MAX)
+    {
+      PyErr_SetString (PyExc_ValueError, "value is too large");
+      return false;
+    }
+  if (native_val < 0)
+    {
+      PyErr_SetString (PyExc_ValueError,
+		       "value must not be smaller than zero");
+      return false;
+    }
+
+  *val = (unsigned int) native_val;
+  return true;
+}
+
+/* Converts from a Python integer to a signed integer. */
+
+static bool
+py_to_int(PyObject *object, int *val)
+{
+  if(!PyObject_IsInstance(object, (PyObject*)&PyLong_Type))
+    {
+      PyErr_SetString(PyExc_TypeError, "value must be an integer");
+      return false;
+    }
+
+  long native_val = PyLong_AsLong(object);
+  if(native_val > (long)INT_MAX)
+    {
+      PyErr_SetString(PyExc_ValueError, "value is too large");
+      return false;
+    }
+
+  *val = (int)native_val;
+  return true;
+}
+
+/* Instantiate functions for all of the float format fields we'd like to be
+ * able to read and change from our Python object. These will be used later to
+ * define `getset` entries for them. */
+
+INSTANCE_FIELD_GETTER (ffpy_get_totalsize, totalsize,
+		       unsigned int, PyLong_FromLong)
+INSTANCE_FIELD_GETTER (ffpy_get_sign_start, sign_start,
+		       unsigned int, PyLong_FromLong)
+INSTANCE_FIELD_GETTER (ffpy_get_exp_start, exp_start,
+		       unsigned int, PyLong_FromLong)
+INSTANCE_FIELD_GETTER (ffpy_get_exp_len, exp_len,
+		       unsigned int, PyLong_FromLong)
+INSTANCE_FIELD_GETTER (ffpy_get_exp_bias, exp_bias, int, PyLong_FromLong)
+INSTANCE_FIELD_GETTER (ffpy_get_exp_nan, exp_nan,
+		       unsigned int, PyLong_FromLong)
+INSTANCE_FIELD_GETTER (ffpy_get_man_start, man_start,
+		       unsigned int, PyLong_FromLong)
+INSTANCE_FIELD_GETTER (ffpy_get_man_len, man_len,
+		       unsigned int, PyLong_FromLong)
+INSTANCE_FIELD_GETTER (ffpy_get_intbit, intbit,
+		       enum floatformat_intbit, intbit_to_py)
+INSTANCE_FIELD_GETTER (ffpy_get_name, name,
+		       const char *, PyUnicode_FromString)
+
+INSTANCE_FIELD_SETTER (ffpy_set_totalsize, totalsize,
+		       unsigned int, py_to_unsigned_int)
+INSTANCE_FIELD_SETTER (ffpy_set_sign_start, sign_start,
+		       unsigned int, py_to_unsigned_int)
+INSTANCE_FIELD_SETTER (ffpy_set_exp_start, exp_start,
+		       unsigned int, py_to_unsigned_int)
+INSTANCE_FIELD_SETTER (ffpy_set_exp_len, exp_len,
+		       unsigned int, py_to_unsigned_int)
+INSTANCE_FIELD_SETTER (ffpy_set_exp_bias, exp_bias, int, py_to_int)
+INSTANCE_FIELD_SETTER (ffpy_set_exp_nan, exp_nan,
+		       unsigned int, py_to_unsigned_int)
+INSTANCE_FIELD_SETTER (ffpy_set_man_start, man_start,
+		       unsigned int, py_to_unsigned_int)
+INSTANCE_FIELD_SETTER (ffpy_set_man_len, man_len,
+		       unsigned int, py_to_unsigned_int)
+INSTANCE_FIELD_SETTER (ffpy_set_intbit, intbit,
+		       enum floatformat_intbit, py_to_intbit)
+
+/* Makes sure float formats created from Python always test as valid. */
+
+static int
+ffpy_always_valid (const struct floatformat *fmt ATTRIBUTE_UNUSED,
+		   const void *from ATTRIBUTE_UNUSED)
+{
+  return 1;
+}
+
+/* Initializes new float format objects. */
+
+static int
+ffpy_init (PyObject *self,
+	   PyObject *args ATTRIBUTE_UNUSED,
+	   PyObject *kwds ATTRIBUTE_UNUSED)
+{
+  auto ff = (float_format_object*) self;
+  ff->format = floatformat ();
+  ff->float_format ()->name = "";
+  ff->float_format ()->is_valid = ffpy_always_valid;
+  return 0;
+}
+
+/* See python/python-internal.h. */
+
+struct floatformat *
+float_format_object_as_float_format (PyObject *self)
+{
+  if (!PyObject_TypeCheck (self, &float_format_object_type))
+    {
+      PyErr_SetString(PyExc_TypeError, "expected gdb.FloatFormat");
+      return nullptr;
+    }
+  return ((float_format_object*) self)->float_format ();
+}
+
+static gdb_PyGetSetDef float_format_object_getset[] =
+{
+  { "totalsize", ffpy_get_totalsize, ffpy_set_totalsize,
+    "The total size of the floating point number, in bits.", nullptr },
+  { "sign_start", ffpy_get_sign_start, ffpy_set_sign_start,
+    "The bit offset of the sign bit.", nullptr },
+  { "exp_start", ffpy_get_exp_start, ffpy_set_exp_start,
+    "The bit offset of the start of the exponent.", nullptr },
+  { "exp_len", ffpy_get_exp_len, ffpy_set_exp_len,
+    "The size of the exponent, in bits.", nullptr },
+  { "exp_bias", ffpy_get_exp_bias, ffpy_set_exp_bias,
+    "Bias added to the written exponent to form the biased exponent.",
+    nullptr },
+  { "exp_nan", ffpy_get_exp_nan, ffpy_set_exp_nan,
+    "Exponent value which indicates NaN.", nullptr },
+  { "man_start", ffpy_get_man_start, ffpy_set_man_start,
+    "The bit offset of the start of the mantissa.", nullptr },
+  { "man_len", ffpy_get_man_len, ffpy_set_man_len,
+    "The size of the mantissa, in bits.", nullptr },
+  { "intbit", ffpy_get_intbit, ffpy_set_intbit,
+    "Is the integer bit explicit or implicit?", nullptr },
+  { "name", ffpy_get_name, nullptr,
+    "Internal name for debugging.", nullptr },
+  { nullptr }
+};
+
+PyTypeObject float_format_object_type =
+{
+  PyVarObject_HEAD_INIT (NULL, 0)
+  "gdb.FloatFormat",		  /*tp_name*/
+  sizeof (float_format_object),   /*tp_basicsize*/
+  0,				  /*tp_itemsize*/
+  nullptr,			  /*tp_dealloc*/
+  0,				  /*tp_print*/
+  nullptr,			  /*tp_getattr*/
+  nullptr,			  /*tp_setattr*/
+  nullptr,			  /*tp_compare*/
+  nullptr,			  /*tp_repr*/
+  nullptr,			  /*tp_as_number*/
+  nullptr,			  /*tp_as_sequence*/
+  nullptr,			  /*tp_as_mapping*/
+  nullptr,			  /*tp_hash */
+  nullptr,			  /*tp_call*/
+  nullptr,			  /*tp_str*/
+  nullptr,			  /*tp_getattro*/
+  nullptr,			  /*tp_setattro*/
+  nullptr,			  /*tp_as_buffer*/
+  Py_TPFLAGS_DEFAULT,		  /*tp_flags*/
+  "GDB float format object",      /* tp_doc */
+  nullptr,			  /* tp_traverse */
+  nullptr,			  /* tp_clear */
+  nullptr,			  /* tp_richcompare */
+  0,				  /* tp_weaklistoffset */
+  nullptr,			  /* tp_iter */
+  nullptr,			  /* tp_iternext */
+  nullptr,			  /* tp_methods */
+  nullptr,			  /* tp_members */
+  float_format_object_getset,     /* tp_getset */
+  nullptr,			  /* tp_base */
+  nullptr,			  /* tp_dict */
+  nullptr,			  /* tp_descr_get */
+  nullptr,			  /* tp_descr_set */
+  0,				  /* tp_dictoffset */
+  ffpy_init,			  /* tp_init */
+  nullptr,			  /* tp_alloc */
+  PyType_GenericNew,		  /* tp_new */
+};
diff --git a/gdb/python/py-objfile.c b/gdb/python/py-objfile.c
index bb5d0d92aba..71d840c3e00 100644
--- a/gdb/python/py-objfile.c
+++ b/gdb/python/py-objfile.c
@@ -705,6 +705,23 @@ objfile_to_objfile_object (struct objfile *objfile)
   return gdbpy_ref<>::new_reference (result);
 }
 
+/* See python/python-internal.h. */
+
+struct objfile *
+objfile_object_to_objfile (PyObject *self)
+{
+  if (!PyObject_TypeCheck (self, &objfile_object_type))
+    {
+      PyErr_SetString(PyExc_TypeError, "expected gdb.Objfile");
+      return nullptr;
+    }
+
+  auto objfile_object = (struct objfile_object*) self;
+  OBJFPY_REQUIRE_VALID (objfile_object);
+
+  return objfile_object->objfile;
+}
+
 static int CPYCHECKER_NEGATIVE_RESULT_SETS_EXCEPTION
 gdbpy_initialize_objfile (void)
 {
diff --git a/gdb/python/py-type-init.c b/gdb/python/py-type-init.c
new file mode 100644
index 00000000000..58f29393413
--- /dev/null
+++ b/gdb/python/py-type-init.c
@@ -0,0 +1,520 @@
+/* Functionality for creating new types accessible from python.
+
+   Copyright (C) 2008-2023 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include "defs.h"
+#include "python-internal.h"
+#include "gdbtypes.h"
+#include "floatformat.h"
+#include "objfiles.h"
+#include "gdbsupport/gdb_obstack.h"
+
+
+/* An abstraction covering the objects types that can own a type object. */
+
+class type_storage_owner
+{
+public:
+  /* Creates a new type owner from the given python object. If the object is
+   * of a type that is not supported, the newly created instance will be
+   * marked as invalid and nothing should be done with it. */
+
+  type_storage_owner (PyObject *owner)
+  {
+    if (gdbpy_is_architecture (owner))
+      {
+	this->kind = owner_kind::arch;
+	this->owner.arch = arch_object_to_gdbarch (owner);
+	return;
+      }
+
+    this->kind = owner_kind::objfile;
+    this->owner.objfile = objfile_object_to_objfile (owner);
+    if (this->owner.objfile != nullptr)
+	return;
+
+    this->kind = owner_kind::none;
+    PyErr_SetString(PyExc_TypeError, "unsupported owner type");
+  }
+
+  /* Whether the owner is valid. An owner may not be valid if the type that
+   * was used to create it is not known. Operations must only be done on valid
+   * instances of this class. */
+
+  bool valid ()
+  {
+    return this->kind != owner_kind::none;
+  }
+
+  /* Returns a type allocator that allocates on this owner. */
+
+  type_allocator allocator ()
+  {
+    gdb_assert (this->valid ());
+
+    if (this->kind == owner_kind::arch)
+      return type_allocator (this->owner.arch);
+    else if (this->kind == owner_kind::objfile)
+      {
+	/* Creating types on the gdbarch sets their language to minimal, we
+	 * maintain this behavior here. */
+	return type_allocator (this->owner.objfile, language_minimal);
+      }
+
+    /* Should never be reached, but it's better to fail in a safe way than try
+     * to instance the allocator with arbitraty parameters here. */
+    abort ();
+  }
+
+  /* Get a reference to the owner's obstack. */
+
+  obstack *get_obstack ()
+  {
+    gdb_assert (this->valid ());
+
+    if (this->kind == owner_kind::arch)
+	return gdbarch_obstack (this->owner.arch);
+    else if (this->kind == owner_kind::objfile)
+	return &this->owner.objfile->objfile_obstack;
+
+    return nullptr;
+  }
+
+  /* Get a reference to the owner's architecture. */
+
+  struct gdbarch *get_arch ()
+  {
+    gdb_assert (this->valid ());
+
+    if (this->kind == owner_kind::arch)
+	return this->owner.arch;
+    else if (this->kind == owner_kind::objfile)
+	return this->owner.objfile->arch ();
+
+    return nullptr;
+  }
+
+  /* Copy a null-terminated string to the owner's obstack. */
+
+  const char *copy_string (const char *py_str)
+  {
+    gdb_assert (this->valid ());
+
+    unsigned int len = strlen (py_str);
+    return obstack_strndup (this->get_obstack (), py_str, len);
+  }
+
+
+
+private:
+  enum class owner_kind { arch, objfile, none };
+
+  owner_kind kind = owner_kind::none;
+  union {
+    struct gdbarch *arch;
+    struct objfile *objfile;
+  } owner;
+};
+
+/* Creates a new type and returns a new gdb.Type associated with it. */
+
+PyObject *
+gdbpy_init_type (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *keywords[] = { "owner", "type_code", "bit_size", "name",
+				    NULL };
+  PyObject *owner_object;
+  enum type_code code;
+  int bit_length;
+  const char *py_name;
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, "Oiis", keywords, &owner_object,
+					&code, &bit_length, &py_name))
+    return nullptr;
+
+  type_storage_owner owner (owner_object);
+  if (!owner.valid ())
+    return nullptr;
+
+  const char *name = owner.copy_string (py_name);
+  struct type *type;
+  try
+    {
+      type_allocator allocator = owner.allocator ();
+      type = allocator.new_type (code, bit_length, name);
+      gdb_assert (type != nullptr);
+    }
+  catch (gdb_exception_error& ex)
+    {
+      GDB_PY_HANDLE_EXCEPTION (ex);
+    }
+
+  return type_to_type_object (type);
+}
+
+/* Creates a new integer type and returns a new gdb.Type associated with it. */
+
+PyObject *
+gdbpy_init_integer_type (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *keywords[] = { "owner", "bit_size", "unsigned", "name",
+				    NULL };
+  PyObject *owner_object;
+  int bit_size;
+  int unsigned_p;
+  const char *py_name;
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, "Oips", keywords,
+					&owner_object, &bit_size, &unsigned_p,
+					&py_name))
+    return nullptr;
+
+  type_storage_owner owner (owner_object);
+  if (!owner.valid ())
+    return nullptr;
+
+  const char *name = owner.copy_string (py_name);
+  struct type *type;
+  try
+    {
+      type_allocator allocator = owner.allocator ();
+      type = init_integer_type (allocator, bit_size, unsigned_p, name);
+      gdb_assert (type != nullptr);
+    }
+  catch (gdb_exception_error& ex)
+    {
+      GDB_PY_HANDLE_EXCEPTION (ex);
+    }
+
+  return type_to_type_object(type);
+}
+
+/* Creates a new character type and returns a new gdb.Type associated
+ * with it. */
+
+PyObject *
+gdbpy_init_character_type (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *keywords[] = { "owner", "bit_size", "unsigned", "name",
+				    NULL };
+  PyObject *owner_object;
+  int bit_size;
+  int unsigned_p;
+  const char *py_name;
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, "Oips", keywords,
+					&owner_object, &bit_size, &unsigned_p,
+					&py_name))
+    return nullptr;
+
+  type_storage_owner owner (owner_object);
+  if (!owner.valid ())
+    return nullptr;
+
+  const char *name = owner.copy_string (py_name);
+  struct type *type;
+  try
+    {
+      type_allocator allocator = owner.allocator ();
+      type = init_character_type (allocator, bit_size, unsigned_p, name);
+      gdb_assert (type != nullptr);
+    }
+  catch (gdb_exception_error& ex)
+    {
+      GDB_PY_HANDLE_EXCEPTION (ex);
+    }
+
+  return type_to_type_object (type);
+}
+
+/* Creates a new boolean type and returns a new gdb.Type associated with it. */
+
+PyObject *
+gdbpy_init_boolean_type (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *keywords[] = { "owner", "bit_size", "unsigned", "name",
+				    NULL };
+  PyObject *owner_object;
+  int bit_size;
+  int unsigned_p;
+  const char *py_name;
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, "Oips", keywords,
+					&owner_object, &bit_size, &unsigned_p,
+					&py_name))
+    return nullptr;
+
+  type_storage_owner owner (owner_object);
+  if (!owner.valid ())
+    return nullptr;
+
+  const char *name = owner.copy_string (py_name);
+  struct type *type;
+  try
+    {
+      type_allocator allocator = owner.allocator ();
+      type = init_boolean_type (allocator, bit_size, unsigned_p, name);
+      gdb_assert (type != nullptr);
+    }
+  catch (gdb_exception_error& ex)
+    {
+      GDB_PY_HANDLE_EXCEPTION (ex);
+    }
+
+  return type_to_type_object (type);
+}
+
+/* Creates a new float type and returns a new gdb.Type associated with it. */
+
+PyObject *
+gdbpy_init_float_type (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *keywords[] = { "owner", "format", "name", NULL };
+  PyObject *owner_object, *float_format_object;
+  const char *py_name;
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, "OOs", keywords, &owner_object,
+					&float_format_object, &py_name))
+    return nullptr;
+
+  type_storage_owner owner (owner_object);
+  if (!owner.valid ())
+    return nullptr;
+
+  struct floatformat *local_ff = float_format_object_as_float_format
+    (float_format_object);
+  if (local_ff == nullptr)
+    return nullptr;
+
+  /* Persist a copy of the format in the objfile's obstack. This guarantees
+   * that the format won't outlive the type being created from it and that
+   * changes made to the object used to create this type will not affect it
+   * after creation. */
+  auto ff = OBSTACK_CALLOC (owner.get_obstack (), 1, struct floatformat);
+  memcpy (ff, local_ff, sizeof (struct floatformat));
+
+  /* We only support creating float types in the architecture's endianness, so
+   * make sure init_float_type sees the float format structure we need it to.
+   */
+  enum bfd_endian endianness = gdbarch_byte_order (owner.get_arch ());
+  gdb_assert (endianness < BFD_ENDIAN_UNKNOWN);
+
+  const struct floatformat *per_endian[2] = { nullptr, nullptr };
+  per_endian[endianness] = ff;
+
+  const char *name = owner.copy_string (py_name);
+  struct type *type;
+  try
+    {
+      type_allocator allocator = owner.allocator ();
+      type = init_float_type (allocator, -1, name, per_endian, endianness);
+      gdb_assert (type != nullptr);
+    }
+  catch (gdb_exception_error& ex)
+    {
+      GDB_PY_HANDLE_EXCEPTION (ex);
+    }
+
+  return type_to_type_object (type);
+}
+
+/* Creates a new decimal float type and returns a new gdb.Type
+ * associated with it. */
+
+PyObject *
+gdbpy_init_decfloat_type (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *keywords[] = { "owner", "bit_size", "name", NULL };
+  PyObject *owner_object;
+  int bit_length;
+  const char *py_name;
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, "Ois", keywords, &owner_object,
+					&bit_length, &py_name))
+    return nullptr;
+
+  type_storage_owner owner (owner_object);
+  if (!owner.valid ())
+    return nullptr;
+
+  const char *name = owner.copy_string (py_name);
+  struct type *type;
+  try
+    {
+      type_allocator allocator = owner.allocator ();
+      type = init_decfloat_type (allocator, bit_length, name);
+      gdb_assert (type != nullptr);
+    }
+  catch (gdb_exception_error& ex)
+    {
+      GDB_PY_HANDLE_EXCEPTION (ex);
+    }
+
+  return type_to_type_object (type);
+}
+
+/* Returns whether a given type can be used to create a complex type. */
+
+PyObject *
+gdbpy_can_create_complex_type (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *keywords[] = { "type", NULL };
+  PyObject *type_object;
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, "O", keywords,
+					&type_object))
+    return nullptr;
+
+  struct type *type = type_object_to_type (type_object);
+  if (type == nullptr)
+    return nullptr;
+
+  bool can_create_complex = false;
+  try
+    {
+      can_create_complex = can_create_complex_type (type);
+    }
+  catch (gdb_exception_error& ex)
+    {
+      GDB_PY_HANDLE_EXCEPTION (ex);
+    }
+
+  if (can_create_complex)
+    Py_RETURN_TRUE;
+  else
+    Py_RETURN_FALSE;
+}
+
+/* Creates a new complex type and returns a new gdb.Type associated with it. */
+
+PyObject *
+gdbpy_init_complex_type (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *keywords[] = { "type", "name", NULL };
+  PyObject *type_object;
+  const char *py_name;
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, "Os", keywords, &type_object,
+					&py_name))
+    return nullptr;
+
+  struct type *type = type_object_to_type (type_object);
+  if (type == nullptr)
+    return nullptr;
+
+  obstack *obstack;
+  if (type->is_objfile_owned ())
+    obstack = &type->objfile_owner ()->objfile_obstack;
+  else
+    obstack = gdbarch_obstack (type->arch_owner ());
+
+  unsigned int len = strlen (py_name);
+  const char *name = obstack_strndup (obstack,
+				      py_name,
+				      len);
+  struct type *complex_type;
+  try
+    {
+      complex_type = init_complex_type (name, type);
+      gdb_assert (complex_type != nullptr);
+    }
+  catch (gdb_exception_error& ex)
+    {
+      GDB_PY_HANDLE_EXCEPTION (ex);
+    }
+
+  return type_to_type_object (complex_type);
+}
+
+/* Creates a new pointer type and returns a new gdb.Type associated with it. */
+
+PyObject *
+gdbpy_init_pointer_type (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *keywords[] = { "owner", "target", "bit_size", "name",
+				    NULL };
+  PyObject *owner_object, *type_object;
+  int bit_length;
+  const char *py_name;
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, "OOis", keywords,
+					&owner_object, &type_object,
+					&bit_length, &py_name))
+    return nullptr;
+
+  struct type *type = type_object_to_type (type_object);
+  if (type == nullptr)
+    return nullptr;
+
+  type_storage_owner owner (owner_object);
+  if (!owner.valid ())
+    return nullptr;
+
+  const char *name = owner.copy_string (py_name);
+  struct type *pointer_type = nullptr;
+  try
+    {
+      type_allocator allocator = owner.allocator ();
+      pointer_type = init_pointer_type (allocator, bit_length, name, type);
+      gdb_assert (type != nullptr);
+    }
+  catch (gdb_exception_error& ex)
+    {
+      GDB_PY_HANDLE_EXCEPTION (ex);
+    }
+
+  return type_to_type_object (pointer_type);
+}
+
+/* Creates a new fixed point type and returns a new gdb.Type associated
+ * with it. */
+
+PyObject *
+gdbpy_init_fixed_point_type (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *keywords[] = { "owner", "bit_size", "unsigned", "name",
+				    NULL };
+  PyObject *owner_object;
+  int bit_length;
+  int unsigned_p;
+  const char* py_name;
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, "Oips", keywords,
+					&owner_object, &bit_length,
+					&unsigned_p, &py_name))
+    return nullptr;
+
+  type_storage_owner owner (owner_object);
+  if (!owner.valid ())
+    return nullptr;
+
+  const char *name = owner.copy_string (py_name);
+  struct type *type;
+  try
+    {
+      type_allocator allocator = owner.allocator ();
+      type = init_fixed_point_type (allocator, bit_length, unsigned_p, name);
+      gdb_assert (type != nullptr);
+    }
+  catch (gdb_exception_error& ex)
+    {
+      GDB_PY_HANDLE_EXCEPTION (ex);
+    }
+
+  return type_to_type_object (type);
+}
diff --git a/gdb/python/python-internal.h b/gdb/python/python-internal.h
index 14e15574685..51e1202d5bd 100644
--- a/gdb/python/python-internal.h
+++ b/gdb/python/python-internal.h
@@ -291,6 +291,8 @@ extern PyTypeObject frame_object_type
     CPYCHECKER_TYPE_OBJECT_FOR_TYPEDEF ("frame_object");
 extern PyTypeObject thread_object_type
     CPYCHECKER_TYPE_OBJECT_FOR_TYPEDEF ("thread_object");
+extern PyTypeObject float_format_object_type
+    CPYCHECKER_TYPE_OBJECT_FOR_TYPEDEF ("float_format");
 
 /* Ensure that breakpoint_object_type is initialized and return true.  If
    breakpoint_object_type can't be initialized then set a suitable Python
@@ -433,6 +435,26 @@ gdb::unique_xmalloc_ptr<char> gdbpy_parse_command_name
 PyObject *gdbpy_register_tui_window (PyObject *self, PyObject *args,
 				     PyObject *kw);
 
+PyObject *gdbpy_init_type (PyObject *self, PyObject *args, PyObject *kw);
+PyObject *gdbpy_init_integer_type (PyObject *self, PyObject *args,
+				   PyObject *kw);
+PyObject *gdbpy_init_character_type (PyObject *self, PyObject *args,
+				     PyObject *kw);
+PyObject *gdbpy_init_boolean_type (PyObject *self, PyObject *args,
+				   PyObject *kw);
+PyObject *gdbpy_init_float_type (PyObject *self, PyObject *args,
+				 PyObject *kw);
+PyObject *gdbpy_init_decfloat_type (PyObject *self, PyObject *args,
+				    PyObject *kw);
+PyObject *gdbpy_can_create_complex_type (PyObject *self, PyObject *args,
+					 PyObject *kw);
+PyObject *gdbpy_init_complex_type (PyObject *self, PyObject *args,
+				   PyObject *kw);
+PyObject *gdbpy_init_pointer_type (PyObject *self, PyObject *args,
+				   PyObject *kw);
+PyObject *gdbpy_init_fixed_point_type (PyObject *self, PyObject *args,
+				       PyObject *kw);
+
 PyObject *symtab_and_line_to_sal_object (struct symtab_and_line sal);
 PyObject *symtab_to_symtab_object (struct symtab *symtab);
 PyObject *symbol_to_symbol_object (struct symbol *sym);
@@ -504,6 +526,18 @@ extern void serialize_mi_results (PyObject *results);
 extern PyObject *gdbpy_notify_mi (PyObject *self, PyObject *args,
 				  PyObject *kw);
 
+/* Retrieves a pointer to the underlying float format structure. Expects an
+ * instance of gdb.Objfile for SELF. If SELF is of an incompatible type,
+ * returns nullptr and raises a Python exception. */
+
+extern struct objfile *objfile_object_to_objfile (PyObject *self);
+
+/* Retrieves a pointer to the underlying float format structure. Expects an
+ * instance of gdb.FloatFormat for SELF. If SELF is of an incompatible type,
+ * returns nullptr and raises a Python exception. */
+
+extern struct floatformat *float_format_object_as_float_format (PyObject *self);
+
 /* Convert Python object OBJ to a program_space pointer.  OBJ must be a
    gdb.Progspace reference.  Return nullptr if the gdb.Progspace is not
    valid (see gdb.Progspace.is_valid), otherwise return the program_space
diff --git a/gdb/python/python.c b/gdb/python/python.c
index 2ca3c50afd4..1ccba1ca519 100644
--- a/gdb/python/python.c
+++ b/gdb/python/python.c
@@ -2626,6 +2626,56 @@ Return current recording object." },
     "stop_recording () -> None.\n\
 Stop current recording." },
 
+  /* Type initialization functions. */
+  { "init_type", (PyCFunction) gdbpy_init_type, METH_VARARGS | METH_KEYWORDS,
+    "init_type (objfile, type_code, bit_length, name) -> type\n\
+    Creates a new type with the given bit length and type code, owned\
+    by the given objfile." },
+  { "init_integer_type", (PyCFunction) gdbpy_init_integer_type,
+    METH_VARARGS | METH_KEYWORDS,
+    "init_integer_type (objfile, bit_length, unsigned, name) -> type\n\
+    Creates a new integer type with the given bit length and \
+    signedness, owned by the given objfile." },
+  { "init_character_type", (PyCFunction) gdbpy_init_character_type,
+    METH_VARARGS | METH_KEYWORDS,
+    "init_character_type (objfile, bit_length, unsigned, name) -> type\n\
+    Creates a new character type with the given bit length and \
+    signedness, owned by the given objfile." },
+  { "init_boolean_type", (PyCFunction) gdbpy_init_boolean_type,
+    METH_VARARGS | METH_KEYWORDS,
+    "init_boolean_type (objfile, bit_length, unsigned, name) -> type\n\
+    Creates a new boolean type with the given bit length and \
+    signedness, owned by the given objfile." },
+  { "init_float_type", (PyCFunction) gdbpy_init_float_type,
+    METH_VARARGS | METH_KEYWORDS,
+    "init_float_type (objfile, float_format, name) -> type\n\
+    Creates a new floating point type with the given bit length and \
+    format, owned by the given objfile." },
+  { "init_decfloat_type", (PyCFunction) gdbpy_init_decfloat_type,
+    METH_VARARGS | METH_KEYWORDS,
+    "init_decfloat_type (objfile, bit_length, name) -> type\n\
+    Creates a new decimal float type with the given bit length,\
+    owned by the given objfile." },
+  { "can_create_complex_type", (PyCFunction) gdbpy_can_create_complex_type,
+    METH_VARARGS | METH_KEYWORDS,
+    "can_create_complex_type (type) -> bool\n\
+     Returns whether a given type can form a new complex type." },
+  { "init_complex_type", (PyCFunction) gdbpy_init_complex_type,
+    METH_VARARGS | METH_KEYWORDS,
+    "init_complex_type (base_type, name) -> type\n\
+    Creates a new complex type whose components belong to the\
+    given type, owned by the given objfile." },
+  { "init_pointer_type", (PyCFunction) gdbpy_init_pointer_type,
+    METH_VARARGS | METH_KEYWORDS,
+    "init_pointer_type (objfile, target_type, bit_length, name) -> type\n\
+    Creates a new pointer type with the given bit length, pointing\
+    to the given target type, and owned by the given objfile." },
+  { "init_fixed_point_type", (PyCFunction) gdbpy_init_fixed_point_type,
+    METH_VARARGS | METH_KEYWORDS,
+    "init_fixed_point_type (objfile, bit_length, unsigned, name) -> type\n\
+    Creates a new fixed point type with the given bit length and\
+    signedness, owned by the given objfile." },
+
   { "lookup_type", (PyCFunction) gdbpy_lookup_type,
     METH_VARARGS | METH_KEYWORDS,
     "lookup_type (name [, block]) -> type\n\
diff --git a/gdb/testsuite/gdb.python/py-type-init.c b/gdb/testsuite/gdb.python/py-type-init.c
new file mode 100644
index 00000000000..010e62bd248
--- /dev/null
+++ b/gdb/testsuite/gdb.python/py-type-init.c
@@ -0,0 +1,21 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2009-2023 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+int main ()
+{
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.python/py-type-init.exp b/gdb/testsuite/gdb.python/py-type-init.exp
new file mode 100644
index 00000000000..8ef3c2c57af
--- /dev/null
+++ b/gdb/testsuite/gdb.python/py-type-init.exp
@@ -0,0 +1,132 @@
+# Copyright (C) 2009-2023 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# This file is part of the GDB testsuite.  It tests the mechanism
+# of creating new types from within Python.
+
+load_lib gdb-python.exp
+
+standard_testfile
+
+# Build inferior to language specification.
+proc build_inferior {exefile lang} {
+  global srcdir subdir srcfile testfile hex
+
+  if { [gdb_compile "${srcdir}/${subdir}/${srcfile}" "${exefile}" executable "debug $lang"] != "" } {
+      untested "failed to compile in $lang mode"
+      return -1
+  }
+
+  return 0
+}
+
+# Restart GDB.
+proc restart_gdb {exefile} {
+  clean_restart $exefile
+
+  if {![runto_main]} {
+      return
+  }
+}
+
+# Tests the basic values of a type.
+proc test_type_basic {owner t code sizeof name} {
+  gdb_test "python print(${t}.code == ${code})" \
+    "True" "check the code for the python-constructed type (${owner}/${name})"
+  gdb_test "python print(${t}.sizeof == ${sizeof})" \
+    "True" "check the size for the python-constructed type (${owner}/${name})"
+  gdb_test "python print(${t}.name == ${name})" \
+    "True" "check the name for the python-constructed type (${owner}/${name})"
+}
+
+# Runs the tests for a given owner object.
+proc for_owner {owner} {
+  # Simple direct type creation.
+  gdb_test "python t = gdb.init_type(${owner}, gdb.TYPE_CODE_INT, 24, 'long short int')" \
+    "" "construct a new type from inside python (${owner})"
+  test_type_basic $owner "t" "gdb.TYPE_CODE_INT" "3" "'long short int'"
+
+  # Integer type creation.
+  gdb_test "python t = gdb.init_integer_type(${owner}, 24, True, 'test_int_t')" \
+    "" "construct a new integer type from inside python (${owner})"
+  test_type_basic $owner "t" "gdb.TYPE_CODE_INT" "3" "'test_int_t'"
+
+  # Character type creation.
+  gdb_test "python t = gdb.init_character_type(${owner}, 24, True, 'test_char_t')" \
+    "" "construct a new character type from inside python (${owner})"
+  test_type_basic $owner "t" "gdb.TYPE_CODE_CHAR" "3" "'test_char_t'"
+
+  # Boolean type creation.
+  gdb_test "python t = gdb.init_boolean_type(${owner}, 24, True, 'test_bool_t')" \
+    "" "construct a new boolean type from inside python (${owner})"
+  test_type_basic $owner "t" "gdb.TYPE_CODE_BOOL" "3" "'test_bool_t'"
+
+  # Float type creation.
+  gdb_test "python f = gdb.FloatFormat()" "" "create a float format object (${owner})"
+  gdb_test "python f.totalsize = 32" "" "set totalsize for the float format (${owner})"
+  gdb_test "python f.sign_start = 31" "" "set sign_start for the float format (${owner})"
+  gdb_test "python f.exp_start = 23" "" "set exp_start for the float format (${owner})"
+  gdb_test "python f.exp_len = 8" "" "set exp_len for the float format (${owner})"
+  gdb_test "python f.exp_bias = 0" "" "set exp_bias for the float format (${owner})"
+  gdb_test "python f.exp_nan = 0xff" "" "set exp_nan for the float format (${owner})"
+  gdb_test "python f.man_start = 0" "" "set man_start for the float format (${owner})"
+  gdb_test "python f.man_len = 22" "" "set man_len for the float format (${owner})"
+  gdb_test "python f.intbit = False" "" "set intbit for the float format (${owner})"
+  gdb_test "python f.name = 'test_float_fmt'" "" "set name for the float format (${owner})"
+
+  gdb_test "python ft = gdb.init_float_type(${owner}, f, 'test_float_t')" \
+    "" "construct a new float type from inside python (${owner})"
+  test_type_basic $owner "ft" "gdb.TYPE_CODE_FLT" "4" "'test_float_t'"
+
+  # Decfloat type creation.
+  gdb_test "python t = gdb.init_decfloat_type(${owner}, 24, 'test_decfloat_t')" \
+    "" "construct a new decfloat type from inside python (${owner})"
+  test_type_basic $owner "t" "gdb.TYPE_CODE_DECFLOAT" "3" "'test_decfloat_t'"
+
+  # Test complex type.
+  gdb_test "python print(gdb.can_create_complex_type(ft))" "True" \
+    "check whether the float type we created can be the basis for a complex (${owner})"
+
+  gdb_test "python t = gdb.init_complex_type(ft, 'test_complex_t')" \
+    "" "construct a new complex type from inside python (${owner})"
+  test_type_basic $owner "t" "gdb.TYPE_CODE_COMPLEX" "8" "'test_complex_t'"
+
+  # Create a 24-bit pointer to our floating point type.
+  gdb_test "python t = gdb.init_pointer_type(${owner}, ft, 24, 'test_pointer_t')" \
+    "" "construct a new pointer type from inside python (${owner})"
+  test_type_basic $owner "t" "gdb.TYPE_CODE_PTR" "3" "'test_pointer_t'"
+}
+
+# Run the tests.
+if { [build_inferior "${binfile}" "c"] == 0 } {
+  restart_gdb "${binfile}"
+
+  # Skip all tests if Python scripting is not enabled.
+  if { ![allow_python_tests] } { continue }
+
+  # Test objfile-owned type construction
+  for_owner "gdb.objfiles()\[0\]"
+
+  # Objfile-owned fixed point type creation.
+  #
+  # Currently, these cannot be owned by architectures, so we have to
+  # test them separately.
+  gdb_test "python t = gdb.init_fixed_point_type(gdb.objfiles()\[0\], 24, True, 'test_fixed_t')" \
+    "" "construct a new fixed point type from inside python (gdb.objfile()\[0\])"
+  test_type_basic "gdb.objfile()\[0\]" "t" "gdb.TYPE_CODE_FIXED_POINT" "3" "'test_fixed_t'"
+
+  # Test arch-owned type construction
+  for_owner "gdb.inferiors()\[0\].architecture()"
+}
-- 
2.40.1


^ permalink raw reply	[relevance 1%]

* Re: [PATCH v3] Add support for creating new types from the Python API
  2024-01-13  1:37  1%       ` [PATCH v3] " Matheus Branco Borella
@ 2024-01-13  7:21  0%         ` Eli Zaretskii
  2024-01-16  4:55  7%           ` Matheus Branco Borella
  0 siblings, 1 reply; 65+ results
From: Eli Zaretskii @ 2024-01-13  7:21 UTC (permalink / raw)
  To: Matheus Branco Borella; +Cc: gdb-patches, aburgess

> From: Matheus Branco Borella <dark.ryu.550@gmail.com>
> Cc: aburgess@redhat.com,
> 	Matheus Branco Borella <dark.ryu.550@gmail.com>
> Date: Fri, 12 Jan 2024 22:37:57 -0300
> 
>  gdb/Makefile.in                           |   2 +
>  gdb/NEWS                                  |  16 +
>  gdb/doc/python.texi                       | 237 ++++++++++
>  gdb/python/py-float-format.c              | 307 +++++++++++++
>  gdb/python/py-objfile.c                   |  17 +
>  gdb/python/py-type-init.c                 | 520 ++++++++++++++++++++++
>  gdb/python/python-internal.h              |  34 ++
>  gdb/python/python.c                       |  50 +++
>  gdb/testsuite/gdb.python/py-type-init.c   |  21 +
>  gdb/testsuite/gdb.python/py-type-init.exp | 132 ++++++
>  10 files changed, 1336 insertions(+)
>  create mode 100644 gdb/python/py-float-format.c
>  create mode 100644 gdb/python/py-type-init.c
>  create mode 100644 gdb/testsuite/gdb.python/py-type-init.c
>  create mode 100644 gdb/testsuite/gdb.python/py-type-init.exp

Thanks.

> diff --git a/gdb/NEWS b/gdb/NEWS
> index 11cd6c0663e..e21cca24422 100644
> --- a/gdb/NEWS
> +++ b/gdb/NEWS
> @@ -87,6 +87,22 @@ show remote thread-options-packet
>    ** New function gdb.interrupt(), that interrupts GDB as if the user
>       typed control-c.
>  
> +  ** Functions that allow creation of instances of gdb.Type, and a new
> +     class gdb.FloatFormat that may be used to create floating point
> +     types. The functions that allow new type creation are:
             ^^
Our convention is to leave two spaces between sentences.

> +@node Creating Types In Python
> +@subsubsection Creating Types In Python
> +@cindex creating types in Python
> +@cindex Python, working with types
> +
> +@value{GDBN} makes available functionality to create new types from
> +inside Python.

This is awkward English.  Suggest to reword:

  @value{GDBN} allows creation of new types from Python extensions.

> +The following type creation functions are available in the @code{gdb}
> +module:

  The following functions available in the @code{gdb} module create
  new types:

> +@findex gdb.init_type
> +@defun gdb.init_type (owner, type_code, bit_size, name)
> +This function creates a new @code{gdb.Type} instance corresponding to a
> +type owned by the given @code{owner}, with the given type code,
> +@code{name} and size.

The arguments should have the @var markup, not @code.  So:

  This function creates a new @code{gdb.Type} instance corresponding to a
  type owned by the given @var{owner}, with the given @var{type_code},
  @var{name} and @var{bit_size}.

Likewise in the rest of this section: formal parameters should be
referenced with @var{..}, not @code{..}.

> +@code{owner} must be a reference to either a @code{gdb.Objfile} or a
> +@code{gdb.Architecture} object. These correspond to objfile and
                                 ^^
Two spaces between sentences, here and elsewhere.

> +@code{type_code} is one of the @code{TYPE_CODE_} constants defined in
> +@xref{Types In Python}.
   ^^^^^
This should be @ref, not @xref.  @xref is only used at the beginning
of a sentence.

> +@findex gdb.init_integer_type
> +@defun gdb.init_integer_type (owner, bit_size, unsigned, name)
> +This function creates a new @code{gdb.Type} instance corresponding to an
> +integer type owned by the given @code{owner}, with the given
> +@code{name}, size and signedness.
> +
> +@code{owner} must be a reference to either a @code{gdb.Objfile} or a
> +@code{gdb.Architecture} object. These correspond to objfile and
> +architecture-owned types, respectively.
> +
> +@code{bit_size} is the size of instances of the newly created type, in
> +bits. Currently, accepted values are limited to multiples of 8.
> +
> +@code{unsigned} is a boolean indicating whether the type corresponds to
> +a signed or unsigned value.
> +
> +This function returns an instance of @code{gdb.Type}, and will throw an
> +exception in case of an error.
> +@end defun

There's no need to repeat the same text over and over again, for each
function.  It should be enough to describe these arguments once, and
say that all of the functions accept these arguments.  The only thing
you should say about each function is what data type it creates, all
the rest is similar or identical for all of these functions, and
should be only described once.  If some functions accept arguments
that are unique to them, then describe those unique arguments as part
of the function's description.  Otherwise, just refer to the general
description, which should be at the beginning of the section.

> +@findex gdb.init_pointer_type
> +@defun gdb.init_pointer_type (owner, target, bit_size, name)
> +This function creates a new @code{gdb.Type} instance corresponding to a
> +pointer type that points to @code{target} and is owned by the given
> +@code{owner}, with the given @code{name} and size.
> +
> +@code{owner} must be a reference to either a @code{gdb.Objfile} or a
> +@code{gdb.Architecture} object. These correspond to objfile and
> +architecture-owned types, respectively.
> +
> +@code{target} is a @code{gdb.Type} object, corresponding to the type
> +that will be pointed to by the newly created pointer type.
> +
> +@code{bit_size} is the size of instances of the newly created type, in
> +bits. Currently, accepted values are limited to multiples of 8.

Does this mean one can create pointer types of arbitrary length,
regardless of the target architecture?  Aren't pointers limited to
what the underlying system supports?

> +@findex gdb.init_boolean_type
> +@defun gdb.init_fixed_point_type (owner, bit_size, unsigned, name)

Typo in the first line?

Reviewed-By: Eli Zaretskii <eliz@gnu.org>

^ permalink raw reply	[relevance 0%]

* [PATCH v3] Add support for creating new types from the Python API
  2023-08-08 21:00  1%     ` [PATCH v2] " Matheus Branco Borella
@ 2024-01-13  1:37  1%       ` Matheus Branco Borella
  2024-01-13  7:21  0%         ` Eli Zaretskii
  0 siblings, 1 reply; 65+ results
From: Matheus Branco Borella @ 2024-01-13  1:37 UTC (permalink / raw)
  To: gdb-patches; +Cc: aburgess, Matheus Branco Borella

I had to walk away from this for a while. I'm pinging it now and I've updated
the code so that it works on master.

This patch adds support for creating types from within the Python API. It does
so by exposing the `init_*_type` family of functions, defined in `gdbtypes.h` to
Python and having them return `gdb.Type` objects connected to the newly minted
types.

These functions are accessible in the root of the gdb module and all require
a reference to either a `gdb.Objfile` or a `gdb.Architecture`. Types created
from them will be owned by the object passed to the function.

This patch also adds an extra type - `gdb.FloatFormat` - to support creation of
floating point types by letting users control the format from within Python. It
is missing, however, a way to specify half formats and validation functions.

It is important to note that types created using this interface are not
automatically registered as a symbol, and so, types will become unreachable
unless used to create a value that otherwise references it or saved in some way.

The main drawback of using the `init_*_type` family over implementing type
initialization by hand is that any type that's created gets immediately
allocated on its owner's obstack, regardless of what its real lifetime
requirements are. The main implication of this is that types that become
unreachable will remain live for the lifetime of the owner.

Keeping track of the initialization of the type by hand would require a
deeper change to the existing type object infrastructure. A bit too ambitious
for a first patch, I'd say.

If it were to be done though, we would gain the ability to only keep in the
obstack types that are known to be referenced in some other way - by allocating
and copying the data to the obstack as other objects are created that reference
it (eg. symbols).
---
 gdb/Makefile.in                           |   2 +
 gdb/NEWS                                  |  16 +
 gdb/doc/python.texi                       | 237 ++++++++++
 gdb/python/py-float-format.c              | 307 +++++++++++++
 gdb/python/py-objfile.c                   |  17 +
 gdb/python/py-type-init.c                 | 520 ++++++++++++++++++++++
 gdb/python/python-internal.h              |  34 ++
 gdb/python/python.c                       |  50 +++
 gdb/testsuite/gdb.python/py-type-init.c   |  21 +
 gdb/testsuite/gdb.python/py-type-init.exp | 132 ++++++
 10 files changed, 1336 insertions(+)
 create mode 100644 gdb/python/py-float-format.c
 create mode 100644 gdb/python/py-type-init.c
 create mode 100644 gdb/testsuite/gdb.python/py-type-init.c
 create mode 100644 gdb/testsuite/gdb.python/py-type-init.exp

diff --git a/gdb/Makefile.in b/gdb/Makefile.in
index 195f3a2e2d1..50a758c802b 100644
--- a/gdb/Makefile.in
+++ b/gdb/Makefile.in
@@ -432,6 +432,8 @@ SUBDIR_PYTHON_SRCS = \
 	python/py-threadevent.c \
 	python/py-tui.c \
 	python/py-type.c \
+	python/py-type-init.c \
+	python/py-float-format.c \
 	python/py-unwind.c \
 	python/py-utils.c \
 	python/py-value.c \
diff --git a/gdb/NEWS b/gdb/NEWS
index 11cd6c0663e..e21cca24422 100644
--- a/gdb/NEWS
+++ b/gdb/NEWS
@@ -87,6 +87,22 @@ show remote thread-options-packet
   ** New function gdb.interrupt(), that interrupts GDB as if the user
      typed control-c.
 
+  ** Functions that allow creation of instances of gdb.Type, and a new
+     class gdb.FloatFormat that may be used to create floating point
+     types. The functions that allow new type creation are:
+      - gdb.init_type: Create a new type given a type code.
+      - gdb.init_integer_type: Create a new integer type.
+      - gdb.init_character_type: Create a new character type.
+      - gdb.init_boolean_type: Create a new boolean type.
+      - gdb.init_float_type: Create a new floating point type.
+      - gdb.init_decfloat_type: Create a new decimal floating point type.
+      - gdb.can_create_complex_type: Whether a type can be used to create a
+          new complex type.
+      - gdb.init_complex_type: Create a new complex type.
+      - gdb.init_pointer_type: Create a new pointer type.
+          * This allows creating pointers of arbitrary size.
+      - gdb.init_fixed_point_type: Create a new fixed point type.
+
 * Debugger Adapter Protocol changes
 
   ** GDB now emits the "process" event.
diff --git a/gdb/doc/python.texi b/gdb/doc/python.texi
index d74defeec0c..8c2cd393e0c 100644
--- a/gdb/doc/python.texi
+++ b/gdb/doc/python.texi
@@ -1743,6 +1743,243 @@ A Fortran namelist.
 Further support for types is provided in the @code{gdb.types}
 Python module (@pxref{gdb.types}).
 
+
+
+@node Creating Types In Python
+@subsubsection Creating Types In Python
+@cindex creating types in Python
+@cindex Python, working with types
+
+@value{GDBN} makes available functionality to create new types from
+inside Python.
+
+The following type creation functions are available in the @code{gdb}
+module:
+
+@findex gdb.init_type
+@defun gdb.init_type (owner, type_code, bit_size, name)
+This function creates a new @code{gdb.Type} instance corresponding to a
+type owned by the given @code{owner}, with the given type code,
+@code{name} and size.
+
+@code{owner} must be a reference to either a @code{gdb.Objfile} or a
+@code{gdb.Architecture} object. These correspond to objfile and
+architecture-owned types, respectively.
+
+@code{type_code} is one of the @code{TYPE_CODE_} constants defined in
+@xref{Types In Python}.
+
+@code{bit_size} is the size of instances of the newly created type, in
+bits. Currently, accepted values are limited to multiples of 8.
+
+This function returns an instance of @code{gdb.Type}, and will throw an
+exception in case of an error.
+@end defun
+
+@findex gdb.init_integer_type
+@defun gdb.init_integer_type (owner, bit_size, unsigned, name)
+This function creates a new @code{gdb.Type} instance corresponding to an
+integer type owned by the given @code{owner}, with the given
+@code{name}, size and signedness.
+
+@code{owner} must be a reference to either a @code{gdb.Objfile} or a
+@code{gdb.Architecture} object. These correspond to objfile and
+architecture-owned types, respectively.
+
+@code{bit_size} is the size of instances of the newly created type, in
+bits. Currently, accepted values are limited to multiples of 8.
+
+@code{unsigned} is a boolean indicating whether the type corresponds to
+a signed or unsigned value.
+
+This function returns an instance of @code{gdb.Type}, and will throw an
+exception in case of an error.
+@end defun
+
+@findex gdb.init_character_type
+@defun gdb.init_character_type (owner, bit_size, unsigned, name)
+This function creates a new @code{gdb.Type} instance corresponding to a
+character type owned by the given @code{owner}, with the given
+@code{name}, size and signedness.
+
+@code{owner} must be a reference to either a @code{gdb.Objfile} or a
+@code{gdb.Architecture} object. These correspond to objfile and
+architecture-owned types, respectively.
+
+@code{bit_size} is the size of instances of the newly created type, in
+bits. Currently, accepted values are limited to multiples of 8.
+
+@code{unsigned} is a boolean indicating whether the type corresponds to
+a signed or unsigned value.
+
+This function returns an instance of @code{gdb.Type}, and will throw an
+exception in case of an error.
+@end defun
+
+@findex gdb.init_boolean_type
+@defun gdb.init_boolean_type (owner, bit_size, unsigned, name)
+This function creates a new @code{gdb.Type} instance corresponding to a
+boolean type owned by the given @code{owner}, with the given
+@code{name}, size and signedness.
+
+@code{owner} must be a reference to either a @code{gdb.Objfile} or a
+@code{gdb.Architecture} object. These correspond to objfile and
+architecture-owned types, respectively.
+
+@code{bit_size} is the size of instances of the newly created type, in
+bits. Currently, accepted values are limited to multiples of 8.
+
+@code{unsigned} is a boolean indicating whether the type corresponds to
+a signed or unsigned value.
+
+This function returns an instance of @code{gdb.Type}, and will throw an
+exception in case of an error.
+@end defun
+
+@findex gdb.init_float_type
+@defun gdb.init_float_type (owner, format, name)
+This function creates a new @code{gdb.Type} instance corresponding to a
+character type owned by the given @code{owner}, with the given
+@code{name} and @code{format}.
+
+@code{owner} must be a reference to either a @code{gdb.Objfile} or a
+@code{gdb.Architecture} object. These correspond to objfile and
+architecture-owned types, respectively.
+
+@code{format} is an reference to a @code{gdb.FloatFormat} object, as
+described below.
+
+This function returns an instance of @code{gdb.Type}, and will throw an
+exception in case of an error.
+@end defun
+
+@findex gdb.init_decfloat_type
+@defun gdb.init_decfloat_type (owner, bit_size, name)
+This function creates a new @code{gdb.Type} instance corresponding to a
+decimal floating point type owned by the given @code{owner}, with the
+given @code{name} and size.
+
+@code{owner} must be a reference to either a @code{gdb.Objfile} or a
+@code{gdb.Architecture} object. These correspond to objfile and
+architecture-owned types, respectively.
+
+@code{bit_size} is the size of instances of the newly created type, in
+bits. Currently, accepted values are limited to multiples of 8.
+
+This function returns an instance of @code{gdb.Type}, and will throw an
+exception in case of an error.
+@end defun
+
+@findex gdb.can_create_complex_type
+@defun gdb.can_create_complex_type (type)
+This function returns a boolean indicating whether @code{type} can be
+used to create a new complex type using the @code{gdb.init_complex_type}
+function.
+@end defun
+
+@findex gdb.init_complex_type
+@defun gdb.init_complex_type (type, name)
+This function creates a new @code{gdb.Type} instance corresponding to a
+complex type with the given @code{name} based on the given base
+@code{type}.
+
+The newly created type will be owned by the same object as the base
+type that was used to create it.
+
+This function returns an instance of @code{gdb.Type}, and will throw an
+exception in case of an error.
+@end defun
+
+@findex gdb.init_pointer_type
+@defun gdb.init_pointer_type (owner, target, bit_size, name)
+This function creates a new @code{gdb.Type} instance corresponding to a
+pointer type that points to @code{target} and is owned by the given
+@code{owner}, with the given @code{name} and size.
+
+@code{owner} must be a reference to either a @code{gdb.Objfile} or a
+@code{gdb.Architecture} object. These correspond to objfile and
+architecture-owned types, respectively.
+
+@code{target} is a @code{gdb.Type} object, corresponding to the type
+that will be pointed to by the newly created pointer type.
+
+@code{bit_size} is the size of instances of the newly created type, in
+bits. Currently, accepted values are limited to multiples of 8.
+
+This function returns an instance of @code{gdb.Type}, and will throw an
+exception in case of an error.
+@end defun
+
+@findex gdb.init_boolean_type
+@defun gdb.init_fixed_point_type (owner, bit_size, unsigned, name)
+This function creates a new @code{gdb.Type} instance corresponding to a
+fixed point type owned by the given @code{owner}, with the given
+@code{name}, size and signedness.
+
+@code{owner} must be a reference to either a @code{gdb.Objfile} or a
+@code{gdb.Architecture} object. These correspond to objfile and
+architecture-owned types, respectively.
+
+@code{bit_size} is the size of instances of the newly created type, in
+bits. Currently, accepted values are limited to multiples of 8.
+
+@code{unsigned} is a boolean indicating whether the type corresponds to
+a signed or unsigned value.
+
+This function returns an instance of @code{gdb.Type}, and will throw an
+exception in case of an error.
+@end defun
+
+When creating a floating point type through @code{gdb.init_float_type},
+one has to use a @code{gdb.FloatFormat} object. These objects may be
+created with no arguments, and the following attributes may be used to
+defined the format of the desired floating point format:
+
+@defvar FloatFormat.totalsize
+The size of the floating point number, in bits. Currently, accepted
+values are limited to multiples of 8.
+@end defvar
+
+@defvar FloatFormat.sign_start
+The bit offset of the sign bit.
+@end defvar
+
+@defvar FloatFormt.exp_start
+The bit offset of the start of the exponent.
+@end defvar
+
+@defvar FloatFormat.exp_len
+The size of the exponent, in bits.
+@end defvar
+
+@defvar FloatFormat.exp_bias
+Bias added to the written exponent to form the biased exponent.
+@end defvar
+
+@defvar FloatFormat.exp_nan
+Exponent value which indicates NaN.
+@end defvar
+
+@defvar FloatFormat.man_start
+The bit offset of the start of the mantissa.
+@end defvar
+
+@defvar FloatFormat.man_len
+The size of the mantissa, in bits.
+@end defvar
+
+@defvar FloatFormat.intbit
+This is a boolean values that indicates whether the integer bit is part
+of the value or if it is determined implicitly. A value of true
+indicates the former, while a value of false indicates the latter.
+@end defvar
+
+@defvar FloatFormat.name
+The name of the float format. Used internally, for debugging purposes.
+@end defvar
+
+
+
 @node Pretty Printing API
 @subsubsection Pretty Printing API
 @cindex python pretty printing api
diff --git a/gdb/python/py-float-format.c b/gdb/python/py-float-format.c
new file mode 100644
index 00000000000..984b96361a7
--- /dev/null
+++ b/gdb/python/py-float-format.c
@@ -0,0 +1,307 @@
+/* Accessibility of float format controls from inside the Python API
+
+   Copyright (C) 2008-2023 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include "defs.h"
+#include "python-internal.h"
+#include "floatformat.h"
+
+/* Structure backing the float format Python interface. */
+
+struct float_format_object
+{
+  PyObject_HEAD
+  struct floatformat format;
+
+  struct floatformat *float_format ()
+  {
+    return &this->format;
+  }
+};
+
+/* Initializes the float format type and registers it with the Python
+ * interpreter. */
+
+static int CPYCHECKER_NEGATIVE_RESULT_SETS_EXCEPTION
+gdbpy_initialize_float_format (void)
+{
+  if (PyType_Ready (&float_format_object_type) < 0)
+    return -1;
+
+  if (gdb_pymodule_addobject (gdb_module, "FloatFormat",
+			      (PyObject *) &float_format_object_type) < 0)
+    return -1;
+
+  return 0;
+}
+
+GDBPY_INITIALIZE_FILE (gdbpy_initialize_float_format);
+
+/* Creates a function that gets the value of a field of a given name from the
+ * underliying float_format structure in the Python object. */
+
+#define INSTANCE_FIELD_GETTER(getter_name, field_name, field_type, field_conv)\
+  static PyObject *							      \
+  getter_name (PyObject *self, void *closure)				      \
+  {									      \
+    float_format_object *ff = (float_format_object*) self;		      \
+    field_type value = ff->float_format ()->field_name;			      \
+    return field_conv (value);						      \
+  }
+
+/* Creates a function that sets the value of a field of a given name from the
+ * underliying float_format structure in the Python object. */
+
+#define INSTANCE_FIELD_SETTER(setter_name, field_name, field_type, field_conv)\
+  static int								      \
+  setter_name (PyObject *self, PyObject* value, void *closure)		      \
+  {									      \
+    field_type native_value;						      \
+    if (!field_conv (value, &native_value))				      \
+      return -1;							      \
+    float_format_object *ff = (float_format_object*) self;		      \
+    ff->float_format ()->field_name = native_value;			      \
+    return 0;								      \
+  }
+
+/* Converts from the intbit enum to a Python boolean. */
+
+static PyObject *
+intbit_to_py (enum floatformat_intbit intbit)
+{
+  gdb_assert (intbit == floatformat_intbit_yes
+	      || intbit == floatformat_intbit_no);
+
+  if (intbit == floatformat_intbit_no)
+    Py_RETURN_FALSE;
+  else
+    Py_RETURN_TRUE;
+}
+
+/* Converts from a Python boolean to the intbit enum. */
+
+static bool
+py_to_intbit (PyObject *object, enum floatformat_intbit *intbit)
+{
+  if (!PyObject_IsInstance (object, (PyObject*) &PyBool_Type))
+    {
+      PyErr_SetString (PyExc_TypeError, "intbit must be True or False");
+      return false;
+    }
+
+  *intbit = PyObject_IsTrue (object) ? floatformat_intbit_yes
+    : floatformat_intbit_no;
+
+  return true;
+}
+
+/* Converts from a Python integer to a unsigned integer. */
+
+static bool
+py_to_unsigned_int (PyObject *object, unsigned int *val)
+{
+  if (!PyObject_IsInstance (object, (PyObject*) &PyLong_Type))
+    {
+      PyErr_SetString (PyExc_TypeError, "value must be an integer");
+      return false;
+    }
+
+  long native_val = PyLong_AsLong (object);
+  if (native_val > (long) UINT_MAX)
+    {
+      PyErr_SetString (PyExc_ValueError, "value is too large");
+      return false;
+    }
+  if (native_val < 0)
+    {
+      PyErr_SetString (PyExc_ValueError,
+		       "value must not be smaller than zero");
+      return false;
+    }
+
+  *val = (unsigned int) native_val;
+  return true;
+}
+
+/* Converts from a Python integer to a signed integer. */
+
+static bool
+py_to_int(PyObject *object, int *val)
+{
+  if(!PyObject_IsInstance(object, (PyObject*)&PyLong_Type))
+    {
+      PyErr_SetString(PyExc_TypeError, "value must be an integer");
+      return false;
+    }
+
+  long native_val = PyLong_AsLong(object);
+  if(native_val > (long)INT_MAX)
+    {
+      PyErr_SetString(PyExc_ValueError, "value is too large");
+      return false;
+    }
+
+  *val = (int)native_val;
+  return true;
+}
+
+/* Instantiate functions for all of the float format fields we'd like to be
+ * able to read and change from our Python object. These will be used later to
+ * define `getset` entries for them. */
+
+INSTANCE_FIELD_GETTER (ffpy_get_totalsize, totalsize,
+		       unsigned int, PyLong_FromLong)
+INSTANCE_FIELD_GETTER (ffpy_get_sign_start, sign_start,
+		       unsigned int, PyLong_FromLong)
+INSTANCE_FIELD_GETTER (ffpy_get_exp_start, exp_start,
+		       unsigned int, PyLong_FromLong)
+INSTANCE_FIELD_GETTER (ffpy_get_exp_len, exp_len,
+		       unsigned int, PyLong_FromLong)
+INSTANCE_FIELD_GETTER (ffpy_get_exp_bias, exp_bias, int, PyLong_FromLong)
+INSTANCE_FIELD_GETTER (ffpy_get_exp_nan, exp_nan,
+		       unsigned int, PyLong_FromLong)
+INSTANCE_FIELD_GETTER (ffpy_get_man_start, man_start,
+		       unsigned int, PyLong_FromLong)
+INSTANCE_FIELD_GETTER (ffpy_get_man_len, man_len,
+		       unsigned int, PyLong_FromLong)
+INSTANCE_FIELD_GETTER (ffpy_get_intbit, intbit,
+		       enum floatformat_intbit, intbit_to_py)
+INSTANCE_FIELD_GETTER (ffpy_get_name, name,
+		       const char *, PyUnicode_FromString)
+
+INSTANCE_FIELD_SETTER (ffpy_set_totalsize, totalsize,
+		       unsigned int, py_to_unsigned_int)
+INSTANCE_FIELD_SETTER (ffpy_set_sign_start, sign_start,
+		       unsigned int, py_to_unsigned_int)
+INSTANCE_FIELD_SETTER (ffpy_set_exp_start, exp_start,
+		       unsigned int, py_to_unsigned_int)
+INSTANCE_FIELD_SETTER (ffpy_set_exp_len, exp_len,
+		       unsigned int, py_to_unsigned_int)
+INSTANCE_FIELD_SETTER (ffpy_set_exp_bias, exp_bias, int, py_to_int)
+INSTANCE_FIELD_SETTER (ffpy_set_exp_nan, exp_nan,
+		       unsigned int, py_to_unsigned_int)
+INSTANCE_FIELD_SETTER (ffpy_set_man_start, man_start,
+		       unsigned int, py_to_unsigned_int)
+INSTANCE_FIELD_SETTER (ffpy_set_man_len, man_len,
+		       unsigned int, py_to_unsigned_int)
+INSTANCE_FIELD_SETTER (ffpy_set_intbit, intbit,
+		       enum floatformat_intbit, py_to_intbit)
+
+/* Makes sure float formats created from Python always test as valid. */
+
+static int
+ffpy_always_valid (const struct floatformat *fmt ATTRIBUTE_UNUSED,
+		   const void *from ATTRIBUTE_UNUSED)
+{
+  return 1;
+}
+
+/* Initializes new float format objects. */
+
+static int
+ffpy_init (PyObject *self,
+	   PyObject *args ATTRIBUTE_UNUSED,
+	   PyObject *kwds ATTRIBUTE_UNUSED)
+{
+  auto ff = (float_format_object*) self;
+  ff->format = floatformat ();
+  ff->float_format ()->name = "";
+  ff->float_format ()->is_valid = ffpy_always_valid;
+  return 0;
+}
+
+/* See python/python-internal.h. */
+
+struct floatformat *
+float_format_object_as_float_format (PyObject *self)
+{
+  if (!PyObject_TypeCheck (self, &float_format_object_type))
+    {
+      PyErr_SetString(PyExc_TypeError, "expected gdb.FloatFormat");
+      return nullptr;
+    }
+  return ((float_format_object*) self)->float_format ();
+}
+
+static gdb_PyGetSetDef float_format_object_getset[] =
+{
+  { "totalsize", ffpy_get_totalsize, ffpy_set_totalsize,
+    "The total size of the floating point number, in bits.", nullptr },
+  { "sign_start", ffpy_get_sign_start, ffpy_set_sign_start,
+    "The bit offset of the sign bit.", nullptr },
+  { "exp_start", ffpy_get_exp_start, ffpy_set_exp_start,
+    "The bit offset of the start of the exponent.", nullptr },
+  { "exp_len", ffpy_get_exp_len, ffpy_set_exp_len,
+    "The size of the exponent, in bits.", nullptr },
+  { "exp_bias", ffpy_get_exp_bias, ffpy_set_exp_bias,
+    "Bias added to the written exponent to form the biased exponent.",
+    nullptr },
+  { "exp_nan", ffpy_get_exp_nan, ffpy_set_exp_nan,
+    "Exponent value which indicates NaN.", nullptr },
+  { "man_start", ffpy_get_man_start, ffpy_set_man_start,
+    "The bit offset of the start of the mantissa.", nullptr },
+  { "man_len", ffpy_get_man_len, ffpy_set_man_len,
+    "The size of the mantissa, in bits.", nullptr },
+  { "intbit", ffpy_get_intbit, ffpy_set_intbit,
+    "Is the integer bit explicit or implicit?", nullptr },
+  { "name", ffpy_get_name, nullptr,
+    "Internal name for debugging.", nullptr },
+  { nullptr }
+};
+
+PyTypeObject float_format_object_type =
+{
+  PyVarObject_HEAD_INIT (NULL, 0)
+  "gdb.FloatFormat",		  /*tp_name*/
+  sizeof (float_format_object),   /*tp_basicsize*/
+  0,				  /*tp_itemsize*/
+  nullptr,			  /*tp_dealloc*/
+  0,				  /*tp_print*/
+  nullptr,			  /*tp_getattr*/
+  nullptr,			  /*tp_setattr*/
+  nullptr,			  /*tp_compare*/
+  nullptr,			  /*tp_repr*/
+  nullptr,			  /*tp_as_number*/
+  nullptr,			  /*tp_as_sequence*/
+  nullptr,			  /*tp_as_mapping*/
+  nullptr,			  /*tp_hash */
+  nullptr,			  /*tp_call*/
+  nullptr,			  /*tp_str*/
+  nullptr,			  /*tp_getattro*/
+  nullptr,			  /*tp_setattro*/
+  nullptr,			  /*tp_as_buffer*/
+  Py_TPFLAGS_DEFAULT,		  /*tp_flags*/
+  "GDB float format object",      /* tp_doc */
+  nullptr,			  /* tp_traverse */
+  nullptr,			  /* tp_clear */
+  nullptr,			  /* tp_richcompare */
+  0,				  /* tp_weaklistoffset */
+  nullptr,			  /* tp_iter */
+  nullptr,			  /* tp_iternext */
+  nullptr,			  /* tp_methods */
+  nullptr,			  /* tp_members */
+  float_format_object_getset,     /* tp_getset */
+  nullptr,			  /* tp_base */
+  nullptr,			  /* tp_dict */
+  nullptr,			  /* tp_descr_get */
+  nullptr,			  /* tp_descr_set */
+  0,				  /* tp_dictoffset */
+  ffpy_init,			  /* tp_init */
+  nullptr,			  /* tp_alloc */
+  PyType_GenericNew,		  /* tp_new */
+};
diff --git a/gdb/python/py-objfile.c b/gdb/python/py-objfile.c
index bb5d0d92aba..71d840c3e00 100644
--- a/gdb/python/py-objfile.c
+++ b/gdb/python/py-objfile.c
@@ -705,6 +705,23 @@ objfile_to_objfile_object (struct objfile *objfile)
   return gdbpy_ref<>::new_reference (result);
 }
 
+/* See python/python-internal.h. */
+
+struct objfile *
+objfile_object_to_objfile (PyObject *self)
+{
+  if (!PyObject_TypeCheck (self, &objfile_object_type))
+    {
+      PyErr_SetString(PyExc_TypeError, "expected gdb.Objfile");
+      return nullptr;
+    }
+
+  auto objfile_object = (struct objfile_object*) self;
+  OBJFPY_REQUIRE_VALID (objfile_object);
+
+  return objfile_object->objfile;
+}
+
 static int CPYCHECKER_NEGATIVE_RESULT_SETS_EXCEPTION
 gdbpy_initialize_objfile (void)
 {
diff --git a/gdb/python/py-type-init.c b/gdb/python/py-type-init.c
new file mode 100644
index 00000000000..58f29393413
--- /dev/null
+++ b/gdb/python/py-type-init.c
@@ -0,0 +1,520 @@
+/* Functionality for creating new types accessible from python.
+
+   Copyright (C) 2008-2023 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include "defs.h"
+#include "python-internal.h"
+#include "gdbtypes.h"
+#include "floatformat.h"
+#include "objfiles.h"
+#include "gdbsupport/gdb_obstack.h"
+
+
+/* An abstraction covering the objects types that can own a type object. */
+
+class type_storage_owner
+{
+public:
+  /* Creates a new type owner from the given python object. If the object is
+   * of a type that is not supported, the newly created instance will be
+   * marked as invalid and nothing should be done with it. */
+
+  type_storage_owner (PyObject *owner)
+  {
+    if (gdbpy_is_architecture (owner))
+      {
+	this->kind = owner_kind::arch;
+	this->owner.arch = arch_object_to_gdbarch (owner);
+	return;
+      }
+
+    this->kind = owner_kind::objfile;
+    this->owner.objfile = objfile_object_to_objfile (owner);
+    if (this->owner.objfile != nullptr)
+	return;
+
+    this->kind = owner_kind::none;
+    PyErr_SetString(PyExc_TypeError, "unsupported owner type");
+  }
+
+  /* Whether the owner is valid. An owner may not be valid if the type that
+   * was used to create it is not known. Operations must only be done on valid
+   * instances of this class. */
+
+  bool valid ()
+  {
+    return this->kind != owner_kind::none;
+  }
+
+  /* Returns a type allocator that allocates on this owner. */
+
+  type_allocator allocator ()
+  {
+    gdb_assert (this->valid ());
+
+    if (this->kind == owner_kind::arch)
+      return type_allocator (this->owner.arch);
+    else if (this->kind == owner_kind::objfile)
+      {
+	/* Creating types on the gdbarch sets their language to minimal, we
+	 * maintain this behavior here. */
+	return type_allocator (this->owner.objfile, language_minimal);
+      }
+
+    /* Should never be reached, but it's better to fail in a safe way than try
+     * to instance the allocator with arbitraty parameters here. */
+    abort ();
+  }
+
+  /* Get a reference to the owner's obstack. */
+
+  obstack *get_obstack ()
+  {
+    gdb_assert (this->valid ());
+
+    if (this->kind == owner_kind::arch)
+	return gdbarch_obstack (this->owner.arch);
+    else if (this->kind == owner_kind::objfile)
+	return &this->owner.objfile->objfile_obstack;
+
+    return nullptr;
+  }
+
+  /* Get a reference to the owner's architecture. */
+
+  struct gdbarch *get_arch ()
+  {
+    gdb_assert (this->valid ());
+
+    if (this->kind == owner_kind::arch)
+	return this->owner.arch;
+    else if (this->kind == owner_kind::objfile)
+	return this->owner.objfile->arch ();
+
+    return nullptr;
+  }
+
+  /* Copy a null-terminated string to the owner's obstack. */
+
+  const char *copy_string (const char *py_str)
+  {
+    gdb_assert (this->valid ());
+
+    unsigned int len = strlen (py_str);
+    return obstack_strndup (this->get_obstack (), py_str, len);
+  }
+
+
+
+private:
+  enum class owner_kind { arch, objfile, none };
+
+  owner_kind kind = owner_kind::none;
+  union {
+    struct gdbarch *arch;
+    struct objfile *objfile;
+  } owner;
+};
+
+/* Creates a new type and returns a new gdb.Type associated with it. */
+
+PyObject *
+gdbpy_init_type (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *keywords[] = { "owner", "type_code", "bit_size", "name",
+				    NULL };
+  PyObject *owner_object;
+  enum type_code code;
+  int bit_length;
+  const char *py_name;
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, "Oiis", keywords, &owner_object,
+					&code, &bit_length, &py_name))
+    return nullptr;
+
+  type_storage_owner owner (owner_object);
+  if (!owner.valid ())
+    return nullptr;
+
+  const char *name = owner.copy_string (py_name);
+  struct type *type;
+  try
+    {
+      type_allocator allocator = owner.allocator ();
+      type = allocator.new_type (code, bit_length, name);
+      gdb_assert (type != nullptr);
+    }
+  catch (gdb_exception_error& ex)
+    {
+      GDB_PY_HANDLE_EXCEPTION (ex);
+    }
+
+  return type_to_type_object (type);
+}
+
+/* Creates a new integer type and returns a new gdb.Type associated with it. */
+
+PyObject *
+gdbpy_init_integer_type (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *keywords[] = { "owner", "bit_size", "unsigned", "name",
+				    NULL };
+  PyObject *owner_object;
+  int bit_size;
+  int unsigned_p;
+  const char *py_name;
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, "Oips", keywords,
+					&owner_object, &bit_size, &unsigned_p,
+					&py_name))
+    return nullptr;
+
+  type_storage_owner owner (owner_object);
+  if (!owner.valid ())
+    return nullptr;
+
+  const char *name = owner.copy_string (py_name);
+  struct type *type;
+  try
+    {
+      type_allocator allocator = owner.allocator ();
+      type = init_integer_type (allocator, bit_size, unsigned_p, name);
+      gdb_assert (type != nullptr);
+    }
+  catch (gdb_exception_error& ex)
+    {
+      GDB_PY_HANDLE_EXCEPTION (ex);
+    }
+
+  return type_to_type_object(type);
+}
+
+/* Creates a new character type and returns a new gdb.Type associated
+ * with it. */
+
+PyObject *
+gdbpy_init_character_type (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *keywords[] = { "owner", "bit_size", "unsigned", "name",
+				    NULL };
+  PyObject *owner_object;
+  int bit_size;
+  int unsigned_p;
+  const char *py_name;
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, "Oips", keywords,
+					&owner_object, &bit_size, &unsigned_p,
+					&py_name))
+    return nullptr;
+
+  type_storage_owner owner (owner_object);
+  if (!owner.valid ())
+    return nullptr;
+
+  const char *name = owner.copy_string (py_name);
+  struct type *type;
+  try
+    {
+      type_allocator allocator = owner.allocator ();
+      type = init_character_type (allocator, bit_size, unsigned_p, name);
+      gdb_assert (type != nullptr);
+    }
+  catch (gdb_exception_error& ex)
+    {
+      GDB_PY_HANDLE_EXCEPTION (ex);
+    }
+
+  return type_to_type_object (type);
+}
+
+/* Creates a new boolean type and returns a new gdb.Type associated with it. */
+
+PyObject *
+gdbpy_init_boolean_type (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *keywords[] = { "owner", "bit_size", "unsigned", "name",
+				    NULL };
+  PyObject *owner_object;
+  int bit_size;
+  int unsigned_p;
+  const char *py_name;
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, "Oips", keywords,
+					&owner_object, &bit_size, &unsigned_p,
+					&py_name))
+    return nullptr;
+
+  type_storage_owner owner (owner_object);
+  if (!owner.valid ())
+    return nullptr;
+
+  const char *name = owner.copy_string (py_name);
+  struct type *type;
+  try
+    {
+      type_allocator allocator = owner.allocator ();
+      type = init_boolean_type (allocator, bit_size, unsigned_p, name);
+      gdb_assert (type != nullptr);
+    }
+  catch (gdb_exception_error& ex)
+    {
+      GDB_PY_HANDLE_EXCEPTION (ex);
+    }
+
+  return type_to_type_object (type);
+}
+
+/* Creates a new float type and returns a new gdb.Type associated with it. */
+
+PyObject *
+gdbpy_init_float_type (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *keywords[] = { "owner", "format", "name", NULL };
+  PyObject *owner_object, *float_format_object;
+  const char *py_name;
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, "OOs", keywords, &owner_object,
+					&float_format_object, &py_name))
+    return nullptr;
+
+  type_storage_owner owner (owner_object);
+  if (!owner.valid ())
+    return nullptr;
+
+  struct floatformat *local_ff = float_format_object_as_float_format
+    (float_format_object);
+  if (local_ff == nullptr)
+    return nullptr;
+
+  /* Persist a copy of the format in the objfile's obstack. This guarantees
+   * that the format won't outlive the type being created from it and that
+   * changes made to the object used to create this type will not affect it
+   * after creation. */
+  auto ff = OBSTACK_CALLOC (owner.get_obstack (), 1, struct floatformat);
+  memcpy (ff, local_ff, sizeof (struct floatformat));
+
+  /* We only support creating float types in the architecture's endianness, so
+   * make sure init_float_type sees the float format structure we need it to.
+   */
+  enum bfd_endian endianness = gdbarch_byte_order (owner.get_arch ());
+  gdb_assert (endianness < BFD_ENDIAN_UNKNOWN);
+
+  const struct floatformat *per_endian[2] = { nullptr, nullptr };
+  per_endian[endianness] = ff;
+
+  const char *name = owner.copy_string (py_name);
+  struct type *type;
+  try
+    {
+      type_allocator allocator = owner.allocator ();
+      type = init_float_type (allocator, -1, name, per_endian, endianness);
+      gdb_assert (type != nullptr);
+    }
+  catch (gdb_exception_error& ex)
+    {
+      GDB_PY_HANDLE_EXCEPTION (ex);
+    }
+
+  return type_to_type_object (type);
+}
+
+/* Creates a new decimal float type and returns a new gdb.Type
+ * associated with it. */
+
+PyObject *
+gdbpy_init_decfloat_type (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *keywords[] = { "owner", "bit_size", "name", NULL };
+  PyObject *owner_object;
+  int bit_length;
+  const char *py_name;
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, "Ois", keywords, &owner_object,
+					&bit_length, &py_name))
+    return nullptr;
+
+  type_storage_owner owner (owner_object);
+  if (!owner.valid ())
+    return nullptr;
+
+  const char *name = owner.copy_string (py_name);
+  struct type *type;
+  try
+    {
+      type_allocator allocator = owner.allocator ();
+      type = init_decfloat_type (allocator, bit_length, name);
+      gdb_assert (type != nullptr);
+    }
+  catch (gdb_exception_error& ex)
+    {
+      GDB_PY_HANDLE_EXCEPTION (ex);
+    }
+
+  return type_to_type_object (type);
+}
+
+/* Returns whether a given type can be used to create a complex type. */
+
+PyObject *
+gdbpy_can_create_complex_type (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *keywords[] = { "type", NULL };
+  PyObject *type_object;
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, "O", keywords,
+					&type_object))
+    return nullptr;
+
+  struct type *type = type_object_to_type (type_object);
+  if (type == nullptr)
+    return nullptr;
+
+  bool can_create_complex = false;
+  try
+    {
+      can_create_complex = can_create_complex_type (type);
+    }
+  catch (gdb_exception_error& ex)
+    {
+      GDB_PY_HANDLE_EXCEPTION (ex);
+    }
+
+  if (can_create_complex)
+    Py_RETURN_TRUE;
+  else
+    Py_RETURN_FALSE;
+}
+
+/* Creates a new complex type and returns a new gdb.Type associated with it. */
+
+PyObject *
+gdbpy_init_complex_type (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *keywords[] = { "type", "name", NULL };
+  PyObject *type_object;
+  const char *py_name;
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, "Os", keywords, &type_object,
+					&py_name))
+    return nullptr;
+
+  struct type *type = type_object_to_type (type_object);
+  if (type == nullptr)
+    return nullptr;
+
+  obstack *obstack;
+  if (type->is_objfile_owned ())
+    obstack = &type->objfile_owner ()->objfile_obstack;
+  else
+    obstack = gdbarch_obstack (type->arch_owner ());
+
+  unsigned int len = strlen (py_name);
+  const char *name = obstack_strndup (obstack,
+				      py_name,
+				      len);
+  struct type *complex_type;
+  try
+    {
+      complex_type = init_complex_type (name, type);
+      gdb_assert (complex_type != nullptr);
+    }
+  catch (gdb_exception_error& ex)
+    {
+      GDB_PY_HANDLE_EXCEPTION (ex);
+    }
+
+  return type_to_type_object (complex_type);
+}
+
+/* Creates a new pointer type and returns a new gdb.Type associated with it. */
+
+PyObject *
+gdbpy_init_pointer_type (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *keywords[] = { "owner", "target", "bit_size", "name",
+				    NULL };
+  PyObject *owner_object, *type_object;
+  int bit_length;
+  const char *py_name;
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, "OOis", keywords,
+					&owner_object, &type_object,
+					&bit_length, &py_name))
+    return nullptr;
+
+  struct type *type = type_object_to_type (type_object);
+  if (type == nullptr)
+    return nullptr;
+
+  type_storage_owner owner (owner_object);
+  if (!owner.valid ())
+    return nullptr;
+
+  const char *name = owner.copy_string (py_name);
+  struct type *pointer_type = nullptr;
+  try
+    {
+      type_allocator allocator = owner.allocator ();
+      pointer_type = init_pointer_type (allocator, bit_length, name, type);
+      gdb_assert (type != nullptr);
+    }
+  catch (gdb_exception_error& ex)
+    {
+      GDB_PY_HANDLE_EXCEPTION (ex);
+    }
+
+  return type_to_type_object (pointer_type);
+}
+
+/* Creates a new fixed point type and returns a new gdb.Type associated
+ * with it. */
+
+PyObject *
+gdbpy_init_fixed_point_type (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *keywords[] = { "owner", "bit_size", "unsigned", "name",
+				    NULL };
+  PyObject *owner_object;
+  int bit_length;
+  int unsigned_p;
+  const char* py_name;
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, "Oips", keywords,
+					&owner_object, &bit_length,
+					&unsigned_p, &py_name))
+    return nullptr;
+
+  type_storage_owner owner (owner_object);
+  if (!owner.valid ())
+    return nullptr;
+
+  const char *name = owner.copy_string (py_name);
+  struct type *type;
+  try
+    {
+      type_allocator allocator = owner.allocator ();
+      type = init_fixed_point_type (allocator, bit_length, unsigned_p, name);
+      gdb_assert (type != nullptr);
+    }
+  catch (gdb_exception_error& ex)
+    {
+      GDB_PY_HANDLE_EXCEPTION (ex);
+    }
+
+  return type_to_type_object (type);
+}
diff --git a/gdb/python/python-internal.h b/gdb/python/python-internal.h
index 14e15574685..51e1202d5bd 100644
--- a/gdb/python/python-internal.h
+++ b/gdb/python/python-internal.h
@@ -291,6 +291,8 @@ extern PyTypeObject frame_object_type
     CPYCHECKER_TYPE_OBJECT_FOR_TYPEDEF ("frame_object");
 extern PyTypeObject thread_object_type
     CPYCHECKER_TYPE_OBJECT_FOR_TYPEDEF ("thread_object");
+extern PyTypeObject float_format_object_type
+    CPYCHECKER_TYPE_OBJECT_FOR_TYPEDEF ("float_format");
 
 /* Ensure that breakpoint_object_type is initialized and return true.  If
    breakpoint_object_type can't be initialized then set a suitable Python
@@ -433,6 +435,26 @@ gdb::unique_xmalloc_ptr<char> gdbpy_parse_command_name
 PyObject *gdbpy_register_tui_window (PyObject *self, PyObject *args,
 				     PyObject *kw);
 
+PyObject *gdbpy_init_type (PyObject *self, PyObject *args, PyObject *kw);
+PyObject *gdbpy_init_integer_type (PyObject *self, PyObject *args,
+				   PyObject *kw);
+PyObject *gdbpy_init_character_type (PyObject *self, PyObject *args,
+				     PyObject *kw);
+PyObject *gdbpy_init_boolean_type (PyObject *self, PyObject *args,
+				   PyObject *kw);
+PyObject *gdbpy_init_float_type (PyObject *self, PyObject *args,
+				 PyObject *kw);
+PyObject *gdbpy_init_decfloat_type (PyObject *self, PyObject *args,
+				    PyObject *kw);
+PyObject *gdbpy_can_create_complex_type (PyObject *self, PyObject *args,
+					 PyObject *kw);
+PyObject *gdbpy_init_complex_type (PyObject *self, PyObject *args,
+				   PyObject *kw);
+PyObject *gdbpy_init_pointer_type (PyObject *self, PyObject *args,
+				   PyObject *kw);
+PyObject *gdbpy_init_fixed_point_type (PyObject *self, PyObject *args,
+				       PyObject *kw);
+
 PyObject *symtab_and_line_to_sal_object (struct symtab_and_line sal);
 PyObject *symtab_to_symtab_object (struct symtab *symtab);
 PyObject *symbol_to_symbol_object (struct symbol *sym);
@@ -504,6 +526,18 @@ extern void serialize_mi_results (PyObject *results);
 extern PyObject *gdbpy_notify_mi (PyObject *self, PyObject *args,
 				  PyObject *kw);
 
+/* Retrieves a pointer to the underlying float format structure. Expects an
+ * instance of gdb.Objfile for SELF. If SELF is of an incompatible type,
+ * returns nullptr and raises a Python exception. */
+
+extern struct objfile *objfile_object_to_objfile (PyObject *self);
+
+/* Retrieves a pointer to the underlying float format structure. Expects an
+ * instance of gdb.FloatFormat for SELF. If SELF is of an incompatible type,
+ * returns nullptr and raises a Python exception. */
+
+extern struct floatformat *float_format_object_as_float_format (PyObject *self);
+
 /* Convert Python object OBJ to a program_space pointer.  OBJ must be a
    gdb.Progspace reference.  Return nullptr if the gdb.Progspace is not
    valid (see gdb.Progspace.is_valid), otherwise return the program_space
diff --git a/gdb/python/python.c b/gdb/python/python.c
index 2ca3c50afd4..1ccba1ca519 100644
--- a/gdb/python/python.c
+++ b/gdb/python/python.c
@@ -2626,6 +2626,56 @@ Return current recording object." },
     "stop_recording () -> None.\n\
 Stop current recording." },
 
+  /* Type initialization functions. */
+  { "init_type", (PyCFunction) gdbpy_init_type, METH_VARARGS | METH_KEYWORDS,
+    "init_type (objfile, type_code, bit_length, name) -> type\n\
+    Creates a new type with the given bit length and type code, owned\
+    by the given objfile." },
+  { "init_integer_type", (PyCFunction) gdbpy_init_integer_type,
+    METH_VARARGS | METH_KEYWORDS,
+    "init_integer_type (objfile, bit_length, unsigned, name) -> type\n\
+    Creates a new integer type with the given bit length and \
+    signedness, owned by the given objfile." },
+  { "init_character_type", (PyCFunction) gdbpy_init_character_type,
+    METH_VARARGS | METH_KEYWORDS,
+    "init_character_type (objfile, bit_length, unsigned, name) -> type\n\
+    Creates a new character type with the given bit length and \
+    signedness, owned by the given objfile." },
+  { "init_boolean_type", (PyCFunction) gdbpy_init_boolean_type,
+    METH_VARARGS | METH_KEYWORDS,
+    "init_boolean_type (objfile, bit_length, unsigned, name) -> type\n\
+    Creates a new boolean type with the given bit length and \
+    signedness, owned by the given objfile." },
+  { "init_float_type", (PyCFunction) gdbpy_init_float_type,
+    METH_VARARGS | METH_KEYWORDS,
+    "init_float_type (objfile, float_format, name) -> type\n\
+    Creates a new floating point type with the given bit length and \
+    format, owned by the given objfile." },
+  { "init_decfloat_type", (PyCFunction) gdbpy_init_decfloat_type,
+    METH_VARARGS | METH_KEYWORDS,
+    "init_decfloat_type (objfile, bit_length, name) -> type\n\
+    Creates a new decimal float type with the given bit length,\
+    owned by the given objfile." },
+  { "can_create_complex_type", (PyCFunction) gdbpy_can_create_complex_type,
+    METH_VARARGS | METH_KEYWORDS,
+    "can_create_complex_type (type) -> bool\n\
+     Returns whether a given type can form a new complex type." },
+  { "init_complex_type", (PyCFunction) gdbpy_init_complex_type,
+    METH_VARARGS | METH_KEYWORDS,
+    "init_complex_type (base_type, name) -> type\n\
+    Creates a new complex type whose components belong to the\
+    given type, owned by the given objfile." },
+  { "init_pointer_type", (PyCFunction) gdbpy_init_pointer_type,
+    METH_VARARGS | METH_KEYWORDS,
+    "init_pointer_type (objfile, target_type, bit_length, name) -> type\n\
+    Creates a new pointer type with the given bit length, pointing\
+    to the given target type, and owned by the given objfile." },
+  { "init_fixed_point_type", (PyCFunction) gdbpy_init_fixed_point_type,
+    METH_VARARGS | METH_KEYWORDS,
+    "init_fixed_point_type (objfile, bit_length, unsigned, name) -> type\n\
+    Creates a new fixed point type with the given bit length and\
+    signedness, owned by the given objfile." },
+
   { "lookup_type", (PyCFunction) gdbpy_lookup_type,
     METH_VARARGS | METH_KEYWORDS,
     "lookup_type (name [, block]) -> type\n\
diff --git a/gdb/testsuite/gdb.python/py-type-init.c b/gdb/testsuite/gdb.python/py-type-init.c
new file mode 100644
index 00000000000..010e62bd248
--- /dev/null
+++ b/gdb/testsuite/gdb.python/py-type-init.c
@@ -0,0 +1,21 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2009-2023 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+int main ()
+{
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.python/py-type-init.exp b/gdb/testsuite/gdb.python/py-type-init.exp
new file mode 100644
index 00000000000..8ef3c2c57af
--- /dev/null
+++ b/gdb/testsuite/gdb.python/py-type-init.exp
@@ -0,0 +1,132 @@
+# Copyright (C) 2009-2023 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# This file is part of the GDB testsuite.  It tests the mechanism
+# of creating new types from within Python.
+
+load_lib gdb-python.exp
+
+standard_testfile
+
+# Build inferior to language specification.
+proc build_inferior {exefile lang} {
+  global srcdir subdir srcfile testfile hex
+
+  if { [gdb_compile "${srcdir}/${subdir}/${srcfile}" "${exefile}" executable "debug $lang"] != "" } {
+      untested "failed to compile in $lang mode"
+      return -1
+  }
+
+  return 0
+}
+
+# Restart GDB.
+proc restart_gdb {exefile} {
+  clean_restart $exefile
+
+  if {![runto_main]} {
+      return
+  }
+}
+
+# Tests the basic values of a type.
+proc test_type_basic {owner t code sizeof name} {
+  gdb_test "python print(${t}.code == ${code})" \
+    "True" "check the code for the python-constructed type (${owner}/${name})"
+  gdb_test "python print(${t}.sizeof == ${sizeof})" \
+    "True" "check the size for the python-constructed type (${owner}/${name})"
+  gdb_test "python print(${t}.name == ${name})" \
+    "True" "check the name for the python-constructed type (${owner}/${name})"
+}
+
+# Runs the tests for a given owner object.
+proc for_owner {owner} {
+  # Simple direct type creation.
+  gdb_test "python t = gdb.init_type(${owner}, gdb.TYPE_CODE_INT, 24, 'long short int')" \
+    "" "construct a new type from inside python (${owner})"
+  test_type_basic $owner "t" "gdb.TYPE_CODE_INT" "3" "'long short int'"
+
+  # Integer type creation.
+  gdb_test "python t = gdb.init_integer_type(${owner}, 24, True, 'test_int_t')" \
+    "" "construct a new integer type from inside python (${owner})"
+  test_type_basic $owner "t" "gdb.TYPE_CODE_INT" "3" "'test_int_t'"
+
+  # Character type creation.
+  gdb_test "python t = gdb.init_character_type(${owner}, 24, True, 'test_char_t')" \
+    "" "construct a new character type from inside python (${owner})"
+  test_type_basic $owner "t" "gdb.TYPE_CODE_CHAR" "3" "'test_char_t'"
+
+  # Boolean type creation.
+  gdb_test "python t = gdb.init_boolean_type(${owner}, 24, True, 'test_bool_t')" \
+    "" "construct a new boolean type from inside python (${owner})"
+  test_type_basic $owner "t" "gdb.TYPE_CODE_BOOL" "3" "'test_bool_t'"
+
+  # Float type creation.
+  gdb_test "python f = gdb.FloatFormat()" "" "create a float format object (${owner})"
+  gdb_test "python f.totalsize = 32" "" "set totalsize for the float format (${owner})"
+  gdb_test "python f.sign_start = 31" "" "set sign_start for the float format (${owner})"
+  gdb_test "python f.exp_start = 23" "" "set exp_start for the float format (${owner})"
+  gdb_test "python f.exp_len = 8" "" "set exp_len for the float format (${owner})"
+  gdb_test "python f.exp_bias = 0" "" "set exp_bias for the float format (${owner})"
+  gdb_test "python f.exp_nan = 0xff" "" "set exp_nan for the float format (${owner})"
+  gdb_test "python f.man_start = 0" "" "set man_start for the float format (${owner})"
+  gdb_test "python f.man_len = 22" "" "set man_len for the float format (${owner})"
+  gdb_test "python f.intbit = False" "" "set intbit for the float format (${owner})"
+  gdb_test "python f.name = 'test_float_fmt'" "" "set name for the float format (${owner})"
+
+  gdb_test "python ft = gdb.init_float_type(${owner}, f, 'test_float_t')" \
+    "" "construct a new float type from inside python (${owner})"
+  test_type_basic $owner "ft" "gdb.TYPE_CODE_FLT" "4" "'test_float_t'"
+
+  # Decfloat type creation.
+  gdb_test "python t = gdb.init_decfloat_type(${owner}, 24, 'test_decfloat_t')" \
+    "" "construct a new decfloat type from inside python (${owner})"
+  test_type_basic $owner "t" "gdb.TYPE_CODE_DECFLOAT" "3" "'test_decfloat_t'"
+
+  # Test complex type.
+  gdb_test "python print(gdb.can_create_complex_type(ft))" "True" \
+    "check whether the float type we created can be the basis for a complex (${owner})"
+
+  gdb_test "python t = gdb.init_complex_type(ft, 'test_complex_t')" \
+    "" "construct a new complex type from inside python (${owner})"
+  test_type_basic $owner "t" "gdb.TYPE_CODE_COMPLEX" "8" "'test_complex_t'"
+
+  # Create a 24-bit pointer to our floating point type.
+  gdb_test "python t = gdb.init_pointer_type(${owner}, ft, 24, 'test_pointer_t')" \
+    "" "construct a new pointer type from inside python (${owner})"
+  test_type_basic $owner "t" "gdb.TYPE_CODE_PTR" "3" "'test_pointer_t'"
+}
+
+# Run the tests.
+if { [build_inferior "${binfile}" "c"] == 0 } {
+  restart_gdb "${binfile}"
+
+  # Skip all tests if Python scripting is not enabled.
+  if { ![allow_python_tests] } { continue }
+
+  # Test objfile-owned type construction
+  for_owner "gdb.objfiles()\[0\]"
+
+  # Objfile-owned fixed point type creation.
+  #
+  # Currently, these cannot be owned by architectures, so we have to
+  # test them separately.
+  gdb_test "python t = gdb.init_fixed_point_type(gdb.objfiles()\[0\], 24, True, 'test_fixed_t')" \
+    "" "construct a new fixed point type from inside python (gdb.objfile()\[0\])"
+  test_type_basic "gdb.objfile()\[0\]" "t" "gdb.TYPE_CODE_FIXED_POINT" "3" "'test_fixed_t'"
+
+  # Test arch-owned type construction
+  for_owner "gdb.inferiors()\[0\].architecture()"
+}
-- 
2.40.1


^ permalink raw reply	[relevance 1%]

* [PATCH v2] Add support for symbol addition to the Python API
  2023-07-07 23:13  3%   ` Matheus Branco Borella
@ 2024-01-13  1:36  3%     ` Matheus Branco Borella
  2024-02-06 17:50  0%       ` Tom Tromey
  0 siblings, 1 reply; 65+ results
From: Matheus Branco Borella @ 2024-01-13  1:36 UTC (permalink / raw)
  To: gdb-patches; +Cc: aburgess, Matheus Branco Borella

I had to walk away from this for a while. I'm pinging it now and I've updated
the code so that it works on master.

This patch adds support for symbol creation and registration. It currently
supports adding type symbols (VAR_DOMAIN/LOC_TYPEDEF), static symbols
(VAR_DOMAIN/LOC_STATIC) and goto target labels (LABEL_DOMAIN/LOC_LABEL). It
adds a new `gdb.ObjfileBuilder` type, with `add_type_symbol`,
`add_static_symbol` and `add_label_symbol` functions, allowing for the addition
of the aforementioned types of symbols.

Symbol addition is achieved by constructing a new objfile with msyms and full
symbols reflecting the symbols that were previously added to the builder through
its methods. This approach lets us get most of the way to full symbol addition
support, but due to not being backed up by BFD, it does have a few limitations,
which I will go over them here.

PC-based minsym lookup does not work, because those would require a more
complete set of BFD structures than I think would be good practice to pretend to
have them all and crash GDB later on when it expects things to be there that
aren't.

In the same vein, PC-based function name lookup also does not work, although
there may be a way to have the feature work using overlays. However, this patch
does not make an attempt to do so

For now, though, this implementation lets us add symbols that can be used to,
for instance, query registered types through `gdb.lookup_type`, and allows
reverse engineering GDB plugins (such as Pwndbg [0] or decomp2gdb [1]) to add
symbols directly through the Python API instead of having to compile an object
file for the target architecture that they later load through the add-symbol-
file command. [2]

[0] https://github.com/pwndbg/pwndbg/
[1] https://github.com/mahaloz/decomp2dbg
[2] https://github.com/mahaloz/decomp2dbg/blob/055be6b2001954d00db2d683f20e9b714af75880/decomp2dbg/clients/gdb/symbol_mapper.py#L235-L243]
---
 gdb/Makefile.in                         |   1 +
 gdb/python/py-objfile-builder.c         | 633 ++++++++++++++++++++++++
 gdb/testsuite/gdb.python/py-objfile.exp |  11 +
 3 files changed, 645 insertions(+)
 create mode 100644 gdb/python/py-objfile-builder.c

diff --git a/gdb/Makefile.in b/gdb/Makefile.in
index 195f3a2e2d1..4f268983847 100644
--- a/gdb/Makefile.in
+++ b/gdb/Makefile.in
@@ -418,6 +418,7 @@ SUBDIR_PYTHON_SRCS = \
 	python/py-micmd.c \
 	python/py-newobjfileevent.c \
 	python/py-objfile.c \
+	python/py-objfile-builder.c \
 	python/py-param.c \
 	python/py-prettyprint.c \
 	python/py-progspace.c \
diff --git a/gdb/python/py-objfile-builder.c b/gdb/python/py-objfile-builder.c
new file mode 100644
index 00000000000..57fa7d71c71
--- /dev/null
+++ b/gdb/python/py-objfile-builder.c
@@ -0,0 +1,633 @@
+/* Python class allowing users to build and install objfiles.
+
+   Copyright (C) 2013-2023 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include "defs.h"
+#include "python-internal.h"
+#include "quick-symbol.h"
+#include "objfiles.h"
+#include "minsyms.h"
+#include "buildsym.h"
+#include "observable.h"
+#include "inferior.h"
+#include <string>
+#include <unordered_map>
+#include <type_traits>
+#include <optional>
+
+/* This module relies on symbols being trivially copyable. */
+static_assert (std::is_trivially_copyable<struct symbol>::value);
+
+/* Interface to be implemented for symbol types supported by this interface. */
+class symbol_def
+{
+public:
+  virtual ~symbol_def () = default;
+
+  virtual void register_msymbol (const std::string& name,
+				 struct objfile* objfile,
+				 minimal_symbol_reader& reader) const = 0;
+  virtual void register_symbol (const std::string& name,
+				struct objfile* objfile,
+				buildsym_compunit& builder) const = 0;
+};
+
+/* Shorthand for a unique_ptr to a symbol. */
+typedef std::unique_ptr<symbol_def> symbol_def_up;
+
+/* Data being held by the gdb.ObjfileBuilder.
+ *
+ * This structure needs to have its constructor run in order for its lifetime
+ * to begin. Because of how Python handles its objects, we can't just reconstruct
+ * the object structure as a whole, as that would overwrite things the runtime
+ * cares about, so these fields had to be broken off into their own structure. */
+struct objfile_builder_data
+{
+  /* Indicates whether the objfile has already been built and added to the
+   * current context. We enforce that objfiles can't be installed twice. */
+  bool installed = false;
+
+  /* The symbols that will be added to new newly built objfile. */
+  std::unordered_map<std::string, symbol_def_up> symbols;
+
+  /* The name given to this objfile. */
+  std::string name;
+
+  /* Adds a symbol definition with the given name. */
+  bool add_symbol_def (std::string name, symbol_def_up&& symbol_def)
+  {
+    return std::get<1> (symbols.insert ({name, std::move (symbol_def)}));
+  }
+};
+
+/* Structure backing the gdb.ObjfileBuilder type. */
+
+struct objfile_builder_object
+{
+  PyObject_HEAD
+
+  /* See objfile_builder_data. */
+  objfile_builder_data inner;
+};
+
+extern PyTypeObject objfile_builder_object_type
+    CPYCHECKER_TYPE_OBJECT_FOR_TYPEDEF ("objfile_builder_object_type");
+
+/* Constructs a new objfile from an objfile_builder. */
+static struct objfile *
+build_new_objfile (const objfile_builder_object& builder)
+{
+  gdb_assert (!builder.inner.installed);
+
+  auto of = objfile::make (nullptr, builder.inner.name.c_str (),
+			   OBJF_READNOW | OBJF_NOT_FILENAME,
+			   nullptr);
+
+  /* Setup object file sections. */
+  of->sections_start = OBSTACK_CALLOC (&of->objfile_obstack,
+				       4,
+				       struct obj_section);
+  of->sections_end = of->sections_start + 4;
+
+  const auto init_section = [&](struct obj_section* sec)
+    {
+      sec->objfile = of;
+      sec->ovly_mapped = false;
+
+      /* We're not being backed by BFD. So we have no real section data to speak
+       * of, but, because specifying sections requires BFD structures, we have to
+       * play a little game of predend. */
+      auto bfd = obstack_new<bfd_section> (&of->objfile_obstack);
+      bfd->vma = 0;
+      bfd->size = 0;
+      bfd->lma = 0; /* Prevents insert_section_p in objfiles.c from trying to
+		     * dereference the bfd structure we don't have. */
+      sec->the_bfd_section = bfd;
+    };
+  init_section (&of->sections_start[0]);
+  init_section (&of->sections_start[1]);
+  init_section (&of->sections_start[2]);
+  init_section (&of->sections_start[4]);
+
+  of->sect_index_text = 0;
+  of->sect_index_data = 1;
+  of->sect_index_rodata = 2;
+  of->sect_index_bss = 3;
+
+  /* While buildsym_compunit expects the symbol function pointer structure to be
+   * present, it also gracefully handles the case where all of the pointers in
+   * it are set to null. So, make sure we have a valid structure, but there's
+   * no need to do more than that. */
+  of->sf = obstack_new<struct sym_fns> (&of->objfile_obstack);
+
+  /* We need to tell GDB what architecture the objfile uses. */
+  if (has_stack_frames ())
+    of->per_bfd->gdbarch = get_frame_arch (get_selected_frame (nullptr));
+  else
+    of->per_bfd->gdbarch = current_inferior ()->arch ();
+
+  /* Construct the minimal symbols. */
+  minimal_symbol_reader msym (of);
+  for (const auto& element : builder.inner.symbols)
+      std::get<1> (element)->register_msymbol (std::get<0> (element), of, msym);
+  msym.install ();
+
+  /* Construct the full symbols. */
+  buildsym_compunit fsym (of, builder.inner.name.c_str (), "", language_c, 0);
+  for (const auto& element : builder.inner.symbols)
+    std::get<1> (element)->register_symbol (std::get<0> (element), of, fsym);
+  fsym.end_compunit_symtab (0);
+
+  /* Notify the rest of GDB this objfile has been created. Requires
+   * OBJF_NOT_FILENAME to be used, to prevent any of the functions attatched to
+   * the observable from trying to dereference of->bfd. */
+  gdb::observers::new_objfile.notify (of);
+
+  return of;
+}
+
+/* Implementation of the quick symbol functions used by the objfiles created
+ * using this interface. Turns out work here is fairly light, as we can get
+ * something that works by effectively just using no-ops, and the rest of the
+ * code will fall back to using just the minimal and full symbol data. It is
+ * important to note, though, that this only works because we're marking our
+ * objfile with `OBJF_READNOW`. */
+class runtime_objfile : public quick_symbol_functions
+{
+  virtual bool has_symbols (struct objfile*) override
+  {
+    return false;
+  }
+
+  virtual void dump (struct objfile *objfile) override
+  {
+  }
+
+  virtual bool expand_symtabs_matching
+    (struct objfile *objfile,
+     gdb::function_view<expand_symtabs_file_matcher_ftype> file_matcher,
+     const lookup_name_info *lookup_name,
+     gdb::function_view<expand_symtabs_symbol_matcher_ftype> symbol_matcher,
+     gdb::function_view<expand_symtabs_exp_notify_ftype> expansion_notify,
+     block_search_flags search_flags,
+     domain_enum domain,
+     enum search_domain kind) override
+  {
+    return true;
+  }
+};
+
+
+/* Create a new symbol alocated in the given objfile. */
+
+static struct symbol *
+new_symbol
+  (struct objfile *objfile,
+   const char *name,
+   enum language language,
+   enum domain_enum domain,
+   enum address_class aclass,
+   short section_index)
+{
+  auto symbol = new (&objfile->objfile_obstack) struct symbol ();
+  OBJSTAT (objfile, n_syms++);
+
+  symbol->set_language (language, &objfile->objfile_obstack);
+  symbol->compute_and_set_names (std::string_view (name), true,
+				 objfile->per_bfd);
+
+  symbol->set_is_objfile_owned (true);
+  symbol->set_section_index (section_index);
+  symbol->set_domain (domain);
+  symbol->set_aclass_index (aclass);
+
+  return symbol;
+}
+
+/* Parses a language from a string (coming from Python) into a language
+ * variant. */
+
+static enum language
+parse_language (const char *language)
+{
+  if (strcmp (language, "c") == 0)
+    return language_c;
+  else if (strcmp (language, "objc") == 0)
+    return language_objc;
+  else if (strcmp (language, "cplus") == 0)
+    return language_cplus;
+  else if (strcmp (language, "d") == 0)
+    return language_d;
+  else if (strcmp (language, "go") == 0)
+    return language_go;
+  else if (strcmp (language, "fortran") == 0)
+    return language_fortran;
+  else if (strcmp (language, "m2") == 0)
+    return language_m2;
+  else if (strcmp (language, "asm") == 0)
+    return language_asm;
+  else if (strcmp (language, "pascal") == 0)
+    return language_pascal;
+  else if (strcmp (language, "opencl") == 0)
+    return language_opencl;
+  else if (strcmp (language, "rust") == 0)
+    return language_rust;
+  else if (strcmp (language, "ada") == 0)
+    return language_ada;
+  else
+    return language_unknown;
+}
+
+/* Convenience function that performs a checked coversion from a PyObject to
+ * a objfile_builder_object structure pointer. */
+inline static struct objfile_builder_object *
+validate_objfile_builder_object (PyObject *self)
+{
+  if (!PyObject_TypeCheck (self, &objfile_builder_object_type))
+    return nullptr;
+  return (struct objfile_builder_object*) self;
+}
+
+/* Registers symbols added with add_label_symbol. */
+class typedef_symbol_def : public symbol_def
+{
+public:
+  struct type* type;
+  enum language language;
+
+  virtual void register_msymbol (const std::string& name,
+				 struct objfile *objfile,
+				 minimal_symbol_reader& reader) const override
+  {
+  }
+
+  virtual void register_symbol (const std::string& name,
+				struct objfile *objfile,
+				buildsym_compunit& builder) const override
+  {
+    auto symbol = new_symbol (objfile, name.c_str (), language, LABEL_DOMAIN,
+				LOC_TYPEDEF, objfile->sect_index_text);
+
+    symbol->set_type (type);
+
+    add_symbol_to_list (symbol, builder.get_file_symbols ());
+  }
+};
+
+/* Adds a type (LOC_TYPEDEF) symbol to a given objfile. */
+static PyObject *
+objbdpy_add_type_symbol (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *format = "sOs";
+  static const char *keywords[] =
+    {
+      "name", "type", "language", NULL
+    };
+
+  PyObject *type_object;
+  const char *name;
+  const char *language_name = nullptr;
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, format, keywords, &name,
+					&type_object, &language_name))
+    return nullptr;
+
+  auto builder = validate_objfile_builder_object (self);
+  if (builder == nullptr)
+    return nullptr;
+
+  struct type *type = type_object_to_type (type_object);
+  if (type == nullptr)
+    return nullptr;
+
+  if (language_name == nullptr)
+    language_name = "auto";
+  enum language language = parse_language (language_name);
+  if (language == language_unknown)
+    {
+      PyErr_SetString (PyExc_ValueError, "invalid language name");
+      return nullptr;
+    }
+
+  auto def = std::unique_ptr<typedef_symbol_def> (new typedef_symbol_def ());
+  def->type = type;
+  def->language = language;
+
+  builder->inner.add_symbol_def (name, std::move (def));
+
+  Py_RETURN_NONE;
+}
+
+
+/* Registers symbols added with add_label_symbol. */
+class label_symbol_def : public symbol_def
+{
+public:
+  CORE_ADDR address;
+  enum language language;
+
+  virtual void register_msymbol (const std::string& name,
+				 struct objfile *objfile,
+				 minimal_symbol_reader& reader) const override
+  {
+    reader.record (name.c_str (),
+		   unrelocated_addr (address),
+		   minimal_symbol_type::mst_text);
+  }
+
+  virtual void register_symbol (const std::string& name,
+				struct objfile *objfile,
+				buildsym_compunit& builder) const override
+  {
+    auto symbol = new_symbol (objfile, name.c_str (), language, LABEL_DOMAIN,
+			      LOC_LABEL, objfile->sect_index_text);
+
+    symbol->set_value_address (address);
+
+    add_symbol_to_list (symbol, builder.get_file_symbols ());
+  }
+};
+
+/* Adds a label (LOC_LABEL) symbol to a given objfile. */
+static PyObject *
+objbdpy_add_label_symbol (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *format = "sks";
+  static const char *keywords[] =
+    {
+      "name", "address", "language", NULL
+    };
+
+  const char *name;
+  CORE_ADDR address;
+  const char *language_name = nullptr;
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, format, keywords, &name,
+					&address, &language_name))
+    return nullptr;
+
+  auto builder = validate_objfile_builder_object (self);
+  if (builder == nullptr)
+    return nullptr;
+
+  if (language_name == nullptr)
+    language_name = "auto";
+  enum language language = parse_language (language_name);
+  if (language == language_unknown)
+    {
+      PyErr_SetString (PyExc_ValueError, "invalid language name");
+      return nullptr;
+    }
+
+  auto def = std::unique_ptr<label_symbol_def> (new label_symbol_def ());
+  def->address = address;
+  def->language = language;
+
+  builder->inner.add_symbol_def (name, std::move (def));
+
+  Py_RETURN_NONE;
+}
+
+/* Registers symbols added with add_static_symbol. */
+class static_symbol_def : public symbol_def
+{
+public:
+  CORE_ADDR address;
+  enum language language;
+
+  virtual void register_msymbol (const std::string& name,
+				 struct objfile *objfile,
+				 minimal_symbol_reader& reader) const override
+  {
+    reader.record (name.c_str (),
+		   unrelocated_addr (address),
+		   minimal_symbol_type::mst_bss);
+  }
+
+  virtual void register_symbol (const std::string& name,
+				struct objfile *objfile,
+				buildsym_compunit& builder) const override
+  {
+    auto symbol = new_symbol (objfile, name.c_str (), language, VAR_DOMAIN,
+			      LOC_STATIC, objfile->sect_index_bss);
+
+    symbol->set_value_address (address);
+
+    add_symbol_to_list (symbol, builder.get_file_symbols ());
+  }
+};
+
+/* Adds a static (LOC_STATIC) symbol to a given objfile. */
+static PyObject *
+objbdpy_add_static_symbol (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *format = "sks";
+  static const char *keywords[] =
+    {
+      "name", "address", "language", NULL
+    };
+
+  const char *name;
+  CORE_ADDR address;
+  const char *language_name = nullptr;
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, format, keywords, &name,
+					&address, &language_name))
+    return nullptr;
+
+  auto builder = validate_objfile_builder_object (self);
+  if (builder == nullptr)
+    return nullptr;
+
+  if (language_name == nullptr)
+    language_name = "auto";
+  enum language language = parse_language (language_name);
+  if (language == language_unknown)
+    {
+      PyErr_SetString (PyExc_ValueError, "invalid language name");
+      return nullptr;
+    }
+
+  auto def = std::unique_ptr<static_symbol_def> (new static_symbol_def ());
+  def->address = address;
+  def->language = language;
+
+  builder->inner.add_symbol_def (name, std::move (def));
+
+  Py_RETURN_NONE;
+}
+
+/* Builds the object file. */
+static PyObject *
+objbdpy_build (PyObject *self, PyObject *args)
+{
+  auto builder = validate_objfile_builder_object (self);
+  if (builder == nullptr)
+    return nullptr;
+
+  if (builder->inner.installed)
+    {
+      PyErr_SetString (PyExc_ValueError, "build() cannot be run twice on the \
+		       same object");
+      return nullptr;
+    }
+  auto of = build_new_objfile (*builder);
+  builder->inner.installed = true;
+
+
+  auto objpy = objfile_to_objfile_object (of).get ();
+  Py_INCREF(objpy);
+  return objpy;
+}
+
+/* Implements the __init__() function. */
+static int
+objbdpy_init (PyObject *self0, PyObject *args, PyObject *kw)
+{
+  static const char *format = "s";
+  static const char *keywords[] =
+    {
+      "name", NULL
+    };
+
+  const char *name;
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, format, keywords, &name))
+    return -1;
+
+  auto self = (objfile_builder_object *)self0;
+  self->inner.name = name;
+  self->inner.symbols.clear ();
+
+  return 0;
+}
+
+/* The function handling construction of the ObjfileBuilder object.
+ *
+ * We need to have a custom function here as, even though Python manages the
+ * memory backing the object up, it assumes clearing the memory is enough to
+ * begin its lifetime, which is not the case here, and would lead to undefined
+ * behavior as soon as we try to use it in any meaningful way.
+ *
+ * So, what we have to do here is manually begin the lifecycle of our new object
+ * by constructing it in place, using the memory region Python just allocated
+ * for us. This ensures the object will have already started its lifetime by
+ * the time we start using it. */
+static PyObject *
+objbdpy_new (PyTypeObject *subtype, PyObject *args, PyObject *kwds)
+{
+  objfile_builder_object *region =
+    (objfile_builder_object *) subtype->tp_alloc(subtype, 1);
+  gdb_assert ((size_t)region % alignof (objfile_builder_object) == 0);
+  gdb_assert (region != nullptr);
+
+  new (&region->inner) objfile_builder_data ();
+
+  return (PyObject *)region;
+}
+
+/* The function handling destruction of the ObjfileBuilder object.
+ *
+ * While running the destructor of our object isn't _strictly_ necessary, we
+ * would very much like for the memory it owns to be freed, but, because it was
+ * constructed in place, we have to call its destructor manually here. */
+static void
+objbdpy_dealloc (PyObject *self0)
+{
+  auto self = (objfile_builder_object *)self0;
+  PyTypeObject *tp = Py_TYPE(self);
+
+  self->inner.~objfile_builder_data ();
+
+  tp->tp_free(self);
+  Py_DECREF(tp);
+}
+
+static int CPYCHECKER_NEGATIVE_RESULT_SETS_EXCEPTION
+gdbpy_initialize_objfile_builder (void)
+{
+  if (PyType_Ready (&objfile_builder_object_type) < 0)
+    return -1;
+
+  return gdb_pymodule_addobject (gdb_module, "ObjfileBuilder",
+				 (PyObject *) &objfile_builder_object_type);
+}
+
+GDBPY_INITIALIZE_FILE (gdbpy_initialize_objfile_builder);
+
+static PyMethodDef objfile_builder_object_methods[] =
+{
+  { "build", (PyCFunction) objbdpy_build, METH_NOARGS,
+    "build ().\n\
+Build a new objfile containing the symbols added to builder." },
+  { "add_type_symbol", (PyCFunction) objbdpy_add_type_symbol,
+    METH_VARARGS | METH_KEYWORDS,
+    "add_type_symbol (name [str], type [gdb.Type], language [str]).\n\
+Add a new type symbol in the given language, associated with the given type." },
+  { "add_label_symbol", (PyCFunction) objbdpy_add_label_symbol,
+    METH_VARARGS | METH_KEYWORDS,
+    "add_label_symbol (name [str], address [int], language [str]).\n\
+Add a new label symbol in the given language, at the given address." },
+  { "add_static_symbol", (PyCFunction) objbdpy_add_static_symbol,
+    METH_VARARGS | METH_KEYWORDS,
+    "add_static_symbol (name [str], address [int], language [str]).\n\
+Add a new static symbol in the given language, at the given address." },
+  { NULL }
+};
+
+PyTypeObject objfile_builder_object_type = {
+  PyVarObject_HEAD_INIT (NULL, 0)
+  "gdb.ObjfileBuilder",               /* tp_name */
+  sizeof (objfile_builder_object),    /* tp_basicsize */
+  0,                                  /* tp_itemsize */
+  objbdpy_dealloc,                    /* tp_dealloc */
+  0,                                  /* tp_vectorcall_offset */
+  nullptr,                            /* tp_getattr */
+  nullptr,                            /* tp_setattr */
+  nullptr,                            /* tp_compare */
+  nullptr,                            /* tp_repr */
+  nullptr,                            /* tp_as_number */
+  nullptr,                            /* tp_as_sequence */
+  nullptr,                            /* tp_as_mapping */
+  nullptr,                            /* tp_hash  */
+  nullptr,                            /* tp_call */
+  nullptr,                            /* tp_str */
+  nullptr,                            /* tp_getattro */
+  nullptr,                            /* tp_setattro */
+  nullptr,                            /* tp_as_buffer */
+  Py_TPFLAGS_DEFAULT,                 /* tp_flags */
+  "GDB object file builder",          /* tp_doc */
+  nullptr,                            /* tp_traverse */
+  nullptr,                            /* tp_clear */
+  nullptr,                            /* tp_richcompare */
+  0,                                  /* tp_weaklistoffset */
+  nullptr,                            /* tp_iter */
+  nullptr,                            /* tp_iternext */
+  objfile_builder_object_methods,     /* tp_methods */
+  nullptr,                            /* tp_members */
+  nullptr,                            /* tp_getset */
+  nullptr,                            /* tp_base */
+  nullptr,                            /* tp_dict */
+  nullptr,                            /* tp_descr_get */
+  nullptr,                            /* tp_descr_set */
+  0,                                  /* tp_dictoffset */
+  objbdpy_init,                       /* tp_init */
+  PyType_GenericAlloc,                /* tp_alloc */
+  objbdpy_new,                        /* tp_new */
+};
+
+
diff --git a/gdb/testsuite/gdb.python/py-objfile.exp b/gdb/testsuite/gdb.python/py-objfile.exp
index 61b9942de79..ab2413e3176 100644
--- a/gdb/testsuite/gdb.python/py-objfile.exp
+++ b/gdb/testsuite/gdb.python/py-objfile.exp
@@ -173,3 +173,14 @@ gdb_py_test_silent_cmd "python objfile = gdb.objfiles()\[0\]" \
     "get first objfile" 1
 gdb_file_cmd ${binfile}
 gdb_test "python print(objfile)" "<gdb.Objfile \\\(invalid\\\)>"
+
+# Test adding a new objfile.
+gdb_py_test_silent_cmd "python builder = gdb.ObjfileBuilder(\"test_objfile\")" \
+    "Create an object file builder" 1
+gdb_test "python print(repr(builder))" "<gdb.ObjfileBuilder .*>"
+
+gdb_py_test_silent_cmd "python builder.add_static_symbol(name = \"test\", address = 0, language = \"c\")" \
+    "Add a static symbol to the object file builder" 1
+gdb_py_test_silent_cmd "python objfile = builder.build()" \
+    "Build an object from an objcect file builder" 1
+gdb_test "python print(repr(objfile.lookup_static_symbol(\"test\")))" "<gdb.Symbol .*>"
-- 
2.40.1


^ permalink raw reply	[relevance 3%]

* Re: [PATCH] Make `linux_info_proc` prefer using the LWP over the PID
  2024-01-06  2:45  6% [PATCH] Make `linux_info_proc` prefer using the LWP over the PID Matheus Branco Borella
@ 2024-01-08 15:50  7% ` Simon Marchi
  2024-01-19 16:52  7%   ` Matheus Branco Borella
  2024-01-19 16:49  6% ` [PATCH v2] " Matheus Branco Borella
  1 sibling, 1 reply; 65+ results
From: Simon Marchi @ 2024-01-08 15:50 UTC (permalink / raw)
  To: Matheus Branco Borella, gdb-patches



On 2024-01-05 21:45, Matheus Branco Borella wrote:
> Fixes: https://sourceware.org/bugzilla/show_bug.cgi?id=31207

We use `Bug:` for these.  Also, move it at the end of the commit
message, like standard git trailers
(https://git-scm.com/docs/git-interpret-trailers).

> Normally, `linux_info_proc` would use the PID to determine which subfolder in
> `/proc` to read information from. While this is usually fine, it breaks down
> after the main thread exits, at which point the information in `/proc/$pid`
> becomes become unreliable, if it is available at all. While it is the case
> that most programs terminate after their main thread exits, some may continue
> running from detached threads, in which case `info proc` will start misbehaving.
> 
> This patch addresses this by making it so that the LWP - the Lightweight Process
> ID, that, in the case of GNU/Linux is the number of the process backing up the
> thread[1] - is prefered over the PID. By doing this, `linux_info_proc` will
> always access valid procfs information, even after the main thread exits.
> 
> [1]: https://man7.org/linux/man-pages/man2/clone.2.html
> ---
>  gdb/linux-tdep.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/gdb/linux-tdep.c b/gdb/linux-tdep.c
> index 82e8bc3db3c..2c91e298d45 100644
> --- a/gdb/linux-tdep.c
> +++ b/gdb/linux-tdep.c
> @@ -840,7 +840,14 @@ linux_info_proc (struct gdbarch *gdbarch, const char *args,
>        if (current_inferior ()->fake_pid_p)
>  	error (_("Can't determine the current process's PID: you must name one."));
>  
> -      pid = current_inferior ()->pid;
> +      /* Seeing as, when the main thread exits, the information in /proc/$pid
> +       * becomes unreliable, we should prefer using the current TID, whenever
> +       * possible. */
> +      pid = inferior_ptid.lwp ();
> +
> +      /* And fall back to the actual PID only when the TID is not available. */
> +      if (pid == 0)
> +	pid = current_inferior ()->pid;

I would suggest trying to use the any_live_thread_of_inferior function
to get a non-exited thread.   This way, if the current thread has
exited, it will find another that should be suitable for reading the
proc information.

I can imagine another case where thing would go wrong.  There might be
threads which have exited, but for which we have not processed the
"exited" event yet.  The exited state will not yet be reflected in the
thread_info structure, so we might pick it thinking it's a live thread.
But I would ignore that problem for now, what you propose is already a
good improvement over the current state.

Simon

^ permalink raw reply	[relevance 7%]

* [PATCH] Make `linux_info_proc` prefer using the LWP over the PID
@ 2024-01-06  2:45  6% Matheus Branco Borella
  2024-01-08 15:50  7% ` Simon Marchi
  2024-01-19 16:49  6% ` [PATCH v2] " Matheus Branco Borella
  0 siblings, 2 replies; 65+ results
From: Matheus Branco Borella @ 2024-01-06  2:45 UTC (permalink / raw)
  To: gdb-patches; +Cc: Matheus Branco Borella

Fixes: https://sourceware.org/bugzilla/show_bug.cgi?id=31207

Normally, `linux_info_proc` would use the PID to determine which subfolder in
`/proc` to read information from. While this is usually fine, it breaks down
after the main thread exits, at which point the information in `/proc/$pid`
becomes become unreliable, if it is available at all. While it is the case
that most programs terminate after their main thread exits, some may continue
running from detached threads, in which case `info proc` will start misbehaving.

This patch addresses this by making it so that the LWP - the Lightweight Process
ID, that, in the case of GNU/Linux is the number of the process backing up the
thread[1] - is prefered over the PID. By doing this, `linux_info_proc` will
always access valid procfs information, even after the main thread exits.

[1]: https://man7.org/linux/man-pages/man2/clone.2.html
---
 gdb/linux-tdep.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/gdb/linux-tdep.c b/gdb/linux-tdep.c
index 82e8bc3db3c..2c91e298d45 100644
--- a/gdb/linux-tdep.c
+++ b/gdb/linux-tdep.c
@@ -840,7 +840,14 @@ linux_info_proc (struct gdbarch *gdbarch, const char *args,
       if (current_inferior ()->fake_pid_p)
 	error (_("Can't determine the current process's PID: you must name one."));
 
-      pid = current_inferior ()->pid;
+      /* Seeing as, when the main thread exits, the information in /proc/$pid
+       * becomes unreliable, we should prefer using the current TID, whenever
+       * possible. */
+      pid = inferior_ptid.lwp ();
+
+      /* And fall back to the actual PID only when the TID is not available. */
+      if (pid == 0)
+	pid = current_inferior ()->pid;
     }
 
   args = skip_spaces (args);
-- 
2.40.1


^ permalink raw reply	[relevance 6%]

* Re: [PATCH 1/2] [gdb/symtab] Add name_of_main and language_of_main to the DWARF index
  2023-10-10 19:19  6%   ` Tom Tromey
@ 2023-10-11 15:37  0%     ` Tom de Vries
  0 siblings, 0 replies; 65+ results
From: Tom de Vries @ 2023-10-11 15:37 UTC (permalink / raw)
  To: Tom Tromey, Tom de Vries via Gdb-patches

On 10/10/23 21:19, Tom Tromey wrote:
>>>>>> "Tom" == Tom de Vries via Gdb-patches <gdb-patches@sourceware.org> writes:
> 
> Tom> From: Matheus Branco Borella <dark.ryu.550@gmail.com>
> Tom> This patch adds a new section to the DWARF index containing the name
> Tom> and the language of the main function symbol, gathered from
> Tom> `cooked_index::get_main`, if available.
> 
> This patch has a bunch of formatting nits.  I think it's also a little
> incorrect in its handling of unknown languages / its understanding of
> its own idea of how the "0" case is handled.
> 
> Tom> +@item The shortcut table
> Tom> +This is a data structure with the following fields:
> Tom> +
> Tom> +@table @asis
> Tom> +@item Language of main
> Tom> +An @code{offset_type} value indicating the language of the main function as a
> Tom> +@code{DW_LANG_} constant.  This value will be zero if main function information
> Tom> +is not present.
> Tom> +
> Tom> +@item Name of main
> Tom> +An @code{offset_type} value indicating the offset of the main function's name
> Tom> +in the constant pool.  This value must be ignored if the value for the language
> Tom> +of main is zero.
> 
> This phrasing is a little strange.  The index-writing code seems to omit
> the name field if the language is unknown.  However, the text here makes
> it sound like the field is present but must be ignored.
> 
> If the writing code is correct, I would suggest changing this to say
> "This field is not present if..."
> 
> Tom> +  dwarf_source_language dw_lang = (dwarf_source_language)0;
> 
> gdb style puts a space after the ")" of a cast.
> There's a lot of cases like this.
> 

Done.

> Tom> +  shortcuts.append_uint (4, BFD_ENDIAN_LITTLE, dw_lang);
> Tom> +  shortcuts.append_offset (main_name_offset);
> 
> The first ljne using append_uint seems confusing... it made me wonder
> why it isn't using offset_type.  But in the end I don't think there's
> any reason.  "offset_type" is used for all kinds of fields in the index,
> it's better to just use that here.  It is 4 bytes anyway.
> 

Done.

> Tom> +/* Sets the name and language of the main function from the shortcut table. */
> Tom> +
> Tom> +static void
> Tom> +set_main_name_from_gdb_index (dwarf2_per_objfile *per_objfile,
> Tom> +			      mapped_gdb_index *index)
> Tom> +{
> Tom> +  const auto expected_size = 4 + sizeof (offset_type);
> 
> Better to use 2 * sizeof (offset_type) IMO.
> 

Done.

> Tom> +  if (index->shortcut_table.size () < expected_size)
> Tom> +    /* The data in the section is not present, is corrupted or is in a version
> Tom> +     * we don't know about. Regardless, we can't make use of it. */
> Tom> +    return;
> 
> The leading "*" on comments is not gdb style.  Several cases of this.
> 

Done.

> This is where the reader seems to expect that both members are always
> written.
> 
> Tom> +
> Tom> +  auto ptr = index->shortcut_table.data ();
> Tom> +  const auto dw_lang = extract_unsigned_integer (ptr, 4, BFD_ENDIAN_LITTLE);
> Tom> +  if (dw_lang >= DW_LANG_hi_user)
> Tom> +    {
> Tom> +      complaint (_(".gdb_index shortcut table has invalid main language %u"),
> Tom> +		   (unsigned) dw_lang);
> Tom> +      return;
> Tom> +    }
> 
> IMO it would be better for this check to happen in
> dwarf_lang_to_enum_language.  Suppose gdb adds support for a new language.
> That version of gdb may still emit index version 9, but the index
> won't be directly usable by an older gdb.
> 

AFAIU the check is the usual: is the value we've just read in the valid 
range.  It makes no assumption about which language gdb supports.  So 
I'm not sure this needs to be moved.

> Tom> +  const auto name = (const char*) (index->constant_pool.data () + name_offset);
> 
> Space before "*".  There were some with "&" as well but I didn't
> remember to point them out before deleting the patch text.
> 

Done.  Submitted here ( 
https://sourceware.org/pipermail/gdb-patches/2023-October/203161.html ).

Thanks,
- Tom


^ permalink raw reply	[relevance 0%]

* Re: [PATCH 1/2] [gdb/symtab] Add name_of_main and language_of_main to the DWARF index
  2023-10-06 18:31  4% ` [PATCH 1/2] [gdb/symtab] " Tom de Vries
@ 2023-10-10 19:19  6%   ` Tom Tromey
  2023-10-11 15:37  0%     ` Tom de Vries
  0 siblings, 1 reply; 65+ results
From: Tom Tromey @ 2023-10-10 19:19 UTC (permalink / raw)
  To: Tom de Vries via Gdb-patches; +Cc: Tom de Vries

>>>>> "Tom" == Tom de Vries via Gdb-patches <gdb-patches@sourceware.org> writes:

Tom> From: Matheus Branco Borella <dark.ryu.550@gmail.com>
Tom> This patch adds a new section to the DWARF index containing the name
Tom> and the language of the main function symbol, gathered from
Tom> `cooked_index::get_main`, if available.

This patch has a bunch of formatting nits.  I think it's also a little
incorrect in its handling of unknown languages / its understanding of
its own idea of how the "0" case is handled.

Tom> +@item The shortcut table
Tom> +This is a data structure with the following fields:
Tom> +
Tom> +@table @asis
Tom> +@item Language of main
Tom> +An @code{offset_type} value indicating the language of the main function as a
Tom> +@code{DW_LANG_} constant.  This value will be zero if main function information
Tom> +is not present.
Tom> +
Tom> +@item Name of main
Tom> +An @code{offset_type} value indicating the offset of the main function's name
Tom> +in the constant pool.  This value must be ignored if the value for the language
Tom> +of main is zero.

This phrasing is a little strange.  The index-writing code seems to omit
the name field if the language is unknown.  However, the text here makes
it sound like the field is present but must be ignored.

If the writing code is correct, I would suggest changing this to say
"This field is not present if..."

Tom> +  dwarf_source_language dw_lang = (dwarf_source_language)0;

gdb style puts a space after the ")" of a cast.
There's a lot of cases like this.

Tom> +  shortcuts.append_uint (4, BFD_ENDIAN_LITTLE, dw_lang);
Tom> +  shortcuts.append_offset (main_name_offset);

The first ljne using append_uint seems confusing... it made me wonder
why it isn't using offset_type.  But in the end I don't think there's
any reason.  "offset_type" is used for all kinds of fields in the index,
it's better to just use that here.  It is 4 bytes anyway.

Tom> +/* Sets the name and language of the main function from the shortcut table. */
Tom> +
Tom> +static void
Tom> +set_main_name_from_gdb_index (dwarf2_per_objfile *per_objfile,
Tom> +			      mapped_gdb_index *index)
Tom> +{
Tom> +  const auto expected_size = 4 + sizeof (offset_type);

Better to use 2 * sizeof (offset_type) IMO.

Tom> +  if (index->shortcut_table.size () < expected_size)
Tom> +    /* The data in the section is not present, is corrupted or is in a version
Tom> +     * we don't know about. Regardless, we can't make use of it. */
Tom> +    return;

The leading "*" on comments is not gdb style.  Several cases of this.

This is where the reader seems to expect that both members are always
written.

Tom> +
Tom> +  auto ptr = index->shortcut_table.data ();
Tom> +  const auto dw_lang = extract_unsigned_integer (ptr, 4, BFD_ENDIAN_LITTLE);
Tom> +  if (dw_lang >= DW_LANG_hi_user)
Tom> +    {
Tom> +      complaint (_(".gdb_index shortcut table has invalid main language %u"),
Tom> +		   (unsigned) dw_lang);
Tom> +      return;
Tom> +    }

IMO it would be better for this check to happen in
dwarf_lang_to_enum_language.  Suppose gdb adds support for a new language.
That version of gdb may still emit index version 9, but the index
won't be directly usable by an older gdb.

Tom> +  const auto name = (const char*) (index->constant_pool.data () + name_offset);

Space before "*".  There were some with "&" as well but I didn't
remember to point them out before deleting the patch text.

Tom

^ permalink raw reply	[relevance 6%]

* [PATCH 0/2] Add name_of_main and language_of_main to the DWARF index
@ 2023-10-06 18:31  7% Tom de Vries
  2023-10-06 18:31  4% ` [PATCH 1/2] [gdb/symtab] " Tom de Vries
  0 siblings, 1 reply; 65+ results
From: Tom de Vries @ 2023-10-06 18:31 UTC (permalink / raw)
  To: gdb-patches

This is a patch series containing:
- a gdb patch introducing a new version of the .gdb_index section,
  adding a shortcut table.
- a readelf patch adding the capability to read the new version, as
  well print the shortcuts table.

This is v4 of the gdb patch.  The v3 was submitted here (
https://sourceware.org/pipermail/gdb-patches/2023-August/201568.html ).

Changes in v4:
- fixed a few whitespace issues.
- reformulated 'A 32-bit little-endian value' to 'An @code{offset_type} value'
  in the docs.
- mentioned PR symtab/30946 in the commit log.
- updated a pre-existing unit test.

I've already approved v3, and the v4 changes are trivial.

The readelf patch is new.

Tested on x86_64-linux.

Matheus Branco Borella (1):
  [gdb/symtab] Add name_of_main and language_of_main to the DWARF index

Tom de Vries (1):
  [readelf] Handle .gdb_index section version 9

 binutils/dwarf.c            | 176 +++++++++++++++++++++++-------------
 gdb/NEWS                    |   3 +
 gdb/doc/gdb.texinfo         |  23 ++++-
 gdb/dwarf2/index-write.c    |  54 +++++++++--
 gdb/dwarf2/read-gdb-index.c |  54 ++++++++++-
 gdb/dwarf2/read.c           |  13 ++-
 gdb/dwarf2/read.h           |  12 +++
 7 files changed, 259 insertions(+), 76 deletions(-)


base-commit: 9a896be33224654760c46d3698218241d0a1f354
-- 
2.35.3


^ permalink raw reply	[relevance 7%]

* [PATCH 1/2] [gdb/symtab] Add name_of_main and language_of_main to the DWARF index
  2023-10-06 18:31  7% [PATCH 0/2] " Tom de Vries
@ 2023-10-06 18:31  4% ` Tom de Vries
  2023-10-10 19:19  6%   ` Tom Tromey
  0 siblings, 1 reply; 65+ results
From: Tom de Vries @ 2023-10-06 18:31 UTC (permalink / raw)
  To: gdb-patches

From: Matheus Branco Borella <dark.ryu.550@gmail.com>

This patch adds a new section to the DWARF index containing the name
and the language of the main function symbol, gathered from
`cooked_index::get_main`, if available. Currently, for lack of a better name,
this section is called the "shortcut table". The way this name is both saved and
applied upon an index being loaded in mirrors how it is done in
`cooked_index_functions`, more specifically, the full name of the main function
symbol is saved and `set_objfile_main_name` is used to apply it after it is
loaded.

The main use case for this patch is in improving startup times when dealing with
large binaries. Currently, when an index is used, GDB has to expand symtabs
until it finds out what the language of the main function symbol is. For some
large executables, this may take a considerable amount of time to complete,
slowing down startup. This patch bypasses that operation by having both the name
and language of the main function symbol be provided ahead of time by the index.

In my testing (a binary with about 1.8GB worth of DWARF data) this change brings
startup time down from about 34 seconds to about 1.5 seconds.

When testing the patch with target board cc-with-gdb-index, test-case
gdb.fortran/nested-funcs-2.exp starts failing, but this is due to a
pre-existing issue, filed as PR symtab/30946.

Tested on x86_64-linux, with target board unix and cc-with-gdb-index.

PR symtab/24549
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=24549
---
 gdb/NEWS                    |  3 +++
 gdb/doc/gdb.texinfo         | 23 ++++++++++++++--
 gdb/dwarf2/index-write.c    | 54 +++++++++++++++++++++++++++++++------
 gdb/dwarf2/read-gdb-index.c | 54 ++++++++++++++++++++++++++++++++++++-
 gdb/dwarf2/read.c           | 13 +++++++--
 gdb/dwarf2/read.h           | 12 +++++++++
 6 files changed, 146 insertions(+), 13 deletions(-)

diff --git a/gdb/NEWS b/gdb/NEWS
index 2f6378f9c7a..20f36d278bd 100644
--- a/gdb/NEWS
+++ b/gdb/NEWS
@@ -21,6 +21,9 @@
   styling according to the spec.  See https://no-color.org/.
   Styling can be re-enabled with "set style enabled on".
 
+* GDB index now contains information about the main function. This speeds up
+  startup when it is being used for some large binaries.
+
 * The AArch64 'org.gnu.gdb.aarch64.pauth' Pointer Authentication feature string
   has been deprecated in favor of the 'org.gnu.gdb.aarch64.pauth_v2' feature
   string.
diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
index 4932e49b758..db1a82ec838 100644
--- a/gdb/doc/gdb.texinfo
+++ b/gdb/doc/gdb.texinfo
@@ -49663,13 +49663,14 @@ unless otherwise noted:
 
 @enumerate
 @item
-The version number, currently 8.  Versions 1, 2 and 3 are obsolete.
+The version number, currently 9.  Versions 1, 2 and 3 are obsolete.
 Version 4 uses a different hashing function from versions 5 and 6.
 Version 6 includes symbols for inlined functions, whereas versions 4
 and 5 do not.  Version 7 adds attributes to the CU indices in the
 symbol table.  Version 8 specifies that symbols from DWARF type units
 (@samp{DW_TAG_type_unit}) refer to the type unit's symbol table and not the
-compilation unit (@samp{DW_TAG_comp_unit}) using the type.
+compilation unit (@samp{DW_TAG_comp_unit}) using the type.  Version 9 adds
+the name and the language of the main function to the index.
 
 @value{GDBN} will only read version 4, 5, or 6 indices
 by specifying @code{set use-deprecated-index-sections on}.
@@ -49690,6 +49691,9 @@ The offset, from the start of the file, of the address area.
 @item
 The offset, from the start of the file, of the symbol table.
 
+@item
+The offset, from the start of the file, of the shortcut table.
+
 @item
 The offset, from the start of the file, of the constant pool.
 @end enumerate
@@ -49766,6 +49770,21 @@ don't currently have a simple description of the canonicalization
 algorithm; if you intend to create new index sections, you must read
 the code.
 
+@item The shortcut table
+This is a data structure with the following fields:
+
+@table @asis
+@item Language of main
+An @code{offset_type} value indicating the language of the main function as a
+@code{DW_LANG_} constant.  This value will be zero if main function information
+is not present.
+
+@item Name of main
+An @code{offset_type} value indicating the offset of the main function's name
+in the constant pool.  This value must be ignored if the value for the language
+of main is zero.
+@end table
+
 @item
 The constant pool.  This is simply a bunch of bytes.  It is organized
 so that alignment is correct: CU vectors are stored first, followed by
diff --git a/gdb/dwarf2/index-write.c b/gdb/dwarf2/index-write.c
index 3acff266ab3..6ea4217fb22 100644
--- a/gdb/dwarf2/index-write.c
+++ b/gdb/dwarf2/index-write.c
@@ -1079,14 +1079,15 @@ write_gdbindex_1 (FILE *out_file,
 		  const data_buf &types_cu_list,
 		  const data_buf &addr_vec,
 		  const data_buf &symtab_vec,
-		  const data_buf &constant_pool)
+		  const data_buf &constant_pool,
+		  const data_buf &shortcuts)
 {
   data_buf contents;
-  const offset_type size_of_header = 6 * sizeof (offset_type);
+  const offset_type size_of_header = 7 * sizeof (offset_type);
   uint64_t total_len = size_of_header;
 
   /* The version number.  */
-  contents.append_offset (8);
+  contents.append_offset (9);
 
   /* The offset of the CU list from the start of the file.  */
   contents.append_offset (total_len);
@@ -1104,6 +1105,10 @@ write_gdbindex_1 (FILE *out_file,
   contents.append_offset (total_len);
   total_len += symtab_vec.size ();
 
+  /* The offset of the shortcut table from the start of the file.  */
+  contents.append_offset (total_len);
+  total_len += shortcuts.size ();
+
   /* The offset of the constant pool from the start of the file.  */
   contents.append_offset (total_len);
   total_len += constant_pool.size ();
@@ -1125,6 +1130,7 @@ write_gdbindex_1 (FILE *out_file,
   types_cu_list.file_write (out_file);
   addr_vec.file_write (out_file);
   symtab_vec.file_write (out_file);
+  shortcuts.file_write (out_file);
   constant_pool.file_write (out_file);
 
   assert_file_size (out_file, total_len);
@@ -1187,6 +1193,34 @@ write_cooked_index (cooked_index *table,
     }
 }
 
+/* Write shortcut information. */
+
+static void
+write_shortcuts_table (cooked_index *table, data_buf& shortcuts,
+		       data_buf& cpool)
+{
+  const auto main_info = table->get_main ();
+  size_t main_name_offset = 0;
+  dwarf_source_language dw_lang = (dwarf_source_language)0;
+
+  if (main_info != nullptr)
+    {
+      dw_lang = main_info->per_cu->dw_lang;
+
+      if (dw_lang != 0)
+	{
+	  auto_obstack obstack;
+	  const auto main_name = main_info->full_name (&obstack, true);
+
+	  main_name_offset = cpool.size ();
+	  cpool.append_cstr0 (main_name);
+	}
+    }
+
+  shortcuts.append_uint (4, BFD_ENDIAN_LITTLE, dw_lang);
+  shortcuts.append_offset (main_name_offset);
+}
+
 /* Write contents of a .gdb_index section for OBJFILE into OUT_FILE.
    If OBJFILE has an associated dwz file, write contents of a .gdb_index
    section for that dwz file into DWZ_OUT_FILE.  If OBJFILE does not have an
@@ -1263,11 +1297,14 @@ write_gdbindex (dwarf2_per_bfd *per_bfd, cooked_index *table,
 
   write_hash_table (&symtab, symtab_vec, constant_pool);
 
+  data_buf shortcuts;
+  write_shortcuts_table (table, shortcuts, constant_pool);
+
   write_gdbindex_1(out_file, objfile_cu_list, types_cu_list, addr_vec,
-		   symtab_vec, constant_pool);
+		   symtab_vec, constant_pool, shortcuts);
 
   if (dwz_out_file != NULL)
-    write_gdbindex_1 (dwz_out_file, dwz_cu_list, {}, {}, {}, {});
+    write_gdbindex_1 (dwz_out_file, dwz_cu_list, {}, {}, {}, {}, {});
   else
     gdb_assert (dwz_cu_list.empty ());
 }
@@ -1573,8 +1610,9 @@ gdb_index ()
   pretend_data_buf addr_vec;
   pretend_data_buf symtab_vec;
   pretend_data_buf constant_pool;
+  pretend_data_buf short_cuts;
 
-  const size_t size_of_header = 6 * sizeof (offset_type);
+  const size_t size_of_header = 7 * sizeof (offset_type);
 
   /* Test that an overly large index will throw an error.  */
   symtab_vec.set_pretend_size (~(offset_type)0 - size_of_header);
@@ -1584,7 +1622,7 @@ gdb_index ()
   try
     {
       write_gdbindex_1 (nullptr, cu_list, types_cu_list, addr_vec,
-			symtab_vec, constant_pool);
+			symtab_vec, constant_pool, short_cuts);
     }
   catch (const gdb_exception_error &e)
     {
@@ -1604,7 +1642,7 @@ gdb_index ()
   try
     {
       write_gdbindex_1 (nullptr, cu_list, types_cu_list, addr_vec,
-			symtab_vec, constant_pool);
+			symtab_vec, constant_pool, short_cuts);
     }
   catch (const gdb_exception_error &e)
     {
diff --git a/gdb/dwarf2/read-gdb-index.c b/gdb/dwarf2/read-gdb-index.c
index 9bfc5302b0e..b96eaa96e23 100644
--- a/gdb/dwarf2/read-gdb-index.c
+++ b/gdb/dwarf2/read-gdb-index.c
@@ -88,6 +88,9 @@ struct mapped_gdb_index final : public mapped_index_base
   /* A pointer to the constant pool.  */
   gdb::array_view<const gdb_byte> constant_pool;
 
+  /* The shortcut table data. */
+  gdb::array_view<const gdb_byte> shortcut_table;
+
   /* Return the index into the constant pool of the name of the IDXth
      symbol in the symbol table.  */
   offset_type symbol_name_index (offset_type idx) const
@@ -583,7 +586,7 @@ to use the section anyway."),
 
   /* Indexes with higher version than the one supported by GDB may be no
      longer backward compatible.  */
-  if (version > 8)
+  if (version > 9)
     return 0;
 
   map->version = version;
@@ -610,6 +613,16 @@ to use the section anyway."),
 						    symbol_table_end));
 
   ++i;
+
+  if (version >= 9)
+    {
+      const gdb_byte *shortcut_table = addr + metadata[i];
+      const gdb_byte *shortcut_table_end = addr + metadata[i + 1];
+      map->shortcut_table
+	= gdb::array_view<const gdb_byte> (shortcut_table, shortcut_table_end);
+      ++i;
+    }
+
   map->constant_pool = buffer.slice (metadata[i]);
 
   if (map->constant_pool.empty () && !map->symbol_table.empty ())
@@ -758,6 +771,43 @@ create_addrmap_from_gdb_index (dwarf2_per_objfile *per_objfile,
     = new (&per_bfd->obstack) addrmap_fixed (&per_bfd->obstack, &mutable_map);
 }
 
+/* Sets the name and language of the main function from the shortcut table. */
+
+static void
+set_main_name_from_gdb_index (dwarf2_per_objfile *per_objfile,
+			      mapped_gdb_index *index)
+{
+  const auto expected_size = 4 + sizeof (offset_type);
+  if (index->shortcut_table.size () < expected_size)
+    /* The data in the section is not present, is corrupted or is in a version
+     * we don't know about. Regardless, we can't make use of it. */
+    return;
+
+  auto ptr = index->shortcut_table.data ();
+  const auto dw_lang = extract_unsigned_integer (ptr, 4, BFD_ENDIAN_LITTLE);
+  if (dw_lang >= DW_LANG_hi_user)
+    {
+      complaint (_(".gdb_index shortcut table has invalid main language %u"),
+		   (unsigned) dw_lang);
+      return;
+    }
+  if (dw_lang == 0)
+    {
+      /* Don't bother if the language for the main symbol was not known or if
+       * there was no main symbol at all when the index was built. */
+      return;
+    }
+  ptr += 4;
+
+  const auto lang = dwarf_lang_to_enum_language (dw_lang);
+  const auto name_offset = extract_unsigned_integer (ptr,
+						     sizeof (offset_type),
+						     BFD_ENDIAN_LITTLE);
+  const auto name = (const char*) (index->constant_pool.data () + name_offset);
+
+  set_objfile_main_name (per_objfile->objfile, name, (enum language) lang);
+}
+
 /* See read-gdb-index.h.  */
 
 int
@@ -843,6 +893,8 @@ dwarf2_read_gdb_index
 
   create_addrmap_from_gdb_index (per_objfile, map.get ());
 
+  set_main_name_from_gdb_index (per_objfile, map.get ());
+
   per_bfd->index_table = std::move (map);
   per_bfd->quick_file_names_table =
     create_quick_file_names_table (per_bfd->all_units.size ());
diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index 5bbc8e24cf9..d4aec19d31d 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -17796,7 +17796,9 @@ leb128_size (const gdb_byte *buf)
     }
 }
 
-static enum language
+/* Converts DWARF language names to GDB language names. */
+
+enum language
 dwarf_lang_to_enum_language (unsigned int lang)
 {
   enum language language;
@@ -21725,6 +21727,7 @@ prepare_one_comp_unit (struct dwarf2_cu *cu, struct die_info *comp_unit_die,
   /* Set the language we're debugging.  */
   attr = dwarf2_attr (comp_unit_die, DW_AT_language, cu);
   enum language lang;
+  dwarf_source_language dw_lang = (dwarf_source_language)0;
   if (cu->producer != nullptr
       && strstr (cu->producer, "IBM XL C for OpenCL") != NULL)
     {
@@ -21733,18 +21736,24 @@ prepare_one_comp_unit (struct dwarf2_cu *cu, struct die_info *comp_unit_die,
 	 language detection we fall back to the DW_AT_producer
 	 string.  */
       lang = language_opencl;
+      dw_lang = DW_LANG_OpenCL;
     }
   else if (cu->producer != nullptr
 	   && strstr (cu->producer, "GNU Go ") != NULL)
     {
       /* Similar hack for Go.  */
       lang = language_go;
+      dw_lang = DW_LANG_Go;
     }
   else if (attr != nullptr)
-    lang = dwarf_lang_to_enum_language (attr->constant_value (0));
+    {
+      lang = dwarf_lang_to_enum_language (attr->constant_value (0));
+      dw_lang = (dwarf_source_language)attr->constant_value (0);
+    }
   else
     lang = pretend_language;
 
+  cu->per_cu->dw_lang = dw_lang;
   cu->language_defn = language_def (lang);
 
   switch (comp_unit_die->tag)
diff --git a/gdb/dwarf2/read.h b/gdb/dwarf2/read.h
index 9dfc435e861..1d9432c5c11 100644
--- a/gdb/dwarf2/read.h
+++ b/gdb/dwarf2/read.h
@@ -245,6 +245,14 @@ struct dwarf2_per_cu_data
      functions above.  */
   std::vector <dwarf2_per_cu_data *> *imported_symtabs = nullptr;
 
+  /* The original DW_LANG_* value of the CU, as provided to us by
+   * DW_AT_language. It is interesting to keep this value around in cases where
+   * we can't use the values from the language enum, as the mapping to them is
+   * lossy, and, while that is usually fine, things like the index have an
+   * understandable bias towards not exposing internal GDB structures to the
+   * outside world, and so prefer to use DWARF constants in their stead. */
+  dwarf_source_language dw_lang;
+
   /* Return true of IMPORTED_SYMTABS is empty or not yet allocated.  */
   bool imported_symtabs_empty () const
   {
@@ -764,6 +772,10 @@ struct dwarf2_per_objfile
 		     std::unique_ptr<dwarf2_cu>> m_dwarf2_cus;
 };
 
+/* Converts DWARF language names to GDB language names. */
+
+enum language dwarf_lang_to_enum_language (unsigned int lang);
+
 /* Get the dwarf2_per_objfile associated to OBJFILE.  */
 
 dwarf2_per_objfile *get_dwarf2_per_objfile (struct objfile *objfile);
-- 
2.35.3


^ permalink raw reply	[relevance 4%]

* Re: [PATCH v3] Add name_of_main and language_of_main to the DWARF index
  2023-09-26 14:07  7%         ` Tom de Vries
@ 2023-10-04 22:30  0%           ` Tom de Vries
  0 siblings, 0 replies; 65+ results
From: Tom de Vries @ 2023-10-04 22:30 UTC (permalink / raw)
  To: Matheus Branco Borella (DarkRyu550); +Cc: gdb-patches

On 9/26/23 16:07, Tom de Vries wrote:
> On 9/25/23 20:47, Matheus Branco Borella (DarkRyu550) wrote:
>> The patch should be mostly complete by this point, no? I think I've 
>> addressed all of the concerns that were raised.
>>
> 
> Yes, which is why the patch was approved (while leaving the room for you 
> to fix some nits if you were so inclined).  So I'm just wondering why it 
> hasn't been committed yet.

So, are you expecting somebody else to commit this for you?  I'd be 
happy to do this for you if that's the case.

Or are you somewhere in the process of getting write permissions, and 
would like to do this yourself once that's done?

Please let me know what the situation is.

Thanks,
- Tom

^ permalink raw reply	[relevance 0%]

* Re: [PATCH v3] Add name_of_main and language_of_main to the DWARF index
  2023-09-25 18:47  7%       ` Matheus Branco Borella (DarkRyu550)
@ 2023-09-26 14:07  7%         ` Tom de Vries
  2023-10-04 22:30  0%           ` Tom de Vries
  0 siblings, 1 reply; 65+ results
From: Tom de Vries @ 2023-09-26 14:07 UTC (permalink / raw)
  To: Matheus Branco Borella (DarkRyu550); +Cc: gdb-patches

On 9/25/23 20:47, Matheus Branco Borella (DarkRyu550) wrote:
> The patch should be mostly complete by this point, no? I think I've addressed all of the concerns that were raised.
> 

Yes, which is why the patch was approved (while leaving the room for you 
to fix some nits if you were so inclined).  So I'm just wondering why it 
hasn't been committed yet.

Thanks,
- Tom


^ permalink raw reply	[relevance 7%]

* Re: [PATCH v3] Add name_of_main and language_of_main to the DWARF index
  2023-09-13  7:09  0%     ` Tom de Vries
@ 2023-09-25 18:47  7%       ` Matheus Branco Borella (DarkRyu550)
  2023-09-26 14:07  7%         ` Tom de Vries
  0 siblings, 1 reply; 65+ results
From: Matheus Branco Borella (DarkRyu550) @ 2023-09-25 18:47 UTC (permalink / raw)
  To: Tom de Vries; +Cc: gdb-patches

The patch should be mostly complete by this point, no? I think I've addressed all of the concerns that were raised.

> On Sep 13, 2023, at 4:09 AM, Tom de Vries <tdevries@suse.de> wrote:
> 
> On 8/14/23 09:31, Tom de Vries via Gdb-patches wrote:
>> On 8/11/23 20:21, Matheus Branco Borella wrote:
>>> This should hopefully be the final version. Has a more descriptive entry in the
>>> NEWS file, but is otherwise the same as v2. I believe the misunderstanding with
>>> my contributor status should have also been sorted out?
>>> 
>> Hi,
>> As per Eli's last comment in this thread, that's indeed the case.
>>> ---
>>> This patch adds a new section to the DWARF index containing the name
>>> and the language of the main function symbol, gathered from
>>> `cooked_index::get_main`, if available. Currently, for lack of a better name,
>>> this section is called the "shortcut table". The way this name is both saved and
>>> applied upon an index being loaded in mirrors how it is done in
>>> `cooked_index_functions`, more specifically, the full name of the main function
>>> symbol is saved and `set_objfile_main_name` is used to apply it after it is
>>> loaded.
>>> 
>>> The main use case for this patch is in improving startup times when dealing with
>>> large binaries. Currently, when an index is used, GDB has to expand symtabs
>>> until it finds out what the language of the main function symbol is. For some
>>> large executables, this may take a considerable amount of time to complete,
>>> slowing down startup. This patch bypasses that operation by having both the name
>>> and language of the main function symbol be provided ahead of time by the index.
>>> 
>>> In my testing (a binary with about 1.8GB worth of DWARF data) this change brings
>>> startup time down from about 34 seconds to about 1.5 seconds.
>>> 
>> I've reviewed the patch, and found a few nits.
>> There are two spots that look unintentional to me, one adding and one
>> removing an empty line.  You might want to remove those.
>> I found this bit:
>> ...
>> +      lang = dwarf_lang_to_enum_language (attr->constant_value (0));
>> +      dw_lang = (dwarf_source_language)attr->constant_value (0);
>> ...
>> I would reformulate as:
>> ...
>> +      dw_lang = (dwarf_source_language)attr->constant_value (0);
>> +      lang = dwarf_lang_to_enum_language (dw_lang);
>> ...
>> but I don't feel strongly about that.
>> I've looked over Tom Tromey's comments earlier in the thread, and I think they all have been addresses.  Furthermore, Eli already approved the documentation part, so I'd say: approved, but wait for one week in case somebody else has other comments.
>> Approved-By: Tom de Vries <tdevries@suse.de>
> 
> Hi,
> 
> any update on this?
> 
> Thanks,
> - Tom
> 
>> Thanks,
>> - Tom
>>> PR symtab/24549
>>> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=24549
>>> ---
>>>   gdb/NEWS                    |  3 ++
>>>   gdb/doc/gdb.texinfo         | 23 +++++++++++++--
>>>   gdb/dwarf2/index-write.c    | 47 +++++++++++++++++++++++++++----
>>>   gdb/dwarf2/read-gdb-index.c | 56 +++++++++++++++++++++++++++++++++++--
>>>   gdb/dwarf2/read.c           | 13 +++++++--
>>>   gdb/dwarf2/read.h           | 12 ++++++++
>>>   6 files changed, 143 insertions(+), 11 deletions(-)
>>> 
>>> diff --git a/gdb/NEWS b/gdb/NEWS
>>> index d97e3c15a8..ac455f39f2 100644
>>> --- a/gdb/NEWS
>>> +++ b/gdb/NEWS
>>> @@ -3,6 +3,9 @@
>>>   *** Changes since GDB 13
>>> +* GDB index now contains information about the main function. This speeds up
>>> +  startup when it is being used for some large binaries.
>>> +
>>>   * The AArch64 'org.gnu.gdb.aarch64.pauth' Pointer Authentication feature string
>>>     has been deprecated in favor of the 'org.gnu.gdb.aarch64.pauth_v2' feature
>>>     string.
>>> diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
>>> index d1059e0cb7..3b2fdcd19e 100644
>>> --- a/gdb/doc/gdb.texinfo
>>> +++ b/gdb/doc/gdb.texinfo
>>> @@ -49093,13 +49093,14 @@ unless otherwise noted:
>>>   @enumerate
>>>   @item
>>> -The version number, currently 8.  Versions 1, 2 and 3 are obsolete.
>>> +The version number, currently 9.  Versions 1, 2 and 3 are obsolete.
>>>   Version 4 uses a different hashing function from versions 5 and 6.
>>>   Version 6 includes symbols for inlined functions, whereas versions 4
>>>   and 5 do not.  Version 7 adds attributes to the CU indices in the
>>>   symbol table.  Version 8 specifies that symbols from DWARF type units
>>>   (@samp{DW_TAG_type_unit}) refer to the type unit's symbol table and not the
>>> -compilation unit (@samp{DW_TAG_comp_unit}) using the type.
>>> +compilation unit (@samp{DW_TAG_comp_unit}) using the type.  Version 9 adds
>>> +the name and the language of the main function to the index.
>>>   @value{GDBN} will only read version 4, 5, or 6 indices
>>>   by specifying @code{set use-deprecated-index-sections on}.
>>> @@ -49120,6 +49121,9 @@ The offset, from the start of the file, of the address area.
>>>   @item
>>>   The offset, from the start of the file, of the symbol table.
>>> +@item
>>> +The offset, from the start of the file, of the shortcut table.
>>> +
>>>   @item
>>>   The offset, from the start of the file, of the constant pool.
>>>   @end enumerate
>>> @@ -49196,6 +49200,21 @@ don't currently have a simple description of the canonicalization
>>>   algorithm; if you intend to create new index sections, you must read
>>>   the code.
>>> +@item The shortcut table
>>> +This is a data structure with the following fields:
>>> +
>>> +@table @asis
>>> +@item Language of main
>>> +A 32-bit little-endian value indicating the language of the main function as a
>>> +@code{DW_LANG_} constant.  This value will be zero if main function information
>>> +is not present.
>>> +
>>> +@item Name of main
>>> +An @code{offset_type} value indicating the offset of the main function's name
>>> +in the constant pool.  This value must be ignored if the value for the language
>>> +of main is zero.
>>> +@end table
>>> +
>>>   @item
>>>   The constant pool.  This is simply a bunch of bytes.  It is organized
>>>   so that alignment is correct: CU vectors are stored first, followed by
>>> diff --git a/gdb/dwarf2/index-write.c b/gdb/dwarf2/index-write.c
>>> index 62c2cc6ac7..7117a5184b 100644
>>> --- a/gdb/dwarf2/index-write.c
>>> +++ b/gdb/dwarf2/index-write.c
>>> @@ -1080,14 +1080,15 @@ write_gdbindex_1 (FILE *out_file,
>>>             const data_buf &types_cu_list,
>>>             const data_buf &addr_vec,
>>>             const data_buf &symtab_vec,
>>> -          const data_buf &constant_pool)
>>> +          const data_buf &constant_pool,
>>> +          const data_buf &shortcuts)
>>>   {
>>>     data_buf contents;
>>> -  const offset_type size_of_header = 6 * sizeof (offset_type);
>>> +  const offset_type size_of_header = 7 * sizeof (offset_type);
>>>     offset_type total_len = size_of_header;
>>>     /* The version number.  */
>>> -  contents.append_offset (8);
>>> +  contents.append_offset (9);
>>>     /* The offset of the CU list from the start of the file.  */
>>>     contents.append_offset (total_len);
>>> @@ -1105,6 +1106,10 @@ write_gdbindex_1 (FILE *out_file,
>>>     contents.append_offset (total_len);
>>>     total_len += symtab_vec.size ();
>>> +  /* The offset of the shortcut table from the start of the file.  */
>>> +  contents.append_offset (total_len);
>>> +  total_len += shortcuts.size ();
>>> +
>>>     /* The offset of the constant pool from the start of the file.  */
>>>     contents.append_offset (total_len);
>>>     total_len += constant_pool.size ();
>>> @@ -1116,6 +1121,7 @@ write_gdbindex_1 (FILE *out_file,
>>>     types_cu_list.file_write (out_file);
>>>     addr_vec.file_write (out_file);
>>>     symtab_vec.file_write (out_file);
>>> +  shortcuts.file_write (out_file);
>>>     constant_pool.file_write (out_file);
>>>     assert_file_size (out_file, total_len);
>>> @@ -1193,6 +1199,34 @@ write_cooked_index (cooked_index *table,
>>>       }
>>>   }
>>> +/* Write shortcut information. */
>>> +
>>> +static void
>>> +write_shortcuts_table (cooked_index *table, data_buf& shortcuts,
>>> +               data_buf& cpool)
>>> +{
>>> +  const auto main_info = table->get_main ();
>>> +  size_t main_name_offset = 0;
>>> +  dwarf_source_language dw_lang = (dwarf_source_language)0;
>>> +
>>> +  if (main_info != nullptr)
>>> +    {
>>> +      dw_lang = main_info->per_cu->dw_lang;
>>> +
>>> +      if (dw_lang != 0)
>>> +    {
>>> +      auto_obstack obstack;
>>> +      const auto main_name = main_info->full_name (&obstack, true);
>>> +
>>> +      main_name_offset = cpool.size ();
>>> +      cpool.append_cstr0 (main_name);
>>> +    }
>>> +    }
>>> +
>>> +  shortcuts.append_uint (4, BFD_ENDIAN_LITTLE, dw_lang);
>>> +  shortcuts.append_offset (main_name_offset);
>>> +}
>>> +
>>>   /* Write contents of a .gdb_index section for OBJFILE into OUT_FILE.
>>>      If OBJFILE has an associated dwz file, write contents of a .gdb_index
>>>      section for that dwz file into DWZ_OUT_FILE.  If OBJFILE does not have an
>>> @@ -1270,11 +1304,14 @@ write_gdbindex (dwarf2_per_bfd *per_bfd, cooked_index *table,
>>>     write_hash_table (&symtab, symtab_vec, constant_pool);
>>> +  data_buf shortcuts;
>>> +  write_shortcuts_table (table, shortcuts, constant_pool);
>>> +
>>>     write_gdbindex_1(out_file, objfile_cu_list, types_cu_list, addr_vec,
>>> -           symtab_vec, constant_pool);
>>> +           symtab_vec, constant_pool, shortcuts);
>>>     if (dwz_out_file != NULL)
>>> -    write_gdbindex_1 (dwz_out_file, dwz_cu_list, {}, {}, {}, {});
>>> +    write_gdbindex_1 (dwz_out_file, dwz_cu_list, {}, {}, {}, {}, {});
>>>     else
>>>       gdb_assert (dwz_cu_list.empty ());
>>>   }
>>> diff --git a/gdb/dwarf2/read-gdb-index.c b/gdb/dwarf2/read-gdb-index.c
>>> index 1006386cb2..f09c5ba234 100644
>>> --- a/gdb/dwarf2/read-gdb-index.c
>>> +++ b/gdb/dwarf2/read-gdb-index.c
>>> @@ -88,6 +88,9 @@ struct mapped_gdb_index final : public mapped_index_base
>>>     /* A pointer to the constant pool.  */
>>>     gdb::array_view<const gdb_byte> constant_pool;
>>> +  /* The shortcut table data. */
>>> +  gdb::array_view<const gdb_byte> shortcut_table;
>>> +
>>>     /* Return the index into the constant pool of the name of the IDXth
>>>        symbol in the symbol table.  */
>>>     offset_type symbol_name_index (offset_type idx) const
>>> @@ -166,6 +169,7 @@ dwarf2_gdb_index::dump (struct objfile *objfile)
>>>     mapped_gdb_index *index = (gdb::checked_static_cast<mapped_gdb_index *>
>>>                    (per_objfile->per_bfd->index_table.get ()));
>>> +
>>>     gdb_printf (".gdb_index: version %d\n", index->version);
>>>     gdb_printf ("\n");
>>>   }
>>> @@ -583,7 +587,7 @@ to use the section anyway."),
>>>     /* Indexes with higher version than the one supported by GDB may be no
>>>        longer backward compatible.  */
>>> -  if (version > 8)
>>> +  if (version > 9)
>>>       return 0;
>>>     map->version = version;
>>> @@ -608,8 +612,17 @@ to use the section anyway."),
>>>     map->symbol_table
>>>       = offset_view (gdb::array_view<const gdb_byte> (symbol_table,
>>>                               symbol_table_end));
>>> -
>>>     ++i;
>>> +
>>> +  if (version >= 9)
>>> +    {
>>> +      const gdb_byte *shortcut_table = addr + metadata[i];
>>> +      const gdb_byte *shortcut_table_end = addr + metadata[i + 1];
>>> +      map->shortcut_table
>>> +    = gdb::array_view<const gdb_byte> (shortcut_table, shortcut_table_end);
>>> +      ++i;
>>> +    }
>>> +
>>>     map->constant_pool = buffer.slice (metadata[i]);
>>>     if (map->constant_pool.empty () && !map->symbol_table.empty ())
>>> @@ -763,6 +776,43 @@ create_addrmap_from_gdb_index (dwarf2_per_objfile *per_objfile,
>>>       = new (&per_bfd->obstack) addrmap_fixed (&per_bfd->obstack, &mutable_map);
>>>   }
>>> +/* Sets the name and language of the main function from the shortcut table. */
>>> +
>>> +static void
>>> +set_main_name_from_gdb_index (dwarf2_per_objfile *per_objfile,
>>> +                  mapped_gdb_index *index)
>>> +{
>>> +  const auto expected_size = 4 + sizeof (offset_type);
>>> +  if (index->shortcut_table.size () < expected_size)
>>> +    /* The data in the section is not present, is corrupted or is in a version
>>> +     * we don't know about. Regardless, we can't make use of it. */
>>> +    return;
>>> +
>>> +  auto ptr = index->shortcut_table.data ();
>>> +  const auto dw_lang = extract_unsigned_integer (ptr, 4, BFD_ENDIAN_LITTLE);
>>> +  if (dw_lang >= DW_LANG_hi_user)
>>> +    {
>>> +      complaint (_(".gdb_index shortcut table has invalid main language %u"),
>>> +           (unsigned) dw_lang);
>>> +      return;
>>> +    }
>>> +  if (dw_lang == 0)
>>> +    {
>>> +      /* Don't bother if the language for the main symbol was not known or if
>>> +       * there was no main symbol at all when the index was built. */
>>> +      return;
>>> +    }
>>> +  ptr += 4;
>>> +
>>> +  const auto lang = dwarf_lang_to_enum_language (dw_lang);
>>> +  const auto name_offset = extract_unsigned_integer (ptr,
>>> +                             sizeof (offset_type),
>>> +                             BFD_ENDIAN_LITTLE);
>>> +  const auto name = (const char*) (index->constant_pool.data () + name_offset);
>>> +
>>> +  set_objfile_main_name (per_objfile->objfile, name, (enum language) lang);
>>> +}
>>> +
>>>   /* See read-gdb-index.h.  */
>>>   int
>>> @@ -848,6 +898,8 @@ dwarf2_read_gdb_index
>>>     create_addrmap_from_gdb_index (per_objfile, map.get ());
>>> +  set_main_name_from_gdb_index (per_objfile, map.get ());
>>> +
>>>     per_bfd->index_table = std::move (map);
>>>     per_bfd->quick_file_names_table =
>>>       create_quick_file_names_table (per_bfd->all_units.size ());
>>> diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
>>> index 4828409222..89acd94c05 100644
>>> --- a/gdb/dwarf2/read.c
>>> +++ b/gdb/dwarf2/read.c
>>> @@ -17745,7 +17745,9 @@ leb128_size (const gdb_byte *buf)
>>>       }
>>>   }
>>> -static enum language
>>> +/* Converts DWARF language names to GDB language names. */
>>> +
>>> +enum language
>>>   dwarf_lang_to_enum_language (unsigned int lang)
>>>   {
>>>     enum language language;
>>> @@ -21661,6 +21663,7 @@ prepare_one_comp_unit (struct dwarf2_cu *cu, struct die_info *comp_unit_die,
>>>     /* Set the language we're debugging.  */
>>>     attr = dwarf2_attr (comp_unit_die, DW_AT_language, cu);
>>>     enum language lang;
>>> +  dwarf_source_language dw_lang = (dwarf_source_language)0;
>>>     if (cu->producer != nullptr
>>>         && strstr (cu->producer, "IBM XL C for OpenCL") != NULL)
>>>       {
>>> @@ -21669,18 +21672,24 @@ prepare_one_comp_unit (struct dwarf2_cu *cu, struct die_info *comp_unit_die,
>>>        language detection we fall back to the DW_AT_producer
>>>        string.  */
>>>         lang = language_opencl;
>>> +      dw_lang = DW_LANG_OpenCL;
>>>       }
>>>     else if (cu->producer != nullptr
>>>          && strstr (cu->producer, "GNU Go ") != NULL)
>>>       {
>>>         /* Similar hack for Go.  */
>>>         lang = language_go;
>>> +      dw_lang = DW_LANG_Go;
>>>       }
>>>     else if (attr != nullptr)
>>> -    lang = dwarf_lang_to_enum_language (attr->constant_value (0));
>>> +    {
>>> +      lang = dwarf_lang_to_enum_language (attr->constant_value (0));
>>> +      dw_lang = (dwarf_source_language)attr->constant_value (0);
>>> +    }
>>>     else
>>>       lang = pretend_language;
>>> +  cu->per_cu->dw_lang = dw_lang;
>>>     cu->language_defn = language_def (lang);
>>>     switch (comp_unit_die->tag)
>>> diff --git a/gdb/dwarf2/read.h b/gdb/dwarf2/read.h
>>> index 37023a2070..6707c400cf 100644
>>> --- a/gdb/dwarf2/read.h
>>> +++ b/gdb/dwarf2/read.h
>>> @@ -245,6 +245,14 @@ struct dwarf2_per_cu_data
>>>        functions above.  */
>>>     std::vector <dwarf2_per_cu_data *> *imported_symtabs = nullptr;
>>> +  /* The original DW_LANG_* value of the CU, as provided to us by
>>> +   * DW_AT_language. It is interesting to keep this value around in cases where
>>> +   * we can't use the values from the language enum, as the mapping to them is
>>> +   * lossy, and, while that is usually fine, things like the index have an
>>> +   * understandable bias towards not exposing internal GDB structures to the
>>> +   * outside world, and so prefer to use DWARF constants in their stead. */
>>> +  dwarf_source_language dw_lang;
>>> +
>>>     /* Return true of IMPORTED_SYMTABS is empty or not yet allocated.  */
>>>     bool imported_symtabs_empty () const
>>>     {
>>> @@ -755,6 +763,10 @@ struct dwarf2_per_objfile
>>>                std::unique_ptr<dwarf2_cu>> m_dwarf2_cus;
>>>   };
>>> +/* Converts DWARF language names to GDB language names. */
>>> +
>>> +enum language dwarf_lang_to_enum_language (unsigned int lang);
>>> +
>>>   /* Get the dwarf2_per_objfile associated to OBJFILE.  */
>>>   dwarf2_per_objfile *get_dwarf2_per_objfile (struct objfile *objfile);



^ permalink raw reply	[relevance 7%]

* Re: [PATCH v3] Add name_of_main and language_of_main to the DWARF index
  2023-08-14  7:31  7%   ` Tom de Vries
@ 2023-09-13  7:09  0%     ` Tom de Vries
  2023-09-25 18:47  7%       ` Matheus Branco Borella (DarkRyu550)
  0 siblings, 1 reply; 65+ results
From: Tom de Vries @ 2023-09-13  7:09 UTC (permalink / raw)
  To: Matheus Branco Borella, gdb-patches; +Cc: tom

On 8/14/23 09:31, Tom de Vries via Gdb-patches wrote:
> On 8/11/23 20:21, Matheus Branco Borella wrote:
>> This should hopefully be the final version. Has a more descriptive 
>> entry in the
>> NEWS file, but is otherwise the same as v2. I believe the 
>> misunderstanding with
>> my contributor status should have also been sorted out?
>>
> 
> Hi,
> 
> As per Eli's last comment in this thread, that's indeed the case.
> 
>> ---
>> This patch adds a new section to the DWARF index containing the name
>> and the language of the main function symbol, gathered from
>> `cooked_index::get_main`, if available. Currently, for lack of a 
>> better name,
>> this section is called the "shortcut table". The way this name is both 
>> saved and
>> applied upon an index being loaded in mirrors how it is done in
>> `cooked_index_functions`, more specifically, the full name of the main 
>> function
>> symbol is saved and `set_objfile_main_name` is used to apply it after 
>> it is
>> loaded.
>>
>> The main use case for this patch is in improving startup times when 
>> dealing with
>> large binaries. Currently, when an index is used, GDB has to expand 
>> symtabs
>> until it finds out what the language of the main function symbol is. 
>> For some
>> large executables, this may take a considerable amount of time to 
>> complete,
>> slowing down startup. This patch bypasses that operation by having 
>> both the name
>> and language of the main function symbol be provided ahead of time by 
>> the index.
>>
>> In my testing (a binary with about 1.8GB worth of DWARF data) this 
>> change brings
>> startup time down from about 34 seconds to about 1.5 seconds.
>>
> 
> I've reviewed the patch, and found a few nits.
> 
> There are two spots that look unintentional to me, one adding and one
> removing an empty line.  You might want to remove those.
> 
> I found this bit:
> ...
> +      lang = dwarf_lang_to_enum_language (attr->constant_value (0));
> +      dw_lang = (dwarf_source_language)attr->constant_value (0);
> ...
> 
> I would reformulate as:
> ...
> +      dw_lang = (dwarf_source_language)attr->constant_value (0);
> +      lang = dwarf_lang_to_enum_language (dw_lang);
> ...
> but I don't feel strongly about that.
> 
> I've looked over Tom Tromey's comments earlier in the thread, and I 
> think they all have been addresses.  Furthermore, Eli already approved 
> the documentation part, so I'd say: approved, but wait for one week in 
> case somebody else has other comments.
> 
> Approved-By: Tom de Vries <tdevries@suse.de>
> 

Hi,

any update on this?

Thanks,
- Tom

> Thanks,
> - Tom
> 
>> PR symtab/24549
>> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=24549
>> ---
>>   gdb/NEWS                    |  3 ++
>>   gdb/doc/gdb.texinfo         | 23 +++++++++++++--
>>   gdb/dwarf2/index-write.c    | 47 +++++++++++++++++++++++++++----
>>   gdb/dwarf2/read-gdb-index.c | 56 +++++++++++++++++++++++++++++++++++--
>>   gdb/dwarf2/read.c           | 13 +++++++--
>>   gdb/dwarf2/read.h           | 12 ++++++++
>>   6 files changed, 143 insertions(+), 11 deletions(-)
>>
>> diff --git a/gdb/NEWS b/gdb/NEWS
>> index d97e3c15a8..ac455f39f2 100644
>> --- a/gdb/NEWS
>> +++ b/gdb/NEWS
>> @@ -3,6 +3,9 @@
>>   *** Changes since GDB 13
>> +* GDB index now contains information about the main function. This 
>> speeds up
>> +  startup when it is being used for some large binaries.
>> +
>>   * The AArch64 'org.gnu.gdb.aarch64.pauth' Pointer Authentication 
>> feature string
>>     has been deprecated in favor of the 'org.gnu.gdb.aarch64.pauth_v2' 
>> feature
>>     string.
>> diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
>> index d1059e0cb7..3b2fdcd19e 100644
>> --- a/gdb/doc/gdb.texinfo
>> +++ b/gdb/doc/gdb.texinfo
>> @@ -49093,13 +49093,14 @@ unless otherwise noted:
>>   @enumerate
>>   @item
>> -The version number, currently 8.  Versions 1, 2 and 3 are obsolete.
>> +The version number, currently 9.  Versions 1, 2 and 3 are obsolete.
>>   Version 4 uses a different hashing function from versions 5 and 6.
>>   Version 6 includes symbols for inlined functions, whereas versions 4
>>   and 5 do not.  Version 7 adds attributes to the CU indices in the
>>   symbol table.  Version 8 specifies that symbols from DWARF type units
>>   (@samp{DW_TAG_type_unit}) refer to the type unit's symbol table and 
>> not the
>> -compilation unit (@samp{DW_TAG_comp_unit}) using the type.
>> +compilation unit (@samp{DW_TAG_comp_unit}) using the type.  Version 9 
>> adds
>> +the name and the language of the main function to the index.
>>   @value{GDBN} will only read version 4, 5, or 6 indices
>>   by specifying @code{set use-deprecated-index-sections on}.
>> @@ -49120,6 +49121,9 @@ The offset, from the start of the file, of the 
>> address area.
>>   @item
>>   The offset, from the start of the file, of the symbol table.
>> +@item
>> +The offset, from the start of the file, of the shortcut table.
>> +
>>   @item
>>   The offset, from the start of the file, of the constant pool.
>>   @end enumerate
>> @@ -49196,6 +49200,21 @@ don't currently have a simple description of 
>> the canonicalization
>>   algorithm; if you intend to create new index sections, you must read
>>   the code.
>> +@item The shortcut table
>> +This is a data structure with the following fields:
>> +
>> +@table @asis
>> +@item Language of main
>> +A 32-bit little-endian value indicating the language of the main 
>> function as a
>> +@code{DW_LANG_} constant.  This value will be zero if main function 
>> information
>> +is not present.
>> +
>> +@item Name of main
>> +An @code{offset_type} value indicating the offset of the main 
>> function's name
>> +in the constant pool.  This value must be ignored if the value for 
>> the language
>> +of main is zero.
>> +@end table
>> +
>>   @item
>>   The constant pool.  This is simply a bunch of bytes.  It is organized
>>   so that alignment is correct: CU vectors are stored first, followed by
>> diff --git a/gdb/dwarf2/index-write.c b/gdb/dwarf2/index-write.c
>> index 62c2cc6ac7..7117a5184b 100644
>> --- a/gdb/dwarf2/index-write.c
>> +++ b/gdb/dwarf2/index-write.c
>> @@ -1080,14 +1080,15 @@ write_gdbindex_1 (FILE *out_file,
>>             const data_buf &types_cu_list,
>>             const data_buf &addr_vec,
>>             const data_buf &symtab_vec,
>> -          const data_buf &constant_pool)
>> +          const data_buf &constant_pool,
>> +          const data_buf &shortcuts)
>>   {
>>     data_buf contents;
>> -  const offset_type size_of_header = 6 * sizeof (offset_type);
>> +  const offset_type size_of_header = 7 * sizeof (offset_type);
>>     offset_type total_len = size_of_header;
>>     /* The version number.  */
>> -  contents.append_offset (8);
>> +  contents.append_offset (9);
>>     /* The offset of the CU list from the start of the file.  */
>>     contents.append_offset (total_len);
>> @@ -1105,6 +1106,10 @@ write_gdbindex_1 (FILE *out_file,
>>     contents.append_offset (total_len);
>>     total_len += symtab_vec.size ();
>> +  /* The offset of the shortcut table from the start of the file.  */
>> +  contents.append_offset (total_len);
>> +  total_len += shortcuts.size ();
>> +
>>     /* The offset of the constant pool from the start of the file.  */
>>     contents.append_offset (total_len);
>>     total_len += constant_pool.size ();
>> @@ -1116,6 +1121,7 @@ write_gdbindex_1 (FILE *out_file,
>>     types_cu_list.file_write (out_file);
>>     addr_vec.file_write (out_file);
>>     symtab_vec.file_write (out_file);
>> +  shortcuts.file_write (out_file);
>>     constant_pool.file_write (out_file);
>>     assert_file_size (out_file, total_len);
>> @@ -1193,6 +1199,34 @@ write_cooked_index (cooked_index *table,
>>       }
>>   }
>> +/* Write shortcut information. */
>> +
>> +static void
>> +write_shortcuts_table (cooked_index *table, data_buf& shortcuts,
>> +               data_buf& cpool)
>> +{
>> +  const auto main_info = table->get_main ();
>> +  size_t main_name_offset = 0;
>> +  dwarf_source_language dw_lang = (dwarf_source_language)0;
>> +
>> +  if (main_info != nullptr)
>> +    {
>> +      dw_lang = main_info->per_cu->dw_lang;
>> +
>> +      if (dw_lang != 0)
>> +    {
>> +      auto_obstack obstack;
>> +      const auto main_name = main_info->full_name (&obstack, true);
>> +
>> +      main_name_offset = cpool.size ();
>> +      cpool.append_cstr0 (main_name);
>> +    }
>> +    }
>> +
>> +  shortcuts.append_uint (4, BFD_ENDIAN_LITTLE, dw_lang);
>> +  shortcuts.append_offset (main_name_offset);
>> +}
>> +
>>   /* Write contents of a .gdb_index section for OBJFILE into OUT_FILE.
>>      If OBJFILE has an associated dwz file, write contents of a 
>> .gdb_index
>>      section for that dwz file into DWZ_OUT_FILE.  If OBJFILE does not 
>> have an
>> @@ -1270,11 +1304,14 @@ write_gdbindex (dwarf2_per_bfd *per_bfd, 
>> cooked_index *table,
>>     write_hash_table (&symtab, symtab_vec, constant_pool);
>> +  data_buf shortcuts;
>> +  write_shortcuts_table (table, shortcuts, constant_pool);
>> +
>>     write_gdbindex_1(out_file, objfile_cu_list, types_cu_list, addr_vec,
>> -           symtab_vec, constant_pool);
>> +           symtab_vec, constant_pool, shortcuts);
>>     if (dwz_out_file != NULL)
>> -    write_gdbindex_1 (dwz_out_file, dwz_cu_list, {}, {}, {}, {});
>> +    write_gdbindex_1 (dwz_out_file, dwz_cu_list, {}, {}, {}, {}, {});
>>     else
>>       gdb_assert (dwz_cu_list.empty ());
>>   }
>> diff --git a/gdb/dwarf2/read-gdb-index.c b/gdb/dwarf2/read-gdb-index.c
>> index 1006386cb2..f09c5ba234 100644
>> --- a/gdb/dwarf2/read-gdb-index.c
>> +++ b/gdb/dwarf2/read-gdb-index.c
>> @@ -88,6 +88,9 @@ struct mapped_gdb_index final : public 
>> mapped_index_base
>>     /* A pointer to the constant pool.  */
>>     gdb::array_view<const gdb_byte> constant_pool;
>> +  /* The shortcut table data. */
>> +  gdb::array_view<const gdb_byte> shortcut_table;
>> +
>>     /* Return the index into the constant pool of the name of the IDXth
>>        symbol in the symbol table.  */
>>     offset_type symbol_name_index (offset_type idx) const
>> @@ -166,6 +169,7 @@ dwarf2_gdb_index::dump (struct objfile *objfile)
>>     mapped_gdb_index *index = 
>> (gdb::checked_static_cast<mapped_gdb_index *>
>>                    (per_objfile->per_bfd->index_table.get ()));
>> +
>>     gdb_printf (".gdb_index: version %d\n", index->version);
>>     gdb_printf ("\n");
>>   }
>> @@ -583,7 +587,7 @@ to use the section anyway."),
>>     /* Indexes with higher version than the one supported by GDB may 
>> be no
>>        longer backward compatible.  */
>> -  if (version > 8)
>> +  if (version > 9)
>>       return 0;
>>     map->version = version;
>> @@ -608,8 +612,17 @@ to use the section anyway."),
>>     map->symbol_table
>>       = offset_view (gdb::array_view<const gdb_byte> (symbol_table,
>>                               symbol_table_end));
>> -
>>     ++i;
>> +
>> +  if (version >= 9)
>> +    {
>> +      const gdb_byte *shortcut_table = addr + metadata[i];
>> +      const gdb_byte *shortcut_table_end = addr + metadata[i + 1];
>> +      map->shortcut_table
>> +    = gdb::array_view<const gdb_byte> (shortcut_table, 
>> shortcut_table_end);
>> +      ++i;
>> +    }
>> +
>>     map->constant_pool = buffer.slice (metadata[i]);
>>     if (map->constant_pool.empty () && !map->symbol_table.empty ())
>> @@ -763,6 +776,43 @@ create_addrmap_from_gdb_index (dwarf2_per_objfile 
>> *per_objfile,
>>       = new (&per_bfd->obstack) addrmap_fixed (&per_bfd->obstack, 
>> &mutable_map);
>>   }
>> +/* Sets the name and language of the main function from the shortcut 
>> table. */
>> +
>> +static void
>> +set_main_name_from_gdb_index (dwarf2_per_objfile *per_objfile,
>> +                  mapped_gdb_index *index)
>> +{
>> +  const auto expected_size = 4 + sizeof (offset_type);
>> +  if (index->shortcut_table.size () < expected_size)
>> +    /* The data in the section is not present, is corrupted or is in 
>> a version
>> +     * we don't know about. Regardless, we can't make use of it. */
>> +    return;
>> +
>> +  auto ptr = index->shortcut_table.data ();
>> +  const auto dw_lang = extract_unsigned_integer (ptr, 4, 
>> BFD_ENDIAN_LITTLE);
>> +  if (dw_lang >= DW_LANG_hi_user)
>> +    {
>> +      complaint (_(".gdb_index shortcut table has invalid main 
>> language %u"),
>> +           (unsigned) dw_lang);
>> +      return;
>> +    }
>> +  if (dw_lang == 0)
>> +    {
>> +      /* Don't bother if the language for the main symbol was not 
>> known or if
>> +       * there was no main symbol at all when the index was built. */
>> +      return;
>> +    }
>> +  ptr += 4;
>> +
>> +  const auto lang = dwarf_lang_to_enum_language (dw_lang);
>> +  const auto name_offset = extract_unsigned_integer (ptr,
>> +                             sizeof (offset_type),
>> +                             BFD_ENDIAN_LITTLE);
>> +  const auto name = (const char*) (index->constant_pool.data () + 
>> name_offset);
>> +
>> +  set_objfile_main_name (per_objfile->objfile, name, (enum language) 
>> lang);
>> +}
>> +
>>   /* See read-gdb-index.h.  */
>>   int
>> @@ -848,6 +898,8 @@ dwarf2_read_gdb_index
>>     create_addrmap_from_gdb_index (per_objfile, map.get ());
>> +  set_main_name_from_gdb_index (per_objfile, map.get ());
>> +
>>     per_bfd->index_table = std::move (map);
>>     per_bfd->quick_file_names_table =
>>       create_quick_file_names_table (per_bfd->all_units.size ());
>> diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
>> index 4828409222..89acd94c05 100644
>> --- a/gdb/dwarf2/read.c
>> +++ b/gdb/dwarf2/read.c
>> @@ -17745,7 +17745,9 @@ leb128_size (const gdb_byte *buf)
>>       }
>>   }
>> -static enum language
>> +/* Converts DWARF language names to GDB language names. */
>> +
>> +enum language
>>   dwarf_lang_to_enum_language (unsigned int lang)
>>   {
>>     enum language language;
>> @@ -21661,6 +21663,7 @@ prepare_one_comp_unit (struct dwarf2_cu *cu, 
>> struct die_info *comp_unit_die,
>>     /* Set the language we're debugging.  */
>>     attr = dwarf2_attr (comp_unit_die, DW_AT_language, cu);
>>     enum language lang;
>> +  dwarf_source_language dw_lang = (dwarf_source_language)0;
>>     if (cu->producer != nullptr
>>         && strstr (cu->producer, "IBM XL C for OpenCL") != NULL)
>>       {
>> @@ -21669,18 +21672,24 @@ prepare_one_comp_unit (struct dwarf2_cu *cu, 
>> struct die_info *comp_unit_die,
>>        language detection we fall back to the DW_AT_producer
>>        string.  */
>>         lang = language_opencl;
>> +      dw_lang = DW_LANG_OpenCL;
>>       }
>>     else if (cu->producer != nullptr
>>          && strstr (cu->producer, "GNU Go ") != NULL)
>>       {
>>         /* Similar hack for Go.  */
>>         lang = language_go;
>> +      dw_lang = DW_LANG_Go;
>>       }
>>     else if (attr != nullptr)
>> -    lang = dwarf_lang_to_enum_language (attr->constant_value (0));
>> +    {
>> +      lang = dwarf_lang_to_enum_language (attr->constant_value (0));
>> +      dw_lang = (dwarf_source_language)attr->constant_value (0);
>> +    }
>>     else
>>       lang = pretend_language;
>> +  cu->per_cu->dw_lang = dw_lang;
>>     cu->language_defn = language_def (lang);
>>     switch (comp_unit_die->tag)
>> diff --git a/gdb/dwarf2/read.h b/gdb/dwarf2/read.h
>> index 37023a2070..6707c400cf 100644
>> --- a/gdb/dwarf2/read.h
>> +++ b/gdb/dwarf2/read.h
>> @@ -245,6 +245,14 @@ struct dwarf2_per_cu_data
>>        functions above.  */
>>     std::vector <dwarf2_per_cu_data *> *imported_symtabs = nullptr;
>> +  /* The original DW_LANG_* value of the CU, as provided to us by
>> +   * DW_AT_language. It is interesting to keep this value around in 
>> cases where
>> +   * we can't use the values from the language enum, as the mapping 
>> to them is
>> +   * lossy, and, while that is usually fine, things like the index 
>> have an
>> +   * understandable bias towards not exposing internal GDB structures 
>> to the
>> +   * outside world, and so prefer to use DWARF constants in their 
>> stead. */
>> +  dwarf_source_language dw_lang;
>> +
>>     /* Return true of IMPORTED_SYMTABS is empty or not yet allocated.  */
>>     bool imported_symtabs_empty () const
>>     {
>> @@ -755,6 +763,10 @@ struct dwarf2_per_objfile
>>                std::unique_ptr<dwarf2_cu>> m_dwarf2_cus;
>>   };
>> +/* Converts DWARF language names to GDB language names. */
>> +
>> +enum language dwarf_lang_to_enum_language (unsigned int lang);
>> +
>>   /* Get the dwarf2_per_objfile associated to OBJFILE.  */
>>   dwarf2_per_objfile *get_dwarf2_per_objfile (struct objfile *objfile);
> 


^ permalink raw reply	[relevance 0%]

* Re: [PATCH v3] Add name_of_main and language_of_main to the DWARF index
  2023-08-11 18:21  4% ` [PATCH v3] " Matheus Branco Borella
@ 2023-08-14  7:31  7%   ` Tom de Vries
  2023-09-13  7:09  0%     ` Tom de Vries
  0 siblings, 1 reply; 65+ results
From: Tom de Vries @ 2023-08-14  7:31 UTC (permalink / raw)
  To: Matheus Branco Borella, gdb-patches; +Cc: tom

On 8/11/23 20:21, Matheus Branco Borella wrote:
> This should hopefully be the final version. Has a more descriptive entry in the
> NEWS file, but is otherwise the same as v2. I believe the misunderstanding with
> my contributor status should have also been sorted out?
> 

Hi,

As per Eli's last comment in this thread, that's indeed the case.

> ---
> This patch adds a new section to the DWARF index containing the name
> and the language of the main function symbol, gathered from
> `cooked_index::get_main`, if available. Currently, for lack of a better name,
> this section is called the "shortcut table". The way this name is both saved and
> applied upon an index being loaded in mirrors how it is done in
> `cooked_index_functions`, more specifically, the full name of the main function
> symbol is saved and `set_objfile_main_name` is used to apply it after it is
> loaded.
> 
> The main use case for this patch is in improving startup times when dealing with
> large binaries. Currently, when an index is used, GDB has to expand symtabs
> until it finds out what the language of the main function symbol is. For some
> large executables, this may take a considerable amount of time to complete,
> slowing down startup. This patch bypasses that operation by having both the name
> and language of the main function symbol be provided ahead of time by the index.
> 
> In my testing (a binary with about 1.8GB worth of DWARF data) this change brings
> startup time down from about 34 seconds to about 1.5 seconds.
> 

I've reviewed the patch, and found a few nits.

There are two spots that look unintentional to me, one adding and one
removing an empty line.  You might want to remove those.

I found this bit:
...
+      lang = dwarf_lang_to_enum_language (attr->constant_value (0));
+      dw_lang = (dwarf_source_language)attr->constant_value (0);
...

I would reformulate as:
...
+      dw_lang = (dwarf_source_language)attr->constant_value (0);
+      lang = dwarf_lang_to_enum_language (dw_lang);
...
but I don't feel strongly about that.

I've looked over Tom Tromey's comments earlier in the thread, and I 
think they all have been addresses.  Furthermore, Eli already approved 
the documentation part, so I'd say: approved, but wait for one week in 
case somebody else has other comments.

Approved-By: Tom de Vries <tdevries@suse.de>

Thanks,
- Tom

> PR symtab/24549
> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=24549
> ---
>   gdb/NEWS                    |  3 ++
>   gdb/doc/gdb.texinfo         | 23 +++++++++++++--
>   gdb/dwarf2/index-write.c    | 47 +++++++++++++++++++++++++++----
>   gdb/dwarf2/read-gdb-index.c | 56 +++++++++++++++++++++++++++++++++++--
>   gdb/dwarf2/read.c           | 13 +++++++--
>   gdb/dwarf2/read.h           | 12 ++++++++
>   6 files changed, 143 insertions(+), 11 deletions(-)
> 
> diff --git a/gdb/NEWS b/gdb/NEWS
> index d97e3c15a8..ac455f39f2 100644
> --- a/gdb/NEWS
> +++ b/gdb/NEWS
> @@ -3,6 +3,9 @@
>   
>   *** Changes since GDB 13
>   
> +* GDB index now contains information about the main function. This speeds up
> +  startup when it is being used for some large binaries.
> +
>   * The AArch64 'org.gnu.gdb.aarch64.pauth' Pointer Authentication feature string
>     has been deprecated in favor of the 'org.gnu.gdb.aarch64.pauth_v2' feature
>     string.
> diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
> index d1059e0cb7..3b2fdcd19e 100644
> --- a/gdb/doc/gdb.texinfo
> +++ b/gdb/doc/gdb.texinfo
> @@ -49093,13 +49093,14 @@ unless otherwise noted:
>   
>   @enumerate
>   @item
> -The version number, currently 8.  Versions 1, 2 and 3 are obsolete.
> +The version number, currently 9.  Versions 1, 2 and 3 are obsolete.
>   Version 4 uses a different hashing function from versions 5 and 6.
>   Version 6 includes symbols for inlined functions, whereas versions 4
>   and 5 do not.  Version 7 adds attributes to the CU indices in the
>   symbol table.  Version 8 specifies that symbols from DWARF type units
>   (@samp{DW_TAG_type_unit}) refer to the type unit's symbol table and not the
> -compilation unit (@samp{DW_TAG_comp_unit}) using the type.
> +compilation unit (@samp{DW_TAG_comp_unit}) using the type.  Version 9 adds
> +the name and the language of the main function to the index.
>   
>   @value{GDBN} will only read version 4, 5, or 6 indices
>   by specifying @code{set use-deprecated-index-sections on}.
> @@ -49120,6 +49121,9 @@ The offset, from the start of the file, of the address area.
>   @item
>   The offset, from the start of the file, of the symbol table.
>   
> +@item
> +The offset, from the start of the file, of the shortcut table.
> +
>   @item
>   The offset, from the start of the file, of the constant pool.
>   @end enumerate
> @@ -49196,6 +49200,21 @@ don't currently have a simple description of the canonicalization
>   algorithm; if you intend to create new index sections, you must read
>   the code.
>   
> +@item The shortcut table
> +This is a data structure with the following fields:
> +
> +@table @asis
> +@item Language of main
> +A 32-bit little-endian value indicating the language of the main function as a
> +@code{DW_LANG_} constant.  This value will be zero if main function information
> +is not present.
> +
> +@item Name of main
> +An @code{offset_type} value indicating the offset of the main function's name
> +in the constant pool.  This value must be ignored if the value for the language
> +of main is zero.
> +@end table
> +
>   @item
>   The constant pool.  This is simply a bunch of bytes.  It is organized
>   so that alignment is correct: CU vectors are stored first, followed by
> diff --git a/gdb/dwarf2/index-write.c b/gdb/dwarf2/index-write.c
> index 62c2cc6ac7..7117a5184b 100644
> --- a/gdb/dwarf2/index-write.c
> +++ b/gdb/dwarf2/index-write.c
> @@ -1080,14 +1080,15 @@ write_gdbindex_1 (FILE *out_file,
>   		  const data_buf &types_cu_list,
>   		  const data_buf &addr_vec,
>   		  const data_buf &symtab_vec,
> -		  const data_buf &constant_pool)
> +		  const data_buf &constant_pool,
> +		  const data_buf &shortcuts)
>   {
>     data_buf contents;
> -  const offset_type size_of_header = 6 * sizeof (offset_type);
> +  const offset_type size_of_header = 7 * sizeof (offset_type);
>     offset_type total_len = size_of_header;
>   
>     /* The version number.  */
> -  contents.append_offset (8);
> +  contents.append_offset (9);
>   
>     /* The offset of the CU list from the start of the file.  */
>     contents.append_offset (total_len);
> @@ -1105,6 +1106,10 @@ write_gdbindex_1 (FILE *out_file,
>     contents.append_offset (total_len);
>     total_len += symtab_vec.size ();
>   
> +  /* The offset of the shortcut table from the start of the file.  */
> +  contents.append_offset (total_len);
> +  total_len += shortcuts.size ();
> +
>     /* The offset of the constant pool from the start of the file.  */
>     contents.append_offset (total_len);
>     total_len += constant_pool.size ();
> @@ -1116,6 +1121,7 @@ write_gdbindex_1 (FILE *out_file,
>     types_cu_list.file_write (out_file);
>     addr_vec.file_write (out_file);
>     symtab_vec.file_write (out_file);
> +  shortcuts.file_write (out_file);
>     constant_pool.file_write (out_file);
>   
>     assert_file_size (out_file, total_len);
> @@ -1193,6 +1199,34 @@ write_cooked_index (cooked_index *table,
>       }
>   }
>   
> +/* Write shortcut information. */
> +
> +static void
> +write_shortcuts_table (cooked_index *table, data_buf& shortcuts,
> +		       data_buf& cpool)
> +{
> +  const auto main_info = table->get_main ();
> +  size_t main_name_offset = 0;
> +  dwarf_source_language dw_lang = (dwarf_source_language)0;
> +
> +  if (main_info != nullptr)
> +    {
> +      dw_lang = main_info->per_cu->dw_lang;
> +
> +      if (dw_lang != 0)
> +	{
> +	  auto_obstack obstack;
> +	  const auto main_name = main_info->full_name (&obstack, true);
> +
> +	  main_name_offset = cpool.size ();
> +	  cpool.append_cstr0 (main_name);
> +	}
> +    }
> +
> +  shortcuts.append_uint (4, BFD_ENDIAN_LITTLE, dw_lang);
> +  shortcuts.append_offset (main_name_offset);
> +}
> +
>   /* Write contents of a .gdb_index section for OBJFILE into OUT_FILE.
>      If OBJFILE has an associated dwz file, write contents of a .gdb_index
>      section for that dwz file into DWZ_OUT_FILE.  If OBJFILE does not have an
> @@ -1270,11 +1304,14 @@ write_gdbindex (dwarf2_per_bfd *per_bfd, cooked_index *table,
>   
>     write_hash_table (&symtab, symtab_vec, constant_pool);
>   
> +  data_buf shortcuts;
> +  write_shortcuts_table (table, shortcuts, constant_pool);
> +
>     write_gdbindex_1(out_file, objfile_cu_list, types_cu_list, addr_vec,
> -		   symtab_vec, constant_pool);
> +		   symtab_vec, constant_pool, shortcuts);
>   
>     if (dwz_out_file != NULL)
> -    write_gdbindex_1 (dwz_out_file, dwz_cu_list, {}, {}, {}, {});
> +    write_gdbindex_1 (dwz_out_file, dwz_cu_list, {}, {}, {}, {}, {});
>     else
>       gdb_assert (dwz_cu_list.empty ());
>   }
> diff --git a/gdb/dwarf2/read-gdb-index.c b/gdb/dwarf2/read-gdb-index.c
> index 1006386cb2..f09c5ba234 100644
> --- a/gdb/dwarf2/read-gdb-index.c
> +++ b/gdb/dwarf2/read-gdb-index.c
> @@ -88,6 +88,9 @@ struct mapped_gdb_index final : public mapped_index_base
>     /* A pointer to the constant pool.  */
>     gdb::array_view<const gdb_byte> constant_pool;
>   
> +  /* The shortcut table data. */
> +  gdb::array_view<const gdb_byte> shortcut_table;
> +
>     /* Return the index into the constant pool of the name of the IDXth
>        symbol in the symbol table.  */
>     offset_type symbol_name_index (offset_type idx) const
> @@ -166,6 +169,7 @@ dwarf2_gdb_index::dump (struct objfile *objfile)
>   
>     mapped_gdb_index *index = (gdb::checked_static_cast<mapped_gdb_index *>
>   			     (per_objfile->per_bfd->index_table.get ()));
> +
>     gdb_printf (".gdb_index: version %d\n", index->version);
>     gdb_printf ("\n");
>   }
> @@ -583,7 +587,7 @@ to use the section anyway."),
>   
>     /* Indexes with higher version than the one supported by GDB may be no
>        longer backward compatible.  */
> -  if (version > 8)
> +  if (version > 9)
>       return 0;
>   
>     map->version = version;
> @@ -608,8 +612,17 @@ to use the section anyway."),
>     map->symbol_table
>       = offset_view (gdb::array_view<const gdb_byte> (symbol_table,
>   						    symbol_table_end));
> -
>     ++i;
> +
> +  if (version >= 9)
> +    {
> +      const gdb_byte *shortcut_table = addr + metadata[i];
> +      const gdb_byte *shortcut_table_end = addr + metadata[i + 1];
> +      map->shortcut_table
> +	= gdb::array_view<const gdb_byte> (shortcut_table, shortcut_table_end);
> +      ++i;
> +    }
> +
>     map->constant_pool = buffer.slice (metadata[i]);
>   
>     if (map->constant_pool.empty () && !map->symbol_table.empty ())
> @@ -763,6 +776,43 @@ create_addrmap_from_gdb_index (dwarf2_per_objfile *per_objfile,
>       = new (&per_bfd->obstack) addrmap_fixed (&per_bfd->obstack, &mutable_map);
>   }
>   
> +/* Sets the name and language of the main function from the shortcut table. */
> +
> +static void
> +set_main_name_from_gdb_index (dwarf2_per_objfile *per_objfile,
> +			      mapped_gdb_index *index)
> +{
> +  const auto expected_size = 4 + sizeof (offset_type);
> +  if (index->shortcut_table.size () < expected_size)
> +    /* The data in the section is not present, is corrupted or is in a version
> +     * we don't know about. Regardless, we can't make use of it. */
> +    return;
> +
> +  auto ptr = index->shortcut_table.data ();
> +  const auto dw_lang = extract_unsigned_integer (ptr, 4, BFD_ENDIAN_LITTLE);
> +  if (dw_lang >= DW_LANG_hi_user)
> +    {
> +      complaint (_(".gdb_index shortcut table has invalid main language %u"),
> +		   (unsigned) dw_lang);
> +      return;
> +    }
> +  if (dw_lang == 0)
> +    {
> +      /* Don't bother if the language for the main symbol was not known or if
> +       * there was no main symbol at all when the index was built. */
> +      return;
> +    }
> +  ptr += 4;
> +
> +  const auto lang = dwarf_lang_to_enum_language (dw_lang);
> +  const auto name_offset = extract_unsigned_integer (ptr,
> +						     sizeof (offset_type),
> +						     BFD_ENDIAN_LITTLE);
> +  const auto name = (const char*) (index->constant_pool.data () + name_offset);
> +
> +  set_objfile_main_name (per_objfile->objfile, name, (enum language) lang);
> +}
> +
>   /* See read-gdb-index.h.  */
>   
>   int
> @@ -848,6 +898,8 @@ dwarf2_read_gdb_index
>   
>     create_addrmap_from_gdb_index (per_objfile, map.get ());
>   
> +  set_main_name_from_gdb_index (per_objfile, map.get ());
> +
>     per_bfd->index_table = std::move (map);
>     per_bfd->quick_file_names_table =
>       create_quick_file_names_table (per_bfd->all_units.size ());
> diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
> index 4828409222..89acd94c05 100644
> --- a/gdb/dwarf2/read.c
> +++ b/gdb/dwarf2/read.c
> @@ -17745,7 +17745,9 @@ leb128_size (const gdb_byte *buf)
>       }
>   }
>   
> -static enum language
> +/* Converts DWARF language names to GDB language names. */
> +
> +enum language
>   dwarf_lang_to_enum_language (unsigned int lang)
>   {
>     enum language language;
> @@ -21661,6 +21663,7 @@ prepare_one_comp_unit (struct dwarf2_cu *cu, struct die_info *comp_unit_die,
>     /* Set the language we're debugging.  */
>     attr = dwarf2_attr (comp_unit_die, DW_AT_language, cu);
>     enum language lang;
> +  dwarf_source_language dw_lang = (dwarf_source_language)0;
>     if (cu->producer != nullptr
>         && strstr (cu->producer, "IBM XL C for OpenCL") != NULL)
>       {
> @@ -21669,18 +21672,24 @@ prepare_one_comp_unit (struct dwarf2_cu *cu, struct die_info *comp_unit_die,
>   	 language detection we fall back to the DW_AT_producer
>   	 string.  */
>         lang = language_opencl;
> +      dw_lang = DW_LANG_OpenCL;
>       }
>     else if (cu->producer != nullptr
>   	   && strstr (cu->producer, "GNU Go ") != NULL)
>       {
>         /* Similar hack for Go.  */
>         lang = language_go;
> +      dw_lang = DW_LANG_Go;
>       }
>     else if (attr != nullptr)
> -    lang = dwarf_lang_to_enum_language (attr->constant_value (0));
> +    {
> +      lang = dwarf_lang_to_enum_language (attr->constant_value (0));
> +      dw_lang = (dwarf_source_language)attr->constant_value (0);
> +    }
>     else
>       lang = pretend_language;
>   
> +  cu->per_cu->dw_lang = dw_lang;
>     cu->language_defn = language_def (lang);
>   
>     switch (comp_unit_die->tag)
> diff --git a/gdb/dwarf2/read.h b/gdb/dwarf2/read.h
> index 37023a2070..6707c400cf 100644
> --- a/gdb/dwarf2/read.h
> +++ b/gdb/dwarf2/read.h
> @@ -245,6 +245,14 @@ struct dwarf2_per_cu_data
>        functions above.  */
>     std::vector <dwarf2_per_cu_data *> *imported_symtabs = nullptr;
>   
> +  /* The original DW_LANG_* value of the CU, as provided to us by
> +   * DW_AT_language. It is interesting to keep this value around in cases where
> +   * we can't use the values from the language enum, as the mapping to them is
> +   * lossy, and, while that is usually fine, things like the index have an
> +   * understandable bias towards not exposing internal GDB structures to the
> +   * outside world, and so prefer to use DWARF constants in their stead. */
> +  dwarf_source_language dw_lang;
> +
>     /* Return true of IMPORTED_SYMTABS is empty or not yet allocated.  */
>     bool imported_symtabs_empty () const
>     {
> @@ -755,6 +763,10 @@ struct dwarf2_per_objfile
>   		     std::unique_ptr<dwarf2_cu>> m_dwarf2_cus;
>   };
>   
> +/* Converts DWARF language names to GDB language names. */
> +
> +enum language dwarf_lang_to_enum_language (unsigned int lang);
> +
>   /* Get the dwarf2_per_objfile associated to OBJFILE.  */
>   
>   dwarf2_per_objfile *get_dwarf2_per_objfile (struct objfile *objfile);


^ permalink raw reply	[relevance 7%]

* [PATCH v3] Add name_of_main and language_of_main to the DWARF index
  2023-06-08 21:40  5% [PATCH] Add name_of_main and language_of_main to the DWARF index Matheus Branco Borella
  2023-06-09 16:56  0% ` Tom Tromey
@ 2023-08-11 18:21  4% ` Matheus Branco Borella
  2023-08-14  7:31  7%   ` Tom de Vries
  1 sibling, 1 reply; 65+ results
From: Matheus Branco Borella @ 2023-08-11 18:21 UTC (permalink / raw)
  To: gdb-patches; +Cc: tdevries, tom, Matheus Branco Borella

This should hopefully be the final version. Has a more descriptive entry in the
NEWS file, but is otherwise the same as v2. I believe the misunderstanding with
my contributor status should have also been sorted out?

---
This patch adds a new section to the DWARF index containing the name
and the language of the main function symbol, gathered from
`cooked_index::get_main`, if available. Currently, for lack of a better name,
this section is called the "shortcut table". The way this name is both saved and
applied upon an index being loaded in mirrors how it is done in
`cooked_index_functions`, more specifically, the full name of the main function
symbol is saved and `set_objfile_main_name` is used to apply it after it is
loaded.

The main use case for this patch is in improving startup times when dealing with
large binaries. Currently, when an index is used, GDB has to expand symtabs
until it finds out what the language of the main function symbol is. For some
large executables, this may take a considerable amount of time to complete,
slowing down startup. This patch bypasses that operation by having both the name
and language of the main function symbol be provided ahead of time by the index.

In my testing (a binary with about 1.8GB worth of DWARF data) this change brings
startup time down from about 34 seconds to about 1.5 seconds.

PR symtab/24549
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=24549
---
 gdb/NEWS                    |  3 ++
 gdb/doc/gdb.texinfo         | 23 +++++++++++++--
 gdb/dwarf2/index-write.c    | 47 +++++++++++++++++++++++++++----
 gdb/dwarf2/read-gdb-index.c | 56 +++++++++++++++++++++++++++++++++++--
 gdb/dwarf2/read.c           | 13 +++++++--
 gdb/dwarf2/read.h           | 12 ++++++++
 6 files changed, 143 insertions(+), 11 deletions(-)

diff --git a/gdb/NEWS b/gdb/NEWS
index d97e3c15a8..ac455f39f2 100644
--- a/gdb/NEWS
+++ b/gdb/NEWS
@@ -3,6 +3,9 @@
 
 *** Changes since GDB 13
 
+* GDB index now contains information about the main function. This speeds up
+  startup when it is being used for some large binaries.
+
 * The AArch64 'org.gnu.gdb.aarch64.pauth' Pointer Authentication feature string
   has been deprecated in favor of the 'org.gnu.gdb.aarch64.pauth_v2' feature
   string.
diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
index d1059e0cb7..3b2fdcd19e 100644
--- a/gdb/doc/gdb.texinfo
+++ b/gdb/doc/gdb.texinfo
@@ -49093,13 +49093,14 @@ unless otherwise noted:
 
 @enumerate
 @item
-The version number, currently 8.  Versions 1, 2 and 3 are obsolete.
+The version number, currently 9.  Versions 1, 2 and 3 are obsolete.
 Version 4 uses a different hashing function from versions 5 and 6.
 Version 6 includes symbols for inlined functions, whereas versions 4
 and 5 do not.  Version 7 adds attributes to the CU indices in the
 symbol table.  Version 8 specifies that symbols from DWARF type units
 (@samp{DW_TAG_type_unit}) refer to the type unit's symbol table and not the
-compilation unit (@samp{DW_TAG_comp_unit}) using the type.
+compilation unit (@samp{DW_TAG_comp_unit}) using the type.  Version 9 adds
+the name and the language of the main function to the index.
 
 @value{GDBN} will only read version 4, 5, or 6 indices
 by specifying @code{set use-deprecated-index-sections on}.
@@ -49120,6 +49121,9 @@ The offset, from the start of the file, of the address area.
 @item
 The offset, from the start of the file, of the symbol table.
 
+@item
+The offset, from the start of the file, of the shortcut table.
+
 @item
 The offset, from the start of the file, of the constant pool.
 @end enumerate
@@ -49196,6 +49200,21 @@ don't currently have a simple description of the canonicalization
 algorithm; if you intend to create new index sections, you must read
 the code.
 
+@item The shortcut table
+This is a data structure with the following fields:
+
+@table @asis
+@item Language of main
+A 32-bit little-endian value indicating the language of the main function as a
+@code{DW_LANG_} constant.  This value will be zero if main function information
+is not present.
+
+@item Name of main
+An @code{offset_type} value indicating the offset of the main function's name
+in the constant pool.  This value must be ignored if the value for the language
+of main is zero.
+@end table
+
 @item
 The constant pool.  This is simply a bunch of bytes.  It is organized
 so that alignment is correct: CU vectors are stored first, followed by
diff --git a/gdb/dwarf2/index-write.c b/gdb/dwarf2/index-write.c
index 62c2cc6ac7..7117a5184b 100644
--- a/gdb/dwarf2/index-write.c
+++ b/gdb/dwarf2/index-write.c
@@ -1080,14 +1080,15 @@ write_gdbindex_1 (FILE *out_file,
 		  const data_buf &types_cu_list,
 		  const data_buf &addr_vec,
 		  const data_buf &symtab_vec,
-		  const data_buf &constant_pool)
+		  const data_buf &constant_pool,
+		  const data_buf &shortcuts)
 {
   data_buf contents;
-  const offset_type size_of_header = 6 * sizeof (offset_type);
+  const offset_type size_of_header = 7 * sizeof (offset_type);
   offset_type total_len = size_of_header;
 
   /* The version number.  */
-  contents.append_offset (8);
+  contents.append_offset (9);
 
   /* The offset of the CU list from the start of the file.  */
   contents.append_offset (total_len);
@@ -1105,6 +1106,10 @@ write_gdbindex_1 (FILE *out_file,
   contents.append_offset (total_len);
   total_len += symtab_vec.size ();
 
+  /* The offset of the shortcut table from the start of the file.  */
+  contents.append_offset (total_len);
+  total_len += shortcuts.size ();
+
   /* The offset of the constant pool from the start of the file.  */
   contents.append_offset (total_len);
   total_len += constant_pool.size ();
@@ -1116,6 +1121,7 @@ write_gdbindex_1 (FILE *out_file,
   types_cu_list.file_write (out_file);
   addr_vec.file_write (out_file);
   symtab_vec.file_write (out_file);
+  shortcuts.file_write (out_file);
   constant_pool.file_write (out_file);
 
   assert_file_size (out_file, total_len);
@@ -1193,6 +1199,34 @@ write_cooked_index (cooked_index *table,
     }
 }
 
+/* Write shortcut information. */
+
+static void
+write_shortcuts_table (cooked_index *table, data_buf& shortcuts,
+		       data_buf& cpool)
+{
+  const auto main_info = table->get_main ();
+  size_t main_name_offset = 0;
+  dwarf_source_language dw_lang = (dwarf_source_language)0;
+
+  if (main_info != nullptr)
+    {
+      dw_lang = main_info->per_cu->dw_lang;
+
+      if (dw_lang != 0)
+	{
+	  auto_obstack obstack;
+	  const auto main_name = main_info->full_name (&obstack, true);
+
+	  main_name_offset = cpool.size ();
+	  cpool.append_cstr0 (main_name);
+	}
+    }
+
+  shortcuts.append_uint (4, BFD_ENDIAN_LITTLE, dw_lang);
+  shortcuts.append_offset (main_name_offset);
+}
+
 /* Write contents of a .gdb_index section for OBJFILE into OUT_FILE.
    If OBJFILE has an associated dwz file, write contents of a .gdb_index
    section for that dwz file into DWZ_OUT_FILE.  If OBJFILE does not have an
@@ -1270,11 +1304,14 @@ write_gdbindex (dwarf2_per_bfd *per_bfd, cooked_index *table,
 
   write_hash_table (&symtab, symtab_vec, constant_pool);
 
+  data_buf shortcuts;
+  write_shortcuts_table (table, shortcuts, constant_pool);
+
   write_gdbindex_1(out_file, objfile_cu_list, types_cu_list, addr_vec,
-		   symtab_vec, constant_pool);
+		   symtab_vec, constant_pool, shortcuts);
 
   if (dwz_out_file != NULL)
-    write_gdbindex_1 (dwz_out_file, dwz_cu_list, {}, {}, {}, {});
+    write_gdbindex_1 (dwz_out_file, dwz_cu_list, {}, {}, {}, {}, {});
   else
     gdb_assert (dwz_cu_list.empty ());
 }
diff --git a/gdb/dwarf2/read-gdb-index.c b/gdb/dwarf2/read-gdb-index.c
index 1006386cb2..f09c5ba234 100644
--- a/gdb/dwarf2/read-gdb-index.c
+++ b/gdb/dwarf2/read-gdb-index.c
@@ -88,6 +88,9 @@ struct mapped_gdb_index final : public mapped_index_base
   /* A pointer to the constant pool.  */
   gdb::array_view<const gdb_byte> constant_pool;
 
+  /* The shortcut table data. */
+  gdb::array_view<const gdb_byte> shortcut_table;
+
   /* Return the index into the constant pool of the name of the IDXth
      symbol in the symbol table.  */
   offset_type symbol_name_index (offset_type idx) const
@@ -166,6 +169,7 @@ dwarf2_gdb_index::dump (struct objfile *objfile)
 
   mapped_gdb_index *index = (gdb::checked_static_cast<mapped_gdb_index *>
 			     (per_objfile->per_bfd->index_table.get ()));
+
   gdb_printf (".gdb_index: version %d\n", index->version);
   gdb_printf ("\n");
 }
@@ -583,7 +587,7 @@ to use the section anyway."),
 
   /* Indexes with higher version than the one supported by GDB may be no
      longer backward compatible.  */
-  if (version > 8)
+  if (version > 9)
     return 0;
 
   map->version = version;
@@ -608,8 +612,17 @@ to use the section anyway."),
   map->symbol_table
     = offset_view (gdb::array_view<const gdb_byte> (symbol_table,
 						    symbol_table_end));
-
   ++i;
+
+  if (version >= 9)
+    {
+      const gdb_byte *shortcut_table = addr + metadata[i];
+      const gdb_byte *shortcut_table_end = addr + metadata[i + 1];
+      map->shortcut_table
+	= gdb::array_view<const gdb_byte> (shortcut_table, shortcut_table_end);
+      ++i;
+    }
+
   map->constant_pool = buffer.slice (metadata[i]);
 
   if (map->constant_pool.empty () && !map->symbol_table.empty ())
@@ -763,6 +776,43 @@ create_addrmap_from_gdb_index (dwarf2_per_objfile *per_objfile,
     = new (&per_bfd->obstack) addrmap_fixed (&per_bfd->obstack, &mutable_map);
 }
 
+/* Sets the name and language of the main function from the shortcut table. */
+
+static void
+set_main_name_from_gdb_index (dwarf2_per_objfile *per_objfile,
+			      mapped_gdb_index *index)
+{
+  const auto expected_size = 4 + sizeof (offset_type);
+  if (index->shortcut_table.size () < expected_size)
+    /* The data in the section is not present, is corrupted or is in a version
+     * we don't know about. Regardless, we can't make use of it. */
+    return;
+
+  auto ptr = index->shortcut_table.data ();
+  const auto dw_lang = extract_unsigned_integer (ptr, 4, BFD_ENDIAN_LITTLE);
+  if (dw_lang >= DW_LANG_hi_user)
+    {
+      complaint (_(".gdb_index shortcut table has invalid main language %u"),
+		   (unsigned) dw_lang);
+      return;
+    }
+  if (dw_lang == 0)
+    {
+      /* Don't bother if the language for the main symbol was not known or if
+       * there was no main symbol at all when the index was built. */
+      return;
+    }
+  ptr += 4;
+
+  const auto lang = dwarf_lang_to_enum_language (dw_lang);
+  const auto name_offset = extract_unsigned_integer (ptr,
+						     sizeof (offset_type),
+						     BFD_ENDIAN_LITTLE);
+  const auto name = (const char*) (index->constant_pool.data () + name_offset);
+
+  set_objfile_main_name (per_objfile->objfile, name, (enum language) lang);
+}
+
 /* See read-gdb-index.h.  */
 
 int
@@ -848,6 +898,8 @@ dwarf2_read_gdb_index
 
   create_addrmap_from_gdb_index (per_objfile, map.get ());
 
+  set_main_name_from_gdb_index (per_objfile, map.get ());
+
   per_bfd->index_table = std::move (map);
   per_bfd->quick_file_names_table =
     create_quick_file_names_table (per_bfd->all_units.size ());
diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index 4828409222..89acd94c05 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -17745,7 +17745,9 @@ leb128_size (const gdb_byte *buf)
     }
 }
 
-static enum language
+/* Converts DWARF language names to GDB language names. */
+
+enum language
 dwarf_lang_to_enum_language (unsigned int lang)
 {
   enum language language;
@@ -21661,6 +21663,7 @@ prepare_one_comp_unit (struct dwarf2_cu *cu, struct die_info *comp_unit_die,
   /* Set the language we're debugging.  */
   attr = dwarf2_attr (comp_unit_die, DW_AT_language, cu);
   enum language lang;
+  dwarf_source_language dw_lang = (dwarf_source_language)0;
   if (cu->producer != nullptr
       && strstr (cu->producer, "IBM XL C for OpenCL") != NULL)
     {
@@ -21669,18 +21672,24 @@ prepare_one_comp_unit (struct dwarf2_cu *cu, struct die_info *comp_unit_die,
 	 language detection we fall back to the DW_AT_producer
 	 string.  */
       lang = language_opencl;
+      dw_lang = DW_LANG_OpenCL;
     }
   else if (cu->producer != nullptr
 	   && strstr (cu->producer, "GNU Go ") != NULL)
     {
       /* Similar hack for Go.  */
       lang = language_go;
+      dw_lang = DW_LANG_Go;
     }
   else if (attr != nullptr)
-    lang = dwarf_lang_to_enum_language (attr->constant_value (0));
+    {
+      lang = dwarf_lang_to_enum_language (attr->constant_value (0));
+      dw_lang = (dwarf_source_language)attr->constant_value (0);
+    }
   else
     lang = pretend_language;
 
+  cu->per_cu->dw_lang = dw_lang;
   cu->language_defn = language_def (lang);
 
   switch (comp_unit_die->tag)
diff --git a/gdb/dwarf2/read.h b/gdb/dwarf2/read.h
index 37023a2070..6707c400cf 100644
--- a/gdb/dwarf2/read.h
+++ b/gdb/dwarf2/read.h
@@ -245,6 +245,14 @@ struct dwarf2_per_cu_data
      functions above.  */
   std::vector <dwarf2_per_cu_data *> *imported_symtabs = nullptr;
 
+  /* The original DW_LANG_* value of the CU, as provided to us by
+   * DW_AT_language. It is interesting to keep this value around in cases where
+   * we can't use the values from the language enum, as the mapping to them is
+   * lossy, and, while that is usually fine, things like the index have an
+   * understandable bias towards not exposing internal GDB structures to the
+   * outside world, and so prefer to use DWARF constants in their stead. */
+  dwarf_source_language dw_lang;
+
   /* Return true of IMPORTED_SYMTABS is empty or not yet allocated.  */
   bool imported_symtabs_empty () const
   {
@@ -755,6 +763,10 @@ struct dwarf2_per_objfile
 		     std::unique_ptr<dwarf2_cu>> m_dwarf2_cus;
 };
 
+/* Converts DWARF language names to GDB language names. */
+
+enum language dwarf_lang_to_enum_language (unsigned int lang);
+
 /* Get the dwarf2_per_objfile associated to OBJFILE.  */
 
 dwarf2_per_objfile *get_dwarf2_per_objfile (struct objfile *objfile);
-- 
2.41.0


^ permalink raw reply	[relevance 4%]

* [PATCH v2] Add support for creating new types from the Python API
  2023-08-07 14:53  5%   ` Andrew Burgess
@ 2023-08-08 21:00  1%     ` Matheus Branco Borella
  2024-01-13  1:37  1%       ` [PATCH v3] " Matheus Branco Borella
  0 siblings, 1 reply; 65+ results
From: Matheus Branco Borella @ 2023-08-08 21:00 UTC (permalink / raw)
  To: gdb-patches; +Cc: aburgess, disconnect3d, Matheus Branco Borella

Andrew Burgess <aburgess@redhat.com> wrote:
> I'd soften this from "leak their memory" to "remain live" -- it just
> feels like claiming there's a leak here is a little too harsh.

Fair enough. When I wrote that I was coming from the perspective that, while the
memory will get freed eventually, I'd guess that for the average user that will
only happen when GDB exists. So it would be effectively lost anyway. But, yes,
it makes more sense to describe it as remaining live.

> I have a few observations, but I think it will be easier to review once
> there are either some docs and tests that exercise all the parts, as
> I'll be able to see how everything is intended to work together without
> having to figure it out from the code.

I've added a far more complete set of tests, added docs and a news file entry.
Let me know if there's anything else you'd like me to add or something that's
there but need clarifying or changing.

Additionally, I think all of your formatting and styling concerns should have
been addressed in this new version. I might have still missed something though,
since all of my lines came out to less than or equal to 80 characters long in
the original patch, so I could be counting them differently. Just to be safe,
I've made it so that lines in this patch are always strictly less than 80 
characters long.

> Actually, looking at some of the other code, I wonder if the right thing
> here is to switch to PyObject_TypeCheck?

That makes it much more consistent, yes. I've changed to using that.

> I haven't dug into the implications of providing this structure with all
> the fields set to nullptr vs just providing nullptr for the tp_as_number
> field below.

While it wouldn't be a problem, those lists and structures being there was a
consequence of how I'd originally planned to expose things to Python through 
them, but ended up never doing so and forgetting about it. Since they were
unused, I took them out. Thanks for pointing it out.

> But I'm not sure how these pointers would be used ... maybe some tests
> will give examples of how this is different to calling
> gdb.Type.pointer() then I'll understand...

What I had in mind was representing pointers types in environments that have
different pointer types depending on the mode of execution. I'm not sure how I
could test it though, or if this is redundant for that use case.

---
This patch adds support for creating types from within the Python API. It does
so by exposing the `init_*_type` family of functions, defined in `gdbtypes.h` to
Python and having them return `gdb.Type` objects connected to the newly minted
types.

These functions are accessible in the root of the gdb module and all require
a reference to either a `gdb.Objfile` or a `gdb.Architecture`. Types created
from them will be owned by the object passed to the function.

This patch also adds an extra type - `gdb.FloatFormat` - to support creation of
floating point types by letting users control the format from within Python. It
is missing, however, a way to specify half formats and validation functions.

It is important to note that types created using this interface are not
automatically registered as a symbol, and so, types will become unreachable
unless used to create a value that otherwise references it or saved in some way.

The main drawback of using the `init_*_type` family over implementing type
initialization by hand is that any type that's created gets immediately
allocated on its owner's obstack, regardless of what its real lifetime
requirements are. The main implication of this is that types that become
unreachable will remain live for the lifetime of the owner.

Keeping track of the initialization of the type by hand would require a
deeper change to the existing type object infrastructure. A bit too ambitious
for a first patch, I'd say.

If it were to be done though, we would gain the ability to only keep in the
obstack types that are known to be referenced in some other way - by allocating
and copying the data to the obstack as other objects are created that reference
it (eg. symbols).
---
 gdb/Makefile.in                           |   2 +
 gdb/NEWS                                  |  16 +
 gdb/doc/python.texi                       | 237 ++++++++++
 gdb/python/py-float-format.c              | 307 +++++++++++++
 gdb/python/py-objfile.c                   |  17 +
 gdb/python/py-type-init.c                 | 516 ++++++++++++++++++++++
 gdb/python/python-internal.h              |  34 ++
 gdb/python/python.c                       |  50 +++
 gdb/testsuite/gdb.python/py-type-init.c   |  21 +
 gdb/testsuite/gdb.python/py-type-init.exp | 132 ++++++
 10 files changed, 1332 insertions(+)
 create mode 100644 gdb/python/py-float-format.c
 create mode 100644 gdb/python/py-type-init.c
 create mode 100644 gdb/testsuite/gdb.python/py-type-init.c
 create mode 100644 gdb/testsuite/gdb.python/py-type-init.exp

diff --git a/gdb/Makefile.in b/gdb/Makefile.in
index 14b5dd0bad..108bcea69e 100644
--- a/gdb/Makefile.in
+++ b/gdb/Makefile.in
@@ -431,6 +431,8 @@ SUBDIR_PYTHON_SRCS = \
 	python/py-threadevent.c \
 	python/py-tui.c \
 	python/py-type.c \
+	python/py-type-init.c \
+	python/py-float-format.c \
 	python/py-unwind.c \
 	python/py-utils.c \
 	python/py-value.c \
diff --git a/gdb/NEWS b/gdb/NEWS
index 6aa0d5171f..7d461ff29f 100644
--- a/gdb/NEWS
+++ b/gdb/NEWS
@@ -170,6 +170,22 @@ info main
      (program-counter) values, and can be used as the frame-id when
      calling gdb.PendingFrame.create_unwind_info.
 
+  ** Functions that allow creation of instances of gdb.Type, and a new
+     class gdb.FloatFormat that may be used to create floating point
+     types. The functions that allow new type creation are:
+      - gdb.init_type: Create a new type given a type code.
+      - gdb.init_integer_type: Create a new integer type.
+      - gdb.init_character_type: Create a new character type.
+      - gdb.init_boolean_type: Create a new boolean type.
+      - gdb.init_float_type: Create a new floating point type.
+      - gdb.init_decfloat_type: Create a new decimal floating point type.
+      - gdb.can_create_complex_type: Whether a type can be used to create a
+          new complex type.
+      - gdb.init_complex_type: Create a new complex type.
+      - gdb.init_pointer_type: Create a new pointer type.
+          * This allows creating pointers of arbitrary size.
+      - gdb.init_fixed_point_type: Create a new fixed point type.
+
 *** Changes in GDB 13
 
 * MI version 1 is deprecated, and will be removed in GDB 14.
diff --git a/gdb/doc/python.texi b/gdb/doc/python.texi
index 1113591065..3f3a9a2220 100644
--- a/gdb/doc/python.texi
+++ b/gdb/doc/python.texi
@@ -1665,6 +1665,243 @@ A Fortran namelist.
 Further support for types is provided in the @code{gdb.types}
 Python module (@pxref{gdb.types}).
 
+
+
+@node Creating Types In Python
+@subsubsection Creating Types In Python
+@cindex creating types in Python
+@cindex Python, working with types
+
+@value{GDBN} makes available functionality to create new types from
+inside Python.
+
+The following type creation functions are available in the @code{gdb}
+module:
+
+@findex gdb.init_type
+@defun gdb.init_type (owner, type_code, bit_size, name)
+This function creates a new @code{gdb.Type} instance corresponding to a
+type owned by the given @code{owner}, with the given type code,
+@code{name} and size.
+
+@code{owner} must be a reference to either a @code{gdb.Objfile} or a
+@code{gdb.Architecture} object. These correspond to objfile and
+architecture-owned types, respectively.
+
+@code{type_code} is one of the @code{TYPE_CODE_} constants defined in
+@xref{Types In Python}.
+
+@code{bit_size} is the size of instances of the newly created type, in
+bits. Currently, accepted values are limited to multiples of 8.
+
+This function returns an instance of @code{gdb.Type}, and will throw an
+exception in case of an error.
+@end defun
+
+@findex gdb.init_integer_type
+@defun gdb.init_integer_type (owner, bit_size, unsigned, name)
+This function creates a new @code{gdb.Type} instance corresponding to an
+integer type owned by the given @code{owner}, with the given
+@code{name}, size and signedness.
+
+@code{owner} must be a reference to either a @code{gdb.Objfile} or a
+@code{gdb.Architecture} object. These correspond to objfile and
+architecture-owned types, respectively.
+
+@code{bit_size} is the size of instances of the newly created type, in
+bits. Currently, accepted values are limited to multiples of 8.
+
+@code{unsigned} is a boolean indicating whether the type corresponds to
+a signed or unsigned value.
+
+This function returns an instance of @code{gdb.Type}, and will throw an
+exception in case of an error.
+@end defun
+
+@findex gdb.init_character_type
+@defun gdb.init_character_type (owner, bit_size, unsigned, name)
+This function creates a new @code{gdb.Type} instance corresponding to a
+character type owned by the given @code{owner}, with the given
+@code{name}, size and signedness.
+
+@code{owner} must be a reference to either a @code{gdb.Objfile} or a
+@code{gdb.Architecture} object. These correspond to objfile and
+architecture-owned types, respectively.
+
+@code{bit_size} is the size of instances of the newly created type, in
+bits. Currently, accepted values are limited to multiples of 8.
+
+@code{unsigned} is a boolean indicating whether the type corresponds to
+a signed or unsigned value.
+
+This function returns an instance of @code{gdb.Type}, and will throw an
+exception in case of an error.
+@end defun
+
+@findex gdb.init_boolean_type
+@defun gdb.init_boolean_type (owner, bit_size, unsigned, name)
+This function creates a new @code{gdb.Type} instance corresponding to a
+boolean type owned by the given @code{owner}, with the given
+@code{name}, size and signedness.
+
+@code{owner} must be a reference to either a @code{gdb.Objfile} or a
+@code{gdb.Architecture} object. These correspond to objfile and
+architecture-owned types, respectively.
+
+@code{bit_size} is the size of instances of the newly created type, in
+bits. Currently, accepted values are limited to multiples of 8.
+
+@code{unsigned} is a boolean indicating whether the type corresponds to
+a signed or unsigned value.
+
+This function returns an instance of @code{gdb.Type}, and will throw an
+exception in case of an error.
+@end defun
+
+@findex gdb.init_float_type
+@defun gdb.init_float_type (owner, format, name)
+This function creates a new @code{gdb.Type} instance corresponding to a
+character type owned by the given @code{owner}, with the given
+@code{name} and @code{format}.
+
+@code{owner} must be a reference to either a @code{gdb.Objfile} or a
+@code{gdb.Architecture} object. These correspond to objfile and
+architecture-owned types, respectively.
+
+@code{format} is an reference to a @code{gdb.FloatFormat} object, as
+described below.
+
+This function returns an instance of @code{gdb.Type}, and will throw an
+exception in case of an error.
+@end defun
+
+@findex gdb.init_decfloat_type
+@defun gdb.init_decfloat_type (owner, bit_size, name)
+This function creates a new @code{gdb.Type} instance corresponding to a
+decimal floating point type owned by the given @code{owner}, with the
+given @code{name} and size.
+
+@code{owner} must be a reference to either a @code{gdb.Objfile} or a
+@code{gdb.Architecture} object. These correspond to objfile and
+architecture-owned types, respectively.
+
+@code{bit_size} is the size of instances of the newly created type, in
+bits. Currently, accepted values are limited to multiples of 8.
+
+This function returns an instance of @code{gdb.Type}, and will throw an
+exception in case of an error.
+@end defun
+
+@findex gdb.can_create_complex_type
+@defun gdb.can_create_complex_type (type)
+This function returns a boolean indicating whether @code{type} can be
+used to create a new complex type using the @code{gdb.init_complex_type}
+function.
+@end defun
+
+@findex gdb.init_complex_type
+@defun gdb.init_complex_type (type, name)
+This function creates a new @code{gdb.Type} instance corresponding to a
+complex type with the given @code{name} based on the given base
+@code{type}.
+
+The newly created type will be owned by the same object as the base
+type that was used to create it.
+
+This function returns an instance of @code{gdb.Type}, and will throw an
+exception in case of an error.
+@end defun
+
+@findex gdb.init_pointer_type
+@defun gdb.init_pointer_type (owner, target, bit_size, name)
+This function creates a new @code{gdb.Type} instance corresponding to a
+pointer type that points to @code{target} and is owned by the given
+@code{owner}, with the given @code{name} and size.
+
+@code{owner} must be a reference to either a @code{gdb.Objfile} or a
+@code{gdb.Architecture} object. These correspond to objfile and
+architecture-owned types, respectively.
+
+@code{target} is a @code{gdb.Type} object, corresponding to the type
+that will be pointed to by the newly created pointer type.
+
+@code{bit_size} is the size of instances of the newly created type, in
+bits. Currently, accepted values are limited to multiples of 8.
+
+This function returns an instance of @code{gdb.Type}, and will throw an
+exception in case of an error.
+@end defun
+
+@findex gdb.init_boolean_type
+@defun gdb.init_fixed_point_type (owner, bit_size, unsigned, name)
+This function creates a new @code{gdb.Type} instance corresponding to a
+fixed point type owned by the given @code{owner}, with the given
+@code{name}, size and signedness.
+
+@code{owner} must be a reference to either a @code{gdb.Objfile} or a
+@code{gdb.Architecture} object. These correspond to objfile and
+architecture-owned types, respectively.
+
+@code{bit_size} is the size of instances of the newly created type, in
+bits. Currently, accepted values are limited to multiples of 8.
+
+@code{unsigned} is a boolean indicating whether the type corresponds to
+a signed or unsigned value.
+
+This function returns an instance of @code{gdb.Type}, and will throw an
+exception in case of an error.
+@end defun
+
+When creating a floating point type through @code{gdb.init_float_type},
+one has to use a @code{gdb.FloatFormat} object. These objects may be
+created with no arguments, and the following attributes may be used to
+defined the format of the desired floating point format:
+
+@defvar FloatFormat.totalsize
+The size of the floating point number, in bits. Currently, accepted
+values are limited to multiples of 8.
+@end defvar
+
+@defvar FloatFormat.sign_start
+The bit offset of the sign bit.
+@end defvar
+
+@defvar FloatFormt.exp_start
+The bit offset of the start of the exponent.
+@end defvar
+
+@defvar FloatFormat.exp_len
+The size of the exponent, in bits.
+@end defvar
+
+@defvar FloatFormat.exp_bias
+Bias added to the written exponent to form the biased exponent.
+@end defvar
+
+@defvar FloatFormat.exp_nan
+Exponent value which indicates NaN.
+@end defvar
+
+@defvar FloatFormat.man_start
+The bit offset of the start of the mantissa.
+@end defvar
+
+@defvar FloatFormat.man_len
+The size of the mantissa, in bits.
+@end defvar
+
+@defvar FloatFormat.intbit
+This is a boolean values that indicates whether the integer bit is part
+of the value or if it is determined implicitly. A value of true
+indicates the former, while a value of false indicates the latter.
+@end defvar
+
+@defvar FloatFormat.name
+The name of the float format. Used internally, for debugging purposes.
+@end defvar
+
+
+
 @node Pretty Printing API
 @subsubsection Pretty Printing API
 @cindex python pretty printing api
diff --git a/gdb/python/py-float-format.c b/gdb/python/py-float-format.c
new file mode 100644
index 0000000000..984b96361a
--- /dev/null
+++ b/gdb/python/py-float-format.c
@@ -0,0 +1,307 @@
+/* Accessibility of float format controls from inside the Python API
+
+   Copyright (C) 2008-2023 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include "defs.h"
+#include "python-internal.h"
+#include "floatformat.h"
+
+/* Structure backing the float format Python interface. */
+
+struct float_format_object
+{
+  PyObject_HEAD
+  struct floatformat format;
+
+  struct floatformat *float_format ()
+  {
+    return &this->format;
+  }
+};
+
+/* Initializes the float format type and registers it with the Python
+ * interpreter. */
+
+static int CPYCHECKER_NEGATIVE_RESULT_SETS_EXCEPTION
+gdbpy_initialize_float_format (void)
+{
+  if (PyType_Ready (&float_format_object_type) < 0)
+    return -1;
+
+  if (gdb_pymodule_addobject (gdb_module, "FloatFormat",
+			      (PyObject *) &float_format_object_type) < 0)
+    return -1;
+
+  return 0;
+}
+
+GDBPY_INITIALIZE_FILE (gdbpy_initialize_float_format);
+
+/* Creates a function that gets the value of a field of a given name from the
+ * underliying float_format structure in the Python object. */
+
+#define INSTANCE_FIELD_GETTER(getter_name, field_name, field_type, field_conv)\
+  static PyObject *							      \
+  getter_name (PyObject *self, void *closure)				      \
+  {									      \
+    float_format_object *ff = (float_format_object*) self;		      \
+    field_type value = ff->float_format ()->field_name;			      \
+    return field_conv (value);						      \
+  }
+
+/* Creates a function that sets the value of a field of a given name from the
+ * underliying float_format structure in the Python object. */
+
+#define INSTANCE_FIELD_SETTER(setter_name, field_name, field_type, field_conv)\
+  static int								      \
+  setter_name (PyObject *self, PyObject* value, void *closure)		      \
+  {									      \
+    field_type native_value;						      \
+    if (!field_conv (value, &native_value))				      \
+      return -1;							      \
+    float_format_object *ff = (float_format_object*) self;		      \
+    ff->float_format ()->field_name = native_value;			      \
+    return 0;								      \
+  }
+
+/* Converts from the intbit enum to a Python boolean. */
+
+static PyObject *
+intbit_to_py (enum floatformat_intbit intbit)
+{
+  gdb_assert (intbit == floatformat_intbit_yes
+	      || intbit == floatformat_intbit_no);
+
+  if (intbit == floatformat_intbit_no)
+    Py_RETURN_FALSE;
+  else
+    Py_RETURN_TRUE;
+}
+
+/* Converts from a Python boolean to the intbit enum. */
+
+static bool
+py_to_intbit (PyObject *object, enum floatformat_intbit *intbit)
+{
+  if (!PyObject_IsInstance (object, (PyObject*) &PyBool_Type))
+    {
+      PyErr_SetString (PyExc_TypeError, "intbit must be True or False");
+      return false;
+    }
+
+  *intbit = PyObject_IsTrue (object) ? floatformat_intbit_yes
+    : floatformat_intbit_no;
+
+  return true;
+}
+
+/* Converts from a Python integer to a unsigned integer. */
+
+static bool
+py_to_unsigned_int (PyObject *object, unsigned int *val)
+{
+  if (!PyObject_IsInstance (object, (PyObject*) &PyLong_Type))
+    {
+      PyErr_SetString (PyExc_TypeError, "value must be an integer");
+      return false;
+    }
+
+  long native_val = PyLong_AsLong (object);
+  if (native_val > (long) UINT_MAX)
+    {
+      PyErr_SetString (PyExc_ValueError, "value is too large");
+      return false;
+    }
+  if (native_val < 0)
+    {
+      PyErr_SetString (PyExc_ValueError,
+		       "value must not be smaller than zero");
+      return false;
+    }
+
+  *val = (unsigned int) native_val;
+  return true;
+}
+
+/* Converts from a Python integer to a signed integer. */
+
+static bool
+py_to_int(PyObject *object, int *val)
+{
+  if(!PyObject_IsInstance(object, (PyObject*)&PyLong_Type))
+    {
+      PyErr_SetString(PyExc_TypeError, "value must be an integer");
+      return false;
+    }
+
+  long native_val = PyLong_AsLong(object);
+  if(native_val > (long)INT_MAX)
+    {
+      PyErr_SetString(PyExc_ValueError, "value is too large");
+      return false;
+    }
+
+  *val = (int)native_val;
+  return true;
+}
+
+/* Instantiate functions for all of the float format fields we'd like to be
+ * able to read and change from our Python object. These will be used later to
+ * define `getset` entries for them. */
+
+INSTANCE_FIELD_GETTER (ffpy_get_totalsize, totalsize,
+		       unsigned int, PyLong_FromLong)
+INSTANCE_FIELD_GETTER (ffpy_get_sign_start, sign_start,
+		       unsigned int, PyLong_FromLong)
+INSTANCE_FIELD_GETTER (ffpy_get_exp_start, exp_start,
+		       unsigned int, PyLong_FromLong)
+INSTANCE_FIELD_GETTER (ffpy_get_exp_len, exp_len,
+		       unsigned int, PyLong_FromLong)
+INSTANCE_FIELD_GETTER (ffpy_get_exp_bias, exp_bias, int, PyLong_FromLong)
+INSTANCE_FIELD_GETTER (ffpy_get_exp_nan, exp_nan,
+		       unsigned int, PyLong_FromLong)
+INSTANCE_FIELD_GETTER (ffpy_get_man_start, man_start,
+		       unsigned int, PyLong_FromLong)
+INSTANCE_FIELD_GETTER (ffpy_get_man_len, man_len,
+		       unsigned int, PyLong_FromLong)
+INSTANCE_FIELD_GETTER (ffpy_get_intbit, intbit,
+		       enum floatformat_intbit, intbit_to_py)
+INSTANCE_FIELD_GETTER (ffpy_get_name, name,
+		       const char *, PyUnicode_FromString)
+
+INSTANCE_FIELD_SETTER (ffpy_set_totalsize, totalsize,
+		       unsigned int, py_to_unsigned_int)
+INSTANCE_FIELD_SETTER (ffpy_set_sign_start, sign_start,
+		       unsigned int, py_to_unsigned_int)
+INSTANCE_FIELD_SETTER (ffpy_set_exp_start, exp_start,
+		       unsigned int, py_to_unsigned_int)
+INSTANCE_FIELD_SETTER (ffpy_set_exp_len, exp_len,
+		       unsigned int, py_to_unsigned_int)
+INSTANCE_FIELD_SETTER (ffpy_set_exp_bias, exp_bias, int, py_to_int)
+INSTANCE_FIELD_SETTER (ffpy_set_exp_nan, exp_nan,
+		       unsigned int, py_to_unsigned_int)
+INSTANCE_FIELD_SETTER (ffpy_set_man_start, man_start,
+		       unsigned int, py_to_unsigned_int)
+INSTANCE_FIELD_SETTER (ffpy_set_man_len, man_len,
+		       unsigned int, py_to_unsigned_int)
+INSTANCE_FIELD_SETTER (ffpy_set_intbit, intbit,
+		       enum floatformat_intbit, py_to_intbit)
+
+/* Makes sure float formats created from Python always test as valid. */
+
+static int
+ffpy_always_valid (const struct floatformat *fmt ATTRIBUTE_UNUSED,
+		   const void *from ATTRIBUTE_UNUSED)
+{
+  return 1;
+}
+
+/* Initializes new float format objects. */
+
+static int
+ffpy_init (PyObject *self,
+	   PyObject *args ATTRIBUTE_UNUSED,
+	   PyObject *kwds ATTRIBUTE_UNUSED)
+{
+  auto ff = (float_format_object*) self;
+  ff->format = floatformat ();
+  ff->float_format ()->name = "";
+  ff->float_format ()->is_valid = ffpy_always_valid;
+  return 0;
+}
+
+/* See python/python-internal.h. */
+
+struct floatformat *
+float_format_object_as_float_format (PyObject *self)
+{
+  if (!PyObject_TypeCheck (self, &float_format_object_type))
+    {
+      PyErr_SetString(PyExc_TypeError, "expected gdb.FloatFormat");
+      return nullptr;
+    }
+  return ((float_format_object*) self)->float_format ();
+}
+
+static gdb_PyGetSetDef float_format_object_getset[] =
+{
+  { "totalsize", ffpy_get_totalsize, ffpy_set_totalsize,
+    "The total size of the floating point number, in bits.", nullptr },
+  { "sign_start", ffpy_get_sign_start, ffpy_set_sign_start,
+    "The bit offset of the sign bit.", nullptr },
+  { "exp_start", ffpy_get_exp_start, ffpy_set_exp_start,
+    "The bit offset of the start of the exponent.", nullptr },
+  { "exp_len", ffpy_get_exp_len, ffpy_set_exp_len,
+    "The size of the exponent, in bits.", nullptr },
+  { "exp_bias", ffpy_get_exp_bias, ffpy_set_exp_bias,
+    "Bias added to the written exponent to form the biased exponent.",
+    nullptr },
+  { "exp_nan", ffpy_get_exp_nan, ffpy_set_exp_nan,
+    "Exponent value which indicates NaN.", nullptr },
+  { "man_start", ffpy_get_man_start, ffpy_set_man_start,
+    "The bit offset of the start of the mantissa.", nullptr },
+  { "man_len", ffpy_get_man_len, ffpy_set_man_len,
+    "The size of the mantissa, in bits.", nullptr },
+  { "intbit", ffpy_get_intbit, ffpy_set_intbit,
+    "Is the integer bit explicit or implicit?", nullptr },
+  { "name", ffpy_get_name, nullptr,
+    "Internal name for debugging.", nullptr },
+  { nullptr }
+};
+
+PyTypeObject float_format_object_type =
+{
+  PyVarObject_HEAD_INIT (NULL, 0)
+  "gdb.FloatFormat",		  /*tp_name*/
+  sizeof (float_format_object),   /*tp_basicsize*/
+  0,				  /*tp_itemsize*/
+  nullptr,			  /*tp_dealloc*/
+  0,				  /*tp_print*/
+  nullptr,			  /*tp_getattr*/
+  nullptr,			  /*tp_setattr*/
+  nullptr,			  /*tp_compare*/
+  nullptr,			  /*tp_repr*/
+  nullptr,			  /*tp_as_number*/
+  nullptr,			  /*tp_as_sequence*/
+  nullptr,			  /*tp_as_mapping*/
+  nullptr,			  /*tp_hash */
+  nullptr,			  /*tp_call*/
+  nullptr,			  /*tp_str*/
+  nullptr,			  /*tp_getattro*/
+  nullptr,			  /*tp_setattro*/
+  nullptr,			  /*tp_as_buffer*/
+  Py_TPFLAGS_DEFAULT,		  /*tp_flags*/
+  "GDB float format object",      /* tp_doc */
+  nullptr,			  /* tp_traverse */
+  nullptr,			  /* tp_clear */
+  nullptr,			  /* tp_richcompare */
+  0,				  /* tp_weaklistoffset */
+  nullptr,			  /* tp_iter */
+  nullptr,			  /* tp_iternext */
+  nullptr,			  /* tp_methods */
+  nullptr,			  /* tp_members */
+  float_format_object_getset,     /* tp_getset */
+  nullptr,			  /* tp_base */
+  nullptr,			  /* tp_dict */
+  nullptr,			  /* tp_descr_get */
+  nullptr,			  /* tp_descr_set */
+  0,				  /* tp_dictoffset */
+  ffpy_init,			  /* tp_init */
+  nullptr,			  /* tp_alloc */
+  PyType_GenericNew,		  /* tp_new */
+};
diff --git a/gdb/python/py-objfile.c b/gdb/python/py-objfile.c
index ad72f3f042..f440a538c1 100644
--- a/gdb/python/py-objfile.c
+++ b/gdb/python/py-objfile.c
@@ -704,6 +704,23 @@ objfile_to_objfile_object (struct objfile *objfile)
   return gdbpy_ref<>::new_reference (result);
 }
 
+/* See python/python-internal.h. */
+
+struct objfile *
+objfile_object_to_objfile (PyObject *self)
+{
+  if (!PyObject_TypeCheck (self, &objfile_object_type))
+    {
+      PyErr_SetString(PyExc_TypeError, "expected gdb.Objfile");
+      return nullptr;
+    }
+
+  auto objfile_object = (struct objfile_object*) self;
+  OBJFPY_REQUIRE_VALID (objfile_object);
+
+  return objfile_object->objfile;
+}
+
 static int CPYCHECKER_NEGATIVE_RESULT_SETS_EXCEPTION
 gdbpy_initialize_objfile (void)
 {
diff --git a/gdb/python/py-type-init.c b/gdb/python/py-type-init.c
new file mode 100644
index 0000000000..9c26f29b2c
--- /dev/null
+++ b/gdb/python/py-type-init.c
@@ -0,0 +1,516 @@
+/* Functionality for creating new types accessible from python.
+
+   Copyright (C) 2008-2023 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include "defs.h"
+#include "python-internal.h"
+#include "gdbtypes.h"
+#include "floatformat.h"
+#include "objfiles.h"
+#include "gdbsupport/gdb_obstack.h"
+
+
+/* An abstraction covering the objects types that can own a type object. */
+
+class type_storage_owner
+{
+public:
+  /* Creates a new type owner from the given python object. If the object is
+   * of a type that is not supported, the newly created instance will be
+   * marked as invalid and nothing should be done with it. */
+
+  type_storage_owner (PyObject *owner)
+  {
+    if (gdbpy_is_architecture (owner))
+      {
+	this->kind = owner_kind::arch;
+	this->owner.arch = arch_object_to_gdbarch (owner);
+	return;
+      }
+
+    this->kind = owner_kind::objfile;
+    this->owner.objfile = objfile_object_to_objfile (owner);
+    if (this->owner.objfile != nullptr)
+	return;
+
+    this->kind = owner_kind::none;
+    PyErr_SetString(PyExc_TypeError, "unsupported owner type");
+  }
+
+  /* Whether the owner is valid. An owner may not be valid if the type that
+   * was used to create it is not known. Operations must only be done on valid
+   * instances of this class. */
+
+  bool valid ()
+  {
+    return this->kind != owner_kind::none;
+  }
+
+  /* Returns a type allocator that allocates on this owner. */
+
+  type_allocator allocator ()
+  {
+    gdb_assert (this->valid ());
+
+    if (this->kind == owner_kind::arch)
+      return type_allocator (this->owner.arch);
+    else if (this->kind == owner_kind::objfile)
+      return type_allocator (this->owner.objfile);
+
+    /* Should never be reached, but it's better to fail in a safe way than try
+     * to instance the allocator with arbitraty parameters here. */
+    abort ();
+  }
+
+  /* Get a reference to the owner's obstack. */
+
+  obstack *get_obstack ()
+  {
+    gdb_assert (this->valid ());
+
+    if (this->kind == owner_kind::arch)
+	return gdbarch_obstack (this->owner.arch);
+    else if (this->kind == owner_kind::objfile)
+	return &this->owner.objfile->objfile_obstack;
+
+    return nullptr;
+  }
+
+  /* Get a reference to the owner's architecture. */
+
+  struct gdbarch *get_arch ()
+  {
+    gdb_assert (this->valid ());
+
+    if (this->kind == owner_kind::arch)
+	return this->owner.arch;
+    else if (this->kind == owner_kind::objfile)
+	return this->owner.objfile->arch ();
+
+    return nullptr;
+  }
+
+  /* Copy a null-terminated string to the owner's obstack. */
+
+  const char *copy_string (const char *py_str)
+  {
+    gdb_assert (this->valid ());
+
+    unsigned int len = strlen (py_str);
+    return obstack_strndup (this->get_obstack (), py_str, len);
+  }
+
+
+
+private:
+  enum class owner_kind { arch, objfile, none };
+
+  owner_kind kind = owner_kind::none;
+  union {
+    struct gdbarch *arch;
+    struct objfile *objfile;
+  } owner;
+};
+
+/* Creates a new type and returns a new gdb.Type associated with it. */
+
+PyObject *
+gdbpy_init_type (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *keywords[] = { "owner", "type_code", "bit_size", "name",
+				    NULL };
+  PyObject *owner_object;
+  enum type_code code;
+  int bit_length;
+  const char *py_name;
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, "Oiis", keywords, &owner_object,
+					&code, &bit_length, &py_name))
+    return nullptr;
+
+  type_storage_owner owner (owner_object);
+  if (!owner.valid ())
+    return nullptr;
+
+  const char *name = owner.copy_string (py_name);
+  struct type *type;
+  try
+    {
+      type_allocator allocator = owner.allocator ();
+      type = allocator.new_type (code, bit_length, name);
+      gdb_assert (type != nullptr);
+    }
+  catch (gdb_exception_error& ex)
+    {
+      GDB_PY_HANDLE_EXCEPTION (ex);
+    }
+
+  return type_to_type_object (type);
+}
+
+/* Creates a new integer type and returns a new gdb.Type associated with it. */
+
+PyObject *
+gdbpy_init_integer_type (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *keywords[] = { "owner", "bit_size", "unsigned", "name",
+				    NULL };
+  PyObject *owner_object;
+  int bit_size;
+  int unsigned_p;
+  const char *py_name;
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, "Oips", keywords,
+					&owner_object, &bit_size, &unsigned_p,
+					&py_name))
+    return nullptr;
+
+  type_storage_owner owner (owner_object);
+  if (!owner.valid ())
+    return nullptr;
+
+  const char *name = owner.copy_string (py_name);
+  struct type *type;
+  try
+    {
+      type_allocator allocator = owner.allocator ();
+      type = init_integer_type (allocator, bit_size, unsigned_p, name);
+      gdb_assert (type != nullptr);
+    }
+  catch (gdb_exception_error& ex)
+    {
+      GDB_PY_HANDLE_EXCEPTION (ex);
+    }
+
+  return type_to_type_object(type);
+}
+
+/* Creates a new character type and returns a new gdb.Type associated
+ * with it. */
+
+PyObject *
+gdbpy_init_character_type (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *keywords[] = { "owner", "bit_size", "unsigned", "name",
+				    NULL };
+  PyObject *owner_object;
+  int bit_size;
+  int unsigned_p;
+  const char *py_name;
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, "Oips", keywords,
+					&owner_object, &bit_size, &unsigned_p,
+					&py_name))
+    return nullptr;
+
+  type_storage_owner owner (owner_object);
+  if (!owner.valid ())
+    return nullptr;
+
+  const char *name = owner.copy_string (py_name);
+  struct type *type;
+  try
+    {
+      type_allocator allocator = owner.allocator ();
+      type = init_character_type (allocator, bit_size, unsigned_p, name);
+      gdb_assert (type != nullptr);
+    }
+  catch (gdb_exception_error& ex)
+    {
+      GDB_PY_HANDLE_EXCEPTION (ex);
+    }
+
+  return type_to_type_object (type);
+}
+
+/* Creates a new boolean type and returns a new gdb.Type associated with it. */
+
+PyObject *
+gdbpy_init_boolean_type (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *keywords[] = { "owner", "bit_size", "unsigned", "name",
+				    NULL };
+  PyObject *owner_object;
+  int bit_size;
+  int unsigned_p;
+  const char *py_name;
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, "Oips", keywords,
+					&owner_object, &bit_size, &unsigned_p,
+					&py_name))
+    return nullptr;
+
+  type_storage_owner owner (owner_object);
+  if (!owner.valid ())
+    return nullptr;
+
+  const char *name = owner.copy_string (py_name);
+  struct type *type;
+  try
+    {
+      type_allocator allocator = owner.allocator ();
+      type = init_boolean_type (allocator, bit_size, unsigned_p, name);
+      gdb_assert (type != nullptr);
+    }
+  catch (gdb_exception_error& ex)
+    {
+      GDB_PY_HANDLE_EXCEPTION (ex);
+    }
+
+  return type_to_type_object (type);
+}
+
+/* Creates a new float type and returns a new gdb.Type associated with it. */
+
+PyObject *
+gdbpy_init_float_type (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *keywords[] = { "owner", "format", "name", NULL };
+  PyObject *owner_object, *float_format_object;
+  const char *py_name;
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, "OOs", keywords, &owner_object,
+					&float_format_object, &py_name))
+    return nullptr;
+
+  type_storage_owner owner (owner_object);
+  if (!owner.valid ())
+    return nullptr;
+
+  struct floatformat *local_ff = float_format_object_as_float_format
+    (float_format_object);
+  if (local_ff == nullptr)
+    return nullptr;
+
+  /* Persist a copy of the format in the objfile's obstack. This guarantees
+   * that the format won't outlive the type being created from it and that
+   * changes made to the object used to create this type will not affect it
+   * after creation. */
+  auto ff = OBSTACK_CALLOC (owner.get_obstack (), 1, struct floatformat);
+  memcpy (ff, local_ff, sizeof (struct floatformat));
+
+  /* We only support creating float types in the architecture's endianness, so
+   * make sure init_float_type sees the float format structure we need it to.
+   */
+  enum bfd_endian endianness = gdbarch_byte_order (owner.get_arch ());
+  gdb_assert (endianness < BFD_ENDIAN_UNKNOWN);
+
+  const struct floatformat *per_endian[2] = { nullptr, nullptr };
+  per_endian[endianness] = ff;
+
+  const char *name = owner.copy_string (py_name);
+  struct type *type;
+  try
+    {
+      type_allocator allocator = owner.allocator ();
+      type = init_float_type (allocator, -1, name, per_endian, endianness);
+      gdb_assert (type != nullptr);
+    }
+  catch (gdb_exception_error& ex)
+    {
+      GDB_PY_HANDLE_EXCEPTION (ex);
+    }
+
+  return type_to_type_object (type);
+}
+
+/* Creates a new decimal float type and returns a new gdb.Type
+ * associated with it. */
+
+PyObject *
+gdbpy_init_decfloat_type (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *keywords[] = { "owner", "bit_size", "name", NULL };
+  PyObject *owner_object;
+  int bit_length;
+  const char *py_name;
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, "Ois", keywords, &owner_object,
+					&bit_length, &py_name))
+    return nullptr;
+
+  type_storage_owner owner (owner_object);
+  if (!owner.valid ())
+    return nullptr;
+
+  const char *name = owner.copy_string (py_name);
+  struct type *type;
+  try
+    {
+      type_allocator allocator = owner.allocator ();
+      type = init_decfloat_type (allocator, bit_length, name);
+      gdb_assert (type != nullptr);
+    }
+  catch (gdb_exception_error& ex)
+    {
+      GDB_PY_HANDLE_EXCEPTION (ex);
+    }
+
+  return type_to_type_object (type);
+}
+
+/* Returns whether a given type can be used to create a complex type. */
+
+PyObject *
+gdbpy_can_create_complex_type (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *keywords[] = { "type", NULL };
+  PyObject *type_object;
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, "O", keywords,
+					&type_object))
+    return nullptr;
+
+  struct type *type = type_object_to_type (type_object);
+  if (type == nullptr)
+    return nullptr;
+
+  bool can_create_complex = false;
+  try
+    {
+      can_create_complex = can_create_complex_type (type);
+    }
+  catch (gdb_exception_error& ex)
+    {
+      GDB_PY_HANDLE_EXCEPTION (ex);
+    }
+
+  if (can_create_complex)
+    Py_RETURN_TRUE;
+  else
+    Py_RETURN_FALSE;
+}
+
+/* Creates a new complex type and returns a new gdb.Type associated with it. */
+
+PyObject *
+gdbpy_init_complex_type (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *keywords[] = { "type", "name", NULL };
+  PyObject *type_object;
+  const char *py_name;
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, "Os", keywords, &type_object,
+					&py_name))
+    return nullptr;
+
+  struct type *type = type_object_to_type (type_object);
+  if (type == nullptr)
+    return nullptr;
+
+  obstack *obstack;
+  if (type->is_objfile_owned ())
+    obstack = &type->objfile_owner ()->objfile_obstack;
+  else
+    obstack = gdbarch_obstack (type->arch_owner ());
+
+  unsigned int len = strlen (py_name);
+  const char *name = obstack_strndup (obstack,
+				      py_name,
+				      len);
+  struct type *complex_type;
+  try
+    {
+      complex_type = init_complex_type (name, type);
+      gdb_assert (complex_type != nullptr);
+    }
+  catch (gdb_exception_error& ex)
+    {
+      GDB_PY_HANDLE_EXCEPTION (ex);
+    }
+
+  return type_to_type_object (complex_type);
+}
+
+/* Creates a new pointer type and returns a new gdb.Type associated with it. */
+
+PyObject *
+gdbpy_init_pointer_type (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *keywords[] = { "owner", "target", "bit_size", "name",
+				    NULL };
+  PyObject *owner_object, *type_object;
+  int bit_length;
+  const char *py_name;
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, "OOis", keywords,
+					&owner_object, &type_object,
+					&bit_length, &py_name))
+    return nullptr;
+
+  struct type *type = type_object_to_type (type_object);
+  if (type == nullptr)
+    return nullptr;
+
+  type_storage_owner owner (owner_object);
+  if (!owner.valid ())
+    return nullptr;
+
+  const char *name = owner.copy_string (py_name);
+  struct type *pointer_type = nullptr;
+  try
+    {
+      type_allocator allocator = owner.allocator ();
+      pointer_type = init_pointer_type (allocator, bit_length, name, type);
+      gdb_assert (type != nullptr);
+    }
+  catch (gdb_exception_error& ex)
+    {
+      GDB_PY_HANDLE_EXCEPTION (ex);
+    }
+
+  return type_to_type_object (pointer_type);
+}
+
+/* Creates a new fixed point type and returns a new gdb.Type associated
+ * with it. */
+
+PyObject *
+gdbpy_init_fixed_point_type (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *keywords[] = { "owner", "bit_size", "unsigned", "name",
+				    NULL };
+  PyObject *objfile_object;
+  int bit_length;
+  int unsigned_p;
+  const char* py_name;
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, "Oips", keywords,
+					&objfile_object, &bit_length,
+					&unsigned_p, &py_name))
+    return nullptr;
+
+  struct objfile *objfile = objfile_object_to_objfile (objfile_object);
+  if (objfile == nullptr)
+    return nullptr;
+
+  unsigned int len = strlen (py_name);
+  const char *name = obstack_strndup (&objfile->objfile_obstack, py_name, len);
+  struct type *type;
+  try
+    {
+      type = init_fixed_point_type (objfile, bit_length, unsigned_p, name);
+      gdb_assert (type != nullptr);
+    }
+  catch (gdb_exception_error& ex)
+    {
+      GDB_PY_HANDLE_EXCEPTION (ex);
+    }
+
+  return type_to_type_object (type);
+}
diff --git a/gdb/python/python-internal.h b/gdb/python/python-internal.h
index dbd33570a7..c355bed212 100644
--- a/gdb/python/python-internal.h
+++ b/gdb/python/python-internal.h
@@ -289,6 +289,8 @@ extern PyTypeObject frame_object_type
     CPYCHECKER_TYPE_OBJECT_FOR_TYPEDEF ("frame_object");
 extern PyTypeObject thread_object_type
     CPYCHECKER_TYPE_OBJECT_FOR_TYPEDEF ("thread_object");
+extern PyTypeObject float_format_object_type
+    CPYCHECKER_TYPE_OBJECT_FOR_TYPEDEF ("float_format");
 
 /* Ensure that breakpoint_object_type is initialized and return true.  If
    breakpoint_object_type can't be initialized then set a suitable Python
@@ -431,6 +433,26 @@ gdb::unique_xmalloc_ptr<char> gdbpy_parse_command_name
 PyObject *gdbpy_register_tui_window (PyObject *self, PyObject *args,
 				     PyObject *kw);
 
+PyObject *gdbpy_init_type (PyObject *self, PyObject *args, PyObject *kw);
+PyObject *gdbpy_init_integer_type (PyObject *self, PyObject *args,
+				   PyObject *kw);
+PyObject *gdbpy_init_character_type (PyObject *self, PyObject *args,
+				     PyObject *kw);
+PyObject *gdbpy_init_boolean_type (PyObject *self, PyObject *args,
+				   PyObject *kw);
+PyObject *gdbpy_init_float_type (PyObject *self, PyObject *args,
+				 PyObject *kw);
+PyObject *gdbpy_init_decfloat_type (PyObject *self, PyObject *args,
+				    PyObject *kw);
+PyObject *gdbpy_can_create_complex_type (PyObject *self, PyObject *args,
+					 PyObject *kw);
+PyObject *gdbpy_init_complex_type (PyObject *self, PyObject *args,
+				   PyObject *kw);
+PyObject *gdbpy_init_pointer_type (PyObject *self, PyObject *args,
+				   PyObject *kw);
+PyObject *gdbpy_init_fixed_point_type (PyObject *self, PyObject *args,
+				       PyObject *kw);
+
 PyObject *symtab_and_line_to_sal_object (struct symtab_and_line sal);
 PyObject *symtab_to_symtab_object (struct symtab *symtab);
 PyObject *symbol_to_symbol_object (struct symbol *sym);
@@ -481,6 +503,18 @@ struct symtab_and_line *sal_object_to_symtab_and_line (PyObject *obj);
 frame_info_ptr frame_object_to_frame_info (PyObject *frame_obj);
 struct gdbarch *arch_object_to_gdbarch (PyObject *obj);
 
+/* Retrieves a pointer to the underlying float format structure. Expects an
+ * instance of gdb.Objfile for SELF. If SELF is of an incompatible type,
+ * returns nullptr and raises a Python exception. */
+
+extern struct objfile *objfile_object_to_objfile (PyObject *self);
+
+/* Retrieves a pointer to the underlying float format structure. Expects an
+ * instance of gdb.FloatFormat for SELF. If SELF is of an incompatible type,
+ * returns nullptr and raises a Python exception. */
+
+extern struct floatformat *float_format_object_as_float_format (PyObject *self);
+
 /* Convert Python object OBJ to a program_space pointer.  OBJ must be a
    gdb.Progspace reference.  Return nullptr if the gdb.Progspace is not
    valid (see gdb.Progspace.is_valid), otherwise return the program_space
diff --git a/gdb/python/python.c b/gdb/python/python.c
index fd5a920cbd..22a1ca7184 100644
--- a/gdb/python/python.c
+++ b/gdb/python/python.c
@@ -2521,6 +2521,56 @@ Return current recording object." },
     "stop_recording () -> None.\n\
 Stop current recording." },
 
+  /* Type initialization functions. */
+  { "init_type", (PyCFunction) gdbpy_init_type, METH_VARARGS | METH_KEYWORDS,
+    "init_type (objfile, type_code, bit_length, name) -> type\n\
+    Creates a new type with the given bit length and type code, owned\
+    by the given objfile." },
+  { "init_integer_type", (PyCFunction) gdbpy_init_integer_type,
+    METH_VARARGS | METH_KEYWORDS,
+    "init_integer_type (objfile, bit_length, unsigned, name) -> type\n\
+    Creates a new integer type with the given bit length and \
+    signedness, owned by the given objfile." },
+  { "init_character_type", (PyCFunction) gdbpy_init_character_type,
+    METH_VARARGS | METH_KEYWORDS,
+    "init_character_type (objfile, bit_length, unsigned, name) -> type\n\
+    Creates a new character type with the given bit length and \
+    signedness, owned by the given objfile." },
+  { "init_boolean_type", (PyCFunction) gdbpy_init_boolean_type,
+    METH_VARARGS | METH_KEYWORDS,
+    "init_boolean_type (objfile, bit_length, unsigned, name) -> type\n\
+    Creates a new boolean type with the given bit length and \
+    signedness, owned by the given objfile." },
+  { "init_float_type", (PyCFunction) gdbpy_init_float_type,
+    METH_VARARGS | METH_KEYWORDS,
+    "init_float_type (objfile, float_format, name) -> type\n\
+    Creates a new floating point type with the given bit length and \
+    format, owned by the given objfile." },
+  { "init_decfloat_type", (PyCFunction) gdbpy_init_decfloat_type,
+    METH_VARARGS | METH_KEYWORDS,
+    "init_decfloat_type (objfile, bit_length, name) -> type\n\
+    Creates a new decimal float type with the given bit length,\
+    owned by the given objfile." },
+  { "can_create_complex_type", (PyCFunction) gdbpy_can_create_complex_type,
+    METH_VARARGS | METH_KEYWORDS,
+    "can_create_complex_type (type) -> bool\n\
+     Returns whether a given type can form a new complex type." },
+  { "init_complex_type", (PyCFunction) gdbpy_init_complex_type,
+    METH_VARARGS | METH_KEYWORDS,
+    "init_complex_type (base_type, name) -> type\n\
+    Creates a new complex type whose components belong to the\
+    given type, owned by the given objfile." },
+  { "init_pointer_type", (PyCFunction) gdbpy_init_pointer_type,
+    METH_VARARGS | METH_KEYWORDS,
+    "init_pointer_type (objfile, target_type, bit_length, name) -> type\n\
+    Creates a new pointer type with the given bit length, pointing\
+    to the given target type, and owned by the given objfile." },
+  { "init_fixed_point_type", (PyCFunction) gdbpy_init_fixed_point_type,
+    METH_VARARGS | METH_KEYWORDS,
+    "init_fixed_point_type (objfile, bit_length, unsigned, name) -> type\n\
+    Creates a new fixed point type with the given bit length and\
+    signedness, owned by the given objfile." },
+
   { "lookup_type", (PyCFunction) gdbpy_lookup_type,
     METH_VARARGS | METH_KEYWORDS,
     "lookup_type (name [, block]) -> type\n\
diff --git a/gdb/testsuite/gdb.python/py-type-init.c b/gdb/testsuite/gdb.python/py-type-init.c
new file mode 100644
index 0000000000..010e62bd24
--- /dev/null
+++ b/gdb/testsuite/gdb.python/py-type-init.c
@@ -0,0 +1,21 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2009-2023 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+int main ()
+{
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.python/py-type-init.exp b/gdb/testsuite/gdb.python/py-type-init.exp
new file mode 100644
index 0000000000..8ef3c2c57a
--- /dev/null
+++ b/gdb/testsuite/gdb.python/py-type-init.exp
@@ -0,0 +1,132 @@
+# Copyright (C) 2009-2023 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# This file is part of the GDB testsuite.  It tests the mechanism
+# of creating new types from within Python.
+
+load_lib gdb-python.exp
+
+standard_testfile
+
+# Build inferior to language specification.
+proc build_inferior {exefile lang} {
+  global srcdir subdir srcfile testfile hex
+
+  if { [gdb_compile "${srcdir}/${subdir}/${srcfile}" "${exefile}" executable "debug $lang"] != "" } {
+      untested "failed to compile in $lang mode"
+      return -1
+  }
+
+  return 0
+}
+
+# Restart GDB.
+proc restart_gdb {exefile} {
+  clean_restart $exefile
+
+  if {![runto_main]} {
+      return
+  }
+}
+
+# Tests the basic values of a type.
+proc test_type_basic {owner t code sizeof name} {
+  gdb_test "python print(${t}.code == ${code})" \
+    "True" "check the code for the python-constructed type (${owner}/${name})"
+  gdb_test "python print(${t}.sizeof == ${sizeof})" \
+    "True" "check the size for the python-constructed type (${owner}/${name})"
+  gdb_test "python print(${t}.name == ${name})" \
+    "True" "check the name for the python-constructed type (${owner}/${name})"
+}
+
+# Runs the tests for a given owner object.
+proc for_owner {owner} {
+  # Simple direct type creation.
+  gdb_test "python t = gdb.init_type(${owner}, gdb.TYPE_CODE_INT, 24, 'long short int')" \
+    "" "construct a new type from inside python (${owner})"
+  test_type_basic $owner "t" "gdb.TYPE_CODE_INT" "3" "'long short int'"
+
+  # Integer type creation.
+  gdb_test "python t = gdb.init_integer_type(${owner}, 24, True, 'test_int_t')" \
+    "" "construct a new integer type from inside python (${owner})"
+  test_type_basic $owner "t" "gdb.TYPE_CODE_INT" "3" "'test_int_t'"
+
+  # Character type creation.
+  gdb_test "python t = gdb.init_character_type(${owner}, 24, True, 'test_char_t')" \
+    "" "construct a new character type from inside python (${owner})"
+  test_type_basic $owner "t" "gdb.TYPE_CODE_CHAR" "3" "'test_char_t'"
+
+  # Boolean type creation.
+  gdb_test "python t = gdb.init_boolean_type(${owner}, 24, True, 'test_bool_t')" \
+    "" "construct a new boolean type from inside python (${owner})"
+  test_type_basic $owner "t" "gdb.TYPE_CODE_BOOL" "3" "'test_bool_t'"
+
+  # Float type creation.
+  gdb_test "python f = gdb.FloatFormat()" "" "create a float format object (${owner})"
+  gdb_test "python f.totalsize = 32" "" "set totalsize for the float format (${owner})"
+  gdb_test "python f.sign_start = 31" "" "set sign_start for the float format (${owner})"
+  gdb_test "python f.exp_start = 23" "" "set exp_start for the float format (${owner})"
+  gdb_test "python f.exp_len = 8" "" "set exp_len for the float format (${owner})"
+  gdb_test "python f.exp_bias = 0" "" "set exp_bias for the float format (${owner})"
+  gdb_test "python f.exp_nan = 0xff" "" "set exp_nan for the float format (${owner})"
+  gdb_test "python f.man_start = 0" "" "set man_start for the float format (${owner})"
+  gdb_test "python f.man_len = 22" "" "set man_len for the float format (${owner})"
+  gdb_test "python f.intbit = False" "" "set intbit for the float format (${owner})"
+  gdb_test "python f.name = 'test_float_fmt'" "" "set name for the float format (${owner})"
+
+  gdb_test "python ft = gdb.init_float_type(${owner}, f, 'test_float_t')" \
+    "" "construct a new float type from inside python (${owner})"
+  test_type_basic $owner "ft" "gdb.TYPE_CODE_FLT" "4" "'test_float_t'"
+
+  # Decfloat type creation.
+  gdb_test "python t = gdb.init_decfloat_type(${owner}, 24, 'test_decfloat_t')" \
+    "" "construct a new decfloat type from inside python (${owner})"
+  test_type_basic $owner "t" "gdb.TYPE_CODE_DECFLOAT" "3" "'test_decfloat_t'"
+
+  # Test complex type.
+  gdb_test "python print(gdb.can_create_complex_type(ft))" "True" \
+    "check whether the float type we created can be the basis for a complex (${owner})"
+
+  gdb_test "python t = gdb.init_complex_type(ft, 'test_complex_t')" \
+    "" "construct a new complex type from inside python (${owner})"
+  test_type_basic $owner "t" "gdb.TYPE_CODE_COMPLEX" "8" "'test_complex_t'"
+
+  # Create a 24-bit pointer to our floating point type.
+  gdb_test "python t = gdb.init_pointer_type(${owner}, ft, 24, 'test_pointer_t')" \
+    "" "construct a new pointer type from inside python (${owner})"
+  test_type_basic $owner "t" "gdb.TYPE_CODE_PTR" "3" "'test_pointer_t'"
+}
+
+# Run the tests.
+if { [build_inferior "${binfile}" "c"] == 0 } {
+  restart_gdb "${binfile}"
+
+  # Skip all tests if Python scripting is not enabled.
+  if { ![allow_python_tests] } { continue }
+
+  # Test objfile-owned type construction
+  for_owner "gdb.objfiles()\[0\]"
+
+  # Objfile-owned fixed point type creation.
+  #
+  # Currently, these cannot be owned by architectures, so we have to
+  # test them separately.
+  gdb_test "python t = gdb.init_fixed_point_type(gdb.objfiles()\[0\], 24, True, 'test_fixed_t')" \
+    "" "construct a new fixed point type from inside python (gdb.objfile()\[0\])"
+  test_type_basic "gdb.objfile()\[0\]" "t" "gdb.TYPE_CODE_FIXED_POINT" "3" "'test_fixed_t'"
+
+  # Test arch-owned type construction
+  for_owner "gdb.inferiors()\[0\].architecture()"
+}
-- 
2.41.0


^ permalink raw reply	[relevance 1%]

* Re: [PATCH] Add support for creating new types from the Python API
  2023-05-26  3:30  2% ` Matheus Branco Borella
@ 2023-08-07 14:53  5%   ` Andrew Burgess
  2023-08-08 21:00  1%     ` [PATCH v2] " Matheus Branco Borella
  0 siblings, 1 reply; 65+ results
From: Andrew Burgess @ 2023-08-07 14:53 UTC (permalink / raw)
  To: Matheus Branco Borella via Gdb-patches, gdb-patches; +Cc: dark.ryu.550

Matheus Branco Borella via Gdb-patches <gdb-patches@sourceware.org>
writes:

> From: "dark.ryu.550@gmail.com" <dark.ryu.550@gmail.com>
>
> On 1/6/23 20:00, simark@simark.ca:
>> Unfortunately, I am unable to apply this patch as well, please send it
>> using git-send-email.
>
> Should be all good to go now. I'm sorry for unearthing this patch after it's 
> been so long, but I hope it's not (too much of) a problem. I've updated the old 
> patch to work with the way symbol allocation is done now, since it changed from 
> six months ago, and I've also added a test case for it.
>
>> It would maybe be nice to be able to create arch-owned types too.  For
>> instance, you could create types just after firing up GDB, without even
>> having an objfile loaded.  It's not necessary to implement it at the
>> same time, but does your approach leave us the option to do that at a
>> later time?
>
> Hmm, I think it shouldn't be a problem. The way it works now, it already uses
> `type_allocator` to do most of the heavy lifting, which can handle both `
> objfile`s and `arch`es. I can see a straightforward way to do that in using
> keyword arguments (e.g. `objfile=` and `arch=`) to separate the two cases in 
> Python and doing a check on the C side for which of the two was used.
>
> ---
>
> This patch adds support for creating types from within the Python API. It does
> so by exposing the `init_*_type` family of functions, defined in `gdbtypes.h` to
> Python and having them return `gdb.Type` objects connected to the newly minted
> types.
>
> These functions are accessible in the root of the gdb module and all require
> a reference to a `gdb.Objfile`. Types created from this API are exclusively
> objfile-owned.
>
> This patch also adds an extra type - `gdb.FloatFormat` - to support creation of
> floating point types by letting users control the format from within Python. It
> is missing, however, a way to specify half formats and validation functions.
>
> It is important to note that types created using this interface are not
> automatically registered as a symbol, and so, types will become unreachable
> unless used to create a value that otherwise references it or saved in some way.
>
> The main drawback of using the `init_*_type` family over implementing type
> initialization by hand is that any type that's created gets immediately
> allocated on its owner objfile's obstack, regardless of what its real
> lifetime requirements are. The main implication of this is that types that
> become unreachable will leak their memory for the lifetime of the objfile.

I'd soften this from "leak their memory" to "remain live" -- it just
feels like claiming there's a leak here is a little too harsh.

>
> Keeping track of the initialization of the type by hand would require a
> deeper change to the existing type object infrastructure. A bit too ambitious
> for a first patch, I'd say.
>
> if it were to be done though, we would gain the ability to only keep in the
> obstack types that are known to be referenced in some other way - by allocating
> and copying the data to the obstack as other objects are created that reference
> it (eg. symbols).
> ---
>  gdb/Makefile.in                      |   2 +
>  gdb/python/py-float-format.c         | 321 +++++++++++++++++++++
>  gdb/python/py-objfile.c              |  12 +
>  gdb/python/py-type-init.c            | 409 +++++++++++++++++++++++++++
>  gdb/python/python-internal.h         |  15 +
>  gdb/python/python.c                  |  41 +++
>  gdb/testsuite/gdb.python/py-type.exp |  10 +

Before this could be merged we would need at least:

  - Updates to the documentation,
  - A NEWS entry,
  - Significantly more tests.

I have a few observations, but I think it will be easier to review once
there are either some docs and tests that exercise all the parts, as
I'll be able to see how everything is intended to work together without
having to figure it out from the code.

>  7 files changed, 810 insertions(+)
>  create mode 100644 gdb/python/py-float-format.c
>  create mode 100644 gdb/python/py-type-init.c
>
> diff --git a/gdb/Makefile.in b/gdb/Makefile.in
> index 14b5dd0bad..108bcea69e 100644
> --- a/gdb/Makefile.in
> +++ b/gdb/Makefile.in
> @@ -431,6 +431,8 @@ SUBDIR_PYTHON_SRCS = \
>  	python/py-threadevent.c \
>  	python/py-tui.c \
>  	python/py-type.c \
> +	python/py-type-init.c \
> +	python/py-float-format.c \
>  	python/py-unwind.c \
>  	python/py-utils.c \
>  	python/py-value.c \
> diff --git a/gdb/python/py-float-format.c b/gdb/python/py-float-format.c
> new file mode 100644
> index 0000000000..8fe92980f1
> --- /dev/null
> +++ b/gdb/python/py-float-format.c
> @@ -0,0 +1,321 @@
> +/* Accessibility of float format controls from inside the Python API
> +
> +   Copyright (C) 2008-2023 Free Software Foundation, Inc.
> +
> +   This file is part of GDB.
> +
> +   This program is free software; you can redistribute it and/or modify
> +   it under the terms of the GNU General Public License as published by
> +   the Free Software Foundation; either version 3 of the License, or
> +   (at your option) any later version.
> +
> +   This program is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +   GNU General Public License for more details.
> +
> +   You should have received a copy of the GNU General Public License
> +   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
> +
> +#include "defs.h"
> +#include "python-internal.h"
> +#include "floatformat.h"
> +
> +/* Structure backing the float format Python interface. */
> +
> +struct float_format_object
> +{
> +  PyObject_HEAD
> +  struct floatformat format;
> +
> +  struct floatformat *float_format ()
> +  {
> +    return &this->format;
> +  }
> +};
> +
> +/* Initializes the float format type and registers it with the Python interpreter. */

Throughout this patch you have some lines that are longer than we like
for GDB.  Ideally we keep lines under 80 characters.  I'm not going to
point out every such line.

> +
> +static int CPYCHECKER_NEGATIVE_RESULT_SETS_EXCEPTION
> +gdbpy_initialize_float_format (void)
> +{
> +  if (PyType_Ready (&float_format_object_type) < 0)
> +    return -1;
> +
> +  if (gdb_pymodule_addobject (gdb_module, "FloatFormat",
> +                              (PyObject *) &float_format_object_type) < 0)
> +    return -1;
> +
> +  return 0;
> +}
> +
> +GDBPY_INITIALIZE_FILE (gdbpy_initialize_float_format);
> +
> +#define INSTANCE_FIELD_GETTER(getter_name, field_name, field_type, field_conv) \
> +  static PyObject *                                                            \
> +  getter_name (PyObject *self, void *closure)                                  \
> +  {                                                                            \
> +    float_format_object *ff = (float_format_object*) self;                     \
> +    field_type value = ff->float_format ()->field_name;                        \
> +    return field_conv (value);                                                 \
> +  }
> +
> +#define INSTANCE_FIELD_SETTER(getter_name, field_name, field_type, field_conv) \
> +  static int                                                                   \
> +  getter_name (PyObject *self, PyObject* value, void *closure)                 \

Probably setter_name would be better here.

Ideally every function and global macro or variable should have a
comment.  I think these would definitely be improved with a comment.

> +  {                                                                            \
> +    field_type native_value;                                                   \
> +    if (!field_conv (value, &native_value))                                    \
> +      return -1;                                                               \
> +    float_format_object *ff = (float_format_object*) self;                     \
> +    ff->float_format ()->field_name = native_value;                            \
> +    return 0;                                                                  \
> +  }
> +
> +/* Converts from the intbit enum to a Python boolean. */
> +
> +static PyObject *
> +intbit_to_py (enum floatformat_intbit intbit)
> +{
> +  gdb_assert 
> +    (intbit == floatformat_intbit_yes || 
> +     intbit == floatformat_intbit_no);

The '||' operator should be placed at the start of the new line.  The
line wrapping here is a little too aggressive, better would be:

  gdb_assert (intbit == floatformat_intbit_yes
	      || intbit == floatformat_intbit_no);

> +
> +  if (intbit == floatformat_intbit_no)
> +    Py_RETURN_FALSE;
> +  else
> +    Py_RETURN_TRUE;
> +}
> +
> +/* Converts from a Python boolean to the intbit enum. */
> +
> +static bool
> +py_to_intbit (PyObject *object, enum floatformat_intbit *intbit)
> +{
> +  if (!PyObject_IsInstance (object, (PyObject*) &PyBool_Type))
> +    {
> +      PyErr_SetString (PyExc_TypeError, "intbit must be True or False");
> +      return false;
> +    }
> +
> +  *intbit = PyObject_IsTrue (object) ? 
> +    floatformat_intbit_yes : floatformat_intbit_no;

Operator at the start of a line again, and use () to encourage
alignment, so:

  *intbit = (PyObject_IsTrue (object)
	     ? floatformat_intbit_yes : floatformat_intbit_no);


> +  return true;
> +}
> +
> +/* Converts from a Python integer to a unsigned integer. */
> +
> +static bool
> +py_to_unsigned_int (PyObject *object, unsigned int *val)
> +{
> +  if (!PyObject_IsInstance (object, (PyObject*) &PyLong_Type))
> +    {
> +      PyErr_SetString (PyExc_TypeError, "value must be an integer");
> +      return false;
> +    }
> +
> +  long native_val = PyLong_AsLong (object);
> +  if (native_val > (long) UINT_MAX)
> +    {
> +      PyErr_SetString (PyExc_ValueError, "value is too large");
> +      return false;
> +    }
> +  if (native_val < 0)
> +    {
> +      PyErr_SetString (PyExc_ValueError, 
> +                       "value must not be smaller than zero");
> +      return false;
> +    }
> +
> +  *val = (unsigned int) native_val;
> +  return true;
> +}
> +
> +/* Converts from a Python integer to a signed integer. */
> +
> +static bool
> +py_to_int(PyObject *object, int *val)
> +{
> +  if(!PyObject_IsInstance(object, (PyObject*)&PyLong_Type))
> +    {
> +      PyErr_SetString(PyExc_TypeError, "value must be an integer");
> +      return false;
> +    }
> +
> +  long native_val = PyLong_AsLong(object);
> +  if(native_val > (long)INT_MAX)
> +    {
> +      PyErr_SetString(PyExc_ValueError, "value is too large");
> +      return false;
> +    }
> +
> +  *val = (int)native_val;
> +  return true;
> +}
> +
> +INSTANCE_FIELD_GETTER (ffpy_get_totalsize, totalsize, 
> +                       unsigned int, PyLong_FromLong)
> +INSTANCE_FIELD_GETTER (ffpy_get_sign_start, sign_start, 
> +                       unsigned int, PyLong_FromLong)
> +INSTANCE_FIELD_GETTER (ffpy_get_exp_start, exp_start, 
> +                       unsigned int, PyLong_FromLong)
> +INSTANCE_FIELD_GETTER (ffpy_get_exp_len, exp_len, 
> +                       unsigned int, PyLong_FromLong)
> +INSTANCE_FIELD_GETTER (ffpy_get_exp_bias, exp_bias, int, PyLong_FromLong)
> +INSTANCE_FIELD_GETTER (ffpy_get_exp_nan, exp_nan, 
> +                       unsigned int, PyLong_FromLong)
> +INSTANCE_FIELD_GETTER (ffpy_get_man_start, man_start, 
> +                       unsigned int, PyLong_FromLong)
> +INSTANCE_FIELD_GETTER (ffpy_get_man_len, man_len, 
> +                       unsigned int, PyLong_FromLong)
> +INSTANCE_FIELD_GETTER (ffpy_get_intbit, intbit, 
> +                       enum floatformat_intbit, intbit_to_py)
> +INSTANCE_FIELD_GETTER (ffpy_get_name, name, 
> +                       const char *, PyUnicode_FromString)
> +
> +INSTANCE_FIELD_SETTER (ffpy_set_totalsize, totalsize, 
> +                       unsigned int, py_to_unsigned_int)
> +INSTANCE_FIELD_SETTER (ffpy_set_sign_start, sign_start, 
> +                       unsigned int, py_to_unsigned_int)
> +INSTANCE_FIELD_SETTER (ffpy_set_exp_start, exp_start, 
> +                       unsigned int, py_to_unsigned_int)
> +INSTANCE_FIELD_SETTER (ffpy_set_exp_len, exp_len, 
> +                       unsigned int, py_to_unsigned_int)
> +INSTANCE_FIELD_SETTER (ffpy_set_exp_bias, exp_bias, int, py_to_int)
> +INSTANCE_FIELD_SETTER (ffpy_set_exp_nan, exp_nan, 
> +                       unsigned int, py_to_unsigned_int)
> +INSTANCE_FIELD_SETTER (ffpy_set_man_start, man_start,
> +                       unsigned int, py_to_unsigned_int)
> +INSTANCE_FIELD_SETTER (ffpy_set_man_len, man_len, 
> +                       unsigned int, py_to_unsigned_int)
> +INSTANCE_FIELD_SETTER (ffpy_set_intbit, intbit, 
> +                       enum floatformat_intbit, py_to_intbit)
> +
> +/* Makes sure float formats created from Python always test as valid. */
> +
> +static int
> +ffpy_always_valid (const struct floatformat *fmt ATTRIBUTE_UNUSED,
> +                   const void *from ATTRIBUTE_UNUSED)
> +{
> +  return 1;
> +}
> +
> +/* Initializes new float format objects. */
> +
> +static int
> +ffpy_init (PyObject *self,
> +           PyObject *args ATTRIBUTE_UNUSED,
> +           PyObject *kwds ATTRIBUTE_UNUSED)
> +{
> +  auto ff = (float_format_object*) self;
> +  ff->format = floatformat ();
> +  ff->float_format ()->name = "";
> +  ff->float_format ()->is_valid = ffpy_always_valid;
> +  return 0;
> +}
> +
> +/* Retrieves a pointer to the underlying float format structure. */

Comments for extern functions should be placed in the header file, and a
comment here should just say:

  /* See python/python-internal.h.  */

I know there are lots of counter examples to this practice in
python-internal.h, but they are all older code.  Newer code is expected
to follow the above style -- and there are also examples of this in that
header too.

> +
> +struct floatformat *
> +float_format_object_as_float_format (PyObject *self)
> +{
> +  if (!PyObject_IsInstance (self, (PyObject*) &float_format_object_type))
> +    return nullptr;

I'm not sure this is right.  I believe PyObject_IsInstance can return 1
(if it is an instance), 0 (if it is not an instance), or -1 (on error).

So:

  if (PyObject_IsInstance (self, (PyObject*) &float_format_object_type) <= 0)
    return nullptr;

Might be better.  Except, an exception is only set for the -1 case, and
looking at the user of float_format_object_as_float_format, I think
there is an expectation that an exception will have been set, so you
might want to either always set an exception here, or only set an
exception for the 0 case?

Either way, this is exactly the sort of thing that the comment should be
saying, e.g.

  /* Retrieves a pointer to the float format structure represented by
     SELF, assuming SELF is of type gdb.XXXX.  If SELF is not of the
     correct type then return nullptr and set an exception.  */

Actually, looking at some of the other code, I wonder if the right thing
here is to switch to PyObject_TypeCheck?  This avoids (possibly) calling
back into Python code, and only returns 0/1.  We seem to use _TypeCheck
extensively in other places, is there a requirement for _IsInstance?

> +  return ((float_format_object*) self)->float_format ();
> +}
> +
> +static gdb_PyGetSetDef float_format_object_getset[] =
> +{
> +  { "totalsize", ffpy_get_totalsize, ffpy_set_totalsize,
> +    "The total size of the floating point number, in bits.", nullptr },
> +  { "sign_start", ffpy_get_sign_start, ffpy_set_sign_start,
> +    "The bit offset of the sign bit.", nullptr },
> +  { "exp_start", ffpy_get_exp_start, ffpy_set_exp_start,
> +    "The bit offset of the start of the exponent.", nullptr },
> +  { "exp_len", ffpy_get_exp_len, ffpy_set_exp_len,
> +    "The size of the exponent, in bits.", nullptr },
> +  { "exp_bias", ffpy_get_exp_bias, ffpy_set_exp_bias,
> +    "Bias added to a \"true\" exponent to form the biased exponent.", nullptr },
> +  { "exp_nan", ffpy_get_exp_nan, ffpy_set_exp_nan,
> +    "Exponent value which indicates NaN.", nullptr },
> +  { "man_start", ffpy_get_man_start, ffpy_set_man_start,
> +    "The bit offset of the start of the mantissa.", nullptr },
> +  { "man_len", ffpy_get_man_len, ffpy_set_man_len,
> +    "The size of the mantissa, in bits.", nullptr },
> +  { "intbit", ffpy_get_intbit, ffpy_set_intbit,
> +    "Is the integer bit explicit or implicit?", nullptr },
> +  { "name", ffpy_get_name, nullptr,
> +    "Internal name for debugging.", nullptr },
> +  { nullptr }
> +};
> +
> +static PyMethodDef float_format_object_methods[] =
> +{
> +  { NULL }
> +};
> +
> +static PyNumberMethods float_format_object_as_number = {
> +  nullptr,             /* nb_add */
> +  nullptr,             /* nb_subtract */
> +  nullptr,             /* nb_multiply */
> +  nullptr,             /* nb_remainder */
> +  nullptr,             /* nb_divmod */
> +  nullptr,             /* nb_power */
> +  nullptr,             /* nb_negative */
> +  nullptr,             /* nb_positive */
> +  nullptr,             /* nb_absolute */
> +  nullptr,             /* nb_nonzero */
> +  nullptr,             /* nb_invert */
> +  nullptr,             /* nb_lshift */
> +  nullptr,             /* nb_rshift */
> +  nullptr,             /* nb_and */
> +  nullptr,             /* nb_xor */
> +  nullptr,             /* nb_or */
> +  nullptr,             /* nb_int */
> +  nullptr,             /* reserved */
> +  nullptr,             /* nb_float */
> +};

I haven't dug into the implications of providing this structure with all
the fields set to nullptr vs just providing nullptr for the tp_as_number
field below.

However, given this is not common in the GDB code, I think, if there is
a reason for doing this, that it would be worth explaining in a comment.

Similarly, you have an empty float_format_object_methods list, in other
places we just set tp_methods to nullptr -- so is there a reason for the
approach you've taken here?

> +
> +PyTypeObject float_format_object_type =
> +{
> +  PyVarObject_HEAD_INIT (NULL, 0)
> +  "gdb.FloatFormat",              /*tp_name*/
> +  sizeof (float_format_object),   /*tp_basicsize*/
> +  0,                              /*tp_itemsize*/
> +  nullptr,                        /*tp_dealloc*/
> +  0,                              /*tp_print*/
> +  nullptr,                        /*tp_getattr*/
> +  nullptr,                        /*tp_setattr*/
> +  nullptr,                        /*tp_compare*/
> +  nullptr,                        /*tp_repr*/
> +  &float_format_object_as_number, /*tp_as_number*/
> +  nullptr,                        /*tp_as_sequence*/
> +  nullptr,                        /*tp_as_mapping*/
> +  nullptr,                        /*tp_hash */
> +  nullptr,                        /*tp_call*/
> +  nullptr,                        /*tp_str*/
> +  nullptr,                        /*tp_getattro*/
> +  nullptr,                        /*tp_setattro*/
> +  nullptr,                        /*tp_as_buffer*/
> +  Py_TPFLAGS_DEFAULT,             /*tp_flags*/
> +  "GDB float format object",      /* tp_doc */
> +  nullptr,                        /* tp_traverse */
> +  nullptr,                        /* tp_clear */
> +  nullptr,                        /* tp_richcompare */
> +  0,                              /* tp_weaklistoffset */
> +  nullptr,                        /* tp_iter */
> +  nullptr,                        /* tp_iternext */
> +  float_format_object_methods,    /* tp_methods */
> +  nullptr,                        /* tp_members */
> +  float_format_object_getset,     /* tp_getset */
> +  nullptr,                        /* tp_base */
> +  nullptr,                        /* tp_dict */
> +  nullptr,                        /* tp_descr_get */
> +  nullptr,                        /* tp_descr_set */
> +  0,                              /* tp_dictoffset */
> +  ffpy_init,                      /* tp_init */
> +  nullptr,                        /* tp_alloc */
> +  PyType_GenericNew,              /* tp_new */
> +};
> +
> +
> diff --git a/gdb/python/py-objfile.c b/gdb/python/py-objfile.c
> index ad72f3f042..be2121c405 100644
> --- a/gdb/python/py-objfile.c
> +++ b/gdb/python/py-objfile.c
> @@ -704,6 +704,18 @@ objfile_to_objfile_object (struct objfile *objfile)
>    return gdbpy_ref<>::new_reference (result);
>  }
>  
> +struct objfile *
> +objfile_object_to_objfile (PyObject *self)
> +{
> +  if (!PyObject_TypeCheck (self, &objfile_object_type))
> +    return nullptr;
> +
> +  auto objfile_object = (struct objfile_object*) self;
> +  OBJFPY_REQUIRE_VALID (objfile_object);
> +
> +  return objfile_object->objfile;
> +}
> +
>  static int CPYCHECKER_NEGATIVE_RESULT_SETS_EXCEPTION
>  gdbpy_initialize_objfile (void)
>  {
> diff --git a/gdb/python/py-type-init.c b/gdb/python/py-type-init.c
> new file mode 100644
> index 0000000000..a18cce6e51
> --- /dev/null
> +++ b/gdb/python/py-type-init.c
> @@ -0,0 +1,409 @@
> +/* Functionality for creating new types accessible from python.
> +
> +   Copyright (C) 2008-2023 Free Software Foundation, Inc.
> +
> +   This file is part of GDB.
> +
> +   This program is free software; you can redistribute it and/or modify
> +   it under the terms of the GNU General Public License as published by
> +   the Free Software Foundation; either version 3 of the License, or
> +   (at your option) any later version.
> +
> +   This program is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +   GNU General Public License for more details.
> +
> +   You should have received a copy of the GNU General Public License
> +   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
> +
> +#include "defs.h"
> +#include "python-internal.h"
> +#include "gdbtypes.h"
> +#include "floatformat.h"
> +#include "objfiles.h"
> +#include "gdbsupport/gdb_obstack.h"
> +
> +
> +/* Copies a null-terminated string into an objfile's obstack. */
> +
> +static const char *
> +copy_string (struct objfile *objfile, const char *py_str)
> +{
> +  unsigned int len = strlen (py_str);
> +  return obstack_strndup (&objfile->per_bfd->storage_obstack,
> +                          py_str, len);
> +}
> +
> +/* Creates a new type and returns a new gdb.Type associated with it. */
> +
> +PyObject *
> +gdbpy_init_type (PyObject *self, PyObject *args)
> +{
> +  PyObject *objfile_object;
> +  enum type_code code;
> +  int bit_length;
> +  const char *py_name;
> +
> +  if(!PyArg_ParseTuple (args, "Oiis", &objfile_object, &code, 
> +                        &bit_length, &py_name))
> +    return nullptr;

I'm a huge fan of named arguments, and would like to see all of these
converted to use named arguments.

With an eye to Simon's suggestion, I would be tempted to name the first
argument `owner` maybe?  We could then use PyObject_TypeCheck to decide
if the owner is a gdb.Objfile or a gdb.Architecture.

I agree that supporting gdb.Architecture isn't a requirement for getting
the patch merged, but I don't want to get stuck with an API that doesn't
quite work, so if you were willing to give that a go, that would be great.

> +
> +  struct objfile* objfile = objfile_object_to_objfile (objfile_object);
> +  if (objfile == nullptr)
> +    return nullptr;
> +
> +  const char *name = copy_string (objfile, py_name);
> +  struct type *type;
> +  try
> +    {
> +      type_allocator allocator (objfile);
> +      type = allocator.new_type (code, bit_length, name);
> +      gdb_assert (type != nullptr);
> +    }
> +  catch (gdb_exception_error& ex)
> +    {
> +      GDB_PY_HANDLE_EXCEPTION (ex);
> +    }
> +
> +  return type_to_type_object (type);
> +}
> +
> +/* Creates a new integer type and returns a new gdb.Type associated with it. */
> +
> +PyObject *
> +gdbpy_init_integer_type (PyObject *self, PyObject *args)
> +{
> +  PyObject *objfile_object;
> +  int bit_size;
> +  int unsigned_p;
> +  const char *py_name;
> +
> +  if (!PyArg_ParseTuple (args, "Oips", &objfile_object, &bit_size, 
> +                         &unsigned_p, &py_name))
> +    return nullptr;
> +
> +  struct objfile *objfile = objfile_object_to_objfile (objfile_object);
> +  if (objfile == nullptr)
> +    return nullptr;
> +
> +  const char *name = copy_string (objfile, py_name);
> +  struct type *type;
> +  try
> +    {
> +      type_allocator allocator (objfile);
> +      type = init_integer_type (allocator, bit_size, unsigned_p, name);
> +      gdb_assert (type != nullptr);
> +    }
> +  catch (gdb_exception_error& ex)
> +    {
> +      GDB_PY_HANDLE_EXCEPTION (ex);
> +    }
> +
> +  return type_to_type_object(type);
> +}
> +
> +/* Creates a new character type and returns a new gdb.Type associated 
> + * with it. */
> +
> +PyObject *
> +gdbpy_init_character_type (PyObject *self, PyObject *args)
> +{
> +
> +  PyObject *objfile_object;
> +  int bit_size;
> +  int unsigned_p;
> +  const char *py_name;
> +
> +  if (!PyArg_ParseTuple (args, "Oips", &objfile_object, &bit_size, 
> +                         &unsigned_p, &py_name))
> +    return nullptr;
> +
> +  struct objfile *objfile = objfile_object_to_objfile (objfile_object);
> +  if (objfile == nullptr)
> +    return nullptr;
> +
> +  const char *name = copy_string (objfile, py_name);
> +  struct type *type;
> +  try
> +    {
> +      type_allocator allocator (objfile);
> +      type = init_character_type (allocator, bit_size, unsigned_p, name);
> +      gdb_assert (type != nullptr);
> +    }
> +  catch (gdb_exception_error& ex)
> +    {
> +      GDB_PY_HANDLE_EXCEPTION (ex);
> +    }
> +
> +  return type_to_type_object (type);
> +}
> +
> +/* Creates a new boolean type and returns a new gdb.Type associated with it. */
> +
> +PyObject *
> +gdbpy_init_boolean_type (PyObject *self, PyObject *args)
> +{
> +
> +  PyObject *objfile_object;
> +  int bit_size;
> +  int unsigned_p;
> +  const char *py_name;
> +
> +  if (!PyArg_ParseTuple (args, "Oips", &objfile_object, &bit_size, 
> +                         &unsigned_p, &py_name))
> +    return nullptr;
> +
> +  struct objfile *objfile = objfile_object_to_objfile (objfile_object);
> +  if (objfile == nullptr)
> +    return nullptr;
> +
> +  const char *name = copy_string (objfile, py_name);
> +  struct type *type;
> +  try
> +    {
> +      type_allocator allocator (objfile);
> +      type = init_boolean_type (allocator, bit_size, unsigned_p, name);
> +      gdb_assert (type != nullptr);
> +    }
> +  catch (gdb_exception_error& ex)
> +    {
> +      GDB_PY_HANDLE_EXCEPTION (ex);
> +    }
> +
> +  return type_to_type_object (type);
> +}
> +
> +/* Creates a new float type and returns a new gdb.Type associated with it. */
> +
> +PyObject *
> +gdbpy_init_float_type (PyObject *self, PyObject *args)
> +{
> +  PyObject *objfile_object, *float_format_object;
> +  const char *py_name;
> +
> +  if (!PyArg_ParseTuple (args, "OOs", &objfile_object, 
> +                         &float_format_object, &py_name))
> +    return nullptr;
> +
> +  struct objfile *objfile = objfile_object_to_objfile (objfile_object);
> +  if (objfile == nullptr)
> +    return nullptr;
> +
> +  struct floatformat *local_ff = float_format_object_as_float_format 
> +    (float_format_object);
> +  if (local_ff == nullptr)
> +    return nullptr;
> +
> +  /* Persist a copy of the format in the objfile's obstack. This guarantees that
> +   * the format won't outlive the type being created from it and that changes
> +   * made to the object used to create this type will not affect it after
> +   * creation. */
> +  auto ff = OBSTACK_CALLOC
> +    (&objfile->objfile_obstack,
> +     1,
> +     struct floatformat);
> +  memcpy (ff, local_ff, sizeof (struct floatformat));
> +
> +  /* We only support creating float types in the architecture's endianness, so
> +   * make sure init_float_type sees the float format structure we need it to. */
> +  enum bfd_endian endianness = gdbarch_byte_order (objfile->arch());
> +  gdb_assert (endianness < BFD_ENDIAN_UNKNOWN);
> +
> +  const struct floatformat *per_endian[2] = { nullptr, nullptr };
> +  per_endian[endianness] = ff;
> +
> +  const char *name = copy_string (objfile, py_name);
> +  struct type *type;
> +  try
> +    {
> +      type_allocator allocator (objfile);
> +      type = init_float_type (allocator, -1, name, per_endian, endianness);
> +      gdb_assert (type != nullptr);
> +    }
> +  catch (gdb_exception_error& ex)
> +    {
> +      GDB_PY_HANDLE_EXCEPTION (ex);
> +    }
> +
> +  return type_to_type_object (type);
> +}
> +
> +/* Creates a new decimal float type and returns a new gdb.Type 
> + * associated with it. */
> +
> +PyObject *
> +gdbpy_init_decfloat_type (PyObject *self, PyObject *args)
> +{
> +  PyObject *objfile_object;
> +  int bit_length;
> +  const char *py_name;
> +
> +  if (!PyArg_ParseTuple (args, "Ois", &objfile_object, &bit_length, &py_name))
> +    return nullptr;
> +
> +  struct objfile *objfile = objfile_object_to_objfile (objfile_object);
> +  if (objfile == nullptr)
> +    return nullptr;
> +
> +  const char *name = copy_string (objfile, py_name);
> +  struct type *type;
> +  try
> +    {
> +      type_allocator allocator (objfile);
> +      type = init_decfloat_type (allocator, bit_length, name);
> +      gdb_assert (type != nullptr);
> +    }
> +  catch (gdb_exception_error& ex)
> +    {
> +      GDB_PY_HANDLE_EXCEPTION (ex);
> +    }
> +
> +  return type_to_type_object (type);
> +}
> +
> +/* Returns whether a given type can be used to create a complex type. */
> +
> +PyObject *
> +gdbpy_can_create_complex_type (PyObject *self, PyObject *args)
> +{
> +
> +  PyObject *type_object;
> +
> +  if (!PyArg_ParseTuple (args, "O", &type_object))
> +    return nullptr;
> +
> +  struct type *type = type_object_to_type (type_object);
> +  if (type == nullptr)
> +    return nullptr;
> +
> +  bool can_create_complex = false;
> +  try
> +    {
> +      can_create_complex = can_create_complex_type (type);
> +    }
> +  catch (gdb_exception_error& ex)
> +    {
> +      GDB_PY_HANDLE_EXCEPTION (ex);
> +    }
> +
> +  if (can_create_complex)
> +    Py_RETURN_TRUE;
> +  else
> +    Py_RETURN_FALSE;
> +}
> +
> +/* Creates a new complex type and returns a new gdb.Type associated with it. */
> +
> +PyObject *
> +gdbpy_init_complex_type (PyObject *self, PyObject *args)
> +{
> +
> +  PyObject *type_object;
> +  const char *py_name;
> +
> +  if (!PyArg_ParseTuple (args, "Os", &type_object, &py_name))
> +    return nullptr;
> +
> +  struct type *type = type_object_to_type (type_object);
> +  if (type == nullptr)
> +    return nullptr;
> +
> +  obstack *obstack;
> +  if (type->is_objfile_owned ())
> +    obstack = &type->objfile_owner ()->objfile_obstack;
> +  else
> +    obstack = gdbarch_obstack (type->arch_owner ());
> +
> +  unsigned int len = strlen (py_name);
> +  const char *name = obstack_strndup (obstack,
> +                                      py_name,
> +                                      len);
> +  struct type *complex_type;
> +  try
> +    {
> +      complex_type = init_complex_type (name, type);
> +      gdb_assert (complex_type != nullptr);
> +    }
> +  catch (gdb_exception_error& ex)
> +    {
> +      GDB_PY_HANDLE_EXCEPTION (ex);
> +    }
> +
> +  return type_to_type_object (complex_type);
> +}
> +
> +/* Creates a new pointer type and returns a new gdb.Type associated with it. */
> +
> +PyObject *
> +gdbpy_init_pointer_type (PyObject *self, PyObject *args)
> +{

I guess I was a little surprised to see this one.  I'd sort-of expected
that to get a pointer type you'd create a new type and then do:

  pointer_type = gdb.init_*_type(....).pointer()

I guess this code below does allow for different pointer sizes ... and
there's even a FIXME comment in gdbtypes.c pointing out that GDB only
supports a single pointer size, so maybe this is working towards closing
that issue.

But I'm not sure how these pointers would be used ... maybe some tests
will give examples of how this is different to calling
gdb.Type.pointer() then I'll understand...

> +  PyObject *objfile_object, *type_object;
> +  int bit_length;
> +  const char *py_name;
> +
> +  if (!PyArg_ParseTuple (args, "OOis", &objfile_object, &type_object, 
> +                         &bit_length, &py_name))
> +    return nullptr;
> +
> +  struct objfile *objfile = objfile_object_to_objfile (objfile_object);
> +  if (objfile == nullptr)
> +    return nullptr;
> +
> +  struct type *type = type_object_to_type (type_object);
> +  if (type == nullptr)
> +    return nullptr;
> +
> +  const char *name = copy_string (objfile, py_name);
> +  struct type *pointer_type = nullptr;
> +  try
> +    {
> +      type_allocator allocator (objfile);
> +      pointer_type = init_pointer_type (allocator, bit_length, 
> +                                        name, type);
> +      gdb_assert (type != nullptr);
> +    }
> +  catch (gdb_exception_error& ex)
> +    {
> +      GDB_PY_HANDLE_EXCEPTION (ex);
> +    }
> +
> +  return type_to_type_object (pointer_type);
> +}
> +
> +/* Creates a new fixed point type and returns a new gdb.Type associated 
> + * with it. */
> +
> +PyObject *
> +gdbpy_init_fixed_point_type (PyObject *self, PyObject *args)
> +{
> +
> +  PyObject *objfile_object;
> +  int bit_length;
> +  int unsigned_p;
> +  const char* py_name;
> +
> +  if (!PyArg_ParseTuple (args, "Oips", &objfile_object, &bit_length, 
> +                         &unsigned_p, &py_name))
> +    return nullptr;
> +
> +  struct objfile *objfile = objfile_object_to_objfile (objfile_object);
> +  if (objfile == nullptr)
> +    return nullptr;
> +
> +  const char *name = copy_string (objfile, py_name);
> +  struct type *type;
> +  try
> +    {
> +      type = init_fixed_point_type (objfile, bit_length, unsigned_p, 
> +                                    name);
> +      gdb_assert (type != nullptr);
> +    }
> +  catch (gdb_exception_error& ex)
> +    {
> +      GDB_PY_HANDLE_EXCEPTION (ex);
> +    }
> +
> +  return type_to_type_object (type);
> +}
> +
> diff --git a/gdb/python/python-internal.h b/gdb/python/python-internal.h
> index dbd33570a7..73e2e6ce62 100644
> --- a/gdb/python/python-internal.h
> +++ b/gdb/python/python-internal.h
> @@ -289,6 +289,8 @@ extern PyTypeObject frame_object_type
>      CPYCHECKER_TYPE_OBJECT_FOR_TYPEDEF ("frame_object");
>  extern PyTypeObject thread_object_type
>      CPYCHECKER_TYPE_OBJECT_FOR_TYPEDEF ("thread_object");
> +extern PyTypeObject float_format_object_type
> +    CPYCHECKER_TYPE_OBJECT_FOR_TYPEDEF ("float_format");
>  
>  /* Ensure that breakpoint_object_type is initialized and return true.  If
>     breakpoint_object_type can't be initialized then set a suitable Python
> @@ -431,6 +433,17 @@ gdb::unique_xmalloc_ptr<char> gdbpy_parse_command_name
>  PyObject *gdbpy_register_tui_window (PyObject *self, PyObject *args,
>  				     PyObject *kw);
>  
> +PyObject *gdbpy_init_type (PyObject *self, PyObject *args);
> +PyObject *gdbpy_init_integer_type (PyObject *self, PyObject *args);
> +PyObject *gdbpy_init_character_type (PyObject *self, PyObject *args);
> +PyObject *gdbpy_init_boolean_type (PyObject *self, PyObject *args);
> +PyObject *gdbpy_init_float_type (PyObject *self, PyObject *args);
> +PyObject *gdbpy_init_decfloat_type (PyObject *self, PyObject *args);
> +PyObject *gdbpy_can_create_complex_type (PyObject *self, PyObject *args);
> +PyObject *gdbpy_init_complex_type (PyObject *self, PyObject *args);
> +PyObject *gdbpy_init_pointer_type (PyObject *self, PyObject *args);
> +PyObject *gdbpy_init_fixed_point_type (PyObject *self, PyObject *args);
> +
>  PyObject *symtab_and_line_to_sal_object (struct symtab_and_line sal);
>  PyObject *symtab_to_symtab_object (struct symtab *symtab);
>  PyObject *symbol_to_symbol_object (struct symbol *sym);
> @@ -480,6 +493,8 @@ struct symtab *symtab_object_to_symtab (PyObject *obj);
>  struct symtab_and_line *sal_object_to_symtab_and_line (PyObject *obj);
>  frame_info_ptr frame_object_to_frame_info (PyObject *frame_obj);
>  struct gdbarch *arch_object_to_gdbarch (PyObject *obj);
> +struct objfile *objfile_object_to_objfile (PyObject *self);
> +struct floatformat *float_format_object_as_float_format (PyObject *self);

These two should be marked 'extern' and (as I said earlier) should have
a their comments here.

>  
>  /* Convert Python object OBJ to a program_space pointer.  OBJ must be a
>     gdb.Progspace reference.  Return nullptr if the gdb.Progspace is not
> diff --git a/gdb/python/python.c b/gdb/python/python.c
> index fd5a920cbd..288c8b355c 100644
> --- a/gdb/python/python.c
> +++ b/gdb/python/python.c
> @@ -2521,6 +2521,47 @@ Return current recording object." },
>      "stop_recording () -> None.\n\
>  Stop current recording." },
>  
> +  /* Type initialization functions. */
> +  { "init_type", gdbpy_init_type, METH_VARARGS,
> +    "init_type (objfile, type_code, bit_length, name) -> type\n\
> +    Creates a new type with the given bit length and type code, owned\
> +    by the given objfile." },
> +  { "init_integer_type", gdbpy_init_integer_type, METH_VARARGS,
> +    "init_integer_type (objfile, bit_length, unsigned, name) -> type\n\
> +    Creates a new integer type with the given bit length and \
> +    signedness, owned by the given objfile." },
> +  { "init_character_type", gdbpy_init_character_type, METH_VARARGS,
> +    "init_character_type (objfile, bit_length, unsigned, name) -> type\n\
> +    Creates a new character type with the given bit length and \
> +    signedness, owned by the given objfile." },
> +  { "init_boolean_type", gdbpy_init_boolean_type, METH_VARARGS,
> +    "init_boolean_type (objfile, bit_length, unsigned, name) -> type\n\
> +    Creates a new boolean type with the given bit length and \
> +    signedness, owned by the given objfile." },
> +  { "init_float_type", gdbpy_init_float_type, METH_VARARGS,
> +    "init_float_type (objfile, float_format, name) -> type\n\
> +    Creates a new floating point type with the given bit length and \
> +    format, owned by the given objfile." },
> +  { "init_decfloat_type", gdbpy_init_decfloat_type, METH_VARARGS,
> +    "init_decfloat_type (objfile, bit_length, name) -> type\n\
> +    Creates a new decimal float type with the given bit length,\
> +    owned by the given objfile." },
> +  { "can_create_complex_type", gdbpy_can_create_complex_type, METH_VARARGS,
> +    "can_create_complex_type (type) -> bool\n\
> +     Returns whether a given type can form a new complex type." },
> +  { "init_complex_type", gdbpy_init_complex_type, METH_VARARGS,
> +    "init_complex_type (base_type, name) -> type\n\
> +    Creates a new complex type whose components belong to the\
> +    given type, owned by the given objfile." },
> +  { "init_pointer_type", gdbpy_init_pointer_type, METH_VARARGS,
> +    "init_pointer_type (objfile, target_type, bit_length, name) -> type\n\
> +    Creates a new pointer type with the given bit length, pointing\
> +    to the given target type, and owned by the given objfile." },
> + { "init_fixed_point_type", gdbpy_init_fixed_point_type, METH_VARARGS,
> +   "init_fixed_point_type (objfile, bit_length, unsigned, name) -> type\n\
> +   Creates a new fixed point type with the given bit length and\
> +   signedness, owned by the given objfile." },
> +
>    { "lookup_type", (PyCFunction) gdbpy_lookup_type,
>      METH_VARARGS | METH_KEYWORDS,
>      "lookup_type (name [, block]) -> type\n\
> diff --git a/gdb/testsuite/gdb.python/py-type.exp b/gdb/testsuite/gdb.python/py-type.exp
> index c245d41a1a..aee2b4d60a 100644
> --- a/gdb/testsuite/gdb.python/py-type.exp
> +++ b/gdb/testsuite/gdb.python/py-type.exp
> @@ -388,3 +388,13 @@ if { [build_inferior "${binfile}-cxx" "c++"] == 0 } {
>        test_type_equality
>    }
>  }
> +
> +# Test python type construction
> +gdb_test "python t = gdb.init_type(gdb.objfiles ()\[0\], gdb.TYPE_CODE_INT, 24, 'long short int')" \
> +  "" "construct a new type from inside python"
> +gdb_test "python print (t.code)" \
> +  "8" "check the code for the python-constructed type"

Rather than including '8' in here, I wonder if:

  gdb_test "python print (t.code == gdb.TYPE_CODE_INT)" \
    "True" "check the code for the python-constructed type"

would be better?

Thanks,
Andrew

> +gdb_test "python print (t.sizeof)" \
> +  "3" "check the size for the python-constructed type"
> +gdb_test "python print (t.name)" \
> +  "long short int" "check the name for the python-constructed type"
> -- 
> 2.40.1


^ permalink raw reply	[relevance 5%]

* Re: [PATCH] Add name_of_main and language_of_main to the DWARF index
  2023-07-07 18:00  0%         ` Eli Zaretskii
@ 2023-08-04 20:55  0%           ` Tom de Vries
  0 siblings, 0 replies; 65+ results
From: Tom de Vries @ 2023-08-04 20:55 UTC (permalink / raw)
  To: Eli Zaretskii, Matheus Branco Borella; +Cc: gdb-patches

On 7/7/23 20:00, Eli Zaretskii via Gdb-patches wrote:
>> From: Matheus Branco Borella <dark.ryu.550@gmail.com>
>> Cc: gdb-patches@sourceware.org,
>> 	Matheus Branco Borella <dark.ryu.550@gmail.com>
>> Date: Fri,  7 Jul 2023 12:00:22 -0300
>>
>> Eli Zaretskii <eliz@gnu.org> wrote:
>>> Your assignment is not on file yet, AFAICT.  Was the paperwork
>>> completed, i.e. did you get a copy of the assignment signed by you and
>>> by the FSF?  If not, you need to wait some more.
>>
>> Huh, that's weird. I do have the copy, signed by both parties. Maybe it just
>> hasn't been filed yet? If you want, I could forward it to you.
> 
> I see now that your assignment was added, but it's only for Emacs, not
> for GDB.
> 
>>> Thanks.  The documentation parts are OK, but please fix the text to
>>> leave two spaces between sentences, not one.
>>
>> Alright, I've fixed that.
>>
>> Is there anything else I'm missing?
> 
> There's still the issue of copyright assignment for GDB contributions,
> AFAICT.

Hi,

Just checking, any update on this?

I'd love to see this get committed.

Thanks,
- Tom

^ permalink raw reply	[relevance 0%]

* [PATCH v2] Add name_of_main and language_of_main to the DWARF index
  2023-08-03  7:29  7%         ` Tom de Vries
@ 2023-08-04 18:09  4%           ` Matheus Branco Borella
  0 siblings, 0 replies; 65+ results
From: Matheus Branco Borella @ 2023-08-04 18:09 UTC (permalink / raw)
  To: gdb-patches; +Cc: tdevries, Matheus Branco Borella

On 8/3/23 04:13 Tom de Vries <tdevries@suse.de> wrote:
> I applied this patch and tried to build it, and ran into:

I had put an extra space where it shouldn't have been before I submitted the
patch. Should work now.

On 8/3/23 04:29 Tom de Vries <tdevries@suse.de> wrote:
> There seem to be a lot of white space issues:

Thanks for pointing them out. They should be fixed now

> So, you could add this to the commit message:

Sure thing.

---
This patch adds a new section to the DWARF index containing the name
and the language of the main function symbol, gathered from
`cooked_index::get_main`, if available. Currently, for lack of a better name,
this section is called the "shortcut table". The way this name is both saved and
applied upon an index being loaded in mirrors how it is done in
`cooked_index_functions`, more specifically, the full name of the main function
symbol is saved and `set_objfile_main_name` is used to apply it after it is
loaded.

The main use case for this patch is in improving startup times when dealing with
large binaries. Currently, when an index is used, GDB has to expand symtabs
until it finds out what the language of the main function symbol is. For some
large executables, this may take a considerable amount of time to complete,
slowing down startup. This patch bypasses that operation by having both the name
and language of the main function symbol be provided ahead of time by the index.

In my testing (a binary with about 1.8GB worth of DWARF data) this change brings
startup time down from about 34 seconds to about 1.5 seconds.

PR symtab/24549
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=24549
---
 gdb/NEWS                    |  2 ++
 gdb/doc/gdb.texinfo         | 23 +++++++++++++--
 gdb/dwarf2/index-write.c    | 47 +++++++++++++++++++++++++++----
 gdb/dwarf2/read-gdb-index.c | 56 +++++++++++++++++++++++++++++++++++--
 gdb/dwarf2/read.c           | 13 +++++++--
 gdb/dwarf2/read.h           | 12 ++++++++
 6 files changed, 142 insertions(+), 11 deletions(-)

diff --git a/gdb/NEWS b/gdb/NEWS
index d97e3c15a8..2d940d1f79 100644
--- a/gdb/NEWS
+++ b/gdb/NEWS
@@ -3,6 +3,8 @@
 
 *** Changes since GDB 13
 
+* DWARF index now contains information about the main function.
+
 * The AArch64 'org.gnu.gdb.aarch64.pauth' Pointer Authentication feature string
   has been deprecated in favor of the 'org.gnu.gdb.aarch64.pauth_v2' feature
   string.
diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
index d1059e0cb7..3b2fdcd19e 100644
--- a/gdb/doc/gdb.texinfo
+++ b/gdb/doc/gdb.texinfo
@@ -49093,13 +49093,14 @@ unless otherwise noted:
 
 @enumerate
 @item
-The version number, currently 8.  Versions 1, 2 and 3 are obsolete.
+The version number, currently 9.  Versions 1, 2 and 3 are obsolete.
 Version 4 uses a different hashing function from versions 5 and 6.
 Version 6 includes symbols for inlined functions, whereas versions 4
 and 5 do not.  Version 7 adds attributes to the CU indices in the
 symbol table.  Version 8 specifies that symbols from DWARF type units
 (@samp{DW_TAG_type_unit}) refer to the type unit's symbol table and not the
-compilation unit (@samp{DW_TAG_comp_unit}) using the type.
+compilation unit (@samp{DW_TAG_comp_unit}) using the type.  Version 9 adds
+the name and the language of the main function to the index.
 
 @value{GDBN} will only read version 4, 5, or 6 indices
 by specifying @code{set use-deprecated-index-sections on}.
@@ -49120,6 +49121,9 @@ The offset, from the start of the file, of the address area.
 @item
 The offset, from the start of the file, of the symbol table.
 
+@item
+The offset, from the start of the file, of the shortcut table.
+
 @item
 The offset, from the start of the file, of the constant pool.
 @end enumerate
@@ -49196,6 +49200,21 @@ don't currently have a simple description of the canonicalization
 algorithm; if you intend to create new index sections, you must read
 the code.
 
+@item The shortcut table
+This is a data structure with the following fields:
+
+@table @asis
+@item Language of main
+A 32-bit little-endian value indicating the language of the main function as a
+@code{DW_LANG_} constant.  This value will be zero if main function information
+is not present.
+
+@item Name of main
+An @code{offset_type} value indicating the offset of the main function's name
+in the constant pool.  This value must be ignored if the value for the language
+of main is zero.
+@end table
+
 @item
 The constant pool.  This is simply a bunch of bytes.  It is organized
 so that alignment is correct: CU vectors are stored first, followed by
diff --git a/gdb/dwarf2/index-write.c b/gdb/dwarf2/index-write.c
index 62c2cc6ac7..7117a5184b 100644
--- a/gdb/dwarf2/index-write.c
+++ b/gdb/dwarf2/index-write.c
@@ -1080,14 +1080,15 @@ write_gdbindex_1 (FILE *out_file,
 		  const data_buf &types_cu_list,
 		  const data_buf &addr_vec,
 		  const data_buf &symtab_vec,
-		  const data_buf &constant_pool)
+		  const data_buf &constant_pool,
+		  const data_buf &shortcuts)
 {
   data_buf contents;
-  const offset_type size_of_header = 6 * sizeof (offset_type);
+  const offset_type size_of_header = 7 * sizeof (offset_type);
   offset_type total_len = size_of_header;
 
   /* The version number.  */
-  contents.append_offset (8);
+  contents.append_offset (9);
 
   /* The offset of the CU list from the start of the file.  */
   contents.append_offset (total_len);
@@ -1105,6 +1106,10 @@ write_gdbindex_1 (FILE *out_file,
   contents.append_offset (total_len);
   total_len += symtab_vec.size ();
 
+  /* The offset of the shortcut table from the start of the file.  */
+  contents.append_offset (total_len);
+  total_len += shortcuts.size ();
+
   /* The offset of the constant pool from the start of the file.  */
   contents.append_offset (total_len);
   total_len += constant_pool.size ();
@@ -1116,6 +1121,7 @@ write_gdbindex_1 (FILE *out_file,
   types_cu_list.file_write (out_file);
   addr_vec.file_write (out_file);
   symtab_vec.file_write (out_file);
+  shortcuts.file_write (out_file);
   constant_pool.file_write (out_file);
 
   assert_file_size (out_file, total_len);
@@ -1193,6 +1199,34 @@ write_cooked_index (cooked_index *table,
     }
 }
 
+/* Write shortcut information. */
+
+static void
+write_shortcuts_table (cooked_index *table, data_buf& shortcuts,
+		       data_buf& cpool)
+{
+  const auto main_info = table->get_main ();
+  size_t main_name_offset = 0;
+  dwarf_source_language dw_lang = (dwarf_source_language)0;
+
+  if (main_info != nullptr)
+    {
+      dw_lang = main_info->per_cu->dw_lang;
+
+      if (dw_lang != 0)
+	{
+	  auto_obstack obstack;
+	  const auto main_name = main_info->full_name (&obstack, true);
+
+	  main_name_offset = cpool.size ();
+	  cpool.append_cstr0 (main_name);
+	}
+    }
+
+  shortcuts.append_uint (4, BFD_ENDIAN_LITTLE, dw_lang);
+  shortcuts.append_offset (main_name_offset);
+}
+
 /* Write contents of a .gdb_index section for OBJFILE into OUT_FILE.
    If OBJFILE has an associated dwz file, write contents of a .gdb_index
    section for that dwz file into DWZ_OUT_FILE.  If OBJFILE does not have an
@@ -1270,11 +1304,14 @@ write_gdbindex (dwarf2_per_bfd *per_bfd, cooked_index *table,
 
   write_hash_table (&symtab, symtab_vec, constant_pool);
 
+  data_buf shortcuts;
+  write_shortcuts_table (table, shortcuts, constant_pool);
+
   write_gdbindex_1(out_file, objfile_cu_list, types_cu_list, addr_vec,
-		   symtab_vec, constant_pool);
+		   symtab_vec, constant_pool, shortcuts);
 
   if (dwz_out_file != NULL)
-    write_gdbindex_1 (dwz_out_file, dwz_cu_list, {}, {}, {}, {});
+    write_gdbindex_1 (dwz_out_file, dwz_cu_list, {}, {}, {}, {}, {});
   else
     gdb_assert (dwz_cu_list.empty ());
 }
diff --git a/gdb/dwarf2/read-gdb-index.c b/gdb/dwarf2/read-gdb-index.c
index 1006386cb2..f09c5ba234 100644
--- a/gdb/dwarf2/read-gdb-index.c
+++ b/gdb/dwarf2/read-gdb-index.c
@@ -88,6 +88,9 @@ struct mapped_gdb_index final : public mapped_index_base
   /* A pointer to the constant pool.  */
   gdb::array_view<const gdb_byte> constant_pool;
 
+  /* The shortcut table data. */
+  gdb::array_view<const gdb_byte> shortcut_table;
+
   /* Return the index into the constant pool of the name of the IDXth
      symbol in the symbol table.  */
   offset_type symbol_name_index (offset_type idx) const
@@ -166,6 +169,7 @@ dwarf2_gdb_index::dump (struct objfile *objfile)
 
   mapped_gdb_index *index = (gdb::checked_static_cast<mapped_gdb_index *>
 			     (per_objfile->per_bfd->index_table.get ()));
+
   gdb_printf (".gdb_index: version %d\n", index->version);
   gdb_printf ("\n");
 }
@@ -583,7 +587,7 @@ to use the section anyway."),
 
   /* Indexes with higher version than the one supported by GDB may be no
      longer backward compatible.  */
-  if (version > 8)
+  if (version > 9)
     return 0;
 
   map->version = version;
@@ -608,8 +612,17 @@ to use the section anyway."),
   map->symbol_table
     = offset_view (gdb::array_view<const gdb_byte> (symbol_table,
 						    symbol_table_end));
-
   ++i;
+
+  if (version >= 9)
+    {
+      const gdb_byte *shortcut_table = addr + metadata[i];
+      const gdb_byte *shortcut_table_end = addr + metadata[i + 1];
+      map->shortcut_table
+	= gdb::array_view<const gdb_byte> (shortcut_table, shortcut_table_end);
+      ++i;
+    }
+
   map->constant_pool = buffer.slice (metadata[i]);
 
   if (map->constant_pool.empty () && !map->symbol_table.empty ())
@@ -763,6 +776,43 @@ create_addrmap_from_gdb_index (dwarf2_per_objfile *per_objfile,
     = new (&per_bfd->obstack) addrmap_fixed (&per_bfd->obstack, &mutable_map);
 }
 
+/* Sets the name and language of the main function from the shortcut table. */
+
+static void
+set_main_name_from_gdb_index (dwarf2_per_objfile *per_objfile,
+			      mapped_gdb_index *index)
+{
+  const auto expected_size = 4 + sizeof (offset_type);
+  if (index->shortcut_table.size () < expected_size)
+    /* The data in the section is not present, is corrupted or is in a version
+     * we don't know about. Regardless, we can't make use of it. */
+    return;
+
+  auto ptr = index->shortcut_table.data ();
+  const auto dw_lang = extract_unsigned_integer (ptr, 4, BFD_ENDIAN_LITTLE);
+  if (dw_lang >= DW_LANG_hi_user)
+    {
+      complaint (_(".gdb_index shortcut table has invalid main language %u"),
+		   (unsigned) dw_lang);
+      return;
+    }
+  if (dw_lang == 0)
+    {
+      /* Don't bother if the language for the main symbol was not known or if
+       * there was no main symbol at all when the index was built. */
+      return;
+    }
+  ptr += 4;
+
+  const auto lang = dwarf_lang_to_enum_language (dw_lang);
+  const auto name_offset = extract_unsigned_integer (ptr,
+						     sizeof (offset_type),
+						     BFD_ENDIAN_LITTLE);
+  const auto name = (const char*) (index->constant_pool.data () + name_offset);
+
+  set_objfile_main_name (per_objfile->objfile, name, (enum language) lang);
+}
+
 /* See read-gdb-index.h.  */
 
 int
@@ -848,6 +898,8 @@ dwarf2_read_gdb_index
 
   create_addrmap_from_gdb_index (per_objfile, map.get ());
 
+  set_main_name_from_gdb_index (per_objfile, map.get ());
+
   per_bfd->index_table = std::move (map);
   per_bfd->quick_file_names_table =
     create_quick_file_names_table (per_bfd->all_units.size ());
diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index 4828409222..89acd94c05 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -17745,7 +17745,9 @@ leb128_size (const gdb_byte *buf)
     }
 }
 
-static enum language
+/* Converts DWARF language names to GDB language names. */
+
+enum language
 dwarf_lang_to_enum_language (unsigned int lang)
 {
   enum language language;
@@ -21661,6 +21663,7 @@ prepare_one_comp_unit (struct dwarf2_cu *cu, struct die_info *comp_unit_die,
   /* Set the language we're debugging.  */
   attr = dwarf2_attr (comp_unit_die, DW_AT_language, cu);
   enum language lang;
+  dwarf_source_language dw_lang = (dwarf_source_language)0;
   if (cu->producer != nullptr
       && strstr (cu->producer, "IBM XL C for OpenCL") != NULL)
     {
@@ -21669,18 +21672,24 @@ prepare_one_comp_unit (struct dwarf2_cu *cu, struct die_info *comp_unit_die,
 	 language detection we fall back to the DW_AT_producer
 	 string.  */
       lang = language_opencl;
+      dw_lang = DW_LANG_OpenCL;
     }
   else if (cu->producer != nullptr
 	   && strstr (cu->producer, "GNU Go ") != NULL)
     {
       /* Similar hack for Go.  */
       lang = language_go;
+      dw_lang = DW_LANG_Go;
     }
   else if (attr != nullptr)
-    lang = dwarf_lang_to_enum_language (attr->constant_value (0));
+    {
+      lang = dwarf_lang_to_enum_language (attr->constant_value (0));
+      dw_lang = (dwarf_source_language)attr->constant_value (0);
+    }
   else
     lang = pretend_language;
 
+  cu->per_cu->dw_lang = dw_lang;
   cu->language_defn = language_def (lang);
 
   switch (comp_unit_die->tag)
diff --git a/gdb/dwarf2/read.h b/gdb/dwarf2/read.h
index 37023a2070..6707c400cf 100644
--- a/gdb/dwarf2/read.h
+++ b/gdb/dwarf2/read.h
@@ -245,6 +245,14 @@ struct dwarf2_per_cu_data
      functions above.  */
   std::vector <dwarf2_per_cu_data *> *imported_symtabs = nullptr;
 
+  /* The original DW_LANG_* value of the CU, as provided to us by
+   * DW_AT_language. It is interesting to keep this value around in cases where
+   * we can't use the values from the language enum, as the mapping to them is
+   * lossy, and, while that is usually fine, things like the index have an
+   * understandable bias towards not exposing internal GDB structures to the
+   * outside world, and so prefer to use DWARF constants in their stead. */
+  dwarf_source_language dw_lang;
+
   /* Return true of IMPORTED_SYMTABS is empty or not yet allocated.  */
   bool imported_symtabs_empty () const
   {
@@ -755,6 +763,10 @@ struct dwarf2_per_objfile
 		     std::unique_ptr<dwarf2_cu>> m_dwarf2_cus;
 };
 
+/* Converts DWARF language names to GDB language names. */
+
+enum language dwarf_lang_to_enum_language (unsigned int lang);
+
 /* Get the dwarf2_per_objfile associated to OBJFILE.  */
 
 dwarf2_per_objfile *get_dwarf2_per_objfile (struct objfile *objfile);
-- 
2.41.0


^ permalink raw reply	[relevance 4%]

* Re: [PATCH] Add name_of_main and language_of_main to the DWARF index
  2023-07-07 15:00  4%       ` Matheus Branco Borella
  2023-07-07 18:00  0%         ` Eli Zaretskii
  2023-08-03  7:12  7%         ` Tom de Vries
@ 2023-08-03  7:29  7%         ` Tom de Vries
  2023-08-04 18:09  4%           ` [PATCH v2] " Matheus Branco Borella
  2 siblings, 1 reply; 65+ results
From: Tom de Vries @ 2023-08-03  7:29 UTC (permalink / raw)
  To: Matheus Branco Borella, eliz; +Cc: gdb-patches

On 7/7/23 17:00, Matheus Branco Borella via Gdb-patches wrote:
> Is there anything else I'm missing?

There seem to be a lot of white space issues:
...
$ git show --pretty=%s --check | grep -v '^\+'
Add name_of_main and language_of_main to the DWARF index

gdb/dwarf2/index-write.c:1084: indent with spaces.
gdb/dwarf2/index-write.c:1206: indent with spaces.
gdb/dwarf2/index-write.c:1217: indent with spaces.
gdb/dwarf2/index-write.c:1218: indent with spaces.
gdb/dwarf2/index-write.c:1219: indent with spaces.
gdb/dwarf2/index-write.c:1221: indent with spaces.
gdb/dwarf2/index-write.c:1222: indent with spaces.
gdb/dwarf2/index-write.c:1223: indent with spaces.
gdb/dwarf2/read-gdb-index.c:622: indent with spaces.
gdb/dwarf2/read-gdb-index.c:777: trailing whitespace.
gdb/dwarf2/read-gdb-index.c:778: indent with spaces.
gdb/dwarf2/read-gdb-index.c:791: indent with spaces.
gdb/dwarf2/read-gdb-index.c:803: trailing whitespace.
gdb/dwarf2/read-gdb-index.c:804: trailing whitespace, indent with spaces.
gdb/dwarf2/read-gdb-index.c:805: indent with spaces.
gdb/dwarf2/read.h:248: trailing whitespace.
gdb/dwarf2/read.h:251: trailing whitespace.
gdb/dwarf2/read.h:252: trailing whitespace.
...

Furthermore, I can confirm that this fixes regressions with target board 
gdb-index in test-cases:
- gdb.fortran/backtrace.exp 

- gdb.fortran/info-main.exp 

- gdb.dwarf2/main-subprogram.exp 

- gdb.fortran/mixed-lang-stack.exp

Some of those are KFAIL-ed for 
https://sourceware.org/bugzilla/show_bug.cgi?id=24549 .

So, you could add this to the commit message:
...
PR symtab/24549
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=24549
...

Thanks,
- Tom

^ permalink raw reply	[relevance 7%]

* Re: [PATCH] Add name_of_main and language_of_main to the DWARF index
  2023-07-07 15:00  4%       ` Matheus Branco Borella
  2023-07-07 18:00  0%         ` Eli Zaretskii
@ 2023-08-03  7:12  7%         ` Tom de Vries
  2023-08-03  7:29  7%         ` Tom de Vries
  2 siblings, 0 replies; 65+ results
From: Tom de Vries @ 2023-08-03  7:12 UTC (permalink / raw)
  To: Matheus Branco Borella, eliz; +Cc: gdb-patches

On 7/7/23 17:00, Matheus Branco Borella via Gdb-patches wrote:
> diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo

I applied this patch and tried to build it, and ran into:
...
/data/vries/gdb/src/gdb/doc/gdb.texinfo:49297: warning: @item missing 
argument
/data/vries/gdb/src/gdb/doc/gdb.texinfo:49302: warning: @item missing 
argument
/data/vries/gdb/src/gdb/doc/gdb.texinfo:49308: warning: @item missing 
argument
/data/vries/gdb/src/gdb/doc/gdb.texinfo:49321: `@end' expected `table', 
but saw `enumerate'
make[3]: *** [Makefile:495: gdb.info] Error 1
...

Thanks,
- Tom

^ permalink raw reply	[relevance 7%]

* [PATCH] Add support for symbol addition to the Python API
  2023-07-04 15:14  7% ` Andrew Burgess
@ 2023-07-07 23:13  3%   ` Matheus Branco Borella
  2024-01-13  1:36  3%     ` [PATCH v2] " Matheus Branco Borella
  0 siblings, 1 reply; 65+ results
From: Matheus Branco Borella @ 2023-07-07 23:13 UTC (permalink / raw)
  To: aburgess; +Cc: gdb-patches, Matheus Branco Borella

Andrew Burgess <aburgess@redhat.com> wrote:
> I started taking a look through this.  I didn't manage to build the code
> due to the use of C++17 features, so I've only given a couple of really
> minor bits of feedback.

My bad. My compiler defaults to C++17 and for some reason it didn't 
occur to me that GDB uses an earlier version. It should build with 
C++11 now.

> I think that adding a first simple test would be a solid idea, this will
> give reviewers something to play with, you can always expand the test
> later to cover more cases.

I've added one to py-objfile.exp, it builds a new object and adds a 
symbol to it, then looks it up. It's simple, but it should be able to
illustrate how this feature works.

> Could this make use of `language_enum` (from language.c)?

It could. Are the enum variants available from inside Python? If not, 
should I add them? I can't seem to find them there, but it could be
that I'm just not looking hard enough.

> Is this change really needed?

Nope.

> Likewise, I suspect this change is not needed.

Likewise, you'd be correct.

The branch I'm working on for this patch was spun off the branch for 
another one. I missed that these two had stayed in there when I was
taking out the other patch's changes. I've taken them out now, thanks
for pointing them out.

Anyway, looking forward to hearing your thoughts on this patch.

---
This patch adds support for symbol creation and registration. It currently
supports adding type symbols (VAR_DOMAIN/LOC_TYPEDEF), static symbols
(VAR_DOMAIN/LOC_STATIC) and goto target labels (LABEL_DOMAIN/LOC_LABEL). It
adds a new `gdb.ObjfileBuilder` type, with `add_type_symbol`,
`add_static_symbol` and `add_label_symbol` functions, allowing for the addition
of the aforementioned types of symbols.

Symbol addition is achieved by constructing a new objfile with msyms and full
symbols reflecting the symbols that were previously added to the builder through
its methods. This approach lets us get most of the way to full symbol addition
support, but due to not being backed up by BFD, it does have a few limitations,
which I will go over them here.

PC-based minsym lookup does not work, because those would require a more
complete set of BFD structures than I think would be good practice to pretend to
have them all and crash GDB later on when it expects things to be there that
aren't.

In the same vein, PC-based function name lookup also does not work, although
there may be a way to have the feature work using overlays. However, this patch
does not make an attempt to do so

For now, though, this implementation lets us add symbols that can be used to,
for instance, query registered types through `gdb.lookup_type`, and allows
reverse engineering GDB plugins (such as Pwndbg [0] or decomp2gdb [1]) to add
symbols directly through the Python API instead of having to compile an object
file for the target architecture that they later load through the add-symbol-
file command. [2]

[0] https://github.com/pwndbg/pwndbg/
[1] https://github.com/mahaloz/decomp2dbg
[2] https://github.com/mahaloz/decomp2dbg/blob/055be6b2001954d00db2d683f20e9b714af75880/decomp2dbg/clients/gdb/symbol_mapper.py#L235-L243]
---
 gdb/Makefile.in                         |   1 +
 gdb/python/py-objfile-builder.c         | 642 ++++++++++++++++++++++++
 gdb/testsuite/gdb.python/py-objfile.exp |  11 +
 3 files changed, 654 insertions(+)
 create mode 100644 gdb/python/py-objfile-builder.c

diff --git a/gdb/Makefile.in b/gdb/Makefile.in
index 14b5dd0bad..c0eecb81b6 100644
--- a/gdb/Makefile.in
+++ b/gdb/Makefile.in
@@ -417,6 +417,7 @@ SUBDIR_PYTHON_SRCS = \
 	python/py-micmd.c \
 	python/py-newobjfileevent.c \
 	python/py-objfile.c \
+	python/py-objfile-builder.c \
 	python/py-param.c \
 	python/py-prettyprint.c \
 	python/py-progspace.c \
diff --git a/gdb/python/py-objfile-builder.c b/gdb/python/py-objfile-builder.c
new file mode 100644
index 0000000000..dd93a95138
--- /dev/null
+++ b/gdb/python/py-objfile-builder.c
@@ -0,0 +1,642 @@
+/* Python class allowing users to build and install objfiles.
+
+   Copyright (C) 2013-2023 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include "defs.h"
+#include "python-internal.h"
+#include "quick-symbol.h"
+#include "objfiles.h"
+#include "minsyms.h"
+#include "buildsym.h"
+#include "observable.h"
+#include <string>
+#include <unordered_map>
+#include <type_traits>
+#include <optional>
+
+/* This module relies on symbols being trivially copyable. */
+static_assert (std::is_trivially_copyable<struct symbol>::value);
+
+/* Interface to be implemented for symbol types supported by this interface. */
+class symbol_def
+{
+public:
+  virtual ~symbol_def () = default;
+
+  virtual void register_msymbol (const std::string& name, 
+                                 struct objfile* objfile,
+                                 minimal_symbol_reader& reader) const = 0;
+  virtual void register_symbol (const std::string& name, 
+                                struct objfile* objfile,
+                                buildsym_compunit& builder) const = 0;
+};
+
+/* Shorthand for a unique_ptr to a symbol. */
+typedef std::unique_ptr<symbol_def> symbol_def_up;
+
+/* Data being held by the gdb.ObjfileBuilder.
+ *
+ * This structure needs to have its constructor run in order for its lifetime
+ * to begin. Because of how Python handles its objects, we can't just reconstruct
+ * the object structure as a whole, as that would overwrite things the runtime
+ * cares about, so these fields had to be broken off into their own structure. */
+struct objfile_builder_data
+{
+  /* Indicates whether the objfile has already been built and added to the
+   * current context. We enforce that objfiles can't be installed twice. */
+  bool installed = false;
+
+  /* The symbols that will be added to new newly built objfile. */
+  std::unordered_map<std::string, symbol_def_up> symbols;
+
+  /* The name given to this objfile. */
+  std::string name;
+
+  /* Adds a symbol definition with the given name. */
+  bool add_symbol_def (std::string name, symbol_def_up&& symbol_def)
+  {
+    return std::get<1> (symbols.insert ({name, std::move (symbol_def)}));
+  }
+};
+
+/* Structure backing the gdb.ObjfileBuilder type. */
+
+struct objfile_builder_object
+{
+  PyObject_HEAD
+
+  /* See objfile_builder_data. */
+  objfile_builder_data inner;
+};
+
+extern PyTypeObject objfile_builder_object_type
+    CPYCHECKER_TYPE_OBJECT_FOR_TYPEDEF ("objfile_builder_object_type");
+
+/* Constructs a new objfile from an objfile_builder. */
+static struct objfile *
+build_new_objfile (const objfile_builder_object& builder)
+{
+  gdb_assert (!builder.inner.installed);
+
+  auto of = objfile::make (nullptr, builder.inner.name.c_str (), 
+                           OBJF_READNOW | OBJF_NOT_FILENAME, 
+                           nullptr);
+
+  /* Setup object file sections. */
+  of->sections_start = OBSTACK_CALLOC (&of->objfile_obstack,
+                                       4,
+                                       struct obj_section);
+  of->sections_end = of->sections_start + 4;
+
+  const auto init_section = [&](struct obj_section* sec)
+    {
+      sec->objfile = of;
+      sec->ovly_mapped = false;
+      
+      /* We're not being backed by BFD. So we have no real section data to speak 
+       * of, but, because specifying sections requires BFD structures, we have to
+       * play a little game of predend. */
+      auto bfd = obstack_new<bfd_section> (&of->objfile_obstack);
+      bfd->vma = 0;
+      bfd->size = 0;
+      bfd->lma = 0; /* Prevents insert_section_p in objfiles.c from trying to 
+                     * dereference the bfd structure we don't have. */
+      sec->the_bfd_section = bfd;
+    };
+  init_section (&of->sections_start[0]);
+  init_section (&of->sections_start[1]);
+  init_section (&of->sections_start[2]);
+  init_section (&of->sections_start[4]);
+
+  of->sect_index_text = 0;
+  of->sect_index_data = 1;
+  of->sect_index_rodata = 2;
+  of->sect_index_bss = 3;
+
+  /* While buildsym_compunit expects the symbol function pointer structure to be
+   * present, it also gracefully handles the case where all of the pointers in
+   * it are set to null. So, make sure we have a valid structure, but there's
+   * no need to do more than that. */
+  of->sf = obstack_new<struct sym_fns> (&of->objfile_obstack);
+
+  /* We need to tell GDB what architecture the objfile uses. */
+  if (has_stack_frames ())
+    of->per_bfd->gdbarch = get_frame_arch (get_selected_frame (nullptr));
+  else
+    of->per_bfd->gdbarch = target_gdbarch ();
+
+  /* Construct the minimal symbols. */
+  minimal_symbol_reader msym (of);
+  for (const auto& element : builder.inner.symbols)
+      std::get<1> (element)->register_msymbol (std::get<0> (element), of, msym);
+  msym.install ();
+
+  /* Construct the full symbols. */
+  buildsym_compunit fsym (of, builder.inner.name.c_str (), "", language_c, 0);
+  for (const auto& element : builder.inner.symbols)
+    std::get<1> (element)->register_symbol (std::get<0> (element), of, fsym);
+  fsym.end_compunit_symtab (0);
+
+  /* Notify the rest of GDB this objfile has been created. Requires 
+   * OBJF_NOT_FILENAME to be used, to prevent any of the functions attatched to
+   * the observable from trying to dereference of->bfd. */
+  gdb::observers::new_objfile.notify (of);
+
+  return of;
+}
+
+/* Implementation of the quick symbol functions used by the objfiles created 
+ * using this interface. Turns out we have our work cut out for us here, as we
+ * can get something that works by effectively just using no-ops, and the rest
+ * of the code will fall back to using just the minimal and full symbol data. It
+ * is important to note, though, that this only works because we're marking our 
+ * objfile with `OBJF_READNOW`. */
+class runtime_objfile : public quick_symbol_functions
+{
+  virtual bool has_symbols (struct objfile*) override
+  {
+    return false;
+  }
+
+  virtual void dump (struct objfile *objfile) override
+  {
+  }
+
+  virtual void expand_matching_symbols
+    (struct objfile *,
+     const lookup_name_info &lookup_name,
+     domain_enum domain,
+     int global,
+     symbol_compare_ftype *ordered_compare) override
+  {
+  }
+
+  virtual bool expand_symtabs_matching
+    (struct objfile *objfile,
+     gdb::function_view<expand_symtabs_file_matcher_ftype> file_matcher,
+     const lookup_name_info *lookup_name,
+     gdb::function_view<expand_symtabs_symbol_matcher_ftype> symbol_matcher,
+     gdb::function_view<expand_symtabs_exp_notify_ftype> expansion_notify,
+     block_search_flags search_flags,
+     domain_enum domain,
+     enum search_domain kind) override
+  {
+    return true;
+  }
+};
+
+
+/* Create a new symbol alocated in the given objfile. */
+
+static struct symbol *
+new_symbol
+  (struct objfile *objfile,
+   const char *name,
+   enum language language,
+   enum domain_enum domain,
+   enum address_class aclass,
+   short section_index)
+{
+  auto symbol = new (&objfile->objfile_obstack) struct symbol ();
+  OBJSTAT (objfile, n_syms++);
+
+  symbol->set_language (language, &objfile->objfile_obstack);
+  symbol->compute_and_set_names (gdb::string_view (name), true, 
+                                 objfile->per_bfd);
+
+  symbol->set_is_objfile_owned (true);
+  symbol->set_section_index (section_index);
+  symbol->set_domain (domain);
+  symbol->set_aclass_index (aclass);
+
+  return symbol;
+}
+
+/* Parses a language from a string (coming from Python) into a language 
+ * variant. */
+
+static enum language
+parse_language (const char *language)
+{
+  if (strcmp (language, "c") == 0)
+    return language_c;
+  else if (strcmp (language, "objc") == 0)
+    return language_objc;
+  else if (strcmp (language, "cplus") == 0)
+    return language_cplus;
+  else if (strcmp (language, "d") == 0)
+    return language_d;
+  else if (strcmp (language, "go") == 0)
+    return language_go;
+  else if (strcmp (language, "fortran") == 0)
+    return language_fortran;
+  else if (strcmp (language, "m2") == 0)
+    return language_m2;
+  else if (strcmp (language, "asm") == 0)
+    return language_asm;
+  else if (strcmp (language, "pascal") == 0)
+    return language_pascal;
+  else if (strcmp (language, "opencl") == 0)
+    return language_opencl;
+  else if (strcmp (language, "rust") == 0)
+    return language_rust;
+  else if (strcmp (language, "ada") == 0)
+    return language_ada;
+  else
+    return language_unknown;
+}
+
+/* Convenience function that performs a checked coversion from a PyObject to
+ * a objfile_builder_object structure pointer. */
+inline static struct objfile_builder_object *
+validate_objfile_builder_object (PyObject *self)
+{
+  if (!PyObject_TypeCheck (self, &objfile_builder_object_type))
+    return nullptr;
+  return (struct objfile_builder_object*) self;
+}
+
+/* Registers symbols added with add_label_symbol. */
+class typedef_symbol_def : public symbol_def
+{
+public:
+  struct type* type;
+  enum language language;
+
+  virtual void register_msymbol (const std::string& name,
+                                 struct objfile *objfile,
+                                 minimal_symbol_reader& reader) const override
+  {
+  }
+
+  virtual void register_symbol (const std::string& name,
+                                struct objfile *objfile,
+                                buildsym_compunit& builder) const override
+  {
+    auto symbol = new_symbol (objfile, name.c_str (), language, LABEL_DOMAIN,
+                              LOC_TYPEDEF, objfile->sect_index_text);
+
+    symbol->set_type (type);
+
+    add_symbol_to_list (symbol, builder.get_file_symbols ());
+  }
+};
+
+/* Adds a type (LOC_TYPEDEF) symbol to a given objfile. */
+static PyObject *
+objbdpy_add_type_symbol (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *format = "sOs";
+  static const char *keywords[] =
+    {
+      "name", "type", "language", NULL
+    };
+
+  PyObject *type_object;
+  const char *name;
+  const char *language_name = nullptr;
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, format, keywords, &name,
+                                        &type_object, &language_name))
+    return nullptr;
+
+  auto builder = validate_objfile_builder_object (self);
+  if (builder == nullptr)
+    return nullptr;
+
+  struct type *type = type_object_to_type (type_object);
+  if (type == nullptr)
+    return nullptr;
+
+  if (language_name == nullptr)
+    language_name = "auto";
+  enum language language = parse_language (language_name);
+  if (language == language_unknown)
+    {
+      PyErr_SetString (PyExc_ValueError, "invalid language name");
+      return nullptr;
+    }
+
+  auto def = std::unique_ptr<typedef_symbol_def> (new typedef_symbol_def ());
+  def->type = type;
+  def->language = language;
+
+  builder->inner.add_symbol_def (name, std::move (def));
+
+  Py_RETURN_NONE;
+}
+
+
+/* Registers symbols added with add_label_symbol. */
+class label_symbol_def : public symbol_def
+{
+public:
+  CORE_ADDR address;
+  enum language language;
+
+  virtual void register_msymbol (const std::string& name,
+                                 struct objfile *objfile,
+                                 minimal_symbol_reader& reader) const override
+  {
+    reader.record (name.c_str (), 
+                   unrelocated_addr (address), 
+                   minimal_symbol_type::mst_text);
+  }
+
+  virtual void register_symbol (const std::string& name,
+                                struct objfile *objfile,
+                                buildsym_compunit& builder) const override
+  {
+    printf("Adding label %s\n", name.c_str ());
+    auto symbol = new_symbol (objfile, name.c_str (), language, LABEL_DOMAIN,
+                              LOC_LABEL, objfile->sect_index_text);
+
+    symbol->set_value_address (address);
+
+    add_symbol_to_list (symbol, builder.get_file_symbols ());
+  }
+};
+
+/* Adds a label (LOC_LABEL) symbol to a given objfile. */
+static PyObject *
+objbdpy_add_label_symbol (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *format = "sks";
+  static const char *keywords[] =
+    {
+      "name", "address", "language", NULL
+    };
+
+  const char *name;
+  CORE_ADDR address;
+  const char *language_name = nullptr;
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, format, keywords, &name,
+                                        &address, &language_name))
+    return nullptr;
+
+  auto builder = validate_objfile_builder_object (self);
+  if (builder == nullptr)
+    return nullptr;
+
+  if (language_name == nullptr)
+    language_name = "auto";
+  enum language language = parse_language (language_name);
+  if (language == language_unknown)
+    {
+      PyErr_SetString (PyExc_ValueError, "invalid language name");
+      return nullptr;
+    }
+
+  auto def = std::unique_ptr<label_symbol_def> (new label_symbol_def ());
+  def->address = address;
+  def->language = language;
+
+  builder->inner.add_symbol_def (name, std::move (def));
+
+  Py_RETURN_NONE;
+}
+
+/* Registers symbols added with add_static_symbol. */
+class static_symbol_def : public symbol_def
+{
+public:
+  CORE_ADDR address;
+  enum language language;
+
+  virtual void register_msymbol (const std::string& name,
+                                 struct objfile *objfile,
+                                 minimal_symbol_reader& reader) const override
+  {
+    reader.record (name.c_str (), 
+                   unrelocated_addr (address), 
+                   minimal_symbol_type::mst_bss);
+  }
+
+  virtual void register_symbol (const std::string& name,
+                                struct objfile *objfile,
+                                buildsym_compunit& builder) const override
+  {
+    auto symbol = new_symbol (objfile, name.c_str (), language, VAR_DOMAIN,
+                              LOC_STATIC, objfile->sect_index_bss);
+
+    symbol->set_value_address (address);
+
+    add_symbol_to_list (symbol, builder.get_file_symbols ());
+  }
+};
+
+/* Adds a static (LOC_STATIC) symbol to a given objfile. */
+static PyObject *
+objbdpy_add_static_symbol (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *format = "sks";
+  static const char *keywords[] =
+    {
+      "name", "address", "language", NULL
+    };
+
+  const char *name;
+  CORE_ADDR address;
+  const char *language_name = nullptr;
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, format, keywords, &name,
+                                        &address, &language_name))
+    return nullptr;
+
+  auto builder = validate_objfile_builder_object (self);
+  if (builder == nullptr)
+    return nullptr;
+
+  if (language_name == nullptr)
+    language_name = "auto";
+  enum language language = parse_language (language_name);
+  if (language == language_unknown)
+    {
+      PyErr_SetString (PyExc_ValueError, "invalid language name");
+      return nullptr;
+    }
+
+  auto def = std::unique_ptr<static_symbol_def> (new static_symbol_def ());
+  def->address = address;
+  def->language = language;
+
+  builder->inner.add_symbol_def (name, std::move (def));
+
+  Py_RETURN_NONE;
+}
+
+/* Builds the object file. */
+static PyObject *
+objbdpy_build (PyObject *self, PyObject *args)
+{
+  auto builder = validate_objfile_builder_object (self);
+  if (builder == nullptr)
+    return nullptr;
+
+  if (builder->inner.installed)
+    {
+      PyErr_SetString (PyExc_ValueError, "build() cannot be run twice on the \
+                       same object");
+      return nullptr;
+    }
+  auto of = build_new_objfile (*builder);
+  builder->inner.installed = true;
+
+
+  auto objpy = objfile_to_objfile_object (of).get ();
+  Py_INCREF(objpy);
+  return objpy;
+}
+
+/* Implements the __init__() function. */
+static int
+objbdpy_init (PyObject *self0, PyObject *args, PyObject *kw)
+{
+  static const char *format = "s";
+  static const char *keywords[] =
+    {
+      "name", NULL
+    };
+
+  const char *name;
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, format, keywords, &name))
+    return -1;
+
+  auto self = (objfile_builder_object *)self0;
+  self->inner.name = name;
+  self->inner.symbols.clear ();
+
+  return 0;
+}
+
+/* The function handling construction of the ObjfileBuilder object. 
+ *
+ * We need to have a custom function here as, even though Python manages the 
+ * memory backing the object up, it assumes clearing the memory is enough to
+ * begin its lifetime, which is not the case here, and would lead to undefined 
+ * behavior as soon as we try to use it in any meaningful way.
+ * 
+ * So, what we have to do here is manually begin the lifecycle of our new object
+ * by constructing it in place, using the memory region Python just allocated
+ * for us. This ensures the object will have already started its lifetime by 
+ * the time we start using it. */
+static PyObject *
+objbdpy_new (PyTypeObject *subtype, PyObject *args, PyObject *kwds)
+{
+  objfile_builder_object *region = 
+    (objfile_builder_object *) subtype->tp_alloc(subtype, 1);
+  gdb_assert ((size_t)region % alignof (objfile_builder_object) == 0);
+  gdb_assert (region != nullptr);
+
+  new (&region->inner) objfile_builder_data ();
+
+  return (PyObject *)region;
+}
+
+/* The function handling destruction of the ObjfileBuilder object. 
+ *
+ * While running the destructor of our object isn't _strictly_ necessary, we
+ * would very much like for the memory it owns to be freed, but, because it was
+ * constructed in place, we have to call its destructor manually here. */
+static void 
+objbdpy_dealloc (PyObject *self0)
+{
+  auto self = (objfile_builder_object *)self0;
+  PyTypeObject *tp = Py_TYPE(self);
+
+  self->inner.~objfile_builder_data ();
+
+  tp->tp_free(self);
+  Py_DECREF(tp);
+}
+
+static int CPYCHECKER_NEGATIVE_RESULT_SETS_EXCEPTION
+gdbpy_initialize_objfile_builder (void)
+{
+  if (PyType_Ready (&objfile_builder_object_type) < 0)
+    return -1;
+
+  return gdb_pymodule_addobject (gdb_module, "ObjfileBuilder",
+				 (PyObject *) &objfile_builder_object_type);
+}
+
+GDBPY_INITIALIZE_FILE (gdbpy_initialize_objfile_builder);
+
+static PyMethodDef objfile_builder_object_methods[] =
+{
+  { "build", (PyCFunction) objbdpy_build, METH_NOARGS,
+    "build ().\n\
+Build a new objfile containing the symbols added to builder." },
+  { "add_type_symbol", (PyCFunction) objbdpy_add_type_symbol,
+    METH_VARARGS | METH_KEYWORDS,
+    "add_type_symbol (name [str], type [gdb.Type], language [str]).\n\
+Add a new type symbol in the given language, associated with the given type." },
+  { "add_label_symbol", (PyCFunction) objbdpy_add_label_symbol,
+    METH_VARARGS | METH_KEYWORDS,
+    "add_label_symbol (name [str], address [int], language [str]).\n\
+Add a new label symbol in the given language, at the given address." },
+  { "add_static_symbol", (PyCFunction) objbdpy_add_static_symbol,
+    METH_VARARGS | METH_KEYWORDS,
+    "add_static_symbol (name [str], address [int], language [str]).\n\
+Add a new static symbol in the given language, at the given address." },
+  { NULL }
+};
+
+PyTypeObject objfile_builder_object_type = {
+  PyVarObject_HEAD_INIT (NULL, 0)
+  "gdb.ObjfileBuilder",               /* tp_name */
+  sizeof (objfile_builder_object),    /* tp_basicsize */
+  0,                                  /* tp_itemsize */
+  objbdpy_dealloc,                    /* tp_dealloc */
+  0,                                  /* tp_vectorcall_offset */
+  nullptr,                            /* tp_getattr */
+  nullptr,                            /* tp_setattr */
+  nullptr,                            /* tp_compare */
+  nullptr,                            /* tp_repr */
+  nullptr,                            /* tp_as_number */
+  nullptr,                            /* tp_as_sequence */
+  nullptr,                            /* tp_as_mapping */
+  nullptr,                            /* tp_hash  */
+  nullptr,                            /* tp_call */
+  nullptr,                            /* tp_str */
+  nullptr,                            /* tp_getattro */
+  nullptr,                            /* tp_setattro */
+  nullptr,                            /* tp_as_buffer */
+  Py_TPFLAGS_DEFAULT,                 /* tp_flags */
+  "GDB object file builder",          /* tp_doc */
+  nullptr,                            /* tp_traverse */
+  nullptr,                            /* tp_clear */
+  nullptr,                            /* tp_richcompare */
+  0,                                  /* tp_weaklistoffset */
+  nullptr,                            /* tp_iter */
+  nullptr,                            /* tp_iternext */
+  objfile_builder_object_methods,     /* tp_methods */
+  nullptr,                            /* tp_members */
+  nullptr,                            /* tp_getset */
+  nullptr,                            /* tp_base */
+  nullptr,                            /* tp_dict */
+  nullptr,                            /* tp_descr_get */
+  nullptr,                            /* tp_descr_set */
+  0,                                  /* tp_dictoffset */
+  objbdpy_init,                       /* tp_init */
+  PyType_GenericAlloc,                /* tp_alloc */
+  objbdpy_new,                        /* tp_new */
+};
+
+
diff --git a/gdb/testsuite/gdb.python/py-objfile.exp b/gdb/testsuite/gdb.python/py-objfile.exp
index 61b9942de7..ab2413e317 100644
--- a/gdb/testsuite/gdb.python/py-objfile.exp
+++ b/gdb/testsuite/gdb.python/py-objfile.exp
@@ -173,3 +173,14 @@ gdb_py_test_silent_cmd "python objfile = gdb.objfiles()\[0\]" \
     "get first objfile" 1
 gdb_file_cmd ${binfile}
 gdb_test "python print(objfile)" "<gdb.Objfile \\\(invalid\\\)>"
+
+# Test adding a new objfile.
+gdb_py_test_silent_cmd "python builder = gdb.ObjfileBuilder(\"test_objfile\")" \
+    "Create an object file builder" 1
+gdb_test "python print(repr(builder))" "<gdb.ObjfileBuilder .*>"
+
+gdb_py_test_silent_cmd "python builder.add_static_symbol(name = \"test\", address = 0, language = \"c\")" \
+    "Add a static symbol to the object file builder" 1
+gdb_py_test_silent_cmd "python objfile = builder.build()" \
+    "Build an object from an objcect file builder" 1
+gdb_test "python print(repr(objfile.lookup_static_symbol(\"test\")))" "<gdb.Symbol .*>"
-- 
2.40.1


^ permalink raw reply	[relevance 3%]

* Re: [PATCH] Add name_of_main and language_of_main to the DWARF index
  2023-07-07 15:00  4%       ` Matheus Branco Borella
@ 2023-07-07 18:00  0%         ` Eli Zaretskii
  2023-08-04 20:55  0%           ` Tom de Vries
  2023-08-03  7:12  7%         ` Tom de Vries
  2023-08-03  7:29  7%         ` Tom de Vries
  2 siblings, 1 reply; 65+ results
From: Eli Zaretskii @ 2023-07-07 18:00 UTC (permalink / raw)
  To: Matheus Branco Borella; +Cc: gdb-patches

> From: Matheus Branco Borella <dark.ryu.550@gmail.com>
> Cc: gdb-patches@sourceware.org,
> 	Matheus Branco Borella <dark.ryu.550@gmail.com>
> Date: Fri,  7 Jul 2023 12:00:22 -0300
> 
> Eli Zaretskii <eliz@gnu.org> wrote:
> > Your assignment is not on file yet, AFAICT.  Was the paperwork
> > completed, i.e. did you get a copy of the assignment signed by you and
> > by the FSF?  If not, you need to wait some more.
> 
> Huh, that's weird. I do have the copy, signed by both parties. Maybe it just 
> hasn't been filed yet? If you want, I could forward it to you.

I see now that your assignment was added, but it's only for Emacs, not
for GDB.

> > Thanks.  The documentation parts are OK, but please fix the text to
> > leave two spaces between sentences, not one.
> 
> Alright, I've fixed that.
> 
> Is there anything else I'm missing?

There's still the issue of copyright assignment for GDB contributions,
AFAICT.

^ permalink raw reply	[relevance 0%]

* [PATCH] Add name_of_main and language_of_main to the DWARF index
  2023-07-01  5:47  0%     ` Eli Zaretskii
@ 2023-07-07 15:00  4%       ` Matheus Branco Borella
  2023-07-07 18:00  0%         ` Eli Zaretskii
                           ` (2 more replies)
  0 siblings, 3 replies; 65+ results
From: Matheus Branco Borella @ 2023-07-07 15:00 UTC (permalink / raw)
  To: eliz; +Cc: gdb-patches, Matheus Branco Borella

Eli Zaretskii <eliz@gnu.org> wrote:
> Your assignment is not on file yet, AFAICT.  Was the paperwork
> completed, i.e. did you get a copy of the assignment signed by you and
> by the FSF?  If not, you need to wait some more.

Huh, that's weird. I do have the copy, signed by both parties. Maybe it just 
hasn't been filed yet? If you want, I could forward it to you.

> Thanks.  The documentation parts are OK, but please fix the text to
> leave two spaces between sentences, not one.

Alright, I've fixed that.

Is there anything else I'm missing?

---
This patch adds a new section to the DWARF index containing the name
and the language of the main function symbol, gathered from
`cooked_index::get_main`, if available. Currently, for lack of a better name,
this section is called the "shortcut table". The way this name is both saved and
applied upon an index being loaded in mirrors how it is done in
`cooked_index_functions`, more specifically, the full name of the main function
symbol is saved and `set_objfile_main_name` is used to apply it after it is
loaded.

The main use case for this patch is in improving startup times when dealing with
large binaries. Currently, when an index is used, GDB has to expand symtabs
until it finds out what the language of the main function symbol is. For some
large executables, this may take a considerable amount of time to complete,
slowing down startup. This patch bypasses that operation by having both the name
and language of the main function symbol be provided ahead of time by the index.

In my testing (a binary with about 1.8GB worth of DWARF data) this change brings
startup time down from about 34 seconds to about 1.5 seconds.
---
 gdb/NEWS                    |  2 ++
 gdb/doc/gdb.texinfo         | 23 +++++++++++++--
 gdb/dwarf2/index-write.c    | 47 +++++++++++++++++++++++++++----
 gdb/dwarf2/read-gdb-index.c | 56 +++++++++++++++++++++++++++++++++++--
 gdb/dwarf2/read.c           | 13 +++++++--
 gdb/dwarf2/read.h           | 12 ++++++++
 6 files changed, 142 insertions(+), 11 deletions(-)

diff --git a/gdb/NEWS b/gdb/NEWS
index d97e3c15a8..2d940d1f79 100644
--- a/gdb/NEWS
+++ b/gdb/NEWS
@@ -3,6 +3,8 @@
 
 *** Changes since GDB 13
 
+* DWARF index now contains information about the main function.
+
 * The AArch64 'org.gnu.gdb.aarch64.pauth' Pointer Authentication feature string
   has been deprecated in favor of the 'org.gnu.gdb.aarch64.pauth_v2' feature
   string.
diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
index d1059e0cb7..4c58ea5709 100644
--- a/gdb/doc/gdb.texinfo
+++ b/gdb/doc/gdb.texinfo
@@ -49093,13 +49093,14 @@ unless otherwise noted:
 
 @enumerate
 @item
-The version number, currently 8.  Versions 1, 2 and 3 are obsolete.
+The version number, currently 9.  Versions 1, 2 and 3 are obsolete.
 Version 4 uses a different hashing function from versions 5 and 6.
 Version 6 includes symbols for inlined functions, whereas versions 4
 and 5 do not.  Version 7 adds attributes to the CU indices in the
 symbol table.  Version 8 specifies that symbols from DWARF type units
 (@samp{DW_TAG_type_unit}) refer to the type unit's symbol table and not the
-compilation unit (@samp{DW_TAG_comp_unit}) using the type.
+compilation unit (@samp{DW_TAG_comp_unit}) using the type.  Version 9 adds
+the name and the language of the main function to the index.
 
 @value{GDBN} will only read version 4, 5, or 6 indices
 by specifying @code{set use-deprecated-index-sections on}.
@@ -49120,6 +49121,9 @@ The offset, from the start of the file, of the address area.
 @item
 The offset, from the start of the file, of the symbol table.
 
+@item
+The offset, from the start of the file, of the shortcut table.
+
 @item
 The offset, from the start of the file, of the constant pool.
 @end enumerate
@@ -49196,6 +49200,21 @@ don't currently have a simple description of the canonicalization
 algorithm; if you intend to create new index sections, you must read
 the code.
 
+@item
+The shortcut table.  This is a data structure with the following fields:
+
+@table @asis
+@item
+A 32-bit little-endian value indicating the language of the main function as a
+@code{DW_LANG_} constant.  This value will be zero if main function information
+is not present.
+
+@item
+An @code{offset_type} value indicating the offset of the main function's name 
+in the constant pool.  This value must be ignored if the value for the language
+of main is zero.
+@ end table
+
 @item
 The constant pool.  This is simply a bunch of bytes.  It is organized
 so that alignment is correct: CU vectors are stored first, followed by
diff --git a/gdb/dwarf2/index-write.c b/gdb/dwarf2/index-write.c
index 62c2cc6ac7..ee6eaa7b87 100644
--- a/gdb/dwarf2/index-write.c
+++ b/gdb/dwarf2/index-write.c
@@ -1080,14 +1080,15 @@ write_gdbindex_1 (FILE *out_file,
 		  const data_buf &types_cu_list,
 		  const data_buf &addr_vec,
 		  const data_buf &symtab_vec,
-		  const data_buf &constant_pool)
+		  const data_buf &constant_pool,
+                  const data_buf &shortcuts)
 {
   data_buf contents;
-  const offset_type size_of_header = 6 * sizeof (offset_type);
+  const offset_type size_of_header = 7 * sizeof (offset_type);
   offset_type total_len = size_of_header;
 
   /* The version number.  */
-  contents.append_offset (8);
+  contents.append_offset (9);
 
   /* The offset of the CU list from the start of the file.  */
   contents.append_offset (total_len);
@@ -1105,6 +1106,10 @@ write_gdbindex_1 (FILE *out_file,
   contents.append_offset (total_len);
   total_len += symtab_vec.size ();
 
+  /* The offset of the shortcut table from the start of the file.  */
+  contents.append_offset (total_len);
+  total_len += shortcuts.size ();
+
   /* The offset of the constant pool from the start of the file.  */
   contents.append_offset (total_len);
   total_len += constant_pool.size ();
@@ -1116,6 +1121,7 @@ write_gdbindex_1 (FILE *out_file,
   types_cu_list.file_write (out_file);
   addr_vec.file_write (out_file);
   symtab_vec.file_write (out_file);
+  shortcuts.file_write (out_file);
   constant_pool.file_write (out_file);
 
   assert_file_size (out_file, total_len);
@@ -1193,6 +1199,34 @@ write_cooked_index (cooked_index *table,
     }
 }
 
+/* Write shortcut information. */
+
+static void
+write_shortcuts_table (cooked_index *table, data_buf& shortcuts,
+                       data_buf& cpool)
+{
+  const auto main_info = table->get_main ();
+  size_t main_name_offset = 0;
+  dwarf_source_language dw_lang = (dwarf_source_language)0;
+
+  if (main_info != nullptr)
+    {
+      dw_lang = main_info->per_cu->dw_lang;
+
+      if (dw_lang != 0)
+        {
+          auto_obstack obstack;
+          const auto main_name = main_info->full_name (&obstack, true);
+
+          main_name_offset = cpool.size ();
+          cpool.append_cstr0 (main_name);
+        }
+    }
+
+  shortcuts.append_uint (4, BFD_ENDIAN_LITTLE, dw_lang);
+  shortcuts.append_offset (main_name_offset);
+}
+
 /* Write contents of a .gdb_index section for OBJFILE into OUT_FILE.
    If OBJFILE has an associated dwz file, write contents of a .gdb_index
    section for that dwz file into DWZ_OUT_FILE.  If OBJFILE does not have an
@@ -1270,11 +1304,14 @@ write_gdbindex (dwarf2_per_bfd *per_bfd, cooked_index *table,
 
   write_hash_table (&symtab, symtab_vec, constant_pool);
 
+  data_buf shortcuts;
+  write_shortcuts_table (table, shortcuts, constant_pool);
+
   write_gdbindex_1(out_file, objfile_cu_list, types_cu_list, addr_vec,
-		   symtab_vec, constant_pool);
+		   symtab_vec, constant_pool, shortcuts);
 
   if (dwz_out_file != NULL)
-    write_gdbindex_1 (dwz_out_file, dwz_cu_list, {}, {}, {}, {});
+    write_gdbindex_1 (dwz_out_file, dwz_cu_list, {}, {}, {}, {}, {});
   else
     gdb_assert (dwz_cu_list.empty ());
 }
diff --git a/gdb/dwarf2/read-gdb-index.c b/gdb/dwarf2/read-gdb-index.c
index 1006386cb2..534c5a7fd7 100644
--- a/gdb/dwarf2/read-gdb-index.c
+++ b/gdb/dwarf2/read-gdb-index.c
@@ -88,6 +88,9 @@ struct mapped_gdb_index final : public mapped_index_base
   /* A pointer to the constant pool.  */
   gdb::array_view<const gdb_byte> constant_pool;
 
+  /* The shortcut table data. */
+  gdb::array_view<const gdb_byte> shortcut_table;
+
   /* Return the index into the constant pool of the name of the IDXth
      symbol in the symbol table.  */
   offset_type symbol_name_index (offset_type idx) const
@@ -166,6 +169,7 @@ dwarf2_gdb_index::dump (struct objfile *objfile)
 
   mapped_gdb_index *index = (gdb::checked_static_cast<mapped_gdb_index *>
 			     (per_objfile->per_bfd->index_table.get ()));
+
   gdb_printf (".gdb_index: version %d\n", index->version);
   gdb_printf ("\n");
 }
@@ -583,7 +587,7 @@ to use the section anyway."),
 
   /* Indexes with higher version than the one supported by GDB may be no
      longer backward compatible.  */
-  if (version > 8)
+  if (version > 9)
     return 0;
 
   map->version = version;
@@ -608,8 +612,17 @@ to use the section anyway."),
   map->symbol_table
     = offset_view (gdb::array_view<const gdb_byte> (symbol_table,
 						    symbol_table_end));
-
   ++i;
+
+  if (version >= 9)
+    {
+      const gdb_byte *shortcut_table = addr + metadata[i];
+      const gdb_byte *shortcut_table_end = addr + metadata[i + 1];
+      map->shortcut_table
+        = gdb::array_view<const gdb_byte> (shortcut_table, shortcut_table_end);
+      ++i;
+    }
+
   map->constant_pool = buffer.slice (metadata[i]);
 
   if (map->constant_pool.empty () && !map->symbol_table.empty ())
@@ -763,6 +776,43 @@ create_addrmap_from_gdb_index (dwarf2_per_objfile *per_objfile,
     = new (&per_bfd->obstack) addrmap_fixed (&per_bfd->obstack, &mutable_map);
 }
 
+/* Sets the name and language of the main function from the shortcut table. */
+
+static void
+set_main_name_from_gdb_index (dwarf2_per_objfile *per_objfile, 
+                              mapped_gdb_index *index)
+{
+  const auto expected_size = 4 + sizeof (offset_type);
+  if (index->shortcut_table.size () < expected_size)
+    /* The data in the section is not present, is corrupted or is in a version
+     * we don't know about. Regardless, we can't make use of it. */
+    return;
+
+  auto ptr = index->shortcut_table.data ();
+  const auto dw_lang = extract_unsigned_integer (ptr, 4, BFD_ENDIAN_LITTLE);
+  if (dw_lang >= DW_LANG_hi_user)
+    {
+      complaint (_(".gdb_index shortcut table has invalid main language %u"),
+                   (unsigned) dw_lang);
+      return;
+    }
+  if (dw_lang == 0)
+    {
+      /* Don't bother if the language for the main symbol was not known or if
+       * there was no main symbol at all when the index was built. */
+      return;
+    }
+  ptr += 4;
+
+  const auto lang = dwarf_lang_to_enum_language (dw_lang);
+  const auto name_offset = extract_unsigned_integer (ptr, 
+                                                     sizeof (offset_type), 
+                                                     BFD_ENDIAN_LITTLE);
+  const auto name = (const char*) (index->constant_pool.data () + name_offset);
+
+  set_objfile_main_name (per_objfile->objfile, name, (enum language) lang);
+}
+
 /* See read-gdb-index.h.  */
 
 int
@@ -848,6 +898,8 @@ dwarf2_read_gdb_index
 
   create_addrmap_from_gdb_index (per_objfile, map.get ());
 
+  set_main_name_from_gdb_index (per_objfile, map.get ());
+
   per_bfd->index_table = std::move (map);
   per_bfd->quick_file_names_table =
     create_quick_file_names_table (per_bfd->all_units.size ());
diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index 4828409222..89acd94c05 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -17745,7 +17745,9 @@ leb128_size (const gdb_byte *buf)
     }
 }
 
-static enum language
+/* Converts DWARF language names to GDB language names. */
+
+enum language
 dwarf_lang_to_enum_language (unsigned int lang)
 {
   enum language language;
@@ -21661,6 +21663,7 @@ prepare_one_comp_unit (struct dwarf2_cu *cu, struct die_info *comp_unit_die,
   /* Set the language we're debugging.  */
   attr = dwarf2_attr (comp_unit_die, DW_AT_language, cu);
   enum language lang;
+  dwarf_source_language dw_lang = (dwarf_source_language)0;
   if (cu->producer != nullptr
       && strstr (cu->producer, "IBM XL C for OpenCL") != NULL)
     {
@@ -21669,18 +21672,24 @@ prepare_one_comp_unit (struct dwarf2_cu *cu, struct die_info *comp_unit_die,
 	 language detection we fall back to the DW_AT_producer
 	 string.  */
       lang = language_opencl;
+      dw_lang = DW_LANG_OpenCL;
     }
   else if (cu->producer != nullptr
 	   && strstr (cu->producer, "GNU Go ") != NULL)
     {
       /* Similar hack for Go.  */
       lang = language_go;
+      dw_lang = DW_LANG_Go;
     }
   else if (attr != nullptr)
-    lang = dwarf_lang_to_enum_language (attr->constant_value (0));
+    {
+      lang = dwarf_lang_to_enum_language (attr->constant_value (0));
+      dw_lang = (dwarf_source_language)attr->constant_value (0);
+    }
   else
     lang = pretend_language;
 
+  cu->per_cu->dw_lang = dw_lang;
   cu->language_defn = language_def (lang);
 
   switch (comp_unit_die->tag)
diff --git a/gdb/dwarf2/read.h b/gdb/dwarf2/read.h
index 37023a2070..1235f62bfc 100644
--- a/gdb/dwarf2/read.h
+++ b/gdb/dwarf2/read.h
@@ -245,6 +245,14 @@ struct dwarf2_per_cu_data
      functions above.  */
   std::vector <dwarf2_per_cu_data *> *imported_symtabs = nullptr;
 
+  /* The original DW_LANG_* value of the CU, as provided to us by 
+   * DW_AT_language. It is interesting to keep this value around in cases where
+   * we can't use the values from the language enum, as the mapping to them is
+   * lossy, and, while that is usually fine, things like the index have an 
+   * understandable bias towards not exposing internal GDB structures to the 
+   * outside world, and so prefer to use DWARF constants in their stead. */
+  dwarf_source_language dw_lang;
+
   /* Return true of IMPORTED_SYMTABS is empty or not yet allocated.  */
   bool imported_symtabs_empty () const
   {
@@ -755,6 +763,10 @@ struct dwarf2_per_objfile
 		     std::unique_ptr<dwarf2_cu>> m_dwarf2_cus;
 };
 
+/* Converts DWARF language names to GDB language names. */
+
+enum language dwarf_lang_to_enum_language (unsigned int lang);
+
 /* Get the dwarf2_per_objfile associated to OBJFILE.  */
 
 dwarf2_per_objfile *get_dwarf2_per_objfile (struct objfile *objfile);
-- 
2.40.1


^ permalink raw reply	[relevance 4%]

* Re: [PATCH] Add support for symbol addition to the Python API
  2023-05-27  1:24  3% [PATCH] Add support for symbol addition to the Python API Matheus Branco Borella
  2023-06-27  3:53 14% ` [PING] " Matheus Branco Borella
@ 2023-07-04 15:14  7% ` Andrew Burgess
  2023-07-07 23:13  3%   ` Matheus Branco Borella
  1 sibling, 1 reply; 65+ results
From: Andrew Burgess @ 2023-07-04 15:14 UTC (permalink / raw)
  To: Matheus Branco Borella via Gdb-patches, gdb-patches
  Cc: Matheus Branco Borella

Matheus Branco Borella via Gdb-patches <gdb-patches@sourceware.org>
writes:

> Disclaimer:
>
> This patch is a rework of a six-month old patch I submitted to the mailing list
> that considerably reduces the hackyness of the original solution to the problem,
> now that I've had more time to read through and understand how symbols are 
> handled and searched for inside GDB. So, I'd like to please ask for comments on 
> things I can still improve in this patch, before I resubmit it. I also plan to 
> add tests to it once I'm more secure about the approach I'm taking to solve the
> problem now.
>
> The interfaces in this patch can be tested like so:
> ```
> (gdb) pi
>>>> builder = gdb.ObjfileBuilder(name = "some_name")
>>>> builder.add_static_symbol(name = "some_sym", address = 0x41414141, 
>         language = "c")
>>>> objfile = builder.build()
> ```
>
> ---
>
> This patch adds support for symbol creation and registration. It currently
> supports adding type symbols (VAR_DOMAIN/LOC_TYPEDEF), static symbols
> (VAR_DOMAIN/LOC_STATIC) and goto target labels (LABEL_DOMAIN/LOC_LABEL). It
> adds a new `gdb.ObjfileBuilder` type, with `add_type_symbol`,
> `add_static_symbol` and `add_label_symbol` functions, allowing for the addition
> of the aforementioned types of symbols.
>
> Symbol addition is achieved by constructing a new objfile with msyms and full
> symbols reflecting the symbols that were previously added to the builder through
> its methods. This approach lets us get most of the way to full symbol addition
> support, but due to not being backed up by BFD, it does have a few limitations,
> which I will go over them here.
>
> PC-based minsym lookup does not work, because those would require a more
> complete set of BFD structures than I think would be good practice to pretend to
> have them all and crash GDB later on when it expects things to be there that
> aren't.
>
> In the same vein, PC-based function name lookup also does not work, although
> there may be a way to have the feature work using overlays. However, this patch
> does not make an attempt to do so
>
> For now, though, this implementation lets us add symbols that can be used to,
> for instance, query registered types through `gdb.lookup_type`, and allows
> reverse engineering GDB plugins (such as Pwndbg [0] or decomp2gdb [1]) to add
> symbols directly through the Python API instead of having to compile an object
> file for the target architecture that they later load through the add-symbol-
> file command. [2]

I started taking a look through this.  I didn't manage to build the code
due to the use of C++17 features, so I've only given a couple of really
minor bits of feedback.

I think that adding a first simple test would be a solid idea, this will
give reviewers something to play with, you can always expand the test
later to cover more cases.

>
> [0] https://github.com/pwndbg/pwndbg/
> [1] https://github.com/mahaloz/decomp2dbg
> [2] https://github.com/mahaloz/decomp2dbg/blob/055be6b2001954d00db2d683f20e9b714af75880/decomp2dbg/clients/gdb/symbol_mapper.py#L235-L243]
> ---
>  gdb/Makefile.in                 |   1 +
>  gdb/python/py-objfile-builder.c | 648 ++++++++++++++++++++++++++++++++
>  gdb/python/py-objfile.c         |   1 +
>  gdb/python/python-internal.h    |   1 +
>  4 files changed, 651 insertions(+)
>  create mode 100644 gdb/python/py-objfile-builder.c
>
> diff --git a/gdb/Makefile.in b/gdb/Makefile.in
> index 14b5dd0bad..c0eecb81b6 100644
> --- a/gdb/Makefile.in
> +++ b/gdb/Makefile.in
> @@ -417,6 +417,7 @@ SUBDIR_PYTHON_SRCS = \
>  	python/py-micmd.c \
>  	python/py-newobjfileevent.c \
>  	python/py-objfile.c \
> +	python/py-objfile-builder.c \
>  	python/py-param.c \
>  	python/py-prettyprint.c \
>  	python/py-progspace.c \
> diff --git a/gdb/python/py-objfile-builder.c b/gdb/python/py-objfile-builder.c
> new file mode 100644
> index 0000000000..1e3110c613
> --- /dev/null
> +++ b/gdb/python/py-objfile-builder.c
> @@ -0,0 +1,648 @@
> +/* Python class allowing users to build and install objfiles.
> +
> +   Copyright (C) 2013-2023 Free Software Foundation, Inc.
> +
> +   This file is part of GDB.
> +
> +   This program is free software; you can redistribute it and/or modify
> +   it under the terms of the GNU General Public License as published by
> +   the Free Software Foundation; either version 3 of the License, or
> +   (at your option) any later version.
> +
> +   This program is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +   GNU General Public License for more details.
> +
> +   You should have received a copy of the GNU General Public License
> +   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
> +
> +#include "defs.h"
> +#include "python-internal.h"
> +#include "quick-symbol.h"
> +#include "objfiles.h"
> +#include "minsyms.h"
> +#include "buildsym.h"
> +#include "observable.h"
> +#include <string>
> +#include <unordered_map>
> +#include <type_traits>
> +#include <optional>
> +
> +/* This module relies on symbols being trivially copyable. */
> +static_assert (std::is_trivially_copyable_v<struct symbol>);

I believe that std::is_trivially_copyable_v is a C++17 feature and
(currently) GDB is C++11.  There's actually a bunch of C++17 code in
this patch -- you'll either need to wait until GDB moves to C++17, or
update things to compile with C++11.

> +
> +/* Interface to be implemented for symbol types supported by this interface. */
> +class symbol_def
> +{
> +public:
> +  virtual ~symbol_def () = default;
> +
> +  virtual void register_msymbol (const std::string& name, 
> +                                 struct objfile* objfile,
> +                                 minimal_symbol_reader& reader) const = 0;
> +  virtual void register_symbol (const std::string& name, 
> +                                struct objfile* objfile,
> +                                buildsym_compunit& builder) const = 0;
> +};
> +
> +/* Shorthand for a unique_ptr to a symbol. */
> +typedef std::unique_ptr<symbol_def> symbol_def_up;
> +
> +/* Data being held by the gdb.ObjfileBuilder.
> + *
> + * This structure needs to have its constructor run in order for its lifetime
> + * to begin. Because of how Python handles its objects, we can't just reconstruct
> + * the object structure as a whole, as that would overwrite things the runtime
> + * cares about, so these fields had to be broken off into their own structure. */
> +struct objfile_builder_data
> +{
> +  /* Indicates whether the objfile has already been built and added to the
> +   * current context. We enforce that objfiles can't be installed twice. */
> +  bool installed = false;
> +
> +  /* The symbols that will be added to new newly built objfile. */
> +  std::unordered_map<std::string, symbol_def_up> symbols;
> +
> +  /* The name given to this objfile. */
> +  std::string name;
> +
> +  /* Adds a symbol definition with the given name. */
> +  bool add_symbol_def (std::string name, symbol_def_up&& symbol_def)
> +  {
> +    return std::get<1> (symbols.insert ({name, std::move (symbol_def)}));
> +  }
> +};
> +
> +/* Structure backing the gdb.ObjfileBuilder type. */
> +
> +struct objfile_builder_object
> +{
> +  PyObject_HEAD
> +
> +  /* See objfile_builder_data. */
> +  objfile_builder_data inner;
> +};
> +
> +extern PyTypeObject objfile_builder_object_type
> +    CPYCHECKER_TYPE_OBJECT_FOR_TYPEDEF ("objfile_builder_object_type");
> +
> +/* Constructs a new objfile from an objfile_builder. */
> +static struct objfile *
> +build_new_objfile (const objfile_builder_object& builder)
> +{
> +  gdb_assert (!builder.inner.installed);
> +
> +  auto of = objfile::make (nullptr, builder.inner.name.c_str (), 
> +                           OBJF_READNOW | OBJF_NOT_FILENAME, 
> +                           nullptr);
> +
> +  /* Setup object file sections. */
> +  of->sections_start = OBSTACK_CALLOC (&of->objfile_obstack,
> +                                       4,
> +                                       struct obj_section);
> +  of->sections_end = of->sections_start + 4;
> +
> +  const auto init_section = [&](struct obj_section* sec)
> +    {
> +      sec->objfile = of;
> +      sec->ovly_mapped = false;
> +      
> +      /* We're not being backed by BFD. So we have no real section data to speak 
> +       * of, but, because specifying sections requires BFD structures, we have to
> +       * play a little game of predend. */
> +      auto bfd = obstack_new<bfd_section> (&of->objfile_obstack);
> +      bfd->vma = 0;
> +      bfd->size = 0;
> +      bfd->lma = 0; /* Prevents insert_section_p in objfiles.c from trying to 
> +                     * dereference the bfd structure we don't have. */
> +      sec->the_bfd_section = bfd;
> +    };
> +  init_section (&of->sections_start[0]);
> +  init_section (&of->sections_start[1]);
> +  init_section (&of->sections_start[2]);
> +  init_section (&of->sections_start[4]);
> +
> +  of->sect_index_text = 0;
> +  of->sect_index_data = 1;
> +  of->sect_index_rodata = 2;
> +  of->sect_index_bss = 3;
> +
> +  /* While buildsym_compunit expects the symbol function pointer structure to be
> +   * present, it also gracefully handles the case where all of the pointers in
> +   * it are set to null. So, make sure we have a valid structure, but there's
> +   * no need to do more than that. */
> +  of->sf = obstack_new<struct sym_fns> (&of->objfile_obstack);
> +
> +  /* We need to tell GDB what architecture the objfile uses. */
> +  if (has_stack_frames ())
> +    of->per_bfd->gdbarch = get_frame_arch (get_selected_frame (nullptr));
> +  else
> +    of->per_bfd->gdbarch = target_gdbarch ();
> +
> +  /* Construct the minimal symbols. */
> +  minimal_symbol_reader msym (of);
> +  for (const auto& [name, symbol] : builder.inner.symbols)
> +      symbol->register_msymbol (name, of, msym);
> +  msym.install ();
> +
> +  /* Construct the full symbols. */
> +  buildsym_compunit fsym (of, builder.inner.name.c_str (), "", language_c, 0);
> +  for (const auto& [name, symbol] : builder.inner.symbols)
> +    symbol->register_symbol (name, of, fsym);
> +  fsym.end_compunit_symtab (0);
> +
> +  /* Notify the rest of GDB this objfile has been created. Requires 
> +   * OBJF_NOT_FILENAME to be used, to prevent any of the functions attatched to
> +   * the observable from trying to dereference of->bfd. */
> +  gdb::observers::new_objfile.notify (of);
> +
> +  return of;
> +}
> +
> +/* Implementation of the quick symbol functions used by the objfiles created 
> + * using this interface. Turns out we have our work cut out for us here, as we
> + * can get something that works by effectively just using no-ops, and the rest
> + * of the code will fall back to using just the minimal and full symbol data. It
> + * is important to note, though, that this only works because we're marking our 
> + * objfile with `OBJF_READNOW`. */
> +class runtime_objfile : public quick_symbol_functions
> +{
> +  virtual bool has_symbols (struct objfile*) override
> +  {
> +    return false;
> +  }
> +
> +  virtual void dump (struct objfile *objfile) override
> +  {
> +  }
> +
> +  virtual void expand_matching_symbols
> +    (struct objfile *,
> +     const lookup_name_info &lookup_name,
> +     domain_enum domain,
> +     int global,
> +     symbol_compare_ftype *ordered_compare) override
> +  {
> +  }
> +
> +  virtual bool expand_symtabs_matching
> +    (struct objfile *objfile,
> +     gdb::function_view<expand_symtabs_file_matcher_ftype> file_matcher,
> +     const lookup_name_info *lookup_name,
> +     gdb::function_view<expand_symtabs_symbol_matcher_ftype> symbol_matcher,
> +     gdb::function_view<expand_symtabs_exp_notify_ftype> expansion_notify,
> +     block_search_flags search_flags,
> +     domain_enum domain,
> +     enum search_domain kind) override
> +  {
> +    return true;
> +  }
> +};
> +
> +
> +/* Create a new symbol alocated in the given objfile. */
> +
> +static struct symbol *
> +new_symbol
> +  (struct objfile *objfile,
> +   const char *name,
> +   enum language language,
> +   enum domain_enum domain,
> +   enum address_class aclass,
> +   short section_index)
> +{
> +  auto symbol = new (&objfile->objfile_obstack) struct symbol ();
> +  OBJSTAT (objfile, n_syms++);
> +
> +  symbol->set_language (language, &objfile->objfile_obstack);
> +  symbol->compute_and_set_names (gdb::string_view (name), true, 
> +                                 objfile->per_bfd);
> +
> +  symbol->set_is_objfile_owned (true);
> +  symbol->set_section_index (section_index);
> +  symbol->set_domain (domain);
> +  symbol->set_aclass_index (aclass);
> +
> +  return symbol;
> +}
> +
> +/* Parses a language from a string (coming from Python) into a language 
> + * variant. */
> +
> +static enum language
> +parse_language (const char *language)
> +{

Could this make use of `language_enum` (from language.c)?

> +  if (strcmp (language, "c") == 0)
> +    return language_c;
> +  else if (strcmp (language, "objc") == 0)
> +    return language_objc;
> +  else if (strcmp (language, "cplus") == 0)
> +    return language_cplus;
> +  else if (strcmp (language, "d") == 0)
> +    return language_d;
> +  else if (strcmp (language, "go") == 0)
> +    return language_go;
> +  else if (strcmp (language, "fortran") == 0)
> +    return language_fortran;
> +  else if (strcmp (language, "m2") == 0)
> +    return language_m2;
> +  else if (strcmp (language, "asm") == 0)
> +    return language_asm;
> +  else if (strcmp (language, "pascal") == 0)
> +    return language_pascal;
> +  else if (strcmp (language, "opencl") == 0)
> +    return language_opencl;
> +  else if (strcmp (language, "rust") == 0)
> +    return language_rust;
> +  else if (strcmp (language, "ada") == 0)
> +    return language_ada;
> +  else
> +    return language_unknown;
> +}
> +
> +/* Convenience function that performs a checked coversion from a PyObject to
> + * a objfile_builder_object structure pointer. */
> +inline static struct objfile_builder_object *
> +validate_objfile_builder_object (PyObject *self)
> +{
> +  if (!PyObject_TypeCheck (self, &objfile_builder_object_type))
> +    return nullptr;
> +  return (struct objfile_builder_object*) self;
> +}
> +
> +/* Registers symbols added with add_label_symbol. */
> +class typedef_symbol_def : public symbol_def
> +{
> +public:
> +  struct type* type;
> +  enum language language;
> +
> +  virtual void register_msymbol (const std::string& name,
> +                                 struct objfile *objfile,
> +                                 minimal_symbol_reader& reader) const override
> +  {
> +  }
> +
> +  virtual void register_symbol (const std::string& name,
> +                                struct objfile *objfile,
> +                                buildsym_compunit& builder) const override
> +  {
> +    auto symbol = new_symbol (objfile, name.c_str (), language, LABEL_DOMAIN,
> +                              LOC_TYPEDEF, objfile->sect_index_text);
> +
> +    symbol->set_type (type);
> +
> +    add_symbol_to_list (symbol, builder.get_file_symbols ());
> +  }
> +};
> +
> +/* Adds a type (LOC_TYPEDEF) symbol to a given objfile. */
> +static PyObject *
> +objbdpy_add_type_symbol (PyObject *self, PyObject *args, PyObject *kw)
> +{
> +  static const char *format = "sOs";
> +  static const char *keywords[] =
> +    {
> +      "name", "type", "language", NULL
> +    };
> +
> +  PyObject *type_object;
> +  const char *name;
> +  const char *language_name = nullptr;
> +
> +  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, format, keywords, &name,
> +                                        &type_object, &language_name))
> +    return nullptr;
> +
> +  auto builder = validate_objfile_builder_object (self);
> +  if (builder == nullptr)
> +    return nullptr;
> +
> +  struct type *type = type_object_to_type (type_object);
> +  if (type == nullptr)
> +    return nullptr;
> +
> +  if (language_name == nullptr)
> +    language_name = "auto";
> +  enum language language = parse_language (language_name);
> +  if (language == language_unknown)
> +    {
> +      PyErr_SetString (PyExc_ValueError, "invalid language name");
> +      return nullptr;
> +    }
> +
> +  auto def = std::make_unique<typedef_symbol_def> ();
> +  def->type = type;
> +  def->language = language;
> +
> +  builder->inner.add_symbol_def (name, std::move (def));
> +
> +  Py_RETURN_NONE;
> +}
> +
> +
> +/* Registers symbols added with add_label_symbol. */
> +class label_symbol_def : public symbol_def
> +{
> +public:
> +  CORE_ADDR address;
> +  enum language language;
> +
> +  virtual void register_msymbol (const std::string& name,
> +                                 struct objfile *objfile,
> +                                 minimal_symbol_reader& reader) const override
> +  {
> +    reader.record (name.c_str (), 
> +                   unrelocated_addr (address), 
> +                   minimal_symbol_type::mst_text);
> +  }
> +
> +  virtual void register_symbol (const std::string& name,
> +                                struct objfile *objfile,
> +                                buildsym_compunit& builder) const override
> +  {
> +    printf("Adding label %s\n", name.c_str ());
> +    auto symbol = new_symbol (objfile, name.c_str (), language, LABEL_DOMAIN,
> +                              LOC_LABEL, objfile->sect_index_text);
> +
> +    symbol->set_value_address (address);
> +
> +    add_symbol_to_list (symbol, builder.get_file_symbols ());
> +  }
> +};
> +
> +/* Adds a label (LOC_LABEL) symbol to a given objfile. */
> +static PyObject *
> +objbdpy_add_label_symbol (PyObject *self, PyObject *args, PyObject *kw)
> +{
> +  static const char *format = "sks";
> +  static const char *keywords[] =
> +    {
> +      "name", "address", "language", NULL
> +    };
> +
> +  const char *name;
> +  CORE_ADDR address;
> +  const char *language_name = nullptr;
> +
> +  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, format, keywords, &name,
> +                                        &address, &language_name))
> +    return nullptr;
> +
> +  auto builder = validate_objfile_builder_object (self);
> +  if (builder == nullptr)
> +    return nullptr;
> +
> +  if (language_name == nullptr)
> +    language_name = "auto";
> +  enum language language = parse_language (language_name);
> +  if (language == language_unknown)
> +    {
> +      PyErr_SetString (PyExc_ValueError, "invalid language name");
> +      return nullptr;
> +    }
> +
> +  auto def = std::make_unique<label_symbol_def> ();
> +  def->address = address;
> +  def->language = language;
> +
> +  builder->inner.add_symbol_def (name, std::move (def));
> +
> +  Py_RETURN_NONE;
> +}
> +
> +/* Registers symbols added with add_static_symbol. */
> +class static_symbol_def : public symbol_def
> +{
> +public:
> +  CORE_ADDR address;
> +  enum language language;
> +
> +  virtual void register_msymbol (const std::string& name,
> +                                 struct objfile *objfile,
> +                                 minimal_symbol_reader& reader) const override
> +  {
> +    reader.record (name.c_str (), 
> +                   unrelocated_addr (address), 
> +                   minimal_symbol_type::mst_bss);
> +  }
> +
> +  virtual void register_symbol (const std::string& name,
> +                                struct objfile *objfile,
> +                                buildsym_compunit& builder) const override
> +  {
> +    auto symbol = new_symbol (objfile, name.c_str (), language, VAR_DOMAIN,
> +                              LOC_STATIC, objfile->sect_index_bss);
> +
> +    symbol->set_value_address (address);
> +
> +    add_symbol_to_list (symbol, builder.get_file_symbols ());
> +  }
> +};
> +
> +/* Adds a static (LOC_STATIC) symbol to a given objfile. */
> +static PyObject *
> +objbdpy_add_static_symbol (PyObject *self, PyObject *args, PyObject *kw)
> +{
> +  static const char *format = "sks";
> +  static const char *keywords[] =
> +    {
> +      "name", "address", "language", NULL
> +    };
> +
> +  const char *name;
> +  CORE_ADDR address;
> +  const char *language_name = nullptr;
> +
> +  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, format, keywords, &name,
> +                                        &address, &language_name))
> +    return nullptr;
> +
> +  auto builder = validate_objfile_builder_object (self);
> +  if (builder == nullptr)
> +    return nullptr;
> +
> +  if (language_name == nullptr)
> +    language_name = "auto";
> +  enum language language = parse_language (language_name);
> +  if (language == language_unknown)
> +    {
> +      PyErr_SetString (PyExc_ValueError, "invalid language name");
> +      return nullptr;
> +    }
> +
> +  auto def = std::make_unique<static_symbol_def> ();
> +  def->address = address;
> +  def->language = language;
> +
> +  builder->inner.add_symbol_def (name, std::move (def));
> +
> +  Py_RETURN_NONE;
> +}
> +
> +/* Builds the object file. */
> +static PyObject *
> +objbdpy_build (PyObject *self, PyObject *args)
> +{
> +  auto builder = validate_objfile_builder_object (self);
> +  if (builder == nullptr)
> +    return nullptr;
> +
> +  if (builder->inner.installed)
> +    {
> +      PyErr_SetString (PyExc_ValueError, "build() cannot be run twice on the \
> +                       same object");
> +      return nullptr;
> +    }
> +  auto of = build_new_objfile (*builder);
> +  builder->inner.installed = true;
> +
> +
> +  auto objpy = objfile_to_objfile_object (of).get ();
> +  Py_INCREF(objpy);
> +  return objpy;
> +}
> +
> +/* Implements the __init__() function. */
> +static int
> +objbdpy_init (PyObject *self0, PyObject *args, PyObject *kw)
> +{
> +  static const char *format = "s";
> +  static const char *keywords[] =
> +    {
> +      "name", NULL
> +    };
> +
> +  const char *name;
> +  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, format, keywords, &name))
> +    return -1;
> +
> +  auto self = (objfile_builder_object *)self0;
> +  self->inner.name = name;
> +  self->inner.symbols.clear ();
> +
> +  return 0;
> +}
> +
> +/* The function handling construction of the ObjfileBuilder object. 
> + *
> + * We need to have a custom function here as, even though Python manages the 
> + * memory backing the object up, it assumes clearing the memory is enough to
> + * begin its lifetime, which is not the case here, and would lead to undefined 
> + * behavior as soon as we try to use it in any meaningful way.
> + * 
> + * So, what we have to do here is manually begin the lifecycle of our new object
> + * by constructing it in place, using the memory region Python just allocated
> + * for us. This ensures the object will have already started its lifetime by 
> + * the time we start using it. */
> +static PyObject *
> +objbdpy_new (PyTypeObject *subtype, PyObject *args, PyObject *kwds)
> +{
> +  objfile_builder_object *region = 
> +    (objfile_builder_object *) subtype->tp_alloc(subtype, 1);
> +  gdb_assert ((size_t)region % alignof (objfile_builder_object) == 0);
> +  gdb_assert (region != nullptr);
> +
> +  new (&region->inner) objfile_builder_data ();
> +  
> +  return (PyObject *)region;
> +}
> +
> +/* The function handling destruction of the ObjfileBuilder object. 
> + *
> + * While running the destructor of our object isn't _strictly_ necessary, we
> + * would very much like for the memory it owns to be freed, but, because it was
> + * constructed in place, we have to call its destructor manually here. */
> +static void 
> +objbdpy_dealloc (PyObject *self0)
> +{
> +  
> +  auto self = (objfile_builder_object *)self0;
> +  PyTypeObject *tp = Py_TYPE(self);
> +  
> +  self->inner.~objfile_builder_data ();
> +  
> +  tp->tp_free(self);
> +  Py_DECREF(tp);
> +}
> +
> +static int CPYCHECKER_NEGATIVE_RESULT_SETS_EXCEPTION
> +gdbpy_initialize_objfile_builder (void)
> +{
> +  if (PyType_Ready (&objfile_builder_object_type) < 0)
> +    return -1;
> +
> +  return gdb_pymodule_addobject (gdb_module, "ObjfileBuilder",
> +				 (PyObject *) &objfile_builder_object_type);
> +}
> +
> +GDBPY_INITIALIZE_FILE (gdbpy_initialize_objfile_builder);
> +
> +static PyMethodDef objfile_builder_object_methods[] =
> +{
> +  { "build", (PyCFunction) objbdpy_build, METH_NOARGS,
> +    "build ().\n\
> +Build a new objfile containing the symbols added to builder." },
> +  { "add_type_symbol", (PyCFunction) objbdpy_add_type_symbol,
> +    METH_VARARGS | METH_KEYWORDS,
> +    "add_type_symbol (name [str], type [gdb.Type], language [str]).\n\
> +Add a new type symbol in the given language, associated with the given type." },
> +  { "add_label_symbol", (PyCFunction) objbdpy_add_label_symbol,
> +    METH_VARARGS | METH_KEYWORDS,
> +    "add_label_symbol (name [str], address [int], language [str]).\n\
> +Add a new label symbol in the given language, at the given address." },
> +  { "add_static_symbol", (PyCFunction) objbdpy_add_static_symbol,
> +    METH_VARARGS | METH_KEYWORDS,
> +    "add_static_symbol (name [str], address [int], language [str]).\n\
> +Add a new static symbol in the given language, at the given address." },
> +  { NULL }
> +};
> +
> +PyTypeObject objfile_builder_object_type = {
> +  PyVarObject_HEAD_INIT (NULL, 0)
> +  "gdb.ObjfileBuilder",               /* tp_name */
> +  sizeof (objfile_builder_object),    /* tp_basicsize */
> +  0,                                  /* tp_itemsize */
> +  objbdpy_dealloc,                    /* tp_dealloc */
> +  0,                                  /* tp_vectorcall_offset */
> +  nullptr,                            /* tp_getattr */
> +  nullptr,                            /* tp_setattr */
> +  nullptr,                            /* tp_compare */
> +  nullptr,                            /* tp_repr */
> +  nullptr,                            /* tp_as_number */
> +  nullptr,                            /* tp_as_sequence */
> +  nullptr,                            /* tp_as_mapping */
> +  nullptr,                            /* tp_hash  */
> +  nullptr,                            /* tp_call */
> +  nullptr,                            /* tp_str */
> +  nullptr,                            /* tp_getattro */
> +  nullptr,                            /* tp_setattro */
> +  nullptr,                            /* tp_as_buffer */
> +  Py_TPFLAGS_DEFAULT,                 /* tp_flags */
> +  "GDB object file builder",          /* tp_doc */
> +  nullptr,                            /* tp_traverse */
> +  nullptr,                            /* tp_clear */
> +  nullptr,                            /* tp_richcompare */
> +  0,                                  /* tp_weaklistoffset */
> +  nullptr,                            /* tp_iter */
> +  nullptr,                            /* tp_iternext */
> +  objfile_builder_object_methods,     /* tp_methods */
> +  nullptr,                            /* tp_members */
> +  nullptr,                            /* tp_getset */
> +  nullptr,                            /* tp_base */
> +  nullptr,                            /* tp_dict */
> +  nullptr,                            /* tp_descr_get */
> +  nullptr,                            /* tp_descr_set */
> +  0,                                  /* tp_dictoffset */
> +  objbdpy_init,                       /* tp_init */
> +  PyType_GenericAlloc,                /* tp_alloc */
> +  objbdpy_new,                        /* tp_new */
> +};
> +
> +
> diff --git a/gdb/python/py-objfile.c b/gdb/python/py-objfile.c
> index ad72f3f042..be21011ce6 100644
> --- a/gdb/python/py-objfile.c
> +++ b/gdb/python/py-objfile.c
> @@ -25,6 +25,7 @@
>  #include "build-id.h"
>  #include "symtab.h"
>  #include "python.h"
> +#include "buildsym.h"
>

Is this change really needed?

>  struct objfile_object
>  {
> diff --git a/gdb/python/python-internal.h b/gdb/python/python-internal.h
> index dbd33570a7..fbf9b06af5 100644
> --- a/gdb/python/python-internal.h
> +++ b/gdb/python/python-internal.h
> @@ -480,6 +480,7 @@ struct symtab *symtab_object_to_symtab (PyObject *obj);
>  struct symtab_and_line *sal_object_to_symtab_and_line (PyObject *obj);
>  frame_info_ptr frame_object_to_frame_info (PyObject *frame_obj);
>  struct gdbarch *arch_object_to_gdbarch (PyObject *obj);
> +struct floatformat *float_format_object_as_float_format (PyObject *self);

Likewise, I suspect this change is not needed.

>  
>  /* Convert Python object OBJ to a program_space pointer.  OBJ must be a
>     gdb.Progspace reference.  Return nullptr if the gdb.Progspace is not
> -- 
> 2.40.1

Thanks,
Andrew


^ permalink raw reply	[relevance 7%]

* Re: [PATCHv3 0/2] Add __repr__() implementation to a few Python types
  2023-06-09 12:33  7%               ` Andrew Burgess
@ 2023-07-04 11:09  2%                 ` Andrew Burgess
  0 siblings, 0 replies; 65+ results
From: Andrew Burgess @ 2023-07-04 11:09 UTC (permalink / raw)
  To: Matheus Branco Borella via Gdb-patches, aburgess
  Cc: dark.ryu.550, gdb-patches

Andrew Burgess <aburgess@redhat.com> writes:

> Matheus Branco Borella via Gdb-patches <gdb-patches@sourceware.org>
> writes:
>
>> All of the changes look good in my oppinion. I'm still getting used to the hang
>> of things, so I really appreciate the effort you put into making my patch more 
>> presentable! So I think it should be good to go?
>
> Do you have a copyright assignment in place?  If not then, given the
> size of the change, I think you will need one before this could be
> merged.

Matheus,

Thanks for getting the copyright assignment sorted.

I've now gone ahead and merged both patches in this series.

Thanks,
Andrew


^ permalink raw reply	[relevance 2%]

* Re: [PATCH]  Add name_of_main and language_of_main to the DWARF index
  2023-06-30 20:36  4%   ` Matheus Branco Borella
@ 2023-07-01  5:47  0%     ` Eli Zaretskii
  2023-07-07 15:00  4%       ` Matheus Branco Borella
  0 siblings, 1 reply; 65+ results
From: Eli Zaretskii @ 2023-07-01  5:47 UTC (permalink / raw)
  To: Matheus Branco Borella; +Cc: gdb-patches

> Cc: Matheus Branco Borella <dark.ryu.550@gmail.com>
> Date: Fri, 30 Jun 2023 17:36:43 -0300
> From: Matheus Branco Borella via Gdb-patches <gdb-patches@sourceware.org>
> 
> Alright, this one should incorporate all of the changes you suggested. And, now
> that I've sorted out my copyright assignment, this should be good to go now?
> Unless I missed something.

Your assignment is not on file yet, AFAICT.  Was the paperwork
completed, i.e. did you get a copy of the assignment signed by you and
by the FSF?  If not, you need to wait some more.

> This patch adds a new section to the DWARF index containing the name
> and the language of the main function symbol, gathered from
> `cooked_index::get_main`, if available. Currently, for lack of a better name,
> this section is called the "shortcut table". The way this name is both saved and 
> applied upon an index being loaded in mirrors how it is done in 
> `cooked_index_functions`, more specifically, the full name of the main function 
> symbol is saved and `set_objfile_main_name` is used to apply it after it is 
> loaded.
> 
> The main use case for this patch is in improving startup times when dealing with
> large binaries. Currently, when an index is used, GDB has to expand symtabs
> until it finds out what the language of the main function symbol is. For some
> large executables, this may take a considerable amount of time to complete,
> slowing down startup. This patch bypasses that operation by having both the name
> and language of the main function symbol be provided ahead of time by the index.
> 
> In my testing (a binary with about 1.8GB worth of DWARF data) this change brings
> startup time down from about 34 seconds to about 1.5 seconds.
> ---
>  gdb/NEWS                    |  2 ++
>  gdb/doc/gdb.texinfo         | 23 +++++++++++++--
>  gdb/dwarf2/index-write.c    | 47 +++++++++++++++++++++++++++----
>  gdb/dwarf2/read-gdb-index.c | 56 +++++++++++++++++++++++++++++++++++--
>  gdb/dwarf2/read.c           | 13 +++++++--
>  gdb/dwarf2/read.h           | 12 ++++++++
>  6 files changed, 142 insertions(+), 11 deletions(-)

Thanks.  The documentation parts are OK, but please fix the text to
leave two spaces between sentences, not one.

Reviewed-By: Eli Zaretskii <eliz@gnu.org>

^ permalink raw reply	[relevance 0%]

* [PATCH]  Add name_of_main and language_of_main to the DWARF index
  2023-06-09 16:56  0% ` Tom Tromey
@ 2023-06-30 20:36  4%   ` Matheus Branco Borella
  2023-07-01  5:47  0%     ` Eli Zaretskii
  0 siblings, 1 reply; 65+ results
From: Matheus Branco Borella @ 2023-06-30 20:36 UTC (permalink / raw)
  To: gdb-patches; +Cc: Matheus Branco Borella

Alright, this one should incorporate all of the changes you suggested. And, now
that I've sorted out my copyright assignment, this should be good to go now?
Unless I missed something.

---

This patch adds a new section to the DWARF index containing the name
and the language of the main function symbol, gathered from
`cooked_index::get_main`, if available. Currently, for lack of a better name,
this section is called the "shortcut table". The way this name is both saved and 
applied upon an index being loaded in mirrors how it is done in 
`cooked_index_functions`, more specifically, the full name of the main function 
symbol is saved and `set_objfile_main_name` is used to apply it after it is 
loaded.

The main use case for this patch is in improving startup times when dealing with
large binaries. Currently, when an index is used, GDB has to expand symtabs
until it finds out what the language of the main function symbol is. For some
large executables, this may take a considerable amount of time to complete,
slowing down startup. This patch bypasses that operation by having both the name
and language of the main function symbol be provided ahead of time by the index.

In my testing (a binary with about 1.8GB worth of DWARF data) this change brings
startup time down from about 34 seconds to about 1.5 seconds.
---
 gdb/NEWS                    |  2 ++
 gdb/doc/gdb.texinfo         | 23 +++++++++++++--
 gdb/dwarf2/index-write.c    | 47 +++++++++++++++++++++++++++----
 gdb/dwarf2/read-gdb-index.c | 56 +++++++++++++++++++++++++++++++++++--
 gdb/dwarf2/read.c           | 13 +++++++--
 gdb/dwarf2/read.h           | 12 ++++++++
 6 files changed, 142 insertions(+), 11 deletions(-)

diff --git a/gdb/NEWS b/gdb/NEWS
index d97e3c15a8..2d940d1f79 100644
--- a/gdb/NEWS
+++ b/gdb/NEWS
@@ -3,6 +3,8 @@
 
 *** Changes since GDB 13
 
+* DWARF index now contains information about the main function.
+
 * The AArch64 'org.gnu.gdb.aarch64.pauth' Pointer Authentication feature string
   has been deprecated in favor of the 'org.gnu.gdb.aarch64.pauth_v2' feature
   string.
diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
index d1059e0cb7..b21bfec89b 100644
--- a/gdb/doc/gdb.texinfo
+++ b/gdb/doc/gdb.texinfo
@@ -49093,13 +49093,14 @@ unless otherwise noted:
 
 @enumerate
 @item
-The version number, currently 8.  Versions 1, 2 and 3 are obsolete.
+The version number, currently 9.  Versions 1, 2 and 3 are obsolete.
 Version 4 uses a different hashing function from versions 5 and 6.
 Version 6 includes symbols for inlined functions, whereas versions 4
 and 5 do not.  Version 7 adds attributes to the CU indices in the
 symbol table.  Version 8 specifies that symbols from DWARF type units
 (@samp{DW_TAG_type_unit}) refer to the type unit's symbol table and not the
-compilation unit (@samp{DW_TAG_comp_unit}) using the type.
+compilation unit (@samp{DW_TAG_comp_unit}) using the type. Version 9 adds
+the name and the language of the main function to the index.
 
 @value{GDBN} will only read version 4, 5, or 6 indices
 by specifying @code{set use-deprecated-index-sections on}.
@@ -49120,6 +49121,9 @@ The offset, from the start of the file, of the address area.
 @item
 The offset, from the start of the file, of the symbol table.
 
+@item
+The offset, from the start of the file, of the shortcut table.
+
 @item
 The offset, from the start of the file, of the constant pool.
 @end enumerate
@@ -49196,6 +49200,21 @@ don't currently have a simple description of the canonicalization
 algorithm; if you intend to create new index sections, you must read
 the code.
 
+@item
+The shortcut table. This is a data structure with the following fields:
+
+@table @asis
+@item
+A 32-bit little-endian value indicating the language of the main function as a
+@code{DW_LANG_} constant. This value will be zero if main function information
+is not present.
+
+@item
+An @code{offset_type} value indicating the offset of the main function's name 
+in the constant pool. This value must be ignored if the value for the language
+of main is zero.
+@ end table
+
 @item
 The constant pool.  This is simply a bunch of bytes.  It is organized
 so that alignment is correct: CU vectors are stored first, followed by
diff --git a/gdb/dwarf2/index-write.c b/gdb/dwarf2/index-write.c
index 62c2cc6ac7..ee6eaa7b87 100644
--- a/gdb/dwarf2/index-write.c
+++ b/gdb/dwarf2/index-write.c
@@ -1080,14 +1080,15 @@ write_gdbindex_1 (FILE *out_file,
 		  const data_buf &types_cu_list,
 		  const data_buf &addr_vec,
 		  const data_buf &symtab_vec,
-		  const data_buf &constant_pool)
+		  const data_buf &constant_pool,
+                  const data_buf &shortcuts)
 {
   data_buf contents;
-  const offset_type size_of_header = 6 * sizeof (offset_type);
+  const offset_type size_of_header = 7 * sizeof (offset_type);
   offset_type total_len = size_of_header;
 
   /* The version number.  */
-  contents.append_offset (8);
+  contents.append_offset (9);
 
   /* The offset of the CU list from the start of the file.  */
   contents.append_offset (total_len);
@@ -1105,6 +1106,10 @@ write_gdbindex_1 (FILE *out_file,
   contents.append_offset (total_len);
   total_len += symtab_vec.size ();
 
+  /* The offset of the shortcut table from the start of the file.  */
+  contents.append_offset (total_len);
+  total_len += shortcuts.size ();
+
   /* The offset of the constant pool from the start of the file.  */
   contents.append_offset (total_len);
   total_len += constant_pool.size ();
@@ -1116,6 +1121,7 @@ write_gdbindex_1 (FILE *out_file,
   types_cu_list.file_write (out_file);
   addr_vec.file_write (out_file);
   symtab_vec.file_write (out_file);
+  shortcuts.file_write (out_file);
   constant_pool.file_write (out_file);
 
   assert_file_size (out_file, total_len);
@@ -1193,6 +1199,34 @@ write_cooked_index (cooked_index *table,
     }
 }
 
+/* Write shortcut information. */
+
+static void
+write_shortcuts_table (cooked_index *table, data_buf& shortcuts,
+                       data_buf& cpool)
+{
+  const auto main_info = table->get_main ();
+  size_t main_name_offset = 0;
+  dwarf_source_language dw_lang = (dwarf_source_language)0;
+
+  if (main_info != nullptr)
+    {
+      dw_lang = main_info->per_cu->dw_lang;
+
+      if (dw_lang != 0)
+        {
+          auto_obstack obstack;
+          const auto main_name = main_info->full_name (&obstack, true);
+
+          main_name_offset = cpool.size ();
+          cpool.append_cstr0 (main_name);
+        }
+    }
+
+  shortcuts.append_uint (4, BFD_ENDIAN_LITTLE, dw_lang);
+  shortcuts.append_offset (main_name_offset);
+}
+
 /* Write contents of a .gdb_index section for OBJFILE into OUT_FILE.
    If OBJFILE has an associated dwz file, write contents of a .gdb_index
    section for that dwz file into DWZ_OUT_FILE.  If OBJFILE does not have an
@@ -1270,11 +1304,14 @@ write_gdbindex (dwarf2_per_bfd *per_bfd, cooked_index *table,
 
   write_hash_table (&symtab, symtab_vec, constant_pool);
 
+  data_buf shortcuts;
+  write_shortcuts_table (table, shortcuts, constant_pool);
+
   write_gdbindex_1(out_file, objfile_cu_list, types_cu_list, addr_vec,
-		   symtab_vec, constant_pool);
+		   symtab_vec, constant_pool, shortcuts);
 
   if (dwz_out_file != NULL)
-    write_gdbindex_1 (dwz_out_file, dwz_cu_list, {}, {}, {}, {});
+    write_gdbindex_1 (dwz_out_file, dwz_cu_list, {}, {}, {}, {}, {});
   else
     gdb_assert (dwz_cu_list.empty ());
 }
diff --git a/gdb/dwarf2/read-gdb-index.c b/gdb/dwarf2/read-gdb-index.c
index 1006386cb2..534c5a7fd7 100644
--- a/gdb/dwarf2/read-gdb-index.c
+++ b/gdb/dwarf2/read-gdb-index.c
@@ -88,6 +88,9 @@ struct mapped_gdb_index final : public mapped_index_base
   /* A pointer to the constant pool.  */
   gdb::array_view<const gdb_byte> constant_pool;
 
+  /* The shortcut table data. */
+  gdb::array_view<const gdb_byte> shortcut_table;
+
   /* Return the index into the constant pool of the name of the IDXth
      symbol in the symbol table.  */
   offset_type symbol_name_index (offset_type idx) const
@@ -166,6 +169,7 @@ dwarf2_gdb_index::dump (struct objfile *objfile)
 
   mapped_gdb_index *index = (gdb::checked_static_cast<mapped_gdb_index *>
 			     (per_objfile->per_bfd->index_table.get ()));
+
   gdb_printf (".gdb_index: version %d\n", index->version);
   gdb_printf ("\n");
 }
@@ -583,7 +587,7 @@ to use the section anyway."),
 
   /* Indexes with higher version than the one supported by GDB may be no
      longer backward compatible.  */
-  if (version > 8)
+  if (version > 9)
     return 0;
 
   map->version = version;
@@ -608,8 +612,17 @@ to use the section anyway."),
   map->symbol_table
     = offset_view (gdb::array_view<const gdb_byte> (symbol_table,
 						    symbol_table_end));
-
   ++i;
+
+  if (version >= 9)
+    {
+      const gdb_byte *shortcut_table = addr + metadata[i];
+      const gdb_byte *shortcut_table_end = addr + metadata[i + 1];
+      map->shortcut_table
+        = gdb::array_view<const gdb_byte> (shortcut_table, shortcut_table_end);
+      ++i;
+    }
+
   map->constant_pool = buffer.slice (metadata[i]);
 
   if (map->constant_pool.empty () && !map->symbol_table.empty ())
@@ -763,6 +776,43 @@ create_addrmap_from_gdb_index (dwarf2_per_objfile *per_objfile,
     = new (&per_bfd->obstack) addrmap_fixed (&per_bfd->obstack, &mutable_map);
 }
 
+/* Sets the name and language of the main function from the shortcut table. */
+
+static void
+set_main_name_from_gdb_index (dwarf2_per_objfile *per_objfile, 
+                              mapped_gdb_index *index)
+{
+  const auto expected_size = 4 + sizeof (offset_type);
+  if (index->shortcut_table.size () < expected_size)
+    /* The data in the section is not present, is corrupted or is in a version
+     * we don't know about. Regardless, we can't make use of it. */
+    return;
+
+  auto ptr = index->shortcut_table.data ();
+  const auto dw_lang = extract_unsigned_integer (ptr, 4, BFD_ENDIAN_LITTLE);
+  if (dw_lang >= DW_LANG_hi_user)
+    {
+      complaint (_(".gdb_index shortcut table has invalid main language %u"),
+                   (unsigned) dw_lang);
+      return;
+    }
+  if (dw_lang == 0)
+    {
+      /* Don't bother if the language for the main symbol was not known or if
+       * there was no main symbol at all when the index was built. */
+      return;
+    }
+  ptr += 4;
+
+  const auto lang = dwarf_lang_to_enum_language (dw_lang);
+  const auto name_offset = extract_unsigned_integer (ptr, 
+                                                     sizeof (offset_type), 
+                                                     BFD_ENDIAN_LITTLE);
+  const auto name = (const char*) (index->constant_pool.data () + name_offset);
+
+  set_objfile_main_name (per_objfile->objfile, name, (enum language) lang);
+}
+
 /* See read-gdb-index.h.  */
 
 int
@@ -848,6 +898,8 @@ dwarf2_read_gdb_index
 
   create_addrmap_from_gdb_index (per_objfile, map.get ());
 
+  set_main_name_from_gdb_index (per_objfile, map.get ());
+
   per_bfd->index_table = std::move (map);
   per_bfd->quick_file_names_table =
     create_quick_file_names_table (per_bfd->all_units.size ());
diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index 4828409222..89acd94c05 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -17745,7 +17745,9 @@ leb128_size (const gdb_byte *buf)
     }
 }
 
-static enum language
+/* Converts DWARF language names to GDB language names. */
+
+enum language
 dwarf_lang_to_enum_language (unsigned int lang)
 {
   enum language language;
@@ -21661,6 +21663,7 @@ prepare_one_comp_unit (struct dwarf2_cu *cu, struct die_info *comp_unit_die,
   /* Set the language we're debugging.  */
   attr = dwarf2_attr (comp_unit_die, DW_AT_language, cu);
   enum language lang;
+  dwarf_source_language dw_lang = (dwarf_source_language)0;
   if (cu->producer != nullptr
       && strstr (cu->producer, "IBM XL C for OpenCL") != NULL)
     {
@@ -21669,18 +21672,24 @@ prepare_one_comp_unit (struct dwarf2_cu *cu, struct die_info *comp_unit_die,
 	 language detection we fall back to the DW_AT_producer
 	 string.  */
       lang = language_opencl;
+      dw_lang = DW_LANG_OpenCL;
     }
   else if (cu->producer != nullptr
 	   && strstr (cu->producer, "GNU Go ") != NULL)
     {
       /* Similar hack for Go.  */
       lang = language_go;
+      dw_lang = DW_LANG_Go;
     }
   else if (attr != nullptr)
-    lang = dwarf_lang_to_enum_language (attr->constant_value (0));
+    {
+      lang = dwarf_lang_to_enum_language (attr->constant_value (0));
+      dw_lang = (dwarf_source_language)attr->constant_value (0);
+    }
   else
     lang = pretend_language;
 
+  cu->per_cu->dw_lang = dw_lang;
   cu->language_defn = language_def (lang);
 
   switch (comp_unit_die->tag)
diff --git a/gdb/dwarf2/read.h b/gdb/dwarf2/read.h
index 37023a2070..1235f62bfc 100644
--- a/gdb/dwarf2/read.h
+++ b/gdb/dwarf2/read.h
@@ -245,6 +245,14 @@ struct dwarf2_per_cu_data
      functions above.  */
   std::vector <dwarf2_per_cu_data *> *imported_symtabs = nullptr;
 
+  /* The original DW_LANG_* value of the CU, as provided to us by 
+   * DW_AT_language. It is interesting to keep this value around in cases where
+   * we can't use the values from the language enum, as the mapping to them is
+   * lossy, and, while that is usually fine, things like the index have an 
+   * understandable bias towards not exposing internal GDB structures to the 
+   * outside world, and so prefer to use DWARF constants in their stead. */
+  dwarf_source_language dw_lang;
+
   /* Return true of IMPORTED_SYMTABS is empty or not yet allocated.  */
   bool imported_symtabs_empty () const
   {
@@ -755,6 +763,10 @@ struct dwarf2_per_objfile
 		     std::unique_ptr<dwarf2_cu>> m_dwarf2_cus;
 };
 
+/* Converts DWARF language names to GDB language names. */
+
+enum language dwarf_lang_to_enum_language (unsigned int lang);
+
 /* Get the dwarf2_per_objfile associated to OBJFILE.  */
 
 dwarf2_per_objfile *get_dwarf2_per_objfile (struct objfile *objfile);
-- 
2.40.1


^ permalink raw reply	[relevance 4%]

* [PING] Re: [PATCH] Add support for symbol addition to the Python API
  2023-05-27  1:24  3% [PATCH] Add support for symbol addition to the Python API Matheus Branco Borella
@ 2023-06-27  3:53 14% ` Matheus Branco Borella
  2023-07-04 15:14  7% ` Andrew Burgess
  1 sibling, 0 replies; 65+ results
From: Matheus Branco Borella @ 2023-06-27  3:53 UTC (permalink / raw)
  To: gdb-patches

Bump on this too, as instructed by the contribution checklist.

On Fri, May 26, 2023 at 10:24 PM Matheus Branco Borella
<dark.ryu.550@gmail.com> wrote:
>
> Disclaimer:
>
> This patch is a rework of a six-month old patch I submitted to the mailing list
> that considerably reduces the hackyness of the original solution to the problem,
> now that I've had more time to read through and understand how symbols are
> handled and searched for inside GDB. So, I'd like to please ask for comments on
> things I can still improve in this patch, before I resubmit it. I also plan to
> add tests to it once I'm more secure about the approach I'm taking to solve the
> problem now.
>
> The interfaces in this patch can be tested like so:
> ```
> (gdb) pi
> >>> builder = gdb.ObjfileBuilder(name = "some_name")
> >>> builder.add_static_symbol(name = "some_sym", address = 0x41414141,
>         language = "c")
> >>> objfile = builder.build()
> ```
>
> ---
>
> This patch adds support for symbol creation and registration. It currently
> supports adding type symbols (VAR_DOMAIN/LOC_TYPEDEF), static symbols
> (VAR_DOMAIN/LOC_STATIC) and goto target labels (LABEL_DOMAIN/LOC_LABEL). It
> adds a new `gdb.ObjfileBuilder` type, with `add_type_symbol`,
> `add_static_symbol` and `add_label_symbol` functions, allowing for the addition
> of the aforementioned types of symbols.
>
> Symbol addition is achieved by constructing a new objfile with msyms and full
> symbols reflecting the symbols that were previously added to the builder through
> its methods. This approach lets us get most of the way to full symbol addition
> support, but due to not being backed up by BFD, it does have a few limitations,
> which I will go over them here.
>
> PC-based minsym lookup does not work, because those would require a more
> complete set of BFD structures than I think would be good practice to pretend to
> have them all and crash GDB later on when it expects things to be there that
> aren't.
>
> In the same vein, PC-based function name lookup also does not work, although
> there may be a way to have the feature work using overlays. However, this patch
> does not make an attempt to do so
>
> For now, though, this implementation lets us add symbols that can be used to,
> for instance, query registered types through `gdb.lookup_type`, and allows
> reverse engineering GDB plugins (such as Pwndbg [0] or decomp2gdb [1]) to add
> symbols directly through the Python API instead of having to compile an object
> file for the target architecture that they later load through the add-symbol-
> file command. [2]
>
> [0] https://github.com/pwndbg/pwndbg/
> [1] https://github.com/mahaloz/decomp2dbg
> [2] https://github.com/mahaloz/decomp2dbg/blob/055be6b2001954d00db2d683f20e9b714af75880/decomp2dbg/clients/gdb/symbol_mapper.py#L235-L243]
> ---
>  gdb/Makefile.in                 |   1 +
>  gdb/python/py-objfile-builder.c | 648 ++++++++++++++++++++++++++++++++
>  gdb/python/py-objfile.c         |   1 +
>  gdb/python/python-internal.h    |   1 +
>  4 files changed, 651 insertions(+)
>  create mode 100644 gdb/python/py-objfile-builder.c
>
> diff --git a/gdb/Makefile.in b/gdb/Makefile.in
> index 14b5dd0bad..c0eecb81b6 100644
> --- a/gdb/Makefile.in
> +++ b/gdb/Makefile.in
> @@ -417,6 +417,7 @@ SUBDIR_PYTHON_SRCS = \
>         python/py-micmd.c \
>         python/py-newobjfileevent.c \
>         python/py-objfile.c \
> +       python/py-objfile-builder.c \
>         python/py-param.c \
>         python/py-prettyprint.c \
>         python/py-progspace.c \
> diff --git a/gdb/python/py-objfile-builder.c b/gdb/python/py-objfile-builder.c
> new file mode 100644
> index 0000000000..1e3110c613
> --- /dev/null
> +++ b/gdb/python/py-objfile-builder.c
> @@ -0,0 +1,648 @@
> +/* Python class allowing users to build and install objfiles.
> +
> +   Copyright (C) 2013-2023 Free Software Foundation, Inc.
> +
> +   This file is part of GDB.
> +
> +   This program is free software; you can redistribute it and/or modify
> +   it under the terms of the GNU General Public License as published by
> +   the Free Software Foundation; either version 3 of the License, or
> +   (at your option) any later version.
> +
> +   This program is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +   GNU General Public License for more details.
> +
> +   You should have received a copy of the GNU General Public License
> +   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
> +
> +#include "defs.h"
> +#include "python-internal.h"
> +#include "quick-symbol.h"
> +#include "objfiles.h"
> +#include "minsyms.h"
> +#include "buildsym.h"
> +#include "observable.h"
> +#include <string>
> +#include <unordered_map>
> +#include <type_traits>
> +#include <optional>
> +
> +/* This module relies on symbols being trivially copyable. */
> +static_assert (std::is_trivially_copyable_v<struct symbol>);
> +
> +/* Interface to be implemented for symbol types supported by this interface. */
> +class symbol_def
> +{
> +public:
> +  virtual ~symbol_def () = default;
> +
> +  virtual void register_msymbol (const std::string& name,
> +                                 struct objfile* objfile,
> +                                 minimal_symbol_reader& reader) const = 0;
> +  virtual void register_symbol (const std::string& name,
> +                                struct objfile* objfile,
> +                                buildsym_compunit& builder) const = 0;
> +};
> +
> +/* Shorthand for a unique_ptr to a symbol. */
> +typedef std::unique_ptr<symbol_def> symbol_def_up;
> +
> +/* Data being held by the gdb.ObjfileBuilder.
> + *
> + * This structure needs to have its constructor run in order for its lifetime
> + * to begin. Because of how Python handles its objects, we can't just reconstruct
> + * the object structure as a whole, as that would overwrite things the runtime
> + * cares about, so these fields had to be broken off into their own structure. */
> +struct objfile_builder_data
> +{
> +  /* Indicates whether the objfile has already been built and added to the
> +   * current context. We enforce that objfiles can't be installed twice. */
> +  bool installed = false;
> +
> +  /* The symbols that will be added to new newly built objfile. */
> +  std::unordered_map<std::string, symbol_def_up> symbols;
> +
> +  /* The name given to this objfile. */
> +  std::string name;
> +
> +  /* Adds a symbol definition with the given name. */
> +  bool add_symbol_def (std::string name, symbol_def_up&& symbol_def)
> +  {
> +    return std::get<1> (symbols.insert ({name, std::move (symbol_def)}));
> +  }
> +};
> +
> +/* Structure backing the gdb.ObjfileBuilder type. */
> +
> +struct objfile_builder_object
> +{
> +  PyObject_HEAD
> +
> +  /* See objfile_builder_data. */
> +  objfile_builder_data inner;
> +};
> +
> +extern PyTypeObject objfile_builder_object_type
> +    CPYCHECKER_TYPE_OBJECT_FOR_TYPEDEF ("objfile_builder_object_type");
> +
> +/* Constructs a new objfile from an objfile_builder. */
> +static struct objfile *
> +build_new_objfile (const objfile_builder_object& builder)
> +{
> +  gdb_assert (!builder.inner.installed);
> +
> +  auto of = objfile::make (nullptr, builder.inner.name.c_str (),
> +                           OBJF_READNOW | OBJF_NOT_FILENAME,
> +                           nullptr);
> +
> +  /* Setup object file sections. */
> +  of->sections_start = OBSTACK_CALLOC (&of->objfile_obstack,
> +                                       4,
> +                                       struct obj_section);
> +  of->sections_end = of->sections_start + 4;
> +
> +  const auto init_section = [&](struct obj_section* sec)
> +    {
> +      sec->objfile = of;
> +      sec->ovly_mapped = false;
> +
> +      /* We're not being backed by BFD. So we have no real section data to speak
> +       * of, but, because specifying sections requires BFD structures, we have to
> +       * play a little game of predend. */
> +      auto bfd = obstack_new<bfd_section> (&of->objfile_obstack);
> +      bfd->vma = 0;
> +      bfd->size = 0;
> +      bfd->lma = 0; /* Prevents insert_section_p in objfiles.c from trying to
> +                     * dereference the bfd structure we don't have. */
> +      sec->the_bfd_section = bfd;
> +    };
> +  init_section (&of->sections_start[0]);
> +  init_section (&of->sections_start[1]);
> +  init_section (&of->sections_start[2]);
> +  init_section (&of->sections_start[4]);
> +
> +  of->sect_index_text = 0;
> +  of->sect_index_data = 1;
> +  of->sect_index_rodata = 2;
> +  of->sect_index_bss = 3;
> +
> +  /* While buildsym_compunit expects the symbol function pointer structure to be
> +   * present, it also gracefully handles the case where all of the pointers in
> +   * it are set to null. So, make sure we have a valid structure, but there's
> +   * no need to do more than that. */
> +  of->sf = obstack_new<struct sym_fns> (&of->objfile_obstack);
> +
> +  /* We need to tell GDB what architecture the objfile uses. */
> +  if (has_stack_frames ())
> +    of->per_bfd->gdbarch = get_frame_arch (get_selected_frame (nullptr));
> +  else
> +    of->per_bfd->gdbarch = target_gdbarch ();
> +
> +  /* Construct the minimal symbols. */
> +  minimal_symbol_reader msym (of);
> +  for (const auto& [name, symbol] : builder.inner.symbols)
> +      symbol->register_msymbol (name, of, msym);
> +  msym.install ();
> +
> +  /* Construct the full symbols. */
> +  buildsym_compunit fsym (of, builder.inner.name.c_str (), "", language_c, 0);
> +  for (const auto& [name, symbol] : builder.inner.symbols)
> +    symbol->register_symbol (name, of, fsym);
> +  fsym.end_compunit_symtab (0);
> +
> +  /* Notify the rest of GDB this objfile has been created. Requires
> +   * OBJF_NOT_FILENAME to be used, to prevent any of the functions attatched to
> +   * the observable from trying to dereference of->bfd. */
> +  gdb::observers::new_objfile.notify (of);
> +
> +  return of;
> +}
> +
> +/* Implementation of the quick symbol functions used by the objfiles created
> + * using this interface. Turns out we have our work cut out for us here, as we
> + * can get something that works by effectively just using no-ops, and the rest
> + * of the code will fall back to using just the minimal and full symbol data. It
> + * is important to note, though, that this only works because we're marking our
> + * objfile with `OBJF_READNOW`. */
> +class runtime_objfile : public quick_symbol_functions
> +{
> +  virtual bool has_symbols (struct objfile*) override
> +  {
> +    return false;
> +  }
> +
> +  virtual void dump (struct objfile *objfile) override
> +  {
> +  }
> +
> +  virtual void expand_matching_symbols
> +    (struct objfile *,
> +     const lookup_name_info &lookup_name,
> +     domain_enum domain,
> +     int global,
> +     symbol_compare_ftype *ordered_compare) override
> +  {
> +  }
> +
> +  virtual bool expand_symtabs_matching
> +    (struct objfile *objfile,
> +     gdb::function_view<expand_symtabs_file_matcher_ftype> file_matcher,
> +     const lookup_name_info *lookup_name,
> +     gdb::function_view<expand_symtabs_symbol_matcher_ftype> symbol_matcher,
> +     gdb::function_view<expand_symtabs_exp_notify_ftype> expansion_notify,
> +     block_search_flags search_flags,
> +     domain_enum domain,
> +     enum search_domain kind) override
> +  {
> +    return true;
> +  }
> +};
> +
> +
> +/* Create a new symbol alocated in the given objfile. */
> +
> +static struct symbol *
> +new_symbol
> +  (struct objfile *objfile,
> +   const char *name,
> +   enum language language,
> +   enum domain_enum domain,
> +   enum address_class aclass,
> +   short section_index)
> +{
> +  auto symbol = new (&objfile->objfile_obstack) struct symbol ();
> +  OBJSTAT (objfile, n_syms++);
> +
> +  symbol->set_language (language, &objfile->objfile_obstack);
> +  symbol->compute_and_set_names (gdb::string_view (name), true,
> +                                 objfile->per_bfd);
> +
> +  symbol->set_is_objfile_owned (true);
> +  symbol->set_section_index (section_index);
> +  symbol->set_domain (domain);
> +  symbol->set_aclass_index (aclass);
> +
> +  return symbol;
> +}
> +
> +/* Parses a language from a string (coming from Python) into a language
> + * variant. */
> +
> +static enum language
> +parse_language (const char *language)
> +{
> +  if (strcmp (language, "c") == 0)
> +    return language_c;
> +  else if (strcmp (language, "objc") == 0)
> +    return language_objc;
> +  else if (strcmp (language, "cplus") == 0)
> +    return language_cplus;
> +  else if (strcmp (language, "d") == 0)
> +    return language_d;
> +  else if (strcmp (language, "go") == 0)
> +    return language_go;
> +  else if (strcmp (language, "fortran") == 0)
> +    return language_fortran;
> +  else if (strcmp (language, "m2") == 0)
> +    return language_m2;
> +  else if (strcmp (language, "asm") == 0)
> +    return language_asm;
> +  else if (strcmp (language, "pascal") == 0)
> +    return language_pascal;
> +  else if (strcmp (language, "opencl") == 0)
> +    return language_opencl;
> +  else if (strcmp (language, "rust") == 0)
> +    return language_rust;
> +  else if (strcmp (language, "ada") == 0)
> +    return language_ada;
> +  else
> +    return language_unknown;
> +}
> +
> +/* Convenience function that performs a checked coversion from a PyObject to
> + * a objfile_builder_object structure pointer. */
> +inline static struct objfile_builder_object *
> +validate_objfile_builder_object (PyObject *self)
> +{
> +  if (!PyObject_TypeCheck (self, &objfile_builder_object_type))
> +    return nullptr;
> +  return (struct objfile_builder_object*) self;
> +}
> +
> +/* Registers symbols added with add_label_symbol. */
> +class typedef_symbol_def : public symbol_def
> +{
> +public:
> +  struct type* type;
> +  enum language language;
> +
> +  virtual void register_msymbol (const std::string& name,
> +                                 struct objfile *objfile,
> +                                 minimal_symbol_reader& reader) const override
> +  {
> +  }
> +
> +  virtual void register_symbol (const std::string& name,
> +                                struct objfile *objfile,
> +                                buildsym_compunit& builder) const override
> +  {
> +    auto symbol = new_symbol (objfile, name.c_str (), language, LABEL_DOMAIN,
> +                              LOC_TYPEDEF, objfile->sect_index_text);
> +
> +    symbol->set_type (type);
> +
> +    add_symbol_to_list (symbol, builder.get_file_symbols ());
> +  }
> +};
> +
> +/* Adds a type (LOC_TYPEDEF) symbol to a given objfile. */
> +static PyObject *
> +objbdpy_add_type_symbol (PyObject *self, PyObject *args, PyObject *kw)
> +{
> +  static const char *format = "sOs";
> +  static const char *keywords[] =
> +    {
> +      "name", "type", "language", NULL
> +    };
> +
> +  PyObject *type_object;
> +  const char *name;
> +  const char *language_name = nullptr;
> +
> +  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, format, keywords, &name,
> +                                        &type_object, &language_name))
> +    return nullptr;
> +
> +  auto builder = validate_objfile_builder_object (self);
> +  if (builder == nullptr)
> +    return nullptr;
> +
> +  struct type *type = type_object_to_type (type_object);
> +  if (type == nullptr)
> +    return nullptr;
> +
> +  if (language_name == nullptr)
> +    language_name = "auto";
> +  enum language language = parse_language (language_name);
> +  if (language == language_unknown)
> +    {
> +      PyErr_SetString (PyExc_ValueError, "invalid language name");
> +      return nullptr;
> +    }
> +
> +  auto def = std::make_unique<typedef_symbol_def> ();
> +  def->type = type;
> +  def->language = language;
> +
> +  builder->inner.add_symbol_def (name, std::move (def));
> +
> +  Py_RETURN_NONE;
> +}
> +
> +
> +/* Registers symbols added with add_label_symbol. */
> +class label_symbol_def : public symbol_def
> +{
> +public:
> +  CORE_ADDR address;
> +  enum language language;
> +
> +  virtual void register_msymbol (const std::string& name,
> +                                 struct objfile *objfile,
> +                                 minimal_symbol_reader& reader) const override
> +  {
> +    reader.record (name.c_str (),
> +                   unrelocated_addr (address),
> +                   minimal_symbol_type::mst_text);
> +  }
> +
> +  virtual void register_symbol (const std::string& name,
> +                                struct objfile *objfile,
> +                                buildsym_compunit& builder) const override
> +  {
> +    printf("Adding label %s\n", name.c_str ());
> +    auto symbol = new_symbol (objfile, name.c_str (), language, LABEL_DOMAIN,
> +                              LOC_LABEL, objfile->sect_index_text);
> +
> +    symbol->set_value_address (address);
> +
> +    add_symbol_to_list (symbol, builder.get_file_symbols ());
> +  }
> +};
> +
> +/* Adds a label (LOC_LABEL) symbol to a given objfile. */
> +static PyObject *
> +objbdpy_add_label_symbol (PyObject *self, PyObject *args, PyObject *kw)
> +{
> +  static const char *format = "sks";
> +  static const char *keywords[] =
> +    {
> +      "name", "address", "language", NULL
> +    };
> +
> +  const char *name;
> +  CORE_ADDR address;
> +  const char *language_name = nullptr;
> +
> +  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, format, keywords, &name,
> +                                        &address, &language_name))
> +    return nullptr;
> +
> +  auto builder = validate_objfile_builder_object (self);
> +  if (builder == nullptr)
> +    return nullptr;
> +
> +  if (language_name == nullptr)
> +    language_name = "auto";
> +  enum language language = parse_language (language_name);
> +  if (language == language_unknown)
> +    {
> +      PyErr_SetString (PyExc_ValueError, "invalid language name");
> +      return nullptr;
> +    }
> +
> +  auto def = std::make_unique<label_symbol_def> ();
> +  def->address = address;
> +  def->language = language;
> +
> +  builder->inner.add_symbol_def (name, std::move (def));
> +
> +  Py_RETURN_NONE;
> +}
> +
> +/* Registers symbols added with add_static_symbol. */
> +class static_symbol_def : public symbol_def
> +{
> +public:
> +  CORE_ADDR address;
> +  enum language language;
> +
> +  virtual void register_msymbol (const std::string& name,
> +                                 struct objfile *objfile,
> +                                 minimal_symbol_reader& reader) const override
> +  {
> +    reader.record (name.c_str (),
> +                   unrelocated_addr (address),
> +                   minimal_symbol_type::mst_bss);
> +  }
> +
> +  virtual void register_symbol (const std::string& name,
> +                                struct objfile *objfile,
> +                                buildsym_compunit& builder) const override
> +  {
> +    auto symbol = new_symbol (objfile, name.c_str (), language, VAR_DOMAIN,
> +                              LOC_STATIC, objfile->sect_index_bss);
> +
> +    symbol->set_value_address (address);
> +
> +    add_symbol_to_list (symbol, builder.get_file_symbols ());
> +  }
> +};
> +
> +/* Adds a static (LOC_STATIC) symbol to a given objfile. */
> +static PyObject *
> +objbdpy_add_static_symbol (PyObject *self, PyObject *args, PyObject *kw)
> +{
> +  static const char *format = "sks";
> +  static const char *keywords[] =
> +    {
> +      "name", "address", "language", NULL
> +    };
> +
> +  const char *name;
> +  CORE_ADDR address;
> +  const char *language_name = nullptr;
> +
> +  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, format, keywords, &name,
> +                                        &address, &language_name))
> +    return nullptr;
> +
> +  auto builder = validate_objfile_builder_object (self);
> +  if (builder == nullptr)
> +    return nullptr;
> +
> +  if (language_name == nullptr)
> +    language_name = "auto";
> +  enum language language = parse_language (language_name);
> +  if (language == language_unknown)
> +    {
> +      PyErr_SetString (PyExc_ValueError, "invalid language name");
> +      return nullptr;
> +    }
> +
> +  auto def = std::make_unique<static_symbol_def> ();
> +  def->address = address;
> +  def->language = language;
> +
> +  builder->inner.add_symbol_def (name, std::move (def));
> +
> +  Py_RETURN_NONE;
> +}
> +
> +/* Builds the object file. */
> +static PyObject *
> +objbdpy_build (PyObject *self, PyObject *args)
> +{
> +  auto builder = validate_objfile_builder_object (self);
> +  if (builder == nullptr)
> +    return nullptr;
> +
> +  if (builder->inner.installed)
> +    {
> +      PyErr_SetString (PyExc_ValueError, "build() cannot be run twice on the \
> +                       same object");
> +      return nullptr;
> +    }
> +  auto of = build_new_objfile (*builder);
> +  builder->inner.installed = true;
> +
> +
> +  auto objpy = objfile_to_objfile_object (of).get ();
> +  Py_INCREF(objpy);
> +  return objpy;
> +}
> +
> +/* Implements the __init__() function. */
> +static int
> +objbdpy_init (PyObject *self0, PyObject *args, PyObject *kw)
> +{
> +  static const char *format = "s";
> +  static const char *keywords[] =
> +    {
> +      "name", NULL
> +    };
> +
> +  const char *name;
> +  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, format, keywords, &name))
> +    return -1;
> +
> +  auto self = (objfile_builder_object *)self0;
> +  self->inner.name = name;
> +  self->inner.symbols.clear ();
> +
> +  return 0;
> +}
> +
> +/* The function handling construction of the ObjfileBuilder object.
> + *
> + * We need to have a custom function here as, even though Python manages the
> + * memory backing the object up, it assumes clearing the memory is enough to
> + * begin its lifetime, which is not the case here, and would lead to undefined
> + * behavior as soon as we try to use it in any meaningful way.
> + *
> + * So, what we have to do here is manually begin the lifecycle of our new object
> + * by constructing it in place, using the memory region Python just allocated
> + * for us. This ensures the object will have already started its lifetime by
> + * the time we start using it. */
> +static PyObject *
> +objbdpy_new (PyTypeObject *subtype, PyObject *args, PyObject *kwds)
> +{
> +  objfile_builder_object *region =
> +    (objfile_builder_object *) subtype->tp_alloc(subtype, 1);
> +  gdb_assert ((size_t)region % alignof (objfile_builder_object) == 0);
> +  gdb_assert (region != nullptr);
> +
> +  new (&region->inner) objfile_builder_data ();
> +
> +  return (PyObject *)region;
> +}
> +
> +/* The function handling destruction of the ObjfileBuilder object.
> + *
> + * While running the destructor of our object isn't _strictly_ necessary, we
> + * would very much like for the memory it owns to be freed, but, because it was
> + * constructed in place, we have to call its destructor manually here. */
> +static void
> +objbdpy_dealloc (PyObject *self0)
> +{
> +
> +  auto self = (objfile_builder_object *)self0;
> +  PyTypeObject *tp = Py_TYPE(self);
> +
> +  self->inner.~objfile_builder_data ();
> +
> +  tp->tp_free(self);
> +  Py_DECREF(tp);
> +}
> +
> +static int CPYCHECKER_NEGATIVE_RESULT_SETS_EXCEPTION
> +gdbpy_initialize_objfile_builder (void)
> +{
> +  if (PyType_Ready (&objfile_builder_object_type) < 0)
> +    return -1;
> +
> +  return gdb_pymodule_addobject (gdb_module, "ObjfileBuilder",
> +                                (PyObject *) &objfile_builder_object_type);
> +}
> +
> +GDBPY_INITIALIZE_FILE (gdbpy_initialize_objfile_builder);
> +
> +static PyMethodDef objfile_builder_object_methods[] =
> +{
> +  { "build", (PyCFunction) objbdpy_build, METH_NOARGS,
> +    "build ().\n\
> +Build a new objfile containing the symbols added to builder." },
> +  { "add_type_symbol", (PyCFunction) objbdpy_add_type_symbol,
> +    METH_VARARGS | METH_KEYWORDS,
> +    "add_type_symbol (name [str], type [gdb.Type], language [str]).\n\
> +Add a new type symbol in the given language, associated with the given type." },
> +  { "add_label_symbol", (PyCFunction) objbdpy_add_label_symbol,
> +    METH_VARARGS | METH_KEYWORDS,
> +    "add_label_symbol (name [str], address [int], language [str]).\n\
> +Add a new label symbol in the given language, at the given address." },
> +  { "add_static_symbol", (PyCFunction) objbdpy_add_static_symbol,
> +    METH_VARARGS | METH_KEYWORDS,
> +    "add_static_symbol (name [str], address [int], language [str]).\n\
> +Add a new static symbol in the given language, at the given address." },
> +  { NULL }
> +};
> +
> +PyTypeObject objfile_builder_object_type = {
> +  PyVarObject_HEAD_INIT (NULL, 0)
> +  "gdb.ObjfileBuilder",               /* tp_name */
> +  sizeof (objfile_builder_object),    /* tp_basicsize */
> +  0,                                  /* tp_itemsize */
> +  objbdpy_dealloc,                    /* tp_dealloc */
> +  0,                                  /* tp_vectorcall_offset */
> +  nullptr,                            /* tp_getattr */
> +  nullptr,                            /* tp_setattr */
> +  nullptr,                            /* tp_compare */
> +  nullptr,                            /* tp_repr */
> +  nullptr,                            /* tp_as_number */
> +  nullptr,                            /* tp_as_sequence */
> +  nullptr,                            /* tp_as_mapping */
> +  nullptr,                            /* tp_hash  */
> +  nullptr,                            /* tp_call */
> +  nullptr,                            /* tp_str */
> +  nullptr,                            /* tp_getattro */
> +  nullptr,                            /* tp_setattro */
> +  nullptr,                            /* tp_as_buffer */
> +  Py_TPFLAGS_DEFAULT,                 /* tp_flags */
> +  "GDB object file builder",          /* tp_doc */
> +  nullptr,                            /* tp_traverse */
> +  nullptr,                            /* tp_clear */
> +  nullptr,                            /* tp_richcompare */
> +  0,                                  /* tp_weaklistoffset */
> +  nullptr,                            /* tp_iter */
> +  nullptr,                            /* tp_iternext */
> +  objfile_builder_object_methods,     /* tp_methods */
> +  nullptr,                            /* tp_members */
> +  nullptr,                            /* tp_getset */
> +  nullptr,                            /* tp_base */
> +  nullptr,                            /* tp_dict */
> +  nullptr,                            /* tp_descr_get */
> +  nullptr,                            /* tp_descr_set */
> +  0,                                  /* tp_dictoffset */
> +  objbdpy_init,                       /* tp_init */
> +  PyType_GenericAlloc,                /* tp_alloc */
> +  objbdpy_new,                        /* tp_new */
> +};
> +
> +
> diff --git a/gdb/python/py-objfile.c b/gdb/python/py-objfile.c
> index ad72f3f042..be21011ce6 100644
> --- a/gdb/python/py-objfile.c
> +++ b/gdb/python/py-objfile.c
> @@ -25,6 +25,7 @@
>  #include "build-id.h"
>  #include "symtab.h"
>  #include "python.h"
> +#include "buildsym.h"
>
>  struct objfile_object
>  {
> diff --git a/gdb/python/python-internal.h b/gdb/python/python-internal.h
> index dbd33570a7..fbf9b06af5 100644
> --- a/gdb/python/python-internal.h
> +++ b/gdb/python/python-internal.h
> @@ -480,6 +480,7 @@ struct symtab *symtab_object_to_symtab (PyObject *obj);
>  struct symtab_and_line *sal_object_to_symtab_and_line (PyObject *obj);
>  frame_info_ptr frame_object_to_frame_info (PyObject *frame_obj);
>  struct gdbarch *arch_object_to_gdbarch (PyObject *obj);
> +struct floatformat *float_format_object_as_float_format (PyObject *self);
>
>  /* Convert Python object OBJ to a program_space pointer.  OBJ must be a
>     gdb.Progspace reference.  Return nullptr if the gdb.Progspace is not
> --
> 2.40.1
>

^ permalink raw reply	[relevance 14%]

* [PING] Re: [PATCH] Add support for creating new types from the Python API
  2023-01-11  0:58  2% ` [PATCH] Add support for creating new types from " Matheus Branco Borella
@ 2023-06-27  3:52 14%   ` Matheus Branco Borella
  0 siblings, 0 replies; 65+ results
From: Matheus Branco Borella @ 2023-06-27  3:52 UTC (permalink / raw)
  To: gdb-patches

Following the contribution checklist, so, pinging this.

On Tue, Jan 10, 2023 at 9:58 PM Matheus Branco Borella
<dark.ryu.550@gmail.com> wrote:
>
> This patch adds support for creating types from within the Python API. It does
> so by exposing the `init_*_type` family of functions, defined in `gdbtypes.h` to
> Python and having them return `gdb.Type` objects connected to the newly minted
> types.
>
> These functions are accessible in the root of the gdb module and all require
> a reference to a `gdb.Objfile`. Types created from this API are exclusively
> objfile-owned.
>
> This patch also adds an extra type - `gdb.FloatFormat` - to support creation of
> floating point types by letting users control the format from within Python. It
> is missing, however, a way to specify half formats and validation functions.
>
> It is important to note that types created using this interface are not
> automatically registered as a symbol, and so, types will become unreachable
> unless used to create a value that otherwise references it or saved in some way.
>
> The main drawback of using the `init_*_type` family over implementing type
> initialization by hand is that any type that's created gets immediately
> allocated on its owner objfile's obstack, regardless of what its real
> lifetime requirements are. The main implication of this is that types that
> become unreachable will leak their memory for the lifetime of the objfile.
>
> Keeping track of the initialization of the type by hand would require a
> deeper change to the existing type object infrastructure. A bit too ambitious
> for a first patch, I'd say.
>
> if it were to be done though, we would gain the ability to only keep in the
> obstack types that are known to be referenced in some other way - by allocating
> and copying the data to the obstack as other objects are created that reference
> it (eg. symbols).
> ---
>  gdb/Makefile.in              |   2 +
>  gdb/python/py-float-format.c | 297 +++++++++++++++++++++++++++
>  gdb/python/py-objfile.c      |  12 ++
>  gdb/python/py-type-init.c    | 388 +++++++++++++++++++++++++++++++++++
>  gdb/python/python-internal.h |  17 ++
>  gdb/python/python.c          |  44 +++-
>  6 files changed, 759 insertions(+), 1 deletion(-)
>  create mode 100644 gdb/python/py-float-format.c
>  create mode 100644 gdb/python/py-type-init.c
>
> diff --git a/gdb/Makefile.in b/gdb/Makefile.in
> index fb4d42c7baa..789f7dce224 100644
> --- a/gdb/Makefile.in
> +++ b/gdb/Makefile.in
> @@ -432,6 +432,8 @@ SUBDIR_PYTHON_SRCS = \
>         python/py-threadevent.c \
>         python/py-tui.c \
>         python/py-type.c \
> +       python/py-type-init.c \
> +       python/py-float-format.c \
>         python/py-unwind.c \
>         python/py-utils.c \
>         python/py-value.c \
> diff --git a/gdb/python/py-float-format.c b/gdb/python/py-float-format.c
> new file mode 100644
> index 00000000000..e517e410899
> --- /dev/null
> +++ b/gdb/python/py-float-format.c
> @@ -0,0 +1,297 @@
> +/* Accessibility of float format controls from inside the Python API
> +
> +   Copyright (C) 2008-2023 Free Software Foundation, Inc.
> +
> +   This file is part of GDB.
> +
> +   This program is free software; you can redistribute it and/or modify
> +   it under the terms of the GNU General Public License as published by
> +   the Free Software Foundation; either version 3 of the License, or
> +   (at your option) any later version.
> +
> +   This program is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +   GNU General Public License for more details.
> +
> +   You should have received a copy of the GNU General Public License
> +   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
> +
> +#include "defs.h"
> +#include "python-internal.h"
> +#include "floatformat.h"
> +
> +/* Structure backing the float format Python interface. */
> +
> +struct float_format_object
> +{
> +  PyObject_HEAD
> +  struct floatformat format;
> +
> +  struct floatformat *float_format ()
> +  {
> +    return &this->format;
> +  }
> +};
> +
> +/* Initializes the float format type and registers it with the Python interpreter. */
> +
> +int
> +gdbpy_initialize_float_format (void)
> +{
> +  if (PyType_Ready (&float_format_object_type) < 0)
> +    return -1;
> +
> +  if (gdb_pymodule_addobject (gdb_module, "FloatFormat",
> +    (PyObject *) &float_format_object_type) < 0)
> +    return -1;
> +
> +  return 0;
> +}
> +
> +#define INSTANCE_FIELD_GETTER(getter_name, field_name, field_type, field_conv) \
> +  static PyObject *                                                            \
> +  getter_name (PyObject *self, void *closure)                                  \
> +  {                                                                            \
> +    float_format_object *ff = (float_format_object*) self;                     \
> +    field_type value = ff->float_format ()->field_name;                        \
> +    return field_conv (value);                                                 \
> +  }
> +
> +#define INSTANCE_FIELD_SETTER(getter_name, field_name, field_type, field_conv) \
> +  static int                                                                   \
> +  getter_name (PyObject *self, PyObject* value, void *closure)                 \
> +  {                                                                            \
> +    field_type native_value;                                                   \
> +    if (!field_conv (value, &native_value))                                    \
> +      return -1;                                                               \
> +    float_format_object *ff = (float_format_object*) self;                     \
> +    ff->float_format ()->field_name = native_value;                            \
> +    return 0;                                                                  \
> +  }
> +
> +/* Converts from the intbit enum to a Python boolean. */
> +
> +static PyObject *
> +intbit_to_py (enum floatformat_intbit intbit)
> +{
> +  gdb_assert (intbit == floatformat_intbit_yes || intbit == floatformat_intbit_no);
> +  if (intbit == floatformat_intbit_no)
> +    Py_RETURN_FALSE;
> +  else
> +    Py_RETURN_TRUE;
> +}
> +
> +/* Converts from a Python boolean to the intbit enum. */
> +
> +static bool
> +py_to_intbit (PyObject *object, enum floatformat_intbit *intbit)
> +{
> +  if (!PyObject_IsInstance (object, (PyObject*) &PyBool_Type))
> +  {
> +    PyErr_SetString (PyExc_TypeError, "intbit must be True or False");
> +    return false;
> +  }
> +
> +  *intbit = PyObject_IsTrue (object) ? floatformat_intbit_yes : floatformat_intbit_no;
> +  return true;
> +}
> +
> +/* Converts from a Python integer to a unsigned integer. */
> +
> +static bool
> +py_to_unsigned_int (PyObject *object, unsigned int *val)
> +{
> +  if (!PyObject_IsInstance (object, (PyObject*) &PyLong_Type))
> +  {
> +    PyErr_SetString (PyExc_TypeError, "value must be an integer");
> +    return false;
> +  }
> +
> +  long native_val = PyLong_AsLong (object);
> +  if (native_val > (long) UINT_MAX)
> +  {
> +    PyErr_SetString (PyExc_ValueError, "value is too large");
> +    return false;
> +  }
> +  if (native_val < 0)
> +  {
> +    PyErr_SetString (PyExc_ValueError, "value must not be smaller than zero");
> +    return false;
> +  }
> +
> +  *val = (unsigned int) native_val;
> +  return true;
> +}
> +
> +/* Converts from a Python integer to a signed integer. */
> +
> +static bool
> +py_to_int(PyObject *object, int *val)
> +{
> +  if(!PyObject_IsInstance(object, (PyObject*)&PyLong_Type))
> +  {
> +    PyErr_SetString(PyExc_TypeError, u8"value must be an integer");
> +    return false;
> +  }
> +
> +  long native_val = PyLong_AsLong(object);
> +  if(native_val > (long)INT_MAX)
> +  {
> +    PyErr_SetString(PyExc_ValueError, u8"value is too large");
> +    return false;
> +  }
> +
> +  *val = (int)native_val;
> +  return true;
> +}
> +
> +INSTANCE_FIELD_GETTER (ffpy_get_totalsize, totalsize, unsigned int, PyLong_FromLong)
> +INSTANCE_FIELD_GETTER (ffpy_get_sign_start, sign_start, unsigned int, PyLong_FromLong)
> +INSTANCE_FIELD_GETTER (ffpy_get_exp_start, exp_start, unsigned int, PyLong_FromLong)
> +INSTANCE_FIELD_GETTER (ffpy_get_exp_len, exp_len, unsigned int, PyLong_FromLong)
> +INSTANCE_FIELD_GETTER (ffpy_get_exp_bias, exp_bias, int, PyLong_FromLong)
> +INSTANCE_FIELD_GETTER (ffpy_get_exp_nan, exp_nan, unsigned int, PyLong_FromLong)
> +INSTANCE_FIELD_GETTER (ffpy_get_man_start, man_start, unsigned int, PyLong_FromLong)
> +INSTANCE_FIELD_GETTER (ffpy_get_man_len, man_len, unsigned int, PyLong_FromLong)
> +INSTANCE_FIELD_GETTER (ffpy_get_intbit, intbit, enum floatformat_intbit, intbit_to_py)
> +INSTANCE_FIELD_GETTER (ffpy_get_name, name, const char *, PyUnicode_FromString)
> +
> +INSTANCE_FIELD_SETTER (ffpy_set_totalsize, totalsize, unsigned int, py_to_unsigned_int)
> +INSTANCE_FIELD_SETTER (ffpy_set_sign_start, sign_start, unsigned int, py_to_unsigned_int)
> +INSTANCE_FIELD_SETTER (ffpy_set_exp_start, exp_start, unsigned int, py_to_unsigned_int)
> +INSTANCE_FIELD_SETTER (ffpy_set_exp_len, exp_len, unsigned int, py_to_unsigned_int)
> +INSTANCE_FIELD_SETTER (ffpy_set_exp_bias, exp_bias, int, py_to_int)
> +INSTANCE_FIELD_SETTER (ffpy_set_exp_nan, exp_nan, unsigned int, py_to_unsigned_int)
> +INSTANCE_FIELD_SETTER (ffpy_set_man_start, man_start, unsigned int, py_to_unsigned_int)
> +INSTANCE_FIELD_SETTER (ffpy_set_man_len, man_len, unsigned int, py_to_unsigned_int)
> +INSTANCE_FIELD_SETTER (ffpy_set_intbit, intbit, enum floatformat_intbit, py_to_intbit)
> +
> +/* Makes sure float formats created from Python always test as valid. */
> +
> +static int
> +ffpy_always_valid (const struct floatformat *fmt ATTRIBUTE_UNUSED,
> +                   const void *from ATTRIBUTE_UNUSED)
> +{
> +  return 1;
> +}
> +
> +/* Initializes new float format objects. */
> +
> +static int
> +ffpy_init (PyObject *self,
> +           PyObject *args ATTRIBUTE_UNUSED,
> +           PyObject *kwds ATTRIBUTE_UNUSED)
> +{
> +  auto ff = (float_format_object*) self;
> +  ff->format = floatformat ();
> +  ff->float_format ()->name = "";
> +  ff->float_format ()->is_valid = ffpy_always_valid;
> +  return 0;
> +}
> +
> +/* Retrieves a pointer to the underlying float format structure. */
> +
> +struct floatformat *
> +float_format_object_as_float_format (PyObject *self)
> +{
> +  if (!PyObject_IsInstance (self, (PyObject*) &float_format_object_type))
> +    return nullptr;
> +  return ((float_format_object*) self)->float_format ();
> +}
> +
> +static gdb_PyGetSetDef float_format_object_getset[] =
> +{
> +  { "totalsize", ffpy_get_totalsize, ffpy_set_totalsize,
> +    "The total size of the floating point number, in bits.", nullptr },
> +  { "sign_start", ffpy_get_sign_start, ffpy_set_sign_start,
> +    "The bit offset of the sign bit.", nullptr },
> +  { "exp_start", ffpy_get_exp_start, ffpy_set_exp_start,
> +    "The bit offset of the start of the exponent.", nullptr },
> +  { "exp_len", ffpy_get_exp_len, ffpy_set_exp_len,
> +    "The size of the exponent, in bits.", nullptr },
> +  { "exp_bias", ffpy_get_exp_bias, ffpy_set_exp_bias,
> +    "Bias added to a \"true\" exponent to form the biased exponent.", nullptr },
> +  { "exp_nan", ffpy_get_exp_nan, ffpy_set_exp_nan,
> +    "Exponent value which indicates NaN.", nullptr },
> +  { "man_start", ffpy_get_man_start, ffpy_set_man_start,
> +    "The bit offset of the start of the mantissa.", nullptr },
> +  { "man_len", ffpy_get_man_len, ffpy_set_man_len,
> +    "The size of the mantissa, in bits.", nullptr },
> +  { "intbit", ffpy_get_intbit, ffpy_set_intbit,
> +    "Is the integer bit explicit or implicit?", nullptr },
> +  { "name", ffpy_get_name, nullptr,
> +    "Internal name for debugging.", nullptr },
> +  { nullptr }
> +};
> +
> +static PyMethodDef float_format_object_methods[] =
> +{
> +  { NULL }
> +};
> +
> +static PyNumberMethods float_format_object_as_number = {
> +  nullptr,             /* nb_add */
> +  nullptr,             /* nb_subtract */
> +  nullptr,             /* nb_multiply */
> +  nullptr,             /* nb_remainder */
> +  nullptr,             /* nb_divmod */
> +  nullptr,             /* nb_power */
> +  nullptr,             /* nb_negative */
> +  nullptr,             /* nb_positive */
> +  nullptr,             /* nb_absolute */
> +  nullptr,             /* nb_nonzero */
> +  nullptr,             /* nb_invert */
> +  nullptr,             /* nb_lshift */
> +  nullptr,             /* nb_rshift */
> +  nullptr,             /* nb_and */
> +  nullptr,             /* nb_xor */
> +  nullptr,             /* nb_or */
> +  nullptr,             /* nb_int */
> +  nullptr,             /* reserved */
> +  nullptr,             /* nb_float */
> +};
> +
> +PyTypeObject float_format_object_type =
> +{
> +  PyVarObject_HEAD_INIT (NULL, 0)
> +  "gdb.FloatFormat",              /*tp_name*/
> +  sizeof (float_format_object),   /*tp_basicsize*/
> +  0,                              /*tp_itemsize*/
> +  nullptr,                        /*tp_dealloc*/
> +  0,                              /*tp_print*/
> +  nullptr,                        /*tp_getattr*/
> +  nullptr,                        /*tp_setattr*/
> +  nullptr,                        /*tp_compare*/
> +  nullptr,                        /*tp_repr*/
> +  &float_format_object_as_number, /*tp_as_number*/
> +  nullptr,                        /*tp_as_sequence*/
> +  nullptr,                        /*tp_as_mapping*/
> +  nullptr,                        /*tp_hash */
> +  nullptr,                        /*tp_call*/
> +  nullptr,                        /*tp_str*/
> +  nullptr,                        /*tp_getattro*/
> +  nullptr,                        /*tp_setattro*/
> +  nullptr,                        /*tp_as_buffer*/
> +  Py_TPFLAGS_DEFAULT,             /*tp_flags*/
> +  "GDB float format object",      /* tp_doc */
> +  nullptr,                        /* tp_traverse */
> +  nullptr,                        /* tp_clear */
> +  nullptr,                        /* tp_richcompare */
> +  0,                              /* tp_weaklistoffset */
> +  nullptr,                        /* tp_iter */
> +  nullptr,                        /* tp_iternext */
> +  float_format_object_methods,    /* tp_methods */
> +  nullptr,                        /* tp_members */
> +  float_format_object_getset,     /* tp_getset */
> +  nullptr,                        /* tp_base */
> +  nullptr,                        /* tp_dict */
> +  nullptr,                        /* tp_descr_get */
> +  nullptr,                        /* tp_descr_set */
> +  0,                              /* tp_dictoffset */
> +  ffpy_init,                      /* tp_init */
> +  nullptr,                        /* tp_alloc */
> +  PyType_GenericNew,              /* tp_new */
> +};
> +
> +
> diff --git a/gdb/python/py-objfile.c b/gdb/python/py-objfile.c
> index c278925531b..28a7c9a7873 100644
> --- a/gdb/python/py-objfile.c
> +++ b/gdb/python/py-objfile.c
> @@ -704,6 +704,18 @@ objfile_to_objfile_object (struct objfile *objfile)
>    return gdbpy_ref<>::new_reference (result);
>  }
>
> +struct objfile *
> +objfile_object_to_objfile (PyObject *self)
> +{
> +  if (!PyObject_TypeCheck (self, &objfile_object_type))
> +    return nullptr;
> +
> +  auto objfile_object = (struct objfile_object*) self;
> +  OBJFPY_REQUIRE_VALID (objfile_object);
> +
> +  return objfile_object->objfile;
> +}
> +
>  int
>  gdbpy_initialize_objfile (void)
>  {
> diff --git a/gdb/python/py-type-init.c b/gdb/python/py-type-init.c
> new file mode 100644
> index 00000000000..f3b6813c3ad
> --- /dev/null
> +++ b/gdb/python/py-type-init.c
> @@ -0,0 +1,388 @@
> +/* Functionality for creating new types accessible from python.
> +
> +   Copyright (C) 2008-2023 Free Software Foundation, Inc.
> +
> +   This file is part of GDB.
> +
> +   This program is free software; you can redistribute it and/or modify
> +   it under the terms of the GNU General Public License as published by
> +   the Free Software Foundation; either version 3 of the License, or
> +   (at your option) any later version.
> +
> +   This program is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +   GNU General Public License for more details.
> +
> +   You should have received a copy of the GNU General Public License
> +   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
> +
> +#include "defs.h"
> +#include "python-internal.h"
> +#include "gdbtypes.h"
> +#include "floatformat.h"
> +#include "objfiles.h"
> +#include "gdbsupport/gdb_obstack.h"
> +
> +
> +/* Copies a null-terminated string into an objfile's obstack. */
> +
> +static const char *
> +copy_string (struct objfile *objfile, const char *py_str)
> +{
> +  unsigned int len = strlen (py_str);
> +  return obstack_strndup (&objfile->per_bfd->storage_obstack,
> +                          py_str, len);
> +}
> +
> +/* Creates a new type and returns a new gdb.Type associated with it. */
> +
> +PyObject *
> +gdbpy_init_type (PyObject *self, PyObject *args)
> +{
> +  PyObject *objfile_object;
> +  enum type_code code;
> +  int bit_length;
> +  const char *py_name;
> +
> +  if(!PyArg_ParseTuple (args, "Oiis", &objfile_object, &code, &bit_length, &py_name))
> +    return nullptr;
> +
> +  struct objfile* objfile = objfile_object_to_objfile (objfile_object);
> +  if (objfile == nullptr)
> +    return nullptr;
> +
> +  const char *name = copy_string (objfile, py_name);
> +  struct type *type;
> +  try
> +  {
> +    type = init_type (objfile, code, bit_length, name);
> +    gdb_assert (type != nullptr);
> +  }
> +  catch (gdb_exception_error& ex)
> +  {
> +    GDB_PY_HANDLE_EXCEPTION (ex);
> +  }
> +
> +  return type_to_type_object (type);
> +}
> +
> +/* Creates a new integer type and returns a new gdb.Type associated with it. */
> +
> +PyObject *
> +gdbpy_init_integer_type (PyObject *self, PyObject *args)
> +{
> +  PyObject *objfile_object;
> +  int bit_size;
> +  int unsigned_p;
> +  const char *py_name;
> +
> +  if (!PyArg_ParseTuple (args, "Oips", &objfile_object, &bit_size, &unsigned_p, &py_name))
> +    return nullptr;
> +
> +  struct objfile *objfile = objfile_object_to_objfile (objfile_object);
> +  if (objfile == nullptr)
> +    return nullptr;
> +
> +  const char *name = copy_string (objfile, py_name);
> +  struct type *type;
> +  try
> +  {
> +    type = init_integer_type (objfile, bit_size, unsigned_p, name);
> +    gdb_assert (type != nullptr);
> +  }
> +  catch (gdb_exception_error& ex)
> +  {
> +    GDB_PY_HANDLE_EXCEPTION (ex);
> +  }
> +
> +  return type_to_type_object(type);
> +}
> +
> +/* Creates a new character type and returns a new gdb.Type associated with it. */
> +
> +PyObject *
> +gdbpy_init_character_type (PyObject *self, PyObject *args)
> +{
> +
> +  PyObject *objfile_object;
> +  int bit_size;
> +  int unsigned_p;
> +  const char *py_name;
> +
> +  if (!PyArg_ParseTuple (args, "Oips", &objfile_object, &bit_size, &unsigned_p, &py_name))
> +    return nullptr;
> +
> +  struct objfile *objfile = objfile_object_to_objfile (objfile_object);
> +  if (objfile == nullptr)
> +    return nullptr;
> +
> +  const char *name = copy_string (objfile, py_name);
> +  struct type *type;
> +  try
> +  {
> +    type = init_character_type (objfile, bit_size, unsigned_p, name);
> +    gdb_assert (type != nullptr);
> +  }
> +  catch (gdb_exception_error& ex)
> +  {
> +    GDB_PY_HANDLE_EXCEPTION (ex);
> +  }
> +
> +  return type_to_type_object (type);
> +}
> +
> +/* Creates a new boolean type and returns a new gdb.Type associated with it. */
> +
> +PyObject *
> +gdbpy_init_boolean_type (PyObject *self, PyObject *args)
> +{
> +
> +  PyObject *objfile_object;
> +  int bit_size;
> +  int unsigned_p;
> +  const char *py_name;
> +
> +  if (!PyArg_ParseTuple (args, "Oips", &objfile_object, &bit_size, &unsigned_p, &py_name))
> +    return nullptr;
> +
> +  struct objfile *objfile = objfile_object_to_objfile (objfile_object);
> +  if (objfile == nullptr)
> +    return nullptr;
> +
> +  const char *name = copy_string (objfile, py_name);
> +  struct type *type;
> +  try
> +  {
> +    type = init_boolean_type (objfile, bit_size, unsigned_p, name);
> +    gdb_assert (type != nullptr);
> +  }
> +  catch (gdb_exception_error& ex)
> +  {
> +    GDB_PY_HANDLE_EXCEPTION (ex);
> +  }
> +
> +  return type_to_type_object (type);
> +}
> +
> +/* Creates a new float type and returns a new gdb.Type associated with it. */
> +
> +PyObject *
> +gdbpy_init_float_type (PyObject *self, PyObject *args)
> +{
> +  PyObject *objfile_object, *float_format_object;
> +  const char *py_name;
> +
> +  if (!PyArg_ParseTuple (args, "OOs", &objfile_object, &float_format_object, &py_name))
> +    return nullptr;
> +
> +  struct objfile *objfile = objfile_object_to_objfile (objfile_object);
> +  if (objfile == nullptr)
> +    return nullptr;
> +
> +  struct floatformat *local_ff = float_format_object_as_float_format (float_format_object);
> +  if (local_ff == nullptr)
> +    return nullptr;
> +
> +  /* Persist a copy of the format in the objfile's obstack. This guarantees that
> +   * the format won't outlive the type being created from it and that changes
> +   * made to the object used to create this type will not affect it after
> +   * creation. */
> +  auto ff = OBSTACK_CALLOC
> +    (&objfile->objfile_obstack,
> +     1,
> +     struct floatformat);
> +  memcpy (ff, local_ff, sizeof (struct floatformat));
> +
> +  /* We only support creating float types in the architecture's endianness, so
> +   * make sure init_float_type sees the float format structure we need it to. */
> +  enum bfd_endian endianness = gdbarch_byte_order (objfile->arch());
> +  gdb_assert (endianness < BFD_ENDIAN_UNKNOWN);
> +
> +  const struct floatformat *per_endian[2] = { nullptr, nullptr };
> +  per_endian[endianness] = ff;
> +
> +  const char *name = copy_string (objfile, py_name);
> +  struct type *type;
> +  try
> +  {
> +    type = init_float_type (objfile, -1, name, per_endian, endianness);
> +    gdb_assert (type != nullptr);
> +  }
> +  catch (gdb_exception_error& ex)
> +  {
> +    GDB_PY_HANDLE_EXCEPTION (ex);
> +  }
> +
> +  return type_to_type_object (type);
> +}
> +
> +/* Creates a new decimal float type and returns a new gdb.Type associated with it. */
> +
> +PyObject *
> +gdbpy_init_decfloat_type (PyObject *self, PyObject *args)
> +{
> +  PyObject *objfile_object;
> +  int bit_length;
> +  const char *py_name;
> +
> +  if (!PyArg_ParseTuple (args, "Ois", &objfile_object, &bit_length, &py_name))
> +    return nullptr;
> +
> +  struct objfile *objfile = objfile_object_to_objfile (objfile_object);
> +  if (objfile == nullptr)
> +    return nullptr;
> +
> +  const char *name = copy_string (objfile, py_name);
> +  struct type *type;
> +  try
> +  {
> +    type = init_decfloat_type (objfile, bit_length, name);
> +    gdb_assert (type != nullptr);
> +  }
> +  catch (gdb_exception_error& ex)
> +  {
> +    GDB_PY_HANDLE_EXCEPTION (ex);
> +  }
> +
> +  return type_to_type_object (type);
> +}
> +
> +/* Returns whether a given type can be used to create a complex type. */
> +
> +PyObject *
> +gdbpy_can_create_complex_type (PyObject *self, PyObject *args)
> +{
> +
> +  PyObject *type_object;
> +
> +  if (!PyArg_ParseTuple (args, "O", &type_object))
> +    return nullptr;
> +
> +  struct type *type = type_object_to_type (type_object);
> +  if (type == nullptr)
> +    return nullptr;
> +
> +  bool can_create_complex;
> +  try
> +  {
> +    can_create_complex = can_create_complex_type (type);
> +  }
> +  catch (gdb_exception_error& ex)
> +  {
> +    GDB_PY_HANDLE_EXCEPTION (ex);
> +  }
> +
> +  if (can_create_complex)
> +    Py_RETURN_TRUE;
> +  else
> +    Py_RETURN_FALSE;
> +}
> +
> +/* Creates a new complex type and returns a new gdb.Type associated with it. */
> +
> +PyObject *
> +gdbpy_init_complex_type (PyObject *self, PyObject *args)
> +{
> +
> +  PyObject *type_object;
> +  const char *py_name;
> +
> +  if (!PyArg_ParseTuple (args, "Os", &type_object, &py_name))
> +    return nullptr;
> +
> +  struct type *type = type_object_to_type (type_object);
> +  if (type == nullptr)
> +    return nullptr;
> +
> +  obstack *obstack;
> +  if (type->is_objfile_owned ())
> +    obstack = &type->objfile_owner ()->objfile_obstack;
> +  else
> +    obstack = gdbarch_obstack (type->arch_owner ());
> +
> +  unsigned int len = strlen (py_name);
> +  const char *name = obstack_strndup (obstack,
> +                                      py_name,
> +                                      len);
> +  struct type *complex_type;
> +  try
> +  {
> +    complex_type = init_complex_type (name, type);
> +    gdb_assert (complex_type != nullptr);
> +  }
> +  catch (gdb_exception_error& ex)
> +  {
> +    GDB_PY_HANDLE_EXCEPTION (ex);
> +  }
> +
> +  return type_to_type_object (complex_type);
> +}
> +
> +/* Creates a new pointer type and returns a new gdb.Type associated with it. */
> +
> +PyObject *
> +gdbpy_init_pointer_type (PyObject *self, PyObject *args)
> +{
> +  PyObject *objfile_object, *type_object;
> +  int bit_length;
> +  const char *py_name;
> +
> +  if (!PyArg_ParseTuple (args, "OOis", &objfile_object, &type_object, &bit_length, &py_name))
> +    return nullptr;
> +
> +  struct objfile *objfile = objfile_object_to_objfile (objfile_object);
> +  if (objfile == nullptr)
> +    return nullptr;
> +
> +  struct type *type = type_object_to_type (type_object);
> +  if (type == nullptr)
> +    return nullptr;
> +
> +  const char *name = copy_string (objfile, py_name);
> +  struct type *pointer_type;
> +  try
> +  {
> +    pointer_type = init_pointer_type (objfile, bit_length, name, type);
> +    gdb_assert (type != nullptr);
> +  }
> +  catch (gdb_exception_error& ex)
> +  {
> +    GDB_PY_HANDLE_EXCEPTION (ex);
> +  }
> +
> +  return type_to_type_object (pointer_type);
> +}
> +
> +/* Creates a new fixed point type and returns a new gdb.Type associated with it. */
> +
> +PyObject *
> +gdbpy_init_fixed_point_type (PyObject *self, PyObject *args)
> +{
> +
> +  PyObject *objfile_object;
> +  int bit_length;
> +  int unsigned_p;
> +  const char* py_name;
> +
> +  if (!PyArg_ParseTuple (args, "Oips", &objfile_object, &bit_length, &unsigned_p, &py_name))
> +    return nullptr;
> +
> +  struct objfile *objfile = objfile_object_to_objfile (objfile_object);
> +  if (objfile == nullptr)
> +    return nullptr;
> +
> +  const char *name = copy_string (objfile, py_name);
> +  struct type *type;
> +  try
> +  {
> +    type = init_fixed_point_type (objfile, bit_length, unsigned_p, name);
> +    gdb_assert (type != nullptr);
> +  }
> +  catch (gdb_exception_error& ex)
> +  {
> +    GDB_PY_HANDLE_EXCEPTION (ex);
> +  }
> +
> +  return type_to_type_object (type);
> +}
> diff --git a/gdb/python/python-internal.h b/gdb/python/python-internal.h
> index 06357cc8c0b..3877f8a7ca9 100644
> --- a/gdb/python/python-internal.h
> +++ b/gdb/python/python-internal.h
> @@ -289,6 +289,8 @@ extern PyTypeObject frame_object_type
>      CPYCHECKER_TYPE_OBJECT_FOR_TYPEDEF ("frame_object");
>  extern PyTypeObject thread_object_type
>      CPYCHECKER_TYPE_OBJECT_FOR_TYPEDEF ("thread_object");
> +extern PyTypeObject float_format_object_type
> +    CPYCHECKER_TYPE_OBJECT_FOR_TYPEDEF ("float_format");
>
>  /* Ensure that breakpoint_object_type is initialized and return true.  If
>     breakpoint_object_type can't be initialized then set a suitable Python
> @@ -431,6 +433,17 @@ gdb::unique_xmalloc_ptr<char> gdbpy_parse_command_name
>  PyObject *gdbpy_register_tui_window (PyObject *self, PyObject *args,
>                                      PyObject *kw);
>
> +PyObject *gdbpy_init_type (PyObject *self, PyObject *args);
> +PyObject *gdbpy_init_integer_type (PyObject *self, PyObject *args);
> +PyObject *gdbpy_init_character_type (PyObject *self, PyObject *args);
> +PyObject *gdbpy_init_boolean_type (PyObject *self, PyObject *args);
> +PyObject *gdbpy_init_float_type (PyObject *self, PyObject *args);
> +PyObject *gdbpy_init_decfloat_type (PyObject *self, PyObject *args);
> +PyObject *gdbpy_can_create_complex_type (PyObject *self, PyObject *args);
> +PyObject *gdbpy_init_complex_type (PyObject *self, PyObject *args);
> +PyObject *gdbpy_init_pointer_type (PyObject *self, PyObject *args);
> +PyObject *gdbpy_init_fixed_point_type (PyObject *self, PyObject *args);
> +
>  PyObject *symtab_and_line_to_sal_object (struct symtab_and_line sal);
>  PyObject *symtab_to_symtab_object (struct symtab *symtab);
>  PyObject *symbol_to_symbol_object (struct symbol *sym);
> @@ -481,6 +494,8 @@ struct symtab *symtab_object_to_symtab (PyObject *obj);
>  struct symtab_and_line *sal_object_to_symtab_and_line (PyObject *obj);
>  frame_info_ptr frame_object_to_frame_info (PyObject *frame_obj);
>  struct gdbarch *arch_object_to_gdbarch (PyObject *obj);
> +struct objfile *objfile_object_to_objfile (PyObject *self);
> +struct floatformat *float_format_object_as_float_format (PyObject *self);
>
>  /* Convert Python object OBJ to a program_space pointer.  OBJ must be a
>     gdb.Progspace reference.  Return nullptr if the gdb.Progspace is not
> @@ -559,6 +574,8 @@ int gdbpy_initialize_micommands (void)
>  void gdbpy_finalize_micommands ();
>  int gdbpy_initialize_disasm ()
>    CPYCHECKER_NEGATIVE_RESULT_SETS_EXCEPTION;
> +int gdbpy_initialize_float_format ()
> +  CPYCHECKER_NEGATIVE_RESULT_SETS_EXCEPTION;
>
>  PyMODINIT_FUNC gdbpy_events_mod_func ();
>
> diff --git a/gdb/python/python.c b/gdb/python/python.c
> index 4aa24421dec..1ed29ff4dea 100644
> --- a/gdb/python/python.c
> +++ b/gdb/python/python.c
> @@ -2153,7 +2153,8 @@ do_start_initialization ()
>        || gdbpy_initialize_membuf () < 0
>        || gdbpy_initialize_connection () < 0
>        || gdbpy_initialize_tui () < 0
> -      || gdbpy_initialize_micommands () < 0)
> +      || gdbpy_initialize_micommands () < 0
> +      || gdbpy_initialize_float_format() < 0)
>      return false;
>
>  #define GDB_PY_DEFINE_EVENT_TYPE(name, py_name, doc, base)     \
> @@ -2529,6 +2530,47 @@ Return current recording object." },
>      "stop_recording () -> None.\n\
>  Stop current recording." },
>
> +  /* Type initialization functions. */
> +  { "init_type", gdbpy_init_type, METH_VARARGS,
> +    "init_type (objfile, type_code, bit_length, name) -> type\n\
> +    Creates a new type with the given bit length and type code, owned\
> +    by the given objfile." },
> +  { "init_integer_type", gdbpy_init_integer_type, METH_VARARGS,
> +    "init_integer_type (objfile, bit_length, unsigned, name) -> type\n\
> +    Creates a new integer type with the given bit length and \
> +    signedness, owned by the given objfile." },
> +  { "init_character_type", gdbpy_init_character_type, METH_VARARGS,
> +    "init_character_type (objfile, bit_length, unsigned, name) -> type\n\
> +    Creates a new character type with the given bit length and \
> +    signedness, owned by the given objfile." },
> +  { "init_boolean_type", gdbpy_init_boolean_type, METH_VARARGS,
> +    "init_boolean_type (objfile, bit_length, unsigned, name) -> type\n\
> +    Creates a new boolean type with the given bit length and \
> +    signedness, owned by the given objfile." },
> +  { "init_float_type", gdbpy_init_float_type, METH_VARARGS,
> +    "init_float_type (objfile, float_format, name) -> type\n\
> +    Creates a new floating point type with the given bit length and \
> +    format, owned by the given objfile." },
> +  { "init_decfloat_type", gdbpy_init_decfloat_type, METH_VARARGS,
> +    "init_decfloat_type (objfile, bit_length, name) -> type\n\
> +    Creates a new decimal float type with the given bit length,\
> +    owned by the given objfile." },
> +  { "can_create_complex_type", gdbpy_can_create_complex_type, METH_VARARGS,
> +    "can_create_complex_type (type) -> bool\n\
> +     Returns whether a given type can form a new complex type." },
> +  { "init_complex_type", gdbpy_init_complex_type, METH_VARARGS,
> +    "init_complex_type (base_type, name) -> type\n\
> +    Creates a new complex type whose components belong to the\
> +    given type, owned by the given objfile." },
> +  { "init_pointer_type", gdbpy_init_pointer_type, METH_VARARGS,
> +    "init_pointer_type (objfile, target_type, bit_length, name) -> type\n\
> +    Creates a new pointer type with the given bit length, pointing\
> +    to the given target type, and owned by the given objfile." },
> + { "init_fixed_point_type", gdbpy_init_fixed_point_type, METH_VARARGS,
> +   "init_fixed_point_type (objfile, bit_length, unsigned, name) -> type\n\
> +   Creates a new fixed point type with the given bit length and\
> +   signedness, owned by the given objfile." },
> +
>    { "lookup_type", (PyCFunction) gdbpy_lookup_type,
>      METH_VARARGS | METH_KEYWORDS,
>      "lookup_type (name [, block]) -> type\n\
> --
> 2.37.3.windows.1
>

^ permalink raw reply	[relevance 14%]

* Re: [PATCH] Add name_of_main and language_of_main to the DWARF index
  2023-06-08 21:40  5% [PATCH] Add name_of_main and language_of_main to the DWARF index Matheus Branco Borella
@ 2023-06-09 16:56  0% ` Tom Tromey
  2023-06-30 20:36  4%   ` Matheus Branco Borella
  2023-08-11 18:21  4% ` [PATCH v3] " Matheus Branco Borella
  1 sibling, 1 reply; 65+ results
From: Tom Tromey @ 2023-06-09 16:56 UTC (permalink / raw)
  To: Matheus Branco Borella via Gdb-patches; +Cc: Matheus Branco Borella

>>>>> Matheus Branco Borella via Gdb-patches <gdb-patches@sourceware.org> writes:

Hi.  Thank you for the patch.  I think it's a good addition and is
fundamentally fine.  There are a few nits below.

For a patch of this size, I think we will need a copyright assignment.
The form is here:

http://git.savannah.gnu.org/cgit/gnulib.git/plain/doc/Copyright/request-assign.future

Normally this is a pretty quick process.

> This patch adds a new section to the DWARF index containing the name
> and the language of the main function symbol, gathered from
> `cooked_index::get_main`, if available. Currently, for lack of a better name,
> this section is called the "shortcut table" (suggestions for a better name are
> appreciated).

Seems fine to me :)

> In my testing (a binary with about 1.8GB worth of DWARF data) this change brings
> startup time down from about 34 seconds to about 1.5 seconds.

Very nice.

> (I feel like I might've changed too much about the index format by adding a 
> breaking change. If there's a better way to do this it'd be glad to hear about 
> it.)

The index format is documented in gdb/doc/gdb.texinfo.  So, your patch
will at least need an update to that file.  I suspect a brief entry in
gdb/NEWS describing the change would also be good.

There's a note on the compatibility issue inline, below.

>    /* The version number.  */
> -  contents.append_offset (8);
> +  contents.append_offset (9);

Someday we should probably use a #define for this.
Not your problem though.

> +/* Write shortcut information. */
> +
> +static void
> +write_shortcuts_table (cooked_index *table, data_buf& shortcuts,

gdb style puts the "&" next to "shortcuts".
There's a few of these I think.

> +      lang = main_info->per_cu->lang ();
...
> +  shortcuts.append_uint (4, BFD_ENDIAN_LITTLE, lang);

I think it would be better to use a DW_LANG_ constant here -- it's
better not to expose gdb's enum language values to the world, whereas
the DWARF values are already fixed.

Those aren't really preserved exactly in the reader, but mapping back
from the gdb language to DWARF would be fine (probably).

> +  ++i;
> +
> +  const gdb_byte *shortcut_table = addr + metadata[i];
> +  const gdb_byte *shortcut_table_end = addr + metadata[i + 1];
> +  map->shortcut_table
> +    = gdb::array_view<const gdb_byte> (shortcut_table, shortcut_table_end);
 
This section probably has to be conditional on 'version >= 8.
Otherwise a new gdb will fail with an older version of the index.

> +/* Sets the name and language of the main function from the shortcut table. */
> +
> +static void
> +set_main_name_from_gdb_index (dwarf2_per_objfile *per_objfile, 
> +                              mapped_gdb_index *index)
> +{
> +  auto ptr = index->shortcut_table.data ();
> +  const auto lang = extract_unsigned_integer (ptr, 4, BFD_ENDIAN_LITTLE);

This code should probably check the size of the data, both for safety
and also because that's a decent way to handle the index version issue.

Tom

^ permalink raw reply	[relevance 0%]

* Re: [PATCHv3 0/2] Add __repr__() implementation to a few Python types
  2023-06-07 17:05  7%             ` [PATCHv3 0/2] Add " Matheus Branco Borella
  2023-06-08 18:46  7%               ` Andrew Burgess
@ 2023-06-09 12:33  7%               ` Andrew Burgess
  2023-07-04 11:09  2%                 ` Andrew Burgess
  1 sibling, 1 reply; 65+ results
From: Andrew Burgess @ 2023-06-09 12:33 UTC (permalink / raw)
  To: Matheus Branco Borella via Gdb-patches, aburgess
  Cc: dark.ryu.550, gdb-patches

Matheus Branco Borella via Gdb-patches <gdb-patches@sourceware.org>
writes:

> All of the changes look good in my oppinion. I'm still getting used to the hang
> of things, so I really appreciate the effort you put into making my patch more 
> presentable! So I think it should be good to go?

Do you have a copyright assignment in place?  If not then, given the
size of the change, I think you will need one before this could be
merged.

For details see:

  https://sourceware.org/gdb/wiki/ContributionChecklist#FSF_copyright_Assignment

I think you'd need to complete this form:

  https://git.savannah.gnu.org/cgit/gnulib.git/plain/doc/Copyright/request-assign.future

and send it to assign@gnu.org, then you'd be sent the actual assignment
document to complete and send back.

Thanks,
Andrew


^ permalink raw reply	[relevance 7%]

* [PATCH] Add name_of_main and language_of_main to the DWARF index
@ 2023-06-08 21:40  5% Matheus Branco Borella
  2023-06-09 16:56  0% ` Tom Tromey
  2023-08-11 18:21  4% ` [PATCH v3] " Matheus Branco Borella
  0 siblings, 2 replies; 65+ results
From: Matheus Branco Borella @ 2023-06-08 21:40 UTC (permalink / raw)
  To: gdb-patches; +Cc: Matheus Branco Borella

This patch adds a new section to the DWARF index containing the name
and the language of the main function symbol, gathered from
`cooked_index::get_main`, if available. Currently, for lack of a better name,
this section is called the "shortcut table" (suggestions for a better name are
appreciated). The way this name is both saved and applied upon an index being
loaded in mirrors how it is done in `cooked_index_functions`, more specifically,
the full name of the main function symbol is saved and `set_objfile_main_name`
is used to apply it after it is loaded.

The main use case for this patch is in improving startup times when dealing with
large binaries. Currently, when an index is used, GDB has to expand symtabs
until it finds out what the language of the main function symbol is. For some
large executables, this may take a considerable amount of time to complete,
slowing down startup. This patch bypasses that operation by having both the name
and language of the main function symbol be provided ahead of time by the index.

In my testing (a binary with about 1.8GB worth of DWARF data) this change brings
startup time down from about 34 seconds to about 1.5 seconds.

(I feel like I might've changed too much about the index format by adding a 
breaking change. If there's a better way to do this it'd be glad to hear about 
it.)

---
 gdb/dwarf2/index-write.c    | 47 +++++++++++++++++++++++++++++++++----
 gdb/dwarf2/read-gdb-index.c | 44 +++++++++++++++++++++++++++++++++-
 2 files changed, 85 insertions(+), 6 deletions(-)

diff --git a/gdb/dwarf2/index-write.c b/gdb/dwarf2/index-write.c
index 62c2cc6ac7..4554b5bc49 100644
--- a/gdb/dwarf2/index-write.c
+++ b/gdb/dwarf2/index-write.c
@@ -1080,14 +1080,15 @@ write_gdbindex_1 (FILE *out_file,
 		  const data_buf &types_cu_list,
 		  const data_buf &addr_vec,
 		  const data_buf &symtab_vec,
-		  const data_buf &constant_pool)
+		  const data_buf &constant_pool,
+                  const data_buf &shortcuts)
 {
   data_buf contents;
-  const offset_type size_of_header = 6 * sizeof (offset_type);
+  const offset_type size_of_header = 7 * sizeof (offset_type);
   offset_type total_len = size_of_header;
 
   /* The version number.  */
-  contents.append_offset (8);
+  contents.append_offset (9);
 
   /* The offset of the CU list from the start of the file.  */
   contents.append_offset (total_len);
@@ -1105,6 +1106,10 @@ write_gdbindex_1 (FILE *out_file,
   contents.append_offset (total_len);
   total_len += symtab_vec.size ();
 
+  /* The offset of the shortcut table from the start of the file.  */
+  contents.append_offset (total_len);
+  total_len += shortcuts.size ();
+
   /* The offset of the constant pool from the start of the file.  */
   contents.append_offset (total_len);
   total_len += constant_pool.size ();
@@ -1116,6 +1121,7 @@ write_gdbindex_1 (FILE *out_file,
   types_cu_list.file_write (out_file);
   addr_vec.file_write (out_file);
   symtab_vec.file_write (out_file);
+  shortcuts.file_write (out_file);
   constant_pool.file_write (out_file);
 
   assert_file_size (out_file, total_len);
@@ -1193,6 +1199,34 @@ write_cooked_index (cooked_index *table,
     }
 }
 
+/* Write shortcut information. */
+
+static void
+write_shortcuts_table (cooked_index *table, data_buf& shortcuts,
+                       data_buf& cpool)
+{
+  const auto main_info = table->get_main ();
+  size_t main_name_offset = 0;
+  language lang = language_unknown;
+
+  if (main_info != nullptr)
+    {
+      lang = main_info->per_cu->lang ();
+
+      if (lang != language_unknown)
+        {
+          auto_obstack obstack;
+          const auto main_name = main_info->full_name (&obstack, true);
+
+          main_name_offset = cpool.size ();
+          cpool.append_cstr0 (main_name);
+        }
+    }
+
+  shortcuts.append_uint (4, BFD_ENDIAN_LITTLE, lang);
+  shortcuts.append_offset (main_name_offset);
+}
+
 /* Write contents of a .gdb_index section for OBJFILE into OUT_FILE.
    If OBJFILE has an associated dwz file, write contents of a .gdb_index
    section for that dwz file into DWZ_OUT_FILE.  If OBJFILE does not have an
@@ -1270,11 +1304,14 @@ write_gdbindex (dwarf2_per_bfd *per_bfd, cooked_index *table,
 
   write_hash_table (&symtab, symtab_vec, constant_pool);
 
+  data_buf shortcuts;
+  write_shortcuts_table (table, shortcuts, constant_pool);
+
   write_gdbindex_1(out_file, objfile_cu_list, types_cu_list, addr_vec,
-		   symtab_vec, constant_pool);
+		   symtab_vec, constant_pool, shortcuts);
 
   if (dwz_out_file != NULL)
-    write_gdbindex_1 (dwz_out_file, dwz_cu_list, {}, {}, {}, {});
+    write_gdbindex_1 (dwz_out_file, dwz_cu_list, {}, {}, {}, {}, {});
   else
     gdb_assert (dwz_cu_list.empty ());
 }
diff --git a/gdb/dwarf2/read-gdb-index.c b/gdb/dwarf2/read-gdb-index.c
index 1006386cb2..f590c7fb99 100644
--- a/gdb/dwarf2/read-gdb-index.c
+++ b/gdb/dwarf2/read-gdb-index.c
@@ -88,6 +88,9 @@ struct mapped_gdb_index final : public mapped_index_base
   /* A pointer to the constant pool.  */
   gdb::array_view<const gdb_byte> constant_pool;
 
+  /* The shortcut table data. */
+  gdb::array_view<const gdb_byte> shortcut_table;
+
   /* Return the index into the constant pool of the name of the IDXth
      symbol in the symbol table.  */
   offset_type symbol_name_index (offset_type idx) const
@@ -166,6 +169,7 @@ dwarf2_gdb_index::dump (struct objfile *objfile)
 
   mapped_gdb_index *index = (gdb::checked_static_cast<mapped_gdb_index *>
 			     (per_objfile->per_bfd->index_table.get ()));
+
   gdb_printf (".gdb_index: version %d\n", index->version);
   gdb_printf ("\n");
 }
@@ -583,7 +587,7 @@ to use the section anyway."),
 
   /* Indexes with higher version than the one supported by GDB may be no
      longer backward compatible.  */
-  if (version > 8)
+  if (version > 9)
     return 0;
 
   map->version = version;
@@ -608,6 +612,12 @@ to use the section anyway."),
   map->symbol_table
     = offset_view (gdb::array_view<const gdb_byte> (symbol_table,
 						    symbol_table_end));
+  ++i;
+
+  const gdb_byte *shortcut_table = addr + metadata[i];
+  const gdb_byte *shortcut_table_end = addr + metadata[i + 1];
+  map->shortcut_table
+    = gdb::array_view<const gdb_byte> (shortcut_table, shortcut_table_end);
 
   ++i;
   map->constant_pool = buffer.slice (metadata[i]);
@@ -763,6 +773,36 @@ create_addrmap_from_gdb_index (dwarf2_per_objfile *per_objfile,
     = new (&per_bfd->obstack) addrmap_fixed (&per_bfd->obstack, &mutable_map);
 }
 
+/* Sets the name and language of the main function from the shortcut table. */
+
+static void
+set_main_name_from_gdb_index (dwarf2_per_objfile *per_objfile, 
+                              mapped_gdb_index *index)
+{
+  auto ptr = index->shortcut_table.data ();
+  const auto lang = extract_unsigned_integer (ptr, 4, BFD_ENDIAN_LITTLE);
+  if (lang >= nr_languages)
+    {
+      complaint (_(".gdb_index shortcut table has invalid main language %u"),
+                   (unsigned) lang);
+      return;
+    }
+  if (lang == language_unknown)
+    {
+      /* Don't bother if the language for the main symbol was not known or if
+       * there was no main symbol at all when the index was built. */
+      return;
+    }
+  ptr += 4;
+
+  const auto name_offset = extract_unsigned_integer (ptr, 
+                                                     sizeof (offset_type), 
+                                                     BFD_ENDIAN_LITTLE);
+  const auto name = (const char*) (index->constant_pool.data () + name_offset);
+
+  set_objfile_main_name (per_objfile->objfile, name, (enum language) lang);
+}
+
 /* See read-gdb-index.h.  */
 
 int
@@ -848,6 +888,8 @@ dwarf2_read_gdb_index
 
   create_addrmap_from_gdb_index (per_objfile, map.get ());
 
+  set_main_name_from_gdb_index (per_objfile, map.get ());
+
   per_bfd->index_table = std::move (map);
   per_bfd->quick_file_names_table =
     create_quick_file_names_table (per_bfd->all_units.size ());
-- 
2.40.1


^ permalink raw reply	[relevance 5%]

* Re: [PATCHv3 0/2] Add __repr__() implementation to a few Python types
  2023-06-07 17:05  7%             ` [PATCHv3 0/2] Add " Matheus Branco Borella
@ 2023-06-08 18:46  7%               ` Andrew Burgess
  2023-06-09 12:33  7%               ` Andrew Burgess
  1 sibling, 0 replies; 65+ results
From: Andrew Burgess @ 2023-06-08 18:46 UTC (permalink / raw)
  To: Matheus Branco Borella via Gdb-patches, aburgess
  Cc: dark.ryu.550, gdb-patches

Matheus Branco Borella via Gdb-patches <gdb-patches@sourceware.org>
writes:

> All of the changes look good in my oppinion. I'm still getting used to the hang
> of things, so I really appreciate the effort you put into making my patch more 
> presentable! So I think it should be good to go?

I'll update, retest, and push these patches tomorrow.

Thanks,
Andrew


^ permalink raw reply	[relevance 7%]

* Re: [PATCHv3 0/2] Add __repr__() implementation to a few Python types
  2023-05-19 21:27  7%           ` [PATCHv3 0/2] " Andrew Burgess
  2023-05-19 21:27  3%             ` [PATCHv3 2/2] gdb: add " Andrew Burgess
@ 2023-06-07 17:05  7%             ` Matheus Branco Borella
  2023-06-08 18:46  7%               ` Andrew Burgess
  2023-06-09 12:33  7%               ` Andrew Burgess
  1 sibling, 2 replies; 65+ results
From: Matheus Branco Borella @ 2023-06-07 17:05 UTC (permalink / raw)
  To: aburgess; +Cc: dark.ryu.550, gdb-patches

All of the changes look good in my oppinion. I'm still getting used to the hang
of things, so I really appreciate the effort you put into making my patch more 
presentable! So I think it should be good to go?




^ permalink raw reply	[relevance 7%]

* [PATCH] Add support for symbol addition to the Python API
@ 2023-05-27  1:24  3% Matheus Branco Borella
  2023-06-27  3:53 14% ` [PING] " Matheus Branco Borella
  2023-07-04 15:14  7% ` Andrew Burgess
  0 siblings, 2 replies; 65+ results
From: Matheus Branco Borella @ 2023-05-27  1:24 UTC (permalink / raw)
  To: gdb-patches; +Cc: Matheus Branco Borella

Disclaimer:

This patch is a rework of a six-month old patch I submitted to the mailing list
that considerably reduces the hackyness of the original solution to the problem,
now that I've had more time to read through and understand how symbols are 
handled and searched for inside GDB. So, I'd like to please ask for comments on 
things I can still improve in this patch, before I resubmit it. I also plan to 
add tests to it once I'm more secure about the approach I'm taking to solve the
problem now.

The interfaces in this patch can be tested like so:
```
(gdb) pi
>>> builder = gdb.ObjfileBuilder(name = "some_name")
>>> builder.add_static_symbol(name = "some_sym", address = 0x41414141, 
        language = "c")
>>> objfile = builder.build()
```

---

This patch adds support for symbol creation and registration. It currently
supports adding type symbols (VAR_DOMAIN/LOC_TYPEDEF), static symbols
(VAR_DOMAIN/LOC_STATIC) and goto target labels (LABEL_DOMAIN/LOC_LABEL). It
adds a new `gdb.ObjfileBuilder` type, with `add_type_symbol`,
`add_static_symbol` and `add_label_symbol` functions, allowing for the addition
of the aforementioned types of symbols.

Symbol addition is achieved by constructing a new objfile with msyms and full
symbols reflecting the symbols that were previously added to the builder through
its methods. This approach lets us get most of the way to full symbol addition
support, but due to not being backed up by BFD, it does have a few limitations,
which I will go over them here.

PC-based minsym lookup does not work, because those would require a more
complete set of BFD structures than I think would be good practice to pretend to
have them all and crash GDB later on when it expects things to be there that
aren't.

In the same vein, PC-based function name lookup also does not work, although
there may be a way to have the feature work using overlays. However, this patch
does not make an attempt to do so

For now, though, this implementation lets us add symbols that can be used to,
for instance, query registered types through `gdb.lookup_type`, and allows
reverse engineering GDB plugins (such as Pwndbg [0] or decomp2gdb [1]) to add
symbols directly through the Python API instead of having to compile an object
file for the target architecture that they later load through the add-symbol-
file command. [2]

[0] https://github.com/pwndbg/pwndbg/
[1] https://github.com/mahaloz/decomp2dbg
[2] https://github.com/mahaloz/decomp2dbg/blob/055be6b2001954d00db2d683f20e9b714af75880/decomp2dbg/clients/gdb/symbol_mapper.py#L235-L243]
---
 gdb/Makefile.in                 |   1 +
 gdb/python/py-objfile-builder.c | 648 ++++++++++++++++++++++++++++++++
 gdb/python/py-objfile.c         |   1 +
 gdb/python/python-internal.h    |   1 +
 4 files changed, 651 insertions(+)
 create mode 100644 gdb/python/py-objfile-builder.c

diff --git a/gdb/Makefile.in b/gdb/Makefile.in
index 14b5dd0bad..c0eecb81b6 100644
--- a/gdb/Makefile.in
+++ b/gdb/Makefile.in
@@ -417,6 +417,7 @@ SUBDIR_PYTHON_SRCS = \
 	python/py-micmd.c \
 	python/py-newobjfileevent.c \
 	python/py-objfile.c \
+	python/py-objfile-builder.c \
 	python/py-param.c \
 	python/py-prettyprint.c \
 	python/py-progspace.c \
diff --git a/gdb/python/py-objfile-builder.c b/gdb/python/py-objfile-builder.c
new file mode 100644
index 0000000000..1e3110c613
--- /dev/null
+++ b/gdb/python/py-objfile-builder.c
@@ -0,0 +1,648 @@
+/* Python class allowing users to build and install objfiles.
+
+   Copyright (C) 2013-2023 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include "defs.h"
+#include "python-internal.h"
+#include "quick-symbol.h"
+#include "objfiles.h"
+#include "minsyms.h"
+#include "buildsym.h"
+#include "observable.h"
+#include <string>
+#include <unordered_map>
+#include <type_traits>
+#include <optional>
+
+/* This module relies on symbols being trivially copyable. */
+static_assert (std::is_trivially_copyable_v<struct symbol>);
+
+/* Interface to be implemented for symbol types supported by this interface. */
+class symbol_def
+{
+public:
+  virtual ~symbol_def () = default;
+
+  virtual void register_msymbol (const std::string& name, 
+                                 struct objfile* objfile,
+                                 minimal_symbol_reader& reader) const = 0;
+  virtual void register_symbol (const std::string& name, 
+                                struct objfile* objfile,
+                                buildsym_compunit& builder) const = 0;
+};
+
+/* Shorthand for a unique_ptr to a symbol. */
+typedef std::unique_ptr<symbol_def> symbol_def_up;
+
+/* Data being held by the gdb.ObjfileBuilder.
+ *
+ * This structure needs to have its constructor run in order for its lifetime
+ * to begin. Because of how Python handles its objects, we can't just reconstruct
+ * the object structure as a whole, as that would overwrite things the runtime
+ * cares about, so these fields had to be broken off into their own structure. */
+struct objfile_builder_data
+{
+  /* Indicates whether the objfile has already been built and added to the
+   * current context. We enforce that objfiles can't be installed twice. */
+  bool installed = false;
+
+  /* The symbols that will be added to new newly built objfile. */
+  std::unordered_map<std::string, symbol_def_up> symbols;
+
+  /* The name given to this objfile. */
+  std::string name;
+
+  /* Adds a symbol definition with the given name. */
+  bool add_symbol_def (std::string name, symbol_def_up&& symbol_def)
+  {
+    return std::get<1> (symbols.insert ({name, std::move (symbol_def)}));
+  }
+};
+
+/* Structure backing the gdb.ObjfileBuilder type. */
+
+struct objfile_builder_object
+{
+  PyObject_HEAD
+
+  /* See objfile_builder_data. */
+  objfile_builder_data inner;
+};
+
+extern PyTypeObject objfile_builder_object_type
+    CPYCHECKER_TYPE_OBJECT_FOR_TYPEDEF ("objfile_builder_object_type");
+
+/* Constructs a new objfile from an objfile_builder. */
+static struct objfile *
+build_new_objfile (const objfile_builder_object& builder)
+{
+  gdb_assert (!builder.inner.installed);
+
+  auto of = objfile::make (nullptr, builder.inner.name.c_str (), 
+                           OBJF_READNOW | OBJF_NOT_FILENAME, 
+                           nullptr);
+
+  /* Setup object file sections. */
+  of->sections_start = OBSTACK_CALLOC (&of->objfile_obstack,
+                                       4,
+                                       struct obj_section);
+  of->sections_end = of->sections_start + 4;
+
+  const auto init_section = [&](struct obj_section* sec)
+    {
+      sec->objfile = of;
+      sec->ovly_mapped = false;
+      
+      /* We're not being backed by BFD. So we have no real section data to speak 
+       * of, but, because specifying sections requires BFD structures, we have to
+       * play a little game of predend. */
+      auto bfd = obstack_new<bfd_section> (&of->objfile_obstack);
+      bfd->vma = 0;
+      bfd->size = 0;
+      bfd->lma = 0; /* Prevents insert_section_p in objfiles.c from trying to 
+                     * dereference the bfd structure we don't have. */
+      sec->the_bfd_section = bfd;
+    };
+  init_section (&of->sections_start[0]);
+  init_section (&of->sections_start[1]);
+  init_section (&of->sections_start[2]);
+  init_section (&of->sections_start[4]);
+
+  of->sect_index_text = 0;
+  of->sect_index_data = 1;
+  of->sect_index_rodata = 2;
+  of->sect_index_bss = 3;
+
+  /* While buildsym_compunit expects the symbol function pointer structure to be
+   * present, it also gracefully handles the case where all of the pointers in
+   * it are set to null. So, make sure we have a valid structure, but there's
+   * no need to do more than that. */
+  of->sf = obstack_new<struct sym_fns> (&of->objfile_obstack);
+
+  /* We need to tell GDB what architecture the objfile uses. */
+  if (has_stack_frames ())
+    of->per_bfd->gdbarch = get_frame_arch (get_selected_frame (nullptr));
+  else
+    of->per_bfd->gdbarch = target_gdbarch ();
+
+  /* Construct the minimal symbols. */
+  minimal_symbol_reader msym (of);
+  for (const auto& [name, symbol] : builder.inner.symbols)
+      symbol->register_msymbol (name, of, msym);
+  msym.install ();
+
+  /* Construct the full symbols. */
+  buildsym_compunit fsym (of, builder.inner.name.c_str (), "", language_c, 0);
+  for (const auto& [name, symbol] : builder.inner.symbols)
+    symbol->register_symbol (name, of, fsym);
+  fsym.end_compunit_symtab (0);
+
+  /* Notify the rest of GDB this objfile has been created. Requires 
+   * OBJF_NOT_FILENAME to be used, to prevent any of the functions attatched to
+   * the observable from trying to dereference of->bfd. */
+  gdb::observers::new_objfile.notify (of);
+
+  return of;
+}
+
+/* Implementation of the quick symbol functions used by the objfiles created 
+ * using this interface. Turns out we have our work cut out for us here, as we
+ * can get something that works by effectively just using no-ops, and the rest
+ * of the code will fall back to using just the minimal and full symbol data. It
+ * is important to note, though, that this only works because we're marking our 
+ * objfile with `OBJF_READNOW`. */
+class runtime_objfile : public quick_symbol_functions
+{
+  virtual bool has_symbols (struct objfile*) override
+  {
+    return false;
+  }
+
+  virtual void dump (struct objfile *objfile) override
+  {
+  }
+
+  virtual void expand_matching_symbols
+    (struct objfile *,
+     const lookup_name_info &lookup_name,
+     domain_enum domain,
+     int global,
+     symbol_compare_ftype *ordered_compare) override
+  {
+  }
+
+  virtual bool expand_symtabs_matching
+    (struct objfile *objfile,
+     gdb::function_view<expand_symtabs_file_matcher_ftype> file_matcher,
+     const lookup_name_info *lookup_name,
+     gdb::function_view<expand_symtabs_symbol_matcher_ftype> symbol_matcher,
+     gdb::function_view<expand_symtabs_exp_notify_ftype> expansion_notify,
+     block_search_flags search_flags,
+     domain_enum domain,
+     enum search_domain kind) override
+  {
+    return true;
+  }
+};
+
+
+/* Create a new symbol alocated in the given objfile. */
+
+static struct symbol *
+new_symbol
+  (struct objfile *objfile,
+   const char *name,
+   enum language language,
+   enum domain_enum domain,
+   enum address_class aclass,
+   short section_index)
+{
+  auto symbol = new (&objfile->objfile_obstack) struct symbol ();
+  OBJSTAT (objfile, n_syms++);
+
+  symbol->set_language (language, &objfile->objfile_obstack);
+  symbol->compute_and_set_names (gdb::string_view (name), true, 
+                                 objfile->per_bfd);
+
+  symbol->set_is_objfile_owned (true);
+  symbol->set_section_index (section_index);
+  symbol->set_domain (domain);
+  symbol->set_aclass_index (aclass);
+
+  return symbol;
+}
+
+/* Parses a language from a string (coming from Python) into a language 
+ * variant. */
+
+static enum language
+parse_language (const char *language)
+{
+  if (strcmp (language, "c") == 0)
+    return language_c;
+  else if (strcmp (language, "objc") == 0)
+    return language_objc;
+  else if (strcmp (language, "cplus") == 0)
+    return language_cplus;
+  else if (strcmp (language, "d") == 0)
+    return language_d;
+  else if (strcmp (language, "go") == 0)
+    return language_go;
+  else if (strcmp (language, "fortran") == 0)
+    return language_fortran;
+  else if (strcmp (language, "m2") == 0)
+    return language_m2;
+  else if (strcmp (language, "asm") == 0)
+    return language_asm;
+  else if (strcmp (language, "pascal") == 0)
+    return language_pascal;
+  else if (strcmp (language, "opencl") == 0)
+    return language_opencl;
+  else if (strcmp (language, "rust") == 0)
+    return language_rust;
+  else if (strcmp (language, "ada") == 0)
+    return language_ada;
+  else
+    return language_unknown;
+}
+
+/* Convenience function that performs a checked coversion from a PyObject to
+ * a objfile_builder_object structure pointer. */
+inline static struct objfile_builder_object *
+validate_objfile_builder_object (PyObject *self)
+{
+  if (!PyObject_TypeCheck (self, &objfile_builder_object_type))
+    return nullptr;
+  return (struct objfile_builder_object*) self;
+}
+
+/* Registers symbols added with add_label_symbol. */
+class typedef_symbol_def : public symbol_def
+{
+public:
+  struct type* type;
+  enum language language;
+
+  virtual void register_msymbol (const std::string& name,
+                                 struct objfile *objfile,
+                                 minimal_symbol_reader& reader) const override
+  {
+  }
+
+  virtual void register_symbol (const std::string& name,
+                                struct objfile *objfile,
+                                buildsym_compunit& builder) const override
+  {
+    auto symbol = new_symbol (objfile, name.c_str (), language, LABEL_DOMAIN,
+                              LOC_TYPEDEF, objfile->sect_index_text);
+
+    symbol->set_type (type);
+
+    add_symbol_to_list (symbol, builder.get_file_symbols ());
+  }
+};
+
+/* Adds a type (LOC_TYPEDEF) symbol to a given objfile. */
+static PyObject *
+objbdpy_add_type_symbol (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *format = "sOs";
+  static const char *keywords[] =
+    {
+      "name", "type", "language", NULL
+    };
+
+  PyObject *type_object;
+  const char *name;
+  const char *language_name = nullptr;
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, format, keywords, &name,
+                                        &type_object, &language_name))
+    return nullptr;
+
+  auto builder = validate_objfile_builder_object (self);
+  if (builder == nullptr)
+    return nullptr;
+
+  struct type *type = type_object_to_type (type_object);
+  if (type == nullptr)
+    return nullptr;
+
+  if (language_name == nullptr)
+    language_name = "auto";
+  enum language language = parse_language (language_name);
+  if (language == language_unknown)
+    {
+      PyErr_SetString (PyExc_ValueError, "invalid language name");
+      return nullptr;
+    }
+
+  auto def = std::make_unique<typedef_symbol_def> ();
+  def->type = type;
+  def->language = language;
+
+  builder->inner.add_symbol_def (name, std::move (def));
+
+  Py_RETURN_NONE;
+}
+
+
+/* Registers symbols added with add_label_symbol. */
+class label_symbol_def : public symbol_def
+{
+public:
+  CORE_ADDR address;
+  enum language language;
+
+  virtual void register_msymbol (const std::string& name,
+                                 struct objfile *objfile,
+                                 minimal_symbol_reader& reader) const override
+  {
+    reader.record (name.c_str (), 
+                   unrelocated_addr (address), 
+                   minimal_symbol_type::mst_text);
+  }
+
+  virtual void register_symbol (const std::string& name,
+                                struct objfile *objfile,
+                                buildsym_compunit& builder) const override
+  {
+    printf("Adding label %s\n", name.c_str ());
+    auto symbol = new_symbol (objfile, name.c_str (), language, LABEL_DOMAIN,
+                              LOC_LABEL, objfile->sect_index_text);
+
+    symbol->set_value_address (address);
+
+    add_symbol_to_list (symbol, builder.get_file_symbols ());
+  }
+};
+
+/* Adds a label (LOC_LABEL) symbol to a given objfile. */
+static PyObject *
+objbdpy_add_label_symbol (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *format = "sks";
+  static const char *keywords[] =
+    {
+      "name", "address", "language", NULL
+    };
+
+  const char *name;
+  CORE_ADDR address;
+  const char *language_name = nullptr;
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, format, keywords, &name,
+                                        &address, &language_name))
+    return nullptr;
+
+  auto builder = validate_objfile_builder_object (self);
+  if (builder == nullptr)
+    return nullptr;
+
+  if (language_name == nullptr)
+    language_name = "auto";
+  enum language language = parse_language (language_name);
+  if (language == language_unknown)
+    {
+      PyErr_SetString (PyExc_ValueError, "invalid language name");
+      return nullptr;
+    }
+
+  auto def = std::make_unique<label_symbol_def> ();
+  def->address = address;
+  def->language = language;
+
+  builder->inner.add_symbol_def (name, std::move (def));
+
+  Py_RETURN_NONE;
+}
+
+/* Registers symbols added with add_static_symbol. */
+class static_symbol_def : public symbol_def
+{
+public:
+  CORE_ADDR address;
+  enum language language;
+
+  virtual void register_msymbol (const std::string& name,
+                                 struct objfile *objfile,
+                                 minimal_symbol_reader& reader) const override
+  {
+    reader.record (name.c_str (), 
+                   unrelocated_addr (address), 
+                   minimal_symbol_type::mst_bss);
+  }
+
+  virtual void register_symbol (const std::string& name,
+                                struct objfile *objfile,
+                                buildsym_compunit& builder) const override
+  {
+    auto symbol = new_symbol (objfile, name.c_str (), language, VAR_DOMAIN,
+                              LOC_STATIC, objfile->sect_index_bss);
+
+    symbol->set_value_address (address);
+
+    add_symbol_to_list (symbol, builder.get_file_symbols ());
+  }
+};
+
+/* Adds a static (LOC_STATIC) symbol to a given objfile. */
+static PyObject *
+objbdpy_add_static_symbol (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *format = "sks";
+  static const char *keywords[] =
+    {
+      "name", "address", "language", NULL
+    };
+
+  const char *name;
+  CORE_ADDR address;
+  const char *language_name = nullptr;
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, format, keywords, &name,
+                                        &address, &language_name))
+    return nullptr;
+
+  auto builder = validate_objfile_builder_object (self);
+  if (builder == nullptr)
+    return nullptr;
+
+  if (language_name == nullptr)
+    language_name = "auto";
+  enum language language = parse_language (language_name);
+  if (language == language_unknown)
+    {
+      PyErr_SetString (PyExc_ValueError, "invalid language name");
+      return nullptr;
+    }
+
+  auto def = std::make_unique<static_symbol_def> ();
+  def->address = address;
+  def->language = language;
+
+  builder->inner.add_symbol_def (name, std::move (def));
+
+  Py_RETURN_NONE;
+}
+
+/* Builds the object file. */
+static PyObject *
+objbdpy_build (PyObject *self, PyObject *args)
+{
+  auto builder = validate_objfile_builder_object (self);
+  if (builder == nullptr)
+    return nullptr;
+
+  if (builder->inner.installed)
+    {
+      PyErr_SetString (PyExc_ValueError, "build() cannot be run twice on the \
+                       same object");
+      return nullptr;
+    }
+  auto of = build_new_objfile (*builder);
+  builder->inner.installed = true;
+
+
+  auto objpy = objfile_to_objfile_object (of).get ();
+  Py_INCREF(objpy);
+  return objpy;
+}
+
+/* Implements the __init__() function. */
+static int
+objbdpy_init (PyObject *self0, PyObject *args, PyObject *kw)
+{
+  static const char *format = "s";
+  static const char *keywords[] =
+    {
+      "name", NULL
+    };
+
+  const char *name;
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, format, keywords, &name))
+    return -1;
+
+  auto self = (objfile_builder_object *)self0;
+  self->inner.name = name;
+  self->inner.symbols.clear ();
+
+  return 0;
+}
+
+/* The function handling construction of the ObjfileBuilder object. 
+ *
+ * We need to have a custom function here as, even though Python manages the 
+ * memory backing the object up, it assumes clearing the memory is enough to
+ * begin its lifetime, which is not the case here, and would lead to undefined 
+ * behavior as soon as we try to use it in any meaningful way.
+ * 
+ * So, what we have to do here is manually begin the lifecycle of our new object
+ * by constructing it in place, using the memory region Python just allocated
+ * for us. This ensures the object will have already started its lifetime by 
+ * the time we start using it. */
+static PyObject *
+objbdpy_new (PyTypeObject *subtype, PyObject *args, PyObject *kwds)
+{
+  objfile_builder_object *region = 
+    (objfile_builder_object *) subtype->tp_alloc(subtype, 1);
+  gdb_assert ((size_t)region % alignof (objfile_builder_object) == 0);
+  gdb_assert (region != nullptr);
+
+  new (&region->inner) objfile_builder_data ();
+  
+  return (PyObject *)region;
+}
+
+/* The function handling destruction of the ObjfileBuilder object. 
+ *
+ * While running the destructor of our object isn't _strictly_ necessary, we
+ * would very much like for the memory it owns to be freed, but, because it was
+ * constructed in place, we have to call its destructor manually here. */
+static void 
+objbdpy_dealloc (PyObject *self0)
+{
+  
+  auto self = (objfile_builder_object *)self0;
+  PyTypeObject *tp = Py_TYPE(self);
+  
+  self->inner.~objfile_builder_data ();
+  
+  tp->tp_free(self);
+  Py_DECREF(tp);
+}
+
+static int CPYCHECKER_NEGATIVE_RESULT_SETS_EXCEPTION
+gdbpy_initialize_objfile_builder (void)
+{
+  if (PyType_Ready (&objfile_builder_object_type) < 0)
+    return -1;
+
+  return gdb_pymodule_addobject (gdb_module, "ObjfileBuilder",
+				 (PyObject *) &objfile_builder_object_type);
+}
+
+GDBPY_INITIALIZE_FILE (gdbpy_initialize_objfile_builder);
+
+static PyMethodDef objfile_builder_object_methods[] =
+{
+  { "build", (PyCFunction) objbdpy_build, METH_NOARGS,
+    "build ().\n\
+Build a new objfile containing the symbols added to builder." },
+  { "add_type_symbol", (PyCFunction) objbdpy_add_type_symbol,
+    METH_VARARGS | METH_KEYWORDS,
+    "add_type_symbol (name [str], type [gdb.Type], language [str]).\n\
+Add a new type symbol in the given language, associated with the given type." },
+  { "add_label_symbol", (PyCFunction) objbdpy_add_label_symbol,
+    METH_VARARGS | METH_KEYWORDS,
+    "add_label_symbol (name [str], address [int], language [str]).\n\
+Add a new label symbol in the given language, at the given address." },
+  { "add_static_symbol", (PyCFunction) objbdpy_add_static_symbol,
+    METH_VARARGS | METH_KEYWORDS,
+    "add_static_symbol (name [str], address [int], language [str]).\n\
+Add a new static symbol in the given language, at the given address." },
+  { NULL }
+};
+
+PyTypeObject objfile_builder_object_type = {
+  PyVarObject_HEAD_INIT (NULL, 0)
+  "gdb.ObjfileBuilder",               /* tp_name */
+  sizeof (objfile_builder_object),    /* tp_basicsize */
+  0,                                  /* tp_itemsize */
+  objbdpy_dealloc,                    /* tp_dealloc */
+  0,                                  /* tp_vectorcall_offset */
+  nullptr,                            /* tp_getattr */
+  nullptr,                            /* tp_setattr */
+  nullptr,                            /* tp_compare */
+  nullptr,                            /* tp_repr */
+  nullptr,                            /* tp_as_number */
+  nullptr,                            /* tp_as_sequence */
+  nullptr,                            /* tp_as_mapping */
+  nullptr,                            /* tp_hash  */
+  nullptr,                            /* tp_call */
+  nullptr,                            /* tp_str */
+  nullptr,                            /* tp_getattro */
+  nullptr,                            /* tp_setattro */
+  nullptr,                            /* tp_as_buffer */
+  Py_TPFLAGS_DEFAULT,                 /* tp_flags */
+  "GDB object file builder",          /* tp_doc */
+  nullptr,                            /* tp_traverse */
+  nullptr,                            /* tp_clear */
+  nullptr,                            /* tp_richcompare */
+  0,                                  /* tp_weaklistoffset */
+  nullptr,                            /* tp_iter */
+  nullptr,                            /* tp_iternext */
+  objfile_builder_object_methods,     /* tp_methods */
+  nullptr,                            /* tp_members */
+  nullptr,                            /* tp_getset */
+  nullptr,                            /* tp_base */
+  nullptr,                            /* tp_dict */
+  nullptr,                            /* tp_descr_get */
+  nullptr,                            /* tp_descr_set */
+  0,                                  /* tp_dictoffset */
+  objbdpy_init,                       /* tp_init */
+  PyType_GenericAlloc,                /* tp_alloc */
+  objbdpy_new,                        /* tp_new */
+};
+
+
diff --git a/gdb/python/py-objfile.c b/gdb/python/py-objfile.c
index ad72f3f042..be21011ce6 100644
--- a/gdb/python/py-objfile.c
+++ b/gdb/python/py-objfile.c
@@ -25,6 +25,7 @@
 #include "build-id.h"
 #include "symtab.h"
 #include "python.h"
+#include "buildsym.h"
 
 struct objfile_object
 {
diff --git a/gdb/python/python-internal.h b/gdb/python/python-internal.h
index dbd33570a7..fbf9b06af5 100644
--- a/gdb/python/python-internal.h
+++ b/gdb/python/python-internal.h
@@ -480,6 +480,7 @@ struct symtab *symtab_object_to_symtab (PyObject *obj);
 struct symtab_and_line *sal_object_to_symtab_and_line (PyObject *obj);
 frame_info_ptr frame_object_to_frame_info (PyObject *frame_obj);
 struct gdbarch *arch_object_to_gdbarch (PyObject *obj);
+struct floatformat *float_format_object_as_float_format (PyObject *self);
 
 /* Convert Python object OBJ to a program_space pointer.  OBJ must be a
    gdb.Progspace reference.  Return nullptr if the gdb.Progspace is not
-- 
2.40.1


^ permalink raw reply	[relevance 3%]

* [PATCH] Add support for creating new types from the Python API
    2023-01-11  0:58  2% ` [PATCH] Add support for creating new types from " Matheus Branco Borella
@ 2023-05-26  3:30  2% ` Matheus Branco Borella
  2023-08-07 14:53  5%   ` Andrew Burgess
  1 sibling, 1 reply; 65+ results
From: Matheus Branco Borella @ 2023-05-26  3:30 UTC (permalink / raw)
  To: gdb-patches; +Cc: dark.ryu.550

From: "dark.ryu.550@gmail.com" <dark.ryu.550@gmail.com>

On 1/6/23 20:00, simark@simark.ca:
> Unfortunately, I am unable to apply this patch as well, please send it
> using git-send-email.

Should be all good to go now. I'm sorry for unearthing this patch after it's 
been so long, but I hope it's not (too much of) a problem. I've updated the old 
patch to work with the way symbol allocation is done now, since it changed from 
six months ago, and I've also added a test case for it.

> It would maybe be nice to be able to create arch-owned types too.  For
> instance, you could create types just after firing up GDB, without even
> having an objfile loaded.  It's not necessary to implement it at the
> same time, but does your approach leave us the option to do that at a
> later time?

Hmm, I think it shouldn't be a problem. The way it works now, it already uses
`type_allocator` to do most of the heavy lifting, which can handle both `
objfile`s and `arch`es. I can see a straightforward way to do that in using
keyword arguments (e.g. `objfile=` and `arch=`) to separate the two cases in 
Python and doing a check on the C side for which of the two was used.

---

This patch adds support for creating types from within the Python API. It does
so by exposing the `init_*_type` family of functions, defined in `gdbtypes.h` to
Python and having them return `gdb.Type` objects connected to the newly minted
types.

These functions are accessible in the root of the gdb module and all require
a reference to a `gdb.Objfile`. Types created from this API are exclusively
objfile-owned.

This patch also adds an extra type - `gdb.FloatFormat` - to support creation of
floating point types by letting users control the format from within Python. It
is missing, however, a way to specify half formats and validation functions.

It is important to note that types created using this interface are not
automatically registered as a symbol, and so, types will become unreachable
unless used to create a value that otherwise references it or saved in some way.

The main drawback of using the `init_*_type` family over implementing type
initialization by hand is that any type that's created gets immediately
allocated on its owner objfile's obstack, regardless of what its real
lifetime requirements are. The main implication of this is that types that
become unreachable will leak their memory for the lifetime of the objfile.

Keeping track of the initialization of the type by hand would require a
deeper change to the existing type object infrastructure. A bit too ambitious
for a first patch, I'd say.

if it were to be done though, we would gain the ability to only keep in the
obstack types that are known to be referenced in some other way - by allocating
and copying the data to the obstack as other objects are created that reference
it (eg. symbols).
---
 gdb/Makefile.in                      |   2 +
 gdb/python/py-float-format.c         | 321 +++++++++++++++++++++
 gdb/python/py-objfile.c              |  12 +
 gdb/python/py-type-init.c            | 409 +++++++++++++++++++++++++++
 gdb/python/python-internal.h         |  15 +
 gdb/python/python.c                  |  41 +++
 gdb/testsuite/gdb.python/py-type.exp |  10 +
 7 files changed, 810 insertions(+)
 create mode 100644 gdb/python/py-float-format.c
 create mode 100644 gdb/python/py-type-init.c

diff --git a/gdb/Makefile.in b/gdb/Makefile.in
index 14b5dd0bad..108bcea69e 100644
--- a/gdb/Makefile.in
+++ b/gdb/Makefile.in
@@ -431,6 +431,8 @@ SUBDIR_PYTHON_SRCS = \
 	python/py-threadevent.c \
 	python/py-tui.c \
 	python/py-type.c \
+	python/py-type-init.c \
+	python/py-float-format.c \
 	python/py-unwind.c \
 	python/py-utils.c \
 	python/py-value.c \
diff --git a/gdb/python/py-float-format.c b/gdb/python/py-float-format.c
new file mode 100644
index 0000000000..8fe92980f1
--- /dev/null
+++ b/gdb/python/py-float-format.c
@@ -0,0 +1,321 @@
+/* Accessibility of float format controls from inside the Python API
+
+   Copyright (C) 2008-2023 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include "defs.h"
+#include "python-internal.h"
+#include "floatformat.h"
+
+/* Structure backing the float format Python interface. */
+
+struct float_format_object
+{
+  PyObject_HEAD
+  struct floatformat format;
+
+  struct floatformat *float_format ()
+  {
+    return &this->format;
+  }
+};
+
+/* Initializes the float format type and registers it with the Python interpreter. */
+
+static int CPYCHECKER_NEGATIVE_RESULT_SETS_EXCEPTION
+gdbpy_initialize_float_format (void)
+{
+  if (PyType_Ready (&float_format_object_type) < 0)
+    return -1;
+
+  if (gdb_pymodule_addobject (gdb_module, "FloatFormat",
+                              (PyObject *) &float_format_object_type) < 0)
+    return -1;
+
+  return 0;
+}
+
+GDBPY_INITIALIZE_FILE (gdbpy_initialize_float_format);
+
+#define INSTANCE_FIELD_GETTER(getter_name, field_name, field_type, field_conv) \
+  static PyObject *                                                            \
+  getter_name (PyObject *self, void *closure)                                  \
+  {                                                                            \
+    float_format_object *ff = (float_format_object*) self;                     \
+    field_type value = ff->float_format ()->field_name;                        \
+    return field_conv (value);                                                 \
+  }
+
+#define INSTANCE_FIELD_SETTER(getter_name, field_name, field_type, field_conv) \
+  static int                                                                   \
+  getter_name (PyObject *self, PyObject* value, void *closure)                 \
+  {                                                                            \
+    field_type native_value;                                                   \
+    if (!field_conv (value, &native_value))                                    \
+      return -1;                                                               \
+    float_format_object *ff = (float_format_object*) self;                     \
+    ff->float_format ()->field_name = native_value;                            \
+    return 0;                                                                  \
+  }
+
+/* Converts from the intbit enum to a Python boolean. */
+
+static PyObject *
+intbit_to_py (enum floatformat_intbit intbit)
+{
+  gdb_assert 
+    (intbit == floatformat_intbit_yes || 
+     intbit == floatformat_intbit_no);
+
+  if (intbit == floatformat_intbit_no)
+    Py_RETURN_FALSE;
+  else
+    Py_RETURN_TRUE;
+}
+
+/* Converts from a Python boolean to the intbit enum. */
+
+static bool
+py_to_intbit (PyObject *object, enum floatformat_intbit *intbit)
+{
+  if (!PyObject_IsInstance (object, (PyObject*) &PyBool_Type))
+    {
+      PyErr_SetString (PyExc_TypeError, "intbit must be True or False");
+      return false;
+    }
+
+  *intbit = PyObject_IsTrue (object) ? 
+    floatformat_intbit_yes : floatformat_intbit_no;
+  return true;
+}
+
+/* Converts from a Python integer to a unsigned integer. */
+
+static bool
+py_to_unsigned_int (PyObject *object, unsigned int *val)
+{
+  if (!PyObject_IsInstance (object, (PyObject*) &PyLong_Type))
+    {
+      PyErr_SetString (PyExc_TypeError, "value must be an integer");
+      return false;
+    }
+
+  long native_val = PyLong_AsLong (object);
+  if (native_val > (long) UINT_MAX)
+    {
+      PyErr_SetString (PyExc_ValueError, "value is too large");
+      return false;
+    }
+  if (native_val < 0)
+    {
+      PyErr_SetString (PyExc_ValueError, 
+                       "value must not be smaller than zero");
+      return false;
+    }
+
+  *val = (unsigned int) native_val;
+  return true;
+}
+
+/* Converts from a Python integer to a signed integer. */
+
+static bool
+py_to_int(PyObject *object, int *val)
+{
+  if(!PyObject_IsInstance(object, (PyObject*)&PyLong_Type))
+    {
+      PyErr_SetString(PyExc_TypeError, "value must be an integer");
+      return false;
+    }
+
+  long native_val = PyLong_AsLong(object);
+  if(native_val > (long)INT_MAX)
+    {
+      PyErr_SetString(PyExc_ValueError, "value is too large");
+      return false;
+    }
+
+  *val = (int)native_val;
+  return true;
+}
+
+INSTANCE_FIELD_GETTER (ffpy_get_totalsize, totalsize, 
+                       unsigned int, PyLong_FromLong)
+INSTANCE_FIELD_GETTER (ffpy_get_sign_start, sign_start, 
+                       unsigned int, PyLong_FromLong)
+INSTANCE_FIELD_GETTER (ffpy_get_exp_start, exp_start, 
+                       unsigned int, PyLong_FromLong)
+INSTANCE_FIELD_GETTER (ffpy_get_exp_len, exp_len, 
+                       unsigned int, PyLong_FromLong)
+INSTANCE_FIELD_GETTER (ffpy_get_exp_bias, exp_bias, int, PyLong_FromLong)
+INSTANCE_FIELD_GETTER (ffpy_get_exp_nan, exp_nan, 
+                       unsigned int, PyLong_FromLong)
+INSTANCE_FIELD_GETTER (ffpy_get_man_start, man_start, 
+                       unsigned int, PyLong_FromLong)
+INSTANCE_FIELD_GETTER (ffpy_get_man_len, man_len, 
+                       unsigned int, PyLong_FromLong)
+INSTANCE_FIELD_GETTER (ffpy_get_intbit, intbit, 
+                       enum floatformat_intbit, intbit_to_py)
+INSTANCE_FIELD_GETTER (ffpy_get_name, name, 
+                       const char *, PyUnicode_FromString)
+
+INSTANCE_FIELD_SETTER (ffpy_set_totalsize, totalsize, 
+                       unsigned int, py_to_unsigned_int)
+INSTANCE_FIELD_SETTER (ffpy_set_sign_start, sign_start, 
+                       unsigned int, py_to_unsigned_int)
+INSTANCE_FIELD_SETTER (ffpy_set_exp_start, exp_start, 
+                       unsigned int, py_to_unsigned_int)
+INSTANCE_FIELD_SETTER (ffpy_set_exp_len, exp_len, 
+                       unsigned int, py_to_unsigned_int)
+INSTANCE_FIELD_SETTER (ffpy_set_exp_bias, exp_bias, int, py_to_int)
+INSTANCE_FIELD_SETTER (ffpy_set_exp_nan, exp_nan, 
+                       unsigned int, py_to_unsigned_int)
+INSTANCE_FIELD_SETTER (ffpy_set_man_start, man_start,
+                       unsigned int, py_to_unsigned_int)
+INSTANCE_FIELD_SETTER (ffpy_set_man_len, man_len, 
+                       unsigned int, py_to_unsigned_int)
+INSTANCE_FIELD_SETTER (ffpy_set_intbit, intbit, 
+                       enum floatformat_intbit, py_to_intbit)
+
+/* Makes sure float formats created from Python always test as valid. */
+
+static int
+ffpy_always_valid (const struct floatformat *fmt ATTRIBUTE_UNUSED,
+                   const void *from ATTRIBUTE_UNUSED)
+{
+  return 1;
+}
+
+/* Initializes new float format objects. */
+
+static int
+ffpy_init (PyObject *self,
+           PyObject *args ATTRIBUTE_UNUSED,
+           PyObject *kwds ATTRIBUTE_UNUSED)
+{
+  auto ff = (float_format_object*) self;
+  ff->format = floatformat ();
+  ff->float_format ()->name = "";
+  ff->float_format ()->is_valid = ffpy_always_valid;
+  return 0;
+}
+
+/* Retrieves a pointer to the underlying float format structure. */
+
+struct floatformat *
+float_format_object_as_float_format (PyObject *self)
+{
+  if (!PyObject_IsInstance (self, (PyObject*) &float_format_object_type))
+    return nullptr;
+  return ((float_format_object*) self)->float_format ();
+}
+
+static gdb_PyGetSetDef float_format_object_getset[] =
+{
+  { "totalsize", ffpy_get_totalsize, ffpy_set_totalsize,
+    "The total size of the floating point number, in bits.", nullptr },
+  { "sign_start", ffpy_get_sign_start, ffpy_set_sign_start,
+    "The bit offset of the sign bit.", nullptr },
+  { "exp_start", ffpy_get_exp_start, ffpy_set_exp_start,
+    "The bit offset of the start of the exponent.", nullptr },
+  { "exp_len", ffpy_get_exp_len, ffpy_set_exp_len,
+    "The size of the exponent, in bits.", nullptr },
+  { "exp_bias", ffpy_get_exp_bias, ffpy_set_exp_bias,
+    "Bias added to a \"true\" exponent to form the biased exponent.", nullptr },
+  { "exp_nan", ffpy_get_exp_nan, ffpy_set_exp_nan,
+    "Exponent value which indicates NaN.", nullptr },
+  { "man_start", ffpy_get_man_start, ffpy_set_man_start,
+    "The bit offset of the start of the mantissa.", nullptr },
+  { "man_len", ffpy_get_man_len, ffpy_set_man_len,
+    "The size of the mantissa, in bits.", nullptr },
+  { "intbit", ffpy_get_intbit, ffpy_set_intbit,
+    "Is the integer bit explicit or implicit?", nullptr },
+  { "name", ffpy_get_name, nullptr,
+    "Internal name for debugging.", nullptr },
+  { nullptr }
+};
+
+static PyMethodDef float_format_object_methods[] =
+{
+  { NULL }
+};
+
+static PyNumberMethods float_format_object_as_number = {
+  nullptr,             /* nb_add */
+  nullptr,             /* nb_subtract */
+  nullptr,             /* nb_multiply */
+  nullptr,             /* nb_remainder */
+  nullptr,             /* nb_divmod */
+  nullptr,             /* nb_power */
+  nullptr,             /* nb_negative */
+  nullptr,             /* nb_positive */
+  nullptr,             /* nb_absolute */
+  nullptr,             /* nb_nonzero */
+  nullptr,             /* nb_invert */
+  nullptr,             /* nb_lshift */
+  nullptr,             /* nb_rshift */
+  nullptr,             /* nb_and */
+  nullptr,             /* nb_xor */
+  nullptr,             /* nb_or */
+  nullptr,             /* nb_int */
+  nullptr,             /* reserved */
+  nullptr,             /* nb_float */
+};
+
+PyTypeObject float_format_object_type =
+{
+  PyVarObject_HEAD_INIT (NULL, 0)
+  "gdb.FloatFormat",              /*tp_name*/
+  sizeof (float_format_object),   /*tp_basicsize*/
+  0,                              /*tp_itemsize*/
+  nullptr,                        /*tp_dealloc*/
+  0,                              /*tp_print*/
+  nullptr,                        /*tp_getattr*/
+  nullptr,                        /*tp_setattr*/
+  nullptr,                        /*tp_compare*/
+  nullptr,                        /*tp_repr*/
+  &float_format_object_as_number, /*tp_as_number*/
+  nullptr,                        /*tp_as_sequence*/
+  nullptr,                        /*tp_as_mapping*/
+  nullptr,                        /*tp_hash */
+  nullptr,                        /*tp_call*/
+  nullptr,                        /*tp_str*/
+  nullptr,                        /*tp_getattro*/
+  nullptr,                        /*tp_setattro*/
+  nullptr,                        /*tp_as_buffer*/
+  Py_TPFLAGS_DEFAULT,             /*tp_flags*/
+  "GDB float format object",      /* tp_doc */
+  nullptr,                        /* tp_traverse */
+  nullptr,                        /* tp_clear */
+  nullptr,                        /* tp_richcompare */
+  0,                              /* tp_weaklistoffset */
+  nullptr,                        /* tp_iter */
+  nullptr,                        /* tp_iternext */
+  float_format_object_methods,    /* tp_methods */
+  nullptr,                        /* tp_members */
+  float_format_object_getset,     /* tp_getset */
+  nullptr,                        /* tp_base */
+  nullptr,                        /* tp_dict */
+  nullptr,                        /* tp_descr_get */
+  nullptr,                        /* tp_descr_set */
+  0,                              /* tp_dictoffset */
+  ffpy_init,                      /* tp_init */
+  nullptr,                        /* tp_alloc */
+  PyType_GenericNew,              /* tp_new */
+};
+
+
diff --git a/gdb/python/py-objfile.c b/gdb/python/py-objfile.c
index ad72f3f042..be2121c405 100644
--- a/gdb/python/py-objfile.c
+++ b/gdb/python/py-objfile.c
@@ -704,6 +704,18 @@ objfile_to_objfile_object (struct objfile *objfile)
   return gdbpy_ref<>::new_reference (result);
 }
 
+struct objfile *
+objfile_object_to_objfile (PyObject *self)
+{
+  if (!PyObject_TypeCheck (self, &objfile_object_type))
+    return nullptr;
+
+  auto objfile_object = (struct objfile_object*) self;
+  OBJFPY_REQUIRE_VALID (objfile_object);
+
+  return objfile_object->objfile;
+}
+
 static int CPYCHECKER_NEGATIVE_RESULT_SETS_EXCEPTION
 gdbpy_initialize_objfile (void)
 {
diff --git a/gdb/python/py-type-init.c b/gdb/python/py-type-init.c
new file mode 100644
index 0000000000..a18cce6e51
--- /dev/null
+++ b/gdb/python/py-type-init.c
@@ -0,0 +1,409 @@
+/* Functionality for creating new types accessible from python.
+
+   Copyright (C) 2008-2023 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include "defs.h"
+#include "python-internal.h"
+#include "gdbtypes.h"
+#include "floatformat.h"
+#include "objfiles.h"
+#include "gdbsupport/gdb_obstack.h"
+
+
+/* Copies a null-terminated string into an objfile's obstack. */
+
+static const char *
+copy_string (struct objfile *objfile, const char *py_str)
+{
+  unsigned int len = strlen (py_str);
+  return obstack_strndup (&objfile->per_bfd->storage_obstack,
+                          py_str, len);
+}
+
+/* Creates a new type and returns a new gdb.Type associated with it. */
+
+PyObject *
+gdbpy_init_type (PyObject *self, PyObject *args)
+{
+  PyObject *objfile_object;
+  enum type_code code;
+  int bit_length;
+  const char *py_name;
+
+  if(!PyArg_ParseTuple (args, "Oiis", &objfile_object, &code, 
+                        &bit_length, &py_name))
+    return nullptr;
+
+  struct objfile* objfile = objfile_object_to_objfile (objfile_object);
+  if (objfile == nullptr)
+    return nullptr;
+
+  const char *name = copy_string (objfile, py_name);
+  struct type *type;
+  try
+    {
+      type_allocator allocator (objfile);
+      type = allocator.new_type (code, bit_length, name);
+      gdb_assert (type != nullptr);
+    }
+  catch (gdb_exception_error& ex)
+    {
+      GDB_PY_HANDLE_EXCEPTION (ex);
+    }
+
+  return type_to_type_object (type);
+}
+
+/* Creates a new integer type and returns a new gdb.Type associated with it. */
+
+PyObject *
+gdbpy_init_integer_type (PyObject *self, PyObject *args)
+{
+  PyObject *objfile_object;
+  int bit_size;
+  int unsigned_p;
+  const char *py_name;
+
+  if (!PyArg_ParseTuple (args, "Oips", &objfile_object, &bit_size, 
+                         &unsigned_p, &py_name))
+    return nullptr;
+
+  struct objfile *objfile = objfile_object_to_objfile (objfile_object);
+  if (objfile == nullptr)
+    return nullptr;
+
+  const char *name = copy_string (objfile, py_name);
+  struct type *type;
+  try
+    {
+      type_allocator allocator (objfile);
+      type = init_integer_type (allocator, bit_size, unsigned_p, name);
+      gdb_assert (type != nullptr);
+    }
+  catch (gdb_exception_error& ex)
+    {
+      GDB_PY_HANDLE_EXCEPTION (ex);
+    }
+
+  return type_to_type_object(type);
+}
+
+/* Creates a new character type and returns a new gdb.Type associated 
+ * with it. */
+
+PyObject *
+gdbpy_init_character_type (PyObject *self, PyObject *args)
+{
+
+  PyObject *objfile_object;
+  int bit_size;
+  int unsigned_p;
+  const char *py_name;
+
+  if (!PyArg_ParseTuple (args, "Oips", &objfile_object, &bit_size, 
+                         &unsigned_p, &py_name))
+    return nullptr;
+
+  struct objfile *objfile = objfile_object_to_objfile (objfile_object);
+  if (objfile == nullptr)
+    return nullptr;
+
+  const char *name = copy_string (objfile, py_name);
+  struct type *type;
+  try
+    {
+      type_allocator allocator (objfile);
+      type = init_character_type (allocator, bit_size, unsigned_p, name);
+      gdb_assert (type != nullptr);
+    }
+  catch (gdb_exception_error& ex)
+    {
+      GDB_PY_HANDLE_EXCEPTION (ex);
+    }
+
+  return type_to_type_object (type);
+}
+
+/* Creates a new boolean type and returns a new gdb.Type associated with it. */
+
+PyObject *
+gdbpy_init_boolean_type (PyObject *self, PyObject *args)
+{
+
+  PyObject *objfile_object;
+  int bit_size;
+  int unsigned_p;
+  const char *py_name;
+
+  if (!PyArg_ParseTuple (args, "Oips", &objfile_object, &bit_size, 
+                         &unsigned_p, &py_name))
+    return nullptr;
+
+  struct objfile *objfile = objfile_object_to_objfile (objfile_object);
+  if (objfile == nullptr)
+    return nullptr;
+
+  const char *name = copy_string (objfile, py_name);
+  struct type *type;
+  try
+    {
+      type_allocator allocator (objfile);
+      type = init_boolean_type (allocator, bit_size, unsigned_p, name);
+      gdb_assert (type != nullptr);
+    }
+  catch (gdb_exception_error& ex)
+    {
+      GDB_PY_HANDLE_EXCEPTION (ex);
+    }
+
+  return type_to_type_object (type);
+}
+
+/* Creates a new float type and returns a new gdb.Type associated with it. */
+
+PyObject *
+gdbpy_init_float_type (PyObject *self, PyObject *args)
+{
+  PyObject *objfile_object, *float_format_object;
+  const char *py_name;
+
+  if (!PyArg_ParseTuple (args, "OOs", &objfile_object, 
+                         &float_format_object, &py_name))
+    return nullptr;
+
+  struct objfile *objfile = objfile_object_to_objfile (objfile_object);
+  if (objfile == nullptr)
+    return nullptr;
+
+  struct floatformat *local_ff = float_format_object_as_float_format 
+    (float_format_object);
+  if (local_ff == nullptr)
+    return nullptr;
+
+  /* Persist a copy of the format in the objfile's obstack. This guarantees that
+   * the format won't outlive the type being created from it and that changes
+   * made to the object used to create this type will not affect it after
+   * creation. */
+  auto ff = OBSTACK_CALLOC
+    (&objfile->objfile_obstack,
+     1,
+     struct floatformat);
+  memcpy (ff, local_ff, sizeof (struct floatformat));
+
+  /* We only support creating float types in the architecture's endianness, so
+   * make sure init_float_type sees the float format structure we need it to. */
+  enum bfd_endian endianness = gdbarch_byte_order (objfile->arch());
+  gdb_assert (endianness < BFD_ENDIAN_UNKNOWN);
+
+  const struct floatformat *per_endian[2] = { nullptr, nullptr };
+  per_endian[endianness] = ff;
+
+  const char *name = copy_string (objfile, py_name);
+  struct type *type;
+  try
+    {
+      type_allocator allocator (objfile);
+      type = init_float_type (allocator, -1, name, per_endian, endianness);
+      gdb_assert (type != nullptr);
+    }
+  catch (gdb_exception_error& ex)
+    {
+      GDB_PY_HANDLE_EXCEPTION (ex);
+    }
+
+  return type_to_type_object (type);
+}
+
+/* Creates a new decimal float type and returns a new gdb.Type 
+ * associated with it. */
+
+PyObject *
+gdbpy_init_decfloat_type (PyObject *self, PyObject *args)
+{
+  PyObject *objfile_object;
+  int bit_length;
+  const char *py_name;
+
+  if (!PyArg_ParseTuple (args, "Ois", &objfile_object, &bit_length, &py_name))
+    return nullptr;
+
+  struct objfile *objfile = objfile_object_to_objfile (objfile_object);
+  if (objfile == nullptr)
+    return nullptr;
+
+  const char *name = copy_string (objfile, py_name);
+  struct type *type;
+  try
+    {
+      type_allocator allocator (objfile);
+      type = init_decfloat_type (allocator, bit_length, name);
+      gdb_assert (type != nullptr);
+    }
+  catch (gdb_exception_error& ex)
+    {
+      GDB_PY_HANDLE_EXCEPTION (ex);
+    }
+
+  return type_to_type_object (type);
+}
+
+/* Returns whether a given type can be used to create a complex type. */
+
+PyObject *
+gdbpy_can_create_complex_type (PyObject *self, PyObject *args)
+{
+
+  PyObject *type_object;
+
+  if (!PyArg_ParseTuple (args, "O", &type_object))
+    return nullptr;
+
+  struct type *type = type_object_to_type (type_object);
+  if (type == nullptr)
+    return nullptr;
+
+  bool can_create_complex = false;
+  try
+    {
+      can_create_complex = can_create_complex_type (type);
+    }
+  catch (gdb_exception_error& ex)
+    {
+      GDB_PY_HANDLE_EXCEPTION (ex);
+    }
+
+  if (can_create_complex)
+    Py_RETURN_TRUE;
+  else
+    Py_RETURN_FALSE;
+}
+
+/* Creates a new complex type and returns a new gdb.Type associated with it. */
+
+PyObject *
+gdbpy_init_complex_type (PyObject *self, PyObject *args)
+{
+
+  PyObject *type_object;
+  const char *py_name;
+
+  if (!PyArg_ParseTuple (args, "Os", &type_object, &py_name))
+    return nullptr;
+
+  struct type *type = type_object_to_type (type_object);
+  if (type == nullptr)
+    return nullptr;
+
+  obstack *obstack;
+  if (type->is_objfile_owned ())
+    obstack = &type->objfile_owner ()->objfile_obstack;
+  else
+    obstack = gdbarch_obstack (type->arch_owner ());
+
+  unsigned int len = strlen (py_name);
+  const char *name = obstack_strndup (obstack,
+                                      py_name,
+                                      len);
+  struct type *complex_type;
+  try
+    {
+      complex_type = init_complex_type (name, type);
+      gdb_assert (complex_type != nullptr);
+    }
+  catch (gdb_exception_error& ex)
+    {
+      GDB_PY_HANDLE_EXCEPTION (ex);
+    }
+
+  return type_to_type_object (complex_type);
+}
+
+/* Creates a new pointer type and returns a new gdb.Type associated with it. */
+
+PyObject *
+gdbpy_init_pointer_type (PyObject *self, PyObject *args)
+{
+  PyObject *objfile_object, *type_object;
+  int bit_length;
+  const char *py_name;
+
+  if (!PyArg_ParseTuple (args, "OOis", &objfile_object, &type_object, 
+                         &bit_length, &py_name))
+    return nullptr;
+
+  struct objfile *objfile = objfile_object_to_objfile (objfile_object);
+  if (objfile == nullptr)
+    return nullptr;
+
+  struct type *type = type_object_to_type (type_object);
+  if (type == nullptr)
+    return nullptr;
+
+  const char *name = copy_string (objfile, py_name);
+  struct type *pointer_type = nullptr;
+  try
+    {
+      type_allocator allocator (objfile);
+      pointer_type = init_pointer_type (allocator, bit_length, 
+                                        name, type);
+      gdb_assert (type != nullptr);
+    }
+  catch (gdb_exception_error& ex)
+    {
+      GDB_PY_HANDLE_EXCEPTION (ex);
+    }
+
+  return type_to_type_object (pointer_type);
+}
+
+/* Creates a new fixed point type and returns a new gdb.Type associated 
+ * with it. */
+
+PyObject *
+gdbpy_init_fixed_point_type (PyObject *self, PyObject *args)
+{
+
+  PyObject *objfile_object;
+  int bit_length;
+  int unsigned_p;
+  const char* py_name;
+
+  if (!PyArg_ParseTuple (args, "Oips", &objfile_object, &bit_length, 
+                         &unsigned_p, &py_name))
+    return nullptr;
+
+  struct objfile *objfile = objfile_object_to_objfile (objfile_object);
+  if (objfile == nullptr)
+    return nullptr;
+
+  const char *name = copy_string (objfile, py_name);
+  struct type *type;
+  try
+    {
+      type = init_fixed_point_type (objfile, bit_length, unsigned_p, 
+                                    name);
+      gdb_assert (type != nullptr);
+    }
+  catch (gdb_exception_error& ex)
+    {
+      GDB_PY_HANDLE_EXCEPTION (ex);
+    }
+
+  return type_to_type_object (type);
+}
+
diff --git a/gdb/python/python-internal.h b/gdb/python/python-internal.h
index dbd33570a7..73e2e6ce62 100644
--- a/gdb/python/python-internal.h
+++ b/gdb/python/python-internal.h
@@ -289,6 +289,8 @@ extern PyTypeObject frame_object_type
     CPYCHECKER_TYPE_OBJECT_FOR_TYPEDEF ("frame_object");
 extern PyTypeObject thread_object_type
     CPYCHECKER_TYPE_OBJECT_FOR_TYPEDEF ("thread_object");
+extern PyTypeObject float_format_object_type
+    CPYCHECKER_TYPE_OBJECT_FOR_TYPEDEF ("float_format");
 
 /* Ensure that breakpoint_object_type is initialized and return true.  If
    breakpoint_object_type can't be initialized then set a suitable Python
@@ -431,6 +433,17 @@ gdb::unique_xmalloc_ptr<char> gdbpy_parse_command_name
 PyObject *gdbpy_register_tui_window (PyObject *self, PyObject *args,
 				     PyObject *kw);
 
+PyObject *gdbpy_init_type (PyObject *self, PyObject *args);
+PyObject *gdbpy_init_integer_type (PyObject *self, PyObject *args);
+PyObject *gdbpy_init_character_type (PyObject *self, PyObject *args);
+PyObject *gdbpy_init_boolean_type (PyObject *self, PyObject *args);
+PyObject *gdbpy_init_float_type (PyObject *self, PyObject *args);
+PyObject *gdbpy_init_decfloat_type (PyObject *self, PyObject *args);
+PyObject *gdbpy_can_create_complex_type (PyObject *self, PyObject *args);
+PyObject *gdbpy_init_complex_type (PyObject *self, PyObject *args);
+PyObject *gdbpy_init_pointer_type (PyObject *self, PyObject *args);
+PyObject *gdbpy_init_fixed_point_type (PyObject *self, PyObject *args);
+
 PyObject *symtab_and_line_to_sal_object (struct symtab_and_line sal);
 PyObject *symtab_to_symtab_object (struct symtab *symtab);
 PyObject *symbol_to_symbol_object (struct symbol *sym);
@@ -480,6 +493,8 @@ struct symtab *symtab_object_to_symtab (PyObject *obj);
 struct symtab_and_line *sal_object_to_symtab_and_line (PyObject *obj);
 frame_info_ptr frame_object_to_frame_info (PyObject *frame_obj);
 struct gdbarch *arch_object_to_gdbarch (PyObject *obj);
+struct objfile *objfile_object_to_objfile (PyObject *self);
+struct floatformat *float_format_object_as_float_format (PyObject *self);
 
 /* Convert Python object OBJ to a program_space pointer.  OBJ must be a
    gdb.Progspace reference.  Return nullptr if the gdb.Progspace is not
diff --git a/gdb/python/python.c b/gdb/python/python.c
index fd5a920cbd..288c8b355c 100644
--- a/gdb/python/python.c
+++ b/gdb/python/python.c
@@ -2521,6 +2521,47 @@ Return current recording object." },
     "stop_recording () -> None.\n\
 Stop current recording." },
 
+  /* Type initialization functions. */
+  { "init_type", gdbpy_init_type, METH_VARARGS,
+    "init_type (objfile, type_code, bit_length, name) -> type\n\
+    Creates a new type with the given bit length and type code, owned\
+    by the given objfile." },
+  { "init_integer_type", gdbpy_init_integer_type, METH_VARARGS,
+    "init_integer_type (objfile, bit_length, unsigned, name) -> type\n\
+    Creates a new integer type with the given bit length and \
+    signedness, owned by the given objfile." },
+  { "init_character_type", gdbpy_init_character_type, METH_VARARGS,
+    "init_character_type (objfile, bit_length, unsigned, name) -> type\n\
+    Creates a new character type with the given bit length and \
+    signedness, owned by the given objfile." },
+  { "init_boolean_type", gdbpy_init_boolean_type, METH_VARARGS,
+    "init_boolean_type (objfile, bit_length, unsigned, name) -> type\n\
+    Creates a new boolean type with the given bit length and \
+    signedness, owned by the given objfile." },
+  { "init_float_type", gdbpy_init_float_type, METH_VARARGS,
+    "init_float_type (objfile, float_format, name) -> type\n\
+    Creates a new floating point type with the given bit length and \
+    format, owned by the given objfile." },
+  { "init_decfloat_type", gdbpy_init_decfloat_type, METH_VARARGS,
+    "init_decfloat_type (objfile, bit_length, name) -> type\n\
+    Creates a new decimal float type with the given bit length,\
+    owned by the given objfile." },
+  { "can_create_complex_type", gdbpy_can_create_complex_type, METH_VARARGS,
+    "can_create_complex_type (type) -> bool\n\
+     Returns whether a given type can form a new complex type." },
+  { "init_complex_type", gdbpy_init_complex_type, METH_VARARGS,
+    "init_complex_type (base_type, name) -> type\n\
+    Creates a new complex type whose components belong to the\
+    given type, owned by the given objfile." },
+  { "init_pointer_type", gdbpy_init_pointer_type, METH_VARARGS,
+    "init_pointer_type (objfile, target_type, bit_length, name) -> type\n\
+    Creates a new pointer type with the given bit length, pointing\
+    to the given target type, and owned by the given objfile." },
+ { "init_fixed_point_type", gdbpy_init_fixed_point_type, METH_VARARGS,
+   "init_fixed_point_type (objfile, bit_length, unsigned, name) -> type\n\
+   Creates a new fixed point type with the given bit length and\
+   signedness, owned by the given objfile." },
+
   { "lookup_type", (PyCFunction) gdbpy_lookup_type,
     METH_VARARGS | METH_KEYWORDS,
     "lookup_type (name [, block]) -> type\n\
diff --git a/gdb/testsuite/gdb.python/py-type.exp b/gdb/testsuite/gdb.python/py-type.exp
index c245d41a1a..aee2b4d60a 100644
--- a/gdb/testsuite/gdb.python/py-type.exp
+++ b/gdb/testsuite/gdb.python/py-type.exp
@@ -388,3 +388,13 @@ if { [build_inferior "${binfile}-cxx" "c++"] == 0 } {
       test_type_equality
   }
 }
+
+# Test python type construction
+gdb_test "python t = gdb.init_type(gdb.objfiles ()\[0\], gdb.TYPE_CODE_INT, 24, 'long short int')" \
+  "" "construct a new type from inside python"
+gdb_test "python print (t.code)" \
+  "8" "check the code for the python-constructed type"
+gdb_test "python print (t.sizeof)" \
+  "3" "check the size for the python-constructed type"
+gdb_test "python print (t.name)" \
+  "long short int" "check the name for the python-constructed type"
-- 
2.40.1


^ permalink raw reply	[relevance 2%]

* [PATCHv3 2/2] gdb: add __repr__() implementation to a few Python types
  2023-05-19 21:27  7%           ` [PATCHv3 0/2] " Andrew Burgess
@ 2023-05-19 21:27  3%             ` Andrew Burgess
  2023-06-07 17:05  7%             ` [PATCHv3 0/2] Add " Matheus Branco Borella
  1 sibling, 0 replies; 65+ results
From: Andrew Burgess @ 2023-05-19 21:27 UTC (permalink / raw)
  To: gdb-patches

From: Matheus Branco Borella via Gdb-patches <gdb-patches@sourceware.org>

Only a few types in the Python API currently have __repr__()
implementations.  This patch adds a few more of them. specifically: it
adds __repr__() implementations to gdb.Symbol, gdb.Architecture,
gdb.Block, gdb.Breakpoint, gdb.BreakpointLocation, and gdb.Type.

This makes it easier to play around the GDB Python API in the Python
interpreter session invoked with the 'pi' command in GDB, giving more
easily accessible tipe information to users.

An example of how this would look like:

  (gdb) pi
  >> gdb.lookup_type("char")
  <gdb.Type code=TYPE_CODE_INT name=char>
  >> gdb.lookup_global_symbol("main")
  <gdb.Symbol print_name=main>

The gdb.Block.__repr__() method shows the first 5 symbols from the
block, and then a message to show how many more were elided (if any).
---
 gdb/python/py-arch.c                         | 17 ++++-
 gdb/python/py-block.c                        | 37 ++++++++++-
 gdb/python/py-breakpoint.c                   | 67 ++++++++++++++++++-
 gdb/python/py-symbol.c                       | 15 ++++-
 gdb/python/py-type.c                         | 30 ++++++++-
 gdb/testsuite/gdb.python/py-arch.exp         |  6 ++
 gdb/testsuite/gdb.python/py-block.c          | 31 +++++++++
 gdb/testsuite/gdb.python/py-block.exp        | 38 ++++++++++-
 gdb/testsuite/gdb.python/py-bp-locations.exp | 32 +++++++++
 gdb/testsuite/gdb.python/py-breakpoint.exp   | 69 +++++++++++++++++---
 gdb/testsuite/gdb.python/py-symbol.exp       |  2 +
 gdb/testsuite/gdb.python/py-type.exp         |  4 ++
 12 files changed, 329 insertions(+), 19 deletions(-)

diff --git a/gdb/python/py-arch.c b/gdb/python/py-arch.c
index 4d133d1fe14..ac519331f18 100644
--- a/gdb/python/py-arch.c
+++ b/gdb/python/py-arch.c
@@ -319,6 +319,21 @@ archpy_integer_type (PyObject *self, PyObject *args, PyObject *kw)
   return type_to_type_object (type);
 }
 
+/* __repr__ implementation for gdb.Architecture.  */
+
+static PyObject *
+archpy_repr (PyObject *self)
+{
+  const auto gdbarch = arch_object_to_gdbarch (self);
+  if (gdbarch == nullptr)
+    return PyUnicode_FromFormat ("<%s (invalid)>", Py_TYPE (self)->tp_name);
+
+  auto arch_info = gdbarch_bfd_arch_info (gdbarch);
+  return PyUnicode_FromFormat ("<%s arch_name=%s printable_name=%s>",
+			       Py_TYPE (self)->tp_name, arch_info->arch_name,
+			       arch_info->printable_name);
+}
+
 /* Implementation of gdb.architecture_names().  Return a list of all the
    BFD architecture names that GDB understands.  */
 
@@ -395,7 +410,7 @@ PyTypeObject arch_object_type = {
   0,                                  /* tp_getattr */
   0,                                  /* tp_setattr */
   0,                                  /* tp_compare */
-  0,                                  /* tp_repr */
+  archpy_repr,                        /* tp_repr */
   0,                                  /* tp_as_number */
   0,                                  /* tp_as_sequence */
   0,                                  /* tp_as_mapping */
diff --git a/gdb/python/py-block.c b/gdb/python/py-block.c
index 09fa74d862c..dd6d6d278a0 100644
--- a/gdb/python/py-block.c
+++ b/gdb/python/py-block.c
@@ -418,6 +418,41 @@ blpy_iter_is_valid (PyObject *self, PyObject *args)
   Py_RETURN_TRUE;
 }
 
+/* __repr__ implementation for gdb.Block.  */
+
+static PyObject *
+blpy_repr (PyObject *self)
+{
+  const auto block = block_object_to_block (self);
+  if (block == nullptr)
+    return PyUnicode_FromFormat ("<%s (invalid)>", Py_TYPE (self)->tp_name);
+
+  const auto name = block->function () ?
+    block->function ()->print_name () : "<anonymous>";
+
+  std::string str;
+  unsigned int written_symbols = 0;
+  const int len = mdict_size (block->multidict ());
+  static constexpr int SYMBOLS_TO_SHOW = 5;
+  for (struct symbol *symbol : block_iterator_range (block))
+    {
+      if (written_symbols == SYMBOLS_TO_SHOW)
+	{
+	  const int remaining = len - SYMBOLS_TO_SHOW;
+	  if (remaining == 1)
+	    str += string_printf ("... (%d more symbol)", remaining);
+	  else
+	    str += string_printf ("... (%d more symbols)", remaining);
+	  break;
+	}
+      str += symbol->print_name ();
+      if (++written_symbols < len)
+	str += ", ";
+    }
+  return PyUnicode_FromFormat ("<%s %s {%s}>", Py_TYPE (self)->tp_name,
+			       name, str.c_str ());
+}
+
 static int CPYCHECKER_NEGATIVE_RESULT_SETS_EXCEPTION
 gdbpy_initialize_blocks (void)
 {
@@ -482,7 +517,7 @@ PyTypeObject block_object_type = {
   0,				  /*tp_getattr*/
   0,				  /*tp_setattr*/
   0,				  /*tp_compare*/
-  0,				  /*tp_repr*/
+  blpy_repr,                      /*tp_repr*/
   0,				  /*tp_as_number*/
   0,				  /*tp_as_sequence*/
   &block_object_as_mapping,	  /*tp_as_mapping*/
diff --git a/gdb/python/py-breakpoint.c b/gdb/python/py-breakpoint.c
index becb04c91c1..caf58e4b101 100644
--- a/gdb/python/py-breakpoint.c
+++ b/gdb/python/py-breakpoint.c
@@ -33,6 +33,7 @@
 #include "location.h"
 #include "py-event.h"
 #include "linespec.h"
+#include "gdbsupport/common-utils.h"
 
 extern PyTypeObject breakpoint_location_object_type
     CPYCHECKER_TYPE_OBJECT_FOR_TYPEDEF ("breakpoint_location_object");
@@ -981,6 +982,31 @@ bppy_init (PyObject *self, PyObject *args, PyObject *kwargs)
   return 0;
 }
 
+/* __repr__ implementation for gdb.Breakpoint.  */
+
+static PyObject *
+bppy_repr (PyObject *self)
+{
+  const auto bp = (struct gdbpy_breakpoint_object*) self;
+  if (bp->bp == nullptr)
+    return PyUnicode_FromFormat ("<%s (invalid)>", Py_TYPE (self)->tp_name);
+
+  std::string str = " ";
+  if (bp->bp->thread != -1)
+    str += string_printf ("thread=%d ", bp->bp->thread);
+  if (bp->bp->task > 0)
+    str += string_printf ("task=%d ", bp->bp->task);
+  if (bp->bp->enable_count > 0)
+    str += string_printf ("enable_count=%d ", bp->bp->enable_count);
+  str.pop_back ();
+
+  return PyUnicode_FromFormat ("<%s%s number=%d hits=%d%s>",
+			       Py_TYPE (self)->tp_name,
+			       (bp->bp->enable_state == bp_enabled
+				? "" : " disabled"), bp->bp->number,
+			       bp->bp->hit_count, str.c_str ());
+}
+
 /* Append to LIST the breakpoint Python object associated to B.
 
    Return true on success.  Return false on failure, with the Python error
@@ -1406,7 +1432,7 @@ PyTypeObject breakpoint_object_type =
   0,				  /*tp_getattr*/
   0,				  /*tp_setattr*/
   0,				  /*tp_compare*/
-  0,				  /*tp_repr*/
+  bppy_repr,                     /*tp_repr*/
   0,				  /*tp_as_number*/
   0,				  /*tp_as_sequence*/
   0,				  /*tp_as_mapping*/
@@ -1624,6 +1650,43 @@ bplocpy_dealloc (PyObject *py_self)
   Py_TYPE (py_self)->tp_free (py_self);
 }
 
+/* __repr__ implementation for gdb.BreakpointLocation.  */
+
+static PyObject *
+bplocpy_repr (PyObject *py_self)
+{
+  const auto self = (gdbpy_breakpoint_location_object *) py_self;
+  if (self->owner == nullptr || self->owner->bp == nullptr
+    || self->owner->bp != self->bp_loc->owner)
+    return PyUnicode_FromFormat ("<%s (invalid)>", Py_TYPE (self)->tp_name);
+
+  const auto enabled = self->bp_loc->enabled ? "enabled" : "disabled";
+
+  std::string str (enabled);
+
+  str += string_printf (" address=%s",
+			paddress (self->bp_loc->owner->gdbarch,
+				  self->bp_loc->address));
+
+  if (self->bp_loc->requested_address != self->bp_loc->address)
+    str += string_printf (" requested_address=%s",
+			  paddress (self->bp_loc->owner->gdbarch,
+				    self->bp_loc->requested_address));
+  if (self->bp_loc->symtab != nullptr)
+    str += string_printf (" source=%s:%d", self->bp_loc->symtab->filename,
+			  self->bp_loc->line_number);
+
+  const auto fn_name = self->bp_loc->function_name.get ();
+  if (fn_name != nullptr)
+    {
+      str += " in ";
+      str += fn_name;
+    }
+
+  return PyUnicode_FromFormat ("<%s %s>", Py_TYPE (self)->tp_name,
+			       str.c_str ());
+}
+
 /* Attribute get/set Python definitions. */
 
 static gdb_PyGetSetDef bp_location_object_getset[] = {
@@ -1655,7 +1718,7 @@ PyTypeObject breakpoint_location_object_type =
   0,					/*tp_getattr*/
   0,					/*tp_setattr*/
   0,					/*tp_compare*/
-  0,					/*tp_repr*/
+  bplocpy_repr,                        /*tp_repr*/
   0,					/*tp_as_number*/
   0,					/*tp_as_sequence*/
   0,					/*tp_as_mapping*/
diff --git a/gdb/python/py-symbol.c b/gdb/python/py-symbol.c
index ff3d18504e7..ee863aa4df4 100644
--- a/gdb/python/py-symbol.c
+++ b/gdb/python/py-symbol.c
@@ -378,6 +378,19 @@ sympy_dealloc (PyObject *obj)
   Py_TYPE (obj)->tp_free (obj);
 }
 
+/* __repr__ implementation for gdb.Symbol.  */
+
+static PyObject *
+sympy_repr (PyObject *self)
+{
+  const auto symbol = symbol_object_to_symbol (self);
+  if (symbol == nullptr)
+    return PyUnicode_FromFormat ("<%s (invalid)>", Py_TYPE (self)->tp_name);
+
+  return PyUnicode_FromFormat ("<%s print_name=%s>", Py_TYPE (self)->tp_name,
+			       symbol->print_name ());
+}
+
 /* Implementation of
    gdb.lookup_symbol (name [, block] [, domain]) -> (symbol, is_field_of_this)
    A tuple with 2 elements is always returned.  The first is the symbol
@@ -741,7 +754,7 @@ PyTypeObject symbol_object_type = {
   0,				  /*tp_getattr*/
   0,				  /*tp_setattr*/
   0,				  /*tp_compare*/
-  0,				  /*tp_repr*/
+  sympy_repr,                    /*tp_repr*/
   0,				  /*tp_as_number*/
   0,				  /*tp_as_sequence*/
   0,				  /*tp_as_mapping*/
diff --git a/gdb/python/py-type.c b/gdb/python/py-type.c
index b9fa741177f..b4d1e230b3b 100644
--- a/gdb/python/py-type.c
+++ b/gdb/python/py-type.c
@@ -1028,6 +1028,34 @@ typy_template_argument (PyObject *self, PyObject *args)
   return result;
 }
 
+/* __repr__ implementation for gdb.Type.  */
+
+static PyObject *
+typy_repr (PyObject *self)
+{
+  const auto type = type_object_to_type (self);
+  if (type == nullptr)
+    return PyUnicode_FromFormat ("<%s (invalid)>",
+				 Py_TYPE (self)->tp_name);
+
+  const char *code = pyty_codes[type->code ()].name;
+  string_file type_name;
+  try
+    {
+      current_language->print_type (type, "", &type_name, -1, 0,
+				    &type_print_raw_options);
+    }
+  catch (const gdb_exception &except)
+    {
+      GDB_PY_HANDLE_EXCEPTION (except);
+    }
+  auto py_typename = PyUnicode_Decode (type_name.c_str (), type_name.size (),
+				       host_charset (), NULL);
+
+  return PyUnicode_FromFormat ("<%s code=%s name=%U>", Py_TYPE (self)->tp_name,
+			       code, py_typename);
+}
+
 static PyObject *
 typy_str (PyObject *self)
 {
@@ -1617,7 +1645,7 @@ PyTypeObject type_object_type =
   0,				  /*tp_getattr*/
   0,				  /*tp_setattr*/
   0,				  /*tp_compare*/
-  0,				  /*tp_repr*/
+  typy_repr,                     /*tp_repr*/
   &type_object_as_number,	  /*tp_as_number*/
   0,				  /*tp_as_sequence*/
   &typy_mapping,		  /*tp_as_mapping*/
diff --git a/gdb/testsuite/gdb.python/py-arch.exp b/gdb/testsuite/gdb.python/py-arch.exp
index 4f4b4aa766f..597943ff682 100644
--- a/gdb/testsuite/gdb.python/py-arch.exp
+++ b/gdb/testsuite/gdb.python/py-arch.exp
@@ -27,6 +27,8 @@ if ![runto_main] {
 # Test python/15461.  Invalid architectures should not trigger an
 # internal GDB assert.
 gdb_py_test_silent_cmd "python empty = gdb.Architecture()" "get empty arch" 0
+gdb_test "python print(repr (empty))" "<gdb\\.Architecture \\(invalid\\)>" \
+    "Test empty achitecture __repr__ does not trigger an assert"
 gdb_test "python print(empty.name())" ".*Architecture is invalid.*" \
     "Test empty architecture.name does not trigger an assert"
 gdb_test "python print(empty.disassemble())" ".*Architecture is invalid.*" \
@@ -44,6 +46,10 @@ gdb_py_test_silent_cmd "python insn_list3 = arch.disassemble(pc, count=1)" \
 gdb_py_test_silent_cmd "python insn_list4 = arch.disassemble(gdb.Value(pc))" \
   "disassemble no end no count" 0
 
+gdb_test "python print (repr (arch))" \
+    "<gdb.Architecture arch_name=.* printable_name=.*>" \
+    "test __repr__ for architecture"
+
 gdb_test "python print (len(insn_list1))" "1" "test number of instructions 1"
 gdb_test "python print (len(insn_list2))" "1" "test number of instructions 2"
 gdb_test "python print (len(insn_list3))" "1" "test number of instructions 3"
diff --git a/gdb/testsuite/gdb.python/py-block.c b/gdb/testsuite/gdb.python/py-block.c
index a0c6e165605..dd2e195af4a 100644
--- a/gdb/testsuite/gdb.python/py-block.c
+++ b/gdb/testsuite/gdb.python/py-block.c
@@ -30,9 +30,40 @@ int block_func (void)
   }
 }
 
+/* A function with no locals.  Used for testing gdb.Block.__repr__().  */
+int no_locals_func (void)
+{
+  return block_func ();
+}
+
+/* A function with 5 locals.  Used for testing gdb.Block.__repr__().  */
+int few_locals_func (void)
+{
+  int i = 0;
+  int j = 0;
+  int k = 0;
+  int x = 0;
+  int y = 0;
+  return block_func ();
+}
+
+/* A function with 6 locals.  Used for testing gdb.Block.__repr__().  */
+int many_locals_func (void)
+{
+  int i = 0;
+  int j = 0;
+  int k = 0;
+  int x = 0;
+  int y = 0;
+  int z = 0;
+  return block_func ();
+}
 
 int main (int argc, char *argv[])
 {
   block_func ();
+  no_locals_func ();
+  few_locals_func ();
+  many_locals_func ();
   return 0; /* Break at end. */
 }
diff --git a/gdb/testsuite/gdb.python/py-block.exp b/gdb/testsuite/gdb.python/py-block.exp
index 3bdf97294ae..37e3105b4e3 100644
--- a/gdb/testsuite/gdb.python/py-block.exp
+++ b/gdb/testsuite/gdb.python/py-block.exp
@@ -38,7 +38,8 @@ gdb_continue_to_breakpoint "Block break here."
 gdb_py_test_silent_cmd "python frame = gdb.selected_frame()" "Get Frame" 0
 gdb_py_test_silent_cmd "python block = frame.block()" \
     "Get block, initial innermost block" 0
-gdb_test "python print (block)" "<gdb.Block object at $hex>" "check block not None"
+gdb_test "python print (block)" "<gdb.Block <anonymous> \{i, f, b\}>" \
+    "check block not None"
 gdb_test "python print (block.function)" "None" "first anonymous block"
 gdb_test "python print (block.start)" "${decimal}" "check start not None"
 gdb_test "python print (block.end)" "${decimal}" "check end not None"
@@ -68,15 +69,46 @@ gdb_test_no_output "python block = block.superblock" "get superblock 2"
 gdb_test "python print (block.function)" "block_func" \
          "Print superblock 2 function"
 
+# Switch frames, then test block for no_locals_func.
+gdb_test "continue" ".*" "continue to no_locals_func breakpoint"
+gdb_test "up" ".*" "up to no_locals_func"
+gdb_py_test_silent_cmd "python frame = gdb.selected_frame()" "Get Frame 2" 0
+gdb_py_test_silent_cmd "python block = frame.block()" "Get Frame 2's block" 0
+gdb_test "python print (repr (block))" "<gdb.Block no_locals_func \{\}>" \
+    "Check block in no_locals_func"
+gdb_test "python print (block.function)" "no_locals_func" \
+    "no_locals_func block"
+
+# Switch frames, then test block for few_locals_func.
+gdb_test "continue" ".*" "continue to few_locals_func breakpoint"
+gdb_test "up" ".*" "up to few_locals_func"
+gdb_py_test_silent_cmd "python frame = gdb.selected_frame()" "Get Frame 2" 0
+gdb_py_test_silent_cmd "python block = frame.block()" "Get Frame 2's block" 0
+gdb_test "python print (repr (block))" \
+    "<gdb.Block few_locals_func \{i, j, k, x, y\}>" \
+    "Check block in few_locals_func"
+gdb_test "python print (block.function)" "few_locals_func" \
+    "few_locals_func block"
+
+# Switch frames, then test block for many_locals_func.
+gdb_test "continue" ".*" "continue to many_locals_func breakpoint"
+gdb_test "up" ".*" "up to many_locals_func"
+gdb_py_test_silent_cmd "python frame = gdb.selected_frame()" "Get Frame 2" 0
+gdb_py_test_silent_cmd "python block = frame.block()" "Get Frame 2's block" 0
+gdb_test "python print (repr (block))" \
+    "<gdb.Block many_locals_func \{i, j, k, x, y, \\.\\.\\. \\(1 more symbol\\)\}>" \
+    "Check block in many_locals_func"
+gdb_test "python print (block.function)" "many_locals_func" \
+    "many_locals_func block"
+
 # Switch frames, then test for main block.
 gdb_test "up" ".*"
 gdb_py_test_silent_cmd "python frame = gdb.selected_frame()" "Get Frame 2" 0
 gdb_py_test_silent_cmd "python block = frame.block()" "Get Frame 2's block" 0
-gdb_test "python print (block)" "<gdb.Block object at $hex>" \
+gdb_test "python print (repr (block))" "<gdb.Block main \{.*\}>" \
          "Check Frame 2's block not None"
 gdb_test "python print (block.function)" "main" "main block"
 
-
 # Test Block is_valid.  This must always be the last test in this
 # testcase as it unloads the object file.
 delete_breakpoints
diff --git a/gdb/testsuite/gdb.python/py-bp-locations.exp b/gdb/testsuite/gdb.python/py-bp-locations.exp
index f8649f6c105..b3a8c83bc0a 100644
--- a/gdb/testsuite/gdb.python/py-bp-locations.exp
+++ b/gdb/testsuite/gdb.python/py-bp-locations.exp
@@ -31,6 +31,30 @@ if ![runto_main] {
     return -1
 }
 
+# Build a regexp string that represents the __repr__ of a
+# gdb.BreakpointLocation object.  Accepts arguments -enabled, -address,
+# -source, -line, and -func.
+proc build_bpl_regexp { args } {
+    parse_args [list {enabled True} [list address "$::hex"] {source ".*"} \
+		    [list line "$::decimal"] {func ""}]
+
+    set pattern "<gdb.BreakpointLocation"
+
+    if {$enabled} {
+	set pattern "$pattern enabled"
+    } else {
+	set pattern "$pattern disabled"
+    }
+
+    set pattern "$pattern address=${address}(?: requested_address=$::hex)?"
+    set pattern "$pattern source=${source}:${line}"
+    if {$func ne ""} {
+	set pattern "$pattern in ${func}"
+    }
+    set pattern "$pattern>"
+    return $pattern
+}
+
 # Set breakpoint with 2 locations.
 gdb_breakpoint "add"
 
@@ -42,9 +66,17 @@ gdb_test "python print(gdb.breakpoints()\[1\].locations\[0\].source)" \
 	 ".*('.*py-bp-locations.c', $expected_line_a).*"
 gdb_test "python print(gdb.breakpoints()\[1\].locations\[1\].source)" \
 	 ".*('.*py-bp-locations.c', $expected_line_b).*"
+gdb_test "python print(gdb.breakpoints()\[1\].locations\[1\])" \
+    [build_bpl_regexp -enabled True -source ".*py-bp-locations.c" \
+	 -line "$expected_line_b" -func ".*"] \
+    "check repr of enabled breakpoint location"
 
 # Disable first location and make sure we don't hit it.
 gdb_test "python gdb.breakpoints()\[1\].locations\[0\].enabled = False" ""
+gdb_test "python print(gdb.breakpoints()\[1\].locations\[0\])" \
+    [build_bpl_regexp -enabled False -source ".*py-bp-locations.c" \
+	 -line "$expected_line_a" -func ".*"] \
+    "check repr of disabled breakpoint location"
 gdb_continue_to_breakpoint "" ".*25.*"
 
 if ![runto_main] {
diff --git a/gdb/testsuite/gdb.python/py-breakpoint.exp b/gdb/testsuite/gdb.python/py-breakpoint.exp
index 76094c95d10..df17d646b28 100644
--- a/gdb/testsuite/gdb.python/py-breakpoint.exp
+++ b/gdb/testsuite/gdb.python/py-breakpoint.exp
@@ -38,6 +38,36 @@ if { [prepare_for_testing "failed to prepare" ${testfile} ${srcfile} ${options}]
 
 set past_throw_catch_line [gdb_get_line_number "Past throw-catch."]
 
+# Build a regexp string that can match against the repr of a gdb.Breakpoint
+# object.  Accepts arguments -enabled, -number, -hits, -thread, -task, and
+# -enable_count.  The -enabled argument is a boolean, while all of the others
+# take a regexp string.
+proc build_bp_repr { args } {
+    parse_args [list {enabled True} [list number "-?$::decimal"] \
+		    [list hits $::decimal] {thread ""} {task ""} \
+		    {enable_count ""}]
+
+    set pattern "<gdb\\.Breakpoint"
+
+    if {!$enabled} {
+	set pattern "$pattern disabled"
+    }
+
+    set pattern "$pattern number=$number hits=$hits"
+
+    if {$thread ne ""} {
+	set pattern "$pattern thread=$thread"
+    }
+    if {$task ne ""} {
+	set pattern "$pattern task=$task"
+    }
+    if {$enable_count ne ""} {
+	set pattern "$pattern enable_count=$enable_count"
+    }
+    set pattern "${pattern}>"
+    return $pattern
+}
+
 proc_with_prefix test_bkpt_basic { } {
     global srcfile testfile hex decimal
 
@@ -54,8 +84,8 @@ proc_with_prefix test_bkpt_basic { } {
     # Now there should be one breakpoint: main.
     gdb_py_test_silent_cmd "python blist = gdb.breakpoints()" \
 	"Get Breakpoint List" 0
-    gdb_test "python print (blist\[0\])" \
-	"<gdb.Breakpoint object at $hex>" "Check obj exists @main"
+    gdb_test "python print (repr (blist\[0\]))" \
+	[build_bp_repr -number 1 -hits 1] "Check obj exists @main"
     gdb_test "python print (blist\[0\].location)" \
 	"main" "Check breakpoint location @main"
     gdb_test "python print (blist\[0\].pending)" "False" \
@@ -72,12 +102,12 @@ proc_with_prefix test_bkpt_basic { } {
 	"Get Breakpoint List" 0
     gdb_test "python print (len(blist))" \
 	"2" "Check for two breakpoints"
-    gdb_test "python print (blist\[0\])" \
-	"<gdb.Breakpoint object at $hex>" "Check obj exists @main 2"
+    gdb_test "python print (repr (blist\[0\]))" \
+	[build_bp_repr -number 1 -hits 1] "Check obj exists @main 2"
     gdb_test "python print (blist\[0\].location)" \
 	"main" "Check breakpoint location @main 2"
-    gdb_test "python print (blist\[1\])" \
-	"<gdb.Breakpoint object at $hex>" "Check obj exists @mult_line"
+    gdb_test "python print (repr (blist\[1\]))" \
+	[build_bp_repr -number 2 -hits 1] "Check obj exists @mult_line"
 
     gdb_test "python print (blist\[1\].location)" \
 	"py-breakpoint\.c:${mult_line}*" \
@@ -102,6 +132,9 @@ proc_with_prefix test_bkpt_basic { } {
 	"True" "Check breakpoint enabled."
     gdb_py_test_silent_cmd  "python blist\[1\].enabled = False" \
 	"Set breakpoint disabled." 0
+    gdb_test "python print (repr (blist\[1\]))" \
+	[build_bp_repr -enabled False -number 2 -hits 6] \
+	"Check repr for a disabled breakpoint"
     gdb_continue_to_breakpoint "Break at add 2" ".*Break at add.*"
     gdb_py_test_silent_cmd  "python blist\[1\].enabled = True" \
 	"Set breakpoint enabled." 0
@@ -113,6 +146,13 @@ proc_with_prefix test_bkpt_basic { } {
 	"Get Breakpoint List" 0
     gdb_test "python print (blist\[1\].thread)" \
 	"None" "Check breakpoint thread"
+    gdb_py_test_silent_cmd "python blist\[1\].thread = 1" \
+	"set breakpoint thread" 0
+    gdb_test "python print (repr (blist\[1\]))" \
+	[build_bp_repr -number 2 -hits 7 -thread 1] \
+	"Check repr for a thread breakpoint"
+    gdb_py_test_silent_cmd "python blist\[1\].thread = None" \
+	"clear breakpoint thread" 0
     gdb_test "python print (blist\[1\].type == gdb.BP_BREAKPOINT)" \
 	"True" "Check breakpoint type"
     gdb_test "python print (blist\[0\].number)" \
@@ -231,8 +271,8 @@ proc_with_prefix test_bkpt_invisible { } {
 	"Set invisible breakpoint" 0
     gdb_py_test_silent_cmd "python ilist = gdb.breakpoints()" \
 	"Get Breakpoint List" 0
-    gdb_test "python print (ilist\[0\])" \
-	"<gdb.Breakpoint object at $hex>" "Check invisible bp obj exists 1"
+    gdb_test "python print (repr (ilist\[0\]))" \
+	[build_bp_repr -number 2 -hits 0] "Check invisible bp obj exists 1"
     gdb_test "python print (ilist\[0\].location)" \
 	"py-breakpoint\.c:$ibp_location*" "Check breakpoint location 1"
     gdb_test "python print (ilist\[0\].visible)" \
@@ -244,8 +284,9 @@ proc_with_prefix test_bkpt_invisible { } {
 	"Set invisible breakpoint" 0
     gdb_py_test_silent_cmd "python ilist = gdb.breakpoints()" \
 	"Get Breakpoint List" 0
-    gdb_test "python print (ilist\[0\])" \
-	"<gdb.Breakpoint object at $hex>" "Check invisible bp obj exists 2"
+    gdb_test "python print (repr (ilist\[0\]))" \
+	[build_bp_repr -number "-$decimal" -hits 0] \
+	"Check invisible bp obj exists 2"
     gdb_test "python print (ilist\[0\].location)" \
 	"py-breakpoint\.c:$ibp_location*" "Check breakpoint location 2"
     gdb_test "python print (ilist\[0\].visible)" \
@@ -835,6 +876,14 @@ proc_with_prefix test_bkpt_auto_disable { } {
     set mult_line [gdb_get_line_number "Break at multiply."]
     gdb_breakpoint ${mult_line}
     gdb_test_no_output "enable count 1 2" "one shot enable"
+
+    # Find the Python gdb.Breakpoint object for breakpoint #2.
+    gdb_py_test_silent_cmd \
+	"python bp = \[b for b in gdb.breakpoints() if b.number == 2\]\[0\]" \
+	"Get breakpoint number 2" 0
+    gdb_test "python print (repr (bp))" \
+	[build_bp_repr -number 2 -hits 0 -enable_count 1]
+
     # Python 2 doesn't support print in lambda function, so use a named
     # function instead.
     gdb_test_multiline "Define print_bp_enabled" \
diff --git a/gdb/testsuite/gdb.python/py-symbol.exp b/gdb/testsuite/gdb.python/py-symbol.exp
index 9ec2f44e9c0..9bd5a35ed1c 100644
--- a/gdb/testsuite/gdb.python/py-symbol.exp
+++ b/gdb/testsuite/gdb.python/py-symbol.exp
@@ -43,6 +43,8 @@ clean_restart ${binfile}
 # point where we don't have a current frame, and we don't want to
 # require one.
 gdb_py_test_silent_cmd "python main_func = gdb.lookup_global_symbol(\"main\")" "Lookup main" 1
+gdb_test "python print (repr (main_func))" "<gdb.Symbol print_name=main>" \
+    "test main_func.__repr__"
 gdb_test "python print (main_func.is_function)" "True" "test main_func.is_function"
 gdb_test "python print (gdb.lookup_global_symbol(\"junk\"))" "None" "test lookup_global_symbol(\"junk\")"
 
diff --git a/gdb/testsuite/gdb.python/py-type.exp b/gdb/testsuite/gdb.python/py-type.exp
index c245d41a1ac..918216ddd69 100644
--- a/gdb/testsuite/gdb.python/py-type.exp
+++ b/gdb/testsuite/gdb.python/py-type.exp
@@ -388,3 +388,7 @@ if { [build_inferior "${binfile}-cxx" "c++"] == 0 } {
       test_type_equality
   }
 }
+
+# Test __repr__().
+gdb_test "python print (repr (gdb.lookup_type ('char')))" \
+      "<gdb.Type code=TYPE_CODE_INT name=char>" "test __repr__()"
-- 
2.25.4


^ permalink raw reply	[relevance 3%]

* [PATCHv3 0/2] Add __repr__() implementation to a few Python types
  2023-05-18  3:33  7%         ` Matheus Branco Borella
@ 2023-05-19 21:27  7%           ` Andrew Burgess
  2023-05-19 21:27  3%             ` [PATCHv3 2/2] gdb: add " Andrew Burgess
  2023-06-07 17:05  7%             ` [PATCHv3 0/2] Add " Matheus Branco Borella
  0 siblings, 2 replies; 65+ results
From: Andrew Burgess @ 2023-05-19 21:27 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

Matheus,

Thanks for moving this forward.

I took a look through, and I think this looks great.  I noticed a few
whitespace issues with the patch, so I fixed all of them.  But during
testing I ended up making a few more changes, so I figured I'd just
post what I ended up with.

The biggest change you'll see is a whole extra patch.  It turns out
that the way you were using mdict_size wasn't correct -- mdict_size
doesn't always return the number of symbols!  As a result the symbol
list for block_func (in the py-block.exp test) would end up printed
without comma's between the symbols.  Anyway, the new first patch
fixes this.

While we're in py-block.c, I changed the output format.  Rather than
displaying the symbols one per line, they are now listed all on one
line.  The multi-line format looks great if all you are displaying is
the one object's repr, but if the repr is printed as part of a larger
string then I think the multi-line layout doesn't look as good.  Now
we're only printing a few symbols I figure we can afford to go with a
one line layout.

On the testing side I've tightened the patterns in py-block.exp, and
extended the test to check the ouptut in block_func -- this exposes
the mdict_size issue.

I've added some tests for gdb.BreakpointLocation, which were missing.

And I've made the patterns in py-breakpoint.exp more precise, and
added some additional tests to catch more cases.

Would be great to hear your thoughts on the updates,

Thanks,
Andrew

---

Andrew Burgess (1):
  gdb: have mdict_size always return a symbol count

Matheus Branco Borella via Gdb-patches (1):
  gdb: add __repr__() implementation to a few Python types

 gdb/dictionary.c                             | 13 +++-
 gdb/dictionary.h                             |  3 +-
 gdb/python/py-arch.c                         | 17 ++++-
 gdb/python/py-block.c                        | 37 ++++++++++-
 gdb/python/py-breakpoint.c                   | 67 ++++++++++++++++++-
 gdb/python/py-symbol.c                       | 15 ++++-
 gdb/python/py-type.c                         | 30 ++++++++-
 gdb/symmisc.c                                |  2 +-
 gdb/testsuite/gdb.python/py-arch.exp         |  6 ++
 gdb/testsuite/gdb.python/py-block.c          | 31 +++++++++
 gdb/testsuite/gdb.python/py-block.exp        | 38 ++++++++++-
 gdb/testsuite/gdb.python/py-bp-locations.exp | 32 +++++++++
 gdb/testsuite/gdb.python/py-breakpoint.exp   | 69 +++++++++++++++++---
 gdb/testsuite/gdb.python/py-symbol.exp       |  2 +
 gdb/testsuite/gdb.python/py-type.exp         |  4 ++
 15 files changed, 343 insertions(+), 23 deletions(-)


base-commit: e84060b489746d031ed1ec9e7b6b39fdf4b6cfe3
-- 
2.25.4


^ permalink raw reply	[relevance 7%]

* [PATCH] Add __repr__() implementation to a few Python types
  2023-01-24 14:45  7%       ` Andrew Burgess
@ 2023-05-18  3:33  7%         ` Matheus Branco Borella
  2023-05-19 21:27  7%           ` [PATCHv3 0/2] " Andrew Burgess
  0 siblings, 1 reply; 65+ results
From: Matheus Branco Borella @ 2023-05-18  3:33 UTC (permalink / raw)
  To: gdb-patches

From: Matheus Branco Borella via Gdb-patches <gdb-patches@sourceware.org>

Only a few types in the Python API currently have __repr__() implementations.
This patch adds a few more of them. specifically: it adds __repr__()
implementations to gdb.Symbol, gdb.Architecture, gdb.Block, gdb.Breakpoint,
and gdb.Type.

This makes it easier to play around the GDB Python API in the Python interpreter
session invoked with the 'pi' command in GDB, giving more easily accessible tipe
information to users.

An example of how this would look like:
```
(gdb) pi
>> gdb.lookup_type("char")
<gdb.Type code=TYPE_CODE_INT name=char>
>> gdb.lookup_global_symbol("main")
<gdb.Symbol print_name=main>
```

Okay, this should have all of the changes suggested in the replies. Sorry it
took me this long to get back to this. Life happened. I also made it so that
no more than five symbols get written out by `gdb.Block`, showing how many were
elided, if any. This should be a better compromise between expressiveness and
clutter than the first implementation. This patch makes use of `type_allocator`
for type initialization and of `block_iterator_range` for block symbol
iteration.
---
 gdb/python/py-arch.c                       | 17 ++++-
 gdb/python/py-block.c                      | 47 +++++++++++++-
 gdb/python/py-breakpoint.c                 | 75 +++++++++++++++++++++-
 gdb/python/py-symbol.c                     | 17 ++++-
 gdb/python/py-type.c                       | 32 ++++++++-
 gdb/testsuite/gdb.python/py-arch.exp       |  6 ++
 gdb/testsuite/gdb.python/py-block.c        | 28 ++++++++
 gdb/testsuite/gdb.python/py-block.exp      | 37 ++++++++++-
 gdb/testsuite/gdb.python/py-breakpoint.exp | 24 ++++---
 gdb/testsuite/gdb.python/py-symbol.exp     |  2 +
 gdb/testsuite/gdb.python/py-type.exp       |  4 ++
 11 files changed, 270 insertions(+), 19 deletions(-)

diff --git a/gdb/python/py-arch.c b/gdb/python/py-arch.c
index 4d133d1fe1..44df6db545 100644
--- a/gdb/python/py-arch.c
+++ b/gdb/python/py-arch.c
@@ -319,6 +319,21 @@ archpy_integer_type (PyObject *self, PyObject *args, PyObject *kw)
   return type_to_type_object (type);
 }
 
+/* __repr__ implementation for gdb.Architecture.  */
+
+static PyObject *
+archpy_repr (PyObject *self)
+{
+  const auto gdbarch = arch_object_to_gdbarch (self);
+  if (gdbarch == nullptr)
+    return PyUnicode_FromFormat ("<%s (invalid)>", Py_TYPE (self)->tp_name);
+
+  return PyUnicode_FromFormat ("<%s arch_name=%s printable_name=%s>",
+                               Py_TYPE (self)->tp_name,
+                               gdbarch_bfd_arch_info (gdbarch)->arch_name,
+                               gdbarch_bfd_arch_info (gdbarch)->printable_name);
+}
+
 /* Implementation of gdb.architecture_names().  Return a list of all the
    BFD architecture names that GDB understands.  */
 
@@ -395,7 +410,7 @@ PyTypeObject arch_object_type = {
   0,                                  /* tp_getattr */
   0,                                  /* tp_setattr */
   0,                                  /* tp_compare */
-  0,                                  /* tp_repr */
+  archpy_repr,                        /* tp_repr */
   0,                                  /* tp_as_number */
   0,                                  /* tp_as_sequence */
   0,                                  /* tp_as_mapping */
diff --git a/gdb/python/py-block.c b/gdb/python/py-block.c
index 09fa74d862..6bca88d2fb 100644
--- a/gdb/python/py-block.c
+++ b/gdb/python/py-block.c
@@ -418,6 +418,51 @@ blpy_iter_is_valid (PyObject *self, PyObject *args)
   Py_RETURN_TRUE;
 }
 
+/* __repr__ implementation for gdb.Block.  */
+
+static PyObject *
+blpy_repr (PyObject *self)
+{
+  const auto block = block_object_to_block (self);
+  if (block == nullptr)
+    return PyUnicode_FromFormat ("<%s (invalid)>", 
+                                 Py_TYPE (self)->tp_name);
+
+  const auto name = block->function () ?
+    block->function ()->print_name () : "<anonymous>";
+
+  std::string str;
+  unsigned int written_symbols = 0;
+  const int len = mdict_size (block->multidict ());
+  for (struct symbol *symbol : block_iterator_range (block))
+    {
+      if(++written_symbols >= 6)
+        {
+          const int remaining = len - 5;
+          if (remaining == 1)
+            {
+              str = (str + "\n    ") + 
+                string_printf("... (%d more symbol)", len - 5);
+            }
+          else
+            {
+              str = (str + "\n    ") + 
+                string_printf("... (%d more symbols)", len - 5);
+            }
+
+          break;
+        }
+      str = (str + "\n    ") + symbol->print_name ();
+      if (written_symbols < len)
+        str = str + ",";
+    }
+  if(!str.empty ())
+    str += "\n";
+
+  return PyUnicode_FromFormat ("<%s %s {%s}>", Py_TYPE (self)->tp_name, 
+                               name, str.c_str ());
+}
+
 static int CPYCHECKER_NEGATIVE_RESULT_SETS_EXCEPTION
 gdbpy_initialize_blocks (void)
 {
@@ -482,7 +527,7 @@ PyTypeObject block_object_type = {
   0,				  /*tp_getattr*/
   0,				  /*tp_setattr*/
   0,				  /*tp_compare*/
-  0,				  /*tp_repr*/
+  blpy_repr,                      /*tp_repr*/
   0,				  /*tp_as_number*/
   0,				  /*tp_as_sequence*/
   &block_object_as_mapping,	  /*tp_as_mapping*/
diff --git a/gdb/python/py-breakpoint.c b/gdb/python/py-breakpoint.c
index becb04c91c..1dd746ba94 100644
--- a/gdb/python/py-breakpoint.c
+++ b/gdb/python/py-breakpoint.c
@@ -33,6 +33,7 @@
 #include "location.h"
 #include "py-event.h"
 #include "linespec.h"
+#include "gdbsupport/common-utils.h"
 
 extern PyTypeObject breakpoint_location_object_type
     CPYCHECKER_TYPE_OBJECT_FOR_TYPEDEF ("breakpoint_location_object");
@@ -981,6 +982,32 @@ bppy_init (PyObject *self, PyObject *args, PyObject *kwargs)
   return 0;
 }
 
+/* __repr__ implementation for gdb.Breakpoint.  */
+
+static PyObject *
+bppy_repr (PyObject *self)
+{
+  const auto bp = (struct gdbpy_breakpoint_object*) self;
+  if (bp->bp == nullptr)
+    return PyUnicode_FromFormat ("<%s (invalid)>", 
+                                 Py_TYPE (self)->tp_name);
+
+  std::string str = " ";
+  if (bp->bp->thread != -1)
+    str += string_printf ("thread=%d ", bp->bp->thread);
+  if (bp->bp->task > 0)
+    str += string_printf ("task=%d ", bp->bp->task);
+  if (bp->bp->enable_count > 0)
+    str += string_printf ("enable_count=%d ", 
+                          bp->bp->enable_count);
+  str.pop_back ();
+
+  return PyUnicode_FromFormat ("<%s number=%d hits=%d%s>",
+                               Py_TYPE (self)->tp_name,
+                               bp->bp->number, bp->bp->hit_count,
+                               str.c_str ());
+}
+
 /* Append to LIST the breakpoint Python object associated to B.
 
    Return true on success.  Return false on failure, with the Python error
@@ -1406,7 +1433,7 @@ PyTypeObject breakpoint_object_type =
   0,				  /*tp_getattr*/
   0,				  /*tp_setattr*/
   0,				  /*tp_compare*/
-  0,				  /*tp_repr*/
+  bppy_repr,                     /*tp_repr*/
   0,				  /*tp_as_number*/
   0,				  /*tp_as_sequence*/
   0,				  /*tp_as_mapping*/
@@ -1624,6 +1651,50 @@ bplocpy_dealloc (PyObject *py_self)
   Py_TYPE (py_self)->tp_free (py_self);
 }
 
+/* __repr__ implementation for gdb.BreakpointLocation.  */
+
+static PyObject *
+bplocpy_repr (PyObject *py_self)
+{
+  const auto self = (gdbpy_breakpoint_location_object *) py_self;
+  if (self->owner == nullptr || self->owner->bp == nullptr
+    || self->owner->bp != self->bp_loc->owner)
+    return PyUnicode_FromFormat ("<%s (invalid)>", 
+                                 Py_TYPE (self)->tp_name);
+
+  const auto enabled = self->bp_loc->enabled ? "enabled" : "disabled";
+
+  std::string str(enabled);
+
+  str += " address=0x";
+  str += string_printf 
+    ("%s", core_addr_to_string_nz (self->bp_loc->address));
+
+  if (self->bp_loc->requested_address != self->bp_loc->address)
+    {
+      str += " requested_address=0x";
+      str += string_printf 
+        ("%s", core_addr_to_string_nz (self->bp_loc->requested_address));
+    }
+  if (self->bp_loc->symtab != nullptr)
+    {
+      str += " source=";
+      str += self->bp_loc->symtab->filename;
+      str += ":";
+      str += string_printf ("%d", self->bp_loc->line_number);
+    }
+
+  const auto fn_name = self->bp_loc->function_name.get ();
+  if (fn_name != nullptr)
+    {
+      str += " in ";
+      str += fn_name;
+    }
+
+  return PyUnicode_FromFormat ("<%s %s>", Py_TYPE (self)->tp_name, 
+                               str.c_str ());
+}
+
 /* Attribute get/set Python definitions. */
 
 static gdb_PyGetSetDef bp_location_object_getset[] = {
@@ -1655,7 +1726,7 @@ PyTypeObject breakpoint_location_object_type =
   0,					/*tp_getattr*/
   0,					/*tp_setattr*/
   0,					/*tp_compare*/
-  0,					/*tp_repr*/
+  bplocpy_repr,                        /*tp_repr*/
   0,					/*tp_as_number*/
   0,					/*tp_as_sequence*/
   0,					/*tp_as_mapping*/
diff --git a/gdb/python/py-symbol.c b/gdb/python/py-symbol.c
index ff3d18504e..6cf9825d7f 100644
--- a/gdb/python/py-symbol.c
+++ b/gdb/python/py-symbol.c
@@ -378,6 +378,21 @@ sympy_dealloc (PyObject *obj)
   Py_TYPE (obj)->tp_free (obj);
 }
 
+/* __repr__ implementation for gdb.Symbol.  */
+
+static PyObject *
+sympy_repr (PyObject *self)
+{
+  const auto symbol = symbol_object_to_symbol (self);
+  if (symbol == nullptr)
+    return PyUnicode_FromFormat ("<%s (invalid)>", 
+                                 Py_TYPE (self)->tp_name);
+
+  return PyUnicode_FromFormat ("<%s print_name=%s>",
+                               Py_TYPE (self)->tp_name,
+                               symbol->print_name ());
+}
+
 /* Implementation of
    gdb.lookup_symbol (name [, block] [, domain]) -> (symbol, is_field_of_this)
    A tuple with 2 elements is always returned.  The first is the symbol
@@ -741,7 +756,7 @@ PyTypeObject symbol_object_type = {
   0,				  /*tp_getattr*/
   0,				  /*tp_setattr*/
   0,				  /*tp_compare*/
-  0,				  /*tp_repr*/
+  sympy_repr,                    /*tp_repr*/
   0,				  /*tp_as_number*/
   0,				  /*tp_as_sequence*/
   0,				  /*tp_as_mapping*/
diff --git a/gdb/python/py-type.c b/gdb/python/py-type.c
index b9fa741177..283d595546 100644
--- a/gdb/python/py-type.c
+++ b/gdb/python/py-type.c
@@ -1028,6 +1028,36 @@ typy_template_argument (PyObject *self, PyObject *args)
   return result;
 }
 
+/* __repr__ implementation for gdb.Type.  */
+
+static PyObject *
+typy_repr (PyObject *self)
+{
+  const auto type = type_object_to_type (self);
+  if (type == nullptr)
+    return PyUnicode_FromFormat ("<%s (invalid)>", 
+                                 Py_TYPE (self)->tp_name);
+
+  const char *code = pyty_codes[type->code ()].name;
+  string_file type_name;
+  try
+    {
+      current_language->print_type (type, "", &type_name, -1, 0, 
+                                    &type_print_raw_options);
+    }
+  catch (const gdb_exception &except)
+    {
+      GDB_PY_HANDLE_EXCEPTION (except);
+    }
+  auto py_typename = PyUnicode_Decode (type_name.c_str (), 
+                                       type_name.size (),
+                                       host_charset (), NULL);
+
+  return PyUnicode_FromFormat ("<%s code=%s name=%U>", 
+                               Py_TYPE (self)->tp_name, code, 
+                               py_typename);
+}
+
 static PyObject *
 typy_str (PyObject *self)
 {
@@ -1617,7 +1647,7 @@ PyTypeObject type_object_type =
   0,				  /*tp_getattr*/
   0,				  /*tp_setattr*/
   0,				  /*tp_compare*/
-  0,				  /*tp_repr*/
+  typy_repr,                     /*tp_repr*/
   &type_object_as_number,	  /*tp_as_number*/
   0,				  /*tp_as_sequence*/
   &typy_mapping,		  /*tp_as_mapping*/
diff --git a/gdb/testsuite/gdb.python/py-arch.exp b/gdb/testsuite/gdb.python/py-arch.exp
index 4f4b4aa766..597943ff68 100644
--- a/gdb/testsuite/gdb.python/py-arch.exp
+++ b/gdb/testsuite/gdb.python/py-arch.exp
@@ -27,6 +27,8 @@ if ![runto_main] {
 # Test python/15461.  Invalid architectures should not trigger an
 # internal GDB assert.
 gdb_py_test_silent_cmd "python empty = gdb.Architecture()" "get empty arch" 0
+gdb_test "python print(repr (empty))" "<gdb\\.Architecture \\(invalid\\)>" \
+    "Test empty achitecture __repr__ does not trigger an assert"
 gdb_test "python print(empty.name())" ".*Architecture is invalid.*" \
     "Test empty architecture.name does not trigger an assert"
 gdb_test "python print(empty.disassemble())" ".*Architecture is invalid.*" \
@@ -44,6 +46,10 @@ gdb_py_test_silent_cmd "python insn_list3 = arch.disassemble(pc, count=1)" \
 gdb_py_test_silent_cmd "python insn_list4 = arch.disassemble(gdb.Value(pc))" \
   "disassemble no end no count" 0
 
+gdb_test "python print (repr (arch))" \
+    "<gdb.Architecture arch_name=.* printable_name=.*>" \
+    "test __repr__ for architecture"
+
 gdb_test "python print (len(insn_list1))" "1" "test number of instructions 1"
 gdb_test "python print (len(insn_list2))" "1" "test number of instructions 2"
 gdb_test "python print (len(insn_list3))" "1" "test number of instructions 3"
diff --git a/gdb/testsuite/gdb.python/py-block.c b/gdb/testsuite/gdb.python/py-block.c
index a0c6e16560..7e26f2d1fe 100644
--- a/gdb/testsuite/gdb.python/py-block.c
+++ b/gdb/testsuite/gdb.python/py-block.c
@@ -30,9 +30,37 @@ int block_func (void)
   }
 }
 
+int no_locals_func (void)
+{
+  return block_func ();
+}
+
+int few_locals_func (void)
+{
+  int i = 0;
+  int j = 0;
+  int k = 0;
+  int x = 0;
+  int y = 0; 
+  return block_func ();
+}
+
+int many_locals_func (void)
+{
+  int i = 0;
+  int j = 0;
+  int k = 0;
+  int x = 0;
+  int y = 0; 
+  int z = 0; 
+  return block_func ();
+}
 
 int main (int argc, char *argv[])
 {
   block_func ();
+  no_locals_func ();
+  few_locals_func ();
+  many_locals_func ();
   return 0; /* Break at end. */
 }
diff --git a/gdb/testsuite/gdb.python/py-block.exp b/gdb/testsuite/gdb.python/py-block.exp
index 3bdf97294a..dc2f2f53af 100644
--- a/gdb/testsuite/gdb.python/py-block.exp
+++ b/gdb/testsuite/gdb.python/py-block.exp
@@ -38,7 +38,7 @@ gdb_continue_to_breakpoint "Block break here."
 gdb_py_test_silent_cmd "python frame = gdb.selected_frame()" "Get Frame" 0
 gdb_py_test_silent_cmd "python block = frame.block()" \
     "Get block, initial innermost block" 0
-gdb_test "python print (block)" "<gdb.Block object at $hex>" "check block not None"
+gdb_test "python print (block)" "<gdb.Block <anonymous> \{.*\}>" "check block not None"
 gdb_test "python print (block.function)" "None" "first anonymous block"
 gdb_test "python print (block.start)" "${decimal}" "check start not None"
 gdb_test "python print (block.end)" "${decimal}" "check end not None"
@@ -68,15 +68,46 @@ gdb_test_no_output "python block = block.superblock" "get superblock 2"
 gdb_test "python print (block.function)" "block_func" \
          "Print superblock 2 function"
 
+# Switch frames, then test block for no_locals_func.
+gdb_test "continue" ".*" "continue to no_locals_func breakpoint"
+gdb_test "up" ".*" "up to no_locals_func"
+gdb_py_test_silent_cmd "python frame = gdb.selected_frame()" "Get Frame 2" 0
+gdb_py_test_silent_cmd "python block = frame.block()" "Get Frame 2's block" 0
+gdb_test "python print (repr (block))" "<gdb.Block no_locals_func \{\}>" \
+    "Check block in no_locals_func"
+gdb_test "python print (block.function)" "no_locals_func" \
+    "no_locals_func block"
+
+# Switch frames, then test block for few_locals_func.
+gdb_test "continue" ".*" "continue to few_locals_func breakpoint"
+gdb_test "up" ".*" "up to few_locals_func"
+gdb_py_test_silent_cmd "python frame = gdb.selected_frame()" "Get Frame 2" 0
+gdb_py_test_silent_cmd "python block = frame.block()" "Get Frame 2's block" 0
+gdb_test "python print (repr (block))" \
+    "<gdb.Block few_locals_func \{.*i,.*j,.*k,.*x,.*y.*\}>" \
+    "Check block in few_locals_func"
+gdb_test "python print (block.function)" "few_locals_func" \
+    "few_locals_func block"
+
+# Switch frames, then test block for many_locals_func.
+gdb_test "continue" ".*" "continue to many_locals_func breakpoint"
+gdb_test "up" ".*" "up to many_locals_func"
+gdb_py_test_silent_cmd "python frame = gdb.selected_frame()" "Get Frame 2" 0
+gdb_py_test_silent_cmd "python block = frame.block()" "Get Frame 2's block" 0
+gdb_test "python print (repr (block))" \
+    "<gdb.Block many_locals_func \{.*i,.*j,.*k,.*x,.*y,.*\.\.\. \\(1 more symbol\\).*\}>" \
+    "Check block in many_locals_func"
+gdb_test "python print (block.function)" "many_locals_func" \
+    "many_locals_func block"
+
 # Switch frames, then test for main block.
 gdb_test "up" ".*"
 gdb_py_test_silent_cmd "python frame = gdb.selected_frame()" "Get Frame 2" 0
 gdb_py_test_silent_cmd "python block = frame.block()" "Get Frame 2's block" 0
-gdb_test "python print (block)" "<gdb.Block object at $hex>" \
+gdb_test "python print (repr (block))" "<gdb.Block main \{.*\}>" \
          "Check Frame 2's block not None"
 gdb_test "python print (block.function)" "main" "main block"
 
-
 # Test Block is_valid.  This must always be the last test in this
 # testcase as it unloads the object file.
 delete_breakpoints
diff --git a/gdb/testsuite/gdb.python/py-breakpoint.exp b/gdb/testsuite/gdb.python/py-breakpoint.exp
index 76094c95d1..77e44f6e3c 100644
--- a/gdb/testsuite/gdb.python/py-breakpoint.exp
+++ b/gdb/testsuite/gdb.python/py-breakpoint.exp
@@ -51,11 +51,13 @@ proc_with_prefix test_bkpt_basic { } {
 	return 0
     }
 
+    set repr_pattern "<gdb.Breakpoint number=-?$decimal hits=-?$decimal\( thread=$decimal\)?\( task=$decimal\)?\( enable_count=$decimal\)?>"
+
     # Now there should be one breakpoint: main.
     gdb_py_test_silent_cmd "python blist = gdb.breakpoints()" \
 	"Get Breakpoint List" 0
-    gdb_test "python print (blist\[0\])" \
-	"<gdb.Breakpoint object at $hex>" "Check obj exists @main"
+    gdb_test "python print (repr (blist\[0\]))" \
+	"$repr_pattern" "Check obj exists @main"
     gdb_test "python print (blist\[0\].location)" \
 	"main" "Check breakpoint location @main"
     gdb_test "python print (blist\[0\].pending)" "False" \
@@ -72,12 +74,12 @@ proc_with_prefix test_bkpt_basic { } {
 	"Get Breakpoint List" 0
     gdb_test "python print (len(blist))" \
 	"2" "Check for two breakpoints"
-    gdb_test "python print (blist\[0\])" \
-	"<gdb.Breakpoint object at $hex>" "Check obj exists @main 2"
+    gdb_test "python print (repr (blist\[0\]))" \
+	"$repr_pattern" "Check obj exists @main 2"
     gdb_test "python print (blist\[0\].location)" \
 	"main" "Check breakpoint location @main 2"
-    gdb_test "python print (blist\[1\])" \
-	"<gdb.Breakpoint object at $hex>" "Check obj exists @mult_line"
+    gdb_test "python print (repr (blist\[1\]))" \
+	"$repr_pattern" "Check obj exists @mult_line"
 
     gdb_test "python print (blist\[1\].location)" \
 	"py-breakpoint\.c:${mult_line}*" \
@@ -225,14 +227,16 @@ proc_with_prefix test_bkpt_invisible { } {
 	return 0
     }
 
+    set repr_pattern "<gdb.Breakpoint number=-?$decimal hits=-?$decimal\( thread=$decimal\)?\( task=$decimal\)?\( enable_count=$decimal\)?>"
+
     delete_breakpoints
     set ibp_location [gdb_get_line_number "Break at multiply."]
     gdb_py_test_silent_cmd  "python ibp = gdb.Breakpoint(\"$ibp_location\", internal=False)" \
 	"Set invisible breakpoint" 0
     gdb_py_test_silent_cmd "python ilist = gdb.breakpoints()" \
 	"Get Breakpoint List" 0
-    gdb_test "python print (ilist\[0\])" \
-	"<gdb.Breakpoint object at $hex>" "Check invisible bp obj exists 1"
+    gdb_test "python print (repr (ilist\[0\]))" \
+	"$repr_pattern" "Check invisible bp obj exists 1"
     gdb_test "python print (ilist\[0\].location)" \
 	"py-breakpoint\.c:$ibp_location*" "Check breakpoint location 1"
     gdb_test "python print (ilist\[0\].visible)" \
@@ -244,8 +248,8 @@ proc_with_prefix test_bkpt_invisible { } {
 	"Set invisible breakpoint" 0
     gdb_py_test_silent_cmd "python ilist = gdb.breakpoints()" \
 	"Get Breakpoint List" 0
-    gdb_test "python print (ilist\[0\])" \
-	"<gdb.Breakpoint object at $hex>" "Check invisible bp obj exists 2"
+    gdb_test "python print (repr (ilist\[0\]))" \
+	"$repr_pattern" "Check invisible bp obj exists 2"
     gdb_test "python print (ilist\[0\].location)" \
 	"py-breakpoint\.c:$ibp_location*" "Check breakpoint location 2"
     gdb_test "python print (ilist\[0\].visible)" \
diff --git a/gdb/testsuite/gdb.python/py-symbol.exp b/gdb/testsuite/gdb.python/py-symbol.exp
index 9ec2f44e9c..9bd5a35ed1 100644
--- a/gdb/testsuite/gdb.python/py-symbol.exp
+++ b/gdb/testsuite/gdb.python/py-symbol.exp
@@ -43,6 +43,8 @@ clean_restart ${binfile}
 # point where we don't have a current frame, and we don't want to
 # require one.
 gdb_py_test_silent_cmd "python main_func = gdb.lookup_global_symbol(\"main\")" "Lookup main" 1
+gdb_test "python print (repr (main_func))" "<gdb.Symbol print_name=main>" \
+    "test main_func.__repr__"
 gdb_test "python print (main_func.is_function)" "True" "test main_func.is_function"
 gdb_test "python print (gdb.lookup_global_symbol(\"junk\"))" "None" "test lookup_global_symbol(\"junk\")"
 
diff --git a/gdb/testsuite/gdb.python/py-type.exp b/gdb/testsuite/gdb.python/py-type.exp
index c245d41a1a..918216ddd6 100644
--- a/gdb/testsuite/gdb.python/py-type.exp
+++ b/gdb/testsuite/gdb.python/py-type.exp
@@ -388,3 +388,7 @@ if { [build_inferior "${binfile}-cxx" "c++"] == 0 } {
       test_type_equality
   }
 }
+
+# Test __repr__().
+gdb_test "python print (repr (gdb.lookup_type ('char')))" \
+      "<gdb.Type code=TYPE_CODE_INT name=char>" "test __repr__()"
-- 
2.40.1


^ permalink raw reply	[relevance 7%]

* Re: [PATCH] Add __repr__() implementation to a few Python types
  2023-01-20  1:43  3%     ` Matheus Branco Borella
  2023-01-20 16:45  5%       ` Andrew Burgess
@ 2023-01-24 14:45  7%       ` Andrew Burgess
  2023-05-18  3:33  7%         ` Matheus Branco Borella
  1 sibling, 1 reply; 65+ results
From: Andrew Burgess @ 2023-01-24 14:45 UTC (permalink / raw)
  To: Matheus Branco Borella via Gdb-patches, gdb-patches
  Cc: Matheus Branco Borella

Matheus Branco Borella via Gdb-patches <gdb-patches@sourceware.org>
writes:

> Only a few types in the Python API currently have __repr__() implementations.
> This patch adds a few more of them. specifically: it adds __repr__()
> implementations to gdb.Symbol, gdb.Architecture, gdb.Block, gdb.Breakpoint,
> and gdb.Type.
>
> This makes it easier to play around the GDB Python API in the Python interpreter
> session invoked with the 'pi' command in GDB, giving more easily accessible tipe
> information to users.
>
> An example of how this would look like:
> ```
> (gdb) pi
>>> gdb.lookup_type("char")
> <gdb.Type code=TYPE_CODE_INT name=char>
>>> gdb.lookup_global_symbol("main")
> <gdb.Symbol print_name=main>
> ```
>
>> Sorry for being a little slow.  What does this actually mean?  When you
>> say "makes use of u8 string literals" - does this mean you have string
>> literals in this patch containing non ASCII characters?
>>
>> I've trying to understand why this is different to any other part of GDB
>> that prints stuff via Python.
>
> I forgot to take that out of the commit message, my bad. Originally, I'd 
> intended for the string literals in the patch that get handed to Python to be
> all u8 literals so that I could guarantee it wouldn't break in an environment
> that doesn't output regular string literals in an ASCII-compatible encoding,
> as Python expects all strings handed to it to be encoded in UTF-8. But seeing
> as all of the rest of the Python interface code uses regular string literals, 
> I figured it wouldn't make much of difference having them in anyway.
>
>> I guess I was surprised that so many of the new tests included an
>> explicit call to repr, given the premise of the change was that simply
>> 'print(empty)' would now print something useful.
>>
>> I guess maybe it doesn't hurt to _also_ include some explicit repr
>> calls, but I was expecting most tests to just be printing the object
>> directly.
> As blarsen@ also pointed out, `print`-ing an object directly that has an 
> implmentation of __str__() will print whatever its __str__() functions returns, 
> regardless of whether it implements __repr__() or not, which is not what we want 
> here. __repr__() is always preferred in the REPL, though, so it's understandable 
> it might not be clear at first why I'm calling `repr()` explicitly.
>
>> Over long line, please wrap a little.  There's other long lines in your
>> patch, I'll not point out each one.
>
> Should be all fixed now (hopefully I didn't miss any), with the exception of the
> `repr_pattern` strings in `py-breakpoint.exp`, which I couldn't for the life of 
> me get to match properly with the output were they not on a single line.
>
> ---
>  gdb/python/py-arch.c                       | 18 +++++-
>  gdb/python/py-block.c                      | 27 ++++++++-
>  gdb/python/py-breakpoint.c                 | 68 +++++++++++++++++++++-
>  gdb/python/py-symbol.c                     | 16 ++++-
>  gdb/python/py-type.c                       | 30 +++++++++-
>  gdb/testsuite/gdb.python/py-arch.exp       |  6 ++
>  gdb/testsuite/gdb.python/py-block.exp      |  4 +-
>  gdb/testsuite/gdb.python/py-breakpoint.exp | 24 ++++----
>  gdb/testsuite/gdb.python/py-symbol.exp     |  2 +
>  gdb/testsuite/gdb.python/py-type.exp       |  4 ++
>  10 files changed, 181 insertions(+), 18 deletions(-)
>
> diff --git a/gdb/python/py-arch.c b/gdb/python/py-arch.c
> index cf0978560f9..5384a0d0d0c 100644
> --- a/gdb/python/py-arch.c
> +++ b/gdb/python/py-arch.c
> @@ -319,6 +319,22 @@ archpy_integer_type (PyObject *self, PyObject *args, PyObject *kw)
>    return type_to_type_object (type);
>  }
>  
> +/* __repr__ implementation for gdb.Architecture.  */
> +
> +static PyObject *
> +archpy_repr (PyObject *self)
> +{
> +  const auto gdbarch = arch_object_to_gdbarch (self);
> +  if (gdbarch == nullptr)
> +    return PyUnicode_FromFormat
> +      ("<gdb.Architecture (invalid)>");

Additionally, I think that instead of hard-coding gdb.Architecture, we
should do:

    return PyUnicode_FromFormat ("<%s (invalid)>", Py_TYPE (self)->tp_name);

The benefit being that if a user sub-classes gdb.Architecture, and
doesn't override the __repr__ method, then the name printed will be the
name of the sub-class, rather than the base-class.

This obviously applies throughout this patch.

Thanks,
Andrew

> +
> +  return PyUnicode_FromFormat
> +    ("<gdb.Architecture arch_name=%s printable_name=%s>",
> +     gdbarch_bfd_arch_info (gdbarch)->arch_name,
> +     gdbarch_bfd_arch_info (gdbarch)->printable_name);
> +}
> +
>  /* Implementation of gdb.architecture_names().  Return a list of all the
>     BFD architecture names that GDB understands.  */
>  
> @@ -391,7 +407,7 @@ PyTypeObject arch_object_type = {
>    0,                                  /* tp_getattr */
>    0,                                  /* tp_setattr */
>    0,                                  /* tp_compare */
> -  0,                                  /* tp_repr */
> +  archpy_repr,                        /* tp_repr */
>    0,                                  /* tp_as_number */
>    0,                                  /* tp_as_sequence */
>    0,                                  /* tp_as_mapping */
> diff --git a/gdb/python/py-block.c b/gdb/python/py-block.c
> index b9aea3aca69..b4c55add765 100644
> --- a/gdb/python/py-block.c
> +++ b/gdb/python/py-block.c
> @@ -424,6 +424,31 @@ blpy_iter_is_valid (PyObject *self, PyObject *args)
>    Py_RETURN_TRUE;
>  }
>  
> +/* __repr__ implementation for gdb.Block.  */
> +
> +static PyObject *
> +blpy_repr (PyObject *self)
> +{
> +  const auto block = block_object_to_block (self);
> +  if (block == nullptr)
> +    return PyUnicode_FromFormat ("<gdb.Block (invalid)>");
> +
> +  const auto name = block->function () ?
> +    block->function ()->print_name () : "<anonymous>";
> +
> +  block_iterator iter;
> +  block_iterator_first (block, &iter);
> +
> +  std::string str;
> +  const struct symbol *symbol;
> +  while ((symbol = block_iterator_next (&iter)) != nullptr)
> +    str = (str + "\n") + symbol->print_name () + ",";
> +  if(!str.empty ())
> +    str += "\n";
> +
> +  return PyUnicode_FromFormat ("<gdb.Block %s {%s}>", name, str.c_str ());
> +}
> +
>  int
>  gdbpy_initialize_blocks (void)
>  {
> @@ -486,7 +511,7 @@ PyTypeObject block_object_type = {
>    0,				  /*tp_getattr*/
>    0,				  /*tp_setattr*/
>    0,				  /*tp_compare*/
> -  0,				  /*tp_repr*/
> +  blpy_repr,                     /*tp_repr*/
>    0,				  /*tp_as_number*/
>    0,				  /*tp_as_sequence*/
>    &block_object_as_mapping,	  /*tp_as_mapping*/
> diff --git a/gdb/python/py-breakpoint.c b/gdb/python/py-breakpoint.c
> index de7b9f4266b..d68a205330c 100644
> --- a/gdb/python/py-breakpoint.c
> +++ b/gdb/python/py-breakpoint.c
> @@ -33,6 +33,7 @@
>  #include "location.h"
>  #include "py-event.h"
>  #include "linespec.h"
> +#include "gdbsupport/common-utils.h"
>  
>  extern PyTypeObject breakpoint_location_object_type
>      CPYCHECKER_TYPE_OBJECT_FOR_TYPEDEF ("breakpoint_location_object");
> @@ -967,6 +968,31 @@ bppy_init (PyObject *self, PyObject *args, PyObject *kwargs)
>    return 0;
>  }
>  
> +/* __repr__ implementation for gdb.Breakpoint.  */
> +
> +static PyObject *
> +bppy_repr (PyObject *self)
> +{
> +  const auto bp = (struct gdbpy_breakpoint_object*) self;
> +  if (bp->bp == nullptr)
> +    return PyUnicode_FromFormat ("<gdb.Breakpoint (invalid)>");
> +
> +  std::string str = " ";
> +  if (bp->bp->thread != -1)
> +    str += string_printf ("thread=%d ", bp->bp->thread);
> +  if (bp->bp->task > 0)
> +    str += string_printf ("task=%d ", bp->bp->task);
> +  if (bp->bp->enable_count > 0)
> +    str += string_printf ("enable_count=%d ", bp->bp->enable_count);
> +  str.pop_back ();
> +
> +  return PyUnicode_FromFormat
> +    ("<gdb.Breakpoint number=%d hits=%d%s>",
> +     bp->bp->number,
> +     bp->bp->hit_count,
> +     str.c_str ());
> +}
> +
>  /* Append to LIST the breakpoint Python object associated to B.
>  
>     Return true on success.  Return false on failure, with the Python error
> @@ -1389,7 +1415,7 @@ PyTypeObject breakpoint_object_type =
>    0,				  /*tp_getattr*/
>    0,				  /*tp_setattr*/
>    0,				  /*tp_compare*/
> -  0,				  /*tp_repr*/
> +  bppy_repr,                     /*tp_repr*/
>    0,				  /*tp_as_number*/
>    0,				  /*tp_as_sequence*/
>    0,				  /*tp_as_mapping*/
> @@ -1604,6 +1630,44 @@ bplocpy_dealloc (PyObject *py_self)
>    Py_TYPE (py_self)->tp_free (py_self);
>  }
>  
> +/* __repr__ implementation for gdb.BreakpointLocation.  */
> +
> +static PyObject *
> +bplocpy_repr (PyObject *py_self)
> +{
> +  const auto self = (gdbpy_breakpoint_location_object *) py_self;
> +  if (self->owner == nullptr || self->owner->bp == nullptr
> +    || self->owner->bp != self->bp_loc->owner)
> +    return PyUnicode_FromFormat ("<gdb.BreakpointLocation (invalid)>");
> +
> +  const auto enabled = self->bp_loc->enabled ? "enabled" : "disabled";
> +
> +  std::string str(enabled);
> +
> +  str += " requested_address=0x";
> +  str += string_printf ("%lx", self->bp_loc->requested_address);
> +
> +  str += " address=0x";
> +  str += string_printf ("%lx", self->bp_loc->address);
> +
> +  if (self->bp_loc->symtab != nullptr)
> +  {
> +    str += " source=";
> +    str += self->bp_loc->symtab->filename;
> +    str += ":";
> +    str += string_printf ("%d", self->bp_loc->line_number);
> +  }
> +
> +  const auto fn_name = self->bp_loc->function_name.get ();
> +  if (fn_name != nullptr)
> +  {
> +    str += " in ";
> +    str += fn_name;
> +  }
> +
> +  return PyUnicode_FromFormat ("<gdb.BreakpointLocation %s>", str.c_str ());
> +}
> +
>  /* Attribute get/set Python definitions. */
>  
>  static gdb_PyGetSetDef bp_location_object_getset[] = {
> @@ -1635,7 +1699,7 @@ PyTypeObject breakpoint_location_object_type =
>    0,					/*tp_getattr*/
>    0,					/*tp_setattr*/
>    0,					/*tp_compare*/
> -  0,					/*tp_repr*/
> +  bplocpy_repr,                        /*tp_repr*/
>    0,					/*tp_as_number*/
>    0,					/*tp_as_sequence*/
>    0,					/*tp_as_mapping*/
> diff --git a/gdb/python/py-symbol.c b/gdb/python/py-symbol.c
> index 93c86964f3e..5a8149bbe66 100644
> --- a/gdb/python/py-symbol.c
> +++ b/gdb/python/py-symbol.c
> @@ -375,6 +375,20 @@ sympy_dealloc (PyObject *obj)
>    Py_TYPE (obj)->tp_free (obj);
>  }
>  
> +/* __repr__ implementation for gdb.Symbol.  */
> +
> +static PyObject *
> +sympy_repr (PyObject *self)
> +{
> +  const auto symbol = symbol_object_to_symbol (self);
> +  if (symbol == nullptr)
> +    return PyUnicode_FromFormat ("<gdb.Symbol (invalid)>");
> +
> +  return PyUnicode_FromFormat
> +    ("<gdb.Symbol print_name=%s>",
> +     symbol->print_name ());
> +}
> +
>  /* Implementation of
>     gdb.lookup_symbol (name [, block] [, domain]) -> (symbol, is_field_of_this)
>     A tuple with 2 elements is always returned.  The first is the symbol
> @@ -732,7 +746,7 @@ PyTypeObject symbol_object_type = {
>    0,				  /*tp_getattr*/
>    0,				  /*tp_setattr*/
>    0,				  /*tp_compare*/
> -  0,				  /*tp_repr*/
> +  sympy_repr,                    /*tp_repr*/
>    0,				  /*tp_as_number*/
>    0,				  /*tp_as_sequence*/
>    0,				  /*tp_as_mapping*/
> diff --git a/gdb/python/py-type.c b/gdb/python/py-type.c
> index 928efacfe8a..eb11ef029ca 100644
> --- a/gdb/python/py-type.c
> +++ b/gdb/python/py-type.c
> @@ -1026,6 +1026,34 @@ typy_template_argument (PyObject *self, PyObject *args)
>    return value_to_value_object (val);
>  }
>  
> +/* __repr__ implementation for gdb.Type.  */
> +
> +static PyObject *
> +typy_repr (PyObject *self)
> +{
> +  const auto type = type_object_to_type (self);
> +  if (type == nullptr)
> +    return PyUnicode_FromFormat ("<gdb.Type (invalid)>");
> +
> +  const char *code = pyty_codes[type->code ()].name;
> +  string_file type_name;
> +  try
> +    {
> +      current_language->print_type (type, "",
> +				    &type_name, -1, 0,
> +				    &type_print_raw_options);
> +    }
> +  catch (const gdb_exception &except)
> +    {
> +      GDB_PY_HANDLE_EXCEPTION (except);
> +    }
> +  auto py_typename = PyUnicode_Decode
> +    (type_name.c_str (), type_name.size (),
> +		 host_charset (), NULL);
> +	
> +  return PyUnicode_FromFormat ("<gdb.Type code=%s name=%U>", code, py_typename);
> +}
> +
>  static PyObject *
>  typy_str (PyObject *self)
>  {
> @@ -1612,7 +1640,7 @@ PyTypeObject type_object_type =
>    0,				  /*tp_getattr*/
>    0,				  /*tp_setattr*/
>    0,				  /*tp_compare*/
> -  0,				  /*tp_repr*/
> +  typy_repr,                     /*tp_repr*/
>    &type_object_as_number,	  /*tp_as_number*/
>    0,				  /*tp_as_sequence*/
>    &typy_mapping,		  /*tp_as_mapping*/
> diff --git a/gdb/testsuite/gdb.python/py-arch.exp b/gdb/testsuite/gdb.python/py-arch.exp
> index 1fbbc47c872..d436c957e25 100644
> --- a/gdb/testsuite/gdb.python/py-arch.exp
> +++ b/gdb/testsuite/gdb.python/py-arch.exp
> @@ -29,6 +29,8 @@ if ![runto_main] {
>  # Test python/15461.  Invalid architectures should not trigger an
>  # internal GDB assert.
>  gdb_py_test_silent_cmd "python empty = gdb.Architecture()" "get empty arch" 0
> +gdb_test "python print(repr (empty))" "<gdb\\.Architecture \\(invalid\\)>" \
> +    "Test empty achitecture __repr__ does not trigger an assert"
>  gdb_test "python print(empty.name())" ".*Architecture is invalid.*" \
>      "Test empty architecture.name does not trigger an assert"
>  gdb_test "python print(empty.disassemble())" ".*Architecture is invalid.*" \
> @@ -46,6 +48,10 @@ gdb_py_test_silent_cmd "python insn_list3 = arch.disassemble(pc, count=1)" \
>  gdb_py_test_silent_cmd "python insn_list4 = arch.disassemble(gdb.Value(pc))" \
>    "disassemble no end no count" 0
>  
> +gdb_test "python print (repr (arch))" \
> +    "<gdb.Architecture arch_name=.* printable_name=.*>" \
> +    "test __repr__ for architecture"
> +
>  gdb_test "python print (len(insn_list1))" "1" "test number of instructions 1"
>  gdb_test "python print (len(insn_list2))" "1" "test number of instructions 2"
>  gdb_test "python print (len(insn_list3))" "1" "test number of instructions 3"
> diff --git a/gdb/testsuite/gdb.python/py-block.exp b/gdb/testsuite/gdb.python/py-block.exp
> index 0a88aec56a0..5e3d1c72d5e 100644
> --- a/gdb/testsuite/gdb.python/py-block.exp
> +++ b/gdb/testsuite/gdb.python/py-block.exp
> @@ -39,7 +39,7 @@ gdb_continue_to_breakpoint "Block break here."
>  gdb_py_test_silent_cmd "python frame = gdb.selected_frame()" "Get Frame" 0
>  gdb_py_test_silent_cmd "python block = frame.block()" \
>      "Get block, initial innermost block" 0
> -gdb_test "python print (block)" "<gdb.Block object at $hex>" "check block not None"
> +gdb_test "python print (block)" "<gdb.Block .* \{.*\}>" "check block not None"
>  gdb_test "python print (block.function)" "None" "first anonymous block"
>  gdb_test "python print (block.start)" "${decimal}" "check start not None"
>  gdb_test "python print (block.end)" "${decimal}" "check end not None"
> @@ -73,7 +73,7 @@ gdb_test "python print (block.function)" "block_func" \
>  gdb_test "up" ".*"
>  gdb_py_test_silent_cmd "python frame = gdb.selected_frame()" "Get Frame 2" 0
>  gdb_py_test_silent_cmd "python block = frame.block()" "Get Frame 2's block" 0
> -gdb_test "python print (block)" "<gdb.Block object at $hex>" \
> +gdb_test "python print (repr (block))" "<gdb.Block .* \{.*\}>" \
>           "Check Frame 2's block not None"
>  gdb_test "python print (block.function)" "main" "main block"
>  
> diff --git a/gdb/testsuite/gdb.python/py-breakpoint.exp b/gdb/testsuite/gdb.python/py-breakpoint.exp
> index e36e87dc291..4da46431a3a 100644
> --- a/gdb/testsuite/gdb.python/py-breakpoint.exp
> +++ b/gdb/testsuite/gdb.python/py-breakpoint.exp
> @@ -50,11 +50,13 @@ proc_with_prefix test_bkpt_basic { } {
>  	return 0
>      }
>  
> +    set repr_pattern "<gdb.Breakpoint number=-?$decimal hits=-?$decimal\( thread=$decimal\)?\( task=$decimal\)?\( enable_count=$decimal\)?>"
> +
>      # Now there should be one breakpoint: main.
>      gdb_py_test_silent_cmd "python blist = gdb.breakpoints()" \
>  	"Get Breakpoint List" 0
> -    gdb_test "python print (blist\[0\])" \
> -	"<gdb.Breakpoint object at $hex>" "Check obj exists @main"
> +    gdb_test "python print (repr (blist\[0\]))" \
> +	"$repr_pattern" "Check obj exists @main"
>      gdb_test "python print (blist\[0\].location)" \
>  	"main." "Check breakpoint location @main"
>      gdb_test "python print (blist\[0\].pending)" "False" \
> @@ -71,12 +73,12 @@ proc_with_prefix test_bkpt_basic { } {
>  	"Get Breakpoint List" 0
>      gdb_test "python print (len(blist))" \
>  	"2" "Check for two breakpoints"
> -    gdb_test "python print (blist\[0\])" \
> -	"<gdb.Breakpoint object at $hex>" "Check obj exists @main 2"
> +    gdb_test "python print (repr (blist\[0\]))" \
> +	"$repr_pattern" "Check obj exists @main 2"
>      gdb_test "python print (blist\[0\].location)" \
>  	"main." "Check breakpoint location @main 2"
> -    gdb_test "python print (blist\[1\])" \
> -	"<gdb.Breakpoint object at $hex>" "Check obj exists @mult_line"
> +    gdb_test "python print (repr (blist\[1\]))" \
> +	"$repr_pattern" "Check obj exists @mult_line"
>  
>      gdb_test "python print (blist\[1\].location)" \
>  	"py-breakpoint\.c:${mult_line}*" \
> @@ -224,14 +226,16 @@ proc_with_prefix test_bkpt_invisible { } {
>  	return 0
>      }
>  
> +    set repr_pattern "<gdb.Breakpoint number=-?$decimal hits=-?$decimal\( thread=$decimal\)?\( task=$decimal\)?\( enable_count=$decimal\)?>"
> +
>      delete_breakpoints
>      set ibp_location [gdb_get_line_number "Break at multiply."]
>      gdb_py_test_silent_cmd  "python ibp = gdb.Breakpoint(\"$ibp_location\", internal=False)" \
>  	"Set invisible breakpoint" 0
>      gdb_py_test_silent_cmd "python ilist = gdb.breakpoints()" \
>  	"Get Breakpoint List" 0
> -    gdb_test "python print (ilist\[0\])" \
> -	"<gdb.Breakpoint object at $hex>" "Check invisible bp obj exists 1"
> +    gdb_test "python print (repr (ilist\[0\]))" \
> +	"$repr_pattern" "Check invisible bp obj exists 1"
>      gdb_test "python print (ilist\[0\].location)" \
>  	"py-breakpoint\.c:$ibp_location*" "Check breakpoint location 1"
>      gdb_test "python print (ilist\[0\].visible)" \
> @@ -243,8 +247,8 @@ proc_with_prefix test_bkpt_invisible { } {
>  	"Set invisible breakpoint" 0
>      gdb_py_test_silent_cmd "python ilist = gdb.breakpoints()" \
>  	"Get Breakpoint List" 0
> -    gdb_test "python print (ilist\[0\])" \
> -	"<gdb.Breakpoint object at $hex>" "Check invisible bp obj exists 2"
> +    gdb_test "python print (repr (ilist\[0\]))" \
> +	"$repr_pattern" "Check invisible bp obj exists 2"
>      gdb_test "python print (ilist\[0\].location)" \
>  	"py-breakpoint\.c:$ibp_location*" "Check breakpoint location 2"
>      gdb_test "python print (ilist\[0\].visible)" \
> diff --git a/gdb/testsuite/gdb.python/py-symbol.exp b/gdb/testsuite/gdb.python/py-symbol.exp
> index ad06b07c2c6..979b7dfb8fb 100644
> --- a/gdb/testsuite/gdb.python/py-symbol.exp
> +++ b/gdb/testsuite/gdb.python/py-symbol.exp
> @@ -44,6 +44,8 @@ clean_restart ${binfile}
>  # point where we don't have a current frame, and we don't want to
>  # require one.
>  gdb_py_test_silent_cmd "python main_func = gdb.lookup_global_symbol(\"main\")" "Lookup main" 1
> +gdb_test "python print (repr (main_func))" "<gdb.Symbol print_name=.*>" \
> +    "test main_func.__repr__"
>  gdb_test "python print (main_func.is_function)" "True" "test main_func.is_function"
>  gdb_test "python print (gdb.lookup_global_symbol(\"junk\"))" "None" "test lookup_global_symbol(\"junk\")"
>  
> diff --git a/gdb/testsuite/gdb.python/py-type.exp b/gdb/testsuite/gdb.python/py-type.exp
> index 594c9749d8e..95cdfa54a6e 100644
> --- a/gdb/testsuite/gdb.python/py-type.exp
> +++ b/gdb/testsuite/gdb.python/py-type.exp
> @@ -393,3 +393,7 @@ if { [build_inferior "${binfile}-cxx" "c++"] == 0 } {
>        test_type_equality
>    }
>  }
> +
> +# Test __repr__()
> +gdb_test "python print (repr (gdb.lookup_type ('char')))" \
> +      "<gdb.Type code=TYPE_CODE_INT name=char>" "test __repr__()"
> -- 
> 2.37.3.windows.1


^ permalink raw reply	[relevance 7%]

* Re: [PATCH] Add __repr__() implementation to a few Python types
  2023-01-20  1:43  3%     ` Matheus Branco Borella
@ 2023-01-20 16:45  5%       ` Andrew Burgess
  2023-01-24 14:45  7%       ` Andrew Burgess
  1 sibling, 0 replies; 65+ results
From: Andrew Burgess @ 2023-01-20 16:45 UTC (permalink / raw)
  To: Matheus Branco Borella via Gdb-patches, gdb-patches
  Cc: Matheus Branco Borella

Matheus Branco Borella via Gdb-patches <gdb-patches@sourceware.org>
writes:

> Only a few types in the Python API currently have __repr__() implementations.
> This patch adds a few more of them. specifically: it adds __repr__()
> implementations to gdb.Symbol, gdb.Architecture, gdb.Block, gdb.Breakpoint,
> and gdb.Type.
>
> This makes it easier to play around the GDB Python API in the Python interpreter
> session invoked with the 'pi' command in GDB, giving more easily accessible tipe
> information to users.
>
> An example of how this would look like:
> ```
> (gdb) pi
>>> gdb.lookup_type("char")
> <gdb.Type code=TYPE_CODE_INT name=char>
>>> gdb.lookup_global_symbol("main")
> <gdb.Symbol print_name=main>
> ```
>
>> Sorry for being a little slow.  What does this actually mean?  When you
>> say "makes use of u8 string literals" - does this mean you have string
>> literals in this patch containing non ASCII characters?
>>
>> I've trying to understand why this is different to any other part of GDB
>> that prints stuff via Python.
>
> I forgot to take that out of the commit message, my bad. Originally, I'd 
> intended for the string literals in the patch that get handed to Python to be
> all u8 literals so that I could guarantee it wouldn't break in an environment
> that doesn't output regular string literals in an ASCII-compatible encoding,
> as Python expects all strings handed to it to be encoded in UTF-8. But seeing
> as all of the rest of the Python interface code uses regular string literals, 
> I figured it wouldn't make much of difference having them in anyway.

Thanks, that makes sense.

>
>> I guess I was surprised that so many of the new tests included an
>> explicit call to repr, given the premise of the change was that simply
>> 'print(empty)' would now print something useful.
>>
>> I guess maybe it doesn't hurt to _also_ include some explicit repr
>> calls, but I was expecting most tests to just be printing the object
>> directly.
> As blarsen@ also pointed out, `print`-ing an object directly that has an 
> implmentation of __str__() will print whatever its __str__() functions returns, 
> regardless of whether it implements __repr__() or not, which is not what we want 
> here. __repr__() is always preferred in the REPL, though, so it's understandable 
> it might not be clear at first why I'm calling `repr()` explicitly.

Again, thanks (and to Bruno too) for the explanation.  I understand
__str__ and __repr__ better now :)  I agree that what you're doing makes
sense now I understand it.

>
>> Over long line, please wrap a little.  There's other long lines in your
>> patch, I'll not point out each one.
>
> Should be all fixed now (hopefully I didn't miss any), with the exception of the
> `repr_pattern` strings in `py-breakpoint.exp`, which I couldn't for the life of 
> me get to match properly with the output were they not on a single line.
>
> ---
>  gdb/python/py-arch.c                       | 18 +++++-
>  gdb/python/py-block.c                      | 27 ++++++++-
>  gdb/python/py-breakpoint.c                 | 68 +++++++++++++++++++++-
>  gdb/python/py-symbol.c                     | 16 ++++-
>  gdb/python/py-type.c                       | 30 +++++++++-
>  gdb/testsuite/gdb.python/py-arch.exp       |  6 ++
>  gdb/testsuite/gdb.python/py-block.exp      |  4 +-
>  gdb/testsuite/gdb.python/py-breakpoint.exp | 24 ++++----
>  gdb/testsuite/gdb.python/py-symbol.exp     |  2 +
>  gdb/testsuite/gdb.python/py-type.exp       |  4 ++
>  10 files changed, 181 insertions(+), 18 deletions(-)
>
> diff --git a/gdb/python/py-arch.c b/gdb/python/py-arch.c
> index cf0978560f9..5384a0d0d0c 100644
> --- a/gdb/python/py-arch.c
> +++ b/gdb/python/py-arch.c
> @@ -319,6 +319,22 @@ archpy_integer_type (PyObject *self, PyObject *args, PyObject *kw)
>    return type_to_type_object (type);
>  }
>  
> +/* __repr__ implementation for gdb.Architecture.  */
> +
> +static PyObject *
> +archpy_repr (PyObject *self)
> +{
> +  const auto gdbarch = arch_object_to_gdbarch (self);
> +  if (gdbarch == nullptr)
> +    return PyUnicode_FromFormat
> +      ("<gdb.Architecture (invalid)>");
> +
> +  return PyUnicode_FromFormat
> +    ("<gdb.Architecture arch_name=%s printable_name=%s>",
> +     gdbarch_bfd_arch_info (gdbarch)->arch_name,
> +     gdbarch_bfd_arch_info (gdbarch)->printable_name);
> +}
> +
>  /* Implementation of gdb.architecture_names().  Return a list of all the
>     BFD architecture names that GDB understands.  */
>  
> @@ -391,7 +407,7 @@ PyTypeObject arch_object_type = {
>    0,                                  /* tp_getattr */
>    0,                                  /* tp_setattr */
>    0,                                  /* tp_compare */
> -  0,                                  /* tp_repr */
> +  archpy_repr,                        /* tp_repr */
>    0,                                  /* tp_as_number */
>    0,                                  /* tp_as_sequence */
>    0,                                  /* tp_as_mapping */
> diff --git a/gdb/python/py-block.c b/gdb/python/py-block.c
> index b9aea3aca69..b4c55add765 100644
> --- a/gdb/python/py-block.c
> +++ b/gdb/python/py-block.c
> @@ -424,6 +424,31 @@ blpy_iter_is_valid (PyObject *self, PyObject *args)
>    Py_RETURN_TRUE;
>  }
>  
> +/* __repr__ implementation for gdb.Block.  */
> +
> +static PyObject *
> +blpy_repr (PyObject *self)
> +{
> +  const auto block = block_object_to_block (self);
> +  if (block == nullptr)
> +    return PyUnicode_FromFormat ("<gdb.Block (invalid)>");
> +
> +  const auto name = block->function () ?
> +    block->function ()->print_name () : "<anonymous>";
> +
> +  block_iterator iter;
> +  block_iterator_first (block, &iter);
> +
> +  std::string str;
> +  const struct symbol *symbol;
> +  while ((symbol = block_iterator_next (&iter)) != nullptr)

You should be using ALL_BLOCK_SYMBOLS here rather than calling the
block_iterator_* functions directly.  As it's currently written this
code will crash on a block with no symbols.

I've included a fix-up patch at the end of this email that both fixes
this issue, and extended the existing test to check this case.  Please
feel free to merge this with your work.

> +    str = (str + "\n") + symbol->print_name () + ",";

I don't object to including all the symbol names as you've done.  But
did you consider that some blocks could have a _lot_ of symbols?

As an alternative did you consider just counting the symbols, and
including the count?

This isn't a requirement, if you feel the symbol list is going to be
more useful then I don't feel strongly enough, but I just wanted to
ask.

> +  if(!str.empty ())
> +    str += "\n";
> +
> +  return PyUnicode_FromFormat ("<gdb.Block %s {%s}>", name, str.c_str ());
> +}
> +
>  int
>  gdbpy_initialize_blocks (void)
>  {
> @@ -486,7 +511,7 @@ PyTypeObject block_object_type = {
>    0,				  /*tp_getattr*/
>    0,				  /*tp_setattr*/
>    0,				  /*tp_compare*/
> -  0,				  /*tp_repr*/
> +  blpy_repr,                     /*tp_repr*/
>    0,				  /*tp_as_number*/
>    0,				  /*tp_as_sequence*/
>    &block_object_as_mapping,	  /*tp_as_mapping*/
> diff --git a/gdb/python/py-breakpoint.c b/gdb/python/py-breakpoint.c
> index de7b9f4266b..d68a205330c 100644
> --- a/gdb/python/py-breakpoint.c
> +++ b/gdb/python/py-breakpoint.c
> @@ -33,6 +33,7 @@
>  #include "location.h"
>  #include "py-event.h"
>  #include "linespec.h"
> +#include "gdbsupport/common-utils.h"
>  
>  extern PyTypeObject breakpoint_location_object_type
>      CPYCHECKER_TYPE_OBJECT_FOR_TYPEDEF ("breakpoint_location_object");
> @@ -967,6 +968,31 @@ bppy_init (PyObject *self, PyObject *args, PyObject *kwargs)
>    return 0;
>  }
>  
> +/* __repr__ implementation for gdb.Breakpoint.  */
> +
> +static PyObject *
> +bppy_repr (PyObject *self)
> +{
> +  const auto bp = (struct gdbpy_breakpoint_object*) self;
> +  if (bp->bp == nullptr)
> +    return PyUnicode_FromFormat ("<gdb.Breakpoint (invalid)>");
> +
> +  std::string str = " ";
> +  if (bp->bp->thread != -1)
> +    str += string_printf ("thread=%d ", bp->bp->thread);
> +  if (bp->bp->task > 0)
> +    str += string_printf ("task=%d ", bp->bp->task);
> +  if (bp->bp->enable_count > 0)
> +    str += string_printf ("enable_count=%d ", bp->bp->enable_count);
> +  str.pop_back ();
> +
> +  return PyUnicode_FromFormat
> +    ("<gdb.Breakpoint number=%d hits=%d%s>",
> +     bp->bp->number,
> +     bp->bp->hit_count,
> +     str.c_str ());
> +}
> +
>  /* Append to LIST the breakpoint Python object associated to B.
>  
>     Return true on success.  Return false on failure, with the Python error
> @@ -1389,7 +1415,7 @@ PyTypeObject breakpoint_object_type =
>    0,				  /*tp_getattr*/
>    0,				  /*tp_setattr*/
>    0,				  /*tp_compare*/
> -  0,				  /*tp_repr*/
> +  bppy_repr,                     /*tp_repr*/
>    0,				  /*tp_as_number*/
>    0,				  /*tp_as_sequence*/
>    0,				  /*tp_as_mapping*/
> @@ -1604,6 +1630,44 @@ bplocpy_dealloc (PyObject *py_self)
>    Py_TYPE (py_self)->tp_free (py_self);
>  }
>  
> +/* __repr__ implementation for gdb.BreakpointLocation.  */
> +
> +static PyObject *
> +bplocpy_repr (PyObject *py_self)
> +{
> +  const auto self = (gdbpy_breakpoint_location_object *) py_self;
> +  if (self->owner == nullptr || self->owner->bp == nullptr
> +    || self->owner->bp != self->bp_loc->owner)
> +    return PyUnicode_FromFormat ("<gdb.BreakpointLocation (invalid)>");
> +
> +  const auto enabled = self->bp_loc->enabled ? "enabled" : "disabled";
> +
> +  std::string str(enabled);
> +
> +  str += " requested_address=0x";
> +  str += string_printf ("%lx", self->bp_loc->requested_address);
> +
> +  str += " address=0x";
> +  str += string_printf ("%lx", self->bp_loc->address);

I'd suggest placing 'address' first, and only include
'requested_address' if it's different to 'address'.  In almost all cases
these two fields will be the same, so this would cut down on some
clutter.

Additionally, you should use core_addr_to_string_nz to format the
CORE_ADDR, like this:

  str += " address=";
  str += string_printf ("%s", core_addr_to_string_nz (self->bp_loc->address));


> +
> +  if (self->bp_loc->symtab != nullptr)
> +  {
> +    str += " source=";
> +    str += self->bp_loc->symtab->filename;
> +    str += ":";
> +    str += string_printf ("%d", self->bp_loc->line_number);
> +  }

The GNU style is to indent the opening and closing curly braces by an
additional 2 spaces, and then everything within the block gets 2 extra
spaces beyond that.

> +
> +  const auto fn_name = self->bp_loc->function_name.get ();
> +  if (fn_name != nullptr)
> +  {
> +    str += " in ";
> +    str += fn_name;
> +  }

Indentation again here.

> +
> +  return PyUnicode_FromFormat ("<gdb.BreakpointLocation %s>", str.c_str ());
> +}
> +
>  /* Attribute get/set Python definitions. */
>  
>  static gdb_PyGetSetDef bp_location_object_getset[] = {
> @@ -1635,7 +1699,7 @@ PyTypeObject breakpoint_location_object_type =
>    0,					/*tp_getattr*/
>    0,					/*tp_setattr*/
>    0,					/*tp_compare*/
> -  0,					/*tp_repr*/
> +  bplocpy_repr,                        /*tp_repr*/
>    0,					/*tp_as_number*/
>    0,					/*tp_as_sequence*/
>    0,					/*tp_as_mapping*/
> diff --git a/gdb/python/py-symbol.c b/gdb/python/py-symbol.c
> index 93c86964f3e..5a8149bbe66 100644
> --- a/gdb/python/py-symbol.c
> +++ b/gdb/python/py-symbol.c
> @@ -375,6 +375,20 @@ sympy_dealloc (PyObject *obj)
>    Py_TYPE (obj)->tp_free (obj);
>  }
>  
> +/* __repr__ implementation for gdb.Symbol.  */
> +
> +static PyObject *
> +sympy_repr (PyObject *self)
> +{
> +  const auto symbol = symbol_object_to_symbol (self);
> +  if (symbol == nullptr)
> +    return PyUnicode_FromFormat ("<gdb.Symbol (invalid)>");
> +
> +  return PyUnicode_FromFormat
> +    ("<gdb.Symbol print_name=%s>",
> +     symbol->print_name ());
> +}
> +
>  /* Implementation of
>     gdb.lookup_symbol (name [, block] [, domain]) -> (symbol, is_field_of_this)
>     A tuple with 2 elements is always returned.  The first is the symbol
> @@ -732,7 +746,7 @@ PyTypeObject symbol_object_type = {
>    0,				  /*tp_getattr*/
>    0,				  /*tp_setattr*/
>    0,				  /*tp_compare*/
> -  0,				  /*tp_repr*/
> +  sympy_repr,                    /*tp_repr*/
>    0,				  /*tp_as_number*/
>    0,				  /*tp_as_sequence*/
>    0,				  /*tp_as_mapping*/
> diff --git a/gdb/python/py-type.c b/gdb/python/py-type.c
> index 928efacfe8a..eb11ef029ca 100644
> --- a/gdb/python/py-type.c
> +++ b/gdb/python/py-type.c
> @@ -1026,6 +1026,34 @@ typy_template_argument (PyObject *self, PyObject *args)
>    return value_to_value_object (val);
>  }
>  
> +/* __repr__ implementation for gdb.Type.  */
> +
> +static PyObject *
> +typy_repr (PyObject *self)
> +{
> +  const auto type = type_object_to_type (self);
> +  if (type == nullptr)
> +    return PyUnicode_FromFormat ("<gdb.Type (invalid)>");
> +
> +  const char *code = pyty_codes[type->code ()].name;
> +  string_file type_name;
> +  try
> +    {
> +      current_language->print_type (type, "",
> +				    &type_name, -1, 0,
> +				    &type_print_raw_options);
> +    }
> +  catch (const gdb_exception &except)
> +    {
> +      GDB_PY_HANDLE_EXCEPTION (except);
> +    }
> +  auto py_typename = PyUnicode_Decode
> +    (type_name.c_str (), type_name.size (),
> +		 host_charset (), NULL);

I think there's enough space to start the arguments on the
PyUnicode_Decode line (i.e. no line break before '(' char).  The
'host_charset ()' would still be the start of a second line, and should
align under 'type_name':

  auto py_typename = PyUnicode_Decode (type_name.c_str (), type_name.size (),
				       host_charset (), NULL);

> +	

You still have some trailing whitespace on this ^^^ line.

> +  return PyUnicode_FromFormat ("<gdb.Type code=%s name=%U>", code, py_typename);
> +}
> +
>  static PyObject *
>  typy_str (PyObject *self)
>  {
> @@ -1612,7 +1640,7 @@ PyTypeObject type_object_type =
>    0,				  /*tp_getattr*/
>    0,				  /*tp_setattr*/
>    0,				  /*tp_compare*/
> -  0,				  /*tp_repr*/
> +  typy_repr,                     /*tp_repr*/
>    &type_object_as_number,	  /*tp_as_number*/
>    0,				  /*tp_as_sequence*/
>    &typy_mapping,		  /*tp_as_mapping*/
> diff --git a/gdb/testsuite/gdb.python/py-arch.exp b/gdb/testsuite/gdb.python/py-arch.exp
> index 1fbbc47c872..d436c957e25 100644
> --- a/gdb/testsuite/gdb.python/py-arch.exp
> +++ b/gdb/testsuite/gdb.python/py-arch.exp
> @@ -29,6 +29,8 @@ if ![runto_main] {
>  # Test python/15461.  Invalid architectures should not trigger an
>  # internal GDB assert.
>  gdb_py_test_silent_cmd "python empty = gdb.Architecture()" "get empty arch" 0
> +gdb_test "python print(repr (empty))" "<gdb\\.Architecture \\(invalid\\)>" \
> +    "Test empty achitecture __repr__ does not trigger an assert"
>  gdb_test "python print(empty.name())" ".*Architecture is invalid.*" \
>      "Test empty architecture.name does not trigger an assert"
>  gdb_test "python print(empty.disassemble())" ".*Architecture is invalid.*" \
> @@ -46,6 +48,10 @@ gdb_py_test_silent_cmd "python insn_list3 = arch.disassemble(pc, count=1)" \
>  gdb_py_test_silent_cmd "python insn_list4 = arch.disassemble(gdb.Value(pc))" \
>    "disassemble no end no count" 0
>  
> +gdb_test "python print (repr (arch))" \
> +    "<gdb.Architecture arch_name=.* printable_name=.*>" \
> +    "test __repr__ for architecture"
> +
>  gdb_test "python print (len(insn_list1))" "1" "test number of instructions 1"
>  gdb_test "python print (len(insn_list2))" "1" "test number of instructions 2"
>  gdb_test "python print (len(insn_list3))" "1" "test number of instructions 3"
> diff --git a/gdb/testsuite/gdb.python/py-block.exp b/gdb/testsuite/gdb.python/py-block.exp
> index 0a88aec56a0..5e3d1c72d5e 100644
> --- a/gdb/testsuite/gdb.python/py-block.exp
> +++ b/gdb/testsuite/gdb.python/py-block.exp
> @@ -39,7 +39,7 @@ gdb_continue_to_breakpoint "Block break here."
>  gdb_py_test_silent_cmd "python frame = gdb.selected_frame()" "Get Frame" 0
>  gdb_py_test_silent_cmd "python block = frame.block()" \
>      "Get block, initial innermost block" 0
> -gdb_test "python print (block)" "<gdb.Block object at $hex>" "check block not None"
> +gdb_test "python print (block)" "<gdb.Block .* \{.*\}>" "check block not None"

I think I'd be tempted to at least replace the first '.*' with the
function name we expect to see.  Might as well make the test as accurate
as possible.

>  gdb_test "python print (block.function)" "None" "first anonymous block"
>  gdb_test "python print (block.start)" "${decimal}" "check start not None"
>  gdb_test "python print (block.end)" "${decimal}" "check end not None"
> @@ -73,7 +73,7 @@ gdb_test "python print (block.function)" "block_func" \
>  gdb_test "up" ".*"
>  gdb_py_test_silent_cmd "python frame = gdb.selected_frame()" "Get Frame 2" 0
>  gdb_py_test_silent_cmd "python block = frame.block()" "Get Frame 2's block" 0
> -gdb_test "python print (block)" "<gdb.Block object at $hex>" \
> +gdb_test "python print (repr (block))" "<gdb.Block .* \{.*\}>" \

Same again where with the function name.

>           "Check Frame 2's block not None"
>  gdb_test "python print (block.function)" "main" "main block"
>  
> diff --git a/gdb/testsuite/gdb.python/py-breakpoint.exp b/gdb/testsuite/gdb.python/py-breakpoint.exp
> index e36e87dc291..4da46431a3a 100644
> --- a/gdb/testsuite/gdb.python/py-breakpoint.exp
> +++ b/gdb/testsuite/gdb.python/py-breakpoint.exp
> @@ -50,11 +50,13 @@ proc_with_prefix test_bkpt_basic { } {
>  	return 0
>      }
>  
> +    set repr_pattern "<gdb.Breakpoint number=-?$decimal hits=-?$decimal\( thread=$decimal\)?\( task=$decimal\)?\( enable_count=$decimal\)?>"
> +
>      # Now there should be one breakpoint: main.
>      gdb_py_test_silent_cmd "python blist = gdb.breakpoints()" \
>  	"Get Breakpoint List" 0
> -    gdb_test "python print (blist\[0\])" \
> -	"<gdb.Breakpoint object at $hex>" "Check obj exists @main"
> +    gdb_test "python print (repr (blist\[0\]))" \
> +	"$repr_pattern" "Check obj exists @main"
>      gdb_test "python print (blist\[0\].location)" \
>  	"main." "Check breakpoint location @main"
>      gdb_test "python print (blist\[0\].pending)" "False" \
> @@ -71,12 +73,12 @@ proc_with_prefix test_bkpt_basic { } {
>  	"Get Breakpoint List" 0
>      gdb_test "python print (len(blist))" \
>  	"2" "Check for two breakpoints"
> -    gdb_test "python print (blist\[0\])" \
> -	"<gdb.Breakpoint object at $hex>" "Check obj exists @main 2"
> +    gdb_test "python print (repr (blist\[0\]))" \
> +	"$repr_pattern" "Check obj exists @main 2"
>      gdb_test "python print (blist\[0\].location)" \
>  	"main." "Check breakpoint location @main 2"
> -    gdb_test "python print (blist\[1\])" \
> -	"<gdb.Breakpoint object at $hex>" "Check obj exists @mult_line"
> +    gdb_test "python print (repr (blist\[1\]))" \
> +	"$repr_pattern" "Check obj exists @mult_line"
>  
>      gdb_test "python print (blist\[1\].location)" \
>  	"py-breakpoint\.c:${mult_line}*" \
> @@ -224,14 +226,16 @@ proc_with_prefix test_bkpt_invisible { } {
>  	return 0
>      }
>  
> +    set repr_pattern "<gdb.Breakpoint number=-?$decimal hits=-?$decimal\( thread=$decimal\)?\( task=$decimal\)?\( enable_count=$decimal\)?>"
> +
>      delete_breakpoints
>      set ibp_location [gdb_get_line_number "Break at multiply."]
>      gdb_py_test_silent_cmd  "python ibp = gdb.Breakpoint(\"$ibp_location\", internal=False)" \
>  	"Set invisible breakpoint" 0
>      gdb_py_test_silent_cmd "python ilist = gdb.breakpoints()" \
>  	"Get Breakpoint List" 0
> -    gdb_test "python print (ilist\[0\])" \
> -	"<gdb.Breakpoint object at $hex>" "Check invisible bp obj exists 1"
> +    gdb_test "python print (repr (ilist\[0\]))" \
> +	"$repr_pattern" "Check invisible bp obj exists 1"
>      gdb_test "python print (ilist\[0\].location)" \
>  	"py-breakpoint\.c:$ibp_location*" "Check breakpoint location 1"
>      gdb_test "python print (ilist\[0\].visible)" \
> @@ -243,8 +247,8 @@ proc_with_prefix test_bkpt_invisible { } {
>  	"Set invisible breakpoint" 0
>      gdb_py_test_silent_cmd "python ilist = gdb.breakpoints()" \
>  	"Get Breakpoint List" 0
> -    gdb_test "python print (ilist\[0\])" \
> -	"<gdb.Breakpoint object at $hex>" "Check invisible bp obj exists 2"
> +    gdb_test "python print (repr (ilist\[0\]))" \
> +	"$repr_pattern" "Check invisible bp obj exists 2"
>      gdb_test "python print (ilist\[0\].location)" \
>  	"py-breakpoint\.c:$ibp_location*" "Check breakpoint location 2"
>      gdb_test "python print (ilist\[0\].visible)" \
> diff --git a/gdb/testsuite/gdb.python/py-symbol.exp b/gdb/testsuite/gdb.python/py-symbol.exp
> index ad06b07c2c6..979b7dfb8fb 100644
> --- a/gdb/testsuite/gdb.python/py-symbol.exp
> +++ b/gdb/testsuite/gdb.python/py-symbol.exp
> @@ -44,6 +44,8 @@ clean_restart ${binfile}
>  # point where we don't have a current frame, and we don't want to
>  # require one.
>  gdb_py_test_silent_cmd "python main_func = gdb.lookup_global_symbol(\"main\")" "Lookup main" 1
> +gdb_test "python print (repr (main_func))" "<gdb.Symbol print_name=.*>" \

I think replace '.*' with 'main' here.

> +    "test main_func.__repr__"
>  gdb_test "python print (main_func.is_function)" "True" "test main_func.is_function"
>  gdb_test "python print (gdb.lookup_global_symbol(\"junk\"))" "None" "test lookup_global_symbol(\"junk\")"
>  
> diff --git a/gdb/testsuite/gdb.python/py-type.exp b/gdb/testsuite/gdb.python/py-type.exp
> index 594c9749d8e..95cdfa54a6e 100644
> --- a/gdb/testsuite/gdb.python/py-type.exp
> +++ b/gdb/testsuite/gdb.python/py-type.exp
> @@ -393,3 +393,7 @@ if { [build_inferior "${binfile}-cxx" "c++"] == 0 } {
>        test_type_equality
>    }
>  }
> +
> +# Test __repr__()

Missing '.' at the end of the comment.

> +gdb_test "python print (repr (gdb.lookup_type ('char')))" \
> +      "<gdb.Type code=TYPE_CODE_INT name=char>" "test __repr__()"

Thanks
Andrew

---

diff --git a/gdb/python/py-block.c b/gdb/python/py-block.c
index a9d45b75dc9..c642be4208d 100644
--- a/gdb/python/py-block.c
+++ b/gdb/python/py-block.c
@@ -437,11 +437,10 @@ blpy_repr (PyObject *self)
     block->function ()->print_name () : "<anonymous>";
 
   block_iterator iter;
-  block_iterator_first (block, &iter);
-
-  std::string str;
   const struct symbol *symbol;
-  while ((symbol = block_iterator_next (&iter)) != nullptr)
+  std::string str;
+
+  ALL_BLOCK_SYMBOLS (block, iter, symbol)
     str = (str + "\n") + symbol->print_name () + ",";
   if(!str.empty ())
     str += "\n";
diff --git a/gdb/testsuite/gdb.python/py-block.c b/gdb/testsuite/gdb.python/py-block.c
index a0c6e165605..4c503827347 100644
--- a/gdb/testsuite/gdb.python/py-block.c
+++ b/gdb/testsuite/gdb.python/py-block.c
@@ -30,9 +30,14 @@ int block_func (void)
   }
 }
 
+int
+no_locals_func (void)
+{
+  return block_func ();
+}
 
 int main (int argc, char *argv[])
 {
-  block_func ();
+  no_locals_func ();
   return 0; /* Break at end. */
 }
diff --git a/gdb/testsuite/gdb.python/py-block.exp b/gdb/testsuite/gdb.python/py-block.exp
index 5cca798eeb3..3ac8c97dc69 100644
--- a/gdb/testsuite/gdb.python/py-block.exp
+++ b/gdb/testsuite/gdb.python/py-block.exp
@@ -68,15 +68,22 @@ gdb_test_no_output "python block = block.superblock" "get superblock 2"
 gdb_test "python print (block.function)" "block_func" \
          "Print superblock 2 function"
 
+gdb_test "up" ".*" "up to no_locals_func"
+gdb_py_test_silent_cmd "python frame = gdb.selected_frame()" "Get Frame 2" 0
+gdb_py_test_silent_cmd "python block = frame.block()" "Get Frame 2's block" 0
+gdb_test "python print (repr (block))" "<gdb.Block no_locals_func \{\}>" \
+    "Check block in empty_frame"
+gdb_test "python print (block.function)" "no_locals_func" \
+    "no_locals_func block"
+
 # Switch frames, then test for main block.
-gdb_test "up" ".*"
+gdb_test "up" ".*" "up to main"
 gdb_py_test_silent_cmd "python frame = gdb.selected_frame()" "Get Frame 2" 0
 gdb_py_test_silent_cmd "python block = frame.block()" "Get Frame 2's block" 0
 gdb_test "python print (repr (block))" "<gdb.Block .* \{.*\}>" \
          "Check Frame 2's block not None"
 gdb_test "python print (block.function)" "main" "main block"
 
-
 # Test Block is_valid.  This must always be the last test in this
 # testcase as it unloads the object file.
 delete_breakpoints


^ permalink raw reply	[relevance 5%]

* [PATCH] Add __repr__() implementation to a few Python types
  2023-01-18 18:02  6%   ` Andrew Burgess
@ 2023-01-20  1:43  3%     ` Matheus Branco Borella
  2023-01-20 16:45  5%       ` Andrew Burgess
  2023-01-24 14:45  7%       ` Andrew Burgess
  0 siblings, 2 replies; 65+ results
From: Matheus Branco Borella @ 2023-01-20  1:43 UTC (permalink / raw)
  To: gdb-patches; +Cc: Matheus Branco Borella

Only a few types in the Python API currently have __repr__() implementations.
This patch adds a few more of them. specifically: it adds __repr__()
implementations to gdb.Symbol, gdb.Architecture, gdb.Block, gdb.Breakpoint,
and gdb.Type.

This makes it easier to play around the GDB Python API in the Python interpreter
session invoked with the 'pi' command in GDB, giving more easily accessible tipe
information to users.

An example of how this would look like:
```
(gdb) pi
>> gdb.lookup_type("char")
<gdb.Type code=TYPE_CODE_INT name=char>
>> gdb.lookup_global_symbol("main")
<gdb.Symbol print_name=main>
```

> Sorry for being a little slow.  What does this actually mean?  When you
> say "makes use of u8 string literals" - does this mean you have string
> literals in this patch containing non ASCII characters?
>
> I've trying to understand why this is different to any other part of GDB
> that prints stuff via Python.

I forgot to take that out of the commit message, my bad. Originally, I'd 
intended for the string literals in the patch that get handed to Python to be
all u8 literals so that I could guarantee it wouldn't break in an environment
that doesn't output regular string literals in an ASCII-compatible encoding,
as Python expects all strings handed to it to be encoded in UTF-8. But seeing
as all of the rest of the Python interface code uses regular string literals, 
I figured it wouldn't make much of difference having them in anyway.

> I guess I was surprised that so many of the new tests included an
> explicit call to repr, given the premise of the change was that simply
> 'print(empty)' would now print something useful.
>
> I guess maybe it doesn't hurt to _also_ include some explicit repr
> calls, but I was expecting most tests to just be printing the object
> directly.
As blarsen@ also pointed out, `print`-ing an object directly that has an 
implmentation of __str__() will print whatever its __str__() functions returns, 
regardless of whether it implements __repr__() or not, which is not what we want 
here. __repr__() is always preferred in the REPL, though, so it's understandable 
it might not be clear at first why I'm calling `repr()` explicitly.

> Over long line, please wrap a little.  There's other long lines in your
> patch, I'll not point out each one.

Should be all fixed now (hopefully I didn't miss any), with the exception of the
`repr_pattern` strings in `py-breakpoint.exp`, which I couldn't for the life of 
me get to match properly with the output were they not on a single line.

---
 gdb/python/py-arch.c                       | 18 +++++-
 gdb/python/py-block.c                      | 27 ++++++++-
 gdb/python/py-breakpoint.c                 | 68 +++++++++++++++++++++-
 gdb/python/py-symbol.c                     | 16 ++++-
 gdb/python/py-type.c                       | 30 +++++++++-
 gdb/testsuite/gdb.python/py-arch.exp       |  6 ++
 gdb/testsuite/gdb.python/py-block.exp      |  4 +-
 gdb/testsuite/gdb.python/py-breakpoint.exp | 24 ++++----
 gdb/testsuite/gdb.python/py-symbol.exp     |  2 +
 gdb/testsuite/gdb.python/py-type.exp       |  4 ++
 10 files changed, 181 insertions(+), 18 deletions(-)

diff --git a/gdb/python/py-arch.c b/gdb/python/py-arch.c
index cf0978560f9..5384a0d0d0c 100644
--- a/gdb/python/py-arch.c
+++ b/gdb/python/py-arch.c
@@ -319,6 +319,22 @@ archpy_integer_type (PyObject *self, PyObject *args, PyObject *kw)
   return type_to_type_object (type);
 }
 
+/* __repr__ implementation for gdb.Architecture.  */
+
+static PyObject *
+archpy_repr (PyObject *self)
+{
+  const auto gdbarch = arch_object_to_gdbarch (self);
+  if (gdbarch == nullptr)
+    return PyUnicode_FromFormat
+      ("<gdb.Architecture (invalid)>");
+
+  return PyUnicode_FromFormat
+    ("<gdb.Architecture arch_name=%s printable_name=%s>",
+     gdbarch_bfd_arch_info (gdbarch)->arch_name,
+     gdbarch_bfd_arch_info (gdbarch)->printable_name);
+}
+
 /* Implementation of gdb.architecture_names().  Return a list of all the
    BFD architecture names that GDB understands.  */
 
@@ -391,7 +407,7 @@ PyTypeObject arch_object_type = {
   0,                                  /* tp_getattr */
   0,                                  /* tp_setattr */
   0,                                  /* tp_compare */
-  0,                                  /* tp_repr */
+  archpy_repr,                        /* tp_repr */
   0,                                  /* tp_as_number */
   0,                                  /* tp_as_sequence */
   0,                                  /* tp_as_mapping */
diff --git a/gdb/python/py-block.c b/gdb/python/py-block.c
index b9aea3aca69..b4c55add765 100644
--- a/gdb/python/py-block.c
+++ b/gdb/python/py-block.c
@@ -424,6 +424,31 @@ blpy_iter_is_valid (PyObject *self, PyObject *args)
   Py_RETURN_TRUE;
 }
 
+/* __repr__ implementation for gdb.Block.  */
+
+static PyObject *
+blpy_repr (PyObject *self)
+{
+  const auto block = block_object_to_block (self);
+  if (block == nullptr)
+    return PyUnicode_FromFormat ("<gdb.Block (invalid)>");
+
+  const auto name = block->function () ?
+    block->function ()->print_name () : "<anonymous>";
+
+  block_iterator iter;
+  block_iterator_first (block, &iter);
+
+  std::string str;
+  const struct symbol *symbol;
+  while ((symbol = block_iterator_next (&iter)) != nullptr)
+    str = (str + "\n") + symbol->print_name () + ",";
+  if(!str.empty ())
+    str += "\n";
+
+  return PyUnicode_FromFormat ("<gdb.Block %s {%s}>", name, str.c_str ());
+}
+
 int
 gdbpy_initialize_blocks (void)
 {
@@ -486,7 +511,7 @@ PyTypeObject block_object_type = {
   0,				  /*tp_getattr*/
   0,				  /*tp_setattr*/
   0,				  /*tp_compare*/
-  0,				  /*tp_repr*/
+  blpy_repr,                     /*tp_repr*/
   0,				  /*tp_as_number*/
   0,				  /*tp_as_sequence*/
   &block_object_as_mapping,	  /*tp_as_mapping*/
diff --git a/gdb/python/py-breakpoint.c b/gdb/python/py-breakpoint.c
index de7b9f4266b..d68a205330c 100644
--- a/gdb/python/py-breakpoint.c
+++ b/gdb/python/py-breakpoint.c
@@ -33,6 +33,7 @@
 #include "location.h"
 #include "py-event.h"
 #include "linespec.h"
+#include "gdbsupport/common-utils.h"
 
 extern PyTypeObject breakpoint_location_object_type
     CPYCHECKER_TYPE_OBJECT_FOR_TYPEDEF ("breakpoint_location_object");
@@ -967,6 +968,31 @@ bppy_init (PyObject *self, PyObject *args, PyObject *kwargs)
   return 0;
 }
 
+/* __repr__ implementation for gdb.Breakpoint.  */
+
+static PyObject *
+bppy_repr (PyObject *self)
+{
+  const auto bp = (struct gdbpy_breakpoint_object*) self;
+  if (bp->bp == nullptr)
+    return PyUnicode_FromFormat ("<gdb.Breakpoint (invalid)>");
+
+  std::string str = " ";
+  if (bp->bp->thread != -1)
+    str += string_printf ("thread=%d ", bp->bp->thread);
+  if (bp->bp->task > 0)
+    str += string_printf ("task=%d ", bp->bp->task);
+  if (bp->bp->enable_count > 0)
+    str += string_printf ("enable_count=%d ", bp->bp->enable_count);
+  str.pop_back ();
+
+  return PyUnicode_FromFormat
+    ("<gdb.Breakpoint number=%d hits=%d%s>",
+     bp->bp->number,
+     bp->bp->hit_count,
+     str.c_str ());
+}
+
 /* Append to LIST the breakpoint Python object associated to B.
 
    Return true on success.  Return false on failure, with the Python error
@@ -1389,7 +1415,7 @@ PyTypeObject breakpoint_object_type =
   0,				  /*tp_getattr*/
   0,				  /*tp_setattr*/
   0,				  /*tp_compare*/
-  0,				  /*tp_repr*/
+  bppy_repr,                     /*tp_repr*/
   0,				  /*tp_as_number*/
   0,				  /*tp_as_sequence*/
   0,				  /*tp_as_mapping*/
@@ -1604,6 +1630,44 @@ bplocpy_dealloc (PyObject *py_self)
   Py_TYPE (py_self)->tp_free (py_self);
 }
 
+/* __repr__ implementation for gdb.BreakpointLocation.  */
+
+static PyObject *
+bplocpy_repr (PyObject *py_self)
+{
+  const auto self = (gdbpy_breakpoint_location_object *) py_self;
+  if (self->owner == nullptr || self->owner->bp == nullptr
+    || self->owner->bp != self->bp_loc->owner)
+    return PyUnicode_FromFormat ("<gdb.BreakpointLocation (invalid)>");
+
+  const auto enabled = self->bp_loc->enabled ? "enabled" : "disabled";
+
+  std::string str(enabled);
+
+  str += " requested_address=0x";
+  str += string_printf ("%lx", self->bp_loc->requested_address);
+
+  str += " address=0x";
+  str += string_printf ("%lx", self->bp_loc->address);
+
+  if (self->bp_loc->symtab != nullptr)
+  {
+    str += " source=";
+    str += self->bp_loc->symtab->filename;
+    str += ":";
+    str += string_printf ("%d", self->bp_loc->line_number);
+  }
+
+  const auto fn_name = self->bp_loc->function_name.get ();
+  if (fn_name != nullptr)
+  {
+    str += " in ";
+    str += fn_name;
+  }
+
+  return PyUnicode_FromFormat ("<gdb.BreakpointLocation %s>", str.c_str ());
+}
+
 /* Attribute get/set Python definitions. */
 
 static gdb_PyGetSetDef bp_location_object_getset[] = {
@@ -1635,7 +1699,7 @@ PyTypeObject breakpoint_location_object_type =
   0,					/*tp_getattr*/
   0,					/*tp_setattr*/
   0,					/*tp_compare*/
-  0,					/*tp_repr*/
+  bplocpy_repr,                        /*tp_repr*/
   0,					/*tp_as_number*/
   0,					/*tp_as_sequence*/
   0,					/*tp_as_mapping*/
diff --git a/gdb/python/py-symbol.c b/gdb/python/py-symbol.c
index 93c86964f3e..5a8149bbe66 100644
--- a/gdb/python/py-symbol.c
+++ b/gdb/python/py-symbol.c
@@ -375,6 +375,20 @@ sympy_dealloc (PyObject *obj)
   Py_TYPE (obj)->tp_free (obj);
 }
 
+/* __repr__ implementation for gdb.Symbol.  */
+
+static PyObject *
+sympy_repr (PyObject *self)
+{
+  const auto symbol = symbol_object_to_symbol (self);
+  if (symbol == nullptr)
+    return PyUnicode_FromFormat ("<gdb.Symbol (invalid)>");
+
+  return PyUnicode_FromFormat
+    ("<gdb.Symbol print_name=%s>",
+     symbol->print_name ());
+}
+
 /* Implementation of
    gdb.lookup_symbol (name [, block] [, domain]) -> (symbol, is_field_of_this)
    A tuple with 2 elements is always returned.  The first is the symbol
@@ -732,7 +746,7 @@ PyTypeObject symbol_object_type = {
   0,				  /*tp_getattr*/
   0,				  /*tp_setattr*/
   0,				  /*tp_compare*/
-  0,				  /*tp_repr*/
+  sympy_repr,                    /*tp_repr*/
   0,				  /*tp_as_number*/
   0,				  /*tp_as_sequence*/
   0,				  /*tp_as_mapping*/
diff --git a/gdb/python/py-type.c b/gdb/python/py-type.c
index 928efacfe8a..eb11ef029ca 100644
--- a/gdb/python/py-type.c
+++ b/gdb/python/py-type.c
@@ -1026,6 +1026,34 @@ typy_template_argument (PyObject *self, PyObject *args)
   return value_to_value_object (val);
 }
 
+/* __repr__ implementation for gdb.Type.  */
+
+static PyObject *
+typy_repr (PyObject *self)
+{
+  const auto type = type_object_to_type (self);
+  if (type == nullptr)
+    return PyUnicode_FromFormat ("<gdb.Type (invalid)>");
+
+  const char *code = pyty_codes[type->code ()].name;
+  string_file type_name;
+  try
+    {
+      current_language->print_type (type, "",
+				    &type_name, -1, 0,
+				    &type_print_raw_options);
+    }
+  catch (const gdb_exception &except)
+    {
+      GDB_PY_HANDLE_EXCEPTION (except);
+    }
+  auto py_typename = PyUnicode_Decode
+    (type_name.c_str (), type_name.size (),
+		 host_charset (), NULL);
+	
+  return PyUnicode_FromFormat ("<gdb.Type code=%s name=%U>", code, py_typename);
+}
+
 static PyObject *
 typy_str (PyObject *self)
 {
@@ -1612,7 +1640,7 @@ PyTypeObject type_object_type =
   0,				  /*tp_getattr*/
   0,				  /*tp_setattr*/
   0,				  /*tp_compare*/
-  0,				  /*tp_repr*/
+  typy_repr,                     /*tp_repr*/
   &type_object_as_number,	  /*tp_as_number*/
   0,				  /*tp_as_sequence*/
   &typy_mapping,		  /*tp_as_mapping*/
diff --git a/gdb/testsuite/gdb.python/py-arch.exp b/gdb/testsuite/gdb.python/py-arch.exp
index 1fbbc47c872..d436c957e25 100644
--- a/gdb/testsuite/gdb.python/py-arch.exp
+++ b/gdb/testsuite/gdb.python/py-arch.exp
@@ -29,6 +29,8 @@ if ![runto_main] {
 # Test python/15461.  Invalid architectures should not trigger an
 # internal GDB assert.
 gdb_py_test_silent_cmd "python empty = gdb.Architecture()" "get empty arch" 0
+gdb_test "python print(repr (empty))" "<gdb\\.Architecture \\(invalid\\)>" \
+    "Test empty achitecture __repr__ does not trigger an assert"
 gdb_test "python print(empty.name())" ".*Architecture is invalid.*" \
     "Test empty architecture.name does not trigger an assert"
 gdb_test "python print(empty.disassemble())" ".*Architecture is invalid.*" \
@@ -46,6 +48,10 @@ gdb_py_test_silent_cmd "python insn_list3 = arch.disassemble(pc, count=1)" \
 gdb_py_test_silent_cmd "python insn_list4 = arch.disassemble(gdb.Value(pc))" \
   "disassemble no end no count" 0
 
+gdb_test "python print (repr (arch))" \
+    "<gdb.Architecture arch_name=.* printable_name=.*>" \
+    "test __repr__ for architecture"
+
 gdb_test "python print (len(insn_list1))" "1" "test number of instructions 1"
 gdb_test "python print (len(insn_list2))" "1" "test number of instructions 2"
 gdb_test "python print (len(insn_list3))" "1" "test number of instructions 3"
diff --git a/gdb/testsuite/gdb.python/py-block.exp b/gdb/testsuite/gdb.python/py-block.exp
index 0a88aec56a0..5e3d1c72d5e 100644
--- a/gdb/testsuite/gdb.python/py-block.exp
+++ b/gdb/testsuite/gdb.python/py-block.exp
@@ -39,7 +39,7 @@ gdb_continue_to_breakpoint "Block break here."
 gdb_py_test_silent_cmd "python frame = gdb.selected_frame()" "Get Frame" 0
 gdb_py_test_silent_cmd "python block = frame.block()" \
     "Get block, initial innermost block" 0
-gdb_test "python print (block)" "<gdb.Block object at $hex>" "check block not None"
+gdb_test "python print (block)" "<gdb.Block .* \{.*\}>" "check block not None"
 gdb_test "python print (block.function)" "None" "first anonymous block"
 gdb_test "python print (block.start)" "${decimal}" "check start not None"
 gdb_test "python print (block.end)" "${decimal}" "check end not None"
@@ -73,7 +73,7 @@ gdb_test "python print (block.function)" "block_func" \
 gdb_test "up" ".*"
 gdb_py_test_silent_cmd "python frame = gdb.selected_frame()" "Get Frame 2" 0
 gdb_py_test_silent_cmd "python block = frame.block()" "Get Frame 2's block" 0
-gdb_test "python print (block)" "<gdb.Block object at $hex>" \
+gdb_test "python print (repr (block))" "<gdb.Block .* \{.*\}>" \
          "Check Frame 2's block not None"
 gdb_test "python print (block.function)" "main" "main block"
 
diff --git a/gdb/testsuite/gdb.python/py-breakpoint.exp b/gdb/testsuite/gdb.python/py-breakpoint.exp
index e36e87dc291..4da46431a3a 100644
--- a/gdb/testsuite/gdb.python/py-breakpoint.exp
+++ b/gdb/testsuite/gdb.python/py-breakpoint.exp
@@ -50,11 +50,13 @@ proc_with_prefix test_bkpt_basic { } {
 	return 0
     }
 
+    set repr_pattern "<gdb.Breakpoint number=-?$decimal hits=-?$decimal\( thread=$decimal\)?\( task=$decimal\)?\( enable_count=$decimal\)?>"
+
     # Now there should be one breakpoint: main.
     gdb_py_test_silent_cmd "python blist = gdb.breakpoints()" \
 	"Get Breakpoint List" 0
-    gdb_test "python print (blist\[0\])" \
-	"<gdb.Breakpoint object at $hex>" "Check obj exists @main"
+    gdb_test "python print (repr (blist\[0\]))" \
+	"$repr_pattern" "Check obj exists @main"
     gdb_test "python print (blist\[0\].location)" \
 	"main." "Check breakpoint location @main"
     gdb_test "python print (blist\[0\].pending)" "False" \
@@ -71,12 +73,12 @@ proc_with_prefix test_bkpt_basic { } {
 	"Get Breakpoint List" 0
     gdb_test "python print (len(blist))" \
 	"2" "Check for two breakpoints"
-    gdb_test "python print (blist\[0\])" \
-	"<gdb.Breakpoint object at $hex>" "Check obj exists @main 2"
+    gdb_test "python print (repr (blist\[0\]))" \
+	"$repr_pattern" "Check obj exists @main 2"
     gdb_test "python print (blist\[0\].location)" \
 	"main." "Check breakpoint location @main 2"
-    gdb_test "python print (blist\[1\])" \
-	"<gdb.Breakpoint object at $hex>" "Check obj exists @mult_line"
+    gdb_test "python print (repr (blist\[1\]))" \
+	"$repr_pattern" "Check obj exists @mult_line"
 
     gdb_test "python print (blist\[1\].location)" \
 	"py-breakpoint\.c:${mult_line}*" \
@@ -224,14 +226,16 @@ proc_with_prefix test_bkpt_invisible { } {
 	return 0
     }
 
+    set repr_pattern "<gdb.Breakpoint number=-?$decimal hits=-?$decimal\( thread=$decimal\)?\( task=$decimal\)?\( enable_count=$decimal\)?>"
+
     delete_breakpoints
     set ibp_location [gdb_get_line_number "Break at multiply."]
     gdb_py_test_silent_cmd  "python ibp = gdb.Breakpoint(\"$ibp_location\", internal=False)" \
 	"Set invisible breakpoint" 0
     gdb_py_test_silent_cmd "python ilist = gdb.breakpoints()" \
 	"Get Breakpoint List" 0
-    gdb_test "python print (ilist\[0\])" \
-	"<gdb.Breakpoint object at $hex>" "Check invisible bp obj exists 1"
+    gdb_test "python print (repr (ilist\[0\]))" \
+	"$repr_pattern" "Check invisible bp obj exists 1"
     gdb_test "python print (ilist\[0\].location)" \
 	"py-breakpoint\.c:$ibp_location*" "Check breakpoint location 1"
     gdb_test "python print (ilist\[0\].visible)" \
@@ -243,8 +247,8 @@ proc_with_prefix test_bkpt_invisible { } {
 	"Set invisible breakpoint" 0
     gdb_py_test_silent_cmd "python ilist = gdb.breakpoints()" \
 	"Get Breakpoint List" 0
-    gdb_test "python print (ilist\[0\])" \
-	"<gdb.Breakpoint object at $hex>" "Check invisible bp obj exists 2"
+    gdb_test "python print (repr (ilist\[0\]))" \
+	"$repr_pattern" "Check invisible bp obj exists 2"
     gdb_test "python print (ilist\[0\].location)" \
 	"py-breakpoint\.c:$ibp_location*" "Check breakpoint location 2"
     gdb_test "python print (ilist\[0\].visible)" \
diff --git a/gdb/testsuite/gdb.python/py-symbol.exp b/gdb/testsuite/gdb.python/py-symbol.exp
index ad06b07c2c6..979b7dfb8fb 100644
--- a/gdb/testsuite/gdb.python/py-symbol.exp
+++ b/gdb/testsuite/gdb.python/py-symbol.exp
@@ -44,6 +44,8 @@ clean_restart ${binfile}
 # point where we don't have a current frame, and we don't want to
 # require one.
 gdb_py_test_silent_cmd "python main_func = gdb.lookup_global_symbol(\"main\")" "Lookup main" 1
+gdb_test "python print (repr (main_func))" "<gdb.Symbol print_name=.*>" \
+    "test main_func.__repr__"
 gdb_test "python print (main_func.is_function)" "True" "test main_func.is_function"
 gdb_test "python print (gdb.lookup_global_symbol(\"junk\"))" "None" "test lookup_global_symbol(\"junk\")"
 
diff --git a/gdb/testsuite/gdb.python/py-type.exp b/gdb/testsuite/gdb.python/py-type.exp
index 594c9749d8e..95cdfa54a6e 100644
--- a/gdb/testsuite/gdb.python/py-type.exp
+++ b/gdb/testsuite/gdb.python/py-type.exp
@@ -393,3 +393,7 @@ if { [build_inferior "${binfile}-cxx" "c++"] == 0 } {
       test_type_equality
   }
 }
+
+# Test __repr__()
+gdb_test "python print (repr (gdb.lookup_type ('char')))" \
+      "<gdb.Type code=TYPE_CODE_INT name=char>" "test __repr__()"
-- 
2.37.3.windows.1


^ permalink raw reply	[relevance 3%]

* Re: [PATCH] Add __repr__() implementation to a few Python types
  2023-01-11  0:35  3% ` [PATCH] Add __repr__() implementation to a few Python types Matheus Branco Borella
  2023-01-18 17:05  7%   ` Bruno Larsen
@ 2023-01-18 18:02  6%   ` Andrew Burgess
  2023-01-20  1:43  3%     ` Matheus Branco Borella
  1 sibling, 1 reply; 65+ results
From: Andrew Burgess @ 2023-01-18 18:02 UTC (permalink / raw)
  To: Matheus Branco Borella via Gdb-patches, gdb-patches
  Cc: Matheus Branco Borella

Matheus Branco Borella via Gdb-patches <gdb-patches@sourceware.org>
writes:

> Only a few types in the Python API currently have __repr__() implementations.
> This patch adds a few more of them. specifically: it adds __repr__()
> implementations to gdb.Symbol, gdb.Architecture, gdb.Block, gdb.Breakpoint,
> and gdb.Type.
>
> This makes it easier to play around the GDB Python API in the Python interpreter
> session invoked with the 'pi' command in GDB, giving more easily accessible tipe
> information to users.
>
> An example of how this would look like:
> ```
> (gdb) pi
>>> gdb.lookup_type("char")
> <gdb.Type code=TYPE_CODE_INT name=char>
>>> gdb.lookup_global_symbol("main")
> <gdb.Symbol print_name=main>
> ```

Thanks for working on this, I think this will be really useful.

>
> One thing to note about this patch is that it makes use of u8 string literals,
> so as to make sure we meet python's expectations of strings passed to it using
> PyUnicode_FromFormat being encoded in utf8. This should remove the chance of
> odd compilation environments spitting out strings Python would consider invalid
> for the function we're calling.

Sorry for being a little slow.  What does this actually mean?  When you
say "makes use of u8 string literals" - does this mean you have string
literals in this patch containing non ASCII characters?

I've trying to understand why this is different to any other part of GDB
that prints stuff via Python.

> ---
>  gdb/python/py-arch.c                       | 18 +++++++-
>  gdb/python/py-block.c                      | 30 ++++++++++++-
>  gdb/python/py-breakpoint.c                 | 49 +++++++++++++++++++++-
>  gdb/python/py-symbol.c                     | 16 ++++++-
>  gdb/python/py-type.c                       | 31 +++++++++++++-
>  gdb/testsuite/gdb.python/py-arch.exp       |  4 ++
>  gdb/testsuite/gdb.python/py-block.exp      |  4 +-
>  gdb/testsuite/gdb.python/py-breakpoint.exp | 26 +++++++-----
>  gdb/testsuite/gdb.python/py-symbol.exp     |  1 +
>  gdb/testsuite/gdb.python/py-type.exp       |  4 ++
>  10 files changed, 165 insertions(+), 18 deletions(-)
>
> diff --git a/gdb/python/py-arch.c b/gdb/python/py-arch.c
> index cf0978560f9..5384a0d0d0c 100644
> --- a/gdb/python/py-arch.c
> +++ b/gdb/python/py-arch.c
> @@ -319,6 +319,22 @@ archpy_integer_type (PyObject *self, PyObject *args, PyObject *kw)
>    return type_to_type_object (type);
>  }
>  
> +/* __repr__ implementation for gdb.Architecture.  */
> +
> +static PyObject *
> +archpy_repr (PyObject *self)
> +{
> +  const auto gdbarch = arch_object_to_gdbarch (self);
> +  if (gdbarch == nullptr)
> +    return PyUnicode_FromFormat
> +      ("<gdb.Architecture (invalid)>");
> +
> +  return PyUnicode_FromFormat
> +    ("<gdb.Architecture arch_name=%s printable_name=%s>",
> +     gdbarch_bfd_arch_info (gdbarch)->arch_name,
> +     gdbarch_bfd_arch_info (gdbarch)->printable_name);
> +}
> +
>  /* Implementation of gdb.architecture_names().  Return a list of all the
>     BFD architecture names that GDB understands.  */
>  
> @@ -391,7 +407,7 @@ PyTypeObject arch_object_type = {
>    0,                                  /* tp_getattr */
>    0,                                  /* tp_setattr */
>    0,                                  /* tp_compare */
> -  0,                                  /* tp_repr */
> +  archpy_repr,                        /* tp_repr */
>    0,                                  /* tp_as_number */
>    0,                                  /* tp_as_sequence */
>    0,                                  /* tp_as_mapping */
> diff --git a/gdb/python/py-block.c b/gdb/python/py-block.c
> index b9aea3aca69..1b8433d41e7 100644
> --- a/gdb/python/py-block.c
> +++ b/gdb/python/py-block.c
> @@ -23,6 +23,7 @@
>  #include "symtab.h"
>  #include "python-internal.h"
>  #include "objfiles.h"
> +#include <sstream>
>  
>  struct block_object {
>    PyObject_HEAD
> @@ -424,6 +425,33 @@ blpy_iter_is_valid (PyObject *self, PyObject *args)
>    Py_RETURN_TRUE;
>  }
>  
> +/* __repr__ implementation for gdb.Block.  */
> +
> +static PyObject *
> +blpy_repr (PyObject *self)
> +{
> +  const auto block = block_object_to_block (self);
> +  if (block == nullptr)
> +    return PyUnicode_FromFormat("<gdb.Block (invalid)>");

Missing space before '('.  I think Bruno pointed out more of these, so
I'll not both pointing out the others I see.

> +
> +  const auto name = block->function () ? block->function ()->print_name () : "<anonymous>";

This line is too long and needs to be wrapped to under 80 characters.

> +
> +  block_iterator iter;
> +  block_iterator_first (block, &iter);
> +
> +  std::stringstream ss;
> +  const struct symbol *symbol;
> +  while ((symbol = block_iterator_next (&iter)) != nullptr)
> +  {
> +    ss << std::endl;
> +    ss << symbol->print_name () << ",";
> +  }
> +  if(!ss.str ().empty ())
> +    ss << std::endl;

We don't make much use of std::stringstream in GDB, and unless there's a
compelling reason, then for consistency, I'd prefer to see this written
using std::string.  While playing with this I rewrote this as:

  std::string str;
  const struct symbol *symbol;
  while ((symbol = block_iterator_next (&iter)) != nullptr)
    str = (str + "\n") + symbol->print_name () + ",";
  if(!str.empty ())
    str += "\n";

  return PyUnicode_FromFormat("<gdb.Block %s {%s}>", name, str.c_str ());

which still seems to pass your tests.

> +
> +  return PyUnicode_FromFormat("<gdb.Block %s {%s}>", name, ss.str ().c_str ());
> +}
> +
>  int
>  gdbpy_initialize_blocks (void)
>  {
> @@ -486,7 +514,7 @@ PyTypeObject block_object_type = {
>    0,				  /*tp_getattr*/
>    0,				  /*tp_setattr*/
>    0,				  /*tp_compare*/
> -  0,				  /*tp_repr*/
> +  blpy_repr,                     /*tp_repr*/
>    0,				  /*tp_as_number*/
>    0,				  /*tp_as_sequence*/
>    &block_object_as_mapping,	  /*tp_as_mapping*/
> diff --git a/gdb/python/py-breakpoint.c b/gdb/python/py-breakpoint.c
> index de7b9f4266b..ce307f7be21 100644
> --- a/gdb/python/py-breakpoint.c
> +++ b/gdb/python/py-breakpoint.c
> @@ -33,6 +33,7 @@
>  #include "location.h"
>  #include "py-event.h"
>  #include "linespec.h"
> +#include <sstream>
>  
>  extern PyTypeObject breakpoint_location_object_type
>      CPYCHECKER_TYPE_OBJECT_FOR_TYPEDEF ("breakpoint_location_object");
> @@ -967,6 +968,23 @@ bppy_init (PyObject *self, PyObject *args, PyObject *kwargs)
>    return 0;
>  }
>  
> +/* __repr__ implementation for gdb.Breakpoint.  */
> +
> +static PyObject *
> +bppy_repr(PyObject *self)
> +{
> +  const auto bp = (struct gdbpy_breakpoint_object*) self;
> +  if (bp->bp == nullptr)
> +    return PyUnicode_FromFormat ("<gdb.Breakpoint (invalid)>");
> +
> +  return PyUnicode_FromFormat
> +    ("<gdb.Breakpoint number=%d thread=%d hits=%d enable_count=%d>",
> +     bp->bp->number,
> +     bp->bp->thread,
> +     bp->bp->hit_count,
> +     bp->bp->enable_count);

I think I'd rather see the fields that are included here added
dynamically based on their values.  For example you choose to include
'thread' but not 'task' which seems pretty arbitrary.

Of the fields given here 'number' and 'hits' seem like they should
always be included, but I'd then only include 'thread' if it's not -1,
and 'task' if it's not 0, similarly, the 'enabled_count' is probably
only worth including if it's greater than zero I guess.

This leaves the door open for adding more optional fields in the future.

> +}
> +
>  /* Append to LIST the breakpoint Python object associated to B.
>  
>     Return true on success.  Return false on failure, with the Python error
> @@ -1389,7 +1407,7 @@ PyTypeObject breakpoint_object_type =
>    0,				  /*tp_getattr*/
>    0,				  /*tp_setattr*/
>    0,				  /*tp_compare*/
> -  0,				  /*tp_repr*/
> +  bppy_repr,                     /*tp_repr*/
>    0,				  /*tp_as_number*/
>    0,				  /*tp_as_sequence*/
>    0,				  /*tp_as_mapping*/
> @@ -1604,6 +1622,33 @@ bplocpy_dealloc (PyObject *py_self)
>    Py_TYPE (py_self)->tp_free (py_self);
>  }
>  
> +/* __repr__ implementation for gdb.BreakpointLocation.  */
> +
> +static PyObject *
> +bplocpy_repr (PyObject *py_self)
> +{
> +  const auto self = (gdbpy_breakpoint_location_object *) py_self;
> +  if (self->owner == nullptr || self->owner->bp == nullptr || self->owner->bp != self->bp_loc->owner)
> +    return PyUnicode_FromFormat ("<gdb.BreakpointLocation (invalid)>");

That `if` condition line is too long.

> +
> +  const auto enabled = self->bp_loc->enabled ? "enabled" : "disabled";
> +
> +  std::stringstream ss;
> +  ss << std::endl << enabled << std::endl;
> +  ss << "requested_address=0x" << std::hex << self->bp_loc->requested_address << " ";
> +  ss << "address=0x" << self->bp_loc->address << " " << std::dec << std::endl;
> +  if (self->bp_loc->symtab != nullptr)
> +  {
> +    ss << self->bp_loc->symtab->filename << ":" << self->bp_loc->line_number << " " << std::endl;
> +  }
> +
> +  const auto fn_name = self->bp_loc->function_name.get ();
> +  if (fn_name != nullptr)
> +    ss << "in " << fn_name << " " << std::endl;

I think this can all be rewritten using std::string instead of
std::stringstream, maybe using string_printf to do some of the
formatting.  IMHO this would be more consistent with the rest of GDB.

> +
> +  return PyUnicode_FromFormat ("<gdb.BreakpointLocation %s>", ss.str ().c_str ());
> +}
> +
>  /* Attribute get/set Python definitions. */
>  
>  static gdb_PyGetSetDef bp_location_object_getset[] = {
> @@ -1635,7 +1680,7 @@ PyTypeObject breakpoint_location_object_type =
>    0,					/*tp_getattr*/
>    0,					/*tp_setattr*/
>    0,					/*tp_compare*/
> -  0,					/*tp_repr*/
> +  bplocpy_repr,                        /*tp_repr*/
>    0,					/*tp_as_number*/
>    0,					/*tp_as_sequence*/
>    0,					/*tp_as_mapping*/
> diff --git a/gdb/python/py-symbol.c b/gdb/python/py-symbol.c
> index 93c86964f3e..5a8149bbe66 100644
> --- a/gdb/python/py-symbol.c
> +++ b/gdb/python/py-symbol.c
> @@ -375,6 +375,20 @@ sympy_dealloc (PyObject *obj)
>    Py_TYPE (obj)->tp_free (obj);
>  }
>  
> +/* __repr__ implementation for gdb.Symbol.  */
> +
> +static PyObject *
> +sympy_repr (PyObject *self)
> +{
> +  const auto symbol = symbol_object_to_symbol (self);
> +  if (symbol == nullptr)
> +    return PyUnicode_FromFormat ("<gdb.Symbol (invalid)>");
> +
> +  return PyUnicode_FromFormat
> +    ("<gdb.Symbol print_name=%s>",
> +     symbol->print_name ());
> +}
> +
>  /* Implementation of
>     gdb.lookup_symbol (name [, block] [, domain]) -> (symbol, is_field_of_this)
>     A tuple with 2 elements is always returned.  The first is the symbol
> @@ -732,7 +746,7 @@ PyTypeObject symbol_object_type = {
>    0,				  /*tp_getattr*/
>    0,				  /*tp_setattr*/
>    0,				  /*tp_compare*/
> -  0,				  /*tp_repr*/
> +  sympy_repr,                    /*tp_repr*/
>    0,				  /*tp_as_number*/
>    0,				  /*tp_as_sequence*/
>    0,				  /*tp_as_mapping*/
> diff --git a/gdb/python/py-type.c b/gdb/python/py-type.c
> index 928efacfe8a..abe127eca76 100644
> --- a/gdb/python/py-type.c
> +++ b/gdb/python/py-type.c
> @@ -442,6 +442,7 @@ typy_is_signed (PyObject *self, void *closure)
>      Py_RETURN_TRUE;
>  }
>  
> +

Random extra blank line here.

>  /* Return the type, stripped of typedefs. */
>  static PyObject *
>  typy_strip_typedefs (PyObject *self, PyObject *args)
> @@ -1026,6 +1027,34 @@ typy_template_argument (PyObject *self, PyObject *args)
>    return value_to_value_object (val);
>  }
>  
> +/* __repr__ implementation for gdb.Type.  */
> +
> +static PyObject *
> +typy_repr (PyObject *self)
> +{
> +  const auto type = type_object_to_type (self);
> +  if (type == nullptr)
> +    return PyUnicode_FromFormat ("<gdb.Type (invalid)>");
> +
> +  const char *code = pyty_codes[type->code ()].name;
> +  string_file type_name;
> +  try
> +    {
> +      current_language->print_type (type, "",
> +				    &type_name, -1, 0,
> +				    &type_print_raw_options);
> +    }
> +  catch (const gdb_exception &except)
> +    {
> +      GDB_PY_HANDLE_EXCEPTION (except);
> +    }
> +  auto py_typename = PyUnicode_Decode
> +    (type_name.c_str (), type_name.size (),
> +		 host_charset (), NULL);
> +	
> +	return PyUnicode_FromFormat ("<gdb.Type code=%s name=%U>", code, py_typename);

Here's your whitespace error.  The return line is over indented, and the
preceding line has trailing whitespace.

> +}
> +
>  static PyObject *
>  typy_str (PyObject *self)
>  {
> @@ -1612,7 +1641,7 @@ PyTypeObject type_object_type =
>    0,				  /*tp_getattr*/
>    0,				  /*tp_setattr*/
>    0,				  /*tp_compare*/
> -  0,				  /*tp_repr*/
> +  typy_repr,                     /*tp_repr*/
>    &type_object_as_number,	  /*tp_as_number*/
>    0,				  /*tp_as_sequence*/
>    &typy_mapping,		  /*tp_as_mapping*/
> diff --git a/gdb/testsuite/gdb.python/py-arch.exp b/gdb/testsuite/gdb.python/py-arch.exp
> index 1fbbc47c872..a60b4a25cbb 100644
> --- a/gdb/testsuite/gdb.python/py-arch.exp
> +++ b/gdb/testsuite/gdb.python/py-arch.exp
> @@ -29,6 +29,8 @@ if ![runto_main] {
>  # Test python/15461.  Invalid architectures should not trigger an
>  # internal GDB assert.
>  gdb_py_test_silent_cmd "python empty = gdb.Architecture()" "get empty arch" 0
> +gdb_test "python print(repr (empty))" "<gdb\\.Architecture \\(invalid\\)>" \
> +    "Test empty achitecture __repr__ does not trigger an assert"

I guess I was surprised that so many of the new tests included an
explicit call to repr, given the premise of the change was that simply
'print(empty)' would now print something useful.

I guess maybe it doesn't hurt to _also_ include some explicit repr
calls, but I was expecting most tests to just be printing the object
directly.

Is there some reasoning here that I'm missing?

>  gdb_test "python print(empty.name())" ".*Architecture is invalid.*" \
>      "Test empty architecture.name does not trigger an assert"
>  gdb_test "python print(empty.disassemble())" ".*Architecture is invalid.*" \
> @@ -46,6 +48,8 @@ gdb_py_test_silent_cmd "python insn_list3 = arch.disassemble(pc, count=1)" \
>  gdb_py_test_silent_cmd "python insn_list4 = arch.disassemble(gdb.Value(pc))" \
>    "disassemble no end no count" 0
>  
> +gdb_test "python print (repr (arch))" "<gdb.Architecture arch_name=.* printable_name=.*>" "test __repr__ for architecture"

Over long line, please wrap a little.  There's other long lines in your
patch, I'll not point out each one.

> +
>  gdb_test "python print (len(insn_list1))" "1" "test number of instructions 1"
>  gdb_test "python print (len(insn_list2))" "1" "test number of instructions 2"
>  gdb_test "python print (len(insn_list3))" "1" "test number of instructions 3"
> diff --git a/gdb/testsuite/gdb.python/py-block.exp b/gdb/testsuite/gdb.python/py-block.exp
> index 0a88aec56a0..5e3d1c72d5e 100644
> --- a/gdb/testsuite/gdb.python/py-block.exp
> +++ b/gdb/testsuite/gdb.python/py-block.exp
> @@ -39,7 +39,7 @@ gdb_continue_to_breakpoint "Block break here."
>  gdb_py_test_silent_cmd "python frame = gdb.selected_frame()" "Get Frame" 0
>  gdb_py_test_silent_cmd "python block = frame.block()" \
>      "Get block, initial innermost block" 0
> -gdb_test "python print (block)" "<gdb.Block object at $hex>" "check block not None"
> +gdb_test "python print (block)" "<gdb.Block .* \{.*\}>" "check block not None"
>  gdb_test "python print (block.function)" "None" "first anonymous block"
>  gdb_test "python print (block.start)" "${decimal}" "check start not None"
>  gdb_test "python print (block.end)" "${decimal}" "check end not None"
> @@ -73,7 +73,7 @@ gdb_test "python print (block.function)" "block_func" \
>  gdb_test "up" ".*"
>  gdb_py_test_silent_cmd "python frame = gdb.selected_frame()" "Get Frame 2" 0
>  gdb_py_test_silent_cmd "python block = frame.block()" "Get Frame 2's block" 0
> -gdb_test "python print (block)" "<gdb.Block object at $hex>" \
> +gdb_test "python print (repr (block))" "<gdb.Block .* \{.*\}>" \
>           "Check Frame 2's block not None"
>  gdb_test "python print (block.function)" "main" "main block"
>  
> diff --git a/gdb/testsuite/gdb.python/py-breakpoint.exp b/gdb/testsuite/gdb.python/py-breakpoint.exp
> index e36e87dc291..0c904a12c90 100644
> --- a/gdb/testsuite/gdb.python/py-breakpoint.exp
> +++ b/gdb/testsuite/gdb.python/py-breakpoint.exp
> @@ -50,11 +50,14 @@ proc_with_prefix test_bkpt_basic { } {
>  	return 0
>      }
>  
> +    set num_exp "-?\[0-9\]+"
> +    set repr_pattern "<gdb.Breakpoint number=$num_exp thread=$num_exp hits=$num_exp enable_count=$num_exp>"

We already have $decimal.  I think if you change py-breakpoint.c as I
suggest then you should be able to make use of that and remove num_exp.
This applies below too.

Thanks,
Andrew

> +
>      # Now there should be one breakpoint: main.
>      gdb_py_test_silent_cmd "python blist = gdb.breakpoints()" \
>  	"Get Breakpoint List" 0
> -    gdb_test "python print (blist\[0\])" \
> -	"<gdb.Breakpoint object at $hex>" "Check obj exists @main"
> +    gdb_test "python print (repr (blist\[0\]))" \
> +	"$repr_pattern" "Check obj exists @main"
>      gdb_test "python print (blist\[0\].location)" \
>  	"main." "Check breakpoint location @main"
>      gdb_test "python print (blist\[0\].pending)" "False" \
> @@ -71,12 +74,12 @@ proc_with_prefix test_bkpt_basic { } {
>  	"Get Breakpoint List" 0
>      gdb_test "python print (len(blist))" \
>  	"2" "Check for two breakpoints"
> -    gdb_test "python print (blist\[0\])" \
> -	"<gdb.Breakpoint object at $hex>" "Check obj exists @main 2"
> +    gdb_test "python print (repr (blist\[0\]))" \
> +	"$repr_pattern" "Check obj exists @main 2"
>      gdb_test "python print (blist\[0\].location)" \
>  	"main." "Check breakpoint location @main 2"
> -    gdb_test "python print (blist\[1\])" \
> -	"<gdb.Breakpoint object at $hex>" "Check obj exists @mult_line"
> +    gdb_test "python print (repr (blist\[1\]))" \
> +	"$repr_pattern" "Check obj exists @mult_line"
>  
>      gdb_test "python print (blist\[1\].location)" \
>  	"py-breakpoint\.c:${mult_line}*" \
> @@ -224,14 +227,17 @@ proc_with_prefix test_bkpt_invisible { } {
>  	return 0
>      }
>  
> +    set num_exp "-?\[0-9\]+"
> +    set repr_pattern "<gdb.Breakpoint number=$num_exp thread=$num_exp hits=$num_exp enable_count=$num_exp>"
> +
>      delete_breakpoints
>      set ibp_location [gdb_get_line_number "Break at multiply."]
>      gdb_py_test_silent_cmd  "python ibp = gdb.Breakpoint(\"$ibp_location\", internal=False)" \
>  	"Set invisible breakpoint" 0
>      gdb_py_test_silent_cmd "python ilist = gdb.breakpoints()" \
>  	"Get Breakpoint List" 0
> -    gdb_test "python print (ilist\[0\])" \
> -	"<gdb.Breakpoint object at $hex>" "Check invisible bp obj exists 1"
> +    gdb_test "python print (repr (ilist\[0\]))" \
> +	"$repr_pattern" "Check invisible bp obj exists 1"
>      gdb_test "python print (ilist\[0\].location)" \
>  	"py-breakpoint\.c:$ibp_location*" "Check breakpoint location 1"
>      gdb_test "python print (ilist\[0\].visible)" \
> @@ -243,8 +249,8 @@ proc_with_prefix test_bkpt_invisible { } {
>  	"Set invisible breakpoint" 0
>      gdb_py_test_silent_cmd "python ilist = gdb.breakpoints()" \
>  	"Get Breakpoint List" 0
> -    gdb_test "python print (ilist\[0\])" \
> -	"<gdb.Breakpoint object at $hex>" "Check invisible bp obj exists 2"
> +    gdb_test "python print (repr (ilist\[0\]))" \
> +	"$repr_pattern" "Check invisible bp obj exists 2"
>      gdb_test "python print (ilist\[0\].location)" \
>  	"py-breakpoint\.c:$ibp_location*" "Check breakpoint location 2"
>      gdb_test "python print (ilist\[0\].visible)" \
> diff --git a/gdb/testsuite/gdb.python/py-symbol.exp b/gdb/testsuite/gdb.python/py-symbol.exp
> index ad06b07c2c6..e0baed9b6d4 100644
> --- a/gdb/testsuite/gdb.python/py-symbol.exp
> +++ b/gdb/testsuite/gdb.python/py-symbol.exp
> @@ -44,6 +44,7 @@ clean_restart ${binfile}
>  # point where we don't have a current frame, and we don't want to
>  # require one.
>  gdb_py_test_silent_cmd "python main_func = gdb.lookup_global_symbol(\"main\")" "Lookup main" 1
> +gdb_test "python print (repr (main_func))" "<gdb.Symbol print_name=.*>" "test main_func.__repr__"
>  gdb_test "python print (main_func.is_function)" "True" "test main_func.is_function"
>  gdb_test "python print (gdb.lookup_global_symbol(\"junk\"))" "None" "test lookup_global_symbol(\"junk\")"
>  
> diff --git a/gdb/testsuite/gdb.python/py-type.exp b/gdb/testsuite/gdb.python/py-type.exp
> index 594c9749d8e..95cdfa54a6e 100644
> --- a/gdb/testsuite/gdb.python/py-type.exp
> +++ b/gdb/testsuite/gdb.python/py-type.exp
> @@ -393,3 +393,7 @@ if { [build_inferior "${binfile}-cxx" "c++"] == 0 } {
>        test_type_equality
>    }
>  }
> +
> +# Test __repr__()
> +gdb_test "python print (repr (gdb.lookup_type ('char')))" \
> +      "<gdb.Type code=TYPE_CODE_INT name=char>" "test __repr__()"
> -- 
> 2.37.3.windows.1


^ permalink raw reply	[relevance 6%]

* Re: [PATCH] Add __repr__() implementation to a few Python types
  2023-01-11  0:35  3% ` [PATCH] Add __repr__() implementation to a few Python types Matheus Branco Borella
@ 2023-01-18 17:05  7%   ` Bruno Larsen
  2023-01-18 18:02  6%   ` Andrew Burgess
  1 sibling, 0 replies; 65+ results
From: Bruno Larsen @ 2023-01-18 17:05 UTC (permalink / raw)
  To: Matheus Branco Borella, gdb-patches

On 11/01/2023 01:35, Matheus Branco Borella via Gdb-patches wrote:
> Only a few types in the Python API currently have __repr__() implementations.
> This patch adds a few more of them. specifically: it adds __repr__()
> implementations to gdb.Symbol, gdb.Architecture, gdb.Block, gdb.Breakpoint,
> and gdb.Type.
>
> This makes it easier to play around the GDB Python API in the Python interpreter
> session invoked with the 'pi' command in GDB, giving more easily accessible tipe
> information to users.
>
> An example of how this would look like:
> ```
> (gdb) pi
>>> gdb.lookup_type("char")
> <gdb.Type code=TYPE_CODE_INT name=char>
>>> gdb.lookup_global_symbol("main")
> <gdb.Symbol print_name=main>
> ```
>
> One thing to note about this patch is that it makes use of u8 string literals,
> so as to make sure we meet python's expectations of strings passed to it using
> PyUnicode_FromFormat being encoded in utf8. This should remove the chance of
> odd compilation environments spitting out strings Python would consider invalid
> for the function we're calling.

Thanks for working on this. I am not very familiar with the python side 
of GDB code, so I'll mostly comment on style. Nits inlined.

Also, while applying I get this:

Applying: Add __repr__() implementation to a few Python types
.git/rebase-apply/patch:267: trailing whitespace.

I have tested and see no regressions on my x86 machine.

Tested-By: Bruno Larsen <blarsen@redhat.com>

> ---
>   gdb/python/py-arch.c                       | 18 +++++++-
>   gdb/python/py-block.c                      | 30 ++++++++++++-
>   gdb/python/py-breakpoint.c                 | 49 +++++++++++++++++++++-
>   gdb/python/py-symbol.c                     | 16 ++++++-
>   gdb/python/py-type.c                       | 31 +++++++++++++-
>   gdb/testsuite/gdb.python/py-arch.exp       |  4 ++
>   gdb/testsuite/gdb.python/py-block.exp      |  4 +-
>   gdb/testsuite/gdb.python/py-breakpoint.exp | 26 +++++++-----
>   gdb/testsuite/gdb.python/py-symbol.exp     |  1 +
>   gdb/testsuite/gdb.python/py-type.exp       |  4 ++
>   10 files changed, 165 insertions(+), 18 deletions(-)
>
> diff --git a/gdb/python/py-arch.c b/gdb/python/py-arch.c
> index cf0978560f9..5384a0d0d0c 100644
> --- a/gdb/python/py-arch.c
> +++ b/gdb/python/py-arch.c
> @@ -319,6 +319,22 @@ archpy_integer_type (PyObject *self, PyObject *args, PyObject *kw)
>     return type_to_type_object (type);
>   }
>   
> +/* __repr__ implementation for gdb.Architecture.  */
> +
> +static PyObject *
> +archpy_repr (PyObject *self)
> +{
> +  const auto gdbarch = arch_object_to_gdbarch (self);
> +  if (gdbarch == nullptr)
> +    return PyUnicode_FromFormat
> +      ("<gdb.Architecture (invalid)>");
> +
> +  return PyUnicode_FromFormat
> +    ("<gdb.Architecture arch_name=%s printable_name=%s>",
> +     gdbarch_bfd_arch_info (gdbarch)->arch_name,
> +     gdbarch_bfd_arch_info (gdbarch)->printable_name);
> +}
> +
>   /* Implementation of gdb.architecture_names().  Return a list of all the
>      BFD architecture names that GDB understands.  */
>   
> @@ -391,7 +407,7 @@ PyTypeObject arch_object_type = {
>     0,                                  /* tp_getattr */
>     0,                                  /* tp_setattr */
>     0,                                  /* tp_compare */
> -  0,                                  /* tp_repr */
> +  archpy_repr,                        /* tp_repr */
>     0,                                  /* tp_as_number */
>     0,                                  /* tp_as_sequence */
>     0,                                  /* tp_as_mapping */
> diff --git a/gdb/python/py-block.c b/gdb/python/py-block.c
> index b9aea3aca69..1b8433d41e7 100644
> --- a/gdb/python/py-block.c
> +++ b/gdb/python/py-block.c
> @@ -23,6 +23,7 @@
>   #include "symtab.h"
>   #include "python-internal.h"
>   #include "objfiles.h"
> +#include <sstream>
>   
>   struct block_object {
>     PyObject_HEAD
> @@ -424,6 +425,33 @@ blpy_iter_is_valid (PyObject *self, PyObject *args)
>     Py_RETURN_TRUE;
>   }
>   
> +/* __repr__ implementation for gdb.Block.  */
> +
> +static PyObject *
> +blpy_repr (PyObject *self)
> +{
> +  const auto block = block_object_to_block (self);
> +  if (block == nullptr)
> +    return PyUnicode_FromFormat("<gdb.Block (invalid)>");
missing space before (
> +
> +  const auto name = block->function () ? block->function ()->print_name () : "<anonymous>";
> +
> +  block_iterator iter;
> +  block_iterator_first (block, &iter);
> +
> +  std::stringstream ss;
> +  const struct symbol *symbol;
> +  while ((symbol = block_iterator_next (&iter)) != nullptr)
> +  {
> +    ss << std::endl;
> +    ss << symbol->print_name () << ",";
> +  }
> +  if(!ss.str ().empty ())
> +    ss << std::endl;
> +
> +  return PyUnicode_FromFormat("<gdb.Block %s {%s}>", name, ss.str ().c_str ());
and here
> +}
> +
>   int
>   gdbpy_initialize_blocks (void)
>   {
> @@ -486,7 +514,7 @@ PyTypeObject block_object_type = {
>     0,				  /*tp_getattr*/
>     0,				  /*tp_setattr*/
>     0,				  /*tp_compare*/
> -  0,				  /*tp_repr*/
> +  blpy_repr,                     /*tp_repr*/
>     0,				  /*tp_as_number*/
>     0,				  /*tp_as_sequence*/
>     &block_object_as_mapping,	  /*tp_as_mapping*/
> diff --git a/gdb/python/py-breakpoint.c b/gdb/python/py-breakpoint.c
> index de7b9f4266b..ce307f7be21 100644
> --- a/gdb/python/py-breakpoint.c
> +++ b/gdb/python/py-breakpoint.c
> @@ -33,6 +33,7 @@
>   #include "location.h"
>   #include "py-event.h"
>   #include "linespec.h"
> +#include <sstream>
>   
>   extern PyTypeObject breakpoint_location_object_type
>       CPYCHECKER_TYPE_OBJECT_FOR_TYPEDEF ("breakpoint_location_object");
> @@ -967,6 +968,23 @@ bppy_init (PyObject *self, PyObject *args, PyObject *kwargs)
>     return 0;
>   }
>   
> +/* __repr__ implementation for gdb.Breakpoint.  */
> +
> +static PyObject *
> +bppy_repr(PyObject *self)
here too
> +{
> +  const auto bp = (struct gdbpy_breakpoint_object*) self;
> +  if (bp->bp == nullptr)
> +    return PyUnicode_FromFormat ("<gdb.Breakpoint (invalid)>");
> +
> +  return PyUnicode_FromFormat
> +    ("<gdb.Breakpoint number=%d thread=%d hits=%d enable_count=%d>",
> +     bp->bp->number,
> +     bp->bp->thread,
> +     bp->bp->hit_count,
> +     bp->bp->enable_count);
I'm not sure if this is a "me" thing, but I try to keep line count as 
low as possible, and only break when reaching the 72 character soft limit.
> +}
> +
>   /* Append to LIST the breakpoint Python object associated to B.
>   
>      Return true on success.  Return false on failure, with the Python error
> @@ -1389,7 +1407,7 @@ PyTypeObject breakpoint_object_type =
>     0,				  /*tp_getattr*/
>     0,				  /*tp_setattr*/
>     0,				  /*tp_compare*/
> -  0,				  /*tp_repr*/
> +  bppy_repr,                     /*tp_repr*/
>     0,				  /*tp_as_number*/
>     0,				  /*tp_as_sequence*/
>     0,				  /*tp_as_mapping*/
> @@ -1604,6 +1622,33 @@ bplocpy_dealloc (PyObject *py_self)
>     Py_TYPE (py_self)->tp_free (py_self);
>   }
>   
> +/* __repr__ implementation for gdb.BreakpointLocation.  */
> +
> +static PyObject *
> +bplocpy_repr (PyObject *py_self)
> +{
> +  const auto self = (gdbpy_breakpoint_location_object *) py_self;
> +  if (self->owner == nullptr || self->owner->bp == nullptr || self->owner->bp != self->bp_loc->owner)
> +    return PyUnicode_FromFormat ("<gdb.BreakpointLocation (invalid)>");
> +
> +  const auto enabled = self->bp_loc->enabled ? "enabled" : "disabled";
> +
> +  std::stringstream ss;
> +  ss << std::endl << enabled << std::endl;
> +  ss << "requested_address=0x" << std::hex << self->bp_loc->requested_address << " ";
> +  ss << "address=0x" << self->bp_loc->address << " " << std::dec << std::endl;
> +  if (self->bp_loc->symtab != nullptr)
> +  {
> +    ss << self->bp_loc->symtab->filename << ":" << self->bp_loc->line_number << " " << std::endl;
> +  }
> +
> +  const auto fn_name = self->bp_loc->function_name.get ();
> +  if (fn_name != nullptr)
> +    ss << "in " << fn_name << " " << std::endl;
> +
> +  return PyUnicode_FromFormat ("<gdb.BreakpointLocation %s>", ss.str ().c_str ());
> +}
> +
>   /* Attribute get/set Python definitions. */
>   
>   static gdb_PyGetSetDef bp_location_object_getset[] = {
> @@ -1635,7 +1680,7 @@ PyTypeObject breakpoint_location_object_type =
>     0,					/*tp_getattr*/
>     0,					/*tp_setattr*/
>     0,					/*tp_compare*/
> -  0,					/*tp_repr*/
> +  bplocpy_repr,                        /*tp_repr*/
>     0,					/*tp_as_number*/
>     0,					/*tp_as_sequence*/
>     0,					/*tp_as_mapping*/
> diff --git a/gdb/python/py-symbol.c b/gdb/python/py-symbol.c
> index 93c86964f3e..5a8149bbe66 100644
> --- a/gdb/python/py-symbol.c
> +++ b/gdb/python/py-symbol.c
> @@ -375,6 +375,20 @@ sympy_dealloc (PyObject *obj)
>     Py_TYPE (obj)->tp_free (obj);
>   }
>   
> +/* __repr__ implementation for gdb.Symbol.  */
> +
> +static PyObject *
> +sympy_repr (PyObject *self)
> +{
> +  const auto symbol = symbol_object_to_symbol (self);
> +  if (symbol == nullptr)
> +    return PyUnicode_FromFormat ("<gdb.Symbol (invalid)>");
> +
> +  return PyUnicode_FromFormat
> +    ("<gdb.Symbol print_name=%s>",
> +     symbol->print_name ());
> +}
> +
>   /* Implementation of
>      gdb.lookup_symbol (name [, block] [, domain]) -> (symbol, is_field_of_this)
>      A tuple with 2 elements is always returned.  The first is the symbol
> @@ -732,7 +746,7 @@ PyTypeObject symbol_object_type = {
>     0,				  /*tp_getattr*/
>     0,				  /*tp_setattr*/
>     0,				  /*tp_compare*/
> -  0,				  /*tp_repr*/
> +  sympy_repr,                    /*tp_repr*/
>     0,				  /*tp_as_number*/
>     0,				  /*tp_as_sequence*/
>     0,				  /*tp_as_mapping*/
> diff --git a/gdb/python/py-type.c b/gdb/python/py-type.c
> index 928efacfe8a..abe127eca76 100644
> --- a/gdb/python/py-type.c
> +++ b/gdb/python/py-type.c
> @@ -442,6 +442,7 @@ typy_is_signed (PyObject *self, void *closure)
>       Py_RETURN_TRUE;
>   }
>   
> +
spurious white line here
>   /* Return the type, stripped of typedefs. */
>   static PyObject *
>   typy_strip_typedefs (PyObject *self, PyObject *args)
> @@ -1026,6 +1027,34 @@ typy_template_argument (PyObject *self, PyObject *args)
>     return value_to_value_object (val);
>   }
>   
> +/* __repr__ implementation for gdb.Type.  */
> +
> +static PyObject *
> +typy_repr (PyObject *self)
> +{
> +  const auto type = type_object_to_type (self);
> +  if (type == nullptr)
> +    return PyUnicode_FromFormat ("<gdb.Type (invalid)>");
> +
> +  const char *code = pyty_codes[type->code ()].name;
> +  string_file type_name;
> +  try
> +    {
> +      current_language->print_type (type, "",
> +				    &type_name, -1, 0,
> +				    &type_print_raw_options);
> +    }
> +  catch (const gdb_exception &except)
> +    {
> +      GDB_PY_HANDLE_EXCEPTION (except);
> +    }
> +  auto py_typename = PyUnicode_Decode
> +    (type_name.c_str (), type_name.size (),
> +		 host_charset (), NULL);
> +	
> +	return PyUnicode_FromFormat ("<gdb.Type code=%s name=%U>", code, py_typename);
> +}
> +
>   static PyObject *
>   typy_str (PyObject *self)
>   {
> @@ -1612,7 +1641,7 @@ PyTypeObject type_object_type =
>     0,				  /*tp_getattr*/
>     0,				  /*tp_setattr*/
>     0,				  /*tp_compare*/
> -  0,				  /*tp_repr*/
> +  typy_repr,                     /*tp_repr*/
>     &type_object_as_number,	  /*tp_as_number*/
>     0,				  /*tp_as_sequence*/
>     &typy_mapping,		  /*tp_as_mapping*/
> diff --git a/gdb/testsuite/gdb.python/py-arch.exp b/gdb/testsuite/gdb.python/py-arch.exp
> index 1fbbc47c872..a60b4a25cbb 100644
> --- a/gdb/testsuite/gdb.python/py-arch.exp
> +++ b/gdb/testsuite/gdb.python/py-arch.exp
> @@ -29,6 +29,8 @@ if ![runto_main] {
>   # Test python/15461.  Invalid architectures should not trigger an
>   # internal GDB assert.
>   gdb_py_test_silent_cmd "python empty = gdb.Architecture()" "get empty arch" 0
> +gdb_test "python print(repr (empty))" "<gdb\\.Architecture \\(invalid\\)>" \
> +    "Test empty achitecture __repr__ does not trigger an assert"
>   gdb_test "python print(empty.name())" ".*Architecture is invalid.*" \
>       "Test empty architecture.name does not trigger an assert"
>   gdb_test "python print(empty.disassemble())" ".*Architecture is invalid.*" \
> @@ -46,6 +48,8 @@ gdb_py_test_silent_cmd "python insn_list3 = arch.disassemble(pc, count=1)" \
>   gdb_py_test_silent_cmd "python insn_list4 = arch.disassemble(gdb.Value(pc))" \
>     "disassemble no end no count" 0
>   
> +gdb_test "python print (repr (arch))" "<gdb.Architecture arch_name=.* printable_name=.*>" "test __repr__ for architecture"
> +
>   gdb_test "python print (len(insn_list1))" "1" "test number of instructions 1"
>   gdb_test "python print (len(insn_list2))" "1" "test number of instructions 2"
>   gdb_test "python print (len(insn_list3))" "1" "test number of instructions 3"
> diff --git a/gdb/testsuite/gdb.python/py-block.exp b/gdb/testsuite/gdb.python/py-block.exp
> index 0a88aec56a0..5e3d1c72d5e 100644
> --- a/gdb/testsuite/gdb.python/py-block.exp
> +++ b/gdb/testsuite/gdb.python/py-block.exp
> @@ -39,7 +39,7 @@ gdb_continue_to_breakpoint "Block break here."
>   gdb_py_test_silent_cmd "python frame = gdb.selected_frame()" "Get Frame" 0
>   gdb_py_test_silent_cmd "python block = frame.block()" \
>       "Get block, initial innermost block" 0
> -gdb_test "python print (block)" "<gdb.Block object at $hex>" "check block not None"
> +gdb_test "python print (block)" "<gdb.Block .* \{.*\}>" "check block not None"
>   gdb_test "python print (block.function)" "None" "first anonymous block"
>   gdb_test "python print (block.start)" "${decimal}" "check start not None"
>   gdb_test "python print (block.end)" "${decimal}" "check end not None"
> @@ -73,7 +73,7 @@ gdb_test "python print (block.function)" "block_func" \
>   gdb_test "up" ".*"
>   gdb_py_test_silent_cmd "python frame = gdb.selected_frame()" "Get Frame 2" 0
>   gdb_py_test_silent_cmd "python block = frame.block()" "Get Frame 2's block" 0
> -gdb_test "python print (block)" "<gdb.Block object at $hex>" \
> +gdb_test "python print (repr (block))" "<gdb.Block .* \{.*\}>" \
>            "Check Frame 2's block not None"
>   gdb_test "python print (block.function)" "main" "main block"
>   
> diff --git a/gdb/testsuite/gdb.python/py-breakpoint.exp b/gdb/testsuite/gdb.python/py-breakpoint.exp
> index e36e87dc291..0c904a12c90 100644
> --- a/gdb/testsuite/gdb.python/py-breakpoint.exp
> +++ b/gdb/testsuite/gdb.python/py-breakpoint.exp
> @@ -50,11 +50,14 @@ proc_with_prefix test_bkpt_basic { } {
>   	return 0
>       }
>   
> +    set num_exp "-?\[0-9\]+"
> +    set repr_pattern "<gdb.Breakpoint number=$num_exp thread=$num_exp hits=$num_exp enable_count=$num_exp>"
> +
>       # Now there should be one breakpoint: main.
>       gdb_py_test_silent_cmd "python blist = gdb.breakpoints()" \
>   	"Get Breakpoint List" 0
> -    gdb_test "python print (blist\[0\])" \
> -	"<gdb.Breakpoint object at $hex>" "Check obj exists @main"
> +    gdb_test "python print (repr (blist\[0\]))" \
> +	"$repr_pattern" "Check obj exists @main"
>       gdb_test "python print (blist\[0\].location)" \
>   	"main." "Check breakpoint location @main"
>       gdb_test "python print (blist\[0\].pending)" "False" \
> @@ -71,12 +74,12 @@ proc_with_prefix test_bkpt_basic { } {
>   	"Get Breakpoint List" 0
>       gdb_test "python print (len(blist))" \
>   	"2" "Check for two breakpoints"
> -    gdb_test "python print (blist\[0\])" \
> -	"<gdb.Breakpoint object at $hex>" "Check obj exists @main 2"
> +    gdb_test "python print (repr (blist\[0\]))" \
> +	"$repr_pattern" "Check obj exists @main 2"
>       gdb_test "python print (blist\[0\].location)" \
>   	"main." "Check breakpoint location @main 2"
> -    gdb_test "python print (blist\[1\])" \
> -	"<gdb.Breakpoint object at $hex>" "Check obj exists @mult_line"
> +    gdb_test "python print (repr (blist\[1\]))" \
> +	"$repr_pattern" "Check obj exists @mult_line"
>   
>       gdb_test "python print (blist\[1\].location)" \
>   	"py-breakpoint\.c:${mult_line}*" \
> @@ -224,14 +227,17 @@ proc_with_prefix test_bkpt_invisible { } {
>   	return 0
>       }
>   
> +    set num_exp "-?\[0-9\]+"
> +    set repr_pattern "<gdb.Breakpoint number=$num_exp thread=$num_exp hits=$num_exp enable_count=$num_exp>"
> +
>       delete_breakpoints
>       set ibp_location [gdb_get_line_number "Break at multiply."]
>       gdb_py_test_silent_cmd  "python ibp = gdb.Breakpoint(\"$ibp_location\", internal=False)" \
>   	"Set invisible breakpoint" 0
>       gdb_py_test_silent_cmd "python ilist = gdb.breakpoints()" \
>   	"Get Breakpoint List" 0
> -    gdb_test "python print (ilist\[0\])" \
> -	"<gdb.Breakpoint object at $hex>" "Check invisible bp obj exists 1"
> +    gdb_test "python print (repr (ilist\[0\]))" \
> +	"$repr_pattern" "Check invisible bp obj exists 1"
>       gdb_test "python print (ilist\[0\].location)" \
>   	"py-breakpoint\.c:$ibp_location*" "Check breakpoint location 1"
>       gdb_test "python print (ilist\[0\].visible)" \
> @@ -243,8 +249,8 @@ proc_with_prefix test_bkpt_invisible { } {
>   	"Set invisible breakpoint" 0
>       gdb_py_test_silent_cmd "python ilist = gdb.breakpoints()" \
>   	"Get Breakpoint List" 0
> -    gdb_test "python print (ilist\[0\])" \
> -	"<gdb.Breakpoint object at $hex>" "Check invisible bp obj exists 2"
> +    gdb_test "python print (repr (ilist\[0\]))" \
> +	"$repr_pattern" "Check invisible bp obj exists 2"
>       gdb_test "python print (ilist\[0\].location)" \
>   	"py-breakpoint\.c:$ibp_location*" "Check breakpoint location 2"
>       gdb_test "python print (ilist\[0\].visible)" \
> diff --git a/gdb/testsuite/gdb.python/py-symbol.exp b/gdb/testsuite/gdb.python/py-symbol.exp
> index ad06b07c2c6..e0baed9b6d4 100644
> --- a/gdb/testsuite/gdb.python/py-symbol.exp
> +++ b/gdb/testsuite/gdb.python/py-symbol.exp
> @@ -44,6 +44,7 @@ clean_restart ${binfile}
>   # point where we don't have a current frame, and we don't want to
>   # require one.
>   gdb_py_test_silent_cmd "python main_func = gdb.lookup_global_symbol(\"main\")" "Lookup main" 1
> +gdb_test "python print (repr (main_func))" "<gdb.Symbol print_name=.*>" "test main_func.__repr__"
>   gdb_test "python print (main_func.is_function)" "True" "test main_func.is_function"
>   gdb_test "python print (gdb.lookup_global_symbol(\"junk\"))" "None" "test lookup_global_symbol(\"junk\")"
>   
> diff --git a/gdb/testsuite/gdb.python/py-type.exp b/gdb/testsuite/gdb.python/py-type.exp
> index 594c9749d8e..95cdfa54a6e 100644
> --- a/gdb/testsuite/gdb.python/py-type.exp
> +++ b/gdb/testsuite/gdb.python/py-type.exp
> @@ -393,3 +393,7 @@ if { [build_inferior "${binfile}-cxx" "c++"] == 0 } {
>         test_type_equality
>     }
>   }
> +
> +# Test __repr__()
> +gdb_test "python print (repr (gdb.lookup_type ('char')))" \
> +      "<gdb.Type code=TYPE_CODE_INT name=char>" "test __repr__()"


-- 
Cheers,
Bruno


^ permalink raw reply	[relevance 7%]

* [PATCH] Add support for symbol addition to the Python API
  @ 2023-01-12  2:00  4% ` Matheus Branco Borella
  0 siblings, 0 replies; 65+ results
From: Matheus Branco Borella @ 2023-01-12  2:00 UTC (permalink / raw)
  To: gdb-patches; +Cc: Matheus Branco Borella


> overlap in the addresses of two compunit_symtabs.  What would functions
> like find_compunit_symtab_by_address return?  Should the new symbol be
> added to an existing compunit_symtab, if the address falls into an
> existing compunit_symtab's address range?

I'm actually not sure, from what I can tell `find_compunit_symtab_by_address`
looks into the qfs, that aren't changed by buildsym_compunit, at least not
from what I can tell. I'm probably wrong though, this part of the code is
still pretty confusing to me.

This patch adds support for symbol creation and registration. It currently
supports adding type symbols (VAR_DOMAIN/LOC_TYPEDEF), static symbols
(VAR_DOMAIN/LOC_STATIC) and goto target labels (LABEL_DOMAIN/LOC_LABEL). It
adds the `add_type_symbol`, `add_static_symbol` and `add_label_symbol`
functions to the `gdb.Objfile` type, allowing for the addition of the 
aforementioned types of symbols.

This is done through building a new `compunit_symtab`s for each symbol that is
to be added, owned by a given objfile and whose lifetimes is bound to it. I
might be missing something here, but there doesn't seem to be an intended way
to add new symbols to a compunit_symtab after it's been finished. If there is,
then the efficiency of this method could very much be improved. It could also be
made more efficient by having a way to add whole batches of symbols at once,
which would then all get added to the same `compunit_symtab`.

For now, though, this implementation lets us add symbols that can be used to,
for instance, query registered types through `gdb.lookup_type`, and allows
reverse engineering GDB plugins (such as Pwndbg [0] or decomp2gdb [1]) to add
symbols directly through the Python API instead of having to compile an object
file for the target architecture that they later load through the add-symbol-
file command. [2]

[0] https://github.com/pwndbg/pwndbg/
[1] https://github.com/mahaloz/decomp2dbg
[2] https://github.com/mahaloz/decomp2dbg/blob/055be6b2001954d00db2d683f20e9b714af75880/decomp2dbg/clients/gdb/symbol_mapper.py#L235-L243]
---
 gdb/python/py-objfile.c      | 258 +++++++++++++++++++++++++++++++++++
 gdb/python/python-internal.h |   2 +
 2 files changed, 260 insertions(+)

diff --git a/gdb/python/py-objfile.c b/gdb/python/py-objfile.c
index c278925531b..00fe8de74f1 100644
--- a/gdb/python/py-objfile.c
+++ b/gdb/python/py-objfile.c
@@ -25,6 +25,7 @@
 #include "build-id.h"
 #include "symtab.h"
 #include "python.h"
+#include "buildsym.h"
 
 struct objfile_object
 {
@@ -527,6 +528,233 @@ objfpy_lookup_static_symbol (PyObject *self, PyObject *args, PyObject *kw)
   Py_RETURN_NONE;
 }
 
+/* Adds a new symbol to the given objfile. */
+
+static struct symbol *
+add_new_symbol
+  (struct objfile *objfile,
+   const char *name,
+   enum language language,
+   enum domain_enum domain,
+   enum address_class aclass,
+   short section_index,
+   CORE_ADDR last_addr,
+   CORE_ADDR end_addr,
+   bool global,
+   std::function<void(struct symbol*)> params)
+{
+  struct symbol *symbol = new (&objfile->objfile_obstack) struct symbol ();
+  OBJSTAT (objfile, n_syms++);
+
+  symbol->set_language (language, &objfile->objfile_obstack);
+  symbol->compute_and_set_names (gdb::string_view (name), true, objfile->per_bfd);
+
+  symbol->set_is_objfile_owned (true);
+  symbol->set_section_index (aclass);
+  symbol->set_domain (domain);
+  symbol->set_aclass_index (aclass);
+
+  params (symbol);
+
+  buildsym_compunit builder (objfile, "", "", language, last_addr);
+  add_symbol_to_list (symbol, global ? builder.get_global_symbols() : builder.get_file_symbols ());
+  builder.end_compunit_symtab (end_addr, section_index);
+
+  return symbol;
+}
+
+/* Parses a language from a string (coming from Python) into a language variant. */
+
+static enum language
+parse_language (const char *language)
+{
+  if (strcmp (language, "c") == 0)
+    return language_c;
+  else if (strcmp (language, "objc") == 0)
+    return language_objc;
+  else if (strcmp (language, "cplus") == 0)
+    return language_cplus;
+  else if (strcmp (language, "d") == 0)
+    return language_d;
+  else if (strcmp (language, "go") == 0)
+    return language_go;
+  else if (strcmp (language, "fortran") == 0)
+    return language_fortran;
+  else if (strcmp (language, "m2") == 0)
+    return language_m2;
+  else if (strcmp (language, "asm") == 0)
+    return language_asm;
+  else if (strcmp (language, "pascal") == 0)
+    return language_pascal;
+  else if (strcmp (language, "opencl") == 0)
+    return language_opencl;
+  else if (strcmp (language, "rust") == 0)
+    return language_rust;
+  else if (strcmp (language, "ada") == 0)
+    return language_ada;
+  else if (strcmp (language, "auto") == 0)
+    return language_auto;
+  else
+    return language_unknown;
+}
+
+/* Adds a type (LOC_TYPEDEF) symbol to a given objfile. */
+
+static PyObject *
+objfpy_add_type_symbol (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *format = "sO|s";
+  static const char *keywords[] =
+    {
+      "name", "type", "language",NULL
+    };
+
+  PyObject *type_object;
+  const char *name;
+  const char *language_name = nullptr;
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, format, keywords, &name,
+                                        &type_object, &language_name))
+    return nullptr;
+
+  struct objfile *objfile = objfile_object_to_objfile (self);
+  if (objfile == nullptr)
+    return nullptr;
+
+  struct type *type = type_object_to_type (type_object);
+  if (type == nullptr)
+    return nullptr;
+
+  if (language_name == nullptr)
+    language_name = "auto";
+  enum language language = parse_language (language_name);
+  if (language == language_unknown)
+  {
+    PyErr_SetString (PyExc_ValueError, "invalid language name");
+    return nullptr;
+  }
+
+  struct symbol* symbol = add_new_symbol
+    (objfile,
+     name,
+     language,
+     VAR_DOMAIN,
+     LOC_TYPEDEF,
+     0,
+     0,
+     0,
+     false,
+     [&](struct symbol* temp_symbol)
+     {
+       temp_symbol->set_type(type);
+     });
+
+
+  return symbol_to_symbol_object (symbol);
+}
+
+/* Adds a label (LOC_LABEL) symbol to a given objfile. */
+
+static PyObject *
+objfpy_add_label_symbol (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *format = "sk|s";
+  static const char *keywords[] =
+    {
+      "name", "address", "language",NULL
+    };
+
+  const char *name;
+  CORE_ADDR address;
+  const char *language_name = nullptr;
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, format, keywords, &name,
+                                        &address, &language_name))
+    return nullptr;
+
+  struct objfile *objfile = objfile_object_to_objfile (self);
+  if (objfile == nullptr)
+    return nullptr;
+
+  if (language_name == nullptr)
+    language_name = "auto";
+  enum language language = parse_language (language_name);
+  if (language == language_unknown)
+  {
+    PyErr_SetString (PyExc_ValueError, "invalid language name");
+    return nullptr;
+  }
+
+  struct symbol* symbol = add_new_symbol
+    (objfile,
+     name,
+     language,
+     LABEL_DOMAIN,
+     LOC_LABEL,
+     0,
+     0,
+     0,
+     false,
+     [&](struct symbol* temp_symbol)
+     {
+       temp_symbol->set_value_address(address);
+     });
+
+
+  return symbol_to_symbol_object (symbol);
+}
+
+/* Adds a static (LOC_STATIC) symbol to a given objfile. */
+
+static PyObject *
+objfpy_add_static_symbol (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *format = "sk|s";
+  static const char *keywords[] =
+    {
+      "name", "address", "language", NULL
+    };
+
+  const char *name;
+  CORE_ADDR address;
+  const char *language_name = nullptr;
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, format, keywords, &name,
+                                        &address, &language_name))
+    return nullptr;
+
+  struct objfile *objfile = objfile_object_to_objfile (self);
+  if (objfile == nullptr)
+    return nullptr;
+
+  if (language_name == nullptr)
+    language_name = "auto";
+  enum language language = parse_language (language_name);
+  if (language == language_unknown)
+  {
+    PyErr_SetString (PyExc_ValueError, "invalid language name");
+    return nullptr;
+  }
+
+  struct symbol* symbol = add_new_symbol
+    (objfile,
+     name,
+     language,
+     VAR_DOMAIN,
+     LOC_STATIC,
+     0,
+     0,
+     0,
+     false,
+     [&](struct symbol* temp_symbol)
+     {
+       temp_symbol->set_value_address(address);
+     });
+
+
+  return symbol_to_symbol_object (symbol);
+}
+
 /* Implement repr() for gdb.Objfile.  */
 
 static PyObject *
@@ -704,6 +932,18 @@ objfile_to_objfile_object (struct objfile *objfile)
   return gdbpy_ref<>::new_reference (result);
 }
 
+struct objfile *
+objfile_object_to_objfile (PyObject *self)
+{
+  if (!PyObject_TypeCheck (self, &objfile_object_type))
+    return nullptr;
+
+  auto objfile_object = (struct objfile_object*) self;
+  OBJFPY_REQUIRE_VALID (objfile_object);
+
+  return objfile_object->objfile;
+}
+
 int
 gdbpy_initialize_objfile (void)
 {
@@ -737,6 +977,24 @@ Look up a global symbol in this objfile and return it." },
     "lookup_static_symbol (name [, domain]).\n\
 Look up a static-linkage global symbol in this objfile and return it." },
 
+  { "add_type_symbol", (PyCFunction) objfpy_add_type_symbol,
+    METH_VARARGS | METH_KEYWORDS,
+    "add_type_symbol(name: string, type: gdb.Type, [language: string])\n\
+    Registers a new symbol inside VAR_DOMAIN/LOC_TYPEDEF, with the given name\
+    referring to the given type." },
+
+  { "add_label_symbol", (PyCFunction) objfpy_add_label_symbol,
+    METH_VARARGS | METH_KEYWORDS,
+    "add_label_symbol(name: string, address: int, [language: string])\n\
+    Registers a new symbol inside LABEL_DOMAIN/LOC_LABEL, with the given name\
+    pointing to the given address." },
+
+  { "add_static_symbol", (PyCFunction) objfpy_add_static_symbol,
+    METH_VARARGS | METH_KEYWORDS,
+    "add_static_symbol(name: string, address: int, [language: string])\n\
+    Registers a new symbol inside VAR_DOMAIN/LOC_STATIC, with the given name\
+    pointing to the given address." },
+
   { NULL }
 };
 
diff --git a/gdb/python/python-internal.h b/gdb/python/python-internal.h
index 06357cc8c0b..bb10df63077 100644
--- a/gdb/python/python-internal.h
+++ b/gdb/python/python-internal.h
@@ -481,6 +481,8 @@ struct symtab *symtab_object_to_symtab (PyObject *obj);
 struct symtab_and_line *sal_object_to_symtab_and_line (PyObject *obj);
 frame_info_ptr frame_object_to_frame_info (PyObject *frame_obj);
 struct gdbarch *arch_object_to_gdbarch (PyObject *obj);
+struct objfile *objfile_object_to_objfile (PyObject *self);
+struct floatformat *float_format_object_as_float_format (PyObject *self);
 
 /* Convert Python object OBJ to a program_space pointer.  OBJ must be a
    gdb.Progspace reference.  Return nullptr if the gdb.Progspace is not
-- 
2.37.3.windows.1


^ permalink raw reply	[relevance 4%]

* [PATCH] Add support for creating new types from the Python API
  @ 2023-01-11  0:58  2% ` Matheus Branco Borella
  2023-06-27  3:52 14%   ` [PING] " Matheus Branco Borella
  2023-05-26  3:30  2% ` Matheus Branco Borella
  1 sibling, 1 reply; 65+ results
From: Matheus Branco Borella @ 2023-01-11  0:58 UTC (permalink / raw)
  To: gdb-patches; +Cc: Matheus Branco Borella

This patch adds support for creating types from within the Python API. It does
so by exposing the `init_*_type` family of functions, defined in `gdbtypes.h` to
Python and having them return `gdb.Type` objects connected to the newly minted
types.

These functions are accessible in the root of the gdb module and all require
a reference to a `gdb.Objfile`. Types created from this API are exclusively
objfile-owned.

This patch also adds an extra type - `gdb.FloatFormat` - to support creation of
floating point types by letting users control the format from within Python. It
is missing, however, a way to specify half formats and validation functions.

It is important to note that types created using this interface are not
automatically registered as a symbol, and so, types will become unreachable
unless used to create a value that otherwise references it or saved in some way.

The main drawback of using the `init_*_type` family over implementing type
initialization by hand is that any type that's created gets immediately
allocated on its owner objfile's obstack, regardless of what its real
lifetime requirements are. The main implication of this is that types that
become unreachable will leak their memory for the lifetime of the objfile.

Keeping track of the initialization of the type by hand would require a
deeper change to the existing type object infrastructure. A bit too ambitious
for a first patch, I'd say.

if it were to be done though, we would gain the ability to only keep in the
obstack types that are known to be referenced in some other way - by allocating
and copying the data to the obstack as other objects are created that reference
it (eg. symbols).
---
 gdb/Makefile.in              |   2 +
 gdb/python/py-float-format.c | 297 +++++++++++++++++++++++++++
 gdb/python/py-objfile.c      |  12 ++
 gdb/python/py-type-init.c    | 388 +++++++++++++++++++++++++++++++++++
 gdb/python/python-internal.h |  17 ++
 gdb/python/python.c          |  44 +++-
 6 files changed, 759 insertions(+), 1 deletion(-)
 create mode 100644 gdb/python/py-float-format.c
 create mode 100644 gdb/python/py-type-init.c

diff --git a/gdb/Makefile.in b/gdb/Makefile.in
index fb4d42c7baa..789f7dce224 100644
--- a/gdb/Makefile.in
+++ b/gdb/Makefile.in
@@ -432,6 +432,8 @@ SUBDIR_PYTHON_SRCS = \
 	python/py-threadevent.c \
 	python/py-tui.c \
 	python/py-type.c \
+	python/py-type-init.c \
+	python/py-float-format.c \
 	python/py-unwind.c \
 	python/py-utils.c \
 	python/py-value.c \
diff --git a/gdb/python/py-float-format.c b/gdb/python/py-float-format.c
new file mode 100644
index 00000000000..e517e410899
--- /dev/null
+++ b/gdb/python/py-float-format.c
@@ -0,0 +1,297 @@
+/* Accessibility of float format controls from inside the Python API
+
+   Copyright (C) 2008-2023 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include "defs.h"
+#include "python-internal.h"
+#include "floatformat.h"
+
+/* Structure backing the float format Python interface. */
+
+struct float_format_object
+{
+  PyObject_HEAD
+  struct floatformat format;
+
+  struct floatformat *float_format ()
+  {
+    return &this->format;
+  }
+};
+
+/* Initializes the float format type and registers it with the Python interpreter. */
+
+int
+gdbpy_initialize_float_format (void)
+{
+  if (PyType_Ready (&float_format_object_type) < 0)
+    return -1;
+
+  if (gdb_pymodule_addobject (gdb_module, "FloatFormat",
+    (PyObject *) &float_format_object_type) < 0)
+    return -1;
+
+  return 0;
+}
+
+#define INSTANCE_FIELD_GETTER(getter_name, field_name, field_type, field_conv) \
+  static PyObject *                                                            \
+  getter_name (PyObject *self, void *closure)                                  \
+  {                                                                            \
+    float_format_object *ff = (float_format_object*) self;                     \
+    field_type value = ff->float_format ()->field_name;                        \
+    return field_conv (value);                                                 \
+  }
+
+#define INSTANCE_FIELD_SETTER(getter_name, field_name, field_type, field_conv) \
+  static int                                                                   \
+  getter_name (PyObject *self, PyObject* value, void *closure)                 \
+  {                                                                            \
+    field_type native_value;                                                   \
+    if (!field_conv (value, &native_value))                                    \
+      return -1;                                                               \
+    float_format_object *ff = (float_format_object*) self;                     \
+    ff->float_format ()->field_name = native_value;                            \
+    return 0;                                                                  \
+  }
+
+/* Converts from the intbit enum to a Python boolean. */
+
+static PyObject *
+intbit_to_py (enum floatformat_intbit intbit)
+{
+  gdb_assert (intbit == floatformat_intbit_yes || intbit == floatformat_intbit_no);
+  if (intbit == floatformat_intbit_no)
+    Py_RETURN_FALSE;
+  else
+    Py_RETURN_TRUE;
+}
+
+/* Converts from a Python boolean to the intbit enum. */
+
+static bool
+py_to_intbit (PyObject *object, enum floatformat_intbit *intbit)
+{
+  if (!PyObject_IsInstance (object, (PyObject*) &PyBool_Type))
+  {
+    PyErr_SetString (PyExc_TypeError, "intbit must be True or False");
+    return false;
+  }
+
+  *intbit = PyObject_IsTrue (object) ? floatformat_intbit_yes : floatformat_intbit_no;
+  return true;
+}
+
+/* Converts from a Python integer to a unsigned integer. */
+
+static bool
+py_to_unsigned_int (PyObject *object, unsigned int *val)
+{
+  if (!PyObject_IsInstance (object, (PyObject*) &PyLong_Type))
+  {
+    PyErr_SetString (PyExc_TypeError, "value must be an integer");
+    return false;
+  }
+
+  long native_val = PyLong_AsLong (object);
+  if (native_val > (long) UINT_MAX)
+  {
+    PyErr_SetString (PyExc_ValueError, "value is too large");
+    return false;
+  }
+  if (native_val < 0)
+  {
+    PyErr_SetString (PyExc_ValueError, "value must not be smaller than zero");
+    return false;
+  }
+
+  *val = (unsigned int) native_val;
+  return true;
+}
+
+/* Converts from a Python integer to a signed integer. */
+
+static bool
+py_to_int(PyObject *object, int *val)
+{
+  if(!PyObject_IsInstance(object, (PyObject*)&PyLong_Type))
+  {
+    PyErr_SetString(PyExc_TypeError, u8"value must be an integer");
+    return false;
+  }
+
+  long native_val = PyLong_AsLong(object);
+  if(native_val > (long)INT_MAX)
+  {
+    PyErr_SetString(PyExc_ValueError, u8"value is too large");
+    return false;
+  }
+
+  *val = (int)native_val;
+  return true;
+}
+
+INSTANCE_FIELD_GETTER (ffpy_get_totalsize, totalsize, unsigned int, PyLong_FromLong)
+INSTANCE_FIELD_GETTER (ffpy_get_sign_start, sign_start, unsigned int, PyLong_FromLong)
+INSTANCE_FIELD_GETTER (ffpy_get_exp_start, exp_start, unsigned int, PyLong_FromLong)
+INSTANCE_FIELD_GETTER (ffpy_get_exp_len, exp_len, unsigned int, PyLong_FromLong)
+INSTANCE_FIELD_GETTER (ffpy_get_exp_bias, exp_bias, int, PyLong_FromLong)
+INSTANCE_FIELD_GETTER (ffpy_get_exp_nan, exp_nan, unsigned int, PyLong_FromLong)
+INSTANCE_FIELD_GETTER (ffpy_get_man_start, man_start, unsigned int, PyLong_FromLong)
+INSTANCE_FIELD_GETTER (ffpy_get_man_len, man_len, unsigned int, PyLong_FromLong)
+INSTANCE_FIELD_GETTER (ffpy_get_intbit, intbit, enum floatformat_intbit, intbit_to_py)
+INSTANCE_FIELD_GETTER (ffpy_get_name, name, const char *, PyUnicode_FromString)
+
+INSTANCE_FIELD_SETTER (ffpy_set_totalsize, totalsize, unsigned int, py_to_unsigned_int)
+INSTANCE_FIELD_SETTER (ffpy_set_sign_start, sign_start, unsigned int, py_to_unsigned_int)
+INSTANCE_FIELD_SETTER (ffpy_set_exp_start, exp_start, unsigned int, py_to_unsigned_int)
+INSTANCE_FIELD_SETTER (ffpy_set_exp_len, exp_len, unsigned int, py_to_unsigned_int)
+INSTANCE_FIELD_SETTER (ffpy_set_exp_bias, exp_bias, int, py_to_int)
+INSTANCE_FIELD_SETTER (ffpy_set_exp_nan, exp_nan, unsigned int, py_to_unsigned_int)
+INSTANCE_FIELD_SETTER (ffpy_set_man_start, man_start, unsigned int, py_to_unsigned_int)
+INSTANCE_FIELD_SETTER (ffpy_set_man_len, man_len, unsigned int, py_to_unsigned_int)
+INSTANCE_FIELD_SETTER (ffpy_set_intbit, intbit, enum floatformat_intbit, py_to_intbit)
+
+/* Makes sure float formats created from Python always test as valid. */
+
+static int
+ffpy_always_valid (const struct floatformat *fmt ATTRIBUTE_UNUSED,
+                   const void *from ATTRIBUTE_UNUSED)
+{
+  return 1;
+}
+
+/* Initializes new float format objects. */
+
+static int
+ffpy_init (PyObject *self,
+           PyObject *args ATTRIBUTE_UNUSED,
+           PyObject *kwds ATTRIBUTE_UNUSED)
+{
+  auto ff = (float_format_object*) self;
+  ff->format = floatformat ();
+  ff->float_format ()->name = "";
+  ff->float_format ()->is_valid = ffpy_always_valid;
+  return 0;
+}
+
+/* Retrieves a pointer to the underlying float format structure. */
+
+struct floatformat *
+float_format_object_as_float_format (PyObject *self)
+{
+  if (!PyObject_IsInstance (self, (PyObject*) &float_format_object_type))
+    return nullptr;
+  return ((float_format_object*) self)->float_format ();
+}
+
+static gdb_PyGetSetDef float_format_object_getset[] =
+{
+  { "totalsize", ffpy_get_totalsize, ffpy_set_totalsize,
+    "The total size of the floating point number, in bits.", nullptr },
+  { "sign_start", ffpy_get_sign_start, ffpy_set_sign_start,
+    "The bit offset of the sign bit.", nullptr },
+  { "exp_start", ffpy_get_exp_start, ffpy_set_exp_start,
+    "The bit offset of the start of the exponent.", nullptr },
+  { "exp_len", ffpy_get_exp_len, ffpy_set_exp_len,
+    "The size of the exponent, in bits.", nullptr },
+  { "exp_bias", ffpy_get_exp_bias, ffpy_set_exp_bias,
+    "Bias added to a \"true\" exponent to form the biased exponent.", nullptr },
+  { "exp_nan", ffpy_get_exp_nan, ffpy_set_exp_nan,
+    "Exponent value which indicates NaN.", nullptr },
+  { "man_start", ffpy_get_man_start, ffpy_set_man_start,
+    "The bit offset of the start of the mantissa.", nullptr },
+  { "man_len", ffpy_get_man_len, ffpy_set_man_len,
+    "The size of the mantissa, in bits.", nullptr },
+  { "intbit", ffpy_get_intbit, ffpy_set_intbit,
+    "Is the integer bit explicit or implicit?", nullptr },
+  { "name", ffpy_get_name, nullptr,
+    "Internal name for debugging.", nullptr },
+  { nullptr }
+};
+
+static PyMethodDef float_format_object_methods[] =
+{
+  { NULL }
+};
+
+static PyNumberMethods float_format_object_as_number = {
+  nullptr,             /* nb_add */
+  nullptr,             /* nb_subtract */
+  nullptr,             /* nb_multiply */
+  nullptr,             /* nb_remainder */
+  nullptr,             /* nb_divmod */
+  nullptr,             /* nb_power */
+  nullptr,             /* nb_negative */
+  nullptr,             /* nb_positive */
+  nullptr,             /* nb_absolute */
+  nullptr,             /* nb_nonzero */
+  nullptr,             /* nb_invert */
+  nullptr,             /* nb_lshift */
+  nullptr,             /* nb_rshift */
+  nullptr,             /* nb_and */
+  nullptr,             /* nb_xor */
+  nullptr,             /* nb_or */
+  nullptr,             /* nb_int */
+  nullptr,             /* reserved */
+  nullptr,             /* nb_float */
+};
+
+PyTypeObject float_format_object_type =
+{
+  PyVarObject_HEAD_INIT (NULL, 0)
+  "gdb.FloatFormat",              /*tp_name*/
+  sizeof (float_format_object),   /*tp_basicsize*/
+  0,                              /*tp_itemsize*/
+  nullptr,                        /*tp_dealloc*/
+  0,                              /*tp_print*/
+  nullptr,                        /*tp_getattr*/
+  nullptr,                        /*tp_setattr*/
+  nullptr,                        /*tp_compare*/
+  nullptr,                        /*tp_repr*/
+  &float_format_object_as_number, /*tp_as_number*/
+  nullptr,                        /*tp_as_sequence*/
+  nullptr,                        /*tp_as_mapping*/
+  nullptr,                        /*tp_hash */
+  nullptr,                        /*tp_call*/
+  nullptr,                        /*tp_str*/
+  nullptr,                        /*tp_getattro*/
+  nullptr,                        /*tp_setattro*/
+  nullptr,                        /*tp_as_buffer*/
+  Py_TPFLAGS_DEFAULT,             /*tp_flags*/
+  "GDB float format object",      /* tp_doc */
+  nullptr,                        /* tp_traverse */
+  nullptr,                        /* tp_clear */
+  nullptr,                        /* tp_richcompare */
+  0,                              /* tp_weaklistoffset */
+  nullptr,                        /* tp_iter */
+  nullptr,                        /* tp_iternext */
+  float_format_object_methods,    /* tp_methods */
+  nullptr,                        /* tp_members */
+  float_format_object_getset,     /* tp_getset */
+  nullptr,                        /* tp_base */
+  nullptr,                        /* tp_dict */
+  nullptr,                        /* tp_descr_get */
+  nullptr,                        /* tp_descr_set */
+  0,                              /* tp_dictoffset */
+  ffpy_init,                      /* tp_init */
+  nullptr,                        /* tp_alloc */
+  PyType_GenericNew,              /* tp_new */
+};
+
+
diff --git a/gdb/python/py-objfile.c b/gdb/python/py-objfile.c
index c278925531b..28a7c9a7873 100644
--- a/gdb/python/py-objfile.c
+++ b/gdb/python/py-objfile.c
@@ -704,6 +704,18 @@ objfile_to_objfile_object (struct objfile *objfile)
   return gdbpy_ref<>::new_reference (result);
 }
 
+struct objfile *
+objfile_object_to_objfile (PyObject *self)
+{
+  if (!PyObject_TypeCheck (self, &objfile_object_type))
+    return nullptr;
+
+  auto objfile_object = (struct objfile_object*) self;
+  OBJFPY_REQUIRE_VALID (objfile_object);
+
+  return objfile_object->objfile;
+}
+
 int
 gdbpy_initialize_objfile (void)
 {
diff --git a/gdb/python/py-type-init.c b/gdb/python/py-type-init.c
new file mode 100644
index 00000000000..f3b6813c3ad
--- /dev/null
+++ b/gdb/python/py-type-init.c
@@ -0,0 +1,388 @@
+/* Functionality for creating new types accessible from python.
+
+   Copyright (C) 2008-2023 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include "defs.h"
+#include "python-internal.h"
+#include "gdbtypes.h"
+#include "floatformat.h"
+#include "objfiles.h"
+#include "gdbsupport/gdb_obstack.h"
+
+
+/* Copies a null-terminated string into an objfile's obstack. */
+
+static const char *
+copy_string (struct objfile *objfile, const char *py_str)
+{
+  unsigned int len = strlen (py_str);
+  return obstack_strndup (&objfile->per_bfd->storage_obstack,
+                          py_str, len);
+}
+
+/* Creates a new type and returns a new gdb.Type associated with it. */
+
+PyObject *
+gdbpy_init_type (PyObject *self, PyObject *args)
+{
+  PyObject *objfile_object;
+  enum type_code code;
+  int bit_length;
+  const char *py_name;
+
+  if(!PyArg_ParseTuple (args, "Oiis", &objfile_object, &code, &bit_length, &py_name))
+    return nullptr;
+
+  struct objfile* objfile = objfile_object_to_objfile (objfile_object);
+  if (objfile == nullptr)
+    return nullptr;
+
+  const char *name = copy_string (objfile, py_name);
+  struct type *type;
+  try
+  {
+    type = init_type (objfile, code, bit_length, name);
+    gdb_assert (type != nullptr);
+  }
+  catch (gdb_exception_error& ex)
+  {
+    GDB_PY_HANDLE_EXCEPTION (ex);
+  }
+
+  return type_to_type_object (type);
+}
+
+/* Creates a new integer type and returns a new gdb.Type associated with it. */
+
+PyObject *
+gdbpy_init_integer_type (PyObject *self, PyObject *args)
+{
+  PyObject *objfile_object;
+  int bit_size;
+  int unsigned_p;
+  const char *py_name;
+
+  if (!PyArg_ParseTuple (args, "Oips", &objfile_object, &bit_size, &unsigned_p, &py_name))
+    return nullptr;
+
+  struct objfile *objfile = objfile_object_to_objfile (objfile_object);
+  if (objfile == nullptr)
+    return nullptr;
+
+  const char *name = copy_string (objfile, py_name);
+  struct type *type;
+  try
+  {
+    type = init_integer_type (objfile, bit_size, unsigned_p, name);
+    gdb_assert (type != nullptr);
+  }
+  catch (gdb_exception_error& ex)
+  {
+    GDB_PY_HANDLE_EXCEPTION (ex);
+  }
+
+  return type_to_type_object(type);
+}
+
+/* Creates a new character type and returns a new gdb.Type associated with it. */
+
+PyObject *
+gdbpy_init_character_type (PyObject *self, PyObject *args)
+{
+
+  PyObject *objfile_object;
+  int bit_size;
+  int unsigned_p;
+  const char *py_name;
+
+  if (!PyArg_ParseTuple (args, "Oips", &objfile_object, &bit_size, &unsigned_p, &py_name))
+    return nullptr;
+
+  struct objfile *objfile = objfile_object_to_objfile (objfile_object);
+  if (objfile == nullptr)
+    return nullptr;
+
+  const char *name = copy_string (objfile, py_name);
+  struct type *type;
+  try
+  {
+    type = init_character_type (objfile, bit_size, unsigned_p, name);
+    gdb_assert (type != nullptr);
+  }
+  catch (gdb_exception_error& ex)
+  {
+    GDB_PY_HANDLE_EXCEPTION (ex);
+  }
+
+  return type_to_type_object (type);
+}
+
+/* Creates a new boolean type and returns a new gdb.Type associated with it. */
+
+PyObject *
+gdbpy_init_boolean_type (PyObject *self, PyObject *args)
+{
+
+  PyObject *objfile_object;
+  int bit_size;
+  int unsigned_p;
+  const char *py_name;
+
+  if (!PyArg_ParseTuple (args, "Oips", &objfile_object, &bit_size, &unsigned_p, &py_name))
+    return nullptr;
+
+  struct objfile *objfile = objfile_object_to_objfile (objfile_object);
+  if (objfile == nullptr)
+    return nullptr;
+
+  const char *name = copy_string (objfile, py_name);
+  struct type *type;
+  try
+  {
+    type = init_boolean_type (objfile, bit_size, unsigned_p, name);
+    gdb_assert (type != nullptr);
+  }
+  catch (gdb_exception_error& ex)
+  {
+    GDB_PY_HANDLE_EXCEPTION (ex);
+  }
+
+  return type_to_type_object (type);
+}
+
+/* Creates a new float type and returns a new gdb.Type associated with it. */
+
+PyObject *
+gdbpy_init_float_type (PyObject *self, PyObject *args)
+{
+  PyObject *objfile_object, *float_format_object;
+  const char *py_name;
+
+  if (!PyArg_ParseTuple (args, "OOs", &objfile_object, &float_format_object, &py_name))
+    return nullptr;
+
+  struct objfile *objfile = objfile_object_to_objfile (objfile_object);
+  if (objfile == nullptr)
+    return nullptr;
+
+  struct floatformat *local_ff = float_format_object_as_float_format (float_format_object);
+  if (local_ff == nullptr)
+    return nullptr;
+
+  /* Persist a copy of the format in the objfile's obstack. This guarantees that
+   * the format won't outlive the type being created from it and that changes
+   * made to the object used to create this type will not affect it after
+   * creation. */
+  auto ff = OBSTACK_CALLOC
+    (&objfile->objfile_obstack,
+     1,
+     struct floatformat);
+  memcpy (ff, local_ff, sizeof (struct floatformat));
+
+  /* We only support creating float types in the architecture's endianness, so
+   * make sure init_float_type sees the float format structure we need it to. */
+  enum bfd_endian endianness = gdbarch_byte_order (objfile->arch());
+  gdb_assert (endianness < BFD_ENDIAN_UNKNOWN);
+
+  const struct floatformat *per_endian[2] = { nullptr, nullptr };
+  per_endian[endianness] = ff;
+
+  const char *name = copy_string (objfile, py_name);
+  struct type *type;
+  try
+  {
+    type = init_float_type (objfile, -1, name, per_endian, endianness);
+    gdb_assert (type != nullptr);
+  }
+  catch (gdb_exception_error& ex)
+  {
+    GDB_PY_HANDLE_EXCEPTION (ex);
+  }
+
+  return type_to_type_object (type);
+}
+
+/* Creates a new decimal float type and returns a new gdb.Type associated with it. */
+
+PyObject *
+gdbpy_init_decfloat_type (PyObject *self, PyObject *args)
+{
+  PyObject *objfile_object;
+  int bit_length;
+  const char *py_name;
+
+  if (!PyArg_ParseTuple (args, "Ois", &objfile_object, &bit_length, &py_name))
+    return nullptr;
+
+  struct objfile *objfile = objfile_object_to_objfile (objfile_object);
+  if (objfile == nullptr)
+    return nullptr;
+
+  const char *name = copy_string (objfile, py_name);
+  struct type *type;
+  try
+  {
+    type = init_decfloat_type (objfile, bit_length, name);
+    gdb_assert (type != nullptr);
+  }
+  catch (gdb_exception_error& ex)
+  {
+    GDB_PY_HANDLE_EXCEPTION (ex);
+  }
+
+  return type_to_type_object (type);
+}
+
+/* Returns whether a given type can be used to create a complex type. */
+
+PyObject *
+gdbpy_can_create_complex_type (PyObject *self, PyObject *args)
+{
+
+  PyObject *type_object;
+
+  if (!PyArg_ParseTuple (args, "O", &type_object))
+    return nullptr;
+
+  struct type *type = type_object_to_type (type_object);
+  if (type == nullptr)
+    return nullptr;
+
+  bool can_create_complex;
+  try
+  {
+    can_create_complex = can_create_complex_type (type);
+  }
+  catch (gdb_exception_error& ex)
+  {
+    GDB_PY_HANDLE_EXCEPTION (ex);
+  }
+
+  if (can_create_complex)
+    Py_RETURN_TRUE;
+  else
+    Py_RETURN_FALSE;
+}
+
+/* Creates a new complex type and returns a new gdb.Type associated with it. */
+
+PyObject *
+gdbpy_init_complex_type (PyObject *self, PyObject *args)
+{
+
+  PyObject *type_object;
+  const char *py_name;
+
+  if (!PyArg_ParseTuple (args, "Os", &type_object, &py_name))
+    return nullptr;
+
+  struct type *type = type_object_to_type (type_object);
+  if (type == nullptr)
+    return nullptr;
+
+  obstack *obstack;
+  if (type->is_objfile_owned ())
+    obstack = &type->objfile_owner ()->objfile_obstack;
+  else
+    obstack = gdbarch_obstack (type->arch_owner ());
+
+  unsigned int len = strlen (py_name);
+  const char *name = obstack_strndup (obstack,
+                                      py_name,
+                                      len);
+  struct type *complex_type;
+  try
+  {
+    complex_type = init_complex_type (name, type);
+    gdb_assert (complex_type != nullptr);
+  }
+  catch (gdb_exception_error& ex)
+  {
+    GDB_PY_HANDLE_EXCEPTION (ex);
+  }
+
+  return type_to_type_object (complex_type);
+}
+
+/* Creates a new pointer type and returns a new gdb.Type associated with it. */
+
+PyObject *
+gdbpy_init_pointer_type (PyObject *self, PyObject *args)
+{
+  PyObject *objfile_object, *type_object;
+  int bit_length;
+  const char *py_name;
+
+  if (!PyArg_ParseTuple (args, "OOis", &objfile_object, &type_object, &bit_length, &py_name))
+    return nullptr;
+
+  struct objfile *objfile = objfile_object_to_objfile (objfile_object);
+  if (objfile == nullptr)
+    return nullptr;
+
+  struct type *type = type_object_to_type (type_object);
+  if (type == nullptr)
+    return nullptr;
+
+  const char *name = copy_string (objfile, py_name);
+  struct type *pointer_type;
+  try
+  {
+    pointer_type = init_pointer_type (objfile, bit_length, name, type);
+    gdb_assert (type != nullptr);
+  }
+  catch (gdb_exception_error& ex)
+  {
+    GDB_PY_HANDLE_EXCEPTION (ex);
+  }
+
+  return type_to_type_object (pointer_type);
+}
+
+/* Creates a new fixed point type and returns a new gdb.Type associated with it. */
+
+PyObject *
+gdbpy_init_fixed_point_type (PyObject *self, PyObject *args)
+{
+
+  PyObject *objfile_object;
+  int bit_length;
+  int unsigned_p;
+  const char* py_name;
+
+  if (!PyArg_ParseTuple (args, "Oips", &objfile_object, &bit_length, &unsigned_p, &py_name))
+    return nullptr;
+
+  struct objfile *objfile = objfile_object_to_objfile (objfile_object);
+  if (objfile == nullptr)
+    return nullptr;
+
+  const char *name = copy_string (objfile, py_name);
+  struct type *type;
+  try
+  {
+    type = init_fixed_point_type (objfile, bit_length, unsigned_p, name);
+    gdb_assert (type != nullptr);
+  }
+  catch (gdb_exception_error& ex)
+  {
+    GDB_PY_HANDLE_EXCEPTION (ex);
+  }
+
+  return type_to_type_object (type);
+}
diff --git a/gdb/python/python-internal.h b/gdb/python/python-internal.h
index 06357cc8c0b..3877f8a7ca9 100644
--- a/gdb/python/python-internal.h
+++ b/gdb/python/python-internal.h
@@ -289,6 +289,8 @@ extern PyTypeObject frame_object_type
     CPYCHECKER_TYPE_OBJECT_FOR_TYPEDEF ("frame_object");
 extern PyTypeObject thread_object_type
     CPYCHECKER_TYPE_OBJECT_FOR_TYPEDEF ("thread_object");
+extern PyTypeObject float_format_object_type
+    CPYCHECKER_TYPE_OBJECT_FOR_TYPEDEF ("float_format");
 
 /* Ensure that breakpoint_object_type is initialized and return true.  If
    breakpoint_object_type can't be initialized then set a suitable Python
@@ -431,6 +433,17 @@ gdb::unique_xmalloc_ptr<char> gdbpy_parse_command_name
 PyObject *gdbpy_register_tui_window (PyObject *self, PyObject *args,
 				     PyObject *kw);
 
+PyObject *gdbpy_init_type (PyObject *self, PyObject *args);
+PyObject *gdbpy_init_integer_type (PyObject *self, PyObject *args);
+PyObject *gdbpy_init_character_type (PyObject *self, PyObject *args);
+PyObject *gdbpy_init_boolean_type (PyObject *self, PyObject *args);
+PyObject *gdbpy_init_float_type (PyObject *self, PyObject *args);
+PyObject *gdbpy_init_decfloat_type (PyObject *self, PyObject *args);
+PyObject *gdbpy_can_create_complex_type (PyObject *self, PyObject *args);
+PyObject *gdbpy_init_complex_type (PyObject *self, PyObject *args);
+PyObject *gdbpy_init_pointer_type (PyObject *self, PyObject *args);
+PyObject *gdbpy_init_fixed_point_type (PyObject *self, PyObject *args);
+
 PyObject *symtab_and_line_to_sal_object (struct symtab_and_line sal);
 PyObject *symtab_to_symtab_object (struct symtab *symtab);
 PyObject *symbol_to_symbol_object (struct symbol *sym);
@@ -481,6 +494,8 @@ struct symtab *symtab_object_to_symtab (PyObject *obj);
 struct symtab_and_line *sal_object_to_symtab_and_line (PyObject *obj);
 frame_info_ptr frame_object_to_frame_info (PyObject *frame_obj);
 struct gdbarch *arch_object_to_gdbarch (PyObject *obj);
+struct objfile *objfile_object_to_objfile (PyObject *self);
+struct floatformat *float_format_object_as_float_format (PyObject *self);
 
 /* Convert Python object OBJ to a program_space pointer.  OBJ must be a
    gdb.Progspace reference.  Return nullptr if the gdb.Progspace is not
@@ -559,6 +574,8 @@ int gdbpy_initialize_micommands (void)
 void gdbpy_finalize_micommands ();
 int gdbpy_initialize_disasm ()
   CPYCHECKER_NEGATIVE_RESULT_SETS_EXCEPTION;
+int gdbpy_initialize_float_format ()
+  CPYCHECKER_NEGATIVE_RESULT_SETS_EXCEPTION;
 
 PyMODINIT_FUNC gdbpy_events_mod_func ();
 
diff --git a/gdb/python/python.c b/gdb/python/python.c
index 4aa24421dec..1ed29ff4dea 100644
--- a/gdb/python/python.c
+++ b/gdb/python/python.c
@@ -2153,7 +2153,8 @@ do_start_initialization ()
       || gdbpy_initialize_membuf () < 0
       || gdbpy_initialize_connection () < 0
       || gdbpy_initialize_tui () < 0
-      || gdbpy_initialize_micommands () < 0)
+      || gdbpy_initialize_micommands () < 0
+      || gdbpy_initialize_float_format() < 0)
     return false;
 
 #define GDB_PY_DEFINE_EVENT_TYPE(name, py_name, doc, base)	\
@@ -2529,6 +2530,47 @@ Return current recording object." },
     "stop_recording () -> None.\n\
 Stop current recording." },
 
+  /* Type initialization functions. */
+  { "init_type", gdbpy_init_type, METH_VARARGS,
+    "init_type (objfile, type_code, bit_length, name) -> type\n\
+    Creates a new type with the given bit length and type code, owned\
+    by the given objfile." },
+  { "init_integer_type", gdbpy_init_integer_type, METH_VARARGS,
+    "init_integer_type (objfile, bit_length, unsigned, name) -> type\n\
+    Creates a new integer type with the given bit length and \
+    signedness, owned by the given objfile." },
+  { "init_character_type", gdbpy_init_character_type, METH_VARARGS,
+    "init_character_type (objfile, bit_length, unsigned, name) -> type\n\
+    Creates a new character type with the given bit length and \
+    signedness, owned by the given objfile." },
+  { "init_boolean_type", gdbpy_init_boolean_type, METH_VARARGS,
+    "init_boolean_type (objfile, bit_length, unsigned, name) -> type\n\
+    Creates a new boolean type with the given bit length and \
+    signedness, owned by the given objfile." },
+  { "init_float_type", gdbpy_init_float_type, METH_VARARGS,
+    "init_float_type (objfile, float_format, name) -> type\n\
+    Creates a new floating point type with the given bit length and \
+    format, owned by the given objfile." },
+  { "init_decfloat_type", gdbpy_init_decfloat_type, METH_VARARGS,
+    "init_decfloat_type (objfile, bit_length, name) -> type\n\
+    Creates a new decimal float type with the given bit length,\
+    owned by the given objfile." },
+  { "can_create_complex_type", gdbpy_can_create_complex_type, METH_VARARGS,
+    "can_create_complex_type (type) -> bool\n\
+     Returns whether a given type can form a new complex type." },
+  { "init_complex_type", gdbpy_init_complex_type, METH_VARARGS,
+    "init_complex_type (base_type, name) -> type\n\
+    Creates a new complex type whose components belong to the\
+    given type, owned by the given objfile." },
+  { "init_pointer_type", gdbpy_init_pointer_type, METH_VARARGS,
+    "init_pointer_type (objfile, target_type, bit_length, name) -> type\n\
+    Creates a new pointer type with the given bit length, pointing\
+    to the given target type, and owned by the given objfile." },
+ { "init_fixed_point_type", gdbpy_init_fixed_point_type, METH_VARARGS,
+   "init_fixed_point_type (objfile, bit_length, unsigned, name) -> type\n\
+   Creates a new fixed point type with the given bit length and\
+   signedness, owned by the given objfile." },
+
   { "lookup_type", (PyCFunction) gdbpy_lookup_type,
     METH_VARARGS | METH_KEYWORDS,
     "lookup_type (name [, block]) -> type\n\
-- 
2.37.3.windows.1


^ permalink raw reply	[relevance 2%]

* [PATCH] Add __repr__() implementation to a few Python types
       [not found]     <3Cc9d2dc49-45c7-d6b9-c567-4ec78dd870a0>
@ 2023-01-11  0:35  3% ` Matheus Branco Borella
  2023-01-18 17:05  7%   ` Bruno Larsen
  2023-01-18 18:02  6%   ` Andrew Burgess
  0 siblings, 2 replies; 65+ results
From: Matheus Branco Borella @ 2023-01-11  0:35 UTC (permalink / raw)
  To: gdb-patches; +Cc: Matheus Branco Borella

Only a few types in the Python API currently have __repr__() implementations.
This patch adds a few more of them. specifically: it adds __repr__()
implementations to gdb.Symbol, gdb.Architecture, gdb.Block, gdb.Breakpoint,
and gdb.Type.

This makes it easier to play around the GDB Python API in the Python interpreter
session invoked with the 'pi' command in GDB, giving more easily accessible tipe
information to users.

An example of how this would look like:
```
(gdb) pi
>> gdb.lookup_type("char")
<gdb.Type code=TYPE_CODE_INT name=char>
>> gdb.lookup_global_symbol("main")
<gdb.Symbol print_name=main>
```

One thing to note about this patch is that it makes use of u8 string literals,
so as to make sure we meet python's expectations of strings passed to it using
PyUnicode_FromFormat being encoded in utf8. This should remove the chance of
odd compilation environments spitting out strings Python would consider invalid
for the function we're calling.
---
 gdb/python/py-arch.c                       | 18 +++++++-
 gdb/python/py-block.c                      | 30 ++++++++++++-
 gdb/python/py-breakpoint.c                 | 49 +++++++++++++++++++++-
 gdb/python/py-symbol.c                     | 16 ++++++-
 gdb/python/py-type.c                       | 31 +++++++++++++-
 gdb/testsuite/gdb.python/py-arch.exp       |  4 ++
 gdb/testsuite/gdb.python/py-block.exp      |  4 +-
 gdb/testsuite/gdb.python/py-breakpoint.exp | 26 +++++++-----
 gdb/testsuite/gdb.python/py-symbol.exp     |  1 +
 gdb/testsuite/gdb.python/py-type.exp       |  4 ++
 10 files changed, 165 insertions(+), 18 deletions(-)

diff --git a/gdb/python/py-arch.c b/gdb/python/py-arch.c
index cf0978560f9..5384a0d0d0c 100644
--- a/gdb/python/py-arch.c
+++ b/gdb/python/py-arch.c
@@ -319,6 +319,22 @@ archpy_integer_type (PyObject *self, PyObject *args, PyObject *kw)
   return type_to_type_object (type);
 }
 
+/* __repr__ implementation for gdb.Architecture.  */
+
+static PyObject *
+archpy_repr (PyObject *self)
+{
+  const auto gdbarch = arch_object_to_gdbarch (self);
+  if (gdbarch == nullptr)
+    return PyUnicode_FromFormat
+      ("<gdb.Architecture (invalid)>");
+
+  return PyUnicode_FromFormat
+    ("<gdb.Architecture arch_name=%s printable_name=%s>",
+     gdbarch_bfd_arch_info (gdbarch)->arch_name,
+     gdbarch_bfd_arch_info (gdbarch)->printable_name);
+}
+
 /* Implementation of gdb.architecture_names().  Return a list of all the
    BFD architecture names that GDB understands.  */
 
@@ -391,7 +407,7 @@ PyTypeObject arch_object_type = {
   0,                                  /* tp_getattr */
   0,                                  /* tp_setattr */
   0,                                  /* tp_compare */
-  0,                                  /* tp_repr */
+  archpy_repr,                        /* tp_repr */
   0,                                  /* tp_as_number */
   0,                                  /* tp_as_sequence */
   0,                                  /* tp_as_mapping */
diff --git a/gdb/python/py-block.c b/gdb/python/py-block.c
index b9aea3aca69..1b8433d41e7 100644
--- a/gdb/python/py-block.c
+++ b/gdb/python/py-block.c
@@ -23,6 +23,7 @@
 #include "symtab.h"
 #include "python-internal.h"
 #include "objfiles.h"
+#include <sstream>
 
 struct block_object {
   PyObject_HEAD
@@ -424,6 +425,33 @@ blpy_iter_is_valid (PyObject *self, PyObject *args)
   Py_RETURN_TRUE;
 }
 
+/* __repr__ implementation for gdb.Block.  */
+
+static PyObject *
+blpy_repr (PyObject *self)
+{
+  const auto block = block_object_to_block (self);
+  if (block == nullptr)
+    return PyUnicode_FromFormat("<gdb.Block (invalid)>");
+
+  const auto name = block->function () ? block->function ()->print_name () : "<anonymous>";
+
+  block_iterator iter;
+  block_iterator_first (block, &iter);
+
+  std::stringstream ss;
+  const struct symbol *symbol;
+  while ((symbol = block_iterator_next (&iter)) != nullptr)
+  {
+    ss << std::endl;
+    ss << symbol->print_name () << ",";
+  }
+  if(!ss.str ().empty ())
+    ss << std::endl;
+
+  return PyUnicode_FromFormat("<gdb.Block %s {%s}>", name, ss.str ().c_str ());
+}
+
 int
 gdbpy_initialize_blocks (void)
 {
@@ -486,7 +514,7 @@ PyTypeObject block_object_type = {
   0,				  /*tp_getattr*/
   0,				  /*tp_setattr*/
   0,				  /*tp_compare*/
-  0,				  /*tp_repr*/
+  blpy_repr,                     /*tp_repr*/
   0,				  /*tp_as_number*/
   0,				  /*tp_as_sequence*/
   &block_object_as_mapping,	  /*tp_as_mapping*/
diff --git a/gdb/python/py-breakpoint.c b/gdb/python/py-breakpoint.c
index de7b9f4266b..ce307f7be21 100644
--- a/gdb/python/py-breakpoint.c
+++ b/gdb/python/py-breakpoint.c
@@ -33,6 +33,7 @@
 #include "location.h"
 #include "py-event.h"
 #include "linespec.h"
+#include <sstream>
 
 extern PyTypeObject breakpoint_location_object_type
     CPYCHECKER_TYPE_OBJECT_FOR_TYPEDEF ("breakpoint_location_object");
@@ -967,6 +968,23 @@ bppy_init (PyObject *self, PyObject *args, PyObject *kwargs)
   return 0;
 }
 
+/* __repr__ implementation for gdb.Breakpoint.  */
+
+static PyObject *
+bppy_repr(PyObject *self)
+{
+  const auto bp = (struct gdbpy_breakpoint_object*) self;
+  if (bp->bp == nullptr)
+    return PyUnicode_FromFormat ("<gdb.Breakpoint (invalid)>");
+
+  return PyUnicode_FromFormat
+    ("<gdb.Breakpoint number=%d thread=%d hits=%d enable_count=%d>",
+     bp->bp->number,
+     bp->bp->thread,
+     bp->bp->hit_count,
+     bp->bp->enable_count);
+}
+
 /* Append to LIST the breakpoint Python object associated to B.
 
    Return true on success.  Return false on failure, with the Python error
@@ -1389,7 +1407,7 @@ PyTypeObject breakpoint_object_type =
   0,				  /*tp_getattr*/
   0,				  /*tp_setattr*/
   0,				  /*tp_compare*/
-  0,				  /*tp_repr*/
+  bppy_repr,                     /*tp_repr*/
   0,				  /*tp_as_number*/
   0,				  /*tp_as_sequence*/
   0,				  /*tp_as_mapping*/
@@ -1604,6 +1622,33 @@ bplocpy_dealloc (PyObject *py_self)
   Py_TYPE (py_self)->tp_free (py_self);
 }
 
+/* __repr__ implementation for gdb.BreakpointLocation.  */
+
+static PyObject *
+bplocpy_repr (PyObject *py_self)
+{
+  const auto self = (gdbpy_breakpoint_location_object *) py_self;
+  if (self->owner == nullptr || self->owner->bp == nullptr || self->owner->bp != self->bp_loc->owner)
+    return PyUnicode_FromFormat ("<gdb.BreakpointLocation (invalid)>");
+
+  const auto enabled = self->bp_loc->enabled ? "enabled" : "disabled";
+
+  std::stringstream ss;
+  ss << std::endl << enabled << std::endl;
+  ss << "requested_address=0x" << std::hex << self->bp_loc->requested_address << " ";
+  ss << "address=0x" << self->bp_loc->address << " " << std::dec << std::endl;
+  if (self->bp_loc->symtab != nullptr)
+  {
+    ss << self->bp_loc->symtab->filename << ":" << self->bp_loc->line_number << " " << std::endl;
+  }
+
+  const auto fn_name = self->bp_loc->function_name.get ();
+  if (fn_name != nullptr)
+    ss << "in " << fn_name << " " << std::endl;
+
+  return PyUnicode_FromFormat ("<gdb.BreakpointLocation %s>", ss.str ().c_str ());
+}
+
 /* Attribute get/set Python definitions. */
 
 static gdb_PyGetSetDef bp_location_object_getset[] = {
@@ -1635,7 +1680,7 @@ PyTypeObject breakpoint_location_object_type =
   0,					/*tp_getattr*/
   0,					/*tp_setattr*/
   0,					/*tp_compare*/
-  0,					/*tp_repr*/
+  bplocpy_repr,                        /*tp_repr*/
   0,					/*tp_as_number*/
   0,					/*tp_as_sequence*/
   0,					/*tp_as_mapping*/
diff --git a/gdb/python/py-symbol.c b/gdb/python/py-symbol.c
index 93c86964f3e..5a8149bbe66 100644
--- a/gdb/python/py-symbol.c
+++ b/gdb/python/py-symbol.c
@@ -375,6 +375,20 @@ sympy_dealloc (PyObject *obj)
   Py_TYPE (obj)->tp_free (obj);
 }
 
+/* __repr__ implementation for gdb.Symbol.  */
+
+static PyObject *
+sympy_repr (PyObject *self)
+{
+  const auto symbol = symbol_object_to_symbol (self);
+  if (symbol == nullptr)
+    return PyUnicode_FromFormat ("<gdb.Symbol (invalid)>");
+
+  return PyUnicode_FromFormat
+    ("<gdb.Symbol print_name=%s>",
+     symbol->print_name ());
+}
+
 /* Implementation of
    gdb.lookup_symbol (name [, block] [, domain]) -> (symbol, is_field_of_this)
    A tuple with 2 elements is always returned.  The first is the symbol
@@ -732,7 +746,7 @@ PyTypeObject symbol_object_type = {
   0,				  /*tp_getattr*/
   0,				  /*tp_setattr*/
   0,				  /*tp_compare*/
-  0,				  /*tp_repr*/
+  sympy_repr,                    /*tp_repr*/
   0,				  /*tp_as_number*/
   0,				  /*tp_as_sequence*/
   0,				  /*tp_as_mapping*/
diff --git a/gdb/python/py-type.c b/gdb/python/py-type.c
index 928efacfe8a..abe127eca76 100644
--- a/gdb/python/py-type.c
+++ b/gdb/python/py-type.c
@@ -442,6 +442,7 @@ typy_is_signed (PyObject *self, void *closure)
     Py_RETURN_TRUE;
 }
 
+
 /* Return the type, stripped of typedefs. */
 static PyObject *
 typy_strip_typedefs (PyObject *self, PyObject *args)
@@ -1026,6 +1027,34 @@ typy_template_argument (PyObject *self, PyObject *args)
   return value_to_value_object (val);
 }
 
+/* __repr__ implementation for gdb.Type.  */
+
+static PyObject *
+typy_repr (PyObject *self)
+{
+  const auto type = type_object_to_type (self);
+  if (type == nullptr)
+    return PyUnicode_FromFormat ("<gdb.Type (invalid)>");
+
+  const char *code = pyty_codes[type->code ()].name;
+  string_file type_name;
+  try
+    {
+      current_language->print_type (type, "",
+				    &type_name, -1, 0,
+				    &type_print_raw_options);
+    }
+  catch (const gdb_exception &except)
+    {
+      GDB_PY_HANDLE_EXCEPTION (except);
+    }
+  auto py_typename = PyUnicode_Decode
+    (type_name.c_str (), type_name.size (),
+		 host_charset (), NULL);
+	
+	return PyUnicode_FromFormat ("<gdb.Type code=%s name=%U>", code, py_typename);
+}
+
 static PyObject *
 typy_str (PyObject *self)
 {
@@ -1612,7 +1641,7 @@ PyTypeObject type_object_type =
   0,				  /*tp_getattr*/
   0,				  /*tp_setattr*/
   0,				  /*tp_compare*/
-  0,				  /*tp_repr*/
+  typy_repr,                     /*tp_repr*/
   &type_object_as_number,	  /*tp_as_number*/
   0,				  /*tp_as_sequence*/
   &typy_mapping,		  /*tp_as_mapping*/
diff --git a/gdb/testsuite/gdb.python/py-arch.exp b/gdb/testsuite/gdb.python/py-arch.exp
index 1fbbc47c872..a60b4a25cbb 100644
--- a/gdb/testsuite/gdb.python/py-arch.exp
+++ b/gdb/testsuite/gdb.python/py-arch.exp
@@ -29,6 +29,8 @@ if ![runto_main] {
 # Test python/15461.  Invalid architectures should not trigger an
 # internal GDB assert.
 gdb_py_test_silent_cmd "python empty = gdb.Architecture()" "get empty arch" 0
+gdb_test "python print(repr (empty))" "<gdb\\.Architecture \\(invalid\\)>" \
+    "Test empty achitecture __repr__ does not trigger an assert"
 gdb_test "python print(empty.name())" ".*Architecture is invalid.*" \
     "Test empty architecture.name does not trigger an assert"
 gdb_test "python print(empty.disassemble())" ".*Architecture is invalid.*" \
@@ -46,6 +48,8 @@ gdb_py_test_silent_cmd "python insn_list3 = arch.disassemble(pc, count=1)" \
 gdb_py_test_silent_cmd "python insn_list4 = arch.disassemble(gdb.Value(pc))" \
   "disassemble no end no count" 0
 
+gdb_test "python print (repr (arch))" "<gdb.Architecture arch_name=.* printable_name=.*>" "test __repr__ for architecture"
+
 gdb_test "python print (len(insn_list1))" "1" "test number of instructions 1"
 gdb_test "python print (len(insn_list2))" "1" "test number of instructions 2"
 gdb_test "python print (len(insn_list3))" "1" "test number of instructions 3"
diff --git a/gdb/testsuite/gdb.python/py-block.exp b/gdb/testsuite/gdb.python/py-block.exp
index 0a88aec56a0..5e3d1c72d5e 100644
--- a/gdb/testsuite/gdb.python/py-block.exp
+++ b/gdb/testsuite/gdb.python/py-block.exp
@@ -39,7 +39,7 @@ gdb_continue_to_breakpoint "Block break here."
 gdb_py_test_silent_cmd "python frame = gdb.selected_frame()" "Get Frame" 0
 gdb_py_test_silent_cmd "python block = frame.block()" \
     "Get block, initial innermost block" 0
-gdb_test "python print (block)" "<gdb.Block object at $hex>" "check block not None"
+gdb_test "python print (block)" "<gdb.Block .* \{.*\}>" "check block not None"
 gdb_test "python print (block.function)" "None" "first anonymous block"
 gdb_test "python print (block.start)" "${decimal}" "check start not None"
 gdb_test "python print (block.end)" "${decimal}" "check end not None"
@@ -73,7 +73,7 @@ gdb_test "python print (block.function)" "block_func" \
 gdb_test "up" ".*"
 gdb_py_test_silent_cmd "python frame = gdb.selected_frame()" "Get Frame 2" 0
 gdb_py_test_silent_cmd "python block = frame.block()" "Get Frame 2's block" 0
-gdb_test "python print (block)" "<gdb.Block object at $hex>" \
+gdb_test "python print (repr (block))" "<gdb.Block .* \{.*\}>" \
          "Check Frame 2's block not None"
 gdb_test "python print (block.function)" "main" "main block"
 
diff --git a/gdb/testsuite/gdb.python/py-breakpoint.exp b/gdb/testsuite/gdb.python/py-breakpoint.exp
index e36e87dc291..0c904a12c90 100644
--- a/gdb/testsuite/gdb.python/py-breakpoint.exp
+++ b/gdb/testsuite/gdb.python/py-breakpoint.exp
@@ -50,11 +50,14 @@ proc_with_prefix test_bkpt_basic { } {
 	return 0
     }
 
+    set num_exp "-?\[0-9\]+"
+    set repr_pattern "<gdb.Breakpoint number=$num_exp thread=$num_exp hits=$num_exp enable_count=$num_exp>"
+
     # Now there should be one breakpoint: main.
     gdb_py_test_silent_cmd "python blist = gdb.breakpoints()" \
 	"Get Breakpoint List" 0
-    gdb_test "python print (blist\[0\])" \
-	"<gdb.Breakpoint object at $hex>" "Check obj exists @main"
+    gdb_test "python print (repr (blist\[0\]))" \
+	"$repr_pattern" "Check obj exists @main"
     gdb_test "python print (blist\[0\].location)" \
 	"main." "Check breakpoint location @main"
     gdb_test "python print (blist\[0\].pending)" "False" \
@@ -71,12 +74,12 @@ proc_with_prefix test_bkpt_basic { } {
 	"Get Breakpoint List" 0
     gdb_test "python print (len(blist))" \
 	"2" "Check for two breakpoints"
-    gdb_test "python print (blist\[0\])" \
-	"<gdb.Breakpoint object at $hex>" "Check obj exists @main 2"
+    gdb_test "python print (repr (blist\[0\]))" \
+	"$repr_pattern" "Check obj exists @main 2"
     gdb_test "python print (blist\[0\].location)" \
 	"main." "Check breakpoint location @main 2"
-    gdb_test "python print (blist\[1\])" \
-	"<gdb.Breakpoint object at $hex>" "Check obj exists @mult_line"
+    gdb_test "python print (repr (blist\[1\]))" \
+	"$repr_pattern" "Check obj exists @mult_line"
 
     gdb_test "python print (blist\[1\].location)" \
 	"py-breakpoint\.c:${mult_line}*" \
@@ -224,14 +227,17 @@ proc_with_prefix test_bkpt_invisible { } {
 	return 0
     }
 
+    set num_exp "-?\[0-9\]+"
+    set repr_pattern "<gdb.Breakpoint number=$num_exp thread=$num_exp hits=$num_exp enable_count=$num_exp>"
+
     delete_breakpoints
     set ibp_location [gdb_get_line_number "Break at multiply."]
     gdb_py_test_silent_cmd  "python ibp = gdb.Breakpoint(\"$ibp_location\", internal=False)" \
 	"Set invisible breakpoint" 0
     gdb_py_test_silent_cmd "python ilist = gdb.breakpoints()" \
 	"Get Breakpoint List" 0
-    gdb_test "python print (ilist\[0\])" \
-	"<gdb.Breakpoint object at $hex>" "Check invisible bp obj exists 1"
+    gdb_test "python print (repr (ilist\[0\]))" \
+	"$repr_pattern" "Check invisible bp obj exists 1"
     gdb_test "python print (ilist\[0\].location)" \
 	"py-breakpoint\.c:$ibp_location*" "Check breakpoint location 1"
     gdb_test "python print (ilist\[0\].visible)" \
@@ -243,8 +249,8 @@ proc_with_prefix test_bkpt_invisible { } {
 	"Set invisible breakpoint" 0
     gdb_py_test_silent_cmd "python ilist = gdb.breakpoints()" \
 	"Get Breakpoint List" 0
-    gdb_test "python print (ilist\[0\])" \
-	"<gdb.Breakpoint object at $hex>" "Check invisible bp obj exists 2"
+    gdb_test "python print (repr (ilist\[0\]))" \
+	"$repr_pattern" "Check invisible bp obj exists 2"
     gdb_test "python print (ilist\[0\].location)" \
 	"py-breakpoint\.c:$ibp_location*" "Check breakpoint location 2"
     gdb_test "python print (ilist\[0\].visible)" \
diff --git a/gdb/testsuite/gdb.python/py-symbol.exp b/gdb/testsuite/gdb.python/py-symbol.exp
index ad06b07c2c6..e0baed9b6d4 100644
--- a/gdb/testsuite/gdb.python/py-symbol.exp
+++ b/gdb/testsuite/gdb.python/py-symbol.exp
@@ -44,6 +44,7 @@ clean_restart ${binfile}
 # point where we don't have a current frame, and we don't want to
 # require one.
 gdb_py_test_silent_cmd "python main_func = gdb.lookup_global_symbol(\"main\")" "Lookup main" 1
+gdb_test "python print (repr (main_func))" "<gdb.Symbol print_name=.*>" "test main_func.__repr__"
 gdb_test "python print (main_func.is_function)" "True" "test main_func.is_function"
 gdb_test "python print (gdb.lookup_global_symbol(\"junk\"))" "None" "test lookup_global_symbol(\"junk\")"
 
diff --git a/gdb/testsuite/gdb.python/py-type.exp b/gdb/testsuite/gdb.python/py-type.exp
index 594c9749d8e..95cdfa54a6e 100644
--- a/gdb/testsuite/gdb.python/py-type.exp
+++ b/gdb/testsuite/gdb.python/py-type.exp
@@ -393,3 +393,7 @@ if { [build_inferior "${binfile}-cxx" "c++"] == 0 } {
       test_type_equality
   }
 }
+
+# Test __repr__()
+gdb_test "python print (repr (gdb.lookup_type ('char')))" \
+      "<gdb.Type code=TYPE_CODE_INT name=char>" "test __repr__()"
-- 
2.37.3.windows.1


^ permalink raw reply	[relevance 3%]

* Re: Parallelization of dwarf2 symtab parsing
  2022-12-21 18:46  5% Parallelization of dwarf2 symtab parsing Matheus Branco Borella
@ 2022-12-21 20:19  0% ` Tom Tromey
  0 siblings, 0 replies; 65+ results
From: Tom Tromey @ 2022-12-21 20:19 UTC (permalink / raw)
  To: Matheus Branco Borella via Gdb-patches; +Cc: Matheus Branco Borella

>>>>> "Matt" == Matheus Branco Borella via Gdb-patches <gdb-patches@sourceware.org> writes:

Matt> I was wondering if there's anything design-wise blocking running the calls to
Matt> `dw2_expand_symtabs_matching_one` made in
Matt> `cooked_index_functions::expand_symtabs_matching`
Matt> in parallel, using a similar approach to what building the psymtabs currently
Matt> uses.

I think it's reasonably difficult, though it might be possible.

Matt> Doing that naively, though, there are a few trouble variables and structures
Matt> that cause data races. The ones I've found are:

Yeah, the biggest problem has to do with data races.  In particular gdb
has some maybe non-obvious spots that can interfere.

First, choosing symbol names in the DWARF reader (the "full" reader, not
the indexer) uses a per-objfile bcache:

  if (need_copy)
    retval = objfile->intern (retval);

(The code to compute symbol names is a horrible mess, I wish we could
clean that up...)

Also any spot in the DWARF reader referring to the objfile obstack is a
potential data race.  According to M-x occur there are 43 of these.
These can probably be changed to some kind of sharded obstack to avoid
interference.

There may also be problems with the buildsym code; and of course when a
symtab is actually installed, that must be made thread-safe.  I tried to
make buildsym less reliant on global variables a while ago, though I
don't recall if that was a 100% success.  (Note there's a "legacy" API
that uses globals, but the DWARF reader avoids this.)

Matt> (4) per_objfile->m_dwarf2_cus, and

Matt> (4), though, is where I've hit a bit of a nag. Since it's a global registry of
Matt> CUs, all threads must have a shared and coherent view of it. Using a mutex to
Matt> lock the map can be done in one of two ways, both with equally undesirable
Matt> results:
...
Matt> - We analyze the graph of which CUs cause other CUs to get loaded and run
Matt>  the parallel batches in topological order.
Matt>    Assuming that we can even know that info. ahead of time, this approach
Matt> would still be the most intrusive to the code and the most complex to
Matt> pull off.
Matt> - We introduce a type that represents a CU that is currently being loaded,
Matt>  but that hasn't finished loading quite yet, for threads that need that CU
Matt>  to await on.

Basically, I think you just want to make sure that a given CU is only
parsed by a single thread.  It's better to arrange things to avoid doing
any extra work here.  I don't think this should be super hard to do, for
example you can have a map that is locked only on use that maps from the
dwarf2_per_cu_data to a std::future that will hold the resulting symtab
or whatever.

Then, when multiple CUs refer to some other CU, whichever thread gets to
that included CU first will win.  I think the data structures should be
set up such that these can be stitched together again after parsing.
It's probably fine to just do this in a post-pass that waits for all the
futures and then sets up the inclusions.

The cooked indexer uses a strategy like this.

Also note that cross-CU references are relatively rare, basically only
occurring when 'dwz' has been used.  So this situation isn't extremely
important for normal users.

Matt> (5) is conceptually pretty simple to understand, but fairly complex to solve. We
Matt> can model the CU cache cleanup functionality as a form of age-based garbage
Matt> collection that we're trying to parallelize. And there's plenty of fun to be
Matt> had by going down that particular rabbit hole. :^)

We have some evidence that this cache is just not very good:

    https://sourceware.org/bugzilla/show_bug.cgi?id=25703

Changing it radically to work in some other way seems totally fine.
Like, I think if the DIEs from some CU are ever needed for a cross-CU
reference, then just keeping those around for the duration of the
current "expansion operation" is fine.

Matt> So I'd like to hear your thoughts and opinions on this. If there's anything I've
Matt> missed or got wrong, please let me know.

A different idea is to try to somewhat merge the partial (cooked index)
DWARF reader and the full reader, and do lazy expansion of CUs.  This
has the same end goal -- speed up CU expansion.

    https://sourceware.org/bugzilla/show_bug.cgi?id=29398

The idea here is that, even when some data from a CU is needed (like a
type, or if gdb stops in some function in the CU), frequently the rest
of the data there is not needed.  So, reading it all is needlessly slow.

There's a second idea here as well, which is unifying the top-level
names between the index and the symtab -- occasionally we've had
divergences here, which result in weird bugs.

The latter could be done without full laziness, though, by using the
index to build the symtab but then filling in the details using the
current code.

The big benefit of lazy expansion is that it would be much faster for
the pathologically large CUs that do sometimes appear (whereas with
parallel reading, a really big CU will still be slow).  The main
drawback is that it's more complicated, so there's more chance for bugs,
and the bugs will be harder to understand when they do occur.

Tom

^ permalink raw reply	[relevance 0%]

* Parallelization of dwarf2 symtab parsing
@ 2022-12-21 18:46  5% Matheus Branco Borella
  2022-12-21 20:19  0% ` Tom Tromey
  0 siblings, 1 reply; 65+ results
From: Matheus Branco Borella @ 2022-12-21 18:46 UTC (permalink / raw)
  To: gdb-patches

I was wondering if there's anything design-wise blocking running the calls to
`dw2_expand_symtabs_matching_one` made in
`cooked_index_functions::expand_symtabs_matching`
in parallel, using a similar approach to what building the psymtabs currently
uses. I tried my hand at it and there doesn't seem to be anything insurmountable
about it, though I haven't been able to get it to work yet.

Doing that naively, though, there are a few trouble variables and structures
that cause data races. The ones I've found are:
    (1) per_objfile->queue,
(2) per_objfile->sym_cu,
(3) per_bfd->dwp_checked,
(4) per_objfile->m_dwarf2_cus, and
(5) the CU cache.

From what I can see:

(1) and (2) can easily be made safe by just making them thread local, seeing as
going down from `dw2_expand_symtabs_matching_one`, they get built fresh and are
only used in the context of that call, before being reset again when it exits.

(3) can also similarly be easily made safe by having it be a `std::once_flag`,
and loading up the DWP in a call to `std::run_once`. Since the flag was only
ever set once to mark the load, and never reset.

(4), though, is where I've hit a bit of a nag. Since it's a global registry of
CUs, all threads must have a shared and coherent view of it. Using a mutex to
lock the map can be done in one of two ways, both with equally undesirable
results:
- We can lock the mutex for the whole lifetime of the parsing of a new CU,
 only unlocking it after the call to `per_objfile->set_cu()`. Obviously,
 the problem with this approach is that it stalls other threads trying to
 parse new CUs.
- We can lock the mutex for just the duration of `per_objfile->set_cu()` and
 let threads parse all CUs as they come in. Problem with this is that there
 will be some amount of rework in the form of multiple calls to
 `load_full_comp_unit`. As, for a given CU 'X', there can now be a window
 of time where a given thread is midway through `load_full_comp_unit(X)`,
 and, because it hasn't called `per_objfile->set_cu(X)` yet, another thread
 can try to load the same CU by calling `load_full_comp_unit(X)` a second
 time, as, from its perspective `per_objfile->get_cu(X) == nullptr`.
As far as solutions to this go, I see three of them:
- We weaken the assertion in `per_objfile->set_cu()` so that, instead of
 checking for whether it exists in the objfile, it instead checks for
 whether the objects are sufficiently similar (for some definition of
 sufficiently similar), before either discarding the new or old CU.
  I'm not familiar with how the lifetimes of these objects are managed, so
I can't say how good of an option this is. Though the possible discarding
issue could maybe be solved by altering the return type and having the
caller give up on its object if someone else has already beaten them to
parsing it faster. This would be the simplest solution to implement.
- We analyze the graph of which CUs cause other CUs to get loaded and run
 the parallel batches in topological order.
   Assuming that we can even know that info. ahead of time, this approach
would still be the most intrusive to the code and the most complex to
pull off.
- We introduce a type that represents a CU that is currently being loaded,
 but that hasn't finished loading quite yet, for threads that need that CU
 to await on.
   This avoids the rework inherent to the first solution, and the need for
dependency info. ahead of time inherent to the second. But also
has the potential to be slower than either of them, seeing as, in the
worst case scenario, some threads can stall multiple times waiting for
CUs to load.

(5) is conceptually pretty simple to understand, but fairly complex to solve. We
can model the CU cache cleanup functionality as a form of age-based garbage
collection that we're trying to parallelize. And there's plenty of fun to be
had by going down that particular rabbit hole. :^)

So I'd like to hear your thoughts and opinions on this. If there's anything I've
missed or got wrong, please let me know.

Thanks,
Matt.

^ permalink raw reply	[relevance 5%]

Results 1-65 of 65 | reverse | sort options + mbox downloads above
-- links below jump to the message on this page --
2022-12-21 18:46  5% Parallelization of dwarf2 symtab parsing Matheus Branco Borella
2022-12-21 20:19  0% ` Tom Tromey
2023-01-06 20:00     [PATCH 1/1] Add support for gdb.Type initialization from within the Python API Simon Marchi
2023-01-11  0:58  2% ` [PATCH] Add support for creating new types from " Matheus Branco Borella
2023-06-27  3:52 14%   ` [PING] " Matheus Branco Borella
2023-05-26  3:30  2% ` Matheus Branco Borella
2023-08-07 14:53  5%   ` Andrew Burgess
2023-08-08 21:00  1%     ` [PATCH v2] " Matheus Branco Borella
2024-01-13  1:37  1%       ` [PATCH v3] " Matheus Branco Borella
2024-01-13  7:21  0%         ` Eli Zaretskii
2024-01-16  4:55  7%           ` Matheus Branco Borella
2023-01-06 20:21     [PATCH 1/1] Add support for symbol addition to " Simon Marchi
2023-01-12  2:00  4% ` [PATCH] " Matheus Branco Borella
     [not found]     <3Cc9d2dc49-45c7-d6b9-c567-4ec78dd870a0>
2023-01-11  0:35  3% ` [PATCH] Add __repr__() implementation to a few Python types Matheus Branco Borella
2023-01-18 17:05  7%   ` Bruno Larsen
2023-01-18 18:02  6%   ` Andrew Burgess
2023-01-20  1:43  3%     ` Matheus Branco Borella
2023-01-20 16:45  5%       ` Andrew Burgess
2023-01-24 14:45  7%       ` Andrew Burgess
2023-05-18  3:33  7%         ` Matheus Branco Borella
2023-05-19 21:27  7%           ` [PATCHv3 0/2] " Andrew Burgess
2023-05-19 21:27  3%             ` [PATCHv3 2/2] gdb: add " Andrew Burgess
2023-06-07 17:05  7%             ` [PATCHv3 0/2] Add " Matheus Branco Borella
2023-06-08 18:46  7%               ` Andrew Burgess
2023-06-09 12:33  7%               ` Andrew Burgess
2023-07-04 11:09  2%                 ` Andrew Burgess
2023-05-27  1:24  3% [PATCH] Add support for symbol addition to the Python API Matheus Branco Borella
2023-06-27  3:53 14% ` [PING] " Matheus Branco Borella
2023-07-04 15:14  7% ` Andrew Burgess
2023-07-07 23:13  3%   ` Matheus Branco Borella
2024-01-13  1:36  3%     ` [PATCH v2] " Matheus Branco Borella
2024-02-06 17:50  0%       ` Tom Tromey
2024-02-24 17:35  7%         ` Matheus Branco Borella
2023-06-08 21:40  5% [PATCH] Add name_of_main and language_of_main to the DWARF index Matheus Branco Borella
2023-06-09 16:56  0% ` Tom Tromey
2023-06-30 20:36  4%   ` Matheus Branco Borella
2023-07-01  5:47  0%     ` Eli Zaretskii
2023-07-07 15:00  4%       ` Matheus Branco Borella
2023-07-07 18:00  0%         ` Eli Zaretskii
2023-08-04 20:55  0%           ` Tom de Vries
2023-08-03  7:12  7%         ` Tom de Vries
2023-08-03  7:29  7%         ` Tom de Vries
2023-08-04 18:09  4%           ` [PATCH v2] " Matheus Branco Borella
2023-08-11 18:21  4% ` [PATCH v3] " Matheus Branco Borella
2023-08-14  7:31  7%   ` Tom de Vries
2023-09-13  7:09  0%     ` Tom de Vries
2023-09-25 18:47  7%       ` Matheus Branco Borella (DarkRyu550)
2023-09-26 14:07  7%         ` Tom de Vries
2023-10-04 22:30  0%           ` Tom de Vries
2023-10-06 18:31  7% [PATCH 0/2] " Tom de Vries
2023-10-06 18:31  4% ` [PATCH 1/2] [gdb/symtab] " Tom de Vries
2023-10-10 19:19  6%   ` Tom Tromey
2023-10-11 15:37  0%     ` Tom de Vries
2024-01-06  2:45  6% [PATCH] Make `linux_info_proc` prefer using the LWP over the PID Matheus Branco Borella
2024-01-08 15:50  7% ` Simon Marchi
2024-01-19 16:52  7%   ` Matheus Branco Borella
2024-01-19 16:49  6% ` [PATCH v2] " Matheus Branco Borella
2024-01-16  4:54  1% [PATCH v4] Add support for creating new types from the Python API Matheus Branco Borella
2024-01-16 12:45  0% ` Eli Zaretskii
2024-01-16 17:50  7%   ` Matheus Branco Borella
2024-01-16 18:20  7%   ` [PATCH v4] Add support for creating new types from the Python API Matheus Branco Borella
2024-01-16 18:56  0%     ` Eli Zaretskii
2024-01-16 21:27  7%       ` Matheus Branco Borella
2024-02-06 18:20  4% ` Tom Tromey
2024-02-21 18:11  6%   ` Matheus Branco Borella
2024-04-12  6:06  2% ☠ Buildbot (Sourceware): binutils-gdb - failed test (failure) (master) builder
2024-04-16 15:22  2% ☠ Buildbot (Sourceware): binutils-gdb - failed test (failure) " builder

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).