public inbox for gcc-rust@gcc.gnu.org
 help / color / mirror / Atom feed
* [GSoC] gcc-rs - Unicode Support or Metadata
@ 2023-04-06  2:20 Charlie Hernandez
  0 siblings, 0 replies; only message in thread
From: Charlie Hernandez @ 2023-04-06  2:20 UTC (permalink / raw)
  To: gcc, gcc-rust

[-- Attachment #1: Type: text/plain, Size: 4678 bytes --]

Dear GCC members,

I understand that I am late in submitting this proposal. However, I found
out about gcc-rust and Google of Code three hours ago, and instead of doing
nothing, I decided that it is in my best interest to apply nonetheless. I'm
interested in Rust and the GCC frontend for many reasons, and I would like
to be considered for this involvement. I can be fully committed to the
project if any of my proposals are accepted.


# General Information
Name: Carlos "Charlie Cruz" Hernandez
Email: cjh16@rice.edu
University: Rice University '2026
Major/Focus: Mathematics and Linguistics
Country/Timezone: United States / Eastern Standard Time
What is your Open Source Experience so far?

Online I go by "SeniorMars," (https://github.com/SeniorMars), and I have
contributed to the following significant projects: Rust-analyzer, Neovim,
Coc-rust-analyzer, and the Rust compiler for documentation. I'm highly
active in the Neovim, Latex community and working on several Neovim plugins
for the Typst markup language. Additionally, at Rice, I taught.
https://lazy.rice.edu/ (website is outdated due to University policies --
for now) that aims to teach open source concepts to students. Finally, I
have a youtube channel dedicated to open-source concepts:
https://www.youtube.com/@SeniorMarsTries. For the sake of this project, I
have taken my University's programming class as a Freshman. Also, notably,
I'm working on a tree-sitter parser for the Typst markup language that
deals with Unicode. In Neovim, I'm also trying to tackle "concealed text"
with virtual text. Although I have yet to work with gcc-rs, I'm confident I
can help.

# Project Information

I wish to tackle one of the three projects suggested in the gcc-rust
section: Unicode support, Metadata, or Improving user errors.

## Unicode support

While working on the Typst tree-sitter project, I've learned how extensive
Unicode is and the difficulty of correctly parsing such a language. In
particular, I learned how to work with all the weird cases of Unicode,
i.e., emojis, different types of Whitespace, and identifiers.

My main goal is to apply all the concepts I've learned with Typst to gcc-rs.

Thus, the main difficulties will be dealing with modifying the lexer to
handle \p{Whitespace}, \p{XID_Start}, and \p{XID_Continue} properly without
introducing complications in parsing in other areas of the project. Reusing
code from libcpp/ucnid.h from the CPP frontend may help with this part.
Finally, we must introduce a new Rust::String class that represents rust
identifiers, strings, and `create_name` instead of the old implementation.
Of course, I also need to define the v0 mangling scheme that Rust uses to
parse Unicode correctly. I can take a lot of inspiration from Tree-sitter.

The timeline is very close to the two proposals before me. However, I would
first start implementing punycode earlier as it would give me a checklist
on everything I must test to make the lexer fully support Unicode. As the
rest is then shifted, it makes it easier to implement tests for cases I
know will be difficult to deal with.


# Metadata
While working on the typst.nvim, I decided to use Rust to communicate to
Neovim's API  and Lua by linking binary to something neovim can use. This
piqued my interest, and from the looks of it, the work I would be doing in
this project would porting all the requirements of
`rustc_metadata::rmeta::CrateRoot` to `rust-export-metadata.cc`, whose spec
is detailed in `src/rustc_metadata/rmeta/encoder.rs`. In particular, I
would ensure that we support Strict Version Hash (SVH), Stable Crate Id,
and encoded MIR.

My timeline then is based on modifying and implementing the fields in
`CrateRoot.`

However generally:

Week 1-2:
- Modify rust-export-metadata.cc to include the "basic" fields in
CrateRoot, such as edition, panic_in_drop_strategy
- MetaItem

Week 3:
- Implement a testing method to load only specific metadata in case of
identical hashes correctly.
- Document all the functions I created

Week 4-5:
- Implement CrateDep
- Implement Strict Version Hash, which also needs:
- proper StableCrateId, which needs
- proper basic metadata support
Week 5-7:
- Implment `SourceFile`, `ForeignModule`, `NativeLib`, and the rest.
Week 8:
- Testing and documentation plus start a write-up.
Week 9-10:
- Pipelining and Crate loading
Week 11-12:
- Modify our rlib and add dylib support with compression

I would appreciate any mentor. I understand  I am still late, and this
email could be more robust; however, I would love to work on gcc-rs this
summer.

Thank you,
Charlie


-- 
Charlie Cruz -- Going through a name change!
Math & Linguistics @ Rice University '26

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2023-04-06  2:21 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-06  2:20 [GSoC] gcc-rs - Unicode Support or Metadata Charlie Hernandez

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).