Hi Mark,

Thanks for clarifying this, I was getting mixed up between normal str's and byte strings. Your patch was 99% of the way there to fix the type resolution so I finished it off for you:

https://github.com/Rust-GCC/gccrs/pull/698/files

The missing piece was that References and Array's are a type of covariant type so that an array type can look like this: [_, capacity], so the inference variable here is the variant so that we need to make sure it has its own implicit mapping id. You just needed to create one more mapping to get that implicit id so that the reference type similarly doesn't get into a loop of looking up itself. Creating implicit types like this could be made easier, so we should likely add some helpers for this scenario.

Let me know what you think.

Thanks

--Phil

On Sat, 25 Sept 2021 at 12:53, Mark Wielaard <mark@klomp.org> wrote:

Hi Philip,

On Fri, Sep 24, 2021 at 12:01:42PM +0100, Philip Herron wrote:
> This is really useful information, will this mean that the lexer token will
> need to represent strings differently as well? Or is the std::string in the
> lexer still ok?

I think the respresentation as std::string is fine. As long as we
don't mix std::strings between different types (byte strings may
contain sequences of chars that aren't valid utf-8 sequenecs).

> The change you made above has the problem that reference types like, arrays
> are forms of what rust calls covariant types since they might contain an
> inference variable, so they require lookup to determine the base type. Its
> likely there is a reference cycle here. Though this change will not be
> correct for type checking purposes. The design of the type system is purely
> about rust type checking and inferring types.

OK, so how do I represent an reference to an array type that doesn't
contain any inference variables? When we see a b"hello" byte string
that is the same as seeing &[b'h', b'e', b'l', b'l', b'o'] which is
the same as seeing &[0x68u8, 0x65u8, 0x6cu8, 0x6cu8, 0x6fu8];

So we know this is &[u8;5] and if we write:

let a = b"hello";

We want to infer that a has type &[u8;5].

> So for example this change will break the case of:
>
> ```
> let a:str = "test";
> ```
>
> Since the TypePath of str can't know the size of the expected array at
> compilation time. And the error message will end up with something like
> "expected str got [i8, 4]";

Right, but that is for "proper strings". It is somewhat unfortunate
that Rust calls byte strings also "strings", but they really
aren't. b"abc" is static array of u8, not a &str (containing utf-8).

I have to think about the slicing of "proper strings", which sound
more complicated than slicing of byte strings, because I don't think
you want to chop up a utf-8 sequence. For now I would simply try to
get the type of byte strings like b"test" correct.

Cheers,

Mark