Hi Mark,

Thanks for clarifying this, I was getting mixed up between normal str's and
byte strings. Your patch was 99% of the way there to fix the type
resolution so I finished it off for you:

https://github.com/Rust-GCC/gccrs/pull/698/files

The missing piece was that References and Array's are a type of covariant
type so that an array type can look like this: [_, capacity], so the
inference variable here is the variant so that we need to make sure it has
its own implicit mapping id. You just needed to create one more mapping to
get that implicit id so that the reference type similarly doesn't get into
a loop of looking up itself. Creating implicit types like this could be
made easier, so we should likely add some helpers for this scenario.

Let me know what you think.

Thanks

--Phil

On Sat, 25 Sept 2021 at 12:53, Mark Wielaard <mark@klomp.org> wrote:

> Hi Philip,
>
> On Fri, Sep 24, 2021 at 12:01:42PM +0100, Philip Herron wrote:
> > This is really useful information, will this mean that the lexer token
> will
> > need to represent strings differently as well? Or is the std::string in
> the
> > lexer still ok?
>
> I think the respresentation as std::string is fine. As long as we
> don't mix std::strings between different types (byte strings may
> contain sequences of chars that aren't valid utf-8 sequenecs).
>
> > The change you made above has the problem that reference types like,
> arrays
> > are forms of what rust calls covariant types since they might contain an
> > inference variable, so they require lookup to determine the base type.
> Its
> > likely there is a reference cycle here. Though this change will not be
> > correct for type checking purposes. The design of the type system is
> purely
> > about rust type checking and inferring types.
>
> OK, so how do I represent an reference to an array type that doesn't
> contain any inference variables? When we see a b"hello" byte string
> that is the same as seeing &[b'h', b'e', b'l', b'l', b'o'] which is
> the same as seeing &[0x68u8, 0x65u8, 0x6cu8, 0x6cu8, 0x6fu8];
>
> So we know this is &[u8;5] and if we write:
>
> let a = b"hello";
>
> We want to infer that a has type &[u8;5].
>
> > So for example this change will break the case of:
> >
> > ```
> >   let a:str = "test";
> > ```
> >
> > Since the TypePath of str can't know the size of the expected array at
> > compilation time. And the error message will end up with something like
> > "expected str got [i8, 4]";
>
> Right, but that is for "proper strings". It is somewhat unfortunate
> that Rust calls byte strings also "strings", but they really
> aren't. b"abc" is static array of u8, not a &str (containing utf-8).
>
> I have to think about the slicing of "proper strings", which sound
> more complicated than slicing of byte strings, because I don't think
> you want to chop up a utf-8 sequence. For now I would simply try to
> get the type of byte strings like b"test" correct.
>
> Cheers,
>
> Mark
>
>