Hey guys, I'm new to contributing to GNU projects, but... I'm guessing I send commits through here? Would appreciate some sort of note on the procedure on the website I noticed that your matrix multiplication code had bad cache performance due to a misordering of a loop. In a replicated version of my change, I saw about 20% performance gains on my AMD FX CPU. Do let me know if this is not the correct contribution procedure. -Max