A couple months ago I sent my proposal to add improvement for lower_bound. As a reminder I send it once more I want to propose an improvement to std::__lower_bound for arithmetic types with the standard comparators. The main idea is to use linear search on a small number of elements to aid the branch predictor and CPU caches, but only when it is not observable by the user. In other words, if a standard comparator (std::less, std::greater) is used for arithmetic types. In benchmarks I achieved twice the increase in speed for small vectors(16 elements) and increase for 10-20% in large vectors(1'000, 100'000 elements). Code: https://gist.github.com/ATGsan/8a1fdec92371d5778a65b01321c43604 PR: https://github.com/ATGsan/gcc/pull/1