Fields
| hardware | None |
| os | None |
Parameters
Fields
| Name | Value |
|---|
Parameters
| Name | Value |
|---|---|
| git_commit |
commit a83f8e0314fcdda162e54cbba1c9dcf230dff093 Author: David Benjamin <davidben@google.com> Date: Mon Mar 11 23:40:47 2024 -0400 [libc++][hardening] Check bounds on arithmetic in __bounded_iter (#78876) Previously, `__bounded_iter` only checked `operator*`. It allowed the pointer to go out of bounds with `operator++`, etc., and relied on `operator*` (which checked `begin <= current < end`) to handle everything. This has several unfortunate consequences: First, pointer arithmetic is UB if it goes out of bounds. So by the time `operator*` checks, it may be too late and the optimizer may have done something bad. Checking both operations is safer. Second, `std::copy` and friends currently bypass bounded iterator checks. I think the only hope we have to fix this is to key on `iter + n` doing a check. See #78771 for further discussion. Note this PR is not sufficient to fix this. It adds the output bounds check, but ends up doing it after the `memmove`, which is too late. Finally, doing these checks is actually *more* optimizable. See #78829, which is fixed by this PR. Keeping the iterator always in bounds means `operator*` can rely on some invariants and only needs to check `current != end`. This aligns better with common iterator patterns, which use `!=` instead of `<`, so it's easier to delete checks with local reasoning. See https://godbolt.org/z/vEWrWEf8h for how this new `__bounded_iter` impacts compiler output. The old `__bounded_iter` injected checks inside the loops for all the `sum()` functions, which not only added a check inside a loop, but also impeded Clang's vectorization. The new `__bounded_iter` allows all the checks to be optimized out and we emit the same code as if it wasn't here. Not everything is ideal however. `add_and_deref` ends up emitting two comparisons now instead of one. This is because a missed optimization in Clang. I've filed #78875 for that. I suspect (with no data) that this PR is still a net performance win because impeding ranged-for loops is particularly egregious. But ideally we'd fix the optimizer and make `add_and_deref` fine too. There's also something funny going on with `std::ranges::find` which I have not yet figured out yet, but I suspect there are some further missed optimization opportunities. Fixes #78829. (CC @danakj) |