Retrofitting spatial safety to lines of C++

(security.googleblog.com)

82 points | by jandeboevrie 4 days ago ago

133 comments

titzer 4 days ago ago
> We’ve begun by enabling hardened libc++, which adds bounds checking to standard C++ data structures, eliminating a significant class of spatial safety bugs.
Well, it's 2024 and remember arguing this 20+ years ago. Programs have bugs that bounds checking catches. And making it a language built-in exposes it to compiler optimizations specifically targeting bounds checks, eliminating many and bringing the dynamic cost down immensely. Just turning them on in libraries doesn't necessarily expose all the compiler optimizations, but it's a start. Safety checks should really be built into the language.
[-]
- pjmlp 4 days ago ago
  Before C++98, this used to be pretty much table stakes in C++ compiler frameworks, e.g. Turbo Vision, AppToolbox, OWL, MFC,....
  I still don't get why the standard library went the other way, other than starting the tradition of standardised wrong defaults.
  [-]
  - IshKebab 4 days ago ago
    The C++ standards committee is still under the illusion people can read, understand and remember the entire spec, and write code without making mistakes. All these bugs are the fault of the people making mistakes, not C++.
    [-]
    - tialaramex 4 days ago ago
      I don't think there's any such illusion. You do not see, for example, WG21 members who are confident that they understand the entire C++ language (on the contrary they'll often accept corrections about the language from other committee members) and it's not infrequent that a committee member will agree with the statement that C++ is too large and sprawling for any individual to attain such comprehensive understanding. [Today I would guess maybe Sean Baxter, who wrote his own compiler, has the best individual understanding and I believe Sean is not a member of the committee]
      Instead WG21 has very clearly (but without ever admitting it and that's important) taken the path of maintaining a legacy language. Even as debate carried on about whether in future C++ could end up like COBOL, the committee has acted exactly as though it is for some years now. Compatibility is King, no price is too high for compatibility, everything must be sacrificed to make that happen and that's how you end up like COBOL.
      Three important opportunities to divert and pick other ways forward should be highlighted here. P1863 "ABI: Now or Never" by Titus Winters in 2020; P2137 "Goals and priorities for C++" also in 2020 but with a long list of authors and P1818 "Epochs" from 2019 by Vittorio Romeo.
      In all these cases WG21 chose the "hope the problem goes away" path, preferring not only not to address the critical problem highlighted and take a new route forward, but to specifically ignore the problem and press on anyway.
      "Hope the problem goes away" is also, quietly, the preferred strategy by WG21 for the safety problem.
      There's a reason (albeit a terrible one) to prefer the C++ ISO document's language over the approach of Rust. These are both general purpose languages (I might also write separately in this thread about a non-general purpose language which Google should use more, if I have time) and so must wrestle with Rice's Theorem. Rust's solution is to require the compiler to be conservative. This is very difficult and indeed there are known bugs in the code doing this conservative check in the Rust official compiler. But C++ has a much easier (but IMO fatal) path, it says that's the job of the programmer and when the programmer writes C++ software which is nonsense as a result that's their fault, not the compiler's fault for failing to reject the program.
      It would be extremely difficult to explain how a "standards conforming" Rust compiler can correctly accept all the programs Rust's actual compiler accepts and reject all those it rejects without essentially having a black box where the compiler implementation sits. We can explain the purpose of such rules without, but their detailed behaviour not so much.
      Take borrow checking. All the easy scoped borrows (which is all that worked in Rust say eight years ago) can be explained without too much trouble, but today a lot fancier (but to a human obviously correct) borrowing will compile, because the checker is smarter - now, how do you express, not in Rust source code but in the English language, all the checks to be performed, and neither miss things out nor unknowingly accept programs a real Rust compiler will reject ?
      C++ just needn't do that, in effect the ISO document says. "Don't do borrows that last longer than the thing borrowed, if you do, that's not C++ but your compiler won't notice so the result is arbitrary nonsense"
      [-]
      - kibwen 4 days ago ago
        > how do you express, not in Rust source code but in the English language, all the checks to be performed, and neither miss things out nor unknowingly accept programs a real Rust compiler will reject ?
        I think this is unintentionally stuck in the mindset of "the purpose of a language specification document is to enable armchair language lawyers to flame each other on Usenet about whether or not such-and-such degenerate edge case is technically valid". But a specification doesn't need to be written in English, it can be written as a formal proof, and indeed I would expect a theoretical Rust spec to specify the behavior of the borrow checker as just such a proof. Rust's borrow checking may no longer be as simple as the lexically-scoped model that existed as of Rust 1.0, but it's not like the extensions that have been added since then are ad-hoc; they're all still designed to result in a model that is provably sound.
        [-]
        almostgotcaught 4 days ago ago
        > But a specification doesn't need to be written in English, it can be written as a formal proof, and indeed I would expect a theoretical Rust spec to specify the behavior of the borrow checker as just such a proof.
        Did you miss the part where the person you're responding to mentioned Rice's theorem? Do you know Rice's theorem and hence understand what they're implying?
        [-]
        kibwen 3 days ago ago
        Rice's theorem isn't relevant here. The goal is not to create a system that produces no false positives, it's perfectly fine to do a conservative syntactic analysis that allows false positives but disallows false negatives, and it's then possible to produce a formal proof that this analysis is sound. It is this formal proof that I would expect to be included in a specification in lieu of English prose.
        [-]
        tialaramex 3 days ago ago
        Sure, there's a reason I said "Extremely difficult" not "Impossible". Defying Rice without losing generality is mathematically impossible.
        This is on that continuum where it's definitely neither impossible nor easy enough that we can just let some bored grad student knock out the answer, and so now somebody who wants this must do lots of hard work.
        I think a specification which says e.g. here's the semantic requirement, here's a rule for scoped borrows which works, you must do at least that, but you can do more however you must not allow anything which violates the semantic requirement - would be great, but if you had that rule in your standard then people can write conforming Rust programs which don't compile - they need a yet-to-be-written smarter compiler to figure out why they're legal, which is kinda annoying as a language feature.
        Ygg2 3 days ago ago
        Rice theorem doesn't say anything about humans.
        [-]
        almostgotcaught 3 days ago ago
        No clue what this means
        [-]
        tialaramex 3 days ago ago
        Some people believe that the Church-Turing intuition doesn't tell us anything about humans, that what humans are doing isn't computation but something more powerful. In my experience their lack of evidence for this belief just makes them believe it even harder, and they often write whole books which are in effect the argument from incredulity but expanded to book form.
        [-]
        carbotaniuman 3 days ago ago
        There is no proof that humans are just glorified Turing machines and even as a nonreligious person, I find such a statement to be as lacking in evidence as those that claim humanity has some soul or similar that cannot be replicated.
        The actual logic of gggp's statement also doesn't make any sense. We as humans also under and overestimate the soundness of programs.
        Sometimes, a perfectly fine solution is massaged to better adhere to best practices because we can't convince ourselves that it's correct. Rust requires that we convince the compiler, and then we know it's correct via the compiler's proofs, instead of requiring us to do the proof all the time.
        [-]
        IshKebab 3 days ago ago
        > I find such a statement to be as lacking in evidence as those that claim humanity has some soul or similar that cannot be replicated.
        It doesn't need evidence; it is the null hypothesis.
        Brains clearly compute, and it appears that computation is sufficient to produce the observed behaviour of brains. All our experience of the universe and physics suggests that there is no magic or metaphysics or souls or whatever.
        So the onus is on you to show that there's something more going on. It isn't a 50:50 "is it heads or tails", it's more like "I claim that the tooth fairy exists" vs "I'm pretty sure it's your mum".
        Ygg2 3 days ago ago
        It's simpler than this. Turing machines are a beautiful abstraction. Whatever happens in humans is much, much, much, much, much, much messier, on the account of it being subject to laws of evolution and working on a scale where various micro-effects can be felt (radiation, Brownian motion and quantum effects anyone?).
        So even if the Turing machine model is correct (and we don't know that), it's overtly simplified.
        Ygg2 3 days ago ago
        Humans are not Turning machines. I'm not talking how we work on fundamental level.
        I'm saying we don't obey axioms of Turing machine model. So Rice theorem nor Godel theorem can apply to unsafe code written by humans.
        Even if borrow checker is limited by the Rice theorem, you can create either safe abstractions provably or unsound abstractions provably or potentially unsound abstractions, which humans can reject or accept.
      - IshKebab 3 days ago ago
        This doesn't really have anything to do with Rice's theorem. It's difficult to specify the behaviour of the borrow checker simply because it is very complex behaviour.
        You absolutely could do it, but it would be a ridiculous effort for nebulous benefit.
    - pjmlp 4 days ago ago
      Spot on.
- flohofwoe 4 days ago ago
  Yeah. FWIW, we shipped PC games since the early 2000s written in C++ where the C++ stdlib was banned (for various reasons, not just memory safety), and our custom container classes were bounds checked via custom asserts which stayed in the code for the shipped game (and the rest of the code also peppered with asserts).
  ...and then you still had to argue with some circles of the C++ community why the game and engine code doesn't use the stdlib. It's crazy that it takes decades to convince some people that a bad idea is simply a bad idea.
  [-]
  - pjmlp 3 days ago ago
    Which is kind of ironic, given how performance minded the game industry is, and then we have those circles with such attitude.
    [-]
    - flohofwoe 3 days ago ago
      Well, I did measure performance overhead of all those asserts of course (not just the range checks). It was somewhere around 2% of the frame budget which isn't nothing, but also not enough to justify removing the asserts.
      [-]
      - pjmlp 3 days ago ago
        That is already a big difference to those that oppose on principle, never having measured anything.
WalterBright 4 days ago ago
Dlang added array bounds checking 20 years ago. It's a huge win. As evidenced by the article noting that 40% of the memory safety bugs were spacial.
I used to have all kinds of problems with array overflows. I didn't make them very often, but when I did, they took a long time to track down. They've been gone for 20 years now.
Note that it would be easy to add it to C/C++:
https://www.digitalmars.com/articles/C-biggest-mistake.html
It would be the most useful and cost-effective enhancement ever.
[-]
- lpapez 4 days ago ago
  Thanks for sharing, I enjoy reading your posts in regards to how ahead of time Dlang was in adopting these improvements.
  I wanted to ask: did you ever consider what was missing from Dlang to achieve widespread adoption? Clearly it was not features, so I'm wondering what that would be from your pespective.
  [-]
  - WalterBright 3 days ago ago
    The marketing department was what was missing. I've always had that problem. Borland was brilliant at marketing an inferior compiler. Phillippe Kahn is an amazing businessman. (He's also a very fun person to talk to.)
    For example, Borland at one point decided to include the source code to some of its runtime library for free. At a compiler roundup in the magazine, this was hailed as a great advance forward by the reviewer. Meanwhile, Datalight C was also in the roundup, and had always included 100% of the runtime library source code. No mention was made of this.
  - dataflow 3 days ago ago
    > what was missing from Dlang to achieve widespread adoption
    This: https://godbolt.org/z/s49qzPn81
- dataflow 4 days ago ago
  They have it already, it's called std::span.
  [-]
  - pjmlp 4 days ago ago
    No they didn't, if you care about security, gsl::span is the answer.
    [-]
    - coffeeaddict1 4 days ago ago
      It's quite obvious to me that the C++ folks running the committee didn't care about safety much. How can they standardise `std::span` knowing it's unsafe?
      They care now (well they pretend at least) because Rust is going to take significant market share in domains where C++ is still king.
      [-]
      - 3836293648 4 days ago ago
        They didn't just standardise it when it was unsafe. They got a proposal for a safe span and demanded that safety be removed before they'd accept it
      - pjmlp 4 days ago ago
        Rust isn't the reason, rather governments are now serious about security, just like in any other industry.
        [-]
        coffeeaddict1 3 days ago ago
        Well yes and no. Rust is the reason because it's a real memory-safe alternative for system programming. If it didn't exist, governments would give C and C++ a "pass" for being memory unsafe.
        [-]
        pjmlp 3 days ago ago
        Except, Rust is never mentioned alone on cybersecurity advisories, the anti safety folks are the ones doing that.
        aninteger 3 days ago ago
        Meh.. I kind of don't think so. The reason you and I keep ending up on haveibeenpwned has likely nothing to do with Rust or C++.
        [-]
        pjmlp 3 days ago ago
        Indeed it has to do with lack of liability, but it will come.
        And then evolution will take care of which programming ecosystem are less expensive to result in lawsuits, or invalidation of insurance policies.
    - dataflow 4 days ago ago
      > No they didn't, if you care about security, gsl::span is the answer.
      https://godbolt.org/z/Pda9Me45P ?
      [-]
      - pjmlp 4 days ago ago
        Unless you use .at() it isn't portable to assume code safety.
        [-]
        dataflow 4 days ago ago
        So what? Just pass the command-line flag to enable the code safety in your toolchain. The same way you pass it to enable optimizations in your toolchain.
        [-]
        coffeeaddict1 4 days ago ago
        > The same way you pass it to enable optimizations in your toolchain.
        No, it's not the same. I never enable optimisations by manually passing in flags to the compiler. It's always a `cmake -DCMAKE_BUILD_TYPE=...`. There is no such easily accessible equivalent for bounds checking.
        [-]
        dataflow 3 days ago ago
        Have you tried https://discourse.cmake.org/t/strictly-appending-to-cmake-la...
        [-]
        coffeeaddict1 3 days ago ago
        What flag can I pass to CMAKE_CXX_FLAGS to enable bounds checking on all platforms regardless of the compiler used? I can do that for optimisations with `CMAKE_BUILD_TYPE`.
        [-]
        dataflow 3 days ago ago
        (How) do you have no control over the environment variables you call CMake with?
        [-]
        coffeeaddict1 3 days ago ago
        I don't quite get what you mean. Of course, I could get CMake to pass a specific compiler flag at the configuration stage, but that misses the point. What I'm saying is that there is a super-easy way to configure CMake to enable optimisations (CMAKE_BUILD_TYPE=Release), but this cannot be said for bounds checking. Note that my own code does have bounds checking enabled for Clang, GCC and MSVC. What I'm arguing is that setting up the latter is significantly more effort than enabling optimisations. I'm not arguing that it isn't possible or that one shouldn't do that.
        [-]
        dataflow 3 days ago ago
        > I don't quite get what you mean. [...] What I'm saying is that there is a super-easy way to configure CMake to enable optimisations (CMAKE_BUILD_TYPE=Release), but this cannot be said for bounds checking
        Maybe I'm not getting what you mean.
        You are saying you already run
        cmake ...
        So I am saying you can just change that to
        CXXFLAGS="-Dblah" cmake -U CMAKE_CXX_FLAGS ...
        That genuinely seems pretty darn easy to me.
        In any case, any beef you have is clearly with CMake here. You'd have the same issue(s) with any other flag, for any language, if you use CMake.
        almostgotcaught 4 days ago ago
        > I never enable optimisations by manually passing in flags to the compiler
        Lol then you don't use your compiler/toolchain correctly. How is that anyone's problem but yours?
        [-]
        coffeeaddict1 3 days ago ago
        How exactly am I not using my toolchain correctly? What's the "correct" way?
        WalterBright 3 days ago ago
        We had this debate early on with D. The resolution was checking was on by default. In order to get array bounds turned off, you had to throw a switch and it only happened for code marked @system.
        This turned out to be the right move.
        [-]
        dataflow 3 days ago ago
        That wasn't something I was even debating here. People derailed this whole discussion.
        All I was doing here was saying was that the fix for your "C's biggest mistake" (your T arr[..] proposal) is already in C++ and you can get it today: it's called std::span, and it was explicitly designed to let you get bounds-checking, with just a different syntax. It needs a compiler flag, and so do optimizations. You already pass one, so pass the other too, and get what you wanted.
        That was all I was saying. But this being HN, everyone insisted on derailing this into an argument about whether safe-by-default is better than fast-by-default, when that had nothing to do with my point, and when I was certainly not trying to argue one is better than the other.
        [-]
        throw16180339 2 days ago ago
        > That wasn't something I was even debating here. People derailed this whole discussion.
        If you propose something with blatantly obvious flaws here, you'll usually get called out.
        You suggested that people use an interface without bounds checking and jump through a hoop to enable bounds checking with it. Other people disagreed that this is a solution. You kept digging deeper after that while ignoring their responses, but that's on you.
        [-]
        dataflow 2 days ago ago
        > If you propose something with blatantly obvious flaws here, you'll usually get called out. You suggested that people use an interface without bounds checking and jump through a hoop to enable bounds checking with it. Other people disagreed that this is a solution. You kept digging deeper after that while ignoring their responses, but that's on you.
        The problem you don't seem to understand is that, with this being HN, if I'd told people to use gsl::span, then I would have had a similar barrage of people "calling me out" for it having the "obvious flaws" of (1) destroying performance for users who don't want it, and/or (2) being nonstandard and in no way equivalent to the dlang.org proposal, this is why C++ sucks, blah blah. I might as well have just told them to write their own configurable wrappers at that point.
        So I proposed std::span because it was literally the standard solution that was explicitly designed to let people get bounds checking without those problems... so that they can have their cake and eat it however they want, without an immediate performance loss. I frankly thought that was obvious, but this being HN, I was greeted with people "calling me out". It's like it's impossible to tell people something useful here without writing a comprehensive dissertation on the general topic. Makes me regret trying to help people.
        pjmlp 4 days ago ago
        Which command line option from ISO C++23?
        It is not in the standard, it isn't neither portable, nor guaranteed to exist.
  - coffeeaddict1 4 days ago ago
    std::span is not bounds checked by default.
    [-]
    - dataflow 4 days ago ago
      Optimizations aren't enabled by default either, and yet everyone passes a flag to optimize, and nobody argues C++ sucks just because you need to pass a flag to enable optimizations. Is it so hard to pass another flag to enable bounds checking?
      [-]
      - saagarjha 4 days ago ago
        Yes.
        [-]
        dataflow 4 days ago ago
        Eh? How/why?
        [-]
        saagarjha 4 days ago ago
        Because turning on optimizations makes your code faster and turning on bounds checks makes your code slower. Hence, one gets used far more than the other.
        [-]
        dataflow 4 days ago ago
        > Because turning on optimizations makes your code faster and turning on bounds checks makes your code slower. Hence, one gets used far more than the other.
        The question was "is it so hard to pass a command line flag". You said "yes" when you clearly don't see any difficulty with actually passing the flag. Instead you're apparently answering a totally different question: "why do people lack the motivation to do this." Which had nothing to do with the point you replied to.
        It's not like opt-out vs. opt-in somehow changes the performance characteristics. People who want maximum performance will turn it off. People who want safety will turn it on.
        [-]
        saagarjha 4 days ago ago
        Is it so hard to shoot someone? It's just pressing the trigger. When you say it's hard to kill people you're really just answering a different question, one that is about the psychological or legal or moral cost of doing so. Maybe your overly literal interpretation is not the one people actually want.
        [-]
        dataflow 4 days ago ago
        > Is it so hard to shoot someone? It's just pressing the trigger. When you say it's hard to kill people you're really just answering a different question, one that is about the psychological or legal or moral cost of doing so. Maybe your overly literal interpretation is not the one people actually want.
        You don't feel you're missing the point of the discussion?
        The whole discussion started with: "if you want bounds checking in your own code". Notice the "if". That's the premise.... it by definition assumes you've already accepted the performance impact of getting the safety you want, and thus it's not a problem for you.
        The only remaining question at this point is, how hard is it to get you that safety. Asking you "is it so much harder to pass -foo like the -bar you already pass" and expecting you to address the physical difficulty of adding a flag isn't taking an "overly literal" reading of the question, it's literally asking the most obvious and only remaining question.
        If you want to go back to the premise and argue about the psychological hurdle of taking a performance loss, that's fine and all, but then you're completely changing the topic of the thread you replied to.
        P.S. comparing passing an extra command-line flag to shooting someone is a rather insane comparison. Honestly, all this is really making me regret trying to share a tip to help people make their code safer.
        [-]
        saagarjha 4 days ago ago
        You're regretting it because you keep telling people to use an interface that explicitly was designed to not provide bounds checking and claiming that this is the solution to make their code safer, while in reality you have to look up some nonportable flag to enable it for your STL if even offers the functionality at all. Maybe people would be a lot more reasonable if you didn't post intentional bait in the first place.
        [-]
        dataflow 4 days ago ago
        > You're regretting it because
        No, I'm regretting it because having to spend hours replying to comments that ignore the premise is a complete waste of my time.
        > you keep telling people to use an interface that explicitly was designed to not provide bounds checking
        As a matter of fact it was very intentionally and specifically designed to allow bounds-checking to be configured at build time: "As an example, in the current reference implementation, violating a range-check results by default in a call to terminate() but can also be configured via build-time mechanisms to continue execution (albeit with undefined behavior from that point on)." [1]
        Calling that "explicitly designed not to provide bounds checking" is quite a deceptively misleading way to paint it. It's not an accident that you can enable bounds-checking, it's very much by design and intended that you do so. They just didn't happen to standardize the flag name, just like they never standardized the optimization flag names.
        > and claiming that this is the solution to make their code safer, while in reality you have to look up some nonportable flag to enable it for your STL if even offers the functionality at all.
        Like I said, this is literally the same as optimization flags. Everybody passes them and nobody bashes C++ for it. You're making a big deal out of something incredibly tiny just to win an internet argument on the wrong thread.
        [1] https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p01...
        [-]
        saagarjha 4 days ago ago
        You're the one misunderstanding here. The reference implementation that they provided (which I actually believe is gsl::span) allows configuration. The design for the standard, as you have mentioned elsewhere in this discussion, does not provide bounds checking. I am making a big deal out of this because it is a problem that affects real codebases, not something hypothetical that you can wave away with your idea of how things work. The fact is that people who care about security ship non-bounds-checked spans because this is not the default option.
        orf 4 days ago ago
        Safety by default, opt-in to unsafety.
        It’s not hard to grok.
        [-]
        dataflow 4 days ago ago
        > Safety by default, opt-in to unsafety. It’s not hard to grok.
        Nobody was ever saying that unsafe-by-default is somehow better. That just wasn't the question being asked.
        [-]
        orf 4 days ago ago
        > The question was "is it so hard to pass a command line flag"
        Can your position not be summed up as “unsafe by default doesn’t matter, because changing the default is easy”?
        If so, there’s an obvious flaw in that thinking.
        [-]
        dataflow 3 days ago ago
        > Can your position not be summed up as “unsafe by default doesn’t matter, because changing the default is easy”?
        No.
        >> Nobody was ever saying that unsafe-by-default is somehow better.
vblanco 4 days ago ago
Game developers have been doing this since forever, its one of their main reasons to avoid the STL.
EASTL has this as a feature by default, and unreal engine container library has the boundchecks enabled on most games. The performance cost of those boundchecks in practice is well worth the reduction of bugs even on performance sensitive code.
[-]
- pjmlp 3 days ago ago
  Which is yet another reason to assert (pun intend), how far from reality the anti-bounds check folks are, when even the game industry takes them seriously.
omoikane 4 days ago ago
> Hardening libc++ resulted in an average 0.30% performance impact
Maybe what really happened is that compiler technology has improved such that they are able to remove most redundant checks, such that it only costs 0.30% today. I can imagine things going the opposite direction 20 years ago, as in "we removed some bounds checks and gained X% of performance".
[-]
- panstromek 4 days ago ago
  Probably yes, and branch prediction improved a lot since then, too. Bounds checks are easily predictable.
  [-]
  - Gibbon1 4 days ago ago
    Bounds checking feels to me like low hanging fruit for a processor designer. A low cost operation that can run in parallel or tossed away as the steam from the instruction decoder gets optimized and scheduled.
    Meanwhile the guys on the standards committee thinks of fixed width RISC instructions being executed by jungle logic and the ALU.
  - adgjlsfhk1 4 days ago ago
    the hard part about bounds checks is you need very specific semantics for bounds errors to prevent them from preventing vectorization. specifically, you don't want to promise that they are thrown when they are executed
    [-]
    - almostgotcaught 4 days ago ago
      > preventing vectorization
      No one that wants to emit vectorized code is relying on auto-vectorization to emit that code.
- cma 4 days ago ago
  Unfortunately for many use cases like gamedev, debug builds need to be fast too. So hopefully more of the improvement is from branch prediction.
- masklinn 4 days ago ago
  Bounds checks are trivially predictable though, I would hope code density was the issue rather than branch prediction.
  And as others note, bounds checking was the norm before the STL.
dzogchen 4 days ago ago
To “lines of C++” and to “hundreds of millions of lines of C++” is quite a different title.
alserio 4 days ago ago
> We first enabled hardened libc++ in our tests over a year ago. This allowed us to identify and fix hundreds of previously undetected bugs in our code and tests.
That's something
TinkersW 4 days ago ago
I wonder if google really never had this turned on before? Like this has been available in the C++ standard library for decades(normally as a debug feature to catch errors in development, but some implementations such as MS support it in release also).
Might explain why they claimed 70% of exploits were memory related..
[-]
- alpire 3 days ago ago
  The hardening mode we enabled is quite recent added to libc++. It was proposed in 2022: https://discourse.llvm.org/t/rfc-c-buffer-hardening/65734. It was designed to run in prod, so it's quite fast. Previous debug modes I've seen came with a much higher costs, and therefore weren't (usually) enabled in prod.
DLoupe 4 days ago ago
> The safety checks have uncovered over 1,000 bugs
In most implementations of the standard library, safety checks can be enabled with a simple #define. In some, it's the default behavior in DEBUG mode. I wonder what this library improves on that and why these bugs have not been discovered before.
[-]
- pjmlp 4 days ago ago
  Being actually enforced, even in release.
  Most folks don't use those #defines, and many still haven't leaned about them.
- dataflow 4 days ago ago
  It's a great question (_LIBCPP_DEBUG was already a thing in libc++), and AFAIK the answer is supposedly "it used to be too costly to enable these in production with libc++, and it no longer is." I have no first-hand insight as to how accurate this perception is.
  [-]
  - alpire 3 days ago ago
    That's exactly right. We've had extra hardening enabled in tests, and that does catch many issues. But tests can't exercise every potential out-of-bounds issue, which is why enabling it prod enabled us to find & fix additional issues.
- saagarjha 4 days ago ago
  They turned those on and 1. checked that the software using it didn't break and 2. made sure it didn't tank performance.
  Source: I worked on this apparently
dataflow 4 days ago ago
PSA: Perhaps this is stating the obvious, but if you want bounds checking in your own code, start replacing T* with std::span<T> or std::span<T>::iterator whenever the target is an array.
[-]
- jpc0 4 days ago ago
  std::span is not bounds checked.
  gsl::span is
  [-]
  - dataflow 4 days ago ago
    > std::span is not bounds checked. gsl::span is
    https://godbolt.org/z/Pda9Me45P ?
    [-]
    - debugnik 4 days ago ago
      You've compiled with _LIBCPP_HARDENING_MODE_FAST, which still adds some extra checks not required by the standard.[1] You can also tell it's nonstandard because it doesn't really throw out_of_range, it just traps.
      > Fast mode, which contains a set of security-critical checks that can be done with relatively little overhead in constant time and are intended to be used in production.
      > Using std::span as an example, setting the hardening mode to fast will always enable the valid-element-access checks when accessing elements via a std::span object, but whether dereferencing a std::span iterator does the equivalent check depends on the ABI configuration.
      1: https://libcxx.llvm.org/Hardening.html
      [-]
      - dataflow 4 days ago ago
        > You've compiled with _LIBCPP_HARDENING_MODE_FAST, which still adds some extra checks not required by the standard.
        The standard doesn't require any checks to begin with.
        It also doesn't require optimizations.
        [-]
        debugnik 4 days ago ago
        It does, on explicitly bounds-checked accessors like .at, which span is gaining for C++26.
        But you originally implied using span was sufficient, you didn't mention LLVM's libc++ hardening. (You even mentioned iterators which, I just quoted, might not be bounds-checked on fast mode either.)
        [-]
        dataflow 4 days ago ago
        > It does, on explicitly bounds-checked accessors like .at, which span is gaining for C++26.
        When I said "the standard doesn't require this" I clearly was not referring to C++26, which does not even exist yet. In any case, I'm not sure what the point of this pedantry is. I'm pretty sure the point was clear.
        > But you originally implied using span was sufficient, you didn't mention LLVM's libc++ hardening.
        Because this isn't LLVM-specific, every major STL has bounds checking. You just gotta enable it for your toolchain. Sorry I didn't list every single flag, I guess?
        > (You even mentioned iterators which, I just quoted, might not be bound-checked on fast mode either.)
        Which is why I had _LIBCPP_ABI_BOUNDED_ITERATORS, right? I'm not on HN to write comprehensive documentation for every toolchain, I'm just writing a quick tip for people to look into.
        All this pedantic quibbling over "this isn't required by the standard by default" is just pointless arguing for the sake of arguing on the internet. For all the performance freaks who really care about this: no language I know of guarantees optimizations in the standard, so if you're relying on optimized performance, you're already doing nonstandard stuff.
        And practically every major compiled language you love or hate has a way to enable or disable bounds checking, letting you violate their "standard" one way or another. D itself has -boundscheck, C++ has toolchain-specific flags, Go has -gcflags=-B, etc...
        [-]
        debugnik 4 days ago ago
        So your first answer to being told your initial suggestion is insufficient for bounds-checking was to share a godbolt link without elaborating on where the checking was actually coming from; and when I elaborate, for other readers' sake, not yours, on your solution and other comparable ones, you get defensive and repeatedly call me pedant. Ok, but you know, these discussion are for everyone else to read and maybe learn something too, not just us.
        As for the bounds-checked accessors, I mentioned them because they already exist in current C++ for other collections, they're coming to the one you suggested using, and I thought them relevant to a discussion about C++ lacking spatial safety.
        carbotaniuman 3 days ago ago
        I've used vendor-specific C++ compilers with no bounds checking and a barely conforming stdlib, so by your logic C++ has zero bounds checking... Defaults matter!
        [-]
        dataflow 3 days ago ago
        > I've used vendor-specific C++ compilers with no bounds checking and a barely conforming stdlib, so by your logic C++ has zero bounds checking...
        I literally said exactly that: "The standard doesn't require any checks to begin with."
        > Defaults matter!
        Sigh... nobody claimed otherwise. You're really missing the point of the thread.
        All I did was give people a tip on how to improve their code security. The exact sentence I wrote was:
        >> "If you want bounds checking in your own code, start replacing T* with std::span<T> or std::span<T>::iterator whenever the target is an array."
        "BUT DEFAULTS MATTER!!!", you rebut! Well OK, then I guess keep your raw pointers in and don't migrate your code? Sorry I tried to help!
        [-]
        carbotaniuman 3 days ago ago
        Cool, let me know how to improve the code security on my vendor compiler then, I'll be waiting.
        [-]
        dataflow 3 days ago ago
        > Cool, let me know how to improve the code security on my vendor compiler then, I'll be waiting.
        Switch to std::span and add 1 line to std::span::operator[] to check your bounds...
        [-]
        carbotaniuman 3 days ago ago
        I don't think std::span is bounds checked. Try again.
        [-]
        dataflow 3 days ago ago
        > I don't think std::span is bounds checked. Try again.
        That's why I said add 1 line to std::span::operator[] to check your bounds.
        I'm telling you to modify the STL header. It's a text file. Add 1 line to make it bounds-checked.
        4 days ago ago
        [deleted]
    - jpc0 3 days ago ago
      I do believe the comment thread made the point I would have used here...
      Use gsl::span or write your own bounded span.
      std::span is not bounds checked...
Animats 4 days ago ago
New buzzword for old thing alert.
[-]
- aseipp 4 days ago ago
  People (both practitioners & researchers) have been using the terms "temporal" and "spatial" to refer to different classes of C++ vulnerabilities for at least 12+ years, back when I was actually writing exploits for a job. It is not new at all, and anyone in the field within the past 6-7 years and worth their salt will instantly recognize them.
  [-]
  - tom_ 4 days ago ago
    For whatever it's worth, I've been doing this stupid shit - writing C++, that is - for 25 years, and this is the first time I've heard this term. (This is a data point rather than a complaint. But for a fee, it can become a complaint if you would like.)
    [-]
    - aseipp 3 days ago ago
      I meant security engineers/exploiters actually, but yeah, I can see how most working C++ programmers who aren't security specialists might not be as familiar with it.
- pizlonator 4 days ago ago
  Nah, "spacial safety" is a term of art among security folks and among PL folks who work on security.
  It's the part of memory safety that's just about bounds. You can also call it "bounds safety" and folks will understand what you mean, but "spacial safety" is the more commonly used jargon.
- epage 4 days ago ago
  This term is coming up more frequently in the C++ community as they discuss Rust's safety features so to add more nuance to the discussion and focus on subsets of the problem to solve.
  Note that there are some more heated takes on where these terms are being used. I tried to be as generous as possible in my description.
- vintagedave 4 days ago ago
  I'll say.
  > Attackers regularly exploit spatial memory safety vulnerabilities, which occur when code accesses a memory allocation outside of its intended bounds
  Isn't that... 'out of bounds memory access'?
  [-]
  - moyix 4 days ago ago
    [This is more of a reply to a deleted reply to you, but I don't want my efforts to go to waste]
    Spatial memory safety is a reasonably common term in the security / PL field. You can see examples of it being used at least as far back as 2009: https://scholar.google.com/scholar?hl=en&as_sdt=0%2C33&q=spa...
    It's in contrast to temporal memory safety, which deals with object lifetimes (use after free, for example).
    Here Google is probably also referencing a 2022 post of theirs with a very similar title, dealing with temporal safety: https://security.googleblog.com/2022/05/retrofitting-tempora...
    The terms are also in Wikipedia: https://en.wikipedia.org/wiki/Memory_safety#Classification_o...
  - SAI_Peregrinus 4 days ago ago
    Yes. It's as opposed to temporal memory safety vulnerabilities, like use-after-free or data races.
  - 4 days ago ago
    [deleted]
userbinator 4 days ago ago
[flagged]
[-]
- panstromek 4 days ago ago
  Interestingly, the original quote is "Those who would give up ESSENTIAL Liberty, to purchase a little TEMPORARY Safety, deserve neither Liberty nor Safety." which is quite different meaning. Otherwise you never have any safety, as safety is always at the expense of some freedom.
  Arguably, this kind of work is the opposite - giving up non-essential freedom (in how you write code) for non-temporary (persistent) security.
  [-]
  - userbinator 4 days ago ago
    [flagged]
    [-]
    - SkiFire13 4 days ago ago
      What does that have to do with bound checks?
      [-]
      - saagarjha 4 days ago ago
        The person you are replying to holds the perspective that keeping software insecure allows for a glorious overthrow of our digital overlords when appropriate. A bit like a digital 2nd amendment, really: the hackers will all band together and use their guns^H^H^H^H vulnerabilities to ensure freedom from tyranny. Is that position stupid? Well, they stop replying when you point out what those security holes are overwhelmingly used for. So you be the judge.
        [-]
        userbinator 3 days ago ago
        It's amazing to see how much the mass media along with Big Tech has brainwashed people with their alarmism. The dystopia is rapidly approaching where you cannot do anything without their explicit approval, and it will all be "for your security".
        Some people are more prescient than most, but they've been canceled for telling others about it.
        [-]
        saagarjha 3 days ago ago
        No, you just refuse to do anything about it, because you dream of that dystopian future so you can brush off your hacker skills and save humanity. Why don’t you actually try to fix the problem instead of mandating that loopholes exist for you (and basically nobody else) to use?
        [-]
        userbinator 3 days ago ago
        Look what happened to Stallman.
        Why don’t you actually try to fix the problem
        People have been trying for decades. I voted and hope 45/47 will give them something to be scared of. I fight Big Tech's authoritarianism every day. What did you do besides spreading more pro-corporate FUD and helping my enemy?
        Why is it that the MSM gets all riled up about the people using encryption against the government, but when megacorps use encryption against the population, they're strangely silent? The only true freedom is insecurity. And we will fight against having that freedom taken away, no matter what.
        [-]
        saagarjha 2 days ago ago
        I did watch what happened to Stallman. I don’t think the attack on his personal character was well substantiated but it’s pretty clear that he isn’t fit to lead a movement as important as the one fighting for software freedom. Not only is he seemingly incapable of understanding that his role is one of dealing with people, not code, but he has slowly fallen out of relevance regardless by fighting for freedoms which are largely not useful as more of the world comes online. Being able to recompile software or read its source code is little solace if you don’t actually know how to use that. Knowing free software is out there is not helpful if you’re required to use proprietary software for your job, to interact with your government, or just be part of modern society. The viewpoint of a guy who uses Trisquel but really relies on dozens of other people to actually function (seriously, look up what Stallman writes: he basically isolates himself from proprietary software by asking other people to do things for him and then sending the effects to him via some inconvenient mechanism that he feels happier consuming) is inappropriate for out of touch for someone who is seeking to fight for the rights of the average person.
        As for who you voted for president: I guess you at least learned from last time to include it in your response. But I’m still not seeing any real action there? The new administration has said they’ll take on big tech but I don’t see software freedom driving any of their decisions. Most of them involve large companies having too much control over speech or raising the prices on services, but the solutions do not seem forthcoming. I’m actually concerned because this guy’s right hand man is the world’s richest man who built his fortune on proprietary platforms, and every other big tech CEO also seems to be running to support this guy because they think it will help them become even more entrenched. I can’t say what will happen in the next four years but looking at the previous tenure and spoken statements I see dismantling of anti-monopoly policies, regulation, and a general strengthening of corporate power. None of these seem conducive to an environment where free software can thrive.
        As for myself most of my projects are some form of GPL. Many of them are focused on interoperability, specifically interoperability of proprietary platforms, so those who use them can become familiar with free software and have an easier time leaving them when they can. But the whole effort is a lot more holistic than focusing on one specific area of computer security, which I say even though that is my expertise and area of employment. Like it or not jailbreaks and exploits are neat but do not seem to be resilient drivers of software freedom.
      - panstromek 4 days ago ago
        It's a big bound check conspiracy, actually. Basically, Google will collaborate with US government to gather all the bounds, via Android. FBI will break into your house and look for all the `i < length`, and if they don't find them, they will terminate your `i++` on the spot. This is just the begining. Next time, they will go even for `int = 0`. You'll own nothing, just `for(;;)` and you'll be happy.
andrewstuart 4 days ago ago
>> spatial safety vulnerabilities represent 40% of in-the-wild memory safety exploits
Rust advocates tend to turn stats like this into “40% of all security issues are memory safety”, which sounds very similar but is false.
[-]
- kibwen 4 days ago ago
  > Rust advocates tend to turn stats like this into “40% of all security issues are memory safety”, which sounds very similar but is false.
  You're right that it's false. Historically it's been a much more damning 70% of vulnerabilities that were rooted in memory-unsafety.
  According to the Google Security Blog, in a post linked to from the OP:
  We’ll also share updated data on how the percentage of memory safety vulnerabilities in Android dropped from 76% to 24% over 6 years as development shifted to memory safe languages. [...] The percent of vulnerabilities caused by memory safety issues continues to correlate closely with the development language that’s used for new code. Memory safety issues, which accounted for 76% of Android vulnerabilities in 2019, and are currently 24% in 2024, well below the 70% industry norm, and continuing to drop.
  https://security.googleblog.com/2024/09/eliminating-memory-s...
  [-]
  - 4 days ago ago
    [deleted]
  - andrewstuart 4 days ago ago
    You’re still not getting the point.
    OWASPs top ten security vulnerabilities are not memory safety.
    [-]
    - pornel 4 days ago ago
      Because most applications aren't written in C++.
      People don't write web apps in C++, because they would have to deal with memory safety issues in addition to all the other issues related to auth, injections, etc.
    - 4 days ago ago
      [deleted]
    - 4 days ago ago
      [deleted]
    - jerf 4 days ago ago
      So, maybe you can answer a question I've really had a hard time understanding, that I've posted about before: https://news.ycombinator.com/item?id=39542875
      Why are you offended at the idea that languages should be memory safe by default? What code are you writing that you constantly need memory unsafety, constantly available, without being able to write any sort of "unsafe" keyword? Who cares about whether or not it's the #1 problem in OWASP when it's clearly and undeniably been a massive problem for decades? It is sufficient, after all, that it crashes a program or produces incorrect results for it to be a problem worth pursuing, but it is also extremely well known to produce massive security vulnerabilities regardless of what some list says.
      Why is this a hill you are willing to die on? What are you getting out of it? Is your programming life going to be easier? Are you better off when debugging something to not be able to just know that it's not a memory safety problem, and thus to still have to consider it?
      What actual engineering benefit do those rare few of you who seem to be crusading against memory safety fear disappearing?
      When I got into programming in the late 1990s, I was there to catch the last few holdouts of the "everyone should just write in assembler" opinion. I at least understood their arguments around performance and efficiency, and I understood their arguments around "not needing high level languages" even though I disagree with them both then and now. I think on the net they were wrong, but they did have some legitimate benefits to argue on their side, even if they were already outweighed by the costs then and even more so outweighed today.
      But I don't get what you folk furious about memory safety are looking for. "Using" memory safety is already an invalid program. It's already pretty much automatically a bug, if not worse. You're not losing anything to simply have it, you're not gaining anything except bugs and sharp corners insisting on it. And when you absolutely, positively need it, which I'd call "exceptionally rare but definitely non-zero", it's still there in one form or another of "unsafe". I don't see any benefits at all.
      (And let me reiterate and forstall the usual, memory safety does not mean "Rust". Memory safety is every major language on the market today except C and C++.)
      [-]
      - addaon 3 days ago ago
        > Why are you offended at the idea that languages should be memory safe by default?
        Why are you okay with languages that are not overflow-safe, or unit-safe, or infinite-loop-safe, or safe against bit flips? Memory safety violations are a major chunk of bugs. Writing code to avoid them is about as hard as writing code to avoid other major classes of bugs. In either case, it’s failable. Static analysis and testing then gives confidence that the system is safe, by multiple metrics. Memory safety isn’t special enough to demand a different approach here — quality code requires a coherent approach to quality across multiple bug classes.
        [-]
        jerf 2 days ago ago
        Memory safety doesn't require a special approach. We have abundant experience that says it does not interfere with writing code. It is only C and C++ that lack it. Nobody else in any other language is running around saying "Oh, no, if only I could have memory unsafety back!" No other language community is rushing to put it back into their language. Nobody else even wants it back.
        You argue like we live in some hypothetical universe where only some bizarre academic language has recently invented the idea in a world where nobody else has even heard of the idea, and it's solving a problem we don't generally have. But the truth is, we already have memory safety... everywhere except C and C++. Those languages stand alone now. They are the only ones where it's an issue. And they have demonstrated in as concrete an engineering way as it can be demonstrated that it is a problem, on numerous levels.
        You're not arguing against some new fangled idea that has no evidence. You're arguing against something that is completely normal engineering practice in place almost everywhere, and the rest of us look at you arguing against it as if you're arguing against that source control is a stupid idea for people who can't keep track of the changes they've made, by gosh, just sticking random prefixes and suffixes on my files is enough for me and it ought to be enough for everyone. We're not hypothesizing about it. We've been living it for decades. We're not asking the world to change to be memory safe... it already has. Except C and C++.
- IshKebab 4 days ago ago
  I think you're forgetting about temporal safety (use after free). Presumably that brings it up to the 70% of security issues being related to memory safety, which many studies have shown - remarkably consistently.
- pjmlp 4 days ago ago
  First of all it is 70%, and secondly even if people like to FUD Rust, it is all security advocates that state this, including those of us that would like a better attitude towards safety in C++ world.
  We got too many C refugees that spoiled the soup.
  [-]
  - andrewstuart 4 days ago ago
    The top security issues do not relate to memory safety.
    Rust advocates like to muddy the water and make it sound like memory safety is the biggest issue in security. It isn’t.
    [-]
    - pjmlp 4 days ago ago
      You mean advocates like Microsoft Security Response Center and Google Project Zero?
      Or advocates like NSA and FBI?
      Security FUD, name calling Rust any time someone raises security issues, is quite impressive.