This is a transcript of What's Up With That Episode 1, a 2022 video discussion between Sharon (yangsharon@chromium.org) and Dana (danakj@chromium.org).
The transcript was automatically generated by speech-to-text software. It may contain minor errors.
Welcome to the first episode of What’s Up With That, all about pointers! Our special guest is C++ expert Dana. This talk covers smart pointer types we have in Chrome, how to use them, and what can go wrong.
Notes:
Links:
0:00 SHARON: Hi, everyone, and welcome to the first installment of “What's Up With That”, the series that demystifies all things Chrome. I‘m your host, Sharon, and today’s inaugural episode will be all about pointers. There are so many types of types - which one should I use? What can possibly go wrong? Our guest today is Dana, who is one of our Base and C++ OWNERS and is currently working on introducing Rust to Chromium. Previously, she was part of bringing C++11 support to the Android NDK and then to Chrome. Today, she‘ll be telling us what’s up with pointers. Welcome, Dana!
00:31 DANA: Thank you, Sharon. It's super exciting to be here. Thank you for letting me be on your podcast thingy.
00:36 SHARON: Yeah, thanks for being the first episode. So let's just jump right in. So when you use pointers wrong, what can go wrong? What are the problems? What can happen?
00:48 DANA: So pointers are a big cause in security problems for Chrome, and that‘s what we mostly think about when things go wrong with pointers. So you have a pointer to some thing, like you’ve pointed to a goat. And then you delete the goat, and you allocate some new thing - a cow. And it gets stuck in the same spot. Your pointer didn‘t change. It’s still pointing to what it thinks is a goat, but there's now a cow there. And so when you go to use that pointer, you use something different. And this is a tool that malicious actors use to exploit software, like Chrome, in order to gain access to your system, your information, et cetera.
01:39 SHARON: And we want to avoid those. So what's that general type of attack called?
01:39 DANA: That‘s a Use-After-Free because you have freed the goat and replaced it with a cow. And you’re using your pointer, but the thing it pointed to was freed. There are other kinds of pointer badness that can happen. If you take a pointer and you add to it some number, or you go to an offset off the pointer, and you have an array of five things, and you go and read 20, or minus 2, or something, now you‘re reading out of bounds of that memory allocation. And that’s not good. these are both memory safety bugs that occur a lot with pointers.
02:23 SHARON: Today, we‘ll be mostly looking at the Use-After-Free kind of bugs. We definitely see a lot of those. And if you want to see an example of one being used, Dana has previously done a talk called, “Life of a Vulnerability.” It’ll be linked below. You can check that out. So that being said, should we ever be using just a regular raw pointer in C++ in Chrome?
02:41 DANA: First of all, let‘s call them native pointers. You will see them called raw pointers a lot in literature and stuff. But later on, we’ll see why that could be a bit ambiguous in this context. So we‘ll call them a native pointer. So should you use a native pointer? If you don’t want to Use-After-Free, if you don‘t want a problem like that, no. However, there is a performance implication with using smart pointers, and so the answer is yes. The style guide that we have right now takes this pragmatic approach of saying you should use raw pointers for giving access to an object. So if you’re passing them as a function parameter, you can share it as a pointer or a reference, which is like a pointer with slightly different rules. But you should not store native pointers as fields and objects because that is a place where they go wrong a lot. And you should not use a native pointer to express ownership. So before C++11, you would just say, this is my pointer, use a comment, say this one is owning it. And then if you wanted to pass the ownership, you just pass this native pointer over to something else as an argument, and put a comment and say this is passing ownership. And you just kind of hope it works out. But then it‘s very difficult. It requires the programmer to understand the whole system to do it correctly. There is no help. So in C++11, the type called std::optional_ptr
- or sorry, std::unique_ptr
- was introduced. And this is expressing unique ownership. That’s why it‘s called unique_ptr
. And it’s just going to hold your pointer, and when it goes out of scope, it gets deleted. It can‘t be copied because it’s unique ownership. But it can be moved around. And so if you're going to express ownership to an object in the heap, you should use a unique_ptr
.
04:48 SHARON: That makes sense. And that sounds good. So you mentioned smart pointers before. You want to tell us a bit more about what those are? It sounds like unique_ptr
is one of those.
04:55 DANA: Yes, so a smart pointer, which can also be referred to as a pointer-like object, perhaps as a subset of them, is a class that holds inside of it a pointer and mediates access to it in some way. So unique_ptr
mediates access by saying I own this pointer, I will delete this pointer when I go away, but I‘ll give you access to it. So you can use the arrow operator or the star operator to get at the underlying pointer. And you can construct them out of native pointers as well. So that’s an example of a smart pointer. There‘s a whole bunch of smart pointers, but that’s the general idea. I'm going to add something to what a native pointer is, while giving you access to it in some way.
05:40 SHARON: That makes sense. That‘s kind of what our main thing is going to be about today because you look around in Chrome, you’ll see a lot of these wrapper types. It‘ll be a unique_ptr
and then a type. And you’ll see so many types of these, and talking to other people, myself, I find this all very confusing. So we‘ll cover some of the more common types today. We just talked about unique pointers. Next, talk about absl::optional
. So why don’t you tell us about that.
06:10 DANA: So that‘s actually a really good example of a pointer-like object that’s not actually holding a pointer, so it‘s not a smart pointer. But it looks like one. So this is this distinction. So absl::optional
, also known as std::optional
, if you’re not working in Chromium, and at some point, we will hopefully migrate to it, std::optional
and absl::optional
hold an object inside of it by value instead of by pointer. This means that the object is held in that space allocated for the optional
. So the size of the optional
is the size of the thing it‘s holding, plus some space for a presence flag. Whereas a unique_ptr
holds only a pointer. And its size is the size of a pointer. And then the actual object lives elsewhere. So that’s the difference in how you can think about them. But otherwise, they do look quite similar. An optional
is a unique ownership because it‘s literally holding the object inside of it. However, an optional
is copyable if the object inside is copyable, for instance. So it doesn’t have quite the same semantics. And it doesn‘t require a heap allocation, the way unique_ptr
does because it’s storing the memory in place. So if you have an optional
on the stack, the object inside is also right there on the stack. That‘s good or bad, depending what you want. If you’re worried about your object sizes, not so good. If you're worried about the cost of memory allocation and free, good. So this is the trade-off between the two.
07:51 SHARON: Can you give any examples of when you might want to use one versus the other? Like you mentioned some kind of general trade-offs, but any specific examples? Because I‘ve definitely seen use cases where unique_ptr
is used when maybe an optional
makes more sense or vice versa. Maybe it’s just because someone didn't know about it or it was chosen that way. Do you have any specific examples?
08:14 DANA: So one place where you might use a unique_ptr
, even though optional
is maybe the better choice, is because of forward declarations. So because an optional
holds the type inside of it, it needs to know the type size, which means it needs to know the full declaration of that type, or the whole definition of that type. And a unique_ptr
doesn‘t because it’s just holding a pointer, so it only needs to know the size of a pointer. And so if you have a header file, and you don‘t want to include another header file, and you just want to forward declare the types, you can’t stick an optional of that type right there because you don‘t know how big it’s supposed to be. So that might be a case where it‘s maybe not the right choice, but for other constraining reasons, you choose to use a unique_ptr
here. And you pay the cost of a heap allocation and free as a result. But when would you use an optional
? So optional
is fantastic for returning a value sometimes. I want to do this thing, and I want to give you back a result, but I might fail. Or sometimes there’s no value to give you back. Typically, before C++ - what are we on now, was it came in 14? I‘m going to say it wrong. That’s OK. Before we had absl::optional
, you would have to do different tricks. So you would pass in a native pointer as a parameter and return a bool as the return value to say did I populate the pointer. And yes, that works. But it‘s easy to mess it up. It also generates less optimal code. Pointers cause the optimizer to have troubles. And it doesn’t express as nicely what your intention is. A return, this thing, sometimes. And so in place of using this pointer plus bool, you can put that into a single type, return an optional
. Similar for holding something as a field, where you want it to be held inline in your class, but you don‘t always have it present, you can do that with an optional
now, where you would have probably used a pointer before. Or a union
or something, but that gets even more tricky. And then another place you might use it as a function argument. However, that’s usually not the right choice for a function argument. Why? Because the optional
holds the value inside of it. Constructing an optional
requires constructing the whole object inside of it. And so that‘s not free. It can be arbitrarily expensive, depending on what your type is. And if your caller to your function doesn’t have already an optional
, they have to go and construct it to pass it to you. And that‘s a copy or move of that inner type. So generally, if you’re going to receive a parameter, maybe sometimes, the right way to spell that is just to pass it as a native pointer, which can be null, when it's not present.
11:29 SHARON: Hopefully that clarifies some things for people who are trying to decide which one best suits their use case. So moving on from that, some people might remember from a couple of years ago that instead of being called absl::optional
, it used to be called base::optional
. And do you want to quickly mention why we switched from base
to absl
? And you mentioned even switching to std::optional
. Why this transition?
11:53 DANA: Yeah, absolutely. So as the C++ standards come out, we want to use them, but we can‘t until our toolchain is ready. What’s our toolchain? It‘s our compiler, our standard library - and unfortunately, we have more than one compiler that we need to worry about. So we have the NaCl compiler. Luckily, we just have Clang for the compiler choice we really have to worry about. But we do have to wait for these things to be ready, and for a code base to be ready to turn on the new standard because sometimes there are some non-backwards compatible changes. But we can forward port stuff out of the standard library into base. And so we’ve done that. We have a bunch of C++20 backports in base now. We had 17 backports before. We turned on 17, now they should hopefully be gone. And so base::optional
was an example of a backport, while optional
was still considered experimental in the standard library. We adopted use of absl
since then, and absl
had also, essentially, a backport of the optional
type inside of it for presumably the same reasons. And so why have two when you can have one? That‘s a pretty good rule. And so we deprecated the base
one, removed it, and moved everything to the absl
one. One thing to note here, possibly interest, is we often add security hardening to things in base
. And so sometimes there is available in the standard library something. But we choose not to use it and use something in base
or absl
, but we use it in base
instead, because we have extra hardening checks. And so part of the process of removing base::optional
and moving to absl::optional
was ensuring those same security hardening checks are present in absl
. And we’re going to have to do the same thing to stop using absl
and start using the standard one. And that's currently a work in progress.
13:48 SHARON: So let‘s go through some of the base
types because that’s definitely where the most of these kind of wrapper types live. So let‘s just start with one that I learned about recently, and that’s a scoped_refptr
. What's that? When should we use it?
13:59 DANA: So scoped_refptr
is kind of your Chromium equivalent to shared_ptr
in the standard library. So if you‘re familiar with that, it’s quite similar, but it has some slight differences. So what is scoped_refptr
? It gives you shared ownership of the underlying object. And it‘s a smart pointer. It holds a pointer to an object that’s allocated in the heap. When all scoped_refptr
that point to the same object are gone, it‘ll be deleted. So it’s like unique_ptr
, except it can be copied to add to your ref count, basically. And when all of them are gone, it‘s destroyed. And it gives access to the underlying pointer in exactly the same ways. Oh, but why is it different than shared_ptr
? I did say it is. scoped_refptr
requires the type that is held inside of it to inherit from RefCounted
or RefCountedThreadSafe
. shared_ptr
doesn’t require this. Why? So shared_ptr
sticks an allocation beside your object and then puts your object here. So the ref count is externalized to your object being stored and owned by the shared pointer. Chromium took this position to be doing intrusive ref counting. So because we inherit from a known type, we stick the ref count in that base class, RefCounted
or RefCountedThreadSafe
. And so that is enforced by the compiler. You must inherit from one of these two in order to be stored and owned in a scoped_refptr
. What‘s the difference? RefCounted
is the default choice, but it’s not thread safe. So the ref counting is cheap. It‘s the more performant one, but if you have a scoped_refptr
on two different threads owning the same object, their ref counting will race, can be wrong, you can end up with a double free - which is another way that pointers can go wrong, two things freeing the same thing - or you could end up with potentially not freeing it at all, probably. I guess I’ve never checked if that‘s possible. But they can race, and then bad things happen. Whereas, RefCountedThreadSafe
gives you atomic ref counting. So atomic means that across all threads, they’re all going to have the same view of the state. And so it can be used across threads and be owned across threads. And the tricky part there is the last thread that owns that object is where it‘s going to be destroyed. So if your object’s destructor does things that you expect to happen on a specific thread, you have to be super careful that you synchronize which thread that last reference is going away on, or it could explode in a really flaky way.
17:02 SHARON: This sounds useful in other ways. What are some kind of more design things to consider, in terms of when a scoped_refptr
is useful and does help enforce things that you want to enforce, like relative lifetimes of certain objects?
17:15 DANA: Generally, we recommend that you don‘t use ref counting if you can help it. And that’s because it‘s hard to understand when it’s going to be destroyed, like I kind of alluded to with the thread situation. Even in a single thread situation, how do you know which one is the last reference? And is this object going to outlive that other object? Maybe sometimes. It‘s not super obvious. It’s a little more clear with a unique_ptr
, at least local to where that unique_ptr
‘s destruction is. But there’s usually no scoped_refptr
. You can say this is the last one. So I know it‘s gone after this thing is gone. Maybe it is, maybe it’s not, often. So it‘s a bit tricky. However, there are scenarios when you truly want a bunch of things to have access to a piece of data. And you want that data to go away when nobody needs it anymore. And so that is your use case for a scoped_refptr
. It is nicer when that thing being with shared ownership is not doing a lot of interesting things, especially in its destructor because of the complexity that’s involved in shared ownership. But you're welcome to shoot yourself in the foot with this one if you need to.
18:33 SHARON: We're hoping to help people not shoot themselves in the foot. So use scoped_refptr
carefully, is the lesson there. So you mentioned shared_ptr
. Is that something we see much of in Chrome, or is that something that we generally try to avoid in terms of things from the standard library?
18:51 DANA: That is something that is banned in Chrome. And that‘s just basically because we already have scoped_refptr
, and we don’t want two of the same thing. There‘s been various times where people have brought up why do we need to have both? Can we just use shared_ptr
now? And nobody’s ever done the kind of analysis needed to make that kind of decision. And so we stay with what we're at.
19:18 SHARON: If you want to do that, there‘s someone that’ll tell you what to do. So something that when I was using scoped_refptr
, I came across that you need a WeakPtrFactory to create such a pointer. So weak pointers and WeakPtr factories are one of those things that you see a lot in Chrome and one of these base things. So tell us a bit about weak pointers and their factories.
19:42 DANA: So WeakPtr and WeakPtrFactory have a bit of an interesting history. Their major purpose is for asynchronous work. Chrome is basically a large asynchronous machine, and what does that mean? It means that we break all of the work of Chrome up into small pieces of work. And every time you‘ve done a piece, you go and say, OK, I’m done. And when the next piece is ready, run this thing. And maybe that next thing is like a user input event, maybe that‘s a reply from the network, whatever it might be. And there’s just a ton of steps in things that happen in Chrome. Like, a navigation has a request, a response, maybe another request - some redirects, whatever. That‘s an example of tons of smaller asynchronous tasks that all happen independently. So what goes wrong with asynchronous tasks? You don’t have a continuous stack frame. What does that mean? So if you‘re just running some synchronous code, you make a variable, you go off and you do some things, you come back. Your variable is still here, right? You’re in this stack frame and you can keep using it. You have asynchronous tasks. You make a variable, you go and do some work, and you are done your task. Boop, your stack‘s gone. You come back later, you’re going to continue. You don‘t have your variable anymore. So any state that you want to keep across your various tasks has to be stored and what we call bound in with that task. If that’s a pointer, that‘s especially risky. So we talked earlier about Use-After-Frees. Well, you can, I hope, imagine how easy it is to stick a pointer into your state. This pointer is valid, I’m using it. I go away, I come back when? I don‘t know, sometime in the future. And I’m going to go use this pointer. Is it still around? I don‘t own it. I didn’t use a unique_ptr
. So who owns it? How do they know that I have a task waiting to use it? Well, unless we have some side channel communicating that, they don‘t. And how do I know if they’ve destroyed it if we don‘t have some side channel communicating that? I don’t know. And so I'm just going to use this pointer and bad things happen. Your bank account is gone.
22:06 SHARON: No! My bank account!
22:06 DANA: I know. So what‘s the side channel? The side channel that we have is WeakPtr. So a WeakPtr and WeakPtrFactory provide this communication mechanism where WeakPtrFactory watches an object, and when the object gets destroyed, the WeakPtrFactory inside of it is destroyed. And that sets this little bit that says, I’m gone. And then when your asynchronous task comes back with its pointer, but it‘s a WeakPtr inside of it and tries to run, it can be like, am I still here? If the WeakPtrFactory was destroyed, no, I’m not. And then you have a choice of what to do at that point. Typically, we‘re like, abandon ship. Don’t do anything here. This whole task is aborted. But maybe you do something more subtle. That's totally possible.
22:59 SHARON: I think the example I actually meant to say that uses a WeakPtrFactory is a SafeRef, which is another base type. So tell us a bit about SafeRefs.
23:13 DANA: WeakPtr is cool because of the side channel that you can examine. So you can say are you still alive, dear object? And it can tell you, no, it‘s gone. Or yeah, it’s here. And then you can use it. The problem with this is that in places where you as the code author want to believe that this object is actually always there, but you don‘t want a security bug if you’re wrong. And it doesn‘t mean that you’re wrong now, even. Sometime later, someone can change code, unrelated to where this is, where the ownership happens, and break you. And maybe they don‘t know all the users of a given object and changing its lifetime in some subtle way, maybe not even realizing they are. Suddenly you’re eventually seeing security bugs. And so that‘s why native pointers can be pretty scary. And so SafeRef is something we can use instead of a native pointer to protect you against this type of bug. It’s built on top of WeakPtr and WeakPtrFactory. That's its relationship, but its purpose is not the same. so what SafeRef does is it says - SafePtr?
24:31 SHARON: SafeRef.
24:31 DANA: SafeRef.
24:31 SHARON: I think there's also a safe pointer, but there -
24:38 DANA: We were going to add it. I‘m not sure if it’s there yet. But so two differences between SafeRef and WeakPtr then, ref versus ptr, it can‘t be null. So it’s like a reference wrapper. But the other difference is you can‘t observe whether the object is actually alive or not. So it has the side channel, but it doesn’t show it to you. Why would you want that? If the information is there anyway, why wouldn‘t you want to expose it? And the reason is because you are documenting that you as the author understand and expect that this pointer is always valid at this time. It turns out it’s not valid. What do you do? If it‘s a WeakPtr, people tend to say, we don’t know if it‘s valid. It’s a WeakPtr. Let‘s check. Am I valid? And if I’m not, return. And what does that result in? It results in adding a branch to your code. You do that over, and over, and over, and over, and static analysis, which is what we as humans have to do - we‘re not running the program, we’re reading the code - can‘t really tell what will happen because there’s so many things that could happen. We could exit here, we could exit there, we could exit here. Who knows. And that makes it increasingly hard to maintain and refactor the code. So SafeRef gives you the option to say this is always going to be valid. You can‘t check it. So if it’s not valid, go fix that bug somewhere else. It should be valid here.
26:16 SHARON: So what kind of -
26:16 DANA: The assumptions are broken.
26:16 SHARON: So what kind of errors happen when that assumption is broken? Is that a crash? Is that a DCHECK kind of thing?
26:22 DANA: For SafeRef and for WeakPtr, if you try to use it without checking it, or write it incorrectly, they will crash. And crashing in this case means a safe crash. It‘s not going to lead to a security bug. It’s literally just terminating the program.
26:41 SHARON: Does that also mean you get a sad tab as a user? Like when the little sad file comes up?
26:47 DANA: Yep. It would. If you‘re in the render process, you take it down. It’s a sad tab. So that‘s not great. It’s better than a security bug. Because your options here are don‘t write bugs. Ideal. I love that idea, but we know that bugs happen. Use a native pointer, security problem. Use a WeakPtr, that makes sense if you want it to sometimes not be there. But if you want it to always be there - because you have to make a choice now of what you’re supposed to do if it‘s not, and it makes the code very hard to understand. And you’re only going to find out it can‘t be there through a crash anyhow. Or use a SafeRef. And it’s going to just give you the option to crash. You‘re going to figure out what’s wrong and make it no longer do that.
27:38 SHARON: I think wanting to guarantee the lifetime of some other things seems like a pretty common thing that you might come across. So I‘m sure there are many cases for many people to be adding SafeRefs to make their code a bit safer, and also ensure that if something does go wrong, it’s not leading to a memory bug that could be exploited in who knows how long. Because we don‘t always hear about those. If it crashes, and they can reliably crash, at least you know it’s there. You can fix it. If it‘s not, we’re hoping that one of our VRP vulnerability researchers find it and report it, but that doesn‘t always happen. So if we can know about these things, that’s good. So another new type in base that people might have been seeing recently is a raw_ptr
which is maybe why earlier we were saying let‘s call them native pointers, not raw pointers. Because the difference between raw_ptr
and raw pointer, very easy to mix those up. So why don’t you tell us a bit about raw_ptr
s?
28:40 DANA: So raw_ptr
is really cool. It‘s a non-owning smart pointer. So that’s kind of like WeakPtr or SafeRef. These are also non-owning. And it‘s actually very similar in inspiration to what WeakPtr is. So it has a side channel where it can see if the thing it’s pointing to is alive or gone. So for WeakPtr, it talks to the WeakPtrFactory and says “am I deleted?” And for raw_ptr
, what it does is it keeps a reference count, kind of like scoped_refptr
, but it‘s a weak reference count. It’s not owning. And it keeps this reference count in the memory allocator. So Chrome has its own memory allocator for new
and delete
called PartitionAlloc. And that lets us do some interesting stuff. And this is one of them. And so what happens is as long as there is raw_ptr
around, this reference count is non-zero. So even if you go and you delete the object, the allocator knows there is some pointer to it. It‘s still out there. And so it doesn’t free it. It holds it. And it poisons the memory, so that just means it‘s going to write some bit pattern over it, so it’s not really useful anymore. It‘s basically re-initialized the memory. And so later, if you go and use this raw_ptr
, you get access to just dead memory. It’s there, but it‘s not useful anymore. You’re not going to be able to create security bugs in the same way. Because when we first started talking about a Use-After-Free - you have your goat, you free it, a cow is there, and now your pointer is pointing at the wrong thing - you can‘t do that because as long as there’s this raw_ptr
to your goat, the goat can be gone, but nothing else is going to come back here. It‘s still taken by that poisoned memory until all the raw_ptr
s are gone. So that’s their job, to protect us from a Use-After-Free being exploitable. It doesn‘t necessarily crash when you use it incorrectly, you just get to use this bad memory inside of it. If you try to use it as a pointer, then you’re using a bad pointer, you‘re going to probably crash. But it’s a little bit different than a WeakPtr, which is going to deterministically crash as soon as you try to use it when it‘s gone. It’s really just a protection or a mitigation against security exploits through Use-After-Free. And then we recently just added raw_ref
, which is really the same as raw_ptr
, except addressing nullability. So smart pointers in C++ have historically all allowed a null state. That‘s representative of what native pointers did in C and C++. And so this is kind of just bringing this along in this obvious, historical way. But if you look at other languages that have been able to break with history and make their own choices kind of fresh, we see that they make choices like not having null pointers, not having null smart pointers. And that increases the readability and the understanding of your code greatly. So just like for WeakPtr, how we said, we just check if it’s there or not. And if it‘s not, we return, and so on. It’s every time you have a WeakPtr, if you were thinking of a timeline, every time you touch a WeakPtr, your timeline splits. And so you get this exponential timeline of possible states that your software‘s in. That’s really intense. Whereas every time you can not do that, say this can‘t be null, so instead of WeakPtr, you’re using SafeRef. This can‘t be not here or null, actually - WeakPtr can just be straight up null - this is always present. Then you don’t have a split in your timeline, and that makes it a lot easier to understand what your software is doing. And so for raw_ptr
, it followed this historical precedent. It lets you have a null value inside of it. And raw_ref
is our kind of modern answer to this new take on nullability. And so raw_ref
is a reference wrapper, meaning it holds a reference inside of it, conceptually, meaning it just can‘t be null. That is just basically - it’s a pointer, but it can't be null.
33:24 SHARON: So these do sound the most straightforward to use. So basically, if you're not sure - for your class members at least - any time you would use a native pointer or an ampersand, basically you should always just put those in either a raw_ptr
or a raw_ref
, right?
33:45 DANA: Yeah, that‘s what our style guide recommends, with one nuance. So because raw_ptr
and raw_ref
interact with the memory allocator, they have the ability to be like, turned on or off dynamically at runtime. And there’s a performance hit on keeping this reference count around. And so at the moment, they are not turned on in the renderer process because it‘s a really performance-critical place. And the impact of security bugs there is a little less than in the browser process, where you just immediately get access to the whole system. And so we’re working on turning it on there. But if you‘re writing code that’s only in the renderer process, then there‘s no point to use it. And we don’t recommend that you use it. But the default rule is yes. Don‘t use a native pointer, don’t use a native reference. As a field to an object, use a raw_ptr
, use a raw_ref
. Prefer raw_ref
- prefer something with less states, always, because you get less branches in your timeline. And then you can make it const
if you don‘t want it to be able to rebound to a new object, if you don’t want the pointer to change. Or you can make it mutable if you wanted to be able to.
34:58 SHARON: So you did mention that these types are ref counted, but earlier you said that you should avoid ref counting things. So -
35:04 DANA: Yes.
35:11 SHARON: So what‘s the balance there? Is it because with a scoped_refptr
, you’re a bit more involved in the ref counting, or is it just, this is we've done it for you, you can use it. This is OK.
35:19 DANA: No, this is a really good question. Thank you for asking that. So there‘s two kinds of ref counts going on here. I tried to kind of allude to it, but it’s great to make it clear. So scoped_refptr
is a strong ref count, meaning the ref count owns the object. So the destructor runs, the object is gone and deleted when that ref count goes to 0. raw_ref
and raw_ptr
are a weak ref count. They could be pointing to something owned in a scoped_refptr
even. So they can exist at the same time. You can have both kind of ref counts going at the same time. A weak ref count, in this case, is holding the memory alive so that it doesn‘t get re-used. But it’s not keeping the object in that memory alive. And so from a programming state point-of-view, the weak refs don‘t matter. They’re helping protect you from security bugs. When things go wrong, when a bug happens, they‘re helping to make it less impactful. But they don’t change your program in a visible way. Whereas, strong references do. That destrutor‘s timing is based on when the ref count goes to 0 for a strong reference. So that’s the difference between these two.
36:46 SHARON: So when you say don‘t use ref counting, you mean don’t use strong ref counting.
36:46 DANA: I do, yes.
36:51 SHARON: And if you want to learn more about the raw pointer, raw_ptr
, raw_ref
, that‘s all part of the MiraclePtr project, and there’s a talk about that from BlinkOn. I‘ll link that below also. So in terms of other base types, there’s a new one that‘s called base::expected
. I haven’t even really seen this around. So can you tell us a bit more about how we use that, and what that's for?
37:09 DANA: base::expected
is a backport from C++23, I want to say. So the proposal for base::expected
actually cites a Rust type as inspiration, which is called std::result
in Rust. And it‘s a lot like optional
, so it’s used for return values. And it‘s more or less kind of a replacement for exceptions. So Chrome doesn’t compile with exceptions enabled even, so we‘ve never relied on exceptions to report errors. But we have to do complicated things, like with optional
to return a bool or an enum. And then maybe some value. And so this kind of compresses all that down into a single type, but it’s got more state than just an option. So expected
gives you two choices. It either returns your value, like optional
can, or it returns an error. And so that‘s the difference between optional
and expected
. You can give a full error type. And so this is really useful when you want to give more context on what went wrong, or why you’re not returning the value. So it makes a lot of sense in stuff like file IO. So you‘re opening a file, and it can fail for various reasons, like I don’t have permission, it doesn‘t exist, whatever. And so in that case, the way you would express that in a modern way would be to return base::expected
of your file handle or file class. And as an error, some enumerator, perhaps, or even an object that has additional state beyond just I couldn’t open the file. But maybe a string about why you couldn't open the file or something like this. And so it gives you a way to return a structured error result.
39:05 SHARON: That‘s found useful in lots of cases. So all of these types are making up for basically what is lacking in C++, which is memory safety. C++, it does a lot. It’s been around for a long time. Most of Chrome is written in it. But there are all these memory issues. And a lot of our security bugs are a result of this. So you are working on bringing Rust to Chromium. Why is that a good next step? Why does that solve these problems we're currently facing?
39:33 DANA: So Rust has some very cool properties to it. Its first property that is really important to this conversation is the way that it handles pointers, which in Rust would be treated pretty much exclusively as references. And what Rust does is it requires you to tell the compiler the relationships between the lifetimes of your references. And the outcome of this additional knowledge to the compiler is memory safety. And so what does that mean? It means that you can‘t write a Use-After-Free bug in Rust unless you’re going into the unsafe part of the language, which is where scariness exists. But you don‘t need to go there to write a normal program. So we’ll ignore it. And so what that means is you can‘t write the bug. And so that doesn’t just mean I also like to believe I can write C++ without a bug. That‘s not true. But I would love to believe that. But it means that later, when I come back and refactor my code, or someone comes who’s never seen this before and fixes some random bug somewhere related to it, they can‘t introduce a Use-After-Free either. Because if they do, the compiler is like, hey - it’s going to outlive it. You can‘t use it. Sorry. And so there’s this whole class of bugs that you never have to debug, you never ship, they never affect users. And so this is a really nice promise, really appealing for a piece of software like Chrome, where our basic purpose is to handle arbitrary and adversarial data. You want to be able to go on some web page, maybe it‘s hostile, maybe not. You just get a link. You want to be able to click that link and trust that even if it’s really hostile and wanting to destroy you, it can‘t. Chrome is that safety net for you. And so Rust is that kind of safety net for our code, to say no matter how you change it over time, it’s got your back. You can't introduce this kind of bug.
42:03 SHARON: So this Rust project sounds really cool. If people want to learn more or get involved - if you're into the whole languages, memory safety kind of thing - where can people go to learn more?
42:09 DANA: So if you‘re interested in helping out with our Rust experiment, then you can look for us in the Rust channel on Slack. If you’re interested in C++ language stuff, you can find us in the CXX channel on Slack, as well. As well as the cxx@chromium.org mailing list. And there is, of course, the rust-dev@chromium.org mailing list if you want to use email to reach us as well.
42:44 SHARON: Thank you very much, Dana. There will be notes from all of this also linked in the description box. And thank you very much for this first episode.
42:52 DANA: Thanks, Sharon This was fun.