CppCon 2018: Stoyan Nikolov “OOP Is Dead, Long Live Data-oriented Design”

– Good morning. My name is Stoyan and first of all, thank you for waking up
early to come to this talk. Today we’ll be chatting
about data-oriented design and I decided to do this talk because, when looking at all the
resources, all the talks, all the blocks on data-oriented design, I was unable to find a good example of people showing a
real-world production system made with data-oriented design, comparing it with
objected-oriented programming and outside of games. So, I’ve been in the video games industry for about 10 years now
and I’ve been doing games, I’ve been doing mostly,
however, game technology. I work at a company called Coherent Labs where we create the technology for games, so it’s very likely that if you’re playing and if you’re a gamer,
you machine has executed some of the code we’ll be looking at. For the past six and a half years, we’ve been working on
products based on Chromium, WebKit, and now our
in-house proprietary game UI and the browser engine. If you’re not familiar with the terms, Chromium is the open-source project that is at the heart
of the Chrome browser, of Slack, of Skype, and
many other applications that are probably
running on your machines, and WebKit is the original
project that the core of Chromium is forked
from and works in Safari, on IOS, on Macs, so if
you’re an Apple user, you’re using it as well. So, we have these successful projects and around two and a half years ago when I was working on
optimizing and improving the software based on these technologies, the whole time I felt like I was in a cage and that cage was the
architecture of WebKit, which is also the architecture
in many ways of Chromium, of those layout engines. So, we gather up with my co-founders and did the only sensible
thing, of course, which is to start from scratch, leave all that behind and build our own in-house browser engine for games, and there was a very big question for us: can a browser engine be successful with data-oriented design? We didn’t know that. We had successful projects
with data-oriented design, our graphics engine was built with that, we knew about a lot of games
that have been successful by employing the design, but we didn’t know of
anything with the scale that we were looking at and
we didn’t know if it scaled to a software that is very
object-oriented in its core. If you look at all those
open-source projects, they are all object-oriented
and this has many reasons: it’s part legacy, it’s part
the way that the standard of HTML rendering requires
you to do some things. So, to see how that went,
let’s do a very quick demo. Okay. Let’s run a test page
that I did with Chromium. This is a build that I have here. Okay, it’s a very simple example. These are 3000 rectangles. All that they do is
they move left and right and they change their color and at the center there’s a big rectangle whose only idea is for
us to gain an insight on the performance. So, if you’re a keen gamer
or somebody who works a lot on performance, you
already see that the frame rate is not so great, and
the exact number is… It’s a bit difficult to see. It’s around 13, 15-ish frames per second. So, to zoom that a bit I’ll
use a very old school method of doing this, I’ll paste
it in Paint and zoom it and the reason is that
the Windows magnifier actually reduces the performance
of the Windows presenter, so we will skew our test if
we did it with the magnifier. So, we get 15 point
seven frames per second. It’s very far from the 60
frames that we would like to have for a smooth animation. Okay, let’s see what we did. Okay. If you look down on the right, we got around 50 frames. Actually, in my hotel room, it was 70, but who knows what’s happening? So, we have a significant
improvement on the performance. And the answer to that
question happens to be yes, we were successful, and we’ll
see how that was implemented. So, today’s agenda is we’ll be looking at the basic issues of
object-oriented programming. I’ll just glance at the
things that we believe can be done better. We’ll look at the basics
of data-oriented design. There are a lot of free sources. I have a final slide with references that you can look up later. The idea for today will be
to design a specific system, to look at a design made with
object-oriented programming, then to redo it with data-oriented design and see the difference between those two. What’s so wrong with
object-oriented programming? Why people have a problem with it? I believe that a picture
speaks 1000 words, so this is an inheritance hierarchy. It’s a bit difficult to follow, right? So, you’ve got diamond
inheritance and whatever. So, object-oriented
programming at it’s gist marries the operations with
the data they’re done on and it’s not a really happy marriage, because in the end, you end
up with those giant objects with a lot of unrelated data in them that you poke in different
parts of your software. If we imagine a game
and you have an enemy, you might have a data about its physics, some data about it’s AI, some
data about it’s rendering, and so on, but every time
you access that object, for instance to draw it,
you’re actually polluting your cache and getting a lot
of data from the other stuff that you’re not using right now, and object-oriented programming
also has the feature of hiding state in many places, and by hiding I mean that
you have all these Booleans, all these hidden ifs in your methods that, in the end, increase substantially the complexity of your code. I know those things have a
real impact on performance, scalability, modifiability,
and testability and these will be the
four quality attributes we’ll be analyzing. Data-oriented design takes
a different approach. It puts the data first. You can imagine it’s in the title. So, if we look at the
lower side of our slide, in object-oriented
programming you usually have this logical entity with all its fields, all the stuff that logically
belongs to that object. In data-oriented design, we do a bit of a different approach: we separate the data
according to its usage. So, we might have data A to
C with data A with fields A to C that is used in system Alpha, we get data B from fields from D to F, we use that in Alpha and in Beta, and we transform that
data to a new collection that is used down the pipeline. So, the gist of data-oriented
design is to separate the data from the logic. We try to use the data where necessary without polluting our caches and trying to keep our logic simpler. So, we embrace the data,
we don’t try to hide it. We avoid hidden usage. We’ll see how. It’s an interesting technique. We try to avoid virtual
calls because, very often, we just don’t need them, and
data-oriented design is one of those things that,
at its core, are simple, but they’re very difficult to master, a bit like the game of Go. It promotes domain knowledge. You cannot separate well your data and build a good system if
you don’t know that data, if you don’t know the
problem that you’re solving and the system you’re designing. And as I mentioned, there
are references at the end of the talk for additional information. So, let’s design a system and
the system I have selected is a system for animations,
from the browser engines that I showed earlier,
and to give you an idea about what is possible with such a system, this is a very small example
from one of our samples, but it gives you the
idea that with animations you can do a lot of things in CSS, so the example that we showed
earlier for the performance was just moving rectangles
and changing their color, but you can do more complex
things like transformations, moving text, opacity, and so on, so you can modify different properties. The details of the system
are not that important. It’s more important for
us to gain an insight on how to create it. The definition of the
animation is again very simple. We have textual definition,
we have our key frames, everybody who’s doing animations
know what the key frame is. It’s I want the animation
to start here and end here and has this property here and here and we have the duration of the animation. In a real world system,
there are other properties, obviously, but these are
the basics and the others don’t change the design that much. We already saw that we can modify different types of properties and this will have a real
impact on our design. This means that when we
implement this in C plus plus, we’ll have to interpolate different types of C plus plus structures
or objects or whatever. We can have numbers, we can have colors, we can have transformations, and so on. And another important design
is that this definition is not static. We want to allow the user at runtime, to modify the animations. So, let’s try to implement this with object-oriented programming and for the example today,
I’ve selected Chromium, and I don’t know how
many people are familiar with the Chromium code base, but I really, really encourage
you to go and study it. It’s an incredible piece of software done by one of the best
engineers in the world, so there’s a lot to learn. Of course, it’s a giant code base, many millions line of code, but when you get the gist
of it, it’s very cool. So, Chromium actually has
two animation systems. We’ll be looking at just one of them. The other is similar in its design, so all the things that I’m
talking apply to that as well, and it very closely
follows the HTML standard, so that’s why probably they
have this object-oriented design in the first place. As it’s object-oriented and
we are creating animations, what do we expect to have? A class called animation. Right, that’s exactly what
I looked up in the code. There is a class called animation. That’s great, but wow. It inherits six other non-trivial classes. At least it’s final so we
know there are no others down the line. And you remember the picture that I showed in the beginning about
the inheritance hierarchy? It’s actually the inheritance
hierarchy of this class. So, up you get a lot of inheritance. It’s already difficult to follow, right? Okay, so to gain a better
understanding of the system, we’ll be looking at the
most common operation, which is ticking or updating an animation, that is from the time and the definition to generate the new value for this frame of the property that we’re changing, the new color or the new
position of our elements, and we’ll delve into the code of Chromium. It starts very as we would expect. There is a method called
service animations. It takes time at the second line. It takes the current
time and it calls update on every animation pointer
that is in our collection. There is a collection of
animations needing update, so clear so far. However, we already see some
strange things happening. They are not walking over the animations needing update vector. They are copying all the
pointers and walking over and iterating over
another temporary vector, and the reason for that is
that, in this call stack, but also probably in other call stacks, they are deleting from
the original vector. So, we have some unclear
lifetime semantics. It is very difficult for us now to know when those animations get
created and, most importantly, when they get deleted. If they are writing this code,
there is a reason for that. So, let’s look at the update method, what’s happening right there. Fairly simple, it takes the reason, but the very first line
is the hidden state I was talking to you about. If there is no timeline, which
is some pointer, whatever, they return false. So, the animation will do
nothing in some condition, but we’re already paying a
big cost just for this if because the animations are
all located on the heap, so we’re just touching
that field just to see if it’s set or not and we
are paying for a cache miss and we might be having a lot of branch miss predictions as well, because if we have a good mix of active and inactive animations,
the branch predictor will probably get confused, so we’re paying some performance cost and we’re exponentially the
states and the complexity of our system. Okay, so if we have this content, which is another condition, what do we do? We delegate, actually, the
animation to another class, which is held in the
declaration of our animation as a member. So, Chromium does garbage
collection in C plus plus, but you can think of this member type as a shared pointer. So, we are again paying the
cost of a possible cache miss and this animation effect read only is actually the definition
of our animation. All right. So, I’ll be skipping
some of the call stacks. I had a lot more slides,
but had to remove them for time constraints. We’ll be skipping the call stacks and looking just at the interesting parts. Next up, we’re going to update our value, so we skipped a bit and we
see some interesting stuff that happens a lot in
object-oriented programming. In some cases, the animations
have to call an event. Happens. So, if there is an event delegate, we’ll call an event condition, and here we have brought coupling between our animation
system and the event system. Moreover, we have no idea
what the on event condition will actually do, so it
might call JavaScript, which will destroy all the whole data that we have in our caches, it’ll have to load all the
call data for the JavaScript, execute whatever it’s doing, and they you’ll have
to reload all the data that we just threw away. So, a possible very
heavy performance heap. There are probably also some
instruction cache misses because we don’t know
where this code is going in our giant source. So, we have calculated all this and comes an interesting part that I
mentioned in the beginning. We want to be able to
interpolate different C plus plus types because our animation could be on a color, could
be on a number, whatever. So however, we still want to keep that in our system somehow,
so how do we do that in classic C plus plus, in classic object-oriented programming? We want the method that is different depending on the type that
we want to interpolate and they have different sizes. Anyone? Yep, virtual functions on abstract class. That’s exactly how they do it. So, there is a class interpolation, which is abstract and it is inherited for every possible type
that can be interpolated and the virtual interpolate
method does the magic. Unfortunately, this type
of dynamic type erasure in C plus plus is very costly to do. You have the cost of allocating
all those interpolations and in our example earlier,
that’s 6000 animation, so 6000 of those, and you have
the very likely cache miss when you call the virtual, and a very likely instruction cache miss, depending on the types of
properties you’re animating. Finally, we have calculated the new value, the interpolation did its magic. We have to apply that new
value to the properties, to the objects that are being animated, and again we do it in a
very straightforward manner. We have a member variable target, which is the element that
will receive the new value. And again, we bring some
coupling between the systems because now our animation system, whose only job was to calculate new values according to time, knows about
the document object model, about the styling system,
so that it can call the appropriate method of the element. In our case, the element is
called set needs animation style recalc, and if we
look at the implementation, we’ll see something very, very scary. This is again a symptom of
object-oriented programming, I believe, where you call a method, but as it is a black box, we
have no idea what’s going on, so you might end paying
a very big cost on that and we’re indeed paying a very big cost because set need style
recalc, which is called there, actually walks up the
document object model tree for every element and sets a Boolean, and on each call up this tree, we’ll very likely be getting a cache miss, so we were doing simple animations, now we’re walking up the DOM tree, we’re again changing the
context of our program. So, to recap very quickly,
to do our animation, we’ve used more than
six non-trivial classes, the objects contain smart
pointers to the other systems, they are coupled with the other systems, for the interpolation
we use abstract classes, and we are basically drinking the poison of object-oriented
programming all the way. All right, now let’s try do
it with data-oriented design. Let’s clear our mind, forget what we saw, and return to the drawing board. So, let’s design our system
around the most common operation and this is the operation of ticking, of updating the animation. It happens like 99% of the time, we have other operations obviously, like adding, removing, posing
an animation, whatever, but the most important one, the one that defines our
performance, is the tick. For this tick, we have
two very simple pieces of data as input. We have the definition,
obviously, of the animation, and we have time, and as
output, if you think about it, we also have some very
simple piece of data. We need the properties that have changed, we need the new property values, and we need a mapping the elements that will receive those properties. And again, as a cornerstone
of data-oriented design, we will work on this system
like there are many animations. With object-oriented programming
and the thing that we saw earlier, it works as if
there is just one animation, so you call update on the
animation, it does what it does, but this is not what’s actually happening. You don’t have one
animation, you have dozens, hundreds, even thousands. So, how will our tick work? It can be done very simply. Let’s create a class
called animation controller or whatever name we find interesting. It will contain an array
of our animation states, it will take time as input. Those states are generated
from definitions, and it will output exactly what we need: An array, a vector of
the changed properties, an array of the elements that have changed with the links back to the new properties. And here, the element
pointers are just DOM pointers for our system. It doesn’t know what they are. It doesn’t directly call to
the document object model. With that data as output, we can leave it to the next systems in our pipeline to decide what to do. So, the events system can
walk these changed properties, these changed elements, and decide to call the events where necessary. The styling system can
walk over these elements and decide what to do,
knowing their new properties. So, we’ve reached a lot of the coupling, a lot of the separation of concerns. How do we build those animation states? Well, it’s very easy. Again, go flat, forget about methods, just have plain old data style struct with all the data that we need. The details are not important. This is the runtime data that we need, so the start time, the
post time of the animation, and here is a bit of a twist: we also have all the
animation definition copied for each animation state, and we did this for two reasons: the first one is that empirically, we found out that the
performance is better this way and second, it makes some
operations very easy. Imagine you want to change the duration of a certain animation. If the definitions were shared pointers, you would have to implement
some sort of copy on write or other EDM to do that, but now that they are separate, you change just the
property for that animation, all the others are completely unaffected. So, think about how that might help us, for instance, doing this
in multiple threads. Okay, so we have these properties. However, we haven’t solved
the problem of having to interpolate different
types of different C plus plus types. Remember, they were doing
it with the abstract class, but we don’t want to
pay the cost for that. So, to avoid type erasure, C
plus plus has a very good way to do it statically:
we can use a template. We can use a template because
we know exactly all the types of properties that we will be modifying. There are no new ones. So, we now have different types templated by the animation key frame and we would like to iterate over them to run the animation. However, obviously they have
different sizes in memory, so we cannot put them
in a nice, neat vector and walk it over, but as I
said, we know each on of them, so what we can do is very simple. Let’s have a vector for each type. Now, they are the same size, so we have as many vectors
as types as needed, and when we want to interpolate
them, it’s very simple. We go over all the types A,
all the vectors of that type, then we go to the second one,
then we go to the third one. So, we are now having
an order of magnitude less cache misses than we had
with the object-oriented case. The reason is that we will
be having a cache miss when we change the type of the
vector we are iterating on. However, and this is again
a bit of the main knowledge, but we know that we’ll usually
have a handful of properties that are really animated,
so we’ll end up with a bunch of very large vectors that
we can linearly iterate and the prefecture of
the CPU will do its magic and we’ll not be paying
the cost of cache misses and we’ll end up with a lot of vectors that are simply empty
and we’ll skip over them. So, instead of having one
cache miss for each animation, as it was in the
object-oriented case, so O N, we’re actually having one
cache miss for each type of animation, which is an
order of magnitude less. Actually, in the performance
example I showed earlier, there were 6000 animations because there were 3000 rectangles, but they were changing the
color and the position, so these are different animations, so in the object-oriented case, we got 6000 cache misses each frame. In this case, we’re just
walking over the color and the position, so we’ll be getting just two cache misses, probably. And on the implementation
level, it’s very easy. We can template our
decontamination function, according to the property and just have the same code running. I mentioned that
object-oriented programming tends to hide a lot of state. We saw that with the
activeness of the animations, so you can have some that are enabled, some that are disabled,
and in the easiest way of implementation, you can
have a Boolean is active or in the case of Chromium,
it was if there is a timeline, which is basically the same thing, and do if this, then that, and whatever. And in our system, the
activeness, inactiveness is the most important state,
the most important Boolean that we have, so we can employ something that we call existence-based predication. We just have to erase. One array is for the active animations, one is for the inactive animations. The act of enabling an
animation is just moving data from one to the other or vice versa. In this way, when we iterate
and have to apply an operation on our active animations, we know that there is no hidden state, we can apply the exact same code, the exact same transformation
on all of them. This is a bit difficult to
do for every possible state that you have, so you can prioritize by the most important, by the one that has the most volatility, or you can try to reduce
the number of states. So, to look a bit at
the code that’s final, the details are not that important, but if you look at it, it’s very simple and in many ways beautiful
because there are no ifs, there are no branches, and
you can read it like a book. So, we interpolate the key frames, we interpolate the values, everything works as we would expect. And this get interpolated value function that I have highlighted is just a template that will do the right thing
and call the right function for the property we’re changing. And though this code can be
safely in our C plus plus file, it won’t pollute with
templates all over the place, and will also keep the
code together, very likely, in our final executable, so instruction cache
misses are very unlikely. Okay however, our system is now working. It’s very high performance,
we are very happy with it, but we want to add an API to the user and we want it to be a convenient API. And here, object-oriented style of classes with methods and so on is
really, really convenient. You can give the user
just an animation object and she can call play,
pause, and whatever. So, we want to give that to the user. So, let’s create our animation object. Now is the time. It has all of the methods
that we would like, but contains only one simple piece of data and that’s an animation
handle, just an ID, and the first all the
operations to there functions the animation controller. So, our animation class is very DOM, but it gives this
convenient API to the user while we can still
rework our internal data and make our system as
performant as possible, and we are using a simple
handle to simplify the lifetime of those objects because now
we know that the lifetime is completely controlled
by our controller. There is no shared
ownership over this data. And to implement the DOM
API in the controller is very simple, you just
have all the methods that take the ID. So, to quickly recap
the concepts between OOP and data-oriented design that we saw, in the Chromium case we
were using an inheritance of six classes and all
the logic was crammed in this animation object and its links. In D O D, we did a very simple
flat animation state struct, templated, and we had other
methods that worked on that. We used a list of dynamically
allocated interpolation, so abstract classes, but in C
plus plus you can do better. You can use templates
and if you know the types that you’re having, you
can just have an array for each one of them. OOP used Boolean flags for activeness and again, by flags
it’s not just a Boolean, it can be the presence or
not of a pointer, whatever. And in data-oriented design,
we used different arrays for the different states. And to give an API to the user, with object-oriented
programming we directly give the API of the class. However, our user just
wants to play an animation and immediately gets 100 methods that doesn’t know what to do. We solve that by giving a very clear API only with the stuff that is needed. And finally with object-orientation, you reach out to the other
systems in the program directly. With data-oriented design,
we just output tables with the new data and let the other guys in the pipeline decide
what to do with them. So, the key points with
data-oriented design: Try to keep your data flat, try to use existence-based predication to get rid of this state
all over the place, try to use ID-based handles
to reduce the pointers and to simplify the
lifetime of your objects, and try to use table-based output because this table-based output
will actually be the input for the next thing down the pipeline. Let’s analyze how we did. Okay, the first thing we’ll
be looking at is performance and I’ll be looking at the
performance of this system. It’s six times faster. So, this is a hidden treasure for us. We did not change the algorithm, we did not invent some very
clever way to do animation, we’re executing and doing
pretty much the same thing. Only by reorganizing how the data is used, how the code is structured,
we’ve gained six times better performance on this
system, and think about it, it’s very likely that
you have such systems in your code bases as well and the treasure is waiting to be found. Next up, let’s talk about scalability. How could we hypothetically
scale those systems and hereby scaling, I
mean a very simple thing, making them work on multiple threads. Well, in the object-oriented case, it will be very difficult
because we will have to account for all those dependencies
for the data races that are very probable in the system, so we don’t know when we say to the target to change its style if some
other thread is doing that, we don’t know if all the
methods we are calling are thread-safe and we
have to make sure they are. Another problem with the hidden state is that if we hypothetically
solve these issues and run, let’s say, half
the animation on core A, half the animation core B, if they have a very
different runtime profile, so if core A is lucky and gets
a lot of inactive animations, then it will be idle for a lot of time. We won’t have an even
distribution of work. With data-oriented design,
things look simpler. Our animation states are
completely independent of each other, we duplicated
some data to achieve that, so what we can do is just
let’s cut our vectors in half, half goes here, half
goes on the other core, every one of them outputs its data, and in the end, in a classic
fore-conjoined fashion, we can merge that data together and leave it to the next system. What about testability? Well again, with the
object-oriented, it looks difficult. I wouldn’t be the one
that tries the only test for all of this because I will
have to mock a lot of things, for instance, the events,
the document object model just to test my animation system. Moreover, I have a lot of hidden states, so a lot of ifs and elses and so on. So, if I want to cover all of them, I have this combinatorial
explosion of tests that I would have to do. In the data-oriented cases,
again, looks a bit more simpler just because we have really
isolated the animation system from everything else. We have our very clear
input, our very clear output, so as long as that
contract is not breached and the right input gives
you the right output, you are fine, the system works, and the other systems down the
line have the responsibility to use that output in the correct way. For the modifiability, such
large and complex objects throughout the hierarchy stand to harden, and by that I mean they
become so complicated that the programmer is very much inclined not to modify them, so it’s
very difficult for us to, for instance, to experiment
with changing the layout of the object or moving
some data from one class to the other, just because
everything will break and we have to fix a lot of things. So, those things are usually
left to big re-factorings that in the end, never happen. In the data-oriented case,
as we are completely owning our data and it’s very clear, we have a lot more wiggle room. We could separate our
animation state into arrays, our animation could continue
to use both of those, but we can send part of
that data to another system and we can try and reorder
some of the functions, it’s a bit clearer. However, there is one thing
that object-oriented programming excels at and this is what
I call quick modifications. Imagine that we add a new state, like if it’s Tuesday, the
animation runs twice as fast. With object-oriented programming,
it’s very easy to do. If Tuesday, do whatever,
else do the other thing. So, we add a new state and it just works. If we are to keep the
pure data-oriented design, this is a lot more difficult to do because we would have
to reanalyze our data, we would have to look for
ways to keep this state out of our main kernel of the function that is running the animation, and it will probably
take us a lot more time, but in the end, it will probably pay off. Obviously, data-oriented design is a tool like object-oriented programming is, like everything we do in programming is, and as every tool has, it has its place, but it also has some downsides and the biggest downside, I
believe, is that the current data separation is
actually very hard to do. I mentioned that earlier,
but data-oriented design is easy to explain, but
very difficult to master and very difficult, in
the end, to really achieve what you’re looking for; and here comes a bit the domain knowledge that is good for you to have. If you don’t know very well the
problem that you’re solving, it’s very difficult for you
to design the data in the way that it will be optimal,
so know your problem. As existence-based predication
is also a bit difficult to achieve if you have
a lot of those states because you start having
these multi-dimensional arrays with this and this and whatever and my advice there is
try to reduce the state. If you’ve done GPU programming, there are great examples there. You can try and reduce
the state by executing the same operation on multiple
data or other techniques. Just get rid of those ifs and elses. Quick modifications can be tough. That, I mentioned already. And we as programmers might
have to unlearn a thing or two. So, we have been wired with
this object-oriented type of thinking since forever, actually, so in the beginning especially,
it’s difficult to start and rethink everything
in this new framework. And unfortunately, the
language C plus plus is a bit difficult to use in some cases, is not always your friend. So, should we get rid of object-oriented programming altogether? I don’t think so. It is a tool, as well. It has its place in our code bases and it’s the antithesis of my title. But sometimes we have no choice. You might be using third party libraries that use object-oriented
programming, so you’re out of luck. You might have API
requirements that make you need to use object-oriented programming. You might have seen that, but
object-oriented programming is not a synonym for classes
and methods and so on. With data-oriented design,
I still had classes, I still had structs, it
was that I put them first, I put the data first and not
the operations, not the code. Polymorphism and interfaces
have to be kept under control. I don’t believe that
their role is in a very low-level system, like
interpolating a couple of numbers in the animations, but
they have their place in high-level systems,
in client-facing APIs, they are amazing if
you do a type of plugin and you really need some
dynamic polymorphism. And remember that C plus
plus has amazing facilities for static polymorphism. You can go a long way with templates or just when you know all
the types that you have, you can generate everything, actually. It works surprisingly well. Finally, are there some
changes in C plus plus that will make it easier for us to write data-oriented design code? I have a dream, which will
probably never come true, but there are languages that allow you to change the memory layout
of objects very easily and to experiment, for instance,
with array of structures, structure of arrays, and
not having to rewrite half of your code for that. I don’t see that coming,
but we can always hope. I didn’t show that, but in games, very often data-oriented
design goes hand in hand with entity component systems, and even if you don’t
need the whole flexibility of an entity component system, it will be great if it was possible for us without paying a big cost to reorder and to split our classes into components, and this is actually doable in C plus plus and I’m talking about doing it
without a pointer indirection because if you do it
with pointer indirection, it’s very easy, but it
requires a lot of custom code that it’s a bit of a mess. For me, the ranges proposal
looks very exciting. I believe it will be a great addition and it will help a lot with
some of the constructs, especially to make the code
more clear, more readable, and this is a bit of a pet peeve of mine, but unordered maps and unordered
sets are not that great because they do allocations. I think that we need in the
standard another hash table that has some relaxed requirements and will allow us to use
another scheme of addressing and reduce those allocations. So, obviously there are a
lot of open-source solutions, we wrote our own hash table, and actually reduced the allocation count in one of our versions by 10%, just by changing the
unordered maps and sets with a scheme that uses open addressing; and this again brings you
back to the know your machine and design software for the
machine that will run it. Don’t design for a
hypothetical abstract machine like it’s in the standard
because it doesn’t exist. So to conclude,
object-oriented programming is not a silver bullet,
but data-oriented design is not one either. The thing is that you must
use your best judgment. They are just tools in your toolbox, so they have their places. It’s your job to find them. Thank you. (audience applauding) We have around 10 minutes
for questions, so please. – [Man 1] Yeah, you’re
advocating and compile time polymorphism, and I like that as well, and then there are always
people who say to me, yeah, but then you probably
will stamp out a lot of codes and from your experience in
also knowing the machine, when do you hit problems in that area? That there’s too much
code, so that you get instruction cache misses, and do you know of tools,
how I can see whether those problems occur? – Yes, this is a concern. So, what we try to do
is keep the templates inside implementation files. In the animation system, this is doable. If we see an explosion of templates, this is a problem that we
try to cope to case by case, so you have two problems
that you might explode your build time, which
is obviously very bad for productivity, and the other
thing is what you mentioned, and we have seen in some cases, the compiler producing a lot of binary, and increasing our
instruction cache misses. There is no formula to solve
that, to be perfectly honest. Either you have to reduce the templates or you can rely on some black magic and this is what we’ve done
in some places, honestly. So, if you have the fortune
on working on a platform that really easily shows
you where the instruction cache misses happen,
which we have access to, so on game consoles, for
instance, it’s very easy to see. You can go there and try
to reorder your functions or you can try to use
link time optimization. Actually, in our experience,
link time optimization helps a lot with some of those cases, but sometimes you’re out of
luck and you have version one where everything is fine
and then you change a line somewhere in code and you
see five percent slower, just because Clang has
reordered some stuff, and when you’re working
really close to the maximum possible performance,
it’s a thing of life. I don’t have a formula to solve this all the time, unfortunately. – [Man 1] Great, thank you very much. – Thank you. – [Man 2] Hi. – Hi.
– You’ve described and architecture for this
system that includes data and operations on that data and I’m not familiar with
this domain frame much, but I’ll take your word for
it that it’s a good choice. But with those structures,
you may want to maintain certain variance in the data, for example, and you can do that with
constructors or methods. – Yes.
– And you’ve also described that for certain parts of your system, like the user facing part
or the client facing part, you may want to have those abstractions. – Yes.
– And so, I guess my biggest question is
what exactly do you consider to be object-oriented programming and why is what you’ve described not that? Because I think really you’re
just talking about choosing different abstractions for
different parts of your system, and I don’t really think
that that contradicts this idea of OOP unless
you have a different idea of what that is. – Got it. Yes, it’s different way of
solving the same problem, so it’s the same in variance, that’s true. However, we have turned a
bit around the way it is done because with object-oriented programming, what I believe it means
and what usually people end up doing is having this
object that we saw animation that contains all the
data, all the invariance, and all the operations
for one animation, right? That’s what we saw. However, this does not
map well to the hardware. It also does not map well to
the complexity of our system because we end up with these giant clouds that knows everything
and can do everything. – [Man 2] Let me try to refine
my question a little bit. So, the title of your talk is OOP is Dead, and I know you yourself
sort of weakened that stance at the end a little bit, but what exactly is
dead is what I’m asking? What precisely is OOP
that you want to die? – The notion of having
these giant logical classes that contain unrelated data
and I believe they have to be reworked in different
collections of data used by the different systems. Did that answer your question? – [Man 2] That just sorta sounds
to me like bad engineering versus good engineering and I don’t know if you can call that OOP. – Well, a lot of code base, unfortunately, end up with the thing that I showed because when you have the class animation, every developer usually
goes and adds a new Boolean or adds a new data or adds a new method and you can call it bad engineering, you can call it however you want, but it’s a fact of life. No matter how much you try and definitely the Chromium
and WebKit are done by some of the best engineers, but in the end, you end up with this giant inheritance hierarchy and so on. – [Man 3] Hi, so if I
understand correctly, there’s a difference of
about 15 to 20 milliseconds per frame in your initial
demo versus the Chromium demo. Do you have a sense of where those 15 to 20 milliseconds were going? We talked a lot about virtual indirections and cache misses, but we
didn’t really talk about how many milliseconds per
frame those were taking up, just that they were occurring. So for example, Chrome’s GPU
process is heavily sand boxed and it has to actually communicate via IPC just to even read the frame buffer, and so that’s a potentially
dominating source of cycles that obviously your program doesn’t have. – Absolutely. So, my demo ran a bit
slower than I was expecting, to be honest, but it’s probably because the PC’s not plugged, but I have an idea. This is definitely… It has some affect on the performance. That’s why when I do the final comparison, I compare just this system. The first one was just a way to engage and to see that the
overall system works better and there are a lot of
reasons for it to work better. The GPU process, honestly, is
probably not the biggest one. The biggest one is usually
the less cache misses, the simpler call stacks. I don’t know if you’re
familiar with the code bases of Chromium, et cetera, but
they have very long call stacks and there are reasons for them to do that. I get that. But for instance, the
animation system could be made. It just does the same
thing and it’s a lot faster and the milliseconds are here, so six, eight, and this is
on this PC versus one, one. – [Man 2] So yeah, that kinda
jump me to my next question which is that this simple demo is simple and Chromium is not simple and so, do you think that
the data-oriented design scales well with actual
required design complexity? For example, you got
probably 20 other sub-systems that need to hook into
the animation system and modify it or be
notified when things happen or run JavaScript code
and maybe yours does that, I don’t know, but – Yeah, it does that. – Yeah.
– It does that. And one other point on the performance, here, we usually see on PC
around four to five times better performance and this
was an artificial test, but I’m talking about real UIs and so on, but this is because this
PC has eight megabytes of health-free cache. (alarm ringing) I don’t know what that is. (alarm ringing) Okay?
– Okay. All right, yeah. – If you go to a platform
that has a lot less cache, like a mobile phone or a console, the difference in performance goes way up. – [Man 3] Okay. (alarm ringing) – [Man 4] Hi. – Hi. – [Man 4] So yeah, continuing on the line of what the previous person
on this microphone was saying, I don’t think you’re being entirely fair to object-oriented programming
and not all implementations of object-oriented programming
feature all of those features you listed.
– Sure. – [Man 4] And also, yeah,
this is a case for, obviously, the classic approach with
object-oriented programming is not applicable, but
actually, I wanted to ask you a bit about this: six point 12
on what type of CPU is this? – On this? – [Man 4] On this one,
yeah, because honestly, I think that it would be even more on an ARM CPU, for example.
– Oh, yeah. – [Man 4] That have much lower
tolerance for cache misses. So, perhaps use the ARM CPU
statistics for your next – Yeah, I mentioned
that, but the problem was that I had to build Chromium for ARM and I didn’t want to do that, but definitely, if you have
less cache on your processor, this number goes up, and to answer the first part, yes, I’m sure that there are
in the world well-written object-oriented systems
that have their place. That’s what I say at the end. – [Man 5] Hi, I can see
some challenges in debugging a data-driven system like this. I wonder if you had some challenges and how did you manage to solve them? – Yes, there are some challenges. The biggest challenge
is we are accustomed to we see a bug, we go into the debugger, we have the subject, we can
inspect all of the data, but when you’re doing
data-oriented design, you might know which data
belongs to which logical entity and what we usually do
is that, in the bug, we annotate our data so
that it’s easy for us to relate it to other
stuff in they systems. That’s what we do usually. I don’t have a lot of – [Man 5] Do you think
there could be a tool that would help debugging
that would access the data because the data, if you have one entity, the data is scattered all over the place, so for a human to understand
it, it’s way harder. – Sure, I haven’t seen generic tools. I have seen a lot of
tools ad hoc like the ones that we have just to solve these, so we annotate the data,
have some extensions for the debugger to be a bit
easier for us to find it. I’ve seen a lot of game companies, especially when things go multi-threaded, it’s even more difficult to find it. A lot of companies have some type of graph that they visualize or
try to do it that way, but I’m not aware and
I cannot really think of a generic tool that will really help. It’s very domain-specific. – Thank you.
– Thank you. – [Man 6] What is you’re experience with long-term support and
maintenance using this approach? My biggest concern is that, as time goes, you add more data to the structures and you keep multiple coop is, so the data would be like data upload, so in a few years, you
pretty much has to reorganize all the layers and adjust
them to the architecture the CPU’s using to the caches and pretty much redoing the whole thing. – Yes. Well, we pay attention to
what we do in the first place and second, I believe that
this goes a bit on the slide that I said about the
systems that tend to harden. I believe that it’s easier to re-work and eventually split the data
if it is done in this way than if it is done in a
giant hierarchy of classes. So as time goes by, requirements change and the system has to change. We’ve had so far very good experience, so we’ve worked for
years on Chromium’s base and for some years on this stuff and actually, the maintenance time and the solving a bug and
implementing a new future time is an order of magnitude faster and of course, it might
be related to the fact that Chromium is millions
of lines of code. Ours is very big, but it’s
not that big, so it depends. I believe that in time,
this is easier to maintain. – Thank you.
– Thank you. – [Man 7] You talked about that there was read-only common state
between all of these classes that you needed to duplicate
across all of this state. Could you talk about
the challenge as to why that couldn’t be simply pulled out and read-only state
separated from mutable state? – Sure. Because it’s not read-only. The fact is that the user
can modify that definition at any time, so one way to do it is to do a copy on write
because you might have multiple animation that refer that. The other is to copy it. So, as we have little data,
it’s an empiric finding that we did that it’s easier and faster for us to copy the data. As it changes, that fact
of life might change and you saw in the Chromium
base, you might have noticed that the class was called
animation effect read only, it’s not read only. They have cost methods that do cost cache, remove the cost, and change
them, so that’s not true. So, it’s mutable in their state as well. Unfortunately, we are out of time. I’ll be here, so everybody
who has questions is welcome and thank
you again for your time. (audience applauding)

Leave a Reply

Your email address will not be published. Required fields are marked *