Alright. So, to explain the jitter there’s a bunch of context that I need to explain first. So I’ll talk about the jitter at the very end but first I need to explain the basics of Torque Hi-Fi
To start with, I have this short demo video which shows a few key things about how the online multiplayer in this game works
In this video, I’ve added some fake lag and gone into a multiplayer game against myself, where I play both the host and the client. You can watch it for yourself to see the concepts I’ll be explaining in realtime, but there are a few key takeaways.
In the video, I start out by playing as the host. I position the client marble on the edge of the map. I, playing as the host, roll toward the client marble, but just as I’m about to knock the client off, I turn around.
On the client’s side, they see themselves get knocked off, before suddenly warping back onto the map. This proves that there is some kind of prediction going on, because the client was never actually knocked off on the host’s side.
In the next part of the video, I flip the experiment around. Now, I position the host on the edge of the map, and as the client, I try and knock the host off. But in this case, the same thing doesn’t happen. The host doesn’t seem to predict ahead.
The next part of the video shows the host and the client picking up gems. The host picks up the gem instantly. The client has to wait a short moment before they hear the gem pickup confirmation sound.
What we can deduce from this, is that the host is playing the game in realtime. So from the host’s perspective, there is no prediction. As soon as they press WASD, they move instantly. Similarly, as soon as the host receives a packet from the client, the client is immediately represented in the new position. From the host’s perspective, everything works exactly as you would expect in a singleplayer game: any inputs received, from the host or from other players, are drawn immediately as-is.
But if we were to apply the same rules to the client, then the game would be deeply unfair. Here’s why.
So here’s a beautiful diagram that I’ve drawn of two marbles and a gem. The blue marble is the client, the green marble is the host, and the red triangle is a gem
Now, the host is constantly sending the client its current position over the internet. But that’s not instant: on a somewhat laggy connection, it takes about 400ms for that packet to get to the client over the internet
So, as soon as the client receives the host’s current position, it is already out of date. The host is probably somewhere over there, having already collected the gem. If we drew the host in this outdated position, the client would think they have a shot at the gem, when the host probably already collected it.
So to make up for this, the client is playing the game in the future. The client predicts where the host will be in 400ms, based on the host’s current trajectory, to make up for the time the packet took to get from the host to the client. This way, the game looks consistent
Now, this doesn’t yet explain the jitter (don’t worry, I’ll get there.) In fact, this is pretty standard stuff in online multiplayer games
Players hate input lag. They want to be able to press WASD to move instantly. Without this system in place, the client would press a key, then they’d have to wait until they received the host position, and then they could move. It’d be the only other way to have a consistent looking game, but players hate input lag.
So, the solution is that the host plays the game in realtime and the client plays the game in the future. The client’s real position is still the one the host sees: everything on the client’s screen is prediction. The host’s location is predicted by the client, but so is the client’s: if the host bumps into the client on their screen, and the client fails to predict it, the client will see themselves warp to their new canonical location – that’s server deconfirmation. The position that the client sees themselves in is, itself, a prediction, 400 milliseconds ahead
But… if we don’t need to predict as much anymore, then we shouldn’t. What if our connection speed changes? Obviously, it’s impossible to predict where the host will be in five minutes, based on their current trajectory. The longer a prediction is, the less likely it’s true. So, it’s in our best interest not to predict too far ahead. What if we have to?
The answer then, is to insert just a little input lag on the client side. That way, we get the best of both worlds: we can keep our predictions shorter, but because there isn’t a lot of input lag, the player won’t notice it as much
but what determines how much input lag to add? Well, what we really want is for our predictions to line up with the next packet we receive. So in the diagram above, we are trying to predict the host’s location in the next packet from the host. If we think the next packet will arrive in 300ms, we try and predict 300ms ahead. If we think it’ll arrive in 500ms, we try and predict 500ms ahead. And of course, connection speeds can change, so we have to account for that. Here’s another visual I’ve made to try and demonstrate how it predicts all that.
For the sake of simplicity, I am assuming that this is a 30 FPS game and one frame = one tick (not actually true but that doesn’t matter.) The blue square represents what the player sees on screen right now. It is a few frames behind the current actual user input, in green, which is constantly being sent off to the server.
The blue bar along the bottom represents the frames that the server has confirmed – it’s gotten back to the client and confirmed that it actually received that input.
What you can see, is that the blue “on screen” square moves closer or further away from the current user input. The game is adjusting the amount of input lag, because the connection speed has changed. That’s the jitter. What’s going on is that the game is moving you back or forward a few frames, so that it can line up its prediction with when the next packet is coming in
So on a good connection, it doesn’t have to predict very much, and on a bad connection, it has to predict more so it inserts some input lag to keep predictions short, and therefore more accurate
but when it inserts a frame of input lag, your marble appears to be in the same spot for two frames because it had to duplicate a frame in order to insert that lag. And when it removes input lag, you appear to jump forward a bit because it has to take out a frame to remove that lag
In Torque Engine, this is called “Move Sync.” In OpenMBU you can find the relevant sourcecode in gameConnection.cpp and moveList.cpp
any frame that has not yet been server confirmed is stored in the “tick cache.” The positions and forces on all of the marbles on those frames are stored, so it can return to displaying those frames instead of the current user input. As soon as a frame is confirmed by the server, it’s discarded from the cache. There’s no point in rewinding before a server confirmed frame
so depending on the connection speed, the game varies the amount of input lag, which can be all the way to the max (the last server confirmed frame) or the minimum (the current user input frame) or anywhere inbetween, and if the connection speed changes it can use the tick cache to either display the same frame twice or skip ahead, to vary the amount of input lag to line up with the current amount of prediction
and that’s what causes the jitter you see. It isn’t server deconfirmation (that only occurs when, say, a collision happens,) it’s just trying to keep predictions accurate. If you’re playing as the host, who plays in realtime, you’ll never see the jitter
Here’s the actual source code powering all of this: https://github.com/MBU-Team/OpenMBU/blob/master/engine/source/game/gameConnection.cpp
In this code, mFirstMoveIndex corresponds to the blue bar in the illustration: the server confirmed tick. mLastSentMove is the green square: the current user input, which is constantly being sent off to the server. mLastClientMove corresponds to the blue square: it’s what is currently displayed on screen.
mFirstMoveIndex gets updated whenever a new packet is received from the server. mLastSentMove is added to every tick, when we poll for user input. mLastClientMove is what we get to play with, to decide how far what’s on screen is behind user input, that is, the input lag
The clientCatchup function, which is referenced here but the source for it is in moveList.cpp, is what’s responsible for predicting the host’s location based on their trajectory, and does a bunch of math-y physics-y stuff
I don’t know the specifics of how the location is predicted, only that that’s what it does, and otherwise I treat it like a black box
I wasn’t interested in the specifics of the physics since the reason I studied this was that I was interested in at one point making my own multiplayer game (though that’s still only a fantasy and not something that I’ve actually started on)
Note by Hailey: after this he wrote a little more but it wasn’t really about MBU, we were just talking about other games etc then later personal stuff