Code Walk-throughs

It seems overkill to write instructions for placing every single piece of new code. At the same time, it really is necessary to describe the major portions of it, hence the existence of this section.


Backward Reconciliation

We'll start here, because it's what every major Unlagged feature is built on.

The main source of our problems with “lag” (latencies and discrete movement) is the disparity between what the client interpolates from the information it receives from the server, and what's actually going on. Backward reconciliation is an attempt (and a good one) at reducing that disparity.

I've already described in general terms what we'll do, so let's jump into the implementation.

Look for “//unlagged - backward reconciliation #1” in the code. The first hit in g_local.h defines clientHistory_t, a structure that defines a single entry in a client history queue. It contains a bounding box, an origin, and a timestamp. gclient_s contains an array of NUM_CLIENT_HISTORY entries of this type. At the end of every server frame, for every client, we'll make a call to G_StoreHistory().

Now open up g_unlagged.c. This is where the majority of the backward reconciliation logic is. I'll give a short explanation of the functions:

void G_ResetHistory( gentity_t *ent ): Resets a client's history, initializing every entry with the client's current origin and bounding box.

void G_StoreHistory( gentity_t *ent ): Stores an entry in the client history, using the client's current origin and bounding box. Notice that it doesn't store ent->r.currentOrigin, it stores ent->s.pos.trBase. This is very important. The current origin is kept purely server-side, so we'll regard it as the client's actual origin. The trajectory base is what's sent to every client. When a client is interpolating between two origins to determine what state to draw a player in, this origin is one of them. The other origin is another trajectory base, from the next snapshot.

static void TimeShiftLerp( float frac, vec3_t start, vec3_t end, vec3_t result ): Interpolates between two origins. The important thing to note about this is that it uses the exact set of operations that the client does to avoid floating-point error.

void G_TimeShiftClient( gentity_t *ent, int time, qboolean debug, gentity_t *debugger ): This is where the backward reconciliation happens. First, a loop determines which two history entries sandwich time. It then saves the parts of the client's state that will change, interpolates between the two states to set the client's new state, and re-links the client's entity. This is also very important. The server engine needs to be updated on the change, so not calling trap_LinkEntity() will cause strange problems.

Notice also that this function can dump a whole slew of debugging information to the console of a “debugger” player. This is probably the most important piece of code when it comes to making sure things are working correctly.

void G_TimeShiftAllClients( int time, gentity_t *skip ): This basically calls G_TimeShiftClient() for every connected client, except the one it's supposed to skip. It also determines whether or not the skipped client will have debugging information dumped to his console.

void G_DoTimeShiftFor( gentity_t *ent ): Finally, we get to the interface. This is what the other code almost always calls to do the backward reconciliation. It determines exactly the time that the other clients should be backward-reconciled to, and calls G_TimeShiftAllClients() to do it. If you want to exclude full backward reconciliation, this is where you do it.

void G_UnTimeShiftClient( gentity_t *ent ): Puts a client back into his true state, using the information saved in G_TimeShiftClient(). The call to trap_LinkEntity() here is also very important.

void G_UnTimeShiftAllClients( gentity_t *skip ): Calls G_UnTimeShiftClient() for every connected client except skip.

void G_UndoTimeShiftFor( gentity_t *ent ): This, like G_DoTimeShiftFor(), is the interface. It does some cursory checks and calls G_UnTimeShiftAllClients().

The rest of g_unlagged.c is for skip correction.

Now look in the code for “//unlagged - backward reconciliation #2”. These will be all the places in which players are backward reconciled or restored. Most of it is easy enough, with a call to G_DoTimeShiftFor() before a trace, and a call to G_UndoTimeShiftFor() after.

The call to G_UnTimeShiftClient() in player_die() in g_combat.c needs special attention. It's possible, for rail and shotgun attacks, for G_Damage() to be called while a player is in a backward-reconciled state. Of the functions that G_Damage() can call, player_die() depends on the player being in the correct state. At the beginning of this, then, we put him back. This won't affect him being restored later – he'll just be skipped. In most mods, this is enough to keep things consistent. However, if you have made changes to how players take damage, or to other code that is executed during a time shift, you may need to make some changes. (The easiest way to deal with it is usually to restrict the scope of the backwared-reconciled state.)

Also, note the direct calls to G_TimeShiftAllClients() and G_UnTimeShiftAllClients() in g_main.c. These exist to backward reconcile players 50ms for the duration of all missile movement. (The missile thinking has been moved to execute after the rest of the non-player thinking to keep from having to backward reconcile players more than once.) This will of course make other players easier to hit, at the expense of making projectiles harder to dodge. By itself, it will also create some small visual problems. We'll completely handle both of these issues on the client later.

The code marked by “//unlagged - backward reconciliation #3” resets a player's history at the right times. “Right times” are always when a player's origin changes dramatically inside a server frame. You may have to reset the history in other places, but luck is with us: it's easy to find all the right places. Search in the code for every instance of EF_TELEPORT_BIT being changed.

“//unlagged - backward reconciliation #4” marks code dealing with time. The first hit stores the command timestamp for the attack, which is used as the time to backward reconcile the other players to during instant-hit attacks when full lag compensation is enabled. The rest deal with estimating the actual server time, which is necessary for backward reconciling only 50ms.

The “//unlagged - backward reconciliation #5” code is a simple way to notify clients of whether or not full lag compensation is enabled.


Attack Prediction

If you've opted not to implement full lag compensation, go ahead and skip this section.

We can't do much about visual inconsistencies that show up for a lag-compensated player's targets, or for people who are spectating a lag-compensated player. However, since full lag compensation is pretty much perfect (barring prediction error and really bad network conditions), we can fix it up for the attacker by predicting weapon effects.

Our first stop is “//unlagged - attack prediction #1” where it actually takes place. The lightning gun is easy: you just draw it pointing straight at the crosshair if full lag compensation is enabled on both the server and the client. The other instant-hit weapons aren't so easy, which is why the function that does it, CG_PredictWeaponEffects(), pretty much has its own source file. The only entry point to that is a single call at the end of CG_FireWeapon().

To predict weapon effects, we calculate them as the server does and draw them immediately. There are a few niggly little details to it, though.

First, if a client is predicting weapon effects, it'll have to suppress the corresponding events as it receives them from the server. (The lightning shaft needs no such treatment, but the others do.)

Second, there is randomness to the effects. We have to somehow synchronize the client and server. Using a random seed does this nicely. The problem is making sure the client and server use the same random seed. We haven't got time to send a seed from the client to the server – we need something predictable. We'll use the attack time for it, since it'll be random enough for our purposes, and it's known by both the client and the server.

The code marked by “//unlagged - attack prediction #2” is changes to random effects and event suppression.

Code marked by “//unlagged - attack prediction #3” makes the useful function SnapVectorTowards() available to both the client and server game. It's now needed by the client since the server uses it to create effects, and the client needs to duplicate some of that functionality.


Lag-compensated Grappling Hook

This one's easy enough. All we do is make the grapple's end point's first step large enough to make up for the latency in firing.

Look for “//unlagged - grapple” in the code. It should all be contained within the fire_grapple() function.

One other thing you might want to do is add predicted release to the grapple. Ultra Freeze Tag's grapple is off-hand, and I've defined BUTTON_GRAPPLE as 32 so it corresponds with “+button5”. Here's UFT's predicted release:

if ( !(latestCmd.buttons & BUTTON_GRAPPLE) ) {
	cg_pmove.ps->pm_flags &= ~PMF_GRAPPLE_PULL;
}

You would add something like that to CG_PredictPlayerState(). The safest place to do it is right before the prediction loop (the only loop – easy to spot).


Skip Correction

One of the nicest things about how Quake 3 is written – with regards to skip correction, anyway – is that a player's actual state is somewhat divorced from the state that other clients use to render him.

BG_PlayerStateToEntityState() is where you go to see it in action. In this function, a playerState_t is transformed into an entityState_t. The entityState_t is what is actually sent to every client for rendering. It's almost never used on the server at all for client entities.

I'll add one more piece of information before getting to the point: for backward reconciliation, we're storing part of an entityState_t. Taken together, these ideas mean this: we can smooth out a player's movement without tampering with his actual state, and not have to worry about it breaking hit tests. Remember, for every hit test – including projectile hit tests – players are in a backward-reconciled state. This state uses entityState_t information.

So here's the plan: we'll keep track of the last frame number in which the server received a command from a client. This is pretty easy – just set some value in the client struct to level.framenum inside of ClientThink_real(). If, at the end of a server frame, the client has missed one or two frames, we'll extrapolate what his origin would have been if he had continued at the same velocity, and send that to the other clients instead of his actual origin.

Take a look at the code marked by “//unlagged - smooth clients #1” – I believe the first hit demonstrates that part. Also marked is the code that deals with setting the EF_CONNECTION flag and actually smoothing out movement for players who miss a frame.

A couple of things deserve special attention. First, G_PredictPlayerMove() is defined in g_unlagged.c. It's a subset (though a large subset) of the stepping and sliding code in bg_slidemove.c. You pass it a player entity and a duration, and it will extrapolate the player's new position and store it in ent->s.pos.trBase. (Notice that G_StoreHistory() is called only after that happens.) If you have changed any of the stepping and sliding code, you will have to make changes to this as well.

Second, the EF_CONNECTION flag is set when the server is not willing to extrapolate a laggy player's position. This will put a phone jack over the player's head. In my opinion, this is much more useful knowledge to any would-be attacker than was shown by the old method, when it actually worked. (It didn't in the 1.29h codebase.) Basically, it's a message from the server that says, “this guy is still skipping, even though I'm trying to smooth him out.” It will also indicate connection problems, so you can tell the difference between a player who is standing still out of choice and a player who is standing still because he's “lagged out.”

Now we'll abolish cg_smoothclients altogether. It never worked right anyway. (It was supposed to cause a client to extrapolate up to 50ms when it didn't have enough information to interpolate, but it was using the wrong base time. It would usually extrapolate 50ms no matter what.) We'll also reassign g_smoothclients to enabling and disabling the new skip correction feature.

The code marked by “//unlagged- smooth clients #2” does all this, and is pretty self-explanatory.


Projectile Nudge / Early Transitioning of Missile Entities

So now, players are backward reconciled by 50ms for missile hit tests. That makes projectiles easier to use, but also harder to dodge. What do we do?

First, on the client, we'll advance every projectile 50ms. The code that does this is marked by “//unlagged - projectile nudge”. (It's simple enough to not need to be split into parts.) The main logic is in CG_CalcEntityLerpPositions() in cg_ents.c, and is very straightforward. The rest, scattered about in g_missile.c, is all about setting ent->s.otherEntityNum to projectiles' owner numbers. This is necessary for the cg_projectileNudge option.

cg_projectileNudge allows you to advance other players' projectiles more than 50ms. If you set it to your ping, you will see projectiles about where they will be by the time your commands to move reach the server. This should make them easier to dodge. (Personally, I love it.) The trade-off is that projectiles will appear to have a very long first step (though I think this is fine since it reflects your window of useful action), and they will stick in the floor for a bit before exploding.

Now, advancing a rocket only 50ms ahead will create some small visual problems. We can take care of these by transitioning missile entities and their explosions earlier.

cg.snap contains the “current” server snapshot. cg.nextSnap, if not NULL, contains the “next” one. Interestingly enough, entities are never processed until the client game makes cg.nextSnap into cg.snap (“transitions”). That means, with regards to entities, your client game has information 50ms before it actually uses it!

There are certain expectations of order in the code as it is, but I've found that it's perfectly safe to transition ET_MISSILE and ET_GENERAL entities (missiles and their explosions) early. The code marked by “//unlagged - early transitioning” does that.

This is the effect: players see missiles and explosions 50ms earlier than they normally would. Couple that with advancing projectiles by 50ms, and you have missiles and explosions that look perfectly normal again – but you feel like you're pinging 50ms less when you use them. They are also no more difficult to dodge.

Everybody wins. With these changes, 50ms of lag compensation for projectiles involves no trade-offs at all.


Extrapolating With cl_timenudge

It's a small change, but it's nifty.

Look for “//unlagged - timenudge extrapolation” in the code. It's all contained within CG_CalcEntityLerpPositions() in cg_ents.c.

This new code will only be run if the preceeding if statements evaluate to false – in other words, if the entity's origin was not interpolated.

If you set cl_timenudge to a large negative value (like -30), cg.nextSnap will almost always be NULL. That means other players' origins will almost never be interpolated. The code drops through to do extrapolation. However, since a player's trajectory type is TR_INTERPOLATE, the trajectory evaluation function (BG_EvaluateTrajectory()) will simply return the input vector.

The input vector will only change when there's a new snapshot. That's why other players look “jerky” when you use a large negative cl_timenudge.

The solution is simple: change players' trajectory types if they're not interpolated, so they'll be properly extrapolated. That's all the code does.

The effect is nice: setting cl_timenudge to a negative value will no longer make other players look “jerky.” The client game will basically predict up to 50ms of other players' movement for you.

One notice: this will not work with the old smoothclients behavior. If you're not doing Unlagged's skip correction and you want to make it work, you'll need to fix smoothclients first. It's simple: pass level.time into BG_PlayerStateToEntityStateExtrapolate() instead of ent->client->ps.commandTime.


Player Prediction Optimization

I suppose it would be correct to label this code as “experimental” since it hasn't had nearly as much time in the wild as the rest of it. Even so, I'm very sure that it's working just fine, having tested it under various conditions myself and with other players. A thorough analysis also shows that there is nothing that depends on the changed code that is broken. Er...probably. So if you find bugs, please let me know.

Recall that the server recevies a very small set of data to work with in processing player input, which represents more or less a few key presses and view angles. Also, remember that where you see yourself is predicted, meaning basically that, barring prediction error, you see yourself where you will be on the server by the time your most recent command is processed by the server.

This all means that the client and the server need to process the input in exactly the same way in order to stay synchronized. They accomplish this by each making a call to Pmove() for a command. (This works because Pmove() is defined in bg_pmove.c, which is compiled into both the server and client game.) The server only makes one call to Pmove() per command, but the client can make up to 63 calls to Pmove() per command, depending on latency.

Alright, that's nuts, I hear you saying. There's actually a really good reason for it. (And there's no good reason that we can't optimize it, either.) The simplest way of doing player prediction requires it. Open up cg_predict.c, and take a look at CG_PredictPlayerState(). I'll describe the function, starting at the place which concerns us: where cg.predictedPlayerState is set to a player state from a valid snapshot. (Look for the comment “get the most recent information we have” and you'll be starting in the right place.)

  1. Assign cg.predictedPlayerState to a player state from a valid snapshot. This is where the prediction will start. Either cg.snap->ps or cg.nextSnap->ps represents our own player's true state on the server at the time the corresponding snapshot was sent. It makes sense to start predicting from here.
  2. Loop: get an old command, from 63 commands ago on up to the current command. (CMD_BACKUP is defined as 64.)
  3. If the command happened before the snapshot's player state was produced, or it happened on the previous map, go back to step #2 to get the next command.
  4. If we've finally caught up to our last predicted player state, handle prediction error.
  5. Call Pmove() to advance the player.
  6. Check for touching predicted triggers (like bouncepads, teleporters, and items if we're predicting item pickups).
  7. Go back to step #2 to process the next command.

After this loop is finished (after it's processed the latest available command), your player's predicted state will be valid, and will represent the state you should be in by the time these commands are processed by the server.

Here's the rub: the more commands you have to back up (in other words, the higher your ping is), the more Pmove() calls have to be made. Pmove() calls are expensive, particularly the stepping and sliding code. (Sliding can take an obscene amount of time, and stepping can double it. It's mostly the trace calls that slow it down.)

One optimization we can make to this is to not repredict everything when we haven't got new information. It's really only the first frame after a new snapshot is transitioned that we know anything new about our own state, so it's only then that we should have to repredict everything. On other frames, we should be able to get away with running only one Pmove() because our predicted state can't be changed because of new information from the server.

That works, and that's what the code that arQon sent me did. The only problem with it is that it tends to mess up player state events with regards to predicted item pickups. He doesn't really have to worry about that (most CPMA players don't use that feature), but I do. It may even rarely mess up player state events in general, since it may produce results that are different than they would be with no optimization. (I haven't researched that very much, but I doubt that's really the case.)

So we'll do the optimization suggested in the function's comments, because it's safer. Our particular implementation will leave the predicted player state in almost exactly the same state as it would be if we weren't optimizing.

We'll make a queue to hold all the player states after a Pmove(). When we process a command, we'll either call Pmove(), or play back a previous call to Pmove() by simply copying in an old predicted player state. This will preserve the player state events.

Another thing we can do is this: when we get a new snapshot from the server, we can check the player state in it against its corresponding predicted player state from our queue. If it matches, there is no need to fully repredict our player's state: we can start from the predicted player state instead. In that case, we still only need to run Pmove() once.

If we do that, a full predict will only happen if there is some kind of prediction error. In normal play, it seems that happens, on average, about twice every second. That's really good. It's usually health or armor counting down, getting shot, and picking up items that does it – almost never from just running around.

Here's the new order of things:

  1. Assign cg.predictedPlayerState to a player state from a valid snapshot. This is where the prediction will start. Either cg.snap->ps or cg.nextSnap->ps represents our own player's true state on the server at the time the corresponding snapshot was sent.
  2. Check to see whether the snapshot we're using is different from the last one (whether it's new).
    1. If it's not, decide to start calling Pmove() after the last predicted command. (This will cause Pmove() to be called only once.)
    2. If it is, find the predicted player state that corresponds with the snapshot's player state.
      1. If it doesn't match, decide to start calling Pmove() right away. (This will cause a full predict.)
      2. If it does match, adjust the queue and decide to start calling Pmove() after the last predicted command. (This will cause Pmove() to be called only once.)
  3. Loop: get an old command, from 63 commands ago on up to the current command.
  4. If it happened before the snapshot's player state was produced, or it happened on the previous map, go back to step #3 to get the next command.
  5. If we've finally caught up to our last predicted player state, handle prediction error.
  6. Check the command number against the number at which we decided we'd start calling Pmove().
    1. If we've passed it up, call Pmove(). Save the player state in the queue for later.
    2. If we haven't, copy the correct saved player state from the queue into the predicted player state.
  7. Check for touching predicted triggers (like bouncepads, teleporters, and items if we're predicting item pickups).
  8. Go back to step #3.

Look in the code for “//unlagged - optimized prediction” and you'll find all the changes.

A few things deserve special mention. First is some commented-out code marked with “// debug code”. Uncommenting these lines will dump a line per client frame to the console, indicating how many Pmove() calls were made and how many commands were played back from the player state queue on that frame. This is very useful for making sure everything is working correctly.

A new function, IsUnacceptableError(), takes two player states as arguments and returns a value based on which parts are different. If no parts are different, it returns zero. It is important to note that this function will tolerate a 0.1 unit difference in the player origin, player velocity, and grapple point. 0.1 is more than enough to account for floating-point error, but not enough to cause any problems with player prediction. (For reference, a player is 30 units wide. That means IsUnacceptableError() tolerates error up to 1/300 a player's width.)

cg_showmiss, when nonzero, now also dumps a line to the console when IsUnacceptableError() returns nonzero (meaning the player states are different). It also writes “saved state miss” to the console when the predicted player state's command time differs from its supposedly corresponding player state queue entry. This should only happen for a little while after the value of pmove_fixed changes. If it happens at any other time, something is wrong.


True Ping / Lag Simulation

“True ping” is really quite simple. Every command we receive has a timestamp. If we subtract that from the our estimated server time, we get the ping. That could be a little too sporadic to be useful, however, so we'll save up the pings in a queue and return their average.

That was easy. Lag simulation is a little bit more difficult.

For almost all intents and purposes, it doesn't matter where your latency comes from. The effects will end up exactly the same whether the latency happens on the way from the server to your client, or from your client to the server. Nevertheless, we'll implement both types.

Server-to-client latency isn't too difficult. The game engine is nice to us, and keeps a backlog of snapshots. We'll simply grab an earlier snapshot than the one we'd normally get. Look for the code marked by “//unlagged - lag simulation #1” in CG_ReadNextSnapshot(). That's the main logic.

We'll also have to fix up the client clock to reflect the latency, since the clock time is sent directly to the VM by the game engine, which has no clue that we're looking into the backlog. Then, there are a couple of calls to CG_Error() that happen when the client believes it's insane. During transitions between different values of cg_latentSnaps, it actually will be, so we have to turn them into calls to CG_Printf() instead.

On the server, we'll adjust the client's ping according to the value of cg_latentSnaps, and adjust the attack time so the backward reconciliation will work properly. (We have to do that because when the client engine sends out commands, it puts the regular timestamps on them, again, having no clue that we're looking into the backlog for snapshots.)

That's it for server-to-client latency. All of this is marked by “//unlagged - lag simulation #1”.

For client-to-server latency, we'll keep a queue of up to 64 past commands. In ClientThink_real(), we'll grab an earlier one according to the value of cg_latentCmds. The code marked by “//unlagged - lag simulation #2” does this.

The code marked by “//unlagged - lag simulation #3” does outgoing packet loss. This one's pretty simple: it randomly skips the think function according to the client's cg_plOut setting.

Next: What Next?