Tuesday, October 27, 2009

Anti-Piracy



[ Team LiB ]









Anti-Piracy


I have never been overly concerned with piracy of my games; it happens, and it does hurt financially, but my hunch is that the cost of preventing piracy is higher than the gains. However, I was irritated by the whining of games developers that the pirates were unstoppable, that no anti-piracy technology could withstand their assaults. "Everything is crackable" was the gleeful claim of the pirates, and the developers seemed to have glumly accepted their boasts. I thought it was time to put paid to that myth.


The Code Talkers


Time for a story. Armies face a similar problem: How do various military units communicate without the enemy intercepting their communications? This problem was so important that it quickly generated its own field of study: cryptography. Mathematicians devoted vast intellectual resources to this problem on which hung the lives of thousands. By World War II, cryptography was a major field of study, and cryptographic systems were perched on the sharp edge between security and failure. The Allies were able to crack the Japanese code in short order. The German codes were tougher, but they too succumbed to the assault. But how were the Allies to keep their own communications secure? It seemed at the time that no code system could be devised that was both practical and secure.


The US Marines hit upon a brilliant solution to the problem: the Code Talkers. They recruited Navajo Indians to act as communicators. Commander A gives his orders to Navajo Code Talker A, who speaks on the radio in a code based on Navajo with Navajo Code Talker B, who translates the orders for Commander B. The scheme was a smashing success. Marine communications remained secure throughout the war; Navy and Air Corps codes were successfully cracked by the Japanese.


The scheme even survived the capture of a native Navajo speaker in Bataan. Despite strong inducements to decode their communications, the POW was unable to figure out what the Code Talkers were saying, because he did not know the simple secondary code that the Code Talkers were using.


The brilliance of this scheme lay in its inversion of the normal process of code creation. Most code designers concentrate on the code itself; the Code Talkers scheme was based on an assessment of the people attempting to crack the code. Rather than "What can I build that's strong and uncrackable?" the starting question was "What will be impossible for the Japanese to crack?" The result was the most successful code in World War II.


Technology Versus Psychology


The lesson here is that the designers of anti-piracy schemes were taking the wrong tack. They saw the problem as a technological one, and so attempted to come up with technological solutions. Their opponents, the pirates, also saw the problem as a technological one, and greatly enjoyed the solution to the problem. In effect, the two sides were playing a silly game: The developers were trying to out-techie-macho the pirates, who in turn saw the anti-piracy schemes as a test of their technological manhood and therefore plunged into the cracking problem with relish.


The correct tack was to approach the problem as a psychological one rather than a technological one. The goal is not to challenge the pirate to a contest of skill; the goal is to discourage the pirate from completing the task.


The designer's biggest problem in creating copy protection is that everything he does is out in the open where the pirate can see it. The designer cannot guarantee the security of anything he does, because the cracker can observe any computation while it is being done. In short, our problem is to process information in full view of the cracker, yet not allow him to see it.


The trick here is misdirection. We cannot lock the information up; it's out in the open where he can see it. Our task is to camouflage the information, to make it look like something else. That way, the cracker can look right at it and never notice it. We want to give him so much stuff to look at, so many key look-alikes that, due to fatigue and frustration, he won't notice it when the real key comes into his field of view.


Remember that this is not an absolute solution. We cannot lock up the key behind a wall of steel. It will be out in the open. But we can create so many false keys, so many deceptive images of keys, that the cracker will probably not find the real key. We want to fatigue him and to demoralize him. We want to give him apparent successes and then yank those away. That is our fundamental strategy.


Basic Structure


The basic approach of any copy protection system is simple and requires three steps:






  1. Present the user with a challenge, a question that can be answered only by using information from an uncopyable source, such as a code wheel or a word from the manual.

  2. Compare the user's response with the correct response.

  3. If the response is incorrect, disable the game.


There are many ways that such a system can be defeated. The one way that I will not address is for the pirate to copy the uncopyable external source of information. Some of the most common strategies available to the cracker are:



  1. Find the internal file of challenge/password combinations and publish it.


  2. Skip over the challenge so that it never happens.


  3. Patch the challenge so that it always thinks that the correct password has been entered.


  4. Patch the challenge so that its attempt to disable the game fails.



Hiding the Password File


Our first task is to retrieve the challenge and the password, presumably from a file on the hard disk. This is the most easily cracked process, and we must take special precautions.


First, we must disguise the contents of the file. The cracker could play the game once, find a single challenge/password, and then perform a simple file search for that text. This would reveal the critical file for his publication. The first step, then, is to encode the text of the file. We cannot use a standard encoding scheme; the cracker will surely try all the standard methods. Instead we must make up a simple but unique encoding system. A great variety of such systems are possible: moving-key arrangements, averaging systems, differential schemes, or whatever. The only requirement for such a system is that it render the data in the file immune to simple textual examination and that it be different for each product.


Now that our text is encoded, we must hide it. We can't just have a file called "Password File." And if we have a file just laying around with no obvious purpose, it's sure to attract attention. We want to hide that file somewhere inside another file, preferably in a way that allows the cracker to examine it without realizing that it's not quite right. I once hid my password file inside the sound file of an explosion. The cracker can look at the file, play it with a sound editor, and never realize that funny scratching sound in the middle of the explosion is really my password file. We could hide the password file by interleaving it through some other file; if every hundredth byte of a big sound effect is part of the password file, the user will never hear the difference. We could use a watermark system for an image file, or even a special texture. Lord knows that games these days have so many megabytes of files that a cracker could never find a few kilobytes of data tucked inside one of those files.


Reading the Password File


Now we have to get the password file into the computer so that we can select a password. This shouldn't be difficult; there will be megabytes of data in RAM. The important trick, though, is to load the data during game initialization, not immediately before the challenge�that's a dead giveaway.


Now we select our challenge/password pair from the full set of challenge/password pairs. But we don't decode it! If we decode it, it'll be sitting around in RAM just waiting to be found by a simple text string search. Instead, we leave it encoded; later, when we collect the user's input, we encode the input and compare the encoded strings. That's slightly more secure than decoding the password and comparing it directly with the user input.


How do we pass the encoded challenge/password data to the challenge routine? If you're a good little programmer, you probably want to declare it as data of the "Anti-Piracy" object. This is all wrong! You don't want to be writing clean, easily understood code: You want this code to be a bloody mess! Passing the data as a parameter is just as stupid, and making it a global variable is even worse. That data can't be sitting in any obvious location. It has to sneak into your routine through a back door.


At this point, consider a reverie to the impenetrable obscurities of the stack. Any idiot can scan through flat RAM searching for key data; digging up pointers to the data is only a tad more obscure. But stacks are without doubt the most godawful pains in the butt when debugging time comes. Nowadays, stack structures are kept behind many layers of operating system code to insulate the poor little wimp-programmer from their ferocious tendency to blow sky-high. That's why we cracker-bashers love them so much. Of course, we must remember that we are juggling with the programming equivalent of nitro-glycerin; we had better not drop any balls here!


My favorite solution is to store the data on the local variable stack frame. Remember what happens when a high-level language such as C or Pascal calls a routine? First it sets up a stack frame on the stack to reserve RAM space for the local variables. But these variables are not initialized; they initially contain whatever garbage was already present in the RAM. The trick is to make sure that the "garbage" just happens to contain the data we want to pass.


The stack accounting for this process can be tricky. The easiest way to accomplish it is to have one outer-level routine that calls both routines (the one that puts the data onto the stack and the other that uses it). Then you need merely ensure that the local variable declarations of the two routines are byte-harmonized. My approach was even simpler than that: I combined the local variable declarations for the two routines into one master local variable declaration which I then used for both routines. This has the bonus of making the cracker's stack-snooping that much more tedious.


A code example might help. Here's how the scheme might be implemented in C++. First, we have to place the data onto the stack. For purposes of simplicity, let's declare that the critical data consists of a 32-character string. Our first routine places that data onto the stack:



Void PutTheDataOntoTheStack()
{
Short j, k;
Long a, b;
Char SecretData[32];
// Let's assume that this particular C compiler
// reserves data on the stack in the order in which it
// is declared. Thus, the top two bytes are for the
// short variable j; the next two bytes for k, and so
// forth. Our character string will thus take the 13th
// through 44th bytes on the stack.
// First we do some officious-looking stuff to distract
// the cracker
{ blah, blah, blah }
// Now we get to the real task: this loop pulls data
// out of its secret hiding place and places it onto
// the local stack frame.
For (j = 0; (j < 32); ++j)
{
SecretData[j] = MysteriousLocation* + j;
}
// now we do some more officious-looking stuff to
// distract the cracker.
{ blah, blah, blah }

Void GetTheDataOffTheStack()
{
Short n, m;
Long c, d;
Char NothingImportant[32];
{ blah, blah, blah}
}

So now we have our real challenge/password combination sitting in the garbage of our uninitialized variables, all ready to be used. But we must cover our tracks, lest the alert cracker ask where that data came from. We must now go through the motions of loading in a fake set of challenge/passwords. So we ostentatiously load a file called "Password File." This file must, of course, contain plausible challenge/password combinations. The trick is to make the first dozen password combinations legitimate copies of the real password file. That way, the careful cracker who double-checks the file will of course verify that its first entries are indeed genuine ones.


Now at this point we take advantage of one of the most common programming errors in the business. How many times have you set up some messy loop and screwed up the terminating conditions by just one step? In other words, does your loop go all the way to the very last value, or does it stop short by one, or does it do one loop too many? We all make that mistake�so let's use it as a means of messing with that punk cracker's mind. Let's deliberately screw up the terminating condition such that the very last entry in the list of challenge/passwords is not read in. That's where we arrange the real password to sit. Thus, we start with an uninitialized array of challenge/passwords, except that the very last entry is the honest-to-God challenge password left on the stack previously, then we load in all the false challenge/passwords, but due to a "bug" in our code, we fail to overwrite the honest-to-God one.


Now we have to select the genuine challenge/password. We have to be careful here; since we are inside the challenge routine, we know that the cracker is watching closely. Our problem is very similar to that of a card shark who wants to pick a particular card out of the deck without the audience understanding how he did it. I call the algorithm that I used for this "The Hand Is Quicker Than the Eye." We want to make sure that we always pick the last entry from the challenge/password table�without the watching cracker catching on.


At this point I take advantage of another bit of cracker psychology. Crackers are not well-rounded people. They're twisted personalities, obsessed with the computer and technical trivia. This means that while they know lots of petty crap about computers, they know very little else. In particular, crackers tend to regard arithmetic the same way that vampires view crucifixes. So a nice, complicated arithmetic calculation is sure to send them skipping to safer areas of code. So let's just wrap our critical work in some protective layers of arithmetic.


The algorithm I devised to solve my problem looks very much like a card-shuffling routine. It uses a complicated procedure to repeatedly exchange pairs of table entries, using an arithmetic formula to determine which pairs are shuffled. The formula is recursive, so that the cracker who wishes to penetrate it must follow it through all 300+ steps�and if he makes one mistake along the way, he's lost it. Suffice it to say that the formula is rigged to ensure that after it has gone through a certain number of steps (more than 300), our honest-to-God password is sitting at the top of the table. We now pick the card from the top of the pile. Lo and behold, it's the honest-to-God entry! The hand is quicker than the eye, especially at 600MHz.


So now we go through the normal motions with the player. We perform an ostentatious comparison of the player's entry with our table entry, and then set a global flag that will later be used to disable the game if it is FALSE. Of course, you didn't believe for one moment that this flag really does anything, did you? You didn't think that I would be that obvious, did you? Silly boy!


No, we do the same trick going out that we did coming in. We put the player's response to the challenge into a local variable where�guess what?�it remains on the stack. A few million cycles later, we come back to the stack location from a safe routine buried deep inside boring housekeeping stuff. That routine retrieves the player's response as well as the honest-to-God password, encodes the player's response, and compares the encoded results. If they match, all is well. If not…


Bombs Away!


Our next task is to disable the game. Now, we must be circumspect here. Initiating an immediate crash or lockup or putting up a dialog box that terminates the game are all incorrect solutions, as they are dead giveaways to the cracker. We should have taken care of the confused or ignorant but otherwise innocent player in the original password dialog box. We now need a scheme to deal with the cracker who has somehow cracked the outer layer of protection.


There are two requirements for our disabling system: default failure and delayed response. Default failure means that the program begins life with a fatal but hard-to-find bug in it. The copy protection system repairs the bug only if the player meets the password challenge. That way, the simple-minded strategy of patching out the password challenge will fail. Delayed response means that the disabling does not take place immediately. Instead, it takes place much later, so that when the crash happens, the cracker will have a whole lot of backtracking to do to get to its source.


In Patton Strikes Back, I took one of the military units and fudged its initializing data to a false value that, while initially harmless enough, was certain to cause a program crash just before the end of the game. The copy protection code sets the value back to the correct value. This has the advantage that the code looks for all the world like a typical unit-manipulation routine.


Guard Routines


But this is only the second layer of protection. The next layer uses guard routines. These are routines that checksum other routines, thereby ensuring that patches are detected at runtime. Guard routines should be executed rarely, because every time you call one, there's a chance that the cracker will be looking in on you at that moment. Besides, you only need to catch him once. Since guard routines are short and fast, it's a good idea to sprinkle them liberally throughout your code. Have them guard different things: different portions of the challenge routine or maybe the two routines that precede and follow the challenge routine.


Guard routines have a flaw: Since they use pointers into the code itself, they can be found with a carefully contrived automated code search, and when found, they will point to the routine they are guarding. Thus, guard routines can be "turned" and made to spill their guts about the other routines in the program. The best defense against this is to use many slight variations in your guard routine code, most especially in the line that directly accesses the client code. If just one guard routine uses a code sequence that slips past the automated code search, you've created a gigantic headache for the cracker, the more so because he will assume that his automated code search revealed all guard routines.


Harken to the wisdom of the lowly virus here. It does not seek to conquer the body's immune system. It just keeps making minor changes in its protein sheath, hoping to stay one jump ahead of the immune system's recognition subsystem. You should do the same.


Above all, have guard routines that guard other guard routines. The idea here is to create such an interlocking web of guard routines that the cracker will go nuts just trying to seek them out and neutralize them.


Do not have the guard routines generate the same bomb. Instead, have them do different things. My most fiendish trick was a guard routine that restores its client code to its original state. I can just see the cracker mumbling to himself, "I thought I disabled that thing half an hour ago…."


Recursion


Just as the stack is one of the safest places to hide data, recursion (the processing-equivalent of the stack) is one of the safest places to hide processing. It is easy to write a recursive routine whose function can only be determined by walking through every god-damned level of recursion. If you know where you're going with the routine, it takes a few minutes to write, but it can cost the cracker hours and hours to figure out.


Here's an example. Start with a long binary string:



1001110101011110010

Now create a Pascal triangle of recursive routines, each one looking rather like this:



Void RecurseNumber27(long tInput)
{
short j, Sum;
for (j = 0; (j < 27); ++j)
{
Sum += (tInput shifted right j times) & 1
}
if (Sum & 1)
RecurseNumber22(tInput);
Else
RecurseNumber07(tInput);
}

Theoretically, this code could expand out to 232 different recursive routines. The cracker knows this, but doesn't know that you have rigged the values to ensure that there are only, say, 23 routines. The only way the poor cracker can figure out what's going on is to track his way through every single routine. You, of course, can spike the set of recursive routines with a few never-called dummies that carry out the most bodacious computations. Let 'em wade through that mess!


Somewhere at the end of this shifting-walled rat's maze is a routine that actually does the useful work. You know that this is the only routine that will work�but your victim doesn't.


The Lord Giveth, and the Lord Taketh Away


One of the greatest differences between the experienced programmer and the neophyte is the former's debugging instincts. The neophyte blunders his way through a jungle of code and becomes hopelessly lost. The old pro creeps through the same jungle, sniffing, pausing to listen, and examining the ground for tracks. Working by a combination of instincts and well-honed hunches, he inevitably closes in on his quarry for the kill. Any cracker will rely heavily on those instincts to hunt down your secrets. Therefore, you must turn his instincts against him. Offer him hints that trigger his instincts; let him track down the quarry you offer, and then lead him into a dead end. Sprinkle short cryptic hard strings through your code, strings like "psswrd", "crypto", "crack", and "decrypt". They shouldn't do anything; just stick 'em in the middle of otherwise irrelevant routines. Let him go nuts trying to figure out why those strings are in there. Add a few extra, meaningless parameters to pass around through your routines. Write irrational code that carries out detailed calculations that don't do anything. Deliberately seed a few common programming errors into safe places in your code. Declare arrays to be bigger than you need, and use only the portion that you need. If your array is 10 elements longer than required, skip the first 10 elements rather than the last 10 elements.


Remember that the cracker is counting on you to write clean, rational code, so go wild! Be idiotic, insane; take everything you know about good programming practice and turn it upside down! Become a programming barbarian, lawless and uncouth! Have fun!


Back to Reality


There is one small catch with all this muddy fun: Your code must also work. Debugging deliberately psychotic code is almost as hard for you as it is for the cracker. Moreover, since the average game ships infested with a sewerful of bugs, you don't want to get carried away. My advice is to add the anti-piracy code after you have reached late beta; you don't want to burden your debugging efforts with egregious bugginess until the last moment.


And how did Patton Strikes Back fare against the crackers? I hired a cracker to beat the Mac system, and had him keep notes on his thinking as he worked. I could pay him for only twenty hours of work, but from his notes, it is obvious that, even after twenty hours, he had not come close to cracking the first layer of protection. I have never found a single cracked version of the game. There are plenty of versions posted on pirate sites, but they are all failed cracks; the crackers were so lazy that they never bothered to play the game to the end. If they had, they would have encountered the bomb that makes it impossible to complete the game. This, ultimately, is the best form of psychological protection. Let them think that they have won the battle of skills; let them hold their trophy aloft for all to see. So long as they can't actually play the game to completion, they have not taken anything of value from you. Everybody wins: They win their cracking techie-macho game, and you don't lose sales from people who actually want to play the game. Isn't that the best solution for all concerned?



LESSON 84


Defeat crackers with psychology, not technology.







    [ Team LiB ]



    No comments:

    Post a Comment