The theft by a Russian syndicate of 1.2 billion username and password combinations from 420,000 websites around the world means that the personal details of almost half of all users of the internet must now be considered severely compromised. It can be only a matter of time before the victims find nasty surprises in their bank statements and credit-card accounts. To be on the safe side, anyone who uses financial and shopping websites should change their passwords forthwith—preferably to something longer, more jumbled, and including no word found in any dictionary. The more nonsensical the better.
Heads may nod in agreement, but the advice is then promptly ignored. Human nature, being what it is, has a habit of making people the weakest link in any security chain. For instance, passwords that are easy to remember—the ones most people choose—tend to be the easiest for cybercrooks to guess. By contrast, passwords comprising long, random strings of uppercase and lowercase letters plus numbers and other keyboard characters are far more difficult to fathom. Unfortunately, they are also difficult to remember. As a result, users write them down on scraps of paper that get left lying around for prying eyes to see.
Basically, two factors determine a password’s strength. The first is the number of guesses an attacker must try to find the correct one. This depends on the password’s length, complexity and randomness. The second factor concerns how easy it is to check the validity of each guess. This depends on how the password is stored on a website’s server.
Taking the second factor first, any computer system that requires users to be authenticated when logging-on stores the various passwords in a database. Because such tables can be stolen, passwords are normally encrypted in the form of a “hash” of the user’s authentication details rather than in plain text. A cryptographic hash is a string of characters created from the original plain text by an algorithm (such as MD5 or SHA-1), from which it is supposed to be impossible to recreate the original. When a user enters a password, it is hashed using the same one-way algorithm, and the output is then compared with the hash stored in the database.
The strength of a password therefore depends, to a large extent, on the hashing function used, and how well the database containing all the password hashes is protected. Such things are normally outside the user’s control, depending instead on the integrity of the online banking or retailing firm’s website security.
What is very much under the user’s control is the length, complexity and randomness of the password chosen. Several years ago, an eight-character password was considered more than adequate. Using cracking computers of the day, it would have taken a couple of years to break such a password by brute-force methods—more than enough to deter most criminals. Today, ten characters has to be considered the absolute minimum length.
For its part, the complexity of a password depends on the size of the character set it is selected from. The wider the choice, the greater the complexity—and thus the better the security. Using numbers alone limits the choice to just ten characters. Add upper- and lower-case letters and the complexity rises to 62. If all the symbols in the standard ASCII set of printable characters are available, the pool to choose from increases to 95.
By contrast, the randomness of a password depends largely on whether it was created automatically by a random-number generator (better), or by the user making a less-than-arbitrary choice (worse). Either way, randomness is measured by its so-called “entropy”—its degree of disorder. In information theory, a tossed coin is said to have an entropy of one bit (ie, one binary digit). That is because it can land randomly in one of two, equally possible, binary states.
Each time an extra bit of entropy is added to a password, it doubles the number of guesses needed to crack it. Thus, a password with 64 bits of entropy is as strong as a string of data comprising 64 randomly selected binary digits. Put another way, a 64-bit password would require 2 raised to the power of 64 attempts to crack it by brute force—in short, 18 billion-billion attempts.
That may sound astronomical, but a 64-bit password was cracked in 2002 using brute-force methods. It did, nevertheless, take a network of volunteers nearly five years to do so. However, given the sort of equipment available today, a 64-bit password could be cracked in months.
Two things have changed in recent years to make even strong passwords vulnerable. One is that computers have got a whole lot faster. This is not just the effect of Moore’s Law—the doubling of processing power every two years or so. There has also been a quantum leap in the computational performance of PCs, thanks to the massive parallel processing made possible by the graphics processing unit (GPU) embedded in video cards. When used to crunch numbers instead of drawing complex shapes, colours, textures, highlights and shadows that change rapidly in a video game, a modern graphics card costing less than $1,000 can turn a humble PC into a desktop supercomputer.
Unlike a computer’s central processing unit or CPU, which executes single instruction threads in rapid succession, a GPU’s parallel architecture allows it to execute many threads simultaneously. Some of the latest graphics cards offer the performance equivalent of a CPU with several thousand processing cores. When used with cracking software optimised for parallel processing, attackers can make billions of guesses a second using nothing more than a high-end gaming PC. One modified machine fitted with eight graphics cards is claimed to make over 140 billion guesses a second.
The second thing that has changed is that hackers with malicious intentions no longer rely solely on brute-force methods that try all possible combinations of characters in order to guess a password correctly. These days, they can buy black-market dictionaries of common passwords, along with all their imaginable variants, that run into a billion or so entries. Such dictionaries are used to create tables of pre-generated hash values. Lists of these pre-generated hashes are stored in so-called “rainbow tables” for mounting attacks.
By trying these first, all the low-hanging fruit in a stolen hash table can quickly be unscrambled. As a rule, attackers can usually decipher at least half of the hashes in a database in 5% of the time it would take to do the lot. Weighing time against results, many attackers cease after unscrambling 80% or so of a stolen database.
What can individuals do to protect themselves? Apart from choosing passwords that are strong enough (ie, long, complex and random mixtures of ASCII characters) to make cracking their hashes too time consuming for thieves to bother with, there is actually not all that much more. Passwords get stolen and broken mainly because of poor choices made by those responsible for a website’s security—especially the way it stores customers’ validation details.
Even when passwords are hashed, the most popular algorithm remains MD5. Yet, this has long been known to have a fundamental “collision” flaw that is easily exploited. The other widely used hashing function, SHA-1, is little better. More robust hashing algorithms exist (eg, bcrypt, scrypt, SHA-512 and PBKDF2) that make life difficult for would-be thieves. Among other things, these stretch out the hashing process by repeating it thousands of times—slowing, in the process, all decryption attempts to a snail’s pace.
Another useful defence is to “salt” each password with a different random number before hashing it. An attacker pre-generating a rainbow table then has to store the hashes of every conceivable salt value for each and every password in the dictionary used. For a salt value of more than, say, 32 bits (2 raised to the power of 32), cracking such a salted hash table in any reasonably amount of time is nigh impossible with today’s technology. Even so, few commercial websites use salting, let alone stretching, to protect their customers’ logon details.
Given the pace of innovation in graphics processors, coupled with the increasing power of cracking software (mostly available for free on the internet), even the best password defences are destined to be overwhelmed in due course. After two thousand years of development, the password’s days would finally seem numbered. Time to start investing in spoof-proof biometric factors that characterise each person uniquely as an individual.
http://www.economist.com/blogs/babbage/2014/08/difference-engine-1?fsrc=scn/tw/te/bl/youvebeenhacked