Into the Breach — the Password Breach

It seems like there is a “news stories about password stealing” season, where story after story after story comes out in the news about some web site with a zillion passwords being stolen, and we’re deep in that season now. Within the past few weeks we have heard about the attempted sale of 117 million LinkedIn passwords, the discovery of the compromise of over 360 million user records and passwords million passwords from Myspace, and the black-market marketing of 65 million Tumblr account records. In a previous interest surge a year and a half ago, I even helped with a story on the local FOX-8 news on password and email security.

Do you know what it means when a web site has “user records” compromise or when a password database is stolen? The point of this posting is to explain the most common way that systems handle passwords, something you may have never really thought about. Once we’ve seen how passwords are handled, we’ll get to some basic tips on how to protect yourself from password breaches like those in the previous paragraph.

Anatomy of a web site

To talk about how passwords are handled, let’s establish a shared mental picture of how people interact with web sites. Not every interaction works exactly this way, but it’s good enough for our discussion. Here’s the picture:

You, as the user, interact with the web server through your browser, which sends information and requests to the web server by way of the mysterious Internet. The web server then processes the user’s requests, retrieving information from the the database as necessary to process the request. Note that database requests can only come from the web server, and the “as necessary” part in that last sentence is important. The database contains all of the data for the web site, for all users, so is the “crown jewel” for anyone attacking the web site; therefore, a goal of many hackers is to trick the web server into making requests that are broader than necessary.

When you are logging in, what is the information necessary to process the login request? You send your user name and password to the web server, and the web server creates a request for the database server that says something along the lines of “Give me the user record for this user name.” The user record includes information about your password so that the web server can determine whether you provided the correct password, and ideally only the “necessary” information of whether you provided correct login credentials should be revealed. Unfortunately, many web sites are not implemented correctly, and attackers can find some clever way of tricking the web server into providing more information than the system designer intended. For example, what if the attacker could trick the web server into retrieving all user records instead of just the one for the entered user name? A common attack known as SQL Injection can do just this, and is at the heart of most of the password compromises we’re seeing in the news.

What information is in each user record? Does it contain each user’s actual password so that it can compare with what the user sent? While that may be the obvious answer, it’s not true except for really, really poorly designed web sites (more on this below). There’s a reason why I said the user record contains “information about your password” rather than “your password.” The password is typically run through some complex but repeatable transformation before it is stored in the database. Then when you try to log in, the web server takes the password you send and runs it through that same transformation, which is compared to the transformed password in the database. That way, the web site can tell whether you typed the correct password without keeping a copy of your actual password. What does the transformed version look like? Here’s what is stored in the database when the password “password” is run through a standard transformation named bcrypt:

    $2y$05$8G/CcpCKMc4x8MS8DNf6WezoEk/hi/meeLXJUa6tY0HxUkyKmJb/u

This transformation should satisfy certain properties, and to understand those we need to consider what happens if an attacker steals user records with the transformed passwords. What can they do?

First consider the possibility that an attacker can somehow reverse the computation that produced the transformed value, figuring out the password from the transformed version. For example, here’s a (bad) transformation for numerical PINs: take the PIN and add 6315 to it, so a PIN of 1234 is stored as 7549 (that’s 6315+1234). While 7549 looks very different from the PIN, so the PIN seems hidden at first glance, the computation is easily reversible by subtracting 6315 from it to get 1234. For this reason, we want the transformation to be one-way, meaning that it’s not feasible to reverse the transformation and figure out the password that way. While addition is reversible (using subtraction) and multiplication is reversible (using division), there are specially-designed operations called cryptographic hash functions that are not easily reversible.

Next, what if the attacker could just take big lists of common passwords and put them in a big table so they can quickly look them up later like in a dictionary (this is called a "lookup table"). The attacker would then know that whenever she sees “2JyhZl6yG5alFg7I8/vfMqRKUvjRkNn/MrY” (for example), the password is “secret.” If you could calculate 10 of these transformed values per second, then in a month (about 2.5 million seconds) you could make a lookup table containing transformed values for each of the 25 million most common passwords. Then when the attacker steals a bunch of user records containing transformed passwords, looking up stolen values in this table would be very fast, and would probably be very successful. For this reason, a good password transformation is salted — an odd technical term that basically means that the transformation uses some random value (the “salt”) that it stores with the transformed value. Each user’s transformed password will use a different salt, and the attacker can’t know the salts until after the user database is stolen, which means that the attacker can’t precompute the lookup table. In addition, since each stored password on a system uses a different salt, the attacker can’t even work on more than one user’s entry at a time — testing “secret” as a possibility for 1000 user’s passwords now takes 1000 different computations, since each user’s password is transformed with a different salt.

Finally, consider the time it takes to compute the transformation. It should be somewhat quick (you don’t want it be so slow that it makes users wait to log in) but not too quick. Why not too quick? What we described in the previous paragraph is called a “brute force attack,” where the attacker tries password after password after password. The one-way cryptographic hash functions mentioned above are actually designed to be very fast, and a popular one called “SHA-256” takes less than a microsecond (a millionth of a second) to process one 16-character password. That means that a brute force attack can test out a million different passwords every second, which is bad if you want to make things hard for attackers! Therefore, good password transformations are slower than that, taking say one tenth of a second for the transformation. Now it would take the attacker 100,000 seconds (a little over a day) to test a million possible passwords, and it would take over 3 years to test a billion possibilities for a single user.

How does this affect me?

So about now you’re saying “how does any of that affect me?” Let’s consider how this knowledge can help you understand whether the web site you’re using is secure, and how you should pick a password.

Since you now know that any well-designed web site will keep passwords that have been transformed by a one-way function, any web site that can tell you what your password is must be ignoring standard security practices. For example, most web sites have a “forgot password” link you can click, and if this emails your password to you then it’s a big red flag that the web site is not handling your information properly. A properly-designed and run web site will email you a link so that you can reset your password (probably after answering some security questions), since the web site has no way of knowing what your current password is!

Also consider the complexity of your password. A lot of web sites will force you to pick a password that is a combination of letters and numbers, and maybe some symbols as well. That’s all to slow down the brute force attacks that were described above. What if you just picked an actual English word to be your password? There are a little under 100,000 English words (omitting the really obscure ones), so the brute force attack described above (trying 10 passwords per second) could test all of these in 10,000 seconds, or about 3 hours. Your password should definitely not be a word! On the other hand, if your password were 8 randomly chosen letters, then there would be 26⁸ (that’s 26 to the 8th power) possible passwords, or about 209 billion possible passwords. At 10 per second, testing all of those would take about… oh, let’s see…. 6,600 years. Random passwords are very secure! We have two extremes now: a meaningful, easy to remember password (an English word) that is insecure, versus a random, hard to remember password that is very secure. This is why you mix in numbers and symbols to a meaningful password: that can make it difficult to find in a brute force search, but meaningful enough for you to remember.

Some ways to protect yourself.

The first thing to do when considering how to protect yourself is to be paranoid and assume that the user records will be stolen from any web site that you use. Some may in fact be fairly secure, but even some sites run by very smart and professional people get compromised. Furthermore, there’s really nothing you can do about the security of the web server, so concentrate on what you can to protect yourself. Here are some tips:

Pick complex, hard-to-guess passwords. As explained above, this slows down brute force attacks and can make it so that even if a hacker steals the database with your transformed password, they won’t be able to find your password in a reasonable amount of time. Don’t use words, and don’t just stick “123” on the end of an English word. Pick a longer phrase and use the initial letter of each word, or turn some words into numbers. For example, “Ally Sheedy was great in the Breakfast Club” might turn into “ASwgr8itBC.” That’s 10 characters long, contains upper and lower case letters as well as a number, won’t appear in any dictionary, and is easy to remember (assuming the phrase is meaningful to you).
Use different passwords for different web sites. Again, assume that someone is going to get your password, and further assume that the web site doesn’t properly transform stored passwords like we described above. Then the attacker knows your actual password. That might not be a problem if the password is for a discussion forum for navel lint collectors (which may not be very well protected!), but if you use the same password for your bank then the attacker learns your valuable password by breaking into a poorly-protected but low-value web site. At the very least, use different passwords for important web sites that deal with financial information.
Use two-factor authentication. This is hard to describe in a quick bullet point, and might be better as a separate blog post, but the basic idea is this: set up your important accounts or web sites so that they require multiple ways to check your identity. For example, you can set up GMail to require a 6-digit code that is texted to you in addition to your password. For easier use, you can have Google remember your trusted systems (your laptop, home computer, phone, etc.) so that you don’t have to enter the code on every log in. But if some hacker in China steals your password from the web site, they still won’t be able to log in from their computer since they don’t have access to your phone and text messages with the secret code.

The bottom line is that you can secure your accounts very well, if you take a little time to follow these tips. Protect yourself!

From a Computer Science Mind

Monday, June 13, 2016

Into the Breach — the Password Breach

Anatomy of a web site

How does this affect me?

Some ways to protect yourself.

No comments:

Post a Comment