[Update: The main purpose of this post is to present and demonstrate a method of risk estimation and quantification to support practical policy decision. The email password policy is just a simplistic case to facilitate the debate. I also modified the blog post title and the text below to make it clear that this method is aimed to support quantitative risk estimation.]
What is the risk-driven, correct frequency of changing my email password?
<crickets…. silence… more silence>
Yes, we all can quote that “PCI DSS says 90 days” or “whatever regulation says 30 days”, but what does risk say? What actuarial information we need – if we are to define risk through probability of loss? What info about my email usage? Value of information stored there? Frequency of attacks on other similar email accounts? Chances of attack success? My approach to protecting the password? My personal password reuse “policy?” Anything else? On a related note, maybe this is simpler: what is my risk [of having the account compromised] if I change the password every 30 days, 90 days, 300 days?
So, any idea how to go about it?
This little experiment might well show us that “risk-based security” is an awesome thing – but not one achievable in this world today… [emphasis in original]
I wanted to blog about this, but hadn’t collected enough specifics. Now I can, thanks to the blog conversation by David Mortman, Rich Mogull, Chris Popper, and “Steve”, we have some smart/experienced people providing the needed detail.
Below, I offer a method for reasoning in order to estimate relative risk of alternatives that is compatible with quantitative risk analysis management, but doesn’t require massive amounts of risk calculations. I use the conversation by Mortman, et. al. as an example of this method in action (armchair-style).
The Method — Abductive Validation
The following is a fairly generic method to guide decisions when you only have partial information/evidence and a rough estimate of overall risk. It is a form of abductive validation, which is reasoning to the best explanation from available evidence along with evaluation criteria. [Update: it’s also called “Analysis of Competing Hypothesis” in the Intelligence community. C.f. book . There is a cool extension using Subjective Logic described in slides and a paper]
Step 1. Frame your decision alternatives in terms of hypotheses that could, in principle, be refuted with enough evidence. In this case, I propose the following hypotheses:
- Risk (Policy: “Change password every 90 days”) < Risk (Policy: none)
[the policy decreases risk ]
- Risk (Policy: “Change password every 90 days”) > Risk (Policy: none)
[the policy increases risk]
- Risk (Policy: “Change password every 90 days”) = Risk (Policy: none)
[the policy makes no difference in risk]
- Risk (Policy: “Change password every 90 days”) ? Risk (Policy: none)
[i.e. “we don’t know”. It’s indeterminant, unresolved, etc.]
[Update: having these four hypotheses is critical to the method, rather than just a single hypothesis such as #1. The reason is that different pieces of evidence may support or contradict one or more of these hypothesis. For example, just because a given piece of evidence contradicts #1 doesn’t mean it supports #2 or #3.]
Step 2. Define the Risk( ) function to make it operational.
Start with a metric for aggregate risk for the whole business unit. I prefer “Total Cost of Security” but other aggregate risk assessment methods will do (e.g. NIST 800 series) . Next, go through the steps to reach at least a rough estimate of aggregate risk. In the process, you should be able to identify the operational factors that have the biggest effect on aggregate risk. These are called “risk drivers“, and the concept is analogous to “cost drivers” in Activity-based Costing. You should be able to rank order them by degree of impact. Any thing that increases the risk drivers also tends to increase the aggregate risk metric.
Then, define your Risk( ) function in relation to these risk drivers. This is the key to making this method tractable yet also sufficient to guide decisions using quantitative risk management principles. In other words, you aren’t seeking a risk function that relates the policy variable directly to aggregate risk. Instead, you relate the policy variable to the risk drivers.
Here’s a simple example using email password policy. Let’s say your organization has only three risk drivers: 1) Number of confidentiality breaches of person-to-person communications, 2) Number of breaches of end-user email accounts, and 3) Number of non-employees who have access internal information system. The Risk( ) function would be defined according to the effect that password policy had to increase or decrease any of these risk drivers. You can either use quantitative or qualitative functions to evaluate impact on the risk drivers.
Step 3. Collect evidence regarding each hypothesis, both pro and con. Evidence can be quantiative, qualitative, or a mix. (You’ll need formal rules for evaluating the quality and strength of the evidence, but to keep this post from getting too long, I won’t go into that. ) Stop when you have enough evidence to choose among the hypotheses, or you run out of time or money. (If this happens, you are left with the inconclusive hypothesis #4, above.)
Step 4. Evaluate the evidence and decide to support or refute each hypotheses. If multiple conflicting hypotheses are supported, then either collect more evidence to exclude one of the hypotheses, or accept a mixed or ambiguous conclusion.
Example of Arguments and Evidence — Pro and Con
Mortman’s blog conversation is an excellent example of how this evidence-based argumentation should be done. This is an armchair debate, so they don’t follow the steps exactly, the discussion is very incomplete. But I think it has enough specifics to show how a formal study could proceed. (BTW, all uses of the word “evidence” in the commentary below is short-hand for “evidence that he thinks someone could collect…”. No one is actually pointing to solid evidence.)
Mortman starts with a challenge statement that expresses his support for hypothesis #3 (PW change policy makes no difference):
Show me any reasonable evidence that changing all your users’ passwords every 90 days reduces your risk of being exploited.
In the first comment, Steve offers evidence in support of hypothesis #1 (PW change policy reduces risk):
Aside from regular password change intervals, is there a way to mitigate offline brute-force attack? Assuming an attacker uses any of a number of methods to grab a password hash, and that the hash isn’t some sort of weak LM silliness, an attacker is left with a long-running brute force process, depending on the computational power available. For most organizations, a password change policy of 90 or 180 days would likely make the results of the brute force moot.
Given that offline brute-force is a realistic threat, isn’t a password change policy a reasonable control?
Mortman counters this argument in comment 2, not by refuting it, but by offering more evidence in support of hypothesis #3 (no difference):
It sounds like a realistic threat, except for the fact, that if someone has been able to get your password hashes, then they are unlikely to need to brute force passwords. They already have the access they need to get to the data that they want. If you own the authentication system, passwords no longer matter. Even if they need or want passwords, they now have the ability to capture them at will.
Steve counters Mortman in comment 3, again by offering evidence in support of hypothesis #1:
The specific scenario I was thinking of involved cracking an Active Directory domain member, and then dumping the hashes of the last 10 logged-in users (which is used to authenticate users when the domain controller is unavailable). There’s a good chance that a regular user’s PC will have had a Help Desk or other more-privileged account logged in within the last 10, and by cracking that hash, the attacker would gain access to higher privileges.
Chris Pepper chimes in, basically agreeing with Mortman on hypothesis #3, appealing to evidence that such attacks are not frequent in most scenarios:
90-day password change policies are stupid in >90% of scenarios. There are probably some DoD scenarios under active attack where they make sense. The clowns who insist normal business systems need 30 or 90 day password expirations don’t mean *all* users should disbelieve *all* professional security advice.
Then he offers a counter argument to Steve, with evidence in support of hypothesis #3 (i.e. password strength matters much more than change frequency):
If your passwords are realistically brute-forceable in 90 days, they’re too short.
Then he offers evidence supporting Steve’s evidence in favor of hypothesis #1:
There are various ways you might get a UNIX /etc/shadow file from backups, so reversing password hashes is a real threat.
Mortman responds to Steve by proposing a sub-problem to evaluate (i.e. strength of has relative to time to crack):
Okay so lets take your scenario. The question you have to ask, is how long is the hash going to stand up to attack. With a strong hash, it’s going to be a heck of a lot longer then 90 or even a 180 days, possibly years. In that case, what’s the justification for changing it on a 90-180 day schedule? Realistically though, what this means is that you now have a more complex risk question around how likely is it that someone is going to break in and get the hashes and how long are you willing for them to have use of those passwords?
Then Mortman asserts that this sub-problem may be unresolvable due to lack of evidence:
We at this point have little to no data on how likely this sort of attack is to occur so we can’t even take even a bad guess at if 90 days is a good number or a bad number. But until we have some data, we’re just making stuff up so make ourselves feel like we’re doing something.
Mogull chimes in, supporting hypothesis #3 (no difference) with other evidence (rarety of these sorts of attacks, agreeing with Chris Pepper):
Based on all the recent breach reports and investigations, it doesn’t look like password cracking is a major vector anymore (I’m not willing to stand behind that statement, but that’s my reading of these reports).
Finally, Mogull hints that there might be evidence in favor of hypothesis #2 (policy increases risk by increasing total costs):
With modern systems (no more NTLANMAN) is it really a risk? Is that risk greater than the cost of password rotations?
And so on…
I hope this example gives you a feeling of how the Abduction Validation method can work in practice. If you really care about the quality of the answer, you would need to be formalized through investigation, data collection, and experiements. Also, by enumerating hypotheses in the way I described, this method has the great advantage of telling you when the evidence is inadequate to support any hypothesis or alternative.