Measuring Success in an Anti-Phishing Behavior Management Program

Posted by Mike Broering—Vice President of Sales 


With the rapid uptick in organizations adopting anti-phishing behavioral management (APBM) solutions, the question arises, “How do you measure success?” The answer may seem obvious: the venerable click rate. When the click rate goes down, we have achieved meaningful risk reduction, i.e. “success.” While reducing users’ propensity or willingness to click on a link is a good thing, there are several problems that arise when we try to use click rate as the yardstick for success.


  • What happens when click rates, over multiple phishing simulations,  show a high degree of variation? What can you glean from the results?  Is the program working?  When click rates have a high degree of variability, it is much more difficult to arrive at a meaningful conclusion.  Did we send a harder test?  Did we add new users to the pool that might be affecting the data?
  • Another issue with defining success by the click-rate is the problem we refer to as “Hey folks, its phishing Tuesday.” Organizations that run regular phishing tests have noticed that when an employee spots the phishing simulation, word spreads like wildfire.  Instead of training users to spot real-world phishing attempts, they have taught them to spot the simulation and spread the news.  What can we take from the results of those tests?  Was a very low click-rate the result of more-informed, more aware associates, or was it a result of the employee early-warning system to the phishing test?
  • What if there is no clickable link in the phishing simulation?  Verizon reports ( that 46.5% of real-world phishing attacks do not contain a link at all, but rather contain an attachment.  In those instances, click-rate is obviously zero.  If an organization attempts to use click-rate as the sole success yardstick, and as such avoids running simulations that test attachments rather than a clickable link, then are we missing a chance to test users against a real-world attack scenario?

Certainly there are challenges with using click-rate as the sole measurement of success.  This is not to say that click-rate is unimportant, in fact it is quite important.  Instead, we should use the click-rate as simply one metric amongst many in any well-planned, real-world based anti-phishing program.

In discussing how to best measure effectiveness of your anti-phishing program, I suggest we should look back to our college days to find an answer.  When in college, I pursued a minor in mathematics.  As a requirement, I had to complete Calculus I, II, and III, as well as multi-variable calculus and differential equations. If I had skipped Calculus I through III and went right into multi-variable calculus, I would have certainly crashed and burned.  Instead, the university fed me bite size chunks that I could handle.  When I showed mastery of the skills at one level, they moved me onto the next and presented more complex material that built upon what I had already learned.

In running a phishing program and in measuring effectiveness, I recommend a similar approach. Think of your user populations in a similar vein to college courses – you have users that should be in level 100 courses, 200, 300, and even grad-work.  You need to define the skills and passing grade to move a user to the next level. The two key elements in this scenario are 1) what is appropriate course work, meaning how sophisticated is the phishing attempt and 2) what are the skills for which you want the user to demonstrate mastery?

For the first question, realize all phishing attempts are not the same.  Some are highly personalized while others are not. Some phishing attempts have spelling and grammatical errors while others are perfectly and professionally worded. You may use a phishing attempt that incorporates a high-level of branding (company logos) while others have no branding whatsoever.   URLs that are harder to distinguish from legitimate sites might be part of a test rather than easy to spot, suspicious, URLs.  Well, you get the point.  As you are planning, determine the level of sophistication in your testing for each user level.

Once you have determined the level of sophistication (the course work), you need to turn your attention to what constitutes a passing grade.   For example, for users in Level 100, perhaps they must complete a general security awareness training session and pass the quiz, and spot and not engage with (click, open attachment, etc.) at least 50% of the phishing simulations sent during a defined period. Those that pass move on to 200 level courses where perhaps they are presented with the same level of sophistication in the phish, but they must now not engage with 100% and they must report (take action) on at least one of the messages.  As you move users up in level, the material gets harder and the expectations rise.

There are several advantages to this type of approach.   First it measures both the ability of the user to not do the wrong thing and their awareness to do the right thing (report).  Because you have implemented a program and now can talk about effectiveness in terms of progress of user populations (i.e. 30% have graduated from 100 to 200, and 10% from 200 to 300), you get away from problematic single measurements such as click rate. It also allows you to handle new employees (they simply get enrolled in the level 100 program) without skewing the data. Finally, this is a measurement that executives can understand and it generally goes to the heart of the matter. Are we moving the needle when it comes to end-user security awareness?

Related posts