Effective Incident Response – Practice Makes Perfect
As a security professional, I have seen a wide variety of best practices for incident response. The methodologies vary greatly based on the sensitivity of the data, requirements to notify law enforcement, etc. Best practices recommendations exist from non-profit security organizations through to regulatory compliance initiatives, but all suffer from the same problem—they are painfully too high level to actually execute.
Every one of the standards will recommend having an incident response plan, assigning roles and responsibilities, preserving critical log information, notifying law enforcement, and prioritizing restoration of services. Sounds great…but how? Creating an incident response plan is one thing, but actually using it effectively without a fire drill is a completely different enigma. How do you take your incident response plan, regardless of its maturity, and actually make it effective? The answer—periodic role playing and practice, much like regular vulnerable assessments and penetration testing.
Roles & Responsibilities
To get started, first ask yourself how often you have fire drills at your office or even at home? You probably have the former at least once a quarter but rarely do you ever practice fire safety at home; let alone tell your children what to do if there is a fire. This is the first step in exercising an incident response plan. Typically, these plans require you to call out the roles and responsibilities for all the team members involved but do they actually know what to do? Do they know what to do when the incident happens while someone is on vacation, in the middle of the night, or during a holiday? Who are their backups? This may sound like a procedure maturity issue but all too often these procedures call out executives and various team members but they themselves are unware of their role or what their tasks and responsibilities are. This is why practicing an incident response plan is so important to reference their participation including any context aware variables that may affect the plan outside of business hours. The results, good and bad, should obviously be re-rolled back into the plan.
A second problem is controversial and revolves around transparency. How much information should you disclose internally, to team members, and to the press or law enforcement? During practice exercises hypothetical scenarios should always include some form of catastrophic use case. This could include access to crown jewels, or data leakage that could be a ‘game over’ event for the business, and include aspects that may have human liability such as illicit photos or behavior. Why? Teams need to learn how to communicate this information between each other in order to successfully navigate an incident response plan. The team also needs to understand what information not to share, duplicate, or make available outside of a secure and trusted inner circle. For example, if you are building a component for a government contract, and there is a suspicion that designs have been inappropriately accessed, sharing the blueprints with incident team members is inappropriate. The data may have been privileged before the incident and the policy for its control must continue to be monitored and adhered to during an incident response. Just because it may have been leaked does not remove existing security procedures and teams need to learn how and what to communicate. The same would be true for credit card or other personally identifiable information. Just because it may be leaked does not mean the files should be treated with any less security moving forward including putting the data on an insecure share for preservation. Teams need to learn how to talk, know what can and can’t be said, and what can and can’t be shared. Data transparency during an incident needs to be considered during the social aspects of any response plan.
Worst-Case Scenario Planning
One of the primary goals after any incident is to return to a normal operating steady state. The criteria may include remediating systems, restoring backups, and preserving logs and files suspected to have been manipulated during an incident. The human factor for role playing is time dependent. Based on a catastrophic incident as the worst-case scenario, teams need to consider how long the restoration times of services may take, including collecting forensic information. Without an effective role-playing implementation, an hour-long meeting in a conference room to review an incident response plan will never reveal the true pain of an outage that could last days if domain controllers, virtual instances, and infrastructure needs to be restored. The cost to the business may dictate the need to leverage disaster recovery procedures or high availability options as a method to mitigate lost revenue during an incident.
This practice exercise obviously touches more team members outside of the core incident response team and only emphasizes the first concern I discussed—roles and responsibilities. Incidents may have associated outages and then again, they may not. Planning, role playing, and practice exercises should consider the worst case, just in case an environment needs to be completely rebuilt. If you think that is never possible; ask someone who has ever had a domain controller compromised or been a victim of ransomware.
The final piece of any incident response plan and practicing is self-improvement. Creating a plan, never testing it, and filing it away so you can check the box on a regulatory compliance form is just pathetic. We have fire drills for a reason. We need to learn how to survive when a life threating event occurs and cyber security incidents can be life threating to the business, your job, and depending on your line of work, the country or well-being of others.
Please do not misunderstand my message. I am not suggesting war games. I am suggesting actually following your incident response plan in a mock scenario and on a periodic basis to ensure that people know what to do, how to do it, and what to say. Then, anything that works, fails, or needs clarification gets included back into the plan until the next scheduled practice. This scheduled practice should match up to existing policies for penetration testing, cyber security awareness training, and may involve red and blue teams for the sake of realism.
Incident response plans need to evolve past being a documented procedure. They need to be a working part of any business and the success of the plan should be measured with appropriate metrics—from response time to loss of services and revenue. While this might not always be the case, and the incident may be trivially minor, the potential for a catastrophic incident can happen to any organization. Reading a manual for the first time when it happens is never a good way to manage the crisis. This is why we have fire drills.