OTSecurity

Building ‘Incident Management for Industrial Control Systems’ to address gaps in OT cyber incident response


Industrial cybersecurity programs have matured considerably in the past 10 years, with many organizations spending substantial sums on detection tools, network segmentation, and preventive controls. Yet when they happen, response capabilities are locked behind these defenses. Playbooks are lacking, roles are undefined, and collaboration between cybersecurity teams, plant operators and leadership is seldom practiced in industry. In the industry, where an interruption can be so quickly translated into risk to safety or a stopped production line, improvisation is minimal. It is these gaps that led OT and cybersecurity risk management veteran Durgesh Kalya to pen Incident Management for Industrial Control Systems.

Drawing on nearly two decades of experience across IT and OT environments, Kalya argues that organizations remain far better prepared to prevent incidents than to manage them once they unfold, particularly when breaches originate in corporate IT networks rather than on the plant floor, and when confusion over authority, unrealistic isolation expectations, and poor understanding of process dependencies compound the damage.

To provide enterprises the coordination architecture and decision clarity they need at the time of a cyber incident, the book develops a scalable and repeatable model based on the Incident Command System, tailored for industrial cyber events through the ICS4ICS framework. It explores how cyber incidents may rapidly evolve into operational emergencies with implications for safety, production, and public service and, through case studies, operational guidance, and exercises, demonstrates how these can be used to enhance critical infrastructure sectors, including energy, transportation, and manufacturing.

What distinguishes the book is its insistence that technology alone is not going to save an organization in crisis. People, communication, authority structures, and cultural readiness – these are just as predictive of whether an incident will turn into a manageable disruption or a disaster as any other element. Oil pipelines, power grids and the factory floor aren’t immune to cyber attacks, and Kalya is clear that the need to close the gap between prevention and response has never been more urgent.

In the following interview, Kalya discusses with Industrial Cyber the motivation for the book, enduring misconceptions that sabotage OT security programs, and three tenets that he feels every industrial organization must accept to react with confidence instead of chaos.

Origins and Intent

Industrial Cyber: You have nearly two decades of experience in cybersecurity and OT risk management. What specific operational gaps or recurring challenges in industrial incident response compelled you to write this book?

Durgesh Kalya: I repeatedly saw organizations with strong prevention controls but very weak response capability. When incidents occurred, there was confusion about roles, authority, communications, and recovery priorities. In OT, you do not get unlimited time to figure things out. I also saw missing or incomplete checklists, unrealistic expectations from IT teams about how quickly OT assets can be isolated or restored, and a lack of understanding of process dependencies. While OT systems are vulnerable, majority of the time the initial compromise originates through IT or business devices, remote access, or personnel using corporate networks. My book addresses this gap between cybersecurity programs and real operational response.

IC: When you developed the concept, did you intend it primarily as a practitioner’s field guide, a strategic reference for leadership, or a bridge between operational and executive audiences? How did that positioning evolve as the manuscript took shape?

DK: It started as a practitioner field guide, but it quickly became clear that incident response in industrial environments requires both operational and executive alignment. The final result is a bridge between plant teams, cybersecurity, and leadership, because all three must act together during a crisis. It also became clear that, as we stand at the cusp of another technological evolution, organizations have become heavily technology-focused, while people-to-people coordination and communication remain critical to effective incident response in OT environments. I believe this makes the book especially timely.

Frameworks and Operationalization

IC: In your view, what is the most common misconception IT security professionals have about incident response in industrial control systems?

DK: Coming from working in both IT and OT, I would say it is less a misconception and more a misalignment of priorities and expectations. IT response models focus on rapid containment, while OT environments must prioritize safety, process stability, and controlled recovery. You cannot simply isolate systems or shut things down without understanding operational impact. Availability and safety come first, and recovery can take days or weeks, not hours.

IC: The book draws on the Incident Command System. How do you translate a framework originally designed for emergency management into a practical and sustainable model for cyber incident coordination in OT environments? 

DK: Industrial cyber incidents are operational emergencies, not just technical events. That is why my book is built around multiple pillars, with the Incident Command System as one of the core foundations. ICS brings a proven structure for roles, authority, communication, and decision-making under pressure, something most cyber playbooks lack in industrial settings. By combining ICS with critical infrastructure context and industrial control system realities, I translate these concepts into a practical program, not just a response guide. The goal is to give organizations a repeatable methodology for preparedness, coordination, and recovery, rather than forcing teams to improvise during a crisis.

IC: You introduce the ICS4ICS framework. What gap does it address in the current ICS incident response landscape, and how does it differ from more traditional OT response models?

DK: ICS4ICS adapts FEMA’s ICS model specifically for cyber incidents affecting industrial automation. Traditional IT playbooks do not address plant operations, safety systems, or physical consequences. ICS4ICS, developed through ISA’s Global Cybersecurity Alliance (GCA), provides role definitions, coordination guidance, and exercises tailored for critical infrastructure. My book also explores other established frameworks, recognizing that many organizations already have programs in place, and shows how to integrate or align them rather than replace what is already working.

Threat Environment and Consequence Management

IC: How has the threat landscape for industrial control systems evolved over the past five years, and where do you see persistent misunderstandings about today’s risks to critical infrastructure?

DK: Attacks are now designed to disrupt operations, not just steal data. Ransomware targeting OT networks can halt production for days or weeks, as we saw in the case of Japan’s beer brewing company, Asahi Group Holdings. A persistent misunderstanding is assuming these events remain contained in IT. In reality, business impact comes from operational disruption.

IC: You emphasize the convergence of cyber events and physical consequences. Can you describe a situation where coordination between IT, OT, and safety teams materially changed the trajectory of an incident?

DK: My book examines real-world case studies, including significant cyber attacks on critical infrastructure, such as the City of St. Paul, Minnesota ransomware incident. In this case, the affected OT systems were tied directly to public safety functions, which raised the stakes far beyond typical IT disruption. It demonstrates how cyber events can quickly become operational emergencies requiring coordination across IT, OT, emergency management, and city leadership. Structured alignment of priorities enabled continuity of critical services while recovery was underway, showing how effective coordination can reduce impact even during a major disruption.

Organizational Readiness

IC: You argue that readiness is as much cultural as it is technical. What structural or organizational changes are most critical for building durable ICS incident response capability?

DK: Start with governance and operating model, not tools. Effective ICS incident response capability requires clear authority, defined workflows, and practiced coordination across IT, OT, safety, and leadership. 

Organizations need predefined decision paths, communication channels, and cross-functional alignment so they can act quickly when production, safety, or public services are at risk. This must be supported by accurate asset visibility, tested backup and recovery, alternate communications, manual or fallback operating procedures, and personnel trained to operate under degraded conditions. Major safety incidents consistently show that failures in communication, situational awareness, and decision clarity often amplify the impact more than the initiating event itself.

Personal Reflection and Strategic Takeaways

IC: As you translated real-world incident experience into a structured framework for the book, were there assumptions you had to reconsider about how industrial organizations approach incident management?
DK: Yes. I assumed most organizations had at least a basic coordinated response structure. In reality, many rely on informal relationships and ad hoc decision-making. That may work during small events, but it breaks down quickly during a major incident, increasing confusion and recovery time.

IC: If readers could implement only three principles from your book to strengthen their incident management posture, what would you want those to be?

DK: Define who is in charge during an incident through clear Delegation of Authority. Establish how decisions and communications will flow across IT, OT, and leadership, and practice realistic scenarios that involve operational teams, not just cybersecurity staff. These steps alone significantly improve an organization’s ability to respond without panic, delay, or conflicting actions.



Source link