Formal- and Data-driven analytics for Security and Resiliency
of Cyber and Cyber-Physical Systems
1. Automated Verification & Synthesis of Security Configuration
Professor Al-Shaer has a long standing record of leading a number projects in modeling complex cyber systems based on their configuration and verifying its behavior according to safety or liveness security and resiliency properties. Our model finding approach can analyze millions of highly interdependent system configurations, device logs, or sensor measurements, and construct a deterministic or probabilistic model checker for analyzing critical properties. Prof. Al-Shaer has developed novel models based on Binary Decision Diagrams, SAT, Satisfiability Modulo Theories, and stochastic model checking to represent cyber and cyber-physical configurations (e.g., access control rules) of large-scale enterprises and SDN, Clouds, WSN, advanced metering infrastructure, and Energy Management Systems. Developing a formal model that is accurate, canonical, and scalable is a key challenge that we addressed in this research. The following are selected projects in this area:
1.1. Configuration Verification
(a) ConfigChecker-- Security configuration analytics for large-scale enterprises [INFOCOM04, JSAC06, ICNP05, ICNP09, SafeConfig10-13, SafeConfig16]: Network systems contain thousands of devices such as routers firewalls, IPSec devices, intrusion detection and prevention systems, wireless access points, virtual machines, and end hosts that comprise millions of configuration parameters (e.g., rules). Although these devices are configured in isolation, their configurations are often interdependent which complicates the enforcement validation of the global security properties in cyber systems. This inherent complexity of cyber configuration significantly increases human errors and makes devices misconfigurations, policy inconsistency, security violations highly likely. It has been widely reported that many security breaches are mainly due to misconfigurations in the network. “It is estimated that configuration errors enable 65% of cyber-attacks and cause 62% of infrastructure downtime”, Network World, July 2006.
To address these challenges, we developed a novel formal approach that models the global network behavior based on the access control configurations of routers, IPSec, firewalls, NAT, IDS, wireless access points and end-host access controls. We model the semantics of access control policies using Binary Decision Diagrams (BDDs) and then use computation tree logic (CTL) and symbolic model checking to verify network reachability and security requirements. We have implemented our approach in a tool called ConfigChecker that can scale to thousands of devices and millions of configuration rules [ICNP09, Safeconfig12, Safeconfig13]. A novel taxonomy of policy conflicts and misconfiguration for firewalls and IPSec devices with soundness and completeness proof have also been presented [ICNP05, COMMAG06].
(b) SDNChecker-- Security configuration analytics for SDN [SafeConfig16]: In this project, we developed an SMT-based bounded model checking framework and tools, called SDNChecker, to verify Openflow in software defined network’s data plane with a high-level enterprise security policy. The SDNChecker framework models the Openflow switch internal pipeline processing and captures packet transformation and forwarding both internally (i.e., across multiple flow tables), and externally (i.e., across different switches). SDNChecker provides a high-level interface to (1) express a wide range of network properties as temporal expressions based on LTL, and (2) define special-purpose variables as arithmetic relations, such as aggregation functions that will be considered by the model checker to monitor the system state based on user-defined properties. SDNChecker has the flexibility to verify any general reachability, security, and resiliency properties. We demonstrated many case studies for discovering conflicts, and verifying network reachability with guaranteed quality of service, and risk-aware policies constraints. In our performance evaluation, we show that SDNChecker can scale to Openflow networks of thousands of nodes (switches) and hundreds of thousands of rules.
(c) CloudChecker-- Security configuration analytics for the cloud [SCC13-1, SCC13-2]: In this project, we developed a risk-aware access control technique that effectively allows for the enforcement of fine-grain isolation and defense-in-depth in cloud. Our approach involves two key techniques: (1) security-aware resource allocation of VMs such that each VM will be assigned to the security groups according to their risk exposure and induced impact, and (2) risk-aware VM placement planning that reduces global residual risk in the cloud. As risk can be dynamically changed, the VM placement and group assignment should also be adaptive. Thus to provide adaptive access control configuration, we developed provably-safe VM Migration Planning technique that guarantees valid VM migration sequence that satisfies the service dependency, security requirements, and performance constraints in the cloud. The problem was formalized as a Constraint Satisfaction Problem and solved using Satisﬁability Modulo Theories (SMT) solvers.
(d) SensorChecker-- Configuration Verification of Wireless Sensor Network: In this project, we developed static analysis techniques to verify the coverage and the delivery of sensors’ activities based on WSN configurations such as forwarding information and awake/dormant sensor schedule generated by WSN protocols and algorithms. Our contribution is two-fold. First, we create a scalable model that represents the end-to-end reachability and the network coverage behavior of WSN based on sensor configuration using Binary Decision Diagrams (BDDs) and Symbolic Model Checking, and then define generic reachability and coverage properties using Computational Tree Logic (CTL). Second, we encode the WSN topological information using Boolean functions to verify constraint-based reachability and coverage properties for mission-critical WSN, and show soundness and completeness. We implement this in a tool, called SensorChecker, and validate the scalability and performance of the system with very large sensor networks (10s of thousands of sensor nodes) and wakeup scheduling parameters. We developed a scalable formal approach for verifying the configurations of large-scale wireless sensor networks.
(e) ACDChecker-- Verification of Active Cyber Defense [SafeConfig16]: Active Cyber Defense (ACD) is the capability of cyber systems to respond to threat or attacks automatically by dynamically reconfiguring cyber systems without any human assistance. ACDChecker is a project to verify the safety and effectiveness of ACD operation. ACDChecker provides a specification language to define the cyber mission, and ACD policies and it verifies if the ACD policies can be implemented effectively without jeopardizing the cyber mission at any point [SafeConfig16].
2.1. Automated Configuration Synthesis
(a) Metric-driven Multi-Layer Security Architecture & Configuration Synthesis [TPDC16, ICDCS13, NDSS13, INFOCOM10, CNS11]: Determining optimal security architectures, that specify the required countermeasures, device location and configuration, while considering the contention of various design constraints including risk, usability and cost has been the Holy Grail problem in cybersecurity for many years. This project was funded by NSA Science of Security Lablet to investigate formal method techniques and developed tools to synthesize security architecture specification and configuration considering risk, usability, and cost metrics [ICDCS13, NDSS13]. The proposed approach enables automatic identification and placement of multi-layer security controls (e.g., firewall, IDS, IPSec, NAT, Proxy, Sandboxing, and virtualization) to maximize/enforce isolation while considering connectivity, usability, and cost requirements. Our approach employs automated hypothesis generation and evaluation for refinement and validation of the synthesis results. Our framework also automates the placement of network security devices, deployment of the rules and the creation of external and internal Demilitarized Zones (DMZ) to increase isolation and security.
(b) ThreatZoom— Automated Cyber Risk Mitigation [CNSM16]: The goal of this project is to identify the most cost-effective defense actions for reducing cyber risk due to software vulnerabilities or configuration weakness. ThreatZoom analyzes both XCCDF, end-host vulnerability scanning and compliance reports, and network configuration, and determine the global risk based on threat exposure and impact that we developed as new metrics for measuring cyber risk. ThreatZoom optimizes the cybersecurity return of investment (RoI) by considering local and global risk as well as the cost of elected counter-measures that include patching, blocking, inspecting, authentication, encrypting, and other fine-grain defense actions.
(c) SensorPlanner-- Mission-oriented Resilient Configuration Synthesis for Wireless Sensor Networks: In this work, we developed an automated synthesis technique for correct and yet resilient “distributed” sensor configurations including placement, sampling rate, power management, orientation, communication/forwarding, etc to guarantee sensing/monitoring coverage and reporting of events as stated in the mission requirements, while satisfying security, such as communication credentials, and resiliency requirements with the existence of failures or malicious sensors. Our formal approach is based on Satisfiability Modulo Theories (SMT) to find a satisfying coverage and reachability configurations to execute the mission. The presented approaches were implemented using Yices SMT Solver and evaluated extensively to show the feasibility and scalability of deploying our solutions in real-life WSN. As a case study, we consider multi-mode sensor PTZ cameras for surveillance applications.
2. Data-driven Cybersecurity Analytics for Cyber & Cyber-Physical Systems
Professor Al-Shaer has been leading a number of projects on data-driven cybersecurity analysis for cyber threat intelligence, predictive analytics, fault diagnosis, structural health monitoring, spambot detection, stealthy DDoS detections, security crowdsourcing through games, and security and risk metrics.
(a) TTPDrill-- Automated Threat-Action Extraction and Attack Pattern Inference from Unstructured Text of Cyber Threat Intelligence [ACSAC17] An innovative approach/tool that analyzes the unstructured text of cyber threat intelligence information (CTI) to enable the following capabilities: (1) Extracting fine-grain low-level micro threat actions that including observables and indicators, (2) Inferring a complete attack pattern expression by extracting the temporal and logical correlation of threat-actions, (3) Mapping threat-actions to high-level attack behavior of the TTP Kill-Chain based in MITRE ATT&CK, and (4) Generating rich and accurate STIX 2.0 automatically from unstructured text of CTI resources [ACSAC17]. To accomplish these goals, we propose a combination of novel CTI-aware analytic techniques of Natural Language Processing, Text Mining, and Information Retrieval that are well-adapted to address CTI domain specific challenges. The proposed analytics are driven based on a domain specific ontology called TTP Ontology that we developed specifically for CTI information.
TTPDrill allows for developing many novel cybersecurity application including (1) automated generation of cost-effective course of actions mitigation to detect, prevent or deceive attackers, (2) TTP Dictionary to create a graph that represent thousands of threat actions, their definition, synonyms, and relationship to cybersecurity terms and kill chains, and (3) automatically creating rich and accurate STIX 2.0 reports.
(b) CyberARM-- Automated Multidimensional Decision-Making for Cyber Risk Mitigation CyberARM is an innovative approach to automating decision-making for optimal cybersecurity planning and risk mitigation. The key objective of CyberARM is to recommend the cyber defense investment to maximize cybersecurity Return on Investment (ROI) considering risk, budget and resiliency constraints by determining the optimal set of critical security controls (CIS CSC) in a multi-dimensional decision matrix (called Cyber Defense Matrix) that specifies the following:
(1) What to enforce: which security function (Identify, Protect, Detect, Respond and Recover) based on NIST Cybersecurity framework,
(2) Where to enforce: to decide on enforcement-level (People, Network, Device, Application, Data), and
(3) When to enforce: to decide on the enforcement-trigger in the attack kill-chain (Recon, Weaponize, Deliver, Exploit, Install, Execute, and Maintain).
CyberARM includes predictive analytic models that analyze historical incident reports (such as VERIS) for forecasting the most critical threat, asset and impact. CyberARM then computes provably-correct planning that guarantees a bounded residual risk under specific budget constraints [CyberARM].
(c) Predicting Malicious URLs using Prioritized Active [ISI17] Learning Obtaining data with annotations (i.e. ground-truth labels) is very challenging and known limitation in cybersecurity data-driven supervised analytics. As the task of annotation is extensively manual and requires a huge amount of expert intervention, significant portions of the large datasets typically remain unlabelled. In this work, we developed an effective active learning approach that can efficiently address this limitation in a practical cybersecurity problem of Phishing categorization, whereby we use a human-machine collaborative approach to design a semi-supervised solution.
An initial classifier is learnt on a small amount of the annotated data which in an iterative manner, is then gradually updated/refined by shortlisting only relevant samples from the large pool of unlabelled data that are most likely to influence the classifier performance fast. Our Prioritized Active Learning approach shows a significant promise to achieve faster convergence in terms of the classification performance in a batch learning framework, and thus requiring even lesser effort for human annotation. In a batch learning environment, a novel machine- driven priority-based active learning approach is used to down select and identify a very small set of relevant unannotated samples that require further human analysis, which ensures a much lesser burden on the human analyst while making the learning process faster.
Our approach shows very promising classification performance for categorizing Phishing/malicious URLs without requiring a large amount of annotated training samples to be available during training. In experiments with several collections of PhishMonger’s Targeted Brand dataset, the proposed method shows significant improvement over the baseline by as much as 12%.
(d) Predicting Zero-day Malicious IP Addresses Many malware infections, malicious URLs, botnet communication, and Advanced Persistent Threats can be detected based on detecting known blacklisted IP addresses. However, today’s providers of IP blacklists are based on already observed malicious activities, collected from multiple sources around the world. Attackers can evade those reactive IP blacklist defense by using IP addresses that have not been recently engaged in malicious activities. In this work, we exploit the configuration and domain correlation to predict IP addresses that are likely to be used in malicious activities in the near future. Our evaluation shows that this approach can detect 88% of zero-day malware instances missed by top five antivirus products. It can also block 68% of phishing websites before reported by Phishtank.
(e) Data-driven Probabilistic and Evidential Reasoning for Fault Diagnosis: Professor Al-Shaer proposed several reasoning techniques for analyzing system events and determining the root cause of failures and performance problems. In his early work he developed an integrated fault diagnosis technique that uses passively collected system artifacts (e.g., symptoms, alerts, user’s observation) to generates hypotheses about the root cause of a fault [DSN09, INFOCOM09c, INFOCOM08a, TNMS08, IM05]; then he used Bayesian analysis and Believe networks to evaluate these hypotheses and determine the optimal (minimum cost) active probing required for high confidence root cause analysis. The simulation study shows that both performance and accuracy of fault diagnosis was greatly improved using this technique. In our recent work [DSN2009,CNSM12-1,TSNM12], we proposed a new direction for fault diagnosis in overlay networks using only end-user (negative) observations without deploying any passive or active probing sensors. Our approach uses constraint satisfiability theory to reduce the fault scenarios and then Dumpster-Shafer theory for evidential reasoning to rank the select solutions. Our simulation and PlanetLab evaluation show that this technique is scalable and accurate when reasonable user participation exists. We extended this work to identify the root cause of performance bottleneck on the network (congestion) using SMT and plausible reasoning [DSN09, TNSM11, CNSM12, CNSM13, GComm13]. Our technique can determine the “location” and “distribution” of packet loss in the network by analyzing only the user observations [DSN09, INFOCOM09c, INFOCOM08a, TNMS08, IM05, CNSM12-1, CNSM13, TNSM12, SGCOMM13].
(f) Real-time Detection and Mitigation of Low and Slow Stealthy DDoS Attacks: Professor Al-Shaer has several projects in this area with financial industries. These projects involve using information theory (entropy analysis) for analyzing data from financial transaction web logs, and identifying critical traffic features for detecting highly stealthy (low and slow rate) application-level DDoS attacks. Man-in-the-browser is one manifestation of such stealthy DDoS attack that mimics the user behavior and indistinguishable in application level analysis. The objective of this project is to detect application-level low-and-slow DDoS attacks with high accuracy, make attack evasion difficult and the data-driven analytics actionable by enabling risk-based decision making. We identified statistically critical features for anomaly detection and analyzed the correlation of the feature combinations using conditional entropy and Markovian analysis for detecting malicious behavior based on real data provided by our industry partners [ ]. Our contributions in this project include the following:
(1) developing new multi-dimensional detectors based on spatial, temporal and behavior features from both network and application levels, (2) quantifying the attack evasion for DDoS, (2) developing methods to differentiate normal and botnet distribution, ranking ISPs and monitoring Internet neighborhood, (3) metrics to verify the feasibility and accuracy of the detectors.
(g) Risk Metrics based on Vulnerability History: What likely will be exploited next in your enterprise? How different services are ranked based on security risk? These are some of the major questions that this research attempt to answer. Professor Al-Shaer has developed techniques and metrics for (1) predicting the occurrence and severity of future software vulnerabilities (0-day attacks), and (2) creating a stochastic risk ranking of network services by mining the vulnerability history data (NVD) of well-known operating systems and network services [JNSM11, NOMS08].
(h) Security Metrics: Professor Al-Shaer worked on developing various security metrics for measuring the quality of protection for firewalls and network software services (e.g., OS, web services) [ICDCS13, NDSS13, Safeconfig13, Safeconfig12, JNSM11, INFOCOM08b, NOMS12]. As a part of NSA Science of Security Lablet, Professor Al-Shaer developed four metrics for measuring and comparing firewall (single and distributed) policies: (1) risk resistance, (2) risk mitigation, (3) resilience to attack, and (4) manageability/complexity. These metrics were tested and rigorously validated against the ground truth using simulation and real data experiments [Safeconfig13]. Professor Al-Shaer has also developed several other enterprise security metrics including (1) network resistance metric [INFOCOM08b] to measure the potential attack propagation, (2) host vulnerability metric to predict future vulnerability based on vulnerability history from NVD, and (3) stochastic risk ranking metric to order network services in an enterprise based on the risk of future potential vulnerabilities [NOMS12].
(i) Crowed-Sourcing Security Analytics Using Multi-player Games: The goal of the project is to transform network specification and security configuration into a multi-player game. The player traces generated as a result of the game will be used for identifying unknown system vulnerabilities, and for characterizing the new attacker strategies and human behavior [CNS14-game].
3. Security and Resiliency Analytics for Cyber-Physical Systems—Smart Grids, IoT and Structural Health Monitoring
Our research in the science of security and resiliency of cyber-physical systems includes smart grids, industrial control systems, and Internet of Things. Our research in this area has the following key research contribution:
Our contributions in this area were published in top journals including ACM TISSEC, IEEE TSG, IEEE TSDC, and reputed conferences venues including CCS, INFOCOM, SmartGridComm, ICDCS, DSN, ICCPS, CNS. The following is a brief description of our research contribution in this area.
(a) Security Compliance Verification, Diagnosis and Remediation of AMI Configurations: Advanced Metering Infrastructure (AMI) is one of the most critical elements of smart gird. It allows for connecting the smart meters at users' premises, the most likely vulnerable points in a smart grid system, to the rest of smart grid including the energy management and control systems. In this work, we created a new formal model for AMI data flow analysis for proactively discovering vulnerabilities and violations to AMI security policies. Our techniques were developed in AMIAnalyzer tool that provides the following capabilities and contribution: (1) creating a scalable formal model of AMI that considers the reporting and communication conﬁgurations of millions of AMI devices (smart meters and collectors), AMI topology, functional and communication inter-dependency between AMI devices, and security properties; (2) logical formalization of invariants of AMI operational integrity/resiliency based on user-defined objectives and device interdependencies, and AMI security properties based on security control requirements in NSITIR 7628, NIST SP 800-82 and NERC CIP guidelines; (3) developing a scalable verification model based on Satisﬁability Modulo Theory (SMT) and compositional verification, (4) determining the impact of potential vulnerabilities and violations (counter examples); and (5) creating a remediation plan automatically. Our approach scales to tens of millions of meters and hundreds of thousands of constraints/properties [INFOCOM12, TSG13].
(b) Proactive Threat Analytics and Mitigation for Energy Management Systems in Smart Grid: Energy Management System (EMS) is the core component of the smart grid. EMS interconnects industrial control system, with both the power generation and distribution (e.g., advanced metering infrastructure) to satisfy power supply-demand requirements and guarantee operational sustainability in the presence of faults. Considering the high criticality of EMS, it is highly important to analyze the EMS resiliency against arbitrary attackers of various capabilities proactively. Our goal in this research project is to develop techniques and tools to proactively predict attacks, particularly stealthy and coordinated ones, and measure their potential impact on Energy Management Systems (EMS), considering three key factors: (1) the EMS components' interdependencies (specifically PMU, state estimation, optimal power flow, contingency analysis, topology mapper and automatic generation control), (2) wide-scale of varying adversarial capabilities (knowledge, capacity, and resources), and (3) potential attack evasion due to limitation in physical or cyber capabilities (e.g., number and distribution of sensors). Therefore, our proposed model and tool can systematically measure the resiliency of EMS by quantifying the feasibility and impact of stealthy coordinated attacks that can be launched simultaneously, yet undetectable on various parts of the system. Moreover, our developed techniques and tools for EMS Threat Analytics can also determine a cost-effective risk mitigation planning considering both the benefit and cost of the security controls. Unlike any of the previous work, our formal model captures concretely the complex inter-dependency between cyber-physical and physical-physical components of the grid, and provably identify how this interdependency can be exploited by attackers of arbitrary capabilities to plan undetectable stealthy attacks that can successfully propagate to critical elements of smart grid such as Automatic Generation Control (AGC) and Optimal Power Flow (OPF). In other words, our EMS Threat Analytics tool can answer critical security/resiliency questions, such as: given the EMS physical (e.g., bus system) and cyber (e.g., sensing and connectivity) configurations, what, where, and how attacks start and percolate in the system? And to what level can these attacks be devised to impact adversely EMS functionality, and/or energy markets?
Our case study experimentation shows that the amplification of the stealthy coordinated attacks on both state estimation and topology mapper of smart grid can lead to compromising the integrity of OPF, the most critical part of EMS, and undermining the economic and operational stability of the smart grid. Our contribution to the science of CPS security comes in many folds. First, it provides a formal foundation to measure the impact of components' interdependency on the attackability of the smart grid. Second, it developed a formal analytic approach to quantify the attackers' capabilities required to launch a successful stealthy attack. Third, it quantifies the direct and indirect impact of the attack and required mitigation techniques.
(c) Resilient Risk Mitigation Planning for EMS: We developed automated synthesis techniques of resilient mitigation plans that are effective against an increasing-scale of devastating attacks. There are various choices of security measures for hardening the security of state estimation in EMS, which includes increasing or changing the placement of new sensors such as PMUs, increasing the number of selected/active set of sensors through redundancy, and/or enforcing more encrypted communications. Our mitigation planning obtains the most cost-effective solution for resilient architecture because it considers both the benefit of risk mitigation (based on potential attack and impact) and the budgetary/cost constraints of each alternative plan. To manage the time complexity which was a serious challenge in for solving this constraint satisfaction problem, we developed a multi-level (compositional) verification approach based on SMT to develop a scalable solution [ICDCS14, DSN14]
(d) Real-time Intrusion Deterrence, Detection, and Mitigation for AMI: Despite the effort to proactively mitigate risk by shrinking the attack surface, attacks on complex systems are often inevitable. However, unlike traditional networks, smart grid has unique challenges, such as limited computational power devices and potentially high deployment cost, which restrict the deployment options of intrusion detectors. On the other hand, we believe that smart grid exhibits behavior that can be accurately modeled using its configuration, and possible leveraged to design efficient intrusion detectors. In this project, we show based on our statistical analysis of AMI smart collectors' logs that AMI behavior is predictable, thereby it can be verified using the specifications invariant generated from the configurations of the AMI devices. We create an accurate model of AMI behavior using the fourth order Markov chain. We then used the logs to construct a stochastic model checker that can be probabilistically verified using the configuration specifications represented in Linear Temporal Logic properties. Our model is capable of detecting malicious behavior in the AMI network due to intrusions or device malfunctioning. We validate our approach on a real-world dataset of thousands of meters collected at the AMI of a leading utility provider [TISSEC15, CCS13].
(e) Real-time Multi-layer Intrusion Detection and Prediction for Automatic Generation Control: Energy generation in power systems is usually adjusted based on the load feedback. However, due to the dependency on the cyber infrastructure for load monitoring and reporting, generation control is inherently vulnerable to undetected attacks by bad data injection of state estimation. Recent studies have shown that the possibility of data integrity attacks on the generation control can significantly disrupt the energy system. In this work, we present light weight, yet effective data-driven two- tier intrusion detection system for automatic generation control (AGC). The first tier is a short-term adaptive predictor for system variables, such as load and area control error (ACE). The first tier provides a real-time measurement predictor that adapts to the underlying changing behavior of these system variables and flags out the abnormal behavior in these variables independently. The second tier provides deep state inspection to investigate the presence of anomalies by incorporating the overall system variable correlation using Markov models. Moreover, we expand our second tier inspection to include multi-AGC environment where a behavior of one AGC is validated against the behavior of the interconnected AGC. The combination of tier-1 lightweight prediction and tier-2 offline deep state inspection offers a great advantage to balance accuracy and real-time requirements of intrusion detection for AGC environment. Our results show high detection accuracy (95%) under different multi-attack scenarios, and the second tier successfully verified all the injected intrusions.
(f) Resiliency Synthesis of AMI: Advanced Metering Infrastructure (AMI) of smart grid is highly vulnerable to attacks that can impact the operational integrity of smart grid. Therefore, resilient configurations are important to ensure AMI functionality even with failures due to misconfigurations or attacks. We identify two major aspects of a resilient AMI architecture: (1) report-scheduling meter configuration, which determines when and how to report power consumption data, and (2) placement of the smart collectors that connect meters to energy control systems. In this project, we developed formal models and constraints to automatically synthesize dependable AMI architecture including meter configuration and smart collector placement to guarantee services continuity and reachability even if the network is under attack. Our AMI configuration synthesis can consider provisioned resiliency against attacks such as meters should be able to communicate even if 10% of the meters are malicious. We formalized and implemented our proposed framework using Satisfiability Modulo Theories (SMT) and evaluated the scalability of our model with large size AMI (10,000 meters in one area) [SGCOMM15].
(g) Moving Target Defense for Smart Grid Agility: Recent works have shown that the state estimation process is vulnerable to stealthy attacks where an adversary can still corrupt measurements while evading the existing bad data detection algorithms and remaining invisible. However, an adversary needs sufficient resources as well as necessary knowledge to achieve the desired attack outcome. The knowledge that is required to launch an attack mainly includes the measurements considered in state estimation, the connectivity among the buses, and the power line admittances. Uncertainty in information limits the potential attack space for an attacker. In this project, we developed new techniques for enabling agility of smart grid by frequently randomizing its configuration in order to increase the attackers' uncertainty and cost, and significantly reduce the attack window of opportunity. We developed two provably-safe mechanisms to randomize smart grid configurations to make both state estimation and advanced metering infrastructure highly resilient/resistive to attacks. The first technique frequently randomizes the state estimation measurements, and the topology (e.g., line admittances) to minimize the potential of bad data injection attacks. The work was tested and validated using the standard IEEE 14-bus and 30-bus test systems. The second technique randomly changes the smart meter configurations such as the reporting/polling and meter-collector association to make the communication pattern in AMI unpredictable, thereby significantly increase the detectability of malicious behavior even with attack evasion. This technique was evaluated using syntactic AMI logs based on smart grid data [MTD14-sg].
(hAutomated Security Verification and Hardening of IoT Systems Adversaries could exploit a misconﬁguration or a weakness in order to manipulate or jeopardize the behavior of IoT systems through unauthorized, incorrect, contradicting, malicious, delayed, or prevented actuation. Considering the multi-level and complex interdependency between IoT devices and between IoT and cyber system, the surface of misconﬁgurations and attacks become humongous.
Our approach includes developing a flexible and global model that captures IoT behavior based on sensing information and their transforming to actions. The modeling is based on ontologies, SMT-based model-checkers, and mew security and resiliency metrics represented in by linear-time temporal logic (LTL) and rule-based constraints for automated generation and hardening of firewall rulesets. Our currently work advances the area of cybersecurity automation for IoT by (1) developing models and tools for automated verification and synthesis to capture the behaviors of large-scale IoT systems, considering policy configuration and multi-level interdependencies between IoT-devices, IoT systems and IoT-Cyber systems, (2) developing techniques for rigorous conflict analysis and resolution in IoT systems, and (3) creating new IoT-aware firewall models that integrate both state- and context-aware analysis into firewall operation enabling intelligently dynamic security enforcement and automated reactive mitigation for security and resiliency. Leveraging our active research in security analytics and hardening of IoT systems [IoTChecker], we plan to investigate inter-decencies between configuration and policy within the same IoT system and across different IoT systems to predict off-line and on-line potential malicious sensing or actuation and enable system to avoid to reaching the bad state or to auto-regenerate and recover after reaching the bad state.
(i) Formal Analytics for Diagnosis and Prognostication of Structural Health Monitoring: This is a multi-disciplinary project that intersects computer scientists with civil engineers in order to offer a unique perspective to redirect the search for diagnostic and prognostic methods of sensor-based infrastructure health monitoring. The research proposes to redirect the conventional approach to infrastructure health monitoring by producing a direct solution to the structural model by means of Satisfiability Modulo Theories (SMT) solver constrained by mechanics-based constraints using data collected from sensors. The overarching goal of this project is to facilitate reliable methods for the detection and estimation of the severity of damage in civil structures through structural identification. By formalizing the finite element model updating problem as a constraint satisfaction problem rather than an optimization problem, we seek to explore a potential paradigm shift in the structural identification method that may lead to an effective means of generating multiple or alternative solutions in the presence of measurement noise [IWSHM15, SEMS15, ESD15]. Our contributions include the following:
4. Agility for Cyber and Cyber-Physical Systems
Our goal in this research is to reverse the asymmetry in cyber warfare by embedding agility into cyber systems. Cyber agility is a system property that allows cyber systems to proactively defend against unknown threats by dynamically changing the system parameters and defense strategies in a timely and cost-effective fashion in order to deter, deceive or resist attacks and mitigate their consequences without degrading the quality of service. In other words, cyber agility provides proactive and resilient defense by deceiving attackers from reaching their goals, disrupting their plans via changing adversarial behaviors, and deterring them by prohibitively increasing the attack cost. Prof. Al-Shaer has contributed extensively to the science, theoretical foundations of cyber agility. He also has developed many metrics, techniques, and prototypes in cyber agility [INFOCOM15-arhm, INFOCOM15-movenet, HotSDN12, Securecomm12, ESORICS13, CNS13-1, CNS13-2, MTD14-storm, MTD14-sg].
(a) Random Host Mutation (RHM): Prof. Al-Shaer developed Adaptive Random Host Mutation (RHM) [SecureComm12, HotSDN12, INFOCOM15-arhm, TIFS16-rhm] that randomly mutates an IP address to defeat reconnaissance and scanning worm attacks. RHM can also learn and adapt to adversary propagation model to increase deception [INFOCOM15-arhm]. RHM mechanism was formulated using constraint satisfaction logic to ensure mutation unpredictability while preserving network integrity and constraints such as limited table size, routing restriction, DNS consistency, and reachability and security policy requirements. RHM was implemented and demonstrated on traditional (using RHM proxies) and SDN networks without requiring any changes to the underlying network infrastructure. Our evaluation results based on mathematical modeling, simulation and real experimentations show that our specific RHM implementation can defeat fast scanning tools (such as NMAP and Nessus) by invalidating at least 97% of their discovery. We also show that non-adaptive RHM can defeat scanning worms by decreasing the number of infected hosts by 40% (for purely random worm) to 80% (for cooperative worms), and by slowing down the propagation speed by at least 50% [SecureComm12, TIFS16-rhm]. In adaptive RHM, the infection ratio is less than 5% for all worms [INFOCOM15-arhm]. Our implementation also shows that the routing and DNS update overhead is tens of times smaller than random mutation without optimization and the packet delay due to the mutation process is less than one-tenth of a millisecond [HotSDN12, SecureComm12].
(b) Spatio-Temporal Host Mutation (STORM): PI Al-Shaer has also developed other techniques for adversary-aware agility to deceive APT and stepping stone attacks by changing virtually the network topology at each location in the network [MTD14-storm]. In this project, we investigate new dimensions for IP mutation other than time and explore a theoretical framework for integrating them efficiently. For instance, we will investigate a new spatial-based IP mutation which gives the ability to mutate the view of the destination hosts in a network via mapping their IPs to random vIPs based on the source (client) identity, without changing the real IP in the end-host. We call this spatial-temporal random mutation (STORM). STORM's objective is to invalidate key assumptions of reconnaissance by: (1) making any scanning information collected by an adversary at one location (host) invalid at any other location in the network, (2) causing reconnaissance information (collected even on the same host) to expire after a period of time, and (3) disrupting attackers' abilities to infer network topology using IP correlation (e.g., that similar IP addresses are likely to be on the same subnet). Our proposed architecture does not require any changes in the end-host configuration; end-hosts will continue to use their rIPs.
(c) Random Route Mutation (RRM): To proactively defend against denial of service and/or man-in-the-middle attacks, we developed an agile multipath routing approach called random route mutation (RRM) which combines game theory and constraint satisfaction optimization to determine the optimal strategy for re-routing packets in order to increase attack deterrence, while satisfying security, performance, and QoS requirements of the network [ESORICS13, CNS13-1]. Two metrics were developed to measure the major parameters that determine benefit and limitation of this technique.
(d) Finger Printing Deception: Remote Operating System (OS) Fingerprinting is a precursory step for characterizing vulnerabilities and launching attacks on the Internet. In this project, we use a game-theoretic approach to deceive fingerprinting attackers by generating fake fingerprinting identities dynamically, while minimizing the side effects on benign clients [CNS13-2]. We also created metrics to measure the attackers' deception and benign disruption values objectively.
(e) Virtual Network Mutation (MoveNet): Recent stealthy DDoS attacks can be very devastating because they can target critical links/devices without sending traffic directly to the victim. Thus, they are not detectable by existing network or application analysis tools. We developed a moving target technique, called MoveNet, to make the discovery of critical links/resources infeasible by randomly changing the virtual network map. MoveNet allows for migrating virtual networks randomly to mutate the physical network footprint and avoid targeting critical links by DDoS attacks while satisfying bandwidth, VN policy, and quality of service requirements [INFOCOM15-movenet].
(f) Wireless Access Point Mutation: Wireless Access Point (WAP) and clients can be easily discovered and targeted by attacks such as denial of service or eavesdropping, particularly due to the static nature of the wireless AP topology and configuration. In this project, we developed formal foundations for two wireless agility techniques: (1) Random Range Mutation (RNM) that allows for periodic changes of AP coverage range randomly and (2) Random Topology Mutation (RTM) that allows for random motion and placement of APs in the wireless infrastructure. The goal of these techniques is to proactively defend against attacks such as DoS and eavesdropping by forcing the wireless clients to randomly change their WAP association. We developed a Satisfiability Modulo Theories (SMT) based formal framework that allows for optimizing wireless AP mutation while maintaining the service requirements including coverage, security and energy properties under incomplete information about the adversary strategies. Our evaluation validates the feasibility, scalability, and effectiveness of the formal methods based technical approaches.
(h) AMI Configuration Mutation: We developed a technique for randomly mutating the configuration of Advanced Metering Infrastructure (AMI) such as reporting patter, meter-collector association, etc., to significantly increase the potential detectability of stealthy malicious activities (malware propagation) in smart grid [CCS13, MTD14-sg]. In all of the above work, we also developed rigorous metrics and performed experimentation to evaluate the benefit and cost (effectiveness) of this defense technology.
(i) Mutable Sensor Set for Evasion-Resistant State Estimation in Energy Management Systems: This technique was described in "Moving Target Defense for Smart Grid Agility" in Section 2.2.
5. Automating Cyber Resilience – Techniques, Verification, and Synthesis
As attacks on cyber and cyber-physical systems are inevitable, defense techniques that are capable of both resisting attacks and recovering their consequences are required for the sustainability of the system mission. We define system resiliency as a property of the system to deter, resist, and mitigate attacks, rather than merely preventing them, in order to maintain the mission integrity of cyber systems.
Many proactive and reactive resiliency techniques are proposed in the research literature to achieve this goal. However, no sufficient effort has been made to develop formal models and metrics to verify the safety, correctness, and effectiveness of such techniques in specific network environments. Prof. Al-Shaer has contributed extensively to the science and engineering of resilient systems by developing metrics to measure resiliency of static and evolving systems.
(5.1) Traffic-aware Statistical Firewall Optimization for Adaptive Defense: Professor Al-Shaer proposed two novel adaptive filtering techniques to dynamically adapting firewall policy based on real-time traffic statistics to optimize packet filtering against abnormal or malicious traffic. We proposed a technique for early packet rejection by dynamically inserting the minimum number of rules that can deny the maximum amount of traffic discarded by default-deny rule. Although this is an NP-hard problem, we developed efficient off-line and on-line approximation algorithms [INFOCOM06, INFOCOM09a]. This technique shows a 19%-50% improvement in the filtering performance even for small-medium size policies. The second technique adapting the filtering structure by dynamically reordering matching rules based on the traffic distribution "skewness" characterized by real-time entropy traffic measurement to minimize the average matching cost. This approach is shown to improve the average case performance of firewall significantly (up to 40%) over previous techniques. Alphabetic [INFOCOM06] and bounded depth Huffman trees of fields and segments [INFOCOM07] were used to build adaptive filtering structures.
(5.2) Metrics and Verification of Proactive Resiliency Mechanisms—Isolation, Diversity, and Redundancy: Proactive resiliency techniques, such as diversity, isolation, and redundancy, are usually statically integrated into the cyber architecture and configuration to create built-in attack resistance and enhance the system resiliency. Due to the increasing complexity of cyber configuration and inter-dependency between its components, it is hard to ensure that these techniques are configured correctly and efficiently. Our goal in this work is to develop models and properties to verify the effectiveness, integrity and consistency of proactive static resiliency based on access control configurations in traditional or software defined networks.
(5.3) Resiliency-by-construction Synthesis: We developed a formal model to enable resilient-by-construction development of cyber systems. Our preliminary model has two components: (1) resiliency metrics based on Cyber Resilience Engineering Framework (CREF), and (2) formal verification to investigate and extend the cyber model in order to exhibit the desired resilient properties in the design phase. This work has two objectives. The first is to propose a formal approach to measuring cyber resilience from different aspects (i.e., attacks, failures) and at different levels (i.e., proactive, resistive and reactive). To achieve the first objective, we propose a formal framework named as Cyber Resilience Engineering Framework (CREF). The second objective is to build a resilient system by construction. The idea is to build a formal model of cyber systems, which is initially not resilient against attacks. Then by systematic refinements of the formal model through its model checking, we attain resiliency. We exemplify our technique through a simple case study of firewall security [SafeConfig15]. To reduce the state space generated for model checking as a result of dynamic reactive actions and system evolution, we developed a new slicing-based verification technique to restricting the model checking to the model part that is evolved, rather the entire system [CSEA15].
Software Tools and Systems Developed in Research Group
The following are selected tools and systems that were developed and demonstrated by our research team as part of the funded project from government and industry.