Prepare to be hacked

In the IoT, a cyberattack exposes an entire network, but unikernels and AI can play a part in making systems secure, says Ian Ferguson of Lynx Software Technologies.

When a connected system has been compromised, estimates of the cost of the cyberattack vary. This is partly due to embarrassment on the part of the business that may also be in denial about the scale and frequency of attacks, meaning that it is, more or less, inevitable that sooner or later its defences will be penetrated.

Alongside implementing best-in-class security to harden systems, enterprises need to plan their response to a successful hacking attack. This should include mechanisms for detecting an intrusion as quickly as possible and limiting its extent and/or closing it down.

The thinking is not dissimilar to the approach for fire safety in buildings. Building managers are required to adhere to a series of regulations to prevent fire breaking out in the first place, covering, for example, the use of flammable materials and the maintenance of electrical appliances. There are further regulations to ensure an effective response when fire does break out, for example, mandating alarms and fire doors to contain the blaze and sprinkler systems to reduce its impact and potentially put it out. New technologies, such as AI, used alongside established technologies such as hypervisors, can provide a robust response to neutralising those hacking attacks that penetrate the system’s defences.

Secure and connected?

Cybersecurity has become a major focus for system designers who are architecting connected embedded platforms. Ecosystem providers such as Arm and Microsoft provide blueprints and best practices for creating more secure platforms. An example of this is the set of specifications associated with Arm’s Platform Security Architecture initiative.

First, work is done to identify the types of attacks that can be instigated on the system, the information that can potentially be accessed and the impact this can have on the system. This will vary by use case. This threat analysis helps to determine the level of effort and type of strategy that will be deployed.

At a high level, the techniques to mitigate the impact of attacks include isolation, least privilege and divide and conquer.

Isolation is ensuring separation (ideally using hardware) between applications to ensure that if an application is compromised, the impact to the system is constrained to a specific area and that critical system functionality is not affected.

Least privilege is ensuring an application has access to only the bare minimum set of system resources that are necessary to perform its function.

To divide and conquer, developers shift from a monolithic software architecture to one with a myriad of application-specific tasks. Many attacks have exploited an OS as a point of vulnerability. Breaching it can create an opportunity to directly access certain critical assets and rewrite memory areas with new information. Shifting to an architecture of multiple simple, dedicated applications provides an opportunity to create more scalable, modular platforms. Unikernels execute only in user mode, therefore limiting the instruction set of a processor to which the software has access. These architectures are underpinned by hypervisors.

Detecting an intrusion

This raises the bar in terms of cybersecurity and immunity to system attack, but does not do enough to identify and reverse the attacks that manage to get through. There needs to be increased focus on recognising that a system has been hacked. Networks have been hacked for months without being noticed. To return to the building analogy, the fire is recognised only when the building is a smoking ruin.

Time is of the essence when a system is hacked. Normally, it will take cyberhackers some time to find valuable assets once they have gained access to a system. If the attack is identified quickly, there is a good chance that most, if not all, of the potential damage can be prevented. Today’s system designer needs to plan for a system to be compromised, focusing technology on early recognition of that fact (the fire alarm), containing it (fire doors) and then returning the system to a known good state (sprinkler).

Developers can use one of two intrusion detection techniques. The first is signature-based in which a database of known attack identities is created and the system compares against these. This approach does not allow systems to detect new types of attacks or even a known attack with a slight change on its signature.

The second approach is anomaly based. Here, a baseline model for normal behaviour of the system is defined. Any activity outside this is flagged as an anomaly. This sounds like the better approach, but there are two primary challenges to bear in mind.

First, gathering the data to train the system relies on unsupervised learning to learn from the ground up as to what normal behaviour looks like. Today, we see the path that the majority of customers take is to build up the data model running systems in a controlled environment.

An IoT can be extremely complex, resulting in a minimum of false positives (that is, the system is actually fine, but software flags the system as having been compromised).

AI for intrusion detection

One of the greatest opportunities for AI is intrusion detection. For this, it resides inside the system, as close to the hardware as possible.

Normal system behaviour creates data logs in the hypervisor, which help to give specific system signatures/profiles for individual tasks. This includes patterns of CPU loads and accesses, memory usage and I/O activity.

Consider the Stuxnet worm which, a little over a decade ago, focused on a specific type of SCADA platform running Windows. Part of this worm was a rootkit. This malware is designed to gain access to part of a system that is not otherwise allowed. What makes it especially challenging to detect is that it often masks its existence or the existence of other software. For poorly architected systems, that access enables the program to gain privileged system access. Full control over a system means that existing software can be modified, including software that might otherwise be used to detect or circumvent it. At the hypervisor level, there will be immediate flags; Stuxnet needed around 0.5MB of memory. The hypervisor would see requests to write to areas of memory (which, in a well architected system, will be identified as a protected area).

Other hypervisor capabilities include API intercept whereby authorised guest applications/OS can be notified of another guest executing code in specified memory locations and obtain information about the context. There is also API monitoring of specific memory locations for code execution. This also involves looking for patterns that could represent malicious activity.
There are also pages of interest, which involves watching specified memory pages, for example, kernel pages, for reads and writes of sensitive data structures that indicate potential malicious software activity.

In secure domain, isolation data that is secure is displayed to guests.

There is also hypervisor fingerprinting. Quite a lot has been written about running applications that determine what type of hypervisor is running on a platform with a view to then identifying how best to access it. One example is the ‘back door’ in VMware, where a communication channel exists between the guest and the hypervisor. The back door responds to certain interrupt calls, which would crash a user mode application in a physical machine. The hypervisor therefore needs to include anti-evasion techniques to prevent malicious software from fingerprinting the hypervisor.

To be clear, the interpretation of the data does not occur in the hypervisor. It will be performed in a trusted virtual machine that is secured and isolated from other applications. Instead, the hypervisor is a continual auditor of the platform and delivers the gathered data to the appropriate applications for processing and, ultimately, system decision making if a breach is identified.

System return

Once a breach has been mitigated, the system needs to be returned to a known good state. From a software point of view, there are two exciting areas.

The first is improved methods of storing snapshots of the system state in memory and accomplishing rapid restore for infected systems. The second is the use of an immutable separation kernel configuration to nullify any detected threats when one or more virtual machines are rebooted.

As the sophistication of cyberattacks increases, the capabilities of the connected systems must rise up to rebuff them. Deployed systems will indubitably encounter new attack methods that were unknown and/or unconceived when the system was originally created, therefore the system architect has to plan for the system to be breached. The focus then turns to early recognition of that system, a thorough sandboxed approach to minimise the access to valuable system assets and provide a proven path to returning the system to a known state.