2021. What an interesting year. With the world turned upside down by a pandemic that seemingly had its sights set on...
NonStop Systems and Applications – Detecting and Protecting against Ransomware
Keith Moore HPE Distinguished Technologist
HPE


The topic of Ransomware protection and recovery is hot in the FSI industry, especially. Al lot of this is due to recent (2018-2021) increase of IT attacks reported publicly. One would assume that this correlates to an actual increase in the number of attacks, but it is always wise to assume that some of the reaction is due to visibility as much as it is to an increase.
So let’s assume a significant rise of activity specifically relating to hack that hold systems and/or data for ransom. This is clearly a small subset of overall cyber security events worldwide. (10% of computer attacks are ransomware per Norton Software), but ransomware is by several sources the #1 most common EXPLOITATION used by the criminals that are leveraging beyond a less disruptive illegal attack (like phishing, stealing email data, privacy information, etc.).
The most important thing to consider, according to several sources, is that 95% of all ransomware attacks are against Windows and are usually involved via Trojan horse DLL or Remote Trojans (RATS). The remaining reported incidents are on Android and OSX (Apple). The reasons that Windows’ systems are so vulnerable have mostly to do with the ubiquity and the ease of development of intrusive software. In other words, it is also the platform where there are the most vulnerabilities to virus attacks (a superset of the ransomware tools mentioned above).
What, realistically, is a true ransomware event?
A ransomware event occurs when a malicious activity allows the actor to take control of the operating system and keep some portion of it from being usable by its systems-level users. Most often, the action perpetrated is to obfuscate the data that is managed by the operating system, rendering it useless to the systems user(s). As with all ransom hostage situations, there is a promise to deliver a resolution I returns for something. Usually this something of monetary value, but it could be some other form of extortion.
I say all of this not to shelve the concern, but rather to express the perspective that NonStop users likely has some time to address any future attacks that could come to any computer. NonStop users should develop specific detailed plans to address this concern at the same level at which they have disaster recovery plans. In all cases, the NonStop systems management must address all appropriate computer security procedures and should also focus on data protection and system-wide recovery schemes.
The overlap with disaster recovery processes is apparent, but some specific differences are:
- Ransomware introduces the concept of a fuzzy Recovery Point Objective (RPO). This is my term “fuzzy RPO”, not an industry term. But because most NonStop servers are transactional in nature, the data state used for recovery is ever moving. NonStop audit replication tools provide some possible sources for recovery, but it is still true that the RPO is fuzzy until the second (below) item is addressed.
- Known-good. Most NonStop transactional systems do not often go down/stop, nor do they have a simple definitive perspective of “known good” for the data (base) s. This is because the data is ever-changing based upon the transactional activities. Audit trails represent a state-in-time, but the time at which they are considered “known good” is a subjective function based upon post-facto analysis after an attack.
- When is something wrong? Unless a malicious actor notifies of an event, how does the NonStop system user know when there has been some malicious event. To be quite open, this is probably the biggest concern for protecting from disastrous events. It is critical to discover an event quickly, so that there is not a lot of time between discovery and recovery. There are several reasons for this; not the least of which is loss of possible transactional activity during the assaulted timeframes. There are proactive tools provided by HPE NonStop partners that enrich and assist with this need for discovery.
Common with Diaster / Recovery (DR):
- DevSecOps. Development and Operations must also include security as part of the process. Multi-user code audits and strict change management processes are one of the most important protections for all Trojans and for all accidental security issues on any platform. As a security proponent, I tell all users that the third most valuable system-side security procedure they could do to protect their system would be to enforce a multi-party change authorization with multi-party code audits. (What are the other two? Two-factor authorization and RBAC. SIEM is a close #4)
- Firewalls.
- Audit everything WRT add/change/delete of system data and users.
- Recovery schemes are similar. Recovery time objectives (RTO) and RPOs are both relevant in ransomware planning. Ransomware almost always requires a specific need to restore data, whereas pure DR plans often do not require a restoration of data to a point I time (assuming active-active DR protection, for example). Often with DR, the data is replicated and synchronized as part of the DR planning process, but with ransomware strategies, the recovery data must be “clean” into the active site for a point in time. It must reflect last-known-good state. For DR designs, the limit is that the data is good to the point of the disaster. It’s a subtle but significant difference. This means that for Ransomware recovery, in addition to providing a known-good system, the user must recover to a RPO using “known-good-to-a-point” recovery. The use of NonStop backup and Restore (aka BR2 on NonStop) is not usually sufficient because the data is in motion and the known-good recovery point is not easily restored form a traditional BR2 backup. This means that combinations of BR2 and replication, and recovery from audit trail play into most designs.
- Often Ransomware will require a known-good new server service I some form. This is especially true because forensics of the attacked server often will take longer than the RTO allows.
One last topic that needs to be mentioned:
The very deliberate and aggressive directives to deliver a documented ransomware recovery process ala Disaster Recovery style plan is driven by the trend mentioned in the first paragraph. Windows perspective is that the file system is likely to be encrypted by the attacker and that the data to recover will also be unavailable because it will also reside on a Microsoft machine.
This would be as if NonStop would recover from a remote NonStop server that had a copy of the data: Again, much like a DR recovery scheme. Therefore, the greater IT security pundits reflect and impose a further requirement to create an immutable copy of all data for the purposes of a “known-good” recovery. Putting aside the fact that a backup of data on Microsoft would still need to pass the “known-good” scrutiny for any immutable backup, moving data to a device that cannot be modified is a critical requirement.
NonStop backup data is just data. So NonStop should also avail itself to using an immutable destination. The delivery to an immutable store is not difficult and ticks the box for protecting a state in time.