Automating Malware Analysis Operations (MAOps)
I believe that automating analysis is a challenge that all malware analysts are working on for more efficient daily incident investigations. Cloud-based technologies (CI/CD, serverless, IaC, etc.) are great solutions that can automate MAOps efficiently. In this article, I introduce how JPCERT/CC automates malware analysis on the cloud, based on the following case studies.
- Malware C2 Monitoring
- Malware Hunting using Cloud
- YARA CI/CD system
- Surface Analysis System on Cloud
- Memory Forensic on Cloud
Malware C2 Monitoring
Monitoring C2 servers is important for understanding attackers’ activities, and many malware analysts do it on a regular basis. However, if you access C2 servers too many times, you may be blocked by attackers, and furthermore, you may even become a target of the attack. Such problems can be solved with cloud services.
JPCERT/CC monitors the C2 server used for lucky visitor scam using cloud services. Attackers send commands from their C2 servers to the defaced sites to redirect visitors to the scam site. The redirect URLs sent from C2 servers change periodically, and thus you can find new redirect URLs if you keep monitoring C2 servers.
In response to this kind of attack, JPCERT/CC automated the process of discovering C2 servers as well as collecting and denylisting the redirect URLs to block redirects to the lucky visitor scam site on the web browser. The detailed flow is as shown in Figure 1.
The system runs on AWS as shown in Figure 2. The serverless service Lambda scans C2 servers and sends the results to Google Safe Browsing via GitHub Action. By monitoring C2 servers on the cloud, even when access restrictions are made by the attackers, you can easily avoid it by using a different region.
The C2 server monitoring results (list of redirect URLs) are available in the following GitHub repository:
https://github.com/JPCERTCC/Lucky-Visitor-Scam-IoC
Malware Hunting using Cloud
Malware analysts routinely hunt for malware in order to gather IoCs and other information. They repeatedly download malware from VirusTotal and other resources and analyze it. However, it is impossible to manually analyze every single type of malware because tons of types of malware are created by attackers on a daily basis. In such cases, you can also build automated analysis system easily using cloud services.
JPCERT/CC operates a system that automatically collects Cobalt Strike Beacons from the Internet and stores the analysis results. Figure 3 shows the operation flow of the system. There are a number of Cobalt Strike C2 servers on the Internet, and information collected by many researchers are available on open platforms such as VirusTotal. Based on the information, the system investigates C2 servers and automatically collects and analyzes samples.
This system developed using AWS serverless services and GitHub. The key points, the connection to C2 servers and malware analysis, are conducted on Lambda. In addition to automatically collecting C2 server information, we are also preparing an API to manually investigate them (API Gateway). Open-source data alone does not cover all C2 servers. When an unknown C2 server is discovered, it needs to be analyzed manually. However, if an API is prepared in this way, it is possible to investigate C2 servers without downloading samples, by simply providing the URL.
The analysis results of Cobalt Strike Beacon collection system are available in the following GitHub repository:
https://github.com/JPCERTCC/CobaltStrike-Config
YARA CI/CD system
Although various tools for automated YARA rule creation have been released, there is currently no tool that can be considered the most standard. Therefore, malware analysts need to manually create YARA rules. Since this is a time-consuming task, automating YARA rule creation is an important challenge. Although there is no automatic YARA rule creation method applicable to all types of malware, automation may be possible depending on the pattern of samples.
JPCERT/CC operates a system that automatically generates YARA for malware with specific patterns. As an example, I introduce CI/CD system that generates YARA rules for HUI Loader, which is used by APT10, Blue Termite, A41APT, etc. The HUI Loader loads the encoded malware that serves as the main body, decodes it, and executes it on the memory. Refer to the past JPCERT/CC Eyes article for details.
A common problem with loader malware such as HUI Loader is that although the loader can often be detected by malware hunting, the encoded malware itself that it loads often cannot be found. However, in such cases, since the encoding algorithm of the loader is known, it is possible to create YARA rules to detect the encoded malware. Figure 5 shows the operation flow of this system.
This system also uses AWS serverless services and GitHub. It automatically collects samples from VirusTotal, analyzes them, generates YARA rules, and applies the rules to VirusTotal to collect new samples. So far, I have introduced 3 kinds of automation systems as examples, but you can actually achieve pretty much anything on AWS as long as you can use Lambda and S3.
The analysis results of HUI Loader are available in the following GitHub repository:
https://github.com/JPCERTCC/HUILoader-research
Surface Analysis System on Cloud
Malware analysts gather information on the latest malware from reports and blog posts published by security vendors. However, vendors often give unique names to the same type of malware or publish only hash values. Therefore, it can be difficult to compare a type of malware to the one which your organization recognizes based on the content of the report alone. If you can instantly identify a type of malware from the hash values in the report (e.g. based on scan results with your YARA rules), you can find the connection between the report and the information your organization is collecting, which can enhance the speed of your research.
Cloud services can be used for such purpose as well. Specifically, this problem can be solved by building a system that operates in the flow shown in Figure 7. A web browser plug-in is used to send the hash value information to the analysis system, which downloads the malware from VirusTotal, analyzes it using various analysis tools such as YARA scan, and shows the results (Figure 8).
This system can be built on AWS and works as shown in Figure 9. Hash values sent from a web browser plug-in are received by API Gateway, which triggers Docker with Batch to analyze malware. The analysis results are saved as HTML in S3 while being emailed by SNS. In addition, since many researchers post malware hash values on Twitter, by incorporating a function that collects and analyzes such information, you can automate this process
Furthermore, by converting such an analysis system to IaC, new analysis tools can be easily incorporated and used. For example, you can create scripts to extract the configuration information of a specific type of malware and incorporate it into IaC, and it can be applied to the cloud more easily. A Terraform version of the analysis system described above is available in the following GitHub repository:
https://github.com/JPCERTCC/SurfaceAnalysis-on-Cloud
Memory Forensic on Cloud
During incident investigation, there are cases where a large number of devices need to be investigated at the same time. When memory forensics is performed on many devices simultaneously, a single device can hardly bear it, and also it may take a long time depending on the spec of the devices. These problems can be solved by using cloud services. Figure 10 shows a memory forensic system built on AWS.
When a memory image is saved in S3, it triggers Batch to launch Docker, and the memory image is analyzed by Volatility Framework, a memory forensics tool. The analysis results are saved as HTML in S3 while being emailed by SNS. As shown in Figure 11, your analysis becomes more efficient by referring to the past results instead of repeating the same process. In this system, since Docker is launched by Batch for each memory image analysis, multiple devices can be analyzed at the same time.
The Terraform code to build the above system is available in the following GitHub repository:
https://github.com/JPCERTCC/MemoryForensic-on-Cloud
In Closing
This article has presented just a few examples of using cloud services for MAOps. There should still be a lot more possible use patterns. I hope using cloud services can make your daily malware analysis more efficient!
Shusei Tomonaga
(Translated by Takumi Nakano)