Study of Binaries Created with Rust through Reverse Engineering
Rust has been gaining attention in recent years as a language expected to replace C and C++, due to its memory safety and high performance. While Rust continues to be adopted as a programming language, malware developed using Rust (hereinafter referred to as "Rust malware"), such as Rust variants of SysJoker and the BlackCat ransomware, has also been increasing in recent years. However, knowledge of reverse engineering techniques for Rust malware is still insufficient compared to classical reverse engineering techniques for C/C++ malware. For this reason, JPCERT/CC has published the "Study of Binaries Created with Rust through Reverse Engineering", which summarizes the results of verifications conducted on the reverse engineering of binaries created with Rust (hereinafter referred to as "Rust binaries").
Study of Binaries Created with Rust through Reverse Engineering
This article provides an overview of the report.
Contents of the Report
This report summarizes the results of studies and verifications conducted by selecting study items related to reverse engineering of Rust binaries. For detailed study items, please refer to Appendix A. The versions of the tools used in this study are listed below. In addition, the binaries were compiled using a Windows MSVC environment during the study and verification.
cargo: 1.82.0
rustc: 1.82.0
IDA Pro v8.3.230608
Usage Scenarios
Since each study item in this report is independent, readers can refer only to items of interest rather than reading the entire report from start to finish. Some study items include sample programs. Therefore, it is recommended to first review the items of interest, then compile the sample programs and examine the Rust binaries alongside the report.
Conclusion
Rust is a language that is rapidly gaining adoption, and since it is considered relatively difficult to reverse engineer, its abuse by attackers is expected to increase. We hope that this report will be of some help in the reverse engineering of Rust malware. If you find any issues or have comments regarding the content, we welcome your feedback.
Tomoya Kamei
(This document was machine-translated and manually reviewed.)
Appendix A: Study Items
| No. | Title | Overview |
|---|---|---|
| 1 | Differences between binaries, associated with setting modifications of Profiles in Cargo | Study on what extent approaches that use cargo to reduce binary sizes can reduce the sizes and what information is left unremoved. The approaches should be those available from disclosed information. |
| 2 | Reducing binary sizes | Study on to what extent approaches that use rustc to reduce binary sizes can reduce the sizes and what information is left unremoved. The approaches should be those available from disclosed information. |
| 3 | Identifying Rust binaries | Study on approaches that determine whether a binary is a Rust binary |
| 4 | Exception Directory | Study on information available from an Exception Directory structure |
| 5 | TLS Directory | Study on information available from a TLS Directory structure and contents of a TLS Callback |
| 6 | Identifying the main function and initialization | Approaches for identifying user-defined main functions |
| 7 | Strings | Approaches for handling strings |
| 8 | Mangling function names | Structure of mangled function names and how to demangle mangled function names |
| 9 | Closure | Behavior of closures and memory layouts to be used |
| 10 | Enum types | Study on how behavior of enum types in Rust is implemented in assembly code |
| 11 | Match statement | Study on how behavior of match statements in Rust is implemented in assembly code |
| 12 | Panic statement | Differences in assembly code between behavior on a panic "unwind" and "abort" |
| 13 | Iterator | Study on how code using iterators or the "next" function is implemented in assembly code |
| 14 | Trait | Differences between calls to a function using traits and to a common function |
| 15 | Identify typical traits | Approaches for identifying traits the #[derive] attribute uses, in assembly code |
| 16 | Dynamic dispatch reference | Characteristics of assembly code and differences between calls using dynamic and static dispatches |
| 17 | Collection | Memory layout to be used |
| 18 | Identifying functions generated from the same generics | Study on approaches for identifying a function generating another function |
| 19 | Smart pointer | Characteristics and memory layouts of smart pointers |
| 20 | Inline assembly | Characteristic code patterns |
| 21 | Link attribute | Differences in how to link libraries |
| 22 | Repr attribute | Study on how memory layouts change according to specifiable options |
| 23 | How to identify code in standard and third-party libraries | Approaches for identifying statically linked standard and third-party library functions |