266 lines
9.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Malware Analysis
{{#include ../../banners/hacktricks-training.md}}
## Forensics CheatSheets
[https://www.jaiminton.com/cheatsheet/DFIR/#](https://www.jaiminton.com/cheatsheet/DFIR/)
## Online Services
- [VirusTotal](https://www.virustotal.com/gui/home/upload)
- [HybridAnalysis](https://www.hybrid-analysis.com)
- [Koodous](https://koodous.com)
- [Intezer](https://analyze.intezer.com)
- [Any.Run](https://any.run/)
## Offline Antivirus and Detection Tools
### Yara
#### Install
```bash
sudo apt-get install -y yara
```
#### Prepare rules
Use this script to download and merge all the yara malware rules from github: [https://gist.github.com/andreafortuna/29c6ea48adf3d45a979a78763cdc7ce9](https://gist.github.com/andreafortuna/29c6ea48adf3d45a979a78763cdc7ce9)\
Create the _**rules**_ directory and execute it. This will create a file called _**malware_rules.yar**_ which contains all the yara rules for malware.
```bash
wget https://gist.githubusercontent.com/andreafortuna/29c6ea48adf3d45a979a78763cdc7ce9/raw/4ec711d37f1b428b63bed1f786b26a0654aa2f31/malware_yara_rules.py
mkdir rules
python malware_yara_rules.py
```
#### Scan
```bash
yara -w malware_rules.yar image #Scan 1 file
yara -w malware_rules.yar folder #Scan the whole folder
```
#### YaraGen: Check for malware and Create rules
You can use the tool [**YaraGen**](https://github.com/Neo23x0/yarGen) to generate yara rules from a binary. Check out these tutorials: [**Part 1**](https://www.nextron-systems.com/2015/02/16/write-simple-sound-yara-rules/), [**Part 2**](https://www.nextron-systems.com/2015/10/17/how-to-write-simple-but-sound-yara-rules-part-2/), [**Part 3**](https://www.nextron-systems.com/2016/04/15/how-to-write-simple-but-sound-yara-rules-part-3/)
```bash
python3 yarGen.py --update
python3.exe yarGen.py --excludegood -m ../../mals/
```
### ClamAV
#### Install
```
sudo apt-get install -y clamav
```
#### Scan
```bash
sudo freshclam #Update rules
clamscan filepath #Scan 1 file
clamscan folderpath #Scan the whole folder
```
### [Capa](https://github.com/mandiant/capa)
**Capa** detects potentially malicious **capabilities** in executables: PE, ELF, .NET. So it will find things such as Att\&ck tactics, or suspicious capabilities such as:
- check for OutputDebugString error
- run as a service
- create process
Get it int he [**Github repo**](https://github.com/mandiant/capa).
### IOCs
IOC means Indicator Of Compromise. An IOC is a set of **conditions that identify** some potentially unwanted software or confirmed **malware**. Blue Teams use this kind of definition to **search for this kind of malicious files** in their **systems** and **networks**.\
To share these definitions is very useful as when malware is identified in a computer and an IOC for that malware is created, other Blue Teams can use it to identify the malware faster.
A tool to create or modify IOCs is [**IOC Editor**](https://www.fireeye.com/services/freeware/ioc-editor.html)**.**\
You can use tools such as [**Redline**](https://www.fireeye.com/services/freeware/redline.html) to **search for defined IOCs in a device**.
### Loki
[**Loki**](https://github.com/Neo23x0/Loki) is a scanner for Simple Indicators of Compromise.\
Detection is based on four detection methods:
```
1. File Name IOC
Regex match on full file path/name
2. Yara Rule Check
Yara signature matches on file data and process memory
3. Hash Check
Compares known malicious hashes (MD5, SHA1, SHA256) with scanned files
4. C2 Back Connect Check
Compares process connection endpoints with C2 IOCs (new since version v.10)
```
### Linux Malware Detect
[**Linux Malware Detect (LMD)**](https://www.rfxn.com/projects/linux-malware-detect/) is a malware scanner for Linux released under the GNU GPLv2 license, that is designed around the threats faced in shared hosted environments. It uses threat data from network edge intrusion detection systems to extract malware that is actively being used in attacks and generates signatures for detection. In addition, threat data is also derived from user submissions with the LMD checkout feature and malware community resources.
### rkhunter
Tools like [**rkhunter**](http://rkhunter.sourceforge.net) can be used to check the filesystem for possible **rootkits** and malware.
```bash
sudo ./rkhunter --check -r / -l /tmp/rkhunter.log [--report-warnings-only] [--skip-keypress]
```
### FLOSS
[**FLOSS**](https://github.com/mandiant/flare-floss) is a tool that will try to find obfuscated strings inside executables using different techniques.
### PEpper
[PEpper ](https://github.com/Th3Hurrican3/PEpper)checks some basic stuff inside the executable (binary data, entropy, URLs and IPs, some yara rules).
### PEstudio
[PEstudio](https://www.winitor.com/download) is a tool that allows to get information of Windows executables such as imports, exports, headers, but also will check virus total and find potential Att\&ck techniques.
### Detect It Easy(DiE)
[**DiE**](https://github.com/horsicq/Detect-It-Easy/) is a tool to detect if a file is **encrypted** and also find **packers**.
### NeoPI
[**NeoPI** ](https://github.com/CiscoCXSecurity/NeoPI)is a Python script that uses a variety of **statistical methods** to detect **obfuscated** and **encrypted** content within text/script files. The intended purpose of NeoPI is to aid in the **detection of hidden web shell code**.
### **php-malware-finder**
[**PHP-malware-finder**](https://github.com/nbs-system/php-malware-finder) does its very best to detect **obfuscated**/**dodgy code** as well as files using **PHP** functions often used in **malwares**/webshells.
### Apple Binary Signatures
When checking some **malware sample** you should always **check the signature** of the binary as the **developer** that signed it may be already **related** with **malware.**
```bash
#Get signer
codesign -vv -d /bin/ls 2>&1 | grep -E "Authority|TeamIdentifier"
#Check if the apps contents have been modified
codesign --verify --verbose /Applications/Safari.app
#Check if the signature is valid
spctl --assess --verbose /Applications/Safari.app
```
## Detection Techniques
### File Stacking
If you know that some folder containing the **files** of a web server was **last updated on some date**. **Check** the **date** all the **files** in the **web server were created and modified** and if any date is **suspicious**, check that file.
### Baselines
If the files of a folder **shouldn't have been modified**, you can calculate the **hash** of the **original files** of the folder and **compare** them with the **current** ones. Anything modified will be **suspicious**.
### Statistical Analysis
When the information is saved in logs you can **check statistics like how many times each file of a web server was accessed as a web shell might be one of the most**.
---
## Deobfuscating Dynamic Control-Flow (JMP/CALL RAX Dispatchers)
Modern malware families heavily abuse Control-Flow Graph (CFG) obfuscation: instead of a direct jump/call they compute the destination at run-time and execute a `jmp rax` or `call rax`. A small *dispatcher* (typically nine instructions) sets the final target depending on the CPU `ZF`/`CF` flags, completely breaking static CFG recovery.
The technique showcased by the SLOW#TEMPEST loader can be defeated with a three-step workflow that only relies on IDAPython and the Unicorn CPU emulator.
### 1. Locate every indirect jump / call
```python
import idautils, idc
for ea in idautils.FunctionItems(idc.here()):
mnem = idc.print_insn_mnem(ea)
if mnem in ("jmp", "call") and idc.print_operand(ea, 0) == "rax":
print(f"[+] Dispatcher found @ {ea:X}")
```
### 2. Extract the dispatcher byte-code
```python
import idc
def get_dispatcher_start(jmp_ea, count=9):
s = jmp_ea
for _ in range(count):
s = idc.prev_head(s, 0)
return s
start = get_dispatcher_start(jmp_ea)
size = jmp_ea + idc.get_item_size(jmp_ea) - start
code = idc.get_bytes(start, size)
open(f"{start:X}.bin", "wb").write(code)
```
### 3. Emulate it twice with Unicorn
```python
from unicorn import *
from unicorn.x86_const import *
import struct
def run(code, zf=0, cf=0):
BASE = 0x1000
mu = Uc(UC_ARCH_X86, UC_MODE_64)
mu.mem_map(BASE, 0x1000)
mu.mem_write(BASE, code)
mu.reg_write(UC_X86_REG_RFLAGS, (zf << 6) | cf)
mu.reg_write(UC_X86_REG_RAX, 0)
mu.emu_start(BASE, BASE+len(code))
return mu.reg_read(UC_X86_REG_RAX)
```
Run `run(code,0,0)` and `run(code,1,1)` to obtain the *false* and *true* branch targets.
### 4. Patch back a direct jump / call
```python
import struct, ida_bytes
def patch_direct(ea, target, is_call=False):
op = 0xE8 if is_call else 0xE9 # CALL rel32 or JMP rel32
disp = target - (ea + 5) & 0xFFFFFFFF
ida_bytes.patch_bytes(ea, bytes([op]) + struct.pack('<I', disp))
```
After patching, force IDA to re-analyse the function so the full CFG and Hex-Rays output are restored:
```python
import ida_auto, idaapi
idaapi.reanalyze_function(idc.get_func_attr(ea, idc.FUNCATTR_START))
```
### 5. Label indirect API calls
Once the real destination of every `call rax` is known you can tell IDA what it is so parameter types & variable names are recovered automatically:
```python
idc.set_callee_name(call_ea, resolved_addr, 0) # IDA 8.3+
```
### Practical benefits
* Restores the real CFG → decompilation goes from *10* lines to thousands.
* Enables string-cross-reference & xrefs, making behaviour reconstruction trivial.
* Scripts are reusable: drop them into any loader protected by the same trick.
---
## References
- [Unit42 Evolving Tactics of SLOW#TEMPEST: A Deep Dive Into Advanced Malware Techniques](https://unit42.paloaltonetworks.com/slow-tempest-malware-obfuscation/)
{{#include ../../banners/hacktricks-training.md}}