hacktricks/src/pentesting-web/json-xml-yaml-hacking.md
carlospolop 1d33f0bcc8 c
2025-07-08 18:27:17 +02:00

144 lines
3.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# JSON, XML & Yaml Hacking & Issues
{{#include ../banners/hacktricks-training.md}}
## Go JSON Decoder
The following issues were detected in the Go JSON although they could be present in other languages as well. These issues were published in [**this blog post**](https://blog.trailofbits.com/2025/06/17/unexpected-security-footguns-in-gos-parsers/).
Gos JSON, XML, and YAML parsers have a long trail of inconsistencies and insecure defaults that can be abused to **bypass authentication**, **escalate privileges**, or **exfiltrate sensitive data**.
### (Un)Marshaling Unexpected Data
The goal is to exploit structs that allow an attacker to read/write sensitive fields (e.g., `IsAdmin`, `Password`).
- Example Struct:
```go
type User struct {
Username string `json:"username,omitempty"`
Password string `json:"password,omitempty"`
IsAdmin bool `json:"-"`
}
```
- Common Vulnerabilities
1. **Missing tag** (no tag = field is still parsed by default):
```go
type User struct {
Username string
}
```
Payload:
```json
{"Username": "admin"}
```
2. **Incorrect use of `-`**:
```go
type User struct {
IsAdmin bool `json:"-,omitempty"` // ❌ wrong
}
```
Payload:
```json
{"-": true}
```
✔️ Proper way to block field from being (un)marshaled:
```go
type User struct {
IsAdmin bool `json:"-"`
}
```
### Parser Differentials
The goal is to bypass authorization by exploiting how different parsers interpret the same payload differently like in:
- CVE-2017-12635: Apache CouchDB bypass via duplicate keys
- 2022: Zoom 0-click RCE via XML parser inconsistency
- GitLab 2025 SAML bypass via XML quirks
**1. Duplicate Fields:**
Go's `encoding/json` takes the **last** field.
```go
json.Unmarshal([]byte(`{"action":"UserAction", "action":"AdminAction"}`), &req)
fmt.Println(req.Action) // AdminAction
```
Other parsers (e.g., Javas Jackson) may take the **first**.
**2. Case Insensitivity:**
Go is case-insensitive:
```go
json.Unmarshal([]byte(`{"AcTiOn":"AdminAction"}`), &req)
// matches `Action` field
```
Even Unicode tricks work:
```go
json.Unmarshal([]byte(`{"ationſ": "bypass"}`), &req)
```
**3. Cross-service mismatch:**
Imagine:
- Proxy written in Go
- AuthZ service written in Python
Attacker sends:
```json
{
"action": "UserAction",
"AcTiOn": "AdminAction"
}
```
- Python sees `UserAction`, allows it
- Go sees `AdminAction`, executes it
### Data Format Confusion (Polyglots)
The goal is to exploit systems that mix formats (JSON/XML/YAML) or fail open on parser errors like:
- **CVE-2020-16250**: HashiCorp Vault parsed JSON with an XML parser after STS returned JSON instead of XML.
Attacker controls:
- The `Accept: application/json` header
- Partial control of JSON body
Gos XML parser parsed it **anyway** and trusted the injected identity.
- Crafted payload:
```json
{
"action": "Action_1",
"AcTiOn": "Action_2",
"ignored": "<?xml version=\"1.0\"?><Action>Action_3</Action>"
}
```
Result:
- **Go JSON** parser: `Action_2` (case-insensitive + last wins)
- **YAML** parser: `Action_1` (case-sensitive)
- **XML** parser: parses `"Action_3"` inside the string
### 🔐 Mitigations
| Risk | Fix |
|-----------------------------|---------------------------------------|
| Unknown fields | `decoder.DisallowUnknownFields()` |
| Duplicate fields (JSON) | ❌ No fix in stdlib |
| Case-insensitive match | ❌ No fix in stdlib |
| XML garbage data | ❌ No fix in stdlib |
| YAML: unknown keys | `yaml.KnownFields(true)` |
{{#include ../banners/hacktricks-training.md}}