Secure, Efficient and Easy C programming

Easiest way to write secure code is to do it so that you don't have to think about it.

UPDATE: I thought of writing this to be more easily understandable and with some more content and explanations. But of course I got tired of doing that. Anyway, beginnings are here.

Introduction

Everyone knows about buffer overflows nowadays. Everyone knows some ways to prevent them. But most people still do it in a way that requires them to be unnecessarily careful while coding - even a single carelessly written part of code can be a security disaster.

None of these ideas I've written about is new, but most of them are very rarely used with C programs. I think it's mostly because many people don't know about them or just haven't realized how they could be easily used. Sure, there are also people who will never change their way of coding. And there's also the problem that using non-libc functions make the program bigger, which is pretty annoying with otherwise very small programs.

Besides not only making your code more resistant to buffer overflows, I think many of these features will actually make writing C-code a lot easier and more fun.

I'm not a writer and I'm not too good at english, so sorry about all the spelling and grammar errors :) All of this stuff was written at sunday morning, tired after being awake the whole night and not being able to do anything useful..

Memory Allocations

The Common Ways:

Use static buffers - fast, but lack the ability to grow when needed.
malloc() - slighly slower than static buffers, possible memory fragmentation, and most importantly it requires the memory to be freed, which can be sometimes very annoying. It's easy to forget to free the memory and cause memory leaks. free()ing already freed memory may also be an exploitable a security flaw, always setting the free()d pointer to NULL helps quite a lot though (create a macro to do it).
Garbage collector - this would be the best way to manage memory. For example OCaml's garbage collector is quite smart by treating long-living allocations differently than temporary allocations. However with C it's not really possible - you can't go moving the allocated memory elsewhere unless you write your program in special way. So there's a few simple non-portable garbage collector implementations for C, but they're not much more different from malloc()ing other than that you don't need to free() memory. They're not fool proof either.

Data Stack

What I haven't yet seen used anywhere outside my own software and some programming languages internals (eg. calling Perl code from C), is using data stack for temporary memory allocations. It has the most important advantage of garbage collectors; allocate memory without worrying about freeing it. It also has a few gotchas, but I'd say it's advantages are well worth it. UPDATE: GNU obstack is a data stack, although I think it's a bit bloated and the API isn't that nice.

The way it works is simply letting the programmer define the stack frames. All memory allocated within the frame are freed at once when the frame ends. This works best with programs running in some event loop so you don't have to worry about the stack frames too much. Here's an example program:

/* asprintf()-like functon, but allocates the memory from data stack */
const char *t_printf(const char *fmt, ...) __attribute__((format (printf, 1, 2)));

void client_input(struct client *client)
{
	const char *cmd;
	int i;

	cmd = read_command(client);
	if (cmd == NULL)
		return;

	if (strcasecmp(cmd, "status") == 0) {
		time_t now = time(NULL);
		client_send(client, t_printf("Time: %s", ctime(&now)));
		client_send(client, t_printf("Clients: %u", clients_count));

		for (i = 0; i < 10000; i++) {
			/* without an extra stack frame here, we'd allocate the
			   t_printf() 10000 times before freeing them all.
			   That's probably not be very good. */
			t_push();
			client_send(client, t_printf("%d: %u", i, stuff[i]));
			t_pop();
		}
	} else {
		client_send(client, t_printf("Unknown command: %s", cmd));
	}
}

void main_loop(void)
{
	unsigned int i, id;
	fd_set rfds;

	if (select(max_fd, &rfds, NULL, NUL, NULL) <= 0)
		return;

	for (i = 0; i < clients_count; i++) {
		if (FD_ISSET(clients[i].fd, &rds)) {
			id = t_push();
			client_input(&clients[i]);
			if (t_pop() != id) {
				/* we could simply call the missing t_pop()s,
				   but this  usually indicates a problem which
				   we want to know. for example we might have
				   leaked it inside a for loop which caused
				   unnecessarily large memory usage. */
				panic("missing t_pop()");
			}
		}
	}
}

Advantages over control stack:

Functions can return a value allocated from data stack.
We can portably specify how much data we want to allocate at runtime, and in some cases we can grow it without copying data.
alloca() function should be mentioned here as a way of nonportably allocating memory from control stack. So it does none of the above.

Advantages over malloc():

FAST, most of the time allocating memory means only updating a couple of pointers and integers. Freeing the memory all at once also is a fast operation.
No need to free() each allocation. This has a huge impact on code simplicity. For example: fprintf(f, "'%s'\n", escape_string(input)); vs. char *str = escape_string(input); fprintf(f, "'%s'\n", str); free(str);
No memory leaks.
No memory fragmentation.

Disadvantages:

Allocating memory inside loops can accidentally allocate a lot of memory if the loops are long and you forgot to place stack frame there.
The memory could accidentally be saved somewhere and accessed after the end of stack frame.
Debugging for buffer overflows can be difficult, since the regular bound checkers don't work. To kludge around this, I've made an compile time option to allocate all memory using malloc(), but this of course hurts the performance and grows memory usage. Of course, by coding good code you should never need this :)

The second disadvantage is the most problematic one. It's actually two problems; first it shouldn't be mixed with permanent data, and second it shouldn't be temporarily stored to location where it could be accessed outside the stack frame.

The first one is easier to handle. The ideal solution would be to have some tags in C language that would give warning if being lost, for example temporary char *str = t_strdup("str"); char *str2 = str; where the str2 assignment would give a compiler warning about the missing temporary tag. But since this isn't possible we can do almost as well using const keyword, which has exactly that behaviour but restricts us from modifying the memory. Luckily that's not usually needed.

The second one is much more difficult, I haven't found any good way to handle it other than by a) don't even try it, b) fill the freed memory when freeing stack frame so the corrupted value is easily noticed at runtime.

And why ever even do that? Because it's simpler in some situations. For example I have this message header parser, which calls a callback function for each header line. Suppose I only want to find the message's Content-Type, so save it with context->content_type = t_strdup(content_type); Then later I just read it without need to worry about freeing it. Now, the only problem with this is that it relies on the parser not to create a stack frame around the callback which would invalidate our saved return value. While writing the callback it's easy to check if it's possible, but what if the parser was later modified without remembering the special needs of the one callback..

Memory Pools

Alloc-only pools can be quite useful for storing larger amount of return values from some function, especially when data stack can't be used (the second problem case above). They still provide relatively easy memory management since you only need to free the pool once.

Alloc-free pools could be useful with some applications to prevent memory fragmentation and memory leaks by grouping related data together into single malloc() block. They could also make the program run faster due to (possibly) better CPU cache utilization. Besides performance reasons, they could also be useful for statistics to find out where exactly the allocated memory goes. They'd however need their own internal malloc() implementation so they're not very simple.

Memory Pool API

When you have a function that could return a large return value, it's not very efficient to first store it into data stack and then copy it somewhere else. The traditional way would be to just give a (void *result, size_t size) as parameters. This again has the problem that we might not know at all what buffer size is large enough.

So, one way to deal with this is to give the function a memory pool object which can be used by the function to allocate the correct amount of data. Besides the dynamically created alloc-only and alloc-free pools there could be global pools as well: data_stack_pool and system_malloc_pool. Here's an example:

extern Pool data_stack_pool, system_malloc_pool;

/* generic dynamically growing array */
struct array *array_create(Pool pool, size_t initial_size)
{
	struct array *arr;

	arr = p_malloc(pool, sizeof(struct array));
	arr->pool = pool;
	arr->size = initial_size;
	arr->data = p_malloc(pool, arr->size);
	return arr;
}

void array_destroy(struct array *arr)
{
	p_free(arr->pool, arr->data);
	p_free(arr->pool, arr);
}

void array_grow(struct array *arr, size_t size)
{
	if (size > arr->size)
		arr->data = p_realloc(arr->pool, arr->data, size);
}

void func(void)
{
	struct array *arr;

	t_push();
	array_create(data_stack_pool, 1024);
	/* ... */
	array_destroy(arr); /* not really needed, t_pop() frees it anyway */
	t_pop();
}

String Handling

This is where most of the buffer overflows have happened. Nowadays people are more aware of the problem, but many still they do it in unnecessarily hard way.

The Common Ways:

strncpy(), strncat(), snprintf() - only snprintf() of these is easy to use safely (except it's return value) but it's still somewhat unportable (Windows). strncpy() doesn't necessarily NUL-terminate requiring it to be done explicitly. strncat() was probably never meant to be a way to prevent buffer overflows, it's behaviour is just too insane for that, for example: strncat(buf, "foo", sizeof(buf)-strlen(buf)-1); Then again there's people who are happily using strncat() by giving the full buffer size to it's 3rd argument. While that would sound logical, it's completely wrong.
strlcpy(), strlcat() - much better replacements to above by OpenBSD. Very unportable, but you can easily create your own ones. But these can still be used unsafely if the buffer size parameter is wrong or if the programmer goes playing around with the buffer indirectly, by eg. appending single characters and missing size checks (yes, I've seen this in software that contained "secure" in it's name).
Dynamically allocating the amount of wanted memory and then using strcpy(), strcat(), sprintf() and direct accessing. This requires you to be very careful with the string size calculations. I don't understand why so many people think that's not a problem, they have this "If you can't calculate the sizes correctly, you're stupid and you shouldn't be coding at all" attitude. Why bother wasting time with that at all when you could be doing more important things?
Dynamically growing buffers, used by for example GLIB, vsftpd, qmail, djbdns and Postfix. This is definitely the right way; string manipulation is done through API which discourages - or even disallows - direct buffer manipulation.

String API

There's several slight variations how to implement the dynamically growing buffers. Most work by allocating a new string object, using it and then freeing it. qmail and djbdns uses statically created buffers which are reused and never freed - that's pretty dangerous unless you can be sure the buffer isn't in use anymore. It's actually pretty much the same as having static char bigbuf[8192] which is used by several functions.

There's two ways how to manage a buffer - first being the explicit way done by eg. djb:

struct stralloc str = {0};
stralloc_copys(&str, "string");
stralloc_cats(&str, str_variable);
stralloc_catulong0(&str, int_variable);

GLIB supports doing this much more easily:

/* portable asprintf() */
char *str = g_strdup_printf("string%s%d", str_variable, int_variable);

Or if you wanted a modifyable buffer:

GString *str = g_string_new(NULL);
g_string_append(str, "string");
g_string_append(str, str_variable);
g_string_sprintfa(str, "%d", int_variable); /* no g_string_append_int() */

Some people don't seem to like the printf-style or believe it's unsafe. GCC however gives very good warnings about parameters with incorrect type so I don't think there's any need to worry about.

String API with Memory Pool API Support

Nothing fancy here really. Just by being able to specify the used memory pool we can easily allocate strings from data stack. I've made several wrapper functions so that instead of p_strdup_printf(data_stack_pool,...) I can just use t_strdup_printf(). Here's a list of few useful functions:

const char *t_strdup(const char *str);
char *t_strdup_noconst(const char *str);
const char *t_strdup_empty(const char *str); /* return NULL if str = "" */
const char *t_strdup_until(const char *start, const char *end); /* *end isn't included */
const char *t_strndup(const char *str, size_t max_chars);
const char *t_strdup_printf(const char *format, ...);
const char *t_strconcat(const char *str1, ...); /* NULL terminated */

t_strdup_noconst is there mostly to avoid casting const away in those few situations where it's needed. Because it returns char * it could be more easily mixed with permanent data so it's usage should be kept minimal.

t_strconcat() is one function that I also copied from GLIB. It's a bit dangerous though, the terminating NULL is too easy to forget. I've been thinking about removing it entirely, but it's much more efficient than t_strdup_printf() so I haven't yet had the heart :)

Buffer Handling

Many people concentrate only on string related buffer overflows. Yes, they've been the most common ones but they're quite known now and there's more to buffer overflows than just them. The most obvious one is just another type of buffer handling. All the other data that couldn't be called strings. Often people just hack away separate memory allocations and/or size checks for them. The problems are exactly the same as with strings.

If we create a buffer API and write to buffers only through it, we would prevent almost all buffer overflows, even integer related ones, simply because the buffer API implementation is the only part of code that directly writes to memory, and that of course should be well audited not to overflow in any situation.

Buffer API would be somewhat similar to string API, ideally the string API should be created using the buffer API, or they could even be the same. My buffer API currently looks like this:

/* Create a static sized buffer. Writes past this size will simply not
   succeed. */
Buffer *buffer_create_static(Pool pool, size_t size);

/* Create a static sized buffer. Writes past this size will kill the program. */
Buffer *buffer_create_static_hard(Pool pool, size_t size);

/* Create a modifyable buffer from given data. For example:
   { char buf[1024]; buf = buffer_create_data(pool, buf, sizeof(buf)); } */
Buffer *buffer_create_data(Pool pool, void *data, size_t size);

/* Create a non-modifyable buffer from given data. */
Buffer *buffer_create_const_data(Pool pool, const void *data, size_t size);

/* Creates a dynamically growing buffer. Whenever write would exceed the
   current size it's grown. */
Buffer *buffer_create_dynamic(Pool pool, size_t init_size, size_t max_size);

Then there are several write, append and truncate functions and a few functions to temporarily limit the accessible buffer range. An example:

int fill_buffer(Buffer *buf, struct data *data)
{
	struct bufdata *bufdata;

	/* these return value checks aren't really needed, since they fail
	   only if the buffer gets full. and since we're using dynamically
	   growing buffer with no limits, malloc() failure will kill the
	   program long before reaching the high limit. */
	if (buffer_append(buf, "data\n", 5) != 5)
		return -1;

	/* returns a pointer to just appended space */
	bufdata = buffer_append_space(buf, sizeof(*bufdata));
	if (bufdata == NULL)
		return -1;

	bufdata->field1 = htonl(data->field1);
	bufdata->field2 = htonl(data->field2);
	return 1;
}

int write_data(int fd, struct data *data)
{
	Buffer *buf;
	size_t size;
	int ret;

	t_push();
	buf = buffer_create_dynamic(data_stack_pool, 256, (size_t)-1);
	if ((ret = fill_buffer(buf, data)) > 0) {
		if (write(fd, buffer_get_data(buf, &size), size) != size)
			ret = -1;
	}
	t_pop();
	return ret;
}

There is of course also the possibility of buffer overflows while reading data. That's a bit more difficult to prevent except by careful coding, but usually it's not such a big deal anyway. It might crash the program, or at worst could expose some private information to attacker. If the later is a real problem, you should use multiple processes and IPC to hide the private data.

Real World Usage

I've written a fully featured IMAP server using pretty much these techniques. The code still needs some cleanups and there's probably a couple of bugs left, but mostly I think it's not too bad :) The library functions in it are MIT licenced, so go ahead and rip them to your own programs. I could try to create some minimal library out of them if there's enough interest.

My data stack implementation.

vsftpd uses it's own string API for pretty much everything. In only few of the files it's even possible to create a buffer overflow with the used coding style.