Besides having slight performance problems, malloc() also creates memory fragmentation. How much and how bad that is depends a lot of the program. Usually it matters only with long running processes, short living processes free all their memory soon anyway.
Fragmentation happens when allocating and freeing different sized memory blocks. The malloc() implementation may not find an empty block with exactly the wanted size so it uses a larger block. The unused part may be used for other even smaller allocations, but if it happens a lot there will soon be a lot of these small holes unused.
Programs allocating memory blocks of only few sizes generally don't get fragmented much. Since most programs have traditionally used statically sized buffers for handling most of temporary allocations (string manipulation) they don't get too fragmented.
There are many different malloc implementations. Some concentrate on speed and some on reducing fragmentation, but most try to find a balance on them. Most implementations are portable and quite small, so if you find some of them working much better than others with your program, you could just include it with your program and use it directly.
There's several ways to reduce or even completely eliminate fragmentation by using different allocation methods:
The Memory Fragmentation Problem: Solved?
The reason that most objects allocated are of so few object sizes is that, for most programs, the majority of dynamic objects are of just a few types.
Are Mallocs Free of Fragmentation?
To our surprise, when we run Hummingbird on several operating systems with different malloc implementations, the heap size of the process was several times larger than the total size of live memory objects.
Hummingbird implements a memory-based cache, which stores variable-sized objects in addition to fixed-sized objects. The total size of live objects is fixed. We believe the Hummingbird's memory access pattern is actually quite common for many long-running applications, which store dynamic memory objects of variable sizes.
The best malloc we measured was PhK/BSD malloc version 42. It caused 30.5% heap fragmentation.