+ 7
When/how often realloc from an efficiency standpoint?
When I want to use a flexibly-sized array in C, I would write a few functions that regulate the realloc-matter for my use-case. Like for example I'd have an 'append' function that checks if the size of an array of ints needs to be longer now, and does the realloc business. Now it doesn't seem to be efficient to do that every time the length changes by 1, since it might lead to the whole array having to call the mover company more often than necessary. 😉 Now my question: Is there a guideline or a rule of thumb, or are there underlying principles for how to decide when to realloc, and how will it differ depending on the datatype or the size of the array?
9 Réponses
+ 5
I don't think that realloc() is "expensive", and here is the proof:
https://code.sololearn.com/cXj9ROKT3esL/?ref=app
512 mega rellocations with granularity one byte: we do not incur in "time limit exceeded" and memory copy is done only 14 times
+ 5
OK Martin Taylor, you are right...
I've added a malloc(42) before each realloc(), and the result changes drammatically...
But also this is a rather unrealistic case...
The truth is in the middle.
+ 5
Ace, I have no specific scenario in mind, but I am thinking of the dynamic types in higher languages, like C++ vectors, Python lists and so on.
For general use cases, like lists of numbers or words or whatever. You append something, they grow, you pop something, they shrink.
And the memory-grabbing and dropping is done without me even knowing.
So if I kind of wanted to simulate a 'container type' and append and pop stuff, changing the size of the memory chunk every time seems inefficient. So that's where my question comes from.
Martin Taylor, I suppose I should finally look into data structures and algorithms more. Right now I'm not there yet, writing mostly stuff with Python with the built-in types.
I have no specific performance issues at hand right now, since I'm only starting to play around with this.
So, ~ swim ~, I would take a big blob of memory when my program starts, and use sub-sections of it for my needs, and only grab more when really needed, as to reduce number of necessary reallocations?
+ 5
~ swim ~, sure sounds messy, like you could easily swirl up different sets of data if you get a number just a little bit wrong.
Probably best to create one pattern very carefully and make a 'method' for starting the procedure when needed.
+ 5
HonFu,
if you are concerned with this kind of stuff, you can consider the use of alternate libraries:
http://ithare.com/testing-memory-allocators-ptmalloc2-tcmalloc-hoard-jemalloc-while-trying-to-simulate-real-world-loads/
+ 5
What about the idea of realloc'ing an extra N bytes at a time instead of 1 where N is dependent on the memory needs of the application for the array. Seems to me like it would be more efficient than adding one byte each time you need an extra byte. Or else, each time you need M bytes, realloc by an extra N*M bytes? Is that considered messy bookkeeping?
+ 4
Also, if, as Martin Taylor said, a memory block will always first try to just grow if the space is there, does it even matter if I realloc with every change? I mean, if it only moves if it has to anyway?
+ 4
Ah, so there's a mechanism that will try to fill up small holes in memory first?
That would lead to a lot of array-hopping then. 😂
Hmm... seems there's not really 'one way to do it' here, right?
It's not Python after all...
+ 2
Google table doubling - it is the most common technique for dynamic memory allocation. As a rule of thumb: multiply array size by N if more than X% is full. Divide the array size by M, if less than Y% is full.
Both increasing and decreasing array size have O(n) computational complexity since you need to copy the data between old and new arrays.
Be aware, that because of that this doesn't work for real-time applications.
Bilbo Baggins I got 4000+ address changes before timeout... If you use larger memory blocks it will be even slower. The point of realloc in this situation is to increase array size.