![]() |
|
|
Welcome to the { mindfrost82.com } forums. You are currently viewing our boards as a guest which gives you limited access to view most discussions and access our other features. By joining our free community you will have access to post topics, communicate privately with other members (PM), respond to polls, upload content and access many other special features. Registration is fast, simple and absolutely free so please, join our community today! If you have any problems with the registration process or your account login, please contact contact us. |
|
|||||||
![]() |
|
|
LinkBack | Thread Tools | Search this Thread | Display Modes |
|
|||
|
memory efficient STL allocator alternatives?
Hello,
I am looking for an allocator that minimizes memory losses due to fragmentation. Use case: I have a program that builds a data structure containing variable length sequences that are currently implemented as vector<...>. While the data structure is built up, the vectors get elements added (or removed) on a regular basis. It is not possible to predict the number of elements a vector will get in advance, and in a typical structure there is a wide range of sizes (real example: mininum length 1, maximum 182, average 49). After building up the data structure I write it to disk, and do a visualisation step. In subsequent runs of the program I read the data structure from disk instead of recomputing it, to do alternative visualisation steps. For one concrete case I measured that in the run that actually computes the data structure it uses 1.88 GB of memory. A subsequent run that reads the same data structure from disk uses only 463 MB. The difference in usage is due to fragmentation, as far as I can tell, and the 463 MB figure makes sense from the actual data that is in memory. The program runs on linux (Debian Lenny) and is built with g++ 4.3.1. I played around with the malloc_allocator and the __pool_allocator that come with the GNU stdlibc++ library, but they did not make any significant difference. I also tried the Hoard memory allocator (http://www.hoard.org/), but that used more in building the data structure, and significantly more in reruns that only read the data structure (compared to reruns with the vanilla allocator, that is). What I would like to find is an allocator that agressively reuses earlier freed bytes, possibly at the cost of some runtime performance. Are there any such allocators known, or should I invent this wheel myself? Kind regards, Eric -- [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ] |
|
|||
|
Re: memory efficient STL allocator alternatives?
On 31 Jul., 06:28, Eric Meijer <eric_no1sp...@xs4all.nl> wrote:
> Hello, > > I am looking for an allocator that minimizes memory losses due to > fragmentation. > [snip] > > What I would like to find is an allocator that agressively reuses > earlier freed bytes, possibly at the cost of some runtime performance. > Are there any such allocators known, or should I invent this wheel myself? > Did you consider replacing std::vector with std::deque? Most probably, you'll be able to just replace and forget, and std::deque should be kinder to the memory-allocator as this structure does not need contiguous memory. /Peter -- [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ] |
|
|||
|
Re: memory efficient STL allocator alternatives?
Alberto Ganesh Barbati wrote:
> Eric Meijer ha scritto: >> Hello, >> >> I am looking for an allocator that minimizes memory losses due to >> fragmentation. > > You start with the wrong foot. Most industry-strength allocators try > their best to do that. By making this request you are implicitly > assuming that the allocators you are using don't do it and that is > probably unfair. Um, no need to take it personal :-). I am seeing memory behaviour in my program that indicates that some "discarded" memory is not reused. After some analysis I concluded that the problem was in the allocator. Then I got stuck trying to improve with alternative allocators. Then I decided to post to this helpful forum. Now I have had a good night's sleep, did some more analysis, and I think the memory is stuck in the vector class. > The issue is strongly related to your specific > allocation pattern. Indeed. > The problem is that a general purpose allocator > can't behave well in every allocation patterns... so you need one that > behaves well in your case. Unfortunately, despite your very detailed > description of your allocation pattern, I still don't have enough > information to answer that question. [... snip my use case description ...] The hint was hidden in the description, but I didn't realize it myself. The point is that the vectors not only grow, but also shrink (via resize()), except that they don't really shrink. > [ ... question about specific distribution of allocation sizes ...] Here is some output of my analysis of a run that computes the data structure: min, avg, max elements : 1, 43.8756, 201 So on average some 44 elements, in a range from 1 to 201. I also checked the vector.capacity(), which gave me this interesting result: min, avg, max vector capacity : 1, 63.801, 256 Apparently on average any vector was 20 elements larger at some point during the computation than at the end, something I hadn't realised. I verified that the vector implementation does not overallocate. My conclusion so far is that I should find / write a container that actually releases memory back to the allocator if it shrinks. It remains to be seen if the actual peak memory usage can be reduced. The vector's strategy of the gnu libstdc++ appears to be at least very reluctant to really release memory when it is reduced in size. This strategy is too wasteful for my particular use case. > HTH, > > Ganesh Thanks for taking part in a discussion, it does help. Eric -- [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ] |
|
|||
|
Re: memory efficient STL allocator alternatives?
On Jul 31, 12:57 pm, Eric Meijer <eric_no1sp...@xs4all.nl> wrote:
> My conclusion so far is that I should find / write a container that > actually releases memory back to the allocator if it shrinks. It > remains to be seen if the actual peak memory usage can be reduced. The > vector's strategy of the gnu libstdc++ appears to be at least very > reluctant to really release memory when it is reduced in size. This > strategy is too wasteful for my particular use case. In case you overlooked it, might I again suggest what lhyatt@princeton.edu recommended. It's the common idiom that if you want to shrink the vector's capacity to its current size, then you do: On Jul 31, 7:19 am, "lhy...@princeton.edu" <lhy...@gmail.com> wrote: > template<typename Vector> > void shrink_vector(Vector& v) { > Vector(v.begin(), v.end()).swap(v); > > } Or, more commonly, vector<int>(x).swap(x); If you build each vector one at a time, then at the end of working with each vector, you can shrink its capacity to its size, and proceed to the next one. You will gain your desired space savings with a small amount of time overhead. If you're building all the vectors at once, and you don't know when you're done with some vector, then there is no "end point" when you're done working with some vector. Thus the only way to save space is to shrink it after every remove and insert, or some sort of heuristic to determine when to shrink. This problem would be more interesting, and we don't have enough information on your exact problem to solve it. -- [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ] |
|
|||
|
Re: memory efficient STL allocator alternatives?
Eric Meijer ha scritto:
> > Here is some output of my analysis of a run that computes the data > structure: > > min, avg, max elements : 1, 43.8756, 201 > > So on average some 44 elements, in a range from 1 to 201. > I also checked the vector.capacity(), which gave me this interesting > result: > > min, avg, max vector capacity : 1, 63.801, 256 > > Apparently on average any vector was 20 elements larger at some point > during the computation than at the end, something I hadn't realised. I > verified that the vector implementation does not overallocate. vector implementations usually don't overallocate on resize() or reserve() but they usually do so on push_back(). In fact, the gcc implementation of push_back() adds elements up to capacity(), then reallocates by doubling the capacity. It's not a coincidence that the max capacity is exactly 256! > My conclusion so far is that I should find / write a container that > actually releases memory back to the allocator if it shrinks. It > remains to be seen if the actual peak memory usage can be reduced. The > vector's strategy of the gnu libstdc++ appears to be at least very > reluctant to really release memory when it is reduced in size. This > strategy is too wasteful for my particular use case. Another poster already mentioned the swap-trick to shrink a vector, that is: std::vector<..>().swap(v); Notice that C++0x will provide std::vector with a new method called shrink_to_fit() to perform that task without "tricks". HTH, Ganesh -- [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ] |
|
|||
|
Re: memory efficient STL allocator alternatives?
Alberto Ganesh Barbati wrote:
> vector implementations usually don't overallocate on resize() or > reserve() but they usually do so on push_back(). In fact, the gcc > implementation of push_back() adds elements up to capacity(), then > reallocates by doubling the capacity. It's not a coincidence that the > max capacity is exactly 256! I think you have a disconnect somewhere. I've had vectors much larger than 256 in GCC. -- [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ] |
|
|||
|
Re: memory efficient STL allocator alternatives?
red floyd wrote:
> Alberto Ganesh Barbati wrote: >> vector implementations usually don't overallocate on resize() or >> reserve() but they usually do so on push_back(). In fact, the gcc >> implementation of push_back() adds elements up to capacity(), then >> reallocates by doubling the capacity. It's not a coincidence that the >> max capacity is exactly 256! > > > I think you have a disconnect somewhere. I've had vectors much larger > than 256 in GCC. > You must be using one of those new-fangled 16 bit systems then! ;-) (sorry, couldn't resist, I don't think Ganesh implied that 256 was an absolute maximum, he just noted that the allocated reserve was a power of 2). Eric -- [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ] |
|
|||
|
Re: memory efficient STL allocator alternatives? [ solved ]
JoshuaMaurice@gmail.com wrote:
[ ... ] > In case you overlooked it, might I again suggest what > lhyatt@princeton.edu recommended. It's the common idiom that if you > want to shrink the vector's capacity to its current size, then you do: > > On Jul 31, 7:19 am, "lhy...@princeton.edu" <lhy...@gmail.com> wrote: >> template<typename Vector> >> void shrink_vector(Vector& v) { >> Vector(v.begin(), v.end()).swap(v); >> >> } > > Or, more commonly, > vector<int>(x).swap(x); > > If you build each vector one at a time, then at the end of working > with each vector, you can shrink its capacity to its size, and proceed > to the next one. You will gain your desired space savings with a small > amount of time overhead. > > If you're building all the vectors at once, and you don't know when > you're done with some vector, then there is no "end point" when you're > done working with some vector. Thus the only way to save space is to > shrink it after every remove and insert, or some sort of heuristic to > determine when to shrink. This problem would be more interesting, and > we don't have enough information on your exact problem to solve it. This is indeed the case, the whole set of vectors is updated in a chaotic pattern (it _is_ a fractal (http://www.elmweb.nl/)), and I don't know when a particular vector is "ready". I took the shrink_vector function as inspiration to implement alternatives for the insert() and erase() methods for vectors (see code fragment below). The advantage above the shrink_vector function is that it prevents an extra copy of the elements of the vector. The "ShrinkThreshold" allows a trade-off between cpu and memory resources. This tackles the problem. What I don't like about it is that to my taste the solution contains too much "inside knowledge" about the memory allocation strategies of vector. I also don't know to what degree this is standard required behaviour, common usage, or even GNU libstdc++ specific (I guess not). Would there be a "market" for a lean_vector<T> class that behaves along the lines of the code below, or can this behaviour be factored out of vector somehow? Thanks everyone for the suggestions. Eric ---- code fragment ---- // threshold for shrinking intervals vector const size_t ShrinkThreshold = 4; // "lean" erase that never lets the capacity grow // beyond size + ShrinkThreshold template<typename V> void LeanErase(V& v, typename V::iterator i) { if(v.capacity() >= v.size() + ShrinkThreshold) { V w(v.size() - 1); typename V::iterator wi = w.begin(); for(typename V::iterator vi = v.begin(); vi != v.end(); ++vi) { if(vi == i) continue; *wi++ = *vi; } w.swap(v); } else { v.erase(i); } } // lean insert that grows the vector capacity only by ShrinkThreshold template<typename T> void LeanInsert(vector<T>& v, typename vector<T>::iterator i, const T& t) { if(v.capacity() < v.size() + 1) { vector<T> w; w.reserve(v.size() + ShrinkThreshold); w.resize(v.size() + 1); typename vector<T>::iterator vi(v.begin()); typename vector<T>::iterator wi = w.begin(); while(vi != i) *wi++ = *vi++; *wi++ = t; while(vi != v.end()) *wi++ = *vi++; w.swap(v); } else { v.insert(i, t); } } -- [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ] |
|
|||
|
Re: memory efficient STL allocator alternatives?
red floyd ha scritto:
> Alberto Ganesh Barbati wrote: >> vector implementations usually don't overallocate on resize() or >> reserve() but they usually do so on push_back(). In fact, the gcc >> implementation of push_back() adds elements up to capacity(), then >> reallocates by doubling the capacity. It's not a coincidence that the >> max capacity is exactly 256! > > > I think you have a disconnect somewhere. I've had vectors much larger > than 256 in GCC. > Aww, be serious! Of course you can have vectors larger than that! But has the OP has said that *his vectors* hold at most 201 elements, it's not a coincidence that the capacity of *those vectors* is exactly 256. Ganesh -- [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ] |
|
|||
|
Re: memory efficient STL allocator alternatives?
red floyd wrote:
> Alberto Ganesh Barbati wrote: >> vector implementations usually don't overallocate on resize() or >> reserve() but they usually do so on push_back(). In fact, the gcc >> implementation of push_back() adds elements up to capacity(), then >> reallocates by doubling the capacity. It's not a coincidence that >> the max capacity is exactly 256! > > > I think you have a disconnect somewhere. I've had vectors much > larger than 256 in GCC. It was about the capacity of the OP's vector > Here is some output of my analysis of a run that computes the data > structure: > > min, avg, max elements : 1, 43.8756, 201 > > So on average some 44 elements, in a range from 1 to 201. > I also checked the vector.capacity(), which gave me this interesting > result: > > min, avg, max vector capacity : 1, 63.801, 256 Seeing a capacity of 256 for a vector with sizes around 200 is not surprising, nor is the average capacity being very close to 64. Bo Persson -- [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ] |
![]() |
|
| Thread Tools | Search this Thread |
| Display Modes | |
|
|