Welcome to the { mindfrost82.com } forums.

You are currently viewing our boards as a guest which gives you limited access to view most discussions and access our other features. By joining our free community you will have access to post topics, communicate privately with other members (PM), respond to polls, upload content and access many other special features. Registration is fast, simple and absolutely free so please, join our community today!

If you have any problems with the registration process or your account login, please contact contact us.

Go Back   { mindfrost82.com } > Gadget Corner > Tech Newsgroups > Programming > C++

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 07-31-2008, 05:28 AM
Eric Meijer
 
Posts: n/a
memory efficient STL allocator alternatives?

Hello,

I am looking for an allocator that minimizes memory losses due to
fragmentation.

Use case:
I have a program that builds a data structure containing variable length
sequences that are currently implemented as vector<...>. While the data
structure is built up, the vectors get elements added (or removed) on a
regular basis. It is not possible to predict the number of elements a
vector will get in advance, and in a typical structure there is a wide
range of sizes (real example: mininum length 1, maximum 182, average
49). After building up the data structure I write it to disk, and do a
visualisation step. In subsequent runs of the program I read the data
structure from disk instead of recomputing it, to do alternative
visualisation steps. For one concrete case I measured that in the run
that actually computes the data structure it uses 1.88 GB of memory. A
subsequent run that reads the same data structure from disk uses only
463 MB. The difference in usage is due to fragmentation, as far as I
can tell, and the 463 MB figure makes sense from the actual data that is
in memory.

The program runs on linux (Debian Lenny) and is built with g++ 4.3.1.
I played around with the malloc_allocator and the __pool_allocator that
come with the GNU stdlibc++ library, but they did not make any
significant difference. I also tried the Hoard memory allocator
(http://www.hoard.org/), but that used more in building the data
structure, and significantly more in reruns that only read the data
structure (compared to reruns with the vanilla allocator, that is).

What I would like to find is an allocator that agressively reuses
earlier freed bytes, possibly at the cost of some runtime performance.
Are there any such allocators known, or should I invent this wheel myself?

Kind regards,
Eric

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Reply With Quote
  #2 (permalink)  
Old 07-31-2008, 12:18 PM
peter koch larsen
 
Posts: n/a
Re: memory efficient STL allocator alternatives?

On 31 Jul., 06:28, Eric Meijer <eric_no1sp...@xs4all.nl> wrote:
> Hello,
>
> I am looking for an allocator that minimizes memory losses due to
> fragmentation.
>

[snip]
>
> What I would like to find is an allocator that agressively reuses
> earlier freed bytes, possibly at the cost of some runtime performance.
> Are there any such allocators known, or should I invent this wheel myself?
>

Did you consider replacing std::vector with std::deque? Most probably,
you'll be able to just replace and forget, and std::deque should be
kinder to the memory-allocator as this structure does not need
contiguous memory.

/Peter

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Reply With Quote
  #3 (permalink)  
Old 07-31-2008, 08:57 PM
Eric Meijer
 
Posts: n/a
Re: memory efficient STL allocator alternatives?

Alberto Ganesh Barbati wrote:
> Eric Meijer ha scritto:
>> Hello,
>>
>> I am looking for an allocator that minimizes memory losses due to
>> fragmentation.

>
> You start with the wrong foot. Most industry-strength allocators try
> their best to do that. By making this request you are implicitly
> assuming that the allocators you are using don't do it and that is
> probably unfair.


Um, no need to take it personal :-).
I am seeing memory behaviour in my program that indicates that some
"discarded" memory is not reused. After some analysis I concluded that
the problem was in the allocator. Then I got stuck trying to improve
with alternative allocators. Then I decided to post to this helpful
forum. Now I have had a good night's sleep, did some more analysis, and
I think the memory is stuck in the vector class.

> The issue is strongly related to your specific
> allocation pattern.


Indeed.

> The problem is that a general purpose allocator
> can't behave well in every allocation patterns... so you need one that
> behaves well in your case. Unfortunately, despite your very detailed
> description of your allocation pattern, I still don't have enough
> information to answer that question.


[... snip my use case description ...]
The hint was hidden in the description, but I didn't realize it myself.
The point is that the vectors not only grow, but also shrink (via
resize()), except that they don't really shrink.

> [ ... question about specific distribution of allocation sizes ...]


Here is some output of my analysis of a run that computes the data
structure:

min, avg, max elements : 1, 43.8756, 201

So on average some 44 elements, in a range from 1 to 201.
I also checked the vector.capacity(), which gave me this interesting result:

min, avg, max vector capacity : 1, 63.801, 256

Apparently on average any vector was 20 elements larger at some point
during the computation than at the end, something I hadn't realised. I
verified that the vector implementation does not overallocate.

My conclusion so far is that I should find / write a container that
actually releases memory back to the allocator if it shrinks. It
remains to be seen if the actual peak memory usage can be reduced. The
vector's strategy of the gnu libstdc++ appears to be at least very
reluctant to really release memory when it is reduced in size. This
strategy is too wasteful for my particular use case.

> HTH,
>
> Ganesh


Thanks for taking part in a discussion, it does help.

Eric

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Reply With Quote
  #4 (permalink)  
Old 07-31-2008, 11:32 PM
JoshuaMaurice@gmail.com
 
Posts: n/a
Re: memory efficient STL allocator alternatives?

On Jul 31, 12:57 pm, Eric Meijer <eric_no1sp...@xs4all.nl> wrote:
> My conclusion so far is that I should find / write a container that
> actually releases memory back to the allocator if it shrinks. It
> remains to be seen if the actual peak memory usage can be reduced. The
> vector's strategy of the gnu libstdc++ appears to be at least very
> reluctant to really release memory when it is reduced in size. This
> strategy is too wasteful for my particular use case.


In case you overlooked it, might I again suggest what
lhyatt@princeton.edu recommended. It's the common idiom that if you
want to shrink the vector's capacity to its current size, then you do:

On Jul 31, 7:19 am, "lhy...@princeton.edu" <lhy...@gmail.com> wrote:
> template<typename Vector>
> void shrink_vector(Vector& v) {
> Vector(v.begin(), v.end()).swap(v);
>
> }


Or, more commonly,
vector<int>(x).swap(x);

If you build each vector one at a time, then at the end of working
with each vector, you can shrink its capacity to its size, and proceed
to the next one. You will gain your desired space savings with a small
amount of time overhead.

If you're building all the vectors at once, and you don't know when
you're done with some vector, then there is no "end point" when you're
done working with some vector. Thus the only way to save space is to
shrink it after every remove and insert, or some sort of heuristic to
determine when to shrink. This problem would be more interesting, and
we don't have enough information on your exact problem to solve it.

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Reply With Quote
  #5 (permalink)  
Old 07-31-2008, 11:33 PM
Alberto Ganesh Barbati
 
Posts: n/a
Re: memory efficient STL allocator alternatives?

Eric Meijer ha scritto:
>
> Here is some output of my analysis of a run that computes the data
> structure:
>
> min, avg, max elements : 1, 43.8756, 201
>
> So on average some 44 elements, in a range from 1 to 201.
> I also checked the vector.capacity(), which gave me this interesting
> result:
>
> min, avg, max vector capacity : 1, 63.801, 256
>
> Apparently on average any vector was 20 elements larger at some point
> during the computation than at the end, something I hadn't realised. I
> verified that the vector implementation does not overallocate.


vector implementations usually don't overallocate on resize() or
reserve() but they usually do so on push_back(). In fact, the gcc
implementation of push_back() adds elements up to capacity(), then
reallocates by doubling the capacity. It's not a coincidence that the
max capacity is exactly 256!

> My conclusion so far is that I should find / write a container that
> actually releases memory back to the allocator if it shrinks. It
> remains to be seen if the actual peak memory usage can be reduced. The
> vector's strategy of the gnu libstdc++ appears to be at least very
> reluctant to really release memory when it is reduced in size. This
> strategy is too wasteful for my particular use case.


Another poster already mentioned the swap-trick to shrink a vector, that is:

std::vector<..>().swap(v);

Notice that C++0x will provide std::vector with a new method called
shrink_to_fit() to perform that task without "tricks".

HTH,

Ganesh

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Reply With Quote
  #6 (permalink)  
Old 08-01-2008, 02:55 AM
red floyd
 
Posts: n/a
Re: memory efficient STL allocator alternatives?

Alberto Ganesh Barbati wrote:
> vector implementations usually don't overallocate on resize() or
> reserve() but they usually do so on push_back(). In fact, the gcc
> implementation of push_back() adds elements up to capacity(), then
> reallocates by doubling the capacity. It's not a coincidence that the
> max capacity is exactly 256!



I think you have a disconnect somewhere. I've had vectors much larger
than 256 in GCC.

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Reply With Quote
  #7 (permalink)  
Old 08-01-2008, 04:44 PM
Eric Meijer
 
Posts: n/a
Re: memory efficient STL allocator alternatives?

red floyd wrote:
> Alberto Ganesh Barbati wrote:
>> vector implementations usually don't overallocate on resize() or
>> reserve() but they usually do so on push_back(). In fact, the gcc
>> implementation of push_back() adds elements up to capacity(), then
>> reallocates by doubling the capacity. It's not a coincidence that the
>> max capacity is exactly 256!

>
>
> I think you have a disconnect somewhere. I've had vectors much larger
> than 256 in GCC.
>

You must be using one of those new-fangled 16 bit systems then! ;-)

(sorry, couldn't resist, I don't think Ganesh implied that 256 was an
absolute maximum, he just noted that the allocated reserve was a power
of 2).

Eric

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Reply With Quote
  #8 (permalink)  
Old 08-01-2008, 04:58 PM
Eric Meijer
 
Posts: n/a
Re: memory efficient STL allocator alternatives? [ solved ]

JoshuaMaurice@gmail.com wrote:

[ ... ]
> In case you overlooked it, might I again suggest what
> lhyatt@princeton.edu recommended. It's the common idiom that if you
> want to shrink the vector's capacity to its current size, then you do:
>
> On Jul 31, 7:19 am, "lhy...@princeton.edu" <lhy...@gmail.com> wrote:
>> template<typename Vector>
>> void shrink_vector(Vector& v) {
>> Vector(v.begin(), v.end()).swap(v);
>>
>> }

>
> Or, more commonly,
> vector<int>(x).swap(x);
>
> If you build each vector one at a time, then at the end of working
> with each vector, you can shrink its capacity to its size, and proceed
> to the next one. You will gain your desired space savings with a small
> amount of time overhead.
>
> If you're building all the vectors at once, and you don't know when
> you're done with some vector, then there is no "end point" when you're
> done working with some vector. Thus the only way to save space is to
> shrink it after every remove and insert, or some sort of heuristic to
> determine when to shrink. This problem would be more interesting, and
> we don't have enough information on your exact problem to solve it.


This is indeed the case, the whole set of vectors is updated in a
chaotic pattern (it _is_ a fractal (http://www.elmweb.nl/)), and I don't
know when a particular vector is "ready". I took the shrink_vector
function as inspiration to implement alternatives for the insert() and
erase() methods for vectors (see code fragment below). The advantage
above the shrink_vector function is that it prevents an extra copy of
the elements of the vector. The "ShrinkThreshold" allows a trade-off
between cpu and memory resources. This tackles the problem.

What I don't like about it is that to my taste the solution contains too
much "inside knowledge" about the memory allocation strategies of
vector. I also don't know to what degree this is standard required
behaviour, common usage, or even GNU libstdc++ specific (I guess not).
Would there be a "market" for a lean_vector<T> class that behaves along
the lines of the code below, or can this behaviour be factored out of
vector somehow?

Thanks everyone for the suggestions.

Eric

---- code fragment ----

// threshold for shrinking intervals vector
const size_t ShrinkThreshold = 4;

// "lean" erase that never lets the capacity grow
// beyond size + ShrinkThreshold

template<typename V>
void LeanErase(V& v, typename V::iterator i)
{
if(v.capacity() >= v.size() + ShrinkThreshold) {
V w(v.size() - 1);
typename V::iterator wi = w.begin();
for(typename V::iterator vi = v.begin(); vi != v.end(); ++vi) {
if(vi == i) continue;
*wi++ = *vi;
}
w.swap(v);
}
else {
v.erase(i);
}
}

// lean insert that grows the vector capacity only by ShrinkThreshold

template<typename T>
void LeanInsert(vector<T>& v, typename vector<T>::iterator i,
const T& t)
{
if(v.capacity() < v.size() + 1) {
vector<T> w;
w.reserve(v.size() + ShrinkThreshold);
w.resize(v.size() + 1);
typename vector<T>::iterator vi(v.begin());
typename vector<T>::iterator wi = w.begin();
while(vi != i) *wi++ = *vi++;
*wi++ = t;
while(vi != v.end()) *wi++ = *vi++;
w.swap(v);
}
else {
v.insert(i, t);
}
}

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Reply With Quote
  #9 (permalink)  
Old 08-01-2008, 04:58 PM
Alberto Ganesh Barbati
 
Posts: n/a
Re: memory efficient STL allocator alternatives?

red floyd ha scritto:
> Alberto Ganesh Barbati wrote:
>> vector implementations usually don't overallocate on resize() or
>> reserve() but they usually do so on push_back(). In fact, the gcc
>> implementation of push_back() adds elements up to capacity(), then
>> reallocates by doubling the capacity. It's not a coincidence that the
>> max capacity is exactly 256!

>
>
> I think you have a disconnect somewhere. I've had vectors much larger
> than 256 in GCC.
>


Aww, be serious! Of course you can have vectors larger than that! But
has the OP has said that *his vectors* hold at most 201 elements, it's
not a coincidence that the capacity of *those vectors* is exactly 256.

Ganesh

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Reply With Quote
  #10 (permalink)  
Old 08-01-2008, 04:58 PM
Bo Persson
 
Posts: n/a
Re: memory efficient STL allocator alternatives?

red floyd wrote:
> Alberto Ganesh Barbati wrote:
>> vector implementations usually don't overallocate on resize() or
>> reserve() but they usually do so on push_back(). In fact, the gcc
>> implementation of push_back() adds elements up to capacity(), then
>> reallocates by doubling the capacity. It's not a coincidence that
>> the max capacity is exactly 256!

>
>
> I think you have a disconnect somewhere. I've had vectors much
> larger than 256 in GCC.


It was about the capacity of the OP's vector

> Here is some output of my analysis of a run that computes the data
> structure:
>
> min, avg, max elements : 1, 43.8756, 201
>
> So on average some 44 elements, in a range from 1 to 201.
> I also checked the vector.capacity(), which gave me this interesting
> result:
>
> min, avg, max vector capacity : 1, 63.801, 256


Seeing a capacity of 256 for a vector with sizes around 200 is not
surprising, nor is the average capacity being very close to 64.


Bo Persson



--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Reply With Quote
Reply

  { mindfrost82.com } > Gadget Corner > Tech Newsgroups > Programming > C++


Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are Off
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT. The time now is 11:20 AM.


Powered by vBulletin, Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Friendly URLs by vBSEO 3.1.0 ©2007, Crawlability, Inc.
© 1999-2008 mindfrost82.com v11.0


Sponsors:
Mortgages | Loans | Bad Credit Mortgages | Gas Suppliers | Literatura fantastica



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114