+ 3

How to allocate vector on heap for short string?

Hi My base understanding is that stack capacity is very small compared to heap capacity. When I am doing allocation of millions of data on vector, I am facing bad allocation failure. Refer code snippet below : 1. demoInts alllcates three int on stack and no heap allocation. 2. Refer code related to demo vector int (): When I am doing allocation for vector<int> , all int are allocated on heap. Due to this, I am not facing any bad allocation issue for vector<int> with millions of data pushed to vector 3. demostring alllcates three strings... One is large to overcome small sting optimization and gets allocated on heap. Rest 2 (s1 and S2) are small strings and gets on stack. 4. Refer code related to demo vector string (): When I am doing allocation for vector<string> , all the strings are not allocated on heap unlike vector<int>. It follows criteria of small string optimization My concern is this. vector<string> with million of small string does all allocation on stack which results into bad alloc

14th Dec 2022, 11:35 AM
Ketan Lalcheta
Ketan Lalcheta - avatar
16 Answers
+ 3
3. std::string s; will be 32 bytes on the 'stack', unless you allocate `s` on the heap manually or through vector 5. here also, `s` will be on the stack. But yes, if the string to be stored inside the std::string object isn't small, then space for the string (not the std::string `s` object, but the character array) will be allocated on the heap. The main thing is that whatever is inside a vector will always be on the heap. The std::string object will always be 32 bytes. Now, if the string to be stored inside is small enough, it will go inside a fixed size character array that std::string stores inside (the character array is a field inside the std::string object, i.e. it is part of the 32 bytes taken up by std::string) else the whole string (character array) will be allocated on the heap EDIT: made small changes to the answer, in case you were reading as I made changes, please reload the page and read again
14th Dec 2022, 6:10 PM
XXX
XXX - avatar
+ 1
In order to implement Small String Optimisation (SSO) and still manage to keep the sizeof string class constant, the basic_string () class would always have a small buffer ready for being filled with characters. If a string fits this buffer ( maybe 10 bytes or 20 bytes, depending your architecture ) it is considered to be a small string and if it doesn't, a dynamic memory allocation happens and the string is allocated on heap. So if you are having vector<string> with millions of elements, each and every element would consume same amount of space ( i.e. sizeof (string) ) despite being small strings or long strings ( long strings would actually take additional space in heap to accomodate those additional characters they have )
14th Dec 2022, 5:02 PM
Arsenic
Arsenic - avatar
+ 1
Thanks Arsenic .. I got your point but my concern is allocation on stack for vector<string> Actually, i have vector<int> ids vector<string> names I needed to push 10,000,000 elements in both the vectors Now, assume that each int element takes 4 bytes so vector<int> ids will do allocation on heap for 40,000,000 bytes.. I am ok with this as heap has more memory availability compared to stack (correct me plz if i am wrong) Now, assume that each string element takes 8 bytes (obviously this depends on compiler implementation, but my string is small enough to have small string optimization). so vector<string> names will do allocation on stack (not on STACK even though it is vector unlike in case of vector<int>) for 80,000,000 bytes.. I am NOT ok ( I REPEAT, I AM NOT OK) with this as stack has not much memory. I know sso allocates on stack, but due to sso and millions of data even on vector allocates so much on stack which becomes bad allocation for me on stack. Is there a way to stop this ?
14th Dec 2022, 5:33 PM
Ketan Lalcheta
Ketan Lalcheta - avatar
+ 1
I am getting small string alocation on heap also like vector in this code https://code.sololearn.com/cvjSl2OEb2CA/?ref=app Here in new , this , code; vector string locates properly on heap (32*10) But in my original code i shared earlier, only one string is allocated on heap out of three from function related to vector of string Basically, i dont want to use stack
14th Dec 2022, 5:45 PM
Ketan Lalcheta
Ketan Lalcheta - avatar
+ 1
Ketan Lalcheta When it is said that "small strings go on the stack", what it actually means is that the small string is stored inside the std::string object only. So when the std::string object goes on the heap, it's contents will also be on the heap. See this answer for an overview on how SSO is implemented https://stackoverflow.com/questions/10315041/meaning-of-acronym-sso-in-the-context-of-stdstring EDIT: That's the link to the thread, here's the answer https://stackoverflow.com/a/10319672
14th Dec 2022, 5:45 PM
XXX
XXX - avatar
+ 1
Ketan Lalcheta Just to clarify, the std::string object is stored on the heap only (as in your first code, 96 bytes of space is allocated for 3 strings). It's contents can either go on the heap, or in a fixed size array that the std::string holds for small strings
14th Dec 2022, 5:47 PM
XXX
XXX - avatar
+ 1
Ohk got your point XXX Just to confirm, 1. int a;// 4 bytes on stack 2. vector<int> v(10);// 40 bytes on heap + 24 bytes on stack for vector object 3. string s;//32 bytes on heap even though it is small string 4. Vector<string> v(10); 320 bytes on heap even though it is small string + 24 bytes on stack for vector object 5. string s;//32 bytes on heap + some additional bytes on heap which could not be accomodated on string object.. correct ? 6. Vector<string> v(10); 320 bytes on heap + 24 bytes on stack for vector object + additional bytes for each string on heap which could not be accomodated on string object due to large string.. Is this correct ?
14th Dec 2022, 5:55 PM
Ketan Lalcheta
Ketan Lalcheta - avatar
+ 1
Okay thanks XXX ... seems my assumption was wrong that small string vector will allocate so much on stack... But fact is below : vector<string> for small string also allocate only vector object on stack (merely 32 bytes ) and rest all (including small strings ) is allocated on heap With this , i think i have to dig more into my production code where i was getting bad alloc for reserving millions of small string through vector<string> v and v.reserve(10,000,000) Many thanks again.
14th Dec 2022, 6:49 PM
Ketan Lalcheta
Ketan Lalcheta - avatar
+ 1
Ketan Lalcheta bad_alloc is generally thrown when allocation on heap fails. It's probably because the heap can't allocate that much space. 10,000,000 strings is basically 320 million bytes which would be around 300 MBs of space that you're trying to allocate
14th Dec 2022, 7:15 PM
XXX
XXX - avatar
+ 1
Thanks XXX for this insight. Ram is responsible to provide space for heap. Correct ? I have 32 GB ram and when i started my application, system was having 21 GB free ram. So, should this be not fine ? Or Am I mistaken about ram and space allocation on RAM
15th Dec 2022, 4:29 AM
Ketan Lalcheta
Ketan Lalcheta - avatar
+ 1
Ketan Lalcheta Sorry, I wanted to reply to this, but completely forgot. I don't know about that actually. I tried on my phone, and I was able to allocate 2^27 strings, but at 2^28 strings, I got a bad_alloc error. So I don't know why your code is giving that error. A bad_alloc error is most likely not your fault. bad_alloc error is thrown by the allocator when allocation fails https://stackoverflow.com/questions/18684951/how-and-why-an-allocation-memory-can-fail
22nd Dec 2022, 5:33 PM
XXX
XXX - avatar
+ 1
Ketan Lalcheta The '^' operator is the bitwise XOR operator, and (2 ^ 270) gives 268. Anyways, as I said, I tried it on my phone (on Termux) and it allowed allocation of space equal to that of 2^27 strings (just 13 times what you were trying to allocate). But I tried it out on SL, and SL allows space of over 2^41 strings (200,000 times of 10 million) https://code.sololearn.com/cm9D2QrYoKKT/?ref=app Anyways, I don't know if there is much use of this information since there is a lot about memory allocation in the OS that I don't understand. Also, there is pretty much no case IRL where you need a vector of that many strings, so I don't think there's any point in worrying about the bad_alloc error
25th Dec 2022, 8:55 PM
XXX
XXX - avatar
14th Dec 2022, 11:36 AM
Ketan Lalcheta
Ketan Lalcheta - avatar
0
Basically , i need vector<string> to do all allocation on heap for small strings also like we get all allocation on heap for vector<int> This would help me to avoid bad allocation on stack
14th Dec 2022, 5:34 PM
Ketan Lalcheta
Ketan Lalcheta - avatar
0
Thanks @xxx ... you got bad alloc on sololearn ? I tried but not getting in below code : https://code.sololearn.com/crJzewitWuNJ/?ref=app
23rd Dec 2022, 5:55 PM
Ketan Lalcheta
Ketan Lalcheta - avatar
0
Thanks XXX
26th Dec 2022, 4:22 AM
Ketan Lalcheta
Ketan Lalcheta - avatar