+ 1
How to implement effective way of set_intersection for duplicate values
Hi All I tried to implement the better way of finding set_intersection for non sorted unique elements of the input... I am stuck at how can I make it work for duplicate values as well.. please throw some light on how to achieve the same ? More details is present as code comment for below code: https://code.sololearn.com/crlS14DO8EvC/?ref=app
5 Réponses
+ 2
Oh! Like, a multiset intersection or so.
Hmm, how about this:
keep a `std::unordered_map<T, std::tuple<int, int>>` where all the keys are your vector elements and the values are tuples, where the first element is the number of occurrences in list `a` and the second element is the number of occurrences in list `b`.
You can build that map in O(n).
Then, filter list `b`. Foreach `x` in `b`: Look `x` up in the map. Are both elements in the tuple > 0?
If no, throw `x` away.
If yes, keep `x` and decrement both elements in the tuple by 1.
Should be O(n)!
+ 1
After you intersect you could filter your vector by running through it once and putting each element into a new set. Unless the element is already present, then you found a duplicate. You can even do that in the same single loop.
Also I don't believe that's really O(n). It's O(n) with a big asterisk, as inserting into a set is only O(1) in the average case, with O(n) in the worst case so you have O(n²) worst case performance.
Also the sort-and-remove algorithm is so simple that it probably works better for average sized arrays anyway. I guess you'd have to test it.
+ 1
Schindlabua true that algorithm function are almost most of the time great compared to the custom developed...
however , that set_intersection misplace output sequence input need to be sorted before passing to function..... So yes , it's always almost a better choice to use already available untill we are not sure about fully custom developed.
I am sorry but I could not understand your first para... So , request you to elaborate on how to achieve custom developed algorithm for the finding of common elements when we have duplicates in the input... I feel I have mis articulated the problem... Let me try with example:
input a = {10,2,3,5,10,5,6,9}
input b = {5,5,9,7,10}should have answer as {5,5,9,10} for common elements
and many thanks for your help 👍
+ 1
Thinking about it you only need to count how many `a`s there are. So scratch the tuple and make it a unordered_map<T, int>! That's almost what you are already doing in your first example anyway.
0
Perfect 👍 thanks Schindlabua