+ 6

[Assignment] [Challenge]: Longuest common substring

Given X and Y two strings, build a function C=S(X,Y) so that C is a subsequence of X and Y and C is the longuest (you can ignore cases where there are many of them) # Tests (X,Y),C => f(X,Y) should equal C ("azertyuiop","zwwxuiop"),"zuiop" ("aaaataaaazbaazzz","yitafgabjjazebza"),"taaazbz" ("12345","aeetyyy"),"" ("a2z3e4r5t6zz","a234562ertzz"),"a23ertzz" Bigger test case here : https://code.sololearn.com/ceO4P41uQE7E/#py

3rd May 2018, 12:15 PM
VcC
VcC - avatar
14 odpowiedzi
+ 19
https://code.sololearn.com/c0rMCRCnPRCP/?ref=app
3rd May 2018, 6:17 PM
LukArToDo
LukArToDo - avatar
+ 5
@VcC I didn't want to spoil it to let the others find out their own way, but if you ask... Here, it is a recursive one together with some unnecessary optimizations (as usual well commented): https://code.sololearn.com/cpsKMKAAHnS3/?ref=app However, I have just got an idea, how it can be done even easier by realization with about the same time complexity and less memory complexity. I will try to code it later...
3rd May 2018, 3:28 PM
Nevfy
+ 4
I think people try to solve then look at others code... Mine same as yours. https://code.sololearn.com/csH8yT3oNfBJ/?ref=app
3rd May 2018, 4:04 PM
VcC
VcC - avatar
+ 3
@VcC One more super cool task! It took me for the almost half of hour to solve it properly. A brute force solution will have a huge complexity of about O(2^(M+N)) by time and O(M*2^M+N*2^N) by memory, where M and N - lengths of input strings. However, a smarter approach allows to reduce it (if I estimate it correctly) up to O(2^max(M,N)) by time and O(M+N) by memory.
3rd May 2018, 3:20 PM
Nevfy
+ 3
Cool. Show us your solution ! As i am concerned i focused on doing it in one line as short as possible :-)
3rd May 2018, 3:22 PM
VcC
VcC - avatar
+ 3
Bigger test case if you want to see how efficient your solution is: https://code.sololearn.com/ceO4P41uQE7E/#py
3rd May 2018, 8:22 PM
VcC
VcC - avatar
+ 3
Since nobody else didn't want to demonstrate the solution based on dynamic programming with memoization approach, I have done it by my own: https://code.sololearn.com/c7W8OV49STkF/?ref=app Its time and memory complexities should be both equal to O(N*M), where M and N - lengths of input strings. However, I'm not the expert in a field, so feel free to test it and inform me about any bugs that you have met!
6th May 2018, 2:14 AM
Nevfy
+ 3
I have found some time to perform several time measurements. PART 1: My recursive solution and its optimizations PART 2: Comparision to @VcC recursive solution PART 3: Notes about dynamic programming solutions To test recursive solutions I took 2 examples: 1. ("aaaataaaazbaazzz","yitafgabjjazebza") -> "taaazbz" 2. ("a2z3e4r5t6zz","a234562ertzz") -> "a23ertzz" All times are given in relative units. ##### PART 1 ##### I test 2 optimizations that I introduced in my code. 1. First one checks if two input strings are equal. If so, there is no need to continue recursion, just this string can be returned. It is useful for long strings, which have similar symbols at their ends. 2. Second optimization: first string (from which symbols are taken) is always shorter (or equal by length) than second string (in which symbols are searched). Test 1: w/o optimizations: 1x 1 optimization: 0.9x 2 optimization: 0.45x both optimizations: 0.45x Test 2: w/o optimizations: 1x 1 optimization: 0.65x 2 optimization: 0.98x both optimizations: 0.73x We can see that for first example second optimization helps to gain more than twice! First optimization is also useful when used independently but doesn't gain more together with the second one. And it is completely different for the other test case. First optimization is much more useful that the second one. And when used together second optimization only slows down the solution! ##### PART 2 ##### Comparision of 2 solutions from me (with both optimizations) and @VcC also shows interesting results! Test 1: @Nevfy: 1x @VcC: 0.54x Test 2: @Nevfy: 1x @VcC: 1.9x We see that each solution works better for specific tests. ##### PART 3 ##### I have found that my and @VcC solutions for dynamic programming approach are almost identical. However, there is a major thing that allows VcC's solution to be almost two times quicker. He uses build-in Python lists, when I use NumPy arrays. So you should choose your data structures wisely!
8th May 2018, 2:12 AM
Nevfy
+ 2
Agree ! Updated test cases
3rd May 2018, 1:20 PM
VcC
VcC - avatar
+ 2
Nevfy Nice analysis. The rule for python efficiency is : find the good algorithm, use the efficient datastructure and prebuilt functions and leave the smallest possible work to the slow python interpreter. One reason i m trying to oneline any problem - make me find new ways to use python possibilities while using the interpreter as less as possible
8th May 2018, 2:32 PM
VcC
VcC - avatar
+ 2
#what u hv given as eg: is not substring .. its substring of all possible permutations azertyuiop zwwxuiop for substring... ans shud be uiop zuiop is not a substring or subsequence of either two...
3rd Aug 2018, 5:09 PM
sayan chandra
sayan chandra - avatar
3rd Aug 2018, 5:10 PM
sayan chandra
sayan chandra - avatar
+ 1
test cases unclear. in case 2, the subsequence "taaazba" can be formed from both, and in case 3 the same can applied to the subsequence "a23ertzz"
3rd May 2018, 12:45 PM
hinanawi
hinanawi - avatar
3rd May 2018, 8:49 PM
Sebastian Keßler
Sebastian Keßler - avatar