+ 4

How to compress string in python I am not getting proper output can you please help me

string s is given as input ex-aaabb output-3a2b ex-"bbcd" output-2bcd here we are count frequency of each character and we have to display their count with that character https://code.sololearn.com/cBVuy2M68ul6/?ref=app

python character count string frequency compress

29th Feb 2020, 10:47 AM

Shelly Kapoor

15 Answers

+ 7

Gordon, storing individual letters in the dictionary with their frequency will allow you to work only with sequences like 'aaabb' not with 'aaabbaabbb' st = input() or 'aaabbaabbb' d = dict (zip (list(st), map (st.count, st))) print(''.join ([str(d[i]) + i for i in d])) output is '5a5b' , but should be '3a2b2a3b' i made it like this: https://code.sololearn.com/c2206HU7VFDA/?ref=app

29th Feb 2020, 11:43 AM

Vitaly Sokol

+ 2

Looping through the string is correct instead of using only one count variable and comparing with the first char only, Use a dictionary to store count count = {} # before the loop If the current char has not existed in the dictionary as a key, initialize by setting its value as 1. if not ch in count: count[ch] = 1 otherwise increment by 1. else: count[ch] = count[ch] + 1 at last, loop through the dictionary to display the result.

29th Feb 2020, 10:51 AM

Gordon

+ 2

If we are talking about RLE (run length encoded) we can use python itertools takewhile(). It does group consecutive characters, that can be encoded in an easy way.

29th Feb 2020, 4:45 PM

Lothar

+ 2

rodwynnejones, your code sample does collect for example all "a", so the output is not correct in terms of RLE: 7a6b4c

29th Feb 2020, 8:05 PM

Lothar

+ 2

Here is a code that works with itertools groupby() https://code.sololearn.com/cH6ss6iK6EP8/?ref=app

29th Feb 2020, 8:15 PM

Lothar

+ 1

May be you should have a look first of all to what the code is producing: Input string: abbc deffg hiiki result from program: 1a2b1a2b1c1 1d1e2f1a2b1c1 1d1e2f1g1 1h2i result expected: 1a2b1c 1d1e2f1g 1h2i1k1i if spaces should be also handled, encoding should be like this: 1a2b1c1 1d1e2f1g1 1h2i1k1i

29th Feb 2020, 3:45 PM

Lothar

+ 1

lothar ahh I see...I was not aware of the RLE thing (i just now googled it)...and I hadn't understood the requirement correctly.

29th Feb 2020, 8:27 PM

rodwynnejones

+ 1

Shelly Kapoor Is your input one single word string i.e. no spaces? Have a look at regular expressions. (re.findall)...I've had a go using re (after I finally understood what your trying to do). Here's the pattern I used: mystring = "aaabbaabbbbc" pattern = '+|'.join(set(mystring)) + '+' https://code.sololearn.com/cIKvkA7HZdHq/#py

29th Feb 2020, 10:34 PM

rodwynnejones

+ 1

Shelly Kapoor any feedback about answer solutions you got?

2nd Mar 2020, 9:49 AM

Vadym M

Vitaly Sokol Why do you think that it won't work for aaabbaabbbb? https://code.sololearn.com/cjwW0ypPO75g/?ref=app JSON is applicable to any programming language ~

29th Feb 2020, 11:57 AM

Gordon

Gordon, How can we restore the original sequence from its compressed form '5a5b'? We need compression in the form of '3a2b2a3b' to restore the original string, as I understand it.

29th Feb 2020, 12:02 PM

Vitaly Sokol

Vitaly Sokol 🤔 I don't see the need for decompression in the original question. But I can see your point now.

29th Feb 2020, 12:12 PM

Gordon

- 1

mystring = "aaabbaabbccaabbcc" for x in sorted(set(mystring)): print(mystring.count(x), x, sep='', end='') or you can use the "Counter" from the "collections" module.

29th Feb 2020, 5:05 PM

rodwynnejones