1.6 KiB
1.6 KiB
1.6. String compression
Compression of count of repeated characters. If the compressed smaller isn't smaller than the original string, return the original string. The string only has lowercase and uppercase letters
example:
"aabcccccaaa" -> "a2b1c5a3"
return compr if len(compr) < slen else s
First idea
O(n), create a new empty string and add to that. Iterate through the string, keep count and add to the string. Start with compr=s[0] and count=1, and start at index=1.
def compress(s):
slen = len(s)
if slen < 3:
return s
compr = s[0]
count = 1
for i in range(1, slen):
if len(compr) >= slen:
return s
if s[i] == compr[-1]:
count += 1
if i == slen - 1:
compr += str(count)
else:
compr += str(count)
compr += s[i]
count = 1
if i == slen - 1:
compr += str(count)
return compr if len(compr) < slen else s
If character changes, update count in compr and reset to 1. If the string ends, add the count at the end.
Tests:
- "" -> "", correct
- "a" -> "a", correct
- "aa" -> "a2", correct
- "aabbb" -> "a2b3", correct
- "aabbbc" -> "aabbbc", correct
Remember that if len(compr) < len(s), return s. This has O(n) time, O(n) space.
Solution
Uses StringBuilder as optimal solution, because in the original (my) case, runtime is O(n + k2), where k is the number of character sequences. An optimization is, while creating the compressed string, if it gets bigger than the original one, stop and return s.