+ 1

How do I replace a substring only if it is a whole word?

For example, if I want to replace the word "in" with "on" in the string "I was walking in Memphis.", and do this: result = re.sub("in", "on", string) # prints "I was walkong on Memphis." (Note "walkong".) I tried using regular expressions like r"\Win\W" to only replace the expression if it is surrounded by non-word characters. But this will also replace the surrounding whitespaces, for example: import re pattern = r"\Win\W" repl = "on" string = "I was walking in Memphis." result = re.sub(pattern, repl, string) print(result) # prints "I was walkingonMemphis." Any ideas? Update: Sharing my progress. The following code works, but only as long as there is only one occurrence of "in". Maybe I could fix that, but it feels like an unnecessarily complicated way to go: import re old_word = "in" pattern = r"\W" + old_word + r"\W" repl = "on" string = "I was walking in Memphis." match = re.search(pattern, string).group() splt = re.split(pattern, string) new_word = re.sub(old_word, repl, match) result = splt[0] + new_word + splt[1] print(result) # prints "I was walking on Memphis."

2nd Jan 2019, 10:15 PM
Lennart Wisbar
Lennart Wisbar - avatar
10 odpowiedzi
0
Thank you, everybody! You really helped me. I decided I liked my \b-solution best, so here is the full working code. I added a comma to the string to emphasize the solution doesn't only work for whitespace. import re s = "I was walking in, Memphis." old_word = "in" new_word = "on" pattern = r"\b" + old_word + r"\b" result = re.sub(pattern, new_word, s) print(result) # prints "I was walking on, Memphis."
3rd Jan 2019, 10:01 AM
Lennart Wisbar
Lennart Wisbar - avatar
+ 2
Try this: result = re.sub("(?<!\S)in(?!\S)", "on", string) (?<!\S)in(?!\S) will match all ocurrences of "in" enclosed with withespaces or at the start/end of the string. Check the following links to get more information on the pattern: https://docs.python.org/3/library/re.html#index-20 https://docs.python.org/3/library/re.html#index-22 https://docs.python.org/3/library/re.html#index-30
2nd Jan 2019, 11:18 PM
Diego
Diego - avatar
+ 2
Lennart Wisbar You're welcome! Always happy to help. And feel free to use the answer you like the most. Happy coding!
3rd Jan 2019, 2:18 AM
Diego
Diego - avatar
+ 2
You don't even need regex for that 😜 print(' '.join([w if w != 'in' else 'on' for w in 'I was walking in Memphis'.split()])) Or you can define a dict with arbitrary words and their replacements: repl_dict = { 'in': 'on', 'I': 'You', 'was': 'were', 'Memphis': 'the sun', } s = 'I was walking in Memphis' print(' '.join([repl_dict.get(w, w) for w in s.split()])) # output: You were walking on the sun
3rd Jan 2019, 6:35 AM
Anna
Anna - avatar
3rd Jan 2019, 10:00 AM
Anna
Anna - avatar
+ 2
Lennart Wisbar Right. I fixed it
3rd Jan 2019, 10:09 AM
Anna
Anna - avatar
+ 1
Diego Acero Thank you! This is almost it. First, I modified your regex by replacing \S with \w, so it also worked if the word was followed by a comma, for example. Then I followed one of your links and found out about \b (again). It signifies word boundaries. This is much shorter: pattern = r"\b" + old_word + r"\b" Now I'm not sure if I should accept your answer or mine. Your answer missed non-whitespace characters like commas, but it was your answer that led me to mine, so it feels a bit unfair ...
2nd Jan 2019, 11:52 PM
Lennart Wisbar
Lennart Wisbar - avatar
0
Anna Nice, but it only works if you know exacly what seperates the words. In your example, that's " ". If the sentence is "I was walking in Memphis.", then "Memphis" is not replaced, because it is directly followed by a fullstop. I guess you do need a regex to be that flexible. Thank you anyways, still a helpful answer!
3rd Jan 2019, 9:50 AM
Lennart Wisbar
Lennart Wisbar - avatar
0
Anna I see it's possible without regex, but now you would have to do that for every non-word character like (, ), ", !, : etc.
3rd Jan 2019, 10:07 AM
Lennart Wisbar
Lennart Wisbar - avatar
0
Anna Nice! It works perfectly. The regex solution with \b seems easier and more readable to me, but it's nice to know that there are also other ways.
3rd Jan 2019, 10:33 AM
Lennart Wisbar
Lennart Wisbar - avatar