extract know url, how?

example i have a text file like this: https://www.site.com/part1/part3/........... https://www.site.com/part1/part2/........... https://www.site.com/part1/part3/........... i want only extract like one this: https://www.site.com/part1/part3/........... not part2 how can i do this?🧐 thank you friends 😸 i looking 2 days, i cant find anything :(

python regular-expressions text json python3 extract

12th Mar 2019, 11:30 AM

Halil İbrahim Yalçın

3 Answers

+ 5

This pattern should work for you: pat= r'.*/part1/.*[^2]/.*' m= re.search(pat, link) if m: #then select else: #reject

12th Mar 2019, 11:47 AM

Шащи Ранжан

+ 1

You should be able to use regex, match the URLs against "http://www.site.com/part1/part3/.*" like, idk what you actually need cuz you just gave us this example site with very limited test cases to fulfill. Do you just want part3, or do you want everything except for part2? Does part2 only appear after part1? Do you have non-URLs in your text file?

12th Mar 2019, 11:47 AM

Hatsy Rei

+ 1

Thank you both. You gave me ideas. I figured it out. urls = re.findall('https://site.com/part1/part3/.*/*.jpg', text_file) for url in urls: print(text_file, file=open("a.txt", "a"))

12th Mar 2019, 1:02 PM

Halil İbrahim Yalçın