+ 1
How to implement word count if my file has hyphens?
language Scala or Java If I want count every word and their frequencey which appears in, for example, *.txt, however,if there is a hyphen , and the .txt reads: hello everyone, I am Bill Ga- tes My aim is to print out each word and their frequency in the *.txt by using a Map[String,Int] So how to deal with the word "Gates"? I am a rookie in Scala and The only method I know is sourse.getline() to read this file so My output is not one "Gates" but two words. How to read a hyphen "-" in a file and count one word , but not two? Appreciate your help! Scala will be better Java is also accepted thanks!
6 odpowiedzi
+ 5
A hyphen represents two words, which if a word with hyphen is counted as two words, we can remove the number of hyphens from the final result.
Let's say you had this text:
Hello I am Ay-mane and I wa-nt to say some-thing.
We have 10 words. The 'count word' merhod will count 13, then we remove the number of hyphens to get 10.
+ 4
Tinm jac you can follow these steps:
- read the whole file
- count then remove hyphens
- remove "\n" (they represent newline)
I'm sure there are methods to do that, you can use what Aleksandrs Kalinins suggested, it will he something like: string.replaceAll("-", ""), which will remove all your hyphens. Same goes for "\n". I think thay is the result you want, but I do not know scala so I can't write a functioning script (but I can probably help you in dms, then you share final solution here)
+ 1
Well, in Java, before putting string into a map, you can call replaceAll method, to remove any characters you want. I bet this method exists in Scala also.
https://code.sololearn.com/cijFhzvfH35Z/?ref=app
0
But what if the file is very long? I cannot claim them and replaceAll them one by one 😂
0
Aymane's method is brilliant to count the total words
But what if I want more?😂😂
Each word and its frequency!
Maybe I ask for too much😂😂
0
My idea is convert all the content of this file into a huge String which can be printed in one line! so I can replace hyphen into "" and two words merge into one word
BUT HOW TO IMPLEMENT?That's hard for me😂😂😂😂