+ 3

How to stop Sklearn converting values to float

I'm trying to make a language difficulty predictor, but when I input my data, it converts all the strings to floats. Why won't it stay the way I defined it? How can I input the data? https://code.sololearn.com/chup4G6FlQBE/?ref=app

19th Jul 2020, 2:56 PM
Clueless Coder
Clueless Coder - avatar
9 odpowiedzi
+ 2
Check out LabelEncoding, OneHotEncoder, Scikit Learn does not work well with strings/text and it prefers floats. You could also try some NLP in your code too. Good Luck 👍👍
19th Jul 2020, 3:34 PM
Steven M
Steven M - avatar
+ 2
Arnesh Thanks
19th Jul 2020, 4:23 PM
Clueless Coder
Clueless Coder - avatar
+ 2
Steven M I tried LabelEncoding, but it throws an error. It's commented out. It says I need a better shape, but it throws an error if it's in 1 dimension
19th Jul 2020, 4:24 PM
Clueless Coder
Clueless Coder - avatar
+ 1
Instead of fitting Python, C++, etc you could fit 1,2,3.. And display the name from the number while predicting.
19th Jul 2020, 3:10 PM
Arnesh
Arnesh - avatar
+ 1
Arnesh I don't know how to do that. It's even more annoying considering they're all inside an object
19th Jul 2020, 3:19 PM
Clueless Coder
Clueless Coder - avatar
+ 1
There must be better solution but this is what I managed to do. https://code.sololearn.com/cYVUr0ONbsBC/?ref=app Hope that helps.
19th Jul 2020, 3:35 PM
Arnesh
Arnesh - avatar
+ 1
Clueless Coder I modified your code a little, be advised, this is a very small dataset, so the results will vary, but I think this will help. https://code.sololearn.com/cX4SByLk6jEU/?ref=app
19th Jul 2020, 5:07 PM
Steven M
Steven M - avatar
+ 1
Steven M Wow! That's awesome Thanks, I don't really understand it though
19th Jul 2020, 5:13 PM
Clueless Coder
Clueless Coder - avatar
+ 1
Clueless Coder I have added some more notes to the code, theres also some helpful links too. I know it's a lot to take in at once, essentially I encoded the strings so SciKit Learn can fit them into the model. I also added some metrics and reporting so you can see how well the model worked. SciKit Learn has built in datasets, you should play with them, they're fun, I added a link at the bottom of the code
19th Jul 2020, 5:50 PM
Steven M
Steven M - avatar