+ 3
How to stop Sklearn converting values to float
I'm trying to make a language difficulty predictor, but when I input my data, it converts all the strings to floats. Why won't it stay the way I defined it? How can I input the data? https://code.sololearn.com/chup4G6FlQBE/?ref=app
9 Réponses
+ 2
Check out LabelEncoding, OneHotEncoder, Scikit Learn does not work well with strings/text and it prefers floats. You could also try some NLP in your code too. Good Luck 👍👍
+ 2
Arnesh Thanks
+ 2
Steven M I tried LabelEncoding, but it throws an error. It's commented out. It says I need a better shape, but it throws an error if it's in 1 dimension
+ 1
Instead of fitting Python, C++, etc you could fit 1,2,3.. And display the name from the number while predicting.
+ 1
Arnesh I don't know how to do that. It's even more annoying considering they're all inside an object
+ 1
There must be better solution but this is what I managed to do.
https://code.sololearn.com/cYVUr0ONbsBC/?ref=app
Hope that helps.
+ 1
Clueless Coder I modified your code a little, be advised, this is a very small dataset, so the results will vary, but I think this will help.
https://code.sololearn.com/cX4SByLk6jEU/?ref=app
+ 1
Steven M Wow! That's awesome
Thanks, I don't really understand it though
+ 1
Clueless Coder I have added some more notes to the code, theres also some helpful links too. I know it's a lot to take in at once, essentially I encoded the strings so SciKit Learn can fit them into the model. I also added some metrics and reporting so you can see how well the model worked. SciKit Learn has built in datasets, you should play with them, they're fun, I added a link at the bottom of the code