0
Need guidance on sklearn nearest neighbors - numpy array shape, please
I was given the task below. The problem is that every tutorial and document I've read focuses on KNN neighbors with datasets already shaped. Does anyone have a good resource (article, tutorial, vídeo, etc) on this specific topic? (numpy(np) array / np.fit / .reshape) *My big question is: what shape does KNN require? "So what you need to do is fit a knn model with the vectors the problem is that, depending on the array shape, you will get errors. That's why you might need to reshape the vectors list using the .reshape function" TIA
5 odpowiedzi
+ 1
The KNN model, much like every other model, will be trained (.fit) based on a training dataset that you provide. As I understand, your test set will be of the same shape as the training set, right?
So why would you have problems with array shapes anyway?
+ 1
Ok, so if you split your dataset into a training and test set, you should still be fine, as both will still have 300 features. Which by the way is *a lot* for such a small dataset. Have you considered feature selection first?
0
My training dataset and test set have the same shape, yes.
The data has: a sentence and 300 vectors for it; and then this repeats for 830 questions. So I don't if this is the reason for the possible problems, I'm still tryin to fill in the blanks in my mind for this task
0
Hmm.. So it's dataset of 830 datapoints with 300 features each, or actually those vectors are datapoints?
0
830 datapoints with 300 features each