0

Need guidance on sklearn nearest neighbors - numpy array shape, please

I was given the task below. The problem is that every tutorial and document I've read focuses on KNN neighbors with datasets already shaped. Does anyone have a good resource (article, tutorial, vídeo, etc) on this specific topic? (numpy(np) array / np.fit / .reshape) *My big question is: what shape does KNN require? "So what you need to do is fit a knn model with the vectors the problem is that, depending on the array shape, you will get errors. That's why you might need to reshape the vectors list using the .reshape function" TIA

5th Jun 2020, 7:49 PM
Jairo Arce Hernandez
Jairo Arce Hernandez - avatar
5 Respuestas
+ 1
The KNN model, much like every other model, will be trained (.fit) based on a training dataset that you provide. As I understand, your test set will be of the same shape as the training set, right? So why would you have problems with array shapes anyway?
5th Jun 2020, 8:13 PM
Kuba Siekierzyński
Kuba Siekierzyński - avatar
+ 1
Ok, so if you split your dataset into a training and test set, you should still be fine, as both will still have 300 features. Which by the way is *a lot* for such a small dataset. Have you considered feature selection first?
7th Jun 2020, 9:01 AM
Kuba Siekierzyński
Kuba Siekierzyński - avatar
0
My training dataset and test set have the same shape, yes. The data has: a sentence and 300 vectors for it; and then this repeats for 830 questions. So I don't if this is the reason for the possible problems, I'm still tryin to fill in the blanks in my mind for this task
5th Jun 2020, 8:59 PM
Jairo Arce Hernandez
Jairo Arce Hernandez - avatar
0
Hmm.. So it's dataset of 830 datapoints with 300 features each, or actually those vectors are datapoints?
6th Jun 2020, 9:24 AM
Kuba Siekierzyński
Kuba Siekierzyński - avatar
0
830 datapoints with 300 features each
7th Jun 2020, 1:20 AM
Jairo Arce Hernandez
Jairo Arce Hernandez - avatar