+ 2

Pandas pandas pandas help !

My code for the last DS with Python project passes the first 2 test cases but none of the hidden ones. Here in the code https://code.sololearn.com/cNU0uG00kA7n/?ref=app Here is the challenge: Assume that there are two clusters among the given two-dimensional data points and two random points (0, 0), and (2, 2) are the initial cluster centroids. Calculate the euclidean distance between each data point and each of the centroid, assign each data point to its nearest centroid, then calculate the new centroid. If there's a tie, assign the data point to the cluster with centroid (0, 0). If none of the data points were assigned to the given centroid, return None. Input Format First line: an integer to indicate the number of data points (n) Next n lines: two numeric values per each line to represent a data point in two dimensional space. Output Format Two lists for two centroids. Numbers are rounded to the second decimal place. Sample Input 3 1 0 0 .5 4 0 Sample Output [0.5 0.25] [4. 0.] Note: I prefer to modify my code to get the right answer and would like to use kmeans to solve this.

3rd Mar 2021, 9:26 AM
Ethan
Ethan - avatar
2 Réponses
0
https://www.sololearn.com/compiler-playground/cvUcsMq6W81C import numpy as np n, k, datapoints = int(input()), 2, [] for i in range(n): datapoints.append([float(x) for x in input().split()]) datapoints = np.array(datapoints) centroids = np.array([[0, 0], [2, 2]]) def euclidean_distance(x, y): return np.sum(np.square(x - y)) ed = np.zeros((n, k)) for i, c in enumerate(centroids): for j, d in enumerate(datapoints): ed[j, i] = euclidean_distance(c, d) nearest_c = np.argmin(ed, axis=1) for c in range(k): dps = datapoints[np.asarray(c==nearest_c).nonzero()] if len(dps) == 0: print(None) continue new_centroid = dps.mean(axis=0).round(2) print(new_centroid)
22nd Feb 2023, 7:31 PM
kwesiquest
- 1
PANDAS
4th Mar 2021, 7:35 PM
BUZZSAW
BUZZSAW - avatar