+ 7
Python for Data Science - House Prices
You are given an array that represents house prices. Calculate and output the percentage of houses that are within one standard deviation from the mean. To calculate the percentage, divide the number of houses that satisfy the condition by the total number of houses, and multiply the result by 100. I stuck at this question, can anyone help out? This is my code : https://code.sololearn.com/cA255a72a21A
40 Answers
- 2
you rather need to output percentage of data than count:
print(100*count/data.size)
+ 35
My code
import numpy as np
data = np.array([150000, 125000, 320000, 540000, 200000, 120000, 160000, 230000, 280000, 290000, 300000, 500000, 420000, 100000, 150000, 280000])
m = np.mean(data)
s = np.std(data)
low = m-s
high = m+s
print(
len(data[(low < data) & (data < high)])
/ len(data) *100
)
+ 8
import numpy as np
data = np.array([150000, 125000, 320000, 540000, 200000, 120000, 160000, 230000, 280000, 290000, 300000, 500000, 420000, 100000, 150000, 280000])
mean = np.mean(data)
std = np.std(data)
low = mean-std
high = mean+std
count = 0
for i in data:
if low < i < high:
count += 1
result = (count / len(data))*100
print(result)
+ 4
Hi! here's my answer
m = np.mean(data)
d = np.std(data)
y1 = m-d
y2 = m+d
s = len(data [(data > y1) & (data < y2)])
r = (s/len(data))*100
print(r)
I hope it helped!
+ 4
import numpy as np
data = np.array([150000, 125000, 320000, 540000, 200000, 120000, 160000, 230000, 280000, 290000, 300000, 500000, 420000, 100000, 150000, 280000])
one line answer✌️
print(len([i for i in data if i > (np.mean(data) - np.std(data)) and i < (np.mean(data) + np.std(data))]) / len(data) * 100)
+ 3
A short and simple answer for the problem:
mean = np.mean(data)
std = np.std(data)
x=(data[(data <= mean+std) & (data >= mean-std)])
print(x.size/data.size*100)
+ 2
Lai Kai Yong: Please help me with the first Python for Data Science Basketball Players exercise.
+ 2
import numpy as np
data = np.array([150000, 125000, 320000, 540000, 200000, 120000, 160000, 230000, 280000, 290000, 300000, 500000, 420000, 100000, 150000, 280000])
mean = np.mean(data)
std = np.std(data)
result = data[np.logical_and(data <= mean + std, data >= mean - std)]
print(result.size/data.size*100)
+ 2
print(len([i for i in data if (np.mean(data)-np.std(data))< i < (np.mean(data)+np.std(data))])/len(data)*100)
+ 2
import numpy as np
data = np.array([150000, 125000, 320000, 540000, 200000, 120000, 160000, 230000, 280000, 290000, 300000, 500000, 420000, 100000, 150000, 280000])
strd=np.std(data)
means=np.mean(data)
alt=means-strd
ust=means+strd
a=(data>alt)&(data<ust)
print((len(data[a])/len(data))*100)
+ 1
import numpy as np
data = np.array([150000, 125000, 320000, 540000, 200000, 120000, 160000, 230000, 280000, 290000, 300000, 500000, 420000, 100000, 150000, 280000])
low = np.mean(data)-np.std(data)
high = np.mean(data)+np.std(data)
print (len(data[(data>low)&(data<high)])/len(data)*100)
0
you must compute the mean of your data, then compute the standard deviation and finally count how many data are in the range mean-deviation, mean+deviation...
0
@visph, I'm still confuse. I had modified my codes but it shows an output but still wrong.
0
Lai Kai Yong check again my previous edited post: np.std should get data as argument ^^
0
@visph, it does not work...
0
Bruno it havent helped
0
import numpy as np
data = np.array([150000, 125000, 320000, 540000, 200000, 120000, 160000, 230000, 280000, 290000, 300000, 500000, 420000, 100000, 150000, 280000])
mean = np.mean(data)
std = np.std(data)
low = mean-std
high = mean+std
count = 0
for i in data:
if low < i < high:
count += 1
result = (count / len(data))*100
print(result)
0
data = np.array([150000, 125000, 320000, 540000, 200000,
120000, 160000, 230000, 280000, 290000, 300000, 500000,
420000, 100000, 150000, 280000])
mean = np.mean(data)
std = np.std(data)
a=mean+std
b=mean-std
c=(data[(data <= a) & (data >= b)])
print(c.size/data.size*100)
0
COVID Data Analysis
You are working with the COVID dataset for California, which includes the number of cases and deaths for each day of 2020.
Find the day when the deaths/cases ratio was largest.
To do this, you need to first calculate the deaths/cases ratio and add it as a column to the DataFrame with the name 'ratio', then find the row that corresponds to the largest value.
0