0
You are working on the Titanic Survivors data set, which includes information on the passengers of the ship. The data is stored
I try many times .it will be comes error .if anybody know pls explain ?
15 Antworten
+ 2
Amala Yakin
Can you please post your attempt with your question
It helps us to see the problem
+ 2
Amala Yakin
According to the task description, we should get mean Pclass grouped by Survived
+ 2
Exaucé Maruba
Search the internet for titanic.csv
Downloads are readily available.
+ 1
You could approach the task like this:
1. subset so that only the passengers >= 18 are left
2. Use tapply to get the Pclass means by Survived
+ 1
same problem
idont know what is the problem
data <- x[x$age >= 18, ]
res<-tapply(data$Pclass, data$Survived, mean)
the output is logical(0)
+ 1
Amirreza Farahani
Get rid of res <-
tapply(....
is all you need
+ 1
Or print "res" when assigned :)
+ 1
Can i have , titanic.csv ? i want to use this file to my model
0
Ok
0
x <- read.csv('/usercode/files/titanic.csv')
y<-tapply(x$passenger>18,x$survived,mean)
print(y)
0
x <- read.csv('/usercode/files/titanic.csv')
t<-data.frame(x)
survived <-subset(t,t$passenger>=18)
print(mean(survived))
// this wrong but why?
0
Amala Yakin
Please review lesson 26.1 which shows how to filter a dataframe.
You need to find the adults first
adults <- x[lesson 26.1]
Then you can fufill the rest.
Remember, you need the mean of the Pclass from those who Survived
0
x <- read.csv('/usercode/files/titanic.csv')
t <- data.frame(x)
survived <- subset(t, t$passenger >= 18)
print(mean(survived))
The code above reads a CSV file named "titanic.csv" located in the "/usercode/files" directory. It then creates a data frame called "t" using the contents of the CSV file. Next, it filters the data in the "t" data frame, selecting only the rows where the "passenger" column has a value greater than or equal to 18. Finally, it calculates the mean of the "survived" column and prints the result.
However, there is an issue with the last line of code. The line print(mean(survived)) is incorrect because the mean() function expects a numeric vector as its argument. In this case, the "survived" variable is a data frame, not a numeric vector. To fix this issue, we need to specify the column of the data frame that we want to calculate the mean for. Let's assume that the "survived" column contains numeric values representing survival status (e.g., 0 for not survived, 1 for survived).
0
We can modify the code as follows:
0
x <- read.csv('/usercode/files/titanic.csv')
t <- data.frame(x)
survived <- subset(t, t$passenger >= 18)
print(mean(survived$survived))
By specifying survived$survived, we are accessing the "survived" column within the "survived" data frame, which is a numeric vector. This will allow us to correctly calculate the mean of the "survived" column.
Remember, it's important to pay attention to the data types and structures when working with functions in programming languages.