如何用R中的另一个值替换一个值?

问题描述 投票:0回答:1

我有0's1's这个变量,我想将其转换为NO'sYES's。变量的名称为default,数据集的名称为credit_train。我尝试的最初解决方案无效,该解决方案是通过以下代码将integer类变量default设为factorcredit_train$default <- factor(credit_train$default)。这提供了从以下内容的过渡:

> class(credit_train$default)
[1] "integer"

> class(credit_train$default)
[1] "factor"

以下决策树算法需要该因子:

credit_model <- C5.0(credit_train[-1],credit_train$default)

但是,通过检查发现以下内容(树大小= 0):

> credit_model

Call:
C5.0.default(x = credit_train[-1], y = credit_train$default)

Classification Tree
Number of samples: 900 
Number of predictors: 20 

Tree size: 0 

Non-standard options: attempt to group attributes

因此,我现在尝试将因素设置为是和否,因为1和0可能有问题。

我将在此处包括完整的代码(直到问题出现为止:]:>

install.packages("C50", dependencies=TRUE, repos='http://cran.rstudio.com/')
library(C50)  # Gives the decision tree algorithm


#######Step 2: EXploring and Preparing the Data####
credit <- read.csv("german.csv")
credit
str(credit)
table(credit$account_check_status)
table(credit$savings)

summary(credit$duration_in_month)
summary(credit$credit_amount)


# A successful model that identifies applicants who are at
# high risk of default, allowing the bank to refuse the credit 
# request before the money is given.
table(credit$default)

# Data Preparation: Create RANDOM training and test datasets
# Use 90% data for training & 10% data for testing
# B/C its not RANDOM (bank sorted data by loan amount, largest
# at end of the file & so train only on the smallest loans)
set.seed(123)

# select 900 values at random out of the sequence of integers
# of 1 to 1,000
train_sample <- sample(1000,900)

# Shows the random selection
str(train_sample)

# The 'train_sample'(900) is passed as selected rows.
credit_train <- credit[train_sample,]

# The REMAINING rows NOT passed (100) become the test
credit_test <- credit[-train_sample,]

# Check to see if randomization was done correctly by having
# 30 percent of loans with default in each of the datasets
prop.table(table(credit_train$default))
prop.table(table(credit_test$default))

#####STEP3: Training a model on the Data ######

credit_model <- C5.0(credit_train[-1],credit_train$default)

这里是数据集:

Please click here for Data

我有一个变量,该变量带有0和1,我想将其转换为NO和YES。变量的名称是默认名称,数据集的名称是credit_train。我尝试的初始解决方案是...

r machine-learning decision-tree
1个回答
0
投票

这应该做,

© www.soinside.com 2019 - 2024. All rights reserved.