我对R很陌生,正在尝试为一组事务运行apriori函数。就检查出现问题的规则而言,LHS返回的是空的。我究竟做错了什么?
下面是我使用的代码。我还附加了两个我使用过的不同文件格式的图像。数据格式1,数据格式2和数据格式3。下面的代码使用格式3。
> dianacsv <- read.csv("diana.csv")
> dianatrans <- read.transactions(file="diana.csv", format = c("basket"), header = TRUE, sep = ",")
> summary(dianatrans)
transactions as itemMatrix in sparse format with
114091 rows (elements/itemsets/transactions) and
114149 columns (items) and a density of 3.32023e-05
most frequent items:
CS12 PS12 BU12 GB12 CC12 (Other)
23819 18268 16166 15544 14452 344157
element (itemset/transaction) length distribution:
sizes
3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
65024 26446 13237 5185 2243 873 435 250 133 83 65 47 26 8 17
18 19 20 21 22 23 24
3 3 6 3 1 2 1
Min. 1st Qu. Median Mean 3rd Qu. Max.
3.00 3.00 3.00 3.79 4.00 24.00
includes extended item information - examples:
labels
1 1
2 10
3 100
> dianarules <- apriori(dianatrans, parameter = list(supp = 0.01, conf = 0.01, target = "rules"))
Apriori
Parameter specification:
confidence minval smax arem aval originalSupport maxtime support minlen maxlen target
0.01 0.1 1 none FALSE TRUE 5 0.01 1 10 rules
ext
FALSE
Algorithmic control:
filter tree heap memopt load sort verbose
0.1 TRUE TRUE FALSE TRUE 2 TRUE
Absolute minimum support count: 1140
set item appearances ...[0 item(s)] done [0.00s].
set transactions ...[114149 item(s), 114091 transaction(s)] done [0.23s].
sorting and recoding items ... [38 item(s)] done [0.01s].
creating transaction tree ... done [0.04s].
checking subsets of size 1 2 3 done [0.00s].
writing ... [237 rule(s)] done [0.00s].
creating S4 object ... done [0.02s].
> inspect(dianarules[1:5])
lhs rhs support confidence lift count
[1] {} => {BBK16} 0.01076334 0.01076334 1 1228
[2] {} => {CS4} 0.01036892 0.01036892 1 1183
[3] {} => {BU4} 0.01196413 0.01196413 1 1365
[4] {} => {CHK16} 0.01362071 0.01362071 1 1554
[5] {} => {PJK16} 0.01575059 0.01575059 1 1797
数据格式1
数据格式2
数据格式3
使用apriori时必须设置minlen:
library(arules)
M = matrix(sample(0:1,100,replace=TRUE),ncol=10)
rownames(M) = paste0("tx",1:10)
colnames(M) = paste0("item",1:10)
M.rules = apriori(as(M,"transactions"),
parameter = list(supp = 0.01, conf = 0.01, target = "rules"))
inspect(M.rules[1:5])
lhs rhs support confidence lift count
[1] {} => {item1} 0.2 0.2 1 2
[2] {} => {item2} 0.4 0.4 1 4
[3] {} => {item4} 0.4 0.4 1 4
[4] {} => {item9} 0.5 0.5 1 5
[5] {} => {item7} 0.6 0.6 1 6
M.rules = apriori(as(M,"transactions"),
parameter = list(supp = 0.01, conf = 0.01,
target = "rules",minlen=2))
inspect(M.rules[1:5])
lhs rhs support confidence lift count
[1] {item1} => {item4} 0.1 0.50 1.2500000 1
[2] {item4} => {item1} 0.1 0.25 1.2500000 1
[3] {item1} => {item9} 0.1 0.50 1.0000000 1
[4] {item9} => {item1} 0.1 0.20 1.0000000 1
[5] {item1} => {item7} 0.1 0.50 0.8333333 1