修复环路与模仿索引＆匹配中的R

Question

我试图模仿Excel公式与INDEX和MATCH R中与在循环，但结果是NA。

Excel中的公式与INDEX和MATCH把数据，以便我需要的，但它不能在河以及工作时，此像EXCEL数据的样本：

NR TRUNK lemat HEAD
1米2气球的气球
2至4是
3自由自由2
4，但但14
5瓦特W 4
6在框架5内
7 LEN 6
8麻线帘线7a
9 [ [ 14
10 # # 9

我可以从列TRUNK基于从柱头号加入的话。

HEAD躯干关节
在气球气球
但他们，但他们是
免费是免费的
但它，但它
但却
的框架内
长的长度的范围内
串的串长度的长度
在[[

列头中的公式从TRUNK取数据至INDEX和匹配字[balony]与来自HEAD根据它的数字[2]，即[SA]。换句话说，下式从表生成双字短语。 = INDEX（PARSER B：B;（MATCH（PARSER G3; PARSER答：!!! A; 0）））

现在，在R I可以读取数据，使data.frames和循环，以填补新表头和躯干的话，但它不能很好地工作。

graf <- read.csv("graf.txt", sep = "\t", quote = "\t", header = FALSE)
names(graf)[1] = "nr"
names(graf)[2] = "trunk"
names(graf)[3] = "lemat"
names(graf)[4] = "head"
nrheaddf = cbind.data.frame(graf$head,as.character(graf$trunk))
names(nrheaddf)[1] = "HEAD"
names(nrheaddf)[2] = "TRUNK"
nrtrunkdf = cbind.data.frame(graf$nr,as.character(graf$trunk))
names(nrtrunkdf)[1] = "NR"
names(nrtrunkdf)[2] = "TRUNK"



as.character(nrheaddf$TRUNK[6]) #BALONY
which(nrtrunkdf$NR == as.character(nrheaddf$HEAD[6])) #7
nrtrunkdf$TRUNK[which(nrtrunkdf$NR == as.character(nrheaddf$HEAD[6]))[1]] #są
grafi <- as.numeric(count(graf))
JOINER <- data.frame(matrix(nrow = grafi, ncol = 2))
joinv <- list()
for (i in grafi) {
  joinv <- nrtrunkdf$V2[which(nrheaddf$V1 == nrtrunkdf$V1[i])][1]
  JOINER[i] <- joinv
}

错误[<-.data.frame（*tmp*，我，值= NULL）：新列将离开现有列后孔

#### NEW DATA:

head(WSD$Lemma)

“有人”，“它”，“鳄鱼”，“认为”，“有色” “玻璃”

head(KEYWORDS$V1)

“有人来”，“鳄鱼” “我认为”，“彩色玻璃” “我”，“映着”

WSDKEY <- as.data.frame(cbind.na(WSD$Lemma,KEYWORDS$V1), stringsAsFactors = FALSE)

但随后该解决方案不起作用： - （！（我％的％WSDKEY $ V2））get_head <功能（I）{如果回报率（NA）其他头< - WSDKEY [WSDKEY $ V2 ==我， 'V1']回报（as.character（头））}

Answer 1

你是这个意思吗：

library(dplyr)
# The used Data
my_data <- read.table(text = "nr TRUNK lemat HEAD
                1 balony balon 2
                2 są być 4
                3 swobodne swobodny 2
                4 ale ale 14
                5 w w 4
                6 ramach rama 5
                7 długości długość 6
                8 sznurka sznurek 7
                9 [ [ 14
                10 '#' '#' 9", header = TRUE)
my_data

my_data %>% 
  mutate(HEAD = my_data[HEAD, 'TRUNK']) %>%                # replace the numbers with the values from TRUNK
  mutate(joined_text = paste(HEAD, TRUNK)) %>%        # paste the text together in a new column
  select(HEAD, TRUNK, joined_text)                    # select the needed columns

然后我得到这样的：

#       HEAD    TRUNK      joined_text
#         są   balony        są balony
#        ale       są           ale są
#         są swobodne      są swobodne
#       <NA>      ale           NA ale
#        ale        w            ale w
#          w   ramach         w ramach
#     ramach długości  ramach długości
#   długości  sznurka długości sznurka
#       <NA>        [             NA [
#         [        #              [ #

更新：

下面是如果你不想靠行标号这也适用另一种方式

# define a function to find and extract the right HEAD
get_head <- function(i){
  if (!(i %in% my_data$nr))
    return(NA)
  else
    head <- my_data[my_data$nr == i,'TRUNK']
    return(as.character(head))
}

# replace with the new values
my_data$HEAD <- sapply(my_data$HEAD, get_head)

# now concatenate the text and select the columns you want
my_data %>% 
  mutate(joined_text = paste(HEAD, TRUNK)) %>%        # paste the text together in a new column
  select(HEAD, TRUNK, joined_text)

如果你想匹配字符串而非数字此方法也适用。

修复环路与模仿索引＆匹配中的R

问题描述投票：0回答：1

1个回答

最新问题

修复环路与模仿索引＆匹配中的R

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1