修复环路与模仿索引&匹配中的R

问题描述 投票:0回答:1

我试图模仿Excel公式与INDEX和MATCH R中与在循环,但结果是NA。

Excel中的公式与INDEX和MATCH把数据,以便我需要的,但它不能在河以及工作时,此像EXCEL数据的样本:

  • NR TRUNK lemat HEAD
  • 1米2气球的气球
  • 2至4是
  • 3自由自由2
  • 4,但但14
  • 5瓦特W 4
  • 6在框架5内
  • 7 LEN 6
  • 8麻线帘线7a
  • 9 [ [ 14
  • 10 # # 9

我可以从列TRUNK基于从柱头号加入的话。

  • HEAD躯干关节
  • 在气球气球
  • 但他们,但他们是
  • 免费是免费的
  • 但它,但它
  • 但却
  • 的框架内
  • 长的长度的范围内
  • 串的串长度的长度
  • 在[[

列头中的公式从TRUNK取数据至INDEX和匹配字[balony]与来自HEAD根据它的数字[2],即[SA]。换句话说,下式从表生成双字短语。 = INDEX(PARSER B:B;(MATCH(PARSER G3; PARSER答:!!! A; 0)))

现在,在R I可以读取数据,使data.frames和循环,以填补新表头和躯干的话,但它不能很好地工作。

graf <- read.csv("graf.txt", sep = "\t", quote = "\t", header = FALSE)
names(graf)[1] = "nr"
names(graf)[2] = "trunk"
names(graf)[3] = "lemat"
names(graf)[4] = "head"
nrheaddf = cbind.data.frame(graf$head,as.character(graf$trunk))
names(nrheaddf)[1] = "HEAD"
names(nrheaddf)[2] = "TRUNK"
nrtrunkdf = cbind.data.frame(graf$nr,as.character(graf$trunk))
names(nrtrunkdf)[1] = "NR"
names(nrtrunkdf)[2] = "TRUNK"



as.character(nrheaddf$TRUNK[6]) #BALONY
which(nrtrunkdf$NR == as.character(nrheaddf$HEAD[6])) #7
nrtrunkdf$TRUNK[which(nrtrunkdf$NR == as.character(nrheaddf$HEAD[6]))[1]] #są
grafi <- as.numeric(count(graf))
JOINER <- data.frame(matrix(nrow = grafi, ncol = 2))
joinv <- list()
for (i in grafi) {
  joinv <- nrtrunkdf$V2[which(nrheaddf$V1 == nrtrunkdf$V1[i])][1]
  JOINER[i] <- joinv
}

错误[<-.data.frame*tmp*,我,值= NULL):新列将离开现有列后孔

#### NEW DATA:
head(WSD$Lemma)

“有人”,“它”,“鳄鱼”,“认为”,“有色” “玻璃”

head(KEYWORDS$V1)

“有人来”,“鳄鱼” “我认为”,“彩色玻璃” “我”,“映着”

WSDKEY <- as.data.frame(cbind.na(WSD$Lemma,KEYWORDS$V1), stringsAsFactors = FALSE)

但随后该解决方案不起作用: - (!(我%的%WSDKEY $ V2))get_head <功能(I){如果回报率(NA)其他头< - WSDKEY [WSDKEY $ V2 ==我, 'V1']回报(as.character(头))}

r
1个回答
0
投票

你是这个意思吗:

library(dplyr)
# The used Data
my_data <- read.table(text = "nr TRUNK lemat HEAD
                1 balony balon 2
                2 są być 4
                3 swobodne swobodny 2
                4 ale ale 14
                5 w w 4
                6 ramach rama 5
                7 długości długość 6
                8 sznurka sznurek 7
                9 [ [ 14
                10 '#' '#' 9", header = TRUE)
my_data

my_data %>% 
  mutate(HEAD = my_data[HEAD, 'TRUNK']) %>%                # replace the numbers with the values from TRUNK
  mutate(joined_text = paste(HEAD, TRUNK)) %>%        # paste the text together in a new column
  select(HEAD, TRUNK, joined_text)                    # select the needed columns 

然后我得到这样的:

#       HEAD    TRUNK      joined_text
#         są   balony        są balony
#        ale       są           ale są
#         są swobodne      są swobodne
#       <NA>      ale           NA ale
#        ale        w            ale w
#          w   ramach         w ramach
#     ramach długości  ramach długości
#   długości  sznurka długości sznurka
#       <NA>        [             NA [
#         [        #              [ #

更新:

下面是如果你不想靠行标号这也适用另一种方式

# define a function to find and extract the right HEAD
get_head <- function(i){
  if (!(i %in% my_data$nr))
    return(NA)
  else
    head <- my_data[my_data$nr == i,'TRUNK']
    return(as.character(head))
}

# replace with the new values
my_data$HEAD <- sapply(my_data$HEAD, get_head)

# now concatenate the text and select the columns you want
my_data %>% 
  mutate(joined_text = paste(HEAD, TRUNK)) %>%        # paste the text together in a new column
  select(HEAD, TRUNK, joined_text) 

如果你想匹配字符串而非数字此方法也适用。

© www.soinside.com 2019 - 2024. All rights reserved.