我有来自模型仿真的数据,其中我有多个重复项(run_num),每次运行中我测量长度和蛋的输出(phys_length&no_eggs)的时间步数不相等。
> head(params)
run_num time_step phys_length no_eggs
1 1 0 0.000000000 0
2 1 1 0.008209734 0
3 1 2 0.016332967 0
4 1 3 0.024238314 0
5 1 4 0.031594308 0
6 1 5 0.033077672 0
> tail(params)
run_num time_step phys_length no_eggs
607395 49 13728 15.04109 727
607396 49 13729 15.04111 727
607397 49 13730 15.04112 727
607398 49 13731 15.04113 727
607399 49 13732 15.04114 727
607400 49 13733 15.04115 727
>
我想查找个体(运行)从未开始产卵的所有实例,并从数据框中删除整个运行。我的解决方案是找到每次运行的最大time_step,如果no_eggs = 0则删除整个运行。
我是R新手,不知道从哪里开始实际指导R这样做。我在考虑for循环,但是在试图弄清楚如何告诉R仅在每次运行中查看最大time_step时遇到问题。然后如何删除所有具有该run_number的行。
这是我到目前为止所拥有的,但是我不确定我是否走在正确的轨道上,因为我以前从未使用过for循环。
for (val in params$run_num)) {
if(no_eggs )
}
感谢有关如何执行此操作的任何想法。
在开始的R中可能有点难以理解,这就是我在基本R中的处理方式
#your data
run_num <- c(1,1,1,1,1,1,49,49,49,49,49,49)
timestep <- c(0,1,2,3,4,5,13728,13729,13730,13731,13732,13733)
phys_length <- c(0.000000000,0.008209734,0.016332967,0.024238314,0.031594308,0.033077672,15.04109,15.04111,15.04112,15.04113,15.04114,15.04115)
eggs <- c(0,0,0,0,0,0,727,727,727,727,727,727)
params <- as.data.frame(cbind(run_num,timestep,phys_length,eggs))
#this line returns a vector of True and False we can use this to grab all the rows where the statement is true
params$eggs == 0
#like here "params" your data frame, and in the "[,]" the first value is row number, and second is colunm
#if set like below it returns all rows that were true, and since the second value was empty it returns all columns
no_eggs <- params[params$eggs == 0,]