首页 \ 问答 \ 计算同一个表中不同行之间的差异[重复](Calculate delta's between different rows in same table [duplicate])

计算同一个表中不同行之间的差异[重复](Calculate delta's between different rows in same table [duplicate])

这个问题在这里已经有了答案:

我有一张桌子,里面有很多来自不同米的测量数据。 每个测量值都存储在一个新行中,并具有实际的仪表值。 我需要每米每次连续测量之间的差异。

简化输入:

 [2016-11-03,MeterA,45]
 [2016-11-03,MeterB,45]
 [2016-11-04,MeterA,47]
 [2016-11-04,MeterB,54]

目前我正在做几个for循环,但这需要很长时间,并且可能有一个更有效的方法。 现在的代码

data$diff <- 0;
for(address in unique(data$Address)){
    subaddr <- subset(data, data$Address== address)
    for(meter in unique(subaddr$Meter)){
        submeter <- subset(subaddr, subaddr$Meter == meter)
        for (i in 1:nrow(submeter)){
            if(i > 1){
                prow = submeter[i-1,]
                row = submeter[i,]
                data[which(data$Address ==  address & data$Meter == meter &    data$UCPTlogTime == row$UCPTlogTime),]$diff <- row$UCPTvalue - prow$UCPTvalue
             }    
          }
     }
}

期望的输出

 [2016-11-03,MeterA,0]
 [2016-11-03,MeterB,0]
 [2016-11-04,MeterA,2]
 [2016-11-04,MeterB,9]

This question already has an answer here:

I've got a table which contains a lot of measurements from different meters. Each measurement is stored in a new row and has the actual meter value. I need to have the difference between each successive measurement per meter.

Simplified imput:

 [2016-11-03,MeterA,45]
 [2016-11-03,MeterB,45]
 [2016-11-04,MeterA,47]
 [2016-11-04,MeterB,54]

Currently I am doing this with several for loops but this takes long and there probably is a more efficient way. Code currently

data$diff <- 0;
for(address in unique(data$Address)){
    subaddr <- subset(data, data$Address== address)
    for(meter in unique(subaddr$Meter)){
        submeter <- subset(subaddr, subaddr$Meter == meter)
        for (i in 1:nrow(submeter)){
            if(i > 1){
                prow = submeter[i-1,]
                row = submeter[i,]
                data[which(data$Address ==  address & data$Meter == meter &    data$UCPTlogTime == row$UCPTlogTime),]$diff <- row$UCPTvalue - prow$UCPTvalue
             }    
          }
     }
}

Desired output

 [2016-11-03,MeterA,0]
 [2016-11-03,MeterB,0]
 [2016-11-04,MeterA,2]
 [2016-11-04,MeterB,9]
更新时间:2023-03-21 17:03

最满意答案

dplyr使用lag函数很容易。 假设数据UCPTlogTime的列名为UCPTlogTimeAddressMeterUCPTvalue

library(dplyr)

data <- data %>% group_by(Address, Meter) %>% 
  mutate(delta = order_by(UCPTlogTime, UCPTvalue - lag(UCPTvalue))) %>%
  mutate(delta = ifelse(is.na(delta), 0, delta))

This is a breeze with dplyr using the lag function. Assuming the columns in your dataframe are named UCPTlogTime, Address, Meter, and UCPTvalue:

library(dplyr)

data <- data %>% group_by(Address, Meter) %>% 
  mutate(delta = order_by(UCPTlogTime, UCPTvalue - lag(UCPTvalue))) %>%
  mutate(delta = ifelse(is.na(delta), 0, delta))

相关问答

更多