在时间序列中组合两个具有不同列名的数据框

问题描述 投票:0回答:1

我有两个数据框,一个名为传感器,一个名为火车。 传感器数据帧包含时间序列的数据,其索引为 ts_sensor 列。在传感器数据框中,列的名称标识传感器的名称,每个单元格填充传感器在 ts_sensor 时刻假定的值。训练数据帧包含一个 id_sensor 列,其中填充了传感器的名称(如传感器数据帧中所示)和 boot_threshold 列,其中每个单元格填充了每个传感器的阈值。

这是与上述数据框相关的信息。

train.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 264 entries, 0 to 263
Data columns (total 17 columns):
 #   Column               Non-Null Count  Dtype  
---  ------               --------------  -----  
 0   id_train             264 non-null    object 
 1   asset_id             264 non-null    object 
 2   train_type           264 non-null    object 
 3   equipment_type       264 non-null    object 
 4   id_sensor            264 non-null    object 
 5   unit                 262 non-null    object 
 6   is_operating         262 non-null    float64
 7   will_be_available    262 non-null    float64
 8   pi_acquisition_date  0 non-null      float64
 9   pi_description       264 non-null    object 
 10  sensor_description   264 non-null    object 
 11  tag_id               262 non-null    float64
 12  priority             262 non-null    float64
 13  for_ml               264 non-null    int64  
 14  onoff_parameter      264 non-null    int64  
 15  boot_threshold       264 non-null    int64  
 16  efficiency_index     264 non-null    int64  
dtypes: float64(5), int64(4), object(8)
memory usage: 35.2+ KB

sensor.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1729930 entries, 0 to 1729929
Columns: 135 entries, level_0 to Australia.Blacktip.ICSS.KPI.EE.GasTurbineEfficiency_A
dtypes: float64(132), int64(2), object(1)
memory usage: 1.7+ GB

sensor.columns
Index(['level_0', 'index', 'sensor_ts',
       'Australia.Blacktip.ICSS.360140413-100.PV',
       'Australia.Blacktip.ICSS.360140415-100.PV',
       'Australia.Blacktip.ICSS.360140417-100.PV',
       'Australia.Blacktip.ICSS.360140421-100.PV',
       'Australia.Blacktip.ICSS.360140423-100.PV',
       'Australia.Blacktip.ICSS.360140425-100.PV',
       'Australia.Blacktip.ICSS.360140433-100.PV',
       ...
       'Australia.Blacktip.ICSS.3601PDI144.DACA.PV',
       'Australia.Blacktip.ICSS.3601PIT111.PV',
       'Australia.Blacktip.ICSS.3601PIT141.PV',
       'Australia.Blacktip.ICSS.3601TIT042.PV',
       'Australia.Blacktip.ICSS.3601TIT052.PV',
       'Australia.Blacktip.ICSS.3601TIT111.PV',
       'Australia.Blacktip.ICSS.3601TIT122.PV',
       'Australia.Blacktip.ICSS.3601TIT141.PV',
       'Australia.Blacktip.ICSS.3601TIT152.PV',
       'Australia.Blacktip.ICSS.KPI.EE.GasTurbineEfficiency_A'],
      dtype='object', length=135)

我想获得一个新的数据帧,其中时间序列受到尊重,并且仅存在 ts_sensor 值当前高于相应 boot_threshold 值的传感器。

谢谢大家

python dataframe join time-series
1个回答
0
投票

你在找那个吗?

import pandas as pd
import numpy as np   
# Simulated sensor data (a small part of the actual data for demonstration)
sensor_data = {
    'sensor_ts': pd.date_range(start='2023-01-01', periods=4, freq='D'),
    'Australia.Blacktip.ICSS.360140413-100.PV': [100, 200, 300, 400],
    'Australia.Blacktip.ICSS.360140415-100.PV': [100, 220, 330, 440],
}

sensor = pd.DataFrame(sensor_data)

# Simulated train data 
train_data = {
    'id_sensor': ['Australia.Blacktip.ICSS.360140413-100.PV', 'Australia.Blacktip.ICSS.360140415-100.PV'],
    'boot_threshold': [250, 300],
}

train = pd.DataFrame(train_data)

# Reshape sensor DataFrame from wide to long format
sensor_long = pd.melt(sensor, id_vars=['sensor_ts'], var_name='id_sensor', value_name='value')

# Merge the reshaped sensor data with the train data
merged_data = pd.merge(sensor_long, train, on='id_sensor')

# Filter to include only rows where sensor value is greater than the boot_threshold
filtered_data = merged_data[merged_data['value'] > merged_data['boot_threshold']]

# Displaying the filtered data for verification
filtered_data

输出:

sensor_ts   id_sensor   value                      boot_threshold
2   2023-01-03  Australia.Blacktip.ICSS.360140413-100.PV    300 250
3   2023-01-04  Australia.Blacktip.ICSS.360140413-100.PV    400 250
6   2023-01-03  Australia.Blacktip.ICSS.360140415-100.PV    330 300
7   2023-01-04  Australia.Blacktip.ICSS.360140415-100.PV    440 300
© www.soinside.com 2019 - 2024. All rights reserved.