加速度计数据分类

问题描述 投票:0回答:1

我试图将加速度计数据(以 100Hz 的频率采样)分类为 4 种不同的运输模式(0、1、2、3)。我有 41 个不同的 CSV 文件,每个文件代表一个时间序列。我将每个文件存储在一个名为“主题”的列表中。每个 CSV 文件如下所示:

    # Check if the label mapping worked
    test = subjects[0]
    print(test.head())
    print(test.info())
    print(len(test))
              x         y         z  label
    0 -0.154881  0.383397 -0.653029      0
    1 -0.189302  0.410185 -0.597840      0
    2 -0.202931  0.408217 -0.490296      0
    3 -0.205011  0.407853 -0.360820      0
    4 -0.196665  0.430047 -0.147033      0

    <class 'pandas.core.frame.DataFrame'>
    RangeIndex: 128628 entries, 0 to 128627
    Data columns (total 4 columns):
     #   Column  Non-Null Count   Dtype  
    ---  ------  --------------   -----  
     0   x       128628 non-null  float64
     1   y       128628 non-null  float64
     2   z       128628 non-null  float64
     3   label   128628 non-null  int64  
    dtypes: float64(3), int64(1)
    memory usage: 3.9 MB
    None

    128628

首先,我想从实现随机森林算法开始。但是我不确定如何为此创建训练和测试数据集,因为我有不同的 CSV 文件。

如何为此任务创建训练和测试文件?起初我考虑将所有 CSV 文件连接在一起,但由于每个文件代表一个时间序列,我不确定这是否是正确的方法。

预先感谢您的帮助!

python classification random-forest accelerometer train-test-split
1个回答
0
投票

这是您想要执行的操作的一个粗略示例:

# Import necessary libraries
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, classification_report

# concat the list of your dataframes
df = pd.concat(list_of_your_dataframes)
df = **your data**

# Split the data into features (X) and target labels (y)
X = df[['x', 'y', 'z']]
y = df['label']

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create a Random Forest Classifier
clf = RandomForestClassifier(n_estimators=100, random_state=42)

# Fit the classifier to the training data
clf.fit(X_train, y_train)

# Make predictions on the test data
y_pred = clf.predict(X_test)

# Evaluate the classifier's performance
accuracy = accuracy_score(y_test, y_pred)
report = classification_report(y_test, y_pred)

print("Accuracy:", accuracy)
print("Classification Report:\n", report)
© www.soinside.com 2019 - 2024. All rights reserved.