RandomForest n_estimators 计算 [已关闭]

Question

我正在分析数据和训练模型，稍后将用于日期预测应用程序。目前，我的服务器上有大约 700 万行数据（表大小 = 6,400,000 行 x 8 列）。我想知道是否有针对此行数的推荐 n_estimators 值。我需要准确的数据和应用程序的速度之间的平衡比例。

def train_random_forest(data):
    try:
        # Split data into features and target
        X = data.drop(columns=['ident'])  # Features
        y = data['ident']  # Target
        
        # Split data into train and test sets
        X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
        
        # Initialize Random Forest model
        model = RandomForestClassifier(n_estimators=100, random_state=42)
        
        # Train the model
        model.fit(X_train, y_train)
        
        # Evaluate the model
        accuracy = model.score(X_test, y_test)
        print(f"Model accuracy: {accuracy}")
        
        return model
    except Exception as e:
        print("Error training Random Forest model:", e)
        return None

我尝试了从 1 到 1500 的不同值，但没有找到最好的值。

Answer 1

我不知道 n_estimator 的任何客观最佳选择，但您可以尝试绘制精度 VS n_estimator 和计算时间 VS n_estimator 以找到合适的值（有点像这篇文章）

RandomForest n_estimators 计算 [已关闭]

问题描述投票：0回答：1

1个回答

最新问题

RandomForest n_estimators 计算 [已关闭]

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1