如何选择记录取决于PC,以减少Rapidminer的维数?

问题描述 投票:0回答:1

我是Rapidminer的新手,所以我有一个庞大的数据集,我使用主成分分析来减少维度,问题是当我得到PC时我不知道如何选择记录取决于它我如何制作新的数据集哪个减少了?

这是我试图使用的:

这就是我得到的:

pca rapidminer dimensionality-reduction
1个回答
0
投票

您可以使用“权重PCA”运算符计算属性重要性的权重,然后使用“按权重选择”运算符来减少原始数据集中的属性数。

检查下面附带的示例流程(只需将XML复制到RapidMiner流程窗口中)。也可以随时在RapidMiner community上查看或提问

enter image description here

<?xml version="1.0" encoding="UTF-8"?><process version="9.2.000">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="6.0.002" expanded="true" name="Root" origin="GENERATED_TUTORIAL">
<parameter key="logverbosity" value="init"/>
<parameter key="random_seed" value="2001"/>
<parameter key="send_mail" value="never"/>
<parameter key="notification_email" value=""/>
<parameter key="process_duration_for_mail" value="30"/>
<parameter key="encoding" value="SYSTEM"/>
<process expanded="true">
  <operator activated="true" class="retrieve" compatibility="9.2.000" expanded="true" height="68" name="Sonar" origin="GENERATED_TUTORIAL" width="90" x="112" y="34">
    <parameter key="repository_entry" value="//Samples/data/Sonar"/>
  </operator>
  <operator activated="true" class="weight_by_pca" compatibility="9.2.000" expanded="true" height="82" name="Weight by PCA" width="90" x="313" y="34">
    <parameter key="normalize_weights" value="true"/>
    <parameter key="sort_weights" value="true"/>
    <parameter key="sort_direction" value="ascending"/>
    <parameter key="component_number" value="1"/>
  </operator>
  <operator activated="true" class="select_by_weights" compatibility="9.2.000" expanded="true" height="103" name="Select by Weights" width="90" x="581" y="34">
    <parameter key="weight_relation" value="greater equals"/>
    <parameter key="weight" value="0.5"/>
    <parameter key="k" value="10"/>
    <parameter key="p" value="0.5"/>
    <parameter key="deselect_unknown" value="true"/>
    <parameter key="use_absolute_weights" value="true"/>
  </operator>
  <connect from_op="Sonar" from_port="output" to_op="Weight by PCA" to_port="example set"/>
  <connect from_op="Weight by PCA" from_port="weights" to_op="Select by Weights" to_port="weights"/>
  <connect from_op="Weight by PCA" from_port="example set" to_op="Select by Weights" to_port="example set input"/>
  <connect from_op="Select by Weights" from_port="example set output" to_port="result 1"/>
  <portSpacing port="source_input 1" spacing="0"/>
  <portSpacing port="sink_result 1" spacing="0"/>
  <portSpacing port="sink_result 2" spacing="162"/>
</process>
</operator>
</process>
© www.soinside.com 2019 - 2024. All rights reserved.