如何在机器学习中打印分类特征?

问题描述 投票:-1回答:1

假设我有火车数据集

r1:便宜,昂贵->价格

r2:兴奋->娱乐

r3:炎热,夏天->天气

r4:金钱->价格

r5:下雨->天气


然后我要以这种模式显示它:

价格->便宜,昂贵,金钱

娱乐->兴奋

天气->炎热,夏天,下雨

有人知道吗?我正在做NLP研究。谢谢。

python machine-learning nlp nltk data-science
1个回答
0
投票
import pandas as pd # Dictionary of items d = {'words' : [ [ 'cheap', 'expensive'], ['excited'], ['hot', 'summer'], ['money'], ['rain'] ], 'category': ['price', 'entertainment', 'weather', 'price', 'weather']} # Convert dictionary to dataframe df = pd.DataFrame(d) # Unpack the list of 'words' by joining with ',' df.words = df.words.str.join(',') # Groupby and aggregate to get the unique 'words' for each 'category' new_df = df.groupby('category').agg({'words':'unique'}) # Since the groupby results in a list of items, unpack by joining with ',' new_df.words = new_df.words.str.join(',') # reset_index() to convert the groupby object to a dataframe # This is optional. If not used, 'category' will the index of the dataframe. new_df.reset_index(inplace=True) new_df
© www.soinside.com 2019 - 2024. All rights reserved.