如何从列中选择大写单词并分成新列？

Question

我有一个基因和药物的数据集在1列，看起来像这样：

Molecules
3-nitrotyrosine
4-phenylbutyric acid
5-fluorouracil/leucovorin/oxaliplatin
5-hydroxytryptamine
ABCB4
ABCC8
ABCC9
ABCF2
ABHD4

列中基因和药物的分布是随机的，因此我无法进行精确的分区。我希望删除基因并将它们放入一个新的列中，我想知道我是否可以使用isupper（）来选择基因并将它们移动到一个新的列中，尽管我知道这只适用于字符串。有没有办法选择带有大写字母的行放入新列？任何指导将不胜感激。

Expected Output:
  Column 1                                Column 2
3-nitrotyrosine                           ABCB4
4-phenylbutyric acid                      ABCC8
5-fluorouracil/leucovorin/oxaliplatin     ABCC9
5-hydroxytryptamine                       ABCF2

Answer 1

将您的文件读入列表：

with open('test.txt', 'r') as f:
    lines = [line.strip() for line in f]

剥离所有大写如下：

mols = [x for x in lines if x.upper() != x]
genes = [x for x in lines if x.upper() == x]

结果：

mols
['3-nitrotyrosine', '4-phenylbutyric acid', 
 '5-fluorouracil/leucovorin/oxaliplatin', '5-hydroxytryptamine']
genes
['ABCB4', 'ABCC8', 'ABCC9', 'ABCF2', 'ABHD4']

Answer 2

如上所述，分离大写很简单：

df.loc[df['Molecules'].str.isupper()]

  Molecules
5     ABCB4
6     ABCC8
7     ABCC9
8     ABCF2
9     ABHD4

df.loc[df['Molecules'].str.isupper() == False]

                               Molecules
0                        3-nitrotyrosine
1                        4-phenylbutyric
2                                   acid
3  5-fluorouracil/leucovorin/oxaliplatin
4                    5-hydroxytryptamine

但是，在您能够提供其他详细信息之前，您不希望如何匹配行。

如何从列中选择大写单词并分成新列？

问题描述投票：0回答：2

2个回答

最新问题

如何从列中选择大写单词并分成新列？

问题描述 投票：0回答：2

2个回答

最新问题

问题描述投票：0回答：2