如何只保留一列中第一个空白字符之前的子字符串？

Question

这是我的数据样本：

a=pd.DataFrame({'ID':[1,2,3,4,5],
                'Str':['aa aafae afre ht4','v fef 433','1234334 a','bijf 049tu0q4g vie','aaa 1']})

现在我只想保留第一个空白字符之前的子字符串。我可以找到第一个空白字符的位置。但我不知道接下来的部分该怎么做。

我正在尝试打印句子的前三个单词的第一个字母，但是在

d4 = y.find(" ", d3)

部分，程序不会将其识别为整数，如果我将其转换为整数，则会导致错误，因为我的基数是 10。

如何解决问题？

Answer 1

让我们使用示例字符串

this is just a test

你可以这样做：

test = "this is just a test"
first_word = test.split(" ")[0]
print(first_word)

这将导致

this

我在这里做的是

您可以像这样使用正则表达式：

import re

test = "this is just a test"
first_word = re.sub('\s.*', '', test)
print(first_word)

在这里，我搜索第一个出现的空格（'\s'），后跟任何文本（

.*

），然后将其替换为空（

''

）。

test = "this is just a test"
space_pos = test.find(" ")
first_word = test[:space_pos]
print(first_word)

Answer 2

a['Str']=a['a'].str.split(' ') # split text using blanks
a['Str']=a['Str'].str.get(0) #get the first item of each Str column 
print(a)