print(movie_idname['rating'])
我想将其转换为所有int数,这些是我尝试过的一些代码:
for rating in movie_idname:
if rating == float:
int_rating = movie_idname['rating'].astype(int)
print(int_rating)
break
int_rating = movie_idname['rating'].astype(int)
->这是评级数据集的样子print(movie_idname ['rating'])
0 4.0
1 5.0
2 5.0
3 4.0
4 4.0
...
82624 3.0
82625 4.5
82626 4.0
82627 5.0
82628 4.5
Name: rating, Length: 82629, dtype: object
您有一系列的字符串(因此输出显示为dtype: object
),看起来像浮点数。那些不能直接转换为int,但是如果沿这种方式转换为float,则可以这样做:
>>> import pandas as pd
>>> pd.Series(["1.0", "2.5"])
0 1.0
1 2.5
dtype: object
>>> pd.Series(["1.0", "2.5"]).astype(int)
Traceback (most recent call last):
...
ValueError: invalid literal for int() with base 10: '1.0'
>>> pd.Series(["1.0", "2.5"]).astype(float)
0 1.0
1 2.5
dtype: float64
>>> pd.Series(["1.0", "2.5"]).astype(float).astype(int)
0 1
1 2
dtype: int64
考虑以下数据框:
In [1055]: df
Out[1055]:
rating
0 4.0
1 5.0
2 NaN
3 4.0
4 4.0
这将起作用:
In [1053]: df['rating'].astype('Int64')
Out[1053]:
0 4
1 5
2 <NA>
3 4
4 4
Name: val, dtype: Int64
您所做的应该有效。您是否忘了看int_rating?您没有将其分配回日期框架,例如movie_idname['int_rating'] = ...
。
例如,尝试一下:
import pandas as pd
from random import uniform
movie_idname = pd.DataFrame({
'rating': [uniform(0, 10) for _ in range(100)]
})
print(movie_idname)
rating
0 6.032252
1 0.492256
2 7.474722
3 0.175150
4 7.286012
.. ...
95 1.385851
96 9.070880
97 7.222838
98 4.941222
99 1.443023
movie_idname['rating_int'] = movie_idname['rating'].astype(int)
print(movie_idname)
rating rating_int
0 6.032252 6
1 0.492256 0
2 7.474722 7
3 0.175150 0
4 7.286012 7
.. ... ...
95 1.385851 1
96 9.070880 9
97 7.222838 7
98 4.941222 4
99 1.443023 1