我有一个列表a
。我想按升序排列这些文件,比如
如果我使用split函数,它只会拆分扩展名npy。排序函数仅对整数有效。我应该为此目的做些什么?
a = ['kernal_1.0.npy','kernal_100.npy','kernal_50.npy','kernal_10.npy' ]
b='kernal_1.0.npy'
print(os.path.splitext(b))
由于开头始终相同,结束,您可以根据索引进行搜索。
a = ['kernal_1.0.npy','kernal_100.npy','kernal_50.npy','kernal_10.npy' ]
prefix_len = len('kernal_')
prefix_ext = len('.npy')
# Here, the key parameter means *how* you want to sort your list. So,
# basically, at each operation, it will sort based on this argument. The
# lambda here is basically a function, on which I invite you to document
# yourself on.
# This line says : Sort this list, and compare every elements using
# only the letters between the prefix_len'th index and the prefix_ext index,
# casted as `float` numbers.
b = sorted(a, key = lambda x: float(x[prefix_len:-prefix_ext]) )
print(b)
# ['kernal_1.0.npy', 'kernal_10.npy', 'kernal_50.npy', 'kernal_100.npy']
def show_list_based_on_lambda(arr, key):
""" When you use the key parameter in a sorting function, it behaves
the same way as here. Meaning at every iteration, it will
only consider the elements returned by the function you sent in.
"""
for elem in arr:
print( key(elem) )
# This function is supposed to strip off the first and last character of an iterable.
f = lambda x:x[1:-1]
arr = ['aaa', 'bbb', 'ccc', 'ddd', 'eee']
show_list_based_on_lambda(arr, f)
# a
# b
# c
# d
# e
# This function is supposed to add one to every element that passes by.
f = lambda x:x+1
arr = [10, 20, 30, 40, 50]
show_list_based_on_lambda(arr, f)
# 11
# 21
# 31
# 41
# 51
您可以使用Pandas Series
来概括解决方案:
a = np.array(['kernal_1.0.npy','kernal_100.npy','kernal_50.npy','kernal_10.npy' ])
idx_ = pd.Series(a).str.split('.', expand=True).iloc[:, 0]\
.str.split('_', expand=True).iloc[:, 1]\
.astype(int).sort_values(0).index
a[idx_]
array(['kernal_1.0.npy', 'kernal_10.npy', 'kernal_50.npy',
'kernal_100.npy'], dtype='<U14')
在os.path.splitext
或str.split
中使用sorted
和list.sort
:
import os
a = ['kernal_1.0.npy','kernal_100.npy','kernal_50.npy','kernal_10.npy']
sorted(a, key = lambda x: float(os.path.splitext(x)[0].split('_')[1]))
# ['kernal_1.0.npy', 'kernal_10.npy', 'kernal_50.npy', 'kernal_100.npy']
试试这个 :
b = sorted(a, key = lambda x : int(x[x.find('_')+1:].split('.')[0]))
输出:
b = ['kernal_1.0.npy', 'kernal_10.npy', 'kernal_50.npy', 'kernal_100.npy']
您可以尝试以下古老而经典的做法:
import re
def numeric_compare(x, y):
u = re.findall("\d+(?:\.\d+)?", x)
v = re.findall("\d+(?:\.\d+)?", y)
u = [0] if len(u) == 0 else u
v = [0] if len(v) == 0 else v
return int(float(u[0]) - float(v[0]))
a = ['kernal_1.0.npy','kernal_100.npy','kernal_50.npy','kernal_10.npy' ]
print(a)
print(sorted(a, cmp=numeric_compare))
输出:
['kernal_1.0.npy', 'kernal_100.npy', 'kernal_50.npy', 'kernal_10.npy']
['kernal_1.0.npy', 'kernal_10.npy', 'kernal_50.npy', 'kernal_100.npy']
说明:
numeric_compare
0
int
,因为你需要让你的函数返回一个int
sorted()
这种做法很健壮,也可以在文件中使用,其中没有任何数字:
输入:
b = ['kernal_1.0.npy','kernal_100.npy','kernal_50.npy','kernal_10.npy', 'abc' ]
输出:
['abc', 'kernal_1.0.npy', 'kernal_10.npy', 'kernal_50.npy', 'kernal_100.npy']
如果您希望文件中没有数字的文件出现在列表的末尾而不是在开头排序,那么您可以用u = [0]
和v = [0]
替换u = [sys.maxsize]
和v = [sys.maxsize]
。 (您需要在代码的开头添加import sys
)
正则表达式演示和解释: https://regex101.com/r/evIeVD/1/