我正在尝试根据其PatientID将多个主题的dicom分类到各自的文件夹中。当前目录包含所有主题的所有dicom,而不进行排序。我可以浏览一个dicom目录,并按其PatientID将主题分组,并计算每个主题有多少个dicom。是否可以将dicoms复制或移动到另一个目录,并根据其PatientID对它们进行排序。
代码:
os.listdir('\\dicoms')
device = torch.device("cuda")
print(device)
input_path = '\\dicoms\\'
ds_columns = ['ID', 'PatientID', 'Modality', 'StudyInstance',
'SeriesInstance', 'PhotoInterpretation', 'Position0',
'Position1', 'Position2', 'Orientation0', 'Orientation1',
'Orientation2', 'Orientation3', 'Orientation4', 'Orientation5',
'PixelSpacing0', 'PixelSpacing1']
def extract_dicom_features(ds):
ds_items = [ds.SOPInstanceUID,
ds.PatientID,
ds.Modality,
ds.StudyInstanceUID,
ds.SeriesInstanceUID,
ds.PhotometricInterpretation,
ds.ImagePositionPatient,
ds.ImageOrientationPatient,
ds.PixelSpacing]
line = []
for item in ds_items:
if type(item) is pydicom.multival.MultiValue:
line += [float(x) for x in item]
else:
line.append(item)
return line
list_img = os.listdir(input_path + 'imgs')
print(len(list_img))
df_features = []
for img in tqdm.tqdm(list_img):
img_path = input_path + 'imgs/' + img
ds = pydicom.read_file(img_path)
df_features.append(extract_dicom_features(ds))
df_features = pd.DataFrame(df_features, columns=ds_columns)
df_features.head()
df_features.to_csv('\\meta.csv')
print(Counter(df_features['PatientID']))
元数据示例:
,ID,PatientID,Modality,StudyInstance,SeriesInstance,PhotoInterpretation,Position0,Position1,Position2,Orientation0,Orientation1,Orientation2,Orientation3,Orientation4,Orientation5,PixelSpacing0,PixelSpacing1
0,ID_000012eaf,ID_f15c0eee,CT,ID_30ea2b02d4,ID_0ab5820b2a,MONOCHROME2,-125.0,-115.89798,77.970825,1.0,0.0,0.0,0.0,0.927184,-0.374607,0.488281,0.488281
计数器输出示例:
Counter({'ID_19702df6': 28, 'ID_b799ed34': 26, 'ID_e3523464': 26, 'ID_cd9169c2': 26, 'ID_e326a8a4': 24, 'ID_45da90cb': 24, 'ID_99e4f787': 24, 'ID_df751e93': 24, 'ID_929a5b39': 20})
我添加了以下代码,尝试将图像分类到子目录中,但遇到错误:
dest_path = input_path+'imageProcessDir'
counter = 0
for index, rows in df_features.iterrows():
filename = basename(rows['ID'])
image = cv2.imread(input_path+rows['ID'])
counter=counter+1
fold = rows['PatientID']+"/"
dest_fold = dest_path+fold
cv2.imwrite(dest_fold+"/"+filename+ "_" +str(counter)+".dcm", img)
错误:
Traceback (most recent call last):
File "ct_move.py", line 77, in <module>
cv2.imwrite(dest_fold+"/"+filename+ "_" +str(counter)+".dcm", img)
TypeError: Expected cv::UMat for argument 'img'
为了解决您的问题,在这里完全使用opencv似乎有点过头了。如果您要做的就是将dicom图像从文件系统中的一个位置移动到另一位置,那么如果您使用的是类似UNIX的系统,则可以使用os.rename
或shutil.move
。除非您要修改图像内容,否则它们是更干净,更快速的解决方案。
我在您的最后一个代码块中注意到了两件事:
[我想我注意到您希望fold
变量具有"/"
前缀,而不是为工作路径添加后缀。
Counter
类。dest_path = input_path+'imageProcessDir'
counter = 0
prev_fold = '/' + df_features.loc[0, 'PatientID']
for index, rows in df_features.iterrows():
filename = basename(rows['ID'])
counter=counter + 1
fold = '/' + rows['PatientID']
dest_fold = dest_path + fold
out_file = dest_fold + "/" + filename + "_" + str(counter) + ".dcm"
os.rename(input_path + rows['ID'], out_file)
if fold != prev_fold:
counter = 0 # reset when the PatientID changes
prev_fold = fold
我也将使用os.path.join
处理文件系统路径,而不是在所有内容中添加“ /”:
fold = rows['PatientID']
dest_fold = os.path.join(dest_path, fold)
因为我认为输入文件路径也存在问题:input_path + rows['ID']