m = MultiLabelBinarizer()
X = pd.read_csv('data/data.csv', sep=None, engine='python')
X = X.dropna()
Y_train = m.fit_transform(X['labels'])
Y_train2 = [list(i) for i in Y_train]
data = pd.DataFrame({'text': pd.Series(X[text_col]), 'labels': Y_train2})
data = data.dropna()
train_df, eval_df = train_test_split(data, test_size=0.2)
numLabels = len(pd.unique(X['labels])) # count of the labels
model = MultiLabelClassificationModel('roberta', 'roberta-base', num_labels=numLabels, use_cuda=False)
model.train_model(pd.DataFrame(train_df))
我的标签列的数据结构是:[[0,1,0,0,0,1,0,0],[0,1,1,0,0,1,0,0],[0,0,0,0,0,1]......]每一行都有一个标签列表,如标签列中的[0,1,0,0,0,1,0,0]。
而对于文字,每行有一篇文字(报纸文章)。
(从那个源头得到的。https:/github.comThilinaRajapaksesimpletransformers#minimal-start-for-multilabel-classification。)
如果我只用4个条目训练模型,模型可以被训练。但当我想用整个数据集来训练它时,它给我的答案是:"RuntimeError: shape '[-1, 9]' is invalid input size 8::"。RuntimeError: shape '[-1, 9]' is invalid for input of size 8:
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/simpletransformers/classification/multi_label_classification_model.py", line 121, in train_model
return super().train_model(train_df, multi_label=multi_label, eval_df=eval_df, output_dir=output_dir, show_running_loss=show_running_loss, args=args)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/simpletransformers/classification/classification_model.py", line 208, in train_model
global_step, tr_loss = self.train(train_dataset, output_dir, multi_label=multi_label, show_running_loss=show_running_loss, eval_df=eval_df, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/simpletransformers/classification/classification_model.py", line 306, in train
outputs = model(**inputs)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in __call__
result = self.forward(*input, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/simpletransformers/custom_models/models.py", line 117, in forward
loss = loss_fct(logits.view(-1, self.num_labels), labels.view(-1, self.num_labels))
RuntimeError: shape '[-1, 9]' is invalid for input of size 8
我不知道大小8是从哪里来的,现在该怎么办,因为它只对很少的条目起作用.有人能帮帮我吗?
[0,1,0,0,0,0,1,0,0] - 它是8个大小,但你的模型期望大小为9。这意味着,你的numLabels = 9。如果你有9个类,那么label-colum中的标签列表应该是这样的。[0,1,0,0,0,0,0,1,0,0,0].但我认为你只需要把num_labels传成8就可以了。