我正在编写代码,使用 openMP 并行解决 NxN 数独难题。当我在没有 -fopenmp 选项的情况下进行编译时,代码工作正常,但是当我使用 -fopenmp 进行编译时,我得到了奇怪的结果。 这是代码相关部分的片段(我在使用 openmp 时在并行部分注释掉了 break 语句):
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <omp.h>
const int SUBSIZE = 4;
const int SIZE = SUBSIZE*SUBSIZE;
typedef struct {
int *grid;
int row;
int col;
int filled_cells;
} State;
int solveBoardDFS(int *grid, int row, int col){
//printf("New branch (%d)\n!", rec_count++);
// end of grid, solution found.
if (row == SIZE) {
return 1;
}
//End of row, move to next row
if (col == SIZE) {
return solveBoardDFS(grid, row + 1, 0);
}
// Skip if non-empty cell
if (grid[row*SIZE + col] != 0){
return solveBoardDFS(grid, row, col+1);
}
//Try all possible values for the cell
for (int value = 1; value <= SIZE; value++) {
if (validateBoard(grid, row, col, value)) {
grid[row*SIZE + col] = value;
if (solveBoardDFS(grid, row, col + 1)) {
return 1;
}
// backtrack
grid[row*SIZE + col] = 0;
}
}
return 0;
}
int main(int argc, char **argv){
int grid[SIZE * SIZE];
FILE *fp = fopen("sudoku.csv", "r");
char *line = NULL;
size_t len = 0;
ssize_t read;
int i = 0;
while ((read = getline(&line, &len, fp)) != -1) {
char *token = strtok(line, ",");
while (token != NULL) {
grid[i++] = atoi(token);
token = strtok(NULL, ",");
}
}
fclose(fp);
// Allocate memory for the array
int **boards = (int **)malloc(200000 * sizeof(int *));
for (int i = 0; i < 200000; i++) {
boards[i] = (int *)malloc(SIZE*SIZE * sizeof(int));
}
State state = {grid, 0, 0};
solvePartialBoardBFS(&state, boards);
int flag = 0;
#pragma omp parallel for schedule(dynamic) shared(flag)
for (i = 0; i < last; i++){
int solution_found = solveBoardDFS(boards[i], 0, 0);
if (solution_found && flag == 0){
#pragma omp critical
{
flag = 1;
printSolution(boards[i]);
}
}
}
return 0;
}
代码说明: 此代码段中未包含的函数使用 BFS 为前 10 个填充单元格生成所有可能的板,创建大约 130k 个独特的板。我保存每个板并将它们存储在指针“板”数组中。然后我对此进行迭代,有效地将每个电路板传递给函数 solveBoardDFS,它将使用 DFS 完全解决电路板问题。
当我在没有 openMP 的情况下运行它时,我得到了预期的结果:
+
| 16 2 14 1 | 5 8 6 11 | 10 13 15 7 | 9 3 4 12 |
| 3 4 10 5 | 1 2 7 13 | 14 9 12 11 | 6 15 8 16 |
| 15 7 8 13 | 9 3 12 10 | 1 6 16 4 | 2 5 11 14 |
| 6 12 11 9 | 14 4 15 16 | 2 3 5 8 | 13 1 7 10 |
+
| 13 3 7 2 | 6 16 11 12 | 4 1 8 10 | 5 9 14 15 |
| 4 5 16 8 | 2 1 9 14 | 3 15 11 12 | 7 6 10 13 |
| 9 1 6 15 | 3 7 10 4 | 13 5 2 14 | 16 11 12 8 |
| 14 10 12 11 | 15 5 13 8 | 6 16 7 9 | 3 4 1 2 |
+
| 10 6 1 4 | 7 11 8 9 | 12 2 13 16 | 15 14 5 3 |
| 2 8 5 3 | 10 14 16 15 | 11 7 9 1 | 4 12 13 6 |
| 12 15 9 7 | 4 13 3 2 | 8 14 6 5 | 11 10 16 1 |
| 11 14 13 16 | 12 6 5 1 | 15 10 4 3 | 8 7 2 9 |
+
| 7 13 15 14 | 11 12 4 3 | 16 8 10 6 | 1 2 9 5 |
| 1 11 3 12 | 16 9 2 6 | 5 4 14 13 | 10 8 15 7 |
| 8 9 2 6 | 13 10 14 5 | 7 11 1 15 | 12 16 3 4 |
| 5 16 4 10 | 8 15 1 7 | 9 12 3 2 | 14 13 6 11 |
+
但是当我用 openMP 运行它时,我得到这个输出:
+
| 1597449657 5 893616101 -904939001 | 9 8 2 11 | 16 13 10 7 | 15 3 4 14 |
| 3 4 4977 2 | 16 8 10 7 | 14 5 15 9 | 6 1 11 13 |
| 11 15 1 2 | 3 5 0 7 | 0 6 1918611648 22003 | 1918607104 22003 1918607360 22003 |
| 9 8 1 13 | 16 5 6 0 | 0 14 0 0 | 0 0 7 2 |
+
| 13 2 7 0 | 16 15 11 3 | 0 0 6 10 | 4 14 9 1 |
| 0 0 6 9 | 0 10 0 0 | 3 15 0 0 | 1918607200 22003 0 7 |
| 0 1 0 0 | 0 4 8 13 | 1918607232 22003 6 14 | 16 11 12 3 |
| 16 10 11 5 | 2 1 1 6 | 132164 8 1 4 | 1918607328 22003 15 14 |
+
| 10 14 8 4 | 1 0 18 12 | 2 0 0 16 | 0 0 5 0 |
| 0 0 5 0 | 0 3 13 10 | 14 7 9 15 | 2 4 1 6 |
| 2 3 9 7 | 4 6 14 16 | 1918607488 22003 12 5 | 8 10 15 13 |
| 1 14 16 15 | 5 0 12 0 | 0 3 4 0 | 0 0 0 11 |
+
| 7 16 15 14 | 11 12 0 0 | 1918607616 22003 0 6 | 1 2 0 0 |
| 0 11 3 12 | 0 0 0 0 | 0 16 7 1 | 10 8 14 9 |
| 4 9 13 1 | 8 7 6 5 | 1918607744 22003 14 2 | 11 16 3 12 |
| 8 6 2 10 | 1 16 14 4 | 9 12 3 11 | 5 7 13 15 |
+
Segmentation fault (core dumped)
我是 C 的新手,所以我确定我的代码非常混乱,我会很感激一般提示,但我主要关注并行化在所有板上运行 BFS 算法的循环。