检查在数据库插入[重复]之前是否存在记录。

问题描述 投票:0回答:1

我有一个包含640,000多条记录的数据库,我每周用JSON文件中的数据进行更新。 我想做的是只将当前不存在的记录加载到数据库中。 我下面的脚本可以在少量数据上工作,但是当我尝试加载一个大文件时,它就会超时(我得到一个500内部服务器错误)。 有没有更好的方法?

<?php
set_time_limit(0);
ini_set('memory_limit','2000M');

$url = 'json/OERrecordstest.json';
$contents = file_get_contents($url);
$records = json_decode($contents, true);
include("../config.php"); 

echo "<div class='card card-body'>";
foreach($records as $record) {  

    $type = $record['type'];
    $name = $record['title'];
    $title = addslashes($name);
    $creator = $record['author'];
    $author = addslashes($creator);
    $link = addslashes($record['link']);
    $origin = $record['source'];
    $source = addslashes($origin);
    $description = addslashes($record['description']);
    $base_url = $record['base_url'];
    $isbn_number = $record['isbn_number'];
    $e_isbn_number = $record['e_isbn_number'];
    $publication_date = $record['publication_date'];
    $license = $record['license'];
    $subject = addslashes($record['subject']);
    $image_url = $record['image_url'];
    $review = $record['review'];
    $language = $record['language'];
    $license_url = $record['license_url'];
    $publisher = addslashes($record['publisher']);
    $publisher_url = $record['publisher_url'];

    $query = $conn->prepare("SELECT * FROM oer_search WHERE title=:title AND author=:author AND source=:source");
    $query->bindParam(":title", $name);
    $query->bindParam(":author", $creator);
    $query->bindParam(":source", $origin);
    $query->execute();

    if ($query->rowCount() == 0) {
        $insert = $conn->prepare("INSERT INTO oer_search (type, title, author, link, source, description, base_url, isbn_number, e_isbn_number, publication_date, license, subject, image_url, review, language, license_url, publisher, publisher_url) VALUES ('$type', '$title', '$author', '$link', '$source', '$description', '$base_url', '$isbn_number', '$e_isbn_number', '$publication_date', '$license', '$subject', '$image_url', '$review', '$language', '$license_url', '$publisher', '$publisher_url')");
        $insert->execute();
    }

}

if($insert){
    echo "<p><span class='recordInserted'><em>$name was successfully inserted into SOAR.</em></span></p>";
}
else {
    echo "<p><span class='recordInserted'><em>Record(s) already exist in SOAR.</em></span></p>";
}
echo "</div>";
?>
php mysql pdo
1个回答
-1
投票

我无法评论,我写的是答案,因为我的分数不够,你可以这样改一下试试吗?

 $query = $conn->prepare("SELECT id FROM oer_search WHERE title=:title AND author=:author AND source=:source limit 1");

或者

<?php
if(!session_id()) session_start();
ini_set('memory_limit', '2000M');

$url = 'json/OERrecordstest.json';
$contents = file_get_contents($url);
$records = json_decode($contents, true);
include("../config.php");

echo "<div class='card card-body'>";

if (!$_SESSION["records"]) {
    foreach ($records as $record) {
        $_SESSION["records"][$record["id"]] = $records;
    }
}
$i = 0;
foreach ($_SESSION["records"] as $record) {
    $i++;
    if ($i > 1000) break;

    $type = $record['type'];
    $name = $record['title'];
    $title = addslashes($name);
    $creator = $record['author'];
    $author = addslashes($creator);
    $link = addslashes($record['link']);
    $origin = $record['source'];
    $source = addslashes($origin);
    $description = addslashes($record['description']);
    $base_url = $record['base_url'];
    $isbn_number = $record['isbn_number'];
    $e_isbn_number = $record['e_isbn_number'];
    $publication_date = $record['publication_date'];
    $license = $record['license'];
    $subject = addslashes($record['subject']);
    $image_url = $record['image_url'];
    $review = $record['review'];
    $language = $record['language'];
    $license_url = $record['license_url'];
    $publisher = addslashes($record['publisher']);
    $publisher_url = $record['publisher_url'];

    $query = $conn->prepare("SELECT id FROM oer_search WHERE title=:title AND author=:author AND source=:source limit 1");
    $query->bindParam(":title", $name);
    $query->bindParam(":author", $creator);
    $query->bindParam(":source", $origin);
    $query->execute();

    if ($query->rowCount() == 0) {
        $insert = $conn->prepare("INSERT INTO oer_search (type, title, author, link, source, description, base_url, isbn_number, e_isbn_number, publication_date, license, subject, image_url, review, language, license_url, publisher, publisher_url) VALUES ('$type', '$title', '$author', '$link', '$source', '$description', '$base_url', '$isbn_number', '$e_isbn_number', '$publication_date', '$license', '$subject', '$image_url', '$review', '$language', '$license_url', '$publisher', '$publisher_url')");
        $insert->execute();

        unset($_SESSION["records"][$record["id"]]);
    }

}

print "remaining data :". count($_SESSION["records"]);
?>

-1
投票

加快大规模导入的技巧。

  • 把你的SQL准备工作移到循环之外(你只需要做一次)。
  • 收集数据插入到1000个批次的数据中(例如......通常更多的可能性)。
  • 在插入过程中,使用事务禁用指数计算
  • 用现有数据的查找数组查找重复的数据(不要对导入的每一行查询数据库)。
  • 在一般情况下。避免在循环中进行SQL查询

望闻问切

© www.soinside.com 2019 - 2024. All rights reserved.