如何使用 Batch 更新 Firestore 中的 500 多个文档?

问题描述 投票:0回答:10

我正在尝试使用包含超过 500 个文档的集合中的

timestamp
管理时间戳来更新字段
Firestore

const batch = db.batch();
const serverTimestamp = admin.firestore.FieldValue.serverTimestamp();

db
  .collection('My Collection')
  .get()
  .then((docs) => {
    serverTimestamp,
  }, {
    merge: true,
  })
  .then(() => res.send('All docs updated'))
  .catch(console.error);

这会引发错误

{ Error: 3 INVALID_ARGUMENT: cannot write more than 500 entities in a single call
    at Object.exports.createStatusError (C:\Users\Growthfile\Desktop\cf-test\functions\node_modules\grpc\src\common.js:87:15)
    at Object.onReceiveStatus (C:\Users\Growthfile\Desktop\cf-test\functions\node_modules\grpc\src\client_interceptors.js:1188:28)
    at InterceptingListener._callNext (C:\Users\Growthfile\Desktop\cf-test\functions\node_modules\grpc\src\client_interceptors.js:564:42)
    at InterceptingListener.onReceiveStatus (C:\Users\Growthfile\Desktop\cf-test\functions\node_modules\grpc\src\client_interceptors.js:614:8)
    at callback (C:\Users\Growthfile\Desktop\cf-test\functions\node_modules\grpc\src\client_interceptors.js:841:24)
  code: 3,
  metadata: Metadata { _internal_repr: {} },
  details: 'cannot write more than 500 entities in a single call' }

有没有一种方法可以让我编写一个递归方法,该方法创建一个批处理对象,逐个更新一批 500 个文档,直到所有文档都更新为止。

从文档中我知道可以使用此处提到的递归方法进行删除操作:

https://firebase.google.com/docs/firestore/manage-data/delete-data#collections

但是,对于更新,我不确定如何结束执行,因为文档没有被删除。

javascript firebase google-cloud-firestore firebase-admin
10个回答
69
投票

我还遇到了更新 Firestore 集合中 500 多个文档的问题。我想分享我是如何解决这个问题的。

我使用云函数来更新 Firestore 中的集合,但这也应该适用于客户端代码。

该解决方案会计算对该批次进行的每个操作,并在达到限制后创建一个新批次并将其推送到

batchArray

所有更新完成后,代码循环通过

batchArray
并提交数组内的每个批次。

计算批次中的每个操作set(), update(), delete()

非常重要,因为它们都计入 500 次操作限制。

const documentSnapshotArray = await firestore.collection('my-collection').get(); const batchArray = []; batchArray.push(firestore.batch()); let operationCounter = 0; let batchIndex = 0; documentSnapshotArray.forEach(documentSnapshot => { const documentData = documentSnapshot.data(); // update document data here... batchArray[batchIndex].update(documentSnapshot.ref, documentData); operationCounter++; if (operationCounter === 499) { batchArray.push(firestore.batch()); batchIndex++; operationCounter = 0; } }); batchArray.forEach(async batch => await batch.commit()); return;
    

28
投票
我喜欢这个简单的解决方案:

const users = await db.collection('users').get() const batches = _.chunk(users.docs, 500).map(userDocs => { const batch = db.batch() userDocs.forEach(doc => { batch.set(doc.ref, { field: 'myNewValue' }, { merge: true }) }) return batch.commit() }) await Promise.all(batches)
请记住在顶部添加

import * as _ from "lodash"

。基于
这个答案


8
投票
您可以使用默认的

BulkWriter。此方法使用 500/50/5 规则。

示例:

let bulkWriter = firestore.bulkWriter(); bulkWriter.create(documentRef, {foo: 'bar'}); bulkWriter.update(documentRef2, {foo: 'bar'}); bulkWriter.delete(documentRef3); await close().then(() => { console.log('Executed all writes'); });
    

4
投票
如上所述,@Sebastian 的回答很好,我也赞成。尽管在一次性更新 25000 多个文档时遇到了问题。 逻辑调整如下。

console.log(`Updating documents...`); let collectionRef = db.collection('cities'); try { let batch = db.batch(); const documentSnapshotArray = await collectionRef.get(); const records = documentSnapshotArray.docs; const index = documentSnapshotArray.size; console.log(`TOTAL SIZE=====${index}`); for (let i=0; i < index; i++) { const docRef = records[i].ref; // YOUR UPDATES batch.update(docRef, {isDeleted: false}); if ((i + 1) % 499 === 0) { await batch.commit(); batch = db.batch(); } } // For committing final batch if (!(index % 499) == 0) { await batch.commit(); } console.log('write completed'); } catch (error) { console.error(`updateWorkers() errored out : ${error.stack}`); reject(error); }
    

1
投票
之前的评论给出的解释已经解释了这个问题。

我正在分享我为我构建和工作的最终代码,因为我需要一些以更加解耦的方式工作的东西,而不是上面介绍的大多数解决方案的方式。

import { FireDb } from "@services/firebase"; // = firebase.firestore(); type TDocRef = FirebaseFirestore.DocumentReference; type TDocData = FirebaseFirestore.DocumentData; let fireBatches = [FireDb.batch()]; let batchSizes = [0]; let batchIdxToUse = 0; export default class FirebaseUtil { static addBatchOperation( operation: "create", ref: TDocRef, data: TDocData ): void; static addBatchOperation( operation: "update", ref: TDocRef, data: TDocData, precondition?: FirebaseFirestore.Precondition ): void; static addBatchOperation( operation: "set", ref: TDocRef, data: TDocData, setOpts?: FirebaseFirestore.SetOptions ): void; static addBatchOperation( operation: "create" | "update" | "set", ref: TDocRef, data: TDocData, opts?: FirebaseFirestore.Precondition | FirebaseFirestore.SetOptions ): void { // Lines below make sure we stay below the limit of 500 writes per // batch if (batchSizes[batchIdxToUse] === 500) { fireBatches.push(FireDb.batch()); batchSizes.push(0); batchIdxToUse++; } batchSizes[batchIdxToUse]++; const batchArgs: [TDocRef, TDocData] = [ref, data]; if (opts) batchArgs.push(opts); switch (operation) { // Specific case for "set" is required because of some weird TS // glitch that doesn't allow me to use the arg "operation" to // call the function case "set": fireBatches[batchIdxToUse].set(...batchArgs); break; default: fireBatches[batchIdxToUse][operation](...batchArgs); break; } } public static async runBatchOperations() { // The lines below clear the globally available batches so we // don't run them twice if we call this function more than once const currentBatches = [...fireBatches]; fireBatches = [FireDb.batch()]; batchSizes = [0]; batchIdxToUse = 0; await Promise.all(currentBatches.map((batch) => batch.commit())); } }
    

1
投票
基于上述所有答案,我整理了以下代码片段,可以将它们放入 JavaScript 后端和前端的模块中,以便轻松使用 Firestore 批量写入,而不必担心 500 次写入限制。

后端(Node.js)

// The Firebase Admin SDK to access Firestore. const admin = require("firebase-admin"); admin.initializeApp(); // Firestore does not accept more than 500 writes in a transaction or batch write. const MAX_TRANSACTION_WRITES = 499; const isFirestoreDeadlineError = (err) => { console.log({ err }); const errString = err.toString(); return ( errString.includes("Error: 13 INTERNAL: Received RST_STREAM") || errString.includes("Error: 4 DEADLINE_EXCEEDED: Deadline exceeded") ); }; const db = admin.firestore(); // How many transactions/batchWrites out of 500 so far. // I wrote the following functions to easily use batchWrites wthout worrying about the 500 limit. let writeCounts = 0; let batchIndex = 0; let batchArray = [db.batch()]; // Commit and reset batchWrites and the counter. const makeCommitBatch = async () => { console.log("makeCommitBatch"); await Promise.all(batchArray.map((bch) => bch.commit())); }; // Commit the batchWrite; if you got a Firestore Deadline Error try again every 4 seconds until it gets resolved. const commitBatch = async () => { try { await makeCommitBatch(); } catch (err) { console.log({ err }); if (isFirestoreDeadlineError(err)) { const theInterval = setInterval(async () => { try { await makeCommitBatch(); clearInterval(theInterval); } catch (err) { console.log({ err }); if (!isFirestoreDeadlineError(err)) { clearInterval(theInterval); throw err; } } }, 4000); } } }; // If the batchWrite exeeds 499 possible writes, commit and rest the batch object and the counter. const checkRestartBatchWriteCounts = () => { writeCounts += 1; if (writeCounts >= MAX_TRANSACTION_WRITES) { batchIndex++; batchArray.push(db.batch()); writeCounts = 0; } }; const batchSet = (docRef, docData) => { batchArray[batchIndex].set(docRef, docData); checkRestartBatchWriteCounts(); }; const batchUpdate = (docRef, docData) => { batchArray[batchIndex].update(docRef, docData); checkRestartBatchWriteCounts(); }; const batchDelete = (docRef) => { batchArray[batchIndex].delete(docRef); checkRestartBatchWriteCounts(); }; module.exports = { admin, db, MAX_TRANSACTION_WRITES, checkRestartBatchWriteCounts, commitBatch, isFirestoreDeadlineError, batchSet, batchUpdate, batchDelete, };

前端

// Firestore does not accept more than 500 writes in a transaction or batch write. const MAX_TRANSACTION_WRITES = 499; const isFirestoreDeadlineError = (err) => { return ( err.message.includes("DEADLINE_EXCEEDED") || err.message.includes("Received RST_STREAM") ); }; class Firebase { constructor(fireConfig, instanceName) { let app = fbApp; if (instanceName) { app = app.initializeApp(fireConfig, instanceName); } else { app.initializeApp(fireConfig); } this.name = app.name; this.db = app.firestore(); this.firestore = app.firestore; // How many transactions/batchWrites out of 500 so far. // I wrote the following functions to easily use batchWrites wthout worrying about the 500 limit. this.writeCounts = 0; this.batch = this.db.batch(); this.isCommitting = false; } async makeCommitBatch() { console.log("makeCommitBatch"); if (!this.isCommitting) { this.isCommitting = true; await this.batch.commit(); this.writeCounts = 0; this.batch = this.db.batch(); this.isCommitting = false; } else { const batchWaitInterval = setInterval(async () => { if (!this.isCommitting) { this.isCommitting = true; await this.batch.commit(); this.writeCounts = 0; this.batch = this.db.batch(); this.isCommitting = false; clearInterval(batchWaitInterval); } }, 400); } } async commitBatch() { try { await this.makeCommitBatch(); } catch (err) { console.log({ err }); if (isFirestoreDeadlineError(err)) { const theInterval = setInterval(async () => { try { await this.makeCommitBatch(); clearInterval(theInterval); } catch (err) { console.log({ err }); if (!isFirestoreDeadlineError(err)) { clearInterval(theInterval); throw err; } } }, 4000); } } } async checkRestartBatchWriteCounts() { this.writeCounts += 1; if (this.writeCounts >= MAX_TRANSACTION_WRITES) { await this.commitBatch(); } } async batchSet(docRef, docData) { if (!this.isCommitting) { this.batch.set(docRef, docData); await this.checkRestartBatchWriteCounts(); } else { const batchWaitInterval = setInterval(async () => { if (!this.isCommitting) { this.batch.set(docRef, docData); await this.checkRestartBatchWriteCounts(); clearInterval(batchWaitInterval); } }, 400); } } async batchUpdate(docRef, docData) { if (!this.isCommitting) { this.batch.update(docRef, docData); await this.checkRestartBatchWriteCounts(); } else { const batchWaitInterval = setInterval(async () => { if (!this.isCommitting) { this.batch.update(docRef, docData); await this.checkRestartBatchWriteCounts(); clearInterval(batchWaitInterval); } }, 400); } } async batchDelete(docRef) { if (!this.isCommitting) { this.batch.delete(docRef); await this.checkRestartBatchWriteCounts(); } else { const batchWaitInterval = setInterval(async () => { if (!this.isCommitting) { this.batch.delete(docRef); await this.checkRestartBatchWriteCounts(); clearInterval(batchWaitInterval); } }, 400); } } }
    

1
投票
没有引用或文档,这个代码是我自己发明的,对我来说它有效并且看起来干净,并且易于阅读和使用。如果有人喜欢,也可以使用它。

最好进行自动测试,因为代码使用私有变量

_ops

,可以在软件包升级后更改。例如,在旧版本中,它可以是 
_mutations

async function commitBatch(batch) { const MAX_OPERATIONS_PER_COMMIT = 500; while (batch._ops.length > MAX_OPERATIONS_PER_COMMIT) { const batchPart = admin.firestore().batch(); batchPart._ops = batch._ops.splice(0, MAX_OPERATIONS_PER_COMMIT - 1); await batchPart.commit(); } await batch.commit(); }

用途:

const batch = admin.firestore().batch(); batch.delete(someRef); batch.update(someRef); ... await commitBatch(batch);
    

1
投票
我喜欢这个实现:

https://github.com/qualdesk/firestore-big-batch

这是一篇关于它的博客文章(不是我的):

https://www.qualdesk.com/blog/2021/the-solution-to-firestore-batched-write-limit/

它是 Firestore 批次的直接替代品。而不是这个:

const batch = db.batch();
...这样做:

const batch = new BigBatch({ db });
这是我的变体,它已更新为与最新的 firebase-admin 和 TypeScript 类型兼容。我还添加了一个 

setGroup

 选项,它确保一组操作属于同一批次。

// Inspired by: https://github.com/qualdesk/firestore-big-batch import type { DocumentReference, Firestore, SetOptions, WriteBatch, } from 'firebase-admin/firestore'; const MAX_OPERATIONS_PER_FIRESTORE_BATCH = 499; export class BigBatch { private db: Firestore; private currentBatch: WriteBatch; private batchArray: Array<WriteBatch>; private operationCounter: number; constructor({ db }: { db: Firestore }) { this.db = db; this.currentBatch = db.batch(); this.batchArray = [this.currentBatch]; this.operationCounter = 0; } private startNewBatch() { this.currentBatch = this.db.batch(); this.batchArray.push(this.currentBatch); this.operationCounter = 0; } private checkLimit() { if (this.operationCounter < MAX_OPERATIONS_PER_FIRESTORE_BATCH) return; this.startNewBatch(); } private ensureGroupOperation(operations: unknown[]) { if (operations.length > MAX_OPERATIONS_PER_FIRESTORE_BATCH) throw new Error( `Group can only accept ${MAX_OPERATIONS_PER_FIRESTORE_BATCH} operations.`, ); if ( this.operationCounter + operations.length > MAX_OPERATIONS_PER_FIRESTORE_BATCH ) this.startNewBatch(); } /** * Add a single set operation to the batch. */ set( ref: DocumentReference, data: object, options: SetOptions = {}, ) { this.currentBatch.set(ref, data, options); this.operationCounter++; this.checkLimit(); } /** * Add a group of set operations to the batch. This method ensures that everything in a group will be included in the same batch. * @param group Array of objects with ref, data, and options */ setGroup( operations: { ref: DocumentReference; data: object; options?: SetOptions; }[], ) { this.ensureGroupOperation(operations); operations.forEach(o => this.currentBatch.set(o.ref, o.data, o.options ?? {}), ); this.operationCounter += operations.length; this.checkLimit(); } update(ref: DocumentReference, data: object) { this.currentBatch.update(ref, data); this.operationCounter++; this.checkLimit(); } delete(ref: DocumentReference) { this.currentBatch.delete(ref); this.operationCounter++; this.checkLimit(); } commit() { const promises = this.batchArray.map(batch => batch.commit()); return Promise.all(promises); } }
    

0
投票
简单的解决方案 只开火两次? 我的数组是“resultsFinal” 我以 490 的限制触发一次批次,第二次以数组长度的限制( results.lenght ) 对我来说效果很好:) 你怎么检查呢? 你去firebase并删除你的集合,firebase说你已经删除了XXX文档,与你的数组的长度相同?好的,你可以走了

async function quickstart(results) { // we get results in parameter for get the data inside quickstart function const resultsFinal = results; // console.log(resultsFinal.length); let batch = firestore.batch(); // limit of firebase is 500 requests per transaction/batch/send for (i = 0; i < 490; i++) { const doc = firestore.collection('testMore490').doc(); const object = resultsFinal[i]; batch.set(doc, object); } await batch.commit(); // const batchTwo = firestore.batch(); batch = firestore.batch(); for (i = 491; i < 776; i++) { const objectPartTwo = resultsFinal[i]; const doc = firestore.collection('testMore490').doc(); batch.set(doc, objectPartTwo); } await batch.commit(); }
    

0
投票
自 2023 年 3 月起,Firestore 不再限制可传递给提交操作或在事务中执行的写入数量(

)。

© www.soinside.com 2019 - 2024. All rights reserved.