如何使用 PHP 的 batchWriteItem 将超过 25 个项目写入 DynamoDB 表中

问题描述 投票:0回答:4

我正在使用适用于 PHP 3.x 的 AWS 开发工具包

对 BatchWriteItem 的单次调用最多可写入 16 MB 的数据,其中可包含多达 25 个放置或删除请求。要写入的单个项目最大可达 400 KB。

  $result = $dynamodbClient->batchWriteItem([
  'RequestItems' => [
    $tableName => [
      [
        'PutRequest' => [
          'Item' => [
            'Id' => ['N' => '1'],
            'AlbumTitle' => [
              'S' => 'Somewhat Famous',
            ],
            'Artist' => [
              'S' => 'No One You Know',
            ],
            'SongTitle' => [
              'S' => 'Call Me Today',
            ],
          ],
        ],
      ],          
    ],
  ],
]);

对于单个项目来说,它工作正常。我怎样才能写超过25条。

amazon-dynamodb
4个回答
5
投票

要写入超过 25 个项目,您必须重复调用 BatchWriteItem,从集合中添加项目,一次 25 个。

沿着这些思路(伪代码):

requests = []; // use an array to stage your put item requests
foreach(item in SourceCollection) {
    addItem(item, requests); // add this item to the array 
    if(count(requests) == 25) { // when you have 25 ready..
       // result = dynamodbClient->batchWriteItem(...)
       requests = []; // clean up the array of put item requests
       // handle the failed items from the result object
    }
}

确保通过将每个batchWriteItem结果重新添加回请求来处理失败的项目


2
投票

这是我的 lambda 函数的方法:

exports.handler = (event, context, callback) => {
  console.log(`EVENT: ${JSON.stringify(event)}`);

  var AWS = require('aws-sdk');

  AWS.config.update({ region: process.env.REGION })

  var docClient = new AWS.DynamoDB.DocumentClient();

  const {data, table, cb} = JSON.parse(event.body);

  console.log('{data, table, cb}:', {data, table, cb});

  // Build the batches
  var batches = [];
  var current_batch = [];
  var item_count = 0;

  for (var i = 0; i < data.length; i++) {
    // Add the item to the current batch
    item_count++
    current_batch.push({
      PutRequest: {
        Item: data[i],
      },
    })
    // If we've added 25 items, add the current batch to the batches array
    // and reset it
    if (item_count % 25 === 0) {
      batches.push(current_batch)
      current_batch = []
    }
  }

  // Add the last batch if it has records and is not equal to 25
  if (current_batch.length > 0 && current_batch.length !== 25) {
    batches.push(current_batch)
  }

  // Handler for the database operations
  var completed_requests = 0
  var errors = false

  function requestHandler (request) {

    console.log('in the handler: ', request)

    return function (err, data) {
      // Increment the completed requests
      completed_requests++;

      // Set the errors flag
      errors = (errors) ? true : err;

      // Log the error if we got one
      if(err) {
        console.error(JSON.stringify(err, null, 2));
        console.error("Request that caused database error:");
        console.error(JSON.stringify(request, null, 2));
        callback(err);
      }else {
        var response = {
          statusCode: 200,
          headers: {
            'Content-Type': 'application/json',
            'Access-Control-Allow-Methods': 'GET,POST,OPTIONS',
            'Access-Control-Allow-Origin': '*',
            'Access-Control-Allow-Credentials': true
          },
          body: JSON.stringify(data),
          isBase64Encoded: false
        };
        console.log(`success: returned ${data}`);
        callback(null, response);
      }

      // Make the callback if we've completed all the requests
      if(completed_requests === batches.length) {
        cb(errors);
      }
    }
  }

  // Make the requests
  var params;
  for (var j = 0; j < batches.length; j++) {
    // Items go in params.RequestItems.id array
    // Format for the items is {PutRequest: {Item: ITEM_OBJECT}}
    params = '{"RequestItems": {"' + table + '": []}}'
    params = JSON.parse(params)
    params.RequestItems[table] = batches[j]

    console.log('before db.batchWrite: ', params)

    // Perform the batchWrite operation
    docClient.batchWrite(params, requestHandler(params))
  }
};


dealspoondBatchWrite

0
投票

我正在使用以下代码使用batchWriteItem添加数据。有没有更好的办法请指教。

    // Build the batches
$albums= "// collection of album json";
$batches = [];
$current_batch = [];
$item_count = 0;
foreach ($albums as $album) {
    // Add the item to the current batch
    $item_count++;
    $json = json_encode($album);
    $data['PutRequest'] = array('Item' => $marshaler->marshalJson($json));
    array_push($current_batch, $data);
    // If we've added 25 items, add the current batch to the batches array
    // and reset it
    if ($item_count % 25 == 0) {
        array_push($batches, $current_batch);
        $current_batch = [];
    }
}
// Handler for the database operations
// Add the last batch if it has records and is not equal to 25
if (count($current_batch) > 0 && count($current_batch) != 25) {
    array_push($batches, array_values($current_batch));
}
//batches.push(current_batch);
// Handler for the database operations
$completed_requests = 0;
$errors = false;
$batch_count = 0;
foreach ($batches as $batch) {
    try {
        $batch_count++;
        $params = array('RequestItems' => array($tableName => $batch), 'ReturnConsumedCapacity' => 'TOTAL', 'ReturnItemCollectionMetrics' => 'SIZE');
        $response = $dynamodb->batchWriteItem($params);
        echo "Album $batch_count Added." . "<br>";
        echo "<pre>";
//        print_r($params);
        print_r($response);
        echo "</pre>";
    }
    catch (DynamoDbException $e) {
        echo "Unable to add movie:\n";
        echo $e->getMessage() . "\n";
//    break;
    }
}

0
投票

PHP 有一个非常方便的数组函数可以做到这一点,称为

array_chunk

$items = [
    'PutRequest' => [
        'Item' => [
            'Id' => ['N' => '1'],
            'AlbumTitle' => [
                'S' => 'Somewhat Famous',
            ],
            'Artist' => [
                'S' => 'No One You Know',
            ],
            'SongTitle' => [
                'S' => 'Call Me Today',
            ],
        ],
    ],
];

foreach (array_chunk($items, 25) as $chunk) {
    try {
        $response = $dynamoDbClient->batchWriteItem([
            'RequestItems' => [
                $tableName => $chunk
            ]
        ]);
    } catch (Exception $e) {
        // Log the exception and handle it
    }
}

确保在

$dynamoDbClient
循环之前的某个位置定义了两个变量
$tableName
foreach

© www.soinside.com 2019 - 2024. All rights reserved.