使用TPU的GCP的Keras / Tensorflow训练

问题描述 投票:0回答:1

我正在尝试使用keras和tensorflow 1.15在GCP上训练模型。从现在开始,我的代码类似于我可以在colab上执行的代码,即:

# TPUs
import tensorflow as tf
print(tf.__version__)
cluster_resolver = tf.distribute.cluster_resolver.TPUClusterResolver("tpu-name")
tf.config.experimental_connect_to_cluster(cluster_resolver)
tf.tpu.experimental.initialize_tpu_system(cluster_resolver)
tpu_strategy = tf.distribute.experimental.TPUStrategy(cluster_resolver)
print("Number of accelerators: ", tpu_strategy.num_replicas_in_sync)


import numpy as np


np.random.seed(123)  # for reproducibility
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Activation, Flatten
from tensorflow.keras.layers import Convolution2D, MaxPooling2D, Input
from tensorflow.keras import utils
from tensorflow.keras.datasets import mnist, cifar10
from tensorflow.keras.models import Model

# 4. Load data into train and test sets
(X_train, y_train) = load_data(sets="gs://BUCKETS/dogscats/train/",target_size=img_size)
(X_test, y_test) =  load_data(sets="gs://BUCKETS/dogscats/valid/",target_size=img_size)
print(X_train.shape, X_test.shape)

# 5. Preprocess input data
#X_train = X_train.reshape(X_train.shape[0], 28, 28, 1)
#X_test = X_test.reshape(X_test.shape[0], 28, 28,1)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255.0
X_test /= 255.0

print(y_train.shape, y_test.shape)
# 6. Preprocess class labels One hot encoding
Y_train = utils.to_categorical(y_train, 2)
Y_test = utils.to_categorical(y_test, 2)
print(Y_train.shape, Y_test.shape)

with tpu_strategy.scope():
  model = make_model((img_size, img_size, 3))
  # 8. Compile model
  model.compile(loss='categorical_crossentropy',
                optimizer="sgd",
                metrics=['accuracy'])

model.summary()

batch_size = 1250 * tpu_strategy.num_replicas_in_sync
# 9. Fit model on training data
model.fit(X_train, Y_train, steps_per_epoch=len(X_train)//batch_size,  
            epochs=5, verbose=1)

但是我的数据在存储桶中,而我的代码在VM中。那我该怎么办?我尝试使用“ gs:// BUCKETS”加载数据,但是它不起作用。我该怎么办 ?编辑:我添加我的代码以加载数据,我对不起。

def load_data(sets="dogcats/train/", k = 5000, target_size=250):
  # define location of dataset
  folder = sets
  photos, labels = list(), list()
  # determine class
  output = 0.0
  for i, dog in enumerate(listdir(folder + "dogs/")):
    if i >= k:
      break
    # load image
    photo = load_img(folder + "dogs/" +dog, target_size=(target_size, target_size))
    # convert to numpy array
    photo = img_to_array(photo)
    # store
    photos.append(photo)
    labels.append(output)

  output = 1.0

  for i, cat in enumerate(listdir(folder + "cats/") ):
    if i >= k:
      break
    # load image
    photo = load_img(folder + "cats/"+cat, target_size=(target_size, target_size))
    # convert to numpy array
    photo = img_to_array(photo)
    # store
    photos.append(photo)
    labels.append(output)

  # convert to a numpy arrays
  photos = asarray(photos)
  labels = asarray(labels)
  print(photos.shape, labels.shape)
  photos, labels = shuffle(photos, labels, random_state=0)
  return photos, labels
tensorflow keras google-cloud-platform bucket tpu
1个回答
0
投票
(X_train, y_train) = load_data(sets="gs://BUCKETS/dogscats/train/",target_size=img_size)
(X_test, y_test) =  load_data(sets="gs://BUCKETS/dogscats/valid/",target_size=img_size)

这显然不起作用,因为从本质上讲,您所做的所有操作都设置了字符串。您需要做的是将此数据下载为字符串,然后使用它。

首先安装软件包pip install google-cloud-storagepip3 install google-cloud-storage

pip-> Python

pip3-> Python3

this,您需要一个服务帐户才能通过您的代码与GCP进行交互。用于authentication目的。

当您将服务帐户作为json获取时,您需要执行以下两项操作之一:

将其设置为环境变量:export GOOGLE_APPLICATION_CREDENTIALS="/home/user/Downloads/[FILE_NAME].json"

或我更喜欢的解决方法

gcloud auth activate-service-account \
  <repalce-with-email-from-json-file> \
          --key-file=<path/to/your/json/file> --project=<name-of-your-gcp-project>

现在让我们看看如何使用google-cloud-storage库以字符串形式下载文件:

from google.cloud import storage
client = storage.Client()
bucket = client.get_bucket("BUCKETS")
blob = bucket.get_blob('/dogscats/train/<you-will-need-to-point-to-a-file-and-not-a-directory>')
data = blob.download_as_string()

现在您已将数据作为字符串,您可以像data那样简单地将(X_train, y_train) = load_data(sets=data,target_size=img_size)传递到加载数据中>

听起来复杂,但这里是一个快速的伪布局:

  1. 安装google-cloud-storage
  2. 转到Google Cloud Platform控制台-> IAM和管理->服务帐户
  3. 创建具有相对权限的服务帐户(google-cloud-storage)
  4. 下载(JSON)文件,并记住位置。
  5. 激活服务帐户
  6. 将文件下载为字符串并将该字符串传递给您的load_data(data)
  7. 希望有帮助!

© www.soinside.com 2019 - 2024. All rights reserved.