猫狗分类器:如何利用免费的计算资源训练一个简单的CNN,并将其部署在浏览器上

Train a simple CNN using free computing resources, then deploy it in the browser to classify cats and dogs 猫狗分类器:如何利用免费的计算资源训练一个简单的CNN,并将其部署在浏览器上

Train a simple CNN using free computing resources, then deploy it in the browser to classify cats and dogs

We will learn to create a simple Convolutional Neural Network for binary classification using Tensorflow, train the model with free resources using Google Colab, and deploy the trained model in the browser using Tensorflow.js.

Find a simple demo here (not mobile-friendly yet) and code for the demo here. Find the code for model training and saving here on Colab.

a simple demo

Note that most codes used for the classification task in this tutorial were from the Google Machine Learning practica, read it if you want to see a much more detailed tutorial on image classification. Also, if you are not already familiar, read more about CNN here and Colab here (a detailed tutorial on Colab usage can be found here).

Train CNN using Tensorflow in Google Colab

The model will train on a filtered version of Kaggle Dogs vs. Cats dataset. The filtered dataset contains 2,000 labeled color images of cats and dogs: 1,000 cat examples for training, 1,000 dog examples for training, 500 cat examples for validation, and 500 dog examples for validation.

Open a Colab notebook and type the following into a code cell to download the filtered dataset as a zip file:

!wget --no-check-certificate https://storage.googleapis.com/mledudatasets/cats_and_dogs_filtered.zip -O /tmp/cats_and_dogs_filtered.zip

Unzip the file and obtain addresses to the training and validation datasets:

猫狗分类器——如何利用免费的计算资源训练一个简单的CNN,并将其部署在浏览器上

我们将学习使用Tensorflow创建一个简单的卷积神经网络以用于二进制分类,使用Google Colab及免费资源训练模型,使用Tensorflow.js在浏览器中部署经过训练的模型

在这里您可以找到一个简单的演示程序(暂不适用于移动设备)和演示代码

找到用于模型训练的代码,并将其保存在Colab上。

一个简单的演示程序

请注意,本教程中用于分类任务的大多数代码均来自于Google机器学习课程,您可以通过这些课程了解关于图像分类的更多详情。如果您对此领域尚不熟悉,请在这里阅读更多关于CNNColab的内容 (关于Colab用法的详细教程可以在这里找到)

 

Google Colab上利用Tensorflow训练CNN


这个模型将在Kaggle Dogs vs. Cats数据集的过滤版本上进行训练。经过过滤的数据集包含了2000张带有标签的猫狗彩色图像:1000张用于训练的猫图像示例、1000张用于训练的狗图像示例、500张用于验证的猫图像示例和500张用于验证的狗图像示例

 

打开Colab notebook,在代码单元中输入以下内容,即可下载过滤后数据集的zip文件

 

!wget –no-check-certificate https://storage.googleapis.com/mledudatasets/cats_and_dogs_filtered.zip -O /tmp/cats_and_dogs_filtered.zip

 

解压文件并获取训练验证数据集的地址

import osnimport zipfilenn# extract all files from the downloaded zip filen# the resulting root folder is named 'tmp' and it containsn# a folder named 'cats_and_dogs_filtered', whichn# contains two separate folders: 'train' and 'validation'nlocal_zip = '/tmp/cats_and_dogs_filtered.zip'nzip_ref = zipfile.ZipFile(local_zip, 'r')nzip_ref.extractall('/tmp')nzip_ref.close()nn# construct addresses for two datasetsnbase_dir = '/tmp/cats_and_dogs_filtered'ntrain_dir = os.path.join(base_dir, 'train')nvalidation_dir = os.path.join(base_dir, 'validation')nn# construct addresses for two classes of examplesntrain_cats_dir = os.path.join(train_dir, 'cats')ntrain_dogs_dir = os.path.join(train_dir, 'dogs')nval_cats_dir = os.path.join(validation_dir, 'cats')nval_dogs_dir = os.path.join(validation_dir, 'dogs')

These addresses will later be used to generate batched examples.

Other than reading about the data here, let’s display an random image from the datasets to get more familiar with our data 😉

import randomnncat_img_files = [os.path.join(train_cats_dir, f) for f in train_cats_fnames]ndog_img_files = [os.path.join(train_dogs_dir, f) for f in train_dogs_fnames]nimg_path = random.choice(cat_img_files+dog_img_files)nimg = load_img(img_path)nnplt.imshow(img)nplt.grid(False)nplt.xticks([])nplt.yticks([])nplt.show()

an example from the cat class

Data pre-processing is an important part of any learning system, especially when the dataset is small, which is true in our case. Pre-process the images in the training set using data augmentation techniques, including re-scale, rotation, shifting, zooming, flipping, and shearing:

# The ImageDataGnerator class generates batches of tensor image data n# and their labelswith real-time data augmentationnfrom keras.preprocessing.image import ImageDataGeneratornn# apply data augmentation of the training setntrain_datagen= ImageDataGenerator(n    # normalize pixel values to be between [0,1]n    # note that the images uses the byte image pixel format, n    # which stores a vector of three numbers (i.e. RGB) as an 8-bit integer,n    # giving a range of possible values from 0 to 255 (b/c 2^8).n    # read more about pixel values here: n    # https://homepages.inf.ed.ac.uk/rbf/HIPR2/value.htmn    rescale=1./255, n    # Read about the following arguments: n    # https://keras.io/preprocessing/image/n    rotation_range= 40,n    width_shift_range= 0.2,n    height_shift_range= 0.2,n    shear_range= 0.2, n    zoom_range = 0.2, n    horizontal_flip= True)nn# do not augmentent the validation set, n# just normalize the pixel valuenval_datagen= ImageDataGenerator(rescale=1./255)nn# flow_from directory() takes the path to a directory n# and generates batches of augmented datantrain_generator = train_datagen.flow_from_directory(n    train_dir, n    # resize all images to 150x150, will give reasons later n    target_size = (150, 150),n    # 20 examples will be used in each iteration, or, n    # one gradient update of model trainingn    batch_size = 20,n    # binary labels for binary cross entropy lossn    class_mode='binary') nnval_generator = val_datagen.flow_from_directory(n    validation_dir, n    target_size = (150, 150),n    batch_size = 20,n    class_mode = 'binary')

Build a simple model using Keras.

The input layer of the model will accepts raw pixels from images that have width and height of 150 and three color channels; this is the reason why we reshaped all images in our dataset to that a size of 150 by 150.

After the input layer, the model contains three Convolutional layers with ReLU activation and three Max Pooling layers. The details of the structures of theses hidden layers will be included in the comments of code snippets below.

The model then uses a Flatten layer to map the resulting features to a 1D tensor in order to feed the existing calculation into a fully-connected layer of 512 hidden units. After that, a dropout regularization of rate 0.5 is applied to to further prevent over-fitting.

Finally, the output layer of the model uses a Sigmoid function as the activation function; it squeezes the output to a class score between 0 and 1, which is perfect for our task since the dataset labels the cat class with 0 ad the dog class with 1.

import os import zipfile # extract all files from the downloaded zip file # the resulting root folder is named 'tmp' and it contains # a folder named 'cats_and_dogs_filtered', which # contains two separate folders: 'train' and 'validation' local_zip = '/tmp/cats_and_dogs_filtered.zip' zip_ref = zipfile.ZipFile(local_zip, 'r') zip_ref.extractall('/tmp') zip_ref.close() # construct addresses for two datasets base_dir = '/tmp/cats_and_dogs_filtered' train_dir = os.path.join(base_dir, 'train') validation_dir = os.path.join(base_dir, 'validation') # construct addresses for two classes of examples train_cats_dir = os.path.join(train_dir, 'cats') train_dogs_dir = os.path.join(train_dir, 'dogs') val_cats_dir = os.path.join(validation_dir, 'cats') val_dogs_dir = os.path.join(validation_dir, 'dogs')

这些地址稍后将用于生成批量示例

nnnnn nnnnnnnnn

除了阅读这里的数据,让我们从数据集中随机显示一个图像,以便更熟悉我们的数据😉

import random cat_img_files = [os.path.join(train_cats_dir, f) for f in train_cats_fnames] dog_img_files = [os.path.join(train_dogs_dir, f) for f in train_dogs_fnames] img_path = random.choice(cat_img_files+dog_img_files) img = load_img(img_path) plt.imshow(img) plt.grid(False) plt.xticks([]) plt.yticks([]) plt.show()

  

来自猫分类中的一个示例

nnnnn nnnnnnnnn

在任何学习系统中,数据预处理都是重要的组成部分,尤其是当数据集很小的时候,这在本案例中能够得以验证。使用数据增强技术对训练集中的图像进行预处理,包括重新缩放、旋转、移动、缩放、翻转和剪

# The ImageDataGnerator class generates batches of tensor image data  # and their labelswith real-time data augmentation from keras.preprocessing.image import ImageDataGenerator # apply data augmentation of the training set train_datagen= ImageDataGenerator(     # normalize pixel values to be between [0,1]     # note that the images uses the byte image pixel format,      # which stores a vector of three numbers (i.e. RGB) as an 8-bit integer,     # giving a range of possible values from 0 to 255 (b/c 2^8).     # read more about pixel values here:      # https://homepages.inf.ed.ac.uk/rbf/HIPR2/value.htm     rescale=1./255,      # Read about the following arguments:      # https://keras.io/preprocessing/image/     rotation_range= 40,     width_shift_range= 0.2,     height_shift_range= 0.2,     shear_range= 0.2,      zoom_range = 0.2,      horizontal_flip= True) # do not augmentent the validation set,  # just normalize the pixel value val_datagen= ImageDataGenerator(rescale=1./255) # flow_from directory() takes the path to a directory  # and generates batches of augmented data train_generator = train_datagen.flow_from_directory(     train_dir,      # resize all images to 150x150, will give reasons later      target_size = (150, 150),     # 20 examples will be used in each iteration, or,      # one gradient update of model training     batch_size = 20,     # binary labels for binary cross entropy loss     class_mode='binary')  val_generator = val_datagen.flow_from_directory(     validation_dir,      target_size = (150, 150),     batch_size = 20,     class_mode = 'binary')

使用Keras构建一个简单的模型。

模型的输入层将接受宽度和高度均为150、拥有3个颜色通道的图像的原始像素这就是为什么我们要将数据集中的所有图像重新改成150*150的尺寸。

输入层之后,模型包含三个卷积层,ReLU激活层和三个最大池化层。这些隐藏层的结构细节将可以在代码片段的注释下方找到。然后,模型使用一个压平将产生的特征绘制到1D张量中,以此满足拥有512个隐藏单位的全连接层的现有计算需求。接着,采用比率为0.5丢弃正则化方法以进一步防止过度拟合

nnnnn nnnnnnnnnnnnnnn

最后,模型的输出层使用一个Sigmoid函数作为激活函数;它会将输出结果压缩为01之间的类分数。这正好符合我们的任务要求,因为数据集将猫类标为0,而将狗类标为1

from tensorflow.keras import layersnfrom tensorflow.keras import Modelnfrom tensorflow.keras.optimizers import RMSpropnn# the input images are required to have width and height of 150n# and 3 channels, one for each colorn# this is the reason why all images were reshaped aboveninput_layer = layers.Input(shape=(150, 150, 3))nn# note that all layers from now on is a function of the previous layernn# the first convolutional layer has 16 3x3 filtersn# output volume = (W-F+2P)/S+1 = (150-3+2(0))/1+1 = 148 --> (148, 148, 16), where n# W:=input size, F:=filter size, S:=stride (moving filter 1 pixel at a time), P:=paddingn# num of parameter s = inputxFxFxoutput = 3x3x3x16 + 3 = 432+ 16 = 448nx = layers.Conv2D(16, 3, activation='relu')(input_layer) # relu := max(0, x)n# max pooling layer with 2x2 windown# output volume = (W-F)/2+1 = (148-2)/2+1 = 74 --> (74, 74, 16)nx = layers.MaxPooling2D(2)(x)n# The second convolutional layer has 32 3x3 filtersn# output volume = (W-F+2P)/S+1 = (74-3+2(0))/1+1 = 148 --> (72, 72, 32)n# num of parameters = inputxFxFxoutput = 16x3x3x32 + 32 = 4640nx = layers.Conv2D(32, 3, activation='relu')(x)n# max pooling layer with 2x2 windown# output volume = (W-F)/2+1 = (72-2)/2+1 = 36 --> (36, 36, 32)nx = layers.MaxPooling2D(2)(x)n# The third convolutional layer has 64 3x3 filters n# output volume = (W-F+2P)/S+1 = (36-3+2(0))/1+1 = 34 --> (34, 34, 64)n# num of parameters = inputxFxFxoutput = 32x3x3x64 + 64 = 18496nx = layers.Conv2D(64, 3, activation='relu')(x)n# max pooling layer with 2x2 windown# output volume = (W-F)/2+1 = (34-2)/2+1 = 17 --> (17, 17, 64)nx = layers.MaxPooling2D(2)(x)n# the shape is really (None, 17, 17, 64)nn# flatten feature map to a 1D tensor n# output shape = (None, 17x17x64) = (None, 18496)nx = layers.Flatten()(x)n# fully connected layer with relu activation and 512 hidden unitsn# output shape = (None, 512)n# num of parameters = 18496 * 512 + 512= 9470464nx = layers.Dense(512, activation='relu')(x)nn# dropout regularization with dropout rate of 0.5n# output shape = (None, 512)nx = layers.Dropout(0.5)(x)n# output layer with a signal node and a sigmoid activation, n# since we have a binary classification problemn# output shape = (None, 1)n# num of parameters = 1x512 + 1 = 513noutput = layers.Dense(1, activation='sigmoid')(x) # [0,1]nn# the model is defined by input and output tensor(s)nmodel = Model(input_layer, output)nn# print out summary of the model to confirm our calculations abovenprint(model.summary())

define model structure and training process

After model definition, we then define the training configuration and plot the learning curves. The loss function will be binary cross entropy and the optimizer will be RMSprop. The model will be trained on data generatedbatch-by-batch (we defined each batch to contain 20 examples) for 100 epochs, each containing 100 batches of examples (because 2,000/20 = 100). The model will also be validated on data generated batch-by-batch (also 20 examples per batch) for 50 epochs, each containing 50 examples (because 1,000/20 = 50).

# define the model training configurationnmodel.compile(loss='binary_crossentropy', optimizer=RMSprop(lr=0.001), metrics=['acc'])nn# train model and save the processnhistory = model.fit_generator(train_generator,n                             steps_per_epoch=100,n                             epochs=100,n                             validation_data=val_generator,n                             validation_steps=50,n                             verbose=2)nn# visualize the training processn# all data are pulled from 'history'nacc = history.history['acc']nval_acc = history.history['val_acc']nnloss = history.history['loss']nval_loss = history.history['val_loss']nnepochs = range(len(acc))nnplt.plot(epochs, acc, label='accuracy')nplt.plot(epochs, val_acc, label='validation')nplt.title('Training and validation accuracy')nplt.legend()nplt.figure()nnplt.plot(epochs, loss, label='accuracy')nplt.plot(epochs, val_loss, label='validation')nplt.title('Training and validation loss')nplt.legend()

train the model and plot the learning curves

the learning curves

As you can see, the classification model we constructed is working but is not yet a great model — the accuracy is only around 80% and the losses are still high. After you get the model to work in the browser, it would be a good idea to test a few model structure of you own or tune hyper-parameters of the current model to achieve a higher accuracy. You could also try to to train the model with the full Kaggle dataset or try to leverage a pre-trained classification models such as Inception-V3. The details of training a better model for our task can all be found in the Google Image Classification Practica linked above.

from tensorflow.keras import layers from tensorflow.keras import Model from tensorflow.keras.optimizers import RMSprop # the input images are required to have width and height of 150 # and 3 channels, one for each color # this is the reason why all images were reshaped above input_layer = layers.Input(shape=(150, 150, 3)) # note that all layers from now on is a function of the previous layer # the first convolutional layer has 16 3x3 filters # output volume = (W-F+2P)/S+1 = (150-3+2(0))/1+1 = 148 --> (148, 148, 16), where  # W:=input size, F:=filter size, S:=stride (moving filter 1 pixel at a time), P:=padding # num of parameter s = inputxFxFxoutput = 3x3x3x16 + 3 = 432+ 16 = 448 x = layers.Conv2D(16, 3, activation='relu')(input_layer) # relu := max(0, x) # max pooling layer with 2x2 window # output volume = (W-F)/2+1 = (148-2)/2+1 = 74 --> (74, 74, 16) x = layers.MaxPooling2D(2)(x) # The second convolutional layer has 32 3x3 filters # output volume = (W-F+2P)/S+1 = (74-3+2(0))/1+1 = 148 --> (72, 72, 32) # num of parameters = inputxFxFxoutput = 16x3x3x32 + 32 = 4640 x = layers.Conv2D(32, 3, activation='relu')(x) # max pooling layer with 2x2 window # output volume = (W-F)/2+1 = (72-2)/2+1 = 36 --> (36, 36, 32) x = layers.MaxPooling2D(2)(x) # The third convolutional layer has 64 3x3 filters  # output volume = (W-F+2P)/S+1 = (36-3+2(0))/1+1 = 34 --> (34, 34, 64) # num of parameters = inputxFxFxoutput = 32x3x3x64 + 64 = 18496 x = layers.Conv2D(64, 3, activation='relu')(x) # max pooling layer with 2x2 window # output volume = (W-F)/2+1 = (34-2)/2+1 = 17 --> (17, 17, 64) x = layers.MaxPooling2D(2)(x) # the shape is really (None, 17, 17, 64) # flatten feature map to a 1D tensor  # output shape = (None, 17x17x64) = (None, 18496) x = layers.Flatten()(x) # fully connected layer with relu activation and 512 hidden units # output shape = (None, 512) # num of parameters = 18496 * 512 + 512= 9470464 x = layers.Dense(512, activation='relu')(x) # dropout regularization with dropout rate of 0.5 # output shape = (None, 512) x = layers.Dropout(0.5)(x) # output layer with a signal node and a sigmoid activation,  # since we have a binary classification problem # output shape = (None, 1) # num of parameters = 1x512 + 1 = 513 output = layers.Dense(1, activation='sigmoid')(x) # [0,1] # the model is defined by input and output tensor(s) model = Model(input_layer, output) # print out summary of the model to confirm our calculations above print(model.summary())

nnnnn nnnnnnn

定义模型结构和训练过程

nnnnn nnnnnnn

模型定义之后,我们还需定义训练配置并绘制出学习曲线。损失函数为二元交叉熵,优化器为RMSprop。模型把100epoch的数据生成batch进行批量训练(我们定义每个批包含20个示例),每个epoch包含100个批示例(因为2000 /20 = 100)。该模型还将把50epoch的数据生成batch进行批量验证(同样是每个批包含20个示例),每个epoch包含50个示例(因为1000 /20 = 50

# define the model training configuration model.compile(loss='binary_crossentropy', optimizer=RMSprop(lr=0.001), metrics=['acc']) # train model and save the process history = model.fit_generator(train_generator,                              steps_per_epoch=100,                              epochs=100,                              validation_data=val_generator,                              validation_steps=50,                              verbose=2) # visualize the training process # all data are pulled from 'history' acc = history.history['acc'] val_acc = history.history['val_acc'] loss = history.history['loss'] val_loss = history.history['val_loss'] epochs = range(len(acc)) plt.plot(epochs, acc, label='accuracy') plt.plot(epochs, val_acc, label='validation') plt.title('Training and validation accuracy') plt.legend() plt.figure() plt.plot(epochs, loss, label='accuracy') plt.plot(epochs, val_loss, label='validation') plt.title('Training and validation loss') plt.legend()

nnnnn nnnnnnn

训练模型并绘制学习曲线

 

nnnnn nnnnnnn

关于学习曲线

nn

正如你所看到的,我们构建的分类模型虽然是有效的,但仍不是一个最优的模型——准确率只有80%左右,损失仍然很高。将模型部署在浏览器工作之后,您最好测试一下自己的模型结构,或者调优当前模型的超参数,以达到更高的准确度。您还可以尝试使用完整的Kaggle数据集或Inception-V3等预先训练过的分类模型来训练您的模型。训练更优模型以满足任务需求的细节可以在Google图像分类实践课中找,链接如下。 

# from https://medium.com/deep-learning-turkey/google-colab-free-gpu-tutorial-e113627b9f5dnn# install necessary libraires and perform authorizationn!apt-get install -y -qq software-properties-common python-software-properties module-init-toolsn!add-apt-repository -y ppa:alessandro-strada/ppa 2>&1 > /dev/nulln!apt-get update -qq 2>&1 > /dev/nulln!apt-get -y install -qq google-drive-ocamlfuse fusenfrom google.colab import authnauth.authenticate_user()nfrom oauth2client.client import GoogleCredentialsncreds = GoogleCredentials.get_application_default()nimport getpassn!google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret} < /dev/null 2>&1 | grep URLnvcode = getpass.getpass()n!echo {vcode} | google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret}nn# mount your Google Driven!mkdir -p ggdriven!google-drive-ocamlfuse ggdrive

Save the trained model structure and weights to Google drive

# save the modelnmodel.save('cats_dogs.h5')nn# install tensorflow.jsn!pip install tensorflowjsnn# create weight files and a json file containing structure of the modeln!mkdir modeln!tensorflowjs_converter --input_format keras cats_dogs.h5 model/nn# zip the model upn!zip -r model.zip modelnn# move filen!mv model.zip ggdrive

Establish connection with your Google Drive:

Save the model structure and weights and move the files into your Google Drive:

The method on line #2 saves a Keras model into a single HDF5 file that contains the architecture, weights, training configuration, and the state of the optimizer of the model; see more details here. Your resulting ‘model’ folder should contain files similar to the following:

# from https://medium.com/deep-learning-turkey/google-colab-free-gpu-tutorial-e113627b9f5d # install necessary libraires and perform authorization !apt-get install -y -qq software-properties-common python-software-properties module-init-tools !add-apt-repository -y ppa:alessandro-strada/ppa 2>&1 > /dev/null !apt-get update -qq 2>&1 > /dev/null !apt-get -y install -qq google-drive-ocamlfuse fuse from google.colab import auth auth.authenticate_user() from oauth2client.client import GoogleCredentials creds = GoogleCredentials.get_application_default() import getpass !google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret} < /dev/null 2>&1 | grep URL vcode = getpass.getpass() !echo {vcode} | google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret} # mount your Google Drive !mkdir -p ggdrive !google-drive-ocamlfuse ggdrive

将训练好的模型结构及权重保存在Google Drive上

# save the model model.save('cats_dogs.h5') # install tensorflow.js !pip install tensorflowjs # create weight files and a json file containing structure of the model !mkdir model !tensorflowjs_converter --input_format keras cats_dogs.h5 model/ # zip the model up !zip -r model.zip model # move file !mv model.zip ggdrive

建立与Google Drive的连接:

保存模型结构和权重,并将文件移动至Google Drive上:

nnnnn nnnnnnnnnnn

2行的方法将Keras模型保存到一个HDF5文件中,该文件包含模型的架构、权重、训练配置和优化器的状态;点击这里查看更多细节。您生成的模型文件夹应该包含与下面类似的文

图片[1]-猫狗分类器:如何利用免费的计算资源训练一个简单的CNN,并将其部署在浏览器上-X-Copilot  

Reconstruct the pre-trained model in the browser using TensorFlow.js

Make sure you have a place to host your project, such as github.io. After that, include the following in your HTML file to enable the usage of Tensorflow.js:

Download the files from google drive to your project folder, then load the saved model from the files:

model = await tf.loadModel('cat_dog_model/model.json');

Create an element to display the image meant to be classified and another to show the classification result:

n

Your image contains a ____.

Use the model we loaded to perform classification on inputImage and display the result on the HTML label tag:

// step 1: grab the raw input image pixel; n// unfortunatly we cannot get pixel data directly from  the html img elementnn// get the html element that contains the input image nvar img = document.getElementById('inputImage');n// create an html canvas elementn// the canvas element is a container for grahpics that allows javascript to draw graphics on the fly nvar canvas = document.createElement('canvas');n// https://developer.mozilla.org/en-US/docs/Web/API/HTMLCanvasElement/getContextn// context is a CanvasRenderingContext2D object (which represents a 2D rendering context) nvar context = canvas.getContext('2d');n// set the width and the height of the canvas to be n// the same as inputImagencanvas.width = img.width;ncanvas.height = img.height;n// position the imae on the canvasncontext.drawImage(img, 0, 0);n/* https://developer.mozilla.org/en-US/docs/Web/API/CanvasRenderingContext2D/getImageDatan CanvasRenderingContext2D.getImageData(sx, sy. sw, sh) returns an ImageData that n represents the underlying pixel data for the area of the canvas starting at (sx, sy)n and has width of sw and height of sh */nvar imgData = context.getImageData(0, 0, img.width, img.height);nn// step 2: turn raw pixels into tensors, so our model could work with the data n// use tidy() to dispose any possible  unsued variablesnpreprocessed_imgData = tf.tidy(()=>{n// convert the image data into a tensorn// tensor.shape = (width, height, 3)nlet tensor = tf.fromPixels(imgData, numChannels=3);n// resize the tensor into (150, 150, 3)n// because the input layer of our models requires it nvar resized = tf.image.resizeBilinear(tensor, [150, 150]).toFloat();n// now the tensor has shape (150, 150, 3)n// reshape it to (1, 150, 150, 3) because the input layer takes 4 dimensions (i.e. batch size=1)nconst input_tensor = resized.reshape([1,150,150,3]);n// normalize the pixel values to be in the range of [0, 1]nconst offset = tf.scalar(255.0);nconst normalized = tf.scalar(1.0).sub(input_tensor.div(offset));n// return the normalized imagenreturn normalized;})nn// step 3: make perdiction n// https://www.bisque.com/products/orchestrate/RASCOMHelp/RASCOM/Synchronous_vs_Asynchronous_Execution.htmn// dataSync() synchronously downloads the values from the tf.Tensorn// this could be why the demo has performance issue on mobil devicesnpred = model.predict(preprocessed_imgData).dataSync();nn// step 4: display the perdiction for users nvar catOrDog = pred <= 0.5 ? 'cat' : 'dog';ndocument.getElementById('catOrDogLabel').innerHTML = catOrDog;

I set the decision threshold to be 0.5, but it could be reset to anything reasonable.

利用TensorFlow.js在浏览器上重建预先训练好的模型

nnnnn nnnnnnnnn

确保您有类似于github.io这样的地方来托管您的项目。然后,在HTML文件中输入以下内容,以支持使用Tensorflow.js

从Google Drive中下载文件至您的项目文件夹,然后从文件中上传保存好的模型:

model = await tf.loadModel('cat_dog_model/model.json');

nnnnn nnnnnnn

创建两个元素,一个用于显示要分类的图像,另一个显示分类结

 

Your image contains a ____.

nnnnn nnnnnnn

使用我们加载的模型对输入图像进行分类,并在HTML标签上显示结

// step 1: grab the raw input image pixel;  // unfortunatly we cannot get pixel data directly from  the html img element // get the html element that contains the input image  var img = document.getElementById('inputImage'); // create an html canvas element // the canvas element is a container for grahpics that allows javascript to draw graphics on the fly  var canvas = document.createElement('canvas'); // https://developer.mozilla.org/en-US/docs/Web/API/HTMLCanvasElement/getContext // context is a CanvasRenderingContext2D object (which represents a 2D rendering context)  var context = canvas.getContext('2d'); // set the width and the height of the canvas to be  // the same as inputImage canvas.width = img.width; canvas.height = img.height; // position the imae on the canvas context.drawImage(img, 0, 0); /* https://developer.mozilla.org/en-US/docs/Web/API/CanvasRenderingContext2D/getImageData  CanvasRenderingContext2D.getImageData(sx, sy. sw, sh) returns an ImageData that   represents the underlying pixel data for the area of the canvas starting at (sx, sy)  and has width of sw and height of sh */ var imgData = context.getImageData(0, 0, img.width, img.height); // step 2: turn raw pixels into tensors, so our model could work with the data  // use tidy() to dispose any possible  unsued variables preprocessed_imgData = tf.tidy(()=>{ // convert the image data into a tensor // tensor.shape = (width, height, 3) let tensor = tf.fromPixels(imgData, numChannels=3); // resize the tensor into (150, 150, 3) // because the input layer of our models requires it  var resized = tf.image.resizeBilinear(tensor, [150, 150]).toFloat(); // now the tensor has shape (150, 150, 3) // reshape it to (1, 150, 150, 3) because the input layer takes 4 dimensions (i.e. batch size=1) const input_tensor = resized.reshape([1,150,150,3]); // normalize the pixel values to be in the range of [0, 1] const offset = tf.scalar(255.0); const normalized = tf.scalar(1.0).sub(input_tensor.div(offset)); // return the normalized image return normalized;}) // step 3: make perdiction  // https://www.bisque.com/products/orchestrate/RASCOMHelp/RASCOM/Synchronous_vs_Asynchronous_Execution.htm // dataSync() synchronously downloads the values from the tf.Tensor // this could be why the demo has performance issue on mobil devices pred = model.predict(preprocessed_imgData).dataSync(); // step 4: display the perdiction for users  var catOrDog = pred <= 0.5 ? 'cat' : 'dog'; document.getElementById('catOrDogLabel').innerHTML = catOrDog;

nnnnn nnnnnnn

这里将决策阈值设置为0.5,您也可以将它重置为任何合理的值。

n ]]>

© 版权声明
THE END
喜欢就支持一下吧
点赞9 分享
评论 抢沙发
头像
欢迎您留下宝贵的见解!
提交
头像

昵称

取消
昵称表情代码图片

    暂无评论内容