Home iOS & Swift Books Machine Learning by Tutorials

Digging Deeper into Turi Create Written by Audrey Tam & Matthijs Hollemans

In this chapter, you’ll use the SqueezeNet base model to train the snacks classifier, then explore more ways to evaluate its results.

You’ll also try to improve the model’s accuracy, first with more iterations, then by tweaking some of the underlying Turi Create source code. The SqueezeNet model overfits at a much lower training accuracy than VisionFeaturePrint_Screen, so any improvements will be easier to see.

You’ll also use the Netron tool to view the model — a SqueezeNet-based model has a lot more inside it than the Create ML version from last chapter.

Getting started

You can continue to use the turienv environment, Jupyter notebook, and snacks dataset from the previous chapter, or start fresh with the DiggingDeeper_starter notebook in this chapter’s starter folder.

If you skipped Chapter 4, “Getting Started with Python & Turi Create,” the quickest way to set up the turienv environment is to perform these commands from a Terminal window:

$ cd /path/to/chapter/resources
$ conda env create --file=starter/turienv.yaml
$ conda activate turienv
$ jupyter notebook

In the web browser window that opens, navigate to the starter/notebook folder for this chapter, and open DiggingDeeper_starter.ipynb.

If you downloaded the snacks dataset for a previous chapter, copy or move it into starter/notebook. Otherwise, double-click starter/notebook/snacks-download-link.webloc to download and unzip the snacks dataset in your default download location, then move the snacks folder into starter/notebook.

Note: In this book we’re using Turi Create version 5.6. Other versions may give different results or even errors. This is why we suggest using the turienv that comes with the book.

Transfer learning with SqueezeNet

If you’re not continuing from the previous chapter’s notebook, then run the following cells one by one.

import turicreate as tc
import matplotlib.pyplot as plt
train_data = tc.image_analysis.load_images("snacks/train", 
test_data = tc.image_analysis.load_images("snacks/test", with_path=True)
import os
train_data["label"] = train_data["path"].apply(
              lambda path: os.path.basename(os.path.split(path)[0]))

test_data["label"] = test_data["path"].apply(
              lambda path: os.path.basename(os.path.split(path)[0]))

model = tc.load_model("MultiSnacks.model")
model = tc.image_classifier.create(train_data, target="label",
                                   verbose=True, max_iterations=100)

metrics = model.evaluate(test_data)
print("Accuracy: ", metrics["accuracy"])
print("Precision: ", metrics["precision"])
print("Recall: ", metrics["recall"])
Accuracy:  0.6470588235294118
Precision:  0.6441343963604582
Recall:  0.6445289115646259

Getting individual predictions

So far, you’ve just repeated the steps from the previous chapter. The evaluate() metrics give you an idea of the model’s overall accuracy but you can get a lot more information about individual predictions. Especially interesting are predictions where the model is wrong, but has very high confidence that it’s right. Knowing where the model is wrong can help you improve your training dataset.

The interactive evaluation window
Glo ahjicokqihe ajoneivuac welwim

Predicting and classifying

Turi Create models have other functions, in addition to evaluate(). Enter and run these commands in the next cell, and wait a while:

['apple', 'grape', 'orange', 'orange', 'orange', 'apple', 'orange', 'apple', 'candy', 'apple', 'grape', 'apple', ’strawberry', 'apple', 'apple', 'carrot', 'candy', 'ice cream', 'apple', 'apple', 'apple', ...

output = model.classify(test_data)
The head of the SFrame with classification results
Cpe yuok it sde YHxalo wucb hzargiyigatoez nunofbg

imgs_with_pred = test_data.add_columns(output)
Visually inspecting the classification results
Lujuegww ilpvamxivj yve ncowrosawoveit fifivyg

imgs_filtered = imgs_with_pred[(imgs_with_pred["probability"] > 0.9) &
                 (imgs_with_pred["label"] != imgs_with_pred["class"] )]
Inspecting the filtered classification results
Iszjeqbams zyo pinzunoh fwelmufosasuat zotovwx

Sorting the prediction probabilities

Turi Create’s predict() method can also give you the probability distribution for each image. Enter and run these lines, then wait a while:

predictions = model.predict(test_data, output_type="probability_vector")
print("Probabilities for 2nd image", predictions[1])
array('d', [0.20337662077520557, 0.010500386379535839, 2.8464920324200633e-07, 0.0034932724790819624, 0.0013391166287066811, 0.0005122369124003818, 5.118841868115829e-06, 0.699598450277612, 2.0208374302686123e-07, 7.164497444549948e-07, 2.584012081941193e-06, 5.5645094234565224e-08, 0.08066298157942492, 0.00021689939485918623, 2.30074608705137e-06, 3.6511378835730773e-10, 5.345215832976188e-05, 9.897270575019545e-06, 2.1477438456101293e-08, 0.00022540187389448156])
labels = test_data["label"].unique().sort()
preds = tc.SArray(predictions[1])
tc.SFrame({"preds": preds, "labels": labels}).sort([("preds", False)])
Top five probabilities for the second image.
Yor xeyo grovemozefuot foy vde cawuxf umohe.

Using a fixed validation set

Turi Create extracts a random validation dataset from the training dataset — 5% of the images. The problem with using a small random validation set is that sometimes you get great results, but only because — this time! — the validation dataset just happens to be in your favor.

val_data = tc.image_analysis.load_images("snacks/val", with_path=True)
val_data["label"] = val_data["path"].apply(lambda path:
model = tc.image_classifier.create(train_data, target="label",
                                   verbose=True, max_iterations=100,

Increasing max iterations

So, is a validation accuracy of 63% good? Meh, not really. Turi Create knows it, too — at the end of the training output it says:

This model may not be optimal. To improve it, consider increasing `max_iterations`.
model = tc.image_classifier.create(train_data, target="label",
                                   verbose=True, max_iterations=200,

metrics = model.evaluate(test_data)
print("Accuracy: ", metrics["accuracy"])
print("Precision: ", metrics["precision"])
print("Recall: ", metrics["recall"])
Accuracy:  0.6554621848739496
Precision:  0.6535792163681828
Recall:  0.6510697278911566

Confusing apples with oranges?

print("Confusion Matrix:\n", metrics["confusion_matrix"])
| target_label | predicted_label | count |
|    cookie    |      juice      |   1   |
|    carrot    |    watermelon   |   1   |
|   pretzel    |     pretzel     |   14  |
|     cake     |    ice cream    |   2   |
|  pineapple   |      carrot     |   1   |
|   doughnut   |      muffin     |   1   |
|    muffin    |     doughnut    |   7   |
import numpy as np
import seaborn as sns

def compute_confusion_matrix(metrics, labels):
    num_labels = len(labels)
    label_to_index = {l:i for i,l in enumerate(labels)}

    conf = np.zeros((num_labels, num_labels), dtype=np.int)
    for row in metrics["confusion_matrix"]:
        true_label = label_to_index[row["target_label"]]
        pred_label = label_to_index[row["predicted_label"]]
        conf[true_label, pred_label] = row["count"]

    return conf

def plot_confusion_matrix(conf, labels, figsize=(8, 8)):
    fig = plt.figure(figsize=figsize)
    heatmap = sns.heatmap(conf, annot=True, fmt="d")
    heatmap.xaxis.set_ticklabels(labels, rotation=45,
                                 ha="right", fontsize=12)
    heatmap.yaxis.set_ticklabels(labels, rotation=0,
                                 ha="right", fontsize=12)
    plt.xlabel("Predicted label", fontsize=12)
    plt.ylabel("True label", fontsize=12)
conf = compute_confusion_matrix(metrics, labels)
plot_confusion_matrix(conf, labels, figsize=(16, 16))
The confusion matrix
Pri fockiloud pomzen

Computing recall for each class

Turi Create’s evaluate() function gives you the overall test dataset accuracy but, as mentioned in the AI Ethics section of the first chapter, accuracy might be much lower or higher for specific subsets of the dataset. With a bit of code, you can get the accuracies for the individual classes from the confusion matrix:

for i, label in enumerate(labels):
    correct = conf[i, i]
    images_per_class = conf[i].sum()
    print("%10s %.1f%%" % (label, 100. * correct/images_per_class))
     apple 64.0%
    banana 68.0%
      cake 54.0%
     candy 58.0%
    carrot 66.0%
    cookie 56.0%
  doughnut 62.0%
     grape 84.0%
   hot dog 76.0%
 ice cream 44.0%
     juice 74.0%
    muffin 50.0%
    orange 74.0%
 pineapple 67.5%
   popcorn 62.5%
   pretzel 56.0%
     salad 72.0%
strawberry 67.3%
    waffle 62.0%
watermelon 64.0%

Training the classifier with regularization

A typical hyperparameter that machine learning practitioners like to play with is the amount of regularization that’s being used by the model. Regularization helps to prevent overfitting. Since overfitting seemed to be an issue for our model, it will be instructive to play with this regularization setting.

model = tc.image_classifier.create(train_data, target="label",
                                   verbose=True, max_iterations=200,
                                   l2_penalty=10.0, l1_penalty=0.0,

Wrangling Turi Create code

Saving the extracted features

Wouldn’t it be nice if there was a way we could save time during the training phase, and not have to continuously regenerate the features extracted by SqueezeNet? Well, as promised, in this section, you’ll learn how to save the intermediate SFrame to disk, and reload it, just before experimenting with the classifier.

from turicreate.toolkits import _pre_trained_models
from turicreate.toolkits import _image_feature_extractor

ptModel = _pre_trained_models.MODELS["squeezenet_v1.1"]()
feature_extractor = _image_feature_extractor.MXFeatureExtractor(ptModel)
train_features = feature_extractor.extract_features(train_data,
                                          "image", verbose=True)
extracted_train_features = tc.SFrame({
    "label": train_data["label"],
    "__image_features__": train_features,
# Run this tomorrow or next week
extracted_train_features = tc.SFrame("extracted_train_features.sframe")

Inspecting the extracted features

Let’s see what these features actually look like — enter and run this command:

The head of the extracted features table
Djo peij aj tta ajhnossur geifalim hufjo

array('d', [6.1337385177612305, 10.12844181060791, 13.025101661682129, 7.931194305419922, 12.03809928894043, 15.103202819824219, 12.722893714904785, 10.930903434753418, 12.778315544128418, 14.208030700683594, 16.8399658203125, 11.781684875488281, ...
val_features = feature_extractor.extract_features(val_data,
                                      "image", verbose=True)

extracted_val_features = tc.SFrame({
    "label": val_data["label"],
    '__image_features__': val_features,


Training the classifier

Now you’re ready to train the classifier! Enter and run this statement:

lr_model = tc.logistic_classifier.create(extracted_train_features,
from turicreate.toolkits.image_classifier import ImageClassifier

state = {
    'classifier': lr_model,
    'model': ptModel.name,
    'max_iterations': lr_model.max_iterations,
    'feature_extractor': feature_extractor,
    'input_image_shape': ptModel.input_image_shape,
    'target': lr_model.target,
    'feature': "image",
    'num_features': 1,
    'num_classes': lr_model.num_classes,
    'classes': lr_model.classes,
    'num_examples': lr_model.num_examples,
    'training_time': lr_model.training_time,
    'training_loss': lr_model.training_loss,
model = ImageClassifier(state)
metrics = model.evaluate(test_data)
print("Accuracy: ", metrics["accuracy"])
print("Precision: ", metrics["precision"])
print("Recall: ", metrics["recall"])
Accuracy:  0.6712184873949579
Precision:  0.6755916486674352
Recall:  0.6698818027210884

Saving the model

You can save the model as a Turi Create model:

Class                                    : ImageClassifier

Number of classes                        : 20
Number of feature columns                : 1
Input image shape                        : (3, 227, 227)
Training summary
Number of examples                       : 4838
Training loss                            : 3952.4993
Training time (sec)                      : 59.2703
Class                          : LogisticClassifier

Number of coefficients         : 19019
Number of examples             : 4838
Number of classes              : 20
Number of feature columns      : 1
Number of unpacked features    : 1000

L1 penalty                     : 0.0
L2 penalty                     : 10.0

Training Summary
Solver                         : lbfgs
Solver iterations              : 200
Solver status                  : Completed (Iteration limit reached).
Training time (sec)            : 59.2703

Log-likelihood                 : 3952.4993

Highest Positive Coefficients
(intercept)                    : 1.8933
(intercept)                    : 1.4506
(intercept)                    : 0.6717
(intercept)                    : 0.5232
(intercept)                    : 0.4072

Lowest Negative Coefficients
(intercept)                    : -1.6521
(intercept)                    : -1.5588
(intercept)                    : -1.4143
(intercept)                    : -0.8959
(intercept)                    : -0.5863
no_reg_model = tc.load_model("MultiSnacks.model")
Log-likelihood                 : 2400.3284

Highest Positive Coefficients
(intercept)                    : 0.3808
(intercept)                    : 0.3799
(intercept)                    : 0.1918
__image_features__[839]        : 0.1864
(intercept)                    : 0.15

Lowest Negative Coefficients
(intercept)                    : -0.3996
(intercept)                    : -0.3856
(intercept)                    : -0.3353
(intercept)                    : -0.2783
__image_features__[820]        : -0.1423

A peek behind the curtain

SqueezeNet and VisionFeaturePrint_Screen are convolutional neural networks. In the coming chapters, you’ll learn more about how these networks work internally, and you’ll see how to build one from scratch. In the meantime, it might be fun to take a peek inside your Core ML model.

Using Netron to examine the .mlmodel file
Acotr Hiqhuh ge ifejape lca .fzwucuw qesa


Challenge 1: Binary classifier

Remember the healthy/unhealthy snacks model? Try to train that binary classifier using Turi Create. The approach is actually very similar to what you did in this chapter. The only difference is that you need to assign the label “healthy” or “unhealthy” to each row in the training data SFrame.

healthy = [
    'apple', 'banana', 'carrot', 'grape', 'juice', 'orange',
    'pineapple', 'salad', 'strawberry', 'watermelon'

unhealthy = [
    'cake', 'candy', 'cookie', 'doughnut', 'hot dog',
    'ice cream', 'muffin', 'popcorn', 'pretzel', 'waffle'

train_data["label"] =
  train_data["path"].apply(lambda path: "healthy"
      if any("/" + class_name in path for class_name in healthy)
                                      else "unhealthy")
test_data["label"] =
  test_data["path"].apply(lambda path: "healthy"
      if any("/" + class_name in path for class_name in healthy)
                                      else "unhealthy")

Challenge 2: ResNet50-based model

Train the 20-class classifier using the ResNet-50 model and see if that gets a better validation and test set score. Use model_type="resnet-50" when creating the classifier object. How many FPS does this get in the app compared to the SqueezeNet-based model?

Challenge 3: Use another dataset

Create your own training, validation, and test datasets from Google Open Images or some other image source. I suggest keeping the number of categories limited.

Key points

Have a technical question? Want to report a bug? You can ask questions and report bugs to the book authors in our official book forum here.

Have feedback to share about the online reading experience? If you have feedback about the UI, UX, highlighting, or other features of our online readers, you can send them to the design team with the form below:

© 2020 Razeware LLC

You're reading for free, with parts of this chapter shown as obfuscated text. Unlock this book, and our entire catalogue of books and videos, with a raywenderlich.com Professional subscription.

Unlock Now

To highlight or take notes, you’ll need to own this book in a subscription or purchased by itself.