Introduction to Using OpenCV With Unity

Introduction If you’ve been working with Unity for a while, you must have realized how powerful the game engine is. From making simple 2D and 3D mobile games, to full-fledged virtual reality applications, you can do it all with Unity. However, if you are a fan of playing motion-based games like Kinect Table Tennis or […]

Version

  • C# 3.5, Unity 2018.2, Unity

Introduction

If you’ve been working with Unity for a while, you must have realized how powerful the game engine is. From making simple 2D and 3D mobile games, to full-fledged virtual reality applications, you can do it all with Unity.

However, if you are a fan of playing motion-based games like Kinect Table Tennis or Motion Sports, as a developer, you might have wondered whether it’s possible to make these kinds of games with Unity.

This tutorial will serve as the perfect starting point as you’ll learn about OpenCV (Open Source Computer Vision) — one of the world’s most popular and widely used Computer Vision Libraries — which makes use of your webcam for real-time movement detection. You’ll learn how to use Hand Gesture Recognition in this tutorial, as well as how to communicate and make use of the data generated by the Hand Gesture Recognition in order to control a player in your Unity game.

Specifially, you’ll learn:

  • Basics of communication protocols in Networking.
  • What is OpenCV and some of its applications.
  • How to install, set up and run the Python version of OpenCV on your system.
  • How to send data to a specific port via Sockets in Python.
  • How to read data from a Socket with the help of UDP Client in Unity.
Note: This tutorial assumes you are familiar with Unity and have an intermediate knowledge of C#. If you need need some refreshers, you can check out our other Unity tutorials. You will also need to know your way around your operating systems as you will need to install OpenCV and Python. Installation instructions provided!

Getting Started

You’ll need the following for this tutorial:

  • The latest stable copy of Unity installed on your machine — currently 2018.2.
  • A code editor of your choice.
  • Python and OpenCV installed (installation instruction in the appropriate section).

For now, assuming you have Unity set up, Download the project materials using the “Download Materials” link at the top or bottom of this tutorial.

Unzip the contents. Inside the Starter folder, you will find two more folders:

  1. Unity3D, which contains the Unity Project.
  2. Python, which contains Recognition.py and requirements.txt.

Open the Unity3D folder as a Project with Unity.

Note: If you are using the Unity Hub, the Unity3D folder may not be recognized as a valid Project folder. In that case, quit the Unity Hub and start Unity without using the Hub.

Exploring the Starter Project

Take a look at the folder structure in the Project window:

Introduction To Using OpenCV With Unity

Here’s what each contains:

  • Animation Controllers: Contains the Animation Controller for the Player.
  • Animations: Contains the Idle and Jumping animations for the Player.
  • Materials: Contains the necessary materials required for this tutorial.
  • Prefabs: Contains the Models for the Gym and the Player.
  • Scenes: Contains the Main scene.
  • Scripts: Contains the PlayerController script, which will store the OpenCV logic for controlling player movement.
  • Sounds: Music and sound files for the project are kept here.
  • Textures: Contains the main textures for the project.

Setting Up Your Scene

Open the Main scene inside the Scenes folder.

You’ll see a gym, some nice motivational posters (all gyms need them) and the star of the show: Mr. Jumpy McJumper.

To save you the hassle of fiddling with the Transform values of the GameObjects in the scene, you already have all the right things in all the right places.

If you click the Play button at this moment, there should be background music playing. Additionally, you’ll see the player in his Idle animation, and not much else going on at this moment.

You should also see the PlayerController under Managers, with a PlayerController script attached to it. This is the file you’ll add all your code to later on.

In this tutorial, you’ll use Python to create a virtual server and use OpenCV to detect the number of fingers of your hand as recorded by the webcam, and you’ll use Sockets to send that information to a predefined ‘port’.

Once that is working, you’ll use an UDP Client, which is part of the System.Net.Sockets class library, to receive that data in the Unity project.

Understanding Key Concepts

Before moving on to the actual coding, take a look at the underlying concepts that will be applied throughout the rest of this tutorial.

Communication Protocols and UDP

In order to understand how the Python-OpenCV script and your Unity instance will communicate with each other, it’s important to know about the underlying principles of how data is shared between two or more applications in computer systems.

Communication protocols are formal descriptions of digital message formats and rules. They are required to exchange messages in or between computing systems and are required in telecommunications.

Two commonly used Protocols are TCP (Transmission Control Protocol) and UDP (User Datagram Protocol).

TCP is the dominant protocol used for the bulk of internet connectivity owing to the services it provides for breaking large data sets into individual packets, checking for and resending lost packets, and reassembling packets into the correct sequence — sending and receiving email (SMTP) and browsing the web, for example.

In contrast, applications that communicate using the UDP protocol just send the packets and don’t check or wait for an acknowledgement before sending the next packet, which means that they have much lower bandwidth overhead and latency.

Some common real-world applications of UDP: Tunneling/VPN, Live Media streaming services (when it is OK if some frames are lost) and most commonly in online multiplayer games like Counter Strike:Global Offensive or DOTA 2.

Now that you understand a little bit about how the data will be communicated between Unity and our Python instance, here’s a little bit about OpenCV and how it will help in detecting the fingers of your hand.

OpenCV

OpenCV is a library of programming functions mainly aimed at real-time computer vision. Originally developed by Intel, it was later supported by Willow Garage, then Itseez (which was later acquired by Intel). The library is cross-platform and free for use under the open-source BSD license. OpenCV also supports the deep-learning frameworks TensorFlow, Torch/PyTorch and Caffe.

In this tutorial, you will use the Python API for OpenCV to detect the number of fingers your hand displays when it is open as opposed to when you make a fist (zero fingers). It will then send a message to a predefined port using sockets, which will be used to trigger an action in the Unity project.

A broad overview of the steps performed by OpenCV to detect the number of fingers:

  1. Frames are captured from the camera (webcam) as images.
  2. Gaussian Blur is applied to the image to blur the image. This is done to reduce the noise in the image and to separate out the outlines of the fingers for the later processing stages. (Left : Normal Image, Right : Blurred Image)
  3. A binary image (by removing all but two colors from the image) is created, wherein the skin color is replaced with white and everything else is replaced with black.
  4. Contour detection is applied to find the defective areas of the hand.
  5. Based on the number of contours detected the number of fingers is calculated.

You can read through the Recognition.py script to get an idea of what each line of code is doing.

Once the number of fingers is recognized, the Socket library is used to send the relevant data via UDP to port number 5065 (line number 128 in Recognition.py).

Installing Additional Software

Setting Up Python and OpenCV

The process of setting up OpenCV varies a lot across the two major operating systems supported by Unity (Windows and macOS).

Follow the instructions below to set up OpenCV and Python according to your operating system.

Windows

  1. Download and install Anaconda for Python 3.6 from https://www.continuum.io/downloads.
  2. Make sure to check both these options while installing:
  3. – Register Anaconda as my default Python.
    – Add Anaconda to my PATH environment variable.

  4. After Anaconda is installed, open command prompt as Administrator and execute the following command to install the required packages.
  5. conda install -c menpo opencv
    

    You will be asked Proceed ([y]/n)?
    Type ‘y’ and press Enter

  6. Once all packages have successfully installed, test your install by executing the following commands in the command prompt.
  7. python
    
    >>> import cv2
    

If you do not see any error, it means OpenCV has been successfully installed.

macOS

  1. Download and install Anaconda for Python 3 from https://www.continuum.io/downloads.
  2. After Anaconda is installed, open Terminal (Applications ▸ Utilities ▸ Terminal) and execute the following command to install the required packages.
  3. conda install -c menpo opencv
    

    You will be asked Proceed ([y]/n)?
    Type ‘y’ and press Enter.

  4. Once all packages have successfully installed, test your install by executing the following command.
  5. python
    
    >>> import cv2
    

If you do not see any error, it means OpenCV has been successfully installed.

Getting the Python Server Running

Now that you have the theoretical part out of the way, start with getting your Hand Gesture Recognition working.

Open a command line utility, such as terminal Terminal (if you’re using MacOS) or CMD (if you’re using Windows), and change the directory to Starter ▸ Python.

Now, run the command python Recognition.py

Your webcam should become active and two new windows will open: “Full Frame” and “Recognition.”

The Full Frame window shows the complete frame of what your webcam is capturing. However, you will process only a small part of that (the frames within the green outline box) where the user’s hand is supposed to be.

Now, align your hand so that it is completely inside the green box. You should be able to see it in effect in the “Detection” window, with two previews of your hand:

  1. The outline of your hand.
  2. Your hand as seen by the webcam, with the outline overlaid on it as detected by OpenCV.

If everything works as expected, you should be able to see something like this:

The red colored outline makes up a complete polygon area covering your entire hand, and the green colored outline joins from point to point around the tips of your fingers:

Note: Make sure you have a Monochromatic background behind your hand and good lighting conditions for the detection to work flawlessly.

To trigger a Jump action, completely open your hand and then close it to make a fist. If you do this properly, you see “Jump Action Triggered!” in the command line logs:

When a fist is detected, the string “JUMP!” is sent via Sockets to port “5065.”

Check out line number 126 in Recognition.py to see the exact code that is being executed.

To stop the Python server, with one of the windows active, press the ‘Q’ key on your keyboard, or with your Terminal window active, press Ctrl + C on Windows or Command + C if you’re using a Mac.

Next, learn how to receive this data in Unity.

Receiving Data Via UDP in Unity

Now that you understand the basics of communications protocols and have Hand Detection working with OpenCV, the following steps will guide you to receive that data in your Unity instance. Open the Main scenes from the Scenes folder to get started.

In the Project Hierarchy, select Managers ▸ PlayerController. In the Inspector window, you’ll see that a script PlayerController.cs is attached to it.

Open PlayerController.cs in your favorite code editor. You should see six comments.

You’ll now add pieces of code snippet below these comments to add the necessary functionality to your project.

To start, add the following code below the comment // 1. Declare Variables.

// 1. Declare Variables

Thread receiveThread; //1
UdpClient client; //2
int port; //3

public GameObject Player; //4
AudioSource jumpSound; //5
bool jump; //6

Looking at each piece comment-by-comment:

  1. Declare a variable of class Thread: This will be used to start a thread that will be continuously running in the background.
  2. Declare a variable of class UdpClient: This will parse the pre-defined address for data, which will be used to call the necessary methods.
  3. An integer type variable that stores the port number.
  4. A reference to the Player.
  5. An AudioSource type variable to store the reference to the ‘Jump’ sound that will be played whenever the Player jumps.
  6. A Boolean variable. The value of this variable will be checked in each frame and, based on that, the Jump action will be triggered.

Save the file and go back to the editor.

From the Hierarchy window, select Managers ▸ PlayerController to make it active and see its properties in the Inspector. Now, select PlayerObject ▸ Player and drag-and-drop it onto the placeholder for the Player GameObject in PlayerController script:

Also, with the PlayerController selected, add an AudioSource component to it by selecting Add Component ▸ Audio ▸ Audio Source.

In the Inspector window, de-select Play On Awake.

From the Project window, open the Sounds folder, and drag-and-drop the JumpSound audio file, and drop it onto the placeholder for AudioClip in the Inspector window:

Now that you have all the necessary variables declared and the GameObject references set, you’ll add pieces of program logic.

Add the following method under // 2. Initialize variables.

// 2. Initialize variables

void Start () 
{
  port = 5065; //1 
  jump = false; //2 
  jumpSound = gameObject.GetComponent<AudioSource>(); //3

  InitUDP(); //4
}

This is fairly straightforward:

  1. Variable port is initialized with value: 5065 (the same value we are using in our Python instance).
  2. Variable jump is initialized as Boolean false. This variable will be set to true whenever you get the “Jump!” message from the Python instance.
  3. A reference to the AudioSource component attached to the PlayerController GameObject is stored in variable jumpSound.
  4. The InitUDP() method is called. Don’t worry about the error, you will the code for it in the next step.

In order to be able to read a given IP address via UDP, a thread has to be created and set to run in the background.

If you’re unfamiliar with threads, they are components of a process that can be used to achieve parallelism by executing concurrently in the running process, also sharing resources such as memory within this process. In simple terms a thread can be started to run work (such as polling for UDP data in this case) in the background whilst your Unity script code continues to run in your Unity game process.

When you are dealing with computationally expensive or long-term operations, threads are very useful. In addition to performing network communication, they are also commonly used for running AI sub-processes in the background, running path-finding algorithms, performing file operations and much more.

Now, add the following method under // 3. InitUDP.

// 3. InitUDP

private void InitUDP()
{
  print ("UDP Initialized");

  receiveThread = new Thread (new ThreadStart(ReceiveData)); //1 
  receiveThread.IsBackground = true; //2
  receiveThread.Start(); //3
}

This method creates a new thread, and gives the method ReceiveData() as an argument, which will be defined in the next step, for it to handle all the data.

Take a look at what each line of code means:

  1. Variable receiveThread is initialized as a new Thread with the method ReceiveData() as an argument.
  2. The thread type is set as “Background” so that it runs parallel to your game code.
  3. Thread receiveThread is set to Start.

Once a separate thread has been initiated, you’ll need a method to actually read the data from the predefined IP Address that will be sent by the Python-OpenCV server.

Add the following lines of code below // 4. Receive Data:

// 4. Receive Data

private void ReceiveData()
{
  client = new UdpClient (port); //1
  while (true) //2
  {
    try
    {
      IPEndPoint anyIP = new IPEndPoint(IPAddress.Parse("0.0.0.0"), port); //3
      byte[] data = client.Receive(ref anyIP); //4

      string text = Encoding.UTF8.GetString(data); //5
      print (">> " + text);

      jump = true; //6

    } 
    catch(Exception e)
    {
      print (e.ToString()); //7
    }
  }
}

A comment-by-comment explanation is below:

  1. Variable client is assigned port 5065.
  2. A while loop is initiated. You can use a variable as an exit condition if you require, but for the purposes of this tutorial this will do just fine. Within the body, instead of just having the code for receiving the data using the UDP Client, adding a try/catch is always a good practise, so that in case of an error and if no data is received, the complete program does not crash. It also makes managing errors and bugs easy with the help of logs.
  3. The IP Endpoint (where the value “Jump!” will be read from) is declared.
  4. Data read from the IP Endpoint declared above stored in the variable “data” in binary form.
  5. Data in binary form is encoded to a utf-8 string format and stored in the “text” variable.
  6. Since the only data being sent from the Python instance is the string “Jump!”, the Boolean variable “jump” is set to true. Later on you will add the Update() method, which is called once every frame and will check for the value of this variable. If true it will trigger the Jump action on the player.
  7. If an exception occurs it will be logged to the console.

Now that all the checks for getting the data are in place, add the functionality to trigger the Jump animation and play the Jump sound.

Add the following method below // 5. Make the Player Jump:

// 5. Make the Player Jump

public void Jump()
{
  Player.GetComponent<Animator>().SetTrigger ("Jump"); //1
  jumpSound.PlayDelayed(44100); // Play Jump Sound with a 1 second delay to match the animation
}

Save the file and return to the editor. From the Hierarchy window, select the player from PlayerObjects ▸ Player and Open the Animator window.

You will see that there are two animations baked within the player: “Idle” (the default Animation) and “Jump”, which is set to run when “Jump” Trigger is set.

  1. The first line in the Jump() method sets the Trigger as “Jump” so that the player moves from “Idle” animation to the “Jump” animation and back.
  2. The next line simply plays the Jump sound with a one-second delay, in order to be in sync with the Jump animation.

Now that you have added all the necessary logic for Initializing the required thread, and the checking and receiving of data via UDP, all that’s left is to add a check for the value of Boolean variable Jump in the Update() method.

Finally, add the following lines of code after the comment // 6. Check for variable value, and make the Player jump!.

// 6. Check for variable value, and make the Player jump!

void Update () 
{
  if(jump == true)
  {
    Jump ();
    jump = false;
  }
}

Here, the Update() method, which is called once every frame check if the value of Boolean variable is set to true.

When it does, it calls the Jump() Method and then sets the value of “jump” back to false so that our player is not continuously jumping.

And that’s it! That was all the code required to add to the project to work.

Finally, save the PlayerController.cs script and go back to the Unity editor.

As done previously, type python Detection.py and click Enter in your command line utility. Your Python OpenCV instance should now be running.

Simultaneously, click on the Play button in Unity to play the scene.

If you’ve done everything correctly, you should now be able to make a fist with your hand and it will make the player in Unity jump!

Here’s how it should look:

And that’s it! You now have a working example project of how to use OpenCV with Unity.

As mentioned earlier, if you got stuck at some point or have any errors, you can find the complete project using the “Download Materials” link at the top or the bottom of this tutorial.

Where to Go From Here?

The aim of this tutorial was to serve as a template for you to make more applications by harnessing the power a Computer Vision library like OpenCV and a game engine as versatile as Unity.

There are several things you can do at this point, such as send different values depending on the number of fingers detected by OpenCV to perform different operations in your Unity project.

If you want to learn more about Hand Gesture Recognition, you should check out the official tutorial docs here (at this link).

To learn more about Sockets, you can check out this tutorial. (https://www.geeksforgeeks.org/socket-programming-python/)

To learn more about threads and how multithreading works in .NET, check out – http://www.yoda.arachsys.com/csharp/threads/.

In addition to performing Hand Gesture Recognition, there are many more things that you can do with the OpenCV such as Object Detection, Face Recognition, Template Matching, SLAM Detection, create your own self-driving car, making Augmented Reality Applications and much more!

If you have any questions or comments, or you just want to show what you experimented with from the learnings of this tutorial, join the discussion below!

Contributors

Comments