8 Steps to Mastering Your Computer Vision Development Skills
Look at 8 steps to mastering your computer vision development skills.
Join the DZone community and get the full member experience.
Join For FreeIf you’ve recently followed the FaceApp hype and frenzy in social media and tried this AI app to see what you’ll look like in your ripe old age, you definitely realize all the power behind computer vision technologies. While they are in infancy and we’ve yet to see more compelling and thought-provoking computer vision use cases across various domains and verticals, you have a great chance to gain and master your AI skills and match a future demand by becoming a computer vision guru.
Having talked to several developers working on AI and computer vision projects, I’ve come up with eight steps to becoming a stellar computer vision developer. However, before delving deep into each step, let’s see what cases computer vision tech is best suited for:
- Image segmentation
- Object detection
- Classification of images
- Tracking moving objects over time
- Face detection and recognition
- Optical character recognition
- Image generation
The essential skills required of a computer vision specialist today:
- Python syntax
- Mathematical analysis
- Linear algebra
- OpenCV library
- TensorFlow Deep Learning Framework
Now let’s review 8 steps to mastering computer vision skills.
Step 1: Basic imaging techniques
You can start by watching this excellent Youtube series by Joseph Redmon called “The ancient secrets of computer vision.”
Then make sure to read “Computer Vision: Algorithms and Applications” by Richard Szeliski. The book addresses such computer vision methods as image formation and processing, feature detection and matching, segmentation, feature-based alignment, computational photography, 3D reconstruction, and rendering. All in all, it should become your handbook and an essential guide to the world of computer vision development.
To exercise and practice your knowledge from the above-mentioned book, try this OpenCV tool.
The site also contains many tutorials to help you practice GUI features, image processing, video analysis, camera calibration, and solve different computer vision challenges.
Step 2: Motion tracking and optical flow analysis
Optical flow is a sequence of images of objects obtained by moving an observer or objects relative to the scene.
Take a course in Computer Vision on Udacity, pay special attention to Lesson 6 on oriented gradients. The focus of the course is to develop the intuitions and mathematics of the methods in lecture, and then to learn about the difference between theory and practice in the problem sets.
Along with the course, watch once again Episode 8 of "The Ancient Secrets of Computer Vision" and do read sections 10.5 и 8.4 of Szeliski’s book.
Step 3: Basic segmentation
In computer vision, segmentation is the process of dividing a digital image into several segments (super-pixels). The purpose of segmentation is to simplify and/or change the representation of the image to make it easier and more accessible to analyze.
For example, the Hough Transform helps find imperfect instances of objects within a particular class of shapes by a voting procedure.
Watch these videos for knowledge enhancement:
Also, take a look at this Lane Finding Project for Self-Driving Car.
Step 4: Fitting
Different data require a specific fitting approach and particular algorithms. This video will be helpful!
Besides, read sections 4.3.2 и 5.1.1 of “Computer Vision: Algorithms and Applications”.
For homework, analyze detection and tracking of the vanishing point on the horizon. This will give a powerful boost to your computer vision skills.
Step 5: Matching images from different viewpoints
This Youtube playlist by Sean Mullery will come in handy.
For homework, you can take your own data like pictures of furniture taken from different angles and make a 3D object in OpenCV from a flat image album.
Step 6: 3D scenes
If you know how to create 3D objects from flat images, you can try to create a 3D reality.
Consider taking a course on Stereo Vision, Dense Motion and Tracking available for free on Coursera.
To fix your new knowledge, watch these videos below:
For homework, try to play with 3D scene reconstruction and build a real-time application to estimate the camera pose in order to track a textured object with six degrees of freedom given a 2D image and its 3D textured model.
Step 7: Object recognition and image classification
As a framework for deep learning, TensorFlow is very convenient to use. It's one of the most popular frameworks, so you'll find plenty of examples. To start working with images in TensorFlow, go through this tutorial.
Next, using the links below, consider exploring the following topics:
- Semantic segmentation: categorization of objects, scenes, activities
- Object detection (non-maximum suppression, sliding windows, anchor boxes)
- Real-time object detection with YOLO and Darknet, region proposal networks (RPN)
- Supervised image classification
- Visual attributes
- Optical character and text recognition
- Face Detection
For homework, create a TensorFlow neural network that can define a dog’s breed by the image.
Step 8: Deep Learning
It's highly recommended that you watch all 16 lectures from Stanford University School of Engineering which address an array of AI and computer vision topics, from convolutional neural networks to CNN architectures to detection and segmentation to deep reinforcement learning.
That’s pretty much it. Good luck with your new career as a computer vision consultant!
Opinions expressed by DZone contributors are their own.
Comments