Mimic Snapchat Filters Programmatically
Learn to create a facial recognition type app that mimics the filters feature so popular with Snapchat users, for use in your own mobile app.
Join the DZone community and get the full member experience.
Join For FreeSince you probably own a smartphone, you may have noticed someone from your entourage popping up on social networks with a flower crown or some dog stuff on his head. This effect is produced by the Snapchat app, is called filters, and every competitor of the app is copying it nowadays.
These filters can be regarded as a computer vision and programming challenge. In this blog post, we'll try to answer some questions about these filters: How they are actually made? Which software libraries are needed to mimic their behavior? Finally, we'll implement some famous filters using Python or whatever language that support HTTP requests with the help of the PixLab API (more on this later).
With no further ado, let's dig a little bit onto this Snapchat output (in fact, it is made by our program that we'll implement below):
To produce such an effect, two phases are actually needed: analysis and processing.
Computer Vision to the Rescue
The analysis phase is always the first pass and is the most complicated. It requires some computer vision algorithms running under the hood and performing the following tasks:
Face Detection
Given an input image or video frame, it finds all present human faces and outputs their bounding box (the rectangle coordinates in the form: X, Y, Width & Height).
Face detection has been a solved problem since the early 2000s, but faces some challenges including detecting tiny, partial and non-frontal faces. The most widely used technique is a combination of Histogram of Oriented Gradients (HOG for short) and Support Vector Machine (SVM) that achieves mediocre to relatively good detection ratios given a good quality image, but this method is not capable of real-time detection at least on the CPU.
Here is how the HOG/SVM detector works:
Given an input image, compute the pyramidal representation of that image, which is a pyramid of multi-scaled (maybe Gaussian) downed version of the original image. For each entry on the pyramid, a sliding window approach is used. The sliding window concept is quite simple. By looping over an image with a constant step size, small image patches, typically of size 64 x 128 pixels, are extracted at different scales. For each patch, the algorithm makes a decision if it contains a face or not. The HOG is computed for the current window and passed to the SVM classifier (linear or not) for the decision to take place (face or not). When done with the pyramid, a non-maxima suppression (NMS for short) operation usually takes place in order to discard stacked rectangles. You can read more about the HOG/SVM combination here.
At Symisc, we'll be soon launching a general purpose embedded object detection framework (including face detection) that achieves real-time performance on the CPU using a combination of random-forest and neural networks.
Facial Landmarks
This is the next step in our analysis phase and works as follows:
For each detected face, the output is the local region coordinates for each member or facial feature of that face. This includes the eyes, bone, lips, nose, mouth,... coordinates usually in the form of points (X,Y).
Extracting facial landmarks is a relatively cheap operation for the CPU given a bounding box (a cropped image with the target face), but quite difficult to implement for the programmer unless some not-so-fast machine learning techniques such as training and running a classifier is used.
You can find out more about extracting facial landmarks here or this PDF: One millisecond face alignment with an ensemble of regression trees (it's a scientific paper, you have been warned!).
In some obviously useful cases, face detection and landmarks extraction are combined into a single operation.
Dlib requires face detection to take place at first and shape extraction next, so it's a two-step operation. PixLab achieves that in a single call via the facelandmarks API endpoint that we will be using later.
We move now to the next part of our quest for the best snap: the processing phase.
Image & Video Processing
Once we have the facial landmarks, 70% of the work is done; all we have to do right now is to superimpose the target filter such as the flower crown, dog nose, etc. on top of the desired region of the face like the bone for the flower crown case. This operation is named compositing. That is, combining multiple visual elements from separate sources into a single image.
For the sake of simplicity, we'll stick with image processing only. Video processing is similar in concept but requires some additional steps such as extracting each frame using a decoder (like FFmpeg or CvCapture from OpenCV) and treating that frame exactly like you would do with an image.
Now that we understand how those filters are created, it's time to start producing a few ones using some code in the next section!
Finding the Best Software Library
Since there is a huge gap between theory and practice, finding a software library, whether open source or not, that is capable of detecting faces, extracting their landmarks with video and image processing capability, and above all, is embeddable in your app, is not an easy task.
In fact, no software library is capable of doing that, not even the popular OpenCV library that you may think of at first. OpenCV has no landmark extraction capability, a mediocre face detector, and weighs a ton, but is built with very good video and image processing capability, I must say.
Your best bet if you are looking for an embedded solution is mixing Dlib with the MagickWand C library from the ImageMagick project, but compiling and setting up these two in your app require another
blog post
book (one is written in C++11, the other plain C, so...). Stasm is relatively good at facial feature extraction but requires a third-party library for face detection. CLM-framework is also capable of facial landmark extraction, but again, no face detector is built-in. GPUImage is a library of choice for simple image and video processing but appears to be iOS exclusive.
libccv also seems like a potential candidate. It is shipped with a face detector without a landmark extractor though, a few image manipulation routines (didn't find the composite routine), but it has not been updated in at least two years.
Note that we are talking about an embedded solution here, so we have explicitly omitted the nice Python libraries of NumPY or SciKit, unless you are embedding the Python interpreter in your own app!?
RESTful APIs Come to the Rescue
Fortunately for the app builder, producing those filters is quite easy if some cloud vision service is used instead of building and compiling your own. All you need to do is to make a simple HTTP request to the target service for the analysis phase to take place on behalf of you. The notable cloud providers are:
- Microsoft Cognitive Services.
- Google Vision API.
- PixLab - Machine Vision & Media Processing APIs (Our service).
The former two (Microsoft and Google) offer machine vision only. In others words, you'll be able to detect faces and extract their shapes using state-of-the-art machine learning algorithms, but it is up to you to perform the processing phase. That is, composite the flower crown or whatever filter using your own image processing library.
PixLab, on the other hand, offers both computer vision and media processing as a single set of unified RESTful APIs.
What Is PixLab?
As I said, PixLab is a machine learning SaaS platform with a set of unified RESTful APIs for all your media analysis and processing tasks. It is shipped with over 130 commands (API endpoints) including:
- Face detection, recognition, emotion, generation, lookup, landmarks.
- Content Moderation & Extraction: nsfw, sfw, urlcapture, header, ocr, tagimg.
- Pixel Generation, including human faces and landscapes.
- Image processing: blur, composite, negate, grayscale, oilpaint, etc.
- The ability to train your own object detector.
- Media encryption & decryption.
You are invited to take a look at the Github sample page to see the possible transformations you can achieve on your input media, such as content filtering, blurring/cropping human faces, image to speech, meme generation, and much more.
Enough talking, let's start programming our filters...
Programming Our First Filter
Given an input image with some pretty faces:
and this flower crown:
(located at pixlab.xyz/images/flower_crown.png)
The output will be something like this:
Using this code:
import requests
import json
# Detect all human faces & extract their facial landmarks via `facelandmarks`.
# Once done, mimic the famous Snapchat flower crown filter.
# Only three commands are actually needed in order to mimic the Snapchat filters:
# face landmarks: https://pixlab.io/#/cmd?id=facelandmarks
# smart resize: https://pixlab.io/#/cmd?id=smartresize
# merge: https://pixlab.io/#/cmd?id=merge
# Optionally: blur, grayscale, drawtext, oilpaint, etc. for cool background effects.
# The following is target image that we'll superpose our filter on top of it.
# This image must contain at least one face. free free to change the link to whatever your want.
# Note that you can upload your own images from your app very easily. Refer to the docs for additional info.
img = 'https://ak6.picdn.net/shutterstock/videos/10819841/thumb/8.jpg'
# The flower crown to be composited on top of the target face
flower_crown = 'http://pixlab.xyz/images/flower_crown.png'
# You PixLab API key
key = 'Your_PixLab_Key'
# This list contain all the coordinates of the regions where the flower crown should be
# Composited on top of the target face later using the `merge` command.
coordinates = []
# First off, call `facelandmarks` and extract all present faces plus their landmarks.
print ("Detecting and extracting facial landmarks..")
req = requests.get('https://api.pixlab.io/facelandmarks',params={
'img': img,
'key': key,
})
reply = req.json()
if reply['status'] != 200:
print (reply['error'])
exit();
total = len(reply['faces']) # Total detected faces
if total < 1:
# No faces were detected
print ("No faces were detected..exiting")
exit()
print(str(total)+" faces were detected")
# Iterate all over the detected faces and make our flower crown filter..
for face in reply['faces']:
cord = face['rectangle']
# Show the face coordinates
print ("Coordinates...")
print ('\twidth: ' + str(cord['width']) + ' height: ' + str(cord['height']) + ' x: ' + str(cord['left']) +' y: ' + str(cord['top']))
# Show landmarks:
print ("Landmarks...")
landmarks = face['landmarks']
print ("\tBone Center: X: " + str(landmarks['bone']['center']['x']) + ", Y: "+str(landmarks['bone']['center']['y']))
print ("\tBone Outer Left: X: " + str(landmarks['bone']['outer_left']['x']) + ", Y: "+str(landmarks['bone']['outer_left']['y']))
print ("\tBone Outer Right: X: "+ str(landmarks['bone']['outer_right']['x'])+ ", Y: "+str(landmarks['bone']['outer_right']['y']))
# More landmarks on the docs..Let's make our flower crown filter now
# Resize the flower crown which is quite big right now to exactly the face width using smart resize.
print ("Resizing the snap flower crown...")
req = requests.get('https://api.pixlab.io/smartresize',params={
'img':flower_crown,
'key':key,
'width': 20 + cord['width'], # Face width
'height':0 # Let Pixlab decide the best height for this picture
})
reply = req.json()
if reply['status'] != 200:
print (reply['error'])
exit()
else:
fit_crown = reply['link']
# Composite the flower crown at the bone center region
coordinates.append({
'img': fit_crown, # The resized crown flower
'x': landmarks['bone']['center']['x'],
'y': landmarks['bone']['center']['y'] - 10,
'center': True,
'center_y': True
})
# Finally, Perform the composite operation
print ("Composite operation...")
req = requests.post('https://api.pixlab.io/merge',
headers={'Content-Type':'application/json'},
data=json.dumps({
'src':img, # The target image.
'key':key,
'cord': coordinates # The coordinates list filled earlier with the resized images (i.e. The flower crown & the dog parts) and regions of interest
})
)
reply = req.json()
if reply['status'] != 200:
print (reply['error'])
else:
# Optionally call blur, oilpaint, grayscale, meme for cool background effects..
print ("Snap Filter Effect: "+ reply['link'])
The program output when run should look like:
Detecting and extracting facial landmarks..
4 faces were detected
Coordinates...
width: 223 height: 223 x: 376 y: 112
Landmarks...
Nose: X: 479.3, Y: 244.2
Bottom Lip: X: 481.7, Y: 275.3
Top Lip: X: 481.6, Y: 267.1
Chin: X: 486.7, Y: 327.3
Bone Center: X: 490, Y: 125
Bone Outer Left: X: 356, Y: 72
Bone Outer Right: X: 564, Y: 72
Bone Center: X: 490, Y: 125
Eye Pupil Left: X: 437.1, Y: 177.3
Eye Pupil Right: X: 543.7, Y: 173.3
Eye Left Brown Inner: X: 458.2, Y: 161.1
Eye Right Brown Inner: X: 504.9, Y: 156.1
Eye Left Outer: X: 417.5, Y: 180.9
Eye Right Outer: X: 559.4, Y: 175.6
Resizing the snap flower crown...
Composite operation...
...
Snap Filter Effect: https://pixlab.xyz/24p596187b0b1b68.jpg
If this is the first time you've seen the PixLab API in action, your are invited to read the excellent introduction to the API in 5 minutes or less.
To mimic such behavior, only three commands (API endpoints) were actually needed to produce such a filter:
- facelandmarks is the analysis command we called first on line 29 of the Python gist. As said earlier, it is an all in one operation. It tries to detect all present human faces, extract their rectangle coordinates and more importantly, it outputs the landmarks for each facial feature such as the eyes, bone, nose, mouth, chin, lips, etc. of the target face. We'll use these coordinates later to composite stuff on top of the desired face region. In our case, only the bone region coordinates were actually needed to composite the flower crown. Refer to the PixLab documentation for additional information on the facelandmarks command.
- Now for each detected face, perform the following processing operations (we are done with the analysis phase at this stage):
- smartresize is called next on line 84 in order to fit the image to be composited (the flower crown) to the desired dimension such as the bone width of the target face. This is an essential step since the flower crown is quite big at this stage compared to the bone width. The bone width is assumed to be the same as the face width so we use that value. The engine is smart enough to calculate the height for us so we leave the height field untouched.
- merge aka composite is the main processing command we call next. It expects a list of coordinates (X, Y) and a list of images to be composited (the flower crown) on top of the target region (the bone center in our case). The coordinates were collected on line 97 and passed verbatim to the merge command on line 108. Refer to the PixLab documentation for additional information on the merge command.
- Optionally, you may be tempted to use some photo filter commands such as: grayscale, blur, oil paint, etc. for some cool background effects if desired.
We move now to a more complete example with the famous dog filter.
A More Complex Example
Given a input image:
and these dog facial parts:
available separately:
- Left Ear: pixlab.xyz/images/dog_left_ear.png
- Right Ear: pixlab.xyz/images/dog_right_ear.png
- Nose: pixlab.xyz/images/dog_nose.png
- Tongue: pixlab.xyz/images/dog_tongue.png
The output will be something like this:
Using this code:
import requests
import json
# Mimic the two famous Snapchat filters: The flower crown & the dog facial parts filter.
# The target image must contain at least one human face. The more faces you got, the more funny it should be!
#
# Only three commands are actually needed in order to mimic the Snapchat filters:
# face landmarks: https://pixlab.io/#/cmd?id=facelandmarks
# smart resize: https://pixlab.io/#/cmd?id=smartresize
# merge: https://pixlab.io/#/cmd?id=merge
# rotate (Optionally): https://pixlab.io/#/cmd?id=rotate
# meme (Optionally): https://pixlab.io/#/cmd?id=meme for Drawing some funny text
# Optionally: blur, grayscale, oilpaint, etc. for cool background effects.
# Target image to composite stuff on. Must contain at least one face.
img = 'https://trekkerscrapbook.files.wordpress.com/2013/09/face-08.jpg'
# The flower crown.
flower = 'http://pixlab.xyz/images/flower_crown.png'
# The dog facial parts: Left & right ears, nose & optionally the tongue
dog_left_ear = 'http://pixlab.xyz/images/dog_left_ear.png'
dog_right_ear = 'http://pixlab.xyz/images/dog_right_ear.png'
dog_nose = 'http://pixlab.xyz/images/dog_nose.png'
dog_tongue = 'http://pixlab.xyz/images/dog_tongue.png'
# Your PixLab API key
key = 'Your_Pixlab_Key'
# If set to True then composite the flower crown. Otherwise, composite the dog stuff.
draw_crown = False
# Resize an image (Dog parts or the flower crown) to fit the face dimension using smartresize.
def smart_resize(img,width,height):
print ("Resizing filter image...")
req = requests.get('https://api.pixlab.io/smartresize',params={
'img':img,
'key':key,
'width': width,
'height': height
})
reply = req.json()
if reply['status'] != 200:
print (reply['error'])
exit()
else:
return reply['link'] # Resized image
# First step, Detect & extract the landmarks for each human face present in the image.
req = requests.get('https://api.pixlab.io/facelandmarks',params={
'img': img,
'key': key,
})
reply = req.json()
if reply['status'] != 200:
print (reply['error'])
exit();
total = len(reply['faces']) # Total detected faces
if total < 1:
print ("No faces were detected..exiting")
exit()
print(str(total)+" faces were detected")
# This list contain all the coordinates of the regions where the flower crown or the dog facial parts should be
# Composited on top of the target face later using the `merge` command.
coordinates = []
# Iterate all over the detected faces and make our stuff
for face in reply['faces']:
# Show the face coordinates
print ("Coordinates...")
cord = face['rectangle']
print ('\twidth: ' + str(cord['width']) + ' height: ' + str(cord['height']) + ' x: ' + str(cord['left']) +' y: ' + str(cord['top']))
# Show landmarks of interest:
print ("Landmarks...")
landmarks = face['landmarks']
print ("\tNose: X: " + str(landmarks['nose']['x'] ) + ", Y: "+str(landmarks['nose']['y']))
print ("\tBone Center: X: " + str(landmarks['bone']['center']['x']) + ", Y: "+str(landmarks['bone']['center']['y']))
print ("\tBone Outer Left: X: " + str(landmarks['bone']['outer_left']['x']) + ", Y: "+str(landmarks['bone']['outer_left']['y']))
print ("\tBone Outer Right: X: "+ str(landmarks['bone']['outer_right']['x'])+ ", Y: "+str(landmarks['bone']['outer_right']['y']))
#More landmarks on the docs...
draw_crown = not draw_crown
if draw_crown:
# Resize the flower crown to fit the face width
fit_crown = smart_resize(
flower,
cord['width'] + 20, # Face width
0 # Let Pixlab decide the best height for this picture
)
# Composite the flower crown at the bone center region.
print ("\tCrown flower at: X: " + str(landmarks['bone']['center']['x']) + ", Y: "+str(landmarks['bone']['center']['y']))
coordinates.append({
'img': fit_crown, # The resized crown flower
'x': landmarks['bone']['center']['x'],
'y': landmarks['bone']['center']['y'] + 5,
'center': True,
'center_y': True
})
else:
# Do the dog facial parts using the bone left & right regions and the nose coordinates.
print ("\tDog Facial Parts...")
coordinates.append({
'img': smart_resize(dog_left_ear, cord['width']/2,cord['height']/2),
'x': landmarks['bone']['outer_left']['x'], # Adjust to get optimal effect
'y': landmarks['bone']['outer_left']['y'] # Adjust to get optimal effect
})
coordinates.append({
'img': smart_resize(dog_right_ear, cord['width']/2,cord['height']/2),
'x': landmarks['bone']['outer_right']['x'], # Adjust to get optimal effect
'y': landmarks['bone']['outer_right']['y'] # Adjust to get optimal effect
})
coordinates.append({
'img': smart_resize(dog_nose, cord['width']/2,cord['height']/2),
'x': landmarks['nose']['x'], # Adjust to get optimal effect
'y': landmarks['nose']['y'] + 2, # Adjust to get optimal effect
'center': True,
'center_y': True
})
# Finally, Perform the composite operation
print ("Composite operation...")
req = requests.post('https://api.pixlab.io/merge',
headers={'Content-Type':'application/json'},
data=json.dumps({
'src':img, # The target image.
'key':key,
'cord': coordinates # The coordinates list filled earlier with the resized images (i.e. The flower crown & the dog parts) and regions of interest
})
)
reply = req.json()
if reply['status'] != 200:
print (reply['error'])
else:
# Optionally call blur, oilpaint, grayscale,meme for cool background effects..
print ("Snap Filter Effect: "+ reply['link'])
snapchat_dog_filter.py on Github. PHP/JS code available here.
As you may notice, no matter how complex your filter, the logic is always the same: facelandmarks first, smartresize next, and finally merge to generate the filter.
Tip: If you want to download the image output immediately without remote storage, set the Blob parameter to true in your final HTTP request (line 148).
Compositing the flower crown is quite easy to implement. All we have to do is to locate the bone center region of the target face and invoke merge with the X & Y coordinates of that region. Don't forget to set the center & center_y parameters for each entry the merge command takes to True. This will calculate the best position for the filter. The same rule applies to the dog nose, except we used the face nose coordinates.
The difficulty here is to find the best location for the left and right ear. The bone left most region is selected for the left ear and we adjusted the Y position for optimal effect. The same rule applies for the right ear. The rotate command can be of particular help here if the target face is inclined to some degree.
One Last Example
Given this woman:
and this eye mask:
located at. pixlab.xyz/images/eye_mask.png
plus this mustache:
located at. pixlab.xyz/images/mustache.png
The output will be something like this, with some text (e.g. a meme) at the bottom of the image:
Using this code:
import requests
import json
# Make an eye mask plus a mustache filter and finally draw some text on the bottom of the image.
#
# Only three commands are actually needed in order to mimic the Snapchat filters:
# face landmarks: https://pixlab.io/#/cmd?id=facelandmarks
# smart resize: https://pixlab.io/#/cmd?id=smartresize
# merge: https://pixlab.io/#/cmd?id=merge
# rotate (Optionally): https://pixlab.io/#/cmd?id=rotate
# meme (Optionally): https://pixlab.io/#/cmd?id=meme Draw some funny text
# Optionally: blur, grayscale, oilpaint, etc. for cool background effects.
# Target image to composite stuff on. Must contain at least one face.
img = 'http://pixlab.xyz/images/wm.jpg'
# The eye mask.
eye_mask = 'http://pixlab.xyz/images/eye_mask.png'
# The mustache!
mustache = 'http://pixlab.xyz/images/mustache.png'
# Your PixLab API key
key = 'My_Pix_Key'
# Resize and image (Eye mask, mustache, etc.) to fit the face dimension using smart resize.
def smart_resize(img,width,height):
print ("Resizing image...")
req = requests.get('https://api.pixlab.io/smartresize',params={
'img':img,
'key':key,
'width': width,
'height': height
})
reply = req.json()
if reply['status'] != 200:
print (reply['error'])
exit()
else:
return reply['link'] # Resized image
# First step, Detect & extract the landmarks for each human face present in the image.
req = requests.get('https://api.pixlab.io/facelandmarks',params={
'img': img,
'key': key,
})
reply = req.json()
if reply['status'] != 200:
print (reply['error'])
exit();
total = len(reply['faces']) # Total detected faces
if total < 1:
print ("No faces were detected..exiting")
exit()
print(str(total)+" faces were detected")
# This list contain all the coordinates of the regions where the flower crown or the dog part should be
# Composited on the target image later using the `merge` command.
coordinates = []
# Iterate all over the detected faces and make our stuff
for face in reply['faces']:
# Show the face coordinates
print ("Coordinates...")
cord = face['rectangle']
print ('\twidth: ' + str(cord['width']) + ' height: ' + str(cord['height']) + ' x: ' + str(cord['left']) +' y: ' + str(cord['top']))
# Show landmarks of interest:
print ("Landmarks...")
landmarks = face['landmarks']
print ("\tNose: X: " + str(landmarks['nose']['x'] ) + ", Y: "+str(landmarks['nose']['y']))
print ("\tTop Lip: X: " + str(landmarks['top_lip']['x']) + ", Y: "+str(landmarks['top_lip']['y']))
print ("\tChin: X: " + str(landmarks['chin']['x']) + ", Y: "+str(landmarks['chin']['y']))
print ("\tEye Pupil Left: X: " + str(landmarks['eye']['pupil_left']['x']) + ", Y: "+str(landmarks['eye']['pupil_left']['y']))
print ("\tEye Pupil Right: X: " + str(landmarks['eye']['pupil_right']['x']) + ", Y: "+str(landmarks['eye']['pupil_right']['y']))
print ("\tEye Left Outer: X: " + str(landmarks['eye']['left_outer']['x']) + ", Y: "+str(landmarks['eye']['left_outer']['y']))
print ("\tEye Right Outer: X: " + str(landmarks['eye']['right_outer']['x']) + ", Y: "+str(landmarks['eye']['right_outer']['y']))
# More landmarks on the docs...
# Do the mustache using the top lip coordinates.
print ("\tMustache...")
coordinates.append({
'img': smart_resize(mustache, cord['width']/2,0),
'x': landmarks['top_lip']['x'], # Adjust to get optimal effect
'y': landmarks['top_lip']['y'] - 50, # Adjust to get optimal effect
'center': True # Composite at the center of the X point.
})
# Do the eye mask using the top lip coordinates.
print ("\tEye Mask...")
coordinates.append({
'img': smart_resize(eye_mask, cord['width'],0),
'x': landmarks['bone']['center']['x'], # Adjust to get optimal effect
'y': landmarks['eye']['left_brow_inner']['y'], # Adjust to get optimal effect
'center': True, # Composite at the center of the X point.
'center_y': True # Composite at the center of the Y point.
})
# Finally, Perform the composite operation when we exit the loop
print ("Composite operation...")
req = requests.post('https://api.pixlab.io/merge',
headers={'Content-Type':'application/json'},
data=json.dumps({
'src':img, # The target image.
'key':key,
'cord': coordinates # The coordinates list filled earlier with the composited images & regions of interest
})
)
reply = req.json()
if reply['status'] != 200:
print (reply['error'])
else:
snap = reply['link'];
# Snap created, let's draw some MEME at the bottom of the pic.
req = requests.get('http://api.pixlab.io/meme',params={
'img': snap,
'bottom': 'sounds good?',
'cap':True, # Capitalize text,
'strokecolor': 'black',
'key':key
})
reply = req.json()
if reply['status'] != 200:
print (reply['error'])
else:
print ("Snap Filter + Meme: "+ reply['link'])
snapchat_eye_mask_mustache_meme_filter.py on Github. PHP/JS code available here.
Note how smartresize is of particular help here. If we did not rely on it, the mustache and the eye mask would occupy the entire face instead of a small region. When called, the PixLab engine should calculate the best output dimension for us; all we have to do is to pass the desired width and the height will be automatically adjusted for us (or vice versa).
We also called the meme command for the first time to draw some text on the bottom of the image. This command is so flexible that you can supply font name, size & colors, stroke width and height. Take a look at the meme documentation for additional information.
Making Your Own Stuff
You may be tempted to execute the code listed above and see it in action. All you have to do is to put your own API key instead of the dummy one. If you don't have a key, you may generate one from your PixLab dashboard (we do offer free trials right now, so you don't have to pay anything).
If you want to try it immediately, you can generate your own API Key from the PixLab dashboard. Give it a try, you won't pay anything.
You are invited to take a look at the Github sample page for dozens of other interesting samples in action, such as censoring images based on their nsfw score, blurring human faces, making gifs, encrypting images, etc. All of them are documented on the PixLab sample page.
Conclusion
As we saw, making Snpachat filters is a two step operation. The analysis phase is always the first pass but also the most complicated. For each detected face in the image or video frame, extract and record the facial landmarks for that face. Most filters require a tiny set of landmarks, if not one like the bone region for the flower crown filter. We saw that few open source software libraries are capable of doing face detection and landmarks extraction at the same time, except Dlib and the future Symisc SoD, but alternatives are always possible.
Once you have the facial landmarks, most of the complicated work is done. We can start the processing phase which is a relatively cheap composite operation. All we have to do is to superimpose the target filter on top of the desired region of the face. We saw that GPUImage and MagickWand are potential candidates for our image processing tasks.
Our proposed solution is to rely on the PixLab Restful APIs for this kind of task since it is capable of media analysis and processing at the same time and avoids all the hassle of integrating the necessary libraries needed to perform such a task. A simple HTTP request from any programming language to the target commands:
- facelandmarks for facial landmarks extraction.
- smartresize to fit the target filter to the desired dimension such as the face width for the flower crown case.
- merge to superpose the target filter on top of the desired facial region.
- Optionally, grayscale, meme, oil paint, etc. for some background effect if desired.
- Et Voila! you're done.
Finally, if you have any suggestions or critiques, your can email the authors directly or leave a comment below.
Published at DZone with permission of Garnier Vincent. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments