Next-Gen Lie Detector: Stack Selection
Follow the steps of the creation of a lie detector's backend stack and learn the importance of remaining open to unconventional solutions for common tasks.
Join the DZone community and get the full member experience.
Join For FreeThe first lie detector which relied on eye movement appeared in 2014. The Converus team together with Dr. John C. Kircher, Dr. David C. Raskin, and Dr. Anne Cook launched EyeDetect — a brand-new solution to detect deception quickly and accurately. This event became a turning point in the polygraph industry.
In 2021, we finished working on a contactless lie detection technology based on eye-tracking and presented it at the International Scientific and Practical Conference. As I was part of the developers’ team, in this article, I would like to share some insights into how we worked on the creation of the new system, particularly how we chose our backend stack.
What Is a Contactless Lie Detector and How Does It Work?
We created a multifunctional hardware and software system for contactless lie detection. This is how it works: the system tracks a person's psychophysiological reactions by monitoring eye movements and pupil dynamics and automatically calculates the final test results.
Its software consists of 3 applications.
- Administrator application: Allows the creation of tests and the administration of processes
- Operator application: Enables scheduling test dates and times, assigning tests, and monitoring the testing process
- Respondent application: Allows users to take tests using a special code
On the computer screen, along with simultaneous audio (either synthesized or pre-recorded by a specialist), the respondent is given instructions on how to take the test. This is followed by written true/false statements based on developed testing methodologies.
The respondent reads each statement and presses the "true" or "false" key according to their assessment of the statement's relevance. After half a second, the computer displays the next statement.
Then, the lie-detector measures response time and error frequency, extracts characteristics from recordings of eye position and pupil size, and calculates the significance of the statement or the "probability of deception."
To make it more visual here is a comparison of the traditionally used polygraph and lie-detector.
Criteria | Classic Polygraph | Contactless Lie Detector |
---|---|---|
Working Principle |
Registers changes in GSR, cardiovascular, and respiratory activity to measure emotional arousal |
Registers involuntary changes in eye movements and pupil diameter to measure cognitive effort |
Duration |
Tests take from 1.5 to 5 hours, depending on the type of examination |
Tests take from 15 to 40 minutes |
Report Time |
From 5 minutes to several hours; written reports can take several days |
Test results and reports in less than 5 minutes automatically |
Accuracy |
Screening test: 85% Investigation: 89% |
Screening test: 86-90% Investigation: 89% |
Sensor contact |
Sensors are placed on the body, some of which cause discomfort, particularly the two pneumatic tubes around the chest and the blood pressure cuff |
No sensors are attached to the person |
Objectivity |
Specialists interpret changes in responses. The specialist can influence the result. Manual evaluation of polygraphs requires training and is a potential source of errors. |
Automated testing process ensuring maximum reliability and objectivity. AI evaluates responses and generates a report. |
Training |
Specialists undergo 2 to 10 weeks of training. Regular advanced training courses |
Standard operator training takes less than 4 hours; administrator training for creating tests takes 8 hours. Remote training with a qualification exam. |
As you can see, our lie detector made the process more comfortable and convenient compared to traditional lie detectors. First of all, the tests take less time, from 15 to 40 minutes. Besides, one can get the results almost immediately. They are generated automatically within minutes. Another advantage is that there are no physically attached sensors which can be even more uncomfortable in an already stressful environment. Operator training is also less time-consuming. Most importantly, the results' credibility is still very high.
Backend Stack Choice
Our team had experience with Python and asyncio
. Previously, we developed projects using Tornado. But at that time FastAPI was gaining popularity, so this time we decided to use Python with FastAPI and SQLAlchemy (with asynchronous support).
To complement our choice of a popular backend stack, we decided to host our infrastructure on virtual machines using Docker.
Avoiding Celery
Given the nature of our lie detector, several mathematical operations require time to complete, making real-time execution during HTTP requests impractical. We developed multiple background tasks. Although Celery is a popular framework for such tasks, we opted to implement our own task manager.
This decision stemmed from our use of CI/CD, where we restart various services independently. Sometimes, services could lose connection with Redis during these restarts. Our custom task manager, extending the base aioredis library, ensures reconnection if a connection is lost.
Background Tasks Architecture
At the project's outset, we had a few background tasks, which increased as functionality expanded. Some tasks were interdependent, requiring sequential execution. Initially, we used a queue manager where each task, upon completion, would trigger the next task via a message queue. However, asynchronous execution could lead to data issues due to varying execution speeds of related tasks.
We then replaced this with a task manager that uses gRPC to call related tasks, ensuring execution order and resolving data dependency issues between tasks.
Logging
We couldn't use popular bug-tracking systems like Sentry for a few reasons. First, we didn’t want to use any third-party services managed and deployed outside of our infrastructure, so we were limited to using a self-hosted Sentry. At that time, we only had one dedicated server divided into multiple virtual servers, and there weren't enough resources for Sentry. Additionally, we needed to store not only bugs but also all information about requests and responses, which required the use of Elastic.
Thus, we chose to store logs in Elasticsearch. However, memory leak issues led us to switch to Prometheus and Typesense. Maintaining backward compatibility between Elasticsearch and Typesense was a priority for us, as we were still determining if the new setup would meet our needs. This decision worked quite well, and we saw improvements in resource usage. The main reason for switching from Elastic to Typesense was resource usage. Elastic often requires a huge amount of memory, which is never sufficient. This is a common problem discussed in various forums, such as this one. Since Typesense is developed in C, it requires considerably fewer resources.
Full-Text Search (FTS)
Using PostgreSQL as our main database, we needed an efficient FTS mechanism. Based on previous experience, PostgreSQL's built-in ts_query
and ts_vector
could have performed better with Cyrillic text. Thus, we decided to synchronize PostgreSQL with Elasticsearch. While not the fastest solution, it provided enough speed and flexibility for our needs.
PDF Report Generation
As you may know, generating PDFs in Python can be quite complicated. This issue is rather common — the main challenge here is that to generate a PDF in Python you need to create an HTML file and only then convert it to PDF, similar to how it's done in other languages.
This conversion process can sometimes produce unpredictable artifacts that are difficult to debug. Meanwhile, generating PDFs with JavaScript is much easier. We used Puppeteer to create an HTML page and then save it as a PDF just as we would in a browser, avoiding these problems altogether.
To Conclude
In conclusion, I would like to stress that this project turned out to be demanding in terms of choosing the right solutions but at the same time, it was more than rewarding. We received numerous unconventional customer requests that often questioned standard rules and best practices.
The most exciting part of the journey was implementing mathematical models developed by another team into the backend architecture and designing a database architecture to handle a vast amount of unique data. It made me realize once again that popular technologies and tools are not always the best option for every case. We always need to explore different methodologies and remain open to unconventional solutions for common tasks.
Opinions expressed by DZone contributors are their own.
Comments