The number of connected devices is skyrocketing, and new solutions are being introduced to process and visualize data every day. But it's not all sunshine and roses.
How to get started using free, preprocessed CommonCrawl web crawl datasets that you could use for machine learning, natural language processing, and more.
Learn about in-memory MapReduce, the Ignite in-memory file system, and the Hadoop file system cache. Also learn how to install and configure Hadoop and Ignite.
In this post we take a look at how you can control the humidity in a room using a Raspberry Pi, a switch, and a sensor with a dash of JavaScript and Python.
You can augment and enhance Apache Spark clusters using Amazon EC2's computing resources. Find out how to set up clusters and run master and slave daemons on one node.
When it comes to integrating and managing data, there are quite a few tasks that are downright tedious. Data engineering is a tough job, but somebody's gotta do it!
It's not about SQL vs. NoSQL, but rather when to use each option. This guide walks through the benefits of relational and non-relational databases as well as use cases.
In this post, we'll walk you through a tutorial on how to create an MVC CRUD application using KnockoutJS in conjunction with an SQL database and Visual Studio.
Securing cloud-based IoT is hard; there is a combination of local software, cloud, and hardware solutions to deal with. Let's take a look at a possible solution.
Test your backup and restore procedures right after you install your cluster. Backups are a waste of time and space if they don't work and you can't get your data back!
Apache Hive is a powerful tool for analyzing data. It's very important that you know how to improve the performance of query when you are processing petabytes of data.
If you're planning to invest in connected devices, make sure your middleware and communications are up to snuff. In this case, we look a how to mix Camel with a Pi.