Learn about how to take HBase backup of HBase data and tables in Hortonworks Sandbox 2.5. and all of the different ways that a backup can be taken for HBase datasets.
RabbitMQ and Apache Kafka are two of the most popular messaging technologies on the market today. Get the insight you need to choose the right software for you.
Learn what tape data storage is, the best times to use it, and why it has become more popular among people looking to maintain lots of information in a database.
Without best practices, storage can become unmaintainable. Automating data quality, lifecycle, and privacy provide ongoing cleansing/movement of the data in your lake.
HDF collects, curates, analyzes, and delivers real-time data to data stores quickly and easily. It can be used with Spark Streaming and Solr to process weather events.
There are many reasons that Cassandra could be the right tool for your app. Knowing your systems requirements, workloads, and future will help you make the right choice.
In the past, database approaches have required the translation of your data model design to the underlying data modeling language of the database. Redis reverses this.
The number of connected devices is skyrocketing, and new solutions are being introduced to process and visualize data every day. But it's not all sunshine and roses.
How to get started using free, preprocessed CommonCrawl web crawl datasets that you could use for machine learning, natural language processing, and more.
Learn about in-memory MapReduce, the Ignite in-memory file system, and the Hadoop file system cache. Also learn how to install and configure Hadoop and Ignite.
In this post we take a look at how you can control the humidity in a room using a Raspberry Pi, a switch, and a sensor with a dash of JavaScript and Python.
You can augment and enhance Apache Spark clusters using Amazon EC2's computing resources. Find out how to set up clusters and run master and slave daemons on one node.
When it comes to integrating and managing data, there are quite a few tasks that are downright tedious. Data engineering is a tough job, but somebody's gotta do it!