Big Data Resources

Running Apache Spark Applications in Docker Containers

Even once your Spark cluster is configured and ready, you still have a lot of work to do before you can run it in a Docker container. But these tips can help make it easier!

August 26, 2017

by Arseniy Tashoyan

· 54,607 Views · 4 Likes

Where the Industrial Robots Roam

You might expect that industrial robots would be evenly distributed across the U.S. or concentrated in states with big high-tech industries — but you'd be wrong.

August 25, 2017

by Amy Groden-Morrison

· 1,800 Views · 3 Likes

Reasons Your Database Might Crash

This article helps database administrators prevent unexpected behavior and crashes by exploring the reasons databases crash, to optimize their performance.

August 23, 2017

by Dzmitry Ivanou

· 28,219 Views · 7 Likes

Understanding Kafka Failover

This Kafka tutorial aims to help you understand failover for brokers and consumers; you'll run a consumer to test and prove Kafka's consumer failover.

August 23, 2017

by Jean-Paul Azar

· 48,445 Views · 12 Likes

Concept Learning: The Stepping Stone Toward Machine Learning With Find-S

Learn about one of the most simple algorithms of artificial intelligence, the Find-S algorithm, to help you get started with machine learning.

August 21, 2017

by Girish Bharti

· 25,278 Views · 5 Likes

Data Science for Java Developers With Tablesaw

Tablesaw is like an open-source Java power tool for data manipulation with hooks for interactive visualization, analytics, and machine learning. Come learn all about it!

August 20, 2017

by Larry White

· 28,077 Views · 38 Likes

What Is a Data Science Workbench and Why Do Data Scientists Need One?

These factors described in this article make data scientists self-sufficient, improve the effectiveness of their models, and accelerate the time to insight.

August 19, 2017

by Syed Mahmood

· 16,613 Views · 3 Likes

Kafka Consumer Architecture - Consumer Groups and Subscriptions

In this installment, learn about Kafka consumer architecture, consumer groups, how record processing is shared, and failover for consumers.

August 19, 2017

by Jean-Paul Azar

· 56,785 Views · 12 Likes

Kafka Producer Architecture - Picking the Partition of Records

This article covers Kafka Producer Architecture, including how a partition is chosen, producer cadence, partitioning strategies, and consumers.

August 18, 2017

by Jean-Paul Azar

· 69,079 Views · 12 Likes

Kafka Topic Architecture - Replication, Failover, and Parallel Processing

Digging deeper into Kafka architecture, this article covers the details of replication, failover, and parallel processing in this data pipeline software.

August 17, 2017

by Jean-Paul Azar

· 96,604 Views · 29 Likes

Analyst, Scientist, or Specialist? Choosing Your Data Job Title

There are tons of data job titles, including data scientist, data analyst, and data specialist. It’s important to pick one that matches your capabilities and aspirations.

August 15, 2017

by Shelby Blitz

· 10,054 Views · 7 Likes

Hadoop Distributions: Past, Present, and Future

In a world where open-source software can avoid vendor lock-in, are major Hadoop distributors discarding some of that benefit to the detriment of Hadoop users?

August 15, 2017

by Mark Chopping

· 10,270 Views · 3 Likes

Solving a Clustering Problem Using the k-Means Algorithm With Oracle

Clustering algorithms let machines group data points or items into groups with similar characteristics. See how to use the k-means algorithm with Oracle to do clustering.

August 15, 2017

by Emrah Mete

· 12,886 Views · 7 Likes

How to Order Streamed DataFrames

Many of the solutions that you experiment with to help you order streamed DataFrames will bring you to disappointment. Luckily, there's a light at the end of the table!

August 15, 2017

by Mahesh Chand Kandpal

· 6,578 Views · 1 Like

Testing MQTT Messaging Brokers

If you're looking to test your IoT app's communication, here's how JMeter can load test the popular MQTT protocol, with an overview of the protocol itself.

August 13, 2017

by Roman Aladev

· 17,612 Views · 7 Likes

Kafka Producer in Java

This detailed tutorial will help you create a simple Kafka producer, which allows you to publish records to the Kafka cluster.

August 10, 2017

by Jean-Paul Azar

· 107,745 Views · 14 Likes

Kafka Architecture

Learn about the architecture and functionality of Kafka, the software for building real-time streaming data pipelines, in this comprehensive primer.

August 9, 2017

by Jean-Paul Azar

· 135,303 Views · 99 Likes

The Role of Predictive Analytics in DevOps

Learn how data and predictive analysis can be used by DevOps engineers to further develop and optimize the DevOps workflow.

August 5, 2017

by Badri Srinivasan

CORE

· 11,320 Views · 4 Likes

Coffee With a Data Scientist: Tuhin Chattopadhyay, Ph.D.

In the fourth issue of DZone's Coffee With a Data Scientist, we had a chat with business analytics evangelist, Tuhin Chattopadhyay, to glean some of his expert insights and opinions on the Big Data space.

August 4, 2017

by Michael Tharrington

· 8,630 Views · 4 Likes

Event Driven Microservices Patterns

Read about the motivation behind the switch to microservices, and some of the patterns that make these applications more scalable.

August 4, 2017

by Carol McDonald

· 66,988 Views · 75 Likes

The Latest Big Data Topics