Real-Time Analytics on MongoDB Data in Power BI
We take a look at how to use a particular BI tool with MongoDB to gain insights from data and visualize this data in interesting ways.
Join the DZone community and get the full member experience.
Join For FreePower BI is expanding self-service data prep to help business analysts extract insights from big data and introducing enterprise BI platform capabilities. With recent updates, Power BI has enabled connectivity to more data sources than ever before. That said, no product is able to do everything, which is where the CData Power BI Connectors come in.
With CData, you get live connectivity to data in Power BI (meaning DirectQuery) from any of the 120+ supported sources, ranging from CRM and marketing automation to big data and NoSQL. With the CData connectors, you can access MongoDB data quickly (faster than you can with any other connector) and easily, leveraging the built-in modeling and data flattening features of the connector to create a table-like model of your schema-less data, ready to be viewed, reported and analyzed from Power BI — no code or data curation required.
Making MongoDB Data Look Relational
Our connectors provide the fastest and most flexible support for NoSQL data integration, allowing you to interact with NoSQL databases as if they were a relational database, leveraging the flexibility of NoSQL data storage with the familiarity and utility of SQL-92 queries. The CData Connectors include several features for mapping or flattening existing NoSQL data structures (like JSON objects, arrays, etc.) to simplify integration with Power BI. The options range from free-form queries to horizontal and vertical flattening to custom schema definitions. While we discuss flattening in brief below, you can read the following article to learn more: Leading-Edge Drivers for NoSQL Integration.
The explanation below is based on the following MongoDB sample document, from the MongoDB primer dataset. You can download a truncated version of the dataset and import it into your MongoDB instance using the following command:
mongoimport --db $database --collection restaurants --file primer-dataset.json
Sample Document
{
"_id" : ObjectId("5780046cd5a397806c3dab38"),
"address" : {
"building" : "1007",
"coord" : [-73.856077, 40.848447],
"street" : "Morris Park Ave",
"zipcode" : "10462"
},
"borough" : "Bronx",
"cuisine" : "Bakery",
"grades" : [{
"date" : ISODate("2014-03-03T00:00:00Z"),
"grade" : "A",
"score" : 2
}, {
"date" : ISODate("2013-09-11T00:00:00Z"),
"grade" : "A",
"score" : 6
}, {
"date" : ISODate("2013-01-24T00:00:00Z"),
"grade" : "A",
"score" : 10
}, {
"date" : ISODate("2011-11-23T00:00:00Z"),
"grade" : "A",
"score" : 9
}, {
"date" : ISODate("2011-03-10T00:00:00Z"),
"grade" : "B",
"score" : 14
}],
"name" : "Morris Park Bake Shop",
"restaurant_id" : "30075445"
}
The Relational Model
With this document in mind, we configure the CData Connector to build a schema based on the hierarchy of the data, using dot notation and array indexes to drill down into the data.
Field Name in Power BI | MongoDB Reference | Value |
---|---|---|
_id | _id | 5780046cd5a397806c3dab38 |
building | address.building | 1007 |
longitude | address.coord.0 | -73.865077 |
latitude | address.coord.1 | 40.848447 |
street | address.street | Morris Park Ave |
zipcode | address.zipcode | 10462 |
borough | borough | Bronx |
cuisine | cuisine | Bakery |
latest_grade | grades.0.grade | A |
latest_score | grades.1.grade | 2 |
name | name | Morris Park Bake Shop |
restaurant_id | restaurant_id | 30075445 |
Drilling down into the MongoDB data to build this schema is as easy as setting a few configuration properties and then modifying the generated file to change the names of the fields.
Configuring the Connector
Use the included ODBCConfigure.exe application to configure the connection properties (by section).
- Authentication: Set the Server, User, and Password properties to connect to your MongoDB instance.
- Database: Set the Database property to your MongoDB database.
- Misc: Set the Flatten Arrays property to "2", set Flatten Objects to "true" and set Generate Schema Files to "OnStart."
- Schema: Set Location to the directory for your schema files and (optionally) set the Tables property to "restaurants" to only work with the 'restaurants' collection.
With the connection properties configured, click Test Connection to ensure the properties are properly configured and to generate the schema file.
The Schema File
Open the generated schema file (restaurants.rsd) to see the table/column definitions. Each column has specific attributes (like whether or not the column is read-only, it's source datatype, and more). For this example, we simply edit the existing schema definition, removing unwanted columns (grades_0_date) and changing the name attribute for other columns (note the attributes have been truncated for readability).
<attr name="_id" key="true" xs:type="bson:ObjectId" other:bsonpath="$._id"/>
<attr name="building" xs:type="string" other:bsonpath="$.address.building"/>
<attr name="longitude" xs:type="double" other:bsonpath="$.address.coord.0"/>
<attr name="latitude" xs:type="double" other:bsonpath="$.address.coord.1"/>
<attr name="street" xs:type="string" other:bsonpath="$.address.street"/>
<attr name="zipcode" xs:type="string" other:bsonpath="$.address.zipcode"/>
<attr name="borough" xs:type="string" other:bsonpath="$.borough"/>
<attr name="cuisine" xs:type="string" other:bsonpath="$.cuisine"/>
<attr name="name" xs:type="string" other:bsonpath="$.name"/>
<attr name="restaurant_id" xs:type="string" other:bsonpath="$.restaurant_id"/>
<attr name="latest_grade" xs:type="string" other:bsonpath="$.grades.0.grade"/>
<attr name="latest_score" xs:type="integer" other:bsonpath="$.grades.0.score"/>
(Live) MongoDB Data in Power BI
Once the connection and schema files are configured, you can view live MongoDB data in Power BI. Since the CData Connector is built using the Custom Connector interface, you have to option to pull the MongoDB data into memory (Import) or connect live (DirectQuery).
- Open Power BI and enable Custom Connectors (menu -> Options & Settings -> Options -> Preview Features -> Custom data connectors).
- Click Get Data, search for MongoDB (or click Other and scroll to CData MongoDB) and click Connect.
- Select the DSN you configured earlier, select Import or DirectQuery and click OK.
- Select the table (restaurants) and click Load.
At this point, you will have access to the restaurants' data (as defined by the schema file) in Power BI, as seen in the fields menu in Power BI.
Since the Power BI Connectors allow direct connectivity to MongoDB from Power BI, you can bypass the traditional data replication or virtualization steps associated with working on SaaS and NoSQL data and jump straight into building visualizations and reports.
Visualizing MongoDB Data in Power BI
Now that our MongoDB data is loaded, we're ready to build visualizations in Power BI. There are many distinguishing parts of the restaurant data. Thanks to built-in location mapping, we can build a map of the restaurants using the latitude and longitude. We can also view the restaurants based on Borough or cuisine. For starters, let's build the map.
Building a Map Visualization
- Click on the Map icon in the Visualizations menu.
- Drag latitude and longitude to the corresponding Fields in the Visualizations menu.
- Drag borough to the Legend field and latest_score to the Size field chart.
- Expand the latest_score field and change the aggregation to Average.
Now you have a Map chart that plots each restaurant, colored by the borough and with its size based on the average score.
Building a Stacked Column Chart
- Click on the Stacked column chart icon in the Visualizations menu.
- Drag borough to the Axis field.
- Drag cuisine to the Legend field.
- Drag restaurant_id to the Value field, then expand restaurant_id, rename the field to "# of restaurants," and change the aggregation to Count (distinct).
Now you have a Stacked column chart for the restaurants, where each borough is a column and each column is divided by cuisine.
More Visualizations
We went through the process of building a report with two more tables to show the numbers of restaurants by borough and cuisine. In this report, you can use the focus functionality of Power BI to select a Borough or a cuisine type (or both) and drill down to those data points associated with your selections.
Free Trial and More Information
Thanks to the CData Power BI Connectors, users can connect Power BI to live data from SaaS, Big Data and NoSQL sources to perform analytics and build visualizations to drive business. Get started with a free, 30-day trial of any of the Power BI Connectors. For more videos, head over to our YouTube channel. As always, let us know if you have any questions during your evaluation. Our world-class CData Support Team is always available to help.
Published at DZone with permission of Jerod Johnson, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments