Building a Reference Architecture for a Demand-Side Platform in AdTech
Learn the key architectural components and explore a real-time bidding use case featuring Redpanda, Aerospike, and Flink.
Join the DZone community and get the full member experience.
Join For FreeAdTech, short for Advertising Technology, represents the convergence of technology and advertising within the digital landscape. At its core, AdTech encompasses a range of tools, platforms, and strategies designed to streamline and optimize the process of buying, selling, and delivering digital advertisements.
Programmatic advertising, a key component of AdTech, shakes up how ad placements are bought and sold. It involves automated, data-driven transactions, enabling advertisers to precisely target their audiences and optimize ad placements in real time. At the heart of programmatic advertising is real-time bidding (RTB), a dynamic auction-based system where advertisers and Demand-Side Platforms (DSPs) compete to bid on ad impressions milliseconds before being displayed to users.
In this post, we delve deeper into the intricacies of real-time bidding and propose a reference architecture for a DSP with improved performance, scalability, and accuracy.
The Programmatic Advertising Ecosystem
This ecosystem comprises various components that work together to enable the automated buying and selling of digital ad inventory in real-time.
At its core are advertisers, who seek to reach their target audiences with relevant messages, and publishers, who offer digital ad space on their websites or apps. Supply-side platforms (SSPs) facilitate the transactions and manage and monetize publishers' ad inventory, and demand-side platforms (DSPs) represent advertisers and bid on ad impressions.
Ad exchanges are bridging the gap between SSPs and DSPs, acting as marketplaces where auctions occur. Data management platforms (DMPs) provide audience data and insights to enhance targeting, while ad networks may facilitate the distribution of ads across multiple publishers. These components rely on real-time bidding (RTB) protocols and algorithms to make instantaneous decisions on ad placements, ensuring that the right ad is displayed to the right user at the right moment.
Together, these elements form a dynamic ecosystem that drives digital advertising efficiency, precision, and relevance.
What Is Real-Time Bidding (RTB)?
Real-time bidding is an intricate process that involves a coordinated sequence of actions across different components in the AdTech ecosystem. Below is a list of actions that kick off once a user clicks on an ad displayed on a website or mobile app.
- A bid request is sent to an ad exchange with information on the website or app and opted-in visitor data (demographic, contextual, behavioral, device—only if a user has given tracking permission).
- The website or app owner puts the ad impression up for auction on the SSP.
- The opted-in visitor information is then matched with available advertisers.
- Advertisers on the DSP offer bids for the impression.
- The highest bidder wins the ad impression.
- The ad is served to the user on the website or app.
- Ideally, the user clicks on the ad and converts. If not, retargeting tactics like “sticky” ads can be used to encourage them to convert at a later time.
The technology behind RTB is so advanced that it usually takes about 20 to 30 milliseconds to complete, while users are completely unaware that this is even happening.
A Reference Architecture for a Demand-Side Platform (DSP)
Now that we understand the critical components of a programmatic advertising ecosystem, real-time bidding, and the associated sequences, let’s discuss the design and implementation of a DSP.
Given that every millisecond costs money in the AdTech industry, DSPs play a crucial role by representing advertisers and making split-second decisions to bid on ad slots. They must distinguish themselves by swiftly processing data, targeting the right audience, and optimizing bids to win the auction.
We propose the following reference architecture for a DSP with these design objectives in mind:
- Performance: Ability to decide and respond to bid requests with low latency. Ideally, the response time should be in the range of 20-30 milliseconds.
- Scalability: Ability to handle high throughput bid requests and respond to them within defined SLAs.
- Cost-efficient storage for ad events: Scalably stores critical data points, including bid requests, responses, clicks, and user behavioral data, and makes them available for downstream applications.
- Efficient data integration: Integrate with the ecosystem components to ingest behavioral data from relevant external providers, process them, and use them for more improved bidding decisions.
Now, let’s divide the solution into three important stages:
- Integration with the RTB ecosystem
- User profile database
- Real-time user data enrichment
- Reporting and analytics
Integration With the RTB Ecosystem
In a typical programmatic advertising ecosystem, DSPs communicate with SSPs or ad exchanges to bid for available ad slots. The communication between the supply side (SSPs or ad exchanges) and the demand side (DSPs) takes place over HTTP, according to a standard real-time bidding (RTB) specification, such as OpenRTB.
In the scope of bidding, the DSP expects three types of HTTP POST requests from the supply-side platform, received to a configured HTTP endpoint in the DSP.
- Bid requests
- Win notifications
- Loss notifications
The bidding application (widely known as the bidder) in the DSP is responsible for handling them and responding accordingly. The bidder is a critical component in the DSP that evaluates incoming ad impressions and determines whether to bid on them. It decides which ad creatives to display and how much to bid for the opportunity to show those ads to the targeted user.
Bidders operate in real-time, processing ad requests within milliseconds. They use a combination of algorithms, data analysis, and targeting criteria to make rapid decisions based on factors like user demographics, browsing behavior, location, device type, and the context of the ad placement.
In our proposed solution, a bidder could be a packaged application managed by a DSP vendor or an in-house application, like a microservice.
User Profile Database
A user profile database is a valuable asset for the DSP, enabling highly targeted and personalized advertising strategies. The bidder leverages the data stored in the database to make more informed bidding decisions, improve ad relevance, and deliver better results for advertisers.
This database contains detailed information about individual users' preferences, behaviors, demographics, and historical interactions with digital content. Based on that data, users are segmented into specific audience groups based on their characteristics and interests. The bidder queries these user profiles to target ads to users who are most likely to be interested in the advertised products or services. It can match the user segments with the targeting criteria specified by advertisers, such as age, location, interests, and browsing history.
The bidder and the user profile database stand in the critical path of bidding, always expecting a faster response when the bidder queries the database before deciding on a bid. The ability to offer reads with ultra-low latency and the ability to sustain those SLAs at a higher query throughput (QPS) is a deciding factor for a user database. So, the solution recommends a database proven in the AdTech industry, like Aerospike.
For the unfamiliar, Aerospike is a real-time database that ingests, stores, and retrieves data, handling millions of transactions per second (TPS) throughput with sub-millisecond latency. Aerospike can ingest events directly, or through its connectors, to streaming data platforms like Redpanda, Apache Kafka®, Apache Pulsar, and others.
Real-Time User Data Enrichment
Maintaining a user profile database with high-quality data is crucial for bidders to make informed bidding decisions. Proper ad placements increase brand awareness, drive more sales, and help advertisers maximize the ROI on ad spend. However, the data used for decision-making is constantly growing in volume, velocity, and variety.
DMPs often enrich user profile data with the data collected from outside sources for better ad targeting and personalization. These data sources include:
- Third-party data: These are data sets obtained from external sources that are not directly affiliated with the organization. Third-party data items can be diverse and may include demographic data (e.g., age, gender, income), geolocation data (e.g., IP addresses, GPS data), behavioral data (e.g., online shopping habits, content consumption), interest data (e.g., hobbies, preferences) and Intent data (e.g., search queries, online activities indicating purchase intent).
- Device and platform data: Information related to the devices and platforms used by users, such as device types (e.g., mobile, desktop, tablet), Operating systems and versions, Browser types and versions, Device IDs and mobile advertising IDs (e.g., IDFA, AAID).
- Location data: Geographical information about users, including GPS, coordinates, IP geolocation data, location-based services data, and user-reported location data (e.g., zip code, city, country).
- Contextual data: Information about the context in which users engage with digital content, including content categories (e.g., news, sports, entertainment), keywords, and search terms.
The challenge for DMPs lies in collecting this data from external sources, cleaning them to use for user profile enrichment, and making them available during real-time bidding. The external sources can be third-party data vendors, partners, or DMPs. The need here is to collect this data faster while they’re still relevant.
Enter Redpanda
Redpanda is a powerful streaming data platform that enables scalable and real-time ingestion of large volumes of external data. Redpanda supports ingesting from several data sources, including change data capture (CDC) from transactional databases, flat files, and legacy systems via managed connectors.
The DMP could securely expose Redpanda APIs for external partners to produce data with appropriate levels of RBAC. Alternatively, non-Redpanda producers can use PandaProxy—an HTTP-based API for reading and writing to Redpanda. Once the external data lands in Redpanda, it must be processed before consumption.
However, the processing must be done on the fly–as the data arrives to minimize the latency and keep the data fresh. The solution recommends using a stream processing engine, such as Apache Flink®, for building a streaming ETL pipeline that does the preprocessing.
A good practice is to use different Redpanda topics for different data streams, such as devices, preferences, geolocation data, etc. The changes in the user profile database can also be streamed into a Redpanda topic as a changelog. Flink consumes these streams to build and maintain different changelog streams and materialized views inside its state, allowing the enrichment of user profiles by performing joins and aggregations.
Finally, the enriched stream has materialized as a Redpanda topic, enabling the user profile database to consume it and update itself. Modern real-time analytics databases, like Aerospike, are capable of streaming data ingestion.
Reporting and Analytics
Furthermore, Redpanda can ingest and store different types of events generated by the DSP, including bid requests and responses, win notifications, and losses. This data can power different analytics and reporting use cases, such as real-time dashboards and training ML models. Otherwise, this data can be moved into a data warehouse or a data lake for further analysis.
The proposed solution keeps existing RTB integrations intact because they have been tried and tested in the AdTech industry for many years, delivering latency SLAs as expected.
Conclusion
The world of AdTech and programmatic advertising is a dynamic and rapidly evolving landscape where the ability to make real-time decisions can be the difference between a successful campaign and missed opportunities. In this post, we explored the intricacies of real-time bidding and the key architectural components required to build a highly scalable demand-side platform (DSP).
Bidding applications can benefit from having the right data to make informed decisions. Oftentimes, DSPs enrich ad requests with the data received from external sources for better ad targeting and personalization. Redpanda, as a streaming data platform, enables streaming data ingestion from these external sources at scale. Its API compatibility with Kafka, single binary deployment, and rich developer experience accelerate data integrations across teams, departments, and even within organizations. Additionally, the Tiered Storage feature reduces the storage costs for ingesting and retaining large volumes of external ad events.
By leveraging Redpanda's capabilities and the insights shared in this post, organizations can create DSPs that keep pace with the demands of programmatic advertising and drive success for advertisers, publishers, and audiences alike in this fast-paced digital ecosystem.
Published at DZone with permission of Dunith Dhanushka. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments