Implement a Counter Table in Elasticsearch
Join the DZone community and get the full member experience.
Join For FreeIn your product, you ought to know in-depth information concerning feature usage. With that, you also should be interested in knowing when exactly a particular feature was touched last.In this article, we're going to solve the problem of getting click counts on different entities/pages of your website to find out when a particular page last touched.
To solve this problem, two main solutions come to mind:
Cassandra counter tables — Cassandra supports counter tables, but you can't mix a timestamp column (which cannot be a primary column) with a counter table. And a counter table is allowed only with primary key columns and counter columns. So, with this limitation, we can not proceed with Cassandra counter tables.
Elasticsearch — There is no in-built support for counter tables in Elasticsearch, but we can take advantage of its script functionality to implement them.
You may also like: Data Modelling: Counter Table.
Solution
Let's assume a user has just seen a video with a unique ID of 88d19b07-86d2-471d-be33-ed10eef5d38e on our website.
POST counter/count/VIDEO_VIEW_88d19b07-86d2-471d-be33-ed10eef5d38e/_update
{
"script": {
"source": "ctx._source.counter += params.count;ctx._source.recorded_timestamp = params.recorded_timestamp",
"lang": "painless",
"params": {
"count": 1,
"recorded_timestamp": "2019-11-19T22:15:30Z"
}
},
"upsert": {
"entity": "VIDEO",
"counter": 1,
"action":"VIEW",
"uuid":"88d19b07-86d2-471d-be33-ed10eef5d38e",
"recorded_timestamp": "2019-11-19T22:15:30Z"
}
}
We have an Elasticsearch index with an index name, "counter," and type, "count." With this information, we're trying to index a document with ID ${entity}_${action}_${UUID} (88d19b07-86d2-471d-be33-ed10eef5d38e in this example.
In case this video is being watched for the first time, the block given the "upsert" key will be executed and will initiate the document with a counter value of 1.
The next time, whenever a user watches this video, the same indexing code will be fired and a document with the given ID will be found with index name, "counter," and type, "count." So, following the script, the counter will be increased by 1 (given under params key), and the updated date will be updated with data given under the params key. Other keys will remain untouched.
Now I have data available that tells me:
Which entity is used widely and which particular item of the entity is used widely. In this case, which particular video is most often watched.
Which action of which entity is not in use
By just querying in Elasticsearch with "give me all the entities where
recorded_timestamp
is older than 30 days."
Further Reading
Opinions expressed by DZone contributors are their own.
Comments