Sr Software Development Manager at Salesforce
Technical Manager, Hands-on AWS Solution Architect, Backend development (Java and Python) with expertise in ETL systems, API design, Big Data Pipelines, and Observability
Stats
Reputation: | 1084 |
Pageviews: | 422.7K |
Articles: | 11 |
Comments: | 8 |
Comments
Sep 12, 2020 · Kai Wähner
My working with large messages (GBs, TBs) have been mostly solving problem of volume and variety. Using a scheduled ETL job which collects and transforms large payloads in chunks (< 10 MBs) after partitioning. Then sending these chunks into kafka for further processing is better alternative which allowed us to not worry about the varying size payload in kafka.
May 07, 2020 · Preetdeep Kumar
There are multiple options, most of them rely on creating a Java POJO from your JSON and then serialize/deserialize the Java object. You can also use AVRO is you want versioning & strict type in your JSON schema
Apr 27, 2020 · Preetdeep Kumar
Thanks for feedback
Apr 14, 2020 · Preetdeep Kumar
Glad you liked it
Apr 14, 2020 · Preetdeep Kumar
Glad you liked it
Apr 14, 2020 · Preetdeep Kumar
There is a nice blog in dzone which has compared Pulsar and Kafka. But with Flink, I don't think it should be compared with Pulsar.
May 11, 2019 · Brian Hannaway
Nice, we are also using lombok in our project to reduce lot of repetitive code specially setters/getters
Apr 20, 2019 · Preetdeep Kumar
Thanks Hammad for your feedback. You are correct and I have too pointed out in the blog. This application is more than decade old and has many limitations when it comes to deployment and integration. My team will be refactoring this application in next phase.