Deduplicating streaming data at scale

July 7, 2017 by Kevin Zellmer

Here at Qubit we’ve been known to discuss customer data – quantitative and qualitative – and the value that comes from analysing it.

It’s pretty much our thing.

We’re delighted when brands use Qubit to dig into their data and uncover what makes their site visitors tick, deliver great personalizations and really targeted segmentation. We’re also hugely excited when we can share the results of our data to deliver insight that’s useful for the whole of the personalization space.

We enable over 400 million personalized experiences per month. Doing that means processing vast quantities of data from websites, customer touchpoints and other systems, in near-real time. There’s a whole world of creative problem-solving and technological wizardry beneath the surface, that enables user-friendly, reliable personalization delivery.

Jibran Saithi, Lead Architect at Qubit, has written a blog going into how we face one of the challenges of handling large scale data from multiple sources: duplication of data.

It may not sound sexy, but it is vital. We have to deduplicate data to make sure we’re getting all the information we need, with no unnecessary repeats that could throw off results and introduce inaccuracies, right down to whether a personalization worked and by how much.

To find out how we overcame this challenge, read Jibran’s blog.



Author Kevin Zellmer Read more

Subscribe to stay up to date.

Receive the personalization newsletter directly to your inbox.

Please enter a valid e-mail address.

We will treat your data with respect and you can find the details in our privacy and cookie policy, and our website terms of use.

We use cookies and other forms of website navigational information to offer you a better browsing experience, analyze site traffic, personalize content, and serve targeted advertisements.
Read about how we use cookies and how you can control them in our Privacy and Cookie Policy. If you continue to use this site, you consent to our use of cookies.

Accept Cookies