HyperLogLog in Google BigQuery

1 · Unsplash · May 15, 2019, 11:57 a.m.
Counting and reporting uniques is always a challenge as it usually requires a full scan of the dataset to count the number of distinct values we have. On small datasets it’s fine but when dealing with larger volumes, it quickly becomes a performance and resource issue. We recently ran into that problem when trying to measure the number of unique users reached by Unsplash images.Photo by Joanna Kosinska on UnsplashUniques can’t be aggregatedThe uniqueness of a value also depends on the time range...