Problem
The problem is that the sent amount of messages on sender_firebase_statistics table does not correspond to the actual sent messages taken from channels_stat_daily table. Here is the example:
channels stat daily output
2024-08-24 - 23833877937
2024-08-25 - 23653522348
2024-08-26 - 23751405481
2024-08-27 - 16153471015
sender_firebase_statistics output
2024-08-24 - 15243943345
2024-08-25 - 15063587756
2024-08-26 - 15161470889
2024-08-27 - 11893535303
Investigation
The difference is quite a bit. The interesting thing is that the amount of deliveries is equal on both tables. A mechanism by which data is being written o the sender_firebase_statistics table is different for sent and deliveries, because:
sent_countis taken frommessage_eventstabledelivered_countis taken frommessagecallbackeventstable
SELECT
toDate(timestamp) AS date,
firebase_id,
toInt32(0) AS sender_id,
toInt32(0) AS delivered_count,
toInt32(0) AS clicked_count,
toInt32(0) AS unique_click_count,
toInt32(0) AS closed_count,
sum(subscribe_count) AS subscribe_count,
sum(unsubscribe_count) AS unsubscribe_count,
toInt32(0) AS sent_count,
toInt32(0) AS errors_count
FROM statistics.subscription_events WHERE date > '2024-08-20'
GROUP BY toDate(timestamp), firebase_id
SELECT
toDate(timestamp) as date,
firebase_id,
sender_id,
toInt32(0) as delivered_count,
toInt32(0) as clicked_count,
toInt32(0) as unique_click_count,
toInt32(0) as closed_count,
toInt32(0) as subscribe_count,
toInt32(0) as unsubscribe_count,
toInt32(0) as sent_count,
sum(errors_count) as errors_count
FROM message_errors_events GROUP BY date, firebase_id, sender_id ORDER BY date
SELECT date, sum(sent_count) as sent FROM (
SELECT
toDate(timestamp) as date,
firebase_id,
sender_id,
sum(sent_count) as sent_count
FROM message_events WHERE date = '2024-08-26' GROUP BY toDate(timestamp), firebase_id, sender_id ORDER BY date
) GROUP BY date ORDER BY date
Quering message_events table in the same manner that is should be taken by materialized view to pass data to the sender_firebase_statistics gives me the correct amount of sent messages in total. Let’s take it closer by comparing a single day and a signle firebase + sender pair:
date | firebase_id | sender_id | sent_count
2024-08-26 | 2205 | 400 | 250757829
| 1984 | 433 | 263732410
| 2197 | 433 | 355744845
They are all the same for sender_firebase_statistics table. But there is a unique record living in message_events grouped by firebase_id and sender_id which refers to firebase=0, sender=0 pair with a huge amount of sent messages (8775431127) which is the exact piece we’re missing for the sender_firebase_statistics table.
Such data is definitely missing for our table:
select * from sender_firebase_statistics where sender_id = 0 and firebase_id = 0 and date = '2024-08-26'Questions
- why does this data is not being written to the
sender_firebase_statisticstable? - why do we have such a record in the
message_eventstable?
Solution
- DeliveryReportContext compound_id should contain sender_id and firebase_id
- it looks like those numbers came from stale reports that are extracted by compound id. all DeliveryReportContext contexts had sender and firebase id set to zero.
- fix can be found here - https://git.systransfer.com/webpush/webpush-directory/-/merge_requests/611