The story that numbers tell

I always tell my students that data can always be interpreted correctly or end up confusing to the reader.

I have provided the link of the Department of Health’s Covid-19 tracker. There are various sources where one can obtain good data analytics from. If you go to this link, you will find out that the improvement in the COVID-19 tracker.

Then there’s the down side. The data analytics is wanting in useful information. The kind of information that should make us decide on how bad this Novel Coronavirus pandemic is in the Philippines and whether we’re actually “flattening the curve”.

Once you get into the site, this is what you see.

It gives you a pretty good picture of the daily information on the epidemiology of COVID-19 in the country. 4,932 confirmed. 3082 currently admitted (they don’t identify how many are critically ill and the rest are probably mild and quarantined.) 315 died so far. 242 recovered. Then in the gray bar are the ones “for validation”. There’s a caveat that says that “around 25% of province-level data are still undergoing validation”. But wait…there’s more!

If you scroll down a bit, you will see the table on testing capacity. Here’s where the data actually differ. The testing capacity as of April 11, 2020 showed 4,913 positives. The total confirmed as of April 13, 2020 is 4,932. On April 11, the DOH officially reported 4,428 confirmed cases. What accounts for the discrepancies in the numbers coming from the agency?

The 1,293 case for validation (said to be province-level data) is confusing.

I would assume that province-level data would point to tests coming from Southern Philippines Medial Center, Baguio General Hospital, Vicente Sotto Memorial Medical Center, Western Visayas, and Bicol Regional Diagnostic Laboratory. If you total their positive cases, that would just be 204 positives from the provincial testing centers.

We can assume that some specimens were sent to Manila for analysis. How many were sent to Manila? These inconsistencies in the numbers makes one wonder if they just deduct the total cases from the remaining cases in order to arrive at the discrepancy?

Let’s look at the numbers again.

4932 total confirmed. 3082 currently admitted. 315 died. 242 recovered.

Assuming the numbers are correct, that’s 3639 cases (admitted currently, died and recovered). What are remaining 1,293 cases “undergoing validation”? What does undergoing validation mean?

First, if they’re still not yet validated they shouldn’t even be part of the total statistic.

Second, is why are the numbers in the provincial testing centers not tallying (202 from all the provincial testing sites vs 1293 cases at the national level)?

The third and most vital query is – does the DOH actually have data on the patients that tested positive but were sent home for quarantine? How many of them returned for retesting? When you look at the numbers – of 4932 confirmed only 3082 are currently admitted. Or 1850 cases distributed as follows: 315 deaths, 242 recovered and 1293 for validation(?).

Data integrity is important in the analysis of outcomes and planning of mitigation strategies. It is, after all, the basis of our life after April 30.

The new site, while providing information on cases, deaths, recoveries, is wanting in the type of severity of the illness. We all know that not all admitted cases are in the ICU or are critically ill.

If we scroll down a bit, there’s interesting information regarding the availability of beds and mechanical ventilators in various hospitals.

This above information above while helpful is disturbing. If you look at the ICU beds, only 391 beds (out of 1,085) are filled. The remaining 2/3 are unoccupied. There are more than 87.29% mechanical ventilators still available. Yet reports coming from various private hospitals are that they are filled to the brim and that the healthcare system is overwhelmed by the coronavirus infection that some hospitals had to turn away patients.

Based on the data provided in the DoH website, of the 3082 currently admitted, 391 are in the ICU (as of this writing). That means only 12.7% are critically ill (needing intensive care).

There are more deaths than daily recoveries among those currently admitted. The graph below is a screenshot of the new daily deaths and new daily recoveries on the website. Implying that of the 391 cases in the ICU, 315 have died. An 80% mortality rate when intubated or in intensive care.

Of the 3082 cases admitted in the hospital, assuming that only 391 are in the ICU, what happened to the remaining 2691? That’s the majority of the patients. Who happen to be mild or moderate and probably recover.

Finally, there’s the interpretation of the data.

Cumulative is the operative word in the presentation of data.

Which means that regardless of patients getting better or dying, the total cases is what you see. But in reality it is not. Minus the deaths and recoveries, we actually are tracking 4375 cases that are still active. The deaths and recoveries are considered closed.

The red herring here? The ones for validation. They form the inconsistent information that needs an explanation so that it is not misinterpreted.

I’ve always told my former students in biostatistics that all the numbers should have an explanation. It is imperative that data gathered is accurate, valid and not confusing to the reader. It is also important that all tables and graphs are reconcilable. Otherwise, any conclusion or decision that is made with this kind of data is confusing and simply leads to bad decisions in preparation and planning.

The bottom line of good solid data? The April 30 deadline.

10 thoughts on “The story that numbers tell

  1. jamalashley April 13, 2020 / 10:38 pm

    Where did the data on testing came from. Accdg to WHO and world data, Philippines had 4,000 plus people tested as of April 4. How in the world could there be more than 33,000 tested by April 11. That means in 7 days, some 9,000 people were tested. That’s about 1300 tests a day., DOH just announced that tomorrow, April 14, MASS Testing will start with about 3000 tests a day. So, DOH did not know that the country was already “mass testing” since April 5 or thereabouts!


  2. jamalashley April 13, 2020 / 11:40 pm

    Sorry, wrong math in my previous comment. It should go like this:

    Accdg to the Covid19 Tracker of DOH, they have already tested 33,814 people as of April 11. This is soooo different from the world data @…/total-covid-19-tests…

    Accdg to the World Data under United Nations Office for the Coordination of Humanitarian Affairs, these are the cumulative number of reported Covid19 test by the Philippines:

    Philippines PHL Jan 28, 2020 3
    Philippines PHL Feb 10, 2020 112
    Philippines PHL Feb 18, 2020 456
    Philippines PHL Mar 30, 2020 2559
    Philippines PHL Apr 2, 2020 3908
    Philippines PHL Apr 3, 2020 4367
    By April 4, Philippines tested only 4,568 people.

    How in the world could there be more than 33,814 tested by April 11. That means in 7 days, some 29,000 people were tested. That’s about 4178 tests a day., DOH just announced that tomorrow, April 14, MASS Testing will start with about 3000 tests a day. So, DOH did not know that the country was already “mass testing” since April 5 or thereabouts and doing more that 4000 tests a day!?!

    Something is wrong there.



  3. Jill Gale de Villa April 14, 2020 / 9:39 am

    Statistics wasn’t my forte, but it seems a lot of data interpretation is needed, with cautionary notes, and your analysis hopefully will help move in that direction. One uncertainty factor is that the PCR tests are known to be only 60%-70% accurate. For example, given the accuracy issue of the test results, could there be, say, over 10,000 more positives among the negatives tested, and over 1,700 more negatives among those testing positive, for a balance of an additional 8,300+ positives out there (using 65% as the average accuracy)? Just curious.


    • kidatheart April 14, 2020 / 9:44 am

      That is correct. There are so many unknowns and many limitations. But one thing I am sure, is that if this continues a bit longer people will die of hunger not of Coronavirus.


  4. Roman Jebulan April 14, 2020 / 9:21 pm

    Hi! I was able to read this through ANCX after I tried digging through the data of DOH Data Drop spreadsheet file. ( I am from Bicol and I immediately looked for the data from the local testing center. I found there that the said testing center recorded 4 positive cases as of April 13. However, the DOH Bicol confirmed only an additional 1 case from today, April 14. There were no new cases yesterday. I can only assume that the data from the Data Drop is more updated than those from DOH Bicol…or, the testing center may have carried out tests from other regions. 🤔


    • kidatheart April 14, 2020 / 9:22 pm

      There should be consistency in data always because we need it for planning.


  5. Greg April 16, 2020 / 6:36 am

    A big problem with the stats is that they are presented based on “Date of Public Announcement”. This doesn’t tell an accurate or useful story. DOH should be presenting it based on Date of Death (for death numbers) and Date of Onset of Symptoms (for number of cases).

    I’ve plotted the Daily Deaths based on date of actual death, which shows a pattern/story. See the image here:

    Deaths steadily rose to a peak of 19 deaths per day on March 26-27, and may have started declining then already. However, the data is incomplete due to delays in communication and tabulation. Forty four (44) of the “Died” entries in the Data Drop do not include a Date of Death, plus many more have not even been reported yet.

    I’m glad the info is publicly available but big improvements are needed for the government to have reliable basis for their decisions.


    • kidatheart April 16, 2020 / 7:32 am

      Thanks for this information and yes I agree. That’s why it’s startling that we end up having so many fluctuations in the graph. It’s with the inconsistency of data.


  6. Tj April 17, 2020 / 12:16 am

    Thank you. You might not remember me but you definitely leave an impression on a former student. I have always wondered the obsession with the total confirmed, it was fine at the start but once the numbers reached 500 I started to wonder its true utility. Yes the “for validation” category just confuses me further. Would an epidemiologist and statistician combined have sorted the data a bit different? I do wonder.

    You mentioned that testing matters; truth is I feel that the delayed creation of testing facilities is due to red tape on the part of DOH or RITM. Some hospitals had PCR testing capabilities probably way even before RITM started using them and I have always wondered why some big named hospitals have to bow down to the RITM approvals list when these hospitals have probably been running far more PCR tests than RITM with technical staffs just as proficient as RITM personnel.

    I have seen the check list required to be complied by labs for running one of the 45 minute PCR test using Geneexpert which have yet to be introduced into the local market These are check list for existing licensed labs in this country. The required data to be complied are incredibly redundant to existing labs… Almost as if you are building a new lab from scratch with required protocols to be submitted to include statements like ; “no animals are admitted in the laboratory”, “no sniffing in vitro cultures or mouth pipetting” among a few.

    Incredibly these are protocols for working laboratory professional, I certainly do not believe these professionals will lick a used nasal swab just for fun. Nor would they put a cactus plant inside the biosafety cabinet citing it’s for stress relief and to encourage nature in the lab.

    We need more labs that can run the tests, but there are more questions like why is San Lazaro, 3 months into the epidemic with thousands of kits donated only managed to run 409 tests. Or why is Makati Med, new granted the powers to do PCR is rumored to reserve it only for its patients. Can’t RITM give these kits out and have DOH ordered collected samples to be distributed to the newly approved hospitals to be run? Why do I feel RITM prefers to handle the workload when it has maxed out its capacity?

    Truth is, I do not feel DOH is in a rush. I feel they are just reporters citing the daily statistics of new cases and how many died and reciting the timeless covid rhetoric of wash your hands.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s