The data provided by Authorities on current and increasing infection numbers is missing data on how many tests have been performed to obtain the Active Infection counts we're seeing.
Currently widely tracked and published are:
Regular reporting of:
Active Infection Cases
Case "Recoveries" (with different criteria for "recovery" by region)
Sporadic reporting of:
Overall number of tests performed (usually country-level), with positive/negative tests counts within that set.
different regions are using different tests
This data is likely skewed towards cases presenting at Healthcare providers with more severe symptoms vs. the community spread in the population as a whole.
We're getting an incomplete picture of how prevalent the spread of the virus is.
This makes it hard for the general population to judge the risk in their area since a low infection count could just be down to poor intel/a low testing rate.
In turn, this may be an explanation for some regions apparent lack of behaviour change/prevention measures due to underestimated risk from low detection rates.
Poor intel on any community reservoir of infected people may increase the risk of another outbreak through premature relaxing of prevention measures.
Regional data on:
Active Infection Cases with:
initial detection method/presentation
Criteria used to judge and or test for COVID19 fatalities among the recently deceased
i.e. testing/detection rate among the dead
Number of tests with:
How the cases were flagged for testing (e.g. high risk area, high risk group, presenting with symptoms at point of care, thermal screening, random sampling etc.)
What test(s) were used
Which testing facilities
In lieu of official (and detailed) data from regional authorities, we could create a website or mobile app for self-reporting cases who've been tested, with:
how they were flagged to be tested
test type if known
location before presenting for testing
Once we have the data, then stronger analysis of/modelling from the existing data can be made, with dashboards, confidence ratings for published infection rates & estimates of likely community spread
Data privacy would need to be at the forefront, but the data is probably useful if we can track cases to Town level to help inform local behaviour.
Place of test is likely less important than place of likely contacts (catch or spread)
Self-reported data means a self-selecting sample - likely biassed towards negative tests, those with less serious symptoms & those who are more comfortable with technology
Self-reported may not know the specifics of the test performed
Can't verify data is accurate, so likely there will be noise in the data from:
data entry mistakes
duplicate reporting when reporting on behalf of a patient