COVID-19 Open Data Updates
Posted by Oscar Wahltinez, Developer Programs Engineer
It's been over two years since the COVID-19 Open Data repository launched. The world has changed a lot since then, and the purpose of the repository has changed, too.
The major features and associated use cases were:
- Real-time COVID-19 data updates for monitoring purposes
- Comprehensive location coverage for forecasting and research purposes
- Comprehensive covariate coverage for forecasting and research purposes
The COVID-19 Open Data repository was launched over two years ago. |
A major global shift in policy and resources allocated to COVID-19 monitoring has drastically changed our ability to maintain good-quality and up-to-date data. Due to a societal shift away from focusing on the pandemic, many health authorities have decreased the frequency of updates, the type of data being updated, and some have stopped updating data altogether.
Changing our focus
With that in mind, we have decided to shift our focus to the research use case, using the data for retrospective analysis and no longer providing real-time updates.
Snapshot of existing data sources for the COVID-19 Open Data repository. |
The project site, Github repository and data will continue to be accessible to users, including the associated BigQuery Datasets entry.
Related tools and further research
Although COVID-19 was evidently the focus of the repository, the breadth of data available is such that a number of generalizable tools and research were also built using our data. For example, the agent-based epidemic simulator can be seeded with real data from any chosen location and date, and the clinical trial site selection tool can be used to plan any future large-scale, diverse vaccine trials.
Plot of visits to parks compared to Google search trends for skin rash and podalgia. |
As evidenced by the plot, the correlation between the different variables is quite remarkable! The Pearson correlation coefficient is greater than 0.8 for any two of the three variables. Here's the SQL query that you can use to replicate the chart above, using the dataset hosted on BigQuery:
Feel free to reach out to us via the COVID-19 Open Data site if you use this work for interesting and impactful research, or if you have any questions for us!