Accessing Hub Data
Accessing hub data from GitHub
All model-output, target, and configuration files for this hub are hosted on GitHub. You can access the data directly from the repository at dailypartita/China-COVID-19-Forecast-Hub.
GitHub serves as the primary interface for operating the hub and collecting forecasts from modelers. You can access hub data by cloning the repository or downloading files directly from GitHub.
Data Access Methods
The sections below provide examples for accessing hub data depending on your goals and preferred tools:
| Access Method | Description |
|---|---|
| Git/GitHub | Clone the repository for full local access |
| GitHub Raw Files | Direct HTTP access to individual files |
| GitHub API | Programmatic access to repository contents |
Cloning the Repository
To get a complete local copy of all hub data:
git clone https://github.com/dailypartita/China-COVID-19-Forecast-Hub.git
cd China-COVID-19-Forecast-HubThis gives you access to: - model-output/: All model forecasts - target-data/: Observed data (time-series.csv) - hub-config/: Hub configuration files - model-metadata/: Model metadata files
Repository Structure
China-COVID-19-Forecast-Hub/
├── model-output/ # Model forecasts by team
│ ├── GZNL-test_001/
│ ├── GZNL-test_002/
│ ├── GZNL-test_003/
│ └── GZNL-test_004/
├── target-data/ # Observed/target data
│ └── time-series.csv
├── hub-config/ # Configuration files
├── model-metadata/ # Model descriptions
└── README.md
Accessing Individual Files
You can directly download individual files using GitHub’s raw file URLs:
# R example: Load target data
library(readr)
target_data <- read_csv("https://raw.githubusercontent.com/dailypartita/China-COVID-19-Forecast-Hub/main/target-data/time-series.csv")
# Load a specific model's forecast
model_data <- read_csv("https://raw.githubusercontent.com/dailypartita/China-COVID-19-Forecast-Hub/main/model-output/GZNL-test_001/2025-08-28-GZNL-test_001.csv")# Python example: Load target data
import pandas as pd
target_data = pd.read_csv("https://raw.githubusercontent.com/dailypartita/China-COVID-19-Forecast-Hub/main/target-data/time-series.csv")
# Load a specific model's forecast
model_data = pd.read_csv("https://raw.githubusercontent.com/dailypartita/China-COVID-19-Forecast-Hub/main/model-output/GZNL-test_001/2025-08-28-GZNL-test_001.csv")Programmatic Access
Use the GitHub API to programmatically explore repository contents:
# R example using httr
library(httr)
library(jsonlite)
# List all model teams
response <- GET("https://api.github.com/repos/dailypartita/China-COVID-19-Forecast-Hub/contents/model-output")
teams <- fromJSON(content(response, "text"))$name
print(teams)# Python example using requests
import requests
# List all model teams
response = requests.get("https://api.github.com/repos/dailypartita/China-COVID-19-Forecast-Hub/contents/model-output")
teams = [item['name'] for item in response.json()]
print(teams)Batch Download Script
#!/bin/bash
# Download all model outputs for a specific date
DATE="2025-08-28"
mkdir -p model-outputs-${DATE}
for team in GZNL-test_001 GZNL-test_002 GZNL-test_003 GZNL-test_004; do
wget "https://raw.githubusercontent.com/dailypartita/China-COVID-19-Forecast-Hub/main/model-output/${team}/${DATE}-${team}.csv" \
-O "model-outputs-${DATE}/${team}.csv"
doneData Format
All model output files in this hub are stored in CSV format and follow the Hubverse data format standards.
Target Data
The target data (target-data/time-series.csv) contains observed SARS-CoV-2 positivity rates among department influenza-like illness (ILI) cases, as reported in China CDC’s weekly National Sentinel Surveillance of Acute Respiratory Infectious Diseases.
Model Output Data
Each model output file contains quantile forecasts with the following columns: - reference_date: The date the forecast was made - target: The forecasting target (wk inc covid prop ili) - target_end_date: The date for which the prediction is made - location: Geographic location code (CN for China) - output_type: Type of prediction (quantile) - output_type_id: Quantile level (0.01, 0.025, 0.05, …, 0.975, 0.99) - value: The predicted value
For more details on the data format and hub structure, see the repository README.