The Weatherlogics Climate Database (WCD) expertly joins, merges, and quality-controls climate data to give the most complete database for Canada. This process is complex and involves many steps. Below the methods used for joining, merging, and quality-controlling data are described in detail.
Joining refers to the process of combining two or more weather stations to produce a longer historical record. Over time weather stations are added and removed, but by joining stations we can maintain a long historical record for a given location.
All stations in the Weatherlogics climate database have been automatically joined based on specific rules. To qualify for joining, stations must be:
- Within 25 km of each other.
- Be in the same climate (e.g. no joining coastal and inland stations).
- Be within 100 m elevation of each other.
All active, automated stations (data up to 2019) with both hourly and daily data (at minimum) are used as the starting point for joining. Active stations are not joined together so that we preserve distinct climate records for each active stations. Even if two or more active stations are close together, they will be kept separate.
Order of joining
Station joining is not always simple, since many stations often overlap. Below is the priority of joining. In the case of ties, the period of record is the tie breaker.
- Complete* airport stations with WMO and Transport Canada (TC) IDs.
- Non-airport (usually climate) stations with WMO and/or TC IDs
- Airport stations not included in (1)**.
- Any automated station not included in 1-3.
- Any non-automated station.
*Complete means it records temperature, humidity, wind, precipitation and weather conditions at minimum.
**If the station continues the record from a previous station with the same TC ID, it moves ahead of a station classified as position 2.
Exceptions are sometimes made to the joining rules outlined above. Here are two types of exceptions:
- Co-located, active stations can be joined if they represent the same location. An example of co-located stations would be Winnipeg Intl A (Nav Canada) and Winnipeg Airport CS (ECCC), which are both at the Winnipeg Airport. An example of non-co-located stations would be Winnipeg Intl A and Winnipeg The Forks. These two stations meet the joining criteria, but are both active and represent different locations of the city, which therefore means they cannot be joined.
- Subjective exceptions have been made to the rules above in cases where breaking the rules is justifiable to continue a long climate record. All such cases have been documented.
Merging is the process where missing values are filled using alternative data sources or stations. This is done to limit the amount of missing data in the climate record. If a value is missing, the following steps outline how that value is filled:
- METARs and SYNOPs from the same station are retrieved to see if they contain the missing data.
- If step (1) fails, alternative data from the same station may be used to estimate the missing value. For example, if the daily maximum temperature is missing, the highest hourly temperature is used to estimate the daily maximum. This only applies to certain variables, such as temperature, which are measured hourly.
- If steps (1) and (2) fail to fill the missing value, a nearby station is selected as a replacement. If there is more than one nearby station, the priority list in the joining section is used to set the order in which they will be used for replacement of the missing value.
- If (1), (2), and (3) fail to fill the missing value, the value can sometimes be filled during the QC process. For example, if daily rain is missing and the METARs from that day recorded no precipitation, the daily rain value is logically filled with 0. See the QC checks for more details.
- If all of the above fail to fill the missing value, it remains missing.
Always check the flag for a data value to see if it was filled using the steps above. If the value was filled using a different station, check the ecid column to see which station was used.
Every data point in the WCD is quality controlled. If an erroneous value is found, there are three possible actions taken.
- If the value is suspected to be erroneous, but there is reasonable doubt, the value is maintained, but marked with a suspect (Z) flag.
- If the value is found to be erroneous with certainty, the value is rejected and marked missing. We then use the priority list from the merging section to determine how the missing value will be filled.
- If a value is rejected and no replacement value is available, it remains missing.
Below is a list of QC Checks performed by our climate QC program. For more information, please contact us.
- Gross limits: Check if a value is obviously above/below fixed gross limits (e.g. air temperature of 200 C). For limits please contact.
- Too warm for snow: If snow is recorded with a daily minimum temperature above 6 C.
- Fake snow from ice crystals: If more than 0.4 cm of snow occurs, but only ice crystals are reported, change snowfall total to 0.
- Fill missing with logic: If a daily precipitation value is missing, fill with 0 if no precipitation was recorded in the METARs (and no METARs are missing) that day.
- Fill missing snow depth with logic: If the daily snow depth value is missing, and no new snow has occurred, fill with the previous day's snow depth. This occurs no more than once in a row.
- Verify estimate: If a value is marked as estimated, verify it using other data.
- Absent snowfall: If snowfall was recorded in METARs (and no METARs are missing), but none was recorded, fill with Trace. If snowfall is zero and rainfall is 0.2 mm or higher, replace precipitation amount with rainfall amount.
- Fake precipitation: If daily precipitation is 2.0 mm or higher and no precipitation was recorded in the METARs (and no METARs are missing), replace precipitation with 0.
- Max/min inconsistent with hourlies: If the highest/lowest hourly temperature is higher/lower than the daily max/min, mark the daily max/min value suspect.
- Suspect max/min: If the daily max/min is more than 3 C higher than the highest/lowest hourly temperature, mark suspect.
- Suspect gusts: Case 1) If the max gust is 100 km/h or higher and the maximum hourly sustained wind is 40 km/h or less and the maximum hourly gust is 70 km/h or less and no thunderstorm was reported, replace with missing. Case 2) If the max gust is 150 km/h or higher and the maximum hourly sustained wind is 50 km/h or less and the maximum hourly gust is 90 km/h or less, replace with missing.
- Suspect precipitation type: Mark suspect any case where snow or ice pellets are reported with an air temperature above 10 C.
- Fill missing max/min with hourlies: If the daily max/min temperature is missing, fill with the highest/lowest hourly temperature. No more than one hourly observation can be missing.
While our QC process is quite rigourous, we are aware of some issues. This is a list of known issues. It will be updated over time as issues are discovered or corrected:
- TS Days: There are some erroneous synops which recorded a thunderstorm when one did not occur. These are often obvious, since daytime temperatures are below freezing. However, some are less obvious. We have removed all instances of thunderstorms below -10 C, but some erroneous cases still exist.
- Total precip/rain: There are some erroneous synops which had an extreme amount of daily rain or precipitation. Daily amounts are often over 100 mm. Most of these cases are flagged QS or ZB. We have removed all cases where daily precipitation is greater than 100 mm with a ZB flag, but some erroneous cases still exist.
- Daily max/min temps: There are some erroneous synops which had incorrect daily max/min temps. We have fixed cases that we were aware of, but some errors still exist.