Behind the scenes of the 2021 Census: the role of data
Every five years, the Australian Bureau of Statistics (ABS) counts every person and household in Australia, in what is officially known as the Census of Population and Housing. The information provided by people across the country is used to inform important decisions about transport, schools, health care, infrastructure and business. Lateral Economics has released an independent valuation of the Australian Census, which found that for every $1 invested in the Census, $6 of value was generated to the Australian economy.
Clearly, it is critical to get the Census count right. In this article, three members of the ABS Census team explain how their work contributes to improved Census enumeration. In the first example, Heather Cleaves from the Census Project Management Office addresses the important topic of data ethics. The ABS’ approach was underpinned by the ‘privacy by design’ principle which informed all aspects of the Census design and which has helped to maintain community trust. In the second example, methodologist Ben Ingram shows how data analysis was used to optimise the recruitment and deployment of the Census workforce in the field. In the final example, Ross Watmuff from the Census Futures section discusses the role administrative data played in enhancing Census counts in certain areas of the country.
Data ethics and the 2021 Census
The ABS implemented a ‘privacy by design’ approach for the 2021 Census. This means that privacy is considered at all stages of any process or product. The 2021 Census incorporated privacy into business planning, staff training, priorities, project objectives and design processes.
In line with this approach, the ABS commissioned two Privacy Impact Assessments (PIAs) in advance of the 2021 Census. The 2021 Census PIA considered the alignment of the Census with privacy and ABS legislation, user acceptance and public perception issues. This included the statistical data that is collected in the Census, information that is used to support Census operations and information about the large temporary Census workforce. The second PIA considered the proposed use of integrated administrative data in the 2021 Census.
During the undertaking of the PIA, a risk assessment of potential privacy issues was provided to support the timely strengthening of privacy practices. This enabled the Census program to make privacy preserving decisions as the planning and design work progressed. The early adoption of privacy enhancements encouraged a culture of thinking about privacy impacts as a primary consideration.
In response to the PIAs, the ABS has strengthened its data protection practices through the development of a range of privacy enhancing measures such as a 7-8 year Census Privacy Strategy, which covers more than one Census, and principle-based approaches to managing privacy risk.
Ensuring the privacy and protection of data remains a critical priority throughout the Census lifecycle, extending well after the data has been collected. Census data must continue to be well managed, safe and secure as long as it is held by the ABS.
The Census program developed a ‘Data Protection Toolkit’ for the 2021 Census to assist staff in developing robust personal information-handling practices, procedures and systems.
In order to pass assurance processes, each Census team has developed Data Protection and Retention Plans. These plans stipulate how data will be managed and protected in accordance with data management and privacy principles, security and access controls, and data retention and secure destruction procedures. The assurance process also requires that teams develop Privacy Action Plans for specific workflows. These ensure preparedness to respond effectively in the event of a privacy breach.
Further information on the privacy practices of the 2021 Census, including the Census PIA reports and ABS’ responses, are available on the ABS website.
Using data to inform the count of the population in the Census
Ben Ingram, Statistical Methodology Branch
Counting the Australian population is a major task that involves a very large workforce of temporary ABS staff in the field. We use data and modelling to determine where we need to hire staff as well as how to marshal them to produce the highest quality results possible.
A major project in the lead up to the 2021 Census was determining our Field Officer recruitment targets. That is, determining how many staff we expect to need in each area around the country to knock on doors and encourage responses from households. To do this, we used data from the 2016 Census and created statistical and machine learning models for several different things at the area level:
- Dwelling projections - how many dwellings we expected to be in each area by August 2021.
- Self-response rates - what percentage of dwellings we expect to respond without us needing to visit them.
- Visit resolution rates - what percentage of dwellings will respond after receiving a visit.
- Visit and travel times - how long the visits will take.
Results of these models were combined to produce estimates of the amount of work required in each area, and hence how many staff we needed to hire to complete this vital work.
Another area where we made use of data to maximise results was in operational monitoring during the Census count. We created models for the expected response rate over time, which can be compared to actual response rates as they happen and tell us which areas are underperforming, allowing us to prioritise our efforts appropriately.
The models for expected response rates over time used data from the 2016 Census and incorporated adjustments for known and expected changes in 2021. These models provided valuable information to help guide decisions on prioritising field effort to maximise response. This was particularly important in areas where we might need to fly in extra field staff or areas where we can cease visits earlier or extend if necessary. Ensuring the workforce was targeted to the highest priority areas was an important part of the enumeration (counting) phase of the 2021 Census.
Using insights from data to assess occupancy and improve Census counts
We are using administrative data to assist with deciding which houses were empty on Census night. This is important for the accuracy of our final Census counts.
After the Census has finished collecting as many forms as possible, despite our best efforts in the field, there will still be a small set of houses where we can't get a response (this was about 3% of houses in 2016). It's important to make an accurate assessment of how many of these were empty on Census night because, if we believe a house was occupied, we adjust the Census count to cover the people who were missed. For the 2016 Census, our Post Census Review showed that we assumed too many houses were occupied. This effect was particularly pronounced in areas like inner Sydney and Melbourne, where dense apartment blocks make occupancy harder to assess.
For the 2021 Census, we are using administrative data to help assess occupancy of these houses. This is now more important for areas like inner Sydney and Melbourne, where COVID lockdowns are making occupancy assessment in the field a lot more difficult than usual. First, we will create indicators of occupancy from administrative data for the houses where we need to make a decision. The administrative data we use includes data from government services like Medicare, Centrelink and the Australian Taxation Office, information on electricity use from energy distributors, information from our Address Register and data from the last Census.
We use these indicators to assess whether houses that didn’t send us a form were empty or occupied. For example, if the house is in an area where electricity use shows that not many houses are occupied, we will probably decide it's empty. Applying this new approach to 2016 Census data shows that we would have decided that another 1.7% of houses were empty. This matches closely with the findings from the 2016 Post Census Review, making us confident that this approach will improve the accuracy of our counts in the 2021 Census.