Apr 11, 2023 | 3 min read

What is "data quality" and why is it important?

Sacha van Tuijn

In order to understand the performance of your portfolio or an individual asset, it’s important to have a high rate of data coverage. This ensures that you are getting a complete picture of what is going on in a building. However, to make good business decisions based on that data, it is also import to have an understanding of the quality of that data. Data that is of higher quality can provide managers, investors and other stakeholders with confidence regarding the integrity and reliability of the reported information.

What is meant when we talk about data quality?

Data quality refers to the reliability of the data that is being collected, or in other words, how much trust you can have that the data you are looking at is in fact reflecting reality. We can think of this along two lines, the source of the data and how error prone it is due to how much manual labor is involved, and whether data is actual or estimated.

Data is of the highest quality when it is fully automated, when it goes directly from a smart meter to an analytics tool or platform. In such cases, there is no opportunity for human error to be introduced. More often, however, data is exposed to manual processing at some point before reaching the point of analysis. Each touch point introduces a potential for human error, for example, a traditional meter requiring manual reading of the meter on site.The reading could be done or recorded incorrectly, and on its way to the final database, be copied or entered with an incorrect digit and/or incorrect unit. We often find energy consumption values which are entered as megawatt instead kilowatt or vice versa, throwing the data off by a factor of 1000.

Actual meter data refers to data that is coming from a meter. Estimated meter data refers to a range of calculations based on known variables across floor area, property type, location and the presence of energy labels, and for energy consumption specifically, average emissions factors instead of supplier-specific missions factors for energy sources. Actual data is of higher quality of course, but estimated data may be necessary when data is missing for a certain period of time, or when data is difficult to collect due to privacy issues. This is often the case for residential portfolios.

Why is data quality important?

Data quality is emerging as an important metric that can give managers, investors and other stakeholders more visibility into the true state of their portfolios. Data that is of higher quality can be better trusted and relied upon when it comes to making business decisions. ESG data is increasingly being looked at from a strategic perspective and to inform capital and operational expenditure. More stringent requirements on data area already being put in place from regulatory bodies and many reporting requirements disallow the use of estimated data.

How is Scaler helping managers ensure the quality of their data?

Data quality is given a prominent position in the Scaler platform. We look at this metric at both the portfolio and asset level, displaying it alongside data coverage and data completion to give a more complete picture of your data characteristics. By making data quality accessible, we give clients insight into their investments and the necessary contextual information to take action and improve quality for single assets or across an entire portfolio.

Our method for calculating data quality is aligned with the Partnership for Carbon Accounting Financials (PCAF), which has a Scale of 1-5 based on actual vs. estimated emissions. Scaler takes this a step further by incorporating data source (e.g. smart or conventional meter) and applies a data quality score to water and waste consumption in addition to energy consumption.

You can find more information on the PCAF Global GHG Standard here. Along with CRREM and GRESB, two other important industry players, they have recently released “harmonized technical guidance for the financial industry on Accounting and Reporting of GHG emissions from Real Estate Operations.” The technical guidance comes after intensive public consultation and is a great example of the growing emphasis on and need for standardization and clarification as it relates to data quality.

How should managers use their data quality score?

As with all emerging metrics, it is important for managers to first get visibility of their baseline. Once this has become clear, managers should start developing internal targets and a strategy to improve their data quality over time, that is, moving toward automated data as well as data that reflects actual vs estimated consumption.

The way a manager goes about this depends on their starting point. They may need to think about installing smart meters across their properties and training stakeholders to get data as carefully and effectively entered into their data management platform as possible. In Europe and other regions that ensure privacy protection for tenants, managers will need to set up processes to collect tenant data, though this can be a tedious and complex process. Scaler is working with clients to educate, better engage with and better manage data collection from tenants and stakeholders. As part of our mission to empower the industry, we are developing tools to provide clients with a more complete picture of how their buildings are performing and the confidence to use this data in their decision making.