Skip to main content
Shape Shape

Gathering Quality Data

Introduction

Embedding practices to manage and improve your data quality and collection is the foundation of any retrofit project. Confidence in your underlying stock portfolio information will aid you in retrofit programme decision making and allow you to have a full understanding of your housing stock.

Contents

    Good quality information

    Embedding practices to manage and improve your data quality and collection is the foundation of any retrofit project. Confidence in your underlying stock portfolio information will aid you in retrofit programme decision making and allow you to have a full understanding of your housing stock.

    Please note that although the words ‘information’ and ‘data’ are used interchangeably throughout this document, the term ‘information’ is generally preferred when referring to information about property type, condition and usage. This is to avoid confusion with the data gathered to measure property energy efficiency before and after retrofit.

    Collecting data

    To ensure reliable stock analysis results, it's crucial to have comprehensive information on your homes. Filling gaps with assumptions based on similar properties (known as ‘cloning’) can be a useful tool in the early stages of preparing a retrofit project but naturally reduces accuracy.

    Therefore, as you develop your retrofit plans, it's important to gather enough information on individual homes to understand their build type, existing components, and, ideally occupant behaviours and usage patterns. Before collecting new information, check existing data from asset management systems to avoid duplication, and unnecessary time and costs. Below are six factors to consider to improve your data quality:

    1. Cloning: Where copying information about homes with similar archetypes, and their usage, may meet certain information requirements. This may be especially useful where there are limited timescales, but the approach is generally considered a much less reliable way of understanding each property; largely because it assumes the homes are identical. If using cloning, ensure you are confident with the information available and have explored any differences between the buildings, which mean they are not directly comparable. For example, a database entry containing information about a non-retrofit property, could not be reliably cloned to provide data about a very similar building that had some retrofit measures installed.
    2. Registered Energy Performance Certificate (EPC):Collecting information from an EPC register can act as a guide for the energy performance of a property. However, they can often be out of date, not aligned to net zero carbon planning and, like cloning, can be assumption-based and so contain data quality issues. When an EPC assessor is conducting your EPC audit, you should encourage use of the Standard Assessment Procedure (SAP) to collect data. Reduced SAP (RdSAP) contains more assumptions and so is less reliable (as point 4 below).
    3. Stock condition surveys: As a landlord you will already collect a raft of data through periodic stock condition surveys, for general asset management purposes and to help comply with the Decent Homes Standard. Usually these surveys include detailed information on internal and external building components, which can provide relevant data on the current thermal performance and energy efficiency of homes.
    4. SAP data: This consists of two levels, the full SAP, which provides a comprehensive dataset using floor plans, drawings, and specifications, and is required for new buildings. And the RdSAP, a reduced version for existing homes, uses some assumed default data, making it less accurate. Collecting additional data points, like u-values, can improve RdSAP accuracy.
    5. Thermal imaging surveys: These surveys help to identify where homes are experiencing excessive heat loss, providing evidence that fabric measures in specific areas may be needed as part of the overall retrofit approach. This information can complement other calculations and assessments of a home’s energy performance.
    6. Retrofit assessments: A Retrofit assessment will provide the measures to be installed to improve the energy performance of a home. Additional energy assessments will be required to understand the overall energy performance, but a retrofit assessment will use standards such as PAS 2035 to identify key building data to be established and an appraisal of occupancy, including the number of occupants and any special considerations.

    Aggregating Data

    Once enough information has been collected, best practice for storing and managing it involves two key principles: a single version of the truth and consistent formatting.

    Single version of the truth

    Different datasets may exist within your organisation. Ensuring these are consistent will mean there is limited ambiguity between them, allowing all departments within your organisation to work cohesively even when using the different sources. Using a centralised database that stores all the portfolio’s information will enable even better streamlining of work processes. If different software is used to gather information, such as asset management systems, it is important to keep the central database up to date.

    Consistent formatting

    To ensure your data has consistent formatting you will need to do three things:

    1. Store digitalised information and in the same file types e.g.,Excelorcsv
    2. Use consistent variable labels such as always first identifying properties by
    3. their unique property number (UPRN) or first line of address. This goes for naming of variables such as whether to include or exclude dashes e.g., ‘Solid brick wall’ or ‘solid-brick’

    Version control and file labelling if the file or system is regularly being updated with new data, including the date of the last update

    Improving quality

    Focusing on these six key factors will help you maintain the reliability of your housing stock records:

    Accuracy 

    Correct, precise and up to date. 

    Completeness 

    All possible data is present (no gaps or blanks, with any cloned or assumed information clearly marked). 

    Consistency 

    No conflicting information within or between systems and attributes. 

    Timeliness 

    Data is created, maintained and available when required. 

    Uniqueness 

    Where appropriate, there are no duplicates or redundant data elements.

    Validity 

    Data is authentic, proven to be valid, and derived from good, reliable sources. 

    You can achieve the above through home energy analytic software or manual checks.

    1. Homes energy analytic software: This provides a data quality score and is based on any gaps in the information
    2. Manual checks: this can be achieved in your database by filtering missing data and ensuring that updates (e.g., new EPC ratings) are recorded and tracked

    When conducting data checks and cleaning there a few questions to ask yourself:

    Is the data logical?
    E.g., if cavity walls were introduced from1920, why is a cavity wall included in a pre-1900 construction date property?

    Is the data consistent?
    E.g., does the data align to the other 20 properties in the same block or postcode? And if not, is there a good reason for this?

    Is the data plausible?
    E.g., when filtering by heating types, are there any unusual heating types that don’t match the rest of your housing stock?

    Next steps

    At this point, the organisation should have enough quality data, held in a reliable enough system. This should enable you to produce a baseline for the current performance of their housing stock’s energy performance, and start to plan which homes to prioritise for retrofit. To understand more thoroughly the steps to improve your data and why it’s important, refer to our detailed toolkit.

    Back to top