Tracking improvements of Companies House data

Tracking improvements of Companies House data

Changes at Companies House

On 26 October 2023, the Economic Crime and Corporate Transparency Bill was given royal assent and became UK law.

The purpose of the bill is to help prevent organised criminals, fraudsters, kleptocrats and terrorists from using companies to abuse the UK’s open economy.

The law provides Companies House with a range of additional powers to implement measures which are designed to improve the accuracy of Companies House data.

This is good news to those of us who rely upon Companies House data for due diligence, especially where that relates to ownership structures and company director data.

What are the steps Companies House have announced

Companies House have announced some initial steps they are implementing from 4 March 2024 to ensure the improvement of the register. 

These include increased scrutiny of information submitted to the register, but they also claim that existing data will be improved using data matching.

But how will the success of these changes be measured, tracked and reported upon? 

Xama’s interest in Companies House data and our capabilities

At Xama we have a vested interest in the accuracy of Companies House data. An integration into Companies House provides our users with an overview of directors and the ownership structure of a company. As we rely on the data provided by Companies House, improvements to this dataset will also improve the data we provide to our users.

In order to provide the level of Companies House integration and reporting that we do, Xama has created a service for which we cloned Companies House data. Through the streaming  API provided by Companies House, we can maintain a real-time, up-to-date, copy of Companies House data at all times within our own environment.

The datasets include more than:

  • 6.4 million companies
  • 9 million persons with significant control relationships
  • 20 million officer appointments

By maintaining an up-to-date Companies House dataset, it provides us with some unique capabilities to analyse data within the register.

The duplicate officer problem

A well-known problem with Companies House data, which needs improvement, is that an individual often exists within Companies House as two or more different people, even though they are, in fact, the same person.

It does not take long to find such an example, in fact, I only need to look at my own record. Whether as an unwitting fault of my own or a mistake made by Companies House, I can track myself down as two separate people in Companies House.

Firstly as a director of Malan Consulting Limited (06831339) and secretary of Smart Project Solutions (06475025)

And then separately as a director of Xama Technologies Limited (11398708) and previously Excluserv Limited (05633814).

There is no reliable way for me to tell from Companies House records that I am in fact the same person and has had relationships as an officer to 4 companies.

Of course it is possible that someone with the same name and born within the same month and year are two separate people, but we know that many individuals are duplicated within Companies House, as proven here (as i am connected to all 4 these companies).

This complicates the process of due diligence by not being able to reliably interrogate what relationship an officer has with all their related companies.

Xama analysis and KPIs

At Xama, we have looked at this problem and believe we can start tracking whether improvements to the duplicate officer problem is being made.

We identified two possible indicators which we could use to measure the potential issue and then use these indicators to track progress over time.

Indicator 1: Matched by name, month and year of birth

The first query that we have constructed is to identify where people with the same name as well as month and year of birth exist more than once within Companies House with unconnected appointments as an officer.

We know that this query will highlight some legitimate duplicates, but we also know that a significant percentage of these results would be incorrectly duplicated individuals.

Our findings:

Total number of cases analysed were 8,781,431 with:

  • 826,318 were identified where 2 or more people match the same criteria and
  • 7,955,113 cases where the criteria is unique

Where a potential duplicate has been identified, they are only counted as one case even though they represent multiple officer records within Companies House.

If we agree to use the total number of cases as the denominator, we reach a percentage of 9.41%. 

Indicator 2: Matched by name, month and day of birth as well as postcode

We also performed the analysis using further matching criteria of the postcode.

Where it is quite likely that two people exist with the same name, month and date of birth, it is far less likely that those people will also have the same postcode attached to their correspondence address. As with my own example, the correspondence address is quite likely to be different across officer appointments, but the cases highlighted within the postcode matching query represent cases with a very high likelihood of duplicated officer records..

Our findings:

Total number of cases analysed were 11,107,633 with:

  • 324,589 were identified where 2 or more people match the same criteria and
  • 10,783,044 cases where the criteria is unique

The total cases increase due to many more unique records due to additional matching criteria. If we use the same method to calculate the percentage of potential duplicates compared to total cases we arrive at 2.92%.


Should Companies House start to enforce their powers and make use of data matching to clean up officer records, we should see a reduction in these percentages over time.

Especially where the postcodes could be matched, the assumption would be that Companies House can fairly easily identify these duplicates and correct them.

Where addresses cannot be matched, matching might be more challenging. However, if the data being accepted into Companies House are of better quality and do not introduce new duplicates, we should see a further reduction of this number over time.

We will continue to track these indicators and any others we can identify (with your help!) to provide an overview of the results over time.

How can you get involved?

Hopefully with this analysis we have demonstrated the capability Xama has to quickly analyse Companies House data in real time.

We invite anyone who is interested in this project to reach out to us and would appreciate further ideas for indicators which will allow us to track the progress Companies House is making on cleaning up the register.

If you want to reach out to us, you can contact us at or reach out to me on LinkedIn.

Posted in

Xama Tech