Up-to-date national beneficial ownership data from Denmark, Slovakia, and the UK is currently available via these tools for download or analysis in a variety of formats, such as comma-separated values (CSV), SQLite, PostgreSQL, Parquet, and JSON.
Latvian BODS data from 2021, mapped data from the Global Legal Entity Identifier Foundation, and a combined dataset from the Open Ownership Register are also available here. All datasets are published under an open licence for anyone to reuse, and they are connected up to the BigQuery and Datasette services to support detailed analysis and the running of queries using Structured Query Language (SQL). The code for these tools is published on GitHub.
BigQuery is a service from Google which provides organisations access to “data warehouse” tools so that they can carry out detailed or complex data queries.. Users may need to pay Google for this depending on the complexity of the queries. Each of the datasets in the BODS data analysis tools are configured so that users can go to the dataset in BigQuery, copy it to their own workspace, and analyse it as they wish.
Datasette is a tool for exploring and publishing data. It helps people take data of any shape and analyse, explore, and publish it as an interactive website and accompanying application programming interface.
Step 1: Click on the Datasette label
For each dataset on the BODS data analysis tools, there is a link labelled “Datasette”. Clicking this will take you to the Datasette instance to explore that particular dataset. Watch the introduction video from the Datasette team to learn more about how to use it.
Step 2: Choose the table you want to use
Once you enter the selected country’s Datasette page, you will see a list of tables. Choose the one you want to explore. Below each table’s name, in italics, you will see all the fields contained in each table and the number of rows (statements) there are.
In the example below, we can see that Latvia's entity statement table contains over 120,000 rows as well as each of the fields from this part of the entity statement (type, name, date of incorporation, publication date, etc.).
Step 3: Create your own filters
You can select what information you want to see from each table. Following the previous example, from the entity statements you can select the “column” you want to filter and choose how you would like to do so by selecting an item from the middle dropdown menu.
For example, you can look for an individual company name:
You can also filter by companies that have been founded after a particular year (such 1992, as in this example):
You can also add another filter to only show companies registered between particular years (in this case, 1992-1994).
Step 4: Edit the SQL query
Using the automatic filters provided by the systems will make it easier for non-data experts to search. Using those filters will limit the search within the same “table”. For instance, following the example above, we can filter information within entities but we cannot find general statistics or experiment with the data, and we cannot combine an entity with person statements (e.g. searching for companies registered after 1992, whose beneficial owner was born after 1980). In order to do so, you will need to manually edit the SQL query.
4.1. Explore the data
In the example below, we used Datasette to show us the most common addresses from all UK-registered entities. To do this, we selected all the addresses in the entity_statement table, then ordered them by “desc” (descendant), to show the most common ones at the top of the list.
In this example, we can see that there are over 2,000 entities registered at 20-22 Wenlock Road, London. This might be a red flag for investigators, or it could be that a company formation agent which creates companies is registered at that address.
4.2. Combining two or more tables
If you want to know, for example, who is the oldest beneficial owner listed in a country’s jurisdiction, you can do this by writing queries which bring together data from different Datasette tables:
- Person_statement (beneficial owner name)
- Person_addresses (beneficial owner’s residence)
- Person_statement (contains birth date)
In this case, the query looks at the person’s date of birth (person_statement), country of residence (person_addresses), and full name (person_statement), filters it by country to provide only Latvian examples, then sorts them by date of birth so the oldest person appears at the top.
You can copy this query and adapt it to your needs. If you want to see the youngest beneficial owner, you will need to sort the list in descending order so the youngest person will be listed at the top (see query here).
You can identify which tables you need to select by looking into the writing in italics which describe the items contained in each (see Step 2), or by looking at the BODS data schema.