Relational database design considerations for beneficial ownership information

  • Publication date: 16 December 2021
  • Authors: Open Ownership, Extractive Industries Transparency Initiative (EITI)
  • |  View in: Spanish  French 

5. Database structure

Below is an Entity-Relationship diagram of the structure that a beneficial ownership relational database could take. It does not contain details of all of the fields that should be captured.

For example, the “declaration” table contains the fields “person”, “entity”, and “relationship”, which are placeholders. Therefore, it is representative of the overall structure, but does not represent a complete model that could be used to implement a beneficial ownership register.

Example of a simplified beneficial ownership database diagram

Figure 2. Example of a simplified database diagram

a) Tables

i) Declaration table

A table, where the declaration of the beneficial ownership is stored, should exist. This declaration table acts as an audit log of company declarations. It should:

  • Contain as much raw user input as possible

It is important to make sure that no data is lost from data capture. This is to ensure that, if necessary, the processing of that data can be replicated.

  • Capture the best quality information the person making the declaration has on the known people, entities, and relationships

Capture as much information as possible aiming for structured fields, such as codelist lookups. If collected online, the system should try to contain lookups to known people, entities, and codelists, as detailed in the data capture section above.

  • Be stored in a log form, making sure a new record is created for each new declaration

Add a new row for each declaration with a date associated with each entry. UPDATES should not be applied to them. For example, if the only difference is the date of the declaration (such as for an annual return where there has been no change in company ownership or control), then a new row should still be added.

ii) Statement tables

The declaration data should be processed into statement tables for people, entities, and relationships. These tables should:

  • Represent the whole history (all changes) of the person, entity, or relationship

When a declaration is made that differs from a previous one, then a new row should be added to the affected table. This potentially includes confirmation statements where only the declaration id and date are different, in jurisdictions where such confirmations are required by law.

For example, if a person’s shareholding increases, then a new row should be added to the Relationship table with that updated information.

  • Only create new entries when there is a change from the last statement

There should be less duplication than in the declaration tables and new statements need to be generated only if there has been a change.

For example, in the situation described above, where shareholding level has changed, there is no need to update the Person or Entity tables unless information has changed in them as well.

  • Generate BODS statements

These tables should contain the data, processed from the declaration, that are then used to populate BODS statements. There should be a direct mapping between these internal database tables and their related top level objects in the BODS schema.

For example, the Person table(s) in a beneficial ownership database should be capable of publishing to all of the fields in a BODS Person statement. An array in BODS is an indication of a one-to-many relationship that the database could implement using a parent-child table structure.

  • Contain rows whose values are rarely altered

Any fields that will be published to BODS should be unchanged after the creation of the row that contains their values, but certain fields that relate to the processing of the data could be updated, e.g:

– links to previous statement rows to indicate this row supersedes the previous statement;

– date ranges of when the information was applicable.

  • Be fully reproducible from the declarations

These tables should be able to be reproduced from information in the declaration tables for all fields apart from those ids that are generated by the system.

  • Contain the declaration id that produced these statements

The system should publish the declaration id in each of the rows that potentially represent multiple changes across multiple tables. This means that all of the updates can be traced back to the originating declaration, supporting data use.

iii) Current tables

The system should process statement data to generate person, entity, and relationship tables that reference the current best-known information. This allows snapshots of current, up-to-date beneficial ownership information to be published. These tables should:

  • Contain a unique generated “id” for each person within the declarations of each entity

This enables a history of changes to each person and their relationship to that entity to be recorded and reported. This is in addition to any externally generated ids, such as a national identification number (person) or a company number (entity).

  • Contain a unique “id” for each person, entity, or relationship within the declarations of the database

This supports the understanding of the case when a person is a beneficial owner of two or more entities. These ids may be internally generated or externally generated ids, such as a national identification number (person) or a company number (entity).

  • Link to the most up-to-date statement record

This enables the current state of each person, entity, and relationship to be recorded and reported.

  • Be reproducible from the statement tables

This table should be showing current-point-in-time information and could be calculated from the statement tables if needed, apart from the “id” fields.

b) Fields

This section highlights notable fields in the declaration, statement, and current tables in a beneficial ownership database. It does not attempt to cover every field in the database.

i) Person fields

These fields should exist in declaration tables along with person and person statement tables.

  • References to person identifiers

To uniquely identify a person, both internally and globally as part of a BODS publication, there should be reference to one or more identifiers. Consideration should be given to the use of internal database identifiers, BODS statement identifiers, and real-world identifiers.

It may be possible to construct BODS statement identifiers from internal database identifiers. See: Strategies for identifier creation in the BODS documentation.

The three forms of real world identifiers that BODS accepts are a national identification number, a passport number, or a tax identifier. Verifying a person identifier avoids duplication of person data in the beneficial ownership database.

The declarer may enter this through a lookup which will check those identification details are valid according to the relevant database. At that point, it could be encrypted or hidden from database users through access controls to meet any privacy requirements in the jurisdiction.

When publishing to BODS, a real world identifier is not a required field and should not be published unless there are legal grounds to do so. The person statement id is required to be published and it can be used to uniquely identify the person.

  • Status fields

These indicate the current state of the information about an entity, person, or relationship (as reported by a particular entity). Status fields will not necessarily exist in the declaration tables: they can be derived and exist in the statement and current tables.

Status values should, as a minimum, include the equivalent of “new”, “updated”, and “closed” Statements made over time about an entity, person, or relationship form a series. The first in the series would have the status “new”, and the final statement would have the status “closed”. All other statements in the series would have the status “updated”.

  • Name and address information

When collecting addresses, the international nature of beneficial ownership should be considered.

For example, where addresses outside the jurisdiction of the database are possible, the system should not enforce local postal/zip code formats.

ii) Entity fields

These should exist in declaration tables along with entity and entity statement tables.

  • Organisation identifier

Preferably using an org-id identifier in conjunction with a register unique identifier.

  • Status fields

This indicates the current state of the information about this entity and may contain just “new”, “updated”, and “closed”.

iii) Relationship fields

These should exist in declaration tables along with relationship and relationship statement tables.

  • Links to person and entity tables

The relationship statement table should also link to the person statement and entity statement tables.

  • Known information about beneficial ownership

Such as shareholding; direct or indirect ownership.

  • Equivalent to ownership or control statement in BODS

These fields should capture all of the details of the information provided about ownership or controlling interests. It should be possible to publish these as ownership-or-control statements in BODS.

  • Status fields

This indicates the current state of this information and may contain just “new”, “updated”, and “closed”.

Next page: 6. Case studies