See updates across the data model, metadata structure, and API of our service. Breaking changes that require updates to data consumer applications are announced prior to their implementation.
Filter by components: yente (3) · Hosted API (2) · Data model (5) · Export formats (1) · Metadata (1) · Datasets (2)
Effective date: | |
---|---|
Components affected: | yente |
Announcement: |
We will officially end support for yente 3.x at the end of March 2025. This means we will be unable to provide support for this version, and will not guarantee that new versions of the data will be correctly and completely loadable by the old application release.
To ensure continued compatibility and access to the latest features, we strongly encourage all users to upgrade to the latest version. For detailed instructions on upgrading, visit our documentation page on upgrading yente.
Effective date: | |
---|---|
Components affected: | Hosted API |
Announcement: |
A new version of the API will remove the regression-v2
scoring algorithm, but leave regression-v1
in place for now. regression-v2
is a historic scoring method that is discouraged for screening use (it's intended use is internal de-duplication of entities in our database).
A new regression-v3
scoring mechanism is now being developed and will be released, but not recommended for screening use. Any requests that are sent to the API using regression-v2
will be re-written to use regression-v3
in the future.
Effective date: | |
---|---|
Components affected: | Data model |
Announcement: |
In the us_sam_exclusions
dataset based off the SAM.gov exclusion and debarment list, the designated entity's UEI is currently stored in the registrationNumber
field. Going forward, a new property, called uniqueEntityId
is available and will contain these identifiers. Starting Feb 1, 2025, we will then remove the UEI mapping to registrationNumber
.
Effective date: | took effect on |
---|---|
Components affected: | Data modelExport formats |
Announcement: |
We're phasing out the use of the target
flag throughout the system, and switching the export formats that are based on target
to use a defined list of topics
as their source of truth.
A binary flag (target
) is an insufficient method to describe what entities are associated with risk. For the past few months, we've been recommending the use of topics to decide if a match is relevant (e.g. as a PEP, sanctioned entity). However, some export formats - such as targets.nested.json
and targets.simple.csv
are still using targets to decide which entities to include.
On January 15, we will switch these two export formats (targets.nested.json
and targets.simple.csv
) to include any entities tagged with one the topics listed below. This is guaranteed to include all current targets, but will bring in additional entities that have topics assigned, but are not marked as targets. In short: the new exports will be more correct, and a bit larger.
This will result in the targets.nested.json
export of the default
dataset becoming equivalent to the topics.nested.json
export of the same collection. This export can be used for testing until the change becomes effective on January 15, 2025. We will eventually remove the topics.nested.json
export format on February 15, 2025, and only generated the file named targets.nested.json
going forward.
Topics included in new target definition:
corp.disqual
crime.boss
crime.fin
crime.fraud
crime.terror
crime.theft
crime.traffick
crime.war
crime
debarment
export.control
export.risk
poi
reg.action
reg.warn
role.oligarch
role.pep
role.rca
sanction.counter
sanction.linked
sanction
wanted
Effective date: | took effect on |
---|---|
Components affected: | Metadata |
Announcement: |
This change is relevant for users of the index.json
and catalog.json
metadata. API and yente users as well as those users who download static files based on location are not affected.
As the number of datasets in the default
collections grows, the metadata size is becoming a factor: the main index file is now larger than two megabytes. In order to prevent scaling issues in the future, we're splitting up the dataset metadata into two files: index.json
and statistics.json
. index.json
will contain fewer summary facts about each dataset. The extended statistics (e.g. the number of entities in the dataset linked to each country) will be in the more in-depth statistics.json
.
The following fields will be removed from dataset metadata:
schemata
properties
targets
(and all nested items)things
(and all nested items)These fields continue to be published in a statistics.json
file. A statistics.json
file is published with each data export of a dataset, and referenced from the index.json
via the statistics_url
field.
Additionally, the following fields will be removed in favor of replacements that are already in index.json
:
sources
- use datasets
insteadexternals
- use datasets
insteadEffective date: | took effect on |
---|---|
Components affected: | yenteHosted API |
Announcement: |
This release updates various dependencies, introducing new fields in the followthemoney schema and bringing in security patches for the web stack dependencies.
We're also implementing the Reconciliation API's Data Extension protocol for the first time, allowing users to enrich OpenRefine tables with new columns using the API.
See more: https://github.com/opensanctions/yente/releases/tag/v4.2.0
Effective date: | took effect on |
---|---|
Components affected: | Datasets |
Announcement: |
The sanctions regime under Section 353 of the Corrupt and Undemocratic Actors Report has expired in December 2023. To reflect this, we've removed the topic sanction
from all entities in the dataset. We will remove the relevant dataset entirely after November 1, 2024, and move the formerly-designated entities for reference to the US Special Legislative Exclusions dataset
Section 353 concerns foreign individuals who have been reported for knowingly engaging in activities that undermine democratic processes or institutions, participating in significant corruption, or obstructing investigations into such acts in El Salvador, Guatemala, Honduras, and Nicaragua.
Effective date: | took effect on |
---|---|
Components affected: | Data model |
Announcement: |
The followthemoney
data model currently stores the citizenship of individuals in the nationality
property. After being advised the the two concepts are not identical in some jurisdictions, we've now also introduced a citizenship
property. From the effective date we will begin moving country affiliations for individuals in the citizenship
property if that nomenclature is used in the data source (e.g. the UK sanctions list).
Data consumers should check both properties in the future. To get a complete picture of the countries linked to an individual you may also want to check the birthCountry
and country
field. The latter serves as a catch-all field for affiliations that may not involve citizenship or holding a passport - simple residence might be enough.
See: Person schema.
Effective date: | took effect on |
---|---|
Components affected: | Datasets |
Announcement: |
The dataset Liechtenstein Posted Workers Act (EntsG) Sanctions was previously running of a static, outdated version of the data which had been published in HTML format. It will now be updated from a regularly-published PDF file. As part of this update, the IDs of the listed entities are changed, and the topic applied to first-time penalised entities ("Übertretungen" i.S.v Art 9) is changed from debarment
to reg.warn
.
Effective date: | took effect on |
---|---|
Components affected: | Data model |
Announcement: |
The permId
property for LSEG/Refinitiv company codes has been moved up from the Company
to the Organization
schema to enable reflecting government entities (using the PublicBody
schema) also receiving these identifiers.
Effective date: | took effect on |
---|---|
Components affected: | yente |
Announcement: |
yente
4.1.0 (release page) is a minor patch release. It introduces clearer error reporting during index runs, and fixes the number of matches reported in the total
section of the /match
API. Various dependencies have been updated to their latest version, including followthemoney
.
Effective date: | took effect on |
---|---|
Components affected: | Data model |
Announcement: |
A soft length limit in Unicode codepoints has been added for all properties. These can be seen in the data dictionary. The goal of this is to make it easier for data consumers to import our data into systems with fixed-length column types.
Property values are not yet guaranteed to be limited to this value, but our tooling now alerts us when values are longer than this, so that we can identify sources which don’t adhere to sensible limits and eventually enforce hard limits.
Imposing a length limit has also identified many instances where the data required further cleaning, which we've implemented as needed.
Our monthly newsletter brings you product updates, new datasets, and upcoming changes.
Subscribe now