RSSChangelog

#2 BreakingRemoval of dataset statistics from metadata

Effective date:
Components affected:Metadata
Announcement:

This change is relevant for users of the index.json and catalog.json metadata. API and yente users as well as those users who download static files based on location are not affected.

As the number of datasets in the default collections grows, the metadata size is becoming a factor: the main index file is now larger than two megabytes. In order to prevent scaling issues in the future, we're splitting up the dataset metadata into two files: index.json and statistics.json. index.json will contain fewer summary facts about each dataset. The extended statistics (e.g. the number of entities in the dataset linked to each country) will be in the more in-depth statistics.json.

The following fields will be removed from dataset metadata:

  • schemata
  • properties
  • targets (and all nested items)
  • things (and all nested items)

These fields continue to be published in a statistics.json file. A statistics.json file is published with each data export of a dataset, and referenced from the index.json via the statistics_url field.

Additionally, the following fields will be removed in favor of replacements that are already in index.json:

  • sources - use datasets instead
  • externals - use datasets instead