CKAN is an enterprise metadata management tool used by various types of organisations to improve the search and discoverability of datasets.

Commonly used by key players within the open data space including Government agencies, this powerful data management tool is increasingly used by research organisations. A growing number of public data portals are publishing, not only the data of public administration, but also research artifacts supporting public policy and programs.

Government organisations that rely on scientific data (environment, geoscience, etc.) operate under workflows that have the features researchers use or value. One of these features relates to Digital Object Identifiers (DOIs).

What is a DOI?

A DOI refers to the persistent alphanumeric string used to identify datasets, research documents, and other specific output, including online articles. It’s a more preferred way of locating an online source due in large part to its reliability and permanence. The International DOI Foundation (IDF) handles the central database where the identifiers are stored.

While a URL can often be changed or be decommissioned at any point, DOIs have been designed to be persistent. This means that anyone trying to cite a dataset or research document will never have issues with a broken link or a 404 error, as DOIs are a permanent reference to a digital object.

Can CKAN support DOIs?

Many are wondering if CKAN can be extended to support DOIs… the short answer is, absolutely. Using CKAN to support DOIs can be very helpful for many existing open Government data portals. DataCite (a registration agency accredited by the IDF) provides the registration of DOI names for research data. This CKAN extension assigns a DOI to datasets using the DataCite DOI service.

How can a dataset be eligible for DOI minting? 

Typically, outputs should satisfy the following requirements: they must be unique, accessible, discoverable, immutable, citable and persistent.

As an example, let’s look at the case in Research Data Australia.

Research Data Australia is an online portal for the discovery of data collections from several Australian research organisations and Government agencies. It is a flagship DOI service offered by the Australian National Data Service (ANDS) – an entity that aims to make research data assets more valuable. The DOI service provided by ANDS is free for all publicly funded Australian Government agencies and research organisations. You can also opt to register for manual and machine-to-machine DOI minting. Still, to do so, you need to register with the Australian Research Development Council (ARDC) for a DOI Service Account.

Most organisations similar to ANDS that offer DOI minting services do not manage DOIs and only provide the infrastructure to allow minting. Organising processes and policies to guarantee appropriate maintenance practices to support persistence is still up to those using it.

Before minting a DOI, there must be a publicly accessible landing page of a dataset that the DOI is pointing to. If the dataset or resource’s location is changed (e.g., another organisation takes ownership and moves it to a different portal), the DOI must be updated to ensure it’s pointing to the new location.

If a dataset location has changed, the DOI will not change as well but users can still access the datasets using the same DOI. Due to its permanence, the same DOI will persist in resolving the correct resources at their new locations – readily available for reuse and reference.

DOI Create: IAR portal as a use-case

The NSW Government’s Information Asset Register (IAR) portal provides searchable metadata and contact details for a list of core-value information assets, including datasets for the NSW Government. Link Digital has worked with the NSW Office of Environment and Heritage (OEH) (now Department of Planning, Industry, and Environment) to develop the capability for scientists to assign DOIs or Handles to selected research papers and datasets to allow more standardised participation in scientific communities and work to be cited with permanent URLs based on the DOIs.

CKAN AND DOI

Link Digital used ANDS to mint the DOIs. The then OEH Science Division assisted us in getting access to required information for ANDS DOI Services and Handle Services.

In doing so, we were able to achieve the following:

  • IAR has fields to supply all required metadata for minting a DOI
  • The asset/dataset summary page on IAR and the SEED Open Data Portal displays the dataset DOI
  • The DOI resolves to the dataset summary page in the SEED Open Data Portal
  • Admin users are able to delete assets from the IAR and un-publish them from the SEED Open Data Portal.

CKAN being extensible to support DOIs proves why the platform remains to be the standard when it comes to open data publishing. If your organisation is considering adopting CKAN for open research data, don’t hesitate to give this tool a try.                      

Want to learn more about how CKAN can be used to support DOIs? Feel free to get in touch with the Link Digital team today.