The Data4me site is working on linking to more portals for data harvesting (Data.NSW and the NSW Climate Data Portal to start with), increasing its useability, and exploring the potential for Artificial Intelligence to help with the data searching process.


In a post in early February, Link Digital discussed some of the emerging best practice for open data portals and how they might be reimagined in the future. This included the need for portals to become more focused on heightened data transferability and usability features, improved standards to maximise interoperability, and the provision of improved metadata and data lineage features for enhanced user comprehension and re-use. Such improvements may be especially important for government open data activities as these evolve from a primary focus on transparency to more complex issues involved in knowledge management. For example, how data is better shared internally and externally, catalogued, and how it is made discoverable for a range of human and non-human actors.

One example for how open data portals might be improved was the subject of Link Digital’s recent Monthly Forum, at which Tom Barrett, Senior Scientist in the Science, Economics and Insights Division of the New South Wales Department of Climate Change, Energy, the Environment and Water (DCCEEW), presented on an innovative new portal called Data4me.

Data4me emerged from Barrett’s personal experience, including applying Geographic Information System-based tools to support conservation and natural resource management planning decisions. “It’s really a culmination of thirty years of real frustration and anguish associated with trying to use data to undertake projects.” The core problem was the huge and growing volume of data in open data portals at national, state and local levels. This rendered the portals little better than data registries and made it difficult not only to find the right data, but to share knowledge about it, and manage data for specific projects. Managing data “is often something that gets left to the end of project… It often is done hastily or drops off altogether. So, one of the objectives of Data4me is trying to develop a tool that users can actually use to manage their data from the beginning and all the way through the project, so it’s not a mad rush at the end.”

During the presentation, Barrett made an analogy to online tools such as TripAdvisor, that allow users to pick a service and look at what others have said about it, only in this instance applied to government workers that use data. The Data4me platform, which Link Digital worked on, is an internally facing data portal, that uses the  Comprehensive Knowledge Archive Network (CKAN), a piece of open-source software that can be configured and set up to function as an open data platform once it is deployed and hosted on a web server. Data4me harvests data from two NSW government portals, which Link Digital helped develop, and which also use CKAN: DCCEEW’s internal asset register and a public facing repository of environment data in NSW, Sharing and Enabling Environmental Data (SEED). Data4me enables DCCEEW staff to register projects and search for data/metadata from SEED and the internal asset register and consolidate what they find on their own project page. This search can be based on themes, key words or by other projects being undertaken that might have relevance to what the searcher is doing. They can also see who is running other projects and contact them.

Crucially, in terms of debates around how to open data portals might be better configured to provide a more comprehensive descriptive record of datasets that goes beyond very basic metadata, Data4me enables users to record feedback on data. This can include whether the data was fit for purpose for the project in question, and its pros and cons, etc. Users can also attach other related information, for example relevant journal articles that they may have written. The aim is that this contributes to a much larger repository of supporting documentation that can be immediately accessed from within the dataset and is context sensitive. Users can see comments from others who have used data that they are thinking of examining, to help them make decisions as to whether it might be useful, and to join the dots between projects, documentation, and the people involved in them. It also gives users the ability to run a report at the end of their project that creates a list of citations of the data they have used. As Barrett put it: “In an ideal world you would also register the outputs of your project, if they are data, and they then can be re-related to your project, which completes its life cycle.”

In addition to an ongoing process of monitoring and evaluation, Data4me site is working on linking to more portals for data harvesting (Data.NSW and the NSW Climate Data Portal to start with), increasing its useability, and exploring the potential for Artificial Intelligence to help with the data searching process.

Barrett told the Monthly Forum that future challenges include figuring out how Data4me fits into DCCEEW’s larger data management ecosystem, as well as other communities of data practice. The most significant challenge, however, is promoting the site and motivating users to leave feedback on data they use. This requires cultural change within DCCEEW to make Data4me regarded as just another tool staff use when they start a project. “I suppose there is an element of crowd sourcing that this relies on, and this is a critical factor, that we need that crowdsourcing of sharing of knowledge, so it gets to a critical mass where people find it useful and will use it.”

While Data4me has not reached such a critical mass yet, Link Digital believes it provides a fascinating example of what might be possible in relation to data portal configuration, and we look forward to seeing how it develops in the future.

You can watch Barrett’s entire presentation and the discussion that followed it on Link Digital’s YouTube channel here.

If you have found this post useful, you might be interested in taking part in a series of forums being held by Link Digital every month, 11 am AEST. These forums will connect you with like-minded experts who are passionate about the importance of open data and want to stay updated on the latest developments in the field. They are free to attend and open to everyone. Register today.