by Andrew Treloar, Ross Wilkinson, and the ANDS team
Over the last seven years, Australia has had a strong investment in research infrastructure, and data infrastructure is a core part of that investment.
Much has been achieved already. The Government understands the importance of data, our research institutions are putting in place research data infrastructure, we can store data, we can compute over data, and our data providing partners – research institutions, public providers, and NCRIS data intensive investments are ensuring that we are establishing world best data and data infrastructure.
The Australian National Data Service (ANDS) commenced in 2009 to establish an Australian research data commons. It has progressively refined its mission towards making data more valuable to researchers, research institutions and the nation. Over the last 5 years ANDS has worked across the whole sector in partnership with major research organisations and NCRIS facilities. It has worked collaboratively to make data more valuable through bringing about some critical data transformations: moving to structured data collections that are managed, connected, discoverable and reusable. This requires both technical infrastructure and community capability, and can deliver significant research changes .
We have seen many examples where these transformations have been successful. We give three examples, showing the technical, community and the research effects.
Figure 1: Transformation of Data
Parkes Observatory Pulsar Data Archive
Information gathered over many years using CSIRO’s Parkes Radio Telescope is being used as a major resource in the international search for gravitational waves. This is one of the more unexpected outcomes of the construction of the Parkes Observatory Pulsar Data Archive . This project fulfilled a CSIRO commitment to the world’s astronomers - to make data from the Parkes telescope available publicly within 18 months of observation.
The data archive was established with support from the ANDS. It also has freed CSIRO astronomers from the time consuming task of satisfying requests for the Parkes data from all over the world. Those requests come flooding in because Parkes is where the bulk of all known pulsating neutron stars or pulsars have been discovered.
In addition, astronomers from Peking University, who are using the Parkes data archive as a training tool for radio telescope data analysis, have published several papers which include descriptions of pulsars and other astronomical bodies they have newly discovered.
The archive is accessible through both Research Data Australia (the national data portal) and CSIRO’s Data Access Portal.
Data Citation Support
Citation is a pivotal concern in the publication and reuse of data. ANDS’ approach to raising awareness has resulted in a vibrant and burgeoning Australian community now routinely applying DOIs to data. Community were engaged through webinars, roundtables, workshops, train-the- trainer sessions, seminars YouTube recordings, and countless resource links were provided, including links to key international resources.
As well as community development there was corresponding technical support through the ANDS Cite My Data Service. Technical documentation is accompanied by plain–English information to help institutions effect cultural change. As well many additional resources supporting the community were made available through the ANDS website.
As a result 26 institutions in Australia are minting DOIs (Digital Object Identifiers) through the ANDS CiteMyData service and citation metrics are being harvested by data collecting agencies. The growing practice of data citation is underpinned and driven forward by librarians who are knowledgeable and skilled in research data management and reuse, and by institutionally focused services and materials.
The Research Student
A PhD student is quietly changing the way we assess our fisheries without setting foot on a boat. By analysing more than 25 years of Australian Fisheries Management Authority records, he has found that a key assumption of the models employed to make predictions of the future of Australia’s ocean fish stocks is not justified . “It has always been assumed for each different species that the relationship of length with age does not change through time,” says Mr Athol Whitten from CSIRO and the University of Melbourne. By going back and pulling the data apart, Whitten found that some species are showing a long-term decline of length with age, and in others, the growth rate depends on the year they were born. The information now amounts to more than 180,000 data points on 15 species of fish over 25 years. Access to this wealth of information has provided Whitten with an efficient way to pursue his research on testing the assumptions of the fisheries models.
There is now a strong research data management capacity in Australia that uses a coherent approach to Australia’s research data assets, and can support substantial change research data practice in the light of policy or technical changes.
Significant progress has been made in enabling improved data management, connectivity, discoverability, and usability by:
- Establishing the Australian Research Data Commons, a network of shared data resources
- Populating the Australian Research Data Commons with over 100,000 research data collections
- Dramatically improving institutional research data management capacity
- Helping to establish institutional research data infrastructure
- Co-leading the establishment of the Research Data Alliance, improving international data exchange.
This has meant that Australian researchers, research institutions and the nation are at the forefront of the opportunities inherent in global research data intensive activity.
 Material drawn from Australian National Data Service, http://ands.org.au/, Accessed, 2014
 G. Hobbs et al.: “The Parkes Observatory Pulsar Data Archive”, Publications of the Astronomical Society of Australia 28, 202, 2011
 A. Whitten, N. Klaer, G. Tuck, R. Day: “Variable Growth in South-Eastern Australian Fish Species: Evidence and Implications for Stock Assessment”, 141st American Fisheries Society Annual Meeting held in Seattle, 2011
Australian National Data Service