top of page

Quantifying Data Sharing

Issue #91

Data, Numbers

by Michael Seadle (Humboldt-Universität zu Berlin)


Loren Frank wrote an article for The Transmitter on 8 October 2024 entitled “The S-index Challenge: Develop a metric to quantify data-sharing success”.¹ The context is the neuroscience community, but the issues are far broader. He cites several reasons for the importance of data sharing:


Sharing the data generated by experiments is critical for reproducibility, and it enables reuse of data that may have taken years to collect. Sharing the code used to transform data into scientific results is also critical, both to boost reproducibility and to reduce the amount of time trainees spend developing these tools.”¹


The US National Institutes of Health are sponsoring a challenge to devise “... the best idea on how to quantify sharing.”¹ The incentive is a million dollar prize for the person with the best idea. Frank warns that the challenge is not easy: 

 

The index would ideally consider not only how many datasets or software repositories a group shares, but how useful they are to the community. At the same time, … Major hurdles still remain—most notably, limited funding to support data-sharing tools and limited technical expertise to adopt them.”¹ 

 

The requirement to share the code may discourage corporate researchers, partly because of the complexity of the task:


… writing code designed to derive a result and writing code that is useful to others can be very different things, and much of the code written in a laboratory is incomprehensible to anyone other than the author (and sometimes even to them.)”¹


Anyone who has done significant coding in a corporate shared-code environment knows how true this problem is, even for people who are trying to write easily maintained code. Another feature is the requirement to enable tracking:

 

“... the S-index will need to offer a way to track when datasets or code are used by others, as well as how often that use turns into one or more publications.” ¹


Tracking matters in order to enable following up on errors in both the data and the code, both of which affect reuse. 


If someone wins the prize, there is good reason to think that other fields could benefit from the standard, assuming they have the resources and the will to do so.

 

1: Frank, Loren. “The S-Index Challenge: Develop a Metric to Quantify Data-Sharing Success.” The Transmitter: Neuroscience News and Perspectives, October 8, 2024. https://www.thetransmitter.org/open-neuroscience-and-data-sharing/the-s-index-challenge-develop-a-metric-to-quantify-data-sharing-success/.

10 views

Recent Posts

See All

Opmerkingen


Opmerkingen zijn uitgezet.
bottom of page