Difference between revisions of "Creating citable Code"

From HPC Wiki
Jump to navigation Jump to search
 
(19 intermediate revisions by the same user not shown)
Line 1: Line 1:
Citability and reproducibility of scientific results have always been an integral part of research. More traditional aspects of research, including theoretical verification and experimental validation, have a long history of peer-reviewed publications, which enable referencing and reproduction by other parties. However, this often does not apply to any kind of self-written code, which is more and more common in every parts of research, from small evaluation scripts to whole software packages. Therefore, this page shall give a small overview of how to make your code citable and how to contribute to its reproducibility.
+
[[Category:HPC-Developer]]
 +
Citability and reproducibility of scientific results have always been an integral part of research. More traditional aspects of research, including theoretical verification and experimental validation, have a long history of peer-reviewed publications, which enable [[Citing Code|referencing and reproduction by other parties]]. However, this often does not apply to any kind of self-written code, which is more and more common in every parts of research, from small evaluation scripts to whole software packages. Therefore, this page shall give a small overview of how to make your code citable and how to contribute to its reproducibility.
  
  
 
== Benefits of citable Code ==
 
== Benefits of citable Code ==
Naturally, an unavoidable part of creating citable code is making your code publicly available. Therefore, the first step should always be to inform yourself whether your chair or institute agrees to an open publication of the code. This does not necessitate free use for other parties and you should look into all possible licenses, which could be suitable for your code. While the availability of your code opens it to criticism by your peers, it will contribute to the advancement of science in several ways<ref name="RSA">{{Cite web|url=https://www.researchsoft.org/guidelines/|title=Research Software Alliance, Researchsoft guidlines}}</ref>:
+
Naturally, an unavoidable part of creating citable code is making your code publicly available. Therefore, the first step should always be to inform yourself whether your chair or institute agrees to an open publication of the code. This does not necessitate free use for other parties and you should look into all possible licenses, which could be suitable for your code. While the availability of your code opens it to criticism by your peers, it will contribute to the advancement of science in several ways<ref name="RSA">https://www.researchsoft.org/guidelines/</ref><ref name="NIH-FAQ">https://datascience.nih.gov/tools-and-analytics/best-practices-for-sharing-research-software-faq</ref><ref name="softwareCitationGuide">DS Katz et al. “Recognizing the value of software: a software citation guide [version 2; peer review:2 approved]”. In: F1000Research 9.1257 (2021). doi: 10.12688/f1000research.26932.2.</ref>:
  
 
* transparency on how published results were achieved
 
* transparency on how published results were achieved
Line 12: Line 13:
 
* proper attribution and credit for the code development
 
* proper attribution and credit for the code development
 
* depending on the license, other researchers are allowed to use the code for their own research including modifications and redistribution
 
* depending on the license, other researchers are allowed to use the code for their own research including modifications and redistribution
 +
 +
Re-usability can be further promoted by providing a sufficient documentation of the code. Guidelines for the creation and maintenance of a good documentation can be found on the Better Scientific Software website. A common tool for C++ code documentation is Doxygen. It is available under the GNU General Public License, is well documented and contains extensions for other languages like MatLab or Perl.
  
 
== Making Code citable ==
 
== Making Code citable ==
 
As mentioned above, the first step is to make your software publicly available on a platform of your choice. Preferable platforms contain some type of revision control, as in GitLab or GitHub, which is also beneficial to the overall development of your code. The publication should come with a clear license agreement (not just "open access" or "open source"), preferably it should include a license file and a description of the preferred way of referencing / citing the code. Generally any Creative Commons license should be avoided for code and if the code is intended to be free to use, then the GNU GPL can be a good choice.
 
As mentioned above, the first step is to make your software publicly available on a platform of your choice. Preferable platforms contain some type of revision control, as in GitLab or GitHub, which is also beneficial to the overall development of your code. The publication should come with a clear license agreement (not just "open access" or "open source"), preferably it should include a license file and a description of the preferred way of referencing / citing the code. Generally any Creative Commons license should be avoided for code and if the code is intended to be free to use, then the GNU GPL can be a good choice.
Once the code has been published with an appropriate license, its citability can be further improved upon through the addition of meta data. These include:
+
Once the code has been published with an appropriate license, its citability can be further improved upon through the addition of meta data. These include <ref name="NIH-FAQ"></ref><ref>Greg Wilson et al. “Good enough practices in scientific computing”. In: PLOS Computational Biology 13 (June 2017), pp. 1–20. doi: 10.1371/journal.pcbi.1005510.</ref><ref>Neil P. Chue Hong et al. Software Citation Checklist for Developers. Version 0.9.0. Oct. 2019. doi:10.5281/zenodo.3482769.</ref>:
  
 
* code version
 
* code version
Line 25: Line 28:
 
* related publications
 
* related publications
  
One way to get a DOI for your code is to use the GitHub integration of Zenodo. This of course requires your code to be published on GitHub. Please keep in mind that the DOI is only valid for one particular version of your code.
+
One way to get a DOI for your code is to use the GitHub integration of Zenodo. This of course requires your code to be published on GitHub. Please keep in mind that the DOI is usually only valid for one particular version of your code.
 
 
  
 +
== Links and more Information ==
 +
*[https://genr.eu/wp/cite/ Guide: Getting a DOI through Zenodo]
 +
*[https://www.gnu.org/licenses/licenses.en.html GNU General Public License]
 +
*[https://bssw.io/items?topic=documentation Documentation guidelines on the Better Scientific Software website]
 +
*[https://doxygen.nl/ Doxygen]
  
== Links and more Information ==
+
== References ==
[https://genr.eu/wp/cite/ Guide: Getting a DOI through Zenodo]
 

Latest revision as of 12:41, 14 July 2022

Citability and reproducibility of scientific results have always been an integral part of research. More traditional aspects of research, including theoretical verification and experimental validation, have a long history of peer-reviewed publications, which enable referencing and reproduction by other parties. However, this often does not apply to any kind of self-written code, which is more and more common in every parts of research, from small evaluation scripts to whole software packages. Therefore, this page shall give a small overview of how to make your code citable and how to contribute to its reproducibility.


Benefits of citable Code

Naturally, an unavoidable part of creating citable code is making your code publicly available. Therefore, the first step should always be to inform yourself whether your chair or institute agrees to an open publication of the code. This does not necessitate free use for other parties and you should look into all possible licenses, which could be suitable for your code. While the availability of your code opens it to criticism by your peers, it will contribute to the advancement of science in several ways[1][2][3]:

  • transparency on how published results were achieved
  • significant improvement of reproducibility
  • fostering of collaborations among researchers across institutions
  • makes software sustainable (through prolonged support / updates)
  • peer-review and validation
  • proper attribution and credit for the code development
  • depending on the license, other researchers are allowed to use the code for their own research including modifications and redistribution

Re-usability can be further promoted by providing a sufficient documentation of the code. Guidelines for the creation and maintenance of a good documentation can be found on the Better Scientific Software website. A common tool for C++ code documentation is Doxygen. It is available under the GNU General Public License, is well documented and contains extensions for other languages like MatLab or Perl.

Making Code citable

As mentioned above, the first step is to make your software publicly available on a platform of your choice. Preferable platforms contain some type of revision control, as in GitLab or GitHub, which is also beneficial to the overall development of your code. The publication should come with a clear license agreement (not just "open access" or "open source"), preferably it should include a license file and a description of the preferred way of referencing / citing the code. Generally any Creative Commons license should be avoided for code and if the code is intended to be free to use, then the GNU GPL can be a good choice. Once the code has been published with an appropriate license, its citability can be further improved upon through the addition of meta data. These include [2][4][5]:

  • code version
  • unique persistent identifier like a DOI (digital object identifier)
  • code description
  • code language and utilized code standard
  • code requirements and dependencies (e.g., libraries or hardware)
  • small data set or example to confirm correct functionality
  • related publications

One way to get a DOI for your code is to use the GitHub integration of Zenodo. This of course requires your code to be published on GitHub. Please keep in mind that the DOI is usually only valid for one particular version of your code.

Links and more Information

References

  1. https://www.researchsoft.org/guidelines/
  2. 2.0 2.1 https://datascience.nih.gov/tools-and-analytics/best-practices-for-sharing-research-software-faq
  3. DS Katz et al. “Recognizing the value of software: a software citation guide [version 2; peer review:2 approved]”. In: F1000Research 9.1257 (2021). doi: 10.12688/f1000research.26932.2.
  4. Greg Wilson et al. “Good enough practices in scientific computing”. In: PLOS Computational Biology 13 (June 2017), pp. 1–20. doi: 10.1371/journal.pcbi.1005510.
  5. Neil P. Chue Hong et al. Software Citation Checklist for Developers. Version 0.9.0. Oct. 2019. doi:10.5281/zenodo.3482769.