Guidelines on Research Practice in Computer Science

Compiled by Justin Zobel, Department of Computer Science, RMIT University, Melbourne, Australia, jz@cs.rmit.edu.au , http://www.cs.rmit.edu.au/~jz.

May 1999

Summary

"The broad principles that guide research have been long established. Central to these are the maintenance of high ethical standards, and validity and accuracy in the collection and reporting of data ... The processes of research protect the truth. Communication between collaborators; maintenance and reference to records; presentation and discussion of work at scholarly meetings; publication of results, including the important element of peer refereeing; and the possibility that investigations will be repeated or extended by other researchers, all contribute to the intrinsically self-correcting and ethical nature of research." [The Joint NHMRC/AVCC Statement and Guidelines on Research Practice, May 1997 are currently under review. -JZ, Feb 2004]

The process of research - that is, the discovery of scientific truths - relies on the assumption that researchers observe ethical standards. Over the past decade, the major organisations responsible for research and science, such as the AVCC and NHMRC in Australia, have developed guidelines describing these standards. The principal topics of the AVCC guidelines are Data Storage and Retention, Authorship, Publication, Supervisior of Students, Disclosure of Potential Conflicts of Interest, and Research Misconduct.

The NHMRC/AVCC guidelines are written for the general scientific community. In a series of discussions of research ethics that I held with computer scientists, difficulties with the guidelines emerged. In particular, amplification or clarification was needed because in some cases it is not clear how the guidelines should be applied in computer science. This document, then, concerns the ethical conduct of research in computer science. It augments the NHMRC/AVCC guidelines: each section has a brief restatement of the major points in the original guidelines (which every researcher should read in full), focusing on issues of particular relevance to research practice in computer science, followed by commentary based on discussions with practicing computer scientists.

The principal recommendations in this document are that:

General considerations

Three elements the NHMRC/AVCC opening statement are of particular relevance to practising researchers in computer science:

Data storage and retention

The guidelines also consider issues relating to data available through confidential sources.

Commentary

In this context, data does not necessarily include the subject of an experiment - by analogy, a chemist is not required to keep test-tubes of chemicals once the work is complete - but rather the outcomes, results, and conclusions of research. What should be held is not always clear, but at a minimum researchers should keep programs and scripts used to conduct experiments, and if possible should also keep both input and output; but it is recognised that inputs often cannot be retained, because of their size or transient nature. Output as processed for analysis should always be kept.

In other disciplines it is expected that scientists keep careful, dated notebooks describing their experiments, ideas, intentions, methodology, results, and so on. Computer scientists should also keep such notebooks, which can function as effective reminders of the differences between version of software, of parameters used and data tested, and of failed runs. In conjunction with centralised backups (which are evidence that software, inputs, and outputs were present in the system at certain dates), such notebooks - even if held privately - are an adequate mechanism for retaining data. Notebooks should be bound (that is, not bundles of loose pages), dated, and should clearly identify software, data, versions, and outcomes. Software used to produce recorded outcomes should be retained if possible.

Data related to publications should be made available for discussion with other researchers; but any requirements of confidentiality in relation to data should be observed.

Authorship

Commentary

Thus the minimum requirement for authorship of a publication is participation in conceiving, executing or interpreting a significant part of the outcomes of the research reported. Note that this does not include subsidiary tasks such as implementation conducted under the direction of a researcher. Honorary authorship on any basis - seniority, "tit-for-tat", generosity, or coercion - is unacceptable.

There is no simple rule that establishes how much contribution to a paper is enough to merit authorship. Providing comments on a draft or two is almost certainly not sufficient; but conception in detail of the original idea almost certainly is.

On the one hand, a researcher should always be given an opportunity to be included as an author if their contribution has added to the quality of the paper sufficiently for it to be accepted by a more prestigious journal or conference than would otherwise have been the case. On the other hand, involvement in an extended project does not guarantee authorship on every paper that is an outcome of the project; and in most circumstances a researcher will have partipated to some degree in every part of a publication of which they are an author, from the commencement or conception of the research to completion of the publication itself. A researcher who has only minimally met requirements for authorship should consider choosing to be acknowledged instead, particularly when the researcher is a supervisor of the other authors. Co-authorship is a consequence of having made a genuine contribution to the intellectual property embodied in a paper, and simply being a student's supervisor is not sufficient to merit co-authorship.

A related issue is of author order, since many readers will assume that the first author was the primary contributor. A researcher who is clearly the primary contributor should be listed first. Where there is no obvious first author, possible approaches to ordering include: alphabetical or reverse alphabetical, with an explanatory footnote; a reversal or rotation of the order used on a previous paper by the same authors; choosing the first author based on considerations such as the value to each invididual, so that for example in a paper jointly written by a student and supervisor the student should be listed first; or if all else fails toss a coin. The order of authors should always be explicitly discussed prior to submission, and is the joint decision of all of the authors.

A publication should contain due recognition of the contributions made by all participants in the relevant research. The work of research students, research assistants, and the assistance or advice of colleagues, should be properly acknowledged. Acknowledged persons should, if the presence of their name could be interpreted as their endorsing the contents of the paper, be given the opportunity to read the paper prior to submission.

Any groups or organisations that funded or contributed significantly to the research should be acknowledged. Where, for example, the address of an author is not the institution that employed them while they conducted the research, the institution should be acknowledged explicitly.

Publication

Commentary

When presenting research results, researchers should:

Publication of multiple papers based on the same results is improper unless there is full cross-referencing. It should be made unambiguously clear to what degree the results are new.

Other forms of self-plagiarism are the subject of debate. The conflicting positions are represented by two articles: "Copyrights and author responsibilities", H.S. Stone, IEEE Computer, 25:12, December 1992, pp. 46-51; and "Self-plagiarism or fair use?", by P. Samuelson, Communications of the ACM, 37:8, August 1994, pp. 21-25. The central issue is of whether it is acceptable for authors to reuse their text, even text describing background or principles (in contrast to text describing new research). The arguments are with regard to both copyright and ethics.

Researchers should be aware that some referees consider reuse of text to be unethical, on the grounds that a contribution can only be made once; even discussion of background or principles can present a new view of a topic that is itself a contribution to understanding of research; and the quality of such discussion is a factor in the decision as to whether the paper should be accepted. Moreover, any significant reuse of text by a researcher that is not explicitly referenced as a quotation can be regarded as plagiarism once the text has already appeared in a copyright form such as a journal or a conference proceedings. But the issue of how much text constitutes "significant" is open. Stone argues from copyright that reuse of two paragraphs or more is unacceptable; Samuelson argues from an informal survey of colleagues that, without specific permission, 30% of the paper being reused prose is "a grey area" and would recommend less reuse than that. (In contrast, an informal survey of my colleagues suggested that perhaps 10% - one page of a typical conference paper - was a limit and that they would be uncomfortable recommending acceptance of a paper with more than that level of reuse, an exception being a journal submission that was a substantial expansion of a preliminary conference paper.) Researchers should not reuse text to a degree that is likely to be interpreted by a referee as plagiarism. Note that the text of a paper is owned collectively by the authors, regardless of who actually wrote what.

Any plagiarism of the work of others is unacceptable. Note that plagiarism is not limited to copying of material from published papers, but also includes material in electronic form, such as technical articles made available on Internet or comments or suggestions made in e-mail.

Gender-inclusive language - that is, language that does not specify gender unnecessarily - should be used in all writing, including research publications.

Supervision

Commentary

When a researcher supervises a postgraduate or honours student, the student undertakes a research program under the supervisor's direction, culminating in a written report that is assessed. Often material in the report is the product of joint research, and must be explicitly acknowledged as such. However, if the research or report is substantially the work of the supervisor, the student is in breach of University regulations if the work is submitted as their own. As far as possible, supervisors should ensure that the work submitted by research students is the work of the student, and that the research is valid.

It is improper for a supervisor to publish a student's work without giving appropriate credit (usually authorship) to the student. Where a supervisor is enrolled in a research degree, the supervised project must be distinct from the supervisor's research. That is, the research undertaken by, for example, an honours student should not subsequently be incorporated into the supervisor's assessed work.

Published work that is generated during the course of a postgraduate degree is often jointly attributed to both student and supervisor. It is usually the case that the student has undertaken the bulk of the task: capturing some idea in text, conducting experiments, and creating the paper that that describes the idea. However, it is often the case that the paper would not have existed without ongoing input from the supervisor, and that the conception and initial development of the idea is due to the supervisor. In these cases student and supervisor should both claim authorship. This practice of shared authorship does not diminish the student's final work, and it helps to prevent the supervisor from limiting their responsibility to the student and to the quality of the research.

There are also cases in which sole authorship by a student is appropriate, such as during the latter stages of a candidacy when the student can be expected to be in contact with other experts in the field, and is by then expert. Good PhD candidates should be able to prove themselves towards the end of their candidacy by undertaking research largely unassisted. Supervisors should encourage each student to write a paper of which the student is sole author - that is, to develop an idea, conduct experiments and write a paper - and should give him/her the freedom to do so. In such cases the supervisor should be consulted prior to submission.

In no circumstances should a supervisor use their position to force a student to include him/her as an author; indeed, a supervisor who has only minimally met the requirements for authorship should consider choosing instead to be acknowledged. Nor should a supervisor assume that he/she is automatically an author of a student's paper - authorship should always be explicitly discussed. Disputes over authorship should be raised at the earliest opportunity, and taken to the research coordinator or the head of department if they cannot be resolved amicably.

A supervisor, and in particular the senior supervisor in a supervisory team, is responsible for ensuring that the student has reasonable access to resources necessary for their project and that the academic aspects of the program proceed steadily and at a sufficient rate.

Disclosure of potential conflicts of interest

Research misconduct

Misconduct or Scientific misconduct is taken here to mean fabrication, falsification, plagiarism, or other practices that seriously deviate from those that are commonly accepted within the scientific community for proposing, conducting, or reporting research. It includes the misleading ascription of authorship including the listing of authors without their permission, attributing work to others who have not in fact contributed to the research, and the lack of appropriate acknowledgment of work primarily produced by a research student/trainee or associate. It does not include honest errors or honest differences in interpretation or judgements of data.

Examples of research misconduct include but are not limited to the following :

Misappropriation : A researcher or reviewer shall not intentionally or recklessly

Interference : A researcher or reviewer shall not intentionally and without authorization take or sequester or materially damage any research-related property of another, including without limitation the apparatus, reagents, biological materials, writings, data, hardware, software, or any other substance or device used or produced in the conduct of research.

Misrepresentation : A researcher or reviewer shall not with intent to deceive, or in reckless disregard for the truth,

Commentary

Thus misconduct in research includes:

Misconduct does not include honest errors or honest differences in interpretation or judgements of data; but significant errors should be corrected as promptly as possible. Accusations of misconduct or unethical behaviour are serious matters and should only be made after careful consideration.

Refereeing and examination

The guidelines in this section are adapted and extended from a resolution passed by the Transactions Advisory Committee of the IEEE Computer Society.

Researchers should not referee a paper or examine a thesis where there is a real or perceived conflict of interest, or where there is some reasonable likelihood that it will be difficult for the referee to maintain objectivity. Examples are:

In such cases, the referee should notify the editor that an alternative referee should be sought. Where there is no alternative referee, the editor may request that the referee evaluate the work despite the conflict. In such cases, assuming that the conflict is not so severe as to prohibit objectivity, the referee may evaluate the paper but the referee's report should carry an appropriate caveat.

Referees should respect the confidential nature of the papers they referee. Such papers should not be shown to colleagues, except as part of the refereeing process; they should not be used as a basis for the referee's own research or for the referee's personal gain; and referees should not indicate whose papers they have been reviewing or the outcome of the review process. It will however sometimes be the case that the authors have already made the work publicly available, for example on Internet, in which case the publicly available version does not have to be treated in a confidential manner.

When a referee recommends acceptance of a paper, the referee is assuring the technical content, originality, and proper credit to previous work to the best of the referee's ability to judge these aspects; a referee should not recommend acceptance if the paper is not of adequate standard in some respect. The onus is on the referee to take sufficient care to fully evaluate the paper. Referees who are not able to assure the quality of the paper should not recommend acceptance without an appropriate caveat.

Referees should make every endeavour to complete reviews in a timely and professional manner. Reviews should be constructive. For example, in the review process it may be possible to correct a proof or generalise a result, and thus strengthen the paper anonymously on behalf of the author. Rejections should clearly explain, not only the faults of the paper, but a process that the author might use to produce a more acceptable outcome. Even in the case of a paper that a referee believes to be totally without contribution, it is helpful to explain how the author might verify for themselves whether this evaluation is correct. Every paper, no matter how weak, should have a careful and thorough review. Researchers should only decline to referee a paper with good reason.

Acknowledgements

Many people contributed to the content of this document. In particular I am grateful to Alistair Moffat, Lin Padgham, and Ross Wilkinson.

[delimiter]

Return to Justin Zobel's home page.

[To the RMIT Home Page]