Glossary

A B C D E F G H I J K L M

N O P Q R S T U V W X Y Z

A

  • Altmetrics: A project that produces article level metrics of scholarly articles from information collected from the Internet, such as social media sites, newspapers, and other sources.

  • Arbodat: A database program based on MS Access. It enables interconnecting of several data-tables comprising data of archeobotanical analyses, details of archaeological excavations as well as data about ecological and other traits of plant taxa.

  • Archaeobotany: The study of plant remains from archaeological sites.

  • Article processing charge (APC): Also known as a publication fee, is a fee which is sometimes charged to authors to make a work available open access in either an open access journal or hybrid journal. This fee may be paid by the author, the author’s institution, or their research funder.

B

  • Binder: The Binder Project is a software project to package and share interactive, reproducible environments. A Binder or “Binder-ready repository” is a code repository that contains both code and content to run, and configuration files for the environment needed to run it.

C

  • Citizen science: The involvement of members of the public in scientific research.

  • Computational environment: Features of a computer which can impact the behaviour of work done on it, such as its operating system, what software it has installed, and what versions of software packages are installed.

  • Container: A container is a standard unit of software that packages up code and all its dependencies so the application runs quickly and reliably from one computing environment to another.

  • Contributing guidelines: Guidelines outlining how a person should go about contributing to an open source project.

  • Creative commons: A suite of standardized licences that allow copyright holders to grant some rights to users by default. CC licences are widely used, simple to use, machine readable, and have been created by legal experts. There are a variety of CC licences, each of which use one or more clauses. Some licences are compatible with Open Access in the Budapest sense (CC0 or those carrying the BY, SA, and ND clauses), and some are not (carrying the NC clause).

D

  • Data availability statement: A data availability statement (also sometimes called a ‘data access statement’) tells the reader where the research data associated with a paper is available, and under what conditions the data can be accessed. They also include links using a DOI (where applicable) to the data set, code and other documentation.

  • Data paper: A data paper is a peer reviewed document describing a dataset, published in a peer reviewed journal. It makes datasets more findable and accessible.

  • Digital Object Identifier (DOI): A unique text string that is used to identify digital objects such as journal articles, data sets or open source software releases. A DOI is one type of Persistent Identifier (PID).

E

  • EDI: Equity, diversity and inclusion.

F

  • FAIR data: FAIR data are Findable, Accessible, Interoperable, and Re-usable, in order to facilitate knowledge discovery by assisting humans and machines in their discovery of, access to, integration and analysis of, task-appropriate scientific data and their associated algorithms and workflows.

G

  • Generalisable: Combining replicable and robust findings allow us to form generalisable results. Note that running an analysis on a different software implementation and with a different dataset does not provide generalised results. There will be many more steps to know how well the work applies to all the different aspects of the research question. Generalisation is an important step towards understanding that the result is not dependent on a particular dataset nor a particular version of the analysis pipeline.

  • Git: Git is a free and open source distributed version control system designed to handle everything from small to very large projects with speed and efficiency.

  • Github: Github is a coporation that provides hosting for software development and version control using Git. It is free to use and is commonly used to host open-source projects.

  • Gold open access: The publisher makes all articles and related content available for free immediately on the journal’s website. In such publications, articles are licensed for sharing and reuse via creative commons licenses or similar. An article processing charge (APC) is paid by the authors.

  • Green open access: Independently from publication by a publisher, the author posts the work to a website controlled by the author, the research institution that funded or hosted the work, or to an independent central open repository, where people can download the work without paying. This can be a pre-print (version of article prior to peer preview) or post-print (version that has been peer reviewed). This is free for the author.

H

  • Hybrid journal: This is a subscription journal in which some of the articles are open access. This status typically requires the payment of a publication fee (also called an article processing charge or APC) to the publisher in order to publish an article open access, in addition to the continued payment of subscriptions to access all other content.

I

  • Issue: The GitHub term for tasks, enhancements, and bugs for your projects.

J

K

L

  • License: A license is a document that provides legally binding guidelines for the use and distribution of software and other works published on the internet.

M

  • Metadata: Metadata provide a basic description of the data, often including authorship, dates, title, abstract, keywords, and license information. They serve first and foremost the findability of data (e.g. creator, time period, geographic location).

  • Milestone: A milestone is an event or state marking a specific stage in development on the project.

N

O

  • Open access: The practice through which research outputs are distributed online, free of cost or other access barriers.

  • Open data: Data that anyone can access, use and share.

  • Open education: Education without academic admission requirements and is typically offered online. Open education broadens access to the learning and training traditionally offered through formal education systems.

  • Open hardware: Physical artifacts of technology designed and offered by the open-design movement.

  • Open lab/notebooks: Laboratory research records, diaries, journals, workbooks etc., offered online free of cost with terms that allow reuse and redistribution of the recorded material.

  • Open materials: Sharing of research materials, for example, biological and geological samples, is another Open Science practice.

  • Open methods/protocols: Open methods is to document and communicate your research methods unambiguously, so that other researchers can easily replicate your exact procedures.

  • Open peer review: Peer validation process conducted openly on the Internet.

  • Open reproducible research: The act of practicing Open Science and the provision of offering to users free access to experimental elements for research reproduction.

  • Open repositories: Open archives that host scientific literature and make their content freely accessible to everyone in the world.

  • Open science: The movement to make scientific research (including publications, data, physical samples, and software) transparent and accessible to all.

  • Open source: Software where the source code is available free of cost with terms that allow dissemination and adaptation.

  • Open workflow tools: Apparatuses and services that promote open scientific projects.

P

  • Paradata: Paradata of a data set or survey are data about the process by which the data were collected.

  • Peer community in (PCI): PCI is a non-profit scientific organization that aims to create specific communities of researchers reviewing and recommending, for free, unpublished preprints in their field (i.e. unpublished articles deposited on open online archives like arXiv and bioRxiv1).

  • Persistent Identifier: A long-lived method for identifying a resource that is unique, and widely understandable by a community. This includes ORCIDs as an identifier of researchers and digital object identifiers (DOI) as identifiers of research objects.

  • Phytoliths: Microscopic silica bodies formed in living plant cells.

  • Post-print: A digital draft of a research journal article after it has been peer reviewed and accepted for publication, but before it has been typeset and formatted by the journal.

  • Pre-print: A version of a scientific paper that precedes formal peer review and publication in a scientific journal.

  • Preregistration: Researchers have the option or are required to submit important information about their study (for example: research rationale, hypotheses, design and analytic strategy) to a public registry before beginning the study. Preregistration can help counter reporting bias.

  • Proprietary software: Software that requires a paid license to be able to use it and it is closed-source (the code behind the software and the code that you produce in your analysis is not available to see).

  • Python: A high-level, interpreted, general-purpose programming language. Its design philosophy emphasises code readability with the use of significant indentation.

Q

R

  • R: R is a programming language for statistical computing and graphics supported by the R Core Team and the R Foundation for Statistical Computing.

  • README file: A document that introduces an open project to the public and any potential contributors.

  • Registered report: A published report describing the hypotheses and planned method of a study, before the data is collected. Also known as a ‘pre-registration’ or ‘pre-reg’.

  • Replicable/Replication: A result is replicable when the same analysis performed on different datasets produces qualitatively similar answers.

  • Repository or repo: A collection of documents related to your project, in which you create and save new code or content.

  • Reproducible: A result is reproducible when the same analysis steps performed on the same dataset consistently produces the same answer.

  • Reproducible workflow: A transparent record of the research that includes data, methods, and analysis to allow other researchers to review, reproduce and replicate the study.

  • Roadmap: A document outlining the schedule of work to be done on a project.

  • Robust: A result is robust when the same dataset is subjected to different analysis workflows to answer the same research question and a qualitatively similar or identical answer is produced. Robust results show that the work is not dependent on the specificities of the programming language chosen to perform the analysis.

S

T

U

V

  • Version Control: Version control is the management of changes to documents, computer programs, large web sites, and other collections of information in a logical and persistent manner, allowing for both track changes and the ability to revert a piece of information to a previous revision.

W

X

Y

Z