CoLRev Tutorial¶

We Edit your feedback and suggestions on this notebook!


The goal of this tutorial is to demonstrate the CoLRev workflow on GitHub codespaces.

Part Label Time (min)
1 Init 5
2 Search 15
3 Screen 15
4 Data 10
5 Bonus
Overall 45


Start¶

Start the CoLRev tutorial in a separate window using GitHub Codespaces:

Start Codespace

CoLRev is installed automatically together with required packages. The process takes some time to complete.

You may need to confirm that the initial setup was completed by pressing Enter.

Important: Please run all of the following commands in the shell as shown in the screenshot.
No description has been provided for this image

1. Init ¶

To get started, simply type colrev init on the command line and press Enter

In [ ]:
colrev init

colrev init initializes a new CoLRev project. With this operation, the directories and files, including the git history, are set up. For more information check the documentation.)

Notes:

  • Check out the folder scaffold being created on the left
  • When starting the project, it is good practice to use the data/data/paper.md as a protocol
  • You can always add --help to the following commands to display more information (e.g., colrev init --help).

It is good practice to run colrev status regularly to assess the current progress of the review and identify the next steps:

In [ ]:
colrev status

2. Search ¶

To add records the colrev search command can be used.

In [ ]:
colrev search --add

Use colrev search --add to add searches of the Crossref API and the DBLP API, using the search term microsourcing.

After the search, continue with the following commands:

In [ ]:
colrev load
colrev prep
colrev dedupe
  • In the colrev load operation, search results are added to the main records. Records from the search result files are identified based on unique origin IDs and added to the main records file (data/records.bib). For more information check the documentation.
  • In the colrev prep operation, metadata quality of records is improved. Preparation procedures include general formatting rules, SearchSource-specific fixes, and cross-checks with metadata repositories and curated CoLRev repositories. For more information check the documentation.
  • In the colrev dedupe operation, duplicate records are identified and merged. The predecessors of a merged record can be identified through the colrev_origin list field in data/records.bib, enabling ex-post validation and offering the possibility to undo merges. For more information check the documentation.

Notes:

  • At this point, it may be a good idea to check the state of your CoLRev repository via colrev status.
  • If you check the changes introduced by prep, you may recognize that local quality rules (specific to AIS content) were applied.

To update the search, you can simply run the following command again (and again, ...):

In [ ]:
colrev search

3. Screen ¶

The metadata prescreen refers to the inclusion or exclusion of records based on titles and abstracts (if available).

In [ ]:
colrev prescreen

The main purpose of colrev prescreen is to reduce the number of records by excluding those that are clearly irrelevant to the review objectives. When in doubt, records can be retained (included provisionally) to decide in step 5, i.e., the screen based on full-text documents. For more information check the documentation.

Provide a short explanation of the prescreen and decide on inclusion/exclusion of each record based on the titles and abstracts displayed.

The step of PDF retrieval refers to the activities to acquire PDF documents (or documents in other formats) as well as to ascertain or improve their quality.

In [ ]:
colrev pdfs

colrev pdfs ensures that PDF documents correspond to their associated metadata (no mismatches), that they are machine readable (OCR and semantically annotated), and that unnecessary materials (such as cover pages) are removed. For more information check the documentation.

You may need to collect some PDFs manually. For colrev pdf-get-man follow the instructions provided by `colrev` on the command line.

You may also need to manually prepare some PDFs via colrev pdf-prep-man.

The PDF screen refers to the final inclusion or exclusion of records based on PDF documents.

In [ ]:
colrev screen

In colrev screen, screening criteria, which can be inclusion or exclusion criteria, are a means to making these decisions more transparent (e.g., in a PRISMA flow chart). Records are only included when none of the criteria is violated. For more information check the documentation.

Screening criteria can be added by providing a short name, defining an inclusion (i) or exclusion (e) criterion, and by providing a short explanation. After defining all screening criteria, each record will be screened against each criterion.

Notes:

  • Both prescreen and screen can be split to distribute the workload among multiple researchers.
  • You can extract abstracts from the PDFs before the screen by running colrev screen --add_abstracts_from_tei.

4. Data ¶

The last step corresponds to the data extraction, analysis and synthesis activities. Depending on the type of review, this step can involve very different activities and outcomes.

In [ ]:
colrev data

In the colrev data operation, records transition from rev_included to rev_synthesized. The data analysis and synthesis can involve different activities (data endpoints). For more information check the documentation.

5. Bonus (Optional) ¶

  1. Analyze how the change history of the review develops by navigating to the Versioning tab on the left, iterating through the commits (the Version history), and analyzing the file changes:
No description has been provided for this image
  1. The paper-md endpoint is added per default for most of the review types. It can be used to create a review protocol or a manuscript based on pandoc and CSL citation styles.
  2. Add a PRISMA chart as our second data endpoint (if it is not already in the settings.json):
colrev data --add colrev.prisma

The prisma endpoint generates a PRISMA flow chart based on the current state of the review.

  1. Add a BiBTex export as our third data endpoint:
colrev data --add colrev.bibliography_export

The bibliography-export endpoint exports the records in different bibliographical formats, which can be useful when the team works with a particular reference manager.

  1. Update an existing search
  2. Conduct a backward search


Wrap-up ¶

🎉🎈 You have completed the CoLRev Tutorial - good work! 🎈🎉

To use CoLRev for literature review projects, we recommend a local setup.