Creating and Using DataLad Datasets

Creating and Using DataLad Datasets


This unit introduces the DataLad command line tool and the basic operations to create and work with datasets. You are going to both create your own datasets and use published datasets from the OpenNeuro. After completing this unit, you’ll understand how DataLad stores and manages digital objects and their version history and you’ll know how file content can be added, downloaded and dropped. You’ll also going to learn how to use the dataset’s commit history to inspect old versions of specific files and restore previous states of the dataset. The more advanced sections in this unit focus on using Git-annex to configure how DataLad behaves (e.g. which files DataLad should annex or how many copies of a file you want to keep).

After this unit, you’ll know the following commands for DataLad, Git and Git-annex (note that the linked command-line reference is very detailed and contains a lot more information than will be covered in the notebooks):

DataLad

Git

Git-annex