File and Data Management
Explore database management with SQL, DuckDB, HDF5, and JSON to seamlessly integrate and analyze complex neuroscience datasets.
Author
Neuroscience is evolving rapidly, with experimental data becoming increasingly complex. How can you seamlessly integrate vast and diverse datasets for insightful analysis and easy sharing? And how would your process improve if, instead of having to write long scripts, you could analyze data with just a few lines of code?
In this course, discover the power of database management systems, a game-changer in neuroscience research. We will dive into the world of SQL and learn about DuckDB SQL engine, which makes it easy to apply industry-standard data organization methods to research data as a relational database – no server management needed! You’ll also gain hands-on experience with HDF5 and JSON for key-value data storage and learn how to combine various management techniques for optimal convenience and performance by building hybrid database systems.
By the end, you’ll be adept at writing Python scripts to create and extract data from databases, query large databases in SQL, store complex data in HDF5, manage your work with Git, and publish your projects on GitHub.
Prerequisites: This workshop is ideal for Neuroscience Researchers at any level (Masters, PhD Candidate, Postdoc, PI) with some background in data analysis using Matlab, Python, or R.
Credits
Installation
To run the course materials on your own machine, it is recommended that you:
- Install VSCode as your editor
- Install pixi or alternatively conda to create virtual Python environments (see the lessons on environment and package management)
- Create a dedicated folder for this course and install the virtual environment:
Download the pixi.toml file and install the environment:
pixi install --manifest-path pixi.toml
pixi shellDownload the environment.yml file and install the environment:
conda env create -f environment.yml
conda activate file_and_data_managementCourse Contents
Organizing Structured Data
Organizing Data into Dictionaries
Data into dictionaries for key-value mapping
Extracting Metadata from strings
Extracting meaningful information from filenames
Navigating and Searching through Local and Remote Filesystems
Navigating the filesystem
Managing and Navigating files and directories
Sciebo/NextCloud/Owncloud Folders as a Remote Filesystem
Managing project workspaces with Pixi