Effective Memory Management
In this unit, we examine the most-common source of poor performance in scientific Python projects: memory pressure. By understanding when, how, and how much memory is copied in a given operation, we can reduce the amount of preparatory work our software does in order to get our data where it’s needed most: the processor doing the calculation we care about.
Sessions
Understanding and Controlling Memory Usage in Numpy
In this session, we explore how Numpy uses memory, calculating how much space our data really takes, examine how arrays are created, and investigate when memory is copied, reused, or temporarily expanded.
Data Representation and Disk IO: Performance Beyond RAM
In this session, we measure what happens when we write arrays to disk, compare text and binary formats, and explore how data types determine both memory usage and file size.
Structured Scientific Data with HDF5: Design, Access, and Compression
In this notebook, we use h5py to explore how HDF5, one of the most widely used scientific data formats, works at a practical level to reduce memory pressure using the h5py library.