Polars is a dataframe library for Python, intended to be a modern and performant replacement for pandas.

Basics

What makes Polars better than pandas?

LazyFrames

One of Polars’ key features is “lazy evaluation”, which should be the default way we interact with dataframes, because Polars will optimise the queries under the hood.

To run in lazy mode, we should use an implicitly lazy function (scan_csv() over read_csv()) or use the .lazy() method to convert a dataframe to a lazyframe.

To convert back to a dataframe (and evaluate all the build-up queries), we use the .collect() feature. Use this sparingly and only when necessary.

Resources

See also