PyCon X


2nd - 5th May 2019

How to use pandas the wrong way

The pandas library represents a very efficient and convenient tool for data manipulation, but sometimes hides unexpected pitfalls which can arise in various and sometimes unintelligible ways.

By briefly referring to some aspects of the internals, I will review specific situations in which a change of approach can, for instance, make a difference in terms of performance.

UPDATE (April 12, 2017) - SLIDES: the talk had very few slides; still, you can find those few, together with the notebooks I used live, here.


  1. Gravatar
    Can you pls share some insights?
    — Roberto Polli,
  2. Gravatar
    Hadn't seen the comment, sorry. Some examples: storage in blocks and memory usage, (non-)efficient storage in HDF5, concatenated assignments, (non-)vectorization, (non-)efficient use of string methods, unexpected cases of type casting.

    (Nothing extremely advanced, but some "features" might be unexpected also for relatively advanced users)
    — Pietro Battiston,

New comment