PyCon X


2nd - 5th May 2019

Intro to Natural Language Processing in Python

This tutorial will introduce the audience to the fields of Natural Language Processing (NLP) and Machine Learning (ML) using a hands-on approach. Analysing and truly understanding natural language is a difficult task, but Python doesn’t fall short in terms of making this topic approachable, thanks to its rich ecosystem and active community.

Firstly we’ll provide an overview on NLP using NLTK, the Natural Language Toolkit, one of the most popular Python library for NLP. We’ll discuss some of the most important text processing steps and how they impact the quality of our final application.

We’ll then move on to some notions of word frequency analysis, like TF and IDF. All these concepts are going to be applied in the context of a text classification system. Using scikit-learn, we’ll understand how to build a classifier, and how to use it to categorise documents according to pre-defined labels. We’ll briefly discuss how to assess the quality of our model, and how we can tune it to achieve better predictions.

There are no particular requirements to attend this tutorial, so Python beginners are welcome. Some prior basic notions of machine learning can be beneficial, but the theoretical aspects are kept to the bare minimum: the aim of this tutorial is to be practical and approachable. Attendees are encouraged to bring their own laptop with Python 3, NLTK and scikit-learn pre-installed (instructions for pip/virtualenv will be provided, but please feel free to set up the environment in advance).

The material is available at


  1. Gravatar
    Hi, I am very interest in this topic.
    Unfortunately I probably cannot be in Florence for the talk that day.
    I would like to know if there will be some options for watching it online.

    Thank you
    — Fabio Pettenuzzo,
  2. Gravatar
    Hello Fabio,

    I was there last year, and conferences are live streamed on this website.
    — Téva,

New comment