PyCon X

firenze

2-5 maggio 2019

Playing with Data by using of Python, Pandas

Elevator Pitch: This era is of Data technology instead of Information Technology, Data is new gold so to make sense of this raw data we need to perform ETL operations. This we can do with an easier way using python,pandas,numpy Gather the raw data(Extraction), data cleaning, Data manipulation(Transformation), Export dataset in the required format(Loading)

Installation of the Pandas, Ipython package using Pip - Quick guide for Anaconda, Spyder

Check the installed version of python and pandas Reading csv/txt/excel/json file using Pandas Understanding what is dataframe? Understanding what is series Using head, tail, describe Making hands dirty with data manupulation Create a Pandas DataFrame Create blank dataframes Data Filtering How To Select an Index or Column From a Pandas DataFrame using lambda expression, adding new columns to existing dataframe joins of dataframes Iterating the dataset Filter, Sort and Groupby Removing the blank values from dataframe Replace values in dataframe(data transformation) Playing with Time series data Writing the new dataframe into csv,excel,txt format plot the dataframe using matplotlib(discuss other alternatives)

Audience Level:Intermediate (Python programming experience is required)

Description: Installation of the Pandas,Ipython package using Pip(5 mins) - Quick guide for Anaconda, Spyder

Check the installed version of python and pandas (2 mins) Reading csv/txt/excel/json file using Pandas (10 mins) Understanding what is dataframe? Understanding what is series Using head, tail, describe Making hands dirty with data manupulation(20 mins) Create a Pandas DataFrame Create blank dataframes Data Filtering How To Select an Index or Column From a Pandas DataFrame using lambda expression, adding new columns to existing dataframe joins of dataframes Iterating the dataset Filter, Sort and Groupby Removing the blank values from dataframe (5 mins) Replace values in dataframe(data transformation)(5 mins) Playing with Time series data(15 mins) Writing the new dataframe into csv,excel,txt format(10 mins) plot the dataframe using matplotlib(discuss other alternatives 15 mins) Dask Increaase the performance using Parallel Computing(15 Mins) QA Session (10 Mins) Total Duration : around 2 hours

Notes: I am working as python developer from last 4 years, have taken many workshops for students and teachers in Engineering Institutes. Given in-house training to 50+ employees and transformed them into intermediate Python programmers. Started python meetup group in Penang-Malaysia. Open source, AI, Data Science enthusiastic

Objective: Understanding how to use pandas library for Data Cleansing, Transformation. At the end of this workshop, you will be able to work with python pandas library.

Special requirements: Anaconda 5.x(python3.x version) Download link https://www.anaconda.com/download/

in on sabato 4 maggio at 17:15 See schedule

Comments

  1. Gravatar
    Do you have a docker image for the training? It would save a lot of time and deliver a better experience for trainee.
    — Roberto Polli,
  2. Gravatar
    Currently I don't have but I can prepare it immediately if you guys need it.
    Please do, let me know about it.
    — abhijeet mote,

Nuovo commento