Talk

The `__magic__` of the Python Model for Magic™ data

Friday, May 26

16:05 - 16:35
RoomLasagna
LanguageEnglish
Audience levelIntermediate
Elevator pitch

The Python Data Model is __magic__ (like, literally!), and what a better use case than processing Magic: The Gathering™ cards dataset to deep dive into the the power of the __dunders__ to create flexible data representations for complex pipelines ?

Abstract

Everything in Python is an object! This may not sound surprising at first: Python is an object-oriented language, after all. But, objects are Python’s abstraction for data, and all data in a Python program are then represented by objects, and relations between objects [1]. This means that data and objects are tightly coupled in Python, so that the two terms could be used interchangeably. This concept is very powerful, and becomes even more compelling in data science, where data abstractions are the essence for sensible data processing.

In this talk, we will work our way through Python __magic__ methods and custom class definitions to create reusable data abstractions. To make things more entertaining, we will consider a fun (but challenging!!) data case to drive the whole talk: processing Magic: The Gathering™ (M:TG) cards dataset. And what better use case than MTG to showcase the power of the __magic__ methods to create flexible data representation?

We will start by creating the skeleton of basic Python abstractions to ingest a collection of unstructured json data as provided by the Scryfall Cards APIs. Throughout the talk, we will be extending these abstractions with more and more features to cover multiple use cases that can possibly arise in real data science applications: from data collection & filtering to data transformation for machine learning processing.

With this talk, I hope to show you how amazing the Python Data Model is, passing on some of its __magic__ to become better Pythonistas.

Side Note

I won’t assume any specific prior knowledge that I would consider as required to attend, and follow through the examples. Familiarity with the Python language is although desirable - hence I picked “intermediate” as for the Audience level section.

I wouldn’t dare considering this talk interesting for absolute beginners, but I believe that any developer or practitioner with any experience with the Python could benefit from the content of this talk. The ideal personas I have in mind with this talk are Python developer, and Data Scientists who are interested in developing (better) Python solutions for their data problems.

TagsBest Practice, Abstractions, Data Structures
participant photo

Valerio Maggio

Valerio Maggio is a Researcher, a Data scientist Advocate at Anaconda, and a casual “Magic: The Gathering” wizard. He is well versed in open science and research software, supporting the adoption of best software development practice (e.g. Code Review) in Data Science. Valerio is also an open-source contributor, and an active member of the Python community.