The Python Data Model is __magic__
(like, literally!), and what a better use case than processing Magic: The Gathering™ cards dataset to deep dive into the the power of the __dunders__
to create flexible data representations for complex pipelines ?
Everything in Python is an object! This may not sound surprising at first: Python is an object-oriented language, after all. But, objects are Python’s abstraction for data, and all data in a Python program are then represented by objects, and relations between objects [1]. This means that data and objects are tightly coupled in Python, so that the two terms could be used interchangeably. This concept is very powerful, and becomes even more compelling in data science, where data abstractions are the essence for sensible data processing.
In this talk, we will work our way through Python __magic__
methods and custom class definitions to create reusable data abstractions. To make things more entertaining, we will consider a fun (but challenging!!) data case to drive the whole talk: processing Magic: The Gathering™ (M:TG
) cards dataset. And what better use case than MTG to showcase the power of the __magic__
methods to create flexible data representation?
We will start by creating the skeleton of basic Python abstractions to ingest a collection of unstructured json
data as provided by the Scryfall Cards APIs. Throughout the talk, we will be extending these abstractions with more and more features to cover multiple use cases that can possibly arise in real data science applications: from data collection & filtering to data transformation for machine learning processing.
With this talk, I hope to show you how amazing the Python Data Model is, passing on some of its __magic__
to become better Pythonistas.
I won’t assume any specific prior knowledge that I would consider as required to attend, and follow through the examples. Familiarity with the Python language is although desirable - hence I picked “intermediate” as for the Audience level section.
I wouldn’t dare considering this talk interesting for absolute beginners, but I believe that any developer or practitioner with any experience with the Python could benefit from the content of this talk. The ideal personas I have in mind with this talk are Python developer, and Data Scientists who are interested in developing (better) Python solutions for their data problems.
Valerio Maggio is a Researcher, a Data scientist Advocate at Anaconda, and a casual “Magic: The Gathering” wizard. He is well versed in open science and research software, supporting the adoption of best software development practice (e.g. Code Review) in Data Science. Valerio is also an open-source contributor, and an active member of the Python community.