Online, 09.06 | Tue

#60 Talk4Devs | Data Science @ PMI – Journey from business problem to the data product industrialization | If everything seems under control, you just aren't going fast enough

Michał Dyrda - Data Science @ PMI – Journey from business problem to the data product industrialization

Data Science is not a one-man show. It is a team effort that requires every team member to master the tools of the trade. This is extremely important for effectively putting data science to work in a global organization. We introduced the best practices to start, develop and ship data science products, which are currently in use by 30+ data scientists across three different locations, where data science labs of PMI were established in 2017. 
We inherited the technologies (e.g. Kubernetes, Docker, Jenkins) and methods (CI/CD with GitFlow) from software engineering, which are helping us in creating data science products starting from the business requirement, through proof of concept up to the industrialization phase. Uses cases which we are developing provides transparency and reproducible in each state of development. 
Currently, we provided tens of tangible uses cases results to the business users. 
One of the examples is Duty-Free Portfolio Optimization. 
Have you ever wondered what is behind the choice of products available on the store shelves? Why at the given airport you can buy different products than at the train station? In this talk I would like to show you technics and tools, which are used in the portfolio optimization problem at Philip Morris International, allowing us to progress towards a smoke-free future. Moreover, during this talk, I will also share our best practices to develop and ship data science products towards production. 

Jarek Pałka - If everything seems under control, you just aren't going fast enough.

Did you ever wonder what you need to do to make your code run faster? Have you ever wondered how to become a "performance man"? Perhaps you deal with a performance from time to time, from one failure to the next "severity 1" incident? And you have this weird feeling that you are doing it wrong? Maybe you've never bothered how fast your code runs. It has been known for ages that this is always a database problem (or someone else's problem). Or maybe it's just hard to admit that you don't know how to improve your code?

In this presentation, I will show you how to become a programmer aware of the performance of your code. Accompanied by tools such as JMH, JFR and flame graphs. We will focus not only on tools but also on the process of optimizing performance. We will talk about how good quality, the so-called Clean code affects performance, why the data sets we use are crucial, and when more is not faster.

Watch the video:


Stay updated!
Do you want to be the first to receive invitations to the next Talk4Devs meeting? Subscribe to our newsletter:

SIGN UP TO OUR newsletter!
Michał Dyrda

Ph.D. in astrophysics. More than 12 years of experience working in data science. Started my data journey working with data from different astrophysical experiments. Then working with different business and engineering teams in order to provide new insights. Experienced in designing and implementing complex software systems for various international projects, including big data analysis projects.

At the moment Data Science Best Practice Team Lead @PMI & Senior Enterprise Data Scientist in Cracow Poland working on a daily basis on technology in order to increase the business value provided by the Data Science team. Putting a lot of effort into pushing data science solutions into production and to reduce time to market for data products. Addicted to long-distance runs together with my dog.

Jarek Pałka

For more than 20 years in the IT industry, as a database administrator, programmer, architect, manager and "onsite disaster engineer". At the moment, working at Neo4j as performance engineer, enjoying the way of code, and exploring dungeons of JVM and OS, after few years as chief architect in SaaS business and teach lead in I took part in small, medium and large projects nonsense, under the principles of "Waterfall", Agile and in the absence of any methodologies, always with the same effect. What led me to the conclusion that no matter what you do, as long how you do it well, in the simplest possible way and use appropriate tools that do the work for you. In the meantime, I fell in love in the ideas of TDD and Software Craftsmanship, to the limits exploring beautiful in its simplicity ideas as REST and NoSQL, only to abandon them to explore the secrets of "systems thinking" and admire the strength that brings "metaphor" and discover that we are all objects in an eternal virtual machine. A humble follower of the church of JVM, bytecode and JIT researcher, exploring all sorts of parsers, interpreters and compilers.

From time to time you can hear my low-quality jokes about architecture conferences in Poland. I am also the author of a blog on and self-proclaimed dictator in the program committee at SegFault, CoreDump, 4Developers and JDD conferences.