Kabelsalat

http://philmassie.github.com

Serverless ML

Bring your models to life with AWS Lambdas.

Wrangling data and training models can feel a little... disconnected. In this post I'll walk through deploying an ML model in the cloud. It's quick, easy and free (I think).

PU Learning

Positive/unknown class machine learning approaches

A challenge that keeps presenting itself at work is one of not having a labelled negative class in the context of needing to train a binary classifier. Typically, the issue is paired with horribly imbalanced data sets and pressed for time, I have often taken the simplistic route of sub-sampling the unknown set and treating them as unknowns. Obviously this isn’t ideal as the unknown set is contaminated and as a result the classifiers dont train that well.

R Netcdf cheatsheet

NetCDF files are often used to distribute gridded, multidimensional spatial data such as sea surface temperature, chlorophyll-a levels and so on. NetCDF is more than just a file format, and so googling it can be a little intimidating. I hope this helps make these files a little easier to use in R.

South African municipal elections 2016

A visual comparison of party effort

Employing and promoting candidates costs money. Assuming that our political parties don’t have infinite financial resources it follows that investigating where they invest their resources may be a reasonable proxy for effort. Furthermore, looking at the change in effort adds a temporal dimension, suggesting where effort has increased or decreased between the elections.

Windrose Plots

Plotting a windrose in R with ggplot2

This post covers plotting windroses in R.