With South Africa's municipal elections in a few days time (3rd August 2016), I wondered about the effort being expended by the parties in each district or municipality. Has it increased or decreased since the 2011 elections? I must point out that I am not a political analyst or a political scientist. I am nevertheless a scientist with an appreciation for analytics.

Employing and promoting candidates costs money. Assuming that our political parties don't have infinite financial resources it follows that investigating where they invest their resources may be a reasonable proxy for effort. Furthermore, looking at the change in effort adds a temporal dimension, suggesting where effort has increased or decreased between the elections.

The map below shows the change in PR candidate representation by each party across the county. If you hover your mouse pointer over a district, you can see by how much the representation has changed in terms of two metrics:

Relative change: The change in representation as a proportion of all representatives. E.g. In 2011 parties A and B each had 50% of a district's PR candidates. In 2016 party A has 60% and party B has 40%. Party A's relative change is 10% and party B's is -10%.

Absolute change: The change in actual number of PR candidates in a party from 2011 to 2016.

Scroll below the map for more info regarding data sources and methods.

Disclaimer: As I said though, I am no political scientist and so maybe this is all nonsense.

Embed this map using the following iframe code:

<iframe width="100%" height="720px" src="https://philmassie.github.io/election_effort_d3/za_election_effort_2016.html" scrolling = "no" frameborder="0" seamless name="iframe-election-content" id="iframe-election-content">
        <p>Your browser does not support iframes. Click <a href="https://philmassie.github.io/election_effort_d3/">here</a> to navigate to the content.</p>
        </iframe>
        

Data sources

The Independent Electoral Commission have published the 2016 candidate lists as pdf documents. The good people at openAfrica have already gone to the trouble of processing these pdfs (which can be a bit painful) so I used their lists instead. openAfrica also host the 2011 candidate lists. After learning a little more about the electoral system, it seemed that the proportional representative candidates would be the interesting group to look at.

The Municipal Demarcation Board website hosts shape files for South Africa's districts and provinces.

R data processing

Some simple wrangling in R (scroll down for the process) exposed the list of district codes which I joined with the appropriate candidate lists from each election year. From there I produced a list of parties, ordered by the total number of district or municipal representatives. The Economic Freedom Fighters lead this list with a total of 1501 candidates, 106 more than the African National Congress and 139 more than the Democratic Alliance. These are the three largest parties which is why I mention them. I chose to load the map with ANC data because the EFF didn't exist during the 2011 elections and so they've not decreased in any provinces. I believe that a few candidates have withdrawn from the elections. I'm not sure who they were but I doubt their withdrawal would make any substantive change to this analysis.

## Libraries
        library(rgdal) #
        library(xlsx) #
        library(plyr)#
        library(stringr)#
        library(dplyr) #
        setwd("D:/My Folders/R/2016/blog/20160714_election_effort")
        
# Load the district shape file to get a list of district codes
        za.districts <- readOGR("Districts", layer="DistrictMunicipalities2011")
        za.district.list <- unique(za.districts@data$DISTRICT)
        rm(za.districts)
        
# load and process electoral candidate data
        # 2011 Proportional Representative candidates
        pr.cand.2011 <- read.xlsx2("data/2011-pr-candidate-lists.xls", 1, startRow = 3)
        # A few minor edits to party names
        pr.cand.2011$Party <- str_to_title(str_replace_all(pr.cand.2011$Party, "  ", " "))
        pr.cand.2011$Party[pr.cand.2011$Party == "Democratic Alliance/Demokratiese Alliansie"] <- "Democratic Alliance"
        pr.cand.2011$Party[pr.cand.2011$Party == "Independent Ratepayers Association Of Sa"] <- "Independent Ratepayers Association Of SA"
        pr.cand.2011$Party[pr.cand.2011$Party == "South African Maintanance And Estate Beneficiaries Associati"] <- "South African Maintanance And Estate Beneficiaries Association"
        
# 2016 ward & proportional Representative candidates were in the same data set
        ward.pr.cand.2016 <- read.csv("data/Electoral_Candidates_2016.csv")
        # A few minor edits to party names
        ward.pr.cand.2016$Party <- str_to_title(str_replace_all(ward.pr.cand.2016$Party, "  ", " "))
        ward.pr.cand.2016$Party[ward.pr.cand.2016$Party == "Independent Ratepayers Association Of Sa"] <- "Independent Ratepayers Association Of SA"
        ward.pr.cand.2016$Party[ward.pr.cand.2016$Party == "South African Maintanance And Estate Beneficiaries Associati"] <- "South African Maintanance And Estate Beneficiaries Association"

        # A number of the 2016 district codes had a wierd "\f" prefix.
        # I consulted the original IEC data,confirmed that these were errors, and tidied them up.
        err_index <- grep("\f", ward.pr.cand.2016$Municipality)
        ward.pr.cand.2016$Municipality[err_index] <- str_replace(ward.pr.cand.2016$Municipality[err_index], "\f", "")
        ward.pr.cand.2016 <- droplevels(ward.pr.cand.2016)

        #### Split the data into ward and pr data
        # change variable names so they match later
        pr.cand.2016 <- subset(ward.pr.cand.2016, ward.pr.cand.2016$PR.List.OrderNo...Ward.No < 1000)
        names(pr.cand.2016)[4] <- "list.order.no"
        
#### Process ward and PR data
        processor <- function(df){
            # get the variable name of passed data
            data.name <- (deparse(substitute(df)))
            # split the municipality names into codes and names
            dist.split <- str_split_fixed(df$Municipality, " - ", 2)
            df$dist.code <- dist.split[,1]
            df$dist.name <- dist.split[,2]
            # Keep only the district values from the shape file. Main centers and DC areas.
            df <- df[df$dist.code %in% za.district.list, ]
            # split data by district and party, counting how many candidates per party per district
            df.long <- ddply(df, c("dist.code", "Party"), function(df) nrow(df))
            # temp naming - to keep track not NB
            names(df.long)[3] <- "num"
            # split new data by district, summing tot candidates from all parties (in district)
            df.tot <- ddply(df.long, "dist.code", function(df) sum(df$num))
            # join tot numbers by district to main data
            df.long <- join(df.long, df.tot, by = "dist.code")
            # temp naming - to keep track not NB
            names(df.long)[4] <- "tot"
            # calculate the proportion of each party to total, for each district
            df.long$prop <- df.long$num / df.long$tot
            # quick srting, not v NB
            df.long <- arrange(df.long, dist.code, -prop)
            # renaming based on variable name
            names(df.long)[c(3:5)] <- c(paste0(data.name, ".num"), paste0(data.name, ".tot"), paste0(data.name, ".prop"))
            return(df.long)
        }
        pr.cand.2011.long <- processor(pr.cand.2011)
        pr.cand.2016.long <- processor(pr.cand.2016)

        # Join data, dropping parties that no longer exist
        pr.cand <- join(pr.cand.2011.long, pr.cand.2016.long, by = c("dist.code", "Party"), type = "right")

        # replace NAs with zero for parties new in 2016
        pr.cand[is.na(pr.cand)] <- 0

        # calculate relative and absolute change
        pr.cand$rel <- pr.cand$pr.cand.2016.prop - pr.cand$pr.cand.2011.prop
        pr.cand$abs <- pr.cand$pr.cand.2016.num - pr.cand$pr.cand.2011.num
        
# generate data for d3 map
        # full data list
        d3_data_all <- pr.cand[, c(1, 2, 9, 10)]
        names(d3_data_all) <- c("dist_code", "party", "Relative Change", "Absolute Change")
        write.csv(d3_data_all, file="data_d3/d3_data_all.csv", row.names = FALSE, quote = FALSE)

        # Party data list, ordered by total number or district PR candidates
        # group the data by Party
        grouped <- group_by(pr.cand, Party)
        # Summarise by summing the total candidates per party, excluding zeros and sorting
        d3_party_list <- summarise(grouped, tot = sum(pr.cand.2016.num))
        d3_party_list <- d3_party_list[d3_party_list$tot > 0, ]
        d3_party_list <- arrange(d3_party_list, -tot)
        # now we only need the party list
        d3_party_list <- data.frame(party = d3_party_list$Party)

        write.csv(d3_party_list, file="data_d3/d3_party_list.csv", row.names = FALSE, quote = FALSE)
        

The map was build with JavaScript and the mighty D3 libraries.