Röstigraben Applied data analysis of swiss votation patterns

Matthieu :create the dataframe, get the parties, proposition , create the maps , create the clustering , create the png , participate in the final report

David : clean the data, create setting map theme , manually matching themes to votations from available data, participate in the final report

Alain : Scrap of votation data, Cleaning of language dataset ,Cleaning manually the geojson to draw the maps correctly, Interpretation of results for clustering of municipalities (also by decades)( participate in the final report)

Introduction

This project consists of a data analysis of Swiss votations results from 1981 to nowadays. Switzerland is a unique case study in the world in the sense that it is the only country that implements direct democracy, as opposed to most countries’ representative democracies where people delegate their voting power to the decision-makers they elect, which means that only 0.1% of the world’s population gets the opportunity to make direct decisions over changes to their national constitution. Language divides, as we will see it just below, will make that work even more interesting.

We seek to use the data from the Federal Office of Statistics about votations to gain an insight of political trends in the country. Switzerland being at the crossroads of three of the major Western European languages, it is understandable that cultural differences will unavoidably emerge. It must be noted that this country is one of the only developed countries to have such a linguistic variety (we may also cite Belgium and Canada). Still, it does not prevent a strong national cohesion. However, each region is influenced by different news channels and ideologies in line with the language spoken (e.g. French-speaking citizens are more influenced by French authors and politics than the German-speaking citizens and same goes the other way with Germany). Note that this should be particularly true before the internet era and the sharing of news and opinions on a global scale that we witness today.

The term Röstigraben (Rösti is a typical Swiss-German dish, Graben is Swiss-German for ‘ditch’) has long been used in Switzerland to describe to cultural differences between the German-speaking and the French-speaking part of Switzerland. Similarly, the term Polentagraben (Polenta is a typical Swiss-Italian dish) is used to refer about the cultural diffences between the Italian-speaking part of the country and the rest. However can we see such differences in the votating trends of these communities? If so, is it something that evolved since 1981? Is there specific topics of votation which show no cultural splits or on the contrary any topics which show a clear divide? Do we observe the emergence of multiple identities among the same nation?

Several articles have been written on the subject and the Röstigraben phenomenon is part of common political talk in Switzerland. However, evidence has emerged that these differences have been attenuated today to leave another kind of split: a gap in voting patterns between urban areas and the countryside. Can we verify this statement? Are there other demographic explanations?

These are questions we will try to answer using our data.

Research questions

Here are the main questions we asked ourselves when we started working on this project:

  1. Röstigraben, Polentagraben, countryside-cities split: Our main point of focus is to observe political division in Switzerland, and its correlation with different cultural aspects such as language and urban status. Is there a recurrence in the splits during votations? Do they differ depending on the topics?

  2. Outliers: We are interested to find out about entities that we may call «chronical winners» and «chronical losers», in other words cantons/regions/towns that are often on parity with votation results and those that often vote the opposite. Is it a phenomenon limited to few entities or a common one?

Dataset

The main dataset being used is the results of all the popular votations from 1981 to May 2017 in Switzerland published by Swiss Federal Statistical Office. The dataset is available here.

The data are available as HTML tables that can be parsed using the requests library and the convenient pandas function read_html. The only small issue here is that the entire table is not available at once, it is then necessary to scrap multiple small table and then merge them, which is what we do in the notebook named bfs_scrap.ipynb. What happens there is that we had to clean the data in order to get one line for each votation for each community, with a district and a canton for each community. Names had to be thoroughly cleaned to get usable data to match with our json data to generate maps.

Note that votes are anonymized. Thus, it is not possible to get reliable data on an individual basis (fortunately for the voters). However, it is possible to enrich our dataset, as we need, with demographical data for each town also published by the Swiss Federal Statistical Office. These data are always available either as Excel spreadsheets or HTML tables. In both cases we managed to easily parse them and integrate them.

Objectives

Our main objective in this project is to implement machine learning techniques, namely K-Means and DBSCAN to generate clusters of similarly voting communities using the votation data in order to determine the existence of political divides in Switzerland. With the data we have, it is thus possible to observe not only the divides, but their evolution over time or their status depending on the theme of the vote. Our reasoning behind this idea is that we feel it is extremely arbitrary to sort votation topics by how liberal or conservative they are, and thus we cannot just aggregate votation scores together to get «liberal scores» for each region.

To visualize our data, we use the Folium plugin for Python to generate maps using our results and geographical data taken from MapZen.come that we have slightly modified for various reasons, including community mergers, wrong orthograph in names of locations and various inconsistencies with the boundaries of certain communities.

Structure of the story

In this data story we will use two methods: simple aggregation of data (for example adding the voting results of all years) and clustering using K-Means and DBSCAN. We will try to extract as much relevant information as possible from that. Our data is given at the community level (there are about 3’000 communities in Switzerland). What we will look at is the following, using several methods when we see fit:



Languages

The first important element of our data story is to know where each language is spoken in Switzerland. It is important to understand the repartition of languages throughout the country in order to fully appreciate the results we will observe later on. Complementary dataset The main dataset we use in this project does not contain any linguistic... Read more

Winner and loser

In this post we will determine if some commune win more more votation than other and if yes which one is losing the most election and which one is winning the most. For that we take a look at the percentage of votation won by commune and we put it in a map, and next we will observe and analyze the 5 best and the 5 worst. See the full html here... Read more

Theme

The third thing we have done with our voting data is split it by themes. This required some manual work as the official databases only tell us how many times each theme was voted on in a given year. We had to manually deduce which votation was registered as which theme from that, which was easy to do most of the times. There might be one or two ... Read more

Party Recommendations

Another meaningful information we have is the recommendations of parties for each votation. An interesting thing we can extract from that is popular agreement with party policies. For that we generate one map for each party, showing the percentage of votations where each community voted along with a given party’s recommendation. Allianc... Read more

Clustering of municipalities

Introduction In this part, we are going to take a closer look at our votations data and try to identify distinct communities (composed of the municipalities) with different votation patterns in Switzerland. In order to do so, we will use only the votation dataset and more specifically the results (in % Yes) of all the municipalities for all the... Read more

Data clustering by theme

Now that we have implemented clustering, we can restrict the data by specific theme, and thus possibly observe division among themes. We will do this with both Kmeans and DBSCAN (which had a hard time giving us pertinent results here). Here we can compare the non-clustering solution with the clustering, and additionally we can look at the Pr... Read more

Parti recommendation Cluster

Let us now look at our party data clusterized using K-means and DBSCAN. This time around, DBSCAN was able to give us more interesting results as the clusters are much more defined compared to the theme clustering. We are interested to see whether party allegiance is a defining factor of the Röstigraben phenomenon. A... Read more

Clustering of municipalities by decades

Introduction This post is mainly an extension of the previous post on the clustering of the municipalities using the votation dataset. In this is post we are going to see what happens when we consider timeframes of 10 years using the exact same methods. Please be sure to have read the previous post before this one. A concern we raised in th... Read more

Conclusion

Finally we cannot deny the presence of the Röstigraben, we have found some data that might suggest that a division is created between the big cities and the countryside but more often than not it just means that the big cities of the German-speaking Switzerland align themselves politically with the french-speaking communes. So the Röstigraben might weaken a little but in the end we are far from seeing a new divide cities countryside and far from seeing the end of this separation between languages.

But even with this this divide the difference in Switzerland are not always present there is still some common place in Switzerland that are agreed by most of the people through every boundary be it linguistic geographical.