This tutorial is based on a presentation I made for a Hacks/Hackers Montreal meetup. Become a member if you live in the area!
In this post, I’ll show you how to use the raw data from the 2011 Canadian General Election to make an interactive map with CartoDB.
For the moment, let’s play with the vote results and some Excel formulas!
PS1: I made the map in french and I simply translated the french version of this tutorial. If you want a map in english, just replace (CMD + F and choose replace in Excel) the names of the political parties by their english translation at the end of this tutorial.
PS2: You will be using the results per ridings. But if you want something more precise, you could use the results per poll and the geographic files for them.
PS3: If you get stuck, write me!
#I Getting the data ready
First, download the results from the Elections Canada’s website.
At the end of the webpage, you’ll find the results per riding. It’s the Table 12.
The downloaded file is in a CSV format (Comma-Separated Values), which is very common. When you open it in Excel, it looks like something is wrong with it.
It’s just because each information is separated by a comma and you have to tell Excel to convert the commas into columns.
Select the first column. In the Data tab, click on Text to Columns.
Then on Delimited, on Next, on Comma and finally on Finish.
Here, Excel reorganized your data and it’s much clearer now!
Let’s start by deleting the useless columns A, E, F, I et J. Only keep the name of the riding, its identification number, the name of the candidate with its political party, the number of votes and the percentage of votes.
Rename your columns.
Perfect! We can now start to work with the data. First, let’s sort the data in the right order.
Click on Custom Sort.
Sort by the riding identification number from the smallest to the largest, then click on + and sort by the number of votes from the smallest to the largest. Click on OK.
Your data is now in the right order. You will understand why later. For the moment, just trust me! 🙂
Now, let’s clean all that.
As you can see, in the percentage column, when it’s less than 1, the zero before the dot is missing. Also, for a better presentation on the map, it would be great to have % after the numbers.
To do so, we will use the following formula:
This formula has three parts:
- The IF function allows you to set a condition. If the condition is met, Excel will do whatever you want it to do. If it’s not met, you can ask Excel to do something else.
- The LEFT function allows you to select X number of characters from a cell, from the left.
- The CONCATENATE function allows you to merge text and cells content one after the other.
These functions all toghether do the following:
- They check if the first character is a dot.
- If it’s a dot, they add a 0 before the cell content and a % after it. The result is placed in column F, where the formula is.
- If it’s not a dot, they simply add a % after the cell content. The result is also placed in the column F.
Once the formula written in F2, apply it for the whole column. To do so, move your cursor to the bottom right corner of the cell. When you see a small black cross, double click and Excel will copy and adapt the formula for all the cells below.
Now select all the cells from column F (starting from F2) and copy them.
Click on E2.
In the Home tab, click on Paste Special and choose Values.
Delete the column where your formula is written, the column F. We don’t need it anymore!
Let’s work on the name of the candidates and their political party. We need to separate these two different informations.
To start, we can see that the french name is isolated from the rest by a slash. We can easily use that.
Move the column to have a better hand at it. Select column C. Copy and paste it in column F. Delete column C.
Select the whole column E (where the candidate name and the party are now in).
In the Data tab, click one more time on Text to Columns, then on Delimited and on Next.
Now, click on Other and type a slash / in the text field. Click on Finish.
Here we go! The names of the parties are in column F. Name it Parti.
However, one party name had several slashs in it. At the lines 638, 699, 786, 834, 1020, 1046 and 1580, you can see that the Animal Alliance Environnement Voters filled the cells in columns G and H.
Delete them manually and put the full name in column F.
In column E, we still have the english names of the parties. On top of that, there’re two stars after the names in some lines.
To delete the stars, press CMD + F and click on Replace.
Then type ~* in the search field and let the replace field empty.
Click now on Replace All and the starts will be erased.
For the name of the political party in column E, the simpliest way to get rid of it would be to do a search and replace like we did for the stars.
But, since there’re more than twenty parties, it would be quite long!
So I created a formula that will erase everything for you:
=TRIM(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(E2;”No affiliation”;””);”Bloc Québécois”;””);”Communist”;””);”Conservative”;””);”Environment Voters”;””);”FPNP”;””);”Independent”;””);”Liberal”;””);”Libertarian”;””);”Marxist-Leninist”;””);”NDP-New Democratic Party”;””);”CAP”;””);”PC Party”;””);”Pirate Party”;””);”United Party”;””);”Green Party”;””);”CHP Canada”;””);”Radical Marijuana”;””);”Rhinoceros”;””);”WBP”;””);”Animal Alliance”;””))
Insert a new column between E and F. To do so, right click on the column header and choose Insert.
Now copy-paste the formula in F2.
Move your cursor to the bottom right corner of the cell and double click. The formula is applied to the whole column. Here we go! All the candidates names are clean!
Copy the cells in the column F (from F2). Click on E2. On the Home tab, click on Paste Special and select Values (as we’ve done it for the percentage).
Delete the column F.
Et voilà! We have a clean dataset!
#II Convert the data into HTML code
We now have our data sorted and cleaned. However, in our map data, we will only have 308 lines, since there were 308 ridings in 2011.
But we have 1 587 lines in our current Excel file, because there were 1 587 candidates for the 2011 General Election!
Therefore, we have to group our data. All the information for each riding have to hold on one line.
And for a better appearance on our map, we will also transform the data into HTML code.
Here we go!
Copy this formula and paste it in G2.
The <b> and </b> tags will go before and after the name of the candidate and will put it in bold.
The <br> tags insert a break line, which will help us to show each information on top of the other. And we put two of them at the end, so an empty line will separate different candidates.
We also add votes after the number of votes.
Move your cursor at the bottom right corner of the cell and double click when you’ll see a small black cross. The formula is now applied for the whole column.
We are getting there! Now, we need to group the HTML code of the candidates by ridings.
Copy the following formula and paste it in H2:
Apply the formula for the whole column H.
This formula adds the HTML code of the candidate on the current line but, if the name of the riding on the next line is the same (in column A), the formula also adds the HTML code of the previous cell (in column H).
Therefore, on the last line for each riding, you’ll find the HTML code with all the information about each candidate!
For example, line 6 is the last line for the Avalon riding.
If we copy the HTML code of H6 and paste it into the HTML viewer from Code Beautify, we can see the list of all the candidates of the riding. Our formula is working!
Furthermore, you can see that the name of the political party on the last line is the party that won the riding, thanks to the custom sort we did at the beginning. It’s perfect. We will be able to use this column on our map!
However, we still have our 1 587 lines for our 1 587 candidates and we need to reduce this number to 308.
Let’s use another formula to distinguish the last line of each riding, since it’s where all the HTML code can be found.
In I2, copy and paste the formula below and apply it for the whole column I:
This formula adds the word DERNIERE on each last line for each riding.
We can now filter the lines to keep only the ones where DERNIERE is written!
Click on the small funnel.
And now click on the filter in column I to choose only the cells with DERNIERE.
Tadam! You now only have the last lines displayed and there are… 308 of them!
We are getting closer to our goal.
Now select all of the filtered cells (CMD+ A) and copy them (CMD + C).
Open a new Excel file (CMD + N). Click on the first cell and paste the values.
Renamme column H HTML and delete columns C, D, E, G et I (you should keep A, B, F, H).
Save your file under Resultats_2011_CLEAN into an Excel file.
Here we go! You have your Excel file with, for each riding, the results for each candidate in HTML code!
#V Make a map with CartoDB
Now that we have our file, the last thing left to do is the map!
To do so, we will merge our Excel file with the geographic files for the ridings.
To be able to use them with CartoDB, you have to compress them (.zip format).
Select the files (in french) and compress them.
Rename your new file Circonscriptions_2011.zip.
Now, let’s use CartoDB!
If you don’t have an account, it’s time to create one. CartoDB allows you to host up to 50 mo of data for free, which will be enough for our map.
Once logged in, go to your Dashboard and click on New Dataset.
Now click on SELECT A FILE (or BROWSE) and add your file Circonscriptions_2011.zip by clicking on CONNECT DATASET.
When the file will be sent and ready, you should see a table with your data.
Click on MAP VIEW and you’ll see your ridings mapped!
Now that we have the geographic file in CartoDB, let’s upload the Excel file with the results.
Go back on your Dashboard, click on New dataset, and SELECT A FILE (or BROWSE). Then send your file Resultats_2011_CLEAN by clicking on CONNECT DATASET.
We now have to merge the geographic file with the Excel file with the results!
We will use the identification number of the ridings to create a link between the two files.
Click on MERGE DATASETS at the bottom right the webpage.
Click on COLUMN JOIN. A new window will open.
On the left side, for the file Resultats_2011_CLEAN, choose numero_de_circonscription.
On the right side, be sure you selected circonscriptions_2011 and choose numcef (at the end of the page).
Then click on NEXT STEP.
These two columns holds the identification numbers of the ridings, which will tell CartoDB which lines should be merged together.
On the next page, CartoDB asks you which columns you want to keep.
For resultats_2011_CLEAN, choose them all except geometry.
For circonscriptions_2011, only choose geometry, since it’s the line that has all the geographic data about the ridings.
Click on MERGE DATASETS to launch the operation!
There we go! All the data is brought together and ready to be mapped!
Click on MAP VIEW. Then on WIZARDS (the small paint brush on the left) and select CATEGORY. Choose the column Parti to have a map colored in function of the name of the political parties.
Change the colors for the official colors of the parties.
You now have to choose which information will be shown when the users will click on the ridings.
Click on the INFOWINDOW tab. Select nom_de_circonscription and html. Put them in the right order and change the titles.
Look how our HTML code perfectly works!
Now click on VISUALIZE, at the top right, then on OK, CREATE MAP.
You can now click on PUBLISH (at the top right again), which will give you the code to embed your map on any website you want!
Voilà! It’s done! You juste created your interactive map of the 2011 General Election results.
You can now do the same for the 2015 vote results! 🙂