Data Visualisation

Research

Hans Rosling’s 200 Countries, 200 Years, 4 Minutes

00HansDataVis

The graph displays health against wealth, in the form of lifespan against income in an attempt to depict a country’s quality of life. Each country is represented by a circle; its size showing its population and colour showing its continent. The graph refreshes every frame to show the new year. This creates motion for each circle, all of which depict an upward trajectory as they approach the present day.

The use of motion to depict progress is very easily processed by the viewer and the minute details can be worked out. This accompanied by its presence in AR allows Hans Rosling to point out areas of interest, similar to the behaviour of a weatherman/ weatherwoman.

The data visualisation also allows for interaction. We see Rosling tap on a country’s circle for it to split into districts so the wealth disparity within a country can be seen.

For what this piece lacks in aesthetics, it makes up for with accessibility and interaction.

 

Jer Thorp- The Weight of Data

Jer Thorp describes his work as being at the intersection between art, design and science and this is seen through the aesthetics, function and complexity of his work. When data visualisation tends to be used as a mode to spot patterns and outliers in data, Thorp uses it as an exploration of humanity. Whether it is using his New York Times Cascade- depicting a network of interest in a news story, whilst allowing for close examination into the individuals and cultures that show interest in them- or whether it is using geolocation data to observe moving home or going on a date.

 

Gestalt Data

Data often breaks down objects into their parts. For example, a human could be broken down into simple properties such as eye colour, personality traits, nationality, etc. but after listing these properties in a spreadsheet, one is not left with a human or a society.

Data visualisation can create a phenomenon greater than the sum of its parts. For example, Brendan Dawes’s Here, Around And Beyond creates an image of networking circles to represent the attendees at an event.

The raw data is translated into a new visualisation of the end. The original event has been abstracted.

Edward R. Tufte

Envisioning Information

Spaceland and flatland are terms for bringing three-dimensional depictions of data into just two dimensions.

Domestic Data Streamers

Domestic Data Streamers are a creative studio based in Barcelona that specialise in data vis. If I understand correctly, their main goal is not for the ease of consumption of data, but is instead to make an experience from the data.

Their data visualisations exist in three dimensions, and often in gallery space. The audience is invited to walk through visualisations, to view the data from different angles.

Many projects are a combination of digital and physical space. Some are purely physical, like Lifeline which uses a grid of balloons to represent an individuals age and the age at which they would like to die.

Ethics of Data Vis

Raw data presents itself as objective truth and so data visualisations of this raw data are often falsely presented as truth.

For example, I can recall a time where I watched a PragerU video on YouTube (PragerU is a right-wing ‘educational’ channel), I can’t remember the video, but I think it was saying that the youth have been corrupted by the absence of strong masculine figures. It conflated data showing that men brought up without a father were more likely to go to prison. A conservative reading would conclude that growing up without a father will result in a poor upbringing, however, I am willing to bet that if you added a third piece of data- social class or wealth, one would find that an individual is more likely spend time in prison or not have a father according to how little money they have.

I believe that data is visualised with the intent of bringing an audience to a specific conclusion. So once data is visualised, it is no longer unbiased.

It would be interesting to explore this phenomenon by deliberately trying to force a false conclusion from correct data.

Politics

Data visualisations are often political. They are particularly useful for highlighting patterns that can suggest inequalities in demographics.

The BBC do a good bit of data vis. Many articles offer interactive panels where readers can see how their MP voted on a specific issue. Their general election map is intuitive. Constituencies are outlined and filled with the colour of the party with the most votes. This is interesting to see how people voted geographically but could be misleading. For example, if one were to look at this map, they may believe that the conservative got the most votes, followed by the SNP, then Labour, then Lib Dems.

It would be interesting to construct a map that shows where First-Past-The-Post fails. Although I am sure many have been created already.

05BBCElectionResults

SIMD (Scottish Index of Multiple Deprivation)

I have a strange fascination with the SIMD16 map.

06SIMD16

The map uses government data in urban Scottish areas to give it a deprivation ranking based on income, employment, health, education, housing, geographical access and crime. Higher scores are blue, and lower scores are red. Just be looking at the image above, you can get a general idea of the affluence of areas in Scotland. Edinburgh is mostly blue, and Glasgow is mostly red.

07GlasgowSIMD16

Even from a distance, you can see the social inequality in Glasgow. The city is a giant red patch on the map, with the exception of the West End.

08SIMD16Dundee

I think its particularly interesting to observe places you know well. Dundee has an interesting pattern. Affluence stretches along the coast and exists mostly in the West End and Broughty Ferry- both places seem to have retained wealth from when they were the homes of wealthy jute barons.

There is a big blotch of red that surrounds the north side of the city centre. These are mostly schemes. If you venture out further, the areas are mostly affluent suburbs. It is also interesting to see that all areas just outside of Dundee are affluent. The map also allows you to see the difference in population density.

I find it interesting that this map essentially shows how the residents voted. If my memory is correct, I believe Broughty Ferry and the West End are the only council wards that elected a Conservative.

This map is extremely interesting and it is a shame that it has not been updated since 2016. I am not sure why this is the case. I have had trouble trying to find the data on government websites but I will have a closer look soon. I think it would be extremely to see how these areas have changed over time.

After having a look at the forthcoming publication on the Scottish Government website, there is a mention of what can possibly be an upcoming update to the SIMD Data

09possibleSIMDdate

Hopefully, this means that a SIMD19 will be published before the end of this project. I have sent an email to the Scottish Government statistics team to find out if this is the case.

Requested Information

I am not sure what data I will use, but I know it can take a while to get datasets from online services. I am currently going through a couple of accounts to request their data on me.

Spotify

I received my data from Spotify today. There is nothing too interesting. I did get a list of my entire listening and search history from the past three months that would be easy to make a visualisation for. The information arrived in JSON format.

Facebook

The zipped file is 5.8GB but I am sure I downloaded it in the past and the data was pretty mundane. However, hopefully, there is some good data to work with. I also haven’t been very active on Facebook since about 2013.

Instagram

Google

Edward R. Tufte

The Visual Display of Quantitative Information

Graphical Displays of data should

  • Show the data
  • Induce the viewer to think about the substance rather than about the methodology, graphic design, the technology of graphic production, or something else
  • avoid distorting what the data have to say
  • make large data sets coherent
  • encourage the eye to compare different pieces of data
  • reveal the data at several levels of detail, from a broad overview to the fine structure
  • serve a reasonably clear purpose: description, exploration, tabulation, or decoration
  • be closely integrated with the statistical and verbal descriptions of the data set

A data visualisation can only be as good as the raw data.

Dr. John Snow plotted the locations of the deaths from cholera onto a map of London. Once the data was visualised, it became obvious that the deaths all near a water pump that turned out to be contaminated

Charles Joseph Minard’s work has a certain style. A method as simple as adjusting the thickness of a line is extremely effective at establishing the amount of activity in a specific route or area.

Time-Series charts depict change over time. Most commonly in a graph form.

“small, noncomparative, highly labelled data sets usually belong in tables”

 

David McCandless

David McCandless is a British journalist (or ‘data-journalist’ as written in his Wikipedia bio) that specialises in the visualisation of data. He is the author of the books Data Is Beautiful and Information Is Beautiful and runs a blog.

Image result for david mccandless data vis

His work tends to consist of simple brightly coloured shapes against a white background. These shapes often overlap to illustrate areas of commonality. This design looks very ‘webby’; as if it was created with CSS. I admire the piece above as a narrative can be extracted from it. We see the tax revenue from different drugs but £5bn is arbitrary. It sounds like a large number but it is difficult to contextualise. The area of this tax income is compared to government incomes from tobacco and alcohol and then shows us that the tax revenue would be enough to fund every university in the UK.

Image result for david mccandless data vis

Million Dollar Homepage

10MillionDollarHomepage

Visualising Data by Ben Fry

I think I read the first couple of chapters of this book a while back so I’m just going to start at chapter 3.

In chapter 3, we plot an ellipse at the centre of each state. The location data is from a .tsv file that contains the x and y positions for the Processing sketch.

11TSV

A .tsv appears to be the same as a .csv but instead of commas, tab spacing is used.

12MapResult

This book may be written for complete beginners so I think I will skim the bits that appear to be quite simple and note any methods that are new to me.

Next, we get a .tsv with random values. It appears as though you can get the max value and min value in a column by writing either MAX_FLOAT or MIN_FLOAT. I assume this can be done with INT also. After looking this up, this isn’t true. MAX_FLOAT and MIN_FLOAT find the maximum and minimum possible floats. Later in the code, we find the max and min with a for loop.

The chapter goes on to discuss mouse rollover interaction. But I reckon I can work all this out.

 

Geographical Data Visualisation

The image above is of the same sort of data I plan to use. They use raised and coloured 3D objects to show deprivation in England. This is simple yet effective. The simple design and limited colour keep the data clear and obvious. The shadows cast by the objects is a nice touch and further emphasise the data.

 


Processing Skills

Using a Mock Dataset to Visualise Data

Using Mockaroo, I created a data set for companies, their wealth, and the number of employees.

01CopmanyWealthExcelSheet

I quickly ran into a problem. By using the data type of “currency” in Mockaroo, the wealth property is a string as it uses the dollar symbol. This means I either have to adjust my spreadsheet, or I need to write a function in Processing that removes the ‘$’ from the beginning of each wealth property then converts the String into an Integer.

02CompanyRandomWealth

In the image shown above, I have 1000 companies, the colours and position are random, but the size represents wealth.

03CompanyGraph

04CompanyGraphCursor

I then mapped the X position to the number of employees and the Y position to the wealth of the company. The wealth also drives the colour of the circles. If the user hovers their mouse over the circle, it will show the name of the company.

This piece isn’t supposed to be entirely functional or aesthetically pleasing, I am just practising what we learned in the first tutorial.

 


Development

Initial Thoughts

I am leaning towards creating a data visualisation in Processing. I would like for it to be pleasing aesthetically, yet function well- and if all goes well, it will be interactive. I would also like to get a couple of prints from this.

Although creating a physical piece interests me, and has huge potential for exploration, I am more interested in the idea of creating an interactive application that responds to data. Processing also has the capacity to bring data vis into three dimensions.

Artboard 1ProjectVis

Food Standards

I found the data for the food standards rating of every cafe/ restaurant in Dundee. This data comes with loads of information including the geolocation. The only issue is that the data comes in XML format; which I currently do not know how to use. By looking at the XML data, I can understand it as it seems to be a tree of children and data, I am just unsure of how to access the data in Processing. Thankfully Dan Schiffman has a video on it.

In a XML URL, there is often a query. It looks like this: q=London&mode=xml&units=metric&cnt=7

This tutorial wasn’t very helpful. I am wanting to find the child of a child. This video only goes as far as showing you how to find the first child.

It took me a while to work out, but I was using XML arrays when it was unnecessary and therefore used a for loop in the wrong place.

This is the code required to get every name in the XML document:

13ProcessingXMLBusinessName

I adapted the code to show the geolocation of each business. However, I run into an error after a number of the geocodes are printed to the console. I suspect there may be some entries in the XML sheet that do not have geolocation.

14Error

I can debug the code to find that the error occurs at the 58th entry.

15ErrorDebug

If we look in the XML file, we can see that there is indeed an empty geocode element.

16ErrorFound

By adding a line of code that says if geolocation does not equal nothing, then find the latitude and longitude.

17FixforGeolocation

I then used the code from Ben Fry’s visualising data to find max and min values. I had to make

18FindMaxandMin

I then mapped the values to the width and height to plot a map.

19PlottedPoints

This is interesting and you can definitely see a pattern. But I realise I have squashed and stretched it to fit into a square, and hopefully, that is why I am struggling to find recognisable town centres and high streets. I think I need to establish a scale. So perhaps I divide the difference in max and min width by the width of the screen and then use that value to multiply the height.

I was delaying the inevitable but I may need to put each establishment into an object array. I want to be able to hover my mouse over each circle to find out the name of the establishment and other data. From there, I’ll be able to work out if the map has been flipped.

 

I added the names of the establishments so I could try and work out where was what.

20MapWithNames

I live about halfway between the Royal Tay Yach Club and Cragie News (the red circle), but I live east of the city centre (the big clump of circles in the middle right) so I can tell the map is flipped.

The latitude and longitude are not to scale in the image below. But I scaled it manually to try and see if I could make more sense of the data. You can now clearly see the city centre and some of the roads that stem from it. You can also see where the river Tay is and another small clump on the right is Broughty Ferry.

21GooderMap

On my walk I realised that I can bring the minimum values for the latitude and longitude to zero, then I would be working from a graph.

22MapToScale

I did some maths, and I am not entirely sure how, but I think I arrived at the proper scale for my map- at least it looks right.

Take a Break

This experiment was a handy way to learn how to navigate and plot data. However, I feel as though it was going to consume too much time and I should move on.

Scottish Index of Multiple Deprivation Map

Initial Idea

During the workshop on 28th Jan, Paul showed us a technique of creating 3D data visualisations from a Processing-generated displacement maps.

From this, I discovered I may be able to create a landscape of Glasgow based on the levels of deprivation. Mountains can be areas of affluence, and valleys or areas below sea level are areas of deprivation.

The SIMD data now contains three sets, from 2012, 2016 and 2020. This would also allow me to interpolate the vertices y position in a way that creates an animation of the distribution of wealth over a decade.

The Data

The government has just released its 2020 SIMD data. I downloaded the spreadsheet and it contains more data than I thought. Some of it is very interesting, like levels of depression in certain areas. I began working with it in processing when I realised it does not have geolocation data.

After a search online, I found that the government use data zones from 2011 in all of its statistics. Hopefully, I can find a data set that I can link with the SIMD data to gain the geolocations for each area.

I can’t seem to find location data for data zones anywhere. The SIMD interactive map uses shape files to plot the information. I think this might be too complex for me to work out in this project so I may need to consider another option.

In The Meantime…

I want to still create an interactive visualisation using the SIMD data. I decided I would create a graph.

There are over 6000 data points so each point could only be a few pixels wide.

The process was relatively painless. There were no sticking points and I managed to create the following sketch in about an hour or two.

The size of the circle represents the population. The x position and colour show the level of deprivation. Each area has a rank. The least deprived area was Stockbridge in Edinburgh and the most deprived was Greenock, which I believe has moved several places down over the years. This BBC article explains the rankings.

The Y position is driven by the level of alcohol use in each area. This graph shows that the more deprived an area is, the higher the alcohol consumption. Parkhead West and Barrowfield have by far the highest alcohol consumption.

The map also displays the name of the area if you hover the mouse over the top.

The Code:

 

23SIMDALCDATACODE0123SIMDALCDATACODE0223SIMDALCDATACODE03

Height Map

I found that I could fiddle around with the settings on the SIMD map to remove labels. From here, I could just build a high res image by taking several screenshots.

I then selected each colour, then gave it an alpha value from 95-5.

01ColourToGrey-1

The black areas are the most deprived and white the least. I then applied a blur so the districts would have a smooth transition in 3D.

01ColourToGreyBlur

I may play around with this blur value in the future.

Next, I took it into Maya, applied it as a displacement map, then applied the displacement to the geometry.

Environment

This was the result. The peaks and troughs are very distinct, I may want to blur my map some more and play around with the settings. I may also add a sea-level to this and a skybox and see how it looks.

Creating a 3D Visualisation In Maya

I decided to play about with shaders in Maya to see what kind of result I could produce.

At first, I tried to apply a diffuse texture map to the mesh above. But seeing as the mesh already has the displacement map applied in geometry, when I reapplied the shader it gave me a really weird result.

The layered displacement maps made each area on the map balloon. The result was really interesting but there’s not much I can really do with it since it appears to be so unpredictable.

I started from a fresh scene and rebuilt a shader using a transparency map, a diffuse map, and a height map. I kept the displacement height very low, to begin with so it is easier to work with while I work things out. Here is the result:

 

Next, I needed to increase the scale of it so that the camera can do a fly-through. I played around with the settings and this was the result.

31HiResFail

I exported a high-resolution render of the scene. As you can see, there are many issues with this. You can see the difference in height between each 0-255 alpha value of the displacement map. This could possibly be fixed by enabling ‘smoothing’. There is also a strange area at the back where it appears as though the normals are facing the wrong direction.

I had been having trouble with the displacement map. Each time I rendered the scene, the displacement map would be cropped by the outer edge of the plane. I discovered that this was caused by the normals being the wrong way. This was odd because it didn’t look like meshes normally do when the normals are inside out.

 

I found a website that has hundreds of skymaps for free called HDRI Haven. I found a nice overcast scene and brought it into Maya.

33SkyDome

32SkyDome
Rendered landscape with too few polygons

34Water

I then decided I would bring in a sea-level. I found an official Maya tutorial that shows how to create an ocean effect. It’s as easy as applying a displacement image to a plane. You can then adjust transmission values to make it transparent.

36WaterScene

The texture is still too large so I would need to tone it down a good bit.

I also looked into creating grass for the landscape. I found a tutorial that uses Maya’s XGen to generate grass. It looked great on a small plane. But my landscape mesh was too large, and the density of vertices was totally uneven. Either the mountains appeared too bare, or the render would crash as there were too many blades of grass.

37grassPlane35GrassFail

I think I may have to paint my grass textures onto the landscape. This may work better in the long run, as I will be able to paint different textures like rock for higher peaks. I am just realising now, that it will be a nightmare to UV map this, but I’ll see how it goes.

https://area.autodesk.com/tutorials/creating_grass_with_splines/

38SeaAndSky

Another Approach

I am sceptical about pursuing a realistic representation of land. It is a lot of effort, for a rarely interesting result. I am considering taking a different aesthetic approach. Currently, I am not exactly sure what this means, but I am thinking about bringing the 3D landscape into a medium.

I may produce an interactive environment in Unity, I could use Maya’s sculpting tools to change the environment, or I may create a 2D map using the 3D data.

In Edwarde Tufte’s books, there are many references to orthographic maps. These often have a practical use, but tend to be very aesthetically interesting. Perhaps I could build an orthographic map, then build upon it in Illustrator to create a print.

Image result for michel etienne Turgot and Louis Bretez"

Look into Italo Calvino invisible cities

Plan of Action

I am going to need a draft render in order to start playing around with the design. I am currently unsure of the angle, texture and lighting for the 3D render but I will just try things out.

I will then begin to play around with different designs, maybe I will go back and adjust the renders.

Hopefully I will be able to get into the workflow of creating graphic designs, then adjusting renders, then adjusting the design until I reach a design that I am happy with.

 

 

Draft Render

39OrthoBlackWhite40OrthoBlackWhiteSmooth

I rendered out a couple of orthographic shots of Glasgow, each at a slight tilt so the height of the peaks and troughs is obvious. There are a couple of problems with this, however, you can see a deep dark shadow where the geometry ends above sea level. This could be fixed by extruding the outer edge and bringing it downwards in the Y-axis, creating cliffs.

Also, I quite like the way the render looks without subdivision but I will carry on with the subdivisions until I am experimenting with aesthetic. The sea is also distracting, and I may need to find a way to keep it within the land.

41OrthoTopDown0242OrthoTopDown03

I then decided it would probably be best to start simple, with a top-down shot, then work from there. I find this result very effective. Although I was struggling to work out where was where. I found out that the image is upside down and flipped.

Also, adding a bit of specularity to the material really emphasised the peaks and troughs making it easier to recognise high points and low points. There are some annoying jaggy bits at the edges, this is where the displacement map ends. I need to go into Photoshop and match the colours with the edges.

43EarlyMockup01

I then added some labels to see what areas are what. I discovered that my displacement map is actually inverted, so the lowest points are the least deprived. The result is quite accurate, however, when compared to an actual map of Glasgow, it is distorted. I am not sure why this is, but I suspect that it becomes distorted when converting to geometry. I will have to work this out in Maya but it is not a priority.

 

I decided that I quite like the idea of the lochs being the least deprived areas and the mountains being the most. The real focus of the data is areas of deprivation, and it makes sense to give these the most mass to highlight them.

I decided I would have a look at applying colour maps to give it more of a map-like appearance.

47ColourOrthoRender

I find the colour a bit overwhelming, and it is difficult to derive information from it. I will try removing the blue colours.

45DeprivationOnlyRender

I feel as though this is far more effective as data visualisation. I may flatten out the grey areas as well to draw less attention. I also discovered that I can use the cutout filter in PS to give more of a graphical map-like quality to the render.

With Cutout

46DeprivationOnlyCutout

Without Cutout44DeprivationOnly

48OrthoTopDown05

The cutout works well. I will need to go back in and flatten the grey areas.

Starting From Scratch(ish)

I have been using the first maps I created. They worked well enough, but are unsuitable for a final render as they are too low resolution and have some wonky areas.

I recreated the displacement map and diffuse map, see below.

49ColourMap50Displacement

After applying the displacement map to a plane, then adding a shader with the colour map, I added a physical sky. I had never used this before but it is really handy. Essentially it is a skybox with a directional light. The time of day can be changed in the attributes.

Rendering with this light gave a nice result.

51PhysicalSky

I may be able to paint on map graphics to make it look as though this is raised from a real map. I created a quick test, where I painted the affluent areas light blue, and added a label for Kelvinside.

52ColourMapMap

The render came out as follows…

53ColourMapRender

Although it is slightly off, I am very happy with this render. The benefit of this workflow is that I am still not limited to one medium. I could use this to produce a series of prints consisting of close-ups and other data. Or I could make an animation, with fly-throughs and slow pans.

54ColourMapCloseUp

This close-up reveals the versatility of working in 3D, however, the imperfections become more obvious. There are funny raised vertices, this can be fixed really easily but its not worth doing until the end. However, the individual polygons are visible in some areas, and sometimes the change between different colours. This is tricky to fix. I can update my geometry to have more vertices. I cannot guarantee this will fix it but its the only thing I can think of. I quite like its appearance though, and does not necessarily have to be fixed, so it is not a priority. I will bear it in mind, however.

Labels

The data is difficult to link back to Glasgow unless you know it well. It would be useful for me to include labels, for key areas. I have written a list of areas of interest…

Areas of Least Deprivation

  • Kelvinside/ West End
  • Mount Vernon & Garrowhill
  • Pollokshields – Newland’s & Kingspark
  • Dennistoun and Port Dundas

Areas of Highest Deprivation

  • Cantyne West & Haghill- most deprived in Glasgow
  • Easterhouse South and North Barlanark
  • Castlemilk
  • Drumchapel
  • Pollok
  • Possil Park & Milton
  • Govan
  • East End – Bridgeton, Dalmarnock, Parkhead, Easterhouse, Queenslie

 

Seeing as this is a fictional landscape I am creating, it may be interesting to give them names like Loch Kelvin (may exist in real life though), or the East End Munros, Ben Pollock.

I am also contemplating whether to have 3D text, that may look nice in animation. Or just having plain 2D text within the diffuse map to give it more of a map look.

Finishing Touches

The more time I have spent polishing the render, the more I realise that the 3D artefact should entirely be the piece. I no longer wish to produce a design with the accompanying information- and if I do, it shall be minimal. I think the data visualisation should speak for itself. If there is to be other info, it may just be a key to the side. At a push I will point to the most and least deprived areas in Glasgow.

I think I shall just produce several print-quality renders, then I can decide what I want to do with them, but they should be able to stand alone as a piece.

If I have the time, I would like to produce a short video that pans across the landscape to really show the three dimensions. The one issue is, whether I can render it for Friday. I may be able to let it render on the studio’s PC overnight.

I liked these renders, but I feel as though the flat text does not work so well in perspective shots and the height of each deprivation level is uneven. Next I will raise the text, and smooth out the landscape.

56aisedOrTexture

I removed the texture text and replaced it with 3D. I also smoothed out the landscape and made it less shiny. I feel like the white text doesn’t look quite right, however. There is also no painted border for Glasgow. Now the border is entirely raised.

57RaisedText58RaisedText00

59RaisedText0160RaisedText02

The text is now blue or red depending on its background. This looks much better but it is too difficult to read from a distance- I tried to fix this but changing the scale or size of the text just breaks Arnold, then it refuses to apply the shaders in the render.


Final Renders

61A3Render

The light looks a little bit harsh in this but it looks good. I rendered this out in A3 so the text will hopefully look better when printed.

62A3OutputLabeled01

I tried labelling it for a print. I can’t tell if I like it or not. I may just print it to see. First, I think I am going to need to update the render, as the sunlight is too harsh and makes it difficult to read the text.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s