Tutorial written by Keith Jenkins, GIS Librarian at Mann Library, Cornell University, spring 2019.
https://kgjenkins.github.io/pam5280/
In this workshop, we will use QGIS create a map of tract-level cancer data for New York City. The steps below could be modified to map other health variables, or other cities. (The same methods could also be applied to any other spreadsheet of data that includes tract ids.) The map we create for New York City will look something like this:
The CDC (Centers for Disease Control and Prevention) have compiled a variety of health data for 500 cities across the United States. This data is aggregated at the Census tract level, which is detailed enough to see variation across a city.
500 Cities tract-level health data
The only location data provided by this CSV file is found in the columns containing the state abbreviation, placename, and tract FIPS codes (which uniquely identify each tract). In order to make a map, we will need the Census tract boundaries.
Census Tract boundaries
1. Download the workshop data
2. Open QGIS
QGIS is a free, open-source geographic information system that can be used to create maps and perform spatial analysis. If you would like to install QGIS on your own Windows, Mac, or Linux computer, visit the QGIS web site at qgis.org
3. Load the tract boundaries
tracts-ny
folder within the unzipped workshop filestl_2018_36_tract.shp
file and click the “Open” button“Vector” means points, lines, and polygons. In this case, the tract boundaries are polygons.
Shapefiles are an archaic spatial data format, but still commonly used. Shapefiles are comprised of several separate files, all with the same name, but with different extensions. In this case, all the files that start with tl_2018_36_tract
. When selecting the shapefile to load, pick the one with the .shp
extension.
4. Basic Styling
The shapefile only contains data, and lacks any sense of style. The default polygon style is used with a random color, but we can change it.
5. Explore the tract boundary data
There are two basic ways to explore a shapefile:
Notice that there is no demographic data in the boundary shapefile, just identifiers and geometric data. We will be using the GEOID column, which contains a unique id for each tract, to join the 500 Cities data in order to have something interesting to map.
6. Load the 500 Cities health data
cdc_500_cities.csv
fileQGIS is pretty good at detecting things in a CSV, but check the options and make sure that the “Sample Data” preview at the bottom looks good. Make sure that “Detect field types” is checked. and that “No geometry” is selected (meaning that there are no coordinates in our table).
A delimited table won’t yet show up on the map, but you can explore the data table:
Notice especially the “TractFIPS” column, with contains the unique ids for each tract. We will use these values to match each row to the corresponding row in the tract boundary shapefile.
7. Join tables
We want to join the 500 Cities data to the tract boundaries, so that we can use colors to display the 500 Cities data on the map.
Set the following options in the “Add Vector Join” dialog:
_
Be sure to save the join!
Open the attribute table for the tract boundaries to see the joined data. Note that some of the rows will have null values because they were not part of the 500 Cities dataset.
8. Graduated colors
Now that we have some interesting health data in our tract boundary attribute table, we can make a choropleth map, using the numeric values of a particular column to set the color of the corresponding polygon.
We will be mapping New York City, so zoom to that area on the map. Now we’ll change the map styles.
_CANCER_CrudePrev
If all goes well, you should see a range of colors across the city. Notice that the tracts that had null values (i.e. were not included in the 500 Cities dataset) do not appear at all.
The default scaling mode is “equal interval”, which is not usually the best way to view the data. Change it to “Quantile” and observe the difference. Also try “Natural Breaks”, which is especially designed for mapping numeric data – it tries to minimize the variation within each classification.
You can change the number of classes (default is 5), and you can also change the color ramp – click it to edit the colors. There are many other ways to customize map styles in QGIS – the possibilities are endless!
9. Add a Basemap
Sometimes it is useful to load a global base layer from the web, to add context to your map, or just to help confirmthat your data is correctly aligned. The QuickMapServices plugin makes this easy. (It’s already installed on the Mann Library computers.)
To avoid the pixelization caused by reprojecting the basemap image, right-click the basemap layer > Set CRS > Set Project CRS from Layer. (CRS means Coordinate Reference System.) In fact, the CRS of basemap is a better choice for ensuring that shapes on the ground are true, and not distorted (as in the case of the original latitude/longitude CRS that was being used when we first loaded the tract boundary shapefile).
10. Creating a Print Layout
Now it’s time to decide how our map will appear on a page, or in an exported image. We can create a print layout that specifies the extent of the map we want to show (just New York City, for example). With a print layout, we can also add a time, a legend, a scale bar, and other elements to the page.
First, we add a map to the page:
By default, the map will be centered and scaled just as it appeared in the main QGIS window. To adjust the extent of the map, use “Move Content” tool.
Now you can pan the map content, or zoom with the mouse wheel. To zoom with finer control, hold the CTRL key while zooming.
11. Add a title
12. Add a Legend
The default legend is a bit ugly, as it uses the original layer names, and also shows layers that are not needed. But we can customize the legend in the legend’s Item Properties:
-
button) any rows you don’t want to show, like cdc_500_cities
and ESRI Gray (light)
.tl_2018_36_tract
and click the “Edit” icon below (pencil on paper) to chance the layer name to “Crude Prevalence %”13. Add a Scale Bar
14. Add a North Arrow
In QGIS, a north arrow can be added as an SVG image. (SVG = scalable vector graphic)
C:\OSGeo4W64\apps\qgis\svg\arrows\NorthArrow_04.svg
(or one of the other files there)15. Export the map image