GIS 5935 - Lab 15 -- Dasymetric Mapping

I wish I could say the hardest part of this lab was learning to spell and pronounce dasymetric...

For the last lab of the semester we learned about dasymetric mapping which is a mapping approach to overcome the limitations of spatial data aggregation.  Or basically employing the dasymetric mapping technique allows population estimates in areas that don't follow typical census boundaries oftentimes using ancillary information.

For example, in this particular lab, we wanted to see how many 5 to 14 year olds would be part of which high school attendance zones in order to estimate future attendance. For this method, we used an impervious surface dataset with the assumption that population would be concentrated within impervious areas. Using some spatial statistics in the form of the zonal statistics in table tool, we found the area where the population was concentrated within the census zone.  With the intersect tool and summarize we were able to see where the impervious spots were in…

GIS 5936 -- Lab 14 Spatial Data Aggregation

This week's material was on gerrymandering which is typically a practice to gain political advantage over a party or group by manipulating district boundaries. These split counties and divide multiple counties between different districts so the decision makers can decide which demographics and ideals that are inline with their own to include in their district. We examined two ways to measure its effects on the shape of political districts. The first one is a measure of compactness. The theory behind this is that the oddly shaped districts have been reshaped and redrawn as opposed to the regular rectangular shapes.  To measure this, I used the formula: 400*pi*Area/perimeter squared. This gave me a score between 0 and 100 where the higher scores were relatively compact and the lower scores were considered "the worst". Below is an example of the worst compacted district:

The second is a measure of community or the relationship between how many counties fell within a particu…

GIS5935 - Lab 12: Geographically Weighted Regression

This week was a build up of the last two weeks of regression.  However, this time was on geographically weighted regressions. This type of regression allows analysts to include a spatial component into their regression models. The need to do this is partly because of the unique properties of spatial data: spatial autocorrelation, non-normal datasets, colinearity, and heteroscedacsticity. In this specific lab we looked at crime data and using OLS and GWR regressions, we then compared the two and then looked at the residuals in attempts to begin to predict where future crime may occur.

The biggest difference between OLS and GWR regression analysis is that GWR includes a spatial component where a linear regression is created for every location so the best-fit model varies by each individual location, whereas in a regular OLS a single best-fit line is created for the entire dataset.  In this respect it can be thought of as a "global model", and GWR is more of a "local model…

GIS5935 - Lab 11: Multivariate Regression

For this week, we built a little bit on last week's regression basics and learned about multivariate regressions, or regressions with multiple independent variables, as well as useful ways to perform regression analysis within arcmap.

Overall, analysis within ArcMap allowed us to assess performance of different models that we put together.  One of the ways we can do this is to use some exploratory regression tools within ArcMap.  This output will generate all the different combination of variables and will state statistics for the 6 checks for a meaningful regression (listed below):

Are the explanatory variables helping my model? p-value of coefficients (p-value > 0.05 significant and helpful)Are the relationships what I expected? Signs of coefficients  -- Are they what is expected?Are any of the explanatory variables redundant? Multicolinearity and redundancy.  Look at VIF (only available when there is more than one variable) VIF < 7.5 generally no redundancy Is the model bi…

GIS5935: Lab 10 -- Intro statistics

For this lab in introductory statistics, we used regression analysis to estimate rainfall for missing years using a nearby weather station. In order to do this, we first conducted a regression analysis in excel using the toolpak data analysis add on. Below are the results from the regression analysis:

SUMMARY OUTPUTRegression StatisticsMultiple R0.903336R Square0.816017Adjusted R Square0.812545Standard Error70.23828Observations55ANOVAdfSSMSFSignificance FRegression111596971159697235.06973.95E-21Residual53261471.14933.417Total541421168CoefficientsStandard Errort StatP-valueLower 95%Upper 95%Lower 95.0%Upper 95.0%Intercept162.342166.179262.4530660.01748829.60332295.080929.60332295.0809X Variable 10.8461710.0551915.331983.95E-210.7354740.9568680.7354740.956868
In particular, we used the highlighted values above for intercept and the X variable (slope). 
In order to estimate the levels we used the equation for a line of best fit, or Y=bX+a where y represents the predicted value, or what wi…

GIS5935 - Lab 9: Accuracy of Elevation Models

Continuing with the surfaces series, this week we assessed the vertical accuracy of Digital Elevation Models which are interpolated surfaces.  Since these are interpolated, they can also vary in the accuracy and reliability.  In order to assess this, one needs high accuracy assured sample points and the ability to understand some statistics and calculations.

One statistic is the RMSE (root mean square error). To calculate this, you take the square root of the squared sum of the absolute difference between the DEM and the sample points and divide this by the number of sample points. The RMSE is a measure of overall error, but it is rather just a summary of the differences. Lower values here are better. The confidence intervals determine how close to the actual value it is for a percentage of occurences. For example, for land cover A, the elevation is within 0.206m 95% of the time and within 0.105m 68% of the time.
*The most confusing part of this lab for me was calculating these.  I did…

GIS5935: Lab 8 -- Interpolation Methods

This week we learned about interpolating surfaces.  It started with comparing the IDW method and the spline method and calculated the difference between the two values. This was very informative to see where the IDW or the Spline method was more accurate in capturing and where the values were roughly the same.

For the second part, we compared interpolated surfaces using Thiessen polygons, IDW, as well as both the regularized and tension spline methods. It was very interesting to see how each one differed in different areas and which ones were the better representation of the sampled points. Below is a screen capture of all four different methods and how they compare:

As you can see, each  method varied from the others.  Creating the thiessen polygons (nearest neighbor method) created very strict boundaries, which I think is rather unrealistic in an oceanographic environment.  The IDW method created very small diameters of the different concentrations, and most of the surface was aroun…