Frawley: R Geospatial Wrap Up

Background ā€”Ā 

This independent study was more so learning R rather than spending more time in project-based learning as I did with python. R is a very powerful data-driven language and is great for using large data sets. R is vastly used for geospatial analysis as well. I created a county population map of the United States using census data from built-in libraries whose primary use is for geospatial analysis. The code looks very simple, but it was not. I had significantly more problems with R than I did with Python. The syntax used in R is an absolute cluster and I despise it.Ā 

 

StepsĀ 

  1. Import libraries usmap and ggplot2.Ā 
  2. Add some data to plot the map, data = countypop draws the county lines and values creates data with the built-in census data via a CSV.Ā 
  3. Create a scale and create breaksĀ 
  4. Change the scale fill to one thatā€™s not a constant color for easier viewing
  5. Cry because you canā€™t create an interactive map with tmapĀ 

 

The Code ā€”Ā 

library(usmap)

library(ggplot2)

plot_usmap(data = countypop, values = “pop_2015”, color = “grey”) +Ā 

Ā Ā scale_fill_viridis_c(name = “County Population 2015”, label = scales::comma, option = ‘D’, trans = “log2”,Ā  breaks = trans_breaks(“log2”, function(x) 2^x))+

theme(legend.position = “right”)

 

 

Problems ā€”Ā 

Like I stated before, I had tons of issues with this project. I programmed the map at least 15 different ways and it never looked like how I wanted it to. I think R is a little over my head when it comes to the math-based stuff like the scale. The breaks were never how I wanted them but I chose one that seemed the most reasonable.

Ā Another issue that I had was this map is a static map, it cannot move and you cannot click on counties to see their population. I tried for at least three weeks to create an interactive map through the library T map. Itā€™s literally very simple, you just join the counties and the county population CSV to create the interactive map. The issue that I had was joining the two tables by their geoID. It never seemed to work and I literally even created my own spreadsheet with the countiesā€™ populations, names, states, and geoID. Didnā€™t work. This was the most frustrating part of the project that I dealt with.Ā 

Again, I took more of a book approach to this course so it definitely felt like I couldā€™ve figured it out by making other things within R. I can follow the code and see what someone is doing, but trying to do it on your own starting out is very difficult. Also, R is just way worse and not as user-friendly as Python. I feel that I could definitely become better at R if I just spent more time with it, and also brushed up on my math skills a little more.Ā 

 

Frawley: Python Wrap Up

Background – Ā  Ā For this independent study, I wanted to solve a problem that isnā€™t necessarily tedious but just annoying. For me, that was creating true color images through ArcPro. Working with Dr. Rowley has made me quite familiar with creating true colors, mostly creating several of them over a melt season for the Sermeq Kujalleq Ablation Region. By creating this script, it just makes everything a little easier so you donā€™t have to manually import which .TIFF files you want. Iā€™ve had some experience with python before starting this course so it definitely was a little easier than I expected, and the syntax of python is just leaps and bounds ahead of that stinky language R.Ā 

 

Introduction – (what each step does)

  1. Create a geodatabase or .gdb ā€” doing this creates a server like storage house your tiff files to make calculations and pull the data back down.Ā 
  2. Create Mosaic Dataset ā€” This creates a dataset to house your images, another important step before using the composite bandā€™s tool.Ā 
  3. Add rasters to Mosaic ā€” Add your images to the datasetĀ 
  4. Composite Bands Tool ā€” From the mosaic dataset, import them to the composite bandā€™s tool, which creates the Truecolor image raster for you.Ā 

 

 

 

What Did I learn? ā€”

I learned a lot from this project, firstly that ERSI software is insanely tedious when it comes to using their programs. I needed to jump through several hoops just to create a geodatabase. The geodatabase is created through ArcServer – everything must be run through the server and you need to have the software installed to even create one.Ā 

I learned to problem solve way better on my own, knowing exactly what to search to find solutions to my problem was a huge success. I genuinely didnā€™t have any problems with this project, nothing that was too halting that I felt like I needed to stop the project or require outside help. I dedicated about 8-10 hours a week to this project and it was mostly spent debugging and finding solutions to my code.Ā 

One issue that came across was at the very beginning of the project. To use the Arcpy library, you cannot have the newest version of python installed. The newest version is 3.10, but arcpy is currently only compatible with 3.7 and below. I felt like this project was a great introduction to the GIS developing world, and I am very glad that I took it. Overall, I am very confident in my work for this course. I believe that this script has significant use for GIS creators and researchers.Ā 

Wade: Chapter 8: Manipulating Spatial and Tabular Data

 

Mack Wade

Chapter 8: manipulating Spatial and Tabular Data

Ā 

Terms:

Comma separated Value (CSV): a delimited text file that uses a comma to separate values. Each line of the file is a data record. Each record consists of one or more fields, separated by commas. The use of the comma as a field separator is the source of the name for this file format.Ā 

Data Lock: There are two states of a lock i.e locked and unlocked. A lock is a class in the threading module whose object is generated in the unlocked state and has two primary methods i.e acquire() and release() .

Mode: Ā refers to the most frequently occurring number found in a set of numbers.

Parsing: the processing of a piece of python program and converting these codes into machine language.

Postfix clause: An SQL postfix clause is positioned in the second position and will be appended to the SELECT statement, following the where clause. The SQL postfix clause is most commonly used for clauses such as ORDER BY.

Prefix clause: The PREFIX clause declares any abbreviations for URIs that you want to reference in a query. You can declare prefixes to simplify query text if your data includes long URI names. If you do not declare prefixes, you must include the full URI names in the query.

SQL clause: Clauses are in-built functions available to us in SQL. With the help of clauses, we can deal with data easily stored in the table. Clauses help us filter and analyze data quickly. When we have large amounts of data stored in the database, we use Clauses to query and get data required by the user.

SQL expression: An expression is a combination of one or more values, operators, and SQL functions that evaluate to a value. An expression generally assumes the datatype of its components.

SQL keyword: the reserved words that are used to perform various operations in the database. There are many keywords in SQL and as SQL is case insensitive, it does not matter if we use for example SELECT or select.

Structured query language (SQL): a programming language designed to get information out of and put it into a relational database. Queries are constructed from a command language that lets you select, insert, update and locate data.

Triple Quotes: allowing strings to span multiple lines, including verbatim NEWLINEs, TABs, and any other special characters. The syntax for triple quotes consists of three consecutive single or double quotes.

Review Questions

  1. What are the three different cursors, and what purpose does each one serve?

cursor.fetchall() fetches all the rows of a query result. It returns all the rows as a list of tuples. An empty list is returned if there is no record to fetch.

cursor.fetchmany(size) returns the number of rows specified by size argument. When called repeatedly, this method fetches the next set of rows of a query result and returns a list of tuples. If no more rows are available, it returns an empty list.

cursor.fetchone() method returns a single record or None if no more rows are available

  1. Explain how data locks occur and how they can be removed.
    1. A lock is a class in the threading module whose object generated in the unlocked state and has two primary methods i.e acquire() and release() .
  2. When writing SQL expressions, why are quotation marks sometimes an issue?Ā 
    1. Single quotes are used to indicate the beginning and end of a string in SQL. Double quotes generally aren’t used in SQL, but that can vary from database to database. Stick to using single quotes.

Wade: Chapter 7: Debugging and Error Handling

Chapter 7: Debugging and Error Handling

Terms:

Breakpoint: Ā used to interrupt a running program immediately before the execution of a programmer-specified instruction. This is often referred to as an instruction breakpoint. … Breakpoints can also be used to interrupt execution at a particular time, upon a keystroke etc.

Commenting out code: to use comment syntax to remove something from the parsed code.

Custom class: Ā a developer defined class, based on one of the stock classesĀ 

Debugging: the complete control over the program execution.

Error handling: the process of responding to the occurrence of exceptions ā€“ anomalous or exceptional conditions requiring special processing ā€“ during the execution of a program.

Exception:

Exception object:Ā 

An event that occurs during the execution of a program that disrupts the normal flow of instructions is called an exception.Ā 

Logic error: a mistake in a program’s source code that results in incorrect or unexpected behavior

Traceback: a report containing the function calls made in your code at a specific point.

Wade: Chapter 6: Exploring Spatial Data

Chapter 6: Exploring Spatial Data

Terms:

Dynamic (property): terminologies for attributes that are defined at runtime, after creating the objects or instances.

List comprehension: an easy and compact syntax for creating a list from a string or another list. It is a very concise way to create a new list by performing an operation on each item in the existing list. List comprehension is considerably faster than processing a list using the for loop.

System path

Wildcard: a symbol used to replace or represent one or more characters. Wildcards are used in computer programs, languages, search engines, and operating systems to simplify search criteria.

 

Review Questions:

  1. What are the key differences and similarities between the describe () and da.describe () functions?
    1. The Describe function returns a dictionary with multiple properties, such as data type, fields, indexes, and many others. The dictionary’s keys are dynamic, meaning that depending on what data type is described, different properties will be available for use.
  2. Explain the difference between system and catalog paths, and how they affect exploring data in a folder.
    1. The main difference between an absolute and a relative path is that an absolute path specifies the location from the root directory whereas relative path is related to the current directory.
  3. What is list comprehension, and when would you consider usng it.
    1. List Comprehension is an easy and compact syntax for creating a list from a string or another list. It is a very concise way to create a new list by performing an operation on each item in the existing list. List comprehension is considerably faster than processing a list using the for loop.

Wade: Chapter 5: Geoprocessing Using Python

Terms

Class: like an outline for creating a new object

Environment: a tool that helps to keep dependencies required by different projects separate

Factory Code:Factory method is a creational design pattern which solves the problem of creating product objects without specifying their concrete classes. Factory Method defines a method, which should be used for creating objects instead of direct constructor call ( new operator).

Function: a block of code that only runs when it is called.

Hard-Coded: A part of a program that has been declared as unchanging

Instance: An individual object of a certain class.

Method: a function that ā€œbelongs toā€ an object.

Namespace: a system that has a unique name for each and every object in Python.Ā 

Object: Everything is in Python treated as an object, including variable, function, list, tuple, dictionary, set, etc. Every object belongs to its class.

Package: A package is basically a directory with Python files and a file with the name __init__ . py. This means that every directory inside of the Python path, which contains a file named __init__ . py, will be treated as a package by Python.

Property: the main purpose of Property() function is to create property of a class.

Well-known ID (WKID): a unique number assigned to a coordinate system.

Well-known text (WKT): an Open Geospatial Consortium (OGC) standard that is used to represent spatial data in a textual format.

Workspace: supports the embedding and running of Python scripts within a workflow.

 

Review Questions

 

  • Explain some of the uses of the ā€œResultā€ object.Ā 
    • The result object can be used as an input to another function. The result object also has properties and methods.

 

 

  • Why are classes used as input parameters for geoprocessing tools?
    • Classes are often used as shortcuts for tool parameters that would otherwise have a more complicated equivalent.

 

  • What are some of the typical environments set in a script?

 

      1. ClearEnvironment.
      2. ListEnvironments.
      3. LoadSettings.
      4. ResetEnvironments.
      5. SaveSettings

 

  • Explain how Pro is licensed and how this impacts handling licensing when writing scripts.
    • You must have a license to run geoprocessing tools, which includes running a stand-alone python script that uses these tools. You will receive an error message if you try to run a tool that you do not have a license for, such as Spatial Analyst tools.

Wade: Chapter 4: Learning Python Language Fundamentals

Mack Wade

Chapter 4: Learning Python Language Fundamentals

Terms

Boolean: denoting a system of algebraic notation used to represent logical propositions, especially in computing and electronics.

Boolean Expression: In computer science, a Boolean expression is an expression used in programming languages that produces a Boolean value when evaluated. A Boolean value is either true or false.

Boolean Logic: In mathematics and mathematical logic, Boolean algebra is the branch of algebra in which the values of the variables are the truth values true and false, usually denoted 1 and 0, respectively.

Boolean OPerator: Boolean Operators are simple words (AND, OR, NOT or AND NOT) used as conjunctions to combine or exclude keywords in a search, resulting in more focused and productive results. This should save time and effort by eliminating inappropriate hits that must be scanned before discarding.

Built in Operator: These operators are used to perform arithmetic computations on their operands. This operator returns the result of adding the two operands (operand1 and operand2).

Camel Case: Camel case is the practice of writing phrases without spaces or punctuation, indicating the separation of words with a single capitalized letter, and the first word starting with either case. Common examples include “iPhone” and “eBay”

Casting: Casting is when you convert a variable value from one type to another.

Comparison Operator: Comparison operators ā€” operators that compare values and return true or false . The operators include: > , < , >= , <= , === , and !==Ā 

Condition: The boolean expression in a conditional statement that determines which branch is executed.Ā 

Docstring: a string literal specified in source code that is used, like a comment, to document a specific segment of code

Dot Notation: You can access properties on an object by specifying the name of the object, followed by a dot (period) followed by the property name.

Escape character: a character that invokes an alternative interpretation on the following characters in a character sequence.

Float: a data type composed of a number that is not an integer, because it includes a fraction represented in decimal format.

Floor Division: a normal division operation except that it returns the largest possible integer.

F-String: provide a way to embed expressions inside string literals, using a minimal syntax

Immutable: unchanging over time or unable to be changed.

Indexing: a way to refer the individual items within an iterable by its position

Modulus: The Python modulo operator calculates the remainder of dividing two values. This operator is represented by the percentage sign (%). The syntax for the modulo operator is: number1 % number2. The first number is divided by the second then the remainder is returned

Parameter: he variable listed inside the parentheses in the function definition.

Sentry Variable: a variable used in the condition and it is compared to some other value or values

Snake Case: refers to the style of writing in which each space is replaced by an underscore character, and the first letter of each word written in lowercase. It is a commonly used naming convention in computing, for example for variable and subroutine names, and for filenames.

String: an immutable sequence data type. It is the sequence of Unicode characters wrapped inside single, double, or triple quotes

True Division: When dividing something by 1, the answer will always be the original number.Ā 

Unicode: Ā Unicode, formally the Unicode Standard, is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world’s writing systems.

Whitespace: characters which are used for spacing, and have an “empty” representation. In the context of python, it means tabs and spaces

Zero-Based Language: Zero-based numbering is a way of numbering in which the initial element of a sequence is assigned the index 0, rather than the index 1 as is typical in everyday non-mathematical or non-programming circumstances

Review Questions

  1. What are the main data types in Python?
    1. The most common ones are float (floating point), int (integer), str (string), bool (Boolean), list, and dict (dictionary). float – used for real numbers. int – used for integers.Ā 
  2. What are unicode strings?
    1. A Unicode string is a sequence of zero or more code points.
  3. What is dot notation?
    1. In general, dot notation tells Python to look inside the space that is before the dot for code to execute. You can use dot notation to access the specific version of a certain function that is defined in a different class or a different module.
  4. Name three methods of string objects in Python.
    1. The static method, the class method, and the instance method.
  5. What are some of the key similarities and differences between lists and tuples in python.
    1. list and tuple are a class of data structure that can store one or more objects or values. A list is used to store multiple items in one variable and can be created using square brackets. Similarly, tuples also can store multiple items in a single variable and can be declared using parentheses.
  6. Describe the use of Boolean expressions to evaluate a condition.
    1. A Boolean expression is a logical statement that is either TRUE or FALSE . Boolean expressions can compare data of any type as long as both parts of the expression have the same basic data type. You can test data to see if it is equal to, greater than, or less than other data.
  7. Describe the steps to create a dictionary and add new items to the dictionary.
    1. A Dictionary can be created by placing a sequence of elements within curly {} braces, separated by ‘comma’. Dictionary holds a pair of values, one being the Key and the other corresponding pair element being its Key:value.
  8. Describe how branching is implemented in Python code.
    1. Branching statements in Python are used to change the normal flow of execution based on some condition. The return branching statement is used to explicitly return from a method. A break branching statement is used to break the loop and transfer control to the line immediately outside of loop.Ā 
  9. What is a compound statement in Python?Ā 
    1. Compound statements are made up of two or more program statements that are executed together. This usually occurs while handling conditions wherein a series of statements are executed when a TRUE or FALSE is evaluated. Compound statements can also be executed within a loop.
  10. Describe the two main looping structures in Python.
    1. ā€œForā€ and ā€œWhileā€ For loops can iterate over a sequence of numbers using the “range” and “xrange” functions. The difference between range and xrange is that the range function returns a new list with numbers of that specified range, whereas xrange returns an iterator, which is more efficient.For loops can iterate over a sequence of numbers using the “range” and “xrange” functions. The difference between range and xrange is that the range function returns a new list with numbers of that specified range, whereas xrange returns an iterator, which is more efficient.
  11. Why should you add comments to your scripts?
    1. Using comments in any script or code is very important to make the script more readable. Comments work as a documentation for the script. The reader can easily understand each step of the script if it is properly commented by the author.

McConkey Set-up Procedures and ArcGIS Pro

So the first thing I did was buy Python Scripting for ArcGIS Pro by Paul A. Zandbergen off of Amazon.Ā  It arrived within a couple of days and I was able to start reading chapter one. After reading chapters 1 and 2 I downloaded ArcGIS Pro with the files provided by Dr. Krygier. The download took around 30 minutes to complete. Once downloaded, I made a new folder and transferred the data to the folder. I then extracted all the data, which took several minutes. Afterwards, I went through the data files and clicked on ones that I thought would open the program. I honestly do not remember what the file was called but eventually I got the software downloaded. It is probably a good idea to add ArcGIS Pro to your taskbar at the bottom of the screen. I did this by right-clicking the application on the desktop and finding the option “Pin to taskbar.”

After I downloaded ArcPro I logged into ESI and looked up training modules for Pro. I was able to find a free one that would take around 3 hours to complete. I downloaded the accompanied data and got started. I did not finish the module, partially due to already being familiar with ArcMap. However, the course was very interesting and I saw how more intuitive Pro is compared to its older counterpart. For instance, you can start a project from several premade templates, which will also automatically make a project folder for various related elements such as geodatabases, layouts, maps, and toolboxes.

Another upgrade is that ArcPro is context-sensitive. When you click on layer of different types, the ribbon above may change showing ways to edit the appearance of that layer or specific tools for that layer type. This makes editing the appearance and labeling layers more convenient. It also makes it easier to familiarize yourself on how the different types of layers can be edited. You can still find tools in the toolbox but now you can put tools into a favorites folder for easier access. This is useful for when you plan on performing the same or similar tasks on a project. Learning a new software is difficult but I believe many of the new features of ArcGIS Pro ultimately make the GIS experience more convenient.

Wade: Chapter 3: Geoprocessing in ArcGIS Pro

Mack Wade

Chapter 3: Geoprocessing in ArcGIS Pro

This chapter was pretty easy to follow along to. I had never used modelbuilder before so that was interesting.Ā 

  • Batch Mode: Batch mode is a network round trip-reduction feature that, as its name implies, batches up data-related operations in order to perform them in more coarse-grained chunks.
  • Batch Processing: Computerized batch processing is the running of “jobs that can run without end user interaction, or can be scheduled to run as resources permit.
  • Current Workspace: Current Workspaceā€”The workspace from which inputs are taken and outputs are placed when running tools.Ā 
  • Geoprocessing: Geoprocessing is a framework and set of tools for processing geographic and related data.
  • Iteration: repetition of a mathematical or computational procedure applied to the result of a previous application, typically as a means of obtaining successively closer approximations to the solution of a problem
  • ModelBuilder: ModelBuilder is a visual programming language for building geoprocessingworkflows

Review Questions

  • Describe some of the general elements of the geoprocessing framework in ArcGIS Pro.
    • A collection of tools, methods to find and execute tools, environment settings and other geoprocessing options that control how tools are run, python window, geoprocessing historyĀ 
  • What are the three types of tools in ArcGIS Pro
    • Built tools, script tools, model tools, also system tools and custom tools
  • Explain the difference between system tools and custom tools
    • System tools are created by Esri
    • Custom tools are created by a user or third party
  • What are the strengths and limitations of batch processing in Pro
    • Geoprocessing tools can be run in a batch mode that allows you to run the tool multiple times using many input datasets or different parameter settings. This makes it possible to run a tool many times with very little interaction.Ā 
  • What are some of the similarities and differences between models and tools and script tools in Pro
    • Model tools help you execute multiple tools at the same time
    • script tools help you create your own detailed tool by connecting applications