Multiple Regressions with Python

Multiple Regression and Model Building Introduction In the last chapter we were running a simple linear regression on cereal data. We wanted to see if there was a relationship between the cereal’s nutritional rating and its sugar content. There was. But with all this other data, like fiber(!), we want to see what other variables are related, in conjunction with (and without) each other. Multiple regression seems like a friendly tool we can use to do this, so that’s what we’ll be doing here. [Read More]

Cereal Regression with Python

Simple Linear Regression Cereal Nutritional Rating against Sugar Content Being the cereal enthusiasts we are, we might be interested in knowing what sort of relationship exists between a cereal’s nutrition rating and its sugar content. Therefore, we can turn to using a simple linear regression. Using a linear model, we would also be able to look at any given cereal’s sugar content, and attempt to make an estimation as to what its nutritional rating will be. [Read More]

A Simpler Tutorial on Jupyter (IPython) Widgets

A Simpler Tutorial on Jupyter (IPython) Widgets Jupyter widgets are an awesome tool for creating interactive dashboards, but documentation can be a little excessive if you’re just looking for basic functionality. It really doesn’t have to be so complicated. Our widget use-case My data visualization team wanted to provide researchers with a GUI for visualizing species distributions, and we wanted to give them the power and flexibility to specify different parameters. [Read More]

An Introduction to the Stylo Library [R]

An Introduction to the Stylo Library What is Stylometry? Stylometry uses linguistic style to determine who authored some anonymous piece of writing, and it has diverse applications. The authorship of some suicide notes may be questionable. Most forum users have aliases in an attempt to anonymize themselves. And some authors publish their writings under pseudonyms. In these varying cases, stylometry can be used to deanonymize an author. What is Stylo? [Read More]

Path Distance Analysis [GIS]

Path-Distance Analysis //This is a fictitious scenario// Let’s suppose we’re a team of highly skilled NGA agents. We just received a briefing about some suspicious activity between two rival factions. These factions are typically hostile towards each other, but a trusted informant has brought forward some intel suggesting the two factions have been cooperating. Our team must assist in verifying these claims. Wielding the all-powerful eye in the sky — or, an array of US Government satellites — we will first identify which outposts are linked to each faction, and we will then infer the inter- and intra-faction transportation networks. [Read More]

Sentiment Analysis of First GOP Debate in 2015 [R]

A Statistical Analysis of Sentiments Every four years, the United States goes through the process of electing (or re-electing) its president. Politics becomes a popular topic of conversation, and inadvertently, a popular emotional outlet. Our digital landscape has — mostly textually, but sometimes by video or podcast — granted the ability for people to express their thoughts and feelings on political ideas and events, en masse. Suffice to say, an examination of people’s language in these expressions can yield many useful insights into human (or American) character and the influence of rhetoric on political philosophy and national pride. [Read More]

Projecting with Python [GIS, Python]

My Introduction to GIS with Python Python is a powerful tool in the GIS world, so I wanted to get a little bit of practice with it. I have had a lot of fun working with the Global Terrorism Database so I figured I would go from its CSV format to one that is better-supported by GIS — the shapefile. The dataset contains information related to terrorist attacks, including attack locations. [Read More]

Visibility Analysis [GIS]

This post is intended to be informative, and not so much reproducible. Viewshed Analysis A viewshed is an area that is visible from an observer at a given location. The problem of visibility lends itself to many domains: a guard on a prison tower must see the entire prison yard to ensure everyone’s safety a scout must maximize visibility of a battlefield for intelligence gathering autonomous vehicles must have 360-degree visibility cell-towers must provide maximum coverage In this post, we will pay mind to the ancient world, where according to Christopherson and Guertin*, “fortified sites were often located in order to visually control their territory, sacred sites might be located to provide views of other sacred sites, and the settlement patterns of hinterland sites might be located to facilitate, or to impede, visual communication. [Read More]

Terrorist Targets [Exploratory]

If you were to start some Google searches for terrorist attacks, while specifying the terrorist organization (e.g. Boko Haram, or the Lord’s Resistance Army), you will likely see different themes within the listed results. Going to a page that aggregates LRA attacks, you get: and for Boko Haram: Assuming these sources weren’t cherry-picked, we might infer that the LRA tends to attack people and their property, with assault rifles, while Boko Haram might be more likely to attack crowded public places, with explosives. [Read More]