Tor Stem API

Tor Stem Controller Just in case somebody out there is looking for an example of how to use Tor’s Stem API, here is a tiny Python script that allows the user to monitor circuit connections. Here, I’m just getting basic information like description, nickname, and address, but the API docs show much more information that you can get. Just be mindful that you will need to change the string within the authenticate method. [Read More]

Sklearn Neural Network

Artificial Neural Networks with Sci-kit Learn The Gist of Neural Nets A neural network is a supervised classification algorithm. With your help, it kind of teaches itself how to make better classifications. For a basic neural net, you have three primary components: an input layer, a hidden layer, and an output layer, each consisting of nodes. The nodes of the input layer are basically your input variables; the nodes of the hidden layer are neurons that contain some function that operates on your input data; and there is one output node, which uses a function on the values given by the hidden layer, putting out one final calculation. [Read More]

k-Nearest Neighbors Classifier

Classification using K-Nearest Neighbor (KNN) import numpy as np import pandas as pd from sklearn.neighbors import KNeighborsClassifier from IPython.display import display pd.set_option('display.notebook_repr_html', True) Prescription Drug Classification KNN bases its classifications on the nearest k-neighbors. A neighbor’s “near-ness” is based on their attributes or predictors. For example, below, the attributes are simple. Every patient at a hospital has an age attribute, and a Na/K ratio attribute. Based on those attributes, a patient is assigned a classification (or type of drug). [Read More]

Multiple Regressions with Python

Multiple Regression and Model Building Introduction In the last chapter we were running a simple linear regression on cereal data. We wanted to see if there was a relationship between the cereal’s nutritional rating and its sugar content. There was. But with all this other data, like fiber(!), we want to see what other variables are related, in conjunction with (and without) each other. Multiple regression seems like a friendly tool we can use to do this, so that’s what we’ll be doing here. [Read More]

Cereal Regression with Python

Simple Linear Regression Cereal Nutritional Rating against Sugar Content Being the cereal enthusiasts we are, we might be interested in knowing what sort of relationship exists between a cereal’s nutrition rating and its sugar content. Therefore, we can turn to using a simple linear regression. Using a linear model, we would also be able to look at any given cereal’s sugar content, and attempt to make an estimation as to what its nutritional rating will be. [Read More]

Exploratory Analysis of Churn Data

Exploratory Data Analysis Conquering Earth by Phone It’s the year 3000 and we’re in the Futurama universe. Per usual, Lrrr has been up to no good. Scheming, he had a breakthrough. He will use Earth’s telephone services to recruit an army to conquer the planet. Unfortunately, he has many people leaving his service and joining competitors. To carry out his plan, he needs to get a better understanding of what is causing people to leave, and he will do this by employing an analyst. [Read More]

Quick Notes on Decision Trees

Decision Trees in Classification The Gist of Decision Trees Decision trees aren’t the most accurate method of classification, as they often lead to overfitting, but it’s still a very intuitive way of understanding how classification works. A decision tree is basically a collection of decision nodes, where if-thens rule. Each tier of nodes is basically a step down from significance of information, or in other words, the first node will likely contain the most useful information for the model, while the second tier of nodes will contain the second most useful information, and so forth. [Read More]

A Simpler Tutorial on Jupyter (IPython) Widgets

A Simpler Tutorial on Jupyter (IPython) Widgets Jupyter widgets are an awesome tool for creating interactive dashboards, but documentation can be a little excessive if you’re just looking for basic functionality. It really doesn’t have to be so complicated. Our widget use-case My data visualization team wanted to provide researchers with a GUI for visualizing species distributions, and we wanted to give them the power and flexibility to specify different parameters. [Read More]

An Introduction to the Stylo Library [R]

An Introduction to the Stylo Library What is Stylometry? Stylometry uses linguistic style to determine who authored some anonymous piece of writing, and it has diverse applications. The authorship of some suicide notes may be questionable. Most forum users have aliases in an attempt to anonymize themselves. And some authors publish their writings under pseudonyms. In these varying cases, stylometry can be used to deanonymize an author. What is Stylo? [Read More]

Projecting with Python [GIS, Python]

My Introduction to GIS with Python Python is a powerful tool in the GIS world, so I wanted to get a little bit of practice with it. I have had a lot of fun working with the Global Terrorism Database so I figured I would go from its CSV format to one that is better-supported by GIS — the shapefile. The dataset contains information related to terrorist attacks, including attack locations. [Read More]