## R: Converting to Numeric Part II

While a one liner to convert categorical values to numerics is mentioned in a previous post, this post will cover a better way to achieve similar results (using levels.) Another way to convert data to numeric is with the as.numeric method. In R, you can access A column of a Continue Reading

## R: Converting to Numeric 0,1

When using a dataset, sometimes I want to convert a column of two values (such as Gender) to numerical values like 0, 1. Let’s take this dataset.. It’s called “Mall_Customers” and it has a column called “Gender.” Gender has values like “Female” and “Male.” To subset the column from the Continue Reading

## R: Decision Making

In taking the Udemy course on Data Analysis with R by Sandeep Kumar, I took one of his tests. It’s a very simplistic question, but it gets a person thinking and running various tests in RStudio with R. Explaining R is beyond the scope of this simple post. As a Continue Reading

## Descriptive Statistics: Covariance

One way of checking the relationship between two variables is with covariance. This gives an indication if the two variables are correlated, one to another. A related post is that of Dispersion and Variance: Population Covariance The formula for population covariance is: Sample Covariance If the covariance value (between variables) Continue Reading

## Measures of Asymmetry

Building off the article below, I wanted to continue this discussion dealing with Right and Left Skewness. Specifically, how to calculate this in Python or visualize Asymmetry issues in Python. Skewness Right skewness is when the the Mean is Larger than the Median. The tail is longer towards the right Continue Reading

## Bar and Scatter Plots

In the above code, I’ve got a data frame of some fake car data. I have five cars, with column values for Top Speed, Price, MPG and Deaths per Year. To plot this as a bar graph, we can combine the column data into one plot using Pandas: I’ve not Continue Reading

## Descriptive Statistics: Cross Tables with Python

In a previous post, Frequency Distribution tables were briefly discussed. In this post, I’m going to quickly show how to build a cross table in Python Pandas. Cross Tables Unlike a spreadsheet where each row is an observation, a cross tab (or cross table) gives a tally of multiple variables. Continue Reading

## Plotting a Histogram

In a previous post about Frequency Distribution tables, a lot of pre-work was done to setup a set of frequency distribution and relative frequency distribution into tabular form. This time, I’ll take the same data set used, and plot a histogram to visually notice the frequency distribution. Unlike tabular data, Continue Reading

## Descriptive Statistics: Frequency Distribution Table

A frequency distribution table is much like a spreadsheet. This is where we have One Way Data being represented in a table form. Figure 1 above shows a frequency distribution table. Specifically it shows a tally of suicides per year by country. Each row is an observation and each column Continue Reading