While a one liner to convert categorical values to numerics is mentioned in a previous post, this post will cover a better way to achieve similar results (using levels.)
Another way to convert data to numeric is with the as.numeric method.
In R, you can access A column of a data frame using the $ (dollar sign) syntax. RStudio can even auto complete this for you.
# Converting a list to a category: > gender.factor <- factor(genderlist$Gender) # Or we can convert a dataFrame to a category: > gender.factor <- factor(gender$Gender)
In the example above, I’m using a list first… for an example I also provide a dataframe (gender) and supply the column header.
Next I created a factor of that list (or data frame.) In R a factor is a category type. By explicitly setting the values of “Male” and “Female” to a category, we can run a numeric operation on it.
Notice the $Gender syntax: factor(df$Gender)
Where $Gender is the column name of the list or data frame. What’s nice about this, unlike Pandas, the IDE auto-completes the Columns by raising a contextual menu as you type after the $ sign.
Once a factor, we can run the as.numeric method like so:
> gender.numeric <- as.numeric(gender.factor) > gender.numeric  2 2 1 1 1 1 1 1
There we get our output of 1 for Female and 2 for Male. The auto numeric assignment is done. These category numeric assignments are called levels. In this case there are 2 levels.
If we had a case of 6 values, like “Up,” “Down,” “Left,” “Right,” “Forward,” “Back,” we would get 6 levels: 1, 2, 3, 4, 5, 6.