Data Source

The Open-Source Psychometrics Personalities dataset is provided on the website itself, yet its data is not entirely clean. As this dataset is widely used by researchers, data scientists, and the like, people have shared their cleaned versions of the datasets. The dataset which I used for my project was attained from gitHub by Amber Thomas, which is under the CC BY-NC-SA license hence usable for my portfolio. The dataset can be found using the following link https://data.world/popculture/fictional-character-personality-traits.

Data Cleaning

I first used Excel to navigate through the dataset, and decided to omit the some of the columns such as the ratings and emoji spectrums, as both of them do not provide very meaningful indications regarding characters or personality spectrums. Consequently, including all 252 personality spectrums would be very exhaustive and would not be able to allow any meaningful visualizations. Hence, the only 36 spectrums were selected, which are the 36 spectrums used for the Recommended length of the personality quiz.

Note that the dataset was a dataset from 2020, hence there was a slight change in one of the spectrums on the website on 2020, which is unavailable in the dataset. The spectrum of concern which was unavailable in the dataset was joyful-miserable. Using the dataset of more than a hundred of unused spectrums, the spectrum of haunted-blissful is deemed to be the most suitable to replace the spectrum for the selected 36 spectrums.

Additionally, there was a character from the series Alien whose gender was unspecified, and for simplicity, all NA gender was changed to Male.

Cleaned Raw Data Preview

Here’s a preview of the further cleaned raw dataset I will be using as a base for my DV portfolio.

setwd("~/COMM2501 Portfolio - z5218332")

raw_data_cleaned <- read.csv("~/COMM2501 Portfolio - z5218332/files/raw_data_cleaned.csv")

head(raw_data_cleaned)
##   character_code fictional_work character_name gender spectrum
## 1           A/04          Alien            Ash   Male     BAP4
## 2           A/04          Alien            Ash   Male     BAP5
## 3           A/04          Alien            Ash   Male     BAP8
## 4           A/04          Alien            Ash   Male    BAP12
## 5           A/04          Alien            Ash   Male    BAP15
## 6           A/04          Alien            Ash   Male    BAP20
##   spectrum_positive spectrum_negative  mean   sd
## 1         masculine          feminine -16.9 22.3
## 2          charming           awkward  23.1 25.2
## 3            strict           lenient -32.8 20.9
## 4          artistic        scientific  43.0 12.1
## 5           orderly           chaotic -28.2 27.1
## 6         spiritual         skeptical  34.5 22.1
summary(raw_data_cleaned)
##  character_code     fictional_work     character_name        gender         
##  Length:28800       Length:28800       Length:28800       Length:28800      
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##    spectrum         spectrum_positive  spectrum_negative       mean         
##  Length:28800       Length:28800       Length:28800       Min.   :-49.4000  
##  Class :character   Class :character   Class :character   1st Qu.:-17.8000  
##  Mode  :character   Mode  :character   Mode  :character   Median :  1.0000  
##                                                           Mean   :  0.8211  
##                                                           3rd Qu.: 19.3000  
##                                                           Max.   : 48.1000  
##        sd       
##  Min.   : 0.80  
##  1st Qu.:19.90  
##  Median :24.20  
##  Mean   :23.48  
##  3rd Qu.:27.50  
##  Max.   :42.70