Data Source
The Open-Source Psychometrics Personalities dataset is provided on the website itself, yet its data is not entirely clean. As this dataset is widely used by researchers, data scientists, and the like, people have shared their cleaned versions of the datasets. The dataset which I used for my project was attained from gitHub by Amber Thomas, which is under the CC BY-NC-SA license hence usable for my portfolio. The dataset can be found using the following link https://data.world/popculture/fictional-character-personality-traits.
Data Cleaning
I first used Excel to navigate through the dataset, and decided to omit the some of the columns such as the ratings and emoji spectrums, as both of them do not provide very meaningful indications regarding characters or personality spectrums. Consequently, including all 252 personality spectrums would be very exhaustive and would not be able to allow any meaningful visualizations. Hence, the only 36 spectrums were selected, which are the 36 spectrums used for the Recommended length of the personality quiz.
Note that the dataset was a dataset from 2020, hence there was a slight change in one of the spectrums on the website on 2020, which is unavailable in the dataset. The spectrum of concern which was unavailable in the dataset was joyful-miserable. Using the dataset of more than a hundred of unused spectrums, the spectrum of haunted-blissful is deemed to be the most suitable to replace the spectrum for the selected 36 spectrums.
Additionally, there was a character from the series Alien whose gender was unspecified, and for simplicity, all NA gender was changed to Male.
Cleaned Raw Data Preview
Here’s a preview of the further cleaned raw dataset I will be using as a base for my DV portfolio.
setwd("~/COMM2501 Portfolio - z5218332")
raw_data_cleaned <- read.csv("~/COMM2501 Portfolio - z5218332/files/raw_data_cleaned.csv")
head(raw_data_cleaned)
## character_code fictional_work character_name gender spectrum
## 1 A/04 Alien Ash Male BAP4
## 2 A/04 Alien Ash Male BAP5
## 3 A/04 Alien Ash Male BAP8
## 4 A/04 Alien Ash Male BAP12
## 5 A/04 Alien Ash Male BAP15
## 6 A/04 Alien Ash Male BAP20
## spectrum_positive spectrum_negative mean sd
## 1 masculine feminine -16.9 22.3
## 2 charming awkward 23.1 25.2
## 3 strict lenient -32.8 20.9
## 4 artistic scientific 43.0 12.1
## 5 orderly chaotic -28.2 27.1
## 6 spiritual skeptical 34.5 22.1
summary(raw_data_cleaned)
## character_code fictional_work character_name gender
## Length:28800 Length:28800 Length:28800 Length:28800
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
## spectrum spectrum_positive spectrum_negative mean
## Length:28800 Length:28800 Length:28800 Min. :-49.4000
## Class :character Class :character Class :character 1st Qu.:-17.8000
## Mode :character Mode :character Mode :character Median : 1.0000
## Mean : 0.8211
## 3rd Qu.: 19.3000
## Max. : 48.1000
## sd
## Min. : 0.80
## 1st Qu.:19.90
## Median :24.20
## Mean :23.48
## 3rd Qu.:27.50
## Max. :42.70