Disney Female Characters and Baby Names Over Time

Author

Rebecca Wycoff

Does the release of Disney Movies impact baby naming?

This project looks into the connection, if there is any, between, baby names throughout the years, and the names of Disney Characters.

We will be looking more specifically at main female characters from Disney Movies, and baby names recorded between the years of 1917 through 2017. For my data sets, I used the package, “babynames,” and ChatGPT, Data Analyst. To create my data of Disney characters, I used the prompt, “Create a data set of Female main Disney character names from 1880 - 2017 in movies, listing the year the movie came out, and their name.” This gave me a base of characters, that I was able to cross-check and adjust to make it usable. This data set can be found here: Disney Female Characters

I began this project by loading necessary packages and importing my data set into R Studio. After loading this, I merged my data, to create one base file. This shows the overlap of names in both Disney films, and baby names. I then filtered it to only show what I was looking for, which are Female names, and names shown after the year 1916.

Code
library(tidyverse)
library(babynames)
library(plotly)

library(readr)
disney_female_characters <- read_csv("disney_female_characters.csv")

babynames |> 
  inner_join(disney_female_characters) |> 
  filter (sex == "F" & year > 1916) -> merged_disney_female 

I first found the top 15 female names, and separated them by color of their “role” in the movies. I first had to find the sum of each name, and then graphed only the top 15.

Code
merged_disney_female |> 
  group_by(name, role) |> 
  mutate(total = sum(n)) |>
  distinct(name, role, .keep_all = TRUE) |> 
  arrange(desc(total)) -> totals_by_name

totals_by_name |> 
filter(name %in% c("Helen", "Alice", "Anna", "Olivia", "Judy", "Jane", "Wendy", "Jasmine", "Anita","Joy", "Penny", "Jenny", "Maggie", "Bianca", "Ariel")) |> 
  ggplot(aes(total, reorder(name, total), fill = role, 
             text = paste("name:", name, "<br>",
                          "movie:", `movie title`, "<br>",
                         "release year:", release, "<br>", 
                          "total:", total, "<br>", options(scipen = 100000)))) + geom_col() + 
         labs(title = "Top 15 Female Disney Names by Role", x = "Total Number", y = "Name") -> top_disney_names
ggplotly(top_disney_names, tooltip = "text")

This graph shows how the Heroine names are most popular, and the top 15 names, range from being pretty common names, such as Helen and Anna, to more obscure ones, like Jasmine. This leaves us with the question of, did the Disney characters influence this naming?

To look at the names over time, and to see if they were influenced by the movie release date, I plotted the top 10 names. When hovering over a line on each graph, you can see the year, proportion, character name, release year, and movie title. By seeing the release year, you are able to see the impact, if any.

First, I looked at the Female Heroine Character.

Due to the large amount of Heroine Characters, I started by finding the top fifteen names.

Code
merged_disney_female |>
  filter(role == "Heroine") |> 
  group_by(name) |> 
  summarize(total = sum(n)) |> 
  arrange(desc(total)) |> 
  head(15)
# A tibble: 15 × 2
   name       total
   <chr>      <int>
 1 Helen     755513
 2 Alice     431135
 3 Judy      381208
 4 Jane      344813
 5 Wendy     260673
 6 Anita     205133
 7 Joy       132961
 8 Penny      97335
 9 Jenny      89447
10 Maggie     83049
11 Bianca     71910
12 Esmeralda  44916
13 Giselle    42817
14 Meg         5261
15 Cruz        3769

From there, I plotted the top ten names. I removed the name, Helen, from the graph because of it’s high popularity, especially in 1917. By looking at the total, it is clear that it would not allow us to have a close look at our other names in the top 10, and by removing it we can get a closer look.

Code
merged_disney_female |> 
  filter(role == "Heroine") |> 
  filter(name %in% c("Alice", "Jane", "Judy", "Maggie", "Anita", "Wendy", "Joy", "Jenny", "Penny", "Bianca")) |> 
  ggplot(aes(year, prop, color = name,
             text = paste("release year:", release, "<br>",
                          "movie: ", `movie title`))) + geom_line() +
  labs(title = "Disney Heroine Names Over Time") -> heroine_plot
ggplotly(heroine_plot)

This graph shows us names, such as Alice, Jane, and Judy, popular through the years, despite the release of a movie associated with them. However, it can be suggested that the name, Wendy, and it’s peak in 1969 could have been influenced by the movie release of Peter Pan, in 1953. Some names such as Bianca and Jenny, remain low and steady throughout the 1000 years, with no spikes or dips around the time of the movie releases.

Next, I did Supporting Female Characters.

Similarly to the name, Helen, I removed Olivia due to it’s commonality, and to not skew our graph.

Code
merged_disney_female |> 
  filter(role == "Supporting") |> 
  filter(!name == "Olivia") |> 
  ggplot(aes(year, prop, color = name, 
             text = paste("release year:", release, "<br>",
                          "movie:", `movie title`))) + geom_line() +
  labs(title = "Disney Supporting Character Names Over Time") -> supporting_plot
ggplotly(supporting_plot)

Looking at the graph, it appears that these movie releases, do not correlate to the popularity of baby names. For example, the three names with the largest spikes, Winifred, Abby, and Roxanne, are on the decline during their respective movie releases - meaning something else influenced this incline.

I then looked into the Disney Female Villains.

Code
merged_disney_female |> 
  filter(role == "Villain") |> 
  ggplot(aes(year, prop, color = name,
             text = paste("release year:", release, "<br>",
                          "movie: ", `movie title`))) + geom_line() +
  labs(title = "Disney Villain Names Over Time") -> villain_plot
ggplotly(villain_plot) 

It is shown that the only overlap of names between the two data sets are Maleficent and Ursula. The name Maleficent was so slight, that information about it only appears when hovering along the x-axis, near the year 2015.

Here, you can see that the name, Ursula, on the decline, starting from the year 1972. After The Little Mermaid release, the names continued to decrease. It is possible that the name now being associated with a mean octopus impacted this decline, but it can not be confirmed.

Lastly, I looked into the Disney Princesses.

Similarly to the Heroines, I first found the top fifteen names.

Code
merged_disney_female |> 
  filter(role == "Princess") |>
  group_by(name) |> 
  summarize(total = sum(n)) |> 
  arrange(desc(total)) |> 
  head(15)
# A tibble: 14 × 2
   name        total
   <chr>       <int>
 1 Anna       666493
 2 Jasmine    244181
 3 Ariel       67779
 4 Aurora      50911
 5 Tiana       26104
 6 Elsa        24831
 7 Belle       15536
 8 Cinderella    783
 9 Moana         741
10 Snow          621
11 Pocahontas    145
12 Kida           65
13 Eilonwy        15
14 Rapunzel       15

Because Anna has a total count much higher than the other names, I removed it from the graph.

Code
merged_disney_female |> 
  filter(role == "Princess") |> 
  filter(name %in% c("Jasmine", "Belle", "Ariel", "Aurora", "Elsa", "Tiana", "Cinderella", "Snow", "Moana", "Pocahontas")) |> 
  ggplot(aes(year, prop, color = name, 
             text = paste("release year:", release, "<br>",
                          "movie: ", `movie title`))) + geom_line() + 
  labs(title = "Disney Princess Names Over Time") -> princess_plot
ggplotly(princess_plot)

Out of the other Disney names, these are the most obscure, yet are the ones with possibly the most influence. When naming a child, it is likely that parents use these princess names for their own.

From the graph, it is shown that although rising, the name, Jasmine, did hit it’s peak in 1993, one year after the movie, Aladdin, was released. Elsa and Tiana, also peaked one year after release of Frozen and Princess and the Frog. Similarly, the name, Ariel, peaked two years after the release of The Little Mermaid. Some of these names, however, appear to have no connection.

Overall, this analysis reveals a mixed impact of Disney female character names on baby name trends. Some names, Jasmine, Ariel, and Elsa, show a clear rise in popularity following their movie release, while others, like Winifred, Roxanne, and Abby, do not show any. Some names even show a decrease after the associated movie was released. This shows that although Disney films may have some influence, there are outside factors that are likely to play a larger role in shaping name popularity.