From Broadway to the Big Screen: Analyzing Audience Sentiment in Theatrical Adaptations

Author

Rebecca Wycoff

Introduction

Since 1866, Broadway has been a place for people to gather and watch live storytelling through music, movement, and performance. During the early 1900s, the Theater District grew as electric lights and bright signs changed its visibility, attracting larger and new audiences (Jaramillo, 2016). Shifting from a local theatrical hub to a global phenomenon, people now travel from all over the world to watch Broadway productions in New York City. Broadway showcases diverse stories and cultures, sharing unique voices and fostering inclusivity.

With advancements in technology, Broadway and theatrical productions have no choice but to evolve with it. Starting as a local experience, theatre has transformed into a global entertainment industry. Through adaptations, social media, filmed recordings, audiences from all over can engage with theatre in new ways, shifting how the audience interacts with it. There lies gap in understanding how adaptations influence audience sentiment over time. Studies have looked into the shifting audience behaviors, financial trends, and marketing strategies, but little research has been done on whether audience perceptions change based on the release of a film adaptation.

Hypothesis

This study investigates how audience sentiment shifts when a Broadway musical is adapted into a film format. The hypothesis is that while adaptations generally maintain an overall positive sentiment, specific factors—such as casting, cinematography, and the adaptation’s faithfulness to the original production—strongly influence audience reception. It is expected that successful adaptations will show high positive sentiment linked to these elements, while less successful adaptations will reveal audience dissatisfaction tied to perceived shortcomings in these areas.

Gathering Data

The analysis focuses on the cinematic adaptations of Wicked: Part One, West Side Story, Cats, and Dear Evan Hansen. These movies were selected due to their widespread popularity, their recent releases - in order to have a fair comparison of visual effects and technology used - and the variation in critical reception. Wicked: Part One, most recently released in November 2024, received an 87% on the Tomatometer scale on the website, Rotten Tomatoes, through a calculation of critic reviews. Steven Speilberg’s West Side Story, released in December 2021, received a 91%. Cats received a 19% after being released in December 2019, and Dear Evan Hansen, released in September 2021, received a 28%.

Reviews were gathered from Google.com by searching “[movie name] movie reviews.” From the resulting list, data was collected directly from the displayed reviews; these reviews have been put online from the general public. To ensure all reviews are relevant to the study’s focus, each selected review included the word “adaptation.” A random sample of 20 reviews for each film was achieved by selecting every fourth review. The data was then entered into an Excel spreadsheet containing the movie title, star rating, and full text of the review, allowing for comparison across films and sentiment scores.

Reviews can be found here: Adaptation Reviews

Closer Look

I began by loading my packages and importing my data set.

Code

library(tidyverse)
library(tidytext)
library(devtools)
library(wordcloud2)
library(textdata)
library(quanteda)
library(ggplot2)
library(plotly)
library(scales)

library(readr)
reviews <- read_csv("adaptationreviews.csv")

The first thing looked at were the top words in the reviews collected after taking out stop words - common words like “the,” “a,” “of,” etc.

Code

reviews |> 
  unnest_tokens(word, Review) -> review_words

review_words |> 
  anti_join(stop_words) |> 
  count(word,sort = TRUE) |> 
  arrange(desc(n)) |> 
  head(10) |> 
  knitr::kable()

word	n
musical	109
movie	107
film	103
adaptation	102
original	56
cats	55
wicked	54
story	50
broadway	44
cast	36

Here, it is shown, without any other filtering that the top words were the ones stating what was being reviewed. This contains the movies’ names, what it is, and the connection to Broadway.

Once filtering out common words that would appear in each one, a word cloud was created to see what else is being discussed by reviewers.

Code

review_words |> 
  filter(!word %in% c('movie', 'musical', 'wicked', 'cats', 'dear', 'evan', 'hansen', 
                      'west', 'side', 'story', 'film', 'production', 'adaptation', 'original', 'broadway')) |> 
  anti_join(stop_words) |> 
  count(word,sort = TRUE) |> 
  arrange(desc(n)) |> 
  head(180) -> review_top_words

wordcloud2(review_top_words, size =.5)

Sentiment Analysis

To look at the reviews with deeper meaning, a sentiment analysis was performed. Sentiment analysis categorizes text as positive, negative, or neutral by leveraging Natural Language Processing techniques to analyze word choice, order, and combinations (Kennedy, 2012). To perform this, the reviews were imported into a coding program called R.Studio, which is widely used for data analysis and statistical computing. Within R.Studio, the reviews were tokenized—meaning they were broken down into individual words—to enable more detailed analysis of textual patterns. Each word was then assigned a sentiment value on a numerical scale, with zero representing neutral sentiment, using the AFINN lexicon.

The first graph shows the overall sentiment of each movies’ reviews.

Code

review_words |> 
  arrange(Movie) |> 
  inner_join(get_sentiments('afinn')) -> review_sentiment

review_sentiment |> 
  group_by(Movie) |> 
  summarise(avg_sentiment = mean(value)) |> 
  ggplot(aes(reorder(Movie, avg_sentiment), avg_sentiment, fill = Movie, 
             text = paste("Movie:", Movie, "<br> Average Sentiment:", round(avg_sentiment, 2)))) +
  geom_col() + coord_flip() +
  labs(title = "Average Sentiment By Movie",
       y = "Average Sentiment",
       x = NULL) +  scale_fill_manual(values = c(
         "West Side Story" = "lightcoral",  # pastel red
         "Wicked" = "lightgreen",            # pastel green
         "Cats" = "lightsalmon",             # pastel orange
         "Dear Evan Hansen" = "lightblue"    # pastel blue
       )) + theme(legend.position = "none") -> movie_sentiment
ggplotly(movie_sentiment, tooltip = "text")

The sentiment of each adaptation was generally positive, as reflected in the average sentiment scores of their public reviews. These scores were calculated by assigning a sentiment value to each word in the reviews and then averaging those values per production. Among the four films, West Side Story received the highest average sentiment score at 1.52, followed by Wicked (1.16), Cats (0.93), and Dear Evan Hansen (0.65). For instance, West Side Story’s positive sentiment is echoed in comments such as, “Brilliant adaptation. Great cinematography, excellent choreography, beautiful costumes and good acting.” Conversely, Dear Evan Hansen’s lower score is reflected in reviews like, “None of the choices make sense, it’s a bad adaptation of an already kinda shaky musical premise.”

Looking more closely at the top words contributing to each average provides insight into why certain productions were received more positively or negatively. The tops 20 words that most frequently appear in the reviews for each movie were shown on a chart in value order.

Code

review_sentiment |> 
  summarise(Movie, word, value) |> 
  filter(Movie %in% "West Side Story") |> 
  count(word, value, sort = TRUE) |> 
  arrange(desc(n)) |> 
  head(20) |> 
  ggplot(aes(reorder(word, -value),value, fill = value, text = paste("Word:", word, "<br>Value:", value))) + geom_col() + coord_flip() +
  scale_fill_gradient(low = "#a50f15", high = "#fee0d2") +
  labs(title = "'West Side Story' Words with Sentiment",
       y = "Value",
       x = NULL) +   theme(legend.position = "none", plot.title = element_text(size = 8.5)) -> westside_sentiment
ggplotly(westside_sentiment, tooltip = "text")
review_sentiment |> 
  summarise(Movie, word, value) |> 
  filter(Movie %in% "Wicked") |> 
  filter(!word == "wicked") |> 
  count(word, value, sort = TRUE) |> 
  arrange(desc(n)) |> 
  head(20)  |> 
  ggplot(aes(reorder(word, -value),value, fill = value, text = paste("Word:", word, "<br>Value:", value))) + geom_col() + coord_flip() +
   scale_fill_gradient(low = "#00441b", high = "#c7e9c0") + 
  labs(title = "'Wicked' Words with Sentiment",
       y = "Value",
       x = NULL) +   theme(legend.position = "none", plot.title = element_text(size = 8.5)) -> wicked_sentiment
ggplotly(wicked_sentiment, tooltip = "text")
review_sentiment |> 
  summarise(Movie, word, value) |> 
  filter(Movie %in% "Dear Evan Hansen") |> 
  count(word, value, sort = TRUE) |> 
  arrange(desc(n)) |> 
  head(20)  |> 
  ggplot(aes(reorder(word, -value),value, fill = value, text = paste("Word:", word, "<br>Value:", value))) + geom_col() + coord_flip() +
    scale_fill_gradient(low = "#08306b", high = "#c6dbef") + 
  labs(title = "'Dear Evan Hansen' Words with Sentiment",
       y = "Value",
       x = NULL) +   theme(legend.position = "none", plot.title = element_text(size = 8.5)) -> DEH_sentiment
ggplotly(DEH_sentiment, tooltip = "text")
review_sentiment |> 
  summarise(Movie, word, value) |> 
  filter(Movie %in% "Cats") |> 
  count(word, value, sort = TRUE) |> 
  arrange(desc(n)) |> 
  head(20) |> 
  ggplot(aes(reorder(word, -value),value, fill = value, text = paste("Word:", word, "<br>Value:", value))) + geom_col() + coord_flip() +
  scale_fill_gradient(low = "#7f2704", high = "#fee6ce") +
  labs(title = "'Cats' Words with Sentiment",
       y = "Value",
       x = NULL) +   theme(legend.position = "none", plot.title = element_text(size = 8.5)) -> cats_sentiment
ggplotly(cats_sentiment, tooltip = "text")

From this it is obvious how each production got each of their averages. When looking at the values given to the words, you can see West Side Story and Wicked only having 1 negative word in their top 20 words, while Dear Evan Hansen having 6 negative words, and Cats having 8.

It is important to recognize that without context, the words may not match their assigned sentiment. For example, one review in West Side Story states, “No regrets about seeing this adaptation,” which is more neutral, compared to when it was used in a negative way, “In my opinion there was absolutely no chemistry between Tony and Maria.”

Top Words

Discovering the top words of all productions gave insight into what the audience focuses on the most when writing reviews. These terms reflected key themes and emotional responses, while highlighting praise or criticism.

To understand how reviewers described the films as adaptations, the word “adaptation” itself was analyzed. Since all reviews included the term - appearing 104 times - the analysis focused on the word directly preceding it to capture tone and context.

Looking at the preceding word requires each word to be paired with the word coming directly before or after the word, creating bigrams.

Code

review_bigrams <- reviews |> 
  unnest_tokens(bigram, Review, token = "ngrams", n = 2) |> 
  filter(!is.na(bigram))


bigrams_separated <- review_bigrams %>%
  separate(bigram, c("word1", "word2"), sep = " ")

bigrams_filtered <- bigrams_separated %>%
  filter(!word1 %in% stop_words$word) %>%
  filter(!word2 %in% stop_words$word)

# new bigram counts:
bigram_counts <- bigrams_filtered %>% 
  count(word1, word2, sort = TRUE)

bigrams_united <- bigrams_filtered %>%
  unite(bigram, word1, word2, sep = " ")

Looking directly at the word, “adaptation,” this was set to “word2,” while “word1” was being picked out.

Code

afinn <-  get_sentiments("afinn")

adaptation_afinn <- bigrams_filtered %>%
  filter(word2 == "adaptation") |> 
  inner_join(afinn, by = c("word1" = "word")) %>%
  group_by(word1) %>%
  summarise(freq = n(), avg_sentiment = mean(value)) %>%
  arrange(desc(freq))

adaptation_afinn %>%
  knitr::kable()

word1	freq	avg_sentiment
ambitious	2	2
wonderful	2	4
amazing	1	4
bad	1	-3
beautiful	1	3
brilliant	1	4
perfect	1	3
stunning	1	4
successful	1	3
worst	1	-3

This shows how the word adaptation is talked about in both positive and negative light.

Another word that appeared often, but held little meaning on its own, was “experience.” To better understand how it was framed in reviews, the preceding word was examined.

Code

bigrams_filtered %>%
  filter(word2 == "experience") %>%
  count( word1, sort = TRUE) |> 
  knitr::kable()

word1	n
cinematic	7
immersive	2
magical	2
transformative	2
unforgettable	2
enthralling	1
quality	1
recent	1
stunning	1
theatrical	1
viewing	1
wonderful	1

Out of 28 uses, “cinematic experience” was the most common phrase, appearing seven times. This suggests a strong focus on the visual quality of the adaptation. One reviewer wrote about the movie, Wicked, “The 2024 adaptation of Wicked is a spellbinding cinematic experience that breathes fresh life into one of Broadway’s most beloved musicals.” The other words point out how the audience felt absorbed in the story, experienced an emotional or nostalgic impact, and suggests that the storytelling was of high-quality.

Casting

Each production is filled with actors and actresses that are well-known, creating star-studded casts. To examine how reviewers responded to the casting choices in adaptations, the word “cast” was analyzed. In the 80 reviews, 36 mentioned the “cast” of the production. To see how the word “cast” was being used, bigrams were used to look at common phrases.

Code

bigrams_filtered %>%
  filter(word2 == "cast") %>%
  count( word1, sort = TRUE) |> 
  knitr::kable()

word1	n
talented	4
original	2
perfectly	2
supporting	2
amazing	1
brilliantly	1
ensemble	1
entire	1
glorious	1
main	1
possibly	1
stellar	1
studded	1
unforgettable	1
writers	1

These words show how reviewers all had positive things to say about the cast, one review for West Side Story, even stating, “Everyone was perfectly cast!”

Looking more specifically at each production, bigrams were created of the actors’ names, so they could be presented on a graph.

Code

actor_colors <- c(
  "jonathan bailey" = "#cff7b7",  
  "cynthia erivo"   = "#9df567",   
  "ariana grande"   = "#78ed2f", 
  
   "taylor swift"    = "#f7cca3",  
  "rebel wilson"    = "#fabf87",   
  "jennifer hudson" = "#f5a85f", 
  "jason derulo"    = "#f28f30",  
  "james corden"    = "#e67a12",
  
  "rita moreno"     = "#fc9292",  
  "rachel zegler"   = "#fc7c7c",  
  "mike faist"      = "#fa5050",  
  "ansel elgort"    = "#f51b1b",  

  "ben platt" = "#97dcf7"  
)

desired_order <- c("Wicked", "Cats", "West Side Story", "Dear Evan Hansen")

bigrams_united |> 
  group_by(Movie, bigram) |> 
  count(bigram, sort = TRUE) |> 
  filter(bigram %in% names(actor_colors)) |> 
  mutate(bigram = factor(bigram, levels = names(actor_colors)), 
          Movie = factor(Movie, levels = desired_order)) |>
  ggplot(aes(Movie, n, fill = bigram,
             text = paste("Actor:", bigram, "<br>Mentions:", n))) + 
  geom_col(color = "white", width = 0.8, linewidth = 0.2) +
  scale_fill_manual(values = actor_colors, name = "Actor") +
  labs(
    title = "Actors Mentioned in Movies",
    y = "Mentions",
    x = NULL
  ) -> name_plot

ggplotly(name_plot, tooltip = "text")

Wicked stood out with the highest number of references to its lead cast. Ariana Grande (Glinda) was mentioned 17 times, Cynthia Erivo (Elphaba) 16 times, and Jonathan Bailey (Fiyero) 4 times. Reviews include, “Elphaba, played by the talented Cynthia Erivo, is a force to be reckoned with. Erivo’s powerhouse vocals and emotional depth breathe new life into the character, making her both sympathetic and fierce. Ariana Grande as Glinda is a perfect match, effortlessly blending comedic timing with her own impressive vocal range.” Not all reviews were as positive, in Dear Evan Hansen, Ben Platt - who also originated the role on Broadway - was mentioned 6 times. This brought controversy, however, as reviewers felt he was no longer fit for the role. One review states, “Ben Platt is way too old for this role - he seems at times as if he’s acting in a different movie.”

Cinematography

One major difference of having a show on stage versus on screen, is the addition of cinematography. This helps tell the story in ways that can not be shown otherwise. The word “cinematography” was first examined, were it was indicated that there was a positive reception to the visual style.

To explore which productions placed the most emphasis on cinematography, technical terms such as “cinematography,” “camera,” “angles,” “shot,” and “shots” were analyzed.

Code

bigrams_filtered %>%
  filter(word2 == "cinematography") %>%
  count( word1, sort = TRUE) |> 
  knitr::kable()
camera_colors <- c(
  "cinematography" = "#FF8383", 
  "camera" = "#A19AD3",
  "angles" = "#A1D6CB",
  "shots" = "#FFF574")

camera_order <- c( "West Side Story", "Dear Evan Hansen", "Wicked", "Cats")

review_words |> 
  group_by(Movie, word) |> 
  count(word,sort = TRUE) |> 
  filter(word %in% c("cinematography","camera","angles","shot","shots")) |> 
    mutate( Movie = factor(Movie, levels = camera_order)) |>
  ggplot(aes(Movie,n, fill = word, text = paste("Word:", word, "<br>Mentions:", n))) + geom_col() +
  geom_col(color = "white", width = 0.8, linewidth = 0.2) +
  scale_fill_manual(values = camera_colors) +
  labs(title = "Cinematography Mentioned in Movies",
       y = "Value",
       x = NULL) -> camera_plot
ggplotly(camera_plot, tooltip = "text")

word1	n
expressive	1
winning	1

This examination revealed that West Side Story was the most discussed in terms of cinematographic techniques, with mentions occurring seven times across reviews, with one reviewer describing the film as having “sublime and expressive cinematography that brilliantly compliments the film’s immersive technical details.” Dear Evan Hansen followed with three mentions, though the response to this was more mixed. One criticized the cinematography, stating, “Camera angles were neither broad enough nor personal enough, instead staying at an awkward, in-between documentary-like distance.” Meanwhile, Wicked and Cats each garnered two references, representing that it played a role, but it was not a dominant topic in the reviews.

Broadway Production

Each of these adaptations were well-known original Broadway productions. To explore how reviewers connected the films to their stage counterparts, the words “Broadway” and “original” were counted across reviews. The term “Broadway” appeared a total of 44 times throughout the reviews.

Code

broadway_colors <- c(
  "broadway" = "#FF8383", 
  "original" = "#A1D6CB")

broadway_order <- c("Wicked", "Cats",  "West Side Story", "Dear Evan Hansen")

review_words |> 
  group_by(Movie, word) |> 
  count(word,sort = TRUE) |> 
  filter(word %in% c("broadway","original")) |> 
   mutate( Movie = factor(Movie, levels = broadway_order)) |>
  ggplot(aes(Movie, n, fill = word, text = paste("Word:", word, "<br>Mentions:", n))) +   geom_col(color = "white", width = 0.8, linewidth = 0.2) +
  scale_fill_manual(values = broadway_colors) +
  labs(title = "Reference to Original Broadway Production",
       y = "Value",
       x = NULL) -> broadway_plot
ggplotly(broadway_plot, tooltip="text")

Wicked, which still runs on Broadway, accounted for the majority with 23 mentions, followed by Dear Evan Hansen with 13, and Cats with 8. West Side Story had no references from this sample that referred back to the Broadway production; the word “original,” however, was mentioned 21 times in reviews for the movie. The data shows that this was because the reviews reference back to the original 1960 film, not the 1957 musical.

What is the Impact?

These adaptations open audiences to the music of the original productions and expose them to the theater world. To see if there is any correlation between the adaptation and the Broadway show, the Broadway Weekly Grosses can be examined. On this website, information is added weekly on each productions’ weekly grosses. It contains the gross, gross difference, average ticket price, top ticket price, attendance, and capacity % (the percentage of theater filled).

As it is the most recent adaptation, and currently running on Broadway, Wicked’s weekly capacity percent was plotted to see if there was an increase after the Wicked:Part One release on November 22nd, 2024. After downloading the data found (using the link above), the weeks were filtered to just be over the past year, the week ending April 27th, 2024 to the week ending April 27th, 2025. The week’s capacity percent was then able to be showcased using a line graph.

Code

library(readr)
Wicked_Report <- read_csv("Wicked.Report.csv")

Wicked_Report |> 
  mutate(
    Week = mdy(Week),
    `This Week %` = as.numeric(gsub("%", "", `This Week %`)) 
  ) %>%
  filter(!is.na(Week)) -> Updated_Report
  

# Filter for the date range
Updated_Report %>%
  filter(Week >= as.Date("2024-04-27") & Week <= as.Date("2025-04-27")) -> Filtered_Weeks

Filtered_Weeks |> 
  ggplot(aes(Week, `This Week %`)) +
  geom_line(color = "#9df567", size = 1) +
  geom_point(color = "#2f8f2f", size = 0.8) +
  labs(title = "Wicked On Broadway Theatre Capacity <br>(4/27/24 to 4/27/25)",
       x = NULL,
       y = "Capacity %") +
  theme_minimal() -> Capacity_plot

ggplotly(Capacity_plot)

Looking at the graph, it is clear that the adaptation did make a huge difference in sales, and continues to. The weeks leading up to and after show how a larger audience was brought in each night. Although there could be other factors, based on the raving reviews of the production, and the numbers shown here, sales have increased significantly due to the adaptation.

To further these findings, other qualities of the production, such as ticket pricing, can be observed to see how it changes over time and after the release of adaptations. Even though this is only one production, which is still running, other adaptations do go on tour and perform in various venues. Retrieving the data of these - not New York based - productions, and diving into it could be interesting to look at, to see if they have similar outcomes.

Another way to further these findings, would be to look at these reviews and compare them to the reviews of the actual Broadway production. This would allow a direct comparison on different acting choices, music, and staging.

Conclusion

This study provides insight into how audience members react to adaptations from Broadway musicals. With a mostly positive sentiment, these adaptations are well received and enjoyed by the public. This opinion may differ when it comes to critics, but the general public enjoy how the productions can transport the audience into a different world throughout the duration of the movie. The cinematography of the piece may bring out certain moments that might not have been caught on stage, or amplify the ones that happen, and the cast help bring it all together through their acting skills and emotion.

The importance of this research lies in the idea that it not only brings a new audience to theatre, but creates a path for upcoming releases of theatrical adaptations, such as Kiss of the Spider Woman, 25th Annual Putnam County Spelling Bee, Be More Chill, Beautiful: The Carole King Musical, and Guys and Dolls. This study showed how some parts of production are well received, and why others are not.

When looking at the limitations of this study, it is important to note that only four productions were analyzed. This does not allow the full picture of all adaptations. Some productions might have focused on casting more than others due to their popularity, like Ariana Grande, who is famous in the music industry, as opposed to Ben Platt, more known in the theatre realm. With the sample size selected, having 20 reviews each, not all opinions of the movie were included, which might have skewed the results.

The sentiment analysis using the AFINN lexicon did a good job at setting words to values, however, because the analysis is based on individual words, rather than full sentence context, the sentiment assigned may not always accurately reflect the intended meaning. Also, not every word in the reviews had a corresponding sentiment score because the AFINN lexicon only assigns values to a limited set of pre-defined words. This means that some words were left out of the calculations, either due to being neutral or outside of the lexicon’s knowledge. In the future, using an expanded or context-aware sentiment analysis tool could improve the accuracy of results without the context of the whole sentence, the sentiment it was given might not have been completely accurate.

References

Burston, J. (2009). Recombinant Broadway. Continuum: Journal of Media & Cultural Studies, 23(2), 159–169. https://doi.org/10.1080/10304310802710504.

Jaramillo, C. (2016). The history of theater on Broadway. Octane Seating. https://octaneseating.com/blog/the-history-of-theater-on-broadway/.

Kennedy, H. (2012). Perspectives on sentiment analysis. Journal of Broadcasting & Electronic Media, 56(4), 435–450. https://doi.org/10.1080/08838151.2012.732141.