Natural Language Processing- Emotion, Sentiment, Readability, Complexity- App to test books essays and websites
- Simon A
- Aug 8, 2024
- 4 min read
Updated: Aug 20, 2024
In the digital age, text is everywhere—customer reviews, social media posts, academic papers, and more. But how do you extract valuable insights from all that data? That's where a Shiny app comes in, offering a powerful tool to make text analysis accessible and user-friendly. Whether you're a researcher, marketer, or simply curious about what your data is saying, this app could be exactly what you need.
Test the app below.
*Further comments on how the app can be used is below the app:
Testing below based on KJV bible from https://www.gutenberg.org/ - try a book and insert it into the URL.




What Can This App Do?
This Shiny app brings together some of the most robust R packages—quanteda, sentimentr, syuzhet, and more—to provide a comprehensive text analysis platform. Here's a breakdown of the features and how you might use them:
Sentiment Analysis:
Example: Imagine you're a brand manager monitoring customer reviews on a new product. With sentiment analysis, you can quickly gauge the overall mood—are customers loving it or leaving it? The app plots sentiment scores across the text, helping you identify trends and potential issues.
Good Indicator: A higher average sentiment score (above 0) suggests generally positive feedback, while a lower score (below 0) indicates negative sentiment.
Application: Track shifts in public opinion after a product launch or a PR campaign.
Emotion Analysis:
Example: Let’s say you’re analyzing social media posts about a recent event. Emotion analysis can reveal whether posts are filled with joy, anger, or sadness. For instance, tracking emotions during a political debate can offer insights into public reactions and concerns.
Good Indicator: High counts in positive emotions like joy and trust suggest a favorable reception, whereas spikes in anger or fear might signal dissatisfaction or concern.
Application: Understand the emotional impact of marketing campaigns or news articles.
Keyword-in-Context (KWIC):
Example: You’re researching how a specific term is used in academic literature. KWIC allows you to see how a keyword is used within its surrounding text. This is particularly useful for linguists or anyone conducting content analysis.
Good Indicator: KWIC results provide context for understanding the sentiment or meaning behind specific keywords, helping you capture the nuances of language.
Application: Investigate how often and in what context certain words or phrases appear in customer feedback.
Word Frequency and Lexical Variety:
Example: If you’re an author or content creator, knowing the most frequently used words in your text can help you avoid redundancy and improve your writing style. Lexical variety measures how diverse your vocabulary is, which is crucial for engaging content.
Good Indicator: High lexical variety suggests a rich and engaging text, while a narrow range of frequent words might indicate repetition.
Application: Enhance readability and engagement in blog posts, articles, or even technical documentation.
Hapax Richness:
Example: In literary studies, hapax richness—words that appear only once—can indicate the uniqueness of a text. Analysing hapax richness helps scholars understand the distinctiveness of an author’s language.
Good Indicator: A higher proportion of hapax words indicates a more unique or creative use of language, which might be desirable in poetry or literary works.
Application: Use this to assess the originality of written work or to ensure varied vocabulary in educational content.
Readability (SMOG Index)
Example: Writing for a general audience? The SMOG Index helps you assess how easy or difficult your text is to read. By adjusting your content based on this score, you can ensure it’s appropriately tailored to your target audience.
Good Indicator:
Score of 7-9: Easily understood by middle school students (7th-9th grade level).
Score of 10-12: Suitable for high school students (10th-12th grade level).
Score of 13-15: Appropriate for college students (undergraduate level).
Score of 16-18: Best for advanced readers (postgraduate level).
Score above 18: Highly complex, suitable for specialised or professional audiences.
Application:
Use the SMOG Index to ensure your content meets the right reading level for your audience, whether you’re writing for middle school students, the general public, or academic readers. For instance, the King James Version (KJV) of the Bible, known for its rich but archaic language, would likely score around 22 on the SMOG Index, indicating a challenging text suitable for those with advanced reading skills.
How Does It Work?
Using the app is straightforward. Upload a text file, paste text, or scrape content from a URL. Select the analyses you want to perform, and let the app do the heavy lifting. Within seconds, you'll have detailed visualisations and summaries at your fingertips.
Who Can Benefit?
Researchers can analyse large volumes of text data to uncover trends and patterns.
Marketers can monitor brand sentiment and adjust strategies based on customer feedback.
Writers can refine their content to ensure it's both engaging and accessible.
Educators can assess and adjust the readability of educational materials.
Try the app below. Let me know your thoughts.
Attached Code below: