kjv.Rmd
Sacred texts in R. sacred
includes 6 tidy datasets about the ‘Apocrypha’, the ‘King James Bible’, the ‘Greek New Testament’, the ‘Septuagint’, the ‘Tanach’ and the ‘Vulgate’. The package also includes utility datasets such as Middle English keywords and Sentiment terms (positive and negative).
We can have a look at the most mentioned terms in the Bible. Here I use the king_james_version
dataset which is a corpus of documents from the tm package.
sacred
also comes with a set of 295 Middle English stopwords.
library(dplyr)
library(echarts4r) # devtools::install_github("JohnCoene/sacred")
#> Welcome to echarts4r
#>
#> Docs: echarts4r.john-coene.com
library(tidytext)
data("king_james_version")
data("middle_english_stopwords") # get middle english stopwords
# set echarts4r theme
THEME <- "vintage"
# add common englishh stopwords
sw <- plyr::rbind.fill(stop_words, middle_english_stopwords)
king_james_version %>%
unnest_tokens(word, text) %>%
anti_join(sw, by = "word") %>%
count(word, sort = TRUE) %>%
slice(1:120) %>%
e_charts() %>%
e_cloud(word, n) %>%
e_theme("westeros") %>%
e_title("Most mentioned words", "King James Bible") %>%
e_tooltip(trigger = "item") %>%
e_theme(THEME)
Most sentiment lexicons are based on online reviews or social media conversations and therefore are not well suited to 17 century text. To remedy to this bibler
also comes with lexicons of positive and negative Middle English terms.
We can now assess the sentiment of each psalm and plot the sentiment score by book in order of appearance in the Bible.
data("middle_english_sentiments")
# add common sentiment
common_sentiments <- get_sentiments("afinn")
common_sentiments <- common_sentiments %>%
anti_join(middle_english_sentiments, by = "word")
sent <- plyr::rbind.fill(
middle_english_sentiments,
common_sentiments
) %>%
select(-sentiment) %>%
unique()
king_james_version %>%
unnest_tokens(word, text) %>%
anti_join(sw) %>%
inner_join(sent) %>%
group_by(book.num) %>%
summarise(score = sum(score)) %>%
e_charts(book.num) %>%
e_bar(score) %>%
e_visual_map(
score,
show = FALSE,
type = "piecewise",
pieces = list(
list(gt = -1500, lte = 0, color = "#dc322f"),
list(gt = 0, lte = 600, color = "#859900")
)
) %>%
e_legend(FALSE) %>%
e_title("Average sentiment by book", "King James Bible") %>%
e_tooltip(trigger = "axis") %>%
e_theme(THEME) -> p
#> Joining, by = "word"
#> Joining, by = "word"
p
There is indeed more negativity at the “beginning” of the Bible: the Old Testament is far more gruesome than the New Testament.
The Psalms (book #19) is a lapse of light amidst the dark and spine-chilling godly injunctions of the Old Testament, “psalm” means “praises,” it seems appropriate to be the most positive book.
John 3 in the New Testament is also positive, by the Bible standard anyway: John accepts Jesus as the Messiah, gets baptised, finds the path to God, etc. Good stuff.
The most negative book according to the analysis is Book of Isaiah (book #24) which is not inaccurate, the message of the book is summed by Wikipedia:
“The book opens by setting out the themes of judgment and subsequent restoration for the righteous. God has a plan which will be realised on the”Day of Yahweh“, when Jerusalem will become the centre of his worldwide rule.”