Disclaimer This tutorial requires some experience with both python and R and the command line.
First, I should point out that much of the content in this section will be based on the book Text Analysis with R for Students of Literature by Matthew L. Jockers.
There are at least two different kinds of environments that you can work with R in. Currently, the environment that I am using is a terminal like environment called vim. Vim works using a number of plugins and for my purposes, the main one is is Nvim-R. I am not going to recommend this if you do not have experience with it. Rather I will specify the basic method for getting started from scratch with no experience.
- To download the current version of
Rhead to R and select on your operating system:
Linux, choose distribution and installer file.
- Download the “Desktop” version of
- Follow the installation instructions
- Launch RStudio like any other program
Primarily, this tutorial is for my own personal purposes meaning that these instructions are ones that I’ve used to accomplish some task and do not want to forget.
Ultimately, I am aiming to compare contextual sentiment expression across two genres of billboard music charts; namely Country and Hip-Hop R-n-B. To do this, it is helpful to identify a number of Billboard music charts that correspond to these styles. To do this, I’ve found the
billboard.py python api by
Allen Guo to be central.
It can be found here. It is merely an api that gives you access to
Billboard.com’s music charts.
First, find a chart:
- create new file with an extension
Enter these contents:
import billboard import sys chart = billboard.ChartData('Hot-Country-Songs', year=2021) original_stdout = sys.stdout with open('Hot-Country-Songs-2022.txt', 'w') as f: sys.stdout = f print(chart) sys.stdout = original_stdout chart2 = billboard.ChartData('Country-Streaming-Songs', year=2022) original2_stdout = sys.stdout with open('Country-Streaming-Songs-2022.txt', 'w') as g: sys.stdout = g print(chart2) sys.stdout = original2_stdout
import billboard imports the billboard api while
import sys imports various python functions that will help to maintain different aspects of your python runtime environment.
chart = billbarod.ChartData('Your chosen chart', year=your chosen year') saves the chart data, artist and song to the variable “chart”.
with open('some text file.txt', 'w') as f: sys.stdout = f print(chart) sys.stdout = original_stdout
Here, you’ve created a new file “some text . . .” and stored it as
f, and printed the “chart” data to f and saved the chart to “some text file . . .”.
It should look something like this:
Country-Streaming-Songs chart (2021) ------------------------------------ 1. 'Forever After All' by Luke Combs 2. 'Tennessee Whiskey' by Chris Stapleton 3. 'Starting Over' by Chris Stapleton 4. 'Wasted On You' by Morgan Wallen 5. 'I Hope' by Gabby Barrett 6. 'Fancy Like' by Walker Hayes 7. 'Whiskey Glasses' by Morgan Wallen
In my particular case, I ran a number of regex functions in vim so that I would only have the title and name of the song left which I saved as a comma delimited file
csv like this:
I Hope,Gabby Barrett The Bones,Maren Morris Heartless,Diplo Presents Thomas Wesley Featuring Morgan Wallen One Man Band,Old Dominion 10000 Hours,Dan + Shay & Justin Bieber Tennessee Whiskey,Chris Stapleton Whiskey Glasses,Morgan Wallen
Briefly, it should be noted that you can retrieve as many charts as you’d like with this code:
import billboard import sys ## chart 1, Hot-Country-Songs year-end 2021 chart = billboard.ChartData('Hot-Country-Songs', year=2021) original_stdout = sys.stdout with open('Hot-Country-Songs-2022.txt', 'w') as f: sys.stdout = f print(chart) sys.stdout = original_stdout ## chart 2, Country-Streaming-Songs year-end 2021 chart2 = billboard.ChartData('Country-Streaming-Songs', year=2021) original2_stdout = sys.stdout with open('Country-Streaming-Songs-2022.txt', 'w') as g: sys.stdout = g print(chart2) sys.stdout = original2_stdout
Each set will save a chart to a new
Before you use this package, you will need to sign up for an account to get access to the api. An account authorizes your access to the Genuis Api and can be done here.
First, install the api from your terminal command line:
pip install lyricsgenuis
Or get the latest version from github
pip install git+https://github.com/johnwmillr/LyricsGenius.git
For standard usage, see the website above. In my particular case, I had someone help me write this script and currently, as of today, it still works:
# import lyricsgenius import lyricsgenius # import csv python package of functions for handling csv files import csv # the first line gives you access to lyrics genius genius = lyricsgenius.Genius("your api") # artist and song list here = "your-saved-csv.csv" # file you will save lyrics to there = "new-lyrics-csv-file.csv (or .txt)" # open new-lyrics-file for writing to directionsFile = open(there, "+w") # open artist song file to read from with open(here, "r") as source: reader = csv.reader(source) songartistlist = list(reader) # Admittedly, it gets a little vague here, but I think that what is happening is that you create two sets of lists to save each of your columns to. The column on the left lists the song titles "songlist = " while the one on the right lists the artist's name "artistlist = ". songlist =  artistlist =  # you then save the songartistlist as a variable called "i" each line is listed separately. Create a new variable called "j" where if an item (set of lyrics) is retured, then it is added to your object as a song for i in songartistlist: count = 0 for j in i: if count == 0: songlist.append(j) count += 1 else: jsplit = j.split() j = jsplit artistlist.append(j) # while the additional (once again not utterly clear) will attach the song lyrics to the artist. count1 = 0 songlyrics =  for k in songartistlist: song = genius.search_song(songlist[count1], artistlist[count1]) count1 += -1 songlyrics.append(song.lyrics) # save the lyrics file directionsFile.writelines(songlyrics)
This section will extensively use Matthew Jocker’s book. I’ve done analysis on similar sets before, but I am anticipating that Jocker’s book will help clarify much of what I’ve attempted in the past.
In the section of Jocker’s book (1.5), he asks us to download the materials you would use to do his specific analysis. Instead, we will use the material we’ve just created.
# We first set the working directory # setwd("/home/redapemusic35/1-2021-22-Projects/Publications/Research-Projects/Music_Corpora/") directory <- "/home/redapemusic35/1-2021-22-Projects/Publications/Research-Projects/Music_Corpora/Song-Charts/Country-Streaming-Songs-2020-there-3.txt" # load the first text file using the scan function. I am following Jocker's nomenclature, vectors will be denoted with a .v # Jockers calls for the scan function but this gave me an error. Found on https://stackoverflow.com/questions/7797395/data-type-error-with-scan/7797830 that I should use 'read.csv' instead. text.v <- read.csv(directory, sep="\n") # Not the entire text text.v[23:50,]
##  "I hope you stay up all night all alone, waitin' by the phone" ##  "And then she calls" ##  "And baby, I, I hope you work it out" ##  "Forgive and just about forget" ##  "And take her on a first date again" ##  "And when you lean in for a kiss" ##  "[Chorus]" ##  "I hope you're both feelin' sparks by the end of the drive" ##  "I hope you know she's the one by the end of the night" ##  "I hope you never ever felt more free" ##  "Tell your friends that you're so happy" ##  "I hope she comes along and wrecks every one of your plans" ##  "I hope you spend your last dime to put a rock on her hand" ##  "I hope she's wilder than your wildest dreams" ##  "She's everything you're ever gonna need" ##  "And then I hope she cheats" ##  "Like you did on me" ##  "And then I hope she cheats" ##  "Like you did on me" ##  "[Bridge]" ##  "I hope what goes comes all the way around" ##  "I hope she makes you feel the same way about her" ##  "That I feel about you right now" ##  "[Chorus]" ##  "I hope you're both feelin' sparks by the end of the drive" ##  "I hope you know she's the one by the end of the night" ##  "I hope you never ever felt more free" ##  "Tell your friends that you're so happy"
# I would like to separate the content from the meta-data, [Chorus], [Outro] etc. intro.v <- which(text.v == "[Intro]") verse1.v <- which(text.v == "[Verse 1]") verse2.v <- which(text.v == "[Verse 2]") Verse3.v <- which(text.v == "[Verse 3]") outro.v <- which(text.v == "[Outro]") bridge.v <- which(text.v == "[Bridge]")