Rrrr Country

Running some analysis on Country music using R.

By Monty Reynolds

January 25, 2021

Test

Disclaimer This tutorial requires some experience with both python and R and the command line.

Some Basics

First, I should point out that much of the content in this section will be based on the book Text Analysis with R for Students of Literature by Matthew L. Jockers.

R and RStudio

There are at least two different kinds of environments that you can work with R in. Currently, the environment that I am using is a terminal like environment called vim. Vim works using a number of plugins and for my purposes, the main one is is Nvim-R. I am not going to recommend this if you do not have experience with it. Rather I will specify the basic method for getting started from scratch with no experience.

  1. To download the current version of R head to R and select on your operating system:
    1. For Linux, choose distribution and installer file.
  2. Download the “Desktop” version of RStudio
    1. Follow the installation instructions
    2. Launch RStudio like any other program

Creating the Materials

Primarily, this tutorial is for my own personal purposes meaning that these instructions are ones that I’ve used to accomplish some task and do not want to forget.

Billboard Music Charts

Ultimately, I am aiming to compare contextual sentiment expression across two genres of billboard music charts; namely Country and Hip-Hop R-n-B. To do this, it is helpful to identify a number of Billboard music charts that correspond to these styles. To do this, I’ve found the billboard.py python api by Allen Guo to be central. It can be found here. It is merely an api that gives you access to Billboard.com’s music charts.

First, find a chart:

chart = billboard.ChartData('Hot-Country-Songs', year=2022) selects a specific “year end chart” from Billboard.com which you can get grouped by category here and the year-end charts from here.

Next

  1. create new file with an extension .py
touch billboard.py

Enter these contents:

import billboard
import sys

chart = billboard.ChartData('Hot-Country-Songs', year=2021)

original_stdout = sys.stdout

with open('Hot-Country-Songs-2022.txt', 'w') as f:
    sys.stdout = f
    print(chart)
    sys.stdout = original_stdout

chart2 = billboard.ChartData('Country-Streaming-Songs', year=2022)

original2_stdout = sys.stdout

with open('Country-Streaming-Songs-2022.txt', 'w') as g:
    sys.stdout = g
    print(chart2)
    sys.stdout = original2_stdout

import billboard imports the billboard api while import sys imports various python functions that will help to maintain different aspects of your python runtime environment.

Next, chart = billbarod.ChartData('Your chosen chart', year=your chosen year') saves the chart data, artist and song to the variable “chart”.

Then

with open('some text file.txt', 'w') as f:
    sys.stdout = f
    print(chart)
    sys.stdout = original_stdout

Here, you’ve created a new file “some text . . .” and stored it as f, and printed the “chart” data to f and saved the chart to “some text file . . .”.

It should look something like this:

Country-Streaming-Songs chart (2021)
------------------------------------
1. 'Forever After All' by Luke Combs
2. 'Tennessee Whiskey' by Chris Stapleton
3. 'Starting Over' by Chris Stapleton
4. 'Wasted On You' by Morgan Wallen
5. 'I Hope' by Gabby Barrett
6. 'Fancy Like' by Walker Hayes
7. 'Whiskey Glasses' by Morgan Wallen

In my particular case, I ran a number of regex functions in vim so that I would only have the title and name of the song left which I saved as a comma delimited file csv like this:

I Hope,Gabby Barrett
The Bones,Maren Morris
Heartless,Diplo Presents Thomas Wesley Featuring Morgan Wallen
One Man Band,Old Dominion
10000 Hours,Dan + Shay & Justin Bieber
Tennessee Whiskey,Chris Stapleton
Whiskey Glasses,Morgan Wallen

Briefly, it should be noted that you can retrieve as many charts as you’d like with this code:

import billboard
import sys

## chart 1, Hot-Country-Songs year-end 2021

chart = billboard.ChartData('Hot-Country-Songs', year=2021)

original_stdout = sys.stdout

with open('Hot-Country-Songs-2022.txt', 'w') as f:
    sys.stdout = f
    print(chart)
    sys.stdout = original_stdout

## chart 2, Country-Streaming-Songs year-end 2021

chart2 = billboard.ChartData('Country-Streaming-Songs', year=2021)

original2_stdout = sys.stdout

with open('Country-Streaming-Songs-2022.txt', 'w') as g:
    sys.stdout = g
    print(chart2)
    sys.stdout = original2_stdout

Each set will save a chart to a new .txt file

Retrieving the Music Lyrics

Retrieving music lyrics, I use John W. Miller’s excellent lyrics genuis api which in turn gets the lyrics you want from the website genuis.com. The full instructions or its use can be found here.

Before you use this package, you will need to sign up for an account to get access to the api. An account authorizes your access to the Genuis Api and can be done here.

First, install the api from your terminal command line:

pip install lyricsgenuis

Or get the latest version from github

pip install git+https://github.com/johnwmillr/LyricsGenius.git

For standard usage, see the website above. In my particular case, I had someone help me write this script and currently, as of today, it still works:

# import lyricsgenius
import lyricsgenius
# import csv python package of functions for handling csv files
import csv

# the first line gives you access to lyrics genius
genius = lyricsgenius.Genius("your api")

# artist and song list
here = "your-saved-csv.csv"
# file you will save lyrics to
there = "new-lyrics-csv-file.csv (or .txt)"

# open new-lyrics-file for writing to
directionsFile = open(there, "+w")

# open artist song file to read from
with open(here, "r") as source:
    reader = csv.reader(source)
    songartistlist = list(reader)

# Admittedly, it gets a little vague here, but I think that what is happening is that you create two sets of lists to save each of your columns to. The column on the left lists the song titles "songlist = []" while the one on the right lists the artist's name "artistlist = []".

songlist = []
artistlist = []

# you then save the songartistlist as a variable called "i" each line is listed separately. Create a new variable called "j" where if an item (set of lyrics) is retured, then it is added to your object as a song

for i in songartistlist:
    count = 0
    for j in i:
        if count == 0:
            songlist.append(j)
            count += 1
        else:
            jsplit = j.split()
            j = jsplit[0]
            artistlist.append(j)

# while the additional (once again not utterly clear) will attach the song lyrics to the artist.

count1 = 0
songlyrics = []
for k in songartistlist:
    song = genius.search_song(songlist[count1], artistlist[count1])
    count1 += -1
    songlyrics.append(song.lyrics)

# save the lyrics file
directionsFile.writelines(songlyrics)

Preparing the Files for Analysis

This section will extensively use Matthew Jocker’s book. I’ve done analysis on similar sets before, but I am anticipating that Jocker’s book will help clarify much of what I’ve attempted in the past.

Creating the R environment

In the section of Jocker’s book (1.5), he asks us to download the materials you would use to do his specific analysis. Instead, we will use the material we’ve just created.

# We first set the working directory
# setwd("/home/redapemusic35/1-2021-22-Projects/Publications/Research-Projects/Music_Corpora/")

directory <- "/home/redapemusic35/1-2021-22-Projects/Publications/Research-Projects/Music_Corpora/Song-Charts/Country-Streaming-Songs-2020-there-3.txt"

# load the first text file using the scan function. I am following Jocker's nomenclature, vectors will be denoted with a .v

# Jockers calls for the scan function but this gave me an error. Found on https://stackoverflow.com/questions/7797395/data-type-error-with-scan/7797830 that I should use 'read.csv' instead.

text.v <- read.csv(directory, sep="\n")

# Not the entire text

text.v[23:50,]
##  [1] "I hope you stay up all night all alone, waitin' by the phone"
##  [2] "And then she calls"                                          
##  [3] "And baby, I, I hope you work it out"                         
##  [4] "Forgive and just about forget"                               
##  [5] "And take her on a first date again"                          
##  [6] "And when you lean in for a kiss"                             
##  [7] "[Chorus]"                                                    
##  [8] "I hope you're both feelin' sparks by the end of the drive"   
##  [9] "I hope you know she's the one by the end of the night"       
## [10] "I hope you never ever felt more free"                        
## [11] "Tell your friends that you're so happy"                      
## [12] "I hope she comes along and wrecks every one of your plans"   
## [13] "I hope you spend your last dime to put a rock on her hand"   
## [14] "I hope she's wilder than your wildest dreams"                
## [15] "She's everything you're ever gonna need"                     
## [16] "And then I hope she cheats"                                  
## [17] "Like you did on me"                                          
## [18] "And then I hope she cheats"                                  
## [19] "Like you did on me"                                          
## [20] "[Bridge]"                                                    
## [21] "I hope what goes comes all the way around"                   
## [22] "I hope she makes you feel the same way about her"            
## [23] "That I feel about you right now"                             
## [24] "[Chorus]"                                                    
## [25] "I hope you're both feelin' sparks by the end of the drive"   
## [26] "I hope you know she's the one by the end of the night"       
## [27] "I hope you never ever felt more free"                        
## [28] "Tell your friends that you're so happy"
# I would like to separate the content from the meta-data, [Chorus], [Outro] etc.
intro.v  <- which(text.v == "[Intro]")
verse1.v  <- which(text.v == "[Verse 1]")
verse2.v  <- which(text.v == "[Verse 2]")
Verse3.v  <- which(text.v == "[Verse 3]")
outro.v  <- which(text.v == "[Outro]")
bridge.v  <- which(text.v == "[Bridge]")