The Natural Language Toolkit (NLTK) is a collection of libraries and programs written in Python to perform Natural Language Processing (NLP) operations on data sets. One of the simplest statistical measurements to perform is a Frequency Distribution, which measures how often a given word is used within a particular text. I thought it might be fun to use NLTK to analyze the lyrics of three classic hardcore records from different time periods and see how well the analysis conveys the message of the individual band.
Album #1: Bad Brains – Bad Brains (1982)
Album #2: Cro-Mags – Age of Quarrel (1986)
Album #3: Gorilla Biscuits – Start Today (1989)
Many NLTK functions deal with tokenized texts and because the texts being analyzed here are lyrics, I decided to use a very simple Regular Expression based tokenizer that just split the words based on whitespace. This yielded the following results:
|Bad Brains||Cro-Mags||Gorilla Biscuits|
|1.||to – 53||you – 75||you – 82|
|2.||the – 51||the – 68||the – 68|
|3.||you – 43||i – 51||i – 60|
|4.||i – 35||and – 43||to – 51|
|5.||and – 23||to – 43||and – 48|
|6.||we – 22||a – 25||a – 37|
|7.||me – 20||it – 25||it – 34|
|8.||my – 20||just – 25||not – 30|
|9.||i’m – 19||gotta – 24||my – 29|
|10.||a – 18||i’m – 24||it’s – 26|
The results in this case are extremely generic. The second word for all three are all the same. Only at the very end of the Cro-Mags list with the word “gotta” (We Gotta Know) does any semblance of personality start to emerge. Obviously we need a way to filter out more of the very basic, generic English words. These generic words are called stop words, and the NLTK provides a collection of them to use for filtering them out of texts. Filtering out the stop words yielded the following results:
|Bad Brains||Cro-Mags||Gorilla Biscuits|
|1.||i’m – 19||gotta – 24||it’s – 26|
|2.||got – 18||i’m – 24||don’t – 22|
|3.||don’t – 14||can’t – 19||i’m – 16|
|4.||it’s – 14||don’t – 19||time – 16|
|5.||right – 11||it’s – 19||you’re – 15|
|6.||that’s – 11||see – 19||like – 13|
|7.||gonna – 9||world – 18||know – 11|
|8.||guess – 8||cause – 15||things – 11|
|9.||like – 8||know – 14||get – 10|
|10.||i’ve – 7||things – 14||that’s – 10|
Filtering out the stop words definitely provides some better results in this case. The word which we pointed out for the Cro-Mags, ‘gotta’, shoots right to the top of the list. Additionally, other patterns come to light including ‘world’ (World Peace), ’cause’ – John Joseph’s preferred choice over the word ‘because’, and ‘know’ (We Gotta Know). For the Bad Brains, words like ‘right’ (Right Brigade) and ‘guess’ (I) start to shine through as well. And for the Gorilla Biscuits, ‘time’ (New Direction, Start Today) surfaces as a lyrical theme. While these lists are much better than the first set, there are still too many generic words, and most of those words appear to be contractions that were skipped over in filtering out the stop words. Here are the further processed results with the contractions filtered out:
|Bad Brains||Cro-Mags||Gorilla Biscuits|
|1.||got – 18||gotta – 24||time – 16|
|2.||right – 11||see – 19||like – 13|
|3.||gonna – 9||world – 18||know – 11|
|4.||guess – 8||cause – 15||things – 11|
|5.||like – 8||know – 14||get – 10|
|6.||see – 7||things – 14||think – 9|
|7.||want – 7||get – 13||see – 8|
|8.||bad – 8||justice – 13||want – 8|
|9.||pay – 6||street – 13||make – 7|
|10.||sail – 6||back – 12||up – 7|
Filtering down those lists just a little further produces three entirely different lists.
The Cro-Mags list adds both ‘justice’ and ‘street’ (Street Justice), whereas the Bad Brains list adds ‘pay’ (Pay to Cum) and ‘sail’ (Sailin’ On). The Gorilla Biscuits list is a little different though. None of the words pop out as instantly recognizable, but if the words are instead looked at as a whole set then a pattern becomes clear. All the words have an action quality to them ‘like’, ‘know’, ‘get’, ‘think’, ‘see’ which is something very much inline with the title of the record “Start Today”. So overall a very good first pass at taking a look at Hardcore Music lyrics using Natural Language Processing and NLTK.
Code is available on github.
A question came up recently on how to write an algorithm where given a list of denominations of coins, what is a way to return the exact amount of change using the smallest number of coins possible.
Here is my solution which has not been fully optimized.
Here is a running of the program.
rhino:python_code roneill$ python minimum_coins_change.py Smallest change for 67 using denominations:  67 coins needed distributed via [(1, 67)]
Smallest change for 69 using denominations:  No change possible
Smallest change for 43 using denominations: [5, 1] 11 coins needed distributed via [(1, 3), (5, 8)]
Smallest change for 72 using denominations: [100, 50, 25, 10, 5, 1] 5 coins needed distributed via [(1, 2), (5, 0), (10, 2), (25, 0), (50, 1), (100, 0)]
Smallest change for 124 using denominations: [67, 42, 34, 15, 8] 3 coins needed distributed via [(8, 0), (15, 1), (34, 0), (42, 1), (67, 1)]
A question came up recently on how to build a very simple program that has the ability to lookup anagrams for a given word. An anagram of a given word is a word spelled using the same letters as the given word but in a different order. Here is my take on the solution using Python.
Here is the output from running the program:
rhino:python_code roneill$ ./anagrams.py ['opt', 'pot', 'top'] ['asterin', 'eranist', 'restain', 'stainer', 'starnie', 'stearin'] ['something'] rhino:python_code roneill$
Here is the source code for the program:
In his book “The Evolution of a Cro-Magnon”, Cro-Mags lead singer John Joseph (McGowan) recounts his time stationed in Norfolk, VA while a part of the United States Navy. It is here around 1980 that Joseph first encounters Washington, DC’s Bad Brains, a band which he would eventually become a roadie for before singing for the Cro-Mags:
“They were in the middle of their sound check when the drummer clicked off a four count and the shit really hit the fan. I never heard anything like it in my life; lightning fast chords that they were able to stop on a dime, flip, turn around and race back the other way. The singer moved like no one I had ever seen before. He was fit as hell and had the finesse of James Brown, the anger of Johnny Rotten and a shaking ability that made Elvis look like a fuckin’ paraplegic.”
– John Joseph , “The Evolution of a Cro-Magnon”
My first encounter with the Bad Brains came about 15 years later. The first record I owned that had anything of theirs on it was the amazing compilation put out by Another Planet records called “Sunday Matinee: The Best of NY Hardcore.” (Bad Brains moved to the East Village from Washington, DC and holed up at 171A.) This record kicks off with Sailin’ On. This led me straight to the revolutionary ROIR Sessions, the album with the now-famous logo of the Washington, DC Capitol building being struck by lightning on its cover. I even remember being a little too excited at finding a copy of the Rock for Light CD while I was on a class trip in Minneapolis in high school. When I took guitar lessons briefly a few years ago, one of the first songs I wanted to learn how to play was Re-Ignition.
With all that said, sadly the closest I had come to actually seeing Bad Brains live was in seeing Darryl Jenifer’s band Stealth play at one of the Superbowl’s of Hardcore in Washington, DC. After talking to a few people who had seen Bad Brains somewhat recently, I had a pretty good idea of what to expect, and because of that I was happy with what I got. I knew that long gone were the days where HR moved the way John Joseph described and where HR would do a perfectly timed backflip to the ending of “At The Movies” as in this Youtube video:
Bad Brains – At The Movies (1979)
Two things from the show did surprise me though. One, the band music-wise can still hang with anyone in much the same way as Joseph describes. Dr. Know, Darryl Jenifer, and Earl Hudson are all amazing musicians and each of them can still play at a very high level. HR at this point though just seems lost and either unable or unwilling to get into it. The best parts of the show were when the band was playing and the entire crowd was singing along, especially to favorites like “Sailin’ On”, “Right Brigade”, and “I Against I”. With HR in his current state, they are unfortunately just a nostalgia act. I was expecting a more balanced decline all the way around, and instead it seems as if the band is being held back by HR’s condition. I knew to expect the nostalgia comparison, even though not in this exact manner, and so I still had a great time.
H2O played as one of the openers. This was probably about the 10th or 12th time that I had seen them, but the first time in about 10 years. They always put on an entertaining show and tonight was no different.