It’s hardly possible to write something about Bob Dylan that have not been written before. But using computer to analize artist lyrics is rather uncommon, so my first thought was to find most common words used by Dylan in his songs. Bruce Springsteen cites him as one of his influences, so Springsteen lyrics are added as an comparison.
Python was my language of choice to write some simple code that sorts words in order of their occurrence. That task is easy, sure, but you have to get full lyrics first, preferably stored in one place(i.e. file). I haven’t even bothered to look after something like that on the Net. Instead, I used Beautiful Soup to parse artist’s web pages, download lyrics and write them to file. Beautiful Soup is fast and simple Python parser, excellent tool for this kind of job. http://www.bobdylan.com/ and http://www.brucespringsteen.net/ contains new and error-free songs lyrics (from 2009 albums) – you can’t say that about lyrics found on various strange fan-made pages in the internet.
Although I specified some popular prepositions, postpositions, pronouns etc. to be omitted, running my “lyrics processor” for the first time gave a poor results – recurring words were “it’s”, “have”, “where”, “their”, “don’t”, “from” and so on. I added some more common/unattractive words to omitted list, and got nicer results in graph format:
Those are definitely more interesting. Dylan likes to ask questions (but not necessarily answer them), so “know” is not a suprise. Both artist use “baby”, “love” or “girl”, words common in love-oriented songs (”It’s all over now, baby blue”, “Tunnel of Love”). “Night” and “tonight” are frequently used by Springsteen, night time is often used/anticipated in his songs (”Born to run”, “Thunder road”, “Because The Night”).
This simple analysis isn’t 100% accurate, but shows some frequent words used by Dylan and Springsteen. In future, I may compare more artists and introduce some other comparison method than simple word frequency. I used Python 3000 to do this task – it is more consistent, new dictionary views make it harder to do some bad mistakes. Unfortunately, there are incompability problems with Beautiful Soup and graph plotting libraries (and many other), they have yet to be ported to 3000.
Below you can find .rar with Python 3000 code I used to calculate frequency and sanity input. You can use WordAnalizer class to calculate common words in any text file, and both lyric files are also there. I didn’t include plot and parsing code, because they are written in Python 2.6 and including 3000 and 2.6 files in one package could be misleading.

