Analyzing Ulysses with NLTK: Lestrygonians (Ch. 8)¶

Part II: Sentences and Phrases¶

Table of Contents¶

Introduction
Sentences:
Subject Tagging
Chapter by Chapter Analysis

¶

Introduction¶

In Part I, we applied NLTK to James Joyce's Ulysses and found some interesting features of Chapter 8, Lestrygonians. We started by analyzing characters and letter frequencies, and then moved on to words. In this notebook, we'll be looking at phrases.

In particular, we'll try and improve the part of speech tagger by looking at the text at the phrase level, and we'll also apply some chunking algorithms to the text to chunk words into phrases based on their parts of speech.

Let's start by importing our libraries.

# In case we want to plot something:
%matplotlib inline 

from __future__ import division
import nltk, re
import numpy as np

# The io module makes unicode easier to deal with
import io

def p():
    print "-"*20

file_contents = io.open('txt/08lestrygonians.txt','r').read()
print type(file_contents)

<type 'unicode'>

# Tokenize the chapter using the Punkt Tokenizer:
sentences = nltk.sent_tokenize(file_contents)
print len(sentences)

1979

print sentences[:21]

[u'Pineapple rock, lemon platt, butter scotch.', u'A sugarsticky girl\nshovelling scoopfuls of creams for a christian brother.', u'Some school\ntreat.', u'Bad for their tummies.', u'Lozenge and comfit manufacturer to His\nMajesty the King.', u'God.', u'Save.', u'Our.', u'Sitting on his throne sucking red\njujubes white.', u'A sombre Y.M.C.A.', u'young man, watchful among the warm sweet fumes of\nGraham Lemon\u2019s, placed a throwaway in a hand of Mr Bloom.', u'Heart to heart talks.', u'Bloo... Me?', u'No.', u'Blood of the Lamb.', u'His slow feet walked him riverward, reading.', u'Are you saved?', u'All are\nwashed in the blood of the lamb.', u'God wants blood victim.', u'Birth, hymen,\nmartyr, war, foundation of a building, sacrifice, kidney burntoffering,\ndruids\u2019 altars.', u'Elijah is coming.']

Now that we've tokenized the text by sentence, we can set to work. The first useful task we'll want to be able to do is to print out a sentence if it contains a given word. We can use a Text object and use the concordance('word') function, but this only prints out the context - and does not return the sentence or its context.

Words with Context¶

Suppose we want to search for a word, like "eye", and we want to return the setnence that contains it, along with two sentences of context (the sentence before, and the sentence after).

We can do this by looping through each sentence, breaking it apart using a word tokenizer, and searching for the word of interest. If we find it, we add the prior sentence, current sentence, and next sentence to the list of instances.

small_sentences = sentences[:21]

def word_with_context(word,sentences):
    final_list = []
    for i,sentence in enumerate(sentences):
        if i>0 and i<(len(sentences)-1):
            words = nltk.word_tokenize(sentence)
            if word in words:
                final_list.append( [re.sub('\n',' ',sentences[i-1] ),
                                    re.sub('\n',' ',sentences[i]   ),
                                    re.sub('\n',' ',sentences[i+1] ) ]
                                  )
    return final_list

for i in word_with_context('eyes',sentences):
    p()
    print '\n'.join(i)

--------------------
Must be selling off some old furniture.
Knew her eyes at once from the father.
Lobbing about waiting for him.
--------------------
How is that?
His eyes sought answer from the river and saw a rowboat rock at anchor on the treacly swells lazily its plastered board.
Kino’s       11/-       Trousers  Good idea that.
--------------------
No, no.
Mr Bloom moved forward, raising his troubled eyes.
Think no more about that.
--------------------
Sister?
I am sure she was crossed in love by her eyes.
Very hard to bargain with that sort of a woman.
--------------------
What was the name of that priestylooking chap was always squinting in when he passed?
Weak eyes, woman.
Stopped in Citron’s saint Kevin’s parade.
--------------------
—And your lord and master?
Mrs Breen turned up her two large eyes.
Hasn’t lost them anyhow.
--------------------
Her hand ceased to rummage.
Her eyes fixed themselves on him, wide in alarm, yet smiling.
—What?
--------------------
Let her speak.
Look straight in her eyes.
I believe you.
--------------------
Round to Menton’s office.
His oyster eyes staring at the postcard.
Be a feast for the gods.
--------------------
How flat they look all of a sudden after.
Peaceful eyes.
Weight off their mind.
--------------------
Eaten a bad egg.
Poached eyes on ghost.
I have a pain.
--------------------
To aid gentleman in literary work.
His eyes followed the high figure in homespun, beard and bicycle, a listening woman at his side.
Coming from the vegetarian.
--------------------
Don’t eat a beefsteak.
If you do the eyes of that cow will pursue you through all eternity.
They say it’s healthier.
--------------------
Mr Bloom, quickbreathing, slowlier walking passed Adam court.
With a keep quiet relief his eyes took note this is the street here middle of the day of Bob Doran’s bottle shoulders.
On his annual bend, M’Coy said.
--------------------
Take off that white hat.
His parboiled eyes.
Where is he now?
--------------------
Men, men, men.
Perched on high stools by the bar, hats shoved back, at the tables calling for more bread no charge, swilling, wolfing gobfuls of sloppy food, their eyes bulging, wiping wetted moustaches.
A pallid suetfaced young man polished his tumbler knife fork and spoon with his napkin.
--------------------
Bolting to get it over.
Sad booser’s eyes.
Bitten off more than he can chew.
--------------------
Mr Bloom raised two fingers doubtfully to his lips.
His eyes said:  —Not here.
Don’t see him.
--------------------
A warm shock of air heat of mustard hanched on Mr Bloom’s heart.
He raised his eyes and met the stare of a bilious clock.
Two.
--------------------
Felt so off colour.
His eyes unhungrily saw shelves of tins: sardines, gaudy lobsters’ claws.
All the odd things people pick up for food.
--------------------
O wonder!
Coolsoft with ointments her hand touched me, caressed: her eyes upon me did not turn away.
Ravished over her I lay, full lips full open, kissed her mouth.
--------------------
Soft warm sticky gumjelly lips.
Flowers her eyes were, take me, willing eyes.
Pebbles fell.
--------------------
Screened under ferns she laughed warmfolded.
Wildly I lay on her, kissed her: eyes, her lips, her stretched neck beating, woman’s breasts full in her blouse of nun’s veiling, fat nipples upright.
Hot I tongued her.
--------------------
Stuck, the flies buzzed.
His downcast eyes followed the silent veining of the oaken slab.
Beauty: it curves: curves are beauty.
--------------------
That was one of the saint Legers of Doneraile.
Davy Byrne, sated after his yawn, said with tearwashed eyes:  —And is that a fact?
Decent quiet man he is.
--------------------
Tastes?
They say you can’t taste wines with your eyes shut or a cold in the head.
Also smoke in the dark they say get no pleasure.
--------------------
Get on.
Making for the museum gate with long windy steps he lifted his eyes.
Handsome building.
--------------------
Didn’t see me perhaps.
Light in his eyes.
The flutter of his breath came forth in short sighs.
--------------------
My heart!
His eyes beating looked steadfastly at cream curves of stone.
Sir Thomas Deane was the Greek architecture.

Combining Context with Patterns¶

This is a useful function that we can combine with some other conditions - such as searching a wordlist for words matching a certain pattern. Then we can pass a pattern, and get back each word matching our pattern, with three sentences of context. We'll need a wordlist first, which we can obtain by tokenizing each of our sentences.

wordlist = nltk.word_tokenize(file_contents)
wordlist = [w.lower() for w in wordlist]
english_words = [w for w in nltk.corpus.words.words('en') if w.islower()]
z1 = set(wordlist)
z2 = set(english_words)

print "Number of words in Chapter 8:",len(wordlist)
print "Number of unique words in Ch. 8:",len(z1)
print "Number of words in English dictionary:",len(z2)
print "Numer of words in Ch. 8 in English dictionary:",len( z1.intersection(z2) )

Number of words in Chapter 8: 15153
Number of unique words in Ch. 8: 3711
Number of words in English dictionary: 210687
Numer of words in Ch. 8 in English dictionary: 2280

intersection = z1.intersection(z2)
non_dictionary_words = z1.symmetric_difference(intersection)
print len(non_dictionary_words)

1431

non_dictionary_words = sorted(list(non_dictionary_words))
print non_dictionary_words[110:125]

[u'boyl', u'boylan', u'boyle', u'boyne', u'breaks', u'breasts', u'breen', u'breen\u2019s', u'bricks', u'bridgepiers', u'brighton', u'brillantined', u'brings', u'brogues', u'brother\u2019s']

We now have a list of words that aren't found in an English dictionary provided by the NLTK corpus, so these have the potential to be interesting words. We'll use these results to print out some context for each word.

While we're at it, we can also get word counts of each of these words using a Text object:

text = nltk.Text(wordlist)
print "Number of occurences of",non_dictionary_words[115],":",text.count(non_dictionary_words[113])

Number of occurences of breasts : 1

result = word_with_context(non_dictionary_words[115],sentences)
print '\n'.join(result[0])

Screened under ferns she laughed warmfolded.
Wildly I lay on her, kissed her: eyes, her lips, her stretched neck beating, woman’s breasts full in her blouse of nun’s veiling, fat nipples upright.
Hot I tongued her.

The phrase "woman's breasts full" is reminiscent of Lady Macbeth's speech from Macbeth, Act 1 Scene 5, when she discovers Duncan is staying the night (it has a somewhat, uh, different tone):

Stop up the access and passage to remorse,
That no compunctious visitings of nature
Shake my fell purpose, nor keep peace between
The effect and it! Come to my woman’s breasts,
And take my milk for gall, you murd'ring ministers,
Wherever in your sightless substances
You wait on nature’s mischief.
- Macbeth, Act 1, Scene 5

If instead we wanted to search for words matching a regular expression, we could write a function that takes a regular expression, searches for words matching that expression, and passes them to the word_with_context() function.

def re_with_context(rex,sentences):
    final_list = []
    for i,sentence in enumerate(sentences):
        if i>0 and i<(len(sentences)-1):
            words = nltk.word_tokenize(sentence)
            for word in words:
                if len(re.findall(rex,word))>0:
                    final_list.append( [re.sub('\n',' ',sentences[i-1] ),
                                        re.sub('\n',' ',sentences[i]   ),
                                        re.sub('\n',' ',sentences[i+1] ) ]
                                      )
    return final_list

for i,ss in enumerate(re_with_context(r'ood\b',sentences)):
    if i<25:
        p()
        print '\n'.join(ss)

--------------------
No.
Blood of the Lamb.
His slow feet walked him riverward, reading.
--------------------
Are you saved?
All are washed in the blood of the lamb.
God wants blood victim.
--------------------
All are washed in the blood of the lamb.
God wants blood victim.
Birth, hymen, martyr, war, foundation of a building, sacrifice, kidney burntoffering, druids’ altars.
--------------------
The phosphorescence, that bluey greeny.
Very good for the brain.
From Butler’s monument house corner he glanced along Bachelor’s walk.
--------------------
His reverence: mum’s the word.
Good Lord, that poor child’s dress is in flitters.
Underfed she looks too.
--------------------
If I threw myself down?
Reuben J’s son must have swallowed a good bellyful of that sewage.
One and eightpence too much.
--------------------
His eyes sought answer from the river and saw a rowboat rock at anchor on the treacly swells lazily its plastered board.
Kino’s       11/-       Trousers  Good idea that.
Wonder if he pays rent to the corporation.
--------------------
Because life is a stream.
All kinds of places are good for ads.
That quack doctor for the clap used to be stuck up in all the greenhouses.
--------------------
Parallax.
I never exactly understood.
There’s a priest.
--------------------
Y lagging behind drew a chunk of bread from under his foreboard, crammed it into his mouth and munched as he walked.
Our staple food.
Three bob a day, walking along the gutters, street after street.
--------------------
The heavy noonreek tickled the top of Mr Bloom’s gullet.
Want to make good pastry, butter, best flour, Demerara sugar, or they’d taste it with the hot tea.
Or is it from her?
--------------------
Or is it from her?
A barefoot arab stood over the grating, breathing in the fumes.
Deaden the gnaw of hunger that way.
--------------------
Like to answer them all.
Good system for criminals.
Code.
--------------------
And the other one Lizzie Twigg.
My literary efforts have had the good fortune to meet with the approval of the eminent poet A. E. (Mr Geo.
Russell).
--------------------
First to the meet and in at the death.
Strong as a brood mare some of those horsey women.
Swagger around livery stables.
--------------------
Only one lump of sugar in my tea, if you please.
He stood at Fleet street crossing.
Luncheon interval.
--------------------
Nine she had.
A good layer.
Old woman that lived in a shoe she had so many children.
--------------------
Here goes.
Here’s good luck.
Must be thrilling from the air.
--------------------
Foodheated faces, sweating helmets, patting their truncheons.
After their feed with a good load of fat soup under their belts.
Policeman’s lot is oft a happy one.
--------------------
Michaelmas goose.
Here’s a good lump of thyme seasoning under the apron for you.
Have another quart of goosegrease before it gets too cold.
--------------------
Three hundred kicked the bucket.
Other three hundred born, washing the blood off, all are washed in the blood of the lamb, bawling maaaaaa.
Cityful passing away, other cityful coming, passing away too: other coming on, passing on.
--------------------
Three hundred kicked the bucket.
Other three hundred born, washing the blood off, all are washed in the blood of the lamb, bawling maaaaaa.
Cityful passing away, other cityful coming, passing away too: other coming on, passing on.
--------------------
Esthetes they are.
I wouldn’t be surprised if it was that kind of food you see produces the like waves of the brain the poetical.
For example one of those policemen sweating Irish stew into their shirts you couldn’t squeeze a line of poetry out of him.
--------------------
Don’t know what poetry is even.
Must be in a certain mood.
The dreamy cloudy gull      Waves o’er the waters dull.
--------------------
The dreamy cloudy gull      Waves o’er the waters dull.
He crossed at Nassau street corner and stood before the window of Yeates and Son, pricing the fieldglasses.
Or will I drop into old Harris’s and have a chat with young Sinclair?

Now, we are able to pass words and regular expressions, and get a few sentences of context back in return. We can use various techniques to identify keywords, or provide keywords from a file, or from a list. We could iterate through a file containing any of the following things:

Beverages
Food items
Kitchen items
Numbers
Colors
Animals

We can also look for particular phonetic sounds, which often occur in groups (as we can see from the word searches above, many of the sentences are repeated because the "ood" pattern often shows up repeatedly over a few sentences.

We can also look for patterns across the chapters - something we haven't done yet, since we've been focusing on Chapter 8 alone, as a smaller and more manageable body of text.

First, let's expand on that context function, to print out N sentences of context:

def re_with_context(rex,sentences,n_sentences):
    final_list = []
    half = (int)(np.floor(n_sentences/2))
    for i,sentence in enumerate(sentences):
        if i>=half and i<(len(sentences) - half):
            words = nltk.word_tokenize(sentence)
            for word in words:
                if len(re.findall(rex,word))>0:
                    short_list = []
                    for s in sentences[i-half:i] + sentences[i:i+half+1]:
                        short_list.append( re.sub(r'[\n\t]',' ',s) )
                    final_list.append(short_list)
    return final_list

for group in re_with_context('eyes',sentences,5):
    p()
    print '\n'.join(group)

--------------------
Dedalus’ daughter there still outside Dillon’s auctionrooms.
Must be selling off some old furniture.
Knew her eyes at once from the father.
Lobbing about waiting for him.
Home always breaks up when the mother goes.
--------------------
But then why is it that saltwater fish are not salty?
How is that?
His eyes sought answer from the river and saw a rowboat rock at anchor on the treacly swells lazily its plastered board.
Kino’s       11/-       Trousers  Good idea that.
Wonder if he pays rent to the corporation.
--------------------
He wouldn’t surely?
No, no.
Mr Bloom moved forward, raising his troubled eyes.
Think no more about that.
After one.
--------------------
Sister?
Sister?
I am sure she was crossed in love by her eyes.
Very hard to bargain with that sort of a woman.
I disturbed her at her devotions that morning.
--------------------
Stream of life.
What was the name of that priestylooking chap was always squinting in when he passed?
Weak eyes, woman.
Stopped in Citron’s saint Kevin’s parade.
Pen something.
--------------------
Funeral was this morning.
Your funeral’s tomorrow       While you’re coming through the rye.
Diddlediddle dumdum       Diddlediddle...  —Sad to lose the old friends, Mrs Breen’s womaneyes said melancholily.
Now that’s quite enough about that.
Just: quietly: husband.
--------------------
Just: quietly: husband.
—And your lord and master?
Mrs Breen turned up her two large eyes.
Hasn’t lost them anyhow.
—O, don’t be talking!
--------------------
Do you know what he did last night?
Her hand ceased to rummage.
Her eyes fixed themselves on him, wide in alarm, yet smiling.
—What?
Mr Bloom asked.
--------------------
Mr Bloom asked.
Let her speak.
Look straight in her eyes.
I believe you.
Trust me.
--------------------
Wrote it for a lark in the Scotch house I bet anything.
Round to Menton’s office.
His oyster eyes staring at the postcard.
Be a feast for the gods.
He passed the Irish Times.
--------------------
Phthisis retires for the time being, then returns.
How flat they look all of a sudden after.
Peaceful eyes.
Weight off their mind.
Old Mrs Thornton was a jolly old soul.
--------------------
Look at the woebegone walk of him.
Eaten a bad egg.
Poached eyes on ghost.
I have a pain.
Great man’s brother: his brother’s brother.
--------------------
Not saying a word.
To aid gentleman in literary work.
His eyes followed the high figure in homespun, beard and bicycle, a listening woman at his side.
Coming from the vegetarian.
Only weggebobbles and fruit.
--------------------
Only weggebobbles and fruit.
Don’t eat a beefsteak.
If you do the eyes of that cow will pursue you through all eternity.
They say it’s healthier.
Windandwatery though.
--------------------
Must.
Mr Bloom, quickbreathing, slowlier walking passed Adam court.
With a keep quiet relief his eyes took note this is the street here middle of the day of Bob Doran’s bottle shoulders.
On his annual bend, M’Coy said.
They drink in order to say or do something or cherchez la femme.
--------------------
Coarse red: fun for drunkards: guffaw and smoke.
Take off that white hat.
His parboiled eyes.
Where is he now?
Beggar somewhere.
--------------------
See the animals feed.
Men, men, men.
Perched on high stools by the bar, hats shoved back, at the tables calling for more bread no charge, swilling, wolfing gobfuls of sloppy food, their eyes bulging, wiping wetted moustaches.
A pallid suetfaced young man polished his tumbler knife fork and spoon with his napkin.
New set of microbes.
--------------------
Chump chop from the grill.
Bolting to get it over.
Sad booser’s eyes.
Bitten off more than he can chew.
Am I like that?
--------------------
Did you, faith?
Mr Bloom raised two fingers doubtfully to his lips.
His eyes said:  —Not here.
Don’t see him.
Out.
--------------------
Isn’t Blazes Boylan mixed up in it?
A warm shock of air heat of mustard hanched on Mr Bloom’s heart.
He raised his eyes and met the stare of a bilious clock.
Two.
Pub clock five minutes fast.
--------------------
I wanted that badly.
Felt so off colour.
His eyes unhungrily saw shelves of tins: sardines, gaudy lobsters’ claws.
All the odd things people pick up for food.
Out of shells, periwinkles with a pin, off trees, snails out of the ground the French eat, out of the sea with bait on a hook.
--------------------
Pillowed on my coat she had her hair, earwigs in the heather scrub my hand under her nape, you’ll toss me all.
O wonder!
Coolsoft with ointments her hand touched me, caressed: her eyes upon me did not turn away.
Ravished over her I lay, full lips full open, kissed her mouth.
Yum.
--------------------
Young life, her lips that gave me pouting.
Soft warm sticky gumjelly lips.
Flowers her eyes were, take me, willing eyes.
Pebbles fell.
She lay still.
--------------------
Young life, her lips that gave me pouting.
Soft warm sticky gumjelly lips.
Flowers her eyes were, take me, willing eyes.
Pebbles fell.
She lay still.
--------------------
High on Ben Howth rhododendrons a nannygoat walking surefooted, dropping currants.
Screened under ferns she laughed warmfolded.
Wildly I lay on her, kissed her: eyes, her lips, her stretched neck beating, woman’s breasts full in her blouse of nun’s veiling, fat nipples upright.
Hot I tongued her.
She kissed me.
--------------------
And me now.
Stuck, the flies buzzed.
His downcast eyes followed the silent veining of the oaken slab.
Beauty: it curves: curves are beauty.
Shapely goddesses, Venus, Juno: curves the world admires.
--------------------
But be damned but they smelt her out and swore her in on the spot a master mason.
That was one of the saint Legers of Doneraile.
Davy Byrne, sated after his yawn, said with tearwashed eyes:  —And is that a fact?
Decent quiet man he is.
I often saw him in here and I never once saw him—you know, over the line.
--------------------
Don Giovanni, thou hast me invited      To come to supper tonight,      The rum the rumdum.
Doesn’t go properly.
Keyes: two months if I get Nannetti to.
That’ll be two pounds ten about two pounds eight.
Three Hynes owes me.
--------------------
Then the spring, the summer: smells.
Tastes?
They say you can’t taste wines with your eyes shut or a cold in the head.
Also smoke in the dark they say get no pleasure.
And with a woman, for instance.
--------------------
Handel.
What about going out there: Ballsbridge.
Drop in on Keyes.
No use sticking to him like a leech.
Wear out my welcome.
--------------------
Not see.
Get on.
Making for the museum gate with long windy steps he lifted his eyes.
Handsome building.
Sir Thomas Deane designed.
--------------------
Not following me?
Didn’t see me perhaps.
Light in his eyes.
The flutter of his breath came forth in short sighs.
Quick.
--------------------
Just at the gate.
My heart!
His eyes beating looked steadfastly at cream curves of stone.
Sir Thomas Deane was the Greek architecture.
Look for something I.

Chapter-by-Chapter Analysis¶

If we want to start analyzing Ulysses as a whole and look for connections across chapters, we'll need objects to store data about each chapter, objects that will encapsulate much of the functionality laid out in Part I and Part II of these notebooks.

To design such an object, a Lestrygonians object, we would first want to define a UlyssesChapter object. The constructor would take a text file representing the chapter. There would be a number of methods to get useful lists, dictionaries, or sets.

Useful lists:

List of tokens representing words and word boundaries
List of tokens representing all words turned into lowercase
List of tokens representing only the lowercase words
List of sentences
If the chapter has a more complicated structure, such as 07 Aeolus or 15 Circe, then a special/custom parser would be needed and different methods available (list of headlines, or list of characters, or lists of dialogue)

Useful dictionaries:

Word counts - all words
Word counts - unique words (would count base words toward base) (eyes --> eye)

Useful sets:

Unique words appearing in this chapter

Building wordlists:

Use a set of words, and search for larger words containing them.
- Example: ['orange','yellow','green','blue','indigo','rose','violet'] becomes [u'blue', u'greenhouses', u'greens', u'penrose', u'orangepeels', u'bluecoat', u'orangegroves', u'greeny', u'yellow', u'bluey', u'yellowgreen', u'green', u'rose', u'blues', u'greenwich']
- Not perfect, but it can certainly be improved

Subject Tagging¶

If we were to tag various sentences, based on the nouns, verbs, and actions they contained, their neighbor sentences, the chapter they're in, etc., it would be possible to tag different themes (e.g., the appearance of the bar of soap, Paddy Dignam, ghosts) and plot their appearance temporally throughout the novel.

The appearance and disappearance of various characters (and indeed simply a list of characters extracted from the novel) would be marvelous.