Search all six Jane Austen novels


*Return to Jane Austen info page
*Return to Jane Austen's writings

The following form will search the text of Jane Austen's novels, and of Lady Susan and The Watsons. The text of the novels is stored in a file in which each sentence is put on a separate line, and searching is done within each line. The result returned by the search will be those lines on which the requested search pattern is found. All searching is done in a case-insensitive way.


Search for:


Show chapter and volume markers

These are the details of the different search methods:

Phrase (keyword sequence) search:
This search finds any passage in which all the keywords occur in the same order that they were entered into the form box. Each individual keyword is matched as a whole word (in the same way as explained for the next search method below), and all punctuation is ignored, both in the search pattern and in the text being searched.
Keyword Search -- ANY:
This type of search most closely resembles those of WWW search engines (though it is not exactly the same). In this search, all punctuation is ignored, each sequence of alphabetic characters is searched for as a whole word, and every sentence that contains any of these keywords is returned as a result of the search (in other words, this is a logical "or" search). So searching for "hat" will return only the sentences that contain the word "hat", and searching for "French, Italian" will return all sentences which contain either the word "French" or the word "Italian" (the comma is ignored here, except in its function of separating the two keywords).
When using this search, you need to search separately for noun plurals, inflected forms of verbs, etc. (So searching for "hat" will not find sentences which contain the word "hats", unless they also happen to contain the word "hat".) However, you can find all occurrences of words beginning with a certain sequence of letters by ending a keyword with the special "*" wildcard character (so that using the search keyword "hat*" will find sentences that contain any of the words "hat", "hats", hatred", etc.).
A final refinement of this search is that a hyphen directly preceded and followed by alphabetic characters forms a special composite keyword, which will match against cases in the e-texts where the hyphen is present, absent, or replaced by a space. So the keyword "mantel-piece" will return sentences that contain "mantel-piece", "mantelpiece", or "mantel piece". (This feature gets around inconsistencies of hyphenation.)
(Note that any hyphen which does not have letters on both sides will be ignored, as will any alphabetic characters that occur after an asterisk character in a search keyword.)
Exact String Search:
This search simply takes the exact string of characters that you have typed in (except that any leading or trailing spaces are removed), and returns all the sentences in the e-texts which contain this precise sequence (including punctuation characters) -- even where the string matches against parts of words, rather than whole words. So if you do an exact string search on "hat", the search will return the rather unmanageably large list of all sentences which happen to include words that contain the sequence of letters h, a, t. And searching for "French, Italian" will only return the sentences where the words "French" and "Italian" occur next to each other, in this order, and are separated by a comma.
Regular Expression Search (egrep):
This type of search allows you to use the regular expression wildcard language which is available with the Unix egrep commmand.

Search results are separated by novel, but no indication of each sentence's exact location within a novel is given, unless the "Show Chapter and Volume Markers" box is clicked (in which case all chapter headings will be shown). Also, the "Show Surrounding Context" option in the search form above causes the three sentences which precede and follow every matching sentence to be shown. (Selecting "Show Chapter and Volume Markers" has no effect if the "Show Surrounding Context" option is also specified.)

Caveats: Note that publicly-available (i.e. non-scholarly) e-texts of the novels were used, that often have modernized punctuation and spelling. (The e-text of Lady Susan is closer to the original manuscript, and has some occasional idiosyncratic Jane Austen spellings, such as "ei" for "ie".) Sentences from the middle of a letter, or a multi-sentence quotation, do not have any punctuation that indicates they are not part of the narrative. Paragraph breaks in the original texts have not been preserved. Chapter numbering has not been yet harmonized between the e-texts of the different novels (i.e. roman numerals vs. decimal numbers, and volume-relative chapter numbering vs. whole-book chapter numbering).

Details of format of texts: Abbreviations such as "Mr", "Mrs", "Col" and "St" do not have any period at the end. There is no sequence of more than one space character in the search text. In the texts of Mansfield Park, Pride and Prejudice, Lady Susan, and The Watsons, the beginnings and ends of italics are marked by "_" characters; italicization is not marked in the other novels. The e-texts of the novels are otherwise entirely in plain ASCII form (no HTML or other markup), except that there is a "<P>" tag at the end of each line.


Other searches also available:



*Return to Jane Austen info page table of contents
*Return to Jane Austen's writings
- Republic of Pemberley - To our Amazon storefront page
Home | Q | Jane Info