|
Text Analysis Info - Information retrieval software |
Last update: 18. August 2008
Programs listed here can be divided into more subtle groups:
- pure information retrievers: searching and displaying texts, indexers
- concordancers: programs providing concordances
program: Analysis 2.94
author: Giovanni Lo Conti
distributor: Giovanni Lo Conti (gloconti@romascuola.net)
documentation: none
download: free version
operating system: MS-Windows, Digital Unix, Acorn RiscOS
description: Analysis is a program which allows several types of analysis
about the text: concordances, KWIC, KWOC, indexes of readability,
co-occurrences, lemmatization, statistics about the sentences, non intelligent
abstract; Summary; meaningful and sense; Incipit; explicit; frequency; for many
procedures it is possible to delimite the range or compare the text with
an electronic dictionary; it is provided whith Help, Help on line, and Wimp.
program: AntConc 3.2.1.
author: Lawrence Anthony
distributor: Linguist's Software
documentation: readme file for usage
download: free version
operating system: MS-Windows, MacOS, Linux
description: This is a free concordance program.
program: AnyText
author: Linguist's Software
distributor: Linguist's Software
documentation: no
download: no
operating system: MacOS System 7.1-9.2, or the Classic system in OS X. (You must be able to boot into Classic to install.) 2 MB of RAM.
description:
AnyText is a HyperCard®-based Full Proximity Boolean Search Engine and Index Generator that allows you to create concordances and do FAST word searches on ordinary text files in English, Greek and Russian languages. AnyText was designed especially to work with the Greek, English, Cyrillic and Latin Bible texts, but can be used with any text-only file. The text files can be on diskettes, hard disk drives or CD-ROM drives, as long as there is disk space for the special indexing files that AnyText must create and access for operation. Requires 2 Megabytes of RAM.
program: Ask Sam 7.0
author: Ask Sam Software Development
distributor: Ask Sam Software Development
documentation: overview and quick tour
download: trial version
operating system: MS-Windows
description: AskSam is a fast information retrieval program and allows searching in E-mails
and PDF-files. The new professional version allows programming (e.g. with Visual Basic).
|
ATA - Ashton Text Analyser (WinATA Mark 2) |
program: ATA - Ashton Text Analyser
author and distributor: Peter Roe

documentation: users's guide
download: no, but it is free for non-commercial applications
operating system(s): Win9x, WinNT
description: ATA generates word lists, KWIC, KWOC
program: Collocate
author: Michael Barlow
distributor: Athelstan
documentation: is in the test version file
download: demo The demo processes data in the same manner as the full version, but the results are limited to the top 5 items.
operating system(s): Win9x
description: Collocate is a new software program that can be used to find collocations or terms in a corpus. There are three main components:
Search for a word (phrase) within a set span (e.g. 4 words). The program lists all the collocations containing the searchword and provides frequency and/or statistical information (Log Likelihood, Mutual Information).
Produce an n-gram list for the corpus.
Extract collocations from the corpus as a whole.
program:Concordance 3.2
author and distributor: Rob J C Watt

documentation: manual
download: trial version
operating system(s): Win9x, WinNT, WinXP
description:
phrases, proximity search, samples, regular expression search, references
book-like indexing, treat upper and lower case separately, show duplicate words separately,
analyse characters instead of words, It can also handle East Asian languages (e.g. Chinese).
sort headwords by order of occurrence,
sort word endings using a string sort,
sort contexts by string before and string after headword
language support including East Asian languages on Windows 2000/XP
user-definable HTML entity translation
program: Corpus Presenter 10.0
author: Raymond Hickey
distributor: Raymond Hickey
documentation: manual
download: full and free version
operating system(s): WinXP
description: Corpus Presenter is a suite of programs designed to work with both existing corpora and any files which users might wish to examine for linguistically interesting structures. It has all the options of standard corpus software, i.e. it can generate concordances, word lists and perform a whole range of text retrieval tasks and generate reverse dictionaries of words in texts. It does not require that texts are prepared in any way, e.g. by indexing them in advance.
Note: some pages are 10 years old, and I couldn't find current information HK
This program was removed, because the web pages were outdated and I couldn't find current information HK
|
LEXA 7.0 - Corpus Processing Software |
program: LEXA 7.0
author: Raymond Hickey University of Essen/Germany
distributor: University of Bergen, Norway
documentation: documentation quite like a manual
download: test
operating system(s): DOS
description: LEXA is an open system based on files. It can perform lemmatisation,
word lists, lexical density tables, file comparision, global find and replace,
database and corpus management functions (print, sort), statistics on characters,
words, and sentences, searching groups of files looking for strings, also with
wildcards * and ?, also in databases (DBF-files). There are also lots of
DOS-utilites.
program: Metamorph
distributor: Thunderstone Software
documentation: manual
download: none
operating systems: DOS, Win9x, WinNT, Unix
description:
Metamorph is a realtime concept based search package. It will search through
anything without any pre-processing steps.
Metamorph has an English language vocabulary of 250,000 word and phrase concept
associations for natural language queries, also boolean logic (with weights),
and wildcards can be used.
It also provides proximity control, fuzzy searches, true regular expression
matching, and numerical value searches.
The Metamorph API alone is available for most operating systems.
program: MicroConcord
author: Mike Scott, Tim Johns
distributor: Mike Scott
documentation: none
download: freeware
operating system(s): DOS
description: MicroConcord is the predecessor of WordSmith. It is faster than
Windows but the number of concordance lines is limited to around 1,500, and you
can't save a concordance except as a text file.
|
MicroOCP - Oxford Concordance Package |
The program is not available any more. However, you will find outdated information on the web that tells you otherwise, they lead you to dead links.
program: MonoConc 2.2
author: Michael Barlow
distributor: Athelstan
documentation: unknown
download: demo limited to 20 hits
operating system(s): Win9x
description: MonoConc is a concordancer. It can create concordances, word lists,
(with exclusion lists, case sensitive/insensitive), converts texts, and works
with tagged texts and with different languages. Searching can be done with
wildcard characters and variable (multi-line) context (also a sentence).
Sorting to words left and right, collocation of words is possible, too.
program: Phrase Context

author/distributor: Hans J. Klarskov Mortensen
download: test version
documentation: none
operating systems: Windows ?
description: Phrase Context is a versatile program that counts words and phrases, does concordances, calculates TTR-and lexical density values, regular expressions as search patterns, and writes XML formatted output files.
The author also provides some free utilities like extracting texts from PDF-files.
|
SCP 4.0.9 - Simple Concordance Program |
program: SCP 4.0.9 - Simple Concordance Program
author/distributor: Alan Reed
download: free software
documentation: none
operating systems: WinXP, MacOSX
description: This free program lets you create word lists and search natural language text files for words, phrases, and patterns. SCP is a concordance and word listing program that is able to read texts written in many languages. There are built-in alphabets for English, French, German, Greek, Russian, etc. SCP contains an alphabet editor which you can use to create alphabets for any other language.
|
Sonar 2003.32 Text Retrieval/Document Management Systems |
program: Sonar 2003.32
distributor: Virginiasystems
download: demo
documentation: none
operating systems: Win9x, WinNT, MacOS
description: High speed program than can process many types of text and word processing files.
program: Textalyzer
author: Bernhard Huber
distributor: none
documentation: self explaining
download: none
operating system: runs on a web site
description: Textalyser is a free text analysis tool that counts words, sentences, syllables, and lexical density. It also computes the Gunning readability index. A small but nice tool that counts syllables correct at least for English, French, and German. You can cut and paste text or specify a web page.
program: Textstat 2.7
author: Matthias Hüning

distributor: Matthias Hüning
documentation: manual
download: freeware
operating system: Windows, MacOS, Linux
description:
TextSTAT is a simple programme for the analysis of texts. It reads ASCII/ANSI texts and HTML files (directly from the internet) and it produces word frequency lists and concordances from these files. The programme runs on MS Windows and is distributed as freeware. Source code in Python is also available for free. User interface in German (default), English, and French.
program: WordSmith 5.0
author: Mike Scott
distributor: Mike Scott, Liverpool University
documentation: manual in English, French, and German
download: test version shows a sample of the results only
operating system: Win9x, WinNT
description: WordSmith is the sucessor of MicroConcord.
Please send comments and suggestions to