|
Text Analysis Info - Content: quantitative with category system |
Last update: 12. February 2008
new:
this section was re-edited
|
CoAn 2.08 - Content Analysis (German only) |
author: Matthias Romppel
program: CoAn 2.08
documentation: printed manual in German
download: test
operating system: Win 3.x, Win9x, WinNT
description: word list, concordances, frequencies of categories
COAN is inspired by a former Intext version. It uses dictionaries
to code texts, special features are interactive coding, powerful search patterns
like word co-occurences. It is available in German only. Personal comment: this site was not updated since 2001.
program: DICTION 5.0
author: Roderick P. Hart

distributor: Roderick P. Hart
download: trial version
manual: manual
operating system: Windows
description:
Diction 5.0 uses dictionaries (word-lists) to search a text for these qualities:
- Certainty: Language indicating resoluteness, inflexibility, and completeness and a tendency to speak ex-cathedra.
- Activity: Language featuring movement, change, the implementation of ideas and the avoidance of inertia.
- Optimism: Language describing tangible, immediate, recognizable matters that affect people's everyday lives.
- Commonality: Language highlighting the agreed-upon values of a group and rejecting idiosyncratic modes of engagement.
- Realism: Language describing tangible, immediate, recognizable matters that affect people's everyday lives
The results can be statistically analysed and are compared with other texts, so that an under- or overrepresentation of categories can be detected.
program: General Inquirer
author and distributor: Philip J. Stone
download: yes, but only the category systems
operating system: Java, category systems are Excel-files (XLS)
documentation: description of categories
description: The grandfather of many content analysis software is now available
for computers that run Java and are able to read the category system (Excel files).
|
LIWC 2007 - LInguistic Word Count |
program: LIWC 2007 - LInguistic Word Count
author: James B. Pennebaker
Roger J. Booth, and Martha E. Francis.
distributor: Erlbaum Associates
download: with registration only
operation system: MS-Windows, MacOs
documentation: LIWC 2007 manual
description:
The program analyses text files on a word-by-word basis, calculating percentage words that match each
of several language dimensions. The program has 68 pre-set dimensions (output variables) including linguistic dimensions, word categories tapping psychological constructs, and personal concern
categories, and can accommodate user-defined dimensions as well.
In the new LIWC 2007 version the dictionary has been extended. In the MacOS-version there are new features like phrases and parts of words (stems) as search patterns, and also highlighting of the text. A lite version for students is also available.
|
MCCA - Minnesota Contextual content analysis |
program: Dimap 4.0 with MCCA
operating system: Win95
authors: Ken Litkowski, Donald McTavish
distributor: CL Research
download: test
documentation: no, but many white papers on the website
description: DIMAP/MMCA description
|
INTEXT 4.1 - INhaltsanalyse von TEXTen (content analysis of texts) |
program: INTEXT 4.1
author and distributor: Harald Klein

documentation: manual as a PDF-file in English and German, CD 20 € including postage and packing.
download: full free version
operating system: DOS
description: Intext is the MS-DOS version of TextQuest. It uses dictionaries
to code texts, special features are interactive coding, powerful search patterns
like word co-occurences, and negation dextection. But also readability and text
statistics as well as word sequences are part of the program. It is available in English and German.
Online help is available. The Windows-version TextQuest has a modern user interface. Intext will be no longer supported.
program: PCAD 2000
author and distributor: GB Software

documentation: manual
download: no
operating system(s): Win9x
description: The primary area of interest is measuring psychobiologically interesting states such as anxiety, hostility, and hope using the Gottschalk-Gleser content analysis scales.
These scales have been empirically developed and tested, and have been shown to be reliable and valid in a wide range of studies. Louis A. Gottschalk (M.D. Ph.D.) has been the principal developer of these scales, and has applied them in many areas of medicine and beyond.
|
Protan - Protocol Analyser |
program: PROTAN
author and distributor: Robert Hogenraad

documentation: overview on the analysis modules
download: no
operating system(s): DOS, MacOS, Unix
description: word list, concordances, frequencies of categories, sequences of categories
manuals in electronic and printed form, Documetation in French only
PROTAN is also a sucessor of the General Inquirer, with a lot of utilities that perform
numerous text analysis tasks. PROTAN is very complex and difficult to handle. Documentation
is in French, where the command language for the utilities is English.
comment: Although the functions of PROTAN are very impressing, it requires some
time to make use of all the functions it offers. A lot of standardised category systems are available like all language version of Colin Martindale's RID (Regressive Imagenry Dictionary), Harvard dictionary, Whissell dictionary and many others
for different languages.
removed, no relevant information on the website
|
TEXTPACK 7.0 - TextPackage |
program: TEXTPACK 7.0
authors: Peter Ph. Mohler
and
Cornelia Züll 
distributor: ZUMA Mannheim
documentation: short manual
download: demo with attached texts only
operating systems: Win9x, WinNT, in English or Spanish
description:
TEXTPACK features:
word frequencies for the entire text or its sub-units, can be filtered by
external variables (identifiers) and/or frequency, sorted by alphabet or frequency, sort order tables possible
Keyword-in-Context and Keyword-out-of-Context (KWIC/KWOC)
Single words, word roots (beginnings of strings) or word sequences can be shown in their context.
cross-references and concordances
word comparison of two texts
TEXTPACK categorises/classifies a text according to a user dictionary. It
generates files with both category frequencies or category sequences. The validity of the coding can be checked by various options (e.g., the insertion of category numbers or category labels in the continuous text).
selection of text units: filtering on the basis of the external variables or to use a numeric file to select text units.
program: TextQuest 3.0
author and distributor: Harald Klein

documentation: manual (PDF-file) included in the test version
download: test version in English, German, and Spanish
operating system(s): MS-Windows - planned: MacOS, Linux
description: TextQuest is the Windows version of INTEXT.
It uses dictionaries to code texts, special features are interactive coding, powerful search patterns
like word co-occurences, and negation detection for English and German.
The text exploring features are word lists supporting sort order tables,
exclusion lists (STOP-words), KWIC-lines with variable length, and lists of
word sequences (phrases) and word permutations. The readability module consits of 68 readability formula for 7 languages (English, French, German, Spanish, Dutch, Danish, and Swedish).
It is available in English and German, version 1.9
is available in Spanish also.
In version 3.0 there are two new modules: the multiple vocabulary comparison module and the category manager.
Since version 1.8 standard category systems like RID (Regressive Imagery Dictionary) for English and
German, and the HKW (Hamburger kommunikationssoziologisches Wörterbuch) are included.
|
VB-pro - Verbatim Protocol |
removed
|
Whissell's dictionary of Affect in Language |
author: Cynthia Whissell
download: demo
program: dictionary of Affect in Language (DAL)
operating system: MS-Windows, MacOS, Lynx, Web
documentation: manual
description: none yet
program: WordStat 5.1
author: Normand Peladeau
distributor: Provalis Research or
Social Science Consulting (Europe)
documentation: manual as a PDF-file
download: test version expires after 30 days
or self running demo
operating systems: MS-Windows
description: WordStat is an add-on to SimStat, a general purpose
statistic program (comparable to SPSS e.g.). Both packages are integrated and especially
useful for the coding of answers to open ended questions. It also includes thesauri and
spell-checker for different languages. It comes with Colin Martindale's RID - Regressive Imagery Dictionary (English, French, Portuguese, Swedish, German, Latin) and a few other dictionaries and thesauri (WordNet, Roget's thesaurus).
program: Yoshikoder 0.6.3
author: Will Lowe
distributor: Will Lowe
documentation: user documentation
download: free version
operating systems: MS-Windows, MacOS-X, Linux, with Java environment
description: Yoshikoder works with text documents, whether in plain ASCII, Unicode (e.g. UTF-8), or a national encodings (e.g. Big5 Chinese.) You can construct, view, and save keywords-in-context. You can write content analysis dictionaries can be constructed using PERL-style regular expressions. Yoshikoder provides summaries of documents, either as word frequency tables or according to a content analysis dictionary. You can also compare documents according to word frequency profile or with respect to a content dictionary. Yoshikoder's native file format is XML, so dictionaries and keyword-in-context files are non-proprietary and human readable. The RID and LIWC are also available.
program: UIMA
authors: many
distributor: IBM Research and IBM Software Group and Carnegie Mellon University
documentation: user documentation
download: list of available components
operating systems: Java, SDK independent from operating system
description: UIMA stands for the Unstructured Information Management Architecture.
It is an open, industrial-strength, scaleable and extensible platform for creating, integrating and deploying unstructured information management solutions from combinations of semantic analysis and search components.
IBM is making UIMA available as free, open source software to provide a common foundation for industry and academia to collaborate and accelerate the world-wide development of technologies critical for discovering the vital knowledge present in the fastest growing sources of information today.
UIMA Software Development Kit (SDK) is freely available, also the UIMA core Java framework source code.
In particular the UIMA APIs are available for creating customized solutions in WebSphere Information Integrator OmniFind Edition.
Please send comments and suggestions to