r2 - 07 Jul 2005 - 14:22:19 - LisaDusseaultYou are here: OSAF >  Journal Web  >  SimpleDynamicCategorizer20031223 > CatySectionTwo20031223 > ToolBoxImplementationReference20031223
Here is a list of useful programms, which implement functions for the CATY toolbox. Most of them include therit source under GPL. I encurage possible implementers to look at them for reference only.

Do get an impression of how they work!
Do your own implementation!
Do not copy any code!

Chgrep

'chgrep' searches the input files (or standard input if no files are named) for oldpattern and changes them to newpattern (grep doesn't support this). You can use .lock files (or another extend). It is useful in (but not limited to) mail servers.

Learn more about it at:
Get sources at: http://www.bmk.bicom.pl/chgrep/chgrep-1.2.2.tgz

Meld

'Meld' is a GNOME 2 diff and merge tool. It lets you edit files in place (diffs update dynamically), and a middle column shows detailed changes and allows merges. It has user-friendly diff-browsing. The margins show location of changes, and it also has a tabbed interface that lets you open multiple diffs at once.

Learn more about it at:
Get sources at: http://prdownloads.sourceforge.net/meld/meld-0.9.1.tgz?use_mirror=twtelecom

Bow

Bow: A Toolkit for Statistical Language Modeling, Text Retrieval, Classification and Clustering

Learn more about it at: http://www-2.cs.cmu.edu/~mccallum/bow/
Get sources at: http://www-2.cs.cmu.edu/~mccallum/bow/src/

Provided in the library source distribution, there are currently three executable programs based on the library.

  • Rainbow is an executable program that does document classification. While mostly designed for classification by naive Bayes, it also provides TFIDF/Rocchio, Probabilistic Indexing and K-nearest neighbor.
  • Arrow is an executable program that does document retrieval. It currently only performs simple TFIDF-based retrieval.
  • Crossbow is a an executable program that does document clustering (and also classification).

JPlag

JPlag is a system that finds similarities among multiple sets of source code files. This way it can detect software plagiarism. JPlag does not merely compare bytes of text, but is aware of programming language syntax and program structure and hence is robust against many kinds of attempts to disguise similarities between plagiarized files. JPlag currently supports Java, C, C++, Scheme, and natural language text.

Learn more about it at: http://www.ipd.uka.de/jplag/
Get it at: http://wwwipd.ira.uka.de:2222/user.cgi

-- BernhardGroehl - 28 Dec 2003

Edit | WYSIWYG | Attach | Printable | Raw View | Backlinks: Web, All Webs | History: r2 < r1 | More topic actions
 
Open Source Applications Foundation
Except where otherwise noted, this site and its content are licensed by OSAF under an Creative Commons License, Attribution Only 3.0.
See list of page contributors for attributions.