Chris Biemann Homepage

Please note: this page is outdated! It is no longer maintained.

Find my current homepage here: FG Sprachtechnologie, FB Informatik, TU Darmstadt

Software:

  • Download the Chinese Whispers graph clustering algorithm.
  • Download the tinyCC2.0 corpus production engine
  • Download the langSepP Language Separation Program
  • Download the unsuPos Unsupervised Part-of-Speech Tagger
  • Download the ASV-Toolbox : A compilation of NLP tools

biemann

Professional Activities:

Co-Chairing

Program Commitee Member

Visits

Publications:

Book chapters

  • Biemann, C.: Bootstrapping. In G. Heyer, U. Quasthoff, T. Wittig (Eds.): Wissensrohstoff Text, Bochum, w3L
  • Biemann, C., Heyer, G.: Aufbau einer Wissenslandkarte. In G. Heyer, U. Quasthoff, T. Wittig (Eds.): Wissensrohstoff Text, Bochum, w3L
  • Biemann, C., Hidden Markov Modelle. In G. Heyer, U. Quasthoff, T. Wittig (Eds.): Wissensrohstoff Text, Bochum, w3L
  • Biemann, C.: Tagging – als Anwendung HMM. In G. Heyer, U. Quasthoff, T. Wittig (Eds.): Wissensrohstoff Text, Bochum, w3L
  • Biemann, C.: Kookkurrenzen höherer Ordnung. In G. Heyer, U. Quasthoff, T. Wittig (Eds.): Wissensrohstoff Text, Bochum, w3L
  • Biemann, C., Heyer, G.: Netze von Kookkurrenzen. In G. Heyer, U. Quasthoff, T. Wittig (Eds.): Wissensrohstoff Text, Bochum, w3L


Journal publications

  • Biemann, C. (2010): Unsupervised Part-of-Speech Tagging in the Large. Research on Language and Computation: Volume 7, Issue 2 (2010), pp. 101-135. (SpringerLink )
  • Cysouw, M., Biemann, C. and Ongyerth, M. (2007): Using strong’s numbers in the bible to test an automatic alignment of parallel texts. In: Michael Cysouw and Bernhard Wälchli (eds.) Parallel Texts: Using translational equivalents in linguistic typology. Special issue of Sprachtypologie und Universalienforschung (STUF), pp.66-79. (pdf )
  • Biemann, C.: Ontology Learning from Text - a Survey of Methods. LDV-Forum 20(2):75-93, 2005 (pdf ) (ps)
  • Biemann, C., Quasthoff, U., Böhm, K., Wolff, C. (2003): Automatic discovery and Aggregation of Compound Names for the Use in Knowledge Representations, Journal of Universal Computer Science (JUCS), Volume 9, Number 6, pp. 530-541, June 2003 (pdf ) (ppt )

Conference proceedings

  • Joydeep Nath, Monojit Choudhury, Animesh Mukherjee, Chris Biemann and Niloy Ganguly (2008): Unsupervised Parts-of-Speech Induction for Bengali. Proceedings of LREC-08, Marrakech, Morocco
  • Chris Biemann, Uwe Quasthoff, Gerhard Heyer and Florian Holz (2008): ASV Toolbox: a Modular Collection of Language Exploration Tools. Proceedings of LREC-08, Marrakech, Morocco 
  • Holz, F., Biemann, C. (2008): Unsupervised and Knowledge-Free Learning of Compound Splits and Periphrases. Proceedings of CicLING-08, Haifa, Israel (pdf)
  • Biemann, C., Giuliano, C. and Gliozzo, A. (2007): Unsupervised Part of Speech Tagging Supporting Supervised Methods. IN: Proceedings of RANLP-07, Borovets, Bulgaria (pdf) (poster-png)
  • Biemann, C., Heyer, G., Quasthoff U. and Richter, M. (2007): The Leipzig Corpora Collection – Monolingual corpora of standard size. In: Proceedings of Corpus Linguistics 2007, Birmingham, UK (pdf)
  • Loos, B. and Biemann, C. (2007): Supporting Web-based Address Extraction with Unsupervised Tagging. In: Proceedings of the 31st Annual Conference of the German Classification Society GfKl 2007 and Springer LNCS (pdf)
  • Hallsteinsdóttir, E., Eckart, T., Biemann, C., Quasthoff, U. and Richter, M. (2007). Íslenskur Orðasjóður - Building a Large Icelandic Corpus Proceedings of NODALIDA-07, Tartu, Estonia (pdf) (poster-ppt)
  • Socher, R., Biemann, C. and Osswald, R. (2007). Combining Contexts in Lexicon Learning for Semantic Parsing. Proceedings of NODALIDA-07, Tartu, Estonia (pdf) (ppt)
  • Biemann, C. (2007). Unsupervised Natural Language Processing using Graph Models. HLT-NAACL-07 Doctoral Consortium and Poster Proceedings, Rochester, NY, USA (pdf ) (poster-png) (ppt)
  • Biemann, C. (2007): A Random Text Model for the Generation of Statistical Language Invariants. Proceedings of HLT-NAACL-07, Rochester, NY, USA (pdf ) (ppt)
  • Biemann, C. and Quasthoff, U. (2007). Similarity of Documents and Document Collections using Attributes with Low Noise. Proceedings of WEBIST-07, Barcelona, Spain (pdf ) (ppt )
  • Richter, M., Quasthoff, U., Hallsteinsdóttir, E. and Biemann, C. (2006): Exploiting the Leipzig Corpora Collection. Proceedings of IS-LTC'06, Ljubljana, Slovenia ( pdf)
  • Eiken, U.C., Liseth, A.T., Richter, M., Witschel, H. F. and Biemann, C. (2006): Ord i Dag: Mining Norwegian Daily Newswire. Proceedings of FinTAL, Turku, Finland ( pdf )
  • Quasthoff, U., Richter, M. and Biemann, C. (2006): Corpus Portal for Search in Monolingual Corpora. Proceedings of LREC-06, Genoa, Italy (pdf ) (poster-pdf )
  • Witschel, F., Biemann, C. (2005): Rigorous dimensionality reduction through linguistically motivated feature selection for text categoris ation. Proceedings of NODALIDA 2005, Joensuu, Finland (pdf )
  • Biemann, C., Quasthoff, U. (2005): Dictionary acquisition using parallel text and cooccurrence statistics. Proceedings of NODALIDA 2005,Joensuu, Finland (pdf )(ppt)
  • Melz, R., Biemann, C., Böhm, K., Heyer, G. (2005): Real-time Analysis of Voice Streams and their Representation as Conceptual Structures, Proceedings of HCI-05, Las Vegas, USA (pdf)
  • Biemann, C., Osswald, R. (2005): Automatic Extension of Feature-based Semantic Lexicons via Contextual Features, Proceedings of the 29th Annual Conference of the German Classification Society GfKl 2005 and Springer LNCS (pdf) (ppt)
  • Biemann, C., Osswald, R. (2005): Automatische Erweiterung eines semantikbasierten Lexikons durch Bootstrapping auf großen Korpora. In W. Hess und W. Lenders (Hrsg.) "Sprache, Sprechen und Computer/Computer Studies in Language and Speech" und Proceedings of GLDV-Frühjahrstagung 2005, Bonn, Peter-Lang-Verlag, Frankfurt am Main (pdf) (ppt)
  • Biemann, C., Teresniak, S. (2005): Disentangling from Babylonian Confusion - Unsupervized Language Identification, Proceedings of CICLing-2005, Computational Linguistics and Intelligent Text Processing, Mexico City, Mexico and Springer LNCS 3406 (pdf) (ppt)
  • Biemann, C., Böhm, K., Heyer, G., Melz, R. (2004): "SemanticTalk: Software for Visualizing Brainstorming Sessions and Thematic Concept Trails on Document Collections, Proceedings of ECML/PKDD 2004, Pisa, Italy and Springer LNAI 3202 (pdf )
  • Biemann, C., Shin, S.-I., Choi, K.-S. (2004): "Semiautomatic Extension of CoreNet using a Bootstrapping Mechanism on Corpus-based Co-occurrences", Proceedings of the 20th International Conference on Computational Linguistics (COLING04) Genf, Switzerland (pdf ) (ppt )
    Biemann, C.; Quasthoff, U.; Wolff, C. (2004). "Linguistic Corpus Search" In: Proceedings Fourth International Conference on Languag esources and Evaluation (LREC 2004), Lissabon, Mai 2004. (pdf ) (poster-pdf )
  • Biemann, C.; Bordag, S.; Quasthoff, U.; Wolff, C. (2004). "Web Services for Language Resources and Language Technology Applications. " In: Proceedings Fourth International Conference on Language Resources and Evaluation (LREC 2004), Lissabon, Mai 2004. (pdf )
  • Biemann, C.; Bordag, S.; Quasthoff, U. (2004): Automatic Acquisition of Paradigmatic Relations using Iterated Co-occurrences, Proceedings of LREC2004, Lisboa, Portugal (pdf ) (ppt )
  • Biemann, C.; Böhm, K., Heyer, G., Melz, R. (2004): Automatically Building Concept Structures and Displaying Concept Trails for the Use in Brainstorming Sessions and Content Management Systems, Proceedings of I2CS, Guadalajara, Mexico and Springer LNCS (pdf ) (ppt )
  • Biemann, Chr.; Bordag, S.; Heyer, G.; Quasthoff, U.; Wolff, Chr.: Language-independent Methods for Compiling Monolingual Lexical Data, Proceedings of CicLING 2004, Seoul, Korea and Springer LNCS 2945, pp. 215-228, Springer Verlag Berlin Heidelberg (pdf )
  • Biemann, C. (2003): Extraktion semantischer Relationen aus natürlichsprachlichem Text mit Hilfe von maschinellem Lernen, in Uta Seewald-Heeg (Hrsg.), Sprachtechnologie für die multilinguale Kommunikation, Proceedings of GLDV-Frühjahrstagung 2003, Gardez!-Verlag, Sankt Augustin (pdf )
  • Quasthoff, U., Biemann, C., Wolff, C. (2002): Named Entity Learning and Verification: Expectation Maximisation in Large Corpora, Proceedings of CoNNL-2002, Taipei, Taiwan (pdf )

Workshop proceedings

  • Biemann, C., Witschel, F. (2007): Webspam Detection via Semi-Supervised Graph Partitioning. Proceedings of the WEBSPAM Challenge in Conjunction with ECML/PKDD-07, Warsaw, Poland (pdf)
  • Biemann, C. and Quasthoff, U. (2007): Examining Higher Order Transformations for Scale-free Small World Graphs. Abstracts of the ECCS-07 Workshop on Dynamics on and of Compex Networks, Dresden, Germany (pdf)
  • Biemann C. (2006): Unsupervised Part-of-Speech Tagging Employing Efficient Graph Clustering. Proceedings of the COLING/ACL-06 Student Research Workshop 2006, Sydney, Australia (pdf ) (poster-png) (tagsets-pdf )
  • Biemann, C. (2006): Chinese Whispers - an Efficient Graph Clustering Algorithm and its Application to Natural Language Processing Problems. Proceedings of the HLT-NAACL-06 Workshop on Textgraphs-06, New York, USA (pdf ) (ppt )
  • Quasthoff, U., Biemann, C. (2006): Measuring Monolinguality. Proceedings of the LREC-06 workshop on Quality assurance and quality measurement for language and speech resources, Genoa, Italy (pdf ) (ppt )
  • Mahn, M., Biemann, C. (2005): Tuning Co-occurrences of Higher Orders for Generating Ontology Extension Candidates, proceedings of the ICML-2005 Workshop on Learning and Extending Lexical Ontologies using Machine Learning Methods, Bonn, Germany (pdf)
  • Biemann, C. (2005): Semantic Indexing with Typed Terms Using Rapid Annotation, Proceedings of the TKE-05-Workshop on Methods and Applications of Semantic Indexing, Copenhagen, Denmark (pdf) (ppt)
  • Biemann, C., Bordag, S., Quasthoff, U. (2003): Lernen von paradigmatischen Relationen auf iterierten Kollokationen, Beiträge zum GermaNet-Workshop: Anwendungen des deutschen Wortnetzes in Theorie und Praxis, Tübingen, Oktober 2003 and LDV-Forum 19 (1/2), 2004 (pdf) (ppt)

Theses

  • Biemann, C. (2007): Unsupervised and Knowledge-Free Natural Language Processing in the Structure Discovery Paradigm. PhD Thesis, University of Leipzig. Submitted 07.07.007, Accepted 19.11.2007 (pdf)
  • Biemann, C. (2002): Extraktion semantischer Relationen aus natürlichsprachlichem Text mit Hilfe von maschinellem Lernen, Diplomarbeit (Master’s Thesis), University of Leipzig, September 2002 (pdf )