Tommi A Pirinen, Dublin City UniversityUniversität Hamburg

Note: these pages will move soon with me to University of Hamburg.

Table of contents

Short academic history

Research interests

The things I've studied and am good at and interested in using my time in:

The list is not exhaustive.


Following is a list of all my accepted publications and links to author's post-print versions. The versions on this page may differ significantly from the officials in that they have been optimised for screen reading, they have been reformatted, the hyperlinks have been added, and so forth.

I can haz PDF–for free and open science for everyone!1

It may be noteworthy at the moment, that google scholar offers a great way to browse my publications and see their incoming citations

Publications in conferences and journals

  1. Tommi A Pirinen (2016, to appear) ...
  2. Francis Tyers, Tommi Pirinen (2016b) Intermediate Representations in Rule-Based Machine Translation for Uralic languages in Proceedings of Second International Workshop on Computational Linguistics for Uralic Languages (IWCLUL)
  3. Tommi Pirinen, Antonio Toral, Raphael Rubino (2016a) Rule-Based and Statistical Morph Segments in English-to-Finnish SMT, in Proceedings of Second International Workshop on Computational Linguistics for Uralic Languages (IWCLUL), Szeged, Hungary
  4. Antonio Toral, Xiaofeng Wu, Tommi Pirinen, Zhengwei Qiu, Ergun Bicici and Jinhua Du (2015d) Dublin City University at the TweetMT 2015 Shared Task in Proceedings of TweetMT shared task at SEPLN 2015 (forthcoming)
  5. Raphael Rubino, Tommi Pirinen, Miquel Esplà-Gomis, Nikola Ljubešić, Sergio Ortiz Rojas, Vassilis Papavassiliou, Prokopis Prokopidis and Antonio Toral (2015c), Abu-MaTran at WMT 2015 Translation Task: Morphological Segmentation and Web Crawling In proceedings of WMT shared task at EMNLP 2015 (forthcoming)
  6. Tommi A Pirinen (2015a), Omorfi—Free and open source morphological lexical database for Finnish, in Proceedings of the 20th Nordic Conference of Computational Linguistics NODALIDA 2015
  7. Tommi A Pirinen (2015b), Using weighted finite state morphology with VISL CG-3—Some experiments with free open source Finnish resources, in Proceedings of Constraint grammar - methods, tools and applications Workshop at NoDaLiDa
  8. Antonio Toral, Raphael Rubino, Miquel Esplà, Tommi Pirinen, Andy Way and Gema Ramírez-Sánchez (2014d). Extrinsic Evaluation of Web-Crawlers in Machine Translation: a Case Study on Croatian–English for the Tourism Domain in Proceedings of EAMT 2014
  9. Sjur Moshagen, Trond Trosterud, Jack Rueter, Francis Tyers and Tommi A Pirinen (2014c), Open-source infrastructures for collaborative wrok on under-resourced languages, in Proceedings of CCURL workshop 2014 in LREC
  10. Senka Drobac, Krister Lindén, Tommi A Pirinen and Miikka Silfverberg (2014b), Heuristic Hyperminimisation of Finite-State Lexicons, in Proceedings of LREC 2014
  11. Tommi A Pirinen, Krister Lindén (2014a) State-of-the-art in Weighted Finite-State Spell-Checking in Proceedings of CICLing 2014
  12. Sjur Moshagen, Tommi A Pirinen, Trond Trosterud (2013a) Building an open-source development infrastructure for language technology projects, in Proceedings of Nodalida 2013
  13. Tommi A Pirinen, Sam Hardwick (2012d) Effect of Language and Error Models on Efficiency of Finite-State Spell-Checking and Correction, in Proceedings of 10th International Workshop on Finite-State Methods and/in Natural Language Processing FSMNLP 2012
  14. Krister Lindén, Miikka Silfverberg, Erik Axelson, Senka Drobac, Sam Hardwick, Tommi A Pirinen (to appear, 2012c) Using HFST for creating Computational Linguistic Applications in Computational Linguistics-Applications 2012
  15. Tommi A Pirinen, Francis M. Tyers (2012b) Compiling Apertium morphological dictionaries with HFST and using them in HFST applications in Proceedings of Workshops in Language Resources and Evaluation conference LREC 2012, in saltmil-aflat workshop on “language technology for normalisation of less-resourced languages”
  16. Tommi A Pirinen, Miikka Silfverberg (2012a) Improving Finite-State Spell-Checker Suggestions with Part-of-Speech N-grams in Proceedings of International Conference on Intelligent Text Processing and Computational Linguistics CICLING 2012
  17. Krister Lindén, Miikka Silfverberg, Erik Axelson, Sam Hardwick, Tommi A Pirinen (2011c) HFST—Framework for Compiling and Applying Morphologies in Systems and Frameworks for Computational Morphology 2011, in Communications in Computer and Information Science (100), ISBN: 978-3-642-23138-4
  18. Miikka Silfverberg, Mirka Hyvärinen, Tommi A Pirinen (2011b), Improving Predictive Entry of Finnish Text Messages using IRC Logs in Proceedings of the Computational Linguistics-Applications Conference 2011, ISBN: 978-83-60810-47-7.
  19. Tommi A Pirinen (2011a), Modularisation of Finnish Finite-State Language Description—Towards Wide Collaboration in Open Source Development of Morphological Analyser in Proceedings of Nodalida 2011 (18).
  20. Tommi A Pirinen, Krister Lindén (2010c), Creating and Weighting Hunspell Dictionaries as Finite-State Automata , in Investigationes Linguisticae (19).
  21. Tommi A Pirinen, Krister Lindén (2010b), Building and Using Existing Hunspell Dictionaries and TEX Hyphenators as Finite-State Automata, , in Proceedings of International Multiconference in Computer Science and Information Technology
  22. Tommi A Pirinen, Krister Lindén (2010a), Finite-State Spell-Checking with Weighted Language and Error Models, , in Proceedings of Workshops of Language Resources and Evaluation Conference 7 in Valletta, Malta.
  23. Krister Lindén, Tommi A Pirinen (2009a), Weighted Finite-State Morphological Analysis of Finnish Compounding with hfst-lexc, , in Proceedings of Nodalida 2009 presentation in PDF]
  24. Krister Lindén, Tommi A Pirinen (2009b), Weighting Finite-State Morphological Analyzers using HFST tools , in Pre-proceedings of FSMNLP 2009
  25. Krister Lindén, Miikka Silfverberg, Tommi A Pirinen (2009c), HFST Tools for morphology—An Efficient Open-Source Pacakge for Construction of Morphological Analyzers in Proceedings of Workshop on Systems and Frameworks for Computational Morphology


  1. Tommi Pirinen (2008), Suomen kielen äärellistilainen automaattinen morfologinen analyysi avoimen lähdekoodin menetelmin, [in Finnish] Master's Thesis, University of Helsinki (in Finnish)
  2. Tommi A Pirinen (2014), Weighted Finite-State Methods in Spell-Checking and Correction, Doctoral dissertation, University of Helsinki

Presentations, tutorials, invited speeches

  1. Tommi A Pirinen, Antonio Toral Why linguistics in SMT? in Why Linguistics? workshop, Tarto, 2015
  2. Morphological segmentation for machine translation , in internal project meeting of abumatran, in Elx, 2014
  3. Weighted finite-state methods as a bridge between strictly rule-based and mostly statistical nlp systems in NCLT seminar series, DCU, 2014
  4. Crowd-Sourcing morphology and lexicography, productising NLP research, in FSCONS 2013, Gothenburg.
  5. Weighted Finite-State Spell-Checking , in Research Seminar of Uni Helsinki. A ~final report on PhD thesis.
  6. Building finite-state spell-checkers with HFST tools, in FSMNLP 2012, Donostia-San Sebastian. (a step-by-step guide on building simple spell-checker from beginning to end and using it in real-world software). In same conference: Effects of Finite-State Language and Error Models to Efficiency (presented by Sam Hardwick)
  7. Building and Using Apertium Dictionaries with HFST in LREC 2012, Istanbul. (A GENERic rant about how cheap it is to build new formalisms, engineering-wise, rather than bothering linguists with unuseful legacy coding formats)
  8. Using POS taggers to rerank spell-checking results in CICLING 2012, Delhi. (a short blurb on how it fares for Finnish)
  9. Building and using Hunspell and TEX hyphenation descriptions with HFST in CLA 2010, Wisla.
  10. Using Wikipedia to Weight a Spelling-Checker in LREC 2010, Valletta. (first demo of HFST spell-checkers with automaton as error model)
  11. Weighting Finnish Compound Boundaries in Nodalida 2009, Odense.
  12. Weighted Finite-State analysis of Finnish Compounds, in CLARIN/D-SPIN meeting in ???
  13. (Unigram-)Weighting Language Models with HFST in FSMNLP 2009, Pretoria.
  14. Avoimen lähdekoodin menetelmät äärellistilaista morfologiaa varten, in 2008, Helsinki. (first public demo of omorfi)

Software projects

The following projects I participate are more or less related to my work at university and sparetime hobbies related to science:

Coures I've taught or TA'd

About me

Contact information

When contacting, I do prefer methods that are earlier in below table, e.g. E-mail rather than telephone.

Contact Address (link)
E-Mail (work email)
IRC Flammie in IRCNet
Flammie in Freenode
AIM Flammie in AOL instant Messenger
ICQ UIN 27204357
88 Custom Hall
Gardiner Street Lower
Dublin 1
Linkedin (Professional)
Google+ (Professional / Personal; allows different circles)
Facebook (Personal; mostly mundane ranting) Facebook profile
Telephone By asking, but prefer other communication channels, including google hangouts


I've been trying to organize my time with following google calendar (note that it contains both calendar markings from my phone (red?) as well as all Facebook invites (green?), the green ones are not necessarily accurate)

Commons License
University Home Page by Tommi A Pirinen is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.