Gene/protein named entity recognition and normalization software
GNAT is a library and web service capable of performing gene entity NER and normalization of biomedical articles. Mentions of genes and proteins in the articles are linked to to Entrez Gene identifiers. GNAT is available both for local download (suitable for large-scale processing) and as a web service (suitable for more limited processing or testing). A combination of local and remote processing is also available, where CPU-heavy operations are performed locally and memory-intensive operations are performed remotely (this is suitable for large-scale processing where a large amount of memory is not available). GNAT uses LINNAEUS (Gerner et al., 2010) for species detection and BANNER (Leaman et al., 2008) in one part of its false positive filtering process.
GNAT is described in the following papers:
For questions, suggestions or bug reports, please contact Jörg Hakenberg, Martin Gerner or Casey Bergman. The files on this webpage can also be accessed from this project's SourceForge project page.
- Hakenberg J, Gerner M, Haeussler M, Solt I, Plake C, Schroeder M, Gonzalez G, Nenadic G, Bergman C: The GNAT library for local and remote gene mention normalization. Bioinformatics 27(19):2769-71, 2011 [html] [pdf].
- Solt I, Gerner M, Thomas P, Nenadic G, Bergman CM, Leser U, Hakenberg J: Gene mention normalization in full texts using GNAT and LINNAEUS. In Proceedings of the BioCreative III Workshop, Bethesda, USA, 2010 [pdf]
- Hakenberg J, Plake C, Royer L, Strobelt H, Leser U, Schroeder M: Gene mention normalization and interaction extraction with context models and sentence motifs. Genome Biology 9:S14, 2008 [html] [pdf]
- Hakenberg J, Plake C, Leaman R, Schroeder M, Gonzales G: Inter-species normalization of gene mentions with GNAT. Bioinformatics 24:i126-i132, 2008 [html] [pdf]