Accueil > Equipes > TALN > Ressources logicielles

Ressources logicielles

Liste non exhaustive des réalisations logicielles produites par l’équipe TALN


TermSuite is a Java UIMA-based toolbox for terminology extraction and multilingual term alignment.
It extracts monolingual terminologies and generates bilingual dictionaries from these terminologies by the means of distributional and compositional methods.
The languages covered are : English, French, German, Spanish, and Russian.
More information here


Un détecteur d’opinions qui explore les tweets sur le sujet qui vous intéresse ! Une personne, un produit, un sujet d’actualité. Entrez le sujet et visualisez en temps réel les opinions qui sont émises positives, négatives ou neutres.
Jouer avec un démonstrateur ici

Apache OpenNLP models for processing French

The Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text. It supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and coreference resolution.
More information here

Code source and Apache UIMA components

Various contributions to the NLP and Apache UIMA (Unstructured Information Management Architecture) communities to facilitate the development of NLP components and pipelines, to connect with various data formats, to solve interoperability issues (within UIMA workflow or by integrating third-party tools) and also to perfom some NLP analysis tasks.

Here two source code repositories dev-star et jules-star ; both should be merged soon.
Some components are available under Java Apache Maven dependencies.