nlp - what is the NLTK equivalent of the UIMA CAS (common annotation structure)? -


in uima, cas (common annotating structure) plays major role in structuring nlp application. allows pass metadata 1 components adds next compoment. example, sentence boundaries sentence tokenizer can added cas , used subsequent word tokenizer.

what equivalent data structure in nltk?

in short, there no equivalent concept cas (common analysis system) in nltk. latter uses simpler means of representing texts uima. in nltk, texts lists of words, whereas in uima have complex (and heavy-weight) data structures defined part of cas purpose of describing input data , flow through uima system.

that being said, view 2 of them serve quite different purposes anyway. if name java equivalent nltk, choose opennlp toolkit rather uima. former offers number of algorithms nlp based on machine learning (as nltk, among other things), while latter component-based framework not nlp, unstructured data in general. is, defines general model building applications working unstructured data.


Comments

Popular posts from this blog

Android : Making Listview full screen -

javascript - Parse JSON from the body of the POST -

javascript - How to Hide Date Menu from Datepicker in yii2 -