Build information extraction and text summarization algorithms on large text corpora.
Perform end-to-end text analytics - textual data preprocessing (Regex transformation, stemming etc...), feature extraction, algorithm implementation for pattern recognition or sentiment analysis & data visualization.
Develop & test best-in-class NLP & machine learning algorithms for supervised or unsupervised learning in big data environment.
Maintain existing algorithm implementations & optimize them for speed or accuracy.
Support intellectual property (IP) and proof of concept (POC) initiatives by identifying opportunities and working on building analytics solutions around them.
Work with other team members to build data platforms for large-scale data analysis, modeling & real time analytics.
Skill Set & Qualifications:
Understanding of statistical NLP methods and key machine learning algorithms Neural Networks, decision forests, support vector machines, regression etc
At-least 2 years of experience in natural language processing, text analytics, information extraction, information retrieval.
Proficiency in using atleast one open source NLP library (NLTK etc..)
Skills with R/Python/Java/C++ (or other high-level programming language) essential.
Working knowledge of relational databases, structured query language, Distributed data platforms (Hadoop etc...), REST API etc...
Experience in handling large text copora.
Good data visualization skills and ability to communicate the algorithm implementation and analysis results clearly to the stakeholders.
Experience with text search engines like Lucene would be an added advantage.
Desire to learn new skills and technology.