Proficient in one or more of modern programming language such as Java, Python. Proficiency in Analytics Packages like R, SAS, Matlab.
Experience and ability to work in a Unix/Linux environment, and proficient in command-line scripting
Ability to implement, maintain, and troubleshoot big data infrastructure, such as distributed processing paradigms, stream processing(Storm,spark), search api(Solr) and databases, such as Hadoop,HBASE,HIVE,SQL etc.
Strong mathematical background with ability to understand algorithms and methods from a mathematical viewpoint and an intuitive viewpoint
Ability to break down complex problems, and develop strategies that prioritize key areas
Experience of processing large dataset in a cloud environment
Experience working with large datasets and problems
Experience in machine learning, natural language processing and/or information retrieval
Knowledgeable with search engines, spam detection, recommendation systems, and/or social networks
Strong data extraction and processing, using MapReduce, Pig, and/or Hive preferred
Good to have Skills
Analyze and model structured data using advanced statistical methods.