Develop the Data Infrastructure and related services for real-time and streaming data analytics.
Assist in data exploration, feature engineering, model training, testing and deployments at scale.
Develop quick prototypes and demonstrations to showcase key data insights.
Design and develop time-series machine learning and statistical models for anomaly detection, forecasting, pattern identification, data aggregation and transformation at scale.
Evaluate and deploy the developed models in distributed and non-distributed environments.
Deploy automation frameworks to achieve highest levels of engineering quality.
Collaborate and communicate effectively with the business and technical teams to deliver strong results.
Experience handling massive data pipelines using messeging systems such as Kafka (preffered) /Kinesis/RabbitMQ/ActiveMQ.
Experience building Streaming Analytics and Real Time processing systems using Confluent Streams, Storm, SPARK Streaming or similar.
Excellent understanding of data structures, algorithms and distributed systems.
Experience and understanding with some of the traditional, NoSQL, columnar databases such as Oracle, MySQL, PostgreSQL, Cassandra, DynamoDB, Redshift, Vertica.
2+ years of experience in developing highly distributed systems by leveraging cloud and open source technologies.
4+ years of strong experience in some of the programming languages: Python, Java, Scala, C++.
Experience in partnering with architects, engineers in data environments that are complex, enterprise wide, multi-tenant, and host large scale of data.
Great team player with excellent written & verbal communication skills.