Rajnish Kumar

rajnish.garg27@gmail.com

Highly motivated Data Engineer with passion for Data Engineering/Machine Learning/AnalyticsI am currently working on big data technologies to build and manage pipelines for ad network and exchange that handles billions of events end to end. Previously worked on building data pipeline using Spark for IP address. In my internship I worked on using machine learning techniques for Statistical Modeling on credit card transaction data for offer targeting, user segmentation on large scale.+----------------------+----------------------------------+| Cloud Platform ---> AWS | Storage ---> HDFS, S3 | DB ---> DynamoDB, MySQL, HIVE, Redis | Batch Compute Engine ---> Hive (MR, Tez), Spark | Querying ---> Presto (adhoc), HIVE | Language ---> Python, Java | Machine Learning ---> SageMaker (Scikit, TF etc..) | Workflow Manager ---> Airflow | Analysis / Visualization ---> Zeppelin, Superset, Tableau | Monitoring / Altering ---> Datadog, Pagerduty, Pepperdata +----------------------+----------------------------------+I have completed my Master in Computer Science at University of Illinois at Chicago with focus on Big data, Data Analysis and Machine Learning.

Work Experience

Chartboost

Data Engineer

2016 - Current

1. Building and maintaining data pipelines in HIVE and Spark/Streaming to process billions of events.2. Working on building a Machine Learning Framework to train, build and deploy models using Sagemaker, Docker etc.3. Working with Data Scientist to build features, analyze the data.4. Worked on building the core data pipelines using Spark on Spot instances, reducing significant cost.5. Helped on integrating third party service Qubole and migrated the heavy jobs.6. Moved jobs from MR to TEZ and Spark SQL, Also brought Presto in the Company.7. Extended Airflow by writing FastForward Operator, Dynamic DAG builder, Integrated HIVE metastore browser, and SQL based sanity check.8. Worked on creating a basic stat collector using Presto, Airflow, Mysql and Superset for basic quality/stats check.9. Introduced Athena to the company.10. Introduced Superset and integrated Presto, HIVE (in house), Athena for querying.11. Worked on managing the Hadoop cluster.Always, exploring and adding new tools for our data warehouse.

Neustar, Inc.

Data Engineer

Jan 2015 - 2016

Working on Data pipelining by using Apache Spark and Cassandra.Created REST API endpoint for querying different data sources. Build analytic platform using Spark + Postgres + Tableau.Build S3 metadata dashboard using Lambda + Dynamodb.Analyzing proxy ip address.

Neustar, Inc.

Data Scientist

Jul 2014 - Dec 2014

1. Working on Malicious website detection research project by prototyping and analyzing various learning algorithms.2. Implemented Expectation Maximization variant ML algorithm “Combining supervised with unsupervised” using python.3. Used Pandas, numpy, matplotlib, Tableau for data analysis

University of Illinois at Chicago

Research Assistant

Feb 2014 - May 2014

Truaxis (A MasterCard Company)

Data Scientist Intern + CoOp

Jun 2013 - Dec 2013

Offer Targeting:1. Analyzed millions of user’s credit/debit card transactions across hundreds of merchants/categories for advertisement targeting.2. Worked on recommendation engine algorithms to create user-based targeting model to predict the offer’s likelihood for a user.3. Wrote many Map-Reduce jobs in Java and Python (Hadoop Streaming) to create features from the raw transactions.4. Used HIVE on transaction data to perform preprocessing operations such as aggregation, filtering and join.5. Used machine learning libraries such as Mahout, Sklearn to build and continuously optimize the model.User Segmentation and Association:1. Analyzed the user’s spending data across different Categories/Merchants and created features, to do user segmentation by using clustering algorithms.2. Worked on association mining to generate rules for merchants based on measures such as lift, confidence and support.3. Used Tableau to do the reporting and visual analysis of data.

University of Illinois at Chicago

Graduate Assistance

Dec 2012 - May 2013

1. My work was to update the UICHR website to Campus accessbility standards.2. Created and maintained software applications for UIC’s Department of Human Resources.3. Worked on Google Analytics to track behavior and other analytics of users to improve efficiency of websites.

Sapient

Associate L2

Jun 2010 - Jul 2012

1. Worked as Automation Quality Assurance on Fixed Income Portfolio Management tool for leading US based Investment Management Company.2. Worked on SilkTest for running regression, monitor, logging defects and also development of new scripts3. Developed Automated processes for running Functional and Regression testing on financial Applications.4. Enhanced existing automation scripts to improve performance of critical modules in regression suite from 52 Hrs. to 14 Hrs. 5. Worked on Manual testing efforts that includes running test cases and test result analysis.6.Gained financial knowledge by doing various online courses.

Education

University of Illinois at Chicago

Master's degree Computer Science

2012 - 2014