1. Building and maintaining data pipelines in HIVE and Spark/Streaming to process billions of events.2. Working on building a Machine Learning Framework to train, build and deploy models using Sagemaker, Docker etc.3. Working with Data Scientist to build features, analyze the data.4. Worked on building the core data pipelines using Spark on Spot instances, reducing significant cost.5. Helped on integrating third party service Qubole and migrated the heavy jobs.6. Moved jobs from MR to TEZ and Spark SQL, Also brought Presto in the Company.7. Extended Airflow by writing FastForward Operator, Dynamic DAG builder, Integrated HIVE metastore browser, and SQL based sanity check.8. Worked on creating a basic stat collector using Presto, Airflow, Mysql and Superset for basic quality/stats check.9. Introduced Athena to the company.10. Introduced Superset and integrated Presto, HIVE (in house), Athena for querying.11. Worked on managing the Hadoop cluster.Always, exploring and adding new tools for our data warehouse.
Working on Data pipelining by using Apache Spark and Cassandra.Created REST API endpoint for querying different data sources. Build analytic platform using Spark + Postgres + Tableau.Build S3 metadata dashboard using Lambda + Dynamodb.Analyzing proxy ip address.
1. Working on Malicious website detection research project by prototyping and analyzing various learning algorithms.2. Implemented Expectation Maximization variant ML algorithm “Combining supervised with unsupervised” using python.3. Used Pandas, numpy, matplotlib, Tableau for data analysis
Offer Targeting:1. Analyzed millions of user’s credit/debit card transactions across hundreds of merchants/categories for advertisement targeting.2. Worked on recommendation engine algorithms to create user-based targeting model to predict the offer’s likelihood for a user.3. Wrote many Map-Reduce jobs in Java and Python (Hadoop Streaming) to create features from the raw transactions.4. Used HIVE on transaction data to perform preprocessing operations such as aggregation, filtering and join.5. Used machine learning libraries such as Mahout, Sklearn to build and continuously optimize the model.User Segmentation and Association:1. Analyzed the user’s spending data across different Categories/Merchants and created features, to do user segmentation by using clustering algorithms.2. Worked on association mining to generate rules for merchants based on measures such as lift, confidence and support.3. Used Tableau to do the reporting and visual analysis of data.
1. My work was to update the UICHR website to Campus accessbility standards.2. Created and maintained software applications for UIC’s Department of Human Resources.3. Worked on Google Analytics to track behavior and other analytics of users to improve efficiency of websites.
1. Worked as Automation Quality Assurance on Fixed Income Portfolio Management tool for leading US based Investment Management Company.2. Worked on SilkTest for running regression, monitor, logging defects and also development of new scripts3. Developed Automated processes for running Functional and Regression testing on financial Applications.4. Enhanced existing automation scripts to improve performance of critical modules in regression suite from 52 Hrs. to 14 Hrs. 5. Worked on Manual testing efforts that includes running test cases and test result analysis.6.Gained financial knowledge by doing various online courses.