Yahoo • 2 roles • (Oct 2019 - Apr 2024) 4 years 6 months
Senior Software Development Engineer (Dec 2022 – Apr 2024) - 1 year, 4 months
Served as technical lead and active contributor across multiple projects within the Big Data Platform division, orchestrating teams and engaging directly in the development of distributed systems and web applications. These projects processed petabytes of data and millions of files daily, covering data engineering applications such as data warehousing systems, grid onboarding, permissions tracking, and various full stack web applications and visualization tools. Additionally, contributed significantly to cloud migration initiatives, employing a hybrid cloud approach to successfully transition numerous on-premise projects to AWS and GCP.
-
Yahoo's Big Data Warehouse System: Led the development of Yahoo’s Big Data warehouse system, which aggregates and processes logs from all Grid projects. This system efficiently stores data extracted from these logs in distributed SQL databases, including Hive and AWS RedShift. The resulting SQL tables are crucial for cybersecurity teams, business analysts, and executives, enabling them to effectively evaluate project efficacy, monitor growth trends, and detect security anomalies. Initially developed using Java, Hadoop MapReduce, and shell scripts, the modernized cloud version operates on AWS, utilizing technologies such as Python, AWS EMR, Lambdas, EC2, and AWS RedShift.
-
Business Intelligence Solution: Led the design, development and implementation of a business intelligence solution that visualizes cloud migration efforts, enabling strategic, data-driven decisions across Yahoo.
-
Doppler: Worked on enhancements and modernization of Yahoo’s Grid project onboarding and access management application, Doppler. This full stack application, which utilizes technologies such as Node.js, Ember.js, and MySQL, supports projects using Hadoop, Storm, Presto, and HBase. I led efforts to implement new features, optimize performance, and transition the application to a cloud environment. Additionally, I developed functionalities within Doppler that enable users to manage memberships in Grid Unix Groups, access rights for Grid Headless users, and the creation and modification of grid datasets and HTTP proxy allow list entries.
-
Hue: Developed and managed Hue, an open-source, web-based SQL Assistant for querying databases and data warehouses, built primarily with Python and Django. My role included integrating Hue with Yahoo's Big Data infrastructure and Okta SSO to enhance security and user authentication. I was deeply involved in every phase of Hue’s lifecycle—from product development and deployment to customer support.
-
Big Data Platform Monitoring: Developed web applications that monitor, visualize, and administer Big Data platforms like Hadoop, HBase, Storm, and Presto for Yahoo and Verizon. Employing Node.js, JavaScript, Vue.js and MySQL, we created advanced dashboards that provided insights into grid utilization. These tools enabled the visualization of resource utilization metrics with varied dimensions and granularity, from cluster-wide to individual user levels, facilitating more informed decision-making.
-
Cloud Migration: Spearheaded cloud migration strategies, successfully transitioning significant on-premise projects to AWS and GCP using a hybrid cloud approach, which improved system scalability and operational efficiency.
-
Skills involved: Java, Golang, Javascript, MySql, Looker, AWS, GCP, SQLite, Airflow, Jenkins, Screwdriver, APM, CICD, JUnit, Shell, Bash, AWK, Python, Django, VueJS, EmberJS, CSS, Hive, Hue, Jupyter, Linux, Unix, MacOS, ETLs, REST APIs
Software Development Engineer II (Oct 2019 – Dec 2022) - 3 years, 3 months
Engaged in the technical development and continuous optimization of large-scale Big Data and full stack projects, handling petabytes of storage and processing millions of files daily. Embraced a DevOps approach, taking ownership of production systems to enhance operational success through automation, active alerting, and self-healing mechanisms. Led the resolution of production issues to ensure high availability and performance.
-
Data Warehouse Development: Played a key role in enhancing Yahoo’s data warehouse, implementing features like adaptive memory allocation for MapReduce job failures and integrating data pipelines from Verizon grids.
-
Grid Project Onboarding and Access Management: Modernized the codebase, enhancing MySQL authentication, performance, and database scalability. Supported new feature deployments actively.
-
Hue Development: Improved deployment processes and monitoring of Hue, a web-based SQL Assistant. Also improved integration with Yahoo's data infrastructure.
-
Resource Usage Dashboards: Co-developed a Node.js, JavaScript, and Vue.js based proof of concept to visualize various aspects of grid utilization, aiding detailed resource usage monitoring.
-
Starling Cloud Migration: Contributed to migrating Yahoo’s Big Data Data Warehouse, Starling, to the cloud (AWS), focusing on seamless integration and cloud optimization.
-
Technical Skills: Java, Python, JavaScript, Hadoop, Spark, Hive, AWS EMR, MySQL, SQLite, Airflow, Jenkins, APM, CI/CD, JUnit, Shell, Bash, AWK, Django, Vue.js, Ember.js, Hue, Jupyter, AWS, Linux, Unix, MacOS.
Wolfram Research • 2 roles • (Aug 2014 - Oct 2019) 5 years 2 months
Software Engineer (Aug 2015 – Oct 2019) - 4 years, 2 months
Enhanced NLP capabilities for Wolfram Alpha, contributing to technologies used in virtual assistants like Amazon Alexa and Apple Siri. Developed several key components that led to a 40% improvement in query response times.
Software Engineering Co Op (Aug 2014 – July 2015) - 1 year
Developed a high-performance distributed licensing server and worked on NLP projects that helped streamline the product activation process, significantly reducing server load and improving user experience.