Git Product home page Git Product logo

< Hello World, I'm Mehroos Ali />

  • I am a collaborative data engineering professional with substantial knowledge and experience in analysis, design, development, implementation, migration, convergence, management, and support of large-scale databases, data warehouses, and big data systems by creating intuitive architectures and frameworks that help organizations effectively capture, store, process, visualize and analyze huge volume of structured, semi-structured, unstructured and stream of heterogeneous data set.
  • I am currently pursuing my Masters in Computer Science at the University of Texas at Dallas specializing in Intelligent Systems.
  • I have previously interned at Amazon as Data Engineer this past summer where I gained knowledge and experience working with design and development of streaming data pipelines.
  • I have previously worked as a Data Engineer for Onward Technologies which is a global IT service provider in domains such as data analytics, data science, Artificial Intelligence (AI) and Machine Learning (ML). Before that I was working with Cognizant on their flagship Core Banking and Insurance customer - Suncorp.
  • I am interested in Big Data Engineering, Cloud Data Warehousing, Devops and Full Stack Development.
  • 📩 Feel free to reach me at [email protected].

🛠 My Toolkit

java python sql hadoop spark kafka Airflow Hive Sqoop nifi docker intelij

oracle mysql aws GCP Azure maven github postman linux databricks jenkins vscode

🏆 Github Stats

Mehroos's Github Stats

🤝 Let's stay connected!

       

Mehroos Ali's Projects

abcstorespipeline icon abcstorespipeline

Batch ETL data pipeline built on HDP 3.0 to process daily sales and business data to procedure power Bi reports. Automated the pipelines using Airflow.

assembly_file_statistics icon assembly_file_statistics

Assembly project to compute file statistics using MIPS for class CS5330 (Computer Architecture) at the University of Texas at Dallas.

batch_number_conversion icon batch_number_conversion

batch number conversion project using MIPS for class CS5330 (Computer Architecture) at the University of Texas at Dallas.

bigquery-sparksql-batch-etl icon bigquery-sparksql-batch-etl

Batch ETL pipeline project on GCP to load and transform daily flight data using Spark to update tables in BigQuery. The pipeline is automated using Airflow.

databricks-f1-project icon databricks-f1-project

A data pipeline project build on databricks and azure to demostrate lifecycle of a cloud data project.

ebay-db-design icon ebay-db-design

Ebay database design project for the class CS6360 (Database Design) at the University of Texas at Dallas.

kruskals-algorithm icon kruskals-algorithm

Kruskal's algorithm project using Java for class CS5343 (Data Structures and Algorithms) at the University of Texas at Dallas.

maze-solver icon maze-solver

Maze Solver project using Java for class CS5343 (Data Structures and Algorithms) at the University of Texas at Dallas.

muy-feliz icon muy-feliz

Android application project in react ecosystem for the class CS 6326 (Human Computer Interaction).

realtime-customer-viewership-analysis icon realtime-customer-viewership-analysis

data pipeline using the lambda architecture is created for the unification and consolidation of real-time customer web events, weblogs, and profile data into a hive warehouse for adhoc analysis.

s3-redshift-batch-etl-pipeline icon s3-redshift-batch-etl-pipeline

Built functional python ETL script with functions that initialized spark clusters using pyspark library to extract songs stored in S3 bucket. Partitioned songs data by year and artist_id and compressed in parquet output files to increase load performance. Used the overwrite mode in spark to ensure every new run of ELT script is overwritten in the data lake to avoid duplicates. Orchestrated ELT data pipeline that extracts from S3, loads in redshift for transformation and loads output back to S3. Used hooks in airflow to make connection credentials configurable in order to separate access rights from code base for security. Used operators to execute loading and transformation scripts for redshift with airflow DAG.

twitter-sentiment-analysis icon twitter-sentiment-analysis

personal project to pull live Twitter data using Nifi getTwitter processor and pushes to Kafka topic which is then consumed by a Spark Streaming application where basic sentiment analysis is performed and the final result is stored in elastic search for visualization using Kibana.

word-puzzle icon word-puzzle

Word Puzzle project using Java for class CS5343 (Data Structures and Algorithms) at the University of Texas at Dallas.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.