Welcome to my data portfolio! I focus on solving real-world problems through data cleaning, advanced visualization, and statistical modeling. I am currently expanding my skillset into advanced machine learning and predictive analytics.
A comprehensive analysis of the global engineering workforce, identifying market standards and emerging technology trends.
The Challenge: Synthesize complex survey data from Stack Overflow to predict future hiring needs and developer skill gaps.
Key Accomplishments:
- Data Cleaning: Engineered a custom Python pipeline to handle semi-structured data and remove 20 hidden duplicates using
ResponseIdlogic. - Advanced Analysis: Utilized K-Nearest Neighbors (KNN) to impute missing AI adoption data and IQR methods to filter salary outliers.
- Visualization: Created an interactive dashboard in IBM Cognos to map global developer density and technology adoption rates.
- Insight: Identified a "Golden Triangle" of essential skills (JS, Python, SQL) and predicted the rise of TypeScript and Go as high-value future skills.
Tech Stack: Python (Pandas, NumPy, Scikit-learn), IBM Cognos Analytics.
- Google Advanced Data Analytics Professional Certificate (In Progress)
- Focus: Advanced regression, machine learning models, and Python automation.
- IBM Data Analyst Professional Certificate
- Focus: Data analysis, visualization, and SQL databases.
- Market Trends Analyzer (Tesla & GameStop)
- Focus: Time-series analysis and stock price correlation.
- Tools: Python, yfinance, Matplotlib.
- Summary: Analyzed historical revenue and stock data to visualize the volatility of "Meme Stocks" vs. traditional automotive giants.
- Twitter Sentiment Analysis
- Focus: Sentiment mining and text processing.
- Tools: Python, NLTK/TextBlob.
- Summary: Processed social media data to classify public sentiment (Positive/Negative/Neutral) regarding trending topics.
- Gender Classifier
- Focus: Classification algorithms.
- Tools: Python, Scikit-learn (Decision Trees).
- Summary: A foundational ML project building a decision tree to classify gender based on biometric feature inputs.
| Category | Tools & Technologies |
|---|---|
| Languages | Python, SQL, HTML/CSS |
| Data Manipulation | Pandas, NumPy, Statistics |
| Visualization | Matplotlib, Seaborn, IBM Cognos, Tableau |
| Machine Learning | Scikit-learn (Regression, KNN, Decision Trees) |
| Tools | Jupyter Notebooks, Git/GitHub, Excel |
I am constantly building new things and refining my models. If you have any questions about my projects or would like to discuss data analytics, feel free to reach out!
- 📧 Email: hiusaidk@gmail.com