Skip to content

UMwai/resumerank-ai

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Investment Intelligence Platform

Leveraging Data Engineering + Finance + Biotech expertise for systematic alpha generation

A complete, production-ready platform that detects investment opportunities through automated signal detection across clinical trials, patent expirations, and insider activity.

🎯 Quick Start

# 1. Clone or navigate to the project
cd /Users/waiyang/Desktop/repo/dreamers-v2

# 2. Set up databases
createdb clinical_trials patent_intelligence insider_signals

# 3. Load schemas
psql clinical_trials < clinical_trial_signals/database/schema.sql
psql patent_intelligence < patent_intelligence/sql/schema.sql
psql insider_signals < insider_hiring_signals/schema.sql

# 4. Run the dashboard (demo mode works without databases)
cd dashboard
pip install -r requirements.txt
streamlit run app.py

Access dashboard at: http://localhost:8501


📊 Three Core Systems

1. Clinical Trial Signal Detection

Goal: Detect trial outcomes 2-4 weeks before announcements

Edge: Protocol amendments, enrollment patterns, patent filings signal trial success/failure

ROI Target: 5-10x on options plays

📖 Full Specs | 📁 Code


2. Patent/IP Intelligence

Goal: Identify patent cliff events 6-18 months in advance

Edge: Predict generic competition timing with 90%+ accuracy

ROI Target: 20-40% on pair trades (long generic, short branded)

📖 Full Specs | 📁 Code


3. Insider Activity + Hiring Signals

Goal: Detect company trajectory through behavioral signals

Edge: Combine Form 4, 13F, and hiring data for directional conviction

ROI Target: 15-25% on 3-6 month positions

📖 Full Specs | 📁 Code


🎛️ Integrated Dashboard

Streamlit-based dashboard combining all three systems:

  • Combined scoring (weighted across all signals)
  • Real-time data from all pipelines
  • Interactive charts and filterable tables
  • Watchlist management
  • Configurable alerts

📁 Code | [🚀 Demo Mode Available]


⚙️ Orchestration & Automation

Automated scheduling using Apache Airflow (or Prefect):

Pipeline Frequency Time (ET)
Clinical Trial Signals Daily 6:00 PM
Patent/IP Intelligence Weekly Monday 8:00 AM
Form 4 Insider Trades Every 30 min Market hours
13F Holdings Quarterly 45 days post-quarter

Multi-channel alerts: Email, Slack, SMS

📁 Code | [🐳 Docker Deployment Available]


💰 Budget & Costs

Initial Investment

  • Total Budget: $5,000
  • Infrastructure Setup: ~$1,000
  • Remaining for Trading: ~$4,000

Monthly Operating Costs

  • Local Deployment: $670/month
  • Cloud Deployment (AWS): $600-800/month

Includes: Cloud infrastructure, AI APIs, LinkedIn access, PACER account, email/SMS


📈 Expected Returns

Based on signal accuracy and market volatility:

Strategy Allocation Expected Return Risk
Clinical trial options 30% 50-100% High
Patent cliff pairs 40% 20-40% Medium
Insider-driven longs 30% 15-25% Low-Med

Target Blended Return: 30-50% annually


🏗️ Project Structure

dreamers-v2/
├── README.md                              # You are here
├── PROJECT_SUMMARY.md                     # Complete documentation
│
├── clinical_trial_signals/                # System 1: Trial signals
│   ├── main.py
│   ├── demo.py
│   └── database/, scrapers/, scoring/, alerts/
│
├── patent_intelligence/                   # System 2: Patent cliffs
│   ├── src/pipeline.py
│   └── extractors/, transformers/, loaders/
│
├── insider_hiring_signals/                # System 3: Insider/hiring
│   ├── main.py
│   └── scrapers/, models/, reports/
│
├── dashboard/                             # Integrated UI
│   ├── app.py
│   └── components/, pages/, utils/
│
└── orchestration/                         # Automation
    ├── dags/                              # Airflow DAGs
    ├── docker/                            # Local deployment
    └── cloud/                             # AWS/GCP deployment

🚀 Usage Examples

Run Individual Systems

# Clinical Trial Signals
cd clinical_trial_signals
python main.py --full

# Patent Intelligence
cd patent_intelligence
python -m src.pipeline --top-drugs 50

# Insider/Hiring Signals
cd insider_hiring_signals
python main.py scrape --all
python main.py score --ticker MRNA

Deploy with Docker

cd orchestration/docker
docker-compose up -d

# Access services
# Airflow: http://localhost:8080 (admin/admin)
# Grafana: http://localhost:3000 (admin/admin)
# Health: http://localhost:9091/health

📚 Documentation

Each system folder also has its own README with specific setup instructions.


✅ Next Steps

Week 1: Validation

  • Run all scrapers and verify data quality
  • Review signal reports for accuracy
  • Backtest scoring models on historical data

Week 2: Paper Trading

  • Track 5-10 hypothetical trades
  • Monitor signal accuracy
  • Refine weights and thresholds

Week 3: Go Live

  • Start with small positions ($500 each)
  • Focus on highest-confidence signals (score >= 7)
  • Implement strict stop losses

Month 2-3: Scale

  • Increase position sizes gradually
  • Expand watchlist
  • Consider automated execution

🎓 Why This Works

This platform provides an unfair advantage because:

  1. Rare Skill Combination: Data engineering + finance + biotech expertise is extremely uncommon
  2. Information Asymmetry: Automated detection of public signals that others miss
  3. Systematic Edge: Removes emotion from investment decisions
  4. Scalable: Can manage larger capital with same infrastructure
  5. Defensible: Can't be easily replicated by general AI consultants

💡 Future Enhancements

Additional Systems (when budget allows):

  • HEOR model validation
  • Clinical trial cost benchmarking
  • FDA submission QC service
  • Biotech M&A due diligence

Advanced Features:

  • Automated trade execution (via Interactive Brokers API)
  • Machine learning models for signal weighting
  • Real-time news sentiment analysis
  • Options strategy optimizer

Monetization (after proven track record):

  • Raise a biotech-focused fund
  • Sell research reports/newsletter
  • License platform to hedge funds
  • Offer consulting services

🔒 Risk Management

Position Sizing

  • Max 10% per trade
  • Max 40% per system
  • 20% cash reserve

Stop Losses

  • Clinical trials: 50% (options decay)
  • Patent cliffs: 15% (thesis break)
  • Insider plays: 10% (momentum loss)

Signal Validation

  • Minimum confidence: 0.6
  • Prefer 3+ clustered signals
  • Always review contrarian signals

📞 Support

Each system has detailed troubleshooting in its README. Common issues:

Database errors: Check PostgreSQL is running (pg_isready) API limits: Review rate limiting in configs Missing data: Check scraper logs in logs/ folders


📜 License

This is proprietary investment software. Unauthorized distribution is prohibited.


🏆 Success Metrics

Track these KPIs weekly:

System Performance:

  • Data freshness < 24 hours
  • Uptime > 99%
  • Signal accuracy > 70%

Investment Performance:

  • Win rate > 60%
  • Average return > 25%
  • Sharpe ratio > 1.5
  • Max drawdown < 20%

Built with: Python, PostgreSQL, Streamlit, Apache Airflow, Docker

Powered by: AI (Claude Sonnet, GPT-4), SEC EDGAR API, ClinicalTrials.gov API, USPTO API


Let's generate alpha! 📈

About

AI-powered resume screening API for staffing agencies. Screen 100 resumes in 10 minutes. K budget → .5K MRR in 90 days. Built with Node.js, TypeScript, and Anthropic Claude.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors