Skip to content

ijshd7/datanexus

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DataNexus

DataNexus is a command-line tool for correlating public datasets directly from the terminal. It fetches time-series data from multiple free APIs, normalizes it into a common format, computes statistical correlations, and presents results in plain-language summaries that anyone can understand. Detailed statistical tables, ASCII charts, and scatter plots are available for deeper analysis.

Architecture

CLI (Cobra)
  ├── Data Source Registry (FRED, World Bank, Alpha Vantage, NOAA, FBI UCR)
  ├── HTTP Client (timeouts, exponential backoff retry, rate limiting, disk cache)
  ├── Normalization Layer (resample, align, z-score, min-max, percent change)
  ├── Correlation Engine (Pearson, Spearman, Kendall, p-values, lag analysis)
  └── Renderer (go-pretty tables, asciigraph charts, ASCII scatter plots, lipgloss colors)

Data Sources

Source ID API Key Description
FRED fred Required Federal Reserve Economic Data (GDP, unemployment, CPI, etc.)
World Bank worldbank Not needed World development indicators (population, GNI, etc.)
Alpha Vantage alphavantage Required Stock market data (monthly close prices)
NOAA noaa Required Climate data (temperature, precipitation, wind, snow)
FBI UCR fbi Required National crime statistics (violent crime, property crime, etc.)

Prerequisites

  • Go 1.24 or later
  • One or more API keys (see API Keys below)

Building

# Build the binary
make build

# Build and run
make run

# Run tests
make test

# Clean build artifacts
make clean

Or build directly with Go:

go build -o datanexus .

All examples below use ./datanexus to run the locally built binary. To use datanexus without the ./ prefix, move it onto your PATH:

cp ./datanexus /usr/local/bin/

API Keys

DataNexus reads API keys from environment variables. All keys are free to obtain.

Variable Source Registration
DATANEXUS_FRED_KEY FRED https://fred.stlouisfed.org/docs/api/api_key.html
DATANEXUS_ALPHAVANTAGE_KEY Alpha Vantage https://www.alphavantage.co/support/#api-key
DATANEXUS_NOAA_KEY NOAA https://www.ncdc.noaa.gov/cdo-web/token
DATANEXUS_FBI_KEY FBI UCR https://api.data.gov/signup/

World Bank requires no API key.

Option 1: Use a .env file

Copy the example file and add your keys:

cp .env.example .env
# Edit .env with your API keys

DataNexus loads .env from the current working directory at startup. The .env file is gitignored.

Option 2: Set in your shell

export DATANEXUS_FRED_KEY=your_fred_key
export DATANEXUS_ALPHAVANTAGE_KEY=your_alphavantage_key
export DATANEXUS_NOAA_KEY=your_noaa_key
export DATANEXUS_FBI_KEY=your_fbi_key

Run ./datanexus configure to see all keys and registration links.

Usage

List available data sources

./datanexus sources

Shows all registered sources with their configuration status. Unconfigured sources include a hint to run ./datanexus configure.

Search for datasets

./datanexus search "unemployment"
./datanexus search "GDP"
./datanexus search "temperature"

Searches across all configured sources concurrently and displays matching datasets with their full IDs. Results include a tip on how to use the dataset IDs to run a correlation.

Correlate two datasets

./datanexus correlate fred:GDP fred:UNRATE

Fetches both datasets, aligns them to a common time axis, applies z-score scaling, and computes Pearson correlation. By default, results are shown as a human-readable summary:

  Correlation Results
  ───────────────────────────────────────

  Gross Domestic Product vs Unemployment Rate

  Correlation:    -0.55
  Strength:       Moderate negative
  Significance:   Highly significant (p < 0.001)
  Data points:    100
  Method:         Pearson

Add --verbose to see the full statistical table with raw R values, p-values, and significance stars.

Choose a correlation method

./datanexus correlate fred:GDP fred:UNRATE --method spearman
./datanexus correlate fred:GDP fred:UNRATE --method kendall

Supported methods: pearson (default), spearman, kendall.

Set a date range

./datanexus correlate fred:GDP fred:UNRATE --start 2010-01-01 --end 2023-12-31

Default start is 2000-01-01. Default end is today.

Change the scaling method

./datanexus correlate fred:GDP fred:UNRATE --scale minmax
./datanexus correlate fred:GDP fred:UNRATE --scale pctchange
./datanexus correlate fred:GDP fred:UNRATE --scale none

Scaling options: zscore (default), minmax (0-1 range), pctchange (period-over-period %), none.

Show an ASCII time-series chart

./datanexus correlate fred:GDP fred:UNRATE --chart

Overlays both series on a single ASCII line chart.

Show a scatter plot

./datanexus correlate fred:GDP fred:UNRATE --scatter

Renders an ASCII scatter plot of the two series.

Correlation matrix (3+ datasets)

./datanexus correlate fred:GDP fred:UNRATE worldbank:NY.GDP.MKTP.CD --matrix

Computes pairwise correlations across all datasets and displays an NxN matrix with color-coded significance.

Lag analysis

./datanexus correlate fred:GDP fred:UNRATE --lag 5

Tests time-shifted correlations from -5 to +5 periods to find if one series leads or lags the other. The default output explains the result in plain language:

  Lag Analysis
  ───────────────────────────────────────

  Gross Domestic Product vs Unemployment Rate

  Best lag:        +3 (3 quarters)
  Correlation:     -0.72
  Strength:        Strong negative
  Significance:    Highly significant (p < 0.001)

  Changes in Unemployment Rate tend to follow changes
  in Gross Domestic Product by about 3 quarters.

Add --verbose to see the full table of correlations at every lag offset.

Cross-source correlation

./datanexus correlate fred:GDP worldbank:NY.GDP.MKTP.CD --chart
./datanexus correlate alphavantage:AAPL fred:DFF --method spearman
./datanexus correlate noaa:TAVG fbi:violent_crime --start 2005-01-01

Any combination of sources can be correlated. The normalization layer handles different frequencies by resampling to the coarsest common frequency and intersecting date ranges.

View version

./datanexus version

View API key configuration help

./datanexus configure

Global Flags

Flag Short Description
--verbose -v Show detailed statistical tables instead of plain-language summaries
--output -o Output format: table (default), json
--no-color Disable colored terminal output

Correlate Command Flags

Flag Short Default Description
--method -m pearson Correlation method: pearson, spearman, kendall
--start -s 2000-01-01 Start date (YYYY-MM-DD)
--end -e today End date (YYYY-MM-DD)
--scale zscore Scaling: zscore, minmax, pctchange, none
--lag -l 0 Max lag periods for time-shifted analysis
--chart -c false Show ASCII time-series chart
--scatter false Show ASCII scatter plot
--matrix false Show full NxN correlation matrix

Testing

Run all tests

make test

Or directly:

go test ./... -v

Manual testing workflow

  1. Verify the build compiles:
make build
  1. Check source registration (no API keys needed):
./datanexus sources

You should see a table listing all 5 data sources with their configuration status. Unconfigured sources will show a setup hint.

  1. Test with World Bank (no API key required):
./datanexus search "GDP"
./datanexus correlate worldbank:NY.GDP.MKTP.CD worldbank:SP.POP.TOTL --chart
  1. Test with FRED (requires API key):
export DATANEXUS_FRED_KEY=your_key
./datanexus search "unemployment"
./datanexus correlate fred:GDP fred:UNRATE --method pearson --chart --scatter
  1. Test cross-source correlation:
./datanexus correlate fred:GDP worldbank:NY.GDP.MKTP.CD

You should see a plain-language summary with correlation strength and significance. Add -v to see the full statistical table.

  1. Test correlation matrix:
./datanexus correlate fred:GDP fred:UNRATE fred:CPIAUCSL --matrix

The matrix table is displayed along with a key findings summary highlighting the strongest and weakest pairs.

  1. Test lag analysis:
./datanexus correlate fred:GDP fred:UNRATE --lag 4

You should see a summary describing which series leads or lags the other and by how many periods. Add -v for the full lag table.

  1. Test scaling modes:
./datanexus correlate fred:GDP fred:UNRATE --scale none
./datanexus correlate fred:GDP fred:UNRATE --scale minmax
./datanexus correlate fred:GDP fred:UNRATE --scale pctchange

Caching

API responses are cached on disk at ~/.datanexus/cache/ with a 24-hour TTL. This avoids redundant API calls during testing. To force fresh data, delete the cache directory:

rm -rf ~/.datanexus/cache

Project Structure

datanexus/
├── main.go                          # Entry point
├── cmd/                             # CLI commands (Cobra)
│   ├── root.go                      # Root command, global flags
│   ├── correlate.go                 # correlate command (fetch, align, scale, compute, render)
│   ├── search.go                    # search command
│   ├── sources.go                   # sources command
│   ├── configure.go                 # configure command (API key help)
│   ├── version.go                   # version command
│   └── registry.go                  # Shared data source registry initializer
├── internal/
│   ├── client/                      # HTTP client infrastructure
│   │   ├── http.go                  # Base client with timeouts
│   │   ├── retry.go                 # Exponential backoff retry wrapper
│   │   ├── ratelimit.go             # Token bucket rate limiter wrapper
│   │   └── cache.go                 # Disk-based response cache with TTL
│   ├── config/
│   │   └── config.go                # Environment variable config (API keys, cache dir)
│   ├── datasource/                  # Data source interface and connectors
│   │   ├── source.go                # DataSource interface
│   │   ├── dataset.go               # DatasetMeta model, ParseFullID
│   │   ├── registry.go              # Registry with concurrent search
│   │   ├── fred/fred.go             # FRED connector
│   │   ├── worldbank/worldbank.go   # World Bank connector
│   │   ├── alphavantage/alphavantage.go  # Alpha Vantage connector
│   │   ├── noaa/noaa.go             # NOAA Climate Data connector
│   │   └── fbi/fbi.go              # FBI UCR crime data connector
│   ├── normalize/                   # Time-series normalization
│   │   ├── timeseries.go            # TimeSeries, DataPoint, Frequency types
│   │   ├── align.go                 # Resample, align, intersect time ranges
│   │   └── scale.go                 # Z-score, min-max, percent change scaling
│   ├── correlation/                 # Statistical correlation engine
│   │   ├── correlation.go           # Pearson, Spearman, Kendall with p-values
│   │   ├── matrix.go                # NxN pairwise correlation matrix
│   │   └── lag.go                   # Time-shifted lag analysis
│   └── render/                      # Terminal output rendering
│       ├── style.go                 # Lipgloss color-coded significance
│       ├── summary.go              # Human-readable result summaries
│       ├── table.go                 # go-pretty table rendering (--verbose)
│       ├── chart.go                 # asciigraph ASCII line charts
│       └── scatter.go               # ASCII scatter plots
├── Makefile
├── go.mod
└── go.sum

License

This project is for research and educational purposes.

About

CLI tool for correlating public datasets from the terminal. Fetches time-series data from FRED, World Bank, Alpha Vantage, NOAA, and FBI UCR, normalizes it, computes correlations, and outputs plain-language summaries with ASCII charts.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors