OpenRef

OpenRef is a production-oriented TypeScript SDK for web-grounded answers with optional inline citations.

It keeps the runtime model simple:

Query -> Web Search -> Content Extraction/Chunking -> Streaming Chat Response

Features

Web search with provider fallback (Brave -> DuckDuckGo)
Source deduplication and domain diversity filtering
Optional LLM reranking of search candidates
Optional LLM query expansion for broader retrieval
Query-aware page extraction and chunk scoring
Streaming chat response
Configurable citation behavior (citationStrictness)
Typed SDK surface (search, chat, event streams)

Install

npm install @altamsh04/openref

Quick Start

import { OpenRef } from "@altamsh04/openref";

const agent = new OpenRef({
  llm: {
    apiKey: "sk-or-v1......",
    chatModel: "stepfun/step-3.5-flash:free",
    fallbackChatModels: [
      "nvidia/nemotron-3-nano-30b-a3b:free",
      "mistralai/mistral-small-3.1-24b-instruct:free"
    ],
    systemPrompt: "Answer in short bullet points.",
    citationStrictness: true,
    maxRetries: 2,
    retryDelayMs: 1200,
    maxOutputTokens: 2048,
    maxContinuationRequests: 2
  },
  search: {
    engineProvider: { provider: "brave" }, // fallback order: duckduckgo -> bing
    preferLatest: true,
    timeZone: "America/New_York",
    maxSources: 5,
    queryExpansion: true,
    queryExpansionValue: 3,
    queryExpansionTimeout: 1200,
    searchTimeout: 5000,
    enableReranking: true,
    rerankTimeout: 4000
  },
  retrieval: {
    contentTimeout: 6000,
    maxContextTokens: 6000,
    chunkTargetTokens: 400
  },
  response: {
    stream: true
  }
});

const query = "Today's top news in AI";

async function run() {
  // Per-request overrides
  const response = await agent.chat(query, {
    stream: false,
    systemPrompt: "Keep it under 120 words and mention uncertainty clearly.",
    citationStrictness: false
  });

  console.log(JSON.stringify(response, null, 2));
}

run();

Local Example

Run the local SDK example:

OPENROUTER_API_KEY=sk-or-v1-xxxx npm run example:run -- "Today's top news in AI"

Run dedicated non-stream and stream examples:

OPENROUTER_API_KEY=sk-or-v1-xxxx npm run example:non-stream -- "What is OpenRouter?"
OPENROUTER_API_KEY=sk-or-v1-xxxx npm run example:stream -- "What is OpenRouter?"

Files:

example/index.ts example app using local source (../src)
example/non-stream.ts non-stream response example
example/stream.ts stream response example
tsconfig.example.json example build config

API

`new OpenRef(config)`

`config.llm`

apiKey?: string OpenRouter API key. Preferred key location.
chatModel?: string Primary chat model.
fallbackChatModels?: string[] Models used if primary model fails.
systemPrompt?: string Base system instruction for response style/behavior.
citationStrictness?: boolean Citation policy in response text.
maxRetries?: number Retry attempts per model request.
retryDelayMs?: number Backoff delay between retries.
maxOutputTokens?: number Max tokens per generation request.
maxContinuationRequests?: number Extra continuation requests when output is truncated.

citationStrictness behavior:

true (default): model is instructed to include inline [N] citations for factual claims.
false: model is instructed to avoid [N] citations unless user explicitly asks.

`config.search`

preferLatest?: boolean Adds recency bias to search and prompting.
timeZone?: string Used for date context formatting.
maxSources?: number Final number of sources to keep.
queryExpansion?: boolean Expand user query into subqueries using LLM before retrieval.
queryExpansionValue?: number Number of expanded subqueries (0-5).
queryExpansionTimeout?: number Max expansion wait time in ms.
engineProvider?: { provider?: "brave" | "duckduckgo" | "bing" | "searxng" | "searxncg" | Array<...>, queryUrl?: string } Choose preferred engine order. If one provider is given (e.g. "brave"), OpenRef auto-falls back to the other mainstream engines. For provider: "searxng", you can pass a custom queryUrl (for example http://localhost:8080/search?q={query}).
searchTimeout?: number Search request timeout in ms.
enableReranking?: boolean Enable LLM reranking for candidates.
rerankTimeout?: number Reranking timeout in ms.

`config.retrieval`

contentTimeout?: number Page fetch/extraction timeout in ms.
maxContextTokens?: number Token budget for assembled context.
chunkTargetTokens?: number Approximate target size of chunks.

`config.response`

stream?: boolean Default chat mode (true for event stream, false for aggregated response).

`search(query: string): Promise<SearchResult>`

Runs retrieval and ranking only.

`chat(query: string, options?)`

Per-request options:

stream?: boolean
systemPrompt?: string Overrides constructor llm.systemPrompt.
citationStrictness?: boolean Overrides constructor llm.citationStrictness.

When stream: true, returns AsyncGenerator<ChatEvent> with:

expanded_queries (optional, early event when enabled)
sources
text (multiple chunks)
citations
done

When stream: false, returns Promise<ChatResponse> with:

text
sources
citationMap
chatTokenUsage
metadata

search() and chat(..., { stream: false }) include metadata.expandedQueries when query expansion is enabled.

Legacy Config Support

Top-level fields like openRouterApiKey, chatModel, maxSources, etc. are still accepted for backward compatibility.

Preferred format is grouped config (llm, search, retrieval, response).

Test Before Publish

Run full pre-publish checks:

npm run check

This runs:

npm run typecheck (tsc --noEmit)
npm run test:smoke (build + SDK smoke checks)
npm run test:pack (build + npm pack install/import verification in a temp project)

test:pack tries a real temp-project install first. If npm registry access is unavailable, it automatically runs an offline tarball import fallback.

Smoke Test Modes

Without OPENROUTER_API_KEY:
- validates constructor/config compatibility (grouped + legacy)
- validates API surface (search, chat)
- skips live network requests
With OPENROUTER_API_KEY:
- runs live search
- runs chat non-stream
- runs chat stream and confirms done event

Example:

OPENROUTER_API_KEY=sk-or-v1-xxxx npm run test:smoke

Notes

query must be a non-empty string.
If no sources are found, chat returns a graceful text response with empty citation map.
OpenRef uses OpenRouter-compatible chat models for reranking and response generation.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
docs		docs
example		example
scripts		scripts
src		src
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
README.md		README.md
package.json		package.json
tsconfig.example.json		tsconfig.example.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OpenRef

Features

Install

Quick Start

Local Example

API

`new OpenRef(config)`

`config.llm`

`config.search`

`config.retrieval`

`config.response`

`search(query: string): Promise<SearchResult>`

`chat(query: string, options?)`

Legacy Config Support

Test Before Publish

Smoke Test Modes

Notes

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

OpenRef

Features

Install

Quick Start

Local Example

API

new OpenRef(config)

config.llm

config.search

config.retrieval

config.response

search(query: string): Promise<SearchResult>

chat(query: string, options?)

Legacy Config Support

Test Before Publish

Smoke Test Modes

Notes

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages

`new OpenRef(config)`

`config.llm`

`config.search`

`config.retrieval`

`config.response`

`search(query: string): Promise<SearchResult>`

`chat(query: string, options?)`