Skip to content

altamsh04/openref

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OpenRef

OpenRef is a production-oriented TypeScript SDK for web-grounded answers with optional inline citations.

It keeps the runtime model simple:

Query -> Web Search -> Content Extraction/Chunking -> Streaming Chat Response

Features

  • Web search with provider fallback (Brave -> DuckDuckGo)
  • Source deduplication and domain diversity filtering
  • Optional LLM reranking of search candidates
  • Optional LLM query expansion for broader retrieval
  • Query-aware page extraction and chunk scoring
  • Streaming chat response
  • Configurable citation behavior (citationStrictness)
  • Typed SDK surface (search, chat, event streams)

Install

npm install @altamsh04/openref

Quick Start

import { OpenRef } from "@altamsh04/openref";

const agent = new OpenRef({
  llm: {
    apiKey: "sk-or-v1......",
    chatModel: "stepfun/step-3.5-flash:free",
    fallbackChatModels: [
      "nvidia/nemotron-3-nano-30b-a3b:free",
      "mistralai/mistral-small-3.1-24b-instruct:free"
    ],
    systemPrompt: "Answer in short bullet points.",
    citationStrictness: true,
    maxRetries: 2,
    retryDelayMs: 1200,
    maxOutputTokens: 2048,
    maxContinuationRequests: 2
  },
  search: {
    engineProvider: { provider: "brave" }, // fallback order: duckduckgo -> bing
    preferLatest: true,
    timeZone: "America/New_York",
    maxSources: 5,
    queryExpansion: true,
    queryExpansionValue: 3,
    queryExpansionTimeout: 1200,
    searchTimeout: 5000,
    enableReranking: true,
    rerankTimeout: 4000
  },
  retrieval: {
    contentTimeout: 6000,
    maxContextTokens: 6000,
    chunkTargetTokens: 400
  },
  response: {
    stream: true
  }
});

const query = "Today's top news in AI";

async function run() {
  // Per-request overrides
  const response = await agent.chat(query, {
    stream: false,
    systemPrompt: "Keep it under 120 words and mention uncertainty clearly.",
    citationStrictness: false
  });

  console.log(JSON.stringify(response, null, 2));
}

run();

Local Example

Run the local SDK example:

OPENROUTER_API_KEY=sk-or-v1-xxxx npm run example:run -- "Today's top news in AI"

Run dedicated non-stream and stream examples:

OPENROUTER_API_KEY=sk-or-v1-xxxx npm run example:non-stream -- "What is OpenRouter?"
OPENROUTER_API_KEY=sk-or-v1-xxxx npm run example:stream -- "What is OpenRouter?"

Files:

  • example/index.ts example app using local source (../src)
  • example/non-stream.ts non-stream response example
  • example/stream.ts stream response example
  • tsconfig.example.json example build config

API

new OpenRef(config)

config.llm

  • apiKey?: string OpenRouter API key. Preferred key location.
  • chatModel?: string Primary chat model.
  • fallbackChatModels?: string[] Models used if primary model fails.
  • systemPrompt?: string Base system instruction for response style/behavior.
  • citationStrictness?: boolean Citation policy in response text.
  • maxRetries?: number Retry attempts per model request.
  • retryDelayMs?: number Backoff delay between retries.
  • maxOutputTokens?: number Max tokens per generation request.
  • maxContinuationRequests?: number Extra continuation requests when output is truncated.

citationStrictness behavior:

  • true (default): model is instructed to include inline [N] citations for factual claims.
  • false: model is instructed to avoid [N] citations unless user explicitly asks.

config.search

  • preferLatest?: boolean Adds recency bias to search and prompting.
  • timeZone?: string Used for date context formatting.
  • maxSources?: number Final number of sources to keep.
  • queryExpansion?: boolean Expand user query into subqueries using LLM before retrieval.
  • queryExpansionValue?: number Number of expanded subqueries (0-5).
  • queryExpansionTimeout?: number Max expansion wait time in ms.
  • engineProvider?: { provider?: "brave" | "duckduckgo" | "bing" | "searxng" | "searxncg" | Array<...>, queryUrl?: string } Choose preferred engine order. If one provider is given (e.g. "brave"), OpenRef auto-falls back to the other mainstream engines. For provider: "searxng", you can pass a custom queryUrl (for example http://localhost:8080/search?q={query}).
  • searchTimeout?: number Search request timeout in ms.
  • enableReranking?: boolean Enable LLM reranking for candidates.
  • rerankTimeout?: number Reranking timeout in ms.

config.retrieval

  • contentTimeout?: number Page fetch/extraction timeout in ms.
  • maxContextTokens?: number Token budget for assembled context.
  • chunkTargetTokens?: number Approximate target size of chunks.

config.response

  • stream?: boolean Default chat mode (true for event stream, false for aggregated response).

search(query: string): Promise<SearchResult>

Runs retrieval and ranking only.

chat(query: string, options?)

Per-request options:

  • stream?: boolean
  • systemPrompt?: string Overrides constructor llm.systemPrompt.
  • citationStrictness?: boolean Overrides constructor llm.citationStrictness.

When stream: true, returns AsyncGenerator<ChatEvent> with:

  • expanded_queries (optional, early event when enabled)
  • sources
  • text (multiple chunks)
  • citations
  • done

When stream: false, returns Promise<ChatResponse> with:

  • text
  • sources
  • citationMap
  • chatTokenUsage
  • metadata

search() and chat(..., { stream: false }) include metadata.expandedQueries when query expansion is enabled.

Legacy Config Support

Top-level fields like openRouterApiKey, chatModel, maxSources, etc. are still accepted for backward compatibility.

Preferred format is grouped config (llm, search, retrieval, response).

Test Before Publish

Run full pre-publish checks:

npm run check

This runs:

  • npm run typecheck (tsc --noEmit)
  • npm run test:smoke (build + SDK smoke checks)
  • npm run test:pack (build + npm pack install/import verification in a temp project)

test:pack tries a real temp-project install first. If npm registry access is unavailable, it automatically runs an offline tarball import fallback.

Smoke Test Modes

  • Without OPENROUTER_API_KEY:

    • validates constructor/config compatibility (grouped + legacy)
    • validates API surface (search, chat)
    • skips live network requests
  • With OPENROUTER_API_KEY:

    • runs live search
    • runs chat non-stream
    • runs chat stream and confirms done event

Example:

OPENROUTER_API_KEY=sk-or-v1-xxxx npm run test:smoke

Notes

  • query must be a non-empty string.
  • If no sources are found, chat returns a graceful text response with empty citation map.
  • OpenRef uses OpenRouter-compatible chat models for reranking and response generation.

About

An Agentic web search SDK

Topics

Resources

Stars

Watchers

Forks

Contributors