feat: v0.5.0 - Local vector storage, OpenRouter support, free-text questions, and more by sherlock2010song · Pull Request #8 · hi-jin/logseq-ask-pdf

sherlock2010song · 2025-12-09T14:06:56Z

Changelog

v0.5.0

Local Vector Storage: Optimized performance by caching vector embeddings locally using IndexedDB. This significantly speeds up subsequent queries for the same PDF.
OpenRouter Support: Added support for OpenRouter API, allowing you to use various models like Claude 3, Llama 3, etc.
Free-text Questions: You can now ask questions directly in a block without needing to reference a specific highlight.
Progress Indicator: Added a real-time progress indicator for the PDF embedding process.
API Key Verification: Added a button in settings to verify your API key and connection.
Clear Cache: Added an option to clear the local vector store cache.

…estions, and more

Copilot

Pull request overview

This PR introduces version 0.5.0 with significant performance and usability improvements. The main enhancement is local vector storage using IndexedDB to cache embeddings, eliminating the need to re-process PDFs on subsequent queries. Additionally, the plugin now supports OpenRouter API for accessing various LLM models beyond OpenAI, and enables free-text questions without requiring specific highlight references.

Local caching via IndexedDB significantly improves performance for repeat queries
OpenRouter API support expands available model options
Free-text questions enhance usability by allowing direct questions without highlights
Progress indicators and API key verification improve user experience

Reviewed changes

Copilot reviewed 6 out of 7 changed files in this pull request and generated 9 comments.

Show a summary per file

File	Description
`src/utils/indexedDB.ts`	New utility file implementing IndexedDB operations for storing and retrieving vector embeddings with file hashing for cache keys
`src/settings.ts`	Added API base URL configuration, API key verification button, and cache clearing option; updated UI labels to reflect OpenRouter support
`src/openai.ts`	Implemented two-tier caching (in-memory + IndexedDB), batch processing with progress callbacks, and support for free-text questions via updated invoke signature
`src/main.ts`	Added logic to handle both highlight-based and free-text questions, integrated progress indicators for PDF processing
`package.json`	Version bumped to 0.5.0
`package-lock.json`	Version update to match package.json changes
`README.md`	Updated documentation with OpenRouter setup instructions, free-text question usage, and v0.5.0 changelog

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-09T14:14:59Z

src/main.ts

-            if (!highlight) {
-                await logseq.UI.showMsg(`Please check whether the highlight uuid is on current line.`, "warning");
-                return;
+            let inputData: any = null;


The variable name inputData with type any is too generic. Consider using a more descriptive name like questionSource or a union type like Highlight | string instead of any for better type safety.

Copilot · 2025-12-09T14:15:00Z

src/openai.ts

        });

        // ask the model to describe the image
+        const apiBaseUrl = readAPIBaseUrl();


apiBaseUrl is being read again here (line 227) even though it was already read at line 149. This is redundant and can be removed since the variable is already in scope from the parent function. Consider removing this line and reusing the existing apiBaseUrl variable.

Suggested change

const apiBaseUrl = readAPIBaseUrl();

Copilot · 2025-12-09T14:15:00Z

src/settings.ts

+            logseq.updateSettings({ checkApiKey: false });
+            const apiKey = newSettings.openaiApiKey;
+            if (!apiKey) {
+                logseq.UI.showMsg("Please enter an OpenAI API key first.", "error");


The error message "Please enter an OpenAI API key first." is misleading since the field now accepts both OpenAI and OpenRouter API keys. Consider changing to "Please enter an API key first." to match the updated field description.

Suggested change

logseq.UI.showMsg("Please enter an OpenAI API key first.", "error");

logseq.UI.showMsg("Please enter an API key first.", "error");

Copilot · 2025-12-09T14:15:00Z

src/settings.ts

+                const model = new ChatOpenAI({
+                    openAIApiKey: apiKey,
+                    modelName: "gpt-3.5-turbo", // Use a cheap model for checking
+                    maxTokens: 5,
+                    configuration,
+                });
+                await model.invoke("Hello");


The API key verification always uses gpt-3.5-turbo model, which will fail for OpenRouter or non-OpenAI endpoints. When using OpenRouter, you need to specify a valid OpenRouter model name (e.g., openai/gpt-3.5-turbo). Consider using a model from settings or detecting the endpoint type based on apiBaseUrl to choose an appropriate test model.

Copilot · 2025-12-09T14:15:01Z

src/utils/indexedDB.ts

+        // MemoryVectorStore exposes memoryVectors but it's protected/private in TS definition
+        // We need to access it to serialize.
+        // In LangChain JS MemoryVectorStore, the data is stored in `memoryVectors`
+        // Each item has { content: string, embedding: number[], metadata: object }
+
+        // However, MemoryVectorStore doesn't have a direct export method that gives us everything easily for reconstruction
+        // without re-embedding.
+        // But we can access `memoryVectors` by casting to any.
+
+        const memoryVectors = (vectorStore as any).memoryVectors;
+
+        // We need to store documents and vectors.
+        // Actually MemoryVectorStore.fromDocuments creates vectors from documents.
+        // If we want to avoid re-embedding, we need to save the embeddings.
+        // MemoryVectorStore.fromVectors(vectors, documents, embeddings) is what we want to use for restoration.
+


These comments describe internal implementation details about accessing private/protected members of MemoryVectorStore. While helpful, they span multiple lines and could be more concise. Consider consolidating to: // Access memoryVectors (protected property) to serialize embeddings and avoid re-embedding on restore

Suggested change

// MemoryVectorStore exposes memoryVectors but it's protected/private in TS definition

// We need to access it to serialize.

// In LangChain JS MemoryVectorStore, the data is stored in `memoryVectors`

// Each item has { content: string, embedding: number[], metadata: object }

// However, MemoryVectorStore doesn't have a direct export method that gives us everything easily for reconstruction

// without re-embedding.

// But we can access `memoryVectors` by casting to any.

const memoryVectors = (vectorStore as any).memoryVectors;

// We need to store documents and vectors.

// Actually MemoryVectorStore.fromDocuments creates vectors from documents.

// If we want to avoid re-embedding, we need to save the embeddings.

// MemoryVectorStore.fromVectors(vectors, documents, embeddings) is what we want to use for restoration.

// Access memoryVectors (protected property) to serialize embeddings and avoid re-embedding on restore

const memoryVectors = (vectorStore as any).memoryVectors;

Copilot · 2025-12-09T14:15:01Z

README.md

 > [!Note]
 > You can move your cursor or continue working in Logseq while the response is being generated.


The Note block at lines 67-68 appears to be duplicated from lines 54-55. Consider removing this duplication to keep the documentation clean.

Suggested change

> [!Note]

> You can move your cursor or continue working in Logseq while the response is being generated.

Copilot · 2025-12-09T14:15:01Z

src/utils/indexedDB.ts

+            console.error("IndexedDB error:", event);
+            reject("IndexedDB error");


The error message "IndexedDB error" is not descriptive enough. Consider including the actual error from the event: reject(new Error("IndexedDB error: " + (event.target as any)?.error?.message)) to help with debugging.

Suggested change

console.error("IndexedDB error:", event);

reject("IndexedDB error");

const errorMsg = (event.target as any)?.error?.message || "";

console.error("IndexedDB error:", event);

reject(new Error("IndexedDB error: " + errorMsg));

Copilot · 2025-12-09T14:15:02Z

src/utils/indexedDB.ts

+    const db = await openDB();
+    return new Promise((resolve, reject) => {
+        const transaction = db.transaction([STORE_NAME], "readwrite");
+        const store = transaction.objectStore(STORE_NAME);
+
+        // MemoryVectorStore exposes memoryVectors but it's protected/private in TS definition
+        // We need to access it to serialize.
+        // In LangChain JS MemoryVectorStore, the data is stored in `memoryVectors`
+        // Each item has { content: string, embedding: number[], metadata: object }
+
+        // However, MemoryVectorStore doesn't have a direct export method that gives us everything easily for reconstruction
+        // without re-embedding.
+        // But we can access `memoryVectors` by casting to any.
+
+        const memoryVectors = (vectorStore as any).memoryVectors;
+
+        // We need to store documents and vectors.
+        // Actually MemoryVectorStore.fromDocuments creates vectors from documents.
+        // If we want to avoid re-embedding, we need to save the embeddings.
+        // MemoryVectorStore.fromVectors(vectors, documents, embeddings) is what we want to use for restoration.
+
+        const vectors = memoryVectors.map((v: any) => v.embedding);
+        const documents = memoryVectors.map((v: any) => new Document({
+            pageContent: v.content,
+            metadata: v.metadata,
+        }));
+
+        const data: StoredVectorStore = {
+            id: key,
+            vectors,
+            documents,
+            timestamp: Date.now(),
+        };
+
+        const request = store.put(data);
+
+        request.onsuccess = () => resolve();
+        request.onerror = () => reject(request.error);
+    });


The database connection opened in openDB() is not being closed after operations complete. Consider adding db.close() calls after transactions finish in each function (saveVectorStoreToDB, loadVectorStoreFromDB, clearVectorStoreFromDB, clearAllVectorStoresFromDB) to avoid keeping unnecessary connections open.

Copilot · 2025-12-09T14:15:02Z

src/openai.ts

    const loader = new PDFLoader(pdf, {
        pdfjs: () => pdfjs as any,
    });
    const docs = await loader.load()


Avoid automated semicolon insertion (97% of all statements in the enclosing function have an explicit semicolon).

Suggested change

const docs = await loader.load()

const docs = await loader.load();

hi-jin · 2025-12-10T01:47:37Z

Thank you for your pull request. What you added are really required features for this repo.
I may have time to review this PR at this weekend (maybe..?)
Thanks.

feat: v0.5.0 - Local vector storage, OpenRouter support, free-text qu…

5f2640d

…estions, and more

Copilot AI review requested due to automatic review settings December 9, 2025 14:06

Copilot started reviewing on behalf of sherlock2010song December 9, 2025 14:07 View session

Copilot AI reviewed Dec 9, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: v0.5.0 - Local vector storage, OpenRouter support, free-text questions, and more#8

feat: v0.5.0 - Local vector storage, OpenRouter support, free-text questions, and more#8
sherlock2010song wants to merge 1 commit intohi-jin:mainfrom
sherlock2010song:main

sherlock2010song commented Dec 9, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Dec 9, 2025

Uh oh!

Copilot AI Dec 9, 2025

Uh oh!

Copilot AI Dec 9, 2025

Uh oh!

Copilot AI Dec 9, 2025

Uh oh!

Copilot AI Dec 9, 2025

Uh oh!

Copilot AI Dec 9, 2025

Uh oh!

Copilot AI Dec 9, 2025

Uh oh!

Copilot AI Dec 9, 2025

Uh oh!

Copilot AI Dec 9, 2025

Uh oh!

hi-jin commented Dec 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	logseq.UI.showMsg("Please enter an OpenAI API key first.", "error");
	logseq.UI.showMsg("Please enter an API key first.", "error");

		> [!Note]
		> You can move your cursor or continue working in Logseq while the response is being generated.

		console.error("IndexedDB error:", event);
		reject("IndexedDB error");

	const docs = await loader.load()
	const docs = await loader.load();

Conversation

sherlock2010song commented Dec 9, 2025

Changelog

v0.5.0

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

hi-jin commented Dec 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants