ResearchBlocks

YOUR RESEARCH — YOUR DATA
Manage your publications library, ask questions, and get summaries — all on your own computer.

Connect a local LLM (for example llama.cpp) and ResearchBlocks can give you smarter answers, summaries, citations, and more. ResearchBlocks does not require an internet connection to run your library and local features.

Author: Jude Janitha Niroshan · (Bukkbeek)

Free and open source under the MIT License.
You self-host ResearchBlocks (basic localhost use is enough; setup steps are below).
An LLM is optional; ResearchBlocks functions fully for library, search, BibTeX, and extractive features without one. Natural-language answers and enrichment need a connected LLM.

Who is this for?

Researchers and students who want a private library of research publications.
Anyone tired of juggling folders, citation managers, and ad hoc search.
People who want plain-language questions powered by an LLM (when you add one).
Anyone who wants a personal research assistant that only sees your data.

You do not need to be a programmer — only follow the setup steps once.

What you can do

You want to…	`ResearchBlocks` helps by…
Keep papers in one place	Library with search, tags, and optional PDF storage
Import PDFs quickly	Add Data — scan a folder or Browse (Chrome/Edge) to upload many PDFs at once
Organize by topic	Hierarchy — categories like Biology, Medicine, Agriculture, plus Discovered topics from your keywords
Link formal citations	BibTeX — import `.bib`, link entries to papers, export
Ask “what did my papers say about X?”	Query — finds relevant passages and (with an LLM) writes a short cited answer
Skim a long paper	Summary on each paper (extractive; richer summaries with Generate + LLM)
Fix messy metadata	Generate on a paper — cleanup, abstract and authors (with verification), keywords, hierarchy, summary
Start over	Settings → cleanup database (removes all local data)

Without an LLM, you still get search, hierarchy, BibTeX, extractive summaries, and passage-based query results.
With an LLM, you get natural-language answers, smarter tagging, and automated enrichment.

What you need

A computer — Windows, macOS, or Linux
Node.js — v18 or newer (the LTS installer is fine)
This repository — clone with Git or download the ZIP from GitHub
(Optional, recommended for full assistant features) Local LLM — an OpenAI-compatible server such as llama.cpp llama-server on port 8080 (see below)

Install Node.js (based on your OS)

OS	How to install
Windows	Download the LTS `.msi` from nodejs.org. Restart the terminal, then run `node -v` (should be ≥ 18).
macOS	Use the LTS pkg from nodejs.org, or Homebrew: `brew install node`, or a version manager like `nvm`.
Linux (Debian/Ubuntu)	Example with NodeSource (Node 20): `curl -fsSL https://deb.nodesource.com/setup_20.x \| sudo -E bash -` then `sudo apt install -y nodejs`. Or use your distro’s packages if they meet the version requirement.

Verify:

node -v
npm -v

Set up `ResearchBlocks` (first time)

1. Get the code

Git:

git clone https://github.com/bukkbeek/ResearchBlocks.git
cd ResearchBlocks

(Use your fork URL if you forked the repo.)

Without Git: On GitHub, use Code → Download ZIP, extract it, and open a terminal in the extracted folder.

2. Install dependencies

npm install

This installs the libraries the app needs (web server, PDF text, BibTeX parsing, etc.). Run this once per copy of the project.

3. Start the app

npm start

You should see that the server is running on port 3000.

4. Open in the browser

Go to http://localhost:3000.

Important: Always use the address above. Do not open public/index.html with Live Server or by double-clicking the file — the app needs the Node server.

Where data is stored

Everything lives under data/db/ in the project folder (papers as JSON, optional PDF copies, BibTeX imports). That folder is created automatically. Back it up if you care about your library.

Optional: local LLM (llama.cpp + example model)

ResearchBlocks expects an OpenAI-compatible HTTP API. Many people use llama.cpp llama-server.

The steps below are a minimum-viable setup (small model, moderate context, CPU-friendly). If your machine has more RAM, VRAM, or GPU, you can scale the model and context—see Upgrading the LLM and context window.

1. Download llama.cpp for your platform

Open llama.cpp Releases and download the archive that matches your OS and CPU, for example:

Windows — llama-*-bin-win-*.zip
macOS — llama-*-bin-macos-*.zip (choose arm64 or x64 as appropriate)
Linux x64 — e.g. llama-*-bin-ubuntu-x64.tar.gz

Extract it somewhere convenient (e.g. ~/llm/llama-… on Mac/Linux, C:\llm\llama-… on Windows). The llama-server binary is usually in the root of the extracted folder (not always under bin/).

2. Download a small instruct model (example: Qwen2.5-3B)

Example GGUF (Q4_K_M, ~2.1 GB) — lightweight default for laptops and low RAM; easy to run, still useful for summaries and queries:

# macOS / Linux (curl)
curl -L -o qwen2.5-3b-q4.gguf "https://huggingface.co/Qwen/Qwen2.5-3B-Instruct-GGUF/resolve/main/qwen2.5-3b-instruct-q4_k_m.gguf"

On Windows, you can paste the same URL into a browser or use PowerShell Invoke-WebRequest to save the file in the same folder as llama-server.exe.

3. Run the LLM server

macOS / Linux (from the folder that contains llama-server and the .gguf):

./llama-server -m qwen2.5-3b-q4.gguf -c 4096 -t 4 --host 127.0.0.1 --port 8080

Windows (PowerShell or Command Prompt, same folder as llama-server.exe):

.\llama-server.exe -m qwen2.5-3b-q4.gguf -c 4096 -t 4 --host 127.0.0.1 --port 8080

-c 4096 — context length (ResearchBlocks benefits from a larger window when your hardware allows).
-t N — CPU threads; set N to roughly your physical cores.
--host 127.0.0.1 — only this PC can reach the LLM (safest default). Use 0.0.0.0 only if you intentionally want other devices on your LAN to call the LLM (see LAN access).

Wait until the log shows the server is listening. Quick check:

curl http://127.0.0.1:8080/health

(On Windows without curl, open http://127.0.0.1:8080/health in a browser.)

4. Start `ResearchBlocks` with the LLM

By default, ResearchBlocks uses http://127.0.0.1:8080. In a second terminal, from the project folder:

npm start

If the LLM runs on another host or port:

macOS / Linux (bash):

export LLM_URL=http://192.168.0.10:8080
npm start

Windows PowerShell:

$env:LLM_URL = "http://192.168.0.10:8080"
npm start

Windows CMD:

set LLM_URL=http://192.168.0.10:8080
npm start

When the connection works, an LLM badge appears in the sidebar.

What the LLM unlocks

When the LLM is connected:

Query → Ask a Question — natural-language answers with citations from your papers.
Import PDFs — options such as extracting abstracts with the LLM, sorting keywords, and auto-matching BibTeX when matches are ambiguous.
Paper detail → Keywords — “Sort keywords with LLM” for existing papers.

Upgrading the LLM and context window

Everything above is tuned for least hardware: a ~3B quant, 4096 tokens of context in llama-server, and ResearchBlocks sending a relatively small passage budget to the model by default (LLM_QUERY_CONTEXT_CHARS defaults to 1400 characters when unset—roughly aligned with ~2K-token-class usage; the README example uses 3500 for ~4K-class servers).

If you have headroom, you can push quality and depth in three places: the model, the server context (-c), and how much text ResearchBlocks feeds into answers (LLM_QUERY_CONTEXT_CHARS).

Bigger or stronger models

Browse Hugging Face GGUF collections (or a specific family such as Qwen, Llama, Mistral, Gemma) and pick a larger instruct model: 7B, 14B, 32B, and beyond train better reasoning and writing; 70B+ and MoE models are viable if you have enough unified RAM or VRAM (and patience on CPU).
Use the same llama-server -m your-model.gguf flow; only the file path and hardware requirements change.
Prefer GPU builds of llama.cpp when you can (CUDA on NVIDIA, Metal on Apple Silicon)—see the same releases page for variants that match your OS and accelerator.
Quantization: Q4_K_M stays efficient; Q5_K_M / Q8_0 (or similar) can improve quality if you have spare memory.

Larger context in `llama-server`

Raise -c to what the model supports (check the model card): 8192, 16384, 32768, or more on long-context checkpoints.
More context = more RAM/VRAM and often slower generation; if the process is killed or swaps badly, use a smaller -c, a smaller model, or a smaller quant.
Keep -c in sync with reality: there is no benefit setting it far above the model’s effective training context.

Let `ResearchBlocks` use more text per query

The app limits how many characters of retrieved passages go into the LLM for Query / synthesis (see lib/llm.js — LLM_QUERY_CONTEXT_CHARS). Rough guidance:

Server context (`-c`, tokens)	Example `LLM_QUERY_CONTEXT_CHARS`
~2048	1400 (default) or leave unset
~4096	3000–4000
~8192+	6000–12000+ (increase until answers improve or you hit latency / OOM)

Set it when starting the app (same patterns as above):

LLM_QUERY_CONTEXT_CHARS=8000 npm start

There is no single “correct” number—tokens ≠ characters, so treat this as tune until quality and speed feel right.

If you want to go “all out”

Run the largest GGUF you can load (multi-GPU, high-RAM workstations, or a dedicated inference machine on your LAN with LLM_URL pointing to it).
Point ResearchBlocks at any OpenAI-compatible endpoint (local or self-hosted stack)—not only llama.cpp—as long as it speaks the same HTTP API your setup expects.

Access from other devices on your network

To use ResearchBlocks from a phone or another PC on the same Wi‑Fi, you need the host computer’s LAN IP (for example 192.168.1.42).

Windows: ipconfig → IPv4 Address
macOS / Linux: ip addr, ifconfig, or System Settings → Network

Open http://<that-ip>:3000 on the other device.

If it does not load:

Windows: Allow Node.js or port 3000 in Windows Defender Firewall when prompted, or add an inbound rule for TCP 3000.
Linux (ufw): sudo ufw allow 3000/tcp && sudo ufw reload
macOS: System Settings → Network → Firewall → Options and allow incoming for Node/your terminal app if needed.

Security: Exposing the app or the LLM on 0.0.0.0 increases risk. Prefer VPN or SSH tunneling for remote access; if you expose services to the internet, use HTTPS, a firewall, and proper access control.

Linux: auto-start on boot (systemd)

On systemd-based distros you can run the LLM and ResearchBlocks as services.

LLM unit (/etc/systemd/system/researchblocks-llm.service) — adjust User, paths, and the llama-server line for your install:

[Unit]
Description=ResearchBlocks LLM (llama-server)
After=network.target

[Service]
Type=simple
User=YOUR_USER
WorkingDirectory=/home/YOUR_USER/llm/YOUR_LLAMA_FOLDER
ExecStart=/home/YOUR_USER/llm/YOUR_LLAMA_FOLDER/llama-server -m /home/YOUR_USER/llm/YOUR_LLAMA_FOLDER/qwen2.5-3b-q4.gguf -c 4096 -t 2 --host 127.0.0.1 --port 8080
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target

ResearchBlocks unit (/etc/systemd/system/researchblocks.service):

[Unit]
Description=ResearchBlocks
After=network.target researchblocks-llm.service

[Service]
Type=simple
User=YOUR_USER
WorkingDirectory=/home/YOUR_USER/ResearchBlocks
ExecStart=/usr/bin/node /home/YOUR_USER/ResearchBlocks/server.js
Restart=on-failure
RestartSec=5
Environment=NODE_ENV=production
Environment=LLM_URL=http://127.0.0.1:8080

[Install]
WantedBy=multi-user.target

Then:

sudo systemctl daemon-reload
sudo systemctl enable researchblocks-llm researchblocks
sudo systemctl start researchblocks-llm researchblocks

Linux tips: low RAM and swap

If you have about 4 GB RAM and run a 3B model, extra swap can reduce out-of-memory kills (at the cost of speed):

sudo fallocate -l 2G /swapfile
sudo chmod 600 /swapfile && sudo mkswap /swapfile && sudo swapon /swapfile
echo '/swapfile none swap sw 0 0' | sudo tee -a /etc/fstab

Troubleshooting

Issue	What to try
`llama-server` not found	Run the command from the extracted release folder where the binary lives.
Port already in use	Change the port for `llama-server` or stop the other app. Check: Linux/macOS `lsof -i :3000` / `:8080`; Windows `netstat -ano \| findstr :3000`.
LLM slow or OOM	Lower `-c` (e.g. 2048), reduce `-t`, use a smaller quant, or add swap (Linux).
No LLM badge	Confirm `llama-server` is running and `LLM_URL` matches; test `/health` on the LLM port.
Reset everything	Settings → Cleanup database (removes local library data).

Everyday tips

Import: Add Data → folder or Browse → choose options (abstracts, keywords, BibTeX match) → Import.
Questions: Query → type a full question → optional topic filter → read the answer and open sources.
One paper needs work? Open it → Generate (next to Edit) to re-run enrichment with the LLM.
BibTeX: Import under BibTeX, then link from each paper or use auto-match on import.

For developers

Topic	Details
Stack	Node.js, Express, static frontend, file-based `data/db/`
LLM module	`lib/llm.js` — chat, abstract, authors, verify, keywords, hierarchy hints, answer synthesis
Design notes	docs/LLM-INTEGRATION-REPORT.md

ResearchBlocks/
├── server.js
├── lib/llm.js
├── public/          # UI
├── data/db/         # your database (default)
└── docs/

Contributing and license

Licensed under the MIT License. Contributions, issues, and forks are welcome.

Credits


Project	Bukkbeek
Development	Built with Cursor
PDF	pdf-parse
BibTeX	bibtex-parse
Optional LLM	llama.cpp, GGUF models

Your research. Your data. Your personal research assistant — on your terms.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data		data
lib		lib
public		public
scripts		scripts
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
server.js		server.js

Folders and files

Latest commit

History

Repository files navigation

ResearchBlocks

Who is this for?

What you can do

What you need

Install Node.js (based on your OS)

Set up ResearchBlocks (first time)

1. Get the code

2. Install dependencies

3. Start the app

4. Open in the browser

Where data is stored

Optional: local LLM (llama.cpp + example model)

1. Download llama.cpp for your platform

2. Download a small instruct model (example: Qwen2.5-3B)

3. Run the LLM server

4. Start ResearchBlocks with the LLM

More context for answers (optional)

What the LLM unlocks

Upgrading the LLM and context window

Bigger or stronger models

Larger context in llama-server

Let ResearchBlocks use more text per query

If you want to go “all out”

Access from other devices on your network

Linux: auto-start on boot (systemd)

Linux tips: low RAM and swap

Troubleshooting

Everyday tips

For developers

Contributing and license

Credits

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Set up `ResearchBlocks` (first time)

4. Start `ResearchBlocks` with the LLM

Larger context in `llama-server`

Let `ResearchBlocks` use more text per query

Packages