Skip to content

rebase: a basic version 0.2#60

Merged
Sskift merged 12 commits intodevfrom
release0.2
Dec 23, 2025
Merged

rebase: a basic version 0.2#60
Sskift merged 12 commits intodevfrom
release0.2

Conversation

@Sskift
Copy link
Copy Markdown
Collaborator

@Sskift Sskift commented Dec 23, 2025

No description provided.

Copilot AI review requested due to automatic review settings December 23, 2025 09:55
@Sskift Sskift merged commit 084cd9a into dev Dec 23, 2025
3 of 4 checks passed
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements version 0.2 with significant infrastructure and feature enhancements, focusing on deployment automation, file upload conflict resolution, and improved dark mode support.

Key Changes

  • Deployment infrastructure: Replaces development scripts with production-ready deploy.sh featuring automated dependency installation, firewall configuration, and public network support
  • File conflict handling: Implements comprehensive upload conflict detection and resolution (overwrite/rename) across frontend and backend with user-friendly dialog interfaces
  • Database improvements: Adds paper-conversation binding and per-user filename uniqueness constraints to prevent duplicates

Reviewed changes

Copilot reviewed 31 out of 32 changed files in this pull request and generated 26 comments.

Show a summary per file
File Description
scripts/deploy.sh New production deployment script with UFW firewall setup, parallel dependency installation, and public IP configuration
scripts/docker-infra.sh Removed infrastructure-only startup script (consolidated into deploy.sh)
scripts/dev1.sh Removed development environment script (consolidated into deploy.sh)
redis.conf Changed bind address from 127.0.0.1 to 0.0.0.0 for broader network access
docker-compose.yml Updated Postgres password from default to project-specific value
backend/config/magic-pdf.config.json Switched device-mode from CPU to CUDA for GPU acceleration
backend/migrations/versions/20251223_upload_conflict_and_conversation_binding.py Adds paper_id to conversations table and unique constraint on user_id+original_filename for uploads
backend/app/models/uploaded_paper.py Adds conversations relationship to support paper-conversation binding
backend/app/models/conversation.py Adds paper_id foreign key and relationship for document binding
backend/app/utils/file_naming.py Implements build_storage_name_with_email for cross-user unique storage names
backend/app/services/mineru_service.py Changes file writing from md_writer to direct Path.write_text with error handling
backend/app/workers/mineru_runner.py Uses ensure_ascii=True to prevent UnicodeEncodeError with surrogate characters
backend/app/schemas/conversation.py Adds optional paper_id field to conversation schema
backend/app/db/repository.py Adds get_by_original_name and purge_cached_artifacts methods for conflict detection and cleanup
backend/app/db/note_repository.py Adds detach_uploaded_paper for unlinking notes when papers are deleted
backend/app/db/conversation_repository.py Adds paper-aware conversation retrieval and bulk deletion for paper cleanup
backend/app/api/v1/endpoints/users.py Changes account deletion from soft-delete to hard-delete with full data cleanup
backend/app/api/v1/endpoints/papers.py Implements upload conflict detection, resolution logic (overwrite/rename), and conversation cleanup
backend/app/api/v1/endpoints/library.py Updates save_pdf_bytes to use email-based storage naming
frontend/src/lib/api-client.ts Adds UploadConflictError class and conflict handling in uploadPaper function
frontend/src/components/note-editor/NoteEditorPanel.tsx Adds resetEditor method, improves dark mode support with data-color-mode attributes
frontend/src/components/layout/dashboard-shell.tsx Sets data-color-mode attributes for GitHub/Primer-based component theming
frontend/src/components/ai-agent/AgentChatPanel.tsx Improves conversation history loading with better error handling and cache cleanup
frontend/src/app/upload/page.tsx Adds conflict resolution dialog with rename/overwrite options
frontend/src/app/smart-reading/page.tsx Implements conflict handling UI and conversation cache cleanup on overwrite
frontend/src/app/profile/page.tsx Updates back button href from "/" to "/home" and adds data-color-mode support
frontend/src/app/notes/page.tsx Fixes SSR hydration mismatch by deferring token read to client-side
frontend/src/app/notes/new/page.tsx Adds hydration guard to prevent SSR/client state mismatches
frontend/src/app/notes/[id]/page.tsx Enhances dark mode styling and adds token check in bootstrap
frontend/src/app/library/page.tsx Changes anchor tag to Link component for client-side navigation
frontend/src/app/academic/page.tsx Uses fallback chain for userName display
Comments suppressed due to low confidence (1)

redis.conf:9

  • Changing Redis to bind 0.0.0.0 while protected-mode no is set, combined with the Docker mapping 6379:6379 in docker-compose.yml, exposes an unauthenticated Redis instance on all network interfaces. An external attacker who can reach this host can connect to Redis, read/modify all cached data, and potentially achieve remote code execution via configuration changes or module loading. Restrict Redis to listen only on localhost or an internal network interface, re-enable protected-mode, and avoid publishing port 6379 directly to untrusted networks.
bind 0.0.0.0


# Disable protected mode because access is restricted by Docker network
protected-mode no

Comment on lines +58 to +60
# 尝试自动获取公网 IP,如果失败则使用你提供的默认 IP
info "正在获取公网 IP 地址..."
PUBLIC_IP=$(curl -s --max-time 3 ifconfig.me || echo "36.103.203.223")
Copy link

Copilot AI Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The public IP address is hardcoded as a fallback value, which appears to be a specific personal or organizational IP. This should not be committed to version control as it exposes infrastructure details. Consider using a placeholder like "YOUR_PUBLIC_IP" or removing the fallback entirely to force explicit configuration.

Suggested change
# 尝试自动获取公网 IP,如果失败则使用你提供的默认 IP
info "正在获取公网 IP 地址..."
PUBLIC_IP=$(curl -s --max-time 3 ifconfig.me || echo "36.103.203.223")
# 尝试自动获取公网 IP;如果失败,则要求通过环境变量 PUBLIC_IP 显式提供
info "正在获取公网 IP 地址..."
if [ -z "${PUBLIC_IP-}" ]; then
PUBLIC_IP="$(curl -s --max-time 3 ifconfig.me || true)"
fi
if [ -z "$PUBLIC_IP" ]; then
error "无法自动获取公网 IP,请通过环境变量 PUBLIC_IP 指定,例如:PUBLIC_IP=\"YOUR_PUBLIC_IP\" ./scripts/deploy.sh"
exit 1
fi

Copilot uses AI. Check for mistakes.
Comment on lines +36 to +43
"uq_uploaded_papers_user_original",
"uploaded_papers",
["user_id", "original_filename"],
)


def downgrade() -> None:
op.drop_constraint("uq_uploaded_papers_user_original", "uploaded_papers", type_="unique")
Copy link

Copilot AI Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unique constraint name "uq_uploaded_papers_user_original" could be more descriptive. Consider using a name that clearly indicates what is being constrained, such as "uq_uploaded_papers_user_id_original_filename" to make it easier to understand the constraint purpose when debugging database issues.

Suggested change
"uq_uploaded_papers_user_original",
"uploaded_papers",
["user_id", "original_filename"],
)
def downgrade() -> None:
op.drop_constraint("uq_uploaded_papers_user_original", "uploaded_papers", type_="unique")
"uq_uploaded_papers_user_id_original_filename",
"uploaded_papers",
["user_id", "original_filename"],
)
def downgrade() -> None:
op.drop_constraint(
"uq_uploaded_papers_user_id_original_filename",
"uploaded_papers",
type_="unique",
)

Copilot uses AI. Check for mistakes.
Comment on lines +105 to +123
# md_writer.write_string(f"{stem}.md", md_content)
(target_dir / f"{stem}.md").write_text(md_content, encoding="utf-8", errors="replace")

# 2. Content List (JSON)
content_list = union_make(pdf_info, MakeMode.CONTENT_LIST, "images")
md_writer.write_string(f"{stem}_content_list.json", json.dumps(content_list, ensure_ascii=False, indent=4))
# md_writer.write_string(f"{stem}_content_list.json", json.dumps(content_list, ensure_ascii=False, indent=4))
(target_dir / f"{stem}_content_list.json").write_text(
json.dumps(content_list, ensure_ascii=False, indent=4),
encoding="utf-8",
errors="replace"
)

# 3. Middle JSON
md_writer.write_string(f"{stem}_middle.json", json.dumps(middle_json, ensure_ascii=False, indent=4))
# md_writer.write_string(f"{stem}_middle.json", json.dumps(middle_json, ensure_ascii=False, indent=4))
(target_dir / f"{stem}_middle.json").write_text(
json.dumps(middle_json, ensure_ascii=False, indent=4),
encoding="utf-8",
errors="replace"
)
Copy link

Copilot AI Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The commented-out code using md_writer should be removed rather than left as comments. If this is a temporary change during debugging or testing, add a TODO comment explaining when and why it should be reverted. Leaving commented code reduces maintainability and creates confusion about the intended behavior.

Copilot uses AI. Check for mistakes.
"models-dir": "/tmp/magic-pdf/models",
"layoutreader-model-dir": "/tmp/magic-pdf/models/Layout",
"device-mode": "cpu",
"device-mode": "cuda",
Copy link

Copilot AI Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing device-mode from "cpu" to "cuda" assumes CUDA is available on all deployment environments. This will cause runtime errors on systems without GPU support. Consider making this configurable via environment variable or add a fallback mechanism to detect CUDA availability at runtime.

Suggested change
"device-mode": "cuda",
"device-mode": "cpu",

Copilot uses AI. Check for mistakes.
<div className="mx-auto w-full px-4 pt-6 md:px-6 md:pt-8">
<Link
href="/"
href="/home"
Copy link

Copilot AI Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The link now navigates to "/home" instead of "/". Ensure that the "/home" route exists and is properly configured in your routing setup. If "/" was the intended home route, this change might cause a 404 error. Verify this is intentional and that the route is defined.

Suggested change
href="/home"
href="/"

Copilot uses AI. Check for mistakes.
Comment on lines +149 to +150
const message = error instanceof Error ? error.message : "";
if (message.includes("不存在") || message.toLowerCase().includes("not found")) {
Copy link

Copilot AI Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message check uses message.includes("不存在") || message.toLowerCase().includes("not found"), but this is fragile as it depends on specific error message text. If the backend error messages change, this logic will break. Consider using HTTP status codes or custom error codes instead of parsing error message strings for business logic decisions.

Suggested change
const message = error instanceof Error ? error.message : "";
if (message.includes("不存在") || message.toLowerCase().includes("not found")) {
const anyError = error as any;
const status = anyError?.status ?? anyError?.response?.status;
if (status === 404) {

Copilot uses AI. Check for mistakes.
existing = await repo.get_by_original_name(current_user.id, target_display_name)

# 解析冲突处理策略
resolution = (conflict_resolution or "").strip().lower() or None
Copy link

Copilot AI Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable resolution is normalized to lowercase and then compared against None and string literals. However, if conflict_resolution is an empty string after stripping, the expression resolves to None due to "or None". This logic could be clearer. Consider: resolution = conflict_resolution.strip().lower() if conflict_resolution and conflict_resolution.strip() else None

Suggested change
resolution = (conflict_resolution or "").strip().lower() or None
resolution = (
conflict_resolution.strip().lower()
if conflict_resolution and conflict_resolution.strip()
else None
)

Copilot uses AI. Check for mistakes.
Comment on lines +26 to +28
# 2. 核心安全防护:允许 SSH (端口22) 从任何地方访问,或者从你当前的 IP 访问
# 建议先开放全网 SSH 端口,防止失联。如果追求极高安全,可以只写你现在的 IP。
sudo ufw allow 22/tcp comment 'Allow SSH'
Copy link

Copilot AI Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The firewall configuration allows SSH from anywhere (port 22), which could be a security risk. While the comment acknowledges this, it's recommended to restrict SSH access to known IP ranges or use additional security measures like fail2ban, key-only authentication, or a VPN. Consider adding a warning or requiring explicit confirmation before running this setup.

Suggested change
# 2. 核心安全防护:允许 SSH (端口22) 从任何地方访问,或者从你当前的 IP 访问
# 建议先开放全网 SSH 端口,防止失联。如果追求极高安全,可以只写你现在的 IP。
sudo ufw allow 22/tcp comment 'Allow SSH'
# 2. 核心安全防护:SSH(22) 访问控制
warn "即将配置 SSH(22) 访问规则:从任意 IP 开放 SSH 存在安全风险(暴力破解、扫描等)。"
read -r -p "是否允许从任意 IP 访问 SSH(22)?输入 'yes' 确认,默认仅允许当前公网 IP: " ssh_anywhere
if [[ "${ssh_anywhere:-}" == "yes" ]]; then
info "允许从任意 IP 访问 SSH(22)。请确保已配置强密码/密钥登录等安全措施。"
sudo ufw allow 22/tcp comment 'Allow SSH from anywhere'
else
CURRENT_MY_IP=$(curl -s https://ifconfig.me || true)
if [[ -n "${CURRENT_MY_IP:-}" ]]; then
info "仅允许当前公网 IP ${CURRENT_MY_IP} 访问 SSH(22)。"
sudo ufw allow from "$CURRENT_MY_IP" to any port 22 proto tcp comment 'Allow SSH from current IP'
else
warn "无法自动获取当前公网 IP,请手动配置 SSH 访问规则(例如:sudo ufw allow from <YOUR_IP>/32 to any port 22 proto tcp)。"
fi
fi

Copilot uses AI. Check for mistakes.
Comment on lines +47 to +48
if paper_id is not None:
stmt = stmt.where(Conversation.paper_id == paper_id)
Copy link

Copilot AI Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition paper_id is not None uses explicit None comparison, but on line 48 the comparison uses == paper_id. For consistency with Python best practices and the existing code style, use explicit None comparison throughout: if paper_id is not None:

Copilot uses AI. Check for mistakes.
Comment on lines +155 to +166
NEXT_PUBLIC_BACKEND_URL="http://$PUBLIC_IP:8000" npm run dev -- -H 0.0.0.0
) &
FRONTEND_PID=$!

# --- 4) 最终信息输出 ---
# 清屏并显示最终访问地址
sleep 2
success "================================================"
success " InsightReading 服务已全部在公网启动!"
success "================================================"
info "▶ 前端访问地址: http://$PUBLIC_IP:3000"
info "▶ 后端接口地址: http://$PUBLIC_IP:8000/docs"
Copy link

Copilot AI Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The deployment script exposes the backend API over plain HTTP (http://$PUBLIC_IP:8000) and instructs users to access it directly, even though the backend implements authentication and password handling. This means login credentials, session tokens, and other sensitive API traffic can be intercepted or modified by any attacker on the network path between clients and the server. Terminate TLS in front of the backend (e.g., via a reverse proxy or load balancer) and ensure both the public URLs and NEXT_PUBLIC_BACKEND_URL use HTTPS for all external access.

Copilot uses AI. Check for mistakes.
@Dynamite2003 Dynamite2003 deleted the release0.2 branch January 11, 2026 05:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants