RTX-3090 issues with install

### Operating System

- [ ] macOS
- [x] Linux
- [ ] Windows

### Deployment Method

- [x] Docker
- [ ] non-Docker

### CUDA Usage

- [x] Yes
- [ ] No

### Training Process Details (if applicable)

_No response_

### Second-Me version

latest master

### Describe the bug

This is as far as I've been able to bring my video drivers and CUDA:
`
| NVIDIA-SMI 555.42.02              Driver Version: 555.42.02      CUDA Version: 12.5     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3090        Off |   00000000:24:00.0  On |                  N/A |
| 31%   60C    P0            106W /  370W |    1977MiB /  24576MiB |      6%      Default |
|                                         |                        |                  N/A |
`
I've had to back off to Ubuntu 22.04 for compatibility. In spite of answering yes to the CUDA question I end up with the following output at the end of the build (no errors in the actual build, as I've made a number of changes to the Dockerfile as workarounds)

`
[+] Building 2/2
 ✔ backend   Built                                                                                                     0.0s 
 ✔ frontend  Built                                                                                                     0.0s 
[+] Running 3/3
 ✔ Network second-me_second-me-network  Created                                                                        0.1s 
 ✔ Container second-me-backend          Started                                                                        0.5s 
 ✔ Container second-me-frontend         Started                                                                        0.6s 
Container startup complete
Check CUDA support with: make docker-check-cuda

/r/P/Second-Me on 🌱 master [!?] via 🐍 v3.12.3 took 16m0s 
→ make docker-check-cuda                                
Checking CUDA support in Docker containers...
Running CUDA support check in backend container:
=== GPU Support Check ===
llama-server binary found, checking for CUDA linkage...
❌ llama-server is not linked with CUDA libraries
Container was built without CUDA support
🔍 NVIDIA GPU is available at runtime, but llama-server doesn't support CUDA
To enable GPU support, rebuild using: make docker-up (and select CUDA support when prompted)
No GPU support detected in backend container
`



### Current Behavior

The container starts without CUDA support

### Expected Behavior

The container starts propertly with CUDA support

### Reproduction Steps

Use the following `Dockerfile.backend.cuda`:

`
FROM nvidia/cuda:12.4.1-base-ubuntu22.04

# Set working directory
WORKDIR /app

# Add build argument to conditionally skip llama.cpp build
ARG SKIP_LLAMA_BUILD=false

# Install system dependencies with noninteractive mode to avoid prompts
#RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y \
#    build-essential cmake git curl wget lsof vim unzip sqlite3 \
#    python3-pip python3-venv python3-full python3-poetry pipx \
#    && apt-get clean \
#    && rm -rf /var/lib/apt/lists/* \
#    && ln -sf /usr/bin/python3 /usr/bin/python

ENV DEBIAN_FRONTEND=noninteractive

# Install Python 3.12.2 from source
RUN apt-get update && \
    apt-get install -y \
    build-essential libssl-dev zlib1g-dev libbz2-dev \
    libreadline-dev libsqlite3-dev wget curl llvm \
    libncursesw5-dev xz-utils tk-dev libxml2-dev libxmlsec1-dev \
    libffi-dev liblzma-dev git unzip cmake sqlite3 \
    cuda-toolkit-12-4 \
    && cd /usr/src && \
    wget https://www.python.org/ftp/python/3.12.2/Python-3.12.2.tgz && \
    tar xzf Python-3.12.2.tgz && cd Python-3.12.2 && \
    ./configure --enable-optimizations && \
    make -j"$(nproc)" && make altinstall && \
    ln -sf /usr/local/bin/python3.12 /usr/bin/python && \
    curl -sS https://bootstrap.pypa.io/get-pip.py | python && \
    rm -rf /var/lib/apt/lists/*
# Create a virtual environment to avoid PEP 668 restrictions
RUN python -m venv /app/venv
ENV PATH="/app/venv/bin:$PATH"
ENV VIRTUAL_ENV="/app/venv"

# Use the virtual environment's pip to install packages
RUN pip install --upgrade pip \
    && pip install poetry \
    && poetry config virtualenvs.create false

# Create directories
RUN mkdir -p /app/dependencies /app/data/sqlite /app/data/chroma_db /app/logs /app/run /app/resources

# Copy dependency files - Files that rarely change
COPY dependencies/graphrag-1.2.1.dev27.tar.gz /app/dependencies/
COPY dependencies/llama.cpp.zip /app/dependencies/

# Copy GPU checker script
COPY docker/app/check_gpu_support.sh /app/
COPY docker/app/check_torch_cuda.py /app/
RUN chmod +x /app/check_gpu_support.sh

# Unpack llama.cpp and build with CUDA support (conditionally, based on SKIP_LLAMA_BUILD)
RUN if [ "$SKIP_LLAMA_BUILD" = "false" ]; then \
        echo "=====================================================================" && \
        echo "STARTING LLAMA.CPP BUILD WITH CUDA SUPPORT - THIS WILL TAKE SOME TIME" && \
        echo "=====================================================================" && \
        LLAMA_LOCAL_ZIP="dependencies/llama.cpp.zip" && \
        echo "Using local llama.cpp archive..." && \
        unzip -q "$LLAMA_LOCAL_ZIP" && \
        cd llama.cpp && \
        mkdir -p build && \
        cd build && \
        echo "Starting CMake configuration with CUDA support..." && \
        cmake -DGGML_CUDA=ON \
              -DCMAKE_BUILD_TYPE=Release \
              -DBUILD_SHARED_LIBS=OFF \
              -DLLAMA_NATIVE=ON \
              .. && \
        echo "Starting build process (this will take several minutes)..." && \
        cmake --build . --config Release -j$(nproc) --verbose && \
        echo "Build completed successfully" && \
        chmod +x /app/llama.cpp/build/bin/llama-server /app/llama.cpp/build/bin/llama-cli && \
        echo "====================================================================" && \
        echo "CUDA BUILD COMPLETED SUCCESSFULLY! GPU ACCELERATION IS NOW AVAILABLE" && \
        echo "===================================================================="; \
    else \
        echo "=====================================================================" && \
        echo "SKIPPING LLAMA.CPP BUILD (SKIP_LLAMA_BUILD=$SKIP_LLAMA_BUILD)" && \
        echo "Using existing llama.cpp build from Docker volume" && \
        echo "=====================================================================" && \
        LLAMA_LOCAL_ZIP="dependencies/llama.cpp.zip" && \
        echo "Just unpacking llama.cpp archive (no build)..." && \
        unzip -q "$LLAMA_LOCAL_ZIP" && \
        cd llama.cpp && \
        mkdir -p build; \
    fi

# Mark as GPU-optimized build for runtime reference
RUN mkdir -p /app/data && \
    echo "{ \"gpu_optimized\": true, \"optimized_on\": \"$(date -u +\"%Y-%m-%dT%H:%M:%SZ\")\" }" > /app/data/gpu_optimized.json && \
    echo "Created GPU-optimized marker file"

# Copy project configuration - Files that occasionally change
COPY pyproject.toml README.md /app/

# Fix for potential package installation issues with Poetry
RUN pip install --upgrade setuptools wheel
RUN poetry install --no-interaction --no-root || poetry install --no-interaction --no-root --without dev
RUN pip install --force-reinstall dependencies/graphrag-1.2.1.dev27.tar.gz

# Copy source code - Files that frequently change
COPY docker/ /app/docker/
COPY lpm_kernel/ /app/lpm_kernel/

# Check module import
RUN python -c "import lpm_kernel; print('Module import check passed')"

# Set environment variables
ENV PYTHONUNBUFFERED=1 \
    PYTHONPATH=/app \
    BASE_DIR=/app/data \
    LOCAL_LOG_DIR=/app/logs \
    RUN_DIR=/app/run \
    RESOURCES_DIR=/app/resources \
    APP_ROOT=/app \
    FLASK_APP=lpm_kernel.app \
    LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH

# Expose ports
EXPOSE 8002 8080

# Set the startup command
CMD ["bash", "-c", "echo 'Checking SQLite database...' && if [ ! -s /app/data/sqlite/lpm.db ]; then echo 'SQLite database not found or empty, initializing...' && mkdir -p /app/data/sqlite && sqlite3 /app/data/sqlite/lpm.db '.read /app/docker/sqlite/init.sql' && echo 'SQLite database initialized successfully' && echo 'Tables created:' && sqlite3 /app/data/sqlite/lpm.db '.tables'; else echo 'SQLite database already exists, skipping initialization'; fi && echo 'Checking ChromaDB...' && if [ ! -d /app/data/chroma_db/documents ] || [ ! -d /app/data/chroma_db/document_chunks ]; then echo 'ChromaDB collections not found, initializing...' && python /app/docker/app/init_chroma.py && echo 'ChromaDB initialized successfully'; else echo 'ChromaDB already exists, skipping initialization'; fi && echo 'Starting application at ' $(date) >> /app/logs/backend.log && cd /app && python -m flask run --host=0.0.0.0 --port=${LOCAL_APP_PORT:-8002} >> /app/logs/backend.log 2>&1"]

`


### Possible Workaround

_No response_

### Additional Information

_No response_

### Link to related Github discussion or issue

_No response_

RTX-3090 issues with install #377

Description

Operating System

Deployment Method

CUDA Usage

Training Process Details (if applicable)

Second-Me version

Describe the bug

Current Behavior

Expected Behavior

Reproduction Steps

Set working directory

Add build argument to conditionally skip llama.cpp build

Install system dependencies with noninteractive mode to avoid prompts

build-essential cmake git curl wget lsof vim unzip sqlite3 \

python3-pip python3-venv python3-full python3-poetry pipx \

&& apt-get clean \

&& rm -rf /var/lib/apt/lists/* \

&& ln -sf /usr/bin/python3 /usr/bin/python

Install Python 3.12.2 from source

Create a virtual environment to avoid PEP 668 restrictions

Use the virtual environment's pip to install packages

Create directories

Copy dependency files - Files that rarely change

Copy GPU checker script

Unpack llama.cpp and build with CUDA support (conditionally, based on SKIP_LLAMA_BUILD)

Mark as GPU-optimized build for runtime reference

Copy project configuration - Files that occasionally change

Fix for potential package installation issues with Poetry

Copy source code - Files that frequently change

Check module import

Set environment variables

Expose ports

Set the startup command

Possible Workaround

Additional Information

Link to related Github discussion or issue

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions