Skip to content

This repository utilizes Docker to package large language models and multimodal models optimized for Rockchip platforms. It provides a unified calling interface that is compatible with the OpenAI API, making it easy for users to integrate and use these models.

License

Notifications You must be signed in to change notification settings

Seeed-Projects/reComputer-RK-LLM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

99 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Contributors Forks Stargazers Issues MIT License

Introduction

This repository utilizes Docker to package large language models and multimodal models optimized for Rockchip platforms. It provides a unified calling interface that is compatible with the OpenAI API, making it easy for users to integrate and use these models.

Hardware Prepare

For reComputer RK3588 and reComputer RK3576.

LLM

Fast start

Device Model
RK3588 rk3588-deepseek-r1-distill-qwen:7b-w8a8-latest
rk3588-deepseek-r1-distill-qwen:1.5b-fp16-latest
rk3588-deepseek-r1-distill-qwen:1.5b-w8a8-latest
RK3576 rk3576-deepseek-r1-distill-qwen:7b-w4a16-g128-latest
rk3576-deepseek-r1-distill-qwen:7b-w4a16-latest
rk3576-deepseek-r1-distill-qwen:1.5b-fp16-latest
rk3576-deepseek-r1-distill-qwen:1.5b-w4a16-g128-latest
rk3576-deepseek-r1-distill-qwen:1.5b-w4a16-latest

VLM

Fast start

Device Model
RK3588 rk3588-qwen2-vl:7b-w8a8-latest
rk3588-qwen2-vl:2b-w8a8-latest
RK3576 rk3576-qwen2.5-vl:3b-w4a16-latest

Speed test

Note: A rough estimate of a model's inference speed includes both TTFT and TPOT. Note: You can use python test_inference_speed.py --help to view the help function.

python -m venv .env && source .env/bin/activate
pip install requests
python llm_speed_test.py

💞 Top contributors:

contrib.rocks image

🌟 Star History

Star History Chart

Reference: rknn-llm

About

This repository utilizes Docker to package large language models and multimodal models optimized for Rockchip platforms. It provides a unified calling interface that is compatible with the OpenAI API, making it easy for users to integrate and use these models.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors