How to build OpenCV with CUDA for C++ on Windows
Does your Windows PC have an GPU made by NVIDIA?
Good news: you can use those CUDA cores with OpenCV! Even the weakest GPUs can boost your computation time. Bad news: You can't download this. You will need to build OpenCV to work with your GPU. Worse news: There's so many options that you might be overwhelmed by what to and not to do.
This guide is intended to help you quickly figure out what you need to do to build OpenCV so it can leverage your CUDA cores. This was created with a bit of trial and error.
- 64-bit Windows OS
- An NVIDIA GPU with CUDA cores
- ~5GB of space free on a single drive
Before you start, there's a few things that you will need to download and install for your machine:
- CMake: https://cmake.org/
- Visual Studio for C++ (Community/Express will work): https://www.visualstudio.com/
- CUDA Toolkit: https://developer.nvidia.com/cuda-toolkit
- CUDA Tools SDK (Maybe? 4.0 is the latest I can find...): https://developer.nvidia.com/cuda-toolkit-40
- The version of Visual Studio you need will be dependent on your compatibility requirements.
- Make sure you download the C++ support and it can build 64-bit.
- Make sure VS it's in your path as well.
Get the latest relase. This contains your source code.
Link: http://opencv.org/releases.html
CMake is a GUI tool to configure OpenCV to your liking before it's built.
- Point CMake to where the source code is. Look for the directory with the CMakeLists.txt file in it.
- Hit configure
- Check the following:
- WITH_CUDA
- WITH_CUBLAS
- CUDA_FAST_MATH
Check only if you do not care about absolute accuracy. The tradeoff here is accuracy for speed.
- BUILD_opencv_world
Select only if you don't plan to deploy this. This will build everything into one lib/dll. This is not lean at all.
- INSTALL_TESTS
Lets you verify everything works.
- Uncheck the BUILD_opencv_python2 flag if you don't plan on using python
- Otherwise you will need Python with debug built on your machine
- Hit configure again/until the red text highlights go away
- Finally: Hit Generate button to build C++ code
The code has now been configured as you need it. Now we actually get to build OpenCV.
- MAKE SURE BUILDING 64 BIT
- Build debug and release
- Build ALL_BUILD
- Build INSTALL
| Go to Solution Explorer | Right click the project you want to build |
|---|---|
![]() |
![]() |
- Expect 2 hours to build each ALL_BUILD.
- Change your power settings temporarily so your computer doesn't fall asleep during this.
- Look for [build directory]/install/ for dlls, libs, test programs for debug and release
- Review OpenCVConfig.cmake file to double check how things were configured
Run a subset (or all if you really want to ) of the test executables with Command Prompt/PowerShell to see if things built correctly
- Have the lena.png and lena.jpg in the exe directory. Some tests will need these. They can be found in the original OpenCV source directory.
- At a minimum, "opencv_test_core.exe" should be tested
Move your new opencv to the place you want (C drive maybe?)
- What you want will be in the "../install" directory. Once you move it, rename it something like "../opencv"
- Add "../opencv/bin" to your path (has all the dlls)
This part will be very similar to what OpenCV's documentation has. Link to 2.4 documentation in the reference section below. I use the local method here, but you can use the global method if you wish.
- Build for 64 bit only. This is a requirement to use the GPU.
- Include Debug and Release libs that you require
- Edit Project for C/C++: include additional libraries
- Edit Project for Linker: include opencv lib
- Add additional libraries under Linker
- Repeat for release building
If Visual Studio complains about pdb symbols not being found, you may need to grab symbols from Microsoft for debugging:
- Tools->Options->Debugging->Symbols and select checkbox "Microsoft Symbol Servers"
- Firewalls/proxys may make things tricky for you! Manually downloading dlls and other items may be required.
- Remember that to use the GPU, you must first download matricies to the GPU. Once you're finished, upload back to the CPU.

