CNN implementation on CUDA using C++

This is a very simplified implementation of a CNN (Convolutional Neural Network) using C++ and CUDA for GPU acceleration. This implementation is intended for educational purposes and may not be optimized for performance or accuracy. Additionally, some functionalities were omitted for simplicity - proper data loading, CLI and metrics, inference and persistence, etc.

I have tried to squeeze out as much performance as possible without making complex optimizations, however that came with some trade-offs due to my limited time - for example, there is a huge memory footprint because of improper iterator for data loading and copying in layers.

To simplify the data handling, I have decided to use templates for most of the cuda code, should make it much easier to handle different data types out of the box.

Installation and Running

To compile and run the code, you need to have a CUDA-capable GPU and the CUDA toolkit installed. You can follow these steps:

Clone the repo. Modify main.cu to change hyperparameters or dataset paths if needed (note that I might use slightly different name for unpacked MNIST data).
Compile using CMake (I recommend opening this project in CLion, since it handles most of the stuff automatically, don't forget to use proper compiler with CUDA included):
```
cmake --build ./cmake-build-<build-type> --target CNN_CUDA -j $(nproc)
```
Most likely you can use nvcc instead of CMake, I haven't tested it though.
Run the executable:
```
 ./cnn_cuda
```

Ensure you are using MNIST dataset files in the same directory as the executable or modify the paths in the code accordingly.

I haven't added any CLI arguments, or proper inference because I have wanted to implement the core CNN in CUDA as a small challenge, I don't really think it's that important for this demo. Same about block_size and grid_size, they are hardcoded in most of the places, which is a subject for future improvement.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
core		core
data_processing/mnist		data_processing/mnist
models		models
nn		nn
ops		ops
CMakeLists.txt		CMakeLists.txt
README.md		README.md
main.cu		main.cu

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CNN implementation on CUDA using C++

Installation and Running

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CNN implementation on CUDA using C++

Installation and Running

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages