sra-vjti / pixels_seminar Goto Github PK
View Code? Open in Web Editor NEWSRA's seminar on Introduction to Computer Vision Fundamentals
License: MIT License
SRA's seminar on Introduction to Computer Vision Fundamentals
License: MIT License
The aim of the pixels seminar is to introduce the juniors with the basic concepts of image processing and computer vision. We started with python to keep it as simple as possible. But I feel the need to move from python to C++. C++ is complicated but is an industry standard.
If juniors were to introduce to C++ in this seminar than python, then they might be more willing to go through large c++ codebase in GSoC or any other programs or competitions. With the inclusion of C++, we can also introduce build systems (which honestly is the need of an hour).
This repository can also act as basic template for CV people getting started with C++.
@meshtag @amanchhaparia @SAtacker kindly share your opinion on this.
Also keep in account that at SRA, we advocate progress, regardless of how challenging or difficult it may seem at first.
Is your feature request related to a problem? Please describe.
Current implementation for working with image arrays is in Python, as discussed in this thread, we need to port it to C++.
Describe the solution you'd like
cv::Mat
's inbuilt slicing functionalities for demonstration purpose(s).for
loops as well, this is done for the purpose of making students aware of pointer manipulation and its arithmetic in C++.makefile
to compile and build related executable(s)..md
file for theoretical part (and depicting expected results) as well.Describe the bug
Makefiles fail to work on macOS after #107. A libsdl2 error is raised.
To Reproduce
Steps to reproduce the behaviour:
0. Try this on macOS Sonoma
4_cv_basics/1_image_representation/Makefile
.main.cpp:28:10: fatal error: 'SDL2/SDL.h' file not found
Expected behavior
The file should have compiled successfully to give an executable.
Desktop (please complete the following information):
The code 4_convolutions_filtering
covers the following features:
This issue involves refactoring the provided convolution code into three separate files, each focusing on a specific aspect of convolution operations and creating a separate folder for benchmarking the results of each file. The goal is to is to enhance code modularity, readability, and maintainability by isolating different functionalities into distinct files. This would further reduce the complexity while understanding the code.
Benchmarking can be used solely for displaying the results and comparing the efficiency of different convolution approaches during the seminar. The code for benchmarking can be kept optional to explore.
Is your feature request related to a problem? Please describe.
Current Blob Detection algorithm is implemented using Python, need to port to C++, as discussed in this thread
Describe the solution you'd like
Readme.md
explaining the algorithm in detail and should also contain the instruction for running the executable.Is your feature request related to a problem? Please describe.
This feature request is created to keep a record of porting and potential addition of new assignments related to the seminar in C++ as discussed in this thread.
Describe the solution you'd like
In the existing code of 1_image_representation, we are reading both the file header and the image information header, which include metadata about the BMP file beyond just the pixel data. This metadata includes information such as file size, pixel data offset, image dimensions, color depth, compression method, etc. I feel this metadata is unnecessary since our goal here is to read the pixel data and display the image. However, reading the complete header can also provide useful information for debugging or error handling.
If the purpose of this code is to only read the pixel data and display the image, the code could be optimized to skip reading unnecessary metadata. Since we want to reduce the complexity of this code there is no other option in my opinion so these are my suggestions:-
struct BMPImage {
uint32_t width; // Width of the image in pixels
uint32_t height; // Height of the image in pixels
std::vector<uint8_t> pixels; // Vector to store pixel data
};
@advait-0 and @PritK99 , Please suggest if there is any other way to reduce the complexity of the existing code other than removing metadata, otherwise we can go ahead with this.
Is your feature request related to a problem? Please describe.
The contents of the Image Representation are not upto the mark.
Describe the solution you'd like
The contents of the Image Representation should be updated and documentation format should also be updated.
Is your feature request related to a problem? Please describe.
We need to update the root level README.md as we add and port the contents to C++, as discussed in this thread.
Describe the solution you'd like
The README.md should be updated and follow the same standard as the other repository of the org, eg: Wall E
Is your feature request related to a problem? Please describe.
Current implementation of convolution is in Python. As discussed in this thread, we need to port it (1D, 2D and separable convolution) to C++.
Describe the solution you'd like
for
loops by hand) as wells as proper (using native OpenCV
API) implementation in C++.makefile
to compile and build related executable(s)..md
file for theoretical part.Google benchmarks
) both implementations (Naive for
loops vs OpenCV
's implementation) to give students a fair idea about the optimization provided by OpenCV
. We can have a discussion on this point once all other tasks are complete.Is your feature request related to a problem? Please describe.
As discussed in the thread, Concepts of Build System should be added.
Describe the solution you'd like
Additional context
Can refer from here: Embedded Study Group Week 2.
Note: Content is not finalised and open for discussion.
Is your feature request related to a problem? Please describe.
As discussed in the thread, It is important to have a familiarity of how images are store.
Describe the solution you'd like
.bmp
, .tiff
, .jpg
, png
etc..cpp
file on how image can be read from the bmp format.
Makefile
to compile and build the executable..md
file explaining the theory and instructions to build and run the executables.Note: Content is not finalised and open for discussion.
Is your feature request related to a problem? Please describe.
We have a section covering some important introduction topics from OpenCV
. This section covers all of them in Python atm, as discussed in thread thread, we need to port it to C++.
Describe the solution you'd like
OpenCV
's native API wherever required.OpenCV
's C++ APIs in accordance with current Python APIs present inside the above mentioned file.makefile
to build and compile related executable(s)..md
file for theoretical part (and depicting expected results) as well.Users who have freshly installed Ubuntu on their laptops may not have make
, g++
or other essentials installed. To mitigate this problem, that may look simple but can create problems for those who are new to Linux environment, I suggest replacing Makefile with a shell script ( .sh
script like we are using for Wall-E and MARIO) which will update their system as well as install the build-essential
package on their laptops along with usual installation of OpenCV as well as its dependencies
Thank You
Is your feature request related to a problem? Please describe.
Current implementation of Morphological Transformation is in Python, it is needed to be ported in to C++ as discussed in this thread.
Describe the solution you'd like
Readme.md
which contains the detailed explanation of the content similiar to jupyter notebook notes earlier, should contain the instructions to build and execute the code.Is your feature request related to a problem? Please describe.
Since the seminar is being ported to C++ as discussed in this thread, it is important to teach some important C++ concepts.
Describe the solution you'd like
Note: Content is not finalised and open for discussion.
Is your feature request related to a problem? Please describe.
As discussed in the thread, concepts of interpolation can also be added.
Describe the solution you'd like
Makefile
to compile and build executables..md
file to explain the theory of interpolations and instructions to build and run the executables.Additional context
Reference: Ancient Secrets of computer vision.
Note: Content is not finalised and open for discussion
Is your feature request related to a problem? Please describe.
As we are moving the contents to c++ as discussed in this thread, it is important to redesign the existing file structure.
Describe the solution you'd like
I would like to propose the file structure as below:
.
├── 1_cpp_basic # Contents on the basic of cpp concepts like pointers.
├── 2_build_systems # Topics of build systems including Makefile and Cmake.
├── 3_git_github # Git Github basic contents.
├── 4_cv_basics # Contents of all the Computer Vision Fundamentals.
│ ├── image_representation # Contents in markdown format for the basic image representation.
│ ├── Color spaces and conversion # Introduction to image color spaces and their conversions.
│ │ ├── color_conversion.cpp # Contain the C++ code for conversion between the colorspaces.
│ │ ├── Makefile # Includes build commands.
│ │ └── Readme.md # Detailed Explanation of the topic and its instructions for running the code.
│ ├── Playing with image coordinates # Content on the OpenCV mat container.
│ ├── Convolution and filtering # Contents on the convolution and filtering.
│ ├── Masking # Contents on the Masking.
│ ├── Morphological Transformation # Contents on the morphological transformation.
│ └── Blob Detection # Mini project of blob detection.
├── Assignments # Contains the assignments
├── assets # Contains the images and gifs.
├──.gitignore # ignores the unnecessary build files.
├── LICENSE # Licence for this project.
└── README.md # Contains documentation and topics of the repo.
Additional context
Reference from : Ancient Secrets of Computer Vision
Note: File structure is not finalised and open for discussion.
Is your feature request related to a problem? Please describe.
Current contents and implementation of masking is in Python. As discussed in the thread, it is needs to port to C++.
Describe the solution you'd like
Makefile
to compile and build the executables..md
file explaining the concepts and instructions for running the executables.Users especially those new to C++ often face confusion and difficulties in understanding the proper usage of OpenCV classes due to the common practice of using the cv namespace. To improve clarity and encourage best practices, I think it is suitable to eliminate the use of the cv namespace in our examples.
Possible solution to this problem could be:
Is your feature request related to a problem? Please describe.
There are currently no examples of how data is stored in image files. Since all the codebase was in python, it didn't make any sense to add this, until the discussion in this thread.
Describe the solution you'd like
read()
api to read the image bitmap file.struct
is suggested.main.cpp
(or any other name of choice) file may suffice for this purpose.README.md
for detailed theory and explanation of code and how to compile it.A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.