By registering with us, you'll be able to discuss, share and private message with other members of our community.

SignUp Now!
  • Guest, before posting your code please take these rules into consideration:
    • It is required to use our BBCode feature to display your code. While within the editor click < / > or >_ and place your code within the BB Code prompt. This helps others with finding a solution by making it easier to read and easier to copy.
    • You can also use markdown to share your code. When using markdown your code will be automatically converted to BBCode. For help with markdown check out the markdown guide.
    • Don't share a wall of code. All we want is the problem area, the code related to your issue.

    To learn more about how to use our BBCode feature, please click here.

    Thank you, Code Forum.

Question about Optical Character Recognition Software


New Coder
Hello, I am hoping someone could point me in the right direction.

I know very little about programming but I would like to have software that I could load videos into which could be analyzed. Specifically, I’d like the software to be able to output a text file or spreadsheet that contains a timestamp of any time recognizable text was encountered and an output of said text.

Would this even be possible to do? I’m assuming creating Optical Character Recognition from scratch would be prohibitively difficult and expensive but I’m hoping that there is some way to license existing technology for use in a desktop application that I’m hoping to develop.

Any thoughts or ideas are greatly appreciated.
Essentially, this application would be broken down in the following way....
  • You load in a video file into it.
  • Detecting and decoding the video format type, the application then renders a series of "frames" as images every "X" seconds - Most likely, a video engine will be needed here, such as FFmpeg - "X" being an arbitrary number, perhaps user defined
  • These "frame" images are then run through a computer vision engine to analyze them for text - Most likely something like Google's Cloud Vision API
  • Any returned text is then added to an output file along with the frame number it's found in - which determines the 'time' in the video it appeared.
  • A final "cleanup" routine then scrubs sequential duplicates or possible over-reporting in the datafile (such as if a scene keeps switching camera 1 to camera 2, and each has some text in the shot)
If you are talking about facial recognition security technology it's going amazing to talk. This technology of the new world that makes our society's work easier. AI Facial recognition security software is developed to focus on facial recognition technology recognising human faces' sizes and places in digital photographs.
The most used machine learning algorithm for facial recognition is a deep learning Convolutional Neural Network (CNN). CNNs are a type of artificial neural network well-suited for image classification tasks.
Very interesting... but totally irrelevant to the question that was asked !

New Threads

Latest posts

Buy us a coffee!

Top Bottom