Facebook PixelVoice-controlled picture generator
Create newCreate new

Voice-controlled picture generator

Image credit: https://d1csarkz8obe9u.cloudfront.net/posterpreviews/actions-design-template-272fe678ae87badd19e38913b3249ac8_screen.jpg?ts=1593343965

Spook Louw
Spook Louw Jul 23, 2021
Please leave the feedback on this idea

Is it original or innovative?


Is it feasible?


Is it targeting an unsolved problem?


Is it concisely described?

Bounty for the best solution

Provide a bounty for the best solution

Bounties attract serious brainpower to the challenge.

Currency *
Who gets the Bounty *
This idea was originally to use the voice translation technology that already exists in programs like Google Translate and Speak and Translate and simply pair an entire dictionary of words to illustrations as a way to communicate with deaf or hard of hearing people or especially people of a different language who can not read.
I think this idea or simply typing messages is a better alternative for speaking to hard of hearing people.

The idea could still be helpful when trying to communicate with people who can not read, but that got me thinking, "who else can't read?"

I think this could be an excellent educational tool for babies and toddlers. It could either help them understand the stories you are telling them or be a tool for them to practise speaking with.
It could be quite basic, having a specific picture connected to specific words, or it could possibly be more intelligent, where it would be able to take in an entire sentence or paragraph and display an appropriate sequence or maybe as tech evolves even a short animation, I don't think the technology is at that level yet so I won't concentrate on it too much.

For educational purposes it would show the word along with the picture, just like posters in kindergarten classes. That way, users would hear the word, see an image of what it refers to and also see how it is spelled.

I have found many programs, tools and games that pronounce and spell words if you click on a picture, but none that work the other way around. If the program is written well it could make stories even more interesting for children when they are listening, while motivating them to explore their vocabulary when they speak.
Creative contributions

Voice-controlled video generator

Shubhankar Kulkarni
Shubhankar Kulkarni Aug 26, 2021
Why stop at pictures? What if you could describe a scene and the tool creates it in real-time? Then you could make additions to it and changes as per your story.

In the case of people who are hard of hearing, pictures will limit the communication to non-actionable stuff or the teacher might need to break the entire process into consecutive images. A real-time video generator, on the other hand, will not break the flow.

This will also be a huge breakthrough in teaching. There will be no need to make presentations beforehand. The presentation will be created in real-time while the teacher is speaking. Any last-minute changes to a presentation are a headache.

Also, there are dynamic concepts that are easy to understand using a voice-assisted video. However, you do not find the exact video you are looking for and you may have to edit existing videos or make a new one for yourself. The voice-controlled video generator eliminates all this hassle.

Something like Doodly comes to mind. Using Doodly, you can create Doodle videos that are basically images but the tool draws them, converting an image into a video. If you look at the Doodly advertisement, the anchor explains how Doodly works and the video presents the same. Currently, we have to manually match the audio with the video. We won't have to do that with the voice-controlled video generator.
Please leave the feedback on this idea
Darko Savic
Darko Savic2 years ago
I can't find the right video but AI is ready for this. I remember seeing a demo where AI-generated realistic video based on verbal descriptions from people. This is not it, but is also cool:

Please leave the feedback on this idea

The use of electroencephalographs to generate images from the user's thoughts instead

Samuel Bello
Samuel Bello Aug 25, 2021
An electroencephalograph (EEG) is a machine that records electrical signals on the scalp of one's head. These electrical signals have been shown to depend on the brain's activity especially for parts of the brain that are near the surface. An EEG can be developed to generate images directly from the user's mind.

A Brain Computer Interface (BCI), where a chip has to be implanted into the user's brain has been used to type in this case. Though the speed is not impressive, it shows that the use of BCIs to generate images is achievable in the future. Not all BCIs require brain implants. Some of them can be worn by the user just like a cap but the accuracy of such BCIs is quite low with our level of technology.

This might feel like a complex solution for a simple problem but a computer would not do a good job of generating images or animations that correspond to the user's thoughts just by recording the user's speech. The reason is that there is a large number of images that may correspond to a particular word or sentence. Converting speech to text is possible because there is a one-to-one correspondence between spoken words and their written forms. Speech to image conversion is impossible for a computer program unless the image library is very small because images generally contain much more data than words. But, thought-to-image conversions can be made possible with advances in technology.
Please leave the feedback on this idea

Add your creative contribution

0 / 200

Added via the text editor

Sign up or


Guest sign up

* Indicates a required field

By using this platform you agree to our terms of service and privacy policy.

General comments

Samuel Bello
Samuel Bello3 years ago
Such a program will be very helpful in learning new languages and training animals since images are a natural universal 'language' that most humans and animals can process.

For the simple cases, voice assistants can already search for images that correspond to some words and simple sentences. The generated images may not be the most suitable ones for your purpose though.
Please leave the feedback on this idea