Technology is taking part of our daily life. It serves us and help us find solutions to problems or ease our way through solving them. As technology progresses new ways of interaction between people and computers pop out. Computers are already expanding our view and knowledge but I think the better way is for computers to literally expand our view by augmenting what we see and help us find solutions right in front our eyes by capturing what we see and augmenting it with solution. This is what I think will be the way technology will go over next few years. Augmenting reality and artificial intelligence can help us find solution to problems faster, more transparent and easily accessible. When Google announced Google Glass my dream of augmenting reality came true. There will be HUD like in games there in front of your eyes. I was kind of disappointed when I saw first prototype of Google Glass because it is just small display and it is definitely not using all resources modern technology provides. But then I saw video with new virtual reality glasses that have no borders no black sides pure view like if you were there. Here is what I would do. I would mount up those glasses on my head mount Kinect kind of camera with depth perception and more cameras just to make it full view. There are endless possibilities how to augment reality with these technologies. I saw many videos and demos of these technology. I.e. With Kinect you can put object right there in scene. With object detection, face recognition and many other features you could enhance your view to get more information from what you see and to solve problems there are in front of you.
And there I come in. I want to take part in this and do something useful. I got assignment from school to use artificial intelligence to solve some easy problem of red car trying to get from parking lot. Many of you know this game it is called Parking, Parking Lot or something like that. I created simple algorithms solving this problem and I thought who would use this if they have to write all positions of cars textually in console or file – Nobody. That’s why I started experimenting and added new way of input by clicking with mouse and creating scene with mouse. Sure some people would use this but still not enough. Lazy people want the solution right in front of their eyes. So I thought what if I could recognize objects in picture and recreate the problem in my program. So I started by searching for library for Java which can detect objects, faces and generally work with the camera input. There I found OpenCV with JavaCV wrapper. I struggled a little with installation because there is no good documentation (and everyone is using Eclipse) for JavaCV but I managed to get it right. I managed to detect objects in scene analyze their position in grid and finally extract this data into my program. That’s where my program extended its input method by image file. Then I thought people might have puzzles on paper or their mobile phone so it would be great to capture this with camera and have new input method. That’s what I did and added new input method. But still not enough. I would like user’s experience to be perfect so I thought what If I could extract data from camera and directly write solution on user’s display right where he sees the capturing image. That would be augmented reality with use of artificial intelligence. Pictures, tutorials and much more are coming in next posts.
Then I asked myself what if we could solve more problems like this by enhancing our view, detecting objects and use of artificial intelligence. If we add OCR we can solve even more. My friend had this idea of AR longer in mind and we decided to make it happen. I can’t tell you what is it but it will be useful for sure :).
The main idea of this post is if we use AR and AI we can solve problems that are right in front of us.
If you have problems which you think are solvable with this approach feel free to contact me or write a comment :).