Vision

This application allows the robot to use his camera and the Google APIs to take a photo and process it to say what it is about.

  • First we need to import the libraries that we are going to use.

    • We need to import the apis of google. (Google Cloud API)

import argparse
import base64
import httplib2
import os
from time import sleep

from apiclient.discovery import build
from oauth2client.client import GoogleCredentials
from time import sleep
from google.cloud import translate
  • Explaining the code.

    • First we need to get the credentials of the json file to storage at "Google_Application_credentials"

    • Then we connect to the vision.googleapis and open a connection.

    • We also use the translate api so we could change the result to spanish, we select spanish at the target.

    • We use speak so the user could know when the robot is going to take the photo.

    • We take the photo using the command fswebcam.

    • We send the photo to the cloud of google as a json file, and we wait for the result.

    • We get the first result, and we ignore the 4 next, because the first one is the one with more percentage of success.

    • We use espeak so the robot could say that it believe that the object is.

    • Finally we show the photo that we take and finish the app.

def main():

 os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "/home/lupe/Documents/Vision2-befbab1fbecb.json"

 API_DISCOVERY_FILE = 'https://vision.googleapis.com/$discovery/rest?version=v1'
 http = httplib2.Http()

 credentials = GoogleCredentials.get_application_default().create_scoped(
     ['https://www.googleapis.com/auth/cloud-platform'])
 credentials.authorize(http)

 service = build('vision', 'v1', http, discoveryServiceUrl=API_DISCOVERY_FILE)

 translate_client = translate.Client()
 target = 'es'

 os.system('espeak -v es-la -a 200 "Ahora tomare la foto, porfavor acerca el objeto a mi camara"')  
 sleep(2)
 os.system('fswebcam -r 640x480 --jpeg 85 -D 1 reconoce.jpg')  
 sleep(5)

 with open("reconoce.jpg", 'rb') as image:
   image_content = base64.b64encode(image.read())
   service_request = service.images().annotate(
     body={
       'requests': [{
         'image': {
           'content': image_content.decode("utf-8")
          },
         'features': [{
           'type': 'LABEL_DETECTION',
           'maxResults': 5,
          }]
        }]
     })
   response = service_request.execute()
   opcion = ""
   for results in response['responses']:
     if 'labelAnnotations' in results:          
       for annotations in results['labelAnnotations']:
         print('Found label %s, score = %s' % (annotations['description'],annotations['score']))
         opcion = annotations['description']
         print(opcion)
         break
   translation = translate_client.translate(opcion,target_language=target)
   print(translation['translatedText'])    
   string = "Creo que es " + str(translation['translatedText']) 
   os.system('feh -F reconoce.jpg &')   
   os.system('espeak -v es-la -a 200 "{}"'.format(string))
   os.system('killall -9 feh')
   return 0 

if __name__ == '__main__':
 main()

Last updated

Was this helpful?