# Real-time object detect

This scripts allow to detect what object those the camera is watching, as an example a person, chair, door, etc.

Enter into python virtual enviroment:

```
# workon cv
```

Now the first thing we need to detect and determine is the object that we want to be tracked. So for that we need to execute:

```
# python realDetectionl.py --prototxt MobileNetSSD_deploy.prototxt.txt --model MobileNetSSD_deploy.caffemodel
```

As we can see we need to give two parameters that are the pre-trained model and de caffe deploy.

Explaining the code.

First we need to import the packages that we are going to use

```
# import the necessary packages
from imutils.video import VideoStream
from imutils.video import FPS
import numpy as np
import argparse
import imutils
import time
import cv2
```

Initialice the parameters that we are going to use

* Caffe deploy&#x20;
* Caffe pre-trained model

```
ap = argparse.ArgumentParser()
ap.add_argument("-p", "--prototxt", required=True,
        help="path to Caffe 'deploy' prototxt file")
ap.add_argument("-m", "--model", required=True,
        help="path to Caffe pre-trained model")
ap.add_argument("-c", "--confidence", type=float, default=0.2,
        help="minimum probability to filter weak detections")
args = vars(ap.parse_args())
```

Give the list of object that our MobileNet SSD was trained, and create rectangles to introduce to the frame when a object is detected.

```
CLASSES = ["background", "aeroplane", "bicycle", "bird", "boat",
        "bottle", "bus", "car", "cat", "chair", "cow", "diningtable",
        "dog", "horse", "motorbike", "person", "pottedplant", "sheep",
        "sofa", "train", "tvmonitor"]
COLORS = np.random.uniform(0, 255, size=(len(CLASSES), 3))
```

Initialize the serialized model and the video stream.

```
print("Loading model...")
net = cv2.dnn.readNetFromCaffe(args["prototxt"], args["model"])

print("Starting video stream...")
vs = VideoStream(src=0).start()
time.sleep(2.0)
fps = FPS().start()
```

Take the frame and resize it to 400 pixels and then covert it to a blob file.

\*This width is define to optimize our code.

```
        frame = vs.read()
        frame = imutils.resize(frame, width=400)

        (h, w) = frame.shape[:2]
        blob = cv2.dnn.blobFromImage(cv2.resize(frame, (300, 300)),
                0.007843, (300, 300), 127.5)
```

GIve the blob file to the network so it can start predicting

```
net.setInput(blob)
detections = net.forward()
```

Start looping over the detections that we get, then we extract them and associate with the prediction.

```
for i in np.arange(0, detections.shape[2]):
        confidence = detections[0, 0, i, 2]
```

We guarantee that the detections are greater than the minium of confidence.

Then we star to get the index of the class label from the array that we defin, so we could start paintin the boxes around the objects

```
if confidence > args["confidence"]:
       idx = int(detections[0, 0, i, 1])
       box = detections[0, 0, i, 3:7] * np.array([w, h, w, h])
       (startX, startY, endX, endY) = box.astype("int")

       label = "{}: {:.2f}%".format(CLASSES[idx],confidence * 100)
       cv2.rectangle(frame, (startX, startY), (endX, endY),COLORS[idx], 2)
       y = startY - 15 if startY - 15 > 15 else startY + 15
       cv2.putText(frame, label, (startX, y),cv2.FONT_HERSHEY_SIMPLEX, 0.5, COLORS[idx], 2)
```

Finally we show the frame to the user.

```
        cv2.imshow("Frame", frame)
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://alfredo-reyes-montero.gitbook.io/xiaomin/solutions/image-processing/real-time-object-detect.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.