教你學會物體檢測

最新 07-12

編者按——

雖然作為產品，但是如果技術說這個實現不了的時候，你就應該把他一把推開「讓我來」。

起因是看到了@景略集智和王司圖的光頭檢測，摸了摸自己的頭髮，又稀疏了不少，於是我陷入沉思……

然鵝，在我摘掉我的黑框眼鏡以後，我就發現我啥都看不見了，於是我就準備做一個對黑框眼鏡的檢測。

說干♂就干。

首先去找數據集，鑒於我的渣機子..決定先找20張。

然後我就愛上了圖7的小哥哥（並不）

當然這個數據集很小，很容易出現欠擬合……但是我懶qwq

在基友王♂比利的指引下，我打開了一個叫LabelImg的項目。地址是：https://github.com/tzutalin/labelImg

常規操作一波：

pip install PyQt5

pip install XXX.whl

pip install pyqt5-tools

pip install lxml

通過cd進入labelImg-master文件夾後：

pyrcc5 -o resources.py resources.qrc

然後隨即

python labelImg.py

於是出現了白框？

接著就是標定了，將眼鏡部分標定為「black glasses」，發現快捷鍵太好用惹！！！！！

常用快捷鍵

下一張：快捷鍵D

上一張：快捷鍵A

創建矩形框：快捷鍵W

保存：快捷鍵Ctrl+S

標註結束以後，同名的XML文件就生成惹。嚶嚶嚶~接著是創建TFRecord文件。

TFRecord是一種二進位文件。傳統的圖像與標籤往往是分為不同文件存放的，而在TF Record中每一張輸入圖像和與其相關的標籤則是存放在一個文件中的。TF Record並不對數據進行壓縮，所以可以被快速載入到內存中，從而進行大量數據流的讀取操作。

然後為了確定訓練效果，我們對標定的圖片來分出訓練集和測驗集同時生成csv文件。

在這裡我們構建xmlto csv的PY腳本：

import os

import glob

import pandas as pd

import xml.etree.ElementTree as ET

def xml_to_csv(path):

xml_list = []

for xml_file in glob.glob(path + "/*.xml"):

tree = ET.parse(xml_file)

root = tree.getroot()

for member in root.findall("object"):

value =(root.find("filename").text,

int(root.find("size")[0].text),

int(root.find("size")[1].text),

member[0].text,

int(member[4][0].text),

int(member[4][1].text),

int(member[4][2].text),

int(member[4][3].text)

)

xml_list.append(value)

column_name = ["filename", "width", "height", "class", "xmin", "ymin","xmax", "ymax"]

xml_df = pd.DataFrame(xml_list, columns=column_name)

return xml_df

def main():

for directory in ["train","test"]:

xml_df = xml_to_csv(image_path)

xml_df.to_csv("data/{}_labels.csv".format(directory), index=None)

print("Successfully converted xml tocsv.")

main()

腳本來源：https://github.com/datitran/raccoon_dataset

用python或者Jupyter運行代碼後生成csv文件

接下來建立腳本generate_tfrecord.py

from __future__ import division

from __future__ import print_function

from __future__ importabsolute_import

import os

import io

import pandas as pd

import tensorflow as tf

from PIL import Image

from object_detection.utils importdataset_util

from collections import namedtuple,OrderedDict

flags = tf.app.flags

flags.DEFINE_string("csv_input","", "Path to the CSV input")

flags.DEFINE_string("output_path","", "Path to output TFRecord")

FLAGS = flags.FLAGS

# TO-DO replace this with label map

def class_text_to_int(row_label):

if row_label == "black glasses":

return 1

else:

None

def split(df, group):

data = namedtuple("data", ["filename", "object"])

gb = df.groupby(group)

def create_tf_example(group, path):

encoded_jpg = fid.read()

encoded_jpg_io = io.BytesIO(encoded_jpg)

image = Image.open(encoded_jpg_io)

width, height = image.size

image_format = b"jpg"

xmins = []

xmaxs = []

ymins = []

ymaxs = []

classes_text = []

classes = []

xmins.append(row["xmin"] / width)

xmaxs.append(row["xmax"] / width)

ymins.append(row["ymin"] / height)

ymaxs.append(row["ymax"] / height)

classes_text.append(row["class"].encode("utf8"))

classes.append(class_text_to_int(row["class"]))

tf_example = tf.train.Example(features=tf.train.Features(feature={

"image/height":dataset_util.int64_feature(height),

"image/width":dataset_util.int64_feature(width),

"image/filename":dataset_util.bytes_feature(filename),

"image/source_id":dataset_util.bytes_feature(filename),

"image/encoded":dataset_util.bytes_feature(encoded_jpg),

"image/format":dataset_util.bytes_feature(image_format),

"image/object/bbox/xmin": dataset_util.float_list_feature(xmins),

"image/object/bbox/xmax":dataset_util.float_list_feature(xmaxs),

"image/object/bbox/ymin":dataset_util.float_list_feature(ymins),

"image/object/bbox/ymax":dataset_util.float_list_feature(ymaxs),

"image/object/class/text":dataset_util.bytes_list_feature(classes_text),

"image/object/class/label":dataset_util.int64_list_feature(classes),

}))

return tf_example

def main(_):

writer = tf.python_io.TFRecordWriter(FLAGS.output_path)

examples = pd.read_csv(FLAGS.csv_input)

grouped = split(examples, "filename")

for group in grouped:

tf_example = create_tf_example(group,path)

writer.write(tf_example.SerializeToString())

writer.close()

print("Successfully created the TFRecords: {}".format(output_path))

if __name__ == "__main__":

tf.app.run()

然後就…報錯了….

檢查了一下發現是自己沒有導入包，QwQ嗚嗚嗚，蠢哭惹~

然後下載models包，地址：https://github.com/tensorflow/models

安裝完包以後運行了CMD：

pythongenerate_tfrecord.py --csv_input=data/train_labels.csv --output_path=data/train.record

pythongenerate_tfrecord.py --csv_input=data/test_labels.csv --output_path=data/test.record

對，在這裡我們把csv文件都放在了data文件夾裡面。

在這個過程中還去看了一些CSDN的文章，然後被這個巨巨的標題想起了我在夕陽下的奔跑~

然後回來一看，就…懵逼了………

然後發現是文件路徑問題…….（掀桌）

原因是我的電腦里有兩個anaconda和三個python，可能我的電腦叫維魯斯？

一番折騰後，終於出現了久違的文件，露出姨母笑（哈？）

實現到這一步就算是成功一半了，但——行九十里路半五十。

接著就是把訓練集餵給代碼吃了~~嚶嚶嚶

於是我們就要去API代碼庫里挑一下，哪個models最月半月半，最快消化代碼

於是我們就選擇了ssd_mobilenet_v1_coco，敲棒der~

對比一下就知道~

喜歡這篇文章嗎？立刻分享出去讓更多人知道吧！

本站內容充實豐富，博大精深，小編精選每日熱門資訊，隨時更新，點擊「搶先收到最新資訊」瀏覽吧！

請您繼續閱讀更多來自 生而產品 的精彩文章:

TAG:生而產品 |