當前位置:
首頁 > 知識 > TensorFlow結構分析及自定義Op

TensorFlow結構分析及自定義Op

tensorflow 架構

kernel

tensorflow中最底層的實現應該是kernel,在tensorflow/tensorflow/core/kernels/目錄下,包括常用操作conv, pooling的cpu, gpu以及梯度計算的實現,繼承OpKernel類,實現Computer函數,最後調用REGISTER_KERNEL_BUILDER註冊到tensorflow。

在tensorflow/tensorflow/core/kernels/BUILD中,為每個kernel定義了rule:tf_kernel_libraries:

tf_kernel_library(

name = "control_flow_ops",

prefix = "control_flow_ops",

deps = [

"//tensorflow/core:control_flow_ops_op_lib",

"//tensorflow/core:framework",

"//tensorflow/core:lib",

],

)

1

2

3

4

5

6

7

8

9

tensorflow/tensorflow/core/ops是在kernel的基礎上,調用REGISTER_OP添加屬性限制,shape函數等:

REGISTER_OP("Conv2D")

.Input("input: T")

.Input("filter: T")

.Output("output: T")

.Attr("T: {half, float, double}")

.Attr("strides: list(int)")

.Attr("use_cudnn_on_gpu: bool = true")

.Attr(GetPaddingAttrString())

.Attr(GetConvnetDataFormatAttrString())

.SetShapeFn(shape_inference::Conv2DShape)

.Doc(R"doc(

Computes a 2-D convolution given 4-D `input` and `filter` tensors.

Given an input tensor of shape `[batch, in_height, in_width, in_channels]`

and a filter / kernel tensor of shape

`[filter_height, filter_width, in_channels, out_channels]`, this op

performs the following:

1. Flattens the filter to a 2-D matrix with shape

`[filter_height * filter_width * in_channels, output_channels]`.

2. Extracts image patches from the input tensor to form a *virtual*

tensor of shape `[batch, out_height, out_width,

filter_height * filter_width * in_channels]`.

3. For each patch, right-multiplies the filter matrix and the image patch

vector.

In detail, with the default NHWC format,

output[b, i, j, k] =

sum_{di, dj, q} input[b, strides[1] * i + di, strides[2] * j + dj, q] *

filter[di, dj, q, k]

Must have `strides[0] = strides[3] = 1`. For the most common case of the same

horizontal and vertices strides, `strides = [1, stride, stride, 1]`.

strides: 1-D of length 4. The stride of the sliding window for each dimension

of `input`. Must be in the same order as the dimension specified with format.

padding: The type of padding algorithm to use.

data_format: Specify the data format of the input and output data. With the

default format "NHWC", the data is stored in the order of:

[batch, in_height, in_width, in_channels].

Alternatively, the format could be "NCHW", the data storage order of:

[batch, in_channels, in_height, in_width].

)doc");

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

ops

在$(tensorflow)/tensorflow/core/BUILD中,The ops are linked into the core TensorFlow library here

cc_library(

name = "ops",

visibility = ["//visibility:public"],

deps = [

":array_ops_op_lib",

":candidate_sampling_ops_op_lib",

":control_flow_ops_op_lib",

":ctc_ops_op_lib",

":data_flow_ops_op_lib",

":function_ops_op_lib",

":functional_ops_op_lib",

":image_ops_op_lib",

":io_ops_op_lib",

":linalg_ops_op_lib",

":logging_ops_op_lib",

":math_ops_op_lib",

":nn_ops_op_lib",

":no_op_op_lib",

":parsing_ops_op_lib",

":random_ops_op_lib",

":script_ops_op_lib",

":sendrecv_ops_op_lib",

":sparse_ops_op_lib",

":state_ops_op_lib",

":string_ops_op_lib",

":training_ops_op_lib",

":user_ops_op_lib",

"//tensorflow/models/embedding:word2vec_ops",

],

alwayslink = 1,

)

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

and the kernels are linked in here

# This includes implementations of all kernels built into TensorFlow.

cc_library(

name = "all_kernels",

visibility = ["//visibility:public"],

deps = [

"//tensorflow/core/kernels:array",

"//tensorflow/core/kernels:candidate_sampler_ops",

"//tensorflow/core/kernels:control_flow_ops",

"//tensorflow/core/kernels:ctc_ops",

"//tensorflow/core/kernels:data_flow",

"//tensorflow/core/kernels:fact_op",

"//tensorflow/core/kernels:image",

"//tensorflow/core/kernels:io",

"//tensorflow/core/kernels:linalg",

"//tensorflow/core/kernels:logging",

"//tensorflow/core/kernels:math",

"//tensorflow/core/kernels:multinomial_op",

"//tensorflow/core/kernels:nn",

"//tensorflow/core/kernels:parameterized_truncated_normal_op",

"//tensorflow/core/kernels:parsing",

"//tensorflow/core/kernels:random_ops",

"//tensorflow/core/kernels:required",

"//tensorflow/core/kernels:sparse",

"//tensorflow/core/kernels:state",

"//tensorflow/core/kernels:string",

"//tensorflow/core/kernels:training_ops",

"//tensorflow/models/embedding:word2vec_kernels",

],

)

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

python api

在tensorflow/tensorflow/core/python/BUILD中, 首先在bazel-genfiles/tensorflow/python/ops/目錄下生成可供python調用的python函數:

tf_gen_op_wrapper_private_py(

name = "nn_ops_gen",

require_shape_functions = True,

)

# tensorflow.bzl

def tf_gen_op_wrapper_private_py(name, out=None, deps=[], require_shape_functions=False):

if not name.endswith("_gen"):

fail("name must end in _gen")

bare_op_name = name[:-4] # Strip of the _gen

tf_gen_op_wrapper_py(name=bare_op_name,

out=out,

hidden_file="ops/hidden_ops.txt",

visibility=["//visibility:private"],

deps=deps,

require_shape_functions=require_shape_functions,

generated_target_name=name,

)

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

以上生成的文件,最終會在tensorflow/tensorflow/core/python/ops目錄中被調用:

# tensorflow/tensorflow/core/python/ops/nn_ops.py

...

from tensorflow.python.ops import array_ops

from tensorflow.python.ops import gen_nn_ops

from tensorflow.python.ops import math_ops

...

1

2

3

4

5

6

最終生成py library:

py_library(

name = "nn_ops",

srcs = ["ops/nn_ops.py"],

srcs_version = "PY2AND3",

deps = [

":array_ops_gen",

":framework",

":math_ops",

":nn_ops_gen",

":random_ops",

],

)

1

2

3

4

5

6

7

8

9

10

11

12

user ops

tensorflow/tensorflow/core/user_ops/ 實際包含了tensorflow/tensorflow/core/kernel/與tensorflow/tensorflow/core/ops/的功能,在tensorflow/tensorflow/core/BUILD中引用了user_ops_op_lib:

cc_library(

name = "ops",

visibility = ["//visibility:public"],

deps = [

":array_ops_op_lib",

":candidate_sampling_ops_op_lib",

":control_flow_ops_op_lib",

":ctc_ops_op_lib",

":data_flow_ops_op_lib",

":function_ops_op_lib",

":functional_ops_op_lib",

":image_ops_op_lib",

":io_ops_op_lib",

":linalg_ops_op_lib",

":logging_ops_op_lib",

":math_ops_op_lib",

":nn_ops_op_lib",

":no_op_op_lib",

":parsing_ops_op_lib",

":random_ops_op_lib",

":script_ops_op_lib",

":sendrecv_ops_op_lib",

":sparse_ops_op_lib",

":state_ops_op_lib",

":string_ops_op_lib",

":training_ops_op_lib",

":user_ops_op_lib", //hear

"//tensorflow/models/embedding:word2vec_ops",

],

alwayslink = 1,

)

...

# And one for all user ops

cc_library(

name = "user_ops_op_lib",

srcs = glob(["user_ops/**/*.cc"]),

copts = tf_copts(),

linkstatic = 1,

visibility = ["//visibility:public"],

deps = [":framework"],

alwayslink = 1,

)

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

在tensorflow/tensorflow/core/python中被引用:

tf_gen_op_wrapper_py(

name = "user_ops",

hidden = [

"Fact",

],

require_shape_functions = False,

)

1

2

3

4

5

6

7

8

添加一個op

Define the Op』s interface

You define the interface of an Op by registering it with the TensorFlow system. In the registration, you specify the name of your Op, its inputs (types and names) and outputs (types and names), as well as docstrings and any attrs the Op might require.

To see how this works, suppose you』d like to create an Op that takes a tensor of int32s and outputs a copy of the tensor, with all but the first element set to zero. Create file tensorflow/core/user_ops/zero_out.cc and add a call to the REGISTER_OP macro that defines the interface for such an Op:

#include "tensorflow/core/framework/op.h"

REGISTER_OP("ZeroOut")

.Input("to_zero: int32")

.Output("zeroed: int32");

1

2

3

4

5

This ZeroOut Op takes one tensor to_zero of 32-bit integers as input, and outputs a tensor zeroed of 32-bit integers.

A note on naming: The name of the Op should be unique and CamelCase. Names starting with an underscore (_) are reserved for internal use.

Implement the kernel for the Op

After you define the interface, provide one or more implementations of the Op. To create one of these kernels, create a class that extends OpKernel and overrides the Compute method. The Compute method provides one context argument of type OpKernelContext*, from which you can access useful things like the input and output tensors.

Important note: Instances of your OpKernel may be accessed concurrently. Your Compute method must be thread-safe. Guard any access to class members with a mutex (Or better yet, don』t share state via class members! Consider using a ResourceMgr to keep track of Op state).

Add your kernel to the file you created above. The kernel might look something like this:

#include "tensorflow/core/framework/op_kernel.h"

using namespace tensorflow;

class ZeroOutOp : public OpKernel {

public:

explicit ZeroOutOp(OpKernelConstruction* context) : OpKernel(context) {}

void Compute(OpKernelContext* context) override {

// Grab the input tensor

const Tensor& input_tensor = context->input(0);

auto input = input_tensor.flat<int32>();

// Create an output tensor

Tensor* output_tensor = NULL;

OP_REQUIRES_OK(context, context->allocate_output(0, input_tensor.shape(),

&output_tensor));

auto output = output_tensor->template flat<int32>();

// Set all but the first element of the output tensor to 0.

const int N = input.size();

for (int i = 1; i < N; i++) {

output(i) = 0;

}

// Preserve the first input value if possible.

if (N > 0) output(0) = input(0);

}

};

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

After implementing your kernel, you register it with the TensorFlow system. In the registration, you specify different constraints under which this kernel will run. For example, you might have one kernel made for CPUs, and a separate one for GPUs.

To do this for the ZeroOut op, add the following to zero_out.cc:

REGISTER_KERNEL_BUILDER(Name("ZeroOut").Device(DEVICE_CPU), ZeroOutOp);

1

添加rule

在$(tensorflow)/tensorflow/core/python中修改rule user_ops如下:

tf_gen_op_wrapper_py(

name = "user_ops",

hidden = [

"Fact",

"ZeroOut"

],

require_shape_functions = False,

)

1

2

3

4

5

6

7

8

9

實現python介面

在$(tensorflow)/tensorflow/python/user_ops/user_ops.py中添加如下代碼,不要忘記在zero_out前加下劃線:

def zero_out(arg):

return gen_user_ops._zero_out(arg)

1

2

編譯

bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package

1

測試

$ python

>>> import tensorflow as tf

>>> a = tf.user_ops.zero_out([1, 2, 3])

>>> sess = tf.Session()

>>> sess.run(a)

array([1, 0, 0], dtype=int32)

1

2

3

4

5

6

引用

http://bingotree.cn/?p=862

---------------------

作者:沒出沒

原文:https://blog.csdn.net/wqzghost/article/details/52192462

版權聲明:本文為博主原創文章,轉載請附上博文鏈接!

TensorFlow結構分析及自定義Op

喜歡這篇文章嗎?立刻分享出去讓更多人知道吧!

本站內容充實豐富,博大精深,小編精選每日熱門資訊,隨時更新,點擊「搶先收到最新資訊」瀏覽吧!


請您繼續閱讀更多來自 程序員小新人學習 的精彩文章:

運行時動態的開關 Spring Security
重構大型業務型寫介面——並行處理注意點

TAG:程序員小新人學習 |