基于MAX78000的手势识别人机交互系统

基于MAX78000的手势识别人机交互系统

码农世界 2024-05-24 前端 67 次浏览 0个评论

 项目地址:https://www.eetree.cn/project/2604

 视频地址:基于MAX78000的手势识别人机交互系统_哔哩哔哩_bilibili

(项目有点简陋,本来以为时间很长,开头就拿到板子的时候搞了一段时间,中间一直没碰。。。。等到最后发现时间快截止了的时候又和期末考试时间撞一起了,只能匆匆完结。。。)

模型是基于resnet18进行修改的,只保留了最后修改完的代码

conda activate Maxim

(1)要使用不同数据集的话,要在 ai8x-training-data 下里面的路径下存放数据集,并生成txt文件

(2)ai8x-training-datasets 下的 gesture.py 文件里修改数据读取路径

一、流程和一些细节

训练

train下

bash scripts/train-gesture.sh

#!/bin/sh
python train.py --epochs 200  --optimizer Adam --lr 0.001 --batch-size 64 --gpus 0 --deterministic --compress policies/schedule-gesture.yaml --model ai85net_gesture --dataset gesture --param-hist --pr-curves --embedding --device MAX78000 "$@"

log中训练得到的

复制到ai8x-synthesis的 self_proj_gesture中,主要是qat量化文件

注释了 ai8x-training/ai8x.py 的两行代码

1808  :# b = target_attr.op.bias.data

1830  :# target_attr.op.bias.data = b_new

量化

synthesis下

scripts/quantize_gesture.sh

#!/bin/sh
python quantize.py self_proj/gesture/qat_best.pth.tar self_proj/gesture/qat_best-ai8x-q.pth.tar --device MAX78000 -v "$@"

修改了 ai8x-synthesis/izer/quantize.py 一行代码

159 :# bias_name = '.'.join([layer, operation, 'bias'])

160 : bias_name = 'bias'

评估

train下

scripts/evaluate_gesture.sh

#!/bin/sh
python ./train.py --model ai85net_gesture --dataset gesture --confusion --evaluate --exp-load-weights-from ../ai8x-synthesis/self_proj/gesture/qat_best-ai8x-q.pth.tar -8 --device MAX78000 "$@"

生成npy文件

train中

./train.py --model ai85net_gesture --save-sample 10 --dataset gesture --evaluate --exp-load-weights-from ../ai8x-synthesis/self_proj/gesture/qat_best-ai8x-q.pth.tar -8 --device MAX78000 "$@"

会在 ai8x-training 文件下生存 sample_fpr2.npy

移动到测试目录:ai8x-synthesis/tests/sample_fpr2.npy

模型转换

synthesis中

./ai8xize.py --verbose --test-dir demos --prefix ai85-gesture --checkpoint-file self_proj/gesture/qat_best-ai8x-q.pth.tar --config-file networks/gesture-chw.yaml --device MAX78000 --compact-data --mexpress --softmax --overwrite

下列的模型转换的yaml配置文件可供参考,搭配模型一起看:

yaml文件

---
# FaceNet sequential model ending with avg_pool, CHW(big data) data_format
arch: ai85net_gesture
dataset: gesture
layers:
  - out_offset: 0x1000
    processors: 0x0000000000000007
    operation: conv2d
    max_pool: 1
    pool_stride: 3
    pad: 1
    kernel_size: 3x3
    activate: ReLU
    data_format: HWC
  - processors: 0x0ffff00000000000 # 16
    out_offset: 0x0000
    operation: conv2d
    activate: ReLU
    write_gap: 1
    max_pool: 1
    pool_stride: 2
    kernel_size: 3x3
    pad: 1
    output_processors: 0x00000000ffffffff # 32
  - processors: 0x00000000ffffffff
    out_offset: 0x2000
    operation: passthrough
    write_gap: 1
    output_processors: 0x00000000ffffffff # 32
    name: res1
  - pad: 1
    operation: conv2d
    kernel_size: 3x3
    activate: ReLU
    out_offset: 0x4000
    processors: 0x00000000ffffffff
  - operation: conv2d
    out_offset: 0x2004
    kernel_size: 3x3
    pad: 1
    name: res2
    write_gap: 1
    processors: 0x00000000ffffffff
# layer4 + blk2
  - in_sequences: [res1, res2]
    processors: 0x00000000ffffffff
    in_offset: 0x2000
    out_offset: 0x0000
    operation: conv2d
    eltwise: add
    max_pool: 1
    pool_stride: 2
    kernel_size: 3x3
    pad: 1
  - processors: 0x00000000ffffffff
    out_offset: 0x2000
    operation: passthrough
    write_gap: 1
    output_processors: 0x00000000ffffffff
    name: res3
  - pad: 1
    operation: conv2d
    kernel_size: 3x3
    activate: ReLU
    out_offset: 0x4000
    processors: 0x00000000ffffffff
  - operation: conv2d
    out_offset: 0x2004
    kernel_size: 3x3
    pad: 1
    name: res4
    write_gap: 1
    processors: 0x00000000ffffffff
# layer8 + blk3
  - in_sequences: [res3, res4]
    processors: 0x00000000ffffffff
    in_offset: 0x2000
    out_offset: 0x0000
    operation: conv2d
    eltwise: add
    max_pool: 1
    pool_stride: 2
    kernel_size: 3x3
    pad: 1
  - processors: 0xffffffffffffffff # 64
    out_offset: 0x2000
    operation: passthrough
    write_gap: 1
    output_processors: 0xffffffffffffffff
    name: res5
  - pad: 1
    operation: conv2d
    kernel_size: 3x3
    activate: ReLU
    out_offset: 0x4000
    processors: 0xffffffffffffffff
  - operation: conv2d
    out_offset: 0x2004
    kernel_size: 3x3
    pad: 1
    name: res6
    write_gap: 1
    processors: 0xffffffffffffffff
# layer12 + blk4
  - in_sequences: [res5, res6]
    processors: 0xffffffffffffffff
    in_offset: 0x2000
    out_offset: 0x0000
    operation: conv2d
    eltwise: add
    max_pool: 1
    pool_stride: 2
    kernel_size: 3x3
    pad: 1
  - processors: 0xffffffffffffffff
    out_offset: 0x2000
    operation: passthrough
    write_gap: 1
    output_processors: 0xffffffffffffffff
    name: res7
  - pad: 1
    operation: conv2d
    kernel_size: 3x3
    activate: ReLU
    out_offset: 0x4000
    processors: 0xffffffffffffffff
  - operation: conv2d
    out_offset: 0x2004
    kernel_size: 3x3
    pad: 1
    name: res8
    write_gap: 1
    processors: 0xffffffffffffffff
# layer16
  - in_sequences: [res7, res8]
    in_offset: 0x2000
    out_offset: 0x0000
    eltwise: add
    avg_pool: [2,2] # 64*2*2 -> 64*1*1, 设置为[1,1]的时候是把 64*2*2 -> 16*2*2,所以不行
    pool_stride: 1
    operation: None
    processors: 0xffffffffffffffff
    output_processors: 0xffffffffffffffff
# Layer 18 - LINNER
  - out_offset: 0x2000
    processors: 0xffffffffffffffff
    output_processors: 0x00000000000000f9
    operation: fc
    flatten: true
    output_width: 32

模型:

import torch
import torch.nn as nn
from torch.nn import functional as F
import ai8x
class ResBlk(nn.Module):
    """
    resnet block
    """
    def __init__(self, ch_in, ch_out, stride=1, bias=False, **kwargs):  # 要传入输入、输出的维度
        """
        :param ch_in:
        :param ch_out:
        """
        super(ResBlk, self).__init__()
        self.ch_in = ch_in
        self.ch_out = ch_out
        self.conv1 = ai8x.FusedMaxPoolConv2dReLU(ch_in, ch_out, kernel_size=3, pool_size=1, pool_stride=2, stride=stride, padding=1, bias=bias, **kwargs)	
        self.conv2 = ai8x.FusedConv2dReLU(ch_out, ch_out, kernel_size=3,stride=1, padding=1, bias=bias, **kwargs)
#        self.conv2 = ai8x.FusedConv2dReLU(ch_out, ch_out, kernel_size=3, stride=1, padding=1, #bias=bias, **kwargs)
        #self.conv1 = ai8x.Conv2d(ch_in, ch_out, kernel_size=3, stride=stride, padding=1)
        #self.bn1 = nn.BatchNorm2d(ch_out)
        #self.conv2 = ai8x.Conv2d(ch_out, ch_out, kernel_size=3, stride=1, padding=1)
        #self.bn2 = nn.BatchNorm2d(ch_out)
#        self.extra = nn.Sequential()
#        if ch_out != ch_in:
#            # [b, ch_in, h, w] => [b, ch_out, h, w]
#            self.extra = nn.Sequential(
#                ai8x.Conv2d(ch_in, ch_out, kernel_size=1, stride=stride, bias=bias, **kwargs),
#                #nn.BatchNorm2d(ch_out)
#            )
        self.resid1 = ai8x.Add()
 
        self.extra = ai8x.Conv2d(ch_out, ch_out, kernel_size=3, stride=stride, padding=1, bias=bias, **kwargs)
#        self.extra = ai8x.Conv2d(ch_in, ch_out, kernel_size=1, stride=stride, bias=bias, **kwargs)
            
    def forward(self, x):
        """
        :param x: [b, ch, h, w]
        :return:
        """
        #out = F.relu(self.bn1(self.conv1(x)))
        #out = self.bn2(self.conv2(out))
        #print("out1:", x)
        x = self.conv1(x)
        #print("x:", x.shape)
        out = self.conv2(x)
        #print("out:", x.shape)
        # short cut.
        # extra module: [b, ch_in, h, w] => [b, ch_out, ch_out, h, w]
        # element-wise add:
        # out = self.extra(x) + out
#        if self.ch_in != self.ch_out:
#            out = self.extra(x) + out
#        else:
#            out = x + out
#        out = self.extra(x) + out
        out = self.extra(out) 
        #print("out2:", out.shape)
        out = self.resid1(out, x)
        #print("out3:", out.shape)
        return out
class ResNet18(nn.Module):
    def __init__(self,num_classes=6, num_channels=1,dimensions=(64, 64),  bias=False, **kwargs):
        super(ResNet18, self).__init__()
        #self.conv1 = nn.Sequential(
        #    ai8x.Conv2d(3, 64, kernel_size=3, stride=3, padding=0),
        #    nn.BatchNorm2d(64)
        #)
#        self.conv1 = ai8x.FusedConv2dReLU(3, 32, kernel_size=3, stride=3, padding=0, bias=bias, #**kwargs)
        self.conv2 = ai8x.FusedMaxPoolConv2dReLU(3, 16, kernel_size=3, pool_size=1, pool_stride=3, stride=1, padding=1, bias=bias, **kwargs)
        # 修改为 stride=2, padding=1 后就是 64->32
        # followed 4 blocks
        # [b, 64, h, w] => [b, 128, h, w]       # 输入 22*22
        self.blk1 = ResBlk(16, 32, stride=1)  # 11*11
        # [b, 128, h, w] => [b, 256, h, w]
        self.blk2 = ResBlk(32, 32, stride=1)  # 6*6
        # [b, 256, h, w] => [b, 5112, h, w]
        self.blk3 = ResBlk(32, 64, stride=1)  # 3*3
        # [b, 512, h, w] => [b, 1024, h, w]
        self.blk4 = ResBlk(64, 64, stride=1)  # 2*2
        #self.out = ai8x.Conv2d(512, 256, kernel_size=1, stride=1, bias=bias, **kwargs)
        #self.out2 = ai8x.Conv2d(256, 128, kernel_size=1, stride=1, bias=bias, **kwargs)
        self.outlayer = ai8x.Linear(64 * 1 * 1, 6)
    def forward(self, x):
        """
        :param x:
        :return:
        """
        # x = F.relu(self.conv1(x))
        # print(x.shape) # torch.Size([64, 9, 64, 64])
        # x = x[:, :3, :, :]
        #print(x.shape) # torch.Size([3, 64, 64])
        #x = self.conv1(x)
        x = self.conv2(x)
        #print("x1:",x.shape)
        # [b, 64, h, w] => [b, 1024, h, w]
        x = self.blk1(x)
        #print("x2:",x.shape)
        x = self.blk2(x)
        #print("x3:",x.shape)
        x = self.blk3(x)
        #print("x4:",x.shape)
        x = self.blk4(x)
        #print("x5:",x.shape)
        # print('after conv:', x.shape) # [b, 512, 2, 2]
        # [b, 512, h, w] => [b, 512, 1, 1]
        # 不管你的输入是多少,最终经过这个 avgpooling 都会变成 [1, 1]的
        x = F.adaptive_avg_pool2d(x, [1, 1])
        #print("x6:",x.shape)
        # print('after pool:', x.shape)
        # after pool: torch.Size([2, 512, 1, 1])
        #x = self.out(x)
        #x = self.out2(x)
        x = x.view(x.size(0), -1)
        #print("x7:",x.shape)
        x = self.outlayer(x)
        #print("x8:",x.shape)
        return x
    
def ai85net_gesture(pretrained=False, **kwargs):
    assert not pretrained
    return ResNet18(**kwargs)
models = [
    {
        'name': 'ai85net_gesture',
        'min_input': 1,
        'dim': 3,
    },
]

二、补充

(1)若模型量化后(这里都用qat方式)评估时,精度下降严重,将你模型中所有使用 nn.xx 的替换成 ai8x.xx 的,需要找 ai8x.py 中的函数的对应名称

(2)使用ai8x.Add()进行ai8x的残差连接

(3)avg_pool: [2,2]

具体含义小编也不太清楚,不过可参考下式理解

[2,2]:64*2*2 -> 64*1*1, 设置为[1,1]的时候是把 64*2*2 -> 16*2*2,所以不行

(4)个人理解 processors 和 output_processors 其实就是通道channel的大小,processors是输入通道,output_processors是输出通道

0xffffffffffffffff =  64

0x00000000ffffffff = 32

0x0ffff00000000000 = 16

0x0000000000000007 = 3

0x00000000000000f9 = 6

      一些 大概的值,具体计算没去研究

(5)[res3, res4] 也可用层号代替,如[1,3]

(6)yaml 中好像不能写入stride,可能默认是1,只能写pool_stride,所以我对我的模型做了一些修改,改为stride都使用1的情况

(7)yaml 文件的配置参考:MaximAI_Documentation/Guides/YAML Quickstart.md at main · analogdevicesinc/MaximAI_Documentation · GitHub

补充一个:gesture-chw.yaml

kernel 为1*1时,pad要设置为0

(kernel应该是要一起设置,不然pad会有默认值)

转载请注明来自码农世界,本文标题:《基于MAX78000的手势识别人机交互系统》

百度分享代码,如果开启HTTPS请参考李洋个人博客
每一天,每一秒,你所做的决定都会改变你的人生!

发表评论

快捷回复:

评论列表 (暂无评论,67人围观)参与讨论

还没有评论,来说两句吧...

Top