BERT ner 微调参数的选择

BERT ner 微调参数的选择

码农世界 2024-05-27 前端 81 次浏览 0个评论

针对批大小和学习率的组合进行收敛速度测试,结论:

  • 相同轮数的条件下,batchsize-32 相比 batchsize-256 的迭代步数越多,收敛更快
  • 批越大的话,学习率可以相对设得大一点

    画图代码(deepseek生成):

    import matplotlib.pyplot as plt
    dic = {
        (256, 1e-5): [0,        0.185357, 0.549124, 0.649283, 0.720528, 0.743900],
        (256, 2e-5): [0.086368, 0.604535, 0.731870, 0.763409, 0.773608, 0.781042],
        (256, 3e-5): [0.415224, 0.715375, 0.753391, 0.771326, 0.784421, 0.783432],
        (32,  1e-5): [0.710058, 0.769245, 0.781832, 0.786909, 0.792920, 0.799076],
        (32,  2e-5): [0.761296, 0.766089, 0.795317, 0.801602, 0.795861, 0.799864],
        (32,  3e-5): [0.771385, 0.788055, 0.791863, 0.793491, 0.800057, 0.799527],
    }
    # 提取参数和对应的训练轨迹
    params = list(dic.keys())
    trajectories = list(dic.values())
    # 绘制折线图
    plt.figure(figsize=(10, 6))
    for param, trajectory in zip(params, trajectories):
        plt.plot(range(1, len(trajectory) + 1), trajectory, label=f'{param[0]}, {param[1]}')
    # 设置图表标题和坐标轴标签
    plt.title('Validation Score Trajectory for Different Parameters')
    plt.xlabel('Training Epochs')
    plt.ylabel('Performance Metric')
    # 添加图例
    plt.legend()
    # 显示图表
    plt.show()
    

    附录

    微调命令

    !python ner_finetune.py \
    --gpu_device 0 \
    --train_batch_size 32 \
    --valid_batch_size 32 \
    --epochs 6 \
    --learning_rate 3e-5 \
    --train_file data/cluener2020/train.json \
    --valid_file data/cluener2020/dev.json \
    --allow_label "{'name': 'PER', 'organization': 'ORG', 'address': 'LOC', 'company': 'ORG', 'government': 'ORG'}" \
    --pretrained_model models/bert-base-chinese \
    --tokenizer models/bert-base-chinese \
    --save_model_dir models/local/bert_tune_5
    

    日志

    Namespace(allow_label={'name': 'PER', 'organization': 'ORG', 'address': 'LOC', 'company': 'ORG', 'government': 'ORG'}, epochs=6, gpu_device='0', learning_rate=3e-05, max_grad_norm=10, max_len=128, pretrained_model='models/bert-base-chinese', save_model_dir='models/local/bert_tune_5', tokenizer='models/bert-base-chinese', train_batch_size=32, train_file='data/cluener2020/train.json', valid_batch_size=32, valid_file='data/cluener2020/dev.json')
    CUDA is available!
    Number of CUDA devices: 1
    Device name: NVIDIA GeForce RTX 2080 Ti
    Device capability: (7, 5)
    标签映射: {'O': 0, 'B-PER': 1, 'B-ORG': 2, 'B-LOC': 3, 'I-PER': 4, 'I-ORG': 5, 'I-LOC': 6}
    加载数据集:data/cluener2020/train.json
      0%|  | 0/10748 [00:00
                    
                    
                    

转载请注明来自码农世界,本文标题:《BERT ner 微调参数的选择》

百度分享代码,如果开启HTTPS请参考李洋个人博客
每一天,每一秒,你所做的决定都会改变你的人生!

发表评论

快捷回复:

评论列表 (暂无评论,81人围观)参与讨论

还没有评论,来说两句吧...

Top