jiangxinyang227 / textclassifier Goto Github PK

View Code? Open in Web Editor NEW

1.1K 1.1K 556.0 271.95 MB

tensorflow implementation

Jupyter Notebook 82.16% Python 17.82% Shell 0.02%

textclassifier's People

Contributors

Stargazers

Watchers

Forkers

liujing1036081 thinking110 hellojomo wangyiyao2016 debuluoyi chongp gdh756462786 duanzhihua lihengtianxia redmu7 myminsomnia niglewang netrookiecn wsp317 chenny0808 kissthink psyquant flyingzhy kyroad marmalade666 hatleon youngsmile fengzhou4 john9281 annyn zzzz123321 lianfei tianyunzqs 1234hello dengminna boluoyu sunny8898 berryhn meccy tectal orient12 phelanwang katsuyoo liusj0715 dx2048 yaojl2006 yuhuofei lizhzh8 1780041410 goodluckkk wangshoudao inetty chenzk1993 lee2015new ashes1018 fengdf zhangshaodong xueweixiansheng yaoleihxr xiaoshuzh liuriver123 mars-wei yutong007x 894551089 mirrormy hanhongchang allensmile xuewengeophysics jangocheng jianhua2022 mengqq6952 wangkanger bailixuance legendtianjin buptcai phoenix-06 isunym dynamic0617 shincheung gaogaoxiasha taylor009007 fuuuyuuu pumpkinduo liutianling chengli0327 caoyuji1986 xiaonan07 markovsc mrxiexianzhao process520 jensen1217 fishredleaf zhangleihan yexm xiaoxiong74 liurandong wengbenjue huhuigou nxw1994 yuweifamily zsl98 823858275 zhao131 work-er pjx1993

textclassifier's Issues

输入数据维度问题

你好，在elmo中，我看到输入的长度只是限制在200以内，但是每个句子的长度都不同，那么最后训练的向量的输出的形状是（batchsize， 200， dim）吗？在elmo中，是否允许padding以后的句子呢？如果允许，是不是就是以空格作为padding呢，我看tensorflow_hub中就是这样封装的。

位置嵌入生成的是一个[batch_size, sequence_length，sequence_length]维的单位矩阵；
然后与词嵌入拼接：self.embeddedWords = tf.concat([self.embedded, self.embeddedPosition], -1)
这个矩阵的维度应该是[batch_size, sequence_length，embedding_size+sequence_length]，再作为Q/K/V传入multiheadAttention；
计算key mask的时候：keyMasks = tf.sign(tf.abs(tf.reduce_sum(keys, axis=-1)))，比如最后时间步做的是Pad Zero， keys[:,:,-1]=[0,0,0,…0,|0,…,0,1] (“|”之前的是词嵌入，“|”之后的是位置嵌入)用上面函数计算出来的也是1，好像key mask永远不会有0值。

请问下作者用transformer做文本分类的效果怎么样？我尝试了keras上的实现总比LSTM差2-4个点，有遇到过吗？
谢谢！

BERT下面的dataProcessor.ipynb 下面的ln[2] 路径

可以改为绝对路径，保护个人信息
由/data4T/share/jiangxinyang848/textClassifier/data/preProcess/labeledTrain.csv
改为 ../data/preProcess/labeledTrain.csv

请问您BERT文件夹下的数据集中每行末尾0,4是什么意思啊？0是类别，4是什么意思

您好，请问想用transformer的编码器对多类别标签的文本进行编码，怎么改代码？

如题，实验中需要用transformer的encoding对带类别标签的句子词向量编码（类别可以改成您在末尾0，1那样子，0是大类，1是小类别），请问要怎么改？希望大佬能不吝执教，非常感谢！

关于测试文件的编写还是存在疑问？

关于测试文件的编写还是存在疑问，根据您给的博客链接我只会把checkpoint加载进来，后面的传参还是不太会，該传进那些参数不是很清楚，根据博客链接里面，没看见测试集如何加载进来的，希望您能够给予解答。
import tensorflow as tf
import numpy as np

graph = tf.Graph()
with graph.as_default():

session_conf = tf.ConfigProto(allow_soft_placement=True, log_device_placement=False)
session_conf.gpu_options.allow_growth=True
session_conf.gpu_options.per_process_gpu_memory_fraction = 0.9  # 配置gpu占用率  

sess = tf.Session(config=session_conf)

with sess.as_default():
    checkpoint_file = tf.train.latest_checkpoint("C:/Users/韩泽峰/Desktop/textClassifier-master/model/transformer3classfier/")
    saver = tf.train.import_meta_graph("{}.meta".format(checkpoint_file))
    saver.restore(sess, checkpoint_file)

# 获得需要喂给模型的参数，输出的结果依赖的输入值

    inputX = graph.get_operation_by_name("inputX")
    dropoutKeepProb = graph.get_operation_by_name("dropoutKeepProb")
    embeddedPosition = graph.get_operation_by_name("embeddedPosition")
    
    # 获得输出的结果
    pred_all = graph.get_operation_by_name("inputY")

关于 Transformer._positionEmbedding 中的 batchSize

您好，我想使用 Transformer._positionEmbedding ，在设置 batchSize 遇到个问题：
在验证时，验证集的 batch 和训练集的 batch 大小不同（训练集的batch 要大些，为了减少验证的时间，同时避免内存溢出）
当然可以把 PositionEmbedding 当做参数输入以解决这个问题，有没有更好地方法来解决呢？

bilstm-attention代码错误

在attention部分，w的初始化不应该是一个一维的吧：W = tf.Variable(tf.random_normal([hiddenSize], stddev=0.1))

为什么bert原本的只有MLP层，accuracy能达到86%，加了BiLSTM+Attention却降到了65%

加BiLSTM+Attention
INFO:tensorflow:***** Eval results *****
INFO:tensorflow: eval_accuracy = 0.6544118
INFO:tensorflow: eval_auc = 0.52433944
INFO:tensorflow: eval_precision = 0.69602275
INFO:tensorflow: eval_recall = 0.8781362
INFO:tensorflow: global_step = 917
INFO:tensorflow: loss = 0.77603155

bert原生的MLP层

ELMO输入长度问题

我在运行elmo的时候也遇到了“ValueError: setting an array element with a sequence.”的错误。我的疑问是，句子长度小于Maxlen的话怎么处理呢，要如何处理？

楼主也可以试试ULMFiT以及ResNet文本分类

测试问题

您好，我是初学者，仔细看了您的代码，写的很全面，但是好像没有test（或eval）方法，那么我想在测试集上测试性能，该怎么做呢？谢谢！

您好，我想咨询一下，要是应用在中文文本处理上面，数据应该怎么处理，代码上有什么变动没有，谢谢

BERT中的bert_blstm_atten.py无法predict

在执行predict.sh时：

python bert_blstm_atten.py
--task_name=imdb
--do_predict=True
--data_dir=data/
--vocab_file=modelParams/uncased_L-12_H-768_A-12/vocab.txt
--bert_config_file=modelParams/uncased_L-12_H-768_A-12/bert_config.json
--init_checkpoint=blstm_output/
--max_seq_length=128
--output_dir=predicts/imdb/

报错！：
Instructions for updating:
Please use keras.layers.RNN(cell), which is equivalent to this API
INFO:tensorflow:H shape: Tensor("Attention/add:0", shape=(?, 128, 256), dtype=float32)
INFO:tensorflow:r shape: Tensor("Attention/MatMul_1:0", shape=(?, 256, 1), dtype=float32)
INFO:tensorflow:sequeeze_r shape: Tensor("Attention/Squeeze:0", shape=(?, 256), dtype=float32)
INFO:tensorflow:sentence embedding shape: Tensor("Attention/Tanh_1:0", shape=(?, 256), dtype=float32)
INFO:tensorflow:output shape: Tensor("Attention/dropout/mul:0", shape=(?, 256), dtype=float32)
INFO:tensorflow:output_w shape: <tf.Variable 'output/output_w:0' shape=(256, 2) dtype=float32_ref>
INFO:tensorflow:predictions: Tensor("predictions:0", shape=(?,), dtype=int64)
INFO:tensorflow:Error recorded from prediction_loop: Assignment map with scope only name Attention should map to scope only Attention/Variable. Should be 'scope/': 'other_scope/'.
INFO:tensorflow:prediction_loop marked as finished
WARNING:tensorflow:Reraising captured error
Traceback (most recent call last):
File "bert_blstm_atten.py", line 865, in
tf.app.run()
File "/home/zys/anaconda3/envs/TensorFlow/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "bert_blstm_atten.py", line 847, in main
for (i, prediction) in enumerate(result):
File "/home/zys/anaconda3/envs/TensorFlow/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2500, in predict
rendezvous.raise_errors()
File "/home/zys/anaconda3/envs/TensorFlow/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/error_handling.py", line 128, in raise_errors
six.reraise(typ, value, traceback)
File "/home/zys/anaconda3/envs/TensorFlow/lib/python3.6/site-packages/six.py", line 693, in reraise
raise value
File "/home/zys/anaconda3/envs/TensorFlow/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2494, in predict
yield_single_examples=yield_single_examples):
File "/home/zys/anaconda3/envs/TensorFlow/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 611, in predict
features, None, model_fn_lib.ModeKeys.PREDICT, self.config)
File "/home/zys/anaconda3/envs/TensorFlow/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2251, in _call_model_fn
config)
File "/home/zys/anaconda3/envs/TensorFlow/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1112, in _call_model_fn
model_fn_results = self._model_fn(features=features, **kwargs)
File "/home/zys/anaconda3/envs/TensorFlow/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2534, in _model_fn
features, labels, is_export_mode=is_export_mode)
File "/home/zys/anaconda3/envs/TensorFlow/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 1323, in call_without_tpu
return self._call_model_fn(features, labels, is_export_mode=is_export_mode)
File "/home/zys/anaconda3/envs/TensorFlow/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 1593, in _call_model_fn
estimator_spec = self._model_fn(features=features, **kwargs)
File "bert_blstm_atten.py", line 539, in model_fn
tf.train.init_from_checkpoint(init_checkpoint, assignment_map)
File "/home/zys/anaconda3/envs/TensorFlow/lib/python3.6/site-packages/tensorflow/python/training/checkpoint_utils.py", line 190, in init_from_checkpoint
_init_from_checkpoint, args=(ckpt_dir_or_file, assignment_map))
File "/home/zys/anaconda3/envs/TensorFlow/lib/python3.6/site-packages/tensorflow/python/distribute/distribute_lib.py", line 1516, in merge_call
return self._merge_call(merge_fn, args, kwargs)
File "/home/zys/anaconda3/envs/TensorFlow/lib/python3.6/site-packages/tensorflow/python/distribute/distribute_lib.py", line 1524, in _merge_call
return merge_fn(self._distribution_strategy, *args, **kwargs)
File "/home/zys/anaconda3/envs/TensorFlow/lib/python3.6/site-packages/tensorflow/python/training/checkpoint_utils.py", line 246, in _init_from_checkpoint
scopes, tensor_name_in_ckpt))
ValueError: Assignment map with scope only name Attention should map to scope only Attention/Variable. Should be 'scope/': 'other_scope/'.

在do_train和do_eval时都没问题，我写了predict.sh执行就不行了，看起来像缺少了一些variable，能看下是怎么回事吗？谢谢！