Git Product home page Git Product logo

textclassifier's People

Contributors

jiangxinyang227 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

textclassifier's Issues

输入数据维度问题

你好,在elmo中,我看到输入的长度只是限制在200以内,但是每个句子的长度都不同,那么最后训练的向量的输出的形状是(batchsize, 200, dim)吗?在elmo中,是否允许padding以后的句子呢?如果允许,是不是就是以空格作为padding呢,我看tensorflow_hub中就是这样封装的。

Transformer 计算key mask似乎有问题

位置嵌入生成的是一个[batch_size, sequence_length,sequence_length]维的单位矩阵;
然后与词嵌入拼接:self.embeddedWords = tf.concat([self.embedded, self.embeddedPosition], -1)
这个矩阵的维度应该是[batch_size, sequence_length,embedding_size+sequence_length],再作为Q/K/V传入multiheadAttention;
计算key mask的时候:keyMasks = tf.sign(tf.abs(tf.reduce_sum(keys, axis=-1))),比如最后时间步做的是Pad Zero, keys[:,:,-1]=[0,0,0,…0,|0,…,0,1] (“|”之前的是词嵌入,“|”之后的是位置嵌入)用上面函数计算出来的也是1,好像key mask永远不会有0值。

请问下作者用transformer做文本分类的效果怎么样?我尝试了keras上的实现总比LSTM差2-4个点,有遇到过吗?
谢谢!

关于测试文件的编写还是存在疑问?

关于测试文件的编写还是存在疑问,根据您给的博客链接我只会把checkpoint加载进来,后面的传参还是不太会,該传进那些参数不是很清楚,根据博客链接里面,没看见测试集如何加载进来的,希望您能够给予解答。
import tensorflow as tf
import numpy as np

graph = tf.Graph()
with graph.as_default():

session_conf = tf.ConfigProto(allow_soft_placement=True, log_device_placement=False)
session_conf.gpu_options.allow_growth=True
session_conf.gpu_options.per_process_gpu_memory_fraction = 0.9  # 配置gpu占用率  

sess = tf.Session(config=session_conf)

with sess.as_default():
    checkpoint_file = tf.train.latest_checkpoint("C:/Users/韩泽峰/Desktop/textClassifier-master/model/transformer3classfier/")
    saver = tf.train.import_meta_graph("{}.meta".format(checkpoint_file))
    saver.restore(sess, checkpoint_file)

# 获得需要喂给模型的参数,输出的结果依赖的输入值

    inputX = graph.get_operation_by_name("inputX")
    dropoutKeepProb = graph.get_operation_by_name("dropoutKeepProb")
    embeddedPosition = graph.get_operation_by_name("embeddedPosition")
    
    # 获得输出的结果
    pred_all = graph.get_operation_by_name("inputY")

关于 Transformer._positionEmbedding 中的 batchSize

您好,我想使用 Transformer._positionEmbedding ,在设置 batchSize 遇到个问题:
在验证时,验证集的 batch 和 训练集的 batch 大小不同(训练集的batch 要大些,为了减少验证的时间,同时避免内存溢出)
当然可以把 PositionEmbedding 当做参数输入以解决这个问题,有没有更好地方法来解决呢?

image

bilstm-attention代码错误

在attention部分,w的初始化不应该是一个一维的吧:W = tf.Variable(tf.random_normal([hiddenSize], stddev=0.1))

ELMO输入长度问题

我在运行elmo的时候也遇到了“ValueError: setting an array element with a sequence.”的错误。我的疑问是,句子长度小于Maxlen的话怎么处理呢,要如何处理?

测试问题

您好,我是初学者,仔细看了您的代码,写的很全面,但是好像没有test(或eval)方法,那么我想在测试集上测试性能,该怎么做呢?谢谢!

BERT中的bert_blstm_atten.py无法predict

在执行predict.sh时:

python bert_blstm_atten.py
--task_name=imdb
--do_predict=True
--data_dir=data/
--vocab_file=modelParams/uncased_L-12_H-768_A-12/vocab.txt
--bert_config_file=modelParams/uncased_L-12_H-768_A-12/bert_config.json
--init_checkpoint=blstm_output/
--max_seq_length=128
--output_dir=predicts/imdb/

报错!:
Instructions for updating:
Please use keras.layers.RNN(cell), which is equivalent to this API
INFO:tensorflow:H shape: Tensor("Attention/add:0", shape=(?, 128, 256), dtype=float32)
INFO:tensorflow:r shape: Tensor("Attention/MatMul_1:0", shape=(?, 256, 1), dtype=float32)
INFO:tensorflow:sequeeze_r shape: Tensor("Attention/Squeeze:0", shape=(?, 256), dtype=float32)
INFO:tensorflow:sentence embedding shape: Tensor("Attention/Tanh_1:0", shape=(?, 256), dtype=float32)
INFO:tensorflow:output shape: Tensor("Attention/dropout/mul:0", shape=(?, 256), dtype=float32)
INFO:tensorflow:output_w shape: <tf.Variable 'output/output_w:0' shape=(256, 2) dtype=float32_ref>
INFO:tensorflow:predictions: Tensor("predictions:0", shape=(?,), dtype=int64)
INFO:tensorflow:Error recorded from prediction_loop: Assignment map with scope only name Attention should map to scope only Attention/Variable. Should be 'scope/': 'other_scope/'.
INFO:tensorflow:prediction_loop marked as finished
WARNING:tensorflow:Reraising captured error
Traceback (most recent call last):
File "bert_blstm_atten.py", line 865, in
tf.app.run()
File "/home/zys/anaconda3/envs/TensorFlow/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "bert_blstm_atten.py", line 847, in main
for (i, prediction) in enumerate(result):
File "/home/zys/anaconda3/envs/TensorFlow/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2500, in predict
rendezvous.raise_errors()
File "/home/zys/anaconda3/envs/TensorFlow/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/error_handling.py", line 128, in raise_errors
six.reraise(typ, value, traceback)
File "/home/zys/anaconda3/envs/TensorFlow/lib/python3.6/site-packages/six.py", line 693, in reraise
raise value
File "/home/zys/anaconda3/envs/TensorFlow/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2494, in predict
yield_single_examples=yield_single_examples):
File "/home/zys/anaconda3/envs/TensorFlow/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 611, in predict
features, None, model_fn_lib.ModeKeys.PREDICT, self.config)
File "/home/zys/anaconda3/envs/TensorFlow/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2251, in _call_model_fn
config)
File "/home/zys/anaconda3/envs/TensorFlow/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1112, in _call_model_fn
model_fn_results = self._model_fn(features=features, **kwargs)
File "/home/zys/anaconda3/envs/TensorFlow/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2534, in _model_fn
features, labels, is_export_mode=is_export_mode)
File "/home/zys/anaconda3/envs/TensorFlow/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 1323, in call_without_tpu
return self._call_model_fn(features, labels, is_export_mode=is_export_mode)
File "/home/zys/anaconda3/envs/TensorFlow/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 1593, in _call_model_fn
estimator_spec = self._model_fn(features=features, **kwargs)
File "bert_blstm_atten.py", line 539, in model_fn
tf.train.init_from_checkpoint(init_checkpoint, assignment_map)
File "/home/zys/anaconda3/envs/TensorFlow/lib/python3.6/site-packages/tensorflow/python/training/checkpoint_utils.py", line 190, in init_from_checkpoint
_init_from_checkpoint, args=(ckpt_dir_or_file, assignment_map))
File "/home/zys/anaconda3/envs/TensorFlow/lib/python3.6/site-packages/tensorflow/python/distribute/distribute_lib.py", line 1516, in merge_call
return self._merge_call(merge_fn, args, kwargs)
File "/home/zys/anaconda3/envs/TensorFlow/lib/python3.6/site-packages/tensorflow/python/distribute/distribute_lib.py", line 1524, in _merge_call
return merge_fn(self._distribution_strategy, *args, **kwargs)
File "/home/zys/anaconda3/envs/TensorFlow/lib/python3.6/site-packages/tensorflow/python/training/checkpoint_utils.py", line 246, in _init_from_checkpoint
scopes, tensor_name_in_ckpt))
ValueError: Assignment map with scope only name Attention should map to scope only Attention/Variable. Should be 'scope/': 'other_scope/'.

在do_train和do_eval时都没问题,我写了predict.sh执行就不行了,看起来像缺少了一些variable,能看下是怎么回事吗?谢谢!

bert_blstm_atten.py无法进行多分类

你好,我在修改了bert_blstm_atten.py后发现无法进行多分类验证,训练可以训练,在bert里修改run_classifier就可以。我怀疑是blstm_atten.py这一层只能进行二分类,请问怎么修改代码可以使用bert_blstm进行多分类?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.