Comments (16)
I'm pretty new to python, but I tried to do this. This is how I changed the load data function to support an arbitrary number of classes, each one with a file in the data folder:
def load_data_and_labels():
#define the data directory where the templates live
data_dir = "./data/templates/"
data_dir = "./data/rt-polaritydata/"
#store all of the class data in a list
class_data = []
label_list = []
default_list = []
#load data from files
for i in os.listdir(data_dir):
print data_dir+i
examples = list(open(data_dir+i).readlines())
examples = [s.strip() for s in examples]
#append these examples to the list of lists
class_data.append(examples)
#make the label list as long as the numbe rof classes
default_list.append(0)
# concat class examples
counter = 0
for class_examples in class_data:
#set the label
temp_list = zerolist(class_data)
temp_list[counter] = 1
label_list.append(temp_list)
if counter == 0:
x_text = class_examples
else:
x_text = x_text + class_examples
counter += 1
#clean and split
x_text = [clean_str(sent) for sent in x_text]
x_text = [s.split(" ") for s in x_text]
# Generate labels
final_labels = []
counter = 0
for class_examples in class_data:
print label_list[counter]
final_labels.append([label_list[counter] for _ in class_data[counter]])
counter += 1
y = np.concatenate(final_labels, 0)
return [x_text, y]
Works fine when I leave the original data set there, but when I add a third class as rt-polarity.neu I get:
python train.py
I tensorflow/stream_executor/dso_loader.cc:101] successfully opened CUDA library libcublas.so.7.0 locally
I tensorflow/stream_executor/dso_loader.cc:101] successfully opened CUDA library libcudnn.so.6.5 locally
I tensorflow/stream_executor/dso_loader.cc:101] successfully opened CUDA library libcufft.so.7.0 locally
I tensorflow/stream_executor/dso_loader.cc:101] successfully opened CUDA library libcuda.so locally
I tensorflow/stream_executor/dso_loader.cc:101] successfully opened CUDA library libcurand.so.7.0 locally
Parameters:
ALLOW_SOFT_PLACEMENT=True
BATCH_SIZE=64
CHECKPOINT_EVERY=100
DROPOUT_KEEP_PROB=0.5
EMBEDDING_DIM=128
EVALUATE_EVERY=100
FILTER_SIZES=3,4,5
L2_REG_LAMBDA=0.0
LOG_DEVICE_PLACEMENT=False
NUM_EPOCHS=200
NUM_FILTERS=128
Loading data...
./data/rt-polaritydata/rt-polarity.pos
./data/rt-polaritydata/rt-polarity.neg
./data/rt-polaritydata/rt-polarity.neu
[1, 0, 0]
[0, 1, 0]
[0, 0, 1]
Vocabulary Size: 23975
Train/Dev split: 10191/1000
I tensorflow/core/common_runtime/local_device.cc:40] Local device intra op parallelism threads: 8
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:909] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_init.cc:103] Found device 0 with properties:
name: GRID K520
major: 3 minor: 0 memoryClockRate (GHz) 0.797
pciBusID 0000:00:03.0
Total memory: 4.00GiB
Free memory: 3.91GiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:127] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_init.cc:137] 0: Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:669] Ignoring gpu device (device: 0, name: GRID K520, pci bus id: 0000:00:03.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
I tensorflow/core/common_runtime/direct_session.cc:58] Direct session inter op parallelism threads: 8
Writing to /home/ec2-user/cnn-text-classification-tf/runs/1455075470
Traceback (most recent call last):
File "train.py", line 159, in <module>
train_step(x_batch, y_batch)
File "train.py", line 131, in train_step
feed_dict)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 364, in run
tuple(subfeed_t.get_shape().dims)))
ValueError: Cannot feed value of shape (64, 3) for Tensor u'input_y:0', which has shape (Dimension(None), Dimension(2))
I think the issue may be with how it splits the training and testing set here:
# Split train/test set
# TODO: This is very crude, should use cross-validation
x_train, x_dev = x_shuffled[:-1000], x_shuffled[-1000:]
y_train, y_dev = y_shuffled[:-1000], y_shuffled[-1000:]
What do you think?
from cnn-text-classification-tf.
If you have 20 classes then the y vector would be of length 20, with all 0s, except for the label that is correct. So, it's a one-hot vector of your label. y actually represents a probability distribution over the possible labels.
Or did you mean a multilabel input (which is different from multiclass) where several labels can be true at the same time? I think you can adapt the softmax to doing that, take a look at this: http://arxiv.org/abs/1312.5419 or http://arxiv.org/pdf/1312.4894.pdf
I don't have any personal experience with multilabel problems though.
from cnn-text-classification-tf.
If you look at https://github.com/dennybritz/cnn-text-classification-tf/blob/master/train.py you can see that num_classes
parameter is hard-coded to 2, so it created a 2-dimensional placeholder. You'd need to set it to 3 to match the number of classes in the examples.
I think the splitting of the train/test set is fine, the only thing that you may want to consider is that you may get an uneven split of the classes in your train and test set. I think scikit learn has built-in support for something called "stratified" splitting which will take care of that. But that not necessary to make it work.
from cnn-text-classification-tf.
Ah, such a simple mistake! Thanks!
from cnn-text-classification-tf.
@LeapGamer please fix zerolist in you code!! Thanks
from cnn-text-classification-tf.
just change the line with the following:
temp_list = [0] * len(class_data)
from cnn-text-classification-tf.
@dennybritz hello denny! thanks for sharing!! But I'm very confused that how about multi-level label. Since in your code, you represent the lable like [0,1][1,0] in train.py and you mentioned that
(If you have 20 classes then the y vector would be of length 20, with all 0s, except for the label that is correct. So, it's a one-hot vector of your label. y actually represents a probability distribution over the possible labels.)
so this is the only method to represent the label like [0,0,0,0,0,1]? how can I represent the level of lables for example postive include happy and good, while negative include sad and fear?
Any idear? thanks a lot!
from cnn-text-classification-tf.
@ShadowsJF If you want to have multiple labels per class that's a different classification problem and you need to change the loss function because you are no longer predicting a probability distribution over labels. Of course you could do something like [0.5, 0, 0, 0.5](as a probability distribution) - but that's not quite the same.
If the classes are independent (sad/fear are probably not) you could change the output layer to mutiple sigmoids (instead of one softmax) that output a probability for each class.
from cnn-text-classification-tf.
@dennybritz it helps a lot! thanks!
from cnn-text-classification-tf.
Hi Denny, just wanted to say thank you for your really good work, it helped me out a lot. :)
I managed to tweak your code to make it work on a 20 categories dataset with a 9000/1000 train/dev repartition, but the accuracy is stagnating around 0.74 max for some reason. Any ideas on what kind of parameters i should play with to maximize the performance of the network (maybe batch or embedding size?).
Thanks a lot!
from cnn-text-classification-tf.
Actually nevermind my last question, turns out i was not spliting the train/dev set as expected. I have now an accuracy of 0.99 which seems to good to be true. So i was wondering : Is there an easy way to get precision and recall value for the 20 classes of my classifier? I have to admit the code in text_cnn.py is too complicated for me to understand fully. Thanks a lot for your help.
from cnn-text-classification-tf.
@LeapGamer hello,did you solve the problem of multi class?
from cnn-text-classification-tf.
@lcorbel I also in the experiment of the problem of classification based on 20 Newsgroups. But it didn't get the result. Would you share your experiment code? Thanks a lot.
from cnn-text-classification-tf.
hello, @yifenzhong1920 whats your problem?
from cnn-text-classification-tf.
Hello, is there any one here who test the model with W2Vector???
from cnn-text-classification-tf.
@dennybritz I'm using 2 different datasets (1. Full_Economic_News [7775 documents] // 2.News_Article_Wikipedia [3000 documents]) after running ./train.py after steps 100 suddenly process becomes kill and stop!!!!
How can I solve this issue?
I attached my 2 datasets.
Full-Economic-News.txt
bare_train_news_article_wikipedia.txt
from cnn-text-classification-tf.
Related Issues (20)
- Sentence Length in code
- eval.py
- AttributeError: _parse_flags on running eval.py HOT 2
- About code
- Training problem
- Unknown error
- What's the best results on MR data?
- 1 Epoch size? HOT 1
- Model size after training HOT 1
- why training speed is very slow when using a large of training dataset ?
- output/predictions not in train.py
- what should I do if I wanna predict whether a piece of news is true?
- AttributeError: _parse_flags HOT 3
- Vocabulary size
- size of maxpooling HOT 2
- What is the size of embedded_chars_expanded ?
- tf.get_variable and tf.Variable HOT 2
- Please, How to print F1-score and recall etc. HOT 3
- AttributeError: module 'tensorflow_estimator.python.estimator.api._v1.estimator' has no attribute 'preprocessing' HOT 1
- run HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cnn-text-classification-tf.