Git Product home page Git Product logo

tf-tabnet's Introduction

TabNet for Tensorflow 2.0

A Tensorflow 2.0 port for the paper TabNet: Attentive Interpretable Tabular Learning, whose original codebase is available at https://github.com/google-research/google-research/blob/master/tabnet.

The above image is obtained from the paper, where the model is built of blocks in two stages - one to attend to the input features and anither to construct the output of the model.

Differences from Paper

There are two major differences from the paper and the official implementation.

  1. This implementation offers a choice in the normalization method, between the regular Batch Normalization from the paper and Group Normalization.

    • It has been observed that the paper uses very large batch sizes to stabilie Batch Normalization and obtain good generalization. An issue with this is computational cost.
    • Therefore Group Normalization (with number of groups set as 1, aka Instance Normalization) offers a reasonable alternative which is independent of the batch size.
    • One can set num_groups to 1 for Instance Normalization type behaviour, or to -1 for Layer Normalization type behaviour.
  2. This implementation does not strictly need feature columns as input.

    • While this model was originally developed for tabulur data, there is no hard requirement for that to be the only type of input it accepts.
    • By passing feature_columns=None and explicitly specifying the input dimensionality of the data (using num_features), we can get a semi-interpretable result from even image data (after flattening it into a long vector).

Installation

  • For latest release branch
$ pip install --upgrade tabnet
  • For Master branch.
$ pip install git+https://github.com/titu1994/tf-TabNet.git

As Tensorflow can be used with either a CPU or GPU, the package can be installed with the conditional requirements using [cpu] or [gpu] as follows.

$ pip install tabnet[cpu]
$ pip install tabnet[gpu]

Usage

The script tabnet.py can be imported to yield either the TabNet building block, or the TabNetClassification and TabNetRegression models, which add appropriate heads for the basic TabNet model. If the classification or regression head is to be customized, it is recommended to compose a new model with the TabNet as the base of the model.

from tabnet import TabNet, TabNetClassifier

model = TabNetClassifier(feature_list, num_classes, ...)

Stacked TabNets

Regular TabNets can be stacked into various layers, thereby reducing interpretability but improving model capacity.

from tabnet import StackedTabNetClassifier

model = TabNetClassifier(feature_list, num_classes, num_layers, ...)

As the models use custom objects, it is necessary to import custom_objects.py in an evaluation only script.

Mask Visualization

The masks of the TabNet can be obtained by using the TabNet class properties

  • feature_selection_masks: Returns a list of 1 or more masks at intermediate decision steps. Number of masks = number of decision steps - 1
  • aggregate_feature_selection_mask: Returns a single tensor which is the average activation of the masks over that batch of training samples.

These masks can be obtained as TabNet.feature_selection_masks. Since the TabNetClassification and TabNetRegression models are composed of TabNet, the masks can be obtained as model.tabnet.*

Mask Generation must be in Eager Execution Mode

Note: Due to autograph, the outputs of the model when using fit() or predict() Keras APIs will generally be graph based Tensors, not EagerTensors. Since the masks are generated inside the Model.call() method, it is necessary to force the model to behave in Eager execution mode, not in Graph mode.

Therefore there are two ways to force the model into eager mode:

  1. Get tensor data samples, and directly call the model using this data as below :
x, _ = next(iter(tf_dataset))  # Assuming it generates an (x, y) tuple.
_ = model(x)  # This forces eager execution.
  1. Or another choice is to build a seperate model (but here you will pass the dynamic=True flag to the model constructor), load the weights and parameters in this model, and call model.predict(x). This should also force eager execution mode.
new_model = TabNetClassification(..., dynamic=True)
new_model.load_weights('path/to/weights)')

x, _ = next(iter(tf_dataset))  # Assuming it generates an (x, y) tuple.
model.predict(x)

After the model has been forced into Eager Execution mode, the masks can be visualized in Tensorboard as follows -

writer = tf.summary.create_file_writer("logs/")
with writer.as_default():
    for i, mask in enumerate(model.tabnet.feature_selection_masks):
        print("Saving mask {} of shape {}".format(i + 1, mask.shape))
        tf.summary.image('mask_at_iter_{}'.format(i + 1), step=0, data=mask, max_outputs=1)
        writer.flush()

    agg_mask = model.tabnet.aggregate_feature_selection_mask
    print("Saving aggregate mask of shape", agg_mask.shape)
    tf.summary.image("Aggregate Mask", step=0, data=agg_mask, max_outputs=1)
    writer.flush()
writer.close()

Requirements

  • Tensorflow 2.0+ (1.14+ with V2 compat enabled may be sufficient for 1.x)
  • Tensorflow-datasets (Only required for evaluating train_iris.py)

tf-tabnet's People

Contributors

atamazian avatar dlperf avatar mausam3407 avatar sakshit1406 avatar titu1994 avatar zmjjmz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tf-tabnet's Issues

Check failed: work_element_count > 0 (0 vs. 0)

Hi,
If I run your mnist example I get an error message like this:

"Check failed: work_element_count > 0 (0 vs. 0)"

Seems so be related to tensorflow, but generally tensorflow is working fine for me - thus I'll post it here. Do you have any idea?

Thanks!

Full error log is:

2020-03-05 21:26:48.868876: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties: 
pciBusID: 0000:0a:00.0 name: GeForce GTX 1080 Ti computeCapability: 6.1
coreClock: 1.607GHz coreCount: 28 deviceMemorySize: 11.00GiB deviceMemoryBandwidth: 451.17GiB/s
2020-03-05 21:26:48.869296: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-03-05 21:26:48.869429: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-03-05 21:26:48.869561: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-03-05 21:26:48.869684: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-03-05 21:26:48.869806: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-03-05 21:26:48.869930: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-03-05 21:26:48.870056: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-03-05 21:26:48.870515: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-03-05 21:26:48.870902: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties: 
pciBusID: 0000:0a:00.0 name: GeForce GTX 1080 Ti computeCapability: 6.1
coreClock: 1.607GHz coreCount: 28 deviceMemorySize: 11.00GiB deviceMemoryBandwidth: 451.17GiB/s
2020-03-05 21:26:48.871150: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-03-05 21:26:48.871277: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-03-05 21:26:48.871400: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-03-05 21:26:48.871526: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-03-05 21:26:48.871655: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-03-05 21:26:48.871784: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-03-05 21:26:48.871912: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-03-05 21:26:48.872321: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-03-05 21:26:48.872526: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-03-05 21:26:48.872759: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102]      0 
2020-03-05 21:26:48.872893: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0:   N 
2020-03-05 21:26:48.873502: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 9011 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:0a:00.0, compute capability: 6.1)
Epoch 1/5
2020-03-05 21:26:56.571332: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-03-05 21:26:56.963017: F .\tensorflow/core/util/gpu_launch_config.h:129] Check failed: work_element_count > 0 (0 vs. 0)
Process finished with exit code -1073740791 (0xC0000409)

How to use tabnet model to train models with both categorical features and continuous features?

I want to use my data to train tabnetclassfier models. The data is a hybrid data, including categorical features and continuous features. I find 'train_embedding.py' and 'train_iris.py' in /examples, which are use their own data types. I handle continuous data use tf.feature_column.numeric_column(col_name), and handle categorical data use tf.feature_column.indicator_column(tf.feature_column.categorical_column_with_vocabulary_list(col_name,[1,2,10,20,30])), Then I put them in feature_columns as a parameter of TabNetClassifier(). The console will raise the error of unequal size, such as

ValueError: Dimensions must be equal, but are 89 and 458 for '{{node tab_net_classifier_2/tab_net_2/Mul_10}} = Mul[T=DT_FLOAT](tab_net_classifier_2/tab_net_2/PartitionedCall, tab_net_classifier_2/tab_net_2/input_gn/Reshape_3)' with input shapes: [?,89], [?,458]

So, how do I train the model with two kinds of data as input.

Unable to use the package with scikeras KerasClassifier Wrapper

Hi, I wanted to use the package with KerasClassifier so that it can have sklearn compatibility, but whenever I am trying to pass it as build_function argument in the wrapper itself I am observing a lot of issues. Is there anyway I can change it functionality to work with KerasClassifier ?

tf-TabNet due for new PyPI release

Hey there,

I'm using tabnet in some stuff and would like to be able to install the version with the new fixes from PyPI. The latest release is from November, so I think it's due for a version bump.

Thanks!

Performance issues in /examples/train_embedding.py (by P3)

Hello! I've found a performance issue in /examples/train_embedding.py: batch() should be called before map(), which could make your program more efficient. Here is the tensorflow document to support it.

Detailed description is listed below:

  • ds_train = ds_train.batch(BATCH_SIZE)(line 33) should be called before ds_train = ds_train.map(transform)(line 32).
  • ds_test = ds_test.batch(BATCH_SIZE)(line 37) should be called before ds_test = ds_test.map(transform)(line 36).

Besides, you need to check the function called in map()(e.g., transform called in ds_test.map(transform)) whether to be affected or not to make the changed code work properly. For example, if transform needs data with shape (x, y, z) as its input before fix, it would require data with shape (batch_size, x, y, z).

Looking forward to your reply. Btw, I am very glad to create a PR to fix it if you are too busy.

tabnet mask loss bug

in tabnet.,py line 292 entropy_loss = 0.
this will conduct the entropy_loss always 0

TabNetRegressor Example

Hello, I am looking for an example of using TabNetRegressor on data that has both numeric and categorical features and cant find that in the examples, thanks

AttributeError: 'Tensor' object has no attribute 'numpy'

I faced this error when parsing model.tabnet.aggregate_feature_selection_mask tensor into numpy array.
In google colab, it has been done easily by using .numpy(). (my notebook).
In my local machine, I wrote a python script and tried many solutions to do fix it like this, this, this, this, this, this, this, this, this.

After that, I think I need to create a tensor operator to make it work. However, I faced this error: this.

My system information:

  • Tensorflow 2.5.0
  • Tabnet 0.16

Please help me to get the values in the tensor.
Thank you very much.

Performance issues in /examples/train_embedding.py (by P3)

Hello! I've found a performance issue in /examples/train_embedding.py: batch() should be called before map(), which could make your program more efficient. Here is the tensorflow document to support it.

Detailed description is listed below:

  • ds_train = ds_train.batch(BATCH_SIZE)(line 33) should be called before ds_train = ds_train.map(transform)(line 32).
  • ds_test = ds_test.batch(BATCH_SIZE)(line 37) should be called before ds_test = ds_test.map(transform)(line 36).

Besides, you need to check the function called in map()(e.g., transform called in ds_test.map(transform)) whether to be affected or not to make the changed code work properly. For example, if transform needs data with shape (x, y, z) as its input before fix, it would require data with shape (batch_size, x, y, z).

Looking forward to your reply. Btw, I am very glad to create a PR to fix it if you are too busy.

[Question / Not sure if it's an issue] Suggested choice of hyperparameters feat_dim (N_a) == output_dim (N_d) leads to ValueError

Both in docstring of TabNet class and in the original article they suggest N_a == N_d for most datasets.
(Dimensionalities of hidden representations and the outputs of each decision step)
But in the code (tabnet.py:129) there is a ValueError which is raised if N_a <= N_d.
I'm not sure if it's an issue or it's my comprehension of the code which is not correct.
Could you please clarify this point ?

P.S.
I'd like to thank you for your implementation of a very interesting paper.
I'm trying to use tabnet module for a small POC with an imbalanced dataset containing ~20k samples, mostly categorical data.

Feature transformer is all decision-step independent transformer blocks

Hey there,

I think there's a mistake in the feature transformer implementation here (i.e., Fig. b) compared to the original implementation. My understanding based on the paper and the original is that Transforms 3 & 4 should be dependent on the decision step (i.e., the weights should not be shared between the decision steps), whereas Transforms 1 & 2 should be the same weights for each decision step.

Note that this is implemented in the original model by having the (I think now deprecated) tf.layers reuse flag set for the first two, and for the rest to have different names based on the decision step.

If I'm understanding this code correctly (and I might not be), there are only 4 instances of the TransformBlock model, and there's no difference in how these are instantiated or called between blocks 1&2 and blocks 3&4. I'm fairly certain this means that they're all shared weights between decision steps, which does not match the model detailed in the paper.

A possible solution would be to make two lists (3&4) of num_decision_steps TransformBlock instances and reference the appropriate TransformBlock in the call method, i.e.

I think this would be a fairly simple change:

In __init__

self.transform_f3_list = [TransformBlock(2 * self.feature_dim, self.batch_momentum,       
   self.virtual_batch_size, self.num_groups) for _ in range(self.num_decision_steps)]

and then in call

transform_f3 = self.transform_f3_list[ni](transform_f2, training=training)
transform_f3 = (glu(transform_f3, self.feature_dim) +
                          transform_f2) * tf.math.sqrt(0.5)

Let me know what you think! I may go ahead and try this myself and submit a PR if it's appropriate.

F ./tensorflow/core/util/gpu_launch_config.h:129] Check failed: work_element_count > 0.

i would like to use hyperband to find the hyperparameter of autoencoder-cnn. But in gpu, there is a bug like this: F ./tensorflow/core/util/gpu_launch_config.h:129] Check failed: work_element_count > 0.

Ang solution?

Here is my code.




import kerastuner as kt
#from google.colab import drive
import pandas as pd
import glob
import pdb
import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import os
import subprocess
import h5py
from tensorflow.keras import Sequential, layers, Model


#drive.mount("/content/gdrive")
data_path=r'C:\Users\q75714hz\New folder\UVLIF\PLAIR_HK\Processed\'

"""1) Define custom generators that will load the data from multiple CSV files in batches during the training phase. """

from tensorflow.keras.utils import Sequence

class DataGenerator(Sequence):
 """Generates data for Keras
 Sequence based data generator. Suitable for building data generator for training and prediction.
 CONTAINS SPECIFIC INFO FOR AUTOENCODERS
 """
 def init(self, list_files, to_fit=True, mini_batch = 1000, batch_size=1, shuffle=True):
 """Initialization
 :param data_path: path to datafiles
 :param list_files: list of image labels (file names)
 :param to_fit: True to return X and y, False to return X only
 :param batch_size: batch size at each iteration
 :param dim: tuple indicating image dimension
 :param shuffle: True to shuffle label indexes after every epoch
 """

 # We have to create a mapping to the file name and the subset of data
 # extracted from that file as a dictionary or list.
 # To do this we need to count the number of lines in each file
 # and then divide that by the mini_batch and loop through each
 # chunck and define a starting point to extract the data

 self.list_files = list_files
 self.mini_batch = mini_batch
 self.data_path = data_path
 #self.mask_path = mask_path
 self.to_fit = to_fit
 self.batch_size = batch_size
 #self.dim = dim
 #self.n_channels = n_channels
 #self.n_classes = n_classes
 self.shuffle = shuffle
 self.on_epoch_end()

 def len(self):
 """Denotes the number of batches per epoch
 :return: number of batches per epoch
 """
 return int(np.floor(len(self.list_files) / self.batch_size))

 def getitem(self, index):
 """Generate one batch of data
 :param index: index of the batch
 :return: X and y when fitting. X only when predicting
 """
 # Generate indexes of the batch
 indexes = self.indexes[index * self.batch_size:(index + 1) * self.batch_size]

 # Find list of IDs
 list_files_temp = [self.list_files[k] for k in indexes]

 # Generate data
 X = self._generate_X(list_files_temp)

 #return X

 if self.to_fit:
 y = X
 return (X, y)
 else:
 return X

 def on_epoch_end(self):
 """Updates indexes after each epoch
 """
 self.indexes = np.arange(len(self.list_files))
 if self.shuffle == True:
 np.random.shuffle(self.indexes)

 def _generate_X(self, list_files_temp):
 """Generates data containing batch_size images
 :param list_IDs_temp: list of label ids to load
 :return: batch of images
 """
 # Initialization
 #X = np.empty((self.batch_size, self.dim, self.n_channels))


 if len(list_files_temp) == 1:
 path = list_files_temp[0][0]
 start_loc = list_files_temp[0][1]
 end_loc = list_files_temp[0][2]
 stop_point = min(start_loc+self.mini_batch,end_loc)

 #info_df=pd.read_csv(path,skiprows=start_loc+1,nrows=min(self.mini_batch,end_loc-start_loc))
 #Scattering_df = info_df.iloc[:, 34::][(info_df.iloc[:, 34::].T != 0).any()]
 #Scattering_df[Scattering_df < 0] = 0
 #Scattering_df=Scattering_df.div(Scattering_df.max(axis=1), axis=0)
 ## extract the numpy array and then reshape back to the original size.
 #images = Scattering_df.loc[:, Scattering_df.columns != 'label'].to_numpy()
 #X = np.reshape(images, (images.shape[0], 80, 24, 1))

 hf = h5py.File(path, 'r')
 data=hf['test']['block0_values'][start_loc:stop_point, 33::]

 #info_df=pd.read_hdf(filename, "test",start=start_loc,stop=stop_point)
 #data=info_df.iloc[:, 33::].to_numpy()
 data = data[~np.all(data == 0, axis=1)]
 data=data[~np.isnan(data).all(axis=1)]
 data=data[np.isfinite(data).all(axis=1)]

 # basic stratgey here is to convert the image into a sharpened replica
 data=data/np.max(data,axis=1)[:,None]
 #std=np.std(data,axis=1)[:,None]
 #mean=np.mean(data,axis=1)[:,None]
 #data[data >= 0.001]=1.0
 #data[data < 0.001]=0.0

 #X = info_df.iloc[:, 33::].to_numpy().reshape(info_df.shape[0],80,24,1)
 #data[data > 0.0001]=1.0
 X=data.reshape(data.shape[0],80,24,1)

 else:
 Scattering_list=[None] * len(list_files_temp)
 step=0
 for entry in list_files_temp:
 path = entry[0]
 start_loc = entry[1]
 end_loc = entry[2]
 stop_point = min(start_loc+self.mini_batch,end_loc)
 #info_df=pd.read_csv(path,na_filter=False,header=None,skiprows=start_loc+1,nrows=min(self.mini_batch,end_loc-start_loc))
 #Scattering_df = info_df.iloc[:, 34::][(info_df.iloc[:, 34::].T != 0).any()]
 #Scattering_df[Scattering_df < 0] = 0
 #Scattering_df=Scattering_df.div(Scattering_df.max(axis=1), axis=0)
 #Scattering_df = Scattering_df.dropna()
 #info_df=pd.read_hdf(filename, "test",start=start_loc,stop=stop_point)
 hf = h5py.File(path, 'r')
 Scattering_list[step]=hf['test']['block0_values'][start_loc:stop_point, 33::]
 #Scattering_list.append(info_df)
 step+=1
 #Scattering_df2=pd.concat(Scattering_list,axis=0) #pd.DataFrame.from_dict(Scattering_dict, orient='index')
 #extract the numpy array and then reshape back to the original size.
 #images = Scattering_df2.loc[:, Scattering_df2.columns != 'label'].to_numpy()
 #pdb.set_trace()
 #data=Scattering_df2.iloc[:, 33::].to_numpy()
 data = Scattering_list[~np.all(Scattering_list == 0, axis=1)]
 data=data[~np.isnan(data).all(axis=1)]
 data=data[np.isfinite(data).all(axis=1)]
 data=data/np.max(data,axis=1)[:,None]
 #data[data >= 0.001]=1.0
 #data[data < 0.001]=0.0
 #data[data > 0.0001]=1.0
 #X = Scattering_df2.iloc[:, 33::].to_numpy().reshape(Scattering_df2.shape[0],80,24,1)
 X=data.reshape(data.shape[0],80,24,1)

 return X

list_files = glob.glob(data_path+'.hdf')

# define a minibatch which would normally be used in the standard training method
minibatch = 1000

list_of_mappings = []

for filename in list_files:
 # lines = int(subprocess.getoutput("sed -n '$=' " + filename))
 hf=pd.read_hdf(filename,mode='r')
 lines=int(hf.shape[0])
 #pdb.set_trace()
 chunks = int(np.ceil(lines / minibatch))
 for step in range(chunks):
 sublist=[]
 sublist.append(filename)
 sublist.append(step*minibatch)
 sublist.append(min((step + 1)*minibatch,lines-2))
 list_of_mappings.append(sublist)
print(list_of_mappings[0:10])
print(list_of_mappings[11:20])
# pdb.set_trace()

training_generator = DataGenerator(list_of_mappings[:])
validation_generator = DataGenerator(list_of_mappings[:])
print(len(list_of_mappings[:]))
print(len(training_generator))


# define tunner of ae

def model(hp):

 original_inputs = keras.Input(shape=(80, 24, 1), name='encoder_input')
 variance_scale = 0.3
 init = tf.keras.initializers.VarianceScaling(scale=variance_scale, mode='fan_in', distribution='uniform')
 layer= layers.Conv2D(filters=hp.Choice("num_filters_layer_1", values=[8, 32], default=8), kernel_size=3,
 activation='relu', kernel_initializer=init, padding='same',
 strides=1)(original_inputs)
 layer1 = layers.Conv2D(filters=hp.Int("num_filters_layer_2", min_value=16, max_value=64, step=16), kernel_size=3,
 activation='relu', kernel_initializer=init, padding='same',
 strides=1)(layer)
 layer2 = layers.Conv2D(filters=hp.Int("num_filters_layer_3", min_value=16, max_value=96, step=16), kernel_size=3,
 activation='relu', kernel_initializer=init, padding='same',
 strides=1)(layer1)
 layer3 = layers.Conv2D(filters=hp.Int("num_filters_layer_4", min_value=16, max_value=112, step=16), kernel_size=3,
 activation='relu', kernel_initializer=init, padding='same',
 strides=1)(layer2)



 layer_flatten = layers.Flatten()(layer3)
 h = layers.Dense(hp.Int("num_Dense", 0, 600, 200), activation='relu', name="encoding_5")(layer_flatten)
 latent_layer = layers.Dense(hp.Int("latent_space", 20, 40, 10), activation='relu')(h)

 #decoder
 latent_inputs_cnn = keras.Input(shape=(latent_layer.shape[1],), name='latent_input')
 dec_layer1_cnn = layers.Dense(h.shape[1], activation='relu')(latent_inputs_cnn)
 dec_layer2_cnn = layers.Dense(layer_flatten.shape[1], activation='relu')(dec_layer1_cnn)
 dec_layer = layers.Reshape((layer3.shape[1], layer3.shape[2], layer3.shape[3]))(dec_layer2_cnn)

 dec_layer3_cnn = layers.Conv2DTranspose(hp.Int("num_filters_layer_3", min_value=16, max_value=96, step=16),
 kernel_size=3, activation='relu', kernel_initializer=init,
 padding='same', strides=1)(dec_layer)

 dec_layer4_cnn = layers.Conv2DTranspose(filters=hp.Int("num_filters_layer_2", min_value=16, max_value=64, step=16), kernel_size=3,
 activation='relu', kernel_initializer=init, padding='same',
 strides=1)(dec_layer3_cnn)


 dec_layer5_cnn=layers.Conv2DTranspose(filters=hp.Choice("num_filters_layer_1",values=[8,32],default=8), kernel_size=3, activation='relu', kernel_initializer=init,
 padding='same', strides=1)(dec_layer4_cnn)
 dec_layer6_cnn = layers.Conv2DTranspose(original_inputs.shape[3], (3, 3), activation='sigmoid',
 kernel_initializer=init, padding='same', strides=1)(dec_layer5_cnn)
 dec_cnn = Model(inputs=latent_inputs_cnn, outputs=dec_layer6_cnn, name='decoder_cnn')
 outputs = dec_cnn(latent_layer)

 cnn_ae = Model(inputs=original_inputs, outputs=outputs, name='cnn_ae')

 cnn_ae.compile(optimizer=keras.optimizers.Adam(hp.Choice('learning_rate', values=[1e-3, 1e-4, 1e-5])),
 loss='binary_crossentropy', metrics=['accuracy'])

 return cnn_ae

tuner = kt.Hyperband(model,
 objective='loss',
 max_epochs=5,
 factor=3,
 directory='my_dir_SE',
 project_name='intro_to_kt_se' ,overwrite=True)



tuner.search(training_generator, epochs=4, workers=4)

# Get the optimal hyperparameters
best_hps =tuner.get_best_hyperparameters(num_trials=1)[0]


# Now we can train the AE and save it, if needs be, for later use.
# Build the model with the optimal hyperparameters and train it on the data for 30 epochs
model = tuner.hypermodel.build(best_hps)
history = model.fit(training_generator, epochs=40, workers=4)
model.save('ae_sequence_scattering_40epochs.h5')
val_acc_per_epoch = history.history['val_loss']
best_epoch = val_acc_per_epoch.index(min(val_acc_per_epoch)) + 1
print('Best epoch: %d' % (best_epoch,))
hypermodel = tuner.hypermodel.build(best_hps)
# Retrain the model
history_new = hypermodel.fit(training_generator, epochs=best_epoch,workers=4)
# plot loss history
hypermodel.save('ae_sequence_scattering.h5')

[Maybe an issue?] No inter-sample variation of per-step attention masks

I was trying to get a sense of feature importance from TabNet using the attention masks, and essentially what I want to do is show the distribution over the sample axis of the attention paid to each feature per step. For a first approximation of that I'm plotting the mean attention paid w/the standard deviation as an error bar, and I noticed that the standard deviation is nearly zero!

Screenshot from 2020-06-24 14-13-49

This is the attention masks after running inference on the first batch in the included Iris example, and a text-only version can be found in this gist. The little lines on top of each bar are supposed to be error-bars. I've observed this on a larger dataset with proprietary features (that I can't share here), so it's not just an issue with there being only 50 examples in that inference batch.

So first off - am I interpreting these masks correctly? Secondly, am I wrong in assuming there even should be inter-sample variation here? Or is the model supposed to apply the same attention to each feature in each step, regardless of the sample? I'm not clear on that -- since it seems like the actual feature value should affect the transform_coef block that feeds the mask_values, but maybe I'm misunderstanding this.

FWIW, tested a few things (reducing sparsity coefficient, using batch norm) and I haven't found anything that changes this behavior. I haven't tried the original implementation to see if it exhibits similar behavior, but if it's not supposed to behave like this that would be the next step.

fitting the model using generators

I am working with a huge file which can not fully be loaded therefore have to use generators.

here is the iris example, which I am trying to read the data from the csv file in batches

!wget https://gist.githubusercontent.com/netj/8836201/raw/6f9306ad21398ea43cba4f7d537619d0e07d5ae3/iris.csv
    
def generate_data_from_file(params):
    data_out = pd.read_csv('iris.csv',  
                           skiprows = range(1, params['skiprows']),
                           index_col = 0, 
                           nrows = params['train_upto_row_num'], 
                           chunksize = params['num_observation'])

    for item_df in data_out:
        
        target = item_df['variety']
        item_df = item_df[["sepal.length","sepal.width","petal.length","petal.width"]]
        
        yield np.array(item_df), np.array(target)

types = (tf.float32, tf.int16)

training_params = {'skiprows': 0, 
                   'train_upto_row_num': 40, 
                   'num_observation': 5}

valid_params = {'skiprows': 40, 
                   'train_upto_row_num': 20, 
                   'num_observation': 5}

training_dataset = tf.data.Dataset.from_generator(lambda: generate_data_from_file(params_train),
                                         output_types=types
                                         #, output_shapes=shapes
                                        ).repeat(1)

validation_dataset = tf.data.Dataset.from_generator(lambda: generate_data_from_file(params_val),
                                         output_types=types
                                         #, output_shapes=shapes
                                        ).repeat(1)

col_names = ['sepal_length', 'sepal_width', 'petal_length', 'petal_width']

feature_columns = []
for col_name in col_names:
    feature_columns.append(tf.feature_column.numeric_column(col_name))

    
model = tabnet.TabNetClassifier(feature_columns, num_classes=3,
                                feature_dim=8, output_dim=4,
                                num_decision_steps=4, relaxation_factor=1.0,
                                sparsity_coefficient=1e-5, batch_momentum=0.98,
                                virtual_batch_size=None, norm_type='group',
                                num_groups=1)

lr = tf.keras.optimizers.schedules.ExponentialDecay(0.01, decay_steps=100, decay_rate=0.9, staircase=False)
optimizer = tf.keras.optimizers.Adam(lr)
model.compile(optimizer, loss='categorical_crossentropy', metrics=['accuracy'])

model.fit(training_dataset, 
          epochs=100, 
          validation_data=validation_dataset, 
          verbose=2)

model.summary()

however it gives me the following error

ValueError: in user code:

    /opt/conda/lib/python3.8/site-packages/tensorflow/python/keras/engine/training.py:805 train_function  *
        return step_function(self, iterator)
    /opt/conda/lib/python3.8/site-packages/tabnet/tabnet.py:421 call  *
        self.activations = self.tabnet(inputs, training=training)
    /opt/conda/lib/python3.8/site-packages/tabnet/tabnet.py:213 call  *
        features = self.input_features(inputs)
    /opt/conda/lib/python3.8/site-packages/tensorflow/python/keras/engine/base_layer.py:1012 __call__  **
        outputs = call_fn(inputs, *args, **kwargs)
    /opt/conda/lib/python3.8/site-packages/tensorflow/python/keras/feature_column/dense_features.py:158 call  **
        raise ValueError('We expected a dictionary here. Instead we got: ',

    ValueError: ('We expected a dictionary here. Instead we got: ', <tf.Tensor 'IteratorGetNext:0' shape=<unknown> dtype=float32>)
    

I wonder if you can help me with formatting the data - thanks

How Can I use it as feature extractor?

I want to use TabNet with one other network :

Input_1_dataframe ==> Model(x)   ==> Model_features
Input_2_dataframe ==> TabNet(x) ==> Attentive features

Final_layer            ==> Merge( Model_features, Attentive features) ==> Classification

How I can use tabnet as feature extractor?

Feature Importance

When I calculated the ranking of feature importance, I found that the column sum of the mask matrix of each run was different. Is this normal? I'm confused.

Documentation question regarding N_d and N_a

In the TabNet docstring it says

"""
- Adjustment of the values of Nd and Na is the most efficient way of obtaining a trade-off
between performance and complexity. Nd = Na is a reasonable choice for most datasets.
"""

However tabnet line 128 is:

"""
if feature_dim <= output_dim:
raise ValueError("To compute features_for_coef, feature_dim must be larger than output dim")
"""

Wondering if this input validation is in error, or if documentation should be adjusted. Happy to PR the fix, I just don't know which way it should go.

License

Great job - the code looks really clean! Would be keen to try it out on some of my datasets.

Can you kindly tell us what the license is?

Performance issue in the definition of call, tabnet/tabnet.py

Hello, I found a performance issue in the definition of call, tabnet/tabnet.py, tf.cast will be created repeatedly during program execution, resulting in reduced efficiency. I think it should be created before the loop.

Looking forward to your reply. Btw, I am very glad to create a PR to fix it if you are too busy.

v0.1.4 TypeError: 'module' object is not iterable

Hello! When I pip install the latest version, I cannot import the module:

>>> import tabnet
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/Jacky/anaconda3/lib/python3.7/site-packages/tabnet/__init__.py", line 15, in <module>
    __all__ = [*tabnets, *stacked_tabnet]
TypeError: 'module' object is not iterable

The issue does not occur if I use 0.1.3.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.