cerlymarco / medium_notebook Goto Github PK

View Code? Open in Web Editor NEW

2.0K 2.0K 970.0 95.05 MB

Repository containing notebooks of my posts on Medium

License: MIT License

Jupyter Notebook 99.98% Python 0.02%

artificial-intelligence data-science deep-learning machine-learning notebooks

medium_notebook's People

Contributors

Stargazers

Watchers

Forkers

ylghb udemirezen foeinlove saiaman iotgod jbdatascience vgoklani stjordanis drehar arashsaeidpour schaelle abhinaymandepudi fahad021 fred-fan annakrystalli shancq aasimwani sharavsambuu nmningmei kishorkukreja totoov ruixinzhangmarty aaaaaawei-wei amigoclouds jonaqp jaahnavitiruthani nkhan076 steccami ntutdanny0414 bupt2017212698 tontodevs wwdxfa elopezdroguett wusir2001 yongfeng-li ndbertrand zhoupeng310 dddfaker yxinjiang leonracsis wxd66 vvsg funson xinyeax christianfarnier muxting veryhannibal lj015625 manu007sharma anisayati florencekim aijinz sdoggod 459548764 jingyi1995 safiravanillia chanasitgithub ankitdamani thegreatjedi shirin23 ramithajhu xulena praditya pankajmehar frankntim vasyllyashkevych larasatiayudia phansoks pjvazquez gopakumargeetha rahuketu86 eggpan95 thananyas tokonex anujarora93 mac999 nihedy pkusnail shogan50 shoman2 muthuram1980 gsqt tomzhang eliasmontini zhonghhx get2yogeshk wmjhome caicaifearless gitouyou kuruparan amelbelounnas sumedha95 kossi25 xilong11 alfeoantonioluca arkitahara ythwilam muhamed-farooq daniel-althoff jiaxiangbu

medium_notebook's Issues

Multivariate anamoly detection in time series VAR

How to split train and test for multi variate analysis. I have a date column and 4 features.

VAE+LSTM for missing value imputation with a univariate time series

Hello Marco, I was wondering if you can replicate your post: https://towardsdatascience.com/time-series-generation-with-vae-lstm-5a6426365a1c but without the categorical variables? I tried to do it but the model performed really bad.

Some questions about the "Graph_TimeSeries_Forecasting" code.

Hi, I am very interested in the part of "Graph_TimeSeries_Forecasting", but I cannot download the full article in "README.md". Is it an academic paper? Can you tell me the name of this article or the website of the article? Thank you very much!

ValueError: cannot reshape array of size 25690112 into shape (50625,512)

my image size = 225
and ERROR is -

---> 23 out = np.dot(activation_maps.reshape((img.shape[0]*img.shape[1], 512)), class_weights).reshape(img.shape[0],img.shape[1])
24
25

ValueError: cannot reshape array of size 25690112 into shape (50625,512)

Accuracy always 0.0

I used the program as such on cat dog dataset..even after training for reasonable epochs "print(accuracy)" shows 0.0 . Is it just my problem or dataset problem or anything else . Pls reply and thank you for your hard work shared.

Time series bootstrapping bug

Hi,
Thank you for tsmoothie. It is a nice library that I often use.

I am trying to bootstrap time series data using the `'nbb' as a bootstrapping method. I am always getting this error:

bootstrap_res = residuals[[0], bootstrap_id]
IndexError: index 18590 is out of bounds for axis 1 with size 18590

Here a MWE that you can reproduce the error


import yfinance
data = yfinance.download('^GSPC')
ret = data['Adj Close'].pct_change()
btype = ['nbb', 'mbb', 'cbb', 'sb'][0]
spc = SpectralSmoother(smooth_fraction=0.18, pad_len=12)
bts = BootstrappingWrapper(spc, bootstrap_type=btype, block_length=24)
bts_samples = bts.sample(data=ret.dropna(), n_samples=10)

It looks like the `nbb ' does not take into account the sample size of the original data.

I couldn't install the Shaphypetune package through PIP or Conda

https://github.com/cerlymarco/MEDIUM_NoteBook/tree/master/ShapBoruta_FeatureSelection

Question for extreme time series forecasting

Thank you for the great tutorial! I didn't understand the logic of how did you merge the time series of different cities and made them into one and then fed into the model. I am working with a similar dataset where I have time series of different cities. I am taking the time series of each city separately and training the model. I start fresh for a different city. I wanted to make one heterogeneous model which takes the data of all the cities together.

Thank you in advance!
deb

There is no data for the "Dress_segmentation"

Hello
Excuse me,
where are the Images for training the U-net in Dress_segmentation?
@cerlymarco

Dataset for "dress_segmentation"

Hello
Excuse me,
where are the Images for training the U-net?
@cerlymarco

No module named 'kerashypetune'

VAE _timeseries or Multivariate timeseries input

Dear Marco
Thank you for sharing the VAE_timeseries codes.

I want to use that for Multivariate time series reconstruction. but it gives me error in modeling in several parts. do you have sample code for multivariate ones as well?
Is that possible to modify the model to predict by itself the categories of reconstructed dataset as well? I mean that the decoder part only gets the Z latent vector and produces the reconstruction dataset together with the categorical attributes?
I tried for both cases unsuccessful!
thanks
hojat

Different predictions for the same dataset

Hi Marco, how are you?

I am using your code for extreme event forecasting and I have a problem when predicting the new features: after the training process in a first notebook, I saved the architecture and the weights from the encoder model (encoder=Model(inputs_ae, encoded_ae)). When I load the layers and the weights in a second notebook, each time I run this second notebook, using the same dataset, I get different results generated by the "encoder.predict(X_train)". Do you have any suspects why it is happening?

Thanks a lot for this material mate, great stuff.

Features for autoencoder

X = np.concatenate([X_train_c,X_train_o,X_test_c,X_test_o],axis=0)
sequence_autoencoder.fit(X[:len(X_train_c)+len(X_train_o)], X[:len(X_train_c)+len(X_train_o)], 
                         batch_size=128, epochs=100, verbose=2, shuffle=True)

I was going through the extreme forecasting code and I saw that you used only the sequence of average price as an input to the auto encoder. I was wondering why didn't you use the extra features as well in the input to the autoencoder? According to my knowledge autoencoder is used to learn a complex representation of the feature space so including more features should help it.

Thanks in advance.

ONE-class neural network

When i run the code block :

tf.random.set_random_seed(33)
os.environ['PYTHONHASHSEED'] = str(33)
np.random.seed(33)
random.seed(33)

session_conf = tf.compat.v1.ConfigProto(
intra_op_parallelism_threads=1,
inter_op_parallelism_threads=1
)
sess = tf.Session()
graph=tf.compat.v1.get_default_graph(),
config=session_conf
)
tf.compat.v1.keras.backend.set_session(sess)
es = EarlyStopping(monitor='val_loss', mode='auto', restore_best_weights=True, verbose=1, patience=5)
model = get_model()
model.fit(wrap_generator(train_generator), steps_per_epoch=train_generator.samples/train_generator.batch_size, epochs=20)

I Get this Error!

InvalidArgumentError: Requested tensor connection from unknown node: "dense_2_target:0".

TimeSeries_Smoothing_Clustering.ipynb

I wanted to ask if it is possible to use this notebook for forecasting as well with the predict function from tslearn as well together with tsmoothie ....
And if so - could you be so kind and show how to predict 1 or 2 time steps ahead with the according python code to try it out?

One more additional question I have - would you recommend the TimeSeries_Smoothing_Forecasting notebook for financial data as well? I am quite curious.

Thanks a lot for your effort and answer

Matthias

Medium Issue

Thank you for sharing that. That's great.
I can't open Medium's docs because I should pay to read your medium's docouments.

Error in MEDIUM_NoteBook/NeuralNetSeq_FeatureImportance/NeuralNetSeq_FeatureImportance.ipynb

HI there,
I like your GitHub repo!

I want to implement your RNN-Heatmap visualization in my task. During the testing I have found, you are using the wrong indices in the activation_grad function. In your example, the OUTPUT shape is 15,32 while the dimension of the weight is 15. So when the activation_grad(seq, model) function create the cam variable use first 15 features from 32. While somebody (like me) uses output with shape 500,20 it will create an error, because in that scenario the dimension of the weight will be 500.

I attached a pic of the problem.

Predicted results in the model

I was confused on how to get the predicted average price which the model has learnt after this code-

### DEFINE LSTM FORECASTER ###
inputs2 = Input(shape=(X_train2.shape[1], X_train2.shape[2]))
lstm2 = LSTM(128, return_sequences=True, dropout=0.3)(inputs2, training=True)
lstm2 = LSTM(32, return_sequences=False, dropout=0.3)(lstm2, training=True)
dense2 = Dense(50)(lstm2)
out2 = Dense(1)(dense2)

model2 = Model(inputs2, out2)
model2.compile(loss='mse', optimizer='adam', metrics=['mse'])

### FIT FORECASTER ###
history = model2.fit(X_train2, y_train2, epochs=30, batch_size=128, verbose=2, shuffle=True)

return sequence=true or return sequence is false

Sorry, sir for asking my query like this I urge you to please write a medium article on attention too this is about your StackOverflow answer of this(I have a low reputation for post a comment there) -
https://stackoverflow.com/questions/62948332/how-to-add-attention-layer-to-a-bi-lstm/62949137#62949137

Sir using this I have implemented the custom Attention layer as well. Sir What is the difference between return_sequence =True and return_sequence=False. Is return sequence =True is able to return the weighted sum of attention vectors ?? I am trying to implement attention of this paper. As per paper Context vector is weighted sum of softmax attention vector at given time step . Is with return sequence = True I am able to get Weighted sum of the context vector of all time step of Do I need to change it . Demo attention code is `from tensorflow.keras.layers import Layer
from tensorflow.keras import backend as K

class Attention(Layer):

def __init__(self, return_sequences=True):
    self.return_sequences = return_sequences
    super(Attention,self).__init__()
    
def build(self, input_shape):
    
    self.W=self.add_weight(name="att_weight", shape=(input_shape[-1],1),
                           initializer="normal")
    self.b=self.add_weight(name="att_bias", shape=(input_shape[1],1),
                           initializer="zeros")
    
    super(Attention,self).build(input_shape)
    
def call(self, x):
    
    e = K.tanh(K.dot(x,self.W)+self.b)
    a = K.softmax(e, axis=1)
    output = x*a
    
    if self.return_sequences:
        return output
    
    return K.sum(output, axis=1)`

Please help me in this regard sir I am stuck over it for days . Thanks in advance

Why use ks statistic to calculate distance？

Hi,
I want to ask，In https://github.com/cerlymarco/MEDIUM_NoteBook/blob/master/TimeSeries_Cluster/TS_Cluster.ipynb.
I could't understand why put ks_matrix into sch.distance.pdist, what does it mean?

Feature_importance

This is excellence work

Question regarding NuralNet FetaureImportance
For a classification problem, while calculating the causation, what should we include, the predicted classes or the predicted probability (referring to "shuff_pred" & "real_pred")?

Thanks

TimeSeries_Cluster : standardized data values clipped to nearest integer

@cerlymarco, Hi and many thanks for making your work available in this repo, it's an invaluable resource for a ML newbie like me.

Recently I've been thinking about clustering time series to ensure that the input dataset is focussed for the ML modelling phase.

Therefore, the TimeSeries_Cluster article (and code) is of particular interest to me.

In trying to understand the various steps in the code presented, it seems that the standardized data values are all clipped to the nearest integer. I think this is due to employing the pd.df.values method, and the dtype of the individual ndarrays is automatically selected as integer (as all the initial data values are integers).

If, instead of using .values to build the initial list of np.ndarrays, we use .to_numpy(dtype=np.float32) then the standardized data are no longer clipped.

This makes a significant difference to the dendrograms.

Perhaps, I'm not understanding the data preparation process correctly, anyway I hope the above observation is of use.

Resource allocate error while using tensorflow 2.0

I am using tensorflow 2.0 and got this error , can you please let me know how to resolve that.

Train on 6132 samples
Epoch 1/10
  32/6132 [..............................] - ETA: 1:17:35 - loss: 2130.8425
---------------------------------------------------------------------------
NotFoundError                             Traceback (most recent call last)
<ipython-input-54-d5c9353e9609> in <module>()
      4 nnT2V.compile(loss='mse', optimizer='adam')
      5 
----> 6 nnT2V.fit(X_train, y_train, epochs=10)

~\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\keras\engine\training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_freq, max_queue_size, workers, use_multiprocessing, **kwargs)
    726         max_queue_size=max_queue_size,
    727         workers=workers,
--> 728         use_multiprocessing=use_multiprocessing)
    729 
    730   def evaluate(self,

~\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\keras\engine\training_v2.py in fit(self, model, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_freq, **kwargs)
    322                 mode=ModeKeys.TRAIN,
    323                 training_context=training_context,
--> 324                 total_epochs=epochs)
    325             cbks.make_logs(model, epoch_logs, training_result, ModeKeys.TRAIN)
    326 

~\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\keras\engine\training_v2.py in run_one_epoch(model, iterator, execution_function, dataset_size, batch_size, strategy, steps_per_epoch, num_samples, mode, training_context, total_epochs)
    121         step=step, mode=mode, size=current_batch_size) as batch_logs:
    122       try:
--> 123         batch_outs = execution_function(iterator)
    124       except (StopIteration, errors.OutOfRangeError):
    125         # TODO(kaftan): File bug about tf function and errors.OutOfRangeError?

~\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\keras\engine\training_v2_utils.py in execution_function(input_fn)
     84     # `numpy` translates Tensors to values in Eager mode.
     85     return nest.map_structure(_non_none_constant_value,
---> 86                               distributed_function(input_fn))
     87 
     88   return execution_function

~\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\eager\def_function.py in __call__(self, *args, **kwds)
    455 
    456     tracing_count = self._get_tracing_count()
--> 457     result = self._call(*args, **kwds)
    458     if tracing_count == self._get_tracing_count():
    459       self._call_counter.called_without_tracing()

~\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\eager\def_function.py in _call(self, *args, **kwds)
    485       # In this case we have created variables on the first call, so we run the
    486       # defunned version which is guaranteed to never create variables.
--> 487       return self._stateless_fn(*args, **kwds)  # pylint: disable=not-callable
    488     elif self._stateful_fn is not None:
    489       # Release the lock early so that multiple threads can perform the call

~\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\eager\function.py in __call__(self, *args, **kwargs)
   1821     """Calls a graph function specialized to the inputs."""
   1822     graph_function, args, kwargs = self._maybe_define_function(args, kwargs)
-> 1823     return graph_function._filtered_call(args, kwargs)  # pylint: disable=protected-access
   1824 
   1825   @property

~\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\eager\function.py in _filtered_call(self, args, kwargs)
   1139          if isinstance(t, (ops.Tensor,
   1140                            resource_variable_ops.BaseResourceVariable))),
-> 1141         self.captured_inputs)
   1142 
   1143   def _call_flat(self, args, captured_inputs, cancellation_manager=None):

~\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\eager\function.py in _call_flat(self, args, captured_inputs, cancellation_manager)
   1222     if executing_eagerly:
   1223       flat_outputs = forward_function.call(
-> 1224           ctx, args, cancellation_manager=cancellation_manager)
   1225     else:
   1226       gradient_name = self._delayed_rewrite_functions.register()

~\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\eager\function.py in call(self, ctx, args, cancellation_manager)
    509               inputs=args,
    510               attrs=("executor_type", executor_type, "config_proto", config),
--> 511               ctx=ctx)
    512         else:
    513           outputs = execute.execute_with_cancellation(

~\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\eager\execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
     65     else:
     66       message = e.message
---> 67     six.raise_from(core._status_to_exception(e.code, message), None)
     68   except TypeError as e:
     69     keras_symbolic_tensors = [

~\Anaconda3\lib\site-packages\six.py in raise_from(value, from_value)

NotFoundError:  Resource AnonymousIterator/AnonymousIterator6/class tensorflow::data::IteratorResource does not exist.
	 [[node IteratorGetNext (defined at C:\Users\yogesh\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\framework\ops.py:1751) ]] [Op:__inference_distributed_function_25761]

Function call stack:
distributed_function

Why is 'sequence_autoencoder' never used in the next code?

Hi, Sir
I want to ask, in your notebook here
you define sequence_autoencoder variabel and train it as TRAIN AUTOENCODER.

sequence_autoencoder = Model(inputs_ae, out_ae)
sequence_autoencoder.compile(optimizer='adam', loss='mse')

### TRAIN AUTOENCODER ###
es = EarlyStopping(patience=6, verbose=2, min_delta=0.001, monitor='val_loss', mode='auto', restore_best_weights=True)
sequence_autoencoder.fit(X_train, X_train, validation_data=(X_train, X_train),
batch_size=128, epochs=100, verbose=1, callbacks=[es])

so, why is sequence_autoencoder variabel never used in the next code?

ValueError: Dimensions must be equal, but are 2 and 1 for

Could you please tell the dimension for the case where the input_dim=4, and the output_dim=3. During my implementation, I have some problem with this part of the code:
class T2V(Layer):
def init(self, output_dim = None, **kwargs):
self.output_dim = output_dim
super(T2V, self).init(**kwargs)
def build(self, input_shape):
self.W = self.add_weight(name='W', shape=(1, self.output_dim), initializer='uniform', trainable=True)
self.P = self.add_weight(name='P',shape=(1, self.output_dim),initializer='uniform',trainable=True)
self.w = self.add_weight(name='w',shape=(1,1),initializer='uniform', trainable=True)
self.p = self.add_weight(name='p',shape=(1,1),initializer='uniform',trainable=True)
super(T2V, self).build(input_shape)
for one input and one output it works well. But I don't know how to set the dimension for the other cases. Thanks

requirement environment

Hi authors,
Can you push one requirement.txt for this source code? Cos there is a problem about the enviroment.

Why do you create smoother iteratively?

I think it is okay to initialize a smoother at the very beginning and then just use it for the rest, which looks like this:

smoother = ConvolutionSmoother(window_len=window_len, window_type='ones')
# smoother = ExponentialSmoother(window_len=window_len, alpha=0.4)

for i in tqdm(range(timesteps + 1), total=(timesteps + 1)):
    if i > window_len:

        # smoother.smooth(series['original'][:, -window_len-1:])
        smoother.smooth(series['original'][:,-window_len:])

        series['smooth'] = np.hstack([series['smooth'], smoother.smooth_data[:,[-1]]]) 

        _low, _up = smoother.get_intervals('sigma_interval', n_sigma=2)
        series['low'] = np.hstack([series['low'], _low[:,[-1]]])
        series['up'] = np.hstack([series['up'], _up[:,[-1]]])

        is_anomaly = np.logical_or(
            series['original'][:,-1] > series['up'][:,-1], 
            series['original'][:,-1] < series['low'][:,-1]
        ).reshape(-1,1)
        
        if is_anomaly.any():
            series['ano_id'] = np.hstack([series['ano_id'], is_anomaly*i]).astype(int)
    
    if i>=timesteps:
        continue
    
    series['original'] = np.hstack([series['original'], data['pv'].values.reshape(1, -1)[:,[I]]])

Perhaps I missed something, can you tell us more about your logic?

librosa

in notebook :
https://github.com/cerlymarco/MEDIUM_NoteBook/tree/master/Predictive_Maintenance_CRNN
https://github.com/cerlymarco/MEDIUM_NoteBook/tree/master/Predictive_Maintenance_ResNet

in sample_mfcc = librosa.feature.mfcc(np.asfortranarray(sample),sr=40000)
and
sample_spectre = np.apply_along_axis(lambda x: librosa.feature.melspectrogram(x,sr=40_000) ,0, sample)
..... same errore
TypeError: melspectrogram() takes 0 positional arguments but 1 positional argument (and 1 keyword-only argument) were given
TypeError: mfcc() takes 0 positional arguments but 1 positional argument (and 1 keyword-only argument) were given

time in Extreme_Event_Forecasting

Hi Marco,
Thanks very much for the good work.
I have one question on time in avocado.csv. The date column is not really from the earliest to the latest. The year followed the right order, but the date starts from Dec to Jan for each year.
But we used LSTM, do you think this will affect your final results? Or this format is for another purpose.

Many thanks.

Best
Tan

NeuralNet_ensemble

Sir,
I executed the neural network ensemble model successfully. Finally, how we can test the unseen image to get the model predictions. Thanks in advance . I am waiting for your reply.

Kavitha

Question on label setting in OneClass Neural Network

Thank you for your great work.
I have a question about label setting.

In train_generator, I think class "cat" is given label "0", but in test_generator, class "dog" is given label "0" , class "cat" is given label "1". Is this fine if training data and test data has disagreement on label?

The performance of this project

Hi cerlymarco,
Cos there is no tensorflow v2.0 environment, I achieved this project in ptorch.
I completely followed your codes. However, the MSE of test dataset is too high (91.426).
BTW, I also drew the prediction of tese data. The distributions of predicted data and true data are same with yours.
Can you give some suggestion?

Offical metric is SMAPE. I mistake it as MSE.
SMAPE: 14.5608 on splited test data

One class Neural network issue

Hello,

I used the one class CNN codes for cat/dog classification, but when I only use cat images as training and test on cat/dog data, my trained model predict all test samples as cat, which means it cannot detect the anomalies. Since there is a bug in my tensorflow 2.1 version, I only changed the code from " pre_process = Lambda(preprocess_input)" to "pre_process = preprocess_input". The rest are same. I do not know why, could you also provide the dataset, please?

Questions for the note book Predictive_Maintenance_SiameseNet

Hi, Marco, really enjoy this notebook!!! I am working on a similar problem([https://github.com/kcg2015/fiber_cuts_prediction)]. In your SeameseNet architecture, you add a Dropout layer after the difference calculation of two encoders

L1_layer = Lambda(lambda tensor: K.abs(tensor[0] - tensor[1])) L1_distance = L1_layer([encoded_l, encoded_r]) drop = Dropout(0.2)(L1_distance)
Could you let me know what is the rationale behind this? More importantly, would adding this dropout layer significantly reduce the overfitting? Thanks, Kyle

Request for translation and reprinting to Chinese community

Thank you for your great articles and notebooks! They inspired me a lot when I was struggling to get started with anomaly detection tasks ：）

May I translate your articles into Chinese and repost it to the Chinese Community (namely ZHIHU)? If approval I will point out their original source. In addition, I am more accustomed to using Pytorch, so I hope to add the modified code when reprinting. Similarly, I will declare its original developer and reposity.

input training for model

Hi marco, good post on medium and good code.

Given the paper http://roseyu.com/time-series-workshop/submissions/TSW2017_paper_3.pdf, they specify that a training dataset is created by splitting the historical data into sliding windows of input and output variables.
I don't see any of that in your code.

Furthermore, I feel like we are doing a sort of "data leaking" here XX = encoder.predict(X) .it seems that we are using the entire data available both test and train to produce that array XX.

Thanks

Hello, I want to contact you plz

configuration environment

Hello author, can you tell me the configuration environment, I can't view the link in the README.md

Dress segmentation

@cerlymarco kudos! amazing work (dress segmentation) on combining GMM and autoencoders to get the mask of an image! I was replicating your approach from the notebook you have put up and couldn't find the fashion_unet.h5 model to replicate the results. Will it be possible to share the trained model with us, please. Thank you!

DFM and tspiral

Your models are great. Can you also show an example with strong trend?

Concatenation error

I am using the LSTM VAR notebook and got this error-

ValueError                                Traceback (most recent call last)
<ipython-input-38-99b8bafb8d3d> in <module>
     20 y_test = test.values
     21 
---> 22 X_test = np.concatenate([test_diff.values, test_ext.values], axis=1)

ValueError: all the input array dimensions except for the concatenation axis must match exactly

Shape of test_diff and test_ext -(1872, 9),
(1704, 9)

Extreme_Event_Forecasting.ipynb

Hello Marco,

Amazing code ! I have just a small problem when I run it. I have the error "ValueError: Found input variables with inconsistent numbers of samples: [848, 16]"

When I run below :

COMPUTE STOCHASTIC DROPOUT FOR SINGLE COUNTY

mae2_test = []
for i in tqdm.tqdm(range(0,100)):
mae2_test.append(mean_absolute_error(test_stoc_drop2('TotalUS', 0.5)[0], y2_test))
print(np.mean(mae2_test), np.std(mae2_test))

ValueError Traceback (most recent call last)
in
2 mae2_test = []
3 for i in tqdm.tqdm(range(0,100)):
----> 4 mae2_test.append(mean_absolute_error(test_stoc_drop2('TotalUS', 0.5)[0], y2_test))
5
6 print(np.mean(mae2_test), np.std(mae2_test))

~\AppData\Local\Continuum\anaconda3\lib\site-packages\sklearn\metrics\regression.py in mean_absolute_error(y_true, y_pred, sample_weight, multioutput)
170 """
171 y_type, y_true, y_pred, multioutput = _check_reg_targets(
--> 172 y_true, y_pred, multioutput)
173 check_consistent_length(y_true, y_pred, sample_weight)
174 output_errors = np.average(np.abs(y_pred - y_true),

~\AppData\Local\Continuum\anaconda3\lib\site-packages\sklearn\metrics\regression.py in _check_reg_targets(y_true, y_pred, multioutput)
75
76 """
---> 77 check_consistent_length(y_true, y_pred)
78 y_true = check_array(y_true, ensure_2d=False)
79 y_pred = check_array(y_pred, ensure_2d=False)

~\AppData\Local\Continuum\anaconda3\lib\site-packages\sklearn\utils\validation.py in check_consistent_length(*arrays)
203 if len(uniques) > 1:
204 raise ValueError("Found input variables with inconsistent numbers of"
--> 205 " samples: %r" % [int(l) for l in lengths])
206
207

ValueError: Found input variables with inconsistent numbers of samples: [848, 16]

Could you help me please ?
thanks

Dataset for Dress_Segmentation

I see that you have provided a link to the competition, but can you provide us with the link of the database that you used? Or maybe the trained model?

There is no data for the "Dress_segmentation"

Hello
Excuse me,
where are the Images for training the U-net in Dress_segmentation?
@cerlymarco

FAVAR with LSTM

Sir, for a FAVAR model what changes do you recommend in the codes?

Deleted issues (librosa in Predictive_Maintenance_CRNN /Predictive_Maintenance_ResNet )

in notebooks:
https://github.com/cerlymarco/MEDIUM_NoteBook/tree/master/Predictive_Maintenance_CRNN
https://github.com/cerlymarco/MEDIUM_NoteBook/tree/master/Predictive_Maintenance_ResNet

sample_mfcc = librosa.feature.mfcc(np.asfortranarray(sample),sr=40000)
and
sample_spectre = np.apply_along_axis(lambda x: librosa.feature.melspectrogram(x,sr=40_000) ,0, sample)

same errore ,please help!
TypeError: melspectrogram() takes 0 positional arguments but 1 positional argument (and 1 keyword-only argument) were given
and
TypeError: mfcc() takes 0 positional arguments but 1 positional argument (and 1 keyword-only argument) were given

One Class Neural Network

Hi,

I am trying to reproduce your code for the oneclass neural network exactly as it is (but with my own data) but I am facing some issues. Namely, I get the following error message:

"TypeError: An op outside of the function building code is being passed
a "Graph" tensor. It is possible to have Graph tensors
leak out of the function building context by including a
tf.init_scope in your function building code.

runfile('C:/Users/SofiaPereira/Documents/Projects/CV/scripts/teste.py', wdir='C:/Users/SofiaPereira/Documents/Projects/CV/scripts')
Found 7 images belonging to 1 classes.
Found 2 images belonging to 2 classes.
WARNING:tensorflow:sample_weight modes were coerced from
...
to
['...']
Train for 0.02734375 steps
Epoch 1/20
1/0 [=========================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================] - 1s 735ms/step
Traceback (most recent call last):

File "C:\Users\SofiaPereira\anaconda3\envs\tf2\lib\site-packages\tensorflow_core\python\eager\execute.py", line 61, in quick_execute
num_outputs)

TypeError: An op outside of the function building code is being passed
a "Graph" tensor. It is possible to have Graph tensors
leak out of the function building context by including a
tf.init_scope in your function building code.
For example, the following function will fail:
@tf.function
def has_init_scope():
my_constant = tf.constant(1.)
with tf.init_scope():
added = my_constant * 2
The graph tensor has name: lambda/Const:0

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

File "C:\Users\SofiaPereira\Documents\Projects\CV\scripts\teste.py", line 145, in
model.fit(wrap_generator(train_generator), steps_per_epoch=train_generator.samples/train_generator.batch_size, epochs=20)

File "C:\Users\SofiaPereira\anaconda3\envs\tf2\lib\site-packages\tensorflow_core\python\keras\engine\training.py", line 819, in fit
use_multiprocessing=use_multiprocessing)

File "C:\Users\SofiaPereira\anaconda3\envs\tf2\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py", line 342, in fit
total_epochs=epochs)

File "C:\Users\SofiaPereira\anaconda3\envs\tf2\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py", line 128, in run_one_epoch
batch_outs = execution_function(iterator)

File "C:\Users\SofiaPereira\anaconda3\envs\tf2\lib\site-packages\tensorflow_core\python\keras\engine\training_v2_utils.py", line 98, in execution_function
distributed_function(input_fn))

File "C:\Users\SofiaPereira\anaconda3\envs\tf2\lib\site-packages\tensorflow_core\python\eager\def_function.py", line 568, in call
result = self._call(*args, **kwds)

File "C:\Users\SofiaPereira\anaconda3\envs\tf2\lib\site-packages\tensorflow_core\python\eager\def_function.py", line 632, in _call
return self._stateless_fn(*args, **kwds)

File "C:\Users\SofiaPereira\anaconda3\envs\tf2\lib\site-packages\tensorflow_core\python\eager\function.py", line 2363, in call
return graph_function._filtered_call(args, kwargs) # pylint: disable=protected-access

File "C:\Users\SofiaPereira\anaconda3\envs\tf2\lib\site-packages\tensorflow_core\python\eager\function.py", line 1611, in _filtered_call
self.captured_inputs)

File "C:\Users\SofiaPereira\anaconda3\envs\tf2\lib\site-packages\tensorflow_core\python\eager\function.py", line 1692, in _call_flat
ctx, args, cancellation_manager=cancellation_manager))

File "C:\Users\SofiaPereira\anaconda3\envs\tf2\lib\site-packages\tensorflow_core\python\eager\function.py", line 545, in call
ctx=ctx)

File "C:\Users\SofiaPereira\anaconda3\envs\tf2\lib\site-packages\tensorflow_core\python\eager\execute.py", line 75, in quick_execute
"tensors, but found {}".format(keras_symbolic_tensors))

_SymbolicException: Inputs to eager execution function cannot be Keras symbolic tensors, but found [<tf.Tensor 'lambda/Const:0' shape=(3,) dtype=float32>]

I've read somewhere that this has to do with tf2.0's eager execution so I added "tf.compat.v1.disable_eager_execution()" after importing tf. With this, I got the following error:

Traceback (most recent call last):

File "C:\Users\SofiaPereira\Documents\Projects\CV\scripts\teste.py", line 144, in
model = get_model()

File "C:\Users\SofiaPereira\Documents\Projects\CV\scripts\teste.py", line 92, in get_model
vgg_16_process = pre_process(GaussianNoise(0.1)(inp))

File "C:\Users\SofiaPereira\anaconda3\envs\tf2\lib\site-packages\tensorflow_core\python\keras\engine\base_layer.py", line 778, in call
outputs = call_fn(cast_inputs, *args, **kwargs)

File "C:\Users\SofiaPereira\anaconda3\envs\tf2\lib\site-packages\tensorflow_core\python\keras\layers\core.py", line 846, in call
result = self.function(inputs, **kwargs)

File "C:\Users\SofiaPereira\anaconda3\envs\tf2\lib\site-packages\tensorflow_core\python\keras\applications_init_.py", line 46, in wrapper
return base_fun(*args, **kwargs)

File "C:\Users\SofiaPereira\anaconda3\envs\tf2\lib\site-packages\tensorflow_core\python\keras\applications\vgg16.py", line 44, in preprocess_input
return vgg16.preprocess_input(*args, **kwargs)

File "C:\Users\SofiaPereira\anaconda3\envs\tf2\lib\site-packages\keras_applications\imagenet_utils.py", line 195, in preprocess_input
mode=mode, **kwargs)

File "C:\Users\SofiaPereira\anaconda3\envs\tf2\lib\site-packages\keras_applications\imagenet_utils.py", line 151, in _preprocess_symbolic_input
x = backend.bias_add(x, _IMAGENET_MEAN, data_format)

File "C:\Users\SofiaPereira\anaconda3\envs\tf2\lib\site-packages\tensorflow_core\python\keras\backend.py", line 5559, in bias_add
x = nn.bias_add(x, bias, data_format='NHWC')

File "C:\Users\SofiaPereira\anaconda3\envs\tf2\lib\site-packages\tensorflow_core\python\ops\nn_ops.py", line 2728, in bias_add
with ops.name_scope(name, "BiasAdd", [value, bias]) as name:

File "C:\Users\SofiaPereira\anaconda3\envs\tf2\lib\site-packages\tensorflow_core\python\framework\ops.py", line 6237, in enter
g_from_inputs = _get_graph_from_inputs(self._values)

File "C:\Users\SofiaPereira\anaconda3\envs\tf2\lib\site-packages\tensorflow_core\python\framework\ops.py", line 5883, in _get_graph_from_inputs
_assert_same_graph(original_graph_element, graph_element)

File "C:\Users\SofiaPereira\anaconda3\envs\tf2\lib\site-packages\tensorflow_core\python\framework\ops.py", line 5818, in _assert_same_graph
(item, original_item))

ValueError: Tensor("lambda/Const:0", shape=(3,), dtype=float32) must be from the same graph as Tensor("lambda_9/strided_slice:0", shape=(None, 224, 224, 3), dtype=float32).

Do you have any idea regarding what's going on? I am using an Anaconda Env with tf=2.0

Thank you very much,
Sofia

求助

您好，请问可以分享一下数据库嘛

Question: SHAP RFE vs Wrapper based RFE

Hi Marco

Thanks for all the great write ups on medium and code on github. Great contributions!

I've a question hoping you could shed light on. I'm working with a dataset with many correlated features. And I'm building an automated method to select the most important features.

I'm looking at your work on SHAP with RFE and I'm trying to understand if and why it would be better than a wrapper based RFE. ie. a method when the model is trained on data with one column left out & the column that results in the least reduction n model performance (ie. AUC) is then removed from the dataset. This is repeated to find the optimal number and which features to include.

Do you have a position on which of these approaches are better? And if it is SHAP RFE I'd love to know why.

What holds me back from the SHAP RFE approach - is that shap values explain the effect a feature has on the model output prediction. But it doesn't necessarily say that this feature has a positive or negative impact of model performance. But my knowledge of SHAP is limited.

Anyway - appreciate to hear your thoughts.

Best
Mark