neurosim / dnn_neurosim_v2.1 Goto Github PK

View Code? Open in Web Editor NEW

99.0 1.0 48.0 8.29 MB

Benchmark framework of compute-in-memory based accelerators for deep neural network (on-chip training chip focused)

Python 7.52% C++ 81.71% C 1.62% Makefile 0.22% HTML 8.32% MATLAB 0.61%

dnn-neurosim benchmarking-framework memory-accelerators chip-training xiaochen-peng

dnn_neurosim_v2.1's People

Contributors

Stargazers

Watchers

dnn_neurosim_v2.1's Issues

QE Blow Assertion

I'm attempting to look at the effects of certain hardware parameters (cellBit, ADCPrecision, etc.) on accuracy and energy. I set "--inference 1" on a relatively unchanged clone of the repository and my GPU ran out of memory. After reducing the size of the layers but leaving everything else generally unchanged (except for a few errors), I keep getting a "QE Blow" assertion error. I've used print statements to find that the assertion error occurs during the second run of "backward" for WAGERounding. Changing grad_scale hasn't helped, nor has adjusting the network architecture. Adding a small value to "x" since it is zero also doesn't help. Is there a possible explanation for why this error is occurring?

Question about weight update latency

When I set param->batchSize to 1, the result weight update latency become infns. I don't understand the relation between batch size and weight update. Could anyone please help me and explain? thanks.

The program is stuck on Estimation of Layer 1

We use the DNN_NeuroSim_V2.0 programs, The steps are as follows:

make in NeuroSim folder using CMD in Ubuntu, we can find many .o files
Run train.py, the program is stuck on Estimation of Layer 1，no other logs.
Could you please help me? Thank you very much.

Question about Amp latency calculation

Hi! May I ask, why the latency in MultilevelSenseAmp::CalculatePower is not the same with the readLatency calculated in MultilevelSenseAmp::CalculateLatency simulation? In CalculateLatency, readLatency = LatencyCol*numColMuxed, while CalculatePower only takes one LatencyCol. How is the energy consumption during the remaining (numColMuxed-1)*LatencyCol ? I'd appreciate it a lot if someone sheds a little light on it. :)

Could you please give a conda list

Could you please give a conda list? So we can run the code

It seems the Input data is not all written to file

As below shows, only the input_matrix[0, :] is changed into filled_matrix_bin and then saved into csv file, so it means not all data in input_matrix can be saved into file? And the silulator cann't read the whole input data, which lead to the wrong result.

def write_matrix_activation_conv(input_matrix,fill_dimension,length,filename):
    filled_matrix_b = np.zeros([input_matrix.shape[2],input_matrix.shape[1]*length],dtype=np.str)
    filled_matrix_bin,scale = dec2bin(input_matrix[0,:],length)
    for i,b in enumerate(filled_matrix_bin):
        filled_matrix_b[:,i::length] =  b.transpose()
    activity = np.sum(filled_matrix_b.astype(np.float), axis=None)/np.size(filled_matrix_b)
    np.savetxt(filename, filled_matrix_b, delimiter=",",fmt='%s')
    return activity```
The codes come from hook.py. And what does the function dec2bin exactly do, waht does  each parammeter means?

Tool Not working in cude 11.6 latest version

Hii,
I have tried setup and install the tool according to the user manual but while running after Floorplan stage tool is not giving any output it is just running and running for default values as provided in manual.I have tried to reduce no of epochs and batch size but this also not working.
I am using latest Cuda version I.e 11.6. Can you please suggest something.
Thank You.

condition of the if statement in MultilevelSenseAmp::CalculateLatency

In MultilevelSenseAmp.cpp:
Is there a typo in line 148? It seems that the condition of line 148 will always be true...

Energy calculation of cells

It seems that energy produced by memory cells, typically RRAM, is not taken into account when calculating dynamic energy. I'm wondering whether this would influence the accuracy of energy calculation.

DNN NeuroSim as MLP

Why can't I use DNN NeuroSim as MLP by removing the Convolutional Layers, Max Pool and Activation Functions? Is there any advantage of using MLP NeuroSim over the altered DNN NeuroSim?

And I did try using DNN NeuroSim as MLP, but in the Hardware Performance it's getting stuck at Layer 1. May I know the issue/cause behind this?

Where is the output

After running make it is showing no errors, but the output CSV files are being generated nowhere. I am a beginner at this. Am I missing something? Please help. Thank you.

code after return

It may sound stupid. I found plenty of codes that appear after a function returns.
For example in ProcessingUnit.cpp, line 700-712.

vector<vector<double> > CopySubArray(const vector<vector<double> > &orginal, int positionRow, int positionCol, int numRow, int numCol) {
	vector<vector<double> > copy;
	for (int i=0; i<numRow; i++) {
		vector<double> copyRow;
		for (int j=0; j<numCol; j++) {
			copyRow.push_back(orginal[positionRow+i][positionCol+j]);
		}
		copy.push_back(copyRow);
		copyRow.clear();
	}
	return copy;
	copy.clear();
}

It seems that code after return (in the example copy.clear()) is useless.
Does such code do any job? Why add such code?
I'm so confused. Can anyone explain this to me? Thank you so much.

Issue with ADC area calculation

Hello,

In ProcessingUnit.cpp, the area of ADC per subarray is calculated using below equation
areaResults.push_back(subArray->areaADC*(numSubArrayRow*numSubArrayCol));
However, if numColMuxed is set to non-zero value (i.e. set to 8 or 8 bitlines share 1 ADC), the total area of ADC should be divided by numColMuxed as below
areaResults.push_back(subArray->areaADC*(numSubArrayRow*numSubArrayCol/numColMuxed));

Am I missing some pieces?

Thanks
/T

writeLatency of Weight Update

Hi, I'm using your DNN_NeuroSim 2.1 for my research. Could you kindly tell me if the "writeLatency of Weight Update" is the accumulation latency of the weight gradients and the write latency to the PEs?

DenseNet40 for On-Chip training

Hi @neurosim,

Has anyone tried using DenseNet40 for on-chip training using DNN_NeuroSim_V2.1?

I run into this error - ERROR: SubArray Size is too large, which break the chip hierarchey, please decrease the SubArray size!

Due to this error, I am unable to get the circuit-level performance metrics. Any suggestions on how to fix this error?

Thanks!

weak_script_method no longer supported in newer torch versions

Newer torch versions does not come with a @weak_script_method decorator which is used in modules/quantization_cpu_np_infer.py.
Is there a workaround?

It seems wrong for tileLocaEachLayer

In the function ChipFloorPlan in Training_pytorch/NeuroSIM/Chip.cpp, it calculate the double vector tileLocaEachLayer.
I presume that tileLocaEachLayer record the location of the first tile that store a layer.
The following code to calculate tileLocaEachLayer seems wrong.

for (int i=0; i<netStructure.size(); i++) {
  if (i==0) {
	tileLocaEachLayerRow.push_back(0);
	tileLocaEachLayerCol.push_back(0);
  } else {
        // original code here
	// thisTileTotal += numTileEachLayer[0][i]*numTileEachLayer[1][i];
	tileLocaEachLayerRow.push_back((int)thisTileTotal/(*numTileRow));
	tileLocaEachLayerCol.push_back((int)thisTileTotal%(*numTileRow)-1);
  }
  // I think it should be moved here.
  thisTileTotal += numTileEachLayer[0][i]*numTileEachLayer[1][i];
}

I think the calculation of thisTileTotal should be moved from the else clause to outside.

Question about uniform random noise when calculating weight gradient

May I ask, why is uniform random noise added at the end of the non-ideal weight gradient calculation, in wage_quantizer.py line 74-75?

It may seem that the gradient calculated in line 73 is already quantized to the desired resolution. May I ask why the process of adding uniform random noise and re-quantizing the gradient is necessary?
Can someone please explain the reason for this? Thank you.

Can't solve the issue that 'nan's keep appearing in Loss if the parameter in train.py is greater than 1.96

If I set the args.nonlinearityLTP or args.nonlinearityLTD greater than 1.96, the 'nan's will keep appearing in Loss at training phase and it will report an error while converting the decimal data to binary data in hook.py.

First I tried to adjust the learning rate and it didn't work. Then I tried to add normalization before the converting but it didn't work either. I don't know when the 'nan's appear and how to fix it.

neurosim / dnn_neurosim_v2.1 Goto Github PK

dnn_neurosim_v2.1's People

Contributors

Stargazers

Watchers

Forkers

dnn_neurosim_v2.1's Issues

Recommend Projects

Recommend Topics

Recommend Org