Git Product home page Git Product logo

dnn_neurosim_v2.1's People

Contributors

neurosim avatar seeder-research avatar xfong avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

dnn_neurosim_v2.1's Issues

QE Blow Assertion

I'm attempting to look at the effects of certain hardware parameters (cellBit, ADCPrecision, etc.) on accuracy and energy. I set "--inference 1" on a relatively unchanged clone of the repository and my GPU ran out of memory. After reducing the size of the layers but leaving everything else generally unchanged (except for a few errors), I keep getting a "QE Blow" assertion error. I've used print statements to find that the assertion error occurs during the second run of "backward" for WAGERounding. Changing grad_scale hasn't helped, nor has adjusting the network architecture. Adding a small value to "x" since it is zero also doesn't help. Is there a possible explanation for why this error is occurring?

Question about weight update latency

When I set param->batchSize to 1, the result weight update latency become infns. I don't understand the relation between batch size and weight update. Could anyone please help me and explain? thanks.

The program is stuck on Estimation of Layer 1

We use the DNN_NeuroSim_V2.0 programs, The steps are as follows:

  1. make in NeuroSim folder using CMD in Ubuntu, we can find many .o files
  2. Run train.py, the program is stuck on Estimation of Layer 1,no other logs.
    Could you please help me? Thank you very much.
    image

Question about Amp latency calculation

Hi! May I ask, why the latency in MultilevelSenseAmp::CalculatePower is not the same with the readLatency calculated in MultilevelSenseAmp::CalculateLatency simulation? In CalculateLatency, readLatency = LatencyCol*numColMuxed, while CalculatePower only takes one LatencyCol. How is the energy consumption during the remaining (numColMuxed-1)*LatencyCol ? I'd appreciate it a lot if someone sheds a little light on it. :)
Neurosim_latency_Q2_2 png

Neurosim_latency_Q2_1

It seems the Input data is not all written to file

As below shows, only the input_matrix[0, :] is changed into filled_matrix_bin and then saved into csv file, so it means not all data in input_matrix can be saved into file? And the silulator cann't read the whole input data, which lead to the wrong result.

def write_matrix_activation_conv(input_matrix,fill_dimension,length,filename):
    filled_matrix_b = np.zeros([input_matrix.shape[2],input_matrix.shape[1]*length],dtype=np.str)
    filled_matrix_bin,scale = dec2bin(input_matrix[0,:],length)
    for i,b in enumerate(filled_matrix_bin):
        filled_matrix_b[:,i::length] =  b.transpose()
    activity = np.sum(filled_matrix_b.astype(np.float), axis=None)/np.size(filled_matrix_b)
    np.savetxt(filename, filled_matrix_b, delimiter=",",fmt='%s')
    return activity```
The codes come from hook.py. And what does the function dec2bin exactly do, waht does  each parammeter means?

Tool Not working in cude 11.6 latest version

Hii,
I have tried setup and install the tool according to the user manual but while running after Floorplan stage tool is not giving any output it is just running and running for default values as provided in manual.I have tried to reduce no of epochs and batch size but this also not working.
I am using latest Cuda version I.e 11.6. Can you please suggest something.
Thank You.

Energy calculation of cells

It seems that energy produced by memory cells, typically RRAM, is not taken into account when calculating dynamic energy. I'm wondering whether this would influence the accuracy of energy calculation.

DNN NeuroSim as MLP

Why can't I use DNN NeuroSim as MLP by removing the Convolutional Layers, Max Pool and Activation Functions? Is there any advantage of using MLP NeuroSim over the altered DNN NeuroSim?

And I did try using DNN NeuroSim as MLP, but in the Hardware Performance it's getting stuck at Layer 1. May I know the issue/cause behind this?

Where is the output

After running make it is showing no errors, but the output CSV files are being generated nowhere. I am a beginner at this. Am I missing something? Please help. Thank you.

code after return

It may sound stupid. I found plenty of codes that appear after a function returns.
For example in ProcessingUnit.cpp, line 700-712.

vector<vector<double> > CopySubArray(const vector<vector<double> > &orginal, int positionRow, int positionCol, int numRow, int numCol) {
	vector<vector<double> > copy;
	for (int i=0; i<numRow; i++) {
		vector<double> copyRow;
		for (int j=0; j<numCol; j++) {
			copyRow.push_back(orginal[positionRow+i][positionCol+j]);
		}
		copy.push_back(copyRow);
		copyRow.clear();
	}
	return copy;
	copy.clear();
} 

It seems that code after return (in the example copy.clear()) is useless.
Does such code do any job? Why add such code?
I'm so confused. Can anyone explain this to me? Thank you so much.

Issue with ADC area calculation

Hello,

In ProcessingUnit.cpp, the area of ADC per subarray is calculated using below equation
areaResults.push_back(subArray->areaADC*(numSubArrayRow*numSubArrayCol));
However, if numColMuxed is set to non-zero value (i.e. set to 8 or 8 bitlines share 1 ADC), the total area of ADC should be divided by numColMuxed as below
areaResults.push_back(subArray->areaADC*(numSubArrayRow*numSubArrayCol/numColMuxed));

Am I missing some pieces?

Thanks
/T

writeLatency of Weight Update

Hi, I'm using your DNN_NeuroSim 2.1 for my research. Could you kindly tell me if the "writeLatency of Weight Update" is the accumulation latency of the weight gradients and the write latency to the PEs?

DenseNet40 for On-Chip training

Hi @neurosim,

Has anyone tried using DenseNet40 for on-chip training using DNN_NeuroSim_V2.1?

I run into this error - ERROR: SubArray Size is too large, which break the chip hierarchey, please decrease the SubArray size!

Due to this error, I am unable to get the circuit-level performance metrics. Any suggestions on how to fix this error?

Thanks!

It seems wrong for tileLocaEachLayer

In the function ChipFloorPlan in Training_pytorch/NeuroSIM/Chip.cpp, it calculate the double vector tileLocaEachLayer.
I presume that tileLocaEachLayer record the location of the first tile that store a layer.
The following code to calculate tileLocaEachLayer seems wrong.

for (int i=0; i<netStructure.size(); i++) {
  if (i==0) {
	tileLocaEachLayerRow.push_back(0);
	tileLocaEachLayerCol.push_back(0);
  } else {
        // original code here
	// thisTileTotal += numTileEachLayer[0][i]*numTileEachLayer[1][i];
	tileLocaEachLayerRow.push_back((int)thisTileTotal/(*numTileRow));
	tileLocaEachLayerCol.push_back((int)thisTileTotal%(*numTileRow)-1);
  }
  // I think it should be moved here.
  thisTileTotal += numTileEachLayer[0][i]*numTileEachLayer[1][i];
}

I think the calculation of thisTileTotal should be moved from the else clause to outside.

Question about uniform random noise when calculating weight gradient

May I ask, why is uniform random noise added at the end of the non-ideal weight gradient calculation, in wage_quantizer.py line 74-75?

image
image

It may seem that the gradient calculated in line 73 is already quantized to the desired resolution. May I ask why the process of adding uniform random noise and re-quantizing the gradient is necessary?
Can someone please explain the reason for this? Thank you.

Can't solve the issue that 'nan's keep appearing in Loss if the parameter in train.py is greater than 1.96

If I set the args.nonlinearityLTP or args.nonlinearityLTD greater than 1.96, the 'nan's will keep appearing in Loss at training phase and it will report an error while converting the decimal data to binary data in hook.py.

First I tried to adjust the learning rate and it didn't work. Then I tried to add normalization before the converting but it didn't work either. I don't know when the 'nan's appear and how to fix it.
nan

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.