python - theano csv to pkl file -


i trying make pkl file loaded theano csv starting point

      import numpy np     import csv     import gzip, cpickle     numpy import genfromtxt     import theano     import theano.tensor t      #open csv file , read in data     csvfile = "filename.csv"     my_data = genfromtxt(csvfile, delimiter=',', skip_header=1)     data_shape = "there " + repr(my_data.shape[0]) + " samples of vector length " + repr(my_data.shape[1])      num_rows = my_data.shape[0] # number of data samples     num_cols = my_data.shape[1] # length of data vector      total_size = (num_cols-1) * num_rows       data = np.arange(total_size)     data = data.reshape(num_rows, num_cols-1) # 2d matrix of data points     data = data.astype('float32')      label = np.arange(num_rows)     print label.shape     #label = label.reshape(num_rows, 1) # 2d matrix of data points     label = label.astype('float32')      print data.shape      #read through data file, assume label in last col     in range(my_data.shape[0]):         label[i] = my_data[i][num_cols-1]          j in range(num_cols-1):             data[i][j] = my_data[i][j]       #split data in terms of 70% train, 10% val, 20% test      train_num = int(num_rows * 0.7)     val_num = int(num_rows * 0.1)     test_num = int(num_rows * 0.2)      datasetstate = "this dataset has " + repr(data.shape[0]) + " samples of length " + repr(data.shape[1]) + ". number of training examples " + repr(train_num)     print datasetstate        train_set_x = data[:train_num]     train_set_y = label[:train_num]      val_set_x = data[train_num+1:train_num+val_num]     val_set_y = label[train_num+1:train_num+val_num]      test_set_x = data[train_num+val_num+1:]     test_set_y = label[train_num+val_num+1:]       # divided dataset 3 parts. split percentage.      train_set = train_set_x, train_set_y     val_set = val_set_x, val_set_y     test_set = test_set_x, val_set_y       dataset = [train_set, val_set, test_set]      f = gzip.open(csvfile+'.pkl.gz','wb')     cpickle.dump(dataset, f, protocol=2)     f.close()  

when run resulting pkl file through thenao, (as dbn or sda) pretrains fine, makes me think data stored correctly.

however when comes finetune following error:

          epoch 1, minibatch 2775/2775, validation error 0.000000 %      traceback (most recent call last):       file "sda_custom.py", line 489, in          test_sda()       file "sda_custom.py", line 463, in test_sda         test_losses = test_model()       file "sda_custom.py", line 321, in test_score         return [test_score_i(i) in xrange(n_test_batches)]        file "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", line 606, in __call__         storage_map=self.fn.storage_map)       file "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", line 595, in __call__         outputs = self.fn()     valueerror: input dimension mis-match. (input[0].shape[0] = 10, input[1].shape[0] = 3)     apply node caused error: elemwise{neq,no_inplace}(argmax, subtensor{int64:int64:}.0)     inputs types: [tensortype(int64, vector), tensortype(int32, vector)]     inputs shapes: [(10,), (3,)]     inputs strides: [(8,), (4,)]     inputs values: ['not shown', array([0, 0, 0], dtype=int32)]      backtrace when node created:       file "/home/dean/documents/deeplearningrepo/deeplearningtutorials-master/code/logistic_sgd.py", line 164, in errors         return t.mean(t.neq(self.y_pred, y))      hint: use theano flag 'exception_verbosity=high' debugprint , storage map footprint of apply node.  

10 size of batch, if change batch size of 1 following:

      valueerror: input dimension mis-match. (input[0].shape[0] = 1, input[1].shape[0] = 0)  

i think storing labels wrong when make pkl, can't seem spot happening or why changing batch alters error

hope can help!

saw looking similar error getting. posting reply might looking similar error. me error resolved when changed n_out 2 1 in dbn_test() parameter list. n_out number of labels rather number of output layers.


Comments

Popular posts from this blog

javascript - gulp-nodemon - nodejs restart after file change - Error: listen EADDRINUSE events.js:85 -

Fatal Python error: Py_Initialize: unable to load the file system codec. ImportError: No module named 'encodings' -

oracle - Changing start date for system jobs related to automatic statistics collections in 11g -