Git Product home page Git Product logo

difacto_dmlc's People

Contributors

cnevd avatar qiang2008 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

difacto_dmlc's Issues

how to fix run_yarn.sh python error

2016-06-30 14:44:10,301 INFO start listen on 10.32.44.171:9102
Traceback (most recent call last):
File "../../dmlc-core/tracker/../yarn//run_hdfs_prog.py", line 47, in
ret = subprocess.call(args = sys.argv[1:], env = env)
File "/usr/lib64/python2.6/subprocess.py", line 478, in call
p = Popen(_popenargs, *_kwargs)
File "/usr/lib64/python2.6/subprocess.py", line 642, in init
errread, errwrite)
File "/usr/lib64/python2.6/subprocess.py", line 1238, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/lib64/python2.6/threading.py", line 532, in __bootstrap_inner
self.run()
File "/usr/lib64/python2.6/threading.py", line 484, in run
self.__target(_self.__args, *_self.__kwargs)
File "/home/formath/github/Difacto_DMLC/dmlc-core/tracker/tracker.py", line 345, in
self.thread = Thread(target = (lambda : subprocess.check_call(self.cmd, env=env, shell=True)), args = ())
File "/usr/lib64/python2.6/subprocess.py", line 505, in check_call
raise CalledProcessError(retcode, cmd)
CalledProcessError: Command '../../dmlc-core/tracker/../yarn//run_hdfs_prog.py build/linear.dmlc guide/demo_hdfs.conf ' returned non-zero exit status 1

Problem of reading data

Hi there, I was testing the basic function of difacto on some simple data. The data format is libsvm, i.e. 10 1 2 3 5 (10 is value and rest are the feature ids).
But according to the dumped model I found it always missed some feature id, based on the data example I gave above, the model only has w and V for 1, 2 and 3, so 5 is missing.
Can someone give me some suggestions? Thanks!

L1L2 has not effect on auc

I have tested L1 and L2 parameters ranged from 0.00 to 1000 and found that has no effect on the test auc anymore. Is there something incomplete in the code of the original project or I use it in a wrong way?

Difacto_DMLC's installation become Wrong

root@hadoop-master:~/Difacto_DMLC/src/difacto# make
g++ -O3 -ggdb -Wall -std=c++11 -I./ -I../ -I../../ps-lite/src -I../../dmlc-core/include -I../../dmlc-core/src -I../../ps-lite/deps/include -fopenmp -fPIC -DDMLC_USE_HDFS=0 -DDMLC_USE_S3=0 -DDMLC_USE_GLOG=1 -DDMLC_USE_AZURE=0 build/config.pb.o build/difacto.o ../../dmlc-core/libdmlc.a ../../ps-lite/build/libps.a -fopenmp -lrt ../../ps-lite/deps/lib/libglog.a ../../ps-lite/deps/lib/libprotobuf.a ../../ps-lite/deps/lib/libgflags.a ../../ps-lite/deps/lib/libzmq.a ../../ps-lite/deps/lib/libcityhash.a ../../ps-lite/deps/lib/liblz4.a -lgssapi_krb5 -o build/difacto.dmlc
../../dmlc-core/libdmlc.a(hdfs_filesys.o): In function dmlc::io::HDFSFileSystem::~HDFSFileSystem()': hdfs_filesys.cc:(.text+0xba): undefined reference to hdfsDisconnect'
../../dmlc-core/libdmlc.a(hdfs_filesys.o): In function dmlc::io::HDFSFileSystem::Open(dmlc::io::URI const&, char const*, bool)': hdfs_filesys.cc:(.text+0x6a0): undefined reference to hdfsOpenFile'
../../dmlc-core/libdmlc.a(hdfs_filesys.o): In function dmlc::io::HDFSFileSystem::GetPathInfo(dmlc::io::URI const&)': hdfs_filesys.cc:(.text+0x16df): undefined reference to hdfsGetPathInfo'
hdfs_filesys.cc:(.text+0x1cbd): undefined reference to hdfsFreeFileInfo' ../../dmlc-core/libdmlc.a(hdfs_filesys.o): In function dmlc::io::HDFSFileSystem::HDFSFileSystem()':
hdfs_filesys.cc:(.text+0x27e1): undefined reference to hdfsConnect' ../../dmlc-core/libdmlc.a(hdfs_filesys.o): In function dmlc::io::HDFSFileSystem::ListDirectory(dmlc::io::URI const&, std::vector<dmlc::io::FileInfo, std::allocatordmlc::io::FileInfo >)':
hdfs_filesys.cc:(.text+0x2d5a): undefined reference to hdfsListDirectory' hdfs_filesys.cc:(.text+0x3419): undefined reference to hdfsFreeFileInfo'
../../dmlc-core/libdmlc.a(hdfs_filesys.o): In function dmlc::io::HDFSStream::~HDFSStream()': hdfs_filesys.cc:(.text._ZN4dmlc2io10HDFSStreamD2Ev[_ZN4dmlc2io10HDFSStreamD5Ev]+0x41): undefined reference to hdfsCloseFile'
hdfs_filesys.cc:(.text._ZN4dmlc2io10HDFSStreamD2Ev[_ZN4dmlc2io10HDFSStreamD5Ev]+0xa2): undefined reference to hdfsDisconnect' ../../dmlc-core/libdmlc.a(hdfs_filesys.o): In function dmlc::io::HDFSStream::Read(void
, unsigned long)':
hdfs_filesys.cc:(.text._ZN4dmlc2io10HDFSStream4ReadEPvm[_ZN4dmlc2io10HDFSStream4ReadEPvm]+0x55): undefined reference to hdfsRead' ../../dmlc-core/libdmlc.a(hdfs_filesys.o): In function dmlc::io::HDFSStream::Write(void const*, unsigned long)':
hdfs_filesys.cc:(.text._ZN4dmlc2io10HDFSStream5WriteEPKvm[_ZN4dmlc2io10HDFSStream5WriteEPKvm]+0x5e): undefined reference to hdfsWrite' ../../dmlc-core/libdmlc.a(hdfs_filesys.o): In function dmlc::io::HDFSStream::Seek(unsigned long)':
hdfs_filesys.cc:(.text._ZN4dmlc2io10HDFSStream4SeekEm[_ZN4dmlc2io10HDFSStream4SeekEm]+0x2e): undefined reference to hdfsSeek' ../../dmlc-core/libdmlc.a(hdfs_filesys.o): In function dmlc::io::HDFSStream::Tell()':
hdfs_filesys.cc:(.text._ZN4dmlc2io10HDFSStream4TellEv[_ZN4dmlc2io10HDFSStream4TellEv]+0x2b): undefined reference to `hdfsTell'
collect2: error: ld returned 1 exit status
make: *** [build/difacto.dmlc] Error 1

Difacto_DMLC 如何支持 分布式执行

@CNevd 你好,你的 Difacto_DMLC 项目中给出了在本地执行和远程yarn执行两个例子,但是之前的在进行简化(parameter_server 变为 para lite)之前,parameter server是可以通过提供一个ip列表的形式分布式跑起来的,请问这个功能在 Difacto_DMLC 有吗?如果要加上这个功能的话要通过什么样子的修改呢?

run_yarn.sh fail

sh run_yarn.sh
the error infomation:

18/01/24 13:26:09 WARN util.NativeCodeLoader (NativeCodeLoader.java:(62)) : Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
18/01/24 13:26:09 INFO client.RMProxy (RMProxy.java:createRMProxy(92)) : Connecting to ResourceManager at master4.osos.com/10.155.140.215:8032
18/01/24 13:26:16 INFO dmlc.Client (Client.java:run(289)) : jobname=DMLC[nworker=2,nsever=1]:difacto.dmlc,username=nlp_qrw
18/01/24 13:26:16 INFO dmlc.Client (Client.java:run(296)) : Submitting application application_1506450123858_250218
18/01/24 13:26:16 INFO impl.YarnClientImpl (YarnClientImpl.java:submitApplication(236)) : Submitted application application_1506450123858_250218
F0124 13:34:31.273530 7162 manager.cc:55] Timeout (500 sec) to wait all other nodes initialized. See commmets for more information
*** Check failure stack trace: ***
@ 0x482eaa google::LogMessage::Fail()
@ 0x484d72 google::LogMessage::SendToLog()
@ 0x482a8f google::LogMessage::Flush()
@ 0x48568e google::LogMessageFatal::~LogMessageFatal()
@ 0x4715e2 ps::Manager::Run()
@ 0x468e19 ps::Postoffice::Run()
@ 0x40813b main
@ 0x7f180e2dbb35 __libc_start_main
@ 0x40a221 (unknown)
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File "/usr/lib64/python2.7/threading.py", line 764, in run
self.__target(*self.__args, **self.__kwargs)
File "/data1/peak/peakzeng/Difacto_DMLC-master/dmlc-core/tracker/tracker.py", line 345, in
self.thread = Thread(target = (lambda : subprocess.check_call(self.cmd, env=env, shell=True)), args = ())
File "/usr/lib64/python2.7/subprocess.py", line 542, in check_call
raise CalledProcessError(retcode, cmd)
CalledProcessError: Command '../../dmlc-core/tracker/../yarn//run_hdfs_prog.py build/difacto.dmlc guide/demo_hdfs.conf ' returned non-zero exit status 250

how to set multiple hdfs training data path

My hdfs data path is like this: /root/${date}/part-*

It will cause error when train_data written as blow examples:

1) /root/2016060*/part-*
2) /root/2016060[1-9]/part-*
3) /root/20160601/part-* /root/20160602/part-* ...
4) /root/20160601/part-*,/root/20160602/part-*, ...

So, how to set multiple hdfs training data path? Thanks.

New function is needed

When predict, we only get the proba, but we don't get the original samples.
for example:
y1 x11 x12 (sample 1)
y2 x21 x22 (sample 2)

we then get
predict_y1
predict_y2

but we don't know predict_y1 belongs to the prediction result of sample 1 or sample 2.

sh run_local.sh become Wrong!!!

root@hadoop-master:~/Difacto_DMLC/src/linear# sh run_local.sh
2016-11-16 15:27:54,784 INFO start listen on 172.18.0.2:9091
build/linear.dmlc: error while loading shared libraries: libhdfs.so.0.0.0: cannot open shared object file: No such file or directory
build/linear.dmlc: error while loading shared libraries: libhdfs.so.0.0.0: cannot open shared object file: No such file or directory
build/linear.dmlc: error while loading shared libraries: libhdfs.so.0.0.0: cannot open shared object file: No such file or directory
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
self.run()
File "/usr/lib/python2.7/threading.py", line 763, in run
self.__target(*self.__args, **self.__kwargs)
File "/root/Difacto_DMLC/dmlc-core/tracker/tracker.py", line 345, in
self.thread = Thread(target = (lambda : subprocess.check_call(self.cmd, env=env, shell=True)), args = ())
File "/usr/lib/python2.7/subprocess.py", line 540, in check_call
raise CalledProcessError(retcode, cmd)
CalledProcessError: Command 'build/linear.dmlc guide/demo.conf ' returned non-zero exit status 127

build/dump.dmlc: error while loading shared libraries: libhdfs.so.0.0.0: cannot open shared object file: No such file or directory

GLIBC_2.14

./difacto.dmlc: /lib64/libc.so.6: version GLIBC_2.14' not found (required by ./difacto.dmlc) ./difacto.dmlc: /lib64/libc.so.6: version GLIBC_2.14' not found (required by ./libstdc++.so.6)

How the training or prediction be started?

I found the training or prediction is started in the last of

void StartDispatch()
namely,

    /*ask all workers to start by sending an empty workload*/
    Workload wl;
    SendWorkload(ps::kWorkerGroup, wl);

But if the workload in the request sent to worker node is empty, worker node just jump out the process method and response nothing. And then the scheduler node could't receive response so not to assign new training or prediction workload to workers. So, how the training or prediction be started? I'm a little confused about this. Thanks for your response.

run_yarn.sh fail: Timeout (500 sec) to wait all other nodes initialized. See commmets for more information

command is : ../../dmlc-core/tracker/dmlc_yarn.py --queue users.bigdata --hadoop_binary /opt/cloudera/parcels/CDH/bin/hadoop --vcores 1 -n 2 -s 1 build/linear.dmlc guide/demo_hdfs.conf

error log:
Container: container_1504869490009_0132_01_000006 on cdh246.test_8041

LogType:stderr
Log Upload Time:Fri Sep 15 12:23:24 +0800 2017
LogLength:606
Log Contents:
F0915 12:14:57.163543 2300 manager.cc:55] Timeout (500 sec) to wait all other nodes initialized. See commmets for more information
*** Check failure stack trace: ***
@ 0x496a2a google::LogMessage::Fail()
@ 0x4988f2 google::LogMessage::SendToLog()
@ 0x49660f google::LogMessage::Flush()
@ 0x49920e google::LogMessageFatal::~LogMessageFatal()
@ 0x47daf2 ps::Manager::Run()
@ 0x475339 ps::Postoffice::Run()
@ 0x408381 main
@ 0x7f3bdc936b15 __libc_start_main
@ 0x40a481 (unknown)

LogType:stdout
Log Upload Time:Fri Sep 15 12:23:24 +0800 2017
LogLength:75
Log Contents:
===============================argv: ['./linear.dmlc', './demo_hdfs.conf']

dump bug

when I run the command "build/dump.dmlc model_in="agaricus_model_part-0" dump_out="dump.txt" need_inverse", it reports the following error:
[
2016-11-18 7 06 36

Can anyone help me ?

Some confusion about use dump.cc to dump a readable model

I used the agaricus data to test dump.cc , and have dumped a readable model.
But the model's keys are very large such as 602879701896396800 , the train&test data's feature index only 0 to 120+ .
So I confusing about what key's mean in the readable model? How can I match it with feature'index.

run_local wrong

你好,我在运行run_local.sh脚本的时候,停在了2017-07-27 16:17:08,738 INFO start listen on 127.0.0.1:9091 这段INFO了,不知道是什么原因?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.