Git Product home page Git Product logo

swim's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

swim's Issues

How can I find this data "FB-2009 and FB-2010"?

FB-2009 comes from historical Hadoop traces on a 600-machine cluster at Facebook. The original trace spans 6 months from May 2009 to October 2009, and contains roughly 1 million jobs.

FB-2010 comes from historical Hadoop traces on the same cluster at Facebook, now grown to 3000 machines. The original trace spans 1.5 months from October 2010 to November 2010, and also contains roughly 1 million jobs.

I am wondering how can I access this data? ... Thanks!

parse-hadoop-jobhistory.pl doesn't work

Hello everyone,

I used SWIM to test my MR cluster. But when I finished the execution and want to use "parse-hadoop-jobhistory.pl" to analyse the job logs, I found that it doesn't work.

I tried it in Centos6 and Ubuntu12.04. But it doesn't work.
I follow the guide "Step 1. Parse historical Hadoop logs"

And hope you can check it again.

Thanks

ZHANG Bo

Compilation error in WorkGen.java

Hi,

While compiling WorkGen.java, I encountered two compilation errors "Cannot Find Symbol" at Line no. 267 and 268. I think these need to be changed

From:
System.out.println("shuffleInputRatio = " + Double.parseDouble(job.getRaw("workGen.ratios.shuffleInputRatio")));
System.out.println("outputShuffleRatio = " + Double.parseDouble(job.getRaw("workGen.ratios.outputShuffleRatio")));

To:
System.out.println("shuffleInputRatio = " + Double.parseDouble(jobConf.getRaw("workGen.ratios.shuffleInputRatio")));
System.out.println("outputShuffleRatio = " + Double.parseDouble(jobConf.getRaw("workGen.ratios.outputShuffleRatio")));

Regards,
Saurav

hadoop jar HDFSWrite.jar fails

I am following the instructions at https://github.com/SWIMProjectUCB/SWIM/wiki/Performance-measurement-by-executing-synthetic-or-historical-workloads.

I was able to successfully complete steps 1 through 4 for the synthetic workload provided as part of SWIM, specifically FB-2009_samples_24_times_1hr_0_first50jobs.tsv.

On the 5th step, when I run hadoop jar HDFSWrite.jar, it gives an error:

14/06/16 15:43:00 WARN mapred.JobClient: No job jar file set. User classes may not be

After this the map tasks started fail and the job fails. So, the input data is not created within HDFS and I am unable to move to the next step.

Can someone help me figure out what I am missing?

Thanks,
Madhura

More details from the console:
[hdfs@hadoop1 workloadSuite]$ hadoop jar HDFSWrite.jar org.apache.hadoop.examples.HDFSWrite -conf conf/randomwriter_conf.xsl workGenInput
client.getClusterStatus().getMaxMapTasks() gives 72
client.getClusterStatus().getMaxReduceTasks() gives 36
Running on 6 nodes with 60 maps,
writing 64424509440 bytes with 1073741824 bytes per map.
Job started: Mon Jun 16 15:43:00 PDT 2014
14/06/16 15:43:00 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.

14/06/16 15:43:00 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String).

java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.examples.HDFSWrite$RandomInputFormat not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1806)
at org.apache.hadoop.mapred.JobConf.getInputFormat(JobConf.java:620)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:394)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.examples.HDFSWrite$RandomInputFormat not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1774)
at org.apache.hadoop.conf.Configuration.getClass(Co
14/06/16 15:22:33 INFO mapred.JobClient: Task Id : attempt_201406161518_0001_m_000046_1, Status : FAILED

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.