Sangmi Lee Pallickara

Wednesday, February 11, 2009

Recent Change in TG

cobalt requires (queue=) in rsl string. It was optional previously. Now if you don't put either
standard or industrial, globus will through exception,
Globus error 37

Also condor will through exception,
Exec format error Code 6 Subcode 8

Submitting job to Ranger

Gatekeeper : gatekeeper.ranger.tacc.teragrid.org:2119/jobmanager-sge
Batch queue system: SGE
Check the queue status:showq -u

(Test 0) try locally installed executables with datafiles stored in scratch directory
./cap3 /scratch/00891/tg459282/2mil/cluster1807.fsa -p 95 -o 49 -t 100

(Test 1) try simple job submission through globus toolkit.
globusrun -o -r gatekeeper.ranger.tacc.teragrid.org:2119/jobmanager-sge '&(executable=/bin/ls)'

It works with swarm!

Monday, February 9, 2009

Submit and Manage Vanilla Condor Job to Swarm

To submit Vanilla condor job(s) to Swarm, first you have to install swarm and configure the service to process vanilla condor jobs. the source code is available from
svn co https://ogce.svn.sourceforge.net/svnroot/ogce/ogce-services-incubator

For the general information to install swarm, please refer,
http://decemberstorm.blogspot.com/2008/11/swarm-sever-installation-guide.html

Now, configure your swarm for the Vanilla condor pool.
Step 1. modify swarm/TGResource.properties, to process only vanilla condor pool. This is my properties:
matchmaking=FirstAvailableResource
taskQueueMaxSize = 100
taskQueueScanningInterval = 3000
condorCluster=true
teragridHPC=false
eucalyptus=false
condorCluster_numberOfToken=10
condorRefreshInterval = 2000

Step 2. start swarm server.
Upload swarm/build/jobsub.aar to your axis2 installation.

Step 3. Client example of vanilla condor job is stored at,
swarm/src/core/org/ogce/jobsub/clients/SubmitVanillaJob.java

There are three example methods:
submitJobWithStandardOutput(): job with standard output wihout input files
submitJobWithOutputTransfer(): job with standard output and output files
submitJobWithInputOutputTransfer(): job with standard output, output files, and local input files

Step 4. Compile example file
[swarm]ant clean
[swarm]ant

Step 5. Run example file
[swarm]./run.sh submit_vanilla_job

Step 6. Check the status
[swarm]cd ClientKit
[ClientKit]./swarm status http://serverIP:8080/axis2/services/Swarm your_ticket_ID

Step 7. Get the location of output file
[ClientKit]./swarm outputURL http://serverIP:8080/axis2/services/Swarm your_ticket_ID
This will return the URL of the output file.

Tuesday, December 9, 2008

renewing proxy

more /usr/local/gateway/RenewCred/renewCred.sh
#!/bin/bash
export GLOBUS_LOCATION=$HOME/globus-condor/globus
source $GLOBUS_LOCATION/etc/globus-user-env.sh
myproxy-logon -s myproxy.teragrid.org -l quakesim -t 5000 -S << EOF
PUT_PASSWORD_HERE
EOF

Monday, December 1, 2008

Writing simple Java application with HBase APIs[0]

To write a simple Java application with HBase APIs, you surely need hadoop and hbase installation on your machine. For this example, I used hadoop installation with pseudo distributed setup on the localhost.
The code is mainly downloaded from the hbase site,
http://hadoop.apache.org/hbase/docs/r0.2.1/api/index.html


import java.io.IOException;
import org.apache.hadoop.hbase.client.HTable;
import org.apache.hadoop.hbase.client.Scanner;
import org.apache.hadoop.hbase.io.BatchUpdate;
import org.apache.hadoop.hbase.io.Cell;
import org.apache.hadoop.hbase.io.RowResult;
import org.apache.hadoop.hbase.HBaseConfiguration;

public class MySimpleTest {

 public static void main(String args[]) throws IOException {
   // You need a configuration object to tell the client where to connect.
   // But don't worry, the defaults are pulled from the local config file.
   HBaseConfiguration config = new HBaseConfiguration();

   // This instantiates an HTable object that connects you to the "myTable"
   // table.
   HTable table = new HTable(config, "myTable");

   // To do any sort of update on a row, you use an instance of the BatchUpdate
   // class. A BatchUpdate takes a row and optionally a timestamp which your
   // updates will affect.
   BatchUpdate batchUpdate = new BatchUpdate("myRow");

   // The BatchUpdate#put method takes a Text that describes what cell you want
   // to put a value into, and a byte array that is the value you want to
   // store. Note that if you want to store strings, you have to getBytes()
   // from the string for HBase to understand how to store it. (The same goes
   // for primitives like ints and longs and user-defined classes - you must
   // find a way to reduce it to bytes.)
   batchUpdate.put("myColumnFamily:columnQualifier1",
     "columnQualifier1 value!".getBytes());

   // Deletes are batch operations in HBase as well.
   batchUpdate.delete("myColumnFamily:cellIWantDeleted");

   // Once you've done all the puts you want, you need to commit the results.
   // The HTable#commit method takes the BatchUpdate instance you've been
   // building and pushes the batch of changes you made into HBase.
   table.commit(batchUpdate);

   // Now, to retrieve the data we just wrote. The values that come back are
   // Cell instances. A Cell is a combination of the value as a byte array and
   // the timestamp the value was stored with. If you happen to know that the
   // value contained is a string and want an actual string, then you must
   // convert it yourself.
   Cell cell = table.get("myRow", "myColumnFamily:columnQualifier1");
   String valueStr = new String(cell.getValue());
 
   // Sometimes, you won't know the row you're looking for. In this case, you
   // use a Scanner. This will give you cursor-like interface to the contents
   // of the table.
   Scanner scanner =
     // we want to get back only "myColumnFamily:columnQualifier1" when we iterate
     table.getScanner(new String[]{"myColumnFamily:columnQualifier1"});
 
 
   // Scanners in HBase 0.2 return RowResult instances. A RowResult is like the
   // row key and the columns all wrapped up in a single interface.
   // RowResult#getRow gives you the row key. RowResult also implements
   // Map, so you can get to your column results easily.
 
   // Now, for the actual iteration. One way is to use a while loop like so:
   RowResult rowResult = scanner.next();
 
   while(rowResult != null) {
     // print out the row we found and the columns we were looking for
     System.out.println("Found row: " + new String(rowResult.getRow()) + " with value: " +
      rowResult.get("myColumnFamily:columnQualifier1".getBytes()));
   
     rowResult = scanner.next();
   }
 
   // The other approach is to use a foreach loop. Scanners are iterable!
   for (RowResult result : scanner) {
     // print out the row we found and the columns we were looking for
     System.out.println("Found row: " + new String(result.getRow()) + " with value: " +
      result.get("myColumnFamily:columnQualifier1".getBytes()));
   }
 
   // Make sure you close your scanners when you are done!
   scanner.close();
 }
}

Hadoop: java.io.IOException: Incompatible namespaceIDs

The error java.io.IOException: Incompatible namespaceIDs in the logs of a datanode (/logs/hadoop-hadoop-datanode-.log) might be caused by bug HADOOP-1212. Here is a site which provides how to get around this bug,
http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_(Multi-Node_Cluster)

The complete error message was,

 ... ERROR org.apache.hadoop.dfs.DataNode: java.io.IOException: Incompatible namespaceIDs in /usr/local/hadoop-datastore/hadoop-hadoop/dfs/data: namenode namespaceID = 308967713; datanode namespaceID = 113030094
      at org.apache.hadoop.dfs.DataStorage.doTransition(DataStorage.java:281)
      at org.apache.hadoop.dfs.DataStorage.recoverTransitionRead(DataStorage.java:121)
      at org.apache.hadoop.dfs.DataNode.startDataNode(DataNode.java:230)
      at org.apache.hadoop.dfs.DataNode.(DataNode.java:199)
      at org.apache.hadoop.dfs.DataNode.makeInstance(DataNode.java:1202)
      at org.apache.hadoop.dfs.DataNode.run(DataNode.java:1146)
      at org.apache.hadoop.dfs.DataNode.createDataNode(DataNode.java:1167)
      at org.apache.hadoop.dfs.DataNode.main(DataNode.java:1326)

Friday, November 21, 2008

Test #4(2mil): Nov.21.2008

Full sequence test for 2million sequences.

(1) Resource setup: BigRed(150), Ornl(80), Cobalt(80)
(2) Service setup: Status check interval: 60 secs
Job queue scan interval: 60 secs
Job queue size: 100
(3) Input files : 2mil.tar copied to each of cluster in the $TG_CLUSTER_SCRATCH directory.
Referred by arguments with full path to the input files
(4) Output files: staged out to swarm host

(5) Client side setup: input files are located in my desktop(same machine with swarm host).
Scan the directory and find files which contain more than 1 sequence(with grep unix command through Java Runtime). Send the request to swarm with 10 batch per rpc call.

Total duration of the Submission: 170364307 milliseconds(around 47.3 hours).
Total number of jobs submitted: 75533
Total number of files scanned: 536825

(6) Completed Jobs : To be added
(7) Held Jobs: To be added
(8) Open Issues:
Submission time requires to be improved.
Reason:

Loading 536825 objects which represent the filename takes too much of memory.[Approach]: Use filefilter and load partial list at a time
Using Java Runtime: Java Runtime requires extra memory to execute system fork. [Approach]: Try checking the number of sequences by means of Java FileInputStream
Running client and host in the same machine.[Approach]: Try the client in different machine.