SoftwareResources

Note to editors!

 

Don't split this page into multiple pages without clearing it with everyone else.

 

With a basic description of everything on this single page, you can easily use the browser's "search" command to find the program you want, if you don't know the name.

 

Lab software

 

You can get a list of the software you can currently run by typing "ls $path | less -S". If there are multiple programs with the same name, you can see which one will be run with the which command. For example, if you find two versions of java, you can type which java, and you will get as a result something like /usr/bin/somewhere/jre/java-1.4.2, and now you know exactly which executable file "=java=" refers to. Note that the earlier directories in the $path take precedence over the later ones (keep this in mind when you are editing the path in your config file).

 

How / where to install new software

 

 

List of known bugs that need to be fixed

 

  • Known Bugs: Known bugs in our software, along with descriptions and workarounds.

 

Full list of installed lab software

 

/cse/grads/alexgw/bin:

  • These are all of Alex's files. They probably are not that interesting at the moment.

 

copy-my-ip.sh*

  • Copies the IP address of the computer it was run on. Might be useful for uploading it somewhere too.

 

display_large_files.pl

  • This can be run to figure out how much space you are using on sysbio. It prints out the results to a small text file in your home directory.

 

krogan_parse.pl

  • Alex's. I forget what it does, though.

 

make_matrix_from_two_cols.pl

  • Makes a matrix-style file from two columns of "adjacency matrix"-style data (I think).

 

randomize_terminal_color.pl

  • For Mac OS X Terminal users, if you run this script with the -cycle flag, your terminal windows will have unique background colors. You need to put a line that runs it in your OS X ~/.profile so that each new Terminal window will run the script.

 

rsync-script.sh

  • Useful for figuring out how the rsync backup / remote synchronization program works.

 

sd.pl

  • "Safer delete." It moves things to ~/.Trash, but it isn't very safe. Beware! Probably "del.pl" is a better version of this, so check that out instead.

 

tree_of_filestructure.sh

  • Generates a text description of the directory structure. You can print it out, or look at it with less or emacs or something.

 


 

$MYSRC/alexgw:

  • Alex's various scripts. Kind of a mess!

 

count_items_per_line.pl*

 

display_large_files.pl*

 

drug_interactions.pl*

 

krogan_parse.pl*

 

parse_pathway_links.pl*

 

shell_scripts/

 

strip_comments.pl*

 


 

$MYPERLDIR/Tools:

  • This is probably the most important section. These are very useful perl programs that perform common action and extend the functionality of basic UNIX commands. Is "sort" annoying you with its lack of options? Try "sort.pl". Same for cut / cut.pl, paste / paste.pl, and many others.

 

add_column.pl*

 

add_to_column.pl*

 

aggregate.pl*

 

all_combinations.pl*

 

apply.pl*

 

bind.pl*

  • Replaces text in files. Used to create specific files from a general template. If a template Makefile says $LOCATION, you can use bind.pl to substitute a specific location, for example, and create a new specific Makefile.

 

bind_table.pl*

 

body.pl*

 

bootstrap.pl*

 

cap.pl*

  • Puts a header line onto a file. It's easier than writing to a temporary file and then using cat.

 

cast.pl*

 

cat.pl*

  • Extended version of UNIX

 

chars.pl*

 

check_table.pl*

 

col_merge.pl*

 

collapse-and-center.pl*

 

collapse.pl*

 

cols.pl*

 

column_stats.pl*

 

combine.pl*

 

combine_columns.pl*

 

compute_all_pairwise_correlations.pl*

 

compute_correlation.pl*

 

compute_hyper_pvalue.pl*

 

compute_pairwise_correlations.pl*

 

compute_pvalues.pl*

 

compute_symmetric_uniform_cdf.pl*

 

concat.pl*

 

condense.pl*

 

connected_components.pl*

 

connectivity.pl*

 

consecutive_and.pl*

 

consecutive_correlation.pl*

 

count.pl*

 

count_ranks.pl*

 

cross.pl*

 

csvsql

  • Script to perform arbitrary SQL queries on a comma separated value file.  For example:  
    csvsql "select gene,score from geneScores.csv where score > 40"

cut.pl*

 

deblank.pl*

 

degree.pl*

 

del.pl*

  • Safer version of rm.

 

delete_columns.pl*

 

delim2delim.pl*

 

differ.pl*

 

discretize.pl*

 

dos2unix.pl*

 

edges2matrix.pl*

 

empirical_zscore.pl*

 

endnote2bibtex.pl*

 

exists.pl*

 

expand.pl*

  • The opposite of flatten.pl?

 

fasta2stab.pl*

 

fasta2tab.pl*

 

fasta_length.pl*

 

fill.pl*

 

fill_nan.pl*

 

filter.pl*

 

find_bidirectional.pl*

 

find_columns.pl*

 

find_rows.pl*

 

flatten.pl*

  • The opposite of expand.pl?

 

foreach.pl*

  • Useful when you run out of room on the command line. If you get "argument list too long" errors using the built-in shell "foreach," then you will want to figure out how to use this. Be warned, it requires a lot of escape characters ("\") when used in a Makefile.

 

format_xml.pl*

 

func.pl*

 

grep.pl*

  • Extended version of UNIX grep.

group.pl*

 

hyper_geometric.pl*

 

if.pl*

 

index.pl*

 

integrate.pl*

 

interconnectivity.pl*

 

interleave.pl*

 

join.pl*

  • Extended version of UNIX join. Unlike join, join.pl does NOT require the input files to be sorted already. Note that join.pl actually has slightly different behaviors from join. The join.pl script performs a left-inner join so that the keys only in both files are kept and are printed in order of the keys supplied in the first file. For an example of this behavior, try: echo 'a' > test ; echo 'a\tb\na\tc' > test2 . Running join.pl test1 test2 will not report the "a b" match, while Running join.pl test2 test1 will report it.

 

join_col.pl*

 

join_combinations.pl*

 

join_fast.pl*

 

join_multi.pl*

 

join_multi_sorted.pl*

 

join_sorted.pl*

 

join_sorted_uniq.pl*

 

kill.pl*

 

kmeans.pl*

 

knn.pl*

 

leaves.pl*

 

limit_graph.pl*

 

lin.pl*

 

link_entropy.pl*

 

lists2matrix.pl*

 

ln.pl*

 

log.pl*

 

make_gnuplot_graph.pl*

 

mapper.pl*

 

mean.pl*

 

merge.pl*

 

merge_columns.pl*

 

merge_fields.pl*

 

meta_msgr.pl*

 

mi.pl*

 

modify_column.pl*

 

mv_if_diff.pl*

 

nand.pl*

 

neighbor_connectivity.pl*

 

neighborhood_overlaps.pl*

 

neighborhood_pairs.pl*

 

neighborhood_precision.pl*

 

newest.pl*

 

node.pl*

 

non_empty.pl*

 

nowhite.pl*

 

nums.pl*

 

old.sets2list.pl*

 

or_sets.pl*

 

order_file.pl*

 

order_keys.pl*

 

order_pairs.pl*

 

order_rows.pl*

 

overlap_combinations.pl*

 

overlaps.pl*

 

paste.pl*

  • Fancier version of UNIX paste.

 

pearson2fisher.pl*

 

pearson2pvalue.pl*

 

projection.pl*

 

quote.pl*

 

rand_lines.pl*

 

random.pl*

 

range.pl*

 

rank.pl*

 

redelim.pl*

 

rename.pl*

 

rename_duplicates_sorted.pl*

 

rename_num.pl*

 

rep.pl*

 

replace.pl*

 

resolve_keys.pl*

 

restrict_pairs.pl*

 

reverse.pl*

 

rewire_links.pl*

 

right.pl*

 

rm.pl*

 

rmcode.pl*

 

row_stats.pl*

 

rows.pl*

 

save/

 

scp.pl*

 

scramble_links.pl*

 

scrub.pl*

 

select.pl*

 

select_best_item.pl*

  • Allows you to select the best (minimum or maximum) item from a list. For example, if you have 10 results for every experiment, then you can use this to pick the most-highly-scoring repetition for each experiment. (Finds the min/max value for every key specified, and only prints that line.) In the event of a tie, the first such tying line found in the file is printed.

 

self_reference.pl*

 

set_covering.pl*

 

set_diff.pl*

 

set_intersect.pl*

 

set_operations.pl*

  • Alex's attempt to make basic set operations a little easier. Duplicates some functionality of join.pl, but also has some additional features.

 

set_sizes.pl*

 

sets.pl*

 

sets2list.pl*

 

sets2matrix.pl*

 

sets_intersect.pl*

 

sets_overlap.pl*

 

shortest_path.pl*

 

skip.pl*

 

sort.pl*

 

sort_rows.pl*

 

space2tab.pl*

 

stab2fasta.pl*

 

stats.pl*

 

stretch.pl*

 

subst.pl*

 

substr.pl*

 

swap.pl*

 

symmuni.pl*

 

tab.pl*

 

tab2dotty.pl*

 

tab2fasta.pl*

 

tab2gml.pl*

 

tab2space.pl*

 

tab2xgr.pl*

 

table2visant.pl*

 

thisdir.pl*

 

threshold.pl*

 

tiling.pl*

 

topk.pl*

 

transform_matrix.pl*

 

translate_column.pl*

 

transpose.pl* * Transposes a matrix / table. ("Flips" it along the diagonal.)

transpose_fast.pl* Same thing as transpose.pl, but runs significantly faster on large files...  probably should replace transpose.pl eventually.

 

treeview2dotty.pl*

 

triangles.pl*

 

trimmed_mean.pl*

 

trunc.pl*

  • Truncate lines of specific columns only. Useful for reducing the number of decimal places of a result.

 

uniq.pl*

  • Extended version of UNIX uniq (find only unique items / delete repeats)

 

uu.pl*

 

wc.pl*

  • Extended version of UNIX wc.

 

wget_all.pl*

 

white_noise.pl*

 

xml.pl*

 

zipall.pl*

 

zipcode.pl*

 

ziplist.pl*

 

 


 

$MYPERLDIR/web:

 

create_image_map.pl*

 

create_image_map_from_file.pl*

 

create_links_section.pl*

 

html2tab.pl*

 

html_convert.pl*

 

html_utils.pl*

 

index_template.html*

 

list2htmltable.pl*

 

lists2htmltable.pl*

 

motif_template.html*

 

tab2html.pl*

 

wrap_url.pl*

 


 

 

$MYSRC/shell_scripts:

 

trimline.sh*

  • Trims every line in a file, completely deleting all characters after the nth. Example usage: cat myfile | trimline.sh 45 (all lines are <= 45 characters now). Note that Josh's script trunc.pl is probably a more general form of this.

 


 

 

 

/projects/sysbio/system/i386/bin:

 

acyclic*

  • Part of the graphviz suite.

 

bcomps*

  • Part of the graphviz suite.

 

ccomps*

  • Part of the graphviz suite.

 

circo@

  • Part of the graphviz suite. Circle layout (expanded).

 

dijkstra*

  • Part of the graphviz suite.

 

dot*

  • Part of the graphviz suite. Does layout. Lets you output a graph as a PNG file / PS / etc.

 

dot2gxl@

  • Part of the graphviz suite. Converts DOT format to GXL, I assume.

 

dot_static*

  • Part of the graphviz suite.

 

dotty*

  • Part of the graphviz suite. XWindows GUI graph editor.

 

fdp@

  • Part of the graphviz suite. Some kind of layout-er.

 

gc*

  • Part of the graphviz suite.

 

gvcolor*

  • Part of the graphviz suite.

 

gvpack*

  • Part of the graphviz suite.

 

gvpr*

  • Part of the graphviz suite.

 

gxl2dot*

  • Part of the graphviz suite.

 

lefty*

  • Part of the graphviz suite.

 

lneato*

  • Part of the graphviz suite.

 

neato@

  • Part of the graphviz suite.

 

nop*

 

prune*

 

sccmap*

 

tred*

 

twopi@

  • Part of the graphviz suite. Outputes a compressed circle layout.

 

unflatten*

 


 

 

/projects/sysbio/system/x86_64/bin:

  • x86 64-bit code. Some of this is actually i386/i686 code which is installed here for some reason. But some of it is really 64-bit, and won't run on the 32-bit machines.

 

 

AlignACE

 

Ant/

 

BayesNets/

 

Blast/

 

Claquer/

 

Cliquer/

 

GeneXPress*

 

GeneXPress2.0/

 

KNNImpute1.0/

 

MCL/

 

MODES/

 

R/

 

blastall

 

cl

 

cluster-eisen*

  • The same functionality as the GUI "Cluster 3.0" program for microarray clustering, but it can be invoked directly from the command line. This is version 1.33. We have the source in /projects/sysbio/apps/src/. More at: http://bonsai.ims.u-tokyo.ac.jp/~mdehoon/software/cluster/software.htm
  • The command to install the COMMAND LINE version of this program from source is: ./configure --without-x --prefix=/projects/sysbio/apps/$MACHTYPE && make && make install. Note that it will be installed as just plain cluster, but I have been calling it cluster-eisen in order to distinguish it from the multi-processor cluster program used in another lab. I think that's part of the Kent source or something.

 

coden1.0/

 

emacs*

  • A local copy. Some of the machines don't have emacs on them anywhere else.

 

formatdb

 

libpng-config

 

libpng12-config*

 

matlab/

 

mcl

 

modes

 

pwc*

 

runStandAloneMSGR_WithOut_Mouse

 

runStandAloneMSGR_With_Mouse

 

setup_ant_env*

 

setup_ant_env_old*

 

 

 

 


 

 

/projects/sysbio/system/java:

  • Java programs. Note that you normally run them from the command line by calling "(program name).runjar), whic is a shell script giving the proper (hopefully) commands to run the jar file.

 

eclipse.runjar (Eclipse Integrated Development Environment)

  • Java IDE. Can be used as a development environment for almost any language.

 

cytoscape.runjar*

  • Cytoscape is a network / graph viewer (similar to VisANT, but with more options).

 

gsea.runjar (GSEA)

  • Gene Set Enrichment Analysis

 

treeview.runjar (Java TreeView)

  • Java TreeView. For viewing microarrays

 

visant.runjar (VisANT)

  • VisANT is a network / graph viewer (similar to Cytoscape, but faster and with native support for grouping of elements).

 

/projects/sysbio/apps/perl/weblogo/

 

  • seqlogo is the WebLogo sequence logo generator. Running the program with no arguments will generate help information. See http://weblogo.berkeley.edu/ for more information and a Web interface to WebLogo.
  • This directory also contains various example files and documentation.