Brigham Young University Computer Science Department
Computational Science Laboratory

Genomic Next-generation Universal MAPper (gnumap)

BYU | Bioinformatics | CS Dept.
subglobal6 link | subglobal6 link | subglobal6 link | subglobal6 link | subglobal6 link | subglobal6 link | subglobal6 link
subglobal7 link | subglobal7 link | subglobal7 link | subglobal7 link | subglobal7 link | subglobal7 link | subglobal7 link
gnumap
Running gnumap (quick start) For help on how to run GNUMAP, just type ./bin/gnumap into a terminal and the usage information will be displayed. A typical gnumap run requires several things. For example, to run a test with the sequence file s_100_int.txt, reporting only the locations containing a local alignment score of 90% or better, using the file chrI.fa as the genome, and having the output printed to gnumap.output, I would use:
	./bin/gnumap -g examples/Cel_gen.fa -o example.output -a .9 -p -v 1 examples/example_sequences_prb.txt
(Note: This command can also be run by typing make example )


Example Files

Running GNUMAP with MPI:

mpiexec -np N_MACH -machinefile MACH_FILE gnumap [options...]
where N_MACH is the number of machines you are using and MACH_FILE is a file listing the machines that are available to use. Using the -c option to specify the number of processors can also be included with these parameters.

For those that are using BYU's supercomputer (or another PBS supercomputer), here is an example submission script:

#! /bin/bash

#PBS -N MPI_test
#PBS -l nodes=30:ppn=1:pmem=12gb,walltime=3:00:00
#PBS -q batch
#PBS -k oe 
#PBS -m bea
#PBS -M your_email@gmail.com

N_MACH=30
MACH_FILE=gnumap_mpi_file

GENOME="/path/to/genome/genome.fasta"
SEQFILES="$(ls /path/to/sequences/*_prb.txt)"
OUTPUT="/path/to/output/gnumap.out"
PROG="/path/to/gnumap/bin/gnumap"
PROGARGS="-g \"$(echo $GENOME | sed -e 's/ /,/g')\" -o $OUTPUT -a .9 -p -c 8 \"$(echo $SEQFILES | sed -e 's/ /,/g')\" -m 12 -j 10 -v 1"

cp $PBS_NODEFILE $MACH_FILE

echo "mpiexec -np $N_MACH -machinefile $MACH_FILE $PROG $PROGARGS"
mpiexec -np $N_MACH -machinefile $MACH_FILE $PROG $PROGARGS
The $PBS_NODEFILE is a file that lists all the nodes your program is allowed to run on. Alternatively, for a large genome would have the flag --MPI_largemem on the end of the PROGARGS command.


This page last modified Wednesday May 21, 2014