Problem Sets

Be sure to check out https://github.com:kyclark/abe487.git for the tests!

Hello!

Alter the greeting script to accept the flag "--excited" that adds an exclamation point to the end of the greeting:

$ ./greet.pl6 --excited --greeting="Hola" --name="Amigo"
Hola, Amigo!

Sum and mean

Create a script that reads in numbers from the command line and prints out their sum and mean.

Clean sequences

Write a bash script to clean up some raw sequence.

Create a FASTA file

Create a script called "txt2fasta.pl6" that accepts an input file of sequences, one-per-line, and emits FASTA-formatted sequences. Sequence IDs should be an incrementing integer value starting at 1. Extra credit: block the sequences to a maximum column width, default 50.

FASTA stats

Given one or more input files of sequences in FASTA format, recreate this output:

$ seqmagick info mouse.fa | awk '{print $1,$3,$4,$5,$6}' | column -t
name      min_len  max_len  avg_len  num_seqs
mouse.fa  50       100      84.32    500

Compute GC content

Solve the GC content problem on Rosalind (http://rosalind.info/problems/gc/). Then use that program to profile FASTA files and predict species.

Find motifs

Solve http://rosalind.info/problems/subs/ to find motifs in strings. Use this to find ORFs.

Compute Hamming

Solve http://rosalind.info/problems/hamm/ to find the number of mutations (SNPs/SNVs) between two sequences. Use this to determine sequence similarity.

Protein translation

Solve http://rosalind.info/problems/prot/. Read the translation table from the given "table.txt."

Shared k-mers

Create a program that will find the number of shared k-mers of a given size among a set of sequences in a FASTA file. Use this to determine sequence similarity. You can use the "fasta-kmer" program to create a list of k-mers in the given sequences.

results matching ""

    No results matching ""