CS 174 BioInformatics Assignment 3

Hello, if you have any need, please feel free to consult us, this is my wechat: wx91due

Assignment 3. Similarity a la Needleman-Wunsch

As in assignment 1, in this assignment your deliverables consists of two two parts. You need to hand in a) the code for computing the global similarity of a family of dna strings and in a separate document, the output of your program together with the answers to a number of questions given below.
  1. Write a Java or Python program to a) read in a fasta file of strings, b) compute the global similarity between any two strings and c) print out the similarity matrix for all pairs of strings. Deposit this program in your folder. I thought writing the program to read in the strings was harder than the program for global similarity. In the worse case and with a loss of credit, you can simply create an array of strings where you hard code the strings.
  2. From the masterhit course site get the file of 10 viruses. These are in fasta format. Use the settings gap = -1, mismatch = -1, and match = +1. With these settings compute the similarity matrix (10 by 10) and put the matrix in a *.doc file. Add to the word file the answers to the following questions.
  3. Which two strings have the greatest similarity and what is their similarity?
  4. Which two strings have the least similarity and what is their similarity.
  5. Describe strings s1 and s2 of lengths n and m which would have maximum similarity? Use the scoring scheme where the gap cost is -1. What would their similarity score be?
  6. Describe strings s1 and s2 of lengths n and m which have minimum similarity? Again use the scoring scheme where the gap cost is -1. What would their similarity be?

发表评论

电子邮件地址不会被公开。 必填项已用*标注