Microarray Data Analysis Assignment

(50% of CA578/579 As Appropriate)

NEW! Due Friday 13th August 2010 at 5pm
Please submit to the School Secretary in the School of Computing

Instructions:

  1. Take the raw data
    1. DataSet_1
    2. DataSet_2
    3. DataSet_3
    4. DataSet_4
    5. DataSet_5
  2. Implement the Hierarchical Clustering and K-Means Clustering algorithms with UPGMA (Average distance between clusters) and various distance metrics (Pearson, Manhattan).
  3. Results required: plot of data in experiment space showing the clustering according to the various distance metrics and a text file with gene clustering using the notation: (A, ((B,C),D)) meaning B & C are closest and then D etc
  4. Write a concise (4 page max) report on the project summarising your results.
  5. Individual effort is required.  Please write and sign a declaration on your submission of the form: "Except where otherwise stated, the following is all my own work.  I have read and am aware of the University's rules concerning plagiarism."  These rules are shown here
  6. Due 5pm on Friday 13th August.  Penalty of 5% per day late.
  7. Here are some hints
  8. Notes from the tutorial