• Written portions of assignments are due at the start of class (12:00pm) on their due date. Digital submissions (e.g. code, PDFs) are due by 11:00am on Blackboard (so that students show up to class on time).
  • The schedule is subject to change, as the semester progresses, however any changes will be made at least one week in advance of the dates affected.
  • You are expected to have completed the Lin and Dyer readings before class on the days indicated. We have provided suggested readings for White's book as well; however, since that book contains reference material on Hadoop in general, you will probably find it useful to jump around as necessary while completing homework assignments.

Week 1.
25-Aug-11: Introduction
Week 2. 1-Sep-11: MapReduce Basics [slides]

Week 3. 8-Sep-11: MapReduce Algorithm Design [slides]
  • READ: Lin/Dyer 3.1-3.3 (pp. 37-52), White Ch. 2 (finish)
  • DUE: Homework1b

Week 4. 15-Sep-11 continued [slides]
  • READ: Lin/Dyer Ch. 3 (finish), White Ch. 3 (41-62)
  • DUE: Individual Project Ideas (1 page)

Week 5. 22-Sep-11: Inverted Indexing for Text Retrieval [slides]
  • READ: Lin/Dyer 4.1-4.4 (pp. 65-72), White Ch. 3 (finish)
  • DUE: Homework 2

Week 6. 29-Sep-11: IR and Spelling Correction, Index Compression, Popular Passages, Hadoop Chaining [slides]
  • READ: Lin/Dyer Ch. 4 (finish), White Ch. 4 (75-86)

Week 7. 6-Oct-11: Graph Algorithms [slides]
  • READ: Lin/Dyer Ch. 5, White Ch. 5
  • READ: White Ch. 4 (finish)
  • DUE: Project Proposal

Week 8. 13-Oct-11: HMMs and part-of-speech tagging [slides]
  • READ: Lin/Dyer Ch. 6 (pp. 105-113), White Ch. 6
  • DUE: Homework 3 (Indexing)

Week 9.
20-Oct-11: EM, Machine translation, word alignment [slides]
  • READ: Lin/Dyer Ch. 6 (pp. 105-130), White Ch. 6

Week 10. 27-Oct-11: Language models [slides], Pig [Pig slides]
  • READ: Lin/Dyer Ch. 6 (finish), White Ch. 11 (pp. 321-364)

Week 11. 3-Nov-11: Guest lecture by Flip Kromer (InfoChimps)
  • DUE: Project Progress Report

Week 12. 10-Nov-11: Actors, Akka and Label Propagation

Week 13. 17-Nov-11: Wrap-up, other approaches (including Spark)

Week 14.
24-Nov-11: Thanksgiving; no class 

Week 15. 1-Dec-11: Project Presentations

Finals. 9-Dec-11
  • DUE: **9am** Project Final Papers
Jason Baldridge,
Jan 19, 2012, 3:09 PM
Jason Baldridge,
Oct 27, 2011, 1:33 PM
Jason Baldridge,
Oct 27, 2011, 1:36 PM