• Unique IDs: 52653 (CS), 28590 (INF), 40780 (LIN)
  • Course Credit: Computer Science students can satisfy the same requirements toward graduation regardless of which section they enroll in. If the CS listing is full, enroll in one of the others and email to request the CS major credit (August 22, 2011).
Time, Place, & Instructors
  • Meeting Time: Th 12-3pm
  • Location: UTA 1.208 (at the iSchool, 16th and Guadalupe)
  • Instructors: the course will be co-taught by two instructors (see below).

Jason Baldridge
office hours:  Mon 10-noon, Fri 9:30-10:30
office: Calhoun 510
phone: 232-7682
Matt Lease
office hours: by appointment
office: UTA 5.442 
phone: 471-9350
email: [myinitials]

TA: Hohyon (Will) Ryu
Office hours: Tuesday Aug 30, 3-4pm
office: 5.548

NOTE ON EMAIL: *ALWAYS* include "dicta" in the subject line of any emails sent to instructors or TA.

Course discussion list. We will use a Google Group for class discussion, which students are welcome to post questions to. Everyone needs to sign-up for the list.


Java or Scala
. Students taking this course should be comfortable programming in Java or Scala, since concepts taught in class will be reinforced through extensive programming exercises in the Hadoop implementation of MapReduce (which is in Java).

Basic Math, Statistics, and Probability. Students will be expected to have knowledge of basic statistics and probability (e.g., axioms of probability, Bayes' Theorem, relative frequency estimation, etc.) since much quantitative analysis of text centers on counting data observed and estimating probabilities.

Text Analysis. As the course will focus on text analysis applications, prior experience with quantitative text analysis will be extremely valuable (e.g. CS388 NLP, LIN 386M Semi-supervised Learning for Computational Linguistics).

No previous experience with MapReduce or parallel and distributed programming is necessary.

Note that this is a course on scalable algorithms and "thinking at scale" rather than nuts-and-bolts of Hadoop programming. We will expect you to largely "pick up" the details of the Hadoop API without explicit instruction from us. Of course, we will assist you by providing resources and a reasonable amount of guidance.

If in doubt about whether you have sufficient background to succeed in the course, please contact one or both of the instructors. Describe your background and let us help you to see if the class will be a good match for you based on your prior experience and existing skills.

Course Textbooks

There are two required text books for the course:

Selected readings from this text will be suggested, along with other readings made available for download or copying.

Required Work

Course Project

A major component of the course involves a group-based course project (~3 students, with groups specified by instructors). See the course project page for details.


There will be five homeworks. These will be posted throughout the semester to the assignments page.

Pair programming. We encourage students to work in pairs, though it is not required unless indicated for a particular assignment. To encourage wider student interaction, no two students may work together on more than one assignment. Students working together may submit the same code and answers. We expect partners to do an equal amount of work in completing an assignment. If either party is not holding up their end, please notify the instructors.


Course project (60%)
  • Initial Ideas (2%, individual submission).
  • Proposal (8%, group submission).
  • Progress Report (15%, group submission).
  • Final Report (30%, group submission).
  • Final Presentation (5%, group presentation)
Homeworks (40%): There will be four homeworks, each worth 10% each of the total course grade.

Overall course grades. The grading scale is different from the usual one used in the USA.

80+ A
77-80 A-
74-77 B+
70-74 B
67-70 B-
64-67 C+
60-64 C
57-60 C-
54-57 D+
50-54 D
47-50 D-
0-47 F

This scale is inspired by typical British grading scale. It allows us to give you a better sense of where you can improve, taking off points, but still giving an A for quality work. Also, if you get 90+, it means you did an amazingly good job, above and beyond expectations.

Late Homework will be accepted only under exceptional circumstances (e.g., medical or family emergency) and at the discretion of the instructor (e.g. exceptional denotes a rare event).  This policy allowing for exceptional circumstances is not a right, but a privilege and courtesy to be used when needed and not abused. Should you encounter such circumstances, simply email assignment to instructor and note "late submission due to exceptional circumstances". You do not need to provide any further justification or personally revealing information regarding the details. 

Academic Honor Code

You are encouraged to discuss assignments with classmates, but all written submission must reflect your own, original work. If in doubt, ask the instructor. Acts like plagiarism represent a serious violation of UT's Honor Code and standards of conduct:

Students who violate University rules on academic dishonesty are subject to severe disciplinary penalties, such as automatically failing the course and potentially being dismissed from the University. Don't risk it. Honor code violations ultimately harm yourself as well as other students, and the integrity of the University, policies on academic honesty will be strictly enforced.

For further information please visit the Student Judicial Services Web site:

Notice about students with disabilities

The University of Texas at Austin provides appropriate accommodations for qualified students with disabilities. To determine if you qualify, please contact the Dean of Students at 512-471-6529 or UT Services for Students with Disabilities. If they certify your needs, we will work with you to make appropriate arrangements.

UT SSD Website:

Notice about missed work due to religious holy days

A student who misses an examination, work assignment, or other project due to the observance of a religious holy day will be given an opportunity to complete the work missed within a reasonable time after the absence, provided that he or she has properly notified the instructor. It is the policy of the University of Texas at Austin that the student must notify the instructor at least fourteen days prior to the classes scheduled on dates he or she will be absent to observe a religious holy day. For religious holy days that fall within the first two weeks of the semester, the notice should be given on the first day of the semester. The student will not be penalized for these excused absences, but the instructor may appropriately respond if the student fails to complete satisfactorily the missed assignment or examination within a reasonable time after the excused absence.