Course Syllabus

Large-scale digitization projects as well as increasing quantities of  born-digital materials have put enormous collections of documents and data within our reach. The use of programmatic techniques is necessary for managing such massive collections and improving their utility for specific purposes. The ability to blend understanding of digital collections, computational techniques (such as machine learning, data mining, image, audio, or video processing, data analysis), user experience design that you gain from this course will enable you to.craft small projects that demonstrate the viability of your proposed solutions to your supervisors, whether in academic institutions, not for profit organizations, start ups, or large corporations.

During the Fall 2015 semester, we will develop projects that focus on three themes:

with an emphsis on developing transferable competencies, such as concepts and techniques that you will be able to apply in other domains and settings.

During this semester, we will work on the projects described below. These projects describe high-level objectives and technologies, while leaving ample scope for teams to define the specifics within these broad definitions.

DPLA - Metadata Collection Aggregator (Python, JSON, data processing techniques)

DPLA - Collection Profiler (Python, JSON)

DPLA - Metadata Visualizer (D3, JavaScript, HTML, CSS)

CSH - Metadata extraction and formatting for Fedora 4 repository (Python/PHP, XML/JSON/CSV, Fedora 4)

Alamo - Digital management of conservation data  (PHP, JavaScript, MongoDB)

The projects above provide opportunities for studying concepts related to the nature of activities regarding digital collections, for developing ground-breaking techniques for making collections more accessible, and for publishing the results of your efforts in highly regarded research and practitioner publications.  

 Contributions from last year's course was published in the ACM/IEEE Joint Conference on Digital Libraries.


There is no textbook for this course. All readings and reference materials will be available online.

  • assess the quantitative and qualitative aspects of large datasets
  • identify a use case for the data and the necessary metadata for supporting the use case
  • generate or retrieve metadata from documents and Web-based collections
  • design storage structures (such as database schemas or XML/JSON documents) for extracted metadata
  • craft user interface widgets and features for supporting users in accessing the data to address the target problem
  • develop scripts that interface with third-party RESTful or library-based APIs
  • evalute the developed scripts, techniques, and algorithms



We will adopt a hands-on, project-based learning approach in a studio setting. This course will enable (rather, require) you to synthesize competencies from several other courses taught in the iSchool, for example: Database Management, Understanding Research.

The key to success in this course is the ability to augment your existing skills on the fly. For example, if you have used simple database queries before, this course will demand more complex queries. Similarly, you will need to learn new features of a programming language that you are familiar with. 

Peer learning: Most of the work in this class will be conducted in groups, which will give you an opportunity to learn from and teach your team mates simultaneously.

Instructor as guide: I believe in leading by example. Often, we will encounter situations without one correct answer or where I may not know the answer. I will explain the processes that I use for learning new techniques and provide a structure for such learning-on-the-fly. You will learn how to learn.



  • Graduate standing
  • Knowledge of a programming language (examples: PHP, Python, Java, C++)
  • Familiarity with at least one of the following with a willingness to self-learn the other quickly
    • Using an external API ( examples: MySQL or file APIs in PHP, Web-based RESTful APIs, JavaScript libraries)
    • Data modeling (examples: Entity-relationship diagrams, databases, XML)

Please contact me if you have doubts or concerns about satisfying these criteria. I am more than happy to discuss your individual case and suggest an appropriate course of action to maximize your learning opportunities.


Textbook and Readings

There is no textbook for this course. All readings and reference materials will be available online.



10% - Project proposal (due Wed., Sept. 16th)

10% - User interface prototype and/or system architecture (due Wed., Sept. 30th)

10% - Evaluation plan (due Wed., Oct. 7th)

10% - Evaluation report (due Tue. Dec. 1st)

30% - Project implementation

10% - Adherence to programming style guide

20% - Final poster presentation


Scheduling appointments

I have an open door policy. You are welcome to drop by my office (UTA 5.408). I will do my best to make time for you. To be sure that I will have time when you come by, please email me to setup an appointment.



Programming resources -- language and API reference, style manuals, etc.

Information about iSchool user accounts and passwords -- please create a login as soon as possible


Academic Integrity

All students are expected to abide by the University of Texas Honor code, reproduced below for your convenience.

The core values of The University of Texas at Austin are learning, discovery, freedom, leadership, individual opportunity, and responsibility. Each member of the university is expected to these values through integrity, honesty, trust, fairness, and respect toward peers and community.

Violation of academic integrity, especially plagiarism, will not be tolerated. The first infraction will result in a grade of zero for that component of the course as well as a formal reprimand in your student file for future reference. Penalty for a second violation will include failure of the course and University-level disciplinary action.


Disability Accommodation

The University of Texas at Austin provides upon request appropriate academic adjustments for qualified students with disabilities. For more information, contact the Services for Students with Disabilities (SSD) at (512) 471-6259 (voice) or (512)-410-6644 (video phone). An official letter from SSD is required in order to avail academic accommodations.

Please notify me as quickly as possible if the material being presented in class is not accessible (for example, instructional videos need captioning, course packets are not readable for proper alternative text conversion, etc.).


Emergency Preparedness

Please see details in the files section.

Course Summary:

Date Details