Syllabus: QTM 350 - Data Science Computing

Quantitative Theory and Methods Department, Emory University

Course Description

This course equips students with computing skills and knowledge for data science applications. Students will gain knowledge foundations and hands-on experience with technologies such as Version Control, Project Collaboration, Data Structures and Algorithms. Prospective data scientists, statisticians, and other quantitative professionals will learn computational foundations to efficiently utilize data, data structures, and algorithms for data science.

Course Website: https://davi-moreira.github.io/2024S_dsc_emory_qtm_350

Instructor and TAs

Instructor: Professor Davi Moreira

  • Email: davi.moreira@emory.edu
  • Office hours: Tuesdays, 9:15am - 10:15am, or by appointment
    • Zoom link in your Course Canvas Page.

Teaching Assistant: Michael Cao

Learning Outcomes

By the end of this course, students will be able to:

    1. Demonstrate proficiency in data science project collaboration and version control.
    1. Utilize advanced data storage, manipulation, and querying.
    1. High-level understanding of data structures, and algorithms.
    1. Critically navigate the emergent trends in data science computing.

Objectives

  • Conceptual Understanding: To provide students with a foundational grasp of data structures and algorithms.
  • Technical Proficiency: To equip students with practical skills in version control, Python programming, data structures, and algorithms enabling them to execute data manipulation and analysis tasks proficiently.
  • Critical Integrated Learning: To offer a holistic educational experience that combines theoretical learning with practice, ensuring students can apply their knowledge to real-world projects and foster an awareness of emerging trends in the data science computing landscape.

Course References

  • Computing Skills for Biologists: A toolbox with basic computational skills necessary for the course.
  • Elements of Data Science: a digital textbook by Allen Downey written in the form of Jupyter notebooks. It provides an introduction to data science in Python for students with limited programming experience.
  • Think Python: An introduction to programming using Python.
  • Applied Computing: Applied Computing is an online textbook. It provides an introduction to spreadsheets and SQL. To view the book, students need to register using the course name.
  • SQL & NoSQL Databases: Models, Languages, Consistency Options, and Architectures for Big Data Management: Explores relational (SQL) and non-relational (NoSQL) databases. Covers database management, modeling, languages, consistency, architecture, and more.
  • Pro Git Book: A comprehensive resource for learning Git, covering everything from the basics to advanced topics by Scott Chacon and Ben Straub.

Additional References

Assessment

Final grades will be based on:

Assignment Percentage
Lecture Quizzes and Activities 60%
Problem Sets 40%

Lecture Quizzes and Activities

Each lecture will be accompanied by a set of questions and/or practice activity, available on the course’s Canvas page or Webpage to be completed individually. Students may complete these quizzes and activities either during or after class, the due dates will be posted in the course canvas page. To accommodate the learning process, the lowest two quizzes scores will be excluded from the final grade calculation. While individual submission is mandatory, collaborative discussions are encouraged. Please note that no extensions will be granted under any circumstances, ensuring fairness and consistency in assessment for all students.

Problem Sets

Problem sets aligned with each topic will be assigned to solidify and apply the concepts covered. These sets are to be collaboratively developed in groups of up to three members, emphasizing the importance of code collaboration. Consequently, individual submissions will not be accepted. Assignments will be distributed via Canvas and/or GitHub, and may be formatted as either a Jupyter Notebook (.ipynb), Quarto documents (.qmd), or HTML (.html). Groups will be required to submit the complete source code of their assignments (.ipynb, .qmd, or .html). Each problem set will be meticulously evaluated, with grading based on both the accuracy and the overall quality of the work submitted. For instance, you must guarantee:

  • All code must run;

  • Each problem set material will have its own GitHub repository;

  • Readable Solutions: To facilitate effective evaluation and comprehension of the coding assignments, students must adhere to the following standards for code readability:

    1. Comprehensive Commenting: All code must include thorough comments. These comments are essential as they allow the Professor and Teaching Assistants to understand the purpose and functionality of the code solely through these annotations. It is crucial that the comments are clear and concise, providing insight into the logic and purpose behind each segment of code.
    1. Structured Code Segmentation: Solutions should be methodically organized into distinct code chunks within Jupyter or R Markdown notebooks. For clarity on this format, refer to examples provided in class or consult with the Professor or Teaching Assistants.
    1. Detailed Documentation of Functions: Every function defined by a student must be accompanied by a docstring. This documentation should clearly explain the function’s purpose, describe each input argument, and outline what the function returns.

Grading

Each student’s final grade will be based on the following after rounding up to the nearest point:

Grade Range
A 91% – 100%
A- 86% – 90%
B+ 81% – 85%
B 76% – 80%
B- 71% – 75%
C 66% – 70%
D 60% – 65%
F < 60%

AI policy

I encourage you to use AI tools you believe will enhance your individual or group performance. Learning to use AI is a valuable and emerging skill, and I am available to provide support and assistance with these tools during office hours or by appointment.

Be aware of the following guidelines:

  • You are not allowed to use AI tools during the exams.

  • Providing low-effort prompts will result in low-quality outputs. You must refine your prompts to achieve desirable outcomes. Use the course knowledge for that!

  • Do not blindly trust the information provided by the output. If the output contains a number, index, analysis, conclusion, or fact, assume it is incorrect and check its veracity. Any errors or omissions resulting from using the AI tool will be your responsibility. Remember, the AI tool works better for topics that you already understand.

  • While AI is a tool, you must acknowledge its use. Always cite! Include a paragraph or note at the end of any document to mention that you used AI on its development.

Academic Integrity

Upon every individual who is a part of Emory University falls the responsibility for maintaining in the life of Emory a standard of unimpeachable honor in all academic work. The Honor Code of Emory College is based on the fundamental assumption that every loyal person of the University not only will conduct his or her own life according to the dictates of the highest honor, but will also refuse to tolerate in others action which would sully the good name of the institution. Academic misconduct is an offense generally defined as any action or inaction which is offensive to the integrity and honesty of the members of the academic community. The typical sanction for a violation of the Emory Honor Code is an F in the course. Any suspected case of academic misconduct will be referred to the Emory Honor Council.

Communication

  • Check the Course Website and Canvas Page regularly to keep yourself informed with up-to-date information about the course. Also, be sure to check the course syllabus before asking any questions about the course schedule/policies.
  • If you cannot attend the office hours due to conflicts with other course schedule or attending the university-sanctioned events (proof required), email the instructor at least two days in advance to set up an appointment. Note that each appointment will be 15-minutes long, and it may be done in a small group or individually. No appointments will be allowed nearing the exam dates.
  • When attending virtual office hours, make sure you are in a private setting with a little to no background noise. The use of headphones is strongly encouraged. This is especially true when you are discussing private matters with the instructor.
  • Do not use email for asking content-related questions, and do not use Canvas messages.
  • Do not email me your private stories. Keep your email brief, and you will receive a response from me within 48 hours, except for the weekends. Similarly, if you receive an email from me, you are also expected to respond within 48 hours. Set up an individual appointment to discuss such things.
  • Finally, if you are experiencing situations that negatively impact your overall student life, you should immediately contact the Office of Undergraduate Education.

Regarding absences

  • If you miss a lecture for any reasons, understand that you are still responsible for the missed course materials. First, review the missed materials, then you may attend the instructor office hours to ask specific questions.
  • Attendance is not monitored in lecture except on the exam dates.
  • Emory College of Arts and Sciences policy states, “A student who fails to take any required midterm or final examination at the scheduled time may not make up the examination without written permission from a dean in the Office for Undergraduate Education. Permission will be granted only for illness or other compelling reasons, such as participation in scheduled events off-campus as an official representative of the University.

Access and Disability Resources

Students with medical/health conditions that might impact academic success should visit the Department of Accessibility Services (DAS) to determine eligibility for appropriate accommodations. Students who receive accommodations must contact the instructor with an Accommodation Letter from the DAS at the beginning of the semester, or as soon as the accommodation is granted. If you have DAS accommodations, you must inform the instructor after confirming that your accommodation letter is available in the DAS web portal. The instructor will respond to your email confirming which accommodations you will receive for this class. If you wish to do so, you may request an individual meeting to further discuss the specific accommodations.

Subject to Change Policy

While I will try to adhere to the course schedule as much as possible, I also want to adapt to your learning pace and style. The syllabus and course plan may change in the semester.

Schedule