Project Option

Last updated August 26, 2025

Everyone will work on the same project option this year, although the details will vary. We are limiting options this year in order to focus on pedagogy and software engineering principles this year.

The client will act in a different capacity this year and we will meet the client in Term 2.

All the projects will be open-source.

Mining Digital Work Artifacts

Milestone #1 (October-December 07)

The focus of this milestone is to create the functionality for parsing the and outputing information correctly. We will be very particular about your system design and testing approach during this phase. All the output for this milestone is expected to be in text (that is, you can opt for a CSV, JSON, plain text output, etc., or a combination that facilitates your future development). The specific requirements are below.

The system must be able to ... :

  1. Require the user to give consent for data access before proceeding
  2. Parse a specified zipped folder containing nested folders and files
  3. Return an error if the specified file is in the wrong format
  4. Request user permission before using external services (e.g., LLM) and provide implications on data privacy about the user's data
  5. Have alternative analyses in place if sending data to an external service is not permitted
  6. Store user configurations for future use
  7. Distinguish individual projects from collaborative projects
  8. For a coding project, identify the programming language and framework used
  9. Extrapolate individual contributions for a given collaboration project
  10. Extract key contribution metrics in a project, displaying information about the duration of the project and activity type contribution frequency (e.g., code vs test vs design vs document), and other important information
  11. Extract key skills from a given project
  12. Output all the key information for a project
  13. Store project information into a database
  14. Retrieve previously generated portfolio information
  15. Retrieve previously generated résumé item
  16. Rank importance of each project based on user's contributions
  17. Summarize the top ranked projects
  18. Delete previously generated insights and ensure files that are shared across multiple reports do not get affected
  19. Produce a chronological list of projects
  20. Produce a chronological list of skills exercised
Your system is expected to be built using Python. The teaching staff considered other programming languages that were proposed, but we believe students will have the most success using Python for this part of the project.

Milestone #2 (January-March 01)

Tentatively -- Details to be finalized by January 05

Generally, the system should operate as a service through API calls. Aside from that, the focus of this milestone is to create functionality that supports a human-in-the-loop process. Since we cannot expect any system to perfectly extract the desired information that different people might want, the system should be designed in a way that faciliates user selection, customization, and corrections. The additional requirements are below.

The system must be able to ... :

  1. Allow incremental information by adding another zipped folder of files for the same portfolio or résumé
  2. Recognize duplicate files and maintains only one in the system
  3. Allow users to choose which information is represented (e.g., re-ranking of projects, corrections to chronology, attributes for project comparison, skills to highlight, projects selected for showcase)
  4. Incorporate key role of the user in a given project
  5. Incorporate evidence of success (e.g., metrics, feedback, evaluation) for a given project
  6. Allow user to associate an image for a given project to use as the thumbnail
  7. Customize and save information about a portfolio showcase project
  8. Customize and save the wording of a project used for a résumé item
  9. Display textual information about a project as a portfolio showcase
  10. Display textual information about a project as a résumé item
Your system is expected to be built using Python. The teaching staff considered other programming languages that were proposed, but we believe students will have the most success using Python for this part of the project.

Milestone #3 (March-April 05)

Tentatively -- Details to be finalized by March 01

The focus of this milestone is to create a front-end for the user. Teams can choose how best their system should be designed to incorporate a front-end to their milestone #2. Teams can choose a development framework that works best for their project. This also means that you can extend the technology stack to your liking at this point. Teams can choose to build an online website or a system that generates webpage locally. There are no expectations for deployment, but it will be treated as a bonus feature if it is done. By the end of the milestone, the system should generate these two items:

  1. A One-Page Résumé:
    • Education/Awards
    • Skills, categorized by expertise level
    • Projects, highlighting evidence of contributions/impact
  2. A Web Portfolio:
    • Timeline of skills, demonstrating learning progression and increased in expertise/depth
    • Heatmap of project activities, showing evidence of productivity over time
    • Showcase of top 3 projects, illustrating process to demonstrate evolution of changes
    • Dashboard supports a private mode where the user can interactively customize specific components or visualizations before going live
    • Dashboard supports a public mode where the dashboard information only changes based on search and filter