Project Option
Last updated August 26, 2025
Everyone will work on the same project option this year, although the
details will vary. We are limiting options this year in order to focus on
pedagogy and software engineering principles this year.
The client will act in a different capacity this year and we will meet the client in Term 2.
All the projects will be open-source.
Mining Digital Work Artifacts
- Project goal:
This project investigates how you can mine individual work artifacts from a
laptop or any computer device to identify the digital outputs they create in
the course of professional or creative activities. Work artifacts may include
programming code and repositories, written documents and notes, or design
sketches and media files.
By analyzing these artifacts and their metadata, your project will uncover
meaningful insights in one's work contributions, creative processes, and
project evolution. This mining approach aims to help individuals showcase and
reflect on their productivity, creative direction, and skill development while
raising awareness of the ethical considerations of analyzing one’s personal
work history.
- Target users:
A graduating student or an early professional who works on the computer regular
and is wanting to gather information about the projects they worked on over the
years. They hope to use this information to showcase as part of a web
portfolio, or a dashaboard of some kind that lets them easily see the metrics
and highlights of their hard work, or descriptions and metrics that could be
useful for improving their résumé.
- Technology Stack Requirement:
A programming language that allows you to focus on system architecture design
and decisions. You will eventually build an API and a front-end in future
milestones in Term 2.
Milestone #1 (October-December 07)
The focus of this milestone is to create the functionality for parsing the and
outputing information correctly. We will be very particular about your system
design and testing approach during this phase.
All the output for this milestone is expected
to be in text (that is, you can opt for a CSV, JSON, plain text output, etc., or a combination that facilitates your future development). The specific requirements are below.
The system must be able to ... :
- Require the user to give consent for data access before proceeding
- Parse a specified zipped folder containing nested folders and files
- Return an error if the specified file is in the wrong format
- Request user permission before using external services (e.g., LLM) and provide implications on data privacy about the user's data
- Have alternative analyses in place if sending data to an external service is not permitted
- Store user configurations for future use
- Distinguish individual projects from collaborative projects
- For a coding project, identify the programming language and framework used
- Extrapolate individual contributions for a given collaboration project
- Extract key contribution metrics in a project, displaying information about the duration of the project and activity type contribution frequency (e.g., code vs test vs design vs document), and other important information
- Extract key skills from a given project
- Output all the key information for a project
- Store project information into a database
- Retrieve previously generated portfolio information
- Retrieve previously generated résumé item
- Rank importance of each project based on user's contributions
- Summarize the top ranked projects
- Delete previously generated insights and ensure files that are shared across multiple reports do not get affected
- Produce a chronological list of projects
- Produce a chronological list of skills exercised
Your system is expected to be built using
Python. The teaching staff considered
other programming languages that were proposed, but we believe students will
have the most success using Python for this part of the project.
Milestone #2 (January-March 01)
Tentatively -- Details to be finalized by January 05
Generally, the system should operate as a service through API calls.
Aside from that, the focus of this milestone is to create functionality that supports a human-in-the-loop process. Since we cannot expect any system to perfectly extract the desired information that different people might want, the system should be designed in a way that faciliates user selection, customization, and corrections.
The additional requirements are below.
The system must be able to ... :
- Allow incremental information by adding another zipped folder of files for the same portfolio or résumé
- Recognize duplicate files and maintains only one in the system
- Allow users to choose which information is represented (e.g., re-ranking of projects, corrections to chronology, attributes for project comparison, skills to highlight, projects selected for showcase)
- Incorporate key role of the user in a given project
- Incorporate evidence of success (e.g., metrics, feedback, evaluation) for a given project
- Allow user to associate an image for a given project to use as the thumbnail
- Customize and save information about a portfolio showcase project
- Customize and save the wording of a project used for a résumé item
- Display textual information about a project as a portfolio showcase
- Display textual information about a project as a résumé item
Your system is expected to be built using
Python. The teaching staff considered
other programming languages that were proposed, but we believe students will
have the most success using Python for this part of the project.
Milestone #3 (March-April 05)
Tentatively -- Details to be finalized by March 01
The focus of this milestone is to create a front-end for the user.
Teams can choose how best their system should be designed to incorporate a
front-end to their milestone #2.
Teams can choose a development framework that works best for their project.
This also means that you can extend the technology stack to your liking at this
point.
Teams can choose to build an online website or a system that generates webpage
locally. There are no expectations for deployment, but it will be treated as a
bonus feature if it is done.
By the end of the milestone, the system should generate these two items:
- A One-Page Résumé:
- Education/Awards
- Skills, categorized by expertise level
- Projects, highlighting evidence of contributions/impact
- A Web Portfolio:
- Timeline of skills, demonstrating learning progression and increased in expertise/depth
- Heatmap of project activities, showing evidence of productivity over time
- Showcase of top 3 projects, illustrating process to demonstrate evolution of changes
- Dashboard supports a private mode where the user can interactively customize specific components or visualizations before going live
- Dashboard supports a public mode where the dashboard information only changes based on search and filter