IBM InstructLab
Increasing accuracy, efficiency, and usability in large-scale data review for LLM (large language model) training.
8 weeks
Product Designer
1 project lead, 5 designers
High-fidelity Figma prototype
Background
Instructlab is an open source IBM project, training and fine tuning enterprise-level LLMs with synthetically generated data on their flagship watsonx.ai product.
This allows enterprises to create and train custom models to perform on tasks such as data analysis, customer service, etc.
Challenge
Currently, InstructLab does not have a consistent way to review the hundreds of sets of training data being fed into the model.
Instead, individual teams have to rely on irregular and manual review processes to refine their models.
Results
We designed a smoother, more intuitive workflow for reviewing synthetic data, developing features like modular toggles, collaboration tools, and source traceability to make the review process more efficient and transparent.
Key Problem: Reviewing synthetic data in InstructLab is unstructured, laborious, and full of bottlenecks.
The first version of our redesigned flow.
Redesigning the Synthetic Data Generation (SDG) Review Process
We started by reimagining the data viewing experience. Assuming that most reviewing teams were highly collaborative, we generated iterations of the commenting feature, emphasizing the ability to reference the discussion during the review process and include tag other commenters in the process if a user was unsure about the content.
Using the current watsonx.ai interfaces as a reference and with continuous feedback from two watsonx.ai UX designers + one InstructLab developer, we designed screens encompassing the three prioritized features: Collaborative Team Tools (filtering, commenting), List and Modular Views, and Approving, Denying, Editing.
A glimpse into our low-fidelity mock-ups. Here, we're exploring the possibilities of the modular view.
How Feedback Reshaped Our Priorities
With our mid-fidelity prototype, our team gathered critical user feedback that reshaped our approach for the next iteration.
We learned that data review is a much more individualized process than we had anticipated. Introducing collaborating tools without proper consideration could disrupt, rather than bolster, the team's natural workflows.
Reviewers frequently leaned on the reference document to evaluate the quality of the synthetically generated data, a need our initial design did not prioritize.
Reviewers also strongly preferred the modular view over the list view for its higher functionality, but found toggling between the two screens unintuitive in our mid-fidelity interface.
These insights helped us prioritize three key areas for our next iteration, making navigation more intuitive, improving commenting, and increasing the accessibility of the reference document.
Key Improvements: Translating Feedback into Frictionless Design
While high fidelity designs cannot be shared, here are the specific adjustments that we prioritized to better align with our users:
Navigation:
Introduced a toggle icon to clearly indicate transition between views.
Added a list view icon to indicate current page.
Designed a colorful animated transition to clearly signal the shift to modular view.
Commenting:
Implemented a minimally invasive comment display to avoid disrupting the reviewing process.
Redesigned the comment modal to support short interactions between reviewers.
Reference Documents:
Embedded a reference document for each question in modular view.
Added a PDF search tool for convenient information look-up.
Enabled reference documents in list view for consistent access.
"This is a big step forward from what we've been doing in the past — a large Improvement!"
Jacob Engelbrecht
Backend Software Engineer @ IBM