Faceted Browse for the SPARC Portal

TL;DR: The goal was to encourage discovery and reuse of datasets and models resulting from NIH funded research on the autonomic nervous system and bioelectronic medicine.  In particular, we wanted to leverage the rich metadata connected via ontologies in an underlying knowledge graph, which would enable users to traverse the connections in the data far better than a keyword search based on a dataset title or paragraph description.  The faceted browse project was part of this broader effort.  I conducted interviews, participatory design activities, card sorts, prototyping, and usability testing.  I worked closely with the data and development teams to ensure the designs would be feasible and to understand what could be possible with the metadata available.  Over 7-8 months, I led the multiple research and design phases of a faceted browse to provide a cohesive experience for browsing all the site’s content while leveraging the metadata to help users surface relevant content.

Challenge | Approach | Results | Tools + Methods

Challenge

Through the SPARC program, the National Institutes of Health (NIH) funds research in bioelectronic medicine and on the autonomic nervous system.  NIH created the SPARC Portal as an open data portal to provide datasets and models resulting from this research available for reuse by other researchers. 

The question we set out to answer: How can we provide one place that users can go to see what’s available on the SPARC Portal and find relevant content that will help them accomplish their task?  Specifically, how can we help researchers browse and discover data and models relevant to their work?

Although there were under 200 datasets or models available at the time that could be accessed via keyword search, there were several reasons to undertake improvement to how outside researchers would be able to discover relevant data/models:

  • Future plans to automate the curation process (adding metadata and preparing submitted datasets/models to be shared) meant there would be significant growth in the volume of available content moving forward.
  • There was an underlying knowledge graph comprised of the vast amounts of metadata for each model and dataset, connecting them across all types of attributes (e.g., species, anatomical structure, experimental modality, etc.) based upon existing ontologies and mapping work.  This was not at all leveraged by the current keyword search, which meant that someone searching for “heart” would not find items that had “cardiac” in the title or “left ventricle” in the description.  Leveraging the ontological relationships in the knowledge graph would allow someone searching for “heart” to find datasets or models that referenced substructures or involved cardiac data without the word “heart.”
  • Keyword search works best when someone knows exactly what they are looking for, while browsing supports discovery when someone wants to see what is available.

Other considerations:

  • We needed to balance between the primary audience of scientists who might use the data and the secondary audience of dataset contributors (i.e., not make the submission and curation process too onerous).
  • This particular research area spans many different disciplines, meaning that there is a wide range of the data and models produced and their attributes.

The potential audiences were also a diverse group, from principal investigators and postdocs to data scientists and device designers, from researchers who focused on a particular anatomical structure or phenomena to those who specialized in a particular technical approach, such as neuromodulation or mapping. 

Approach

I was the primary design researcher and project lead for the UX design of the faceted browse.  With the guidance of a senior UX strategist, I developed all the protocols, often moderated the interviews or user research activities, and I consulted and collaborated with the UX designer, data engineers, the team that curated dataset submissions, and developers on design and implementation – all part of a globally distributed consortium working on the SPARC Portal.

The plan consisted of three phases of research and design, along with ongoing technical feasibility work and planning for implementation.  We focused on faceted browsing due to browsing behavior being different from search behavior.

Title is "Y4 - Faceted Browse & Search Project Overview." Image consists of a horizontal timeline with milestones from left to right: Phase 1 Discovery, Began technical feasibility, Phase 2 Facets and Display, Co-design activity, Card sort activity, Phase 3 Prototyping and Testing, Test + tech review prototype 1, Test prototype 2, Ready for development, Faceted browse QA.  Then, the line turns from purple to orange to indicate the next project (Site Search) which begins with Faceted Browse AND Search implemented on staging, with the milestone of Test search configuration.

Knowing that faceted browse would be the first in a series of projects related to helping users explore and access SPARC program data, the first phase was discovery.  We began by meeting with internal stakeholders to understand context, constraints, how we might collaborate.  From this, we gained understanding of metadata available, dataset curation process, underlying infrastructure, concurrent projects / dependencies, future plans.  I leveraged my background in data management and governance heavily, and diagrammed the infrastructure to facilitate conversations with technical leads. 

I then interviewed potential users interested in data from other researchers on their needs and their current workflows in searching for data or models.  I diagrammed the user workflow as well as some of the infrastructure.

Workflow diagram in two halves: Faceted Browse & Search, and Dataset Details Page.

Under Faceted Browse & Search are 2 phases: Search for data, Review for relevant results.  Search for data steps are: Keyword search, Select facets or filters.  Review for relevant results steps are: Read titles of results, If results are unsatisfactory iterate on search terms/facets, May read number of citations or publication date or author/investigators, Pass over irrelevant results, Click through to view in full.

The phases in the second half, Dataset Details page, are: Assess data for usefulness, Download, Assess data for usefulness, Use.  Steps are also included for each of these phases -- although this was used to inform a separate project.

Below the flowchart are boxes: Why look for data, Key questions for assessing data for usefulness, Must be able to....
Since the discovery would support a number of upcoming projects, I mapped the workflow beginning with faceted browse all the way through actually downloading a dataset or model.

I used our findings to plan the next two phases: refining the content model, and prototyping/testing.

For the second phase, our objective was to refine the content model – refine our understanding of both the metadata available as well as users’ mental models around datasets and models and which attributes.  Since this can be a rather abstract topic, I developed a participatory activity using Miro that allowed participants to build their own faceted browse interface, asking them to explain which attributes (or facets) were most useful for browsing vs. for evaluating results. 

Screenshot from a Miro board.  At the top it asks "How often do you search for each? Drag label to appropriate point on line." Line is a spectrum from Never to Once a week or more. Labels are Simulations, Imaging Data, Other data.

Below is a large rectangle like a webpage, with a purple left column, and several example search results displayed in a list on the right with pictures, titles, and a brief description.  The large rectangle is flanked by small rectangles labeled with facet names, connected to boxes that list the options that would appear for that facet in the dropdown.  

On the far left is a yellow rectangle containing boxes for participants to create their own facets with dropdown options.

On the far right is a magenta box that also has small rectangles for participants to create their own tags to display facets on the search results.

Knowing that the relevant attributes/facets varied depending on what someone was looking for, we followed this with an unmoderated card sort to categorize result types (datasets vs. 3D models vs. simulations vs. data analysis, etc.).

The results of the second phase formed the building blocks for our prototypes.  We created categories of result types, with the facets to be used for browsing or displayed in results per type.  I used Airtable as a way to prototype this initial model so that we could use real data for the prototype to be used in usability testing.  Given the technical nature, it was critical that the participants not be distracted by scientifically inaccurate examples in the prototype.  I used this Airtable to provide the UX designer with the examples to demonstrate in the clickable prototype.

Screenshot of an Airtable base named Prototype Idea in Gallery view.  There are vertical rectangles that contain information about datasets and show colorful tags for different attributes.

I conducted two rounds of moderated usability testing with two iterations of the prototype.  In addition, I moderated a review of the prototype by the technical leads for data and development for feasibility. 

Webpage that shows the SPARC Portal header and below, the faceted browse interface.

The upper left shows the number of results and that 10 results are shown per page.  Across on the right are page numbers with page 1 highlighted.

In the left column are boxes with various facets, starting with Browse Categories (set to Models & Simulations), followed by things like Model Type, Anatomical Structure, Species, Age Category, etc. with dropdowns.  Some facets (Has Publications, Status, etc.) have checkboxes.

On the right is the list of results.  For each result it shows a thumbnail image with a dataset title, a short paragraph description, and date in international friendly format (May 20, 2020).
Prototype 1 (by Dominic Rogers)

We were able validate the structure/concept and refine the UI to address usability issues in the final designs.

Results

From the discovery phase, we learned that open data represents a change from accessing information via peer-reviewed papers or via formal collaboration with another researcher.  Thus, in our designs, we sought to provide some familiarity while also leveraging the metadata by choosing facets that would support effective winnowing and browsing.  The facets provided information about the types of content available.  Metadata included in the results display helped users trust the browsing functionality and evaluate results for relevancy before clicking through. 

We had learned that transparency and seeing if something was not there that was expected was crucial to the users as part of being able to trust what they were finding.

Due to the diversity of datasets/models, as well as the desire to create a cohesive experience for discovering relevant content (which also included tools, tutorials, and other resources), we decided upon grouping content per result type (Data, Models & Simulations, Tools & Resources, etc.) so that the browsing facets would reflect what was both available and useful in filtering for each category of content.  We also included a search bar since a future project would be to evaluate how the search could be improved in concert with faceted browsing.

Based on the research findings and analysis, I provided specs to the UX designer and worked with him to find solutions to usability issues.

Webpage with the SPARC header at the top, followed by the faceted browse interface.

Across the top are browse categories: Data, Models & Simulations, Tools & Resources, News & Events, SPARC Information.

This is followed by a search bar.

Below are two columns.  The left column consists of a box for applied filters followed by collapse/expand boxes for the facets: Anatomical Structure, Species, Sex, Age Category, Techniques.  There is a checkbox for Code Available.  There are collapse/expand boxes for Coding Language and File Format.  Then checkboxes for Has Publications and Availability.

In the right column is a list of results, with a number of results, results per page, and pages at the top.  Per each result is a thumbnail image, a title, a brief description, and then metadata listed with labels.
Webpage with the SPARC header at the top, followed by the faceted browse interface.

Across the top are browse categories: Data, Models & Simulations, Tools & Resources, News & Events, SPARC Information.

This is followed by a search bar.

Below are two columns.  The left column consists of a box for applied filters followed by facets: News/Event Type, Publication Date.  Publication Date offers Show all, Before (month year), During (month year), After (month year).

In the right column is a list of results, with a number of results and results per page at the top.  Per each result is a thumbnail image, a title, a brief description, and a publication date in international friendly format (Feb 21, 2021).
Final designs by Dominic Rogers

The existing search functionality on the SPARC Portal consisted of separate search bars that allowed users to filter within a page (e.g., within Data, or within News & Events) but required users go to each section to search a different type of content. 

Title is Current Pathways for Search & Browse.

On the left is a website header menu with an orange circle around Tools & Resources and pointing to a screenshot of the Tools & Resources page.

In the middle is a search bar set to News & Events with an arrow pointing towards a filtered News & Events page.

On the right is search bar set to Support Center with an arrow pointing towards a filtered Support Center page.

Since these designs introduced a singular, dedicated search and browse page with results displays designed for each type of content, we needed to make other changes across the site. I decided to retool the existing search pages to be landing pages to guide users and provide context (e.g., where the data came from and how they were curated) in order to build trust.

Large white canvas showing a series of webpage screenshots connected with arrows.

Besides annotating the wireframes provided by the UX designer, I created user flows for the revised information architecture across the site, created a taxonomy for the facets that included their definitions so the user friendly labels could be mapped to the corresponding metadata, prepared tickets breaking up the project for implementation over multiple sprints, and worked together with the UX designer to create a table of the UI elements for the developers since some were also part of building out the design system.  I worked closely with the technical leads in planning for implementation.  In addition, I wrote the reports (discovery, testing, and overall) and led the briefings for the initial kickoffs with the client and other stakeholders as well as to present the final designs to NIH.

As of December 2021, the faceted browse design is in the process of being developed incrementally, across multiple releases.  Progress to date can currently be viewed at https://sparc.science/data?type=dataset

Tools + Methods

  • Airtable
  • Abstract
  • InVision
  • Sketch
  • Miro
  • Zoom
  • Otter
  • Card Sort
  • Interview
  • Participatory Methods
  • Usability Testing
  • Stakeholder Engagement
  • Project Management