Skip Navigation

Blue Brain Nexus

Open-Source Semantic Data Management

December 31, 2021 · Alex Ulbrich

Data center racks with network cables and lights
Table of Contents

Work in Progress

Hello! This page is still evolving. The information published is well-researched, but there is more to do, and always more references to find.

Some things I still need to do:

  • Explaining the objectives of the product
  • Defining user personas
  • Market research analysis
  • Show design iterations
  • Reflect with the retrospective

Blue Brain Nexus is a semantic research data management platform developed at the Blue Brain Project to support and accelerate simulation neuroscience specifically, and research generally.

Background and Context

The Blue Brain Project (BBP) aims at digitally reconstructing the mouse brain. It is a simulation neuroscience laboratory part of the École Polytechnique Fédérale de Lausanne (EPFL). At the time of writing, the lab has almost 150 collaborators: scientists, PhDs, postdocs, engineers, and one product manager. The initiative was started in 2005, and uses a supercomputer to simulate parts of the mammalian brain.

In a nutshell, the process is:

  1. Collection of experimental (biological) data: cells such as neurons are sliced, stained, and their electrical behavior is measured
  2. Reconstruction and modeling: we use algorithms to identify different characteristics of cells, interactions between them, and build a detailed representation of the network of neurons (microcircuit)
  3. Simulation of the microcircuit: on a supercomputer, because there are millions of neurons and billions of synapses

Given the vast amount of data, both collected and generated, managing this data is one of the biggest challenges in the field.

In particular, while the Blue Brain Project is a long-term initiative, most researchers (PhDs, postdocs) will only stay between two and six years on average. It is crucial to be able to onboard them quickly and make sure their contribution is properly captured and reusable (not just only their papers but also their code and data). As a result, half of the BBP’s staff is tasked to support scientists and their projects. This is the computing division.

There exists many tools to keep track of code (versioning system, continuous integration, …) and data (databases, data integration tools, …). Research, however, is intrinsically evolving fast, and the existing alternatives in terms of data management were (in 2015) not sufficient:

  • new types of data were regularly created and standards were not keeping up,
  • models were evolving rapidly, new parameters were added, new methods tested,
  • knowledge transfer was difficult given the high turnover.

In addition, the ambition of the BBP was to simulate the whole mouse brain. We knew that there was not enough experimental data (in vivo or in vitro) to cover all of it. We had to rely on inferred information. The decision was thus made to build an in-house data management solution that would:

  • give us flexibility with data models (compared to fixed schema relation databases),
  • allow us to easily leverage the intrinsic “links” between brain datasets,
  • therefore, pave the way for proper inference,
  • integrate all tools, software, storages used at the BBP in one common data platform.

Blue Brain Nexus was born.

Challenges and Opportunities

Working as a product manager in an academic research organization has its quirks. Here, I list challenges that I have experienced or witnessed, and potential workarounds or solutions for them.

An Academic Research Laboratory

The BBP is an academic research laboratory. It is run by university professors (also called PIs, for principal investigators). PIs are or were researchers and are used to direct the research of their PhD student (PhDs) or postdoctoral researchers (postdocs). They know their field really well, they are well connected, and, in a way, proper research is like proper product management. Create experiments, run them, iterate, until you reach your goal.

What happens when PIs start to direct software engineering or product management? What happens when the PIs run the computing division? Suddenly, they become the main users as well as your managers. From solving problems, you end up implementing their solutions, and the whole product management process falls apart.

How do you avoid being told what to implement, instead of letting the team come up with a solution?

  • Ask PIs to state the problems instead of the solutions. This is a hard one because solutions more easily align with their vision.
  • Don’t implement anything until you can actually validate the prototype. This is difficult in particular if teams work in silos (more on that in the next section).
  • (Preferred) Split the scientific leadership from the computing division altogether. The computing division should support the scientific roadmap but be completely independent of PIs. It should run like a business.

This last one is particularly hard but leads to the best results. Everyone ends up doing what they know best.

High Autonomy, Low Alignment

A symptom of the point above: in particular in larger labs, there are different teams with different leads for different scopes. They all operate with high autonomy, in particular for PhDs. This is fine but requires a high level of coordination if teams rely on each other. The computing division might end up siloed in a similar way.

Because leaders are full-time professors (or have other responsibilities than full-time manage the lab), you risk not having enough alignment across teams. High autonomy is great, but you also need high alignment.

How do you reach high alignment?

  • Frequent all hands were each team reports on their progress in a vulgarized fashion. The proceedings should be written down and shared afterwards.
  • Have at least one dedicated full-time executive that will manage the organization (like you would a business).
  • (Preferred) Set up Objectives and Key Results from top to bottom and up again. This requires a strong, dedicated, and full-time executive to drive.
  • Set up a project management office (PMO) that reports to that executive.
  • For the computing division, break down silos into engineering (skill-based) teams, and have cross “product” teams to address scientific challenges.

Research Incentives

If anyone works in academia, you’ll know this: success is generally measured by research output (papers). The h-index is the most famous metric of research productivity.

It, unfortunately, does not incentivize researchers to:

  • publish their paper in open access,
  • publish their data (in open access),
  • produce high quality, well-documented data,
  • publish their code and (statistical) analysis (in public repositories),
  • produce quality and well-documented code,
  • have reproducible experiments and results,
  • reproduce experiments to validate or reject results,
  • publish negative results.

These things are probably even more important than the number of papers, and, in my opinion, should be the main vectors for funding a laboratory.

Of course, to run a successful project in the long-run (such as the BBP), you’d need many of these things to be checked (data, code, reproducibility).

How can we help the organization succeed in the long-run (in order of importance)?

  1. Obviously, you’d need to change the system. Move away from evaluating research output, and instead design an index to holistically track the points listed above. This is hard, and journals in particular do not have much incentive to do so (financially).
  2. Onboard new hires (such as PhDs and Postdocs) to use the tools and standards of the organization. Include these as objectives in their job description.
  3. Develop easy-to-use, well documented, software, tools, and standards for people to use. That’s a no-brainer, but is more difficult than we think if we don’t have high alignment.
  4. Have a separate team (such as the computing division) take care of the code, data, and documentation quality. If all else fails, let the scientists do their science, and have someone else clean up after them.

Getting Started

I joined the Blue Brain Nexus team as their product manager in December 2019. The team had just published version 1 of the software, which included a new user interface that took aim at scientists.

Blue Brain Nexus was already open-source at that point, and we had adoption from two other organizations (namely the Human Brain Project in Geneva, and the Centre for Addiction and Mental health in Toronto).

As befit any product manager, I started getting up to speed with the following:

  • Deep understanding of the
    • organization/business
    • users
    • data
    • market (alternatives and competition)
  • Technical understanding of the existing solution

This meant engaging with all stakeholders, directors and managers, existing users, and listening in all scientific meetings to understand the research. This also included looking at the numbers we had, if any, and looking at what other organizations were doing, and what the market had to offer.

Objectives

This section is under construction!

Identifying Users

This section is under construction!

Analyzing the Market

This section is under construction!

Retrospective

This section is under construction!