This is a blog post. The opinions expressed here are those of the author.
You will also find out how her trip to the Svalbard islands in the Artic in January ties in with what lays ahead.
Kate Crosman, Andre Hoffmann Fellow:
After almost four months as an André Hoffmann Fellow in Big Ocean Data, my scope of work is starting to settle into place. I have presented an initial research plan to all the research partners (NTNU, SFI Harvest, World Economic Forum Ocean Action Agenda and C4IR Oceans) and am ready to start moving forward.
At first glance, the topic may seem simple: do we believe that the data accurately portray the characteristics of the empirical reality that we seek to describe? However, even this formulation hides a wealth of complexity.
Characteristics of Big Data
As the first example of that complexity, consider that we are talking specifically about “big data” – meaning data that are high volume, rapidly accumulating, and that originate from a variety of platforms and come in a variety of formats.
Big data are therefore impossible to analyze using the standard statistical techniques employed in traditional data analysis, instead requiring the use of novel analytical techniques that rely on artificial intelligence (AI; e.g., machine learning, deep learning).
AI introduces accuracy concerns (e.g., algorithm selection, bias in training data) that extend far beyond – but do not supplant – the relatively simple question of whether or not the data themselves are accurate.
Whose trust, and to what ends?
As a second example, consider that the simple question of “do we believe that the data are accurate” omits definition of a crucial variable: who are “we”? In other words, whose trust are we talking about?
To add additional complexity, it has become clear that what is truly of interest to all research partners is not solely whether the data are accurate in the abstract. Rather, it is what trustworthiness (meaning the accuracy of data and information derived from analysis) and trust (meaning how the resulting information and knowledge are perceived and applied) mean for how the data are used.
How are Big Data for oceans used?
Maybe we can gain some traction by working backwards. Ultimately, we’re interested in using big data to make decisions for ocean sustainability.
Let’s look at an applied example.
Big data could, for instance, include images of water column suspended particulate, in which zooplankton have been identified and classified by a machine learning algorithm. They could also include multibeam echosounder data from sensors on fishing vessels, and streaming environmental data from surface buoys.
These data might be used to better understand the system drivers of plankton abundance and distribution. In turn, the data might allow the development of sustainable fishery management of that plankton, which then might provide a sustainable source of feed for Norway’s farmed salmon.
If we were concerned with only the accuracy of the information we gain from the data (i.e., trustworthiness), we might have questions about the accuracy of the data, the appropriateness of the machine learning algorithms, the tuning of the echosounders, and the spatial and temporal distribution of our buoys. This is just for starters. But in the above example, we can also start to see the outline of whose trust in data and decisions might bear consideration. A quick initial survey of potentially interested parties includes:
- buyers of farmed salmon
- salmon farmers themselves
- wild-capture fishers (both those that might target plankton and those in other fisheries that might be affected by novel fishery operations)
- fisheries managers
- other marine resource users
- a range of scientists who both provide and use the data in question.
Within each group, members are likely to share some similar concerns, but any individual might belong to more than one of these groups. Furthermore, despite their shared concerns, individuals within each group are likely to differ in their individual attitudes and experiences. So levels of trust or distrust are likely to vary with group membership(s), individual characteristics, and whether we are talking about trusting the data, the analysis, or the decisions, among other things (for instance, who is making the decision or doing the communicating).
Scientists and a tiny zooplankton
Where to start? Well, by digging into the example outlined above. As my first case, I’ll be looking at the stakeholder landscape, the big data inputs and transformations in the information value chain, and how both sets of variables intersect to create or challenge data trustworthiness and trust in the relatively young Calanus finmarchicus fishery in the Norwegian Sea.
This applied research context is an excellent jumping-off point as it is relatively small, well-bounded and local.
Research partners’ connections also mean I’m likely to have excellent access. I will be beginning the data collection phase of this project by interviewing scientists studying C. finmarchicus and related systems to understand the kinds of data they are collecting, the analytical methodologies they are using, and how their work feeds into decision-making.
I’ll also ask them about the barriers to trustworthiness and trust they experience and identify.
I’ll be interviewing members of other stakeholder groups as well, but as I will have access to many researchers (a captive audience!) as a member of a planned field expedition to Svalbard in January 2022, interviewing the scientists is an obvious starting point.
Expect some updates from the polar night after I return!
And there you have it. A starting point for a larger research plan that will extend my research to other applied contexts and identify particularly compelling trustworthiness and/or trust issues. More about that later.
For now, there’s plenty to be getting on with. For example, I am going to need some really good boots.
About Calanus finmarchicus – an ecological key species
The copepod by the Norwegian name “Raudåte” (Calanus finmarchicus) is an ecologically important species in the Norwegian and Barents Sea. In the food chain, it is located between algae and fish, and is a key species in transferring energy between primary producers and fish species. Off the coast of Norway, it periodically constitutes 90% of the standing stock of zooplankton.
About the Andre Hoffmann Fellowship for the Fourth Industrial Revolution:
The André Hoffmann Fellowship for the Fourth Industrial Revolution offers early-career academics the opportunity to work at the intersection of society, science and technology through a joint appointment between the World Economic Forum and leading academic institutions.
This particular fellowship is sponsored by the World Economic Forum’s Andre Hoffmann Fellowships for the Fourth Industrial Revolution, the Norwegian University of Science and Technology (NTNU) Department of Marine Technology, SFI Harvest and C4IR Ocean.