For much of the past year I have been wrestling with a question that began as a faint signal and has since grown into a central line of inquiry in my mind: the subtle but critical distinction that seems to underpin our entire data-driven enterprise: the boundary between observation and measurement.
This reflection is based on my emergent belief that the scientific community may be so fluent in the language of data that we risk forgetting the profound act of translation that occurs when the world is turned into data. This act has deep ethical consequences, especially given the changes being wrought by the rapid growth of AI.
The inalienable right to observe
Observation is the bedrock of science. It is a primary faculty of a conscious agent: the natural, inalienable right to perceive the signals of the world, to witness a phenomenon, and to ask “Why?”. This fundamental freedom is the engine of all inquiry.
Measurement, or what we might more accurately call datafication, is a different act altogether. It is the secondary, deliberate process of capturing, structuring, and recording an observation. It transforms a fleeting perception into a persistent, shareable, and analysable artifact: data. This is the moment a natural right is translated into a constructed asset, and it is here that the ethical landscape shifts entirely.
A researcher observing a community, a sensor measuring atmospheric pressure, and a database holding a genomic record; these are not points on a frictionless continuum. They are distinct stages, each requiring its own ethical considerations and, crucially, its own layer of consent. One has a right to look through the microscope, but the act of creating a permanent, shareable image of what is seen requires a different and more specific agreement, especially when the subject is a person or a community.
It is in this crucial moment of datafication that we have a blind spot. Our current frameworks are primarily concerned with the governance of data after it exists. We have not yet developed the language or the architecture to ethically manage the act of its creation.
AI’s true and transformative paradigm
The current discourse positions AI primarily as a tool for accelerating the existing scientific paradigm. It is seen as a technology that will optimise search and learning across vast, pre-existing datasets. As Canadian computer scientist Professor Richard Sutton’s highly influential 2019 essay, ‘The Bitter Lesson‘ compellingly argues, general methods that leverage computation will inevitably outperform those based on domain-specific human knowledge.
But this view frames AI as a competitor to human intellect, destined to render the researcher’s intuition obsolete. A ‘Brighter Lesson’ suggests a different, more collaborative path. The true transformative power of AI is not in its ability to analyse the data we have already collected, but in its potential to augment our capacity for observation and discernment before data is ever created.
Imagine AI not as a data-crunching engine at the end of the pipeline, but as an ethically-bound co-observer at the very beginning. An AI that can can be a tool of incredible discernment, capable of identifying subtle patterns and signals in the world that a human might miss. Its role should be to assist the observer, not just to analyse the measurement.
In this new paradigm, AI operates within a system of info-logical dynamics:
- Assisted observation: An AI, with the explicit consent of the observer, can be tasked to watch over a specific context.
- Discernment as a service: Its primary function is to identify events of interest based on pre-agreed rules and ethical principles.
- Sovereign datafication: Only when a meaningful event is discerned does the system, with consent, datafy the observation. Crucially, this newly created data asset belongs, by default, to the subject of the observation. It is their sovereign property.
This model moves AI from the back office to the field, transforming it from a mere analyst into a trusted instrument of perception.
An architecture for a new scientific renaissance
This vision is not a theoretical fantasy.
The technical components to build this future exist today. The Solid protocol, a web decentralisation protocol initiated by Sir Tim Berners-Lee, provides the concept of a ‘digital domicile’ or Personal Online Data Store (POD), where individuals can hold their own data with sovereign control. In this model, a research subject’s data, be it genomic, social, or environmental, resides in their own secure POD.
The CKAN-powered open-data platform can be applied to serve not only as a repository of raw data, but as a public transparent registry of machine-readable agreements that govern how data in sovereign PODs can be accessed, used, and observed.
This is the paradigm shift: from centralised data repositories to a decentralised ecosystem of sovereign agents who grant revocable access to trusted researchers. It embeds the rights of the individual into the very architecture of the scientific process. It creates a high-trust environment where the immense power of AI discernment can be harnessed without sacrificing agency.
The scientific community has always been the vanguard of progress. It has a historic opportunity to lead not just in technical innovation, but in pioneering the social and ethical frameworks that will define the coming info-logical age. By drawing a bright line between observation and measurement, and by re-imagining AI as a partner in perception, we can build a future where data empowers everyone, starting with the very subjects of our essential and unending observation of the universe.