Automatic FAIRness on DeSci Publish: What the Evidence Shows

FAIR principles are now deeply embedded in research policy. Funding bodies, journals, and institutions increasingly require that research outputs be findable, accessible, interoperable, and reusable (FAIR). Despite this, FAIR compliance remains uneven across disciplines and repositories.

Most researchers understand the principles in broad terms. The difficulty lies in implementation. FAIR requirements often depend on technical details that sit outside normal research workflows, such as persistent identifiers, structured metadata, licensing clarity, and provenance tracking. As a result, FAIR is frequently treated as a reporting obligation rather than as a property of the research itself.

This gap between principle and practice has led to the development of automated FAIR indicators, which provide a way to assess whether research outputs meet baseline technical expectations for reuse.

From Principles to Automated Indicators

The FAIR principles were intentionally written at a high level. They describe outcomes rather than prescribing specific technologies. To make FAIR measurable, the research data community has translated these principles into operational indicators that can be tested programmatically.

Automated FAIR assessment tools evaluate questions such as whether a research object has a globally unique identifier, whether metadata is machine-readable, whether access protocols are standardized, and whether reuse conditions are explicit. Tools such as F-UJI implement these checks to produce reproducible assessments across large collections of research outputs.

These assessments are necessarily limited. They do not evaluate scientific quality, completeness, or disciplinary appropriateness. What they do reveal, however, is whether infrastructure choices systematically support or hinder FAIR alignment.

Across many repositories, automated FAIR scores remain modest. Common failure points include missing metadata, ambiguous licensing, delayed identifier assignment, and incomplete provenance information.

Why Infrastructure Shapes FAIR Outcomes

One of the most consistent findings across FAIR evaluations is that researcher intent is rarely the limiting factor. Researchers rarely choose to make their work less reusable on purpose. Instead, they work within systems that make certain outcomes easier than others.

When identifiers are optional, metadata fields are free-text, and version history is informal, FAIR compliance becomes fragile. When these elements are treated as defaults and enforced by design, compliance becomes routine.

This distinction is important because it reframes FAIR not as a behavioral problem, but as an infrastructure problem.

A Case Study in Infrastructure-Led FAIR Design

DeSci Publish provides a concrete example of how FAIR-related requirements can be addressed at the system level. The platform’s technical documentation describes a formal FAIR Implementation Profile, outlining how specific architectural choices map to individual FAIR criteria.

Several aspects of this design are particularly relevant to automated FAIR indicators.

Persistent Identifiers From the Start

Research objects created on DeSci Publish receive decentralized persistent identifiers at the moment of creation rather than at the end of the publication process. These identifiers are version-aware and resolvable, supporting both findability and provenance tracking.

From an assessment perspective, early and consistent identifier assignment addresses one of the most common causes of low FAIR scores.

Structured, Machine-Readable Metadata

Metadata on DeSci Publish is generated using JSON-LD and RO-Crate conventions. This allows metadata to be processed by machines rather than inferred from static documents. The FAIR Implementation Profile explicitly maps this approach to findability, interoperability, and reusability criteria.

Metadata is treated as a primary research characteristic, not as supplementary information.

Defaults That Reduce Failure Modes

The documentation confirms that when optional metadata such as keywords or licenses is not provided, defaults are applied. Licenses are made explicit, and descriptive metadata is inferred where possible.

This design choice is significant because automated FAIR assessments frequently fail research objects due to missing administrative metadata rather than substantive shortcomings.

File-Level Transparency

All files within a research object are enumerated with file names, sizes, formats, and access paths. This supports both discoverability and reuse, particularly for computational research where file structure matters.

Incomplete file inventories are a common source of poor FAIR performance in traditional repositories.

Provenance and Version History

Updates to research objects create new, traceable versions with an immutable history. Provenance is encoded as part of the system rather than relying on informal documentation. This directly supports reuse and verification indicators.

What the Automated Assessments Show

To evaluate how platform-level infrastructure choices affect measurable FAIR performance, research objects hosted on DeSci Publish were assessed using the F-UJI automated FAIR assessment tool.

In total:

309 research objects were evaluated
17 FAIR indicators were applied across all four FAIR dimensions

Before optimization, the same research objects achieved mean FAIR scores of approximately 30 percent. At that stage, much of the required FAIR information already existed within the platform, but it was not exported in formats that automated assessment tools could reliably interpret.

After updating how metadata is structured and exported in machine-readable, FAIR-compliant formats on the DeSci Publish infrastructure, the mean FAIR score increased to 88.6 percent. As a result, 94.8 percent of research objects exceeded an 80 percent FAIR score threshold.

Importantly, this improvement did not require any additional effort from researchers, nor did it reflect changes to the underlying data itself. Demonstrating that FAIR outcomes can be largely resolved by publication infrastructure as long as researchers still do their minimum due diligence.

Diversity of Research Outputs

A common concern with automated FAIR scoring is that high scores may only be achievable for narrowly defined data types. The assessed research objects on DeSci Publish span a wide range of formats, including source code, tabular data, structured documents, PDFs, and domain-specific scientific files.

More than 180 distinct file extensions were represented. High FAIR scores were observed across this diversity, indicating that the underlying infrastructure does not rely on format-specific assumptions.

Broader Implications

The documentation and assessment results together reinforce a broader lesson that is increasingly evident across research infrastructure work. FAIR compliance is difficult to achieve when it depends on individual researchers making correct technical decisions under time and incentive constraints.

When identifiers, metadata structure, licensing, indexing, and provenance are treated as defaults rather than options, FAIR alignment becomes a routine outcome rather than an exceptional one.

This does not remove the need for good data practices or domain expertise. It does, however, shift responsibility for FAIR mechanics away from individuals and into systems that are better positioned to apply them consistently.

Closing Reflection

FAIR compliance is often treated as an abstract ideal or an additional burden placed on researchers. The evidence reviewed here points to a more practical conclusion: when FAIR principles are built directly into publication infrastructure, compliance becomes the default rather than the exception. On DeSci Publish, researchers are able to share their work in a FAIR-compliant way without needing to understand the underlying standards or perform additional technical steps themselves. The platform automatically handles persistent identifiers, machine-readable metadata, file descriptions, licensing, and provenance, allowing researchers to focus on their research rather than on compliance mechanics. This case does not resolve every open question around FAIR assessment or long-term reuse. It does, however, demonstrate that high FAIR performance can be achieved when infrastructure takes responsibility for the technical details and does the heavy lifting automatically on behalf of the user.

‍