All Blog Posts

Product

Tuesday, June 9, 2026


Meet PosterSentry: Guarding the Gate to FAIR Poster Data


Scientific posters hold some of the earliest results in research, yet most are seen once at a conference and then disappear. Posters.science exists to change that. Supported by a grant from The Navigation Fund, it is a platform that makes the knowledge inside posters FAIR (Findable, Accessible, Interoperable, and Reusable) and ready for AI. It works in two ways. It gives researchers a simple place to share their own posters, and it gathers posters that already live in other repositories, such as Zenodo and Figshare, so that all of them can be discovered in one place.

As we began gathering those posters, we ran into an unexpected problem. Many of the records labeled "poster" on those repositories are not posters at all. They are conference programs, abstracts, slide decks, and other documents filed under the same label. Before we can make posters easy to find, we have to be able to tell which files are genuinely posters.

That is exactly the problem PosterSentry solves. PosterSentry is also the first of several tools we are developing in collaboration with COMET, a large-scale collaboration to improve metadata quality and enhance poster metadata at the source in DataCite.

🛡️ What PosterSentry does

PosterSentry looks at a PDF and decides, quickly and cheaply, whether it is a genuine scientific poster. We deliberately kept it lightweight. Instead of running an expensive AI model on every file, it pairs a small modern language model with classic, well-understood machine learning. The resulting pipeline can run on an ordinary laptop with no special hardware and screen more than 30,000 candidate files in under half an hour.

That efficiency is the whole point. Because PosterSentry runs first, before any of the heavier processing, it has to be cheap enough to check every file. Using a heuristically derived training & testing set it classifies posters with about 87% accuracy, and when we ran it across the more than 30,000 PDFs gathered from records labeled 'poster' on Zenodo and Figshare, it narrowed them to roughly 24,000 confirmed posters, filtering out the conference programs, abstracts, and slide decks that did not belong.

✨ Why this matters for FAIR data

The FAIR principles say research outputs should be Findable, Accessible, Interoperable, and Reusable. None of that holds if the underlying labels are wrong. A poster cannot be found among the right results if it is filed as a conference program, and it cannot be reused if a tool meant for posters keeps choking on slide decks.

Getting the labels right is the unglamorous foundation that FAIR depends on. PosterSentry is that foundation, built to run cheaply across a whole repository so that a large and overlooked category of research finally has a real chance to join the trustworthy scholarly record. Researchers gain attribution and discoverability with no extra effort on their part, and the open-science community ends up with a more complete picture of the work being done. This is why we are collaborating with COMET to improve and use PosterSentry to enhance poster records right at the source in the DataCite database.

⏱️ Help us validate it: two minutes

PosterSentry's results are only as trustworthy as the labels behind them, and that is where we need the community. We have assembled 3,606 candidate posters into a public survey, and the human judgments we collect become the gold-standard benchmark behind the PosterSentry publication. This is not a side activity. It is how the work gets validated.

survey.posters.science

It's easy: you'll see a file and just tell us whether it looks like a conference poster. That's all there is to it. No account, no setup, no expertise needed, and you can watch participation climb at survey.posters.science/metrics. If you have colleagues or a lab group who care about open science, please pass it along. Every judgment makes the benchmark stronger for everyone who reuses it.

🎤 Come find us at BOSC and ISMB 2026

PosterSentry has been accepted for presentation (talk and poster) at the Bioinformatics Open Source Conference (BOSC) at ISMB 2026, where we will share the full results and the road ahead with COMET. If you work on open science, metadata, or scholarly communication, come say hello. We are actively looking for collaborators.

🔓 Everything is open

PosterSentry is fully open source under the MIT License. The model, code, and training data are all public, and taking the survey is the most useful thing you can do to help right now:


Share this article: