ChatGPT, Claude, Grok, and other generic large language models (LLMs) excel at summarizing, drafting, and coding support – but their limits quickly show when applied to highly specialized technical domains like subsurface geoscience. Vast datasets, inconsistent formats, and complex terminology demand a tailored approach.
To address this, Viridien launched one of the largest geoscience annotation initiatives to date.
Starting with a collection of publicly available well data – over 65 million documents from 10 million worldwide – subject matter experts have carefully curated and annotated more than one million data points.
Each annotation is guided by a detailed taxonomy of 30 document classes, 90 table classes, and 50 figure classes, ensuring broad coverage across document types and geoscience disciplines.
The annotation process blends human expertise with AI feedback loops. By monitoring model errors during curation, the team continuously refines both the taxonomy and the dataset, capturing edge cases and improving class balance. This iterative approach not only ensures quality but also drives measurable gains in model performance.
The payoff is clear: fine-tuned classifiers and embedding models trained on this annotated dataset consistently outperform generic LLMs in classification accuracy and document retrieval relevance. In other words, they move beyond hype and into practical, explainable AI tailored to subsurface workflows.
At the Dig X Subsurface 2025 conference in December, Jamie Richardson (Viridien) will share how this evolving dataset is laying the groundwork for operational AI in geoscience — helping industry teams turn decades of underutilized data into discovery.
Join us at Scandic Fornebu, December 03-04, 2025, to learn more. The program can be found on the conference website.
![window.adn = window.adn || {};
adn.calls = adn.calls || [];
adn.calls.push(function() {
adn.request({
network: "2cddc6",
adUnits: [{
auId: "2e0bfb",
auW: 1230,
auH: 480
}]
});
});
Fine-tuning LLMs for geoscience](https://geo365.no/wp-content/uploads/2025/10/1000_Fig-1-Richardson-1024x675.jpg)