期刊国际识别代码
及其它连续出版物(电子版和印刷版)国际识别代码

2025/02/18

Implementing AI to further scale and accelerate WorldCat de-duplication

 打印

 下载

 分享

 发给朋友

The OCLC Metadata Quality team continuously enhances WorldCat data through manual and automated processes, ensuring its accuracy for global libraries. By integrating AI and human expertise, they refine metadata, improve duplicate detection, and enhance resource discovery.

In August 2023, OCLC launched a machine learning model to detect duplicate bibliographic records, incorporating feedback from 300+ catalogers on 34,000 records, leading to the removal of 5.4 million duplicates.

Now, AI-driven de-duplication expands to all formats, languages, and scripts. A test run in February 2025 merged 500,000 duplicate English print records, with broader cleanup efforts to follow. Libraries should enable WorldCat updates for seamless integration.