Skip to content

Harvesting client fails : Zenodo "Couperin" community / error with "language" field mapping. #7638

@tjouneau

Description

@tjouneau

Dear community
I'm taking here the liberty to create an issue based on some previous discussions in the Dataverse-Users group. I'm trying to harvest some Zenodo communities. My client is set up as follows :

  • Alias : zenodo_test
  • Server URL : http://www.zenodo.org/oai2d
  • Local dataverse : zenodo_test_couperin
  • OAI set : user_couperin
  • Metadata format : oai_dc
  • Repository type : Generic OAI archive

Only 5 datasets are harvested out of the 20 present here: https://zenodo.org/communities/couperin/?page=1&size=20

Exception processing getRecord(), oaiUrl=https://zenodo.org/oai2d, identifier=oai:zenodo.org:3773762, edu.harvard.iq.dataverse.api.imports.ImportException, Failed to import harvested dataset: class edu.harvard.iq.dataverse.util.json.ControlledVocabularyException (Value 'eng' does not exist in type 'language')
(also : "Value 'fra' does not exist...")

Harvested for example: https://zenodo.org/record/4266132
Not harvested: https://zenodo.org/record/3948266

An exchange with @qqmyers (thanks!) linked that problem to another one regarding a SWORD Atom file import :
"With a quick look, it appears that that element is getting mapped to the Language field in the citation metadata block, whose values are controlled by the list of languages available. Those entries are all complete words (e.g. “English”) rather than the 2 or 3 letter ISO language codes. I haven’t checked to see if there’s an open issue to add support for language codes – might be something that could be addressed via an external service as being discussed in the CVV MDWG."

So it seems the problem is almost trivial, but it's still blocking the harvesting. Until further development solves this problem, is there a way around this that could be tried? Maybe a trick to ignore the language field altogether?

Best

Thomas

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions