-
Notifications
You must be signed in to change notification settings - Fork 531
Description
Dear community
I'm taking here the liberty to create an issue based on some previous discussions in the Dataverse-Users group. I'm trying to harvest some Zenodo communities. My client is set up as follows :
- Alias : zenodo_test
- Server URL : http://www.zenodo.org/oai2d
- Local dataverse : zenodo_test_couperin
- OAI set : user_couperin
- Metadata format : oai_dc
- Repository type : Generic OAI archive
Only 5 datasets are harvested out of the 20 present here: https://zenodo.org/communities/couperin/?page=1&size=20
Exception processing getRecord(), oaiUrl=https://zenodo.org/oai2d, identifier=oai:zenodo.org:3773762, edu.harvard.iq.dataverse.api.imports.ImportException, Failed to import harvested dataset: class edu.harvard.iq.dataverse.util.json.ControlledVocabularyException (Value 'eng' does not exist in type 'language')
(also : "Value 'fra' does not exist...")
Harvested for example: https://zenodo.org/record/4266132
Not harvested: https://zenodo.org/record/3948266
An exchange with @qqmyers (thanks!) linked that problem to another one regarding a SWORD Atom file import :
"With a quick look, it appears that that element is getting mapped to the Language field in the citation metadata block, whose values are controlled by the list of languages available. Those entries are all complete words (e.g. “English”) rather than the 2 or 3 letter ISO language codes. I haven’t checked to see if there’s an open issue to add support for language codes – might be something that could be addressed via an external service as being discussed in the CVV MDWG."
So it seems the problem is almost trivial, but it's still blocking the harvesting. Until further development solves this problem, is there a way around this that could be tried? Maybe a trick to ignore the language field altogether?
Best
Thomas