[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RDF URIs for UD v2 #613

Open
eroux opened this issue Feb 25, 2019 · 10 comments
Open

RDF URIs for UD v2 #613

eroux opened this issue Feb 25, 2019 · 10 comments
Labels
Milestone

Comments

@eroux
Copy link
eroux commented Feb 25, 2019

I want to import some UD-tagged corpus using web annotations I could define the UD URIs we need but I'm wondering if there is a project to have an RDF export of UD2? There is an RDF version of the V1 here (if you click on the small logos on the right). I'm quite used to RDF so I could help but I don't have much (time) resources at the moment.

@dan-zeman dan-zeman added this to the later milestone Feb 25, 2019
@dan-zeman
Copy link
Member

I am not aware of any such project.

@eroux
Copy link
Author
eroux commented Feb 26, 2019

This seems like a relatively scoped task we could help with, can you provide some indications about the best way to do so? I think the easiest would be to have a script that would read the data in _data and fetch the labels in the various languages. It would then produce an RDF file that could live in github... does that sound about right?

@dan-zeman
Copy link
Member

Depends on what data you need. The files in _data are updated irregularly and they say what language-label pairs have documentation pages, which is not the same as being a valid label that can be used in a corpus. At least now it is not (yet) the same.

@eroux
Copy link
Author
eroux commented Feb 27, 2019

I see, thanks!

@eroux
Copy link
Author
eroux commented Feb 27, 2019

Here's another file containing UD rdf:

http://www.acoli.informatik.uni-frankfurt.de/resources/olia/ud-pos-link.rdf

I'll contact the authors

@dan-zeman
Copy link
Member

Yes, Christian Chiarcos from Frankfurt is also the main person behind the UD V1 mapping you linked from your first post.

@chiarcos
Copy link
Contributor
chiarcos commented Feb 1, 2020

Hi,

apologies for seeing this too late.

For UD data, you can use our CoNLL-to-RDF roundtripping with CoNLL-RDF. This includes support for CoNLL-U, and since December 2019, also CoNLL-U+: https://github.com/acoli-repo/conll-rdf

For UD tagsets, these are covered by OLiA (UD v.1 so far, UD v.2 is still experimental. should be stable before May): https://github.com/acoli-repo/olia/tree/master/owl/stable/ud*; upcoming revision (no linking models yet) under https://github.com/acoli-repo/olia/tree/master/owl/experimental/univ_dep/built-from-html.

An earlier version of UD ontologies was directly generated (on-the-fly!) from the documentation Markdown, but the file structure changed with UD v.2. Prototype (never merged with the main branch) still available under http://fginter.github.io/docs/, click the RDF buttons.

@arademaker
Copy link
Contributor
arademaker commented Feb 1, 2020

One of my past students (Guilherme Passos) has also worked on UD transformation to RDF. The idea is to help the identification of inconsistent annotations.

We have formalized UD guidelines 2.0 by hand and we implemented our own transformation.

Dissertation available at https://www.cos.ufrj.br/uploadfile/publicacao/2858.pdf

@arademaker
Copy link
Contributor

Maybe @GPPassos can add something here too

@eroux
Copy link
Author
eroux commented Feb 3, 2020

Thanks a lot, that's very helpful! I've looked at the dissertation but didn't find a link to the .owl file that's reproduced at the end of the PDF... I tried http://www.semanticweb.org/gppassos/ unsuccessfully, do you know if it's available somewhere?

To answer some of the questions: I'm not currently using this in production, I'm exploring the idea to convert some custom annotation format using UD (it's a very simple system based on character coordinates) into proper web annotations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants