[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update guidelines for fixed, flat, compound and be more careful about the term "multiword expression" #989

Closed
nschneid opened this issue Nov 5, 2023 · 16 comments

Comments

@nschneid
Copy link
Contributor
nschneid commented Nov 5, 2023

It was agreed at the May 2023 Dagstuhl Seminar that the UD guidelines are not clear enough about the distinction between MWEs marked by semantic idiosyncrasy, which may be associated with a variety of kinds of expressions, and syntactic criteria for the fixed, flat, and compound relations. (Dagstuhl report, p. 45)

@jnivre has led a rewrite effort. This issue will track the guidelines updates.

@nschneid
Copy link
Contributor Author
nschneid commented Nov 5, 2023

^ adds a new heading to illustrate fixed/flat vs. compound. If we keep this a link should be added in the outline at the top of the page. But there's still no real explanation in the syntax overview of what compound is for, not even under nominals.

@jnivre
Copy link
Contributor
jnivre commented Nov 5, 2023

I am not sure such an explanation is needed in the syntax overview, which does not provide an exhaustive discussion of all relations, but is a construction-oriented description focusing on major construction types like nominals and clauses. In the case of compound, I think the relation definition is sufficient in itself. Moreover, since compounding is not restricted to nouns, it doesn't really belong in the nominals section either (except if we want to mention the special case of the nominal head being a noun-noun compound). A general discussion of compounding would have to go into a (new) section on "word formation" or "lexical relations".

@nschneid
Copy link
Contributor Author
nschneid commented Nov 5, 2023

How about ^ this? It generalizes the MWE section to also cover compounds and headless structures.

@jnivre
Copy link
Contributor
jnivre commented Nov 6, 2023

I see the point of doing this, but the current version is a bit of a rag-bag. Compounds and MWEs sort of fit together as "things that behave like single words", but flat expressions go beyond this (because they may be composed of phrases).

@nschneid
Copy link
Contributor Author
nschneid commented Nov 6, 2023

I do think most flat expressions, like compounds and fixed expressions, are "quasi-lexical"—they tend to be proper names, or else borrowed foreign phrases that would be listed in a dictionary. The main exception is quoted foreign material. But apart from that, if two large units (e.g. sentences) appear next to each other without grammatical linkage, we don't call it flat—we call it list or parataxis.

@jnivre
Copy link
Contributor
jnivre commented Nov 9, 2023 via email

@sylvainkahane
Copy link
Contributor

The definition start by "Structures analyzed with fixed and flat are headless by definition"
I really don't like the examples given just after that, because both of them have clearly an internal syntactic structure:

  1. in spite of: ok, it is a frozen expression, but it has a very clear syntactic structure, exactly as in front of, in support of, in term of, etc., which have a normal analysis.
  2. Martin Luther King: there are occurrences of Luther King without Martin. So it is not so flat, and Luther King works as the last name I think and should be a phrase.

@nschneid
Copy link
Contributor Author
nschneid commented Nov 9, 2023

I'm open to suggestions on reorganizing/changing examples. I just think it would be strange not to acknowledge compound as part of the general guidelines pages for UD syntax.

Just noticed this page also needs updating: https://universaldependencies.org/u/overview/syntax.html In fact, does all the discussion of MWEs belong there rather than under Other Constructions?

@sylvainkahane
Copy link
Contributor

I'm open to suggestions on reorganizing/changing examples.

Examples for fixed should be chosen among the things that have been annotated fixed: https://universal.grew.fr/?custom=654dc47a5079f
But even here, I am not sure I agree with all the choices made. Constructions such as in order to, instead of, according to, etc could be analyzed with a very standard syntactic structure. I do have the impression that we're mixing semantic criteria (frozenness) with syntactic ones. This is also a consequence of UD's initial choice to treat adpositions as dependents.

@nschneid
Copy link
Contributor Author

@sylvainkahane I agree that the English fixed list is somewhat ad hoc (see e.g. UniversalDependencies/UD_English-EWT#400), and one of the priorities in connection with UniDive should be to decide what to do about that. But for the time being, we just need some good examples for the general guidelines.

@amir-zeldes
Copy link
Contributor

Constructions such as in order to, instead of, according to, etc could be analyzed with a very standard syntactic structure

I think that's a language-specific guidelines choice. In the case of these English items I agree that they are parallel to other constructions which are analyzed transparently, but this is true for almost any language - in UD French GSD we have "en ->fixed vigueur", which has a special legal meaning (a law is "in force"), but there is nothing special about the syntax AFAIK. In the English case we have some legacy choices from older editions of UD, and maybe those should have been decided differently, but I do not think it's unusual for fixed expressions to have some historically transparent syntax.

As a matter of principle, I think once we say something is fixed in UD, we assert that we no longer care about its etymological structure, even if it is transparent - we are saying that it functions as a unit. A better example is maybe "as well as", which also has a transparent etymological strucutre, but synchronically acts like a single coordinator, similar to "and" - so fixed says we recognize it as a multi-part version of "and", regardless of its origin.

@nschneid
Copy link
Contributor Author

OK, I've merged the revised content into https://universaldependencies.org/u/overview/syntax.html and removed it from Other Constructions. Did my best to take into account the above suggestions to make things clearer. Also improved cross-linking within and across pages. How does that look?

@sylvainkahane
Copy link
Contributor

Thanks @nschneid. I am ok with this version (and I agreed with @amir-zeldes too). I am not sure it is useful to have two separate sections, one on Multiword function word and another, farther, on Multiword expressions and headless structures, especially because only multiword function words are annotated as MWEs in UD (which could be explicitly said).
By the way, if you want a good example in French, I like the adverb quand même for which I would be unable to propose a reasonable internal structure: quand même 'even so, really', lit. when even.

@jnivre
Copy link
Contributor
jnivre commented Nov 11, 2023 via email

@jnivre
Copy link
Contributor
jnivre commented Nov 24, 2023

I modified the overview table by moving “compound” to the “Special” group and replacing “MWE” with “Headless”, which seems appropriate for the two remaining relations in that group (“fixed” and “flat”). Since none of the other proposals was completely convincing, I agree with the comment by Marie that it is preferable to make a minimal change, so that people will recognize most of the table.

nschneid added a commit that referenced this issue Dec 1, 2023
- Multiword Expressions (#974, #989)
- Semi-mandatory Relation Subtypes (#990)
@nschneid nschneid closed this as completed Dec 1, 2023
nschneid added a commit that referenced this issue Mar 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants