[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support finer-grained distinctions around NewsArticle [and news content] #1525

Open
danbri opened this issue Feb 16, 2017 · 47 comments
Open
Assignees
Labels
no-issue-activity Discuss has gone quiet. Auto-tagging to encourage people to re-engage with the issue (or close it!). schema.org vocab General top level tag for issues on the vocabulary status:work expected We are likely to, or would like to, or probably should try, ... to do something in this area.

Comments

@danbri
Copy link
Contributor
danbri commented Feb 16, 2017

There is interest (e.g. from The Trust Project, and conversations around fact checking and our ClaimReview markup #1061) in using schema.org to describe more detailed kinds of NewsArticle, such as allowing sites to express that an article is an opinion piece, a backgrounder document, analysis, visualization, or that it is sponsored content. We have also touched on these themes here recently when discussing the idea of a 'Satirical News Article' subtype, see #1437.

See also https://www.theatlantic.com/technology/archive/2017/05/what-people-really-want-from-news-organizations/526902/

In the simplest case we could overload the "genre" property here, but I'd suggest (repeating myself from #1437) that a bit more structure will be more rewarding in terms of markup usability / quality. We could use a simple Enumeration list but I lean towards types since they can be instantiated (to carry properties) and even subtyped further. More compellingly, in the (increasingly popular) JSON-LD format, markup for types is simple, and even multiple types also a simple array of short tokens, whereas in general our markup for enumerated property values involves full URLs and is more error prone.

Update (May 10th) from @danbri:

Several results of these investigations:

danbri added a commit that referenced this issue Feb 16, 2017
@danbri danbri self-assigned this Feb 16, 2017
@danbri danbri added pending.schema.org schema.org vocab General top level tag for issues on the vocabulary status:work expected We are likely to, or would like to, or probably should try, ... to do something in this area. labels Feb 16, 2017
@thadguidry
Copy link
Contributor
thadguidry commented Feb 16, 2017

So we covered a bit of this before with rNews and how they basically use Genre property for this currently.
But then I think again about some benefits/cons you mention and then think about @rvguha proposal of Compositional Terms / Types #1493 and feel much better about expanding types when/where there would be wide agreement. I think in the news domain, this would get wide agreement. But in the end....its still just taking Article and slapping a genre property on its back.

So what we are REALLY trying to do here, is basically making it easier for any heavy markup publishers... like news organizations. And saying that "its OK and beneficial to break norms and add some useful types for this heavy markup domain instead of properties"

So... +1 for expanding typing for News domain.

@danbri
Copy link
Contributor Author
danbri commented Feb 16, 2017

@thadguidry this would be more a Compositional Terms matter if we were trying for FootballNewsArticle, CricketNewsArticle, {huge-list-of-every-sport-enumerated}NewsArticle. The current situation in the news industry is rather different; different kinds of articles are distinguished within the publishing organizations but they are encouraged by schema.org to call a lot of quite varying situations "news articles" due to our lack of detail. The "genre" field, like "keywords" is certainly a workaround, put then we'd end up pushing everything down into a tagging-like structure.

I'll post a draft for discussion shortly...

@thadguidry
Copy link
Contributor
thadguidry commented Feb 16, 2017

@danbri it still can apply to the Compositional use case... SatiricalNewsArticle ... versus FootballNewsArticle. But I guess your saying its not about "MainSubject"

  1. "MainSubject" NewsArticle ... like the FootballNewsArticle
    but more about
  2. "KindofNewsArticle" NewsArticle ... like a SatiricalNewsArticle

I'm just pointing out that... both 1 and 2 can get very similar for a lot of publishers so you want to make that very clear as I did above...that this is not about easier typing for Subject composition, but instead for Genre composition :) ... I.E. Subjects will still need to go into a property. Right ?

@danbri
Copy link
Contributor Author
danbri commented Feb 16, 2017

I agree that making usecases very clear for publishers would be critical here. I'd like to float some designs in our "pending" area (as we did quite successfully with ClaimReview for fact checking) and get their feedback.

@thadguidry
Copy link
Contributor
thadguidry commented Feb 16, 2017

@danbri Updated: read my last clarity sentence above please and answer that. "I.E. Subjects will still need to go into a property. Right ?"

@danbri
Copy link
Contributor Author
danbri commented Feb 16, 2017

Yes, subjects would still be described with properties.

@thadguidry
Copy link
Contributor

@danbri OK. Well, here is a Genre visualization URL on what currently passes along the NewsML wires as being a sameAs subType of Article... and just seeing now some of the NewsCodes being used and traveling across Reuters alone, makes me scratch my head...lol...

http://show.newscodes.org/index.html?newscodes=genre&lang=en-GB&startTo=Show

A few good ones make sense for us to add as subtypes...

Advisory
Biography
Interview
Obituary
Opinion
Question and Answer
etc.

While others in that listing, not very good candidates for discussion to add as a subtype of Article...and are clearly much more of a pure Genre or even Subject.

History
Music
Quote
Raw Sound
etc.

From https://iptc.org/standards/newsml-g2/iptc-catalog/

Genre
The <genre> element indicates the style of the content, in this example “interview” as a property that is
distinct from <subject> that is used to indicate the subject matter of content. In the example, an IPTC
Genre NewsCodes value of “Interview” is used:
<genre qcode=“genre:interview”>
<name xml:lang=“en-GB”>Interview</name>
</genre>

I also wonder if we need to worry about Media Topic as an additional property and separate issue ?
http://show.newscodes.org/index.html?newscodes=medtop&lang=en-GB&startTo=Show

@ghost
Copy link
ghost commented Feb 24, 2017

https://en.wikipedia.org/wiki/List_of_genres#Literary_genres (note the list down the side of the page). I've spent a bit of time looking at this, can't find the reference to that work yet.

Genre's are not simply for written works: https://www.wikidata.org/wiki/Property:P136
Classifications have of course been used in libraries for a very long time.

The problem really gets down to the fact that people who are using wordpress may need simple tools. A word-press plugin would be ideal, but maybe boolean datatypes would likely be most useful. nb also: found http://www.tagtraum.com/download/schreiber_learnedgenreontologies_ismir2016.pdf

@ghost
Copy link
ghost commented Feb 24, 2017

danbri added a commit that referenced this issue Apr 26, 2017
(exploring related changes here from #1525 too)
danbri added a commit that referenced this issue Apr 26, 2017
danbri added a commit that referenced this issue Apr 27, 2017
danbri added a commit that referenced this issue Apr 27, 2017
These are not automatically inherited (or displayed). #1525
@subbuvincent
Copy link

Just so all of you have news media examples for the HelpUsReport/AskPublic sub-types, here are some.

Propublica Help Us Investigate Maternal Mortality (there is @type NewsArticle in this page)
https://www.propublica.org/getinvolved/help-propublica-and-npr-investigate-maternal-mortality

WBUR (Boston's NPR 90.9FM) crowdsourcing insights from the Boston public head of a Nov 2016 charter schools ballot measure report series
http://www.wbur.org/edify/2016/08/15/mass-charter-school-debate-stories

KALW (SF Local 91.7 FM) callout/crowdsourcing insights from Bay Area citizens ahead of local election reporting 2016
http://kalw.org/post/help-us-report-2016-elections#stream/0

(Please keep in the mind that the Screendoor web-form used on these Asks won't expand on the pages because the deadline has long passed. Disclosure: I was involved with the WBUR and KALW examples)

Last week Vox put a call for citizens to submit ER billing.
https://erbills.vox.com/ (this is setup as a separate website, but the schema idea does apply).

@danbri @chaals @thadguidry @RichardWallis

@danbri
Copy link
Contributor Author
danbri commented Oct 30, 2017

@thadguidry I see where you're going with that, but I'd suggest here that we resist the urge to try to decompose crowdsourcing requests into some complex expression built from schema.org primitives. As you say it's possible we might later find some commonalities across existing terms. I'd suggest we think about news publishers making crowdsourcing requests simply as a "common enough kind of a thing" that we can dedicate a term to it, rather than putting that on hold while trying to generalize to all kinds of requests. If you did want to go in that direction you might start with our existing http://schema.org/Question type.

I believe there's an opportunity through the work that the Trust Project are doing with newsrooms and journalists, for the proposed NewsArticle subtypes (like this one) to become drop-in replacements in existing news publishing workflows. The more complicated we make it, the harder that simple adoption becomes. Even figuring out what kind of article something is, is a big enough ask, so I'd argue strongly to keep the markup as basic as possible at this stage. Let's give it 6-12 months in "pending" and see how all these terms are looking in practice. If a richer model for describing requests more generally surfaces, of course we can discuss it, but let's not try to cover too much in the first step. There are so many similiar sounding but different ideas around "asking" that I fear we could get lost in the distinctions, whereas "newspaper article encourages public to comment on topic" seems a reasonably distinct phenomena.

@jeannieh
Copy link

The follow code below currently checks out perfect with the SDTT for page https://www.propublica.org/getinvolved/help-propublica-and-npr-investigate-maternal-mortality

<script type="application/ld+json"> { "@context": "http://schema.org", "@type": "NewsArticle", "name": "ProPublica", "description": "By many measures, the United States has become the most dangerous industrialized country in which to give birth.", "image": { "@type": "ImageObject", "url": "https://assets.propublica.org/images/getInvolved/20170210-maternal-mortality-1200x630.jpg", "width": "1200", "height": "630" }, "url": "https://www.propublica.org/getinvolved/help-propublica-and-npr-investigate-maternal-mortality", "mainEntityOfPage": "https://www.propublica.org/getinvolved/help-propublica-and-npr-investigate-maternal-mortality", "inLanguage": "en_us", "headline": "Do You Know Someone Who Died or Nearly Died in Childbirth? Help Us Investigate Maternal Health", "keywords": ["many","measures","states","dangerous","country","birth"], "dateCreated": "2017-05-30T05:40:41+0000", "dateModified": "2017-09-07T22:01:57+0000", "datePublished": "2017-02-10T14:00:00+0000", "copyrightYear": "2017", "author": { "@type": "NGO", "name": "ProPublica", "description": "ProPublica is an independent, non-profit newsroom that produces investigative journalism in the public interest.", "url": "https://www.propublica.org", "sameAs": ["https://twitter.com/propublica","https://www.facebook.com/propublica","https://en.wikipedia.org/wiki/ProPublica"], "image": { "@type": "ImageObject", "url": "https://assets.propublica.org/2017-pp-open-graph-1200x630.jpg", "height": "630", "width": "1200" }, "telephone": "1-212-514-5250", "email": "info@propublica.org", "address": { "@type": "PostalAddress", "streetAddress": "155 Avenue of the Americas", "addressLocality": "13th Floor", "addressRegion": "New York", "postalCode": "N.Y. 10013", "addressCountry": "US" }, "logo": { "@type": "ImageObject", "url": "https://assets.propublica.org/2017-pp-open-graph-1200x630.jpg", "height": "630", "width": "1200" }, "location": { "@type": "Place", "name": "ProPublica", "description": "ProPublica is an independent, non-profit newsroom that produces investigative journalism in the public interest.", "hasMap": "http://maps.google.com/maps?q=ProPublica%2C+155+Avenue+of+the+Americas%2C+13th+Floor%2C+New+York+N.Y.+10013%2C+US", "telephone": "1-212-514-5250", "image": { "@type": "ImageObject", "url": "https://assets.propublica.org/2017-pp-open-graph-1200x630.jpg", "height": "630", "width": "1200" }, "logo": { "@type": "ImageObject", "url": "https://assets.propublica.org/2017-pp-open-graph-1200x630.jpg", "height": "630", "width": "1200" }, "url": "https://www.propublica.org", "sameAs": ["https://twitter.com/propublica","https://www.facebook.com/propublica","https://en.wikipedia.org/wiki/ProPublica"], "geo": { "@type": "GeoCoordinates", "latitude": "40.725395", "longitude": "-74.0048036" }, "address": { "@type": "PostalAddress", "streetAddress": "155 Avenue of the Americas", "addressLocality": "13th Floor", "addressRegion": "New York", "postalCode": "N.Y. 10013", "addressCountry": "US" } } }, "copyrightHolder": { "@type": "NGO", "name": "ProPublica", "description": "ProPublica is an independent, non-profit newsroom that produces investigative journalism in the public interest.", "url": "https://www.propublica.org", "sameAs": ["https://twitter.com/propublica","https://www.facebook.com/propublica","https://en.wikipedia.org/wiki/ProPublica"], "image": { "@type": "ImageObject", "url": "https://assets.propublica.org/2017-pp-open-graph-1200x630.jpg", "height": "630", "width": "1200" }, "telephone": "1-212-514-5250", "email": "info@propublica.org", "address": { "@type": "PostalAddress", "streetAddress": "155 Avenue of the Americas", "addressLocality": "13th Floor", "addressRegion": "New York", "postalCode": "N.Y. 10013", "addressCountry": "US" }, "logo": { "@type": "ImageObject", "url": "https://assets.propublica.org/2017-pp-open-graph-1200x630.jpg", "height": "630", "width": "1200" }, "location": { "@type": "Place", "name": "ProPublica", "description": "ProPublica is an independent, non-profit newsroom that produces investigative journalism in the public interest.", "hasMap": "http://maps.google.com/maps?q=ProPublica%2C+155+Avenue+of+the+Americas%2C+13th+Floor%2C+New+York+N.Y.+10013%2C+US", "telephone": "1-212-514-5250", "image": { "@type": "ImageObject", "url": "https://assets.propublica.org/2017-pp-open-graph-1200x630.jpg", "height": "630", "width": "1200" }, "logo": { "@type": "ImageObject", "url": "https://assets.propublica.org/2017-pp-open-graph-1200x630.jpg", "height": "630", "width": "1200" }, "url": "https://www.propublica.org", "sameAs": ["https://twitter.com/propublica","https://www.facebook.com/propublica","https://en.wikipedia.org/wiki/ProPublica"], "geo": { "@type": "GeoCoordinates", "latitude": "40.725395", "longitude": "-74.0048036" }, "address": { "@type": "PostalAddress", "streetAddress": "155 Avenue of the Americas", "addressLocality": "13th Floor", "addressRegion": "New York", "postalCode": "N.Y. 10013", "addressCountry": "US" } } }, "publisher": { "@type": "Organization", "name": "ProPublica", "description": "ProPublica is an independent, non-profit newsroom that produces investigative journalism in the public interest.", "url": "https://www.propublica.org", "sameAs": ["https://twitter.com/propublica","https://www.facebook.com/propublica","https://en.wikipedia.org/wiki/ProPublica"], "image": { "@type": "ImageObject", "url": "https://assets.propublica.org/2017-pp-open-graph-1200x630.jpg", "height": "630", "width": "1200" }, "telephone": "1-212-514-5250", "email": "info@propublica.org", "address": { "@type": "PostalAddress", "streetAddress": "155 Avenue of the Americas", "addressLocality": "13th Floor", "addressRegion": "New York", "postalCode": "N.Y. 10013", "addressCountry": "US" }, "logo": { "@type": "ImageObject", "url": "https://assets.propublica.org/2017-pp-open-graph-1200x630.jpg", "height": "630", "width": "1200" }, "location": { "@type": "Place", "name": "ProPublica", "description": "ProPublica is an independent, non-profit newsroom that produces investigative journalism in the public interest.", "hasMap": "http://maps.google.com/maps?q=ProPublica%2C+155+Avenue+of+the+Americas%2C+13th+Floor%2C+New+York+N.Y.+10013%2C+US", "telephone": "1-212-514-5250", "image": { "@type": "ImageObject", "url": "https://assets.propublica.org/2017-pp-open-graph-1200x630.jpg", "height": "630", "width": "1200" }, "logo": { "@type": "ImageObject", "url": "https://assets.propublica.org/2017-pp-open-graph-1200x630.jpg", "height": "630", "width": "1200" }, "url": "https://www.propublica.org", "sameAs": ["https://twitter.com/propublica","https://www.facebook.com/propublica","https://en.wikipedia.org/wiki/ProPublica"], "geo": { "@type": "GeoCoordinates", "latitude": "40.725395", "longitude": "-74.0048036" }, "address": { "@type": "PostalAddress", "streetAddress": "155 Avenue of the Americas", "addressLocality": "13th Floor", "addressRegion": "New York", "postalCode": "N.Y. 10013", "addressCountry": "US" } } }, "articleSection": "Lost Mothers", "creator": ["Adriana Gallardo"] } </script>

@subbuvincent
Copy link

@jeannieh Did you mean 'following code' meaning the code below your line.

Yes, if we went with the new sub-type of NewsArticle we are discussing here, one line will change

from
"@type": "NewsArticle"

to
"@type": "HelpUsReportNewsArticle"

or
"@type": "AskPublicNewsArticle"

Also two more tidbits to just give more info to everyone (@danbri already knows this)

1 The origin of the words HelpUsReport - the Trust Project type of work indicator protocol guidelines itself, where the Crowdsourcing type call-to-users is defined with the label Help Us Report.

2 The origin for 'Ask' is the usage in this article, see the headline in particular. A bunch of journalists (including self) wrote in late 2016 and early 2017.
https://medium.com/@subbuvincent/ask-and-you-shall-receive-the-power-of-crowdsourcing-for-public-radio-7bff4f189607

@thadguidry
Copy link
Contributor
thadguidry commented Oct 31, 2017

@danbri This is why I hate email and comments, because all the subtlety of my voice is just chucked out the window. :) Re-read my comment and before all of it put the context "IN THE FUTURE WE COULD...". I didn't say "hold everything", "I hate this idea", "we need to land the ASK type first before everything." I was merely expanding thoughts for our future. I'll be sure to preface future comments like that to avoid disruption.

+1 for getting any types that The Trust Project needs. How's that for my unwavering support even if its not clear enough how deeply I care about journalists and provide a data tool for them like OpenRefine to boot ? :) :)

@subbuvincent Yeah, Subra I know about the "Ask" and journalists challenges around getting responses just to get something going. For you and I and journalists the "Ask" usually ends up being a "Beg". Ain't I right? lololol

@TheTrustProject
Copy link

@thadguidry thanks for your unwavering support. :> It is much appreciated! The care and engagement this community puts into this vocabulary is inspiring.

@danbri
Copy link
Contributor Author
danbri commented Oct 31, 2017

Thanks @thadguidry :)) yeah, email/github misses a lot. I'd hoped I might run into you at the Wikidata conference. Maybe next time?

Ok, so where are we?

AskPublicNewsArticle, HelpUsReportNewsArticle, AskingPublicNewsArticle, ... any other candidates?

@thadguidry
Copy link
Contributor

Let's make it even better, by combining the ideas into...

PublicHelpOnNewsArticle
HelpReportOnNewsArticle

@danbri
Copy link
Contributor Author
danbri commented Oct 31, 2017

Ok, how about "HelpReportOnNewsArticle" - @subbuvincent @TheTrustProject, is that consistent with the TP intent? It is a little verbose, but most NewsArticles won't be of this type.

I would also like to clarify how general low-key "contact us if you can help with this story" messaging fits in here. My understanding is that some articles are very heavily focussed on the crowdsourcing, but you still often see a smaller message, as an aside, in other kinds of news article.
Perhaps in those cases we can take it as given that news organizations are open to being contacted and the important thing is to have their contact information, and to be clear that the contact information is for the newsroom/journalists rather than for the public? Per-article authorship information can also provide more specific contact opportunities too.

@chaals
Copy link
Contributor
chaals commented Oct 31, 2017

I am wondering if "we welcome more information" (which is what I think YetToBeNamedClassOfNewsArticle is about) is actually a property of things, not a class.

As far as I know Standard and Poor's, Moody's, and some other rating agencies don't actually invite the public to provide their own ratings, while others do... an AccessibilityDescription is likely to be something with this property too. And of course all kinds of NewsArticles, how-tos, and so on.

@TheTrustProject
Copy link

@danbri Your context description is quite right - journalism has always depended on the public for tips and sources. There's a growing movement, though, among newsrooms to reach out more actively and we heard a strong call for this from our public interviews.

"HelpUsReportOnNewsArticle" is a little off because the call isn't necessary for that particular article, it's for a breaking story, a beat, an area of coverage, etc. HelpUsReportNewsArticle is a type of news article that calls for the public to help the newsroom report.

@danbri
Copy link
Contributor Author
danbri commented Nov 1, 2017

@chaals how about we do both? Let's add this new type - since crowdsourcing of news articles is a major phenomena, and worth being able to distinguish from the many other situations in which feedback can be sought. But let's also expand the domain of http://schema.org/contactPoint to CreativeWork, which would allow it to be applied to all kinds of documents...

@subbuvincent
Copy link

@danbri expanding contactPoint to CreativeWork is a great idea.

It adds an anchor directly connected to a practical nuance in crowdsourced journalism. Many newsrooms often separate out a custom email address for e.g. for receiving responses from a form-tool or even a direct email invited from readers. They tend to avoid using the editors@newsroom.com or letters@newsroom.com account for special projects.
Allowing a contactPoint for a CreativeWork will allow a NewsArticle that is of the type HelpUsReport or AskPublic to additionally markup this custom ContactPoint - when needed.
(It might not always get used.)

Independently, what is interesting is that the Actionable Feedback (AF) Trust Indicator anyhow requires a special ContactPoint to be identified. And HelpUsReport/Crowdsourcing type callouts will usually run under an AF umbrella/program in a newsroom. So if CreativeWork can have a CP attached, that same AF CP could be reinvoked here too, even in cases where the newsroom is not creating a custom email address/contact for the callout episode. i.e. This adds flexibility to newsroom CMSes. (Sorry about going off on a detailed Trust Protocol nuance here, but the dots do connect.)

All in all, a thumbs up.

Regarding the difference between HelpUsReportNewsArticle and HelpUsReportOnNewsArticle - hope @TheTrustProject distinction in an earlier comment was clear @thadguidry ?

Also please note that vocabulary wise, we were trying for a term like ReportageNewsArticle, AnalysisNewsArticle, OpinionNewsArticle where the first part of the term is kind of a news genre, and the second part 'NewsArticle' is just the major type.

So HelpUsReport or AskPublic was an effort to verbalise the genre, which does not require an added connector.

@thadguidry
Copy link
Contributor

@subbuvincent Yes the distinction is clear.

@chaals
Copy link
Contributor
chaals commented Nov 3, 2017

The recent discussion is strengthening my opposition to making a class. The class of NewsArticles we want input on seems to sit badly with articles of different genres.

@github-actions
Copy link

This issue is being tagged as Stale due to inactivity.

@github-actions github-actions bot added the no-issue-activity Discuss has gone quiet. Auto-tagging to encourage people to re-engage with the issue (or close it!). label Aug 30, 2020
@marco-brandizi
Copy link

Hi all, do we need a specific issue to extend the domain of contactPoint? I have yet another case where I would need to have CreativeWork in the domain of this property, specifically I need a contact point for biological studies.

@baconjulie
Copy link

We are interested in bringing these new schema types into our articles to distinguish Opinion vs. News.

I see this issue is tagged as stale, but notes New subtypes for NewsArticle, again pending: AnalysisNewsArticle, BackgroundNewsArticle, ReportageNewsArticle, OpinionNewsArticle, ReviewNewsArticle. I do see these published on schema.org and labeled as "new" https://schema.org/OpinionNewsArticle. Are these still pending? If we implement them within our site will they be recognized by SERPs to help bring clarity around context?

@marco-brandizi
Copy link

This seems to be one of those cases where it's best to have a generic class like Article, and then leave specific classifications to other schemas, controlled vocabularies, ontologies, etc. For instance, the types you have in science (peer-review article, book chapter, thesis, tech report) might be different than the types used in current affair (report, opinion, infographic...) .

So, I agree with the other comments proposing to represent this information with additionalType or genre, rather than introducing many new types in schema.

@baconjulie
Copy link

Are there other valid use cases for OpinionNewsArticle or are there plans to remove it from the documentation if it is not recommended for use?

@TheTrustProject
Copy link

@baconjulie All the news types you list above are in use by Trust Project news partners. We were advised that "pending" would be eventually lifted. @danbri , can you help?

@choochatjumpathongcj

This comment was marked as spam.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no-issue-activity Discuss has gone quiet. Auto-tagging to encourage people to re-engage with the issue (or close it!). schema.org vocab General top level tag for issues on the vocabulary status:work expected We are likely to, or would like to, or probably should try, ... to do something in this area.
Projects
None yet
Development

No branches or pull requests