Archive for the ‘AP’ tag

577 US sites publishing hNews news

without comments

The San Francisco Chronicle was founded in 1865. It is the only daily broadsheet newspaper in San Francisco – and is published online at SFgate.com. In the 1960s Paul Avery was a police reporter at the Chronicle when he started investigating the so-called ‘Zodiac Killer’. And, earlier this year Mark Fiore won a Pulitzer Prize for his animated online cartoons for the paper (well worth watching his cartoon with Snuggly the security beardemonstrating how to make the internet ‘wire tap friendly’).

The Chronicle is also one of 577 US news sites now publishing articles with hNews(full list here).

hNews is the news microformat we developed with the Associated Press that makes the provenance of news articles clear, consistent and machine readable. A news article with hNews will – by definition – identify its author, its source organisation, its title, when it was published and – in most cases – the license associated with its use and a link to the principles to which it adheres (e.g. see AP essential news). It could also have where it was written, when it was updated, and a bunch of other useful stuff.

Essentially, hNews makes the provenance of a news article a lot more transparent – which is good news for whoever produces the article (gains credit, creates potential revenue models etc.), and good news for the end user (better able to assess its provenance, greater credibility etc.).

Up to now, though we have been aware that many sites have been integrating hNews, there has not been a published list of these sites. This seemed to us a little unsatisfactory. So we went out and found as many of them as we could and have now published them on a list as an open Google doc.

There are, I understand, a few hundred more sites that have either already integrated hNews or are in the process of integrating it. We haven’t found them yet but will add them when we do. If you know of one (or if you are one) please let us know and we’ll add it.

If you’re interested in integrating hNews and are wondering why you would, you can read a piece I wrote for PBS MediaShift (‘How metadata can eliminate the need for paywalls’), see the official specification at hNews microformats wiki, watch an hNews presentation by Stuart Myles, view a (slightly dated) slideshow on why it creates ‘Value Added News’, or see how to add hNews to WordPress.

hNews was developed as part of the transparency initiative of the Media Standards Trust, which aims to make news on the web more transparent. The initiative has been funded by the MacArthur Foundation and the Knight Foundation. You can read more about the transparency initiative elsewhere on this site.

This post was first published on the Media Standards Trust site on Tuesday 12th October, 2010

Update: I’m grateful to Max Cutler for spotting a number of duplicate entries in the original list which have now been cleaned up. It’s still 577 sites since in the process of cleaning we found a few more. And, as I wrote in my original post, this number is by no means final. There are almost certainly a lot more sites publishing with hNews, it’s just a matter of finding them (through sweat and scrapers). So if you spot any that aren’t on the list, please let me know

Written by Martin Moore

October 15th, 2010 at 3:10 pm

Posted in hNews

Tagged with , , , , , ,

How Metadata Can Eliminate the Need for Pay Walls

without comments

This piece I wrote was first published on PBS Mediashift on 18th August 2010, and subsequently at j-source.ca on 31st August 2010

You have to admire his chutzpah. Rupert Murdoch, the so-called nemesis of public interest news, is now being hailed by some as its potential savior. Sick and tired of people reading his news outlets for free online, Murdoch has erected pay walls around his sites (or some of them at least).

Anyone who wants to see what is published on thetimes.co.uk will have to pay at least £1. That includes search engines who are not even allowed to index the Times’ online content. Now we have to wait and see if the subscription revenues start rolling in.

Yet even those who hope the pay wall succeeds have reservations. Pay walls represent both a practical and philosophical shift in the provision of news on the net. They represent a shift from the openness that has defined the early history of the web, to a closed world much more reminiscent of the 20th century’s constrained media environment. Erect a pay wall and you immediately cut yourself off from much of the web community. You disable the vast majority of people from recommending, linking, commenting, quoting, and discussing.

It is for this reason that any forward thinking journalist cannot help but be disheartened by the pay wall. It cuts you off from a much bigger potential audience. It suffocates networked journalism, whereby you engage with your readers to source, expand, deepen, and extend your story. It limits your opportunity to enhance your own brand, as opposed to that of the publication. But worst of all, it turns its back on the reason for the net’s success — the flowering of millions of conversations. As the lawyer who stopped writing for the Times after it put up its pay wall said, “inside the paywall no-one can even hear you scream.”

Fortunately, there is an alternative. A way in which news can remain distributed, open, even re-usable. A way in which journalism can work with the grain of the web, and continue to grow, extend, and integrate. And it is a way — crucially — that journalism can still make money.

But first, a story.

LIBRARY OF ALEXANDRIA

In the fourth century BC, a student of Aristotle, Demetrius of Phaleron set up a library in Alexandria. It was a little different from the libraries we’re now familiar with. It had lecture halls, a dining room, meeting rooms, and a “walk.” It also had a reading room and lots of books (or scrolls as then were). Within a few decades it had acquired almost half a million scrolls, many containing multiple works. Such an abundance of scrolls would quickly have become unmanageable had it not been for Callimachus of Cyrene. Callimachus started “the first subject catalogue in the world, the Pinakes,” according to Roy Macleod in “The Library of Alexandria.” This was made up of six sections and catalogued some 120,000 scrolls of classical poetry and prose. His methods were then adopted and extended by other librarians.

Thanks in no small part to the cataloguing, people were able to build on each other’s knowledge. Scholars began to compare the texts and try to understand the reasons why they differed. Hence cross-textual analysis was born. People were able to contrast and evaluate various scientific methods. Archimedes (of “Eureka” fame) worked out methods for calculating areas and volumes while at the library that later formed the basis for calculus.

The library at Alexandria became the most famous of the ancient world, and spawned many further libraries and even whole university towns such as Bologna and Oxford. Yet had its books not been catalogued none of this might have happened. Had the books not had metadata giving basic details about who wrote them, when they were written, what they should be classified as, then there would not have been the foundations on which scholars could build.

Metadata is just a fancy word for information about information. A library catalogue is metadata because it categorizes the books and describes where you can find them. You find metadata on the side of every food packet, only we don’t call it metadata, we call it ingredients. The equivalent metadata about a news article would capture information about where it was written, who wrote it, when it was first published, when it was updated. All pretty basic stuff, but critical to properly identifying it and helping its distribution.

IMPORTANCE OF METADATA

Metadata did not matter so much when news was all tidily packaged together in a newspaper. You knew when something was published because it was inside that day’s paper. You knew who had published it because it was on the masthead and at the top of every page. There was — is — lots of metadata about news in newspapers, we just tend to take it all for granted.

The Internet, and the search engines and social networks that power the web, have broken the newspaper package down into discrete pieces of content. These atomized chunks — individual news articles, photographs, video clips, audio clips — are what we consume online. We do not read an online paper cover to cover, as we would a print paper. That would be exhausting. The BBC news website publishes about 150,000 words each day. To skim every individual article would take upwards of 17 hours. Instead we pick and choose, we unbundle.

Rather than seeing unbundling as a problem, news outlets should see it as an opportunity. An opportunity to distribute news all around the web. An opportunity to get readers to help sell their news – by recommending pieces to their colleagues and friends, and by linking to stories from their networks and blogs. The only thing news producers need to do before publishing a news article, is make sure it has metadata integrated to it. This way whenever people — or machines (i.e. search engines) — see it, they can also see its provenance, recognize what category of information it is, and give credit to its creator.

Having basic information about who produced something is to the mutual advantage of the person who wrote the article (or took the photograph or shot the film footage), and of the public who is reading it. The producer gets proper credit for what they created, and the public gets to see who created it — giving the news greater transparency and a measure of accountability.

When you think about it, it seems remarkable that so much content does not have this sort of metadata already. It is like houses not having house numbers or zip codes. Or like movies not having opening or closing credits. Or like a can of food without an ingredients label. As Jeff Jarvis wrote recently, “When it comes to products, we want to know: where it was made, by whom, in what conditions, using what materials, causing what damage, traveling what distance, with whose assurances of quality, with whose assurances of safety.” Why should news be any different?

HNEWS

hNews is just one of a number of methods of adding metadata. It is a simple, open standard that is free and that anyone can implement. We at the Media Standards Trust Britain developed it in partnership with Sir Tim Berners-Lee’s Web Science Trust, and in the latter stages by working with the Associated Press. (This was made possible thanks to two foundation grants, one from the MacArthur Foundation and one from the Knight Foundation. You can read my blog posts about the development of hNews over at Idea Lab, a Knight-funded sister site of PBS MediaShift.)

There are other ways to add metadata to news, for example using RDF or linked data. hNews is an easy entry point since it is built on existing standards (microformats), fits easily within any CMS (there is a WordPress and a blogger plugin), and is entirely reversible. Almost 500 news sites in the US have already implemented hNews, including the Associated Press andAOL. But you choose whichever one suits you best. (Some sample implementations are available here.)

Once hNews is added there are some immediate benefits. Every news article has consistent information about who wrote it, who published it, when it was published etc. built into it. Every article also has an embedded link to the license associated with its reuse (so ignorance is no excuse). And, every article has a link to the principles to which it adheres. These principles should not only help to distinguish the article as journalism, but should make the principles that define journalism — that are right now opaque and little understood by the public — transparent. Moreover, all this information is made ‘machine-readable’ by hNews. In other words a machine (like a search engine) can understand it.
Making this information machine-readable opens up the less immediate, but more exciting aspects of metadata. It creates an ecology of structured data that makes search more intelligent, enables innovation, and opens up new revenue opportunities.

It is a little known truth that much of the evolution of the web has already been driven by open standards. And that many of the uses of open standards are not at first apparent to those who create them. Who could have known that RSS (Really Simple Syndication) a simple standard for syndicating web content, would now be the way millions of people consume audio podcasts? Or that OAuth and OpenID would so simplify the sharing of private information across websites?

The openness and re-usability of hNews enables people to build stuff with it and on top of it. It allows you, for example, to add a “news ingredients” label to the bottom of each article. This is what Open Democracy are doing. Under each article that has hNews embedded they will automatically add an hNews icon. Scroll over this icon and you will get a pop-up box with all the basic details of the article (author, publish data/time, principles etc.). Rather like the ingredients on a food packet. Some of this information is hyperlinked so that you can click directly through to more information — like the license associated with re-use of the article. Imagine labels like these on all news articles. At a stroke you would have transformed their transparency and accountability.

Embedding metadata like hNews has countless other potential uses. As a simple illustration of the type of thing it enables, we built a browser plugin – itchanged.org – that allows you to track changes in news articles. Another application might be more intelligent recommendations (e.g. see readness.com). But most importantly, structuring data creates an environment in which invention becomes possible — in the same way, for example, that library catalogues do.

AP NEWS REGISTRY

It can also help news organizations work out ways to make money. For example, the Associated Press has built its News Registry on top of hNews. The news registry is AP’s way of tracking its news around the web so that it has much better metrics that it can use to charge more accurately for its content, and work out revenue sharing opportunities for advertising associated with its content.

How it does this is pretty straightforward. In addition to hNews the AP embeds an image file, probably a transparent pixel, to each news article. This file is equivalent to a photograph in a web page, except that it is not intended to be seen. But like a photograph in a web page, this image file has to be served up from a separate server — in this case AP’s servers. So whenever the article is viewed on a computer, the browser (Internet Explorer, Firefox etc.) notices the image file and asks AP’s server to deliver it. That way the AP knows who is reading the article. It’s a little like a carrier pigeon. The pigeon can fly wherever it likes but always knows where its home is.

Pay walls will rise and pay walls will fall. But in the world of information abundance in which we now live pay walls are a step backwards. If news wants to benefit from the remarkable openness and dynamism that the internet has unleashed then it should embrace the distributed network and take advantage of it, not turn its back.

Written by Martin Moore

September 13th, 2010 at 1:42 pm

News organisations must innovate or die

without comments

This post was originally published at PBS IdeasLab on July 8th 2010.

People in news don’t generally think of innovation as their job. It’s that old CP Snow thing of the two cultures, where innovation sits on the science not the arts side. I had my own experience of this at the American Society of Newspaper Editors conference in Washington a couple of months ago.

After one of the sessions I spotted an editor whose newspaper had adopted hNews (the Knight-funded news metadata standard we developed with the AP). “How’s it going?” I asked him. “Is it helping your online search? Are you using it to mark up your archive?”

Before I had even finished the editor was jotting something down on his notepad. “Here,” he said, “Call this guy. He’s our technical director — he’ll be able to help you out.”

Technology and innovation still remain, for most editors, something the techies do.

So it’s not that surprising that over much of the last decade, innovation in news has been happening outside the news industry. In news aggregation, the work of filtering and providing context has been done by Google News, YouTube, Digg, Reddit, NowPublic, Demotix and Wikipedia…I could go on. In community engagement, Facebook, MySpace, and Twitter led the way. In news-related services (the ones that tend to earn money) it has been Craigslist, Google AdWords and now mobile services like Foursquare.

Rather than trying to innovate themselves, many news organisations have chosen instead to gripe from the sidelines. Rupert Murdoch called Google a “thief” and a “parasite.” The U.K.’s Daily Mail has published stories about how using Facebook could raise your risk of cancer,, referred to someone as a “Facebook killer” (as in murderer), and runs scare stories about Facebook and child safety. And let’s not even start to take apart various news commentators’ dismissive attitude towards Twitter.

When they have seen the value of innovation, news organizations have tended to try and buy it in rather than do it themselves, with decidedly mixed results. Murdoch’s purchase of MySpace initially looked very smart, but now, as John Naughton wrote over the weekend, it “is beginning to look like a liability.” The AOL /Time Warner mashup never worked. Associated Newspapers in the U.K. have done slightly better by making smaller investments in classified sites.

Most news organisations do not see innovation as a critical element of what they do. This is not that unexpected since they spend their day jobs gathering and publishing news. Unfortunately for them, if it doesn’t become more central to their DNA they are liable to become extinct.

Speed and Unpredictability of Innovation

At last week’s Guardian Activate Summit, Eric Schmidt, Google’s CEO, was asked what kept him awake at nights. “Almost all deaths in the IT industry are self-inflicted,” Schmidt said. “Large-scale companies make mistakes because they don’t continue to innovate.”

Schmidt does not need to look far to see how quickly startups can rise and fall. Bebo was started in 2005, was bought by AOL in 2008 for $850 million, and then was sold again this month to Criterion Capital for a fee reported to be under $10 million.

The problem for Schmidt — and one that is even more acute for news organizations — is the increasing speed and unpredictability of innovation. “I’m surprised at how random the future has become,” Clay Shirky said at the same Activate summit, meaning that the breadth of participation in the digital economy is now so wide that innovation can come from almost anyone, anywhere.

As an example he cited Ushahidi, a service built by two young guys in Kenya to map violence following the election in early 2008 that has now become a platform that “allows anyone to gather distributed data via SMS, email or web and visualize it on a map or timeline.” It has been used in South Africa, the Democratic Republic of Congo, India, Pakistan, Gaza, Haiti and in the U.S.

He might also have cited Mendeley, a company which aims to organize the world’s academic research papers online. Though only 16 months old, the service already has over 29 million documents in its library, and is used by over 10,000 institutions and over 400,000 people. It won a prize at Activate for the startup “most likely to change the world for the better.”

The tools to innovate are much more widely available than they were. Meaning a good idea could be conceived in Nairobi, Bangalore or Vilnius, and also developed and launched there too, and then spread across the world. “The future is harder to predict,” Shirky said, “but easier to see.”

That’s why Google gives one day a week to its employees to work on an innovation of their choice (Google News famously emerged from one employee’s hobby project). It is why foundations like Knight have recognized the value of competition to innovation. And it’s why Facebook will only enjoy a spell at the peak.

Some Exceptions

There are exceptions in the news industry. The New York Times now has an R&D department, has taken the leap towards linked data, and published its whole archive in reusable RDF. The Guardian innovated with Comment is Free, its Open platform, and the Guardian Data Store. The BBC developed the iPlayer.

The Daily Telegraph had a go, setting up “Euston Partners” under then editor Will Lewis. (Although setting up an innovation center three miles away from the main office did not suggest it was seen as central to the future of the business.) The project was brought back in-house shortly after Lewis left the Telegraph in May 2010 and has been renamed the “Digital Futures Divisio
n.”

But mostly people in news don’t really do innovation. They’re too focused on generating content. But as the Knight Foundation has recognized, doing news in the same old way not only doesn’t pay — it doesn’t even solve the democratic problems many of those in news are so rightly concerned about. For some people FixMyStreet.com or its U.S. equivalent SeeClickFix is now more likely to give them a direct relationship with their council than the local newspaper.

News and media organizations have to realize that they are in the communications business, and being in that business means helping people to communicate. Giving them news to talk about is a big part of this, but it’s not the only part. The sooner they realize this and start to innovate, the better chance they have of surviving the next couple of decades.

Written by Martin Moore

July 22nd, 2010 at 2:42 pm

What are the universal principles that guide journalism?

without comments

This blog was first published on the PBS MediaShift Idea Lab.

Defining principles of journalism is difficult. Rewarding, but difficult.

Back in 2005 it took the Los Angeles Times a year of internal discussions to settle on its ethical guidelines for journalists. The Committee for Concerned Journalists took four years, did oodles of research and held 20 public forums, in order to come up with a Statement of Shared Purpose with nine principles (which was subsequently fleshed out in the excellent “The Elements of Journalism” by Kovach and Rosenstiel).

Time spent thinking can then translate into a lot of principles. The BBC’s editorial guidelines — which include guidance about more than just journalism — run to 228 pages. The New York Times’ policy on ethics in journalism has more than 10,000 words. Principles needn’t be so wordy. The National Union of Journalists (U.K.) code of conduct, first drafted in 1936, has 12 principles adding up to barely more than 200 words.

But, once defined, these principles serve multiple functions. They act as a spur to good journalism, as well as a constraint on bad. They provide protection for freedom of speech and of the press — particularly from threats or intimidation by the government or commercial organizations. And they protect the public by preventing undue intrusion and providing a means of response or redress.

Principles in the Online World

In an online world, principles can serve another function. They can help to differentiate journalism from other content published on the web, whether that be government information, advertising, promotion, or institutional or personal information.

One of the key elements of hNews — the draft microformat the Media Standards Trust developed with the AP to make news more transparent — is rel-principles. This is a line of code that embeds a link within each article to the news principles to which it adheres. It doesn’t specify what those principles should be, just that the article should link to some.

Now that lots of news sites are implementing hNews (over 200 sites implemented the microformat in January), we’re getting some pushback on this. News sites, and bloggers, generally recognize that transparent principles are a good idea but, having not previously made them explicit online, many of them aren’t entirely sure what they should be.

When we started working with OpenDemocracy, for example, they realized they had not made their principles explicit. As a result of integrating hNews, they now have. Similarly, the information architect and blogger Martin Belam, who blogs at currybet.net and integrated hNews in January 2010, wrote: “it turned out that what I thought would be a technical implementation task actually generated a lot of questions addressing the fundamentals of what the site is about… It meant that for the first time I had to articulate my blogging principles.”

So, in an effort to help those who haven’t yet defined their principles, we’re in the process of gathering together as many as we can find, and pulling out the key themes.

This is where you can help.

Asking for Feedback

We’ve identified 10 themes that we think characterize many journalism statements of principle. This is a result of reviewing dozens of different (English language) principles statements available on the web. The statements were accessed via the very useful journalism ethics page on Wikipedia; via links provided by the Project for Excellence in Journalism; and from the Media Accountability Systems listed on the website of Donald W. Reynolds Institute of Journalism.

These themes are by no means comprehensive — nor are they intended to be. They are a starting point for those, be they news organizations or bloggers, who are drawing up their own principles and need a place to start.

We’d really like some feedback on whether these are right, if ten is too many, if there are any big themes missing, and which ones have most relevance to the web.

Ten Themes

Our 10 themes are:

  1. Public interest Example: “… to serve the general welfare by informing the people and enabling them to make judgments on the issues of the time” (American Society of Newspaper Editors)
  2. Truth and accuracy Example: “[The journalist] strives to ensure that information disseminated is honestly conveyed, accurate and fair” (National Union of Journalists, UK)
  3. Verification Example: “Seeking out multiple witnesses, disclosing as much as possible about sources, or asking various sides for comment… [The] discipline of verification is what separates journalism from other modes of communication, such as propaganda, fiction or entertainment” (Principles of Journalism, from Project for Excellence in Journalism)
  4. Fairness Example: “… our goal is to cover the news impartially and to treat readers, news sources, advertisers and all parts of our society fairly and openly, and to be seen as doing so” (New York Times Company Policy on Ethics in Journalism)
  5. Distinguishing fact and comment Example: “… whilst free to be partisan, [the press] must distinguish clearly between comment, conjecture and fact” (Editors Code of Practice, PCC, U.K.)
  6. Accountability Example: “The journalist shall do the utmost to rectify any published information which is found to be harmfully inaccurate” (International Federation of Journalists, Principles on the Conduct of Journalists)
  7. Independence Example: “Journalists should be free of obligation to any interest other than the public’s right to know… [and] Avoid conflicts of interest, real or perceived” (Society of Professional Journalists)
  8. Transparency (regarding sources) Example: “Aim to attribute all information to its source. Where a source seeks anonymity, do not agree without first considering the source’s motives and any alternative, attributable source. Where confidences are accepted, respect them in all circumstances” (Australian Journalists Code)
  9. Restraint (around harassment and intrusion) Example: “The public has a right to know about its institutions and the people who are elected
    or hired to serve its interests. People also have a right to privacy and those accused of crimes have a right to a fair trial. There are inevitable conflicts between the right to privacy, the public good and the public’s right to be informed. Each situation should be judged in the light of common sense, humanity and the public’s rights to know” (Canadian Association of Journalists)
  10. Originality (i.e. not plagiarizing) Example: “An AP staffer who reports and writes a story must use original content, language and phrasing. We do not plagiarise, meaning that we do not take the work of others and pass it off as our own” (Associated Press Statement of news values and principles)

There are, of course, many excluded from here. We could, for example, have gone into much more depth in the area of “limitation from harm,” which is only briefly referred to in number nine. Principles to inform newsgathering could form another whole section in itself.

There is also the growing area of commercial influence. In the U.S., the FTC has become pretty animated about bloggers taking money to promote goods while appearing to be impartial. In the online world, the line between editorial and commercial content can get pretty blurred. Right now this is just covered by number five, independence. Should there be a separate principle around independence from commercial influence?

Any and all responses are much appreciated, so please leave them in the comments. Also feel free to get in touch directly if you’d like to continue the discussion (I’m at martin DOT moore AT mediastandardstrust DOT org).

Written by Martin Moore

February 3rd, 2010 at 6:03 am

Posted in Uncategorized

Tagged with , , , ,