Serving up the right digital content to the right person at the right time from a pool of published materials will be a challenging task for media outlets without the help of software that locates, accesses and delivers the piece that will interest the reader. A technology can be taught to analyse, categorise and align data the way a human does and take over from editors and journalists the drag of manually reviewing and indexing each piece of content. We have developed solutions that will enable media organisations to make its staff more efficient and tackle complex data classification, clustering and interlinking jobs.
Recommending related content
Studies show that most readers of web content are likely to click on links to related stories for more information. Publishers, trying to maximise their visibility and reach, offer related content but in most cases the selection of headlines is made by the editorial staff by hand. This cumbersome task takes a lot of time which can be spent in a more creative way once the organisation gets an intelligent engine that will automatically generate recommendations. When we have millions of documents, a technology will have no trouble finding articles of interest while a human will. On top of that, a technology is not biased - its choice will not be influenced by a particular mood or opinion but determined by statistics. Having automatically generated recommendations that match reader interest has a number of benefits. User engagement metrics improve considerably - readers browse more articles, stay on the website longer and the bounce rate drops. It leads to increased conversion and loyalty in the long run and last but not least to better SEO.
Named entity recognition
Named Entity Recognition (NER) finds mentions of entities such as persons, organisations, brands, locations, events, e-mails, etc, in large corpora of textual digital data. It is an important stage in information retrieval tasks. Extracting information from unstructured data is a staggering task, considering the complexity of natural languages.
How named entity recognition works? The first step is to train the model to recognise the entity in data set of manually labeled entities, because named entity terms can be ambiguous. Thus “Apple” is not likely to refer to the fruit in a text about corporate dealings but to Apple Inc. “Fannie Mae” looks like a person but will be marked as a type of organisation in the phrase “mortgage company Fannie Mae”.
The model learns fast to identify all textual mentions of the named entities from the pre-defined data set. The next stage is to supervise how well it does it and give it feedback so that it improves its capabilities. We provide a fast and user-friendly interface for the editors to correct the mistakes of the model. This allows us to guarantee the quality of our service and to meet industry standards!
Using software to classify content instead of editorial staff, helps publishers deal with large volumes of articles fast and efficiently. What you need is a pre-defined taxonomy and some exampels of how this taxonomy will be used, which the algorithm will use for indexing articles. Then journalists will only give feedback to the machine whether it is doing fine. Training is necessary because the initial set of data may not cover well all the categories or may change in time, for example new topics have been added, so retraining is required.
We have created an automatic tagging system which can analyse the article and assign it to a given set of categories, for example, this story is about agriculture, that one should go to the sports section, etc. The service can be applied on large volumes of content to identify all the components and sort them based on a taxonomy created for the purpose. The Taxonomy Classifier developed by our team can evaluate an article and tag it to a set of categories to which it belongs, that is we offer a multitagger!
One of the advantages of having accurately tagged content is delivering better targeted advertising. Tagging can help online media match banners with user interest and monetise it with advertisers. For example, if a user is viewing articles about cars, it is quite natural for a car maker’s ad to pop up.