Out of their hands?

The six major UK mobile operators have just signed a code of practice with the aim of protecting children under 18 from accessing adult internet content using their mobiles. What does this mean and what gaps does it leave? Nick Outteridge of content filtering specialist SurfControl, reports.

As new 2.5G and 3G mobile phones become pocket-sized Web browsers, the mobile industry has been quick to address growing concerns about children having access to adult and other inappropriate content from their handsets.

Recently the six major mobile operators in the UK — Orange, O2, T-Mobile, 3, Vodafone and Virgin — have joined forces and signed up to a Code of Practice, which will impose an “18” classification — along the lines of systems used in film and TV— on adult content including images, video, gambling, games, chatrooms and Web access. 

This means Web content with an “18” certificate will only be available when the network operators can verify the age of the user. Identification is expected to be accomplished by managing individual user profiles, accessed by username and password. In addition parents and carers will able to use filters to restrict content delivery to children in their care.

While these measures go some way toward addressing the issues of content filtering and parental controls, there are still a number of key questions which are as yet unanswered:

l How fast can Websites be categorised to conform to the Code?
l How many sites will be categorised when filtering services are launched?
l Will categorisation be able to keep up with the proliferation of new Websites?
l How will the filtering and control solution work?
l How quickly can it be deployed to deliver protection?
l Will there be a performance penalty for using these controls?
 
To address these questions, let’s look first at how the mobile operators’ Code of Practice intends that sites should be identified and categorised, and compare this method with accepted practice in the corporate and home Web access environments.

Labelling a moving target

The Code of Practice states that an independent authority will be established to decide on standards for the website classification system, in a similar way to other parts of the media industry. This authority will in turn be regulated by ICSTIS (Independent Committee for the Supervision of Standards of Telephone Information Services).However, classifying the multi-millions of websites is not a simple or quick task.
 
The authority will require a significant lead-time to develop a list of adult sites that can be used meaningfully to enable filtering. Furthermore, the Internet is not a static medium. Websites are launched, closed down, change names and move IP address at bewildering speed — which means that many sites need to be re-checked and reclassified, further complicating the job of the authority.

Self-labelling options

An often-discussed classification option is self-labelling of Web pages by Webmasters — using the Internet Content Rating Association (ICRA) system. However, only 120,000 websites have chosen to self-label, and many sites with adult or inappropriate content are not interested in being reputable, nor in maintaining a clean public profile. So self-labelling is limited in scope and effectiveness, and should be supported by alternative filtering methods.

Artificial intelligence

Artificial (AI) intelligence technology is a commonly used — and powerful — tool for categorising new websites. Proponents of AI claim that it can analyse a web site with no latency, even under load and can achieve 99.7% accuracy.

However, the reality is different. To assess accurately the category of a website, the site must be analysed in-depth — which does not mean zero latency. The only way to be fast is to look at meta-tags or skim the home page looking for keywords. This improves performance but can produce false positives, producing false positives.
AI has for many years been used to assist in categorisation — typically it is used offline to assist with site analysis by its Internet researchers. It works by ‘spidering’ a site, bringing back a minimum of ten pages and then analysing the data to determine a category — a far more in-depth approach to AI. So AI certainly has a role to play, but demands sufficient data to analyse.

An alternative approach

So what’s the alternative? In the corporate and home computing sectors, the most effective, proven and fastest method of filtering and blocking website access is by reference to an established database of ready-categorised Internet addresses, or URLs.

As an example, SurfControl’s database comprises over six million URLs and more than one billion web pages divided into 40 categories. Over 35,000 new sites are categorised and added to the database each week.

This gives a comprehensive picture of Internet content, and allows rapid decisions on whether to allow access to, or block, a request for a Web page. Database look-ups of this kind offer the highest performance for the user, and the lowest latency.

This approach can be easily migrated to the mobile environment without the need for powerful computing resources on the device itself. In fact, there is no need for any software on the mobile phone. All that is needed is the 2.5G or 3G Internet connection, and a categorising server on the Internet, linked to the mobile operator’s Web authentication infrastructure.

Cascading analysis

Before Web access is granted from the phone, users first log in with their username and password. Their requests for URLs are then intercepted and handled by the categorising server. To speed response and minimise latency, the server has two caches: a custom cache and dedicated URL cache. These hold regularly-requested URLs, suitably categorised, and can also hold other data, such as ICRA labeled sites. If the requested URL is not found in these caches, the request is cascaded to the categorising server for checking against the large URL database.

If the URL is still not found, then it is subjected to AI on-the-fly analysis and categorisation. The point with this “cascading” approach to analysing URLs is that it combines comprehensive filtering and profiling, with maximum and appropriate performance at all levels.

The Code of Practice has gone part of the way to give customers more protection but the journey won’t be smooth due to the complex nature of the Web and mobile access. Increasingly mobile service providers need take up the challenge of responsive content management and those that do, will be far better placed for the future.