Chapter 3

Requirements and Methods of Investigation




Introduction

This chapter explains the research that was undertaken for the project, and the results. It describes the methods used for the research and why they were used. It gives the conclusions drawn from the research, resulting in a full list of requirements for the system. The chapter is divided into three sections:

3.1 Method of Investigation
3.2 Research
3.3 Requirements


3.1. Method of Investigation

The main source of information used while researching this project was the Internet. This is primarily because the project is to develop a web application, and much of the information about the world-wide-web is available on the web itself. Most blind charities also have web pages that were useful in obtaining information about the rights of the visually impaired, and the services that exist to support them.

During the process of researching, someone who works for a talking newspaper for the blind distributed on audio cassettes in the Luton area was consulted about the project. He agreed to put a message on the next edition of the newspaper asking for any visually impaired computer users who were interested in helping with the project to get into contact. One visually impaired person did so, and discussed his ideas about the project. He suggested information he would like to see included in the newspaper, as well as some ideas on how the system could work. He also gave valuable feedback on various ideas about the system’s operation. As many of his suggestions as possible have been used in the course of developing the system.

While developing the project itself books were used to aid in the design [CARROL 89] and implementation [PETZOLD 89] of the system. Notes from university lectures and sections of code developed as part of exercises set in lectures were also a valuable resource.


3.2. Research

During the course of researching the project a great deal of information about systems in place to help visually impaired people get access to news and information was found. Information about how computers are made accessible to visually impaired users through the use of screen readers, etc. was also investigated.

While investigating popular news web pages, the way such pages are designed was investigated. It was discovered that almost all pages on the Internet use tables to create complex page layouts, and that screen readers and talking browsers have trouble getting the information across in an intelligible manner. Other HTML tags that are commonly used in web pages, and how they are used to structure pages was also researched.

The official HTML 4.0 specification [HTML] has guidelines for “non-visual user agents” such as the one this project aims to produce. Unfortunately, many of them rely on the web pages being used taking certain guidelines into account, and most (if not all) of the pages examined during research failed to follow these guidelines.


3.3. Requirements

This section will outline the main requirements for the system to be developed, divided into sections representing different aspects of its operation. The requirements were obtained by investigation of existing systems, discussions with a potential user, and the author’s ideas about how the system should operate.

It was decided that the best way to present the information in a page to the user in a simple and understandable way was to remove all the parts of the page that are not necessary to the understanding of the content. This includes adverts, navigation menus, etc. This is much simpler than trying to confer meanings and importance from the different sections in a page (in most cases this is impossible).

A number of aspects of how the system could work became clear as research progressed. The system would have to be maintained from a central location because producing a completely automated system for finding information on the Internet would be extremely difficult. To do this, a database of pages that can be used in producing the newspaper must be used. It must not only contain the addresses of the pages, but also details of how they can be converted into text in a way that is understandable to a screen reader. Attempting to devise algorithms to extract relevant information from a web page automatically is beyond the scope of this project (if it is possible at all).

In order to allow the user to browse the newspaper as a sighted user would, it has to be divided into sections so that the user can choose to read certain sections and skip others. To achieve this the newspaper has to be structured as a hierarchical document.

In order to get the important information from a web site, more than just the main page is needed. Generally, the main page is a contents page containing links to each item or news story. It was decided that by getting the main page, and following certain links on the main page, enough information could be retrieved from the site. Some experiments were done with following links to a certain depth, i.e. following the links on the main page, and then the links on each of the linked pages etc. to a certain depth. This proved to be very complicated, and it was decided that the extra amount of information gained was negligible.

The name “Web News Speak” was chosen for the application, and it will be referred to as such in the remainder of this document.

A formal list of requirements is shown in figure 3.3a.

Page Database
  • The pages that can be included in the newspaper should be divided into categories to make browsing them easier.
  • All the information about the pages that can be included in the newspaper must be maintained in a “page database”. The information maintained should include:
  • A unique ID number for the page.
  • A short description of the page.
  • The URL of the page.
  • Categories that the page is in.
  • Page formatting information (including which sections to ignore).
  • Sub-page information – which sub-pages to download.
  • Formatting information for sub-pages.
  • An up-to-date copy of the page database must be kept on the UMIST server, and should be maintained by someone at UMIST.
  • A copy of the page database should be stored locally on the users machine.
  • The user must be able to download the newest version of the page database at any time from the UMIST server.
User Interface
  • The system must be completely controllable using the keyboard, in a way that is standard for all windows applications.
  • The user interface to the system should be a simple as possible to avoid confusion.
  • The system should attempt to follow all appropriate windows conventions in its interface, so that it is compatible with screen readers.
  • There must be a complete set of online user instructions that are easily accessible from the main menu of the application.
Downloading Pages
  • The system should have a log file that stores information about the pages downloaded and problems encountered. This is to allow easier debugging of the application. The log file can be included in any bug reports that come from users to help the person maintaining the system to track down the error.
  • The system must be able to be put into “sleep mode”, when it should be idle until a predetermined time is reached, it should then initiate the download process.
  • The system should be able to download the pages automatically on demand without any user intervention.
  • The user must be able to cancel the process of downloading the pages from the Internet at any time.
  • The system should download only the information required for the newspaper, and should minimise the amount of time spent connected to the Internet.
System Output (i.e. the newspaper itself)
  • The newspaper must be a browsable document, split into sections so that the user can choose which sections to read, and in what order they want to read them.
  • The user should be able to specify the font size and other characteristics of the newspaper produced.
  • The newspaper must contain only text, and no images or complex formatting (e.g. tables, frames, etc.)
  • When a page cannot be downloaded for some reason, information about the error that occurred should be included in the relevant section of the newspaper to that the user can report the problem to the person maintaining the system.
  • The newspaper should not contain any special characters that may not be understood by a screen reader, e.g. & or £, but should have the textual equivalent instead – ampersand, pound.
Figure 3.3a - Requirements

Where the word “should” is used, a requirement is desired but not absolutely necessary. Where the word “must” is used the requirement is necessary for the system’s operation.