The subject of this paper is the incorporation of temporal semantics into database rules and how the syntax needed to support this might be reconciled with the evolving SQL standard. A working syntax, free of any limitations of the SQL standard, is defined and presented and an operational definition for this syntax is evolved through the application of this working syntax. A graphical representation of the domain is given using an adapted object-oriented modelling. When an attempt was made to re-cast our working syntax into SQL, a satisfactory mapping which would have succeeded in preserving the semantics of the original could not be achieved. This suggests a shortcoming in the current standards. Support for time-based triggers; cyclic operations; delayed actions and rule lifetimes necessitated the development of appropriate modifications to the basic SQL3 draft syntax. The proposed extensions capture all of the semantics required for the specification of time-based rules.
The problem of (semi-)automatically turning text into hypertext is one that has been identified as important to the growth and development of hypertext as a way of organising information. In this paper we describe an approach we have developed to semi-automatically generate a hypertext from linear texts. This is based on initially creating nodes and composite nodes composed of mini-hypertexts. Following this we then compute node-node similarity values using standard information retrieval techniques. These similarity measures are then used to selectively create node-node links based on the strength of similarity between nodes. What makes our process novel is that the link creation process also uses values from a dynamically computed metric which measures the topological compactness of the overall hypertext being generated. Thus link creation is a selective process based not only on node-node similarity but also on the overall layout of the hypertext. Experiments on generating a hypertext from a collection of 846 software product descriptions comprising 8.5 Mbytes of text are described. Our experiments with a variety of IR techniques and link creation approaches yield some guidelines on how the process should be automated. Finally, this text to hypertext conversion method is put into the context of an overall hypertext authoring tool currently under development.
The application of natural language processing tools and techniques to information retrieval tasks has long since been identified as potentially useful for the quality of information retrieval. Traditionally, IR has been based on matching words or terms in a query with words or terms in a document. In this paper we introduce an approach to IR based on computing a semantic distance measurement between concepts or words and using this word distance to compute a similarity between a query and a document. Two such semantic distance measures are presented in this paper and both are benchmarked on queries and documents from the TREC collection. Although our results in terms of precision and recall are disappointing, we rationalise this in terms of our experimental setup and our results show promise for future work in this area.
A variety of methods for speeding up the response time of Information Retrieval processes have been put forward and one of these is the idea of thresholding. The basic assumption behind the concept of thresholding is that in information retrieval storage structures, data can be organised in such a way so as to allow cut-off points to be employed during processing. These cut-off points or thresholds are designed and used to reduce the amount of information to be processed while maintaining the quality or allowing minimal degradation of the response to the users query. In this paper we report experiments we have carried out with a portion of the TREC data where we introduced some features into the retrieval process designed to improve response time and we found that these features would significantly improve response time while maintaining the same level of retrieval effectiveness.
In this paper we discuss the implementation and design of an automatic word sense disambiguator. This semantic tagger is used in an overall Knowledge Based Information Retrieval (KBIR) system which uses a knowledge base derived from WordNet and two independent semantic similarity estimators. The KB is used as a controlled vocabulary to represent documents and queries and the semantic similarity estimators are employed to determine the degree of relatedness between these KB representations.
The process of arriving at a KB representation of documents and queries involved locating KB nodes equivalent to index and query terms. It was found that on average, 72% of index terms had more than one KB sense and the semantic tagger was subsequently developed to determine the most likely sense in these cases. The surrounding words in the text were used as a context when disambiguating ambiguous terms. the tagger itself is made up of four independent components, each of which allocate scores to the senses of ambiguous terms. The sense(s) with the highest scores are deemed the "winners". Evaluation of the semantic tagger has so far been within the context of results from the KBIR system, however an independent evaluation using a manually tagged corpus is currently taking place.
Despite the advantages associated with the use of formal (mathematical) techniques for the specification of computerised systems, industry has been slow to abandon existing informal techniques. A suggested solution to this situation involves integrating informal and formal techniques within the development process. Such an integration would allow the advantages of both techniques to be exploited. This paper describes a method for integrating the widely used models of Real Time Structured Analysis with the formal specification technique LOTOS. The approach suggested uses the Real Time Structured Analysis specification as a cognitive vehicle for producing a LOTOS specification, and is illustrated by means of a small case study.
Steganography is the science of hiding data in otherwise plain text or images. This document is split into two sections. Section one gives a brief history of the use of steganography by mankind. It describes interesting events in history where steganography was used to great or sometimes disastrous effect. Section two describes some applications that exist on computer which apply steganographic techniques to allow a user to hide information in data files.
It is well-known that there are many problems inhibiting the spread of hypertext and hypermedia and among those problems there is the cognitive overload or strain on the authors of such material. In addition to authoring the content of each node in a hypertext, an author is responsible for creating information links and link anchors while also having to maintain an overall balanced hypertext document. Our recent work at Dublin City University has addressed this aspect of the authoring problem by investigating whether the values of graph topology metrics can be used as an aid to a hypertext author in maintaining an overall balanced document. In this position paper we report on the progress we have made to date towards the goal of realising such an aid. As an exercse we have applied the compactness metric to a number of hypertexts which have been hand-cracted, constructed semi-automatically or generated in a completely automatic way and each of these hypertexts has been created for different purposes. The results of these exercises are reported here. Some of these documents have been marked up in HTML and can thus form part of the World Wide Web.
Institutions are abstract frameworks used for the description of formal
systems; one common application is giving formal semantics for
languages used in formal specification. This facilitates comparison
and integration of specifications written using these formalisms.
Two of the main formalisms used in the description of programming
language syntax are regular expressions and context-free grammars. In
this report we treat these as separate specification formalisms, and
construct an institution for each.
Building on previous work in programming language syntax, this paper gives institutions for some formalisms used in the description of programming language semantics. Specifically we construct an insitution for attribute grammars and van-Wijngaarden grammars, and examine the possibility of extending the framework to include denotational- and axiomatic-style definitions.
This paper attempts to asess the potential of back-propagation neural networks in predicting spot foreign exchange rates. A new and more recently developed technique is also introduced an appraised: genetic programming based symbolic regression. both techniques are used to predict 1,2,3,4 and 20 day ahead spot U.S. Dollar/ Japanese Yen exchange rates using four years of daily sampled values taken from the period 1990-1994.
The system propsed here is intended to address two aspects of a police investigation. The first is the need for a facility to extract and compare information and the second is the necessity for an advanced text retrieval tool.
Check-out Check-in Replication is an attempt to address the need of the more volatile mobile computing environment. It allows data to be 'checked-out' of a server database and down-loaded onto a PC or lap-top. While the data is checked-out, it can be independently updated both at the Server site and on the PC (or 'Client' site). At some stage, the data is later 'Checked-In' again. The check-in process updates the Server database with any changes made to the checked out copy. The aim of this project is to simulate this situation by building and testing a prototype implementation.
This paper examines how best to carry out software cost estination. The aim of the paper is to recommend a cost estinating methodology suitable for a medium sized software house which develops and maintains customised packages. Thus the focus is on a stable environment where there is a lot of relevant expertise, and many projects are similar in nature.
This paper presents a formal approach to the co-design of hybrid systems based on object-oriented analysis and design, and the formal description languages VHDL and SDL. This methodology covers the whole development process from requirements capture, through design and implementation, to validation. The paper also presents some of our experiences todate with the methodology.
This work presents a method by which hypertexts may be initially authored by presenting anchors as a list of nodes separate from the information content of the nodes and subsequently automatically resolving locations for replacement anchors which form part of the actual information content of the nodes thus allowing the cognitive overload problem for the author to be eased while not compromising the useability of the final product.
In the work we report here we are concerned with delivering information
retrieval applications to users where our over-riding constraints are
for handling large volumes of domain independent texts. Our
observations from published work and our own efforts previously
reported are that NLP techniques, beyond simple tagging and
morphological analysis, do not yet offer enhancements to retrieval
effectiveness and certainly not to retrieval efficiency when dealing
with large volume, domain-independent applications. With this in mind
we have adopted a philosophy of using NLP resources rather than NLP
processes in our work. Here we outline details of 4 of our ongoing
projects, retrieval on +2Gbytes English texts, retrieval on +200
Mbytes Spanish newspaper texts, retrieval on +3000 multimedia images
and information filtering of USENET NeWS articles.
This article appears in the proceedings of IA95, June 1995,
Montpellier, France
Techniques are described to speed up the calculation of modular products, which are required at the heart of software implementations of many cryptographic protocols such as the well-known RSA Public Key algorithm. By combining ideas from Comba and Dusse and Kaliski, and exploiting Montgomery's method for modular reduction, very efficient machine code routines can be generated. The method is illustrated for the popular 80386/80486/Pentium Family of processors, but uses only a very small subset of the instruction set, and so can easily be adapted to many other conventional processors with a word length of 16 bits or more.
We address the problems associated with storing and retrieving multimedia presentations where the multimedia objects making up the presentation are stored in a distributed manner on a network of computers. In order to deliver multimedia presentations to users in an efficient manner, quality of service (QoS) parameters for various components like networks, disks, etc, need to be defined before a multimedia presentation is launched. It is our hypothesis that a multimedia presentation authoring tool should use an internal representation known as a coloured petri net, in order to represent the synchronisation and the spatial aspects of a multimedia presentation. The CPN representation for a presentation would be represented internally only and need not be visable. In this report we outline how some commonly used multimedia sychronisation features can be represented using CPNs and we discuss what kinds of persistent storage would be needed for this.
This report describes work done as part of the TREC-4 benchmarking exercise by a team from Dublin City University with help from the CLIPS-IMAG laboratory while the first author was on sabbatical leave at the CLIPS-IMAG laboratory in Grenoble, France. In the first section we briefly outline what TREC is and what its modus operandi is and following that we describe the experimental environment in which we worked. The bulk of the work reported is based on an automatic expansion of the original query using a knowledgebase derived from WordNet. The knowledgebase, WordNet and the previous applications we have used this for are presented also. In section 4 we give a chronological presentation of the query expansion methods we tried and the experimental results, vis-a-vis retrieval effectiveness, we obtained. This presentation is accompanied by an analysis of results as they are presented and in a concluding section we analyse and try to explain the results we obtained.
Discourse Representation Theory (DRT) developed by Hans Kamp (Kamp 1981), works on a discourse processing it sentence by sentence building on what has gone before to produce a semantic representation, the Discourse Representation Structure (DRS). It is a departure from traditional methods which treat sentences in isolation giving each sentence a separate semantic representation, and so fail to find antecedents for anaphoric pronouns within a discourse. In DRT anaphora are analysed not as a relationship between pronouns and other NPs, but as one between pronouns and the discourse referents that are already present in the semantic representation under construction. In this paper the basic mechanisms, construction rules and antecedent prediction methods of DRT will be discussed.
In this paper we describe work done as part of the TREC-4 benchmarking exercise by a team from Dublin City University. In TREC-4 we had 3 activities as follows:
Recent advances in mathematical programming include the improved performance of Integer Programming (IP) using strong Linear Programming formulations. One focus is on the area of reformulation and preprocessing of the LP relaxation. Preprocessing refers to elementary operations that can be performed automatically to tighten a given formulated model. These strategies can lead to vast improvements in solution times of IP programs. The aim of this paper is two fold: Firstly we present a brief survey of the early efforts on the search for the general solution of IP and MIP problems by the cutting planes approach. We give a brief description of the general valid inequalities as derived for this purpose. Secondly, we present procedures that transform a given formulated model automatically into a tighter equivalent representation (preprocessing). We implement these procedures and present computational results.