Intelligent Software Agents in Knowledge Management

The goal of this article is to introduce the reader to the concepts and theory behind the knowledge management process and how intelligent software agents can help to manage the knowledge management process.

Before we get into the crux of this article it is important to understand why we even discuss this topic. Various types of Decision support systems (DSS) are widely used in enterprises today. Organizations might need an EIS (enterprise information system) that not only caters to the needs of management but also serves the needs of the rest of the enterprise. Needs are obviously defined by what role each person plays. For example an EIS can be used to check the demand and match it closely to the supply or perform forecasting based on trends or patterns in the data. The validity of the results from this process is dependent on the knowledge that was used to come to this result.

The key to the success of a DSS system is to have access to a reliable, valid and growing knowledge base. Human actors will often enter knowledge into the knowledge base directly. But that may not be feasible or many times not possible. Here is where intelligent software agents can help.

Intelligent Software agents, in the context of knowledge management, are automated software modules that act on behalf of the knowledge management system to automatically collect knowledge, validate it, organize it and then add it to the knowledge base.

Knowledge

Typical production databases are transactional in nature. They can be seen as the database of operations, where all business transactions take place. Orders are placed, inventory is tracked, and customers are managed among other activities. Here the onus is on managing data and the challenge is to have efficient and reliable access to this data. Data is often organized into tables to form meaningful information.

But there is a parallel requirement in many large enterprises to have a different view of the data. A view that is used by upper management (and others) to track sales, to forecast trends (like demand and supply), to trouble shoot specific performance problems, etc. Here the onus is not on pure data. Instead what is needed is to consolidate the information from the many databases spread across the organization and bring them together to provide what we term knowledge. Knowledge is giving meaning or more substance to data, so that decisions can be made using this knowledge.

Knowledge Base

Knowledge is typically collected and organized into a knowledge base (similar to how data is stored in databases). Typically these knowledge bases could also be located as data warehouses and data marts and they are separate from the operational databases. In fact the latter should be requirement for your knowledge base. Some of the activities that are run on a knowledge base can be very intense and could slow down your already fully loaded operational database.

Typically the knowledge base is another DBMS that caters purely to the knowledge management subsystem. This could be an RDBMS (like Oracle, DB2) or you can even use XML enabled databases.

Knowledge Management Process

Knowledge management is the process of collecting, rearranging and validating data to produce knowledge. Knowledge can be gathered from various sources like

Human actors using the system.
Automatic feeds from internal or external sources.
Periodic manual feeds from internal or external sources.
Random feeds from internal or external sources.
Knowledge engineers working in tandem with experts.
Experts.

Knowledge can be gathered from any of the above sources (maybe from all too). Here is a simple checklist to keep in mind when gathering knowledge. Assume the system is getting a new feed of data, which is to be entered into the knowledge base. In a good knowledge management process…

It is important to identify knowledge sources and also verify that the data is coming from the same reliable sources (Capture phase).
The data should adhere to pre-defined formats (refine phase).
The data should follow conventions for any business rules that have been previously identified (refine phase).
The data needs to be massaged to a form that can be understood by the knowledge collection subsystem (refine phase).
The massaged data needs to be validated to check for correctness and verified to check if it follows all the pre-defined business rules (validate phase).
Finally the newly constructed knowledge should be added to the knowledge base (store phase).
The process should be able to react appropriately in case of errors in the feed.
When knowledge is requested, it will be retrieved and if needed consolidated with other knowledge and returned (Disseminate phase)

An important function of a knowledge management process is to continuously grow its knowledge base (following the process we outlined above). An inference engine will use this knowledge to provide value to a client. An inference engine is only as good as its backend knowledge base.

Very often it is not feasible to leave it to human agents to enter data into the knowledge base. Sometimes the data may be so large that this is not possible. Or the information may arrive periodically at predefined times or maybe it follows no time schedules. Here is where we can use intelligent software agents.

Intelligent Software Agents

Intelligent Software agents are independent autonomous software programs that gather knowledge by following the process we outlined earlier. In doing so they require no help from human agents. The process is completely automated. They are termed as intelligent because they possess all process information on how to intelligently read incoming information and convert it to knowledge to be stored into the knowledge base. These agents can be written in various programming languages such as C, C++, Java, etc. They can also be implemented using newer technologies such as Web Services.

Agents can be one of two types; static agents or mobile agents. Lets discuss each in some detail and also how they can be used in the knowledge management process. There are other classifications of software agents. But for the purposes of this paper we will concentrate on static and mobile classifications only.

Static Agents

Static agents are called so based on the fact that they do not move or relocate themselves from the computer that started them. If a particular computer starts a static agent then the agent will continue to run on that very computer throughout its lifecycle.

The life cycle of a static agent can be better understood using the diagram below.

Initially there are no agents in the knowledge management system. The agents come to life either when the first knowledge gathering task is initiated or a pre-configured number of agents can be set up to be in a pool of free agents. When a new knowledge task arrives either an available agent from the pool is allocated or a new one is created.

When the agent is running, it is at that point that the knowledge process we outlined earlier is applied. When the agent finishes execution it is added back to the pool of available agents. Keeping a pool such as this can be useful in improving the reliability and scalability of the knowledge management system.

Next question we need to ask is how are these agents called by clients. There are many ways that this can be done. But lets discuss one very innovative method. Today Web Services is the “in-thing” in the tech world. Beyond the hype, this technology is extremely viable for implementing static intelligent agents. Web Services allow us to expose interfaces on existing or new business objects as available to our business partners.

Our static agent could be designed as a J2EE (or .NET) object running on a remote server. This object though private to the server exposes some of its interfaces using the web services suite of protocols (SOAP, WSDL, UDDI). Clients will typically call one of the interfaces on our web services. Clients who need to feed in data can do so using appropriate web service interfaces. They would provide the data as an XML document that has a predefined XML Schema. Due to the use of XML Schema there is naturally a strict adherence to data formats and some amount of data validation is applied right here at this step. This can save an enormous amount of computing space on the server and can better facilitate the knowledge validation process.

Static agents can also be of a reactive kind. Wherein they react to new data that is added into a database or new data that an application server receives. In such cases the application server or database can spawn a reactive static agent and delegate to it the task knowledge processing. Or these agents can run in the background always waiting for new data to come in. Once they detect new data they read it and run the knowledge gathering process to move it into a separate knowledge base. Very often though knowledge creation is done at predefined times, maybe once a day. During that time slot the agent will read the operational database and process the new or updated data. For large amounts of highly volatile data this can be very beneficial approach.

Static agents are especially well suited for data-mining tasks. Typically data mining involves wading through large amounts of data in a data warehouse or data mart to find patterns in existing knowledge. Agents can perform these background tasks based on either predefined time intervals or maybe it is triggered by the arrival of certain data or simply when a request for knowledge access comes in.

Mobile Agents

This is a very innovative field of research. Ironically what makes it so innovative is also a reason why this technology is not in common use.

Mobile agents are software modules that do not necessarily stay on the server that initiated them. Simply put these agents travel. Say we start an agent on one computer (the parent). To perform its work the agent needs to communicate to a remote server. The agent might start performing some of its duties on the parent computer and then decide to move from the parent towards the remote server. In doing so it might decide to travel the network and move ever so close to the remote server. Finally once it finishes its task it will notify the remote server or it may even destroy itself. The parent can at all times send messages to the agents, such as control messages.

Some agents might interact with other static or mobile agents to perform its task. Some may even spawn additional sub-agents to delegate some tasks. At all times the agent maintains a reference to the parent server.

One would ask how could this be of any use in a knowledge management system. The answer is simple. It depends on what type of information your knowledge base is tracking. Lets say we have an enterprise with a large globally distributed computing facility. The network is so complex that it has become difficult to track what is happening on this network. It has become difficult to collect performance and security related information. And we need to periodically have this knowledge added to a knowledge base, so that we can later analyze and maybe even predict network performance.

We can create a mobile agent that will roam our network, moving from one node to the other and always collecting network performance statistics as it roams. Periodically the mobile agent can send the information back to the parent, which can then add it to the knowledge base. The agent can communicate with the network elements using SNMP. Based on this simple yet realistic example you can see the power behind mobile agents.

Due to the mobility of these agents they may be limited to gathering data only and maybe performing some initial validations on it. Once they gather data they can call a static agent on the parent to perform the remaining tasks. It is important that the mobile agent keep doing its main task, which is to keep moving and gathering new data.

Limitations of Mobile Agents

Mobile agents face many challenges among them are security concerns, what if someone tampers with the agent runtime code, how does the agent find a suitable platform from which to execute as it moves, how does the parent know that the data it is receiving is from the agent and not from some other malicious agent, how does the parent know if a child agent is still alive, what if the agent looses ability to communicate to the parent, etc.

Conclusion

Automated knowledge management using intelligent software agents is very much a reality. The advances in newer technologies such as J2EE, Web Services, .NET allow us to create more reliable, scalable and secure intelligent agents. Static agents are more common compared to mobile agents. But as was discussed earlier mobile agents are definitely useful in certain types of applications.

Resources

For information on XML Schema refer to http://www.w3.org/XML/Schema

ADK for mobile agent development http://develop.tryllian.com/

IBM’s Aglet mobile agent development

Voyager from http://www.recursionsw.com/

Software Agents on Distributed Knowledge Management Systems (DKMS Brief No. Three, July 30, 1998

{"Mat's Random Thoughts"}

Mathew's Random Tech Notes..