Automated network management application and knowledge discovery framework.
Ciolac, Camelia Elena ; Radulescu, Florin
1. INTRODUCTION
In the knowledge-based society, the IT infrastructure is the
"blood system" of the enterprise organism and therefore its
underlying "vessels" should be monitored real-time. Its
"blood", the information, if blocked from reaching a certain
part of the organization in real time, might prevent it from taking the
right business decisions.
Therefore we build a tool to support the IT decisionmaking process,
using Oracle, Java technologies and Data Mining in a synergic manner.
2. PROBLEM STATEMENT
While IT manufacturers provide excessively customized monitoring
solutions, especially configured for proprietary hardware, developing a
platform-independent network management system represents a challenging
project.
Moreover, providing support for an enterprise's intranet
requests the use of flexible, stable, future-oriented technologies and
strict adherence to standards.
Time-series graphs for individual equipments put the weight of
analysis and data inter-correlation onto the administrator's side.
By using data-mining techniques, synergy is created and valuable
prepared knowledge is provided to decision-makers.
3. LITERATURE REVIEW
Being a constant concern of both theoreticians and practitioners,
automatic network management evolved from automatic incident reporting
towards defining device coalitions that interchange information to reach
consensus in action policy (Burgess, 2004).
Decentralized management architectures, with local device
hierarchies that aggregate regional results and forward them to the
superior hierarchical network manager are suitable for RMON protocol and
asynchronous reporting (Sturt, 1994).
A critical review of actual network management techniques include:
network traffic control using a collaborative simulation (Tao Ye et al.,
2001), game theory for risk assessment within networks (Cui Xiaolin et
al., 2008), Bayesian approach to compressing the storage space of
historic monitoring data and event correlation used in fault prediction
(Garofalakis and Rastogi, 2001).
4. RESEARCH COURSE AND TECHNICAL SOLUTIONS
The requirements analysis consisted of comprehensive interviews
with network administrators from some medium and large enterprises,
including a company in the aerospace industry where information delivery
is vital.
Results gathered from this prospective research emphasized the fact
that actual IT management systems provide localized data that prevent a
global overview of the network. While packet analysis through traffic
intercept increases significantly the delays, administrators need to opt
for a set of software products, unbound and not integrated, each one
offering a specific view of the network.
To respond these issues, our application offers an integrated
framework for network management that addresses both communication and
workload aspects of an enterprise's intranet.
Technologies that support the developed application include:
* Oracle Database 10g Enterprise Edition, that includes Oracle
interMedia
* Oracle Data Miner 10g that supports the knowledge extraction from
data by clustering algorithms
* Java Development Kit 1.6 used in the development of the web
service and JSP/JSPx for user interface to the service
* Business Components for Java (BC4J) based on Oracle ADF 11g and
managed beans built inside J2EE that extend the functionality of Oracle
ADF Faces elements
* WebLogic 10.3 integrated server that offers robustness and
flexibility
Another strength of this application relies in its conformance with
standards like: XML, JPEG, ASN.1 notation and BER encoding rules.
The software application has a complex architecture that deals with
data acquisition, storage and analysis. We chose the SNMP Protocol (RFC 1157) for data communication due to its large-scale implementation
within IT devices, its device-oriented approach and its advantages
compared to flow-based protocols.
The data acquisition from the SNMP agents distributed in the
network is realized by a web service with specialized threads. Human
intervention is minimal; once the system is configured the network
administrator chooses the desired function and the device polling period
and the rest of the process is transparent to him.
5. RESULTS
The primary management functions FCAPS (Fault, Configuration,
Accounting, Performance and Security) (Burgess,2004) are divided in
sub-functions detailed at different levels: interfaces, protocols (IP,
TCP, and UDP), host workload (used storage, CPU load, devices,
running/installed software).
The network management station manages the query process and
organizes it in iterations and stages. The metrics are taken from the
MIB II (RFC 1213) and Host-MIB (RFC 2790) and grouped according to the
target sub-functions. The approach used in device polling is adaptive
and proves its efficiency when devices of the intranet fall and polling
them would result in useless network overload.
Another issue solved by the application is the configurations'
management, achieved by both remote setting of parameters onto network
devices as well as by the topology management function.
The Topology Management function takes advantage of the multimedia
features of Oracle interMedia to store inside the database graphical
descriptions of the devices and the types of cables that link them in
the network. Using Java Graphics 2D and Java Applet technologies, the
application is able to present the network topology to its
administrator. Workstations and communication devices are placed in the
corresponding enterprise departments (delimited by size scaled
rectangular regions on the canvas) together with labels indicating their
assigned IP address.
Thus, an overview is provided to the administrator to assist him in
fault localization and network redesign.
Along with time-series graphs that present evolutionary perspective
upon analyzed sub-functions and status-meters/gauges for fault severity
alert, the application aims to go beyond descriptive analysis.
Data-mining algorithms are used to cluster working days' time
intervals according to registered traffic. To accomplish this task
data-preparation is requested in order to change the context from SNMP
device-centric approach to aggregate measures. The formula (1) is used
to obtain an aggregate measure of the traffic that outputs through all
the interfaces of all the devices in a time interval of the day of the
week:
Traff[day][hour] = [AVG.sub.day, hour] ([[summation].sub.IP]
[[summation].sub.interface] max(IfOutBytes)) (1)
The parameters used in the data-mining process were: the ODM
K-means clustering algorithm (Oracle Documentation, 2003), 4 clusters,
Euclidian distance, 0.01 convergence tolerance and variance as criteria
of splitting the nodes of the ODM clusters' hierarchy. The clusters
obtained after two-month data gathering from a SME are presented in
figure 1, in an Oracle ADF Faces Bubble Graph.
[FIGURE 1 OMITTED]
The knowledge learned through traffic clustering emphasizes the
periods when the IT infrastructure gets overloaded as well as the
intervals of a week when the actual infrastructure is underexploited.
Thus we obtain a basis for planning the maintenance and modernization
jobs with minimal impact on current network operations. Also, Oracle
Data Miner's histogram provided along with the clustering model,
offers a perspective on the quota of traffic for different octets
ranges. High percentages registered in the high traffic ranges represent
an incentive to network expansion and modernization.
6. FURTHER RESEARCH
Further development of this application includes defining
customized agents within the workstations in order to detail traffic at
application-level using JMX (Java Management Extension). For now,
device-level analysis is accomplished.
The data-mining algorithms will be used to expand the knowledge
discovery process, with anomaly detection in hosts' workload. In
conjunction with JMX, the application aims to identify those software
applications that cause CPU overload or memory intense usage. This
approach is expected to also bring results in the software integration
field.
Another step that our application aims to address in the future is
building a regression model to quantify the relationship between various
management metrics contained in the MIB II and Host-MIB. Equations that
estimate the network flow based on the number and types of software
running on a workstation are further steps of our research.
7. CONCLUSION
The application manages to cross the borders of a localized
monitoring approach and offers an overview of the whole network from a
multitude of perspectives (communication, host workload, performance of
running software, used storage capacity) to its administrator as well as
knowledge extracted from the acquired data as basis for network
modernization and optimization.
8. REFERENCES
Burgess M. (2004). Principles of network and system administration,
John Wiley&Sons, 0-470-86807-4, USA
Cui Xiaolin, Tan Xiaobin, Zhang Yong, Xi Hongsheng (2008). A Markov
Game Theory-based Risk Assessment Model for Network Information Systems,
from: ftp://ftp.computer.org/press/outgoing/proceedings/csse08/
data/3336f057.pdf Accessed: 2009-02-03
Garofalakis M, Rastogi R(2001). Data Mining Meets Network
Management: The NEMESIS Project from:
http://www.cs.cornell.edu/johannes/papers/
dmkd2001-papers/p1_garofalakis.pdf Accessed: 2009-02-03
Sturt E. (1994). Network management: concepts and tools, Springer,
9780412578106, France
Tao Ye, David Harrison, Bin Mo, Shivkumar Kalyanaraman, Boleslaw
Szymanski, Ken Vastola, Biplap Sikdar, Hema Tahilramani Kaur (2001).
Traffic Management and Network Control Using Collaborative On-line
Simulation, from: http://www.ecse.rpi.edu/Homepages/shivkuma/research/
papers/tao-icc2001-camready.pdf Accessed: 2009-02-03
*** (2003) Oracle Documentation, Oracle Data Mining Application
Developer's Guide from: http://download.oracle.com/docs
/html/B10699_01/toc.htm Accessed: 2009-01-10