Network Problem Solving:
A diary study and model of telecommunications trouble shooting
Abstract
This paper will describe the results of a study of network problem
solving. The paper will present an initial model of network problem solving, based on
this study. Finally the paper will describe the implications of this model of
network problem solving for enabling people who are not technical experts to
diagnose and in many cases solve these problems. This network problem solving
model also has implications for more general models of problem solving.
Introduction
As more and more people begin to use telecommunications in their everyday
lives, there is a myth that the initial technical problems are a temporary
burden which will soon be lifted. In some cases this is true, at least for the
short term, but since technology continues to change at a rapid rate,
"technical difficulties" will always be with us. So some level of expertise
with network problem solving will continue to be essential.
There is a second myth that technical problems can only be solved by technical
experts. This myth leaves ordinary people immobilized
when a problem does arise, despite the fact that they can solve many of the network problems they encounter.
The Diary Study of Network Problem Solving
I was a co-principal investigator of a three year research project, the Teaching
Teleapprenticeships Project, which has been studying ways that electronic
networks can improve teacher education (Levin, Waugh, Brown, & Clift,
1994). When we started the first year, there were many technical problems,
which we assumed would go away after we got them solved. During the second
year, we encountered more technical problems, again, which we believed would go
away after we worked to solve them. During the third year, when more technical
problems occurred, we began to realize that technical problems will be with us
always.
This is largely because technology continues to change at a rapid rate, and so
there are always new models of hardware and new versions of software, each of
which can interact with each other in negative ways. For networks, this
interaction has a way of multiplying problems, since all of the pieces of
hardware and software need to co-operate along the way in order for network
communication to occur. So, while technology in many ways gets easier to use,
problems will continue to occur, and so users need to have an effective
strategy for problem solving, especially for solving network problems.
Starting at the beginning of the third year of our project, I began to keep a
diary of the network problems that I was faced with, and to ask the other
members of our project to send me reports of problem they have encountered.
The majority of the problem reports came in via email, and for those that were
reported otherwise, I sent myself a description of the problem via email, which
I collected in a separate mailbox.
The corpus of network problem messages.
There were 60 messages collected between September 2, 1994 and April 23, 1995.
These messages described approximately 30 different problem solving episodes
(some messages contained more than one problem, but there were also sequences
of messages concerning a single problem). Here is a typical message describing
a problem.
From: [teacher's email address]
Date: Wed, 15 Feb 1995 00:21:50 -0500
To: Jim Levin <jim-levin@uiuc.edu>
Subject: RE:Re: Installer disks
Jim,
Sorry to trouble you, again. Apparently installed Eudora
without difficulty. Just can't get it to work. Suspect I
don't know something I need to know about setting it up or
whatever. Do I need a manual? I read through README to no
avail. Frustrating!
Any suggestions? Much appreciate your help!
[teacher's name]
The key part of this message is the exclamation
"Frustrating!" When things don't work, the typical user is left with few
options other than calling upon others for help.
Let us look at the set of "message sequences" in this corpus to see the details
of network problem solving.
Many of these problems presented themselves initially as a very general problem
("It doesn't work."), and then as the diagnosis proceeded, the problem could be
described more specifically ("The kind of modem was set incorrectly.") This
general to specific sequence was characteristic of many of the successful
problem solving episodes.
A second common feature was the use of the analysis of the "flow" of
communication, to identify where things were going wrong. Especially when
establishing a communication session, there is a sequence of discrete steps, in
which the connection is established to progressively further away components of
the network, until a functional network connection is established. By watching
(and listening) to these steps and observing what happens (and what does NOT
happen), the problem solvers were able to focus in on one or more problem
areas.
Another commonly occurring sequence in these problem solving processes is the
technique of swapping of presumably equivalent parts. When things did not work
properly in an identified problem area, a commonly used strategy was to replace
the questionable part with an equivalent part known to work.
A Model of Network Problem Solving
The model of network problem solving that emerges from this diary study is one
that is based multiple conceptual levels of communication flow.
Communication Flow.
Network communication flows between the user and the other end of the network,
and everything along the way needs to work in order for the communication to
occur. Network problems can occur of any point along the way. When a problem
occurs, then solving the problem requires locating the problem and then taking
some action to correct the problem.
So when a problem arises, does that mean that you should start tracking down
the bits as they flow from your keyboard through your computer through your
modem, etc.? Most people do not have the expertise to trouble shoot at this
level. However, many network problems can be solved by diagnosis and treatment
at a much more qualitative, global level of thinking about network
communications.
Conceptual Levels of Analysis.
Novices differ from experts is that experts have many ways of thinking about
their domain of expertise, while novices may have only one way (Larkin,
McDermott, Simon & Simon, 1980; Chi, Feltovich, & Glaser, 1981).
Experts typically start with a global, qualitative level of analysis, then use
that to guide them to a selection of progressively more detailed levels of
analysis. This has provided a useful model for solving technical problems with
stand-alone personal computers (Levin & Miyake, 1996). Let us see
whether this same concept is useful for network problem solving as well.
Level 1 conceptualization
What is the simplest, most global level of conceptualization of network
communication? Let us call this a "level 1" analysis, as shown in Figure 1.
In this level, there are only three subparts. How does this very simple level
of conceptualization help with network problem solving?
Check connections.
The first thing is to establish that there is IN FACT a continuous path between
yourself and the remote other. Now usually you cannot physically examine the
complete path, but a surprisingly high percentage of network problems can be solved
by checking whether there is in fact SOME kind of connection as far as you can
easily trace it. Many problems with telecommunications are due to the simple
fact that you aren't really plugged into a network. "Things work better when
they're plugged in."
Try alternatives.
One useful way to narrow down network problems is to try alternatives. In this
case, you have two options. First you can try to establish a successful
network connection with an alternative remote other. If that works, then that
tends to support the conclusion that the problem is with the "remote other", not
the network connection.
Secondly you can try to establish a successful alternative network connection
between yourself and the remote other, then that will identify the connection
as being the problem. If the alternative connection does NOT work, then that
points to the problem being with the "remote other".
Level 2 conceptualization of networks
If the remote other doesn't respond, then there is not much you can do to fix
the problem beyond notifying them of the problem and hoping that they can fix
it. However if the problem is in the connection between you and the remote
other, then you can move to a more detailed conceptualization of networks.
Figure 4 shows a next level of detail.
Telecommunications is a two way flow of information. At this level, the
network system has been represented as three parts: my computer system, the
network, and the remote computer system.
Flow diagnosis.
When you're establishing a connection, there is often a sequence of actions, as
each part is activated, starting with the most local and moving outward. For
example, often you start up some communication program on the local computer
system, which establishes a link to a network and then finally reaches the
remote computer system. When the whole system is working correctly, the user
doesn't need to know about this flow. However, when things fail, then paying
attention to this sequence can be an important diagnosis tool. How far does
this startup process proceed before an error message appears? Problem solving
can start by focusing on the part in which the connection process gets stuck.
Check the connections.
Once you have a concept of a system with a set of interconnected parts, then
you can proceed with problem solving by checking the connections between these
parts. There is a wire that connects your computer system with the network
(either a LAN or telephone network). Is that connected? There is also a wire
that connects the network with the remote computer, which you may be able to
have somebody at the remote computer end check.
Swap equivalent parts.
If the problem is not in the connections, you can narrow down the problem by
swapping equivalent parts. Guided by the "flow diagnosis" described above, you
can try replacing the suspect part with an "equivalent" part. You can try a
different local computer system with the same network to the same remote other.
This is shown in Figure 5.
Or you can try a different network with the same local computer to the same
remote other. This is shown in Figure 6.
Or you can try the same local computer with the same network to a different
computer used by the remote other. This is shown in Figure 7.
Suppose that you determine by this "swap equivalent parts" action that one of
these subparts are the likely cause of the problem. What then? Well, at this
point you can call in an expert, directing him or her toward the problematic
part. Or you can continue your network problem solving process, moving to a
more detailed level of conceptualization.
Level 3 conceptualization of networks
Each level of conceptualization consists of a set of subparts, interrelated in
ways that represent the structure of the overall concept. At that level, the
subpart is a "black box", defined only in terms of its functions with no
internal structure (Miyake, 1986). As we proceed to more detailed levels, the
subpart itself is structurally represented by a set of interrelated
sub-subparts. Thus at the next level of conceptualization of networks, each of
the subparts from level 2 can be conceptualized in more detail (Levin &
Miyake, 1996).
So that our figures do not get too complex, let us choose the local computer
system to represent at this level. This is the part of the system that the
typical person has the most access to in terms of problem solving.
At this level, we can represent the local computer system in terms of a local
computer and the network interface box (either a modem or an ethernet adapter
or some sort of other network box). This is shown in Figure 8.
Even though this level of analysis is only slightly more detailed than the
previous level, it does highlight the special hardware associated with
connecting the local computer with the network. There is the local network
interface box (a modem or an ethernet box or ...) that is either built
into the local computer or is connected to a port on the computer. And there
is a network jack (either a phone jack or another kind of network jack), often
the only easily visible part of the network to the typical network user.
Check the connections.
At each more detailed level of analysis, there are more connections to check.
At this level, there is the connection between the computer and the network
box, the connection between the network box and the network jack, the
connection between the network jack and the rest of the network, etc. Let's
take the first of these as an example: Is there a connection between the
computer and the network box? This can be checked in two ways. First, is
there a physical connection? Is there a wire that goes from the computer to
the network box. Secondly, is there a functioning connection? Can you issue
commands to the network box successfully? For example, if the network box is a
modem, then you can use a simple communications application to issue a simple
"AT" command. Do you get "OK" back? If so, then there is a functioning
connection between the computer and the modem.
You can then move on to check the next connection. Is there a physical wire
that goes from the network box to a network jack? If so, then you can again
try to check the functionality of the connection. If the network box is a
modem, if you issue an "ATD" command, do you hear a dial tone? If so, then you
have verified both of the next two connection, the one from the modem to the
phone jack, and the one from the phone jack to the rest of the telephone
network.
Swap equivalent parts.
Level 4 conceptualization of networks and beyond
As may now be apparent, there are conceptualizations of the network at more and
more specific levels of analysis, where each component at a given level is
represented by an interacting set of subcomponents at the next more detailled
level. For many problem solving purposes, however, these first few levels are
sufficient to either solve network problems or to narrow down the problem
either to fix the faulty component or to allow it to be permanently replaced by
a functional component.
As has been shown by other research on expert problem solvers, this strategy of
starting with a general level of conceptualization and then moving progressivly
to more and more specific levels allows the problem solving to follow a
systematic plan, rather than to unsystematically examine the large number of
possible problem areas presented at a detailled level of conceptualization.
Even if the network problem is not solving, this "top-down" strategy can often
progress far enough to eliminate some of the possible problem areas, making it
easier to turn the problem over to someone else to solve.
References
Chi, M. T. H., Feltovich, P. J., & Glaser, R. (1981). Categorization and
representation of physics problems by experts and novices. Cognitive
Science, 5, 121-152.
Larkin, J. H., McDermott, J., Simon, D. P., & Simon, H. A. (1980). Expert
and novice performance in solving physics problems. Science, 208,
1335-1342.
Levin, J. A., & Miyake, N. (1996).
Care and repair of your computer: A top-down strategy for the novice. Learning and Leading with
Technology, 238, 53-56.
Miyake, N. (1986). Constructive interaction and the iterative process of
understanding. Cognitive Science, 10(2), 151-177.
Acknowledgments
This material is based upon work supported by the National Science Foundation
under Grant No. RED-9253423. The Government has certain rights in this material. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the
author and do not necessarily reflect the views of the National Science Foundation.
Appendix A: The logic of swapping equivalent parts.
Given: System S that does not work. System S has subparts A & B.
Given "Equivalent" System S' that does work. System S' has subparts A' &
B'
Action: Swap A for A'
Possible results:
1. System S (with A' & B) now works. System S' (with A & B') does not
work.
Conclusion: suspect subpart A.
2. System S (with A' & B) still does not work. System S' (with A & B')
still works.
Conclusion: suspect subpart B.
3. System S (with A' & B) now works. System S' (with A & B') works.
Conclusion: suspect the connections of A
4. System S (with A' & B) still does not work. System S' (with A & B')
does not work.
Conclusion: suspect a negative interaction of A with B & B'.
Verification Action: Swap A back into System S
Results (in order of decreasing probability):
- Original state (reinforces original conclusion)
- Both work (better connections)
- Neither work (negative interaction - System S destroyed A')
- Original works; other doesn't (highly unlikely)