In this part of the course we look at ways to describe a system in
terms of the hardware, software, and people in the system. We will start by
reviewing some of the options for hardware. After that we
will look at techniques for documenting
many different systems: the current, proposed, planned, and implemented systems.
Systems Architecture is strongly oriented towards physical models of the current and
the future systems. We use the same diagrams to how it actually is and how it should be.
System Architecture defines
a Physical model that (hopefully) will implement or realize the more
abstract logical models. It should also have the right qualities: speed, security,
reliability, and so on. Logically we consider the options under four headings:
We will cover the options for Input, Connections, Output, Storage, and Processors
before looking at tools for expressing architectures.
First however some problems and patterns. First be aware that computer hardware is
very versatile. It can simulate other pieces of hardware. This is often called
Virtualization.
For example, most computers these days have their own machine code, but some one
has written a program called the JVM
(Java Virtual Machine) that lets the machine execute Java byte code.
Another example (that goes back to the 60's) is
virtual memory
which uses disks to simulate RAM. The result is a slightly slow computer with
a lot more apparent memory. For 40 years, operating systems have allowed many
programs to all be running at one time on one piece of hardware. If each program
seems to have its own computer -- and can not access another programs data then
we have a virtualization system running under a hypervisor. This makes it very easy
to move programs from one machine to another and to use all the power of a
expensive piece of hardware. There has been a strong virtualization trend in
the last ten years.
New hardware appears every month. It pays a computer professional to
keep an eye on new developments. I don't know of a really good source on the
web.... but I do follow
[ http://slashdot.com ]
and
[ http://www.wired.com ]
as straws in the wind.
I subscribe to a blog entitled "Coding Horror" that often has interesting and enlightening
comments on software development. As an example, when you have time you check out
this article
[ 001003.html ]
describing the ways in which system and software architecture goes astray
and ultimately becomes unmanageable.
A better technique is to join a professional group like the
ACM
-- Association for Computing Machinery
and/or the
IEEE-CS
-- Institute of Electrical and Electronic Engineers Computer Society.
Both offer student memberships. And discounts if you are a member of the
other one. These publish excellent magazines and journals. These, in turn,
are available on this campus as electronic digital libraries. On campus,
you can check the societies out for nothing at
[ http://www.acm.org/ ]
and
[ index.jsp ]
and even drill down into their digital libraries.
Be aware that old technology is kept in use
for a long time. You are likely to meet
examples of old devices
[ a3.dotmatrix.html ]
still in use in a current system. Always find out why -- sometimes the
old device is still the best solution or the only solution to a systems problem. If,
and only if, there is no
reason -- the old device can become a focus for change.
Here is a story, from SlashDot, about the use of older computers
[ http://www.silicon.com/management/public-sector/2010/09/25/space-exploration-the-computers-that-power-mans-conquest-of-the-stars-39746245/ ]
(regular)
[ http://m.silicon.com/management/public-sector/2010/09/25/space-exploration-the-computers-that-power-mans-conquest-of-the-stars-39746245/ ]
(mobile).
Here is another example. I learned to program on my college's Elliott 803 minicomputer.
When I became a lecturer, 7 years later the CS Department still had an Elliot 803,
and the university was using another one as a peripheral controller. I ran the
department's 803, at a profit for 5 years. It cost roughly $400 per year to maintain
and a company paid $800 to use it. They had no choice because they used software
that couldn't be ported
to another machine. Meanwhile the department used it for experiments.
- Wild Idea
- Laboratory demonstration
- Hype
- Early Adoption
- Mainstream -- Competing similar technologies
- Obsolete
Follow the links to Wikipedia articles if you need more information.
- Mainframe
[ Mainframe_computer ]
- Peer-to-Peer
[ Peer_to_peer ]
- Client-Server
[ Client-Server ]
- Cloud Computing
[ Cloud_computing ]
. . . . . . . . . ( end of section Introduction) <<Contents | End>>
Here is a list of processor types:
- Supercomputer
- Mainframe...enterprise server
- Clusters, Grids, Clouds,...: many processors+memories in a tight fast network.
- Multicore PCS -- several CPUs on one chip. More power with same clock
and heat.
- Many PC's in a rack -- shared display and keyboard.
- PC -- Personal Computers (Apple ][, IBM PC, ...)
- Laptops -- PCs that can be carried
- Tablets -- Laptops minus the keyboard
- Palmtops (Palm, iPod, ...) and Game consoless
- Embedded Chip
- Special purpose processors: GPU (Graphics), Peripherals, ...
- The key difference to the user is not the hardware so much as the operating system.
Here again there is an incredible range of possibilities.
- Mainframes and Minis had their own special OSs. To find out about these -- ask any faculty member!
- UNIX: AT&T, BSD, Linux, MacOS X, iOS, Android... If you want the history the Wikipedia
[ Unix ]
article is quite good.
- The MS Family: DOS (Disk OS), Windows 3.0, Windows98, Windows2k, Windows XP, Windows Vista, Windows 7, etc. Again, for details (if you want) check out
[ Microsoft_DOS ]
[ Microsoft_Windows ]
on the Wikipedia.
. . . . . . . . . ( end of section Operating Systems) <<Contents | End>>
. . . . . . . . . ( end of section Processors) <<Contents | End>>
We can classify I/O (Input and Output) devices/options in several ways:
- Form factor
- Embedded chip/board or Circuit
- Cell Phone: may become the one peripheral that everybody in the world
owns. Their functionality is increasing under strong competitive
pressure. Go to any mall and play with their demo machines!
We have had Blackberries and Palm Treos for some time.
Now we have the iPhone, iPod Touch, and Google's Android (2008)...
This appeared 2007 --
UMTS Universal Mobile Telecommunication System
[ UMTS ]
Hard to tell if which will be the winner and which the
whiner!
- Normal phone
- Hand Held Device
- Hand held bar code reader
- Game controller
- Palmtop/PDA/cell phone/MP3 player/Zune/...
- Tablet
- Laptop/Terminal
- Workstation/PC
- Special purpose work station -- eg. Point Of Sale -- -- --(POS)
- Input technology
[ Input_device ]
(Wikipedia).
- Keyboard
- Radio Frequency IDentification
[ RFID ]
(Wikipedia).
- Micro-technology embedded in bodies: medical uses!
- Headgear can read eye positions -- either with an infra-red beam or by reading signals to muscles.
- Sound and (more complex) speech.
- Body measurements --
Biometrics.
These are technologies that extract information by measuring the body.
They are mostly used to ID people. They include: finger print and palm
print readers, iris scans, retinal scanners, ... The earliest (from Doug
Engelbart) was to use people's weight to recognize and log them in!
- Data capture devices: eg. bar code readers
[ Barcode_reader ]
[ Barcode ]
(Wikipedia).
- Digital camera
- Smart phone with camera and QR app
[ QR_code ]
(Wikipedia).
- Electronic Whiteboards
- Graphics
- Stylus/pen-based
- Mouse
- Touch screen:
Originally you had a screen or tablet plus a special pen. Some use
a magnetic pen (WACOM). Special coatings can also be used or a double
layers pushed together. A common example is the PalmOS driven devices.
I'm not sure how they digitize the pen movements -- but can work
well (I have a had a slow spot in one part of the screen or a digitizer that
reads taps as strokes and misreads the position on the screen by about 0.1 inches).
Now some technologies let you use a finger -- very popular
for kiosk machines like ATMs and voting, and now the impressive Apple Touch interface.
Other manufacturers are introducing their own
touch
devices.
- (Magnetic Ink Character Recognition): think checks!
[ MICR ]
(Wikipedia).
- Scanners
[ Image_scanner ]
- OCR::=
[ Optical_character_recognition ]
-- On a special font it is excellent, and on a fixed known
font it is quite good, but scanning regular text
with its many fonts, typefaces, wrinkles in the paper, spots and so on, OCR
only get 80% to 90% accuracy. The key technology is to convert
the image into a two-dimensional array of points and try and
match parts of it with known templates. There are improvements on this
using special data structures and algorithms. I'm not sure where
to get the nitty-gritty details. Some forms of OCR input can
even handle hand-printed numbers.
- Cards -- old
[ Computer_punch_card ]
(Wikipedia).
, new
[ Magnetic_stripe_card ]
(Wikipedia).
, and smart
[ Smart_card ]
(Wikipedia).
- Phone Keyboard -- 4><12 array of buttons + some special.
- Voice input and Speech recognition:
in my experience flaky and typically needs training. Possible
exception: Chinese. Chinese uses inflexions and tone to communicate
meaning and voice recognition technology tends to react well to inflection and
tone. (A result of a simple experiment at CSUSB CSCI dept in the 1980's).
Speech recognition is good way to input data
when the hands are busy and the vocabulary
is small, fixed, and discrete. Recognizing normal speech is less effective
-- and the
technology is probably proprietary (= secret). Most techniques are based
on separating out the different frequencies of sound that make up the
sound: Fourier or Spectrum Analysis. This gives patterns that can be
correctly recognized in many cases. However even recognizing where one
word begins and another ends in normal speech turns out to be very difficult.
It doesn't help that in normal speech we run words together and omit
sounds that are supposed to be there.
For details try the Wikipedia
[ Speech_recognition ]
- Game controllers: hand held, buttons, joysticks, ... forms of motion sensing,
including the Wii Controller
- Motion Sensing devices: iPod touch,
[ Wii_Remote ]
[ Wii Dupe.shtml ]
- Haptic devices -- you hold and manipulate and get force feedback.
- Manual push buttons, switches, knobs, rollers, etc.
- CD-ROM -- Compact Disk Read Only Memory, and lately DVD...
- Sound
- Many other sensors -- eg. detecting particular molecules, ionization,
humidity, pressure, temperature, ... -- all depending on Analog to
Digital conversion.
. . . . . . . . . ( end of section Input) <<Contents | End>>
As a rule -- get the input in to your system as close to where it is created as possible.
Collect it
automatically if possible. Avoid re-inputting data that can be stored securely.
When secure, save information so that it does not have to be re-input.
Note: Re-inputting data into a web form is a common design error on web systems.
The first thing a system must do when data is input is to verify and validate it.
Failures to do this has lead to embarrassment, security break ins, loss
of money, etc. etc. If you do not stringently check the user's input
is just like putting a "Welome Mat" outside your front door, leaving it
unlocked, and going out for the night. Anything can happen....
For example I have a simple PHP script that searches this web site. The user
inputs a string they are interested in and the script "GETs" it and searches
a dictionary of important terms... And all was well for 4 years and then
I get an EMail from the campus Information Security Office. They were
checking every script on all the web servers and found out that if you
called the script directly, the script output an error message -- not too
worrying except that the error message contains the path to the script
on the server. This is like handing out a map of your house showing
that you have your expensive sound system behind a flimsy wall...
I spent 24 hours patching this and 20 other scripts to avoid this
problem.
Here
[ Output_device ]
is the Wikipedia summary.
- Types of Output: audio, fax, COM, COLD, EMail, Internet, Mobile, Special, printers, screens, sound, CD-RW, DVD-RW, ....
- Special: POS
[ Point_of_sale ]
, ATMs, special printers, plotters, photos, TVs, VCRs, Toasters, Blinking
VCR displays, speakers, and earphones.
- Screens
- Printers: laser printer, page printer, line printers, ...
[ Computer_printer ]
[ Laser_printer ]
[ Inkjet ]
[ a3.dotmatrix.html ]
[ Line_printer ]
- Special displays: lights, LEDs, LCDs, ...
- Mobile: cell phone, wireless PDA, ...
- EMail and Email attachments -- a simple way to get data from a computer to a
remote or mobile user.
- Web page - open and insecure -- Again a simple way to share data that
is not particularly secure.
- COLD
[ Computer_Output_to_Laser_Disk ]
(the predecessor to the CD-ROM and DVD).
- CD-ROM, DVDs, Blue Ray -- still a developing technology.
- COM -- Computer Output of Microfilm
[ Microfilm ]
- Fax -- optional printer on many PCs/Macs.
- Audio -- These days speech is on a chip.
We will talk more about different ways of encoding data
(EBCDIC, ASCII, Unicode, XML, .... ) later in the course.
The principle of locality is one of the most important principles for
choosing and organizing data. It
relates the design of data processing and software systems to their
performance. Quite simply...
Where data is stored determines how fast it
can be found and retrieved. So, the closer the data is to where it is
processed, the faster the system can run. Similarly, when the sequence
of data accesses moves from a position to a nearby one
then the system will run faster. For example, consider a normal
telephone directory/list of contacts... It is easy to find the phone number of
a person.... but try finding their neighbor's phone number in the
same phone book. (No! You can't phone up the person and ask them for the
name of their neighbor).
Or consider, the old magnetic tape which can retrieve (and write) data
very quickly once it starts moving at full speed, and as long as you don't
stop or start. You can pick up the next piece of data almost instantly, but it
takes several minutes to go back to the beginning of the tape, or to the
end.
I learned this when I first used a new magnetic tape based compiler
in ICI (Yorkshire, England) in the 1960's. Immediately after
my first compilation, all the compilations by my team were
taking 4 or 5 minutes! It turned out that I shouldn't have asked the compiler
to compile my program to tape until I had removed all the compile errors. A
bug in the compiler
left an incomplete file on the tape if the compiler halted on an error
while writing to tape. It did not write an
end-of-tape marker. As a result my compilations (and my team's compilations)
involved spooling 200ft of tape to get to the "end". My name was mud! But
the compiler team thanked me for finding the bug. And then said
"don't do that again!"
Moving to disks did not change the principle of locality. When the
operating system scatters a file all over the disk the computer slows down.
This is called fragmentation. We have special programs to defragment
disks. But clever data design can make parts of the
software run much faster. If the
data is read in a sequence that makes the disk head jump at random then
each read has an average time proportional to the size of the data set. But,
sequential access is faster and depends (on average) on how fast the disk
moves not on how much data is stored.
The principle of locality also applies to networks. As Admiral Grace
Hopper observed light takes 1 nanosecond to travel 11.7 inches. She used to
hand out pieces of wire cut to this length. I have one in my office.
She used to observe
that even her colleague admirals would understand that there are a
lot of nanoseconds from a the ground to a satellite and so one could not
instantly communicate with people the other side of the world. The
long delay is an example of latency. It can be a major pain in
web applications. By the time your server has communicated with the
user's client they have lost attention. In some cases it is even worth
having multiple copies of the data in many different servers so that
it can be delivered rapidly to the processes that need it... but this
needs subtle programming to make sure the different copies are synchronized.
As an example if you want to download a copy of Real Player you will be invited
to choose the closest of several servers. Similarly Netflix places it's
servers in the same building with various ISP hubs to reduce latency and
increase bandwidth.
The principle of locality even holds
at the machine code level: (fastest) data in cache vs data in RAM, data in RAM vs
data in virtual (disk) memory, ...(slowest).
The
principle of locality
means that
there is a sequence that lets you access the data faster than other
sequences. As a result defragmentation is a key way to improve
badly designed disk storage systems. Similarly, sorting data is a key technique for
improving performance of computer system.
The best device depends on many factors -- what you want it to do,
cost, size, how much data, how fast, and how reliably, and how mobile, ...
A key decision for a business is where to store its data. This is
a common choice. Unfortunately the best answers have to be worked
out on a case by case basis and even change as technology changes.
For example Daniel Truckenmiller's senior project
[ seminar/20120608DarinTruckenmiller.txt ]
(June 2012) turned on replacing a networked storage device. It took
research and some simple mathematics to sellect the best way forward.
- Registers and cache in CPU
Very fast, small, and transient.
- RAM/Primary memory/Core
[ Random_access ]
Fast, getting bigger, but transient.
- Memory chips for cameras and hand held devices.
- Flash drives -- portable storage of data.
[ Flash_drive ]
Portable SSD.
The computer spies best friend.
- Solid State Disks --
SSD
[ SSD ]
Slower than RAM but faster than disks.
Persistent storage.
Will fail after a large number of overwrites.
- Disks -- Direct access -- move head and wait for data to go by.
[ Hard_disk ]
[ Floppy_disk ]
also Zip Disks, etc.
Started out the size of a washing machine...
Persistent storage.
Survives until jostled or shut down incorrectly to often (
disk crash
).
- Optical storage devices: CDs,....
- Tapes -- Sequential Access -- but fast when you get up to speed.
[ Magnetic_tape_data_storage ]
Very Persistent storage.
. . . . . . . . . ( end of section Storage devices) <<Contents | End>>
- Data Base -- A collection of linked files -- CMS.
- File -- a collection of records of one type -- all the student records we have.
- Record -- collection of elements referring to one entity -- Your student record.
- Element -- An indivisible atom of information -- example: student Id.
. . . . . . . . . ( end of section Storage) <<Contents | End>>
Notice that data flows between processes can be internal to a computer
or through a network. You can even connect outputs to inputs. However
one common and simple improvement to a system is to spot a place where a human
re-inputs data that is produced by a computer. This tends to be slow and
error-prone and something to be avoided except for a good reason -- like security.
You can quantify the behavior of a connection in terms of three key values. Wise computer
people tend to think in these terms: Latency, Bandwidth, and Reliability.
Latency
is the delay between when a signal/message is sent, and when it arrives.
Latency is a time measured in microseconds, milliseconds, seconds, minutes, ...
Bandwidth
is a measure of how much data you can transmit in a given time.
Typically you have to wait for the first message (Latency) and then the data starts
flowing at the rate of the Bandwidth. This is measured in terms of the amount of information that can be sent
per unit of time. For example bits per second, bytes per second, ... There is a special unit the Baud that
is approximately bits per second. It is said that the Kludge Komputer Korporation had such a bad
connections that the salesmen quote the bandwidth in cpf which stood for characters per fortnight:-) Related to Bandwidth is the user level concept of
Throughput
-- the number of transaction that can be done in a given time.
You can find a lot of useful reference materials by starting at
[ Bandwidth_(computing) ]
on the Wikipedia.
Reliability
is a measure of how few errors are introduced when the data is sent through the connection. Also
the chance of the connection being broken. Reliability is a complex property with no single measure.
We can list the following problems with connections:
- Items are transmitted but are never received.
- Items are received but never transmitted.
- Items are transmitted and received more than once.
- Items are distorted as they move from transmitter to receiver.
- Items are received in a different order to which they are received.
Working with a connection with a high probabillity of one or more of the above faults
is difficult. You may have to waste bandwidth to detect and/or correct the errors.
Latency: the time to make the call. Bandwidth: How fast can you talk? Reliability: Distortion, frequencies clipped,
Breaking up, bad coverage, dropped calls.
Latency: time to log in and go back to application. Bandwidth: Depends on load and which one you choose -- about 58Kbps -- any exact figires?
Reliability: Seems pretty good... what do you think?
For example: A company uses pigeons to take a memory stick from a camera to there home base. Why?
OK... the latency is not good.... it take 30 minutes for the pigeon to fly down the Grand canyon.
All the company wants is to have the photographs available to the people rafting down the canyon before
they get back.
But the bandwidth is equal to the size of the memory stick. Reliability? Depends on the presence of
hawks!
Exercise -- evaluate some other technologies.
- Paperwork -- provides a record of the communication. Can be scanned back in or retyped.
Best done by other people!
- Sneakernet -- Copy data to a memory chip/flash drive/floppy/zip/tape/etc and walk it to
the other part.
Slow, cheap, but reliable!
- Face to Face.
- Phone...
- VOIP/SKYPE?
- A series of "Best technologies" for connecting devices no more than
6 feet apart by wire, like a disk drive and a PC:
- Ancient Proprietary systems.
- Serial -- the venerable RS232 interface and twisted pairs...
- Parallel -- the classic Centronics Printer ribbon cable.
- Small Computer System Interface
[ SCSI ]
(pronounced: skuzzy).
- Universal Serial Bus
[ USB ]
- Firewire
(The latest IEEE sponsored way of hooking up devices. Examples
include cameras to PCs. IEEE 1394. See Wikipedia
[ Firewire ]
and this IEEE tutorial
[ 2.730740 ]
)
- ...
Ethernet
-- a protocol for transmitting data inside a single network. Originally designed for wide area radio networks (eg. Hawai) then adapted to
coaxial cables and then to twisted pairs.
You can build an
Internet
on top of any technology -- even phones and modems.
The first internet connected several Ethernet networks.
The Internet is defined by the TCP/IP stack of protocols. TCP defines how to move data and IP defines how
to navigate multiple networks. Internet technology is largely defined by
the Internet Engineering Task Force
[ http://www.ietf.org/ ]
(IETF) and the "Requests For Comment"
[ http://tools.ietf.org/html/ ]
that they archive.
There is now a ton of technology that drives the Internet including
repeaters, routers, switches, firewalls, Domain Name Servers (DNS), firewalls,
and so on.
WWW
-- The World Wide Web is built on top of all the previous Internet
technologies...
VPN -- Virtual Private Network
, using encryption to fake an isolated
network. The introduction of the VPN technology looks
a little fuzzy but the ideas were around in the 1990s and
were standardized by 2000. Here is what I dug out of the Internet.
- VPN's are mentioned and defined in a 1995 editorial by Darren Boulding:
[ security.html ]
- There were IEEE research papers in 1996
[ SDNE.1996.502456 ]
and then an editorial
[ 2.634834 ]
in 1996.
- The first standard (I can find) is an Internet Request For Comment
[ rfc2764 ]
posted in February 2000 by Gleeson, et. al.
- In 2001 Don Hall claimed VPNs were
covered by his 1992 US Patent #5,126,728.
. . . . . . . . . ( end of section Wired Connections) <<Contents | End>>
- The security maven's nightmare.... but so convenient.
- Blue Tooth for very local connections... Check out
[ BlueTooth ]
in the Wikipedia when you need details.
- IEEE 802-11? -- WiFi, WiMax, ... There are a wide array of IEEE standards
for wireless connections. See
[ 802_11 ]
on the Wikipedia for details.
- Data can be sent through a cell phone connection: hand-set to cell tower to internet.
. . . . . . . . . ( end of section Wireless Connections) <<Contents | End>>
The word "Topology" means "the science of position" and in the context
of networks indicates the connectivity or structure of the network.
So "network topology" is a question of how the parts are connected. We talk
about node as the parts and arcs as the connections.
More connections mean more money and complexity.... but more connections
mean greater reliability:
- Bus -- High speed backbone with branches. Use to connect peripherals and central
processor and memory together inside a single compute.
- Linear -- simplest, cheapest, and most likely to be separated. Each node connected to one or two neighbors.
- Star -- A mathematical tree guarantees that there is one path from any node to another.
Or none, if the network breaks down.
- Ring -- Send the token round!
- Network -- many paths give reliability etc. But costs more. Signals can take the shortest route -- saving time.
I wrote the following notes in response to a request from a student. I hope they help.
However I don't expect you to memorize these hints when I write quizzes and
final questions.
- Get Trained!
Take our System Admin sequence: CSCI360, CSCI365, and CSCI366 -- they are
part of a BA option.
- Abandon any idea of being up-to-date and cutting edge. Remember the
bath-tub curve: reliability is best in the middle of a technology's life.
The chance of failure is high initially and increases at the end of the
life time.
NASA on-board computers are always several generations behind to maximize
reliability. Best choose things that other people have already had
good experiences with. Others can be bleeding edge:-)
- On the other hand keep your software up to date -- MS products get monthly security patches and anti-virus products seems to update several times a week.
- Next: how reliable? 365/24/7 is more expensive than 20/6.
- Reliable cabling: hidden and redundant.
Even so, a back hoe, a rat, or a squirrel can bring things crashing down.
- Reliable hardware -- and that means a controlled and secure
environment. I've known servers to shutdown for an hour once a week
when the custodian unplugged it and plugged in a floor polisher!
Lock up key servers.
- Don't forget you will need backup processors and a way to backup and
recover data.
- Then you need to set up redundant servers for running a network: DNS,
NIS or LDAP servers, NFS or other file sharing, Web servers, ...
- All software must be fully up to date with patches, else get the last
stable release. Design a system that keeps all MS systems up to date.
- If you've gone for Wi-Fi then you must secure it.
Recall: WiFi works through walls.
- Then the security system: fire walls (perimeter and between sub networks).
- Did I mention backing up all the data?
- Develop admin procedures that monitor and improve reliability.
- You need to maintain a configuration management inventory! It documents the version of
each component do you have on each computer. You will need to know when each component was
last updated.
- Then train the administrators
- Set up the panic button schedule: who
comes in at 11 at night to reboot the system after a power cut?
- Then comes the client machines. Big question: what hardware and what
platform? How do you make sure that large numbers of workstations are backed up?
- Then train the users in reliable computing.
- Did I mention backing up all the data?
. . . . . . . . . ( end of section Connections) <<Contents | End>>
The UML provides a new standard way to describe the architecture of a system:
the hardware and the software that it executes. Prior to that there was a branch of flow charting i
(see
[ Systems Flowchart ]
below)
that was used to indicate the physical devices in a system and how they were connected.
These two figures show the system I used to use up until Summer
2008. And the replacement I was hoping for. Later I will show you what
actually happened. In both
diagrams I'm using the old UML1 notation from the
old Rational Rose free student edition.
In a deployment diagram there are three-dimensional cubes or boxes called
nodes.
In the diagrams above the cubical boxes represent hardware devices and computers.
Deployment diagrams also show the
connections
between nodes as simple lines
with no arrowheads. Finally the software that is deployed onto the
hardware is also shown inside the 3D boxes (nodes). This notation
changed in 2003 from UML1 to UML2.
Deployment Diagrams
- Show Nodes and Artifacts.
- The
Nodes are 3-D boxes.
They can be hardware or software as long as they
execute programs. Examples -- A Laptop. An iPod. A Mainframe. The
Java Virtual Machine. The UNIX Shell. Visual BASIC.
- Nodes have Artifacts.
- These
Artefacts can be listed in the box or shown inside rectangles in cubes.
They are files, programs, data, etc.
- The
Communication paths
are labeled with protocol names. Example: HTTP or TCP/IP or SSH.
They are shown as simple lines with no arrowheads.
- Artifacts depend on each other. This shown as a dotted line with an
arrow pointing from an artifact to the artifact it depends on. Example:
A browser depends on a web server. A PHP script depends on the
shell scripts and libraries it calls.
- Deployment diagrams can show the components that are manifested by artifacts. But
this is rare. This links the system level parts to the software architecture.
- From the UML2 Language Reference Manual
- Example UML2 -- my 2010 Hardware
- Use simple connections between nodes. No Arrows. Just lines marked with the
protocol.
- No connections between artifacts. But dependencies are OK and common.
- Nodes represent "Execution environments" including computers and operating
systems.
- Nodes can be put inside nodes to show that one executes the other. For
example to show that the client PC executes a browser and a Java Virtual
Machine.
- Artifacts are things that are created: data, programs, scripts, libraries, ...
- Artifacts manifest elements of other models -- components, classes, ...
- Hardware -> Nodes
- Op Systems (if special) -> Nodes inside hardware nodes, else use tagged values.
- Virtual Machines -> Nodes. Example you could show the "Java Virtual Machine" on a PC as the execution environment for compiled Java Applets.
- Data bases executing SQL: SQL artifacts on Data Base node.
- Browsers that are asked to execute significant scripts and/or applets would also be nodes placed inside hardware.
The scripts and applets are shown as artifacts.
If the browser executes a virtual machine then this would also be
an execution environment. A common example is the
JVM
-- Java Virtual Machine. Meanwhile on the server
one might wish to show Microsoft's Common Language Infrastructure (CLI)
as an execution environment for systems that use their .NET Framework.
- Simple data bases -> artifacts
- files->artifacts stereotyped <<file>>
- programs->artifacts stereotype <<process>>
- The UML defines how symbols in other kinds of diagram are linked
to symbols in deployment diagrams. Classes are encapsulated in Components. Components are manifested as artifacts. Artifacts are deployed to nodes.
- In CSE557 we use these diagrams to analyze and design systems. In CS375 we will
use them to design software.
The short article
[ Deployment_diagram ]
on the Wikipedia gives a brief description.
In the UML you can add constraints to things by using tagged values
that look like this
{webserver="Apache Tomcat"}
{OS="MS XP"}
{CPU="Intel ...."}
{author=RJB, file="a3.html", source="a3.mth", language="MATHS"}
These are a loose but useful way of supplying data about nodes and artifacts
You can also attach some useful stereotypes to artifacts. The following are
well known
Table| Stereotype | Meaning(UML2)
|
|---|
| <<file>> | A physical file in the context of the system developed.
|
| <<script>> | A script file that can be interpreted by a execution environment or node.
|
| <<executable>> | A program file that can be executed on a computer system.
|
| <<library>> | A static or dynamic library file.
|
| <<source>> | A source file that can be compiled into an executable file.
|
| <<document>> | A generic file that is not a source file or executable.
|
(Close Table)
The above summarizes these facts
- The iPod runs iOS4 and coomuncates using HTTP protocol with a web server.
- The web ser runs Linux and the Apache web server.
- Apache excutes PHP.
- The iPod runs Safari which depends on Apache.
- THe web server stores pages and a data base called "Data" which PHP uses.
You may see some of the older style deployment diagrams. So here
are the rules:
- Show nodes and components.
- Nodes are 3-D boxes. Components can be listed under the box or shown as
rectangles with "tongues" on the left inside the nodes.
- Computers -> Nodes
- Special nodes for devices other than computers.
- Connections between nodes are labeled with protocols. No Arrows.
- Op Systems shown as components or as a tagged value in a node.
- Nodes contain Components.
- Virtual Machines
- Data bases
- files
- programs
- Connections between components show dependency. You can also show the interfaces
provided by the components by lollypops.
The web is full of obsolete diagrams.
Use UML2 in this class and CS375.
The old notation suffers from clutter because it shows both the system
architecture and the software architecture (components) on one diagram.
UML2 is a less cluttered. It separates the systems
architecture (deployment) from software architecture (components).
UML1 deployment diagrams could also show devices that where not computers.
This feature is missing from UML2 deployment diagrams.
Advice: Only use UML1 if your organization has a policy or standard that you
can not change.
Click through to these diagrams I found on the web.
[ DemoSysDeploy.jpg ]
[ MbariDeployment.gif ]
[ deployment-diagram1.png ]
[ deployment_diagram.gif ]
[ 12779.jpg ]
[ way_dep_diagram.jpg ]
Which of the above are ULM1.* and which UML2.0? Which are correct? WHat conclusion can you draw?
- Mainframe plus card input/output and line printers.
- Mainframe+terminals: A terminal is a special device with limited functions
-- a keyboard for input and a screen for display.
- Mainframe+clients emulating terminals.
- Stand alone processing. Workstations without connections.
Use sneakernet to share data.
- File sharing: Networked peer-to-peer workstations share data.
- Client/server: Dedicated server serves many client workstations.
- Fat and thin clients: A thin client has little special software and can not execute general purpose programs.
- Multi-tier client/server: Many servers with different functions.
- Middle-ware: Specialized "Glue" software for connecting tiers.
- AJAX -- (asynchronous JavaScript and XML)
[ AJAX ]
- Peer-to-Peer: All machines have similar power and share the load.
- Virtualized Client/Server -- The servers are executed by one machine and each
assumes it hase the whole machine.
- Cloud Computing: relying on the internet and special hidden servers which other people
run, power, cool, secure, and maintain.
Even though the functions of the Student Information System have not changed its
name and architecture has changed many times.
Architectures:
- Mainframe+cards and line printer
- Mainframe running SIS+ with line printer and cards
- Mainframe running SIS+ with access through a PC running a T3270
terminal emulator.
- Mainframe running SIS+ and TRACS etc.
- Mainframe running SIS+ and TRACS and webreg etc.
Currently we have a new architecture -- find out about it on the field trips.
Peoplesoft = CMS
An architecture describes the overall structure of something: the parts
and how they are connected. A Systems Architecture describes the hardware
and software making up the system. The performance, cost, and reliability of
a system is often determined by the architecture. Thus we need to
be able to record and evaluate architectures: what software and data is placed
where on which hardware.
Architecture is also a process of choosing the parts to meet the
requirements of the stakeholders. We will look at this later
[ c2.html ]
(Choosing an Architecture) +
[ r3.html ]
(How requirements drive architecture).
There will be more in CSCI375.
Yes... but usually because they haven't learned the new standard.
To get a summary of the differences look at
[ ../papers/20050502Abstract.html ]
and follow the links into the outline and then to the details.
They are about the same. Except that the UML1.0 notation is harder
to figure out.
No! Artifacts are placed inside nodes: they are deployed on a node.
These represent interfaces -- lists of functions that are called/used on
one side of the "lollypop" and implemented/provided by code on the other side.
The particular diagram has a problem: it doesn't make clear which component
provides the functions and which one uses them. It is better to have a dotted
line from the client that uses the interface to the circle indicating it. Then a short solid line (think lollipop) from the interface to the component
or class that provides it. The UML2.0 version has a cup for the client but
this is hard to draw!
We'll talk more about this notation in CSCI375.
A Java Virtual Machine of JVM.
A Browser that interprets JavaScript.
MySQL.
The .NET runtime environment.
A VB interpreter.
Tomcat Java Server Pages....
ASP -- Active Server Pages.
Interpreters for scripting languages: Perl, Python, Ruby, PHP, Unix shell.
However only show these as nodes if there is something special and unobvious
that needs explaining. It is simpler to just mention them as
[ Tagged Values ]
in the node. Some people just list the internal environments and artifacts inside a node -- with no special notation.
Many people get this wrong. The result is nonstandard.
In your work in this class we will do it right.
Yes. The American National Standard Institute
(ANSI) and European Computer
Manufacturing Association (ECMA) provide very similar rules for systems
architectures. They define special shaped symbols for different devices
and connections between them. This is called a
Systems Flowchart
and here is an example from my Ph. D. Thesis (1971, Brunel University)
- It shows the British mainframe (ICL 1900) with its magnetic storage, line printer, card reader,
and hard disks. I seem to have forgotten the Card Punch -- or else it arrived after
I drew the diagram.
- It shows the ICL 803b with its teletypes, paper tape station, plotter, and
a prototype graphical display unit called the ETOM.
- The two computers were connected by a standard interface -- the British forerunner
of the later SCSI.
- There are also two comments supplying information on the storage available
on each machine: 1900: 32K * 24 bits, 803: 8k * 39 bits.
- There was no standard way to show two-way flows, digitizers, or plotters.
I had to fake them.
- In those days computer people all owned a "Flow Charting Stencil" and a
collection of drawing tools. My thesis was about the algorithms and languages
needed so that people could draw diagrams using a computer instead.
All hail to "MacPaint", "Macdraw", ... "Dia", and even "Visio"!
Here are some of the symbols from that era drawn by the free "Dia" tool.
However Systems Flowcharts do not let you show what is deployed on the hardware
or the details (when needed) of the nodes like the UML diagrams. For example,
if I drew an UML2 Deployment diagram of the old system it would not show the
peripheral devices but it would show the software. The 803 had Algol, SAP, and
PictAlgol while the 1900 had FORTRAN, COBOL, and PLAN. The ICL had an
Operating System (OS) called George 3 but the 803 had no OS:
(Drawn using the Visio UML2 template by Hruby)
No. They are about the physical connections between hardware
and software. DFDs are about the abstract flows of information
between abstract processes and data stores.
A node is something that can execute code.
An artifact is anything that is made and placed on a computer -- including
data, documentation, files, programs, etc.
There is one tough choice: Is an interpreter you must
program, an artifact
or a node. My answer is only show it as a node if you have an
artifact that it will execute it. Then you need a box in which
to place the artifact.
Depends on the project -- look at the requirements.
Look at the properties of the hardware: the clients machine, the
connections, the server? How fast are they?
Can you easily download the extra software
to "fatten the client"?
We will cover this in detail later. Right now understand that
it is the desired qualities of a system (security,
reliability, speed, size, ...)
plus the reality
that drives systems architecture.
The most secure architecture is a mainframe in a locked room
with no connection to the outside world:-)
I think that operating systems that tackled security a long time ago
tend to be more secure... especially when they are not the most
popular ones. So I like the UNIX based ones. I've had good security
experiences with the BSD versions of UNIX
from way back.
Linux or Mac X seem to be fairly secure. To make Windows 2K secure is
tough if not impossible: I unplug mine from the network whenever I leave
the office, and I run a personal firewall, and a virus checker, and the MS
updates,... and turn all the software to the most secure properties, and
use a suite of tools that encrypts its data...
The most insecure component in any architecture is a foolish person --
fools are too ingenious.