A young and naive officer was visiting an army barracks on the South
Coast of England in the 1930's. He observed that the sentry at the gate
handed a pair of thick rubber gloves to the next sentry when they
changed the guard. The next sentry hung the rubber gloves in the sentry box
and at the end of his "shift" took them out and handed them to the incoming
sentry. The officer was intrigued and asked questions. The sentry could explain what
he had to do, but he was reduced to gibbering when the officer tried to
find out why they did this. The sergeant, when asked "Why?", just repeated the
manual which describe the procedure but didn't explain what the purpose was
for. As far as the sergeant was concerned the only purpose for the rubber gloves
was for the sentries to hand them on to each other!
It took a lot of research in the history of the barracks to find out why.
(I'll give the explanation in class).
Moral1: Always document the reasons for an activity in a system. Do it when
the activity is thought of, or do it 10 years later when you are forced to
remove or change it.
Moral2: Today's solution is tomorrow's confusion and the next day's problem.
It is a good idea to have incoming EMail pop up as it arrives.
This solves the problem: "I want to know when I have new Email".
It is a natural solution.
But in some cases
it causes a problem. For example I saw a presentation at the IEEE/ACM Software Engineering
conference ruined when the speaker's Email popped up in the middle of his slides. Now "Smart Phones"
and "iPods" make sure the we always know when new mail arrives, at the cost of continuous distractions.
A similar example starts with the problem that a fixed image on a cathode ray tube tends to
damage it. So computers had special programs called "screen savers" that turn on and display
a moving image when the the machine is idle. But, in the middle of a presentation, video, or when watching
a news channel, a screen saver turning on is irritating.
Again you can use the techniques below to model systems at two levels.
You can include all the
technological details. This gives you a physical model.
You can also create abstract or essential models that will not change
when the technology implementing them is
We have already the two most popular diagrams for
Data Modeling -- DFDs and ERDs:
[ a4.html ]
and this section adds to these diagrams some text based ways of "documenting a system". Later we
will get to some specialized and detailed techniques:
[ r1.html ]
(Rules and procedures),
[ r2.html ]
[ r3.html ]
These notes cover
- Lists of definitions
[ Glossaries ]
- Detailed definitions of the data, entities, stores, processes, and flows in a system
[ Data Dictionaries ]
- Simple text descriptions of processes and procedures
[ Goals and Stories ]
[ Functions and Formulas ]
- Sequences of steps describing a possible behavior of a system
[ Scenarios ]
- Programs that explore possible processes
[ Prototypes ]
A glossary is one of the must useful documents you can develop when analyzing an
It is just a list of definitions. But you can use a glossary to define
data, words, processes, entities, stores, etc.
All a glossary does is record the meanings given to the words and phrases
used in an enterprise. Notice that is is common for
different parts to use different terms for the same thing and the same term
for different things. This one reason why you need to track both the term
and its context (= where it is used). This is why I use a notation like this
when I put a glossary on the web.
You also need to track aliases because the same idea
is often referred to by different phrases in different parts of an organization.
I have prepared a glossary for this class by extracting definitions from the
web pages. Here is what part of it looks like
- DFD::="A diagram that shows how data moves through a system through processes and stores, from sources to sinks".
[ a4.html#DFD ]
- DFDs::="Data Flow Diagrams".
[ a4.html#DFDs ]
[ d2.html#digit ]
- DRY::="Don't Repeat Yourself", in computer code (data and program) say everything once and only once.
[ d1.html#DRY ]
- ERD::="A diagram showing Entity and Relation Types".
[ d3.html#ERD ]
You can use pieces of paper, pages in a notebook, index cards, text files (notepad, wordpad, memos, notes),
web pages, spread sheets,
or even a small
data base to implement a glossary. Informal glossaries are easy to jot down
or type. For example
I use this syntax
defined_term :: context = definition.
Notice that I also define the contexts for each definition. Contexts start to analyze the structure of the
In Computer Science, in the mid-60's, we adopted the idea of listing definitions
to define the
of computer languages. Jim Bachus and Pete Naur proposed the original
as a practical form of the theoretical "context free grammars"
developed by Noam Chomsky. Since then just about every programming language has had
its syntax defined by using
(Extended BNF) that has ways to show repeated and optional patterns in the data.
If you want you can look at
[ EBNF ]
on the Wikipedia for history, examples, and details.
I have many examples -- see the
[ Online Enrichment ]
exercises at the end of this page.
BNF was adapted to define data by Tom DeMarco as part of Structured Analysis. You would define,
login::= user_id password,
user supplies their id and password when they login.
between 1 to 8 characters, inclusive.
any number of characters.
notation uses a more mathematical
format based on EBNF that includes discrete mathematics. For example you can define
SSN::data_element=("0".."9")^9, unique identifier for USA residents.
date::USA= month "/" day "/" year.
date::UK=day "/" month "/" year.
XBNF is defined as a computer language. As a result I have tools for handling it. If
you go to
[ ../hole.html ]
you will find a web app that will convert XBNF into HTML.
I generated a complete glossary
[ glossary.html ]
of all the materials for this course by extracting and sorting all the definitions on all the
pages. You might like to look at it.
A glossary can grow into a data dictionary if you add detailed information on all
the processes, external entities, and data for the enterprise. This in turn becomes the
information you must have to design new systems. Here for example is part of the data
dictionary for a recent student project.
See the online Enrichment at the bottom of this page to see the whole glossary/data dictionary.
A data dictionary is a complete description, often in a data base, that describes a data
base. These are commonly part of the
-- Data Base Management System. A data dictionary is any manual or computerized system that lets you
record, read, access, and organize information about the data and processes
in a system. Like any dictionary it defines the meanings of words and
Often, the information about the data is placed in tables with a fixed layout.
In this case the words and phrases in the dictionary
describe data and processing. For example you might find
|SSN||data element||9 digits uniquely identifying humans in the USA.
in an informal dictionary.
[ Data_dictionary ]
(the Wikipedia entry).
Most of the data dictionaries on the market
use a data base rather a set of simple files. They are typically part of a proprietary
Rather than promote a particular DBMS in this course I
will describe data dictionaries in terms of tables.
All data dictionaries must define the four levels of data:
- Element or Item
- Record or Row
- File or Table
- Data Base
Each level is made up of the lower levels: a record has elements for example.
I think the first use of the name was for a CDC product in the late
It is a place to record some important facts about existing and planned
systems. It complements the various diagrams we draw by providing
and organizing quantitative and textual information about a system.
The only other place where this data is stored is in source code. However,
you will find it distributed and duplicated in many different pieces of
code. Indeed a significant source of bugs comes for incompatible
assumptions made about the data by different pieces of software.
You need to record information about existing systems so that you
can create better systems. You also need to specify
the data in your new systems so that the data base can be created
and the software written to access it.
Many media can hold data in an enterprise.
You should gather samples, print outs, data base designs,
You need to also record ideas and designs. Thus you need to record data
about the data -- so-called metadata.
- metadata::jargon="data about data", for example the format of a date.
describes the data in a system in great detail. The typical format is a
data base with forms as an user interface. Some methodologists use
the computer science technique of writing BNF (above) to
describe data. Some data dictionaries include information on the
processes in a system as well as the data stored and flowing through it.
Data dictionaries are a good place to record physical details like the media and
format of data and processes. This may change but you can then avoid
putting this detail into the more abstract models: DFDs, Scenarios, ERDs, etc.
Notice that you can use any number of different CASE and Systems Analysis
tools to record this data. Visio, Rational Rose, and Dia can use the
Unified Modeling language to create visual data dictionaries. Or you can
use simple text files (memos or notes on
a PDA) or 3><5 cards and a pencil. Posters are common.
It is also easy to generate web
pages that store the information -- and then you can link the different
pieces of data together.
- Data Elements -- gathered into record structures -- Example: an SSN.
- Record Structures -- describe data flows and stores -- Example: your record on SIS+.
- Data Flows -- Lists the record structures flowing through the link -- a PAWS report
- Data Stores -- Holds Records with the same structure (ideally) -- The Student records in the CMS data base
- External Entities -- Sources and Sinks, linked to their data flows -- You!
- Processes -- Describe activities, linked to their data flows -- "Print rosters".
Notice: all the data in the data dictionary is should be given a date when it
was recorded... Ideally the data dictionary should keep a history of
|Label||Social Security Number||A carefully selected official name for the data
|Alias||SSN||List of other names for the data
|Type||N||Numeric, alphanumeric, date, string, ...
|Length||9||How many bytes needed?
|Default Value||NONE||if any
|Constraints||Positive 0-99999999||What values are allowed/forbidden
|Syntax||999-99-9999||Layout of the data on input/output.
|Security||Human Resources||Who have the right to see/change this data
|Records||...||Links to record structures where item appears.
We will discuss the design of data elements later(
[ d2.html ]
) in this course.
A record is a collection of data elements that are always handled together. The idea
goes back to the beginning of computer systems when a punched card was a record.
The items or fields were pieces of the card. The idea moved onto all other
data media and into COBOL. From COBOL to Algol68, Pascal, and C.
Example of an instance of a record:
|Field Label||Field Data
A record is a sequence of named fields -- just like a C/C++ struct
or the data members in a C++/Java class. One way of documenting record
structures is to draw an UML class diagram.
The simplest technique is
to list the field names. But a complete description requires a lot of metadata:
|Items describing a record
|Label -- name of record type
|Description / Purpose
- Name of field
- Type --
[ Data Element ]
- Key? (more later
[ d2.html ]
|Usage -- data flows and data stores
These record structures also map closely into data base descriptions.
To document a data flow you need to list the records that flow through
it and (optionally) how the flow is (or will be)implemented. A data dictionary
should link each data flow to its source and destination.
|Items describing a Data Flow
|Records flowing through data flow
Data stores should, ideally, store only a single type of record.
You must identify the types of record in a data store.
You should indicate if a subset of the items form a
Each key value determines a unique record.
You should note the sequence of the records -- or state that they
are in random order.
You need to note how many there are now
and how fast this is growing.
|Items needed for a Data Store
|List of Record Structures -- in a normalized system there is only one record type per store.
|Estimate of size
|Estimate of growth %/year
|Relationships with other data
|Growth||10% per year
|Related to||Majors, Address, EMailId, GPA, ...
A set of inter-related data stores should be documented using an Entity-Relationship
Ultimately data stores are mapped into data base tables, which will be covered later
in the class.
|Description / Persona
[ Data Flow ]
|Description||A younger tech-savie person with a need to take the classes in their major
|Data Flows||login, course_id, course_status, ...
|Items describing a Process
|Description of goals and requirements
|List of connected
[ Data Flow ]
[ Data Store ]
|Desirable Qualities (eg secure, reliable, cheap, ...)
|Details -- Link to a detailed description (if any).
Process descriptions can require special documentation of the detailed logic,
[ r1.html ]
[ r2.html ]
for details. However initially you use
the techniques in the next section.
. . . . . . . . . ( end of section Information in a typical data dictionary) <<Contents | End>>
You can document a lot of the information in a data dictionary in the UML.
You treat each record type as a class with attributes but no operations.
This works OK for small projects when the diagram fits on a page. When it
becomes poster sized you may need to go to more complex tools.
Nearly all the necessary information to define a modern data base can be expressed
in a set of simple tables all with the same format. Each table describes
the records in a file in the data base and lists the data elements.
Here is an example transcribed from Meng-Chun Ling's MS Project
"Senior Health Care System" (July 2005)[in the Pfau library]:
|USER_ID||User login ID||String
|LAST_NAME||physician's last name||String
|SPECIALTY||identifier of specialty||Int
|Name||Name of the program||String
|Name||Referral source name||String
|... 14 attributes
- ... and so on...
(End of Net)
. . . . . . . . . ( end of section Data Dictionaries) <<Contents | End>>
It is often best to define a process in terms of what it aims
(or does) achieve. This is the actual
of the process. This has to be
stated in clear, objective, and observable effects. They also need
to be short -- just a sentence or two at most. The ideal is a process
whose name describes its goal:
sort student data
To be more precise you specify the resources needed by the process
and the results that it achieves.
Given randomly ordered student records sort them by student I.D.
Another approach is to write a
describing what the process does. A story is a simple paragraph describing
what we want the process to do.
It is wise to record the goal of a process -- since this can get lost
as the system develops.
In many businesses part of the know-how is in the form of mathematical formulas. Many domains
have standard calculations. Once you have a Process that can be expressed this way then it is a bad
mistake to draw a DFD of it. Instead you need to define meaning of the process as a formula.
Expressing formula in text is easily done by using expressions from a programming language. Or you
can use the Equation Writer that is available with most word processors. There is a standard Computer
Science language called ΤΕΧ (pronounce "teck) that is designed for writing mathematics.
There have been experiments in using more complex logic and discrete math
to describe the behavior of a system in abstract form and
then execute the model to see if the ideas work. It
is also possible to manipulate them to search out unwanted behaviors.
These are a part of
[ ../cs556/ ]
Formal Methods. So they are not covered here.
provides more details on a complex process.
It is a numbered sequence of simple actions by subjects
that achieves some stated goal.
A scenario must mention who does each action (the subjects).
To completely describe a complex process it easiest to write
many simple scenarios.
Each is a simple slice through the possibilities.
To handle algorithms and complex decisions you need more complicated techniques
A scenario can be used to describe any pattern of activity
that you either (1) observe in the real world, (2) imagine as existing,
or (3) plan to be part of your future system. A Scenario can
be Physical (mentioning the technology) or Essential (abstracted
from the technology). Essential Scenarios are Logical Models.
Scenarios are easy to understand -- People get
"scenarios". Both by users and technologists.
They map simply
into specifications and designs for programs. They are also a very
good Modeling tool to use with non-computer people.
- Scenario::Wikipedia= See http://en.wikipedia.org/wiki/Scenario_(computing).
As far as possible avoid mentioning technology in your scenarios.
There was a time when scenarios tended to include statements like
"put a new stack of cards in the hopper". These days lots
of scenarios tend to refer to web-specific actions:
Jo clicks the Red Cancel Link.
Try to avoid this if you can find an alternative. For example
Jo selects "Cancel"
Jo cancels ....
mind that a significant number of users are not using a mouse and so
can not "click" anything!
Notice: a scenario has no branches, conditions, exceptions, extensions, or
parallelism. It should be strictly sequential.
You should also use simple language in scenarios and borrow from the
stakeholder's words and phrases.
A Scenario can not express all the things that happen in real systems.
It is a slice of life, It shows one simple path through the possibilities.
Real systems make decisions and follow different branches at different
times. A single scenario can not describe all the possibilities. Neither
can it handle parallelism -- when many things are happening and can happen in
different orders. For example: three people are working on a common set of
tasks -- stuffing, stamping, and sorting envelopes for example. There
is no one scenario which describes all the possible sequences of events.
To express more complex processes an activity diagram
[ r1.html#Activity%20Diagrams ]
can be used. I would make each "activity box" stand for
a simple scenario.
You can handle non-determinism of parallel system by drawing a Data Flow
[ a4.html ]
See SD's Agile Modeling Newsletter January 2006
, Modeling the Real-World: Usage Scenarios By Scott W. Ambler
Here's a scenario outlining a successful withdrawal attempt at an
automated teller machine (ATM).
- John Smith presses the "Withdraw Funds" button.
- The ATM displays the preset withdrawal amounts ($20, $40, ...).
- John chooses the option to specify the amount of the
- The ATM displays an input field for the withdrawal amount.
- John indicates that he wishes to withdraw $50 dollars.
- The ATM displays a list of John's accounts: a checking and two
- John chooses his checking account.
- The ATM verifies that the amount may be withdrawn from his
- The ATM verifies that there is at least $50 available to be
disbursed from the machine.
- The ATM debits John's account by $50.
- The ATM disburses $50 in cash.
- The ATM displays the "Do you wish to print a receipt"
- John indicates "Yes".
- The ATM prints the receipt.
Now let's define a wider range of ATM functionality:
- Sally Jones places her bank card into the ATM.
- Sally successfully logs into the ATM using her personal
- Sally deposits her weekly paycheck of $350 into her savings
- Sally pays her phone bill of $75, her electric bill of $145,
her cable bill of $55, and her water bill of $85 from her savings
- Sally attempts to withdraw $100 from her savings account for
the weekend, but discovers that she has insufficient funds.
- Sally withdraws $40 and gets her card back.
As you can imagine, there are several differences between use
cases and scenarios. First, a use case typically refers to
generic actors, such as Customer, while scenarios typically refer
to specific actors such as John Smith and Sally Jones. You could
write a generic scenario, but it's usually better to personalize
it to increase understandability. Second, usage scenarios
describe a single path of logic, whereas use cases typically
describe several paths (the basic course, plus any appropriate
alternate paths). [...]
. . . . . . . . . ( end of section Quotations from Scott Ambler) <<Contents | End>>
We will discuss use cases later in this course and study them in
detail in CSCI375.
. . . . . . . . . ( end of section Scenarios) <<Contents | End>>
"Never talk to a user without a prototype in your hand".
|A prototype ||A prototype
|is a Physical Model||is not a Logical Model
|can be an executable program||is not useful to the users
|can have a data base||is not full of all the real data
|demonstrates how something could work||is not how it will be coded
|demonstrates some behaviors||is not a finished product
|works on some input||is not completely correct
|is produced quickly||is not a high quality long term project
|runs||is not efficient on real data
|tests ideas||is not a test of a prgram
|is for the user||is not always executable code
|provokes discussion||is not the best possible solution to the problem
|can be used to sell a project||is not a promise that a project is feasible
is a high quality, usable subset of a project.
It meets some of the requirements and is designed so that it is easy
to add more requirements. Each iteration modifies the existing system.
Each iteration is well made and documented but
doesn't meet all requirements, yet.
A prototype is often incomplete and nearly always of low quality.
It is produced quickly using specialized tools. The result is not run quickly.
It is not designed so that it can be easily reused in the project.
There is a trap with trying to change a prototype into an iteration:
it is harder to put quality into bad software than it is to
add features to good software.
It helps to study prototypes used outside the software field.
The following all have analogies in software and systems development.
A poster showing the successive screens/pages as a scenario
The term comes form the movie industry where directors use them.
You will often find images of story boards as "extras" on DVDs.
These are not high-tech prototypes. Use a board and stick cards
Use story boards as a basis of planning and discussion.
A mock up looks like the real thing and may even do some things
but is not properly constructed.
It's possible the term comes from publishers. A computer
example is faking a web page to show its look-and-feel.
Many automobile companies produce strange and wonderful one off
cars in exhibitions to show what is possible.
Most don't make it into production.
Use concept cars to sell a new system to management?
Once upon a time electronic engineers would try out ideas on the kitchen table
and use a bread board and pins to hold the wires in place.
To this day you can buy special electronic breadboards.
Typically they become an untidy mess of wires with many inputs, meters, and probes attached
Used by software people to try out an algorithm -- full of extra outputs
and with a user interface designed for techies. Don't let the user
A variation on breadboard prototyping is for complex systems which may have unexpected
performance problems and timing hazards. Here the protoype is for a component. It provides
all the necessary interfaces and even responds similarly to the real component... however it doesn't compute the results correctly. This is a recent proposal
that needs thinking about.
A scale model has all the functionality of the final system but
does not have the full data base. It's purpose is to spot
logic problems quickly and to test if the design will scale up.
. . . . . . . . . ( end of section Types of Prototypes) <<Contents | End>>
. . . . . . . . . ( end of section Prototypes) <<Contents | End>>
. . . . . . . . . ( end of section Describing Processes) <<Contents | End>>
- Give an example from your experience of a solution that has become a problem.
- Write a basic scenario for one of the following. What happens when you
- Pump gas.
- Register for a class.
- Submit CS372 assigned work.
- Plan to interview someone.
- Write a program.
- Try to solve a problem.
- Describe and distinguish: glossary, BNF, and data dictionary.
- What is the purpose of a data dictionary.
- Compare and contrast briefly: item vs record, record vs file, file vs data base.
- What information should a data dictionary record about
- a data element
- a data record
- a data store
- a data flow
- a process
- What fields would you expect to find in the following records:
- Name of a person
- Address of person or location
- A Phone number in the USA
- List the types of prototype.
. . . . . . . . . ( end of section Review Questions) <<Contents | End>>
I have several dozen
[ ../samples/ ]
glossaries written in my own style -- XBNF.
This is an expanded glossary
[ dd01.html ]
for a project carried out by one of my graduate students.
[ search;_ylt=A0oGdOIIk7xM..AAEkMPxQt.?ei=UTF-8&q=Bread+board&fr=alltheweb&p=electronic+breadboard&rs=0&fr2=rs-top ]
[ images?_adv_prop=image&fr=alltheweb&va=storyboard ]
[ images;_ylt=A2KJkexur7xMQjEAlG.JzbkF?p=Mock%2Bup&y=Search&fr=alltheweb&ei=utf-8&js=1&x=wrt ]
[ images?hl=en&expIds=25657,26637,27102,27143&sugexp=ldymls&xhr=t&q=concept+cars&cp=10&wrapid=tljp1287435194937020&um=1&ie=UTF-8&source=univ&ei=vbO8TMaQDYT6swPfvunmDg&sa=X&oi=image_result_group&ct=title&resnum=1&sqi=2&ved=0CDAQsAQwAA ]
[ Backus-Naur_Form ]
- DeMarco::person=inventor of a kind of DFD and data dictionary,
[ Tom_DeMarco] .
. . . . . . . . . ( end of section System Modeling II) <<Contents | End>>
TBA::="To Be Announced".
TBD::="To Be Done".
Notes -- Analysis
[ a1.html ]
[ a2.html ]
[ a3.html ]
[ a4.html ]
[ a5.html ]
[ c1.html ]
[ c2.html ]
[ c3.html ]
[ d1.html ]
[ d2.html ]
[ d3.html ]
[ d4.html ]
[ r1.html ]
[ r2.html ]
[ r3.html ]
[ project1.html ]
[ project2.html ]
[ project3.html ]
[ project4.html ]
[ project5.html ]
[ projects.html ]
[ F1.html ]
[ F2.html ]
[ F3.html ]
[ about.html ]
[ index.html ]
[ schedule.html ]
[ syllabus.html ]
[ readings.html ]
[ review.html ]
[ glossary.html ]
[ contact.html ]
[ grading/ ]