CSIT 2nd year 4th semester –
DBMS notes (Part I)
What do you mean by Data and Database?
Data can be divided into three categories.
Raw data – this could be “85” – doesn’t have meaning when it stands alone.
It might mean
something if you knew it was weight of a man in Kilograms.
Related raw data is a group (data set or
data file) of organized raw data that can be tied
together. For example, it could be a group of Names, weights,
blood group and identification
numbers, all tied to the Identity cards issued to patients at
hospitals
Cleaned raw data is all the above after
being validated or processed through some process.
Such a process might ensure that blood groups doesn’t have any
value as “red” or “black” for
example only allowed values could be of the kind A,A+,B,B+ etc.
Data can be acquired from many different sources. It must always
be evaluated as to which
category it belongs, and if it needs any additional validation
before analysis that produces
information.
Database:
A database consists of an organized collection of
interrelated data for one or more uses,
typically in digital form.
Examples of databases could be: Database for Educational Institute
or a Bank, Library, Railway
Reservation system etc.
What Is a DBMS?
Consists of two things- a Database and a set of programs.
1 Database is a very large, integrated collection of data.
2 The set of programs are used to Access and Process the database.
3 So DBMS can be defined as the software package designed to store
and manage or process the database.
Management
of data involves
o Definition of structures for the storage of information
o Methods to manipulate information
o Safety of the information stored despite system crashes.
Database models real-world enterprise by
entities and relationships.
o Entities (e.g., students, courses, class, subject)
o Relationships (e.g., Arjun studies in Class -EEE VII)
File System
? Data is stored in Different Files in forms of Records
? The programs are
written time to time as per the requirement to manipulate the data
within files.
o A program to debit and
credit an account
o A program to find the
balance of an account
o A program to generate
monthly statements
Disadvantages of File system over DBMS
Most explicit and major disadvantages of file system when compared
to database management
system are as follows:
?
Data Redundancy- The files are created in the file system as and when required by
an
enterprise over its growth
path. So in that case the repetition of information about an
entity cannot be avoided
Eg. The addresses of
customers will be present in the file maintaining information
about customers holding
savings account and also the address of the customers will be
present in file maintaining
the current account. Even when same customer have a saving
account and current account
his address will be present at two places.
?
Data Inconsistency: Data redundancy leads to greater problem than just wasting the
storage i.e. it may lead to
inconsistent data. Same data which has been repeated at several
places may not match after
it has been updated at some places.
For example: Suppose the
customer requests to change the address for his account in
the Bank and the Program is
executed to update the saving bank account file only but his
current bank account file is
not updated. Afterwards the addresses of the same customer
present in saving bank
account file and current bank account file will not match.
Moreover there will be no
way to find out which address is latest out of these two.
?
Difficulty in Accessing Data: For generating ad hoc
reports the programs will not already
be present and only options
present will to write a new program to generate requested
report or to work manually.
This is going to take impractical time and will be more
expensive.
For example: Suppose all of
sudden the administrator gets a request to generate a list
of all the customers holding
the saving banks account who lives in particular locality of
the city. Administrator will
not have any program already written to generate that list but
say he has a program which
can generate a list of all the customers holding the savings
account. Then he can either
provide the information by going thru the list manually to
select the customers living
in the particular locality or he can write a new program to
generate the new list. Both
of these ways will take large time which would generally be
impractical.
?Data Isolation: Since the data files are created at
different times and supposedly by
different people the
structures of different files generally will not match. The data will be
scattered in different files
for a particular entity. So it will be difficult to obtain
appropriate data.
For example: Suppose the Address in Saving Account file have
fields: Add line1, Add
line2, City, State,
Pin while the fields in address of Current account are: House No.,
Street No., Locality,
City, State, Pin. Administrator is asked to provide the list of
customers living in a
particular locality. Providing consolidated list of all the customers
will require looking in both
files. But they both have different way of storing the address.
Writing a program to
generate such a list will be difficult.
?Integrity Problems: All the consistency
constraints have to be applied to database through
appropriate checks in the
coded programs. This is very difficult when number such
constraint is very large.
For example: An account should not have balance less than Rs. 500.
To enforce this
constraint appropriate check
should be added in the program which add a record and the
program which withdraw from
an account. Suppose later on this amount limit is
increased then all those
check should be updated to avoid inconsistency. These time to
time changes in the programs
will be great headache for the administrator.
?Security and access control: Database should be
protected from unauthorized users.
Every user should not be
allowed to access every data. Since application programs are
added to the system
For example: The Payroll Personnel in a bank should not be allowed
to access
accounts information of the
customers.
?Concurrency Problems: When more than one users
are allowed to process the database.
If in that environment two
or more users try to update a shared data element at about the
same time then it may result
into inconsistent data.
For example: Suppose Balance of an account is Rs. 500. And User A
and B try to
withdraw Rs 100 and Rs 50
respectively at almost the same time using the Update
process.
Update:
1.
Read the balance amount.
2.
Subtract the withdrawn amount from balance.
3.
Write updated Balance value.
Suppose A performs Step 1
and 2 on the balance amount i.e it reads 500 and subtract 100
from it. But at the same
time B withdraws Rs 50 and he performs the Update process and
he also reads the balance as
500 subtract 50 and writes back 450. User A will also write
his updated Balance amount
as 400. They may update the Balance value in any order
depending on various reasons
concerning to system being used by both of the users. So
finally the balance will be
either equal to 400 or 450. Both of these values are wrong for
the updated balance and so
now the balance amount is having inconsistent value forever.
Comments
Post a Comment