M2-R3: BUSINESS SYSTEM

JUNE 2004

 

NOTE:

1.There are TWO PARTS in this Module/Paper. PART ONE contains FOUR questions and PART TWO contains FIVE questions.

 

2. PART ONE is to be answered in the TEAR-OFF ANSWER SHEET only, attached to the question paper, as per the instructions contained therein. PART ONE is NOT to be answered in the answer book.        

3. Maximum time allotted for PART ONE is ONE HOUR. Answer book for PART TWO will be supplied at the table when the answer sheet for PART ONE is returned. However, candidates, who               complete PART ONE earlier than one hour, can collect the answer book for PART TWO immediately after handing over the answer sheet for PART ONE.

 

TOTAL TIME: 3 HOURS                                                                           TOTAL MARKS: 100

(PART ONE - 40; PART TWO - 60)

PART ONE

(Answer all the question)

 

1. Each question below gives a multiple choice of answers.  Choose the most appropriate one and enter in the “tear-off” answer sheet attached to the question paper, following instructions therein.

1.1 The objective of maintaining an inventory control operation is

A) To maximize the money invested

B) To reduce the cost of storing, handling and insuring a large quantity of goods

C) To search stock

D) None of the above

Solution :  B

Reason Inventory control is the phenomenon of maintaining the stock to make the optimum utilization of resources optimum utilizations means making maximum utilization with minimum cost.

 

1.2 which of the following is true about a disk file where WRITE statement will always place a record in the next physical position?

A) Sequential file

B) Relative file

C) Indexed file

D) None of the above

Solution :  (D)

 

1.3 The most efficient and well designed computerized payroll system would be

A) On-line real time system

B) Batch processing system

C) Both A) and B)

D) None of the above

Solution :  (B)

Reason Payroll system of an organization is responsible for maintaining the salary details and payslip generation at the end of the pay period. In batch processing records data is collected over a period of time and processed after the time period, hence is most suitable for payroll system.         

1.4 Purchases from suppliers is part of

A) Subsidiary ledger

B) Payroll transaction file

C) Balance sheet

D) Sales register

Solution :

 

1.5 If AGE and SALARY are numeric fields in a database file, which of the following commands is incorrect?

A) REPLACE ALL AGE WITH AGE+1 FOR AGE < 20

B) REPLACE ALL AGE WITH AGE+1 WHILE AGE > 20

C) REPLACE AGE WITH AGE+1 FOR AGE < 20 ALL

D) All of the above

Solution :  ?????????

 

1.6 The command to create a new database file based on two open database files is

A) SET RELATION

B) UPDATE

C) JOIN

D) None of the above

Solution :  C 

 

1.7 ROUND ( ) function is

A) A numeric function 

B) Used to round a number to a specified number of decimal places

C) Both A) and B) above are correct

D) None of the above correct

Solution : (C)

Reason Round function is a numeric function used for rounding off the specific number of decimal places.          

 

1.8 ? STUFF (“Baby’,3, 2, “ba”) returns

A) baby

B) baba

C) Baba

D) Bob

Solution : (C)

Reason Stuff function is used to replace specific number of characters in a given text starting from a given location.

 

1.9 The command WAIT

A) Causes all Foxpro processing to pause until any key is pressed

B) permits usage of a prompt which must be a character expression

C) has a default prompt which is “press any key to continue.....”

D) all of the above

Solution : (D)

Reason Wait is used to halt the processing till a key is pressed. A text message can be given to tell the user about the same.           

 

1.10 The command REINDEX will

A) Rebuild all open index files in the currently selected work area using the key expression defined by the original INDEX command.

B) Rebuild all the index files on the hard disk

C) Rebuild all the index files in a particular directory.

D) Both B) and C) above are correct

Solution : (A)

Reason Reindex is a back used compalibility command. It is used with single index and compound index files. These files are not updated automatically like structured compound index files.

 

2. Each statement below is either TRUE or FALSE.  Choose the most appropriate one and in the “tear-off” sheet attached to the question paper, following instructions therein.       

2.1 Structured programming diminishes the efficiency of coding programs.

2.2 a purchase transaction increases the liability of a company.

2.3 Invoice detail file is a master file in a sales invoicing system.

2.4 In indexed sequential operation, records are held in sequential order on the disk and the index is kept permanently on the main storage for fast access.

2.5 While creating a report with CREATE REPORT, each record can occupy only one detail line.

2.6 SORT ON NAME + AMOUNT TO BANK is a valid command.

2.7 If a database file contains 5 fields, its structural compound index file can have a maximum of 5 index tags.

2.8 ALLTRIM( ) removes leading blanks from the specified character expression.

2.9 Consider the following code:

X = “Hello! How are you?”

? LEN(X)

It will return the value as 16.

2.10 A task started within DO WHILE and ENDDO loop will be continued until the condition is satisfied.

Answers

2.1: False

Reason Structured programming makes the program more efficient and easy to understand.

2.2: True

Reason Liability means expenses since any purchase means expences. Hence, it increases the liability of the company.

2.3:   True

Reason Sales invoicing system maintains the details of sales invoices. Hence, the invoice details are kept in master file.

2.4: True

2.5: True

2.6: True

2.7: True     

2.8: False

Reason ALLTRIM () function removes leading and trailing blanks from a specified text expression.

2.9: False

Reason LEN() function counts the total space occupied by the character expression including white space.

2.10: False

Reason “Do while” Loop works on the principle “do it  as long as ....”. So a took started within DOWHILE and END DO loop will be continued as long as condition is satistied.

 

3.  Match words and phrased in column X with the closest related meaning/word(s) /phrases in column Y.   Enter your selection in the “tear-off” answer sheet attached to the question paper, following instructions therein.     

           X                            Y

3.1 Zero inventory level                     A. Programming aid

3.2 Pseudocode                     B. Warning signal

3.3 Re-order level                     C. sends to printer

3.4 ???                     D. displays in the same line

3.5 VAL ( )                     E. low inventory storage

3.6 TRANSFORM( )                     F. it is a numeric function to change                                  character strings to numeric

3.7 IIF ( )                     G. provides a picture formatting to                          the numeric output

3.8 RANGE                     H. result in compactness in the pro                                 gram code

3.9 ALIAS                     I. copies the contents of previous                                 record to the text

3.10 SET CARRY ON                     J. links the active database file to an                                      other database file

                     K. alternate name for a database file

                     L. restricts entry in a numeric field

3.1: E

3.2: A

Reason It is a programming aid. A pseudocode is an algorithm written using the syntax of a language.

3.3: B

Reason It is a worring signal in the sense that we need to order for new stock purchase once the quantity of the  stock reaches to this level.

3.4: C

Reason It sends the output directly to the default printer if it is on.

3.5: F

Reason VAL() function return the numeric value of a digital character expression e.g. “992” is a character expression and 9992 is a numeric value.

3.6:

3.7: H

Reason IIF() function results in compactness in the program code.

3.8: L

Reason It is a validation check for a numeric field. It allows to restrict  the entries in a numeric field.

3.9: K

Reason ALIAS is used to give an easy to remember new name to data base file. It does not change the name of the file.

3.10: I

Reason SET CARRY ON is used when we need to keep the contents of new records same as the previous records or we need to make little changes in the previous records to get the new record.

 

4. Each statement below has a blank space to fit one of the word(s) or phrases in the list below.  Enter your choice in the “tear-off” answer sheet attached to the question paper, following instructions therein.      

A. query language         B. fields C. data processing

D. report generator         E. tools          F. data definition language

G. RETURN           H. LOCATE         I. data ampliation language

J. transaction file     K. SEEK         L. close database/use

4.1 __________is defined as a series of actions or operations, which convert data into useful information.

4.2 The__________is used to define the structure of a database.

4.3 The__________enables users to define their requirements for extracting the desired information from the database in the form of queries.

4.4 The _________enables the users of a database to design the layout of a report in the desired format.

4.5 A(n)_________ is used to store input data until it can be processed.

4.6 ________is the basic of FoxPro’s indexes.

4.7 To start the Table Wizard, open the __________option front the system menu and select Wizards. From the list select the Table Wizard.

4.8 If the table has no appropriate indexes, you have to use_________.

4.9 The_________command is used to close a database file in use.

4.10 Each function should end with a(n)________statement.

4.1: C

4.2: F

4.3: A

4.4: D

4.5: J

4.6: K

4.7: E

4.8: H

Reason If the table ----, you have to use LOCATE for searching FIND/SEEK can work only on index files.

4.9: L

4.10:           G

PART TWO

                              (Answer any FOUR question)        

 

5a) What is business data processing?

Solution : a. business data processing - data processing in accounting or business management.

Methods or apparatus performing systematic operations upon data or information exemplified by functions such as data or information transferring, merging, sorting, and calculating (i.e., arithmetic operations or logical operations).

 

b) State some of the advantages and disadvantages of getting the accounting data processed by a computer.

Solution :

Advantages

1. A computerised system means less paperwork, which may lead to greater effeciency and less likelihood of error.

2. Computers enable you to have an integrated system, so data only need to be entered once.

3. Such a system can quickly provide useful information to many different parts of an entity.

Disadvantages

1. Computerised systems require capital investment.

2. Some employees may resist the use of computers.

3. It can lead to the loss of an audit trail.

4. It requires special security procedures to prevent loss or corruption of data and unauthorised access to programmes.

 

c) What are the main problems that can arise in capturing raw data for use within a computer system?

Solution :

 

6a) What is meant by file organization? What are the three commonly used file organizations in business data processing? How do we decide to use a particular file organization for an application?

Solution :  The way records in a file are physically arranged on a storage device. Fortran files can have sequential or relative organization. On OpenVMS systems, files can also have indexed organization

A file is organized logically as a sequence of records.

Records are mapped onto disk blocks.

Files are provided as a basic construct in operating systems, so we assume the existence of an underlying file system.

Blocks are of a fixed size determined by the operating system.

Record sizes vary.

In relational database, tuples of distinct relations may be of different sizes.

One approach to mapping database to files is to store records of one length in a given file.

An alternative is to structure files to accommodate variable-length records. (Fixed-length is easier to implement.)

Assume that there are 10,000 employees at a particular company. Assume that each employee has been assigned a 5-digit employee number, so that there are 100,000 possible employee numbers ranging from 00000 to 99999. Assume that each employee has 95 characters of information to be stored on the file in addition to the 5 character employee number, so that each employee record is 100 characters long.

Assume that the information is to be stored on a DASD in which each sector contains 100 characters. Assume that the DASD contains 100,000 such sectors, numbered from 00000 to 99999. In all the statistical analyses, it is assumed that the actual employee numbers are randomly distributed over the set of all possible employee numbers. It is also assumed that each employee record is accessed with equal likelihood. The format of the employee records is shown in figure 8.1. A schematic layout of the DASD is shown in figure 8.2.

 

(Insert Figure 8.1 here.)????????

 

(Insert Figure 8.2 here.)?????????

 

 

 

 

 

1 NON-SEQUENTIAL

 

In this method, the first 10,000 sectors are reserved for the 100 character records of each of the 10,000 employees. The records are stored in random order. A diagram of this technique is shown in figure 8.3. (Insert Figure 8.3 here.) A record is retrieved by searching sequentially from the beginning of the file (sector 00000) until the desired employee number is found or until the end of the file (sector 09999) is encountered (in which case the employee number sought is not in the file). If the desired employee number exists in the file, then the number of accesses required to find the employee will be a minimum of 1 access, a maximum of 10,000 accesses, and an average of number of accesses of 5,000.5.

If the desired employee number does not exist in the file, then it will always require 10,000 accesses to determine that fact. An addition to the file is made by inserting the new record in the sector immediately following the current end of file. This operation will always require exactly one access. A quick deletion can be made by locating the record and flagging it as deleted. Over time this will require additional storage and slowly degrade performance. A quick deletion requires one more accesses (to flag the record as deleted) than a search. A complete deletion from the file is made by first locating the record to be deleted. After the record is located, the last record in the file is moved to the location of the deleted record. Assuming the record to be deleted is found, deletion requires two more accesses than just finding the record. Thus deletion requires a minimum of 3 accesses, a maximum of 10,001 accesses (if the last record is deleted,no data movement is necessary), and an average of 5,002.4998 accesses. (The correction of 2/10,000 is necessary because deletion of the last record does not require the two extra accesses.) Accessing all the records in the file in sequence by employee number requires that the file first be sorted. Since the best sorting techniques will require a number of accesses proportional to n log n, this operation will require roughly 92,100 accesses. (This number can be substantially reduced if there is a considerable amount of storage available in the random access main memory and  efficient DASD sorting techniques are used.) Creation of the file requires only 10,000 accesses to store the records. This technique requires only 10,000 sectors of storage, all of which contain useful information. This technique requires no storage overhead. The advantages of this technique are that it requires a minimum amount of storage, it is easy to design, and it is straightforward (although brute force) to implement. The main disadvantage of this techniques is this it is extremely slow (except for file additions).

1 SEQUENTIAL (OR SORTED SEQUENTIAL) In this method, as in the non-sequential method, the first 10,000 sectors are reserved for the 100 character records of each of the 10,000 employees. However, with this method the records are stored in order by ascending employee number. A diagram of this technique is shown in figure 8.4. (Insert Figure 8.4 here.) A record is retrieved by performing a binary search. The binary

search is performed by first examining the record in the middle of the file (sector 05000). By examining the employee number in the middle record, it can be determined whether the desired record is in the first half of the list (sectors 00000 through 04999), is in the second half of the list (sectors 05001 through 09999), or has been found (at sector 05000). If it is determined that the desired record is in the first half of the list, then sector 02500 is examined next. If it is determined that the desired record is in the second half of the list, then sector 07500 is examined next. This process of halving the list is repeated until the desired record is found or until it is determined that it is impossible for the employee number requested to be in the file. If the desired employee number exists in the file, then the number of accesses required to find the employee will be a minimum of 1 access, a maximum of 14 accesses, and an average of 12.3631 accesses. If the desired employee number does not exist in the file, then it will require a minimum of 13 accesses, a maximum of 14 accesses, and an average of 13.1809 accesses to determine that fact. To add a record to the file, it is necessary to start at the end of the file and move each record to the next highest numbered sector until the proper position for the new record is found. Once the proper position is found, the new record is inserted into the space created by the movement of all the records containing a higher employee number. Note that each time a record is moved, two record accesses are required: one to read the record and one two write it somewhere else. Thus the minimum number of accesses is 2, the maximum number of accesses is 20,001, and the average number of accesses is 10,002. (The exact average is 1/10,001 less than 10,002 because insertions before the first item do not require an extra comparison.) A quick deletion can be made by accessing the record to be deleted and flagging it as deleted. Care must be taken to retain the value of the key so that the binary search can still be performed without modification. Over time this will require additional storage space and slowly degrade performance. A quick deletion requires one

more access (to flag the record) than a search. A complete deletion from the file is most safely made by first locating the record to be deleted to make sure that it is in the file.

Then, starting at the deleted record each record is moved to the next lowest numbered sector until the end of file is reached. Again, each record movement requires two accesses. Since it takes 13 accesses to find the last item, the minimum number of accesses is 13. Since it takes 14 accesses to find the first item and 19,998 accesses to move everything else, the maximum number of accesses is 20,012. Since the average movement requires 9,999 accesses and the average search 12.3631 accesses, the average deletion will require 10011.3631 accesses. Accessing all the records in the file in sequence by employee number requires only 10,000 accesses since the file is always kept sorted.  Creation of the file requires only 10,000 accesses to store the records, assuming the file is already sorted. If the file is not sorted additional accesses will be needed to sort the file. This will be the equivalent in effort to accessing all the records in a non-sequential file in sequence by employee number. One technique that should not be adopted is to make 10,000 insertions into a sequential list of increasing length. These repetitive insertions would require a minimum of 20,001 accesses (assuming the list was

sorted), a maximum of 100,010,001 accesses (assuming the list was in reverse order), and an average of 50,024,992.2126 accesses. This technique requires only 10,000 sectors of storage, all of which contain useful information. This technique requires no storage overhead.

The advantages of this technique are that it requires a minimum of storage, and it is easy to design. The main disadvantage of this technique is that additions to and deletions from the file are extremely time-consuming.

DIRECT RELATION

In this method, space is allocated for an entire record for every possible identifier (employee number), whether or not a record exists for the employee number. Since each record occupies one sector (100 characters), one sector will be reserved for every possible employee number. Since there are 100,000 possible employee numbers, 100,000 sectors will be reserved. The employee number is used to compute the absolute or relative DASD address of where the record for the employee is stored. In the example used, this relationship is trivial: the employee number is used as the sector number. In general, the function that relates the record identifier to the DASD address must be a one-to-one correspondence that is order preserving. In practice, the function is usually a simple linear function with a positive slope.

A record is retrieved by going directly to the DASD address that corresponds to the employee number. This always requires exactly one access. An addition to the file is made by writing the record for the new employee on the sector corresponding to the new employee’s employee number. Again, this operation will always require exactly one access. A deletion from the file requires accessing the record for the employee being deleted, and altering the contents of the sector in some way so that it can be recognized that the record has been deleted. This last step is necessary so that a search for a particular record will be able to distinguish between a valid employee record in the file and an unoccupied cell. A deletion always requires exactly one access. Accessing all the records in the file in sequence by  employee number requires that the entire space reserved be searched sequentially for existing records. This first requires that at the time of file creation all unused sectors on the DASD were marked in some way to flag a deleted record (or empty cell). Then the data is inserted into the file and maintained in such a way that all deleted records are similarly flagged. The sequential search itself involves accessing all 100,000 sectors, processing all non-deleted sectors. This requires exactly 100,000 accesses each time the list is accessed sequentially, plus a one-time fixed overhead of an additional 100,000 accesses. Creation of the file requires 110,000 accesses (100,000 to flag unused sectors plus 10,000 to insert the records). This technique requires 100,000 sectors of storage. Only 10,000 of these sectors contain useful information. In addition, each sector contains redundant information because the employee number does not need to be stored in the record. (This is true because of the fundamental premise that “you always know where you are.” If the DASD  location is known, so is the employee number.) Thus only 9.5% of the required storage contains useful information. The storage overhead is 900% as compared with non-sequential or sequential. The advantages of this technique are that is a simple and unsophisticated method that needs little system design, is simple and easy to program, contains a low probability of programming error, has automatic updating, and requires extremely few accesses for all requests except accessing all items sequentially. The disadvantages of this technique are that it is rigid and inflexible, with the possibility of a great waste of storage. The storage of data is generally limited to numeric keys that do not contain duplicate values. Finally, processing the file sequentially is very time-consuming.

DIRECT RELATION AND INDEXED NONSEQUENTIAL

In this method space is allocated in a table (or index) for every possible identifier (employee number), whether or not a record exists for the employee number. Space is reserved in the table only for another address (a pointer), and not for the entire record as was the case in direct relation. The records themselves are stored in any way desired, but are usually packed together in a block in no specific order (i.e. are stored in non-sequential form). Each location in the identifier table  (the direct relation index) contains the address of the associated record. The length of the pointer in the direct relation table is clearly shorter than the space required by the entire record using the direct relation method. If the maximum number of employees is 10,000, then a relative pointer could be a maximum of 4 digits long. However, in this example, it is desirable to make additions to the employee file. Even a single addition would create problems with a 4 digit relative pointer. A 5 digit pointer is capable of directly addressing the entire DASD, and hence is adequate for the direct relation index table. With each pointer 5 digits long, 20 pointers will fit into a single sector. Since there are 100,000 possible keys, 5000 sectors need to be reserved for the 100,000 possible pointers in the direct relation index. The non-sequential storage of the records themselves takes up 10,000 sectors as was seen previously. Thus the total amount of space required for this technique is 15,000 sectors.

A record is retrieved by first finding the pointer in the direct  relation index, and then using the index pointer to locate the record  desired. If it is assumed that the index starts at sector 00000, then to find the pointer in the direct relation index, the employee number is divided by 20. The quotient from this division gives the sector in which the pointer is stored. The remainder from this division gives the relative position within the sector so that the specific pointer desired can be located. Once the pointer is found, the record can be accessed directly using the sector number contained in the index. This accessing operation always requires two accesses. An addition to the file is made by first inserting the new record at the end of the non-sequential file as was done in the non-sequential method. After inserting the record itself, a new entry is made in the index table. The proper location in the index table is made by dividing the employee number by 20 (pointers per sector), using the quotient as the sector number and the remainder as the position within the sector. Again, this operation always requires exactly two accesses. A quick deletion from the file can be done by merely flagging the record as deleted in the direct relation index. This has two serious disadvantages. First, the non-sequential file will enlarge with additions but will remain the same size instead of contracting with deletions. Secondly, the record will not be flagged as deleted in the non-sequential file so that all processing not using the direct relation index will be done incorrectly. A quick deletion only requires one access to perform and does not degrade processing speed as long as the  index is always used. A complete deletion requires moving the last record in the non-sequential file into the position occupied by the deleted record and altering the index. The employee number in the last record specifies which index to alter while the index entry for the employee number being deleted specifies where the last record is being moved. The entire operation can be performed in four accesses, unless the last item in the file is being deleted in which case only one access is required. (It is presumed that a pointer to the end of the non-sequential file is kept for addition and deletion purposes. To delete the last item from the non-sequential file requires only altering the end-of-file pointer. Thus the only access needed to delete the last item is to alter the index.) Thus the average number of accesses for a complete deletion is 3.9997. Accessing all the records in the file in sequence by employee number requires that the entire index be searched sequentially for pointers to existing records. As in direct relation, this first requires that at the time of file creation all unused pointer positions in the index were marked in some way to flag a deleted record (or unused key). Then the pointers are inserted into the index and maintained in such a way that all index entries for deleted records are similarly flagged. The sequential search itself involves accessing all 5,000 sectors in the index table as well as the 10,000 sectors containing the related records. This requires exactly 15,000 accesses each time the list is accessed sequentially, plus a one-time fixed overhead of an additional 5,000 accesses. Creation of the file requires 25,000 accesses (5,000 to flag unused pointers in the index table plus two accesses per record to insert records into the non-sequential file and index them). This technique requires 15,000 sectors of storage. All 10,000 sectors of the non-sequential file contain useful information. In addition, 10% of the index contains useful pointer information (the remaining 90% is positional information). Thus 70% of the storage required contains useful information. The storage overhead as compared with non-sequential or sequential is 50%. The advantages of this technique are that wasted space is reduced without the lengthy access time required for non-sequential or sequential. Also, a storage organization is not imposed on the records themselves since they are stored non-sequentially. The disadvantages of this technique are that it requires at least twice as many accesses as direct relation for all operations except accessing all items sequentially. As with direct relation, the keys are generally limited to numeric keys that do not contain duplicate values.

 DISTRIBUTED

In this method, sections of storage, called buckets, are set aside in which the records will be stored randomly. Within each bucket, the records are stored using the non-sequential method. However, if there are many buckets with only a few records in each bucket, then the time required for processing the short non-sequential file can be kept reasonable. Statistical experience shows that for efficient usage each  bucket  should average half full. Knowing the number of records and the number of buckets desired, the size of each bucket can be determined. A randomizing technique is used to assign keys (employee numbers) to buckets. The most common of these techniques is the remainder after division technique. First the key is divided by the number of buckets, then the bucket number is computed as the remainder after division. Some divisors (number of buckets) may yield a better dispersion than others, and some experimentation is generally called for. Most experiments show that a prime number works as well as or better than a composite number. Since each bucket reserves space for a fixed number of records, a simple linear relationship can be established between bucket number and the beginning sector of each bucket. Suppose it is desired to have about 1000 buckets. Then each bucket would contain an average of 10 records. Using a 50% fill factor, each bucket would reserve space for 20 records and hence be 20 sectors long. Since the closest prime number to 1000 is 997, practical values for storing the employee file using the distributed method are 997 buckets each 20 sectors long. Suppose bucket number 0 occupies sectors 0 through 19, bucket number 1 occupies sectors 20 through 39, etc., ending with bucket number 996 occupying sectors 19920 through 19939. Then the starting sector of each bucket is merely the bucket number multiplied by 20.

The file is initially created by dividing the employee number by 997, then using the remainder after division as the bucket number. This remainder is next multiplied by 20 to obtain the number of the starting sector of the bucket. The record is then inserted in the first available record position in the bucket. Essentially, the record has been inserted into one of 997 non-sequential files. Note that while each bucket contains room for 20 records, it is possible that more than 20 records in the file will have the same remainder after division. In those cases, the bucket associated with that remainder will overflow. Provisions must be made for this occurrence so that information is not lost. While there are many ways of handling this overflow problem, one method is to treat each bucket as a node in a linked list. Whenever a bucket overflows, another bucket can be linked to It. A record is retrieved by dividing the employee number by 997, then using the remainder after division as the bucket number. This remainder is next multiplied by 20 to obtain the number of the starting sector of the bucket. Finally, the bucket is searched sequentially like a short non-sequential file to locate the desired record. If the bucket has overflowed and the desired record is not in the first bucket examined, then links are followed through the overflow buckets until the record is found. The minimum number of accesses required to find an employee is 1. Since there are at most 101 numbers in the range 0 to 99,999 that have the same remainder after division by 997, the maximum number of accesses required to find an employee is 101. Intuitively, since each bucket is on the average half full, and since each search will find the desired record half way through the list, the average number of accesses is about 5. Actually, a slight correction is needed since there are only 997 buckets instead of 1000. Buckets 0 through 299 have an expected length of 10.1 while the remaining 697 buckets have an expected length of 10. Since the average number of accesses is 5.05 in buckets 0 through 299 and 5 in buckets 300 through 996, the overall average is 5.015045 accesses. The number of records placed in each bucket varies with a hypergeometric distribution. If the desired employee number does not exist in the file, a bucket will have to be exhaustively searched. Unless pointers are kept to the ends of each of the 997 buckets, an unsuccessful search will require one more access than the number of items in the bucket. This will require a minimum of one access (for an empty bucket), a maximum of 102 accesses, and an average of 11.03009 accesses. If pointers are kept to the ends of the buckets, an unsuccessful search requires a minimum of zero, a maximum of 101, and an average of 10.03009 accesses. An addition to the file is made by inserting the new record at the end of the appropriate bucket. Without pointers to the ends of the buckets, locating the end of a bucket is equivalent to an unsuccessful search, requiring a minimum of one access to insert into an empty bucket, a maximum of 101 accesses (since no more than 101 employee numbers can yield the same bucket number), and an average of 11.03009 accesses. If pointers are kept to the ends of the buckets, an insertion is done in one access plus the effort required to do the bookkeeping. The necessary bookkeeping involves not only altering the pointer to the end of the bucket, but also finding and establishing a link to an overflow bucket when overflow occurs. A quick deletion can be made by merely flagging the unwanted record as deleted. While this approach does not require extra storage, performance will rapidly degrade if there are numerous deletions. A quick deletion requires one more access (to flag the record) than searching for the desired record. A complete deletion from the file is made by first locating the record to be deleted. After the record is located, the last record in the same bucket is moved to the location of the deleted record. Without pointers to the ends of the buckets, deletion requires at most two more accesses than adding a record. The entire bucket must be searched to find the end, and in the process the record to be deleted is found. If the system has two buffer areas, the last record can be retained when the first empty cell in the bucket is located. This will require one less access. If pointers are kept to the ends of the buckets, the deletion will usually require two more accesses than searching for the record to be deleted. More accesses will be required only when an overflow bucket is emptied as a result of the deletion. Accessing all the records in the file in sequence by employee number requires that all the buckets be sorted and merged. Assuming each bucket requires n log n accesses to be sorted, it can be shown that the expected number of accesses is 23.5841 per bucket. Since there are 997 buckets, sorting requires 23,513.34 accesses. Assuming there is room in main memory for 997 keys and pointers (but not 997 records), the merge will require 2 accesses per record (one to retrieve the key and one to subsequently retrieve the record). Thus accessing the entire file in sequential order will require roughly 43,513 accesses. Creation of the file with pointers to the ends of the buckets requires only 10,000 accesses plus much bookkeeping. Creation of the file without pointers to the ends of the buckets requires 20,000 accesses (assuming space is reserved for three overflow buckets) to flag the cells as empty. Insertion of the records into the buckets requires roughly an additional 60,155 accesses on the average, bringing the total to 80,155 accesses. From the above discussion, it might seem that it is obviously better to keep track of the ends of the 997 buckets. However, it must be remembered that these 997 pointers will themselves need to be stored (in general on a DASD), and must be retrieved every time any bucket is accessed. A reasonable compromise is to keep track of the ends of the buckets during file creation, then flag the cells at the ends of the buckets as empty, and finally discard the end of bucket pointers. This reduces the number of accesses required for creation to 20,000. Handling of the overflow buckets has been somewhat glossed over in this initial discussion. In this example, there was no room reserved for an overflow pointer in each bucket. Since it is highly improbable that the overflow bucket will itself be even half full, there is no great loss in using an entire record position (sector) for the overflow pointer when one is needed. An analysis of the statistical distribution of the number of records in each bucket shows that the probability of a bucket overflowing is 0.00091877 in buckets 000 through 299 and 0.00080233 in buckets 300 through 696, yielding an overall probability of 0.00083737. With 997 buckets, the expected number of buckets requiring an overflow bucket is roughly 0.8349. Providing three overflow buckets is probably adequate. This technique requires 20,000 sectors of storage, 50% of which contain useful information. The storage overhead is 100% as compared with non-sequential or sequential. One advantage of this technique is that alphabetic keys can be used as easily as numeric keys. The bit patterns used to code alphabetic characters can be treated as if they were binary coded integers. Another advantage of this technique is that it can take advantage of DASD hardware. If the bucket is sized so that the DASD can read an entire bucket without head movement, a desired record can be accessed more rapidly than the number of accesses required predicts. A major disadvantage of this technique is that half of the reserved space is not used. If the records to be stored are long, a DASD may only be able to access a few records without head movement, thus degrading performance by requiring head movement during the sequential search of a bucket.

 

b) Distinguish between structure programming and modular programming.

Solution : A style of programming usually associated with languages such as C, Fortran, Pascal and so on. Using structured programming techniques, a problem is often solved using a divide and conquer approach such as stepwise refinement. An initially large problem is broken into several smaller sub-problems. Each of these is then progressively broken into even smaller sub-problems, until the level of difficulty is considered to be manageable. At the lowest level, a solution is implemented in terms of data structures and procedures. This approach is often used with imperative programming languages that are not object-oriented languages, i.e. the data structures and procedures are not implemented as classes.

Modular programming involves building a program in such a way as to create it in specific sections; modules. This can be done using SUB declarations, GOSUBS, and FUNCTIONS. Using modular structure is preferred because, elements that need to be repeated can be created once, than invoked repeatedly. Good programming practice will also allow you to “transport” SUB modules for inclusion in other programs you create.

modular programming is one aspect of structured programming in which individual tasks are programmed as distinct sections or modules (subprograms). One advantage is the ease with which individual sections can be modified without reference to other sections.

What is modular programming

A module, in modular programming, is a series of functions (or procedures) that are related in some way. This way of programming is used in many applications.

Way of work

Given a problem, you begin with analysis of the problem. You carefully look at the requirements of the problem and you make sure all questions are answered before you begin with design of the modules. Then, you break the problem in subproblems. Each of these problems you solve in one or more modules. All the modules together solve the complete problem. Modules itself may be divided into smaller modules.

 

7a) How information is stored by FoxPro?

Solution :You choose a data storage container according to the amount and type of data you need to store, and how you want to use it. You determine the availability of data by the way you declare it and where you create it in the program. This range of availability or effectiveness is called scope.

Most programming languages make it possible for you to store data in constants, variables, and arrays. In Visual FoxPro, you can also store data in records and objects.

 

b) What is a relational database system? Explain with example.

Solution : Relational Database Management System. A computer program that lets you store, index, and retrieve tables of data. The simplest way to look at an RDBMS is as a spreadsheet that multiple users can update. The most important thing that an RDBMS does is provide transactions.

A Relational Database Management System stores information in the form of tables and then links or relates those tables to provide answer to usersí diverse questions.

RDBMS were invented around 1970 by Codd.

There are many RDBMS on market today such as ORACLE, DB2, Microsoft SQL, etc

Databases have been a staple of business computing from the very beginning of the digital era. In fact, the relational database was born in 1970 when E.F. Codd, a researcher at IBM, wrote a paper outlining the process. Since then, relational databases have grown in popularity to become the standard.

Originally, databases were flat. This means that the information was stored in one long text file, called a tab delimited file. Each entry in the tab delimited file is separated by a special character, such as a vertical bar (|). Each entry contains multiple pieces of information (fields) about a particular object or person grouped together as a record. The text file makes it difficult to search for specific information or to create reports that include only certain fields from each record. Here’s an example of the file created by a flat database:

Lname, FName, Age, Salary|Smith, John, 35, $280|Doe, Jane, 28, $325|Brown, Scott, 41, $265|Howard, Shemp, 48, $359|Taylor, Tom, 22, $250

You can see that you have to search sequentially through the entire file to gather related information, such as age or salary. A relational database allows you to easily find specific information. It also allows you to sort based on any field and generate reports that contain only certain fields from each record. Relational databases use tables to store information. The standard fields and records are represented as columns (fields) and rows (records) in a table. Look at this example:

LName FName City Age Salary

Smith John 3 35 $280

Doe Jane 1 28 $325

Brown Scott 3 41 $265

Howard Shemp 4 48 $359

Taylor Tom 2 22 $250

In the relational database example, you can quickly compare salaries and ages because of the arrangement of data in columns. The relational database model takes advantage of this uniformity to build completely new tables out of required information from existing tables. In other words, it uses the relationship of similar data to increase the speed and versatility of the database.

The “relational” part of the name comes into play because of the other tables. A typical relational database has anywhere from 10 to more than 1,000 tables. Each table contains a column or columns that other tables can key on to gather information from that table. Look at the table below that matches the number in the City column of the above table with the name of a city.

City # City Name

1 Boston

2 London

3 New York

4 Los Angeles

By storing this information in another table, the database can create a single small table with the locations that can then be used for a variety of purposes by other tables in the database. A typical large database, like the one a big Web site, such as Amazon would have, will contain hundreds or thousands of tables like this all used together to quickly find the exact information needed at any given time.

Relational databases are created using a special programming language, structured query language (SQL), that is the standard for database interoperability. SQL is the foundation for all of the popular database applications available today, from Access to Oracle.

 

c) How does indexing works in Database?

Solution : Indexes

Indexes in databases are similar to indexes in books. In a book, an index allows you to find information quickly without reading the entire book. In a database, an index allows the database program to find data in a table without scanning the entire table. An index in a book is a list of words with the page numbers that contain each word. An index in a database is a list of values in a table with the storage locations of rows in the table that contain each value. Indexes can be created on either a single column or a combination of columns in a table and are implemented in the form of B-trees. An index contains an entry with one or more columns (the search key) from each row in a table. A B-tree is sorted on the search key, and can be searched efficiently on any leading subset of the search key. For example, an index on columns A, B, C can be searched efficiently on A, on A, B, and A, B, C.

Most books contain one general index of words, names, places, and so on. Databases contain individual indexes for selected types or columns of data: this is similar to a book that contains one index for names of people and another index for places. When you create a database and tune it for performance, you should create indexes for the columns used in queries to find data.

In the pubs sample database provided with Microsoft® SQL Server™ 2000, the employee table has an index on the emp_id column. The following illustration shows how the index stores each emp_id value and points to the rows of data in the table with each value.

When SQL Server executes a statement to find data in the employee table based on a specified emp_id value, it recognizes the index for the emp_id column and uses the index to find the data. If the index is not present, it performs a full table scan starting at the beginning of the table and stepping through each row, searching for the specified emp_id value.

SQL Server automatically creates indexes for certain types of constraints (for example, PRIMARY KEY and UNIQUE constraints). You can further customize the table definitions by creating indexes that are independent of constraints.

The performance benefits of indexes, however, do come with a cost. Tables with indexes require more storage space in the database. Also, commands that insert, update, or delete data can take longer and require more processing time to maintain the indexes. When you design and create indexes, you should ensure that the performance benefits outweigh the extra cost in storage space and processing resources.

 

8. Explain the following:

a) What is the purpose of  SET FILTER command? Give an example.

b) What do you understand by object linking and embedding?

c) Explain Query designer and Multiple tables.

 

9a) Explain the IIF( ) Function with an example.

b) In a payroll system, you are provided with Employee number, Employee name, Department, Designation, city which is either a Metro or a Non-metro and Basic pay. Write a FoxPro program to read this file, calculate allowances, total income and net salary as follows: 

 i. Dearness Allowance (DA):

         Up to Rs.8000                 60% of basic pay

         Above Rs.8000         50% of basic pay

 

ii. If city is Metro, House Rent allowance (HRA) is 30% of basic pay else if it is non-metro, HRA is 15% of basic pay.

 

iii. City compensatory allowance (CCA) is Rs.300.

iv. Provided fund is 6% of the basic pay.

v. Total income = Basic pay + DA+HRA+CCA ABD

         NET Salary = Total income - PF

Each payslip printed should contain the information for each employee as per the sample payslip output below:

PAYSLIP

Employee#         :7567                                 Department: COMPUTER

Employee Name: RAJ           Designation: LECTURER

Payslip for the month of: December, 2003

         INCOME                                             Rupees

         Basic Pay                                             6000

         Dearness Allowance                                  3600

         HRA                                                       1800

         CCA                                                       300

         Total Income                                                11700

         DEDUTIONS

         PF                                                           360

         Net Salary                                                  11340

                                                   ACCOUNTS OFFICER

Solution : clear

use emp

do while not eof()

if bp<=8000

da=0.6*bp

else

da=.5*bp

endif

if city="metro"

hr=.3*bp

else

hr=.15*bp

endif

cca=300

 pf=.06*bp

allo=da+hr+cca

ded=pf

ts=bp+da+hr+cca

ns=ts-pf

@ 3,60 SAY"PAYSLIP"

@4,5 say" EMPLOYEE NO:"+ ENO

@5,5 SAY" EMPLOYEE NAME:" +NAME

@4,70 say"DEPARTMENT:"+DEPT

@5,70 SAY"DESIGNATION:"+DESIG

@6,50 SAY" PAYSLIP FOR THE MONTH OF:"+CMONTH(DATE())+LTRIM(STR(YEAR(DATE())))

@7,5 SAY REPLICATE ("-",150)

@8,5 SAY" INCOME:"

@8,50 SAY "RUPEES"

@9,5 SAY REPLICATE ("-",150)

@10,5 SAY" BASIC PAY :"

@10,50 SAY LTRIM(STR(BP))

@11,5 SAY" DEARNESS ALLOWANCE :"

@11,50 SAY LTRIM(STR(DA))

@12,5 SAY" HOUSE RENT ALLOWANCE:"

@12,50 SAY LTRIM(STR(HR))

@13,5 SAY" CITY COMPENSATORY ALLOWANCE :"

@13,50 SAY LTRIM(STR(CCA))

@14,5 SAY" TOTAL INCOME:"

@14,50 SAY LTRIM(STR(ALLO))

@15,5 SAY REPLICATE ("-",150)

@17,5  SAY" DEDUCTIONS:"

@18,5 SAY REPLICATE ("-",150)

@20,5 SAY" PROVIDEND FUND:"

@20,50 SAY LTRIM(STR(PF))

@22,5 SAY" TOTAL DEDUCTIONS:"

@22,50 SAY LTRIM(STR(DED))

@23,5 SAY REPLICATE ("-",150)

@24,5  SAY" NET SALARY :"

@24,50 SAY LTRIM(STR(NS))

WAIT

CLEAR

skip

enddo