COS 4: 2007

Friday, November 23, 2007

UNIX

Unix (officially trademarked as UNIX®, sometimes also written as Unix or Unix® with small caps) is a computer operating system originally developed in 1969 by a group of AT&T employees at Bell Labs including Ken Thompson, Dennis Ritchie and Douglas McIlroy. Today's Unix systems are split into various branches, developed over time by AT&T as well as various commercial vendors and non-profit organizations.
As of 2007, the owner of the trademark UNIX® is The Open Group, an industry standards consortium. Only systems fully compliant with and certified to the Single UNIX Specification qualify as "UNIX®" (others are called "Unix system-like" or "Unix-like").
During the late 1970s and early 1980s, Unix's influence in academic circles led to large-scale adoption of Unix (particularly of the BSD variant, originating from the University of California, Berkeley) by commercial startups, the most notable of which is Sun Microsystems. Today, in addition to certified Unix systems, Unix-like operating systems such as Linux and BSD are commonly encountered. Sometimes, "traditional Unix" may be used to describe a Unix or an operating system that has the characteristics of either Version 7 Unix or UNIX System V.

What is UNIX ®?

The Open Group holds the definition of what a UNIX system is and its associated trademark in trust for the industry.
In 1994 Novell (who had acquired the UNIX systems business of AT&T/USL) decided to get out of that business. Rather than sell the business as a single entity, Novell transferred the rights to the UNIX trademark and the specification (that subsequently became the Single UNIX Specification) to The Open Group (at the time X/Open Company). Subsequently, it sold the source code and the product implementation (UNIXWARE) to SCO. The Open Group also owns the trademark UNIXWARE, transferred to them from SCO more recently.
Today, the definition of UNIX ® takes the form of the worldwide Single UNIX Specification integrating X/Open Company's XPG4, IEEE's POSIX Standards and ISO C. Through continual evolution, the Single UNIX Specification is the defacto and dejure standard definition for the UNIX system application programming interfaces. As the owner of the UNIX trademark, The Open Group has separated the UNIX trademark from any actual code stream itself, thus allowing multiple implementations. Since the introduction of the Single UNIX Specification, there has been a single, open, consensus specification that defines the requirements for a conformant UNIX system.
There is also a mark, or brand, that is used to identify those products that have been certified as conforming to the Single UNIX Specification, initially UNIX 93, followed subsequently by UNIX 95, UNIX 98 and now UNIX 03.
The Open Group is committed to working with the community to further the development of standards conformant systems by evolving and maintaining the Single UNIX Specification and participation in other related standards efforts. Recent examples of this are making the standard freely available on the web, permitting reuse of the standard in open source documentation projects , providing test tools ,developing the POSIX and LSB certification programs.
From this page you can read about the history of the UNIX system over the past 30 years or more. You can learn about the Single UNIX Specification, and read or download online versions of the specification. You can also get involved in the ongoing development and maintenance of the Single UNIX Specification, by joining the Austin Group whose approach to specification development is "write once, adopt everywhere", The Open Group's Base Working Group or get involved in the UNIX Certification program.

The Creation of the UNIX* Operating System

After three decades of use, the UNIX* computer operating system from Bell Labs is still regarded as one of the most powerful, versatile, and flexible operating systems (OS) in the computer world. Its popularity is due to many factors, including its ability to run a wide variety of machines, from micros to supercomputers, and its portability -- all of which led to its adoption by many manufacturers.
Like another legendary creature whose name also ends in 'x,' UNIX rose from the ashes of a multi-organizational effort in the early 1960s to develop a dependable timesharing operating system.
The joint effort was not successful, but a few survivors from Bell Labs tried again, and what followed was a system that offers its users a work environment that has been described as "of unusual simplicity, power, and elegance...."
The system also fostered a distinctive approach to software design -- solving a problem by interconnecting simpler tools, rather than creating large monolithic application programs.
Its development and evolution led to a new philosophy of computing, and it has been a never-ending source of both challenges and joy to programmers around the world.

Thursday, November 22, 2007

What is loosely coupled hardware? Tighly coupled?

"Loosely" and "tightly" coupled referring to other things than hardware... like connections. Maybe the same concepts can apply to hardware in some situations, and I just haven't run into it.

Tightly coupled usually means synchronous connections where if there is any interruption both ends of the communication have to wait until the connection is restored.

Loosely coupled usually means asynchronous connections where if there is any interruption, one or both ends of the communication are able to function separately until the connection is restored.

There can be reasons why one or the other situation is supported, "tightly coupled" communications ensure the integrity and state on both ends while "loosely coupled" communications can mean better fault tolerance in that things get done as much as possible even when conditions aren't perfect.

Coupling refers to how much a component (software or hardware) relies on another component.Coupling can be loose or tight.With low coupling, a change in one module will not require a change in the implementation of another module. Low coupling is often a sign of a well-structured computer system.

What do you think is the advantages of distributed systems over parallel systems? Parallel systems over distributed systems?

A distributed system's main advantage over a parallel system approach is the ability to re-distribute its different functions on to different platforms. That gives you the ability to both scale up (substitute bigger boxes for existing) and to scale out (add additional boxes that add same functionality).

A parallel system approach instead tries to concentrate on internal speed, which is nearly instantaneous compared to the latencies involved for a distributed system.

Tuesday, November 20, 2007

Main Differences of Operating System from the Main Frame Computers

*An operating system is a software that actually runs the computer and is the interface between the user and the computer. There are different operating systems for mainframes and personal computers. The mainframe operating systems are more powerful and expensive whereas the OS for PC's are readily available and are not that powerful.

*The for a main frame is targeted to handle hundreds of users at a time. Means that its managing hundreds of displays (monitors) and keyboards and keeps track of the input from each user, the process required/requested on these inputs and the output of these inputs. On the other hand, the other operating systems are not truly multiuser. If you create four accounts on Windows XP, then it does not mean that the OS is multiuser because only a single user can log in and interact with the machine.

*PC OS's are not concerned with fair use or maximal use of computer resources. Instead, they try to optimize the usefulness of the computer for an individual user, usually at the expense of overall efficiency. Mainframe OS's need more complex scheduling and I/O algortihms to keep the various system components efficiently occupied.

*Generally, operating systems for batch systems have simpler requirements than for personal computers. Batch systems do not have to be concerned with interacting with a user as much as a personal computer. As a result, an operating system for a PC must be concerned with response time for an interactive user. Batch systems do not have such requirements. A pure batch system also may have not to handle time sharing, whereas an operating system must switch rapidly between different jobs.

*Personal computer operating systems are not concerned with fair use, or maximal use, of computer facilities. Instead, they try to optimize the usefulness of the computer for an individual user, usually at the expense of efficiency. Consider how many CPU cycles are used by graphical user interfaces (GUIs). Mainframe operating systems need more complex scheduling and I/O algorithms to keep the various system components busy.

*Personal computers Os are very fixed, the computer needs a keyboard, a mouse and one processor. and at least one monitor and a primary hard drive. while mainframes require very flexible structure. a mani frame can have several hard drives and several processors that cooperate to do the job.

*Also mainframe software has a flexible process control. so that one crashing program wouldn't usually crash the main frame while in PC there is place for such complexities and one process can crash the entire system.

*Generally the difference looks like the difference between a boxer who fights alone, and a big army that has many division to cooperate simultaneously to do a huge task on a large scale.

Multiprogramming

Early computer systems were extremely expensive to purchase and to operate. Unfortunately they often sat idle as their human operators tended to their duties at a humans pace. Even a group of highly trained computer operators could not work fast enough to keep even the earliest tape-fed batch systems CPU busy. The advent of multiprogramming broke through the utilization barrier by removing most of the human factor in CPU utilization.
In a multiprogramming-capable system, jobs to be executed are loaded into a pool. Some number of those jobs are loaded into main memory, and one is selected from the pool for execution by the CPU. If at some point the program in progress terminates or requires the services of a peripheral device, the control of the CPU is given to the next job in the pool. As programs terminate, more jobs are loaded into memory for execution, and CPU control is switched to another job in memory. In this way the CPU is always executing some program or some portion thereof, instead of waiting for a printer, tape drive, or console input [1].
An important concept in multiprogramming is the degree of multiprogramming. The degree of multiprogramming describes the maximum number of processes that a single-processor system can accommodate efficiently[2]. The primary factor affecting the degree of multiprogramming is the amount of memory available to be allocated to executing processes. If the amount of memory is too limited, the degree of multiprogramming will be limited because fewer processes will fit in memory. A factor inherent in the operating system itself is the means by which resources are allocated to processes. If the operating system can not allocate resources to executing processes in a fair and orderly fashion, the system will waste time in reallocation, or process execution could enter into a deadlock state as programs wait for allocated resources to be freed by other blocked processes. Other factors affecting the degree of multiprogramming are program I/O needs, program CPU needs, and memory and disk access speed[3].
Processes maintained in the computers main memory are considered to be executing concurrently even though the CPU is (usually) capable of executing only one instruction at a time. The number of processes that can be held in memory is dependent on the amount of memory available to the system. That amount may be any combination of real or virtual memory, virtual memory being a portion of on-line mass-storage allocated to hold code in the process of being executed which can not fit into available real memory. If a process is too large to fit into the memory allocated to it, portions of its code may be stored temporarily on disk. When this code is required the operating system will load the code into memory and execution will continue. The management process by which this code is swapped into and out of memory is referred to a paging. A similar system can be used to manage data-segment memory[3] [1].
As the number of processes (degree of multiprogramming) increases in a system that supports paging, the amount of memory available to execute processes in decreases and the number of paging operations required increases. At some point the amount of time the CPU spends paging code and data will drag system performance down. This phenomenon, called "thrashing," is a manifestation of exceeding the degree of multiprogramming for a system [2][3][1].
In contrast to a multiprogramming system, a batch system executes its jobs sequentially, and might be referred to as a "monoprogramming" system (though certain batch systems have monitor programs, which might classify them loosely as multiprogramming systems). Batch systems were developed prior to multiprogramming as a means of increasing efficiency of computer time use. In a batch system, jobs with similar requirements (ie. compiler) are collected together and loaded into the system. Each job is taken sequentially, and the next jobs execution starts after the previous ones stops.
The batch system is not as efficient as the multiprogrammed system due to the fact that the CPU must sit idle while waiting on I/O devices or human input. Unlike a multiprogrammed system, the program in progress always has the ability to fully utilize the CPU and its resources. Enhancements to early batch systems included copying all punch-cards to tape and placing all output onto tape for later printing, both operations having been done on a separate system dedicated to those purposes [1]
Efficient use of computer time was the impetus for the development of multiprogramming systems. Multiprogramming freed the CPU from relying on the inherent slowness of the operator, and permitted it to work while waiting on peripheral devices. Every computer system has a limit to the degree of multiprogramming it will support. This limit is primarily based on the amount of main memory available to the system and the efficiency of the operating systems resource allocation algorithms.

Timesharing
Timesharing, also called "time slicing" or "time division multiplexing", or even "multitasking", is a system whereby system focus is switched between each of its users. Ideally, users of a timesharing system will not be aware that they are sharing the computer with one or more users.
Timesharing came about as an extension of multiprogramming. Old batch systems, including multiprogramming batch systems, were usually isolated from their end-users. Due to this isolation, programmers were required to plan for every contingency in designing the input for their code. Many times a programmer would receive an output printout or a memory dump of their program. Interaction with the computer system was not possible due to the inefficiency introduced by extensive human interaction [1].
Timesharing provided a means by which system users could interact with the computer and the computer could direct its idle time toward other users or other processes. Now programmers could write programs that interacted with the user; they even had the opportunity to debug their code in a live environment rather than a stack of paper printout [1].
To provide service to individual users and maintain their operating environment, timesharing systems employ some means of context switching. In context switching, a users execution environment is loaded into memory, some operation is performed, and then the environment is placed back into temporary storage until the CPU comes back to service the next request [2].
Timeshare systems allowed multiple users access to the same system, so protection must be provided for each users environment and for the operating system itself. Also, timeshare systems had to provide the illusion of immediate system interaction in order to be practical and attractive to users demanding real-time interaction.
To achieve effective user interaction, processes were executed in a defined order, and were only given a certain amount of CPU time. This way everyone gets the CPUs attention at some point, and no one can take complete control. Two common schemes are FIFO and round robin. The FIFO scheme processes requests in the order in which they are received. Unfortunately FIFO is not conducive to interactive processing because it permits jobs to run to completion. Thus short jobs wait for long jobs, and job priority can not be defined [3]
On the other hand, round-robin scheduling is a preemptive method of allocating CPU time. In this scheme, processes gain CPU control in a FIFO method, but after a certain amount of time are stopped, or preempted, and placed back into the queue. It is in this scheme that context switching becomes a central issue. Efficiency in setup and clean up for processes determines the overhead introduced by the system. TSS, the time sharing system for the IBM System 360/67, used a scheduling scheme similar to a combination FIFO/round-robin. [2]
Compared to a timeshare system, the scheduling policy of a batch system is a hybrid of round-robin and FIFO, but with more emphasis on FIFO. Earlier systems relied entirely on a FIFO scheme; one job was loaded into the system and ran to completion before another could start. Later batch systems improved on this theme to provide one job with the CPU for as long as it could execute without requesting an I/O device. Upon such a request, the CPU switched execution to the next job in the queue [1].
To make them practical, timesharing systems usually supported some sort of on-line central file system and a means of organizing it. MUTLICS, Multiplexed Information and Computing Service, was the first timesharing system to support a centralized file system organized hierarchically [4].
Timesharing systems, the evolution of multiprogrammed systems, brought the computer to the user. Utilizing process scheduling and queuing schemes, individual users were provided the illusion of being sole user of the system, unlike batch systems where jobs had the opportunity to capture the attention of the CPU from other users. These systems typically incorporated a shared file system where user code and data could be maintained and developed. The timesharing system leveraged the computing investment and at the same time increased the level of service to the end user.

Multiprogramming

Multiprogramming is a rudimentary form of parallel processing in which several programs are run at the same time on a uniprocessor. Since there is only one processor, there can be no true simultaneous execution of different programs. Instead, the operating system executes part of one program, then part of another, and so on. To the user it appears that all programs are executing at the same time.
If the machine has the capability of causing an interrupt after a specified time interval, then the operating system will execute each program for a given length of time, regain control, and then execute another program for a given length of time, and so on. In the absence of this mechanism, the operating system has no choice but to begin to execute a program with the expectation, but not the certainty, that the program will eventually return control to the operating system.
If the machine has the capability of protecting memory, then a bug in one program is less likely to interfere with the execution of other programs. In a system without memory protection, one program can change the contents of storage assigned to other programs or even the storage assigned to the operating system. The resulting system crashes are not only disruptive, they may be very difficult to debug since it may not be obvious which of several programs is at fault.

Multiprogramming
Early computers ran one process at a time. While the process waited for servicing by another device, the CPU was idle. In an I/O intensive process, the CPU could be idle as much as 80% of the time. Advancements in operating systems led to computers that load several independent processes into memory and switch the CPU from one job to another when the first becomes blocked while waiting for servicing by another device. This idea of multiprogramming reduces the idle time of the CPU. Multiprogramming accelerates the throughput of the system by efficiently using the CPU time.
Programs in a multiprogrammed environment appear to run at the same time. Processes running in a multiprogrammed environment are called concurrent processes. In actuality, the CPU processes one instruction at a time, but can execute instructions from any active process.

As the illustration shows, CPU utilization of a system can be improved by using multiprogramming. Let P be the fraction of time that a process spends away from the CPU. If there is one process in memory, the CPU utilization is (1-P). If there are N processes in memory, the probability of N processes waiting for an I/O is P*P...*P (N times). The CPU utilization is ( 1 - P^N ) where N is called the multiprogramming level (MPL) or the degree of multiprogramming. As N increases, the CPU utilization increases. While this equation indicates that a CPU continues to work more efficiently as more and more processes are added, logically, this cannot be true. Once the system passes the point of optimal CPU utilization, it thrashes.
(For more on performance issues of OS, refer to Operating System Performance Module )
In order to use the multiprogramming concept, processes must be loaded into independent sections or partitions of memory. So, main memory is divided into fixed-sized or variable-sized partitions. Since a partition may not be large enough for the entire process, virtual memory is implemented to keep the processes executing. The answers to several questions are important to implementing an efficient virtual memory system in a multiprogrammed environment.

Spooling

Computer science, spooling refers to a process of communicating data to another program by placing it in a temporary working area, where the other program can access it at some later point in time. Traditional uses of the term spooling apply it to situations where there is little or no direct communication between the program that writes the data and the program that reads it. Spooling is often used when devices access data at different rates. The temporary working area provides a waiting station where data can reside while a slower device can process it at its own rate. New data is only added and deleted at the ends of the area, i.e., there is no random access or editing.
It can also refer to a storage device that incorporates a physical spool, such as a tape drive.
The most common spooling application is print spooling. In print spooling, documents are loaded into a buffer (usually an area on a disk), and then the printer pulls them off the buffer at its own rate. Because the documents are in a buffer where they can be accessed by the printer, the user is free to perform other operations on the computer while the printing takes place in the background. Spooling also lets users place a number of print jobs in a queue instead of waiting for each one to finish before specifying the next one.
The temporary storage area to which e-mail is delivered by a Mail Transfer Agent and in which it waits to be picked up by a Mail User Agent is sometimes called a mail spool. Likewise, a storage area for Usenet articles may be referred to as a news spool. (On Unix-like systems, these areas are usually located in the /var/spool directory.) Unlike other spools, mail and news spools usually allow random access to individual messages.

Origin of the term

Magnetic recording tape wound onto a spool or reel.
"Spool" is supposedly an acronym for simultaneous peripheral operations on-line (although this is thought by some to be a backronym), or as for printers: simultaneous peripheral output on line (yes, not on-line but on line). Early mainframe computers had, by current standards, small and expensive hard disks. These costs made it necessary to reserve the disks for files that required random access, while writing large sequential files to reels of tape. Typical programs would run for hours and produce hundreds or thousands of pages of printed output. Periodically the program would stop printing for a while to do a lengthy search or sort. But it was desirable to keep the printer(s) running continuously. Thus, when a program was running, it would write the printable file to a spool of tape, which would later be read back in by the program controlling the printer.
Spooling also improved the multiprogramming capability of systems. Most programs required input, and produced output, using slow peripherals such as card-readers and lineprinters. Without spooling, the number of tasks that could be multiprogrammed could be limited by the availability of peripherals; with spooling, a task did not need access to a real device: slow peripheral input and output for several tasks could be held on shared system storage, written and read by separate system processes running asynchronously with those tasks

Magnetic recording tape wound onto a spool or reel.

Spooling
Acronym for simultaneous peripheral operations on-line, spooling refers to putting jobs in a buffer, a special area in memory or on a disk where a device can access them when it is ready. Spooling is useful because devices access data at different rates. The buffer provides a waiting station where data can rest while the slower device catches up.
The most common spooling application is print spooling. In print spooling, documents are loaded into a buffer (usually an area on a disk), and then the printer pulls them off the buffer at its own rate. Because the documents are in a buffer where they can be accessed by the printer, you can perform other operations on the computer while the printing takes place in the background. Spooling also lets you place a number of print jobs on a queue instead of waiting for each one to finish before specifying the next one.

Data Spooling
Bacula allows you to specify that you want the Storage daemon to initially write your data to disk and then subsequently to tape. This serves several important purposes.
It take a long time for data to come in from the File daemon during an Incremental backup. If it is directly written to tape, the tape will start and stop or shoe-shine as it is often called causing tape wear. By first writing the data to disk, then writing it to tape, the tape can be kept in continual motion.
While the spooled data is being written to the tape, the despooling process has exclusive use of the tape. This means that you can spool multiple simultaneous jobs to disk, then have them very efficiently despooled one at a time without having the data blocks from several jobs intermingled, thus substantially improving the time needed to restore files. While despooling, all jobs spooling continue running.
Writing to a tape can be slow. By first spooling your data to disk, you can often reduce the time the File daemon is running on a system, thus reducing downtime, and/or interference with users. Of course, if your spool device is not large enough to hold all the data from your File daemon, you may actually slow down the overall backup.
Data spooling is exactly that "spooling". It is not a way to first write a "backup" to a disk file and then to a tape. When the backup has only been spooled to disk, it is not complete yet and cannot be restored until it is written to tape. In a future version, Bacula will support writing a backup to disk then later Migrating or Copying it to a tape.
The remainder of this chapter explains the various directives that you can use in the spooling process.
Data Spooling Directives
The following directives can be used to control data spooling.
To turn data spooling on/off at the Job level in the Job resource in the Director's conf file (default no).
SpoolData = yesno
To override the Job specification in a Schedule Run directive in the Director's conf file.
SpoolData = yesno
To limit the maximum total size of the spooled data for a particular device. Specified in the Device resource of the Storage daemon's conf file (default unlimited).
Maximum Spool Size = size Where size is a the maximum spool size for all jobs specified in bytes.
To limit the maximum total size of the spooled data for a particular device for a single job. Specified in the Device Resource of the Storage daemon's conf file (default unlimited).
Maximum Job Spool Size = size Where size is the maximum spool file size for a single job specified in bytes.
To specify the spool directory for a particular device. Specified in the Device Resource of the Storage daemon's conf file (default, the working directory).
Spool Directory = directory

!!! MAJOR WARNING !!!
Please be very careful to exclude the spool directory from any backup, otherwise, your job will write enormous amounts of data to the Volume, and most probably terminate in error. This is because in attempting to backup the spool file, the backup data will be written a second time to the spool file, and so on ad infinitum.
Another advice is to always specify the maximum spool size so that your disk doesn't completely fill up. In principle, data spooling will properly detect a full disk, and despool data allowing the job to continue. However, attribute spooling is not so kind to the user. If the disk on which attributes are being spooled fills, the job will be canceled. In addition, if your working directory is on the same partition as the spool directory, then Bacula jobs will fail possibly in bizarre ways when the spool fills.

Other Points
When data spooling is enabled, Bacula automatically turns on attribute spooling. In other words, it also spools the catalog entries to disk. This is done so that in case the job fails, there will be no catalog entries pointing to non-existent tape backups.
Attribute despooling is done at the end of the job, as a consequence, after Bacula stops writing the data to the tape, there may be a pause while the attributes are sent to the Directory and entered into the catalog before the job terminates.
Attribute spool files are always placed in the working directory.
When Bacula begins despooling data spooled to disk, it takes exclusive use of the tape. This has the major advantage that in running multiple simultaneous jobs at the same time, the blocks of several jobs will not be intermingled.
It probably does not make a lot of sense to enable data spooling if you are writing to disk files.
It is probably best to provide as large a spool file as possible to avoid repeatedly spooling/despooling. Also, while a job is despooling to tape, the File daemon must wait (i.e. spooling stops for the job while it is despooling).
If you are running multiple simultaneous jobs, Bacula will continue spooling other jobs while one is despooling to tape, provided there is sufficient spool file space.

Batch System

Batch processing is execution of a series of programs ("jobs") on a computer without human interaction.
Batch jobs are set up so they can be run to completion without human interaction, so all input data is preselected through scripts or commandline parameters. This is in contrast to interactive programs which prompt the user for such input.
Batch processing has these benefits:
It allows sharing of computer resources among many users,
It shifts the time of job processing to when the computing resources are less busy,
It avoids idling the computing resources with minute-by-minute human interaction and supervision,
By keeping high overall rate of utilization, it better amortizes the cost of a computer, especially an expensive one.
Batch processing has been associated with mainframe computers since the earliest days of electronic computing in 1950s. Because such computers were enormously costly, batch processing was the only economically-viable option of their use. In those days, interactive sessions with either text-based computer terminal interfaces or graphical user interfaces were not widespread. Initially, computers were not even capable of having multiple programs loaded to the main memory.
Batch processing has grown beyond its mainframe origins, and is now frequently used in UNIX environments, where the cron and at facilities allow for scheduling of complex job scripts. Similarly, Microsoft DOS and Windows systems refer to their command-scripting language as batch files and Windows has a job scheduler.
A popular computerized batch processing procedure is printing. This normally involves the operator selecting the documents they need printed and indicating to the batch printing software when and where they should be output. Batch processing is also used for efficient bulk database updates and automated transaction processing, as contrasted to interactive online transaction processing (OLTP) applications.

Batch Computing
In the batch era, computing power was extremely scarce and expensive. The largest computers of that time commanded fewer logic cycles per second than a typical toaster or microwave oven does today, and quite a bit fewer than today's cars, digital watches, or cellphones. User interfaces were, accordingly, rudimentary. Users had to accommodate computers rather than the other way around; user interfaces were considered overhead, and software was designed to keep the processor at maximum utilization with as little overhead as possible.
The input side of the user interfaces for batch machines were mainly punched cards or equivalent media like paper tape. The output side added line printers to these media. With the limited exception of the system operator's console, human beings did not interact with batch machines in real time at all.
Submitting a job to a batch machine involved, first, preparing a deck of punched cards describing a program and a dataset. Punching the program cards wasn't done on the computer itself, but on specialized typewriter-like machines that were notoriously balky, unforgiving, and prone to mechanical failure. The software interface was similarly unforgiving, with very strict syntaxes meant to be parsed by the smallest possible compilers and interpreters.

Figure 2.1. IBM 029 card punch.

Once the cards were punched, one would drop them in a job queue and wait. Eventually. operators would feed the deck to the computer, perhaps mounting magnetic tapes to supply a another dataset or helper software. The job would generate a printout, containing final results or (all too often) an abort notice with an attached error log. Successful runs might also write a result on magnetic tape or generate some data cards to be used in later computation.
The turnaround time for a single job often spanned entire days. If one were very lucky, it might be hours; real-time response was unheard of. But there were worse fates than the card queue; some computers actually required an even more tedious and error-prone process of toggling in programs in binary code using console switches. The very earliest machines actually had to be partly rewired to incorporated program logic into themselves, using devices known as plugboards.
Early batch systems gave the currently running job the entire computer; program decks and tapes had to include what we would now think of as operating-system code to talk to I/O devices and do whatever other housekeeping was needed. Midway through the batch period, after 1957, various groups began to experiment with so-called “load-and-go” systems. These used a monitor program which was always resident on the computer. Programs could call the monitor for services. Another function of the monitor was to do better error checking on submitted jobs, catching errors earlier and more intelligently and generating more useful feedback to the users. Thus, monitors represented a first step towards both operating systems and explicitly designed user interfaces.

The batch system

A batch system is an efficient way to make use of the available resources on cds and does not involve you watching over the execution of a job. The jobs you submit to the batch system are placed on a queue and are run when the resources become available.
The Base Operating System on cds provides a very simple batch system. It only has one queue and provides no mechanism for looking at the queue or the progress of your jobs.

Stacked Job Batch Systems (mid 1950s - mid 1960s)
A batch system is one in which jobs are bundled together with the instructions necessary to allow them to be processed without intervention.
Often jobs of a similar nature can be bundled together to further increase economy

The monitor is system software that is responsible for interpreting and carrying out the instructions in the batch jobs. When the monitor started a job, it handed over control of the entire computer to the job, which then controlled the computer until it finished.
A sample of several batch jobs might look like: $JOB user_spec ; identify the user for accounting purposes
Often magnetic tapes and drums were used to store intermediate data and compiled programs.
Advantages of batch systems
move much of the work of the operator to the computer
increased performance since it was possible for job to start as soon as the previous job finished
Disadvantages
turn-around time can be large from user standpoint
more difficult to debug program
due to lack of protection scheme, one batch job can affect pending jobs (read too many cards, etc)
a job could corrupt the monitor, thus affecting pending jobs
a job could enter an infinite loop
As mentioned above, one of the major shortcomings of early batch systems was that there was no protection scheme to prevent one job from adversely affecting other jobs.
The solution to this was a simple protection scheme, where certain memory (e.g. where the monitor resides) were made off-limits to user programs. This prevented user programs from corrupting the monitor.
To keep user programs from reading too many (or not enough) cards, the hardware was changed to allow the computer to operate in one of two modes: one for the monitor and one for the user programs. IO could only be performed in monitor mode, so that IO requests from the user programs were passed to the monitor. In this way, the monitor could keep a job from reading past it's on $EOJ card.
To prevent an infinite loop, a timer was added to the system and the $JOB card was modified so that a maximum execution time for the job was passed to the monitor. The computer would interrupt the job and return control to the monitor when this time was exceeded.
Spooling Batch Systems (mid 1960s - late 1970s)
One difficulty with simple batch systems is that the computer still needs to read the the deck of cards before it can begin to execute the job. This means that the CPU is idle (or nearly so) during these relatively slow operations.
Since it is faster to read from a magnetic tape than from a deck of cards, it became common for computer centers to have one or more less powerful computers in addition to there main computer. The smaller computers were used to read a decks of cards onto a tape, so that the tape would contain many batch jobs. This tape was then loaded on the main computer and the jobs on the tape were executed. The output from the jobs would be written to another tape which would then be removed and loaded on a less powerful computer to produce any hardcopy or other desired output.
It was a logical extension of the timer idea described above to have a timer that would only let jobs execute for a short time before interrupting them so that the monitor could start an IO operation. Since the IO operation could proceed while the CPU was crunching on a user program, little degradation in performance was noticed.
Since the computer can now perform IO in parallel with computation, it became possible to have the computer read a deck of cards to a tape, drum or disk and to write out to a tape printer while it was computing. This process is called SPOOLing: Simultaneous Peripheral Operation OnLine.
Spooling batch systems were the first and are the simplest of the multiprogramming systems.
One advantage of spooling batch systems was that the output from jobs was available as soon as the job completed, rather than only after all jobs in the current cycle were finished.
Multiprogramming Systems (1960s - present)
As machines with more and more memory became available, it was possible to extend the idea of multiprogramming (or multiprocessing) as used in spooling batch systems to create systems that would load several jobs into memory at once and cycle through them in some order, working on each one for a specified period of time. At this point the monitor is growing to the point where it begins to resemble a modern operating system.

As a simple, yet common example, consider a machine that can run two jobs at once. Further, suppose that one job is IO intensive and that the other is CPU intensive. One way for the monitor to allocate CPU time between these jobs would be to divide time equally between them. However, the CPU would be idle much of the time the IO bound process was executing.
A good solution in this case is to allow the CPU bound process (the background job) to execute until the IO bound process (the foreground job) needs some CPU time, at which point the monitor permits it to run. Presumably it will soon need to do some IO and the monitor can return the CPU to the background job.

Timesharing Systems (1970s - present)
Back in the days of the "bare" computers without any operating system to speak of, the programmer had complete access to the machine. As hardware and software was developed to create monitors, simple and spooling batch systems and finally multiprogrammed systems, the separation between the user and the computer became more and more pronounced.
Users, and programmers in particular, longed to be able to "get to the machine" without having to go through the batch process. In the 1970s and especially in the 1980s this became possible two different ways.

The first involved timesharing or timeslicing. The idea of multiprogramming was extended to allow for multiple terminals to be connected to the computer, with each in-use terminal being associated with one or more jobs on the computer. The operating system is responsible for switching between the jobs, now often called processes, in such a way that favored user interaction. If the context-switches occurred quickly enough, the user had the impression that he or she had direct access to the computer.
Interactive processes are given a higher priority so that when IO is requested (e.g. a key is pressed), the associated process is quickly given control of the CPU so that it can process it. This is usually done through the use of an interrupt that causes the computer to realize that an IO event has occurred.

It should be mentioned that there are several different types of time sharing systems. One type is represented by computers like our VAX/VMS computers and UNIX workstations. In these computers entire processes are in memory (albeit virtual memory) and the computer switches between executing code in each of them. In other types of systems, such as airline reservation systems, a single application may actually do much of the timesharing between terminals. This way there does not need to be a different running program associated with each terminal.

Personal Computers
The second way that programmers and users got back at the machine was the advent of personal computers around 1980. Finally computers became small enough and inexpensive enough that an individual could own one, and hence have complete access to it.

Real-Time, Multiprocessor, and Distributed/Networked Systems
A real-time computer is one that execute programs that are guaranteed to have an upper bound on tasks that they carry out. Usually it is desired that the upper bound be very small. Examples included guided missile systems and medical monitoring equipment. The operating system on real-time computers is severely constrained by the timing requirements.

Dedicated computers are special purpose computers that are used to perform only one or more tasks. Often these are real-time computers and include applications such as the guided missile mentioned above and the computer in modern cars that controls the fuel injection system.
A multiprocessor computer is one with more than one CPU. The category of multiprocessor computers can be divided into the following sub-categories:
shared memory multiprocessors have multiple CPUs, all with access to the same memory. Communication between the the processors is easy to implement, but care must be taken so that memory accesses are synchronized.
distributed memory multiprocessors also have multiple CPUs, but each CPU has it's own associated memory. Here, memory access synchronization is not a problem, but communication between the processors is often slow and complicated.
Related to multiprocessors are the following:
networked systems consist of multiple computers that are networked together, usually with a common operating system and shared resources. Users, however, are aware of the different computers that make up the system.
distributed systems also consist of multiple computers but differ from networked systems in that the multiple computers are transparent to the user. Often there are redundant resources and a sharing of the workload among the different computers, but this is all transparent to the user.

Portable Batch System

Portable Batch System (or simply PBS) is the name of computer software that performs job scheduling. Its primary task is to allocate computational tasks, i.e., batch jobs, among the available computing resources. It is often used in conjunction with UNIX cluster environments. Several spin-offs of this software have resulted in it having various names. However, the overall architecture and command-line interface remain essentially the same.
PBS is one of the job scheduler mechanisms supported by GRAM (Grid Resource Allocation Manager), a component of the Globus Toolkit.

History and versions
PBS was originally developed by MRJ for NASA in the early to mid-1990s. MRJ was taken over by Veridian, which was later taken over by Altair Engineering, which currently distributes PBS Pro commercially.
The following versions of Portable Batch System are currently available:
OpenPBS — unsupported original open source version

TORQUE — a fork of OpenPBS 2.3.12.[1] Paid support available through Cluster Resources.
PBS Professional (PBS Pro) — a version maintained and sold commercially by Altair Engineering

VAX

"VAX" was originally an acronym for Virtual Address eXtension, both because the VAX was seen as a 32-bit extension of the older 16-bit PDP-11 and because it was (after Prime Computer) an early adopter of virtual memory to manage this larger address space. Early versions of the VAX processor implemented a "compatibility mode" that emulated many of the PDP-11's instructions, and were in fact called VAX-11 to highlight this compatibility and the fact that VAX-11 was an outgrowth of the PDP-11 family. Later versions offloaded the compatibility mode and some of the less used CISC instructions to emulation in the operating system software. The plural form of VAX is usually VAXes, but VAXen is also hear.

Operating systems
The "native" VAX operating system is DEC's VAX/VMS (renamed to OpenVMS in 1991 or 1992 when it was ported to DEC Alpha, "branded" by the X/Open consortium, and modified to comply with POSIX standards[1][citation needed]). The VAX architecture and VMS operating system were "engineered concurrently" to take maximum advantage of each other, as was the initial implementation of the VAXcluster facility. Other VAX operating systems have included various releases of BSD UNIX up to 4.3BSD, Ultrix-32 and VAXeln. More recently, NetBSD and OpenBSD support various VAX models and some work has been done on porting Linux to the VAX architecture.

History
The first VAX model sold was the VAX-11/780, which was introduced on October 25, 1977 at the Digital Equipment Corporation's Annual Meeting of Shareholders[1]. The architect of this model was Bill Strecker. Many different models with different prices, performance levels, and capacities were subsequently created. VAX superminis were very popular in the early 1980s.
For a while the VAX-11/780 was used as a baseline in CPU benchmarks because its speed was about one MIPS. Ironically enough, though, the actual number of instructions executed in 1 second was about 500,000. One VAX MIPS was the speed of a VAX-11/780; a computer performing at 27 VAX MIPS would run the same program roughly 27 times as fast as the VAX-11/780. Within the Digital community the term VUP (VAX Unit of Performance) was the more common term, because MIPS do not compare well across different architectures. The related term cluster VUPs was informally used to describe the aggregate performance of a VAXcluster. The performance of the VAX-11/780 still serves as the baseline metric in the BRL-CAD Benchmark, a performance analysis suite included in the BRL-CAD solid modeling software distribution.

VAX 8350 front view with cover removed.

The VAX went through many different implementations. The original VAX was implemented in TTL and filled more than one rack for a single CPU. CPU implementations that consisted of multiple ECL gate array or macrocell array chips included the 8600, 8800 superminis and finally the 9000 mainframe class machines. CPU implementations that consisted of multiple MOSFET custom chips included the 8100 and 8200 class machines.
The MicroVAX I represented a major transition within the VAX family. At the time of its design, it was not yet possible to implement the full VAX architecture as a single VLSI chip (or even a few VLSI chips as was later done with the V-11 CPU of the VAX 8200/8300). Instead, the MicroVAX I was the first VAX implementation to move most of the complexity of the VAX instruction set into emulation software, preserving just the core instructions in hardware. This new partitioning substantially reduced the amount of microcode required and was referred to as the "MicroVAX" architecture. In the MicroVAX I, the ALU and registers were implemented as a single gate-array chip while the rest of the machine control was conventional logic.
A full VLSI (microprocessor) implementation of the MicroVAX architecture then arrived with the MicroVAX II's 78032 CPU and 78132 FPU. This was followed by the V-11, CVAX, SOC ("System On Chip", a single-chip CVAX), Rigel, Mariah and NVAX implementations. The VAX microprocessors extended the architecture to inexpensive workstations and later also supplanted the high-end VAX models. This wide range of platforms (mainframe to workstation) using one architecture was unique in the computer industry at that time.
The VAX architecture was eventually superseded by RISC technology. In 1989 DEC introduced a range of workstations based on processors from MIPS Technologies and running Ultri

x. In 1992 DEC introduced their own RISC processor, the Alpha (originally named Alpha AXP), a high performance 64-bit architecture capable of running OpenVMS.
In August 2000, Compaq announced that the remaining VAX models would be discontinued by the end of the year[2]. By 2005 all manufacturing of VAX computers had ceased, but old systems remain in widespread use.
The SRI CHARON-VAX and SIMH software-based VAX emulators remain available.

A port of Linux to the VAX Architecture.
Welcome to the Linux/VAX porting project. This is a port of the Linux Kernel to the Digital Equipment Corporation (DEC) (now owned by Compaq/Hewlett Packard) VAX and MicroVAX mini computer systems. VAX (Virtual Address eXtension) computers were designed and built from the mid-1970s through to retirement of the line in 1999/2000.
Here you will find all you need to get the Linux Kernel up and running on a VAX computer. We don't support many of the models yet, and it is very early days - we don't (at the time of writing) have dynamic libraries yet! But if you have one of the systems listed in the FAQ you can run a limited version of Linux on it. See the Documentation link for details on the installation process. However, its very rough at the moment, and you need to be very comfortable with the command line and root prompt to be successful.
If you don't like the idea of that, then we suggest that you visit our friends over at the NetBSD project at http://www.netbsd.org/ who make a free UNIX that runs on a wide variety of VAX hardware. NetBSD has a much easier installation, not to mention fewer bugs (at the moment...).
Still here?
You might like to join the Mailing list if you are serious about this...
The mailing list is somewhat imaginatively titled linux-vax. To subscribe, please visit http://www.pergamentum.com/mailman/listinfo/linux-vax. This is the Mailman interface for our list. Because the list used to life on a different machine, there are currently two archives, the old one at http://solar.physics.montana.edu/hypermail/linux-vax/ and the new one, which belongs to the mailing list mentioned above at http://www.pergamentum.com/pipermail/linux-vax/. They'll be merged soon, though.
Additionally, you could subscribe to the CVS checkin mailing list for Kernel (archive), toolchain (archive) and userland tools (archive).
This project would like to acknowledge the support of the following companies or organisations:

Friday, November 16, 2007

Lynx Browser

Lynx is a World Wide Web browser, just like Internet Explorer and Netscape. What makes it different is that it is a text-only browser - it does not display the graphics on web pages.Lynx first started life as a UNIX application, written by the University of Kansas as part of their campus-wide information system. In time, it became a gopher client, then a WWW browser. Lynx was released to the public under the terms of the GNU General Public License of the Free Software Foundation, Inc. It is being constantly improved by a group of developers (the Lynx-Dev Group).

2 types of Browser

Graphical browser- is a highly configurable graphics calculator which displays graphical representations of expressions and formulas and lets you visualize parameter changes as video clips.
Text browser-Text-based browsers are just that. They simply display the contents of the page as text. They do not support images, JavaScript, Java, CSS (Cascading Style Sheets), plug-ins or dynamic HTML. As such they are extremely fast when browsing the web. They are also excellent tools when testing your site for accessibility.

COS 4