Post 613 –by Gautam Shah


A document is a self-sufficient, but unitized set of information. It is a meaningful entity because its contents have some logical order and interrelationship. The word Document originates from Latin word Documentum = lesson or teachings. A document is for preservation (recorded or storage) and for representation. Documents become reliable primarily through their date identification and secondarily by the content. Documents offer evidence of intentions and reports of activities.


CERN datacenter with World Wide Web www and mail servers 2010 > Wikipedia image by Hugovanmeijeren

Documents have many forms: Tablets (clay, stone, wood, etc.), inscriptions, scrolls, books, articles, reports, records, letters, movies, photograph albums, cassettes, disk drives and solid state devices. Documents are created for immediate communication or stored for future access.

Documents are formatted information. Here in-forming implies that a form is impressed onto -a medium. The formatted expression (words, symbols, representational graphics or doodles) on a medium, for the purpose of communication or storage is less likely to get lost with time. The forming mediums are physical, such as: paper, magnetic tape, etc. and formatting tools are: languages, images, graphics, metaphors, etc.


Storage of IBM Punched cards 1959 > Wikipedia image

The medium as estate or space for storage is costly or rare and the required effort is extraordinary, so the information for recording or communication is abridged through processing. With every process of expression, perception, recording and retrieving, etc. the content of a document may get corrupted. The Information originator accessing own records at some other time-space level cannot revert to the original physical and mental state, and re experience or reestablish the original. The communicated information manifests slightly differently, yet it is a reliable ‘knowledge transmission process’.

Traditional documents have linear or sequential arrangement of information. The access is generally sequential, or through preset strategies like: keywords, summaries, content lists, indices, etc. A card catalogue is pre sorted listing. Another method of facilitating access was to place sub sections of the documents in loose sheets held together by a thread (French=fil), wire, or metal-rod as a folder. Document identity was made on projecting tags, coloured edges or notched pages as employed in telephone or address books and account ledgers.


Film Archive storage Flickr image by DR-Byen DRs Kulturarvsprojekt

Very large databases such as police records, telephone directories, library records, however, are difficult to access quickly through cards. Mechanical punched card reader systems were used for reading the information and accordingly reposition (sort) the card. The language of punched and non punched locations not only made the information transmission faster and faultless, but repeatable. Later such systems allowed execution of commands through information on punched cards.


Card catalogue can contain such information but with online processes these are replaced by databases that are digitally searchable Wikipedia image by Tomwsulcer

Documents are stored at a place and in a manner where these can be accessed. Reports or documents are stored with many other similar documents. All storage arrangements have some degree of classification system.

FIRST or the basic classification is the order of arrival. This by itself though provides little meaning, but for administrative handling it shows order of arrival, what is new (-and so latest), and what is old (possibly redundant). For this purpose documents are either, time-date stamped or given a sequential identifier (a chronological number -numeric, alphanumeric or alphabetical).

SECOND classifications for administrative relevance are the size and nature of the document (book size, number of pages, bytes or MBs of data).

Document storage

THIRD relates to the name of the document. Documents have primary title as provided by the author (or the publisher), which could have personal relevance, and so in addition can have a ‘technical titlemeant to explain the content or theme of the document. These additional titles can be longer. Digital documents such as computer files or internet file protocols have abridged (or expanded) titles which include search characters, numbers, words or keys.

Many documents often have identical titles, and so can be distinguished by various appendages such as author’s name, publisher’s name, date of publication or arrival in storage system. Computer file system and internet site address protocol use the extension codes for the same purpose.

FOURTH classification concerns to title-s provided by the author, librarian or storage handler. These are usually of two to three types or tiers. The main title broadly describes the contents and sometimes the purpose of the report. Usually it is of more than one word long, and often runs for two to three lines or sentences. Main title distinguishes the document from such reports dealing with similar or parallel subjects. Main title to the report is specific and should never be a general one.


Markham Stouffville Hospital Library > Wikipedia image by Raysonho @ Open Grid Scheduler / Grid Engine

For example Study of lighting in Interiors is a non specific title, because lighting in interior could be natural, artificial, mixed, direct, reflected, borrowed, even, spot, day, night, evening, purpose related or general illumination. Interiors could be residential, public spaces, commercial, or industrial. Unless the report covers all these, a specific title could have been Study of day time artificial lighting needs in industrial interiors, or Study of lighting in terms of its effect on the perception of heights in interior spaces.

FIFTH classification range is the identity of the author (or editor, compiler). If the author is well known, certain level of content and quality can be presumed. And for this reason a brief note on the author, or reference-links to other works is included.

SIXTH classification present’s document’s relevance to other fields of knowledge. The contents of documents often refer to two or more distinct branches of knowledge. The authors fail to mention such inclusions in main or other tiers of the title. These classifications may include an abstract, a brief description, excerpt or summary. Such short descriptions are also used for primary dissemination of information, and function as a mini document.

SEVENTH classification range derives from the parts of the document. An index and table of contents, show the sequence, size, placement of sub-parts of the document. The sections, chapters and paragraph headings, other media presentations (photographs, illustrations, audio-video clips, links to other chapters, references to other documents, internet links to other resources), provide some idea about the contents.

Topics that are dealt at lower levels, i.e. at sentence or paragraph level may not be adequately covered. A Glossary of key words or terms provides an ideal reference for the sub topics. Internet search engines and research institutions draw out such keywords, and add them to their master data base of terms. The database not only provides reference as to the location of terms but also their context.


The format of a document has completely changed with modern day electronic multi tasking capability and multi media capable systems. Terms like Index, Glossary, list, appendixes were indicative of physical placement of various categories of information. Once these physical locations were difficult to access. Digital media allows interactive presentation formats in audio, video, virtual reality, etc. Hypertext has become a tool for interactive access system. Documents in other storage devices located at different geographical locations are accessible.