Database Glossary – D
Useful Terms For Working With Data
Data acquisition includes extracting and filtering data from operational and/or legacy systems, consolidating (merging separate and/or disparate tables into one master table); scrubbing the resulting data (restructuring records, translating field values to a common data dictionary, and checking data integrity inconsistency); transforming the data (adding time stamps, dates, summarizing data, and deriving new fields); and loading the “clean” data into the warehouse database, as well as updating any warehouse indexes.
The database stores metadata in an area called the data dictionary, which describes the tables, fields, indexes, constraints, and other related items that make up the database.
Another term for data denormalization. In normalized data (not to be confused with database normalization) columns contain attribute values that strictly pertain to the attribute being stored there. An example of data drift would be when both company names and the names of contact individuals are stored in the same column, e.g. as “Recipient” or “Account”.
1) A subset of highly summarized data from the data warehouse designed to support the specific requirements of an organization.
2) A small database warehouse.
The process of transferring data between storage types, formats, or computer systems. Data migration is usually performed programmatically to achieve an automated migration, freeing up human resources from tedious tasks. It is required when organizations or individuals change computer systems or upgrade to new systems, or when systems merge (such as when the organizations that use them undergo a merger or takeover). To achieve an effective data migration procedure, data on the old system is mapped to the new system providing a design for data extraction and data loading. The design relates old data formats to the new system’s formats and requirements. Programmatic data migration may involve many phases but it minimally includes data extraction where data is read from the old system and data loading where data is written to the new system.
The logical data structures, including operations and constraints provided by the DBMS to effectively process data; system used for the representation of data (the ERD, or relational model).
Having the same data stored in more than one place in a database.
Data extraction from disparate sources, most operational, some legacy — typically in different formats.
A source of data used by a database application. It maybe a DBMS, table or a data file.
A logical relationship among data elements that is designed to support specific data manipulation functions (trees, lists, and tables).
Processing that changes the characteristics of data extracted from the operational system; integrates dissimilar data types; changes codes; and selectively calculates, summarizes, and reconciles disparate update cycles.
Every field in every table in a database must be declared as a specific type of data with defined parameters and limitations (e.g. numeric, character or text, date, logical, etc.), known as a data type.
1) A collection of all the data needed by a person or organization to perform their required functions.
2) A collection of related files or tables.
3) Any collection of data organized to answer queries; or,
4) [Informally,] a database management system.
Databases usually consist of both data and metadata [data about the database’s data]. When a database contains a description of its own structure, it is said to be self-describing. A database is integrated when it includes its relationships among data items as well as the data items themselves.
Database Administrator [DBA]
The person who is ultimately responsible for the functionality, integrity, and safety of the database.
That part of the DBMS that directly interacts with the database (part of the back-end).
Database Management System [DBMS]
Also called a database manager. An integrated collection of programs designed to enable people to design databases, enter and maintain data, and perform queries.
1) The person with primary responsibility for the design, construction, and maintenance of a database. 2) [Informally,] a database management system.
Short: A copy of transaction data specifically structured for query, analysis and reporting.
Long: The database warehouse, a single repository depicting a logical view of an enterprise’s data, accessible to developers and business users
alike. Effective database warehousing requires frequent updates and impeccable data quality to insure business end-users and decision makers are using the same data, at the same extraction level, as everyone else when they run queries and reports or formulate analyses.
See: Database Administrator.
The removal of redundant data by removing duplicate records. The duplicate data is deleted saving only one copy of the data.
More about our Deduplication Service
See: Purge; Purging.
To place normalized data in a duplicate location, thus optimizing the performance of the system.
To break down, or separate into parts, as to classify or analyze: e.g. to disaggregate census data according to household size.
Distributed Data Processing
A system in which the data processed is distributed across multiple servers.
A database in which resources are stored on more than one computer system, often at different physical locations.
A collection or range of all the possible values a field can contain. Although a field’s domain is typically finite, it may be infinite as well.
1) That part of the database management system that interfaces directly with the database.
2) A special kind of program for Windows that extends other programs’ abilities to communicate with external devices or file formats. In the case of databases, a driver would contain information about the format and structure of another manufacturer’s files, as well as how to access them either across a network connection or from a third party application.
See also: ODBC.