Fixing Dirty Data
Topics:
fix dirty data |
repair bad data |
fix corrupted data |
fix dirty data |
repair bad data |
fix dirty data |
repair bad data |
fix dirty data
Errors are in databases can be the result of human error in entering the data,
the merging of two databases, a lack of company wide or industry wide data coding
standards, or due to old systems that contain inaccurate or outdated data.
Northwest Database Services offers several data scrubbing services from full
renovation and merge/purge to simple USPS CASS, ACS and NCOA certified coding.
Data Scrubbing ensures that your data is accurate, consistent and correctly
formatted. In other words, the process of data scrubbing (aka data cleaning
or data cleansing) gets your data ready to go to work for you.
Find out more about how data scrubbing ensures accurate data.
A thorough data cleaning typically calls for any or all of the following
data scrubbing operations (listed in their typical sequential order):
Topics:
fix dirty data |
repair bad data |
fix dirty data |
fix corrupted data |
fix dirty data |
repair bad data |
fix dirty data |
repair bad data
-
File Transfer: The first order of business is to get
a file (or files) to us here at Northwest Database Services so that
we can analyze and report what we believe to be the state of your data.
At that point we can give you an estimate of what a thorough data
cleansing will cost.
File exchange is easy. If small enough (equal to or less than 2 MB) then
just email it; if the size of your file(s) is greater than 2 MB and smaller
than 30 MB, we'll create a secure folder to which only you and Northwest
Database Services will have access. Uploading your data is as easy as clicking
a button and logging in. If your file(s) size exceeds 30 MB, we will set you
up with your own FTP site and send you free, simple to use FTP transfer
software (FileZilla) and walk you through the connection process if you're
new to this.
-
Data Exchange: We can also help you import data from multiple
disparate file and data formats into one common file or table, ready for
data cleansing, reporting and analysis. Your file should be in one of the
standard industry database/mailing list file formats so that other data
contractors and personnel, such as your mailing house or IT department
staff can work with it easily.
Standard formatting facilitates data scrubbing processes, whether yours or ours.
Data cleaning is a necessary procedure, so data exchange using the following
formats will make your data cleaning far easier (= less cost). Common database
file formats include: Microsoft Access (.MDB), dBase III, FoxPro and Alpha Five
(.DBF). File formats such as Excel (.XLS) and fixed and delimited ASCII files
(.TXT, .CSV, .ASC) are typically only used for data exchange.
When it's time to send your newly scrubbed data, we would be happy to output
your data in any file format and/or layout you require.
-
Normalization: Data normalization (alignment) picks up misplaced
data, which belongs in one field, and moves it back to the correct field. When
this operation has been run, all addresses are in the appropriate address fields,
names in their proper columns, etc. Even though you may be printing out labels
that look OK, sorting, querying, coding and reporting will be inaccurate and
misleading if your data isn't consistently found where you and your database look for it.
-
Correction: It's sad, but true that often there are incorrect or
missing entries in your data. Based on the availability of standardized data
lexicons (databases to which your data can be matched and from which, when matched,
your data can be corrected and/or updated), we can almost always improve the
accuracy and completeness of your data. In many cases, such as the USPS National
Address (used for CASS – Coding Accuracy and Support System), NCOA (National Change
Of Address)/ACS (Address Change Service) databases we can determine if your data
are out of date and update them if possible. In other cases special enhancement
– adding or updating area codes; adding SIC codes, gender codes, US Census block
and block group numbers, etc. – might be just what's needed to optimize you
data's potential for performance, reporting and prediction.
More about avoiding data errors on
Data Entry Tips page.
-
Data Standardization and format: Once your data have been aligned,
we will standardize the way it's written in your file, based on industry standards
or on your specific business rules. Standardized data yields faster data cleansing
procedures, too. It's your choice. You decide.
-
De-duplication: Duplication of data plagues every database and mailing
list. Duplication is inevitable, constantly keeps growing and significantly erodes
the quality of your data. You can slow its progress, but it's virtually impossible
to stop duplication of data completely. Find out how Data Deduplication can
improve your database.
Contact us for more Data Scrubbing information
Topics:
repair bad data |
fix dirty data | saves money
fix corrupted data | better business
fix dirty data | save time
repair bad data | quality mailing list
fix dirty data |
repair bad data |
fix dirty data |
repair bad data
|