SQL Server query to check for normalization in row columns

If you have a table with duplicated rows, a common requirement is to try to normalize those rows by extracting distinct information from each row, placing it into another table, and relating them with an ID.

But how do you find if there are any differences between non-normalized fields?

To define terms, normalization means taking repeated data such as a name of a customer, and removing that from a data table (let’s say you imported it from Excel). You want to put the customer names into their own table with an ID, and then you can just store the ID in the Orders table.

But many times the data isn’t duplicated quite right because it’s been typed in many times.

Let’s say you want to check an imported table of orders in which the customer name was entered on each row representing line items in the single order.

If you want to find typos in the orders table for a particular order you can do something like this:

SELECT Order_ID, Count(Distinct Firstname), Count(Distinct Lastname)
FROM Orders
GROUP BY Order_ID
HAVING
    Count(Distinct Firstname) > 1 OR
    Count(Distinct Lastname) > 1

This will return all order IDs that have mismatching first or last names.

If you are doing this with an Access table, this will not work. This syntax and command is only available in SQL Server. You can do a Pass Through query though against SQL server if that is your backend to run and get the results of this query in Access.

Access JumpStart 2.0 | Blog

A Rapid Development Framework for Microsoft Access

SQL Server query to check for normalization in row columns

Submit a Comment Cancel reply

Recent Posts

Recent Comments

Quick Links

About

Contact