How to remove duplicate data in Excel

MS Excel spreadsheets are used to analyze the data they contain. Information is analyzed in its original form, but more often it requires preliminary editing. In this article, we will look at how to get rid of unnecessary, duplicate data if it complicates the processing of the file.

Search for duplicate values

Consider an example. A cosmetics store regularly receives price lists from suppliers in Excel format. For convenience, suppose that the names of the goods in all companies are the same. In order to place an order, we want to first of all determine which company is more profitable to order which products.

To do this, find the same records and compare the prices in them. Until we delete duplicates in Excel, we only find the same products for cost analysis. To do this, we use the formatting of cells by condition. Select the column with the name of the goods and open the "Styles" toolbar of the "Home" tab. Using the “Conditional formatting” button, a drop-down list of commands opens, of which we are interested in the item “Cell selection rules”.

Search for repetitions

The rule we need is Duplicate Values. In the window that opens, specify the color by which the repeating elements of the range will be marked. After painting the cells with the filter, you can select only them, sort them by product name, analyze prices and remove duplicates in Excel.

Repeat color filter

Search for unique values

The method described above is also suitable for the selection of elements found in the singular. For this choice, do the same steps as in the previous paragraph. Open the Styles command bar of the Home tab and click the Conditional Formatting button. In the list of operations, select Cell Selection Rules.

But now, in the filter settings window, you need to select the value “Not Repeatable”, but “Unique” from the drop-down list. The program will mark the selected color only those elements of the column that are found in a single copy.

Filter by color of unique cells

Duplicate Removal

After merging several price lists into one file, the table has many duplicate entries that need to be deleted. Since the list of products is very long, manual processing will take a lot of time and require considerable effort. It is much more convenient to use the option that the program offers.

To remove duplicates in Excel, there is a corresponding menu command. It is located on the “Data” tab in the “Working with data” block of operations. Clicking on the button opens a dialog box. If before calling the command a group of cells was not selected on the worksheet, the program immediately suggests selecting the columns by which to find and remove repetitions.

If you pre-select the fields of one column, Excel will display a clarification message in which you will need to choose whether to use only the selected range in a subsequent operation or expand it.

Sort range

Then mark the columns to find duplicates. The function is convenient in that it allows you to find both full matches and remove duplicate Excel rows, and matches only for individual fields.

Duplicate Removal

For example, in our case, we can look for repetitions only by the names, codes, type and manufacturer of goods, knowing that prices and suppliers will be different. Such processing will allow you to leave unique product elements in the list for compiling your own price list or catalog.

Selecting Unique Records

Another way to remove duplicates in Excel is to select only the unique values ​​of the selected range of the table. We show this by example. Select the group of cells for which you want to remove duplicate values, and open the "Data" tab in the program. Here we find the "Sort and Filter" command block and select one of them - "Advanced".

In the window that opens, enter the field selection parameters. If you do not need to save the original table, select the option "Filter the list in place." But if work with it is not finished, it is better to transfer the filter results to another place.

Selecting Unique Records

Select a group of fields for processing and mark the cell in which the filtered data will be placed. In order to get only unique occurrences as a result, check the box “Only unique entries”.

We get as a result of the initial 27 records only 19 without repetition. This way you can remove duplicates in Excel 2003, while the previous one appeared only in the 2007 version of the program.

Additional recommendations

Pay attention to some tips before deleting information from tables. First of all, before performing such actions, make a copy of the table and conduct operations with it. Or leave it as a backup and make changes to the original. Otherwise, you may lose the data of your file or change the sheet format.

If the source table contains a grouping, summation, or subtotals, you must remove them before deleting duplicates in the Excel column.

Finding duplicate items does not work in pivot table reports.

In addition to the above, it is possible to remove duplicate column elements using formulas, but this method is quite time-consuming and does not make sense in modern versions of the program.

Source: https://habr.com/ru/post/C46606/


All Articles