Find and remove duplicates
Excel for Microsoft 365 Excel 2021 Excel 2019 Excel 2016 Excel 2013 Excel 2010 Excel 2007 Excel Starter 2010 More…Less
Sometimes duplicate data is useful, sometimes it just makes it harder to understand your data. Use conditional formatting to find and highlight duplicate data. That way you can review the duplicates and decide if you want to remove them.
-
Select the cells you want to check for duplicates.
Note: Excel can’t highlight duplicates in the Values area of a PivotTable report.
-
Click Home > Conditional Formatting > Highlight Cells Rules > Duplicate Values.
-
In the box next to values with, pick the formatting you want to apply to the duplicate values, and then click OK.
Remove duplicate values
When you use the Remove Duplicates feature, the duplicate data will be permanently deleted. Before you delete the duplicates, it’s a good idea to copy the original data to another worksheet so you don’t accidentally lose any information.
-
Select the range of cells that has duplicate values you want to remove.
-
Click Data > Remove Duplicates, and then Under Columns, check or uncheck the columns where you want to remove the duplicates.
For example, in this worksheet, the January column has price information I want to keep.
So, I unchecked January in the Remove Duplicates box.
-
Click OK.
Note: The counts of duplicate and unique values given after removal may include empty cells, spaces, etc.
Need more help?
Need more help?
When you are working with spreadsheets in Microsoft Excel and accidentally copy rows, or if you are making a composite spreadsheet of several others, you will encounter duplicate rows which you need to delete. This can be a very mindless, repetitive, time-consuming task, but there are several tricks that make it simpler.
Getting Started
Today we will talk about a few handy methods for identifying and deleting duplicate rows in Excel. If you don’t have any files with duplicate rows now, feel free to download our handy resource with several duplicate rows created for this tutorial. Once you have downloaded and opened the resource, or opened your own document, you are ready to proceed.
Option 1 – Remove Duplicates in Excel
If you are using Microsoft Office, you will have a bit of an advantage because there is a built-in feature for finding and deleting duplicates.
Begin by selecting the cells you want to target for your search. In this case, we will select the entire table by pressing Ctrl+A on Windows or Command+A on Mac.
Once you have successfully selected the table, you will need to click on the Data tab on the top of the screen and then select “Remove Duplicates” in the Data Tools drop-down box as shown below.
Once you have clicked on it, a small dialog box will appear. You will notice that the first row has automatically been deselected. The reason for this is that the My Data Has Headers box is ticked.
In this case, we do not have any headers since the table starts at Row 1. We will deselect the My Data Has Headers box. Once you have done that, you will notice that the whole table has been highlighted again and the “Columns” section changed from “duplicates” to “Column A, B, and C.”
Now that the entire table is selected, you just press “OK” to delete all duplicates. In this case, all the rows with duplicate information except for one have been deleted and the details of the deletion are displayed in the popup dialog box.
RELATED: How to Remove Blank Rows in Excel
Option 2 – Advanced Filtering in Excel
The second tool you can use in Excel to Identify and delete duplicates is the Advanced Filter. This method also applies to Excel 2003. Let us start again by opening up the Excel spreadsheet. In order to sort your spreadsheet, you will need to first select all using Ctrl+A or Command+A as shown earlier.
After selecting your table, simply click the Data tab, and in the Sort & Filter section, click “Advanced.” If you are using Excel 2003, click Data > Filters, then choose “Advanced Filters.”
Now you will need to select the Unique Records Only check box.
Once you click “OK,” your document should have all duplicates except one removed. In this case, two were left because the first duplicates were found in Row 1.
This method automatically assumes that there are headers in your table. If you want the first row to be deleted, you will have to delete it manually in this case. If you actually had headers rather than duplicates in the first row, only one copy of the existing duplicates would have been left.
Option 3 – Replace
This method is great for smaller spreadsheets if you want to identify entire rows that are duplicated. In this case, we will be using the simple Replace tool that is built into all Microsoft Office products. You will need to begin by opening the spreadsheet you want to work on.
Once it is open, you need to select a cell with the content you want to find and replace and copy it. Click on the cell and press Ctrl+C on Windows or Command+C on Mac.
Once you have copied the word you want to search for, you will need to press Ctrl+H on Windows or Control+H on Mac to bring up the replace function. Once it is up, you can paste the word you copied into the Find What section by pressing Ctrl+V or Command+V.
Now that you have identified what you are looking for, press “Options.” Select the Match Entire Cell Contents checkbox. The reason for this is that sometimes your word may be present in other cells with other words. If you do not select this option, you could inadvertently end up deleting cells that you need to keep. Ensure that all the other settings match those shown in the image below.
Now you will need to enter a value in the Replace With box. For this example, we will use the number “1.” Once you have entered the value, press “Replace All.”
You will notice that all the values that matched “dulpicate” have been changed to “1.” The reason we used the number one is that it is small and stands out. Now you can easily identify which rows had duplicate content.
In order to retain one copy of the duplicates, simply paste the original text back into the first row that has been replaced by 1’s.
Now that you have identified all the rows with duplicate content, go through the document and hold Ctrl on Windows or Command on Mac while clicking on the number of each duplicate row as shown below.
Once you have selected all the rows that need to be deleted, right-click on one of the grayed-out numbers, and select “Delete.” The reason you need to do this instead of pressing the Delete key on your computer is that it will delete the rows rather than just the content.
Once you are done you will notice that all your remaining rows are unique values.
RELATED: How to Use Conditional Formatting to Find Duplicate Data in Excel
READ NEXT
- › How to Highlight Duplicates in Microsoft Excel
- › How to Use the SUBTOTAL Function in Microsoft Excel
- › How to Remove Blank Rows in Excel
- › 3 Ways to Clean Up Your Google Sheets Data
- › How to Remove Spaces in Microsoft Excel
- › How to Find and Highlight Row Differences in Microsoft Excel
- › How to Remove Duplicates in Google Sheets
- › Expand Your Tech Career Skills With Courses From Udemy
One of the most common data cleaning tasks in Excel includes removing duplicate or redundant records/rows.
While removing whole rows that are duplicate is straightforward, it’s a little trickier when you’re trying to remove entire duplicate rows based on one or more columns.
In this tutorial, we are going to look at three easy ways to remove duplicate rows based on one column in Excel
Using the ‘Remove Duplicates’ Feature
This method is quite straightforward and the most commonly used way to remove duplicate rows in Excel.
Suppose you have a dataset as shown below, and you want to remove all the duplicate records based on Column A.
In this data, you can see that I have multiple instances of ‘Sleeping Bag’ and ‘Karaoke machine’ in column A, and I only want to retain the first row of each instance and get rid of all the others.
To remove all duplicate rows from our sample dataset (shown in the figure above), follow the steps listed below:
- Select the entire dataset, along with the column headers.
- From the Data tab, under the Data Tools group select the Remove Duplicates button.
- This will open the Remove Duplicates dialog box.
- If your selection in step 1 included column headers, then make sure the ‘My data has headers’ checkbox is checked.
- Click on the Unselect All button, to uncheck all the checkboxes under ‘Columns’.
- Now check the column based on which you want to remove the duplicate rows. In our example, we want to remove duplicate rows based on Product Name, so we will simply check the box next to its heading.
- Click OK to close the Remove Duplicates dialog box.
- You should see a notification telling you how many duplicate rows were removed and how many unique rows have been retained, as shown in the image below:
- Click OK
After applying the above steps on our sample dataset, here’s how the dataset should look:
Notice from the above image that exactly one instance corresponding to each product has been retained while all other duplicate rows have been removed.
In other words, this method does not get rid of all duplicate rows. It keeps just one copy and removes all others.
Note: If you want to remove duplicates based on more than one column, you can check the boxes next to the columns you want to include in step 6.
Suppose you have a dataset as shown below, and you want to remove all the duplicate records based on Column A.
You can do this easily using a short VBA code as well. This method is useful when you have to do this quite often and don’t want to follow many steps.
So instead, you can have the code and add it to the Quick Access Toolbar, so that you can access it with a single click.
If you are comfortable with a little coding, then you can use this method to remove duplicate rows based on a single column.
Even if you’re not that keen on coding, you can just copy-paste the following code:
Sub Delete_duplicate_rows()
Dim Rng As Range
Set Rng = Selection
Rng.RemoveDuplicates Columns:=Array(1), Header:=xlYes
End Sub
This code uses a VBA built-in command for removing duplicates in list-objects.
It takes the selected range, as well as the columns that you want to base the duplicate removal on.
In the above code, we specified this as Column 1 or the Product Names column. The code then removes all the rows from the range that contain duplicate product names.
Note: You cannot undo changes made by this VBA script, so we suggest you keep a backup copy of your dataset before running the code.
To enter the above code, copy it and paste it in your developer window.
Here’s how to do this:
- From the Developer tab, select Visual Basic.
- Once your VBA window opens, you will see all your files and folders in the Project Explorer on the left side.
- Make sure ‘ThisWorkbook’ is selected under the VBA project with the same name as your Excel workbook.
- Click Insert->Module. A new module window should open up.
- Now you can start coding. Copy the above script and paste into the module window.
- Close the VBA window.
If you can’t see the Developer ribbon, from the File menu, go to Options. Select Customize Ribbon and check the Developer option from Main Tabs. Finally, Click OK.
Your macro is now ready to use.
Note: If you want to remove duplicates based on more than one column, you can specify the column numbers in the last line of the code. So if you want to search based on columns 1 and 2, your last line would be:
Rng.RemoveDuplicates Columns:=Array(1,2), Header:=xlYes
Running the Macro
To run your macro, do the following:
- Select the range of cells that you want to work with. In our case, select A1:C9
- Select the Developer tab.
- Click on the Macros button (under the Code group).
- This will open the Macro window, where you will find the names of all the macros that you have created so far.
- Select the macro (or module) named ‘Delete_duplicate_rows’.
- Click Run.
- Click OK.
All the rows containing duplicates should now be removed.
Explanation of the Code
The above code simply took the selected range of cells and assigned it to a variable, Rng.
The RemoveDuplicates command then took this range, searched for duplicates in Column 1 (as we had specified Columns:=Array(1)), and then removed all copies of rows containing the same value in Column 1.
Note: We specified Header:=xlYes to tell Excel that the first row of our selected range contains column headings.
Using Filters and the COUNTIF Function to Remove Duplicate Rows based on one Column
You can also use a formula to help you find duplicate values in your data. This method involves two steps.
Suppose you have the same dataset as shown below, and you want to remove all the duplicate records based on Column A.
First, we will use the COUNTIF function to count the first occurrence of a product as 1, its second occurrence as 2, and so on.
Then, we will use this result to filter out those rows that occur for the second time or more.
The result is the set of all duplicate rows. We can then delete these visible rows and remove the filter to obtain rows containing only unique product names.
The COUNTIF function helps count cells in a range that satisfy a given condition. The syntax for the function is as follows:
= COUNTIF (range, condition)
Here,
- range is the range of cells containing the data you want the function to work on (or count)
- condition is the condition that you want satisfied in order to include a cell in the count.
So, if you want to find the number of times the value at cell reference A2 appears in the range A2:A9, use the function as follows:
=COUNTIF(A2:A9,A2)
But if we want to count the number of times the value appears in the range A2 to the current row, then we can use the function as follows:
=COUNTIF($A$2:$A2,A2)
Note: We locked the starting cell reference in the first parameter with ‘$’ because we don’t want this reference to change when the formula is copied to the other cells.
Here, we want the cell reference $A$2 to remain constant, irrespective of which row the formula is copied to. In this way, we can count how many times the value appears up to the current row.
This will ensure that the first occurrence of the cell value is counted as 1, the second occurrence is counted as 2, and so on.
To remove all duplicate rows from our sample dataset, follow the steps listed below:
- Create a new column with the heading ‘Count’. Next to the dataset.
- In the first cell of the column, type the formula: =COUNTIF($A$2:$A2,A2).
- Copy this formula to the rest of the cells in the column by dragging down the cell’s fill handle. This will display the number of times each product name appears in column A.
- Now we need to filter the dataset to show only the duplicate rows.Select the entire dataset, along with the column headers.
- From the Data tab, under the Sort & Filter group select the Filter button.
- You should now see arrows next to each column’s heading.
- Click on the arrow next to the ‘Count’ heading.
- From the dropdown menu that appears, select Number Filters->greater than.
- This will open the Custom Autofilter dialog box. Type ‘1’ in the input box next to “is greater than”.
- Click OK.
- You should see all the duplicate rows only.
- Delete these filtered rows by selecting them, right-clicking and choosing Delete Rows from the context menu that appears.
- Remove the filter now by clicking on the Filter button again (under the Data tab).
- You should now be left with all the unique rows of the original dataset.
You can now get rid of the Count column.
In this tutorial, we looked at three ways in which you can remove duplicate rows based on one or more columns.
You can feel free to choose the method that best suits your requirement at hand.
We hope this was helpful.
Other Excel tutorials you may also like:
- How to Find Duplicates in Excel (Conditional Formatting/ Count If/ Filter)
- How to Remove Blank Columns in Excel? (Formula + VBA)
- Duplicate Sheet in Excel (Shortcuts + VBA)
- How to Count Negative Numbers in Excel
- How to Count How Many Times a Word Appears in Excel
- How to Select Rows with Specific Text in Excel
- How to Delete Filtered Rows in Excel (with and without VBA)
- Get Unique Values from a Column in Excel
- How to Count Unique Values in Excel (Formulas)
See all How-To Articles
This tutorial demonstrates how to remove duplicate rows in Excel and Google Sheets.
Remove Duplicate Rows
In Excel, you can use the built-in functionality to delete duplicate rows comparing several columns. First, look at the data set below, containing information about product, month, and price.
As you can see in the picture above, there are multiple prices for the same product and during the same month. For example, the product keyboard in Jan-21 has two prices: $1,000 (in Row 2) and $595 (in Row 8). Again, for Feb-21, there are two prices: $595 (Row 6) and $ 600 (Row 16). To delete duplicate values comparing both fields (product and month) and get a unique price for this combination, follow these steps.
- Click anywhere in the data range (here, B2:D17) and in the Ribbon, go to Data > Remove Duplicates.
- Excel automatically recognizes how the data and headers are formatted, and all columns are checked by default. First, uncheck Product Price, as you want to compare data by product and month, and click OK.
- The information message below pops up that eight duplicates are removed and eight unique rows are left.
Since all products initially had two rows for each month (Jan-21 and Feb-21), the first appearance of a product in Jan-21 and Feb-21 is kept, while the second is deleted.
Note: You can also use VBA code to delete duplicate rows.
Remove Duplicate Rows in Google Sheets
You can also remove duplicate rows based on one or more columns in Google Sheets.
- Click anywhere in the data range (B2:D17) and in the Menu, go to Data > Remove duplicates.
- Google Sheets takes the whole data range into account. First, check Data has header row to get columns description and uncheck Column D – Product Price under Columns to analyze. Finally, click Remove duplicates.
- Like in Excel, you get the pop-up message below that eight duplicate rows were removed, while eight unique rows are kept.
The final output is the data range with unique combinations of product and month.
Duplicate values in your data can be a big problem! It can lead to substantial errors and over estimate your results.
But finding and removing them from your data is actually quite easy in Excel.
In this tutorial, we are going to look at 7 different methods to locate and remove duplicate values from your data.
Video Tutorial
What Is A Duplicate Value?
Duplicate values happen when the same value or set of values appear in your data.
For a given set of data you can define duplicates in many different ways.
In the above example, there is a simple set of data with 3 columns for the Make, Model and Year for a list of cars.
- The first image highlights all the duplicates based only on the Make of the car.
- The second image highlights all the duplicates based on the Make and Model of the car. This results in one less duplicate.
- The second image highlights all the duplicates based on all columns in the table. This results in even less values being considered duplicates.
The results from duplicates based on a single column vs the entire table can be very different. You should always be aware which version you want and what Excel is doing.
Find And Remove Duplicate Values With The Remove Duplicates Command
Removing duplicate values in data is a very common task. It’s so common, there’s a dedicated command to do it in the ribbon.
Select a cell inside the data which you want to remove duplicates from and go to the Data tab and click on the Remove Duplicates command.
Excel will then select the entire set of data and open up the Remove Duplicates window.
- You then need to tell Excel if the data contains column headers in the first row. If this is checked, then the first row of data will be excluded when finding and removing duplicate values.
- You can then select which columns to use to determine duplicates. There are also handy Select All and Unselect All buttons above you can use if you’ve got a long list of columns in your data.
When you press OK, Excel will then remove all the duplicate values it finds and give you a summary count of how many values were removed and how many values remain.
This command will alter your data so it’s best to perform the command on a copy of your data to retain the original data intact.
Find And Remove Duplicate Values With Advanced Filters
There is also another way to get rid of any duplicate values in your data from the ribbon. This is possible from the advanced filters.
Select a cell inside the data and go to the Data tab and click on the Advanced filter command.
This will open up the Advanced Filter window.
- You can choose to either to Filter the list in place or Copy to another location. Filtering the list in place will hide rows containing any duplicates while copying to another location will create a copy of the data.
- Excel will guess the range of data, but you can adjust it in the List range. The Criteria range can be left blank and the Copy to field will need to be filled if the Copy to another location option was chosen.
- Check the box for Unique records only.
Press OK and you will eliminate the duplicate values.
Advanced filters can be a handy option for getting rid of your duplicate values and creating a copy of your data at the same time. But advanced filters will only be able to perform this on the entire table.
Find And Remove Duplicate Values With A Pivot Table
Pivot tables are just for analyzing your data, right?
You can actually use them to remove duplicate data as well!
You won’t actually be removing duplicate values from your data with this method, you will be using a pivot table to display only the unique values from the data set.
First, create a pivot table based on your data. Select a cell inside your data or the entire range of data ➜ go to the Insert tab ➜ select PivotTable ➜ press OK in the Create PivotTable dialog box.
With the new blank pivot table add all fields into the Rows area of the pivot table.
You will then need to change the layout of the resulting pivot table so it’s in a tabular format. With the pivot table selected, go to the Design tab and select Report Layout. There are two options you will need to change here.
- Select the Show in Tabular Form option.
- Select the Repeat All Item Labels option.
You will also need to remove any subtotals from the pivot table. Go to the Design tab ➜ select Subtotals ➜ select Do Not Show Subtotals.
You now have a pivot table that mimics a tabular set of data!
Pivot tables only list unique values for items in the Rows area, so this pivot table will automatically remove any duplicates in your data.
Find And Remove Duplicate Values With Power Query
Power Query is all about data transformation, so you can be sure it has the ability to find and remove duplicate values.
Select the table of values which you want to remove duplicates from ➜ go to the Data tab ➜ choose a From Table/Range query.
Remove Duplicates Based On One Or More Columns
With Power Query, you can remove duplicates based on one or more columns in the table.
You need to select which columns to remove duplicates based on. You can hold Ctrl to select multiple columns.
Right click on the selected column heading and choose Remove Duplicates from the menu.
You can also access this command from the Home tab ➜ Remove Rows ➜ Remove Duplicates.
= Table.Distinct(#"Previous Step", {"Make", "Model"})
If you look at the formula that’s created, it is using the Table.Distinct function with the second parameter referencing which columns to use.
Remove Duplicates Based On The Entire Table
To remove duplicates based on the entire table, you could select all the columns in the table then remove duplicates. But there is a faster method that doesn’t require selecting all the columns.
There is a button in the top left corner of the data preview with a selection of commands that can be applied to the entire table.
Click on the table button in the top left corner ➜ then choose Remove Duplicates.
= Table.Distinct(#"Previous Step")
If you look at the formula that’s created, it uses the same Table.Distinct function with no second parameter. Without the second parameter, the function will act on the whole table.
Keep Duplicates Based On A Single Column Or On The Entire Table
In Power Query, there are also commands for keeping duplicates for selected columns or for the entire table.
Follow the same steps as removing duplicates, but use the Keep Rows ➜ Keep Duplicates command instead. This will show you all the data that has a duplicate value.
Find And Remove Duplicate Values Using A Formula
You can use a formula to help you find duplicate values in your data.
First you will need to add a helper column that combines the data from any columns which you want to base your duplicate definition on.
= [@Make] & [@Model] & [@Year]
The above formula will concatenate all three columns into a single column. It uses the ampersand operator to join each column.
= TEXTJOIN("", FALSE , CarList[@[Make]:[Year]])
If you have a long list of columns to combine, you can use the above formula instead. This way you can simply reference all the columns as a single range.
You will then need to add another column to count the duplicate values. This will be used later to filter out rows of data that appear more than once.
= COUNTIFS($E$3:E3, E3)
Copy the above formula down the column and it will count the number of times the current value appears in the list of values above.
If the count is 1 then it’s the first time the value is appearing in the data and you will keep this in your set of unique values. If the count is 2 or more then the value has already appeared in the data and it is a duplicate value which can be removed.
Add filters to your data list.
- Go to the Data tab and select the Filter command.
- Use the keyboard shortcut Ctrl + Shift + L.
Now you can filter on the Count column. Filtering on 1 will produce all the unique values and remove any duplicates.
You can then select the visible cells from the resulting filter to copy and paste elsewhere. Use the keyboard shortcut Alt + ; to select only the visible cells.
Find And Remove Duplicate Values With Conditional Formatting
With conditional formatting, there’s a way to highlight duplicate values in your data.
Just like the formula method, you need to add a helper column that combines the data from columns. The conditional formatting doesn’t work with data across rows, so you’ll need this combined column if you want to detect duplicates based on more than one column.
Then you need to select the column of combined data.
To create the conditional formatting, go to the Home tab ➜ select Conditional Formatting ➜ Highlight Cells Rules ➜ Duplicate Values.
This will open up the conditional formatting Duplicate Values window.
- You can select to either highlight Duplicate or Unique values.
- You can also choose from a selection of predefined cell formats to highlight the values or create your own custom format.
Warning: The previous methods to find and remove duplicates considers the first occurrence of a value as a duplicate and will leave it intact. However, this method will highlight the first occurrence and will not make any distinction.
With the values highlighted, you can now filter on either the duplicate or unique values with the filter by color option. Make sure to add filters to your data. Go to the Data tab and select the Filter command or use the keyboard shortcut Ctrl + Shift + L.
- Click on the filter toggle.
- Select Filter by Color in the menu.
- Filter on the color used in the conditional formatting to select duplicate values or filter on No Fill to select unique values.
You can then select just the visible cells with the keyboard shortcut Alt + ;.
Find And Remove Duplicate Values Using VBA
There is a built in command in VBA for removing duplicates within list objects.
Sub RemoveDuplicates()
Dim DuplicateValues As Range
Set DuplicateValues = ActiveSheet.ListObjects("CarList").Range
DuplicateValues.RemoveDuplicates Columns:=Array(1, 2, 3), Header:=xlYes
End Sub
The above procedure will remove duplicates from an Excel table named CarList.
Columns:=Array(1, 2, 3)
The above part of the procedure will set which columns to base duplicate detection on. In this case it will be on the entire table since all three columns are listed.
Header:=xlYes
The above part of the procedure tells Excel the first row in our list contains column headings.
You will want to create a copy of your data before running this VBA code, as it can’t be undone after the code runs.
Conclusions
Duplicate values in your data can be a big obstacle to a clean data set.
Thankfully, there are many options in Excel to easily remove those pesky duplicate values.
So, what’s your go to method to remove duplicates?
About the Author
John is a Microsoft MVP and qualified actuary with over 15 years of experience. He has worked in a variety of industries, including insurance, ad tech, and most recently Power Platform consulting. He is a keen problem solver and has a passion for using technology to make businesses more efficient.
Excel spreadsheets continue to represent a key tool for data storage and visualization. Functionalities such as Find & Replace or Sort help users speed up repetitive tasks that would otherwise be time-consuming and inefficient. Just like working on a spreadsheet with blank rows or cells that interfere with the correct application of rules and formulae, duplicate data can cause similar issues.
In this post, you will learn different ways to find duplicate values to either highlight this information or delete as many duplicates as needed. From more basic highlighting features to more advanced filtering options, you’ll learn how to work with the full potential of the desktop version of Excel.
If you want to avoid duplicate data entry in Google Sheets, you can do that easily using Layer. Layer is a free add-on that allows you to share sheets or ranges of your main spreadsheet with different people. On top of that, you get to monitor and approve edits and changes made to the shared files before they’re merged back into your master file, giving you more control over your data.
Install the Layer Google Sheets Add-On today and Get Free Access to all the paid features, so you can start managing, automating, and scaling your processes on top of Google Sheets!
How to find and remove duplicate rows in Excel?
The various methods shown in this article will first find the duplicate values to be removed and then show how to delete them. This two-step process is crucial, especially considering that you may not want to delete the duplicates automatically and keep only the unique value. Let’s look at the first method to remove all duplicates.
How to Check for Duplicates in Excel?
How to remove duplicates using the Remove Duplicates feature?
What is the shortcut to removing duplicates in Excel? The shortcut is actually a built-in command available in the ribbon, which you can use in the following way.
- 1. Open your Excel spreadsheet and select any range in your spreadsheet which you want to delete duplicate rows from.
How to Find and Remove Duplicates in Excel — Find duplicate rows
- 2. Go to Data > Remove duplicates.
How to Find and Remove Duplicates in Excel — Remove duplicates
If you haven’t selected all data in your spreadsheet, Excel will give you the option of expanding the search to the entire document, which is recommended. Click “OK”.
- 3. In case your data selection has headers, tick the column boxes that contain them so as not to be counted in the duplicate search. All columns in my example contain headers, so I’ll leave all boxes ticked. Click “OK”.
How to Find and Remove Duplicates in Excel — Remove headers from duplicate search
- 4. Excel prompts you with a dialog box informing you about the exact number of duplicate values it found and removed, as well as the number of unique values remaining in your spreadsheet.
How to Find and Remove Duplicates in Excel — Duplicate values found
How to Combine Multiple Excel Columns Into One?
There are many ways to combine multiple columns into a single column in Excel. Here’s how to do it without losing any data
READ MORE
How to delete duplicates in Excel but keep one?
Although the previous method is helpful at targeting all duplicates, this means that the unique data will also be permanently deleted. To avoid this, you may want to explore the following methods.
Here’s how to delete duplicates in Excel but keep one; we strongly recommend that you always keep a copy spreadsheet in case you want to go back to the original dataset.
How to remove duplicates using the Advanced Filter option?
This is a straightforward way to get rid of any duplicate content without deleting them entirely; instead, the Advanced filter option hides your duplicates from your dataset.
- 1. Select a cell in your dataset and go to Data > Advanced filter to the far right.
How to Find and Remove Duplicates in Excel — Advanced filter
- 2. Choose to “Filter the list, in-place” or “Copy to another location”. The first option will hide any row containing duplicates, while the second will make a copy of the data.
How to Find and Remove Duplicates in Excel — Filter list
Leave the “List range” field empty, if you want Excel to list it automatically. You can also leave the “Criteria range” empty. The only mandatory field to fill out is the “Copy to” if you selected the “Copy to another location” option.
- 3. Tick the “Unique records only” box to keep the unique values, and then “OK” to remove all duplicates.
How to Find and Remove Duplicates in Excel — How to keep unique values
Advanced filters are an excellent way to remove duplicate values while keeping a copy of the original data. Don’t forget that the Advanced filter option only applies to the entire table.
How to remove duplicates using Excel formulae?
Although you can combine various formulae to remove duplicates in Excel, in 2018, Microsoft integrated the UNIQUE formula to make this process much easier. First, let’s explore the syntax of the UNIQUE formula:
=UNIQUE (array, [by_col], [exactly_once])
- array refers to the range of cells we will extract unique values from and represents the only required argument.
- [by_col] is an optional parameter determining the search for unique values by rows or columns.
- [exactly_once] is the other optional parameter and sets the behavior for values that appear more than once. If you want the formula to return items that appear exactly once, then write “TRUE”; however, if you want it to return every distinct item, then write “FALSE”.
Let’s now apply the =UNIQUE formula to our dataset.
- 1. Enter the formula next to the set of data. You can either leave one column in between or place it directly next to the last data column. Like in most Excel formulae, as soon as you type at the beginning of the formula, the rest will prompt automatically. Select the range you want to apply the formula to.
How to Find and Remove Duplicates in Excel — UNIQUE formula
- 2. You can leave the second parameter [by_col] by simply including the comma before and after its place. Let’s first see what happens when we include “TRUE” for the [exactly_once] parameter.
How to Find and Remove Duplicates in Excel — UNIQUE function
- 3. As soon as you press the Return key, Excel removes all duplicates. In this example, it has removed rows 5 and 6.
How to Find and Remove Duplicates in Excel — TRUE UNIQUE formula
Let’s see how by including “FALSE” as the last parameter, Excel will keep the unique value.
- 1. Follow the previous steps, and now wrote “FALSE”, to return every distinct value.
How to Find and Remove Duplicates in Excel — FALSE UNIQUE formula
- 2. Now, the UNIQUE formula has returned row 5 and only deleted the duplicate value in row 6.
How to Find and Remove Duplicates in Excel — FALSE UNIQUE formula return
How to remove duplicates using conditional formatting?
Conditional formatting is an Excel feature that helps users filter, sort, and organize data according to built-in rules or custom ones created by the user. The most common feature is the “Highlight Cell Rules”, which allows you to format cell values according to color, font, and various other format styles. Although this method won’t directly remove duplicates, it will make them extremely clear to identify.
- 1. Select the range of cells you want to apply the conditional formatting rule to. Then go to Home > Conditional Formatting > Highlight Cell Rules > Duplicate Values.
How to Find and Remove Duplicates in Excel — Conditional formatting
- 2. Set the “Style” to “Classic” and then “Format only unique or duplicate values”. Don’t forget to leave the drop-down menu to “duplicate”. Finally, choose the formatting style using the “Format with” drop-down menu. Click “OK”.
How to Find and Remove Duplicates in Excel — Conditional formatting remove duplicates
- 3. You can see how Excel highlights all duplicate values, including the cells. This means that you will need to make sure to only remove rows unless you are actually interested in removing all duplicates.
How to Find and Remove Duplicates in Excel — Highlight duplicates conditional formatting]
In case you want to highlight rows, you can combine all row values in one cell using the =CONCAT formula; if you would like to learn more about this function, read this article on the Microsoft support page.
How to remove duplicates based on one or more columns in Excel?
As a more advanced use of Excel, you can remove duplicates based on one or more columns using Power Query. This feature allows you to select the columns you would like to remove the duplicates from. Let’s explore how to use Power Query to remove duplicates based on one or more columns.
- 1. Go to Data > Get Data (Power Query).
How to Find and Remove Duplicates in Excel — Power Query
- 2. Choose “Excel workbook” as your data source.
How to Find and Remove Duplicates in Excel — Power Query data source
- 3. Browse through your files and select the spreadsheet you want to apply the Power Query function to. Click “Next”.
How to Find and Remove Duplicates in Excel — Power Query load data
- 4. Tick the checkbox next to the worksheet containing your data (located in the left-side menu). Then, click “Load” in the bottom right-hand corner.
How to Find and Remove Duplicates in Excel — Power Query load data
- 5. As you can see, the dataset has been transformed into a table.
How to Find and Remove Duplicates in Excel — Power Query table
- 6. Select the columns to apply the Power Query to by pressing Ctrl/Cmd + click on the columns.
How to Find and Remove Duplicates in Excel — Power Query table
- 7. To delete duplicates, simply click on “Remove Duplicates” in the “Data” tab. Then click “OK” in the pop-up dialog box.
How to Find and Remove Duplicates in Excel — Remove Duplicates
- 8. Excel will inform you about the number of duplicates removed and how many unique values remain.
How to Find and Remove Duplicates in Excel — Final Alert message
Don’t worry about removing all duplicates, since the dataset you worked on is a copy created by the Power Query function. However, if you want to keep unique values, follow the steps outlined in the sections on the Advanced Filter option or =UNIQUE formula in Excel.
Want to Boost Your Team’s Productivity and Efficiency?
Transform the way your team collaborates with Confluence, a remote-friendly workspace designed to bring knowledge and collaboration together. Say goodbye to scattered information and disjointed communication, and embrace a platform that empowers your team to accomplish more, together.
Key Features and Benefits:
- Centralized Knowledge: Access your team’s collective wisdom with ease.
- Collaborative Workspace: Foster engagement with flexible project tools.
- Seamless Communication: Connect your entire organization effortlessly.
- Preserve Ideas: Capture insights without losing them in chats or notifications.
- Comprehensive Platform: Manage all content in one organized location.
- Open Teamwork: Empower employees to contribute, share, and grow.
- Superior Integrations: Sync with tools like Slack, Jira, Trello, and more.
Limited-Time Offer: Sign up for Confluence today and claim your forever-free plan, revolutionizing your team’s collaboration experience.
Conclusion
As we have seen, there are many ways to identify and eliminate duplicates in your data, depending on your needs. Not only can you now successfully organize your data correctly, but removing duplicates makes it easier to identify key patterns and create accurate reports, particularly when working with larger datasets.
This post will guide you how to remove duplicate rows from a Microsoft Spreadsheet.
- Remove Duplicate Rows with Remove Duplicates Command
- Remove Duplicate Rows with Advanced Filter
- Remove Duplicate Rows with Formula
Assuming that you have a list of data in range A1:B6 in which contain duplicate rows, and you want to remove them and just keep the unique row. This post will show your three methods to remove duplicate rows.
Table of Contents
- Remove Duplicate Rows with Remove Duplicates Command
- Remove Duplicate Rows with Advanced Filter
- Remove Duplicate Rows with Formula
- Related Functions
Remove Duplicate Rows with Remove Duplicates Command
The easiest way of removing duplicate rows from a selected range is to use the Remove Duplicates command. Just do the following steps:
Step1: select the range that you want to remove duplicate rows.
Step2: go to Data tab in the Excel Ribbon, and click Remove Duplicates command under Data Tools group. And the Remove Duplicates dialog will open.
Step3: checked all column options under Columns list box, and click Ok button. (Note: the dialog box will allow you to select which columns that your range that you want to be included)
Step4: you would see a prompt box and it will inform you that how many rows is removed and how many unique rows is remaining.
Remove Duplicate Rows with Advanced Filter
You can also use Advanced Filter feature to filter unique rows in a Microsoft Excel Spreadsheet and copy the last result to a new range. Just do the following steps:
Step1: select the range that you want to remove the duplicate rows from.
Step2: go to Data tab, and click Advanced command under Sort & Filter group. And the Advanced Filter dialog will appear.
Step3: select the option Copy to another location in the Advanced Filter dialog box. You need to make sure that the selected range has been entered into the List range text box. And then select one blank cell in the Copy to list box as the new location. Make sure to check the Unique records only box. Click Ok button.
Step4: the newly range has been created without duplicate rows.
Remove Duplicate Rows with Formula
You can also use Excel formulas to accomplish the same result. And you can concatenate all columns into one column, and you can find the duplicates values in the combined column. And then you can use another column based on the COUNTIF function to calculate the number of occurrences of each value in another column. Then filter the count number that is greater than 1. And just delete those filtered rows. It should be duplicate rows. Let’s see the below steps:
Step1: select a single cell adjacent to your data, such as: C1. Then enter the following formula into cell C1, and press Enter key. Then you need to copying this formula down all other rows to apply this formula.
=A1&B1
Step2: the contents of columns A-B have been concatenated into column C, and then you need to find the duplicates in the combined column C with another formula based on the COUNTIF function. Select another single cell adjacent to the column C. such as: cell D1, enter the following formula, and copying this formula down all other rows.
=COUNTIF(C$1:C1,C1)
Step3: keep to select column D, and go to Data tab, click Filter button under Sort &Filter group. And one filter arrow will be added into the cell D1.
Step4: click on the Filter arrow in cell D1, and select rows that are not equal to 1, it means that uncheck the value 1. Click Ok button.
Step5: you would see that the first occurrence of every row is hidden. And only duplicate rows are displayed. Then select all filtered rows, and right click on it, click Delete Rows from the popup menu list.
Step6: remove the filter from the column D.
- Excel COUNTIF function
The Excel COUNTIF function will count the number of cells in a range that meet a given criteria. This function can be used to count the different kinds of cells with number, date, text values, blank, non-blanks, or containing specific characters.etc.= COUNTIF (range, criteria)…